WO2021000708A1 - Fitness teaching method and apparatus, electronic device and storage medium - Google Patents
Fitness teaching method and apparatus, electronic device and storage medium Download PDFInfo
- Publication number
- WO2021000708A1 WO2021000708A1 PCT/CN2020/095369 CN2020095369W WO2021000708A1 WO 2021000708 A1 WO2021000708 A1 WO 2021000708A1 CN 2020095369 W CN2020095369 W CN 2020095369W WO 2021000708 A1 WO2021000708 A1 WO 2021000708A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- user
- video frame
- coach
- action
- video
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 60
- 230000009471 action Effects 0.000 claims abstract description 340
- 239000013598 vector Substances 0.000 claims description 163
- 238000011156 evaluation Methods 0.000 claims description 108
- 238000012545 processing Methods 0.000 claims description 21
- 238000004364 calculation method Methods 0.000 claims description 11
- 230000033001 locomotion Effects 0.000 claims description 9
- 230000002194 synthesizing effect Effects 0.000 claims description 7
- 230000011218 segmentation Effects 0.000 claims description 6
- 230000015572 biosynthetic process Effects 0.000 claims description 4
- 238000003786 synthesis reaction Methods 0.000 claims description 4
- 238000001514 detection method Methods 0.000 claims description 3
- 238000004590 computer program Methods 0.000 claims description 2
- 230000000977 initiatory effect Effects 0.000 claims description 2
- 230000000875 corresponding effect Effects 0.000 description 112
- 238000005516 engineering process Methods 0.000 description 17
- 238000010586 diagram Methods 0.000 description 14
- 238000004891 communication Methods 0.000 description 10
- 230000008569 process Effects 0.000 description 7
- 230000000007 visual effect Effects 0.000 description 7
- 230000003993 interaction Effects 0.000 description 6
- 230000003190 augmentative effect Effects 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000006872 improvement Effects 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 230000005236 sound signal Effects 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 238000007726 management method Methods 0.000 description 3
- 230000004044 response Effects 0.000 description 3
- 230000001133 acceleration Effects 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 2
- 239000002131 composite material Substances 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000008450 motivation Effects 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000033764 rhythmic process Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000005728 strengthening Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
- G06V20/42—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B5/00—Electrically-operated educational appliances
- G09B5/06—Electrically-operated educational appliances with both visual and audible presentation of the material to be studied
- G09B5/065—Combinations of audio and video presentations, e.g. videotapes, videodiscs, television systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/23424—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for inserting or substituting an advertisement
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/25—Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
- H04N21/254—Management at additional data server, e.g. shopping server, rights management server
- H04N21/2543—Billing, e.g. for subscription services
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/472—End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
Definitions
- the present disclosure relates to the field of augmented reality technology, and in particular to a fitness teaching method, device, electronic equipment, and computer-readable storage medium.
- a fitness teaching method including:
- the action video frame includes a video frame generated when the user moves;
- the learning position is the display position of the user action in the coaching video frame
- the coach video frame is a video frame that includes a coach demonstration action; the method further includes: acquiring a coach video frame corresponding to the user's action video frame; wherein, the time of acquiring the user's action video frame The time of playing the coach video frame is the same or within a preset time difference; and, comparing the user action in the user's action video frame with the coach demonstration action in the coaching video frame to generate a measure of the user action The evaluation score.
- the comparing the user action in the user's action video frame with the coach demonstration action in the coaching video frame, and generating an evaluation score that measures the user action includes: from the user's action video Obtain the coordinate data of the user's skeleton feature point in the frame as user action data, and obtain the coordinate data of the coach's skeleton feature point from the coach video frame as the coach demonstration action data; and, according to the user action The difference between the data and the coach demonstration action data generates an evaluation score that measures the user's action.
- the generating an evaluation score for measuring the user action according to the difference between the user action data and the coach demonstration action data includes: obtaining at least one user vector angle based on the user action data, And obtaining at least one standard vector angle based on the coach demonstration action data; wherein the user vector angle is the angle between two vectors formed by the coordinate data of any three adjacent skeleton feature points corresponding to the user Angle; the standard vector angle is the angle between the two vectors formed by the coordinate data of any three adjacent skeleton feature points corresponding to the coach; and, according to the user vector angle and the corresponding standard vector
- the included angle determines a similarity parameter to determine the evaluation score corresponding to the user action; the similarity parameter at least includes a standard deviation result or a variance result of the included angle between the at least one user vector and the corresponding standard vector.
- the determining the similarity parameter according to the included angle of the at least one user vector and the corresponding standard vector includes:
- the method further includes: synthesizing the evaluation score into the enhanced coach video frame.
- the synthesizing the evaluation scores into the enhanced coaching video frame includes: if the same coaching video frame is played for multiple users at the same time, and correspondingly generated based on the action video frames of the multiple users Evaluation scores, combining a plurality of evaluation scores into the enhanced coaching video frame based on the size ranking result of the evaluation scores.
- the method further includes: receiving a selection request sent by a user; the selection request includes at least one of the following: an identification of the coaching video selected by the user or an identification of a learning location selected by the user.
- the coach video is a video that includes a coach demonstration action; the coach video includes a template coach video and an enhanced coach video; wherein the template coach video does not synthesize any user action video frames, the enhanced coach video Pre-composite one or more user action video frames.
- the acquiring the coach video frame played for the user includes: acquiring the coach video frame played for the user based on the start time and current time of playing the coach video for the user.
- each coaching video includes one or more learning positions; before said generating an enhanced coaching video frame, it also includes: if the same coaching video is played for multiple users at the same time, and there are unavailability in the coaching video
- the selected learning location the user corresponding to the unselected learning location is selected from users who have never selected the learning location to receive the action video frame of the selected user, and the user action corresponding to the selected user Synthesize to the unselected learning position in the coach video.
- the user corresponds to a user account
- the learning location corresponds to asset information
- the fitness teaching method further includes: acquiring asset information corresponding to the learning location selected by the user, and comparing the asset information to the asset information according to the asset information.
- the user account initiates asset processing operations.
- the obtaining the user's action video frame includes: detecting a human body in a video frame shot by a camera; based on the detected human body, performing background segmentation on the video frame to extract the human body, and generating a generation that includes only the human body Action video frames for action.
- a fitness teaching device including:
- An action video frame acquisition module configured to acquire a user's action video frame; the action video frame includes a video frame generated when the user moves;
- An acquiring module configured to acquire the coaching video frame played for the user and the learning position corresponding to the user; the learning position is the display position of the user's action in the coaching video frame;
- An enhanced coaching video frame generation module configured to synthesize user actions in the user's action video frames to corresponding learning positions in the coaching video frames played for the user to generate enhanced coaching video frames;
- the enhanced coach video frame playback unit is used to play the enhanced coach video frame for the user.
- an electronic device including:
- a memory for storing processor executable instructions
- the processor is configured to perform the operations in the method described above.
- a computer-readable storage medium having a computer program stored thereon, which when executed by one or more processors, causes the processor to perform the operations in the method described above.
- the user action in the user's action video frame can be synthesized to the corresponding learning location in the coach video frame played for the user according to the user's corresponding learning position, thereby generating an enhanced coaching video frame.
- the present disclosure is based on AR technology (Visual Augmented Reality Technology) allows users to see their actions and the actions of the coach at the same time, so that users can correct their actions by comparison, and increase the interactive effect and improve the user's motivation for exercise.
- users can freely select fitness videos and learning locations based on their own needs, thereby generating a selection request.
- the selection request includes the identification of the coach video selected by the user or the identification of the learning location selected by the user, which is beneficial to improve the user's use Experience.
- the server may also select the unselected learning position from users who have never selected a learning position.
- the user corresponding to the learning position of the user so as to receive the action video frame sent by the selected user, and synthesize the user action corresponding to the selected user to the unselected learning position in the coach video, so as to avoid Users who choose the learning position can see their actions, which is conducive to improving the user experience.
- the user action in the action video frame can also be compared with the coach’s demonstration action in the coach video frame to generate an evaluation score that measures the user action sent to the user terminal, and the The evaluation score is synthesized into the enhanced coaching video frame, so that the user has a clear judgment on the standard degree of his own actions, and improves the user experience.
- the multiple evaluation scores are combined into all based on the ranking result of the evaluation scores.
- the enhanced coaching video frame allows users to not only see their own scores, but also the scores of other users, enhancing the interaction of fitness.
- the user corresponds to a user account
- the learning location also corresponds to asset information.
- an asset processing operation can be initiated on the user account based on the asset information, Provide corresponding economic value for businesses.
- Fig. 1 is a structural diagram of a fitness teaching system according to an exemplary embodiment of the present disclosure.
- Fig. 2 is a schematic diagram showing one frame of a template coach video according to an exemplary embodiment of the present disclosure.
- Fig. 3 is a schematic diagram showing one frame of an enhanced coach video according to an exemplary embodiment of the present disclosure.
- Fig. 4 is a schematic diagram showing one frame of another enhanced coach video according to an exemplary embodiment of the present disclosure.
- Fig. 5 is a structural diagram of another fitness teaching system according to an exemplary embodiment of the present disclosure.
- Fig. 6 is a skeleton feature point and vector angle of a human body according to an exemplary embodiment of the present disclosure.
- Fig. 7 is a structural diagram of another fitness teaching system according to an exemplary embodiment of the present disclosure.
- Fig. 8 is a flowchart of a fitness teaching method according to an exemplary embodiment of the present disclosure.
- Fig. 9 is a schematic structural diagram of a fitness teaching device according to an exemplary embodiment of the present disclosure.
- Fig. 10 is a structural diagram of an electronic device according to an exemplary embodiment of the present disclosure.
- first, second, third, etc. may be used in this disclosure to describe various information, the information should not be limited to these terms. These terms are only used to distinguish the same type of information from each other.
- first information may also be referred to as second information, and similarly, the second information may also be referred to as first information.
- word “if” as used herein can be interpreted as "when” or “when” or "in response to determination”.
- Fitness teaching in related technologies usually includes the following methods: the first type, the learner obtains the coach video, the coach video is a video that includes the coach’s demonstration actions, and then the learner watches the coach’s demonstration in the coach video while following the learning; the second type, One or more learners and the coach are connected to the same video channel. In this channel, the learners follow the coach’s real-time presentation to learn, and the learners and the learners, and the learners and the coach can communicate.
- the inventor found that the above-mentioned first fitness teaching method lacks interactivity, the learner may not be able to keep up with the coach’s movement rhythm, and the movement of learning is not sure whether it is standardized or not, and it is difficult to learn. ; Although the above-mentioned second fitness teaching method can obtain a certain degree of online guidance from the coach, it is difficult for the learner to get the true feeling of visiting the fitness site, and the time needs to be matched with the start time of the coach, and training cannot be carried out at any time. limitation.
- the embodiments of the present disclosure provide a fitness teaching method, which can be applied to a local terminal, and the terminal can be an electronic device such as a computer, a tablet, a smart TV, or a mobile phone.
- the fitness teaching method can also be applied to the server, the server can be a server, a computer, and other electronic devices that can provide computing services.
- the following fitness teaching system includes the server and the user.
- the user can be a smart TV, a smart phone, a computer, a personal digital assistant (PDA) or a tablet and other electronic devices with camera functions and audio and video display functions.
- PDA personal digital assistant
- the server executes the fitness teaching method as an example for description.
- FIG. 1 is a structural diagram of a fitness teaching system according to an exemplary embodiment of the present disclosure.
- the system includes a server and a user.
- the user terminal is used to obtain and send the user's action video frame to the server when the coach video selected by the user is played.
- the server is configured to receive the action video frame sent by the user, and obtain the coach video frame played for the user and the corresponding learning position of the user; the learning position is the user action in the The display position in the coach video frame.
- the server is also used to synthesize the user actions in the action video frames to the corresponding learning positions in the coaching video frames played for the client, to generate an enhanced coaching video frame and send it to the client.
- the user end is also used to receive and play the enhanced coach video frame sent by the server end.
- a storage module (database, ROM, etc.) is provided on the server to store coach videos.
- the coach videos are videos that include coach demonstration actions. Each coach video can correspond to one or more learning The specific number of positions can be specifically set based on actual conditions, and the learning position is the display position of the user's action in the coach video. Please refer to FIG. 2 and FIG. 3.
- FIG. 2 shows a scene where the coach video corresponds to 4 learning positions
- FIG. 3 shows a schematic diagram of a user action displayed on one of the learning positions of the coach video frame.
- the coaching video includes a template coaching video and an enhanced coaching video.
- the template coach video is a video that has not synthesized any user's action video frames, as shown in FIG. 2.
- the enhanced coaching video is a video pre-synthesized with one or more user's action video frames, as shown in FIG. 3, which has been pre-synthesized with one user's action video frame.
- the server will obtain the coach video from the database and send it to the user
- the terminal pushes the corresponding coach video for display on the client terminal.
- the user terminal can select the coach video to be played and the learning position to be displayed, or one of the two on the user terminal based on its actual needs, and then the user terminal detects that the user has a specific position, the coach video, and the learning position.
- a selection request is initiated to the server, and then the server receives the selection request sent by the user and records accordingly.
- the selection request may include at least one of the identification of the coaching video selected by the user or the identification of the learning location selected by the user; the user may select one or more learning locations based on their own needs.
- the disclosed embodiment does not impose any restriction on this.
- the disclosed embodiment does not impose any restriction on the type of coach video selected by the user. It can be a template coach video or an enhanced coach video.
- the embodiments of the present disclosure do not impose any restrictions on the specific form of the enhanced coach video frame selected by the user.
- the enhanced coach video frame may be the enhanced coach video frame generated during the user's last exercise or fitness.
- the enhanced coaching video frame may also be an enhanced coaching video frame shared by other users.
- the user may not select the coach video to be played and/or the learning location, and the server terminal determines it. For example, the user clicks a random play button on the user terminal, such as a virtual button or a physical button, and the user terminal sends a random play request, and then the server randomly determines a coaching video and learning location for the user; or The server can also automatically determine a coaching video and learning location according to the user's historical playback data or user preferences.
- the client corresponds to a user account
- the server can combine the coach video (template coach video and/or enhanced coach video) and the generated enhanced coach video selected and played by the user terminal.
- the selected learning location is associated with the user account, which is convenient for the user to obtain the associated coach video, which can help the user to re-learn and improve the accuracy of the follow-up exercise or fitness exercise; and if the user selects the same coach multiple times When performing sports or fitness videos, you can directly obtain the learning location selected by the user associated with the user account, without the user having to repeat the selection, reducing user operation steps.
- the user can also re-select the learning location based on their own needs. Embodiments of the present disclosure There are no restrictions on this.
- the user terminal plays the coach video selected by the user, and the user makes corresponding actions according to the demonstration actions of the coach in the played coach video, and the user terminal synchronizes through the camera
- the user action is captured to generate an action video frame including the user action, and then the action video frame is sent to the server.
- the user terminal may generate action video frames including user actions through the following two possible implementation modes:
- the camera may be a 2D RGB camera
- the user terminal obtains RGB video frames captured by the 2D RGB camera, detects the human body in the RGB video frame, and then, based on the detected human body, The video frame undergoes background segmentation to extract the human body, and an action video frame that only includes the human body motion is generated.
- the camera may be a 2D RGB camera and a 3D depth camera
- the user terminal obtains RGB video frames captured by the 2D RGB camera and depth vision field frames captured by the 3D depth camera, based on the The depth vision field frame calculates the visual depth field, and detects the human body in the RGB video frame, and then divides the human body detected in the RGB video frame from the background according to the visual depth field, and retains the human body foreground, thereby generating only the human body Action video frames for action.
- the user terminal sends motion video frames that only include human motion (ie, user motion) to the server, which is beneficial to reduce the amount of video transmission, thereby speeding up the transmission.
- human motion ie, user motion
- the server after receiving the action video frame sent by the client, acquires the coach video frame played for the client, that is, the coach video frame currently played by the client.
- the user terminal plays the content of the coach video frame in the first frame, and uses the camera to capture the user's actions corresponding to the coach demonstration action in the first frame, because the user makes the corresponding actions based on the coach video frame in the first frame
- the user side needs to process the captured video frames and transmit them to the server. This may take a certain amount of time.
- the coach video frame of the first frame has already been played on the user end.
- the server needs to obtain this At this time, the coach video frame played on the client side is combined with the action video frame, so that the user can see his own actions based on the progress of the playback; it should be noted that the coach video frame played on the user side and the action The coach video frames corresponding to the user actions in the video frames are not the same frame.
- the play request includes a timestamp so that the server can obtain
- the server may determine the coach video frame to be played by the user terminal based on the start time and current time of playing the selected coach video for the client terminal.
- the server detects that only one client currently selects the coaching video to play, the server receives the action video frame sent by the client, and then obtains the The coach video frame played by the user end and the learning position corresponding to the user end, where the learning position is the display position of the user action in the coach video frame, and finally the server converts the user in the action video frame
- the action is synthesized to the corresponding learning position in the coaching video frame played for the client at this time, and an enhanced coaching video frame is generated and sent to the client, so that the client can play the enhanced coaching video frame.
- the coach’s demonstration actions and the user’s own actions are present in the enhanced coach video frame.
- the coach video that the user chooses to play is a template coach video
- the embodiments of the present disclosure are based on AR technology (Visual Augmented Reality Technology) so that the user can simultaneously see the actions done by himself and the actions done by the coach.
- the embodiment of the present disclosure does not impose any restriction on the number of action video frames uploaded by the client at the same time. That is, the embodiment of the present disclosure is not limited to a scenario where a client can only upload one user's action video frame. If two or more learners learn based on the coach video played in the same place, the user terminal will capture the user's actions through the camera, and at least two human actions can be obtained according to each video frame captured, thereby correspondingly generated At least two action video frames are sent to the server.
- the user terminal can obtain multiple and single learning positions based on the captured video frames.
- the corresponding action video frames that only include human actions are sent to the server.
- the server receives a plurality of action video frames sent by the client, and synthesizes the human body movements in each of the action video frames to the corresponding learning position in the coach video frame played for the client at this time to generate an enhanced Coach video frame.
- the server selects one of the action video frames and transfers the action video frame
- the human body action in is synthesized to the corresponding learning position in the coach video frame played for the client at this time to generate an enhanced coach video frame.
- the embodiment of the present disclosure does not impose any restriction on the way the server selects the action video frame. For example, it can be selected randomly, or it can be identified from the multiple action video frames based on the user's face image pre-stored on the user end. The action video frame corresponding to the user's face image is displayed.
- the server will receive the action video frame sent by the client before synthesizing it. It is possible to remove the synthesized actions of other users, and then synthesize the user actions in the action video frame of the user end to the corresponding learning position in the coach video frame played for the user end at this time.
- the server detects that multiple clients are playing the same coaching video at the same time, the server receives the action video frames sent by the multiple clients, and then obtains the current status of the user The coach video frame played by the client terminal and the learning position information corresponding to each client terminal. Finally, the server terminal synthesizes the user actions in the multiple action video frames into the coach video frames that are simultaneously played for the multiple client terminals at this time. At the corresponding learning position, an enhanced coach video frame is generated, and the enhanced coach video frame contains a demonstration action of the coach and corresponding actions of multiple users. Referring to FIG. 4, if the coach video that the user chooses to play is a template coach video, the generated enhanced video frame contains the coach's demonstration actions and the actions corresponding to multiple users.
- the user can also see the actions made by other users, which enhances the interaction of the exercise, provides a high-quality way of mutual encouragement and joint exercise and fitness, and improves the user's motivation for exercise.
- the action video frames sent by multiple clients correspond to the same coaching video frame.
- the server may only receive the action video frames of the client with the selected learning position to improve the receiving efficiency, and if it is detected that the same coaching video frame is played for multiple users at the same time, and the There is an unselected learning position in the coaching video, and the server may select the user terminal corresponding to the unselected learning position from the user terminals that have not selected the learning position to receive the selected user terminal.
- the action video frame is combined with the user action corresponding to the selected user terminal to the unselected learning position in the coach video, thereby enhancing user interaction.
- the embodiment of the present disclosure does not impose any restriction on the manner in which the server selects the user terminal corresponding to the unselected learning position from the user terminal that has never selected the learning position.
- the server may never select the user terminal.
- the user terminal of the learning position is randomly selected from the user terminal corresponding to the unselected learning position, or the server may also select the user terminal corresponding to the unselected learning position based on a preset rule.
- the preset rule may be the user terminal that has the largest number of videos played among the user terminals that have not selected the learning location, or the user terminal that has selected the learning location for other coach videos.
- the server after generating the enhanced coaching video frame, sends the enhanced coaching video frame to the client, so that the client receives and plays the enhanced coaching video frame;
- the electronic device of the user terminal may include a display screen, so that the user terminal can play the enhanced coach video frame through the display screen.
- the user end uploads the action video frame through streaming media technology
- the server sends the enhanced coach video frame through streaming media technology to ensure fast transmission of the video frames.
- the server after the server generates an enhanced coaching video frame corresponding to the coaching video, it saves the generated enhanced coaching video to the storage module, and if the user has a corresponding account, it can also save the generated enhanced coaching video frame.
- Associating the enhanced coaching video with the user’s account can help the user’s replay learning, facilitate the improvement of the accuracy of follow-up exercises or fitness exercises, and enrich the coach’s video resources.
- the learning location may also correspond to asset information
- the asset information represents the value of the learning location.
- the asset information may be virtual currency, points, or real currency, etc.
- the server receives the learning position selected by the user, it initiates an asset processing operation on the user account corresponding to the user terminal according to the asset information corresponding to the learning position, and then the user terminal executes the service to itself
- An asset processing operation initiated by a user account for example, the asset processing operation can be an operation of deducting virtual currency, points, or balance in the user account, or deducting the balance in the bank card or third-party payment account bound to the user account Operation; so as to provide businesses with corresponding economic value.
- FIG. 5 is a structural diagram of another fitness teaching system according to an exemplary embodiment of the present disclosure.
- the system includes a server and a user.
- the user terminal is used to obtain and send the user's action video frame to the server when the coach video selected by the user is played.
- the server is configured to receive the action video frame sent by the user, and obtain the coach video frame played for the user and the corresponding learning position of the user; the learning position is the user action in the The display position in the coach video frame.
- the server is also used to synthesize the user actions in the action video frame to the corresponding learning position in the currently playing coach video frame, generate an enhanced coach video frame and send it to the user terminal.
- the server is also used to obtain the coach video frame corresponding to the action video frame.
- the time when the user terminal obtains the action video frame is the same as the time when the coach video frame is played or is within a preset time difference; compare the user action in the action video frame with the coach presentation in the coach video frame Action, generating an evaluation score measuring the user's action; sending the evaluation score to the user terminal.
- the user terminal is also used to receive the enhanced coach video frame and the evaluation score sent by the server, and synthesize the evaluation score into the enhanced coach video frame and play it.
- the server may obtain the coach video frame corresponding to the action video frame for comparison.
- the user terminal plays the content of the coach video frame in the first frame, and uses the camera to capture the user's action corresponding to the coach demonstration action in the first frame, and the user terminal uploads the action video frame generated based on the action To the server, the first coach video frame corresponds to the action video frame at this time.
- the shooting time when the user terminal uses the camera to capture the user action is the same as the time when the coach video frame is played.
- the server can be based on the shooting time of the action video frame , Determine the corresponding coach video frame, that is, the time when the user terminal obtains the action video frame is the same as the time when the user terminal plays the coach video frame.
- the server may also determine the corresponding coaching video frame based on the shooting time of the action video frame and the preset time difference, That is, the time when the user terminal obtains the action video frame and the time when the user terminal plays the coach video frame are within a preset time difference.
- the server may compare the user action in the action video frame with the coach demonstration action in the coach video frame to generate The evaluation score of the user action is measured. Specifically, the server may obtain the coordinate data of the user's skeleton feature points from the action video frame as user action data, and obtain the coordinate data of the coach's skeleton feature points from the coach video frame as the coach Demonstration action data, the coordinate data of the skeleton feature point may be two-dimensional coordinate data or three-dimensional coordinate data, and then based on the difference between the user action data and the coach demonstration action data, an evaluation measuring the user action is generated fraction.
- the calculation process of the embodiment of the present disclosure is beneficial to improve the accuracy of the evaluation score, wherein the coach action data may be offline data obtained in advance, thereby speeding up the processing speed of the server and improving response efficiency.
- the server may obtain all user vector angles based on the user action data, and obtain all standard vector angles based on the coach demonstration action data.
- the included angle of the user vector may be the included angle of two vectors formed by the coordinate data of any three adjacent skeleton feature points corresponding to the user, and the included angle of the standard vector may be any three corresponding to the coach.
- the included angle of two vectors formed by the coordinate data of adjacent skeleton feature points, and the included angle of the user vector corresponds to the included angle of the standard vector one-to-one.
- Figure 6 shows 14 skeleton feature points (the black mark points in Figure 6).
- the similarity parameter is determined according to the included angle of all user vectors and the included angle of all corresponding standard vectors, so as to determine the evaluation score corresponding to the user action.
- the similarity parameter includes at least a standard deviation result or a variance result of the included angle between the vector and the standard vector. The smaller the standard deviation result and the variance result, and the greater the similarity, the higher the evaluation score.
- the standard vector included angle may be offline data obtained in advance, thereby speeding up the processing speed of the server and improving response efficiency.
- the server may obtain the included user vector angle of the part based on the user action data, and obtain the standard vector included angle of the corresponding part based on the coach demonstration action data.
- the included angle of each user vector corresponds to the included angle of each standard vector one-to-one.
- the similarity parameter is determined according to the included angle of the partial user vector and the corresponding partial standard vector, thereby determining the evaluation score corresponding to the user action.
- the server detects that a coaching video is played for a single user, the server sends the generated evaluation score corresponding to the user to the user, so that the user
- the terminal receives the evaluation score sent by the server, and generates a score image based on the evaluation score to synthesize with the enhanced coach video frame to generate and play an enhanced coach video frame including the evaluation score, so that the user is There is a clear judgment on the standard degree of the action, which improves the user experience.
- the server detects that multiple clients are playing the same coaching video synchronously, and generates corresponding evaluation scores based on the action video frames sent by the multiple clients, then the The server sends the evaluation scores of multiple clients to each client so that each client receives multiple evaluation scores, and sorts the multiple evaluation scores based on the size of the evaluation scores, and generates a score image based on the sorted evaluation scores , To synthesize with the enhanced coaching video frame to generate and play an enhanced coaching video frame including evaluation scores, so that users can not only see their own scores, but also the scores of other users, and enhance the interactivity of fitness.
- the user terminal may also only receive its own corresponding evaluation scores for synthetic playback based on the user's selection, which is not limited in the embodiment of the present disclosure.
- FIG. 7 is a structural diagram of another fitness teaching system according to an exemplary embodiment of the present disclosure.
- the system includes a server and a user.
- the user terminal is used to obtain and send the user's action video frame to the server when the coach video selected by the user is played.
- the server is configured to receive the action video frame sent by the user, and obtain the coach video frame played for the user and the corresponding learning position of the user; the learning position is the user action in the The display position in the coach video frame.
- the server is also used to synthesize the user actions in the action video frame to the corresponding learning position in the currently playing coach video frame, generate an enhanced coach video frame and send it to the user terminal.
- the server is also used to obtain the coach video frame corresponding to the action video frame.
- the time when the user terminal obtains the action video frame is the same as the time when the coach video frame is played or is within a preset time difference; compare the user action in the action video frame with the coach presentation in the coach video frame Action, generating an evaluation score measuring the user's actions; synthesizing the evaluation score into the enhanced coach video frame, generating an enhanced coach video frame including the evaluation score, and sending it to the user terminal.
- the user terminal is also used to receive and play the enhanced coaching video frame including the evaluation score sent by the server.
- the server may compare the user action in the action video frame with the coach demonstration action in the coach video frame to generate The evaluation score of the user action is measured.
- the server detects that a coach video is played for a single user, the server generates a score image according to the generated evaluation score corresponding to the user, and then compares the score image with the Synthesizing the enhanced coaching video frame to generate an enhanced coaching video frame including the evaluation score and send it to the user terminal, so that the user terminal receives and plays the enhanced coaching video frame including the evaluation score sent by the server;
- the user has a clear judgment on the standard degree of his own actions, and the user experience is improved.
- the server detects that multiple clients are playing the same coaching video synchronously, and generate corresponding evaluation scores based on the action video frames sent by the multiple clients, then The server sorts a plurality of the evaluation scores based on the size of the evaluation scores, generates a score image according to the sorted evaluation scores, and then superimposes and synthesizes the score image and the enhanced coach video frame to generate an evaluation score
- the enhanced coaching video frame is sent to the client, so that the client receives and plays the enhanced coaching video frame including the evaluation score sent by the server, so that the user can not only see his own score, but also To the scores of other users, enhance the interaction of fitness.
- the server may also generate an enhanced coaching video frame that only includes the user's evaluation score based on the user's selection, and send it to the user terminal. The embodiment of the present disclosure does not impose any limitation on this.
- Fig. 8 is a flowchart of a fitness teaching method according to an exemplary embodiment of the present disclosure.
- the method includes the following steps.
- step S101 a user's action video frame is obtained; the action video frame includes a video frame generated when the user moves.
- step S102 a coach video frame played for the user and a learning position corresponding to the user are acquired; the learning position is the display position of the user action in the coach video frame.
- step S103 the user action in the user's action video frame is synthesized to the corresponding learning position in the coach video frame played for the user to generate an enhanced coach video frame.
- step S104 the enhanced coaching video frame is played for the user.
- a storage module may be provided on the terminal to store coach videos.
- the coach videos are videos that include coach demonstration actions. Each coach video may correspond to one or more The specific number of learning positions can be specifically set based on the actual situation.
- the learning position is the display position of the user's actions in the coaching video; in addition, the coaching video includes a template coaching video and an enhanced coaching video.
- the coach video is a video that has not synthesized any user's action video frames
- the enhanced coach video is a video that has one or more users' action video frames synthesized in advance; wherein the terminal can be connected to a predetermined cloud to update the Coach video.
- the terminal may push a coach video to the user based on the storage module, and when the user wants to exercise, receive a selection request sent by the user, the selection request includes at least one of the following: a coach video selected by the user Or the identification of the learning location selected by the user.
- the user may not select the coaching video to be played and/or the learning location, and the terminal determines it; for example, the user clicks a random button on the terminal (it can be a virtual button or a physical button) , Then the terminal randomly determines a coaching video and learning location for the user; or the terminal can also automatically determine a coaching video and learning location according to the user's historical playback data or user preferences.
- the terminal when exercising or fitness, the terminal plays the coach video selected by the user, the user makes a corresponding action according to the coach’s demonstration action in the played coach video, and the user terminal synchronously shoots through the camera User actions to generate action video frames including user actions.
- the camera may be a 2D RGB camera
- the user terminal obtains RGB video frames captured by the 2D RGB camera, detects the human body in the RGB video frame, and then, based on the detected human body, The video frame undergoes background segmentation to extract the human body, and an action video frame that only includes the human body motion is generated.
- the camera may be a 2D RGB camera and a 3D depth camera
- the user terminal obtains RGB video frames captured by the 2D RGB camera and depth vision field frames captured by the 3D depth camera, based on the The depth vision field frame calculates the visual depth field, and detects the human body in the RGB video frame, and then divides the human body detected in the RGB video frame from the background according to the visual depth field, and retains the human body foreground, thereby generating only the human body Action video frames for action.
- the terminal obtains the coaching video frame played for the user at this time and the corresponding learning position of the user, and then synthesizes the user action in the user's action video frame into In the corresponding learning position in the coaching video frame played by the user, an enhanced coaching video frame is generated, and the enhanced coaching video frame is played for the user; wherein, the terminal may be based on the start of playing the coaching video for the user Time and current time, the coach video frame played for the user at this time is obtained; the embodiment of the present disclosure is based on AR technology (visual augmented reality technology) so that the user can see the actions done by himself and the actions by the coach at the same time, improving Interactivity.
- AR technology visual augmented reality technology
- the coaching video frame is a video frame that includes a coach demonstration action
- the terminal may obtain the coaching video frame corresponding to the user's action video frame; wherein The time is the same as the time of playing the coach video frame or within a preset time difference, and then the user action in the user action video frame is compared with the coach demonstration action in the coach video frame to generate a measurement of the user action And then synthesize the evaluation scores into the enhanced coaching video frame.
- the terminal after the terminal generates an enhanced coach video frame corresponding to the coach video, it saves the generated enhanced coach video to the storage module, and if the user has a corresponding account, the generated enhanced coach video frame may also be saved Associating the coaching video with the user's account can help the user's replay learning, facilitate the improvement of the accuracy of subsequent sports or fitness exercises, and enrich the coaching video resources.
- the learning location may also correspond to asset information, and the asset information represents the value of the learning location.
- the asset information may be virtual currency, points, or actual currency, etc.
- the terminal is receiving
- an asset processing operation is initiated on the user’s account.
- the asset processing operation may be an operation of deducting virtual currency, points or balance in the user’s account. It can also be the operation of deducting the balance in the bank card or third-party payment account bound to the user's account; thereby providing the merchant with corresponding economic value.
- the terminal may obtain coordinate data of the user's skeleton feature points from the user's action video frame as user action data, and obtain the coach's skeleton from the coach video frame
- the coordinate data of the feature point is used as the coach’s demonstration action data.
- the coordinate data of the skeleton feature point is two-dimensional coordinate data (photographed by a 2D camera) or three-dimensional coordinate data (photographed by a combination of 2D and 3D cameras).
- the coach demonstrates the difference between the action data and generates an evaluation score that measures the user's actions.
- the terminal obtains at least one user vector angle based on the user action data, and obtains at least one standard vector angle based on the coach demonstration action data, and the user vector angle is the user Corresponding to the angle between two vectors formed by the coordinate data of any three adjacent skeleton feature points, the standard vector angle is formed by the coordinate data of any three adjacent skeleton feature points corresponding to the coach And then determine the similarity parameter according to the included angle of the at least one user vector and the corresponding standard vector to determine the evaluation score corresponding to the user action.
- the similarity parameter at least includes the The standard deviation result or the variance result of the included angle between at least one user vector and the corresponding standard vector. The smaller the standard deviation result and the variance result and the greater the similarity, the higher the evaluation score.
- the terminal determining the similarity parameter may include: calculating the difference vector between the included angle of each user vector and the corresponding standard vector; where, the standard vector included angle and the user vector included angle The number of included angles are all N (N is an integer greater than 0).
- N is an integer greater than 0
- the i-th standard vector included angle (1 ⁇ i ⁇ N) be ⁇ i
- the i-th user vector included angle is ⁇ i
- the difference vector is ⁇ i
- ⁇ i ⁇ i- ⁇ i ; calculate the average difference vector according to all the difference vectors; where, suppose the average difference vector is ⁇ r, then Use all the difference vectors and the average difference vector to calculate the similarity parameter; among them, set the similarity parameter to S, then or
- the terminal may sort the results based on the size of the evaluation scores.
- the evaluation scores are synthesized into the enhanced coaching video frame, so that users can not only see their own scores, but also the scores of other users, and enhance the interactivity of fitness.
- FIG. 9 is a schematic structural diagram of a fitness teaching device according to an exemplary embodiment of the present disclosure.
- the device includes: an action video frame acquisition module 21, an acquisition module 22, an enhanced coach video frame generation module 23 and an enhanced coach video frame playback unit 24.
- the action video frame acquisition module 21 is used to acquire a user's action video frame; the action video frame includes a video frame generated when the user moves.
- the acquiring module 22 is configured to acquire the coaching video frame played for the user and the corresponding learning position of the user; the learning position is the display position of the user action in the coaching video frame.
- the enhanced coaching video frame generating module 23 is used for synthesizing the user actions in the user's action video frames to the corresponding learning positions in the coaching video frames played for the user to generate an enhanced coaching video frame.
- the enhanced coach video frame playing unit 24 is configured to play the enhanced coach video frame for the user.
- the coach video frame is a video frame including a coach demonstration action.
- the device further includes: an evaluation score generating module.
- the acquisition module is also used to acquire the coach video frame corresponding to the user's action video frame; wherein, the time of acquiring the user's action video frame is the same as the time of playing the coach video frame or within a preset time difference.
- the evaluation score generating module is configured to compare the user action in the user's action video frame with the coach demonstration action in the coach video frame to generate an evaluation score that measures the user action.
- the evaluation score generation module includes: an action data acquisition sub-module and an evaluation score generation sub-module.
- the action data acquisition sub-module is used to acquire the coordinate data of the user’s skeleton feature points from the user’s action video frame as user action data, and to acquire the coach’s skeleton feature points from the coach video frame
- the coordinate data is used as the coach's demonstration action data.
- the evaluation score generating sub-module is configured to generate an evaluation score for measuring the user action according to the difference between the user action data and the coach demonstration action data.
- the evaluation score generation sub-module includes: a vector included angle acquisition unit and an evaluation score generation unit.
- the vector included angle acquiring unit is configured to acquire at least one user vector included angle based on the user action data, and acquire at least one standard vector included angle based on the coach demonstration action data; wherein the user vector included angle is the user Corresponding to the angle between two vectors formed by the coordinate data of any three adjacent skeleton feature points; the standard vector angle is formed by the coordinate data of any three adjacent skeleton feature points corresponding to the coach The angle between the two vectors.
- the evaluation score generating unit is configured to determine a similarity parameter according to the included angle between the at least one user vector and the corresponding standard vector, thereby determining the evaluation score corresponding to the user action; the similarity parameter includes at least the at least one The standard deviation result or the variance result of the included angle between the user vector and the corresponding standard vector.
- the evaluation score generating unit includes: a difference vector calculation subunit, an average difference vector calculation subunit, a similarity parameter calculation subunit, and an evaluation score determination subunit.
- the average difference vector calculation subunit is used to calculate the average difference vector according to the difference vector; wherein, if the average difference vector is ⁇ r, then
- the similarity parameter calculation subunit is used to calculate the similarity parameter by using the difference vector and the average difference vector; wherein, if the similarity parameter is S, then or
- the evaluation score determination subunit is used to determine the evaluation score corresponding to the user action according to the similarity parameter.
- the device further includes: an evaluation score synthesis module, configured to synthesize the evaluation score into the enhanced coach video frame.
- the evaluation score synthesis module includes: if the same coaching video frame is played for multiple users at the same time, and corresponding evaluation scores are respectively generated based on the action video frames of the multiple users, the sorting results based on the size of the evaluation scores Multiple evaluation scores are synthesized into the enhanced coach video frame.
- the device further includes: a selection request receiving module configured to receive a selection request sent by the user; the selection request includes at least one of the following: an identification of the coaching video selected by the user or an identification of the learning location selected by the user.
- a selection request receiving module configured to receive a selection request sent by the user; the selection request includes at least one of the following: an identification of the coaching video selected by the user or an identification of the learning location selected by the user.
- the coach video is a video that includes a coach demonstration action; the coach video includes a template coach video and an enhanced coach video; wherein the template coach video does not synthesize any user action video frames, the enhanced coach video Pre-composite one or more user action video frames.
- the step of acquiring the coach video frame played for the user in the acquiring module 22 includes: acquiring the coach video frame played for the user based on the start time and current time of playing the coach video for the user .
- each coaching video includes one or more learning positions.
- the device Before generating the enhanced coaching video frame, the device further includes: a learning position allocation module, which is used to play the same coaching video for multiple users at the same time, and there is an unselected learning position in the coaching video, never selected Select the user corresponding to the unselected learning position from the users of the learning position to receive the action video frame of the selected user, and synthesize the user action corresponding to the selected user into the coach video Describe the unselected learning position.
- a learning position allocation module which is used to play the same coaching video for multiple users at the same time, and there is an unselected learning position in the coaching video, never selected Select the user corresponding to the unselected learning position from the users of the learning position to receive the action video frame of the selected user, and synthesize the user action corresponding to the selected user into the coach video Describe the unselected learning position.
- the user corresponds to a user account; the learning location corresponds to asset information.
- the fitness teaching method further includes: an asset operation initiation module, configured to obtain asset information corresponding to the learning location selected by the user, and initiate an asset processing operation on the user account according to the asset information.
- the action video frame acquisition module 21 includes: a human body detection sub-module and an action video frame generation sub-module.
- the human body detection sub-module is used to detect the human body in the video frame shot by the camera.
- the action video frame generation sub-module is used to perform background segmentation on the video frame based on the detected human body to extract the human body, and generate an action video frame that only includes human body actions.
- the relevant part can refer to the part of the description of the method embodiment.
- the device embodiments described above are merely illustrative.
- the modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical modules, that is, they may be located in One place, or it can be distributed to multiple network modules. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the present disclosure. Those of ordinary skill in the art can understand and implement it without creative work.
- the present disclosure also provides an electronic device, including: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to perform the operations in the method described above.
- Fig. 10 is a schematic structural diagram showing an electronic device applied by a fitness teaching device according to an exemplary embodiment.
- an electronic device 300 is shown according to an exemplary embodiment.
- the electronic device 300 may be a computing device such as a computer, a server, a mobile phone, or a tablet.
- the electronic device 300 may include one or more of the following components: processing component 301, memory 302, power supply component 303, multimedia component 304, audio component 305, input/output (I/O) interface 306, sensor component 307 , And communication component 308.
- the processing component 301 generally controls the overall operations of the device 300, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations.
- the processing component 301 may include one or more processors 309 to execute instructions to complete all or part of the steps of the foregoing method.
- the processing component 301 may include one or more modules to facilitate the interaction between the processing component 301 and other components.
- the processing component 301 may include a multimedia module to facilitate the interaction between the multimedia component 304 and the processing component 301.
- the memory 302 is configured to store various types of data to support operations in the electronic device 300. Examples of these data include instructions for any application or method operating on the electronic device 300, contact data, phone book data, messages, pictures, videos, etc.
- the memory 302 can be implemented by any type of volatile or nonvolatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable and Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Magnetic Disk or Optical Disk.
- SRAM static random access memory
- EEPROM electrically erasable programmable read-only memory
- EPROM erasable and Programmable Read Only Memory
- PROM Programmable Read Only Memory
- ROM Read Only Memory
- Magnetic Memory Flash Memory
- Magnetic Disk Magnetic Disk or Optical Disk.
- the power supply component 303 provides power for various components of the electronic device 300.
- the power supply component 303 may include a power management system, one or more power supplies, and other components associated with the generation, management, and distribution of power for the electronic device 300.
- the multimedia component 304 includes a screen that provides an output interface between the electronic device 300 and the user.
- the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from the user.
- the touch panel includes one or more touch sensors to sense touch, sliding, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure related to the touch or slide operation.
- the multimedia component 304 includes a front camera and/or a rear camera. When the electronic device 300 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front camera and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.
- the audio component 305 is configured to output and/or input audio signals.
- the audio component 305 includes a microphone (MIC), and when the electronic device 300 is in an operation mode, such as a call mode, a recording mode, and a voice recognition mode, the microphone is configured to receive external audio signals.
- the received audio signal may be further stored in the memory 302 or transmitted via the communication component 303.
- the audio component 303 further includes a speaker for outputting audio signals.
- the I/O interface 302 provides an interface between the processing component 301 and a peripheral interface module.
- the peripheral interface module may be a keyboard, a click wheel, a button, and the like. These buttons may include but are not limited to: home button, volume button, start button, and lock button.
- the sensor component 307 includes one or more sensors for providing the electronic device 300 with various aspects of state evaluation.
- the sensor component 307 can detect the on/off status of the electronic device 300 and the relative positioning of the components.
- the component is the display and the keypad of the electronic device 300.
- the sensor component 307 can also detect the electronic device 300 or the electronic device 300.
- the position of the component changes, the presence or absence of contact between the user and the electronic device 300, the orientation or acceleration/deceleration of the electronic device 300, and the temperature change of the electronic device 300.
- the sensor assembly 307 may include a proximity sensor configured to detect the presence of nearby objects when there is no physical contact.
- the sensor component 307 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications.
- the sensor component 307 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, a heart rate signal sensor, an electrocardiogram sensor, a fingerprint sensor, or a temperature
- the communication component 308 is configured to facilitate wired or wireless communication between the electronic device 300 and other devices.
- the electronic device 300 can access a wireless network based on a communication standard, such as WiFi, 2G, or 3G, or a combination thereof.
- the communication component 308 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel.
- the communication component 308 further includes a near field communication (NFC) module to facilitate short-range communication.
- the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.
- RFID radio frequency identification
- IrDA infrared data association
- UWB ultra-wideband
- Bluetooth Bluetooth
- the electronic device 300 may be used by one or more application specific integrated circuits (ASIC), digital signal processors (DSP), digital signal processing devices (DSPD), programmable logic devices (PLD), field A programmable gate array (FPGA), controller, microcontroller, microprocessor, or other electronic components are implemented to implement the above methods.
- ASIC application specific integrated circuits
- DSP digital signal processors
- DSPD digital signal processing devices
- PLD programmable logic devices
- FPGA field A programmable gate array
- controller microcontroller, microprocessor, or other electronic components are implemented to implement the above methods.
- non-transitory computer-readable storage medium including instructions, such as the memory 302 including instructions, which may be executed by the processor 309 of the electronic device 300 to complete the foregoing method.
- the non-transitory computer-readable storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
- the device 300 can execute the aforementioned fitness teaching method.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Human Computer Interaction (AREA)
- Business, Economics & Management (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Educational Administration (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Marketing (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Educational Technology (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Psychiatry (AREA)
- Social Psychology (AREA)
- Electrically Operated Instructional Devices (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
Provided are a fitness teaching method and apparatus, a device, and a computer readable storage medium. The method comprises: acquiring action video frames of a user, the action video frames comprising video frames generated upon the actions of the user; acquiring coach video frames played back for the user and a learning position corresponding to the user, the learning position being the display position of the user action in the coach video frames; combining the user actions in the action video frames of the user into a corresponding learning position in the coach video frames played back for the user, so as to generate enhanced coach video frames; and playing back the enhanced coach video frames for the user. The present invention provides a high-quality exercise fitness method.
Description
相关交叉引用Related cross references
本专利申请要求于2019年7月4日提交的、申请号为2019105993901的中国专利申请的优先权,该申请的全文以引用的方式并入本文中。This patent application claims the priority of the Chinese patent application filed on July 4, 2019 with application number 2019105993901, the full text of which is incorporated herein by reference.
本公开涉及增强现实技术领域,尤其涉及一种健身教学方法、装置、电子设备以及计算机可读存储介质。The present disclosure relates to the field of augmented reality technology, and in particular to a fitness teaching method, device, electronic equipment, and computer-readable storage medium.
随着人们生活水平的提高以及运动意识的加强,越来越多的人愿意在健身、或者不同的体育运动如篮球、瑜伽等投入更多的时间与精力。随着科技的发展,人们通过虚拟教学如视频教学等进行健身或者不同体育运动的学习的方式也越来越普遍。但相较于现实教学的过程,目前的虚拟教学过程的互动性不强,学习者难以获得亲临健身现场的真情实感,效果还有待提高。With the improvement of people's living standards and the strengthening of sports awareness, more and more people are willing to invest more time and energy in fitness or different sports such as basketball and yoga. With the development of science and technology, it is becoming more and more common for people to use virtual teaching such as video teaching to perform fitness or learn different sports. However, compared with the actual teaching process, the current virtual teaching process is not very interactive, and it is difficult for learners to get the true feeling of visiting the fitness site, and the effect needs to be improved.
发明内容Summary of the invention
根据本公开实施例的第一方面,提供一种健身教学方法,所述方法包括:According to a first aspect of the embodiments of the present disclosure, there is provided a fitness teaching method, the method including:
获取用户的动作视频帧;所述动作视频帧包括所述用户动作时生成的视频帧;Acquiring a user's action video frame; the action video frame includes a video frame generated when the user moves;
获取为所述用户播放的教练视频帧以及所述用户对应的学习位置;所述学习位置为所述用户动作在所述教练视频帧中的显示位置;Acquiring a coaching video frame played for the user and a learning position corresponding to the user; the learning position is the display position of the user action in the coaching video frame;
将所述用户的动作视频帧中的用户动作合成至为所述用户播放的教练视频帧中对应的学习位置上,生成增强教练视频帧;Synthesize the user actions in the user's action video frame to the corresponding learning position in the coach video frame played for the user to generate an enhanced coach video frame;
为所述用户播放所述增强教练视频帧。Playing the enhanced coaching video frame for the user.
可选地,所述教练视频帧为包括教练演示动作的视频帧;所述方法还包括:获取所述用户的动作视频帧对应的教练视频帧;其中,获取所述用户的动作视频帧的时间与播放所述教练视频帧的时间相同或者在预设时间差内;和,比对所述用户的动作视频帧中的用户动作与所述教练视频帧中的教练演示动作,生成度量所述用户动作的评价分数。Optionally, the coach video frame is a video frame that includes a coach demonstration action; the method further includes: acquiring a coach video frame corresponding to the user's action video frame; wherein, the time of acquiring the user's action video frame The time of playing the coach video frame is the same or within a preset time difference; and, comparing the user action in the user's action video frame with the coach demonstration action in the coaching video frame to generate a measure of the user action The evaluation score.
可选地,所述比对所述用户的动作视频帧中的用户动作与所述教练视频帧中的教练演示动作,生成度量所述用户动作的评价分数,包括:从所述用户的动作视频帧中获取所述用户的骨架特征点的坐标数据作为用户动作数据,以及从所述教练视频帧中获取所述教练的骨架特征点的坐标数据作为教练演示动作数据;和,根据所述用户动作数据与所述教练演示动作数据之间的差异,生成度量 所述用户动作的评价分数。Optionally, the comparing the user action in the user's action video frame with the coach demonstration action in the coaching video frame, and generating an evaluation score that measures the user action includes: from the user's action video Obtain the coordinate data of the user's skeleton feature point in the frame as user action data, and obtain the coordinate data of the coach's skeleton feature point from the coach video frame as the coach demonstration action data; and, according to the user action The difference between the data and the coach demonstration action data generates an evaluation score that measures the user's action.
可选地,所述根据所述用户动作数据与所述教练演示动作数据之间的差异,生成度量所述用户动作的评价分数,包括:基于所述用户动作数据获取至少一个用户矢量夹角,以及基于所述教练演示动作数据获取至少一个标准矢量夹角;其中,所述用户矢量夹角为所述用户对应的任意三个相邻的骨架特征点的坐标数据所构成的两个矢量的夹角;所述标准矢量夹角为所述教练对应的任意三个相邻的骨架特征点的坐标数据所构成的两个矢量的夹角;和,根据所述用户矢量夹角与对应的标准矢量夹角确定相似度参数,从而确定所述用户动作对应的评价分数;所述相似度参数至少包括所述至少一个用户矢量夹角与所述对应的标准矢量夹角的标准差结果或者方差结果。Optionally, the generating an evaluation score for measuring the user action according to the difference between the user action data and the coach demonstration action data includes: obtaining at least one user vector angle based on the user action data, And obtaining at least one standard vector angle based on the coach demonstration action data; wherein the user vector angle is the angle between two vectors formed by the coordinate data of any three adjacent skeleton feature points corresponding to the user Angle; the standard vector angle is the angle between the two vectors formed by the coordinate data of any three adjacent skeleton feature points corresponding to the coach; and, according to the user vector angle and the corresponding standard vector The included angle determines a similarity parameter to determine the evaluation score corresponding to the user action; the similarity parameter at least includes a standard deviation result or a variance result of the included angle between the at least one user vector and the corresponding standard vector.
可选地,所述根据所述至少一个用户矢量夹角与所述对应的标准矢量夹角确定相似度参数,包括:Optionally, the determining the similarity parameter according to the included angle of the at least one user vector and the corresponding standard vector includes:
对于各所述用户矢量夹角与相应的标准矢量夹角,计算其差分矢量;其中,设标准矢量夹角、用户矢量夹角的数量均为N(N为大于0的整数),设第i(1≤i≤N)个标准矢量夹角为α
i,第i个用户矢量夹角为β
i,差分矢量为Δα
i,则Δα
i=α
i-β
i;
For the included angle between each user vector and the corresponding standard vector, calculate the difference vector; where, suppose the number of standard vector included angle and user vector included angle are both N (N is an integer greater than 0), and the i-th (1≤i≤N) The included angle of the standard vectors is α i , the ith user vector included angle is β i , and the difference vector is Δα i , then Δα i = α i- β i ;
根据所述差分矢量计算平均差分矢量;其中,设所述平均差分矢量为Δr,则
Calculate the average difference vector according to the difference vector; where, if the average difference vector is Δr, then
利用所述差分矢量以及所述平均差分矢量计算相似度参数;其中,设相似度参数为S,则
或者
Use the difference vector and the average difference vector to calculate the similarity parameter; where, suppose the similarity parameter is S, then or
可选地,所述方法还包括:将所述评价分数合成到所述增强教练视频帧中。Optionally, the method further includes: synthesizing the evaluation score into the enhanced coach video frame.
可选地,所述将所述评价分数合成到所述增强教练视频帧中,包括:若同时为多个用户播放同一教练视频帧,并且基于所述多个用户的动作视频帧分别生成对应的评价分数,基于评价分数的大小排序结果将多个评价分数合成到所述增强教练视频帧中。Optionally, the synthesizing the evaluation scores into the enhanced coaching video frame includes: if the same coaching video frame is played for multiple users at the same time, and correspondingly generated based on the action video frames of the multiple users Evaluation scores, combining a plurality of evaluation scores into the enhanced coaching video frame based on the size ranking result of the evaluation scores.
可选地,所述方法还包括:接收用户发送的选择请求;所述选择请求包括以下至少一种:用户选择的教练视频的标识或用户选择的学习位置的标识。Optionally, the method further includes: receiving a selection request sent by a user; the selection request includes at least one of the following: an identification of the coaching video selected by the user or an identification of a learning location selected by the user.
可选地,所述教练视频为包括教练演示动作的视频;所述教练视频包括模板教练视频以及增强教练视频;其中,所述模板教练视频未合成任何用户的动作视频帧,所述增强教练视频预先合成有一个或多个用户的动作视频帧。Optionally, the coach video is a video that includes a coach demonstration action; the coach video includes a template coach video and an enhanced coach video; wherein the template coach video does not synthesize any user action video frames, the enhanced coach video Pre-composite one or more user action video frames.
可选地,所述获取为所述用户播放的教练视频帧,包括:基于为所述用户播放教练视频的开始时间以及当前时间,获取为所述用户播放的教练视频帧。Optionally, the acquiring the coach video frame played for the user includes: acquiring the coach video frame played for the user based on the start time and current time of playing the coach video for the user.
可选地,每个教练视频包括一个或多个学习位置;则在所述生成增强教练视频帧之前,还包括: 若同时为多个用户播放同一教练视频、并且所述教练视频中有未被选择的学习位置,从未选择学习位置的用户中选取与所述未被选择的学习位置相应的用户,以接收所述选取的用户的动作视频帧,并将所述选取的用户对应的用户动作合成至所述教练视频中所述未被选择的学习位置上。Optionally, each coaching video includes one or more learning positions; before said generating an enhanced coaching video frame, it also includes: if the same coaching video is played for multiple users at the same time, and there are unavailability in the coaching video The selected learning location, the user corresponding to the unselected learning location is selected from users who have never selected the learning location to receive the action video frame of the selected user, and the user action corresponding to the selected user Synthesize to the unselected learning position in the coach video.
可选地,所述用户对应一用户账户;所述学习位置对应资产信息;则所述健身教学方法,还包括:获取用户选择的学习位置对应的资产信息,并根据所述资产信息,对所述用户账户发起资产处理操作。Optionally, the user corresponds to a user account; the learning location corresponds to asset information; and the fitness teaching method further includes: acquiring asset information corresponding to the learning location selected by the user, and comparing the asset information to the asset information according to the asset information. The user account initiates asset processing operations.
可选地,所述获取所述用户的动作视频帧包括:检测通过摄像头拍摄的视频帧中的人体;基于检测的人体,对所述视频帧进行背景分割以提取所述人体,生成只包括人体动作的动作视频帧。Optionally, the obtaining the user's action video frame includes: detecting a human body in a video frame shot by a camera; based on the detected human body, performing background segmentation on the video frame to extract the human body, and generating a generation that includes only the human body Action video frames for action.
根据本公开实施例的第二方面,提供一种健身教学装置,所述装置包括:According to a second aspect of the embodiments of the present disclosure, there is provided a fitness teaching device, the device including:
动作视频帧获取模块,用于获取用户的动作视频帧;所述动作视频帧包括所述用户动作时生成的视频帧;An action video frame acquisition module, configured to acquire a user's action video frame; the action video frame includes a video frame generated when the user moves;
获取模块,用于获取为所述用户播放的教练视频帧以及所述用户对应的学习位置;所述学习位置为所述用户动作在所述教练视频帧中的显示位置;An acquiring module, configured to acquire the coaching video frame played for the user and the learning position corresponding to the user; the learning position is the display position of the user's action in the coaching video frame;
增强教练视频帧生成模块,用于将所述用户的动作视频帧中的用户动作合成至为所述用户播放的教练视频帧中对应的学习位置上,生成增强教练视频帧;An enhanced coaching video frame generation module, configured to synthesize user actions in the user's action video frames to corresponding learning positions in the coaching video frames played for the user to generate enhanced coaching video frames;
增强教练视频帧播放单元,用于为所述用户播放所述增强教练视频帧。The enhanced coach video frame playback unit is used to play the enhanced coach video frame for the user.
根据本公开实施例的第三方面,提供一种电子设备,包括:According to a third aspect of the embodiments of the present disclosure, there is provided an electronic device, including:
处理器;processor;
用于存储处理器可执行指令的存储器;A memory for storing processor executable instructions;
其中,所述处理器被配置为执行如上所述方法中的操作。Wherein, the processor is configured to perform the operations in the method described above.
根据本公开实施例的第四方面,提供一种计算机可读存储介质,其上存储有计算机程序,当由一个或多个处理器执行时,使得处理器执行如上所述方法中的操作。According to a fourth aspect of the embodiments of the present disclosure, there is provided a computer-readable storage medium having a computer program stored thereon, which when executed by one or more processors, causes the processor to perform the operations in the method described above.
本公开的实施例提供的技术方案可以包括以下有益效果:The technical solutions provided by the embodiments of the present disclosure may include the following beneficial effects:
本公开中,可以根据用户对应的学习位置,将用户的动作视频帧中的用户动作合成至为用户播放的教练视频帧中对应的学习位置上,从而生成增强教练视频帧,本公开基于AR技术(视觉增强现实技术)使得用户可以同时看到自身所做的动作以及教练所做的动作,使得用户可以通过比较纠正自身的动作,而且增加了互动效果,提高用户的运动积极性。In the present disclosure, the user action in the user's action video frame can be synthesized to the corresponding learning location in the coach video frame played for the user according to the user's corresponding learning position, thereby generating an enhanced coaching video frame. The present disclosure is based on AR technology (Visual Augmented Reality Technology) allows users to see their actions and the actions of the coach at the same time, so that users can correct their actions by comparison, and increase the interactive effect and improve the user's motivation for exercise.
本公开中,用户可以基于自身的需求自由选择健身视频以及学习位置,从而生成选择请求,所述选择请求包括用户选择的教练视频的标识或用户选择的学习位置的标识,有利于提高用户的使用体验。In this disclosure, users can freely select fitness videos and learning locations based on their own needs, thereby generating a selection request. The selection request includes the identification of the coach video selected by the user or the identification of the learning location selected by the user, which is beneficial to improve the user's use Experience.
本公开中,若同时为多个用户播放同一教练视频,并且所述教练视频中有未被选择的学习位置, 所述服务端还可以从未选择学习位置的用户中选取与所述未被选择的学习位置相应的用户,从而接收所述选取的用户发送的动作视频帧,并将所述选取的用户对应的用户动作合成至所述教练视频中所述未被选择的学习位置上,以便未选择学习位置的用户可以看到自己的动作,有利于提高用户的使用体验。In the present disclosure, if the same coaching video is played for multiple users at the same time, and there is an unselected learning position in the coaching video, the server may also select the unselected learning position from users who have never selected a learning position. The user corresponding to the learning position of the user, so as to receive the action video frame sent by the selected user, and synthesize the user action corresponding to the selected user to the unselected learning position in the coach video, so as to avoid Users who choose the learning position can see their actions, which is conducive to improving the user experience.
本公开中,还可以比对所述动作视频帧中的用户动作与所述教练视频帧中的教练的演示动作,生成发送至所述用户端的度量所述用户动作的评价分数,并将所述评价分数合成到所述增强教练视频帧中,从而使得用户对于自身的动作的标准程度有一个明确评判,提高用户的使用体验。In the present disclosure, the user action in the action video frame can also be compared with the coach’s demonstration action in the coach video frame to generate an evaluation score that measures the user action sent to the user terminal, and the The evaluation score is synthesized into the enhanced coaching video frame, so that the user has a clear judgment on the standard degree of his own actions, and improves the user experience.
本公开中,若同时为多个用户播放同一教练视频帧,并且基于所述多个用户的动作视频帧分别生成对应的评价分数,则基于评价分数的大小排序结果将多个评价分数合成到所述增强教练视频帧中使得用户不仅可以看到自身的分数,也可以看到其他用户的分数,增强健身的互动性。In the present disclosure, if the same coaching video frame is played for multiple users at the same time, and corresponding evaluation scores are respectively generated based on the action video frames of the multiple users, the multiple evaluation scores are combined into all based on the ranking result of the evaluation scores. The enhanced coaching video frame allows users to not only see their own scores, but also the scores of other users, enhancing the interaction of fitness.
本公开中,所述用户对应一用户账户,所述学习位置还对应有资产信息,在获取用户选择的学习位置对应的资产信息,可以根据所述资产信息对所述用户账户发起资产处理操作,为商家提供相应的经济价值。In this disclosure, the user corresponds to a user account, and the learning location also corresponds to asset information. After acquiring asset information corresponding to the learning location selected by the user, an asset processing operation can be initiated on the user account based on the asset information, Provide corresponding economic value for businesses.
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本公开。It should be understood that the above general description and the following detailed description are only exemplary and explanatory, and cannot limit the present disclosure.
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理。The drawings herein are incorporated into the specification and constitute a part of the specification, show embodiments in accordance with the disclosure, and together with the specification are used to explain the principle of the disclosure.
图1是本公开根据一示例性实施例示出的一种健身教学系统的结构图。Fig. 1 is a structural diagram of a fitness teaching system according to an exemplary embodiment of the present disclosure.
图2是本公开根据一示例性实施例示出的模板教练视频的其中一帧的示意图。Fig. 2 is a schematic diagram showing one frame of a template coach video according to an exemplary embodiment of the present disclosure.
图3是本公开根据一示例性实施例示出的增强教练视频其中一帧的示意图。Fig. 3 is a schematic diagram showing one frame of an enhanced coach video according to an exemplary embodiment of the present disclosure.
图4是本公开根据一示例性实施例示出的另一种增强教练视频其中一帧的示意图。Fig. 4 is a schematic diagram showing one frame of another enhanced coach video according to an exemplary embodiment of the present disclosure.
图5是本公开根据一示例性实施例示出的另一种健身教学系统的结构图。Fig. 5 is a structural diagram of another fitness teaching system according to an exemplary embodiment of the present disclosure.
图6是本公开根据一示例性实施例示出的一人体的骨架特征点以及矢量夹角。Fig. 6 is a skeleton feature point and vector angle of a human body according to an exemplary embodiment of the present disclosure.
图7是本公开根据一示例性实施例示出的又一种健身教学系统的结构图。Fig. 7 is a structural diagram of another fitness teaching system according to an exemplary embodiment of the present disclosure.
图8是本公开根据一示例性实施例示出的健身教学方法的流程图。Fig. 8 is a flowchart of a fitness teaching method according to an exemplary embodiment of the present disclosure.
图9是本公开根据一示例性实施例示出的健身教学装置的结构示意图。Fig. 9 is a schematic structural diagram of a fitness teaching device according to an exemplary embodiment of the present disclosure.
图10是本公开根据一示例性实施例示出的一种电子设备的架构图。Fig. 10 is a structural diagram of an electronic device according to an exemplary embodiment of the present disclosure.
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非 另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本公开相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本公开的一些方面相一致的装置和方法的例子。Here, exemplary embodiments will be described in detail, and examples thereof are shown in the accompanying drawings. When the following description refers to the drawings, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements. The implementation manners described in the following exemplary embodiments do not represent all implementation manners consistent with the present disclosure. Rather, they are merely examples of devices and methods consistent with some aspects of the present disclosure as detailed in the appended claims.
在本公开使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本公开。在本公开和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。The terms used in the present disclosure are only for the purpose of describing specific embodiments, and are not intended to limit the present disclosure. The singular forms "a", "said" and "the" used in the present disclosure and appended claims are also intended to include plural forms, unless the context clearly indicates other meanings. It should also be understood that the term "and/or" used herein refers to and includes any or all possible combinations of one or more associated listed items.
应当理解,尽管在本公开可能采用术语第一、第二、第三等来描述各种信息,但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如,在不脱离本公开范围的情况下,第一信息也可以被称为第二信息,类似地,第二信息也可以被称为第一信息。取决于语境,如在此所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”。It should be understood that although the terms first, second, third, etc. may be used in this disclosure to describe various information, the information should not be limited to these terms. These terms are only used to distinguish the same type of information from each other. For example, without departing from the scope of the present disclosure, the first information may also be referred to as second information, and similarly, the second information may also be referred to as first information. Depending on the context, the word "if" as used herein can be interpreted as "when" or "when" or "in response to determination".
相关技术中的健身教学通常包括以下方式:第一种,学习者获取教练视频,教练视频为包括教练演示动作的视频,然后学习者一边观看教练视频中教练的演示一边跟着学习;第二种,一名或多名学习者与教练接入同一视频频道中,在该频道中,学习者跟着教练的实时演示进行学习,学习者与学习者之间、学习者与教练之间可以进行交流。Fitness teaching in related technologies usually includes the following methods: the first type, the learner obtains the coach video, the coach video is a video that includes the coach’s demonstration actions, and then the learner watches the coach’s demonstration in the coach video while following the learning; the second type, One or more learners and the coach are connected to the same video channel. In this channel, the learners follow the coach’s real-time presentation to learn, and the learners and the learners, and the learners and the coach can communicate.
在实现本公开实施例的过程中,发明人发现:上述第一种健身教学方式缺乏交互性,学习者可能跟不上教练的动作节奏,学习的动作也不确定是否规范,有一定的学习难度;上述第二种健身教学方式虽然可以获得教练在某种程度的在线指导,但学习者难以获得亲临健身现场的真情实感,而且时间上需要配合教练的开课时间,无法随时进行训练,有一定的局限性。In the process of implementing the embodiments of the present disclosure, the inventor found that the above-mentioned first fitness teaching method lacks interactivity, the learner may not be able to keep up with the coach’s movement rhythm, and the movement of learning is not sure whether it is standardized or not, and it is difficult to learn. ; Although the above-mentioned second fitness teaching method can obtain a certain degree of online guidance from the coach, it is difficult for the learner to get the true feeling of visiting the fitness site, and the time needs to be matched with the start time of the coach, and training cannot be carried out at any time. limitation.
因此,为解决相关技术中的问题,本公开实施例提供了一种健身教学方法,所述健身教学方法可以应用于本地的终端,所述终端可以是电脑、平板、智能电视、手机等电子设备;所述健身教学方法也可以应用于服务端上,所述服务端可以是服务器、电脑等可以提供计算服务的电子设备。Therefore, in order to solve the problems in related technologies, the embodiments of the present disclosure provide a fitness teaching method, which can be applied to a local terminal, and the terminal can be an electronic device such as a computer, a tablet, a smart TV, or a mobile phone. The fitness teaching method can also be applied to the server, the server can be a server, a computer, and other electronic devices that can provide computing services.
以下的健身教学系统包括所述服务端以及用户端,所述用户端可以是智能电视、智能手机、电脑、个人数字助理(PDA)或者平板等具有摄像功能以及音视频显示功能的电子设备,所述服务端执行所述健身教学方法为例进行说明。The following fitness teaching system includes the server and the user. The user can be a smart TV, a smart phone, a computer, a personal digital assistant (PDA) or a tablet and other electronic devices with camera functions and audio and video display functions. The server executes the fitness teaching method as an example for description.
请参阅图1,图1是本公开根据一示例性实施例示出的一种健身教学系统的结构图。图1所示的实施例中,所述系统包括服务端以及用户端。Please refer to FIG. 1. FIG. 1 is a structural diagram of a fitness teaching system according to an exemplary embodiment of the present disclosure. In the embodiment shown in Figure 1, the system includes a server and a user.
所述用户端,用于在播放用户选择的教练视频时,获取所述用户的动作视频帧并发送给服务端。The user terminal is used to obtain and send the user's action video frame to the server when the coach video selected by the user is played.
所述服务端,用于接收用户端发送的动作视频帧,并获取为所述用户端播放的教练视频帧以及所述用户端对应的学习位置;所述学习位置为所述用户动作在所述教练视频帧中的显示位置。The server is configured to receive the action video frame sent by the user, and obtain the coach video frame played for the user and the corresponding learning position of the user; the learning position is the user action in the The display position in the coach video frame.
所述服务端,还用于将所述动作视频帧中的用户动作合成至为所述用户端播放的教练视频帧中对应的学习位置上,生成增强教练视频帧并发送给所述用户端。The server is also used to synthesize the user actions in the action video frames to the corresponding learning positions in the coaching video frames played for the client, to generate an enhanced coaching video frame and send it to the client.
所述用户端,还用于接收所述服务端发送的所述增强教练视频帧并进行播放。The user end is also used to receive and play the enhanced coach video frame sent by the server end.
在一示例中,所述服务端上设置有存储模块(数据库、ROM等),用来存储教练视频,所述教练视频为包括教练演示动作的视频,每个教练视频可以对应一个或多个学习位置,其具体数量可基于实际情况进行具体设置,所述学习位置为用户动作在所述教练视频中的显示位置。请参阅图2以及图3,图2示出了所述教练视频对应4个学习位置的场景,图3示出了用户动作在所述教练视频帧的其中一个学习位置上显示的示意图。另外,所述教练视频包括模板教练视频以及增强教练视频。所述模板教练视频为未合成任何用户的动作视频帧的视频,如图2所示。所述增强教练视频为预先合成有一个或多个用户的动作视频帧的视频,如图3中的已预先合成一个用户的动作视频帧。In an example, a storage module (database, ROM, etc.) is provided on the server to store coach videos. The coach videos are videos that include coach demonstration actions. Each coach video can correspond to one or more learning The specific number of positions can be specifically set based on actual conditions, and the learning position is the display position of the user's action in the coach video. Please refer to FIG. 2 and FIG. 3. FIG. 2 shows a scene where the coach video corresponds to 4 learning positions, and FIG. 3 shows a schematic diagram of a user action displayed on one of the learning positions of the coach video frame. In addition, the coaching video includes a template coaching video and an enhanced coaching video. The template coach video is a video that has not synthesized any user's action video frames, as shown in FIG. 2. The enhanced coaching video is a video pre-synthesized with one or more user's action video frames, as shown in FIG. 3, which has been pre-synthesized with one user's action video frame.
在一实施例中,所述用户端的用户在基于本公开的健身教学系统进行运动或者健身之前,若需要进行教练视频的选择,则由所述服务端从数据库中获取教练视频,向所述用户端推送相应的教练视频,以在所述用户端上显示。所述用户端可以基于自身的实际需求在所述用户端上选择待播放的教练视频以及待显示的学习位置、或者两者其中之一,进而所述用户端检测用户对于教练视频和学习位置、或者两者其中之一的选择,向所述服务端发起选择请求,然后由所述服务端接收所述用户端发送的选择请求并进行相应记录。In one embodiment, if the user of the user terminal needs to select a coach video before exercising or exercising based on the fitness teaching system of the present disclosure, the server will obtain the coach video from the database and send it to the user The terminal pushes the corresponding coach video for display on the client terminal. The user terminal can select the coach video to be played and the learning position to be displayed, or one of the two on the user terminal based on its actual needs, and then the user terminal detects that the user has a specific position, the coach video, and the learning position. Or to select one of the two, a selection request is initiated to the server, and then the server receives the selection request sent by the user and records accordingly.
其中,基于用户的选择,所述选择请求中可以包括用户选择的教练视频的标识或用户选择的学习位置的标识中的至少一种;用户可以基于自身的需求选择一个或多个学习位置,本公开实施例对此不做任何限制,另外,本公开实施例对于用户选择的教练视频的类型也不做任何限制,可以是模板教练视频也可以是增强教练视频,可以理解的是,在遵循保护用户隐私的前提下,本公开实施例对于用户选择的所述增强教练视频帧的具体形式不做任何限制,例如所述增强教练视频帧可以是用户上一次运动或健身时生成的增强教练视频帧,或者所述增强教练视频帧也可以是其他用户分享出来的增强教练视频帧。Wherein, based on the user's selection, the selection request may include at least one of the identification of the coaching video selected by the user or the identification of the learning location selected by the user; the user may select one or more learning locations based on their own needs. The disclosed embodiment does not impose any restriction on this. In addition, the disclosed embodiment does not impose any restriction on the type of coach video selected by the user. It can be a template coach video or an enhanced coach video. Under the premise of user privacy, the embodiments of the present disclosure do not impose any restrictions on the specific form of the enhanced coach video frame selected by the user. For example, the enhanced coach video frame may be the enhanced coach video frame generated during the user's last exercise or fitness. Or, the enhanced coaching video frame may also be an enhanced coaching video frame shared by other users.
在另一实施例中,用户也可以不进行待播放的教练视频和/或学习位置的选择,由所述服务端进行确定。例如,所述用户在用户端上点击随机播放的按键,诸如虚拟按键或实体按键,由用户端发送随机播放请求,然后所述服务端为所述用户随机确定一教练视频和学习位置;或者所述服务端还可以根据用户的历史播放数据或用户偏好等自动确定一教练视频和学习位置。In another embodiment, the user may not select the coach video to be played and/or the learning location, and the server terminal determines it. For example, the user clicks a random play button on the user terminal, such as a virtual button or a physical button, and the user terminal sends a random play request, and then the server randomly determines a coaching video and learning location for the user; or The server can also automatically determine a coaching video and learning location according to the user's historical playback data or user preferences.
在一种可能的实现方式中,所述客户端对应一用户账户,所述服务端可以将根据用户端选择并播放的教练视频(模板教练视频和/或增强教练视频)、生成的增强教练视频以及选择的学习位置与该用户账户进行关联,方便用户后续获取关联的教练视频,可以帮助用户的复盘学习,便于提高后续运动或健身练习的动作准确性;而且若用户多次选择相同的教练视频进行运动或健身,可以直接获取该用户账户关联的用户选择的学习位置,无需用户重复选择,减少用户操作步骤,当然,用户也可以基于自身的需求重新进行学习位置的选择,本公开实施例对此不做任何限制。In a possible implementation manner, the client corresponds to a user account, and the server can combine the coach video (template coach video and/or enhanced coach video) and the generated enhanced coach video selected and played by the user terminal. And the selected learning location is associated with the user account, which is convenient for the user to obtain the associated coach video, which can help the user to re-learn and improve the accuracy of the follow-up exercise or fitness exercise; and if the user selects the same coach multiple times When performing sports or fitness videos, you can directly obtain the learning location selected by the user associated with the user account, without the user having to repeat the selection, reducing user operation steps. Of course, the user can also re-select the learning location based on their own needs. Embodiments of the present disclosure There are no restrictions on this.
在本公开实施例中,在进行运动或者健身时,所述用户端播放用户选择的教练视频,用户根据播放的教练视频中的教练的演示动作做出相应的动作,所述用户端通过摄像头同步拍摄用户动作,以生成包括用户动作的动作视频帧,然后将所述动作视频帧发送给所述服务端。In the embodiment of the present disclosure, during exercise or fitness, the user terminal plays the coach video selected by the user, and the user makes corresponding actions according to the demonstration actions of the coach in the played coach video, and the user terminal synchronizes through the camera The user action is captured to generate an action video frame including the user action, and then the action video frame is sent to the server.
其中,所述用户端可以通过以下两种可能的实现方式生成包括用户动作的动作视频帧:Wherein, the user terminal may generate action video frames including user actions through the following two possible implementation modes:
在一种实现方式中,所述摄像头可以是2D RGB摄像头,所述用户端获取通过2D RGB摄像头拍摄的RGB视频帧,检测所述RGB视频帧中的人体,然后基于检测的人体,对所述视频帧进行背景分割以提取所述人体,生成只包括人体动作的动作视频帧。In an implementation manner, the camera may be a 2D RGB camera, and the user terminal obtains RGB video frames captured by the 2D RGB camera, detects the human body in the RGB video frame, and then, based on the detected human body, The video frame undergoes background segmentation to extract the human body, and an action video frame that only includes the human body motion is generated.
在另一种实现方式中,所述摄像头可以是2D RGB摄像头和3D深度摄像头,所述用户端获取通过2D RGB摄像头拍摄的RGB视频帧以及通过3D深度摄像头拍摄的深度视觉场帧,基于所述深度视觉场帧计算视觉深度场,并检测所述RGB视频帧中的人体,然后根据所述视觉深度场将RGB视频帧中检测的人体从背景中分割开,保留人体前景,从而生成只包括人体动作的动作视频帧。In another implementation manner, the camera may be a 2D RGB camera and a 3D depth camera, and the user terminal obtains RGB video frames captured by the 2D RGB camera and depth vision field frames captured by the 3D depth camera, based on the The depth vision field frame calculates the visual depth field, and detects the human body in the RGB video frame, and then divides the human body detected in the RGB video frame from the background according to the visual depth field, and retains the human body foreground, thereby generating only the human body Action video frames for action.
可以看出,本公开实施例中所述用户端向所述服务端发送只包括人体动作(即用户动作)的动作视频帧,有利于减少视频传输量,从而加快传输速度。It can be seen that, in the embodiment of the present disclosure, the user terminal sends motion video frames that only include human motion (ie, user motion) to the server, which is beneficial to reduce the amount of video transmission, thereby speeding up the transmission.
在本公开实施例中,所述服务端在接收所述用户端发送的动作视频帧之后,获取为所述用户端播放的教练视频帧,即所述用户端当前播放的教练视频帧。在一例子中,比如用户端播放第一帧的教练视频帧的内容,并通过摄像头拍摄用户做与第一帧中的教练演示动作相应的动作,由于用户基于第一帧的教练视频帧作出相应的演示动作,并且用户端对于拍摄的视频帧需要进行一定的处理并传输给服务器,这其中可能会耗费一定的时间,当用户端将生成的与第一帧的教练视频帧相应的动作视频帧上传至服务端时,第一帧的教练视频帧在用户端是已播放过的,可能此时为用户端播放的是第三帧的教练视频帧的内容,因此,所述服务端需要获取此时为用户端播放的教练视频帧与所述动作视频帧进行合成,用户才可以基于播放的进度看到自身的动作;需要说明的是,此时为用户端播放的教练视频帧与所述动作视频帧中用户动作对应的教练视频帧并不是同一帧。In the embodiment of the present disclosure, after receiving the action video frame sent by the client, the server acquires the coach video frame played for the client, that is, the coach video frame currently played by the client. In one example, for example, the user terminal plays the content of the coach video frame in the first frame, and uses the camera to capture the user's actions corresponding to the coach demonstration action in the first frame, because the user makes the corresponding actions based on the coach video frame in the first frame The user side needs to process the captured video frames and transmit them to the server. This may take a certain amount of time. When the user side generates the action video frame corresponding to the first frame of the coach video frame When uploading to the server, the coach video frame of the first frame has already been played on the user end. It is possible that the content of the third coach video frame is played for the user at this time. Therefore, the server needs to obtain this At this time, the coach video frame played on the client side is combined with the action video frame, so that the user can see his own actions based on the progress of the playback; it should be noted that the coach video frame played on the user side and the action The coach video frames corresponding to the user actions in the video frames are not the same frame.
在一种实现方式中,由于所述用户端在播放选择的教练视频时会向所述服务器发送相应的播放请求,所述播放请求包括一时间戳,使得所述服务端可以获取到为所述用户端播放所述选择的教练视频的开始时间,则所述服务器可以基于为所述用户端播放所述选择的教练视频的开始时间以及当前时间,确定此时为用户端播放的教练视频帧。In an implementation manner, since the user terminal sends a corresponding play request to the server when playing the selected coach video, the play request includes a timestamp so that the server can obtain When the user terminal plays the start time of the selected coach video, the server may determine the coach video frame to be played by the user terminal based on the start time and current time of playing the selected coach video for the client terminal.
在健身教学场景下,可以选择一对一(即一个用户选择一个教练视频进行学习),也可以多对一(即多个用户同时选择一个教练视频进行学习),则本实施例对于所述用户端的数量不做任何限制。In the fitness teaching scenario, you can choose one-to-one (that is, one user selects a coach video for learning), or multiple-to-one (that is, multiple users select a coach video for learning at the same time), then this embodiment is for the user There is no restriction on the number of terminals.
在一种可能的场景中,若所述服务端检测到当前只有一个用户端选择该教练视频进行播放,则所述服务端接收所述用户端发送的动作视频帧,然后获取此时为所述用户端播放的教练视频帧以及所述用户端对应的学习位置,所述学习位置为所述用户动作在所述教练视频帧中的显示位置,最后所述服务器将所述动作视频帧中的用户动作合成至此时为所述用户端播放的教练视频帧中对应的学习位置上,生成增强教练视频帧并发送给所述用户端,以使所述用户端播放所述增强教练视频帧,所述增强教练视频帧中存在教练的演示动作以及用户自身的动作。请参阅图3,若用户选择播放的教练视频为模板教练视频,则生成的增强视频帧中只存在教练的演示动作以及用户自身的动作。本 公开实施例基于AR技术(视觉增强现实技术)使得用户可以同时看到自身所做的动作以及教练所做的动作。In a possible scenario, if the server detects that only one client currently selects the coaching video to play, the server receives the action video frame sent by the client, and then obtains the The coach video frame played by the user end and the learning position corresponding to the user end, where the learning position is the display position of the user action in the coach video frame, and finally the server converts the user in the action video frame The action is synthesized to the corresponding learning position in the coaching video frame played for the client at this time, and an enhanced coaching video frame is generated and sent to the client, so that the client can play the enhanced coaching video frame. The coach’s demonstration actions and the user’s own actions are present in the enhanced coach video frame. Referring to Figure 3, if the coach video that the user chooses to play is a template coach video, only the coach's demonstration actions and the user's own actions exist in the generated enhanced video frame. The embodiments of the present disclosure are based on AR technology (Visual Augmented Reality Technology) so that the user can simultaneously see the actions done by himself and the actions done by the coach.
本公开实施例对于所述用户端在同一时刻上传的动作视频帧的数量不做任何限制,即本公开实施例并不限于一个用户端只能上传一个用户的动作视频帧的场景,也可以有2个或2个以上的学习者在同一场所基于播放的教练视频进行学习,则所述用户端通过摄像头拍摄用户动作,根据拍摄的每个视频帧能获取到至少两个人体动作,从而对应生成至少两个动作视频帧并发送给所述服务端。The embodiment of the present disclosure does not impose any restriction on the number of action video frames uploaded by the client at the same time. That is, the embodiment of the present disclosure is not limited to a scenario where a client can only upload one user's action video frame. If two or more learners learn based on the coach video played in the same place, the user terminal will capture the user's actions through the camera, and at least two human actions can be obtained according to each video frame captured, thereby correspondingly generated At least two action video frames are sent to the server.
若所述用户端在所述教练视频播放之前,在对应的用户端上有选择与多个学习者的数量相应的学习位置,则所述用户端可以基于拍摄的视频帧获取多个与单个学习者对应的仅包括人体动作的动作视频帧并发送给所述服务端。所述服务端接收所述用户端发送的多个动作视频帧,将各所述动作视频帧中的人体动作合成至此时为所述用户端播放的教练视频帧中对应的学习位置上,生成增强教练视频帧。若所述用户端在所述教练视频播放之前只选择了一个学习位置,则在所述用户端发送的多个动作视频帧帧中,服务端选择其中一个动作视频帧,将所述动作视频帧中的人体动作合成至此时为所述用户端播放的教练视频帧中对应的学习位置上,生成增强教练视频帧。对于服务端选择动作视频帧的方式,本公开实施例对此不做任何限制,例如可以是随机选择,也可以基于该用户端预存的用户人脸图像,从所述多个动作视频帧中识别出与用户人脸图像的对应的动作视频帧等。If the user terminal selects a learning position corresponding to the number of multiple learners on the corresponding user terminal before the coaching video is played, the user terminal can obtain multiple and single learning positions based on the captured video frames. The corresponding action video frames that only include human actions are sent to the server. The server receives a plurality of action video frames sent by the client, and synthesizes the human body movements in each of the action video frames to the corresponding learning position in the coach video frame played for the client at this time to generate an enhanced Coach video frame. If the client selects only one learning position before the coach video is played, then among the multiple action video frames sent by the client, the server selects one of the action video frames and transfers the action video frame The human body action in is synthesized to the corresponding learning position in the coach video frame played for the client at this time to generate an enhanced coach video frame. The embodiment of the present disclosure does not impose any restriction on the way the server selects the action video frame. For example, it can be selected randomly, or it can be identified from the multiple action video frames based on the user's face image pre-stored on the user end. The action video frame corresponding to the user's face image is displayed.
在一示例中,若用户选择播放的为一增强教练视频,且选择的学习位置上已合成其他用户的动作,则所述服务端在接收到所述用户端发送的动作视频帧并进行合成之前,可以将合成的其他用户的动作进行移除,然后将所述用户端的动作视频帧中的用户动作合成至此时为所述用户端播放的教练视频帧中对应的学习位置上。In an example, if the user chooses to play an enhanced coaching video, and the selected learning position has been synthesized with other user's actions, the server will receive the action video frame sent by the client before synthesizing it. , It is possible to remove the synthesized actions of other users, and then synthesize the user actions in the action video frame of the user end to the corresponding learning position in the coach video frame played for the user end at this time.
在另一种可能的场景中,若所述服务端检测到有多个用户端同时播放同一教练视频,则所述服务器接收多个用户端发送的动作视频帧,然后获取此时为所述用户端播放的教练视频帧以及各个用户端分别对应的学习位置信息,最后所述服务端将多个动作视频帧中的用户动作分别合成至此时为所述多个用户端同步播放的教练视频帧中对应的学习位置上,生成增强教练视频帧,所述增强教练视频帧中存在教练的演示动作以及多个用户对应的动作。请参阅图4,若用户选择播放的教练视频为模板教练视频,则生成的增强视频帧中存在教练的演示动作以及多个用户对应的动作。本公开实施例中用户还可以看到其他用户所做的动作,增强运动的互动性,提供了一种高品质的相互激励、共同运动健身的方式,提高用户的运动积极性。在一示例中,多个用户端发送的动作视频帧对应于同一教练视频帧。In another possible scenario, if the server detects that multiple clients are playing the same coaching video at the same time, the server receives the action video frames sent by the multiple clients, and then obtains the current status of the user The coach video frame played by the client terminal and the learning position information corresponding to each client terminal. Finally, the server terminal synthesizes the user actions in the multiple action video frames into the coach video frames that are simultaneously played for the multiple client terminals at this time. At the corresponding learning position, an enhanced coach video frame is generated, and the enhanced coach video frame contains a demonstration action of the coach and corresponding actions of multiple users. Referring to FIG. 4, if the coach video that the user chooses to play is a template coach video, the generated enhanced video frame contains the coach's demonstration actions and the actions corresponding to multiple users. In the embodiments of the present disclosure, the user can also see the actions made by other users, which enhances the interaction of the exercise, provides a high-quality way of mutual encouragement and joint exercise and fitness, and improves the user's motivation for exercise. In an example, the action video frames sent by multiple clients correspond to the same coaching video frame.
在一种可能的实现方式中,所述服务端可以只接收有选择学习位置的用户端的动作视频帧,以提高接收效率,并且若检测到同时为多个用户播放同一教练视频帧,并且所述教练视频中有未被选择的学习位置,所述服务端可以从未选择学习位置的用户端中选取与所述未被选择的学习位置相应的用户端,以接收所述选取的用户端发送的动作视频帧,并将所述选取的用户端对应的用户动作合成至所述教练视频中所述未被选择的学习位置上,从而增强用户的互动性。In a possible implementation manner, the server may only receive the action video frames of the client with the selected learning position to improve the receiving efficiency, and if it is detected that the same coaching video frame is played for multiple users at the same time, and the There is an unselected learning position in the coaching video, and the server may select the user terminal corresponding to the unselected learning position from the user terminals that have not selected the learning position to receive the selected user terminal. The action video frame is combined with the user action corresponding to the selected user terminal to the unselected learning position in the coach video, thereby enhancing user interaction.
其中,对于服务端从未选择学习位置的用户端中选取与所述未被选择的学习位置相应的用户端 的方式,本公开实施例对此不做任何限制,例如所述服务端可以从未选择学习位置的用户端中随机选取与所述未被选择的学习位置相应的用户端,或者所述服务器也可以基于预设规则从未选择学习位置的用户端中选取与所述未被选择的学习位置相应的用户端,所述预设规则可以是所述未选择学习位置的用户端中播放视频数量最多的用户端、或者其他教练视频有选择过学习位置的用户端等。The embodiment of the present disclosure does not impose any restriction on the manner in which the server selects the user terminal corresponding to the unselected learning position from the user terminal that has never selected the learning position. For example, the server may never select the user terminal. The user terminal of the learning position is randomly selected from the user terminal corresponding to the unselected learning position, or the server may also select the user terminal corresponding to the unselected learning position based on a preset rule. For the user terminal corresponding to the location, the preset rule may be the user terminal that has the largest number of videos played among the user terminals that have not selected the learning location, or the user terminal that has selected the learning location for other coach videos.
在本公开实施例中,服务端在生成增强教练视频帧之后,将所述增强教练视频帧发送所述用户端,以使所述用户端接收所述增强教练视频帧并进行播放;其中,集成所述用户端的电子设备可以包括一显示屏幕,使得所述用户端通过所述显示屏幕播放所述增强教练视频帧。在一示例中,所述用户端通过流媒体技术上传所述动作视频帧,且所述服务端通过流媒体技术发送所述增强教练视频帧,以保证视频帧的快速传输。In the embodiment of the present disclosure, after generating the enhanced coaching video frame, the server sends the enhanced coaching video frame to the client, so that the client receives and plays the enhanced coaching video frame; The electronic device of the user terminal may include a display screen, so that the user terminal can play the enhanced coach video frame through the display screen. In an example, the user end uploads the action video frame through streaming media technology, and the server sends the enhanced coach video frame through streaming media technology to ensure fast transmission of the video frames.
在一实施例中,所述服务端在生成对应于该教练视频的增强教练视频帧后,将生成的增强教练视频保存至存储模块中,并且若用户有对应的账户,也可以将生成的所述增强教练视频与用户的账户进行关联,可以帮助用户的复盘学习,便于提高后续运动或健身练习的动作准确性,同时丰富教练视频资源。In one embodiment, after the server generates an enhanced coaching video frame corresponding to the coaching video, it saves the generated enhanced coaching video to the storage module, and if the user has a corresponding account, it can also save the generated enhanced coaching video frame. Associating the enhanced coaching video with the user’s account can help the user’s replay learning, facilitate the improvement of the accuracy of follow-up exercises or fitness exercises, and enrich the coach’s video resources.
在一实施例中,请参阅图2,所述学习位置还可以对应一资产信息,所述资产信息表征该学习位置的价值,例如所述资产信息可以是虚拟币、积分或者实际货币等,则所述服务端在接收用户选择的学习位置时,根据所述学习位置对应的资产信息,对所述用户端对应的用户账户发起资产处理操作,然后所述用户端执行所述服务端向自身的用户账户发起的资产处理操作;例如所述资产处理操作可以是扣除用户账户中的虚拟币、积分或者余额的操作,也可以是扣除用户账户绑定的银行卡或者第三方支付账户中的余额的操作;从而为商家提供相应的经济价值。In one embodiment, referring to FIG. 2, the learning location may also correspond to asset information, and the asset information represents the value of the learning location. For example, the asset information may be virtual currency, points, or real currency, etc. When the server receives the learning position selected by the user, it initiates an asset processing operation on the user account corresponding to the user terminal according to the asset information corresponding to the learning position, and then the user terminal executes the service to itself An asset processing operation initiated by a user account; for example, the asset processing operation can be an operation of deducting virtual currency, points, or balance in the user account, or deducting the balance in the bank card or third-party payment account bound to the user account Operation; so as to provide businesses with corresponding economic value.
请参阅图5,图5是本公开根据一示例性实施例示出的另一种健身教学系统的结构图。图5所示的实施例中,所述系统包括服务端以及用户端。Please refer to FIG. 5, which is a structural diagram of another fitness teaching system according to an exemplary embodiment of the present disclosure. In the embodiment shown in Figure 5, the system includes a server and a user.
所述用户端,用于在播放用户选择的教练视频时,获取所述用户的动作视频帧,并发送给服务端。The user terminal is used to obtain and send the user's action video frame to the server when the coach video selected by the user is played.
所述服务端,用于接收用户端发送的动作视频帧,并获取为所述用户端播放的教练视频帧以及所述用户端对应的学习位置;所述学习位置为所述用户动作在所述教练视频帧中的显示位置。The server is configured to receive the action video frame sent by the user, and obtain the coach video frame played for the user and the corresponding learning position of the user; the learning position is the user action in the The display position in the coach video frame.
所述服务端,还用于将所述动作视频帧中的用户动作合成至当前播放的教练视频帧中对应的学习位置上,生成增强教练视频帧并发送给所述用户端。The server is also used to synthesize the user actions in the action video frame to the corresponding learning position in the currently playing coach video frame, generate an enhanced coach video frame and send it to the user terminal.
所述服务端,还用于获取所述动作视频帧对应的教练视频帧。其中,用户端获取所述动作视频帧的时间与播放所述教练视频帧的时间相同或者在预设时间差内;比对所述动作视频帧中的用户动作与所述教练视频帧中的教练演示动作,生成度量所述用户动作的评价分数;将所述评价分数发送给所述用户端。The server is also used to obtain the coach video frame corresponding to the action video frame. Wherein, the time when the user terminal obtains the action video frame is the same as the time when the coach video frame is played or is within a preset time difference; compare the user action in the action video frame with the coach presentation in the coach video frame Action, generating an evaluation score measuring the user's action; sending the evaluation score to the user terminal.
所述用户端,还用于接收所述服务端发送的所述增强教练视频帧以及评价分数,将所述评价分数合成到所述增强教练视频帧中并进行播放。The user terminal is also used to receive the enhanced coach video frame and the evaluation score sent by the server, and synthesize the evaluation score into the enhanced coach video frame and play it.
在本公开实施例中,所述服务端可以获取所述动作视频帧对应的教练视频帧以进行比较。在一例子中,比如用户端播放第一帧的教练视频帧的内容,并通过摄像头拍摄用户做与第一帧中的教练演示动作相应的动作,用户端将基于该动作生成的动作视频帧上传至服务端,此时与该动作视频帧对应的是第一帧的教练视频帧。在一种可能的实现方式中,用户端通过摄像头拍摄用户动作时的拍摄时间与播放所述教练视频帧的时间一致,在理想状态下,所述服务端可以基于所述动作视频帧的拍摄时间,确定对应的教练视频帧,即用户端获取所述动作视频帧的时间与用户端播放所述教练视频帧的时间相同。在另一种可能的实现方式中,考虑到用户做动作存在一定的反应延迟时间,因此所述服务端还可以基于所述动作视频帧的拍摄时间以及预设时间差,确定对应的教练视频帧,即用户端获取所述动作视频帧的时间与用户端播放所述教练视频帧的时间在预设时间差内。In the embodiment of the present disclosure, the server may obtain the coach video frame corresponding to the action video frame for comparison. In one example, for example, the user terminal plays the content of the coach video frame in the first frame, and uses the camera to capture the user's action corresponding to the coach demonstration action in the first frame, and the user terminal uploads the action video frame generated based on the action To the server, the first coach video frame corresponds to the action video frame at this time. In a possible implementation manner, the shooting time when the user terminal uses the camera to capture the user action is the same as the time when the coach video frame is played. In an ideal state, the server can be based on the shooting time of the action video frame , Determine the corresponding coach video frame, that is, the time when the user terminal obtains the action video frame is the same as the time when the user terminal plays the coach video frame. In another possible implementation manner, considering that there is a certain reaction delay time for the user's action, the server may also determine the corresponding coaching video frame based on the shooting time of the action video frame and the preset time difference, That is, the time when the user terminal obtains the action video frame and the time when the user terminal plays the coach video frame are within a preset time difference.
在本公开实施例中,所述服务端在获取所述动作视频帧对应的教练视频帧之后,可以比对所述动作视频帧中的用户动作与所述教练视频帧中的教练演示动作,生成度量所述用户动作的评价分数。具体地,所述服务端可以从所述动作视频帧中获取用户的骨架特征点的坐标数据作为用户动作数据,以及从所述教练视频帧中获取所述教练的骨架特征点的坐标数据作为教练演示动作数据,所述骨架特征点的坐标数据可以为二维坐标数据或者三维坐标数据,然后根据所述用户动作数据与所述教练演示动作数据之间的差异,生成度量所述用户动作的评价分数。本公开实施例的计算过程,有利于提高评价分数的准确程度,其中,所述教练动作数据可以是预先获取的离线数据,从而加快服务端的处理速度,提高响应效率。In the embodiment of the present disclosure, after obtaining the coach video frame corresponding to the action video frame, the server may compare the user action in the action video frame with the coach demonstration action in the coach video frame to generate The evaluation score of the user action is measured. Specifically, the server may obtain the coordinate data of the user's skeleton feature points from the action video frame as user action data, and obtain the coordinate data of the coach's skeleton feature points from the coach video frame as the coach Demonstration action data, the coordinate data of the skeleton feature point may be two-dimensional coordinate data or three-dimensional coordinate data, and then based on the difference between the user action data and the coach demonstration action data, an evaluation measuring the user action is generated fraction. The calculation process of the embodiment of the present disclosure is beneficial to improve the accuracy of the evaluation score, wherein the coach action data may be offline data obtained in advance, thereby speeding up the processing speed of the server and improving response efficiency.
在一个具体的实现方式中,所述服务端可以基于所述用户动作数据获取所有用户矢量夹角,以及基于所述教练演示动作数据获取所有标准矢量夹角。其中,所述用户矢量夹角可以是用户对应的任意三个相邻的骨架特征点的坐标数据所构成的两个矢量的夹角,所述标准矢量夹角可以是教练对应的任意三个相邻的骨架特征点的坐标数据所构成的两个矢量的夹角,所述用户矢量夹角与所述标准矢量夹角一一对应。请参阅图6,图6中示出14个骨架特征点(图6中的黑色标记点),任意三个相邻的骨架特征点构成的两个矢量确定一矢量夹角,总共13个矢量夹角,然后根据所有用户矢量夹角与对应的所有标准矢量夹角确定相似度参数,从而确定所述用户动作对应的评价分数。所述相似度参数至少包括矢量夹角与标准矢量夹角的标准差结果或者方差结果,所述标准差结果和所述方差结果越小,相似度越大,则评价分数越高。所述标准矢量夹角可以是预先获取的离线数据,从而加快服务端的处理速度,提高响应效率。In a specific implementation manner, the server may obtain all user vector angles based on the user action data, and obtain all standard vector angles based on the coach demonstration action data. The included angle of the user vector may be the included angle of two vectors formed by the coordinate data of any three adjacent skeleton feature points corresponding to the user, and the included angle of the standard vector may be any three corresponding to the coach. The included angle of two vectors formed by the coordinate data of adjacent skeleton feature points, and the included angle of the user vector corresponds to the included angle of the standard vector one-to-one. Please refer to Figure 6. Figure 6 shows 14 skeleton feature points (the black mark points in Figure 6). Two vectors formed by any three adjacent skeleton feature points determine a vector angle, and a total of 13 vector clips Then, the similarity parameter is determined according to the included angle of all user vectors and the included angle of all corresponding standard vectors, so as to determine the evaluation score corresponding to the user action. The similarity parameter includes at least a standard deviation result or a variance result of the included angle between the vector and the standard vector. The smaller the standard deviation result and the variance result, and the greater the similarity, the higher the evaluation score. The standard vector included angle may be offline data obtained in advance, thereby speeding up the processing speed of the server and improving response efficiency.
在另一示例中,所述服务端可以基于所述用户动作数据获取部分的用户矢量夹角,以及基于所述教练演示动作数据获取对应部分的标准矢量夹角。各用户矢量夹角与各标准矢量夹角一一对应。然后根据所述部分的用户矢量夹角与所述对应的部分标准矢量夹角确定相似度参数,从而确定所述用户动作对应的评价分数。In another example, the server may obtain the included user vector angle of the part based on the user action data, and obtain the standard vector included angle of the corresponding part based on the coach demonstration action data. The included angle of each user vector corresponds to the included angle of each standard vector one-to-one. Then, the similarity parameter is determined according to the included angle of the partial user vector and the corresponding partial standard vector, thereby determining the evaluation score corresponding to the user action.
在一例子中,所述服务端计算所述相似度参数可以包括:对于相应的标准矢量夹角与用户矢量夹角,计算其差分矢量;其中,设标准矢量夹角、用户矢量夹角的数量均为N(N为大于0的整数),设第i(1≤i≤N)个标准矢量夹角为α
i,第i个用户矢量夹角为β
i,差分矢量为Δα
i,则Δα
i=α
i-β
i; 根据所有的差分矢量计算平均差分矢量;其中,设所述平均差分矢量为Δr,则
利用所有的差分矢量以及所述平均差分矢量计算相似度参数;其中,设相似度参数为S,则
或者
In an example, the calculation of the similarity parameter by the server may include: for the corresponding included angle between the standard vector and the user vector, calculating the difference vector; where the number of the included angle between the standard vector and the included angle of the user vector Are all N (N is an integer greater than 0), set the i-th (1≤i≤N) standard vector angle as α i , the i-th user vector angle as β i , and the difference vector as Δα i , then Δα i = α i- β i ; calculate the average difference vector according to all the difference vectors; where, suppose the average difference vector is Δr, then Use all the difference vectors and the average difference vector to calculate the similarity parameter; among them, set the similarity parameter to S, then or
在一种可能的场景中,若所述服务端检测到为单个用户播放教练视频,则所述服务端将生成的对应于所述用户端的评价分数发送给所述用户端,以使所述用户端接收所述服务端发送的评价分数,并根据所述评价分数生成分数图像,以与所述增强教练视频帧进行合成,生成包括评价分数的增强教练视频帧并进行播放,从而使得用户对于自身的动作的标准程度有一个明确评判,提高用户的使用体验。In a possible scenario, if the server detects that a coaching video is played for a single user, the server sends the generated evaluation score corresponding to the user to the user, so that the user The terminal receives the evaluation score sent by the server, and generates a score image based on the evaluation score to synthesize with the enhanced coach video frame to generate and play an enhanced coach video frame including the evaluation score, so that the user is There is a clear judgment on the standard degree of the action, which improves the user experience.
在另一种可能的场景中,若所述服务端检测到为多个用户端同步播放同一教练视频,并且基于所述多个用户端发送的动作视频帧分别生成对应的评价分数,则所述服务端向各个用户端发送多个客户端的评价分数,以使各个用户端接收多个评价分数,并基于评价分数的大小对多个所述评价分数进行排序,根据排序后的评价分数生成分数图像,以与所述增强教练视频帧进行合成,生成包括评价分数的增强教练视频帧并进行播放,使得用户不仅可以看到自身的分数,也可以看到其他用户的分数,增强健身的互动性。在另一示例中,用户端也可以基于用户的选择只接收自身对应的评价分数进行合成播放,本公开实施例对此不做任何限制。In another possible scenario, if the server detects that multiple clients are playing the same coaching video synchronously, and generates corresponding evaluation scores based on the action video frames sent by the multiple clients, then the The server sends the evaluation scores of multiple clients to each client so that each client receives multiple evaluation scores, and sorts the multiple evaluation scores based on the size of the evaluation scores, and generates a score image based on the sorted evaluation scores , To synthesize with the enhanced coaching video frame to generate and play an enhanced coaching video frame including evaluation scores, so that users can not only see their own scores, but also the scores of other users, and enhance the interactivity of fitness. In another example, the user terminal may also only receive its own corresponding evaluation scores for synthetic playback based on the user's selection, which is not limited in the embodiment of the present disclosure.
请参阅图7,图7是本公开根据一示例性实施例示出的又一种健身教学系统的结构图。图7所示的实施例中,所述系统包括服务端以及用户端。Please refer to FIG. 7, which is a structural diagram of another fitness teaching system according to an exemplary embodiment of the present disclosure. In the embodiment shown in Figure 7, the system includes a server and a user.
所述用户端,用于在播放用户选择的教练视频时,获取所述用户的动作视频帧,并发送给服务端。The user terminal is used to obtain and send the user's action video frame to the server when the coach video selected by the user is played.
所述服务端,用于接收用户端发送的动作视频帧,并获取为所述用户端播放的教练视频帧以及所述用户端对应的学习位置;所述学习位置为所述用户动作在所述教练视频帧中的显示位置。The server is configured to receive the action video frame sent by the user, and obtain the coach video frame played for the user and the corresponding learning position of the user; the learning position is the user action in the The display position in the coach video frame.
所述服务端,还用于将所述动作视频帧中的用户动作合成至当前播放的教练视频帧中对应的学习位置上,生成增强教练视频帧并发送给所述用户端。The server is also used to synthesize the user actions in the action video frame to the corresponding learning position in the currently playing coach video frame, generate an enhanced coach video frame and send it to the user terminal.
所述服务端,还用于获取所述动作视频帧对应的教练视频帧。其中,用户端获取所述动作视频帧的时间与播放所述教练视频帧的时间相同或者在预设时间差内;比对所述动作视频帧中的用户动作与所述教练视频帧中的教练演示动作,生成度量所述用户动作的评价分数;将所述评价分数合成到所述增强教练视频帧,生成包括评价分数的增强教练视频帧并发送至所述用户端。The server is also used to obtain the coach video frame corresponding to the action video frame. Wherein, the time when the user terminal obtains the action video frame is the same as the time when the coach video frame is played or is within a preset time difference; compare the user action in the action video frame with the coach presentation in the coach video frame Action, generating an evaluation score measuring the user's actions; synthesizing the evaluation score into the enhanced coach video frame, generating an enhanced coach video frame including the evaluation score, and sending it to the user terminal.
所述用户端,还用于接收所述服务端发送的包括评价分数的增强教练视频帧并进行播放。The user terminal is also used to receive and play the enhanced coaching video frame including the evaluation score sent by the server.
在本公开实施例中,所述服务端在获取所述动作视频帧对应的教练视频帧之后,可以比对所述 动作视频帧中的用户动作与所述教练视频帧中的教练演示动作,生成度量所述用户动作的评价分数。In the embodiment of the present disclosure, after obtaining the coach video frame corresponding to the action video frame, the server may compare the user action in the action video frame with the coach demonstration action in the coach video frame to generate The evaluation score of the user action is measured.
在一个可能的实现场景中,若所述服务端检测到为单个用户播放教练视频,则所述服务端根据生成的对应于所述用户端的评价分数生成分数图像,然后将所述分数图像与所述增强教练视频帧进行合成,生成包括评价分数的增强教练视频帧并发送至所述用户端,以使所述用户端接收所述服务端发送的包括评价分数的增强教练视频帧并进行播放;从而使得用户对于自身的动作的标准程度有一个明确评判,提高用户的使用体验。In a possible implementation scenario, if the server detects that a coach video is played for a single user, the server generates a score image according to the generated evaluation score corresponding to the user, and then compares the score image with the Synthesizing the enhanced coaching video frame to generate an enhanced coaching video frame including the evaluation score and send it to the user terminal, so that the user terminal receives and plays the enhanced coaching video frame including the evaluation score sent by the server; Thereby, the user has a clear judgment on the standard degree of his own actions, and the user experience is improved.
在另一种可能的实现场景中,若所述服务端检测到为多个用户端同步播放同一教练视频,并且基于所述多个用户端发送的动作视频帧分别生成对应的评价分数,则所述服务端基于评价分数的大小对多个所述评价分数进行排序,根据排序后的评价分数生成分数图像,然后将所述分数图像与所述增强教练视频帧进行叠加合成,生成包括评价分数的增强教练视频帧并发送至所述用户端,以使所述用户端接收所述服务端发送的包括评价分数的增强教练视频帧并进行播放,使得用户不仅可以看到自身的分数,也可以看到其他用户的分数,增强健身的互动性。在另一示例中,所述服务器也可以基于用户的选择生成只包括该用户的评价分数的增强教练视频帧,并发送给所述用户端,本公开实施例对此不做任何限制。In another possible implementation scenario, if the server detects that multiple clients are playing the same coaching video synchronously, and generate corresponding evaluation scores based on the action video frames sent by the multiple clients, then The server sorts a plurality of the evaluation scores based on the size of the evaluation scores, generates a score image according to the sorted evaluation scores, and then superimposes and synthesizes the score image and the enhanced coach video frame to generate an evaluation score The enhanced coaching video frame is sent to the client, so that the client receives and plays the enhanced coaching video frame including the evaluation score sent by the server, so that the user can not only see his own score, but also To the scores of other users, enhance the interaction of fitness. In another example, the server may also generate an enhanced coaching video frame that only includes the user's evaluation score based on the user's selection, and send it to the user terminal. The embodiment of the present disclosure does not impose any limitation on this.
以下以所述健身教学方法应用于本地的终端为例进行说明:The following is an example of applying the fitness teaching method to a local terminal:
如图8所示,图8是本公开根据一示例性实施例示出的健身教学方法的流程图。As shown in Fig. 8, Fig. 8 is a flowchart of a fitness teaching method according to an exemplary embodiment of the present disclosure.
图8所示的实施例中,所述方法包括以下步骤。In the embodiment shown in FIG. 8, the method includes the following steps.
在步骤S101中,获取用户的动作视频帧;所述动作视频帧包括所述用户动作时生成的视频帧。In step S101, a user's action video frame is obtained; the action video frame includes a video frame generated when the user moves.
在步骤S102中,获取为所述用户播放的教练视频帧以及所述用户对应的学习位置;所述学习位置为所述用户动作在所述教练视频帧中的显示位置。In step S102, a coach video frame played for the user and a learning position corresponding to the user are acquired; the learning position is the display position of the user action in the coach video frame.
在步骤S103中,将所述用户的动作视频帧中的用户动作合成至为所述用户播放的教练视频帧中对应的学习位置上,生成增强教练视频帧。In step S103, the user action in the user's action video frame is synthesized to the corresponding learning position in the coach video frame played for the user to generate an enhanced coach video frame.
在步骤S104中,为所述用户播放所述增强教练视频帧。In step S104, the enhanced coaching video frame is played for the user.
在一实施例中,可以在所述终端上设置有存储模块(数据库、ROM等),用来存储教练视频,所述教练视频为包括教练演示动作的视频,每个教练视频可以对应一个或多个学习位置,其具体数量可基于实际情况进行具体设置,所述学习位置为用户动作在所述教练视频中的显示位置;另外,所述教练视频包括模板教练视频以及增强教练视频,所述模板教练视频为未合成任何用户的动作视频帧的视频,所述增强教练视频为预先合成有一个或多个用户的动作视频帧的视频;其中,所述终端可以连接预定的云端以定期更新所述教练视频。In an embodiment, a storage module (database, ROM, etc.) may be provided on the terminal to store coach videos. The coach videos are videos that include coach demonstration actions. Each coach video may correspond to one or more The specific number of learning positions can be specifically set based on the actual situation. The learning position is the display position of the user's actions in the coaching video; in addition, the coaching video includes a template coaching video and an enhanced coaching video. The coach video is a video that has not synthesized any user's action video frames, and the enhanced coach video is a video that has one or more users' action video frames synthesized in advance; wherein the terminal can be connected to a predetermined cloud to update the Coach video.
在一实施例中,所述终端可以基于所述存储模块向用户推送教练视频,在用户想要健身时,接收用户发送的选择请求,所述选择请求包括以下至少一种:用户选择的教练视频的标识或用户选择的学习位置的标识。In an embodiment, the terminal may push a coach video to the user based on the storage module, and when the user wants to exercise, receive a selection request sent by the user, the selection request includes at least one of the following: a coach video selected by the user Or the identification of the learning location selected by the user.
可选地,用户也可以不进行待播放的教练视频和/或学习位置的选择,由所述终端进行确定;例如所述用户在终端上点击随机播放的按键(可以是虚拟按键或者实体按键),然后所述终端为所述用户随机确定一教练视频和学习位置;或者所述终端还可以根据用户的历史播放数据或用户偏好等自动确定一教练视频和学习位置。Optionally, the user may not select the coaching video to be played and/or the learning location, and the terminal determines it; for example, the user clicks a random button on the terminal (it can be a virtual button or a physical button) , Then the terminal randomly determines a coaching video and learning location for the user; or the terminal can also automatically determine a coaching video and learning location according to the user's historical playback data or user preferences.
在本公开实施例中,在进行运动或者健身时,所述终端播放用户选择的教练视频,用户根据播放的教练视频中的教练的演示动作做出相应的动作,所述用户端通过摄像头同步拍摄用户动作,以生成包括用户动作的动作视频帧。In the embodiment of the present disclosure, when exercising or fitness, the terminal plays the coach video selected by the user, the user makes a corresponding action according to the coach’s demonstration action in the played coach video, and the user terminal synchronously shoots through the camera User actions to generate action video frames including user actions.
在一种实现方式中,所述摄像头可以是2D RGB摄像头,所述用户端获取通过2D RGB摄像头拍摄的RGB视频帧,检测所述RGB视频帧中的人体,然后基于检测的人体,对所述视频帧进行背景分割以提取所述人体,生成只包括人体动作的动作视频帧。In an implementation manner, the camera may be a 2D RGB camera, and the user terminal obtains RGB video frames captured by the 2D RGB camera, detects the human body in the RGB video frame, and then, based on the detected human body, The video frame undergoes background segmentation to extract the human body, and an action video frame that only includes the human body motion is generated.
在另一种实现方式中,所述摄像头可以是2D RGB摄像头和3D深度摄像头,所述用户端获取通过2D RGB摄像头拍摄的RGB视频帧以及通过3D深度摄像头拍摄的深度视觉场帧,基于所述深度视觉场帧计算视觉深度场,并检测所述RGB视频帧中的人体,然后根据所述视觉深度场将RGB视频帧中检测的人体从背景中分割开,保留人体前景,从而生成只包括人体动作的动作视频帧。In another implementation manner, the camera may be a 2D RGB camera and a 3D depth camera, and the user terminal obtains RGB video frames captured by the 2D RGB camera and depth vision field frames captured by the 3D depth camera, based on the The depth vision field frame calculates the visual depth field, and detects the human body in the RGB video frame, and then divides the human body detected in the RGB video frame from the background according to the visual depth field, and retains the human body foreground, thereby generating only the human body Action video frames for action.
接着,在生成所述动作视频之后,所述终端获取此时为所述用户播放的教练视频帧以及所述用户对应的学习位置,然后将所述用户的动作视频帧中的用户动作合成至为所述用户播放的教练视频帧中对应的学习位置上,生成增强教练视频帧,并为所述用户播放所述增强教练视频帧;其中,所述终端可以基于为所述用户播放教练视频的开始时间以及当前时间,获取此时为所述用户播放的教练视频帧;本公开实施例基于AR技术(视觉增强现实技术)使得用户可以同时看到自身所做的动作以及教练所做的动作,提高互动性。Then, after generating the action video, the terminal obtains the coaching video frame played for the user at this time and the corresponding learning position of the user, and then synthesizes the user action in the user's action video frame into In the corresponding learning position in the coaching video frame played by the user, an enhanced coaching video frame is generated, and the enhanced coaching video frame is played for the user; wherein, the terminal may be based on the start of playing the coaching video for the user Time and current time, the coach video frame played for the user at this time is obtained; the embodiment of the present disclosure is based on AR technology (visual augmented reality technology) so that the user can see the actions done by himself and the actions by the coach at the same time, improving Interactivity.
在一实施例中,所述教练视频帧为包括教练演示动作的视频帧,则所述终端可以获取所述用户的动作视频帧对应的教练视频帧;其中,获取所述用户的动作视频帧的时间与播放所述教练视频帧的时间相同或者在预设时间差内,然后比对所述用户的动作视频帧中的用户动作与所述教练视频帧中的教练演示动作,生成度量所述用户动作的评价分数,然后将所述评价分数合成到所述增强教练视频帧中。In an embodiment, the coaching video frame is a video frame that includes a coach demonstration action, and the terminal may obtain the coaching video frame corresponding to the user's action video frame; wherein The time is the same as the time of playing the coach video frame or within a preset time difference, and then the user action in the user action video frame is compared with the coach demonstration action in the coach video frame to generate a measurement of the user action And then synthesize the evaluation scores into the enhanced coaching video frame.
在一个示例中,所述终端在生成对应于该教练视频的增强教练视频帧后,将生成的增强教练视频保存至存储模块中,并且若用户有对应的账户,也可以将生成的所述增强教练视频与用户的账户进行关联,可以帮助用户的复盘学习,便于提高后续运动或健身练习的动作准确性,同时丰富教练视频资源。In an example, after the terminal generates an enhanced coach video frame corresponding to the coach video, it saves the generated enhanced coach video to the storage module, and if the user has a corresponding account, the generated enhanced coach video frame may also be saved Associating the coaching video with the user's account can help the user's replay learning, facilitate the improvement of the accuracy of subsequent sports or fitness exercises, and enrich the coaching video resources.
在一实施例中,所述学习位置还可以对应一资产信息,所述资产信息表征该学习位置的价值,例如所述资产信息可以是虚拟币、积分或者实际货币等,则所述终端在接收用户选择的学习位置时,根据所述学习位置对应的资产信息,对所述用户的账户发起资产处理操作,例如所述资产处理操作可以是扣除用户账户中的虚拟币、积分或者余额的操作,也可以是扣除用户账户绑定的银行 卡或者第三方支付账户中的余额的操作;从而为商家提供相应的经济价值。In an embodiment, the learning location may also correspond to asset information, and the asset information represents the value of the learning location. For example, the asset information may be virtual currency, points, or actual currency, etc., then the terminal is receiving When the user selects the learning location, according to the asset information corresponding to the learning location, an asset processing operation is initiated on the user’s account. For example, the asset processing operation may be an operation of deducting virtual currency, points or balance in the user’s account. It can also be the operation of deducting the balance in the bank card or third-party payment account bound to the user's account; thereby providing the merchant with corresponding economic value.
在一种实现方式中,所述终端可以从所述用户的动作视频帧中获取所述用户的骨架特征点的坐标数据作为用户动作数据,以及从所述教练视频帧中获取所述教练的骨架特征点的坐标数据作为教练演示动作数据,所述骨架特征点的坐标数据为二维坐标数据(2D摄像头拍摄)或者三维坐标数据(2D和3D摄像头组合拍摄),然后根据所述用户动作数据与所述教练演示动作数据之间的差异,生成度量所述用户动作的评价分数。In an implementation manner, the terminal may obtain coordinate data of the user's skeleton feature points from the user's action video frame as user action data, and obtain the coach's skeleton from the coach video frame The coordinate data of the feature point is used as the coach’s demonstration action data. The coordinate data of the skeleton feature point is two-dimensional coordinate data (photographed by a 2D camera) or three-dimensional coordinate data (photographed by a combination of 2D and 3D cameras). The coach demonstrates the difference between the action data and generates an evaluation score that measures the user's actions.
在一种实现方式中,所述终端基于所述用户动作数据获取至少一个用户矢量夹角,以及基于所述教练演示动作数据获取至少一个标准矢量夹角,所述用户矢量夹角为所述用户对应的任意三个相邻的骨架特征点的坐标数据所构成的两个矢量的夹角,所述标准矢量夹角为所述教练对应的任意三个相邻的骨架特征点的坐标数据所构成的两个矢量的夹角,然后根据所述至少一个用户矢量夹角与对应的标准矢量夹角确定相似度参数,从而确定所述用户动作对应的评价分数,所述相似度参数至少包括所述至少一个用户矢量夹角与所述对应的标准矢量夹角的标准差结果或者方差结果,所述标准差结果和所述方差结果越小,相似度越大,则评价分数越高。In an implementation manner, the terminal obtains at least one user vector angle based on the user action data, and obtains at least one standard vector angle based on the coach demonstration action data, and the user vector angle is the user Corresponding to the angle between two vectors formed by the coordinate data of any three adjacent skeleton feature points, the standard vector angle is formed by the coordinate data of any three adjacent skeleton feature points corresponding to the coach And then determine the similarity parameter according to the included angle of the at least one user vector and the corresponding standard vector to determine the evaluation score corresponding to the user action. The similarity parameter at least includes the The standard deviation result or the variance result of the included angle between at least one user vector and the corresponding standard vector. The smaller the standard deviation result and the variance result and the greater the similarity, the higher the evaluation score.
在一种实现方式中,所述终端确定所述相似度参数可以包括:对于各个所述用户矢量夹角与相应的标准矢量夹角,计算其差分矢量;其中,设标准矢量夹角、用户矢量夹角的数量均为N(N为大于0的整数),设第i(1≤i≤N)个标准矢量夹角为α
i,第i个用户矢量夹角为β
i,差分矢量为Δα
i,则Δα
i=α
i-β
i;根据所有的差分矢量计算平均差分矢量;其中,设所述平均差分矢量为Δr,则
利用所有的差分矢量以及所述平均差分矢量计算相似度参数;其中,设相似度参数为S,则
或者
In an implementation manner, the terminal determining the similarity parameter may include: calculating the difference vector between the included angle of each user vector and the corresponding standard vector; where, the standard vector included angle and the user vector included angle The number of included angles are all N (N is an integer greater than 0). Let the i-th standard vector included angle (1≤i≤N) be α i , the i-th user vector included angle is β i , and the difference vector is Δα i , then Δα i =α i- β i ; calculate the average difference vector according to all the difference vectors; where, suppose the average difference vector is Δr, then Use all the difference vectors and the average difference vector to calculate the similarity parameter; among them, set the similarity parameter to S, then or
在一实施例中,若同时为多个用户播放同一教练视频帧,并且基于所述多个用户的动作视频帧分别生成对应的评价分数,则所述终端可以基于评价分数的大小排序结果将多个评价分数合成到所述增强教练视频帧中,使得用户不仅可以看到自身的分数,也可以看到其他用户的分数,增强健身的互动性。In one embodiment, if the same coaching video frame is played for multiple users at the same time, and corresponding evaluation scores are generated based on the action video frames of the multiple users, the terminal may sort the results based on the size of the evaluation scores. The evaluation scores are synthesized into the enhanced coaching video frame, so that users can not only see their own scores, but also the scores of other users, and enhance the interactivity of fitness.
如图9所示,图9是本公开根据一示例性实施例示出的健身教学装置的结构示意图。As shown in FIG. 9, FIG. 9 is a schematic structural diagram of a fitness teaching device according to an exemplary embodiment of the present disclosure.
图9所示的实施例中,所述装置包括:动作视频帧获取模块21、获取模块22、增强教练视频帧生成模块23和增强教练视频帧播放单元24。In the embodiment shown in FIG. 9, the device includes: an action video frame acquisition module 21, an acquisition module 22, an enhanced coach video frame generation module 23 and an enhanced coach video frame playback unit 24.
动作视频帧获取模块21用于获取用户的动作视频帧;所述动作视频帧包括所述用户动作时生成的视频帧。The action video frame acquisition module 21 is used to acquire a user's action video frame; the action video frame includes a video frame generated when the user moves.
获取模块22用于获取为所述用户播放的教练视频帧以及所述用户对应的学习位置;所述学 习位置为所述用户动作在所述教练视频帧中的显示位置。The acquiring module 22 is configured to acquire the coaching video frame played for the user and the corresponding learning position of the user; the learning position is the display position of the user action in the coaching video frame.
增强教练视频帧生成模块23用于将所述用户的动作视频帧中的用户动作合成至为所述用户播放的教练视频帧中对应的学习位置上,生成增强教练视频帧。The enhanced coaching video frame generating module 23 is used for synthesizing the user actions in the user's action video frames to the corresponding learning positions in the coaching video frames played for the user to generate an enhanced coaching video frame.
增强教练视频帧播放单元24用于为所述用户播放所述增强教练视频帧。The enhanced coach video frame playing unit 24 is configured to play the enhanced coach video frame for the user.
可选地,所述教练视频帧为包括教练演示动作的视频帧。则所述装置还包括:评价分数生成模块。Optionally, the coach video frame is a video frame including a coach demonstration action. Then the device further includes: an evaluation score generating module.
所述获取模块还用于获取所述用户的动作视频帧对应的教练视频帧;其中,获取所述用户的动作视频帧的时间与播放所述教练视频帧的时间相同或者在预设时间差内。The acquisition module is also used to acquire the coach video frame corresponding to the user's action video frame; wherein, the time of acquiring the user's action video frame is the same as the time of playing the coach video frame or within a preset time difference.
评价分数生成模块用于比对所述用户的动作视频帧中的用户动作与所述教练视频帧中的教练演示动作,生成度量所述用户动作的评价分数。The evaluation score generating module is configured to compare the user action in the user's action video frame with the coach demonstration action in the coach video frame to generate an evaluation score that measures the user action.
可选地,所述评价分数生成模块包括:动作数据获取子模块和评价分数生成子模块。Optionally, the evaluation score generation module includes: an action data acquisition sub-module and an evaluation score generation sub-module.
动作数据获取子模块,用于从所述用户的动作视频帧中获取所述用户的骨架特征点的坐标数据作为用户动作数据,以及从所述教练视频帧中获取所述教练的骨架特征点的坐标数据作为教练演示动作数据。The action data acquisition sub-module is used to acquire the coordinate data of the user’s skeleton feature points from the user’s action video frame as user action data, and to acquire the coach’s skeleton feature points from the coach video frame The coordinate data is used as the coach's demonstration action data.
评价分数生成子模块,用于根据所述用户动作数据与所述教练演示动作数据之间的差异,生成度量所述用户动作的评价分数。The evaluation score generating sub-module is configured to generate an evaluation score for measuring the user action according to the difference between the user action data and the coach demonstration action data.
可选地,所述评价分数生成子模块包括:矢量夹角获取单元和评价分数生成单元。Optionally, the evaluation score generation sub-module includes: a vector included angle acquisition unit and an evaluation score generation unit.
矢量夹角获取单元,用于基于所述用户动作数据获取至少一个用户矢量夹角,以及基于所述教练演示动作数据获取至少一个标准矢量夹角;其中,所述用户矢量夹角为所述用户对应的任意三个相邻的骨架特征点的坐标数据所构成的两个矢量的夹角;所述标准矢量夹角为所述教练对应的任意三个相邻的骨架特征点的坐标数据所构成的两个矢量的夹角。The vector included angle acquiring unit is configured to acquire at least one user vector included angle based on the user action data, and acquire at least one standard vector included angle based on the coach demonstration action data; wherein the user vector included angle is the user Corresponding to the angle between two vectors formed by the coordinate data of any three adjacent skeleton feature points; the standard vector angle is formed by the coordinate data of any three adjacent skeleton feature points corresponding to the coach The angle between the two vectors.
评价分数生成单元,用于根据所述至少一个用户矢量夹角与对应的标准矢量夹角确定相似度参数,从而确定所述用户动作对应的评价分数;所述相似度参数至少包括所述至少一个用户矢量夹角与所述对应的标准矢量夹角的标准差结果或者方差结果。The evaluation score generating unit is configured to determine a similarity parameter according to the included angle between the at least one user vector and the corresponding standard vector, thereby determining the evaluation score corresponding to the user action; the similarity parameter includes at least the at least one The standard deviation result or the variance result of the included angle between the user vector and the corresponding standard vector.
可选地,所述评价分数生成单元包括:差分矢量计算子单元、平均差分矢量计算子单元、相似度参数计算子单元和评价分数确定子单元。Optionally, the evaluation score generating unit includes: a difference vector calculation subunit, an average difference vector calculation subunit, a similarity parameter calculation subunit, and an evaluation score determination subunit.
差分矢量计算子单元,用于对于各个所述用户矢量夹角与相应的标准矢量夹角,计算其差分矢量;其中,设标准矢量夹角、用户矢量夹角的数量均为N(N为大于0的整数),设第i(1≤i≤N)个标准矢量夹角为α
i,第i个用户矢量夹角为β
i,差分矢量为Δα
i,则Δα
i=α
i-β
i。
The difference vector calculation subunit is used to calculate the difference vector between the included angle of each user vector and the corresponding standard vector; where the number of standard vector included angles and user vector included angles are all N (N is greater than 0), set the i-th (1≤i≤N) standard vector angle as α i , the i-th user vector angle as β i , and the difference vector as Δα i , then Δα i = α i- β i .
平均差分矢量计算子单元,用于根据所述差分矢量计算平均差分矢量;其中,设所述平均 差分矢量为Δr,则
The average difference vector calculation subunit is used to calculate the average difference vector according to the difference vector; wherein, if the average difference vector is Δr, then
相似度参数计算子单元,用于利用所述差分矢量以及所述平均差分矢量计算相似度参数;其中,设相似度参数为S,则
或者
The similarity parameter calculation subunit is used to calculate the similarity parameter by using the difference vector and the average difference vector; wherein, if the similarity parameter is S, then or
评价分数确定子单元,用于根据所述相似度参数确定所述用户动作对应的评价分数。The evaluation score determination subunit is used to determine the evaluation score corresponding to the user action according to the similarity parameter.
可选地,装置还包括:评价分数合成模块,用于将所述评价分数合成到所述增强教练视频帧中。Optionally, the device further includes: an evaluation score synthesis module, configured to synthesize the evaluation score into the enhanced coach video frame.
可选地,所述评价分数合成模块包括:若同时为多个用户播放同一教练视频帧,并且基于所述多个用户的动作视频帧分别生成对应的评价分数,基于评价分数的大小排序结果将多个评价分数合成到所述增强教练视频帧中。Optionally, the evaluation score synthesis module includes: if the same coaching video frame is played for multiple users at the same time, and corresponding evaluation scores are respectively generated based on the action video frames of the multiple users, the sorting results based on the size of the evaluation scores Multiple evaluation scores are synthesized into the enhanced coach video frame.
可选地,装置还包括:选择请求接收模块,用于接收用户发送的选择请求;所述选择请求包括以下至少一种:用户选择的教练视频的标识或用户选择的学习位置的标识。Optionally, the device further includes: a selection request receiving module configured to receive a selection request sent by the user; the selection request includes at least one of the following: an identification of the coaching video selected by the user or an identification of the learning location selected by the user.
可选地,所述教练视频为包括教练演示动作的视频;所述教练视频包括模板教练视频以及增强教练视频;其中,所述模板教练视频未合成任何用户的动作视频帧,所述增强教练视频预先合成有一个或多个用户的动作视频帧。Optionally, the coach video is a video that includes a coach demonstration action; the coach video includes a template coach video and an enhanced coach video; wherein the template coach video does not synthesize any user action video frames, the enhanced coach video Pre-composite one or more user action video frames.
可选地,所述获取模块22中获取为所述用户播放的教练视频帧的步骤,包括:基于为所述用户播放教练视频的开始时间以及当前时间,获取为所述用户播放的教练视频帧。Optionally, the step of acquiring the coach video frame played for the user in the acquiring module 22 includes: acquiring the coach video frame played for the user based on the start time and current time of playing the coach video for the user .
可选地,每个教练视频包括一个或多个学习位置。则在所述生成增强教练视频帧之前,装置还包括:学习位置分配模块,用于若同时为多个用户播放同一教练视频、并且所述教练视频中有未被选择的学习位置,从未选择学习位置的用户中选取与所述未被选择的学习位置相应的用户,以接收所述选取的用户的动作视频帧,并将所述选取的用户对应的用户动作合成至所述教练视频中所述未被选择的学习位置上。Optionally, each coaching video includes one or more learning positions. Before generating the enhanced coaching video frame, the device further includes: a learning position allocation module, which is used to play the same coaching video for multiple users at the same time, and there is an unselected learning position in the coaching video, never selected Select the user corresponding to the unselected learning position from the users of the learning position to receive the action video frame of the selected user, and synthesize the user action corresponding to the selected user into the coach video Describe the unselected learning position.
可选地,所述用户对应一用户账户;所述学习位置对应资产信息。则所述健身教学方法,还包括:资产操作发起模块,用于获取用户选择的学习位置对应的资产信息,并根据所述资产信息,对所述用户账户发起资产处理操作。Optionally, the user corresponds to a user account; the learning location corresponds to asset information. The fitness teaching method further includes: an asset operation initiation module, configured to obtain asset information corresponding to the learning location selected by the user, and initiate an asset processing operation on the user account according to the asset information.
可选地,动作视频帧获取模块21包括:人体检测子模块和动作视频帧生成子模块。Optionally, the action video frame acquisition module 21 includes: a human body detection sub-module and an action video frame generation sub-module.
人体检测子模块,用于检测通过摄像头拍摄的视频帧中的人体。The human body detection sub-module is used to detect the human body in the video frame shot by the camera.
动作视频帧生成子模块,用于基于检测的人体,对所述视频帧进行背景分割以提取所述人体,生成只包括人体动作的动作视频帧。The action video frame generation sub-module is used to perform background segmentation on the video frame based on the detected human body to extract the human body, and generate an action video frame that only includes human body actions.
上述健身教学装置中各个模块的功能和作用的实现过程具体详见上述健身教学系统或健身教学方法中对应步骤的实现过程,在此不再赘述。For the realization process of the functions and functions of each module in the fitness teaching device, please refer to the realization process of the corresponding steps in the fitness teaching system or fitness teaching method for details, which will not be repeated here.
对于装置实施例而言,由于其基本对应于方法实施例,所以相关之处参见方法实施例的部分说明即可。以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理模块,即可以位于一个地方,或者也可以分布到多个网络模块上。可以根据实际的需要选择其中的部分或者全部模块来实现本公开方案的目的。本领域普通技术人员在不付出创造性劳动的情况下,即可以理解并实施。For the device embodiment, since it basically corresponds to the method embodiment, the relevant part can refer to the part of the description of the method embodiment. The device embodiments described above are merely illustrative. The modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical modules, that is, they may be located in One place, or it can be distributed to multiple network modules. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the present disclosure. Those of ordinary skill in the art can understand and implement it without creative work.
相应的,本公开还提供一种电子设备,包括:处理器;和用于存储处理器可执行指令的存储器;其中,所述处理器被配置为执行如上所述方法中的操作。Correspondingly, the present disclosure also provides an electronic device, including: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to perform the operations in the method described above.
图10是根据一示例性实施例示出的一种健身教学装置应用的电子设备的结构示意图。Fig. 10 is a schematic structural diagram showing an electronic device applied by a fitness teaching device according to an exemplary embodiment.
如图10所示,根据一示例性实施例示出的一种电子设备300,该电子设备300可以是电脑、服务器、手机、平板等计算设备。As shown in FIG. 10, an electronic device 300 is shown according to an exemplary embodiment. The electronic device 300 may be a computing device such as a computer, a server, a mobile phone, or a tablet.
参照图10,电子设备300可以包括以下一个或多个组件:处理组件301,存储器302,电源组件303,多媒体组件304,音频组件305,输入/输出(I/O)的接口306,传感器组件307,以及通信组件308。10, the electronic device 300 may include one or more of the following components: processing component 301, memory 302, power supply component 303, multimedia component 304, audio component 305, input/output (I/O) interface 306, sensor component 307 , And communication component 308.
处理组件301通常控制装置300的整体操作,诸如与显示,电话呼叫,数据通信,相机操作和记录操作相关联的操作。处理组件301可以包括一个或多个处理器309来执行指令,以完成上述的方法的全部或部分步骤。此外,处理组件301可以包括一个或多个模块,便于处理组件301和其它组件之间的交互。例如,处理组件301可以包括多媒体模块,以方便多媒体组件304和处理组件301之间的交互。The processing component 301 generally controls the overall operations of the device 300, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 301 may include one or more processors 309 to execute instructions to complete all or part of the steps of the foregoing method. In addition, the processing component 301 may include one or more modules to facilitate the interaction between the processing component 301 and other components. For example, the processing component 301 may include a multimedia module to facilitate the interaction between the multimedia component 304 and the processing component 301.
存储器302被配置为存储各种类型的数据以支持在电子设备300的操作。这些数据的示例包括用于在电子设备300上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。存储器302可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。The memory 302 is configured to store various types of data to support operations in the electronic device 300. Examples of these data include instructions for any application or method operating on the electronic device 300, contact data, phone book data, messages, pictures, videos, etc. The memory 302 can be implemented by any type of volatile or nonvolatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable and Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Magnetic Disk or Optical Disk.
电源组件303为电子设备300的各种组件提供电力。电源组件303可以包括电源管理系统,一个或多个电源,及其它与为电子设备300生成、管理和分配电力相关联的组件。The power supply component 303 provides power for various components of the electronic device 300. The power supply component 303 may include a power management system, one or more power supplies, and other components associated with the generation, management, and distribution of power for the electronic device 300.
多媒体组件304包括在所述电子设备300和用户之间的提供一个输出接口的屏幕。在一些实施例中,屏幕可以包括液晶显示器(LCD)和触摸面板(TP)。如果屏幕包括触摸面板,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。所述触摸传感器可以不仅感测触摸或滑动动作的边界,而且还检测与所述触摸或滑动操作相关的持续时间和压力。在一些实施例中,多媒体组件304包括一个前置摄像头 和/或后置摄像头。当电子设备300处于操作模式,如拍摄模式或视频模式时,前置摄像头和/或后置摄像头可以接收外部的多媒体数据。每个前置摄像头和后置摄像头可以是一个固定的光学透镜系统或具有焦距和光学变焦能力。The multimedia component 304 includes a screen that provides an output interface between the electronic device 300 and the user. In some embodiments, the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from the user. The touch panel includes one or more touch sensors to sense touch, sliding, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure related to the touch or slide operation. In some embodiments, the multimedia component 304 includes a front camera and/or a rear camera. When the electronic device 300 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front camera and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.
音频组件305被配置为输出和/或输入音频信号。例如,音频组件305包括一个麦克风(MIC),当电子设备300处于操作模式,如呼叫模式、记录模式和语音识别模式时,麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器302或经由通信组件303发送。在一些实施例中,音频组件303还包括一个扬声器,用于输出音频信号。The audio component 305 is configured to output and/or input audio signals. For example, the audio component 305 includes a microphone (MIC), and when the electronic device 300 is in an operation mode, such as a call mode, a recording mode, and a voice recognition mode, the microphone is configured to receive external audio signals. The received audio signal may be further stored in the memory 302 or transmitted via the communication component 303. In some embodiments, the audio component 303 further includes a speaker for outputting audio signals.
I/O接口302为处理组件301和外围接口模块之间提供接口,上述外围接口模块可以是键盘,点击轮,按钮等。这些按钮可包括但不限于:主页按钮、音量按钮、启动按钮和锁定按钮。The I/O interface 302 provides an interface between the processing component 301 and a peripheral interface module. The peripheral interface module may be a keyboard, a click wheel, a button, and the like. These buttons may include but are not limited to: home button, volume button, start button, and lock button.
传感器组件307包括一个或多个传感器,用于为电子设备300提供各个方面的状态评估。例如,传感器组件307可以检测到电子设备300的打开/关闭状态,组件的相对定位,例如所述组件为电子设备300的显示器和小键盘,传感器组件307还可以检测电子设备300或电子设备300一个组件的位置改变,用户与电子设备300接触的存在或不存在,电子设备300方位或加速/减速和电子设备300的温度变化。传感器组件307可以包括接近传感器,被配置用来在没有任何的物理接触时检测附近物体的存在。传感器组件307还可以包括光传感器,如CMOS或CCD图像传感器,用于在成像应用中使用。在一些实施例中,该传感器组件307还可以包括加速度传感器,陀螺仪传感器,磁传感器,压力传感器、心率信号传感器、心电图传感器、指纹传感器或温度传感器。The sensor component 307 includes one or more sensors for providing the electronic device 300 with various aspects of state evaluation. For example, the sensor component 307 can detect the on/off status of the electronic device 300 and the relative positioning of the components. For example, the component is the display and the keypad of the electronic device 300. The sensor component 307 can also detect the electronic device 300 or the electronic device 300. The position of the component changes, the presence or absence of contact between the user and the electronic device 300, the orientation or acceleration/deceleration of the electronic device 300, and the temperature change of the electronic device 300. The sensor assembly 307 may include a proximity sensor configured to detect the presence of nearby objects when there is no physical contact. The sensor component 307 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor component 307 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, a heart rate signal sensor, an electrocardiogram sensor, a fingerprint sensor, or a temperature sensor.
通信组件308被配置为便于电子设备300和其它设备之间有线或无线方式的通信。电子设备300可以接入基于通信标准的无线网络,如WiFi,2G或3G,或它们的组合。在一个示例性实施例中,通信组件308经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中,所述通信组件308还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其它技术来实现。The communication component 308 is configured to facilitate wired or wireless communication between the electronic device 300 and other devices. The electronic device 300 can access a wireless network based on a communication standard, such as WiFi, 2G, or 3G, or a combination thereof. In an exemplary embodiment, the communication component 308 receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 308 further includes a near field communication (NFC) module to facilitate short-range communication. For example, the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.
在示例性实施例中,电子设备300可以被一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其它电子元件实现,用于执行上述方法。In an exemplary embodiment, the electronic device 300 may be used by one or more application specific integrated circuits (ASIC), digital signal processors (DSP), digital signal processing devices (DSPD), programmable logic devices (PLD), field A programmable gate array (FPGA), controller, microcontroller, microprocessor, or other electronic components are implemented to implement the above methods.
在示例性实施例中,还提供了一种包括指令的非临时性计算机可读存储介质,例如包括指令的存储器302,上述指令可由电子设备300的处理器309执行以完成上述方法。例如,所述非临时性计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。In an exemplary embodiment, there is also provided a non-transitory computer-readable storage medium including instructions, such as the memory 302 including instructions, which may be executed by the processor 309 of the electronic device 300 to complete the foregoing method. For example, the non-transitory computer-readable storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.
其中,当所述存储介质中的指令由所述处理器309执行时,使得装置300能够执行前述健身教学方法。Wherein, when the instructions in the storage medium are executed by the processor 309, the device 300 can execute the aforementioned fitness teaching method.
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本公开的其它实施方 案。本公开旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由下面的权利要求指出。Those skilled in the art will easily think of other embodiments of the present disclosure after considering the specification and practicing the invention disclosed herein. The present disclosure is intended to cover any variations, uses, or adaptive changes of the present disclosure. These variations, uses, or adaptive changes follow the general principles of the present disclosure and include common knowledge or conventional technical means in the technical field not disclosed in the present disclosure. . The description and the embodiments are to be regarded as exemplary only, and the true scope and spirit of the present disclosure are pointed out by the following claims.
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。It should be understood that the present disclosure is not limited to the precise structure that has been described above and shown in the drawings, and various modifications and changes can be made without departing from its scope. The scope of the present disclosure is limited only by the appended claims.
以上所述仅为本公开的较佳实施例而已,并不用以限制本公开,凡在本公开的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本公开保护的范围之内。The above descriptions are only preferred embodiments of the present disclosure, and are not intended to limit the present disclosure. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present disclosure shall be included in the present disclosure Within the scope of protection.
Claims (28)
- 一种健身教学方法,其特征在于,包括:A fitness teaching method, characterized in that it includes:获取用户的动作视频帧;所述动作视频帧包括所述用户动作时生成的视频帧;Acquiring a user's action video frame; the action video frame includes a video frame generated when the user moves;获取为所述用户播放的教练视频帧以及所述用户对应的学习位置;所述学习位置为所述用户动作在所述教练视频帧中的显示位置;Acquiring a coaching video frame played for the user and a learning position corresponding to the user; the learning position is the display position of the user action in the coaching video frame;将所述用户的动作视频帧中的用户动作合成至为所述用户播放的教练视频帧中对应的学习位置上,生成增强教练视频帧;Synthesize the user actions in the user's action video frame to the corresponding learning position in the coach video frame played for the user to generate an enhanced coach video frame;为所述用户播放所述增强教练视频帧。Playing the enhanced coaching video frame for the user.
- 根据权利要求1所述的健身教学方法,其特征在于,所述教练视频帧为包括教练演示动作的视频帧;The fitness teaching method according to claim 1, wherein the coach video frame is a video frame including a coach demonstration action;则所述方法还包括:The method further includes:获取所述用户的动作视频帧对应的教练视频帧;其中,获取所述用户的动作视频帧的时间与播放所述教练视频帧的时间相同或者在预设时间差内;Acquiring a coach video frame corresponding to the user's action video frame; wherein the time of acquiring the user's action video frame is the same as the time of playing the coach video frame or within a preset time difference;比对所述用户的动作视频帧中的用户动作与所述教练视频帧中的教练演示动作,生成度量所述用户动作的评价分数。Compare the user action in the user's action video frame with the coach demonstration action in the coach video frame to generate an evaluation score measuring the user action.
- 根据权利要求2所述的健身教学方法,其特征在于,所述比对所述用户的动作视频帧中的用户动作与所述教练视频帧中的教练演示动作,生成度量所述用户动作的评价分数,包括:The fitness teaching method according to claim 2, wherein the user action in the user action video frame is compared with the coach demonstration action in the coach video frame to generate an evaluation measuring the user action Score, including:从所述用户的动作视频帧中获取所述用户的骨架特征点的坐标数据作为用户动作数据,以及从所述教练视频帧中获取所述教练的骨架特征点的坐标数据作为教练演示动作数据;Acquiring the coordinate data of the user's skeleton feature point from the user's action video frame as user action data, and acquiring the coordinate data of the coach's skeleton feature point from the coaching video frame as coach demonstration action data;根据所述用户动作数据与所述教练演示动作数据之间的差异,生成度量所述用户动作的评价分数。According to the difference between the user action data and the coach demonstration action data, an evaluation score for measuring the user action is generated.
- 根据权利要求3所述的健身教学方法,其特征在于,所述根据所述用户动作数据与所述教练演示动作数据之间的差异,生成度量所述用户动作的评价分数,包括:The fitness teaching method according to claim 3, wherein the generating an evaluation score for measuring the user's action according to the difference between the user's action data and the coach's demonstration action data comprises:基于所述用户动作数据获取至少一个用户矢量夹角,以及基于所述教练演示动作数据获取至少一个标准矢量夹角;其中,所述用户矢量夹角为所述用户对应的任意三个相邻的骨架特征点的坐标数据所构成的两个矢量的夹角;所述标准矢量夹角为所述教练对应的任意三个相邻的骨架特征点的坐标数据所构成的两个矢量的夹角;At least one user vector included angle is acquired based on the user action data, and at least one standard vector included angle is acquired based on the coach demonstration action data; wherein the user vector included angle is any three adjacent ones corresponding to the user The angle between two vectors formed by the coordinate data of the skeleton feature points; the standard vector angle is the angle between the two vectors formed by the coordinate data of any three adjacent skeleton feature points corresponding to the coach;根据所述至少一个用户矢量夹角与对应的标准矢量夹角确定相似度参数,从而确定所述用户动作对应的评价分数;所述相似度参数至少包括所述至少一个用户矢量夹角与所述对应的标准矢量夹角的标准差结果或者方差结果。The similarity parameter is determined according to the included angle between the at least one user vector and the corresponding standard vector to determine the evaluation score corresponding to the user action; the similarity parameter includes at least the included angle between the at least one user vector and the The standard deviation result or variance result of the corresponding standard vector included angle.
- 根据权利要求4所述的健身教学方法,其特征在于,所述根据所述至少一个用户矢量夹角与所述对应的标准矢量夹角确定所述相似度参数,包括:The fitness teaching method according to claim 4, wherein the determining the similarity parameter according to the included angle of the at least one user vector and the corresponding standard vector comprises:对于各所述用户矢量夹角与相应的标准矢量夹角,计算其差分矢量;其中,设标准矢量夹角、用户矢量夹角的数量均为N(N为大于0的整数),设第i(1≤i≤N)个标准矢量夹角为α i,第i个用户矢量夹角为β i,差分矢量为Δα i,则Δα i=α i-β i; For the included angle between each user vector and the corresponding standard vector, calculate the difference vector; where, suppose the number of standard vector included angle and user vector included angle are both N (N is an integer greater than 0), and the i-th (1≤i≤N) The included angle of the standard vectors is α i , the ith user vector included angle is β i , and the difference vector is Δα i , then Δα i = α i- β i ;根据所述差分矢量计算平均差分矢量;其中,设所述平均差分矢量为Δr,则 Calculate the average difference vector according to the difference vector; where, if the average difference vector is Δr, then
- 根据权利要求2所述的健身教学方法,其特征在于,还包括:The fitness teaching method according to claim 2, characterized in that it further comprises:将所述评价分数合成到所述增强教练视频帧中。The evaluation score is synthesized into the enhanced coach video frame.
- 根据权利要求6所述的健身教学方法,其特征在于,所述将所述评价分数合成到所述增强教练视频帧中,包括:The fitness teaching method according to claim 6, wherein said synthesizing said evaluation score into said enhanced coach video frame comprises:若同时为多个用户播放同一教练视频帧,并且基于所述多个用户的动作视频帧分别生成对应的评价分数,基于评价分数的大小排序结果将多个评价分数合成到所述增强教练视频帧中。If the same coaching video frame is played for multiple users at the same time, and corresponding evaluation scores are respectively generated based on the action video frames of the multiple users, the multiple evaluation scores are combined into the enhanced coaching video frame based on the sorting result of the evaluation scores in.
- 根据权利要求1所述的健身教学方法,其特征在于,还包括:The fitness teaching method of claim 1, further comprising:接收用户发送的选择请求;所述选择请求包括以下至少一种:用户选择的教练视频的标识或用户选择的学习位置的标识。A selection request sent by a user is received; the selection request includes at least one of the following: the identification of the coaching video selected by the user or the identification of the learning location selected by the user.
- 根据权利要求8所述的健身教学方法,其特征在于,所述教练视频为包括教练演示动作的视频;所述教练视频包括模板教练视频以及增强教练视频;其中,所述模板教练视频未合成任何用户的动作视频帧,所述增强教练视频预先合成有一个或多个用户的动作视频帧。The fitness teaching method according to claim 8, wherein the coach video is a video that includes a coach’s demonstration actions; the coach video includes a template coach video and an enhanced coach video; wherein the template coach video does not synthesize any The user's action video frame, and the enhanced coach video is pre-combined with one or more user's action video frames.
- 根据权利要求1所述的健身教学方法,其特征在于,所述获取为所述用户播放的教练视频帧,包括:The fitness teaching method according to claim 1, wherein said acquiring the coach video frame played for the user comprises:基于为所述用户播放教练视频的开始时间以及当前时间,获取为所述用户播放的教练视频帧。Based on the start time and current time of playing the coach video for the user, obtain the coach video frame played for the user.
- 根据权利要求9所述的健身教学方法,其特征在于,每个教练视频包括一个或多个学习位置;The fitness teaching method according to claim 9, wherein each coach video includes one or more learning positions;则在所述生成增强教练视频帧之前,还包括:Then, before generating the enhanced coaching video frame, it also includes:若同时为多个用户播放同一教练视频、并且所述教练视频中有未被选择的学习位置,从未选择学习位置的用户中选取与所述未被选择的学习位置相应的用户,以接收所述选取的用户的动作视频帧,并将所述选取的用户对应的用户动作合成至所述教练视频中所述未被选择的学习位置上。If the same coaching video is played for multiple users at the same time, and there is an unselected learning position in the coaching video, select the user corresponding to the unselected learning position from users who have never selected a learning position to receive all The selected user's action video frame, and the user action corresponding to the selected user is synthesized to the unselected learning position in the coach video.
- 根据权利要求1所述的健身教学方法,其特征在于,所述用户对应一用户账户;所述学习位置对应资产信息;The fitness teaching method according to claim 1, wherein the user corresponds to a user account; the learning location corresponds to asset information;则所述健身教学方法还包括:The fitness teaching method also includes:获取用户选择的学习位置对应的资产信息,并根据所述资产信息,对所述用户账户发起资产处理操作。Acquire asset information corresponding to the learning location selected by the user, and initiate an asset processing operation on the user account according to the asset information.
- 根据权利要求1所述的健身教学方法,其特征在于,The fitness teaching method according to claim 1, wherein:所述获取所述用户的动作视频帧包括:The obtaining the user's action video frame includes:检测通过摄像头拍摄的视频帧中的人体;Detect the human body in the video frame shot by the camera;基于检测的人体,对所述视频帧进行背景分割以提取所述人体,生成只包括人体动作的动作视频帧。Based on the detected human body, background segmentation is performed on the video frame to extract the human body, and an action video frame including only the human body motion is generated.
- 一种健身教学装置,其特征在于,所述方法包括:A fitness teaching device, characterized in that the method includes:动作视频帧获取模块,用于获取用户的动作视频帧;所述动作视频帧包括所述用户动作时生成的视频帧;An action video frame acquisition module, configured to acquire a user's action video frame; the action video frame includes a video frame generated when the user moves;获取模块,用于获取为所述用户播放的教练视频帧以及所述用户对应的学习位置;所述学习位置为所述用户动作在所述教练视频帧中的显示位置;An acquiring module, configured to acquire the coaching video frame played for the user and the learning position corresponding to the user; the learning position is the display position of the user's action in the coaching video frame;增强教练视频帧生成模块,用于将所述用户的动作视频帧中的用户动作合成至为所述用户播放的教练视频帧中对应的学习位置上,生成增强教练视频帧;An enhanced coaching video frame generation module, configured to synthesize user actions in the user's action video frames to corresponding learning positions in the coaching video frames played for the user to generate enhanced coaching video frames;增强教练视频帧播放单元,用于为所述用户播放所述增强教练视频帧。The enhanced coach video frame playback unit is used to play the enhanced coach video frame for the user.
- 根据权利要求14所述的健身教学装置,其特征在于,所述教练视频帧为包括教练演示动作的视频帧;The fitness teaching device according to claim 14, wherein the coach video frame is a video frame including a coach demonstration action;则所述装置还包括:The device further includes:所述获取模块,还用于获取所述用户的动作视频帧对应的教练视频帧;其中,获取所述用户的动作视频帧的时间与播放所述教练视频帧的时间相同或者在预设时间差内;The acquiring module is further configured to acquire the coach video frame corresponding to the user's action video frame; wherein the time of acquiring the user's action video frame is the same as the time of playing the coach video frame or within a preset time difference ;评价分数生成模块,用于比对所述用户的动作视频帧中的用户动作与所述教练视频帧中的教练演示动作,生成度量所述用户动作的评价分数。The evaluation score generating module is configured to compare the user action in the user's action video frame with the coach demonstration action in the coach video frame to generate an evaluation score that measures the user's action.
- 根据权利要求15所述的健身教学装置,其特征在于,所述评价分数生成模块包括:The fitness teaching device according to claim 15, wherein the evaluation score generating module comprises:动作数据获取子模块,用于从所述用户的动作视频帧中获取所述用户的骨架特征点的坐标数据作为用户动作数据,以及从所述教练视频帧中获取所述教练的骨架特征点的坐标数据作为教练演示动作数据;;The action data acquisition sub-module is used to acquire the coordinate data of the user’s skeleton feature points from the user’s action video frame as user action data, and to acquire the coach’s skeleton feature points from the coach video frame The coordinate data is used as the coach's demonstration action data;评价分数生成子模块,用于根据所述用户动作数据与所述教练演示动作数据之间的差异,生成度量所述用户动作的评价分数。The evaluation score generating sub-module is configured to generate an evaluation score for measuring the user action according to the difference between the user action data and the coach demonstration action data.
- 根据权利要求16所述的健身教学装置,其特征在于,所述评价分数生成子模块包括:The fitness teaching device according to claim 16, wherein the evaluation score generating sub-module comprises:矢量夹角获取单元,用于基于所述用户动作数据获取至少一个用户矢量夹角,以及基于所述教练演示动作数据获取至少一个标准矢量夹角;其中,所述用户矢量夹角为所述用户对应的任意三个相邻的骨架特征点的坐标数据所构成的两个矢量的夹角;所述标准矢量夹角为所述教练对应的任意三个相邻的骨架特征点的坐标数据所构成的两个矢量的夹角;The vector included angle acquiring unit is configured to acquire at least one user vector included angle based on the user action data, and acquire at least one standard vector included angle based on the coach demonstration action data; wherein the user vector included angle is the user Corresponding to the angle between two vectors formed by the coordinate data of any three adjacent skeleton feature points; the standard vector angle is formed by the coordinate data of any three adjacent skeleton feature points corresponding to the coach The angle between the two vectors;评价分数生成单元,用于根据所述至少一个用户矢量夹角与对应的标准矢量夹角确定相似度参数,从而确定所述用户动作对应的评价分数;所述相似度参数至少包括所述至少一个用户矢量夹角与所述对应的标准矢量夹角的标准差结果或者方差结果。The evaluation score generating unit is configured to determine a similarity parameter according to the included angle between the at least one user vector and the corresponding standard vector, thereby determining the evaluation score corresponding to the user action; the similarity parameter includes at least the at least one The standard deviation result or the variance result of the included angle between the user vector and the corresponding standard vector.
- 根据权利要求17所述的健身教学装置,其特征在于,所述评价分数生成单元包括:The fitness teaching device according to claim 17, wherein the evaluation score generating unit comprises:差分矢量计算子单元,用于对于各个所述用户矢量夹角与相应的标准矢量夹角,计算其差分矢量;其中,设标准矢量夹角、用户矢量夹角的数量均为N(N为大于0的整数),设第i(1≤i≤N)个标准矢量夹角为α i,第i个用户矢量夹角为β i,差分矢量为Δα i,则Δα i=α i-β i; The difference vector calculation subunit is used to calculate the difference vector between the included angle of each user vector and the corresponding standard vector; where the number of standard vector included angles and user vector included angles are all N (N is greater than 0), set the i-th (1≤i≤N) standard vector angle as α i , the i-th user vector angle as β i , and the difference vector as Δα i , then Δα i = α i- β i ;平均差分矢量计算子单元,用于根据所述差分矢量计算平均差分矢量;其中,设所述平均差分矢量为Δr,则 The average difference vector calculation subunit is used to calculate the average difference vector according to the difference vector; wherein, if the average difference vector is Δr, then相似度参数计算子单元,用于利用所述差分矢量以及所述平均差分矢量计算相似度参数;其中,设相似度参数为S,则 或者 The similarity parameter calculation subunit is used to calculate the similarity parameter by using the difference vector and the average difference vector; wherein, if the similarity parameter is S, then or评价分数确定子单元,用于根据所述相似度参数确定所述用户动作对应的评价分数。The evaluation score determining subunit is used to determine the evaluation score corresponding to the user action according to the similarity parameter.
- 根据权利要求15所述的健身教学装置,其特征在于,还包括:The fitness teaching device according to claim 15, further comprising:评价分数合成模块,用于将所述评价分数合成到所述增强教练视频帧中。The evaluation score synthesis module is used to synthesize the evaluation score into the enhanced coach video frame.
- 根据权利要求19所述的健身教学装置,其特征在于,所述评价分数合成模块包括:The fitness teaching device according to claim 19, wherein the evaluation score synthesis module comprises:若同时为多个用户播放同一教练视频帧,并且基于所述多个用户的动作视频帧分别生成对应的评价分数,基于评价分数的大小排序结果将多个评价分数合成到所述增强教练视频帧中。If the same coaching video frame is played for multiple users at the same time, and corresponding evaluation scores are respectively generated based on the action video frames of the multiple users, the multiple evaluation scores are combined into the enhanced coaching video frame based on the sorting result of the evaluation scores in.
- 根据权利要求14所述的健身教学装置,其特征在于,还包括:The fitness teaching device according to claim 14, further comprising:选择请求接收模块,用于接收用户发送的选择请求;所述选择请求包括以下至少一种:用户选择的教练视频的标识或用户选择的学习位置的标识。The selection request receiving module is configured to receive a selection request sent by the user; the selection request includes at least one of the following: the identification of the coach video selected by the user or the identification of the learning location selected by the user.
- 根据权利要求21所述的健身教学装置,其特征在于,所述教练视频为包括教练演示动作的视频;所述教练视频包括模板教练视频以及增强教练视频;其中,所述模板教练视频未合成任何用户的动作视频帧,所述增强教练视频预先合成有一个或多个用户的动作视频帧。The fitness teaching device according to claim 21, wherein the coach video is a video that includes a coach’s demonstration actions; the coach video includes a template coach video and an enhanced coach video; wherein the template coach video does not synthesize any The user's action video frame, and the enhanced coaching video is pre-combined with one or more user's action video frames.
- 根据权利要求14所述的健身教学装置,其特征在于,所述获取模块,包括:The fitness teaching device according to claim 14, wherein the acquisition module comprises:基于为所述用户播放教练视频的开始时间以及当前时间,获取为所述用户播放的教练视频帧。Based on the start time and current time of playing the coach video for the user, obtain the coach video frame played for the user.
- 根据权利要求21所述的健身教学装置,其特征在于,每个教练视频包括一个或多个学习位置;The fitness teaching device according to claim 21, wherein each coach video includes one or more learning positions;则在所述生成增强教练视频帧之前,还包括:Then, before generating the enhanced coaching video frame, it also includes:学习位置分配模块,用于若同时为多个用户播放同一教练视频、并且所述教练视频中有未被选择的学习位置,从未选择学习位置的用户中选取与所述未被选择的学习位置相应的用户,以接收所述选取的用户的动作视频帧,并将所述选取的用户对应的用户动作合成至所述教练视频中所述未被选择的学习位置上。The learning location allocation module is used to play the same coaching video for multiple users at the same time, and there are unselected learning locations in the coaching video, select the unselected learning location from users who have never selected a learning location The corresponding user may receive the action video frame of the selected user, and synthesize the user action corresponding to the selected user to the unselected learning position in the coach video.
- 根据权利要求14所述的健身教学装置,其特征在于,所述用户对应一用户账户;所述学习位置对应资产信息;The fitness teaching device according to claim 14, wherein the user corresponds to a user account; the learning location corresponds to asset information;则所述健身教学装置还包括:The fitness teaching device further includes:资产操作发起模块,用于获取用户选择的学习位置对应的资产信息,并根据所述资产信息,对所述用户账户发起资产处理操作。The asset operation initiation module is used to obtain asset information corresponding to the learning location selected by the user, and initiate an asset processing operation on the user account according to the asset information.
- 根据权利要求14所述的健身教学装置,其特征在于,动作视频帧获取模块包括:The fitness teaching device according to claim 14, wherein the action video frame acquisition module comprises:人体检测子模块,用于检测通过摄像头拍摄的视频帧中的人体;The human body detection sub-module is used to detect the human body in the video frame shot by the camera;动作视频帧生成子模块,用于基于检测的人体,对所述视频帧进行背景分割以提取所述人体,生成只包括人体动作的动作视频帧。The action video frame generation sub-module is used to perform background segmentation on the video frame based on the detected human body to extract the human body, and generate an action video frame that only includes human body actions.
- 一种电子设备,其特征在于,包括:An electronic device, characterized in that it comprises:处理器;processor;用于存储所述处理器可执行指令的存储器;A memory for storing executable instructions of the processor;其中,among them,所述处理器,被配置为执行上述权利要求1至13任一所述的健身教学方法。The processor is configured to execute the fitness teaching method according to any one of claims 1 to 13.
- 一种计算机可读存储介质,其特征在于,其上存储有计算机程序,当由一个或多个处理器执行时,使得处理器执行权利要求1至13任一所述的健身教学方法。A computer-readable storage medium, characterized in that a computer program is stored thereon, which when executed by one or more processors, causes the processor to execute the fitness teaching method according to any one of claims 1 to 13.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910599390.1 | 2019-07-04 | ||
CN201910599390.1A CN110418205A (en) | 2019-07-04 | 2019-07-04 | Body-building teaching method, device, equipment, system and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021000708A1 true WO2021000708A1 (en) | 2021-01-07 |
Family
ID=68360288
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/095369 WO2021000708A1 (en) | 2019-07-04 | 2020-06-10 | Fitness teaching method and apparatus, electronic device and storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110418205A (en) |
WO (1) | WO2021000708A1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113239849A (en) * | 2021-05-27 | 2021-08-10 | 数智引力(厦门)运动科技有限公司 | Fitness action quality evaluation method and system, terminal equipment and storage medium |
CN114220300A (en) * | 2021-02-01 | 2022-03-22 | 黄华 | Visual intelligent interactive teaching and examination system and method by utilizing augmented reality wearing equipment |
CN114666639A (en) * | 2022-03-18 | 2022-06-24 | 海信集团控股股份有限公司 | Video playing method and display device |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110418205A (en) * | 2019-07-04 | 2019-11-05 | 安徽华米信息科技有限公司 | Body-building teaching method, device, equipment, system and storage medium |
CN113128283A (en) * | 2019-12-31 | 2021-07-16 | 沸腾时刻智能科技(深圳)有限公司 | Evaluation method, model construction method, teaching machine, teaching system and electronic equipment |
CN111522522B (en) * | 2020-04-22 | 2023-12-08 | 咪咕互动娱乐有限公司 | Demonstration video display method, system, device and storage medium |
CN111445738B (en) * | 2020-04-30 | 2021-06-08 | 北京打铁师体育文化产业有限公司 | Online motion action tutoring method and system |
CN111652078A (en) * | 2020-05-11 | 2020-09-11 | 浙江大学 | Yoga action guidance system and method based on computer vision |
CN112348942B (en) * | 2020-09-18 | 2024-03-19 | 当趣网络科技(杭州)有限公司 | Body-building interaction method and system |
CN113992957A (en) * | 2020-09-30 | 2022-01-28 | 深度练习(杭州)智能科技有限公司 | Motion synchronization system and method in video file suitable for intelligent terminal |
CN114973066A (en) * | 2022-04-29 | 2022-08-30 | 浙江运动家体育发展有限公司 | Online and offline fitness interaction method and system |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104598867A (en) * | 2013-10-30 | 2015-05-06 | 中国艺术科技研究所 | Automatic evaluation method of human body action and dance scoring system |
CN104882036A (en) * | 2015-05-27 | 2015-09-02 | 江西理工大学 | Digital fitness teaching system |
US20160283783A1 (en) * | 2015-03-27 | 2016-09-29 | Intel Corporation | Gesture Recognition Mechanism |
CN207913143U (en) * | 2017-12-18 | 2018-09-28 | 郑州特瑞通节能技术有限公司 | A kind of athletic performance correction smart home body-building system |
CN108764120A (en) * | 2018-05-24 | 2018-11-06 | 杭州师范大学 | A kind of human body specification action evaluation method |
CN110298309A (en) * | 2019-06-28 | 2019-10-01 | 腾讯科技(深圳)有限公司 | Motion characteristic processing method, device, terminal and storage medium based on image |
CN110418205A (en) * | 2019-07-04 | 2019-11-05 | 安徽华米信息科技有限公司 | Body-building teaching method, device, equipment, system and storage medium |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB201404990D0 (en) * | 2014-03-20 | 2014-05-07 | Appeartome Ltd | Augmented reality apparatus and method |
CN106411889A (en) * | 2016-09-29 | 2017-02-15 | 宇龙计算机通信科技(深圳)有限公司 | Grouped movement method and system, and terminal |
CN107833283A (en) * | 2017-10-30 | 2018-03-23 | 努比亚技术有限公司 | A kind of teaching method and mobile terminal |
CN108734104B (en) * | 2018-04-20 | 2021-04-13 | 杭州易舞科技有限公司 | Body-building action error correction method and system based on deep learning image recognition |
CN108777081B (en) * | 2018-05-31 | 2021-02-02 | 华中师范大学 | Virtual dance teaching method and system |
CN109191588B (en) * | 2018-08-27 | 2020-04-07 | 百度在线网络技术(北京)有限公司 | Motion teaching method, motion teaching device, storage medium and electronic equipment |
CN109432753B (en) * | 2018-09-26 | 2020-12-29 | Oppo广东移动通信有限公司 | Action correcting method, device, storage medium and electronic equipment |
-
2019
- 2019-07-04 CN CN201910599390.1A patent/CN110418205A/en active Pending
-
2020
- 2020-06-10 WO PCT/CN2020/095369 patent/WO2021000708A1/en active Application Filing
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104598867A (en) * | 2013-10-30 | 2015-05-06 | 中国艺术科技研究所 | Automatic evaluation method of human body action and dance scoring system |
US20160283783A1 (en) * | 2015-03-27 | 2016-09-29 | Intel Corporation | Gesture Recognition Mechanism |
CN104882036A (en) * | 2015-05-27 | 2015-09-02 | 江西理工大学 | Digital fitness teaching system |
CN207913143U (en) * | 2017-12-18 | 2018-09-28 | 郑州特瑞通节能技术有限公司 | A kind of athletic performance correction smart home body-building system |
CN108764120A (en) * | 2018-05-24 | 2018-11-06 | 杭州师范大学 | A kind of human body specification action evaluation method |
CN110298309A (en) * | 2019-06-28 | 2019-10-01 | 腾讯科技(深圳)有限公司 | Motion characteristic processing method, device, terminal and storage medium based on image |
CN110418205A (en) * | 2019-07-04 | 2019-11-05 | 安徽华米信息科技有限公司 | Body-building teaching method, device, equipment, system and storage medium |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114220300A (en) * | 2021-02-01 | 2022-03-22 | 黄华 | Visual intelligent interactive teaching and examination system and method by utilizing augmented reality wearing equipment |
CN113239849A (en) * | 2021-05-27 | 2021-08-10 | 数智引力(厦门)运动科技有限公司 | Fitness action quality evaluation method and system, terminal equipment and storage medium |
CN113239849B (en) * | 2021-05-27 | 2023-12-19 | 数智引力(厦门)运动科技有限公司 | Body-building action quality assessment method, body-building action quality assessment system, terminal equipment and storage medium |
CN114666639A (en) * | 2022-03-18 | 2022-06-24 | 海信集团控股股份有限公司 | Video playing method and display device |
CN114666639B (en) * | 2022-03-18 | 2023-11-03 | 海信集团控股股份有限公司 | Video playing method and display device |
Also Published As
Publication number | Publication date |
---|---|
CN110418205A (en) | 2019-11-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021000708A1 (en) | Fitness teaching method and apparatus, electronic device and storage medium | |
US11503377B2 (en) | Method and electronic device for processing data | |
WO2021008158A1 (en) | Method and apparatus for detecting key points of human body, electronic device and storage medium | |
RU2715797C1 (en) | Method and apparatus for synthesis of virtual reality objects | |
US20190130650A1 (en) | Smart head-mounted device, interactive exercise method and system | |
WO2021184952A1 (en) | Augmented reality processing method and apparatus, storage medium, and electronic device | |
CN109729372B (en) | Live broadcast room switching method, device, terminal, server and storage medium | |
KR101670815B1 (en) | Method for providing real-time contents sharing service based on virtual reality and augment reality | |
CN112560605B (en) | Interaction method, device, terminal, server and storage medium | |
TWI255141B (en) | Method and system for real-time interactive video | |
EP3937154A1 (en) | Method for video interaction and electronic device | |
WO2022227393A1 (en) | Image photographing method and apparatus, electronic device, and computer readable storage medium | |
US9324158B2 (en) | Image processing device for performing image processing on moving image | |
US20130265448A1 (en) | Analyzing Human Gestural Commands | |
WO2022068479A1 (en) | Image processing method and apparatus, and electronic device and computer-readable storage medium | |
KR20130114893A (en) | Apparatus and method for taking a picture continously | |
KR102161034B1 (en) | System for providing exercise lecture and method for providing exercise lecture using the same | |
CN110210045B (en) | Method and device for estimating number of people in target area and storage medium | |
JP6165815B2 (en) | Learning system, learning method, program, recording medium | |
WO2021036954A1 (en) | Intelligent speech playing method and device | |
WO2022161037A1 (en) | User determination method, electronic device, and computer-readable storage medium | |
US20180247419A1 (en) | Object tracking method | |
CN116520982B (en) | Virtual character switching method and system based on multi-mode data | |
CN112581571A (en) | Control method and device of virtual image model, electronic equipment and storage medium | |
CN113743237A (en) | Follow-up action accuracy determination method and device, electronic device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20834575 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20834575 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20834575 Country of ref document: EP Kind code of ref document: A1 |