CN113678137B

CN113678137B - Display apparatus

Info

Publication number: CN113678137B
Application number: CN202080024736.6A
Authority: CN
Inventors: 王光强; 徐孝春; 刘哲哲; 吴相升; 李园园; 陈胤旸; 谢尧; 张凡文
Original assignee: Juhaokan Technology Co Ltd
Current assignee: Juhaokan Technology Co Ltd
Priority date: 2019-08-18
Filing date: 2020-08-18
Publication date: 2024-03-12
Anticipated expiration: 2040-08-18
Also published as: CN113678137A

Abstract

The application discloses a display device, wherein the display device responds to a preset instruction, acquires a local image to generate a local video stream, plays a local video picture, and displays a graphic element for identifying a preset expected position in a floating layer above the local video picture; when no moving object exists in the local video picture or the moving object exists in the local video picture and the deviation of the target position of the moving object relative to the expected position is larger than a preset threshold value, a prompt control for guiding the moving object to move to the expected position is presented in a floating layer above the local video picture according to the deviation of the target position relative to the expected position, so that a user moves to the expected position according to the prompt.

Description

Display apparatus

The present application claims priority of Chinese patent application filed on 18 th 2019, no. 201910761455.8, no. 09, no. 202010386547.5, no. 30, no. 202010364203.4, no. 385, no. 202010412358.0, no. 202010429705.0, no. 30, no. 5, no. 20, the priority of Chinese patent application filed by 27 days of 5 months in 2020, the priority of Chinese patent application filed by 202010459886.1 and the application name of "display device and interface display method", the priority of Chinese patent application filed by 22 days of 5 months in 2020, the priority of Chinese patent application filed by 202010440465.4 and the application name of "display device and information display method", the priority of Chinese patent application filed by 22 days of 5 months in 2020, the priority of Chinese patent application filed by 202010444296.1 and the application name of "display device and experience value update method", the priority of Chinese patent application filed by 22 days of 5 months in 2020, the priority of Chinese patent application filed by 202010444212.4 and the application name of "display device and play speed method", the priority of Chinese patent application filed by 29 days of 5 months in 2020, the application name of 202010479491.8 and the application name of "display device and information display method", priority of chinese patent application filed by chinese patent office at 7/13/2020, application number 202010673469.7, application name "display device and information display method", the entire contents of which are incorporated herein by reference.

Technical Field

The application relates to the technical field of display equipment, in particular to display equipment and an information display method.

Background

The continuous development of communication technology makes terminal devices such as computers, smart phones, display devices and the like more and more popular. In addition, the requirements of users on functions or services provided by the terminal equipment are also increasing. Display devices, such as smart televisions, can provide users with play pictures, such as audio, video, pictures, etc., and are now of great interest.

With the popularity of smart display devices, users are increasingly demanding to perform recreational activities through the large screen of the display device. Based on the increasing time and money that households pay in the interest cultivation and training for action-like activities, it can be seen that interest cultivation and training for action-like activities and the like are important to users, such as dance, gymnastics, and fitness, and the like.

Therefore, how to provide interest culture and training functions related to action activities for users through a display device to meet the needs of users is a technical problem to be solved.

Disclosure of Invention

In a first aspect, some embodiments of the present application provide a display device, including:

A display for displaying a user interface in which at least one video window can be displayed, the top of the video window being capable of displaying at least one floating layer;

an image collector for collecting local images to generate a local video stream;

a controller for:

responding to an input preset instruction, and controlling the image collector to collect a local image so as to generate a local video stream;

playing a local video picture in the video window, and displaying a graphic element for identifying a preset expected position in a floating layer above the local video picture;

presenting a prompt control for guiding the moving object to move to the expected position in a floating layer above the local video picture according to the deviation of the target position relative to the expected position when the moving object does not exist in the local video picture or the moving object exists in the local video picture and the deviation of the target position of the moving object relative to the expected position in the local video picture is larger than a preset threshold value;

and when a moving target exists in the local video picture, and the deviation of the target position of the moving target relative to the expected position is not greater than the preset threshold value, the display of the graphic element and the prompt control is canceled.

In a second aspect, some embodiments of the present application further provide a display device, including:

a display for displaying a user interface, the user interface including a window for playing video;

a controller for:

responding to an input instruction for playing an demonstration video, and acquiring the demonstration video, wherein the demonstration video comprises a plurality of key fragments, and the key fragments show key actions required to be practiced by a user when being played;

playing the exemplary video in the window at a first speed;

when the playing of the key fragment is started, the speed of playing the demonstration video is adjusted from the first speed to a second speed;

when the playing of the key fragment is finished, adjusting the speed of playing the demonstration video from the second speed to the first speed;

wherein the second speed is different from the first speed.

In a third aspect, some embodiments of the present application provide a display device, including:

the image collector is used for collecting local video streams;

a display for displaying a user interface, wherein the user interface comprises a first playing window for playing demonstration video and a second playing window for playing the local video stream;

A controller for:

responding to an input instruction for playing an demonstration video, and acquiring the demonstration video, wherein the demonstration video comprises a key segment and other segments different from the key segment, and the key segment displays key actions required to be practiced by a user when being played;

playing the demonstration video in the first playing window and playing the local video stream in the second playing window;

the speed of playing the other clips in the first playing window is a first speed, the speed of playing the key clips is a second speed, and the second speed is lower than the first speed; and the speed of playing the local video stream in the second playing window is a fixed preset speed.

In a fourth aspect, some embodiments of the present application provide a display device, including:

a display for displaying a user interface including a window for playing an exemplary video;

a controller for:

Playing the demonstration video in the window at a first speed and acquiring the age of the user;

playing the other clips in the demonstration video at the first speed and playing the key clips in the demonstration video at a second speed when the age of the user is lower than the preset age, wherein the second speed is lower than the first speed;

and playing all the fragments of the demonstration video at the first speed when the age of the user is not lower than the preset age.

In a fifth aspect, some embodiments of the present application provide a display device, including:

the display is used for playing the video;

a controller for:

in response to an input instruction indicating to play an demonstration video, acquiring the demonstration video, wherein the demonstration video is used for displaying demonstration actions requiring user exercise when played;

playing the demonstration video at a first speed when the age of the user is within a first age range;

playing the demonstration video at a second speed when the age of the user is within a second age range;

wherein the second speed is different from the first speed.

In a sixth aspect, some embodiments of the present application provide a display device, including:

The image collector is used for collecting the local image to obtain a local video stream;

a display for displaying an exemplary video, a local video stream, and/or a follow-up result interface;

a controller for:

responding to an input instruction for heel-in demonstration video, acquiring the demonstration video, and acquiring a local video stream, wherein the demonstration video shows demonstration actions required by a user to heel-in when being played;

performing action matching on the demonstration video and the local video stream to generate scores corresponding to the follow-up process according to the matching degree of the local video and the demonstration video;

and after the demonstration video is played, generating the heel-and-toe result interface according to the score, wherein an experience value control used for displaying an experience value is arranged in the heel-and-toe result interface, when the score is higher than the highest historical score of the demonstration video for the user to heel-and-toe, the experience value updated according to the score is displayed in the experience value control, and when the score is not higher than the highest historical score, the experience value before the heel-and-toe process is displayed in the experience value control.

In a seventh aspect, some embodiments of the present application provide a display device, including:

a display;

a controller for:

in response to an input instruction to play an demonstration video, obtaining the demonstration video, and obtaining a local video stream, wherein the demonstration video comprises a first video frame for showing a demonstration action required to be followed by a user, and the local video stream comprises a second video frame for showing the action of the user;

matching the corresponding first video frame and the corresponding second video frame, and generating scores corresponding to the heel training process according to the matching result;

and responding to the end of the playing of the demonstration video, generating a heel-and-exercise result interface according to the score, wherein an experience value control used for displaying an experience value is arranged in the heel-and-exercise result interface, the experience value updated according to the score is displayed in the experience value control when the score is higher than the highest historical score of the heel-and-exercise of the demonstration video, and the experience value before the heel-and-exercise process is displayed in the experience value control when the score is not higher than the highest historical score.

In an eighth aspect, some embodiments of the present application provide a display device, including:

the image collector is used for collecting local images;

a controller for:

in response to an instruction indicating to pause an exemplary video played in a window, pause playing the exemplary video and displaying a target key frame, wherein the target key frame is a video frame used for showing a key action in the exemplary video;

after the demonstration video is paused, acquiring a local image through the image acquisition device;

determining whether a user action in the local image is matched with a key action displayed in the target key frame;

resuming playing the demonstration video when the user action in the local image matches the key action shown in the target key frame;

and continuing to pause playing the demonstration video when the user action in the local image is not matched with the key action shown in the target key frame.

In a ninth aspect, some embodiments of the present application provide a display device, including:

the display is used for displaying the history page;

a controller for:

responding to an instruction input by a user and indicating to display a heel-and-toe record page, and sending an acquisition data acquisition request containing a user identifier to a server, wherein the data acquisition request is used for enabling the server to return at least one piece of history heel-and-toe record data according to the user identifier, and the history heel-and-toe record data comprises data for designating a picture or designated identification data for representing that the picture does not exist;

Receiving the at least one piece of history heel training record data;

generating a heel-back record page according to the received history heel-back record data, wherein when the history heel-back record data contains the data of the appointed picture, a heel-back record containing a first picture control is generated in the heel-back record page, and the first picture control is used for displaying the appointed picture; and when the history heel-refining record data comprises the appointed identification data, generating a heel-refining record comprising a first identification control in the heel-refining record page, wherein the first identification control is used for displaying a preset identification element, and the preset identification element is used for identifying that the appointed picture does not exist.

In a tenth aspect, some embodiments of the present application provide a display device, including:

a display for displaying a user interface including a first video playing window for playing an exemplary video and a second video playing window for playing the local video stream;

a controller for:

responding to an input instruction for playing an demonstration video, and acquiring the demonstration video, wherein the demonstration video comprises a preset number of key frames, and each key frame displays a key action needing to be followed;

Playing the demonstration video, and acquiring a local video frame corresponding to the key frame from the local video stream according to the playing time of the key frame;

performing action matching on the local video frame and the corresponding key frame, and obtaining a matching score corresponding to the local video frame according to the action matching degree;

and responding to the end of the demonstration video playing, displaying a heel-and-toe result interface, wherein the matching score of the local video frames displayed in the heel-and-toe result interface is higher than that of the local video frames displayed in the heel-and-toe result interface when the total score is higher than a preset value, and the total score is calculated according to the matching score of the local video frames when the total score is not higher than the preset value.

In an eleventh aspect, some embodiments of the present application provide a display device, including:

a display for displaying a page of an application;

a controller for:

acquiring a first experience value and a second experience value, wherein the first experience value is an experience value obtained by a login user of the application in a current statistical period, and the second experience value is a sum of experience values obtained by the login user in each statistical period before the current statistical period;

Displaying an application homepage according to the first experience value and the second experience value, wherein the application homepage comprises a control for displaying the first experience value and the second experience value.

In a twelfth aspect, some embodiments of the present application provide a display device, including:

a display for displaying a page of an application;

a controller for:

acquiring an experience value obtained by a login user of the application in a current statistical period and the total amount of the experience value obtained by the login user;

displaying an application homepage according to the obtained experience values and the total experience values in the current statistical period, wherein the application homepage comprises a control for displaying the obtained experience values and the total experience values in the current statistical period.

Drawings

In order to more clearly illustrate the technical solutions of the present application, the drawings that are needed in the embodiments will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.

A schematic diagram of an operation scenario between a display device and a control apparatus according to an embodiment is exemplarily shown in fig. 1;

A hardware configuration block diagram of the display device 200 in accordance with the embodiment is exemplarily shown in fig. 2;

a hardware configuration block diagram of the control device 100 in accordance with the embodiment is exemplarily shown in fig. 3;

a functional configuration diagram of the display device 200 according to the embodiment is exemplarily shown in fig. 4;

a schematic diagram of the software configuration in the display device 200 according to an embodiment is exemplarily shown in fig. 5;

a schematic configuration of an application program in the display device 200 according to an embodiment is exemplarily shown in fig. 6;

a schematic diagram of a user interface in a display device 200 according to an embodiment is exemplarily shown in fig. 7;

a user interface is exemplarily shown in fig. 8;

one target application home page is illustrated in FIG. 9;

one type of user interface is exemplarily shown in fig. 10 a;

another user interface is exemplarily shown in fig. 10 b;

a user interface is exemplarily shown in fig. 11;

a user interface is exemplarily shown in fig. 12;

one type of user interface is illustrated schematically in FIG. 13;

a user interface is exemplarily shown in fig. 14;

a user interface is exemplarily shown in fig. 15;

one type of pause interface is shown schematically in fig. 16;

one user interface for presenting the save information is exemplarily shown in fig. 17;

A user interface for presenting a follow-up cue is exemplarily shown in fig. 18;

one user interface for presenting scoring information is exemplarily shown in fig. 19A;

a user interface presenting heel training result information is exemplarily shown in FIG. 19B;

a user interface for presenting heel training result information is exemplarily shown in FIG. 19C;

one user interface for presenting empirical value detail data is illustrated schematically in FIG. 19D;

a user interface for presenting empirical value detail data is illustrated schematically in FIG. 19E;

a user interface for presenting detailed performance information is exemplarily shown in fig. 20;

a user interface for viewing a follow-up screenshot artwork file is illustrated schematically in fig. 21;

another user interface for presenting detailed performance information is exemplarily shown in fig. 22;

a detailed performance information page displayed on the mobile terminal device is exemplarily shown in fig. 23;

a user interface for displaying an automatic play prompt is exemplarily shown in fig. 24;

a user interface for displaying a user exercise record is exemplarily shown in fig. 25;

FIG. 26 is a schematic diagram of a first interface shown according to some embodiments;

FIG. 27 is a schematic diagram of a first interface shown according to some embodiments;

FIG. 28 is a schematic diagram of a prompt interface shown, according to some embodiments;

FIG. 29 is a schematic diagram of a prompt interface shown, according to some embodiments;

FIG. 30 is a schematic diagram of a second display interface shown according to some embodiments;

FIG. 31 is a local image labeled 13 joint bits, shown according to some embodiments;

FIG. 32 is a local image shown annotated with joints according to some embodiments;

FIG. 33 is a local image shown color annotated according to some embodiments;

FIG. 34 is a exercise evaluation interface shown according to some embodiments;

fig. 35 is a schematic diagram of a second display interface shown in accordance with some embodiments.

Detailed Description

In order to better understand the technical solutions in the present application, the following description will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, shall fall within the scope of the present application.

All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present application, are intended to be within the scope of the present application based on the exemplary embodiments shown in the present application. Furthermore, while the disclosure has been presented in terms of an exemplary embodiment or embodiments, it should be understood that various aspects of the disclosure can be practiced separately from the disclosure in a complete subject matter.

It should be understood that the terms "first," "second," "third," and the like in the description and in the claims and in the above-described figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate, such as where appropriate, for example, implementations other than those illustrated or described in accordance with embodiments of the present application.

Furthermore, the terms "comprise" and "have," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a product or apparatus that comprises a list of elements is not necessarily limited to those elements expressly listed, but may include other elements not expressly listed or inherent to such product or apparatus.

The term "module" as used in this application refers to any known or later developed hardware, software, firmware, artificial intelligence, fuzzy logic, or combination of hardware and/or software code that is capable of performing the function associated with that element.

The term "remote control" as used in this application refers to a component of an electronic device (such as a display device as disclosed in this application) that can typically be controlled wirelessly over a relatively short distance. Typically, the electronic device is connected to the electronic device using infrared and/or Radio Frequency (RF) signals and/or bluetooth, and may also include functional modules such as WiFi, wireless USB, bluetooth, motion sensors, etc. For example: the hand-held touch remote controller replaces most of the physical built-in hard keys in a general remote control device with a touch screen user interface.

The term "gesture" as used herein refers to a user behavior by which a user expresses an intended idea, action, purpose, and/or result through a change in hand shape or movement of a hand, etc.

A schematic diagram of an operation scenario between a display device and a control apparatus according to an embodiment is exemplarily shown in fig. 1. As shown in fig. 1, a user may operate the display apparatus 200 through the mobile terminal 300 and the control device 100.

The control device 100 may control the display apparatus 200 through a wireless or other wired manner by using a remote controller including an infrared protocol communication or a bluetooth protocol communication, and other short-distance communication manners. The user may control the display device 200 by inputting user instructions through keys on a remote control, voice input, control panel input, etc. Such as: the user can input corresponding control instructions through volume up-down keys, channel control keys, up/down/left/right movement keys, voice input keys, menu keys, on-off keys, etc. on the remote controller to realize the functions of the control display device 200.

In some embodiments, mobile terminals, tablet computers, notebook computers, and other smart devices may also be used to control the display device 200. For example, the display device 200 is controlled using an application running on a smart device. The application program, by configuration, can provide various controls to the user in an intuitive User Interface (UI) on a screen associated with the smart device.

In some embodiments, the mobile terminal 300 may install a software application with the display device 200, implement connection communication through a network communication protocol, and achieve the purpose of one-to-one control operation and data communication. Such as: it is possible to implement a control command protocol established between the mobile terminal 300 and the display device 200, synchronize a remote control keyboard to the mobile terminal 300, and implement a function of controlling the display device 200 by controlling a user interface on the mobile terminal 300. The audio/video content displayed on the mobile terminal 300 can also be transmitted to the display device 200, so as to realize the synchronous display function.

As also shown in fig. 1, the display device 200 is also in data communication with the server 400 via a variety of communication means. The display device 200 may be permitted to make communication connections via a Local Area Network (LAN), a Wireless Local Area Network (WLAN), and other networks. The server 400 may provide various contents and interactions to the display device 200. In some embodiments, display device 200 receives software program updates, or accesses a remotely stored digital media library, by sending and receiving information, as well as Electronic Program Guide (EPG) interactions. The servers 400 may be one or more groups, and may be one or more types of servers. Other web service content such as video on demand and advertising services are provided through the server 400.

The display device 200 may be a liquid crystal display, an OLED display, a projection display device. The particular display device type, size, resolution, etc. are not limited, and those skilled in the art will appreciate that the display device 200 may be modified in performance and configuration as desired.

The display device 200 may additionally provide an intelligent network television function of a computer support function in addition to the broadcast receiving television function. Including, in some embodiments, web tv, smart tv, internet Protocol Tv (IPTV), etc.

A hardware configuration block diagram of the display device 200 according to an exemplary embodiment is illustrated in fig. 2. As shown in fig. 2, the display device 200 includes therein at least one of a controller 210, a modem 220, a communication interface 230, a detector 240, an input/output interface 250, a video processor 260-1, an audio processor 260-2, a display 280, an audio output 270, a memory 290, a power supply, and an infrared receiver.

A display 280 for receiving image signals from the video processor 260-1 and for displaying video content and images and components of the menu manipulation interface. The display 280 includes a display screen assembly for presenting pictures, and a drive assembly for driving the display of images. The video content may be displayed from broadcast television content, or may be various broadcast signals receivable via a wired or wireless communication protocol. Alternatively, various image contents received from the network server side transmitted from the network communication protocol may be displayed.

Meanwhile, the display 280 simultaneously displays a user manipulation UI interface generated in the display device 200 and used to control the display device 200.

And, depending on the type of display 280, a drive assembly for driving the display. Alternatively, if the display 280 is a projection display, a projection device and projection screen may be included.

The communication interface 230 is a component for communicating with an external device or an external server according to various communication protocol types. For example: the communication interface 230 may be a Wifi module 231, a bluetooth module 232, a wired ethernet module 233, or other network communication protocol chip or near field communication protocol chip, and an infrared receiver (not shown in the figure).

The display device 200 may establish control signal and data signal transmission and reception with an external control device or a content providing device through the communication interface 230. And an infrared receiver, which is an interface device for receiving infrared control signals of the control device 100 (such as an infrared remote controller).

The detector 240 is a signal that the display device 200 uses to collect an external environment or interact with the outside. The detector 240 includes a light receiver 242, a sensor for collecting the intensity of ambient light, a parameter change may be adaptively displayed by collecting the ambient light, etc.

And the image collector 241, such as a camera, a video camera, etc., can be used for collecting external environment scenes, collecting attributes of a user or interacting gestures with the user, can adaptively change display parameters, and can also recognize the gestures of the user so as to realize the interaction function with the user.

In other exemplary embodiments, the detector 240 may also be a temperature sensor or the like, such as by sensing ambient temperature, and the display device 200 may adaptively adjust the display color temperature of the image. The display device 200 may be adjusted to display a colder color temperature shade of the image, such as when the temperature is higher, or the display device 200 may be adjusted to display a warmer color shade of the image when the temperature is lower.

In other exemplary embodiments, the detector 240, and also a sound collector or the like, such as a microphone, may be used to receive the user's sound, including the voice signal of a control instruction of the user controlling the display device 200, or collect the ambient sound for identifying the type of the ambient scene, and the display device 200 may adapt to the ambient noise.

An input/output interface 250 for data transmission between the control display device 200 of the controller 210 and other external devices. Such as receiving video signals and audio signals of an external device, or command instructions.

The input/output interface 250 may include, but is not limited to, the following: any one or more of a high definition multimedia interface HDMI interface 251, an analog or data high definition component input interface 253, a composite video input interface 252, a USB input interface 254, an RGB port (not shown in the figures), etc. may be used.

In other exemplary embodiments, the input/output interface 250 may also form a composite input/output interface from the plurality of interfaces described above.

The modem 220 receives broadcast television signals by a wired or wireless receiving method, and can perform modulation and demodulation processing such as amplification, mixing, resonance, etc., and demodulates television audio and video signals carried in a television channel frequency selected by a user and EPG data signals from a plurality of wireless or wired broadcast television signals.

The tuning demodulator 220 is responsive to the user selected television signal frequency and television signals carried by that frequency, as selected by the user, and as controlled by the controller 210.

The tuning demodulator 220 can receive signals in various ways according to broadcasting systems of television signals, such as: terrestrial broadcast, cable broadcast, satellite broadcast, or internet broadcast signals, etc.; and according to different modulation types, the modulation can be digital modulation or analog modulation mode. Depending on the type of television signal received, both analog and digital signals may be used.

In other exemplary embodiments, the modem 220 may also be in an external device, such as an external set-top box, or the like. Thus, the set-top box outputs television audio and video signals after modulation and demodulation, and inputs the television audio and video signals to the display device 200 through the input/output interface 250.

The video processor 260-1 is configured to receive an external video signal, perform video processing such as decompression, decoding, scaling, noise reduction, frame rate conversion, resolution conversion, image composition, etc., according to the standard codec protocol of the input signal, and obtain a signal that can be displayed or played on the directly displayable device 200.

In some embodiments, the video processor 260-1 includes at least one of a demultiplexing module, a video decoding module, an image compositing module, a frame rate conversion module, a display formatting module, and the like.

The demultiplexing module is used for demultiplexing the input audio/video data stream, such as the input MPEG-2, and demultiplexes the input audio/video data stream into video signals, audio signals and the like.

And the video decoding module is used for processing the demultiplexed video signals, including decoding, scaling and the like.

And an image synthesis module, such as an image synthesizer, for performing superposition mixing processing on the graphic generator and the video image after the scaling processing according to the GUI signal input by the user or generated by the graphic generator, so as to generate an image signal for display.

The frame rate conversion module is configured to convert the input video frame rate, for example, converting the 60Hz frame rate into the 120Hz frame rate or the 240Hz frame rate, and the common format is implemented in an inserting frame manner.

The display format module is used for converting the received frame rate into a video output signal, and changing the video output signal to a signal conforming to the display format, such as outputting an RGB data signal.

The audio processor 260-2 is configured to receive an external audio signal, decompress and decode the external audio signal according to a standard codec protocol of an input signal, and perform noise reduction, digital-to-analog conversion, amplification processing, and the like, to obtain a sound signal that can be played in a speaker.

In other exemplary embodiments, video processor 260-1 may include one or more chip components. The audio processor 260-2 may also include one or more chips.

And, in other exemplary embodiments, the video processor 260-1 and the audio processor 260-2 may be separate chips or integrated together in one or more chips with the controller 210.

An audio output 270, which receives the sound signal output by the audio processor 260-2 under the control of the controller 210, such as: the speaker 272, and an external sound output terminal 274 that can be output to a generating device of an external device, other than the speaker 272 carried by the display device 200 itself, such as: external sound interface or earphone interface, etc.

And a power supply source for providing power supply support for the display device 200 with power inputted from an external power source under the control of the controller 210. The power supply may include a built-in power circuit installed inside the display apparatus 200, or may be an external power source installed in the display apparatus 200, and a power interface providing an external power source in the display apparatus 200.

A user input interface for receiving an input signal of a user and then transmitting the received user input signal to the controller 210. The user input signal may be a remote control signal received through an infrared receiver, and various user control signals may be received through a network communication module.

In some embodiments, a user inputs a user command through the remote controller 100 or the mobile terminal 300, and the user input interface responds to the user input through the controller 210 according to the user input.

In some embodiments, a user may input a user command through a Graphical User Interface (GUI) displayed on the display 280, and the user input interface receives the user input command through the Graphical User Interface (GUI). Alternatively, the user may input the user command by inputting a specific sound or gesture, and the user input interface recognizes the sound or gesture through the sensor to receive the user input command.

The controller 210 controls the operation of the display device 200 and responds to the user's operations through various software control programs stored on the memory 290.

As shown in fig. 2, the controller 210 includes RAM213 and ROM214, and a graphics processor 216, CPU processor 212, communication interfaces, such as: first interface 218-1 through nth interfaces 218-n, and a communication bus. The RAM213 and the ROM214 are connected to the graphics processor 216, the CPU processor 212, and the communication interface via buses.

ROM214 for storing various system boot instructions. When the power of the display device 200 starts to be started when the power-on signal is received, the CPU processor 212 executes a system start instruction in the ROM and copies the operating system stored in the memory 290 to the RAM213, so that the running of the start operating system starts. When the operating system is started, the CPU processor 212 copies various applications in the memory 290 to the RAM213, and then starts running the various applications.

A graphics processor 216 for generating various graphical objects, such as: icons, operation menus, user input instruction display graphics, and the like. The device comprises an arithmetic unit, wherein the arithmetic unit is used for receiving various interaction instructions input by a user to carry out operation and displaying various objects according to display attributes. And a renderer that generates various objects based on the results of the operator, and displays the results of rendering on the display 280.

CPU processor 212 is operative to execute operating system and application program instructions stored in memory 290. And executing various application programs, data and contents according to various interactive instructions received from the outside, so as to finally display and play various audio and video contents.

In some exemplary embodiments, the CPU processor 212 may include multiple processors. The plurality of processors may include one main processor and a plurality or one sub-processor. A main processor for performing some operations of the display apparatus 200 in the pre-power-up mode and/or displaying a picture in the normal mode. A plurality of or a sub-processor for one operation in a standby mode or the like.

The controller 210 may control the overall operation of the display apparatus 100. For example: in response to receiving a user command to select a UI object to be displayed on the display 280, the controller 210 may perform an operation related to the object selected by the user command.

Wherein the object may be any one of selectable objects, such as a hyperlink or an icon. Operations related to the selected object, such as: displaying an operation of connecting to a hyperlink page, a document, an image, or the like, or executing an operation of a program corresponding to the icon. The user command for selecting the UI object may be an input command through various input means (e.g., mouse, keyboard, touch pad, etc.) connected to the display device 200 or a voice command corresponding to a voice uttered by the user.

Memory 290 includes storage for various software modules for driving display device 200. Such as: various software modules stored in memory 290, including: a basic module, a detection module, a communication module, a display control module, a browser module, various service modules and the like.

The base module is a bottom software module for communicating signals between the various hardware in the post-partum care display device 200 and sending processing and control signals to the upper module. The detection module is used for collecting various information from various sensors or user input interfaces and carrying out digital-to-analog conversion and analysis management.

For example: the voice recognition module comprises a voice analysis module and a voice instruction database module. The display control module is used for controlling the display 280 to display the image content, and can be used for playing the multimedia image content, the UI interface and other information. And the communication module is used for carrying out control and data communication with external equipment. And the browser module is used for executing data communication between the browsing servers. And the service module is used for providing various services and various application programs.

Meanwhile, the memory 290 also stores received external data and user data, images of various items in various user interfaces, visual effect maps of focus objects, and the like.

A block diagram of the configuration of the control device 100 according to an exemplary embodiment is illustrated in fig. 3. As shown in fig. 3, the control device 100 includes a controller 110, a communication interface 130, a user input/output interface 140, a memory 190, and a power supply 180.

The control device 100 is configured to control the display device 200, and may receive an input operation instruction of a user, and convert the operation instruction into an instruction recognizable and responsive to the display device 200, to function as an interaction between the user and the display device 200. Such as: the user responds to the channel addition and subtraction operation by operating the channel addition and subtraction key on the control apparatus 100, and the display apparatus 200.

In some embodiments, the control device 100 may be a smart device. Such as: the control apparatus 100 may install various applications for controlling the display apparatus 200 according to user's needs.

In some embodiments, as shown in fig. 1, a mobile terminal 300 or other intelligent electronic device may function similarly to the control device 100 after installing an application that manipulates the display device 200. Such as: the user may implement the functions of controlling the physical keys of the device 100 by installing various function keys or virtual buttons of a graphical user interface available on the mobile terminal 300 or other intelligent electronic device.

The controller 110 includes a processor 112 and RAM113 and ROM114, a communication interface, and a communication bus. The controller 110 is used to control the operation and operation of the control device 100, as well as the communication collaboration among the internal components and the external and internal data processing functions.

The communication interface 130 enables communication of control signals and data signals with the display device 200 under the control of the controller 110. Such as: the received user input signal is transmitted to the display device 200. The communication interface 130 may include at least one of a WiFi chip, a bluetooth module, an NFC module, and other near field communication modules.

A user input/output interface 140, wherein the input interface includes at least one of a microphone 141, a touchpad 142, a sensor 143, keys 144, and other input interfaces. Such as: the user can implement a user instruction input function through actions such as voice, touch, gesture, press, and the like, and the input interface converts a received analog signal into a digital signal and converts the digital signal into a corresponding instruction signal, and sends the corresponding instruction signal to the display device 200.

The output interface includes an interface that transmits the received user instruction to the display device 200. In some embodiments, an infrared interface may be used, as well as a radio frequency interface. Such as: when the infrared signal interface is used, the user input instruction needs to be converted into an infrared control signal according to an infrared control protocol, and the infrared control signal is sent to the display device 200 through the infrared sending module. And the following steps: when the radio frequency signal interface is used, the user input instruction is converted into a digital signal, and then the digital signal is modulated according to a radio frequency control signal modulation protocol and then transmitted to the display device 200 through the radio frequency transmission terminal.

In some embodiments, the control device 100 includes at least one of a communication interface 130 and an output interface. The control device 100 is provided with a communication interface 130 such as: the WiFi, bluetooth, NFC, etc. modules may send the user input instruction to the display device 200 through a WiFi protocol, or a bluetooth protocol, or an NFC protocol code.

A memory 190 for storing various operation programs, data and applications for driving and controlling the control device 200 under the control of the controller 110. The memory 190 may store various control signal instructions input by a user.

A power supply 180 for providing operating power support for the various elements of the control device 100 under the control of the controller 110. May be a battery and associated control circuitry.

A schematic diagram of the functional configuration of the display device 200 according to an exemplary embodiment is illustrated in fig. 4. As shown in fig. 4, the memory 290 is used to store an operating system, application programs, contents, user data, and the like, and performs system operations for driving the display device 200 and various operations in response to a user under the control of the controller 210. Memory 290 may include volatile and/or nonvolatile memory.

The memory 290 is specifically used for storing an operation program for driving the controller 210 in the display device 200, and storing various application programs built in the display device 200, various application programs downloaded by a user from an external device, various graphical user interfaces related to the application, various objects related to the graphical user interfaces, user data information, and various internal data supporting the application. The memory 290 is used to store system software such as OS kernel, middleware and applications, and to store input video data and audio data, and other user data.

Memory 290 is specifically used to store drivers and related data for audio and video processors 260-1 and 260-2, display 280, communication interface 230, modem 220, detector 240 input/output interface, and the like.

In some embodiments, memory 290 may store software and/or programs, the software programs used to represent an Operating System (OS) including, for example: a kernel, middleware, an Application Programming Interface (API), and/or an application program. For example, the kernel may control or manage system resources, or functions implemented by other programs (such as the middleware, APIs, or application programs), and the kernel may provide interfaces to allow the middleware and APIs, or applications to access the controller to implement control or management of system resources.

In some embodiments, memory 290 includes at least one of a broadcast receiving module 2901, a channel control module 2902, a volume control module 2903, an image control module 2904, a display control module 2905, an audio control module 2906, an external instruction recognition module 2907, a communication control module 2908, a light receiving module 2909, a power control module 2910, an operating system 2911, and other applications 2912, a browser module, and so forth. The controller 210 executes various software programs in the memory 290 such as: broadcast television signal receiving and demodulating functions, television channel selection control functions, volume selection control functions, image control functions, display control functions, audio control functions, external instruction recognition functions, communication control functions, optical signal receiving functions, power control functions, software control platforms supporting various functions, browser functions and other applications.

A block diagram of the configuration of the software system in the display device 200 according to an exemplary embodiment is illustrated in fig. 5.

As shown in FIG. 5, operating system 2911, which includes executing operating software for handling various basic system services and for performing hardware-related tasks, acts as a medium for data processing completed between applications and hardware components. In some embodiments, portions of the operating system kernel may contain a series of software to manage display device hardware resources and to serve other programs or software code.

In other embodiments, portions of the operating system kernel may contain one or more device drivers, which may be a set of software code in the operating system that helps operate or control the devices or hardware associated with the display device. The driver may contain code to operate video, audio and/or other multimedia components. In some embodiments, a display screen, camera, flash, wiFi, and audio drivers are included.

Wherein, accessibility module 2911-1 is configured to modify or access an application program to realize accessibility of the application program and operability of display content thereof.

The communication module 2911-2 is used for connecting with other peripheral devices via related communication interfaces and communication networks.

User interface module 2911-3 is configured to provide an object for displaying a user interface, so that the user interface can be accessed by each application program, and user operability can be achieved.

Control applications 2911-4 are used for controllable process management, including runtime applications, and the like.

The event transmission system 2914, which may be implemented within the operating system 2911 or within the application 2912, is implemented in some embodiments on the one hand within the operating system 2911, and simultaneously within the application 2912, for listening to various user input events, and will implement one or more sets of predefined operational handlers based on various event references in response to recognition results of various types of events or sub-events.

The event monitoring module 2914-1 is configured to monitor a user input interface to input an event or a sub-event.

The event recognition module 2914-2 is configured to input definitions of various events to various user input interfaces, recognize various events or sub-events, and transmit them to a process for executing one or more corresponding sets of processes.

The event or sub-event refers to an input detected by one or more sensors in the display device 200, and an input from an external control device (e.g., the control device 100, etc.). Such as: various sub-events are input through voice, gesture input through gesture recognition, sub-events of remote control key instruction input of a control device and the like. In some embodiments, one or more sub-events in the remote control include a variety of forms including, but not limited to, one or a combination of key press up/down/left/right/, ok key, key press, etc. And operations of non-physical keys, such as movement, holding, releasing, etc.

Interface layout manager 2913 directly or indirectly receives user input events or sub-events from event delivery system 2914 for updating the layout of the user interface, including but not limited to the location of controls or sub-controls in the interface, and various execution operations associated with the interface layout, such as the size or location of the container, the hierarchy, etc.

As shown in fig. 6, the application layer 2912 contains various applications that may also be executed on the display device 200. Applications may include, but are not limited to, one or more applications such as: at least one of a live television application, a video on demand application, a media center application, an application center, a gaming application, and the like.

Live television applications can provide live television through different signal sources. For example, a live television application may provide television signals using inputs from cable television, radio broadcast, satellite services, or other types of live television services. And, the live television application may display video of the live television signal on the display device 200.

Video on demand applications may provide video from different storage sources. Unlike live television applications, video-on-demand provides video displays from some storage sources. For example, video-on-demand may come from the server side of cloud storage, from a local hard disk storage containing stored video programs.

The media center application may provide various applications for playing multimedia content. For example, a media center may be a different service than live television or video on demand, and a user may access various images or audio through a media center application.

An application center may be provided to store various applications. The application may be a game, an application, or some other application associated with a computer system or other device but which may be run in a smart television. The application center may obtain these applications from different sources, store them in local storage, and then be run on the display device 200.

A schematic diagram of a user interface in a display device 200 according to an exemplary embodiment is illustrated in fig. 7. As shown in fig. 7, the user interface includes a plurality of view display areas, in some embodiments a first view display area 201 and a play screen 202, wherein the play screen includes a layout of one or more different items. And a selector in the user interface indicating that an item is selected, the position of the selector being movable by user input to change selection of a different item.

It should be noted that the multiple view display areas may present different levels of display images. For example, the first view display region may present video chat item content and the second view display region may present application layer item content (e.g., web page video, VOD presentation, application screen, etc.).

Optionally, the presentation of different view display areas has priority difference, and the display priorities of the view display areas are different among the view display areas with different priorities. If the priority of the system layer is higher than that of the application layer, when the user uses the acquisition selector and the picture switching in the application layer, the picture display of the view display area of the system layer is not blocked; and when the size and the position of the view display area of the application layer are changed according to the selection of the user, the size and the position of the view display area of the system layer are not affected.

The same level of display may be presented, in which case the selector may switch between the first view display region and the second view display region, and the size and position of the second view display region may change as the size and position of the first view display region changes.

In some embodiments, any one of the areas in fig. 7 may display a screen acquired by the camera.

In some embodiments, controller 210 controls the operation of display device 200 and responds to user operations associated with display 280 by running various software control programs (e.g., an operating system and/or various application programs) stored on memory 290. For example, control presents a user interface on a display, the user interface including a number of UI objects thereon; in response to a received user command for a UI object on the user interface, the controller 210 may perform an operation related to the object selected by the user command.

In some embodiments, some or all of the steps involved in embodiments of the present application are implemented within an operating system and in a target application. In some embodiments, the target application program for implementing some or all of the steps of the embodiments of the present application is referred to as "baby exercise work", which is stored in the memory 290, and the controller 210 controls the operation of the display device 200 and responds to user operations related to the application program by running the application program in an operating system.

In some embodiments, the display device obtains the target application, various graphical user interfaces associated with the target application, various objects associated with the graphical user interfaces, user data information, and various internal data supporting the application from the server and stores the aforementioned data information in the memory.

In some embodiments, the display device obtains media resources, such as picture files and audio video files, from the server in response to the launching of the target application or a user operation on a UI object associated with the target application.

It should be noted that the target application is not limited to running on the display device as shown in fig. 1-7, but may also run on other hand-held devices capable of providing voice and data connectivity and having wireless connectivity, or may be connected to other processing devices of a wireless modem, such as mobile phones (or "cellular" phones) and computers with mobile terminals, and may also be portable, pocket, hand-held, computer-built-in or vehicle-mounted mobile devices that exchange data with the radio access network.

FIG. 8 is a user interface, which is one implementation of a display device system home page, as exemplarily shown herein. As shown in fig. 8, the user interface displays a plurality of items (controls) including a target item for launching the target application. As shown in fig. 8, the target item is the item "baby exercise dancing power". When the display displays a user interface as shown in fig. 8, a user can operate a target item "baby exercise work" by operating a control device (such as the remote controller 100), and in response to the operation of the target item, the controller starts a target application.

In some embodiments, the target application refers to a functional module that plays an exemplary video in a first video window on a display screen. Wherein the demonstration video refers to a video exhibiting demonstration actions and/or demonstration sounds. In some embodiments, the target application may also play the local video captured by the camera in a second video window on the display screen.

When the controller receives an input instruction indicating to start the target application, the controller presents a target application homepage on the display in response to the instruction. On the application homepage, various interface elements such as icons, windows, controls and the like can be displayed on the interface, including but not limited to a login account information display area (column frame control), a user data (experience value/dance power value) display area, a window control for playing recommended videos, a related user list display area and a media asset display area.

In some embodiments, at least one of a nickname, a avatar, a member identification, a member validity period of the user may be presented in the login account information presentation area; the user data display area can display data related to the target application, such as experience value/dancing power value and/or corresponding star grade identification, of the user; the related user list display area can display a ranking list (such as experience value ranking) of users in a preset geographical area range in a preset time period, or can display a friend list of the users, wherein the ranking list or the friend list can display experience values/dancing power values and/or corresponding star-level identifiers of the users; in the media asset display area, media assets are displayed in a classified mode. In some embodiments, a plurality of controls can be displayed in the media asset display area, different controls correspond to different types of media assets, and a user can trigger to display a corresponding type of media asset list by operating the controls.

In some embodiments, the user data display area and the login account information display area may be a display area, for example, the user data related to the target application is displayed in the login account information display area.

FIG. 9 illustrates an implementation of the target application home page described above, with a nickname, avatar, membership identification, membership expiration date of the user shown in the login account information display area, as shown in FIG. 9; the user data display area displays the dancing power value and star grade identification of the user; the relevant user list display area displays 'dancing forest high hand ranking (this week)'; the media resource display area is provided with media resource type controls such as a 'loving course', 'happy course', 'dazzling course', and 'my dance work', etc., and a user can operate the type controls to check a corresponding type of media resource list through operating the control device, and can select media resource videos to be trained from the media resource list under any type. The focus is moved to an 'initiating class' control, a 'initiating class' media resource list interface is displayed after confirmation operation of a user is received, and corresponding media resource files are loaded and played according to the media resource control selected by the user in the 'initiating class' media resource list interface.

In addition, the interface shown in FIG. 9 includes a window control and an ad spot control for playing recommended videos. The recommended video may be automatically played in a window control as shown in fig. 9, or may be played in response to a play instruction input by the user. For example, the user can move the position of the selector (focus) by operating the control device so that the selector falls into a window control for playing the recommended video, and in the case where the selector falls into the window control, the user operates an "OK" key on the control device to input an instruction indicating that the recommended video is played.

In some embodiments, the controller obtains information from the server for display in a page as shown in FIG. 9, such as login account information, user data, related user list data, recommended videos, and the like, in response to an instruction to launch the above-described target application. The controller draws an interface as shown in fig. 9 through the graphic processor according to the acquired information and controls presentation on the display.

In some embodiments, the controller obtains a media resource ID corresponding to the media resource control and/or a user identifier of the display device according to the media resource control selected by the user, and sends a loading request to the server, and the server inquires corresponding video data according to the media resource ID and/or determines the authority of the display device according to the user identifier. And feeding the acquired video data and/or rights information back to the display device. The controller plays the video data and/or plays the video information according to the video data and/or the permission information and prompts the permission of the user.

In some embodiments, the target application is not a separate application, but is a part of the focused application as shown in fig. 8, that is, is a functional module of the focused application, and in some embodiments, in addition to the title controls such as "my", "movie", "child", "VIP", "education", "mall", "application", and the like, the TAB of the interactive interface further includes a "dance power" title control, and the user may display a corresponding title interface by moving the focus to a different title control, for example, after moving the focus to the "dance power" title control, enter the interface as shown in fig. 9.

With the popularity of intelligent display devices, the demand for entertainment by users through large screens is increasing, and more time and money are also being invested in interest cultivation. The method and the device provide the follow-up experience of the skills of actions and/or sounds (such as actions in dance, gymnastics, body building and K songs) for the user through the target application, so that the user can learn the skills of the actions and/or sounds at any time in the home.

In some embodiments, the video of assets presented in the asset list interface (e.g., the "loving" asset list interface, the "Leaction" asset list interface in the above examples) includes exemplary video, which is not limited to video for exemplary dance actions, video for exemplary fitness actions, video for exemplary gym actions, song MV video played by a display device in a K-song scene, or video for exemplary avatar actions. In the embodiment of the application, the teaching video is also called demonstration video, and the teaching video or demonstration video user can synchronously make the same actions as those demonstrated in the video while watching the demonstration video so as to realize the functions of home dancing or home body building by using the display device. The function is visual, and can be used as 'watching and training'.

In some embodiments, the "look-at-edge" scenario is as follows: a user (such as a child or teenager) can watch dance teaching video and exercise dance, a user (such as an adult) can watch body-building teaching video and exercise body-building action, the user can connect K songs with friend video, the user can sing songs and then follow MV video or virtual image to make actions, and the like. For convenience of explanation and distinction, in the scene of "look-while-training", the action made by the user is referred to as a user action or a follow-up action, the demonstration action in the video is referred to as a demonstration action, the video showing the demonstration action is a demonstration video, and the action made by the user is a local video acquired after the camera is shown.

In some embodiments, if the display device has an image collector (or camera), the image collector may collect an image or a video stream of the heel-training action of the user, so that the heel-training process of the user is recorded by taking a picture or a video as a carrier. Further, the heel-exercise actions of the user are identified according to the pictures or the videos, the heel-exercise actions of the user are compared with corresponding demonstration actions, and the heel-exercise conditions of the user are evaluated according to the comparison conditions.

In some embodiments, a time tag corresponding to a standard action frame may be preset in the exemplary video, and the action is compared with the standard action frame according to the matching of the image frame at the time tag position and/or the adjacent position in the local video, and then the evaluation is performed according to the matching degree of the action.

In some embodiments, time labels corresponding to standard audio clips may be preset in the exemplary video, and the matching comparison of actions is performed according to the audio clips and standard audio clips in the local video at the time label positions and/or adjacent positions, and then the evaluation is performed according to the matching degree of the actions.

In some embodiments, a display interface of the display presents a local video stream (or a local photo) acquired by a camera and an exemplary video which is tracked by a user on the display synchronously, a first video window and a second video window are arranged in the display interface, the first video window plays the exemplary video, and the second play window plays the local video, so that the user can directly watch own tracking motion and intuitively compare the defect of the tracking motion, thereby timely improving.

When the display displays the interface shown in fig. 9 or the interface of the list of media assets after receiving the operation after the interface in fig. 9, the user may select and play the media asset video to be exercised through the operation control device, and for convenience of explanation and distinction, the media asset video selected to be exercised by the user will be collectively referred to as a target video (i.e. an exemplary video corresponding to the selected control).

In some embodiments, in response to an instruction input by a user to follow-up a target video, the display device controller acquires the target video from the server according to a media asset ID corresponding to the selected control, and detects whether a camera is connected; if the camera is detected, the camera is controlled to be lifted and started, so that the camera starts to collect the local video stream, the loaded target video and the local video stream are displayed on the display at the same time, and if the camera is not detected, the target video is only played on the display.

In some embodiments, a first playing window and a second playing window are set in a display interface (i.e. the heel training interface) during heel training, after loading of the target video is completed, in response to the fact that the camera is not detected, the target video is played in the first playing window, and a preset prompt or blackout is displayed in the second playing window. In some embodiments, when no camera is detected, a reminder without a camera is displayed in a floating layer above the heel-and-toe interface, the user enters the heel-and-toe interface to play a target video after confirming, and exits the target application or returns to the previous interface when the user inputs a disagreeable instruction.

In the case that the camera is detected, the controller sets a first playing window on a first layer of the user interface, sets a second playing window on a second layer of the user interface, plays the acquired target video in the first playing window, and plays the picture of the local video stream in the second playing window. The first playing window and the second playing window can be tiled, wherein the tiled display means that a plurality of windows divide a screen according to a certain proportion, and no superposition exists between the windows.

In some embodiments, the first playback window and the second playback window are formed by window components tiled on the same layer that occupy different positions.

Fig. 10a illustrates a user interface showing one implementation of a first playback window in which a target video frame is displayed and a second playback window in which a frame of a local video stream is displayed, as shown in fig. 10a, the first playback window and the second playback window being tiled in a display area of the display, in some embodiments, the first playback window and the second playback window having different window sizes.

And in the situation that the camera is not detected, the controller plays the acquired target video in the first playing window, and displays the shielding layer or the preset picture file in the second playing window. The first playing window and the second playing window can be tiled, wherein the tiled display means that a plurality of windows divide a screen according to a certain proportion, and no superposition exists between the windows.

Fig. 10b illustrates another user interface in which another implementation of the first play window and the second play window is shown, unlike fig. 10a, in which a target video picture is displayed in the first play window and an occlusion layer is displayed in the second play window, and a preset text element of "no camera detected" is displayed in the occlusion layer.

In some other embodiments, in the event that no camera is detected, the controller sets a first play window at a first layer of the user interface, the first play window being displayed full screen within a display area of the display.

In some embodiments, in the case that the display device has a camera, after receiving an instruction input by a user to instruct to follow a certain exemplary video, the controller enters a follow-up interface to directly play the exemplary video and the local video stream.

In other embodiments, the controller, upon receiving an instruction to follow-up with the demonstration video, first enters a guiding interface in which only the local video frames are presented, without playing the demonstration video frames.

In some embodiments, since the camera is a concealable camera that is concealed within or behind the display when not in use, the controller controls the raising and turning on of the camera when the camera is called up, wherein the initiation is to cause the camera to begin capturing images in order to extend the camera out of the frame of the display.

In some embodiments, to increase the camera angle of the camera, the camera may be rotated in a lateral direction or a longitudinal direction, where the lateral direction refers to a horizontal direction when the video is being viewed normally, and the longitudinal direction refers to a vertical direction when the video is being viewed normally. The acquired image can be adjusted by adjusting the focal length of the camera in the depth direction perpendicular to the display screen.

In some embodiments, when there is no moving object (i.e., human body) in the local video frame, or when there is a moving object in the local video frame and the deviation of the target position where the moving object is located from the preset desired position is greater than the preset threshold, a graphic element for identifying the preset desired position is presented above the local video frame, and a prompt control for guiding the moving object to move to the desired position is presented above the local video frame according to the deviation of the target position from the desired position.

The moving object (human body) is a local user, and in different situations, the moving objects in the local video picture can be one or a plurality of moving objects. The expected position is a position set according to the acquisition area of the image acquisition device, and when the moving target (namely, a user) is positioned at the expected position, the local image acquired by the image acquisition device is most favorable for analyzing and comparing the user actions in the image.

In some embodiments, the alert control graphic for directing movement of the moving object to the desired location includes an arrow graphic indicating a direction with an arrow pointing toward the desired location.

In some embodiments, the desired position refers to a graphic frame displayed on the display, and the controller sets the graphic frame in a floating layer above the local video frame according to the position and angle of the camera and a preset mapping relationship, so that a user can intuitively see to what position the user needs to move.

In the use process, the preset position of the user standing in front of the display device is a reasonable position, and the difference of the images acquired by the cameras can be caused due to the difference of the lifting height and/or the rotation angle, so that the preset position of the graphic frame needs to be adjusted adaptively, and the preset position of the user standing in front of the display device under guidance is a reasonable position.

In some embodiments, the positional mapping of the graphic frame is as follows:

in some embodiments, the video window for playing the local video picture is located at a first layer, the prompt control and/or the graphical frame is located at a second layer, and the second layer is located above the first layer.

In some embodiments, the controller may display a video window for playing the local video frame in a second layer on the display interface, where loading of the heel-and-toe interface is not performed or the heel-and-toe interface is located in a page stack in the background.

In some embodiments, the prompt control for guiding the moving object to move to the desired position may identify an interface prompt for the moving direction of the object, and/or play a voice prompt for the moving direction of the object.

The target moving direction is obtained according to the deviation of the target position relative to the expected position. When one moving object exists in the local video picture, the moving direction of the object is obtained according to the deviation of the object position of the one moving object relative to the expected position; when a plurality of moving targets exist in the local video picture, the target moving direction is obtained according to the minimum offset in a plurality of offsets corresponding to the moving targets.

In some embodiments, the tip control may be an arrow tip, and the arrow direction of the arrow tip may be determined according to the target movement direction to point to the graphical element 112.

In some embodiments, a floating layer, such as a semi-transparent floating layer, having a transparency greater than a preset transparency (e.g., 50%) is presented above the local video frame, and a graphic element for identifying a desired position is displayed in the floating layer, so that a user can view the local video frame of the local video through the floating layer.

In some embodiments, another floating layer with transparency greater than a preset transparency (such as 50%) is presented above the local video frame, and a graphic element for identifying the target moving direction is displayed in the floating layer as a prompt control for guiding the user to move the position.

In some embodiments, the graphical element for identifying the desired location and the reminder control for identifying the direction of movement of the target are displayed in the same float layer.

Fig. 11 illustrates a user interface in which a local video screen is displayed substantially full screen, and a semi-transparent float layer in which a target movement direction is identified by a graphic element 111 and a desired position is identified by a graphic element 112 is displayed over the local video screen, as shown in fig. 11. The position of the graphic element 111 does not coincide with the position of the graphic element 112. The moving object (user) may gradually move to a desired position according to the object moving direction identified by the graphic element 111. When the moving object in the local video frame moves to the desired position, the contour of the moving object in the local video frame coincides with the image element 112 to the greatest extent. In some embodiments, the graphic element 112 is a graphic frame.

In some embodiments, the direction of movement of the target may also be identified by an interface text element, such as "move a little to the left" as exemplarily shown in FIG. 11, or the like.

In some embodiments, the display device controller receives a preset instruction, such as an instruction to follow-up with an exemplary video, in response to which the image collector is controlled to collect local images to generate a local video stream; presenting a local video screen in a user interface; detecting whether a moving target exists in a local video picture; and when the moving object exists in the local video picture, respectively acquiring the position coordinates of the moving object and the expected position in a preset coordinate system, wherein the position coordinates of the moving object in the preset coordinate system are quantized representations of the target position of the moving object, and the position coordinates of the expected position in the preset coordinate system are quantized representations of the expected position. Further, the shift of the target position with respect to the desired position is calculated from the position coordinates of the moving target and the desired position in the preset coordinate system.

The display device controller receives an instruction for indicating a follow-up target video, and responds to the instruction, the image collector is started to collect a local video stream through the image collector; presenting a preview screen of the local video stream in a user interface; detecting whether a moving target exists in the preview picture; and when the moving object exists in the preview picture, acquiring the position coordinate of the moving object in a preset coordinate system, wherein the position coordinate of the moving object in the preset coordinate system is a quantized representation of the position of the moving object. Further, an offset of the target position relative to the desired position is calculated from position coordinates of the moving target and the desired position in a preset coordinate system, wherein the position coordinates of the desired position in the preset coordinate system are a quantized representation of the desired position.

In some embodiments, the position coordinates of the moving object in the preset coordinate system may be a set of position coordinate points of the contour of the moving object (i.e., the object contour) in the preset coordinate system. By way of example, a target profile 121 is shown in fig. 12.

In some embodiments, the target contour includes a torso portion and/or a target reference point, wherein the target reference point may be a midpoint of the torso portion or a center point of the target contour. Illustratively, torso portion 1211 and target reference point 1212 are shown in fig. 12. In these embodiments, acquiring the position coordinates of the moving object in the preset coordinate system includes: identifying a target contour from the preview picture, the target contour including a torso portion and/or a target reference point; and acquiring position coordinates of the trunk part and/or the target reference point in a preset coordinate system.

In some embodiments, the graphical element for identifying the desired location includes a graphical torso portion and/or a graphical reference point that corresponds to the target reference point in the above embodiments, i.e., if the target reference point is the midpoint of the torso portion, the graphical reference point is the midpoint of the graphical torso portion, and if the target reference point is the center point of the target contour, the graphical reference point is the center point of the graphical element. By way of example, a graphical torso portion 1221 and a graphical reference point 1222 are shown in fig. 12. In these embodiments, the obtaining of the position coordinates of the desired position in the preset coordinate system is obtaining the position coordinates of the torso portion of the figure and/or the reference point of the figure in the preset coordinate system.

In some embodiments, the offset of the target position relative to the desired position is calculated from the position coordinates of the torso portion in the preset coordinate system and the position coordinates of the torso portion of the graphic in the preset coordinate system.

In some embodiments, the origin of the preset coordinate system may be any point preset. As follows, with the origin point as the example of the pixel point at the lower left corner of the display screen, the trunk part can be identified by using the coordinates of two focused points or the coordinates of at least two other points, the coordinates of the trunk part of the object are (X1, Y1; X2, Y2), the coordinates of the trunk part of the graph are (X3, Y3; X4, Y4), the position offset between the two is (X3-X1, Y3-Y1; X4-X2, Y4-Y2), and the user can remind according to the corresponding relation between the offset and the prompt so that the overlapping of the trunk part of the object and the trunk part of the graph reaches the preset requirement.

In some embodiments, the offset of the target torso portion and the torso portion of the graphic may be calculated by the overlapping area of the graphic, alerting the user to successful position adjustment when the overlapping area reaches a preset threshold or the duty cycle of the overlapping area reaches a preset threshold.

In some embodiments, the user is alerted to the success of the position adjustment as the user moves to the left, based on the target torso portion and the right side frame of the graphic torso portion completing the overlap. Thus, the user can be ensured to enter the identification area completely.

In some embodiments, the user is alerted that the position adjustment is successful when the user moves to the right, based on the target torso portion and the left side frame of the graphic torso portion completing the overlap. Thus, the user can be ensured to enter the identification area completely.

In other embodiments, the offset of the target position relative to the desired position is calculated based on the position coordinates of the target reference point in the preset coordinate system and the position coordinates of the graphic reference point in the preset coordinate system.

In some embodiments, the origin of the preset coordinate system may be any point preset. As follows, taking the origin as an example of a pixel point at the lower left corner of the display screen, the coordinates of the target reference point 1212 are (X1, Y1), the coordinates of the graphic reference point 1222 are (X2, Y2), and the positional offset between the two is (X2-X1, Y2-Y1), and a prompt is given to the left of the graphic element 112 and/or a prompt to "move a point to the right" is given when X2-X1 is positive, and a prompt is given to the right of the graphic element 112 and/or a prompt to "move a point to the left" is given when X2-X1 is negative.

In some embodiments, the controller also obtains the focal distance at the location of the human body and prompts the user to "prompt for a point forward" or "prompt for a point to the right" based on the comparison of the preset focal distances.

In some embodiments, the controller further gives a specific distance for the user to move leftwards or rightwards according to a proportional relation between the focal distance at the position of the human body and a preset focal distance, and the controller reminds the user to move rightwards by 10 cm when the proportional relation is 0.8, reminds the user to move rightwards by 15 cm when the proportional relation is 1.2, reminds the user to move leftwards by 10 cm when the proportional relation is 0.8, reminds the user to move leftwards by 10 cm when the proportional relation is negative 800pix, and reminds the user to move leftwards by 15 cm when the proportional relation is 1.2.

In some embodiments, the user is alerted to the successful position adjustment when the offset value is less than a preset threshold.

In some embodiments, the preset coordinate system is a three-dimensional coordinate system, and further, the position coordinates of the moving object and the desired position in the preset coordinate system are three-dimensional coordinates, and the offset of the object position relative to the desired position is a three-dimensional offset vector.

In some embodiments, assuming that the position coordinates of the target reference point in the preset coordinate system are (X, Y, Z), the position coordinates of the graphic reference point in the preset coordinate system are (X, Y, Z), the offset vector of the target position relative to the desired position is calculated as (X-X, Y-Y, Z-Z).

In some embodiments, when the deviation of the target position from the expected position is not greater than a preset threshold, then the display of the graphic element for identifying the expected position or the interface prompt for identifying the moving direction of the target is canceled, and a first video window for playing the demonstration video and a second video window for playing the local video are set in the user interface, wherein the second video window and the first video window are tiled in the user interface; the local video frames are played in the second video window and the exemplary video is played in the first video window simultaneously, such as the user interface shown in fig. 10.

It should be noted that, in the above example, the case where the target position is offset from the desired position may be the case where the offset amount between the two is larger than the preset offset amount, and accordingly, the case where the target position is not offset from the desired position may be the case where the offset amount between the two is smaller than the preset offset amount.

In the above embodiment, after receiving the instruction indicating to follow-up the exemplary video, the controller does not directly play the exemplary video to start the follow-up process, but only displays the local video frame, and moves the moving object (user) to the desired position by presenting the graphic element for identifying the preset desired position and the prompt for guiding the moving object to move to the desired position above the local video frame, so that the image collector can collect the image most favorable for analyzing and comparing the user action in the following follow-up process.

In some embodiments, the display device may control the rotation of the camera in the horizontal direction or the longitudinal direction according to whether the display device is in a horizontal placement state or a wall hanging placement state, and the rotation angles of the cameras in different placement states are different when the display device is in the same requirement.

The human body is continuously detected, and in some embodiments, the controller controls the guiding interface to be withdrawn to display the heel training interface when the deviation of the position coordinates of the target reference point in the preset coordinate system and the position coordinates of the graphic reference point in the preset coordinate system meet preset requirements or the deviation of the target trunk part and the graphic trunk part meet preset requirements.

In some embodiments, the display displays an interface as shown in FIG. 10a while the user follows a video of a media asset. When the display displays an interface as described in fig. 10a, the user may trigger the display of a floating layer (which may be a down key in some embodiments) containing controls by operating a designated key on the control device, and in response to a user operation, as shown in fig. 13 or 14, a control floating layer is presented on top of the follow-up interface, the control floating layer including at least one of a control for selecting a media video, a control for adjusting a play speed, and a control for adjusting sharpness. The user can move the focus position by operating the control device, and select the control in the control floating layer. When the focus falls into a control, a sub-floating layer corresponding to the control is presented, and at least one sub-control is displayed in the sub-floating layer. For example, when the focus falls into a control for selecting the media video, a sub-floating layer corresponding to the control is presented, and a plurality of different media video controls are presented in the sub-floating layer. The sub-floating layer refers to a floating layer positioned above the control floating layer. In some embodiments, the controls in the sub-floating layer may be implemented by adding new controls on the control floating layer.

Fig. 13 illustrates an application interface (play control interface) in which a control floating layer is displayed above a layer where a first play window and a second play window are located, where the control floating layer includes a selection control, a double-speed play control, and a sharpness control, and because a focus is located in the selection control, a sub-floating layer corresponding to the selection control is also presented in the interface, where a plurality of controls of other media videos are displayed. In the interface shown in fig. 13, the user can select other media videos to play and follow by moving the focus position.

In some embodiments, when the display displays an interface such as that of fig. 13, the user may move the focus to select the double-speed play control, and in response to the focus falling into the double-speed play control, a sub-floating layer corresponding to the double-speed play control is presented, as shown in fig. 14. And displaying a plurality of sub-controls in the sub-floating layer corresponding to the double-speed playing control, wherein the sub-controls are used for adjusting the playing speed of the target video, and when a certain sub-control is operated, responding to the operation of a user, adjusting the playing speed to the speed corresponding to the operated control. For example, in the interface shown in fig. 14, "0.5 times", "0.75 times", and "1 times" are displayed.

In another embodiment, when the display displays an interface as shown in fig. 13 or 14, the user may move the focus to select the sharpness control, and in response to the focus falling into the sharpness control, a sub-floating layer corresponding to the sharpness control is presented, as shown in fig. 15. And displaying a plurality of controls in the sub-floating layer corresponding to the definition, wherein the controls are used for adjusting the definition of the target video, and when a certain control is operated, responding to the operation of a user, adjusting the definition into the definition corresponding to the operated control. For example, in the interface shown in fig. 14, "720P high definition" and "1080P ultra-definition" are displayed.

In some embodiments, when the control float layer is presented in response to a user operation, the focus is displayed on a default control set in advance, which may be any one of a plurality of controls in the control float layer. For example, as shown in fig. 13, the default control set in advance is a selection control.

In some embodiments, other media videos displayed in the sub-floating layer corresponding to the selection control are sent to the display device by the server. For example, in response to a user selection of a selection control, the display device requests media asset resource information, such as a resource name or a resource cover, etc., to be displayed in the selection list from the server. And after receiving the media resource information returned by the server, the display equipment controls the media resource information to be displayed in the selection list.

In some embodiments, to facilitate the user's differentiation of the assets resources in the selection list, the server, after receiving the request from the display device, queries the user's history heel-and-toe record according to the user ID to obtain therefrom the assets video that the user has practiced. If the media resource information issued to the display device comprises the media resource video practiced by the user, adding an identifier for indicating that the user has practiced the video in the media resource information corresponding to the media resource video. Accordingly, when the display device displays the selection list, the media asset video that has been trained is identified. For example, a "trained" logo is displayed in the interface as shown in FIG. 12.

In some embodiments, in order to facilitate the user to distinguish the media resources in the selection list, after receiving the request from the display device, the server determines whether the selection list resource requested by the display device is newly added, for example, the server may determine whether the selection list resource sent to the display device last time is newly added by comparing the selection list resource sent to the display device with the current selection list resource, and if the selection list resource requested by the display device is newly added, the server adds an identifier indicating that the video is a newly added video in the resource information corresponding to the newly added media resource. Accordingly, when the display device displays the selection list, the newly added media asset video is identified. For example, "update" displayed in the interface as shown in fig. 13.

In some embodiments, the controller is responsive to an instruction entered by the user to follow-up with the demonstration video, to obtain the demonstration video from the server or to obtain the pre-downloaded demonstration video from the local storage according to the resource identification of the demonstration video.

In some embodiments, the exemplary video includes the image data and audio data described above. The image data comprises a video frame sequence which shows a plurality of actions that a user needs to exercise, such as leg lifting actions, squat actions and the like. The audio data may then be narrative audio of the exemplary action and/or background sound audio (e.g., background music).

In some embodiments, the controller processes the exemplary video by controlling the video processor to parse displayable image signals and audio signals therefrom, the audio signals being processed by the audio processor and played in synchronization with the image signals.

In some embodiments, the exemplary video includes the above-mentioned image data, audio data, and subtitle data corresponding to the audio data, and the controller plays the image, the audio, and the subtitle synchronously when the exemplary video is played.

As previously described, an exemplary video includes a sequence of video frames, with frames in the sequence of video frames displayed over time under the playback control of the controller to present to the user the change in limb morphology that each action was made. The user needs to undergo the change of the limb form after completing each action, and the embodiment of the application analyzes and evaluates the situation of completing the action according to the recorded limb form. In some embodiments, the degree of matching of the motion is determined in advance by extracting continuous joint points from the local video during the follow-up procedure based on the motion model of the acquired joint point in the sequence of video frames in the exemplary video, and comparing with the motion model of the acquired joint point in advance.

In some embodiments, the process of changing the shape of a limb (i.e., the movement trajectory of the limb) that needs to be undergone by a critical action is described as the completion of an incomplete state action to a complete state action and then to a release action, that is, the incomplete state action occurs before the complete state action, and the release action occurs after the complete state action, which is the critical action to be completed. In some embodiments, the completion state action may also be referred to as a critical demonstration action or a critical action. In some embodiments, labels may be added to identify the limb change process, and different labels may be preset for action frames of actions of different nodes.

Based on this, in some embodiments, frames showing key actions in a video frame sequence included in the media video are called key frames, and key labels corresponding to each key frame are identified on a time axis of the media video, that is, a time point represented by the key labels is a time point when the corresponding key frame is played. In addition, key frames in a sequence of video frames constitute a sequence of key frames.

Further, for an exemplary video, it may include a sequence of key frames including a number of key frames, one key frame corresponding to each key label on the time axis, one key frame exhibiting each key action. In some embodiments, the key frame sequence is also referred to as a first key frame sequence.

In some embodiments, N sets of start-stop labels are further preset on a time axis of the media asset video (including the demonstration video), and each set of start-stop labels corresponds to N video segments, where each video segment is used to display an action, (or called a completion state action or a key action), and each set of start-stop labels includes a start label and a stop label, and when a progress mark on the time axis moves to a certain start label in playing the media asset video (including the demonstration video), it means that an demonstration process corresponding to a certain action starts to play, and when the progress mark on the time axis moves to the stop label, it means that the demonstration process of a certain action ends to play.

Some users (e.g., children) act very slowly due to the difference in individualization factors such as learning ability, physical coordination, etc. of different users, and it is difficult to achieve synchronization with the playing speed of the demonstration video.

To solve this problem, in some embodiments, in playing the demonstration video, when the demonstration process of playing a certain action is started, the playing speed of the demonstration video is automatically reduced, so that the user can learn and practice the key action better, avoid missing the key action, improve the action in time, and when the demonstration process of the action (i.e. the video clip showing the action) is finished, the original playing speed is automatically restored.

In some embodiments, video segments exhibiting key actions are referred to as key segments, and an exemplary video generally includes a number of key segments and at least one non-key segment (non-key segment or other segments). The non-critical section is a section of video that indicates Fan Shipin that contains non-critical actions, e.g., a section of video in which the action presenter remains standing as a spectator telling the action.

In some embodiments, the controller controls to display a user interface on the display, the user interface including a window for playing video; responding to an input instruction for playing the demonstration video, and acquiring the demonstration video, wherein the demonstration video comprises a plurality of key fragments, and the key fragments show key actions required to be practiced by a user when being played; in some embodiments, the exemplary video that the user indicates to play is also referred to as the target video. The controller controls playing the demonstration video in the window at a first speed; when the playing of the key fragment is started, the speed of playing the demonstration video is adjusted from the first speed to the second speed; when the playing of the key fragment is finished, the speed of playing the demonstration video is adjusted from the second speed to the first speed; wherein the second speed is different from the first speed.

In some embodiments, the controller plays the demonstration video, detects a start tag and an end tag on a timeline of the demonstration video; when the start tag is detected, the speed of playing the demonstration video is adjusted from the first speed to the second speed; upon detection of the end tag, the speed at which the exemplary video is played is adjusted from the second speed to the first speed. Wherein the playing of the start tag representing the key fragment starts and the stop tag representing the key fragment completes.

In some embodiments, the second speed is lower than the first speed.

In the above example, since the second speed is lower than the first speed, when the start tag is detected (i.e., the time when the progress mark on the time axis goes to the start tag mark), automatic low-speed playback is realized, the playback speed of the exemplary video is adapted to the action speed of the user, and when the end tag is detected, the first speed is automatically restored.

In some embodiments, the first speed is a normal playing speed, i.e. 1 time speed, and the second speed may be a preset 0.75 time speed or 0.5 time speed.

In some embodiments, the exemplary video file includes video frame data and audio data, and when playing the exemplary video, the same sampling rate is used to implement reading and processing of the video frame data and the audio data, so when the playing speed of the exemplary video needs to be adjusted, not only the playing speed of the video frame will be adjusted, but also the playing speed of the audio signal will be adjusted, that is, synchronous playing of audio and video is implemented.

In other embodiments, the exemplary video file includes video frame data and audio data, and the sampling rate of the video frame data and the sampling rate of the audio data are independently adjusted and controlled when the exemplary video is played, so that when the playing speed of the exemplary video needs to be adjusted, only the sampling rate of the video frame data can be changed to adjust the playing speed of the video picture, and the sampling rate of the audio data is not changed to keep the playing speed of the audio signal unchanged. For example, when the play speed needs to be reduced, the play speed of the audio is not reduced, so that the user can normally receive the description of the audio and watch the slowed action presentation.

In some embodiments, the key segments include their video data and their audio data. When the playing of the key video clip is started, adjusting the speed of the video data of the playing of the key video clip to a second speed, and maintaining the speed of the audio data of the playing of the key video clip to a first speed; when the playing of the key segment is finished, the speed of playing the video data of the next segment is adjusted to the first speed, and the audio data of the next segment is synchronously played at the first speed, wherein the next segment is a file segment which is positioned behind the key segment and is adjacent to the key segment in the demonstration video, such as other segments adjacent to the key segment.

In some embodiments, during the process of playing the video frame at the low speed, whether the playing of the key segment is finished (for example, detecting the termination label) is detected, if the termination label of the key segment is not detected, when the playing of the audio data corresponding to the corresponding period is finished, the audio data corresponding to the corresponding period may be repeatedly played, for example, when the video frame is played at the 0.5 speed, the audio data corresponding to the period may be repeatedly played twice. And after the video frame data of the period is completely played, namely after the termination label is detected, the audio data and the video frame data corresponding to the next period can be synchronously played.

In other embodiments, in the process of playing the video frame at the low speed, whether the playing of the key segment is finished (for example, detecting the termination label) is detected, if the termination label of the key segment is not detected, when the playing of the audio data corresponding to the corresponding period is finished, the playing of the audio data is paused until the playing of the video frame data of the period is finished, that is, after the termination label is detected, the audio data and the video frame data corresponding to the next period can be synchronously played. For example, when the starting label is at a time of 0:05 and the ending label is at a time of 0:15 on the time axis, in the case of playing a video frame at a speed of 0.5 times, the video frame data corresponding to the time period of 0:05-0:15 needs to be played for 20S, and the audio data corresponding to the time period needs to be played for 10S, because in order to make the audio and video synchronous to play in the time period after 0:15, when the progress mark on the time axis goes to 0:10, the audio data is paused, and when the progress mark on the time axis goes to 0:15, the audio is continuously played.

In some embodiments, during the user follow-up, automatic adjustment is achieved for the playback speed of the exemplary video only, and not for the local video stream.

In some embodiments, the controller controls to display a user interface on the display, the user interface including a first playback window for playing the demonstration video and a second playback window for playing the local video stream; responding to an input instruction for playing the demonstration video, and acquiring the demonstration video; playing the demonstration video in a first playing window, and playing the local video stream in a second playing window; the speed when other fragments of the demonstration video are played in the first playing window is a first speed, and the speed when key fragments of the demonstration video are played is a second speed which is lower than the first speed; the speed of playing the local video stream in the second playing window is a fixed preset speed.

In some embodiments, the fixed preset speed may be a first speed.

In some embodiments, the learning ability and poor physical coordination of the low-age user are considered, so that if the age of the user falls within a preset age range, the speed is automatically reduced when the demonstration process of the key action is started.

In some embodiments, if the user's age is within a first age range, playing the exemplary video at a first speed; if the user's age is within a second age interval, the exemplary video is played at a second speed, wherein the second speed is different from the first speed.

In some embodiments, the first age interval and the second age interval are age intervals divided by a predetermined age, e.g., an age interval above the predetermined age is defined as the first age interval, and an age interval below the predetermined age (including the predetermined age) is defined as the second age interval. For example, the first age interval or the second age interval may be an age interval of preschool children (e.g., 1-7 years), an age interval of school children, an age interval of young people, an age interval of middle-aged people, or an age interval of elderly people.

It should be noted that, a person skilled in the art may set the first speed and the second speed according to a specific value range of the first age interval and the second age interval, and by using a principle that the exemplary video playing speed is maximally adapted to the learning ability and the action ability of the user.

It should be further noted that the first age interval and the second age interval are merely exemplary, and in other embodiments, the corresponding playing speed may be set for more age intervals according to need, and when the user's age is located in the corresponding age interval, the exemplary video may be played at the corresponding playing speed. For example, the demonstration video is played at a third speed when the age of the user is in a third age range, at a fourth speed when the age of the user is in a fourth age range, and so on.

In some embodiments, the user is in a first age range when the user's age is greater than a first starting age and less than a first ending age, and the user's age is in a second age range when the user's age is greater than a second starting age and less than a second ending age.

In some embodiments the age range may be two and demarcated by a preset age.

In some embodiments, when the age of the user is above a preset age, controlling the display to play the demonstration video at a first speed; when the age of the user is not higher than the preset age, controlling the display to play the demonstration video at a second speed; wherein the second speed is lower than the first speed.

In some embodiments, if the age of the user is not higher than the preset age or is in the second age range, when the key segment starts playing, the playing speed of playing the demonstration video is adjusted to the second speed; and when the playing of the key fragment is finished, adjusting the playing speed of the played demonstration video from the second speed to the first speed.

In some embodiments, when the key clip begins to play, the speed of the display playing the video data of the key clip is adjusted from the first speed to the second speed, and the speed of the audio output unit playing the audio data of the key clip is maintained at the first speed; after the playing of the audio data of the key fragments is completed, controlling the audio output unit to pause playing of the audio data of the key fragments or controlling the audio output unit to circularly play the audio data of the key fragments. Wherein the audio output unit is display device hardware, such as a speaker, for playing audio data.

In some embodiments, when the key segment ends playing, the display is controlled to play the video data of the next segment at the first speed, and the audio output unit is controlled to synchronously play the audio data of the next segment at the first speed, wherein the next segment is a segment located after the key segment in the exemplary video.

In some embodiments, if the age of the user is not higher than the preset age, controlling the display to play the video data of the demonstration video at the second speed; controlling the audio output unit to play the audio data of the exemplary video at the first speed.

In specific implementation, the controller acquires the age of the user; judging whether the age of the user is lower than a preset age; in the case that the age of the user is lower than the preset age, in the process of playing the demonstration video, detecting a start-stop label on a time axis, adjusting the playing speed of the demonstration video from a first speed to a second speed when the start label is detected, and adjusting the playing speed of the demonstration video from the second speed to the first speed when the end label is detected.

In some embodiments, the controller obtains user information from the user ID, and obtains age information of the user from the user information.

In other embodiments, the controller activates the image collector in response to a user-entered instruction to play an exemplary video; identifying the figure in the local image acquired by the image acquisition unit; and identifying the age of the user according to the identified character image and a preset age identification model.

In some embodiments, different low speed parameters may be set for different age ranges, for example, if the user is "3-5 years old" then the second speed is 0.5 speed; if the user is aged "6-7 years," the second speed is 0.75 times the speed.

As previously described, the exemplary video has a specified genre, such as the aforementioned "loving lessons", "Leaction lessons", etc., which genre may be characterized by a genre identification. In view of the differences in audience and exercise difficulty of the different types of videos, in some embodiments, if the type of demonstration video is a preset type, the speed is automatically reduced when the demonstration process of playing the critical action is started. And if the type is not the preset type, normally playing the whole process until the user manually adjusts the type.

In some embodiments, the controller obtains a type identifier of the demonstration video, and if the demonstration video is determined to be of a preset type according to the type identifier, in the process of playing the demonstration video, a start-stop label on a time axis is detected, when a start label is detected, the playing speed of the demonstration video is adjusted from a first speed to a second speed, and when a stop label is detected, the playing speed of the demonstration video is adjusted from the second speed to the first speed.

In some embodiments, the resource information issued by the server to the display device includes a type identifier of the resource, so that the display device can determine whether the exemplary video is a preset type according to the type identifier of the exemplary video, where the preset type includes, but is not limited to, a type of some or all of the resources provided by the juvenile channel, such as juvenile resources provided by other channels.

In some embodiments, different low-speed parameters may be set for different types, for example, if the exemplary video belongs to a "germination class" then the second speed is 0.5 speed; if the demonstration video belongs to the "music lesson", the second speed is 0.75 times the speed.

In some embodiments, the playing speed may be automatically adjusted according to the heel training situation of the user, so that the low-speed playing mechanism is suitable for different users. And for the parts of the demonstration video, which are difficult to be smoothly followed by the user, normal speed playing is carried out, and for the parts of the demonstration video, which are difficult to be smoothly followed by the user, low-speed playing is carried out.

For convenience of explanation and distinction, the video frame sequence included in the exemplary video is referred to as a first video frame sequence, where the first video frame sequence includes first key frames for displaying the completion state actions, N first key frames corresponding to the N completion state actions form a first key frame sequence, and of course, the first video frame sequence always includes non-key frames for displaying the incomplete state actions and the release actions.

In some embodiments, in response to an instruction indicating a follow-up demonstration video, the controller activates the image collector and obtains a follow-up video stream of the user from a local video stream collected by the image collector, the follow-up video stream containing some or all of the video frames in the local video stream. The present application refers to a sequence of video frames in a follow-up video stream as a second sequence of video frames, which includes second video frames for exhibiting (recording) user actions.

In some embodiments, user actions are analyzed according to the heel-and-toe video stream, if it is detected that the user does not make a corresponding completion state action at one or more time points (or time periods) at which the completion state action needs to be made, that is, the user actions are incomplete state actions, it is explained that the heel-and-toe actions are more difficult for the user, and at this time, the playing speed of the demonstration video by the display device can be reduced; if it is detected that the user has completed the corresponding completion state action at one or more consecutive time points (or time periods) at which the completion state action is required to be made, that is, the user action is a release action, it is indicated that these actions are less difficult for the user to follow, and at this time, the playing speed of the demonstration video by the display device can be increased.

In some embodiments, in response to an input instruction indicating to follow-up an exemplary video, the controller obtains the exemplary video including a first sequence of key frames for exhibiting a completion state action, and obtains a follow-up video stream of the user from a local video stream collected by the image collector, the follow-up video stream including a second sequence of video frames for exhibiting a user action; the controller plays the demonstration video on the display, and adjusts the playing speed of the demonstration video when the user action in the second video frame corresponding to the first key frame is not matched with the completion state action displayed by the first key frame.

The second video frame corresponding to the first key frame is extracted from the second video frame sequence according to the time information of the played first key frame.

In some embodiments, the time information of the first key frame may be a time when the display device plays the frame, and according to the time when the display device plays the first key frame, the second video frame corresponding to the time is extracted from the second video frame sequence, that is, the second video frame corresponding to the first key frame. The second video frame corresponding to a certain time may be a second video frame whose time stamp is the time, or a second video frame whose time shown by the time stamp is closest to the time.

In some embodiments, the same position may be passed during the preparation and release, so that the second video frame and other video frames adjacent thereto may be extracted, and after the joint data of the continuous frames are extracted, it may be determined whether the action is an action during the preparation or the release.

In some embodiments, the controller extracts a corresponding second video frame from the second video frame sequence according to the played first key frame, and sends the extracted second video frame (and the corresponding first key frame) to the server; and the server judges whether the user action in the second video frame is matched with the completion state action displayed by the first key frame or not by comparing the corresponding first key frame with the second video frame. And when the server judges that the user action in the second video frame is not matched with the completion state action displayed by the corresponding first key frame, returning a speed adjustment instruction to the display equipment.

In some embodiments, the controller controls the node identification (i.e., user action identification) of the second video frame and/or other video frames to be accomplished locally at the display device and uploads the node data and corresponding points in time to the server. The server determines a corresponding target demonstration video frame according to the received time point, compares the received data of the node with the joint point data of the target demonstration video frame, and feeds back the comparison result to the controller.

In some embodiments, the case where the user action in the second video frame does not match the completion status action exhibited by the corresponding first keyframe includes: the user action in the second video frame is an incomplete state action before the complete state action; the user action in the second video frame is a release action following the completion state action. Based on this, if the server determines that the user action in the second video frame is an incomplete state action, an instruction indicating a speed reduction is returned to the display device to cause the display device to reduce the play speed of the target video; if the server determines that the user action in the second video frame is a release action, an instruction for indicating to increase the speed is returned to the display device, so that the display device increases the playing speed of the target video.

Of course, in other embodiments, the display device independently determines whether the user action in the second video frame matches the completion state action shown in the first keyframe, without interaction with the server, which is not described herein.

It should be noted that, in the above implementation case of adjusting the playing speed in real time according to the exercise situation of the user, if the playing speed is adjusted to the preset maximum or minimum value, the playing speed is not adjusted up or down any more.

In some embodiments, the user may pause the video playing by operating a key or inputting voice control, and then operate the key or inputting voice to control resuming the video playing, for example, in the process of following the target video, the user may control the target video to pause the playing by operating a key or inputting voice on the control device, for example, when the display displays the interface as shown in fig. 10, the user may press an "OK" key to pause the playing, and the controller may pause the playing of the target video in response to the key input of the user, and present a pause state identifier as shown in fig. 16 at the upper layer of the play screen.

In the process of tracking the target video, the controller acquires a local image through the image acquisition device and detects whether a user target, namely a person (user), exists in the local image, when the display device controller (or the server) does not detect a moving target from the local image, the display device automatically controls to pause playing of the target video, or the server instructs the display device to pause playing of the target video, and a pause state identifier is presented at the upper layer of a playing picture as shown in fig. 16.

In the above-described embodiments, the pause control performed by the controller does not affect the display of the local video picture.

In the pause play state as shown in fig. 16, the user can resume playing the target video by operating a key or voice input on the control device, for example, the user can press an "OK" key to resume playing the target video, and the controller resumes playing the target video and cancels the display of the pause state identification in fig. 16 in response to the key input of the user.

It can be seen that, in the above example, the user needs to operate the control device to control the display device to resume playing the target video, so that the user experience in the follow-up process is not friendly.

To address this problem, in some embodiments, in response to pause control of playback of a target video, a controller presents a pause interface on a display and displays target key frames in the pause interface, wherein the target video includes a number of key frames, each key frame exhibiting a key action that requires a follow-up, the target key frame being a designated one of the number of key frames. After the playing of the target video is paused, the image collector is controlled to continue working, and whether the user action in the local image collected after the playing is paused is matched with the key action displayed by the target key frame is judged; when the user action in the local image is matched with the key action displayed by the target key frame, resuming playing of the target video; and when the user action in the local image is not matched with the key action displayed by the last key frame, maintaining the playing suspension of the target video.

In the above embodiment, the target key frame may be the key frame showing the last key action, that is, the last key action played before the control target video pauses, or may be a representative one of several key frames.

It should be noted that, the target video referred to in the above example refers to a video that is paused, including but not limited to an exemplary video of dance motion, an exemplary video of exercise motion, an exemplary video of gymnastics motion, an MV video played in a K-song scene, or a video of an exemplary avatar motion.

As some possible implementations, a plurality of key labels are identified in advance on a time axis of the target video, where one key label corresponds to one key frame, that is, a time point represented by the key label is a time point when the corresponding key frame is played. The controller responds to receiving pause control of target video playing, detects a target key label on a time axis according to a time point of the time axis when the pause is performed, acquires a target key frame according to the target key label on the time axis, and displays the acquired target key frame in a pause interface, wherein the time point corresponding to the label of the target key frame is before the time point on the time axis when the pause is performed. Thus, the touch of pausing with the video frames after the training can be used to promote the interest.

In other possible implementations, the controller responds to pause control over playing of the target video, and pauses the target video after the target video is controlled to fall back to the moment of the target key label, so that the target key frame corresponding to the target key label is displayed on a pause interface.

In some embodiments, the target key label is a key label earlier than the current time and closest to the current time on the time axis, and the corresponding target key frame is a key frame showing the previous key action.

In the above example, when or after executing pause control on playing of the target video, the target key frame showing the key action is presented in the pause interface as a prompt action for the user to resume playing, and further, in the play pause state, the user can control to resume playing of the target video by making the prompt action, without operating the control device, so that the follow-up experience of the user is improved.

In some embodiments, displaying the obtained target key frame in the pause interface may be that after the control time axis is retracted to a time point corresponding to the target key label, playing of the demonstration video is stopped and a pause control is added in the demonstration video playing window. The controller acquires the target key frame or the joint point of the target key frame, and simultaneously the camera continuously acquires the local video data and detects the human body in the video data, and when the matching degree of the action of the human body in the video data and the action in the target key frame reaches a preset threshold value, the demonstration video is played.

In other possible implementations, the controller responds to receiving pause control over playing of the target video, and controls the target video to pause after rewinding to the moment of the target key label so as to display the target key frame corresponding to the target key label on the pause interface.

In some embodiments, the controller responds to receiving a pause control for playing the target video, and after controlling the time axis to fall back to a time point corresponding to the target key label, stopping playing the target video and adding a pause control in the video playing window. The controller acquires the target key frame or the joint point data (namely action data) of the target key frame, meanwhile, the camera continuously acquires the local video data and detects the human body in the video data, and when the matching degree of the action of the human body in the video data and the action in the target key frame reaches a preset threshold value, the playing of the target video is controlled.

In some embodiments, resuming playing the video includes starting at a time point corresponding to the target key label after the rewinding, and continuing playing the target video.

In other embodiments, resuming playing the video includes starting at a point in time when the pause control is received, and continuing to play the target video.

In some embodiments, displaying the obtained target key frame in the pause interface may be that the playback of the target video is stopped without performing the rollback of the time axis, a pause control is added in the video playback window, and the obtained target key frame is displayed in a floating layer above the video playback window. The controller acquires the target key frame or the joint point data of the target key frame, the camera continuously acquires the local video data and detects the human body in the video data, and when the matching degree of the human body action in the video data and the action in the target key frame reaches a preset threshold value, the demonstration video is played and the floating layer of the target key frame is cancelled.

In some embodiments, the target key frame displayed at the time of pause may be any video frame in the video being played.

In some embodiments, the display device may complete the comparison of the image frame and the local video frame during the pause itself, or may upload the comparison to the server, so that the server may complete the comparison of the image frame and the local video frame during the pause.

In some embodiments, playing the video may be to continue playing the exemplary video from the point in time corresponding to the backed-up key label.

In some embodiments, continued playback of the exemplary video may be performed at the point in time when the pause control is received.

In some embodiments, displaying the obtained target key frame in the pause interface may be that the playback of the exemplary video is stopped without performing the time-axis rollback, a pause control is added in the exemplary video playing window, and the obtained target key frame is displayed in a floating layer above the exemplary video playing window. The controller acquires the target key frame or the joint point of the target key frame, and simultaneously, the camera continuously acquires the local video data and detects the human body in the video data, and when the matching degree of the action of the human body in the video data and the action in the target key frame reaches a preset threshold value, the demonstration video is played and the floating layer of the target key frame is cancelled.

In some embodiments, the work frame at the time of pause may be any video frame in the exemplary video.

In some embodiments, the follow-up procedure automatically ends when the target video play for the user follow-up is completed. The controller closes the image collector in response to the completion of playing the target video, closes the heel-in interface where the first playing window and the second playing window are located as shown in fig. 10, and presents an interface containing evaluation information.

In some embodiments, the user may end the follow-up procedure by operating a key or voice input on the control device before completing the follow-up procedure, e.g., the user may operate a "back" key input on the control device to indicate an instruction to end the follow-up. In response to the instruction, the controller pauses playing the target video and presents an interface including the save information, such as the save page exemplarily shown in fig. 17.

When the display displays the save interface as shown in fig. 17, the user can operate the control for returning to the heel-exercise interface, return to the heel-exercise interface to continue heel-exercise, and also can operate the control for determining to exit from heel-exercise, so as to end the heel-exercise process.

In some embodiments, a length of play of the target video is determined for subsequent play in response to a user entered instruction to exit the follow-up.

In some embodiments, if the playing time period of the target video is not less than the preset time period (e.g., 30 s), the playing time period of the target video is saved to perform the continuous playing when playing next time, and if the playing time period of the target video is less than the preset time period (e.g., 30 s), the playing time period of the target video is not saved to restart playing when playing next time of the target video.

In some embodiments, if the playing duration of the target video is not less than a preset duration (e.g., 30 s), the local image frame corresponding to the target key frame is saved for display in a subsequent evaluation interface or playing history. If the playing time length of the target video is less than the preset time length (such as 30 s), the local image frame corresponding to the target key frame is not saved. The local image frames corresponding to the target key frames refer to video frames in the determined local video when the target key labels are detected.

In some embodiments, the video frames in the determined local video obtained when the target key tag is detected may be local image frames obtained by a camera at the time point when the target key tag is detected, or local image frames obtained by a camera at the time point when the target key tag is detected or at the adjacent time point and matched with the target key frame to a higher degree.

In some embodiments, when a user selects a video which is played and not played for follow-up, an interface including continuous playing prompt information is presented in response to an instruction input by the user to play the video, and in the continuous playing prompt interface, a last playing time and a control for the user to select whether to perform continuous playing are displayed, so that the user operates the control on the interface to autonomously select whether to perform continuous playing. Fig. 18 illustrates a continuous play prompt interface, as shown in fig. 18, in which a last play time period (1 minute and 30 seconds), a control for restarting play ("restart"), and a control for the user to continue play (continue follow-up) are displayed.

In some embodiments, the playback of the exemplary video is controlled to be resumed in response to an instruction input by the user in the resume prompt interface as shown in fig. 18, for example, from 0 minutes to 0 seconds, or the playback of the exemplary video is resumed in response to an instruction input by the user in the resume prompt interface as shown in fig. 18, for example, from 1 minute to 30 seconds, according to the last playback time.

In some embodiments, the experience value is user data related to a level boost, which is the user's acquisition of user behavior in the target application, i.e., the user can boost the experience value by training more exemplary videos, which is also a quantitative characterization of the user's behavior proficiency, i.e., the higher the experience value, the higher the proficiency of the user's practice actions, and when the experience value is accumulated to a certain value, the boost of the user level is obtained.

In some embodiments, the server or display device counts the experience value increment generated in one statistics period, and updates the user's experience total amount based on the experience value increment generated in the last statistics period after entering the next statistics period.

For example, three, five or seven days may be preset as one statistical period, and accordingly, when the time goes to zero on the fourth, sixth or eighth day, it means going to the next statistical period, etc. For example, assume that one week (from zero per week to zero per week) is one statistical period, and when time goes to zero per week, the next statistical period is entered.

Based on the above-described experience value statistics method, in some embodiments, the experience value (increment) obtained by the user during the current statistics period is referred to as a first experience value, and the sum of the experience values obtained by the user during each statistics period preceding the current statistics period is referred to as a second experience value. It will be appreciated that the sum of the first experience value and the second experience value is the total amount of experience value of the user at the current time, and the second experience value does not include the first experience value since the current time does not reach the time of updating the total amount of experience value.

In some embodiments, when it is desired to display an application homepage, the controller obtains the first experience value and the second experience value, and displays the application homepage including controls for displaying the first experience value and the second experience value according to the obtained first experience value and second experience value.

In some embodiments, the controls for presenting the first empirical value and the second empirical value comprise a first control for presenting the first empirical value and a second control for presenting the second empirical value. Illustratively, the first control is the control in which "current week +10" is located in fig. 9, and the second control is the control in which "stage value 10012" is located in fig. 9.

In some embodiments, the first or second empirical value obtained by the controller is data returned by the server in real time, while in other embodiments, the first or second empirical value obtained by the controller is locally saved data that was last returned by the server.

In some implementation scenarios, when returning from the follow-up results page to the application home page, the display device controller obtains the latest first experiential value from the server, updates the first control in the application home page according to the latest first experiential value, and since the second experiential value is not updated at this time, it is not necessary to obtain the second experiential value, i.e., it is not necessary to update the second control in the application home page.

In some implementations, the display device controller obtains the latest first and second empirical values from the server in response to the launching of the target application, and displays the first empirical value in a first control of the application home page and the second empirical value in a second control of the application home page based on the obtained first and second empirical values.

In some implementation scenarios, when entering the next statistical period, the display device controller obtains the latest second experience value from the server and stores the latest second experience value in the local cache data; when the application homepage is loaded for the first time after the latest second experience value is obtained, updating the second control in the application homepage according to the latest second experience value stored in the local cache data, namely displaying the latest second experience value stored in the local cache data in the second control.

In some implementations, after the server updates the first experience value or the second experience value, returning the updated first experience value or the second experience value to the display device; after receiving the updated first experience value or second experience value returned by the server, the display device stores the updated first experience value or second experience value in local cache data, and when the application homepage needs to be displayed, the first experience value and the second experience value are respectively displayed in a first control and a second control of the application homepage according to the first experience value and the second experience value in the cache data.

In other embodiments, the total amount of experience values of the user at the current time is referred to as a third experience value, and it is understood that the third experience value is a sum of the first experience value and the second experience value.

In some embodiments, when it is desired to display an application homepage, the controller obtains the first and third empirical values and displays the application homepage including controls for displaying the first and third empirical values based on the obtained first and third empirical values.

In some embodiments, the controls in the application homepage for presenting the first empirical value and the third empirical value include a first control for presenting the first empirical value and a third control for presenting the third empirical value. The first experience value is displayed in the first control and the third experience value is displayed in the third control while the application home page is displayed in accordance with the first experience value and the third experience value.

It should be noted that the second control and the third control may be the same control or may be different controls. When the second control and the third control are not the same control, they may be displayed simultaneously on the application home page.

According to the above embodiment, one or more of the first experience value, the second experience value and the third experience value may be displayed on the application homepage.

In some embodiments, the display device controller sends a data request to the server for obtaining the user experience value in response to the request to display the application homepage, the data request including at least user information, such as a user identification. The server responds to the data request, judges whether the second experience value is updated or not by comparing the second experience value stored currently with the second experience value returned to the display device last time, if the second experience value is updated, the updated second experience value and the latest first experience value are returned, and if the second experience value is not updated, only the latest first experience value is returned to the display device. The latest first tested value is updated according to the heel-exercise result of the last heel-exercise process of the user.

In some embodiments, when the server receives the above data request sent by the display device, the server determines whether the second experience value needs to be updated, and if it is determined that the second experience value needs to be updated, updates the second experience value and returns the updated second experience value to the display device. Specifically, the server responds to the data request, acquires the time of last updating the second experience value, and judges whether the interval duration from last updating the second experience value meets the duration of the statistical period; if the first experience value corresponding to the last statistical period is met, a first experience value corresponding to the last statistical period is obtained, and the second experience value is updated by accumulating the first experience value corresponding to the last statistical period into the second experience value; if the first experience value is not satisfied, the second experience value is not updated, and the current first experience value and the second experience value are directly returned to the display device, or only the current first experience value is returned to the display device.

In other embodiments, the server periodically and automatically updates the second empirical value based on the corresponding first empirical value. For example, the first experience value corresponding to the last statistical period is added to the second experience value at preset intervals (one statistical period), so as to obtain a new second experience value.

If the controller receives the first experience value and the second experience value returned by the server, drawing a first control and a second control in the application homepage according to the first experience value and the second experience value at the display equipment side; if the display equipment controller only receives the first experience value returned by the server, drawing a first control and a second control in the application homepage according to the received first experience value and the second experience value in the local cache data, wherein the second experience value in the cache data is the second experience value returned by the server which is received last time.

In some embodiments, the first control and the second control partially overlap so that a user can intuitively see both controls simultaneously.

In some embodiments, the first control is displayed superimposed over the second control, e.g., in fig. 9, the control at "current week +10" is displayed superimposed over the control at "work value 10012".

In some embodiments, the first control and the second control are different in color, so that a user can intuitively see the two controls at the same time, and the user can conveniently distinguish the two controls.

In some embodiments, the first control is located in the upper right corner of the second control.

In some embodiments, when the controller receives an operation that the user determines to exit from the heel-off, the image collector is turned off, and the first and second play windows in the heel-off interface as shown in fig. 10a are turned off, and a heel-off result page for showing the heel-off result is presented.

In some embodiments, in response to the end of the heel-exercise process, a heel-exercise results page is presented on the display according to the heel-exercise results of the heel-exercise process, the heel-exercise results including at least one of star achievements, score achievements, experience value increments, experience values that have been obtained for the current statistical period (i.e., first experience values), sums of experience values obtained for each statistical period preceding the current statistical period (i.e., second experience values), and total amount of experience values obtained up to the current time.

In some embodiments, the star grade score, the score and the experience value increment obtained in the heel training process are determined according to the heel training action of the target key frame completed in the target video playing process and the action matching degree when the heel training action of the target key frame is completed, wherein the number of the heel training actions of the target key frame completed and the action matching degree when the heel training action of the target key frame is completed are positively correlated with the score obtained in the heel training process, and the star grade score and the experience value increment obtained in the heel training process can be calculated according to the score and a preset calculation rule.

It should be noted that, in some embodiments, if the user exits from the heel-exercise in advance, the controller determines whether the playing duration of the target video is longer than a preset value in response to the instruction of exiting from the heel-exercise input by the user, and if the playing duration is longer than the preset value, generates scoring information and detailed score information according to the generated heel-exercise data (such as the acquired local video stream, scoring of part of the user actions, etc.); and if the playing time length is not longer than the preset value, deleting the generated heel training data.

Fig. 19A illustrates a heel-exercise results page, as shown in fig. 19A, in which star-level achievements (four stars), experience value increments (+4), a first experience value (current week+10), and a second experience value (work value 10012) obtained in the current heel-exercise process are presented in the form of items or controls, wherein the first control presenting the first experience value and the second control presenting the second experience value are identical to those shown in fig. 10. In addition, in order to facilitate the user's viewing of detailed achievements, fig. 19 also shows a control "view achievements immediately" for viewing detailed achievements, by which the user can enter an interface presenting detailed achievements information as shown in fig. 19B or 19D or 19E.

In the heel-training results page shown in fig. 19A, star grade achievements (191D), experience value increments (192D), first experience values (193D), and second experience values (194D) obtained by the heel-training process are displayed by a third element combination (192D), wherein the displayed experience value increments are experience value increments determined according to the scores obtained by the heel-training process. The element combination refers to one interface element or a combination of multiple interface elements such as items, text boxes, icons, controls and the like.

In order to avoid malicious earning of experience values by repeatedly sparring the same demonstration video, in some embodiments, in the process of sparring the demonstration video by the user, scoring the sparring condition of the user according to the local video stream acquired by the image acquisition device, and associating the scoring result to the demonstration video, so that the server can query the historical highest score of the sparring of the demonstration video by the user according to the ID and the user ID of the demonstration video, if the score obtained in the sparring process is higher than the recorded historical highest score, the experience value increment obtained in the sparring process is calculated according to the score, and if the score is not higher than the recorded historical highest score, the experience value increment obtained in the sparring process is determined to be zero. Wherein the recorded historical highest score is the highest score obtained by the user training the exemplary video over time.

In some embodiments, after each heel-exercise process is finished, it is determined whether the number of heel-exercises of the user in the current statistical period reaches a preset number, and if the preset number is reached, an encouraging experience value increment is generated.

For example, assume that a week (from zero monday to zero sunday) is a statistical period, the preset number of times is 10, after each heel-exercise is finished, the recorded heel-exercise number of the current statistical period is +1, whether the latest recorded heel-exercise number reaches 10 times is judged, and if the latest recorded heel-exercise number reaches 10 times, 5 experience values are generated to encourage the user; every time sunday zero means that the next statistical period is entered, the recorded following times data is cleared. Optionally, a plurality of preset times may be set, and when the number of times of the follow-up of the current statistical period reaches different preset times, different experience values are generated. For example, 10 experience values are generated when the number of heel exercises reaches 20 times, 15 experience values are generated when the number of heel exercises reaches 30 times, and so on.

In some embodiments, after each follow-up procedure is completed, it is determined whether the total score obtained by the user during the current statistical period reaches a preset value, and if so, a rewarding experience value increment is generated.

For example, assuming that a week (from monday to sunday) is a statistical period, the preset score value is 30 minutes, after each heel training, accumulating the score obtained in the heel training process into the total score of the recorded current statistical period, judging whether the latest recorded total score reaches 30 minutes, and if the latest recorded total score reaches 30 minutes, generating 5 experience values to rewards users; every time period zero means that the next statistical period is entered, the recorded total score data is cleared. Alternatively, a plurality of preset score values may be set, and different numbers of experience values are generated when the total score of the current statistical period reaches different preset score values. For example, 10 empirical values are generated when the total score reaches 40 points, 15 empirical values are generated when the total score reaches 50 points, and so on.

In some embodiments, after the heel-exercise process is finished, a heel-exercise result page is presented according to the heel-exercise result, wherein the heel-exercise result comprises a score obtained in the heel-exercise process, a star grade score, an experience value increment and the like, the star grade score is determined according to the score, the experience value increment comprises an experience value increment determined according to the score, an experience value increment generated when the heel-exercise number of the user in the current statistical period reaches a preset number, and/or an experience value increment generated when the user obtains the total score in the current statistical period reaches a preset score value.

In some embodiments, the results page is presented differently depending on the source of the experience value delta. Specifically, if the heel-exercise time of the user in the current statistical period reaches the preset time after the heel-exercise process is finished, presenting a heel-exercise result page containing a first element combination, wherein the first element combination is used for displaying an experience value increment determined according to the heel-exercise score of the heel-exercise process and an experience value increment determined according to the preset time; if the heel-exercise process is finished, logging in that the heel-exercise total score of the user in the current statistical period is larger than a preset value, presenting a heel-exercise result page containing a second element combination, wherein the second element combination is used for showing an experience value increment determined according to the heel-exercise score of the heel-exercise process and an experience value increment determined according to the preset value; if the heel-exercise process is finished, the heel-exercise times of the login user in the current statistical period do not reach the preset times and the heel-exercise total score is not larger than the preset value, a heel-exercise result page containing a third element combination is presented, and the third element combination is used for displaying experience value increment determined according to the heel-exercise score of the heel-exercise process.

The first element combination, the second element combination, and the third element combination may be one interface element or a combination of multiple interface elements, such as an item, a text box, an icon, and a control.

Fig. 19B is a schematic view of a heel-exercise result page according to an exemplary embodiment of the present application, which specifically shows a heel-exercise result page presented by a user when the heel-exercise number of the current statistical period reaches a preset number, and as shown in fig. 19B, the page shows star-grade achievements (201D) obtained in the heel-exercise process, experience value increments (202D) determined according to scores obtained in the heel-exercise process, experience value increments (203D) determined according to the preset number of heel-exercise times reached by the user, a first experience value (204D), and a second experience value (205D) through a first element combination (202D and 203D).

Fig. 19C is a schematic diagram of a heel-exercise result page according to an exemplary embodiment of the present application, which is specifically a heel-exercise result page presented by a user when a total score obtained in a current statistical period reaches a preset score value, and as shown in fig. 19C, the page shows a star grade result (211D) obtained by the heel-exercise process, an experience value increment (212D) determined according to the score obtained by the heel-exercise process, an experience value increment (213D) determined according to the preset score value reached by the user, a first experience value (214D), and a second experience value (215D) through a second element combination (212D and 213D).

According to the embodiment, when the heel-and-exercise times of the user in the current statistical period reach the preset times and/or when the total score obtained by the user in the current statistical period reaches the preset score value, a certain amount of experience values of the user are rewarded or encouraged and displayed on the heel-and-exercise result page, so that the exercise enthusiasm of the user is improved, and the user experience is improved.

In some embodiments, the heel-exercise result page is displayed, and the voice prompt corresponding to the content of the heel-exercise result page can be controlled to be played.

In some embodiments, during the playing of the demonstration video (i.e. during the heel-in process), performing action matching on the demonstration video and the local video stream to obtain a score corresponding to the heel-in process; after the demonstration video is played (i.e. after the heel training process is finished), corresponding star grade achievements, experience value increments and the like are determined according to the obtained scores, and a heel training result interface is generated.

In some embodiments, the controller obtains the demonstration video in response to an input instruction to play (follow-up) the demonstration video, and collects a local video stream through the image collector; wherein the demonstration video comprises a first video frame for showing a demonstration action that the user needs to follow, and the local video stream comprises a second video frame for showing the user action; matching the corresponding first video frame with the second video frame to obtain a score based on a matching result; if the score is higher than the recorded historical highest score, determining an experience value increment according to the score; if the score is not higher than the highest score noted, the experience value increment is determined to be 0.

In some embodiments, when the controller receives an operation that the user determines to exit from the heel-off, the image collector is turned off, and the first and second play windows in the heel-off interface as shown in fig. 10a are turned off, and an interface containing evaluation information is presented.

In some embodiments, in response to the end of the follow-up procedure, an interface is presented on the display containing rating information including at least one of star grade achievements, scoring achievements, experience value increments, and experience value totals.

In some embodiments, the star grade score, the score and the experience value increment are determined according to the heel-training actions of the target key frames completed in the target video playing process and the action matching degree when the heel-training actions of the target key frames are completed, wherein the heel-training action quantity of the target key frames completed and the action matching degree when the heel-training actions of the target key frames are completed are positively correlated with the star grade score, the score and the experience value increment.

FIG. 19A illustrates an interface for presenting scoring information, as shown in FIG. 19A, with star achievements, experience value increments, and experience value totals presented in the form of items or controls, where the controls presenting the experience value totals are consistent with those shown in FIG. 10. In addition, in order to facilitate the user to view the detailed score, fig. 19A also shows a control "view the score immediately" for viewing the detailed score, and the user can enter the interface for presenting detailed score information as shown in any one of fig. 19B to E by operating the control.

In order to avoid malicious earning of experience values by repeatedly sparring the same demonstration video, in some embodiments, in the process of sparring the demonstration video by the user, scoring is performed on the sparring condition of the user according to the local video stream acquired by the image acquisition device, and mapping relation exists between the score and the demonstration video, the server can inquire the recorded historical highest score of the sparring demonstration video according to the ID of the demonstration video and the ID of the user, if the score is higher than the recorded historical highest score, the new experience value obtained according to the score is displayed, and if the score is not higher than the recorded historical highest score, the original experience value is displayed. Wherein the recorded historical highest score is the highest score obtained by the user training the exemplary video over time.

In some embodiments, for scoring of the heel-exercise process, the score and the new empirical value derived from the score are presented in the heel-exercise result interface when the heel-exercise result interface of the heel-exercise process is presented.

In some embodiments, during the playing of the demonstration video (i.e. during the heel-in process), performing action matching on the demonstration video and the local video stream to obtain a score corresponding to the heel-in process; and after the demonstration video is played (i.e. after the heel-exercise process is finished), generating a heel-exercise result interface according to the obtained score, and setting an experience value control for displaying the experience value in the heel-exercise result interface, wherein the experience value control displays the experience value updated according to the score when the score is higher than the historical highest score of the demonstration video for the user, and displays the experience value before the heel-exercise process when the score is not higher than the historical highest score.

In some embodiments, the controller obtains the demonstration video in response to an input instruction to play (follow-up) the demonstration video, and collects a local video stream through the image collector; wherein the demonstration video comprises a first video frame for showing a demonstration action that the user needs to follow, and the local video stream comprises a second video frame for showing the user action; matching the corresponding first video frame with the second video frame to obtain a score based on a matching result; if the score is higher than the recorded historical highest score, loading a new experience value obtained according to the score in an experience value control; and if the score is not higher than the recorded highest score, loading and displaying an original experience value in the experience value control, wherein the original experience value is the experience value before the heel training process.

In some embodiments, when playing an exemplary video, and detecting a key tag on a timeline; obtaining a second key frame corresponding to the first key frame from the second video frame according to the time information represented by the key label when detecting one key label, wherein the second key frame is used for key training actions of a user; and obtaining a matching result of the first key frame and the second key frame which correspond to the key label at the same time. For example, the first key frame and the second key frame corresponding to the key label may be uploaded to the server, so that the server performs skeleton point matching on the key demonstration action displayed in the first key frame and the key user action displayed in the second key frame, and then receives the matching result returned by the server. For another example, the display device controller may identify a key demonstration motion in the first key frame and a key heel-and-toe motion in the second key frame, and then perform skeleton point matching on the identified key demonstration motion and key heel-and-toe motion to obtain a matching result. It can be seen that, the second key frames of a frame correspond to a matching result, which represents the matching degree or similarity between the user actions in the second key frames and the key actions corresponding to the first key frames, when the matching result represents that the matching degree/similarity between the user actions and the demonstration actions is low, the user actions are not standard enough, and when the matching result represents that the matching degree/similarity between the user actions and the demonstration actions is high, the user actions are standard.

In some embodiments, the display device may obtain the articulation point data of the second key frame in the local video according to the local video data, and upload the articulation point data to the server, so as to reduce the pressure of data transmission.

In some embodiments, the display device may upload the key tag identification to the server to reduce the data transmission pressure from transmitting the first key frame.

In some embodiments, key labels on the timeline are detected while the exemplary video is being played; and each time a key label is detected, acquiring a second key frame corresponding to the second key label from the second video frame according to the time information of the first key label, wherein the second key frame is used for displaying the heel training action of the user.

In some embodiments, the second keyframe is an image frame of the local video at the time of the first keytag.

In the embodiment of the present application, since the time point represented by the key label is a time point corresponding to the first key frame, and the second key frame is a frame extracted from the second video frame sequence according to the time information of the first key frame, one key label corresponds to a pair of the first key frame and the second key frame.

In some embodiments, the second keyframe is an image frame in the local video at and adjacent to the time instant of the first keytag. The image for the evaluation presentation may be the image frame of the second keyframe that matches the first keyframe to the highest degree.

In some embodiments, the time information of the first keyframe may be a time when the display device plays the frame, and according to the time when the display device plays the first keyframe, the second video frame corresponding to the time is extracted from the second video frame sequence, that is, the second keyframe corresponding to the first keyframe. The video frame corresponding to a certain time may be a video frame whose time stamp is the time, or a video frame whose time shown by the time stamp is closest to the time.

In some embodiments, the matching result is specifically a matching score, and the score calculated based on the matching result or the matching score may also be referred to as a total score.

In some embodiments, a certain target video includes M frame first key frames, which show M key actions, and the target video has M key labels on a time axis, and in the follow-up process, a second key frame corresponding to the M frames can be extracted from the local video stream according to the M frame first key frames; and sequentially carrying out corresponding matching on the M-frame first key frames (the M displayed key actions) and the M-frame second key frames (the M displayed user key actions) to obtain M matching scores corresponding to the M-frame second key frames respectively, and carrying out summation, weighted summation, averaging or weighted averaging calculation on the M matching scores to obtain the total score of the follow-up process.

In some embodiments, the display device determines a frame extraction range of the local video stream according to time information of a first key frame (key frame) in the target video, extracts a preset number of local video frames from the local video stream according to the determined frame extraction range, identifies a heel-and-toe motion of a user for each extracted local video frame, longitudinally compares the heel-and-toe motion, then matches the key heel-and-toe motion with the corresponding key motion to obtain a corresponding matching score, and calculates an overall score of the heel-and-toe process after the heel-and-toe is finished.

In other embodiments, the display device sends the extracted local video frames to the server, the server identifies the user heel-exercise action in each frame, and longitudinally compares the user heel-exercise action with the corresponding key action to obtain a corresponding matching score, and after heel-exercise is finished, calculates the total score of the heel-exercise process, and returns the total score to the display device.

In some embodiments, after the server obtains a matching score for a certain key training action, the server sends a grade identifier corresponding to the matching score to the display device, and after the display device receives the grade identifier, the grade identifier is displayed in real time in a floating layer above the local screen, for example, GOOD, GREAT, PERFECT, so as to feed back the training effect to the user in real time. In addition, if the display device determines the matching score of the user training action by itself, the display device directly displays the grade identification corresponding to the matching score in the floating layer above the local screen.

In some embodiments, for practicing the total score of each exemplary video, if the score is higher than the recorded highest score, the difference between the score and the recorded highest score is obtained, and the difference is increased based on the original total score to obtain a new total score, so that the situation that the user repeatedly brushes familiar videos to improve the total score is avoided, and the application fairness is improved.

In some embodiments, if the total score is higher than the highest score noted, a corresponding experience value increment is derived from the total score; obtaining a new experience value by accumulating the experience value increment into the original experience value; further, at the end of the target video play, a new experience value is presented on the display. For example, assuming that the total score is 85 and the historical highest score is 80, the experience value increment of 5 is obtained according to the total score of 85 and the historical highest score of 80, and if the original experience value is 10005, a new experience value 10010 is obtained by integrating the experience value increment of 5 in 10005. Conversely, if the total score is not higher than the highest score noted, the experience value increment is 0, i.e., the experience value is not accumulated, at which point the original experience value is presented on the display.

Further, if the total score is higher than the highest score noted, the original empirical value is replaced with the new empirical value; if the total score is not higher than the highest score noted, the original experience value is not updated.

It should be noted that the terms "first" and "second" are used herein to distinguish similar objects and not necessarily to describe a particular order or sequence. In further embodiments, the first keyframe may also be referred to as a keyframe and the second keyframe may also be referred to as a local video frame or a follow-up screenshot.

In the above embodiment, in the process of training the target video by the user, the training situation of the user is scored according to the local video stream acquired by the image acquisition device, if the score is higher than the recorded highest score, a new experience value is obtained according to the score, and the new experience value is displayed, if the score is not higher than the recorded highest score, the experience value is not updated, and the original experience value is displayed, so that the user is prevented from maliciously earning the experience value by repeatedly training the same exemplary video.

In some embodiments, the first control for displaying the first experience value and the second control for displaying the second experience value are child controls of the experience value control, and as child controls, the first control and the second control are configured to be unavailable for focus, i.e. may not be individually operated, and the experience value control is configured to be available for focus, i.e. may be operated by a user.

In some embodiments, the user may operate an experience value control, such as a click operation, to enter an experience value detail page. Specifically, the display device controller is configured to display an experience value detail page in response to an operation of the experience value control, the experience value detail page displaying a plurality of time points within a preset time period and experience value detail data corresponding to each time point, wherein the experience value detail data corresponding to each time point comprises a first experience value corresponding to the time point, a second experience value corresponding to the time point and/or an experience value generated in a sub-time period between the time point and a previous time. The preset time period is, for example, a time period including at least one statistical period. The predetermined time period is, for example, a time period determined based on the current time and a predetermined time period.

In some embodiments, the experience value detail page is a small window page of a size smaller than the application home page, which floats on top of the application home page for display.

In some implementations, when the experience value detail page floats on the application homepage for display, the experience value control in the application homepage continues to be displayed, and the first control is still displayed on top of the second control.

Fig. 19D is an empirical value detail page according to an exemplary embodiment of the present application, and as shown in fig. 19D, in the empirical value detail page, a plurality of time points from the nth week one to the n+1th week three are shown, which are respectively the nth week one, the nth Zhou Zhouer, the … …, the n+1th week three, and the empirical value detail data corresponding to each of the time points, specifically, the first empirical value corresponding to each of the time points, the second empirical value corresponding to each of the time points, and the empirical value generated in the sub-time period between the adjacent two time points. It can also be seen from fig. 22 that the experience value detail page is a small window page of a size smaller than the application homepage, which is displayed at the upper layer of the application homepage, at which time the application homepage still displays controls for presenting the first experience value and the second experience value.

FIG. 19E is another experience value detail page shown in accordance with an exemplary embodiment of the present application, as shown in FIG. 19D, except that the experience value detail page shown in FIG. 19E is a full screen page that includes controls for displaying a first experience value and a second experience value.

In some embodiments, the server or the display device counts the experience value increment generated in a preset period, and updates the experience value of the user according to the experience value increment generated in the last period when the next period is entered. The preset period may be three days, seven days, etc.

In some embodiments, the display device controller, in response to the initiation of the target application, sends a request to the server for obtaining the user experience value, the request including at least the user information. The server acquires the time of updating the user experience value last time, and judges whether the interval duration from the last time of updating the user experience value meets the duration of the preset period or not; if the user experience value is satisfied, acquiring an experience value increment generated in the previous period, updating the user experience value by accumulating the experience value increment generated in the previous period into a total experience value, and returning the updated user experience value to the display device; if the user experience value is not satisfied, the user experience value is not updated, and the current user experience value is directly returned to the display device, or the display device is informed to acquire the last transmitted user experience value data from the cache data of the display device.

Accordingly, the display device receives the user experience value returned by the server and draws a user data display area in the interface so as to display the user experience value in the display area. And if the updated user experience value is received by the display device, updating the user experience value in the cache of the display device at the same time.

In some embodiments, the experience value control includes a user data presentation area set identification bit as in FIG. 9 for identifying the experience value increment that has occurred during the current period, such as "present week +10" as shown in FIG. 9.

In some embodiments, the experience value control comprises a first sub-control in which the total value of experience values at the end of the last statistical period is presented, and a second sub-control in which the increment of experience values that has been generated in the current statistical period is presented. The first sub-control is shown as a control in which the "dance power value 10012" is shown in fig. 9, and the second sub-control is shown as a control in which the "current week +10" is shown in fig. 9.

In some embodiments, the first sub-control and the second sub-control partially overlap so that a user can intuitively see both sub-controls simultaneously.

In some embodiments, the first sub-control and the second sub-control are different in color so that a user can intuitively see both sub-controls simultaneously.

In some embodiments, the second child control is located in the upper right corner of the first child control.

In some embodiments, the user selects the user data display area to set the identification bit to enter a detail page for displaying the total score of the experience value, and after entering the detail page, the second sub-control remains located at the upper right corner of the first sub-control and displays the newly added score in the current statistical period.

In some embodiments, the heel training result interface is further provided with a heel training evaluation control, where the heel training evaluation control is used for displaying a target state determined according to the scores, and the target states corresponding to different scores are different.

In some embodiments, the target state presented in the follow-up evaluation control is a star rating identification as shown in fig. 9.

In some embodiments, a correspondence between the empirical value data range and the star level is established in advance, for example, 0-20000 (empirical value range) corresponds to 1 star, 20001-40000 corresponds to 2 stars, and so on. Based on this, while the user data presentation area as in fig. 9 presents the user experience value, a star level identification corresponding to the experience value, for example, 1 star as shown in fig. 9, may also be presented in the follow-up evaluation control.

After the training is finished, an interface for presenting the scoring information as shown in fig. 19A is presented on the display. While the display is displaying the interface, the user may enter the interface presenting detailed performance information by operating a control for viewing detailed performance.

In some embodiments, the detailed performance information may also be referred to as heel training result information, and the user interface that presents the heel training result information is referred to as a heel training result interface.

In some embodiments, in response to an instruction of checking detailed achievements input by a user, the display device sends a detailed achievements information interface acquisition request to the server, the display device presents detailed achievements information on a display according to detailed achievements information interface data issued by the server, the detailed achievements information comprises at least one of login user information, star achievements information, an evaluation language and a plurality of follow-up shots, the follow-up shots are local video frames in follow-up videos acquired by the user through a camera, and the follow-up shots are used for displaying follow-up actions of the user.

FIG. 20 illustrates an interface for presenting detailed performance information, such as that shown in FIG. 20, with login user information (e.g., user avatar, user experience value), star grade performance information, valuation, and 4 follow-up shots presented in the form of items or controls.

In some embodiments, the follow-up shots are arranged and displayed in the form of thumbnails in an interface as shown in fig. 20, a user may move the position of the selector by operating the control device to select a certain follow-up shot to view the original image of the selected picture, and when the display displays the original image file of the selected picture, the user may view the original images corresponding to other follow-up shots by operating the left and/or right direction keys.

In some embodiments, when the user selects the first exercise screenshot to view through the operation control device to move the selector, an original image file corresponding to the selected screenshot is obtained and presented on the display, as shown in fig. 21. In fig. 21, the user can view other artwork corresponding to the exercise screenshot by operating the left and/or right direction keys.

Fig. 22 illustrates another interface for presenting detailed performance information, unlike the interface illustrated in fig. 20, a sharing code picture (such as a two-dimensional code) including a detailed performance access address is also displayed in the interface illustrated in fig. 22, and a user can scan the sharing code picture using the mobile terminal to view the detailed performance information.

Fig. 23 illustrates a detailed performance information page displayed on the mobile terminal device, as shown in fig. 23, with login user information, star level performance, evaluation language, and at least one follow-up screenshot presented therein. And the user can share the page links with other users (namely other terminal equipment) by operating the sharing control in the page, and can store the follow-up screenshot displayed in the page and/or the original image file corresponding to the follow-up screenshot in the terminal equipment locally.

To motivate and prompt the user, in some embodiments, if the total score of one heel-exercise process is higher than a preset value, N local video frames (TopN) with the highest matching score are displayed in a detailed performance information page (or heel-exercise result interface), so as to display the highlight moment of the heel-exercise process, and if the total score is not higher than the preset value, N local video frames with the lowest matching score are displayed in the detailed performance information page, so as to display the moment to be improved of the heel-exercise process.

In some embodiments, after receiving the detailed score information interface acquisition request, the server obtains a score when the user follows the demonstration video according to the matching degree of the actions in the corresponding key frames and the local video frames, when the score is higher than a first value, the server sends a certain number of key frames and/or corresponding local video frames with higher matching degree (for example, N is greater than or equal to 1) as detailed score information interface data to the display device, and when the score is lower than a second value, sends a certain number of key frames and/or corresponding local video frames with lower matching degree as detailed score information interface data to the display device. In some embodiments, the first value and the second value may be the same value, in other embodiments, the first value and the second value are different values.

In some embodiments, the controller obtains an exemplary video in response to a user entered instruction to follow-up with the exemplary video, the exemplary video including a sequence of key frames including a predetermined number (M) of key frames ordered by time, each key frame exhibiting a key action requiring user follow-up.

In some embodiments, after receiving the detailed score information interface acquisition request, the server determines a score when the user exercises the target video according to a comparison relation between the target key frames and the corresponding local video frames, when the score is higher than a first value, the server sends a preset number of target key frames and/or corresponding local video frames with higher degree determined in the matching process to the display device as detailed score information interface data, and when the score is lower than a second value, sends a preset number of target key frames and/or corresponding local video frames with lower degree determined in the matching process to the display device as detailed score information interface data.

In some embodiments, the controller plays the target video at the follow-up interface, and obtains a local video frame corresponding to the key frame from the local video stream during the playing of the demonstration video, wherein the local video frame displays the user action.

In some embodiments, the comparison between the key frames and the local video is performed in the display device, in the training process, the controller matches the key actions displayed by the corresponding key frames with the user actions displayed by the local video frames to obtain a matching score corresponding to each local video frame, obtains a total score according to the matching score corresponding to each local video frame, and selects a target video frame to be displayed as a training result according to the total score, that is, if the total score is higher than a preset value, selects N local video frames (TopN) with the highest matching score as target video frames, if the total score is not higher than the preset value, selects N local video frames with the lowest matching score as target video frames, where N is the number of preset target video frames, for example, in fig. 19A, n=4; finally, the heel-back results including the total score and the target video frame are presented, i.e., the total score and the target video frame are presented in the detailed performance page as shown in fig. 18.

In some embodiments, the total score is derived from summing, weighted summing, averaging, or weighted averaging calculations of the matching scores for each local video frame.

In some embodiments, the controller detects key labels on the timeline during control of playing the exemplary video; and extracting local video frames corresponding to the key frames in time from the local video stream according to the time information of the key labels when one key label is detected, and generating a local video frame sequence according to the extracted local video frames, wherein the local video frame sequence comprises part or all of the local video frames which are arranged in descending order according to the matching scores.

In some embodiments, the first N local video frames in the sequence of local video frames are taken as first local video frames, the last N local video frames are taken as second local video frames, the first local video frames are used for being displayed in the heel-and-toe result interface when the total score is higher than a preset value, and the second local video frames are used for being displayed in the heel-and-toe result interface when the total score is not higher than the preset value. In some embodiments, the preset value may be the first value or the second value in the foregoing embodiments.

In some embodiments, the step of generating the sequence of local video frames may comprise: when a new local video frame is acquired, if an overlapped video frame exists in the first local video frame and the second local video frame, inserting the newly acquired local video frame into the local video frame sequence according to a matching score corresponding to the newly acquired local video frame to acquire a new local video frame sequence; if the first local video frame and the second local video frame do not have the overlapped video frame, inserting the newly acquired local video frame into the local video frame sequence according to the matching score corresponding to the newly acquired local video frame, and deleting the local video frame with the matching score in the middle position to obtain a new local video frame sequence.

In some embodiments, if the total score is higher than a preset value, selecting N first local video frames in the local video frame sequence as target video frames, displaying the target video frames in the training result interface, and if the total score is not higher than the preset value, selecting N second local video frames in the local video frame sequence as target video frames, displaying the target video frames in the training result interface.

It should be noted that the presence of overlapping video frames in the first local video frame and the second local video frame means that there are frames in the local video frame sequence that are both the first local video frame and the second local video frame, in which case the number of frames in the local video frame sequence is less than 2N.

It should be further noted that the absence of overlapping video frames in the first local video frame and the second local video frame means that there is no frame in the local video frame sequence that is both the first local video frame and the second local video frame, in which case the number of frames in the local video frame sequence is greater than or equal to 2N. In some embodiments, the display device side (when the display device performs the sequence generation) or the server (when the server performs the sequence generation) may employ an algorithm of bubble ordering in generating a photo sequence for presenting detailed performance information interface data.

The algorithm process is as follows: after the key frame and the local video frame are compared, the matching degree of the key frame and the local video frame is determined.

And when the number of data frames in the sequence is smaller than a preset value, adding the key frames and/or the local video frames into the sequence according to the matching degree, wherein the preset value is the sum of the number of image frames to be displayed with the score higher than the preset value and the number of image frames to be displayed with the score lower than the preset value. For example, the number of frames to be displayed is 4 frames (groups) when the score is higher than the preset value, and the number of frames to be displayed is 4 frames (groups) when the score is lower than the preset value, the preset value corresponding to the sequence is 8 frames (groups).

When the number of data frames in the sequence is greater than or equal to a preset value, a new sequence is formed according to the matching degree corresponding to each group of frames in the sequence, 4 frames (groups) with highest matching pairs are reserved in the new sequence, 4 frames (groups) with lowest matching degree are reserved, and the middle frames (groups) are deleted to enable the sequence to be maintained at 8 frames (groups). Therefore, excessive photos stored in the cache data can be avoided, and the service processing efficiency is improved.

In some, a frame refers to a sequence that contains only local video frames, and a group refers to a sequence of local video frames and corresponding key frames as a set of parameters in the sequence.

In some embodiments, the key frame and the local video frame are compared in a server, and the comparison process can refer to the description of other embodiments in the application.

The server obtains a total score according to the matching score corresponding to each local video frame, and selects a target video frame to be displayed as a training result according to the total score, namely, if the total score is higher than a preset value, N local video frames (topN) with the highest matching score are selected as target video frames to be sent to the display device, and if the total score is not higher than the preset value, N local video frames with the lowest matching score are selected as target video frames to be sent to the display device, wherein N is the number of the preset target video frames, for example, in fig. 19A, N=4; finally, the display device displays the heel-in result including the total score and the target video frame according to the received data, namely, displays the total score and the target video frame in a detailed score page as shown in fig. 18.

In the case where the above-described local video frame sequence contains all of the extracted local video frames, each frame of the local video frame is extracted, it is inserted into the local video frame sequence according to the matching score corresponding to the frame such that the number of frames in the local video frame sequence increases from 0 to M (the number of key frames contained in the exemplary video), and the local video frames in the sequence are arranged in descending order of the respective matching score. When the N frames with the highest matching score are needed to be displayed, the frames with the bit sequence of 1-N are extracted from the local video frame sequence, and when the N frames with the lowest matching score are needed to be displayed, the frames with the bit sequence of (M-N+1) -M are extracted from the local video frame sequence.

In the case that the local video frame sequence contains the extracted partial local video frames, generating an initial sequence according to the acquired 1 st to 2 nd local video frames, wherein the 1 st to 2 nd local video frames respectively correspond to the 1 st to 2 nd key frames, and arranging the 2 nd local video frames in descending order according to the matching scores; every time a local video frame (2n+i frame) is acquired from 2n+1 th frame (including n+1 th frame), inserting the frame (2n+i frame) into the initial sequence according to the matching score corresponding to the frame (2n+i frame), and deleting the frame with bit (n+1) in the initial sequence until 2n+i is equal to the preset number, namely inserting the last frame, to obtain the local video frame sequence, wherein 2N is smaller than M, i epsilon (1, M-2N).

In some embodiments, the display device side (when the display device performs the sequence generation) or the server (when the server performs the sequence generation) may employ an algorithm of bubble ordering in generating a photo sequence for presenting detailed performance information interface data.

When the number of data frames in the sequence is greater than or equal to a preset value, according to the matching degree of the time, the matching degree corresponding to the song frames (groups) in the sequence forms a new sequence, in the new sequence, 4 frames (groups) with the highest matching pair are reserved, 4 frames (groups) with the lowest matching degree are reserved, and the middle frames (groups) are deleted to enable the sequence to be maintained at 8 frames (groups). Therefore, excessive photos stored in the cache data can be avoided, and the service processing efficiency is improved.

It should be noted that, in some embodiments, if the user exits from the follow-up in advance, the number of the local video frames actually extracted may be smaller than the number N of the target video frames to be displayed, and at this time, the controller need not select the target video frames to be displayed according to the total score, and only needs to display the local video frames actually extracted as the target video frames.

In some embodiments, after receiving the operation of confirming exit input by the user, determining whether the number of video frames in the current sequence is greater than the number of video frames to be displayed, if so, selecting the number of video frames to be displayed in the front or rear section of the sequence according to the score for displaying, and if not, displaying all the video frames.

In some embodiments, after receiving the operation of confirming exit input by the user, before judging whether the number of video frames in the current sequence is greater than the number of video frames to be displayed, judging whether the duration and/or the number of actions of the follow-up is satisfied with the preset requirement, if so, judging whether the number of video frames in the current sequence is greater than the number of video frames to be displayed, and if not, not.

In some embodiments, the display device uploads the local video frames sorted according to the overall score to the server so that the server adds the local video frames to the user's exercise record information.

In some embodiments, the display device uploads the node data of the local video frame and the identification of the corresponding local video frame to the server, and the server also performs information transfer of the matching degree through the parameters and the display device. In order to display the picture of the follow-up exercise in the subsequent use history. After receiving the detailed score page data, the display device draws graph scores according to the scores, displays comments according to the comment data, calls the local video frames in the cache to display follow-up pictures according to the marks of the local video frames, and meanwhile uploads the local video frames corresponding to the marks of the local video frames and the detailed score page marks to the server, and the server combines the received local video frames and the detailed score page data into one piece of follow-up data according to the detailed score page marks so as to be sent to the display device for later inquiring the follow-up history.

In some embodiments, in response to ending of the follow-up process, detecting whether a user input is received, presenting an automatic play prompt interface when the user input is not received within a preset time period, and starting countdown, wherein countdown prompt information, automatic play video information and a plurality of controls are displayed in the automatic play prompt interface, the countdown prompt information at least comprises the countdown time period, the automatic play video information comprises a video cover and/or a video name to be played after the countdown is ended, and the plurality of controls can be such as a control for controlling the replaying, a control for exiting the current interface, a control for playing the next video in a preset media asset list, and the like. And continuously detecting whether user input is received or not in the process of executing the countdown, if the user inputs the control in the interface through the control device, playing the video displayed in the interface if the user input is not received before the countdown is completed, stopping the countdown if the user input is received before the countdown is completed, and executing control logic corresponding to the user input.

In some embodiments, the second value is less than or equal to the first value. And under the condition that the second value is smaller than the first value, when the score is higher than the second value and lower than the first value, a preset number of key frames and/or corresponding local video frames are allocated in each matching degree interval according to the matching degree, and the key frames and/or the corresponding local video frames are used as follow-up shots and sent to the display equipment.

Fig. 24 illustrates a user interface, which is an implementation case of the automatic playing prompt interface, and as shown in fig. 24, in the interface, countdown prompt information, that is, "play for you automatically after 5 s", automatic playing video information, that is, the video name "loving kindergarten" and the cover picture of the video, and a "play again" control, "exit" control and "play next" control are displayed.

In some embodiments, the user may control to display a follow-up recording page, or practice recording page, of the user by operating the control device, the practice recording page including a plurality of practice recording entries therein, each of the practice entries including exemplary video information, scoring information, practice time information, and/or at least one follow-up screenshot. The demonstration video information comprises at least one of a cover, a name, a category, a type and a duration of the demonstration video, the scoring information comprises at least one of a star grade score, a scoring score and an experience value increment, the exercise time information comprises an exercise start time and/or an exercise end time, and the exercise follow-up screenshot can be the exercise follow-up screenshot displayed in the detailed score information interface.

In some embodiments, when the display displays an application home page as shown in FIG. 9, the user may operate the "My work" control in the page via the control device to input instructions indicating that the exercise recording page is displayed. When the controller receives the instruction, a request for acquiring exercise record information is sent to a server, wherein the request at least comprises a user Identification (ID); the server responds to a request sent by the display device, searches corresponding exercise record information according to the user identification, and returns the exercise record information to the display device, wherein the exercise record information comprises at least one piece of historical contact record data, and each piece of historical contact record data comprises demonstration video information, scoring information, exercise time information and at least one piece of follow-up screenshot or a special identification indicating that the follow-up screenshot does not exist. The display device generates an exercise record page according to the exercise record information returned by the server and presents the exercise record page on the display.

The follow-up screenshot is displayed when the display device acquires an image showing the user's action.

In some embodiments, the server responds to a request sent by the display device, searches corresponding exercise record information according to the user identifier therein, and determines whether each piece of history contact record data in the exercise record information contains a follow-up screenshot, and for an item which does not contain the follow-up screenshot, adds the special identifier in the item to indicate that the follow-up process corresponding to the history contact record data does not detect the camera. On the display device side, if the historical contact record data returned by the server contains the data of the follow-up screenshot, such as file data or identification of the follow-up screenshot, the corresponding follow-up screenshot is displayed in the follow-up record entry in the follow-up record page, and if the historical contact record data returned by the server does not contain the follow-up screenshot and contains the special identification, the preset identification element for identifying that the camera is not detected is displayed in the follow-up record entry in the follow-up record page.

In some embodiments, the display device receives data issued by the server, draws a heel-and-toe record page, and the heel-and-toe record page comprises one or more heel-and-toe record entries, wherein each heel-and-toe record entry comprises a first picture control for showing a heel-and-toe screenshot or a first identification control for showing a preset identification element, and further comprises a second control for showing demonstration video information and a third control for showing scoring information and exercise time information.

In the process of drawing the heel-and-toe exercise record page, if the first history does not contain the special mark according to the record data, loading a heel-and-toe exercise screenshot in a first picture control of a first heel-and-toe exercise record item, loading demonstration video information in a second control, and loading scoring information and exercise time information in a third control; if the first history contains the special mark according to the recorded data, loading a preset mark element on a first mark control of a first training record item to be used for prompting that the camera is not detected in the training.

In some embodiments, the heel-exercise screenshot displayed in the exercise item is a heel-exercise screenshot displayed in the corresponding detailed performance information page, and the specific implementation process may refer to the above embodiments, which is not described herein.

In some embodiments, the heel screen shots displayed in the heel record entries are also referred to as designated pictures.

In some embodiments, the data of the specified picture included in the history follow-up record data is file data of the specified picture or an identification of the specified picture, where the identification of the specified picture is used to enable the controller to obtain the file data of the specified picture corresponding to the identification of the specified picture from a local cache of the display device or the server.

FIG. 25 illustrates an interface for displaying a user exercise record, which may be an interface that the user enters after operating the My work control of FIG. 9. As shown in fig. 25, 3 exercise items are displayed in the interface, and in the display area of each exercise item, exemplary video information, scoring information, exercise time information, and follow-up shots or an identification indicating that a camera is detected are displayed. Wherein, the demonstration video information comprises cover pictures, types (loving lessons), names (standing slightly) of the demonstration video, and the scoring information comprises experience value increment (such as +4) and star grade identification, and exercise time information such as 2010-10-10-10:10.

In the above examples, the user may obtain past heel-keeping situations by looking at the exercise records, such as what exemplary videos are being followed at what time, how the heel-keeping results are, etc., so that the user may conveniently follow the exercise after the past heel-keeping situation decision, or discover the type of action that the user is good at, e.g., may follow the lower-performing exemplary video again, or follow the corresponding type of video with emphasis on the type of action good at to further refine the exercise.

A first interface of the display device in an exercise environment is shown in fig. 26. FIG. 26 is a schematic diagram of a first interface 200A, where the first interface 200A may display a plurality of exemplary videos in a scrolling manner so that a user may select a target exemplary video from the plurality of exemplary videos.

Fitness is also one of the follow-up videos in some embodiments, and is an exemplary video.

Referring to fig. 26, a display window is shown for displaying an exemplary video selected by a user. And receiving a confirmation instruction input by a user, and when the user selects a control for starting training in the first interface 200A according to the position of the "start training" control in the first interface of the selector (focus). In response to user selection of the start training control, the controller may retrieve and load the corresponding exemplary video clip source from the server based on the API (Application Programming Interface, application program interface).

Referring to fig. 27, fig. 27 is a schematic diagram of an exercise video first interface, which may be referred to as a detail interface, according to some embodiments, wherein the first interface may display a plurality of coaching videos in a scrolling manner for a user to select a target demonstration video among a plurality of demonstration videos. For example: squatting high lifting leg, retreating bow step and four-point type rear kicking … …. The user selects one of the target demonstration videos among the plurality of coaching videos.

In the first interface, a play window is set, the play window is used for playing a default training video or a training video played last time in play history, an introduction display control is set on the right side of the play window, at least one of a 'start training' control (i.e. play control) and a 'collect' control is set, the interface further comprises a training list control, and in the training list control, the display control of a plurality of training videos is displayed.

In some embodiments, the exemplary video may also be retrieved after the start training control or play window is selected. Specifically, displaying and storing pre-downloaded demonstration videos; and then establishing a mapping relation between the demonstration video and the check code. In response to user selection of the start training control, a check code is generated based on the user selected exemplary video. The controller may acquire an exemplary video corresponding to the check code from the stored exemplary videos based on the check code. Because the demonstration video is pre-stored, the controller can directly call the demonstration video corresponding to the check code after acquiring the check code. By the method, the demonstration video is called, the problem of blocking caused by factors such as a network can be avoided, the demonstration video can be downloaded in advance before the demonstration video is obtained, and the smoothness of the demonstration video is improved.

The camera is used for collecting local images or local videos; when not opened, the camera is positioned at the hidden position so that the edge of the display device is kept smooth, and after the camera is opened, the camera rises and protrudes above the edge of the display device so as to avoid shielding of a display screen to acquire image data.

In some embodiments, in response to a user selecting a start training control, the start camera is lifted to acquire image data, the camera is always in an on state during training, and the local video is acquired in real time and sent to the controller to display actions of the user on the follow-up interface. So that the user can watch his own actions and the actions of the demonstration video in real time.

In some embodiments, in response to user selection of the start training control, the camera is raised but in a standby state, one local image is acquired each time the demonstration video is played to a preset point in time, and the acquired local image is sent to the controller. This relieves the processor of the stress and maintains the local image on the display screen for display to the next point in time.

A controller configured to: receiving input confirmation operation of the play control, starting a camera, and loading video data of the demonstration video;

In one implementation, the confirmation operation may be a selection of the start training control in response to the confirmation operation. The controller is further configured to control the display to display a prompt interface 200C (and a guide interface) for instructing the user to enter a predetermined area. Specific alert interfaces may be seen in fig. 28 and 29. Specifically, fig. 28 is a schematic diagram of a prompt interface shown according to some embodiments, where a user adjusts the position of the user according to the prompt interface. When the user finishes entering the preset area, the controller controls the display to display the second interface. The method is characterized in that the acquisition area of the camera is marginal, the camera acquires a current image in order to better acquire local data, a floating layer is newly built above the image layer displaying the current image in the displaying process, the optimal acquisition area is determined according to the position and the angle of the camera in the floating layer, and the optimal position frame is displayed in the floating layer according to the optimal acquisition area. The user is guided to move in position, the acquired position in the current image is overlapped with the optimal position frame in the floating layer, and when the overlapping degree reaches a preset threshold value, the display device displays a successful prompt message, cancels the floating layer and jumps to the heel-and-toe interface shown in fig. 30.

For example, in some embodiments, the character in the prompt interface 200C is located to the left of the box area 200C1, and the user is prompted to move to the right accordingly. If the person is positioned on the right side of the square area in the display picture, the user is correspondingly prompted to move leftwards so that the user enters a preset area, wherein the area can be acquired by a camera in the preset area. The embodiment of the application indicates that the user enters the predetermined area through the above manner. In some embodiments, the prompt interface is further used for displaying a prompt message, and in particular, referring to fig. 29, fig. 29 is a schematic diagram of the prompt interface according to some embodiments, where the prompt message is "please face the screen. Keep the body standing up' and other prompt information. The message prompting the user to move can be a word displayed on the floating layer, can be a voice prompt, and can also be an indication mark pointing to the frame of the optimal position.

In one implementation, the controller may also directly present the second interface in response to the confirmation operation, and play the exemplary video in the first playing window, and play the local image in the second video window. The user can adjust his position according to the image displayed by the second video window in the second interface.

In one implementation manner, the controller may also determine, in response to the confirmation operation, the number of occurrences of the position guide interface, display the guide interface when the number of display times of the guide interface does not satisfy a preset value, directly display the second interface when the number of display times of the guide interface satisfies the preset value, play the demonstration video in the first play window, and play the local image in the second video window. The user can adjust his position according to the image displayed by the second video window in the second interface.

Referring specifically to fig. 30, fig. 30 is a schematic diagram of a second display interface 200B according to some embodiments, where the second display interface 200B includes a first playing window 200B1 for playing the exemplary video and a second playing window 200B2 for playing the local image collected by the camera.

In some embodiments, playing the demonstration video in the first playing window, wherein the demonstration video does not show the joint point, and playing the local image data in the second playing serial port includes the controller obtaining the position of the joint point along with the local image data according to the local image data, and displaying the local image data and the joint point mark in the second playing window after superimposing the local image data and the joint point mark according to the position of the joint point.

In some embodiments, the local image data and the joint point mark are superimposed, the joint point mark can be added on the local image according to the position of the joint point in the local image data, and then output on one image layer to display the local image after the joint point is superimposed. The local image acquired by the camera can be displayed in one image layer, a floating layer is added above the image layer, joint surface marks are added in the floating layer according to the positions of the joint points, and the two image layers are displayed after being overlapped.

In some embodiments, the second playing window directly plays the local video acquired by the camera.

The embodiment of the application shows a display device, which comprises a display, a camera and a controller. The controller is configured to respond to the selection of a training start control in the display interface by a user, acquire an demonstration video, raise and start a camera, wherein the camera is used for acquiring a local pattern; and controlling a first playing window of the display to play the demonstration video, and displaying the local image by a second playing window of the display. Therefore, according to the technical scheme, the demonstration video is displayed through the first playing window, the local picture is displayed through the second playing window, and the user can timely adjust the motion of the user through comparison of the displayed contents in the two windows in the training process, so that the experience of the user is improved.

The camera is used for collecting local videos, and the local videos are a set of continuous local images. In the comparison process, if the comparison is performed for each frame of image, the data processing amount of the controller is large.

Based on the above problems, in some feasible embodiments, the controller may compare the local image with the exemplary video frame to generate a comparison result, and after the user exercises, the user may watch the comparison result through the display interface, so as to help the user to better understand the motion defect of the user, so that the user can overcome the motion defect in the subsequent exercise process. Wherein the demonstration video frame is a graph corresponding to the local image in the demonstration video.

In some embodiments, there are a variety of implementations of capturing local images.

For example, the controller may control the camera to capture a local image when the demonstration video is played to a preset point in time; and then comparing the local image acquired by the camera with a pre-stored demonstration video frame to obtain a comparison result. In some embodiments may be: when the demonstration video is played to a preset time point, the controller controls the camera to acquire a local image. The preset time point may be a preset time point from when the first image appears in the demonstration video to when the last frame image appears in the demonstration video every interval T time. The preset time point may also be generated based on the content of the exemplary video, where each action node in the content of the exemplary video serves as a preset time point. For example, for an exemplary video, the first image appears with a start of 3S and a t time of 10S, the exemplary video is 53S in length. The corresponding preset time point is as follows: the controller controls the camera to capture a local image when the video is released to play to 3S,13S,23S,33S,43S, and 53S, respectively. Labels are added to the demonstration video according to preset time nodes, and when the labels are sent out in playing time, local images are collected.

For another example, the camera is always in an on state, and the local video is recorded in real time and sent to the controller. The controller can extract a corresponding local image in the acquired local video at a preset time point; and then comparing the extracted local image with a pre-stored demonstration video frame to further obtain a comparison result. The specific implementation process comprises the following steps: when the demonstration video is played to a preset time point, the controller extracts one or more local images from the local video collected by the camera. The preset time point may be a time point from when the first image appears in the demonstration video to when the last frame image appears in the demonstration video every interval T (i.e. the time when the demonstration action appears). The preset time point may also be generated or pre-marked based on the content of the exemplary video, where each action node in the content of the exemplary video serves as a preset time point. For example, for an exemplary video, the first image appears at a starting point of 3S, and the predetermined time point is: 3S, 16S, 23S, 45S and 53S, the controller will capture a local image in the local video when the video is released to play to 3S, 16S, 23S, 45S and 53S. It is known that the occurrence time of the demonstration action is arbitrary, and the acquisition of the image to be compared is triggered according to the tag or the time point for identifying the demonstration action.

Typically, after watching the coaching action of the demonstration video, the user simulates the coaching action to take a corresponding action. The user naturally has a certain delay from the time the demonstration action is received to the time the corresponding action is made. In order to counteract the delay, the technical solution shown in the embodiment of the present application shows a "delayed image acquisition method". In this embodiment, a concept of a delayed acquisition time point is introduced, where the delayed acquisition time point=a preset time point+a preset reaction duration. And when the demonstration video is played to the delayed acquisition time point, the controller controls the camera to acquire the local video.

The technical solution shown in the embodiments of the present application uses an excessive amount of experimental data statistics. The reaction time of the user is 1S in the period from the receiving of the demonstration action to the making of the corresponding action, and the preset reaction time is configured to be 1S according to the corresponding technical scheme shown in the embodiment of the application. For example, the preset time point is: 3S, 13S, 23S, 33S, 43S and 53S, the corresponding delayed acquisition time points are: 4S, 14S, 24S, 44S and 54S. The first occurrence of an image frame in an exemplary video is taken as a starting point, after which 14S, 24S, 44S and 54S controllers control the cameras to acquire local images, respectively.

The controller is used for comparing the local image with the demonstration video frame to generate a comparison result; the demonstration video frame is an image corresponding to the local image in the demonstration video or a standard image frame corresponding to the preset time point in the demonstration video. In the technical solution shown in the embodiment of the present application, the image in the exemplary video may be an image with a sign. The flag may be a time-stamp, but is not limited to a time-stamp. The correspondence of the local image to the exemplary video frame may then be determined based on the flag. For example, when the exemplary video is played to 4S, the controller controls the camera to collect the local video, the time stamp corresponding to the local video is 3S, and the local video is compared with the target video with the time stamp of 3S.

In some embodiments, the exemplary video has stored therein joint point data of an exemplary video frame, which is preset in advance, and other image frames than the exemplary video frame may not be preset because there is no need.

In the technical solution shown in the above embodiment, the preset reaction duration is configured to be 1S. However, 1S is a piece of statistical data, and the reaction duration of the user is usually 1S, but 1S is not suitable for all users, and the preset reaction duration can be set according to the needs in the actual application process.

The action comparison can cause a larger processing burden through image comparison, and in order to further reduce the data processing amount of the controller, the technical scheme shown in the embodiment of the application can only compare specific implementation processes for some key parts in the local image and the demonstration video: that is, the comparison of the motions is accomplished by the comparison of the joint points.

The controller is further configured to, prior to the second video window playing the local image: identifying an articulation point in the local image; comparing the nodes in the local image with the nodes in the exemplary video.

In some feasible embodiments, the controller is configured to respond to the selection of the training start control in the first display interface by a user, and the controller controls the camera to start so that the camera acquires the local image. The camera transmits the acquired local images to the controller. The controller identifies an articulation point in the local image; in some embodiments, the controller identifies the joint points in the local image according to a preset model, wherein the joint points are points corresponding to joints of a human body and points corresponding to the head of the human body, and the human body generally comprises 13 joint positions. The controller marks 13 important bone joint points of the whole body. The local image labeled with 13 joint bits can refer to fig. 31. The 13 joint positions are respectively: left wrist, left elbow, left shoulder, chest, waist, left knee, left ankle, head, right wrist, right elbow, right shoulder, right knee, right ankle. However, in some acquired local images, the human body sometimes has a partial missing condition, and only the human body part in the image can be identified at this time.

The controller is further used for comparing the joint points in the local image with the joint points in the demonstration video/demonstration video frame and determining the difference degree of human body actions in the local image and the human body actions in the demonstration video; the identified nodes are marked in the acquired local images, wherein the nodes with different action difference degrees are marked by different colors.

The implementation ways of determining the difference degree between the human body motion in the local image and the human body motion in the demonstration video are various:

for example, the comparison may be to compare the position of the joint point of the human body in the local image with the relative position of the joint point of the human body in the exemplary video. And obtaining a comparison result based on the difference of the relative positions. Different comparison results are marked by different colors.

Illustrating: the position of the left wrist of the human body in the local image is different from the position of the left wrist of the human body in the demonstration video by 10 standard values, and a red mark of the left wrist articulation point can be adopted. The position of the right wrist of the human body in the local image is different from the position of the right wrist of the human body in the demonstration video by 1 standard value, and the right wrist joint point can be marked in a green mode.

For another example, the matching degree of the two joint positions can be calculated by the comparison mode, and corresponding results are generated according to the matching degree. Or determining the matching degree of the actions according to the relative position relation between the self-closing nodes.

The identification and matching of the nodes of interest may also be replaced in some embodiments by other realizable means in the related art.

In some embodiments, the exemplary joint bits are marked in advance in the exemplary video and stored with the exemplary video in a local data list. The labeling process of the exemplary joint bits is similar to that shown in some of the embodiments described above and will not be described in detail herein.

In some embodiments, the controller compares the first angle in the local image with a corresponding standard angle to generate a comparison result. The first angle is an included angle between a connecting line of each joint position and the adjacent joint position in the local image and a connecting line of the trunk; the standard angle is the included angle between the connecting line of each joint position and the adjacent joint position and the connecting line of the trunk in the demonstration video.

Wherein the correspondence between the first angle and the standard angle may be generated based on the time stamp. For example, the local image is acquired at 10S, and then the standard angle corresponding to the first angle of the left ankle is the angle between the line of the left ankle and the adjacent joint and the line of the trunk in the image appearing at 10S in the exemplary video.

For example, referring to fig. 32, fig. 32 is a diagram illustrating a local image annotated with joints according to some embodiments. For the left wrist 1A, the joint adjacent to the left wrist 1A is the left elbow 1B, and the included angle between the connecting line of the corresponding left wrist 1A and the left elbow 1B and the connecting line of the trunk is called as a first angle 1A. By the method, the first angles corresponding to the left elbow, the left shoulder, the left knee, the left ankle, the head, the right wrist, the right elbow, the right shoulder, the right knee and the right ankle can be calculated respectively.

The generating manner of the standard angle may refer to the generating manner of the first angle, which is not described herein.

The controller calculates the matching degree of the first angle and the corresponding standard angle; and according to the matching degree, the user action completion degree can be evaluated.

Therefore, the technical party shown in the embodiment of the application can calculate the difference between the position of each joint position and the standard position, and further help the completion condition of each position of the user to improve the experience of the user.

In order to help a user further understand the situation that actions of all parts are completed, according to the technical scheme shown in the embodiment of the application, a controller calculates the matching degree of a first angle and a corresponding standard angle, and corresponding colors are marked on joint points according to the area in which the matching degree accords with.

Illustrating: in some embodiments, the matching degree may be represented by an angular deviation, and the matching result is according to a preset standard deviation value. For angular deviations greater than 15 degrees, the corresponding joint position may be marked red; for deviations of 10-15 degrees, the corresponding joint positions may be marked yellow. For degree deviations below 10 degrees, the corresponding joint bit may be marked green.

For example, a first angle of a corresponding left wrist joint in a local image acquired at 10S may be 20 degrees different from a standard angle corresponding to 10S in the exemplary video, and the corresponding left wrist joint may be marked red; the first angle of the left ankle joint in the local image acquired by 10S is 12 degrees different from the standard angle corresponding to 10S in the demonstration video, and the corresponding left wrist joint can be marked as yellow; the first angle of the corresponding head in the local image acquired in 10S differs from the standard angle corresponding to 10S in the exemplary video by 6 degrees, the corresponding left wrist joint can be marked green, and the corresponding marked local image can be referred to in fig. 33.

Because of new mismatch of human world actions, the technical scheme shown in the embodiment of the application shows a 'range' comparison mode. That is, when the exemplary video is played to the exemplary video frame, the display device acquires a plurality of image frames adjacent to the time point from the local video, and in some embodiments when the exemplary video is played to the preset time point, the controller selects the plurality of image frames adjacent to the time point in the local video as a first image set, where the first image set includes at least a first local image and a second local image, where the first local image is a local image corresponding to the preset time point, and the second local image is a local image corresponding to the preset time point.

In some embodiments, the controller calculates the matching degree of the local image in the first image set and the exemplary video frame, uses the comparison result of the local image with the highest matching degree as the comparison result of the time point, and uses the local image with the best matching degree with the exemplary video frame as the local image corresponding to the time point.

In some embodiments it may also be: the controller calculates the matching degree (also called as human body action difference degree) between the first local image and the demonstration video frame, when the human body action difference degree is larger than a preset threshold value, the controller screens out the image with the highest matching degree with the demonstration video frame from the first image set as a replacement image, and marks the replacement image according to the comparison result of the replacement image and the demonstration video frame.

For example, for a 10S acquired local image, the first angle of the wrist joint matches 20% of the standard angle corresponding to 10S in the exemplary video, and the preset matching (preset threshold) is 25%. In this case, the controller determines a first image set of the target data set; the first image set is a local image contained within the target dataset for a period of time 1S-13S. And respectively calculating the matching degree of the first angle of the wrist joint in each local image and the standard angle of the wrist joint in the 10S demonstration video frame, wherein the calculated result is that the matching degree of the data corresponding to 8S and the standard angle of the wrist joint in the 10S demonstration video frame is 80 percent and the highest matching degree is 80 percent. And adjusting the comparison result of the wrist joint corresponding to 10S to 80%, marking the wrist joint by using 80% of the colors used for comparison, and caching the marked local video by the controller.

In some embodiments, upon completion of playing the demonstration video, the controller may control the display to display an exercise rating interface for displaying the annotated local picture. The exercise evaluation interface may refer to fig. 34, where the scoring level of the user, the actions of the user, and the standard actions may be displayed on the exercise evaluation interface. Wherein the scoring level may be generated based on a degree of matching of the local image to the exemplary video frame.

In some embodiments, the exercise rating interface may present actions of a plurality of corresponding users in a scrolling manner as well as canonical actions. The sequence of the display can be as follows: and displaying the scores in sequence from low to high. Wherein the higher the degree of matching with the exemplary video frame, the higher the score.

In other embodiments, the exercise evaluation interface may be shown in fig. 34, where the exercise evaluation interface has two display windows, one for displaying the local image corresponding to the time point in the user action and one for displaying the exemplary video frame corresponding to the canonical action.

In some embodiments, to further reduce the data processing amount of the controller, the 'node comparison process' may be performed at the server, which specifically implements the process:

In some embodiments, before the second video window plays the local image, the controller is further configured to: identifying an articulation point in the local image; and transmitting the joint points in the local image to a server, wherein the server can compare the joint points in the local image and the joint points of the demonstration video frames in the demonstration video, determine the difference degree of human body actions in the local image and the human body actions of the demonstration video frames in the demonstration video and generate feedback information to the display equipment.

In some embodiments, the joint point identifying unit in the display device identifies and marks the joint points of all the images acquired by the camera, and displays the images in the second playing window. And when the demonstration video is played to the demonstration video frame, the display equipment uploads the node data of the local image collected at the moment and/or the node data of the local image collected at the adjacent moment to the server to judge the matching degree.

The comparison method of the human motion difference degree in the local image and the human motion difference degree in the demonstration video can refer to the above embodiment, and will not be described herein.

The controller is further configured to receive a feedback message sent by the server, and label the identified nodes in the local image according to the feedback message, wherein different colors label the nodes with different action difference degrees.

Further, according to the technical scheme shown in the embodiment of the application, the situation that the actions of all joint positions are completed is marked by adopting different colors. Different colors are adopted to distinguish the completion condition of each joint of the user, and the different colors play a striking role. It can be seen that the solution shown in the embodiment of the present application further helps the user to know the situation that the actions of each part are completed.

In some embodiments, as shown in fig. 35, in the second display interface, if the action of the user and the matching action of the exemplary video frame are higher at the time point, a floating layer is added in the second playing window to display a prompt sentence, so as to encourage the user.

In some embodiments, as shown in fig. 35, in the second display interface, a training progress control is further disposed above the second playing window to show the completion degree of the user action, and when the controller detects that the matching degree between the user action and the demonstration action frame is higher than the preset value, the controller controls the completion degree value displayed in the training progress control to be increased. And when detecting that the matching degree of the user action and the demonstration action frame is lower than a preset value, controlling the completion degree value displayed in the training progress control to be unchanged.

In order to reduce the data processing amount of the server, in some embodiments, the server may process the local image corresponding to the preset time point, and the specific implementation process may be: the controller sends the joint point in the local image to a server, specifically: and the controller caches the local images acquired in a preset time period before and after the preset time point when the playing time of the demonstration video reaches the preset time point. And identifying node data of the cached local image and transmitting the identified node data to a server.

The process of buffering the local image may be referred to the above implementation and will not be described herein.

It is noted that, since the bandwidth occupied by the picture in the transmission process is large, the solution shown in this embodiment transmits the articulation point of the local video to the server in order to reduce the bandwidth occupied in the data transmission process.

In some embodiments, the controller may be further configured to send the identified joint point data to a server while sending a preset point in time to the server, such that the server determines an image frame (i.e., a target image) of the exemplary video for comparison according to the preset point in time.

In some feasible embodiments, the controller marks the local image and caches the marked picture of the local image and the demonstration video frame corresponding to the preset time point when the difference degree of the human actions is larger than the preset threshold value. So that at the end of the exemplary video play, local video with a large difference in motion is invoked.

In some feasible embodiments, the controller controls the display to display the exercise evaluation interface after the playing is finished, and displays the cached picture of the marked local image and the demonstration video frame corresponding to the preset time point on the exercise evaluation interface.

In some embodiments, the exemplary video frames and the corresponding local images at the preset time points are ranked according to the matching degree (or score), and after the playing of the exemplary video is finished, the exemplary video frames and the corresponding local images at the preset number of time points with low matching degree (or score) are selected for display. For example, exemplary video frames and corresponding local images at 5 time points are buffered according to the matching degree, and after the exemplary video playing is finished, 3 time points with low matching degree (or scoring) are selected for display.

The display mode of the exercise evaluation interface can refer to the embodiment.

The embodiment of the application also shows a display device, including:

the display screen is used for displaying a first display interface and the second display interface, the first display interface comprises a play control used for controlling the playing of the demonstration video, and the second display interface comprises a first play window used for playing the demonstration video and a second play window used for playing the local image acquired by the camera;

the camera is used for collecting local images;

a controller configured to:

receiving input confirmation operation of the play control, starting a camera, and loading video data of the demonstration video;

responsive to the confirmation operation, displaying the second interface;

in the playing process of the demonstration video, when a label representing that the playing time of the demonstration video reaches a preset time point is detected, capturing a current video frame of the acquired local video and a neighboring video frame which is adjacent to the current video frame in time;

identifying a joint point of the current video frame and a joint point of the adjacent video frame;

comparing the joint point of the current video frame with the joint point of the demonstration video frame corresponding to the preset time point in the demonstration video; comparing the joint points of the adjacent video frames with the joint points of the demonstration video frames corresponding to the preset time point in the demonstration video;

Labeling the human body action difference degree of the front video frame or the adjacent video frame according to the comparison result;

and caching the current video frame or the adjacent video frame and the demonstration video frame with the difference degree of the human body actions lower than a difference threshold value after marking, so as to be used for displaying an exercise evaluation interface.

The embodiment of the application also shows a display device, including:

the camera is used for collecting local images;

a controller configured to:

and responding to the confirmation operation, displaying the second interface, and playing the demonstration video in the first playing window, and playing the local image marked with the joint point in the second video window, wherein in the local image marked with the joint point, a first joint point is marked with a first color, a second joint point is marked with a second color, and the action difference degree of a body part corresponding to the first joint point in the demonstration video is greater than the action difference degree of a body part corresponding to the second joint point in the demonstration video.

The embodiment of the application also discloses an interface display method, which comprises the following steps:

when a first interface is displayed, receiving input confirmation operation of a play control in the first interface, starting a camera, and loading video data of the demonstration video;

and responding to the confirmation operation, displaying the second interface, playing the demonstration video in a first playing window in the second interface, and playing the local image in a second video window in the second interface.

when the first interface is displayed, receiving input confirmation operation of the play control, starting a camera, and loading video data of the demonstration video;

responsive to the confirmation operation, displaying the second interface;

in the playing process of the demonstration video, when a label representing that the playing time of the demonstration video reaches a preset time point is detected, capturing a current video frame of the collected local video and a neighboring video frame (the video frame can also be called an image in the scheme shown in the embodiment of the application) which is adjacent to the current video frame in time;

caching the current video frame or the adjacent video frame and the demonstration video frame with the human body action difference degree lower than a difference threshold value after labeling;

and displaying an exercise evaluation interface in response to the end of the demonstration video playing, wherein the exercise evaluation interface displays the current video frame or the adjacent video frame with the lower difference degree of the labeled human body actions and the demonstration video frame.

and responding to the confirmation operation, displaying the second interface, and playing the demonstration video in a first playing window in the second interface, wherein the local image after joint point marking is played in a second video window in the second interface, wherein in the local image after joint point marking, a first joint point is marked with a first color, a second joint point is marked with a second color, and the action difference degree of a body part corresponding to the first joint point and the demonstration video is larger than the action difference degree of a body part corresponding to the second joint point and the demonstration video.

In a specific implementation, the present application further provides a computer storage medium, where the computer storage medium may store a program, where the program may include some or all of the steps in the embodiments of the method provided in the present application when executed. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), a random-access memory (random access memory, RAM), or the like.

It will be apparent to those skilled in the art that the techniques in the embodiments of the present application may be implemented in software plus the necessary general hardware platform. Based on such understanding, the technical solutions in the embodiments of the present application may be embodied in essence or what contributes to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments of the present application.

The same or similar parts between the various embodiments in this specification are referred to each other. In particular, for the method embodiments, since they are substantially similar to the display device embodiments, the description is relatively simple, and reference is made to the description in the display device embodiments for the matters.

The above-described embodiments of the present application are not intended to limit the scope of the present application.

Claims

1. A display device, characterized by comprising:

a display;

a controller for:

responding to an input instruction for heel-training demonstration video, and displaying a heel-training interface, wherein the heel-training interface is provided with a demonstration video window and a local video window;

receiving first video stream data corresponding to the demonstration video from a server according to the media asset ID corresponding to the demonstration video so as to play the demonstration video in the demonstration video window; receiving a local video stream which is acquired and generated in real time by the image acquisition device from the image acquisition device connected with the controller so as to be played in the local video window; wherein, key frames are arranged in the demonstration video;

determining a delay acquisition time according to a first time corresponding to the key frame when the demonstration video is played to the key frame, wherein the delay acquisition time is the first time plus a preset reaction time length;

according to the delay acquisition time and the time stamp of the video frame in the local video, obtaining a local video frame to be compared corresponding to the delay acquisition time in the local video stream, and comparing the action of the key frame with that of the local video frame to be compared to determine a grade identifier, wherein the grade identifier represents the action difference degree of the human action in the local video frame to be compared and the key frame, and the corresponding action matching degrees of different grade identifiers are different;

Displaying the grade identification on the local video window;

responding to the end of the follow-up process, detecting whether user input is received, and when the user input is not received within a preset time length, presenting an automatic playing prompt interface and starting countdown, wherein the automatic playing prompt interface displays countdown prompt information and video information to be automatically played;

continuously detecting whether user input is received in the countdown process;

if the user input is not received before the countdown is completed, playing the video to be automatically played after the countdown is completed;

and stopping the countdown if the user input is detected before the countdown is finished, and executing control logic corresponding to the detected user input.

2. The display device of claim 1, wherein the display device comprises a display device,

the step of performing action comparison on the key frame and the local video frame to be compared comprises the following steps:

performing action comparison on the key frame and the local video frame to be compared to obtain a score for generating current action matching according to the matching degree of the local video frame to be compared and the key frame;

and determining a grade identification corresponding to the score according to the score.

3. The display device of claim 2, wherein the action comparing the key frame to the local video frame to be compared comprises:

and performing skeleton point matching on the key frame and the local video frame to be compared so as to determine the similarity degree of actions.

4. The display device of claim 1, wherein the display device is a smart television, wherein the demonstration video window and the local video window are arranged laterally in the heel training interface, and wherein the demonstration video window is positioned to the left of the local video window in a user viewing perspective.

5. The display device of claim 4, wherein the exemplary video window has a lateral width that is greater than a lateral width of the local video window.

6. The display device of claim 1, wherein the receiving, from an image collector connected to the controller, a local video stream collected and generated in real time by the image collector to play in the local video window comprises:

receiving a local video stream which is acquired and generated in real time by an image acquisition device from the image acquisition device connected with the controller;

Determining skeleton points of image frames in the local video stream according to the local video stream;

and displaying the skeleton point in a superimposed manner while playing the video stream in the local video window.

7. The display device of claim 6, wherein the display device comprises a display device,

and the colors of the bone points displayed in a superimposed manner are different in limb positions with different limb motion matching degrees on the local video window.

8. The display device of claim 1, further comprising an image collector that defaults to hide, the controller further configured to, after following the instruction to follow the exemplary video in response to the input instruction, prior to receiving the local video stream generated by the image collector:

controlling the image collector to lift from the hidden position so that the image collector extends out of the frame of the display device; and starting the image collector so that the image collector starts to collect images.

9. An interface display method, characterized in that the method comprises:

according to the delay acquisition time and the time stamp of the video frame in the local video, obtaining a local video frame to be compared corresponding to the delay acquisition time in the local video, and comparing the action of the key frame with that of the local video frame to be compared to determine a grade identifier, wherein the grade identifier represents the action difference degree of the human action in the local video frame and the key frame, and the corresponding action matching degree of different grade identifiers is different;

displaying the grade identification on the local video window;

continuously detecting whether user input is received in the countdown process;

10. The method of claim 9, wherein:

before following the instruction of the exemplary video in response to the input instruction, the method further comprises:

the method comprises the steps of displaying a detail interface, wherein the detail interface comprises a play window and a play control, the play window is used for playing demonstration video, and an input instruction for training the demonstration video is an input selection instruction for the play control or the play window.