CN113596552B

CN113596552B - Display device and information display method

Info

Publication number: CN113596552B
Application number: CN202010479491.8A
Authority: CN
Inventors: 王光强
Original assignee: Juhaokan Technology Co Ltd
Current assignee: Juhaokan Technology Co Ltd
Priority date: 2020-04-30
Filing date: 2020-05-29
Publication date: 2022-08-19
Anticipated expiration: 2040-05-29
Also published as: CN113591523A; CN113596536B; CN113596551B; CN113596537A; CN113591524A; CN113596536A; CN111787375B; CN113591523B; CN113596590B; CN111787375A; CN113596590A; CN113596537B; CN113596551A; CN113596552A

Abstract

The application discloses a display device and an information display method, wherein the display device responds to a preset instruction, collects a local image to generate a local video stream, plays a local video picture, and displays a graphic element for identifying a preset expected position in a floating layer above the local video picture; when no moving target exists in the local video picture or the moving target exists and the deviation of the target position of the moving target in the local video picture relative to the expected position is larger than a preset threshold value, a prompt control used for guiding the moving target to move to the expected position is presented in a floating layer above the local video picture according to the deviation of the target position relative to the expected position, so that a user can move to the expected position according to the prompt, and a local image which is most beneficial to analyzing and comparing the user action can be collected in the subsequent follow-up process.

Description

Display device and information display method

The present application claims priority from chinese patent application filed on 30/4/2020 and having application number 202010364203.4 entitled "display device and playback control method", the entire contents of which are incorporated herein by reference.

Technical Field

The present application relates to the field of display device technologies, and in particular, to a display device and an information display method.

Background

With the continuous development of communication technology, terminal devices such as computers, smart phones and display devices are becoming more and more popular. Moreover, the user's demand for functions or services that can be provided by the terminal device is also increasing. Display devices, such as smart televisions, can provide playing pictures, such as audio, video, pictures, etc., for users, and are receiving much attention today.

Along with the popularization of intelligent display equipment, the demand of users for leisure and entertainment activities through a large screen of the display equipment is stronger and stronger. The importance of interest cultivation and training and the like regarding action-like activities to users, such as dance, gymnastics, fitness and the like, can be seen based on the increasing expenditure of time and money for families in interest cultivation and training regarding the action-like activities.

Therefore, how to provide interest cultivation and training functions related to action activities for users through the display device to meet the requirements of the users becomes a technical problem to be solved urgently.

Disclosure of Invention

The application provides a display device and an information display method, which aim to solve the problem of providing at least one of interest cultivation and training functions related to action activities for a user through the display device.

In a first aspect, the present application provides a display device comprising:

the display is used for displaying a user interface, at least one video window can be displayed in the user interface, and at least one floating layer can be displayed above the video window;

the image collector is used for collecting local images to generate a local video stream;

a controller to:

responding to an input preset instruction, and controlling the image collector to collect a local image to generate a local video stream;

playing a local video picture in the video window, and displaying a graphic element for identifying a preset expected position in a floating layer above the local video picture;

when no moving target exists in the local video picture or when the moving target exists in the local video picture and the offset of the target position of the moving target in the local video picture relative to the expected position is larger than a preset threshold value, presenting a prompt control for guiding the moving target to move to the expected position in a floating layer above the local video picture according to the offset of the target position relative to the expected position;

and when a moving target exists in the local video picture and the deviation of the target position of the moving target relative to the expected position is not larger than the preset threshold value, the graphic element and the prompt control are cancelled.

In a second aspect, the present application further provides an information display method, including:

displaying a user interface, wherein at least one video window can be displayed in the user interface, and at least one floating layer can be displayed above the video window;

in response to an input preset instruction, acquiring a local image to generate a local video stream;

when no moving target exists in the local video picture or when the moving target exists in the local video picture and the offset of the target position of the moving target in the local video picture relative to the expected position is larger than a preset threshold value, presenting a prompt control for guiding the moving target to move to the expected position in a floating layer above the local video according to the offset of the target position relative to the expected position;

As can be seen from the foregoing technical solutions, an embodiment of the present application provides a display device and an information display method, where the display device, in response to a preset instruction, acquires a local image to generate a local video stream, plays a local video picture, and displays a graphic element for identifying a preset desired position in a floating layer above the local video picture; when no moving target exists in the local video picture or the moving target exists and the offset of the target position of the moving target in the local video picture relative to the expected position is larger than a preset threshold value, a prompt control for guiding the moving target to move to the expected position is presented in a floating layer above the local video picture according to the offset of the target position relative to the expected position, so that a user can move to the expected position according to the prompt, and a local image which is most beneficial to analyzing and comparing the user action can be acquired in the subsequent follow-up process.

Drawings

In order to more clearly explain the technical solution of the present application, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious to those skilled in the art that other drawings can be obtained according to the drawings without any creative effort.

Fig. 1 is a schematic diagram illustrating an operation scenario between a display device and a control apparatus according to an embodiment;

fig. 2 is a block diagram exemplarily showing a hardware configuration of a display device 200 according to an embodiment;

fig. 3 is a block diagram exemplarily showing a hardware configuration of the control apparatus 100 according to the embodiment;

fig. 4 is a diagram exemplarily showing a functional configuration of the display device 200 according to the embodiment;

fig. 5 is a diagram exemplarily showing a software configuration in the display device 200 according to the embodiment;

fig. 6 is a diagram exemplarily showing a configuration of an application program in the display device 200 according to the embodiment;

fig. 7 schematically illustrates a user interface in the display device 200 according to an embodiment;

the user interface is exemplarily shown in fig. 8;

FIG. 9 is an exemplary illustration of a target application home page;

FIG. 10a illustrates a user interface;

another user interface is illustrated in fig. 10 b;

FIG. 11 illustrates a user interface;

FIG. 12 illustrates an example of a user interface;

FIG. 13 illustrates a user interface;

FIG. 14 illustrates a user interface;

FIG. 15 illustrates a user interface;

FIG. 16 illustrates a pause interface;

FIG. 17 illustrates a user interface for presenting saving information;

FIG. 18 illustrates a user interface presenting a resume prompt;

FIG. 19 illustrates a user interface presenting scoring information;

FIG. 20 is an exemplary illustration of a user interface for presenting detailed performance information;

FIG. 21 illustrates an example user interface for viewing the rehearsal screenshot artwork file;

another user interface for presenting detailed performance information is illustrated in FIG. 22;

fig. 23 is a view exemplarily showing a detailed achievement information page displayed on the mobile terminal device;

FIG. 24 illustrates a user interface displaying an automatic play prompt;

a user interface displaying a user exercise record is illustrated in fig. 25.

Detailed Description

In order to make those skilled in the art better understand the technical solutions in the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

All other embodiments, which can be derived by a person skilled in the art from the exemplary embodiments shown in the present application without inventive effort, shall fall within the scope of protection of the present application. Moreover, while the disclosure herein has been presented in terms of exemplary one or more examples, it is to be understood that each aspect of the disclosure can be utilized independently and separately from other aspects of the disclosure to provide a complete disclosure.

It should be understood that the terms "first," "second," "third," and the like in the description and in the claims of the present application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used are interchangeable under appropriate circumstances and can be implemented in sequences other than those illustrated or otherwise described herein with respect to the embodiments of the application, for example.

Furthermore, the terms "comprises" and "comprising," as well as any variations thereof, are intended to cover a non-exclusive inclusion, such that a product or device that comprises a list of elements is not necessarily limited to those elements explicitly listed, but may include other elements not expressly listed or inherent to such product or device.

The term "module" as used herein refers to any known or later developed hardware, software, firmware, artificial intelligence, fuzzy logic, or combination of hardware or/and software code that is capable of performing the functionality associated with that element.

The term "remote control" as used in this application refers to a component of an electronic device, such as the display device disclosed in this application, that is typically wirelessly controllable over a short range of distances. Typically using infrared and/or Radio Frequency (RF) signals and/or bluetooth to interface with the electronic device, and may also include WiFi, wireless USB, bluetooth, motion sensor, etc. functional modules. For example: the hand-held touch remote controller replaces most of the physical built-in hard keys in a common remote control device with a user interface in a touch screen.

The term "gesture" as used in this application refers to a user's behavior through a change in hand type or an action such as hand motion to convey an intended idea, action, purpose, or result.

Fig. 1 is a schematic diagram illustrating an operation scenario between a display device and a control apparatus according to an embodiment. As shown in fig. 1, a user may operate the display device 200 through the mobile terminal 300 and the control apparatus 100.

The control device 100 may control the display device 200 in a wireless or other wired manner by using a remote controller, including infrared protocol communication, bluetooth protocol communication, other short-distance communication manners, and the like. The user may input a user command through a key on a remote controller, a voice input, a control panel input, etc. to control the display apparatus 200. Such as: the user can input a corresponding control command through a volume up/down key, a channel control key, up/down/left/right moving keys, a voice input key, a menu key, a power on/off key, etc. on the remote controller, to implement the function of controlling the display device 200.

In some embodiments, mobile terminals, tablets, computers, laptops, and other smart devices may also be used to control the display device 200. For example, the display device 200 is controlled using an application program running on the smart device. The application, through configuration, may provide the user with various controls in an intuitive User Interface (UI) on a screen associated with the smart device.

In some embodiments, the mobile terminal 300 may install a software application with the display device 200 to implement connection communication through a network communication protocol for the purpose of one-to-one control operation and data communication. Such as: the mobile terminal 300 and the display device 200 can be used for establishing a control instruction protocol, synchronizing a remote control keyboard to the mobile terminal 300 and controlling the function of the display device 200 by controlling the user interface on the mobile terminal 300. The audio/video content displayed on the mobile terminal 300 may also be transmitted to the display device 200, so as to implement the synchronous display function.

As also shown in fig. 1, the display apparatus 200 also performs data communication with the server 400 through various communication means. The display device 200 may be allowed to be communicatively connected through a Local Area Network (LAN), a Wireless Local Area Network (WLAN), and other networks. The server 400 may provide various contents and interactions to the display apparatus 200. In some embodiments, the display device 200 receives software program updates, or accesses a remotely stored digital media library by sending and receiving information, as well as Electronic Program Guide (EPG) interactions. The servers 400 may be one or more servers, and may be one or more servers. Other web service contents such as video on demand and advertisement services are provided through the server 400.

The display device 200 may be a liquid crystal display, an OLED display, a projection display device. The particular display device type, size, resolution, etc. are not limiting, and those skilled in the art will appreciate that the display device 200 may be modified in performance and configuration as desired.

The display apparatus 200 may additionally provide an intelligent network tv function that provides a computer support function in addition to the broadcast receiving tv function. In some embodiments, the network television, the smart television, the Internet Protocol Television (IPTV), and the like are included.

A hardware configuration block diagram of a display device 200 according to an exemplary embodiment is exemplarily shown in fig. 2. As shown in fig. 2, the display device 200 includes at least one of a controller 210, a tuner 220, a communication interface 230, a detector 240, an input/output interface 250, a video processor 260-1, an audio processor 60-2, a display 280, an audio output 270, a memory 290, a power supply, and an infrared receiver.

A display 280 for receiving the image signal from the video processor 260-1 and displaying the video content and image and components of the menu manipulation interface. The display 280 includes a display screen assembly for presenting a picture, and a driving assembly for driving the display of an image. The video content may be displayed from broadcast television content, or may be broadcast signals that may be received via a wired or wireless communication protocol. Alternatively, various image contents received from the network communication protocol and sent from the network server side can be displayed.

Meanwhile, the display 280 simultaneously displays a user manipulation UI interface generated in the display apparatus 200 and used to control the display apparatus 200.

And, a driving component for driving the display according to the type of the display 280. Alternatively, in case the display 280 is a projection display, it may also comprise a projection device and a projection screen.

The communication interface 230 is a component for communicating with an external device or an external server according to various communication protocol types. For example: the communication interface 230 may be a Wifi chip 231, a bluetooth communication protocol chip 232, a wired ethernet communication protocol chip 233, or other network communication protocol chips or near field communication protocol chips, and an infrared receiver (not shown).

The display apparatus 200 may establish control signal and data signal transmission and reception with an external control apparatus or a content providing apparatus through the communication interface 230. And an infrared receiver, an interface device for receiving an infrared control signal for controlling the apparatus 100 (e.g., an infrared remote controller, etc.).

The detector 240 is a signal used by the display device 200 to collect an external environment or interact with the outside. The detector 240 includes a light receiver 242, a sensor for collecting the intensity of ambient light, and parameters such as changes in parameters can be adaptively displayed by collecting the ambient light.

The image acquisition device 241, such as a camera and a camera, may be used to acquire an external environment scene, acquire attributes of a user or interact gestures with the user, adaptively change display parameters, and recognize gestures of the user, so as to implement an interaction function with the user.

In some other exemplary embodiments, the detector 240, a temperature sensor, etc. may be provided, for example, by sensing the ambient temperature, and the display device 200 may adaptively adjust the display color temperature of the image. For example, the display apparatus 200 may be adjusted to display a cool tone when the temperature is in a high environment, or the display apparatus 200 may be adjusted to display a warm tone when the temperature is in a low environment.

In other exemplary embodiments, the detector 240, and a sound collector, such as a microphone, may be used to receive a user's voice, a voice signal including a control instruction from the user to control the display device 200, or collect an ambient sound for identifying an ambient scene type, and the display device 200 may adapt to the ambient noise.

The input/output interface 250 controls data transmission between the display device 200 of the controller 210 and other external devices. Such as receiving video and audio signals or command instructions from an external device.

Input/output interface 250 may include, but is not limited to, the following: any one or more of high definition multimedia interface HDMI interface 251, analog or data high definition component input interface 253, composite video input interface 252, USB input interface 254, RGB ports (not shown in the figures), etc.

In some other exemplary embodiments, the input/output interface 250 may also form a composite input/output interface with the above-mentioned plurality of interfaces.

The tuning demodulator 220 receives the broadcast television signals in a wired or wireless receiving manner, may perform modulation and demodulation processing such as amplification, frequency mixing, resonance, and the like, and demodulates the television audio and video signals carried in the television channel frequency selected by the user and the EPG data signals from a plurality of wireless or wired broadcast television signals.

The tuner demodulator 220 is responsive to a user selected television signal frequency and the television signal carried thereby, as selected by the user and as controlled by the controller 210.

The tuner-demodulator 220 may receive signals in various ways according to the broadcasting system of the television signal, such as: terrestrial broadcast, cable broadcast, satellite broadcast, internet broadcast signals, or the like; and according to different modulation types, the modulation mode can be digital modulation or analog modulation. Depending on the type of television signal received, both analog and digital signals are possible.

In other exemplary embodiments, the tuner/demodulator 220 may be in an external device, such as an external set-top box. Thus, the set-top box outputs television audio and video signals after modulation and demodulation, and the television audio and video signals are input into the display device 200 through the input/output interface 250.

The video processor 260-1 is configured to receive an external video signal, and perform video processing such as decompression, decoding, scaling, noise reduction, frame rate conversion, resolution conversion, image synthesis, and the like according to a standard codec protocol of the input signal, so as to obtain a signal that can be displayed or played on the direct display device 200.

In some embodiments, the video processor 260-1 includes at least one of a demultiplexing module, a video decoding module, an image synthesizing module, a frame rate conversion module, a display formatting module, and the like.

The demultiplexing module is used for demultiplexing the input audio and video data stream, and if the input MPEG-2 is input, the demultiplexing module demultiplexes the input audio and video data stream into a video signal and an audio signal.

And the video decoding module is used for processing the video signal after demultiplexing, including decoding, scaling and the like.

And the image synthesis module is used for carrying out superposition mixing processing on the GUI signal input by the user or generated by the user and the video image after the zooming processing by the graphic generator so as to generate an image signal for display.

The frame rate conversion module is configured to convert an input video frame rate, such as a 60Hz frame rate into a 120Hz frame rate or a 240Hz frame rate, and the normal format is implemented in, for example, an interpolation frame mode.

The display format module is used for converting the received frame rate converted video output signal and changing the signal to conform to the signal of the display format, such as outputting an RGB data signal.

The audio processor 260-2 is configured to receive an external audio signal, decompress and decode the received audio signal according to a standard codec protocol of the input signal, and perform noise reduction, digital-to-analog conversion, amplification processing, and the like to obtain an audio signal that can be played in the speaker.

In other exemplary embodiments, the video processor 260-1 may comprise one or more chips. The audio processor 260-2 may also comprise one or more chips.

And, in other exemplary embodiments, the video processor 260-1 and the audio processor 260-2 may be separate chips or may be integrated together with the controller 210 in one or more chips.

An audio output 272, which receives the sound signal output from the audio processor 260-2 under the control of the controller 210, such as: the speaker 272, and the external sound output terminal 274 that can be output to the generation device of the external device, in addition to the speaker 272 carried by the display device 200 itself, such as: an external sound interface or an earphone interface and the like.

The power supply provides power supply support for the display device 200 from the power input from the external power source under the control of the controller 210. The power supply may include a built-in power supply circuit installed inside the display apparatus 200, or may be a power supply interface installed outside the display apparatus 200 to provide an external power supply in the display apparatus 200.

A user input interface for receiving an input signal of a user and then transmitting the received user input signal to the controller 210. The user input signal may be a remote controller signal received through an infrared receiver, and various user control signals may be received through the network communication module.

In some embodiments, the user inputs a user command through the remote controller 100 or the mobile terminal 300, the user input interface responds to the user input through the controller 210 according to the user input, and the display device 200 responds to the user input.

In some embodiments, a user may enter a user command on a Graphical User Interface (GUI) displayed on the display 280, and the user input interface receives the user input command through the Graphical User Interface (GUI). Alternatively, the user may input the user command by inputting a specific sound or gesture, and the user input interface receives the user input command by recognizing the sound or gesture through the sensor.

The controller 210 controls the operation of the display device 200 and responds to the user's operation through various software control programs stored on the memory 290.

As shown in fig. 2, the controller 210 includes a RAM213 and a ROM214, and a graphic processor 216, a CPU processor 212, a communication interface 218, such as: a first interface 218-1 through an nth interface 218-n, and a communication bus. The RAM213 and the ROM214, the graphic processor 216, the CPU processor 212, and the communication interface 218 are connected via a bus.

A ROM213 for storing instructions for various system boots. If the display apparatus 200 starts power-on upon receipt of the power-on signal, the CPU processor 212 executes a system boot instruction in the ROM, copies the operating system stored in the memory 290 to the RAM213, and starts running the boot operating system. After the start of the operating system is completed, the CPU processor 212 copies the various application programs in the memory 290 to the RAM213, and then starts running and starting the various application programs.

A graphics processor 216 for generating various graphics objects, such as: icons, operation menus, user input instruction display graphics, and the like. The display device comprises an arithmetic unit which carries out operation by receiving various interactive instructions input by a user and displays various objects according to display attributes. And a renderer for generating various objects based on the operator and displaying the rendered result on the display 280.

A CPU processor 212 for executing operating system and application program instructions stored in memory 290. And executing various application programs, data and contents according to various interactive instructions received from the outside so as to finally display and play various audio and video contents.

In some exemplary embodiments, the CPU processor 212 may include a plurality of processors. The plurality of processors may include one main processor and a plurality of or one sub-processor. A main processor for performing some operations of the display apparatus 200 in a pre-power-up mode and/or operations of displaying a screen in a normal mode. A plurality of or one sub-processor for one operation in a standby mode or the like.

The controller 210 may control the overall operation of the display apparatus 100. For example: in response to receiving a user command for selecting a UI object to be displayed on the display 280, the controller 210 may perform an operation related to the object selected by the user command.

Wherein the object may be any one of selectable objects, such as a hyperlink or an icon. Operations related to the selected object, such as: displaying an operation connected to a hyperlink page, document, image, or the like, or performing an operation of a program corresponding to the icon. The user command for selecting the UI object may be a command input through various input means (e.g., a mouse, a keyboard, a touch pad, etc.) connected to the display apparatus 200 or a voice command corresponding to a voice spoken by the user.

The memory 290 includes a memory for storing various software modules for driving the display device 200. Such as: various software modules stored in memory 290, including: the system comprises a basic module, a detection module, a communication module, a display control module, a browser module, various service modules and the like.

Wherein, the basic module is used for the bottom layer software module which communicates signals among the hardware in the postpartum care display device 200 and sends processing and control signals to the upper layer module. The detection module is used for collecting various information from various sensors or user input interfaces, and the management module is used for performing digital-to-analog conversion and analysis management.

For example: the voice recognition module comprises a voice analysis module and a voice instruction database module. The display control module is a module for controlling the display 280 to display image content, and may be used to play information such as multimedia image content and UI interface. And the communication module is used for carrying out control and data communication with external equipment. And the browser module is used for executing a module for data communication between browsing servers. And the service module is used for providing various services and modules including various application programs.

Meanwhile, the memory 290 is also used to store visual effect maps and the like for receiving external data and user data, images of respective items in various user interfaces, and a focus object.

A block diagram of the configuration of the control apparatus 100 according to an exemplary embodiment is exemplarily shown in fig. 3. As shown in fig. 3, the control apparatus 100 includes a controller 110, a communication interface 130, a user input/output interface 140, a memory 190, and a power supply 180.

The control device 100 is configured to control the display device 200 and may receive an input operation instruction of a user and convert the operation instruction into an instruction recognizable and responsive by the display device 200, serving as an interaction intermediary between the user and the display device 200. Such as: the user operates the channel up/down keys on the control device 100, and the display device 200 responds to the channel up/down operation.

In some embodiments, the control device 100 may be a smart device. Such as: the control apparatus 100 may install various applications that control the display apparatus 200 according to user demands.

In some embodiments, as shown in fig. 1, a mobile terminal 300 or other intelligent electronic device may function similar to control device 100 after installation of an application for manipulating display device 200. Such as: a user may implement the functions of controlling the physical keys of device 100 by installing applications, various function keys or virtual buttons of a graphical user interface that may be provided on mobile terminal 300 or other intelligent electronic device.

The controller 110 includes a processor 112 and RAM113 and ROM114, a communication interface 218, and a communication bus. The controller 110 is used to control the operation of the control device 100, as well as the internal components for communication and coordination and external and internal data processing functions.

The communication interface 130 enables communication of control signals and data signals with the display apparatus 200 under the control of the controller 110. Such as: the received user input signal is transmitted to the display apparatus 200. The communication interface 130 may include at least one of a WiFi chip, a bluetooth module, an NFC module, and other near field communication modules.

A user input/output interface 140, wherein the input interface includes at least one of a microphone 141, a touch pad 142, a sensor 143, keys 144, and other input interfaces. Such as: the user can realize a user instruction input function through actions such as voice, touch, gesture, pressing, and the like, and the input interface converts the received analog signal into a digital signal and converts the digital signal into a corresponding instruction signal, and sends the instruction signal to the display device 200.

The output interface includes an interface that transmits the received user instruction to the display apparatus 200. In some embodiments, the interface may be an infrared interface or a radio frequency interface. Such as: when the infrared signal interface is used, the user input instruction needs to be converted into an infrared control signal according to an infrared control protocol, and the infrared control signal is sent to the display device 200 through the infrared sending module. And the following steps: when the rf signal interface is used, a user input command needs to be converted into a digital signal, and then modulated according to an rf control signal modulation protocol, and then transmitted to the display device 200 through the rf transmitting terminal.

In some embodiments, the control device 100 includes at least one of a communication interface 130 and an output interface. The control device 100 is configured with a communication interface 130, such as: the WiFi, bluetooth, NFC, etc. modules may transmit the user input command to the display device 200 through the WiFi protocol, or the bluetooth protocol, or the NFC protocol code.

A memory 190 for storing various operation programs, data and applications for driving and controlling the control apparatus 200 under the control of the controller 110. The memory 190 may store various control signal commands input by a user.

And a power supply 180 for providing operational power support to the various elements of the control device 100 under the control of the controller 110. A battery and associated control circuitry.

A schematic diagram of a functional configuration of the display device 200 according to an exemplary embodiment is exemplarily shown in fig. 4. As shown in fig. 4, the memory 290 is used to store an operating system, an application program, contents, user data, and the like, and performs system operations for driving the display device 200 and various operations in response to a user under the control of the controller 210. The memory 290 may include volatile and/or nonvolatile memory.

The memory 290 is specifically configured to store an operating program for driving the controller 210 in the display device 200, and to store various application programs installed in the display device 200, various application programs downloaded by a user from an external device, various graphical user interfaces related to the applications, various objects related to the graphical user interfaces, user data information, and internal data of various supported applications. The memory 290 is used to store system software such as an OS kernel, middleware, and applications, and to store input video data and audio data, and other user data.

The memory 290 is specifically used for storing drivers and related data such as the audio/video processors 260-1 and 260-2, the display 280, the communication interface 230, the tuning demodulator 220, the input/output interface of the detector 240, and the like.

In some embodiments, memory 290 may store software and/or programs representing software programs for an Operating System (OS) including, for example: a kernel, middleware, an Application Programming Interface (API), and/or an application program. For example, the kernel may control or manage system resources, or functions implemented by other programs (e.g., the middleware, APIs, or applications), and the kernel may provide interfaces to allow the middleware and APIs, or applications, to access the controller to implement controlling or managing system resources.

In some embodiments, the memory 290 includes at least one of a broadcast receiving module 2901, a channel control module 2902, a volume control module 2903, an image control module 2904, a display control module 2905, an audio control module 2906, an external instruction recognition module 2907, a communication control module 2908, a light receiving module 2909, a power control module 2910, an operating system 2911, and other applications 2912, a browser module, and the like. The controller 210 performs functions such as: a broadcast television signal reception demodulation function, a television channel selection control function, a volume selection control function, an image control function, a display control function, an audio control function, an external instruction recognition function, a communication control function, an optical signal reception function, an electric power control function, a software control platform supporting various functions, a browser function, and the like.

Fig. 5 is a block diagram illustrating a configuration of a software system in the display device 200 according to an exemplary embodiment.

As shown in fig. 5, an operating system 2911, including executing operating software for handling various underlying system services and for carrying out hardware related tasks, acts as an intermediary for data processing performed between applications and hardware components. In some embodiments, portions of the operating system kernel may contain a series of software to manage the display device hardware resources and provide services to other programs or software code.

In other embodiments, portions of the operating system kernel may include one or more device drivers, which may be a set of software code in the operating system that assists in operating or controlling the devices or hardware associated with the display device. The drivers may contain code that operates the video, audio, and/or other multimedia components. In some embodiments, a display screen, a camera, Flash, WiFi, and audio drivers are included.

The accessibility module 2911-1 is configured to modify or access the application program to achieve accessibility and operability of the application program for displaying content.

A communication module 2911-2 for connection to other peripherals via associated communication interfaces and a communication network.

User interface modules 2911-3, which are used to provide objects for displaying user interfaces for access by various applications, enable user operability.

Control applications 2911-4 for controlling process management, including runtime applications and the like.

The event transmission system 2914, which may be implemented within the operating system 2911 or within the application program 2912, in some embodiments, on the one hand, within the operating system 2911 and on the other hand, within the application program 2912, is configured to listen for various user input events, and to refer to handlers that perform one or more predefined operations in response to the identification of various types of events or sub-events, depending on the various events.

The event monitoring module 2914-1 is configured to monitor an event or a sub-event input by the user input interface.

The event identification module 2914-1 is configured to input definitions of various types of events for various user input interfaces, identify various events or sub-events, and transmit the same to a process for executing one or more corresponding sets of processes.

The event or sub-event refers to an input detected by one or more sensors in the display device 200 and an input of an external control device (e.g., the control device 100). Such as: the method comprises the following steps of inputting various sub-events through voice, gesture input through gesture recognition, sub-events input through remote control key instructions of the control device and the like. In some embodiments, one or more sub-events in the remote control include a variety of forms including, but not limited to, one or a combination of key presses up/down/left/right/, ok key, key press hold, and the like. And non-physical key operations such as move, hold, release, etc.

The interface layout manager 2913, directly or indirectly receiving the input from the event transmission system 2914, monitors the user input events or sub-events, and updates the layout of the user interface, including but not limited to the position of each control or sub-control in the interface, and the size, position, and level of the container, and other various execution operations related to the layout of the interface.

As shown in fig. 6, the application layer 2912 contains various applications that may also be executed at the display device 200. The application may include, but is not limited to, one or more applications such as: at least one of a live television application, a video-on-demand application, a media center application, an application center, a gaming application, and the like.

The live television application program can provide live television through different signal sources. For example, a live television application may provide television signals using input from cable television, radio broadcasts, satellite services, or other types of live television services. And, the live television application may display video of the live television signal on the display device 200.

A video-on-demand application may provide video from different storage sources. Unlike live television applications, video on demand provides video displays from some storage source. For example, the video on demand may come from a server side of cloud storage, from a local hard disk storage containing stored video programs.

The media center application program can provide various applications for playing multimedia contents. For example, a media center, which may be other than live television or video on demand, may provide services for a user to access various images or audio through a media center application.

The application program center can provide and store various application programs. The application may be a game, an application, or some other application associated with a computer system or other device that may be run on the smart television. The application center may obtain these applications from different sources, store them in local storage, and then be executable on the display device 200.

A schematic diagram of a user interface in a display device 200 according to an exemplary embodiment is illustrated in fig. 7. As shown in fig. 7, the user interface includes a plurality of view display areas, in some embodiments, a first view display area 201 and a play screen 202, wherein the play screen includes a layout of one or more different items. And a selector is included in the user interface indicating that an item is selected, the position of the selector being movable by user input to change the selection of a different item.

It should be noted that the multiple view display areas may present display screens of different hierarchies. For example, a first view display area may present video chat project content and a second view display area may present application layer project content (e.g., web page video, VOD presentations, application screens, etc.).

Optionally, the different view display areas are presented with different priorities, and the display priorities of the view display areas are different among the view display areas with different priorities. If the priority of the system layer is higher than that of the application layer, when the user uses the acquisition selector and picture switching in the application layer, the picture display of the view display area of the system layer is not blocked; and when the size and the position of the view display area of the application layer are changed according to the selection of the user, the size and the position of the view display area of the system layer are not influenced.

The same level of display may also be presented, in which case the selector may switch between the first view display area and the second view display area, and when the size and position of the first view display area changes, the size and position of the second view display area may change accordingly.

In some embodiments, any one of the regions in fig. 7 may display a picture captured by the camera.

In some embodiments, controller 210 controls the operation of display device 200 and responds to user operations associated with display 280 by running various software control programs (e.g., an operating system and/or various application programs) stored on memory 290. For example, control presents a user interface on the display, the user interface including a number of UI objects thereon; in response to a received user command for a UI object on the user interface, the controller 210 may perform an operation related to the object selected by the user command.

In some embodiments, some or all of the steps involved in embodiments of the present application are implemented within the operating system and within the target application. In some embodiments, a target application for implementing some or all of the steps of embodiments of the present application, referred to as "baby dance" is stored in the memory 290, and the controller 210 controls the operation of the display apparatus 200 by running the application in an operating system and responds to user operations related to the application.

In some embodiments, the display device obtains the target application, various graphical user interfaces associated with the target application, various objects associated with the graphical user interfaces, user data information, and internal data of various supported applications from a server and stores the aforementioned data information in a memory.

In some embodiments, the display device retrieves media assets, such as picture files and audio-video files, from a server in response to the launch of a target application or user manipulation of a UI object associated with the target application.

It should be noted that the target application is not limited to running on a display device as shown in fig. 1-7, but may also run on other handheld devices capable of providing voice and data connectivity and having wireless connectivity, or other processing devices that may be connected to a wireless modem, such as a mobile phone (or "cellular" phone) and a computer having a mobile terminal, and may also be a portable, pocket, hand-held, computer-included, or vehicle-mounted mobile device that exchanges data with a radio access network.

Fig. 8 is a user interface exemplary illustrated in the present application, which is one implementation of a display device system home page. As shown in fig. 8, the user interface displays a plurality of items (controls), including a target item for launching the target application. As shown in fig. 8, the target item is the item "baby dance" for exercise. When the display displays a user interface as shown in fig. 8, the user can operate a target item "baby dance function" by operating a control device (e.g., the remote control 100), and the controller starts a target application in response to the operation of the target item.

In some embodiments, the target application refers to a functional module that plays an exemplary video in a first video window on the display screen. Wherein the exemplary video refers to a video showing an exemplary action and/or an exemplary sound. In some embodiments, the target application may also play the local video captured by the camera in a second video window on the display screen.

When the controller receives an input instruction indicating to start the target application program, the controller presents a target application program home page on the display in response to the instruction. On the application homepage, various interface elements such as icons, windows, controls and the like can be displayed on the interface, including but not limited to a login account information display area (column box control), a user data (experience value/dance value) display area, a window control for playing recommended videos, a related user list display area and a media resource display area.

In some embodiments, at least one of a nickname, a head portrait, a member identification, and a member validity period of the user may be displayed in the login account information display area; data related to the target application, such as experience values/dance success values and/or corresponding star identifiers, of the user can be displayed in the user data display area; a ranking list (such as experience value ranking) of users in a predetermined geographic area within a predetermined time period can be displayed in the related user list display area, or a friend list of the users can be displayed, and experience values/dance success values and/or corresponding star-level identifiers of the users can be displayed in the ranking list or the friend list; and in the medium resource display area, the medium resources are displayed in a classified mode. In some embodiments, a plurality of controls can be displayed in the asset display area, different controls correspond to different types of assets, and a user can trigger and display a corresponding type of asset list by operating the controls.

In some embodiments, the user data display area and the login account information display area may be one display area, for example, data related to the user and the target application is displayed in the login account information display area.

Fig. 9 is a view illustrating an implementation of the home page of the target application, as shown in fig. 9, in which a nickname, a head portrait, a member identification, and a member expiration date of a user are displayed in the login account information display area; the dance skill value and the star level identification of the user are displayed in the user data display area; the display area of the related user list displays "dancing high hand ranking (this week)"; the asset type controls such as 'sprout lessons', 'joy lessons', 'dazzle lessons' and 'my dance work' are displayed in the asset display area, a user can check corresponding types of asset lists by operating the type controls through the operation control device, and the user can select asset videos to be followed and exercised from the asset lists of any types. Illustratively, the focus is moved to an 'initiating course' control, an 'initiating course' media asset list interface is displayed after the confirmation operation of the user is received, and the corresponding media asset files are loaded and played according to the media asset control selected by the user in the 'initiating course' media asset list interface.

In addition, the interface shown in FIG. 9 includes a window control and a spot control for playing the recommended video. The recommended video may be automatically played in a window control as shown in fig. 9, or may be played in response to a play instruction input by the user. For example, the user can move the position of the selector (focus) by operating the control device so that the selector falls into a window control for playing the recommended video, and in the case where the selector falls into the window control, the user operates the "OK" key on the control device to input an instruction indicating that the recommended video is to be played.

In some embodiments, the controller, in response to an instruction indicating the launch of the target application, obtains information from the server for display in a page as shown in FIG. 9, such as login account information, user data, related user list data, recommended videos, and the like. The controller draws an interface as shown in fig. 9 through the graphic processor according to the acquired aforementioned information, and controls presentation on the display.

In some embodiments, the controller acquires a media asset ID corresponding to the media asset control and/or a user identifier of the display device according to the media asset control selected by the user, and sends a loading request to the server, and the server queries corresponding video data according to the media asset ID and/or determines the authority of the display device according to the user identifier. And feeding back the acquired video data and/or the permission information to the display equipment. The controller plays the video data and/or plays the video information according to the video data and/or the authority information and simultaneously prompts the authority of the user.

In some embodiments, the target application is not a separate application, but is a part of the focused good-looking application as shown in fig. 8, that is, a function module of the focused good-looking application, and in some embodiments, in addition to the title controls such as "my", "movie", "kid", "VIP", "education", "mall", and the like, in the TAB bar of the interactive interface, the user may also include a "dance function" title control, and the user may display the corresponding title interface by moving the focus to a different title control, for example, after moving the focus to the "dance function" title control, the user enters the interface as shown in fig. 9.

Along with the popularization of intelligent display devices, the demand of users for entertainment through a large screen is stronger, and more time and money are required to be invested for interest cultivation. The application provides the user with the follow-up experience of the motion and/or sound skills (such as the motions in dance, gymnastics, fitness and Karaoke scenes) through the target application, so that the user can learn the motion and/or sound skills at any time at home.

In some embodiments, the asset videos presented in the asset list interface (e.g., "sprout lesson" asset list interface, "music lesson" asset list interface as in the above examples) include exemplary videos, but are not limited to videos for exemplary dance movements, videos for exemplary fitness movements, videos for exemplary gymnastic movements, videos of song MVs played by the display device in karaoke scenes, or videos of exemplary avatar movements. In the embodiment of the present application, a teaching video or a demonstration video user may watch the demonstration video and synchronously make the same motions as those demonstrated in the video to realize the function of home dance or home fitness using the display device. Vividly, this function can be called "see-and-play".

In some embodiments, a "see-while-burn" scenario is as follows: the user (such as children or teenagers) can watch the dance teaching video and practice dance motions, the user (such as adults) can watch the fitness teaching video and practice fitness motions, the user can connect K songs with the friend video, and the user can sing while following the MV video or the virtual image to do motions, and the like. For convenience of explanation and distinction, in the "practice while watching" scene, the action made by the user is called a user action or a follow-up action, the action demonstrated in the video is called a demonstration action, the video showing the demonstration action is a demonstration video, and the action made by the user is a local video acquired after the camera.

In some embodiments, if the display device has an image collector (or camera), the image collector can perform image capture or video stream capture on the follow-up exercise action of the user, so that the follow-up exercise process of the user is recorded by taking pictures or videos as carriers. Furthermore, the exercise following action of the user is identified according to the pictures or videos, the exercise following action of the user is compared with the corresponding demonstration action, and the exercise following condition of the user is evaluated according to the comparison condition.

In some embodiments, a time tag corresponding to a standard action frame may be preset in the demonstration video, and the action matching comparison is performed according to the image frame at and/or near the time tag position in the local video and the standard action frame, so as to perform evaluation according to the action matching degree.

In some embodiments, a time tag corresponding to a standard audio segment may be preset in the demonstration video, and the matching comparison of the action is performed according to the audio segment at the time tag position and/or the adjacent position in the local video and the standard audio segment, so as to perform the evaluation according to the matching degree of the action.

In some embodiments, a display interface of the display synchronously presents a local video stream (or a local photo) acquired by the camera and a demonstration video followed by a user on the display, a first video window and a second video window are arranged in the display interface, the first video window is used for playing the demonstration video, and the second video window is used for playing the local video.

When the display displays the interface shown in fig. 9 or receives the operated media asset list interface after the interface shown in fig. 9 is displayed, the user can select and play the media asset videos to be exercised by operating the control device, and for convenience of explanation and distinction, the media asset videos selected by the user to be exercised are collectively referred to as target videos (i.e. the demonstration videos corresponding to the selected control).

In some embodiments, in response to an instruction input by a user to follow a target video, the display device controller acquires the target video from the server according to the media asset ID corresponding to the selected control and detects whether a camera is connected; and if the camera is detected, controlling the camera to lift and start the camera so as to enable the camera to start to collect the local video stream, simultaneously displaying the loaded target video and the local video stream on the display, and if the camera is not detected, only playing the target video on the display. In some embodiments, a first playing window and a second playing window are arranged in a display interface (namely, a follow-up interface) during follow-up, after the target video is loaded, in response to that no camera is detected, the target video is played in the first playing window, and a preset prompt or black is displayed in the second playing window. In some embodiments, when the camera is not detected, a reminder without the camera is displayed in a floating layer above the follow-up interface, the follow-up interface is entered after confirmation of the user to play the target video, and when the user inputs an instruction of disagreement, the target application is exited or the interface before the return is exited.

In the case of detecting the camera, the controller sets a first play window on a first layer of the user interface, sets a second play window on a second layer of the user interface, plays the acquired target video in the first play window, and plays the picture of the local video stream in the second play window. The first playing window and the second playing window can be in tiled display, wherein the tiled display means that a plurality of windows are divided into screens according to a certain proportion, and the windows are not overlapped.

In some embodiments, the first playing window and the second playing window are formed by window components which are tiled on the same layer and occupy different positions.

Fig. 10a illustrates a user interface showing an implementation of a first playing window and a second playing window, as shown in fig. 10a, the first playing window displays a target video frame, the second playing window displays a frame of a local video stream, the first playing window and the second playing window are tiled in a display area of a display, and in some embodiments, the first playing window and the second playing window have different window sizes.

In the situation that the camera is not detected, the controller plays the acquired target video in the first playing window, and displays the shielding layer or the preset picture file in the second playing window. The first playing window and the second playing window can be tiled, wherein the tiled display means that a plurality of windows are divided into screens according to a certain proportion, and no overlapping exists between the windows.

Fig. 10b illustrates another user interface showing another implementation of the first and second playing windows, and unlike fig. 10a, in fig. 10b, the first playing window displays the target video picture, and the second playing window displays the shielding layer, in which the preset text element of "no camera detected" is displayed.

In some other embodiments, in a case that the camera is not detected, the controller sets a first playing window on a first layer of the user interface, and the first playing window is displayed in a full screen in a display area of the display.

In some embodiments, in the case of a display device having a camera, the controller receives a command from a user to follow a demonstration video, and enters a follow-up interface to directly play the demonstration video and the local video stream.

In other embodiments, the controller, upon receiving an instruction to follow the demonstration video, first enters the guidance interface, and only displays the local video frame in the guidance interface without playing the demonstration video frame.

In some embodiments, since the camera is a concealable camera that is hidden within or behind the display when not in use, the controller controls the raising and opening of the camera when the camera is invoked, wherein the raising is initiated in order for the camera to begin capturing images in order for the camera to extend out of the frame of the display.

In some embodiments, to increase the camera angle of the camera, the camera may be rotated in a lateral direction or a longitudinal direction, where the lateral direction refers to a horizontal direction when the video is normally viewed and the longitudinal direction refers to a vertical direction when the video is normally viewed. The acquired image can be adjusted by adjusting the focal length of the camera along the depth direction perpendicular to the display screen.

In some embodiments, when a moving target (i.e., a human body) does not exist in the local video picture, or when the moving target exists in the local video picture and the offset of the target position where the moving target is located relative to the preset desired position is greater than a preset threshold value, a graphic element for identifying the preset desired position is presented above the local video picture, and a prompt control for guiding the moving target to move to the desired position is presented above the local video picture according to the offset of the target position relative to the desired position.

The moving object (human body) is a local user, and in different scenarios, there may be one or more moving objects in the local video image. The expected position is a position set according to the acquisition region of the image acquisition device, and when the moving target (i.e. the user) is at the expected position, the local image acquired by the image acquisition device is most beneficial to analyzing and comparing the user action in the image.

In some embodiments, the cueing control graphic for directing movement of the moving target to the desired position contains a graphic of an arrow indicating a direction with the arrow pointing towards the desired position.

In some embodiments, the desired position refers to a graphic frame displayed on the display, and the controller sets the graphic frame in a floating layer above the local video picture according to the position and angle of the camera and a preset mapping relation so that the user can intuitively see where the user needs to move.

In the using process, the user stands in a reasonable position at the preset position in front of the display device, and due to the difference of the lifting height and/or the rotating angle, the images collected by the camera are different, so that the preset position of the graphic frame needs to be adjusted adaptively, and the user can stand in the reasonable position at the preset position in front of the display device under guidance.

In some embodiments, the mapping of the position of the graphic frame is as follows:

in some embodiments, the video window for playing the local video picture is located on a first layer, the prompt control and/or the graphic frame is located on a second layer, and the second layer is located above the first layer.

In some embodiments, the controller may display a video window for playing the local video frame in a second layer on the display interface, where the loading of the follow-through interface is not performed or the follow-through interface is located in a page stack in the background.

In some embodiments, the prompt control for guiding the moving object to move to the desired position may identify an interface prompt of the moving direction of the object and/or play a voice prompt of the moving direction of the object.

Wherein the target moving direction is obtained from a deviation of the target position from the desired position. It should be noted that, when a moving object exists in the local video picture, the moving direction of the object is obtained according to the deviation of the object position of the moving object relative to the expected position; when a plurality of moving objects exist in the local video picture, the moving direction of the object is obtained according to the minimum offset in a plurality of offsets corresponding to the moving objects.

In some embodiments, the cue control may be an arrow cue, and the direction of the arrow cue may be determined according to the target movement direction to point to the graphical element 112.

In some embodiments, a floating layer with a transparency greater than a preset transparency (e.g., 50%) is presented above the local video frame, such as a semi-transparent floating layer, and a graphic element for identifying a desired position is displayed in the floating layer, so that a user can view the local video frame of the local video through the floating layer.

In some embodiments, another floating layer with transparency greater than a preset transparency (e.g., 50%) is presented above the local video picture, and a graphic element for identifying a target moving direction is displayed in the floating layer as a prompt control for guiding the user to move the position.

In some embodiments, the graphical element used to identify the desired position and the cueing control used to identify the direction of movement of the target are displayed in the same floating layer.

FIG. 11 illustrates a user interface in which a local video frame is displayed substantially full screen as shown in FIG. 11, with a semi-transparent floating layer displayed above the local video frame, the semi-transparent floating layer having a target movement direction identified by graphic element 111 and a desired position identified by graphic element 112. The graphical element 111 is not coincident with the graphical element 112 in position. The moving object (user) can gradually move to a desired position according to the moving direction of the object identified by the graphic element 111. When the moving object in the local video frame moves to the desired position, the outline of the moving object in the local video frame is maximally overlapped with the image element 112. In some embodiments, the graphic element 112 is a graphic frame.

In some embodiments, the target movement direction may also be identified by an interface text element, such as "move a little to the left" as exemplarily shown in fig. 11.

In some embodiments, the display device controller receives a preset instruction, such as an instruction instructing a follow-up demonstration video, and in response to the instruction, controls the image collector to collect a local image to generate a local video stream; presenting a local video frame in a user interface; detecting whether a moving target exists in a local video picture; when a moving object exists in a local video picture, position coordinates of the moving object and a desired position in a preset coordinate system are respectively obtained, wherein the position coordinates of the moving object in the preset coordinate system are quantized representations of the target position of the moving object, and the position coordinates of the desired position in the preset coordinate system are quantized representations of the desired position. Further, the offset of the target position with respect to the desired position is calculated from the position coordinates of the moving target and the desired position in the preset coordinate system.

In some embodiments, the position coordinates of the moving object in the preset coordinate system may be a position coordinate point set of the contour of the moving object (i.e., the object contour) in the preset coordinate system. Illustratively, the target profile 121 is shown in FIG. 12.

In some embodiments, the target contour includes a torso portion and/or a target reference point, where the target reference point may be a midpoint of the torso portion or a center point of the target contour. Illustratively, the torso portion 1211 and the target reference point 1212 are shown in fig. 12. In these embodiments, acquiring the position coordinates of the moving object in the preset coordinate system includes: identifying a target contour from the preview picture, wherein the target contour comprises a torso part and/or a target reference point; and acquiring the position coordinates of the trunk part and/or the target reference point in a preset coordinate system.

In some embodiments, the graphical element used to identify the desired position includes a graphical torso part and/or a graphical reference point corresponding to the target reference point in the above embodiments, i.e. if the target reference point is the mid-point of the torso part, the graphical reference point is the mid-point of the graphical torso part, if the target reference point is the center point of the target contour, the graphical reference point is the center point of the graphical element. Illustratively, a graphical torso part 1221 and a graphical reference point 1222 are shown in fig. 12. In these embodiments, the position coordinates of the desired position in the preset coordinate system are obtained, i.e. the position coordinates of the torso part and/or the reference point of the figure in the preset coordinate system are obtained.

In some embodiments, the offset of the target position from the desired position is calculated based on the position coordinates of the torso part in the predetermined coordinate system and the position coordinates of the torso part in the predetermined coordinate system.

In some embodiments, the origin of the preset coordinate system may be any point set in advance. As follows, taking the origin as an example of a pixel point at the lower left corner of the display screen, the torso part can be identified using the coordinates of two points in focus or the coordinates of at least two other points, and the target torso part coordinates are (X) ₁ ，Y ₁ ；X ₂ ，Y ₂ ) The coordinates of the torso part of the figure are (X) ₃ ，Y ₃ ；X ₄ ，Y ₄ ) Then the position offset between the two is (X) ₃ -X ₁ ，Y ₃ -Y ₁ ；X ₄ -X ₂ ，Y ₄ -Y ₂ ) The user can remind according to the corresponding relation of the offset and the prompt, so that the overlapping of the target body part and the graphic body part meets the preset requirement.

In some embodiments, the offset of the target torso part and the graphic torso part may be calculated by an overlap area of the graphic, and the user may be alerted that the position adjustment is successful when the overlap area reaches a predetermined threshold or a ratio of the overlap area reaches a predetermined threshold.

In some embodiments, the user is alerted to a successful position adjustment based on the completion of the overlap of the target torso portion and the right side frame of the graphical torso portion as the user moves to the left. This ensures that the user has entered the identification area in its entirety.

In some embodiments, the user is alerted to a successful position adjustment when the user moves to the right, based on the target torso portion and the left border of the graphical torso portion completing the overlap. This ensures that the user has entered the identification area in its entirety.

In other embodiments, the offset of the target position relative to the desired position is calculated based on the position coordinates of the target reference point in the preset coordinate system and the position coordinates of the graphic reference point in the preset coordinate system.

In some embodiments, the origin of the preset coordinate system may be any point set in advance. As follows, taking the origin as an example of a pixel point at the lower left corner of the display screen, the target reference point 1212 coordinate is (X) ₁ ，Y ₁ ) The coordinates of the graphic reference point 1222 are (X) ₂ ，Y ₂ ) The position offset between the two is then (X) ₂ -X ₁ ，Y ₂ -Y ₁ ) At X ₂ -X ₁ When the time is positive, a prompt is given on the left side of the graphical element 112 and/or a "move a little to the right" prompt is given at X ₂ -X ₁ When negative, a cue is given on the right side of the graphical element 112 and/or a "cue moved a little to the left" is given.

In some embodiments, the controller further obtains a focal distance at which the human body is located, and prompts the user for "a prompt to go a bit ahead" or "a prompt to go a bit to the right" based on a preset focal distance comparison.

In some embodiments, the controller further gives the specific distance to the left or right of the user according to a proportional relationship between the focal distance at the position of the human body and a preset focal distance, and according to an offset value of the user in the X direction, for example, when the proportional relationship is 0.8, the offset value in the X direction is positive 800pix, the user is reminded to move 10 centimeters to the right, when the proportional relationship is 1.2, the offset value in the X direction is positive 800pix, the user is reminded to move 15 centimeters to the right, when the proportional relationship is 0.8, the offset value in the X direction is negative 800pix, the user is reminded to move 10 centimeters to the left, when the proportional relationship is 1.2, the offset value in the X direction is negative 800pix, the user is reminded to move 15 centimeters to the left.

In some embodiments, when the offset value is smaller than the preset threshold value, the user is reminded that the position adjustment is successful.

In some embodiments, the predetermined coordinate system is a three-dimensional coordinate system, and the position coordinates of the moving object and the desired position in the predetermined coordinate system are three-dimensional coordinates, and the offset of the object position relative to the desired position is a three-dimensional offset vector.

In some embodiments, assuming that the target reference point has a position coordinate of (X, Y, Z) in the preset coordinate system, the graphic reference point has a position coordinate of (X, Y, Z) in the preset coordinate system, and the offset vector of the target position with respect to the desired position is calculated as (X-X, Y-Y, Z-Z).

In some embodiments, when the deviation of the target position from the desired position is not greater than the preset threshold, then the display of the graphic element for identifying the desired position or the interface prompt for identifying the target moving direction is cancelled, and a first video window for playing the demonstration video and a second video window for playing the local video picture are arranged in the user interface, and the second video window and the first video window are tiled in the user interface; the local video frame is played in the second video window while the exemplary video is played in the first video window, such as the user interface shown in fig. 10.

It should be noted that, in the above example, the case where the target position is offset from the desired position may be a case where an offset amount therebetween is larger than a preset offset amount, and accordingly, the case where the target position is not offset from the desired position may be a case where an offset amount therebetween is smaller than a preset offset amount.

In the above embodiment, after receiving the instruction indicating the follow-up practice video, the controller does not directly play the practice video to start the follow-up practice process, but only displays the local video picture, and moves the moving object (user) to the desired position by presenting the graphic element for identifying the preset desired position and the prompt for guiding the moving object to move to the desired position above the local video picture, so that in the subsequent follow-up practice process, the image collector can collect the image most beneficial for analyzing and comparing the user action.

In some embodiments, the display device may control the rotation of the camera in the horizontal direction or the vertical direction according to whether the display device is in the horizontal placement state or the wall-mounted placement state, and the rotation angles of the cameras in different placement states are different when the same requirement is met.

The human body is continuously detected, and in some embodiments, the controller controls the guide interface to cancel to display the follow-through interface when the deviation of the position coordinates of the target reference point in the preset coordinate system and the position coordinates of the graphic reference point in the preset coordinate system meets the preset requirement and/or the deviation of the target torso part and the graphic torso part meets the preset requirement.

In some embodiments, the display displays an interface as shown in FIG. 10a when the user follows a video of a asset. When the display displays an interface as shown in fig. 10a, a user can trigger the display of a floating layer containing a control (which may be a down key in some embodiments) by operating a designated key on the control device, and in response to the user operation, a floating layer of controls including at least one of a control for selecting a video of a asset, a control for adjusting a play speed, and a control for adjusting a definition is presented on the follow-up interface as shown in fig. 13 or 14. The user can move the focus position by operating the control device to select the control in the control floating layer. And when the focus falls into a certain control, presenting a sub-floating layer corresponding to the control, wherein at least one sub-control is displayed in the sub-floating layer. For example, when the focus falls into a control used for selecting the asset video, a sub-floating layer corresponding to the control is presented, and a plurality of different asset video controls are presented in the sub-floating layer. The sub-floating layer refers to a floating layer positioned above the control floating layer. In some embodiments, the control in the sub-floating layer may also be implemented by adding a control to the control floating layer.

Fig. 13 exemplarily shows an application interface (play control interface), in which a control floating layer is displayed above the layer where the first play window and the second play window are located, the control floating layer includes an album control, a double-speed play control, and a definition control, and since the focus is located in the album control, a sub floating layer corresponding to the album control is also presented in the interface, in which a plurality of controls of other media assets videos are displayed. In the interface shown in fig. 13, the user can select other media asset videos to play and follow through moving the focus position.

In some embodiments, when the display displays the interface shown in fig. 13, the user may move the focus to select the double-speed playing control, and in response to the focus falling into the double-speed playing control, the sub-floating layer corresponding to the double-speed playing control is presented, as shown in fig. 14. And displaying a plurality of sub-controls in the sub-floating layers corresponding to the double-speed playing controls, wherein the sub-controls are used for adjusting the playing speed of the target video, and when a certain sub-control is operated, responding to the operation of a user, and adjusting the playing speed to the speed corresponding to the operated control. For example, in the interface shown in FIG. 14, "0.5 times", "0.75 times", and "1 time" are shown.

In another embodiment, when the display displays an interface as shown in fig. 13 or fig. 14, the user may move the focus to select a sharpness control, and in response to the focus falling into the sharpness control, a sub-float corresponding to the sharpness control is presented, as shown in fig. 15. And displaying a plurality of controls in the sub-floating layer corresponding to the definition, wherein the controls are used for adjusting the definition of the target video, and when a certain control is operated, responding to the operation of a user, and adjusting the definition to the definition corresponding to the operated control. For example, in the interface shown in fig. 14, "720P high definition" and "1080P ultra definition" are displayed.

In some embodiments, when the control floating layer is presented in response to a user operation, the focus is displayed on a preset default control, which may be any one of a plurality of controls in the control floating layer. For example, as shown in FIG. 13, the default control that is preset is the collection control.

In some embodiments, the other media asset videos displayed in the sub-floating layers corresponding to the collection control are sent to the display device by the server. For example, in response to the user selecting the selection control, the display device requests the server for media resource information, such as resource names or resource covers, to be displayed in the selection list. And after receiving the media resource information returned by the server, the display equipment controls the media resource information to be displayed in the selection list.

In some embodiments, in order to facilitate the user's differentiation of the asset resources in the selection list, the server queries the user's history follow-up records according to the user ID after receiving the request from the display device, so as to obtain the video of the asset practiced by the user. And if the media resource information issued to the display equipment comprises the media resource video which is exercised by the user, adding an identifier which represents that the user exercises the video in the media resource information corresponding to the media resource video. Accordingly, when the display device displays the selection list, the trained media asset videos are identified. Such as a "learned" logo displayed in the interface shown in fig. 12.

In some embodiments, in order to facilitate the user to distinguish the media resources in the option list, after receiving the request from the display device, the server determines whether the option list resource requested by the display device is newly added, for example, the server may determine whether the option list resource requested by the display device is newly added by comparing the option list resource issued to the display device last time with the current option list resource, and if the option list resource requested by the display device is newly added, an identifier indicating that the video is a newly added video is added to the resource information corresponding to the newly added media resource. Correspondingly, when the display device displays the selection list, the newly added media asset video is identified. For example, an "update" displayed in the interface shown in FIG. 13.

In some embodiments, the controller is used for responding to an instruction input by the user and instructing follow-up of the demonstration video, and obtaining the demonstration video from the server or obtaining the demonstration video which is downloaded in advance from the local storage according to the resource identification of the demonstration video.

In some embodiments, an exemplary video includes the image data and audio data described above. Wherein the image data comprises a sequence of video frames showing a plurality of movements that the user needs to follow, such as leg-lifting movements, squat movements, etc. The audio data may be narration audio of the exemplary action and/or background sound audio (e.g., background music).

In some embodiments, the controller processes the demonstration video by controlling the video processor to analyze displayable image signals and audio signals, and the audio signals are processed by the audio processor and then played synchronously with the image signals.

In some embodiments, the exemplary video includes the image data, the audio data and the subtitle data corresponding to the audio data, and the controller plays the image, the audio and the subtitle synchronously when playing the exemplary video.

As previously described, an exemplary video comprises a sequence of video frames, the frames of which are displayed in time under the play control of the controller, thereby showing the user the change in the morphology of the limb making each action. The user needs to experience the change of the limb form when completing each action, and the embodiment of the application analyzes and evaluates the action completion condition of the user according to the recorded limb form. In some embodiments, continuous joint data is extracted from the local video during the follow-up process according to the motion model of the obtained joint in the video frame sequence in the exemplary video in advance, and the continuous joint data is compared with the motion model of the joint obtained in advance to determine the matching degree of the motion.

In some embodiments, the process of the change of the limb morphology (i.e. the motion trajectory of the limb) required to complete a certain key action is described as the process from the incomplete state action to the complete state action to the completion of the release action, that is, the incomplete state action occurs before the complete state action, and the release action is performed after the complete state action, that is, the key action to be completed. In some embodiments, the completion state actions may also be referred to as key demonstration actions or key actions. In some embodiments, a tag may be added to identify the limb change process, and different tags are preset in the action frame of the action of different nodes.

Based on this, in some embodiments, frames showing key actions in a sequence of video frames included in the asset video are referred to as key frames, and key tags respectively corresponding to the key frames are identified on a time axis of the asset video, that is, a time point represented by a key tag is a time point at which the corresponding key frame is played. In addition, a key frame in the sequence of video frames constitutes the sequence of key frames.

Further, for the exemplary video, it may include a sequence of key frames including a number of key frames, one key frame corresponding to one key tag on the timeline, one key frame showing one key action. In some embodiments, the sequence of key frames is also referred to as a first sequence of key frames.

In some embodiments, N sets of start-stop tags are preset on a time axis of a asset video (including a demonstration video), and correspond to N video clips, each video clip is used for showing an action, (or called a completion state action or a key action), each set of start-stop tags includes a start tag and a stop tag, when a progress identifier on the time axis moves to a certain start tag during playing of the asset video (including the demonstration video), it means that a demonstration process corresponding to a certain action starts to be played, and when the progress identifier on the time axis moves to the stop tag, it means that the demonstration process of a certain action ends to be played.

Due to the fact that personalized factors such as learning ability and body coordination of different users are different, some users (such as children) are slow in movement and difficult to achieve synchronization with the playing speed of the demonstration video.

To solve this problem, in some embodiments, during the playing of the demonstration video, when the demonstration process of playing a certain action is started, the playing speed of the demonstration video is automatically reduced, so that the user can better learn and practice the key action, avoid missing the key action, and improve his own action in time, and when the demonstration process of the action (i.e. the video clip showing the action) is finished, the original playing speed is automatically recovered.

In some embodiments, video snippets exhibiting key actions are referred to as key snippets, and an exemplary video generally includes a number of key snippets and at least one non-key snippet (or non-key snippet or other snippet). The non-key segment refers to a video segment contained in the demonstration video and not used for showing the key action, for example, a segment of the video in which the action demonstrator keeps standing posture as the audience explains the action.

In some embodiments, the controller controls display of a user interface on the display, the user interface including a window for playing a video; in response to an input instruction for playing a demonstration video, acquiring the demonstration video, wherein the demonstration video comprises a plurality of key segments, and the key segments show key actions required to be exercised by a user when played; in some embodiments, the exemplary video that the user indicates to play is also referred to as the target video. The controller controls the exemplary video to be played at a first speed in the window; when the key segments are started to be played, the speed for playing the demonstration video is adjusted from a first speed to a second speed; when the key segment is finished to be played, adjusting the speed of playing the demonstration video from the second speed to the first speed; wherein the second speed is different from the first speed.

In some embodiments, the controller plays the demonstration video, detects a start tag and an end tag on a timeline of the demonstration video; adjusting the speed of playing the demonstration video from a first speed to a second speed when a start tag is detected; upon detecting the end tag, the speed at which the exemplary video is played is adjusted from the second speed to the first speed. The start tag represents the beginning of playing the key segment, and the end tag represents the completion of playing the key segment.

In some embodiments, the second speed is lower than the first speed.

In the above example, since the second speed is lower than the first speed, automatic low-speed playback is realized when the start tag is detected (i.e., when the progress mark on the time axis goes to the start tag mark), the playback speed of the exemplary video is adapted to the action speed of the user, and the playback speed is automatically returned to the first speed when the end tag is detected.

In some embodiments, the first speed is a normal play speed, i.e., 1 speed, and the second speed may be a preset 0.75 speed or 0.5 speed.

In some embodiments, the exemplary video file includes video frame data and audio data, and when the exemplary video is played, the same sampling rate is used to read and process the video frame data and the audio data, so that when the playing speed of the exemplary video needs to be adjusted, not only the playing speed of the video frame but also the playing speed of the audio signal is adjusted, that is, sound and picture synchronous playing is achieved.

In other embodiments, the exemplary video file comprises video frame data and audio data, and the sampling rate of the video frame data and the sampling rate of the audio data are independently adjusted and controlled when the exemplary video is played, so that when the playing speed of the exemplary video needs to be adjusted, the sampling rate of the video frame data can be changed only to adjust the playing speed of the video frame, and the sampling rate of the audio data is not changed to keep the playing speed of the audio signal unchanged. For example, when the playing speed needs to be reduced, the playing speed of the audio is not reduced, so that the user can normally receive the description of the audio and watch the slowed action demonstration.

In some embodiments, a key clip includes its video data and its audio data. When the key clip begins to be played, adjusting the speed of playing the video data of the key video clip to a second speed, and maintaining the speed of playing the audio data of the key video clip at a first speed; when the playing of the key segment is finished, the speed of playing the video data of the next segment is adjusted to the first speed, and the audio data of the next segment is synchronously played at the first speed, wherein the next segment is a file segment which is positioned after the key segment and adjacent to the key segment in the exemplary video, for example, other segments adjacent to the key segment.

In some embodiments, during the process of playing the video picture at the low speed, it is detected whether the playing of the key segment is finished (for example, a termination tag is detected), and if the termination tag of the key segment is not detected, when the playing of the audio data corresponding to the corresponding time period is finished, the audio data corresponding to the corresponding time period may be repeatedly played, for example, when the video picture is played at 0.5 speed, the audio data corresponding to the time period may be repeatedly played twice. And when the video frame data in the time interval are played completely, namely after the termination tag is detected, the audio data and the video frame data corresponding to the next time interval can be synchronously played.

In other embodiments, during the process of playing the video image at the low multiple speed, it is detected whether the playing of the key segment is finished (for example, a termination tag is detected), and if the termination tag of the key segment is not detected, when the playing of the audio data corresponding to the corresponding time interval is finished, the playing of the audio data is suspended until the playing of the video frame data of the time interval is finished, that is, after the termination tag is detected, the audio data and the video frame data corresponding to the next time interval can be synchronously played. For example, the time of the start tag is 0: 05 and the time of the end tag is 0: 15 on the time axis, and in the case of playing the video picture at 0.5 times speed, the video frame data corresponding to the time period of 0: 05-0: 15 needs to be played for 20S, and the audio data corresponding to the time period needs to be played for 10S, because in order to make 0: and (15) synchronously playing the sound and the picture in the time period after the time period, when the progress mark on the time axis reaches 0: 10, pausing the playing of the audio data, and when the progress mark on the time axis reaches 0: 15, continuing to play the audio.

In some embodiments, during the user follow-up process, automatic adjustment is only implemented for the play speed of the exemplary video, and the play speed of the local video stream is not adjusted.

In some embodiments, the controller controls display of a user interface on the display, the user interface including a first play window for playing the exemplary video and a second play window for playing the local video stream; responding to an input instruction for indicating to play a demonstration video, and acquiring the demonstration video; playing the demonstration video in the first playing window, and playing the local video stream in the second playing window; the speed when other fragments of the demonstration video are played in the first playing window is the first speed, the speed when the key fragments of the demonstration video are played is the second speed, and the second speed is lower than the first speed; the speed of playing the local video stream in the second playing window is a fixed preset speed.

In some embodiments, the fixed preset speed may be a first speed. In some embodiments, if the user's age falls within a predetermined age range, then the speed is automatically reduced when the demonstration process of playing the key action begins, taking into account the poor learning ability and physical coordination of the user of a low age.

In some embodiments, if the user's age is in a first age interval, the exemplary video is played at a first speed; if the user's age is in a second age interval, the exemplary video is played at a second speed, wherein the second speed is different from the first speed.

In some embodiments, the first age interval and the second age interval are divided by a predetermined age, for example, an age interval above the predetermined age is defined as the first age interval, and an age interval below the predetermined age (including the predetermined age) is defined as the second age interval. For example, the first age interval or the second age interval may be an age interval of a preschool child (e.g., 1-7 weeks), an age interval of a school-age child, an age interval of a young adult, an age interval of a middle-aged adult, or an age interval of an elderly person.

It should be noted that, those skilled in the art can set the first speed and the second speed according to the specific value ranges of the first age interval and the second age interval, so as to adapt the exemplary video playing speed to the learning ability and the action ability of the user as a principle.

It should be noted that the first age interval and the second age interval are only an exemplary representation, and in some other embodiments, corresponding playing speeds may be set for more age intervals as needed, and when the user age is in the corresponding age interval, the exemplary video may be played at the corresponding playing speed. For example, the exemplary video is played at a third speed when the user's age is in a third age interval, at a fourth speed when the user's age is in a fourth age interval, and so on.

In some embodiments, the user's age is in the first age interval when the first starting age is less than the first ending age, and the user's age is in the second age interval when the second starting age is less than the second ending age.

In some embodiments, the age intervals may be two, with a predetermined age as a boundary.

In some embodiments, when the age of the user is higher than a preset age, controlling the display to play the demonstration video at a first speed; when the age of the user is not higher than a preset age, controlling the display to play the demonstration video at a second speed; wherein the second speed is lower than the first speed.

In some embodiments, if the age of the user is not higher than the preset age or in the second age interval, when the key clip starts to play, the playing speed of the playing demonstration video is adjusted to the second speed; and when the key clip finishes playing, adjusting the playing speed of the playing demonstration video from the second speed to the first speed.

In some embodiments, when the key segment starts playing, the speed of the display for playing the video data of the key segment is adjusted from a first speed to a second speed, and the speed of the audio output unit for playing the audio data of the key segment is maintained at the first speed; and after the audio data of the key segment is played, controlling the audio output unit to pause playing the audio data of the key segment, or controlling the audio output unit to circularly play the audio data of the key segment. Wherein the audio output unit is display device hardware, such as a speaker, for playing audio data.

In some embodiments, when the key segment is finished playing, the display is controlled to play the video data of the next segment at the first speed, and the audio output unit is controlled to synchronously play the audio data of the next segment at the first speed, wherein the next segment is the segment of the exemplary video after the key segment.

In some embodiments, if the age of the user is not higher than a preset age, controlling the display to play the video data of the exemplary video at a second speed; and controlling the audio output unit to play the audio data of the exemplary video at the first speed.

In specific implementation, the controller acquires the age of the user; judging whether the age of the user is lower than a preset age or not; in the case that the user's age is lower than a preset age, a start-stop tag on a time axis is detected during the playing of the demonstration video, the playing speed of the demonstration video is adjusted from a first speed to a second speed when the start tag is detected, and the playing speed of the demonstration video is adjusted from the second speed to the first speed when the end tag is detected.

In some embodiments, the controller acquires user information from the user ID, and acquires age information of the user from the user information.

In other embodiments, the controller activates the image collector in response to a user-input instruction instructing playing of a demonstration video; identifying a character image in the local image acquired by the image acquirer; and identifying the age of the user according to the identified figure image and a preset age identification model.

In some embodiments, different low speed parameters may be set for different age ranges, e.g., if the user is "3-5 years old", then the second speed is 0.5 times speed; if the user is "6-7 years old", the second speed is 0.75 times speed.

As previously mentioned, the exemplary video has a specified type, such as the aforementioned "sprout lesson", "music lesson", etc., which type can be characterized by a type identifier. In view of the differences in audience and exercise difficulty for different types of videos, in some embodiments, if the type of the demonstration video is a preset type, the speed is automatically reduced when the demonstration process of playing the key action is started. And if the type is not the preset type, the whole course is normally played until the user manually adjusts the type.

In some embodiments, the controller obtains a type identifier of the demonstration video, detects a start-stop tag on a time axis during playing the demonstration video if the demonstration video is determined to be a preset type according to the type identifier, adjusts the playing speed of the demonstration video from a first speed to a second speed when the start tag is detected, and adjusts the playing speed of the demonstration video from the second speed to the first speed when the end tag is detected.

In some embodiments, the server sends the resource information to the display device to include the type identifier of the resource, so that the display device can determine whether the exemplary video is of a preset type according to the type identifier of the exemplary video, where the preset type includes, but is not limited to, the type of part or all of the resources provided by the juvenile channel, such as juvenile resources provided by other channels.

In some embodiments, different low speed parameters may be set for different types, e.g., if the exemplary video belongs to a "sprout class," then the second speed is 0.5 times speed; if the exemplary video belongs to a "happy lesson," then the second speed is 0.75 times speed.

In some embodiments, the playing speed can be automatically adjusted according to the follow-up condition of the user, so that the low-speed playing mechanism is suitable for different users. And for the part of the demonstration video where the user can follow the practice easily, normal-speed playing is carried out, and for the part of the demonstration video where the user can not follow the practice smoothly, low-speed playing is carried out.

For convenience of illustration and distinction, the present application refers to a video frame sequence comprised by an exemplary video as a first video frame sequence, where the first video frame sequence comprises a first key frame for displaying a completed state action, and N first key frames corresponding to N completed state actions constitute the first key frame sequence, and of course, the first video frame sequence always further comprises non-key frames for displaying uncompleted state actions and release actions.

In some embodiments, in response to an instruction indicating follow-up demonstration video, the controller starts the image collector and acquires a follow-up video stream of the user from a local video stream collected by the image collector, wherein the follow-up video stream comprises part or all of video frames in the local video stream. In a different way, the present application refers to a sequence of video frames in the follow-through video stream as a second sequence of video frames comprising a second video frame for exhibiting (documenting) a user action.

In some embodiments, the user actions are analyzed according to the follow-up video stream, and if it is detected that the user does not make the corresponding completed state actions at one or a plurality of continuous time points (or time periods) when the completed state actions need to be made, that is, the user actions are regarded as the incomplete state actions, which indicates that the follow-up difficulty of the actions is greater for the user, then the playing speed of the demonstration video by the display device can be reduced; if it is detected that the user has completed the corresponding completion state action at one or a plurality of continuous time points (or time periods) when the completion state action needs to be made, that is, the user action is taken as the release action, which indicates that the follow-up difficulty of the actions for the user is small, the playing speed of the demonstration video by the display device can be increased.

In some embodiments, in response to an input instruction indicating to follow a demonstration video, the controller acquires the demonstration video, and acquires a follow-up video stream of a user from a local video stream acquired by the image acquirer, wherein the demonstration video comprises a first key frame sequence for displaying a completion state action, and the follow-up video stream comprises a second video frame sequence for displaying a user action; the controller plays the demonstration video on the display, and adjusts the playing speed of the demonstration video when the user action in the second video frame corresponding to the first key frame is not matched with the completion state action displayed by the first key frame.

The second video frame corresponding to the first key frame is extracted from the second video frame sequence according to the time information of the played first key frame.

In some embodiments, the time information of the first key frame may be a time when the display device plays the frame, and the second video frame corresponding to the time is extracted from the second video frame sequence according to the time when the display device plays the first key frame, that is, the second video frame corresponding to the first key frame. The second video frame corresponding to a certain time may be the second video frame with the timestamp of the time, or the second video frame with the time shown by the timestamp closest to the time.

In some embodiments, the same position may be passed during the preparation process and during the release process, so that the second video frame and other adjacent video frames can be extracted, and after the joint data of successive frames is extracted, it can be determined whether the action is a preparation action or a release action.

In some embodiments, the controller extracts the corresponding second video frame from the second video frame sequence according to the played first key frame, and sends the extracted second video frame (and the corresponding first key frame) to the server; and the server judges whether the user action in the second video frame is matched with the completion state action displayed by the first key frame or not by comparing the corresponding first key frame with the second video frame. And when the server judges that the user action in the second video frame is not matched with the completion state action displayed by the corresponding first key frame, returning a speed adjusting instruction to the display equipment.

In some embodiments, the controller controls joint point identification (i.e. user motion identification) of the second video frame and/or other video frames to be done locally at the display device and uploads the joint point data and corresponding points in time to the server. And the server determines a corresponding target demonstration video frame according to the received time point, compares the received data of the joint point with the joint point data of the target demonstration video frame, and feeds back a comparison result to the controller.

In some embodiments, the case where the user action in the second video frame does not match the completion state action presented by the corresponding first keyframe comprises: the user action in the second video frame is used as an unfinished state action before the finished state action; the user action in the second video frame is a release action after the completion state action. Based on this, if the server determines that the user action in the second video frame is an uncompleted state action, returning an instruction indicating a speed reduction to the display device to cause the display device to reduce the playing speed of the target video; and if the server judges the user action in the second video frame as the release action, returning an instruction indicating speed increase to the display equipment so as to enable the display equipment to increase the playing speed of the target video.

Of course, in some other implementation cases, the display device independently determines whether the user action in the second video frame matches the completed action displayed by the first key frame, and does not need to interact with the server, which is not described herein.

It should be noted that, in the above implementation situation of adjusting the playing speed in real time according to the exercise condition of the user, if the playing speed is adjusted to the preset highest value or the preset lowest value, the playing speed is not adjusted to be higher or lower.

In some embodiments, the user may control the pause of the video playing by operating a key or inputting voice and control the resume of the video playing by operating a key or inputting voice, for example, during the following of the target video, the user may control the pause of the target video by operating a key or voice input on the control device, for example, when the display displays an interface as shown in fig. 10, the user may press an "OK" key to pause the playing, and the controller may pause the playing of the target video in response to the key input of the user and present a pause state identifier as shown in fig. 16 on the upper layer of the playing screen.

In the process of following the target video, the controller acquires a local image through the image collector and detects whether a user target, i.e., a person (user), exists in the local image, when the display device controller (or the server) does not detect a moving target from the local image, the display device automatically controls to pause playing the target video, or the server instructs the display device to pause playing the target video, and a pause state flag as shown in fig. 16 is presented on the upper layer of the playing picture.

In the above-described embodiment, the pause control performed by the controller does not affect the display of the local video picture.

In the paused state shown in fig. 16, the user may resume playing the target video by operating a key on the control device or by voice input, for example, the user may press an "OK" key to resume playing the target video, and the controller resumes playing the target video in response to the user's key input and cancels the display of the pause state flag in fig. 16.

As can be seen, in the above example, the user needs to operate the control device to control the display device to resume playing the target video, which makes the user experience of the follow-up process unfriendly.

To address this issue, in some embodiments, in response to a pause control for the playing of the target video, the controller presents a pause interface on the display and displays target key frames in the pause interface, wherein the target video includes a number of key frames, each key frame showing a key action that requires follow-through, the target key frame being a designated one of the number of key frames. After the target video is paused, controlling the image collector to continue working, and judging whether the user action in the local image collected after the pause is matched with the key action displayed by the target key frame; when the user action in the local image is matched with the key action displayed by the target key frame, the target video is restored to be played; and when the user action in the local image does not match the key action displayed by the previous key frame, maintaining the pause of the playing of the target video.

In the above embodiment, the target key frame may be a key frame showing a previous key action, i.e. the last key action played before the control target video is paused, or may be a representative one of several key frames.

It should be noted that the target video referred to in the above example refers to a video that is paused to be played, and includes, but is not limited to, a video that demonstrates dance movements, a video that demonstrates fitness movements, a video that demonstrates gymnastic movements, a video that demonstrates MV playing in a karaoke scene, or a video that demonstrates avatar movements.

As some possible implementation manners, a plurality of key tags are identified in advance on a time axis of a target video, one key tag corresponds to one key frame, that is, a time point represented by a key tag is a time point at which the corresponding key frame is played. The controller responds to the received pause control of the target video playing, detects a target key label on a time axis according to the time point of the time axis during pause, acquires a target key frame according to the target key label on the time axis, and displays the acquired target key frame in a pause interface, wherein the time point corresponding to the label of the target key frame is positioned before the time point on the time axis during pause. Thus, the pause contact with the trained video frame can be used, and the interest is improved.

In other possible implementation manners, the controller controls the target video to perform pause after the target video is retreated to the moment of the target key tag in response to pause control over the playing of the target video, so as to display a target key frame corresponding to the target key tag on a pause interface.

In some embodiments, the target key tag is a key tag that is earlier than the current time on the time axis and is closest to the current time, and correspondingly, the target key frame is a key frame showing the last key action.

In the above example, when the pause control is performed on the playing of the target video or after the pause control is performed, the target key frame showing the key action is presented in the pause interface as the prompt action for the user to resume playing, and further, in the play pause state, the user can control to resume playing the target video by making the prompt action, without operating the control device, so that the follow-up experience of the user is improved.

In some embodiments, the displaying the acquired target key frame in the pause interface may be that after the time axis is controlled to roll back to a time point corresponding to the target key tag, the playing of the demonstration video is stopped, and a pause control is added to the demonstration video playing window. The controller acquires a target key frame or a joint point of the target key frame, meanwhile, the camera continuously acquires local video data and detects a human body in the video data, and when the matching degree of the motion of the human body in the video data and the motion in the target key frame reaches a preset threshold value, a demonstration video is played.

In some embodiments, playing the video may be to continue playing the exemplary video from a time point corresponding to the key tag after the rollback.

In some embodiments, it may be that the continued playing of the exemplary video is performed at a point in time when the pause control is received.

In some embodiments, the step of displaying the acquired target key frame in the pause interface may be to, without performing time axis rollback, stop the playing of the exemplary video and add a pause control in the exemplary video playing window, and display the acquired target key frame in a floating layer above the exemplary video playing window. The controller acquires a target key frame or a joint point of the target key frame, meanwhile, the camera continuously acquires local video data and detects a human body in the video data, and when the matching degree of the human body action in the video data and the action in the target key frame reaches a preset threshold value, the demonstration video is played and the floating layer of the target key frame is cancelled and displayed.

In some embodiments, the working frame at pause may be any video frame in an exemplary video.

In some embodiments, the follow-up process automatically ends when the user finishes playing the target video for follow-up. The controller closes the image collector in response to the completion of the playing of the target video, closes the follow-up interface where the first playing window and the second playing window are located as shown in fig. 10, and presents an interface containing the evaluation information.

In some embodiments, the user may end the follow-up process by operating a key or voice input on the control device before completing the follow-up process, e.g., the user may input an instruction indicating to end the follow-up by operating a "back" key on the control device. The controller, in response to the instruction, pauses playing the target video and presents an interface including the saving information, such as the saving page exemplarily shown in fig. 17.

When the display displays the saving interface shown in fig. 17, the user can operate the control for returning to the follow-up interface, return to the follow-up interface to continue the follow-up, or operate the control for determining to quit the follow-up, and end the follow-up process.

In some embodiments, in response to a user-entered instruction to quit the follow-up, the play duration for the target video is determined for continued play.

In some embodiments, if the playing time length of the target video is not less than the preset time length (e.g., 30s), the playing time length of the target video is saved to continue playing at the next playing, and if the playing time length of the target video is less than the preset time length (e.g., 30s), the playing time length of the target video is not saved to resume playing at the next playing.

In some embodiments, if the playing duration of the target video is not less than the preset duration (e.g., 30s), the local image frames corresponding to the target keyframes are saved for presentation in a subsequent evaluation interface or play history. If the playing time of the target video is lower than the preset time (such as 30s), the local image frame corresponding to the target key frame is not saved. The local image frame corresponding to the target key frame refers to a video frame in the determined local video obtained when the target key tag is detected.

In some embodiments, the video frames in the determined local video obtained when the target key tag is detected may be local image frames obtained by the camera at a time point when the target key tag is detected, or local image frames obtained by the camera at or near the time point when the target key tag is detected and having a higher matching degree with the target key frame.

In some embodiments, when a user selects a video which is played and is not played for follow-up, an interface including resume prompt information is presented in response to an instruction for playing such a demonstration video input by the user, and the last playing time length and a control for the user to select whether to resume are displayed in the resume prompt interface, so that the user can operate the control on the interface to autonomously select whether to resume. Fig. 18 exemplarily shows a resume prompt interface, as shown in fig. 18, which shows the last playing time length (1 minute and 30 seconds), a control for resuming the playing ("resume"), and a control for the user to resume the playing (resume follow).

In some embodiments, the exemplary video is controlled to be played again, for example, from 0 minutes to 0 seconds, in response to an instruction for indicating playback input by the user in the resume prompt interface shown in fig. 18, or the exemplary video is controlled to be played again, for example, from 1 minutes to 30 seconds, according to the last playing time length, in response to an instruction for indicating playback continuation input by the user in the resume prompt interface shown in fig. 18.

In some embodiments, when the controller receives an operation that the user determines to quit the follow-up, the image collector is closed, the first playing window and the second playing window in the follow-up interface shown in fig. 10a are closed, and the interface containing the evaluation information is presented.

In some embodiments, in response to the completion of the follow-up process, an interface is presented on the display containing rating information including at least one of star achievements, rating achievements, experience value increments, and experience value totals.

In some embodiments, the star-grade score, the score and the experience value increment are determined according to the exercise following action of the target key frame completed in the target video playing process and the action matching degree when the exercise following action of the target key frame is completed, wherein the exercise following quantity of the completed target key frame and the action matching degree when the exercise following action of the target key frame is completed are positively correlated with the star-grade score, the score and the experience value increment.

It should be noted that, in some embodiments, if the user exits from the follow-up in advance, in response to an instruction for exiting from the follow-up input by the user, the controller determines whether the playing time length of the target video is longer than a preset value, and if the playing time length is longer than the preset value, generates scoring information and detailed score information according to the generated follow-up data (such as the collected local video stream, scores of part of the user actions, and the like); and if the playing time is not longer than the preset value, deleting the generated follow-up data.

Fig. 19 illustrates an interface presenting scoring information, as shown in fig. 19, in which star achievements, experience value increments, and experience value totals are presented in the form of items or controls, wherein the controls presenting experience value totals are consistent with those shown in fig. 10. In addition, in order to facilitate the user to view the detailed achievements, fig. 19 also shows a control "view achievements immediately" for viewing the detailed achievements, and the user can enter an interface for presenting detailed achievement information as shown in fig. 20 or fig. 22 by operating the control.

In some embodiments, the experience value is user data related to rating up, which is the user's acquisition of user behavior in the target application, i.e. the user can advance the experience value by following more demonstration videos, which is also a quantitative representation of the user's behavioral proficiency, i.e. a higher experience value means a higher proficiency in the user's practice actions, and when the experience values are accumulated to a certain value, the user's rating up can be acquired.

In order to avoid that a user maliciously earns experience values by repeatedly practicing the same demonstration video, in some embodiments, in the process of practicing the demonstration video by the user, according to a local video stream collected by an image collector, scoring is carried out on the practicing situation of the user, a mapping relation exists between the scoring and the demonstration video, a server can inquire a recorded historical highest score of the demonstration video for the user to practice the demonstration video according to an ID (identity) of the demonstration video and the user ID, if the score is higher than the recorded historical highest score, a new experience value obtained according to the scoring is displayed, and if the score is not higher than the recorded historical highest score, an original experience value is displayed. Wherein the historical highest score recorded is the highest score obtained by the user in the past to follow-up the demonstration video.

In some embodiments, the score of the follow-up exercise process and the new experience value obtained according to the score are displayed in the follow-up exercise result interface when the follow-up exercise result interface of the follow-up exercise process is displayed.

In some embodiments, in a process of playing a demonstration video (i.e., in a follow-up process), performing action matching on the demonstration video and the local video stream to obtain a score corresponding to the follow-up process; after the demonstration video is played (namely, after the practice following process is finished), a practice following result interface is generated according to the obtained scores, and an experience value control for displaying experience values is arranged in the practice following result interface, wherein when the score is higher than the historical highest score of the demonstration video followed by the user, the experience values updated according to the score are displayed in the experience value control, and when the score is not higher than the historical highest score, the experience values before the practice following process are displayed in the experience value control.

In some embodiments, the controller acquires a demonstration video in response to an input instruction instructing to play (follow) the demonstration video, and collects a local video stream through the image collector; wherein the demonstration video comprises a first video frame for demonstrating a demonstration action required by the user for follow-up, and the local video stream comprises a second video frame for demonstrating the action of the user; matching the corresponding first video frame and the second video frame to obtain a score based on a matching result; if the score is higher than the recorded historical highest score, loading a new experience value obtained according to the score in an experience value control; and if the score is not higher than the recorded highest score, loading and displaying an original experience value in the experience value control, wherein the original experience value is the experience value before the follow-up exercise process.

In some embodiments, while playing a demonstration video, and detecting key tags on a timeline; when one key label is detected, acquiring a second key frame corresponding to the first key frame from a second video frame according to the time information represented by the key label, wherein the second key frame is used for key follow-up action of the user; and acquiring a matching result of the first key frame and the second key frame which correspond to the key label at the same time. For example, a first key frame and a second key frame corresponding to the key tag may be uploaded to a server, so that the server performs skeleton point matching on a key demonstration action shown in the first key frame and a key user action shown in the second key frame, and then receives a matching result returned by the server. For another example, the display device controller may identify a key demonstration action in the first key frame and a key follow-up action in the second key frame, and then perform bone point matching on the identified key demonstration action and key follow-up action to obtain a matching result. It can be seen that, each of the second key frames corresponds to a matching result, which represents the matching degree or similarity between the user action in the second key frame and the key action in the corresponding first key frame, when the matching result represents that the matching degree/similarity between the user action and the demonstration action is low, it means that the user action is not sufficiently standard, and when the matching result represents that the matching degree/similarity between the user action and the demonstration action is high, it means that the user action is relatively standard.

In some embodiments, the display device may acquire joint point data of a second key frame in the local video according to the local video data, and upload the joint point data to the server, so as to reduce the pressure of data transmission.

In some embodiments, the display device may upload the key tag identification to the server to reduce data transmission stress from transmitting the first key frame.

In some embodiments, while playing a demonstration video, key tags on a timeline are detected; and when one key tag is detected, acquiring a corresponding second key frame from the second video frame according to the time information of the first key tag, wherein the second key frame is used for displaying the follow-up action of the user.

In some embodiments, the second keyframe is the image frame in the local video at the time of the first keyfob. In the embodiment of the present application, since the time point characterized by the key tag is the time point corresponding to the first key frame, and the second key frame is a frame extracted from the second video frame sequence according to the time information of the first key frame, one key tag corresponds to a pair of the first key frame and the second key frame.

In some embodiments, the second keyframe is the image frame in the local video at and adjacent to the temporal instance of the first keyframe. The image used for evaluation presentation may be the image frame of the second key frame that matches the first key frame to the highest degree.

In some embodiments, the time information of the first key frame may be a time when the display device plays the frame, and a second video frame corresponding to the time is extracted from the second video frame sequence according to the time when the display device plays the first key frame, that is, the second key frame corresponding to the first key frame. The video frame corresponding to a certain time may be a video frame with a timestamp of the time, or a video frame with a time shown by the timestamp closest to the time.

In some embodiments, the matching result is specifically a matching score, and the score calculated based on the matching result or the matching score may also be referred to as a total score.

In some embodiments, a target video includes M first key frames showing M key actions, the target video has M key tags on a time axis, and in the following process, second key frames corresponding to the M frames can be extracted from a local video stream according to the M first key frames; and sequentially and correspondingly matching the M first key frames (displayed M key actions) with the M second key frames (displayed M user key actions) to obtain M matching scores respectively corresponding to the M second key frames, and summing, weighting and summing, averaging or weighting and averaging the M matching scores to obtain the total score of the follow-up process.

In some embodiments, the display device determines a frame extraction range of the local video stream according to time information of a first key frame (key frame) in the target video, extracts a preset number of local video frames from the local video stream according to the determined frame extraction range, identifies a follow-up action of a user for each extracted local video frame, longitudinally compares the follow-up action with the key action to obtain a corresponding matching score, and calculates a total score of the follow-up process after the follow-up is finished.

In other embodiments, the display device sends the extracted local video frames to the server, the server identifies the user follow-up exercise motions in each frame, compares the key follow-up exercise motions longitudinally to obtain key follow-up exercise motions, matches the key follow-up exercise motions with the corresponding key motions to obtain corresponding matching scores, calculates the total score of the follow-up exercise process after the follow-up exercise is finished, and returns the total score to the display device.

In some embodiments, after the server obtains a matching score for a key follow-up action, the server sends a level identifier corresponding to the matching score to the display device, and after the display device receives the level identifier, such as GOOD, green, and PERFECT, is displayed in real time in a floating layer above the local screen, so as to feed back the follow-up effect to the user in real time. In addition, if the matching score of the user following exercise is determined by the display device, the display device directly displays the grade mark corresponding to the matching score in the floating layer above the local screen.

In some embodiments, for practicing the total score of each exemplary video, if the score is higher than the recorded highest score, the difference between the score and the recorded highest score is obtained, and the difference is increased on the basis of the original total score to obtain a new total score, so that the situation that the user repeatedly swipes familiar videos to improve the total score is avoided, and the fairness of the application is improved.

In some embodiments, if the total score is higher than the highest score documented, a corresponding increment of empirical value is derived from the total score; accumulating the experience value increment to the original experience value to obtain a new experience value; further, at the end of the target video playback, the new experience value is presented on the display. For example, if the total score is 85 points and the historical highest score is 80 points, the experience value increment 5 is obtained according to the total score of 85 points and the historical highest score of 80 points, and if the original experience value is 10005, a new experience value 10010 is obtained by accumulating the experience value increment 5 in 10005. Conversely, if the total score is not higher than the highest score recorded, the experience value increment is 0, i.e., the experience values are not accumulated, at which point the original experience values are presented on the display.

Further, if the total score is higher than the highest score documented, the original empirical value is replaced with the new empirical value; if the total score is not higher than the highest score documented, the raw empirical values are not updated.

It is noted that the terms first and second in the description of the present application are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. In further embodiments, the first key frame may also be referred to as a key frame and the second key frame may also be referred to as a local video frame or a follow-through screenshot.

In the above embodiment, in the process of practicing the target video by the user, the practicing condition of the user is scored according to the local video stream collected by the image collector, if the score is higher than the recorded highest score, a new experience value is obtained according to the score and is displayed, and if the score is not higher than the recorded highest score, the experience value is not updated and the original experience value is displayed, so that the user is prevented from maliciously earning the experience value by repeating practicing the same demonstration video.

In some embodiments, the server or the display device counts the experience value increment generated in a preset period, and when the next period is entered, the experience value of the user is updated according to the counted experience value increment generated in the previous period. Wherein the preset period may be three days, seven days, etc.

In some embodiments, the display device controller sends a request to the server for obtaining the user experience value in response to the launching of the target application, the request including at least user information. The server acquires the time of updating the user experience value last time according to the request, and judges whether the interval duration from the last time of updating the user experience value meets the duration of the preset period or not; if so, acquiring the experience value increment generated in the previous period, updating the user experience value by accumulating the experience value increment generated in the previous period into the total experience value, and returning the updated user experience value to the display equipment; and if the user experience value does not meet the user experience value, the user experience value is not updated, the current user experience value is directly returned to the display equipment, or the display equipment is informed to obtain the user experience value data issued last time from the cache data of the display equipment.

Accordingly, the display device receives the user data display area in the user experience value drawing interface returned by the server, so that the user experience value is displayed in the display area. And if the display device receives the updated user experience value, updating the user experience value in the cache of the display device at the same time.

In some embodiments, the experience value control includes a user data presentation area setting identification bit as in FIG. 9 for identifying the experience value increment that has been generated in the current cycle, such as "this cycle + 10" as shown in FIG. 9.

In some embodiments, the experience value control includes a first sub-control in which the total experience value at the end of the last statistical period is shown, and a second sub-control in which the experience value increment that has been generated in the current statistical period is shown. The first sub-control is the control showing the position of "dancing value 10012" in fig. 9, and the second sub-control is the control showing the position of "this week + 10" in fig. 9.

In some embodiments, the first sub-control and the second sub-control partially overlap such that a user can visually see both sub-controls at the same time.

In some embodiments, the first and second sub-controls are different colors so that the user can intuitively see both sub-controls at the same time.

In some embodiments, the second child control is located in the upper right corner of the first child control.

In some embodiments, the user selects a detail page in which the user data display area sets an identification bit to display the total score of the experience values, and after the detail page is entered, the second sub-control is still located at the upper right corner of the first sub-control and displays the newly added score in the current statistical period.

In some embodiments, the follow-up exercise result interface is further provided with a follow-up exercise evaluation control, and the follow-up exercise evaluation control is used for displaying the target states determined according to the scores, and the target states corresponding to different scores are different.

In some embodiments, the target state presented in the follow-up rating control is a star rating as shown in fig. 9.

In some embodiments, the correspondence between the empirical value data range and the star level is pre-established, for example, 0-20000 (empirical value range) for 1 star, 20001-40000 for 2 stars, and so on. Based on this, while the user data display area as in fig. 9 displays the user experience value, a star level identifier corresponding to the experience value, for example, 1 star shown in fig. 9, may also be displayed in the follow-up evaluation control.

After the follow-up is completed, an interface for presenting the rating information as shown in fig. 19 is presented on the display. When the display displays the interface, the user can enter the interface for presenting detailed achievement information by operating the control for viewing detailed achievements.

In some embodiments, the detailed performance information may also be referred to as follow-up result information, and the user interface displaying the follow-up result information is referred to as a follow-up result interface.

In some embodiments, in response to an instruction for viewing detailed achievements input by a user, a display device sends a detailed achievement information interface acquisition request to a server, the display device presents detailed achievement information on a display according to detailed achievement information interface data sent by the server, the detailed achievement information comprises login user information, star-level achievement information, an evaluation statement and at least one of a plurality of follow-up screenshots, the follow-up screenshots are local video frames in a follow-up video collected by the user through a camera, and the follow-up screenshots are used for displaying follow-up actions of the user.

Fig. 20 exemplarily shows an interface for presenting detailed performance information, and as shown in fig. 20, login user information (such as a user head portrait and a user experience value), star performance information, evaluation words, and 4 follow-up screenshots are displayed in the form of items or controls.

In some embodiments, the follow-up screenshots are displayed in the form of thumbnails arranged in the interface shown in fig. 20, the user can select one follow-up screenshot by moving the position of the selector by operating the control device to view the original of the selected picture, and the user can view other original corresponding to the follow-up screenshot by operating the left and/or right direction keys while the original file of the selected picture is displayed on the display.

In some embodiments, when the user selects the first follow-up screenshot for viewing by operating the control device to move the selector, the original image file corresponding to the selected screenshot is obtained and presented on the display, as shown in fig. 21. In fig. 21, the user can view other original drawings corresponding to the training screenshot by operating the left and/or right direction keys.

Fig. 22 illustrates another interface for presenting detailed result information, which is different from the interface illustrated in fig. 20 in that a sharing code picture (e.g., a two-dimensional code) including a detailed result access address is further displayed in the interface illustrated in fig. 22, and a user can scan the sharing code picture by using a mobile terminal to view the detailed result information.

Fig. 23 exemplarily shows a detailed achievement information page displayed on the mobile terminal device, as shown in fig. 23, in which login user information, star achievement, comment and at least one follow-up screenshot are displayed. The user can share the page link to other users (namely other terminal devices) by operating the sharing control in the page, and can also store the exercise-following screenshot displayed in the page and/or the original image file corresponding to the exercise-following screenshot in the local of the terminal device.

To motivate and urge the user, in some embodiments, if the total score of one follow-up exercise process is higher than a preset value, N local video frames (TopN) with the highest matching score are displayed in a detailed score information page (or follow-up exercise result interface) so as to display the highlight moment of the follow-up exercise process, and if the total score is not higher than the preset value, N local video frames with the lowest matching score are displayed in the detailed score information page so as to display the moment of the follow-up exercise process to be improved.

In some embodiments, after receiving the detailed achievement information interface acquisition request, the server obtains a score when the user follows the practice demonstration video according to the matching degree of actions in the corresponding key frames and the local video frames, when the score is higher than a first value, the server issues a certain number of key frames and/or corresponding local video frames with higher matching degree (for example, N is greater than or equal to 1) as detailed achievement information interface data to the display device, and when the score is lower than a second value, the server issues a certain number of key frames and/or corresponding local video frames with lower matching degree as detailed achievement information interface data to the display device. In some embodiments, the first value and the second value may be the same value, and in other embodiments, the first value and the second value are different values. In some embodiments, the controller, in response to a user input instructing to follow an exemplary video, obtains the exemplary video comprising a sequence of key frames including a predetermined number (M) of key frames in a temporal ordering, each key frame exhibiting a key action requiring the user to follow.

In some embodiments, the controller plays the target video at the follow-up interface and acquires local video frames corresponding to the key frames from the local video stream during the playing of the demonstration video, the local video frames showing the user actions.

In some embodiments, the comparison between the key frames and the local videos is performed in the display device, during the follow-up process, the controller matches the key actions displayed by the corresponding key frames with the user actions displayed by the local video frames to obtain a matching score corresponding to each local video frame, and obtains a total score according to the matching score corresponding to each local video frame, and selects a target video frame to be displayed as a follow-up result according to the total score, that is, if the total score is higher than a preset value, N local video frames (TopN) with the highest matching score are selected as the target video frames, and if the total score is not higher than the preset value, N local video frames with the lowest matching score are selected as the target video frames, where N is a preset number of target video frames, for example, in fig. 19, N is 4; finally, the follow-up results including the total score and the target video frame are displayed, that is, the total score and the target video frame are displayed in a detailed score page as shown in fig. 18. In some embodiments, the total score is obtained by summing, weighted summing, averaging, or weighted averaging the matching scores corresponding to the respective local video frames.

In some embodiments, the controller detects key tags on a timeline during control of playing the exemplary video; when one key label is detected, extracting a local video frame corresponding to the key frame in time from the local video stream according to the time information of the key label, and generating a local video frame sequence according to the extracted local video frame, wherein the local video frame sequence comprises partial or all local video frames which are arranged in a descending order according to the matching score.

In some embodiments, the first N local video frames in the local video frame sequence are used as first local video frames, and the second N local video frames in the local video frame sequence are used as second local video frames, where the first local video frame is used to be displayed in the follow-up result interface when the total score is higher than a preset value, and the second local video frame is used to be displayed in the follow-up result interface when the total score is not higher than the preset value. In some embodiments, the preset value may be the first value or the second value in the foregoing embodiments.

In some embodiments, the step of generating the sequence of local video frames may comprise: when a new local video frame is obtained, if an overlapped video frame exists in a first local video frame and a second local video frame, inserting the newly obtained local video frame into a local video frame sequence according to a matching score corresponding to the newly obtained local video frame to obtain a new local video frame sequence; and if the first local video frame and the second local video frame do not have overlapped video frames, inserting the newly acquired local video frame into the local video frame sequence according to the matching score corresponding to the newly acquired local video frame, and deleting the local video frame with the matching score positioned at the middle position to obtain the new local video frame sequence.

In some embodiments, if the total score is higher than a preset value, N first local video frames are selected from the local video frame sequence as target video frames to be displayed in the follow-up result interface, and if the total score is not higher than the preset value, N second local video frames are selected from the local video frame sequence as target video frames to be displayed in the follow-up result interface.

It should be noted that the existence of an overlapping video frame in the first local video frame and the second local video frame means that a frame exists in the local video frame sequence as both the first local video frame and the second local video frame, in this case, the number of frames in the local video frame sequence is less than 2N.

It should be further noted that the absence of an overlapping video frame in the first local video frame and the second local video frame means that no frame exists in the local video frame sequence, which is both the first local video frame and the second local video frame, in this case, the number of frames in the local video frame sequence is greater than or equal to 2N. In some embodiments, when generating a photo sequence for displaying detailed performance information interface data, the bubble sorting algorithm may be adopted on either the display device side (when the display device performs sequence generation) or the server (when the server performs sequence generation).

The algorithm process is as follows: after the key frames are compared with the local video frames, the matching degree of the key frames and the local video frames is determined.

And when the number of data frames in the sequence is less than a preset value, adding the key frames and/or the local video frames into the sequence according to the matching degree, wherein the preset value is the sum of the number of the image frames needing to be displayed when the score is higher than the preset value and the number of the image frames needing to be displayed when the score is lower than the preset value. For example, if the number of frames of images to be displayed is 4 frames (group) when the score is higher than the predetermined value, and the number of frames of images to be displayed is 4 frames (group) when the score is lower than the predetermined value, the predetermined value corresponding to the sequence is 8 frames (group).

When the number of data frames in the sequence is greater than or equal to a preset value, forming a new sequence according to the matching degree of the current time and the matching degree corresponding to each group of frames in the sequence, reserving the 4 frames (groups) with the highest matching pair in the new sequence, reserving the 4 frames (groups) with the lowest matching degree, and deleting the middle frames (groups) to maintain the sequence at 8 frames (groups). Therefore, excessive photos can be prevented from being stored in the cache data, and the service processing efficiency can be improved.

In some cases, a frame refers to a sequence that includes only local video frames, and a group refers to a local video frame and a corresponding key frame in the sequence as a group of parameters in the sequence.

In some embodiments, the comparison between the key frame and the local video frame is performed in the server, and the comparison process may refer to the description of other embodiments in this application.

The server obtains a total score according to the matching score corresponding to each local video frame, selects a target video frame to be displayed as a follow-up result according to the total score, that is, if the total score is higher than a preset value, N local video frames (TopN) with the highest matching score are selected as the target video frames to be issued to the display device, if the total score is not higher than the preset value, N local video frames with the lowest matching score are selected as the target video frames to be issued to the display device, where N is a preset number of target video frames, for example, in fig. 19, where N is 4; finally, the display device displays the follow-up result including the total score and the target video frame according to the received data, that is, the total score and the target video frame are displayed in a detailed score page as shown in fig. 18.

In the case that the local video frame sequence includes all extracted local video frames, each extracted local video frame is inserted into the local video frame sequence according to the matching score corresponding to the extracted local video frame, so that the number of frames in the local video frame sequence is increased from 0 to M (the number of key frames included in the exemplary video), and the local video frames in the sequence are arranged in descending order according to the respective matching scores. When the N frames with the highest matching score need to be displayed, frames with the bit sequence of 1-N are extracted from the local video frame sequence, and when the N frames with the lowest matching score need to be displayed, frames with the bit sequence of (M-N +1) -M are extracted from the local video frame sequence.

In the situation that the local video frame sequence comprises extracted partial local video frames, generating an initial sequence according to the obtained 1 st to 2 nd local video frames, wherein the 1 st to 2 nd local video frames respectively correspond to the 1 st to 2 nd key frames, and arranging the 2N local video frames in a descending order according to matching scores; and (2) inserting the frame (the 2N + i frame) into the initial sequence according to the matching score corresponding to the frame (the 2N + i frame) after the 2N +1 frame (including the N +1 frame) is obtained, and deleting the frame with the bit sequence of (N +1) in the initial sequence until the 2N + i is equal to the preset number, namely inserting the last frame, so as to obtain the local video frame sequence, wherein the 2N is less than M, and i belongs to (1, M-2N).

It should be noted that, in some embodiments, if the user quits the follow-up in advance, the number of the local video frames actually extracted may be smaller than the number N of the target video frames to be displayed, at this time, the controller does not need to select the target video frames to be displayed according to the total score, and only needs to display the local video frames actually extracted as the target video frames.

In some embodiments, after receiving an operation of confirming exit input by a user, determining whether the number of video frames in the current sequence is greater than the number of video frames to be displayed, if so, selecting the video frames of the number of the video frames to be displayed in the front section or the rear section of the sequence according to the score for displaying, and if not, displaying all the video frames.

In some embodiments, after receiving an operation of confirming exit input by a user, before determining whether the number of video frames in the current sequence is greater than the number of video frames to be displayed, it is further necessary to determine a duration and/or a number of actions of the follow-up exercise, and whether the duration and/or the number of actions meet a preset requirement, if so, determining whether the number of video frames in the current sequence is greater than the number of video frames to be displayed, and if not, not.

In some embodiments, the display device uploads the selected local video frames according to the total score to the server so that the server adds the local video frames to the user's exercise record information.

In some embodiments, the display device uploads the joint point data of the local video frame and the identification of the corresponding local video frame to the server, and the server also performs matching information transmission through the parameter and the display device. In order to display the following pictures in the subsequent use history. The display equipment draws graphic scores according to the scored values after receiving detailed score page data, displays comments according to the comment data, calls local video frames in the cache according to the identifications of the local video frames to display the follow-up pictures, uploads the local video frames corresponding to the identifications of the local video frames and the detailed score page identifications to the server, and the server combines the received local video frames and the detailed score page data into follow-up data according to the detailed score page identifications to be sent to the display equipment in the follow-up history time.

In some embodiments, in response to the end of the follow-up process, detecting whether a user input is received, when the user input is not received within a preset time period, presenting an automatic play prompt interface, and starting countdown, wherein countdown prompt information, automatic play video information, and a plurality of controls are displayed in the automatic play prompt interface, the countdown prompt information at least includes a countdown time period, the automatic play video information includes a video cover and/or a video name to be played after the countdown is ended, and the plurality of controls may be, for example, a control for controlling replay, a control for exiting the current interface, and/or a control for playing a next video in a preset media asset list. And in the process of executing countdown, continuously detecting whether user input is received or not, if the user operates a control in the interface through the control device, playing a video displayed in the interface if the user input is not received before the countdown is finished, and stopping the countdown and executing a control logic corresponding to the user input if the user input is received before the countdown is finished.

In some embodiments, the second value is less than or equal to the first value. And under the condition that the second value is smaller than the first value and the score is higher than the second value and lower than the first value, allocating a preset number of key frames and/or corresponding local video frames as follow-up screenshots to each matching degree interval according to the matching degree.

FIG. 24 illustrates a user interface for one implementation of the auto-play reminder interface described above, as shown in FIG. 24, in which countdown reminder information, i.e., "Play you after 5 s", video information, i.e., the video name "love kindergarten" and the cover picture of the video, and "Play again" control, "Exit" control, and "Play Next" control are displayed.

In some embodiments, the user may control the display of a user's exercise record by operating the control device, the exercise record including a number of exercise entries, each exercise entry including demonstration video information, scoring information, exercise time information, and/or at least one follow-up shot. The demonstration video information comprises at least one item of cover page, name, category, type and duration of the demonstration video, the scoring information comprises at least one item of star grade, scoring score and experience value increment, the exercise time information comprises exercise starting time and/or exercise ending time, and the follow-up screenshot can be a follow-up screenshot displayed in the detailed grade information interface.

In some embodiments, when the display displays an application home page as shown in FIG. 9, the user can operate a "My dance" control in the page via the control means to input instructions indicating that exercise records are displayed. When the controller receives the instruction, sending a request for acquiring exercise record information to the server, wherein the request at least comprises a user Identification (ID); the server responds to a request sent by the display equipment, searches corresponding exercise record information according to the user identification in the request, and returns the exercise record information to the display equipment, wherein the exercise record information comprises a plurality of exercise items, and each exercise item comprises demonstration video information, scoring information, exercise time information and/or at least one follow-up exercise screenshot. The display device generates a page containing the exercise record according to the exercise record information returned by the server and presents the page on the display.

It should be noted that the follow-up screenshot is displayed when the display device captures an image showing the user's action.

In some embodiments, in response to a request sent by a display device, a server searches corresponding exercise record information according to a user identifier therein, and determines whether each exercise entry in the exercise record information includes a follow-up screenshot, and for entry information that does not include the follow-up screenshot, a special identifier is added to the entry information to indicate that a camera is not detected in a follow-up process corresponding to the exercise entry. On the side of the display device, if the exercise items returned by the server contain the follow-up screenshot, the corresponding follow-up screenshot is displayed in the exercise record, and if the exercise items returned by the server do not contain the follow-up screenshot and contain the special identification, the camera is identified to be not detected in the exercise record.

The display equipment receives data sent by a server, draws an exercise record list, each exercise record comprises a first control for displaying demonstration video information by a user, a second control for displaying scoring information and exercise time information, and a third control for displaying a follow exercise screenshot by the user, and in the process of drawing the exercise record, if the data of the first exercise record does not contain the special identification, the demonstration video information is loaded on the first control of the first exercise record, the scoring information and the exercise time information are loaded on the second control, and the exercise screenshot is loaded on the third control; if the data of the first exercise record contains the special identification, the demonstration video information is loaded on the first control of the first exercise record, the scoring information and the exercise time information are loaded on the second control, and the prompt used for prompting that the exercise record is the prompt for detecting the camera is loaded on the third control.

In some embodiments, the follow-up screenshot displayed in the exercise entry is the follow-up screenshot displayed in the corresponding detailed achievement information page, and the specific implementation process of the follow-up screenshot can refer to the above embodiments, which are not described herein again.

FIG. 25 illustrates an interface displaying a user exercise record, which may be the interface entered by the user after operating the "My dance" control of FIG. 9. As shown in fig. 25, 3 exercise items are displayed in the interface, and in the display area of each exercise item, demonstration video information, rating information, exercise time information, and a follow-up screenshot or an identifier indicating that a camera is detected are displayed. The demonstration video information comprises a cover picture, a type (a sprouting course) and a name (standing right and well after a little rest) of the demonstration video, the scoring information comprises an experience value increment (such as +4) and star-level identification, and the exercise time information comprises 2010-10-10-10: 10.

in the above example, the user can obtain past follow-up exercise conditions by looking at exercise records, such as which demonstration videos were followed at what time, how the follow-up exercise performance is, and the like, so that the user can conveniently decide the exercises after the previous follow-up exercise conditions, or discover the action types which are good for the user, for example, the user can follow-up the demonstration videos with lower exercise performance again, or focus on the videos of the corresponding types to further refine the exercises according to the good action types.

In specific implementation, the present invention further provides a computer storage medium, wherein the computer storage medium may store a program, and the program may include some or all of the steps in the embodiments of the method provided by the present invention when executed. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM) or a Random Access Memory (RAM).

Those skilled in the art will readily appreciate that the techniques of the embodiments of the present invention may be implemented as software plus a required general purpose hardware platform. Based on such understanding, the technical solutions in the embodiments of the present invention may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.

The same and similar parts in the various embodiments in this specification may be referred to each other. In particular, as for the method embodiment, since it is substantially similar to the display apparatus embodiment, the description is simple, and the relevant points can be referred to the description in the display apparatus embodiment.

The above-described embodiments of the present invention should not be construed as limiting the scope of the present invention.

Claims

1. A display device, comprising:

a controller to:

detecting whether a moving target exists in the local video picture;

when no moving target exists in the local video picture, presenting a prompt control in a floating layer above the local video for guiding the moving target to move to the desired position;

when a moving target exists in the local video picture, determining the offset of the target position of the moving target in the local video picture relative to the expected position based on the respective acquisition of the position coordinates of the moving target and the expected position torso part and/or the target reference point in a preset coordinate system;

judging the relation between the offset of the target position of the moving target in the local video picture relative to the expected position and a preset threshold, and when the offset of the target position of the moving target in the local video picture relative to the expected position is larger than the preset threshold, presenting a prompt control for guiding the moving target to move to the expected position in a floating layer above the local video according to the offset of the target position relative to the expected position; and when the deviation of the target position of the moving target relative to the expected position is not larger than the preset threshold value, the graphic elements and the prompt control are cancelled to be displayed.

2. The display device according to claim 1, wherein presenting a cue control in a floating layer above the local video screen for guiding the moving target to move to the desired position according to the offset of the target position relative to the desired position comprises:

determining a target moving direction according to the offset of the target position relative to the expected position, wherein the target moving direction points to the expected position;

and according to the target moving direction, presenting an interface prompt for identifying the target moving direction in a floating layer above the local video picture, and/or playing a voice prompt of the target moving direction.

3. The display device according to claim 2, wherein the deriving a target movement direction from the offset of the target position relative to the desired position comprises:

when a moving target exists in the local video picture, obtaining the moving direction of the target according to the offset of the target position of the moving target relative to the expected position;

and when a plurality of moving targets exist in the local video picture, obtaining the moving direction of the target according to the minimum offset in a plurality of offsets corresponding to the moving targets.

4. The display device according to claim 1, wherein the target moving direction is obtained according to the offset of the target position relative to the desired position, and the controller is further configured to set a graphic frame in a floating layer above a local video picture according to the position and angle of a camera and a preset mapping relationship, wherein the graphic frame is used for representing the desired position.

5. The display device according to claim 1, wherein an offset of a target position of the moving object in the local video picture from the desired position is determined based on respective acquisition of position coordinates of the moving object and the target reference point in the preset coordinate system, the target reference point being a center point of a torso portion or a center point of a target contour.

6. The display device of any of claims 1-5, wherein prior to controlling the image collector to collect the local image to generate the local video stream, the controller is further configured to:

responding to an input preset instruction, acquiring a demonstration video, wherein the demonstration video is used for showing the action of the moving target needing to be followed when being played;

after the dismissing of the display of the graphical element and the prompt control, the controller is further configured to:

setting a first video window for playing the demonstration video and a second video window for playing the local video picture in a user interface, wherein the second video window and the first video window are tiled in the user interface;

and playing the local video picture in the second video window and simultaneously playing the demonstration video in the first video window.

7. An information display method, characterized in that the method comprises:

detecting whether a moving target exists in the local video picture;

when a moving target exists in the local video picture, determining the offset of the target position of the moving target in the local video picture relative to the expected position based on the respective acquisition of the position coordinates of the moving target and the torso part and/or the target reference point of the expected position in a preset coordinate system;

judging the relation between the offset of the target position of the moving target in the local video picture relative to the expected position and a preset threshold, and when the offset of the target position of the moving target in the local video picture relative to the expected position is larger than the preset threshold, presenting a prompt control for guiding the moving target to move to the expected position in a floating layer above the local video according to the offset of the target position relative to the expected position; and when the deviation of the target position of the moving target relative to the expected position is not larger than the preset threshold value, the display of the graphic element and the prompt control is cancelled.

8. The method of claim 7, wherein presenting a cue control in a floating layer above the local video screen for guiding the moving target to move to the desired position according to the offset of the target position relative to the desired position comprises:

9. The method of claim 8, wherein deriving a target movement direction based on the offset of the target position relative to the desired position further comprises:

10. The method of any of claims 7-9, further comprising, prior to acquiring the local image to generate the local video stream: