CN113141527B

CN113141527B - Voice playing method based on content and display equipment

Info

Publication number: CN113141527B
Application number: CN202010054165.2A
Authority: CN
Inventors: 李金波; 张明山; 赵同庆
Original assignee: Qingdao Hisense Media Network Technology Co Ltd
Current assignee: Vidaa Netherlands International Holdings BV
Priority date: 2020-01-17
Filing date: 2020-01-17
Publication date: 2022-06-14
Anticipated expiration: 2040-01-17
Also published as: CN113141527A

Abstract

The embodiment of the invention relates to the technical field of display, in particular to a content-based voice playing method and display equipment, which are used for reducing the sound of a frequently played timestamp, a subtitle and the like interfering with a normal video. The display device includes: the display is used for displaying a user interface for playing the audio and video contents; the user interface also comprises a time progress bar for indicating the playing progress of the audio and video contents and a selector, and the position of the selector in the user interface can be moved through user input; the user interface is used for receiving instructions input by a user; a controller for performing: responding to a first input instruction, enabling a selector to be located on the time progress bar, and simultaneously playing first voice content corresponding to the first progress indicated by the selector; and after the first voice content is played, playing second voice content corresponding to the second progress indicated by the selector after a preset time interval.

Description

Voice playing method based on content and display device

Technical Field

The invention relates to the technical field of display, in particular to a voice playing method based on content and display equipment.

Background

As the internet moves into thousands of households, network-based content services are becoming more and more important. And content service-related auxiliary access is also gaining increasing attention. To meet the needs of visually impaired people, some countries have required that the associated devices must support voice play functionality.

The voice playing function is mainly used for meeting the requirements of visually impaired people for controlling the webpage according to the feedback of the voice prompt. Devices that support voice playback are common on the market today, but for playing video, the focus is on the timeline, which is typically accurate to seconds (even milliseconds), and the timeline updates every second. This causes a problem of frequent reading of seconds and a problem of incomplete reading time. For example, the common precision is second, the focus is stopped on the time progress bars of "00: 26: 05", "00: 26: 06", "00: 26: 07" and "00: 26: 08" in sequence, the "00: 26: 05" is difficult to play within 1 second, and the update time is repeatedly read next time after 1 second, so that the situations of reading "tween six, tween six and tween six all the time occur, and the effect user experience is very poor. But the requirement of reading time is necessary, for example, a user jumps to a played scene according to a time bar, and needs to read the feedback of the time bar to continue the next operation.

Disclosure of Invention

The application provides a content-based voice playing method and display equipment, which are used for reducing the sound of the interference of a normal video such as a timestamp and a subtitle which are frequently played in the process of playing the video and improving the user experience.

In a first aspect, there is provided a display device comprising:

the display is used for displaying a user interface for playing the audio and video contents; the user interface also comprises a time progress bar for indicating the playing progress of the audio and video contents and a selector, and the position of the selector in the user interface can be moved through user input;

the user interface is used for receiving instructions input by a user;

a controller for performing:

responding to a first input instruction, enabling a selector to be located on the time progress bar, and simultaneously playing first voice content corresponding to the first progress indicated by the selector;

and after the first voice content is played, playing second voice content corresponding to the second progress indicated by the selector after a preset time interval.

In a second aspect, there is provided a display device comprising:

the user interface is used for receiving instructions input by a user;

a controller for performing:

moving a position of the selector on the timeline in response to a second input instruction, such that the selector indicates a first progress to indicate a second progress;

in response to a third input instruction, moving the position of the selector on the time progress bar, so that the selector indicates the second progress to change to indicate a third progress, and simultaneously playing a third voice content corresponding to a third progress indicated by the selector;

and the input time interval of the second input instruction and the third input instruction is smaller than a preset value.

In a third aspect, a method for playing content-based speech includes:

In a fourth aspect, a method for playing content-based speech includes:

In the above embodiment, the user interface for playing the audiovisual content is displayed on the display device, and the position of the selector in the user interface is moved by the user input. When responding to the user input instruction, only after the previous section of voice broadcast is completed and a period of time is set between the previous section of voice broadcast and the next section of voice broadcast, the interference of the frequently played timestamps, the subtitles and the like on normal video sound can be effectively reduced, and the user experience is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

Fig. 1A is a schematic diagram illustrating an operation scenario between a display device and a control apparatus;

fig. 1B is a block diagram schematically illustrating a configuration of the control apparatus 100 in fig. 1A;

fig. 1C is a block diagram schematically illustrating a configuration of the display device 200 in fig. 1A;

FIG. 1D is a block diagram illustrating an architectural configuration of an operating system in memory of display device 200;

fig. 2 is a diagram exemplarily illustrating a voice guide opening screen provided by the display apparatus 200;

fig. 3A-3B schematically show a GUI400 provided by the display device 200 by operating the control apparatus 100;

4A-4F illustrate a flow chart of a method for content based voice playback;

fig. 5A to 5D are diagrams exemplarily showing a GUI provided by the display device 200 by operating the control apparatus 100;

FIGS. 6A-6B illustrate a flow chart of another method for content-based voice playback;

a flow chart of yet another content-based voice playback method is illustrated in fig. 7.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The term "user interface" in this application is a media interface for interaction and information exchange between an application or operating system and a user that enables conversion between an internal form of information and a form acceptable to the user. A common presentation form of a user interface is a Graphical User Interface (GUI), which refers to a user interface related to computer operations and displayed in a graphical manner. It may be an interface element such as an icon, a window, a control, etc. displayed in a display screen of the display device, where the control may include a visual interface element such as an icon, a button, a menu, a tab, a text box, a dialog box, a status bar, a navigation bar, a Widget, etc.

Fig. 1A is a schematic diagram illustrating an operation scenario between a display device and a control apparatus. As shown in fig. 1A, the control apparatus 100 and the display device 200 may communicate with each other in a wired or wireless manner.

Among them, the control apparatus 100 is configured to control the display device 200, which may receive an operation instruction input by a user and convert the operation instruction into an instruction recognizable and responsive by the display device 200, serving as an intermediary for interaction between the user and the display device 200. Such as: the user operates the channel up/down key on the control device 100, and the display device 200 responds to the channel up/down operation.

The control device 100 may be a remote controller 100A, which includes infrared protocol communication or bluetooth protocol communication, and other short-distance communication methods, etc. to control the display apparatus 200 in a wireless or other wired manner. The user may input a user instruction through a key on a remote controller, voice input, control panel input, etc., to control the display apparatus 200. Such as: the user can input a corresponding control command through a volume up/down key, a channel control key, up/down/left/right moving keys, a voice input key, a menu key, a power on/off key, etc. on the remote controller, to implement the function of controlling the display device 200.

The control device 100 may also be an intelligent device, such as a mobile terminal 100B, a tablet computer, a notebook computer, and the like. For example, the display device 200 is controlled using an application program running on the smart device. The application program may provide various controls to a user through an intuitive User Interface (UI) on a screen associated with the smart device through configuration.

For example, the mobile terminal 100B may install a software application with the display device 200, implement connection communication through a network communication protocol, and implement the purpose of one-to-one control operation and data communication. Such as: the mobile terminal 100B may be caused to establish a control instruction protocol with the display device 200 to implement the functions of the physical keys as arranged in the remote control 100A by operating various function keys or virtual buttons of the user interface provided on the mobile terminal 100B. The audio and video content displayed on the mobile terminal 100B may also be transmitted to the display device 200, so as to implement a synchronous display function.

The display apparatus 200 may provide a network television function of a broadcast receiving function and a computer support function. The display device may be implemented as a digital television, a web television, an Internet Protocol Television (IPTV), or the like.

The display device 200 may be a liquid crystal display, an organic light emitting display, a projection device. The specific display device type, size, resolution, etc. are not limited.

The display apparatus 200 also performs data communication with the server 300 through various communication means. Here, the display apparatus 200 may be allowed to be communicatively connected through a Local Area Network (LAN), a Wireless Local Area Network (WLAN), and other networks. The server 300 may provide various contents and interactions to the display apparatus 200. By way of example, the display device 200 may send and receive information such as: receiving Electronic Program Guide (EPG) data, receiving software program updates, or accessing a remotely stored digital media library. The servers 300 may be a group or groups of servers, and may be one or more types of servers. Other web service contents such as a video on demand and an advertisement service are provided through the server 300.

Fig. 1B is a block diagram illustrating the configuration of the control device 100. As shown in fig. 1B, the control device 100 includes a controller 110, a memory 120, a communicator 130, a user input interface 140, an output interface 150, and a power supply 160.

The controller 110 includes a Random Access Memory (RAM)111, a Read Only Memory (ROM)112, a processor 113, a communication interface, and a communication bus. The controller 110 is used to control the operation of the control device 100, as well as the internal components of the communication cooperation, external and internal data processing functions.

Illustratively, when an interaction of a user pressing a key disposed on the remote controller 100A or an interaction of touching a touch panel disposed on the remote controller 100A is detected, the controller 110 may control to generate a signal corresponding to the detected interaction and transmit the signal to the display device 200.

And a memory 120 for storing various operation programs, data and applications for driving and controlling the control apparatus 100 under the control of the controller 110. The memory 120 may store various control signal commands input by a user.

The communicator 130 enables communication of control signals and data signals with the display apparatus 200 under the control of the controller 110. Such as: the control apparatus 100 transmits a control signal (e.g., a touch signal or a button signal) to the display device 200 via the communicator 130, and the control apparatus 100 may receive the signal transmitted by the display device 200 via the communicator 130. The communicator 130 may include an infrared signal interface 131 and a radio frequency signal interface 132. For example: when the infrared signal interface is used, the user input instruction needs to be converted into an infrared control signal according to an infrared control protocol, and the infrared control signal is sent to the display device 200 through the infrared sending module. The following steps are repeated: when the rf signal interface is used, a user input command needs to be converted into a digital signal, and then modulated according to an rf control signal modulation protocol, and then transmitted to the display device 200 through the rf transmitting terminal.

The user input interface 140 may include at least one of a microphone 141, a touch pad 142, a sensor 143, a key 144, and the like, so that a user can input a user instruction regarding controlling the display apparatus 200 to the control apparatus 100 through voice, touch, gesture, press, and the like.

The output interface 150 outputs a user instruction received by the user input interface 140 to the display apparatus 200, or outputs an image or voice signal received by the display apparatus 200. Here, the output interface 150 may include an LED interface 151, a vibration interface 152 generating vibration, a sound output interface 153 outputting sound, a display 154 outputting an image, and the like. For example, the remote controller 100A may receive an output signal such as audio, video, or data from the output interface 150, and display the output signal in the form of an image on the display 154, in the form of audio on the sound output interface 153, or in the form of vibration on the vibration interface 152.

And a power supply 160 for providing operation power support for each element of the control device 100 under the control of the controller 110. In the form of a battery and associated control circuitry.

A hardware configuration block diagram of the display device 200 is exemplarily illustrated in fig. 1C. As shown in fig. 1C, the display apparatus 200 may include a tuner demodulator 210, a communicator 220, a detector 230, an external device interface 240, a controller 250, a memory 260, a user interface 265, a video processor 270, a display 275, an audio processor 280, an audio output interface 285, and a power supply 290.

The tuner demodulator 210 receives the broadcast television signal in a wired or wireless manner, may perform modulation and demodulation processing such as amplification, mixing, and resonance, and is configured to demodulate, from a plurality of wireless or wired broadcast television signals, an audio/video signal carried in a frequency of a television channel selected by a user, and additional information (e.g., EPG data).

The tuner demodulator 210 is responsive to the user selected frequency of the television channel and the television signal carried by the frequency, as selected by the user and controlled by the controller 250.

The tuner demodulator 210 can receive a television signal in various ways according to the broadcasting system of the television signal, such as: terrestrial broadcasting, cable broadcasting, satellite broadcasting, internet broadcasting, or the like; and according to different modulation types, a digital modulation mode or an analog modulation mode can be adopted; and can demodulate the analog signal and the digital signal according to the different kinds of the received television signals.

In other exemplary embodiments, the tuning demodulator 210 may also be in an external device, such as an external set-top box. In this way, the set-top box outputs a television signal after modulation and demodulation, and inputs the television signal into the display apparatus 200 through the external device interface 240.

The communicator 220 is a component for communicating with an external device or an external server according to various communication protocol types. For example, the display apparatus 200 may transmit content data to an external apparatus connected via the communicator 220, or browse and download content data from an external apparatus connected via the communicator 220. The communicator 220 may include a network communication protocol module or a near field communication protocol module, such as a WIFI module 221, a bluetooth communication protocol module 222, and a wired ethernet communication protocol module 223, so that the communicator 220 may receive a control signal of the control device 100 according to the control of the controller 250 and implement the control signal as a WIFI signal, a bluetooth signal, a radio frequency signal, and the like.

The detector 230 is a component of the display apparatus 200 for collecting signals of an external environment or interaction with the outside. The detector 230 may include a sound collector 231, such as a microphone, which may be used to receive a user's sound, such as a voice signal of a control instruction of the user to control the display device 200; alternatively, ambient sounds may be collected that identify the type of ambient scene, enabling the display device 200 to adapt to ambient noise.

In some other exemplary embodiments, the detector 230, which may further include an image collector 232, such as a camera, a video camera, etc., may be configured to collect external environment scenes to adaptively change the display parameters of the display device 200; and the function of acquiring the attribute of the user or interacting gestures with the user so as to realize the interaction between the display equipment and the user.

In some other exemplary embodiments, the detector 230 may further include a light receiver for collecting the intensity of the ambient light to adapt to the display parameter variation of the display device 200.

In some other exemplary embodiments, the detector 230 may further include a temperature sensor, such as by sensing an ambient temperature, and the display device 200 may adaptively adjust a display color temperature of the image. For example, when the temperature is higher, the display apparatus 200 may be adjusted to display a color temperature of an image that is cooler; when the temperature is lower, the display device 200 may be adjusted to display a warmer color temperature of the image.

The external device interface 240 is a component for providing the controller 250 to control data transmission between the display apparatus 200 and an external apparatus. The external device interface 240 may be connected to an external apparatus such as a set-top box, a game device, a notebook computer, etc. in a wired/wireless manner, and may receive data such as a video signal (e.g., moving image), an audio signal (e.g., music), additional information (e.g., EPG), etc. of the external apparatus.

The external device interface 240 may include: a High Definition Multimedia Interface (HDMI) terminal 241, a Composite Video Blanking Sync (CVBS) terminal 242, an analog or digital Component terminal 243, a Universal Serial Bus (USB) terminal 244, a Component terminal (not shown), a red, green, blue (RGB) terminal (not shown), and the like.

The controller 250 controls the operation of the display device 200 and responds to the operation of the user by running various software control programs (such as an operating system and various application programs) stored on the memory 260. For example, the controller may be implemented as a Chip (SOC).

As shown in fig. 1C, the controller 250 includes a Random Access Memory (RAM)251, a Read Only Memory (ROM)252, a graphics processor 253, a CPU processor 254, a communication interface 255, and a communication bus 256. The RAM251, the ROM252, the graphic processor 253, and the CPU processor 254 are connected to each other through a communication bus 256 through a communication interface 255.

The ROM252 stores various system boot instructions. When the display apparatus 200 starts power-on upon receiving the power-on signal, the CPU processor 254 executes a system boot instruction in the ROM252, copies the operating system stored in the memory 260 to the RAM251, and starts running the boot operating system. After the start of the operating system is completed, the CPU processor 254 copies the various application programs in the memory 260 to the RAM251 and then starts running and starting the various application programs.

And a graphic processor 253 for generating various graphic objects such as icons, operation menus, and user input instruction display graphics, etc. The graphic processor 253 may include an operator for performing an operation by receiving various interactive instructions input by a user, and further displaying various objects according to display attributes; and a renderer for generating various objects based on the operator and displaying the rendered result on the display 275.

A CPU processor 254 for executing operating system and application program instructions stored in memory 260. And according to the received user input instruction, processing of various application programs, data and contents is executed so as to finally display and play various audio-video contents.

In some example embodiments, the CPU processor 254 may comprise a plurality of processors. The plurality of processors may include one main processor and a plurality of or one sub-processor. A main processor for performing some initialization operations of the display apparatus 200 in the display apparatus preload mode and/or operations of displaying a screen in the normal mode. A plurality of or one sub-processor for performing an operation in a state of a standby mode or the like of the display apparatus.

The communication interface 255 may include a first interface to an nth interface. These interfaces may be network interfaces that are connected to external devices via a network.

The controller 250 may control the overall operation of the display apparatus 200. For example: in response to receiving a user input command for selecting a GUI object displayed on the display 275, the controller 250 may perform an operation related to the object selected by the user input command. For example, the controller may be implemented as an SOC (System on Chip) or an MCU (Micro Control Unit).

Where the object may be any one of the selectable objects, such as a hyperlink or an icon. The operation related to the selected object is, for example, an operation of displaying a link to a hyperlink page, document, image, or the like, or an operation of executing a program corresponding to the object. The user input command for selecting the GUI object may be a command input through various input means (e.g., a mouse, a keyboard, a touch panel, etc.) connected to the display apparatus 200 or a voice command corresponding to a voice spoken by the user.

A memory 260 for storing various types of data, software programs, or applications for driving and controlling the operation of the display device 200. The memory 260 may include volatile and/or nonvolatile memory. And the term "memory" includes the memory 260, the RAM251 and the ROM252 of the controller 250, or a memory card in the display device 200.

In some embodiments, the memory 260 is specifically used for storing an operating program for driving the controller 250 of the display device 200; storing various application programs built in the display apparatus 200 and downloaded by a user from an external apparatus; data such as visual effect images for configuring various GUIs provided by the display 275, various objects related to the GUIs, and selectors for selecting GUI objects are stored.

In some embodiments, memory 260 is specifically configured to store drivers for tuner demodulator 210, communicator 220, detector 230, external device interface 240, video processor 270, display 275, audio processor 280, etc., and related data, such as external data (e.g., audio-visual data) received from the external device interface or user data (e.g., key information, voice information, touch information, etc.) received by the user interface.

In some embodiments, memory 260 specifically stores software and/or programs representing an Operating System (OS), which may include, for example: a kernel, middleware, an Application Programming Interface (API), and/or an application program. Illustratively, the kernel may control or manage system resources, as well as functions implemented by other programs (e.g., the middleware, APIs, or applications); at the same time, the kernel may provide an interface to allow middleware, APIs, or applications to access the controller to enable control or management of system resources.

A block diagram of the architectural configuration of the operating system in the memory of the display device 200 is illustrated in fig. 1D. The operating system architecture comprises an application layer, a middleware layer and a kernel layer from top to bottom.

The application layer, the application programs built in the system and the non-system-level application programs belong to the application layer. Is responsible for direct interaction with the user. The application layer may include a plurality of applications such as a setup application, a post application, a media center application, and the like. These applications may be implemented as Web applications that execute based on a WebKit engine, and in particular may be developed and executed based on HTML5, Cascading Style Sheets (CSS), and JavaScript.

Here, HTML, which is called HyperText Markup Language (HyperText Markup Language), is a standard Markup Language for creating web pages, and describes the web pages by Markup tags, where the HTML tags are used to describe characters, graphics, animation, sound, tables, links, etc., and a browser reads an HTML document, interprets the content of the tags in the document, and displays the content in the form of web pages.

CSS, known as Cascading Style Sheets (Cascading Style Sheets), is a computer language used to represent the Style of HTML documents, and may be used to define Style structures, such as fonts, colors, locations, etc. The CSS style can be directly stored in the HTML webpage or a separate style file, so that the style in the webpage can be controlled.

JavaScript, a language applied to Web page programming, can be inserted into an HTML page and interpreted and executed by a browser. The interaction logic of the Web application is realized by JavaScript. The JavaScript can package a JavaScript extension interface through a browser, realize the communication with the kernel layer,

the middleware layer may provide some standardized interfaces to support the operation of various environments and systems. For example, the middleware layer may be implemented as multimedia and hypermedia information coding experts group (MHEG) middleware related to data broadcasting, DLNA middleware which is middleware related to communication with an external device, middleware which provides a browser environment in which each application program in the display device operates, and the like.

The kernel layer provides core system services, such as: file management, memory management, process management, network management, system security authority management and the like. The kernel layer may be implemented as a kernel based on various operating systems, for example, a kernel based on the Linux operating system.

The kernel layer also provides communication between system software and hardware, and provides device driver services for various hardware, such as: provide display driver for the display, provide camera driver for the camera, provide button driver for the remote controller, provide wiFi driver for the WIFI module, provide audio driver for audio output interface, provide power management drive for Power Management (PM) module etc..

A user interface 265 receives various user interactions. Specifically, it is used to transmit an input signal of a user to the controller 250 or transmit an output signal from the controller 250 to the user. For example, the remote controller 100A may transmit an input signal, such as a power switch signal, a channel selection signal, a volume adjustment signal, etc., input by the user to the user interface 265, and then the input signal is transferred to the controller 250 through the user interface 265; alternatively, the remote controller 100A may receive an output signal such as audio, video, or data output from the user interface 265 via the controller 250, and display the received output signal or output the received output signal in audio or vibration form.

In some embodiments, the user may enter user commands in a Graphical User Interface (GUI) displayed on the display 275, and the user interface 265 receives the user input commands through the GUI. Specifically, the user interface 265 may receive user input commands for controlling the position of a selector in the GUI to select different objects or items.

Alternatively, the user may input a user command by inputting a specific sound or gesture, and the user interface 265 receives the user input command by recognizing the sound or gesture through the sensor.

The video processor 270 is configured to receive an external video signal, and perform video data processing such as decompression, decoding, scaling, noise reduction, frame rate conversion, resolution conversion, and image synthesis according to a standard codec protocol of the input signal, so as to obtain a video signal that is directly displayed or played on the display 275.

Illustratively, the video processor 270 includes a demultiplexing module, a video decoding module, an image synthesizing module, a frame rate conversion module, a display formatting module, and the like.

The demultiplexing module is configured to demultiplex an input audio/video data stream, where, for example, an input MPEG-2 stream (based on a compression standard of a digital storage media moving image and voice), the demultiplexing module demultiplexes the input audio/video data stream into a video signal and an audio signal.

And the video decoding module is used for processing the video signal after demultiplexing, including decoding, scaling and the like.

And the image synthesis module is used for carrying out superposition mixing processing on the GUI signal input by the user or generated by the user and the video image after the zooming processing by the graphic generator so as to generate an image signal for display.

The frame rate conversion module is configured to convert a frame rate of an input video, for example, convert a frame rate of an input 60Hz video into a frame rate of 120Hz or 240Hz, where a common format is implemented by using, for example, an interpolation frame method.

And a display formatting module for converting the signal output by the frame rate conversion module into a signal conforming to a display format of a display, such as converting the format of the signal output by the frame rate conversion module to output an RGB data signal.

A display 275 for receiving the image signal from the video processor 270 and displaying the video content, the image and the menu manipulation interface. The display video content may be from the video content in the broadcast signal received by the tuner-demodulator 210, or from the video content input by the communicator 220 or the external device interface 240. The display 275, while displaying a user manipulation interface UI generated in the display apparatus 200 and used to control the display apparatus 200.

And, the display 275 may include a display screen assembly for presenting a picture and a driving assembly for driving the display of an image. Alternatively, a projection device and projection screen may be included, provided display 275 is a projection display.

The audio processor 280 is configured to receive an external audio signal, decompress and decode the received audio signal according to a standard codec protocol of the input signal, and perform audio data processing such as noise reduction, digital-to-analog conversion, and amplification processing to obtain an audio signal that can be played by the speaker 286.

Illustratively, audio processor 280 may support various audio formats. Such as MPEG-2, MPEG-4, Advanced Audio Coding (AAC), high efficiency AAC (HE-AAC), and the like.

The audio output interface 285 is used for receiving an audio signal output by the audio processor 280 under the control of the controller 250, and the audio output interface 285 may include a speaker 286 or an external sound output terminal 287, such as an earphone output terminal, for outputting to a generating device of an external device.

In other exemplary embodiments, video processor 270 may comprise one or more chips. Audio processor 280 may also comprise one or more chips.

And, in other exemplary embodiments, the video processor 270 and the audio processor 280 may be separate chips or may be integrated with the controller 250 in one or more chips.

And a power supply 290 for supplying power supply support to the display apparatus 200 from the power input from the external power source under the control of the controller 250. The power supply 290 may be a built-in power supply circuit installed inside the display apparatus 200 or may be a power supply installed outside the display apparatus 200.

Fig. 2 is a diagram exemplarily illustrating a voice guide opening screen provided by the display apparatus 200.

As shown in fig. 2, the display apparatus may provide a voice guide setting screen to the display for selecting the function to be turned on or off based on a user input. The blind or the visually impaired need to turn on the function of the voice guide before using the display device, thereby turning on the voice playing function.

Fig. 3 schematically shows a GUI400 provided by the display apparatus 200 by operating the control device 100.

In some embodiments, as shown in fig. 3A, a display device may provide a GUI400 to a display, the GUI400 including a plurality of presentation areas providing different image content, each presentation area including one or more different items arranged therein. For example, items 411-416 are disposed in the first display area 41, and items 421-426 are disposed in the second display area 42. And the GUI further comprises a selector 43 indicating that any one of the items is selected, the position of the selector in the GUI or the positions of the respective items in the GUI being movable by input from a user operating the control means to change the selection of a different item. For example, selector 43 indicates that item 411 within first presentation area 41 is selected.

It should be noted that the items refer to visual objects displayed in respective presentation areas of the GUI in the display apparatus 200 to represent corresponding contents such as icons, thumbnails, video clips, links, etc., and the items may provide the user with various conventional program contents received through data broadcasting, and various application and service contents set by a content manufacturer, etc.

The presentation forms of items are often diverse. For example, the items may include text content and/or images for displaying thumbnails related to the text content. As another example, the item may be text and/or an icon of an application.

It should also be noted that the display form of the selector may be the focus object. The movement of the object of focus displayed in the display apparatus 200 may be controlled according to an input of a user through the control device 100 to select or control an item. Such as: the user may select and control items by controlling the movement of the focus object between items through the direction keys on the control device 100. The identification form of the focus object is not limited. For example, the position of the focus object may be realized or identified by setting a background color of the item, or may be identified by changing a border line, a size, a transparency, and an outline of text or an image of the focused item, and/or a font, etc.

In fig. 3A, when the user operates the control means to instruct the selector 43 to select the item 411, for example, the user presses a direction key on the control means, as shown in fig. 3B, the display device instructs the selector 43 to select the item 412 in response to the key input instruction, and plays the voice content corresponding to the item 412. If the user presses the speech playing interpretation key on the control device, the display device responds to the key input instruction to play the first speech interpretation related to the speech content corresponding to the item 412.

Illustratively, when the item 412 is a movie, the user presses the right direction key on the control device, indicating that when a movie is selected by the selector 43, the voice of the movie name is played. After pressing the voice playing explanation key on the control device, the user can continue to play the story brief introduction of a certain movie, for example, a certain movie is a gift offering piece celebrating the construction country in 70 years, and the unusual story … … of ordinary people under different professions, backgrounds and identities in the era background is told by a series connection method of 'history moment, national memory and head-to-head collision'.

Illustratively, if the item 412 is an application, the user presses the right direction key on the control device to indicate that the selector 43 has selected an application, and the user plays a voice of the name of the application. After pressing the voice playing explanation key on the control device, the user will continue to play the software introduction of some application software, for example, the B application software is a large video website with massive, high-quality and high-definition network videos, and the professional network video playing platform … ….

In some embodiments, after playing the first speech interpretation related to the speech content corresponding to the item 412, the first speech interpretation includes the speech information prompting the existence of the second speech interpretation, the user presses the speech playing interpretation key on the control device again, and the display device plays the second speech interpretation related to the speech content corresponding to the item 412 in response to the instruction input by the key.

Illustratively, when the item 412 is an application store, the user presses the right direction key on the control device to indicate that the selector 43 has selected the application store, and the voice of the name of the application store is played. After pressing the voice playing explanation key on the control device, the user will continue to play the function profile of an application store, for example, the application store is an electronic platform … … dedicated to providing free, fee-based games and software application download services for mobile device phones, tablet computers and the like. After the interpretation voice, "if the application software specifically included in the application store is desired to be known, please press the voice playing interpretation key again within 2 s". The user presses the speech play interpretation key again on the control device to play the speech of the downloadable application software in the application store, such as the a application software, the B application software and the C application software … ….

If the user with visual disorder or blind people wants to know whether the application store has the application software wanted by the user, the method and the system can find the application software wanted by the user without the user clicking the application store and broadcasting all the applications in a traversing way. The invention can directly play the explained voice of the application store, know whether the application store has the application software which the user wants to use, and can acquire the corresponding information without the need of the user to click the corresponding interface to traverse, thereby reducing the unnecessary operation of the user and improving the user experience.

In some embodiments, in combination with the method shown in fig. 4A, before step S51, the method for playing content-based speech further includes:

step S501: counting broadcast texts of all projects;

the broadcast text includes, but is not limited to, an application store name, an application software name, a video (e.g., movie/tv show) name, a system setting name, and the like.

Step S502: compiling an explanation text corresponding to the broadcast text of each project;

the user can further know the specific content without knowing the voice corresponding to the broadcast text conveniently. The invention needs to further write the explanation text of the broadcast text so that the user can know the specific content and function of the broadcast text. Wherein, each broadcast text has at least one corresponding explanation text.

For example, when the broadcast text is an application store name and an application software name, the interpretation text may be a specific introduction to the functions of the application store and the application software, or an application software name contained in the application store. When the broadcast text is a video (e.g., a movie/tv series), the interpretation text may be a synopsis of the video. When the broadcast text sets a name for the system, the interpretation text may be specifically internally including sound setting, time setting, and the like.

Step S503: and storing the broadcast text and the corresponding explanation text of each project in a configuration file or a database.

The invention counts the broadcast texts of all required projects, writes the corresponding explanations of the broadcast texts, and stores the broadcast texts and the explanations of the broadcast texts in a relevant configuration file or a database in a one-to-one correspondence manner.

Referring to the method shown in fig. 4B, a content-based voice broadcasting method includes the following steps S51-S54:

step S51: a first input instruction input by a user through a control device is received.

When the voice guide function of the display device is configured to be turned on, the position of the selector in the user interface is moved by the control means to select a different item.

Step S52: responding to a first input instruction, enabling the selector to move from a first item to a second item, and playing voice content corresponding to the second item;

in some embodiments, with reference to the method shown in fig. 4C, in step S52, the playing the voice content corresponding to the second item specifically includes:

step S521: searching the broadcast text of the second item in a configuration file or a database;

step S522: and playing the voice content corresponding to the broadcast text.

For example, the second item is a movie, the movie name corresponding to the movie is found in the configuration file or the database, and the voice of the movie name is played.

Step S53: and receiving a second input instruction input by the user through the control device.

And when the user encounters the condition that the content broadcasted by the second item is not understood in the voice playing process of the display equipment. And the user presses a certain key on the control device without moving the focus, and if a voice playing explanation key specially set for voice playing explanation is pressed, the display device broadcasts the explained voice of the second item.

Step S54: and responding to a second input instruction input by the user, and playing a first voice explanation related to the voice content.

In some embodiments, for the method shown in fig. 4D, in step S54, the playing the first speech interpretation related to the speech content specifically includes:

step S541: searching a first explanation text corresponding to the second item in a configuration file or a database;

step S542: and playing the voice content corresponding to the first interpretation text.

Illustratively, when the user encounters an incomprehensible name of a movie (second item) to be played, the user inputs a second input instruction, searches a configuration file or a database for a first interpretation text corresponding to the movie, wherein the first interpretation text is a scenario brief of the movie, and plays a voice of the scenario brief of the movie.

The content-based voice playing method provided by the invention can directly explain the voice and the first voice of the selected item through the broadcast, avoid the situation that the user cannot continuously know the specific meaning of the broadcast content when the user cannot understand the broadcast content, directly acquire the content which the user wants to know, and can acquire corresponding information without clicking a corresponding interface, thereby reducing unnecessary operation of the user and improving the user experience.

In some embodiments, in connection with the method shown in fig. 4E, the method for content-based voice playing further includes:

the first speech interpretation comprises speech information prompting the existence of a second speech interpretation;

it should be noted that at least one interpretation text corresponding to each broadcast text is provided, that is, at least one voice interpretation corresponding to the focus broadcast voice is provided. When two or more than two voice interpretations exist, the former voice interpretation needs to prompt the user about the existence of the next voice interpretation, and a prompt of how to operate can be given. For example: at the end of the first speech interpretation it may also include "please press the speech playing interpretation key again within a preset time if it wants to know the second speech interpretation of the selected item". And if the previous speech interpretation does not prompt the user of the content of the second speech interpretation, the speech interpretation corresponding to the selected item is only one.

And step S55, receiving a third input instruction input by the user through the control device.

The user may also enter a third input instruction when he or she knows that there is a second speech interpretation and wants to continue to know the content of the selected item. Specifically, the user presses a certain key on the control device without moving the focus, such as a voice playing explanation key specifically set for voice playing to play the second voice explanation.

And step S56, responding to a third input instruction, and playing a second voice explanation related to the voice content.

In some embodiments, referring to the method shown in fig. 4F, in step S56, the playing the second speech interpretation related to the speech content specifically includes:

step S561, searching a second interpretation text corresponding to the second item in a configuration file or a database;

and step S562, playing the voice content corresponding to the second interpretation text.

For example, after knowing the scenario introduction of a certain movie, the user may input a third input instruction to search a configuration file or a database for a second interpretation text corresponding to the movie, where the second interpretation text is an introduction of a director actor of the movie, and broadcasts a voice of the introduction of the director actor of the movie.

As described in the above embodiments, the position of the selector in the user interface is moved by the user input to obtain the voice content of the selected item, and when the voice content is not understood, the display device may respond to the user input instruction to broadcast the specific interpretation corresponding to the selected item. The user with voice playing function can clearly know the meaning of the voice playing content. The television is convenient for the blind and the visually-handicapped to use more conveniently and friendly, and the significance of the voice playing function on the television is fully exerted. The method is practical and can increase user experience.

Fig. 5A to 5D are schematic views exemplarily showing one GUI provided by the display device 200 by operating the control apparatus 100.

For example, as shown in fig. 5A, the display apparatus may provide a GUI in which a play screen 61 and a plurality of items 601 to 604 including a timeline 605, and a selector 63 are displayed to the display in response to the key input instruction by the user pressing a direction key on the control device, and the position of the selector in the GUI may be moved by the input of the user operating the control device to prompt the user.

In fig. 5A to 5D, when a user inputs an instruction by operating the control means while displaying a playback screen such as audio content on the display of the display apparatus, the display apparatus may make the selector positioned on the timeline in response to the input instruction while playing back the voice content corresponding to the current playback progress indicated by the selector.

It should be noted that when the selector 63 moves to the timeline 605, the selector indicates the current playing progress of the audio content on the timeline, such as the current playing progress is 00:10: 05. Because the current playing progress on the time progress bar can be continuously updated, the display device can continuously play the voice content corresponding to the current playing progress. Particularly, the current playing progress on the time progress bar is updated in real time according to seconds, and then the voice content corresponding to the current playing progress can be frequently reported for seconds.

For example, the current playing progress is 00:26:05, since the playing of 00:26:05 is difficult to complete within 1 second, and the updating time is repeatedly read next time after 1 second, the situations that "twenty six, twenty six and twentysix are read all the time occur, and the user experience is very poor. However, the user must know the current playing progress, for example, the user needs to read the current playing progress of the timeline to continue the next operation when the user implements the fast forward/fast backward operation scene according to the timeline.

In some embodiments, in fig. 5A, when the user operates the control device to instruct the selector 63 to select the item 603, for example, the user presses an up direction key on the control device, as shown in fig. 5B, the display device responds to the key input instruction to instruct the selector 63 to move from the item 603 to the timeline 605, and broadcasts a voice corresponding to the current playing progress, that is, 00:10: 05.

Assuming that the current playing progress on the time progress bar is updated according to seconds, the current playing progress indicated by the selector is updated to 00:10:06, and the voice corresponding to the progress can not be broadcasted, so that the situation that the voice corresponding to 00:10:05 is not broadcasted in 1 second and the voice corresponding to 00:10:06 starts to be broadcasted is avoided. Similarly, when the current playing progress indicated by the selector is updated to 00:10:07, the voice corresponding to the progress may not be broadcasted. Here, it may be determined that if 2 seconds pass after the voice broadcast corresponding to 00:10:05 is completed, the voice corresponding to the next schedule may be broadcast. If 2 seconds pass after the current playing progress indicated by the selector is updated to 00:10:08 and the voice corresponding to 00:10:05 is determined to be broadcasted, the voice corresponding to 00:10:08 can be broadcasted.

Therefore, the frequency of the voice broadcast corresponding to the current playing progress on the time progress bar is reduced through the display device, frequent broadcast can be avoided, and user experience is improved.

In other embodiments, in fig. 5A, when the user operates the control device to instruct the selector 63 to select the item 603, for example, the user presses an up direction key on the control device, as shown in fig. 5B, the display device responds to the key input instruction to instruct the selector 63 to move from the item 603 to the time progress bar 605, and plays the voice corresponding to the current playing progress, that is, the voice corresponding to the first progress, 00:10: 05.

If the user presses the right direction key on the control apparatus for a short time, as shown in fig. 5C, the display device moves the position of the selector 63 on the timeline 605 to the second progress in response to the key input instruction, but the first progress voice content is not played back to the end, or the first progress voice content is played back without passing through a preset time interval, and the selector 63 indicates that the second progress 00:11:05 on the timeline is not played back.

If the user continues to press the right direction key on the control device for a short time, as shown in fig. 5D, the display device moves the position of the selector 63 on the timeline 605 to be at the third progress in response to the key input instruction, and after the playing of the voice content at the first progress is finished and a preset time interval elapses, the voice content at the third progress of the timeline indicated by the selector, that is, 00:12:05, is played.

In the above example, when the user needs to view the audio content corresponding to 00:12:05, the right direction key is continuously pressed for a short time, so that the selector is moved from the first progress to the second progress and the third progress, respectively. And when the time interval between the selector moving to the second progress and the third progress is smaller than the preset value, discarding the instruction of playing the voice of the second progress, and directly playing the voice corresponding to the third progress, namely 00:12: 05. And when the time between the selector moving to the second progress and the third progress is not less than the preset value, playing the voice corresponding to the second progress, namely 00:11: 05.

Therefore, when the current playing progress is fast-forwarded and fast-backed based on user input, only the voice corresponding to the latest progress can be broadcasted, frequent broadcasting can also be avoided, and user experience is improved.

It should be noted that, when performing voice broadcast, the volume of the voice broadcast may be higher than the volume of the audio content, so as to prompt the user; when the voice broadcast is finished, the playing volume of the audio content before the voice broadcast can be recovered, so that the user can continuously watch the audio content.

Fig. 6A-6B illustrate a flow chart of another method for content-based voice playback.

In connection with the method shown in FIG. 6B, the method includes the following steps S71-S73:

step S71: a first input instruction input by a user through a control device is received. For example, the user presses a menu key, a direction key, an OK key, etc. on the control device.

With reference to the method shown in fig. 6A, after step S71, the method further includes:

step S710: and identifying a video playing scene.

Specifically, the method for identifying the playing scene comprises the following steps: when PLAYING video, an interface for inquiring whether to play is provided externally, and the interface provides the return of several states of PLAYING, PAUSED and INVALID. Wherein, PLAYING represents the current PLAYING state, pause represents that the player is in the pause state, and INVALID represents that the player is not created. When a PLAYING scene is identified, inquiring the PLAYING state of the video, and if the PLAYING state is the PLAYING state, returning the result of PLAYING; if the state is the pause state, returning a PAUSED result; if the player is not created, the result of INVALID is returned.

Step S711: and responding to the video playing scene that the player is in a pause state and a first input instruction, so that the selector is positioned on the time progress bar, and simultaneously playing the first voice content corresponding to the first progress indicated by the selector.

Illustratively, after receiving a first input instruction, it is determined whether to play a video currently, and when the player is in a pause state, the first voice content corresponding to the first progress indicated by the selector is directly played.

When the video playing scene is the player in the pause state, the sound which can influence the voice broadcast is not generated in the video, so that the first voice content corresponding to the first progress can be directly played.

Step S712: and responding to the video playing scene that the player is not created and a first input instruction, enabling the selector to be located on the time progress bar, and simultaneously playing first voice content corresponding to the first progress indicated by the selector.

Illustratively, after receiving a first input instruction, whether a video is played currently is judged, and when the player is not created, a first voice content corresponding to a first progress indicated by the selector is directly played.

When the video playing scene is not created by the player, the sound which can influence the voice broadcast does not exist in the video, so that the first voice content corresponding to the first progress can be directly played.

Step S713: in response to the video playing scene being that the player is in a playing state, step S72 is executed.

Step S72: responding to a first input instruction, enabling a selector to be located on the time progress bar, and simultaneously playing first voice content corresponding to the first progress indicated by the selector;

specifically, the first input instruction is the first voice content corresponding to the first progress indicated by the selector when the user presses a menu key, a direction key, an OK key, and the like on the control device.

Step S73: and after the first voice content is played, playing second voice content corresponding to the second progress indicated by the selector after a preset time interval.

Specifically, the user is provided with a setting menu for setting the frequency of voice playing, i.e., the preset time interval. The user can set the appropriate voice playing frequency according to the requirement.

Illustratively, the preset time interval set by the user is 2 s. And after a first input instruction is received, playing first voice content corresponding to the first progress indicated by the selector. Judging whether the time interval between the time of sending the second input instruction and the time of finishing playing the first voice content is more than 2 s; knowing that the time for completing the voice playing of the first voice content is 12:00:00, if the time for receiving the second input instruction is 12:00:03, the time for sending the second input instruction is 3s from the time for completing the playing of the first voice content, and 3s is more than 2s, that is, the interval between the time for sending the second input instruction and the time for completing the playing of the first voice content is more than the preset time interval, playing the second voice content corresponding to the second progress indicated by the selector in the second input instruction.

If the time for receiving the second input instruction is 12:00:01, the time for sending the second input instruction is 1s from the time when the playing of the first voice content is completed, and 1s is less than 2s, that is, the interval between the time for sending the second input instruction and the time when the playing of the first voice content is completed is less than the preset time interval, the second voice content corresponding to the second progress indicated by the selector in the second input instruction is not played.

Fig. 7 illustrates a flow chart of yet another content-based voice playback method.

Referring to the method shown in FIG. 7, the method includes the following steps S81-S83:

step S81: a first input instruction input by a user through a control device is received. For example, the user presses a menu key, a direction key, an OK key, etc. on the control device.

It should be noted that after receiving a first input instruction input by a user, a video playing scene needs to be identified, and the specific steps are the same as S710 to S712. In response to the video playing scene being that the player is in a playing state, step S82 is executed.

Step S82: responding to a first input instruction, enabling a selector to be located on the time progress bar, and simultaneously playing first voice content corresponding to the first progress indicated by the selector;

step S83: and receiving a second input instruction input by the user through the control device. For example, the user presses a menu key, a direction key, an OK key, etc. on the control device.

Step S84: moving a position of the selector on the timeline in response to a second input instruction, such that the selector indicates a first progress to indicate a second progress;

step S85: and receiving a third input instruction input by the user through the control device. For example, the user presses a menu key, a direction key, an OK key, etc. on the control device.

Step S86: in response to a third input instruction, moving the position of the selector on the time progress bar, so that the selector indicates the second progress to change to indicate a third progress, and simultaneously playing a third voice content corresponding to a third progress indicated by the selector; and the input time interval of the second input instruction and the third input instruction is smaller than a preset value.

Specifically, a user is provided with a setting menu for setting a preset value. The user can set a proper preset value according to the requirement.

Illustratively, the preset value set by the user is 1.5 s. And after receiving a first input instruction, playing first voice content corresponding to the first progress (00:10:05) indicated by the selector. Receiving a second input instruction input by the user through the control device, and moving the position of the selector on the timeline so that the selector indicates the first progress (00:10:05) to indicate the second progress (00:11: 05). The time for the second input instruction is known to be 12:00: 00. Receiving a third input instruction input by the user through the control device, and moving the position of the selector on the timeline so that the selector indicates the second progress (00:11:05) to change to indicate the second progress (00:12: 05). And if the time for receiving the third input instruction is 12:00:01, the input time interval between the second input instruction and the third input instruction is 1s, and 1s is less than 1.5s, namely the input time interval between the second input instruction and the third input instruction is less than a preset value, directly playing third voice content corresponding to the third progress indicated by the selector.

As described in the above embodiments, a user interface for playing audiovisual content is displayed on a display device, and the position of the selector in the user interface is moved by user input. When a user input instruction is responded, the next voice can be broadcasted only after the previous voice is played and a period of time is set, so that the interference of normal audio and video sounds such as timestamps and subtitles in frequent playing can be effectively reduced, and the user experience is improved.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. A display device, comprising:

the user interface is used for receiving instructions input by a user;

a controller for performing:

and if the playing ending time of the first voice content and the input time interval of the second input instruction are larger than the preset time interval, playing the second voice content corresponding to the second progress indicated by the selector.

2. The display device of claim 1, wherein the audiovisual content is in a play state.

3. The display device according to claim 1, wherein the play progress on the timeline is updated in real time, and the position of the selector on the timeline changes as the play progress is updated in real time.

4. A method for content-based voice broadcast, the method comprising:

displaying a user interface for playing the audio and video content; the user interface also comprises a time progress bar for indicating the playing progress of the audio and video contents and a selector, and the position of the selector in the user interface can be moved through user input;

5. The method of claim 4, wherein the audiovisual content is in a play state.

6. The method of claim 4, wherein the progress of the playing on the timeline is updated in real time, and the position of the selector on the timeline changes as the progress of the playing is updated in real time.