CN117056622A

CN117056622A - Voice control method and display device

Info

Publication number: CN117056622A
Application number: CN202310856095.6A
Authority: CN
Inventors: 任晓楠; 李霞; 张大钊
Original assignee: Hisense Visual Technology Co Ltd
Current assignee: Hisense Visual Technology Co Ltd
Priority date: 2019-10-22
Filing date: 2019-10-22
Publication date: 2023-11-14
Also published as: CN110737840B; CN110737840A

Abstract

The application provides a voice control method and display equipment. The method comprises the following steps: receiving the voice input from the audio receiving element and generating a voice search instruction according to the voice; a voice search instruction input by a user is sent to a server, wherein the voice search instruction carries target search words, and the target search words are used for adjusting the display sequence of the labels when the service types are not less than two; receiving a first display instruction returned by the server based on the voice search instruction, wherein the first display instruction is generated according to the display sequence of the labels; and responding to the first display instruction, and sequentially displaying the labels in the label display area in the resource display interface according to the adjusted display sequence. By the method, the resource information of the service type with higher association degree with the target search word can be displayed preferentially, so that the possibility that the display sequence of the resource information does not accord with the search intention of the user is reduced, and the user can find the target resource information conveniently.

Description

Voice control method and display device

The application is a divisional application of Chinese patent application 201911008347.X, entitled "Voice control method and display device" filed on 10/22/2019, the entire contents of which are incorporated herein by reference.

Technical Field

The embodiment of the invention relates to the technical field of voice recognition, in particular to a voice control method and display equipment.

Background

The service scenes supported by the smart television are more and more, and for example, services such as film and television, education, music, application, shopping and the like can be supported. Because of more cross and relevance among services, the same target search word may correspond to resource information of multiple service types.

In the prior art, a voice search instruction input by a user is analyzed and searched mainly through a semantic analysis method, so that resource information corresponding to a target search word in the voice search instruction is obtained. Because the same target search word may correspond to resource information of multiple service types, when the resource information is displayed to a user, the resource information can be classified into corresponding tag pages in a preset tag page list according to the service types. The order of the tag pages in the tag page list is preset, and the tag pages display the resource information of the corresponding service types according to the fixed order no matter what target search words are contained in the voice search instruction input by the user.

However, although the target search word can correspond to the resource information of multiple service types, the service of the different target search word is different, and the tag page displays the resource information of the corresponding service type according to the fixed order, so that the display order of the resource information often does not accord with the search intention of the user, the target resources of the user are arranged at the later positions, the time for the user to find the target search resources from the tag page list is greatly increased, and the user experience is reduced.

Disclosure of Invention

The embodiment of the invention provides a voice control method and display equipment, which are used for solving the problem that the display sequence of resource information in the prior art often does not accord with the search intention of a user.

A first aspect of an embodiment of the present invention provides a voice control method, including:

receiving speech input from an audio receiving element and generating a speech search instruction according to the speech;

a voice search instruction input by a user is sent to a server, wherein the voice search instruction carries target search words, the target search words are used for adjusting the display sequence of labels when the service types are not less than two, and different display sequences of the labels correspond to different target search words;

receiving a first display instruction returned by the server based on the voice search instruction, wherein the first display instruction is generated according to the display sequence of the labels;

and responding to the first display instruction, and sequentially displaying the labels in the label display area in the resource display interface according to the adjusted display sequence.

A second aspect of an embodiment of the present invention provides a voice control method, including:

receiving a voice search instruction sent by display equipment, wherein the voice search instruction is generated according to voice input by an audio receiving element in the display equipment, and the voice search instruction carries a target search word;

Acquiring resource information of the service type corresponding to the target search word;

responding to at least two service types, and adjusting the display sequence of labels according to the target search word, wherein each label is used for loading resource information of one service type;

generating a first display instruction according to the display sequence of the labels;

pushing the first display instruction to the display device, wherein the first display instruction is used for indicating a label display area of the display device in a resource display interface to display the labels in sequence according to the adjusted display sequence.

A third aspect of an embodiment of the present invention provides a display apparatus, including:

a display configured to display a user interface, the user interface further comprising a selector indicating that an item is selected, the position of the selector in the user interface being movable by user input to cause selection of a different one of the items;

a controller in communication with the display screen, the controller configured to:

A fourth aspect of an embodiment of the present invention provides a server, including:

a memory and a processor;

the memory is used for storing executable instructions of the processor;

the processor is configured to: receiving a voice search instruction sent by display equipment, wherein the voice search instruction is generated according to voice input by an audio receiving element in the display equipment, and the voice search instruction carries a target search word;

A fifth aspect of the present invention provides a storage medium having stored therein a computer program for executing the method of the first aspect.

A sixth aspect of the present invention provides a storage medium having stored therein a computer program for executing the method of the second aspect.

According to the voice control method and the display device provided by the embodiment of the invention, the display device receives the voice input from the audio receiving element and generates a voice search instruction according to the voice; then, the display equipment sends a voice search instruction input by a user to the server, wherein the voice search instruction carries target search words, the target search words are used for adjusting the display sequence of the labels when the service types are not less than two, and different display sequences of the labels correspond to different target search words; then, the display device receives a first display instruction returned by the server based on the voice search instruction, and responds to the first display instruction, and the display device sequentially displays the labels in the label display area in the resource display interface according to the adjusted display sequence. By the method, the display sequence of the tag page can be adjusted according to the target search word, so that the resource information of the service type with higher association degree with the target search word can be displayed preferentially, the possibility that the display sequence of the resource information does not accord with the search intention of the user is reduced, the time for the user to find the target search resource from the tag page list is shortened, and the user experience is improved.

Drawings

In order to more clearly illustrate the application or the technical solutions of the prior art, the following description of the embodiments or the drawings used in the description of the prior art will be given in brief, it being obvious that the drawings in the description below are some embodiments of the application and that other drawings can be obtained from them without inventive effort for a person skilled in the art.

Fig. 1 is a schematic diagram of an operation scenario between a display device and a control device according to an embodiment of the present application;

fig. 2 is a block diagram of a hardware configuration of a display device 200 according to an embodiment of the present application;

fig. 3 is a block diagram of a hardware configuration of a control device 100 according to an embodiment of the present application;

fig. 4 is a schematic functional configuration diagram of a display device 200 according to an embodiment of the present application;

fig. 5a is a schematic diagram of software configuration in a display device 200 according to an embodiment of the present application;

fig. 5b is a schematic diagram illustrating a configuration of an application program in the display device 200 according to an embodiment of the present application;

fig. 6 is a signaling interaction diagram of a voice control method according to an embodiment of the present application;

FIG. 7a is a schematic diagram of a voice wake interface according to an embodiment of the present application;

FIG. 7b is a diagram illustrating a search results interface according to an embodiment of the present application;

fig. 8 is a schematic diagram of a display principle of a label display area according to an embodiment of the present application;

fig. 9a is an interface schematic diagram of a display device according to an embodiment of the present application;

FIG. 9b is a schematic diagram of an interface of another display device according to an embodiment of the present application

Fig. 10 is a schematic flow chart of a voice control method according to an embodiment of the present application;

fig. 11 is a signaling interaction diagram of another voice control method according to an embodiment of the present application;

fig. 12 is a signaling interaction diagram of yet another voice control method according to an embodiment of the present application;

fig. 13 is a schematic diagram of a text display principle according to an embodiment of the present application;

FIG. 14 is an interface diagram of a display device according to an embodiment of the present application;

fig. 15 is a schematic structural diagram of a display device according to an embodiment of the present application;

fig. 16 is a schematic structural diagram of a server according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of exemplary embodiments of the present application more apparent, the technical solutions of exemplary embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the exemplary embodiments of the present application, and it is apparent that the described exemplary embodiments are only some embodiments of the present application, not all embodiments.

All other embodiments, which can be made by a person skilled in the art without inventive effort, based on the exemplary embodiments shown in the present application are intended to fall within the scope of the present application. Furthermore, while the present disclosure has been described in terms of an exemplary embodiment or embodiments, it should be understood that each aspect of the disclosure may be separately implemented as a complete solution.

It should be understood that the terms "first," "second," "third," and the like in the description and in the claims and in the above-described figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate, such as where appropriate, for example, implementations other than those illustrated or described in connection with the embodiments of the application.

Furthermore, the terms "comprise" and "have," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a product or apparatus that comprises a list of elements is not necessarily limited to those elements expressly listed, but may include other elements not expressly listed or inherent to such product or apparatus.

The term "module" as used in this disclosure refers to any known or later developed hardware, software, firmware, artificial intelligence, fuzzy logic, or combination of hardware and/or software code that is capable of performing the function associated with that element.

The term "remote control" as used herein refers to a component of an electronic device (such as a display device as disclosed herein) that can be controlled wirelessly, typically over a relatively short distance. Typically, the electronic device is connected to the electronic device using infrared and/or Radio Frequency (RF) signals and/or bluetooth, and may also include functional modules such as WiFi, wireless USB, bluetooth, motion sensors, etc. For example: the hand-held touch remote controller replaces most of the physical built-in hard keys in a general remote control device with a touch screen user interface.

The term "gesture" as used herein refers to a user behavior by which a user expresses an intended idea, action, purpose, and/or result through a change in hand shape or movement of a hand, etc.

Fig. 1 is a schematic diagram of an operation scenario between a display device and a control device according to an embodiment of the present application. As shown in fig. 1, a user may operate the display device 200 through the mobile terminal 300 and the control device 100.

The control device 100 may control the display device 200 through a wireless or other wired manner by using a remote controller including an infrared protocol communication or a bluetooth protocol communication, and other short-range communication manners. The user may control the display device 200 by inputting user instructions through keys on a remote control, voice input, control panel input, etc. Such as: the user can input corresponding control instructions through volume up-down keys, channel control keys, up/down/left/right movement keys, voice input keys, menu keys, on-off keys, etc. on the remote controller to realize the functions of the control display device 200.

In some embodiments, mobile terminals, tablet computers, notebook computers, and other smart devices may also be used to control the display device 200. For example, the display device 200 is controlled using an application running on a smart device. The application program, by configuration, can provide various controls to the user in an intuitive User Interface (UI) on a screen associated with the smart device.

By way of example, the mobile terminal 300 may install a software application with the display device 200, implement connection communication through a network communication protocol, and achieve the purpose of one-to-one control operation and data communication. Such as: it is possible to implement a control command protocol established between the mobile terminal 300 and the display device 200, synchronize a remote control keyboard to the mobile terminal 300, and implement a function of controlling the display device 200 by controlling a user interface on the mobile terminal 300. The audio/video content displayed on the mobile terminal 300 can also be transmitted to the display device 200, so as to realize the synchronous display function.

As also shown in fig. 1, the display device 200 is also in data communication with the server 400 via a variety of communication means. The display device 200 may be permitted to make communication connections via a Local Area Network (LAN), a Wireless Local Area Network (WLAN), and other networks. The server 400 may provide various contents and interactions to the display device 200. By way of example, display device 200 receives software program updates, or accesses a remotely stored digital media library by sending and receiving information, as well as Electronic Program Guide (EPG) interactions. The servers 400 may be one or more groups, and may be one or more types of servers. Other web service content such as video on demand and advertising services are provided through the server 400.

The display device 200 may be a liquid crystal display, an OLED display, a projection display device. The particular display device type, size, resolution, etc. are not limited, and those skilled in the art will appreciate that the display device 200 may be modified in performance and configuration as desired.

The display device 200 may additionally provide an intelligent network television function of a computer support function in addition to the broadcast receiving television function. Examples include web tv, smart tv, internet Protocol Tv (IPTV), etc.

Fig. 2 is a block diagram of a hardware configuration of a display device 200 according to an embodiment of the present application. As shown in fig. 2, the display device 200 includes a controller 210, a modem 220, a communication interface 230, a detector 240, an input/output interface 250, a video processor 260-1, an audio processor 60-2, a display 280, an audio output 270, a memory 290, a power supply, and an infrared receiver.

A display 280 for receiving image signals from the video processor 260-1 and for displaying video content and images and components of the menu manipulation interface. The display 280 includes a display screen assembly for presenting pictures, and a drive assembly for driving the display of images. The video content may be displayed from broadcast television content, or may be various broadcast signals receivable via a wired or wireless communication protocol. Alternatively, various image contents received from the network server side transmitted from the network communication protocol may be displayed.

Meanwhile, the display 280 simultaneously displays a user manipulation UI interface generated in the display device 200 and used to control the display device 200.

And, depending on the type of display 280, a drive assembly for driving the display. Alternatively, if the display 280 is a projection display, a projection device and projection screen may be included.

The communication interface 230 is a component for communicating with an external device or an external server according to various communication protocol types. For example: the communication interface 230 may be a Wifi chip 231, a bluetooth communication protocol chip 232, a wired ethernet communication protocol chip 233, or other network communication protocol chips or near field communication protocol chips, and an infrared receiver (not shown in the figure).

The display device 200 may establish control signal and data signal transmission and reception with an external control device or a content providing device through the communication interface 230. And an infrared receiver, which is an interface for receiving an infrared control signal of the control device 100 (e.g., an infrared remote controller, etc.).

The detector 240 is a signal that the display device 200 uses to collect an external environment or interact with the outside. The detector 240 includes a light receiver 242, a sensor for collecting the intensity of ambient light, a parameter change may be adaptively displayed by collecting the ambient light, etc.

And the image collector 241, such as a camera, a video camera, etc., can be used for collecting external environment scenes, collecting attributes of a user or interacting gestures with the user, can adaptively change display parameters, and can also recognize the gestures of the user so as to realize the interaction function with the user.

In other exemplary embodiments, the detector 240 may also be a temperature sensor or the like, such as by sensing ambient temperature, and the display device 200 may adaptively adjust the display color temperature of the image. The display device 200 may be adjusted to display a colder color temperature shade of the image, such as when the temperature is higher, or the display device 200 may be adjusted to display a warmer color shade of the image when the temperature is lower.

In other exemplary embodiments, the detector 240, and also a sound collector or the like, such as a microphone, may be used to receive the user's sound, including the voice signal of a control instruction of the user controlling the display device 200, or collect the ambient sound for identifying the type of the ambient scene, and the display device 200 may adapt to the ambient noise.

An input/output interface 250 for data transmission between the control display device 200 of the controller 210 and other external devices. Such as receiving video signals and audio signals of an external device, or command instructions.

The input/output interface 250 may include, but is not limited to, the following: any one or more of a high definition multimedia interface HDMI interface 251, an analog or data high definition component input interface 253, a composite video input interface 252, a USB input interface 254, an RGB port (not shown in the figures), etc. may be used.

In other exemplary embodiments, the input/output interface 250 may also form a composite input/output interface from the plurality of interfaces described above.

The modem 220 receives broadcast television signals by a wired or wireless receiving method, and can perform modulation and demodulation processing such as amplification, mixing, resonance, etc., and demodulates television audio and video signals carried in a television channel frequency selected by a user and EPG data signals from a plurality of wireless or wired broadcast television signals.

The tuning demodulator 220 is responsive to the user selected television signal frequency and television signals carried by that frequency, as selected by the user, and as controlled by the controller 210.

The tuning demodulator 220 can receive signals in various ways according to broadcasting systems of television signals, such as: terrestrial broadcast, cable broadcast, satellite broadcast, or internet broadcast signals, etc.; and according to different modulation types, the modulation can be digital modulation or analog modulation mode. Depending on the type of television signal received, both analog and digital signals may be used.

In other exemplary embodiments, the modem 220 may also be in an external device, such as an external set-top box, or the like. Thus, the set-top box outputs television audio and video signals after modulation and demodulation, and inputs the television audio and video signals to the display device 200 through the input/output interface 250.

The video processor 260-1 is configured to receive an external video signal, perform video processing such as decompression, decoding, scaling, noise reduction, frame rate conversion, resolution conversion, image composition, etc., according to the standard codec protocol of the input signal, and obtain a signal that can be displayed or played on the directly displayable device 200.

The video processor 260-1, by way of example, includes a demultiplexing module, a video decoding module, an image compositing module, a frame rate conversion module, a display formatting module, and the like.

The demultiplexing module is used for demultiplexing the input audio/video data stream, such as the input MPEG-2, and demultiplexes the input audio/video data stream into video signals, audio signals and the like.

And the video decoding module is used for processing the demultiplexed video signals, including decoding, scaling and the like.

And an image synthesis module, such as an image synthesizer, for performing superposition mixing processing on the graphic generator and the video image after the scaling processing according to the GUI signal input by the user or generated by the graphic generator, so as to generate an image signal for display.

The frame rate conversion module is configured to convert the input video frame rate, for example, converting the 60Hz frame rate into the 120Hz frame rate or the 240Hz frame rate, and the common format is implemented in an inserting frame manner.

The display format module is used for converting the received frame rate into a video output signal, and changing the video output signal to a signal conforming to the display format, such as outputting an RGB data signal.

The audio processor 260-2 is configured to receive an external audio signal, decompress and decode the external audio signal according to a standard codec protocol of an input signal, and perform noise reduction, digital-to-analog conversion, amplification processing, and the like, to obtain a sound signal that can be played in a speaker.

In other exemplary embodiments, video processor 260-1 may include one or more chip components. The audio processor 260-2 may also include one or more chips.

And, in other exemplary embodiments, the video processor 260-1 and the audio processor 260-2 may be separate chips or integrated together in one or more chips with the controller 210.

An audio output 270, which receives the sound signal output by the audio processor 260-2 under the control of the controller 210, such as: the speaker 272, and an external sound output terminal 274 that can be output to a generating device of an external device, other than the speaker 272 carried by the display device 200 itself, such as: external sound interface or earphone interface, etc.

And a power supply source for providing power supply support for the display device 200 with power inputted from an external power source under the control of the controller 210. The power supply may include a built-in power circuit installed inside the display apparatus 200, or may be an external power source installed in the display apparatus 200, and a power interface providing an external power source in the display apparatus 200.

A user input interface for receiving an input signal of a user and then transmitting the received user input signal to the controller 210. The user input signal may be a remote control signal received through an infrared receiver, and various user control signals may be received through a network communication module.

By way of example, a user inputs a user command through the remote controller or the mobile terminal 300, the user input interface responds to the user input through the controller 210, and the display device 200 responds to the user input.

In some embodiments, a user may input a user command through a Graphical User Interface (GUI) displayed on the display 280, and the user input interface receives the user input command through the Graphical User Interface (GUI). Alternatively, the user may input the user command by inputting a specific sound or gesture, and the user input interface recognizes the sound or gesture through the sensor to receive the user input command.

The controller 210 controls the operation of the display device 200 and responds to the user's operations through various software control programs stored on the memory 290.

As shown in fig. 2, the controller 210 includes RAM213 and ROM214, and a graphics processor 216, CPU processor 212, communication interface 218, such as: first interface 218-1 through nth interfaces 218-n, and a communication bus. The RAM213 and the ROM214 are connected to the graphics processor 216, the CPU processor 212, and the communication interface 218 via buses.

A ROM213 for storing instructions for various system starts. When the power of the display device 200 starts to be started when the power-on signal is received, the CPU processor 212 executes a system start instruction in the ROM and copies the operating system stored in the memory 290 to the RAM213, so that the running of the start operating system starts. When the operating system is started, the CPU processor 212 copies various applications in the memory 290 to the RAM213, and then starts running the various applications.

A graphics processor 216 for generating various graphical objects, such as: icons, operation menus, user input instruction display graphics, and the like. The device comprises an arithmetic unit, wherein the arithmetic unit is used for receiving various interaction instructions input by a user to carry out operation and displaying various objects according to display attributes. And a renderer that generates various objects based on the results of the operator, and displays the results of rendering on the display 280.

CPU processor 212 is operative to execute operating system and application program instructions stored in memory 290. And executing various application programs, data and contents according to various interactive instructions received from the outside, so as to finally display and play various audio and video contents.

In some exemplary embodiments, the CPU processor 212 may include multiple processors. The plurality of processors may include one main processor and a plurality or one sub-processor. A main processor for performing some operations of the display apparatus 200 in the pre-power-up mode and/or displaying a picture in the normal mode. A plurality of or a sub-processor for one operation in a standby mode or the like.

The controller 210 may control the overall operation of the display apparatus 200. For example: in response to receiving a user command to select a UI object to be displayed on the display 280, the controller 210 may perform an operation related to the object selected by the user command.

Wherein the object may be any one of selectable objects, such as a hyperlink or an icon. Operations related to the selected object, such as: operations to connect to a hyperlink page, document, image, etc., or operations to execute a program corresponding to an icon are displayed. The user command for selecting the UI object may be an input command through various input means (e.g., mouse, keyboard, touch pad, etc.) connected to the display device 200 or a voice command corresponding to a voice uttered by the user.

Memory 290 includes storage for various software modules for driving display device 200. Such as: various software modules stored in memory 290, including: a basic module, a detection module, a communication module, a display control module, a browser module, various service modules and the like.

The base module is a bottom software module for communicating signals between the various hardware in the post-partum care display device 200 and sending processing and control signals to the upper module. The detection module is used for collecting various information from various sensors or user input interfaces and carrying out digital-to-analog conversion and analysis management.

For example: the voice recognition module comprises a voice analysis module and a voice instruction database module. The display control module is used for controlling the display 280 to display the image content, and can be used for playing the multimedia image content, the UI interface and other information. And the communication module is used for carrying out control and data communication with external equipment. And the browser module is used for executing data communication between the browsing servers. And the service module is used for providing various services and various application programs.

Meanwhile, the memory 290 also stores received external data and user data, images of various items in various user interfaces, visual effect maps of focus objects, and the like.

Fig. 3 is a block diagram of a configuration of a control device 100 according to an embodiment of the present application. As shown in fig. 3, the control device 100 includes a controller 110, a communication interface 130, a user input/output interface 140, a memory 190, and a power supply 180.

The control device 100 is configured to control the display device 200, and may receive an input operation instruction of a user, and convert the operation instruction into an instruction recognizable and responsive to the display device 200, to function as an interaction between the user and the display device 200. Such as: the user responds to the channel addition and subtraction operation by operating the channel addition and subtraction key on the control apparatus 100, and the display apparatus 200.

In some embodiments, the control device 100 may be a smart device. Such as: the control apparatus 100 may install various applications for controlling the display apparatus 200 according to user's needs.

In some embodiments, as shown in fig. 1, a mobile terminal 300 or other intelligent electronic device may function similarly to the control device 100 after installing an application that manipulates the display device 200. Such as: the user may implement the functions of controlling the physical keys of the device 100 by installing various function keys or virtual buttons of a graphical user interface available on the mobile terminal 300 or other intelligent electronic device.

The controller 110 includes a processor 112 and RAM113 and ROM114, a communication interface 130, and a communication bus. The controller 110 is used to control the operation and operation of the control device 100, as well as the communication collaboration among the internal components and the external and internal data processing functions.

The communication interface 130 enables communication of control signals and data signals with the display device 200 under the control of the controller 110. Such as: the received user input signal is transmitted to the display device 200. The communication interface 130 may include at least one of other near field communication modules such as a WiFi chip 131, a bluetooth module 132, an NFC module 133, and the like.

A user input/output interface 140, wherein the input interface includes at least one of a microphone 141, a touchpad 142, a sensor 143, keys 144, and other input interfaces. Such as: the user can implement a user instruction input function through actions such as voice, touch, gesture, press, and the like, and the input interface converts a received analog signal into a digital signal and converts the digital signal into a corresponding instruction signal, and sends the corresponding instruction signal to the display device 200.

The output interface includes an interface that transmits the received user instruction to the display device 200. In some embodiments, an infrared interface may be used, as well as a radio frequency interface. Such as: when the infrared signal interface is used, the user input instruction needs to be converted into an infrared control signal according to an infrared control protocol, and the infrared control signal is sent to the display device 200 through the infrared sending module. And the following steps: when the radio frequency signal interface is used, the user input instruction is converted into a digital signal, and then the digital signal is modulated according to a radio frequency control signal modulation protocol and then transmitted to the display device 200 through the radio frequency transmission terminal.

In some embodiments, the control device 100 includes at least one of a communication interface 130 and an output interface. The control device 100 is provided with a communication interface 130 such as: the WiFi, bluetooth, NFC, etc. modules may send the user input instruction to the display device 200 through a WiFi protocol, or a bluetooth protocol, or an NFC protocol code.

A memory 190 for storing various operating programs, data and applications for driving and controlling the display device 200 under the control of the controller 110. The memory 190 may store various control signal instructions input by a user.

A power supply 180 for providing operating power support for the various elements of the control device 100 under the control of the controller 110. May be a battery and associated control circuitry.

Fig. 4 is a schematic functional configuration diagram of a display device 200 according to an embodiment of the present application. As shown in fig. 4, the memory 290 is used to store an operating system, application programs, contents, user data, and the like, and performs system operations for driving the display device 200 and various operations in response to a user under the control of the controller 210. Memory 290 may include volatile and/or nonvolatile memory.

The memory 290 is specifically used for storing an operation program for driving the controller 210 in the display device 200, and storing various application programs built in the display device 200, various application programs downloaded by a user from an external device, various graphical user interfaces related to the application, various objects related to the graphical user interfaces, user data information, and various internal data supporting the application. The memory 290 is used to store system software such as OS kernel, middleware and applications, and to store input video data and audio data, and other user data.

Memory 290 is specifically used to store drivers and related data for audio and video processors 260-1 and 260-2, display 280, communication interface 230, modem 220, detector 240 input/output interface, and the like.

In some embodiments, memory 290 may store software and/or programs, the software programs used to represent an Operating System (OS) including, for example: a kernel, middleware, an Application Programming Interface (API), and/or an application program. For example, the kernel may control or manage system resources, or functions implemented by other programs (e.g., middleware, APIs, or application programs), and the kernel may provide interfaces to allow the middleware and APIs, or applications to access the controller to implement control or management of system resources.

By way of example, memory 290 includes a broadcast receiving module 2901, a channel control module 2902, a volume control module 2903, an image control module 2904, a display control module 2905, an audio control module 2906, an external instruction recognition module 2907, a communication control module 2908, a light receiving module 2909, a power control module 2910, an operating system 2911, and other applications 2912, a browser module, and the like, wherein the external instruction recognition module 2907 includes a graphics recognition module 2907-1, a voice recognition module 2907-2, a key instruction recognition module 2907-3, and the like. The controller 210 executes various software programs in the memory 290 such as: broadcast television signal receiving and demodulating functions, television channel selection control functions, volume selection control functions, image control functions, display control functions, audio control functions, external instruction recognition functions, communication control functions, optical signal receiving functions, power control functions, software control platforms supporting various functions, browser functions and other applications.

Fig. 5a is a block diagram illustrating a configuration of a software system in a display device 200 according to an embodiment of the present application.

As shown in FIG. 5a, operating system 2911, which includes executing operating software for handling various basic system services and for performing hardware-related tasks, acts as a medium for data processing completed between applications and hardware components. In some embodiments, portions of the operating system kernel may contain a series of software to manage display device hardware resources and to serve other programs or software code.

In other embodiments, portions of the operating system kernel may contain one or more device drivers, which may be a set of software code in the operating system that helps operate or control the devices or hardware associated with the display device. The driver may contain code to operate video, audio and/or other multimedia components. Examples include a display screen, camera, flash, wiFi, and audio drivers.

Wherein, accessibility module 2911-1 is configured to modify or access an application program to realize accessibility of the application program and operability of display content thereof.

The communication module 2911-2 is used for connecting with other peripheral devices via related communication interfaces and communication networks.

User interface module 2911-3 is configured to provide an object for displaying a user interface, so that the user interface can be accessed by each application program, and user operability can be achieved.

Control applications 2911-4 are used for controllable process management, including runtime applications, and the like.

The event transmission system 2914, which may be implemented within the operating system 2911 or within the application 2912, is implemented in some embodiments on the one hand within the operating system 2911, and simultaneously within the application 2912, for listening to various user input events, and will implement one or more sets of predefined operational handlers based on various event references in response to recognition results of various types of events or sub-events.

The event monitoring module 2914-1 is configured to monitor a user input interface to input an event or a sub-event.

The event recognition module 2914-2 is configured to input definitions of various events to various user input interfaces, recognize various events or sub-events, and transmit them to a process for executing one or more corresponding sets of processes.

The event or sub-event refers to an input detected by one or more sensors in the display device 200, and an input from an external control device (e.g., the control device 100, etc.). Such as: various sub-events are input through voice, gesture input through gesture recognition, sub-events of remote control key instruction input of a control device and the like. By way of example, one or more sub-events in the remote control may include a variety of forms including, but not limited to, one or a combination of key press up/down/left/right/, ok key, key press, etc. And operations of non-physical keys, such as movement, holding, releasing, etc.

Interface layout manager 2913 directly or indirectly receives user input events or sub-events from event delivery system 2914 for updating the layout of the user interface, including but not limited to the location of controls or sub-controls in the interface, and various execution operations associated with the interface layout, such as the size or location of the container, the hierarchy, etc.

Fig. 5b is a schematic diagram illustrating a configuration of an application program in the display device 200 according to an embodiment of the present application. As shown in fig. 5b, application 2912 includes various applications that may also be executed on display device 200. Applications may include, but are not limited to, one or more applications such as: live television applications, video on demand applications, media center applications, application centers, gaming applications, etc.

Live television applications can provide live television through different signal sources. For example, a live television application may provide television signals using inputs from cable television, radio broadcast, satellite services, or other types of live television services. And, the live television application may display video of the live television signal on the display device 200.

Video on demand applications may provide video from different storage sources. Unlike live television applications, video-on-demand provides video displays from some storage sources. For example, video-on-demand may come from the server side of cloud storage, from a local hard disk storage containing stored video programs.

The media center application may provide various applications for playing multimedia content. For example, a media center may be a different service than live television or video on demand, and a user may access various images or audio through a media center application.

An application center may be provided to store various applications. The application may be a game, an application, or some other application associated with a computer system or other device but which may be run in a smart television. The application center may obtain these applications from different sources, store them in local storage, and then be run on the display device 200.

During use of each program, the user is unavoidably required to search for resources, and in some embodiments the user's search instructions may be entered via an audio receiving element (e.g., microphone) in the user input interface 140, and in some embodiments via keys 144 in the user input interface 140, such as via remote control keys. The following embodiments take as examples speech input through a microphone.

Further, the item may represent an interface or an interface set display in which the display device 200 is connected to an external device, or may represent an external device name or the like connected to the display device. Such as: a signal source input interface set, an HDMI interface, a USB interface, a PC terminal interface, and the like.

Taking voice input as an example, the current intelligent television supports more and more service scenes, such as video, education, music, application, shopping and other services. Because of more cross and relevance among services, the same target search word may correspond to resource information of multiple service types. In the prior art, a voice search instruction input by a user is analyzed and searched mainly through a semantic analysis method, so that resource information corresponding to a target search word in the voice search instruction is obtained. Because the same target search word may correspond to resource information of multiple service types, when the resource information is displayed to a user, the resource information can be classified into corresponding tag pages in a preset tag page list according to the service types. The order of the tag pages in the tag page list is preset, and the tag pages display the resource information of the corresponding service types according to the fixed order no matter what target search words are contained in the voice search instruction input by the user. However, although the target search word can correspond to the resource information of multiple service types, the service of the different target search words is different, and the tag pages display the resource information of the corresponding service types according to the fixed order, so that the display order of the resource information often does not accord with the search intention of the user, and the target resources are arranged at the later positions, thereby greatly increasing the time for the user to find the target search resources from the tag page list and reducing the user experience.

In order to solve the above-mentioned problems, an embodiment of the present application provides a voice control method to reduce the possibility that the display order of the resource information does not conform to the search intention of the user.

The technical scheme of the embodiment of the application is described in detail below by specific embodiments. The following embodiments may be combined with each other, and some embodiments may not be repeated for the same or similar concepts or processes.

Fig. 6 is a signaling interaction diagram of a voice control method according to an embodiment of the present application. The present embodiment relates to a process of determining a resource letter according to a voice search instruction. The embodiment of the application takes display equipment and a server as examples, and the method of the embodiment of the application is described. As shown in fig. 6, the method includes:

step S101, the display equipment receives voice input by a user and generates a voice search instruction according to the voice.

Wherein the display device receives a voice input of the user through an audio receiving element, wherein the audio receiving element may be a microphone.

In some embodiments, the user inputs speech through the sound collector, the display device generates a speech search instruction from the input speech, and in some embodiments, the display device converts the input speech into text data and then sends the text data to the speech server for parsing through the communication interface 230.

Fig. 7a is a schematic diagram of a voice wake-up interface according to an embodiment of the present application, as shown in fig. 7a, in some embodiments, after a user presses a voice key of a remote controller, the remote controller sends a first key value and/or a first bluetooth command to a display device, and a television calls out a first voice interaction interface according to the received first key value and/or the first bluetooth command, where the first voice interaction interface may be overlapped on a previous interface in a suspension layer manner.

Fig. 7b is a schematic diagram of a search result interface provided in an embodiment of the present application, after a user inputs voice, the search interface presented in response to the voice is shown in fig. 7b, and fig. 7b schematically illustrates a schematic diagram of an interface layout manager 2913 in a display device 200 presenting a user interface when presenting a search result according to an exemplary embodiment. As shown in fig. 7b, the user interface includes a plurality of view display areas, including, for example, a search term presentation area, a tag presentation area, a resource display area, etc. disposed from top to bottom, each view display area including a layout of one or more different items. And a selector in the user interface indicating that any of the items is selected, the position of the selector being movable by user input to change selection of a different item.

The plurality of view display areas may be visible or invisible. Such as: different view display areas can be marked by different background colors of the view display areas, visual marks such as boundary lines and invisible boundaries can be also arranged. There may also be no visible or non-visible border, but only the associated items in a range of areas displayed on the screen, with the same changing properties in size and/or arrangement, which range of areas is looked at as the presence of the border of the same view partition.

In some embodiments, the search term display area is used to display search instructions input by a user, and may also display search instructions input by the user or include search instructions input by the user on the left side of the search term display area, and display recommended search terms on the right side, and vice versa. The recommended search words are obtained by the server according to a search name instruction of the user.

In some embodiments, in the interactive interface, the tag display area is located below the search term display area (if any), or above the resource display area, for displaying the tags in the tag order determined by the server, where the tag display with the greatest weight is displayed on the left side of the area, and all the tags from left to right are arranged according to the weights from large to small. When the label exceeds one row, the other row shows labels with a smaller weight than the first row.

In some embodiments, the resource display area is located at the bottommost part, and the resource display area contains matrix-distributed vacancies, where the vacancies can load the resources of the service corresponding to the selected tag according to the selected tag. And when the labels are switched, the original resources are released, and the resources of the service corresponding to the selected label are loaded again according to the selected label.

Step S102, the display device sends a voice search instruction to the server.

The voice search instruction carries target search words.

In this embodiment, the display device and the server both have a communication function, and can interact with each other. The display device may record a voice input by a user and generate a voice search instruction, and transmit the voice search instruction to the server through the communication interface. For example, the display device may accept voice entry with a cell phone connected thereto; or the display equipment can accept the input of voice through a remote controller connected with the display equipment; alternatively, the display device may also accept voice entry through a recording component configured itself.

In some embodiments, for entered speech, the display device may generate a speech search instruction using a local database, or may first generate text locally from the speech and send the text to a speech server to generate the speech instruction via the speech server.

The language used by the voice search instruction input by the user is not limited, and can be Chinese, english, french and the like by way of example.

The target search term, which may also be understood as a search keyword, is a keyword term in a voice search instruction, and may be a noun, for example: "yoga", "cancrina", etc., may also be place names, such as: "Beijing", "Moscow", etc. may also be the name of the audiovisual work. The voice search instruction may include one target search word or may include a plurality of target search words, and the number of target search words is not limited in the embodiment of the present application.

Step S103, the server acquires the resource information of the service type corresponding to the target search word.

In this step, the server may access a resource library, where resource information of different service types is stored, and each target search word corresponds to resource information of at least one service type. Exemplary, "yoga" corresponds to resource information of shopping service and resource information of movie service, and "apple" corresponds to resource information of shopping service, resource information of movie service and resource information of music service. The service types corresponding to the target search words can be stored in the memory of the server in advance, can be stored in the storage device, and can also output the probability of the service corresponding to the target search words through a voice model, wherein the voice model is generated by training corpus of a plurality of target search words and service types corresponding to the words in advance. The storage device is connected with the server, and the storage device can be arranged inside the server or outside the server.

In some embodiments, the server may store in advance a correspondence between the target search word and a service type of the resource, and may determine resource information of the service type corresponding to the target search word according to the correspondence between the target search word and the service type of the resource. In some embodiments, the uncertainty entered by the user may determine the type of traffic corresponding to the target search term through a deep learning model of the type of traffic of the target search term-resource.

Because of the more cross and relevance between services, the same target search term may correspond to different service types. The apple can be corresponding to resource information of film and television types, music types and shopping types; the yoga can be corresponding to the resource information of the film and television type or the resource information of the shopping type. It should be noted that, the number of resource information acquired by each service type may be one or more, and the embodiment of the present application does not limit the number of resource information.

In some embodiments, determining the service type corresponding to the target search word according to the target search word and searching in the resource library according to the target search word may be two parallel threads, all media resources in the resource library include media resources of a plurality of service types, and the media resources corresponding to the target search word may correspond to only one service type or may correspond to two or more service types.

Step S104, responding to the service types not less than two, the server adjusts the display sequence of the labels according to the target search word, and each label is used for loading resource information of one service type.

In this step, resources of different service types may be placed under different service tag types or each resource may be provided with a tag of a service type. The server acquires resource information corresponding to the target search word from the resources of the business type. If the target search word corresponds to at least two service types, the server also needs to adjust the display sequence of the labels according to the target search word.

The labels can be divided according to service types, for example, the labels can be divided into music labels, shopping labels and novel labels, and resource information of the service types corresponding to the target search words obtained by the server can be mapped into the corresponding labels according to the corresponding service types. For example: the film and television type resource information may be mapped into a film and television tag, the music type resource information may be mapped into a music tag, the shopping type resource information may be mapped into a shopping tag, and the novice type resource information may be mapped into a novice tag.

In some embodiments, the server may adjust the display order of the labels according to the weight of the target search term for at least two service types.

In some embodiments, the mapping relationship may be preset, and the server adjusts the display sequence of the labels according to the preset mapping relationship.

The target keyword yoga corresponds to resource information of two business types, namely a film type and a shopping type, the weight of the film type is preset to be 5, the weight of the shopping type is preset to be 3, and accordingly, the label of the film type can be arranged in front of the label of the shopping type due to the fact that the weight of the film type is larger than the weight of the shopping type.

In other embodiments, the server may also determine the weight of the service type corresponding to the target search term through a dependency relationship between the non-target keyword and the target keyword in the voice search instruction.

Step S105, the server generates a first display instruction according to the display sequence of the labels.

In this step, after the server adjusts the display sequence of the labels, a first display instruction may also be generated, where the first display instruction is used to send to the display device to instruct the display device to display the labels, where the first display device includes the display sequence of the labels.

In some embodiments, the server may further obtain an address corresponding to the resource information of the service type corresponding to the tag, and generate the first display instruction according to the tag after the display order is adjusted and the address corresponding to the resource information of the service type corresponding to the tag.

Step S106, the server pushes a first display instruction to the display equipment, wherein the first display instruction is used for indicating the label display area of the display equipment in the resource display interface to display labels in sequence according to the adjusted display sequence.

In step S106, after the server obtains the resource information of the service type corresponding to the target search term and adjusts the display sequence of the tag page, a first display instruction may be pushed to the display device, so that the display device sequentially displays the tags according to the adjusted display sequence through the tag page area.

Fig. 8 is a schematic diagram of a display principle of a tag display area provided in an embodiment of the present application, in some embodiments, as shown in fig. 8, data obtained by a server in response to a target search term includes TAB data and search result data, where the TAB data includes tags to be returned and a sequence of the tags, the search result data includes service data corresponding to all tags in the TAB data and mapping relations between the data and the tags, the server may package the search result data and the TAB data into an object profile (javascriptobject notation, JSON) format and send the object profile to a display device, after receiving the JSON data, the display device parses the TAB data and the search result data, and displays the TAB data in an order determined by the server in a tag display area, and loads service data (including, for example, poster information) corresponding to the tags of a focus corresponding to the tags in a focus position of a resource display area according to a space of the focus, so as to display different resources under different types of services.

In some embodiments, the search result data contains address information corresponding to each resource, and the gaps can be formed by loading the address information to display the poster of the resource, wherein the poster comprises a display picture and a title of the resource.

Step S107, the display device displays the labels in the label display area in the resource display interface according to the adjusted display sequence.

The resource presentation interface may include a list presentation area and a resource presentation area. The tag at the top of the rank is set to the default selected tag.

In an embodiment, the tag display area displayed by the display device may include a tag for acquiring a service type corresponding to the voice search; in another embodiment, the tabbed page list displayed by the display device may include tabs of all service types, and a specific number of the resource information may be displayed for tabbed pages for which the resource information is acquired, and a number of the tabbed pages for which the resource information is not acquired may be zero.

For example, taking the target search term as yoga as an example, the yoga corresponds to the resource information of the film and television type and also corresponds to the resource information of the shopping type. Correspondingly, the server respectively acquires the resource information of the film type corresponding to the yoga and the resource information of the shopping type corresponding to the yoga. Mapping the shopping type resource information into the shopping type label, mapping the film and television type resource information into the film and television type label, and arranging the film and television type label in front of the shopping type label according to the weight. Then, the server may send a first display instruction to the display device, where the display device displays resource information corresponding to "yoga" through the tag display area, and at the same time, the tag of the film and television type in the tag page list is arranged before the tag of the shopping type.

Fig. 9a is an interface schematic diagram of a display device according to an embodiment of the present application. Fig. 9b is an interface schematic diagram of another display device according to an embodiment of the present application. As shown in fig. 9a and 9b, taking the target search words as "yoga" and "yoga teaching" as an example, resource information related to the target search words "yoga" and "yoga teaching" is displayed on an interface of the display device through a tag display area, and the tags in the tag display area may be, for example: film, education, shopping platform 1, application, shopping platform 2, etc., through the user clicking different labels, the display device can display the resource information corresponding to the labels in the resource display area.

In some embodiments, the tag may use the same field as the service type, or may use different fields, where when the target search term is "yoga", the corresponding media assets in the media asset (media asset) library include multiple media assets, for example, "kungfu yoga" belonging to the movie service, or "yoga tutoring", "follow me yoga" belonging to the education service, or "yoga suit", "yoga mat" belonging to the gathering shopping service, or "daily yoga", "yoga entrance" belonging to the APP resource of the application service, or "yoga suit", "yoga mat" belonging to the panning shopping service, or the like.

In some embodiments, the first display instruction includes tag data and search result data for the media asset, and an order of the tags. Wherein the tag data is determined by the server based on the business location of the target search term. For example, when the target search word is positioned in the video service or when the probability of the target search word being positioned in the video service is maximum, the labels corresponding to the video service are arranged in the first position in a sequence, and other positions can be arranged randomly; the service location information can be arranged according to the service probability, for example, the probability is smaller and more rearward, the service location information can be arranged according to the historical searching habit of the user, the service location information is lower and more rearward, the service location information is most frequently used by the user and is arranged at the second position, and the first position label of service location is not affected. In some embodiments, the display order corresponding to "yoga" is: the display sequence corresponding to the yoga teaching is as follows: application, education, store, movie, and treasury. The labels corresponding to different target search words may be the same or different, different target search words may correspond to different label orders, and different label orders correspond to different target search words.

In some alternative embodiments, if the first display instruction includes an address corresponding to the resource information of the service type corresponding to the tag, the display device displays, in a resource display area in the resource display interface, the resource information of the service type corresponding to the selected tag according to the address and the selected tag.

Correspondingly, the display device can also receive an instruction input by a user, and play or display the corresponding resource according to the identification in the instruction. By taking intelligent electricity as an example, resource information is displayed on the intelligent television through a tag display area, a user can switch selection of different tags through a remote controller, the resource display area loads resource information corresponding to the selected tags, if resources to be played or displayed occur, the user can input instructions, such as downward key values, to the intelligent television through the remote controller, and the focus can be controlled to move from the tag display area to the empty space of the resource display area. If the smart television receives a determining instruction input by a user, a resource corresponding to the vacancy at the focus can be obtained from the server according to the identification in the instruction, and then the resource is played or displayed.

According to the voice control method provided by the embodiment of the application, the display equipment receives the voice input from the audio receiving element and generates a voice search instruction according to the voice; then, the display equipment sends a voice search instruction input by a user to the server, wherein the voice search instruction carries target search words, the target search words are used for adjusting the display sequence of the labels when the service types are not less than two, and different display sequences of the labels correspond to different target search words; then, the display device receives a first display instruction returned by the server based on the voice search instruction, and responds to the first display instruction, and the display device sequentially displays the labels in the label display area in the resource display interface according to the adjusted display sequence. By the method, the display sequence of the tag page can be adjusted according to the target search word, so that the resource information of the service type with higher association degree with the target search word can be displayed preferentially, the possibility that the display sequence of the resource information does not accord with the search intention of the user is reduced, the time for the user to find the target search resource from the tag page list is shortened, and the user experience is improved.

On the basis of the above-described embodiments, a description will be given below of how the server adjusts the display order of the tabbed pages. Fig. 10 is a flowchart of a voice control method according to an embodiment of the present application. This embodiment relates to a specific procedure of how the server adjusts the display order of the tabbed pages. The method of the embodiment of the application is described by taking the server as an execution main body. As shown in fig. 10, on the basis of the above embodiment, the method includes:

step 201, receiving a voice search instruction sent by the display device, where the voice search instruction carries a target search word.

Step S202, obtaining the resource information of the service type corresponding to the target search word.

The technical terms, effects, features, and alternative embodiments of steps S201-S202 may be understood with reference to steps S102-S103 shown in fig. 6, and will not be described in detail herein for repeated matters.

And step 203, responding to at least two service types, and acquiring the weight of each service type corresponding to the target search word.

In some embodiments, the server may obtain the weights of the target search word corresponding to each service type according to the target search word and a preset mapping relationship between the search word and the weights of the service types.

In this embodiment, a mapping relationship between a preset search term and a weight of a service type may be stored in advance in the server, and when performing voice search, the weight of each service type corresponding to the target search term may be found from the pre-stored mapping relationship.

For example, if the voice search instruction includes the target search word "computer", the mapping relationship between the search word "computer" and the weights of the movie type, the education type and the shopping type is stored in the server in advance. Based on the above, the server can directly obtain the weight 1 of the movie type, the weight 2 of the education type and the weight 3 of the shopping type corresponding to the target search word "computer" from the database.

In this embodiment, the mapping relationship between the weights of the target search word and the service types may be used as attribute information of the target search word, in some embodiments, the uncertainty input by the user may determine the weights of the service types corresponding to the target search word through a target search word-service type deep learning model, and when the target search word corresponds to a plurality of service types, the corresponding service types may be ranked according to the weights.

In some embodiments, the voice search instruction includes at least one non-target search word in addition to the target search word, and the server obtains a weight of the target search word corresponding to each service type based on a dependency relationship between the target search word and the at least one non-target search word.

In this embodiment, the determination of the weights of the service types according to the target search word only makes it possible to generate a deviation, so that the determination of the weights of the service types can be assisted by the non-target search word.

The above non-target search words may be words other than the target search word in the voice search instruction, and the non-target search word may generally assist in locating the service type of the target search word. Non-target search terms may be verbs such as "watch", "buy", "learn", etc., and may also be the names "director", "concert", "handbag", etc.

Wherein, the above-mentioned dependency relationship can be set in advance. In an alternative embodiment, modifier rules may be configured to construct a fixed-language relationship with the target search term and use the fixed-language relationship as a dependency relationship between the target search term and non-target search terms.

Illustratively, the voice search instruction is "cheap mobile phone", wherein the target search term is "mobile phone", the non-target search term is "cheap", and the shopping service weight is determined to be greater than or greater than the video service weight according to the dependence between "mobile phone" and "cheap". Accordingly, the server may set the shopping service to 2 and the weight of the movie service to 1. In addition, the non-target search term "inexpensive" may also assist the target keyword in searching for relevant resource information under the corresponding traffic type.

In some embodiments the target search term refers to a term that matches the title of the resource above a preset threshold. Non-target search terms refer to names, verbs, or adjectives in the voice command that are other than target search terms.

In another alternative embodiment, verb rules may be configured to construct a idiomatic relationship with a target search term and use the idiomatic relationship as a dependency between the target search term and non-target search terms.

Illustratively, the voice search instruction is "listen to apples", wherein the target search word is "apples", the non-target search word is "listen", and the music service weight can be determined to be greater than the video service weight according to the dependence between "apples" and "listen". Accordingly, the server may set the weight of the music service to 2 and the weight of the movie service to 1.

In this embodiment, the voice search instruction may include a plurality of non-target search words. Accordingly, the weight of the non-business type may be determined in consideration of the dependency relationship between the plurality of non-target search words and the target search word.

Illustratively, the voice search instruction is "I want to see how panpipe Zhong Hanliang is. First, it is determined how the target search word "sheng xiaomer" corresponds to the video service. Then, determining the dependency relationship of ' I want ', ' see ', ' Zhong Hanliang ' on ' how to take the panpipe with ' the target search word ', respectively determining weights a1, a2 and a3 of the video service according to the dependency relationship corresponding to the non-target search word and the target search word, and summarizing the three weights a1, a2 and a3 to obtain the total weight a of the video service; correspondingly, determining the corresponding music service of the target search word 'sheng xiaomer'. Then, determining the dependency relationship of ' I want ', ' see ', ' Zhong Hanliang ' on ' how to take the panpipe with ' the target search word ', respectively determining the weights b1, b2 and b3 of the music service according to the dependency relationship corresponding to the non-target search word and the target search word, and summarizing the three weights b1, b2 and b3 to obtain the total weight b of the music service.

In some embodiments, the correspondence between the non-target search word and the service type may be used as attribute information of the non-target search word, in some embodiments, the uncertainty input by the user may determine the weight of the service type corresponding to the non-target search word through a non-target search word-service type deep learning model, and when the non-target search word corresponds to a plurality of service types, the corresponding service types may be ranked according to the weight.

In some embodiments, the non-target search word and the corresponding click data may also be counted in a big data manner to determine the weight of the service type corresponding to the non-target search word.

Step S204, according to the weight of each service type corresponding to the target search word, the display sequence of the labels is adjusted.

In this step, when the server determines that the target search term corresponds to the weight of each service type, the labels may be ordered according to the size of the weight of the corresponding service type, so as to adjust the display sequence of the label pages according to the ordering result.

In some embodiments, the server may locate the target service type according to the weights of the target search terms corresponding to the service types; wherein the weight of the target service type is the largest; and according to the target service type, adjusting the display sequence of the tag page so that the tag corresponding to the target service type is positioned at the first position of the tag page list.

For example, if the music genre corresponding to "apple" is weighted 3, the video genre is weighted 2, and the shopping genre is weighted 1. Since the weight of the music type is highest, the music type can be positioned as the target type and arranged at the first position of the tag.

The present embodiment does not limit the arrangement order of the orders other than the first order of the tag display area. In some embodiments, after ranking the tags of the target type first, the subsequent tags may also be ranked from large to small in weight. In some embodiments, after ranking the tags of the target type first, the subsequent tags may be randomly arranged. In some embodiments, after the labels of the target types are ranked first, the labels corresponding to the service types to be recommended may also be used as the second position.

For example, in fig. 9a and 9b, in the tag showing area in fig. 9a, the "movie" tag is located at the first position, and then tags such as "education", "focused communication", "application", "panned shopping" and the like are located in sequence, as shown in fig. 9a, the default focus after searching is located on the tag located at the first position, so that the resource showing area downloads resource information corresponding to the "movie" tag at each empty position.

Step S205, a first display instruction is generated according to the display sequence of the labels.

Step S206, pushing a first display instruction to the display device, wherein the first display instruction is used for indicating the label display area of the display device in the resource display interface to display labels in sequence according to the adjusted display sequence.

The technical terms, effects, features, and alternative embodiments of steps S205-S206 may be understood with reference to steps S105-S106 shown in fig. 6, and will not be described again here for repeated matters.

According to the voice control method provided by the embodiment of the application, the server acquires the weight of each service type corresponding to the target search word; and adjusting the display sequence of the labels according to the weight of the target search word corresponding to each service type. The labels of the target service types with the largest weight are arranged on the home page, so that the resource information of the service types with higher association degree with the target search word can be displayed preferentially, the possibility that the display sequence of the resource information does not accord with the search intention of the user is reduced, the time for the user to find the target search resource from the label display area is shortened, and the user experience is improved.

If the target search word in the voice search instruction received by the server corresponds to the resource information of one service type, the display device can directly display the resource information of the service type corresponding to the target search word. Fig. 11 is a signaling interaction diagram of another voice control method according to an embodiment of the present application. The embodiment relates to a specific process of how a server obtains resource information corresponding to a target search word. The embodiment of the application takes a server as an example, and the method of the embodiment of the application is described. As shown in fig. 11, on the basis of the above embodiment, the method includes:

Step S301, the display device receives the voice input from the audio receiving element, and generates a voice search instruction according to the voice.

Step S302, the display device sends a voice search instruction to the server.

Step S303, the server acquires the resource information of the service type corresponding to the target search word.

The technical terms, effects, features, and alternative embodiments of steps S301-S303 can be understood with reference to steps S101-S103 shown in fig. 6, and will not be described again here for repeated matters.

Step S304, responding to the service type as one, the server pushes a second display instruction to the display device, wherein the second display instruction is used for indicating the display device to display the label in a label display area in the resource display interface, and displaying resource information of the service type corresponding to the label in the resource display area in the resource display interface according to the address and the label.

In this step, if the target search word in the voice search instruction corresponds to a service type, the second display instruction can be directly pushed to the display device without adjusting the display sequence of the tag page, so that the display device displays the resource information of the service type corresponding to the target search word.

For example, if the target search term in the voice search instruction is "yoga", and the attribute information of the target search term "yoga" only corresponds to the video service, the server may directly push the second display instruction to the display device, so that the display device displays the resource of the video service related to "yoga".

In some embodiments, if the target search word corresponds to only one service type, after the display device parses the received JSON data, the TAB data only includes one tag, so that only one tag is displayed in the tag display area, and since the focus defaults on the tag, the space in the resource display area loads resource information corresponding to the tag.

In some embodiments, if the target search word corresponds to only one service type and only one resource information exists in the service type, the display device may also directly display or play the resource corresponding to the resource information.

According to the voice control method provided by the embodiment of the application, the server responds to the fact that the service types are one, and pushes the second display instruction to the display device, wherein the second display instruction is used for indicating the display device to display the resource information of the service type corresponding to the target search word through the label corresponding to the service type, so that when the target search word corresponds to only one service type, the display device can directly display the resource information of the service type corresponding to the target search word to the user.

After receiving the voice search instruction sent by the display device, the server can process the voice search instruction so as to obtain the target search word. Fig. 12 is a signaling interaction diagram of yet another voice control method according to an embodiment of the present application. The embodiment relates to a specific process of how to accurately acquire target search words. The method of the embodiment of the application is described by taking the display device, the voice server and the data server as examples. As shown in fig. 12, on the basis of the above embodiment, the method includes:

step S401, the display device receives the voice input from the audio receiving element and transmits the voice to the voice server.

Step S402, the voice server generates a text corresponding to the voice according to the voice.

In steps S401 and S402, the display device may acquire voice input by the user through a microphone on the remote controller or a microphone on the display device body. The natural language input by the user acquired by the display device is then sent to a voice server, which converts the voice into corresponding text.

It should be noted that, the embodiment of the present application does not limit how to convert the voice into the corresponding text, and may be any one of the existing conversion methods.

Step S403, the voice server pushes a third display instruction to the display device, wherein the third display instruction is used for indicating the display device to display the text corresponding to the voice.

Fig. 13 is a schematic diagram of a text display principle provided by an embodiment of the present application, and fig. 14 is an interface schematic diagram of another display device provided by an embodiment of the present application. For example, as shown in fig. 13, after receiving the third display instruction sent by the voice server, the display device may create a layout file of the text corresponding to the voice search instruction, then load the layout file and initialize text control in the layout file, and finally display the text corresponding to the voice search instruction. For example, as shown in fig. 14, if the voice search instruction is "yoga", after the voice server obtains the text corresponding to the voice search instruction, the voice server may push a third display instruction to the display device, and after receiving the third display instruction, the display device may display the text "yoga" corresponding to the voice search instruction in the label display area on the interface.

In some embodiments, the voice server and the data server may be the same server, and the third display instruction is transmitted to the display device after the search is completed and simultaneously with the first display instruction. Text may the data server may obtain the text expanded search directly from the voice server.

The embodiment of the application does not limit the page of the text corresponding to the voice search instruction displayed by the display device, can display the text corresponding to the voice search instruction on the voice search page, and can display the text corresponding to the voice search instruction on the label display area page after the search is completed.

By displaying the text corresponding to the voice search instruction on the display device, the user can judge whether the voice recognition is accurate or not. When the voice recognition is inaccurate, the display device may re-transmit the voice search instruction to the server after receiving the re-recognition instruction input by the user.

Step S404, the display device generates a voice search instruction according to the text and sends the voice search instruction to the data server.

In some embodiments, the data server and the voice server may be different servers, and the user's voice needs to be parsed at the voice server, and then text is returned to the display device and then sent to the data server by the display device.

In some embodiments, the voice search instruction further includes information such as an ID of the display device, so that the data server can accurately feed back the search result to the display device.

Step S405, the data server performs word segmentation processing on the text to obtain target search words.

In this step, after the data server obtains the text corresponding to the voice search instruction, the text may be subjected to word segmentation processing, so as to obtain the target search word.

It should be noted that, the word segmentation method is not limited in the embodiment of the present application, and in an alternative implementation, a forward maximum matching method may be selected.

The voice search instruction input by the user is "i want to see yoga", and after the data server receives the voice search instruction and converts the voice search instruction into text, the "i want to see yoga" can be decomposed into "i", "want", "see", "yoga" by using a maximum matching method. Then, from a target search word list pre-stored in the data server, the target search word corresponding to the voice search instruction can be determined to be yoga.

Step S406, the data server acquires the resource information of the service type corresponding to the target search word.

In step S407, the data server responds to at least two service types, adjusts the display sequence of the labels according to the target search word, and each label is used for displaying the resource information of one service type.

Step S408, the data server generates a first display instruction according to the display sequence of the labels.

Step S409, the data server pushes a first display instruction to the display device, where the first display instruction is used to instruct the display device to display the labels in the label display area in the resource display interface in sequence according to the adjusted display sequence.

Step S410, the display device displays the labels in the label display area in the resource display interface according to the adjusted display sequence.

The technical terms, effects, features, and alternative embodiments of steps S406 to S410 may be understood with reference to steps S103 to S107 shown in fig. 6, and the repeated descriptions thereof will not be repeated here.

According to the voice control method provided by the embodiment of the application, the text corresponding to the voice search instruction is obtained, word segmentation processing is carried out on the text, the target search word and the attribute information of the target search word are obtained, and therefore the resource information set is determined. By improving the accuracy of the target search term, the likelihood of the search results not being coincident with the user's intent is reduced.

Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware associated with program instructions, where the foregoing program may be stored in a computer readable storage medium, and when executed, the program performs steps including the above method embodiments; and the aforementioned storage medium includes: various media that can store program code, such as ROM, RAM, magnetic or optical disks.

Fig. 15 is a schematic structural diagram of a display device according to an embodiment of the present application. The display device may be implemented by software, hardware, or a combination of both to perform the above-described voice control method. As shown in fig. 10, the display device includes:

a display device 51 configured to display a user interface further including a selector indicating that an item is selected, the position of the selector in the user interface being movable by user input to cause a different item to be selected;

a controller 52 in communication with the display device, the controller configured to:

receiving the voice input from the audio receiving element and generating a voice search instruction according to the voice;

the method comprises the steps of sending a voice search instruction input by a user to a server, wherein the voice search instruction carries target search words, the target search words are used for adjusting the display sequence of labels when service types are not less than two, and different display sequences of the labels correspond to different target search words;

In an alternative embodiment, the first display instruction includes an address corresponding to resource information of a service type corresponding to the tag;

the controller 52 is specifically configured to: and displaying the resource information of one service type corresponding to the selected label in a resource display area in the resource display interface according to the address and the selected label.

In an alternative embodiment, if the service type is one, the controller 52 is further configured to:

receiving a second display instruction pushed by the server, responding to the second display instruction, displaying the label by the display equipment in a label display area in the resource display interface, and displaying resource information of one service type corresponding to the label in the resource display area in the resource display interface according to the address and the label.

In an alternative embodiment, controller 52 is specifically configured to:

transmitting voice to a voice server;

receiving a text returned by a voice server, wherein the text is generated by the voice server according to voice;

and generating a voice search instruction according to the text.

In an alternative embodiment, the controller 52 is further configured to:

receiving a third display instruction pushed by the voice server, and displaying a text corresponding to the voice in a search word display area in a resource display interface by the display equipment in response to the third display instruction, wherein the search word display area, the label display area and the resource display area are sequentially arranged from top to bottom.

In an alternative embodiment, the tag at the top of the rank is set to the default selected tag.

The display device provided by the embodiment of the application can execute the actions of the display device in the embodiment of the method, and the implementation principle and the technical effect are similar, and are not repeated here.

Fig. 16 is a schematic structural diagram of a server according to an embodiment of the present application. As shown in fig. 16, the electronic device may include: at least one processor 61 and a memory 62. Fig. 16 shows an electronic device using one processor as an example.

And a memory 62 for storing programs. In particular, the program may include program code including computer-operating instructions.

The memory 62 may comprise high-speed RAM memory or may further comprise non-volatile memory (non-volatile memory), such as at least one disk memory.

The processor 61 is configured to execute computer-executable instructions stored in the memory 62 to implement the above-described server-side voice control method.

The processor 61 may be a central processing unit (Central Processing Unit, abbreviated as CPU), or an application specific integrated circuit (Application Specific Integrated Circuit, abbreviated as ASIC), or one or more integrated circuits configured to implement embodiments of the present application.

Alternatively, in a specific implementation, if the communication interface, the memory 62 and the processor 61 are implemented independently, the communication interface, the memory 62 and the processor 61 may be connected to each other through a bus and perform communication with each other. The bus may be an industry standard architecture (Industry Standard Architecture, abbreviated ISA) bus, an external device interconnect (Peripheral Component, abbreviated PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, abbreviated EISA) bus, among others. Buses may be divided into address buses, data buses, control buses, etc., but do not represent only one bus or one type of bus.

Alternatively, in a specific implementation, if the communication interface, the memory 62 and the processor 61 are integrated on a chip, the communication interface, the memory 62 and the processor 61 may complete communication through an internal interface.

The present invention also provides a computer-readable storage medium, which may include: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, an optical disk, or other various media capable of storing program codes, and specifically, the computer-readable storage medium stores program instructions for the method on the first terminal side or the method on the second terminal side.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims

1. A display device, the method comprising:

a display configured to present a user interface including a selector therein indicating that an item is selected, the selector for receiving user input to move a position in the user interface to cause selection of a different item;

a communicator configured to communicate data and information with the server;

a controller configured to:

responding to voice input of a user, sending voice text data generated by processing to a server, wherein the voice text data carries target search words, so that the server can acquire corresponding media resource data and tag data according to the target search words, and sequencing the tag data according to the target search words; wherein, the label information corresponds to the service type, and each media resource data corresponds to one label data;

Receiving a first display instruction fed back by the server, wherein the first display instruction comprises media resource data, tag data and ordering information of tags;

controlling a display to present a resource display interface, wherein the labels are presented according to the ordering information of the labels, and the selector defaults to stay on the labels with the first order; and default presenting the media resource data corresponding to the label of the first order.

2. The display device of claim 1, wherein the controller, while executing the sending of the process-generated phonetic text data to the server in response to the user's voice input, is further configured to:

the process-generated phonetic text data is presented on a display in response to a user's voice input.

3. The display device of claim 1, wherein the controller is executing to control the display to present a resource presentation interface, wherein the labels are presented in accordance with ranking information for each label, and wherein the selector defaults to stay on labels ranked first; and media resource data corresponding to the tag presented in the first order by default, further configured to:

Displaying the labels in a label display area in a resource display interface according to the ordering of the labels; the resource display interface comprises a label display area and a resource display area;

according to the label selected by the selector, displaying resource information of a service type corresponding to the selected label in a resource display area in the resource display interface;

the resource display area comprises matrix-distributed gaps, and the gaps are used for loading media resources of services corresponding to the selected tags according to the selected tags;

and when the labels are switched, the empty space is also used for releasing the original resources and loading the resources of the service corresponding to the selected label according to the selected label again.

4. The display device of claim 1, wherein the server comprises a voice server and a data server, the controller performing sending the process-generated voice text data to the server in response to a user's voice input, further configured to:

transmitting the voice to a voice server;

receiving voice text data returned by a voice server, wherein the voice text data is generated by the voice server according to the voice;

And sending the voice text data to the data server.

5. A method of speech control, applied to a display device comprising a display, a communicator and a controller, the display being configured to present a user interface comprising a selector therein indicating that an item is selected, the selector being for receiving user input to move a position in the user interface to cause selection of a different item; the communicator is configured to communicate data and information with a server;

the method comprises the following steps:

6. A voice control method, characterized by a server for communication connection with a display device, comprising:

receiving voice text data sent by display equipment, and carrying out semantic analysis on the voice text data;

when the voice text data carries a target search word, acquiring corresponding media resource data and tag data according to the target search word; sorting the tag data according to the target search word; wherein, the label information corresponds to the service type, and each media resource data corresponds to one label data;

generating a first display instruction and sending the first display instruction to the display equipment, wherein the first display instruction comprises media resource data, tag data and ordering information of tags; such that the display device presents the labels on the user interface in accordance with the ranking information of each label.

7. The method of claim 6, wherein corresponding media asset data and tag data are obtained from the target search term; and sorting the tag data according to the target search word, specifically including:

if the server acquires resource information corresponding to the target search word from a resource library;

and if the resource information corresponds to at least two types of service types, adjusting the display sequence of the labels according to the target search word.

8. The method of claim 7, wherein the ranking the tags according to the target search term comprises:

the mapping relation between the preset search word and the weight of the service type can be prestored in the server;

and when the voice text data is analyzed to comprise the target search word, the weight of each service type corresponding to the target search word is found from the pre-stored mapping relation.

9. The method of claim 7, wherein the ranking the tags according to the target search term comprises:

analyzing target search words and at least one non-target search word included in the voice text data;

And acquiring the weight of each service type corresponding to the target search word based on the dependency relationship between the target search word and at least one non-target search word.