CN112135170A - Display device, server and video recommendation method - Google Patents

Display device, server and video recommendation method Download PDF

Info

Publication number
CN112135170A
CN112135170A CN202011002330.6A CN202011002330A CN112135170A CN 112135170 A CN112135170 A CN 112135170A CN 202011002330 A CN202011002330 A CN 202011002330A CN 112135170 A CN112135170 A CN 112135170A
Authority
CN
China
Prior art keywords
user
recommendation
data
server
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011002330.6A
Other languages
Chinese (zh)
Inventor
杨云龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao Jukanyun Technology Co ltd
Original Assignee
Qingdao Jukanyun Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao Jukanyun Technology Co ltd filed Critical Qingdao Jukanyun Technology Co ltd
Priority to CN202011002330.6A priority Critical patent/CN112135170A/en
Publication of CN112135170A publication Critical patent/CN112135170A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/251Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/239Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests
    • H04N21/2393Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests involving handling client requests
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/42203Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS] sound input device, e.g. microphone
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/4508Management of client data or end-user data
    • H04N21/4532Management of client data or end-user data involving end-user characteristics, e.g. viewer profile, preferences
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/482End-user interface for program selection
    • H04N21/4826End-user interface for program selection using recommendation lists, e.g. of programs or channels sorted out according to their score

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The present application relates to the field of communications technologies, and in particular, to a display device, a server, and a video recommendation method. The problem that the accuracy of recommended videos is reduced due to the fact that video recommendation cannot be specific to a single user and the fact that television use trace information is lost can be solved to a certain extent. The display device includes: a display; a first controller configured to: sending a recommendation request containing voiceprint information to a server, wherein the recommendation request contains a search instruction, and the search instruction is used for enabling the server to feed back recommendation data and receive recommendation data, and for the same search instruction, when the recommendation request contains voiceprint information or does not contain voiceprint information, the recommendation data are different; presenting the recommendation data on the display.

Description

Display device, server and video recommendation method
Technical Field
The present application relates to the field of communications technologies, and in particular, to a display device, a server, and a video recommendation method.
Background
The video recommending system automatically recommends and presents videos related to the videos watched by the user for the user according to the videos watched by the user and search conditions input by the user, helps the user find interesting videos, and increases the stay time of the user on a video website.
In some video recommendation implementations, a user watches videos through an application program pre-installed on a television, then a watching record of the user is acquired in the application program, and videos are recommended to the user according to the similarity by searching related videos and determining the similarity between the videos.
However, when there are multiple users in a home where a television is located, it often happens that the recommended video acquired by the a user is actually the recommended result liked by the B user; and the accuracy of the recommended video may decrease when the television is accidentally emptied of usage traces.
Disclosure of Invention
In order to solve the problem that video recommendation cannot specifically aim at the problem that the accuracy of recommended videos is reduced after single user and television use trace information are lost, the application provides display equipment, a server and a video recommendation method.
The embodiment of the application is realized as follows:
a first aspect of an embodiment of the present application provides a display device, including: a display; a first controller configured to: sending a recommendation request containing voiceprint information to a server, wherein the recommendation request contains a search instruction, and the search instruction is used for enabling the server to feed back recommendation data and receive recommendation data, and for the same search instruction, when the recommendation request contains voiceprint information or does not contain voiceprint information, the recommendation data are different; presenting the recommendation data on the display.
A second aspect of an embodiment of the present application provides a server, including: a second controller configured to: receiving a recommendation request containing voiceprint information and a search instruction sent by display equipment; determining recommendation data according to a search instruction contained in a recommendation request, wherein for the same search instruction, the recommendation data are different when the recommendation request contains voiceprint information or does not contain the voiceprint information; and sending the recommended data to a display device.
A third aspect of an embodiment of the present application provides a video recommendation method, where the method includes: sending a recommendation request containing voiceprint information to a server, wherein the recommendation request contains a search instruction, and the search instruction is used for enabling the server to feed back recommendation data; receiving recommendation data, wherein for the same search instruction, the recommendation data are different when the recommendation request contains voiceprint information or does not contain voiceprint information; and displaying the recommended data.
A fourth aspect of the embodiments of the present application provides a video recommendation method, including: receiving a recommendation request containing voiceprint information and a search instruction; determining recommendation data according to a search instruction contained in a recommendation request, wherein for the same search instruction, the recommendation data are different when the recommendation request contains voiceprint information or does not contain the voiceprint information; and sending the recommended data to a display device.
The beneficial effect of this application: by constructing a recommendation request containing a search instruction, the recommendation data sent by the server can be acquired; further, by constructing voiceprint information, recommendation data for individuals can be acquired; further by constructing a user identification, a user base representation for the individual may be determined; furthermore, by constructing a unique user voice portrait for an individual user, fine-grained depiction and expression of multi-dimensional information of the individual user can be realized, personalized video data recommendation for the individual user can be realized, the user voice portrait can be positioned in a text search scene, and the pertinence and the accuracy of video recommendation data are improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to these drawings without inventive exercise.
Fig. 1 is a schematic diagram illustrating an operation scenario between a display device and a control apparatus according to an embodiment;
fig. 2 is a block diagram exemplarily showing a hardware configuration of a display device 200 according to an embodiment;
fig. 3 is a block diagram exemplarily showing a hardware configuration of the control apparatus 100 according to the embodiment;
fig. 4 is a diagram exemplarily showing a functional configuration of the display device 200 according to the embodiment;
fig. 5a schematically shows a software configuration in the display device 200 according to an embodiment;
fig. 5b schematically shows a configuration of an application in the display device 200 according to an embodiment;
FIG. 6A is a schematic UI diagram illustrating a TV playing program according to an embodiment of the present application;
FIG. 6B is a schematic diagram illustrating a UI for a television to obtain a user voice command according to an embodiment of the application;
FIG. 6C is a schematic diagram of a UI for displaying a user voice command on a television according to an embodiment of the application;
FIG. 6D shows a UI diagram of a television display video highlights according to an embodiment of the application;
FIG. 7 is a diagram illustrating the primary information-containing dimension of a user speech representation according to one embodiment of the present application;
FIG. 8 is a schematic diagram illustrating video recommendation data acquisition according to an embodiment of the present application;
FIG. 9 is a schematic diagram illustrating the timing steps of video recommendation data acquisition according to an embodiment of the present application;
FIG. 10 illustrates a model diagram for recognizing a user speech representation according to an embodiment of the present application;
FIG. 11 is a diagram illustrating a ranking model of video recommendation data according to an embodiment of the present application;
FIG. 12 is a flowchart illustrating a video recommendation method according to an embodiment of the present application;
fig. 13 is a flowchart illustrating a video recommendation method according to an embodiment of the present application.
Detailed Description
To make the objects, technical solutions and advantages of the exemplary embodiments of the present application clearer, the technical solutions in the exemplary embodiments of the present application will be clearly and completely described below with reference to the drawings in the exemplary embodiments of the present application, and it is obvious that the described exemplary embodiments are only a part of the embodiments of the present application, but not all the embodiments.
All other embodiments, which can be derived by a person skilled in the art from the exemplary embodiments shown in the present application without inventive effort, shall fall within the scope of protection of the present application. Moreover, while the disclosure herein has been presented in terms of exemplary one or more examples, it is to be understood that each aspect of the disclosure can be utilized independently and separately from other aspects of the disclosure to provide a complete disclosure.
It should be understood that the terms "first," "second," "third," and the like in the description and in the claims of the present application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used are interchangeable under appropriate circumstances and can be implemented in sequences other than those illustrated or otherwise described herein with respect to the embodiments of the application, for example.
Furthermore, the terms "comprises" and "comprising," as well as any variations thereof, are intended to cover a non-exclusive inclusion, such that a product or device that comprises a list of elements is not necessarily limited to those elements explicitly listed, but may include other elements not expressly listed or inherent to such product or device.
The term "module" as used herein refers to any known or later developed hardware, software, firmware, artificial intelligence, fuzzy logic, or combination of hardware and/or software code that is capable of performing the functionality associated with that element.
Reference throughout this specification to "embodiments," "some embodiments," "one embodiment," or "an embodiment," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment. Thus, appearances of the phrases "in various embodiments," "in some embodiments," "in at least one other embodiment," or "in an embodiment" or the like throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. Thus, the particular features, structures, or characteristics shown or described in connection with one embodiment may be combined, in whole or in part, with the features, structures, or characteristics of one or more other embodiments, without limitation. Such modifications and variations are intended to be included within the scope of the present application.
The term "remote control" as used in this application refers to a component of an electronic device, such as the display device disclosed in this application, that is typically wirelessly controllable over a short range of distances. Typically using infrared and/or Radio Frequency (RF) signals and/or bluetooth to connect with the electronic device, and may also include WiFi, wireless USB, bluetooth, motion sensor, etc. For example: the hand-held touch remote controller replaces most of the physical built-in hard keys in the common remote control device with the user interface in the touch screen.
The term "gesture" as used in this application refers to a user's behavior through a change in hand shape or an action such as hand motion to convey a desired idea, action, purpose, or result.
Fig. 1 is a schematic diagram illustrating an operation scenario between a display device and a control apparatus according to an embodiment. As shown in fig. 1, a user may operate the display device 200 through the mobile terminal 300 and the control apparatus 100.
The control device 100 may control the display device 200 in a wireless or other wired manner by using a remote controller, including infrared protocol communication, bluetooth protocol communication, other short-distance communication manners, and the like. The user may input a user command through a key on a remote controller, voice input, control panel input, etc. to control the display apparatus 200. Such as: the user can input a corresponding control command through a volume up/down key, a channel control key, up/down/left/right moving keys, a voice input key, a menu key, a power on/off key, etc. on the remote controller, to implement the function of controlling the display device 200.
In some embodiments, mobile terminals, tablets, computers, laptops, and other smart devices may also be used to control the display device 200. For example, the display device 200 is controlled using an application program running on the smart device. The application, through configuration, may provide the user with various controls in an intuitive User Interface (UI) on a screen associated with the smart device.
For example, the mobile terminal 300 may install a software application with the display device 200, implement connection communication through a network communication protocol, and implement the purpose of one-to-one control operation and data communication. Such as: the mobile terminal 300 and the display device 200 can establish a control instruction protocol, synchronize a remote control keyboard to the mobile terminal 300, and control the display device 200 by controlling a user interface on the mobile terminal 300. The audio and video content displayed on the mobile terminal 300 can also be transmitted to the display device 200, so as to realize the synchronous display function.
As also shown in fig. 1, the display apparatus 200 also performs data communication with the server 400 through various communication means. The display device 200 may be allowed to be communicatively connected through a Local Area Network (LAN), a Wireless Local Area Network (WLAN), and other networks. The server 400 may provide various contents and interactions to the display apparatus 200. Illustratively, the display device 200 receives software program updates, or accesses a remotely stored digital media library, by sending and receiving information, as well as Electronic Program Guide (EPG) interactions. The servers 400 may be a group or groups of servers, and may be one or more types of servers. Other web service contents such as video on demand and advertisement services are provided through the server 400.
The display device 200 may be a liquid crystal display, an OLED display, a projection display device. The particular display device type, size, resolution, etc. are not limiting, and those skilled in the art will appreciate that the display device 200 may be modified in performance and configuration as desired.
The display apparatus 200 may additionally provide an intelligent network tv function that provides a computer support function in addition to the broadcast receiving tv function. Examples include a web tv, a smart tv, an Internet Protocol Tv (IPTV), and the like.
A hardware configuration block diagram of a display device 200 according to an exemplary embodiment is exemplarily shown in fig. 2. As shown in fig. 2, the display device 200 includes a controller 210, a tuning demodulator 220, a communication interface 230, a detector 240, an input/output interface 250, a video processor 260-1, an audio processor 60-2, a display 280, an audio output 270, a memory 290, a power supply, and an infrared receiver.
A display 280 for receiving the image signal from the video processor 260-1 and displaying the video content and image and components of the menu manipulation interface. The display 280 includes a display screen assembly for presenting a picture, and a driving assembly for driving the display of an image. The video content may be displayed from broadcast television content, or may be broadcast signals that may be received via a wired or wireless communication protocol. Alternatively, various image contents received from the network communication protocol and sent from the network server side can be displayed.
Meanwhile, the display 280 simultaneously displays a user manipulation UI interface generated in the display apparatus 200 and used to control the display apparatus 200.
And, a driving component for driving the display according to the type of the display 280. Alternatively, in case the display 280 is a projection display, it may also comprise a projection device and a projection screen.
The communication interface 230 is a component for communicating with an external device or an external server according to various communication protocol types. For example: the communication interface 230 may be a Wifi chip 231, a bluetooth communication protocol chip 232, a wired ethernet communication protocol chip 233, or other network communication protocol chips or near field communication protocol chips, and an infrared receiver (not shown).
The display apparatus 200 may establish control signal and data signal transmission and reception with an external control apparatus or a content providing apparatus through the communication interface 230. And an infrared receiver, an interface device for receiving an infrared control signal for controlling the apparatus 100 (e.g., an infrared remote controller, etc.).
The detector 240 is a signal used by the display device 200 to collect an external environment or interact with the outside. The detector 240 includes a light receiver 242, a sensor for collecting the intensity of ambient light, and parameters such as parameter changes can be adaptively displayed by collecting the ambient light.
The image acquisition device 241, such as a camera and a camera, may be used to acquire an external environment scene, acquire attributes of a user or interact gestures with the user, adaptively change display parameters, and recognize gestures of the user, so as to implement an interaction function with the user.
In some other exemplary embodiments, the detector 240, a temperature sensor, etc. may be provided, for example, by sensing the ambient temperature, and the display device 200 may adaptively adjust the display color temperature of the image. For example, the display apparatus 200 may be adjusted to display a cool tone when the temperature is in a high environment, or the display apparatus 200 may be adjusted to display a warm tone when the temperature is in a low environment.
In other exemplary embodiments, the detector 240, including a sound collector or the like, such as a microphone, may be used to receive a user's voice, a voice signal including a control instruction of the user to control the display device 200, or collect an ambient sound for identifying an ambient scene type, and the display device 200 may adapt to the ambient noise.
The input/output interface 250 controls data transmission between the display device 200 of the controller 210 and other external devices. Such as receiving video and audio signals or command instructions from an external device.
Input/output interface 250 may include, but is not limited to, the following: any one or more of high definition multimedia interface HDMI interface 251, analog or data high definition component input interface 253, composite video input interface 252, USB input interface 254, RGB ports (not shown in the figures), etc.
In some other exemplary embodiments, the input/output interface 250 may also form a composite input/output interface with the above-mentioned plurality of interfaces.
The tuning demodulator 220 receives the broadcast television signals in a wired or wireless receiving manner, may perform modulation and demodulation processing such as amplification, frequency mixing, resonance, and the like, and demodulates the television audio and video signals carried in the television channel frequency selected by the user and the EPG data signals from a plurality of wireless or wired broadcast television signals.
The tuner demodulator 220 is responsive to the user-selected television signal frequency and the television signal carried by the frequency, as selected by the user and controlled by the controller 210.
The tuner-demodulator 220 may receive signals in various ways according to the broadcasting system of the television signal, such as: terrestrial broadcast, cable broadcast, satellite broadcast, or internet broadcast signals, etc.; and according to different modulation types, the modulation mode can be digital modulation or analog modulation. Depending on the type of television signal received, both analog and digital signals are possible.
In other exemplary embodiments, the tuner/demodulator 220 may be in an external device, such as an external set-top box. In this way, the set-top box outputs television audio/video signals after modulation and demodulation, and the television audio/video signals are input into the display device 200 through the input/output interface 250.
The video processor 260-1 is configured to receive an external video signal, and perform video processing such as decompression, decoding, scaling, noise reduction, frame rate conversion, resolution conversion, image synthesis, and the like according to a standard codec protocol of the input signal, so as to obtain a signal that can be displayed or played on the direct display device 200.
Illustratively, the video processor 260-1 includes a demultiplexing module, a video decoding module, an image synthesizing module, a frame rate conversion module, a display formatting module, and the like.
The demultiplexing module is used for demultiplexing the input audio and video data stream, and if the input MPEG-2 is input, the demultiplexing module demultiplexes the input audio and video data stream into a video signal and an audio signal.
And the video decoding module is used for processing the video signal after demultiplexing, including decoding, scaling and the like.
And the image synthesis module is used for carrying out superposition mixing processing on the GUI signal input by the user or generated by the user and the video image after the zooming processing by the graphic generator so as to generate an image signal for display.
The frame rate conversion module is configured to convert an input video frame rate, such as a 60Hz frame rate into a 120Hz frame rate or a 240Hz frame rate, and the normal format is implemented in, for example, an interpolation frame mode.
The display format module is used for converting the received video output signal after the frame rate conversion, and changing the signal to conform to the signal of the display format, such as outputting an RGB data signal.
The audio processor 260-2 is configured to receive an external audio signal, decompress and decode the received audio signal according to a standard codec protocol of the input signal, and perform noise reduction, digital-to-analog conversion, amplification processing, and the like to obtain an audio signal that can be played in the speaker.
In other exemplary embodiments, video processor 260-1 may comprise one or more chips. The audio processor 260-2 may also comprise one or more chips.
And, in other exemplary embodiments, the video processor 260-1 and the audio processor 260-2 may be separate chips or may be integrated together with the controller 210 in one or more chips.
An audio output 272, which receives the sound signal output from the audio processor 260-2 under the control of the controller 210, such as: the speaker 272, and the external sound output terminal 274 that can be output to the generation device of the external device, in addition to the speaker 272 carried by the display device 200 itself, such as: an external sound interface or an earphone interface and the like.
The power supply provides power supply support for the display device 200 from the power input from the external power source under the control of the controller 210. The power supply may include a built-in power supply circuit installed inside the display device 200, or may be a power supply interface installed outside the display device 200 to provide an external power supply in the display device 200.
A user input interface for receiving an input signal of a user and then transmitting the received user input signal to the controller 210. The user input signal may be a remote controller signal received through an infrared receiver, and various user control signals may be received through the network communication module.
For example, the user inputs a user command through the remote controller 100 or the mobile terminal 300, the user input interface responds to the user input through the controller 210 according to the user input, and the display device 200 responds to the user input.
In some embodiments, a user may enter a user command on a Graphical User Interface (GUI) displayed on the display 280, and the user input interface receives the user input command through the Graphical User Interface (GUI). Alternatively, the user may input the user command by inputting a specific sound or gesture, and the user input interface receives the user input command by recognizing the sound or gesture through the sensor.
The controller 210 controls the operation of the display apparatus 200 and responds to the user's operation through various software control programs stored in the memory 290.
As shown in fig. 2, the controller 210 includes a RAM213 and a ROM214, and a graphic processor 216, a CPU processor 212, a communication interface 218, such as: a first interface 218-1 through an nth interface 218-n, and a communication bus. The RAM213 and the ROM214, the graphic processor 216, the CPU processor 212, and the communication interface 218 are connected via a bus.
A ROM213 for storing instructions for various system boots. If the display apparatus 200 starts power-on upon receipt of the power-on signal, the CPU processor 212 executes a system boot instruction in the ROM, copies the operating system stored in the memory 290 to the RAM213, and starts running the boot operating system. After the start of the operating system is completed, the CPU processor 212 copies the various application programs in the memory 290 to the RAM213, and then starts running and starting the various application programs.
A graphics processor 216 for generating various graphics objects, such as: icons, operation menus, user input instruction display graphics, and the like. The display device comprises an arithmetic unit which carries out operation by receiving various interactive instructions input by a user and displays various objects according to display attributes. And a renderer for generating various objects based on the operator and displaying the rendered result on the display 280.
A CPU processor 212 for executing operating system and application program instructions stored in memory 290. And executing various application programs, data and contents according to various interactive instructions received from the outside so as to finally display and play various audio and video contents.
In some exemplary embodiments, the CPU processor 212 may include a plurality of processors. The plurality of processors may include one main processor and a plurality of or one sub-processor. A main processor for performing some operations of the display apparatus 200 in a pre-power-up mode and/or operations of displaying a screen in a normal mode. A plurality of or one sub-processor for one operation in a standby mode or the like.
The controller 210 may control the overall operation of the display apparatus 100. For example: in response to receiving a user command for selecting a UI object to be displayed on the display 280, the controller 210 may perform an operation related to the object selected by the user command.
Wherein the object may be any one of selectable objects, such as a hyperlink or an icon. Operations related to the selected object, such as: displaying an operation connected to a hyperlink page, document, image, or the like, or performing an operation of a program corresponding to the icon. The user command for selecting the UI object may be a command input through various input means (e.g., a mouse, a keyboard, a touch pad, etc.) connected to the display apparatus 200 or a voice command corresponding to a voice spoken by the user.
The memory 290 includes a memory for storing various software modules for driving the display device 200. Such as: various software modules stored in memory 290, including: the system comprises a basic module, a detection module, a communication module, a display control module, a browser module, various service modules and the like.
Wherein the basic module is a bottom layer software module for signal communication among the various hardware in the postpartum care display device 200 and for sending processing and control signals to the upper layer module. The detection module is used for collecting various information from various sensors or user input interfaces, and the management module is used for performing digital-to-analog conversion and analysis management.
For example: the voice recognition module comprises a voice analysis module and a voice instruction database module. The display control module is a module for controlling the display 280 to display image content, and may be used to play information such as multimedia image content and UI interface. And the communication module is used for carrying out control and data communication with external equipment. And the browser module is used for executing a module for data communication between browsing servers. And the service module is used for providing various services and modules including various application programs.
Meanwhile, the memory 290 is also used to store visual effect maps and the like for receiving external data and user data, images of respective items in various user interfaces, and a focus object.
A block diagram of the configuration of the control apparatus 100 according to an exemplary embodiment is exemplarily shown in fig. 3. As shown in fig. 3, the control apparatus 100 includes a controller 110, a communication interface 130, a user input/output interface 140, a memory 190, and a power supply 180.
The control device 100 is configured to control the display device 200 and may receive an input operation instruction of a user and convert the operation instruction into an instruction recognizable and responsive by the display device 200, serving as an interaction intermediary between the user and the display device 200. Such as: the user responds to the channel up and down operation by operating the channel up and down keys on the control device 100.
In some embodiments, the control device 100 may be a smart device. Such as: the control apparatus 100 may install various applications that control the display apparatus 200 according to user demands.
In some embodiments, as shown in fig. 1, a mobile terminal 300 or other intelligent electronic device may function similar to the control device 100 after installing an application that manipulates the display device 200. Such as: the user may implement the functions of controlling the physical keys of the device 100 by installing applications, various function keys or virtual buttons of a graphical user interface available on the mobile terminal 300 or other intelligent electronic device.
The controller 110 includes a processor 112 and RAM113 and ROM114, a communication interface 218, and a communication bus. The controller 110 is used to control the operation of the control device 100, as well as the internal components for communication and coordination and external and internal data processing functions.
The communication interface 130 enables communication of control signals and data signals with the display apparatus 200 under the control of the controller 110. Such as: the received user input signal is transmitted to the display apparatus 200. The communication interface 130 may include at least one of a WiFi chip, a bluetooth module, an NFC module, and other near field communication modules.
A user input/output interface 140, wherein the input interface includes at least one of a microphone 141, a touch pad 142, a sensor 143, keys 144, and other input interfaces. Such as: the user can realize a user instruction input function through actions such as voice, touch, gesture, pressing, and the like, and the input interface converts the received analog signal into a digital signal and converts the digital signal into a corresponding instruction signal, and sends the instruction signal to the display device 200.
The output interface includes an interface that transmits the received user instruction to the display apparatus 200. In some embodiments, the interface may be an infrared interface or a radio frequency interface. Such as: when the infrared signal interface is used, the user input instruction needs to be converted into an infrared control signal according to an infrared control protocol, and the infrared control signal is sent to the display device 200 through the infrared sending module. The following steps are repeated: when the rf signal interface is used, a user input command needs to be converted into a digital signal, and then the digital signal is modulated according to the rf control signal modulation protocol and then transmitted to the display device 200 through the rf transmitting terminal.
In some embodiments, the control device 100 includes at least one of a communication interface 130 and an output interface. The control device 100 is provided with a communication interface 130, such as: the WiFi, bluetooth, NFC, etc. modules may transmit the user input command to the display device 200 through the WiFi protocol, or the bluetooth protocol, or the NFC protocol code.
A memory 190 for storing various operation programs, data and applications for driving and controlling the control apparatus 200 under the control of the controller 110. The memory 190 may store various control signal commands input by a user.
And a power supply 180 for providing operational power support to the various elements of the control device 100 under the control of the controller 110. A battery and associated control circuitry.
Fig. 4 is a diagram schematically illustrating a functional configuration of the display device 200 according to an exemplary embodiment. As shown in fig. 4, the memory 290 is used to store an operating system, an application program, contents, user data, and the like, and performs system operations for driving the display device 200 and various operations in response to a user under the control of the controller 210. The memory 290 may include volatile and/or nonvolatile memory.
The memory 290 is specifically configured to store an operating program for driving the controller 210 in the display device 200, and to store various application programs installed in the display device 200, various application programs downloaded by a user from an external device, various graphical user interfaces related to the applications, various objects related to the graphical user interfaces, user data information, and internal data of various supported applications. The memory 290 is used to store system software such as an OS kernel, middleware, and applications, and to store input video data and audio data, and other user data.
The memory 290 is specifically used for storing drivers and related data such as the audio/video processors 260-1 and 260-2, the display 280, the communication interface 230, the tuning demodulator 220, the input/output interface of the detector 240, and the like.
In some embodiments, memory 290 may store software and/or programs, software programs for representing an Operating System (OS) including, for example: a kernel, middleware, an Application Programming Interface (API), and/or an application program. For example, the kernel may control or manage system resources, or functions implemented by other programs (e.g., the middleware, APIs, or applications), and the kernel may provide interfaces to allow the middleware and APIs, or applications, to access the controller to implement controlling or managing system resources.
The memory 290, for example, includes a broadcast receiving module 2901, a channel control module 2902, a volume control module 2903, an image control module 2904, a display control module 2905, an audio control module 2906, an external instruction recognition module 2907, a communication control module 2908, a light receiving module 2909, a power control module 2910, an operating system 2911, and other applications 2912, a browser module, and the like. The controller 210 performs functions such as: a broadcast television signal reception demodulation function, a television channel selection control function, a volume selection control function, an image control function, a display control function, an audio control function, an external instruction recognition function, a communication control function, an optical signal reception function, an electric power control function, a software control platform supporting various functions, a browser function, and the like.
A block diagram of a configuration of a software system in a display device 200 according to an exemplary embodiment is exemplarily shown in fig. 5 a.
As shown in fig. 5a, an operating system 2911, including executing operating software for handling various basic system services and for performing hardware related tasks, acts as an intermediary for data processing performed between application programs and hardware components. In some embodiments, portions of the operating system kernel may contain a series of software to manage the display device hardware resources and provide services to other programs or software code.
In other embodiments, portions of the operating system kernel may include one or more device drivers, which may be a set of software code in the operating system that assists in operating or controlling the devices or hardware associated with the display device. The drivers may contain code that operates the video, audio, and/or other multimedia components. Examples include a display screen, a camera, Flash, WiFi, and audio drivers.
The accessibility module 2911-1 is configured to modify or access the application program to achieve accessibility and operability of the application program for displaying content.
A communication module 2911-2 for connection to other peripherals via associated communication interfaces and a communication network.
The user interface module 2911-3 is configured to provide an object for displaying a user interface, so that each application program can access the object, and user operability can be achieved.
Control applications 2911-4 for controllable process management, including runtime applications and the like.
The event transmission system 2914, which may be implemented within the operating system 2911 or within the application program 2912, in some embodiments, on the one hand, within the operating system 2911 and on the other hand, within the application program 2912, is configured to listen for various user input events, and to refer to handlers that perform one or more predefined operations in response to the identification of various types of events or sub-events, depending on the various events.
The event monitoring module 2914-1 is configured to monitor an event or a sub-event input by the user input interface.
The event identification module 2914-1 is configured to input definitions of various types of events for various user input interfaces, identify various events or sub-events, and transmit the same to a process for executing one or more corresponding sets of processes.
The event or sub-event refers to an input detected by one or more sensors in the display device 200 and an input of an external control device (e.g., the control device 100). Such as: the method comprises the following steps of inputting various sub-events through voice, inputting gestures through gesture recognition, inputting sub-events through remote control key commands of the control equipment and the like. Illustratively, the one or more sub-events in the remote control include a variety of forms including, but not limited to, one or a combination of key presses up/down/left/right/, ok keys, key presses, and the like. And non-physical key operations such as move, hold, release, etc.
The interface layout manager 2913, directly or indirectly receiving the input events or sub-events from the event transmission system 2914, monitors the input events or sub-events, and updates the layout of the user interface, including but not limited to the position of each control or sub-control in the interface, and the size, position, and level of the container, and other various execution operations related to the layout of the interface.
As shown in fig. 5b, the application layer 2912 contains various applications that may also be executed at the display device 200. The application may include, but is not limited to, one or more applications such as: live television applications, video-on-demand applications, media center applications, application centers, gaming applications, and the like.
The live television application program can provide live television through different signal sources. For example, a live television application may provide television signals using input from cable television, radio broadcasts, satellite services, or other types of live television services. And, the live television application may display video of the live television signal on the display device 200.
A video-on-demand application may provide video from different storage sources. Unlike live television applications, video on demand provides a video display from some storage source. For example, the video on demand may come from a server side of the cloud storage, from a local hard disk storage containing stored video programs.
The media center application program can provide various applications for playing multimedia contents. For example, a media center, which may be other than live television or video on demand, may provide services that a user may access to various images or audio through a media center application.
The application program center can provide and store various application programs. The application may be a game, an application, or some other application associated with a computer system or other device that may be run on the smart television. The application center may obtain these applications from different sources, store them in local storage, and then be operable on the display device 200.
The embodiment of the application can be applied to various types of display devices (including but not limited to smart televisions, set-top boxes and the like). The technical solution will be explained below with respect to the relevant UI for generating video recommendation data at the tv end.
Fig. 6A shows a UI diagram of a tv playing program according to an embodiment of the present application.
The television is in a program playing state, and the user can perform the operation of acquiring the video recommendation data in the state. It should be noted that, when the television is in other interfaces, such as the system main UI, the video call UI, the movie playing UI, or the UI of other application programs, the user may operate the display device to generate the video recommendation data provided by the present application.
In some embodiments, the display device, while playing a program, may also be configured to present other interactive elements, which may include, for example, television home page controls, search controls, message button controls, mailbox controls, browser controls, favorites controls, signal bar controls, voice controls, and the like.
In order to improve the convenience and the image of a UI of a display device, the display device provided by the application comprises a display, a sound collector and a first controller, wherein the display is used for displaying a user interface; the sound collector is used for acquiring a voice instruction sent by a user, for example, the sound collector can be implemented as a microphone; the first controller controls the display device and the UI thereof in response to an operation of the interactive element. For example, a user clicking on a search control through a controller such as a remote control may expose the search UI on top of other UIs, i.e., the UI of an application component controlling the mapping of interactive elements of the display device can be enlarged, reduced, or displayed full-screen.
In some embodiments, the interactive elements of the display device may also be operated by a sensor, which may be, but is not limited to, an acoustic input sensor, such as a sound collector provided with the display device of the present application, which may detect a voice command including an indication of the desired interactive element. For example, after a user activates a voice control by operating a shortcut button of a remote control of the display device, the user operates a browser control of the display device by saying "open browser" or any other suitable indication.
Fig. 6B shows a UI diagram of a television acquiring a user voice instruction according to an embodiment of the present application.
The following description will take the example where the tv user wants to get a movie video recommendation.
In the using process of a television program or various UI interfaces, the display device can receive a recommendation request from a user through a microphone or a remote controller, wherein the recommendation request contains voiceprint information, namely the display device receives the recommendation request containing the voiceprint information from the user and sends the recommendation request containing the voiceprint information to a server, wherein the recommendation request contains a search instruction, and the search instruction is used for enabling the server to feed back recommendation data. It should be noted that the server may feed back video recommendation data, or e-book recommendation data, APP recommendation data, and other types of resources according to the recommendation request of the user.
In some embodiments, a recommendation request for a user containing voiceprint information may be input through an audio receiving element, such as a microphone, in the user input interface 140. By a user's key input to the remote control, triggering the television to begin detecting the user's voice command, the first controller may recognize the voice command from the microphone and submit data characterizing the interaction to the UI or its processing component or engine. It should be noted that the microphone may be disposed in the remote controller in some embodiments; in other embodiments, the microphone may also be disposed in the body structure of the television.
In some embodiments, the user operates the remote control to trigger the television UI to display the voice control, and when the user triggers voice input, the first controller displays the voice control on the top layer of the current television UI to prompt the user to perform voice input in time. For example, the voice control contains prompt information, which is displayed as "please talk" in the UI, as shown in fig. 6B, the user can make a recommendation request to the television in time after seeing the prompt of the voice control.
In some embodiments, the first controller configures the voice instruction prompt information in a standard format at the top layer of the UI interface of the television, and the user can improve the recognition rate of the recommendation request sent by the user by the television through imitating the voice instruction format. For example, the television UI may prompt "you can try to say: i want to watch a tv show as shown in fig. 6B.
Based on the television UI shown in fig. 6B, after seeing the voice control prompt, the user makes a recommendation request, e.g., "i want to see some good movies", to the microphone of the television, which the microphone of the display device will receive and send to the first controller of the display device for parsing.
In some embodiments, the first controller parses the search instruction contained in the recommendation request into a computer-readable format, such as a text format, and displays it on the television UI so that the user can view the search instruction issued by the user as text, as shown in fig. 6C.
Fig. 6C is a UI diagram illustrating a television displaying a user voice command according to an embodiment of the present application.
The first controller analyzes a search instruction contained in a recommendation request sent by a user into semantics contained in the recommendation request, the semantics are displayed on a television UI in a text information mode, and the user reads the text information to judge the accuracy of the television for acquiring the voice instruction.
For example, if the text information is the same as the recommendation request semantics sent by the user, the television is considered to completely recognize the recommendation request semantics of the user; if the text information is different from the video recommendation request sent by the user, the television is considered to not correctly understand the recommendation request sent by the user due to the factors such as the accent dialect of the user, the sound loudness, the environmental noise of the television and the like; and the user determines whether the recommendation request needs to be sent again for correction or not by reading the text information.
And for the same search instruction, when the recommendation request contains voiceprint information or does not contain the voiceprint information, the recommendation data are different.
In some embodiments, the recommendation request further includes a user identifier, the user identifier being configured to enable the server to determine a user base representation based on the user identifier, the user base representation being configured to enable the server to feed back the recommendation data based on the user base representation and/or the voiceprint information in response to the search instruction.
When the recommendation request contains a user identifier, the server determines a user basic portrait according to the user identifier, so that the server determines the provided recommendation data according to the voiceprint information and the search instruction contained in the recommendation request.
And when the recommendation request does not contain the user identification, the recommendation request is used for enabling the server to determine the user basic portrait according to the user identification so as to determine the video recommendation data which can be provided according to the search instruction, the user basic portrait and the voiceprint information contained in the recommendation request.
In some embodiments, the television is different from mobile devices such as a computer and a mobile phone in the process of implementing video recommendation, the television is commonly used by a plurality of members of a family, and the television does not collect voiceprint information of a user in the process of interaction between the user and the television, so that the television cannot distinguish different users in the family users. In other embodiments, during the interaction process between the user and the television, the television collects the user identification, and the first controller of the television determines the user base representation by recognizing the user identification, so that the identities of different users in the family can be accurately acquired, and the accuracy of video recommendation is improved.
In some embodiments, the recommendation data is different when the voiceprint information is different in the recommendation request for the same search instruction. The voiceprint information is further used for enabling the server to identify one or more combinations of gender, age range and voiceprint ID of the user so as to obtain the user voice portrait through matching.
The identification of the voiceprint information mainly comprises the following aspects of identifying the gender, age range, voiceprint ID, emotional state, intonation and the like of the speaker. In a voice and video recommendation request scene of a television end, the gender and age range of a speaker are important recommendation influencing factors, and the voiceprint ID is corresponding to a user.
In some embodiments, the age range may be implemented as the following different intervals: 0-6 years old: a young child; 7-14 years old: teenagers; 15-35 years old: young; and (4) the age of 36-60 years: middle-aged; over the age of 61 years: and (4) the elderly.
Since each member in the family has difference in user gender and age range, the gender and age range of the user are identified by the voiceprint information in the recommendation request and are used as reference factors of the recommendation data, so that the accuracy and matching rate of the recommendation data can be improved. It should be noted that, the users are divided into age ranges, and the purpose is to better draw people with certain obvious characteristics; and the granularity is too fine when the accurate age information is used alone, so that the method is not suitable for constructing the voice portrait of the user.
In some embodiments, the first controller of the display device builds a base representation for the home add tag from historical operating records of the display device, the base representation describing features of the home including one or more combinations of device base information, consumer capability information, home preference information. The basic portrait represents the use characteristics of the whole display device of the household unit, the user basic portrait characterizes the use characteristics of the display device aiming at the personal unit in the household, and the user voice portrait combines the user basic portrait and the voiceprint information.
The basic portrait is characterized in that operation records of different users in a family, such as historical browsing, historical clicking and historical searching behaviors, are mainly used for depicting the basic portrait, labels are added according to the use characteristics of the whole family through tool methods such as machine learning and network modeling, and the labels are mapped to a uniform label library of a server for storage.
The basic portrait mainly comprises the characteristics of equipment basic information, consumption capability information, family preference information and the like. The basic information of the equipment mainly comprises information such as relevant information of the model of the display equipment, geographical position and the like, and can be acquired through a television purchasing place or a log reported by networking after the television is started, and the acquisition difficulty is low; the consumption capability information is mainly analyzed through vip packages purchased by the family user at the television end, such as whether the family user is a paying user, whether the family user is a monthly continuous user, what kind of packages are purchased, whether a single-point purchase record exists, and the like. The records can be obtained in time when the consumption behaviors of the television are generated, and the obtaining difficulty is low; the family preference information mainly reflects user interests, including short-term interests and long-term interests, and the family preference information can be obtained by modeling user behaviors through tool methods such as machine learning or neural network models, and interest labels are added to families. For example, defining a label "movie destiny" for the family preference information, the conditions that need to be met can be implemented as: the number of times of watching the film in the past N months reaches M; as another example, a label "number of vases of VIPs" is defined for the family preference information, and the condition to be satisfied is that a number of vastly numbered VIP packages are purchased in the past N months and people who view paid videos more than M times are watched.
In some embodiments, the first controller of the display device receives voice data of the user before sending the recommendation request containing voiceprint information to the server; and generating the recommendation request according to the voice data, wherein the recommendation request comprises a search instruction and voiceprint information.
For example, voice data sent by a user is acquired through a user voice input device such as a television remote controller or a microphone, and the voice data is received in a form of a voice instruction; the first controller analyzes the voice command, obtains a search command containing semantics and contained voiceprint information, and stores the identification result to the database.
In some embodiments, the first controller of the display apparatus receives a key input of the remote controller before sending the recommendation request containing voiceprint information to the server; and generating a recommendation request according to the key input, wherein the recommendation request comprises a search instruction.
For example, a user sends a search instruction to a television through a television remote controller or other control terminals, a first controller performs analysis according to key input of the user to generate a recommendation request, the recommendation request comprises the search instruction, and a server determines recommendation data sent to a display device according to semantics contained in the search instruction. . The present application also provides a server for providing video recommendation data to the display device as above, the server comprising a second controller configured to: receiving a recommendation request containing voiceprint information and a search instruction sent by display equipment; determining recommendation data according to a search instruction contained in a recommendation request, wherein for the same search instruction, the recommendation data are different when the recommendation request contains voiceprint information or does not contain the voiceprint information; and finally, sending the recommended data to display equipment.
In some embodiments, the second controller of the server matches the voiceprint information with the user speech representation to determine recommendation data that may be provided based on search instructions contained in the recommendation request. The second controller analyzes the voiceprint information contained in the recommendation request to identify one or more combinations of gender, age range and voiceprint ID of the user; and the second controller constructs a user voice portrait according to the voiceprint information of the user and the user basic portrait identified by the user identification, the user voice portrait is specific to a single user, the user voice portrait corresponds to a unique voiceprint ID, and the second controller determines video recommendation data which can be provided according to the semantics contained in the search instruction and the user voice portrait.
The second controller identifies the received voiceprint information through the voiceprint identification service, and can obtain personal information of the user sending the recommendation request, such as one or more combinations of the gender, the age range and the voiceprint ID of the user.
In some embodiments, the recommendation request received by the server further includes a user identification, and the second controller determines a user base representation based on the user identification, the user base representation being used to cause the second controller to determine the recommendation data based on the user base representation and/or the voiceprint information in response to the search instruction.
The second controller can obtain a basic portrait about the display device through tool methods such as machine learning or neural network models, wherein the basic portrait is used for describing family characteristics and comprises one or more combinations of device basic information, consumption capability information and family preference information. The second controller determines a user base portrait according to the user identification, and then constructs a user voice portrait for a specific user based on the user contact portrait and the voiceprint information, wherein the user voice portrait can provide a basis for a video recommendation search algorithm of the server.
In some embodiments, based on the voiceprint information, the user base representation, the second controller of the server may construct a user voice representation for the particular user, generating a more granular representation tag, including identity, interest, mental state, etc. characteristics of each of the family members.
For example, the second controller obtains that two different voiceprint IDs exist in the family through the identification that the recommendation request includes the user identifier, and the voiceprint IDs correspond to the male user a and the female user B respectively, so that it can be inferred that no child exists in the family; further determining that the male user A and the female user B belong to young men and women, the second controller may construct a user voice representation about the male user A and a user voice representation about the female user B in conjunction with the base representation of the display device. It should be noted that the user identifier in the recommendation request may also be implemented as voiceprint information, that is, the server identifies the user individual in the family by identifying the voiceprint information of the user.
The second controller of the server takes the user voice portrait as the basis to recommend data. For example, the recommendation data recommended by the server to the family is biased to popular movies and television shows which young people prefer, so that the recommendation and advertisement of products such as children and education are reduced; for another example, when the male user a issues a recommendation request regarding a search movie, the recommendation data provided by the server may be implemented as a war, or a science fiction story movie; when a female user B issues a recommendation request for a search movie, the recommendation data provided by the server may be implemented to include a love, or literary theme movie.
It can be found that constructing a user voice portrait for a specific user based on voiceprint information and a user basic portrait can represent different interest points and tag weights for each member in a family.
In some embodiments, the user speech representation further comprises information characterizing the search habits, interests, and mental states that the user possesses during a particular time period for the server to determine video recommendation data that may be provided.
The search habit information, the interest information and the psychological state information which are reflected by a specific user at a display device end in a specific time period are analyzed through the user voice portrait, and the video recommendation algorithm also has important referential property. The second controller of the server performs statistical analysis by acquiring a time point at which the user triggers a recommendation request containing voiceprint information, and can acquire search and browsing habit information of a specific user in a specific time period.
For example, in a three-family, a family male owner has a habit of watching news simulcasts at 6 am, a child leaving home at 5 pm has a habit of watching comics, and a female owner has a habit of watching television shows at 8 pm; for another example, the elderly in the home have a habit of watching drama programs at 10 am. The information can further enrich and expand the information of the user voice portrait, the unique user voice portrait of each user in a family can be obtained through the combination of the voiceprint information and the user basic portrait, and the user voice portrait is identified through the unique voiceprint ID.
FIG. 7 is a diagram illustrating the primary information-containing dimension of a user speech representation according to an embodiment of the present application.
The gender, the age range and the voiceprint ID of the user can be acquired by identifying the voiceprint information in the recommendation request; the basic information, the consumption capability information and the family preference information of the equipment can be obtained by processing the historical use trace of the user of the display equipment through tool methods such as machine learning, neural network models and the like to obtain a basic portrait; the time information, search habit information, interest information and psychological state information can be obtained by using information contained in the user basic portrait and the user voice portrait.
In some embodiments, the second controller of the server determines that video recommendation data can be provided according to the recommendation request containing a search instruction and voiceprint information, and the second controller parses the voiceprint information contained in the recommendation request to identify one or more combinations of user gender, age range, and voiceprint ID; and the second controller determines the video recommendation data which can be provided according to the semantic and voiceprint information of the search instruction.
That is, when a recommendation request sent by a display device does not contain a user identifier, for example, when a television loses a basic portrait due to a software failure, a server cannot acquire the user basic portrait, and the recommendation request is used for enabling the server to determine available recommendation data according to semantics of a search instruction contained in the recommendation request and the voiceprint information. Therefore, the problem that the accuracy and the matching degree of the server recommendation data are reduced after basic portrait information of the display device is lost can be solved, the server carries out recommendation based on the semantics of the recommendation request, and carries out video recommendation based on the information of user gender, age range, interest psychological state and the like contained in the voiceprint information.
For example, basic portrait information is lost due to accidental failure of display equipment of a user home, a recommendation request sent by the user is 'i want to watch a movie', a second controller of the server identifies voiceprint information of the user, judges the gender of the user to be a woman, and can preferentially provide movies related to love and romantic labels for the user; and for the sex of the user, movies related to labels such as war, hallucinations and the like can be preferentially recommended.
Fig. 8 shows a schematic diagram of video recommendation data acquisition according to an embodiment of the present application.
The user interacts with the television via a remote control, or microphone device, for example, the user issues a video recommendation request containing voiceprint information to the display device. In this embodiment, a description is given by taking an example in which a recommendation request is implemented as a video recommendation request and recommendation data is implemented as video recommendation data.
A first controller of a display device receives a video recommendation request sent by a user and sends the video recommendation request to a server, and the server judges whether the video recommendation request contains voiceprint information or not;
if the video recommendation request contains voiceprint information, firstly carrying out voiceprint information recognition, and then sending the recognition information to a server video recommendation engine to carry out user voice portrait recognition;
and if the video recommendation request does not contain voiceprint information, directly sending the video recommendation request to a server video recommendation engine for voice portrait recognition. It should be noted that, in order to simplify the logical operation, it may be considered that the voiceprint information of the user is 0 or the voiceprint information of the user is null at this time; filtering the non-conforming media assets through the user basic portrait information contained in the user voice portrait, such as the equipment basic information, the model, the license plate, the system capability information supporting the playing of the media assets, and the like.
After the second controller of the server calculates the user voice portrait according to the video recommendation request relevant characteristics input by the user, the user voice portrait and the relevant characteristics of the user are issued to the personalized ranking model for personalized ranking, and it should be noted that the user voice portrait is constructed by the user basic portrait and the voiceprint information.
And the personalized sorting model filters the media assets according to the system capability information provided by the display equipment and returns the result obtained after the video recommendation data is reordered to the display equipment.
Finally, the display device presents the final video recommendation data on the user interface.
Fig. 9 is a schematic diagram illustrating a time sequence of video recommendation data acquisition according to an embodiment of the present application.
In this embodiment, a description is given by taking an example in which a recommendation request is implemented as a video recommendation request and recommendation data is implemented as video recommendation data.
In step 901, the user turns on the television terminal, and inputs a video recommendation request including voiceprint information in a voice manner through a microphone, or inputs a video recommendation request not including voiceprint information in a text manner by invoking a search through a keyboard or a remote controller.
In step 902, when receiving the video recommendation request, the tv packages the key input of the user, the voice information, and the version information of the terminal into a data packet and sends the data packet to the system App on the server side. It should be noted that, according to the actual situation, a recommendation search session usually only contains voice information or text information input by a key, and in some cases, even if the user uses a microphone to input voice information and the text information input by the key to search at the same time, the server treats the two different video recommendation request processes.
The system App at the server side judges whether the voice information of the user, namely the voiceprint information, is contained in the received data packet or not and executes different processing strategies; and for the same search instruction, when the voiceprint information in the recommendation request is different, the recommendation data is also different.
In step 903-1, when the voice information in the data packet is not empty, i.e. contains voiceprint information, the server will invoke a voiceprint information recognition service. The system APP calls a voiceprint information identification service by adopting an http request to identify the voiceprint ID of the user and returns the voiceprint ID; in some embodiments, when the voice information in the data packet is empty, the system App combines the user's recommended video recommendation request including semantic user text information, geographical location information, user request time, voiceprint ID identified as empty, and media types (i.e., system capabilities) supported and allowed to be played by the current terminal version into an http request, and sends the http request to the video recommendation engine of the server. Note that the voiceprint ID identified as empty in this paragraph, i.e., the voiceprint ID is set to null hereinafter.
In step 903-2, the server converts the semantics of the video recommendation request into a user text, encapsulates the voiceprint ID, the user text information, the geographical location information, the user request time, and the system capability of the user into a message body, and sends the message body to the video recommendation engine through an http request.
In step 904, the video recommendation engine receives a request from the server-side APP to obtain a user speech representation. The video recommendation engine extracts the user voiceprint features according to the transmitted voiceprint ID, and when the transmitted voiceprint ID is null, the null is used as a special voiceprint feature and can also be regarded as an anonymous user feature; the video recommendation engine acquires media asset characteristics viewed by a user history through searching user text information of which the instruction contains semantics, and then calls a user voice portrait prediction service to predict a user voice portrait by combining the characteristics of voiceprint, media asset, geographic position and request time.
In step 905, video recommendation data is obtained according to the user voice portrait related features/video features/context features/cross features. After the user voice portrait is predicted to be finished, the video recommendation engine extracts relevant features according to the user voice portrait and transmits the relevant features, video features, context features, cross features and the like to the personalized ranking model for ranking in an RPC calling mode. Because the sequencing model needs to obtain the final result through calculation among the features, the method is time-consuming compared with the prior method, the RPC calling is selected to reduce the performance consumption, the transmission efficiency is high, and the time is saved.
In step 906, the ranking model predicts the click rate of the user to recall the assets according to the characteristic parameters transmitted by the video recommendation engine. And the sequencing model predicts and sequences, and sends a sequencing result to the video recommendation engine in an RPC calling mode.
In step 907, the video recommendation engine filters the assets according to system capabilities. And finally, the video recommendation engine matches the system capability of the display equipment according to the result obtained by the sequencing model, filters and removes the media assets which do not support playing, and sends the residual video recommendation data to the system APP at the server side in an http request mode. After the system APP acquires video recommendation data, corresponding recommendation positions, posters and other information are matched and acquired and transmitted to the display equipment terminal, and the display equipment terminal displays the finally typesetted data to a user according to the returned media resource list and the corresponding posters and other information.
In some embodiments, the first controller of the display device controls the user interface to display a video recommendation interface containing recommendation data such as a recommendation place, a poster, and the like, the poster being configured to jump to a play operation interface of a recommended video after receiving a confirmation operation by the user, as shown in fig. 6D.
For example, as shown in the UI in fig. 6D, the video recommendation interface displayed by the first controller according to the received recommendation data includes a list of video recommendation slots on the left side and a first play window on the right side displaying a poster. The first page includes a first video recommendation site "baby plan", a second video recommendation site "twelve Chinese zodiac", a third video recommendation site "machine blood", and a fourth video recommendation site "dragon puzzle". And the position of the focus of the user interface is positioned at the first video recommendation position, and the first controller controls a first playing window to play a poster of the first video recommendation position 'baby plan'. In some embodiments, the first controller controls the video recommendation bit and the first play window to be simultaneously displayed in the video recommendation interface.
In some embodiments, the video recommendation position and the first playing window displaying the poster are displayed in a video recommendation interface in parallel, or the video recommendation position is displayed by being superposed on the first playing window. The positions of the video recommendation position and the first playing window in the playing interface can be configured according to actual conditions, or the first controller displays the video recommendation position above the first playing window, so that an overlapping display effect is obtained.
In some embodiments, when the video recommendation bit is superimposed on the first playing window for display, the video recommendation bit is hidden in response to input that no instruction is received within a preset time period. The user does not perform any operation when receiving the video recommendation data within the preset time length, namely the display device does not receive feedback information of the user, and the first controller hides the video recommendation position to highlight the display of the first play window poster.
In some embodiments, a recommendation request is entered by microphone speech when the user turns on the television, such as "recommend me some good-looking movies"; the television terminal acquires the voice information of the user in time and transmits the voice information of the user to the server.
After obtaining the recommendation request sent by the terminal, the server finds the voiceprint information containing the user and initiates a request to call the voiceprint information identification service. Such as identifying the age range of the user, the gender of the user, the voiceprint ID, etc., and converting the speech information into user text information about the semantics "recommend me some good-looking movies", in some embodiments, the trigger time information of the recommendation request is sent to the video recommendation engine of the server together.
The video recommendation engine of the server predicts the portrait of the user according to the voiceprint characteristics, the time information, the user basic portrait and the like, and matches the portrait with the voice portrait of the young and male users in the family. For example, one possible user speech representation for this user is: liking science fiction films, the server will preferentially recommend a recently hot-played science fiction movie to this user.
Finally, a video recommendation engine of the server extracts several kinds of characteristic information including portrait characteristics, video characteristics, context characteristics and cross characteristics according to the user voice portrait and sends the characteristic information to a sequencing model for sequencing; and the sequencing model returns the sequenced video recommendation data result to the video recommendation engine and further sends the video recommendation data result to the display equipment.
The following description will be made with respect to an algorithm for the server to obtain video recommendation data based on the speech portrayal.
In some embodiments, the second controller of the server performs personalized video recommendations to each user in the household by utilizing a deep learning model in conjunction with information from a user voice representation including a user base representation, voiceprint information, and search instructions.
First, the second controller of the server will recognize the user speech representation.
When the user uses the television, voice input is not used in all situations, and the statistical data shows that the voice search is performed by the user every day and the search text is input by the remote controller in an approximate ratio of 2: 3, so that the direct browsing behaviors such as direct clicking of a recommendation position by the user are eliminated, and the actual voice use ratio is lower. If the user voice portrait is recognized only depending on the voiceprint ID recognized by the voiceprint information, more than half of users can not acquire personalized data; in addition, when the user uses voice input, the user may search for related content for other people, and may obtain a poor result by directly recognizing the voiceprint ID. For example, an adult in a family finds a movie that a child wants to see, and the real requirement is to search for an animation movie, but since the result of voiceprint recognition is a young male, the server may recommend movies such as warwolf, star wars, and the like. Moreover, when the user repeatedly inputs, the types of the obtained results are basically consistent, so that the related content is wrongly recommended to the user, and the user experience is deteriorated.
The display device and the server provided by the application are used for predicting the user voice portrait by extracting voiceprint information and time information of the user voice and user basic portrait, namely media asset characteristics clicked by the user. Although a television is commonly owned by a family, only one user is generally using the television at the same time, regardless of whether the user finds the desired video by himself or finds the video for children in the family, which also provides us with the possibility of predicting the user's speech representation.
In some embodiments, a user session is defined as 5 minutes, i.e. the behavior of a user within a 5 minute time window, we all consider the same person to be in progress. The length of the user session can be defined as required, and here, the length of the time is configured according to methods such as statistics of the search habits of the user and questionnaires. Matching the media assets clicked by the user with the voice pictures of all the users in the family within 5 minutes, and if the matching degree is higher than a preset threshold value, determining that the voice pictures of the users are matched; if there is multiple click behavior within 5 minutes, the user voice portrait with the most number of matches within the 5 minutes is required to be the positive sample, and other non-matching user voice portraits are required to be the negative sample.
The voiceprint information characteristics of the user mainly comprise a voiceprint ID, an age range and a user gender of the user, and if the user only carries out keyboard search and does not have related voiceprint information, a special null value is defined as the voiceprint information characteristics of the user; the time information mainly comprises the triggering request time of the current user and the geographic position (longitude and latitude and place); the characteristics of the media assets mainly comprise the classification, title and label of the media assets, whether the media assets are paid media assets, the duration of the media assets, whether the media assets are positive films, whether famous actors exist, whether the media assets are recent popular movies, TV plays and the like.
In some embodiments, the server employs a Logistic Regression (LR) model to identify the user's voice representation of the user.
The core of the logistic regression model is to predict the probability of the event x when the category Y is 1, and the formula is as follows:
Figure BDA0002694758640000221
w and b are parameters needing learning and prediction, and when prediction is carried out, the model finally outputs the probability of the voice portrait of the current user, and a preset threshold value needs to be set. For example, the preset threshold may be implemented as 0.8, and needs to be configured according to the performance and actual situation of the model, which is schematically shown in fig. 10, to determine the user speech representation of the current user.
And secondly, the server acquires video recommendation data based on the voice images through a video recommendation model.
And the video recommendation engine of the server takes the browsing and clicking data of the user as effective user behavior data and processes the effective user behavior data to obtain a user log.
The positive sample adopts data clicked after the user requests, the data is cleaned, and the data with the watching time smaller than a threshold value, such as the data with the watching time smaller than 1 minute, is filtered. The threshold value is also required to be adjusted according to actual conditions; negative examples use data that the user has browsed, but the exposure time is less than a certain threshold, e.g. less than 1 minute or 30 seconds. This threshold value needs to be selected according to the actual situation.
Assuming that a user clicks a video, but the watching time is only 1 minute, it can be determined that the user may not be interested in the video, and only the browsing behavior generated by the user is clicked carelessly; this behavior data does not help us in obtaining video recommendation data of users, and can therefore be removed. In some embodiments, since there is a short video in the tv, the total time may be only 2 minutes, and we actually use two strategies, that is, the ratio of the user viewing time/total time is less than the threshold, and the viewing time is less than the threshold, to generate and obtain a positive sample. However, as long as the user clicks on a certain video, the data is not recognized as a negative sample, so that the proportion of the positive sample and the negative sample is balanced as much as possible.
The method mainly comprises the following steps that characteristics required by model training need to be extracted from user data constructed by positive and negative samples: user portrait features, video features, contextual features, video and portrait intersection features.
The user representation characteristics mainly comprise attributes acquired from the user voice representation, including a user voiceprint ID, a user gender, an age range and an occupation; the user behavior characteristics mainly comprise browsing duration, clicking and searching times of the user in different time windows; and behavioral preferences, mainly user interest tags, user payment preferences, etc.
The video characteristics mainly comprise the characteristics of the video, and mainly comprise the attributes of the video, such as the title of the video, description information, video classification, whether the video is a positive film or not, and whether a hot actor exists or not; video semantic related features, mainly label and keyword information; video behavior characteristics such as the playing amount of the last 7 days, the exposure times of the last 7 days, whether operation manual recommendation is performed, and the like.
The context feature refers to the combined features of the user's voice portraits, for example, the probability of the voice portraits of the elderly people is 70% and the voice portraits of children are 30% when we recognize the current user request. The user speech representations can be combined by this ratio to obtain context-dependent features. The characteristics are more beneficial to exposing some tail video data, when a user actually requests to search for other people, for example, the old people at home are searched for the opera, so that the opera which the old people like to see is provided, and because the image is identified to have the characteristics of young people, the contents which are liked by some young people and related to the opera can be sorted later, and the exposure rate and the click rate of the user can be effectively improved.
One possible combination condition is represented by a linear combination as follows:
f=α*User1+β*User2
the following can also be expressed in a combination of squares:
f=α*(User1)2+β*(User2)2
it should be noted that the present application does not limit the way of combination, where the weights are parameters of the last model training, and the initial value may be configured as a random number between 0 and 1 by default.
The video and portrait cross feature is mainly the matching degree and correlation between the video label and the user label preference.
And finally, the server constructs a sequencing model based on the deep learning model to sequence the video recommendation data, and finally feeds the video recommendation data back to the display equipment.
In some embodiments, the server constructs a ranking model based on a deep learning model of CNN (Convolutional Neural Networks) + LSTM (long short-term memory), the ranking model is used for combining various features and generating a final video recommendation personalized ranking, and the structural schematic of the ranking model is shown in fig. 11.
The server learns the features of the video and the portrait using CNN and the combined features and context features using LSTM. Deep connection exists between the context features and the interior of the combined features, the LSTM learned by sequence can better represent the semantic meaning of the context features, the image and video features have less connection, and important information can be more easily acquired by using convolution and pooling operations of CNN. It should be noted that, the ranking of the video recommendation data in the present application is not limited to using these two models, and may also be optimized and adjusted according to the actual data and the modeling situation.
In some embodiments, the recommendation request sent by the first controller to the server does not include voiceprint information.
For example, when the user turns on the television and inputs "ZZX" through a remote control key, the television transmits the user's search information to the server, and the user's voiceprint information is null. After receiving the user recommendation request, the second controller of the server finds that the voiceprint information of the user is empty, and then does not call the voiceprint information identification service any more, but sets the voiceprint ID of the user to null, and then directly sends the voiceprint ID, the request trigger time, the user text 'ZZX', the geographic position and other information to the video recommendation engine.
When acquiring that the user voiceprint ID is null, a recommendation engine of the server takes null as a special user voiceprint feature; meanwhile, the characteristics of the age range, the user's sex, and the like obtained by the recognition cannot be used. Assuming that the predicted user image at this time is a 20 year old boy-young, the most likely ZZX search result is likely spider knight, and thus spider knight-related movies may be preferentially recommended.
It should be noted that when the user uses the voice search, the server calls the voiceprint information recognition service when receiving the recommendation request, then calls the service of the video recommendation engine to obtain the video recommendation data result, and the response time is slightly longer than the scene input by the remote controller; in a scene input by a remote controller, after receiving a video recommendation request, a server directly sets the voiceprint ID to null and sends the null to a video recommendation engine, and the video recommendation engine does not need to extract the voiceprint related features when predicting a user portrait due to the null voiceprint ID, so that the video recommendation data acquisition time is relatively short.
Based on the display device, the UI of the server for realizing video recommendation and the technical scheme explanation, the application also provides a video recommendation method of the display device.
Fig. 12 is a flowchart illustrating a video recommendation method according to an embodiment of the present application.
In step 1201, a recommendation request containing voiceprint information is sent to a server, wherein the recommendation request contains a search instruction, and the search instruction is used for enabling the server to feed back recommendation data.
In step 1202, recommendation data is received, where the recommendation data is different when the recommendation request includes voiceprint information or does not include voiceprint information for the same search instruction.
In step 1203, the recommendation data is presented. The specific operations and implementation methods related to the above steps have been described in detail in the above UI and the implementation of the corresponding display device, and are not described herein again.
Based on the display device, the UI of the server for realizing the video recommendation and the technical scheme explanation, the application also provides a video recommendation method of the server side.
Fig. 13 is a flowchart illustrating a video recommendation method according to an embodiment of the present application.
In step 1301, a recommendation request containing voiceprint information and a search instruction is received.
In step 1302, recommendation data is determined according to a search instruction included in a recommendation request, where the recommendation data is different for the same search instruction when the recommendation request includes voiceprint information or does not include voiceprint information.
In step 1303, the recommended data is sent to a display device.
The specific operations and implementation methods related to the above steps have been described in detail in the above UI and the implementation of the corresponding display device, and are not described herein again.
In some embodiments, determining the user voice representation that matches the voiceprint information based on the user base representation determined by the search instruction and the user identification included in the recommendation request may provide video recommendation data comprising: analyzing the voiceprint information contained in the video recommendation request to identify one or more combinations of user gender, age range and voiceprint ID; constructing a user voice portrait according to the voiceprint information of the user, wherein the user voice portrait is specific to a single user and corresponds to a unique voiceprint ID; and determining the recommendation data which can be provided according to the semantics contained in the search instruction and the user voice portrait. The specific operations and implementation methods related to the above steps have been described in detail in the above UI and the implementation of the corresponding display device, and are not described herein again.
The method has the advantages that the recommendation data sent by the server can be acquired by constructing the recommendation request containing the search instruction; further, by constructing voiceprint information, recommendation data for individuals can be acquired; further by constructing a user identification, a user base representation for the individual may be determined; furthermore, by constructing a unique user voice portrait for an individual user, fine-grained depiction and expression of multi-dimensional information of the individual user can be realized, personalized video data recommendation for the individual user can be realized, the user voice portrait can be positioned in a text search scene, and the pertinence and the accuracy of video recommendation data are improved.
Moreover, those skilled in the art will appreciate that aspects of the present application may be illustrated and described in terms of several patentable species or situations, including any new and useful combination of processes, machines, manufacture, or materials, or any new and useful improvement thereon. Accordingly, various aspects of the present application may be embodied entirely in hardware, entirely in software (including firmware, resident software, micro-code, etc.) or in a combination of hardware and software. The above hardware or software may be referred to as "data block", "controller", "engine", "unit", "component", or "system". Furthermore, aspects of the present application may be represented as a computer product, including computer readable program code, embodied in one or more computer readable media.
The computer storage medium may comprise a propagated data signal with the computer program code embodied therewith, for example, on baseband or as part of a carrier wave. The propagated signal may take any of a variety of forms, including electromagnetic, optical, etc., or any suitable combination. A computer storage medium may be any computer-readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code located on a computer storage medium may be propagated over any suitable medium, including radio, cable, fiber optic cable, RF, or the like, or any combination of the preceding.
Computer program code required for the operation of various portions of the present application may be written in any one or more programming languages, including an object oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C + +, C #, VB.NET, Python, and the like, a conventional programming language such as C, Visual Basic, Fortran 2003, Perl, COBOL 2002, PHP, ABAP, a dynamic programming language such as Python, Ruby, and Groovy, or other programming languages, and the like. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any network format, such as a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet), or in a cloud computing environment, or as a service, such as a software as a service (SaaS).
Additionally, the order in which elements and sequences of the processes described herein are processed, the use of alphanumeric characters, or the use of other designations, is not intended to limit the order of the processes and methods described herein, unless explicitly claimed. While various presently contemplated embodiments of the invention have been discussed in the foregoing disclosure by way of example, it is to be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements that are within the spirit and scope of the embodiments herein. For example, although the system components described above may be implemented by hardware devices, they may also be implemented by software-only solutions, such as installing the described system on an existing server or mobile device.
Similarly, it should be noted that in the preceding description of embodiments of the application, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the embodiments. This method of disclosure, however, is not intended to require more features than are expressly recited in the claims. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.
The entire contents of each patent, patent application publication, and other material cited in this application, such as articles, books, specifications, publications, documents, and the like, are hereby incorporated by reference into this application. Except where the application is filed in a manner inconsistent or contrary to the present disclosure, and except where the claim is filed in its broadest scope (whether present or later appended to the application) as well. It is noted that the descriptions, definitions and/or use of terms in this application shall control if they are inconsistent or contrary to the statements and/or uses of the present application in the material attached to this application.

Claims (10)

1. A display device, comprising:
a display;
a first controller configured to:
sending a recommendation request containing voiceprint information to a server, wherein the recommendation request contains a search instruction, and the search instruction is used for enabling the server to feed back recommendation data
Receiving recommendation data, wherein for the same search instruction, the recommendation data are different when the recommendation request contains voiceprint information or does not contain voiceprint information;
presenting the recommendation data on the display.
2. The display device of claim 1, wherein the recommendation data is different when the voiceprint information is different in the recommendation request for the same search instruction.
3. The display device of claim 1, wherein the recommendation request further includes a user identification, the user identification for causing the server to determine a user base representation based on the user identification, the user base representation for causing the server to feed back the recommendation data based on the user base representation and/or the voiceprint information in response to the search instruction.
4. The display device of claim 1, wherein the first controller sends the recommendation request including voiceprint information to a server
Receiving voice data of a user;
and generating the recommendation request according to the voice data, wherein the recommendation request comprises a search instruction and voiceprint information.
5. The display device of claim 1, wherein the first controller sends the recommendation request including voiceprint information to a server
Receiving key input of a remote controller;
and generating a recommendation request according to the key input, wherein the recommendation request comprises a search instruction.
6. A server, comprising:
a second controller configured to:
receiving a recommendation request containing voiceprint information and a search instruction sent by display equipment;
determining recommendation data according to a search instruction contained in a recommendation request, wherein for the same search instruction, the recommendation data are different when the recommendation request contains voiceprint information or does not contain the voiceprint information;
and sending the recommended data to a display device.
7. The server according to claim 6, wherein the recommendation data is different when the voiceprint information is different in the recommendation request for the same search instruction.
8. The server of claim 6, wherein the recommendation request further includes a user identification, the second controller determining a user base representation based on the user identification, the user base representation being used to cause the second controller to determine the recommendation data based on the user base representation and/or the voiceprint information in response to the search instruction.
9. A method for video recommendation, the method comprising:
sending a recommendation request containing voiceprint information to a server, wherein the recommendation request contains a search instruction, and the search instruction is used for enabling the server to feed back recommendation data;
receiving recommendation data, wherein for the same search instruction, the recommendation data are different when the recommendation request contains voiceprint information or does not contain voiceprint information;
and displaying the recommended data.
10. A method for video recommendation, the method comprising:
receiving a recommendation request containing voiceprint information and a search instruction;
determining recommendation data according to a search instruction contained in a recommendation request, wherein for the same search instruction, the recommendation data are different when the recommendation request contains voiceprint information or does not contain the voiceprint information;
and sending the recommended data to a display device.
CN202011002330.6A 2020-09-22 2020-09-22 Display device, server and video recommendation method Pending CN112135170A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011002330.6A CN112135170A (en) 2020-09-22 2020-09-22 Display device, server and video recommendation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011002330.6A CN112135170A (en) 2020-09-22 2020-09-22 Display device, server and video recommendation method

Publications (1)

Publication Number Publication Date
CN112135170A true CN112135170A (en) 2020-12-25

Family

ID=73842285

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011002330.6A Pending CN112135170A (en) 2020-09-22 2020-09-22 Display device, server and video recommendation method

Country Status (1)

Country Link
CN (1) CN112135170A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112804567A (en) * 2021-01-04 2021-05-14 青岛聚看云科技有限公司 Display device, server and video recommendation method
CN113593559A (en) * 2021-07-29 2021-11-02 海信视像科技股份有限公司 Content display method, display equipment and server
CN113938741A (en) * 2021-12-08 2022-01-14 聚好看科技股份有限公司 Server and media asset playing exception handling method
CN114822005A (en) * 2022-06-28 2022-07-29 深圳市矽昊智能科技有限公司 Remote control intention prediction method, device, equipment and medium based on artificial intelligence

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103634687A (en) * 2013-12-23 2014-03-12 乐视致新电子科技(天津)有限公司 Method and system of providing video retrieval results in intelligent television
CN103686371A (en) * 2013-12-02 2014-03-26 Tcl集团股份有限公司 Smart television service pushing method and system based on age groups
CN104462573A (en) * 2014-12-29 2015-03-25 北京奇艺世纪科技有限公司 Method and device for displaying video retrieval results
CN105049882A (en) * 2015-08-28 2015-11-11 北京奇艺世纪科技有限公司 Method and device for video recommendation
CN105260432A (en) * 2015-09-30 2016-01-20 北京奇虎科技有限公司 Network searching result screening method and electronic device
CN107885889A (en) * 2017-12-13 2018-04-06 聚好看科技股份有限公司 Feedback method, methods of exhibiting and the device of search result
US20180174475A1 (en) * 2016-11-23 2018-06-21 Broadband Education Pte. Ltd. Application for interactive learning in real-time
CN110691264A (en) * 2019-10-09 2020-01-14 山东三木众合信息科技股份有限公司 Television program pushing system and method based on user browsing habits
CN111414512A (en) * 2020-03-02 2020-07-14 北京声智科技有限公司 Resource recommendation method and device based on voice search and electronic equipment
CN111613217A (en) * 2020-04-02 2020-09-01 深圳创维-Rgb电子有限公司 Equipment recommendation method and device, electronic equipment and readable storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103686371A (en) * 2013-12-02 2014-03-26 Tcl集团股份有限公司 Smart television service pushing method and system based on age groups
CN103634687A (en) * 2013-12-23 2014-03-12 乐视致新电子科技(天津)有限公司 Method and system of providing video retrieval results in intelligent television
CN104462573A (en) * 2014-12-29 2015-03-25 北京奇艺世纪科技有限公司 Method and device for displaying video retrieval results
CN105049882A (en) * 2015-08-28 2015-11-11 北京奇艺世纪科技有限公司 Method and device for video recommendation
CN105260432A (en) * 2015-09-30 2016-01-20 北京奇虎科技有限公司 Network searching result screening method and electronic device
US20180174475A1 (en) * 2016-11-23 2018-06-21 Broadband Education Pte. Ltd. Application for interactive learning in real-time
CN107885889A (en) * 2017-12-13 2018-04-06 聚好看科技股份有限公司 Feedback method, methods of exhibiting and the device of search result
CN110691264A (en) * 2019-10-09 2020-01-14 山东三木众合信息科技股份有限公司 Television program pushing system and method based on user browsing habits
CN111414512A (en) * 2020-03-02 2020-07-14 北京声智科技有限公司 Resource recommendation method and device based on voice search and electronic equipment
CN111613217A (en) * 2020-04-02 2020-09-01 深圳创维-Rgb电子有限公司 Equipment recommendation method and device, electronic equipment and readable storage medium

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112804567A (en) * 2021-01-04 2021-05-14 青岛聚看云科技有限公司 Display device, server and video recommendation method
CN112804567B (en) * 2021-01-04 2023-04-21 青岛聚看云科技有限公司 Display equipment, server and video recommendation method
CN113593559A (en) * 2021-07-29 2021-11-02 海信视像科技股份有限公司 Content display method, display equipment and server
CN113593559B (en) * 2021-07-29 2024-05-17 海信视像科技股份有限公司 Content display method, display equipment and server
CN113938741A (en) * 2021-12-08 2022-01-14 聚好看科技股份有限公司 Server and media asset playing exception handling method
CN113938741B (en) * 2021-12-08 2023-08-11 聚好看科技股份有限公司 Server and media asset playing exception handling method
CN114822005A (en) * 2022-06-28 2022-07-29 深圳市矽昊智能科技有限公司 Remote control intention prediction method, device, equipment and medium based on artificial intelligence

Similar Documents

Publication Publication Date Title
CN110737840B (en) Voice control method and display device
CN112135170A (en) Display device, server and video recommendation method
CN111372109B (en) Intelligent television and information interaction method
CN112449253B (en) Interactive video generation
CN104170397A (en) User interface for entertainment systems
US20230142720A1 (en) Smart interactive media content guide
CN112163086B (en) Multi-intention recognition method and display device
US20140208219A1 (en) Display apparatus and method for providing customer-built information using the same
US20150127675A1 (en) Display apparatus and method of controlling the same
CN111625716B (en) Media asset recommendation method, server and display device
CN111866568B (en) Display device, server and video collection acquisition method based on voice
CN112804567B (en) Display equipment, server and video recommendation method
CN111787376B (en) Display device, server and video recommendation method
WO2022012271A1 (en) Display device and server
US20150135218A1 (en) Display apparatus and method of controlling the same
CN111914134A (en) Association recommendation method, intelligent device and service device
CN111083538A (en) Background image display method and device
CN113593559B (en) Content display method, display equipment and server
CN111950288B (en) Entity labeling method in named entity recognition and intelligent device
CN113225601A (en) Display device and recommendation method of television programs in program guide
CN110851727A (en) Search result sorting method and server
CN112040299A (en) Display device, server and live broadcast display method
CN112883144A (en) Information interaction method
CN111782878B (en) Server, display device and video search ordering method thereof
CN115150673B (en) Display equipment and media asset display method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20201225