US20140196092A1 - Dialog-type interface apparatus and method for controlling the same - Google Patents
Dialog-type interface apparatus and method for controlling the same Download PDFInfo
- Publication number
- US20140196092A1 US20140196092A1 US14/149,397 US201414149397A US2014196092A1 US 20140196092 A1 US20140196092 A1 US 20140196092A1 US 201414149397 A US201414149397 A US 201414149397A US 2014196092 A1 US2014196092 A1 US 2014196092A1
- Authority
- US
- United States
- Prior art keywords
- contents
- item
- server
- voice
- user
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/26—Pre-filtering or post-filtering
- G10L19/265—Pre-filtering, e.g. high frequency emphasis prior to encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/41—Structure of client; Structure of client peripherals
- H04N21/422—Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
- H04N21/4227—Providing Remote input by a user located remotely from the client device, e.g. at work
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/482—End-user interface for program selection
- H04N21/4828—End-user interface for program selection for searching program descriptors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
- H04N21/2343—Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/45—Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Human Computer Interaction (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Acoustics & Sound (AREA)
- Computational Linguistics (AREA)
- Theoretical Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- User Interface Of Digital Computer (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A dialog type interface apparatus providing contents corresponding to a voice signal received from the display apparatus is disclosed. The dialog type interface apparatus includes a communicator which receives a voice signal corresponding to a user's voice collected in the display apparatus; and a controller which determines the user's utterance intentions using the voice signal, and which controls to generate a query for searching contents corresponding to the determined utterance intentions, divide metadata on the contents, and transmit the divided metadata to an external server, wherein the controller extracts an utterance element for determining the utterance intentions from the voice signal, and converts the extracted utterance element to correspond to contents dividing criteria of each item to generate the query.
Description
- This application claims priority from Korean Patent Application No. 2013-1838, filed in the Korean Intellectual Property Office on Jan. 7, 2013, the disclosure of which is incorporated herein by reference.
- 1. Field
- Methods and apparatuses consistent with the exemplary embodiments relate to a dialog type interface apparatus and a method for controlling the same, and more particularly to a dialog type interface which configures a dialog type system and a method for controlling the same.
- 2. Description of the Prior Art
- Due to the development of electronic technologies, various kinds of display apparatuses have been developed and provided, and have come to include various functions. Recently, in the case of televisions (TVs), display apparatuses have been able to connect to the Internet and provide Internet services, and users have been able to view numerous digital broadcasting channels through TVs.
- Recently, technologies are being developed which may control display apparatuses through a user's voice for controlling display apparatuses more conveniently and intuitively. TVs have been able to recognize a user's voice and perform functions corresponding to the received user's voice such as volume adjustment and channel change.
- However, related art TVs have limitations of not being able to search contents according to user's voice and provide contents to users based on a user's voice.
- Therefore, the purpose of the present disclosure is to provide a dialog type interface apparatus which may efficiently search contents when configuring a dialog type system through a server and a method of controlling thereof.
- According to an exemplary embodiment of the present disclosure, a dialog type interface apparatus providing contents corresponding to a voice signal received from the display apparatus may include a communicator configured to receive a voice signal corresponding to a user's voice collected in the display apparatus; and a controller configured to determine the user's utterance intentions using the voice signal, and configured to generate a query for searching contents corresponding to the determined utterance intentions, divide metadata on the contents, and transmit the divided metadata to an external server, wherein the controller is configured to extract an utterance element for determining the utterance intentions from the voice signal, and convert the extracted utterance element to correspond to contents dividing criteria of each item to generate the query.
- The dialog type interface apparatus may further include a storage which is configured to store an item table which includes a plurality of items having different contents dividing criteria according to at least one of criteria related to a nation and criteria related to a language.
- The controller may correspond the extracted utterance element to at least one item of a plurality of items of the item table, and convert the extracted utterance element to correspond to the contents dividing criteria of the at least one item to generate a query for searching the contents.
- The controller may correspond to the extracted utterance element to at least one item of a plurality of items of the item table, and convert the extracted utterance element to correspond to the contents dividing criteria of the at least one item to generate a query for searching the contents, based on user preference.
- The external server may divide the metadata on the contents per at least one item of a title, cast, producer, contents type, genre, and viewing rating.
- According to an exemplary embodiment of the present disclosure, a method of controlling a dialog type interface apparatus which provides contents corresponding to a voice signal received from a display apparatus may include receiving a voice signal corresponding to a user's voice collected from the display apparatus; determining the user's utterance intentions based on the received voice signal, and generating a query for searching contents corresponding to the determined utterance intentions; and transmitting the generated query to an external server which divides and stores metadata on the contents per item, wherein the generating extracts an utterance element for determining the utterance intentions in the voice signal, and converts the extracted utterance element to correspond to contents dividing criteria in each item to generate the query.
- The dialog type interface apparatus may store an item table which includes a plurality of items having different contents dividing criteria according to at least one of criteria of a nation and criteria of a language.
- The generating may correspond the extracted utterance element to at least one item of a plurality of items of the item table, and convert the extracted utterance element to correspond to the contents dividing criteria of the at least one item to generate a query for searching the contents.
- The generating may correspond the extracted utterance element to at least one item of a plurality of items of the item table, and convert the extracted utterance element to correspond to the contents dividing criteria of the at least one item to generate a query for searching the contents, based on user preference.
- The external server may divide the metadata on the contents per at least one item of a title, cast, producer, contents type, genre, and viewing rating.
- The above and/or other aspects of the present disclosure will be more apparent by describing certain exemplary embodiments with reference to the accompanying drawings, in which:
-
FIG. 1 is a view for illustrating a dialog type system according to an exemplary embodiment; -
FIG. 2 is a block diagram of a display apparatus according to an exemplary embodiment; -
FIG. 3 is a block diagram of a first server illustrated inFIG. 1 ; -
FIG. 4 is a block diagram of a second server illustrated inFIG. 3 ; -
FIGS. 5 to 11 are views for explaining various exemplary embodiments; -
FIGS. 12A and 12B is a view illustrating an example of a system response output in a display apparatus according to an exemplary embodiment; and -
FIG. 13 is a flowchart for explaining a control method of a dialog type interface apparatus according to an exemplary embodiment. - Certain exemplary embodiments are described in higher detail below with reference to the accompanying drawings.
-
FIG. 1 is a view for explaining a dialog type system according to an exemplary embodiment. As illustrated inFIG. 1 , adialog type system 1000 includes adisplay apparatus 100, afirst server 200, asecond server 300 and anexternal server 400. Herein, thesecond server 300 may be embodied as a dialog type interface apparatus in the present disclosure. - The
display apparatus 100 may be controlled by a remote control (not illustrated). More specifically, thedisplay apparatus 100 may perform operations corresponding to a remote control signal received from the remote control (not illustrated). For example, when thedisplay apparatus 100 is embodied as a TV as inFIG. 1 , thedisplay apparatus 100 may perform operations of power on/off, channel conversion, and volume change according to the remote control signal received from the remote control (not illustrated). - In addition, the
display apparatus 100 may perform various operations corresponding to a user's voice. - More specifically, the
display apparatus 100 may perform functions corresponding to the user's voice or output a system response corresponding to the user's voice. - To this end, the
display apparatus 100 transmits the collected user's voice to thefirst server 200. When the user's voice is received from thedisplay apparatus 100, thefirst server 200 converts the received user's voice into text information (that is text) and transmits the text information to thedisplay apparatus 100. - In addition, the display apparatus transmits a signal corresponding to the user's voice to the
second server 300. Herein, the signal corresponding to the user's voice may be text information received from thefirst server 200 or an actual voice signal. When the voice signal or the text information is received from thedisplay apparatus 100, thesecond server 300 generates response information corresponding to the received voice signal or the received text information and transmits the response information to thedisplay apparatus 100. - The
display apparatus 100 may perform various operations based on the response information received from thesecond server 300. Herein, the response information may include at least one of various information regarding a control command for thedisplay apparatus 100 to perform a particular function, or output a system response, and various information regarding a system response output from thedisplay apparatus 100. - More specifically, the
display apparatus 100 may perform functions corresponding to the user's voice. That is, thedisplay apparatus 100 may execute various functions corresponding to the user's voice of the functions that may be provided. For example, when the user's voice “turn to channel “◯” (channel number)” is input, the display apparatus may select and output channel “◯” based on the control command received from thesecond server 300. - In addition, the
display apparatus 100 may output a system response corresponding to the user's voice. For example, when the user's voice “recommend movies for children” is input, thedisplay apparatus 100 may output a searched result corresponding to the user's voice based on the control command received from thesecond server 300. - In this case, the
second server 300 may transmit various information for outputting the system response to thedisplay apparatus 100. For example, thesecond server 300 may transmit information on the searched contents according to the user's voice “recommend movies for children” to thedisplay apparatus 100. - As such, the
display apparatus 100 may perform various operations corresponding to the user's voice based on the response information received from thesecond server 300. - In a case where the voice signal is related to a contents search or recommendation, the
second server 300 may search contents which correspond to the user's utterance intentions and transmit the searched results to thedisplay apparatus 100. - To this end, the
second server 300 may generate a query for searching contents corresponding to the user's utterance intentions, transmit the query to theexternal server 400, and receive the searched results from theexternal server 400. - Herein, the
external server 400 may structure metadata and store the structured metadata. More specifically, theexternal server 400 may divide the metadata on the contents per item (or field), and structure the metadata on the contents according to contents dividing criteria in each item, and store the structured metadata. Herein, items are characteristics for dividing the metadata, and contents dividing criteria may be detailed characteristics for dividing contents in each item. For example, in a case where an item is viewing rating, the contents dividing criteria may be criteria which may subdivide viewing rating such as All, under 7 years, under 13 years, over 18 years etc. As another example, in a case where the item is genre, the contents dividing criteria may be criteria for subdividing genre such as “drama”, “comedy”, “fantasy” etc. - More specifically, the
second server 300 may extract an utterance element for determining the user's utterance intentions from the voice signal, convert the extracted utterance element to correspond to the contents dividing criteria in each item and generate a query for contents search, and transmit the generated query to theexternal server 400. Theexternal server 400 may search contents according to the query received from thesecond server 300, and transmit the searched results to thesecond server 300. - For example, the
second server 300 may extract “children” and “fantasy” as an utterance element from the voice signal “recommend fantasies for children”, wherein “children” may indicate the viewing rating in the metadata in the contents, and “fantasy” may indicate genre in the metadata in the contents. Accordingly, thesecond server 300 may correspond to “children” to viewing rating of the metadata and correspond “fantasy” to genre of the metadata. - In addition, the
second server 300 may convert the extracted “children” to under 7 years of the contents dividing criteria in the viewing rating, and convert “fantasy” into fantasy of the contents dividing criteria in the genre, and generate a search query using the viewing rating: under 7 years, and genre: fantasy, and transmit the generated query to theexternal server 400. - Accordingly, the
external server 400 searches contents which satisfy under 7 years in the viewing rating of the structured metadata, and satisfy fantasy in the genre, and transmit the searched results to thesecond server 300. - The
second server 300 may transmit the information on the control command and searched results for outputting the system response corresponding to the “recommend fantasies for children” to thedisplay apparatus 100, and thedisplay apparatus 100 may output the system response corresponding to the user's voice based on the response information received from thesecond server 300. In the aforementioned example, the display apparatus may output a list on the contents searched as a system response to the “recommend fantasies for children”. - As such, in the case where the server performs contents search through the external server which structures and stores metadata on the contents, the server may generate a query in accordance with the format in which the metadata is structured. Accordingly, the server is able to provide contents search results which further satisfy the user's utterance intentions during contents search, thereby improving user convenience in the dialog type system.
-
FIG. 1 illustrates thedisplay apparatus 100 as being a TV, but this is just exemplary. That is, thedisplay apparatus 100 may not only be a TV, but may also be embodied as various electronic apparatuses such as mobile phones such as a smart phone, desktop PC, notebook, and navigation etc. - In addition,
FIG. 1 illustrated that thefirst server 200 andsecond server 300 are embodied as separate servers, but this is also just exemplary. That is, thefirst server 200 andsecond server 300 may be embodied as one dialog type server. As such, in the case where thefirst server 200 and thesecond server 300 are embodied as one dialog type server, the dialog type server may receive the user's voice received from the display apparatus and convert it into text information to generate response information corresponding to the user's utterance intentions. -
FIG. 2 is a block diagram of a display apparatus according to an exemplary embodiment. As shown inFIG. 2 , thedisplay apparatus 100 may include anoutputter 110,voice collector 120,first communicator 130,second communicator 135,storage 140,receiver 150,signal processor 160, remotecontrol signal receiver 171,inputter 173,interface 175, andcontroller 180. -
FIG. 2 illustrates various configurative elements which may be included in thedisplay apparatus 100, but thedisplay apparatus 100 may not necessarily include all configurative elements, nor are they limited to only these configurative elements. That is, depending on product types ofdisplay apparatuses 100, some of the configurative elements may be omitted or added, or may be replaced by other configurative elements. - The
outputter 110 outputs at least one of voice and image. More specifically, theoutputter 110 may output a system response corresponding to the user's voice collected through thevoice collector 120 in a format of at least one of a voice and a user interface (UI) screen. - Herein, in the UI screen, the system response corresponding to the user's voice may be expressed in a text format, or the results searched according to the user's voice in a list format.
- To this end, the outputter may have a
displayer 111, and anaudio ouputter 113. - More specifically, the
displayer 111 may be embodied as a Liquid Crystal Display, Organic Light Emitting Display or Plasma Display Panel, but is not limited thereto. - The
displayer 111 may provide various display screens which may be provided through thedisplay apparatus 100. Specifically, thedisplayer 111 may configure the system response corresponding to the user's voice in the UI screen and display the UI screen. - The
outputter 113 may be embodied as an output port or speaker etc. such as a jack etc., and output the system response corresponding to the user's voice in voice format. - In addition, the
outputter 110 may output various contents. Herein, the contents may include broadcast contents, and video on demand (VOD) contents etc. For example, thedisplayer 111 may output images configuring the contents, and theaudio outputter 113 may output audio configuring the contents. - The
voice collector 120 collects a user's voice. For example, thevoice collector 120 is embodied as a microphone for collecting a user's voice, and may be integrated into thedisplay apparatus 100 or separated from thedisplay apparatus 100. When thevoice collector 120 is separated from thedisplay apparatus 100, thevoice collector 120 may be embodied to be held by the user, or placed on a table, and thedisplay apparatus 100 may be connected through a wireless or wired network to transmit the collected user's voice to thedisplay apparatus 100. - The
voice collector 120 may determine whether or not the collected voice is a user's voice, and filter the background noise (for example, air conditioner sound, vacuum cleaner sound, music sound etc.) in the user's voice. - For example, when an analog type user's voice is input, the
voice collector 120 samples the user's voice and converts it into digital signals. In addition, thevoice collector 120 calculates the energy of the converted digital signal, and determines whether or not the energy of the digital signal is equal to or greater than a predetermined value. - When the energy of the digital signal is equal to or greater than a predetermined value, the
voice collector 120 removes noise elements from the digital signal and transmits the result to thefirst communicator 130. Herein, the noise element may be abrupt noises which may occur in household environments such as air conditioner sounds, vacuum cleaner sounds, and music sounds etc. When the energy of the digital signal is less than the predetermined value, thevoice collector 120 does not perform additional processes on the digital signal and waits for another input. - Accordingly, since the entire audio processing process is not activated by other sounds besides the user's voice, it is possible to prevent unnecessary power consumption.
- The
first communicator 130 performs communication with the first server (200 inFIG. 1 ). More specifically, thefirst communicator 130 may transmit the user's voice to thefirst server 200, and receive the sound signal corresponding to the user's voice from thefirst server 200. - The
second communicator 135 performs communication with the second server (300 ofFIG. 1 ). More specifically, thesecond communicator 135 may transmit the received voice signal or text information to thesecond server 300, and receive the response information corresponding to the sound signal from thesecond server 300. - In such a case, the sound signal may be text information converted from the user's voice, but this is merely exemplary, and as aforementioned, in a case where the
first server 200 andsecond server 300 are embodied as one dialog type server, thedisplay apparatus 100 may transmit the user's voice collected through thevoice collector 120 to the dialog type server, and may receive the response information corresponding to the user's utterance intentions from the dialog type server. - To this end, the
first communicator 130 andsecond communicator 135 may perform communication with the first server andsecond server 300 using various communication methods. - For example, the
first communicator 130 and thesecond communicator 135 may perform communication with thefirst server 200 and thesecond server 300 using wired/wireless LAN (Local Area Network), wide area network (WAN), Ethernet, Bluetooth, Zigbee, Universal Serial Bus (USB), IEEE 1394, and Wifi. To this end, thefirst communicator 130 and thesecond communicator 135 may have a chip or input port corresponding to each communication method. For example, in the case of performing communication in the wired LAN method, thefirst communicator 130 and thesecond communicator 135 may have a wired LAN card (not illustrated) and input port (not illustrated). - In the aforementioned example, it has been explained that the
display apparatus 100 hasadditional communicators first server 200 and thesecond server 300, but this is merely exemplary. That is, thedisplay apparatus 100 may of course communication with thefirst server 200 andsecond server 300 through one communication module. - In the aforementioned example, it has been explained that the
first communicator 130 andsecond communicator 135 perform communication with thefirst server 200 andsecond server 300, but this is merely exemplary, That is, thefirst communicator 130 andsecond communicator 135 may be connected to a web server (not illustrated) and perform web browsing. - The
storage 140 is a storage medium where various programs necessary for operating thedisplay apparatus 100 are stored, and may be embodied as a memory, and HDD (Hard Disk Drive) etc. For example, thestorage 140 may have an ROM for storing programs for performing operations of acontroller 180, and an RAM etc. for temporarily storing data according to operations of thecontroller 180. In addition, Electrically Erasable and Programmable ROM (EEROM) for storing various reference data may be further included. - The
receiver 150 receives various contents. Herein, the contents may include broadcast contents, and VOD contents etc. - More specifically, the
receiver 150 may receive contents from the web server which transmits contents using a broadcasting station or Internet which transmits broadcasting programs using the broadcasting network. In addition, thereceiver 150 may receive contents from various record medium reproduce apparatuses provided in thedisplay apparatus 100 or connected to thedisplay apparatus 100. A record medium reproduce apparatus refers to an apparatus which reproduces content stored in various types of record medium such as a CD, DVD, hard disk, bluray disk, memory card, and USB memory etc. - In the case of an exemplary embodiment of receiving contents from a broadcasting station, the
receiver 150 may be embodied as a format which includes configurations such as a tuner (not illustrated), demodulator (not illustrated), and equalizer (not illustrated). In the case of an exemplary embodiment which receives contents from a source such as a web server, thereceiver 150 may be embodied as a network interface card (not illustrated). Otherwise, in the case of an exemplary embodiment of receiving contents from the aforementioned various record medium reproduce apparatuses, thereceiver 150 may be embodied as an interface (not illustrated) connected to the record medium reproduce apparatus. As such, thereceiver 150 may be embodied as various devices according to the exemplary embodiments. - The
signal processor 160 performs signal processing on contents so that contents received through thereceiver 150 can be output through theoutputter 110. - More specifically, the
signal processor 160 may perform operations such as decoding, scaling and frame rate conversion etc. regarding the images included in the contents, and convert the result in a format which may be output in thedisplayer 111. In addition, thesignal processor 160 may perform signal processing such as decoding etc. regarding the audio signal included in the contents, and convert the result in a format which may be output in theaudio outputter 113. - The remote
control signal receiver 171 receives a remote control signal from an external remote control. Thecontroller 180 may execute various operations based on the remote control signal received in the remotecontrol signal receiver 171. For example, thecontroller 180 may execute operations such as power on/off, channel change and volume adjustment according to the control signal received through the remotecontrol signal receiver 171. - The
inputter 173 may execute operations corresponding to the user command input in theinputter 173. For example, thecontroller 180 may execute power on/off, channel change, and volume adjustment etc. according to the user command input in theinputter 173. - To this end, the
inputter 173 may be embodied as an input panel. The input panel may be embodied as a key pad or touch screen method having various function keys, number keys, special keys and letter keys etc. - The
interface 175 performs communication with an external apparatus (not illustrated). Herein, the external apparatus (not illustrated) may be various electronic apparatuses. For example, in the case where thedisplay apparatus 100 is embodied as a TV, the external apparatus (not illustrated) may be embodied as various format of electronic apparatuses which may be connected to a set top box, sound device, game device etc. to perform various functions. - For example, the
interface 175 may perform communication with the external apparatus (not illustrated) according to various wired communication methods such as HDMI and USB etc. and a wireless communication method such as Bluetooth and Zigbee etc. To this end, theinterface 175 may have a chip or input port corresponding to various communication methods. For example, in the case of performing communication with the external apparatus (not illustrated) according the HDMI communication method, theinterface 175 may have an HDMI port. - The
controller 180 controls the overall operations of the display apparatus. That is, thecontroller 180 may control operations of theoutputter 110,voice collector 120,first communicator 130,second communicator 135,storage 140,receiver 150,signal processor 160, remotecontrol signal receiver 171,inputter 173, andinterface 175. Thecontroller 180 may include Read Only Memory (ROM) and Random Access Memory (RAM) for storing module and data for controlling the central processing unit (CPU) anddisplay apparatus 100. - More specifically, the controller collects a user's voice through the
voice collector 120, and may control thefirst communicator 130 to transmit the collected user's voice to thefirst server 200. In addition, when the voice signal corresponding to the user's voice is received, thecontroller 180 may control thesecond communicator 135 to transmit the received sound signal to thesecond server 300. - When the response information corresponding to the user's voice is received from the
second server 300, thecontroller 180 may perform various operations based on the received response information. - More specifically, the
controller 180 may perform functions corresponding to the user's voice or output the system response corresponding to the user's voice based on the received response information. - To this end, the response information may include a control command for controlling the functions of the
display apparatus 100. Herein, the control command may include a command for executing at least one function corresponding to the user's voice among the functions executable in thedisplay apparatus 100. Accordingly, thecontroller 180 may control various configurative elements of thedisplay apparatus 100 so that the functions corresponding to the user's voice may be executed, based on the control command received from thesecond server 300. - For example, when the
display apparatus 100 embodied as a TV collects the user's voice “turn to channel “◯” (channel number)”, thesecond server 300 determines that the utterance intentions included in the user's voice “turn to channel “◯” (channel number)” is requesting a channel change to channel “◯” (channel number), and may transmit the control command for changing the channel to channel “◯” (channel number) to the display apparatus according to the determined utterance intentions. - Accordingly, the
controller 180 may control thereceiver 150 to select channel “◯” (channel number) based on the received control command, and control so that the broadcast contents received through thereceiver 150 may be output through theoutputter 110. - However, this is merely exemplary, and thus the
controller 180 may control each configurative element of thedisplay apparatus 100 so that various operations such as power on/off and volume adjustment etc. may be performed according to the collected user's voice. - In addition, the response information may include various information for outputting the system response corresponding to the user's voice.
- More specifically, when the user's voice for content search is collected in the
display apparatus 100, thesecond server 300 may determine the user's utterance intentions and search the contents corresponding thereto. In addition, thesecond server 300 may transmit the control command for outputting the information on the searched contents as the system response to thedisplay apparatus 100. In this case, thesecond server 300 may transmit the information (for example, at least one of title, thumbnail, broadcasting time, cast and producer etc.) on the searched contents to thedisplay apparatus 100 together with the control command. - Accordingly, the
controller 180 may control so that the system response corresponding to the user's voice is output based on the response information received from thesecond server 300. - For example, hereinbelow is an explanation of a case where the
display apparatus 100 embodied as a TV collects the user's voice “recommend fantasies for children”. - In this case, the
second server 300 determines that the utterance intentions included in the user's voice “recommend fantasies for children” are a search request for children and fantasies and searches contents corresponding to such utterance intentions. - In addition, the
second server 300 may transmit the control command for displaying a list of the searched contents to thedisplay apparatus 100. In this case, thecontroller 180 may search contents corresponding to the control command from the web browsing or Electronic Program Guide (EPG), and control thedisplayer 111 to output the UI screen forming the list on the searched contents. - The
second server 300 may transmit the control command for displaying the list of the searched contents and the information on the searched contents to thedisplay apparatus 100. In this case, thecontroller 180 may control thedisplayer 111 to use the information on the received contents to output the UI screen configuring the list on the contents. - In the aforementioned examples, the
controller 180 may control to output the UI screen which includes a search list including at least one of the title, thumbnail, broadcasting time, and producers etc. on the contents corresponding to the utterance intentions. - The response information may include system response information for outputting the system response.
- Herein, the system response information may be an expression in a text format of the system response being output from the display apparatus regarding the user's voice. Accordingly, the
controller 180 may control theoutputter 110 to output the system response corresponding to the user's voice in a format of at least one of voice and UI screen based on the system response information. - For example, the
controller 180 may use the Text to Speech (TTS) engine, to convert the text format response message information into voice, and output the result through theaudio outputter 113. Herein, the TTS engine is a module for converting the text into voice, and it is possible to convert text into voice using various TTS algorithms of the related art. In addition, thecontroller 150 may configure the UI screen so as to include text configuring the system response information and output the UI screen through thedisplayer 110. - For example, when the
display apparatus 100 embodied as a TV collects the user's voice “recommend fantasies for children”, thesecond server 300 may express the title “◯◯◯ (fantasy movie title) is a fantasy for children” in a text format and transmit it to thedisplay apparatus 100. In this case, thecontroller 180 may control so that the “◯◯◯ (fantasy movie title) is a fantasy for children” is expressed as a voice, and output through theaudio outputter 113, or control so that a UI screen is configured to include the text “◯◯◯ (fantasy movie title) is a fantasy for children” and output the text through thedisplayer 111. - In addition, the response information may include the system response information related to the functions executed according to the control command. In this case, the
controller 180 may control to perform the functions according to the control command, and to output the system response related to the functions executed based on the system response information in a format of at least the voice and UI screen. - For example, in the case where the
display apparatus 100 embodied as a TV collects the user's voice “turn to channel “◯” (channel number)”, thesecond server 300 may transmit the control command for changing the channel of thedisplay apparatus 100 to channel “◯” (channel number) and the “channel has been turned to channel “◯” (channel number)” to the display apparatus in a text format. - In such a case, the
controller 180 may control thereceiver 150 to select channel “◯” (channel number) based on the control command, to output the contents provided through channel “◯” (channel number). In addition, thecontroller 180 may control so that “channel has been turned to channel “◯” (channel number)” can be converted into voice and be output through thevoice outputter 113, or so that a UI screen is configured to include the text “channel has been changed to channel “◯” (channel number)” and is output through thedisplayer 111. - As aforementioned, the
controller 180 may execute the functions corresponding to the user's voice or output the system response corresponding to the user's voice based on the response information of various formats received from thesecond server 300. - In the case of outputting the system response corresponding to the user's voice without execution of additional functions in the
display apparatus 100, a case where the user's voice intends to execute the functions that may not be executed in thedisplay apparatus 100 may be further included. - For example, hereinbelow is an explanation of a case where the
display apparatus 100 is embodied as a TV where the video call function is not provided. In this case, when the user's voice “call XXX” is collected in thedisplay apparatus 100, thesecond server 300 may transmit the control command for performing a video call to thedisplay apparatus 100. However, in that the function corresponding to the corresponding control command is not provided in thedisplay apparatus 100, thecontroller 180 becomes incapable of recognizing the control command received from thesecond server 300. In this case, thecontroller 180 may output the system response “this function is not provided” in a format of at least one of voice and UI screen through theoutputter 110. - In the aforementioned exemplary embodiment, it was explained that the system response information transmitted from the
second server 300 was expressed such that the system response is expressed in a text format, but this is merely exemplary. That is, the system response information may be voice data itself which configures the system response output in thedisplay apparatus 100, a part of voice data configuring the corresponding system response, or a control signal format for outputting the corresponding system response using the voice or text prestored in thedisplay apparatus 100. - Accordingly, the
controller 180 may output the system response considering the format of the system response information. - More specifically, when the voice data itself for configuring the system response is received, the
controller 180 may process the corresponding data in a format outputtable in theaudio outputter 113 and output the processed data in a voice format. - When the control signal for outputting the system response is received, the
controller 180 may search data matching the control signal from among the prestored data, and process the searched voice or text data in an outputtable format, and output the processed voice or text data through theoutputter 110. To this end, thedisplay apparatus 100 may be storing voice or text data for providing the system response. For example, thedisplay apparatus 100 may store data of a complete sentence format such as “channel change has been completed”, or may store partial data forming the sentence such as “changed to channel . . . number”. In this case, the channel title which completes the sentence may be received from thesecond server 300. -
FIG. 3 is a block diagram of thefirst server 200 illustrated inFIG. 1 . As illustrated inFIG. 3 , thefirst server 200 includes acommunicator 210 and acontroller 220. - The
communicator 210 performs communication with thedisplay apparatus 100. More specifically, thecommunicator 210 may receive a user's voice from thedisplay apparatus 100, and transmit the sound signal corresponding to the user's voice to thedisplay apparatus 100. To this end, thecommunicator 210 may include various communication modules. - The
controller 220 controls the overall operations of thefirst server 200. Especially, when the user's voice is received from thedisplay apparatus 100, thecontroller 220 generates the voice signal corresponding to the user's voice, and controls thecommunicator 210 to transmit the generated voice signal to thedisplay apparatus 100. Herein, the voice signal may be text information converted from the user's voice. - More specifically, the
controller 220 may use the Speech to Text (STT) engine to generate the voice signal corresponding to the user's voice. Herein, the STT engine is a module for converting the voice signal into a text, and may convert the voice signal into a text using various STT algorithms of the related art. - For example, the
controller 220 detects the start and end of the voice that the user uttered in the received user's voice to determine the voice section. More specifically, thecontroller 220 may calculate the energy of the received voice signal, classify the energy level of the voice signal according to the calculated energy, and detect the voice section of the voice signal through a dynamic programming. In addition, thecontroller 220 may detect a phoneme which is the smallest unit of voice based on acoustic module in the detected voice section to generate phoneme data, and apply the Hidden Markov Model (HMM) probability model to the generated phoneme data to convert the user's voice into a text. -
FIG. 4 is a block diagram of thesecond server 300 illustrated inFIG. 1 . As illustrated inFIG. 4 , thesecond server 300 includes acommunicator 310,storage 320, and acontroller 330. Herein, thesecond server 300 may be a dialog type interface apparatus in the present disclosure. - The
communicator 310 performs communication with thedisplay apparatus 100. More specifically, thecommunicator 310 may receive the voice signal corresponding to the user's voice collected in thedisplay apparatus 100 from thedisplay apparatus 100. Herein, the voice signal may be text information converted from the user's voice. - In addition, the
communicator 310 may transmit the response information corresponding to the received voice signal to thedisplay apparatus 100. - In addition, the
communicator 310 performs communication with the external server (400 ofFIG. 1 ). More specifically, thecommunicator 310 may transmit the query for contents search to theexternal server 400, and receive the search results from theexternal server 400. To this end, thecommunicator 310 may include various communication modules. - To this end, the
communicator 310 may perform communication with thedisplay apparatus 100 andexternal server 400 through communication methods such as wired/wireless LAN (Local Area Network), Ethernet, Bluetooth, Zigbee, USB (Universal Serial Bus), IEEE 1394, and Wifi. To this end, thecommunicator 310 may have a chip or input port etc. corresponding to each communication method. For example, in the case of performing communication in the wired LAN method, thecommunicator 310 may have a wired LAN card (not illustrated) and input port (not illustrated). - However, this is exemplary, and thus the
communicator 310 may have additional communication modules for performing communication with each of thedisplay apparatus 100 and theexternal server 400. - The
storage 320 may use the voice signal received from thedisplay apparatus 100 and store various information for determining the user's utterance intentions. - More specifically, the
storage 320 may use the voice signal received from thedisplay apparatus 100, to store various information for analyzing the purpose domain (domain), purpose function (user action), and major characteristics (slot) etc. in the user's voice. - Herein, the purpose domain may be divided according to the themes belonging to the user's voice uttered such as “broadcast”, and “device control” etc. In addition, the purpose function represents the user's utterance intentions such as “information output” and “device control” etc., and the major characteristics represent information which may specify the user's utterance intentions intended in the purpose domain.
- More specifically, the
storage 320 may store a keyword for analyzing the purpose function in the purpose domain and extracting the major characteristics. - For example, the
storage 320 may store information that in the broadcast purpose domain, the keywords such as “recommend”, “search”, “find”, and “show” etc. are requests for information search, and various keywords related to contents such as producer of contents, genre of contents, and viewing rating of contents etc. are major characteristics. As a specific example, the information that the term “children” is a keyword related to view rating and that this belongs to major characteristics may be stored. As another example, thestorage 320 may store information that in the device control purpose domain, the keywords such as “turn on”, “turn up”, “turn down”, “turn off”, and “execute” etc. are requests regarding purpose function device control, and that the various keywords related to device control such as channel title, channel number, volume, and power etc. belong to major characteristics. - In addition, the
storage 320 may have a corpus database. Herein, the corpus database may be embodied in a format of storing examples and answers thereto. - That is, the
storage 320 may store a plurality of examples and answers thereto for each purpose domain. In this case, thestorage 320 may tag and store an answer for interpreting each example sentence and answer to each example sentence. - For example, the
storage 320 may store “recommend fantasies for children” in the broadcast purpose domain. In this case, thestorage 320 may tag information for interpreting the example sentence “recommend fantasies for children” to the corresponding example and store the information. - More specifically, the
storage 320 may tag the information that in the example sentence “recommend movies for children” “children” is a major characteristic representing the viewing rating of the contents, “fantasy” is a major characteristic representing the genre of the contents, and “recommend” represents information search request for the contents to the corresponding example sentence, and store the tagged information. - As another example, in the device control purpose domain, the
storage 320 may store the example sentence “turn to channel “◯””. In this case, thestorage 320 may tag the information for interpreting the example sentence such as “turn to channel “◯”” to the corresponding example sentence and store the information. - The
controller 330 controls the overall operations of thesecond server 300. When a voice signal is received from thedisplay apparatus 100, the controller 340 uses the received voice signal to determine the user's utterance intentions. - More specifically, the
controller 330 may perform a natural language processing regarding the voice signal, and may determine the user's utterance intentions by analyzing the purpose domain, purpose function, and major characteristics in the voice signal using various information stored in thestorage 320. - In addition, the
controller 330 may generate the voice signal received from the display apparatus in a structured meaning frame format based on the determined utterance intentions. In the structured meaning frame, major characteristics may have a feature concept not depended upon by a particular language, that is, a format of an execution language which may be interpreted in theexternal server 400. - For example, hereinbelow is explanation on a case where the voice signal “recommend fantasies for children” is received from the
display apparatus 100. Herein, thestorage 320 may store the information that the term “children” is a major characteristic related to a viewing rating, and “fantasy” belongs to major characteristics related to genre. - Accordingly, the
controller 330 may use the information stored in thestorage 320 to determine that the “recommend fantasies for children” belongs to the broadcast purpose domain, “children” is a major characteristic that indicates the viewing rating of the contents, and “fantasy” is a major characteristic that indicates the genre of the contents, and that “recommend” is an information search request in the purpose function. - Accordingly, the
controller 330 may determine that the voice signal “recommend fantasies for children” is requesting a contents search for “children” and “fantasies”, and generate a structured meaning frame as in the table 1 below. -
TABLE 1 Voice signal Purpose function feature concept Recommend fantasies Information search Request information: for children (kids, fantasy) $kids$, $fantasy$ - The
controller 330 may determine the user's utterance intentions using the voice signal, and control to generate a query for searching the contents corresponding to the determined utterance intentions, and to transmit the metadata on the contents to theexternal server 400 which divides and stores the metadata per item. That is, in the case where the user's utterance intention is a contents search, thecontroller 330 may generate a query for searching the contents corresponding to the utterance intentions, transmit the generated query to theexternal server 400, and receive search results from theexternal server 400. - Herein, the
controller 330 may convert extracted utterance elements so as to be mapped to the contents dividing criteria dividing each item of the structured metadata stored in theexternal server 400 and transmit to theexternal server 400 instead of transmitting the extracted utterance element itself for searching the contents. In this case, thecontroller 330 may convert the utterance element extracted through the regularized phrase that is, application programming interface (API) to be mapped to the contents dividing criteria. - More specifically, the
controller 330 may correspond the extracted utterance element to at least one item of the plurality of items of the item table, and convert the extracted utterance element to correspond to the contents dividing criteria of the at least one item to generate a query for searching contents. - Herein, the utterance element is a term classifying the characteristics of the contents that the user intends to search, and major characteristics may be utterance elements. Therefore, hereinbelow, utterance elements may be interpreted as the same meaning as the major characteristics.
- For more specific explanation, reference is made to
FIGS. 5 to 11 . - First of all,
FIG. 5 is a view illustrating an example where metadata is structured and stored in the external server according to an exemplary embodiment. - The
external server 400 may divide the metadata regarding the contents per item. Herein, the item may be various contents information included in the metadata. That is, theexternal server 400 may divide the metadata on the contents by at least one item of the title, cast, producer, content type, genre, and viewing rating, and store the metadata. - In addition, the
external server 400 may structure the metadata on the contents according to the contents dividing criteria within each item and store the structured metadata. - For example, as in
FIG. 5 , theexternal server 400 may divide the metadata in items of title, cast, producer, contents type, genre, and viewing rating, and structure the metadata according to the contents dividing criteria within each item and store the structured metadata. - That is, the
external server 400 may structure the metadata on the contents by dividing by thetitle item 510 where the metadata on the contents has been divided based on contents dividing criteria, thecast item 520 where the metadata on the contents has been divided based on the contents dividing criteria, theproducer item 530 where the metadata on the contents has been divided based on the producer as the contents dividing criteria, thecontents type item 540 where the metadata on the contents has been divided based on the genre as the dividing criteria, thegenre item 550 where the metadata on the contents has been divided based on the genre as the dividing criteria, and theviewing rating item 560 where the metadata on the contents has been divided based on the viewing rating as the contents dividing criteria. - However, this is merely exemplary, and thus the
external server 400 may use other information which configures the metadata such as preference and broadcasting time etc. to structure and store the metadata on the contents based on the items and contents dividing criteria. - Hereinbelow is explanation on a method of generating a query for contents search in the
controller 330 in a case where the structured data as inFIG. 5 is stored in theexternal server 400. - More specifically, the
controller 330 extracts major characteristics in the generated structured meaning frame based on the voice signal received from thedisplay apparatus 100, and converts the extracted major characteristics to be mapped to the contents dividing criteria in the data structured in theexternal server 400. In this case, thecontroller 330 uses the feature concept to extract the major characteristics. - However, this is merely exemplary, and the
controller 330 may extract the major characteristics from the received sound signal using the information stored in thestorage 320 without generating an additional structured meaning frame. - For example, in the case where the voice signal “recommend fantasies for children” is received, the
controller 330 may extract the “children” and “fantasies” from the structured meaning frame generated as in table 1. - In this case, since “children” is a major characteristics related to the viewing rating of the contents, the
controller 330 may correspond the “children” to the viewing rating of the contents, and convert the “children” to be mapped to the content dividing criteria corresponding thereto considering the contents dividing criteria of the viewing rating in the structured data stored in theexternal server 400. That is, since in the structured data stored in theexternal server 400, the viewing rating is divided according to the contents dividing criteria “All”, “under 7 years”, “under 13 years”, “children” is mapped to “under 7 years” of these contents dividing criteria. - Since “fantasies” is a major characteristics related to the genre of the contents, the
controller 330 may correspond “fantasies” to the genre of the contents, and convert the “fantasies” to be mapped to the contents dividing criteria corresponding thereto considering the contents dividing criteria in the structured data stored in theexternal server 400. That is, since the genre is divided according to the contents dividing criteria such as “comedy”, “drama”, “fantasy” in the structured data stored in theexternal server 400, thecontroller 330 may map “fantasies” to “fantasies” of these contents dividing criteria. - To this end, the
storage 320 may store the item table. That is, thestorage 320 may store the item table which includes the items mapped in the metadata where major characteristics are structured in theexternal server 400 and stored in the metadata and the information on the contents dividing criteria mapped in the items. For example, in the case where theexternal server 400 structures and stores the metadata as inFIG. 5 , thestorage 320 may store the items where major characteristics are mapped in the table as inFIG. 5 , and information on the contents dividing criteria. However, this is merely exemplary, and thestorage 320 may store items where major characteristics are mapped as inFIG. 5 , and information on the contents dividing criteria mapped in the items. - For example, the
storage 320 may store the item table where the major characteristics related to the viewing rating of the contents “children” is mapped to “under 7 years” in the contents viewing rating item of the metadata structured as inFIG. 5 , and the major characteristics related to the genre of the contents “fantasies” are mapped to the “fantasies” in the contents genre item in the metadata structured as inFIG. 5 . - Accordingly, the
controller 330 may convert the major characteristics extracted from the received voice signal to be mapped to the contents dividing criteria, with reference to the item table. - The
controller 330 may use the utterance element converted to correspond to the contents dividing criteria to generate a query for contents search and transmit the generated query to theexternal server 400, and control thecommunicator 310 to receive the search results from theexternal server 400. - In addition, the
controller 330 may use the search results received from theexternal server 400 to generate a control command for outputting the system response corresponding to the user's voice, and transmit the generated control command to thedisplay apparatus 100. In this case, thecontroller 330 may transmit the information on the search results together with the control command to thedisplay apparatus 100. - For example, the
controller 330 may generate a query for contents search using the viewing rating: under 7 years, genre: fantasy which are utterance elements converted according to the contents dividing criteria and transmit the generated query to theexternal server 400. In this case, theexternal server 400 may search the contents satisfying the under 7 years in theviewing rating item 560 of the structuredexternal server 400 and the contents satisfying the fantasy in thegenre item 550, and transmit the searched results to thesecond server 300. - Herein, the
external server 400 may transmit the information on the searched contents (for example, at least one of title, thumbnail, broadcasting time, cast, and producers) to thesecond server 300. For example, theexternal server 400 may transmit the Title ZZZ which is title information on the contents which satisfy under 7 years in theviewing rating item 560 and which satisfy the fantasy in thegenre item 550 to thesecond server 300. - The
controller 330 may use the information on the received search results to generate the control command for outputting the system response on the “recommend fantasies for children” in thedisplay apparatus 100 and transmit the control command to thedisplay apparatus 100. That is, the controller 350 may transmit the control command (for example a system command in a script format) for searching and outputting the contents of which the title is ZZZ to thedisplay apparatus 100. - Accordingly, the
display apparatus 100 may output the system response corresponding to the voice of the user based on the control command received from thesecond server 300. For example, thedisplay apparatus 100 may search the contents of which the title is ZZZ from a web search or EPG information based on the control command received from thesecond server 300, and may output a UI screen which includes at least one of a title, cast, producer, contents type, genre, and viewing rating on the search contents. - The
controller 330 may transmit the information on the search results received from theexternal server 400 to thedisplay apparatus 100. That is, the controller 350 may transmit information on at least one of the title, cast, producers, contents type, genre, and viewing rating etc. on ZZZ which is contents searched together with the control command to thedisplay apparatus 100. - The same utterance element may correspond to different contents division according to the country and language that the
external server 400 provides metadata service for. Herein, the utterance element which may be interpreted differently from each other according to the characteristics of theexternal server 400 may include at least one of a genre, viewing rating and preference etc. - For example, in the case of the major characteristics related to the viewing rating “adult”, the starting age of an adult may differ depending on the countries, and thus the
external server 400 may divide “adult” based on different contents dividing criteria for the nation which provides the metadata server. - In addition, in the case of major characteristics related to the genre “fantasy”, the languages defining fantasy may be different from each other, and thus the
external server 400 may divide the “fantasy” based on different contents dividing criteria for the language providing the metadata service. - Accordingly, the
storage 320 may store an item table which includes a plurality of items having different contents dividing criteria according to at least one of the nations and languages used. In addition, thecontroller 330 may use the item table having different contents dividing criteria to map the major characteristics to the different contents dividing criteria. - For example, hereinbelow is explanation on the case where identical utterance elements are divided based on different contents dividing criteria as in
FIGS. 6 and 7 . - That is, as illustrated in
FIG. 6 , the first external server 400-1 may divide thegenre item 650 based on the contents dividing criteria of “comedy”, “drama” and “fantasy” and divide theviewing rating item 660 based on the contents dividing criteria of “under 7 years”, “under 13 years”, “over 18 years”, and structure and store the metadata. - However, as in
FIG. 7 , the second server 400-2 may divide thegenre item 750 based on the contents dividing criteria of “comedy”, “drama”, “science fiction”, and may divide theviewing rating item 760 based on the contents dividing criteria of “under 7 years”, “under 13 years”, “over 19 years” to structure and store the metadata. - In such a case, the
controller 330 may enable the same utterance element to be mapped to different contents dividing criteria according to the characteristics of the external server which transmits the query for contents search. - For example, hereinbelow is an explanation of a case where a sound signal “recommend fantasies for adults” is received from the
display apparatus 100. - Herein, the
storage 320 may store the information that the term “adult” is a major characteristic related to the viewing rating, and that the term “fantasy” corresponds to major characteristics related to the genre. - In addition, the
storage 320 may store the item table where the major characteristics related to the view rating such as “adult” is mapped to “over 18 years” in the viewing rating item in the metadata structured as inFIG. 6 , and where major characteristics related to the genre “fantasy” is mapped to “fantasy” in the genre item in the metadata structure as inFIG. 6 . - In addition, the
storage 320 may store the mapping table where the major characteristics related to the viewing rating “adult” are mapped to the “over 19 years” in the viewing rating item in the metadata structured as inFIG. 7 , and where major characteristics related to the genre “fantasy” are mapped to the “science fiction” in the genre item in the metadata structured as inFIG. 7 . - The
controller 330 may extract the major characteristic “adult” related to the viewing rating of the contents and the major characteristic “fantasy” related to the genre of the contents, and generate a query for contents search using the extracted “adult” and “fantasy”. - Herein, the
controller 330 may use the item table stored in thestorage 320 to map the “adult” and “fantasy” to different contents dividing criteria according to the characteristics of the external server. - First of all, hereinbelow is an explanation of a case where a query for contents search regarding the first external server 400-1 is generated.
- In this case, since in the structured data stored in the first external server 400-1, the viewing rating “under 7 years”, “under 13 years”, “over 18 years” are divided according to the contents dividing criteria, the major characteristic “adult” related to the viewing rating of the contents is mapped to “over 18 years”. In addition, since in the structured data stored in the first external server 400-1, the genre is divided based on the contents dividing criteria “comedy”, “drama”, “fantasy”, the
controller 330 maps the major characteristic related to the genre of the contents to “fantasy”. - Accordingly, the
controller 330 may use the viewing rating: over 18 years, genre: fantasy to generate a query for contents search, and transmit the generated query to the first external server 400-1. In theviewing rating item 660 of the structured metadata, the first external server 400-1 searches contents satisfying over 18 years in theviewing rating item 660 of the structured metadata, and satisfying fantasy in thegenre item 650, and transmit the title information Title_CCC on the searched contents to thesecond server 300. - Hereinbelow is an explanation of searching the contents based on the second external server 400-2.
- In this case, since the viewing rating of the structured data stored in the second external server 400-2 is divided based on the contents dividing criteria “under 7 years”, “under 13 years”, “over 19 years”, the
controller 330 maps the “adult” which is the major characteristic related to the viewing rating of the contents to the “over 19 years”. In addition, since the genre of the structured data stored in the second external server 400-2 is divided based on the contents dividing criteria “comedy”, “drama”, “science fiction”, the major characteristic “fantasy” related to the genre of the contents is mapped to the “science fiction” criteria. - Accordingly, the
controller 330 uses the viewing rating: over 19 years, genre: science fiction to generate a query for contents search, and transmits the generated query to the second external server 400-2. The second external server 400-2 searches contents satisfying the over 19 years viewing rating in theviewing rating item 750 of the structured metadata and satisfying the science fiction criteria in thegenre item 750, and transmits the title information on the searched contents Title_CCC to thesecond server 300. - As such, the
controller 330 generates a query for a contents search so as to correspond to the characteristics of the external server. Accordingly, even when divided based on different contents, thecontroller 300 can easily search the contents that the user wants. - One utterance element may include a plurality of utterance elements which may classify the characteristics of the contents.
- In this case, the
controller 330 may determine the plurality of utterance elements which may classify the characteristics of the contents in one utterance element, and map each utterance element to the contents dividing criteria within the item. In addition, thecontroller 330 may generate a query using each utterance element mapped to the contents dividing criteria, and transmit the generated query to theexternal server 400 to perform a contents search. - To this end, the
storage 320 may store information on the utterance element which includes a plurality of utterance elements which may classify the characteristics of the contents. In addition, thestorage 320 may store the items to which each utterance element is mapped in the metadata and information on the contents dividing criteria. - Hereinbelow is an explanation of an example where the voice signal “find what we can watch with family members” is received from the
display apparatus 100. - In this case, when a keyword related to the major characteristics “with family members” is stored, the
controller 330 may extract the “with the family members” from the received voice signal as a major characteristic. - In addition, in a case where information on a plurality of utterance elements that is, a “comedy” related to a genre of “all ages” related to the viewing rating is stored in the
storage 320, thecontroller 330 may extract the utterance element “comedy” related to “all ages” related to the viewing rating from “with the family members” with reference thereto. - In this case, the controller may determine the contents dividing criteria of the viewing rating and the genre in the structured data stored in the
external server 400, convert “all ages” related to the viewing rating which is the extracted utterance element so as to be mapped to the contents dividing criteria within the viewing rating item, and convert “comedy” related to the genre which is an extracted utterance element to be mapped to the contents dividing criteria within the item. - For example, when the structured data stored in the
external server 400 is asFIG. 5 , thecontroller 330 may map “with the family members” to “all” of the contents dividing criteria in the viewing rating with reference to the item table stored in thestorage 320 and to “comedy” of the contents dividing criteria in the genre. - In this case, the
storage 320 may store the item table where the major characteristics related to the viewing rating “all ages” are mapped to “all” in the viewing rating item and where the major characteristics related to the genre “comedy” is mapped to “comedy” in the genre item. - Accordingly, the
controller 330 may use the viewing rating: all, genre: comedy to generate a query for contents search, and transmit the generated query to theexternal server 400. - The
controller 330 may consider the preference of the user when generating a query for contents search. - More specifically, the
controller 330 may correspond the utterance element extracted based on the user's preference to at least one item of the plurality of items, and convert the extracted utterance element to correspond to at least one contents dividing criteria to generate a query for searching contents. - To this end, the
storage 320 may store information on the user preference. Herein, the user preference may include at least one of the genre of contents and viewing rating that the user prefers. - In addition, the
storage 320 may store information on the utterance element where the user's preference is considered when generating a query for contents search. For example, thestorage 320 may store “fun” as the utterance element where the user's preference is considered. - For example, hereinbelow is an explanation of a case where the voice signal “find something fun” has been received from the
display apparatus 100. Herein, when “fun” is stored as a keyword related to the genre, thecontroller 330 may extract “fun” as a major characteristic from the received voice signal. - In this case, the
controller 330 may consider the user's preference when mapping the extracted major characteristic to the contents dividing criteria inside the structured data. - For example, when the genre of the contents that the user prefers is “drama”, and the structured data stored in the
external server 400 is as inFIG. 5 , thecontroller 330 may use the mapping table stored in thestorage 320 and the user preference to correspond “fun” to the genre item, and to “drama” of the contents dividing criteria in the genre item. - In this case, the
storage 320 may store the item table which includes information where the major characteristics related to the genre “drama” are mapped to “drama” in the genre item in the metadata structured as illustrated inFIG. 5 . - In addition, the
controller 330 may use the genre: drama to generate a query for contents search and transmit the generated query to theexternal server 400. - Meanwhile, in the aforementioned example, it has been explained the only the user preference on the genre is considered thereto, but this is merely exemplary. In the aforementioned example, when the viewing rating that the user prefers is all ages, the
controller 330 may further map “fun” to the “all” of the contents dividing criteria in the viewing rating item. Accordingly, thecontroller 330 may transmit the query for contents search such as genre: drama and viewing rating: all to theexternal server 400, and receive information on the searched contents. -
FIG. 8 is a view illustrating a dialog type system according to an exemplary embodiment. More specifically,FIG. 8 is a view specifically illustrating the functions that the apparatus and server configuring thedialog type system 1000 illustrated inFIG. 1 perform. - Since the
display apparatus 100,first server 200,second server 300 andexternal server 400 configuring thedialog type system 100 have been specifically explained inFIGS. 1 to 7 , detailed explanation on the repeated portion will be omitted. - First of all, the
display apparatus 100 collects user's voice, and transmits the collected voice to thefirst server 200. Thefirst server 200 converts the user's voice into a voice signal and transmits the voice signal to thedisplay apparatus 100. Herein, thefirst server 200 may be embodied as an Automatic Speech Recognition (ASR) server which includes an ASR engine. - The
display apparatus 100 transmits the voice signal received from thefirst server 200 to thesecond server 300. Herein, thesecond server 300 may be embodied as a dialog server. - The
second server 300 may perform natural language processing regarding the received voice signal, and determine the user's utterance intentions. More specifically, thesecond server 300 may analyze the purpose domain, purpose function, and major characteristics in the user's voice and determine the user's utterance intentions. In addition, thesecond server 300 may use the analysis results to generate a structured meaning frame regarding the received voice signal. - Next, the
second server 300 may perform scheduling regarding the function execution based on the determined utterance intentions. Herein, scheduling may mean a process of determining an order of operation in a case where there are additional operations that thesecond server 300 must perform in order to generate response information corresponding to the determined utterance intentions. - For example, in a case where the utterance intentions include a search request for the contents, the
second server 300 must search the contents corresponding to the user's utterance intentions, and thus thesecond server 300 must perform a contents search through theexternal server 400 before generating the response information. In this case, when it is necessary to perform additional operations such as a contents search, thesecond server 300 may perform scheduling so as to perform contents search before generating a control command. - When the utterance intentions include a search request, the
second server 300 generates a search query. In this case, thesecond server 300 may generate a query for a contents search considering the user's preference (that is context). The method that thesecond server 300 generates a query for contents search has been aforementioned inFIGS. 1 to 7 , and thus a detailed explanation is omitted. - In addition, the
second server 300 transmits the generated query to theexternal server 400 and may receive the search results from theexternal server 400. - Herein, the
external server 400 may be embodied as a metadata server which structures and stores the metadata information regarding EPG, Music, VOD, Photo, Applications etc. AlthoughFIG. 8 illustrate that metadata information regarding EPG, Music, VOD, Photo, Application etc. are included in theexternal server 400, it is not limited thereto, and thus not all have to be included. - The
second server 300 may generate response information using the received search results. That is, thesecond server 300 may generate a control command (for example, a system command of a script format) for outputting the system response corresponding to the user's voice. - In addition, the
second server 300 transmits the generated control command to thedisplay apparatus 100. In this case, thesecond server 300 may transmit the information on the contents search received from theexternal server 400 together with the generated control command to thedisplay apparatus 100. - Accordingly, the display apparatus may interpret the control command and perform operations corresponding to the user's voice. For example, when the user's voice is related to the contents search, the
display apparatus 100 may output the list regarding the searched contents as a system response. -
FIGS. 9 to 11 are views explaining processes for generating a query according to an exemplary embodiment. - For example, the
second server 300 may extract a phrase indicating the characteristics of the contents from the voice signal and convert the phrase into a regularized phrase. That is, thesecond server 300 may convert the term indicating the characteristics of the contents so as to be mapped to the contents dividing criteria divided in theexternal server 400 which provides metadata service. -
FIGS. 10 and 11 are views illustrating an example of a process of converting a phrase indicating the characteristics of the extracted contents. The extracted phrase “fantasy” is corresponded to the genre of the various items configuring the metadata in that it is an utterance element related to the genre of the contents. - Herein, in that the server which provides the metadata service divides the contents of which the genre is fantasy based on the contents dividing criteria such as “fantasy, sci-fi” in the server which provides the metadata service, “fantasy” is mapped to the “fantasy, sci-fi” to generate a query for contents search.
- In addition, as in the lower section of
FIG. 10 , in the user utterance “Show me all the kids programs”, “kids” is extracted as an utterance element. The extracted phrase “kids” is corresponded to the view rating of the various items configuring the metadata in that it is an utterance element related to the viewing rating of the contents. - Herein, in that the server which provides metadata service divides the viewing rating based on the contents dividing criteria such as “7”, “kids” is mapped to “7”, generating a query for contents search.
- As illustrated in
FIG. 11 , from the user utterance “Show me something funny”, “funny” is extracted as an utterance element. The extracted phrase “funny” may be considered as the user preference when mapping as the item configuring the metadata. - For example, when the user prefers criminal drama genre as the genre of the contents and the viewing rating as the 14 years, the extracted “funny” may correspond the metadata to the genre and viewing rating of the various items configuring the metadata.
- Herein, in that the server which provides the metadata service divides the contents of which the genre is crime drama based on the contents dividing criteria such as “crime drama” and divides the viewing rating based on the contents dividing criteria such as “14”, “funny” is mapped to “crime drama” and “14” to generate a query for contents search.
-
FIGS. 12A and 12B are views illustrating an example of a system response being output in the display apparatus according to an exemplary embodiment. - Herein, as in
FIGS. 12A and 12B , hereinbelow is an explanation of the case where the user utters “recommend fantasies for children”. - In this case, the
display apparatus 100 may output the system response corresponding to the “recommend fantasies for children” based on the response information received from thesecond server 300. For example, thedisplay apparatus 100 may display alist 810 on the fantasy movies of which the viewing rating is 7 or under. In this case, thelist 810 may include at least one of information of the title, thumbnail, broadcast time, cast, and producers etc. -
FIG. 13 is a flowchart for explaining a method of controlling the dialog type interface apparatus according to an exemplary embodiment. - First of all, a voice signal corresponding to the user's voice collected in the display apparatus is received from the display apparatus (operation S1310).
- Then, the user's utterance intention is determined using the voice signal, and a query for searching the contents corresponding to the determined utterance intentions is generated (operation S1320). That is, the utterance element for determining the utterance intention is extracted from the voice signal, and the extracted utterance element is converted to correspond to the contents dividing criteria in each item to generate a query.
- More specifically, it is possible to correspond the extracted utterance element to at least one item of the plurality of items of the item table, and generate a query for converting the extracted utterance element to correspond to the contents dividing criteria of at least one item to generate a query for searching the contents.
- In addition, it is possible to generate a query for corresponding the extracted utterance element to at least one item of the plurality of items of the item table, and for converting the extracted utterance element to correspond to the contents dividing criteria of the at least one item to search contents.
- Next, the generated query is transmitted to the external server which stores the metadata on the contents per item (operation S1330).
- The dialog type interface apparatus may store an item table which includes a plurality of items having different contents dividing criteria according to at least one of nation and language.
- In addition, the external server may divide the metadata on the contents per at least one item of the title, cast, producer, contents type, genre and viewing rating etc. and store the divided metadata.
- Detailed explanation thereof will be omitted since it has been explained with reference to
FIGS. 1 to 12 . - In addition, a non-transitory computer readable medium which stores a program consecutively performing a controlling method according to the present disclosure may be provided.
- A non-transitory computer readable medium refers to a computer readable medium which may store data semi-permanently and not a medium which stores data for a short period of time such as a register, cache, and memory etc. More specifically, the aforementioned various applications or programs may be stored in a non-transitory readable medium such as a CD, DVD, hard disk, blueray disk, USB, and memory card, and ROM etc.
- In addition, in the block diagram illustrated regarding a display apparatus and server, a bus was not illustrated, but communication among each configurative element in the display apparatus and server may be made through bus. In addition, in each device, the CPU performing various stages and a processor such as a micro processor may be further included.
- Although a few exemplary embodiments have been shown and described, it would be appreciated by those skilled in the art that changes may be made in the exemplary embodiments without departing from the principles and spirit of the application, the scope of which is defined in the claims and their equivalents.
Claims (13)
1. A dialog type interface apparatus which provides contents corresponding to a voice signal received from a display apparatus, the dialog type interface apparatus comprising:
a communicator configured to receive a voice signal corresponding to a user's voice collected in the display apparatus; and
a controller configured to determine the user's utterance intentions based on the received voice signal, and configured to generate a query for searching contents corresponding to the determined utterance intentions, divide metadata on the contents, and transmit the divided metadata to an external server,
wherein the controller is configured to extract an utterance element for determining the utterance intentions from the voice signal, and convert the extracted utterance element to correspond to contents dividing criteria of at least one item of an item table to generate the query.
2. The dialog type interface apparatus according to claim 1 ,
further comprising a storage configured to store the item table which includes a plurality of items which have different contents dividing criteria according to at least one of a nation and a language.
3. The dialog type interface apparatus according to claim 1 ,
wherein the controller is configured to correspond the extracted utterance element to at least one item of a plurality of items of the item table, and convert the extracted utterance element to correspond to the contents dividing criteria of the at least one item of the item table to generate a query for searching the contents.
4. The dialog type interface apparatus according to claim 1 ,
wherein the controller is configured to correspond the extracted utterance element to the at least one item of a plurality of items of the item table, and convert the extracted utterance element to correspond to the contents dividing criteria of the at least one item of the item table to generate a query for searching the contents, based on at least one user preference.
5. The dialog type interface apparatus according to claim 1 ,
wherein the external server divides the metadata on the contents per at least one item of a title, a cast, a producer, a contents type, a genre, and a viewing rating.
6. A method of controlling a dialog type interface apparatus which provides contents corresponding to a voice signal received from a display apparatus, the method comprising:
receiving a voice signal corresponding to a user's voice collected from the display apparatus;
determining the user's utterance intentions based on the received voice signal, and generating a query for searching contents corresponding to the determined utterance intentions; and
transmitting the generated query to an external server which divides and stores metadata on the contents per at least one item of an item table,
wherein the generating extracts an utterance element for determining the utterance intentions in the voice signal, and converts the extracted utterance element to correspond to contents dividing criteria in the at least one item of the item table to generate the query.
7. The method according to claim 6 ,
wherein the dialog type interface apparatus stores the item table which includes a plurality of items having different contents dividing criteria according to at least one of a nation and a language.
8. The method according to claim 6 ,
wherein the generating corresponds the extracted utterance element to at least one item of a plurality of items of the item table, and converts the extracted utterance element to correspond to the contents dividing criteria of the at least one item to generate a query for searching the contents.
9. The method according to claim 6 ,
wherein the generating corresponds the extracted utterance element to at least one item of a plurality of items of the item table, and converts the extracted utterance element to correspond to the contents dividing criteria of the at least one item to generate a query for searching the contents, based on at least one user preference.
10. The method according to claim 6 ,
wherein the external server divides the metadata on the contents per at least one item of a title, a cast, a producer, a contents type, a genre, and a viewing rating.
11. A method for searching contents in a dialog type system, the method comprising:
collecting a user's voice at a display apparatus and transmitting the user's voice to a first server;
converting, at the first server, the user's voice to text information, and transmitting the text information to the display apparatus,
transmitting, by the display apparatus, at least one of the text information and a voice signal to a second server, and
generating, by the second server, response information corresponding to the received at least one of the text information and the voice signal.
12. The method of claim 11 , wherein the second server generates a query to search for content based on utterances extracted from the received voice signal or the received text information.
13. The method of claim 12 , wherein the query is transmitted to an external server, and searched results are received from the external server and transmitted to the display apparatus by the second server.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020130001838A KR20140089876A (en) | 2013-01-07 | 2013-01-07 | interactive interface apparatus and method for comtrolling the server |
KR10-2013-0001838 | 2013-01-07 |
Publications (1)
Publication Number | Publication Date |
---|---|
US20140196092A1 true US20140196092A1 (en) | 2014-07-10 |
Family
ID=49917565
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/149,397 Abandoned US20140196092A1 (en) | 2013-01-07 | 2014-01-07 | Dialog-type interface apparatus and method for controlling the same |
Country Status (6)
Country | Link |
---|---|
US (1) | US20140196092A1 (en) |
EP (1) | EP2752846A1 (en) |
JP (1) | JP2014132464A (en) |
KR (1) | KR20140089876A (en) |
CN (1) | CN103916704A (en) |
AU (1) | AU2014200034A1 (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150248886A1 (en) * | 2014-03-03 | 2015-09-03 | Microsoft Corporation | Model Based Approach for On-Screen Item Selection and Disambiguation |
US9405741B1 (en) * | 2014-03-24 | 2016-08-02 | Amazon Technologies, Inc. | Controlling offensive content in output |
US9691070B2 (en) * | 2015-09-01 | 2017-06-27 | Echostar Technologies L.L.C. | Automated voice-based customer service |
EP3197168A1 (en) * | 2016-01-21 | 2017-07-26 | Samsung Electronics Co., Ltd. | Display apparatus and controlling method thereof |
US9886958B2 (en) | 2015-12-11 | 2018-02-06 | Microsoft Technology Licensing, Llc | Language and domain independent model based approach for on-screen item selection |
JP2018511115A (en) * | 2015-02-22 | 2018-04-19 | グーグル エルエルシー | Identify content suitable for children algorithmically without human intervention |
US10089985B2 (en) * | 2014-05-01 | 2018-10-02 | At&T Intellectual Property I, L.P. | Smart interactive media content guide |
US10140985B2 (en) | 2013-07-02 | 2018-11-27 | Samsung Electronics Co., Ltd. | Server for processing speech, control method thereof, image processing apparatus, and control method thereof |
CN110851470A (en) * | 2015-05-27 | 2020-02-28 | 谷歌有限责任公司 | Providing suggested voice-based action queries |
US10929446B2 (en) | 2015-03-20 | 2021-02-23 | Kabushiki Kaisha Toshiba | Document search apparatus and method |
CN112750438A (en) * | 2019-10-30 | 2021-05-04 | Lg 电子株式会社 | Artificial intelligence device |
US20210392403A1 (en) * | 2019-11-29 | 2021-12-16 | Hisense Visual Technology Co., Ltd. | Smart Television And Server |
US11599536B2 (en) | 2019-10-23 | 2023-03-07 | Soppra Corporation | Information output device, information output method, and information output program |
Families Citing this family (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10630751B2 (en) | 2016-12-30 | 2020-04-21 | Google Llc | Sequence dependent data message consolidation in a voice activated computer network environment |
US10956485B2 (en) | 2011-08-31 | 2021-03-23 | Google Llc | Retargeting in a search environment |
US10431209B2 (en) | 2016-12-30 | 2019-10-01 | Google Llc | Feedback controller for data transmissions |
US9703757B2 (en) | 2013-09-30 | 2017-07-11 | Google Inc. | Automatically determining a size for a content item for a web page |
US10614153B2 (en) | 2013-09-30 | 2020-04-07 | Google Llc | Resource size-based content item selection |
KR101602828B1 (en) * | 2014-09-01 | 2016-03-14 | 유형근 | The interactive digital signage |
CN106250474B (en) * | 2016-07-29 | 2020-06-23 | Tcl科技集团股份有限公司 | Voice control processing method and system |
JP7020799B2 (en) * | 2017-05-16 | 2022-02-16 | ソニーグループ株式会社 | Information processing equipment and information processing method |
JP7269221B2 (en) * | 2018-03-27 | 2023-05-08 | ソニーグループ株式会社 | Information processing device, information processing method, transmission device, and transmission method |
CN109036417B (en) * | 2018-07-03 | 2020-06-23 | 百度在线网络技术(北京)有限公司 | Method and apparatus for processing voice request |
WO2020234948A1 (en) * | 2019-05-17 | 2020-11-26 | ソプラ株式会社 | Information output device, information output method, and program |
WO2020258082A1 (en) * | 2019-06-26 | 2020-12-30 | 深圳市欢太科技有限公司 | Information recommendation method and apparatus, electronic device and storage medium |
CN111081252A (en) * | 2019-12-03 | 2020-04-28 | 深圳追一科技有限公司 | Voice data processing method and device, computer equipment and storage medium |
US20230229681A1 (en) * | 2020-05-15 | 2023-07-20 | Sanofi | Information system and electronic device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1772854A1 (en) * | 2005-10-04 | 2007-04-11 | Robert Bosch Corporation | Method and apparatus for organizing and optimizing content in dialog systems |
US20080103780A1 (en) * | 2006-10-31 | 2008-05-01 | Dacosta Behram Mario | Speech recognition for internet video search and navigation |
US7774343B2 (en) * | 2002-04-15 | 2010-08-10 | Microsoft Corporation | Multiple media vendor support |
US20130268956A1 (en) * | 2012-04-05 | 2013-10-10 | X2Tv S.R.L. | Real-time collection of audience feedback of a television or radio show |
US20140068649A1 (en) * | 2012-08-31 | 2014-03-06 | Gregory Joseph Badros | Sharing Television and Video Programming Through Social Networking |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8910232B2 (en) * | 2009-09-14 | 2014-12-09 | At&T Intellectual Property I, Lp | System and method of analyzing internet protocol television content for closed-captioning information |
JP2012043000A (en) * | 2010-08-12 | 2012-03-01 | Sony Corp | Retrieval device, retrieval method, and program |
CN102665129B (en) * | 2012-04-27 | 2015-05-27 | 北京人民广播电台 | Played content-customizing method and played content-customizing device |
-
2013
- 2013-01-07 KR KR1020130001838A patent/KR20140089876A/en not_active Application Discontinuation
-
2014
- 2014-01-02 EP EP14150032.2A patent/EP2752846A1/en not_active Withdrawn
- 2014-01-06 JP JP2014000671A patent/JP2014132464A/en active Pending
- 2014-01-06 AU AU2014200034A patent/AU2014200034A1/en not_active Abandoned
- 2014-01-07 CN CN201410006680.8A patent/CN103916704A/en active Pending
- 2014-01-07 US US14/149,397 patent/US20140196092A1/en not_active Abandoned
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7774343B2 (en) * | 2002-04-15 | 2010-08-10 | Microsoft Corporation | Multiple media vendor support |
EP1772854A1 (en) * | 2005-10-04 | 2007-04-11 | Robert Bosch Corporation | Method and apparatus for organizing and optimizing content in dialog systems |
US20080103780A1 (en) * | 2006-10-31 | 2008-05-01 | Dacosta Behram Mario | Speech recognition for internet video search and navigation |
US20130268956A1 (en) * | 2012-04-05 | 2013-10-10 | X2Tv S.R.L. | Real-time collection of audience feedback of a television or radio show |
US20140068649A1 (en) * | 2012-08-31 | 2014-03-06 | Gregory Joseph Badros | Sharing Television and Video Programming Through Social Networking |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10140985B2 (en) | 2013-07-02 | 2018-11-27 | Samsung Electronics Co., Ltd. | Server for processing speech, control method thereof, image processing apparatus, and control method thereof |
US9412363B2 (en) * | 2014-03-03 | 2016-08-09 | Microsoft Technology Licensing, Llc | Model based approach for on-screen item selection and disambiguation |
US20150248886A1 (en) * | 2014-03-03 | 2015-09-03 | Microsoft Corporation | Model Based Approach for On-Screen Item Selection and Disambiguation |
US9405741B1 (en) * | 2014-03-24 | 2016-08-02 | Amazon Technologies, Inc. | Controlling offensive content in output |
US10089985B2 (en) * | 2014-05-01 | 2018-10-02 | At&T Intellectual Property I, L.P. | Smart interactive media content guide |
US11594225B2 (en) * | 2014-05-01 | 2023-02-28 | At&T Intellectual Property I, L.P. | Smart interactive media content guide |
US20180358017A1 (en) * | 2014-05-01 | 2018-12-13 | At&T Intellectual Property I, L.P. | Smart interactive media content guide |
US11049029B2 (en) | 2015-02-22 | 2021-06-29 | Google Llc | Identifying content appropriate for children algorithmically without human intervention |
JP2018511115A (en) * | 2015-02-22 | 2018-04-19 | グーグル エルエルシー | Identify content suitable for children algorithmically without human intervention |
JP2019145178A (en) * | 2015-02-22 | 2019-08-29 | グーグル エルエルシー | Identifying content appropriate for children algorithmically without human intervention |
US10929446B2 (en) | 2015-03-20 | 2021-02-23 | Kabushiki Kaisha Toshiba | Document search apparatus and method |
US11869489B2 (en) | 2015-05-27 | 2024-01-09 | Google Llc | Providing suggested voice-based action queries |
CN110851470A (en) * | 2015-05-27 | 2020-02-28 | 谷歌有限责任公司 | Providing suggested voice-based action queries |
US9691070B2 (en) * | 2015-09-01 | 2017-06-27 | Echostar Technologies L.L.C. | Automated voice-based customer service |
US9886958B2 (en) | 2015-12-11 | 2018-02-06 | Microsoft Technology Licensing, Llc | Language and domain independent model based approach for on-screen item selection |
KR20170087712A (en) * | 2016-01-21 | 2017-07-31 | 삼성전자주식회사 | Display apparatus and controlling method thereof |
KR102499124B1 (en) * | 2016-01-21 | 2023-02-15 | 삼성전자주식회사 | Display apparatus and controlling method thereof |
US10779030B2 (en) | 2016-01-21 | 2020-09-15 | Samsung Electronics Co., Ltd. | Display apparatus and controlling method thereof |
EP3197168A1 (en) * | 2016-01-21 | 2017-07-26 | Samsung Electronics Co., Ltd. | Display apparatus and controlling method thereof |
US11599536B2 (en) | 2019-10-23 | 2023-03-07 | Soppra Corporation | Information output device, information output method, and information output program |
CN112750438A (en) * | 2019-10-30 | 2021-05-04 | Lg 电子株式会社 | Artificial intelligence device |
US20210392403A1 (en) * | 2019-11-29 | 2021-12-16 | Hisense Visual Technology Co., Ltd. | Smart Television And Server |
Also Published As
Publication number | Publication date |
---|---|
EP2752846A1 (en) | 2014-07-09 |
CN103916704A (en) | 2014-07-09 |
JP2014132464A (en) | 2014-07-17 |
AU2014200034A1 (en) | 2014-07-24 |
KR20140089876A (en) | 2014-07-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20140196092A1 (en) | Dialog-type interface apparatus and method for controlling the same | |
US11700409B2 (en) | Server and method for controlling server | |
US9520133B2 (en) | Display apparatus and method for controlling the display apparatus | |
US20190333515A1 (en) | Display apparatus, method for controlling the display apparatus, server and method for controlling the server | |
US20140195230A1 (en) | Display apparatus and method for controlling the same | |
US20140195244A1 (en) | Display apparatus and method of controlling display apparatus | |
KR102118195B1 (en) | Server and method for comtrolling the server | |
KR102182689B1 (en) | Server and method for comtrolling the server | |
KR102379049B1 (en) | Server and method for comtrolling the server | |
KR102091006B1 (en) | Display apparatus and method for controlling the display apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: SAMSUNG ELECTRONICS CO., LTD, KOREA, REPUBLIC OF Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:CHUNG, JI-HYE;SONG, IN-JEE;SHIN, YONG-WOOK;REEL/FRAME:031908/0127 Effective date: 20131230 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |