WO2018016760A1

WO2018016760A1 - Electronic device and control method thereof

Info

Publication number: WO2018016760A1
Application number: PCT/KR2017/006790
Authority: WO
Inventors: 송영석; 김한기; 임동현; 박해광; 손준호; 이우정
Original assignee: 삼성전자 주식회사
Priority date: 2016-07-21
Filing date: 2017-06-27
Publication date: 2018-01-25
Also published as: KR20180010955A; KR102403149B1

Abstract

An electronic device is disclosed. The electronic device comprises: a communication unit for communicating with a server storing information on a plurality of short clips and storing keywords by the plurality of short clips; an output unit; an input unit; and a processor which, when a voice uttered by a user is received via the input unit, transmits a short clip request signal to the server, on the basis of a keyword included in the received uttered voice and information on content outputted from the output unit, and outputs a short clip via the output unit, on the basis of information on the short clip received from the server in response to the request signal.

Description

Electronic device and its control method

The present invention relates to an electronic device and a control method thereof, and more particularly, to an electronic device providing a short clip and a control method thereof.

Recently, with the development of electronic technology, various types of multimedia devices have been developed. In particular, multimedia devices such as TVs, PCs, laptop computers, tablet PCs, smartphones, and the like are widely used in most homes.

In addition, in order to meet the needs of users who desire various functions, efforts are being made to develop a new type of personal assistant service (Smart Assistant) incorporating speech recognition into a multimedia device.

However, according to the related art, text-based search results for a user's question are merely provided in an unnatural voice using a TTS.

In addition, when a search result includes video or audio content, the content is provided as it is. In this case, the original content contains a large number of irrelevant parts of the user's question, so that the search results that are meaningless to the user are accepted.

Therefore, the necessity for providing only the section related to the user's question in the original content as a search result.

SUMMARY OF THE INVENTION The present invention has been made to solve the above-described problem, and an object of the present invention is to provide an electronic device and a control method thereof for providing a short clip for original content based on a keyword.

An electronic device according to an embodiment of the present disclosure provides a communication unit, an output unit, an input unit, and an input unit for communicating with a server that stores information about a plurality of short clips and keywords for each of the plurality of short clips. When a user speech is received through the unit, a short clip request signal is transmitted to the server based on a keyword included in the received speech voice and information on the content output from the output unit, and the server according to the request signal. And a processor configured to output the short clip through the output unit based on the information about the short clip received from the controller.

Here, the information on the plurality of short clips includes at least one of information on a time interval including the location where the plurality of short clips are stored and the keyword, and the processor is further configured to transmit information from the server according to the request signal. When the information about the short clip is received, the short clip may be output based on the received information.

In addition, each of the plurality of short clips may be video content or sound content generated by editing a portion including a specific keyword in specific content.

The processor may provide additional information about the short clip when additional information about the short clip is received. The additional information about the short clip may include a title, a genre, and a broadcast time of the original content. And at least one of a generation time of the short clip, broadcasting station information of the original content, and the keyword.

The output unit may include at least one of a display and a speaker.

In an electronic device according to another embodiment of the present disclosure, the output unit is implemented to include only a speaker, and the processor may provide additional information about the short clip as audio through the speaker.

The output unit may include at least one of a display and a speaker, and the processor may be configured to generate a short associated with the keyword to the server based on a keyword that is repeated a predetermined number of times for a predetermined time in the audio output through the speaker. The clip request signal may be additionally transmitted to the server.

In addition, the processor may provide additional response information for the spoken voice together with the short clip based on the keyword included in the received spoken voice.

The processor may transmit the request signal including the keyword and the user information to the server, and receive a short clip associated with the keyword and the user information from the server.

In addition, when the spoken voice is received, the processor transmits the received spoken voice to a voice recognition server or the server and shortens based on the information about the keyword and the content received from the voice recognition server or the server. The clip request signal may be transmitted to the server.

On the other hand, in a control method of an electronic device that communicates with a server that stores information about a plurality of short clips and keywords for each of the plurality of short clips according to an embodiment of the present disclosure, the method may include outputting content and generating a voice of a user. Receiving, when the spoken voice is received, transmitting a short clip request signal to the server based on a keyword included in the received spoken voice and information on the content, and receiving from the server according to the request signal. Outputting the short clip based on the information about the short clip.

The information on the plurality of short clips may include at least one of information on a location where the plurality of short clips are stored and information on a time interval including the keyword, and the transmitting may include: When the information about the short clip is received from the server, the short clip may be output based on the received information.

The outputting of the short clip may include providing additional information about the short clip when additional information about the short clip is received, and the additional information about the short clip may include title, genre, and original content. At least one of a broadcast time of the original content, a time of generating the short clip, broadcast station information of the original content, and the keyword.

The outputting of the short clip may provide additional information about the short clip as audio through a speaker.

The electronic device may include at least one of a display and a speaker, and the transmitting of the keyword may include the keyword to the server based on a keyword that is repeated at least a predetermined number of times for a predetermined time in the audio output through the speaker. The short clip request signal associated with may be additionally transmitted to the server.

The outputting of the short clip may provide additional response information for the spoken voice together with the short clip based on a keyword included in the received spoken voice.

The transmitting may include transmitting the request signal including the keyword and the user information to the server, and outputting the short clip, receiving the short clip associated with the keyword and the user information from the server. Can be output.

The transmitting may include transmitting the received spoken voice to a voice recognition server or the server and generating a short clip request signal based on the keyword and the information about the content received from the voice recognition server or the server. Can be sent to the server.

Meanwhile, a system including an electronic device and a server according to an embodiment of the present disclosure generates information on a plurality of short clips based on keywords of a plurality of original contents, and generates a plurality of short clips. When a server for storing information and keywords for each of the plurality of short clips and a spoken voice of a user are received, a short clip request signal is generated based on the keyword included in the received spoken voice and information about the content output by the electronic device. And an electronic device for transmitting to the server and outputting a short clip based on information about the short clip received from the server according to the request signal.

According to various embodiments of the present disclosure as described above, since a short clip for the original content is provided based on a keyword included in the spoken voice of the user, user convenience may be increased.

1 is a view for explaining a system for providing a short clip according to an embodiment of the present invention.

2A and 2B are block diagrams illustrating a configuration of an electronic device according to an embodiment of the present disclosure.

3 is a block diagram illustrating a configuration of a server according to an exemplary embodiment.

4 is a diagram for describing a method of outputting a short clip associated with a keyword according to an exemplary embodiment.

5 is a diagram for describing a method of outputting a short clip associated with output content according to an exemplary embodiment.

6 is a diagram for describing a method of obtaining a keyword by analyzing an audio signal according to an exemplary embodiment.

7 is a diagram for describing additional information about a short clip according to one embodiment of the present invention.

FIG. 8 is a diagram for describing additional response information provided with a short clip according to an exemplary embodiment.

9 is a flowchart illustrating a short clip providing method according to an exemplary embodiment.

10 is a flowchart illustrating a system for providing a short clip according to an exemplary embodiment.

11 is a diagram for describing a method of providing a short clip through an speaker according to another embodiment of the present disclosure.

-

Hereinafter, with reference to the drawings will be described the present invention in more detail. In describing the present invention, when it is determined that a detailed description of a related known function or configuration may unnecessarily obscure the subject matter of the present invention, the detailed description thereof will be omitted. In addition, the following examples may be modified in many different forms, and the scope of the technical spirit of the present disclosure is not limited to the following examples. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the inventive concept to those skilled in the art.

In addition, the term 'comprising' of an element means that the element may further include other elements, not to exclude other elements unless specifically stated otherwise. Furthermore, various elements and regions in the drawings are schematically drawn. Therefore, the technical spirit of the present disclosure is not limited by the relative size or the interval drawn in the accompanying drawings.

The electronic device 100 may be implemented as various types of devices that output content using at least one of a display and a speaker. Accordingly, the electronic device 100 may be implemented as a digital TV, but is not limited thereto. The electronic device 100 may be implemented as various types of devices having a display function such as a PC, a mobile phone, a tablet PC, a PMP, a PDA, a navigation device, and the like. In addition, the electronic device 100 may be implemented as a sound output device having no display function. In this case, the content may be output as an audio signal through the speaker. However, hereinafter, it is assumed that the electronic device 100 is implemented as a digital TV for convenience of description. An embodiment in which the electronic device 100 includes only a speaker without a display function will be described in detail with reference to FIG. 10.

The electronic device 100 according to an embodiment of the present disclosure may receive a spoken voice of a user and obtain a keyword included in the received spoken voice. In detail, the electronic device 100 may transmit the received spoken voice to a voice recognition server (not shown) and receive a keyword included in the spoken voice from the voice recognition server. However, the present invention is not limited thereto, and the electronic device 100 may obtain a keyword by analyzing a user's spoken voice.

In addition, the server 200 according to an embodiment of the present disclosure may be used as a voice recognition server for providing a short clip and analyzing a spoken voice and transmitting a keyword included in the spoken voice to the electronic device 100. Of course.

The electronic device 100 may transmit a short clip request signal to the server 200 based on the keyword included in the received speech voice and information on the content output by the electronic device 100. In this case, the electronic device 100 may receive information about the short clip from the server 200 in response to the request signal, and output the short clip based on the received information. Here, the information about the short clip may be at least one of information about a time clip including a short clip, a location where the short clip is stored, and a keyword. As an example, when a time section including a keyword is received, the electronic device 100 may reproduce and output only a time section including a specific keyword in the content based on this.

The server 200 may store information about the plurality of short clips and keywords for each of the plurality of short clips. In detail, the server 200 may receive content from the content provider 300 and generate a short clip from the received content. For example, the server 200 may receive broadcast content from a broadcaster and generate a plurality of short clips from the received broadcast content. Hereinafter, for convenience of description, the content received from the content provider 300 is referred to as the original content.

The short clip refers to an image obtained by editing a specific portion or part of the received original content, and in some cases, a plurality of contents may be combined. For example, a specific part or part may be obtained from each of the plurality of contents, and the obtained parts may be combined to generate a short clip.

According to an embodiment of the present invention, the server 200 may analyze the audio signal of the original content and edit the original content in units of endpoint detection (EPD). Here, EPD refers to an algorithm that detects a start point and an end point of a voice in real time by analyzing an audio signal of an original content.

In addition, the server 200 may obtain a keyword by analyzing the voice included in each of the edited images in EPD units. Accordingly, the server 200 may obtain and store a plurality of edited images and keywords corresponding to each of the plurality of edited images edited in EPD units from one original content. Here, at least one keyword matching the edited video may be provided.

According to an embodiment of the present disclosure, when the server 200 acquires a plurality of keywords by analyzing an audio signal included in the edited video, the plurality of keywords may be matched to one edited video and stored in the server. Meanwhile, the original content is not necessarily edited in EPD units, and the server 200 may generate a plurality of short clips by editing the original content based on various voice detection algorithms. The short clip and the keyword generation method for each short clip of the server 200 will be described in detail with reference to FIG. 3. In the following description, an edited video obtained from original content is referred to as a short clip for convenience of description.

The short clip may be an image in which a specific part of the original content, for example, a part including a specific keyword, is edited within a predetermined time (for example, within 3 minutes). However, the short clip is not limited to the image content, of course, can be generated by editing the audio content. In addition, since the playback time of the short clip may be changed according to a setting and a voice detection algorithm, it is of course not limited to within 3 minutes.

Meanwhile, the server 200 may generate and store information about the short clip at the time of generating the short clip. Here, the information on the short clip may include at least one of information on a location where the short clip is stored and information on a time interval including a specific keyword. In detail, the server 200 may obtain a keyword by analyzing an audio signal included in the short clip, and store the short clip and a keyword matching the short clip. Therefore, the server 200 may store a plurality of short clips and keywords for each of the plurality of short clips. In addition, the server 200 according to an embodiment of the present invention may display the title, genre, broadcast time of the original content, creation time of the short clip, broadcast station information of the original content, and the like based on metadata about the original content. Can be saved with a short clip.

The electronic device 100 according to an embodiment of the present disclosure analyzes a user's spoken voice and transmits a short clip request signal related to a keyword included in the spoken voice to a server, and the server 200 transmits the received request signal to the server. The short clip for the included keyword may be transmitted to the electronic device 100. In addition, the electronic device 100 may display the received short clip and provide it to the user.

Meanwhile, as described above, the electronic device 100 according to an embodiment of the present disclosure may transmit a user's spoken voice to a voice recognition server and receive a keyword included in the spoken voice from the voice recognition server. In addition, the server 200 providing the short clip may be configured to receive the user's spoken voice and transmit the keyword included in the spoken voice to the electronic device 100. That is, the voice recognition server or the server 200 may be implemented to perform voice recognition of converting the received voice into text and acquiring a keyword from the converted text when the user's spoken voice is received.

Hereinafter, various embodiments of the present disclosure will be described with reference to a block diagram illustrating a specific configuration of the electronic device 100.

2A and 2B are block diagrams illustrating a configuration of a display apparatus according to an exemplary embodiment.

According to FIG. 2A, the electronic device 100 includes a communication unit 110, an input unit 120, an output unit 130, and a processor 140.

The communication unit 110 communicates with an external device according to various types of communication methods.

In particular, the communication unit 110 may communicate with the server 200 which stores a plurality of short clips and keywords for each of the plurality of short clips using at least one wired / wireless method. In addition, the communication unit 110 may communicate with the voice recognition server. Here, the communication unit 110 may include various communication chips such as a Wi-Fi chip, a Bluetooth chip, a wireless communication chip, an NFC chip.

When the user's spoken voice is received through the input unit 120 as described below, the communicator 110 may transmit the received spoken voice to the voice recognition server and receive a keyword included in the spoken voice. Meanwhile, when the server 200 is used as a voice recognition server, the communication unit 110 may transmit the received spoken voice to the server 200 and receive a keyword from the server 200. However, the present invention is not limited thereto, and the electronic device 100 may obtain a keyword by performing voice recognition on the spoken voice of the user without performing communication with the voice recognition server or the server 200.

On the other hand, the communication unit 110 according to an embodiment of the present invention may transmit a signal for requesting a short clip to the server 200, and receives a short clip according to the request signal from the server 200. Here, the request signal is a signal based on information on keywords and content included in the user's spoken voice. For example, the request signal may be a signal including a keyword and information on content being output by the electronic device 100. As another example, the request signal may be transmitted to the server 200 continuously or simultaneously with a separate signal including a keyword and information on content being output by the electronic device 100.

The request signal according to an embodiment of the present disclosure may be a signal including information on content displayed on the electronic device 100, a keyword repeatedly output from the content, information on a user of the electronic device 100, and the like. . Here, the keyword repeatedly output from the content may mean a keyword that is repeated more than a predetermined number of times during a predetermined time in the content output from the electronic device 100. Hereinafter, for the convenience of description, the content displayed on the electronic device 100 or the output content will be referred to as output content.

In addition, the communication unit 110 may receive a short clip from the server 200 in response to the above-described request signal.

In detail, when the short clip request signal is received from the electronic device 100, the server 200 may transmit a short clip corresponding to the request signal to the electronic device 100. However, the present invention is not limited thereto, and the server 200 may store information on a location where original content corresponding to the request signal is stored and time information corresponding to a short clip among the original content. For example, the server 200 may transmit the web address for playing the original content and the time information corresponding to the short clip among the original content to the electronic device 100. In this case, the electronic device 100 may access the server where the original content is stored based on the received web address, and play the section corresponding to the time information.

For example, the electronic device 100 may receive a web address for receiving specific content from the server 200 and time information on a section including a corresponding keyword in the specific content. In this case, the electronic device 100 may access the received web address to receive specific content, and reproduce and output only a specific section of the specific content based on time information.

The input unit 120 is a component for receiving a spoken voice of a user and converting it into audio data. In detail, the input unit 120 may be implemented as a microphone to receive a spoken voice of a user. However, the present invention is not limited thereto, and the input unit 120 may be provided in a remote control device (not shown) for controlling the electronic device 100 instead of the electronic device 100 to receive a spoken voice of a user.

In addition, when the electronic device 100 is implemented as a touch-based electronic device, the input unit 120 may be implemented in the form of a touch screen that forms a mutual layer structure with the touch pad. In this case, the input unit 120 may receive a keyword input through a touch screen in addition to the spoken voice.

The output unit 130 may output at least one of various contents and short clips. In more detail, the output unit 130 may include at least one of a display and a speaker. When the output unit 130 includes a display, the output unit 130 may include various content playback screens such as images, videos, texts, music, etc., application execution screens including various contents, web browser screens, and graphical user interfaces (GUIs). ) Screen, etc. can be displayed.

In this case, the display may be implemented as a liquid crystal display panel (LCD), organic light emitting diodes (OLED), or the like, but is not limited thereto. In some cases, the display may be implemented as a flexible display or a transparent display.

In particular, the display may display the short clip received from the server 200.

Meanwhile, when the output unit 130 according to another embodiment of the present invention is implemented to include only a speaker, the output unit 130 may provide the received short clip as audio through the speaker. For example, when the electronic device 100 is implemented as a sound output device that does not have a display function, the output unit 130 may provide additional information about the received short clip as audio and an audio signal of the short clip. You can also provide only.

The processor 140 controls the overall operation of the electronic device 100.

In particular, when the user's spoken voice is received through the input unit 120, the processor 140 sends a signal through the communication unit 110 to request a short clip based on information on keywords and contents included in the received spoken voice. The server 200 may transmit the data. In addition, the short clip received from the server 200 according to the request signal may be output through the output unit 130.

In detail, the processor 140 may transmit information on the output content to the server 200. Here, the information on the output content may include a title, genre, broadcast time, broadcasting station information, and the like of the output content. Therefore, when the processor 140 transmits a short clip request signal to the server 200 based on at least one of the information about the keyword and the output content, the processor 140 receives and provides the short clip associated with the keyword and the output content. can do.

In this case, when the processor 140 transmits the short clip request signal to the server 200, the processor 140 may be provided with the short clip previously generated. Here, the pre-generated short clip may be a short clip generated from content different from the output content. For example, the content may be pre-generated content that is broadcast before the output content broadcast time.

However, the present invention is not limited thereto, and a short clip generated from the corresponding output content may also be received. According to an embodiment of the present disclosure, when the output content is broadcast content received in real time, the server 200 may also receive the broadcast content. When the processor 140 transmits a request signal, a short clip of the output content is output. If created, the short clip can also be the target. For example, if the corresponding output content is earlier than a preset time when the broadcast start time is requested by the user, the short clip for the corresponding output content may be generated.

Meanwhile, the processor 140 may receive additional information about the short clip. In detail, the processor 140 may receive and provide a short clip and additional information about the short clip from the server 200. Here, the additional information about the short clip may be information including at least one of a title, a genre of the original content of the short clip, a broadcast time of the original content, a creation time of the short clip, a broadcaster of the original content, and a keyword.

In addition, the processor 140 may analyze the audio signal of the output content and transmit a signal for requesting a short clip associated with the keyword to the server 200 based on a keyword that is repeated more than a predetermined number of times for a predetermined time. Accordingly, the processor 140 may obtain a word repeated in the output content as a keyword, and transmit the keyword to the server 200 to receive a short clip associated with the keyword.

In addition, the electronic device 100 may include a storage unit (not shown) for storing user information, and the processor 140 may transmit a request signal including the user information stored in the storage unit to the server 200. . In this case, the processor 140 may receive and display a short clip associated with user information. In this case, the user information is information about a user of the electronic device 100 and may include information including an age group, a favorite genre, a preferred content, a preferred broadcasting station, and the like. Therefore, when the electronic device 100 receives a plurality of short clips from the server 200, the electronic device 100 may receive and display a short clip more suitable for the user based on the keyword and the user information.

2B is a block diagram illustrating a detailed configuration of an electronic device 100 according to another embodiment of the present disclosure. According to FIG. 2B, the electronic device 100 uses the communication unit 110, the input unit 120, the output unit 130, the processor 140, the storage unit 150, the audio processor 160, and the video processor 170. Include. A detailed description of parts overlapping with those shown in FIG. 2A among the elements shown in FIG. 2B will be omitted.

The processor 140 controls overall operations of the electronic device 100 using various programs stored in the storage 150. The processor 140 may be one or more of a central processing unit (CPU), a controller, an application processor (AP), a communication processor (CP), and an ARM processor. It may include or may be defined in the corresponding terms. In addition, the processor 140 may be implemented as a digital signal processor (DSP), may be implemented as an SoC incorporating a content processing algorithm, or may be implemented in the form of a field programmable gate array (FPGA). .

In detail, the processor 140 may access the RAM 141, the ROM 142, the main CPU 143, the graphics processor 144, the first to n interfaces 145-1 to 145-n, and the bus 146. Include.

The RAM 141, the ROM 142, the main CPU 143, the graphics processor 144, the first to nth interfaces 145-1 to 145-n, and the like may be connected to each other through the bus 136.

The first to n interfaces 145-1 to 145-n are connected to the various components described above. One of the interfaces may be a network interface connected to an external device via a network.

The main CPU 143 accesses the storage 150 and performs booting using the operating system stored in the storage 150. Then, various operations are performed using various programs, contents, data, etc. stored in the storage 150.

The ROM 142 stores a command set for system booting. When the turn-on command is input and power is supplied, the main CPU 143 copies the O / S stored in the storage unit 150 to the RAM 141 according to the command stored in the ROM 142 and executes O / S. Boot the system. When booting is completed, the main CPU 143 copies various application programs stored in the storage unit 150 to the RAM 141 and executes the application programs copied to the RAM 141 to perform various operations.

The graphic processor 144 generates a screen including various objects such as an icon, an image, and a text by using a calculator (not shown) and a renderer (not shown). An operation unit (not shown) calculates attribute values such as coordinate values, shapes, sizes, colors, and the like in which objects are displayed according to the layout of the screen based on the received control command. The renderer generates a screen having various layouts including objects based on the attribute values calculated by the calculator. The screen generated by the renderer (not shown) is displayed in the display area of the outputter 130.

The storage unit 150 stores various data such as an operating system (O / S) software module for driving the electronic device 100, various multimedia contents, various applications, various contents input or set during application execution, and the like. In particular, the storage unit 150 may store user information, for example, user preference information, age group, user profile information, and the like.

The audio processor 160 is a component that performs processing on audio data. The audio processor 160 may perform various processing such as decoding, amplification, noise filtering, and the like on the audio data. For example, the audio processor 160 may generate and provide a feedback sound corresponding to a case where the user preference information displayed at the channel zapping satisfies a predetermined criterion.

The video processor 170 is a component that performs processing on video data. The video processor 170 may perform various image processing such as decoding, scaling, noise filtering, frame rate conversion, resolution conversion, and the like on the video data.

3 is a block diagram showing the configuration of a server 200 according to an embodiment of the present invention.

According to FIG. 3, the server 200 includes a communication unit 210, a storage unit 220, and a processor 230.

The communication unit 210 communicates with an external device according to various types of communication methods.

In particular, the communication unit 210 may communicate with the content provider 300 using at least one of the wired and wireless methods. In detail, the communication unit 210 may receive content from the content provider 300. Here, the communicator 210 may include various communication chips such as a Wi-Fi chip, a Bluetooth chip, a wireless communication chip, an NFC chip, and a tuner.

In addition, the communication unit 210 according to an embodiment of the present disclosure may communicate with the electronic device 100. In detail, the communication unit 210 may receive a short clip request signal transmitted by the electronic device 100 and transmit a short clip to the electronic device 100 in response thereto.

The storage unit 220 stores various data such as an operating system (O / S) software module for driving the server 200, various multimedia contents, various applications, various contents input or set during application execution, and the like.

In particular, the storage unit 220 may store original content, a plurality of short clips generated from the original content, and a plurality of keywords for each of the short clips.

According to an embodiment of the present invention, when the server 200 edits original content to generate a plurality of short clips, the server 200 may obtain at least one keyword according to audio signals included in the plurality of short clips. In this case, the server 200 may store the short clip and a keyword obtained from the short clip in the storage 220. For example, when the first and second keywords are obtained by analyzing audio signals included in the first short clip, the server 200 may store the first and second keywords together with the first short clip.

According to an embodiment of the present invention, the server 200 may group and store a short clip for each keyword. In this case, the short clips including the audio signal corresponding to the first keyword may be grouped and stored in the storage 220. Therefore, if the first keyword is included in the short clip request signal received from the electronic device 100, the server 200 may transmit a plurality of short clips grouped to the first keyword to the electronic device 100. .

The processor 230 controls the overall operation of the server 200.

First, when the server 200 according to an embodiment of the present invention performs a voice recognition function, the processor 230 analyzes a spoken voice received from the electronic device 100 and obtains a keyword included in the spoken voice. can do. The server 200 may transmit a keyword to the electronic device 100.

In addition, when the original content is received through the communication unit 210, the processor 230 may edit the received original content to generate a plurality of short clips. In detail, the processor 230 may edit only a specific section of the original content based on the voice detection algorithm. Here, the voice detection algorithm refers to an algorithm for detecting an audio signal including at least one keyword.

For example, the processor 230 may analyze the audio signal of the original content to detect a start point and an end point of the voice, and edit a section (EPD unit) between the start point and the end point to generate a short clip.

However, the present disclosure is not limited thereto, and the server 200 may be based on a preset time interval, a specific interval set by the content provider, a time interval set by the server 200 administrator, and a user request time interval included in the short clip request signal. You can also edit the original content to create a short clip.

According to an embodiment, if it is determined that the voice is terminated after the first detection of the voice in the broadcast content received in real time, the processor 230 may generate a short clip by editing the corresponding section in real time. In this case, the processor 230 may determine that the voice is terminated when the voice is not detected for more than a preset time or when a machine sound or noise is detected for more than the preset time. Thereafter, the processor 230 may store the generated short clip and the acquired keyword together in the storage 220. Therefore, the processor 230 may transmit a short clip to the electronic device 100 in response to the short clip request signal received from the electronic device 100.

Meanwhile, the server 200 according to an embodiment of the present invention may store time information on a section including a web address and a specific keyword that can receive the original content, as a database, without generating a short clip from the original content. have. In this case, when the short clip request signal is received from the electronic device 100, the server 200 may receive a web address corresponding to the short clip request signal and section information including a specific keyword in the original content. May be transmitted to the electronic device 100. Therefore, the electronic device 100 may provide the short clip by outputting only a section including a specific keyword in the original content based on the web address and time information, instead of receiving the short clip from the server 200. .

Hereinafter, to provide a short clip according to various embodiments of the present invention. 4 to 8 illustrate an embodiment in which the electronic device 100 includes a display for convenience of description, and output content and a short clip are output through the display.

4 is a diagram for describing a method of displaying a short clip associated with a keyword according to an exemplary embodiment.

According to FIG. 4, the electronic device 100 may receive a spoken voice of a user. In this case, the electronic device 100 may analyze the spoken voice of the user and obtain a keyword included in the spoken voice. For example, if the received speech of the user is 'tell me the current traffic information', the electronic device 100 may obtain 'traffic information' as a keyword. On the other hand, the electronic device 100 according to another embodiment of the present invention can also obtain a keyword included in the spoken voice by communicating with the voice recognition server or server 200.

In addition, the electronic device 100 may transmit a signal for requesting a short clip for the acquired keyword to the server 200. In this case, the server 200 may transmit a short clip for the keyword to the electronic device 100. In detail, the server 200 may transmit the specific short clip to the electronic device 100 based on the short clip generated from the original content and the keyword for each short clip until the request signal is received from the electronic device 100. For example, if the keyword included in the short clip request signal is 'traffic information', the server 200 transmits only the short clip having 'traffic information' as a keyword to the electronic device 100. In this case, the electronic device 100 may be generated by editing a specific section of a news program transmitted from a content provider, that is, a broadcaster, and may receive a short clip having 'traffic information' as a keyword. Therefore, the received short clip may be image content including an audio signal corresponding to 'traffic information'.

Meanwhile, according to an embodiment of the present disclosure, the electronic device 100 may transmit a short clip request signal including user information to the server 200. In this case, the server 200 may transmit a short clip related to the keyword and the user information to the electronic device 100. For example, if the location of the electronic device 100 corresponds to 'Seoul' according to the user information, the server 200 may select 'traffic information' and 'Seoul from a plurality of short clips having' traffic information 'as a keyword. The short clip satisfying both 'may be transmitted to the electronic device 100. Therefore, the electronic device 100 may display the short clip optimized to the user among the short clips generated in real time.

Meanwhile, according to an embodiment of the present disclosure, the electronic device 100 may provide an output mode and a short clip mode. The output mode may be a mode for continuously outputting only output content regardless of whether a short clip is received from the server 200. In addition, the short clip mode may be a mode for displaying a short clip received from the server 200. The electronic device 100 may display the short clip by switching from the output mode to the short clip mode at the end of the output content (for example, during CF broadcasting). However, the present invention is not limited thereto, and the switching between the output mode and the short clip mode may be performed in response to a user input. For example, when the user's spoken voice is received in the output mode, the user may automatically switch to the short clip mode and display the short clip received from the server 200. Also, the output mode and the short clip mode may be executed at the same time. For example, when a short clip is received from the server 200, the received short clip may be displayed on a portion of the output unit 130 by overlapping the output content.

Hereinafter, a method of receiving a short clip based on the output content will be described.

5 is a diagram illustrating a method of displaying a short clip associated with output content according to an exemplary embodiment.

According to FIG. 5, in addition to a keyword obtained from a spoken voice of a user, the electronic device 100 may include information about the output content in the short clip request signal and transmit the information to the server 200. In this case, the server 200 may transmit the specific short clip to the electronic device 100 based on the keyword and the short clip request signal.

Specifically, the information about the output content means information about the content that is output to the electronic device 100 and may be obtained from metadata about the output content. For example, the information on the output content may include a title, genre, broadcast time, broadcast station information, and the like of the output content. However, the present invention is not limited thereto, and the information on the content may be obtained through various methods. For example, additional information may be obtained by receiving information on content from an external server or performing OCR on a screen.

As illustrated in FIG. 5, when the user's spoken voice is “tell me about the batter of Team A,” the electronic device 100 may obtain at least one of “Team A” and “the batter” as keywords. In addition, if the output content is a baseball game, the electronic device 100 may display information (eg, 'sports', 'baseball') and keywords (eg, 'Team A' and 'hitter') about the output content. The short clip request signal may be transmitted to the server 200. In this case, the server 200 may transmit a short clip to the electronic device 100 that keyword 'sports', 'baseball', 'Team A' and 'batter' among the plurality of short clips. Accordingly, the electronic device 100 may receive and display the interview image of the other person of Team A, the sports news about Team A, and the like from the server 200. Meanwhile, as described above, the plurality of short clips received by the electronic device 100 may be image contents generated by editing a specific section of the original content received by the broadcaster and received by the server 200.

Hereinafter, a method of obtaining a keyword from an audio signal output by the electronic device 100 and receiving a short clip for the acquired keyword will be described.

According to FIG. 6, it may be assumed that the content being output by the electronic device 100 repeatedly outputs a specific word. In this case, in addition to the keyword obtained from the spoken voice of the user, the electronic device 100 may transmit the word repeatedly output from the output content to the server 200 by including the short clip request signal.

In detail, the electronic device 100 may transmit a keyword, which is repeated more than a predetermined number of times for a predetermined time, from the audio output through the speaker provided in the electronic device 100 to the server 200.

For example, if the output content is a travel information program for 'Spain', the electronic device 100 may obtain 'Spain', 'Barcelona', and the like, which are repeatedly output by analyzing an audio signal of the output content as a keyword. . In this case, the server 200 may transmit a short clip matching 'Spain' and 'Barcelona' among the plurality of short clips to the electronic device 100. Accordingly, the electronic device 100 may receive and display short clips of 'Spain' and 'Barcelona' from the server 200. Meanwhile, as described above, the electronic device 100 may include the information on the output content in the short clip request information and transmit the information to the server 200. In this case, the electronic device 100 may receive a short clip generated by editing a specific section of the travel information program for 'Spain' and 'Barcelona'.

Meanwhile, the electronic device 100 according to an embodiment of the present disclosure may display the short clip received from the server 200 as a thumbnail image. In this case, the short clip corresponding to the thumbnail image selected according to the user's input may be played.

Hereinafter, a specific method of displaying a short clip on the electronic device 100 will be described.

According to FIG. 7, the electronic device 100 may additionally receive information on the short clip from the server 200 and provide the received information together with the short clip.

Specifically, the additional information about the short clip includes at least one of the title 710 of the original content, the genre, the broadcast time 720 of the original content, the station information 730 of the original content, the creation time of the short clip, and a keyword. can do. Here, the broadcast time of the original content may mean a time when the server 200 receives the content from the content provider 300, a time for generating the original content, a time when the broadcast station transmits the original content, and the like. Also, the keyword of the short clip may mean a keyword that matches a keyword included in the short clip request signal among at least one keyword matched with the corresponding short clip.

Meanwhile, as illustrated in FIG. 7, additional information about the short clip may be displayed when the selected short clip is reproduced according to a user input. However, the present invention is not limited thereto, and the electronic device 100 may display a plurality of short clips received from the server 200 as thumbnail images and simultaneously display additional information on the short clips.

Hereinafter, a method of displaying additional response information about a keyword included in a user's spoken voice will be described.

According to FIG. 8, the electronic device 100 may receive additional response information about a keyword acquired in the spoken voice of the user from an external server and display the additional response information together with the short clip. Here, the additional response information may include a search result 810 for the keyword, information on the keyword, and the like. However, the present invention is not limited thereto, and of course, additional response information regarding at least one of information on output content, user information, and a keyword repeated in the output content may be received and displayed from an external server.

For example, a search result of a genre of output content as a search word can be received from an external server and displayed together with a short clip. Can also be received by an external server and displayed.

According to the control method of the electronic device illustrated in FIG. 9, first, content is output (S910).

Subsequently, the user's spoken voice is received (S920).

Subsequently, when the spoken voice is received, the short clip request signal is transmitted to the server based on the information about the keyword and the content included in the received spoken voice (S930).

Subsequently, the short clip is output based on the information about the short clip received from the server according to the request signal (S940).

Here, the information on the short clip includes at least one of information on a time interval including a location where the short clip is stored and a keyword. In operation S940, when information about the short clip is received from the server according to a request signal, the received clip is received. A short clip can be output based on the information.

In addition, in operation S940, when additional information about the short clip is received, additional information about the short clip is provided, wherein the information about the short clip includes a title, a genre of the original content, a broadcast time of the original content, and a short clip. May include at least one of a generation time, broadcast station information of original content, and a keyword.

In operation S940, additional information about the short clip may be provided as audio through a speaker.

Also, the electronic device may include at least one of a display and a speaker. In operation S930, a short clip associated with the keyword is sent to the server based on a keyword that is repeated at least a predetermined number of times for a predetermined time in the audio output through the speaker. The request signal can be additionally transmitted to the server.

In operation S940, additional response information regarding the spoken voice may be provided together with the short clip based on the keyword included in the received spoken voice.

In operation S930, the request signal including the keyword and the user information may be transmitted to the server. In operation S940, a short clip related to the keyword and the user information may be received from the server and output.

In operation S930, the received spoken voice may be transmitted to the voice recognition server or the server described above, and the short clip request signal may be transmitted to the server based on the information about the keyword and the content received from the voice recognition server or the server.

According to FIG. 10, first, the server 200 receives content from the content provider 300 (S1010). Hereinafter, the content received from the content provider 300 will be referred to as the original content. Meanwhile, the server 200 may receive the content from the content provider 300 in real time. If the content provider 300 is a broadcast station, the server 200 may receive a broadcast program broadcast in real time from the broadcast station as original content.

Subsequently, the server 200 generates a plurality of short clips based on the keywords of each of the received original contents (S1020).

Subsequently, the server 200 stores a plurality of generated short clips and keywords for each of the plurality of short clips (S1030).

In operation S1040, the electronic device 100 receives a user spoken voice.

Subsequently, the short clip request signal associated with the keyword included in the received speech voice is transmitted to the server 200 (S1050).

Subsequently, the electronic device 100 receives a short clip from the server (S1060).

Subsequently, the electronic device 100 outputs the received short clip (S1070).

Hereinafter, when the electronic device 100 does not have a display function, a method of providing a short clip through a speaker will be described.

FIG. 11 is a diagram for describing a method of providing a short clip through an speaker according to another embodiment of the present disclosure. Referring to FIG.

According to FIG. 11, the electronic device 100 may include only a speaker and no display as an output unit. In this case, the electronic device 100 may output and provide an audio signal of a short clip from the server 200. For example, when the short clip includes both a video signal and an audio signal as moving image content, the electronic device 100 may provide only an audio signal in the received short clip.

As illustrated in FIG. 11, when 'tell me the current weather' is received as a spoken voice, a short clip may be provided that uses 'current weather' as a keyword. In this case, as described above, the location clip of the electronic device 100 may be additionally received to provide a short clip of the current weather (for example, the current weather in New York) of a specific region. Also, since the electronic device 100 may not have a display, only the audio signal of the received short clip may be output.

In addition, when the additional information on the short clip is received as described above, the additional information on the short clip may be converted into an audio signal and provided. For example, when additional information about the short clip and the short clip is received from the server 200, the additional information about the short clip may be output first, and the audio signal included in the short clip may be sequentially output.

According to an embodiment of the present disclosure, the electronic device 100 may output only partial information of additional information about the received short clip as audio. For example, when the title, genre, broadcast time, etc. of the original content are received as additional information about the short clip, the electronic device 100 provides only the title of the original content as an audio signal and then supplies the audio signal of the received short clip. You can also output

In addition, when a plurality of short clips are received from the server 200, the electronic device 100 according to an embodiment of the present disclosure may sequentially provide a plurality of short clips based on a predetermined priority. For example, the electronic device 100 may output audio signals included in the plurality of short clips through the speaker in the order of generating the short clips.

Therefore, even if the electronic device 100 does not have a display function, the user may receive the short clip and additional information about the short clip as an audio signal.

Meanwhile, the above-described methods according to various embodiments of the present disclosure may be implemented in the form of software, a program, or an application that can be installed in an existing electronic device, a server, or the like.

In addition, the above-described methods according to various embodiments of the present disclosure may be implemented by software upgrade or hardware upgrade of an existing electronic device or server.

Meanwhile, the above-described control method of an electronic device according to various embodiments of the present disclosure may be implemented by computer executable program code to be executed by a processor in a state stored in various non-transitory computer readable mediums. It may be provided to each server or devices.

In addition, the method for controlling an electronic device according to various embodiments of the present disclosure described above may include a computer program product including a computer readable medium including a computer readable program executed by a computer device. It can be performed by. In addition, the computer readable program may be stored in a computer readable storage medium in a server, and the program may be implemented in a form downloadable to a computer device through a network.

The non-transitory readable medium refers to a medium that stores data semi-permanently and is readable by a device, not a medium storing data for a short time such as a register, a cache, a memory, and the like. Specifically, the various applications or programs described above may be stored and provided in a non-transitory readable medium such as a CD, a DVD, a hard disk, a Blu-ray disk, a USB, a memory card, a ROM, or the like.

In addition, although the preferred embodiment of the present invention has been shown and described above, the present invention is not limited to the above-described specific embodiment, the technical field to which the invention belongs without departing from the spirit of the invention claimed in the claims. Of course, various modifications can be made by those skilled in the art, and these modifications should not be individually understood from the technical spirit or the prospect of the present invention.

Claims

In an electronic device,

A communication unit communicating with a server that stores information about a plurality of short clips and keywords for each of the plurality of short clips;

An output unit;

An input unit; And

When a user spoken voice is received through the input unit, a short clip request signal is transmitted to the server based on a keyword included in the received spoken voice and information on content output from the output unit, and according to the request signal. And a processor configured to output the short clip through the output unit based on the information about the short clip received from the server.
The method of claim 1,

Information about the plurality of short clips,

At least one of a location where the plurality of short clips are stored and information on a time interval including the keyword,

The processor,

And outputting the short clip based on the received information when the information on the short clip is received from the server according to the request signal.
The method of claim 1,

Each of the plurality of short clips,

An electronic device, which is video content or sound content generated by editing a portion of a specific content including a specific keyword.
The method of claim 1,

The processor,

When additional information about the short clip is received, additional information about the short clip is provided.

Additional information about the short clip,

And at least one of a title, a genre of original content, a broadcast time of the original content, a creation time of the short clip, broadcaster information of the original content, and the keyword.
The method of claim 1,

The output unit,

At least one of a display and a speaker.
The method of claim 4, wherein

The output unit is implemented to include only a speaker,

The processor,

And provide additional information about the short clip as audio through the speaker.
The method of claim 1,

The output unit,

At least one of a display and a speaker,

The processor,

And transmitting a short clip request signal associated with the keyword to the server based on the keyword repeated more than a preset number of times for a predetermined time in the audio output through the speaker.
The method of claim 1,

The processor,

And providing additional response information about the spoken voice together with the short clip based on a keyword included in the received spoken voice.
The method of claim 1,

The processor,

Transmitting the request signal including the keyword and the user information to the server, and receiving a short clip associated with the keyword and the user information from the server.
The method of claim 1,

The processor,

When the spoken voice is received, the received spoken voice is transmitted to a voice recognition server or the server, and the short clip request signal is transmitted to the server based on the information about the keyword and the content received from the voice recognition server or the server. To the electronic device.
A control method of an electronic device communicating with a server that stores information on a plurality of short clips and keywords for each of the plurality of short clips, the method comprising:

Outputting content;

Receiving a spoken voice of a user;

When the spoken voice is received, transmitting a short clip request signal to the server based on a keyword included in the received spoken voice and information on the content; And

And outputting a short clip based on the information about the short clip received from the server according to the request signal.
The method of claim 11,

Information about the plurality of short clips,

At least one of a location where the plurality of short clips are stored and information on a time interval including the keyword,

The outputting step,

And when the information on the short clip is received from the server according to the request signal, outputting the short clip based on the received information.
The method of claim 11,

Each of the plurality of short clips,

A control method, which is video content or sound content generated by editing a portion containing a specific keyword in specific content.
The method of claim 11,

Outputting the short clip,

When additional information about the short clip is received, additional information about the short clip is provided.

Additional information about the short clip,

And at least one of a title, a genre of original content, a broadcast time of the original content, a creation time of the short clip, broadcasting station information of the original content, and the keyword.
The method of claim 14,

Outputting the short clip,

Providing additional information about the short clip as audio through a speaker.