KR20140053760A

KR20140053760A - Image processing apparatus and control method thereof, image processing system

Info

Publication number: KR20140053760A
Application number: KR1020130057290A
Authority: KR
Inventors: 이주영; 반석호; 박상신
Original assignee: 삼성전자주식회사
Priority date: 2013-05-21
Filing date: 2013-05-21
Publication date: 2014-05-08

Abstract

An image processing device in accordance with an embodiment of the present invention includes an image processing unit displaying an image signal to be displayed as an image; a voice input unit inputting a speech of a user; a voice processing unit processing a preset corresponding operation according to a voice command corresponding to the speech; and a control unit providing the corresponding operation on the voice command to be able to control the corresponding operation when the corresponding operation of the voice processing unit is not matched to the speech inputted in the voice input unit and controlling the corresponding operation matched to the speech to be performed according to a control result.

Description

TECHNICAL FIELD [0001] The present invention relates to an image processing apparatus, a control method thereof, and an image processing system,

본 발명은 외부로부터 수신되는 방송신호 등의 영상신호를 영상으로 표시되게 처리하는 영상처리장치 및 그 제어방법, 영상처리 시스템에 관한 것으로서, 상세하게는 사용자의 음성 명령을 인식함으로써 해당 음성 명령에 대응하는 기능 또는 동작을 실행할 수 있는 구조의 영상처리장치 및 그 제어방법, 영상처리 시스템에 관한 것이다.The present invention relates to an image processing apparatus for processing a video signal such as a broadcast signal received from the outside to be displayed as an image, a control method thereof, and an image processing system. More particularly, And a control method thereof, and an image processing system.

영상처리장치는 외부로부터 수신되는 영상신호/영상데이터를 다양한 영상처리 프로세스에 따라서 처리한다. 영상처리장치는 처리된 영상신호를 자체 구비한 디스플레이 패널 상에 영상으로 표시하거나, 또는 패널을 구비한 타 디스플레이장치에서 영상으로 표시되도록 이 처리된 영상신호를 해당 디스플레이장치에 출력할 수 있다. 즉, 영상처리장치는 영상신호를 처리 가능한 장치라면 영상을 표시 가능한 패널을 포함하는 경우 및 패널을 포함하지 않는 경우 모두 포함할 수 있는 바, 전자의 경우의 예시로는 TV가 있으며, 후자의 경우의 예시로는 셋탑박스(set-top box)가 있다.The image processing apparatus processes image signal / image data received from the outside according to various image processing processes. The image processing apparatus can display the processed video signal on the display panel on its own display panel or output the processed video signal to the corresponding display device so as to be displayed as an image on the other display device having the panel. That is, the image processing apparatus can include both a case including a panel capable of displaying an image and a case not including a panel, as long as the apparatus can process a video signal. An example of the former case is a TV, An example of a set-top box is a set-top box.

영상처리장치는 기술의 발전에 따라서 다양한 기능의 추가 및 확장이 계속적으로 반영되고 있는 바, 이러한 추세에 따라서 영상처리장치에 있어서 사용자의 의도를 반영한 커맨드를 영상처리장치에 입력하는 구성도 다양한 구조 또는 방법이 제안되고 있다. 예를 들면, 종래에는 사용자 리모트 컨트롤러(remote controller) 상의 키/버튼을 누르면 리모트 컨트롤러가 사용자가 원하는 동작이 실행되도록 하는 제어신호를 영상처리장치에 전송하는 구성이었으나, 근래에는 영상처리장치가 사용자에 의한 모션 또는 발화 등을 감지하고, 감지된 내용을 분석하여 대응 동작을 실행시키는 등, 사용자의 의도를 반영하여 영상처리장치를 제어하는 다양한 구성이 제안되고 있다.As image processing apparatuses continue to reflect the addition and expansion of various functions in accordance with the development of the technology, a configuration in which a command reflecting the intention of the user in the image processing apparatus is input to the image processing apparatus in accordance with this trend may be variously structured A method has been proposed. For example, conventionally, when the user presses a key / button on a remote controller, the remote controller transmits a control signal to the image processing apparatus to allow the user to perform an operation desired by the user. In recent years, however, There have been proposed various configurations in which the image processing apparatus is controlled to reflect the user's intention, such as detecting motion or ignition by the user, analyzing the detected content, and executing a corresponding operation.

본 발명의 실시예에 따른 영상처리장치는, 영상신호를 영상으로 표시되게 처리하는 영상처리부와; 사용자의 발화가 입력되는 음성입력부와; 상기 발화에 대응하는 음성 명령에 따라서 기 설정된 대응 동작이 수행되게 처리하는 음성처리부와; 상기 음성입력부에 입력되는 상기 발화에 대해 상기 음성처리부에 의한 상기 대응 동작이 매칭되지 않는 경우에 상기 음성 명령에 대한 상기 대응 동작을 조정 가능하게 제공하며, 상기 조정 결과에 따라서 상기 발화에 매칭되는 상기 대응 동작이 수행되게 제어하는 제어부를 포함하는 것을 특징으로 한다.An image processing apparatus according to an exemplary embodiment of the present invention includes an image processing unit for processing a video signal to be displayed as an image; A voice input unit to which a user's utterance is input; A voice processing unit for performing a predetermined corresponding operation according to a voice command corresponding to the utterance; Wherein the speech processing unit is configured to provide the corresponding operation for the voice command in an adjustable manner when the corresponding operation by the voice processing unit is not matched to the utterance input to the voice input unit, And a control unit for controlling the corresponding operation to be performed.

여기서, 상기 제어부는, 상기 음성 명령에 대한 상기 대응 동작의 지정 상태를 조정하는 유아이 영상을 제공하며, 상기 유아이 영상을 통해 소정의 발화의 음성 명령에 대응하게 지정된 상기 대응 동작이 제1동작에서 제2동작으로 조정되면, 상기 발화가 입력됨에 따라서 상기 제2동작이 수행되게 제어할 수 있다.Here, the control unit may provide an infant image for adjusting the designated state of the corresponding operation with respect to the voice command, and the corresponding operation specified in correspondence with the voice command of the predetermined utterance through the infant image may be performed in the first operation 2 operation, it is possible to control the second operation to be performed according to the input of the ignition.

여기서, 상기 유아이 영상은 사용자가 발화하도록 안내하며, 상기 제어부는, 상기 유아이 영상의 안내에 따라서 상기 음성입력부에 상기 발화가 입력되면, 상기 유아이 영상이 기 설정된 복수의 상기 동작 중에서 상기 발화의 상기 음성 명령에 대응하는 어느 하나를 상기 제2동작으로 선택할 수 있다.Here, the infant image may be guided by the user to speak, and the control unit may cause the infant image to be input to the speech input unit in accordance with the guidance of the infant image, One of the commands corresponding to the command can be selected as the second operation.

여기서, 상기 유아이 영상은 사용자에 의해 조작되는 사용자입력부에 설치된 복수의 입력버튼을 조작하도록 안내하며, 상기 제어부는, 상기 기 설정된 복수의 동작 중에서, 상기 안내에 따라서 조작된 상기 입력버튼에 기 지정된 동작을 상기 제2동작으로 선택할 수 있다.Here, the infant image is guided to operate a plurality of input buttons provided on a user input section operated by a user, and the control section performs a predetermined operation on the input button manipulated in accordance with the guide, Can be selected as the second operation.

또한, 상기 유아이 영상은 상기 기 설정된 복수의 동작의 리스트를 포함하며, 상기 제어부는, 상기 리스트 상에서 선택된 상기 동작을 상기 제2동작으로 선택할 수 있다.In addition, the infant image may include a list of the predetermined operations, and the controller may select the operation selected on the list as the second operation.

또한, 상기 유아이 영상은 복수의 상기 발화와, 상기 복수의 발화에 각기 대응하는 복수의 상기 동작을 하나의 상기 발화에 의해 순차적으로 실행하는 매크로 인스트럭션(macro instruction)을 설정 가능하게 제공할 수 있다.In addition, the infant image may be capable of settable macro instructions to sequentially execute a plurality of the utterances and a plurality of the operations corresponding to the plurality of utterances sequentially by one utterance.

여기서, 상기 제어부는, 상기 매크로 인스트럭션이 포함하는 상기 복수의 기 설정된 동작 중에서 첫번째 동작에 대응하는 상기 발화가 입력되면 상기 매크로 인스트럭션을 실행시킬 수 있다.Here, the control unit may execute the macro instruction when the speech corresponding to the first operation among the plurality of predetermined operations included in the macro instruction is input.

또한, 서버에 통신 가능하게 접속되는 통신부를 더 포함하며, 상기 제어부는, 상기 발화가 입력되면 상기 음성처리부 및 상기 서버 중 어느 하나에 의해 상기 발화에 대응하는 상기 음성 명령이 처리되게 제어할 수 있다.Further, the information processing apparatus may further include a communication unit communicably connected to the server, and the control unit may control the voice processing unit and the server to process the voice command corresponding to the utterance when the utterance is input .

여기서, 상기 통신부는 상기 발화를 텍스트의 음성 명령으로 변환하는 STT(speech-to-text)서버와 통신하며, 상기 제어부는, 상기 음성입력부에 상기 발화가 입력되면 상기 발화의 음성신호를 상기 STT서버로 전송하며, 상기 STT서버로부터 상기 발화에 대응하는 상기 음성 명령을 수신할 수 있다.Here, the communication unit may communicate with a speech-to-text (STT) server that converts the utterance into a text voice command, and when the utterance is input to the voice input unit, And may receive the voice command corresponding to the utterance from the STT server.

또한, 상기 제어부는, 상기 음성 명령이 단문일 경우에 상기 음성 명령을 상기 음성처리부에 의해 처리되고, 상기 음성 명령이 대화문일 경우에 상기 음성 명령을 상기 서버에 의해 처리되게 제어할 수 있다.In addition, the control unit may control the voice command to be processed by the voice processing unit when the voice command is a short message, and to control the voice command to be processed by the server when the voice command is a conversation.

또한, 상기 영상처리부에 의해 처리되는 상기 영상신호를 영상으로 표시하는 디스플레이부를 더 포함할 수 있다.The display unit may further include a display unit for displaying the image signal processed by the image processing unit as an image.

또한, 본 발명의 실시예에 따른 영상처리장치의 제어방법에 있어서,In the control method of the image processing apparatus according to the embodiment of the present invention,

사용자의 발화가 입력되는 단계와;Inputting a user's utterance;

상기 발화에 대응하는 음성 명령에 따라서 기 설정된 대응 동작을 수행하는 단계와;Performing a predetermined corresponding operation in accordance with a voice command corresponding to the utterance;

상기 발화에 대해 상기 대응 동작이 매칭되지 않는 경우, 상기 음성 명령에 대한 상기 대응 동작을 조정 가능하게 제공하고, 상기 조정 결과에 따라서 상기 발화에 매칭되는 상기 대응 동작이 수행되도록 설정하는 단계를 포함하는 것을 특징으로 하는 영상처리장치의 제어방법.Providing the corresponding action for the voice command in an adjustable manner if the corresponding action is not matched to the utterance and setting the corresponding action to be performed to match the utterance in accordance with the result of the adjustment And a control unit for controlling the image processing apparatus.

여기서, 상기 설정 단계는, 상기 음성 명령에 대한 상기 대응 동작의 지정 상태를 조정하는 유아이 영상을 제공하는 단계와; 상기 유아이 영상을 통해 소정의 발화의 음성 명령에 대응하게 지정된 상기 대응 동작이 제1동작에서 제2동작으로 조정되면, 상기 발화가 입력됨에 따라서 상기 제2동작이 수행되게 설정하는 단계를 포함할 수 있다.Here, the setting step may include: providing an infant image for adjusting a designated state of the corresponding operation to the voice command; And setting the second action to be performed when the corresponding action specified in correspondence with a voice command of a predetermined utterance through the infant image is adjusted from the first operation to the second operation so that the second operation is performed as the utterance is input have.

여기서, 상기 유아이 영상은 사용자가 발화하도록 안내하며, 상기 설정 단계는, 상기 유아이 영상의 안내에 따라서 상기 발화가 입력되면, 상기 유아이 영상이 기 설정된 복수의 상기 동작 중에서 상기 발화의 상기 음성 명령에 대응하는 어느 하나를 상기 제2동작으로 선택하는 단계를 포함할 수 있다.Here, the infant image is guided to be uttered by the user, and in the setting step, when the utterance is input according to the guidance of the infant image, the infant image corresponds to the voice command of the utterance among a plurality of the predetermined operations And selecting one of the first operation and the second operation as the second operation.

여기서, 상기 유아이 영상은 상기 영상처리장치의 사용자입력부에 설치된 복수의 입력버튼을 조작하도록 안내하며, 상기 설정 단계는, 상기 기 설정된 복수의 동작 중에서, 상기 안내에 따라서 조작된 상기 입력버튼에 기 지정된 동작을 상기 제2동작으로 선택하는 단계를 포함할 수 있다.Here, the infant image guides the user to manipulate a plurality of input buttons provided on a user input unit of the image processing apparatus, and the setting step includes setting, among the predetermined plurality of operations, the input button operated in accordance with the guide And selecting the operation as the second operation.

또한, 상기 유아이 영상은 상기 기 설정된 복수의 동작의 리스트를 포함하며, 상기 설정 단계는, 상기 리스트 상에서 선택된 상기 동작을 상기 제2동작으로 선택하는 단계를 포함할 수 있다.In addition, the infant image may include a list of the predetermined plurality of operations, and the setting step may include selecting the operation selected on the list as the second operation.

또한, 상기 유아이 영상은 복수의 상기 발화와, 상기 복수의 발화에 각기 대응하는 복수의 상기 동작을 하나의 상기 발화에 의해 순차적으로 실행하는 매크로 인스트럭션을 설정 가능하게 제공할 수 있다.In addition, the infant image may be capable of setting macro instructions to sequentially execute a plurality of the utterances and a plurality of the operations corresponding to the plurality of utterances sequentially by one utterance.

여기서, 상기 매크로 인스트럭션이 포함하는 상기 복수의 기 설정된 동작 중에서 첫번째 동작에 대응하는 상기 발화가 입력되면 상기 매크로 인스트럭션을 실행하는 단계를 더 포함할 수 있다.The method may further include executing the macro instruction when the speech corresponding to the first operation among the plurality of predetermined operations included in the macro instruction is input.

또한, 상기 영상처리장치는 서버와 통신하며, 상기 기 설정된 대응 동작을 수행하는 단계는, 상기 영상처리장치 및 상기 서버 중 어느 하나에 의해 상기 발화에 대응하는 상기 음성 명령이 처리되게 제어하는 단계를 포함할 수 있다.In addition, the image processing apparatus communicates with the server, and the step of performing the predetermined corresponding operation includes a step of controlling the voice command corresponding to the utterance to be processed by any one of the image processing apparatus and the server .

여기서, 상기 영상처리장치는 상기 발화를 텍스트의 상기 음성 명령으로 변환하는 STT서버와 통신하며, 상기 사용자의 발화가 입력되는 단계는, 상기 발화의 음성신호를 상기 STT서버로 전송하는 단계와; 상기 STT서버로부터 상기 발화에 대응하는 상기 음성 명령을 수신하는 단계를 포함할 수 있다.Here, the image processing apparatus communicates with the STT server that converts the utterance into the voice command of text, and the input of the utterance of the user comprises: transmitting the voice signal of the utterance to the STT server; And receiving the voice command corresponding to the utterance from the STT server.

또한, 상기 제어 단계는, 상기 음성 명령이 단문일 경우에 상기 음성 명령이 상기 영상처리장치에 의해 처리되고, 상기 음성 명령이 대화문일 경우에 상기 음성 명령이 상기 서버에 의해 처리되게 제어하는 단계를 포함할 수 있다.The controlling step may include the step of controlling the voice command to be processed by the image processing apparatus when the voice command is a short message and the voice command to be processed by the server when the voice command is a conversation .

또한, 본 발명의 실시예에 따른 영상처리 시스템은, 영상신호를 영상으로 표시되게 처리하는 영상처리장치와; 상기 영상처리장치와 통신하는 서버를 포함하며, 상기 영상처리장치는, 사용자의 발화가 입력되는 음성입력부와; 상기 발화에 대응하는 음성 명령에 따라서 기 설정된 대응 동작이 수행되게 처리하는 음성처리부와; 상기 음성입력부를 통해 상기 발화가 입력되면 상기 음성처리부 및 상기 서버 중 어느 하나에 의해 상기 발화에 대응하는 상기 음성 명령이 처리되게 제어하는 제어부를 포함하며, 상기 제어부는, 상기 음성입력부에 입력되는 상기 발화에 대해 상기 음성처리부에 의한 상기 대응 동작이 매칭되지 않는 경우, 상기 음성 명령에 대한 상기 대응 동작을 조정 가능하게 제공하며, 상기 조정 결과에 따라서 상기 발화에 매칭되는 상기 대응 동작이 수행되게 제어하는 것을 특징으로 한다.According to another aspect of the present invention, there is provided an image processing system including: an image processing apparatus for processing a video signal to display the image; And a server for communicating with the image processing apparatus, wherein the image processing apparatus comprises: a voice input unit to which a user's utterance is input; A voice processing unit for performing a predetermined corresponding operation according to a voice command corresponding to the utterance; And a controller for controlling the voice command corresponding to the utterance to be processed by one of the voice processor and the server when the utterance is input through the voice input unit, And a control unit operable to adjustably provide the corresponding operation to the voice command when the corresponding operation by the voice processing unit is not matched with respect to the utterance and to control the corresponding operation matched with the utterance to be performed according to the adjustment result .

도 1은 본 발명의 제1실시예에 따른 디스플레이장치의 구성 블록도,
도 2는 도 1의 디스플레이장치 및 서버의 인터랙션 구조를 나타내는 구성 블록도,
도 3은 도 2의 디스플레이장치 또는 대화형 서버에 저장된, 음성 명령에 대응하는 동작의 데이터베이스의 예시도,
도 4 내지 도 6은 도 2의 디스플레이장치에서, 음성 명령의 설정을 위한 유아이 영상의 예시도,
도 7은 본 발명의 제2실시예에 따른 디스플레이장치에서 설정 가능한 매크로 인스트럭션의 시퀀스 예시도,
도 8 내지 도 12는 도 7의 매크로 인스트럭션을 설정하기 위한 유아이 영상의 예시도이다.1 is a block diagram of a display device according to a first embodiment of the present invention;
2 is a block diagram showing an interaction structure of the display device and the server of FIG. 1,
3 is an illustration of a database of operations corresponding to voice commands stored in the display device or interactive server of Fig. 2, Fig.
4 to 6 are diagrams showing an example of a baby image for setting a voice command in the display device of Fig. 2,
FIG. 7 is a sequence example of macro instructions that can be set in a display device according to a second embodiment of the present invention;
8 to 12 are diagrams illustrating an example of a baby image for setting the macro instruction of FIG.

이하에서는 첨부도면을 참조하여 본 발명에 대해 상세히 설명한다. 이하 실시예에서는 본 발명의 사상과 직접적인 관련이 있는 구성들에 관해서만 설명하며, 그 외의 구성에 관해서는 설명을 생략한다. 그러나, 본 발명의 사상이 적용된 장치 또는 시스템을 구현함에 있어서, 이와 같이 설명이 생략된 구성이 불필요함을 의미하는 것이 아님을 밝힌다.Hereinafter, the present invention will be described in detail with reference to the accompanying drawings. In the following embodiments, only configurations directly related to the concept of the present invention will be described, and description of other configurations will be omitted. However, it is to be understood that, in the implementation of the apparatus or system to which the spirit of the present invention is applied, it is not meant that the configuration omitted from the description is unnecessary.

도 1은 본 발명의 실시예에 따른 영상처리장치(100)의 구성 블록도이다.1 is a block diagram of a configuration of an image processing apparatus 100 according to an embodiment of the present invention.

이하 실시예는 영상처리장치(100)가 자체적으로 영상을 표시할 수 있는 구조의 디스플레이장치인 경우에 관해 설명하나, 본 발명의 사상은 영상처리장치(100)가 자체적으로 영상을 표시하지 않고 타 디스플레이장치에 영상신호/제어신호를 출력 가능한 구조의 장치인 경우에도 적용이 가능한 바, 이하 설명하는 실시예에 한정되지 않는다. 본 실시예는 영상처리장치(100)가 TV인 경우에 관해 설명하지만, 이러한 이유에 따라서 그 구현 방식이 다양하게 변경되어 적용될 수 있다.Although the embodiment will be described with reference to the case where the image processing apparatus 100 is a display apparatus having a structure capable of displaying images on its own, the idea of the present invention is that the image processing apparatus 100 does not display images But the present invention is not limited to the embodiments described below as long as it is a device capable of outputting a video signal / control signal to a display device. The present embodiment describes a case where the image processing apparatus 100 is a TV, but the implementation method may be variously modified and applied according to the reasons.

도 1에 도시된 바와 같이, 본 실시예에 따른 영상처리장치(100) 또는 디스플레이장치(100)는 영상공급원(미도시)으로부터 영상신호를 수신한다. 디스플레이장치(100)가 수신 가능한 영상신호는 그 종류 또는 특성이 한정되지 않으며, 예를 들면 디스플레이장치(100)는 방송국의 송출장비(미도시)로부터 송출되는 방송신호를 수신하고, 해당 방송신호를 튜닝하여 방송영상을 표시할 수 있다.As shown in FIG. 1, the image processing apparatus 100 or the display apparatus 100 according to the present embodiment receives a video signal from a video source (not shown). For example, the display device 100 receives a broadcasting signal transmitted from a transmission device (not shown) of a broadcasting station, and transmits the broadcasting signal to the display device 100 The broadcast image can be displayed by tuning.

디스플레이장치(100)는 영상공급원(미도시)으로부터 영상신호를 수신하는 영상수신부(110)와, 영상수신부(110)에 수신되는 영상신호를 기 설정된 영상처리 프로세스에 따라서 처리하는 영상처리부(120)와, 영상처리부(120)에서 처리되는 영상신호에 기초하여 영상을 표시하는 디스플레이부(130)와, 서버(10)와 같은 외부장치와 통신하는 통신부(140)와, 사용자에 의해 조작되는 사용자입력부(150)와, 외부로부터의 음성 또는 소리가 입력되는 음성입력부(160)와, 음성입력부(160)에 입력되는 음성/소리를 해석 및 처리하는 음성처리부(170)와, 데이터/정보가 저장되는 저장부(180)와, 디스플레이장치(100)의 제반 동작을 제어하는 제어부(190)를 포함한다.The display apparatus 100 includes an image receiving unit 110 for receiving a video signal from a video source (not shown), an image processing unit 120 for processing a video signal received by the image receiving unit 110 according to a predetermined image processing process, A display unit 130 for displaying an image based on the image signal processed by the image processing unit 120, a communication unit 140 for communicating with an external device such as the server 10, A sound processing unit 170 for analyzing and processing the sound / sound input to the sound input unit 160, and a sound processing unit 170 for storing the data / A storage unit 180, and a control unit 190 for controlling various operations of the display device 100. [

영상수신부(110)는 영상신호/영상데이터를 유선 또는 무선으로 수신하여 영상처리부(120)에 전달한다. 영상수신부(110)는 수신하는 영상신호의 규격 및 디스플레이장치(100)의 구현 형태에 대응하여 다양한 방식으로 마련될 수 있다. 예를 들면, 영상수신부(110)는 RF(radio frequency)신호를 수신하거나, 컴포지트(composite) 비디오, 컴포넌트(component) 비디오, 슈퍼 비디오(super video), SCART, HDMI(high definition multimedia interface), 디스플레이포트(DisplayPort), UDI(unified display interface), 또는 와이어리스(wireless) HD 규격 등에 의한 영상신호를 수신할 수 있다. 영상수신부(110)는 영상신호가 방송신호인 경우, 이 방송신호를 채널 별로 튜닝하는 튜너(tuner)를 포함한다.The image receiving unit 110 receives the image signal / image data by wire or wireless and transmits the image signal / image data to the image processing unit 120. The image receiving unit 110 may be provided in various ways corresponding to the standard of the image signal to be received and the implementation form of the display device 100. [ For example, the image receiving unit 110 may receive a radio frequency (RF) signal, or may be a composite video, a component video, a super video, a SCART, a high definition multimedia interface (HDMI) A display port, a unified display interface (UDI), or a wireless HD standard. The image receiving unit 110 includes a tuner for tuning the broadcast signal for each channel when the image signal is a broadcast signal.

영상처리부(120)는 영상수신부(110)에 수신되는 영상신호에 대해 다양한 영상처리 프로세스를 수행한다. 영상처리부(120)는 이러한 프로세스를 수행한 영상신호를 디스플레이부(130)에 출력함으로써, 디스플레이부(130)에 해당 영상신호에 기초하는 영상이 표시되게 한다. 예를 들면, 영상처리부(120)는 영상수신부(110)에서 특정 채널로 방송신호가 튜닝되면, 방송신호로부터 해당 채널에 대응하는 영상, 음성 및 부가데이터를 추출하고 기 설정된 해상도로 조정하여 디스플레이부(130)에 표시한다.The image processing unit 120 performs a variety of image processing processes on the image signal received by the image receiving unit 110. The image processor 120 outputs a video signal that has undergone such a process to the display unit 130 so that an image based on the video signal is displayed on the display unit 130. For example, when the broadcast signal is tuned to a specific channel in the image receiving unit 110, the image processing unit 120 extracts video, audio, and additional data corresponding to the channel from the broadcast signal, adjusts the video, (130).

영상처리부(120)가 수행하는 영상처리 프로세스의 종류는 한정되지 않으며, 예를 들면 영상데이터의 영상 포맷에 대응하는 디코딩(decoding), 인터레이스(interlace) 방식의 영상데이터를 프로그레시브(progressive) 방식으로 변환하는 디인터레이싱(de-interlacing), 영상데이터를 기 설정된 해상도로 조정하는 스케일링(scaling), 영상 화질 개선을 위한 노이즈 감소(noise reduction), 디테일 강화(detail enhancement), 프레임 리프레시 레이트(frame refresh rate) 변환 등을 포함할 수 있다.The type of the image processing process performed by the image processing unit 120 is not limited. For example, the decoding process corresponding to the image format of the image data, the conversion of the interlaced image data into the progressive process, De-interlacing, scaling to adjust image data to a preset resolution, noise reduction for improving image quality, detail enhancement, frame refresh rate conversion And the like.

영상처리부(120)는 이러한 여러 기능을 통합시킨 SOC(system-on-chip), 또는 이러한 각 프로세스를 독자적으로 수행할 수 있는 개별적인 구성들이 인쇄회로기판 상에 장착됨으로써 영상처리보드(미도시)로 구현되어 디스플레이장치(100)에 내장된다.The image processor 120 may be a system-on-a-chip (SOC) that integrates various functions, or an individual configuration capable of independently performing each of the processes, And is embedded in the display device 100.

디스플레이부(130)는 영상처리부(120)로부터 출력되는 영상신호에 기초하여 영상을 표시한다. 디스플레이부(130)의 구현 방식은 한정되지 않는 바, 액정(liquid crystal), 플라즈마(plasma), 발광 다이오드(light-emitting diode), 유기발광 다이오드(organic light-emitting diode), 면전도 전자총(surface-conduction electron-emitter), 탄소 나노 튜브(carbon nano-tube), 나노 크리스탈(nano-crystal) 등의 다양한 디스플레이 방식으로 구현될 수 있다.The display unit 130 displays an image based on the image signal output from the image processing unit 120. [ The display unit 130 may be implemented in various forms including, but not limited to, a liquid crystal, a plasma, a light-emitting diode, an organic light-emitting diode, electron conduction electron-emitter, carbon nano-tube, nano-crystal, and the like.

디스플레이부(130)는 그 구현 방식에 따라서 부가적인 구성을 추가적으로 포함할 수 있다. 예를 들면, 디스플레이부(130)가 액정 방식인 경우, 디스플레이부(130)는 액정 디스플레이 패널(미도시)과, 이에 광을 공급하는 백라이트유닛(미도시)과, 패널(미도시)을 구동시키는 패널구동기판(미도시)을 포함한다.The display unit 130 may further include an additional configuration depending on the implementation method. For example, when the display unit 130 is a liquid crystal type, the display unit 130 includes a liquid crystal display panel (not shown), a backlight unit (not shown) for supplying light thereto, and a panel (Not shown).

통신부(140)는 디스플레이장치(100)가 서버(10)와 양방향 통신을 수행하도록 데이터의 송수신을 수행한다. 통신부(140)는 서버(10)의 통신 프로토콜(protocol)에 따라서, 유선/무선을 통한 광역/근거리 네트워크나 또는 로컬 접속 방식으로 서버(10)에 접속한다.The communication unit 140 performs transmission and reception of data so that the display device 100 performs bidirectional communication with the server 10. The communication unit 140 connects to the server 10 via a wired / wireless wide area / local area network or a local connection method according to a communication protocol of the server 10.

사용자입력부(150)는 사용자의 조작 및 입력에 따라서 기 설정된 다양한 제어 커맨드 또는 정보를 제어부(190)에 전달한다. 사용자입력부(150)는 디스플레이장치(100) 외측에 설치된 메뉴 키(menu-key) 또는 입력 패널(panel)이나, 디스플레이장치(100)와 분리 이격된 리모트 컨트롤러(remote controller) 등으로 구현된다. 또는, 사용자입력부(150)는 디스플레이부(130)와 일체형으로 구현될 수 있는 바, 디스플레이부(130)가 터치스크린(touch-screen)인 경우에 사용자는 디스플레이부(130)에 표시된 입력메뉴(미도시)를 터치함으로써 기 설정된 커맨드를 제어부(190)에 전달할 수 있다.The user input unit 150 transmits various preset control commands or information to the controller 190 according to a user's operation and input. The user input unit 150 is realized by a menu-key or an input panel installed outside the display device 100 or a remote controller separated from the display device 100. Alternatively, the user input unit 150 may be integrated with the display unit 130, and when the display unit 130 is a touch-screen, the user may select the input menu 130 displayed on the display unit 130 (Not shown) to the controller 190. The controller 190 may be configured to receive the command.

음성입력부(160)는 마이크로 구현되며, 디스플레이장치(100)의 외부 환경에서 발생하는 다양한 소리를 감지한다. 음성입력부(160)가 감지하는 소리는 사용자에 의한 발화와, 사용자 이외에 다양한 요인에 의해 발생하는 소리를 포함한다.The voice input unit 160 is micro-implemented and detects various sounds generated in the external environment of the display device 100. [ The sound sensed by the voice input unit 160 includes voice uttered by the user and sound generated by various factors other than the user.

음성처리부(170)는 디스플레이장치(100)에서 수행되는 다양한 기 설정된 프로세스 중에서, 음성입력부(160)에 입력되는 음성/소리에 대한 프로세스를 수행한다. 여기서, 음성처리부(170)가 처리하는 "음성"은 음성입력부(160)에 입력되는 음성을 의미한다. 영상처리부(120)가 영상신호를 처리할 때에 해당 영상신호는 음성데이터를 포함할 수 있는 바, 영상신호에 포함된 음성데이터는 영상처리부(120)에 의해 처리된다.The voice processing unit 170 performs processes for voice / sound input to the voice input unit 160 among various preset processes performed in the display device 100. [ Here, the "voice" processed by the voice processing unit 170 means a voice input to the voice input unit 160. [ When the video processing unit 120 processes the video signal, the video signal may include audio data, and the audio data included in the video signal is processed by the video processing unit 120.

음성처리부(170)는 음성입력부(160)에 음성/소리가 입력되면, 입력된 음성/소리가 사용자에 의한 발화인지 아니면 기타 요인에 의하여 발생한 소리인지 여부를 판단한다. 이러한 판단 방법은 다양한 구조가 적용될 수 있으므로 특정할 수 없으며, 예를 들면 입력된 음성/소리가 사람의 목소리에 대응하는 파장/주파수 대역에 해당하는지 판단하거나, 또는 사전에 지정된 사용자의 음성의 프로파일에 해당하는지 판단하는 등의 방법이 가능하다.When the voice / sound is input to the voice input unit 160, the voice processing unit 170 determines whether the voice / sound is a voice generated by a user or other factors. This determination method can not be specified because various structures can be applied. For example, it can be determined whether the inputted voice / sound corresponds to the wavelength / frequency band corresponding to the voice of the person, or the voice / It is possible to determine whether it is applicable or not.

음성처리부(170)는 사용자의 발화가 입력된 것으로 판단하면, 해당 발화에 대응하는 음성 명령에 따라서 기 설정된 대응 동작이 수행되게 처리한다. 여기서, 음성 명령은 사용자의 발화의 내용을 의미한다. 이에 관한 자세한 내용은 후술한다.If the voice processing unit 170 determines that the user's utterance has been input, the voice processing unit 170 processes the predetermined corresponding operation according to the voice command corresponding to the utterance. Here, the voice command means the contents of the utterance of the user. Details of this will be described later.

본 실시예에서는 음성처리부(170)가 영상처리부(120)와 별개의 구성인 것으로 설명한다. 다만, 이는 실시예를 보다 명확히 설명하기 위해 편의상 기능별로 분류한 것이며, 본 발명의 사상이 구현된 디스플레이장치(100)에서 반드시 영상처리부(120) 및 음성처리부(170)가 분리되어 있음을 의미하는 것은 아니다. 즉, 디스플레이장치(100)가 영상처리부(120) 및 음성처리부(170)를 통합한 신호처리부(미도시)의 구성을 포함하는 경우도 가능하다.In the present embodiment, it is assumed that the audio processing unit 170 is configured separately from the video processing unit 120. [ This means that the image processing unit 120 and the voice processing unit 170 are separated from each other in the display device 100 implementing the idea of the present invention. It is not. That is, the display apparatus 100 may include a signal processing unit (not shown) in which the image processing unit 120 and the voice processing unit 170 are integrated.

저장부(180)는 제어부(190)의 제어에 따라서 한정되지 않은 데이터가 저장된다. 저장부(180)는 플래시메모리(flash-memory), 하드디스크 드라이브(hard-disc drive)와 같은 비휘발성 메모리로 구현된다. 저장부(180)는 제어부(190), 영상처리부(120) 또는 음성처리부(170) 등에 의해 액세스되며, 데이터의 독취/기록/수정/삭제/갱신 등이 수행된다.The storage unit 180 stores unlimited data under the control of the controller 190. The storage unit 180 is implemented as a non-volatile memory such as a flash memory, a hard-disc drive, or the like. The storage unit 180 is accessed by the control unit 190, the image processing unit 120 or the voice processing unit 170 and reads / writes / corrects / deletes / updates data.

제어부(190)는 음성입력부(160)를 통해 사용자의 발화가 입력되면, 입력된 발화를 처리하도록 음성처리부(170)를 제어한다. 이 때, 제어부(190)는 발화가 입력되면 해당 발화에 대응하는 음성 명령이 단문인지 아니면 대화문인지 판단하고, 판단 결과에 따라서 해당 음성 명령이 음성처리부(170) 또는 서버(10)에 의해 처리되도록 제어한다. 구체적으로, 제어부(190)는 음성 명령이 단문이면 음성처리부(170)에 의해 처리되게 하고, 음성 명령이 대화문이면 통신부(140)를 통해 서버(10)에 전송함으로써 서버(10)에 의해 처리될 수 있도록 한다.The control unit 190 controls the voice processing unit 170 to process the input utterance when the utterance of the user is input through the voice input unit 160. [ In this case, the control unit 190 determines whether the voice command corresponding to the utterance is a short sentence or a dialogue sent when the utterance is input, and controls the voice processing unit 170 or the server 10 to process the voice command according to the determination result. . More specifically, the control unit 190 allows the voice processing unit 170 to process the voice command if the voice command is a short message, and transmits the voice command to the server 10 via the communication unit 140 if the voice command is a conversation, .

도 2는 디스플레이장치(100) 및 서버(20, 30)의 인터랙션 구조를 나타내는 구성 블록도이다.2 is a configuration block diagram showing the interaction structure of the display device 100 and the servers 20 and 30.

도 2에 도시된 바와 같이, 디스플레이장치(100)는 통신부(140)와, 음성입력부(160)와, 음성처리부(170)와, 제어부(190)를 포함한다. 이러한 구성은 앞선 도 1에서 설명한 바와 같다. 여기서, 통신부(140)는 사용자의 발화를 음성 명령으로 변환하는 STT(speech-to-text)서버(20)와, 음성 명령을 분석함으로써 음성 명령에 대응하는 대응 동작을 판단하는 대화형 서버(30)에 접속된다.2, the display apparatus 100 includes a communication unit 140, a voice input unit 160, a voice processing unit 170, and a control unit 190. This configuration is as described in FIG. Here, the communication unit 140 includes a speech-to-text (STT) server 20 for converting a user's utterance into a voice command, an interactive server 30 for determining a corresponding operation corresponding to the voice command by analyzing the voice command .

STT서버(20)는 음성신호가 수신되면 해당 음성신호의 파형을 분석함으로써 음성신호의 내용을 텍스트로 생성한다. STT서버(20)는 디스플레이장치(100)로부터 사용자의 발화의 음성신호를 수신하면, 이를 음성 명령으로 변환한다.When the voice signal is received, the STT server 20 generates the text of the voice signal by analyzing the waveform of the voice signal. The STT server 20 receives a voice signal of a user's utterance from the display device 100 and converts it into a voice command.

대화형 서버(30)는 음성 명령에 대응하는 다양한 디스플레이장치(100)의 동작의 데이터베이스를 포함한다. 대화형 서버(30)는 디스플레이장치(100)로부터 수신한 음성 명령을 분석하고, 분석 결과에 따라서 해당 음성 명령에 대응하는 동작을 수행하기 위한 제어신호를 디스플레이장치(100)에 전송한다.The interactive server 30 includes a database of the operations of the various display devices 100 corresponding to voice commands. The interactive server 30 analyzes the voice command received from the display device 100 and transmits a control signal for performing an operation corresponding to the voice command to the display device 100 according to the analysis result.

제어부(190)는 음성입력부(160)에 사용자의 발화가 입력되면, 해당 발화의 음성신호를 STT서버(20)에 전송하고, STT서버(20)로부터 해당 발화에 대응하는 음성 명령을 수신한다.The control unit 190 transmits the voice signal of the utterance to the STT server 20 and receives the voice command corresponding to the utterance from the STT server 20. [

제어부(190)는 STT서버(20)로부터 수신된 음성 명령이 단문 및 대화문 중에서 어느 쪽에 해당하는지를 판단한다. 제어부(190)는 음성 명령이 단문이면 음성처리부(170)에 의해 처리되도록 하고, 음성 명령이 대화문이면 대화형 서버(30)에 의해 처리되도록 한다.The control unit 190 determines whether the voice command received from the STT server 20 corresponds to the short message or the dialogue. The control unit 190 causes the voice processing unit 170 to process the voice command if the voice command is a short message and allows the voice command to be processed by the interactive server 30 if the voice command is a conversation.

음성 명령이 단문인 경우, 음성처리부(170)는 제어부(190)의 제어에 의해 저장부(180)에 저장된 데이터베이스를 검색함으로써, 해당 음성 명령에 대응하는 디스플레이장치(100)의 기능 또는 동작이 무엇인지 특정한다. 제어부(190)는 이와 같이 특정한 동작이 실행되도록 제어한다.When the voice command is a short message, the voice processing unit 170 searches the database stored in the storage unit 180 under the control of the control unit 190 to determine whether the function or operation of the display apparatus 100 corresponding to the voice command is . The control unit 190 controls the specific operation to be executed in this manner.

반면, 음성 명령이 대화문인 경우, 제어부(190)는 해당 음성 명령을 대화형 서버(30)에 전송한다. 대화형 서버(30)는 디스플레이장치(100)로부터 수신한 음성 명령을 분석함으로써 디스플레이장치(100)의 동작을 특정한다. 대화형 서버(30)는 이와 같이 특정된 동작을 지시하는 제어신호를 디스플레이장치(100)에 전송함으로써, 디스플레이장치(100)가 해당 제어신호에 따라서 동작을 실행하도록 한다.On the other hand, when the voice command is a dialogue, the control unit 190 transmits the voice command to the interactive server 30. [ The interactive server 30 specifies the operation of the display device 100 by analyzing the voice command received from the display device 100. [ The interactive server 30 transmits a control signal instructing the specified operation to the display device 100, thereby causing the display device 100 to perform an operation in accordance with the corresponding control signal.

이로써, 사용자의 발화에 따라서 디스플레이장치(100)의 기 설정된 대응 동작이 실행된다.Thereby, a predetermined corresponding operation of the display device 100 is executed in accordance with the user's utterance.

음성 명령의 단문/대화문 여부에 따라서 해당 음성 명령의 처리 주체가 선택되는 이러한 과정은, 디스플레이장치(100)의 시스템 부하와 처리 능력 등에 원인이 있을 수 있다. 대화문은 자연어이기 때문에, 대화문인 음성 명령 내에서 사용자가 원하는 대응 동작을 기계적으로 추출하는 것이 상대적으로 용이하지 않다. 따라서, 디스플레이장치(100)의 한정된 자원으로 대화문인 음성 명령을 분석하는 것이 용이하지 않을 수 있으므로, 대화문인 음성 명령의 경우에 대화형 서버(30)에 의해 처리되도록 함으로써, 다양한 내용의 발화에 대해서 대응할 수 있도록 제공한다.This process of selecting a main body of the voice command in accordance with whether a voice command is a short message or a dialogue may be caused by the system load and processing capability of the display apparatus 100 and the like. Since the dialogue is a natural language, it is relatively difficult to mechanically extract the corresponding action desired by the user in a voice command that is a dialogue. Thus, it may not be easy to analyze a voice command that is a dialogue with limited resources of the display device 100, so that by being processed by the interactive server 30 in the case of a voice command that is a dialogue, To be able to respond.

다만, 이러한 구조는 다양한 변경 설계가 가능한 바, STT서버(20) 및 대화형 서버(30) 중 적어도 어느 하나의 프로세스를 디스플레이장치(100) 자체적으로 수행할 수도 있다. 예를 들면, 디스플레이장치(100)는 사용자의 발화를 음성 명령으로 변환하는 프로세스나, 대화문의 음성 명령을 분석하는 프로세스를 타 서버(20, 30)가 아닌 디스플레이장치(100) 자체적으로 수행할 수도 있다.However, this structure may be modified in various ways, and the display device 100 may perform at least one of the processes of the STT server 20 and the interactive server 30 itself. For example, the display device 100 may perform the process of converting a user's utterance into a voice command, or the process of analyzing the voice command of the conversation utterance itself, rather than the other server 20, 30 have.

이러한 구조 하에서, 제어부(190)는 사용자의 발화에 대응하는 음성 명령에 대응하는 동작을 특정하는 프로세스가 음성처리부(170) 또는 대화형 서버(30)에 의하여 처리되도록 제어한다. 이하 본 실시예에서는, 제어부(190)가 음성처리부(170)를 제어하여 음성 명령에 대응하는 디스플레이장치(100)의 동작을 특정하는 구성에 관해 설명한다. 대화형 서버(30)가 음성 명령에 대응하는 디스플레이장치(100)의 동작을 특정하는 구성에 관해서는, 이하 설명하는 실시예를 응용할 수 있는 바 자세한 설명을 생략한다.Under such a structure, the control unit 190 controls the process of specifying the operation corresponding to the voice command corresponding to the user's utterance to be processed by the voice processing unit 170 or the interactive server 30. [ In the present embodiment, a configuration in which the control unit 190 controls the audio processing unit 170 to specify the operation of the display device 100 corresponding to the voice command will be described. As for the configuration in which the interactive server 30 specifies the operation of the display device 100 corresponding to a voice command, the following embodiments can be applied, and a detailed description will be omitted.

도 3은 디스플레이장치(100) 또는 대화형 서버(30)에 저장된, 음성 명령에 대응하는 동작의 데이터베이스(210)의 예시도이다.3 is an illustration of a database 210 of operations corresponding to voice commands stored in the display device 100 or the interactive server 30.

도 3에 도시된 바와 같이, 저장부(180)는 사용자의 발화에 대응하는 음성 명령을 디스플레이장치(100)가 수행할 수 있는 다양한 기능 또는 동작에 관련시킨 데이터베이스(210)를 저장한다. 여기서, "동작"은 디스플레이장치(100)가 수행할 수 있으며 디스플레이장치(100)가 지원하는 가능한 모든 형태의 동작 및 기능을 의미한다.As shown in FIG. 3, the storage unit 180 stores a database 210 associating voice commands corresponding to user utterances with various functions or operations that the display apparatus 100 can perform. Here, "operation" means all possible types of operations and functions that the display device 100 can perform and that the display device 100 supports.

제어부(190)는 소정의 음성 명령을 가지고 데이터베이스(210)를 검색함으로써, 해당 음성 명령에 대응하는 동작이 무엇인지 판단할 수 있다.The control unit 190 can search the database 210 with a predetermined voice command to determine what operation corresponds to the voice command.

본 실시예에서의 데이터베이스(210)는 데이터베이스(210)를 구축하는 원리 또는 방식 중에서 어느 하나만을 나타낸 것이므로, 본 발명의 사상을 한정하지 않는다. 또한, 본 도면에서의 데이터베이스(210)는 하나의 명령이 하나의 동작에 대응하는 것으로 표현하고 있으나, 이는 실시예를 간략히 설명하기 위하여 편의상 나타낸 것에 불과하다. 실제로, 데이터베이스(210)는 복수의 명령이 하나의 동작에 대응할 수도 있다. 또한, 도면에서 데이터베이스(210)의 번호는 편의상 구분을 위하여 붙여진 것이다.The database 210 in this embodiment shows only one of the principle or the method of building the database 210, and thus does not limit the spirit of the present invention. In addition, although the database 210 in this figure represents one instruction as corresponding to one operation, this is merely for the sake of brevity to illustrate the embodiment. In practice, the database 210 may correspond to a plurality of instructions in one operation. Also, in the figure, the database 210 numbers are attached for the sake of convenience.

예를 들어 사용자의 발화에 대응하는 음성 명령이 "켜(turn-on)"라면, 제어부(190)는 "켜"라는 음성 명령으로 데이터베이스(210)를 검색함으로써, "켜"라는 음성 명령에 대응하는 동작이 "시스템 전원 켜기"라는 것을 알 수 있다.For example, if the voice command corresponding to the user's utterance is "turn-on ", the controller 190 searches the database 210 with the voice command" Quot; system power on "

이 시점에서 제어부(190)는 디스플레이장치(100)의 현재 상태를 고려하여 해당 동작을 선택적으로 수행할 수 있다. 제어부(190)는 만일 디스플레이장치(100)가 현재 전원이 켜진 상태라면 "시스템 전원 켜기" 동작을 수행하지 않으며, 반면 디스플레이장치(100)가 현재 전원이 꺼진 상태라면 시스템 전원이 켜지도록 디스플레이장치(100)를 제어한다.At this point, the control unit 190 may selectively perform the corresponding operation in consideration of the current state of the display device 100. [ The control unit 190 does not perform the "system power on" operation if the display apparatus 100 is currently powered on, whereas if the display apparatus 100 is currently powered off, 100).

다른 예를 들어, 현재 디스플레이장치(100) 상에 영상이 표시되고 있는 상태에서 사용자가 "시끄러워"라고 발화하였다면, 제어부(190)는 데이터베이스(210)를 통해 "시끄러워"라는 음성 명령에 해당하는 동작이 "뮤트(mute)"라는 것을 특정할 수 있다. 이에, 제어부(190)는 현재 표시되고 있는 영상의 소리 볼륨을 0으로 조정함으로써, 뮤트 동작이 수행되도록 한다.For example, if the user speaks "noisy" in a state where an image is currently displayed on the display device 100, the control unit 190 causes the database 210 to display an operation corresponding to a voice command " Quot; mute "can be specified. Accordingly, the control unit 190 adjusts the volume of the currently displayed image to zero so that the mute operation is performed.

또 다른 예를 들어, 현재 디스플레이장치(100) 상에 영상이 표시되고 있는 상태에서 사용자가 "안들려"라고 발화하였다면, 제어부(190)는 데이터베이스(210)를 통해 "안들려"라는 명령에 해당하는 동작이 "현재 소리 볼륨을 5레벨 높인다"는 것을 알 수 있다. 이에, 제어부(190)는 현재 표시되고 있는 영상의 소리 볼륨을 5레벨 올린다.For example, if the user speaks "ignorable" while the image is currently being displayed on the display device 100, the control unit 190 displays an operation corresponding to the command "can not be recognized" through the database 210 "Raise the current sound volume by 5 levels ". Accordingly, the control unit 190 increases the sound volume of the currently displayed image by five levels.

이와 같은 방법으로 제어부(190)는 사용자의 발화에 따라서 대응 동작이 수행되도록 제어할 수 있다.In this way, the control unit 190 can control the corresponding operation to be performed according to the user's utterance.

그런데, 이러한 사용자 음성 명령을 인식하는 구조는, 사용자마다 발성 방법 및 구조가 다양하다는 점 때문에, STT서버(20) 또는 음성처리부(170)의 음성인식 로직(logic)이 모든 사용자의 발성을 해석함에 있어서 항상 정확한 결과를 도출하지 못할 수도 있다.The voice recognition logic of the STT server 20 or the voice processing unit 170 interprets the utterance of all the users because the voice recognition method recognizes the user voice commands in various ways. So it is not always possible to obtain accurate results.

예를 들면, 사용자가 "켜"라고 발화한 경우에, STT서버(20)는 이러한 발화의 음성신호로부터 "켜"라는 음성 명령이 아닌, 다른 내용의 음성 명령으로 변환할 수도 있다. 변환된 음성 명령이 데이터베이스(210)에 없다면, 제어부(190)는 해당 음성 명령에 대응하는 동작을 수행할 수 없다.For example, when the user utters "turn on", the STT server 20 may convert the voice signal of this utterance into a voice command of a different content than a voice command of "on". If the converted voice command is not stored in the database 210, the controller 190 can not perform an operation corresponding to the voice command.

또는, 변환된 음성 명령이 데이터베이스(210)에 있으나 사용자의 음성 명령과 상이한 단어인 경우도 있을 수 있다. 예를 들면 사용자가 "켜"라고 발화하였는데 변환된 음성 명령이 "꺼"인 경우라면, 제어부(190)는 해당 음성 명령에 대응하는 동작이 "시스템 전원 끄기"라고 판단할 것이다. 이는 결과적으로 사용자의 "켜"라는 발화에 대하여, 사용자의 의도와 달리 디스플레이장치(100)의 시스템 전원이 꺼지는 결과를 초래한다.Alternatively, there may be a case where the converted voice command is in the database 210 but is a different word from the voice command of the user. For example, if the user utters "ON" and the converted voice command is "OFF", the controller 190 will determine that the operation corresponding to the voice command is "power off the system". This results in a result that the system power of the display device 100 is turned off, unlike the user's intention, with respect to the user's "on"

이러한 점을 고려하여, 본 실시예에서는 다음과 같은 방법을 제안한다.Considering this point, the following method is proposed in the present embodiment.

제어부(190)는 음성입력부(160)에 입력되는 사용자의 발화에 대해 대응 동작이 매칭되지 않는 경우에, 사용자입력부(150)를 통해 사용자가 음성 명령에 대한 대응 동작을 조정할 수 있도록 제공한다. 이후, 동일한 발화가 음성입력부(160)에 입력되면, 제어부(190)는 조정 결과에 따라서 해당 발화에 매칭되는 대응 동작이 수행되도록 제어한다.The control unit 190 provides the user through the user input unit 150 so that the user can adjust the corresponding operation for the voice command when the corresponding operation is not matched to the user's utterance input to the voice input unit 160. [ Thereafter, when the same utterance is inputted to the voice input unit 160, the controller 190 controls the corresponding operation to be matched to the utterance according to the adjustment result.

구체적으로, 제어부(190)는 기 설정된 이벤트 발생 시, 데이터베이스(210)에서 음성 명령 및 대응 동작 사이의 상호 연결 지정상태를 조정하는 유아이(UI, user interface) 영상을 제공한다. 여기서, 기 설정된 이벤트는 사용자에 의한 사용자입력부(150)의 조작 또는 사용자의 발화로 인해 유아이 영상의 제공을 요청하는 커맨드가 발생하는 경우가 해당할 수 있다.Specifically, the control unit 190 provides a UI (user interface) image in the database 210 that adjusts the state of the interconnection designation between the voice command and the corresponding operation when the preset event occurs. Here, the predetermined event may correspond to a case where a command for requesting the provision of the infant image occurs due to the operation of the user input unit 150 by the user or the user's utterance.

제어부(190)는 소정의 제1명령에 대응하게 제1동작이 지정된 초기상태에서, 유아이 영상을 통해 제1명령에 대응하는 동작이 제1동작과 상이한 제2동작으로 조정되면, 이러한 조정에 따라서 데이터베이스(210)를 업데이트한다. 이후, 제어부(190)는 사용자로부터의 발화에 대응하는 음성 명령이 제1명령인 경우, 업데이트된 데이터베이스(210)에 기초하여 제1동작이 아닌 제2동작을 수행한다.If the operation corresponding to the first instruction through the infant image is adjusted to the second operation different from the first operation in the initial state in which the first operation corresponding to the predetermined first instruction is set, And updates the database 210. Thereafter, when the voice command corresponding to the utterance from the user is the first command, the controller 190 performs the second operation based on the updated database 210, instead of the first operation.

제어부(190)는 제1명령에 대응하게 제1동작이 지정된 초기상태에서, 새로운 음성 명령인 제2명령이 제1동작에 대응하도록 지정되면, 이러한 조정에 따라서 데이터베이스(210)를 업데이트한다. 이후, 제어부(190)는 사용자로부터의 발화에 대응하는 음성 명령이 제1명령 또는 제2명령인 경우, 업데이트된 데이터베이스에 기초하여 제1동작을 수행한다.The control unit 190 updates the database 210 in accordance with the adjustment if the second instruction, which is a new voice command, is designated to correspond to the first operation, in the initial state in which the first operation corresponding to the first instruction is specified. Thereafter, when the voice command corresponding to the utterance from the user is the first command or the second command, the control unit 190 performs the first operation based on the updated database.

이로써, 사용자의 의도를 적절하게 반영하여 음성인식 동작을 수정할 수 있다.Thereby, the speech recognition operation can be corrected by appropriately reflecting the intention of the user.

이하, 유아이 영상을 통해 음성 명령의 설정을 변경하는 방법에 관해 설명한다.Hereinafter, a method of changing the setting of the voice command through the infant image will be described.

도 4 내지 도 6은 음성 명령의 설정을 위한 유아이 영상(220, 230, 240)의 예시도이다.4 to 6 are exemplary diagrams of the infant images 220, 230, and 240 for setting a voice command.

도 4에 도시된 바와 같이, 사용자는 사용자입력부(150)를 조작하여 발화에 대응하는 음성 명령에 대한 설정 변경을 위하여 유아이 영상(220)의 표시를 제어부(190)에 요청한다. 이에, 제어부(190)는 유아이 영상(220)을 표시한다.As shown in FIG. 4, the user operates the user input unit 150 to request the control unit 190 to display the infant image 220 in order to change the setting of the voice command corresponding to the utterance. Accordingly, the control unit 190 displays the infant image 220.

유아이 영상(220)은 사용자의 설정을 반영하고자 하는 발화 및 음성 명령을 특정하기 위해, 사용자가 발화하도록 안내하는 정보를 포함한다. 이에, 사용자는 유아이 영상(220)이 표시되어 있는 동안에 음성 명령을 발화한다.The infant image 220 includes information for guiding the user to utter, in order to specify utterance and voice commands to reflect the settings of the user. Accordingly, the user utters a voice command while the infant image 220 is being displayed.

제어부(190)는 유아이 영상(220)이 표시되어 있는 동안에 음성입력부(160)를 통해 사용자의 발화가 입력되면, 해당 발화를 음성처리부(170) 또는 STT서버(20)에서 음성 명령으로 변환되게 한다.The control unit 190 causes the speech processor 170 or the STT server 20 to convert the utterance into a voice command when the user's utterance is inputted through the voice input unit 160 while the infant image 220 is being displayed .

도 5에 도시된 바와 같이, 제어부(190)는 데이터베이스(210, 도 3 참조)에 있는 다양한 디스플레이장치(100)의 동작 중에서, 입력된 음성 명령에 대응하는 동작을 특정하기 위해 사용자가 원하는 동작의 지정을 안내하는 유아이 영상(230)을 표시한다.5, the controller 190 controls the operation of the various display devices 100 in the database 210 (see FIG. 3) The infant image 230 that guides the designation is displayed.

유아이 영상(230)은 앞선 유아이 영상(220, 도 4 참조)이 표시되어 있는 동안에 사용자로부터의 발화의 음성 명령에 대응하는 동작을 선택하도록 제공한다.The infant image 230 provides for selecting an operation corresponding to a voice command of utterance from the user while the preceding infant image 220 (see Fig. 4) is displayed.

예를 들면, 유아이 영상(220, 도 4 참조)이 표시되어 있는 동안에 사용자가 "켜"라고 발화한 경우를 고려한다. 사용자는 유아이 영상(230)의 안내에 따라서 리모트 컨트롤러로 구현된 사용자입력부(150) 상의 전원버튼(151)을 누른다. 제어부(190)는 사용자의 발화를 변환한 음성 명령과 사용자가 조작한 전원버튼(151)을 상호 관련시켜, 데이터베이스(210, 도 3 참조)를 업데이트한다.For example, a case where the user speaks "on" while the infant image 220 (see FIG. 4) is displayed is considered. The user presses the power button 151 on the user input unit 150 implemented by the remote controller in accordance with the guidance of the infant image 230. The control unit 190 correlates the voice command converted by the user's utterance with the power button 151 operated by the user and updates the database 210 (see FIG. 3).

여기서, 전원버튼(151)은 토글(toggle) 방식이므로, 전원버튼(151)을 누름으로써 전원의 턴온 또는 턴오프의 두 가지 결과가 수행된다. 이러한 경우에, 유아이 영상(230)은 전원의 턴온 또는 턴오프를 선택하도록 하는 옵션을 추가로 제공할 수 있다.Here, since the power button 151 is of the toggle type, two results of turning on or off of the power source are performed by pressing the power button 151. [ In this case, the infant image 230 may additionally provide an option to select turning on or off of the power source.

또 하나의 예를 들면, 유아이 영상(220, 도 4 참조)이 표시되어 있는 동안에 사용자가 "볼륨 낮춰"라는 음성 명령을 발성한 경우를 고려한다. 사용자는 유아이 영상(230)의 안내에 따라서 사용자입력부(150) 상의 볼륨다운버튼(152)을 누른다. 이에, 제어부(190)는 "볼륨 낮춰"의 음성 명령에 대응하여 볼륨다운버튼(152)에 의한 동작이 수행되도록 조정한다.As another example, consider a case where the user utters a voice command "lower volume" while the infant image 220 (see FIG. 4) is being displayed. The user presses the volume down button 152 on the user input unit 150 in accordance with the guidance of the infant image 230. Accordingly, the control unit 190 adjusts the operation by the volume down button 152 to be performed in response to the voice command of "volume down ".

즉, 사용자의 발화가 실제 내용과 상이한 음성 명령으로 변환된다고 하더라도, 이러한 음성 명령에 대응하는 동작을 사용자에 의해 지정 또는 수정할 수 있는 바, 결과적으로 사용자의 의도에 매칭되는 동작이 수행되도록 한다.That is, even if the user's utterance is converted into a voice command different from the actual content, the operation corresponding to the voice command can be specified or modified by the user, so that the operation matching the user's intention can be performed.

도 6에 도시된 바와 같이, 다른 실시예로서, 제어부(190)는 유아이 영상(220, 도 4 참조)이 표시되어 있는 동안에 발화가 입력되면, 이 발화의 음성 명령에 대응하는 동작을 선택하도록, 기 설정된 복수의 동작의 리스트를 포함하는 유아이 영상(240)을 표시할 수도 있다.6, in another embodiment, when a speech is input while the infant image 220 (see FIG. 4) is being displayed, the control unit 190 selects the operation corresponding to the speech command of the utterance, And may display the infant image 240 including a list of a plurality of predetermined operations.

유아이 영상(240)에서 리스트를 표시하는 형식은 다양하게 변경 적용이 가능하며, 스크롤 형식으로 기 설정된 순서에 따라서 복수의 동작 항목을 표시하거나, 몇 가지의 대표 항목을 표시한 상태에서 어느 하나의 대표 항목을 선택한 경우에 하위 항목들이 팝업(pop-up)되거나 또는 트리(tree) 형식으로 분기되어 표시하는 방법 등이 있다.The format of displaying the list on the infant's image 240 can be changed in various ways. It is also possible to display a plurality of operation items in a predetermined order in a scroll format, or to display a plurality of representative items in a predetermined order, And a method of popping up sub-items when the item is selected or branching the sub-items in a tree format.

이와 같은 방법에 의하여, 제어부(190)는 데이터베이스 상에 특정 음성 명령이 제1동작으로 지정되어 있는 상태에서, 해당 음성 명령을 제1동작과 상이한 제2동작으로 지정 상태를 수정하거나, 또는 새로운 음성 명령을 제1동작에 대응하도록 추가할 수도 있다.According to this method, the control unit 190 corrects the designated state to the second operation different from the first operation, or changes the designated state to a new voice Command may be added to correspond to the first operation.

그런데, 제1동작과 제2동작은 동일한 기능에 대한 수치 레벨을 조정하는 동작이되, 다만 조정하는 수치 레벨이 상이한 경우일 수도 있다.However, the first operation and the second operation are operations for adjusting the numerical level for the same function, but may be different when the numerical level to be adjusted is different.

예를 들면, 사용자의 발화의 음성 명령이 "소리 줄여"이고, 이에 대응하는 동작이 현재 소리 볼륨에서 7레벨 줄이는 동작인 경우를 고려한다. 제어부(190)는 앞서 설명한 바와 같은 유아이 영상을 통해 기존의 7레벨이 5레벨로 조정되면, 조정된 사항을 데이터베이스에 업데이트한다.For example, consider the case where the voice command of the user's utterance is "loud" and the corresponding action is an operation of decreasing the current sound volume by seven levels. When the existing seven levels are adjusted to five levels through the infant image as described above, the controller 190 updates the adjusted data in the database.

제어부(190)는 이후 "소리 줄여"라는 발화가 입력되면, 현재 소리 볼륨에서 5레벨 줄이는 동작을 수행한다.The control unit 190 then performs an operation of reducing the current sound volume by five levels when a speech of "noise reduction" is input.

이와 같이, 유아이 영상을 통한 데이터베이스의 업데이트는 다양한 사용자의 의도를 반영할 수 있다.As such, the updating of the database through the infant image can reflect the intentions of various users.

한편, 제어부(190)는 복수의 발화에 각기 대응하는 복수의 동작이, 하나의 발화에 의해 순차적으로 실행되는 매크로 인스트럭션(macro instruction)을 설정할 수 있다.On the other hand, the control unit 190 can set a macro instruction in which a plurality of operations corresponding to a plurality of utterances are sequentially executed by one utterance.

도 7은 본 실시예에 따른 매크로 인스트럭션의 시퀀스 예시도이다.7 is a diagram illustrating an example of a sequence of a macro instruction according to the present embodiment.

도 7에 도시된 바와 같이, 제어부(190)는 복수의 동작을 순차적으로 연계시켜 실행시키기 위한 매크로 인스트럭션의 설정을 제공할 수 있으며, 이는 유아이 영상을 통해 사용자가 설정할 수 있다.As shown in FIG. 7, the control unit 190 may provide macro instruction settings for sequentially executing a plurality of operations in association with each other, which can be set by the user through the infant image.

이러한 유아이 영상의 구현 방식은 다양한 형식이 적용될 수 있으며, 예를 들면 기 설정된 다양한 동작 리스트 상에서 시퀀스 별 동작을 순차적으로 선택하여 지정하도록 제공될 수 있다.Various implementations can be applied to the implementation method of the infant image, and for example, it can be provided to sequentially select and designate sequential operations on various preset operation lists.

여기서, 사용자는 취침예약 및 알람설정을 한번에 자동으로 실행하기 위한 매크로 인스트럭션을 설정하는 경우의 예시를 고려한다. 이 경우의 각 동작의 시퀀스는, 취침예약 기능의 선택(310)과, 디스플레이장치(100)의 시스템 전원을 턴오프시키는 시간(320)과, 알람설정 기능의 선택(330)과, 알람이 울리는 시간(340)과, 시퀀스 종료 설정(350)의 단계를 포함한다.Here, the user considers an example of setting a macro instruction to automatically execute sleeping reservation and alarm setting at one time. The sequence of each operation in this case includes a selection of the sleep reservation function 310, a time 320 for turning off the system power supply of the display device 100, a selection of an alarm setting function 330, A time 340, and a sequence end setting 350.

제어부(190)는 사용자입력부(150)를 통한 사용자의 조작 또는 음성입력부(160)를 통한 발화 등에 의해 이러한 매크로 인스트럭션의 설정 요청을 받으면, 매크로 인스트럭션 설정을 위한 유아이 영상을 표시한다.The control unit 190 displays the infant image for setting the macro instruction when receiving a request for setting the macro instruction by the user's operation through the user input unit 150 or by the speech input through the voice input unit 160 or the like.

도 8 내지 도 12는 매크로 인스트럭션 설정용 유아이 영상(410, 420, 430, 440, 450)의 예시도이다.8 to 12 are diagrams illustrating examples of a macro instruction setting child image 410, 420, 430, 440, and 450, respectively.

도 8에 도시된 바와 같이, 제어부(190)는 매크로 인스트럭션의 첫 번째 동작을 선택하기 위한 유아이 영상(410)을 표시한다. 유아이 영상(410)은 복수의 동작 중 어느 하나를 선택 가능하도록 제공하며, 사용자는 사용자입력부(150)를 조작하거나 또는 해당 동작에 대응하는 음성 명령을 발화함으로써 매크로 인스트럭션의 첫 번째 동작을 선택할 수 있다. 본 실시예의 경우, 사용자는 유아이 영상(410)을 통해 "취침예약" 동작을 선택한다.As shown in FIG. 8, the control unit 190 displays the infant image 410 for selecting the first operation of the macro instruction. The infant image 410 provides any one of a plurality of operations to be selected, and the user can select the first operation of the macro instruction by operating the user input unit 150 or by uttering a voice command corresponding to the operation . In the case of the present embodiment, the user selects the "sleep reservation" operation through the infant image 410. [

도 9에 도시된 바와 같이, 사용자에 의해 "취침예약"이 선택되면, 제어부(190)는 어느 정도의 시간 이후에 디스플레이장치(100)의 전원을 턴오프할 것인지 지정하는 유아이 영상(420)을 표시한다.9, when the user selects "sleep reservation ", the controller 190 displays the infant image 420 indicating a certain time after which the power of the display apparatus 100 is turned off Display.

유아이 영상(420)은 미리 마련된 복수의 시간 예시를 선택 가능하게 제공한다. 또는, 유아이 영상(420)은 사용자가 음성으로 시간을 입력하거나, 사용자입력부(150)를 통해 시간을 입력하도록 제공할 수도 있다.The infant image 420 provides a plurality of preset time examples selectively. Alternatively, the infant image 420 may be provided by the user to input the time by voice or input the time through the user input unit 150. [

도 10에 도시된 바와 같이, 제어부(190)는 앞서 설명한 바와 같은 동작 시퀀스를 포함하는 매크로 인스트럭션의 설정을 완료할 것인지, 아니면 동작을 더 추가하여 매크로 인스트럭션의 설정을 계속할 것인지를 선택하게 하는 유아이 영상(430)을 표시한다.As shown in FIG. 10, the controller 190 determines whether to complete the setting of the macro instruction including the operation sequence as described above, or to further select the macro instruction to continue the setting of the macro instruction. (430).

사용자는 여기서 "완료"를 선택하여 매크로 인스트럭션의 설정을 완료하거나, 또는 "계속"을 선택할 수 있다.The user can select " Complete "here to complete the setting of the macro instruction, or" Continue ".

도 11에 도시된 바와 같이, 사용자가 유아이 영상(430, 도 10 참조)에서 "계속"을 선택하면, 제어부(190)는 매크로 인스트럭션의 다음 동작을 선택하기 위한 유아이 영상(440)을 표시한다. 유아이 영상(440)의 형식은 대체로 도 8의 경우와 유사하다.As shown in Fig. 11, when the user selects "continue" in the infant image 430 (see Fig. 10), the control unit 190 displays the infant image 440 for selecting the next operation of the macro instruction. The format of the infant image 440 is generally similar to that of FIG.

사용자는 유아이 영상(440)을 통해 "알람설정" 동작을 선택한다.The user selects the "ALARM SETTING" operation via the infant ' s image 440. [

도 12에 도시된 바와 같이, 제어부(190)는 앞선 "알람설정" 동작의 선택에 따라서, 알람이 울리는 시각을 지정하기 위한 유아이 영상(450)을 표시한다.As shown in Fig. 12, the control unit 190 displays an infant image 450 for designating the time at which the alarm sounds according to the selection of the above-mentioned "alarm setting" operation.

사용자는 유아이 영상(450)이 표시된 상태에서 사용자입력부(150)를 통해 숫자를 입력하거나 또는 음성으로 숫자를 발화함으로써, 알람이 울리는 시각을 지정할 수 있다.The user can designate the time at which the alarm sounds by inputting a number through the user input unit 150 or by uttering a number by voice while the infant image 450 is displayed.

이러한 일련의 설정 동작이 끝나면, 제어부(190)는 도 10에 도시된 바와 같은 유아이 영상(430)을 표시한다. 사용자가 "완료"를 선택하면, 제어부(190)는 도 7과 같은 시퀀스로 동작을 실행하는 매크로 인스트럭션을 저장부(180)의 데이터베이스에 저장한다.When such a series of setting operations is completed, the control unit 190 displays the infant image 430 as shown in FIG. When the user selects "Complete ", the control unit 190 stores the macro instruction for executing the operation in the sequence shown in FIG. 7 in the database of the storage unit 180. FIG.

이후, 사용자가 매크로 인스트럭션에서 첫 번째 동작에 대응하는 음성 명령, 즉 "취침예약"에 해당하는 음성 명령을 발화하면, 제어부(190)는 미리 매크로 인스트럭션에 설정된 대로의 복수의 동작을 순차적으로 실행시킨다. 또는, 매크로 인스트럭션을 설정하는 과정에서, 매크로 인스트럭션을 실행시키기 위한 새로운 음성 명령이 설정될 수도 있다.Thereafter, when the user utters a voice command corresponding to the first operation in the macro instruction, that is, a voice command corresponding to the "sleep reservation ", the control unit 190 sequentially executes a plurality of operations as set in the macro instruction in advance . Alternatively, in setting the macro instruction, a new voice command for executing the macro instruction may be set.

이와 같이, 사용자의 간단한 음성 명령에 의하여, 복수의 동작이 순차적으로 실행되도록 할 수 있다.In this manner, a plurality of operations can be sequentially executed by the user's simple voice command.

상기한 실시예는 예시적인 것에 불과한 것으로, 당해 기술 분야의 통상의 지식을 가진 자라면 다양한 변형 및 균등한 타 실시예가 가능하다. 따라서, 본 발명의 진정한 기술적 보호범위는 하기의 특허청구범위에 기재된 발명의 기술적 사상에 의해 정해져야 할 것이다.The above-described embodiments are merely illustrative, and various modifications and equivalents may be made by those skilled in the art. Accordingly, the true scope of protection of the present invention should be determined by the technical idea of the invention described in the following claims.

100 : 영상처리장치/디스플레이장치
110 : 영상수신부
120 : 영상처리부
130 : 디스플레이부
140 : 통신부
150 : 사용자입력부
160 : 음성입력부
170 : 음성처리부
180 : 저장부
190 : 제어부100: image processing device / display device
110:
120:
130:
140:
150: User input
160:
170:
180:
190:

Claims

An image processing apparatus comprising:
An image processor for processing the video signal to display the video signal;
A voice input unit to which a user's utterance is input;
A voice processing unit for performing a predetermined corresponding operation according to a voice command corresponding to the utterance;
A control unit for controlling the control unit to display the infant image adjustably providing the corresponding operation for the voice command of the utterance input to the voice input unit, and to control the corresponding operation matched with the utterance according to the adjustment result through the infant image And a control unit for controlling the image processing apparatus.

The method according to claim 1,
The control unit may control the second operation to be performed when the corresponding operation specified in correspondence with the voice command of the predetermined utterance is adjusted from the first operation to the second operation through the infant image so that the second operation is performed The image processing apparatus characterized in.

3. The method of claim 2,
The infant image guides the user to utter,
Wherein the control unit selects one of the plurality of operations corresponding to the voice command of the utterance as the second operation when the utterance is input to the voice input unit according to the guidance of the infant image And the image processing apparatus.

3. The method of claim 2,
The infant image guides the user to operate a plurality of input buttons provided on a user input unit operated by a user,
Wherein the control unit selects the second operation with an operation specified by the input button operated in accordance with the guide.

3. The method of claim 2,
Wherein the infant image includes a list of a plurality of predetermined operations,
Wherein the control unit selects the operation selected on the list as the second operation.

The method according to claim 1,
Wherein the infant image provides a macro instruction capable of setting a macro instruction to sequentially execute a plurality of the utterances and a plurality of the operations corresponding to the plurality of utterances sequentially by one utterance Processing device.

The method according to claim 6,
Wherein the controller executes the macro instruction when the speech corresponding to the first operation among the plurality of predetermined operations included in the macro instruction is input.

The method of claim 1, wherein
And a communication unit communicably connected to the server,
Wherein the control unit controls the voice processing unit and the server to process the voice command corresponding to the utterance when the utterance is input.

9. The method of claim 8,
The communication unit communicates with a speech-to-text (STT) server that converts the utterance into a voice command of text,
Wherein the control unit transmits the voice signal of the utterance to the STT server when the utterance is input to the voice input unit and receives the voice command corresponding to the utterance from the STT server.

9. The method of claim 8,
Wherein the control unit controls the voice command to be processed by the voice processing unit when the voice command is a short message and the voice command to be processed by the server when the voice command is a conversation. Device.

The method according to claim 1,
And a display unit for displaying the image signal processed by the image processing unit as an image.

A method of controlling an image processing apparatus,
Inputting a user's utterance;
Performing a predetermined corresponding operation in accordance with a voice command corresponding to the utterance;
Displaying the infant image adjustably providing the corresponding action for the voice command of the utterance and setting the corresponding action to be performed to match the utterance in accordance with the adjustment result through the infant image Wherein the control unit controls the image processing apparatus.

13. The method of claim 12,
Wherein, in the setting step,
And setting the second operation to be performed when the corresponding operation specified in correspondence with the voice command of the predetermined utterance through the infant image is adjusted from the first operation to the second operation so that the second operation is performed as the utterance is input Wherein the control unit controls the image processing apparatus.

14. The method of claim 13,
The infant image guides the user to utter,
The setting step may include the step of selecting any one of the plurality of the operations corresponding to the voice command of the utterance as the second operation when the utterance is input in accordance with the guidance of the infant image And controlling the image processing apparatus.

14. The method of claim 13,
Wherein the infant image guides the user to operate a plurality of input buttons provided on a user input unit of the image processing apparatus,
Wherein said setting step includes the step of selecting said second operation with an operation designated by said input button operated in accordance with said guide.

14. The method of claim 13,
Wherein the infant image includes a list of a plurality of predetermined operations,
Wherein the setting step includes a step of selecting the operation selected on the list as the second operation.

13. The method of claim 12,
Characterized in that the infant image provides a macro instruction for sequentially setting a plurality of said utterances and a plurality of said operations corresponding to said plurality of utterances sequentially by said one utterance, Way.

18. The method of claim 17,
And executing the macro instruction when the speech corresponding to the first operation among the plurality of predetermined operations included in the macro instruction is inputted.

The method of claim 12, wherein
The image processing apparatus communicates with a server,
Wherein the step of performing the predetermined corresponding operation includes a step of controlling the voice command corresponding to the utterance to be processed by any one of the image processing apparatus and the server .

The method of claim 19, wherein
The image processing apparatus communicates with an STT server that converts the utterance into the voice command of text,
Wherein the inputting of the user's utterance comprises:
Transmitting the voice signal of the utterance to the STT server;
And receiving the voice command corresponding to the utterance from the STT server.

20. The method of claim 19,
Wherein the controlling step includes the step of controlling the voice command to be processed by the image processing apparatus when the voice command is a short message and the voice command to be processed by the server when the voice command is a conversation And a control unit for controlling the image processing apparatus.

In an image processing system,
An image processing device for processing a video signal so as to be displayed as an image;
And a server for communicating with the image processing apparatus,
The image processing apparatus comprising:
A voice input unit to which a user's utterance is input;
A voice processing unit for performing a predetermined corresponding operation according to a voice command corresponding to the utterance;
And a control unit for controlling the voice processing unit and the server to process the voice command corresponding to the utterance when the utterance is input through the voice input unit,
Wherein the control unit displays an infant image that adjustably provides the corresponding operation to the voice command of the utterance input to the voice input unit, Is performed to perform the image processing.