KR20130088637A

KR20130088637A - Display apparatus and voice recognition method thereof

Info

Publication number: KR20130088637A
Application number: KR1020120010008A
Authority: KR
Inventors: 김정근; 고창석; 최준식
Original assignee: 삼성전자주식회사
Priority date: 2012-01-31
Filing date: 2012-01-31
Publication date: 2013-08-08

Abstract

PURPOSE: A display apparatus and a voice recognition method thereof are provided to perform a control operation corresponding to the selected voice information, by acquiring first and second voice information corresponding to a first and a second voice signal respectively, and adding a fixed weighted value to a voice signal with larger measured sound pressure, and then selecting one appropriate for the utterance intend of the user of the first and the second voice information. CONSTITUTION: A first voice acquisition unit (110) converts voice of a user into a first electrical voice signal. A second voice acquisition unit (120) converts the voice of the user into a second electrical voice signal together with the first voice acquisition unit. A sound pressure measurement unit (130) measures the sound pressure amplitude of the first and the second voice signal. A controller (140) acquires first and second voice information corresponding to the first and the second voice signal respectively, and selects one for the utterance intend of the user of the first and the second voice information. [Reference numerals] (110) First voice acquisition unit; (120) Second voice acquisition unit; (130) Sound pressure measurement unit; (140) Controller; (141) Voice recognition unit; (143) Determination unit; (145) Voice-letter conversion unit; (150) User input unit; (160) Display unit

Description

DISPLAY APPARATUS AND VOICE RECOGNITION METHOD THEREOF}

본 발명은 디스플레이장치 및 그 음성인식방법에 관한 것으로서, 보다 상세하게는 음성인식률이 향상된 디스플레이장치 및 그 음성인식방법에 대한 것이다.The present invention relates to a display device and a voice recognition method thereof, and more particularly, to a display device and a voice recognition method with improved voice recognition rate.

마이크가 내장된 디스플레이장치가 존재하는데, 이러한 종래 디스플레이장치의 경우 단일 마이크가 내장되어 있어 사용자의 발화 위치에 따라 음성인식률의 정확도의 차이가 크다. 즉, 마이크가 내장된 위치 근처에서 사용자가 발화하면 음성인식률의 정확도가 좋고, 마이크가 내장된 위치에서 떨어진 곳에서 사용자가 발화하면 음성인식률의 정확도가 떨어진다. 따라서, 사용자의 발화 위치에 따라 음성인식률의 정확도가 일정하지 않은 문제점이 존재한다.There is a display device with a built-in microphone. In the conventional display device, since a single microphone is built in, the accuracy of the voice recognition rate is large according to the position of the user's speech. That is, the accuracy of the voice recognition rate is good when the user speaks near the location where the microphone is built, and the accuracy of the voice recognition rate is low when the user speaks away from the location where the microphone is built. Therefore, there is a problem that the accuracy of the speech recognition rate is not constant according to the user's speech position.

따라서, 본 발명의 목적은 사용자의 발화위치에 관계없이 일정하게 음서인식률의 정확도가 향상된 디스플레이장치 및 음성인식방법을 제공하고자 한다. Accordingly, it is an object of the present invention to provide a display apparatus and a voice recognition method in which the accuracy of a note recognition rate is constantly improved regardless of a user's speech position.

상기 목적은, 본 발명에 따라, 디스플레이장치에 있어서, 사용자가 발화한 음성을 수신하여 전기적인 제1음성신호로 변환하는 제1음성취득부와; 상기 제1음성취득부와 함께 상기 사용자가 발화한 음성을 수신하여 전기적인 제2음성신호로 변환하는 제2음성취득부와; 상기 제1음성신호 및 상기 제2음성신호의 음압 크기를 측정하는 음압측정부와; 상기 제1음성신호 및 상기 제2음성신호를 인식하여 상기 제1음성신호에 대응하는 제1음성정보와 상기 제2음성신호에 대응하는 제2음성정보를 획득하고, 상기 음압측정부에 의하여 측정된 음압 크기가 큰 음성신호에 소정의 가중치를 부가하여 상기 제1음성정보 및 상기 제2음성정보 중에서 상기 사용자의 발화의도에 적합한 어느 하나를 선택하는 제어부를 포함하는 디스플레이장치에 의하여 달성된다.According to the present invention, there is provided a display apparatus, comprising: a first voice acquisition unit for receiving a voice spoken by a user and converting the voice into an electrical first voice signal; A second voice acquisition unit for receiving the voice spoken by the user together with the first voice acquisition unit and converting the voice into an electrical second voice signal; A sound pressure measuring unit measuring a sound pressure magnitude of the first sound signal and the second sound signal; Recognizing the first voice signal and the second voice signal to obtain first voice information corresponding to the first voice signal and second voice information corresponding to the second voice signal, measured by the sound pressure measuring unit And a control unit for selecting one of the first voice information and the second voice information suitable for the user's speech intention by adding a predetermined weight to the voice signal having a large sound pressure level.

상기 제1음성정보는 복수 개의 제1후보 음성정보를 포함하고, 상기 제2음성정보는 복수 개의 제2 후보 음성정보를 포함하고, 상기 각 후보 음성정보는 상기 사용자의 발화의도의 적합도를 나타내는 레벨값을 포함하고, 상기 제어부는 상기 레벨값 및 상기 가중치에 기초하여 상기 사용자의 발화의도에 적합한 복수의 후보 음성정보를 선택할 수 있다.The first voice information includes a plurality of first candidate voice information, the second voice information includes a plurality of second candidate voice information, and each of the candidate voice information indicates a goodness of fit of the user's speech intention. And a level value, and the controller may select a plurality of candidate voice information suitable for the user's speech intention based on the level value and the weight.

상기 디스플레이장치는, 디스플레이부를 더 포함하고, 상기 제어부는, 상기 선택된 복수의 후보 음성정보를 복수의 후보 문자정보로 변환하여 상기 디스플레이부에 표시할 수 있다.The display apparatus may further include a display unit, and the controller may convert the selected plurality of candidate voice information into a plurality of candidate character information and display the same on the display unit.

상기 디스플레이장치는, 사용자 선택을 입력하는 사용자입력부를 더 포함하고, 상기 제어부는 상기 복수의 후보 문자정보 중에서 상기 사용자 선택에 따른 후보 문자정보를 선택할 수 있다.The display apparatus may further include a user input unit configured to input a user selection, and the controller may select candidate character information according to the user selection from the plurality of candidate character information.

상기 제어부는, 상기 제1음성신호 및 상기 제2음성신호의 음성특징을 검출하여 음성정보로 인식하는 음성인식부를 더 포함할 수 있다.The controller may further include a voice recognition unit that detects voice features of the first voice signal and the second voice signal and recognizes the voice feature as voice information.

상기 제어부는, 상기 음성정보를 문자정보로 변환하는 음성문자변환부를 더 포함할 수 있다.The control unit may further include a voice text conversion unit for converting the voice information into text information.

상기 제1음성취득부는 상기 디스플레이장치의 내부에 마련되고, 상기 제2음성취득부는 상기 디스플레이장치의 외장형 주변장치로서 마련될 수 있다.The first audio acquisition unit may be provided inside the display device, and the second audio acquisition unit may be provided as an external peripheral device of the display device.

또한 상기 목적은 본 발명에 따라, 디스플레이장치의 음성인식방법에 있어서, 사용자가 발화한 음성을 수신하여 전기적인 제1음성신호로 변환하는 단계와; 상기 사용자가 발화한 음성을 수신하여 전기적인 제2음성신호로 변환하는 단계와; 상기 제1음성신호 및 상기 제2음성신호의 음압 크기를 측정하는 단계와; 상기 제1음성신호 및 상기 제2음성신호를 인식하여 상기 제1음성신호에 대응하는 제1음성정보와 상기 제2음성신호에 대응하는 제2음성정보를 획득하는 단계와; 상기 측정된 음압 크기가 큰 음성신호에 소정의 가중치를 부가하여 상기 제1음성정보 및 상기 제2음성정보 중에서 상기 사용자의 발화의도에 적합한 어느 하나를 선택하는 단계를 포함하는 디스플레이장치의 음성인식방법에 의하여 달성될 수 있다.In addition, according to the present invention, in the voice recognition method of the display device, the step of receiving a voice spoken by the user and converting it into an electrical first voice signal; Receiving the voice spoken by the user and converting the voice into an electrical second voice signal; Measuring sound pressure magnitudes of the first sound signal and the second sound signal; Recognizing the first voice signal and the second voice signal to obtain first voice information corresponding to the first voice signal and second voice information corresponding to the second voice signal; Speech recognition of the display apparatus comprising the step of selecting any one suitable for the user's intention to speak from the first voice information and the second voice information by adding a predetermined weight to the voice signal having a large sound pressure magnitude measured It can be achieved by the method.

상기 제1음성정보는 복수 개의 제1후보 음성정보를 포함하고, 상기 제2음성정보는 복수 개의 제2 후보 음성정보를 포함하고, 상기 각 후보 음성정보는 상기 사용자의 발화의도의 적합도를 나타내는 레벨값을 포함하고, 상기 선택단계는, 상기 레벨값 및 상기 가중치에 기초하여 상기 사용자의 발화의도에 적합한 복수의 후보 음성정보를 선택하는 단계를 더 포함할 수 있다.The first voice information includes a plurality of first candidate voice information, the second voice information includes a plurality of second candidate voice information, and each of the candidate voice information indicates a goodness of fit of the user's speech intention. The method may further include selecting a plurality of candidate voice information suitable for the user's speech intention based on the level value and the weight.

상기 방법은, 상기 선택된 복수의 후보 음성정보를 복수의 후보 문자정보로 변환하는 단계와; 상기 복수의 후보 문자정보를 표시하는 단계를 더 포함할 수 있다.The method includes converting the selected plurality of candidate speech information into a plurality of candidate character information; The method may further include displaying the plurality of candidate character information.

상기 방법은, 상기 표시된 복수의 후보 문자정보 중에서 어느 하나를 선택하는 사용자 선택을 입력받는 단계를 더 포함할 수 있다.The method may further include receiving a user selection for selecting any one of the displayed candidate character information.

상기 음성정보 획득단계는, 상기 음성신호의 음성특징을 검출하여 음성정보로 인식하는 단계를 포함할 수 있다.The obtaining of the voice information may include detecting a voice feature of the voice signal and recognizing the voice information as voice information.

상기 제1음성신호는 상기 디스플레이장치의 내부에 마련된 제1음성취득부를 통하여 획득되고; 상기 제2음성신호는 상기 디스플레이장치의 외장형 주변장치로서 마련된 제2음성취득부를 통하여 획득될 수 있다.The first audio signal is obtained through a first audio acquisition unit provided in the display apparatus; The second voice signal may be obtained through a second voice acquisition unit provided as an external peripheral device of the display device.

이상 설명한 바와 같이, 본 발명에 따르면, 사용자의 발화위치에 관계없이 일정하게 음서인식률의 정확도가 향상된 디스플레이장치 및 음성인식방법이 제공된다.As described above, according to the present invention, there is provided a display apparatus and a voice recognition method in which the accuracy of a note recognition rate is improved constantly regardless of a user's speech position.

도 1은 본 발명의 일 실시예예 따른 디스플레이장치의 개략도이고,
도 2는 도 1의 디스플레이장치의 제어블록도이고,
도 3은 복수의 후보 문자정보를 표시하는 디스플레이장치의 일 실시예이고,
도 4는 본 발명의 일 실시예에 따른 디스플레이장치의 음성인식 제어동작 흐름도이고,
도 5는 본 발명의 다른 일 실시예에 따른 디스플레이장치의 음성인식 제어동작 흐름도이다. 1 is a schematic diagram of a display apparatus according to an embodiment of the present invention;
2 is a control block diagram of the display apparatus of FIG.
3 is an embodiment of a display device displaying a plurality of candidate character information;
4 is a flowchart illustrating a voice recognition control operation of a display apparatus according to an embodiment of the present invention.
5 is a flowchart illustrating a voice recognition control operation of the display apparatus according to another embodiment of the present invention.

이하, 첨부한 도면을 참고로 하여 본 발명의 실시예들에 대하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 상세히 설명한다. 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예들에 한정되지 않는다. 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 동일 또는 유사한 구성요소에 대해서는 동일한 참조부호를 붙이도록 한다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings, which will be readily apparent to those skilled in the art to which the present invention pertains. The present invention may be embodied in many different forms and is not limited to the embodiments described herein. In order to clearly illustrate the present invention, parts not related to the description are omitted, and the same or similar components are denoted by the same reference numerals throughout the specification.

도 1은 본 발명의 일 실시예예 따른 디스플레이장치의 개략도이다.1 is a schematic diagram of a display device according to an embodiment of the present invention.

디스플레이장치(100)는 방송국으로부터 송출되어 수신되는 방송신호/방송정보/방송데이터에 기초한 방송 영상을 표시하는 TV로 구현될 수 있다. 그러나, 본 발명의 사상이 디스플레이장치(100)의 구현 예시에 한정되지 않는 바, 디스플레이장치(100)는 TV 이외에도 영상을 표시할 수 있는 다양한 종류의 구현 예시가 적용될 수 있다. The display apparatus 100 may be implemented as a TV displaying a broadcast image based on a broadcast signal / broadcast information / broadcast data transmitted and received from a broadcast station. However, the spirit of the present invention is not limited to the implementation example of the display apparatus 100, and the display apparatus 100 may be implemented with various kinds of implementation examples that may display an image in addition to the TV.

본 발명의 일 실시예에 따르면, 디스플레이장치(100)는 스마트 TV로 구현될 수 있다. 스마트 TV는 실시간으로 방송신호를 수신하여 표시할 수 있고, 웹 브라우저 기능을 가지고 있어 실시간 방송신호의 표시와 동시에 인터넷을 통하여 다양한 컨텐츠 검색 및 소비가 가능하고 이를 위하여 편리한 사용자 환경을 제공할 수 있는 TV이다. 또한, 스마트 TV는 개방형 소프트웨어 플랫폼을 포함하고 있어 사용자에게 양방향 서비스를 제공할 수 있다. 따라서, 스마트TV는 개방형 소프트웨어 플랫폼을 통하여 다양한 컨텐츠, 예를 들어 소정의 서비스를 제공하는 어플리케이션을 사용자에게 제공할 수 있다. 이러한 어플리케이션은 다양한 종류의 서비스를 제공할 수 있는 응용 프로그램으로서, 예를 들어 SNS, 금융, 뉴스, 날씨, 지도, 음악, 영화, 게임, 전자 책 등의 서비스를 제공하는 어플리케이션을 포함한다.According to an embodiment of the present invention, the display device 100 may be implemented as a smart TV. Smart TV can receive and display broadcasting signals in real time and has a web browser function. It can display real time broadcasting signals and simultaneously search and consume various contents through the Internet, and can provide a convenient user environment to be. Smart TV also includes an open software platform that can provide interactive services for users. Accordingly, the smart TV can provide users with various contents, for example, an application providing a predetermined service through an open software platform. Such an application is an application program capable of providing various kinds of services and includes applications for providing services such as SNS, finance, news, weather, maps, music, movies, games, e-books and the like.

디스플레이장치(100)는 제1음성취득부(110)와 제2음성취득부(120)를 포함하고 있어, 사용자의 음성을 취득하여 음성인식이 가능하다. 제1음성취득부(110)는 디스플레이장치(100)에 내장되어 있고, 제2음성취득부(120)는 디스플레이장치(100)의 외장형 주변장치로서 마련될 수 있다. 따라서, 제2음성취득부(120)는 디스플레이장치(100)와 유선 또는 무선으로 접속되어 제2음성취득부(120)에서 취득한 음성신호가 디스플레이장치(100)로 전송될 수 있다. 사용자가 발화한 음성은 제1음성취득부(110) 및 제2음성취득부(120)에서 동시에 취득된다. 도 2를 참조하여 디스플레이장치(100)의 구성요소를 상세히 설명한다.The display apparatus 100 includes a first voice acquisition unit 110 and a second voice acquisition unit 120, so that the user's voice can be acquired and voice recognition is possible. The first audio acquisition unit 110 may be embedded in the display apparatus 100, and the second audio acquisition unit 120 may be provided as an external peripheral device of the display apparatus 100. Accordingly, the second voice acquisition unit 120 may be connected to the display apparatus 100 by wire or wirelessly so that the voice signal acquired by the second voice acquisition unit 120 may be transmitted to the display apparatus 100. The voice spoken by the user is simultaneously acquired by the first voice acquisition unit 110 and the second voice acquisition unit 120. The components of the display apparatus 100 will be described in detail with reference to FIG. 2.

도 2는 도 1의 디스플레이장치의 제어 블록도이로서, 도2를 참조하면, 디스플레이장치(100)는 제1음성취득부(110), 제2음성취득부(120), 음압측정부(130), 제어부(140), 디스플레이부(150) 및 사용자입력부(160)를 포함한다.FIG. 2 is a control block diagram of the display apparatus of FIG. 1. Referring to FIG. 2, the display apparatus 100 includes a first audio acquisition unit 110, a second audio acquisition unit 120, and a sound pressure measurement unit 130. , The controller 140, the display 150, and the user input 160.

제1음성취득부(110) 및 제2음성취득부(120)는 사용자의 음성을 취득하는 것으로서, 마이크로 폰으로 구현될 수 있다. 상기 기재한 바와 같이, 제1음성취득부(110)는 디스플레이장치(100)의 내부에 마련될 수 있고, 제2음성취득부(120)는 디스플레이장치(100)의 외장형 주변장치로서 마련될 수 있다. The first voice acquisition unit 110 and the second voice acquisition unit 120 acquires a voice of a user, and may be implemented as a microphone. As described above, the first audio acquisition unit 110 may be provided inside the display apparatus 100, and the second audio acquisition unit 120 may be provided as an external peripheral device of the display apparatus 100. have.

제1음성취득부(110)는 사용자가 발화한 음성을 수신하여 전기적인 제1음성신호로 변환하여 음압측정부(130)로 출력하고, 제2음성취득부(120)는 사용자가 발화한 음성을 수신하여 전기적인 제2음성신호로 변환하여 음압측정부(130)로 출력한다.The first voice acquisition unit 110 receives the voice spoken by the user, converts it into an electrical first voice signal, and outputs the sound to the sound pressure measurement unit 130, and the second voice acquisition unit 120 voices the user spoken. Receives it and converts it into an electrical second voice signal and outputs it to the sound pressure measurement unit 130.

음압측정부(130)는 제1음성취득부(110) 및 제2음성취득부(120)로부터 수신한 제1음성신호 및 제2음성신호의 음압을 각각 측정하여, 제1음성신호, 제2음성신호와 함께 각각의 측정된 음압 측정 값을 제어부(140)로 출력한다.The sound pressure measurement unit 130 measures the sound pressure of the first sound signal and the second sound signal received from the first sound acquisition unit 110 and the second sound acquisition unit 120, respectively, and then measures the first sound signal and the second sound signal. Each measured sound pressure measurement value is output to the controller 140 together with the voice signal.

제어부(140)는 디스플레이장치(100)의 전반적인 상태 및 동작을 제어한다. 예를 들어 CPU로 구현될 수 있다. The controller 140 controls the overall state and operation of the display apparatus 100. For example, it can be implemented as a CPU.

본 발명의 일 실시예에 따른 제어부(140)는 상기 제1음성신호 및 상기 제2음성신호를 인식하여 상기 제1음성신호에 대응하는 제1음성정보와 상기 제2음성신호에 대응하는 제2음성정보를 획득하고, 상기 음압측정부에 의하여 측정된 음압 크기가 큰 음성신호에 소정의 가중치를 부가하여 상기 제1음성정보 및 상기 제2음성정보 중에서 상기 사용자의 발화의도에 적합한 어느 하나를 선택한다. 제어부(140)는 상기 선택된 어느 하나의 음성정보에 대응하는 제어동작을 수행할 수 있다. 본 실시예에서는, 사용자는 제1 및 제2 음성취득부(110, 120)를 이용하여 소정의 명령어를 입력할 수 있고, 제어부(140)는 상기 음성인식 동작에 의하여, 사용자가 발화한 명령어에 대응하는 음성정보를 선택하고, 상기 음성정보에 대응하는 제어동작을 수행할 수 있다.The control unit 140 according to an embodiment of the present invention recognizes the first voice signal and the second voice signal, and the first voice information corresponding to the first voice signal and the second voice signal corresponding to the second voice signal. Acquires voice information, and adds a predetermined weight to a voice signal having a large sound pressure level measured by the sound pressure measuring unit, and selects one of the first voice information and the second voice information suitable for the user's intention to speak. Choose. The controller 140 may perform a control operation corresponding to the selected voice information. In this embodiment, the user may input a predetermined command using the first and second voice acquisition units 110 and 120, and the controller 140 may respond to the command spoken by the user by the voice recognition operation. The voice information may be selected and a control operation corresponding to the voice information may be performed.

본 발명의 다른 일 실시예에 따른 제어부(140)는 상기 선택된 음성정보를 문자정보로 변환하여 이를 디스플레이부(150)에 표시할 수 있다. The control unit 140 according to another embodiment of the present invention may convert the selected voice information into text information and display it on the display unit 150.

제어부(140)는 음성인식부(141), 판단부(143) 및 음성문자변환부(143)를 더 포함한다. The controller 140 further includes a voice recognition unit 141, a determination unit 143, and a voice text conversion unit 143.

음성인식부(141)는 수신된 제1음성신호의 음성특징을 검출하여 제1음성정보로 인식한다. 또한 수신된 제2음성신호의 음성특징을 검출하여 제2음성정보로 인식한다. 음성인식부(141)의 인식기능은 기 알려진 음성인식알고리즘을 이용하여 수행할 수 있다. 음성인식부(141)는 수신된 제1음성신호에 대응하여 복수 개의 제1 후보 음성정보를 획득할 수 있다. 상기 복수 개의 제1 후보 음성정보는 상기 사용자의 발화의도에 적합도를 나타내는 소정의 레벨값을 갖는다. 따라서, 음성인식부(141)는 사용자가 발화한 음성에 대하여 제1음성신호로부터 서로 상이한 레벨값을 갖는 복수 개의 제1후보 음성정보를 획득한다. 예를 들어, 음성인식부(141)는 제1음성신호를 인하여 제1후보 음성정보 a(적합도 레벨 90), 제1 후보 음성정보 b(적합도 레벨 80), 제1후보 음성정보 c(적합도 레벨 70) 등을 획득한다. 이러한 동작은 제2음성신호에 대하여도 동일하게 적용된다. 따라서, 음성인식부(141)는 사용자가 발화한 음성에 대하여 제2음성신호로부터 서로 상이한 레벨값을 갖는 복수 개의 제2 후보 음성정보를 획득한다. 음성인식부(141)는 상기 제1음성신호에 대한 복수 개의 제1 후보 음성정보와 상기 제2음성신호에 대한 복수 개의 제2 후보 음성정보를 판단부(143)로 출력한다.The voice recognition unit 141 detects the voice feature of the received first voice signal and recognizes the voice feature as the first voice information. In addition, the voice feature of the received second voice signal is detected and recognized as the second voice information. The recognition function of the voice recognition unit 141 may be performed by using a known voice recognition algorithm. The voice recognition unit 141 may obtain a plurality of first candidate voice information in response to the received first voice signal. The plurality of first candidate voice informations have predetermined level values indicating suitability for the user's speech intention. Accordingly, the voice recognition unit 141 obtains a plurality of first candidate voice information having different level values from the first voice signal with respect to the voice spoken by the user. For example, the voice recognition unit 141 may generate the first candidate voice information a (conformance level 90), the first candidate voice information b (conformance level 80), and the first candidate voice information c (conformance level) due to the first voice signal. 70) and so on. This operation is equally applied to the second audio signal. Accordingly, the voice recognition unit 141 obtains a plurality of second candidate voice information having different level values from the second voice signal with respect to the voice spoken by the user. The voice recognition unit 141 outputs the plurality of first candidate voice information for the first voice signal and the plurality of second candidate voice information for the second voice signal to the determination unit 143.

판단부(143)는 음성인식부(141)로부터 출력되는 복수 개의 제1후보 음성정보 및 복수 개의 제2 후보 음성정보 중에서 적어도 하나의 사용자 발화의도에 적합한 음성정보를 선택한다. 판단부(143)는 상기 후보 음성정보들 중에서 사용자 발화의도에 적합한지 여부를 판단함에 있어서, 상기 음성인식부(141)가 음성인식의 결과로 후보 음성정보들에 부여한 적합도 레벨값과, 상기 음압측정부(130)로부터 수신한 음압크기를 이용한다. 판단부(143)는 상기 음압측정부(130)가 제1음성신호 및 제2음성신호에 대하여 음압 측정 결과 더 큰 음압을 가진 음성신호에 가중치를 부여한다. 판단부(143)는 상기 가중치와 레벨값을 더 하여 복수의 후보 음성정보들 중에서 가장 높은 값을 가진 음성정보를 선택할 수 있다. 또는 특정 순위에 속하는 복수의 음성정보를 선택할 수 있다. The determination unit 143 selects voice information suitable for at least one user speech intention from among the plurality of first candidate voice information and the plurality of second candidate voice information output from the voice recognition unit 141. The determination unit 143 determines whether the voice recognition unit 141 assigns the candidate voice information to the candidate voice information as a result of the voice recognition in determining whether the candidate voice information is suitable for the user's speech intention. The sound pressure size received from the sound pressure measurement unit 130 is used. The determination unit 143 weights the sound signal having the greater sound pressure by the sound pressure measurement unit 130 as a result of sound pressure measurement on the first sound signal and the second sound signal. The determination unit 143 may select the voice information having the highest value among the plurality of candidate voice information by adding the weight and the level value. Alternatively, a plurality of voice information belonging to a specific rank may be selected.

예를 들어, 제2음성신호가 제1음성신호보다 더 큰 음압을 가진 것으로 측정되면, 제2음성신호에 대하여 획득된 제2 후보 음성정보에 가중치를 부여한다. 제1음성신호에 대응하는 제1 후보 음성정보 a (적합도 레벨 90), 제1 후보 음성정보 b (적합도 레벨 80), 및 제1 후보 음성정보 c (적합도 레벨 70)는 가중치가 부여되지 않고, 제2음성신호에 대응하는 제2 후보 음성정보 d(적합도 레벨 95), 제2후보 음성정보 e(적합도 레벨 85), 제2 후보 음성정보 f(적합도 레벨 75)의 경우에는 적합도 레벨값에 가중치가 부여되어 계산된다. 이렇게 적합도 레벨값 및 가중치에 기초하여 높은 값을 갖는 후보 음성정보를 선택할 수 있다. 상기 선택은 가장 높은 값을 갖는 하나의 후보 음성정보 또는 소정 순위에 해당하는 높은 값을 갖는 복수 개의 음성정보를 선택할 수 있다.For example, if it is determined that the second voice signal has a larger sound pressure than the first voice signal, the second candidate voice information obtained for the second voice signal is weighted. The first candidate voice information a (conformance level 90), the first candidate voice information b (conformance level 80), and the first candidate voice information c (conformance level 70) corresponding to the first voice signal are not weighted. In the case of the second candidate voice information d (conformance level 95), the second candidate voice information e (conformance level 85), and the second candidate voice information f (conformance level 75) corresponding to the second voice signal, the weight is applied to the fitness level value. Is given and calculated. Thus, candidate speech information having a high value can be selected based on the fitness level value and the weight. The selection may select one candidate voice information having the highest value or a plurality of voice information having a high value corresponding to a predetermined rank.

상기 판단부(143)가 선택한 적어도 하나의 음성정보를 음성문자변환부(143)로 출력된다.The at least one voice information selected by the determination unit 143 is output to the voice character conversion unit 143.

음성문자변환부(145)는 상기 판단부(143)로부터 수신한 음성정보들을 문자정보로 변환한다. 즉, 음성문자변환부(145)는 복수 개의 후보 음성정보로부터 복수 개의 후보 문자정보를 획득한다. 이때, 음성문자변환부(145)는 상기 음성정보를 문자정보로 변환하기 위하여 음성정보와 문자정보가 매핑된 테이블을 이용할 수 있다.The voice text conversion unit 145 converts the voice information received from the determination unit 143 into text information. That is, the voice character converter 145 obtains a plurality of candidate character information from the plurality of candidate voice information. In this case, the voice text conversion unit 145 may use a table in which voice information and text information are mapped to convert the voice information into text information.

제어부(140)는 상기 음성문자변환부(145)에 의하여 변환된 문자정보를 디스플레이부(150)에 표시할 수 있다.The controller 140 may display the text information converted by the voice text converter 145 on the display 150.

디스플레이부(150)는 제어부(140)의 제어 하에서 상기 변환된 문자정보를 표시할 수 있다. 디스플레이부(150)는 디스플레이 패널(미도시) 및 상기 디스플레이 패널을 구동시키는 패널 구동부(미도시)를 포함할 수 있다. 상기 디스플레이 패널은, 예를 들어, 액정(liquid crystal), 플라즈마(plasma), 발광 다이오드(light-emitting diode), 유기발광 다이오드(organic light-emitting diode), 면전도 전자총(surface-conduction electron-emitter), 탄소 나노 튜브(carbon nano-tube), 나노 크리스탈(nano-crystal) 등의 다양한 디스플레이 방식으로 구현될 수 있다. The display unit 150 may display the converted character information under the control of the controller 140. The display unit 150 may include a display panel (not shown) and a panel driver (not shown) for driving the display panel. The display panel may include, for example, liquid crystal, plasma, light-emitting diode, organic light-emitting diode, and surface-conduction electron-emitter. ), Carbon nanotubes (carbon nano-tube), nano-crystals (nano-crystal) and the like can be implemented in various display methods.

사용자입력부(160)는 사용자의 조작 및 입력에 의해, 기 설정된 다양한 제어 커맨드 또는 한정되지 않은 정보를 제어부(140)에 전달할 수 있다. 사용자입력부(160)는 디스플레이장치(100) 외측에 설치된 메뉴 키(menu-key) 및 입력 패널(panel)이나, 디스플레이장치(100)와 분리 이격된 리모트 컨트롤러(remote controller) 등으로 구현된다. 또는, 사용자입력부(160)는 디스플레이부(150)와 일체형으로 구현될 수 있다. 즉, 디스플레이부(160)가 터치스크린(touch-screen)인 경우, 사용자는 디스플레이부(150)에 표시된 입력메뉴(미도시)를 통해 기 설정된 커맨드를 제어부(140)에 전달할 수도 있다.The user input unit 160 may transmit various preset control commands or unrestricted information to the controller 140 by a user's manipulation and input. The user input unit 160 may be implemented as a menu key and an input panel installed outside the display apparatus 100, or a remote controller separated from the display apparatus 100. Alternatively, the user input unit 160 may be integrated with the display unit 150. That is, when the display 160 is a touch screen, the user may transmit a preset command to the controller 140 through an input menu (not shown) displayed on the display 150.

디스플레이부(150)에 사용자 발화의도에 가장 적합한 것으로 판단된 어느 하나의 후보 문자정보가 표시되면, 사용자는 상기 표시된 후보 문자정보를 확인하고 자기의 발화의도와 동일한 것이면 소정의 키입력을 통하여 동일한 것임을 입력할 수 있다. When any one of the candidate character information determined to be most suitable for the user's utterance intention is displayed on the display unit 150, the user checks the displayed candidate character information and if the same is the same as his utterance intention, the same through a predetermined key input. Can be entered.

또는 디스플레이부(150)에 사용자 발화의도에 적합한 것으로 판단되는 복수 개의 후보 문자정보가 표시되면, 사용자는 상기 표시된 복수 개의 후보 문자정보를 확인하고, 사용자입력부(160)를 통하여 자기의 발화의도와 동일한 어느 하나를 선택하는 사용자 선택을 입력할 수 있다. Alternatively, when a plurality of candidate character information determined to be suitable for the user's utterance intention is displayed on the display unit 150, the user checks the displayed plurality of candidate character information and the user's utterance intention through the user input unit 160. You can enter a user selection to select the same one.

디스플레이장치(100)는 도 2에 도시되지 않은 수신부, 신호처리부를 더 포함할 수 있다. 수신부(미도시)는 외부의 영상소스(미도시)로부터 수신되는 영상신호를 신호처리부(미도시)로 전달하며, 수신부(미도시)는 방송국의 송출장치(미도시)로부터 송출되는 방송신호를 수신할 수 있다. 예를 들어, 수신부(미도시)는 상기 방송신호를 수신하기 위하여 안테나(미도시) 및/또는 튜너(미도시)를 포함할 수 있다. 또한 수신부(미도시)는 외부의 영상소스(미도시)로부터 영상신호를 수신할 수 있다. 수신부(미도시)는 상기 수신되는 영상신호의 규격이나 영상소스(미도시) 및 디스플레이장치(100)의 구현방식에 대응하여 다양한 방식을 가진다. 예를 들어, 수신부(미도시)는 HDMI(High Definition Multimedia Interface), USB, 컴포넌트(component) 등의 규격에 따른 신호/데이터를 수신할 수 있으며, 이들 각각의 규격에 대응하는 복수의 접속단자(미도시)를 포함할 수 있다. 신호처리부(미도시)는 수신부(미도시)로 수신된 영상신호를 처리하여 디스플레이부(160)로 출력한다. 예를 들어, 각 특성별 신호로 분배하는 디멀티플렉싱(de-multiplexing), 영상신호의 영상 포맷에 대응하는 디코딩(decoding), 인터레이스(interlace) 방식의 영상신호를 프로그레시브(progressive) 방식으로 변환하는 디인터레이싱(de-interlacing), 영상신호를 기 설정된 해상도로 조정하는 스케일링(scaling), 영상 화질 개선을 위한 노이즈 감소(noise reduction), 디테일 강화(detail enhancement), 프레임 리프레시 레이트(frame refresh rate) 변환 등을 포함하는 다양한 영상처리 프로세스 중 적어도 어느 하나를 수행할 수 있다.The display apparatus 100 may further include a receiver and a signal processor that are not shown in FIG. 2. The receiver (not shown) transmits an image signal received from an external video source (not shown) to a signal processor (not shown), and the receiver (not shown) receives a broadcast signal transmitted from a broadcasting device (not shown) of a broadcasting station. Can be received. For example, the receiver (not shown) may include an antenna (not shown) and / or a tuner (not shown) to receive the broadcast signal. Also, the receiver (not shown) may receive an image signal from an external image source (not shown). The receiver (not shown) may have various schemes corresponding to the standard of the received image signal or the implementation method of the image source (not shown) and the display apparatus 100. For example, the receiver (not shown) may receive signals / data according to standards such as HDMI (High Definition Multimedia Interface), USB, component, and the like, and a plurality of connection terminals corresponding to each standard ( Not shown). The signal processor (not shown) processes the image signal received by the receiver (not shown) and outputs the image signal to the display unit 160. For example, de-multiplexing for distributing signals for each characteristic, de-interlacing for decoding a video signal corresponding to the video format of an image signal, and for converting an interlace video signal to a progressive method. (de-interlacing), scaling to adjust the video signal to a preset resolution, noise reduction to improve image quality, detail enhancement, frame refresh rate conversion, and more. At least one of the various image processing processes may be performed.

도 3은 복수의 후보 문자정보를 표시하는 디스플레이장치의 일 실시예이다.3 is an embodiment of a display apparatus displaying a plurality of candidate character information.

도 3의 실시예는, 사용자의 음성인식기능을 하나의 사용자 인터페이스로서 활용하는 예이다. 디스플레이자치(100)는 스마트 TV로 구현되어 웹 브라우저 기능을 가지고 있어서, 웹 페이지를 표시할 수 있다. 상기 웹 페이지를 통하여 특정 구문을 입력하고자 할 때, 사용자는 디스플레이장치(100)에 마련되는 사용자 입력부(160) 대신에 제1및 제2음성취득부(110, 120)를 이용하여 상기 특정 구문을 말하게 된다. 제1 및 제2음성취득부(110, 120)는 상기 사용자가 발화한 특정 구문에 대응하는 음성을 동시에 취득하고, 상기 음성에 대하여 각각 제1음성신호 및 제2음성신호의 형태로 음압측정부(130)로 출력한다. 음압측정부(130)는 상기 제1음성신호 및 제2음성신호의 음압을 각각 측정하고, 그 결과는 제어부(140)로 출력한다.3 illustrates an example of using a voice recognition function of a user as one user interface. The display autonomous 100 is implemented as a smart TV and has a web browser function to display a web page. When the user wants to input a specific phrase through the web page, the user uses the first and second voice acquisition units 110 and 120 instead of the user input unit 160 provided in the display apparatus 100 to input the specific phrase. Will be told. The first and second voice acquisition units 110 and 120 simultaneously acquire a voice corresponding to a specific phrase spoken by the user, and the sound pressure measurement unit in the form of a first voice signal and a second voice signal, respectively, for the voice. Output to 130. The sound pressure measurement unit 130 measures the sound pressure of the first sound signal and the second sound signal, respectively, and outputs the result to the controller 140.

제어부(140)의 음성인식부(141)가 상기 제1음성신호 및 제2음성신호 각각에 대하여 복수 개의 제1후보 음성정보, 복수 개의 제2 후보 음성정보를 획득하여 판단부(143)로 출력한다. 상기 각 후보 음성정보는 사용자 발화 의도에 적합도를 나타내는 레벨값을 포함한다. 판단부(143)는 상기 수신한 각 음성신호에 대한 음압 측정 결과 더 큰 음압을 갖는 음성신호에 가중치를 부여하고, 상기 가중치 및 상기 적합도 레벨값에 기초하여 사용자의 발화의도에 적합한 것으로 판단되는 복수 개의 후보 음성정보를 선택하고, 이를 음성문자변환부(145)로 출력한다.The voice recognition unit 141 of the controller 140 obtains a plurality of first candidate voice information and a plurality of second candidate voice information for each of the first voice signal and the second voice signal, and outputs them to the determination unit 143. do. Each of the candidate voice information includes a level value indicating the suitability for the user's speech intent. The determination unit 143 weights the voice signal having a larger sound pressure as a result of the sound pressure measurement for each of the received voice signals, and determines that it is suitable for the user's speech intention based on the weight and the fitness level value. A plurality of candidate voice information is selected and outputted to the voice text converter 145.

음성문자변환부(145)는 상기 선택된 복수 개의 후보 음성정보를 대응하는 복수 개의 문자정보로 변환하고, 이것을 디스플레이부(150)로 출력하여 표시되도록한다.The voice character conversion unit 145 converts the selected plurality of candidate voice information into a plurality of corresponding character information, and outputs the same to the display unit 150 to be displayed.

도 3에는 디스플레이부(150)에는 후보 문자정보 a, 후보 문자정보 b, 후보 문자정보 c가 표시되어 있다. 사용자는 사용자입력부(160)를 이용하여 상기 표시된 복수 개의 후보 문자정보들 중에서 자신의 발화의도와 일치하는 문자정보를 선택할 수 있다.3, candidate character information a, candidate character information b, and candidate character information c are displayed on the display unit 150. The user may select character information that matches his or her utterance intention from among the plurality of displayed candidate character information by using the user input unit 160.

도 3의 예는 본 발명의 일 실시예일뿐이며, 본 발명이 이에 한정되는 것은 아니다. 3 is only an embodiment of the present invention, and the present invention is not limited thereto.

도 4는 본 발명의 일 실시예에 따른 디스플레이장치의 음성인식 제어동작 흐름도이다.4 is a flowchart illustrating a voice recognition control operation of the display device according to an embodiment of the present invention.

도 4를 참조하면, 디스플레이장치(100)의 제1음성취득부(110)는 사용자가 발화한 음성을 수신하여 전기적인 제1음성신호로 변환하고(S201), 제2음성취득부(120)는 상기 사용자가 발화한 음성을 수신하여 전기적인 제2음성신호로 변환한다(S202). 사용자가 한번 발화한 음성을 제1음성취득부(110) 및 제2음성취득부(120)가 거의 동시에 취득하기에 상기 S201 단계 및 S201 단계는 거의 동시에 일어난다. 음압취득부(130)에 의하여 상기 제1음성신호 및 상기 제2음성신호의 음압 크기가 측정된다(S203). 제어부(140)의 음성인식부(141)는 상기 제1음성신호 및 상기 제2음성신호를 인식하여 상기 제1음성신호에 대응하는 제1음성정보와 상기 제2음성신호에 대응하는 제2음성정보를 획득하고(S204), 제어부(140)의 판단부(143)는 상기 측정된 음압 크기가 큰 음성신호에 소정의 가중치를 부가하여 상기 제1음성정보 및 상기 제2음성정보 중에서 상기 사용자의 발화의도에 적합한 어느 하나를 선택한다(S205). 제어부(140)는 상기 판단부(143)에 의하여 선택된 음성정보에 대응하는 제어동작을 수행할 수 있다.Referring to FIG. 4, the first voice acquirer 110 of the display apparatus 100 receives a voice spoken by a user and converts the voice into an electrical first voice signal (S201) and the second voice acquirer 120. Receives the voice spoken by the user and converts it into an electrical second voice signal (S202). Since the first voice acquiring unit 110 and the second voice acquiring unit 120 acquire virtually the voice of the user once spoken, the steps S201 and S201 occur almost simultaneously. Sound pressure magnitudes of the first voice signal and the second voice signal are measured by the sound pressure acquisition unit 130 (S203). The voice recognition unit 141 of the controller 140 recognizes the first voice signal and the second voice signal, and thus, the first voice information corresponding to the first voice signal and the second voice signal corresponding to the second voice signal. In operation S204, the determination unit 143 of the controller 140 adds a predetermined weight to the measured voice signal having a large sound pressure level, thereby determining the user's information among the first voice information and the second voice information. Any one suitable for ignition intention is selected (S205). The controller 140 may perform a control operation corresponding to the voice information selected by the determiner 143.

도 5는 본 발명의 다른 일 실시예에 따른 디스플레이장치의 음성인식 제어동작 흐름도이다. 5 is a flowchart illustrating a voice recognition control operation of the display apparatus according to another embodiment of the present invention.

도 5를 참조하면, 디스플레이장치(100)의 제1음성취득부(110)는 사용자가 발화한 음성을 수신하여 전기적인 제1음성신호로 변환하고(S301), 제2음성취득부(120)는 상기 사용자가 발화한 음성을 수신하여 전기적인 제2음성신호로 변환한다(S302). 사용자가 한번 발화한 음성을 제1음성취득부(110) 및 제2음성취득부(120)가 거의 동시에 취득하기에 상기 S301 단계 및 S301 단계는 거의 동시에 일어난다. 음압취득부(130)에 의하여 상기 제1음성신호 및 상기 제2음성신호의 음압 크기가 측정된다(S303). 제어부(140)의 음성인식부(141)는 상기 제1음성신호를 인식하여 서로 상이한 레벨값을 갖는 복수 개의 제1후보 음성정보를 획득하고, 상기 제2음성신호를 인식하여 서로 상이한 레벨값을 갖는 복수 개의 제2후보 음성정보를 획득한다(S304). 제어부(140)의 판단부(143)는 상기 음압측정부(130)에 의하여 측정된 음압의 크기가 큰 음성신호에 부가된 가중치 및 상기 레벨값에 기초하여 상기 복수의 후보 음성정보들 중에서 사용자 발화의도에 적합한 복수 개의 후보 음성정보를 선택한다(S305). 제어부(140)의 음성문자변환부(145)는 상기 선택된 복수 개의 후보 음성정보를 대응하는 복수 개의 문자정보로 변환한다(S306). 상기 복수 개의 문자정보는 디스플레이부(150)에서 표시되고(S307), 사용자입력부(160)를 통하여 상기 복수 개의 문자정보 중에서 어느 하나를 선택하는 사용자 선택이 입력되면, 상기 사용자 선택에 대응하는 복수 개의 문자정보를 선택한다(S308).Referring to FIG. 5, the first voice acquirer 110 of the display apparatus 100 receives a voice spoken by a user and converts the voice into an electrical first voice signal (S301) and the second voice acquirer 120. Receives the voice spoken by the user and converts it into an electrical second voice signal (S302). Since the first voice acquiring unit 110 and the second voice acquiring unit 120 acquire virtually one voice spoken once by the user, steps S301 and S301 occur almost simultaneously. Sound pressure magnitudes of the first voice signal and the second voice signal are measured by the sound pressure acquisition unit 130 (S303). The voice recognition unit 141 of the controller 140 acquires a plurality of first candidate voice information having different level values by recognizing the first voice signal, and recognizes the second voice signal to set different level values. The second candidate voice information having a plurality of pieces is acquired (S304). The determination unit 143 of the control unit 140 based on the weight value and the level value added to the voice signal having a large sound pressure measured by the sound pressure measurement unit 130 speaks the user among the plurality of candidate voice information. A plurality of candidate voice information suitable for the intention is selected (S305). The voice character converter 145 of the controller 140 converts the selected plurality of candidate voice information into corresponding plurality of character information (S306). When the plurality of text information is displayed on the display unit 150 (S307) and a user selection for selecting any one of the plurality of text information is input through the user input unit 160, the plurality of text information corresponding to the user selection is input. Character information is selected (S308).

비록 본 발명의 몇몇 실시예들이 도시되고 설명되었지만, 본 발명이 속하는 기술분야의 통상의 지식을 가진 당업자라면 본 발명의 원칙이나 정신에서 벗어나지 않으면서 본 실시예를 변형할 수 있음을 알 수 있을 것이다. 발명의 범위는 첨부된 청구항과 그 균등물에 의해 정해질 것이다.Although several embodiments of the present invention have been shown and described, those skilled in the art will appreciate that various modifications may be made without departing from the principles and spirit of the invention . The scope of the invention will be determined by the appended claims and their equivalents.

100: 디스플레이장치 110: 제1음성취득부
120: 제2음성취득부 130: 음압취득부
140: 제어부 150: 디스플레이부
160: 사용자입력부100: display device 110: first audio acquisition unit
120: second voice acquisition unit 130: sound pressure acquisition unit
140: control unit 150:
160: user input unit

Claims

In the display device,
A first voice acquisition unit configured to receive a voice spoken by a user and convert the voice into an electrical first voice signal;
A second voice acquisition unit for receiving the voice spoken by the user together with the first voice acquisition unit and converting the voice into an electrical second voice signal;
A sound pressure measuring unit measuring a sound pressure magnitude of the first sound signal and the second sound signal;
Recognizing the first voice signal and the second voice signal to obtain first voice information corresponding to the first voice signal and second voice information corresponding to the second voice signal, measured by the sound pressure measuring unit And a control unit for selecting one of the first voice information and the second voice information suitable for the user's utterance intention by adding a predetermined weight to the voice signal having a large sound pressure level.

The method of claim 1,
The first voice information includes a plurality of first candidate voice information,
The second voice information includes a plurality of second candidate voice information,
Each candidate voice information includes a level value indicating a goodness of fit of the user's speech intention,
And the controller selects a plurality of candidate voice information suitable for the user's speech intention based on the level value and the weight.

The method of claim 2,
And a display unit,
And the control unit converts the selected plurality of candidate voice information into a plurality of candidate character information to display on the display unit.

The method of claim 3,
Further comprising a user input unit for inputting a user selection,
And the control unit selects candidate character information according to the user selection from among the plurality of candidate character information.

The method of claim 2,
The control unit may further include a voice recognition unit that detects voice features of the first voice signal and the second voice signal and recognizes the voice information as voice information.

The method of claim 3,
The control unit further comprises a voice text conversion unit for converting the voice information into text information.

The method of claim 1,
The first voice acquisition unit is provided in the display device,
And the second voice acquisition unit is provided as an external peripheral device of the display device.

A speech recognition method of a display device,
Receiving a voice spoken by a user and converting the voice into an electrical first voice signal;
Receiving the voice spoken by the user and converting the voice into an electrical second voice signal;
Measuring sound pressure magnitudes of the first sound signal and the second sound signal;
Recognizing the first voice signal and the second voice signal to obtain first voice information corresponding to the first voice signal and second voice information corresponding to the second voice signal;
Speech recognition of the display apparatus comprising the step of selecting any one suitable for the user's intention to speak from the first voice information and the second voice information by adding a predetermined weight to the voice signal having a large sound pressure magnitude measured Way.

9. The method of claim 8,
The first voice information includes a plurality of first candidate voice information,
The second voice information includes a plurality of second candidate voice information,
Each candidate voice information includes a level value indicating a goodness of fit of the user's speech intention,
The selecting step may further include selecting a plurality of candidate voice information suitable for the user's speech intention based on the level value and the weight.

10. The method of claim 9,
Converting the selected plurality of candidate voice information into a plurality of candidate character information;
The voice recognition method of the display apparatus further comprising the step of displaying the plurality of candidate character information.

The method of claim 10,
And receiving a user selection for selecting one of the displayed candidate text information.

10. The method of claim 9,
The voice information acquiring step includes detecting a voice feature of the voice signal and recognizing the voice feature as voice information.

9. The method of claim 8,
The first audio signal is obtained through a first audio acquisition unit provided in the display apparatus;
And the second voice signal is obtained through a second voice acquisition unit provided as an external peripheral device of the display device.