KR101151571B1

KR101151571B1 - Speech recognition environment control apparatus for spoken dialog system and method thereof

Info

Publication number: KR101151571B1
Application number: KR1020100070474A
Authority: KR
Inventors: 이근배; 김경덕; 이동현; 최준휘
Original assignee: 포항공과대학교 산학협력단
Priority date: 2010-07-21
Filing date: 2010-07-21
Publication date: 2012-05-31
Also published as: KR20120009787A

Abstract

본 발명은 음성 대화 처리 기술을 이용하여 주위 환경을 개선함으로써 음성 대화 인식 수준을 높이기 위한 음성 대화 시스템용 음성 인식 환경 제어 장치 및 그 방법에 관한 것이다. 본 발명의 음성 대화 시스템을 위한 음성 인식 환경 제어 장치는 각기 다른 음성 인식 환경으로부터 사용자가 인식이 잘되는 음성 인식 환경으로 조정하기 위한 유도 장치이다. 상기 장치는 목표 문장을 인식하는 발화 및 조절 인터페이스와 사용자 환경 제어부를 포함한다. 이런 장치는 음성 대화 시스템에 관한 모든 장치의 선단에 연결되어 원활한 음성 인식 환경을 제공할 수 있다. The present invention relates to an apparatus and method for controlling a speech recognition environment for a speech conversation system for improving a speech conversation recognition level by improving a surrounding environment using a speech conversation processing technology. The apparatus for controlling a speech recognition environment for the speech conversation system of the present invention is an induction apparatus for adjusting from a different speech recognition environment to a speech recognition environment in which a user is well recognized. The apparatus includes a speech and control interface for recognizing a target sentence and a user environment controller. Such a device may be connected to the leading end of all devices related to the voice conversation system to provide a smooth voice recognition environment.

Description

Speech recognition environment control apparatus for speech dialogue system and method thereof

본 발명은 음성 대화 시스템을 위한 음성 인식 환경 제어 장치 및 그 방법에 관한 것으로, 보다 상세하게는 음성 대화 시스템을 위해 잡음 정도 평가 및 마이크 입력 제어를 수행하고, 해당 음성 대화 시스템에서 사용하는 도메인으로부터 음성 발화를 요구하고, 그 음성 인식도를 따져 주변의 환경 상태를 파악 조절하는 음성 인식 환경 제어 장치 및 그 방법에 관한 것이다. The present invention relates to an apparatus and method for controlling a speech recognition environment for a voice conversation system. More particularly, the present invention relates to a voice recognition environment control apparatus and a method thereof. The present invention relates to a voice recognition environment control apparatus and a method for requesting speech and grasping and adjusting an ambient state based on the degree of speech recognition.

일반적으로 음성 대화 시스템이라 함은 사용자와 시스템이 원활한 대화를 나누고, 시스템은 사용자의 발화로부터 명령을 인식 수행하는 데에 그 목표가 있다. In general, a voice conversation system is aimed at smooth communication between a user and a system, and the system recognizes a command from a user's speech.

그러므로 음성 대화 시스템은 사람의 발화가 무슨 말인지 인식하는 동작부터 잘되어야한다. 최근의 음성 대화 시스템은 음성 인식기 자체 성능에 초점을 맞추어 발전되었고, 음성 인식을 위한 환경 제어는 신호처리에 관하여 중점적으로 발전되었으며, 그에 대한 연구들이 활발하다. Therefore, the voice conversation system should be well-behaved from recognizing what a person's speech is. Recently, the voice conversation system has been developed with the focus on the performance of the voice recognizer itself, and the environmental control for the voice recognition has been developed with respect to the signal processing.

그러나 기존의 음성 대화 시스템으로서는 환경 변수가 음성 인식기의 음성 인식도에 영향을 미침에도 불구하고 그를 고려하는 경우가 없고, 음성 인식 환경 제어 부분에서는 음성 대화 시스템을 위해 그 인식도와 관련하여 그 변수들을 조절해야함에도 불구하고, 음성 대화 시스템의 의도와는 상관없이 잡음제거, 음향 효과 등 신호처리 위주로 독립적이므로 그 한계가 있다. However, in the case of the existing voice conversation system, even though the environment variable affects the voice recognition level of the voice recognizer, it is not taken into consideration. In the voice recognition environment control part, the variables must be adjusted in relation to the recognition level for the voice conversation system. Nevertheless, there are limitations because they are independent of signal processing such as noise reduction and sound effects regardless of the intention of the voice conversation system.

한편, 음성 대화 시스템을 위한 환경 제어 장치는 궁극적으로 그 음성 인식도를 평가하여 음성 인식이 잘되는 방향으로 환경이 제어되고, 그러한 피드백을 사용자에게 제공하여야 한다. On the other hand, the environment control apparatus for the voice conversation system ultimately evaluates the degree of speech recognition, and the environment is controlled in a direction in which speech recognition is good, and such feedback should be provided to the user.

한국등록특허 10-0940629 (2010. 01. 28) (잡음 제거 장치 및 방법)Korea Patent Registration 10-0940629 (2010. 01. 28) (Noise removing device and method)

따라서, 본 발명자는 상기한 종래 기술의 한계점을 해결하기 위하여 본 발명이 이루고자 하는 기술적 과제, 즉 본 발명의 목적은 음성 대화 시스템의 의도에 부합하여 적합한 인식도를 가지도록 환경 제어 피드백을 제공하는 음성 대화 시스템을 위한 음성 인식 환경 제어 장치 및 그 방법을 제공하는데 있다. Accordingly, the inventors of the present invention have been made to solve the above-mentioned limitations of the prior art, which is an object of the present invention, that is, an object of the present invention is to provide a voice conversation that provides an environmental control feedback in accordance with the intention of the voice conversation system to have an appropriate recognition degree. An apparatus and method for controlling a speech recognition environment for a system are provided.

본 발명이 이루고자 하는 다른 기술적 과제, 즉 본 발명의 다른 목적은 강건한 음성 인식도를 위한 환경 변수의 정도를 파악하여 조절 요청 피드백을 할 수 있는 음성 대화 시스템을 위한 음성 인식 환경 제어 장치 및 그 방법을 제공하는데 있다. Another technical problem to be achieved by the present invention, that is, another object of the present invention is to provide an apparatus and method for controlling a speech recognition environment for a voice conversation system capable of grasping the degree of environment variables for robust speech recognition and making feedback on adjustment requests. It is.

본 발명이 이루고자 하는 또다른 기술적 과제, 즉 본 발명의 또다른 목적은 음성 대화 시스템의 음성 인식 도메인을 고려하여 음성 발화 요청을 하고, 발화에 된 음성에 따라 환경 변수의 정도를 파악할 수 있는 음성 대화 시스템을 위한 음성 인식 환경 제어 장치 및 그 방법을 제공하는데 있다. Another technical problem to be achieved by the present invention, that is, another object of the present invention is to make a voice utterance request in consideration of a voice recognition domain of a voice conversation system, and have a voice conversation capable of grasping a degree of an environmental variable according to the voice made in the speech. An apparatus and method for controlling a speech recognition environment for a system are provided.

본 발명이 이루고자 하는 또다른 기술적 과제, 즉 본 발명의 또다른 목적은 사용자가 피드백에 따라 스스로 그 환경 변수를 조절할 수 있는 음성 대화 시스템을 위한 음성 인식 환경 제어 장치 및 그 방법을 제공하는데 있다. Another object of the present invention, that is, another object of the present invention is to provide an apparatus and method for controlling a voice recognition environment for a voice conversation system in which a user can adjust his / her own environmental parameters according to feedback.

본 발명이 이루고자 하는 또다른 기술적 과제, 즉 본 발명의 또다른 목적은 음성 대화 시스템의 피드백을 위하여 어떤 환경 변수를 어느 정도 조절해야 하는가, 어느 정도 조절이 되었는가를 피드백하고, 그 조절부를 함께 갖는 음성 대화 시스템을 위한 음성 인식 환경 제어 장치 및 그 방법을 제공하는데 있다. Another technical problem to be achieved by the present invention, that is, another object of the present invention is to provide feedback on which environmental variables and how much are adjusted for feedback of the voice conversation system, and have a voice with the control unit. An apparatus and method for controlling a speech recognition environment for a conversation system are provided.

본 발명이 이루고자 하는 또다른 기술적 과제, 즉 본 발명의 또다른 목적은 강건한 음성 인식도라고 판단이 되면 음성 인식 환경 제어를 끝내는 음성 대화 시스템을 위한 음성 인식 환경 제어 장치 및 그 방법을 제공하는데 있다. Another object of the present invention, that is, another object of the present invention is to provide an apparatus and method for controlling a voice recognition environment for a voice conversation system that terminates the control of the voice recognition environment when it is determined that it is robust.

본 발명이 이루고자 하는 또다른 기술적 과제, 즉 본 발명의 또다른 목적은 환경 변수에 의존적인 하드웨어 시스템 구성이 같은 경우에 대해 사용자의 프로필을 저장하여 매번 환경 제어를 하지 않아도 강건한 환경을 제공할 수 있는 음성 대화 시스템을 위한 음성 인식 환경 제어 장치 및 그 방법을 제공하는데 있다. Another technical problem to be achieved by the present invention, that is, another object of the present invention is to provide a robust environment without storing the user's profile for the case where the hardware system configuration dependent on the environment variable is the same. An apparatus and method for controlling a voice recognition environment for a voice conversation system are provided.

본 발명은 상기 기술적 과제를 달성하기 위하여, 음성 대화 시스템을 위한 음성 인식 환경 제어 장치에 있어서, In order to achieve the above technical problem, the present invention provides a voice recognition environment control apparatus for a voice conversation system,

환경 제어 시작 명령에 의해 사용자의 현재 장치 검색 및 환경 초기화부; A user's current device search and environment initialization unit by an environment control start command;

현재 사용자에게 발화를 요청하기 전, 상기 검색된 마이크의 마이크 입력 레벨로부터 잡음을 추정하는 잡음 정도 평가부; A noise degree evaluator for estimating noise from a microphone input level of the searched microphone before requesting a speech from a current user;

상기 평가된 잡음 정도에 따라서 상기 마이크의 볼륨을 평가하여 환경 제어를 완료하는 마이크 볼륨 평가부를 포함하는 음성 인식 환경 제어 장치를 제공한다. It provides a voice recognition environment control apparatus including a microphone volume evaluation unit for completing the environmental control by evaluating the volume of the microphone according to the estimated noise level.

바람직하기로는 상기 마이크 볼륨 평가부는 Preferably the microphone volume evaluator

사용자로 하여금 어떤 문장을 발화하도록 요청하고, 사용자가 어떠한 문장을 발화하였는지 문장을 결과로 내보내는 발화 및 조절 인터페이스; 및 A speech and control interface for requesting a user to speak a sentence and outputting a sentence as a result of the sentence that the user has spoken; And

상기 발화 및 조절 인터페이스의 발화 및 조절 결과에 따라서, 현재 마이크 볼륨 조절 값이 적정한지 판단하고, 높이거나 줄일지 그리고 조절의 정도를 사용자에게 내보내어 피드백하는 마이크 환경 제어부를 포함하는 것을 특징으로 한다. According to the result of the ignition and adjustment of the ignition and adjustment interface, it is characterized in that it comprises a microphone environment control unit for determining whether the current microphone volume adjustment value is appropriate, whether to increase or decrease and send out the degree of adjustment to the user.

바람직하기로는 상기 발화 및 조절 인터페이스는 Preferably the ignition and conditioning interface is

발화 요청에 따라 사용자가 마이크를 통해 입력하는 발화를 인식하여, 사용자가 어떠한 문장을 발화하였는지 문장을 결과로 내보내는 음성 인식부; 및 A speech recognition unit for recognizing a speech input by a user through a microphone according to a speech request and outputting a sentence as a result of which sentence the user has uttered; And

사용자가 마이크의 입력 정도를 조절할 수 있는 마이크 볼륨 조절부를 포함하고, It includes a microphone volume control that allows the user to adjust the input level of the microphone,

상기 음성 인식부의 발화 요청을 위한 음성 대화 시스템 도메인 모델을 저장하는 제1 데이터베이스로부터 발화 요청 문장을 무작위로 뽑아내어 발화를 요청하고, 상기 마이크 볼륨 조절부에서 마이크 볼륨을 조절하며 그 마이크 볼륨 조절이 완료된 경우 그 조절 값을 환경 제어 정보를 포함하는 유저 환경 프로필 데이터를 저장하는 제2 데이터베이스에 저장하는 것을 특징으로 한다. Randomly extracts a speech request sentence from a first database storing a voice conversation system domain model for the speech request of the speech recognition unit and requests a speech, and adjusts the microphone volume in the microphone volume control unit and completes the microphone volume control. And if the adjustment value is stored in a second database storing user environment profile data including environment control information.

바람직하기로는 상기 마이크 환경 제어부는 Preferably the microphone environment control unit

상기 발화 및 조절 인터페이스의 상기 음성 인식부로부터 얻어진 문장으로부터 음성 인식 점수를 평가하는 음성 인식 평가부; A speech recognition evaluator for evaluating a speech recognition score from sentences obtained from the speech recognizer of the speech and control interface;

상기 발화 및 조절 인터페이스의 상기 음성 인식부의 음성 인식 도중 얻어진 마이크 입력 레벨을 평가하는 마이크 입력 레벨 평가부; 및 A microphone input level evaluator configured to evaluate a microphone input level obtained during speech recognition of the speech recognition unit of the speech and control interface; And

상기 음성 인식 평가부로부터 얻어진 음성 인식 점수, 상기 마이크 레벨 평가부로부터 얻어진 마이크 입력 레벨, 및 상기 발화 및 조절 인터페이스의 상기 마이크 볼륨 조절부로부터 얻어진 마이크 볼륨 조절값으로부터 현재 마이크 환경 상태를 평가하여 사용자에게 피드백하는 마이크 환경 평가부를 포함한다. Evaluate the current microphone environment from the voice recognition score obtained from the voice recognition evaluation unit, the microphone input level obtained from the microphone level evaluation unit, and the microphone volume adjustment value obtained from the microphone volume control unit of the speech and control interface. The microphone environment evaluation part which feeds back is included.

바람직하기로는 상기 음성 인식 평가부는 상기 음성 인식부로부터 넘어온 문장으로부터 현재 발화 요청한 문장과 비교하여 실제 음성 인식 유사도를 측정하여 그 인식 점수를 내보내는 것을 특징으로 한다. Preferably, the speech recognition evaluator compares the sentence received from the speech recognizer with the current speech request sentence and measures the actual speech recognition similarity, and outputs the recognition score.

바람직하기로는 상기 마이크 입력 레벨 평가부는 사용자가 발화하는 순간의 신호적 물리 입력량을 측정하여 그 평균 정도의 마이크 입력 레벨을 내보내는 것을 특징으로 한다. Preferably, the microphone input level evaluator measures the amount of signal physical input at the moment of the user's speech and outputs the microphone input level of the average level.

본 발명은 상술한 기술적 과제를 달성하기 위하여 본 발명의 다른 양태에 의하면, 음성 대화 시스템을 위한 음성 인식 환경 제어 방법에 있어서, In accordance with another aspect of the present invention, there is provided a voice recognition environment control method for a voice conversation system.

환경 제어 시작 명령에 의해 사용자의 현재 장치 검색 및 환경 초기화를 수행하는 장치 검색 및 환경 초기화 수행 단계; A device search and environment initialization step of performing a user's current device search and environment initialization by an environment control start command;

현재 사용자에게 발화를 요청하기 전, 상기 검색된 마이크의 마이크 입력 레벨로부터 잡음을 추정하는 잡음 정도 평가 단계; 및 A noise level estimation step of estimating noise from a microphone input level of the searched microphone before requesting a speech from a current user; And

상기 평가된 잡음 정도에 따라서 상기 마이크의 볼륨을 평가하여 환경 제어를 완료하는 마이크 볼륨 평가 단계를 포함하는 음성 인식 환경 제어 방법을 제공한다. It provides a voice recognition environment control method comprising a microphone volume evaluation step of evaluating the volume of the microphone according to the evaluated noise level to complete the environmental control.

바람직하기로는 상기 마이크 볼륨 평가 단계는 Preferably the microphone volume evaluation step

사용자로 하여금 어떤 문장을 발화하도록 요청하고, 사용자가 어떠한 문장을 발화하였는지 문장을 결과로 내보내는 발화 및 조절 단계; 및 An utterance and adjustment step of requesting a user to speak a sentence and outputting a sentence as a result of the sentence that the user uttered; And

상기 발화 및 조절 단계의 발화 및 조절 결과에 따라서, 현재 마이크 볼륨 조절 값이 적정한지 판단하고, 높이거나 줄일지 그리고 조절의 정도를 사용자에게 내보내어 피드백하는 마이크 환경 제어 단계를 포함하는 것을 특징으로 한다. According to the result of the ignition and adjustment of the ignition and adjustment step, it is characterized in that it comprises a microphone environment control step of determining whether the current microphone volume control value is appropriate, whether to increase or decrease and send out the degree of adjustment to the user and feedback. .

바람직하기로는 상기 발화 및 조절 단계는 Preferably the ignition and control step

발화 요청을 위한 음성 대화 시스템 도메인 모델을 저장하는 제1 데이터베이스로부터 발화 요청 문장을 무작위로 뽑아내어 발화를 요청하고, 그 발화 요청에 따라 사용자가 마이크를 통해 입력하는 발화를 인식하여, 사용자가 어떠한 문장을 발화하였는지 문장을 결과로 내보내는 음성 인식 단계; 및 Randomly extracts a speech request sentence from a first database storing a voice conversation system domain model for a speech request and requests a speech, and recognizes a speech input by a user through a microphone according to the speech request, so that the user can recognize any sentence. A speech recognition step of outputting a sentence as a result of uttering speech; And

사용자가 마이크 볼륨을 조절하며 그 마이크 볼륨 조절이 완료된 경우 그 조절 값을 환경 제어 정보를 포함하는 유저 환경 프로필 데이터를 저장하는 제2 데이터베이스에 저장하는 마이크 볼륨 조절 단계를 포함하는 것을 특징으로 한다. And adjusting the microphone volume and storing the adjustment value in a second database storing user environment profile data including environmental control information when the microphone volume adjustment is completed.

바람직하기로는 상기 마이크 환경 제어 단계는 Preferably the microphone environment control step

상기 발화 및 조절 단계의 상기 음성 인식 단계로부터 얻어진 문장으로부터 음성 인식 점수를 평가하는 음성 인식 평가 단계; A speech recognition evaluation step of evaluating a speech recognition score from sentences obtained from the speech recognition step of the speech and adjustment step;

상기 발화 및 조절 단계의 상기 음성 인식 단계의 음성 인식 도중 얻어진 마이크 입력 레벨을 평가하는 마이크 입력 레벨 평가 단계; 및 A microphone input level evaluating step of evaluating a microphone input level obtained during speech recognition of the speech recognition step of the utterance and adjustment step; And

상기 음성 인식 평가 단계로부터 얻어진 음성 인식 점수, 상기 마이크 레벨 평가 단계로부터 얻어진 마이크 입력 레벨, 및 상기 발화 및 조절 단계의 상기 마이크 볼륨 조절 단계로부터 얻어진 마이크 볼륨 조절 값으로부터 현재 마이크 환경 상태를 평가하여 사용자에게 피드백하는 마이크 환경 평가 단계를 포함하는 것을 특징으로 한다. Evaluate the current microphone environmental condition from the voice recognition score obtained from the voice recognition evaluation step, the microphone input level obtained from the microphone level evaluation step, and the microphone volume adjustment value obtained from the microphone volume adjustment step of the utterance and adjustment step. And a microphone environment evaluation step of feeding back.

바람직하기로는 상기 음성 인식 평가 단계는 상기 음성 인식 단계로부터 넘어온 문장으로부터 현재 발화 요청한 문장과 비교하여 실제 음성 인식 유사도를 측정하여 그 인식 점수를 내보내는 것을 특징으로 한다. Preferably, the speech recognition evaluating step is characterized in that the actual speech recognition similarity is measured by comparing the sentence from the speech recognition step with the current speech request, and outputs the recognition score.

바람직하기로는 상기 마이크 입력 레벨 평가 단계는 사용자가 발화하는 순간의 신호적 물리 입력량을 측정하여 그 평균 정도의 마이크 입력 레벨을 내보내는 것을 특징으로 한다. Preferably, the microphone input level evaluation step is characterized by measuring the amount of signal physical input at the moment the user speaks and outputs the microphone input level of the average degree.

본 발명에 의한 음성 대화 시스템을 위한 음성 인식 환경 제어 장치 및 그 방법은 음성 대화 시스템의 음성 인식의 환경에 의한 문제를 해결함으로써 인식 성능을 향상시켜 원활한 대화를 진행할 수 있게 할 수 있다. The apparatus and method for controlling a voice recognition environment for a voice conversation system according to the present invention can solve the problem caused by the environment of the voice recognition of the voice conversation system, thereby improving the recognition performance and enabling a smooth conversation.

본 발명은 음성 대화 시스템이 사용될 수 있는 모든 환경에 적용될 수 있으므로 관련 장치 및 산업에 파급 효과가 클 것으로 기대할 수 있다. Since the present invention can be applied to any environment in which a voice conversation system can be used, it can be expected that the ripple effect on the related apparatus and industry is large.

도 1은 본 발명의 음성 대화 시스템을 위한 음성 인식 제어 장치의 구조를 도시한 블록도이다.
도 2는 도 1의 마이크 볼륨 평가부의 구체적 구조의 상세 블록도이다.
도 3은 본 발명의 음성 대화 시스템을 위한 음성 인식 제어 방법을 설명하기 위한 흐름도이다.
도 4는 도 3의 상세 흐름도이다.
도 5는 도 4의 상세 흐름도이다. 1 is a block diagram showing the structure of a voice recognition control device for a voice conversation system of the present invention.
FIG. 2 is a detailed block diagram of a specific structure of the microphone volume evaluator of FIG. 1.
3 is a flowchart illustrating a voice recognition control method for a voice conversation system according to the present invention.
4 is a detailed flowchart of FIG. 3.
5 is a detailed flowchart of FIG. 4.

이하, 도 1과 도 2를 참조하여, 본 발명의 바람직한 실시예에 의한 음성 대화 시스템을 위한 음성 인식 제어 장치에 대하여 상세히 설명한다. Hereinafter, a voice recognition control apparatus for a voice conversation system according to a preferred embodiment of the present invention will be described in detail with reference to FIGS. 1 and 2.

도 1은 발명에 따른 음성 대화 시스템을 위한 음성 인식 환경 제어 장치의 전반적인 구조를 도시한 블록도이다. 1 is a block diagram showing the overall structure of a voice recognition environment control apparatus for a voice conversation system according to the present invention.

도 1에서 발명에 따른 음성 대화 시스템을 위한 음성 인식 환경 제어 장치(1-1)는 장치 검색 및 환경 제어 초기화부(1-3), 잡음 정도 평가부(1-4), 및 마이크 볼륨 평가부(1-5)를 포함하여 구성되어, 차례로 해당 과정을 진행하게 된다. In FIG. 1, the apparatus 1-1 for recognizing an environment for a voice conversation system according to the present invention includes a device search and environment control initialization unit 1-3, a noise level evaluator 1-4, and a microphone volume evaluator. (1-5), including, in order to proceed with the process.

도 1에 도시한 장치 검색 및 초기화부(1-3)는 환경 제어 시작 명령(1-2)에 의해 사용자의 음성 대화 시스템에 필요한 장치들을 검색하여 음성 대화 시스템의 필요조건을 만족시켰는지 확인하고 장치의 환경 변수들을 초기화 한다. The device search and initialization unit 1-3 shown in FIG. 1 searches for the devices required for the user's voice chat system by the environment control start command 1-2 and checks whether the requirements of the voice chat system are satisfied. Initialize the environment variables of the device.

도 1에 도시한 잡음 정도 평가부(1-4)는 마이크 볼륨 평가부(1-5)의 음성 인식도를 평가하기 전단계로서, 현재 사용자에게 발화를 요청하기 전, 상기 검색된 마이크로부터 들어오는 신호의 마이크 입력 레벨로부터 그 잡음 정도를 평가함으로써 잡음을 추정한다. The noise level evaluator 1-4 shown in FIG. 1 is a step before evaluating the speech recognition level of the microphone volume evaluator 1-5, and before a current user is requested to speak, the microphone of the signal received from the searched microphone. Noise is estimated by evaluating its noise level from the input level.

도 1에 도시한 마이크 볼륨 평가부(1-5)는 상기 잡음 정도 평가부(1-4)로부터 평가된 잡음 정도에 따라서 상기 마이크의 볼륨을 평가하여 환경 제어를 완료(1-6)하며, 이는 도 2에 보다 구체적으로 도시되어있다. The microphone volume evaluator 1-5 shown in FIG. 1 completes the environmental control by evaluating the volume of the microphone according to the noise degree evaluated by the noise degree evaluator 1-4 (1-6). This is illustrated in more detail in FIG.

도 2는 도 1에 도시한 마이크 볼륨 평가부(1-5)를 보다 구체적으로 도시한 블록도이다. 상기 마이크 볼륨 평가부(1-5)는 두 부분으로 구성되는데, 사용자가 외부에서 직접 발화와 조절을 하게 되는 조절 및 발화 인터페이스(2-3)와 마이크 환경 제어부(2-8)로 구성된다. FIG. 2 is a block diagram illustrating the microphone volume evaluator 1-5 shown in FIG. 1 in more detail. The microphone volume evaluator 1-5 is composed of two parts. The microphone volume evaluator 1-5 includes a control and utterance interface 2-3 and a microphone environment control unit 2-8 in which a user directly utters and controls an external device.

도 2에서, 상기 조절 및 발화 인터페이스(2-3)는 사용자로 하여금 어떤 문장을 발화하도록 요청하고, 사용자가 어떠한 문장을 발화하였는지 문장을 결과로 내보내며, 상기 마이크 환경 제어부(2-8)는 상기 발화 및 조절 인터페이스(2-3)의 발화 및 조절 결과에 따라서, 현재 마이크 볼륨 조절 값이 적정한지 판단하고, 높이거나 줄일지 그리고 조절의 정도를 사용자에게 내보내어 피드백하는 역할을 수행한다. In FIG. 2, the control and utterance interface 2-3 asks the user to utter a sentence, outputs a sentence as a result of which sentence the user uttered, and the microphone environment controller 2-8 According to the result of the ignition and adjustment of the ignition and adjustment interface 2-3, it is determined whether the current microphone volume adjustment value is appropriate, and whether to increase or decrease and send out the degree of adjustment to the user to feed back.

보다 상세하게는, 상기 조절 및 발화 인터페이스(2-3)는 음성 대화 시스템과 동일한 음성 인식부(2-4)와 마이크 볼륨 조절부(2-5)로 구성되고, 마이크 환경 평가부(2-8)는 음성 인식 평가부(2-9), 마이크 입력 레벨 평가부(2-10), 및 마이크 환경 평가부(2-11)로 구성된다. More specifically, the control and utterance interface (2-3) is composed of the same voice recognition unit (2-4) and the microphone volume control unit (2-5) and the microphone environment evaluation unit (2-) 8 is composed of a speech recognition evaluator 2-9, a microphone input level evaluator 2-10, and a microphone environment evaluator 2-11.

상기 발화 및 조절 인터페이스(2-3)의 음성 인식부(2-4)는 발화 요청에 따라 사용자가 마이크를 통해 입력하는 발화를 인식하여, 사용자가 어떠한 문장을 발화하였는지 문장을 결과로 내보내고, 마이크 볼륨 조절부(2-5)는 사용자가 마이크의 입력 정도를 조절할 수 있도록 한다. The speech recognition unit 2-4 of the speech and control interface 2-3 recognizes the speech input by the user through the microphone according to the speech request, and outputs the sentence as a result of which sentence the user has spoken, and the microphone. The volume control unit 2-5 allows the user to adjust the input level of the microphone.

이 경우, 상기 발화 및 조절 인터페이스(2-3)는 상기 음성 인식부(2-4)의 발화 요청을 위한 음성 대화 시스템 도메인 모델을 저장하는 제1 데이터베이스(2-6)로부터 발화를 요청할 문장을 무작위로 뽑아내어 발화 요청하게 되고, 사용자가 상기 마이크 볼륨 조절부(2-5)에서 마이크 볼륨을 조절한 후의 조절 값을 환경 제어 정보를 포함하는 유저 환경 프로필 데이터를 저장하는 제2 데이터베이스(2-7)에 저장하며, 이때 상기 제1 및 제2 데이터베이스(2-6, 2-7)는 발화 및 조절 인터페이스(2-3)의 내부 또는 외부에 포함한다. In this case, the speech and control interface 2-3 receives a sentence to request speech from the first database 2-6, which stores the voice conversation system domain model for the speech request of the speech recognition unit 2-4. A second database for randomly extracting and requesting utterance, and storing user environment profile data including environment control information, the adjustment value after the user adjusts the microphone volume in the microphone volume control unit 2-5. 7) wherein the first and second databases 2-6 and 2-7 are included inside or outside the ignition and conditioning interface 2-3.

또한, 마이크 환경 평가부(2-8)의 음성 인식 평가부(2-9)는 상기 발화 및 조절 인터페이스(2-3)의 상기 음성 인식부(2-4)로부터 얻어진 문장으로부터 음성 인식 점수를 평가하고, 마이크 입력 레벨 평가부(2-10)는 상기 발화 및 조절 인터페이스(2-3)의 상기 음성 인식부(2-4)의 음성 인식 도중 얻어진 마이크 입력 레벨을 평가하고, 마이크 환경 평가부(2-11)는 상기 음성 인식 평가부(2-9)로부터 얻어진 음성 인식 점수, 상기 마이크 레벨 평가부(2-10)로부터 얻어진 마이크 입력 레벨, 및 상기 발화 및 조절 인터페이스(2-3)의 상기 마이크 볼륨 조절부(2-5)로부터 얻어진 마이크 볼륨 조절값으로부터 현재 마이크 환경 상태를 평가하여 사용자에게 피드백한다. In addition, the speech recognition evaluator 2-9 of the microphone environment evaluator 2-8 obtains the speech recognition score from the sentences obtained from the speech recognizer 2-4 of the speech and control interface 2-3. The microphone input level evaluator 2-10 evaluates the microphone input level obtained during voice recognition of the speech recognition unit 2-4 of the speech and control interface 2-3, and the microphone environment evaluator. (2-11) indicates the speech recognition score obtained from the speech recognition evaluator 2-9, the microphone input level obtained from the microphone level evaluator 2-10, and the speech and control interface 2-3. From the microphone volume control value obtained from the microphone volume control unit 2-5, the current microphone environmental condition is evaluated and fed back to the user.

여기서, 상기 음성 인식 평가부(2-9)는 상기 음성 인식부(2-4)로부터 넘어온 문장으로부터 현재 발화 요청한 문장과 비교하여 실제 음성 인식 유사도를 측정하여 그 인식 점수를 내보내고, 상기 마이크 입력 레벨 평가부(2-10)는 사용자가 발화하는 순간의 신호적 물리 입력량을 측정하여 그 평균 정도의 마이크 입력 레벨을 내보낸다. Here, the speech recognition evaluator 2-9 measures the actual speech recognition similarity from the sentences passed from the speech recognizer 2-4 and compares the sentences with the current speech request, and outputs the recognition scores to the microphone input level. The evaluation unit 2-10 measures the amount of signal physical input at the moment of the user's speech and emits the microphone input level of the average level.

이하, 도 3 내지 도 5를 참조하여, 본 발명의 바람직한 실시예에 의한 음성 대화 시스템을 위한 음성 인식 제어 방법에 대하여 상세히 설명한다. Hereinafter, a voice recognition control method for a voice conversation system according to a preferred embodiment of the present invention will be described in detail with reference to FIGS. 3 to 5.

도 3은 본 발명의 음성 대화 시스템을 위한 음성 인식 제어 방법을 설명하기 위한 흐름도이다. 3 is a flowchart illustrating a voice recognition control method for a voice conversation system according to the present invention.

도 3에 의하면, 본 발명의 다른 양태에 의하면, 음성 대화 시스템을 위한 음성 인식 환경 제어 방법은 환경 제어 시작 명령에 의해 사용자의 현재 장치 검색 및 환경 초기화를 수행하는 장치 검색 및 환경 초기화 수행 단계(32); 현재 사용자에게 발화를 요청하기 전, 상기 검색된 마이크의 마이크 입력 레벨로부터 잡음을 추정하는 잡음 정도 평가 단계(34); 및 상기 평가된 잡음 정도에 따라서 상기 마이크의 볼륨을 평가하여 환경 제어를 완료하는 마이크 볼륨 평가 단계(36)를 포함한다. According to another aspect of the present invention, according to another aspect of the present invention, a voice recognition environment control method for a voice conversation system includes a device search and environment initialization step 32 of performing a user device current search and environment initialization by an environment control start command. ); A noise level estimation step (34) of estimating noise from a microphone input level of the retrieved microphone before requesting a speech from a current user; And a microphone volume evaluation step 36 of evaluating the volume of the microphone according to the evaluated noise degree to complete an environmental control.

도 4는 도 3의 상세 흐름도로, 마이크 볼륨 평가 단계(36)를 보다 상세히 설명한다. 4 is a detailed flowchart of FIG. 3, which describes the microphone volume evaluation step 36 in more detail.

도 4에서, 상기 마이크 볼륨 평가 단계(36)는 사용자로 하여금 어떤 문장을 발화하도록 요청하고, 사용자가 어떠한 문장을 발화하였는지 문장을 결과로 내보내는 발화 및 조절 단계(362); 및 상기 발화 및 조절 단계의 발화 및 조절 결과에 따라서, 현재 마이크 볼륨 조절 값이 적정한지 판단하고, 높이거나 줄일지 그리고 조절의 정도를 사용자에게 내보내어 피드백하는 마이크 환경 제어 단계(364)를 포함한다. In Fig. 4, the microphone volume evaluating step 36 includes: a speech and adjustment step 362, which requests a user to speak a sentence and outputs a sentence as a result of which sentence the user has spoken; And a microphone environment control step 364 that determines whether the current microphone volume adjustment value is appropriate, increases or decreases it, and sends back a degree of adjustment to the user according to the result of the ignition and adjustment of the ignition and adjustment step. .

도 5는 도 4의 상세 흐름도로, 도 4의 발화 및 조절 단계(362) 및 마이크 환경 제어 단계(364)를 상세히 설명한다. 5 is a detailed flowchart of FIG. 4, which details the ignition and adjustment steps 362 and the microphone environment control step 364 of FIG. 4.

도 5에서, 상기 발화 및 조절 단계(362)는 발화 요청을 위한 음성 대화 시스템 도메인 모델을 저장하는 제1 데이터베이스(2-6)를 참조하여 얻어진 발화 요청에 따라 사용자가 마이크를 통해 입력하는 발화를 인식하여, 사용자가 어떠한 문장을 발화하였는지 문장을 결과로 내보내는 음성 인식 단계(3622); 및 사용자가 마이크의 입력 정도를 조절하고 조절이 완료된 후 환경 제어 정보를 포함하는 유저 환경 프로필 데이터를 저장하는 제2 데이터베이스(2-7)에 그 조절 값을 저장하는 마이크 볼륨 조절 단계(3642)를 포함하고, 상기 마이크 환경 제어 단계(364)는 상기 발화 및 조절 단계의 상기 음성 인식 단계로부터 얻어진 문장으로부터 음성 인식 점수를 평가하는 음성 인식 평가 단계(3642); 상기 발화 및 조절 단계의 상기 음성 인식 단계의 음성 인식 도중 얻어진 마이크 입력 레벨을 평가하는 마이크 입력 레벨 평가 단계(3644); 및 상기 음성 인식 평가 단계로부터 얻어진 음성 인식 점수, 상기 마이크 레벨 평가 단계로부터 얻어진 마이크 입력 레벨, 및 상기 발화 및 조절 단계의 상기 마이크 볼륨 조절 단계로부터 얻어진 마이크 볼륨 조절 값으로부터 현재 마이크 환경 상태를 평가하여 사용자에게 피드백하는 마이크 환경 평가 단계(3646)를 포함한다. In FIG. 5, the speaking and adjusting step 362 may be performed by a user inputting a microphone through a microphone according to a speech request obtained by referring to a first database 2-6 storing a voice conversation system domain model for the speech request. A speech recognition step 3622, which recognizes and outputs a sentence as a result of which sentence the user has uttered; And a microphone volume adjusting step 3642 in which the user adjusts the input level of the microphone and stores the adjustment value in the second database 2-7 storing user environment profile data including environmental control information after the adjustment is completed. And the microphone environment control step 364 includes: a speech recognition evaluation step 3642 for evaluating a speech recognition score from a sentence obtained from the speech recognition step of the utterance and adjustment step; A microphone input level evaluating step (3644) for evaluating a microphone input level obtained during speech recognition of the speech recognition step of the speech and adjustment step; And evaluating the current microphone environment from the voice recognition score obtained from the voice recognition evaluation step, the microphone input level obtained from the microphone level evaluation step, and the microphone volume adjustment value obtained from the microphone volume adjustment step of the utterance and adjustment step. And a microphone environment assessment step 3646 that feeds back.

여기서, 상기 음성 인식 평가 단계(3642)는 상기 음성 인식 단계(3622)로부터 넘어온 문장으로부터 현재 발화 요청한 문장과 비교하여 실제 음성 인식 유사도를 측정하여 그 인식 점수를 내보내고, 상기 마이크 입력 레벨 평가 단계(3644)는 사용자가 발화하는 순간의 신호적 물리 입력량을 측정하여 그 평균 정도의 마이크 입력 레벨을 내보낸다. Here, the speech recognition evaluating step 3642 measures the actual speech recognition similarity from the sentences passed from the speech recognition step 3622 and compares the sentences with the current speech request, and outputs the recognition scores. ) Measures the amount of signal physical input at the moment the user speaks and outputs the average microphone input level.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시예에 의한 음성 대화 시스템을 위한 음성 인식 환경 제어 장치 및 그 방법의 동작을 설명하기로 한다. Hereinafter, with reference to the accompanying drawings will be described the operation of the voice recognition environment control apparatus and method for a voice communication system according to an embodiment of the present invention.

사용자가 마이크 볼륨 조절 평가를 시행하면, 조절 및 발화 인터페이스(2-3)는 음성 대화 시스템 도메인을 저장하는 제1 데이터베이스(2-6)로부터 발화를 요청할 문장을 무작위로 뽑아내어 요청하게 되고, 사용자는 그 문장을 음성 인식부(2-4)에 발화하며, 피드백을 받아 마이크 볼륨 조절부(2-5)에서 마이크 볼륨을 조절하며 조절이 완료되었을 경우 그 조절값을 유저 프로필 데이터를 저장하는 제2 데이터베이스(2-7)에 저장하게 된다. When the user conducts the microphone volume control evaluation, the control and utterance interface (2-3) randomly extracts and requests a sentence to request from the first database (2-6) that stores the voice conversation system domain. The speaker utters the sentence to the voice recognition unit 2-4, receives the feedback, adjusts the microphone volume in the microphone volume adjusting unit 2-5, and stores the user profile data when the adjustment is completed. 2 will be stored in the database (2-7).

마이크 환경 제어부(2-8)에서는 내부 음성 인식 평가부(2-9)에서 음성 인식부(2-4)로부터 전송된 문장과 요청된 문장을 비교하여 낸 점수, 마이크 입력 레벨 평가부(2-10)에서 발화 순간에 측정되는 마이크 입력 소리의 크기, 마이크 볼륨 조절부(2-5)의 조절값을 고려하여 마이크 환경 평가부(2-11)로 하여금 마이크가 가깝거나 멀다, 볼륨이 높거나 낮다, 볼륨의 조절 정도, 재발화 요청과 같은 피드백을 사용자에게 주게 된다. In the microphone environment controller 2-8, the score obtained by comparing the sentence transmitted from the speech recognizer 2-4 with the requested sentence in the internal speech recognition evaluator 2-9, and the microphone input level evaluator 2-2-. In consideration of the volume of the microphone input sound measured at the moment of ignition, and the adjustment value of the microphone volume control unit 2-5, the microphone environment evaluator 2-11 has a microphone close or far away, or the volume is high. Low, gives the user feedback such as volume control and re-ignition requests.

음성 인식 평가부(2-9)에서의 평가는 요청된 문장과 인식된 문장의 N-Best 모델을 추출하여 비교하는 단어 오류율(Word Error Rate) 점수, 해당 문장과 물리적 신호의 유사도를 판별하는 음성 인식부(2-4)의 확실성(Confidence) 점수 등을 고려하여 평가한다. In the speech recognition evaluator 2-9, the word error rate score for extracting and comparing the N-Best model of the requested sentence and the recognized sentence, and the voice for determining the similarity between the sentence and the physical signal. The reliability is evaluated in consideration of the confidence score of the recognition unit 2-4.

마이크 환경 평가부(2-11)에서의 평가는 우선적으로 사용자로 하여금 재발화 요청을 하여 최소 두 번의 평가를 시행한 후 마이크 입력 레벨과 마이크 조절 값의 상관 관계를 알아내고 최적의 음성 인식 평가가 이루어지도록 계산하여 피드백하게 된다. 발화를 많이 하게 되면 할수록 그 데이터가 축적되고, 최고의 음성 인식 평가 점수를 계산하는데 다양한 기계 학습(Machine Learning) 기법을 적용할 수 있다. In the microphone environment evaluator (2-11), the user first requests a re-ignition, performs at least two evaluations, finds a correlation between the microphone input level and the microphone control value, and then evaluates the optimal speech recognition. Calculations to be made and feedback. As more speech is spoken, the data accumulates, and various machine learning techniques can be applied to calculate the best speech recognition score.

상술한 바와 같이, 본 발명에 의한 음성 대화 시스템을 위한 음성 인식 환경 제어 방법 및 그 장치는 음성 대화 시스템의 음성 인식의 환경에 의한 문제를 해결함으로써 인식 성능을 향상시켜 원활한 대화를 진행할 수 있게 하고, 음성 대화 시스템이 사용될 수 있는 모든 환경에 적용될 수 있으므로 관련 장치 및 산업에 파급 효과가 클 것이다. As described above, the method and apparatus for controlling a voice recognition environment for a voice conversation system according to the present invention improve the recognition performance by solving a problem caused by the environment of the voice recognition of the voice conversation system, and facilitate a smooth conversation. The voice dialogue system can be applied to any environment where it can be used, so the ripple effect will be great for the relevant devices and industry.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과함께 상세하게 상술한 실시예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태 로 구현될 것이며, 단지 본 실시예들은 본 발명의 개시가 완전 하도록 하며, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의 정의될 뿐이다. 다른 정의가 없다면, 본 명세서에서 사용되는 모든 용어(기술 및 과학적 용어를 포함)는 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 공통적으로 이해될 수 있는 의미로 사용될 수 있을 것이다. 또 일반적으로 사용되는 사전에 정의되어 있는 용어들은 명백하게 특별히 정의되어 있지 않는한 이상적으로 또는 과도하게 해석되지 않는다. Advantages and features of the present invention, and methods for achieving them will be apparent with reference to the above-described embodiments in detail with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but may be implemented in various different forms, only the embodiments are to make the disclosure of the present invention complete, and the general knowledge in the technical field to which the present invention belongs. It is provided to fully convey the scope of the invention to those skilled in the art, and the invention is defined only by the scope of the claims. Unless otherwise defined, all terms (including technical and scientific terms) used in the present specification may be used in a sense that can be commonly understood by those skilled in the art. In addition, the terms defined in the commonly used dictionaries are not ideally or excessively interpreted unless they are specifically defined clearly.

1-1...음성 인식 환경 제어 장치 1-2...환경 제어 시작
1-3...장치 검색 및 환경 제어 초기화부
1-4...잡음 정도 평가부
1-5...마이크 볼륨 평가부
1-6...환경 제어 완료
2-1...사용자
2-3...발화 및 조절 인터페이스
2-4...음성 인식부
2-5...마이크 볼륨 조절부
2-6...제1 데이터베이스
2-7...제2 데이터베이스
2-8...마이크 환경 제어부
2-9...음성 인식 평가부
2-10...마이크 입력 레벨 평가부
2-11...마이크 환경 평가부1-1 ... Voice Recognition Environment Control Device 1-2 ... Start Environment Control
1-3 ... Device Discovery and Environment Control Initiator
Noise level evaluation section
1-5 ... Mic volume evaluation unit
1-6 ... Environmental control complete
2-1 ... User
2-3 ... ignition and conditioning interface
2-4 ... Voice Recognition Unit
2-5.Microphone volume control
2-6 ... First Database
2-7 ... Second Database
2-8 ... microphone environment control
2-9 ... Voice Recognition Evaluation Unit
Microphone input level evaluation unit
2-11 ... Mike environmental evaluation department

Claims

In the voice recognition environment control apparatus for a voice conversation system,
A device search and environment initialization unit for performing a device search and environment initialization of the user by an environment control start command;
A noise level evaluator for estimating noise from a microphone input level of a microphone searched through the device search before requesting a utterance from a user;
A voice recognition unit for requesting a user to speak a predetermined sentence stored in advance, recognizing a user's speech according to the request, and outputting the sentence spoken by the user as a result;
A microphone volume control unit for adjusting a microphone volume value according to a user's operation; And
A microphone environment controller configured to determine whether a current microphone volume value satisfies a preset criterion based on the user's speech recognized by the voice recognition unit and an adjustment of the microphone volume controller, and feed back the adjustment information of the microphone volume value to the user; Including,
The microphone environment control unit,
A speech recognition evaluator for evaluating a speech recognition score from the sentence recognized by the speech recognizer;
A microphone input level evaluator configured to evaluate a microphone input level obtained during a speech recognition process of the speech recognizer; And
And a microphone environment evaluator configured to evaluate a current microphone environment state from the voice recognition score, the microphone input level, and the microphone volume value and feed back control information of the microphone volume value to the user.

delete

The method of claim 1,
The speech recognition unit,
Request a speech by randomly extracting the predetermined sentence previously stored in a first database in which a voice conversation system domain model for a speech request is stored and presenting the same to a user;
The microphone volume control unit,
And the microphone volume value adjusted in a second database storing user environment profile data as environment control information.

delete

The method of claim 1,
The speech recognition evaluation unit,
And a speech recognition similarity is measured by comparing the sentence spoken by the user provided by the speech recognition unit with the sentence requested to speak, and calculating the measurement result as the speech recognition score.

The method of claim 1,
The microphone input level evaluation unit,
And a signal physical input amount measured at the instant of the user's speech to calculate an average, and output the calculated average at the microphone input level.

In the voice recognition environment control method for a voice conversation system,
A device search and environment initialization step of performing a device search and an environment initialization of a user by an environment control start command;
A noise level estimation step of estimating noise from a microphone input level of a microphone found through the device search before requesting a utterance from a user;
A voice recognition step of requesting a user to speak a predetermined sentence stored in advance, recognizing a user's speech according to the request, and outputting the sentence spoken by the user as a result; And
And a microphone environment control step of determining whether the speech recognized in the speech recognition step and the microphone volume value set by the user satisfy a preset criterion, and feeding back the adjustment information of the microphone volume value to the user.
The microphone environment control step,
A speech recognition evaluation step of evaluating a speech recognition score from a sentence recognized in the speech recognition step;
A microphone input level evaluating step of evaluating a microphone input level obtained in the speech recognition process of the speech recognition evaluating step; And
And a microphone environment evaluation step of evaluating a current microphone environment state from the voice recognition score, the microphone input level, and the microphone volume value and feeding back control information of the microphone volume value to the user. .

delete

The method of claim 7, wherein
The speech recognition step,
Request a speech by randomly extracting the predetermined sentence previously stored in a first database in which a voice conversation system domain model for a speech request is stored and presenting the same to a user;
The microphone volume adjustment step,
And storing the adjusted microphone volume value as environmental control information in a second database in which user environment profile data is stored.

delete

The method of claim 7, wherein
The speech recognition evaluation step,
And comparing the sentence spoken by the user provided in the speech recognition step with a sentence requested for speech, and measuring similarity in speech recognition, and calculating the measurement result as the speech recognition score.

The method of claim 7, wherein
The microphone input level evaluation step,
And a mean is measured by measuring the amount of signal physical input at the moment of the user's utterance, and outputting the calculated mean at the microphone input level.