KR101074049B1

KR101074049B1 - Voice recognition processing method

Info

Publication number: KR101074049B1
Application number: KR1020090104905A
Authority: KR
Inventors: 김기현; 방영규
Original assignee: 주식회사 씨에스
Priority date: 2009-11-02
Filing date: 2009-11-02
Publication date: 2011-10-17
Also published as: KR20110048209A

Abstract

본 발명에 따른 음성 인식 처리 방법은 (a) 입력된 소리의 세기와 지속 시간을 측정하는 단계와, (b) 상기 (a) 단계에서 측정된 소리 세기와 명령 처리에 요구되는 소정의 세기값과 비교하고 그 비교 결과와 지속 시간을 참조하여 휴면 구간, 소음 구간, 및 묵음 구간으로 결정하는 단계, 및 (c) 음성 인식기에서 음성 인식을 처리하되 입력된 소리가 상기 (a) 단계에서 어느 구간에 속하는 것으로 결정되었는가에 따라 휴면 구간에 속하면 음성 인식 결과를 승인하지 않고 제어를 중지하는 휴면 상태, 소음 구간에 속하면 소음을 억제하여 음성 인식을 하면서 음성 인식 결과에 의한 제어를 수행하는 소음 적응 상태, 또는 묵음 구간에 속하면 음성 인식 결과에 의한 제어를 정상적으로 수행하는 정상 상태로 선택적으로 전환하며 제어하는 단계를 포함하되, 상기 (c) 단계에 의하여 제어하는 도중에 각 상태에서의 경과시간을 카운트하여 미리 정한 시간이 경과하거나 명령어를 구성하는 키워드가 인식되는 경우 또는 미리 정한 특정 명령이 인식되는 경우에는 휴면 상태에서 정상 상태로 복귀시키는 단계를 포함하는 것을 특징으로 한다.The speech recognition processing method according to the present invention comprises the steps of (a) measuring the intensity and duration of the input sound; (b) the sound intensity measured in step (a) and a predetermined intensity value required for command processing; Comparing the result of the comparison and determining the sleep period, the noise section, and the silent section with reference to the comparison result and the duration, and (c) processing the speech recognition in the speech recognizer, and inputting the sound into the section in step (a). If it belongs to the dormant period, it is a dormant state that stops the control without approving the voice recognition result, and if it is in the noisy area, the noise adaptive state performs control based on the voice recognition result while suppressing the noise. Or switching to the normal state in which the control by the voice recognition result is normally performed when belonging to the silent section. During the control by step (c), the elapsed time in each state is counted to return from the dormant state to the normal state when a predetermined time elapses or when a keyword constituting the command is recognized or a predetermined specific command is recognized. It characterized in that it comprises a step of.

본 발명에 따르면 입력된 소리의 세기와 지속 시간을 참조하므로써 주변 상황이 서로 대화하는 상황과 같이 지속적인 입력이 이어지는 경우에는 음성 인식기가 휴면 상태로 전환되어 음성 인식이 되는 경우에도 승인을 하지 않게 함으로써 소음 환경에서 오인식을 현저히 줄인다.According to the present invention, if the continuous input continues, such as a situation in which the surrounding situation is talking with each other by referring to the input sound intensity and duration, the voice recognizer switches to a dormant state and does not approve even when voice recognition is performed. Significantly reduce false recognition in the environment.

음성 인식, 오인식, 소음 환경 Speech recognition, misrecognition, noise environment

Description

Voice recognition processing method

본 발명은 음성 인식 처리 방법에 관한 것으로 더 상세하게는 소음 환경에서 오인식을 줄이는 음성 인식 처리 방법에 관한 것이다.The present invention relates to a speech recognition processing method, and more particularly, to a speech recognition processing method for reducing false recognition in a noise environment.

음성 인식 기술이 발달함에 따라 그 영역이 가정내의 홈 네트워크 시스템에 까지 확장되어 적용됨으로써 음성으로 조명, 커튼, 및 다양한 가정내 기기들을 제어할 수 있게 되었다. 이러한 홈 네트워크 시스템에서는 사용자가 언제 기기들을 제어할지 알 수 없기 때문에 항상 음성 인식 가능 상태로 구동할 필요가 있다.As voice recognition technology develops, the area has been extended to home network systems in homes to control lighting, curtains, and various home devices with voice. In such a home network system, since the user does not know when to control the devices, it is necessary to always run in a voice recognition state.

하지만 가정 내에서는 수시로 소음이 발생되고 있어 홈네트워크 시스템에서는 일정한 수준의 소음 환경하에서도 음성 인식을 처리하여야 하며, 명령어가 아닌 음성이 명령어로 인식되더라도 승인하지 않고 거절하는 기능이 요구된다.However, since the noise is often generated in the home, the home network system needs to process voice recognition even under a certain level of noise environment, and even if a voice is recognized as a command, a function of rejecting it without approval is required.

한편, 소음 환경하에서 인식률을 높이기 위해서 다양한 소음 제거 기법들이 사용되고 있으나 아직까지 완벽하게 기능을 구현하고 있지는 않은 실정이고 거절 기능을 위하여 신뢰 척도 측정(Confidence Measure) 기법이 사용되고 있으나 이러한 신뢰 척도 측정 기법은 우도(Likelihood Score)를 가지고 측정하기 때문에 명령어와 유사한 단어가 입력되는 경우에는 명령어로 오인식할 가능성을 여전히 가지고 있다. 또한, 신뢰 척도 측정 기법 이외의 방법들도 그 한계를 가지고 있는 실정이다.On the other hand, various noise reduction techniques are used to increase the recognition rate under noisy environments, but they have not been fully implemented yet, and the confidence measure technique is used for rejection functions. Because it is measured with the Likelihood Score, it still has the potential to be misidentified as a command if a word similar to the command is entered. In addition, there are limitations to the methods other than the reliability measurement method.

본 발명은 상기한 문제점을 해결하기 위하여 발명된 것으로 본 발명이 이루고자 하는 기술적 과제는 상시 음성을 입력받을 수 있도록 하면서도 일정 수준의 소음 환경하에서 인식률을 높이고 명령어만을 인식하여 오인식되는 빈도를 낮추는 음성 인식 처리 방법을 제공하는 것이다.The present invention has been invented to solve the above problems, and the technical problem to be achieved by the present invention is to recognize a voice while increasing the recognition rate under a certain level of noise environment and recognizing only a command while reducing the frequency of misrecognition. To provide a way.

상기 기술적 과제를 이루기 위한 본 발명에 따른 음성 인식 처리 방법은,Speech recognition processing method according to the present invention for achieving the above technical problem,

(a) 입력된 소리의 세기와 지속 시간을 측정하는 단계와,(a) measuring the intensity and duration of the input sound;

(b) 상기 (a) 단계에서 측정된 소리 세기와 명령 처리에 요구되는 소정의 세기값과 비교하고 그 비교 결과와 지속 시간을 참조하여 휴면 구간, 소음 구간, 및 묵음 구간으로 결정하는 단계, 및(b) comparing the sound intensity measured in step (a) with a predetermined intensity value required for command processing and determining the sleep interval, the noise interval, and the silent period by referring to the comparison result and the duration;

(c) 음성 인식기에서 음성 인식을 처리하되 입력된 소리가 상기 (a) 단계에서 어느 구간에 속하는 것으로 결정되었는가에 따라 휴면 구간에 속하면 음성 인식 결과를 승인하지 않고 제어를 중지하는 휴면 상태, 소음 구간에 속하면 소음을 억제하여 음성 인식을 하면서 음성 인식 결과에 의한 제어를 수행하는 소음 적응 상태, 또는 묵음 구간에 속하면 음성 인식 결과에 의한 제어를 정상적으로 수행하는 정상 상태로 선택적으로 전환하며 제어하는 단계를 포함하되,(c) A sleep state and noise that processes the speech recognition in the speech recognizer but stops the control without approving the speech recognition result when the input sound belongs to the dormant section according to which section the input sound is determined to belong to in step (a). If it belongs to the section, the noise adaptive state to control the speech recognition result while suppressing the noise, or if it belongs to the silent section, it selectively switches to the normal state to perform the control based on the speech recognition result. Including steps,

상기 (c) 단계에 의하여 제어하는 도중에 각 상태에서의 경과시간을 카운트하여 미리 정한 시간이 경과하거나 명령어를 구성하는 키워드가 인식되는 경우 또 는 미리 정한 특정 명령이 인식되는 경우에는 휴면 상태에서 정상 상태로 복귀시키는 단계를 포함하는 것을 특징으로 한다.During the control in the step (c), the elapsed time in each state is counted, and when the predetermined time elapses or when a keyword constituting the command is recognized or when a predetermined specific command is recognized, the sleep state is normal. And returning to.

또한, 상기 방법은, In addition, the method,

(d) 상기 (c) 단계에서 휴면 상태중에 미리 정한 특정 명령이 인식되어 정상 상태로 복귀한 이후 일정 시간동안 음성 명령이 인식되지 않으면 휴면 상태로 다시 복귀시키는 단계를 더 포함하는 것이 바람직하다.(d) The method may further include the step of returning to the dormant state if the voice command is not recognized for a predetermined time after the predetermined specific command is recognized and returned to the normal state during the dormant state in step (c).

또한, 상기 (b) 단계는,In addition, the step (b),

(b') 상기 (a) 단계에서 측정된 소리 세기와 지속시간 및 명령 요구 수준의 세기를 비교하여 측정된 소리 세기가 명령 요구 수준인 제1 소정값 이상이고 지속 시간이 제1 소정 길이를 초과하면 휴면 구간, 측정된 소리 세기가 제1 소정값 미만이고 제1 소정값 보다 낮은 제2 소정값 이상이거나 측정된 소리 세기가 제1 소정값 이상이고 지속 시간이 제1 소정 길이 미만이면 소음 구간, 및 측정된 소리 세기가 제2 소정값 미만이면 묵음 구간으로 결정하는 단계인 것이 바람직하다.(b ') comparing the sound intensity measured in the step (a) with the duration and the command request level, and the measured sound intensity is equal to or greater than the first predetermined value of the command request level and the duration exceeds the first predetermined length. A sleep interval, when the measured sound intensity is less than the first predetermined value and is equal to or greater than the second predetermined value lower than the first predetermined value, or when the measured sound intensity is greater than or equal to the first predetermined value and the duration is less than the first predetermined length, And if the measured sound intensity is less than the second predetermined value it is preferable that the step of determining the silence period.

본 발명에 따르면 홈네트워크와 같이 상시 음성 인식 상태에 있으면서도 일상적인 소음이 발생할 수 있는 분야에서 적용할 때 소음에 의하여 오인식되는 빈도를 낮추기 위하여 입력된 소리의 세기와 지속 시간을 참조하므로써 주변 상황이 서로 대화하는 상황과 같이 지속적인 입력이 이어지는 경우에는 음성 인식기가 휴면 상태로 전환되어 음성 인식이 되는 경우에도 승인을 하지 않게 함으로써 소음 환경에서 오인식을 현저히 줄인다.According to the present invention, in order to reduce the frequency of misrecognition by noise when applied in a field where everyday noise may occur while being in a state of constant speech recognition such as a home network, the surrounding situation is mutually referred to by referring to the intensity and duration of the input sound. In the case of continuous input, such as in a conversation situation, the voice recognizer goes into a dormant state, and even if the voice recognition is not recognized, the recognition is significantly reduced in the noise environment.

이하 첨부된 도면들을 참조하여 본 발명의 바람직한 실시예들을 보다 상세히 설명하기로 한다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1에는 본 발명에 따른 음성 인식 처리 방법의 주요 단계들을 흐름도로써 개략적으로 나타내었다. 도 1을 참조하면 본 발명에 따른 음성 인식 처리 방법에서는 소리를 입력(단계 S100)하여, (a) 입력된 소리의 세기와 지속 시간을 측정한다(단계 S102). 다음으로, (b) 상기 (a) 단계에서 측정된 소리 세기와 지속시간 및 명령 요구 수준의 세기를 비교하여 측정된 소리 세기가 명령 요구 수준인 제1 소정값 이상이고 지속 시간이 제1 소정 길이를 초과하면 휴면 구간, 측정된 소리 세기가 제1 소정값 미만이고 제1 소정값 보다 낮은 제2 소정값 이상이거나 측정된 소리 세기가 제1 소정값 이상이고 지속 시간이 제1 소정 길이 미만이면 소음 구간, 및 측정된 소리 세기가 제2 소정값 미만이면 묵음 구간으로 결정한다(단계 S104).1 schematically shows the main steps of the speech recognition processing method according to the present invention in a flowchart. Referring to FIG. 1, in the voice recognition processing method according to the present invention, a sound is input (step S100), and (a) the intensity and duration of the input sound are measured (step S102). Next, (b) comparing the sound intensity measured in the step (a) with the intensity of the duration and the command request level is equal to or greater than the first predetermined value of the command request level and the duration is the first predetermined length. Is greater than the dormant interval, the measured loudness is less than the first predetermined value and the second predetermined value lower than the first predetermined value or the measured loudness is more than the first predetermined value and the duration is less than the first predetermined length If the section and the measured sound intensity is less than the second predetermined value, it is determined as the silent section (step S104).

도 2에는 본원 발명에 따른 음성 인식 처리 방법에서 입력되는 음성의 세기와 지속 시간에 따른 구간 정의의 일예를 설명하기 위한 도면을 나타내었다. 도 2를 참조하면, 묵음 구간(200)은 소리의 세기가 일예로 50 dB 미만인 구간으로 정의하며, 이 구간에서 음성인식기는 정상 상태에서 음성의 입력이 없다고 판단하는 구간이다. 이 묵음 구간(200)은 소음 적응 상태에서는 값이 변하게 되는데 이는 입력 볼륨의 세기를 낮추기 때문이다. 따라서 소음 적응 상태에서의 묵음 구간은 0 ~ 60 dB가 된다. 다음으로 소음 구간(202)은 일예로 소리의 세기가 50 ~ 60 dB의 구간으로 정의하는데 60 dB의 세기를 가지지만 지속 시간이 1 초 미만인 경우에는 소음 적응 구간으로 본다. 휴면 구간(204)은 소리의 세기가 65 dB 이상이고 지속 시간이 1 초를 초과하는 긴 음성의 경우로 정의한다. 2 is a view for explaining an example of the section definition according to the intensity and duration of the voice input in the speech recognition processing method according to the present invention. Referring to FIG. 2, the silence section 200 is defined as a section in which the intensity of the sound is less than 50 dB, for example. In this section, the speech recognizer is a section in which there is no voice input in a normal state. The silence period 200 changes in the noise adaptation state because it lowers the strength of the input volume. Therefore, the silence period in the noise adaptation state is 0 to 60 dB. Next, the noise section 202 is defined as a section of the sound intensity of 50 ~ 60 dB as an example, but if the duration is less than 1 second is considered as the noise adaptation section. The sleep period 204 is defined as the case of long speech in which the intensity of the sound is 65 dB or more and the duration exceeds 1 second.

다음으로, 본 발명에 따르면 (c) 음성 인식기에서 음성 인식을 처리(단계 S106)하되 입력된 소리가 상기 (b) 단계에서 어느 구간에 속하는 것으로 결정되었는가에 따라 휴면 구간에 속하면 음성 인식 결과를 승인하지 않고 제어를 중지하는 휴면 상태, 소음 구간에 속하면 소음을 억제하여 음성 인식을 하면서 음성 인식 결과에 의한 제어를 수행하는 소음 적응 상태, 또는 묵음 구간에 속하면 음성 인식 결과에 의한 제어를 정상적으로 수행하는 정상 상태로 선택적으로 전환하며 제어한다. 즉, 인식 결과를 승인하였는지의 여부(단계 S108)에 따라 정상 구간에 속하여 음성 인식 결과를 승인하게 된 경우에는 해당 기기 제어를 하고 정상 상태로의 복귀 처리(단계 S110)하며, 휴면 구간에 속하여 음성 인식 결과를 승인하지 않고 제어를 중지하였다면 휴면 상태로의 전환을 위한 상태 전환 처리(단계 S120)가 이루어진다. Next, according to the present invention (c) processing the speech recognition in the speech recognizer (step S106), if the input sound belongs to the dormant section depending on which section in the step (b) it is determined that the speech recognition result In the idle state to stop the control without approval, the noise adaptation state to control the speech recognition result while suppressing the noise when belonging to the noise section, or the control by the speech recognition result if it belongs to the silent section Selectively switch to the normal state to perform and control. That is, when the recognition result is recognized in the normal section according to whether the recognition result is approved (step S108), if the speech recognition result is approved, the device is controlled and the process returns to the normal state (step S110). If control is stopped without acknowledging the recognition result, a state switching process (step S120) for switching to the dormant state is performed.

한편, 상기 (c) 단계에 의하여 제어하는 도중에 각 상태에서의 경과시간을 카운트하여 미리 정한 시간이 경과하거나 명령어를 구성하는 키워드가 인식되는 경우 또는 미리 정한 특정 명령이 인식되는 경우에는 휴면 상태에서 정상 상태로 복귀시킨다(단계 S110).On the other hand, during the control by the step (c), the elapsed time in each state is counted, and when the predetermined time has elapsed or when a keyword constituting the command is recognized or when a predetermined specific command is recognized, the sleep state is normal. It returns to a state (step S110).

도 3에는 본 발명에 따른 음성 인식 처리 방법에서 정의되는 휴면 상태, 소음 적응 상태, 및 정상 상태로 전환되는 과정을 도식적으로 나타내었다. 도 3을 참조하면, 본 발명에서는 음성 인식기에서 음성 인식을 처리(단계 S106)하되 음성 인 식기의 상태를 음성 인식 결과를 승인하지 않고 제어를 중지하는 휴면 상태(200), 소음 구간에 속하면 소음을 억제하여 음성 인식을 하면서 음성 인식 결과에 의한 제어를 수행하는 소음 적응 상태(202), 또는 음성 인식 결과에 의한 제어를 정상적으로 수행하는 정상 상태(204)의 3 가지 상태로 정의하고 각 상태별로 음성 인식 결과에 대하여 달리 제어한다.3 schematically shows a process of switching to a dormant state, a noise adaptation state, and a normal state defined in the speech recognition processing method according to the present invention. Referring to FIG. 3, in the present invention, the voice recognizer processes the voice recognition (step S106), but the sleep state 200 for stopping the control of the state of the voice dishware without acknowledging the result of the voice recognition, the noise is in the noise section. Is defined as three states of noise adaptation state 202 performing control based on the speech recognition result and normal state 204 performing control based on the speech recognition result while suppressing the speech. The recognition result is controlled differently.

예컨대 음성 인식기 주변의 상황이 60dB 이상인 소리 세기가 1 초 이상의 지속 시간을 가지고 입력되면 휴면 상태로 전환한다. 휴면 상태로 전환되면 소음에 섞인 음성이 명령어로 인식되어도 그 인식 결과를 승인하지 않는다.For example, if a situation around the speech recognizer is input with a sound intensity of 60 dB or more with a duration of 1 second or more, it enters a dormant state. When it enters the dormant state, even if the voice mixed with noise is recognized as a command, the recognition result is not accepted.

이러한 휴면 상태에서 음성 인식기는 소음 적응 상태 또는 정상 상태로 전환될 수 있다. 첫 번째로 휴면 상태에서 입력되는 음성의 세기가 낮아져 50 dB ~ 60 dB의 소리 세기로 입력되는 경우에는 음성 인식기는 소음 적응 상태로 전환한다. 두 번째로 휴면 상태에서 주변 소음이 제거되어 묵음이 예를 들어 5 초 이상 지속되거나 정상 상태 복귀 명령으로 미리 정의되는 특정 명령어를 인식하면 정상 상태로 복귀한다. 다음으로, 소음 적응 상태에서 음성 인식기는 게인의 조절등을 통하여 입력 볼륨을 낮추어 비교적 큰 소리에 대한 인식을 수행하도록 한다. 이러한 소음 적응 상태에서 음성 인식기 주변의 소음의 세기가 높아져 60 dB 이상이면서 1 초 이상의 지속 시간을 가지는 음성이 입력되면 휴면 상태로 전환하고, 주변의 소음이 제거되어 묵음이 5 초 이상 지속되면 정상 상태로 복귀한다.In this dormant state, the speech recognizer may be switched to the noise adaptation state or the normal state. First, when the intensity of the voice input in the dormant state is lowered and is input at a loudness of 50 dB to 60 dB, the speech recognizer switches to the noise adaptive state. Secondly, the ambient noise is removed from the dormant state and the silence returns to the normal state if the silence lasts, for example, for more than 5 seconds, or recognizes a specific command predefined by the normal state return command. Next, in the noise adaptation state, the voice recognizer lowers the input volume by adjusting gain and performs recognition of a relatively loud sound. In this noise adaptation state, the noise intensity around the voice recognizer is increased, and when a voice with a duration of more than 60 dB and a duration of 1 second or more is inputted, it enters a dormant state, and when the silence is maintained for more than 5 seconds, the normal state is maintained. Return to.

도 4에는 본 발명의 바람직한 실시예에 따른 음성 인식 처리 방법을 나타내었다. 본 발명의 바람직한 실시예에서는 각 상태로의 전환이 보다 안정적으로 이루 어지도록 하기 위하여 음성 구간별로 연속적으로 발생하는 빈도를 체크하여 일정한 빈도 조건에 도달하면 상태를 전환한다. 바람직하게는 입력된 음성이 어느 음성 구간에 속하는지를 각 구간 별로 1 초 마다 카운트하여 예컨대 5 번이라는 소정값에 도달하면 해당 상태로 전환하도록 한다. 도 4를 참조하면, 본 발명의 바람직한 실시예에서는 소리를 입력(단계 S400)하고 입력된 음성에 대하여 세기와 지속 시간을 측정(단계 S402)하고, 측정된 결과에 따라 어느 구간에 속하는지를 결정한다(단계 410).4 shows a speech recognition processing method according to a preferred embodiment of the present invention. In a preferred embodiment of the present invention, in order to achieve a more stable transition to each state, the frequency generated continuously for each voice section is checked and the state is switched when a certain frequency condition is reached. Preferably, the input voice is counted every second for each voice section to which the input voice belongs. Referring to FIG. 4, in a preferred embodiment of the present invention, a sound is input (step S400), the intensity and duration are measured (step S402) with respect to the input voice, and it is determined which section belongs to the measured result. (Step 410).

단계(S410)에서 휴면 구간이나 소음 구간에 속하는 것으로 결정된 경우에는 묵음 구간 카운터를 초기화(단계 S412)하고, 음성 인식을 처리(단계 S414)하면서 결정되어 있는 상태에 따라 인식 결과를 승인할 것인지를 판단(단계 S416)하여, 인식 결과를 승인하지 않는 것으로 결정된 경우에는 임시 정상 상태에 있는지를 체크(단계 S420)한다. 본 실시예에서는 이후에 설명되는 바와 같이 임시 정상 상태를 추가적으로 정의한다.If it is determined in step S410 that it belongs to the dormant section or the noise section, the silence section counter is initialized (step S412), and the voice recognition process (step S414) determines whether to recognize the recognition result according to the determined status. If it is determined that the recognition result is not to be approved (step S416), it is checked whether it is in a temporary normal state (step S420). In this embodiment, a temporary steady state is further defined as described later.

즉, 본 실시예에서는 현재 제어 상태가 임시 정상 상태인지를 체크(S420)하여 임시 정상 상태인 경우에는 임시 정상 카운터의 카운트값을 증가(S450)시키고 임시 정상 카운터의 카운트값이 소정값, 예컨데 3 회에 도달하였으면 휴면 상태로 복귀하도록 한다(S454).That is, in the present embodiment, it is checked whether the current control state is a temporary normal state (S420), and in the case of a temporary normal state, the count value of the temporary normal counter is increased (S450), and the count value of the temporary normal counter is a predetermined value, for example, 3. If it has reached the time to return to the dormant state (S454).

임시 정상 상태가 아닌 상태에서 단계(S410)에서 입력 음성이 소음 구간에 속하는 것으로 결정되었고 음성 인식기의 현재 제어 상태가 소음 적응 상태가 아닌 경우(S430)에는 소음 구간에 할당된 소음 구간 카운터의 카운트값을 증가(S432)시 킨다. 이제, 소음 구간 카운트가 소정값, 예컨대 7 회에 도달하면 음성인식기의 제어 상태를 소음 적응 상태로 전환(S434)하여 마이크 볼륨을 낮추며 보다 크게 발성하도록 유도한다.If it is determined in step S410 that the input voice belongs to the noise section and the current control state of the voice recognizer is not the noise adaptation state in the non-temporary normal state (S430), the count value of the noise section counter assigned to the noise section is determined. Increase (S432). Now, when the noise section count reaches a predetermined value, for example, seven times, the control state of the speech recognizer is switched to the noise adaptation state (S434) to lower the microphone volume and induce it to speak more loudly.

또한, 단계(S410)에서 입력 음성이 휴면 구간에 속하는 것으로 결정되었고 현재 제어 상태가 휴면 상태가 아닌 경우(S440)에는 휴면 구간에 할당된 휴면 구간 카운터의 카운트값을 증가(S442)시키고 휴면 구간 카운트값이 예컨대 5 회에 도달하였는지 체크(S444)하여 휴면 상태로 전환(S446)한다.In addition, when it is determined in step S410 that the input voice belongs to the dormant section and the current control state is not dormant (S440), the count value of the dormant section counter assigned to the dormant section is increased (S442) and the dormant section count is performed. For example, if the value has reached five times (S444), the device enters a dormant state (S446).

한편, 단계(S410)에서 입력 음성이 묵음 구간에 속하는 것으로 결정된 경우에는 1 초 단위로 카운트하는 묵음 구간 카운터의 카운트값을 증가(S460)시키고 묵음 구간 카운터의 카운트값이 소정값, 예컨대 5 회 이상, 즉, 5 초 이상 묵음 상태가 유지되었으면 현재 제어 상태가 소음 적응 상태인 경우(S464)나 휴면 상태인 경우(S466)에는 제어 상태를 정상 상태로 복귀(S468)하고 휴면 구간 카운터와 소음 구간 카운터의 카운트값을 "0"으로 초기화한다(S470). 즉, 본 발명의 바람직한 실시예에 따르면 정상 상태에서 휴면 상태 또는 소음 적응 상태로의 전환은 묵음이 일정 시간 유지되지 않은 상태에서 해당 구간으로 구분되는 소리의 입력이 일정 횟수 이상 지속적으로 반복되는 경우에 이루어진다는 점에 주목할 필요가 있다.On the other hand, when it is determined in step S410 that the input voice belongs to the silent section, the count value of the silent section counter counting by 1 second is increased (S460), and the count value of the silent section counter is a predetermined value, for example, five times or more. That is, if the mute state is maintained for 5 seconds or more, if the current control state is the noise adaptation state (S464) or in the dormant state (S466), the control state returns to the normal state (S468) and the sleep section counter and the noise section counter The count value of is initialized to "0" (S470). That is, according to the preferred embodiment of the present invention, the transition from the normal state to the dormant state or the noise adaptation state is performed when the input of the sound divided into the corresponding sections is continuously repeated for a predetermined number of times while the silence is not maintained for a certain time. It is worth noting that it is done.

다음으로, 휴면 상태 또는 소음 적응 상태에서 정상 상태로 복귀하는 과정을 설명한다. 음성 인식기가 휴면 상태로 전환되면 음성 인식 처리는 수행하지만 인식된 결과를 승인하지 않고 거절함으로써 마치 음성 인식 처리를 하지 않는 것과 동 일한 것으로 이해될 수 있다.Next, a process of returning from the dormant state or the noise adaptation state to the normal state will be described. When the speech recognizer enters the dormant state, it may be understood that the speech recognition process is performed, but the speech recognition process is not equivalent to the speech recognition process by rejecting and rejecting the recognized result.

휴면 상태를 벗어나 정상 상태로 복귀하도록 하는 조건은 예를 들어 다음과 같은 3 가지를 적용할 수 있다.For example, the following three conditions may be applied to recover from a dormant state and return to a normal state.

첫 번 째로, 도 4를 참조하여 설명한 바와 같이 일정 시간, 일예로 5 초동안 음성이 입력되지 않은 묵음 구간의 소리가 입력되면 정상 상태로 복귀한다. 바람직하게는 묵음 구간의 소리 입력을 카운트하는 일명 묵음 구간 카운터를 두어 카운트할 수 있다.First, as described with reference to FIG. 4, when a sound of a silent section in which a voice is not input for a predetermined time, for example, 5 seconds is input, returns to a normal state. Preferably, a so-called silent section counter that counts sound input of the silent section may be counted.

두 번째로 미리 정한 정상 상태 복귀 명령을 사용하는 것이다. 즉, 휴면 상태에서 음성 인식기는 다른 모든 명령들에 대해서는 인식 결과를 승인하지 않지만 미리 정한 정상 상태 복귀 명령에 한해서만 인식 결과를 승인하도록 한다.The second is to use the predetermined steady state return command. That is, in the dormant state, the voice recognizer does not approve the recognition result for all other commands, but only the predetermined normal state return command allows the recognition result to be approved.

세 번째로 명령어를 구성하는 "키워드"를 인식후 소정 시간 동안 일시적인 정상 상태, 즉, 임시 정상상태로 전환하는 것이다. 명령어는 키워드와 제어하고자 하는 기기 명칭, 및 해당 기기 명칭의 기기에 대한 제어 내용으로 이루어지는데 키워드, 예컨대, 명령어가 "나비야 불 켜"에서 키워드인 "나비야"를 인식하면 일시적인 정상 상태인 임시정상상태로 전환하고 예컨대 3 번의 발성 이내에 명령어가 인식되지 않으면 즉시 휴면 상태로 복귀하도록 한다.Third, after recognizing the "keyword" constituting the command, the transition to a temporary normal state, that is, a temporary normal state for a predetermined time. The command consists of a keyword, the name of the device to be controlled, and the control contents of the device of the corresponding device name. For example, when the command recognizes the keyword "nabya" in "blinks", the temporary normal state is a temporary normal state. Switch to return to sleep immediately if the instruction is not recognized within 3 voices, for example.

다시 도 4를 참조하면 단계(S416)에서 인식 결과를 승인하는 경우이고 현재 제어 상태가 휴면 상태가 아닌 경우(S480)라면 정상 상태에 해당하므로 기기 제어 명령인지를 확인(S482)하여 기기 제어(S484)를 수행하며, 그 때의 제어 상태가 임 시정상상태인 경우(S486)에는 정상 상태로 복귀시킨다(S488). 단계(S416)에서 인식 결과를 승인하는 경우이지만 현재 제어 상태가 휴면 상태인 경우(S480)라면 정상 상태 복귀 명령인지를 체크(S490)하여 정상 상태 복귀 명령인 경우에는 정상 상태로 복귀시킨다(S488). 인식된 음성이 정상 상태 복귀 명령이 아닌 것으로 체크된 경우에는 키워드 명령인지를 체크(S492)하여 키워드 명령인 것으로 체크된 경우에는 임시정상상태로 복귀(S494)시킨다.Referring back to FIG. 4, if the recognition result is recognized in step S416, and the current control state is not a dormant state (S480), it corresponds to a normal state, so that it is a device control command (S482) to control the device (S484). If the control state at that time is a temporary normal state (S486), it returns to the normal state (S488). If the recognition result is acknowledged in step S416, but the current control state is in the dormant state (S480), it is checked whether it is the normal state return command (S490), and if the normal state return command is returned to the normal state (S488). . If it is checked that the recognized voice is not a normal state return command, it is checked whether it is a keyword command (S492), and if it is checked that it is a keyword command, it returns to a temporary normal state (S494).

다음으로, 소음 적응 상태에서 정상 상태로 복귀하는 과정을 설명한다.Next, the process of returning from the noise adaptation state to the normal state will be described.

소음 적응 상태에서는 입력 볼륨을 낮추어 입력되는 음성의 세기를 줄이게 된다. 입력되는 음성의 세기를 낮추면 음성 인식기로 입력되는 음성의 세기가 줄어들기 때문에 사용자에게 보다 큰 소리로 발성하는 것을 유도하고 그에 따라 보다 큰 소리로 발성되면 입력된 음성은 신호대잡음비(SNR: Signal-To-Noise Ratio)가 향상되어 소음 환경에서 음성 인식률을 높인다. 소음 적응 상태를 묵음이 5 초 이상 유지될 때까지 유지한다. 즉, 묵음이 5 초 이상 유지되면 음성인식기는 입력 볼륨을 정상 상태의 값으로 변경하고 정상 상태로 전환한다.In the noise adaptation state, the input volume is reduced to reduce the intensity of the input voice. Decreasing the intensity of the input voice reduces the intensity of the voice input to the voice recognizer, thus inducing the user to speak louder, and accordingly, if the voice is louder, the input voice is signal-to-noise ratio (SNR). -Noise Ratio is improved to increase the speech recognition rate in noisy environment. The noise adaptation is maintained until silence is maintained for at least 5 seconds. That is, if silence is maintained for more than 5 seconds, the voice recognizer changes the input volume to a normal state and switches to a normal state.

음성 인식기의 상태 전환에 따라 사용자에게 현재 상태에 대한 정보를 전달해주는 것이 바람직한데, 이를 위하여 바람직하게는 정상 상태에서는 초록색 램프를 온하고 휴면 상태로 전환되면 빨간색 램프를 온하며 소음 적응 상태에서는 노란색 램프를 온한다.It is desirable to provide the user with information about the current state according to the state change of the voice recognizer. For this purpose, the green lamp is turned on in the normal state, the red lamp is turned on in the sleep state, and the yellow lamp is in the noise adaptive state. Come on.

도 1은 본 발명에 따른 음성 인식 처리 방법의 주요 단계들을 나타낸 흐름도,1 is a flow chart showing the main steps of the speech recognition processing method according to the present invention;

도 2는 본원 발명에 따른 음성 인식 처리 방법에서 입력되는 음성의 세기와 지속 시간에 따른 구간 정의의 일예를 설명하기 위한 도면,2 is a view for explaining an example of a section definition according to the strength and duration of a voice input in the voice recognition processing method according to the present invention;

도 2는 본 발명에 따른 음성 인식 처리 방법에서 정의되는 휴면 상태, 소음 적응 상태, 및 정상 상태로 전환되는 과정을 설명하기 위한 도면, 및2 is a diagram for explaining a process of switching to a dormant state, a noise adaptation state, and a normal state defined in a speech recognition processing method according to the present invention; and

도 4는 본 발명의 바람직한 실시예에 따른 음성 인식 처리 방법을 나타낸 흐름도.4 is a flowchart showing a speech recognition processing method according to a preferred embodiment of the present invention.

Claims

delete

(a) measuring the intensity and duration of the input sound;

(b) comparing the sound intensity measured in step (a) with a predetermined intensity value required for command processing and determining the sleep interval, the noise interval, and the silent period by referring to the comparison result and the duration; And

(c) a sleep state in which the speech recognizer processes speech recognition but stops controlling the device without approving the speech recognition result when the input sound belongs to the dormant section according to which section the input sound is determined to belong to in step (b); If it belongs to the noise section, it selectively switches to the noise adaptation state that controls the device based on the voice recognition result while suppressing the noise, or to the normal state that normally controls the device based on the voice recognition result if it belongs to the silent section. And controlling;

And returning from the dormant state to the normal state when the silent section lasts for a predetermined time or when the keyword constituting the command is recognized or the predetermined specific command is recognized during the control by the step (c). ,

(d) returning to the dormant state if the voice command is not recognized for a predetermined time after the predetermined specific command is recognized during the dormant state and is returned to the normal state in the step (c). Speech recognition processing method.

(a) measuring the intensity and duration of the input sound;

In step (b),

(b ') comparing the sound intensity measured in the step (a) with the duration and the command request level, and the measured sound intensity is equal to or greater than the first predetermined value of the command request level and the duration exceeds the first predetermined length. A sleep interval, when the measured sound intensity is less than the first predetermined value and is equal to or greater than the second predetermined value lower than the first predetermined value, or when the measured sound intensity is greater than or equal to the first predetermined value and the duration is less than the first predetermined length, And if the measured sound intensity is less than the second predetermined value, determining the silence period.

(a) measuring the intensity and duration of the input sound;

In step (c),

(c ') The voice recognizer processes the voice recognition, but counts which section the input sound is determined to belong to in step (b), and approves the voice recognition result when the count value reaches a predetermined value. Selectively switch to the dormant state that stops controlling the device without using it, the noise adaptation state that controls the device based on the voice recognition result while suppressing the noise, or the normal state that normally controls the device based on the voice recognition result. And controlling the voice recognition method.

(a) measuring the intensity and duration of the input sound;

In the step (c), when the voice recognizer processes the voice recognition and recognizes a keyword constituting the command, the voice recognizer switches to a temporary normal state which is a temporary normal state for a predetermined time, and the command is recognized within a predetermined number of utterances. If not immediately returning to a dormant state; voice recognition processing method comprising a.