KR20040032660A

KR20040032660A - Method for detecting voice signals in voice processor

Info

Publication number: KR20040032660A
Application number: KR1020020061872A
Authority: KR
Inventors: 권정일
Original assignee: 서울통신기술 주식회사
Priority date: 2002-10-10
Filing date: 2002-10-10
Publication date: 2004-04-17
Also published as: KR100491753B1

Abstract

PURPOSE: A method for detecting audio signals of a voice processing board is provided to promptly judge input of an audio signal by starting detection of the audio signal if energy higher than a certain level comes. CONSTITUTION: A voice processing board initializes a voice count and a mute count(S1). The voice processing board judges whether energy of an audio signal inputted by voice blocks is larger than a first speech threshold(S2). If so, the voice processing board increases the voice count(S3). It is judged whether the voice count value is larger than a third speech threshold(S4). If so, the voice processing board initializes the mute count(S5). The voice processing board judges whether the voice count value is larger than a second speech threshold(S6). If the energy of the audio signal is smaller than the first speech threshold, the voice processing board increases the mute count(S7). The voice processing board judges whether the mute count is larger than the third speech threshold(S8). If so, the voice count is initialized(S9).

Description

Method for detecting voice signals in voice processor

본 발명은 음성처리보드의 음성신호 검출 방법에 관한 것으로서, 더욱 상세하게는 음성신호의 검출 신뢰도를 더욱 향상시키기에 적당하도록 한 음성처리보드의 음성신호 검출 방법에 관한 것이다.The present invention relates to a voice signal detecting method of a voice processing board, and more particularly, to a voice signal detecting method of a voice processing board adapted to further improve the reliability of detecting a voice signal.

일반적으로 자동 응답 서비스(automatic response service : ARS)는 컴퓨터의 기억 장치에 응답 내용을 기록해 두었다가 외부의 전화 질의에 대해 컴퓨터가 그 질의를 해독하고 해당되는 응답 용어를 골라 음성으로 자동 응답해 주는 서비스이다. 이러한 자동 응답 음성을 만들어 내는 방법으로는 자기 드럼 등에 먼저 녹음된 음성의 필요 부분을 읽고 이것을 편집하여 응답문을 만들어 내는 방법과 몇 가지의 기본적인 변수를 써서 음성을 기계적으로 합성해서 응답문을 만들어 내는 방법이 있다. 이러한 ARS 서비스는 전화번호 안내, 변경 전화번호 자동 안내, 전화 고장 신고 문의 안내, 전화 일기 예보, 전화 시보 등이 있다.In general, an automatic response service (ARS) is a service that records a response in a computer storage device, and then, in response to an external telephone query, the computer decrypts the query and selects a corresponding response term and automatically responds with a voice. . This automatic answering voice can be created by reading the required parts of the recorded voice on a magnetic drum, etc., and editing the answering part to create a response, and using a few basic variables to create a response by mechanically synthesizing the voice. There is a way. These ARS services include telephone number guidance, automatic change telephone number guidance, telephone malfunction report inquiry guidance, telephone weather forecast, and telephone time signal.

그러나, 정보통신의 발달에 의해 이러한 자동 응답 서비스에 새로운 부가 기능이 추가되고 있는데, 그 중 하나가 사용자의 음성신호의 검출하여 사용자가 요구하는 부가 기능을 수행하는 것이다. 예를 들어, 자동 응답 서비스가 사용자에게 어떤 음성을 요구하는 경우, 자동 응답 서비스에 구비된 음성처리보드는 입력되는 음성신호를 검출하고, 검출된 음성신호를 분석하여 미리 설정된 기능을 수행한다.However, with the development of information and communication, a new additional function has been added to such an answering service. One of them is to detect the user's voice signal and perform the additional function required by the user. For example, when the answering service requests a user voice, the voice processing board provided in the answering service detects an input voice signal, analyzes the detected voice signal, and performs a preset function.

이하에서 첨부된 도 1을 참조하여 종래의 음성처리보드에서 음성신호 검출 방법을 설명한다.Hereinafter, a voice signal detection method in a conventional voice processing board will be described with reference to FIG. 1.

종래의 기술에 따르면, 음성처리보드는 음성신호가 입력되면 음성신호의 에너지 레벨을 검출한다. 즉, 음성처리보드는 음성신호에 해당되는 에너지 레벨을 도 1에 도시된 바와 같이 미리 설정된 한 윈도우 안에서 10~16msec의 음성 블록마다 체크한다. 이때, 음성처리보드는 검출된 에너지 레벨이 미리 설정된 음성 기준값(도1 참조) 보다 큰 가를 판단한다. 이때, 입력된 에너지 레벨이 음성 기준값 보다 큰 블럭의 수를 모은 값이 트리거(Trigger) 값(도 1 참조) 보다 큰 경우, 음성처리보드는 음성이 입력된 것으로 간주한다. 그러나, 음성처리보드는 입력된 음성신호에 해당되는 에너지가 임계값 보다 큰 블럭의 수를 모은 값이 트리거값 보다 작으면, 음성이 들어오지 않은 것으로 간주한다.According to the prior art, the voice processing board detects the energy level of the voice signal when the voice signal is input. That is, the voice processing board checks the energy level corresponding to the voice signal for each voice block of 10-16 msec within a predetermined window as shown in FIG. At this time, the voice processing board determines whether the detected energy level is greater than the preset voice reference value (see FIG. 1). At this time, when the input energy level is greater than the trigger value (see FIG. 1), the value of the number of blocks having a larger value than the voice reference value is regarded as voice input. However, if the value of the number of blocks whose energy corresponding to the input voice signal is greater than the threshold value is smaller than the trigger value, the voice processing board considers that no voice is input.

이와 같은 방식으로 음성신호를 검출한 음성처리보드는 다음 윈도우에 대해서도 동일한 방식으로 음성 검출을 반복해서 실행한다.The voice processing board that detects the voice signal in this manner repeatedly performs the voice detection in the same manner for the next window.

그러나, 이와 같은 종래의 음성신호 검출 방식에 따르면, 윈도우 크기 정도의 음성신호가 입력된 경우, 음성신호가 설정된 윈도우와 다음 윈도우 사이에 걸쳐서 입력되면 음성처리보드는 음성신호가 입력되지 않은 것으로 간주할 수도 있다. 이에 따라 음성처리보드를 채용한 전체 통신 시스템에 오동작이 발생하여, 사용자나 가입자들이 원하는 서비스를 정상적으로 제공할 수 없는 치명적인 문제가 발생할 수도 있었다.However, according to the conventional voice signal detection method, when a voice signal having a window size is input, if the voice signal is input between the set window and the next window, the voice processing board may consider that the voice signal is not input. It may be. As a result, a malfunction occurred in the entire communication system employing the voice processing board, and thus, a fatal problem may occur in which a user or subscriber cannot normally provide a desired service.

따라서, 본 발명의 목적은 이상에서 언급한 종래 기술의 문제점을 해소하기 위하여 제안된 것으로서, 음성처리보드에서 음성신호가 입력되었는지를 검출하는 경우, 음성신호의 불연속성 및 크기에 무관하게 빠르게 검출할 수 있는 음성처리보드의 음성신호 검출 방법을 제공하기 위한 것이다.Accordingly, an object of the present invention has been proposed to solve the above-mentioned problems of the prior art, and when detecting whether a voice signal is input from the voice processing board, the voice signal can be detected quickly regardless of the discontinuity and size of the voice signal. To provide a voice signal detection method of a voice processing board.

본 발명의 다른 목적은 음성처리보드에서 음성신호가 입력되었는지를 검출하는 경우, 음성신호에 해당되는 에너지가 입력되는 시점에 의하여 검출을 시작하여 음성신호를 정확하게 검출할 수 있는 음성처리보드의 음성신호 검출 방법을 제공하기 위한 것이다.Another object of the present invention is to detect whether the voice signal is input from the voice processing board, the voice signal of the voice processing board that can start the detection by the time point when the energy corresponding to the voice signal is input to accurately detect the voice signal It is for providing a detection method.

도 1은 종래의 음성처리보드에서 음성신호의 파형도.1 is a waveform diagram of a speech signal in a conventional speech processing board.

도 2는 본 발명에 따른 음성처리보드에서 음성신호의 파형도.2 is a waveform diagram of a speech signal in a speech processing board according to the present invention;

도 3은 본 발명에 따른 음성처리보드에서 음성신호 검출 방법을 설명하기 위한 제어 흐름도.3 is a control flowchart for explaining a voice signal detection method in a voice processing board according to the present invention;

이러한 목적을 달성하기 위한 본 발명에 따른 음성처리보드의 음성신호 검출 방법은, 통신 시스템에 구비되는 음성처리보드에서, (a)일정 단위로 입력된 음성신호의 에너지가 기설정된 제1 기준값 보다 큰지를 판단하는 단계와; (b) 상기 판단 결과, 상기 음성신호의 에너지가 상기 제1 기준값 보다 크면, 음성 카운트를 증가시키고, 상기 음성 카운트 값이 기설정된 제3 기준값 보다 큰지를 판단하는 단계와; (c) 상기 판단 결과, 상기 음성 카운트 값이 상기 제3 기준값 보다 큰 것으로 판단되면 묵음 카운트 값을 초기화시키고, 상기 음성 카운트 값이 기설정된 제2 기준값 보다 큰지를 판단하는 단계와; (d) 상기 음성 카운트 값이 상기 제2 기준값 보다 크면, 상기 음성신호를 수신한 것으로 판단하는 단계로 이루어진다.In order to achieve the above object, the voice signal detecting method of the voice processing board according to the present invention includes, in a voice processing board provided in a communication system, (a) whether energy of a voice signal input in a predetermined unit is greater than a first predetermined reference value. Determining; (b) increasing the voice count if the energy of the voice signal is greater than the first reference value, and determining whether the voice count value is greater than a preset third reference value; (c) if it is determined that the voice count value is greater than the third reference value, initializing a silence count value and determining whether the voice count value is greater than a second preset reference value; (d) determining that the voice signal is received when the voice count value is larger than the second reference value.

또한, 상기 음성처리보드의 음성신호 검출 방법은, (e) 상기 음성신호의 에너지가 상기 제1 기준값 보다 작으면 묵음 카운트를 증가시키고, 상기 묵음 카운트가 상기 제3 기준값 보다 큰지를 판단하는 단계와; (f) 상기 판단 결과, 상기 묵음 카운트 값이 상기 제3 기준값 보다 크지 않은 경우 상기 (a) 단계를 실행하며, 상기 묵음 카운트 값이 상기 제3 기준값 보다 큰 경우에는 음성 카운트 값을 초기화시키고 (a) 단계를 실행하는 단계가 더 추가된다.In addition, the voice signal detection method of the voice processing board, (e) increasing the silence count if the energy of the voice signal is less than the first reference value, and determining whether the silence count is greater than the third reference value; ; (f) if the silence count value is not greater than the third reference value as a result of the determination, the step (a) is executed; if the silence count value is larger than the third reference value, the voice count value is initialized (a The step of executing step) is further added.

여기서, 제1 기준값은 블럭 단위로 음성신호가 입력되었는지 입력되지 않았는지를 판단하기 위한 기준이 되는 에너지 레벨이며, 제 2 기준값은 연속적으로 몇 개의 블록에 포함된 큰 에너지가 들어와야 음성신호가 입력된 것으로 판단할 기준이 되는 카운트값 이며, 제3 기준값은 연속적으로 몇 개의 블록에 포함된 큰 또는 적은 에너지가 들어와야 현재 상태를 음성상태에서 묵음상태로 또는 묵음상태에서 음성상태로 바꾸어야 하는지 기준이 되는 카운트값 이다.Here, the first reference value is an energy level as a reference for determining whether or not a voice signal is input or not in units of blocks, and the second reference value is a voice signal input only when a large amount of energy contained in several blocks is input continuously. The third reference value is a count value that is a criterion to be judged, and the third reference value is a count value that determines whether the current state is changed from the voice state to the silent state or from the silent state to the voice state when a large or small amount of energy contained in several blocks is input. to be.

이상과 같은 본 발명의 특징에 따르면, 음성처리보드에서 음성신호 검출하는경우 종래의 검출방식을 개선하여 빠르면서도 보다 정확하게 음성신호를 검출할 수 있는 장점이 있다. 따라서, 전체 통신 시스템의 기능과 동작의 신뢰성을 높일 수 있다.According to the features of the present invention as described above, when the voice signal is detected by the voice processing board, there is an advantage that the voice signal can be detected more quickly and more accurately by improving the conventional detection method. Therefore, the reliability of the functions and operations of the entire communication system can be improved.

이하, 첨부된 도면을 참조하여 본 발명에 따른 음성처리보드에서 음성신호 검출 방법을 상세히 설명한다.Hereinafter, a voice signal detection method in a voice processing board according to the present invention will be described in detail with reference to the accompanying drawings.

본 발명에 따른 음성처리보드에는 먼저 다음과 같은 제1 기준값에서 제3 기준값 까지 세 종류의 기준값이 미리 설정되어야만 한다. 제1 기준값은 블럭 단위로 음성신호가 입력되었는지 입력되지 않았는지를 판단하기 위한 기준이 되는 에너지 레벨이다. 제 2 기준값은 연속적으로 몇 개의 블록에 포함된 큰 에너지가 들어와야 음성신호가 입력된 것으로 판단할 기준이 되는 카운트값 이다. 마지막으로, 제3 기준값은 연속적으로 몇 개의 블록에 포함된 큰 또는 적은 에너지가 들어와야 현재 상태를 음성상태에서 묵음상태로 또는 묵음상태에서 음성상태로 바꾸어야 하는지 기준이 되는 카운트값 이다.In the speech processing board according to the present invention, three types of reference values must be set in advance from the first reference value to the third reference value as follows. The first reference value is an energy level serving as a reference for determining whether or not a voice signal is input in units of blocks. The second reference value is a count value that is used as a criterion for determining that a voice signal is input only when a large amount of energy included in several blocks is continuously input. Finally, the third reference value is a count value that is a reference to whether to change the current state from the voice state to the silent state or from the silent state to the voice state when a large or low energy contained in several blocks in a row is input.

먼저, 음성처리보드는 음성카운트 및 묵음 카운트를 초기화한다(S1). 이어, 음성처리보드는 도 2에 도시된 음성블록 단위로 입력된 음성신호의 에너지가 제1 기준값(Speech Threshold)보다 큰지를 판단한다(S2).First, the voice processing board initializes the voice count and silence count (S1). Subsequently, the voice processing board determines whether the energy of the voice signal input in units of the voice block illustrated in FIG. 2 is greater than a first threshold (S2).

S2 단계의 판단결과, 음성신호의 에너지가 제1 기준값 보다 크면, 음성처리보드는 음성 카운트를 증가시키고(S3), 음성 카운트 값이 제3 기준값 보다 큰지를 판단한다(S4).As a result of the determination in step S2, if the energy of the voice signal is greater than the first reference value, the voice processing board increases the voice count (S3), and determines whether the voice count value is greater than the third reference value (S4).

S4 단계의 판단 결과, 음성 카운트 값이 제3 기준값 보다 작은 것으로 판단되면, 음성처리보드는 S2 단계를 실행한다. 그러나, S4 단계의 판단 결과가 음성 카운트 값이 제3 기준값 보다 큰 것으로 판단되면, 묵음 카운트를 초기화시키고(S5), 이어 음성 카운트 값이 제2 기준값 보다 큰지를 판단한다(S6).If it is determined in step S4 that the voice count value is smaller than the third reference value, the voice processing board executes step S2. However, if it is determined in step S4 that the voice count value is greater than the third reference value, the silence count is initialized (S5), and then it is determined whether the voice count value is greater than the second reference value (S6).

이때, S6 단계의 판단 결과, 음성 카운트 값이 제2 기준값 보다 작다면, S2 단계를 실행한다. 반면, 음성 카운트 값이 제2 기준값 보다 크다면, 음성신호를 완벽하게 수신한 것으로 판단한다.At this time, if the voice count value is smaller than the second reference value as a result of the determination in step S6, step S2 is executed. On the other hand, if the voice count value is larger than the second reference value, it is determined that the voice signal is completely received.

한편, S2 단계에서, 음성블록 단위로 입력된 음성신호의 에너지가 제1 기준값 보다 큰지를 판단하는 경우, 음성신호의 에너지가 제1 기준값 보다 작으면, 묵음 카운트를 증가시키고(S7), 이어 묵음 카운트가 제3 기준값 보다 큰지를 판단한다(S8).On the other hand, when it is determined in step S2 that the energy of the voice signal input in units of voice blocks is greater than the first reference value, if the energy of the voice signal is less than the first reference value, the silence count is increased (S7), and then silence It is determined whether the count is greater than the third reference value (S8).

S8 단계의 판단 결과, 묵음 카운트 값이 제3 기준값 보다 크지 않은 경우 S2 단계를 실행하며, 묵음 카운트 값이 제3 기준값 보다 큰 경우에는 음성 카운트 값을 초기화시키고 S2 단계를 실행한다(S9).As a result of the determination in step S8, if the silence count value is not larger than the third reference value, step S2 is executed. If the silence count value is larger than the third reference value, the voice count value is initialized and step S2 is executed (S9).

이상에서 설명한 본 발명에 따르면, 음성처리보드에서 음성신호를 검출하는 경우, 일정 수준이상의 에너지가 들어오면 음성신호 검출을 시작하므로 종래의 기술에서 윈도우를 적용하여 검출하는 경우 보다 빠르게 음성신호의 입력여부를 판단할 수 있다.According to the present invention described above, when the voice signal is detected by the voice processing board, the voice signal is detected when a certain level of energy is input. Can be determined.

또한, 음성신호에 해당되는 에너지가 입력되는 시점에 의하여 검출을 시작하므로 항상 음성신호의 입력을 정확하게 판단 할 수 있다.In addition, since the detection is started by the time when the energy corresponding to the voice signal is input, it is always possible to accurately determine the input of the voice signal.

따라서, 본 발명에 따른 음성신호 검출 방법을 적용한 음성처리보드를 채용한 시스템에서 제공하는 여러 가지의 기능을 항상 정확하게 제공해 줄 수 있기 때문에 제품의 신뢰성이 크게 향상된다.Therefore, since the various functions provided by the system using the voice processing board to which the voice signal detection method according to the present invention is applied can always be accurately provided, the reliability of the product is greatly improved.

Claims

In the voice processing board provided in the communication system for detecting a voice signal,

(a) determining whether an energy of a voice signal input in a predetermined unit is greater than a first predetermined reference value;

(b) increasing the voice count if the energy of the voice signal is greater than the first reference value, and determining whether the voice count value is greater than a preset third reference value;

(c) if it is determined that the voice count value is greater than the third reference value, initializing a silence count value and determining whether the voice count value is greater than a second preset reference value;

and (d) determining that the voice signal has been received if the voice count value is greater than the second reference value.

The method of claim 1, wherein the predetermined unit is a voice block unit.

The method of claim 1,

(e) increasing a silent count if the energy of the voice signal is less than the first reference value and determining whether the silent count is greater than the third reference value;

(f) if the silence count value is not greater than the third reference value, the step (a) is executed; if the silence count value is larger than the third reference value, the voice count value is initialized (a The method of claim 1, further comprising the step of executing the step.

The method of claim 1, wherein the first reference value is an energy level serving as a reference for determining whether a voice signal is input or not in units of blocks.

The voice signal detecting method of claim 1, wherein the second reference value is a count value that is a reference value to determine that the voice signal is input only when a large amount of energy contained in several blocks is continuously input.

The method of claim 1, wherein the third reference value is a count value which is a reference value of whether the current state must be changed from the voice state to the silent state or from the silent state to the voice state when a large or low energy contained in several blocks is continuously input. A voice signal detection method of a voice processing board characterized in that.