JPH11126093A

JPH11126093A - Voice input adjusting method and voice input system

Info

Publication number: JPH11126093A
Application number: JP9292368A
Authority: JP
Inventors: Osahisa Okamoto; 長久岡本; Koji Aizawa; 浩二相沢; Yoshimizu Iida; 義瑞飯田
Original assignee: Hitachi Engineering and Services Co Ltd
Current assignee: Hitachi Engineering and Services Co Ltd
Priority date: 1997-10-24
Filing date: 1997-10-24
Publication date: 1999-05-11

Abstract

PROBLEM TO BE SOLVED: To improve the recognition rate of a voice input and to facilitate the input work. SOLUTION: An A/D converting device 2 multiplies an input signal from a microphone 1 by gain to make an A/D conversion and a voice recognition device 3 pattern-matches with a word of a syllable dictionary while dividing the input signal into prescribed units (syllable) when the input signal is voice, to output a suited word code to a CPU 4. At this time, a level measurement part 32 measures the sound pressure level of the prescribed unit to output it to the CPU 4. A level decision part 42 in the CPU 4 decides whether or not the measured level lies, within a prescribed range obtainable a high recognition rate, and when out of the range, the level decision part 42 outputs an input gain control signal to a gain adjustment part 22 to control so that the input voice lies within the prescribed range.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は音声入力システムに
関し、特に音声認識のための音声入力ゲインの調整方法
に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice input system, and more particularly to a method for adjusting a voice input gain for voice recognition.

【０００２】[0002]

【従来の技術】測定データや制御指令などを音声によっ
て入力し、音声認識装置によって符号化データに変換し
て計算機装置への入力を行う、音声データ入力システム
が普及しはじめている。また、ワードプロセッサやドラ
イバーの地図案内システム等の音声入力も、実用化の段
階に入っている。2. Description of the Related Art A voice data input system which inputs measurement data, control commands, and the like by voice, converts the data into coded data by a voice recognition device, and inputs the coded data to a computer device has begun to spread. Voice input from word processors and map guidance systems for drivers has also entered the stage of practical use.

【０００３】従来の音声入力システムでは、正式な音声
入力の前に話者の声の音圧レベルに応じて、音声入力部
のゲイン（ボリューム）を適当な範囲に調整していた。In a conventional voice input system, the gain (volume) of a voice input unit is adjusted to an appropriate range in accordance with the sound pressure level of a speaker's voice before formal voice input.

【０００４】[0004]

【発明が解決しようとする課題】従来の音声入力調整方
法では、調整後の入力ゲインが固定となる。このため、
周囲状況から話者が声の大きさ（強さ）を変えて入力し
た場合に、音声辞書とのマッチングに失敗して認識でき
ないことがあり、そのつど音声入力を繰り返さなければ
ならず、使い勝手が悪かった。In the conventional voice input adjustment method, the input gain after adjustment is fixed. For this reason,
When the speaker changes the volume (strength) of the voice based on the surrounding situation, matching with the voice dictionary may fail and recognition may not be possible. It was bad.

【０００５】本発明の目的は、話者の入力音声の大小
（強弱）に関わらず、認識率の高い音声入力調整方法
と、それを用いた使い勝手のよい音声入力システムを提
供することにある。An object of the present invention is to provide a speech input adjustment method having a high recognition rate irrespective of the magnitude (strong or weak) of a speaker's input speech, and a user-friendly speech input system using the same.

【０００６】[0006]

【課題を解決するための手段】本発明は、音声入力部か
らの入力信号をＡ／Ｄ変換し、単語などの単位で音声パ
ターンと対応するコードのデータを格納している音声辞
書を用いてパターンマッチングを行い、入力信号が適合
するコードに変換する音声入力方法において、前記入力
信号が一定時間以上の休止部を含む場合に音声と判定し
てその音圧レベルを求め、該音圧レベルが認識率の良い
所定の音圧範囲内であるか判定し、範囲外のときは前記
範囲内となるように入力ゲインを調整することを特徴と
する。According to the present invention, an input signal from a voice input unit is A / D converted, and a voice dictionary storing code data corresponding to a voice pattern in units of words or the like is used. In a voice input method of performing pattern matching and converting an input signal into a suitable code, when the input signal includes a pause for a certain period of time or more, the input signal is determined to be voice and its sound pressure level is obtained. It is characterized in that it is determined whether the sound pressure is within a predetermined sound pressure range with a good recognition rate, and when the sound pressure is out of the range, the input gain is adjusted so as to be within the range.

【０００７】本発明を実現する音声入力システムにおい
て、Ａ／Ｄ変換装置の音声入力部に音声入力ゲインの自
動調整機能、前記音声認識装置に入力音声信号の音圧レ
ベルの測定機能、及び前記ＣＰＵにレベル範囲判定機能
をそれぞれ設け、入力音声信号の音圧レベルが予め設定
されている音圧範囲外のとき、前記ＣＰＵから前記自動
調整機能に入力ゲイン調整信号を出力するように構成し
たことを特徴とする。In the voice input system for realizing the present invention, the voice input unit of the A / D converter has an automatic adjustment function of a voice input gain, the voice recognition device has a function of measuring a sound pressure level of an input voice signal, and the CPU. Each of which has a level range determination function, and outputs an input gain adjustment signal from the CPU to the automatic adjustment function when the sound pressure level of the input audio signal is out of a preset sound pressure range. Features.

【０００８】本発明によれば、話者が変わり声の大きさ
が異なる場合や、周囲の状況により声の大きさを変えて
入力した時でも、安定した高い認識率を得ることがで
き、認識の失敗による再入力の作業が大幅に低減でき
る。According to the present invention, a stable and high recognition rate can be obtained even when the speaker changes and the volume of the voice is different, or when the voice is changed and input depending on the surrounding conditions. The work of re-input due to the failure of can be greatly reduced.

【０００９】[0009]

【発明の実施の形態】以下、本発明の一実施形態につい
て図面を用いて詳細に説明する。図１は、一実施例によ
る音声入力システムの機能ブロックを示す。マイク１か
らの入力信号は音圧の時間的変化であり、Ａ／Ｄ変換装
置２はゲイン倍したのち、Ａ／Ｄ変換機能２１によりア
ナログ信号からデジタル信号へ変換し、音声認識装置３
へ出力する。DESCRIPTION OF THE PREFERRED EMBODIMENTS One embodiment of the present invention will be described below in detail with reference to the drawings. FIG. 1 shows functional blocks of a voice input system according to one embodiment. The input signal from the microphone 1 is a temporal change in sound pressure. The A / D converter 2 multiplies the gain, and then converts the analog signal into a digital signal by the A / D conversion function 21, and the voice recognition device 3
Output to

【００１０】音声認識装置３の認識処理部３１は、入力
信号を時系列にメモリに格納しながら、分節リズムや休
止などの特徴量から音節や単語に分割し、たとえば、音
節（「あ」，「い」，「う」，．．．）を組合せた単語
（たとえば、「でんあつ（電圧）」）を単位とする音声
辞書とパターンマッチングを行い、認識された単語コー
ドをＣＰＵ４に出力する。The recognition processing unit 31 of the speech recognition device 3 divides the input signal into a syllable or a word based on a characteristic amount such as a segment rhythm or a pause while storing the input signal in a memory in a time series. Pattern matching is performed with a voice dictionary in units of words (for example, “Denatsu (voltage)”) combining “i”, “u”,..., And the recognized word code is output to the CPU 4. .

【００１１】このとき、音声入力に該当する単語がなけ
れば、該当なしを表わすコードを出力する。また、入力
信号が連続音などで、一定時間（たとえば、０．４秒）
以上の休止部を含まないとき、ノイズとみて認識処理を
中止する。ＣＰＵ４は入力された単語コードを、アプリ
ケーションプログラム４１にしたがって演算したり、メ
モリに格納したりする。なお、音声認識の方法は、上記
の他にも周知の複数の手法があり、本発明の構成におい
て特に限定されるものではない。At this time, if there is no word corresponding to the voice input, a code representing no corresponding word is output. Also, when the input signal is a continuous sound, etc., for a certain period of time (for example, 0.4 seconds)
When the pause section is not included, the recognition process is stopped as if it were noise. The CPU 4 calculates the input word code according to the application program 41 or stores it in a memory. In addition, there are a plurality of well-known methods of voice recognition other than the above, and there is no particular limitation in the configuration of the present invention.

【００１２】さらに、本実施例の音声入力システムで
は、Ａ／Ｄ変換装置２にゲイン調整部２２、音声認識装
置３にレベル測定部３２、ＣＰＵ４にレベル判定部４２
の機能を付加して、以下のように音声入力ゲインの自動
調整を行う。Further, in the voice input system according to the present embodiment, the gain adjustment section 22 is provided in the A / D converter 2, the level measurement section 32 is provided in the voice recognition apparatus 3, and the level determination section 42 is provided in the CPU 4.
And the automatic adjustment of the voice input gain is performed as follows.

【００１３】図２に、一実施例による音声入力ゲイン調
整方法の流れ図を示す。まず、入力信号が音声である
か、認識処理部３１で判別し（ｓ１０１）、音声であれ
ば休止部による分割単位の音圧レベル（平均値）を測定
する（ｓ１０２）。FIG. 2 is a flow chart of a method for adjusting a voice input gain according to one embodiment. First, the recognition processing unit 31 determines whether or not the input signal is a voice (s101). If the input signal is a voice, the sound pressure level (average value) of the division unit by the pause unit is measured (s102).

【００１４】次に、測定レベルが予め設定した所定範
囲、すなわち認識処理に適した範囲内にあるか判定する
（ｓ１０３）。判定の結果、所定範囲より測定レベルが
大であれば、その偏差に応じたゲイン・ダウン指令をゲ
イン調整部２２に出力し（ｓ１０４）、所定範囲より小
であればゲイン・アップ指令を出力する（ｓ１０５）。
また、所定範囲内であれば、ゲイン調整指令を出力しな
い。Next, it is determined whether the measurement level is within a predetermined range set in advance, that is, a range suitable for recognition processing (s103). As a result of the determination, if the measurement level is higher than the predetermined range, a gain-down command corresponding to the deviation is output to the gain adjustment unit 22 (s104), and if the measurement level is smaller than the predetermined range, a gain-up command is output. (S105).
Also, if it is within the predetermined range, no gain adjustment command is output.

【００１５】図３に、本実施例による音声入力の調整動
作の概念図を示す。入力音声３０１のレベルが大の場
合、Ａ／Ｄ変換装置２のゲインが下げられて、その後の
入力音声３０２は高認識率範囲に入るように制御され
る。また、入力音声３０３のレベルが小の場合、ゲイン
が上げられて、その後の入力音声３０４は高認識率範囲
に入るように制御される。FIG. 3 shows a conceptual diagram of the operation of adjusting the voice input according to the present embodiment. When the level of the input speech 301 is high, the gain of the A / D converter 2 is reduced, and the subsequent input speech 302 is controlled so as to enter a high recognition rate range. When the level of the input voice 303 is low, the gain is increased, and the subsequent input voice 304 is controlled so as to be in the high recognition rate range.

【００１６】このように、本実施例は入力音声の実レベ
ルに応じて、認識率の高いレベル範囲に入るようにリア
ルタイムでゲインの自動調整を行うので、入力環境の変
化で話者の音声レベルが変動する場合にも認識率を向上
でき、再入力を大幅に低減できる。As described above, the present embodiment automatically adjusts the gain in real time according to the actual level of the input voice so as to enter the level range having a high recognition rate. Can be improved, and re-input can be greatly reduced.

【００１７】[0017]

【発明の効果】本発明によれば、入力音声に応じてリア
ルタイムに入力ゲインが最適化されるので、入力音声の
認識率が向上し、音声入力の作業が簡単になる。According to the present invention, since the input gain is optimized in real time according to the input voice, the recognition rate of the input voice is improved, and the work of voice input is simplified.

[Brief description of the drawings]

【図１】本発明の一実施例による音声入力システムの機
能ブロック図。FIG. 1 is a functional block diagram of a voice input system according to an embodiment of the present invention.

【図２】本発明の一実施例による音声入力調整方法を説
明するフロー図。FIG. 2 is a flowchart illustrating a voice input adjustment method according to an embodiment of the present invention.

【図３】本発明の調整動の効果を示す説明図。FIG. 3 is an explanatory diagram showing the effect of the adjustment movement of the present invention.

[Explanation of symbols]

１…マイク、２…Ａ／Ｄ変換装置、２２…ゲイン調整
部、３…音声認識装置、３１…認識処理部、３２…レベ
ル測定部、４…ＣＰＵ、４１…アプリケーションプログ
ラム、４２…レベル判定部。DESCRIPTION OF SYMBOLS 1 ... Microphone, 2 ... A / D converter, 22 ... Gain adjustment part, 3 ... Voice recognition device, 31 ... Recognition processing part, 32 ... Level measurement part, 4 ... CPU, 41 ... Application program, 42 ... Level determination part .

Claims

[Claims]

An input signal from a voice input unit is A / D converted, and pattern matching is performed using a voice dictionary storing code data corresponding to a voice pattern in units of words or the like. In the voice input method of converting into a suitable code, when the input signal includes a pause for a predetermined time or more, the input signal is determined to be voice and its sound pressure level is obtained, and the sound pressure level is a predetermined sound pressure having a high recognition rate. A voice input adjustment method comprising: judging whether a value is within a range, and adjusting the input gain so as to be within the range when the value is outside the range.

2. A voice input microphone, an A / D converter for converting an input voice signal into a digital signal, and voice recognition for performing a recognition process on the converted input voice signal using a voice dictionary and outputting a suitable code. An audio input system comprising: a device; and a CPU for performing predetermined application processing by inputting the code, wherein an automatic adjustment function of an audio input gain is provided in a voice input unit of the A / D converter; The CPU has a function of measuring the sound pressure level of the input audio signal and the CPU has a level range determination function. When the sound pressure level of the input audio signal is out of the preset sound pressure range, the CPU automatically adjusts the function. A voice input system configured to output an input gain adjustment signal to the voice input system.