JP2000338986A

JP2000338986A - Voice input device, control method therefor and storage medium

Info

Publication number: JP2000338986A
Application number: JP11150004A
Authority: JP
Inventors: Shigeru Nishikawa; 成西川
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1999-05-28
Filing date: 1999-05-28
Publication date: 2000-12-08

Abstract

PROBLEM TO BE SOLVED: To wipe away a resistance sense of speaking to an apparatus thereby freely performing voice input, and also distinguish a voice vocalized to the apparatus from a voice vocalized to normal others. SOLUTION: In this voice input device, a level comparison part 106 in a voice input part 1 compares an inputted voice signal with a threshold value for deciding whether the voice has whisper volume or not, and passes only a voice below the threshold. Thereby, a resistance sense to speaking to an apparatus is wiped away to freely perform voice input, while a voice vocalized to the apparatus can be distinguished from a voice vocalized to normal others.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は音声入力装置及びそ
の制御方法及び記憶媒体に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice input device, a control method thereof, and a storage medium.

【０００２】[0002]

【従来の技術】音声入力により、機器を制御する装置と
しては、留守番メッセージや用件を音声で入力する留守
番電話をはじめ、近年では音声認識機能付きのカーナビ
等がある。また、キーボードに変わる文字入力手段とし
ての活用も行われている。しかしながら、このような音
声入力装置を使用して、音声を入力する際に入力する音
声を他者に聞かれたくない、もしくは他人に迷惑をかけ
たくないといった抵抗感を感じる状況がある。或いは、
公衆の場で大きな声で話し辛い等もある。2. Description of the Related Art As a device for controlling a device by voice input, there is an answering machine for inputting an answering machine message or a message by voice, and in recent years, a car navigation system with a voice recognition function and the like. It is also used as a character input means instead of a keyboard. However, there is a situation in which when using such a voice input device, the user does not want to hear the input voice when inputting the voice, or does not want to bother others. Or,
It is difficult to speak loudly in public.

【０００３】周囲の他者に音声情報を聴かれずに、ま
た、周囲の雑音に影響されずに音声を入力することを実
現するものとして、例えば、特開平５−２６５４７２号
公報の音声情報処理装置、特開平５−１１４８８０号公
報の携帯型移動無線端末といった提案がある。[0003] Japanese Patent Laid-Open No. 5-265472 discloses a speech information processing apparatus which realizes input of speech without being heard by other people around and without being affected by ambient noise. And Japanese Patent Application Laid-Open No. Hei 5-114880.

【０００４】特開平５−２６５４７２号公報に開示され
た音声情報処理装置は、楕円体構造の音声反射器を用
い、楕円体の一つの焦点に音声入出力のための送受話器
（スピーカとマイクロホン）を、他の焦点に装置の操作
者の頭部、特に耳と口がくる構造とすることにより、装
置の操作者は、ハンドセットを持つことなく、かつ、後
方に並んでいる他人に入出力情報を聴かれることもなく
音声情報の入出力を行うことができるというものであ
る。A speech information processing apparatus disclosed in Japanese Patent Application Laid-Open No. 5-265472 uses a speech reflector having an ellipsoidal structure, and a handset (speaker and microphone) for speech input / output at one focal point of the ellipsoid. Is designed so that the operator's head, especially the ears and the mouth, comes to the other focal point, so that the operator of the device does not have a handset and can input / output information to others who are lined up behind. Can input and output audio information without being heard.

【０００５】また、特開平５−１１４８８０号公報に開
示された携帯型移動無線端末は、背景雑音の高いところ
でも使用者が大声を出さなくてすむ体積の小さい携帯型
移動無線端末を提供する為に、予め話者の想定使用状況
に応じて登録された複数の音声辞書を有する音声認識装
置に話者音声を入力し、音声認識された単語出力を、話
者が変換を希望する声調に応じて複数の音声辞書（Ｉ
Ｉ）が付随している音声合成装置に入力し、合成音声を
出力するというものである。これにより、マイク入力か
ら登録話者の音声として認識可能な音節のみを抽出して
いくことが可能であるので、体積の小さいマイクを使用
していても背景雑音の高いところにおいて使用者が大声
を出さずに通話可能になり、また、音声辞書に登録され
ていない音声は認識されないので、所有者以外の不正使
用を実質的に禁止することができる。さらに音声認識と
音声合成の間に同義文章変換の機能を付加したり、音声
合成に際して声調変換を実行可能であるので通話相手に
与える印象を制御でき、対話・交渉を有利に運べる効果
がある。A portable mobile radio terminal disclosed in Japanese Patent Application Laid-Open No. Hei 5-114880 is intended to provide a small-sized mobile radio terminal which does not require a user to make a loud voice even in a place where background noise is high. Then, the speaker's voice is input to a voice recognition device having a plurality of voice dictionaries registered in advance according to the assumed use situation of the speaker, and the speech-recognized word output is changed according to the tone desired by the speaker for conversion. A plurality of speech dictionaries (I
I) is input to the accompanying speech synthesizer, and the synthesized speech is output. As a result, it is possible to extract only syllables that can be recognized as registered speaker's voice from the microphone input. It becomes possible to make a call without issuing a voice, and since voices not registered in the voice dictionary are not recognized, unauthorized use by anyone other than the owner can be substantially prohibited. Furthermore, a synonymous sentence conversion function can be added between speech recognition and speech synthesis, and tone conversion can be performed during speech synthesis, so that the impression given to the other party can be controlled, so that conversation and negotiation can be carried out advantageously.

【０００６】[0006]

【発明が解決しようとする課題】然るに、いずれの場合
においても、周囲の雑音の影響を軽減して小さな音声で
も確実に音声情報を機器に入力することを実現している
ものの、機器に話し掛けることに対する抵抗感を払拭し
て気軽に音声入力を実施できるようにすること自体に注
力しておらず、通常の他者に対して発声した音声と機器
に対して発声した音声を識別するということは実現する
ものはなかった。In any case, although the effect of the surrounding noise is reduced and the voice information is surely input to the device even with a small voice, it is necessary to talk to the device. It does not focus on making it easy to perform voice input by eliminating the resistance to voice, and distinguishing between voices uttered to ordinary people and voices uttered to devices is not There was nothing to realize.

【０００７】本発明は、かかる問題に鑑みなされたもの
であり、機器に話しかけることに対する抵抗感を払拭し
て気軽に音声入力を実施できるようにすると共に、通常
の他者に対して発声した音声と機器に対して発声した音
声を識別することができる音声入力装置およびその制御
方法及び記憶媒体を提供することである。SUMMARY OF THE INVENTION The present invention has been made in view of the above-described problems, and has been made in consideration of the above circumstances. It is an object of the present invention to provide a voice input device, a control method thereof, and a storage medium capable of identifying a voice uttered to a device.

【０００８】[0008]

【課題を解決するための手段】かかる課題を解決するた
め、例えば本発明の音声入力装置は以下の構成を備え
る。すなわち。In order to solve such a problem, for example, a voice input device according to the present invention has the following configuration. That is.

【０００９】音声入力手段より入力された音声信号がさ
さやき音声の音量であるか否かを判定するための閾値を
有し、当該閾値を越える音声信号を非処理対象とし、前
記閾値以下の音声信号について処理して出力する。A threshold for determining whether or not the voice signal input from the voice input means is the volume of whispering voice; a voice signal exceeding the threshold is not processed; Is processed and output.

【００１０】また、本発明にかかる好適な実施態様に従
えば、更に、前記音声入力記号に対し音声信号が音声で
あるか否かを検出する音声信号検出手段を有することが
望ましい。これによって、小音量であっても、非音声の
周囲の雑音を除外することができるようにになる。Further, according to a preferred embodiment of the present invention, it is preferable that the apparatus further comprises a voice signal detecting means for detecting whether or not the voice signal is a voice for the voice input symbol. As a result, even when the sound volume is low, noise around non-speech can be excluded.

【００１１】また、この音声信号検出手段は、パワース
ペクトルとピッチ周波数により音声信号か否かを検出す
ることが望ましい。これによって、機械音と人間が発す
る音声が識別できるようになる。Preferably, the audio signal detecting means detects whether the audio signal is an audio signal based on a power spectrum and a pitch frequency. This makes it possible to distinguish between the mechanical sound and the voice uttered by a human.

【００１２】また、更に、音声信号から音声を認識し、
音声データを得る音声認識手段と、当該音声認識手段で
得られた音声データに基づく情報を出力する出力手段と
を備えることが望ましい。[0012] Further, speech is recognized from the speech signal,
It is desirable to include a voice recognition unit that obtains voice data, and an output unit that outputs information based on the voice data obtained by the voice recognition unit.

【００１３】また、この場合の出力手段は、音声認識し
て得られた文字コードを、かな漢字変換処理に出力する
ようにすれば、これまでのキーボード等に取って代わる
入力手段として機能させることが可能になる。In this case, if the output means outputs the character code obtained by voice recognition to the Kana-Kanji conversion process, the output means can function as an input means replacing a conventional keyboard or the like. Will be possible.

【００１４】また、出力手段は、音声認識して得られた
文字コードをコマンドとして変換処理に出力すれば、ボ
タンやスイッチの代わりの走査手段として機能させるこ
とも可能になる。If the output means outputs the character code obtained by voice recognition as a command to the conversion processing, it can be made to function as a scanning means instead of a button or a switch.

【００１５】更に、音声認識手段は、ささやき声で発声
した場合の辞書パターンを有し、当該辞書パターンを用
いて音声認識処理を行うことが望ましい。この結果、さ
さやき声専用の辞書を用いることにより、ささやき声の
みの認識精度を高めることができる。Further, it is desirable that the voice recognition means has a dictionary pattern when whispering is uttered, and performs voice recognition processing using the dictionary pattern. As a result, by using a dictionary dedicated to whispering voices, it is possible to improve the recognition accuracy of only whispering voices.

【００１６】[0016]

【発明の実施の形態】以下、添付図面に従って本発明に
かかる実施形態を詳細に説明する。Embodiments of the present invention will be described below in detail with reference to the accompanying drawings.

【００１７】＜音声入力装置の構成＞図１に本実施形態
の音声入力装置のブロック図を示す。<Structure of Speech Input Device> FIG. 1 shows a block diagram of a speech input device of the present embodiment.

【００１８】図中、１は本実施形態の音声入力装置であ
り、以下の構成を有する。In FIG. 1, reference numeral 1 denotes a voice input device according to the present embodiment, which has the following configuration.

【００１９】１０１は、後述するＲＯＭ１０３のプログ
ラムに基づいて全体の制御を司るＣＰＵである。１０２
は、ささやき声辞書情報その他各種制御データ及びユー
ザデータを蓄積するＲＡＭ、１０３は、音声入力装置１
の制御プログラムを格納するＲＯＭである。Reference numeral 101 denotes a CPU that controls the entire system based on a program stored in a ROM 103 described later. 102
Is a RAM for storing whispering voice dictionary information and other various control data and user data, and 103 is a voice input device 1
Is a ROM for storing the control program.

【００２０】１０４は、音声入力部であり、小音量の音
声でも感度よく音響電気変換を行う高感度マイク１０
５、入力信号を所定の下限レベルと比較すると共に、所
定の上限レベルと比較するレベル比較部１０６、信号を
増幅する信号増幅部１０７、入力信号からピッチ周波数
やパワースペクトルを算出し、音声信号の検出を行う信
号分析部１０８で構成される。Reference numeral 104 denotes a voice input unit, which is a high-sensitivity microphone 10 that performs acousto-electric conversion with high sensitivity even for low-volume voice.
5. Comparing the input signal with a predetermined lower limit level, and comparing the input signal with a predetermined upper limit level, a signal amplifying unit 107 for amplifying the signal, calculating a pitch frequency and a power spectrum from the input signal, and It is composed of a signal analysis unit 108 that performs detection.

【００２１】１０９はＣＰＵ１０１の制御に基づきＲＡ
Ｍ１０２内のささやき声辞書情報を参照して音声認識を
行う音声認識部、１１０は音声認識の結果等を操作者に
通知する為の表示部である。１１１は音声認識の結果等
を操作者に通知する為の合成音声を生成する音声合成
部、１１２は音声合成部の出力合成音声を適切な信号レ
ベルに増幅する信号増幅部、１１３は音声認識の結果等
を操作者に通知する為の合成音声を出力するスピーカ
（イヤホン）である。また、１１４は通信部であり、外
部装置（例えばパーソナルコンピュータ等）との無線通
信を行うものである。外部機器側にも通信部１１４と通
信するためのデバイスが接続され、これを介して文章の
入力を行う。Reference numeral 109 denotes an RA based on the control of the CPU 101.
A voice recognition unit 110 for performing voice recognition with reference to the whisper voice dictionary information in M102, and a display unit 110 for notifying the operator of the result of voice recognition and the like. Reference numeral 111 denotes a voice synthesis unit that generates a synthesized voice for notifying the operator of the result of voice recognition and the like, 112 denotes a signal amplification unit that amplifies the output synthesized voice of the voice synthesis unit to an appropriate signal level, and 113 denotes a voice recognition unit. This is a speaker (earphone) that outputs a synthesized voice for notifying the operator of a result or the like. A communication unit 114 performs wireless communication with an external device (for example, a personal computer or the like). A device for communicating with the communication unit 114 is also connected to the external device side, through which text is input.

【００２２】図２に、実施形態における音声入力装置の
一形態である使用状況を示す。図示の如く、操作者に
は、マイク１５０及びスピーカ１１３が装着される形態
を成している。FIG. 2 shows a use situation which is one mode of the voice input device in the embodiment. As shown in the figure, the operator is equipped with a microphone 150 and a speaker 113.

【００２３】次に、図３のフローチャートに従って、本
実施形態の音声入力動作を説明する。Next, the voice input operation of this embodiment will be described with reference to the flowchart of FIG.

【００２４】本実施形態に於いては、予め、ＲＡＭ１０
２内のささやき声辞書情報を繰作者のささやき声に適応
させる為にトレーニングを行ってあるものとする。In this embodiment, the RAM 10
It is assumed that training has been performed to adapt the whisper dictionary information in 2 to the whisper of the repeater.

【００２５】先ず、ステップＳ１０１で、高感度マイク
１０５からの信号入力を検出する為に、レベル比較部１
０６に於いて所定の下限レベルと比較する。ステップＳ
１０１に於ける比較の結果所定の下限レベル未満の場合
はステップＳ１０１に戻り、所定の下限レベル以上の場
合はステップＳ１０２に進む。First, in step S101, the level comparator 1 detects a signal input from the high-sensitivity microphone 105.
At 06, a comparison is made with a predetermined lower limit level. Step S
If the result of the comparison in step 101 is less than the predetermined lower limit level, the process returns to step S101. If the result is greater than the predetermined lower limit level, the process proceeds to step S102.

【００２６】ステップＳ１０２では、小音量の音声信号
のみ取り込む為に、レベル比較部１０６に於いて所定の
上限レベルと比較する。この比較の結果、所定の上限レ
ベルより大きい場合はステップＳ１０１に戻り、所定の
上限レベル以下の場合はステップＳ１０３に進む。In step S102, in order to capture only a low-volume audio signal, the level comparison section 106 compares the signal with a predetermined upper limit level. As a result of the comparison, if it is larger than the predetermined upper limit level, the process returns to step S101, and if it is smaller than the predetermined upper limit level, the process proceeds to step S103.

【００２７】この結果、通常の対話での音量以下の音声
で、且つ、周囲の雑音を除くための音量以上の音声のみ
を通過させることができる。As a result, it is possible to pass only a voice lower than the volume of a normal conversation and a voice higher than the volume for removing ambient noise.

【００２８】ステップＳ１０３に進むと、信号増幅部１
０８に於いて入力信号を適正な入力レベルに増幅する。
そして、ステップＳ１０４において、信号分析部１０９
に於いて入力信号の分析を行う。例えば、ピッチ周波数
やパワースペクトラムを算出する。次いで、ステップＳ
１０５において、ステップＳ１０４の分析結果より音声
信号と判定できない場合は、ステップＳ１０１に戻り、
音声信号と判定できる場合はステップＳ１０６に進む。In step S103, the signal amplifying unit 1
At 08, the input signal is amplified to an appropriate input level.
Then, in step S104, the signal analyzer 109
The input signal is analyzed in. For example, a pitch frequency and a power spectrum are calculated. Then, step S
If it is determined in step 105 that the signal is not an audio signal based on the analysis result in step S104, the process returns to step S101.
If it can be determined that the signal is an audio signal, the process proceeds to step S106.

【００２９】ステップＳ１０６では、音声認識部１１２
に於いてＲＡＭ１０２内のささやき声辞書情報を参照し
て入力信号の音声認識を行う。そして、ステップＳ１０
７で、ステップＳ１０６の音声認識結果に基づき、操作
者に対する音声入力のリアクションを通知する。例え
ば、音声合成部１１２より音声入力に応じて求められる
音声情報を出力したり、認識が正常に済んだことを知ら
せるための所定の音をスピーカ（イヤホン）で再生させ
たり、或いは、表示部１１１にリアクションとしての情
報を表示する。In step S106, the voice recognition unit 112
Then, the voice recognition of the input signal is performed with reference to the whisper dictionary information in the RAM 102. Then, step S10
In step 7, based on the voice recognition result in step S106, the operator is notified of the reaction of the voice input. For example, voice information required in response to voice input from the voice synthesis unit 112 is output, a predetermined sound for notifying that recognition has been normally completed is reproduced by a speaker (earphone), or the display unit 111 To display information as a reaction.

【００３０】そして、ステップＳ１０８において、通信
部１１４を介して外部装置に認識結果（文字コード群）
を通知する。In step S108, the recognition result (character code group) is sent to the external device via the communication unit 114.
Notify.

【００３１】本実施形態の音声入力装置は、通常のキー
ボードに代わる入力手段として機能させることができ
る。また、スイッチやボタンに代わるコマンド入力手段
としても機能させることもできる。The voice input device of the present embodiment can function as input means instead of a normal keyboard. In addition, it can also function as command input means instead of switches and buttons.

【００３２】外部装置が例えばパーソナルコンピュータ
である場合、通信部１１４からの情報（かなを示す文字
コード群）を受信し、それをかな漢字変換プログラム
（ＦＥＰ等と呼ばれている）或いはコマンド解釈プログ
ラムに引き渡すデバイスドライバ（プログラムの一種）
として動作させておけばよい。When the external device is, for example, a personal computer, it receives information (a group of character codes indicating kana) from the communication unit 114 and converts it into a kana-kanji conversion program (called FEP or the like) or a command interpretation program. Device driver to be delivered (a type of program)
It should just be made to operate as.

【００３３】また、本装置が携帯できる装置である場合
には、通信部１１４としては無線による通信が望ましい
が、物理的なケーブルで接続する形態としてもよいのは
勿論である。When the present device is a portable device, wireless communication is desirable as the communication unit 114, but it goes without saying that the communication unit 114 may be connected by a physical cable.

【００３４】また、実施形態では、音声認識し、それを
キーボードの代わりに文字を入力する、或いはスイッチ
の代わりにコマンドを入力する例を説明した。しかしな
がら、本発明は幅広い適用が考えられ、これによって本
発明が限定されるものではない。In the embodiment, an example has been described in which voice recognition is performed and characters are input instead of a keyboard, or commands are input instead of switches. However, the invention has wide application and is not intended to limit the invention.

【００３５】更に、昨今のパーソナルコンピュータの情
報処理装置には、標準で音源ボードが装着されており、
そこにはマイク接続端子が備わっている。このマイク接
続端子にマイクロホンを接続し、その音源ボードとパー
ソナルコンピュータ内のＣＰＵ及び図３の処理を実現す
るプログラムでもって実現するようにしてもよい。また
は、好感度のマイクを接続する専用の拡張ボードをパー
ソナルコンピュータに装着して実現してもよい。Further, a sound source board is mounted as a standard in the information processing apparatus of a personal computer these days,
There is a microphone connection. A microphone may be connected to the microphone connection terminal, and the sound source board, a CPU in a personal computer, and a program for realizing the processing of FIG. 3 may be realized. Alternatively, a personal computer may be provided with a dedicated expansion board for connecting a microphone having good sensitivity.

【００３６】この場合のプログラムはデバイスドライ
バ、もしくはアプリケーションプログラムの一部として
動作することになるが、図３と異なるのは、ステップＳ
１０８における出力対象が異なるのみである。In this case, the program operates as a device driver or a part of an application program. The difference from FIG.
The only difference is the output target at 108.

【００３７】以上、説明したように、本実施形態に於い
ては、音声入力装置に於いて、所定音量以下の音声信号
のみ入力する構成と、前記入力信号に対し音声信号であ
ることを検出する音声信号検出部と、前記音声信号検出
部にパワースペクトルとピッチ周波数により音声信号を
検出する構成と、前記入力信号に対して音声認識処理を
行う音声認識部と、前記音声認識部にささやき声で発声
した場合の辞書パターンを有し、当該辞書パターンを用
いて音声認識処理を行う構成とを設けることにより、機
器に話し掛けることに対する抵抗感を払拭して気軽に音
声入力を実施できるようにすると共に、通常の他者に対
して発声した音声と機器に対して発声した音声を識別す
ることができる音声入力装置および方法を提供するとい
う大きな効果がある。As described above, in the present embodiment, in the audio input device, only the audio signal of a predetermined volume or less is input, and it is detected that the input signal is an audio signal. An audio signal detection unit, a configuration in which the audio signal detection unit detects an audio signal based on a power spectrum and a pitch frequency, a voice recognition unit that performs a voice recognition process on the input signal, and a whisper to the voice recognition unit By providing a dictionary pattern in the case of doing and performing a voice recognition process using the dictionary pattern, it is possible to easily perform voice input by eliminating the resistance to talking to the device, There is a great effect of providing a voice input device and method capable of distinguishing a voice uttered to a normal person from a voice uttered to a device. .

【００３８】本実施形態に於いては、音声入力装置の高
感度マイクと本体の形態は図２に示した概観と異なっ
て、例えば、マイクをクリップ型にして襟にはさんで使
用するような形態等他の形態であっても構わないという
ことは言うまでもない。In the present embodiment, the form of the high-sensitivity microphone and the main body of the voice input device is different from that shown in FIG. It goes without saying that other forms such as forms may be used.

【００３９】また、図３におけるステップＳ１０１、１
０２で比較される閾値は操作者によって適宜調整できる
ようしてある（不図示のボリュームツマミを操作す
る）。これにより、ささやき声の個人差、或いは周囲の
雑音の状況の問題を吸収できるようになる。Steps S101 and S101 in FIG.
The threshold value to be compared at 02 can be appropriately adjusted by an operator (operate a volume knob (not shown)). As a result, it becomes possible to absorb the problem of individual differences in whispering voices or the situation of surrounding noise.

【００４０】また、本発明は、複数の機器から構成され
るシステムに適用しても、一つの機器からなる装置（例
えばマイク内蔵型情報処理装置）に適用してもよい。The present invention may be applied to a system composed of a plurality of devices, or may be applied to a device composed of one device (for example, an information processing device with a built-in microphone).

【００４１】また、本発明の目的は、前述した実施形態
の機能を実現するソフトウェアのプログラムコードを記
録した記憶媒体（または記録媒体）を、システムあるい
は装置に供給し、そのシステムあるいは装置のコンピュ
ータ（またはCPUやMPU）が記憶媒体に格納されたプログ
ラムコードを読み出し実行することによっても、達成さ
れることは言うまでもない。この場合、記憶媒体から読
み出されたプログラムコード自体が前述した実施形態の
機能を実現することになり、そのプログラムコードを記
憶した記憶媒体は本発明を構成することになる。また、
コンピュータが読み出したプログラムコードを実行する
ことにより、前述した実施形態の機能が実現されるだけ
でなく、そのプログラムコードの指示に基づき、コンピ
ュータ上で稼働しているオペレーティングシステム(OS)
などが実際の処理の一部または全部を行い、その処理に
よって前述した実施形態の機能が実現される場合も含ま
れることは言うまでもない。Another object of the present invention is to supply a storage medium (or a recording medium) in which a program code of software for realizing the functions of the above-described embodiments is recorded to a system or an apparatus, and a computer (a computer) of the system or the apparatus. It is needless to say that the present invention can also be achieved by a CPU or an MPU) reading and executing the program code stored in the storage medium. In this case, the program code itself read from the storage medium implements the functions of the above-described embodiment, and the storage medium storing the program code constitutes the present invention. Also,
When the computer executes the readout program code, not only the functions of the above-described embodiments are realized, but also the operating system (OS) running on the computer based on the instructions of the program code.
It is needless to say that a case in which the functions of the above-described embodiments are implemented by performing part or all of the actual processing.

【００４２】さらに、記憶媒体から読み出されたプログ
ラムコードが、コンピュータに挿入された機能拡張カー
ドやコンピュータに接続された機能拡張ユニットに備わ
るメモリに書込まれた後、そのプログラムコードの指示
に基づき、その機能拡張カードや機能拡張ユニットに備
わるCPUなどが実際の処理の一部または全部を行い、そ
の処理によって前述した実施形態の機能が実現される場
合も含まれることは言うまでもない。Further, after the program code read from the storage medium is written into the memory provided in the function expansion card inserted into the computer or the function expansion unit connected to the computer, the program code is read based on the instruction of the program code. Needless to say, the CPU included in the function expansion card or the function expansion unit performs part or all of the actual processing, and the processing realizes the functions of the above-described embodiments.

【００４３】[0043]

【発明の効果】以上説明したように本発明によれば、機
器に話しかけることに対する抵抗感を払拭して気軽に音
声入力を実施できるようにすると共に、通常の他者に対
して発声した音声と機器に対して発声した音声を識別す
ることができるようになる。As described above, according to the present invention, it is possible to easily perform voice input by eliminating the resistance to talking to the device, and to make the voice uttered to the other person normal. The voice uttered to the device can be identified.

[Brief description of the drawings]

【図１】音声入力装置ブロック図である。FIG. 1 is a block diagram of a voice input device.

【図２】音声入力装置概観図である。FIG. 2 is a schematic view of a voice input device.

【図３】音声入力動作フローチャートである。FIG. 3 is a flowchart of a voice input operation.

[Explanation of symbols]

１音声入力装置１０１ＣＰＵ１０２ＲＡＭ１０３ＲＯＭ１０４音声入力部１０５高感度マイク１０６レベル比較部１０７信号増幅部１０８信号分析部１０９音声認識部 Reference Signs List 1 voice input device 101 CPU 102 RAM 103 ROM 104 voice input unit 105 high sensitivity microphone 106 level comparison unit 107 signal amplification unit 108 signal analysis unit 109 voice recognition unit

Claims

[Claims]

An audio signal input from an audio input means has a threshold value for determining whether or not the volume of the whispered audio, and an audio signal exceeding the threshold value is set as a non-processing target.
An audio input device that processes and outputs an audio signal equal to or less than the threshold.

2. The apparatus according to claim 1, further comprising an audio signal detecting unit for detecting whether or not the input signal is a voice for the voice input symbol, and processing the audio signal determined to be a voice by the audio signal detecting unit. The voice input device according to claim 1, wherein the voice input device is a target.

3. The voice input device according to claim 2, wherein said voice signal detection means detects whether or not the voice signal is a voice signal based on a power spectrum and a pitch frequency.

4. The apparatus according to claim 1, further comprising: voice recognition means for recognizing voice from the voice signal to obtain voice data; and output means for outputting information based on the voice data obtained by the voice recognition means. The voice input device according to any one of claims 1 to 3.

5. The voice input device according to claim 4, wherein said output means outputs a character code obtained by voice recognition to a kana-kanji conversion process.

6. The voice input device according to claim 4, wherein said voice recognition means has a dictionary pattern when uttered with a whisper, and performs voice recognition processing using said dictionary pattern.

7. A comparison is made between a threshold value for determining whether or not the audio signal has a whisper volume and an actual audio signal input from the audio input means, and an audio signal exceeding the threshold value is not processed. And processing and outputting an audio signal equal to or less than the threshold value.

8. A storage medium for storing a program code which, when read and executed by a computer, functions as a device for processing an audio signal input from the audio input means, wherein the audio signal is input from the audio input means. Has threshold data for determining whether or not the volume of the whispering voice,
A storage medium for storing a program code for processing an audio signal having a threshold value or less and outputting the audio signal having the threshold value or less.