JPH04238398A

JPH04238398A - Voice recognition device and voice dialing device

Info

Publication number: JPH04238398A
Application number: JP3022963A
Authority: JP
Inventors: Keiichi Miyamoto; 恵一宮本
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1991-01-22
Filing date: 1991-01-22
Publication date: 1992-08-26

Abstract

PURPOSE:To decrease the frequency of key depression for the registration of an opposite subscriber number and to improve the operability by using voice recognition even for the registration. CONSTITUTION:A feature extraction part 6 extracts the feature quantity of the voice of an opposite person name which is inputted from a microphone 1 and stores it in a standard pattern storage part 8. Further, the dial number of the opposite person is inputted before or after the registration of the voice by using a numeric voice dictionary stored as an unspecific voice recognition dictionary in a fixed standard pattern storage part 14 and stored in a dial number storage part 12 while related with a standard pattern. A command for switching between a recognition/origination mode and a registration mode is also stored in the fixed standard pattern storage part 9 and this is recognized to switch the modes.

Description

[Detailed description of the invention]

【０００１】0001

【技術分野】本発明は、音声認識装置、さらに詳しくは
、音声認識を利用した制御装置（例えば音声ダイヤリン
グ装置）の音声登録の改良に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice recognition device, and more particularly to an improvement in voice registration of a control device (eg, voice dialing device) using voice recognition.

【０００２】0002

【従来技術】相手先の名前やニックネームなどをマイク
ロフォンに向かって発声するだけで、自動的にそのダイ
ヤル番号を発信することができるいわゆる音声ダイヤリ
ング装置が実用化されつつある。これらの音声ダイヤリ
ング装置は、特定話者音声認識装置を内蔵しており、そ
の認識結果に対応するダイヤル番号を発信する機能を付
加したものである。2. Description of the Related Art A so-called voice dialing device is being put into practical use that can automatically send a dialed number simply by speaking the name or nickname of the other party into a microphone. These voice dialing devices have a built-in specific speaker voice recognition device, and have an added function of transmitting a dial number corresponding to the recognition result.

【０００３】図２に従来の音声ダイヤリング装置の一構
成例を示す。まず、音声によるダイヤリング動作を説明
する。ダイヤリング装置は、認識／発信モードになって
いるとする。このモード切替は、キースイッチ３から入
力された信号にしたがって制御部１１が全体を制御する
ことにより行なわれるとする。ハンドセットのマイクロ
フォン１から入力された相手先名などの音声は、特徴抽
出部６においてその特徴量が抽出される。抽出された特
徴量は、類似度計算部７において標準パターン記憶部８
に予め記憶された複数の標準パターンのそれぞれと類似
度が計算される。結果選択部９では、類似度のもっとも
高い標準パターンを認識結果と見なして、音声応答部１
０と制御部１１に渡す。音声応答部１０では、発声者に
対して認識結果をスピーカ２等により音声で知らせる。この結果確認用の音声は認識のための標準パターンから
合成したり、これとは別に合成専用の特徴量を別途準備
しておき、これから合成してもよい。後者の場合は認識
用標準パターンの登録時に、マイクロフォン１の出力を
応答部に入力して、そちらで記憶すればよい。一方、認
識結果を受け取った制御部１１では、認識結果の標準パ
ターンに対応するダイヤル番号をダイヤル番号記憶部１
２の中から指定する。その後、この選択されたダイヤル
番号がダイヤル発信部１３から回線制御部４を通して電
話回線５に発信される。FIG. 2 shows an example of the configuration of a conventional voice dialing device. First, the voice dialing operation will be explained. Assume that the dialing device is in the recognize/call mode. It is assumed that this mode switching is performed by the control section 11 controlling the entire system according to a signal inputted from the key switch 3. A feature extraction section 6 extracts the feature amount of the voice such as the name of the other party inputted from the microphone 1 of the handset. The extracted feature amounts are stored in the standard pattern storage unit 8 in the similarity calculation unit 7.
The degree of similarity with each of a plurality of standard patterns stored in advance is calculated. The result selection unit 9 regards the standard pattern with the highest degree of similarity as the recognition result, and selects the standard pattern with the highest degree of similarity.
0 and is passed to the control unit 11. The voice response unit 10 notifies the speaker of the recognition result by voice through the speaker 2 or the like. The voice for confirming the results may be synthesized from standard patterns for recognition, or feature amounts exclusively for synthesis may be separately prepared and synthesized from this. In the latter case, when registering the standard pattern for recognition, the output of the microphone 1 may be input to the response section and stored there. On the other hand, the control unit 11 that has received the recognition result stores the dial number corresponding to the standard pattern of the recognition result in the dial number storage unit 11.
Select from 2. Thereafter, the selected dial number is transmitted from the dial transmission section 13 to the telephone line 5 through the line control section 4.

【０００４】次に登録動作を説明する。前記したような
キー操作によって登録モードが設定されているとする。ハンドセットのマイクロフォン１から入力された相手先
名などの音声は、特徴抽出部６においてその特徴量が抽
出される。抽出された特徴量は、標準パターン記憶部８
に記憶される。さらに、この音声の登録に前後して、キ
ースイッチ３を用いて相手先のダイヤル番号が入力され
て、先の標準パターンと関係付けてダイヤル番号記憶部
１２に記憶される。このように、音声ダイヤリング装置
は一度登録してしまえば、相手先を音声で呼ぶだけで、
ダイヤル発信ができるようになるため、数多くの番号を
覚える必要がなくなる。また、目や肢体の不自由なひと
にとっても、ダイヤルボタンを押すという作業がないた
めに大変に使いやすい電話を提供することができる。し
かしながら、今まで説明してきたように、登録の際には
相手先のダイヤル番号をキースイッチで押して入力する
という作業が必要であった。これは、健康な者にとって
はさほど困難なことではないが、目や肢体の不自由なひ
とにとっては、大変に負担となることで、せっかく音声
ダイヤリング装置の良さが生かせなくなってしまう。[0004] Next, the registration operation will be explained. It is assumed that the registration mode is set by the key operation described above. A feature extraction section 6 extracts the feature amount of the voice such as the name of the other party inputted from the microphone 1 of the handset. The extracted feature amounts are stored in the standard pattern storage unit 8.
is memorized. Furthermore, before and after registering this voice, the dial number of the other party is input using the key switch 3, and is stored in the dial number storage section 12 in association with the standard pattern described above. In this way, once you register a voice dialing device, you can simply call the other party by voice.
Since you will be able to dial out calls, there will be no need to memorize numerous numbers. Furthermore, it is possible to provide a telephone that is very easy to use even for people with visual or physical disabilities, since there is no need to press dial buttons. However, as explained above, registration requires inputting the dialed number of the other party by pressing a key switch. Although this is not very difficult for a healthy person, it is a great burden for a person with visual or physical disabilities, and the advantages of the voice dialing device cannot be utilized.

【０００５】[0005]

【目的】本発明は、上述のごとき不具合に鑑みてなされ
たものであり、音声認識を利用した制御装置、例えば、
音声ダイヤリング装置において、相手先番号の登録時に
も音声認識を用いることにより、登録時のキー押下を減
少させ、使い勝手を向上させることを目的としてなされ
たものである。[Objective] The present invention has been made in view of the above-mentioned problems, and provides a control device using voice recognition, for example,
The purpose of the voice dialing device is to reduce the number of key presses during registration and improve usability by using voice recognition when registering the destination number.

【０００６】[0006]

【構成】本発明は、上記目的を達成するために、（１）
入力された音声の特徴を抽出する特徴抽出部と、抽出し
た複数の音声の特徴量を標準パターンとして記憶する標
準パターン記憶部と、前記抽出された音声の特徴量と前
記標準パターン記憶部に記憶された複数の標準パターン
との類似度を計算する類似度計算部と、該計算された類
似度が高いものから順に一つあるいは複数の標準パター
ンを認識結果として選択する結果選択部とを具備する音
声認識装置において、前記標準パターン記憶部の一部を
書き換え不可能のＲＯＭで構成し、残りの部分を書き換
え可能のＲＡＭで構成したこ、或いは、（２）入力され
た音声の特徴を抽出する特徴抽出部と、抽出した複数の
音声の特徴量を標準パターンとして記憶する標準パター
ン記憶部と、前記抽出された音声の特徴量と前記標準パ
ターン記憶部に記憶された複数の標準パターンとの類似
度を計算する類似度計算部と、該計算された類似度が高
いものから順に一つあるいは複数の標準パターンを認識
結果として選択する結果選択部とを具備する音声認識装
置において、前記標準パターン記憶部に記憶された標準
パターンの一部を不特定話者認識用の音声パターンで予
め登録しておき、残りの部分を特定話者認識用の音声パ
ターン用に用いるこ、或いは、（３）入力された音声の
特徴を抽出する特徴抽出部と、抽出した複数の音声の特
徴量を標準パターンとして記憶する標準パターン記憶部
と、該標準パターンに対応した機器制御用データを記憶
する制御データ記憶部と、前記抽出された音声の特徴量
と前記標準パターン記憶部に記憶された複数の標準パタ
ーンとの類似度を計算する類似度計算部と、該計算され
た類似度が高いものから順に一つあるいは複数の標準パ
ターンを認識結果として選択する結果選択部と、該選択
結果に基づいて前記制御データ記憶部に記憶された制御
データを出力する制御データ出力部とを具備する音声認
識装置において、前記標準パターン記憶部に記憶された
標準パターンの一部を不特定話者認識用の音声パターン
で予め登録しておき、残りの部分を特定話者認識用の音
声パターン用に用いるように成し、該特定話者用標準パ
ターンと対応する制御データの登録に際し前記不特定話
者認識用の音声を用いて行なうこと、或いは、（４）入
力された音声の特徴を抽出する特徴抽出部と、抽出した
複数の音声の特徴量を標準パターンとして記憶する標準
パターン記憶部と、該標準パターンに対応したダイヤル
番号を記憶するダイヤル番号記憶部と、前記抽出された
音声の特徴量と前記標準パターン記憶部に記憶された複
数の標準パターンとの類似度を計算する類似度計算部と
、該計算された類似度が高いものから順に一つあるいは
複数の標準パターンを認識結果として選択する結果選択
部と、該選択結果に基づいて前記ダイヤル番号記憶部に
記憶されたダイヤル番号を出力するダイヤル発信部とを
具備する音声認識ダイヤリング装置において、前記標準
パターン記憶部に記憶された標準パターンの一部を不特
定話者認識用の音声パターンで予め登録しておき、残り
の部分を特定話者認識用の音声パターン用に用いるよう
に成し、該特定話者用標準パターンと対応するダイヤル
番号の登録に際し前記不特定話者認識用の音声認識を用
いて行なうこと、或いは、（５）入力された音声の特徴
を抽出する特徴抽出部と、抽出した複数の音声の特徴量
を標準パターンとして記憶する標準パターン記憶部と、
該標準パターンに対応したダイヤル番号を記憶するダイ
ヤル番号記憶部と、前記抽出された音声の特徴量と前記
標準パターン記憶部に記憶された複数の標準パターンと
の類似度を計算する類似度計算部と、該計算された類似
度が高いものから順に一つあるいは複数の標準パターン
を認識結果として選択する結果選択部と、該選択結果に
基づいて前記ダイヤル番号記憶部に記憶されたダイヤル
番号を出力するダイヤル発信部とを具備する音声認識ダ
イヤリング装置において、前記標準パターン記憶部に記
憶された標準パターンの一部を数値もしくは制御語を表
す標準パターンで予め登録しておき、残りの部分を相手
先用の音声パターン用に用いるように成し、該相手先用
標準パターンと対応するダイヤル番号の登録に際し前記
数値もしくは制御語認識用の標準パターンを用いて行な
うことを、更には、（６）前記（５）の音声ダイヤル装
置において、前記数値もしくは制御語を表す標準パター
ンは、書き換え不可能なＲＯＭに不特定話者認識用の標
準パターンとして記憶されることを特徴としたものであ
る。以下、本発明の実施例に基いて説明する。[Structure] In order to achieve the above objects, the present invention provides (1)
a feature extraction unit that extracts features of input speech; a standard pattern storage unit that stores feature quantities of a plurality of extracted speech sounds as a standard pattern; and a feature quantity of the extracted speech and storage in the standard pattern storage unit. a similarity calculation unit that calculates the degree of similarity with a plurality of standard patterns that have been calculated, and a result selection unit that selects one or more standard patterns as a recognition result in order of the calculated degree of similarity. In the speech recognition device, a part of the standard pattern storage section is configured with a non-rewritable ROM, and the remaining part is configured with a rewritable RAM, or (2) features of the input voice are extracted. a feature extraction unit, a standard pattern storage unit that stores extracted feature quantities of a plurality of voices as standard patterns, and similarity between the extracted voice feature quantities and the plurality of standard patterns stored in the standard pattern storage unit; and a result selection unit that selects one or more standard patterns as a recognition result in descending order of the calculated similarity. A part of the standard pattern stored in the section may be registered in advance as a speech pattern for non-specific speaker recognition, and the remaining part may be used as a speech pattern for specific speaker recognition, or (3) input. a feature extraction unit that extracts features of the extracted voices, a standard pattern storage unit that stores the extracted feature quantities of the plurality of voices as a standard pattern, and a control data storage unit that stores equipment control data corresponding to the standard pattern. a similarity calculation unit that calculates the similarity between the extracted voice feature amount and a plurality of standard patterns stored in the standard pattern storage unit; Alternatively, in the speech recognition device, the speech recognition device includes a result selection unit that selects a plurality of standard patterns as recognition results, and a control data output unit that outputs control data stored in the control data storage unit based on the selection results. A part of the standard pattern stored in the standard pattern storage unit is registered in advance as a voice pattern for non-specific speaker recognition, and the remaining part is used as a voice pattern for specific speaker recognition, registering the control data corresponding to the standard pattern for specific speakers using the voice for unspecified speaker recognition, or (4) a feature extraction unit for extracting features of the input voice; a standard pattern storage unit that stores feature quantities of a plurality of voices as a standard pattern; a dial number storage unit that stores a dial number corresponding to the standard pattern; and a standard pattern storage unit that stores the extracted voice feature quantities and the standard pattern storage unit. a similarity calculation unit that calculates the degree of similarity with a plurality of standard patterns stored in the computer; a result selection unit that selects one or more standard patterns as a recognition result in order of the calculated degree of similarity; In the voice recognition dialing device, the voice recognition dialing device includes a dialing unit that outputs the dialed number stored in the dialed number storage unit based on the selection result, and a part of the standard pattern stored in the standard pattern storage unit is The voice pattern for specific speaker recognition is registered in advance, and the remaining part is used for the voice pattern for specific speaker recognition, and when registering the dial number corresponding to the standard pattern for the specific speaker. (5) a feature extraction unit that extracts the features of the input speech and a standard that stores the extracted features of the plurality of speech as a standard pattern; a pattern storage section;
a dial number storage unit that stores a dial number corresponding to the standard pattern; and a similarity calculation unit that calculates the similarity between the extracted voice feature and the plurality of standard patterns stored in the standard pattern storage unit. a result selection unit that selects one or more standard patterns as a recognition result in descending order of the calculated similarity, and outputs the dial number stored in the dial number storage unit based on the selection result. In a voice recognition dialing device, a part of the standard pattern stored in the standard pattern storage part is registered in advance as a standard pattern representing a numerical value or a control word, and the remaining part is registered as a standard pattern representing a number or a control word. Further, (6) The voice dialing device of (5) is characterized in that the standard pattern representing the numerical value or control word is stored in a non-rewritable ROM as a standard pattern for speaker-independent recognition. Hereinafter, the present invention will be explained based on examples.

【０００７】図１は、本発明の一実施例を説明するため
の構成図で、同図に示した実施例は、入力された音声の
特徴を抽出する特徴抽出部６と、抽出した複数の音声の
特徴量を標準パターンとして記憶する標準パターン記憶
部８と、この標準パターンに対応したダイヤル番号を記
憶するダイヤル番号記憶部１２と、前記抽出された音声
の特徴量と前記標準パターン記憶部８に記憶された複数
の標準パターンとの類似度を計算する類似度計算部７と
、計算された類似度が高いものから順に一つあるいは複
数の標準パターンを認識結果として選択する結果選択部
９と、この選択結果に基づいて前記ダイヤル番号記憶部
１２に記憶されたダイヤル番号を出力するダイヤル発信
部１３とを具備する音声認識ダイヤリング装置において
、前記標準パターン記憶部８に記憶された標準パターン
の一部を数値もしくは制御語を表す標準パターンで予め
登録しておくように成し、残りの部分を相手先用の音声
パターン用に用いるように成し、この相手先用標準パタ
ーンと対応するダイヤル番号の登録に際し、前記数値認
識用の標準パターンを用いて行なうことを特徴としたも
のであり、より好ましくは、このような音声ダイヤリン
グ装置において、前記数値もしくは制御語を表す標準パ
ターンは、書き換え不可能なＲＯＭに不特定話者認識用
の標準パターンとして記憶させることを特徴としたもの
である。FIG. 1 is a block diagram for explaining one embodiment of the present invention. A standard pattern storage section 8 that stores voice features as a standard pattern, a dial number storage section 12 that stores dial numbers corresponding to this standard pattern, and the extracted voice features and the standard pattern storage section 8. a similarity calculation unit 7 that calculates the similarity with a plurality of standard patterns stored in the computer; and a result selection unit 9 that selects one or more standard patterns as a recognition result in order of the calculated similarity. , and a dialing unit 13 that outputs the dialed number stored in the dialed number storage unit 12 based on the selection result. A part is registered in advance as a standard pattern representing a numerical value or a control word, and the remaining part is used for a voice pattern for the other party, and a dial corresponding to this standard pattern for the other party is configured. When registering a number, the standard pattern for numerical recognition is used. More preferably, in such a voice dialing device, the standard pattern representing the numerical value or control word is not rewritten. This feature is characterized in that it is stored in an impossible ROM as a standard pattern for speaker-independent recognition.

【０００８】図１の構成が、前記図２の構成に対して変
わっている部分は、固定標準パターン記憶部１４が増え
た点である。認識／発信モードでの動作は図２に示した
従来技術で解説したものと変わらないので省略する。以
下、登録モードでの動作を説明する。ハンドセットのマ
イクロフォン１から入力された相手先名などの音声は、
特徴抽出部６においてその特徴量が抽出される。抽出さ
れた特徴量は、標準パターン記憶部８に記憶される。さ
らに、この音声の登録に前後して、不特定音声認識辞書
である固定標準パターン記憶部１４に記憶された数値音
声辞書を用いて、相手先のダイヤル番号が入力されて、
先の標準パターンと関係付けてダイヤル番号記憶部１２
に記憶される。また、認識／発信モードと登録モードを
切り替えるコマンドも前記固定標準パターン記憶部８に
記憶させておき、これを認識することによりモード切り
替えを行なう。認識に際し、不特定辞書と相手先用の特
定標準パターンのどちらを使うかは、装置の動作履歴に
よって切り替えればよい。また、不特定と特定で認識動
作に何ら変わりがないのであれば切り替える必要はない
。これらの動作を具体例で示す。The configuration shown in FIG. 1 is different from the configuration shown in FIG. 2 in that the number of fixed standard pattern storage sections 14 is increased. The operation in the recognition/transmission mode is the same as that explained in the prior art shown in FIG. 2, so a description thereof will be omitted. The operation in the registration mode will be explained below. Voice input from microphone 1 of the handset, such as the name of the other party, is
The feature amount is extracted in the feature extractor 6. The extracted feature amount is stored in the standard pattern storage section 8. Furthermore, before and after registering this voice, the dial number of the other party is input using the numerical voice dictionary stored in the fixed standard pattern storage unit 14, which is an unspecified voice recognition dictionary.
The dial number storage unit 12 is connected to the standard pattern described above.
is memorized. Further, a command for switching between the recognition/transmission mode and the registration mode is also stored in the fixed standard pattern storage section 8, and mode switching is performed by recognizing this command. At the time of recognition, whether to use an unspecified dictionary or a specific standard pattern for the other party can be switched depending on the operation history of the device. Further, if there is no difference in the recognition operation between unspecified and specific, there is no need to switch. These operations will be illustrated with specific examples.

【０００９】（１）ハンドセットに向かって登録と発声
すると、固定標準パターン１４中の「とうろく」が認識
結果として指定されて、登録モードになる。（２）次に、相手先の名前を発声する。これは標準パタ
ーン記憶部８に記憶される。（３）次に、相手先のダイヤル番号をひとけたづつ順番
に発声する。これらは不特定辞書（固定標準パターン記
憶部１４）から認識され、ダイヤル番号の認識／登録が
終了する。（５）通常は認識／発信モードになって待機している。以上、本発明に直接かかわる部分を説明した。なお、本
発明は音声認識及び応答（合成）のアルゴリズムや、特
徴量などを特定・限定するものではない。また、標準パ
ターンやダイヤル番号の更新・削除などの動作について
は既に公知なので特に解説しない。また、それらの更新
・削除等の動作に本発明を適応することも容易に類推が
できるのでここでの説明を省略する。(1) When you say "registration" into the handset, "Touroku" in the fixed standard pattern 14 is designated as a recognition result, and the system enters registration mode. (2) Next, say the name of the other party. This is stored in the standard pattern storage section 8. (3) Next, say the dialed number of the other party one digit at a time. These are recognized from the unspecified dictionary (fixed standard pattern storage section 14), and the recognition/registration of the dial number is completed. (5) Normally, it is in recognition/calling mode and on standby. The parts directly related to the present invention have been described above. Note that the present invention does not specify or limit the voice recognition and response (synthesis) algorithms, feature amounts, etc. In addition, operations such as updating and deleting standard patterns and dial numbers are already known and will not be particularly explained. Further, since it can be easily inferred that the present invention can be applied to these operations such as updating and deletion, the explanation here will be omitted.

【００１０】0010

【効果】以上の説明から明らかなように、本発明による
と相手先のダイヤル番号の登録や、モード切替にも音声
認識を用いることにより、登録時のキー押下を減少させ
、使い勝手の向上した音声ダイヤリング装置を提供する
ことできる。[Effects] As is clear from the above explanation, according to the present invention, by using voice recognition to register the dialed number of the other party and to switch the mode, the number of key presses during registration can be reduced, and voice recognition has improved usability. Dialing equipment can be provided.

[Brief explanation of the drawing]

【図１】本発明による音声ダイヤリング装置の一実施例
を説明するためのブロック図である。FIG. 1 is a block diagram for explaining an embodiment of a voice dialing device according to the present invention.

【図２】従来の音声ダイヤリング装置の一例を説明する
ための図である。FIG. 2 is a diagram for explaining an example of a conventional voice dialing device.

[Explanation of symbols]

１　　マイクロフォン２　　スピーカー３　　キースイッチ４　　回線制御部５　　回線６　　特徴抽出部７　　類似度計算部８　　標準パターン記憶部９　　結果選択部１０　　音声応答部１１　　制御部１２　　ダイヤル番号記憶部１３　　ダイヤル発信部１４　　固定標準パターン記憶部 1. Microphone 2 Speaker 3 Key switch 4 Line control section 5 Line 6 Feature extraction section 7 Similarity calculation section 8 Standard pattern storage section 9 Result selection section 10 Voice response section 11 Control section 12 Dial number storage section 13　Dial transmission section 14 Fixed standard pattern storage section

Claims

[Claims]

1. A feature extraction unit that extracts features of input speech; a standard pattern storage unit that stores feature quantities of a plurality of extracted speech sounds as a standard pattern; a similarity calculation unit that calculates the degree of similarity with a plurality of standard patterns stored in the pattern storage unit; and a result selection unit that selects one or more standard patterns as a recognition result in order of the calculated degree of similarity. 1. A speech recognition device comprising a standard pattern storage section, wherein a part of the standard pattern storage section is made up of a non-rewritable ROM, and the remaining part is made up of a rewritable RAM.

2. A feature extraction unit that extracts features of input speech; a standard pattern storage unit that stores feature quantities of a plurality of extracted speech sounds as a standard pattern; a similarity calculation unit that calculates the degree of similarity with a plurality of standard patterns stored in the pattern storage unit; and a result selection unit that selects one or more standard patterns as a recognition result in order of the calculated degree of similarity. In the speech recognition device, a part of the standard pattern stored in the standard pattern storage part is registered in advance as a speech pattern for non-specific speaker recognition, and the remaining part is registered as a speech pattern for specific speaker recognition. A speech recognition device characterized in that it is used for speech patterns.

3. A feature extraction unit that extracts features of input audio, a standard pattern storage unit that stores the extracted feature quantities of a plurality of audio as a standard pattern, and a device control data that corresponds to the standard pattern. a control data storage unit for storing; a similarity calculation unit for calculating the similarity between the extracted voice feature amount and a plurality of standard patterns stored in the standard pattern storage unit; A result selection section that selects one or more standard patterns as a recognition result in descending order of the highest standard pattern, and a control data output section that outputs control data stored in the control data storage section based on the selection result. In the speech recognition device, a part of the standard pattern stored in the standard pattern storage section is registered in advance as a speech pattern for non-specific speaker recognition, and the remaining part is registered as a speech pattern for specific speaker recognition. 1. A voice recognition control device, characterized in that the control data corresponding to the specific speaker standard pattern is registered using the voice for non-specific speaker recognition.

4. A feature extraction unit that extracts features of input audio, a standard pattern storage unit that stores the extracted features of a plurality of audio as a standard pattern, and stores a dial number corresponding to the standard pattern. a dial number storage unit; a similarity calculation unit that calculates the similarity between the extracted voice feature and a plurality of standard patterns stored in the standard pattern storage unit; A voice recognition dialer comprising: a result selection unit that sequentially selects one or more standard patterns as a recognition result; and a dialing unit that outputs a dial number stored in the dial number storage unit based on the selection result. In the ring device, a part of the standard pattern stored in the standard pattern storage section is registered in advance as a voice pattern for non-specific speaker recognition, and the remaining part is used as a voice pattern for specific speaker recognition. A voice recognition control device characterized in that the voice recognition for non-specific speaker recognition is used to register a dial number corresponding to the standard pattern for specific speakers.

5. A feature extraction unit that extracts features of input voices, a standard pattern storage unit that stores the extracted feature quantities of a plurality of voices as a standard pattern, and stores a dial number corresponding to the standard pattern. a dial number storage unit; a similarity calculation unit that calculates the similarity between the extracted voice feature and a plurality of standard patterns stored in the standard pattern storage unit; A voice recognition dialer comprising: a result selection unit that sequentially selects one or more standard patterns as a recognition result; and a dialing unit that outputs a dial number stored in the dial number storage unit based on the selection result. In the ring device, a part of the standard pattern stored in the standard pattern storage section is registered in advance as a standard pattern representing a numerical value or a control word, and the remaining part is configured to be used as a voice pattern for the other party. A voice dialing device characterized in that the standard pattern for recognizing numerical values or control words is used to register a dial number corresponding to the standard pattern for the other party.

6. The voice dialing device according to claim 5, wherein the standard pattern representing the numerical value or the control word is stored in a non-rewritable ROM as a standard pattern for speaker-independent recognition. voice dialing device.