JP2003202894A

JP2003202894A - Device, method, and program for speech recognition

Info

Publication number: JP2003202894A
Application number: JP2001401851A
Authority: JP
Inventors: Mitsuyoshi Tatemori; 三慶舘森
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2001-12-28
Filing date: 2001-12-28
Publication date: 2003-07-18

Abstract

<P>PROBLEM TO BE SOLVED: To eliminate inconvenience in system function use due to a wrong reserved word as to a system equipped with a speech recognition function. <P>SOLUTION: The speech recognition device which recognizes a speech from input voice data by referring to a system dictionary wherein reserved words are previously registered by a system is equipped with a reserved word list consisting of a set of reserved words and a setting means of setting whether at least one reserved word in the reserved word list is determined as an object of speech recognition according to a user's indication. <P>COPYRIGHT: (C)2003,JPO

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は予め認識対象語彙を
登録したシステム辞書を利用して音声認識を行う音声認
識装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice recognition device for performing voice recognition using a system dictionary in which recognition target vocabulary is registered in advance.

【０００２】[0002]

【従来の技術】現在、カーナビゲーションシステムのよ
うな種々の情報機器において、通常の機能に加えて音声
認識機能を備えるものが増えてきている。カーナビゲー
ションシステムにおける音声認識機能は、例えば、運転
者等の話者からの音声によって目的地指定や表示切り替
え等の各種コマンド操作を、操作パネルへの指操作に代
えてシステムに対し与えることができるようにするとい
ったものであり、ユーザにとって便利な操作インターフ
ェースを提供する。2. Description of the Related Art Currently, in various information devices such as a car navigation system, those having a voice recognition function in addition to a normal function are increasing. The voice recognition function in the car navigation system can give various command operations such as destination designation and display switching by voice from a speaker such as a driver to the system instead of finger operation on the operation panel. It provides a convenient operation interface for the user.

【０００３】音声認識に係る処理に関し、上記各種コマ
ンド等のそれぞれを音声認識に基づいて識別するための
情報として、システムは固有の認識対象語彙を備えてい
る。認識対象語彙は、予めシステムに登録された単語
（以下、「予約単語」と称する）の集まりからなる。Regarding the processing related to voice recognition, the system has a unique vocabulary to be recognized as information for identifying each of the above-mentioned various commands based on voice recognition. The recognition target vocabulary consists of a collection of words (hereinafter referred to as “reserved words”) registered in the system in advance.

【０００４】予約単語の中には、何度音声入力しても、
別の単語に間違えて認識されやすい単語や、逆に、どの
単語を発声しても、特定のある予約単語に誤認識されて
しまうような単語が存在する。このような不都合を起こ
す予約単語は、既に登録されている語彙や語彙数に依存
して異なり、また、話者によっても異なるため、そのよ
うな単語を前もって特定し、予約単語から除外しておく
ことは困難であるという問題点がある。No matter how many times the reserved words are input by voice,
There are words that are apt to be erroneously recognized as another word, and conversely, words that are erroneously recognized as a specific reserved word regardless of which word is uttered. Reserved words that cause such inconvenience differ depending on the already registered vocabulary and the number of vocabularies, and also differ depending on the speaker. Therefore, identify such words in advance and exclude them from the reserved words. The problem is that it is difficult.

【０００５】また、システムによっては、予約単語のほ
かに、ユーザ自身が所望の単語を登録できるよう構成さ
れているものもあり（以下、このような単語を「ユーザ
単語」と称する）、予約語以外の単語や、予約単語の別
の表現をユーザ単語として登録できるようになってい
る。このため、ある予約単語が別の単語に間違えられや
すい場合に、これとは別の認識されやすいユーザ単語を
登録することによって、その単語がスムーズに音声認識
されるようすることができる。しかしながら、他のどの
単語を発声しても特定の予約単語に誤認識しやすいよう
な単語については、やはりその予約単語を取り除かない
限り誤認識は改善できない。In addition to reserved words, some systems are configured so that the user can register desired words (hereinafter, such words are referred to as "user words"). Words other than or other expressions of reserved words can be registered as user words. Therefore, when a certain reserved word is easily mistaken for another word, another easily recognizable user word can be registered so that the word can be smoothly recognized by voice. However, for a word that is likely to be erroneously recognized as a specific reserved word even if any other word is uttered, the misrecognition cannot be improved unless the reserved word is removed.

【０００６】[0006]

【発明が解決しようとする課題】本発明はかかる事情を
考慮してなされたものであり、システムにおける予約単
語のうち、特定の予約単語を音声認識の対象外に設定す
ることができ、不都合な予約単語に起因するシステム機
能利用上の不便さを解消できる音声認識装置及び方法並
びにプログラムを提供することを目的とする。SUMMARY OF THE INVENTION The present invention has been made in view of the above circumstances, and it is possible to set a specific reserved word out of the target of voice recognition among reserved words in the system, which is inconvenient. An object of the present invention is to provide a voice recognition device, method, and program capable of eliminating inconvenience in using system functions due to reserved words.

【０００７】[0007]

【課題を解決するための手段】上記課題を解決し目的を
達成するために本発明は次のように構成されている。In order to solve the above problems and achieve the object, the present invention is constructed as follows.

【０００８】本発明に係る第１の音声認識装置は、予め
システムにおいて予約単語が登録されたシステム辞書を
参照して入力音声データを音声認識する音声認識装置に
おいて、前記予約単語の集合から構成される予約単語リ
ストと、ユーザからの指示に応じて、前記予約単語リス
ト中の少なくとも一つの予約単語を音声認識の対象とす
るか否かを設定する設定手段と、前記設定手段による設
定内容に基づいて、前記予約単語リスト中において前記
音声認識の対象でない予約単語を除き、前記音声認識の
対象である予約単語のみを前記システム辞書に登録する
登録手段と、を具備することを特徴とする音声認識装置
である。A first voice recognition apparatus according to the present invention is a voice recognition apparatus for voice-recognizing input voice data by referring to a system dictionary in which reserved words are registered in the system in advance, and is composed of a set of the reserved words. A reserved word list, a setting means for setting whether or not at least one reserved word in the reserved word list is a target of voice recognition according to an instruction from the user, and a setting content by the setting means. And a registration means for registering only the reserved words that are the target of the voice recognition in the system dictionary, excluding the reserved words that are not the target of the voice recognition in the reserved word list. It is a device.

【０００９】本発明に係る第２の音声認識装置は、予め
システムにおいて予約単語が登録されたシステム辞書を
参照して入力音声データを音声認識する音声認識装置に
おいて、前記予約単語の集合から構成される予約単語リ
ストと、ユーザからの指示に応じて、前記予約単語リス
ト中の少なくとも一つの予約単語を音声認識の対象とす
るか否かを設定する設定手段と、前記設定手段による設
定内容に基づいて、前記予約単語リスト中において前記
音声認識の対象でない予約単語が除かれ、前記音声認識
の対象である予約単語のみから構成されるよう前記シス
テム辞書を再構成する辞書再構成手段と、を具備するこ
とを特徴とする音声認識装置である。A second voice recognition apparatus according to the present invention is a voice recognition apparatus which recognizes input voice data by voice referring to a system dictionary in which reservation words are registered in the system in advance, and is composed of a set of the reserved words. A reserved word list, a setting means for setting whether or not at least one reserved word in the reserved word list is a target of voice recognition according to an instruction from the user, and a setting content by the setting means. And a dictionary reconfiguring unit that reconfigures the system dictionary so that the reserved words that are not the target of the voice recognition are excluded from the reserved word list and only the reserved words that are the target of the voice recognition are configured. It is a voice recognition device characterized by doing.

【００１０】本発明に係る第３の音声認識装置は、予め
システムにおいて予約単語が登録されたシステム辞書を
参照して入力音声データを音声認識する音声認識手段
と、前記予約単語の集合から構成される予約単語リスト
と、ユーザからの指示に応じて、前記予約単語リスト中
の少なくとも一つの予約単語を音声認識の対象とするか
否かを設定する設定手段と、前記設定手段による設定内
容に基づいて、前記音声認識手段による音声認識結果と
して得られた複数の候補のなかから、前記予約単語リス
ト中において前記音声認識の対象でない予約単語に相当
する候補を除き、前記音声認識の対象である予約単語の
みに相当する候補を選択する選択手段と、を具備するこ
とを特徴とする音声認識装置である。A third voice recognition apparatus according to the present invention comprises voice recognition means for voice-recognizing input voice data by referring to a system dictionary in which reserved words are registered in the system in advance, and a set of the reserved words. A reserved word list, a setting means for setting whether or not at least one reserved word in the reserved word list is a target of voice recognition according to an instruction from the user, and a setting content by the setting means. Then, from among the plurality of candidates obtained as the voice recognition result by the voice recognition means, except the candidates corresponding to the reserved words that are not the target of the voice recognition in the reserved word list, the reservation that is the target of the voice recognition. And a selection unit that selects a candidate corresponding to only a word.

【００１１】また、本発明に係る第１の音声認識方法
は、予めシステムにおいて予約単語が登録されたシステ
ム辞書を参照して入力音声データを音声認識する音声認
識方法において、ユーザからの指示に応じて、前記予約
単語の集合から構成される予約単語リスト中の少なくと
も一つの予約単語を音声認識の対象とするか否かを設定
する設定ステップと、前記設定ステップにおける設定内
容に基づいて、前記予約単語リスト中において前記音声
認識の対象でない予約単語を除き、前記音声認識の対象
である予約単語のみを前記システム辞書に登録する登録
ステップと、を具備することを特徴とする音声認識方法
である。A first voice recognition method according to the present invention is a voice recognition method for voice-recognizing input voice data by referring to a system dictionary in which reserved words are registered in advance in the system, in response to an instruction from a user. And a setting step for setting whether or not at least one reserved word in the reserved word list composed of the set of reserved words is a target of voice recognition, and the reservation based on the setting content in the setting step. A registration step of registering only the reserved words, which are the target of the voice recognition, in the system dictionary, excluding the reserved words, which are not the targets of the voice recognition in the word list, is a voice recognition method.

【００１２】本発明に係る第２の音声認識方法は、予め
システムにおいて予約単語が登録されたシステム辞書を
参照して入力音声データを音声認識する音声認識装置に
おいて、ユーザからの指示に応じて、前記予約単語の集
合から構成される予約単語リスト中の少なくとも一つの
予約単語を音声認識の対象とするか否かを設定する設定
ステップと、前記設定ステップにおける設定内容に基づ
いて、前記予約単語リスト中において前記音声認識の対
象でない予約単語が除かれ、前記音声認識の対象である
予約単語のみから構成されるよう前記システム辞書を再
構成する辞書再構成ステップと、を具備することを特徴
とする音声認識方法である。A second voice recognition method according to the present invention is a voice recognition apparatus for voice-recognizing input voice data by referring to a system dictionary in which reserved words are registered in advance in the system, in accordance with an instruction from a user. The reserved word list is set based on the setting step of setting whether or not at least one reserved word in the reserved word list composed of the set of reserved words is a target of voice recognition, and the setting content in the setting step. A dictionary reconstructing step of reconstructing the system dictionary so that reserved words that are not the target of the voice recognition are removed, and the system dictionary is composed only of the reserved words that are the target of the voice recognition. This is a voice recognition method.

【００１３】また、本発明に係る第３の音声認識方法
は、予めシステムにおいて予約単語が登録されたシステ
ム辞書を参照して入力音声データを音声認識する音声認
識ステップと、ユーザからの指示に応じて、前記予約単
語の集合から構成される予約単語リスト中の少なくとも
一つの予約単語を音声認識の対象とするか否かを設定す
る設定ステップと、前記設定ステップにおける設定内容
に基づいて、前記音声認識ステップにおける音声認識結
果として得られた複数の候補のなかから、前記予約単語
リスト中において前記音声認識の対象でない予約単語に
相当する候補を除き、前記音声認識の対象である予約単
語のみに相当する候補を選択する選択ステップと、を具
備することを特徴とする音声認識方法である。A third voice recognition method according to the present invention is a voice recognition step of voice-recognizing input voice data by referring to a system dictionary in which reserved words are registered in advance in a system, and a voice recognition step according to an instruction from a user. A setting step for setting whether or not at least one reserved word in the reserved word list composed of the set of reserved words is to be subjected to voice recognition, and the voice based on the setting content in the setting step. From the plurality of candidates obtained as the voice recognition result in the recognition step, except the candidate corresponding to the reserved word that is not the target of the voice recognition in the reserved word list, corresponding to only the reserved word that is the target of the voice recognition And a selection step of selecting a candidate to perform the voice recognition method.

【００１４】また、本発明に係る第１の音声認識プログ
ラムは、予めシステムにおいて予約単語が登録されたシ
ステム辞書を参照して入力音声データを音声認識する音
声認識プログラムにおいて、コンピュータに、ユーザか
らの指示に応じて、前記予約単語の集合から構成される
予約単語リスト中の少なくとも一つの予約単語を音声認
識の対象とするか否かを設定する設定ステップと、前記
設定ステップにおける設定内容に基づいて、前記予約単
語リスト中において前記音声認識の対象でない予約単語
を除き、前記音声認識の対象である予約単語のみを前記
システム辞書に登録する登録ステップと、を実行させる
音声認識プログラムである。The first voice recognition program according to the present invention is a voice recognition program for voice-recognizing input voice data by referring to a system dictionary in which reserved words are registered in advance in the system. In response to an instruction, a setting step of setting whether or not at least one reserved word in the reserved word list composed of the set of reserved words is a target of voice recognition, and based on the setting content in the setting step A registration step of registering only the reserved words that are the target of the voice recognition in the system dictionary, excluding the reserved words that are not the target of the voice recognition in the reserved word list, is a voice recognition program.

【００１５】本発明に係る第２の音声認識プログラム
は、予めシステムにおいて予約単語が登録されたシステ
ム辞書を参照して入力音声データを音声認識する音声認
識プログラムにおいて、コンピュータに、ユーザからの
指示に応じて、前記予約単語の集合から構成される予約
単語リスト中の少なくとも一つの予約単語を音声認識の
対象とするか否かを設定する設定ステップと、前記設定
ステップにおける設定内容に基づいて、前記予約単語リ
スト中において前記音声認識の対象でない予約単語が除
かれ、前記音声認識の対象である予約単語のみから構成
されるよう前記システム辞書を再構成する辞書再構成ス
テップと、を実行させる音声認識プログラムである。A second voice recognition program according to the present invention is a voice recognition program for voice-recognizing input voice data by referring to a system dictionary in which reserved words are registered in advance in the system. Accordingly, a setting step of setting whether or not at least one reserved word in the reserved word list composed of the set of reserved words is a target of voice recognition, and based on the setting content in the setting step, A dictionary reconstructing step of reconfiguring the system dictionary so that reserved words that are not the target of voice recognition are excluded from the reserved word list and only the reserved words that are the target of voice recognition are executed. It is a program.

【００１６】本発明に係る第３の音声認識プログラム
は、コンピュータに、予めシステムにおいて予約単語が
登録されたシステム辞書を参照して入力音声データを音
声認識する音声認識ステップと、ユーザからの指示に応
じて、前記予約単語の集合から構成される予約単語リス
ト中の少なくとも一つの予約単語を音声認識の対象とす
るか否かを設定する設定ステップと、前記設定ステップ
における設定内容に基づいて、前記音声認識ステップに
おける音声認識結果として得られた複数の候補のなかか
ら、前記予約単語リスト中において前記音声認識の対象
でない予約単語に相当する候補を除き、前記音声認識の
対象である予約単語のみに相当する候補を選択する選択
ステップと、を実行させる音声認識プログラムである。A third voice recognition program according to the present invention provides a computer with a voice recognition step of voice-recognizing input voice data by referring to a system dictionary in which reserved words are registered in the system in advance, and an instruction from a user. Accordingly, a setting step of setting whether or not at least one reserved word in the reserved word list composed of the set of reserved words is a target of voice recognition, and based on the setting content in the setting step, From the plurality of candidates obtained as the voice recognition result in the voice recognition step, except for the candidate corresponding to the reserved word that is not the target of the voice recognition in the reserved word list, only the reserved word that is the target of the voice recognition. A speech recognition program for executing a selection step of selecting a corresponding candidate.

【００１７】[0017]

【発明の実施の形態】以下、図面を参照しながら本発明
の実施形態を説明する。BEST MODE FOR CARRYING OUT THE INVENTION Embodiments of the present invention will be described below with reference to the drawings.

【００１８】（第１実施形態）図１は、本発明の第１実
施形態に係る音声認識システムの概略構成を示すブロッ
ク図である。本実施形態の音声認識システムは、例えば
カーナビゲーションシステム等の種々の情報機器に組み
入れられ、図示しないがシステムバスに接続されるＣＰ
ＵやＲＡＭ、ＲＯＭ等の記憶装置から成るコンピュータ
と共に構成される。この場合、本発明はコンピュータプ
ログラムにより実現することができ、ハードディスク等
の二次記憶装置やＲＯＭ等に記憶された同プログラムが
ＲＡＭ等からなる主記憶に読み出され、ＣＰＵによって
実行される。コンピュータとしては、上記情報機器への
組み込み型のものや汎用のコンピュータ（ＰＣ等）を含
む。図１は、説明の簡単のため、本発明に関係する構成
要素のみが示してあり、他は図示省略されている。同図
に示すように、この音声認識システムは、全体制御部１
０、音声認識部１１、予約単語リスト１２、単語設定部
１４、および表示部１３を備えている。(First Embodiment) FIG. 1 is a block diagram showing the schematic arrangement of a voice recognition system according to the first embodiment of the present invention. The voice recognition system of this embodiment is incorporated in various information devices such as a car navigation system, and is connected to a system bus (not shown).
It is configured with a computer including storage devices such as U, RAM, and ROM. In this case, the present invention can be implemented by a computer program, and the program stored in a secondary storage device such as a hard disk or ROM or the like is read into a main memory such as RAM or the like and executed by the CPU. The computer includes a built-in type computer in the above information equipment and a general-purpose computer (PC or the like). For simplification of description, FIG. 1 shows only components related to the present invention, and other components are omitted. As shown in the figure, this speech recognition system is provided with an overall control unit 1
0, a voice recognition unit 11, a reserved word list 12, a word setting unit 14, and a display unit 13.

【００１９】全体制御部１０は音声認識システムの主た
る構成要素であって、音声認識部１１を含んでいる。こ
の全体制御部１０は、音声認識に供するデータ入力及び
音声認識処理結果のデータ出力を司る入出力インターフ
ェースや、当該システムへのユーザ操作等を司るユーザ
インターフェース等の処理部を備えている。音声認識に
供するデータは、例えば図１に示すマイクロフォン２を
通じて入力した話者からの音声データであったり、音声
データファイルや録音音声のライン入力であったりす
る。また、音声認識処理結果のデータ出力は、例えば音
声コマンド入力機能など、システムの上位機能に受け渡
される。上記ユーザインターフェースは、タッチパネル
等を含む表示部１３を制御するグラフィカルインターフ
ェース（ＧＵＩ）を含む。The overall control unit 10 is a main component of the voice recognition system and includes a voice recognition unit 11. The overall control unit 10 includes a processing unit such as an input / output interface that controls data input used for voice recognition and data output of a voice recognition processing result, and a user interface that controls user operation on the system. The data used for voice recognition may be, for example, voice data from a speaker input through the microphone 2 shown in FIG. 1 or line input of a voice data file or recorded voice. Further, the data output of the voice recognition processing result is passed to a higher-level function of the system such as a voice command input function. The user interface includes a graphical interface (GUI) that controls the display unit 13 including a touch panel and the like.

【００２０】これらインターフェース処理部及び音声認
識部１１は、公知技術を利用して実現することができ
る。音声データから音素テキスト（発声された文字列）
を認識して出力する音声認識部１１は、入力音声データ
を音響的に分析するとともにこの分析により得られるパ
ラメータ（特徴量）に基づいて、音素などの小さな単位
に対する類似度の計算を行い、いくつかの音素を最適に
配列することで単語や文としての候補を出力する。かか
る音声認識処理において、音声認識部１１内に登録され
た予約単語が認識対象として参照される。そして本実施
形態は、この音声認識の対象となる予約単語の設定及び
変更等を単語設定部１４により行うように構成されてい
る。The interface processing section and the voice recognition section 11 can be realized by using a known technique. Phoneme text (voiced strings) from voice data
The speech recognition unit 11 that recognizes and outputs the input speech data is acoustically analyzed, and the degree of similarity to a small unit such as a phoneme is calculated based on the parameter (feature amount) obtained by this analysis. By arranging those phonemes optimally, candidates as words or sentences are output. In the voice recognition process, the reserved word registered in the voice recognition unit 11 is referred to as a recognition target. The present embodiment is configured so that the word setting unit 14 sets and changes the reserved word that is the target of the voice recognition.

【００２１】図２は、この単語設定部１４について説明
するための図である。単語設定部１４は、登録処理部１
４０及び設定フラグ１４１から構成されている。予約単
語リスト１２は、例えば当該システムにおいてＲＯＭ等
により初期状態から提供され、当該システムが音声認識
対象とする複数の単語を表すリスト（「語彙」と同義）
である。登録処理部１４０は、予約単語リスト１２のア
イテム（項目）のそれぞれに関連付けられた設定フラグ
１４１の設定内容に応じて、当該アイテムを音声認識部
１１に登録するか否かを決定する。設定フラグ１４１の
設定内容は、ディフォルト状態では全て「１」に設定さ
れ、これは、全ての単語は音声認識部１１への認識対象
であることを意味する。そして、予約単語リスト１２中
の特定のアイテムに対応する設定フラグ１４１の内容が
「０」に設定されると、当該アイテムにより示される単
語は音声認識部１１への認識対象外となる。FIG. 2 is a diagram for explaining the word setting section 14. The word setting unit 14 is the registration processing unit 1.
40 and setting flag 141. The reserved word list 12 is provided from the initial state, for example, by a ROM or the like in the system, and represents a plurality of words to be voice-recognized by the system (synonymous with “vocabulary”).
Is. The registration processing unit 140 determines whether or not to register the item in the voice recognition unit 11 according to the setting content of the setting flag 141 associated with each item (item) in the reserved word list 12. In the default state, the setting contents of the setting flag 141 are all set to "1", which means that all words are targets to be recognized by the voice recognition unit 11. Then, when the content of the setting flag 141 corresponding to a specific item in the reserved word list 12 is set to “0”, the word indicated by the item is not recognized by the voice recognition unit 11.

【００２２】なお、フラグの真偽を逆（「０」が認識対
象、「１」が認識対象外）としても良いことは言うまで
もない。Needless to say, the truth of the flag may be reversed ("0" is a recognition target and "1" is a non-recognition target).

【００２３】設定フラグ１４１への設定は、全体制御部
１０のユーザインターフェースを介してユーザが行うこ
とができる。このための画面構成について図３を参照し
て説明する。The setting to the setting flag 141 can be performed by the user via the user interface of the overall control unit 10. A screen configuration for this will be described with reference to FIG.

【００２４】図３は、予約単語一覧を表示する画面の構
成例を示している。FIG. 3 shows an example of the structure of a screen displaying a list of reserved words.

【００２５】表示部１３の表示画面上に、例えば、５０
音順やアルファベットなど適当な順序で予約単語が一覧
表示される。この予約単語は、予約単語リスト１２のア
イテムに相当する。各々の予約単語欄２２の横には、認
識対象外チェック欄２１が設けられていて、これをユー
ザが指などで押下するとチェックが付与される。チェッ
クが付与された場合、この予約単語は音声認識対象外と
なる。ここで再度、チェック欄２１が押下されると、設
定されていたチェックが外れ、この予約単語をあらため
て認識対象単語として登録できるようになる。On the display screen of the display unit 13, for example, 50
Reserved words are displayed in a list in an appropriate order such as the order of sounds or alphabets. This reserved word corresponds to an item in the reserved word list 12. A non-recognition check box 21 is provided next to each reserved word box 22, and a check is given when the user presses this with a finger or the like. When the check is added, this reserved word is not subject to voice recognition. Here, when the check box 21 is pressed again, the set check is removed, and this reserved word can be registered again as a recognition target word.

【００２６】このような表示画面におけるチェック欄２
１の設定内容は、単語設定部１４内の設定フラグ１４１
に反映される。Check box 2 on such a display screen
The setting content of 1 is the setting flag 141 in the word setting unit 14.
Reflected in.

【００２７】図３においては、全予約単語のうちの最初
の５単語が表示されている。そのうち、「いたりありょ
うり（イタリア料理）」と「すーぱー（スーパー）」に
チェックが付与されていて、これら２つの単語は認識対
象外となっている。なお本例では単語欄２２を「ひらが
な表示」としているが、特にこれに限ったものではな
く、システム機能構成に応じて漢字や英数字が混在した
表記にしてもよい。また外国語の音声認識システムの場
合にはアルファベット等、各言語に応じた適当な文字が
使用される。In FIG. 3, the first five words of all the reserved words are displayed. Among them, "Iritariori (Italian food)" and "Super (supermarket)" are checked, and these two words are not recognized. In this example, the word column 22 is set to "Hiragana display", but the present invention is not limited to this, and may be a mixture of Chinese characters and alphanumeric characters depending on the system functional configuration. In the case of a foreign language voice recognition system, appropriate characters such as alphabets are used according to each language.

【００２８】全単語が多数あり、一画面で表示しきれな
い場合などは、次ボタン２４が押下されると次画面以降
の予約単語が順次表示される。また、前ボタン２３が押
下される、前画面の予約単語が表示される。そして、終
了ボタン２５が押下されると、当該一覧表示画面上にお
ける予約単語設定処理が終了する。When all the words are many and cannot be displayed on one screen, when the next button 24 is pressed, reserved words on and after the next screen are sequentially displayed. Further, when the previous button 23 is pressed, the reserved word on the previous screen is displayed. Then, when the end button 25 is pressed, the reserved word setting process on the list display screen ends.

【００２９】なお、システムにおいて、音声認識対象と
して必須である特定の予約単語については、ユーザが不
用意に上記のような設定変更を行えないように構成する
ことが好ましい。このような設定変更不可の予約単語を
他の予約単語から識別可能にするために、設定フラグ１
４１の値を事前に、例えば「２」にする。単語設定部１
４は、このような設定フラグ１４１の設定値に応じて、
当該予約単語については、認識対象外チェック欄２１に
対するユーザ操作を禁止する。あるいは、このような設
定変更不可の予約単語は図３の一覧表示から除外される
ようにしてもよい。In the system, it is preferable that the user does not carelessly change the above-mentioned setting for a specific reserved word that is essential as a voice recognition target. In order to distinguish such a reserved word whose setting cannot be changed from other reserved words, the setting flag 1
The value of 41 is set to “2” in advance, for example. Word setting section 1
4 corresponds to the set value of the setting flag 141 as described above.
With respect to the reserved word, the user operation on the non-recognition check box 21 is prohibited. Alternatively, such reserved words whose settings cannot be changed may be excluded from the list display in FIG.

【００３０】また、ユーザからの指示に応じて、図３の
ような予約単語の一覧を表示するだけでなく、予約単語
リスト１２中において認識対象外となっていものを抽出
して表示するような、表示切り替え機能を表示部１３に
設けるよう構成すれば、認識対象外とした単語を再度認
識対象に戻すときに所望の単語を検索しやすくなり、利
便性が向上する。In addition to displaying the list of reserved words as shown in FIG. 3 according to the instruction from the user, the reserved words in the reserved word list 12 that are not recognized are extracted and displayed. If the display switching function is provided in the display unit 13, it becomes easier to search for a desired word when the word that is not the recognition target is returned to the recognition target again, and the convenience is improved.

【００３１】図４は、本実施形態に係る音声認識システ
ムの概略動作を示すフローチャートである。FIG. 4 is a flowchart showing a schematic operation of the voice recognition system according to this embodiment.

【００３２】先ず、電源投入によるハードウェアの初期
処理やシステムの起動処理ののち、ユーザ操作等の待ち
状態に遷移する。この待ち状態では、単語設定や音声認
識以外のユーザの指示（システムの終了など）も受け付
けられるが、図４においては本発明に直接的に関連する
ステップのみを示してあり、他は省略してある。First, after initializing the hardware and activating the system when the power is turned on, a transition is made to a waiting state for a user operation or the like. In this waiting state, user's instructions other than word setting and voice recognition (system termination, etc.) are accepted, but in FIG. 4, only steps directly related to the present invention are shown, and other steps are omitted. is there.

【００３３】ステップＳ３１において予約語彙の設定変
更を行う旨のユーザからの要求が受け付けられると、ス
テップＳ３２に遷移し、図３に示した予約単語一覧が表
示されるとともに上述した音声認識対象の有無に関する
単語設定処理が行われる。When the request from the user to change the setting of the reserved vocabulary is accepted in step S31, the process proceeds to step S32, the reserved word list shown in FIG. 3 is displayed, and the above-mentioned voice recognition target is present. A word setting process is performed.

【００３４】ステップＳ３２においてなされた設定に従
い、認識対象単語が音声認識部１１に登録され、該登録
された語彙がシステム辞書となる（ステップＳ３３）。
その後、ステップＳ３４において音声が入力されると、
音声認識処理が実行され（ステップＳ３５）、認識結果
に応じた処理が実行される（ステップＳ３６）。According to the setting made in step S32, the recognition target word is registered in the voice recognition unit 11, and the registered vocabulary becomes the system dictionary (step S33).
After that, when voice is input in step S34,
The voice recognition process is executed (step S35), and the process according to the recognition result is executed (step S36).

【００３５】以上説明した本実施形態によれば、認識対
象の予約単語のみを音声認識部１１へ認識対象語彙とし
て登録することができる。つまり、認識対象外の予約単
語をユーザが明示的に指示してこれをシステム辞書から
除外することができる。したがって、音声認識処理にお
いてどうしても似たような単語に誤って認識されるよう
な、不都合な予約単語に起因するシステム機能利用上の
不便さを解消できるようになり、音声認識処理に基づく
システム機能の使い勝手を向上できる。According to the present embodiment described above, only the reserved words to be recognized can be registered in the voice recognition unit 11 as the vocabulary to be recognized. That is, the user can explicitly designate a reserved word that is not a recognition target and exclude it from the system dictionary. Therefore, it becomes possible to eliminate the inconvenience in using the system function due to an inconvenient reserved word that is erroneously recognized as a similar word in the voice recognition process. The usability can be improved.

【００３６】（第２実施形態）次に、本発明の第２実施
形態を説明する。(Second Embodiment) Next, a second embodiment of the present invention will be described.

【００３７】音声認識システムの種類によっては、図１
に示した音声認識部１１に単語を直接登録するのではな
く、認識対象単語に例えばコンパイルなどの処理を施す
などしてシステム辞書を先ず構築し、このシステム辞書
の形態で音声認識部１１に登録するようなものがある。
第２実施形態ではこのような音声認識システムに本発明
を適用する。Depending on the type of the voice recognition system, FIG.
Instead of directly registering a word in the voice recognition unit 11 shown in, a system dictionary is first constructed by subjecting the recognition target word to processing such as compilation, and registered in the voice recognition unit 11 in the form of this system dictionary. There is something to do.
The second embodiment applies the present invention to such a voice recognition system.

【００３８】図５は、このような本発明の第２実施形態
に係る音声認識システムの概略構成を示すブロック図で
ある。図５において、参照数字４０〜４５で示される構
成要素は、それぞれ、図１における参照数字１０〜１５
に示した構成要素と対応しており、これらと同等の機能
を備えている。４５は辞書作成部、４６はシステム辞書
であり、これらの構成要素が付加されている点で第１実
施形態とは構成上相違している。FIG. 5 is a block diagram showing a schematic configuration of such a voice recognition system according to the second embodiment of the present invention. 5, components indicated by reference numerals 40 to 45 are the reference numerals 10 to 15 in FIG. 1, respectively.
It corresponds to the components shown in and has the same functions as these. Reference numeral 45 is a dictionary creating unit, and 46 is a system dictionary, which is different in structure from the first embodiment in that these components are added.

【００３９】第１実施形態で説明した図３の画面構成等
を利用してユーザが単語の認識対象・対象外設定を行
い、単語設定の終了ボタン２５が押下されると、辞書作
成部４５は、予約単語リスト１２の各々の予約単語のう
ち、単語設定部４４において認識対象単語として登録さ
れている予約単語のみを用いてシステム辞書４６を再構
成する。また単語設定部４４は、再構成されたシステム
辞書４６を、これまで使用されていたシステム辞書と置
き換える。When the user sets the recognition target / non-target of the word by using the screen configuration of FIG. 3 described in the first embodiment and the word setting end button 25 is pressed, the dictionary creating unit 45 Of the reserved words in the reserved word list 12, only the reserved words registered as the recognition target words in the word setting unit 44 are used to reconstruct the system dictionary 46. Further, the word setting unit 44 replaces the reconstructed system dictionary 46 with the system dictionary that has been used so far.

【００４０】当然ながら、再構成されたシステム辞書４
６には認識対象外となった予約単語が登録されていない
ので、これらが認識結果となることはない。したがっ
て、第２実施形態においても、第１実施形態と同様に、
誤認識の原因となる不都合な単語を明示的に除外でき
る。Naturally, the reconstructed system dictionary 4
Since reserved words that are not recognized in 6 are not registered, they do not become recognition results. Therefore, also in the second embodiment, as in the first embodiment,
You can explicitly exclude inconvenient words that cause misrecognition.

【００４１】なお、カーナビゲーションシステムに見受
けられる地名認識（例：神奈川県・川崎市・幸区）など
の連続単語認識を行える構成とした場合には、連続単語
中の一部の単語が認識対象外に設定され、これにより音
声認識部４１側で不都合が生じる場合がある。そのよう
な場合には、予約単語中にも、絶対に認識対象外に設定
できない単語を設けるとよい。これは、第１実施形態に
て説明したように、設定フラグの値を特別な値（第１実
施形態では「２」）に設定することで対処できる。When the continuous word recognition such as the place name recognition (eg, Kanagawa prefecture / Kawasaki city / Sachi ward) found in a car navigation system can be performed, only a part of the continuous words can be recognized. It may be set outside, which may cause inconvenience on the side of the voice recognition unit 41. In such a case, it is advisable to provide a word that cannot be set as a recognition target even in the reserved words. This can be dealt with by setting the value of the setting flag to a special value (“2” in the first embodiment), as described in the first embodiment.

【００４２】また、途中の単語を認識対象外に設定した
場合、自動的にその単語に続く単語を認識対象外にすれ
ば、設定操作が簡易化されるので好ましい。Further, when a word in the middle is set as a non-recognition target, it is preferable to automatically set a word following the word as a non-recognition target because the setting operation is simplified.

【００４３】（第３実施形態）次の本発明の第３実施形
態を説明する。(Third Embodiment) The third embodiment of the present invention will be described below.

【００４４】上述した第１実施形態は、音声認識部１１
に認識対象外の予約単語を登録しないようにするもの、
第２実施形態はシステム辞書４６を再構成する際に、認
識対象外の単語を除くようにするものであった。これら
の実施形態は、不都合な予約単語をユーザが前もって、
陽に取り除く作業を行うものである。In the first embodiment described above, the voice recognition unit 11 is used.
To prevent the registration of unrecognized reserved words in
In the second embodiment, when the system dictionary 46 is reconstructed, words that are not recognition targets are excluded. In these embodiments, the user reserves inconvenient reserved words in advance,
It is a work to remove it explicitly.

【００４５】これに対し本発明の第３実施形態は、再度
単語登録をしたり、システム辞書を再構成するなどの処
理過程で単語の除外作業を行うのではなく、認識対象外
の語彙が認識結果とならないようにする、というもので
ある。On the other hand, in the third embodiment of the present invention, the word exclusion operation is not performed in the process of re-registering the word or reconstructing the system dictionary, but recognizing the vocabulary which is not the recognition target. It is to prevent the result.

【００４６】本実施形態では、認識対象・対象外の設定
をユーザが行うところまでは第１実施形態や第２実施形
態と同様であるが、かかる設定を単語設定部４４に記憶
しておくのみとし、システム辞書の変更や、音声認識部
への単語の再登録は行わない点で他の実施形態とは異な
る。The present embodiment is similar to the first and second embodiments up to the point where the user sets the recognition target / non-recognition target, but only the setting is stored in the word setting section 44. However, unlike the other embodiments, the system dictionary is not changed and words are not re-registered in the voice recognition unit.

【００４７】音声認識の際には、本実施形態の音声認識
部は、当初からのシステム辞書をそのまま用いて認識処
理を行う。なお、本実施形態の音声認識部は、１つの音
声入力に対し順位付けがなされた複数の認識結果を返す
ことができることを前提とする。この場合、システム辞
書には認識対象外となった単語も含まれるので、認識結
果にも認識対象外に設定した単語までもが含まれる場合
がある。この認識対象外の単語を含む認識結果を仮の認
識結果と呼ぶことにする。At the time of voice recognition, the voice recognition unit of the present embodiment performs the recognition process using the system dictionary from the beginning as it is. The voice recognition unit of the present embodiment is premised on being able to return a plurality of ranked recognition results for one voice input. In this case, since the system dictionary also includes words that have not been recognized, the recognition result may also include words that have not been recognized. A recognition result including a word that is not a recognition target will be referred to as a temporary recognition result.

【００４８】音声認識部は、単語設定部に記憶してある
各単語の認識対象・対象外のフラグを参照し、仮の認識
結果の中から認識対象外の予約語を削除する。そして、
音声認識部は残った単語を、再度１位から順位づけした
ものを最終的な認識結果として出力し、システムは認識
結果に応じた処理を行う。なお、仮の認識結果の予約語
が全て認識対象外であり、単語が残らなかった場合に
は、ユーザの発声はリジェクト（無効、拒否）されたも
のとして扱う。以上に述べた、認識結果から認識対象外
となった予約単語を除く一連の処理を図６のフローチャ
ートに示す。The voice recognition unit refers to the recognition target / non-target flag of each word stored in the word setting unit, and deletes the reserved word that is not the recognition target from the temporary recognition result. And
The speech recognition unit outputs the remaining words ranked from the first rank again as a final recognition result, and the system performs processing according to the recognition result. If all the reserved words of the temporary recognition result are not recognized and no words remain, the user's utterance is treated as rejected (invalid or rejected). FIG. 6 is a flowchart showing a series of processes described above that exclude reserved words that have not been recognized from the recognition result.

【００４９】連続単語認識のように１つの認識結果が単
語列で構成される場合には、単語列中に少なくとも１個
の認識対象外の予約単語を含む認識結果を削除するよう
にすればよい。When one recognition result is composed of a word string as in continuous word recognition, the recognition result containing at least one reserved word that is not a recognition target in the word string may be deleted. .

【００５０】このような第３実施形態によると、システ
ム辞書を変更しないため、認識結果が全て認識対象外の
場合には発声がリジェクトとなってしまうが、辞書の再
構成のための手段が不要となり、第２実施形態に比べて
システムの構成が単純になり、メモリや計算量が少なく
できるという利点がある。According to the third embodiment as described above, since the system dictionary is not changed, the utterance is rejected when all the recognition results are out of the recognition target, but the means for reconstructing the dictionary is unnecessary. As compared with the second embodiment, there are advantages that the system configuration is simpler and the memory and calculation amount can be reduced.

【００５１】なお、本発明は上述した実施形態に限定さ
れず種々変形して実施可能である。The present invention is not limited to the above-described embodiment, but can be variously modified and carried out.

【００５２】[0052]

【発明の効果】以上説明したように、本発明によれば、
システムにおける予約単語のうち、特定の予約単語を音
声認識の対象外に設定することができ、不都合な予約単
語に起因するシステム機能利用上の不便さを解消できる
音声認識装置及び方法並びにプログラムを提供すること
ができる。As described above, according to the present invention,
Provided are a voice recognition device, method, and program capable of setting a specific reserved word out of the target of voice recognition among reserved words in the system and eliminating inconvenience in using system functions due to an inconvenient reserved word. can do.

[Brief description of drawings]

【図１】本発明の第１実施形態に係る音声認識システ
ムの概略構成を示すブロック図FIG. 1 is a block diagram showing a schematic configuration of a voice recognition system according to a first embodiment of the present invention.

【図２】単語設定部の内部構成を示すブロック図FIG. 2 is a block diagram showing an internal configuration of a word setting unit.

【図３】予約単語一覧を表示する画面の構成例を示す
図FIG. 3 is a diagram showing a configuration example of a screen that displays a list of reserved words.

【図４】第１実施形態の音声認識システムの概略動作
を示すフローチャートFIG. 4 is a flowchart showing a schematic operation of the voice recognition system of the first embodiment.

【図５】本発明の第２実施形態に係る音声認識システ
ムの概略構成を示すブロック図FIG. 5 is a block diagram showing a schematic configuration of a voice recognition system according to a second embodiment of the present invention.

【図６】本発明の第３実施形態に係る音声認識システ
ムにおいて、認識結果から認識対象外となった予約単語
を除く処理の流れを示すフローチャートFIG. 6 is a flowchart showing a flow of processing for removing a reserved word that has not been recognized from the recognition result in the speech recognition system according to the third embodiment of the present invention.

[Explanation of symbols]

１０…全体制御部１１…音声認識部１２…予約語彙リスト１３…表示部１４…単語設定部 10 ... Overall control unit 11 ... Voice recognition unit 12 ... Reserved vocabulary list 13 ... Display 14 ... Word setting section

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｇ１０Ｌ 15/28 ─────────────────────────────────────────────────── ─── Continuation of front page (51) Int.Cl. ⁷ Identification code FI theme code (reference) G10L 15/28

Claims

[Claims]

1. A voice recognition device for recognizing input voice data by referring to a system dictionary in which reserved words are registered in advance in a system, a reserved word list including a set of the reserved words, and an instruction from a user. According to the setting means for setting whether or not at least one reserved word in the reserved word list is the target of voice recognition, and based on the setting content by the setting means, the voice in the reserved word list A speech recognition apparatus comprising: a registration unit that registers only reserved words that are the target of voice recognition in the system dictionary, excluding reserved words that are not recognition targets.

2. A voice recognition device for recognizing input voice data by referring to a system dictionary in which reserved words are registered in advance in a system, a reserved word list including a set of the reserved words, and an instruction from a user. According to the setting means for setting whether or not at least one reserved word in the reserved word list is the target of voice recognition, and based on the setting content by the setting means, the voice in the reserved word list A voice recognition device, comprising: a dictionary reconfiguring unit configured to reconfigure the system dictionary so that reserved words that are not recognition targets are removed and the reserved words that are the target of voice recognition are configured only.

3. A voice recognition means for recognizing input voice data by referring to a system dictionary in which reserved words are registered in advance in the system, a reserved word list composed of a set of the reserved words, and an instruction from a user. Depending on the setting means for setting whether or not at least one reserved word in the reserved word list is the target of the voice recognition, based on the setting content by the setting means, the voice recognition result by the voice recognition means From among the plurality of candidates obtained as above, a candidate corresponding to only the reserved word that is the target of the voice recognition is selected in the reserved word list, except a candidate that corresponds to the reserved word that is not the target of the voice recognition. A speech recognition apparatus comprising:

4. The method according to claim 1, further comprising a prohibition unit that prohibits the user from setting a specific reserved word not to be subject to the voice recognition. The voice recognition device according to the item.

5. The setting means comprises, for each reserved word in the reserved word list, at least a binary flag indicating whether or not the reserved word is a target of the voice recognition. The voice recognition device according to claim 1.

6. A voice recognition method for recognizing input voice data by referring to a system dictionary in which reserved words are registered in advance in a system, comprising a reservation composed of a set of the reserved words according to an instruction from a user. A setting step of setting whether or not at least one reserved word in the word list is to be subjected to voice recognition, and based on the setting content in the setting step, a reserved word that is not the target of the voice recognition in the reserved word list. And a registration step of registering only the reserved words that are the target of the voice recognition in the system dictionary.

7. A voice recognition device for voice recognition of input voice data by referring to a system dictionary in which reserved words are registered in advance in a system, wherein a reservation comprising a set of the reserved words according to an instruction from a user. A setting step of setting whether or not at least one reserved word in the word list is to be subjected to voice recognition, and based on the setting content in the setting step, a reserved word that is not the target of the voice recognition in the reserved word list. And a dictionary reconstructing step of reconstructing the system dictionary so that the system dictionary is composed only of reserved words that are the target of the voice recognition.

8. A voice recognition step of recognizing input voice data by referring to a system dictionary in which reserved words are registered in advance in the system, and a reservation composed of a set of the reserved words according to an instruction from a user. A setting step for setting whether or not at least one reserved word in the word list is to be subjected to voice recognition, and based on the setting contents in the setting step, a plurality of voice recognition results obtained in the voice recognition step A selection step of selecting, from the candidates, candidates corresponding to the reserved words that are not the target of the voice recognition in the reserved word list, and selecting only candidates corresponding to the reserved words that are the target of the voice recognition. A speech recognition method characterized by the above.

9. The method according to claim 6, further comprising a prohibition step of prohibiting the user from setting a specific reserved word not to be subject to the voice recognition. Speech recognition method described in paragraph.

10. A voice recognition program for recognizing input voice data by referring to a system dictionary in which reserved words are registered in advance in a system, comprising a set of the reserved words to a computer according to an instruction from a user. A setting step of setting whether or not at least one reserved word in the reserved word list to be a target of voice recognition, and a target of the voice recognition in the reserved word list based on the setting content in the setting step. A registration step of registering only the reserved words that are the target of the voice recognition in the system dictionary excluding the reserved words that are not registered in the system recognition program.

11. A voice recognition program for recognizing input voice data by referring to a system dictionary in which reserved words are registered in advance in the system, comprising a set of the reserved words according to an instruction from a user to a computer. A setting step of setting whether or not at least one reserved word in the reserved word list to be a target of voice recognition, and a target of the voice recognition in the reserved word list based on the setting content in the setting step. Non-reserved words are excluded, and a dictionary reconstructing step of reconstructing the system dictionary so that the system dictionary is composed only of reserved words that are the target of the speech recognition.

12. A voice recognition step of recognizing input voice data into a computer by referring to a system dictionary in which reserved words are registered in a system in advance, and a set of the reserved words according to an instruction from a user. Based on the setting step of setting whether or not at least one reserved word in the reserved word list to be the target of the voice recognition, and the content of the setting in the setting step, it is obtained as the voice recognition result in the voice recognition step. From the plurality of candidates, a selection step of selecting candidates corresponding to only the reserved words that are the target of the voice recognition, excluding candidates corresponding to the reserved words that are not the target of the voice recognition in the reserved word list, A voice recognition program that executes.

13. The method according to claim 10, further comprising a prohibition step of prohibiting the user from setting a specific reserved word not to be subject to the voice recognition. A speech recognition program according to item.