JP2003271183A

JP2003271183A - Device, method and program for preparing voice recognition dictionary, device and system for recognizing voice, portable terminal device and program recording medium

Info

Publication number: JP2003271183A
Application number: JP2002075595A
Authority: JP
Inventors: Hiroyuki Kanza; 浩幸勘座
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2002-03-19
Filing date: 2002-03-19
Publication date: 2003-09-25
Anticipated expiration: 2022-03-19
Also published as: JP3911178B2

Abstract

<P>PROBLEM TO BE SOLVED: To prepare a voice recognition dictionary capable of recognizing a term even with incorrect reading. <P>SOLUTION: A first analysis dictionary storing part 5 stores a first analysis dictionary registered with the writing, reading, etc., of daily used terms. A second analysis dictionary storing part 6 stores a second analysis dictionary registered with the writing, reading, etc., of special terms. A text analyzing part 1 uses both the analysis dictionaries to perform a morphological analysis, and a reading attaching part 2 records a pair of a term and reading of the second analysis dictionary and a pair of a different candidate and a reading candidate in a correspondence table about registered terms of the second analysis dictionary in attaching reading to an input morpheme. A voice recognition dictionary preparing part 3 prepares a voice recognition dictionary on the basis of the contents of the correspondence table. As a result, the voice recognition dictionary is prepared on the basis of the association of a recognized term with phoneme writing of the second analysis dictionary and with a different phoneme writing candidate. Then, the use of the voice recognition dictionary can recognize the reading 'kyoshu' as 'kyoshu' regardless of pronouncing the reading 'kyosyu' as 'kyoowari'. <P>COPYRIGHT: (C)2003,JPO

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】この発明は、難読語の発声を
正しく認識できる音声認識辞書に関し、特に音声認識辞
書作成装置および音声認識辞書作成方法、上記音声認識
辞書を用いた音声認識装置、この音声認識装置が搭載さ
れた携帯端末器、この携帯端末器を用いた音声認識シス
テム、音声認識辞書作成プログラム、並びに、上記音声
認識辞書作成プログラムを記録したプログラム記録媒体
に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice recognition dictionary capable of correctly recognizing an utterance of a difficult-to-read word, and more particularly to a voice recognition dictionary creating apparatus and a voice recognition dictionary creating method, a voice recognition apparatus using the voice recognition dictionary, and a voice The present invention relates to a mobile terminal device equipped with a recognition device, a voice recognition system using the mobile terminal device, a voice recognition dictionary creating program, and a program recording medium recording the voice recognition dictionary creating program.

【０００２】[0002]

【従来の技術】音声認識技術においては、語彙記憶部に
予め登録された読み方でしか入力音声を認識することが
できない。そのために、利用者は、どのような言葉が認
識可能であるのかを予め知っておく必要がある。その場
合、語彙記憶部に登録された語彙が小語彙である場合に
は、ある程度登録語彙を覚えておくことはできる。しか
しながら、語彙記憶部に登録された語彙が大語彙である
場合には、登録語彙を覚えておくことが困難になる。2. Description of the Related Art In a voice recognition technique, an input voice can be recognized only by a reading method registered in advance in a vocabulary storage unit. Therefore, the user needs to know in advance what words can be recognized. In that case, if the vocabulary registered in the vocabulary storage unit is a small vocabulary, the registered vocabulary can be remembered to some extent. However, when the vocabulary registered in the vocabulary storage unit is a large vocabulary, it becomes difficult to remember the registered vocabulary.

【０００３】このような問題を解決する方法として、以
下のようなものがある。 (１）表示手段に音声認識対象語彙を表示する方法(例え
ば、特開平７‐３１９３８３号公報) (２）対話の進行に応じて、音声認識のための語彙を動
的に変更し、音声認識対象語彙を常に小語彙にしておく
(例えば、特開平６‐３３２４９３号公報) (３）上記語彙記憶部に語彙の変更や追加を容易に行う
ことによって、予め登録した読み方ではない読み方でも
認識可能にする(例えば、特開平８‐２１１８９３号公
報)As a method for solving such a problem, there are the following methods. (1) A method of displaying the vocabulary for voice recognition on the display means (for example, Japanese Patent Laid-Open No. 7-319383) (2) The vocabulary for voice recognition is dynamically changed according to the progress of the dialogue, and the voice recognition is performed. The target vocabulary is always a small vocabulary
(For example, Japanese Patent Laid-Open No. 6-332493) (3) By easily changing or adding a vocabulary to the vocabulary storage unit, it is possible to recognize a reading that is not a pre-registered reading (for example, Japanese Laid-Open Patent Publication No. (211893 publication)

【０００４】[0004]

【発明が解決しようとする課題】しかしながら、上記語
彙記憶部に登録された語彙が大語彙である場合の問題を
解決する従来の方法には、以下のような問題がある。す
なわち、上記表示手段に音声認識対象語彙を表示する方
法の場合には、上記表示手段に表示できる語彙の数には
限界がある。また、地名等を漢字表記で表示した場合に
は、読み方が分らないために利用者が正しく読めない場
合がある。これを避けるため、仮に地名を総て平仮名表
記で表示した場合には、名簿や葉書等に書かれている漢
字の住所との対応がとれなくなる。さらに、これを避け
るため、漢字表記と平仮名表記とを併用して表示した場
合には、表示面積の制約によって益々小語彙しか表示で
きなくなってしまう。However, the conventional method for solving the problem when the vocabulary registered in the vocabulary storage unit is a large vocabulary has the following problems. That is, in the case of the method of displaying the speech recognition target vocabulary on the display means, the number of vocabularies that can be displayed on the display means is limited. In addition, when a place name or the like is displayed in Kanji, the user may not be able to read it correctly because the user does not know how to read it. In order to avoid this, if all place names are displayed in Hiragana notation, it will not be possible to correspond to the Kanji address written in a name list or a postcard. Furthermore, in order to avoid this, when both the kanji and hiragana notations are displayed together, only a small vocabulary can be displayed due to the limitation of the display area.

【０００５】したがって、上記表示手段に音声認識対象
語彙を表示する方法では、利用者に読み方までを含めた
音声認識対象語彙を知らせるには不十分なのである。Therefore, the method of displaying the voice recognition target vocabulary on the display means is not sufficient to inform the user of the voice recognition target vocabulary including the reading.

【０００６】また、対話の進行に応じて音声認識対象語
彙を動的に変更して、音声認識対象語彙を常に小語彙に
しておく方法の場合には、音声認識対象語彙が動的に変
更されるため、個々の場面における認識対象語彙の数は
少ない。しかしながら、結局は、動的に変動する総ての
認識対象語彙を覚えておく必要がある。また、個々の場
面での認識対象語彙の数は少なくなるとはいえ、結局は
語彙記憶部に記憶した通りの読み方で発声しなければ認
識されることはない。Further, in the case of a method of dynamically changing the speech recognition target vocabulary according to the progress of the dialogue so that the speech recognition target vocabulary is always a small vocabulary, the speech recognition target vocabulary is dynamically changed. Therefore, the number of words to be recognized in each scene is small. However, in the end, it is necessary to remember all dynamically recognized recognition target vocabulary. Further, although the number of vocabulary to be recognized in each scene is small, the vocabulary is not recognized unless it is uttered according to the reading as stored in the vocabulary storage unit.

【０００７】特に、地名の場合には難読語が多く、地図
検索表示装置等を用いて目的地を発声で検索しようとし
ても、何と読むか分らないために利用できない場合があ
る。あるいは、正しいと思ったつもりで発声したのに、
語彙記憶部に登録された読み方とは違うために認識でき
ない場合がある。In particular, there are many difficult-to-read words in the case of a place name, and even if an attempt is made to retrieve a destination by utterance using a map search display device or the like, it may not be possible to use it because it is not known what to read. Or, I uttered it because I thought it was right,
It may not be recognized because it is different from the pronunciation read in the vocabulary storage.

【０００８】具体的な例を挙げれば、例えば、奈良市の
「京終」は「きょうばて」と読むのであるが、知らない利用
者は「きょうしゅう」と発声することが多い。しかしなが
ら、その場合には認識されないために、目的地「京終」の
検索を行うことができず、地図も表示されない。尚、地
名全体の読み方が分らない場合であっても、例えば１文
字単位では分る場合がある。例えば「京終」は読めなくて
も、「京」は「きょう」、「終」は「しゅう」または「おわる」と
読める。そこで、利用者は、「京」と「きょう」、「終」と
「しゅう」または「おわる」の対が上記語彙記憶部に登録さ
れていることを予測して、「京終」を、「きょうしゅう」ま
たは「きょうおわる」と発声することによって音声認識で
きれば、「京終」が読めない場合の解決手段にはなる。と
ころが、このような方法が、いつも通用するという保証
は全くない。As a specific example, for example, "Kyoto" in Nara is read as "Kyobate", but users who do not know often say "Kyoshu." However, in that case, since it is not recognized, the destination “Kyoto” cannot be searched and the map is not displayed. Even if the reading of the whole place name is not known, it may be found in units of one character, for example. For example, you cannot read "Kyoto", but you can read "Kyo" as "today" and "Kyu" as "shu" or "end." Therefore, the user predicts that a pair of "Kyo" and "Kyo", "End" and "Shu", or "End" is registered in the vocabulary storage unit, and then "Kyoto" is changed to "Kyoto". If you can recognize the voice by uttering "Shu" or "Kyosuru," it will be a solution when you cannot read "Kyoto." However, there is no guarantee that such a method will always work.

【０００９】同様に、橿原市の「新口」は「にのくち」と発
声するのであるが、知らない利用者は「しんくち」と発声
する可能性がある。そして、「しんくち」が語彙記憶部に
登録されていない場合には認識されないことになる。そ
のため、「新口」を「しんくち」と読むと思い込んでいる利
用者は、永遠に「新口」の地図を表示させることができな
いことになる。Similarly, the "new mouth" in Kashihara City says "Ninokuchi", but a user who does not know may say "Shinkuchi". If "Shinchi" is not registered in the vocabulary storage unit, it will not be recognized. Therefore, a user who thinks that "new mouth" is read as "shinchi" cannot display the map of "new mouth" forever.

【００１０】ところで、上述のような例の場合には、上
記語彙記憶部に語彙の変更や追加を行う方法によって、
難読語と読み誤りしそうな読み方とを語彙記憶部に追加
登録しておくことで解決することはできる。By the way, in the case of the above example, by the method of changing or adding a vocabulary to the vocabulary storage unit,
This can be solved by additionally registering the difficult-to-read word and the reading that is likely to be misread in the vocabulary storage unit.

【００１１】しかしながら、上記従来の何れの方法の場
合にも、利用者は上記語彙記憶部に登録されている認識
対象語彙を知っているかあるいは予測できることが必要
である。したがって、上記従来の場合には、例えば、テ
レビ番組名や音楽タイトル名等の日々更新される言葉を
音声認識する用途に利用する場合には、上記語彙記憶部
に予めテレビ番組名や音楽タイトル名を登録しておくこ
とができず、また、予想することすらできず、全く音声
認識できないのである。However, in any of the above-mentioned conventional methods, it is necessary for the user to know or be able to predict the recognition target vocabulary registered in the vocabulary storage unit. Therefore, in the above-mentioned conventional case, for example, when using words that are updated daily such as a TV program name and a music title name for voice recognition, the TV program name and the music title name are previously stored in the vocabulary storage unit. Cannot be registered, nor can it be predicted, and no voice recognition is possible.

【００１２】以上の例から判るように、正しく読むこと
が難しい言葉が発声された場合や、利用者が正しい読み
方を知らない場合であっても、正しく音声認識されるこ
とが望ましいのである。As can be seen from the above examples, it is desirable that the voice is correctly recognized even when a word that is difficult to read correctly is uttered or the user does not know the correct reading.

【００１３】そこで、この発明の目的は、認識対象単語
の正しい読み方を利用者が知らない場合でも、または、
登録されている認識対象単語を利用者が知らない場合で
あっても、入力音声を認識可能な音声認識辞書を作成で
きる音声認識辞書作成装置および音声認識辞書作成方
法、上記音声認識辞書を用いた音声認識装置、この音声
認識装置が搭載された携帯端末器、この携帯端末器を用
いた音声認識システム、音声認識辞書作成プログラム、
並びに、上記音声認識辞書作成プログラムを記録したプ
ログラム記録媒体を提供することにある。Therefore, an object of the present invention is to provide a method in which the user does not know the correct reading of the recognition target word, or
Even if the user does not know the registered recognition target word, a voice recognition dictionary creating apparatus and a voice recognition dictionary creating method capable of creating a voice recognition dictionary capable of recognizing an input voice, and the above voice recognition dictionary are used. Voice recognition device, mobile terminal device equipped with this voice recognition device, voice recognition system using this mobile terminal device, voice recognition dictionary creation program,
Another object of the present invention is to provide a program recording medium which records the voice recognition dictionary creating program.

【００１４】[0014]

【課題を解決するための手段】上記目的を達成するた
め、第１の発明は、テキスト解析手段によって入力テキ
ストを解析し,読み付与手段によって上記解析された構
成単語に読みを付与し,音声認識辞書作成手段によって
上記解析結果および上記読み付与結果に基づいて音声認
識辞書を作成し,この作成された音声認識辞書を音声認
識辞書記憶手段に記憶する音声認識用辞書作成装置にお
いて、上記テキスト解析手段によるテキスト解析時に参
照される辞書であって,語彙の表記および読みを含む情
報で成る第１解析辞書が記憶された第１解析辞書記憶手
段と、上記テキスト解析手段によるテキスト解析時に参
照される辞書であって,上記第１解析辞書記憶手段に記
憶されてはいない語彙の表記および読みを含む情報で成
る第２解析辞書が記憶された第２解析辞書記憶手段を備
えると共に、上記読み付与手段は、上記テキスト解析手
段によるテキスト解析結果の中に上記第２解析辞書を参
照して得られた語彙が含まれている場合には、当該語彙
に関して、上記第２解析辞書を参照して得られた読みに
加えてその他の読み候補をも付与するようになっている
ことを特徴としている。In order to achieve the above object, a first aspect of the invention is to analyze an input text by a text analysis means, give a reading to the constituent words analyzed by the reading giving means, and perform voice recognition. In the voice recognition dictionary creating device for creating a voice recognition dictionary based on the analysis result and the reading addition result by the dictionary creating means, and storing the created voice recognition dictionary in the voice recognition dictionary storage means, the text analyzing means A first analysis dictionary storing means for storing a first analysis dictionary composed of information including vocabulary notation and reading, and a dictionary referred to when the text is analyzed by the text analysis means. A second analysis dictionary, which is composed of information including notations and readings of vocabulary not stored in the first analysis dictionary storage means, is stored. In addition to the second analysis dictionary storage means, the reading imparting means, when the text analysis result by the text analysis means includes a vocabulary obtained by referring to the second analysis dictionary, Regarding the vocabulary, in addition to the reading obtained by referring to the second analysis dictionary, other reading candidates are also added.

【００１５】上記構成によれば、読み付与手段によっ
て、テキスト解析手段によるテキスト解析結果中に、第
２解析辞書を参照して得られた語彙が含まれている場合
には、当該語彙に関して、上記第２解析辞書を参照して
得られた読みに加えてその他の読み候補をも付与され
る。したがって、上記解析結果および上記読み付与結果
に基づいて作成される音声認識辞書には、上記第２解析
辞書に基づく語彙に関しては、上記第２解析辞書を参照
して得られた読みの音素表記およびその他の読み候補の
音素表記に基づく辞書情報が登録される。According to the above configuration, when the reading analysis means includes the vocabulary obtained by referring to the second analysis dictionary in the text analysis result by the text analysis means, the vocabulary is extracted as described above. In addition to the reading obtained by referring to the second analysis dictionary, other reading candidates are also added. Therefore, in the speech recognition dictionary created based on the analysis result and the reading addition result, regarding the vocabulary based on the second analysis dictionary, the phoneme notation of the reading obtained by referring to the second analysis dictionary and Dictionary information based on phoneme notation of other reading candidates is registered.

【００１６】すなわち、例えば、第２解析辞書に難読語
「京終」とその正しい読み「きょうばて」を登録しておくこ
とによって、語彙「京終」とその読み「きょうばて」,「きょ
う/しゅう」,「きょう/おわり」とに基づく辞書情報が格納
された音声認識辞書が作成される。したがって、この音
声認識辞書を用いて音声認識を行うことによって、誤っ
て「きょうしゅう」と発声してもリジェクトされることな
く認識結果として目的の語彙「京終」が得られるのであ
る。That is, for example, by registering the obfuscated word "Kyoto" and its correct reading "Kyobate" in the second analysis dictionary, the vocabulary "Kyoto" and its readings "Kyobate", "Kyobate" A voice recognition dictionary is created in which dictionary information based on "Shu" and "Kyo / End" is stored. Therefore, by performing voice recognition using this voice recognition dictionary, the target vocabulary “Kyoto” can be obtained as a recognition result without being rejected even if the user “utters” “Kyoushu” is mistakenly uttered.

【００１７】また、１実施例では、上記第１の発明の音
声認識用辞書作成装置において、上記第２解析辞書記憶
手段に記憶される第２解析辞書の内容を第３の辞書記憶
手段から取得する辞書取得手段を備えている。In one embodiment, in the voice recognition dictionary creating device of the first invention, the contents of the second analysis dictionary stored in the second analysis dictionary storage means are acquired from the third dictionary storage means. The dictionary acquisition means is provided.

【００１８】この実施例によれば、上記第２解析辞書の
内容は、辞書取得手段によって第３の辞書記憶手段から
取得される。したがって、語彙の情報提供者によって新
しい語彙が登録された第３の辞書記憶手段が提供される
ことによって、新しく出現した語彙が上記第２解析辞書
記憶手段に追加登録される。さらに、上記第２解析辞書
記憶手段に登録されている認識対象語彙を利用者が知ら
ない場合であっても、入力音声を認識可能な音声認識辞
書を作成することが可能になる。According to this embodiment, the contents of the second analysis dictionary are acquired from the third dictionary storage means by the dictionary acquisition means. Therefore, the newly appearing vocabulary is additionally registered in the second analysis dictionary storage means by providing the third dictionary storage means in which the new vocabulary is registered by the vocabulary information provider. Furthermore, even if the user does not know the recognition target vocabulary registered in the second analysis dictionary storage means, it becomes possible to create a voice recognition dictionary capable of recognizing the input voice.

【００１９】また、第２の発明は、入力された音声を,
音声認識辞書に登録されている語彙との照合手段による
照合を行うことによって認識する音声認識装置におい
て、上記音声認識辞書は、上記第１の発明の音声認識辞
書作成装置によって作成された音声認識辞書であること
を特徴としている。A second aspect of the invention is to input the input voice,
In a voice recognition device for recognizing by performing matching with a vocabulary registered in a voice recognition dictionary, the voice recognition dictionary is created by the voice recognition dictionary creating device of the first invention. It is characterized by being.

【００２０】上記構成によれば、上記第２解析辞書に登
録された語彙に関しては、上記第２解析辞書を参照して
得られた読みの音素表記およびその他の読み候補の音素
表記に基づく辞書情報が登録された音声認識辞書を用い
て、音声認識が行われる。したがって、例えば上記第２
解析辞書に登録された語彙の例としての難読語「京終」
を、誤って「きょうしゅう」と発声した場合でもリジェク
トされることがなく、認識結果として目的の語彙「京終」
が得られるのである。According to the above configuration, with respect to the vocabulary registered in the second analysis dictionary, dictionary information based on the phoneme notation of reading obtained by referring to the second analysis dictionary and the phonetic notation of other reading candidates. Voice recognition is performed using the voice recognition dictionary in which is registered. Therefore, for example, the second
The obfuscated word "Kyoto" as an example of vocabulary registered in the analysis dictionary
Even if you accidentally say "Kyoushu", you will not be rejected and the target vocabulary "Kyoto" will be recognized as the recognition result.
Is obtained.

【００２１】また、第３の発明の音声認識装置は、上記
第１の発明の音声認識辞書作成装置を搭載し、入力され
た音声を、上記音声認識辞書作成装置における音声認識
辞書記憶手段に登録されている語彙との照合を照合手段
によって行って認識することを特徴としている。A speech recognition apparatus of a third invention is equipped with the speech recognition dictionary creating apparatus of the first invention, and the inputted speech is registered in the speech recognition dictionary storing means in the speech recognition dictionary creating apparatus. It is characterized in that the collation means performs collation with the vocabulary that has been recognized.

【００２２】上記構成によれば、上記第２解析辞書に登
録された語彙に関しては、上記第２解析辞書を参照して
得られた読みの音素表記およびその他の読み候補の音素
表記に基づく辞書情報が登録された音声認識辞書を用い
て、音声認識が行われる。したがって、例えば上記第２
解析辞書に登録された語彙の例としての難読語「京終」
を、誤って「きょうしゅう」と発声した場合でもリジェク
トされることがなく、認識結果として目的の語彙「京終」
が得られるのである。According to the above configuration, with respect to the vocabulary registered in the second analysis dictionary, dictionary information based on the phoneme description of the reading obtained by referring to the second analysis dictionary and the phonetic description of other reading candidates. Voice recognition is performed using the voice recognition dictionary in which is registered. Therefore, for example, the second
The obfuscated word "Kyoto" as an example of vocabulary registered in the analysis dictionary
Even if you accidentally say "Kyoushu", you will not be rejected and the target vocabulary "Kyoto" will be recognized as the recognition result.
Is obtained.

【００２３】また、１実施例では、上記第２の発明ある
いは第３の発明の音声認識装置において、音声認識結果
に,第２解析辞書記憶手段に記憶されている語彙と表記
は同じであるが読みは異なる語彙が含まれているか否か
を判定する読み判定手段と、上記読み判定手段によっ
て,上記語彙が含まれていると判定された場合には,当該
語彙に関して,第２解析辞書記憶手段に記憶されている
読みを提示する読み提示手段を備えている。In one embodiment, in the voice recognition device of the second invention or the third invention, the voice recognition result has the same vocabulary and notation stored in the second analysis dictionary storage means. The reading determination means for determining whether or not the reading includes different vocabulary, and when the reading determination means determines that the vocabulary is included, the second analysis dictionary storage means for the vocabulary is read. And a reading presenting means for presenting the readings stored in.

【００２４】この実施例によれば、音声認識結果中に、
上記第２解析辞書に記憶されている語彙であって、上記
第２解析辞書に記憶されている当該語彙の読みとは異な
る読みの語彙が含まれている場合には、読み提示手段に
よって、上記第２解析辞書記憶手段に記憶されている正
しい読みが提示される。こうして、利用者に対して、認
識語彙の正しい読みが教えられる。According to this embodiment, during the voice recognition result,
When the vocabulary stored in the second analysis dictionary includes a vocabulary of a reading different from the reading of the vocabulary stored in the second analysis dictionary, the reading presenting means is used to The correct reading stored in the second analysis dictionary storage means is presented. In this way, the correct reading of the recognized vocabulary is taught to the user.

【００２５】また、１実施例では、上記第２の発明ある
いは第３の発明の音声認識装置において、上記読みを提
示手段は、上記第２解析辞書記憶手段に記憶されている
読みの提示を合成音声で行うようになっている。In one embodiment, in the voice recognition device of the second invention or the third invention, the reading presenting means synthesizes the reading presentation stored in the second analysis dictionary storage means. It is supposed to be done by voice.

【００２６】この実施例によれば、利用者に対して、認
識語彙の正しい読みが、合成音声によって利用者に教え
られる。According to this embodiment, the correct reading of the recognized vocabulary is taught to the user by the synthetic voice.

【００２７】また、第４の発明の携帯端末器は、上記第
２の発明あるいは第３の発明の音声認識装置を搭載した
ことを特徴としている。Further, the portable terminal device of the fourth invention is characterized by being equipped with the voice recognition device of the second invention or the third invention.

【００２８】通常、携帯端末器は、移動時に使用され
る。そして、特に外出先で上記携帯端末器によって音声
認識を行う際に、誤った読みで発声したためにリジェク
トされた場合には、正しい読みを調べる術がない。その
ために、必要な情報が即座に検索できない場合が生ず
る。[0028] Normally, the mobile terminal is used when moving. In particular, when the voice recognition is performed by the mobile terminal device at a place where the user is out, if the voice is uttered with an incorrect reading and is rejected, there is no way to check the correct reading. Therefore, there may be a case where necessary information cannot be retrieved immediately.

【００２９】上記構成によれば、携帯端末器に、例えば
難読語「京終」を誤って「きょうしゅう」と発声した場合で
も、リジェクトされることなく認識結果として目的語彙
「京終」が得られる音声認識装置が搭載されている。した
がって、正しい読みを調べる術がない外出先において
も、音声によって必要な情報を即座に検索することが可
能になるのである。According to the above configuration, even if the obfuscated word "Kyoto" is mistakenly uttered as "kyoshu" in the mobile terminal device, the target vocabulary "Kyoto" is obtained as a recognition result without being rejected. A recognition device is installed. Therefore, even when the user is out of the office where there is no way to check the correct reading, it becomes possible to immediately retrieve the necessary information by voice.

【００３０】また、第５の発明の携帯端末器は、上記第
１の発明の音声認識辞書作成装置および上記第２の発明
の音声認識装置の何れか一方を搭載したことを特徴とし
ている。The portable terminal device of the fifth invention is characterized by being equipped with either one of the speech recognition dictionary creating device of the first invention and the speech recognition device of the second invention.

【００３１】上記構成によれば、上記第１の発明の音声
認識辞書作成装置を搭載した第１携帯端末器から、上記
第２の発明の音声認識装置を搭載した第２携帯端末器
に、作成された音声認識辞書の情報を送信することによ
って、上記第２携帯端末器の音声認識装置によって、例
えば上記第２解析辞書に登録された語彙の例としての難
読語「京終」を、誤って「きょうしゅう」と発声した場合で
もリジェクトされることがなく、認識結果として目的の
語彙「京終」が得られる。According to the above construction, the first portable terminal equipped with the speech recognition dictionary creating apparatus of the first invention is created in the second portable terminal equipped with the speech recognition device of the second invention. By transmitting the information of the recognized voice recognition dictionary, the voice recognition device of the second mobile terminal mistakenly changes the obfuscated word "Kyoto" as an example of the vocabulary registered in the second analysis dictionary to " Even if you say "Kyoushu", you will not be rejected, and the target vocabulary "Kyoto" will be obtained as a recognition result.

【００３２】また、第６の発明の音声認識システムは、
上記第１の発明の音声認識辞書作成装置が設けられたサ
ーバーと、上記第２の発明の音声認識装置を搭載すると
共に,上記サーバーと音声認識辞書情報の送受信を行う
ための送受信手段を有する携帯端末器を備えたことを特
徴としている。The speech recognition system of the sixth invention is
A mobile phone equipped with a server provided with the voice recognition dictionary creating apparatus of the first invention, and a voice recognition apparatus of the second invention, and having a transmitting / receiving unit for transmitting / receiving voice recognition dictionary information to / from the server. It is characterized by having a terminal.

【００３３】上記構成によれば、上記第１の発明の音声
認識辞書作成装置がサーバーに設けられている。したが
って、携帯端末器を上記第３の発明の音声認識装置を搭
載した携帯端末器よりも簡単な構成にして軽量化が図ら
れる。さらに、上記サーバーを上記第３の辞書記憶手段
として、上記第２解析辞書記憶手段の内容を定期的に追
加補充することによって、次々増える新語および外来語
や定期的に更新されるテレビ番組名等を、上記携帯端末
器のユーザは上記第２解析辞書の内容を知らなくとも音
声認識することが可能になるのである。According to the above construction, the voice recognition dictionary creating apparatus of the first invention is provided in the server. Therefore, the portable terminal can be made simpler and lighter in weight than the portable terminal equipped with the voice recognition device of the third invention. Further, by using the server as the third dictionary storage means and periodically supplementing the contents of the second analysis dictionary storage means, new words and foreign words that increase one after another, television program names that are regularly updated, etc. Thus, the user of the mobile terminal can recognize the voice without knowing the contents of the second analysis dictionary.

【００３４】また、第７の発明は、テキスト解析手段,
読み付与手段,音声認識辞書作成手段および音声認識辞
書記憶手段を有すると共に,文字列情報を解析して構成
単語に分割するテキスト解析ステップと,分割した構成
単語に読みを付与する読み付与ステップと,上記テキス
ト解析および読み付与の結果に基づいて音声認識辞書を
作成して上記音声認識辞書記憶手段に記憶する音声認識
辞書作成ステップを有する音声認識辞書作成方法におい
て、上記テキスト解析手段によるテキスト解析は,第１
解析辞書記憶手段に記憶された語彙の表記および読みを
含む情報で成る第１解析辞書,および,第２解析辞書記憶
手段に記憶された上記第１解析辞書記憶手段に記憶され
てはいない語彙の表記および読みを含む情報で成る第２
解析辞書を参照して行い、上記読み付与手段による読み
付与においては,上記テキスト解析結果の中に上記第２
解析辞書を参照して得られた語彙が含まれている場合に
は,当該語彙に関して,上記第２解析辞書を参照して得ら
れた読みに加えて,その他の読み候補をも付与するよう
にしたことを特徴としている。Further, a seventh invention is a text analysis means,
It has a reading adding means, a voice recognition dictionary creating means, and a voice recognition dictionary storing means, a text analysis step of analyzing character string information and dividing it into constituent words, and a reading addition step of giving reading to the divided constituent words. In the voice recognition dictionary creating method having a voice recognition dictionary creating step of creating a voice recognition dictionary based on the result of the text analysis and reading and storing the voice recognition dictionary in the voice recognition dictionary storage means, the text analysis by the text analysis means is: First
Of the vocabulary not stored in the first analytic dictionary storage means stored in the second analytic dictionary storage means and the first analytic dictionary stored in the analytic dictionary storage means Second, consisting of information including notation and reading
When the reading addition is performed by referring to the analysis dictionary, the second addition is made in the text analysis result in the reading addition by the reading addition unit.
When the vocabulary obtained by referring to the analysis dictionary is included, in addition to the reading obtained by referring to the second analysis dictionary, other reading candidates are added to the vocabulary. It is characterized by having done.

【００３５】上記構成によれば、上記テキスト解析結果
および上記読み付与結果に基づいて作成された音声認識
辞書には、上記第２解析辞書に登録された語彙に関して
は、上記第２解析辞書を参照して得られた読みの音素表
記およびその他の読み候補の音素表記に基づく辞書情報
が登録される。したがって、この音声認識辞書を用いて
音声認識を行うことによって、上記第２解析辞書に登録
された語彙「京終(きょうばて)」を誤って「きょうしゅう」
と発声してもリジェクトされることなく認識結果として
目的の語彙「京終」が得られるのである。According to the above configuration, in the voice recognition dictionary created based on the text analysis result and the reading addition result, refer to the second analysis dictionary for the vocabulary registered in the second analysis dictionary. The dictionary information based on the phoneme notation of the reading obtained and the phoneme notation of the other reading candidates is registered. Therefore, by performing voice recognition using this voice recognition dictionary, the vocabulary “Kyobate” registered in the second analysis dictionary is mistakenly changed to “Kyoshu”.
The desired vocabulary "Kyoto" can be obtained as a recognition result without being rejected even when uttering.

【００３６】また、第８の発明の音声認識辞書作成プロ
グラムは、コンピュータを、上記第１の発明におけるテ
キスト解析手段,読み付与手段,音声認識辞書作成手段,
音声認識辞書記憶手段,第１解析辞書記憶手段および第
２解析辞書記憶手段として機能させることを特徴として
いる。A speech recognition dictionary creating program according to an eighth aspect of the present invention causes a computer to execute the text analyzing means, the reading adding means, the speech recognition dictionary creating means according to the first aspect of the invention.
It is characterized in that it functions as a voice recognition dictionary storage means, a first analysis dictionary storage means, and a second analysis dictionary storage means.

【００３７】上記構成によれば、上記第１の発明の場合
と同様に、作成される音声認識辞書には、上記第２解析
辞書に登録された語彙に関しては、上記第２解析辞書を
参照して得られた読みの音素表記およびその他の読み候
補の音素表記に基づく辞書情報が登録される。したがっ
て、この音声認識辞書を用いて音声認識を行うことによ
って、上記第２解析辞書に登録された語彙「京終(きょう
ばて)」を誤って「きょうしゅう」と発声してもリジェクト
されることなく認識結果として目的の語彙「京終」が得ら
れるのである。According to the above configuration, as in the case of the first aspect of the invention, the second analysis dictionary is referred to for the vocabulary registered in the second analysis dictionary in the created voice recognition dictionary. The dictionary information based on the phoneme notation of the reading obtained and the phoneme notation of other reading candidates is registered. Therefore, by performing voice recognition using this voice recognition dictionary, even if the vocabulary "Kyobate" registered in the second analysis dictionary is erroneously pronounced as "Kyoushu", it is rejected. Instead, the desired vocabulary "Kyoto" is obtained as a recognition result.

【００３８】また、第９の発明のプログラム記録媒体
は、上記第８の発明の音声認識辞書作成プログラムが記
録されたことを特徴としている。The program recording medium of the ninth invention is characterized in that the voice recognition dictionary creating program of the eighth invention is recorded.

【００３９】上記構成によれば、記録されている音声認
識辞書作成プログラムをコンピュータで読み出して用い
ることによって、上記第１の発明の場合と同様に、上記
第２解析辞書に登録された語彙に関しては、上記第２解
析辞書を参照して得られた読みの音素表記およびその他
の読み候補の音素表記に基づく辞書情報が登録された音
声認識辞書が作成される。したがって、この音声認識辞
書を用いて音声認識を行うことによって、上記第２解析
辞書に登録された語彙「京終(きょうばて)」を誤って「き
ょうしゅう」と発声してもリジェクトされることなく認
識結果として目的の語彙「京終」が得られる。According to the above configuration, the recorded speech recognition dictionary creating program is read out by the computer and used, so that the vocabulary registered in the second analysis dictionary is the same as in the case of the first invention. A voice recognition dictionary is created in which dictionary information based on the phoneme notation of reading and the phoneme notation of other reading candidates obtained by referring to the second analysis dictionary is registered. Therefore, by performing voice recognition using this voice recognition dictionary, even if the vocabulary "Kyobate" registered in the second analysis dictionary is erroneously pronounced as "Kyoushu", it is rejected. As a result, the desired vocabulary "Kyoto" is obtained.

【００４０】[0040]

【発明の実施の形態】以下、この発明を図示の実施の形
態により詳細に説明する。BEST MODE FOR CARRYING OUT THE INVENTION The present invention will be described in detail below with reference to the embodiments shown in the drawings.

【００４１】＜第１実施の形態＞本実施の形態は、利用
者が、認識対象語彙の正しい読み方を知らない場合でも
認識可能な音声認識辞書を作成する音声認識辞書作成装
置に関する。<First Embodiment> The present embodiment relates to a voice recognition dictionary creating apparatus for creating a voice recognition dictionary that can be recognized even when a user does not know the correct reading of a recognition target vocabulary.

【００４２】図１は、本実施の形態の音声認識辞書作成
装置における構成を示すブロック図である。テキスト解
析部１は、入力された文字列の言語を解析(テキスト解
析)して、構成される形態素に分割する。その場合、複
数の分割候補がある場合には、それらの総てを出力す
る。尚、各分割候補には、その分割候補の可能性の度合
いを表す尤度が与えられる。FIG. 1 is a block diagram showing the configuration of the speech recognition dictionary creating apparatus of this embodiment. The text analysis unit 1 analyzes the language of the input character string (text analysis) and divides it into configured morphemes. In that case, when there are a plurality of division candidates, all of them are output. It should be noted that each division candidate is given a likelihood indicating the degree of possibility of the division candidate.

【００４３】読み付与部２は、分割された形態素の読み
を付与する。複数の読みが存在する場合には、複数の読
みの総てを出力する方法と可能性の度合いが最も高い読
みの一つに絞って出力する方法とがある。The reading imparting section 2 imparts readings of the divided morphemes. When there are a plurality of readings, there are a method of outputting all of the plurality of readings and a method of focusing on one of the readings with the highest possibility.

【００４４】音声認識辞書作成部３は、上記テキスト解
析部１による解析結果と読み付与部２によって付与され
た読みに基づいて、音声認識を行うために必要な音声認
識辞書を作成する。ここで、音声認識辞書には、認識語
彙とその音素表記とを対にして記憶した形式のものと、
各認識語彙の出現連鎖確率を記憶した形式のものとがあ
る。一般に、単語を発声して認識する離散単語音声認識
の場合には前者の形式の音声認識辞書のみを利用し、文
を発声して認識する連続音声認識の場合には前者と後者
との双方の音声認識辞書を利用することが多い。The voice recognition dictionary creating unit 3 creates a voice recognition dictionary required for voice recognition based on the analysis result by the text analyzing unit 1 and the reading provided by the reading providing unit 2. Here, the voice recognition dictionary has a format in which the recognition vocabulary and its phoneme notation are paired and stored,
There is a format in which the occurrence chain probability of each recognition vocabulary is stored. Generally, in the case of discrete word speech recognition for uttering a word, only the former form of speech recognition dictionary is used, and in the case of continuous speech recognition for uttering a sentence, both of the former and the latter are recognized. A voice recognition dictionary is often used.

【００４５】また、上記連続音声認識の場合に用いる各
認識語彙の出現連鎖確率として、Ｎ‐gramに代表される
統計的言語モデルを使用する場合や、連鎖するか否かの
２値で出現連鎖確率を表現して語彙の連鎖情報を文法で
記述する場合がある。上記テキスト解析結果と付与され
た読みとのデータに基づけば、上記何れの場合の出現連
鎖確率にも変換することが可能である。As the appearance chain probability of each recognition vocabulary used in the case of continuous speech recognition, a statistical language model typified by N-gram is used, or a binary appearance chain indicating whether or not to chain is used. The probability may be expressed and the vocabulary chain information may be described in grammar. Based on the data of the text analysis result and the assigned reading, it is possible to convert into the appearance chain probability in any of the above cases.

【００４６】音声認識辞書記憶部４は、上記音声認識辞
書作成部３で作成された音声認識辞書を記憶する。尚、
音声認識辞書記憶部４を構成する記憶媒体としては、フ
ラッシュメモリやハードディスク等の一般的に広く使用
されている記憶装置である。また、音声認識辞書記憶部
４への記憶形式は、先に述べたように、認識語彙とその
音素表記を対で記憶する形式と、各語彙の出現連鎖確率
を記憶する形式とである。こうして音声認識辞書記憶部
４に記憶された音声認識辞書は、後述する音声認識を行
う際に参照される。The voice recognition dictionary storage unit 4 stores the voice recognition dictionary created by the voice recognition dictionary creating unit 3. still,
The storage medium that constitutes the voice recognition dictionary storage unit 4 is a generally widely used storage device such as a flash memory or a hard disk. Further, the storage format in the voice recognition dictionary storage section 4 is, as described above, a format in which the recognition vocabulary and its phoneme notation are stored in pairs, and a format in which the occurrence chain probability of each vocabulary is stored. The voice recognition dictionary stored in the voice recognition dictionary storage unit 4 in this manner is referred to when performing voice recognition described later.

【００４７】第１解析辞書記憶部５は、上記テキスト解
析部１が上記テキスト解析を行う際に使用される解析辞
書を格納している。ここで、上記テキスト解析は形態素
解析と呼ばれる手法を用いて行われるが、この形態素解
析を行うためには解析辞書が必要になる。この解析辞書
には、日常使用される言葉に対する表記,読み,品詞情報
等の情報が記憶されている。そして、テキスト解析を行
う際には、入力テキストと上記解析辞書との照合処理を
行うことによって、テキストの単語(形態素)を同定する
のである。すなわち、第１解析辞書記憶部５には、日常
的に使用される一般的な語彙の表記,読み,品詞情報等の
情報で成る第１解析辞書を格納しているのである。The first analysis dictionary storage unit 5 stores an analysis dictionary used when the text analysis unit 1 performs the text analysis. Here, the text analysis is performed using a method called morpheme analysis, but an analysis dictionary is required to perform this morpheme analysis. This analysis dictionary stores information such as notation, reading, and part-of-speech information for everyday words. Then, when performing the text analysis, the word (morpheme) of the text is identified by performing the matching process between the input text and the analysis dictionary. That is, the first analysis dictionary storage unit 5 stores a first analysis dictionary composed of information such as notation, reading, and part-of-speech information of commonly used vocabulary.

【００４８】第２解析辞書記憶部６は、上記第１解析辞
書記憶部５に記憶されてはいない特殊な語彙の表記,読
み,品詞情報等の情報で成る第２解析辞書を格納してい
る。登録語彙が一般的であるか特殊であるかを除き、両
解析辞書記憶部５,６における構造およびテキスト解析
部からの参照方法は同一である。尚、第２解析辞書記憶
部６に登録される特殊な語彙との例として、通常の読み
方では読めない地名や人名等がある。The second analysis dictionary storage unit 6 stores a second analysis dictionary composed of information such as special vocabulary notation, reading, and part-of-speech information not stored in the first analysis dictionary storage unit 5. . Except for whether the registered vocabulary is general or special, the structure in both analysis dictionary storage units 5 and 6 and the reference method from the text analysis unit are the same. Incidentally, as an example of the special vocabulary registered in the second analysis dictionary storage unit 6, there are a place name, a person name, etc. that cannot be read by a normal reading method.

【００４９】上記構成の音声認識辞書作成装置は以下の
ように動作する。図２は、図１に示す音声認識辞書作成
装置によって行われる音声認識辞書作成処理動作のフロ
ーチャートである。以下、図２に従って、音声認識辞書
作成処理について詳細に説明する。本音声認識辞書作成
処理を行うためには、音声認識辞書を生成するための文
字列情報が必要である。文字列情報がテキスト解析部１
に入力されると、音声認識辞書作成処理動作がスタート
する。The speech recognition dictionary creating apparatus having the above-described configuration operates as follows. FIG. 2 is a flowchart of a voice recognition dictionary creation processing operation performed by the voice recognition dictionary creation device shown in FIG. Hereinafter, the voice recognition dictionary creation process will be described in detail with reference to FIG. In order to perform this voice recognition dictionary creation processing, character string information for generating the voice recognition dictionary is required. Character string information is text analysis unit 1
Is input, the voice recognition dictionary creation processing operation starts.

【００５０】ステップＳ1で、上記テキスト解析部１に
よって、入力文字列から１文の文字列が取得される。ス
テップＳ2で、テキスト解析部１によって、上記テキス
ト解析が行われる。すなわち、形態素解析処理によっ
て、第１解析辞書記憶部５に記憶されている第１解析辞
書と第２解析辞書記憶部６に記憶されている第２解析辞
書とが照合される。そして、上記１文の入力文字列情報
が単語単位に分割されるのである。上述したように、第
１,第２解析辞書記憶部５,６には単語の表記,読み,品詞
等の情報が記憶されており、第１,第２解析辞書記憶部
５,６と照合することによって入力文字列の構成単語が
何であるかを知ることができるのである。In step S1, the text analysis unit 1 acquires a character string of one sentence from the input character string. In step S2, the text analysis unit 1 performs the text analysis. That is, the first analysis dictionary stored in the first analysis dictionary storage unit 5 and the second analysis dictionary stored in the second analysis dictionary storage unit 6 are collated by the morphological analysis processing. Then, the input character string information of one sentence is divided into word units. As described above, the first and second analysis dictionary storage units 5 and 6 store information such as word notation, reading, and part-of-speech, which are collated with the first and second analysis dictionary storage units 5 and 6. This makes it possible to know what the constituent words of the input character string are.

【００５１】例えば、「明日の天気」という文字列が入力
された場合、「明日(名詞)」,「の(助詞)」および「天気(名
詞)」の各形態素に分割される。また、「くるまで待つ」と
いう文字列が入力された場合、「くる(動詞)」,「まで(助
詞)」および「待つ(動詞)」の分割結果と、「くるま(名
詞)」,「で(助詞)」および「待つ(動詞)」の分割結果との２
通りの分割結果が存在し、両分割結果に対して、その確
からしさを表す尤度が与えられる。For example, when the character string "weather of tomorrow" is input, it is divided into morphemes of "tomorrow (noun)", "no (particle)" and "weather (noun)". Also, when the character string "Wait until come" is entered, the division result of "Kuru (verb)", "Made (particle)" and "Wait (verb)" and "Kuruma (noun)", " 2 with the division result of "(particle)" and "wait (verb)"
There are different division results, and a likelihood that represents the certainty is given to both division results.

【００５２】尚、上記形態素解析処理に関しては、右方
向最長一致法や接続表を用いた方法が一般的であり、
「自然言語解析の基礎」(田中穂積著:産業図書 1989年)等
の文献に詳しい。Regarding the above morphological analysis process, the right longest matching method or the connection table is generally used.
He is familiar with literature such as "Basics of Natural Language Analysis" (Hozumi Tanaka: Sangyo Tosho 1989).

【００５３】ステップＳ3で、上記読み付与部２によっ
て、上記テキスト解析部１からのテキスト解析結果に基
づいて、分割された形態素毎に読みが付与される。尚、
読みが複数ある場合は、総ての読みを出力することも可
能であるし、読みの尤度に応じて最も可能性の高いもの
から幾つかの読みを出力することも可能である。上述の
例の場合には、分割単語「明日」には「あす」と「あした」と
の２種類の読みが存在し、夫々の読みに尤度が与えられ
るのである。In step S3, the reading adding section 2 adds reading to each of the divided morphemes based on the text analysis result from the text analysis section 1. still,
When there are a plurality of readings, it is possible to output all the readings, or it is possible to output some readings from the most probable one according to the likelihood of reading. In the case of the above-mentioned example, the divided word “tomorrow” has two types of readings, “tomorrow” and “tomorrow”, and the likelihood is given to each of them.

【００５４】ステップＳ4で、上記読み付与部２によっ
て、テキスト解析部１から入力された形態素の中に、第
２解析辞書に登録されている語彙が含まれているか否か
が判別される。尚、この判別は、例えば第２解析辞書に
基づくテキスト解析結果にフラグを立てること等によっ
て行われる。その結果、含まれている場合にはステップ
Ｓ5に進み、含まれていない場合にはステップＳ6に進
む。ステップＳ5で、読み付与部２によって、上記第２
解析辞書に含まれている語彙に関して、上記第２解析辞
書による分割単語と読みとの対応と、上記第１解析辞書
による解析結果をも含めた分割単語候補と読み候補との
対応とが、対応テーブルに記録される。以下に、具体例
を上げて説明する。In step S4, the reading adding section 2 determines whether or not the morpheme input from the text analysis section 1 includes a vocabulary registered in the second analysis dictionary. This determination is made by, for example, setting a flag on the text analysis result based on the second analysis dictionary. As a result, if it is included, the process proceeds to step S5, and if it is not included, the process proceeds to step S6. In step S5, the reading adding unit 2 causes the second
Regarding the vocabulary included in the analysis dictionary, the correspondence between the divided words by the second analysis dictionary and the reading and the correspondence between the divided word candidates including the analysis result by the first analysis dictionary and the reading candidates correspond to each other. Recorded on the table. A specific example will be described below.

【００５５】例えば、上述した地名「京終」は「きょうば
て」と読む。しかしながら、一般的な単語ではないため
普通の解析辞書には登録されていないことが多い。すな
わち、本実施の形態の場合においては、第１解析辞書記
憶部５には単語「京終」は登録されておらず、第２解析辞
書記憶部６に登録されることになる。一方、第１解析辞
書記憶部５には、語彙「京」および語彙「終」が登録されて
いるものとする。For example, the place name "Kyoto" mentioned above is read as "Kyobate". However, since it is not a general word, it is often not registered in an ordinary analysis dictionary. That is, in the case of the present embodiment, the word “Kyoto” is not registered in the first analysis dictionary storage unit 5, but is registered in the second analysis dictionary storage unit 6. On the other hand, it is assumed that the vocabulary “K” and the vocabulary “End” are registered in the first analysis dictionary storage unit 5.

【００５６】その場合において、上記「京終」という文字
列がテキスト解析部１に入力されると、テキスト解析部
１によって、第２解析辞書記憶部６に登録されている
「京終」と合致するために、単語「京終(名詞)」が得られ
る。そして、読み付与部２によって読み「きょうばて」が
付与される。ここで、単語「京終(きょうばて):名詞」は
上記第２解析辞書に登録された語彙であるため、上記第
２解析辞書を用いた解析結果である「京終(きょうばて):
名詞」と、上記第１解析辞書を用いた解析結果候補であ
る「京(きょう):名詞」/「終(しゅう):名詞」や「京(きょ
う):名詞」/「終(おわり):名詞」とが、図３に示すよう
に、上記対応テーブルに記録されるのである。In this case, when the character string “Kyoto” is input to the text analysis unit 1, the text analysis unit 1 matches the “Kyoto” registered in the second analysis dictionary storage unit 6. Then, the word "Kyoto (noun)" is obtained. Then, the reading adding unit 2 adds the reading “Kyoubate”. Here, since the word “Kyobate: noun” is a vocabulary registered in the second analysis dictionary, the analysis result using the second analysis dictionary is “Kyobate:
Noun "and the analysis result candidates using the first analysis dictionary," Kyo: noun "/" End: noun "or" Kyo: noun "/" End: " The "noun" is recorded in the correspondence table as shown in FIG.

【００５７】仮に、解析辞書記憶部が、上記第１解析辞
書記憶部５と第２解析辞書記憶部６とに分かれていない
場合には、「京終」という表記に対して「きょうばて」とい
う読みしか得られず、「きょうしゅう」や「きょうおわり」
は得られることはない。本実施の形態のごとく、読み付
与部２によって、第１解析辞書記憶部５を参照して得た
読みと第２解析辞書記憶部６を参照して得た読みとに基
づいて読みを生成することによって、「きょうばて」,「き
ょうしゅう」および「きょうおわり」の３通りの読みを得
る事ができるのである。If the analysis dictionary storage unit is not divided into the first analysis dictionary storage unit 5 and the second analysis dictionary storage unit 6, the expression "Kyoto" is called "Kyobate". Only the reading can be obtained, "Kyoshu" and "Kyoto End"
Is never obtained. As in the present embodiment, the reading imparting unit 2 generates a reading based on the reading obtained by referring to the first analysis dictionary storage unit 5 and the reading obtained by referring to the second analysis dictionary storage unit 6. By doing so, you can get the three readings of "Kyobate", "Kyoshu", and "Kyoendo".

【００５８】ステップＳ6で、上記音声認識辞書作成部
３によって、上記対応テーブルの内容を含む上記テキス
ト解析結果および読み付与結果の情報に基づいて音声認
識辞書が生成される。そして、生成された音声認識辞書
が音声認識辞書記憶部４に記憶される。ステップＳ7
で、テキスト解析部１によって、入力文字列に次の文が
あるか否かが判別される。その結果、ある場合には上記
ステップＳ1に戻って次の１文の文字列取得に移行す
る。また、ない場合には音声認識辞書作成処理動作を終
了する。In step S6, the voice recognition dictionary creating section 3 generates a voice recognition dictionary based on the information of the text analysis result and the reading addition result including the contents of the correspondence table. Then, the generated voice recognition dictionary is stored in the voice recognition dictionary storage unit 4. Step S7
Then, the text analysis unit 1 determines whether or not there is a next sentence in the input character string. As a result, if there is, the process returns to the step S1 to proceed to the acquisition of the character string of the next sentence. If not, the voice recognition dictionary creation processing operation is terminated.

【００５９】このように、本実施の形態においては、日
常使用される語彙の表記,読み,品詞情報等の情報が登録
された第１解析辞書を記憶する第１解析辞書記憶部５
と、特殊な語彙の表記,読み,品詞情報等の情報が登録さ
れた第２解析辞書を記憶する第２解析辞書記憶部６とを
有している。そして、テキスト解析部１は両解析辞書記
憶部５,６を用いて形態素解析を行い、読み付与部２は
上記形態素解析結果に基づいて形態素に読みを付与す
る。音声認識辞書作成部３は、上記解析結果と読みとに
基づいて音声認識辞書を作成する。As described above, in the present embodiment, the first analysis dictionary storage unit 5 for storing the first analysis dictionary in which information such as notation, reading, and part-of-speech information of vocabulary used daily is stored.
And a second analysis dictionary storage unit 6 that stores a second analysis dictionary in which information such as special vocabulary notation, reading, and part-of-speech information is registered. Then, the text analysis unit 1 performs the morpheme analysis using both analysis dictionary storage units 5 and 6, and the reading addition unit 2 adds the reading to the morpheme based on the morpheme analysis result. The voice recognition dictionary creating unit 3 creates a voice recognition dictionary based on the analysis result and the reading.

【００６０】その際に、上記読み付与部２は、入力形態
素中に上記第２解析辞書の登録語彙を含む場合には、そ
の語彙に関して、上記第２解析辞書を用いた解析による
分割単語「京終」とその読み「きょうばて」との対に加え
て、上記第１解析辞書を用いた解析結果をも含めた他の
分割単語候補「京/終」とその読み候補「きょう/しゅう」，
「きょう/おわり」との対を対応テーブルに記録する。そ
して、音声認識辞書作成部３は、上記対応テーブルの記
録内容に基づいて音声認識辞書を作成するのである。At this time, when the input morpheme contains the registered vocabulary of the second analysis dictionary, the reading imparting unit 2 makes a division word "Kyoto" for the vocabulary by the analysis using the second analysis dictionary. In addition to the pair "" and its reading "Kyobate", other divided word candidates "Kyo / Shu" and its reading candidate "Kyo / Shu", including the analysis results using the above first analysis dictionary,
Record the pair “Today / End” in the correspondence table. Then, the voice recognition dictionary creating section 3 creates a voice recognition dictionary based on the recorded contents of the correspondence table.

【００６１】その結果、上記音声認識辞書は、例えば認
識語彙「京終」と音素表記「きょうばて」,「きょうしゅう」,
「きょうおわり」との対応付けに基づいて作成されること
になる。したがって、音声認識辞書作成部３によって作
成された音声認識辞書を用いて音声認識を行うことによ
って、表記「京終」を「きょうおわり」と発声された場合で
あっても、「京終」と正しく認識できるのである。As a result, the speech recognition dictionary has, for example, the recognition vocabulary “Kyoto” and the phoneme notation “Kyobate”, “Kyoshu”,
It will be created based on the correspondence with the “Kyoto”. Therefore, by performing voice recognition using the voice recognition dictionary created by the voice recognition dictionary creating unit 3, even when the notation "Kyoto" is pronounced as "Kyo-end", it is correctly recognized as "Kyoto". You can do it.

【００６２】＜第２実施の形態＞本実施の形態は、上記
第１実施の形態における音声認識辞書作成装置に、上記
第２解析辞書を自動的に取得する上記所取得部を設けた
ものに関する。<Second Embodiment> The present embodiment relates to the speech recognition dictionary creating apparatus according to the first embodiment, which is provided with the location acquisition unit for automatically acquiring the second analysis dictionary. .

【００６３】図４は、本実施の形態の音声認識辞書作成
装置における構成を示すブロック図である。テキスト解
析部１１,読み付与部１２,音声認識辞書作成部１３,音
声認識辞書記憶部１４,第１解析辞書記憶部１５および
第２解析辞書記憶部１６は、上記第１実施の形態におい
て図１に示すテキスト解析部１,読み付与部２,音声認識
辞書作成部３,音声認識辞書記憶部４,第１解析辞書記憶
部５および第２解析辞書記憶部６と同じであり、詳細な
説明は省略する。FIG. 4 is a block diagram showing the structure of the speech recognition dictionary creating apparatus of this embodiment. The text analysis unit 11, the reading addition unit 12, the voice recognition dictionary creation unit 13, the voice recognition dictionary storage unit 14, the first analysis dictionary storage unit 15, and the second analysis dictionary storage unit 16 are the same as those in the first embodiment shown in FIG. It is the same as the text analysis unit 1, reading addition unit 2, voice recognition dictionary creation unit 3, voice recognition dictionary storage unit 4, first analysis dictionary storage unit 5 and second analysis dictionary storage unit 6 shown in FIG. Omit it.

【００６４】上記第１解析辞書記憶部１５には一般的な
語彙を登録するのに対して、第２解析辞書記憶部１６に
は特殊な語彙を登録することは、上記第１実施の形態の
場合と同様である。ここで、特殊な語彙としては、例え
ば、専門性の高い語彙、出現頻度の低い馴染みの薄い語
彙、略語、新語、難読語等である。While the general vocabulary is registered in the first analysis dictionary storage unit 15, the special vocabulary is registered in the second analysis dictionary storage unit 16 according to the first embodiment. It is similar to the case. Here, the special vocabulary includes, for example, highly specialized vocabulary, infrequently appearing unfamiliar vocabulary, abbreviations, new words, and difficult-to-read words.

【００６５】ところで、上記専門性の高い語彙,略語,新
語等は、時代の流れと共に絶えず新しい語彙が出現す
る。したがって、この新しく出現した特殊な語彙が、絶
えず第２解析辞書記憶部１６に登録されない場合には、
その新しく出現した特殊な語彙がテキスト解析部１１に
入力されても正確に読みが付与されない可能性が高くな
る。尚、正確に読みが付与できず未知語として判定した
場合に、読みを推定する技術もある。しかしながら、こ
の読み推定技術によるよみ推定の精度はそれ程高くはな
い。そのために、正確に読みを付与しようとする場合に
は、新しく出現した特殊な語彙を絶えず第２解析辞書記
憶部１６に登録しておく必要がある。By the way, with respect to the highly specialized vocabulary, abbreviations, new words, etc., new vocabulary constantly appears with the passage of time. Therefore, when this newly appearing special vocabulary is not constantly registered in the second analysis dictionary storage unit 16,
Even if the newly appearing special vocabulary is input to the text analysis unit 11, there is a high possibility that the reading is not correctly given. There is also a technique of estimating the reading when the reading cannot be accurately given and it is determined as an unknown word. However, the accuracy of reading estimation by this reading estimation technique is not so high. Therefore, in order to accurately add reading, it is necessary to constantly register the newly appearing special vocabulary in the second analysis dictionary storage unit 16.

【００６６】そこで、本実施の形態においては、辞書取
得部１７を設けて、第２解析辞書記憶部１６に記憶する
特殊な語彙の表記,読み,品詞情報等の情報を、第３の辞
書記憶手段(図示せず)から辞書取得部１７によって取得
するのである。こうすることによって、新語のように新
しい言葉が出現すれば、それを第２解析辞書記憶部１６
に追加登録できるのである。Therefore, in this embodiment, a dictionary acquisition unit 17 is provided to store information such as special vocabulary notation, reading, and part-of-speech information stored in the second analysis dictionary storage unit 16 in the third dictionary storage unit. It is acquired by the dictionary acquisition unit 17 from a means (not shown). By doing so, if a new word appears, such as a new word, the second analysis dictionary storage unit 16
You can additionally register with.

【００６７】また、そうすることによって、第２解析辞
書記憶部１６には、電子情報化されたテレビ番組名や音
楽タイトル名等の日々更新される言葉も登録することが
可能になる。したがって、辞書取得部１７によって、定
期的に、新しく出現した特殊な語彙を第２解析辞書記憶
部１６に登録しておけば、第２解析辞書記憶部１６に登
録されている認識対象語彙を利用者が知らない場合であ
っても、入力音声を認識可能な音声認識辞書を作成して
音声認識辞書記憶部１４に記憶することができるのであ
る。By doing so, it becomes possible to register words that are updated daily, such as electronic program information such as TV program names and music title names, in the second analysis dictionary storage section 16. Therefore, if the dictionary acquisition unit 17 regularly registers a newly appearing special vocabulary in the second analysis dictionary storage unit 16, the recognition target vocabulary registered in the second analysis dictionary storage unit 16 is used. Even if the person does not know, a voice recognition dictionary capable of recognizing the input voice can be created and stored in the voice recognition dictionary storage unit 14.

【００６８】ここで、上記第３の辞書記憶手段および辞
書取得部１７による上記特殊な語彙の取得方法について
は、特に限定するものではない。例えば、フロッピー
（登録商標）ディスクやＣＤ(コンパクトディスク)‐Ｒ
ＯＭ(リード・オンリ・メモリ)等のメディアから取得する
方法、ネットワークからダウンロードする方法、文字放
送等の仕組みを利用する方法等がある。何れにせよ、語
彙の情報提供者によって新しい語彙を登録した第３の辞
書記憶手段が用意されれば、その第３の辞書記憶手段か
ら辞書取得部１７によって新しい語彙を取得して利用す
ることができるのである。Here, the method of acquiring the special vocabulary by the third dictionary storage means and the dictionary acquisition unit 17 is not particularly limited. For example, a floppy (registered trademark) disk or CD (compact disk) -R
There are a method of acquiring from a medium such as OM (Read Only Memory), a method of downloading from a network, and a method of using a mechanism such as teletext. In any case, if a third vocabulary storage means in which a new vocabulary is registered by a vocabulary information provider is prepared, the new vocabulary can be acquired from the third dictionary storage means by the dictionary acquisition unit 17 and used. You can do it.

【００６９】以上のごとく、本実施の形態においては、
上記辞書取得部１７を設け、この辞書取得部１７によっ
て、第２解析辞書記憶部１６に記憶する特殊な語彙を第
３の辞書記憶手段から取得するようにしている。したが
って、辞書取得部１７によって、定期的に、第３の辞書
記憶手段から新たな語彙の情報を取得して第２解析辞書
記憶部１６に登録しておけば、テレビ番組名や音楽タイ
トル名等の日々更新される単語であるために第２解析辞
書記憶部１６に登録されていることを利用者が知らない
単語であっても認識可能な音声認識辞書を作成すること
が可能になる。As described above, in the present embodiment,
The dictionary acquisition unit 17 is provided, and the dictionary acquisition unit 17 acquires the special vocabulary stored in the second analysis dictionary storage unit 16 from the third dictionary storage unit. Therefore, if the dictionary acquisition unit 17 periodically acquires new vocabulary information from the third dictionary storage unit and registers it in the second analysis dictionary storage unit 16, a TV program name, a music title name, etc. It is possible to create a voice recognition dictionary that can recognize even a word that the user does not know is registered in the second analysis dictionary storage unit 16 because the word is updated every day.

【００７０】＜第３実施の形態＞本実施の形態は、上記
第１実施の形態における音声認識辞書作成装置が搭載さ
れた音声認識装置に関する。<Third Embodiment> The present embodiment relates to a voice recognition device equipped with the voice recognition dictionary creating device according to the first embodiment.

【００７１】図５は、本実施の形態の音声認識装置にお
ける構成を示すブロック図である。テキスト解析部２
１,読み付与部２２,音声認識辞書作成部２３,第１解析
辞書記憶部２４および第２解析辞書記憶部２５は、上記
第１実施の形態において図１に示すテキスト解析部１,
読み付与部２,音声認識辞書作成部３,第１解析辞書記憶
部５および第２解析辞書記憶部６と同じであり、音声認
識辞書作成装置２６を構成している。そして、音声認識
辞書作成装置２６で作成された音声認識辞書は、音声認
識辞書記憶部２７に記憶される。尚、音声認識辞書作成
装置２６および音声認識辞書記憶部２７の詳細な説明は
省略する。FIG. 5 is a block diagram showing the configuration of the speech recognition apparatus of this embodiment. Text analysis unit 2
1, the reading adding unit 22, the voice recognition dictionary creating unit 23, the first analysis dictionary storage unit 24, and the second analysis dictionary storage unit 25 are the text analysis unit 1 shown in FIG. 1 in the first embodiment.
It is the same as the reading providing unit 2, the voice recognition dictionary creating unit 3, the first analysis dictionary storage unit 5, and the second analysis dictionary storage unit 6, and constitutes a voice recognition dictionary creating device 26. Then, the voice recognition dictionary created by the voice recognition dictionary creating device 26 is stored in the voice recognition dictionary storage unit 27. The detailed description of the voice recognition dictionary creating device 26 and the voice recognition dictionary storage unit 27 will be omitted.

【００７２】音声認識部３１は、音響分析部２８,尤度
演算部２９および照合処理部３０で構成されて、入力音
声を音声認識辞書記憶部２７に登録されている単語との
照合を行って認識し、認識結果を出力する。以下、その
概略を説明する。The speech recognition section 31 is composed of an acoustic analysis section 28, a likelihood calculation section 29 and a matching processing section 30 and matches the input speech with the words registered in the speech recognition dictionary storage section 27. Recognize and output the recognition result. The outline will be described below.

【００７３】上記音響分析部２８は、マイク(図示せず)
から入力された音声をディジタル波形に変換し、短い時
間間隔(フレーム)毎に周波数分析し、スペクトルを表す
パラメータのベクトル系列に変換する。周波数分析には
ＬＰＣ(線形予測分析)メルケプストラムのような表現方
法が用いられる。尤度演算部２９は、上記得られた入力
音声のパラメータベクトルに対し、音響モデル(ＨＭＭ
(隠れマルコフモデル)等)を作用させて各音韻毎に尤度
を算出する。照合処理部３０は、音韻尤度(類似度)系列
に対して、音声認識辞書記憶部２７に記憶されている総
ての項目(単語)との照合を行ない、各単語のスコアを算
出する。そして、スコアが高い単語を認識結果として出
力するのである。尚、音声認識方法については、「ディ
ジタル音声処理」(古井著:東海大学出版会、1985年)等の
文献に詳しい。The acoustic analysis unit 28 includes a microphone (not shown).
The voice input from is converted into a digital waveform, frequency-analyzed for each short time interval (frame), and converted into a vector series of parameters representing a spectrum. An expression method such as LPC (linear prediction analysis) mel cepstrum is used for frequency analysis. The likelihood calculating unit 29 uses the acoustic model (HMM) for the obtained input voice parameter vector.
(Hidden Markov model) etc. is applied to calculate the likelihood for each phoneme. The matching processing unit 30 matches the phonological likelihood (similarity) series with all the items (words) stored in the speech recognition dictionary storage unit 27, and calculates the score of each word. Then, the word with a high score is output as the recognition result. The speech recognition method is detailed in the literature such as "Digital Speech Processing" (Furui: Tokai University Press, 1985).

【００７４】その場合、上記音声認識辞書記憶部２７の
音声認識辞書は、上記第１実施の形態において述べたよ
うに、例えば、難読語である地名「京終」に対して、その
正しい音素表記「きょうばて」に加えて、誤った音素表記
「きょうしゅう」,「きょうおわり」をも対応付けて作成さ
れている。したがって、発話者が上記マイクに向って
「きょうばて」と発声することによって認識結果「京終」
を得ることができる。それに加えて、「きょうしゅう」あ
るいは「きょうおわり」と誤って発声した場合であって
も、正しい認識結果「京終」を得ることができるのであ
る。In this case, as described in the first embodiment, the voice recognition dictionary of the voice recognition dictionary storage unit 27, for example, for the place name "Kyoto" which is a difficult-to-read word, has its correct phoneme notation " In addition to "Kyobate", the wrong phoneme notations "Kyoshu" and "Kyo-end" are also associated with each other. Therefore, when the speaker utters "Kyobate" into the microphone, the recognition result "Kyoto"
Can be obtained. In addition to that, even if the user accidentally utters "Kyoshu" or "Kyoto", the correct recognition result "Kyoto" can be obtained.

【００７５】すなわち、本実施の形態によれば、難読語
である地名や人名の読みを誤って覚えている場合や、正
確な読みが分らない場合であっても、入力音声を目的の
語彙として認識できる。したがって、本実施の形態を、
難読語が多い地名を発声で入力して検索された地図を表
示する地図検索表示装置等に適用すれば、非常に有効に
利用することができる。That is, according to the present embodiment, the input voice is used as the target vocabulary even if the reading of a place name or a person's name, which is a difficult-to-read word, is erroneously memorized or the accurate reading is not understood. Can be recognized. Therefore, the present embodiment is
When applied to a map search display device or the like that displays a searched map by inputting a place name with many obfuscated words by utterance, it can be used very effectively.

【００７６】尚、本実施の形態においては、音声認識装
置に音声認識辞書作成装置２６を搭載している。しかし
ながら、この発明はこれに限定されるものではなく、音
声認識装置を音声認識辞書作成装置２６とは独立に設
け、上記第１,第２実施の形態における音声認識辞書作
成装置によって作成された音声認識辞書を音声認識辞書
記憶部２７に記憶するようにしても差し支えない。In this embodiment, the voice recognition dictionary creation device 26 is mounted on the voice recognition device. However, the present invention is not limited to this, and a voice recognition device is provided independently of the voice recognition dictionary creation device 26, and a voice created by the voice recognition dictionary creation device according to the first and second embodiments described above. It does not matter if the recognition dictionary is stored in the voice recognition dictionary storage unit 27.

【００７７】＜第４実施の形態＞本実施の形態は、上記
第１実施の形態における音声認識辞書作成装置が搭載さ
れると共に、難読語を誤って発声した場合に正しい読み
を提示して教えてくれる音声認識装置に関する。<Fourth Embodiment> In the present embodiment, the voice recognition dictionary creating apparatus according to the first embodiment is installed, and when a difficult-to-read word is uttered by mistake, correct reading is presented and taught. Voice recognition device.

【００７８】図６は、本実施の形態の音声認識装置にお
ける構成を示すブロック図である。テキスト解析部４
１,読み付与部４２,音声認識辞書作成部４３,第１解析
辞書記憶部４４および第２解析辞書記憶部４５は、上記
第１実施の形態において図１に示すテキスト解析部１,
読み付与部２,音声認識辞書作成部３,第１解析辞書記憶
部５および第２解析辞書記憶部６と同じであり、音声認
識辞書作成装置４６を構成している。そして、音声認識
辞書作成装置４６で作成された音声認識辞書は、音声認
識辞書記憶部４７に記憶される。尚、音声認識辞書作成
装置４６および音声認識辞書記憶部４７の詳細な説明は
省略する。FIG. 6 is a block diagram showing the configuration of the speech recognition apparatus of this embodiment. Text analysis unit 4
1, the reading imparting unit 42, the voice recognition dictionary creating unit 43, the first analysis dictionary storage unit 44, and the second analysis dictionary storage unit 45 are the text analysis unit 1 shown in FIG. 1 in the first embodiment.
It is the same as the reading adding unit 2, the voice recognition dictionary creating unit 3, the first analysis dictionary storage unit 5, and the second analysis dictionary storage unit 6, and constitutes a voice recognition dictionary creating device 46. Then, the voice recognition dictionary created by the voice recognition dictionary creating device 46 is stored in the voice recognition dictionary storage unit 47. The detailed description of the voice recognition dictionary creating device 46 and the voice recognition dictionary storage unit 47 will be omitted.

【００７９】音声認識部４８は、上記第３実施の形態に
おいて図５に示す音声認識部３１と同じ構成を有してい
る。そして、入力された音声を音響分析してパラメータ
のベクトル系列に変換し、パラメータベクトルに対して
音響モデルを作用させて各音韻毎に尤度演算し、音韻尤
度系列と音声認識辞書記憶部４７の総単語との照合を行
って各単語のスコアを算出し、最も高いスコアを呈する
単語を認識結果として出力する。The voice recognition section 48 has the same structure as the voice recognition section 31 shown in FIG. 5 in the third embodiment. Then, the input speech is subjected to acoustic analysis to be converted into a vector series of parameters, and an acoustic model is applied to the parameter vector to perform likelihood calculation for each phoneme, and a phonological likelihood series and a speech recognition dictionary storage unit 47. The word of each word is calculated by collating with the total number of words, and the word with the highest score is output as the recognition result.

【００８０】読み判定部４９は、上記音声認識部４８か
らの音声認識の結果を受けて、その中に、第２解析辞書
記憶部４５に記憶されている語彙と表記は同じであるが
読みは異なる語彙が含まれるか否かを判定する。読み提
示部５０は、読み判定部４９の判定結果を受けて、上記
判定結果が「真」である場合には第２解析辞書記憶部４５
に記憶されている当該語彙の読みを提示する。すなわ
ち、第２解析辞書記憶部４５に記憶された語彙の間違っ
た読みが音声認識部４８に入力(発声)された場合に、当
該語彙の正しい読みを提示して、使用者に教えるのであ
る。The reading determination unit 49 receives the result of the voice recognition from the voice recognition unit 48, and the reading is the same as the vocabulary stored in the second analysis dictionary storage unit 45, but the reading is not performed. Determine if different vocabularies are included. The reading presentation unit 50 receives the determination result of the reading determination unit 49, and when the determination result is “true”, the second analysis dictionary storage unit 45.
The reading of the vocabulary stored in is presented. That is, when the wrong reading of the vocabulary stored in the second analysis dictionary storage unit 45 is input (spoken) to the voice recognition unit 48, the correct reading of the vocabulary is presented and taught to the user.

【００８１】上記構成を有する音声認識装置は、以下の
ように動作する。図７は、音声認識部４８,読み判定部
４９および読み提示部５０によって実行される音声認識
処理動作のフローチャートである。以下、図７に従っ
て、上記音声認識処理動作について説明する。マイク
(図示せず)から音声認識部４８に音声が入力されると音
声認識処理動作がスタートする。The voice recognition device having the above configuration operates as follows. FIG. 7 is a flowchart of the voice recognition processing operation executed by the voice recognition unit 48, the reading determination unit 49, and the reading presentation unit 50. The voice recognition processing operation will be described below with reference to FIG. Microphone
When a voice is input to the voice recognition unit 48 (not shown), the voice recognition processing operation starts.

【００８２】ステップＳ11で、上記音声認識部４８によ
って、入力された音声がディジタル波形に変換され、上
記フレーム毎に周波数分析され、スペクトルを表すパラ
メータのベクトル系列に変換される(音響分析)。さら
に、パラメータベクトルに対して音響モデルを作用させ
て各音韻毎に尤度演算される(尤度演算)。そして、音韻
尤度系列と音声認識辞書記憶部４７に登録された総ての
単語との照合が行われて各単語のスコアが算出される
(照合処理)。In step S11, the voice recognition unit 48 converts the input voice into a digital waveform, frequency-analyzes each frame, and converts it into a vector series of parameters representing a spectrum (acoustic analysis). Further, the acoustic model is applied to the parameter vector to perform likelihood calculation for each phoneme (likelihood calculation). Then, the phonological likelihood series is compared with all the words registered in the voice recognition dictionary storage unit 47 to calculate the score of each word.
(Matching process).

【００８３】ステップＳ12で、上記読み判定部４９によ
って、音声認識部４８からの音声認識の結果に基づい
て、その中に、第２解析辞書記憶部４５に記憶されてい
る語彙と表記は同じであるが読みは異なる語彙が含まれ
るか否かが判定される。すなわち、音声認識結果が図３
に示す上記対応テーブルの「その他の候補」に含まれるか
否かが判別される。その結果、含まれる場合にはステッ
プＳ13に進み、そうでなければステップＳ14に進む。In step S12, based on the result of the voice recognition from the voice recognition unit 48 by the reading determination unit 49, the vocabulary and the notation stored in the second analysis dictionary storage unit 45 are the same. It is determined whether or not a vocabulary, which has a different reading, is included. That is, the voice recognition result is as shown in FIG.
It is determined whether or not it is included in the “other candidates” of the correspondence table shown in FIG. As a result, if included, the process proceeds to step S13, and if not, the process proceeds to step S14.

【００８４】ここで、上記音声認識結果が上記対応テー
ブルの「その他の候補」に含まれることとは、例えば、
「京終」という表記を見た人が「きょうしゅう」と発声し、
そのまま「きょうしゅう」と認識された場合等に該当す
る。そして、本ステップにおける上記判別は、例えば、
音声認識辞書作成部４３が上記対応テーブルに基づいて
音声認識辞書を作成する際に、認識語彙「京終」に対応付
けられる音素表記「きょうばて」,「きょうしゅう」,「きょ
うおわり」のうち上記対応テーブルにおける「その他の候
補」に含まれる音素表記「きょうしゅう」,「きょうおわり」
に、フラグを立てることによって実現可能になる。また
は、認識語彙の出現連鎖確率のうち上記対応テーブルに
おける「その他の候補」に含まれる音素表記に基づく出現
連鎖確率に、フラグを立てることによって実現可能にな
る。Here, the fact that the voice recognition result is included in the "other candidates" of the correspondence table means, for example,
A person who saw the notation "Kyoto" uttered "Kyoshu",
It corresponds to the case where it is directly recognized as "Kyushu". Then, the determination in this step is, for example,
When the voice recognition dictionary creating unit 43 creates a voice recognition dictionary based on the above correspondence table, of the phoneme notations “Kyobate”, “Kyoshu”, and “Kyo End” associated with the recognition vocabulary “Kyoto” Phoneme notation "Kyoshu", "Kyo End" included in "Other candidates" in the above correspondence table
It can be realized by setting a flag. Alternatively, it can be realized by setting a flag to the appearance chain probability based on the phoneme notation included in “other candidates” in the correspondence table among the appearance chain probabilities of the recognition vocabulary.

【００８５】つまり、上記音声認識部４８が照合処理を
行った際に、上記音声認識辞書における音素表記あるい
は出現連鎖確率に上記フラグが立っている単語との照合
を行った際には、算出されたスコアに、その旨を示す情
報を付加ればよいのである。That is, when the speech recognition unit 48 performs the collation processing, it is calculated when the speech recognition dictionary 48 collates with the phoneme notation or the word having the appearance chain probability flagged as described above. It is only necessary to add information indicating that to the score.

【００８６】ステップＳ13で、上記読み提示部５０によ
って、上記対応テーブルの「その他の候補」に含まれる音
素表記「きょうしゅう」に対応する第２解析辞書語彙の音
素表記「きょうばて」が求められて、音声認識部４８に返
される。その場合における上記第２解析辞書語彙の音素
表記「きょうばて」は、具体的には、音声認識辞書記憶記
憶部４７の内様を参照し、認識結果「京終」に対応付けら
れている音素表記のうち上記フラグが立っていない音素
表記「きょうばて」を求めることによって行われる。In step S13, the phonetic notation "Kyoubate" of the second analysis dictionary vocabulary corresponding to the phoneme notation "Kyoshu" included in the "other candidates" of the correspondence table is obtained by the reading presentation unit 50. And is returned to the voice recognition unit 48. In this case, the phoneme notation “Kyobate” in the second analytic dictionary vocabulary refers specifically to the inside of the voice recognition dictionary storage storage unit 47 and is associated with the recognition result “Kyoto”. It is performed by obtaining a phoneme notation "Kyobate" in which the above flag is not set among the notations.

【００８７】ステップＳ14で、上記音声認識部４８によ
って、上記ステップＳ11において算出されたスコアの高
い単語が認識結果として出力される。その際に、読み提
示部５０から上記対応テーブルの第２解析辞書語彙の音
素表記が返されている場合には、その音素表記も合わせ
て出力表示される。こうして、認識結果「京終」に本来の
読み「きょうばて」を合わせて出力することによって、ユ
ーザに、音声入力した語彙「京終」の読み「きょうしゅう」
は間違いであり、本当の読みは「きょうばて」であること
を教えることができるのである。In step S14, the speech recognition unit 48 outputs the word with the high score calculated in step S11 as the recognition result. At this time, when the phonetic notation of the second analysis dictionary vocabulary of the correspondence table is returned from the reading presentation unit 50, the phonemic notation is also output and displayed. In this way, by outputting the recognition result “Kyoto” together with the original reading “Kyobate”, the user can read the “Kyoshu” reading of the vocabulary “Kyoto” input by voice.
Is a mistake and can teach that the true reading is "Kyoubate".

【００８８】ここで、上記音声認識部４８に発声「きょ
うしゅう」が入力され、認識結果として「京終」,「郷愁」,
「教習」のように複数の候補がある場合には、音声認識部
４８は一旦複数の候補「京終」,「郷愁」,「教習」を表示し、
ユーザに何れかの候補を選択させる。その結果、認識候
補「京終」が選択された場合には、上述しような読み判定
部４９および読み提示部５０による処理を行うようにす
ればよい。Here, the utterance "Kyoshu" is input to the voice recognition unit 48, and the recognition results "Kyoto", "nostalgia",
When there are a plurality of candidates such as “training”, the voice recognition unit 48 once displays a plurality of candidates “Kyoto”, “nostalgia”, and “training”,
Let the user select one of the candidates. As a result, when the recognition candidate "Kyoto" is selected, the processing by the reading determination unit 49 and the reading presentation unit 50 as described above may be performed.

【００８９】このように、本実施の形態においては、上
記音声認識部４８に加えて、読み判定部４９および読み
提示部５０を設けている。そして、上記読み判定部４９
によって、上記音声認識の中に、第２解析辞書記憶部４
５に記憶されている語彙と表記は同じであるが読みは異
なる語彙が含まれるか否かを判定する。そして、上記語
彙が含まれていると判定された場合には、読み提示部５
０によって、読みが異なると判定された語彙の第２解析
辞書の音素表記を、音声認識部４８に認識結果と共に提
示するようにしている。As described above, in the present embodiment, in addition to the voice recognition section 48, the reading determination section 49 and the reading presentation section 50 are provided. Then, the reading determination unit 49
In the voice recognition, the second analysis dictionary storage unit 4
It is determined whether or not a vocabulary stored in No. 5 has the same notation as the vocabulary but a different reading. When it is determined that the vocabulary is included, the reading presentation unit 5
The phoneme notation of the second analysis dictionary of the vocabulary whose reading is determined to be different depending on 0 is presented to the voice recognition unit 48 together with the recognition result.

【００９０】したがって、上記第２解析辞書記憶部４５
に登録された語彙がその正しい読みとは異なる読みで発
声され、その発声が音声認識部４８によって正しく認識
された場合には、認識結果と共にその正しい読みを出力
表示して、ユーザに教えることができるのである。Therefore, the second analysis dictionary storage section 45 is used.
When the vocabulary registered in is uttered with a reading different from the correct reading and the utterance is correctly recognized by the voice recognition unit 48, the correct reading can be output and displayed together with the recognition result, and the user can be taught. You can do it.

【００９１】尚、本実施の形態においては、上記読み提
示部５０は、上記第２解析辞書の語彙の音素表記「きょ
うばて」を求めて音声認識部４８に返し、音声認識部４
８によって音声認識結果と共に出力表示するようにして
いる。しかしながら、この発明はこれに限定するもので
はない。例えば、読み提示部５０に音声合成手段を設け
て、音声認識部４８による音声認識結果の出力表示に同
期して、合成音声によって出力するようにしても差し支
えない。In the present embodiment, the reading presenting section 50 obtains the phoneme notation “Kyobate” of the vocabulary of the second analysis dictionary and returns it to the voice recognizing section 48.
The output and display together with the result of voice recognition are shown by 8. However, the present invention is not limited to this. For example, the reading and presenting unit 50 may be provided with a voice synthesizing unit to output the voice recognition result by the voice recognizing unit 48 in synchronization with the output display of the voice recognition result.

【００９２】また、本実施の形態においては、音声認識
装置に音声認識辞書作成装置４６を搭載している。しか
しながら、この発明はこれに限定されるものではなく、
音声認識装置を音声認識辞書作成装置４６とは独立に設
け、上記第１,第２実施の形態における音声認識辞書作
成装置によって作成された音声認識辞書を音声認識辞書
記憶部４７に格納するようにしても差し支えない。Further, in the present embodiment, the voice recognition device is equipped with the voice recognition dictionary creating device 46. However, the present invention is not limited to this,
A voice recognition device is provided independently of the voice recognition dictionary creation device 46, and the voice recognition dictionary created by the voice recognition dictionary creation device in the first and second embodiments is stored in the voice recognition dictionary storage unit 47. It doesn't matter.

【００９３】また、本実施の形態においては、読み判定
部４９による判定および読み提示部５０による正しい読
みの取得を、上記音声認識辞書における各認識語彙の上
記対応テーブルの「その他の候補」に含まれる音素表記に
フラグを立て、このフラグを参照することによって行っ
ている。しかしながら、上記対応テーブルを直接参照す
ることによって行っても差し支えない。但し、その場合
には、上記対応テーブルを音声認識辞書作成装置４６と
読み判定部４９と読み提示部５０とで共有する必要があ
るため、音声認識装置に音声認識辞書作成装置４６を搭
載している必要がある。Further, in the present embodiment, the judgment by the reading judging unit 49 and the acquisition of the correct reading by the reading presenting unit 50 are included in the “other candidates” of the correspondence table of each recognized vocabulary in the voice recognition dictionary. This is done by setting a flag in the phoneme notation described and referring to this flag. However, it does not matter if the correspondence table is directly referred to. However, in that case, since the correspondence table needs to be shared by the voice recognition dictionary creation device 46, the reading determination unit 49, and the reading presentation unit 50, the voice recognition device is equipped with the voice recognition dictionary creation device 46. Need to be

【００９４】また、上記第３実施の形態および第４実施
の形態においては、上記第１実施の形態における音声認
識辞書作成装置が搭載された場合を例に説明している
が、上記第２実施の形態における音声認識辞書作成装置
を搭載しても一向に構わない。In the third and fourth embodiments, the case where the voice recognition dictionary creating apparatus according to the first embodiment is installed is described as an example, but the second embodiment is described. It does not matter even if the voice recognition dictionary creating device in the above form is installed.

【００９５】上記第３実施の形態および第４実施の形態
における音声認識装置は、携帯端末器に搭載することに
よってその効果を発揮することができる。通常、携帯端
末器は、移動時に使用される。そして、特に外出先で上
記携帯端末器によって音声入力で地図を検索して表示さ
せる際に、例えば地名「京終(きょうばて)」を「きょうし
ゅう」であると思い込んでいる人は、「きょうしゅう」と
誤った読みで発声することになる。本携帯端末器の場合
には、その場合であってもリジェクトされることがな
く、目的の地名「京終」の地図が表示されるのである。The voice recognition devices according to the third and fourth embodiments described above can exert their effects by being mounted on a portable terminal device. Usually, the mobile terminal is used when moving. And, especially when going out and searching a map by voice input with the above-mentioned mobile terminal device and displaying it, for example, a person who thinks that the place name "Kyobate" is "kyoshu" is You will be uttered with the wrong reading. In the case of the present mobile terminal device, even in that case, it is not rejected and the map of the target place name "Kyoto" is displayed.

【００９６】これに対して、従来の音声認識装置による
地図検索装置を搭載した携帯端末器の場合には、例えば
地名「京終」を「きょうしゅう」と誤った読みで発声すると
リジェクトされる。ところが、外出先では正しい読みを
調べる術がなく、そのために「京終」の地図を表示するこ
とができないことになるのである。On the other hand, in the case of a mobile terminal device equipped with a map search device using a conventional voice recognition device, for example, if the place name "Kyoto" is uttered as "Kyoushu", it is rejected. However, there is no way to check the correct reading while on the go, which makes it impossible to display the "Kyoto" map.

【００９７】また、上記携帯端末器を、上記第１,第２
実施の形態における音声認識辞書作成装置が搭載された
第１携帯端末器と、第３,第４実施の形態における音声
認識辞書記憶部,音声認識部,読み判定部および読み提示
部が搭載された第２携帯端末器とで構成し、両携帯端末
器に、両携帯端末器間で音声認識辞書情報を含む情報を
送受信する送受信器を設けることも可能である。こうす
ることによって、上記第１携帯端末器の音声認識辞書作
成装置によって作成された音声認識辞書情報を上記第２
携帯端末器に送信して、第２携帯端末器の音声認識辞書
記憶部に記憶することができる。In addition, the above-mentioned portable terminal device is replaced with the above-mentioned first and second
The first mobile terminal device equipped with the voice recognition dictionary creation device according to the embodiment, and the voice recognition dictionary storage unit, the voice recognition unit, the reading determination unit, and the reading presentation unit according to the third and fourth embodiments are mounted. It is also possible to configure it with the second mobile terminal device, and to provide a transceiver for transmitting and receiving information including voice recognition dictionary information between both mobile terminal devices to both mobile terminal devices. By doing so, the voice recognition dictionary information created by the voice recognition dictionary creating device of the first mobile terminal is used as the second voice recognition dictionary information.
It can be transmitted to the mobile terminal and stored in the voice recognition dictionary storage unit of the second mobile terminal.

【００９８】また、上記音声認識装置の音声認識辞書を
作成する音声認識辞書作成装置を第２実施の形態におけ
る音声認識辞書作成装置とし、その音声認識辞書作成装
置をサーバーに設ける。さらに、携帯端末器には、第２
の音声認識辞書記憶部,音声認識部(読み判定部,読み提
示部)および上記サーバーと音声認識辞書情報を送受す
るための送受信器を設けることも可能である。このよう
に、上記サーバーと携帯端末器とで音声認識システムを
構成することによって、携帯端末器を簡単な構成にして
軽量化を図ることができる。さらに、上記サーバーを上
記第３の辞書記憶手段として利用することによって、上
記サーバー内の第２解析辞書記憶部の内容を定期的に追
加補充して、次々増える新語および外来語や定期的に更
新されるテレビ番組名等に対処可能な音声認識辞書を、
上記送受信器によって取得することができるのである。The voice recognition dictionary creating device for creating the voice recognition dictionary of the voice recognition device is the voice recognition dictionary creating device in the second embodiment, and the voice recognition dictionary creating device is provided in the server. In addition, the mobile terminal has a second
It is also possible to provide a voice recognition dictionary storage unit, a voice recognition unit (reading determination unit, reading presentation unit), and a transceiver for transmitting and receiving voice recognition dictionary information to and from the server. As described above, by configuring the voice recognition system with the server and the mobile terminal device, the mobile terminal device can have a simple structure and can be reduced in weight. Further, by using the server as the third dictionary storage means, the contents of the second analysis dictionary storage section in the server are supplemented periodically, and new words and foreign words that are increasing one after another and regularly updated. A voice recognition dictionary that can handle TV program names etc.
It can be acquired by the transceiver.

【００９９】ところで、上記各実施の形態におけるテキ
スト解析部１,１１,２１,４１、読み付与部２,１２,２
２,４２、音声認識辞書作成部３,１３,２３,４３、音声
認識辞書記憶部４,１４,２７,４７、第１解析辞書記憶
部５,１５,２４,４４、第２解析辞書記憶部６,１６,２
５,４５としての機能は、プログラム記録媒体に記録さ
れた音声認識辞書作成プログラムによって実現される。
上記各実施の形態における上記プログラム記録媒体は、
ＲＯＭ(図示せず)でなるプログラムメディアである。あ
るいは、外部補助記憶装置に装着されて読み出されるプ
ログラムメディアであってもよい。尚、何れの場合にお
いても、上記プログラムメディアから音声認識辞書作成
プログラムを読み出すプログラム読み出し手段は、上記
プログラムメディアに直接アクセスして読み出す構成を
有していてもよいし、ＲＡＭ(ランダム・アクセス・メモ
リ)(図示せず)に設けられたプログラム記憶エリアにダ
ウンロードし、上記プログラム記憶エリアにアクセスし
て読み出す構成を有していてもよい。尚、上記プログラ
ムメディアから上記ＲＡＭのプログラム記憶エリアにダ
ウンロードするためのダウンロードプログラムは、予め
本体装置に格納されているものとする。By the way, the text analyzing units 1, 11, 21, 41 and the reading adding units 2, 12, 2 in each of the above embodiments.
2, 42, voice recognition dictionary creation unit 3, 13, 23, 43, voice recognition dictionary storage unit 4, 14, 27, 47, first analysis dictionary storage unit 5, 15, 24, 44, second analysis dictionary storage unit 6,16,2
The functions of 5, 45 are realized by the voice recognition dictionary creating program recorded in the program recording medium.
The program recording medium in each of the above embodiments is
The program medium is a ROM (not shown). Alternatively, it may be a program medium loaded in an external auxiliary storage device and read. In any case, the program reading means for reading the voice recognition dictionary creating program from the program medium may have a configuration of directly accessing and reading the program medium, or a RAM (random access memory). You may have the structure which downloads to the program storage area provided in (not shown), accesses the said program storage area, and reads it. The download program for downloading from the program medium to the program storage area of the RAM is assumed to be stored in the main body device in advance.

【０１００】ここで、上記プログラムメディアとは、本
体側と分離可能に構成され、磁気テープやカセットテー
プ等のテープ系、フロッピーディスク,ハードディスク
等の磁気ディスクやＣＤ‐ＲＯＭ,ＭＯ(光磁気)ディス
ク,ＭＤ(ミニディスク),ＤＶＤ(ディジタル多用途ディ
スク)等の光ディスクのディスク系、ＩＣ(集積回路)カ
ードや光カード等のカード系、マスクＲＯＭ,ＥＰＲＯ
Ｍ(紫外線消去型ＲＯＭ),ＥＥＰＲＯＭ(電気的消去型Ｒ
ＯＭ),フラッシュＲＯＭ等の半導体メモリ系を含めた、
固定的にプログラムを坦持する媒体である。Here, the program medium is configured to be separable from the main body side, and is a tape system such as a magnetic tape or a cassette tape, a magnetic disk such as a floppy disk or a hard disk, a CD-ROM, an MO (magneto-optical) disk. , MD (mini disk), DVD (digital versatile disk) and other optical disk systems, IC (integrated circuit) cards, optical cards and other card systems, mask ROM, EPRO
M (UV erasable ROM), EEPROM (electrically erasable R)
OM), including flash ROM and other semiconductor memory systems,
It is a medium that carries the program fixedly.

【０１０１】また、上記各実施の形態における音声認識
辞書作成装置は、モデムを備えてインターネットを含む
通信ネットワークと接続可能な構成を有している場合
は、上記プログラムメディアは、通信ネットワークから
のダウンロード等によって流動的にプログラムを坦持す
る媒体であっても差し支えない。尚、その場合における
上記通信ネットワークからダウンロードするためのダウ
ンロードプログラムは、予め本体装置に格納されている
ものとする。あるいは、別の記録媒体からインストール
されるものとする。Further, when the voice recognition dictionary creating apparatus in each of the above embodiments has a configuration that includes a modem and is connectable to a communication network including the Internet, the program medium is downloaded from the communication network. It may be a medium that supports the program in a fluid manner. In this case, the download program for downloading from the communication network is stored in the main body device in advance. Alternatively, it is assumed that the program is installed from another recording medium.

【０１０２】尚、上記記録媒体に記録されるものはプロ
グラムのみに限定されるものではなく、データも記録す
ることが可能である。It should be noted that what is recorded on the recording medium is not limited to the program, and data can be recorded.

【０１０３】[0103]

【発明の効果】以上より明らかなように、第１の発明の
音声認識用辞書作成装置は、テキスト解析用の辞書とし
て、語彙の表記および読みを含む情報で成る第１解析辞
書が記憶された第１解析辞書記憶手段と、上記第１解析
辞書記憶手段に記憶されてはいない語彙の表記および読
みを含む情報で成る第２解析辞書が記憶された第２解析
辞書記憶手段を備えて、読み付与手段によって、テキス
ト解析結果中に上記第２解析辞書を参照して得られた語
彙が含まれている場合には、当該語彙に関して、上記第
２解析辞書を参照して得られた読みに加えてその他の読
み候補をも付与するので、上記解析結果および上記読み
付与結果に基づいて作成される音声認識辞書には、当該
語彙に関して、上記第２解析辞書を参照して得られた読
みの音素表記およびその他の読み候補の音素表記に基づ
く辞書情報が登録される。したがって、この音声認識辞
書を用いて音声認識を行うことによって、例えば上記第
２解析辞書に登録された難読語「京終(きょうばて)」を誤
って「きょうしゅう」と発声してもリジェクトされること
がなく、認識結果として目的の語彙「京終」を得ることが
できるのである。As is apparent from the above, the voice recognition dictionary creating apparatus of the first invention stores the first analysis dictionary composed of information including vocabulary notation and reading as a text analysis dictionary. The reading includes a first analysis dictionary storage unit and a second analysis dictionary storage unit in which a second analysis dictionary composed of information including notations and readings of vocabulary not stored in the first analysis dictionary storage unit is stored. When the adding means includes the vocabulary obtained by referring to the second analysis dictionary in the text analysis result, in addition to the reading obtained by referring to the second analysis dictionary with respect to the vocabulary. Since other reading candidates are also added, the phoneme of the reading obtained by referring to the second analysis dictionary with respect to the vocabulary is included in the speech recognition dictionary created based on the analysis result and the reading addition result. Notation and Dictionary information based on the phoneme notation of other reading candidate is registered. Therefore, by performing voice recognition using this voice recognition dictionary, for example, even if the obfuscated word “Kyobate” registered in the second analysis dictionary is erroneously pronounced as “Kyoshu”, it is rejected. Therefore, the desired vocabulary "Kyoto" can be obtained as a recognition result.

【０１０４】すなわち、この発明によれば、認識対象単
語の正しい読み方を利用者が知らない場合であっても、
入力音声を認識することが可能な音声認識辞書を作成す
ることができるのである。That is, according to the present invention, even when the user does not know the correct reading of the recognition target word,
It is possible to create a voice recognition dictionary that can recognize the input voice.

【０１０５】また、１実施例の音声認識用辞書作成装置
は、辞書取得手段によって、上記第２解析辞書記憶手段
に記憶される第２解析辞書の内容を第３の辞書記憶手段
から取得するので、語彙の情報提供者によって新しい語
彙が登録された上記第３の辞書記憶手段が提供されるこ
とによって、新しく出現した語彙を常に上記第２解析辞
書記憶手段に追加登録しておくことができる。そうする
ことによって、上記第２解析辞書に登録されている認識
対象語彙を利用者が知らない場合であっても、入力音声
を認識することが可能な音声認識辞書を作成することが
できる。Further, in the voice recognition dictionary creating apparatus according to the first embodiment, the dictionary acquisition means acquires the contents of the second analysis dictionary stored in the second analysis dictionary storage means from the third dictionary storage means. By providing the third dictionary storage means in which a new vocabulary is registered by a vocabulary information provider, a newly appearing vocabulary can always be additionally registered in the second analysis dictionary storage means. By doing so, even if the user does not know the recognition target vocabulary registered in the second analysis dictionary, it is possible to create a voice recognition dictionary that can recognize the input voice.

【０１０６】すなわち、この発明によれば、登録されて
いる認識対象単語を利用者が知らない場合であっても、
入力音声を認識することが可能な音声認識辞書を作成す
ることができるのである。That is, according to the present invention, even if the user does not know the registered recognition target word,
It is possible to create a voice recognition dictionary that can recognize the input voice.

【０１０７】また、第２の発明の音声認識装置は、上記
第１の発明の音声認識辞書作成装置によって作成された
音声認識辞書の登録語彙との照合を行って入力音声を認
識するので、例えば上記音声認識辞書作成装置の上記第
２解析辞書に登録された難読語「京終(きょうばて)」を誤
って「きょうしゅう」と発声した場合でもリジェクトされ
ることがなく、認識結果として目的の語彙「京終」を得る
ことができる。Further, since the voice recognition device of the second invention recognizes the input voice by collating with the registered vocabulary of the voice recognition dictionary created by the voice recognition dictionary creation device of the first invention, for example, Even if the obfuscated word "Kyobate" registered in the second analysis dictionary of the voice recognition dictionary creation device is mistakenly uttered as "Kyoshu", it is not rejected and the target is obtained as a recognition result. You can get the vocabulary “Kyoto”.

【０１０８】また、第３の発明の音声認識装置は、上記
第１の発明の音声認識辞書作成装置を搭載し、この音声
認識辞書作成装置によって作成された音声認識辞書の登
録語彙との照合を行って入力音声を認識するので、認識
対象単語の正しい読み方を利用者が知らない場合であっ
ても、入力音声を認識することが可能な音声認識辞書を
作成することができる。The voice recognition device of the third invention is equipped with the voice recognition dictionary creation device of the first invention described above, and collates with the registered vocabulary of the voice recognition dictionary created by this voice recognition dictionary creation device. Since the input voice is recognized by performing the input voice, it is possible to create a voice recognition dictionary capable of recognizing the input voice even when the user does not know the correct reading of the recognition target word.

【０１０９】また、１実施例の音声認識装置は、第２解
析辞書記憶手段に記憶されている語彙と表記は同じであ
るが読みは異なる語彙が音声認識結果に含まれているか
否かを読み判定手段によって判定し、上記語彙が含まれ
ている場合には、当該語彙に関して、読み提示手段によ
って上記第２解析辞書記憶手段に記憶されている読みを
提示するので、例えば上記第２解析辞書に登録されてい
る難読語「京終(きょうばて)」を誤って「きょうしゅう」と
発声して正しい認識結果「京終」を得た利用者に対して、
正しい読み「きょうばて」を提示して教えることができ
る。The voice recognition apparatus of the first embodiment reads whether or not a vocabulary stored in the second analysis dictionary storage means has the same notation but is read differently from the voice recognition result. When it is determined by the determination means and the vocabulary is included, the reading presenting means presents the reading stored in the second analysis dictionary storage means with respect to the vocabulary. For users who mistakenly uttered the registered obfuscated word "Kyobate" as "Kyoshu" and obtained the correct recognition result "Kyoto",
You can teach by presenting the correct reading "Kyobate".

【０１１０】また、１実施例の音声認識装置は、上記読
み提示手段を、上記第２解析辞書記憶手段に記憶されて
いる読みの提示を合成音声によって行うようにしたの
で、利用者に対して認識語彙の正しい読みを合成音声に
よって提示できると共に、音声認識結果の表示内容を簡
素化できる。Further, in the voice recognition apparatus according to the first embodiment, the reading presentation means is adapted to present the reading stored in the second analysis dictionary storage means by the synthetic voice, so that the user can be read. The correct reading of the recognition vocabulary can be presented by synthetic speech, and the display contents of the speech recognition result can be simplified.

【０１１１】また、第４の発明の携帯端末器は、上記第
２の発明あるいは第３の発明の音声認識装置を搭載した
ので、正しい読みを調べる術がない外出先においても、
音声によって必要な情報を即座に且つ簡単に検索するこ
とが可能になる。Further, since the portable terminal device of the fourth invention is equipped with the voice recognition device of the second invention or the third invention, it can be used even when out of the office where there is no way to check the correct reading.
The voice makes it possible to search for necessary information immediately and easily.

【０１１２】また、第５の発明の携帯端末器は、上記第
１の発明の音声認識辞書作成装置および上記第２の発明
の音声認識装置の何れか一方を搭載したので、上記音声
認識辞書作成装置を搭載した第１携帯端末器から上記音
声認識装置を搭載した第２携帯端末器に、作成された音
声認識辞書の情報を送信することができる。したがっ
て、上記第２携帯端末器を上記第３の発明の音声認識装
置を搭載した携帯端末器よりも簡単な構成にして軽量化
を図ることができる。Further, since the portable terminal device of the fifth invention is equipped with either one of the voice recognition dictionary creating apparatus of the first invention and the voice recognition apparatus of the second invention, the voice recognition dictionary creating The information of the created voice recognition dictionary can be transmitted from the first mobile terminal device equipped with the device to the second mobile terminal device equipped with the voice recognition device. Therefore, the second mobile terminal can be made simpler and lighter in weight than the mobile terminal equipped with the voice recognition device of the third invention.

【０１１３】また、第６の発明の音声認識システムは、
上記第１の発明の音声認識辞書作成装置が設けられたサ
ーバーと、上記第２の発明の音声認識装置を搭載し且つ
上記サーバーと音声認識辞書情報の送受を行うための送
受信手段を有する携帯端末器とを備えているので、本携
帯端末器を上記第３の発明の音声認識装置を搭載した携
帯端末器よりも簡単な構成にして軽量化を図ることがで
きる。The speech recognition system of the sixth invention is
A mobile terminal equipped with a server provided with the voice recognition dictionary creation device of the first invention, and a voice recognition device of the second invention and having a transmitting / receiving means for transmitting / receiving voice recognition dictionary information to / from the server. Since the portable terminal device is provided with a device, the portable terminal device can be made simpler in structure and lighter in weight than the portable terminal device equipped with the voice recognition device of the third invention.

【０１１４】さらに、上記サーバーを上記第３の辞書記
憶手段として、上記サーバーから上記第２解析辞書記憶
手段の内容を定期的に追加補充することによって、次々
増える新語および外来語や定期的に更新されるテレビ番
組名等を、本携帯端末器のユーザは上記第２解析辞書の
内容を知らなくとも音声認識することが可能になる。Further, by using the server as the third dictionary storage means and periodically replenishing the contents of the second analysis dictionary storage means from the server, new words and foreign words that are increasing one after another and regularly updated. It becomes possible for the user of the mobile terminal device to recognize the displayed TV program name and the like by voice without knowing the contents of the second analysis dictionary.

【０１１５】また、第７の発明の音声認識辞書作成方法
は、第１解析辞書記憶手段に記憶された語彙の表記及び
読みを含む情報で成る第１解析辞書と、第２解析辞書記
憶手段に記憶された上記第１解析辞書記憶手段に記憶さ
れてはいない語彙の表記及び読みを含む情報で成る第２
解析辞書とを参照して、テキスト解析を行い、分割され
た構成単語に読みを付与する際に、上記テキスト解析結
果の中に上記第２解析辞書に登録された語彙が含まれて
いる場合には、当該語彙に関して、上記第２解析辞書を
参照して得られた読みに加えてその他の読み候補をも付
与するので、作成された音声認識辞書には、当該語彙に
関しては、上記第２解析辞書を参照して得られた読みの
音素表記およびその他の読み候補の音素表記に基づく辞
書情報が登録されている。したがって、この音声認識辞
書を用いて音声認識を行うことによって、例えば上記第
２解析辞書に登録された難読語「京終(きょうばて)」を誤
って「きょうしゅう」と発声してもリジェクトされること
がなく、認識結果として目的の語彙「京終」を得ることが
できるのである。Further, the speech recognition dictionary creating method according to the seventh aspect of the present invention includes a first analysis dictionary composed of information including notations and readings of vocabulary stored in the first analysis dictionary storage means and a second analysis dictionary storage means. A second stored information that includes notation and reading of a vocabulary not stored in the first analysis dictionary storage means.
When the text analysis is performed by referring to the analysis dictionary and the reading is added to the divided constituent words, when the vocabulary registered in the second analysis dictionary is included in the text analysis result, Adds other reading candidates in addition to the reading obtained by referring to the second analysis dictionary for the vocabulary. Therefore, the created speech recognition dictionary includes the second analysis for the vocabulary. The dictionary information based on the phoneme notation of reading and the phoneme notation of other reading candidates obtained by referring to the dictionary is registered. Therefore, by performing voice recognition using this voice recognition dictionary, for example, even if the obfuscated word “Kyobate” registered in the second analysis dictionary is erroneously pronounced as “Kyoshu”, it is rejected. Therefore, the desired vocabulary "Kyoto" can be obtained as a recognition result.

【０１１６】すなわち、この発明によれば、認識対象単
語の正しい読み方を利用者が知らない場合であっても、
入力音声を認識することが可能な音声認識辞書を作成す
ることができる。That is, according to the present invention, even if the user does not know the correct reading of the recognition target word,
A voice recognition dictionary capable of recognizing input voice can be created.

【０１１７】また、第８の発明の音声認識辞書作成プロ
グラムは、コンピュータを、上記第１の発明におけるテ
キスト解析手段,読み付与手段,音声認識辞書作成手段,
音声認識辞書記憶手段,第１解析辞書記憶手段及び第２
解析辞書記憶手段として機能させるので、上記第１の発
明の場合と同様に、上記第２解析辞書に登録された語彙
「京終(きょうばて)」を誤って「きょうしゅう」と発声して
もリジェクトされることがなく、認識結果として目的の
語彙「京終」を得ることができる音声認識辞書を作成する
ことができる。The voice recognition dictionary creating program according to the eighth aspect of the present invention causes a computer to execute the text analyzing means, the reading adding means, the voice recognition dictionary creating means according to the first aspect of the invention.
Voice recognition dictionary storage means, first analysis dictionary storage means and second
Since it functions as the analysis dictionary storage means, even if the vocabulary "Kyobate" registered in the second analysis dictionary is mistakenly pronounced as "Kyoushu" as in the case of the first invention. It is possible to create a voice recognition dictionary that can obtain the target vocabulary “Kyoto” as a recognition result without being rejected.

【０１１８】また、第９の発明のプログラム記録媒体
は、上記第８の発明の音声認識辞書作成プログラムを記
録しているので、この音声認識辞書作成プログラムをコ
ンピュータで読み出して用いることによって、上記第１
の発明の場合と同様に、上記第２解析辞書に登録された
語彙「京終(きょうばて)」を誤って「きょうしゅう」と発声
してもリジェクトされることがなく、認識結果として目
的の語彙「京終」を得ることができる音声認識辞書を作成
することができる。Further, since the program recording medium of the ninth invention records the voice recognition dictionary creating program of the eighth invention, the voice recognition dictionary creating program can be read out by a computer and used. 1
As in the case of the invention of No. 1, even if the vocabulary “Kyobate” registered in the second analysis dictionary is erroneously pronounced as “Kyoshu”, it is not rejected and the result of the recognition is It is possible to create a voice recognition dictionary that can obtain the vocabulary “Kyoto”.

[Brief description of drawings]

【図１】この発明の音声認識辞書作成装置における構
成を示すブロック図である。FIG. 1 is a block diagram showing a configuration of a voice recognition dictionary creating device of the present invention.

【図２】図１に示す音声認識辞書作成装置によって行
われる音声認識辞書作成処理動作のフローチャートであ
る。FIG. 2 is a flowchart of a voice recognition dictionary creation processing operation performed by the voice recognition dictionary creation device shown in FIG.

【図３】図１における読み付与部によって記録される
対応テーブルの内容の一例を示す図である。FIG. 3 is a diagram showing an example of contents of a correspondence table recorded by a reading imparting unit in FIG.

【図４】図１とは異なる音声認識辞書作成装置におけ
る構成を示すブロック図である。FIG. 4 is a block diagram showing a configuration of a voice recognition dictionary creating apparatus different from that of FIG.

【図５】この発明の音声認識装置における構成を示す
ブロックである。FIG. 5 is a block diagram showing a configuration of a voice recognition device of the present invention.

【図６】図５とは異なる音声認識辞書作成装置におけ
る構成を示すブロック図である。FIG. 6 is a block diagram showing a configuration of a voice recognition dictionary creation device different from that of FIG.

【図７】図６における音声認識部,読み判定部および
読み提示部によって実行される音声認識処理動作のフロ
ーチャートである。7 is a flowchart of a voice recognition processing operation executed by a voice recognition unit, a reading determination unit, and a reading presentation unit in FIG.

[Explanation of symbols]

１,１１,２１,４１…テキスト解析部、２,１２,２２,４２…読み付与部、３,１３,２３,４３…音声認識辞書作成部、４,１４,２７,４７…音声認識辞書記憶部、５,１５,２４,４４…第１解析辞書記憶部、６,１６,２５,４５…第２解析辞書記憶部、１７…辞書取得部、２６,４６…音声認識辞書作成装置、２８…音響分析部、２９…尤度演算部、３０…照合処理部、３１,４８…音声認識部、４９…読み判定部、５０…読み提示部。 1, 11, 21, 41 ... Text analysis part, 2,12,22,42 ... Reading section, 3, 13, 23, 43 ... Voice recognition dictionary creation unit, 4, 14, 27, 47 ... Voice recognition dictionary storage unit, 5, 15, 24, 44 ... First analysis dictionary storage unit, 6, 16, 25, 45 ... Second analysis dictionary storage unit, 17 ... Dictionary acquisition unit, 26,46 ... Voice recognition dictionary creation device, 28 ... Acoustic analysis unit, 29 ... Likelihood calculator, 30 ... Collation processing unit, 31, 48 ... Voice recognition unit, 49 ... a reading determination unit, 50 ... Reading presentation section.

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｇ１０Ｌ 3/00 ５６１ＤＲ ─────────────────────────────────────────────────── ─── Continuation of the front page (51) Int.Cl. ⁷ Identification code FI theme code (reference) G10L 3/00 561D R

Claims

[Claims]

1. An input text is analyzed by a text analysis means, a reading is added to the analyzed constituent words by a reading adding means, and a voice recognition is made by a voice recognition dictionary creating means based on the analysis result and the reading addition result. In a voice recognition dictionary creating device that creates a dictionary and stores the created voice recognition dictionary in a voice recognition dictionary storage means, the dictionary is referred to when the text is analyzed by the text analysis means, and the vocabulary is written and read. A first analysis dictionary storage unit that stores a first analysis dictionary that includes information including: and a dictionary that is referred to when the text is analyzed by the text analysis unit, and is not stored in the first analysis dictionary storage unit. Second, consisting of information including vocabulary and reading
The reading analysis unit includes a second analysis dictionary storage unit that stores an analysis dictionary, and the reading adding unit includes a vocabulary obtained by referring to the second analysis dictionary in the text analysis result of the text analysis unit. In the case where the vocabulary is present, the voice recognition dictionary creating apparatus is characterized in that, in addition to the reading obtained by referring to the second analysis dictionary, other reading candidates are added.

2. The voice recognition dictionary creation device according to claim 1, wherein the dictionary acquisition means for acquiring the contents of the second analysis dictionary stored in the second analysis dictionary storage means from the third dictionary storage means. An apparatus for creating a voice recognition dictionary, comprising:

3. A voice recognition device for recognizing an input voice by matching with a vocabulary registered in a voice recognition dictionary by a matching means, wherein the voice recognition dictionary is one of claims 1 and 2. A voice recognition device, which is a voice recognition dictionary created by the voice recognition dictionary creation device described in 1.

4. A voice recognition dictionary creating apparatus according to claim 1 or 2 is mounted, and an input voice is used as a vocabulary registered in a voice recognition dictionary storing means in the voice recognition dictionary creating apparatus. A voice recognition device characterized by performing verification by means of verification means.

5. The voice recognition device according to claim 3 or 4, wherein the voice recognition result includes a vocabulary having the same notation as the vocabulary stored in the second analysis dictionary storage means but different reading. When it is determined by the reading determination unit that determines whether or not the vocabulary is included, and the above-mentioned vocabulary is included, the vocabulary is stored in the second analysis dictionary storage unit. A voice recognition device comprising a reading presenting means for presenting a reading.

6. The voice recognition apparatus according to claim 5, wherein the reading presenting means presents the reading stored in the second analysis dictionary storage means with synthetic speech. Characteristic voice recognition device.

7. A mobile terminal equipped with the voice recognition device according to claim 3.

8. A voice recognition dictionary creating apparatus according to claim 1 or 2 and a voice recognition apparatus according to any one of claims 3, 5, and 6 are mounted. A mobile terminal device characterized by the above.

9. A server provided with the voice recognition dictionary creating device according to claim 1 or 2, and the voice recognition device according to any one of claims 3, 5, and 6. A voice recognition system comprising: a mobile terminal device which is mounted and has a transmitting / receiving unit for transmitting / receiving voice recognition dictionary information to / from the server.

10. A text analysis step for analyzing character string information and dividing it into constituent words, and a text analyzing means, a reading adding means, a speech recognition dictionary creating means and a speech recognition dictionary storing means, and a divided constituent word. In a voice recognition dictionary creating method, which has a reading adding step of adding reading, and a voice recognition dictionary creating step of creating a voice recognition dictionary based on a result of the text analysis and the reading adding and storing the voice recognition dictionary in the voice recognition dictionary storage means, In the text analysis by the text analysis means, the first analysis dictionary composed of information including notation and reading of vocabulary stored in the first analysis dictionary storage means, and the first analysis stored in the second analysis dictionary storage means. The reading is performed by referring to a second analysis dictionary that includes information including vocabulary notations and readings that are not stored in the dictionary storage means. In the reading addition by the giving means, when the vocabulary obtained by referring to the second analysis dictionary is included in the text analysis result, the second analysis dictionary is referred to for the vocabulary. A method for creating a voice recognition dictionary, characterized in that other reading candidates are added in addition to the obtained reading.

11. A computer is caused to function as the text analysis means, reading imparting means, voice recognition dictionary creating means, voice recognition dictionary storage means, first analysis dictionary storage means and second analysis dictionary storage means according to claim 1. A featured voice recognition dictionary creation program.

12. A computer-readable program recording medium on which the voice recognition dictionary creating program according to claim 11 is recorded.