JP3935655B2

JP3935655B2 - Speech recognition device, morphological analysis device, kana-kanji conversion device, method thereof, and recording medium recording the program

Info

Publication number: JP3935655B2
Application number: JP2000051475A
Authority: JP
Inventors: 啓恭伍井; 裕三丸田; 芳春阿部
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2000-02-28
Filing date: 2000-02-28
Publication date: 2007-06-27
Anticipated expiration: 2020-02-28
Also published as: JP2001242886A

Abstract

PROBLEM TO BE SOLVED: To realize speech recognition device in which strong constraint can be applied while the order of n gram is held small and a stronger constraint can be applied when the orders are the same. SOLUTION: A phoneme probability computing means 2 computes phoneme occurrence probability corresponding to each phoneme of inputted voice to generate phoneme string candidates. A word probability computing means 9 computes word occurrence probability of each word candidate corresponding to the phoneme string candidate by referring phoneme n grams 7 and 8 which classify the phoneme string of an object language, a word declared string corresponding to the phoneme string and occurrence probability for every topic and store them. An output means 6 outputs the word string candidates which are computed using the phoneme occurrence probability and the word occurrence probability and are similar to the inputted voice.

Description

【０００１】
【発明の属する技術分野】
本発明は、自然言語の統計量を用い、対象言語の文字、あるいは単語の連接生起確率であるｎグラムに基づいて、音声認識、または形態素解析、または仮名漢字変換を行う音声認識装置、形態素解析装置、仮名漢字変換装置、およびそれらのための音声認識方法、形態素解析方法、仮名漢字変換方法、ならびにそれらのプログラムを記録した記録媒体に関し、特に、ｎグラムの統計量を話題別に扱うことによる解析精度の向上に関するものである。
【０００２】
【従来の技術】
自然言語の統計量を用いた解析技術は多くの文書処理に応用されている。例えば、音声認識による日本語の入力は文書入力の手段として有用であり、より認識精度の向上が望まれる。音声を精度よく認識するために、言語モデルとして自然言語の統計量を用い、対象言語の文字、または単語の連接生起確率であるｎグラムを用いる方式が注目されている。しかし、ｎグラムでの制約は次数ｎに影響されるため、ｎが小さくなると制約が弱くなってしまう。逆にｎグラムの次数ｎを増加させると、頻度を計数する表が巨大になってしまうという深刻な問題があるとともに、信頼性のある統計量を確保するためには非常に膨大な例文集が必要になるといった課題があった。なお、音声認識における、このようなｎグラムの表の増加を解決するための圧縮方式としては、例えば特表平１０−５０１０７８号公報に示すようなものが提案されている。
【０００３】
以下、自然言語の統計量を用いた従来の解析技術について説明する。図２６は発話された「ｓａＮｋａｉｎｏｓｅＮｓｅｅ」より認識結果「３階の先生」を得るための、従来の解析方式が適用された音声認識装置の構成例を示すブロック図である。図において、１はマイク、２は音韻確率算出手段、３は単語予測手段、４はｎグラム表（この場合には３グラム表）、５は情報を記憶するＲＡＭ、６は出力手段である。
【０００４】
以下、単語列候補の生成について説明する。
単語列候補は、発話された単語列をＷ、音韻列をＹとしたときの、単語列の確率Ｐ（Ｗ｜Ｙ）を最大にする単語列Ｗを算出することにより得られる。なお、この単語列の確率Ｐ（Ｗ｜Ｙ）は次の式（１）で与えられる。
【０００５】
【数１】

【０００６】
単語列候補を生成するためには、前述のようにこの確率Ｐ（Ｗ｜Ｙ）を最大にする単語列Ｗを求めればよいので、上記式（１）の右辺のうち、単語列Ｗに共通な確率Ｐ（Ｙ）は省略することができ、確率Ｐ（Ｙ｜Ｗ）Ｐ（Ｗ）を最大にする単語列Ｗを求めればよい。なお、上記Ｐ（Ｙ｜Ｗ）は単語列Ｗが与えられたときの音韻列Ｙの出現確率であり、Ｐ（Ｗ）は単語列Ｗの出現確率である。
【０００７】
ここで、時刻ｔ＝１，２，…，Ｌにおいて、単語列Ｗに対応する音韻列Ｙが以下の式（２）で決定されるとき、音韻列Ｙの出現確率Ｐ（Ｙ｜Ｗ）は式（２）に示した各音韻Ｙ_ｉの出現確率である音韻確率Ｐ（Ｙ_ｉ）より、次の式（３）によって算出することができる。
【０００８】
【数２】

【０００９】
また、単語列Ｗの出現確率Ｐ（Ｗ）は、ｍ語からなる単語列Ｗが次の式（４）で決定されるとき、上記式（２）による各音韻Ｙ_ｉの出現確率である音韻確率とは独立に、単語３グラムの確率による次の式（５）により近似する。なお、この式（５）において、ｉが１もしくは２である場合、ｗ_ｉ−１，ｗ_ｉ−２には（＃）が入る。
【００１０】
【数３】

【００１１】
上述した計算により音韻列候補のうち３グラムインデックスに単語の列が存在するものについて、単語列確率Ｐ（Ｗ｜Ｙ）を最大にする単語列Ｗを算出する。それぞれの単語の出現確率は、図２６に示した単語の３グラム表４に予め記憶してある頻度値をもとに算出する。
【００１２】
算出した単語列Ｗを認識結果として出力手段６より出力する。
【００１３】
次に動作について説明する。
ここで、図２７は上記従来の音声認識装置における音声認識の概略動作の流れを示すフローチャートである。この音声認識の処理は、ステップＳＴ１においてマイク１に対して発話することによって開始される。マイク１はステップＳＴ２においてこの発話された音声が入力されると、ステップＳＴ３でその入力音声を電気信号に変換する。次にステップＳＴ４において、音韻確率算出手段２はこのマイク１からの電気信号をＡ／Ｄ変換し、量子化した後、スペクトル分析を行って、音節単位に分離した認識結果を連接し、音韻列候補としてＲＡＭ５にこれを記憶する。
【００１４】
その後、単語予測手段３はステップＳＴ５で、ＲＡＭ５からその音韻列候補を１つ取り出し、先頭単語列の初期化をする。次にステップＳＴ６において、検索キーとして、対応する３グラム情報を３グラム表４より検索し、ステップＳＴ７にて、検索された３グラム情報をもとに単語３連鎖の確率値を計算する。このようにして求めた確率値に基づいて、対応する音韻列候補に対して最も確率の高い単語列Ｗを、ステップＳＴ８でＲＡＭ５に記憶する。
【００１５】
次に、ステップＳＴ９において、このＲＡＭ５に記憶されたすべての音韻列候補に対して上述の計算を行い、最も確率の高い単語列Ｗと音韻列候補を選択してそれを出力手段６から出力し、ステップＳＴ１０に進んでこの一連の音声認識処理を終了する。
【００１６】
このように、発話に対して類似する確率の高い単語列Ｗが求められる。
【００１７】
なお、従来の音声認識装置に関連する記載のある文献としては、上記特表平１０−５０１０７８号公報以外にも、音声認識時に認識結果より話題が得られた場合に、次の認識にその話題を用いる特開昭６２−１９８９９号公報、辞書検索で抽出した話題を用いて辞書を選択し、検索精度を向上させる特開平６３−２１９０６７号公報、構文的な制約を用いることでそれまでのｎグラムモデルよりも制約を強める特開平６−３４２２９８号公報などがある。
【００１８】
【発明が解決しようとする課題】
従来の音声認識装置は以上のように構成されているので、ｎグラムの次数を大きくとれば言語制約は強くなるが、ｎグラム表４が巨大化するという課題があるうえ、実用的に巨大なｎグラム表４をうめるだけの統計量をとるための例文が必要となり、また、ｎグラムの次数を小さくすると言語制約が弱まり、解析精度の低下をまねくといった課題があった。すなわち、「３号アーチの先制」という句をこの音声認識装置に入力した時、ｎグラムは、例えば大量の新聞データから統計量を抽出し、ｎの次数を２として、簡単化のため、形態素の区切りは「さんごう・あーち・の・せんせい」とした場合、「３号アーチの」までは正しく解析されると仮定しても、「の」の次の「せんせい」は新聞全体の統計量を用いてしまうと「先生」のほうが高くなってしまうため、「３号アーチの先生」といった認識誤りを起こしてしまう可能性が高くなり、ｎの次数を大きくすれば正解が得られる可能性は高くなるが、前述のｎグラム表４が巨大化し、必要な例文集も巨大化するなどの課題があった。
【００１９】
この発明は上記のような課題を解決するためになされたもので、ｎグラムの次数を小さくしたまま強い制約をかけることができ、また同じ次数であればより強い制約をかけることができる、音声認識、または形態素解析、または仮名漢字変換を行う装置、およびそれらのための方法、ならびにそれらのプログラムを記録した記録媒体を得ることを目的とする。
【００２０】
【課題を解決するための手段】
この発明に係る音声認識装置は、対象言語の音韻列と、音韻列に対応する単語表記列と、生起確率とを記憶し、記憶している単語がそれぞれの話題に対応して分類されている音韻ｎグラムと、前記対象言語の音声を入力する入力手段と、前記入力手段が出力する音声信号を音韻に変換し、各音韻に対応する音韻生起確率を計算して、音韻列候補を出力する音韻確率算出手段と、それぞれの話題に対応して分類されている前記音韻ｎグラムを参照して、先行単語列候補の音韻列情報により検索し、検索した先行単語以降の音韻列候補の部分列に、前方一致する後方単語があるか否かをチェックし、前記音韻確率算出手段が出力する音韻列候補に対応する各単語候補の単語生起確率を算出する単語確率算出手段と、前記音韻確率算出手段にて計算された音韻生起確率と、前記単語確率算出手段にて計算された単語生起確率とを用いて算出した、前記入力手段より入力された音声に類似する単語列候補を出力する出力手段とを備え、上記単語確率算出手段が、後方単語のそれぞれについて話題別に確率の重みを設定するものである。
【００２３】
この発明に係る形態素解析装置は、仮名漢字混じり文字列と、仮名漢字混じり文字列に対応する単語表記列と、生起確率とを記憶し、記憶している単語がそれぞれの話題に対応して分類されている漢字ｎグラムと、前記仮名漢字混じり文字列を入力する入力手段と、それぞれの話題に対応して分類されている前記漢字ｎグラムを参照して、先行単語列候補の漢字列情報により検索し、検索した先行単語以降の漢字列候補の部分列に、前方一致する後方単語があるか否かのチェックし、前記入力手段が出力する仮名漢字混じり文字列に対応する各単語候補の単語生起確率を算出する形態素確率算出手段と、前記形態素確率算出手段にて計算された単語生起確率を用いて算出した、前記入力手段より入力された文字列に適合する単語列候補を出力する出力手段とを備え、上記形態素確率算出手段が、後方単語のそれぞれについて話題別に確率の重みを設定するものである。
【００２６】
この発明に係る仮名漢字変換装置は、仮名文字列と、仮名文字列に対応する単語表記列と、生起確率とを記憶し、記憶している単語がそれぞれの話題に対応して分類されている仮名ｎグラムと、前記仮名文字列を入力する入力手段と、それぞれの話題に対応して分類されている前記仮名ｎグラムを参照して、検索した先行単語以降の仮名列候補の部分列に、前方一致する後方単語があるか否かのチェックし、前記入力手段が出力する仮名文字列に対応する各単語候補の単語生起確率を算出する漢字確率算出手段と、前記漢字確率算出手段にて計算された単語生起確率を用いて算出された、前記入力手段より入力された仮名文字列に適合する単語列候補を出力する出力手段とを備え、上記漢字確率算出手段が、後方単語のそれぞれについて話題別に確率の重みを設定するものである。
【００２９】
この発明に係る音声認識方法は、入力される音声の取り込みを行うステップと、取り込まれた前記音声を音韻に変換するステップと、前記音声より変換された各音韻に対応する音韻生起確率を計算して、音韻列候補を出力するステップと、対象言語の音韻列と、音韻列に対応する単語表記列と、生起確率とを記憶し、記憶している単語がそれぞれの話題に対応して分類された音韻ｎグラムを参照して、後方単語のそれぞれについて話題別に確率の重みを設定して、先行単語列候補の音韻列情報により検索し、検索した先行単語以降の音韻列候補の部分列に、前方一致する後方単語があるか否かをチェックして、算出された前記音韻列候補に対応する各単語候補の単語生起確率を算出するステップと、前記音韻生起確率と単語生起確率を用いて、入力された前記音声に類似する単語列候補を算出するステップとを備えたものである。
【００３０】
この発明に係る形態素解析方法は、入力される仮名漢字混じり文字列の取り込みを行うステップと、仮名漢字混じり文字列と、仮名漢字混じり文字列に対応する単語表記列と、生起確率とを記憶し、記憶している単語がそれぞれの話題に対応して分類された漢字ｎグラムを参照して、後方単語のそれぞれについて話題別に確率の重みを設定して、先行単語列候補の漢字列情報により検索し、検索した先行単語以降の漢字列候補の部分列に、前方一致する後方単語があるか否かのチェックして、取り込まれた前記仮名漢字混じり文字列に対応する各単語候補の単語生起確率を算出するステップと、算出された前記単語生起確率を用いて、入力された前記仮名漢字混じり文字列に適合する単語列候補を算出するステップとを備えたものである。
【００３１】
この発明に係る仮名漢字変換方法は、入力される仮名文字列の取り込みを行うステップと、仮名文字列と、仮名文字列に対応する単語表記列と、生起確率とを記憶し、記憶している単語がそれぞれの話題に対応して分類された仮名ｎグラムを参照して、後方単語のそれぞれについて話題別に確率の重みを設定して、検索した先行単語以降の仮名列候補の部分列に、前方一致する後方単語があるか否かのチェックして、取り込まれた前記仮名文字列に対応する各単語候補の単語生起確率を算出するステップと、算出された前記単語生起確率を用いて、入力された前記仮名文字列に適合する単語列候補を算出するステップとを備えたものである。
【００３２】
この発明に係る記録媒体は、入力される音声の取り込みを行うステップと、取り込まれた前記音声を音韻に変換するステップと、前記音声より変換された各音韻に対応する音韻生起確率を計算して、音韻列候補を出力するステップと、対象言語の音韻列と、音韻列に対応する単語表記列と、生起確率とを記憶し、記憶している単語がそれぞれの話題に対応して分類された音韻ｎグラムを参照して、後方単語のそれぞれについて話題別に確率の重みを設定して、先行単語列候補の音韻列情報により検索し、検索した先行単語以降の音韻列候補の部分列に、前方一致する後方単語があるか否かをチェックして、算出された前記音韻列候補に対応する各単語候補の単語生起確率を算出するステップと、前記音韻生起確率と単語生起確率を用いて、入力された前記音声に類似する単語列候補を算出するステップとを有する音声認識方法を、コンピュータに実行させるためのプログラムを記録したコンピュータ読み取り可能に記録したものである。
【００３３】
この発明に係る記録媒体は、入力される仮名漢字混じり文字列の取り込みを行うステップと、仮名漢字混じり文字列と、仮名漢字混じり文字列に対応する単語表記列と、生起確率とを記憶し、記憶している単語がそれぞれの話題に対応して分類された漢字ｎグラムを参照して、後方単語のそれぞれについて話題別に確率の重みを設定して、先行単語列候補の漢字列情報により検索し、検索した先行単語以降の漢字列候補の部分列に、前方一致する後方単語があるか否かのチェックして、取り込まれた前記仮名漢字混じり文字列に対応する各単語候補の単語生起確率を算出するステップと、算出された前記単語生起確率を用いて、入力された前記仮名漢字混じり文字列に適合する単語列候補を算出するステップとを有する形態素解析方法を、コンピュータに実行させるためのプログラムを記録したコンピュータ読み取り可能に記録したものである。
【００３４】
この発明に係る記録媒体は、入力される仮名文字列の取り込みを行うステップと、仮名文字列と、仮名文字列に対応する単語表記列と、生起確率とを記憶し、記憶している単語がそれぞれの話題に対応して分類された仮名ｎグラムを参照して、後方単語のそれぞれについて話題別に確率の重みを設定して、検索した先行単語以降の仮名列候補の部分列に、前方一致する後方単語があるか否かのチェックして、取り込まれた前記仮名文字列に対応する各単語候補の単語生起確率を算出するステップと、算出された前記単語生起確率を用いて、入力された前記仮名文字列に適合する単語列候補を算出するステップとを有する仮名漢字変換方法を、コンピュータに実行させるためのプログラムを記録したコンピュータ読み取り可能に記録したものである。
【００３５】
【発明の実施の形態】
以下、この発明の実施の一形態を説明する。
実施の形態１．
図１はこの発明の実施の形態１による音声認識装置の構成を示すブロック図である。図において、１は音声を入力する入力手段としてのマイク、２はそのマイク１から入力された音声信号を音韻に変換し、各音韻に対応する音韻生起確率を算出して音韻列候補を生成する音韻確率算出手段であり、これらは図２６に同一符号を付して示した従来のそれらと同等のものである。
【００３６】
７，８は対象言語の音韻列と、音韻列に対応する単語表記列と、生起確率とを記憶する音韻ｎグラムであり、この音韻ｎグラム中では単語が、それぞれの話題に対応して分類されており、音韻ｎグラム７としては野球の話題について記憶した野球話題の音韻ｎグラムについて、音韻ｎグラム８としては一般の話題について記憶した一般話題の音韻ｎグラムについてそれぞれ例示されている。９はこれら野球話題の音韻ｎグラム７および一般話題の音韻ｎグラム８を参照して、音韻確率算出手段２の出力する音韻列候補に対応する各単語候補の単語生起確率を算出する単語確率算出手段である。
【００３７】
５は処理過程の情報を記憶するＲＡＭであり、６は音韻確率算出手段２で算出された音韻生起確率と、単語確率算出手段９で算出された単語生起確率を用いて、マイク１より入力された音声に類似する単語列候補を求めて出力する出力手段である。なお、このＲＡＭ５および出力手段６も図２６に同一符号を付して示した従来のそれらと同等のものである。
【００３８】
以下、単語列候補の生成について説明する。
この実施の形態１においても従来の場合と同様に、単語列候補は、発話された単語列をＷ、音韻列をＹとしたときの、上記従来の音声認識装置の説明で用いた式（１）で与えられる単語列Ｗの確率Ｐ（Ｗ｜Ｙ）を最大にする単語列Ｗを算出することによって得られる。このように単語列候補を生成するためには、確率Ｐ（Ｗ｜Ｙ）を最大にする単語列Ｗを求めればよいので、前述の式（１）の右辺のうち、単語列Ｗに共通な確率Ｐ（Ｙ）は省略でき、確率Ｐ（Ｙ｜Ｗ）Ｐ（Ｗ）を最大にする単語列Ｗを求めればよい。
【００３９】
時刻ｔ＝１，２，…，Ｌにおいて、単語列Ｗに対応する音韻列Ｙが、上記従来の音声認識装置の説明で用いた式（２）で決定されるとき、音韻列Ｙの出現確率Ｐ（Ｙ｜Ｗ）は当該音韻列Ｙの各音韻Ｙ_ｉの出現確率である音韻確率Ｐ（Ｙ_ｉ）より、従来の音声認識装置の説明における式（３）によって算出できる。また、単語列Ｗの出現確率Ｐ（Ｗ）は、ｍ語からなる単語列Ｗが従来の音声認識装置の説明における式（４）で決定されるとき、音韻確率Ｐ（Ｙ_ｉ）とは独立に次の式（６）から求めることができる。なお、この式（６）におけるｎは音韻ｎグラムの次数ｎである。
【００４０】
【数４】

【００４１】
上述の計算により、音韻列候補のうち野球話題の音韻ｎグラム７や一般話題の音韻ｎグラム８に単語の列が存在するものについて、単語列の確率Ｐ（Ｗ｜Ｙ）を最大にする単語列Ｗを算出する。なお、組み合わせの計算については、例えば、中川聖一著：「確率モデルによる音声認識」に示されるビタビ（Ｖｉｔｅｒｂｉ）やスタックデコーディングの方法を用いて高速に行ってもよく、また、確率を対数確率として計算式を総和で計算可能としてもよい。それぞれの単語の出現確率は野球話題の音韻ｎグラム７、および一般話題の音韻ｎグラム８に予め記憶してある値を使用する。
【００４２】
ここで、図２はこの音声認識装置にて解析される例文を示す説明図であり、図において、１０がその例文である。また、図３はこの例文１０の解析に使用する音韻ｎグラムの具体例を示す説明図であり、図において、１１がその音韻ｎグラムである。なお、この音韻ｎグラム１１には野球話題の音韻ｎグラム７と一般話題の音韻ｎグラム８とが記録されている。
【００４３】
図３に示すように、この音韻ｎグラム１１内の野球話題の音韻ｎグラム７と一般話題の音韻ｎグラム８には、それぞれ２グラムと１グラムがあり、先頭の音韻列が検索のためのキーとなっている。２グラムではキーとなる各音韻列に対して、前接形態素、後接形態素、および確率が記録されている。ここに記録されている確率は、前接形態素の次に後接形態素を接続する確率であり、その２グラムの生起確率に該当する。また、１グラムではキーとなる各音韻列に対して、直接次に連接する形態素（後接続形態素）と確率が記録されている。この１グラムの確率はその形態素自身の生起確率である。なお、形態素は表記、音素表記、見出し読み、および品詞の組であらわされる。
【００４４】
算出した単語列Ｗを認識結果として出力手段６より出力する。
【００４５】
次に動作について説明する。
ここで、図４はこの実施の形態１による音声認識装置における認識処理の概略動作の流れを示すフローチャートである。この音声認識の処理はステップＳＴ１０１において、マイク１に対して発話することによって処理が開始される。マイク１はステップＳＴ１０２でこの発話された音声が入力されると、ステップＳＴ１０３でその入力音声を電気信号に変換し、アナログデータとして取り込む。
【００４６】
次にステップＳＴ１０４において、音韻確率算出手段２はこのマイク１の取り込んだアナログデータをＡ／Ｄ変換し、量子化した後、スペクトル分析を行って、音節単位に分離した認識結果を音韻列候補として出力する。なお、その処理の詳細については、例えば、中川聖一著：「確率モデルによる音声認識」などに示される種々の周知の手法によるものであるため、ここではその説明を割愛する。この音韻列候補はマイク１より取り込んだアナログデータに対応する各音韻の確からしさを確率値で表現したもので、連鎖した音韻連鎖とその連鎖の音響尤度の対で出力し、ＲＡＭ５にこれを記憶する。なお、この音響尤度は音韻列Ｙの出現確率Ｐ（Ｙ｜Ｗ）の最大値である。
【００４７】
この実施の形態１では、上記音韻連鎖と、連鎖の音響尤度として、以下が出力されたと仮定する。
＃ｓａＮｇｏｏａａｃｉｎｏｓｅＮｓｅｅ＃０.９
＃ｓａＮｇｏｏａｃｉｎｏｓｅＮｓｅ＃０.１
【００４８】
なお、音響尤度については、確率以外に対数確率等を用いてもよく、音韻連鎖についてはラティス等の効率的な記憶方式を用いてもよい。
【００４９】
次に単語確率算出手段９はステップＳＴ１０５において、音韻確率算出手段２の出力した音韻列候補と音響尤度をＲＡＭ５より１つ取り出すとともに、初期化処理をする。この初期化処理として、ヌル単語「｛＃＃＃文頭｝」とその確率値「１」を、先行単語列候補の初期言語尤度値としてＲＡＭ５に記憶する。ここでは、まず、音韻列候補として、「＃ｓａＮｇｏｏａａｃｉｎｏｓｅＮｓｅｅ＃」が取り出される。
【００５０】
次にステップＳＴ１０６において、単語確率算出手段９はすべての先行単語列候補が音韻列候補の末端の音韻と対応したか否かをチェックし、すべて対応していれば後述するステップＳＴ１１２の処理に移り、対応していなければステップＳＴ１０７以下の処理を行なう。
【００５１】
ステップＳＴ１０７ではＲＡＭ５から先行単語列候補を１つ取り出す。この実施の形態１では、最初に「｛＃＃＃文頭｝」が先行単語列候補として取り出される。
【００５２】
次にステップＳＴ１０８において、音韻ｎグラム１１を先行単語列候補の音韻列情報により検索する。この実施の形態１の場合、まず、初期の先行単語列である「｛＃＃＃文頭｝」を検索する。検索した先行単語以降の音韻列候補の部分列に、前方一致する後方単語があるか否かをチェックする。前方一致した後方単語が無い場合は、ステップＳＴ１０６に処理を戻し、前方一致した後方単語がある場合は、ステップＳＴ１０９以下の処理に進む。
【００５３】
ここで、この実施の形態１では、先行単語列「｛＃＃＃文頭｝」の後方単語として音韻ｎグラム１１の検索を行い、「＃」に後続する「ｓａＮｇｏｏａａ…」の先頭からの音素列が部分一致する単語を検索し後方単語とする。２グラムでは「＃ｓａＮｇｏｏ」が音韻列「＃ｓａＮｇｏｏａａ…」と前方一致するので、この２グラムの後接形態素「野球：３号ｓａＮｇｏｏさんごう名詞」を後方単語の候補の１つとする。また、１グラムの「｛野球：３号ｓａＮｇｏｏさんごう名詞｝」は後方の音素列に前方一致するので候補とする。さらに「｛一般：３号ｓａＮｇｏｏさんごう名詞｝」も候補とする。
【００５４】
なお、この実施の形態１では、説明の簡単化のために部分一致を用いたが、曖昧な音韻連鎖との類似検索に、ＤＰマッチング処理や、阿部他：「１段目の最適解と正解の差分傾向を考慮した２段階探索法」，音講論，１−Ｒ−１５，１９９８.９に示されるような他の手法を用いてもよい。
【００５５】
ステップＳＴ１０９においては、後方単語それぞれについて同様に尤度を計算し、それをＲＡＭ５に記憶するとともに先行単語列に後方単語を接続してゆき、新たに先行単語列としてＲＡＭ５に記憶する。その際、２グラムの場合は話題が先行の形態素と同じになるようにし、１グラムの場合は連接がないため話題の切り替わりがあってもよいようにする。
【００５６】
実施の形態１では、先行単語列「｛＃＃＃文頭｝」を「｛野球：＃＃＃文頭｝、｛野球：３号ｓａＮｇｏｏさんごう名詞｝」に置き換える。言語尤度は、先行単語列「｛＃＃＃文頭｝」の確率１と、野球話題の音韻ｎグラム７の「｛野球：＃＃＃文頭｝、｛野球：３号ｓａＮｇｏｏさんごう名詞｝」の２グラムの確率０.０１から次の式（７）で計算される。
【００５７】
先行単語列の確率×ｎグラムの確率＝１×０．０１＝０．０１
・・・・（７）
【００５８】
次にステップＳＴ１１０において、音韻列全体が単語列に対応したか否かのチェックを行い、対応していればステップＳＴ１１１に進んで、最大尤度および解の先行単語列をＲＡＭ５に記憶した後、処理をステップＳＴ１０６に戻して、すべての先行単語列候補が音韻列候補の末端の音韻と対応したか否かをチェックする。一方、対応していなければ、そのまま処理をステップＳＴ１０６に戻して上記チェックを行う。
【００５９】
ステップＳＴ１０６で、すべての先行単語列候補が音韻列候補の末端の音韻と対応していると判定された場合には、ステップＳＴ１１２に移って、すべての音韻列候補に対して一致する単語が得られているか否かのチェックを行う。その結果、すべての音韻列候補に対して一致する単語が得られていなければステップＳＴ１０５に処理を戻して同様の処理を繰り返す。一方、すべての音韻列候補に対して一致する単語が得られていれば、ステップＳＴ１１３以下の処理を行う。
【００６０】
この実施の形態１では、以上の処理により、音韻列候補に対応して、「｛＃＃＃文頭｝、｛野球：３号ｓａＮｇｏｏさんごう名詞｝、｛野球：アーチａａｃｉあーち名詞｝、｛野球：のｎｏの助詞｝、…」の順に先行単語列候補が得られる。
【００６１】
ステップＳＴ１１３では、ＲＡＭ５に記憶してある最大尤度を持つ解の単語列を読み出す。最大尤度は、言語尤度と音響尤度の積の最大値で近似される。この実施の形態１では、計算の結果、音韻列候補「＃ｓａＮｇｏｏａｃｉｎｏｓｅＮｓｅｅ＃」は該当する音韻ｎグラムが存在しないため捨てられる。音韻列候補「＃ｓａＮｇｏｏａａｃｉｎｏｓｅＮｓｅｅ＃」に対して、「｛＃＃＃文頭｝、｛野球：３号ｓａＮｇｏｏさんごう名詞｝、｛野球：アーチａａｃｉあーち普通名詞｝、｛野球：のｎｏの接続助詞｝、｛野球：先制ｓｅＮｓｅｅせんせいサ変名詞｝」の音声認識結果が、また最大尤度が前述の式（６）で求められる単語列確率Ｐ（Ｗ）中の最大値より、５.４×１０^−９（音響尤度；０.９、言語尤度；６×１０^−９）と得られる。
【００６２】
次にステップＳＴ１１４において、ＲＡＭ５から読み出した解の単語列から表記のみを取り出し、それを出力手段６から出力した後、ステップＳＴ１１５に進んでこの一連の音声認識処理を終了する。このようにして、この実施の形態１では認識結果として、「３号アーチの先制」が得られる。
【００６３】
以上のように、この実施の形態１によれば、話題を分離して統計量をとって音声認識を行っているので、部分的には「の先制」よりも「の先生」の２グラム確率の方が高いにもかかわらず、「の先制」と認識され、ｎグラムの次数を大きくすることなく言語制約の強いｎグラムを構成することができ、高精度な音声認識装置を構築できるという効果が得られる。なお、本実施例では２つの話題を扱ったが、３つ以上の話題を扱うように構成しても良い。
【００６４】
実施の形態２．
なお、上記実施の形態１においては、特に考慮していなかったが、単語列候補の算出時に、一連の音声に対する音韻ｎグラム中の話題がすべて一致するように単語確率算出手段を構成してもよい。図５はそのようなこの発明の実施の形態２による音声認識装置の構成を示すブロック図である。
【００６５】
図において、１はマイク、２は音韻確率算出手段、５はＲＡＭ、６は出力手段であり、これらは図１に同一符号を付して示した実施の形態１のそれらと同等の部分である。１２は図１に符号９を付して示したものに相当する単語確率算出手段であるが、単語列候補の算出時に、一連の音声に対する音韻ｎグラム中の話題がすべて一致するように構成されている点で異なっている。１３、１４は図１に符号７、８を付して示したものに相当する、野球話題の音韻ｎグラムおよび一般話題の音韻ｎグラムであるが、この場合、２グラムのみが用いられ、１グラムは用いられていない。
【００６６】
ここで、図６は音韻ｎグラムの具体例を示す説明図である。図において、１５はその音韻ｎグラムであり、この音韻ｎグラム１５は野球話題の音韻ｎグラム１３と一般話題の音韻ｎグラム１４とが記録されている。前述のように、この音韻ｎグラム１５の野球話題の音韻ｎグラム１３と一般話題の音韻ｎグラム１４には、それぞれキーとなる各音韻列に対して、前接形態素、後接形態素、および確率が記録された２グラムのみが用いられている。
【００６７】
次に動作について説明する。
図７はこのように構成された実施の形態２による音声認識装置の概略動作の流れを示すフローチャートである。この実施の形態２においても、まず、ステップＳＴ１０１からステップＳＴ１０７において、実施の形態１の場合と全く同様の処理が行われる。ステップＳＴ１０７にてＲＡＭ５から先行単語列候補の１つが取り出されると、単語確率算出手段１２はステップＳＴ１２０において、音韻ｎグラム１５を先行単語列候補の音韻列情報によって検索し、前方一致する後方単語があるか否かをチェックする。そのとき、実施の形態１では、音韻ｎグラム１１の野球話題の音韻ｎグラム７と一般話題の音韻ｎグラム８は、それぞれ２グラムと１グラムの双方が用いられていたが、この実施の形態２では、野球話題の音韻ｎグラム１３と一般話題の音韻ｎグラム１４がそれぞれ２グラムのみの音韻ｎグラム１５を用いて一致検出を行っている。チェックの結果、前方一致した後方単語がある場合にはステップＳＴ１０９に移り、以下ステップＳＴ１１５まで、実施の形態１と同様に処理を進める。
【００６８】
以上のように、この実施の形態２によれば、単語確率算出手段１２は音韻ｎグラム１５の２グラムのみを用いて一致を検査しているので、１つの発話に対する一連の形態素は同一の話題の形態素となるため、発話中に他の話題が交ざることを防止することができるという効果が得られる。
【００６９】
実施の形態３．
なお、上記実施の形態１および実施の形態２では、音声認識において、話題ごとの確率の重み調整については特に考慮していなかったが、話題ごとに確率の重みの調整を可能にするようにしてもよい。図８はそのようなこの発明の実施の形態３による音声認識装置の構成を示すブロック図である。図において、１はマイク、２は音韻確率算出手段、５はＲＡＭ、６は出力手段、１３は野球話題の音韻ｎグラム、１４は一般話題の音韻ｎグラムであり、これらは図５に同一符号を付して示した実施の形態２のそれらと同等の部分である。１６は図５に符号１２を付して示したものに相当する単語確率算出手段であるが、話題ごとに確率の重みを調整可能に構成されている点で異なっている。
【００７０】
次に動作について説明する。
図９はこのように構成された実施の形態３による音声認識装置の概略動作の流れを示すフローチャートである。この実施の形態３においても、まず、ステップＳＴ１０１からステップＳＴ１０７、およびステップＳＴ１２０において、実施の形態２の場合と全く同様の処理が行われる。ステップＳＴ１２０における２グラムのみの音韻ｎグラム１５を用いた、前方一致する後方単語があるか否のチェックの結果、前方一致した後方単語がない場合にはステップＳＴ１０６に戻り、前方一致した後方単語がある場合にはステップＳＴ１３０に進む。ステップＳＴ１３０では単語確率算出手段１６が、後方単語のそれぞれについて分野別に重み付けを行って尤度を計算し、それをＲＡＭ５に記憶するとともに、先行単語列に後方単語を接続してゆき、新たに先行単語列としてＲＡＭ５に記憶する。以下ステップＳＴ１１０からステップＳＴ１１５まで、実施の形態２と同様に処理を進める。
【００７１】
以上のように、この実施の形態３によれば、２グラムの確率の重みを話題別にかけるように単語確率算出手段１６を構成しているので、話題別に出現確率の調節が可能になるという効果が得られる。
【００７２】
実施の形態４．
なお、上記実施の形態１〜実施の形態３では音声解析装置に関するものについて説明したが、漢字ｎグラムを構成することにより形態素解析装置を構築することも可能である。図１０はそのようなこの発明の実施の形態４による形態素解析装置の構成を示すブロック図である。
【００７３】
図において、１７は仮名漢字混じり文字列（入力ファイル）を入力する入力手段としてのファイル入力装置である。１８、１９は仮名漢字混じり文字列と、仮名漢字混じり文字列に対応する単語表記列と、生起確率とを記憶する漢字ｎグラムであり、この漢字ｎグラム中では単語が、それぞれの話題に対応して分類されており、漢字ｎグラム１８としては野球の話題について記憶した野球話題の漢字ｎグラムについて、漢字ｎグラム１９としては一般の話題について記憶した一般話題の漢字ｎグラムについてそれぞれ例示されている。２０はこれら野球話題の漢字ｎグラム１８および一般話題の漢字ｎグラム１９を参照して、ファイル入力装置１７が出力する仮名漢字混じり文字列に対応する各単語候補の単語生起確率を算出する形態素確率算出手段である。５は処理過程の情報を記憶するＲＡＭであり、２１は形態素確率算出手段２０で算出された単語生起確率を用いて求めた、ファイル入力装置１７より入力された文字列に適合する単語列候補を出力する出力手段である。
【００７４】
以下、単語列候補の生成について説明する。
この実施の形態４における単語列候補の生成は、単語列の出現確率Ｐ（Ｗ）を最大にするＷを算出することで得られる。このとき、Ｗは入力された単語列である。また、単語列の出現確率Ｐ（Ｗ）は、ｍ語の単語列Ｗが前述の式（４）で決定されるとき、前述の式（６）から求める。なお、その際には野球話題の漢字ｎグラム１８、一般話題の漢字ｎグラム１９の確率が使用される。
【００７５】
上述した計算により、野球話題の漢字ｎグラム１８および一般話題の漢字ｎグラム１９に単語の列が存在するものについて、単語列確率Ｐ（Ｗ）を最大にするＷを算出する。なお、組み合わせの計算については、例えば、長尾真著：「自然言語処理」に示されるＶｉｔｅｒｂｉ方法を用いて高速に行ってもよいし、確率を対数確率として計算式を総和で計算可能としてもよい。それぞれの単語の出現確率は単語の野球話題の漢字ｎグラム１８、一般話題の漢字ｎグラム１９に予め記憶してある確率値をもとに算出する。
【００７６】
ここで、図１１は図２に示した例文１０をもとに作成した漢字ｎグラムの具体例を示す説明図であり、図において、２２がその漢字ｎグラムであり、この漢字ｎグラム２２には野球話題の漢字ｎグラム１８と一般話題の漢字ｎグラム１９とが記録されている。
【００７７】
図１１に示すように、この漢字ｎグラム２２内の野球話題の漢字ｎグラム１８と一般話題の漢字ｎグラム１９には、それぞれ２グラムと１グラムがあり、先頭の漢字列が検索のためのキーとなっている。２グラムではキーとなる各漢字列に対して、前接形態素、後接形態素、および確率が記録されている。ここで記録されている確率は、前接形態素の次に後接形態素の接続する確率であり、その２グラムの生起確率に該当する。また、１グラムではキーとなる各音韻列に対して、直接次に連接する後接続形態素と確率が記録されている。この１グラムの確率はその形態素自身の生起確率である。なお、形態素は表記、音素表記、見出し読み、および品詞の組であらわされる。
【００７８】
算出した単語列Ｗを認識結果として出力手段２１より出力する。
【００７９】
次に動作について説明する。
ここで、図１２はこの実施の形態４による形態素解析装置における解析処理の概略動作の流れを示すフローチャートである。この形態素解析の処理はステップＳＴ２０１において、ファイル入力装置１７より仮名漢字混じり文字列を入力することによって処理が開始される。ファイル入力装置１７はステップＳＴ２０２でその入力された仮名漢字交じり文字列を取り込み、形態素確率算出手段２０に入力する。形態素確率算出手段２０はファイル入力装置１７の取り込んだ仮名漢字交じり文字列が入力されると、ステップＳＴ２０３においてＲＡＭ５にこれを記憶する。この実施の形態４では、仮名漢字交じり文字列として以下が入力されたと仮定する。
３号アーチのせんせい
【００８０】
次にステップＳＴ２０４において、形態素確率算出手段２０はステップＳＴ２０３でＲＡＭ５に記憶させた漢字列候補を取り出すとともに、初期化処理をする。この初期化処理では、ヌル単語「｛＃＃＃文頭｝」とその確率値「１」を先行単語列候補の初期値としてＲＡＭ５に記憶する。従って、ここでは、漢字列候補として、「＃３号アーチのせんせい＃」が、まず取り出される。形態素確率算出手段２０はさらにステップＳＴ２０５において、すべての先行単語列候補が漢字列候補の末端の漢字と対応したか否かをチェックし、すべて対応していれば処理をステップＳＴ２１１に移し、対応していなければ処理をステップＳＴ２０６に進める。
【００８１】
ステップＳＴ２０６では、形態素確率算出手段２０はＲＡＭ５から先行単語列候補を１つ取り出す。この実施の形態４では、最初に「｛＃＃＃文頭｝」が先行単語列候補として取り出される。次にステップＳＴ２０７において、野球話題の漢字ｎグラム１８および一般話題の漢字ｎグラム１９を、先行単語列候補の漢字列情報により検索し、検索した先行単語以降の漢字列候補の部分列に、前方一致する後方単語があるか否かのチェックをする。チェックの結果、前方一致した後方単語が無い場合には、ステップＳＴ２０５に処理を戻し、前方一致した後方単語がある場合には、ステップＳＴ２０８に処理を進める。
【００８２】
従って、この実施の形態４の場合には、初期の先行単語列である「｛＃＃＃文頭｝」をまず検索する。そして、この検索した先行単語列「｛＃＃＃文頭｝」の後方単語として、野球話題の漢字ｎグラム１８および一般話題の漢字ｎグラム１９を検索し、「＃」に後続する「３号ア…」の先頭からの漢字列が部分一致する単語を検索して後方単語とする。２グラムでは「＃３号」が「＃３号ア…」の漢字列と前方一致するので、この２グラムの後接形態素「野球：３号ｓａＮｇｏｏさんごう名詞」を後方単語の候補の１つとする。また、１グラムの「｛野球：３号ｓａＮｇｏｏさんごう名詞｝」は後方の漢字列に前方一致するのでこれも候補とする。さらに「｛一般：３号ｓａＮｇｏｏさんごう名詞｝」も候補となる。
【００８３】
ステップＳＴ２０８では、後方単語のそれぞれについて尤度を計算し、ＲＡＭ５に記憶するとともに、先行単語列に後方単語を接続してゆく。この際に、２グラムの場合は話題が先行の形態素と同じになるようにし、１グラムの場合は連接がないため話題の切り替わりを許すようにする。この後方単語を接続した先行単語列を、新たに先行単語列としてＲＡＭ５に記憶する。この実施の形態４では、先行単語列「｛＃＃＃文頭｝」を「｛野球：＃＃＃文頭｝、｛野球：３号ｓａＮｇｏｏさんごう名詞｝」に置き換える。言語尤度は、先行単語列「｛＃＃＃文頭｝」の確率１と、野球話題の「｛＃｝，｛３号｝」の２グラムの確率０.０１から前述の式（７）で計算される。
【００８４】
次にステップＳＴ２０９において、漢字列全体が先行単語列に対応したか否かのチェックを行い、対応していればステップＳＴ２１０に進んで、最大尤度および解の先行単語列をＲＡＭ５に記憶した後、処理をステップＳＴ２０５に戻し、すべての先行単語列候補が漢字列候補の末端の単語と対応したか否かをチェックする。一方、対応していなければ、そのまま処理をステップＳＴ２０５に戻して上記チェックを行う。
【００８５】
この実施の形態４では、以上の処理により、漢字列候補に対応して、「｛＃＃＃文頭｝、｛野球：３号ｓａＮｇｏｏさんごう名詞｝、｛野球：アーチａａｃｉあーち名詞｝、｛野球：のｎｏの助詞｝、…」の順に先行単語列候補が得られる。
【００８６】
ステップＳＴ２０５ですべての先行単語列候補が漢字列候補の末端の単語と対応していると判定された場合には、ステップＳＴ２１１に進んでＲＡＭ５に記憶してある最大尤度を持つ解の単語列を読み出す。ここで、最大尤度は言語尤度と音響尤度の積の最大値である。この実施の形態４では漢字列候補「＃３号アーチの先制＃」に対して、「｛＃＃＃文頭｝、｛３号ｓａＮｇｏｏさんごう名詞｝、｛アーチａａｃｉあーち名詞｝、｛のｎｏの接続助詞｝、｛先制ｓｅＮｓｅｅせんせいサ変名詞｝」の形態素解析結果が、また最大尤度が前述の式（６）で求められる単語列確率Ｐ（Ｗ）中の最大値より、５.４×１０^−９（音響尤度；０.９、言語尤度；６×１０^−９）と得られる。
【００８７】
次にステップＳＴ２１２において、ＲＡＭ５から読み出した解の形態素列を取り出し、それを出力手段２１から出力した後、ステップＳＴ２１３に進んでこの一連の形態素解析処理を終了する。このようにして、この実施の形態４では解析結果として、「｛３号さんごう名詞｝、｛アーチあーち名詞｝、｛のの接続助詞｝、｛せんせいせんせいサ変名詞｝」が得られる。
【００８８】
以上のように、この実施の形態４によれば、話題を分離して統計量をとって形態素解析を行っているので、部分的には「のせんせい」という曖昧な表記でも「先制」の意味で品詞がサ変であることが算出でき、ｎグラムの次数を大きくすることなく言語制約の強いｎグラムを構成することができ、高精度な形態素解析装置を構築できるという効果が得られる。なお、本実施例では２つの話題を扱ったが、３つ以上の話題を扱うように構成しても良い。
【００８９】
実施の形態５．
なお、上記実施の形態４では、特に考慮していなかったが、単語列候補の算出時に、一連の仮名漢字混じり文字列に対する漢字ｎグラム中の話題がすべて一致するように形態素確率算出手段を構成してもよい。図１３はそのようなこの発明の実施の形態５による形態素解析装置の構成を示すブロック図である。
【００９０】
図において、５はＲＡＭ、１７はファイル入力装置、２１は出力手段であり、これらは図１０に同一符号を付して示した実施の形態４のそれらと同等の部分である。２３は図１０に符号１８を付して示したものに相当する形態素確率算出手段であるが、単語列候補の算出時に、一連の仮名漢字混じり文字列に対する漢字ｎグラム中の話題がすべて一致するように構成されている点で異なっている。２４、２５は図１０に符号１８、１９を付して示したものに相当する、野球話題の漢字ｎグラムおよび一般話題の漢字ｎグラムであるが、この場合には２グラムのみが用いられ、１グラムは用いられていない。
【００９１】
ここで、図１４は漢字ｎグラムの具体例を示す説明図である。図において、２６はその漢字ｎグラムであり、この漢字ｎグラム２６は野球話題の漢字ｎグラム２４と一般話題の漢字ｎグラム２５とが記録されている。前述のように、この漢字ｎグラム２６の野球話題ｎグラム２４と一般話題ｎグラム２５には、それぞれキーとなる各漢字列に対して、前接形態素、後接形態素、および確率が記録された２グラムのみが用いられている。
【００９２】
次に動作について説明する。
図１５はこのように構成された実施の形態５による形態素解析装置の概略動作の流れを示すフローチャートである。この実施の形態５においても、まず、ステップＳＴ２０１からステップＳＴ２０６において、実施の形態４の場合と全く同様の処理が行われる。ステップＳＴ２０６にてＲＡＭ５から先行単語列候補の１つが取り出されると、形態素確率算出手段２３はステップＳＴ２２０において、漢字ｎグラム２６を先行単語列候補の漢字列情報によって検索し、前方一致する後方単語があるか否かのチェックをする。そのとき、実施の形態４では、漢字ｎグラム２２の野球話題の漢字ｎグラム１８と一般話題の漢字ｎグラム１９は、それぞれ２グラムと１グラムの双方が用いられていたが、この実施の形態５では、野球話題の漢字ｎグラム２４と一般話題の漢字ｎグラム２５が、それぞれ２グラムのみの漢字ｎグラム２６を用いて一致検出を行っている。チェックの結果、前方一致した後方単語がある場合にはステップＳＴ２０８に分岐して、以下ステップＳＴ２１３まで、実施の形態４と同様に処理を進める。
【００９３】
以上のように、この実施の形態５によれば、形態素確率算出手段２３は漢字ｎグラム２６の２グラムのみを用いて一致を検査しているので、１つの仮名漢字混じり文字列に対する一連の形態素は同一の話題の形態素となるため、他の話題が交ざることを防止することができるという効果が得られる。
【００９４】
実施の形態６．
なお、上記実施の形態４および実施の形態５では、形態素解析において、話題ごとの確率の重み調整については特に考慮していなかったが、話題ごとに確率の重みの調整を可能にするように形態素確率算出手段を構成してもよい。図１６はそのようなこの発明の実施の形態６による形態素解析装置の構成を示すブロック図である。図において、５はＲＡＭ、１７はファイル入力装置、２１は出力手段、２４、２５は野球話題および一般話題の漢字ｎグラムであり、これらは図１３に同一符号を付して示した実施の形態５のそれらと同等の部分である。２７は図１３に符号２３を付して示したものに相当する形態素確率算出手段であるが、話題ごとに確率の重みを調整可能に構成されている点で異なっている。
【００９５】
次に動作について説明する。
図１７はこのように構成された実施の形態６による形態素解析装置の概略動作の流れを示すフローチャートである。この実施の形態６においても、まず、ステップＳＴ２０１からステップＳＴ２０６、およびステップＳＴ２２０において、実施の形態５の場合と全く同様の処理が行われる。ステップＳＴ２２０における２グラムのみの漢字ｎグラム２６を用いた、前方一致する後方単語があるか否のチェックの結果、前方一致した後方単語がない場合にはステップＳＴ２０５に戻り、前方一致する後方単語がある場合にはステップＳＴ２３０に進む。ステップＳＴ２３０では形態素確率算出手段２７が、後方単語のそれぞれについて分野別に重み付けを行って尤度を計算し、それをＲＡＭ５に記憶するとともに、先行単語列に後方単語を接続してゆき、新たに先行単語列としてＲＡＭ５に記憶する。以下ステップＳＴ２０９からステップＳＴ２１３まで、実施の形態５と同様に処理を進める。
【００９６】
以上のように、この実施の形態６によれば、２グラムの確率の重みを話題別にかけるように形態素確率算出手段２７を構成しているので、話題別に出現確率の調節が可能になるという効果が得られる。
【００９７】
実施の形態７．
なお、上記実施の形態１〜実施の形態６では音声解析装置、あるいは形態素解析装置に関するものについて説明したが、仮名ｎグラムを構成することにより仮名漢字変換装置を構築することも可能である。図１８はそのようなこの発明の実施の形態７による仮名漢字変換装置の構成を示すブロック図である。
【００９８】
図において、２８は入力文の仮名文字列を入力する入力手段としてのキーボードである。２９、３０は仮名文字列と、仮名文字列に対応する単語表記列と、生起確率とを記憶する仮名ｎグラムであり、この仮名ｎグラム中では単語が、それぞれ話題に対応して分類されており、仮名ｎグラム２９としては野球の話題について記憶した野球話題の仮名ｎグラムについて、仮名ｎグラム３０としては一般の話題について記憶した一般話題の仮名ｎグラムについてそれぞれ例示されている。３１はこれら野球話題の仮名ｎグラム２９および一般話題の仮名ｎグラム３０を参照して、キーボード２８が出力する仮名文字列に対応する各単語候補の単語生起確率を算出する漢字確率算出手段である。５は処理過程の情報を記憶するＲＡＭであり、３２は漢字確率算出手段３１で算出された単語生起確率を用いて求めた、キーボード２８より入力された仮名文字列に適合する単語列候補を求めて出力する出力手段である。
【００９９】
以下、単語列候補の生成について説明する。
この実施の形態４における単語列候補の生成は、単語列の出現確率Ｐ（Ｗ）を最大にするＷを算出することで得られる。このとき、Ｗは入力された単語列である。また、単語列の出現確率Ｐ（Ｗ）は、ｍ語の単語列Ｗが前述の式（４）で決定されるとき、前述の式（６）から求める。なお、その際には野球話題の仮名ｎグラム２９、一般話題の仮名ｎグラム３０の確率が使用される。
【０１００】
上述した計算により、野球話題の仮名ｎグラム２９及び一般話題の仮名ｎグラム３０に単語の列が存在するものについて、単語列確率Ｐ（Ｗ）を最大にするＷを算出する。なお、組み合わせの計算については、例えば、長尾真著：「自然言語処理」に示されるＶｉｔｅｒｂｉ方法を用いて高速に行ってもよいし、また、確率を対数確率として計算式を総和で計算可能としてもよい。それぞれの単語の出現確率は単語の野球話題の仮名ｎグラム２９と一般話題の仮名ｎグラム３０に予め記憶してある確率値をもとに算出する。
【０１０１】
ここで、図１９は図２に示した例文１０をもとに作成した仮名ｎグラムの具体例を示す説明図である。図において、３３がその仮名ｎグラムであり、この仮名ｎグラム３３には野球話題の仮名ｎグラム２９と一般話題の仮名ｎグラム３０とが記録されている。
【０１０２】
図１９に示すように、この仮名ｎグラム３３内の野球話題の仮名ｎグラム２９と一般話題の仮名ｎグラム３０には、それぞれ２グラムと１グラムがあり、先頭の仮名文字列が検索のためのキーとなっている。２グラムではキーとなる各仮名文字列に対して、前接形態素、後接形態素、および確率が記録されている。ここで記録されている確率は、前接形態素の次に後接形態素の接続する確率であり、その２グラムの生起確率に該当する。また、１グラムではキーとなる各音韻列に対して、直接次に連接する後接続形態素と確率が記録されている。この１グラムの確率はその形態素自身の生起確率である。なお、形態素は表記、音素表記、見出し読み、および品詞の組であらわされる。
【０１０３】
算出した単語列Ｗを認識結果として出力手段３２より出力する。
【０１０４】
次に動作について説明する。
ここで、図２０はこの実施の形態７による仮名漢字変換装置における変換処理の概略動作の流れを示すフローチャートである。この仮名漢字変換の処理はステップＳＴ３０１において、キーボード２８が操作されることによって処理が開始される。キーボード２８の操作によって入力された仮名文字列は、ステップＳＴ３０２で漢字確率算出手段３１に取り込まれ、ステップＳＴ３０３において、ＲＡＭ５にこれを記憶する。この実施の形態７では、仮名文字列として以下が入力されたと仮定する。
さんごうあーちのせんせい
【０１０５】
次にステップＳＴ３０４において、漢字確率算出手段３１はステップＳＴ３０３でＲＡＭ５に記憶させた仮名文字列を取り出すとともに、初期化処理をする。この初期化処理では、ヌル単語「｛＃＃＃文頭｝」とその確率値「１」を先行単語列候補の初期値としてＲＡＭ５に記憶する。従って、ここでは、仮名文字列として、「＃さんごうあーちのせんせい＃」が、まず取り出される。漢字確率算出手段３１はさらにステップＳＴ３０５において、すべての先行単語列候補が仮名文字列の末端の仮名と対応したか否かをチェックし、すべて対応していれば処理をステップＳＴ３１１に移し、対応していなければ処理をステップＳＴ３０６に進める。
【０１０６】
ステップＳＴ３０６では、漢字確率算出手段３１はＲＡＭ５から先行単語列候補を１つ取り出す。この実施の形態７では、最初に「｛＃＃＃文頭｝」が先行単語列候補として取り出される。次にステップＳＴ３０７において、野球話題の仮名ｎグラム２９および一般話題の仮名ｎグラム３０を、先行単語列候補の仮名列情報により検索し、検索した先行単語以降の仮名列候補の部分列に、前方一致する後方単語があるか否かのチェックをする。チェックの結果、前方一致した後方単語が無い場合には、ステップＳＴ３０５に処理を戻し、前方一致した後方単語がある場合には、ステップＳＴ３０８に処理を進める。
【０１０７】
従って、この実施の形態７の場合には、初期の先行単語列である「｛＃＃＃文頭｝」をまず検索する。そして、この検索した先行単語列「｛＃＃＃文頭｝」の後方単語として、野球話題の仮名ｎグラム２９と一般話題の仮名ｎグラム３０を検索し、「＃」に後続する「さんごうあー…」の先頭からの仮名文字列が部分一致する単語を検索して後方単語とする。２グラムでは「＃さんごう」が「＃さんごうあー…」の仮名文字列と前方一致するので、この２グラムの後接形態素「野球：３号ｓａＮｇｏｏさんごう名詞」を後方単語の候補の１つとする。また、１グラムの「｛野球：３号ｓａＮｇｏｏさんごう名詞｝」は後方の仮名文字列に前方一致するのでこれも候補とする。さらに「｛一般：３号ｓａＮｇｏｏさんごう名詞｝」も候補となる。
【０１０８】
ステップＳＴ３０８では、後方単語それぞれについて尤度を計算し、ＲＡＭ５に記憶するとともに、先行単語列に後方単語を接続してゆき、新たに先行単語列としてＲＡＭ５にこれを記憶する。この実施の形態７では、先行単語列「｛＃＃＃文頭｝」を「｛野球：＃＃＃文頭｝、｛野球：３号ｓａＮｇｏｏさんごう名詞｝」に置き換える。言語尤度は、先行単語列「｛＃＃＃文頭｝」の確率１と、野球話題の「｛野球：＃＃＃文頭｝、｛野球：３号ｓａＮｇｏｏさんごう名詞｝」の２グラムの確率０.０１から前述の式（７）で計算される。
【０１０９】
次にステップＳＴ３０９において、仮名文字列全体が先行単語列に対応したか否かのチェックを行い、対応していればステップＳＴ３１０に進んで、最大尤度および解の先行単語列をＲＡＭ５に記憶した後、処理をステップＳＴ３０５に戻し、すべての先行単語列候補が仮名文字列候補の末端の仮名と対応したか否かをチェックする。一方、対応していなければ、そのまま処理をステップＳＴ３０５に戻して上記チェックを行う。
【０１１０】
この実施の形態７では、以上の処理により、仮名列候補に対応して、「｛＃＃＃文頭｝、｛野球：３号ｓａＮｇｏｏさんごう名詞｝、｛野球：アーチａａｃｉあーち名詞｝、｛野球：のｎｏの助詞｝、…」の順に先行単語列候補が得られる。
【０１１１】
ステップＳＴ３０５ですべての先行単語列候補が仮名文字列候補の末端の仮名と対応していると判定された場合には、ステップＳＴ３１１に進んでＲＡＭ５に記憶してある最大尤度を持つ解の単語列を読み出す。ここで、最大尤度は言語尤度と音響尤度の積の最大値である。この実施の形態７では仮名文字列候補「＃さんごうあーちのせんせい＃」に対して、「｛＃＃＃文頭｝、｛３号ｓａＮｇｏｏさんごう名詞｝、｛アーチａａｃｉあーち普通名詞｝、｛のｎｏの接続助詞｝、｛先制ｓｅＮｓｅｅせんせいサ変名詞｝」が、また最大尤度が前述の式（６）で求められる単語列確率Ｐ（Ｗ）中の最大値より、５.４×１０^−９（音響尤度；０.９、言語尤度；６×１０^−９）と得られる。
【０１１２】
次にステップＳＴ３１２において、このＲＡＭ５から読み出した解の単語列を出力手段３２から出力した後、ステップＳＴ３１３に進んでこの一連の形態素解析処理を終了する。このようにして、この実施の形態７では仮名漢字変換結果として、「３号アーチの先制」が得られる。
【０１１３】
以上のように、この実施の形態７によれば、話題を分離して統計量をとって仮名漢字変換を行っているので、ｎグラムの次数を大きくすることなく言語制約の強いｎグラムを構成することができ、高精度な仮名漢字変換装置を構築できるという効果が得られる。なお、本実施例では２つの話題を扱ったが、３つ以上の話題を扱うように構成しても良い。
【０１１４】
実施の形態８．
なお、上記実施の形態７では、特に考慮していなかったが、仮名漢字の変換時に、一連の仮名文字列に対する仮名ｎグラム中の話題がすべて一致するように漢字確率算出手段を構成してもよい。図２１はそのようなこの発明の実施の形態８による仮名漢字変換装置の構成を示すブロック図である。
【０１１５】
図において、５はＲＡＭ、２８はキーボード、３２は出力手段であり、これらは図１８に同一符号を付して示した実施の形態７のそれらと同等の部分である。３４は図１８に符号２９を付して示したものに相当する漢字確率算出手段であるが、単語列候補の算出時に、一連の仮名文字列に対する仮名ｎグラム中の話題がすべて一致するように構成されている点で異なっている。３５、３６は図１８に符号２９、３０を付して示したものに相当する、野球話題の仮名ｎグラムおよび一般話題の仮名ｎグラムであるが、この場合には２グラムのみが用いられ、１グラムは用いられていない。
【０１１６】
ここで、図２２は仮名ｎグラムの具体例を示す説明図である。図において、３７はその仮名ｎグラムであり、この仮名ｎグラム３７は野球話題の仮名ｎグラム３５と一般話題の仮名ｎグラム３６とが記録されている。前述のように、この仮名ｎグラム３７の野球話題の仮名ｎグラム３５と一般話題の仮名ｎグラム３６には、それぞれキーとなる各仮名文字列に対して、前接形態素、後接形態素、および確率が記録された２グラムのみが用いられている。
【０１１７】
次に動作について説明する。
図２３はこのように構成された実施の形態８による仮名漢字変換装置の概略動作の流れを示すフローチャートである。この実施の形態８においても、まず、ステップＳＴ３０１からステップＳＴ３０６において、実施の形態７の場合と全く同様の処理が行われる。ステップＳＴ３０６にてＲＡＭ５から先行単語列候補を１つが取り出されると、漢字確率計算手段３４はステップＳＴ３２０において、仮名ｎグラム３７を先行単語列候補の仮名列情報によって検索し、前方一致する後方単語があるか否かのチェックをする。そのとき、実施の形態７では、仮名ｎグラム３３の野球話題の仮名ｎグラム３０と一般話題の仮名ｎグラム３１は、それぞれ２グラムと１グラムの双方が用いられていたが、この実施の形態８では、野球話題の仮名ｎグラム３５と一般話題の仮名ｎグラム３６が、それぞれ２グラムのみの仮名ｎグラム３７を用いて一致検出を行っている。チェックの結果、前方一致した後方単語がある場合にはステップＳＴ３０８に分岐して、以下ステップＳＴ３１３まで、実施の形態７と同様に処理を進める。
【０１１８】
以上のように、この実施の形態８によれば、漢字確率算出手段３４は仮名ｎグラム３７の２グラムのみを用いて一致を検査しているので、１つの仮名文字列に対する一連の形態素は同じ話題の形態素となるため、他の話題が交ざることをなくすことができるという効果が得られる。
【０１１９】
実施の形態９．
なお、上記実施の形態７および実施の形態８では、仮名漢字変換において、話題ごとの確率の重み調整については特に考慮していなかったが、話題ごとに確率の重みの調整を可能に漢字確率算出手段を構成するようにしてもよい。図２４はそのようなこの発明の実施の形態９による仮名漢字変換装置の構成を示すブロック図である。図において、５はＲＡＭ、２８はキーボード、３２は出力手段、３５、３６は野球話題および一般話題の仮名ｎグラムであり、これらは図２１に同一符号を付して示した実施の形態８のそれらと同等の部分である。３８は図２１に符号３４を付して示したものに相当する漢字確率算出手段であるが、話題ごとに確率の重みを調整可能に構成されている点で異なっている。
【０１２０】
次に動作について説明する。
図２５はこのように構成された実施の形態９による仮名漢字変換装置の概略動作の流れを示すフローチャートである。この実施の形態９においても、まず、ステップＳＴ３０１からステップＳＴ３０６、およびステップＳＴ３２０において、実施の形態８の場合と全く同様の処理が行われる。ステップＳＴ３２０における２グラムのみの仮名ｎグラム３７を用いた、前方一致する後方単語があるか否のチェックの結果、前方一致した後方単語がない場合にはステップＳＴ３０５に戻り、前方一致する後方単語がある場合にはステップＳＴ３３０に進む。ステップＳＴ３３０では漢字確率算出手段３８が、後方単語のそれぞれについて分野別に重み付けを行って尤度を計算し、それをＲＡＭ５に記憶するとともに、先行単語列に後方単語を接続してゆき、新たに先行単語列としてＲＡＭ５に記憶する。以下ステップＳＴ３０９からステップＳＴ３１３まで、実施の形態８と同様に処理を進める。
【０１２１】
以上のように、この実施の形態９によれば、２グラムの確率の重みを話題別にかけるように漢字確率算出手段３８を構成しているので、話題別に出現確率の調節が可能になるという効果が得られる。
【０１２２】
【発明の効果】
以上のように、この発明によれば、この発明に係る音声認識装置は、対象言語の音韻列と、音韻列に対応する単語表記列と、生起確率とを記憶し、記憶している単語がそれぞれの話題に対応して分類されている音韻ｎグラムと、前記対象言語の音声を入力する入力手段と、前記入力手段が出力する音声信号を音韻に変換し、各音韻に対応する音韻生起確率を計算して、音韻列候補を出力する音韻確率算出手段と、それぞれの話題に対応して分類されている前記音韻ｎグラムを参照して、先行単語列候補の音韻列情報により検索し、検索した先行単語以降の音韻列候補の部分列に、前方一致する後方単語があるか否かをチェックし、前記音韻確率算出手段が出力する音韻列候補に対応する各単語候補の単語生起確率を算出する単語確率算出手段と、前記音韻確率算出手段にて計算された音韻生起確率と、前記単語確率算出手段にて計算された単語生起確率とを用いて算出した、前記入力手段より入力された音声に類似する単語列候補を出力する出力手段とを備え、上記単語確率算出手段が、後方単語のそれぞれについて話題別に確率の重みを設定するものであるので、話題を分離して統計量をとることによって、ｎグラムの次数を大きくすることなく言語制約の強いｎグラムを構成することが可能となり、精度の高い音声認識装置が得られるという効果があり、話題別に出現確率を調整することが可能な音声認識装置が得られるという効果がある。
【０１２５】
この発明によれば、この発明に係る形態素解析装置は、仮名漢字混じり文字列と、仮名漢字混じり文字列に対応する単語表記列と、生起確率とを記憶し、記憶している単語がそれぞれの話題に対応して分類されている漢字ｎグラムと、前記仮名漢字混じり文字列を入力する入力手段と、それぞれの話題に対応して分類されている前記漢字ｎグラムを参照して、先行単語列候補の漢字列情報により検索し、検索した先行単語以降の漢字列候補の部分列に、前方一致する後方単語があるか否かのチェックし、前記入力手段が出力する仮名漢字混じり文字列に対応する各単語候補の単語生起確率を算出する形態素確率算出手段と、前記形態素確率算出手段にて計算された単語生起確率を用いて算出した、前記入力手段より入力された文字列に適合する単語列候補を出力する出力手段とを備え、上記形態素確率算出手段が、後方単語のそれぞれについて話題別に確率の重みを設定するものであるので、話題を分離して統計量をとることによって、ｎグラムの次数を大きくすることなく言語制約の強いｎグラムを構成することが可能となり、精度の高い形態素解析装置が得られるという効果があり、話題別に出現確率を調整することが可能な形態素解析装置が得られるという効果がある。
【０１２８】
この発明によれば、この発明に係る仮名漢字変換装置は、仮名文字列と、仮名文字列に対応する単語表記列と、生起確率とを記憶し、記憶している単語がそれぞれの話題に対応して分類されている仮名ｎグラムと、前記仮名文字列を入力する入力手段と、それぞれの話題に対応して分類されている前記仮名ｎグラムを参照して、検索した先行単語以降の仮名列候補の部分列に、前方一致する後方単語があるか否かのチェックし、前記入力手段が出力する仮名文字列に対応する各単語候補の単語生起確率を算出する漢字確率算出手段と、前記漢字確率算出手段にて計算された単語生起確率を用いて算出された、前記入力手段より入力された仮名文字列に適合する単語列候補を出力する出力手段とを備え、上記漢字確率算出手段が、後方単語のそれぞれについて話題別に確率の重みを設定するものであるので、話題を分離して統計量をとることによって、ｎグラムの次数を大きくすることなく言語制約の強いｎグラムを構成することが可能となり、精度の高い仮名漢字変換装置が得られるという効果があり、話題別に出現確率を調整することが可能な仮名漢字変換装置が得られるという効果がある。
【０１３１】
この発明によれば、この発明に係る音声認識方法は、入力される音声の取り込みを行うステップと、取り込まれた前記音声を音韻に変換するステップと、前記音声より変換された各音韻に対応する音韻生起確率を計算して、音韻列候補を出力するステップと、対象言語の音韻列と、音韻列に対応する単語表記列と、生起確率とを記憶し、記憶している単語がそれぞれの話題に対応して分類された音韻ｎグラムを参照して、後方単語のそれぞれについて話題別に確率の重みを設定して、先行単語列候補の音韻列情報により検索し、検索した先行単語以降の音韻列候補の部分列に、前方一致する後方単語があるか否かをチェックして、算出された前記音韻列候補に対応する各単語候補の単語生起確率を算出するステップと、前記音韻生起確率と単語生起確率を用いて、入力された前記音声に類似する単語列候補を算出するステップとを備えたものであるので、話題を分離して統計量をとることによって、ｎグラムの次数を大きくすることなく言語制約の強いｎグラムを構成することができ、高精度の音声認識方法が得られるという効果がある。
【０１３２】
この発明によれば、この発明に係る形態素解析方法は、入力される仮名漢字混じり文字列の取り込みを行うステップと、仮名漢字混じり文字列と、仮名漢字混じり文字列に対応する単語表記列と、生起確率とを記憶し、記憶している単語がそれぞれの話題に対応して分類された漢字ｎグラムを参照して、後方単語のそれぞれについて話題別に確率の重みを設定して、先行単語列候補の漢字列情報により検索し、検索した先行単語以降の漢字列候補の部分列に、前方一致する後方単語があるか否かのチェックして、取り込まれた前記仮名漢字混じり文字列に対応する各単語候補の単語生起確率を算出するステップと、算出された前記単語生起確率を用いて、入力された前記仮名漢字混じり文字列に適合する単語列候補を算出するステップとを備えたものであるので、ｎグラムの次数を大きくすることなく言語制約の強いｎグラムを構成することができ、高精度の形態素解析方法が得られるという効果がある。
【０１３３】
この発明によれば、この発明に係る仮名漢字変換方法は、入力される仮名文字列の取り込みを行うステップと、仮名文字列と、仮名文字列に対応する単語表記列と、生起確率とを記憶し、記憶している単語がそれぞれの話題に対応して分類された仮名ｎグラムを参照して、後方単語のそれぞれについて話題別に確率の重みを設定して、検索した先行単語以降の仮名列候補の部分列に、前方一致する後方単語があるか否かのチェックして、取り込まれた前記仮名文字列に対応する各単語候補の単語生起確率を算出するステップと、算出された前記単語生起確率を用いて、入力された前記仮名文字列に適合する単語列候補を算出するステップとを備えたものであるので、ｎグラムの次数を大きくすることなく言語制約の強いｎグラムを構成することができ、高精度の仮名漢字変換方法が得られるという効果がある。
【０１３４】
この発明によれば、この発明に係る記録媒体は、入力される音声の取り込みを行うステップと、取り込まれた前記音声を音韻に変換するステップと、前記音声より変換された各音韻に対応する音韻生起確率を計算して、音韻列候補を出力するステップと、対象言語の音韻列と、音韻列に対応する単語表記列と、生起確率とを記憶し、記憶している単語がそれぞれの話題に対応して分類された音韻ｎグラムを参照して、後方単語のそれぞれについて話題別に確率の重みを設定して、先行単語列候補の音韻列情報により検索し、検索した先行単語以降の音韻列候補の部分列に、前方一致する後方単語があるか否かをチェックして、算出された前記音韻列候補に対応する各単語候補の単語生起確率を算出するステップと、前記音韻生起確率と単語生起確率を用いて、入力された前記音声に類似する単語列候補を算出するステップとを有する音声認識方法を、コンピュータに実行させるためのプログラムを記録したコンピュータ読み取り可能に記録したものであるので、音声認識方法を高精度に実現するためのプログラムが記録された記録媒体が得られるという効果がある。
【０１３５】
この発明によれば、この発明に係る記録媒体は、入力される仮名漢字混じり文字列の取り込みを行うステップと、仮名漢字混じり文字列と、仮名漢字混じり文字列に対応する単語表記列と、生起確率とを記憶し、記憶している単語がそれぞれの話題に対応して分類された漢字ｎグラムを参照して、後方単語のそれぞれについて話題別に確率の重みを設定して、先行単語列候補の漢字列情報により検索し、検索した先行単語以降の漢字列候補の部分列に、前方一致する後方単語があるか否かのチェックして、取り込まれた前記仮名漢字混じり文字列に対応する各単語候補の単語生起確率を算出するステップと、算出された前記単語生起確率を用いて、入力された前記仮名漢字混じり文字列に適合する単語列候補を算出するステップとを有する形態素解析方法を、コンピュータに実行させるためのプログラムを記録したコンピュータ読み取り可能に記録したものであるので、形態素解析方法を高精度に実現するためのプログラムが記録された記録媒体が得られるという効果がある。
【０１３６】
この発明によれば、この発明に係る記録媒体は、入力される仮名文字列の取り込みを行うステップと、仮名文字列と、仮名文字列に対応する単語表記列と、生起確率とを記憶し、記憶している単語がそれぞれの話題に対応して分類された仮名ｎグラムを参照して、後方単語のそれぞれについて話題別に確率の重みを設定して、検索した先行単語以降の仮名列候補の部分列に、前方一致する後方単語があるか否かのチェックして、取り込まれた前記仮名文字列に対応する各単語候補の単語生起確率を算出するステップと、算出された前記単語生起確率を用いて、入力された前記仮名文字列に適合する単語列候補を算出するステップとを有する仮名漢字変換方法を、コンピュータに実行させるためのプログラムを記録したコンピュータ読み取り可能に記録したものであるので、仮名漢字変換方法を高精度に実現するためのプログラムが記録された記録媒体が得られるという効果がある。
【図面の簡単な説明】
【図１】この発明の実施の形態１による音声認識装置の構成を示すブロック図である。
【図２】実施の形態１の音声認識装置で解析される例文を示す説明図である。
【図３】実施の形態１の音声認識装置にて解析に用いる音韻ｎグラムの具体例を示す説明図である。
【図４】実施の形態１の音声認識装置における音声認識の概略動作の流れを示すフローチャートである。
【図５】この発明の実施の形態２による音声認識装置の構成を示すブロック図である。
【図６】実施の形態２の音声認識装置にて解析に用いる音韻ｎグラムの具体例を示す説明図である。
【図７】実施の形態２の音声認識装置における音声認識の概略動作の流れを示すフローチャートである。
【図８】この発明の実施の形態３による音声認識装置の構成を示すブロック図である。
【図９】実施の形態３の音声認識装置における音声認識の概略動作の流れを示すフローチャートである。
【図１０】この発明の実施の形態４による形態素解析装置の構成を示すブロック図である。
【図１１】実施の形態４の形態素解析装置にて解析に用いる漢字ｎグラムの具体例を示す説明図である。
【図１２】実施の形態４の形態素解析装置における形態素解析の概略動作の流れを示すフローチャートである。
【図１３】この発明の実施の形態５による形態素解析装置の構成を示すブロック図である。
【図１４】実施の形態５の形態素解析装置にて解析に用いる漢字ｎグラムの具体例を示す説明図である。
【図１５】実施の形態５の計値磯解析装置における形態素解析の概略動作の流れを示すフローチャートである。
【図１６】この発明の実施の形態６による形態素解析装置の構成を示すブロック図である。
【図１７】実施の形態６の形態素解析装置における形態素解析の概略動作の流れを示すフローチャートである。
【図１８】この発明の実施の形態７による仮名漢字変換装置の構成を示すブロック図である。
【図１９】実施の形態７の仮名漢字変換装置にて解析に用いる仮名ｎグラムの具体例を示す説明図である。
【図２０】実施の形態７の仮名漢字変換装置における仮名漢字変換の概略動作の流れを示すフローチャートである。
【図２１】この発明の実施の形態８による仮名漢字変換装置の構成を示すブロック図である。
【図２２】実施の形態８の仮名漢字変換装置にて解析に用いる仮名ｎグラムの具体例を示す説明図である。
【図２３】実施の形態８の仮名漢字変換装置における仮名漢字変換の概略動作の流れを示すフローチャートである。
【図２４】この発明の実施の形態９による仮名漢字変換装置の構成を示すブロック図である。
【図２５】実施の形態９の仮名漢字変換析装置における仮名漢字変換の概略動作の流れを示すフローチャートである。
【図２６】従来の音声認識装置の構成を示すブロック図である。
【図２７】従来の音声認識装置における音声認識の概略動作の流れを示すフローチャートである。
【符号の説明】
１マイク（入力手段）、２音韻確率算出手段、５ＲＡＭ、６出力手段、７野球話題の音韻ｎグラム（音韻ｎグラム）、８一般話題の音韻ｎグラム（音韻ｎグラム）、９単語確率算出手段、１０例文、１１音韻ｎグラム、１２単語確率算出手段、１３野球話題の音韻ｎグラム（音韻ｎグラム）、１４一般話題の音韻ｎグラム（音韻ｎグラム）、１５音韻ｎグラム、１６単語確率算出手段、１７ファイル入力装置（入力手段）、１８野球話題の漢字ｎグラム（漢字ｎグラム）、１９一般話題の漢字ｎグラム（漢字ｎグラム）、２０形態素確率算出手段、２１出力手段、２２漢字ｎグラム、２３形態素確率算出手段、２４野球話題の漢字ｎグラム（漢字ｎグラム）、２５一般話題の漢字ｎグラム（漢字ｎグラム）、２６漢字ｎグラム、２７形態素確率算出手段、２８キーボード（入力手段）、２９野球話題の仮名ｎグラム（仮名ｎグラム）、３０一般話題の仮名ｎグラム（仮名ｎグラム）、３１漢字確率算出手段、３２出力手段、３３仮名ｎグラム、３４漢字確率算出手段、３５野球話題の仮名ｎグラム（仮名ｎグラム）、３６一般話題の仮名ｎグラム（仮名ｎグラム）、３７仮名ｎグラム、３８漢字確率算出手段。[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a speech recognition device that performs speech recognition, morphological analysis, or kana-kanji conversion based on n-grams, which are probabilities of concatenation of characters or words in a target language, using natural language statistics, and morphological analysis. Device, kana-kanji conversion device, and speech recognition method, morpheme analysis method, kana-kanji conversion method for the same, and a recording medium on which these programs are recorded, especially analysis by handling n-gram statistics by topic It relates to the improvement of accuracy.
[0002]
[Prior art]
Analysis techniques using natural language statistics have been applied to many document processing. For example, Japanese input by voice recognition is useful as a document input means, and further improvement in recognition accuracy is desired. In order to recognize speech with high accuracy, attention has been paid to a method that uses natural language statistics as a language model and uses n-grams, which are probabilities of occurrence of characters or words in a target language. However, since the constraint on the n-gram is affected by the order n, the constraint becomes weaker as n decreases. Conversely, increasing the order n of n-grams has a serious problem that the table for counting the frequency becomes enormous, and a very large collection of example sentences is required to ensure reliable statistics. There was a problem of becoming necessary. As a compression method for solving such an increase in n-gram tables in speech recognition, for example, the one shown in Japanese Patent Application Laid-Open No. 10-501078 has been proposed.
[0003]
A conventional analysis technique using natural language statistics will be described below. FIG. 26 is a block diagram showing a configuration example of a speech recognition apparatus to which a conventional analysis method is applied in order to obtain a recognition result “Third-floor teacher” from spoken “saNkainoseNsee”. In the figure, 1 is a microphone, 2 is a phoneme probability calculation means, 3 is a word prediction means, 4 is an n-gram table (in this case, a 3-gram table), 5 is a RAM for storing information, and 6 is an output means.
[0004]
Hereinafter, generation of word string candidates will be described.
The word string candidate is obtained by calculating a word string W that maximizes the probability P (W | Y) of the word string, where W is the spoken word string and Y is the phonological string. The probability P (W | Y) of this word string is given by the following equation (1).
[0005]
[Expression 1]

[0006]
In order to generate a word string candidate, the word string W that maximizes the probability P (W | Y) may be obtained as described above. Therefore, the word string W is common to the word string W in the right side of the above equation (1). The probability P (Y) can be omitted, and the word string W that maximizes the probability P (Y | W) P (W) may be obtained. Note that P (Y | W) is the appearance probability of the phoneme string Y when the word string W is given, and P (W) is the appearance probability of the word string W.
[0007]
Here, at time t = 1, 2,..., L, when the phoneme string Y corresponding to the word string W is determined by the following equation (2), the appearance probability P (Y | W) of the phoneme string Y is Each phoneme Y shown in equation (2)_iPhoneme probability P (Y_i) From the following equation (3).
[0008]
[Expression 2]

[0009]
In addition, the appearance probability P (W) of the word string W is determined based on each phoneme Y according to the above formula (2) when the word string W composed of m words is determined by the following formula (4)._iIt is approximated by the following equation (5) based on the probability of a 3-gram word, independently of the phoneme probability that is the appearance probability of. In this formula (5), when i is 1 or 2, w_i-1, W_i-2(#) Is entered.
[0010]
[Equation 3]

[0011]
The word string W that maximizes the word string probability P (W | Y) is calculated for the phoneme string candidates that have word strings in the 3-gram index by the above-described calculation. The appearance probability of each word is calculated based on the frequency value stored in advance in the 3-gram table 4 of words shown in FIG.
[0012]
The calculated word string W is output from the output means 6 as a recognition result.
[0013]
Next, the operation will be described.
FIG. 27 is a flowchart showing a schematic operation flow of speech recognition in the conventional speech recognition apparatus. This voice recognition process is started by speaking to the microphone 1 in step ST1. When the spoken voice is input in step ST2, the microphone 1 converts the input voice into an electric signal in step ST3. Next, in step ST4, the phoneme probability calculation means 2 A / D-converts and quantizes the electrical signal from the microphone 1, performs spectrum analysis, concatenates the recognition results separated into syllable units, and generates a phoneme string. This is stored in the RAM 5 as a candidate.
[0014]
Thereafter, in step ST5, the word predicting means 3 takes out one of the phoneme string candidates from the RAM 5 and initializes the leading word string. Next, in step ST6, the corresponding 3 gram information is searched from the 3 gram table 4 as a search key, and in step ST7, the probability value of the word 3 chain is calculated based on the searched 3 gram information. Based on the probability value thus determined, the word string W having the highest probability for the corresponding phoneme string candidate is stored in the RAM 5 in step ST8.
[0015]
Next, in step ST9, the above calculation is performed on all the phoneme string candidates stored in the RAM 5, and the word string W and the phoneme string candidate with the highest probability are selected and output from the output means 6. Then, the process proceeds to step ST10, and this series of speech recognition processing is terminated.
[0016]
As described above, a word string W having a high probability of similarity to an utterance is obtained.
[0017]
In addition to the above published Japanese translation of PCT publication No. 10-501078, as a document with a description related to a conventional speech recognition apparatus, when a topic is obtained from the recognition result during speech recognition, the topic is used for the next recognition. Japanese Laid-Open Patent Publication No. 62-19999, using a topic, selecting a dictionary using a topic extracted by dictionary search, and improving the search accuracy, Japanese Patent Laid-Open No. 63-219067, and using a syntactic constraint, n For example, Japanese Patent Laid-Open No. 6-342298, which is more restrictive than the gram model.
[0018]
[Problems to be solved by the invention]
Since the conventional speech recognition apparatus is configured as described above, if the order of the n-gram is increased, the language restriction becomes stronger, but there is a problem that the n-gram table 4 becomes huge and is practically huge. There is a need for an example sentence for taking statistics sufficient for n-gram table 4, and reducing the order of n-gram weakens the language constraint, leading to a decrease in analysis accuracy. That is, when the phrase “No. 3 arch preemption” is input to this speech recognition apparatus, the n-gram, for example, extracts statistics from a large amount of newspaper data and sets the order of n to 2 for the sake of simplification. Assuming that “Sango”, “Achi”, “No” and “Sensei” are used as the separator, the “Sensei” next to “No” is the statistics for the entire newspaper, even assuming that “No. 3 Arch” is correctly analyzed. If the amount is used, “Teacher” will be higher, so it is more likely to cause a recognition error such as “No. 3 Arch Teacher”, and the correctness can be obtained by increasing the order of n However, there is a problem that the above-mentioned n-gram table 4 is enlarged and a necessary example sentence collection is enlarged.
[0019]
The present invention has been made to solve the above-described problems. It is possible to apply a strong constraint while keeping the order of n-grams small, and it is possible to apply a stronger constraint if the order is the same. An object of the present invention is to obtain an apparatus for performing recognition, morphological analysis, or kana-kanji conversion, a method for the same, and a recording medium on which these programs are recorded.
[0020]
[Means for Solving the Problems]
  The speech recognition apparatus according to the present invention isA phoneme sequence of the target language, a word notation sequence corresponding to the phoneme sequence, and an occurrence probability are stored, a phoneme n-gram in which the stored words are classified according to each topic, Input means for inputting speech, phoneme probability calculating means for converting phonetic signals generated by the input means into phonemes, calculating phoneme occurrence probabilities corresponding to each phoneme, and outputting phoneme string candidates, and respective topics The phoneme n-grams classified corresponding to the phoneme n-gram are searched based on the phoneme sequence information of the preceding word sequence candidates, and there is a backward word that matches forward in the subsequence of the phoneme sequence candidates after the searched preceding word. A word probability calculation means for calculating a word occurrence probability of each word candidate corresponding to the phoneme string candidate output by the phoneme probability calculation means, and a phoneme occurrence probability calculated by the phoneme probability calculation means And the word Output using the word occurrence probability calculated by the rate calculation means and outputting word string candidates similar to the speech input from the input means, and the word probability calculation means includes a backward word A probability weight is set for each topic for each topic.
[0023]
  The morphological analyzer according to the present invention is:A kanji-kanji mixed character string, a word notation sequence corresponding to the kana-kanji mixed character string, and an occurrence probability, and a kanji n-gram in which the stored words are classified according to each topic, By referring to the input means for inputting a kana-kanji mixed character string and the kanji n-grams classified according to the respective topics, the search is performed based on the kanji string information of the preceding word string candidates, and after the searched preceding word A morpheme probability calculation unit that checks whether or not there is a backward matching word in a partial sequence of kanji string candidates and calculates a word occurrence probability of each word candidate corresponding to the kana-kanji mixed character string output by the input unit And output means for outputting word string candidates that match the character string input from the input means, calculated using the word occurrence probability calculated by the morpheme probability calculation means, Rate calculating means is for setting the weight of probability topic-specific for each of the rear word.
[0026]
  The kana-kanji conversion device according to the present invention is:A kana character string, a word notation string corresponding to the kana character string, an occurrence probability, a kana n-gram in which the stored words are classified according to each topic, and the kana character string Whether or not there is a backward matching word in the partial sequence of the kana string candidate after the searched preceding word with reference to the input means to input and the kana n-gram classified according to each topic And the kanji probability calculating means for calculating the word occurrence probability of each word candidate corresponding to the kana character string output by the input means, and the word occurrence probability calculated by the kanji probability calculating means. Output means for outputting word string candidates that match the kana character string input from the input means, wherein the kanji probability calculating means sets probability weights for each topic for each backward word.
[0029]
  The speech recognition method according to the present invention includes:A step of capturing input speech, a step of converting the captured speech into phonemes, a step of calculating a phoneme occurrence probability corresponding to each phoneme converted from the speech, and outputting a phoneme string candidate And the phoneme sequence of the target language, the word notation sequence corresponding to the phoneme sequence, and the occurrence probability, and referring to the phoneme n-gram in which the stored words are classified according to each topic, A probability weight is set for each topic for each backward word, and search is performed based on the phoneme string information of the preceding word string candidate. Whether or not there is a backward word that matches forward in the substring of the phoneme string candidate after the searched preceding word And calculating the word occurrence probability of each word candidate corresponding to the calculated phoneme string candidate, and using the phoneme occurrence probability and the word occurrence probability to resemble the input speech It is obtained by a step of calculating a word string candidate.
[0030]
  The morphological analysis method according to the present invention is:The step of capturing the input kana-kanji mixed character string, the kana-kanji mixed character string, the word notation sequence corresponding to the kana-kanji mixed character string, and the occurrence probability are stored, and the stored words are stored in the respective words. Refer to kanji n-grams classified according to the topic, set probability weights for each topic for each backward word, search for kanji string information of the preceding word string candidate, and kanji after the searched preceding word Checking whether or not there is a backward matching word in the partial column of the column candidate, calculating a word occurrence probability of each word candidate corresponding to the captured kana-kanji mixed character string, and And calculating a word string candidate that matches the input character string mixed with the kana and kanji using the word occurrence probability.
[0031]
  The kana-kanji conversion method according to the present invention is:The step of capturing the input kana character string, the kana character string, the word notation string corresponding to the kana character string, and the occurrence probability are stored, and the stored words are classified according to the respective topics. Whether or not there is a backward word that matches forward in the partial sequence of the kana string candidate after the searched preceding word, by setting a weight of probability for each topic for each backward word with reference to the kana gram that has been made Checking and calculating a word occurrence probability of each word candidate corresponding to the captured kana character string, and using the calculated word occurrence probability, a word string that matches the input kana character string And a step of calculating candidates.
[0032]
  The recording medium according to the present invention isA step of capturing input speech, a step of converting the captured speech into phonemes, a step of calculating a phoneme occurrence probability corresponding to each phoneme converted from the speech, and outputting a phoneme string candidate And the phoneme sequence of the target language, the word notation sequence corresponding to the phoneme sequence, and the occurrence probability, and referring to the phoneme n-gram in which the stored words are classified according to each topic, A probability weight is set for each topic for each backward word, and search is performed based on the phoneme string information of the preceding word string candidate. Whether or not there is a backward word that matches forward in the substring of the phoneme string candidate after the searched preceding word And calculating the word occurrence probability of each word candidate corresponding to the calculated phoneme string candidate, and using the phoneme occurrence probability and the word occurrence probability to resemble the input speech The speech recognition method and a step of calculating a word string candidate is obtained by recording a computer-readable recording program for causing a computer to execute.
[0033]
  The recording medium according to the present invention isThe step of capturing the input kana-kanji mixed character string, the kana-kanji mixed character string, the word notation sequence corresponding to the kana-kanji mixed character string, and the occurrence probability are stored, and the stored words are stored in the respective words. Refer to kanji n-grams classified according to the topic, set probability weights for each topic for each backward word, search for kanji string information of the preceding word string candidate, and kanji after the searched preceding word Checking whether or not there is a backward matching word in the partial column of the column candidate, calculating a word occurrence probability of each word candidate corresponding to the captured kana-kanji mixed character string, and Using the word occurrence probability to calculate a word string candidate that matches the input kana-kanji mixed character string to cause the computer to execute a morpheme analysis method It is obtained by recording a computer-readable recording program.
[0034]
  The recording medium according to the present invention isThe step of capturing the input kana character string, the kana character string, the word notation string corresponding to the kana character string, and the occurrence probability are stored, and the stored words are classified according to the respective topics. Whether or not there is a backward word that matches forward in the partial sequence of the kana string candidate after the searched preceding word, by setting a weight of probability for each topic for each backward word with reference to the kana gram that has been made Checking and calculating a word occurrence probability of each word candidate corresponding to the captured kana character string, and using the calculated word occurrence probability, a word string that matches the input kana character string A kana-kanji conversion method having a step of calculating candidates is recorded in a computer-readable manner in which a program for causing a computer to execute is recorded.
[0035]
DETAILED DESCRIPTION OF THE INVENTION
An embodiment of the present invention will be described below.
Embodiment 1 FIG.
1 is a block diagram showing a configuration of a speech recognition apparatus according to Embodiment 1 of the present invention. In the figure, 1 is a microphone as input means for inputting speech, 2 is a speech signal input from the microphone 1, converted to phonemes, and calculates phoneme occurrence probabilities corresponding to each phoneme to generate phoneme string candidates. These are phoneme probability calculation means, which are equivalent to those of the prior art shown with the same reference numerals in FIG.
[0036]

Reference numerals

7 and 8 are phoneme n-grams that store the phoneme strings of the target language, word notation strings corresponding to the phoneme strings, and occurrence probabilities, and in this phoneme n-gram, words are classified according to the respective topics. The phoneme n-gram 7 is illustrated for the baseball topic phoneme n-gram stored for the baseball topic, and the phoneme n-gram 8 is illustrated for the general topic phoneme n-gram stored for the general topic. 9 is a word probability calculation that calculates the word occurrence probability of each word candidate corresponding to the phoneme string candidate output by the phoneme probability calculation means 2 with reference to the phoneme n-gram 7 of the baseball topic and the phoneme n-gram 8 of the general topic. Means.
[0037]
Reference numeral 5 denotes a RAM for storing processing process information, and reference numeral 6 denotes an input from the microphone 1 using the phoneme occurrence probability calculated by the phoneme probability calculation means 2 and the word occurrence probability calculated by the word probability calculation means 9. Output means for obtaining and outputting word string candidates similar to the voice. The RAM 5 and the output means 6 are also equivalent to those of the prior art shown with the same reference numerals in FIG.
[0038]
Hereinafter, generation of word string candidates will be described.
Also in the first embodiment, as in the conventional case, the word string candidate is expressed by the formula (1) used in the description of the conventional speech recognition apparatus when the spoken word string is W and the phoneme string is Y. ) Is obtained by calculating the word string W that maximizes the probability P (W | Y) of the word string W given in (1). In order to generate a word string candidate in this way, the word string W that maximizes the probability P (W | Y) may be obtained. Therefore, among the right sides of the above-described equation (1), the word string W is common. The probability P (Y) can be omitted, and a word string W that maximizes the probability P (Y | W) P (W) may be obtained.
[0039]
At time t = 1, 2,..., L, when the phoneme string Y corresponding to the word string W is determined by the equation (2) used in the description of the conventional speech recognition apparatus, the appearance probability of the phoneme string Y P (Y | W) is each phoneme Y of the phoneme sequence Y_iPhoneme probability P (Y_i) From the equation (3) in the description of the conventional speech recognition apparatus. Further, the appearance probability P (W) of the word string W is determined as the phoneme probability P (Y) when the word string W composed of m words is determined by the expression (4) in the description of the conventional speech recognition apparatus._i) Can be obtained independently from the following equation (6). Note that n in this equation (6) is the order n of the phoneme n-gram.
[0040]
[Expression 4]

[0041]
The word that maximizes the probability P (W | Y) of the word sequence for the phoneme sequence candidates that have the word sequence in the baseball topic phoneme n-gram 7 and the general topic phoneme n-gram 8 by the above calculation. Column W is calculated. The calculation of the combination may be performed at high speed by using, for example, the method of Viterbi or stack decoding shown in Seiichi Nakagawa: “Speech recognition by probability model”, and the probability is logarithmized. The calculation formula may be calculated as the sum as the probability. The appearance probability of each word uses a value stored in advance in the phoneme n-gram 7 of the baseball topic and the phoneme n-gram 8 of the general topic.
[0042]
Here, FIG. 2 is an explanatory diagram showing example sentences analyzed by the speech recognition apparatus, and in the figure, 10 is the example sentence. FIG. 3 is an explanatory diagram showing a specific example of a phoneme n-gram used for the analysis of the example sentence 10, and in the figure, 11 is the phoneme n-gram. The phoneme n-gram 11 contains a baseball topic phoneme n-gram 7 and a general topic phoneme n-gram 8.
[0043]
As shown in FIG. 3, the baseball topic phoneme n-gram 7 and the general topic phoneme n-gram 8 in the phoneme n-gram 11 have 2 grams and 1 gram, respectively. It is a key. In 2 gram, the front morpheme, the back morpheme, and the probability are recorded for each phoneme string as a key. The probability recorded here is the probability of connecting the back morpheme next to the front morpheme, and corresponds to the occurrence probability of 2 grams. In 1 gram, the morpheme (next connected morpheme) and the probability that are directly connected next are recorded for each phoneme string as a key. This 1-gram probability is the occurrence probability of the morpheme itself. Note that a morpheme is represented by a combination of notation, phoneme notation, heading reading, and part of speech.
[0044]
The calculated word string W is output from the output means 6 as a recognition result.
[0045]
Next, the operation will be described.
FIG. 4 is a flowchart showing a schematic operation flow of the recognition processing in the speech recognition apparatus according to the first embodiment. This voice recognition process is started by speaking to the microphone 1 in step ST101. When the spoken voice is input in step ST102, the microphone 1 converts the input voice into an electrical signal in step ST103, and takes it in as analog data.
[0046]
Next, in step ST104, the phoneme probability calculation means 2 performs A / D conversion on the analog data captured by the microphone 1, quantizes it, performs spectrum analysis, and uses the recognition results separated into syllable units as phoneme string candidates. Output. The details of the processing are based on various well-known techniques as shown in, for example, Seiichi Nakagawa: “Voice Recognition Using a Probability Model”, and the description thereof is omitted here. This phoneme string candidate expresses the probability of each phoneme corresponding to the analog data captured from the microphone 1 as a probability value, and outputs it as a pair of chained phoneme chains and the acoustic likelihood of the chain, which is output to the RAM 5. Remember. This acoustic likelihood is the maximum value of the appearance probability P (Y | W) of the phoneme string Y.
[0047]
In the first embodiment, it is assumed that the following is output as the phoneme chain and the acoustic likelihood of the chain.
# SaNgooacinoseNsee # 0.9
# SaNgoocinoseNse # 0.1
[0048]
For acoustic likelihood, a logarithmic probability or the like may be used in addition to the probability, and an efficient storage method such as a lattice may be used for the phoneme chain.
[0049]
Next, in step ST105, the word probability calculation means 9 takes out one phoneme string candidate and acoustic likelihood output from the phoneme probability calculation means 2 from the RAM 5, and performs an initialization process. As this initialization process, the null word “{### head”} and its probability value “1” are stored in the RAM 5 as the initial language likelihood value of the preceding word string candidate. Here, first, “# saNgooacinoseNsee #” is extracted as a phoneme string candidate.
[0050]
Next, in step ST106, the word probability calculation means 9 checks whether all preceding word string candidates correspond to the terminal phonemes of the phoneme string candidates, and if they all correspond, the process proceeds to step ST112 described later. If not, the process from step ST107 is performed.
[0051]
In step ST107, one preceding word string candidate is extracted from the RAM 5. In the first embodiment, “{# ## sentence head}” is first extracted as a preceding word string candidate.
[0052]
Next, in step ST108, the phoneme n-gram 11 is searched with the phoneme string information of the preceding word string candidates. In the case of the first embodiment, first, “{# ## sentence head}” which is an initial preceding word string is searched. It is checked whether or not there is a backward word that matches forward in the partial sequence of phoneme string candidates after the searched preceding word. If there is no forward matched backward word, the process returns to step ST106, and if there is a forward matched backward word, the process proceeds to step ST109 and subsequent steps.
[0053]
Here, in the first embodiment, the phoneme n-gram 11 is searched as a backward word of the preceding word string “{### sentence head}”, and the phoneme string from the head of “saNgooa ...” following “#”. Search for words that partially match and use them as backward words. In “2 gram”, “#saNgoo” matches the phoneme sequence “#saNgooa ...” in front, and therefore, the 2 gram postscript morpheme “baseball: No. 3 saNgoo noun” is one of the candidates for the backward word. Also, 1 gram of “{Baseball: No.3 saNgoo nogo}” is a candidate because it matches the phoneme string in the back. Furthermore, “{General: No.3 saNgoo noun}}” is also a candidate.
[0054]
In the first embodiment, partial matching is used for simplification of description. However, DP matching processing, Abe et al .: “First-stage optimal solution and correct solution” are used for similarity search with an ambiguous phoneme chain. Other methods as shown in “Two-stage search method considering difference tendency”, sound lecture, 1-R-15, 1998. 9 may be used.
[0055]
In step ST109, the likelihood is similarly calculated for each backward word, stored in the RAM 5, and the backward word is connected to the preceding word string, and newly stored in the RAM 5 as the preceding word string. At that time, in the case of 2 grams, the topic is the same as the preceding morpheme, and in the case of 1 gram, there is no connection, so that there may be a change of topic.
[0056]
In the first embodiment, the preceding word string “{### start}} is replaced with“ {baseball: ### start}}, {baseball: No. 3 saNgoo sango noun} ”. The language likelihood is the probability 1 of the preceding word sequence “{### beginning}” and “{baseball: ### beginning}, {baseball: No. 3 saNgo Sango noun}” of the phonetic n-gram 7 of the baseball topic. Is calculated by the following equation (7) from the probability of 2 grams of 0.01.
[0057]
Probability of preceding word string × probability of n-gram = 1 × 0.01 = 0.01
(7)
[0058]
Next, in step ST110, it is checked whether or not the entire phoneme string corresponds to the word string. If it corresponds, the process proceeds to step ST111, and after storing the maximum likelihood and the preceding word string of the solution in the RAM 5, The process returns to step ST106 to check whether all preceding word string candidates correspond to the terminal phonemes of the phoneme string candidates. On the other hand, if not, the process returns to step ST106 to perform the above check.
[0059]
If it is determined in step ST106 that all the preceding word string candidates correspond to the terminal phoneme of the phoneme string candidate, the process proceeds to step ST112, and a word that matches all the phoneme string candidates is obtained. Check whether or not As a result, if no matching word is obtained for all phoneme string candidates, the process returns to step ST105 and the same process is repeated. On the other hand, if a matching word is obtained for all the phoneme string candidates, the process from step ST113 is performed.
[0060]
In the first embodiment, according to the above processing, “{### sentence beginning}, {baseball: No. 3 saNgoo noun}}, {baseball: arch aaci ach noun}, { Base word: no no particle},...
[0061]
In step ST113, the word string of the solution having the maximum likelihood stored in the RAM 5 is read out. The maximum likelihood is approximated by the maximum value of the product of the language likelihood and the acoustic likelihood. In the first embodiment, as a result of the calculation, the phoneme string candidate “# saNgoocinoseNsee #” is discarded because there is no corresponding phoneme n-gram. For the phoneme sequence candidate “# saNgooacinoseNsee #”, “{### heading}, {baseball: No.3 saNgoo noun}}, {baseball: arch aaci ah common noun}, {baseball: no no connect particle }, {Baseball: preemptive seNsee sensa sai noun}], and the maximum likelihood is 5.4 × 10 from the maximum value in the word string probability P (W) obtained by the above equation (6).^-9(Acoustic likelihood; 0.9, language likelihood; 6 × 10^-9) And obtained.
[0062]
Next, in step ST114, only the notation is extracted from the word string of the solution read from the RAM 5, and is output from the output means 6. Then, the process proceeds to step ST115, and this series of speech recognition processing is terminated. In this way, in the first embodiment, “No. 3 arch predecessor” is obtained as the recognition result.
[0063]
As described above, according to the first embodiment, since the topic is separated and the statistic is taken to perform the speech recognition, the 2-gram probability of “no teacher” is partially compared with “no pre-order”. Although it is higher, it is recognized as “preemptive”, and it is possible to construct n-grams with strong language constraints without increasing the order of n-grams, and to construct a highly accurate speech recognition device. Is obtained. In the present embodiment, two topics are dealt with, but three or more topics may be handled.
[0064]
Embodiment 2. FIG.
In the first embodiment, no particular consideration is given, but the word probability calculation means may be configured so that all the topics in the phoneme n-gram for a series of speech match when calculating word string candidates. Good. FIG. 5 is a block diagram showing the configuration of such a speech recognition apparatus according to Embodiment 2 of the present invention.
[0065]
In the figure, 1 is a microphone, 2 is a phoneme probability calculation means, 5 is a RAM, and 6 is an output means, which are the same as those in the first embodiment shown in FIG. . Reference numeral 12 denotes a word probability calculation means corresponding to that indicated by the reference numeral 9 in FIG. 1, and is configured such that all the topics in the phoneme n-gram for a series of speech match when calculating word string candidates. Is different in that.

Reference numerals

13 and 14 denote baseball topic phoneme n-grams and general topic phoneme n-grams corresponding to those indicated by

reference numerals

7 and 8 in FIG. Gram is not used.
[0066]
Here, FIG. 6 is an explanatory diagram showing a specific example of the phoneme n-gram. In the figure, reference numeral 15 denotes a phoneme n-gram. In this phoneme n-gram 15, a baseball topic phoneme n-gram 13 and a general topic phoneme n-gram 14 are recorded. As described above, the phoneme n-gram 13 of the baseball topic and the phoneme n-gram 14 of the general topic of the phoneme n-gram 15 each have a front morpheme, a back morpheme, and a probability for each key phoneme sequence. Only 2 grams with recorded are used.
[0067]
Next, the operation will be described.
FIG. 7 is a flowchart showing a schematic operation flow of the speech recognition apparatus according to the second embodiment configured as described above. Also in the second embodiment, first, from step ST101 to step ST107, exactly the same processing as in the first embodiment is performed. When one of the preceding word string candidates is extracted from the RAM 5 in step ST107, the word probability calculation means 12 searches the phoneme n-gram 15 by the phoneme string information of the preceding word string candidate in step ST120, and the backward word that matches forward is found. Check if it exists. At this time, in the first embodiment, both the 2 gram and the 1 gram of the phoneme n-gram 7 of the baseball topic of the phoneme n-gram 11 and the phoneme n-gram 8 of the general topic are used, respectively. 2, the baseball topic phoneme n-gram 13 and the general topic phoneme n-gram 14 each perform coincidence detection using only two grams of phoneme n-gram 15. As a result of the check, if there is a backward-matched backward word, the process proceeds to step ST109, and the process proceeds to step ST115 in the same manner as in the first embodiment.
[0068]
As described above, according to the second embodiment, the word probability calculation means 12 checks the match using only 2 grams of the phoneme n-gram 15, so that a series of morphemes for one utterance are the same topic. Therefore, it is possible to prevent other topics from being mixed during utterance.
[0069]
Embodiment 3 FIG.
In the first embodiment and the second embodiment, in the speech recognition, the weight adjustment of the probability for each topic is not particularly taken into consideration, but the probability weight can be adjusted for each topic. Also good. FIG. 8 is a block diagram showing the configuration of such a speech recognition apparatus according to Embodiment 3 of the present invention. In the figure, 1 is a microphone, 2 is a phoneme probability calculating means, 5 is a RAM, 6 is an output means, 13 is a phoneme n-gram of a baseball topic, and 14 is a phoneme n-gram of a general topic. It is a part equivalent to those of Embodiment 2 shown with the attached. 16 is the symbol in FIG.12This is word probability calculation means corresponding to those indicated by, but differs in that the probability weights can be adjusted for each topic.
[0070]
Next, the operation will be described.
FIG. 9 is a flowchart showing a schematic operation flow of the speech recognition apparatus according to Embodiment 3 configured as described above. Also in the third embodiment, first, in steps ST101 to ST107 and step ST120, the same processing as in the second embodiment is performed. As a result of checking whether or not there is a forward-matching backward word using the phoneme n-gram 15 of only 2 grams in step ST120, if there is no backward-matching backward word, the process returns to step ST106, and the forward-matching backward word is If there is, the process proceeds to step ST130. In step ST130, the word probability calculation unit 16 calculates the likelihood by weighting each backward word for each field, stores it in the RAM 5, connects the backward word to the preceding word string, and newly adds the preceding word. It is stored in the RAM 5 as a word string. Thereafter, the process proceeds from step ST110 to step ST115 as in the second embodiment.
[0071]
As described above, according to the third embodiment, the word probability calculating unit 16 is configured to apply the weight of the probability of 2 grams for each topic, so that the appearance probability can be adjusted for each topic. Is obtained.
[0072]
Embodiment 4 FIG.
In addition, although the said Embodiment 1-Embodiment 3 demonstrated the thing regarding a speech analyzer, it is also possible to construct | assemble a morphological analyzer by comprising a Chinese character n-gram. FIG. 10 is a block diagram showing the configuration of such a morphological analyzer according to Embodiment 4 of the present invention.
[0073]
In the figure, reference numeral 17 denotes a file input device as input means for inputting a character string (input file) mixed with kana and kanji.

Reference numerals

18 and 19 are kanji n-grams for storing a kana-kanji mixed character string, a word notation string corresponding to the kana-kanji mixed character string, and an occurrence probability, and in this kanji n-gram, a word corresponds to each topic. The kanji n-gram 18 is illustrated for the baseball topic kanji n-gram stored for the baseball topic, and the kanji n-gram 19 is illustrated for the general topic kanji n-gram stored for the general topic. Yes. Reference numeral 20 refers to the kanji n-gram 18 of the baseball topic and the kanji n-gram 19 of the general topic, and calculates the word occurrence probability of each word candidate corresponding to the kana-kanji mixed character string output from the file input device 17. It is a calculation means. Reference numeral 5 denotes a RAM for storing processing process information, and reference numeral 21 denotes a word string candidate that matches the character string input from the file input device 17, which is obtained using the word occurrence probability calculated by the morpheme probability calculation means 20. Output means for outputting.
[0074]
Hereinafter, generation of word string candidates will be described.
The generation of word string candidates in the fourth embodiment is obtained by calculating W that maximizes the word string appearance probability P (W). At this time, W is the input word string. The word string appearance probability P (W) is obtained from the above equation (6) when the m word string W is determined by the above equation (4). In this case, the probabilities of the baseball topic kanji n-gram 18 and the general topic kanji n-gram 19 are used.
[0075]
With the above-described calculation, W that maximizes the word string probability P (W) is calculated for words in which a base word kanji n-gram 18 and a general kanji n-gram 19 have word strings. The calculation of the combination may be performed at high speed using the Viterbi method shown in, for example, Nagao Makoto: “Natural Language Processing”, or the calculation formula may be calculated as a sum of logarithmic probabilities. . The appearance probability of each word is calculated based on probability values stored in advance in the kanji n-gram 18 of the baseball topic and the kanji n-gram 19 of the general topic.
[0076]
Here, FIG. 11 is an explanatory diagram showing a specific example of a kanji n-gram created based on the example sentence 10 shown in FIG. 2. In the figure, 22 is the kanji n-gram. The baseball topic kanji n-gram 18 and the general topic kanji n-gram 19 are recorded.
[0077]
As shown in FIG. 11, the kanji n-gram 18 of the baseball topic and the kanji n-gram 19 of the general topic in the kanji n-gram 22 have 2 grams and 1 gram, respectively. It is a key. In 2 grams, the front morpheme, the back morpheme, and the probability are recorded for each Kanji character string that is a key. The probabilities recorded here are the probabilities of connecting the posterior morpheme after the antecedent morpheme, and correspond to the occurrence probability of 2 grams. Also, in 1 gram, the subsequent connected morphemes and probabilities that are directly connected next are recorded for each phoneme string that is a key. This 1-gram probability is the occurrence probability of the morpheme itself. Note that a morpheme is represented by a combination of notation, phoneme notation, heading reading, and part of speech.
[0078]
The calculated word string W is output from the output means 21 as a recognition result.
[0079]
Next, the operation will be described.
Here, FIG. 12 is a flowchart showing a schematic operation flow of the analysis processing in the morphological analyzer according to the fourth embodiment. This morpheme analysis process is started by inputting a character string mixed with kana and kanji from the file input device 17 in step ST201. In step ST202, the file input device 17 takes in the input kana-kanji mixed character string and inputs it to the morpheme probability calculation means 20. When the kana-kanji mixed character string taken in by the file input device 17 is input, the morpheme probability calculating means 20 stores it in the RAM 5 in step ST203. In the fourth embodiment, it is assumed that the following is input as a kana-kanji mixed character string.
Teacher of No. 3 Arch
[0080]
Next, in step ST204, the morpheme probability calculation means 20 takes out the kanji string candidates stored in the RAM 5 in step ST203 and performs initialization processing. In this initialization process, the null word “{### sentence head}” and its probability value “1” are stored in the RAM 5 as the initial value of the preceding word string candidate. Accordingly, here, “# 3 Arch Teacher #” is first extracted as a Chinese character string candidate. In step ST205, the morpheme probability calculation means 20 further checks whether or not all the preceding word string candidates correspond to the terminal kanji of the kanji string candidates, and if all correspond, the process moves to step ST211 and corresponds. If not, the process proceeds to step ST206.
[0081]
In step ST206, the morpheme probability calculation means 20 takes out one preceding word string candidate from the RAM 5. In the fourth embodiment, “{# ## sentence head}” is first extracted as a preceding word string candidate. Next, in step ST207, the baseball topic kanji n-gram 18 and the general topic kanji n-gram 19 are searched based on the kanji string information of the preceding word string candidates, and the substrings of the kanji string candidates after the searched preceding word are moved forward. Check if there is a matching backward word. As a result of the check, if there is no backward word that matches forward, the process returns to step ST205, and if there is a backward word that matches forward, the process proceeds to step ST208.
[0082]
Therefore, in the case of the fourth embodiment, first, “{### beginning of sentence}” which is an initial preceding word string is searched. Then, the kanji n-gram 18 of the baseball topic and the kanji n-gram 19 of the general topic are searched as backward words of the searched preceding word string “{### beginning}”, and “# 3” following the “#” is searched. Search for words that partially match the kanji string from the beginning of "..." and use it as the backward word. In 2 grams, “# 3” matches the kanji string of “# 3 a…”, so this 2-gram successor morpheme “Baseball: No.3 saNgoo Sango noun” is one of the back word candidates. To do. Also, 1 gram of “{Baseball: No.3 saNgoo sango noun}” is a candidate because it matches the kanji string behind. Furthermore, “{General: No. 3 saNgoo nogo}}” is also a candidate.
[0083]
In step ST208, the likelihood is calculated for each backward word, stored in the RAM 5, and the backward word is connected to the preceding word string. At this time, in the case of 2 grams, the topic is the same as that of the preceding morpheme, and in the case of 1 gram, there is no connection, so that switching of topics is allowed. The preceding word string connecting the backward words is newly stored in the RAM 5 as a preceding word string. In the fourth embodiment, the preceding word string “{### start}} is replaced with“ {baseball: ### start}}, {baseball: No. 3 saNgoo sango noun} ”. The language likelihood is calculated by the above equation (7) from the probability 1 of the preceding word string “{### beginning}” and the 2-gram probability 0.01 of the baseball topic “{#}, {3}”. Calculated.
[0084]
Next, in step ST209, it is checked whether or not the entire kanji character string corresponds to the preceding word string. If yes, the process proceeds to step ST210, and the maximum likelihood and the preceding word string of the solution are stored in the RAM 5. Then, the process returns to step ST205, and it is checked whether all preceding word string candidates correspond to the terminal word of the kanji string candidate. On the other hand, if not, the process returns to step ST205 and the above check is performed.
[0085]
In the fourth embodiment, according to the above processing, “{### start of sentence}, {baseball: No. 3 saNgoo noun}}, {baseball: arch aaci ach noun}, { Base word: no no particle},...
[0086]
If it is determined in step ST205 that all the preceding word string candidates correspond to the terminal word of the kanji string candidate, the process proceeds to step ST211 and the solution word string having the maximum likelihood stored in the RAM 5 Is read. Here, the maximum likelihood is the maximum value of the product of the language likelihood and the acoustic likelihood. In the fourth embodiment, for the candidate for the Chinese character string “# 3 arch pre-emptive #”, “{### heading}, {3 saNgoo noun}}, {arch aaci ach noun}, {no no Morphological analysis result of {first connective particle}, {preemptive seNsee teacher variable noun}}, and the maximum likelihood is 5.4 × from the maximum value in the word string probability P (W) obtained by the above equation (6). 10^-9(Acoustic likelihood; 0.9, language likelihood; 6 × 10^-9) And obtained.
[0087]
Next, in step ST212, a solution morpheme string read from the RAM 5 is taken out and output from the output means 21. Then, the process proceeds to step ST213, and this series of morpheme analysis processing ends. In this way, in the fourth embodiment, “{No. 3 sango noun}”, {arch arch noun}, {no connection particle}, and {sensei sansei noun} are obtained as analysis results.
[0088]
As described above, according to the fourth embodiment, the topic is separated and the statistics are taken to perform the morphological analysis. Therefore, even the ambiguous notation “no teacher” partially means “preemptive”. Therefore, it is possible to calculate that the part of speech is distorted, and to construct an n-gram having strong language restrictions without increasing the order of the n-gram, and to obtain an effect that a highly accurate morphological analyzer can be constructed. In the present embodiment, two topics are dealt with, but three or more topics may be handled.
[0089]
Embodiment 5 FIG.
In the fourth embodiment, although no particular consideration is given, the morpheme probability calculation means is configured so that all the topics in the kanji n-gram for a series of kana-kanji mixed character strings coincide when calculating word string candidates. May be. FIG. 13 is a block diagram showing the configuration of such a morphological analyzer according to the fifth embodiment of the present invention.
[0090]
In the figure, 5 is a RAM, 17 is a file input device, and 21 is an output means, which are the same as those of the fourth embodiment shown in FIG. Reference numeral 23 denotes a morpheme probability calculation means corresponding to that indicated by the reference numeral 18 in FIG. 10, but all the topics in the kanji n-gram for a series of kana-kanji mixed character strings coincide with each other when calculating word string candidates. It is different in that it is configured as follows. 24 and 25 are baseball topical kanji n-grams and general topical kanji n-grams corresponding to those indicated by

reference numerals

18 and 19 in FIG. 10, but only 2 grams are used in this case, One gram is not used.
[0091]
Here, FIG. 14 is an explanatory view showing a specific example of the kanji n-gram. In the figure, 26 is the kanji n-gram, and the kanji n-gram 26 is recorded with the kanji n-gram 24 of the baseball topic and the kanji n-gram 25 of the general topic. As described above, in the baseball topic n-gram 24 and the general topic n-gram 25 of the kanji n-gram 26, a front morpheme, a back-end morpheme, and a probability are recorded for each kanji character string as a key. Only 2 grams are used.
[0092]
Next, the operation will be described.
FIG. 15 is a flowchart showing a schematic operation flow of the morphological analyzer according to Embodiment 5 configured as described above. Also in the fifth embodiment, first, in steps ST201 to ST206, the same processing as in the fourth embodiment is performed. When one of the preceding word string candidates is extracted from the RAM 5 in step ST206, the morpheme probability calculating means 23 searches the kanji n-gram 26 with the kanji string information of the preceding word string candidate in step ST220, and the backward word that matches forward is found. Check if it exists. At that time, in Embodiment 4, both 2 grams and 1 gram of the kanji n-gram 18 of the baseball topic and the kanji n-gram 19 of the general topic of the kanji n-gram 22 were used, respectively. 5, the kanji n-gram 24 of the baseball topic and the kanji n-gram 25 of the general topic are subjected to coincidence detection using the kanji n-gram 26 of only 2 grams. As a result of the check, if there is a backward word that matches forward, the process branches to step ST208, and the process proceeds to step ST213 in the same manner as in the fourth embodiment.
[0093]
As described above, according to the fifth embodiment, since the morpheme probability calculating means 23 checks the match using only 2 grams of the kanji n-gram 26, a series of morphemes for one character string mixed with kana and kanji. Are morphemes of the same topic, so that it is possible to prevent other topics from being mixed.
[0094]
Embodiment 6 FIG.
In the fourth embodiment and the fifth embodiment, in the morphological analysis, the weight adjustment of the probability for each topic is not particularly considered. However, the morpheme is set so that the probability weight can be adjusted for each topic. Probability calculation means may be configured. FIG. 16 is a block diagram showing the configuration of such a morphological analyzer according to Embodiment 6 of the present invention. In the figure, 5 is a RAM, 17 is a file input device, 21 is an output means, 24 and 25 are k-grams of baseball topics and general topics, and these are shown in FIG. It is a part equivalent to those of 5. Reference numeral 27 denotes a morpheme probability calculation means corresponding to that indicated by reference numeral 23 in FIG. 13, but differs in that the weight of probability is configured to be adjustable for each topic.
[0095]
Next, the operation will be described.
FIG. 17 is a flowchart showing a schematic operation flow of the morphological analyzer according to Embodiment 6 configured as described above. Also in the sixth embodiment, first, in steps ST201 to ST206 and step ST220, exactly the same processing as in the fifth embodiment is performed. As a result of checking whether or not there is a forward-matching backward word using the k-gram ngram 26 of only 2 grams in step ST220, if there is no backward-matching backward word, the process returns to step ST205, and the backward-matching backward word is If there is, the process proceeds to step ST230. In step ST230, the morpheme probability calculation unit 27 calculates the likelihood by weighting each of the backward words for each field, stores it in the RAM 5, and connects the backward word to the preceding word string to newly add the preceding word. It is stored in the RAM 5 as a word string. Thereafter, the process proceeds from step ST209 to step ST213 in the same manner as in the fifth embodiment.
[0096]
As described above, according to the sixth embodiment, since the morpheme probability calculation means 27 is configured to apply the weight of the probability of 2 grams for each topic, it is possible to adjust the appearance probability for each topic. Is obtained.
[0097]
Embodiment 7 FIG.
In addition, although the said Embodiment 1-Embodiment 6 demonstrated the thing regarding a speech analyzer or a morphological analyzer, it is also possible to construct a Kana / Kanji conversion device by constructing a kana n-gram. FIG. 18 is a block diagram showing the configuration of such a kana-kanji conversion apparatus according to Embodiment 7 of the present invention.
[0098]
In the figure, 28 is a keyboard as input means for inputting a kana character string of an input sentence.

Reference numerals

29 and 30 are kana n-grams that store kana character strings, word notation strings corresponding to kana character strings, and occurrence probabilities, in which words are classified according to topics. The kana n-gram 29 is illustrated for the kana n-gram of the baseball topic stored for the topic of baseball, and the kana-n-gram 30 is illustrated for the kana n-gram of the general topic stored for the general topic. Reference numeral 31 denotes kanji probability calculation means for calculating the word occurrence probability of each word candidate corresponding to the kana character string output from the keyboard 28 by referring to the kana n-gram 29 of the baseball topic and the kana n-gram 30 of the general topic. . Reference numeral 5 denotes a RAM for storing processing process information. Reference numeral 32 denotes a word string candidate that matches the kana character string input from the keyboard 28, which is obtained using the word occurrence probability calculated by the kanji probability calculating means 31. Output means.
[0099]
Hereinafter, generation of word string candidates will be described.
The generation of word string candidates in the fourth embodiment is obtained by calculating W that maximizes the word string appearance probability P (W). At this time, W is the input word string. The word string appearance probability P (W) is obtained from the above equation (6) when the m word string W is determined by the above equation (4). In this case, the probabilities of the kana n-gram 29 of the baseball topic and the kana n-gram 30 of the general topic are used.
[0100]
Through the above-described calculation, W that maximizes the word string probability P (W) is calculated for words having word strings in the kana n-gram 29 of the baseball topic and the kana n-gram 30 of the general topic. Note that the calculation of the combination may be performed at high speed using the Viterbi method shown in, for example, Nagao Makoto: “Natural Language Processing”, and the calculation formula can be calculated as a sum with the probability as the logarithmic probability. Also good. The appearance probability of each word is calculated based on the probability values stored in advance in the kana n-gram 29 of the baseball topic and the kana n-gram 30 of the general topic.
[0101]
Here, FIG. 19 is an explanatory diagram showing a specific example of the kana n-gram created based on the example sentence 10 shown in FIG. In the figure, 33 is the kana n-gram, and the kana n-gram 33 records the kana n-gram 29 of the baseball topic and the kana n-gram 30 of the general topic.
[0102]
As shown in FIG. 19, the kana n-gram 33 of the baseball topic and the kana n-gram 30 of the general topic in the kana n-gram 33 have 2 grams and 1 gram, respectively. It is the key. In 2 grams, a front morpheme, a back morpheme, and a probability are recorded for each kana character string that is a key. The probabilities recorded here are the probabilities of connecting the posterior morpheme after the antecedent morpheme, and correspond to the occurrence probability of 2 grams. Also, in 1 gram, the subsequent connected morphemes and probabilities that are directly connected next are recorded for each phoneme string that is a key. This 1-gram probability is the occurrence probability of the morpheme itself. Note that a morpheme is represented by a combination of notation, phoneme notation, heading reading, and part of speech.
[0103]
The calculated word string W is output from the output means 32 as a recognition result.
[0104]
Next, the operation will be described.
FIG. 20 is a flowchart showing a schematic operation flow of conversion processing in the kana-kanji conversion apparatus according to the seventh embodiment. This kana-kanji conversion process is started by operating the keyboard 28 in step ST301. The kana character string input by operating the keyboard 28 is taken into the kanji probability calculating means 31 in step ST302, and stored in the RAM 5 in step ST303. In the seventh embodiment, it is assumed that the following is input as a kana character string.
Teacher of Sango-Auch
[0105]
Next, in step ST304, the kanji probability calculating means 31 takes out the kana character string stored in the RAM 5 in step ST303 and performs an initialization process. In this initialization process, the null word “{### sentence head}” and its probability value “1” are stored in the RAM 5 as the initial value of the preceding word string candidate. Accordingly, here, “#sango au no sensei #” is first extracted as a kana character string. In step ST305, the kanji probability calculating means 31 further checks whether all preceding word string candidates correspond to the terminal kana of the kana character string, and if all correspond, moves the process to step ST311 and responds. If not, the process proceeds to step ST306.
[0106]
In step ST306, the kanji probability calculating means 31 takes out one preceding word string candidate from the RAM 5. In the seventh embodiment, “{# ## sentence head}” is first extracted as a preceding word string candidate. Next, in step ST307, the kana n-gram 29 of the baseball topic and the kana n-gram 30 of the general topic are searched based on the kana string information of the preceding word string candidates, and the kana string candidate partial strings after the searched preceding word are moved forward. Check if there is a matching backward word. As a result of the check, if there is no backward word that matches forward, the process returns to step ST305, and if there is a backward word that matches forward, the process proceeds to step ST308.
[0107]
Therefore, in the case of the seventh embodiment, first, “{### beginning of sentence}” which is an initial preceding word string is searched. Then, the kana n-gram 29 of the baseball topic and the kana n-gram 30 of the general topic are searched as backward words of the searched preceding word string “{### start}}, and“ sangouaa ”following“ # ”is searched. Search for words that partially match the kana character string from the beginning of “... In 2 grams, “#sangou” matches the kana character string of “# sangooo…”, so this 2 gram postscript morpheme “baseball: No.3 saNgoo sago noun” is a candidate for the backward word. I will. One gram of “{Baseball: No. 3 saNgoo noun}” is also a candidate because it matches the kana character string in the back. Furthermore, “{General: No. 3 saNgoo nogo}}” is also a candidate.
[0108]
In step ST308, the likelihood for each backward word is calculated and stored in the RAM 5, the backward word is connected to the preceding word string, and this is newly stored in the RAM 5 as the preceding word string. In the seventh embodiment, the preceding word string “{### start}} is replaced with“ {baseball: ### start}}, {baseball: No. 3 saNgoo sango noun} ”. Language likelihood is the probability 1 of the preceding word string “{# # # beginning}” and the probability of 2 grams of the baseball topic “{baseball: # # # beginning}, {baseball: No. 3 saNgoo Sango noun}” It is calculated by the above equation (7) from 0.01.
[0109]
Next, in step ST309, it is checked whether or not the entire kana character string corresponds to the preceding word string. If it corresponds, the process proceeds to step ST310, and the maximum likelihood and the preceding word string of the solution are stored in the RAM 5. Thereafter, the process returns to step ST305, and it is checked whether all preceding word string candidates correspond to the terminal kana of the kana character string candidates. On the other hand, if not, the process returns to step ST305 to perform the above check.
[0110]
In the seventh embodiment, according to the above processing, “{### beginning of sentence}, {baseball: No. 3 saNgoo noun} noun}, {baseball: arch aaci ach noun}, { Base word: no no particle},...
[0111]
If it is determined in step ST305 that all the preceding word string candidates correspond to the terminal kana of the kana character string candidates, the process proceeds to step ST311 and the solution word having the maximum likelihood stored in the RAM 5 Read a column. Here, the maximum likelihood is the maximum value of the product of the language likelihood and the acoustic likelihood. In the seventh embodiment, for the kana character string candidate “# Sango Ao no Sensei #”, “{# # # beginning of sentence}, {3 saNgoo nogo noun}, {Arch aaci ah common noun}, { No no connection particle}, {preemptive seNsee teacher variable noun} ", and the maximum likelihood is 5.4 × 10 from the maximum value in the word string probability P (W) obtained by the above equation (6).^-9(Acoustic likelihood; 0.9, language likelihood; 6 × 10^-9) And obtained.
[0112]
Next, in step ST312, the solution word string read from the RAM 5 is output from the output means 32, and then the process proceeds to step ST313 to end this series of morphological analysis processing. In this way, in the seventh embodiment, “No. 3 arch predecessor” is obtained as the kana-kanji conversion result.
[0113]
As described above, according to the seventh embodiment, since the kana-kanji conversion is performed by separating the topic and taking the statistics, the n-gram having a strong language constraint is formed without increasing the order of the n-gram. Thus, it is possible to construct an accurate kana-kanji conversion device. In the present embodiment, two topics are dealt with, but three or more topics may be handled.
[0114]
Embodiment 8 FIG.
Although no particular consideration is given in the seventh embodiment, the kanji probability calculating means may be configured so that all topics in the kana n-gram for a series of kana character strings match when converting kana-kanji characters. Good. FIG. 21 is a block diagram showing the configuration of such a kana-kanji conversion apparatus according to the eighth embodiment of the present invention.
[0115]
In the figure, 5 is a RAM, 28 is a keyboard, and 32 is an output means, which are the same as those in the seventh embodiment shown in FIG. Reference numeral 34 denotes a kanji probability calculation means corresponding to that indicated by the reference numeral 29 in FIG. 18, but when calculating word string candidates, all the topics in the kana n-gram for a series of kana character strings are matched. It differs in that it is configured.

Reference numerals

35 and 36 denote the kana n-gram of the baseball topic and the kana n-gram of the general topic, which correspond to those indicated by

reference numerals

29 and 30 in FIG. 18, but in this case, only 2 grams are used. One gram is not used.
[0116]
Here, FIG. 22 is an explanatory diagram showing a specific example of the kana n-gram. In the figure, reference numeral 37 denotes the kana n-gram, and the kana n-gram 37 is recorded with the kana n-gram 35 of the baseball topic and the kana n-gram 36 of the general topic. As described above, the kana n-gram 37 of the baseball topic of the kana n-gram 37 and the kana n-gram 36 of the general topic have a front morpheme, a back morpheme, and a kana character string as a key, respectively. Only 2 grams with recorded probabilities are used.
[0117]
Next, the operation will be described.
FIG. 23 is a flowchart showing a schematic operation flow of the Kana-Kanji conversion device according to Embodiment 8 configured as described above. Also in the eighth embodiment, first, in steps ST301 to ST306, exactly the same processing as in the seventh embodiment is performed. When one leading word string candidate is extracted from the RAM 5 in step ST306, the kanji probability calculating means 34 searches the kana n-gram 37 with the kana string information of the preceding word string candidate in step ST320, and the backward matching word is found. Check if it exists. At that time, in Embodiment 7, both 2 grams and 1 gram of the kana n-gram 30 of the baseball topic and the kana n-gram 31 of the general topic of the kana n-gram 33 are used, respectively. In FIG. 8, the kana n-gram 35 of the baseball topic and the kana n-gram 36 of the general topic perform coincidence detection using only the kana n-gram 37 of only 2 grams. As a result of the check, if there is a backward word that coincides with the front, the process branches to step ST308, and the process proceeds to step ST313 in the same manner as in the seventh embodiment.
[0118]
As described above, according to the eighth embodiment, the kanji probability calculating means 34 checks the match using only 2 grams of the kana n-gram 37, so that a series of morphemes for one kana character string is the same. Since it becomes a topical morpheme, the effect that other topics can be mixed is obtained.
[0119]
Embodiment 9 FIG.
In the seventh embodiment and the eighth embodiment, the kana-kanji conversion does not particularly take into account the probability weight adjustment for each topic, but the kanji probability calculation that enables the adjustment of the probability weight for each topic is possible. You may make it comprise a means. FIG. 24 is a block diagram showing the configuration of such a kana-kanji conversion apparatus according to Embodiment 9 of the present invention. In the figure, 5 is a RAM, 28 is a keyboard, 32 is an output means, 35 and 36 are kana-grams of baseball topics and general topics, and these are the same as those in the eighth embodiment shown in FIG. They are the equivalent parts. Reference numeral 38 denotes a kanji probability calculation means corresponding to that indicated by the reference numeral 34 in FIG. 21, but differs in that the weight of probability is configured to be adjustable for each topic.
[0120]
Next, the operation will be described.
FIG. 25 is a flowchart showing a schematic operation flow of the Kana-Kanji conversion device according to Embodiment 9 configured as described above. Also in the ninth embodiment, first, in steps ST301 to ST306 and step ST320, exactly the same processing as in the eighth embodiment is performed. As a result of checking whether or not there is a forward-matching backward word using the kana-gram 37 of only 2 grams in step ST320, if there is no backward-matching backward word, the process returns to step ST305, and the backward-matching backward word is If there is, the process proceeds to step ST330. In step ST330, the kanji probability calculation means 38 calculates the likelihood by weighting each backward word for each field, stores it in the RAM 5, and connects the backward word to the preceding word string to newly add the preceding word. It is stored in the RAM 5 as a word string. Thereafter, the process proceeds from step ST309 to step ST313 in the same manner as in the eighth embodiment.
[0121]
As described above, according to the ninth embodiment, the kanji probability calculating means 38 is configured so that the probability weight of 2 grams is applied to each topic, so that the appearance probability can be adjusted for each topic. Is obtained.
[0122]
【The invention's effect】
  As described above, according to the present invention,The speech recognition apparatus according to the present invention stores a phoneme string of a target language, a word notation string corresponding to the phoneme string, and an occurrence probability, and the stored words are classified corresponding to each topic. A phoneme n-gram, an input means for inputting the speech of the target language, a speech signal output from the input means is converted into a phoneme, a phoneme occurrence probability corresponding to each phoneme is calculated, and a phoneme sequence candidate is output The phoneme probability calculating means and the phoneme n-gram classified according to each topic are referred to, and the phoneme sequence information of the preceding word sequence candidate is searched, and the subsequences of the phoneme sequence candidates after the searched preceding word are searched. A word probability calculating means for calculating a word occurrence probability of each word candidate corresponding to the phoneme string candidate output by the phonological probability calculating means, and the phonological probability calculating Calculate by means An output means for outputting a word string candidate similar to the speech input from the input means, calculated using the generated phonological probability and the word occurrence probability calculated by the word probability calculation means, The word probability calculation means sets a probability weight for each topic for each backward word.Therefore, by separating the topics and taking statistics, it is possible to construct n-grams with strong language constraints without increasing the order of n-grams, and the effect of obtaining a highly accurate speech recognition device is obtained.There is an effect that a speech recognition apparatus capable of adjusting the appearance probability for each topic can be obtained.
[0125]
  According to this invention,The morphological analyzer according to the present invention stores a kana-kanji mixed character string, a word notation sequence corresponding to the kana-kanji mixed character string, and an occurrence probability, and the stored words are classified according to each topic. The kanji n-gram, the input means for inputting the kana-kanji mixed character string, the kanji n-gram classified according to each topic, and the kanji string information of the preceding word string candidates. Words of each word candidate corresponding to the kana-kanji mixed character string output by the input means by checking whether there is a backward word that matches forward in the substring of the kanji string candidate after the searched preceding word A morpheme probability calculation unit that calculates the occurrence probability, and a word string candidate that is calculated using the word occurrence probability calculated by the morpheme probability calculation unit and that matches the character string input from the input unit is output. And output means that, the morpheme probability calculating means is for setting the weight of probability topic-specific for each of the rear wordTherefore, by separating the topics and taking statistics, it becomes possible to construct n-grams with strong language constraints without increasing the order of n-grams, and the effect of obtaining a highly accurate morphological analyzer.There is an effect that a morphological analyzer capable of adjusting the appearance probability for each topic can be obtained.
[0128]
  According to this invention,The kana-kanji conversion device according to the present invention stores a kana character string, a word notation string corresponding to the kana character string, and an occurrence probability, and the stored words are classified corresponding to each topic. By referring to the kana n-gram, the input means for inputting the kana character string, and the kana n-gram classified according to each topic, the substrings of the kana string candidates after the searched preceding word, Check whether there is a backward word that matches the front, and calculate the kanji probability calculation means for calculating the word occurrence probability of each word candidate corresponding to the kana character string output by the input means, and the kanji probability calculation means calculates Output means for outputting word string candidates that match the kana character string input from the input means, calculated using the generated word occurrence probability, wherein the kanji probability calculation means is a topic for each of the backward words It is intended to set the weight of probability inTherefore, by separating the topics and taking statistics, it is possible to construct n-grams with strong language constraints without increasing the order of n-grams, and the effect of obtaining a highly accurate kana-kanji conversion device is obtained.There is an effect that a kana-kanji conversion device capable of adjusting the appearance probability for each topic can be obtained.
[0131]
  According to this invention,The speech recognition method according to the present invention includes a step of capturing an input speech, a step of converting the captured speech into a phoneme, and calculating a phoneme occurrence probability corresponding to each phoneme converted from the speech. The phoneme sequence candidate output step, the phoneme sequence of the target language, the word notation sequence corresponding to the phoneme sequence, and the occurrence probability are stored, and the stored words are classified according to each topic. Referring to the phoneme n-gram, a probability weight is set for each of the backward words, and search is performed based on the phoneme string information of the preceding word string candidate. In the substrings of the phoneme string candidates after the searched preceding word, Checking whether there is a backward word that matches forward, calculating a word occurrence probability of each word candidate corresponding to the calculated phoneme string candidate, and using the phoneme occurrence probability and the word occurrence probability It is obtained by a step of calculating a word sequence candidates similar to the inputted speechTherefore, by separating the topics and taking the statistics, n-grams with strong language constraints can be constructed without increasing the order of n-grams, and there is an effect that a highly accurate speech recognition method can be obtained.
[0132]
  According to this invention,The morpheme analysis method according to the present invention stores an input kana-kanji mixed character string, a kana-kanji mixed character string, a word notation string corresponding to the kana-kanji mixed character string, and an occurrence probability. Referring to kanji n-grams in which stored words are classified according to each topic, set probability weights for each topic for each backward word, and search by kanji string information of preceding word string candidates The word occurrence probability of each word candidate corresponding to the captured kana-kanji mixed character string is checked by checking whether or not there is a backward matching word in the substring of the kanji string candidate after the searched preceding word. And a step of calculating a word string candidate that matches the input character string mixed with the kana and kanji using the calculated word occurrence probability.Therefore, it is possible to construct an n-gram having strong language constraints without increasing the order of the n-gram, and there is an effect that a highly accurate morphological analysis method can be obtained.
[0133]
  According to this invention,The kana-kanji conversion method according to the present invention stores and stores a step of capturing an input kana character string, a kana character string, a word notation string corresponding to the kana character string, and an occurrence probability. By referring to kana-grams in which words are classified corresponding to the respective topics, weights of probabilities are set for each of the backward words and the subsequences of kana string candidates after the searched preceding word are moved forward. Checking whether there is a matching backward word, calculating a word occurrence probability of each word candidate corresponding to the captured kana character string, and using the calculated word occurrence probability And calculating a word string candidate that matches the kana character string.Therefore, it is possible to construct an n-gram having strong language restrictions without increasing the order of the n-gram, and there is an effect that a high-precision kana-kanji conversion method can be obtained.
[0134]
  According to this invention,The recording medium according to the present invention includes a step of capturing input speech, a step of converting the captured speech into phonemes, and calculating a phoneme occurrence probability corresponding to each phoneme converted from the speech. The phoneme sequence candidate is output, the phoneme sequence of the target language, the word notation sequence corresponding to the phoneme sequence, and the occurrence probability are stored, and the stored words are classified according to each topic. With reference to the phoneme n-gram, probability weights are set for each topic for each backward word, the phoneme string information of the preceding word string candidate is searched, and the phoneme string candidate subsequences after the searched preceding word are moved forward. Checking whether there is a matching backward word, calculating a word occurrence probability of each word candidate corresponding to the calculated phoneme string candidate, and using the phoneme occurrence probability and the word occurrence probability, The speech recognition method and a step of calculating a word sequence candidates similar to the sound that is, is obtained by computer readable recording having recorded thereon a program for causing a computer to executeTherefore, there is an effect that a recording medium in which a program for realizing the voice recognition method with high accuracy is recorded can be obtained.
[0135]
  According to this invention,The recording medium according to the present invention stores a step of capturing an input kana-kanji mixed character string, a kana-kanji mixed character string, a word notation string corresponding to the kana-kanji mixed character string, and an occurrence probability, By referring to kanji n-grams in which stored words are classified according to each topic, a probability weight is set for each topic for each backward word, and search is performed using kanji string information of preceding word string candidates. Then, it is checked whether or not there is a backward matching word in the partial sequence of the kanji string candidates after the searched preceding word, and the word occurrence probability of each word candidate corresponding to the captured kana-kanji mixed character string is determined. A morpheme analysis method comprising: a step of calculating, and using the calculated word occurrence probability, calculating a word string candidate that matches the input kana-kanji mixed character string. It is obtained by computer readable recording having recorded thereon a program to be executed by a computerTherefore, there is an effect that a recording medium in which a program for realizing the morphological analysis method with high accuracy is recorded can be obtained.
[0136]
  According to this invention,The recording medium according to the present invention stores the input kana character string, the kana character string, the word notation string corresponding to the kana character string, and the occurrence probability, and the stored word is Referring to the kana gram classified according to each topic, the probability weight is set for each topic for each backward word, and it matches the partial sequence of the kana string candidate after the searched preceding word. Checking whether there is a backward word, calculating a word occurrence probability of each word candidate corresponding to the captured kana character string, and using the calculated word occurrence probability, the input A computer-readable recording of a program for causing a computer to execute a kana-kanji conversion method having a step of calculating word string candidates that match the kana character string ThatTherefore, there is an effect that a recording medium on which a program for realizing the kana-kanji conversion method with high accuracy is recorded can be obtained.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of a speech recognition apparatus according to Embodiment 1 of the present invention.
FIG. 2 is an explanatory diagram showing example sentences analyzed by the speech recognition apparatus according to Embodiment 1;
FIG. 3 is an explanatory diagram showing a specific example of a phoneme n-gram used for analysis by the speech recognition apparatus according to the first embodiment.
FIG. 4 is a flowchart showing a general operation flow of speech recognition in the speech recognition apparatus according to the first embodiment;
FIG. 5 is a block diagram showing a configuration of a speech recognition apparatus according to Embodiment 2 of the present invention.
FIG. 6 is an explanatory diagram showing a specific example of a phoneme n-gram used for analysis in the speech recognition apparatus according to the second embodiment.
FIG. 7 is a flowchart showing a general operation flow of speech recognition in the speech recognition apparatus according to the second embodiment.
FIG. 8 is a block diagram showing a configuration of a speech recognition apparatus according to Embodiment 3 of the present invention.
FIG. 9 is a flowchart showing a general operation flow of speech recognition in the speech recognition apparatus according to Embodiment 3;
FIG. 10 is a block diagram showing a configuration of a morphological analyzer according to Embodiment 4 of the present invention.
FIG. 11 is an explanatory diagram illustrating a specific example of a kanji n-gram used for analysis by the morphological analyzer of the fourth embodiment.
FIG. 12 is a flowchart showing a general operation flow of morpheme analysis in the morpheme analyzer of Embodiment 4;
FIG. 13 is a block diagram showing a configuration of a morphological analyzer according to Embodiment 5 of the present invention.
FIG. 14 is an explanatory diagram showing a specific example of a kanji n-gram used for analysis in the morphological analyzer of the fifth embodiment.
FIG. 15 is a flowchart showing a general operation flow of morpheme analysis in the measured value analyzer of the fifth embodiment.
FIG. 16 is a block diagram showing a configuration of a morphological analyzer according to a sixth embodiment of the present invention.
FIG. 17 is a flowchart showing a general operation flow of morpheme analysis in the morpheme analyzer of Embodiment 6;
FIG. 18 is a block diagram showing a configuration of a Kana-Kanji conversion device according to Embodiment 7 of the present invention;
FIG. 19 is an explanatory diagram showing a specific example of a kana n-gram used for analysis in the kana-kanji conversion apparatus of the seventh embodiment.
FIG. 20 is a flowchart showing a schematic operation flow of kana-kanji conversion in the kana-kanji conversion apparatus of the seventh embodiment.
FIG. 21 is a block diagram showing a configuration of a kana-kanji conversion apparatus according to an eighth embodiment of the present invention.
FIG. 22 is an explanatory diagram showing a specific example of a kana n-gram used for analysis in the kana-kanji conversion apparatus of the eighth embodiment.
FIG. 23 is a flowchart showing a schematic operation flow of kana-kanji conversion in the kana-kanji conversion apparatus of the eighth embodiment.
FIG. 24 is a block diagram showing a configuration of a Kana-Kanji conversion device according to Embodiment 9 of the present invention.
FIG. 25 is a flowchart showing a general operation flow of kana-kanji conversion in the kana-kanji conversion analyzer of the ninth embodiment.
FIG. 26 is a block diagram showing a configuration of a conventional speech recognition apparatus.
FIG. 27 is a flowchart showing a general operation flow of speech recognition in a conventional speech recognition apparatus.
[Explanation of symbols]
1 microphone (input means), 2 phoneme probability calculation means, 5 RAM, 6 output means, 7 baseball topic phoneme n-gram (phoneme n-gram), 8 general topic phoneme n-gram (phoneme n-gram), 9 word probability calculation Means, 10 example sentences, 11 phoneme n-gram, 12 word probability calculation means, 13 baseball topic phoneme n-gram (phoneme n-gram), 14 general topic phoneme n-gram (phoneme n-gram), 15 phoneme n-gram, 16 word probability Calculation means, 17 file input device (input means), 18 baseball topic kanji n-gram (kanji n-gram), 19 general topic kanji n-gram (kanji n-gram), 20 morpheme probability calculation means, 21 output means, 22 kanji n-gram, 23 morpheme probability calculation means, 24 baseball topic kanji n-gram (kanji n-gram), 25 general topic kanji n-gram (kanji n-gram), 26 kanji n-gram, 27 morpheme probability calculating means, 28 keyboard (input means), 29 baseball topic kana n-gram (kana n-gram), 30 general topic kana n-gram (kana n-gram), 31 kanji probability calculating means, 32 output means, 33 kana n-gram, 34 kanji probability calculation means, 35 baseball topic kana n-gram (kana n-gram), 36 general topic kana n-gram (kana n-gram), 37 kana n-gram, 38 kanji probability calculation means.

Claims

A phoneme n-gram that stores a phoneme sequence of the target language, a word notation sequence corresponding to the phoneme sequence, and an occurrence probability, and the stored words are classified according to each topic;
Input means for inputting speech in the target language;
Converting a speech signal output by the input means into phonemes, calculating a phoneme occurrence probability corresponding to each phoneme, and outputting a phoneme sequence candidate;
The phoneme n-gram classified according to each topic is referred to and searched by the phoneme string information of the preceding word string candidate. Word probability calculating means for checking whether there is a word and calculating a word occurrence probability of each word candidate corresponding to the phoneme string candidate output by the phonological probability calculating means;
A word string candidate similar to the speech input from the input means, calculated using the phoneme occurrence probability calculated by the phoneme probability calculation means and the word occurrence probability calculated by the word probability calculation means. Output means for outputting,
The speech recognition apparatus , wherein the word probability calculating means sets a probability weight for each topic for each backward word .

A kanji-kanji mixed character string, a word notation sequence corresponding to the kana-kanji mixed character string, and an occurrence probability, and a kanji n-gram in which the stored words are classified according to each topic;
Input means for inputting the kana-kanji mixed character string;
The kanji n-gram classified according to each topic is referred to, and the kanji string information of the preceding word string candidate is searched, and the substrings of the kanji string candidates after the searched preceding word are matched forward. A morpheme probability calculating unit that checks whether or not there is a word and calculates a word occurrence probability of each word candidate corresponding to the kana-kanji mixed character string output by the input unit;
An output unit that outputs a word string candidate that matches the character string input from the input unit, calculated using the word occurrence probability calculated by the morpheme probability calculation unit ;
The morpheme probability calculating means sets a probability weight for each topic for each backward word .

A kana character string, a word notation string corresponding to the kana character string, an occurrence probability, and a kana n-gram in which the stored words are classified according to each topic;
An input means for inputting the kana character string;
Referring to the kana n-gram classified according to each topic, check whether there is a backward word that matches forward in the partial sequence of the kana string candidate after the searched preceding word, and the input Kanji probability calculating means for calculating the word occurrence probability of each word candidate corresponding to the kana character string output by the means;
An output means for outputting word string candidates that match the kana character string input from the input means, calculated using the word occurrence probability calculated by the kanji probability calculating means ;
A kana-kanji conversion apparatus , wherein the kanji probability calculating means sets a probability weight for each topic for each backward word .

Capturing the input audio;
Converting the captured speech into phonemes;
Calculating a phoneme occurrence probability corresponding to each phoneme converted from the speech, and outputting a phoneme string candidate;
The phoneme string of the target language, the word notation string corresponding to the phoneme string, and the occurrence probability are stored, and the backward word is referred to by referring to the phoneme n-gram in which the stored word is classified corresponding to each topic A probability weight is set for each topic for each of the above, and the phoneme sequence information of the preceding word sequence candidate is searched, and whether or not there is a backward word that matches forward in the subsequence of the phoneme sequence candidate after the searched preceding word Checking and calculating a word occurrence probability of each word candidate corresponding to the calculated phoneme string candidate;
And calculating a word string candidate similar to the input speech using the phonological occurrence probability and the word occurrence probability.

A step of fetching a character string mixed with input kana and kanji,
A kana-kanji mixed character string, a word notation sequence corresponding to the kana-kanji mixed character string, and an occurrence probability are stored, and the stored word is referred to a kanji n-gram classified according to each topic. , By setting probability weights for each topic for each backward word, searching for kanji string information of the preceding word string candidate, and whether there is a backward word that matches forward in the substring of the kanji string candidate after the searched preceding word Checking whether or not , calculating a word occurrence probability of each word candidate corresponding to the captured kana-kanji mixed character string;
A morpheme analysis method comprising: calculating a word string candidate that matches the inputted character string mixed with the kana and kanji using the calculated word occurrence probability.

Capturing the input kana character string;
The kana character string, the word notation string corresponding to the kana character string, and the occurrence probability are stored, and the stored word is referred to the kana n-gram classified according to each topic, and the backward word Set the weight of probability for each topic, check whether there is a backward word that matches forward in the substring of the kana string candidate after the searched preceding word, and correspond to the captured kana character string Calculating a word occurrence probability of each word candidate to be
A kana-kanji conversion method comprising: calculating a word string candidate that matches the input kana character string using the calculated word occurrence probability.

Capturing the input audio;
Converting the captured speech into phonemes;
Calculating a phoneme occurrence probability corresponding to each phoneme converted from the speech, and outputting a phoneme string candidate;
The phoneme string of the target language, the word notation string corresponding to the phoneme string, and the occurrence probability are stored, and the backward word is referred to by referring to the phoneme n-gram in which the stored word is classified corresponding to each topic A probability weight is set for each topic for each of the above, and the phoneme sequence information of the preceding word sequence candidate is searched, and whether or not there is a backward word that matches forward in the subsequence of the phoneme sequence candidate after the searched preceding word Checking and calculating a word occurrence probability of each word candidate corresponding to the calculated phoneme string candidate;
A computer-readable recording recording a program for causing a computer to execute a speech recognition method including calculating a word string candidate similar to the input speech using the phonological occurrence probability and the word occurrence probability. Medium.

A step of fetching a character string mixed with input kana and kanji,
A kana-kanji mixed character string, a word notation sequence corresponding to the kana-kanji mixed character string, and an occurrence probability are stored, and the stored word is referred to a kanji n-gram classified according to each topic. , By setting probability weights for each topic for each backward word, searching for kanji string information of the preceding word string candidate, and whether there is a backward word that matches forward in the substring of the kanji string candidate after the searched preceding word Checking whether or not , calculating a word occurrence probability of each word candidate corresponding to the captured kana-kanji mixed character string;
A computer-readable recording of a program for causing a computer to execute a morpheme analysis method including calculating a word string candidate that matches the input character string mixed with the kana and kanji using the calculated word occurrence probability Possible recording media.

Capturing the input kana character string;
The kana character string, the word notation string corresponding to the kana character string, and the occurrence probability are stored, and the stored word is referred to the kana n-gram classified according to each topic, and the backward word Set the weight of probability for each topic, check whether there is a backward word that matches forward in the substring of the kana string candidate after the searched preceding word, and correspond to the captured kana character string Calculating a word occurrence probability of each word candidate to be
A computer-readable recording of a program for causing a computer to execute a kana-kanji conversion method including calculating a word string candidate that matches the input kana character string using the calculated word occurrence probability Recording medium.