JP3876451B2

JP3876451B2 - Kana-kanji conversion device

Info

Publication number: JP3876451B2
Application number: JP32638995A
Authority: JP
Inventors: 泰男小山
Original assignee: Seiko Epson Corp
Current assignee: Seiko Epson Corp
Priority date: 1995-11-20
Filing date: 1995-11-20
Publication date: 2007-01-31
Anticipated expiration: 2015-11-20
Also published as: JPH09146944A

Description

【０００１】
【発明の属する技術分野】
本発明は、仮名漢字変換装置および仮名漢字変換方法に関し、少なくとも漢字を含む文字列を取り込んでこれを変換する仮名漢字変換装置および仮名漢字変換方法に関する。
【０００２】
【従来の技術】
日本語をコンピュータで扱うことを目的として作られた仮名漢字変換装置では、仮名文字列を入力し、これを漢字仮名混じり文に変換する。これに対して、近年、既に確定した文字列を再変換することを目的として、あるいはＯＣＲにより読み取った文字列や手書き認識により読み取った文字列を再変換することを目的として、漢字が混じった文字列を変換しようとするものが提案されている。
【０００３】
即ち、ワードプロセッサなどに入力して確定済みの文字列に変換の誤りなどを見いだした場合に、もう一度仮名文字列を入力する替わりに、入力され確定済みの文字列から仮名文字列を作成して、再変換に供するのである。
【０００４】
【発明が解決しようとする課題】
しかしながら、従来の仮名漢字変換では、変換済みの文字列から逆変換用の単語辞書（漢字見出しと読みとを有する辞書）を参照して仮名文字列を得ているので、単語が不完全な形態で入力されている場合には、再変換できないという問題があった。例えば、手書き文字を認識する場合、「飛行機」や「伊藤」といった熟語について画数の多い漢字を入力する手間を嫌って「ひ行き」「伊とー」と入力することが考えられる。こうした文字列を正しい日本語文字列に変換しようとすると、一旦これらの文字列を仮名文字列に変換しなければならないが、辞書には「飛行機」を見出しとする語は登録されていても、「ひ行き」といった語は登録されていないから、うまく再変換できないと言う問題を招致する。
【０００５】
また、一旦仮名文字列に変換できたとしても、該当する単語が見いだされない場合には、入力を最初からやり直さなければならなかった。例えば、名前などは種々の漢字を当てはめることができるので、一旦漢字を仮名に戻してから仮名漢字変換を行なっても所望の漢字を得ることができない場合があった。
【０００６】
本発明は、上記問題点を解決するためになされ、少なくとも漢字が混じった文字列をスムースに仮名漢字変換することを目的とする。
【０００７】
【課題を解決するための手段およびその作用・効果】
かかる目的を達成するためになされた本発明の仮名漢字変換装置は、
少なくとも漢字を含んだ文字列を一旦仮名文字列に変換した後、再度仮名漢字変換を行なう仮名漢字変換装置であって、
前記漢字を含む文字列を取り込む文字列入力手段と、
該取り込まれた文字列を、文節を構成する自立語について漢字見出しと読みとを記録した第１の辞書を参照して仮名文字列に逆変換する第１の逆変換手段と、
該逆変換により仮名文字列が得られれば、該仮名文字列の前方から単文節として仮名漢字変換して語尾・付属語との接続検定を行なった上で、単文節の変換候補を出力する仮名漢字変換手段と、
該仮名漢字変換手段により変換候補が得られなかった場合には、単漢字について漢字見出しと読みとを記録した第２の辞書を参照して、前記漢字を含む文字列を仮名文字列に逆変換し、該逆変換された仮名文字列を前記仮名漢字変換手段に供する第２の逆変換手段と、
前記第２の逆変換手段により得られた仮名文字列を基にして、前記仮名漢字変換手段による変換候補が得られなかった場合には、該仮名文字列に対して単漢字変換を行なう単漢字変換手段と、
前記単漢字変換手段における単漢字変換に際して、前記第２の逆変換手段が取り込んだ文字列に含まれる漢字の位置を、前記単漢字変換手段における変換の対象となる仮名文字の区切り位置とする区切り位置決定手段と
を備えることを要旨とする。
【０００８】
この仮名漢字変換装置に対応した本発明の仮名漢字変換方法は、コンピュータが、少なくとも漢字を含んだ文字列を一旦仮名文字列に変換した後、再度仮名漢字変換を行なう処理を実行することにより実現される仮名漢字変換方法であって、
前記漢字を含む文字列を、文字列を入力する手段を介して取り込み、
該取り込まれた文字列を、文節を構成する自立語について漢字見出しと読みとを記録した第１の辞書を参照して仮名文字列に逆変換し、
該逆変換により仮名文字列が得られれば、該仮名文字列の前方から単文節として仮名漢字変換を行なう手段に供して語尾・付属語との接続検定を行なった上で、単文節の変換候補を出力し、
該変換により変換候補が得られなかった場合には、単漢字について漢字見出しと読みとを記録した第２の辞書を参照して前記漢字を含む文字列を仮名文字列に逆変換し、該逆変換された仮名文字列を前記仮名漢字変換を行なう手段に供し、
前記仮名漢字変換を行なう手段が変換候補を作成できなかった場合には、前記文字列に含まれる漢字の位置を仮名文字の区切り位置として、前記逆変換して得られた仮名文字列に対して単漢字変換を行なう
ことを要旨とする。
【０００９】
かかる仮名漢字変換装置および方法は、ＯＣＲやペン入力、キーボード入力などにより取り込んだ漢字を含む文字列を、文節を構成する自立語について漢字見出しと読みとを記録した第１の辞書を用いて仮名文字列に逆変換し、得られた仮名文字列を基に仮名漢字変換を行なって、単文節の変換候補を出力する。かかる変換候補が得られなかった場合には、単漢字について漢字見出しと読みを記録した第２の辞書を参照して、前記取り込まれた文字列を仮名文字列に逆変換し、これを仮名漢字変換を行なう手段に供する。従って、この仮名漢字変換装置および方法によれば、対象となる文字列について文節を構成する語が含まれるとしてまず逆変換および仮名漢字変換を行なった後、変換候補が得られない場合（例えば漢字の熟語の一部を仮名で入力した場合など）には、単漢字についての第２の辞書を参照して逆変換を行なうことにより、所望の変換候補を得られ易くなると言う利点がある。しかも、本発明では、第２の逆変換手段により得られた仮名文字列を基にして、前記仮名漢字変換手段が変換候補を作成できなかった場合には、該仮名文字列に対して単漢字変換を行なう単漢字変換手段を備え、更に、取り込まれた文字列に含まれる漢字の位置を、単漢字変換の対象となる仮名文字の区切り位置とする区切り位置決定手段を備える。従って、仮名漢字変換用の辞書に登録がない文字列について漢字候補を出力することができ、人名などの変換の場合に有効である。単漢字変換の区切り位置が決定できるので、単漢字変換の変換精度を向上することができる。例えば「ひ行き」という文字列を取り込んだ場合、「行」を区切り位置として「ひ」と「き」について単漢字変換を行なえば良く、「ひこ（彦）」等を変換候補とする必要がない。
【００１０】
漢字などを含む文字列を再変換するといった要請は、比較的短い文字列の範囲に対して生じると考えられるから、単文節を前提とした逆変換および仮名漢字変換を起動することにより、文法解析などの手間を低減でき、短時間のうちに変換候補を得ることができる。しかも、変換候補が得られない場合には、単漢字毎に逆変換を行なってから仮名漢字変換に供するので、仮名漢字変換の基本となる仮名文字列が必ず得られるという利点がある。最終的に単漢字変換を用いれば、所望の変換結果を得ることができる可能性は高い。
【００１１】
なお、第２の逆変換手段によって得られた仮名文字列は、単文節の仮名漢字変換を行なう仮名漢字変換手段による仮名漢字変換に供しても良いが、これ以外の仮名漢字変換、例えば単漢字変換や連文節変換あるいは自動変換などに供しても差し支えない。
【００１２】
この仮名漢字変換装置において、前記仮名漢字変換手段に、第２の辞書を参照して得られた仮名文字列について仮名漢字変換を行なう際、前記取り込まれた文字列に含まれる漢字と仮名漢字変換後に得られた単文節に含まれる漢字とが一致するもののみを変換候補とする変換候補制限手段を備えるものとすることができる。例えば、「公えん」と入力されたときに、「公園」や「公演」は変換候補とするが、「講演」や「後援」などは候補としないのである。この結果、単文節変換における変換精度が各段に向上する。
【００１３】
また、この仮名漢字変換装置において、第２の逆変換手段により得られた仮名文字列を基にして、前記仮名漢字変換手段が変換候補を作成できなかった場合には、該仮名文字列に対して単漢字変換を行なう単漢字変換手段を備えるものとすることができる。この場合には、仮名漢字変換用の辞書に登録がない文字列について漢字候補を出力することができる。人名などの変換の場合に有効である。更に、取り込まれた文字列に含まれる漢字の位置を、単漢字変換の対象となる仮名文字の区切り位置とする区切り位置決定手段を備えるものとすることもできる。この場合には、単漢字変換の区切り位置が決定しやすいので、単漢字変換の変換精度を向上することができる。例えば「ひ行き」という文字列を取り込んだ場合、「行」を区切り位置として「ひ」と「き」について単漢字変換を行なえば良く、「ひこ（彦）」等を変換候補とする必要がない。
【００１４】
仮名漢字変換手段による変換候補が得られなかった場合に、最終的に単漢字変換手段による単漢字変換を行なう場合、得られた変換候補がいずれの変換手段により変換されたものであるか直ちに判断することが困難なことが考えられる。そこで、仮名漢字変換手段により出力される単文節としての変換候補と前記単漢字変換手段により出力される単漢字としての変換候補とを区別可能な態様で表示することも好適である。
【００１５】
漢字の含まれる文字列から仮名文字列を得るには、漢字から仮名を取り出す逆変換用の辞書が必要となる。この辞書は、専用に用意してもよいが、仮名文字列を漢字混じり文に変換する仮名漢字変換用の辞書から、生成することも可能である。なお、仮名漢字変換において新たな単語が登録された場合には、そのたびに逆変換用の辞書を更新するものとすることも可能である。この場合には、基本的に用意する辞書を一種類ですませることができる。
【００１６】
上述した様々な仮名漢字変換装置において、第２の辞書を単漢字変換用の辞書から生成するものとすることができる。この場合には、特別な辞書を用意する必要がなく、簡便に第２の辞書を構築することができる。
【００１７】
なお、仮名漢字変換手段によって漢字候補を出力する場合には、使用頻度に応じた順に単漢字変換による変換候補を出力するものとすることができる。単漢字変換の場合には、変換候補が多数に上ることがあり得るから、どのような手法で変換候補を出力するかは使い勝手に影響を与えやすい。
【００１８】
【発明の他の態様】
本発明の他の態様として、入力手段が手書き認識手段であるもの、スキャナなどの光学式文字読み取り装置であるもの、音声認識装置である等、種々の態様を考えることができる。更に、この仮名漢字変換装置は、単独でワードプロセッサとして用いることもできるし、仮名漢字変換を要する種々の機器に組み込んで用いることも可能である。また、コンピュータのオペレーティングシステムに組み込む形態で実現することも可能である。コンピュータを用いた機器にソフトウェアとして組み込んで実現する場合には、このソフトウェアを記録したフレキシブルディスクやＣＤ−ＲＯＭなどの形態で取り扱ったり、パソコン通信などを介して配布することも可能である。
【００１９】
例えば、コンピュータシステムのマイクロプロセッサによって実行されることによって、少なくとも漢字を含んだ文字列を一旦仮名文字列に変換した後、再度仮名漢字変換を行なうソフトウェアプログラムを格納した携帯型記憶媒体であって、
前記漢字を含む文字列を、取り込み、
該取り込まれた文字列を、辞書を参照して仮名文字列に逆変換し、
該逆変換により得られた仮名文字列を単文節として仮名漢字変換して、単文節の変換候補を出力し、
該単文節の変換候補が得られなかった場合には、前記逆変換された仮名文字列に対して単漢字変換を行ない、単漢字の変換候補を出力する
機能を実現するものが実施可能である。
【００２０】
これらの各部の機能を実現するソフトウェアプログラム（アプリケーションプログラム）は、フロッピディスクやＣＤ−ＲＯＭ等の携帯型記憶媒体（可搬型記憶媒体）に格納され、携帯型記憶媒体からコンピュータシステムのメインメモリまたは外部記憶装置に転送される。なお、通信回線を介して、これらのソフトウェアプログラムを提供する装置を設け、上記仮名漢字変換方法を実現するソフトウェアプログラムを通信回線を介して、この供給装置からコンピュータシステムのメインメモリまたは外部記憶装置に転送するものとしても良い。例えば、電話回線を介してパソコン通信のホストコンピュータからダウンロードしたり、衛星放送を用いて配信を受けることも可能である。
【００２１】
【発明の実施の形態】
次に、本発明に係る仮名漢字変換装置の好適な実施例について、図面に基づき説明する。図１は、第１本実施例の仮名漢字変換装置の機能ブロック図、図２は、かかる仮名漢字変換装置が実現されるコンピュータの概略構成を示すブロック図、である。
【００２２】
説明の便宜上、まず図２に従い、コンピュータ１０のハードウェア構成について説明する。このコンピュータ１０は、図示するように、ローカルバス２２に接続された演算処理部２０、ローカルバス２２を外部バスの一つであるＰＣＩバス３２に接続するＰＣＩブリッジ３０、ＰＣＩバス３２を介して演算処理部２０のＣＰＵ２１等によりアクセスを受けるコントローラ部４０、各種のＩ／Ｏ装置等を制御する機器が低速の外部バスであるＩＳＡバス４２に接続されたＩ／Ｏ部６０、および周辺機器であるキーボード７２，スピーカ７４，ＣＲＴ７６などから構成されている。
【００２３】
演算処理部２０は、中央演算処理装置としてのＣＰＵ２１（本実施例ではインテル社製Ｐｅｎｔｉｕｍを使用）、キャッシュメモリ２３，そのキャッシュコントローラ２４およびメインメモリ２５から構成されている。ＰＣＩブリッジ３０は、高速のＰＣＩバス３２を制御する機能を備えたコントローラである。ＣＰＵ２１が扱うメモリ空間は、ＣＰＵ２１の内部に用意された各種レジスタにより、実際の物理アドレスより広い論理アドレスに拡張されている。
【００２４】
コントローラ部４０は、モニタ（ＣＲＴ）７６への画像の表示を司るグラフィックスコントローラ（以下、ＶＧＡと呼ぶ）４４、接続されるＳＣＳＩ機器とのデータ転送を司るＳＣＳＩコントローラ４６、ＰＣＩバス３２と下位のＩＳＡバスとのインタフェースを司るＰＣＩ−ＩＳＡブリッジ４８から構成されている。ＶＧＡ４４は、ＣＲＴ７６に対して、６４０×４８０ドット、１６色表示が可能である。なお、表示用のフォントを記憶したキャラクタジェネレータや所定のコマンドを受け取って所定の図形を描画するグラフィックコントローラ、さらには描画画像を記憶するビデオメモリ等は、このＶＧＡ４４に実装されているが、これらの構成は周知のものなので、必要に応じて後述するものとし、図２では省略した。
【００２５】
ＰＣＩ−ＩＳＡブリッジ４８を介して接続されたＩＳＡバス４２は、各種のＩ／Ｏ機器が接続される入出力制御用のバスであり、ＤＭＡコントローラ（以下単にＤＭＡと呼ぶ）５０、リアルタイムクロック（ＲＴＣ）５２、２つの複合Ｉ／Ｏポート５４，５５、サウンドＩ／Ｏ５６、キーボード７２およびマウス７３とのインタフェースを司るキーボードインタフェース（以下ＫＥＹと呼ぶ）６４、優先順位を有する割り込み制御を行なう割り込みコントローラ（以下ＰＩＣと呼ぶ）６６、各種の時間カウントやビープ音を発生するタイマ６８等から構成されている。なお、ＩＳＡバス４２には、拡張ボードが実装可能なＩＳＡスロット６２が接続されている。
【００２６】
複合Ｉ／Ｏポート５４には、パラレル出力，シリアル出力の他、フロッピディスク装置８２やハードディスク８４を制御する信号を入出力するポートが用意されている。また、パラレル入出力には、パラレルポート８６を介してプリンタ８８が、シリアル入出力には、シリアルポート９０を介してモデム９２が、各々接続されている。もう一つの複合Ｉ／Ｏポート５５には、スキャナ９３や手書き入力可能なタブレット９４が接続されている。サウンドＩ／Ｏ５６には、上述したスピーカ７４の他、マイクロフォン９６が接続可能とされている。これらの構成の他、ＤＯＳ／Ｖ機では、標準化されたＩ／Ｏチャンネルが用意されることも多いが、本実施例では図示および説明は省略する。
【００２７】
次に、こうして構成されたハードウエアにより実行される機能を図１を用いて説明する。図１に示した各部の構成と働きについて概説するが、ここで行なわれる処理は、キーボード７２，スキャナ９３あるいはタブレット９４より入力されたデータに基づき、中央処理装置（ＣＰＵ２１）が実行するものである。このＣＰＵ２１により、総ての処理がおこなわれる。仮名漢字変換については、キーボード７２などから文字列が入力された後、所定の操作によって仮名漢字変換の実行が指示されたとき、所定の割込処理が起動し、取り込まれた文字列（仮名文字列あるいは漢字を含む文字列）を変換し、最終的に仮名漢字混じり文字列に変換するデバイスドライバが起動する。もとより、並列処理可能なコンピュータであれば、仮名漢字変換を一つのアプリケーション（インプットメソッド）が行なうものとし、変換結果を、必要とするアプリケーションに引き渡す構成としても差し支えない。この場合には、キーボード７２等からの入力をインプットメソッドが一括して引き受けることになる。
【００２８】
コンピュータ１０上で実現されるこの仮名漢字変換装置は、図１に示すように、文字列を入力部ＩＰＵにより入力する。この入力部ＩＰＵは、既に文書として入力された文字列を範囲指定して取り込むのが基本であるが、文字列を直接コードにより入力するキーボード７２、紙などに記載された文字を読み取り文字認識により文字列を認識するスキャナ９３と文字認識ソフトウェア、手書き文字を入力し文字列として認識するタブレット９４と手書き文字認識ソフトウェアなどを用いて、文字列を入力するものも含まれる。
【００２９】
こうして入力された文字列には漢字が混じっていることがあり、漢字混じりの部分は、第１の逆変換部ＲＣ１により、ハードディスク８４に記憶された第１の辞書を利用して仮名文字列に変換される。第１の辞書は、漢字見出し（文節を構成し得る語）に対して少なくともその読みを与えるものであり、実施例では、図３に示しように、品詞情報を併せ持つ構造としている。
【００３０】
次に、第１の逆変換部ＲＣ１から得られた仮名文字列を基に、仮名漢字変換部ＫＫＣが、単文節変換を行なう。単文節変換は、周知のものであり、与えられた仮名文字列が、一つの自立語＋付属語として構成されるものとして、変換を行なう。単文節変換に限定すると、仮名文字列に対する形態素解析などが簡略にでき、単純なマッチングにより、変換候補を得ることができる。なお、詳しい処理の内容については後述する。単文節変換には、図示を省略したが、仮名漢字変換用の自立語辞書、付属語辞書が用いられる。
【００３１】
単文節変換により変換候補が得られた場合には、これを出力し、出力された変換候補は、ＣＲＴ７６の所定の領域に表示される。一方、変換候補が得られなかった場合には、次に第２の逆変換部ＲＣ２を起動し、入力部により得られた文字列に対して、単漢字の漢字読み出しと読みとを含む第２の辞書を用いて仮名文字れを行なう。この結果、文節を構成する語の一部であって本来漢字で表記される部分が仮名になっているような場合でも、仮名漢字変換を行なって、変換候補を得やすくなる。なお、それでも変換候補が得られない場合には、図１には示していないが、更に単漢字変換を行なうモジュールを起動することもできる。この場合には図示しない単漢字変換辞書を利用する。単漢字変換であれば、大部分の文字列について、変換候補を示すことができる。
【００３２】
なお、これらの各部の機能を実現するソフトウェアプログラム（アプリケーションプログラム）は、ＣＰＵ２１により直接実行可能にＲＯＭに記憶しておくことも可能であるが、フレキシブルディスクやＭＯ，ＣＤ−ＲＯＭ等の携帯型記憶媒体（可搬型記憶媒体）に格納され、携帯型記憶媒体からこのコンピュータシステムのメインメモリ２５またはハードディスク８４に転送されて、実行されるものとすることもできる。なお、通信回線を介して、これらのソフトウェアプログラムを提供する装置を設け、この装置をモデム９２を介して通信回線に接続し、供給装置から、当該ソフトウェアプログラムをコンピュータシステムのメインメモリ２５またはハードディスク８４に転送するものとしても良い。
【００３３】
次に、各処理ルーチンについて、順次説明する。図４および図５は、漢字を含む文字列を取り込んでこれを再度仮名漢字変換する再変換処理ルーチンを示すフローチャートである。ワードプロセッサなどのアプリケーションプログラムを実行中に、既に入力され確定された文字列の範囲を指定して、再変換の指示がなされると、仮名漢字変換用のプログラムが起動され、図４および図５に示す処理を開始する。
【００３４】
このルーチンが開始されると、まず対象文字列を取得する処理が行なわれる（ステップＳ１００）。対象文字列とは、範囲指定して本実施例の仮名漢字変換プログラムに渡される文字列であり、以下で再変換の対象となる文字列である。例えば、図６（Ａ）に示したように、確定した文字列「佐賀のは今日の北部に」が対象文字列として取得されたものとする。次に、検索文字ポインタと検索回数を初期化する処理を行ない（ステップＳ１１０）、まず検索文字を一つ追加する処理を行なう（ステップＳ１２０）。図６（Ｂ）に示すように、検索文字ポインタの初期値は０なので、ステップＳ１２０の処理を一回行なうことにより、まず検索文字ポインタを先頭の文字に設定する。
【００３５】
次に逆引き自立語辞書ＪＤ１を検索する処理を行なう（ステップＳ１３０）。即ち、検索文字ポインタが示している文字（図６では「佐」）について、第１の辞書に相当する逆引き自立語辞書ＪＤ１を検索する。この辞書ＪＤ１は、漢字見出しにより検索が可能であり、実施例では、その読みと品詞情報とを得ることができる。検索した後、検索結果について判定する。逆引き自立語辞書ＪＤ１は、漢字見出しの順に並んでおり、一回の検索で同じ漢字見出しの単語が一つ読み出されるから、まずその漢字見出しについて最初の単語について一致するかを判断する（ステップＳ１４０）。図６の例のように一致する語が見つからない場合には、更に前方一致があるか否かについて判断する（ステップＳ１５０）。「佐」にたいして「佐賀」や「佐藤」が存在する場合には、前方一致する語が存在することになるから、この場合には、ステップＳ１２０に戻って、検索文字を一つ追加し、逆引き自立語辞書ＪＤ１の検索を再度実行する。即ち、「佐賀」について逆引き自立語辞書ＪＤ１を検索するのである。
【００３６】
この場合には、一致する語「佐賀」が見いだされるので、処理は、図５に示すステップＳ１６０以降に移行する。なお、検索回数番目での一致がなくかつ前方一致もなければ、この逆引き自立語辞書ＪＤ１によっては適正な仮名文字列は得られないと判断して、図８に示す追変換処理ルーチンに移行するが、この処理については後述する。
【００３７】
逆引き自立語辞書ＪＤ１を引いて丁度検索回数番目の一致が見いだされた場合には（ステップＳ１４０）、該当する単語が自立語辞書ＪＤ１に登録されていると判断できるので、検索文字ポインタを一つ進め（ステップＳ１６０）、続いて語尾／付属語検索回数を初期化、即ち値１に設定する処理を行なう（ステップＳ１７０）。次に検索文字を一つ追加して自立語に続く最初の文字（図６（Ｃ）では「の」）にポインタをおいて、語尾／付属語について逆引き辞書ＪＤ２を検索し（ステップＳ１９０）、一致があるか否かを判断する（ステップＳ２００）。一致がなく、前方一致が見いだされた場合には（ステップＳ２１０）、逆引き自立語辞書ＪＤ１の検索の場合と同様、検索文字を追加し（ステップＳ１８０）、再度逆引き語尾／付属語検索を行なう（ステップＳ１９０）。なお、図６では、逆引き語尾／付属語辞書ＪＤ２について、見出し語しか示さなかったが、付属語には漢字からなる接辞（「化」、「課」、「町」、「下さい」など）も存在するので、漢字もしくは仮名の見出しとその読みおよび文法情報とから構成されていることは、逆引き自立語辞書ＪＤ１と同様である。
【００３８】
図６に示した例では、「の」は逆引き語尾／付属語辞書に見出し語が見いだされるから、ステップＳ２００での判断は「ＹＥＳ」となり、この場合には、接続検定を行なう（ステップＳ２２０）。即ち、既に逆引き自立語辞書の検索により見いだされた自立語との文法上の接続を検定し、接続するか否かを判断するのである（ステップＳ２３０）。地名「佐賀」に格助詞「の」は接続するが、仮に接続しない場合には、検索回数を増やして（ステップＳ２４０）、再度逆引き語尾／付属語検索（ステップＳ１９０）以下の処理を行なう。この結果、接続する語尾または付属語が見いだされた場合には、この語尾または付属語を、先に見いだされた自立語に追加し（ステップＳ２５０）、取得された文字列すべてについて対象文字列を再構成したか判断する（ステップＳ２６０）。図６の例では、検索ポインタが「の」に置かれたとき、接続検定は「ＹＥＳ」となるから、対象文字列の再構成が完了したかを判断し、まだ「は」が残っていることから、ステップＳ１６０に戻って、検索文字ポインタを一つ進める所から処理を繰り返す。この結果、今度は「のは」という付属語が見いだされ、かつ対象文字列がすべて再構成されたことになるから、仮名文字列「さがのは」を得て、次に自立語の読みを用いた自立語検索（ステップＳ２７０）を中心とする仮名漢字変換を実行する（ステップＳ２８０）。即ち、図７に示すように、自立語を辞書を検索して、最長一致でかつ付属語も接続する自立語を検索し、変換候補「佐賀のは」「嵯峨のは」「性のは」などを表示し、使用者による候補の選択などを受け付ける（ステップＳ３００）。なお、表示を伴う候補の選択処理の詳細については、後述する。
【００３９】
語尾／付属語の検索において、検索回数番目での一致がなくかつ前方一致も見いだされない場合には、語尾／付属語の検索に入る以前の自立語の切り出しに問題があったと考えられるから、検索回数を増やし（ステップＳ２９０）、図４に示したステップＳ１３０に戻って、自立語の検索、即ち逆非自立語検索から処理を再開する。この場合には、同じ見出しの次の語に一致するものがあれば、これを用いて、上述した語尾／付属語の検索や接続の検定を行なうことになる。例えば、文字列「実物大の」についてまず「実物」までを逆引き自立語検索で検索した結果、「実物、じつぶつ、名詞」を見いだしたとすると、ここで一旦語尾／付属語検索を行なう。ところが、「大」については逆引き語尾／付属語辞書ＪＤ２には、対応する見出しがなく、前方一致も見いだされない。この場合には、検索回数を増やして（ステップＳ２９０）再度検索を行ない（ステップＳ１３０以下）、検索回数番目の一致はなく（ステップＳ１４０）かつ前方一致が見いだされるから（ステップＳ１５０）、次は自立語「実物大」を切り出してから、語尾／付属語の検索に入ることになる。この場合には、付属語「の」が見いだされ、対象文字列の再構成が完了する。
【００４０】
次に、逆引き自立語検索においていかなる一致も見いだされない場合に実行される追変換処理について、図８を用いて説明する。図８に示した追変換が開始されると、まず文字ポインタを初期化、即ち値１に設定し（ステップＳ３１０）、まずそのポインタの位置から１文字の取り出しを行なう（ステップＳ３２０）。この文字が漢字が否かの判断を行ない（ステップＳ３３０）、漢字の場合には、第２の辞書に相当する逆引き単漢字辞書ＫＤを検索し（ステップＳ３４０）、漢字でなければそのまま読み（かな）を取得する（ステップＳ３５０）。得られた読み（仮名）を、既に取得された仮名文字列の末尾に追加し（ステップＳ３６０）、文字ポインタを１だけ進める処理を行なう（ステップＳ３７０）。その後、残りの文字があるか否かを判断し（ステップＳ３８０）、あればステップＳ３２０に戻って１文字の取り出しから処理を繰り返す。
【００４１】
例えば、図９に示す「ひ行き」という文字列の場合、図４および図５に示した再変換の手法では、逆引き自立語検索の時点で破綻し、図８に示した追変換が起動される。文字ポインタが値１の場合に取り出された文字「ひ」については、そのまま読み（仮名）「ひ」を取得し、文字「行」については、逆引き単漢字辞書ＫＤを参照して読み「こう」を取得する。こうして総ての文字について処理を繰り返すと、読みとして「ひこうき」が得られる。
【００４２】
残りの文字がなくなったとき、取得した読みをもって単文節かな漢字変換を起動する（ステップＳ３９０）。この変換は、図５に示したステップＳ２８０と同一であり、仮名漢字変化により変換候補が得られると、次に候補選択処理（ステップＳ３００）が実行される。そこで次に、候補選択処理（ステップＳ３００）の詳細について説明する。候補選択処理は、図１０のフローチャートに示すように、まずキーボード７２などからの指示を受け付ける処理を行ない（ステップＳ４００）、この指示が候補の選択を指示するものであれば、候補を入れ替える処理を行なう（ステップＳ４１０）。図７に示した再変換の結果の例では、最初の候補が「嵯峨野は」となっていれば、これを次候補である「佐賀の」や「性の」に入れ替えるのである。なお、「佐賀の」に対して再変換を行なっているのであるから、再変換の対象となった文字列「佐賀の」は候補として表示しないか、最後に表示する候補とすることも好適である。
【００４３】
また、指示が修正指示であれば（ステップＳ４００）、表示していた候補を一旦仮名文字列に戻しこれを修正する処理を行なう（ステップＳ４２０）。得ようとした文字列が「坂野は（さかのは）」であり、これを誤って「さがのは」と入力したために誤変換を生じていたといった場合には、確定文字列を仮名文字列に戻して再変換を行なっても正しい変換結果を得ることはできない。そこで、こうした場合には、仮名文字列に戻した後、読み（仮名文字列）を修正する作業を行なうのである。読みを修正した後、単文節仮名漢字変換を実行する（ステップＳ４３０）。この変換処理は、上述した単文節仮名漢字変換の処理（ステップＳ２８０，３９０）と同一である。なお、追変換の場合も、文字列の修正とその後の単文節仮名漢字変換の処理は、全く同様に行なわれる。
【００４４】
キーボード７２等からの指示が確定の指示である場合には、まず自立語辞書ＫＤに、選択された候補を優先的に出力するように学習する処理を行なう（ステップＳ４４０）。これは通常の仮名漢字変換における学習と同様である。その後、現在選択されている候補からなる文節を確定文字列として出力する処理を行ない（ステップＳ４５０）、本ルーチンを終了する。
【００４５】
以上の候補選択処理により、再変換もしくは追変換の対象となった文字列を所望の日本語文字列に容易に変換することができる。
【００４６】
以上説明した実施例では、逆引き自立語辞書ＪＤ１や逆引き語尾・付属語辞書ＪＤ２は、それぞれ自立語辞書ＫＤや語尾・付属語辞書から生成するものとして説明した。通常の自立語辞書ＫＤなどは、見出しとしての読み，変換文字列としての漢字および品詞情報とを備えているので、これから図３に示した逆引き辞書を生成することは容易である。このように自立語辞書などから生成する場合でも、あるいは自立語辞書とは別に逆引き辞書を用意する場合でも、新たな単語を登録した時に逆引き辞書にもこれを登録する必要が存在する。図１１は、単語登録の処理ルーチンを示すフローチャートである。
【００４７】
図示するように、このルーチンが起動されると、まず登録しようとする単語を入力する処理を行なう（ステップＳ５００）。続いて、読みを入力する処理を行ない（ステップＳ５１０）、更に品詞を入力する処理を行なう（ステップＳ５２０）。品詞の入力は、予め用意した品詞を選択することにより行なっても差し支えない。以上の情報を順次入力した後、自立語辞書もしくは語尾・付属語辞書に登録する処理を行ない（ステップＳ５３０）、更に逆引き辞書に登録する処理を行なう（ステップＳ５４０）。
【００４８】
この処理を実行することにより、単語登録と同時に逆引き辞書も更新され、再変換や追変換の対象とすることができるものとなる。
【００４９】
以上説明した実施例では、登録単語単位で逆変換を行なってから再度仮名漢字変換に供すること（再変換）と、単漢字単位で逆変換を行なってから再度仮名漢字変換に供すること（追変換）とを行なっているが、再度行なわれる仮名漢字変換は単文節変換としている。これは、再変換や追変換の要請が複文節に亘って行なわれることが少なく、むしろ変換の速度を優先すべき考えれるからである。単文節変換では、接辞の処理（特に接頭語）の処理が困難な場合が考えられるが、「接辞」＋「単語」を予め登録しておけば、単文節変換によっても容易に変換することができる。もとより、形態素解析を行なうものとして、連文節変換を行なうものとすることもできる。
【００５０】
人名などの場合には、単文節変換によっては所望の文字が得られない場合も考えられる。したがって、こうした場合を含め、候補選択処理（ステップＳ３００）において、単漢字変換の選択肢を加えることも好適である。この例を図１２に示す。キーボード７２などからの入力が単漢字変換の起動を指示するものである場合には、読みを区切る処理を行ない（ステップＳ４６０）、区切った読み毎に単漢字変換を起動する（ステップＳ４７０）。こうした単漢字変換によれば、図１３に示すように、「ひ行き」といった文字列が指定された場合には、「行」という漢字の存在により、読みをその前後で区切り、「ひ」と「き」とをそれぞれ単漢字変換することになる。また、「行」については、単漢字変換の対処としても良いし、既に漢字で入力されていると判断し、単漢字変換の対象から除くことも望ましい。図１３の例では、漢字で入力された部分は、単漢字変換の対象から除いてある。なお、入力した文字列に漢字の部分が含まれていない場合には、先頭から単漢字変換の見出しとなりうる最大長の文字列を切り出し、順次単漢字変換に供すれば良い。
【００５１】
こうした単漢字変換は、「ひ行き」のように一部が漢字で入力された場合に有用であるばかりでなく、図１４に例示するように、確定した漢字文字列を変換する場合にも有用である。特に人名の場合には、すべてを登録しておくことは困難なので、単漢字変換を併用することは有用である。図１４に示した人名「北岡」は逆変換により仮名文字列「きたおか」を得たとしても、通常の文節変換では、「貴田岡」に変換することは困難である。こうした場合に、単漢字変換を起動し、逆変換により得られた仮名文字列「きたおか」を、単漢字の見出しのある最小の見出しの組み合わせに区切り、その後、単漢字変換を起動する。この例では、「きた」＋「おか」、「きた」＋「お」＋「か」、「き」＋「た」＋「おか」、「き」＋「た」＋「おか」、「き」＋「た」＋「お」＋「か」といった区切りが可能であり、これらを適宜指定することで、「き」＋「た」＋「おか」に対する単漢字変換を起動して、所望の漢字「貴」＋「田」＋「岡」に変換することができる。なお、単漢字変換により得られた変換候補と単文節変換により得られた候補とは、異なる態様で表示することが望ましい。図１４に示した例では、変換の単位毎に区切って表示しており、いずれの変換が行なわれているか直感的に理解できるが、更に表示の色などを変えることにより、明示することも好適である。
【００５２】
以上説明したように本実施例の仮名漢字変換装置は、確定した文字列や、スキャナ９３により読み込まれた文字列、あるいはタブレット９４から手書き入力され文字認識された文字列などに対して、単語単位で逆変換して再度仮名漢字変換を実行し、この処理によっては有意な候補が得られない場合には、漢字毎に単漢字の逆変換を行なってから仮名漢字変換を行なう。したがって、必ず漢字候補を得ることができ、誤変換などに対する修正を容易に行なうことができる。
【００５３】
以上本発明の実施例について説明したが、本発明はこの様な実施例になんら限定されるものではなく、本発明の要旨を逸脱しない範囲において種々なる態様で実施し得ることは勿論である。
【００５４】
【図面の簡単な説明】
【図１】本実施例における仮名漢字変換の概略構成を示す説明図である。
【図２】実施例で用いたコンピュータ１０のハードウェア構成を示すブロック図である。
【図３】逆引き辞書の構造を示す説明図である。
【図４】再変換処理の前半を示すフローチャートである。
【図５】再変換処理の後半を示すフローチャートである。
【図６】再変換処理の一例を示す説明図である。
【図７】再変換処理の後の単文節変換の様子を示す説明図である。
【図８】追変換の処理を示すフローチャートである。
【図９】同じく追変換の様子を示す説明図である。
【図１０】候補選択処理を示すフローチャートである。
【図１１】単語登録処理ルーチンを示すフローチャートである。
【図１２】候補選択処理において単漢字変換を起動する場合の処理を示すフローチャートである。
【図１３】単漢字変換の様子を示す説明図である。
【図１４】同じく単漢字変換の他の例を示す説明図である。
【符号の説明】
１０…コンピュータ
２０…演算処理部
２１…ＣＰＵ
２２…ローカルバス
２３…キャッシュメモリ
２４…キャッシュコントローラ
２５…メインメモリ
３０…ＰＣＩブリッジ
３２…ＰＣＩバス
４０…コントローラ部
４２…ＩＳＡバス
４４…ＶＧＡ
４６…ＳＣＳＩコントローラ
４８…ＩＳＡブリッジ
５４，５５…複合Ｉ／Ｏポート
５４…複合Ｉ／Ｏポート
５５…複合Ｉ／Ｏポート
５６…サウンドＩ／Ｏ
６０…Ｉ／Ｏ部
６２…ＩＳＡスロット
６８…タイマ
７２…キーボード
７３…マウス
７４…スピーカ
７６…ＣＲＴ
８２…フロッピディスク装置
８４…ハードディスク
８６…パラレルポート
８８…プリンタ
９０…シリアルポート
９２…モデム
９３…スキャナ
９４…タブレット
９６…マイクロフォン
ＩＰＵ…入力部
ＫＥＹ…以下
ＰＩＣ…以下
Ｐｅｎｔｉｕｍ…インテル社製
ＲＣＵ…逆変換部
ＳＢＣ…単文節変換部
ＳＫＣ…単漢字変換部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a kana / kanji conversion device and a kana / kanji conversion method, and more particularly to a kana / kanji conversion device and a kana / kanji conversion method for capturing a character string including at least a kanji and converting it.
[0002]
[Prior art]
A kana-kanji conversion device created for the purpose of handling Japanese on a computer inputs a kana character string and converts it into a kanji-kana mixed sentence. On the other hand, in recent years, for the purpose of reconverting a character string that has already been confirmed, or for the purpose of reconverting a character string read by OCR or a character string read by handwriting recognition, Something that tries to convert columns has been proposed.
[0003]
In other words, if a conversion error is found in a character string that has been input to a word processor or the like, instead of inputting the kana character string again, a kana character string is created from the input and confirmed character string, It is used for reconversion.
[0004]
[Problems to be solved by the invention]
However, in the conventional kana-kanji conversion, the kana character string is obtained from the converted character string by referring to the word dictionary for reverse conversion (a dictionary having kanji headings and readings), so that the word is incomplete. There was a problem that re-conversion was not possible. For example, when recognizing handwritten characters, it may be possible to input “Higo” or “Ito-” instead of the trouble of inputting kanji with a large number of strokes for idioms such as “airplane” and “Ito”. If you try to convert these strings to the correct Japanese strings, you must convert these strings to kana strings once, but even if the words that have the heading "Airplane" are registered in the dictionary, Because the word “higo” is not registered, it invites a problem that it cannot be reconverted well.
[0005]
Moreover, even if it can be converted into a kana character string once, if the corresponding word is not found, the input has to be performed again from the beginning. For example, since various kanji characters can be applied to names and the like, there is a case where a desired kanji character cannot be obtained even if kana-kanji conversion is performed after the kanji character is converted back to kana once.
[0006]
The present invention has been made to solve the above problems, and an object of the present invention is to smoothly convert a kana-kanji character string including at least kanji characters.
[0007]
[Means for solving the problems and their functions and effects]
The kana-kanji conversion device of the present invention made to achieve such an object,
A kana-kanji conversion device that once converts a character string including at least kanji into a kana character string and then performs kana-kanji conversion again,
A character string input means for capturing a character string including the kanji,
First inverse conversion means for inversely converting the captured character string into a kana character string with reference to a first dictionary in which kanji headings and readings are recorded for independent words constituting a phrase;
If a kana character string is obtained by the reverse conversion, the kana character kanji is converted as a single phrase from the front of the kana character string and subjected to a connection test with the ending / attachment, and then a kana that outputs a single phrase conversion candidate Kanji conversion means,
If no conversion candidate is obtained by the kana-kanji conversion means, a character string including the kanji is converted back to a kana character string by referring to a second dictionary in which kanji headings and readings are recorded for single kanji. A second reverse conversion means for providing the reversely converted kana character string to the kana-kanji conversion means;
If a conversion candidate by the kana / kanji conversion means cannot be obtained based on the kana character string obtained by the second reverse conversion means, a single kanji character for performing single kanji conversion on the kana character string Conversion means;
When single kanji conversion is performed by the single kanji conversion means, a delimiter in which the position of the kanji included in the character string captured by the second reverse conversion means is the delimiter position of the kana character to be converted by the single kanji conversion means Positioning means and
It is a summary to provide.
[0008]
The kana-kanji conversion method of the present invention corresponding to the kana-kanji conversion device is realized by the computer once converting a character string including at least a kanji into a kana character string and then executing a process for performing kana-kanji conversion again. Kana-kanji conversion method to be performed,
A character string including the kanji is taken in via a means for inputting the character string,
The captured character string is converted back to a kana character string with reference to a first dictionary in which kanji headings and readings are recorded for independent words constituting a phrase,
If a kana character string is obtained by the reverse conversion, the kana character string is subjected to kana-kanji conversion means as a single phrase from the front of the kana character string and then subjected to a connection test with the ending / attachment, and then a single phrase conversion candidate Output
If a conversion candidate is not obtained by the conversion, the character string including the kanji character is inversely converted into a kana character string with reference to a second dictionary in which kanji headings and readings are recorded for the single kanji character, Providing the converted kana character string to the means for performing the kana-kanji conversion;
When the means for performing kana-kanji conversion fails to create a conversion candidate, the kana character string obtained by performing the reverse conversion is set with the position of the kanji included in the character string as the delimiter position of the kana character. Perform single kanji conversion
This is the gist.
[0009]
Such a kana-kanji conversion apparatus and method uses a first dictionary that records kanji headings and readings for independent words that constitute a phrase from a character string including kanji captured by OCR, pen input, keyboard input, or the like. It reverse-converts to a character string, performs kana-kanji conversion based on the obtained kana character string, and outputs a single phrase conversion candidate. If such a conversion candidate is not obtained, the second dictionary in which kanji headings and readings are recorded for a single kanji character is referred to, and the captured character string is inversely converted into a kana character string, which is converted into a kana kanji character. It serves as a means for performing conversion. Therefore, according to the kana-kanji conversion apparatus and method, if a word constituting the phrase is included in the target character string and then the reverse conversion and the kana-kanji conversion are performed first, then conversion candidates cannot be obtained (for example, kanji) For example, when a part of the idiom is input as a kana), a reverse conversion is performed with reference to the second dictionary for single kanji characters, so that a desired conversion candidate can be easily obtained. Moreover, in the present invention, when the kana-kanji conversion means cannot create a conversion candidate based on the kana character string obtained by the second inverse conversion means, a single kanji character is used for the kana character string. Single kanji conversion means for performing conversion is further provided, and further, delimiter position determining means for setting the position of the kanji included in the captured character string as the delimiter position of the kana character to be converted into single kanji characters. Therefore, kanji candidates can be output for character strings that are not registered in the dictionary for kana-kanji conversion, which is effective for conversion of human names. Since the break position for single kanji conversion can be determined, the conversion accuracy of single kanji conversion can be improved. For example, when the character string “Hi-go” is imported, it is only necessary to perform single-kanji conversion for “hi” and “ki” with “line” as the delimiter position, and “hiko” and the like need to be conversion candidates. Absent.
[0010]
A request to reconvert a character string that includes kanji is considered to occur for a relatively short range of character strings, so grammar analysis can be performed by starting reverse conversion and kana-kanji conversion assuming a single phrase. The conversion candidate can be obtained in a short time. In addition, when conversion candidates cannot be obtained, since kana-kanji conversion is performed after reverse conversion is performed for each single kanji, there is an advantage that a kana character string that is the basis of kana-kanji conversion is always obtained. If single kanji conversion is finally used, there is a high possibility that a desired conversion result can be obtained.
[0011]
The kana character string obtained by the second reverse conversion means may be subjected to kana-kanji conversion by kana-kanji conversion means for performing kana-kanji conversion of a single phrase, but other kana-kanji conversion, for example, single kanji It can be used for conversion, continuous phrase conversion or automatic conversion.
[0012]
In this kana-kanji conversion device, when kana-kanji conversion is performed for the kana-kanji character string obtained by referring to the second dictionary, the kana-kanji conversion means converts the kanji and kana-kanji conversion included in the captured character string. It is possible to provide conversion candidate restriction means for converting only those that match a kanji included in a single phrase obtained later. For example, when “Koen” is input, “park” and “performance” are candidates for conversion, but “lecture” and “support” are not candidates. As a result, the conversion accuracy in single phrase conversion is improved in each stage.
[0013]
Further, in this kana-kanji conversion device, when the kana-kanji conversion means cannot create a conversion candidate based on the kana character string obtained by the second reverse conversion means, Single kanji conversion means for performing single kanji conversion can be provided. In this case, kanji candidates can be output for character strings that are not registered in the kana-kanji conversion dictionary. This is effective when converting a person's name. Furthermore, it is possible to provide delimiter position determining means for setting the position of the kanji included in the captured character string as the delimiter position of the kana character to be converted into single kanji characters. In this case, since the break position for single-kanji conversion can be easily determined, the conversion accuracy of single-kanji conversion can be improved. For example, when the character string “Hi-go” is imported, it is only necessary to perform single-kanji conversion for “hi” and “ki” with “line” as the delimiter position, and “hiko” and the like need to be conversion candidates. Absent.
[0014]
When a conversion candidate by the kana / kanji conversion means cannot be obtained, and when finally converting the single kanji by the single kanji conversion means, it is immediately determined which conversion means the conversion candidate obtained has been converted by. It can be difficult to do. Therefore, it is also preferable to display the conversion candidate as a single phrase output by the kana / kanji conversion means and the conversion candidate as a single kanji output by the single kanji conversion means in a distinguishable manner.
[0015]
In order to obtain a kana character string from a character string including kanji, a reverse conversion dictionary that extracts kana from kanji is required. This dictionary may be prepared exclusively, but can be generated from a dictionary for kana-kanji conversion that converts a kana character string into a kanji-mixed sentence. When a new word is registered in kana-kanji conversion, the inverse conversion dictionary can be updated each time. In this case, one type of dictionary can be prepared basically.
[0016]
In the various kana-kanji conversion devices described above, the second dictionary can be generated from a dictionary for single-kanji conversion. In this case, it is not necessary to prepare a special dictionary, and the second dictionary can be easily constructed.
[0017]
In addition, when a kanji candidate is output by the kana-kanji conversion means, conversion candidates by single kanji conversion can be output in the order according to the frequency of use. In the case of single-kanji conversion, there may be a large number of conversion candidates, and the method used to output the conversion candidates tends to affect usability.
[0018]
Other aspects of the invention
As other aspects of the present invention, various aspects can be considered such that the input means is a handwriting recognition means, an optical character reader such as a scanner, and a speech recognition apparatus. Furthermore, this Kana-Kanji conversion device can be used alone as a word processor, or can be used by being incorporated in various devices that require Kana-Kanji conversion. Also, it can be realized in a form incorporated in an operating system of a computer. When it is realized by being incorporated as software in a device using a computer, it can be handled in the form of a flexible disk or CD-ROM on which this software is recorded, or can be distributed through personal computer communication or the like.
[0019]
For example, a portable storage medium that stores a software program that, once executed by a microprocessor of a computer system, converts a character string including at least kanji into a kana character string and then performs kana-kanji conversion again,
Import a character string containing the kanji,
The captured character string is converted back to a kana character string with reference to a dictionary,
Kana-kanji conversion of the kana character string obtained by the inverse conversion as a single phrase, and output a single phrase conversion candidate;
If the single phrase conversion candidate is not obtained, single kanji conversion is performed on the inversely converted kana character string, and a single kanji conversion candidate is output.
Anything that implements the function can be implemented.
[0020]
A software program (application program) for realizing the functions of these units is stored in a portable storage medium (portable storage medium) such as a floppy disk or a CD-ROM, and from the portable storage medium to the main memory of the computer system or externally. It is transferred to the storage device. A device for providing these software programs is provided via a communication line, and the software program for realizing the kana-kanji conversion method is transferred from the supply device to the main memory or external storage device of the computer system via the communication line. It may be transferred. For example, it is possible to download from a host computer for personal computer communication via a telephone line or to receive distribution using satellite broadcasting.
[0021]
DETAILED DESCRIPTION OF THE INVENTION
Next, a preferred embodiment of a kana-kanji conversion apparatus according to the present invention will be described with reference to the drawings. FIG. 1 is a functional block diagram of a kana-kanji conversion apparatus according to the first embodiment, and FIG. 2 is a block diagram showing a schematic configuration of a computer on which the kana-kanji conversion apparatus is realized.
[0022]
For convenience of explanation, first, the hardware configuration of the computer 10 will be described with reference to FIG. As shown, the computer 10 includes an arithmetic processing unit 20 connected to a local bus 22, a PCI bridge 30 that connects the local bus 22 to a PCI bus 32 that is one of external buses, and an arithmetic via the PCI bus 32. The controller unit 40 that is accessed by the CPU 21 of the processing unit 20, the devices that control various I / O devices and the like are the I / O unit 60 connected to the ISA bus 42 that is a low-speed external bus, and peripheral devices. It consists of a keyboard 72, a speaker 74, a CRT 76, and the like.
[0023]
The arithmetic processing unit 20 is composed of a CPU 21 (using Pentium manufactured by Intel Corporation in the present embodiment), a cache memory 23, its cache controller 24, and a main memory 25 as a central processing unit. The PCI bridge 30 is a controller having a function of controlling a high-speed PCI bus 32. The memory space handled by the CPU 21 is expanded to a logical address wider than the actual physical address by various registers prepared in the CPU 21.
[0024]
The controller unit 40 includes a graphics controller (hereinafter referred to as VGA) 44 that controls display of an image on a monitor (CRT) 76, a SCSI controller 46 that controls data transfer with a connected SCSI device, a PCI bus 32, and a lower level It comprises a PCI-ISA bridge 48 that controls the interface with the ISA bus. The VGA 44 can display 640 × 480 dots and 16 colors with respect to the CRT 76. A character generator that stores display fonts, a graphic controller that receives a predetermined command and draws a predetermined graphic, and a video memory that stores a drawing image are mounted on the VGA 44. Since the configuration is well known, it will be described later if necessary, and is omitted in FIG.
[0025]
The ISA bus 42 connected via the PCI-ISA bridge 48 is a bus for input / output control to which various I / O devices are connected, and includes a DMA controller (hereinafter simply referred to as DMA) 50, a real-time clock (RTC). ) 52, two composite I / O ports 54 and 55, a sound I / O 56, a keyboard interface (hereinafter referred to as KEY) 64 for controlling the interface with the keyboard 72 and the mouse 73, and an interrupt controller for controlling interrupts with priority ( 66), a timer 68 for generating various time counts and beep sounds, and the like. The ISA bus 42 is connected to an ISA slot 62 in which an expansion board can be mounted.
[0026]
The composite I / O port 54 is provided with a port for inputting and outputting signals for controlling the floppy disk device 82 and the hard disk 84 in addition to parallel output and serial output. A printer 88 is connected to the parallel input / output via a parallel port 86, and a modem 92 is connected to the serial input / output via a serial port 90. The other composite I / O port 55 is connected to a scanner 93 and a tablet 94 capable of handwriting input. In addition to the speaker 74 described above, a microphone 96 can be connected to the sound I / O 56. In addition to these configurations, standardized I / O channels are often prepared in DOS / V machines, but illustration and description are omitted in this embodiment.
[0027]
Next, functions executed by the hardware thus configured will be described with reference to FIG. The configuration and operation of each unit shown in FIG. 1 will be outlined. The processing performed here is executed by the central processing unit (CPU 21) based on data input from the keyboard 72, scanner 93, or tablet 94. . All processing is performed by the CPU 21. With regard to kana-kanji conversion, when a character string is input from the keyboard 72 or the like and execution of the kana-kanji conversion is instructed by a predetermined operation, a predetermined interrupt process is started and the captured character string (kana character) A device driver that converts a string or a character string including kanji into a character string mixed with kana and kanji. Of course, in a computer capable of parallel processing, kana-kanji conversion may be performed by one application (input method), and the conversion result may be transferred to a required application. In this case, the input method collectively accepts input from the keyboard 72 or the like.
[0028]
As shown in FIG. 1, the kana-kanji conversion apparatus realized on the computer 10 inputs a character string through an input unit IPU. The input unit IPU basically captures a character string that has already been input as a document by specifying a range. However, the input unit IPU directly inputs a character string by a code, reads characters written on paper, and the like by character recognition. A scanner 93 and character recognition software for recognizing a character string, and a tablet 94 for inputting a handwritten character and recognizing it as a character string and handwritten character recognition software are included.
[0029]
The input character string may contain kanji, and the kanji-mixed part is converted into a kana character string by the first inverse conversion unit RC1 using the first dictionary stored in the hard disk 84. Converted. The first dictionary gives at least readings to kanji headings (words that can constitute a phrase). In the embodiment, as shown in FIG. 3, the first dictionary has a structure having part-of-speech information.
[0030]
Next, based on the kana character string obtained from the first reverse conversion unit RC1, the kana / kanji conversion unit KKC performs single phrase conversion. The single phrase conversion is a well-known one, and the conversion is performed on the assumption that a given kana character string is configured as one independent word + attached word. By limiting to single phrase conversion, morphological analysis for kana character strings can be simplified, and conversion candidates can be obtained by simple matching. Details of the processing will be described later. For the single phrase conversion, although not shown, an independent word dictionary and an attached word dictionary for kana-kanji conversion are used.
[0031]
When a conversion candidate is obtained by single phrase conversion, this is output, and the output conversion candidate is displayed in a predetermined area of the CRT 76. On the other hand, if the conversion candidate is not obtained, the second inverse conversion unit RC2 is activated next, and the second character string containing the reading and reading of the single kanji is read from the character string obtained by the input unit. Kana characters are written using the dictionary. As a result, even when a part of a word constituting the phrase and a part originally written in Kanji is kana, kana-kanji conversion is performed, and conversion candidates can be easily obtained. If a conversion candidate still cannot be obtained, a module for performing single-kanji conversion can be started, although not shown in FIG. In this case, a single kanji conversion dictionary (not shown) is used. With single-kanji conversion, conversion candidates can be shown for most character strings.
[0032]
A software program (application program) for realizing the functions of these units can be stored in a ROM so that it can be directly executed by the CPU 21, but a portable storage such as a flexible disk, MO, CD-ROM or the like. The program may be stored in a medium (portable storage medium), transferred from the portable storage medium to the main memory 25 or the hard disk 84 of the computer system, and executed. A device for providing these software programs is provided via a communication line, this device is connected to the communication line via a modem 92, and the software program is supplied from the supply device to the main memory 25 or the hard disk 84 of the computer system. It is good also as what is transferred to.
[0033]
Next, each processing routine will be described sequentially. 4 and 5 are flowcharts showing a re-conversion processing routine that takes in a character string including kanji and converts it again to kana-kanji. When an application program such as a word processor is being executed and a range of a character string that has already been input and confirmed is designated and a re-conversion is instructed, a program for kana-kanji conversion is started, and FIG. 4 and FIG. The process shown is started.
[0034]
When this routine is started, first, a process for acquiring a target character string is performed (step S100). The target character string is a character string that is passed to the kana-kanji conversion program of the present embodiment by specifying a range, and is a character string that is a target of re-conversion below. For example, as shown in FIG. 6 (A), it is assumed that the confirmed character string “Saga's is in the north of today” is acquired as the target character string. Next, a process for initializing the search character pointer and the number of searches is performed (step S110). First, a process for adding one search character is performed (step S120). As shown in FIG. 6B, since the initial value of the search character pointer is 0, the search character pointer is first set to the first character by performing the process of step S120 once.
[0035]
Next, a process of searching the reverse lookup independent word dictionary JD1 is performed (step S130). That is, the reverse lookup independent word dictionary JD1 corresponding to the first dictionary is searched for the character (“S” in FIG. 6) indicated by the search character pointer. This dictionary JD1 can be searched by kanji headings, and in the embodiment, its reading and part-of-speech information can be obtained. After the search, the search result is determined. The reverse lookup independent word dictionary JD1 is arranged in the order of kanji headings, and one word of the same kanji heading is read out in one search. Therefore, it is first determined whether or not the first word matches with the kanji heading (step S140). If no matching word is found as in the example of FIG. 6, it is determined whether there is a further forward match (step S150). If “Saga” or “Sato” is present for “S”, there will be a word that matches forward. In this case, the process returns to Step S120, and one search character is added and the reverse. The search of the free-standing word dictionary JD1 is executed again. That is, the reverse lookup independent word dictionary JD1 is searched for “Saga”.
[0036]
In this case, since the matching word “Saga” is found, the process proceeds to step S160 and subsequent steps shown in FIG. If there is no match at the number of searches and there is no forward match, it is determined that an appropriate kana character string cannot be obtained by this reverse lookup independent word dictionary JD1, and the routine proceeds to the additional conversion processing routine shown in FIG. However, this process will be described later.
[0037]
If the reverse lookup independent word dictionary JD1 is subtracted to find the exact number of searches (step S140), it can be determined that the corresponding word is registered in the independent word dictionary JD1, and the search character pointer is set to one. The process advances (step S160), and subsequently, the number of ending / attached word searches is initialized, that is, a process of setting the value to 1 is performed (step S170). Next, one search character is added, the pointer is placed on the first character following the independent word (“NO” in FIG. 6C), and the reverse lookup dictionary JD2 is searched for the ending / attached word (step S190). It is determined whether there is a match (step S200). If there is no match and a forward match is found (step S210), a search character is added (step S180) as in the case of searching the reverse lookup independent word dictionary JD1, and the reverse lookup ending / attachment search is performed again. This is performed (step S190). In FIG. 6, only the headword is shown for the reverse lookup ending / adjunct dictionary JD2, but the adjectives made up of kanji are used for the appendix (“K”, “Section”, “Machi”, “Please”, etc.) Therefore, it is the same as the reverse lookup independent word dictionary JD1 that is composed of kanji or kana headings and their readings and grammatical information.
[0038]
In the example shown in FIG. 6, since “no” is found in the reverse lookup ending / attached word dictionary, the determination in step S200 is “YES”. In this case, connection verification is performed (step S220). ). That is, a grammatical connection with an independent word already found by searching the reverse lookup independent word dictionary is examined to determine whether to connect (step S230). The case particle “NO” is connected to the place name “Saga”, but if it is not connected, the number of searches is increased (step S240), and the reverse ending / adjunct search (step S190) and the following processes are performed again. As a result, when the ending or attached word to be connected is found, this ending or attached word is added to the previously found independent word (step S250), and the target character string is added to all the obtained character strings. It is determined whether reconfiguration has been performed (step S260). In the example of FIG. 6, when the search pointer is placed at “no”, the connection verification is “YES”. Therefore, it is determined whether the reconstruction of the target character string is completed, and “ha” still remains. Therefore, the process returns to step S160, and the process is repeated from the position where the search character pointer is advanced by one. As a result, the noun “Nanoha” was found this time, and all the target character strings were reconstructed, so the kana character string “Saganoha” was obtained, and then the independent words were read. Kana-kanji conversion centering on the independent word search using (step S270) is executed (step S280). That is, as shown in FIG. 7, a dictionary is searched for independent words, and the independent word that has the longest match and also has an adjunct word connected is searched for, and conversion candidates “Saga no Ha”, “Saga no Ha”, “Sano no Ha” Are displayed, and selection of a candidate by the user is accepted (step S300). The details of the candidate selection process with display will be described later.
[0039]
In the search for ending / attached words, if there is no match at the number of searches and no forward match is found, it seems that there was a problem in cutting out independent words before entering the search for ending / attached words. The number of searches is increased (step S290), and the process returns to step S130 shown in FIG. 4 to resume the processing from the independent word search, that is, the reverse non-independent word search. In this case, if there is a match with the next word of the same heading, the above-described search for the ending / attachment word and the connection test are performed. For example, if the character string “actual size” is first searched up to “actual” by reverse lookup independent word search, and “actual, collapsible, noun” is found, the ending / adjunct search is once performed here. . However, for “Large”, the reverse ending / attached word dictionary JD2 has no corresponding heading, and no forward match is found. In this case, the number of searches is increased (step S290), and the search is performed again (step S130 and subsequent steps). Since there is no match for the number of searches (step S140) and a forward match is found (step S150), the next is independent. After the word “actual size” is cut out, the ending / attachment search is started. In this case, the attached word “no” is found, and the reconstruction of the target character string is completed.
[0040]
Next, the additional conversion process executed when no match is found in the reverse lookup independent word search will be described with reference to FIG. When the additional conversion shown in FIG. 8 is started, the character pointer is first initialized, that is, set to 1 (step S310), and one character is first extracted from the position of the pointer (step S320). It is determined whether or not this character is kanji (step S330). If the character is kanji, the reverse lookup single kanji dictionary KD corresponding to the second dictionary is searched (step S340). Kana) is acquired (step S350). The obtained reading (kana) is added to the end of the already acquired kana character string (step S360), and the character pointer is advanced by 1 (step S370). Thereafter, it is determined whether or not there are remaining characters (step S380), and if there are, the process returns to step S320 to repeat the processing from the extraction of one character.
[0041]
For example, in the case of the character string “HIGO” shown in FIG. 9, the re-conversion technique shown in FIGS. 4 and 5 fails at the time of reverse lookup independent word search, and the additional conversion shown in FIG. 8 is activated. Is done. For the character “hi” extracted when the character pointer is 1, the reading (kana) “hi” is acquired as it is, and the character “line” is read with reference to the reverse single kanji dictionary KD. Is obtained. If the processing is repeated for all the characters in this way, “hikoki” is obtained as a reading.
[0042]
When there are no remaining characters, single-sentence kana-kanji conversion is started with the acquired reading (step S390). This conversion is the same as step S280 shown in FIG. 5, and when a conversion candidate is obtained by a kana-kanji change, candidate selection processing (step S300) is executed next. Then, the detail of a candidate selection process (step S300) is demonstrated next. In the candidate selection process, as shown in the flowchart of FIG. 10, first, a process for receiving an instruction from the keyboard 72 or the like is performed (step S400). If this instruction instructs selection of a candidate, a process for replacing candidates is performed. This is performed (step S410). In the example of the result of the reconversion shown in FIG. 7, if the first candidate is “Sagano”, this is replaced with the next candidate “Saga” or “Sex”. Since “Sagano” is reconverted, the character string “Sagano” that is the target of reconversion should not be displayed as a candidate or may be the last candidate to be displayed. is there.
[0043]
If the instruction is a correction instruction (step S400), the displayed candidate is temporarily returned to the kana character string and a process for correcting it is performed (step S420). If the character string you are trying to obtain is “Sakanoha (Sanoha)”, and if you accidentally input “Saganoha” and it was misconverted, the fixed character string is the kana character. Even if it is converted back to a column and reconverted, the correct conversion result cannot be obtained. Therefore, in such a case, after returning to the kana character string, an operation of correcting the reading (kana character string) is performed. After correcting the reading, single-sentence kana-kanji conversion is executed (step S430). This conversion process is the same as the above-described single phrase kana-kanji conversion process (steps S280, 390). In the case of additional conversion, the character string correction and the subsequent single-sentence kana-kanji conversion process are performed in exactly the same manner.
[0044]
If the instruction from the keyboard 72 or the like is a confirmation instruction, first, a process of learning to preferentially output the selected candidate to the independent word dictionary KD is performed (step S440). This is the same as learning in normal kana-kanji conversion. Thereafter, a process of outputting a phrase composed of the currently selected candidates as a confirmed character string is performed (step S450), and this routine is terminated.
[0045]
With the above candidate selection process, the character string that has been subject to re-conversion or additional conversion can be easily converted into a desired Japanese character string.
[0046]
In the embodiment described above, the reverse lookup independent word dictionary JD1 and the reverse lookup ending / attachment dictionary JD2 are described as being generated from the independent word dictionary KD and the ending / attachment dictionary, respectively. A normal free-standing dictionary KD or the like has a reading as a headline and kanji and part-of-speech information as a converted character string, so that it is easy to generate the reverse lookup dictionary shown in FIG. Thus, whether it is generated from an independent word dictionary or when a reverse dictionary is prepared separately from the independent word dictionary, it is necessary to register it in the reverse dictionary when a new word is registered. FIG. 11 is a flowchart showing a word registration processing routine.
[0047]
As shown in the figure, when this routine is started, first, a process of inputting a word to be registered is performed (step S500). Subsequently, a process of inputting a reading is performed (step S510), and a process of inputting a part of speech is further performed (step S520). The part of speech may be input by selecting a part of speech prepared in advance. After sequentially inputting the above information, a process of registering in the independent word dictionary or the ending / attached word dictionary is performed (step S530), and further a process of registering in the reverse lookup dictionary is performed (step S540).
[0048]
By executing this process, the reverse lookup dictionary is updated simultaneously with the word registration, and can be a target of re-conversion or additional conversion.
[0049]
In the embodiment described above, reverse conversion is performed in units of registered words and then used again for kana-kanji conversion (re-conversion), and reverse conversion is performed in units of single kanji and then used again for kana-kanji conversion (additional conversion). ), But the kana-kanji conversion performed again is single phrase conversion. This is because requests for re-conversion and post-conversion are rarely made over multiple phrases, and rather the speed of conversion should be given priority. In simple phrase conversion, it may be difficult to process affixes (especially prefixes), but if "affix" + "word" are registered in advance, they can be easily converted by simple phrase conversion. it can. Of course, continuous phrase conversion can also be performed as morphological analysis.
[0050]
In the case of a person name or the like, there may be a case where a desired character cannot be obtained by single phrase conversion. Therefore, including such a case, it is also preferable to add a single kanji conversion option in the candidate selection process (step S300). An example of this is shown in FIG. If the input from the keyboard 72 or the like is an instruction to start single-kanji conversion, a process for dividing the reading is performed (step S460), and single-kanji conversion is started for each divided reading (step S470). According to such single kanji conversion, as shown in FIG. 13, when a character string such as “higo” is designated, the reading is separated before and after due to the presence of the kanji “line”. Each “Ki” will be converted to single kanji. In addition, regarding “line”, it is possible to deal with single-kanji conversion, and it is desirable to determine that it has already been input in kanji and remove it from the target of single-kanji conversion. In the example of FIG. 13, the part input in kanji is excluded from the target of single kanji conversion. If the input character string does not include the kanji part, the character string of the maximum length that can be a heading for single kanji conversion is cut out from the beginning and sequentially subjected to single kanji conversion.
[0051]
Such single-kanji conversion is useful not only when part of the characters are input in kanji like “Hi-Go”, but also when converting a fixed kanji character string as illustrated in FIG. It is. Especially in the case of a person's name, it is difficult to register everything, so it is useful to use single kanji conversion together. Even if the person name “Kitaoka” shown in FIG. 14 obtains the kana character string “Kitaoka” by reverse conversion, it is difficult to convert it to “Kitaoka” by ordinary phrase conversion. In such a case, single-kanji conversion is started, the kana character string “Kitaoka” obtained by reverse conversion is divided into the minimum combination of headings with single-kanji headings, and then single-kanji conversion is started. In this example, “Kita” + “Oka”, “Kita” + “O” + “Ka”, “Ki” + “Ta” + “Oka”, “Ki” + “Ta” + “Oka”, “Ki” ”+“ Ta ”+“ o ”+“ ka ”can be separated, and by specifying these appropriately, single kanji conversion for“ ki ”+“ ta ”+“ oka ”is started, and the desired It can be converted into the kanji “Taka” + “Tada” + “Oka”. In addition, it is desirable to display the conversion candidate obtained by the single kanji conversion and the candidate obtained by the single phrase conversion in different modes. In the example shown in FIG. 14, the conversion is divided into units for display so that it can be intuitively understood which conversion is performed. However, it is also preferable to clearly indicate by changing the display color or the like. It is.
[0052]
As described above, the kana-kanji conversion apparatus according to the present embodiment performs word units for a confirmed character string, a character string read by the scanner 93, a character string input by handwriting from the tablet 94, and the like. Then, the kana-kanji conversion is executed again, and kana-kanji conversion is executed again. If a significant candidate cannot be obtained by this process, the kana-kanji conversion is performed after reverse conversion of single kanji for each kanji. Therefore, kanji candidates can always be obtained, and corrections such as erroneous conversion can be easily performed.
[0053]
Although the embodiments of the present invention have been described above, the present invention is not limited to such embodiments, and can of course be implemented in various modes without departing from the gist of the present invention.
[0054]
[Brief description of the drawings]
FIG. 1 is an explanatory diagram showing a schematic configuration of kana-kanji conversion in the present embodiment.
FIG. 2 is a block diagram illustrating a hardware configuration of a computer 10 used in the embodiment.
FIG. 3 is an explanatory diagram showing a structure of a reverse lookup dictionary.
FIG. 4 is a flowchart showing the first half of reconversion processing.
FIG. 5 is a flowchart showing the second half of the reconversion process.
FIG. 6 is an explanatory diagram showing an example of a reconversion process.
FIG. 7 is an explanatory diagram showing a state of single phrase conversion after re-conversion processing.
FIG. 8 is a flowchart showing additional conversion processing;
FIG. 9 is an explanatory diagram showing the state of additional conversion.
FIG. 10 is a flowchart showing candidate selection processing.
FIG. 11 is a flowchart showing a word registration processing routine.
FIG. 12 is a flowchart showing processing when single kanji conversion is activated in candidate selection processing;
FIG. 13 is an explanatory diagram showing a state of single kanji conversion.
FIG. 14 is an explanatory view showing another example of single kanji conversion.
[Explanation of symbols]
10 ... Computer
20 ... arithmetic processing unit
21 ... CPU
22 ... Local bus
23 ... Cache memory
24 ... Cash controller
25 ... Main memory
30 ... PCI bridge
32 ... PCI bus
40: Controller section
42 ... ISA bus
44 ... VGA
46 ... SCSI controller
48 ... ISA bridge
54, 55 ... Composite I / O port
54 ... Composite I / O port
55 ... Composite I / O port
56 ... Sound I / O
60 ... I / O section
62 ... ISA slot
68 ... Timer
72 ... Keyboard
73 ... Mouse
74 ... Speaker
76 ... CRT
82 ... Floppy disk device
84: Hard disk
86 ... Parallel port
88 ... Printer
90 ... Serial port
92 Modem
93 ... Scanner
94 ... Tablet
96 ... Microphone
IPU ... Input section
KEY ... Below
PIC ... Below
Pentium ... made by Intel
RCU ... Inverse conversion unit
SBC: Single phrase conversion section
SKC ... Single kanji conversion part

Claims

A kana-kanji conversion device that once converts a character string including at least kanji into a kana character string and then performs kana-kanji conversion again,
A character string input means for capturing a character string including the kanji,
First inverse conversion means for inversely converting the captured character string into a kana character string with reference to a first dictionary in which kanji headings and readings are recorded for independent words constituting a phrase;
If a kana character string is obtained by the reverse conversion, the kana character kanji is converted as a single phrase from the front of the kana character string and subjected to a connection test with the ending / attachment, and then a kana that outputs a single phrase conversion candidate Kanji conversion means,
If no conversion candidate is obtained by the kana-kanji conversion means, a character string including the kanji is converted back to a kana character string by referring to a second dictionary in which kanji headings and readings are recorded for single kanji. A second reverse conversion means for providing the reversely converted kana character string to the kana-kanji conversion means;
If a conversion candidate by the kana / kanji conversion means cannot be obtained based on the kana character string obtained by the second reverse conversion means, a single kanji character for performing single kanji conversion on the kana character string Conversion means;
When single kanji conversion is performed by the single kanji conversion means, a delimiter in which the position of the kanji included in the character string captured by the second reverse conversion means is the delimiter position of the kana character to be converted by the single kanji conversion means A kana-kanji conversion device comprising position determining means.

The second reverse conversion means sends the kana character string obtained by the reverse conversion prior to the single kanji conversion means by the single kanji conversion means to the kana / kanji conversion means for outputting the single phrase conversion candidates. The kana-kanji conversion device according to claim 1, which outputs the kana-kanji conversion device.

In the kana-kanji conversion device according to claim 2,
When the kana-kanji conversion means performs kana-kanji conversion on the kana character string obtained by referring to the second dictionary, the kanji contained in the captured character string and the single phrase obtained after the kana-kanji conversion A kana-kanji conversion device comprising conversion candidate restriction means for converting only those that match the kanji included in the conversion candidates.

The display unit according to claim 1, further comprising: a display unit configured to display a conversion candidate as a single phrase output by the kana-kanji conversion unit and a conversion candidate as a single kanji output from the single kanji conversion unit in a distinguishable manner. Kana-kanji conversion device.

A kana-kanji conversion device according to any one of claims 1 to 4,
A kana-kanji conversion device comprising first dictionary dictionary generating means for generating the first dictionary for reverse conversion from a dictionary for kana-kanji conversion that converts a kana character string into a kanji-mixed sentence.

In the kana-kanji conversion device according to any one of claims 1 to 5,
A kana-kanji conversion device comprising second dictionary generation means for generating the second dictionary from a dictionary for single-kanji conversion.

7. The kana-kanji conversion device according to claim 1, wherein said kana-kanji conversion means includes means for outputting conversion candidates by single phrase conversion in order according to the frequency of use.

A kana-kanji conversion method realized by a computer once converting a character string including at least kanji into a kana character string and then executing a process of performing kana-kanji conversion again,
A character string including the kanji is taken in via a means for inputting the character string,
The captured character string is converted back to a kana character string with reference to a first dictionary in which kanji headings and readings are recorded for independent words constituting a phrase,
If a kana character string is obtained by the reverse conversion, the kana character string is subjected to kana-kanji conversion means as a single phrase from the front of the kana character string and then subjected to a connection test with the ending / attachment, and then a single phrase conversion candidate Output
If a conversion candidate is not obtained by the conversion, the character string including the kanji character is inversely converted into a kana character string with reference to a second dictionary in which kanji headings and readings are recorded for the single kanji character, Providing the converted kana character string to the means for performing the kana-kanji conversion;
When the means for performing kana-kanji conversion fails to create a conversion candidate, the kana character string obtained by performing the reverse conversion is set with the position of the kanji included in the character string as the delimiter position of the kana character. Kana-kanji conversion method that performs single-kanji conversion.