JP3728877B2

JP3728877B2 - Character string converter and program recording medium thereof

Info

Publication number: JP3728877B2
Application number: JP18600797A
Authority: JP
Inventors: 俊啓木内; 栄作中谷
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 1997-06-27
Filing date: 1997-06-27
Publication date: 2005-12-21
Anticipated expiration: 2017-06-27
Also published as: JPH1125084A

Description

【０００１】
【発明の属する技術分野】
この発明は、入力文字列を漢字変換する文字列変換装置およびそのプログラム記録媒体に関する。
【０００２】
【従来の技術】
従来、ワードプロセッサ等の文書処理装置において、通常備えられているかな漢字変換辞書（一般辞書）を用いてかな漢字変換を行う場合に、建築、医学、化学、物理等のように専門的な分野の文書ほど、変換率が極端に悪くなるため、このような専門分野の文書を作成する際には、入力操作が面倒な単漢字変換や単語変換等によって対処するようにしていた。あるいはユーザ自身が今回作成する文書の分野を判断し、その分野に合った専門用語辞書を外部記憶媒体から文書処理装置へ外部供給するようにしていた。
【０００３】
【発明が解決しようとする課題】
しかしながら、ユーザ自身がどのような分野の文書を作成するかを常に意識しなければならず、しかも外部記憶媒体をユーザ自身が保管し、必要に応じて文書処理装置に装着するという手間がかかるために、ユーザに大きな負担をかけていた。
この発明の課題は、入力文字列を漢字変換する際に、ユーザ自身がどのような分野の文書を作成するかを意識することなく、専門的な分野の文書であっても効率良く変換できるようにすることである。
【０００４】
【課題を解決するための手段】
請求項１記載の発明は、入力文字列を漢字変換する文字列変換辞書として通常使用されている一般辞書を記憶する第1の辞書記憶手段と、入力文字列を漢字変換する文字列変換辞書として、専門分野毎に分類された分野別辞書を記憶する第２の辞書記憶手段と、前記一般辞書を使用して入力文字列を一括変換する一括変換手段と、この一括変換手段によって得られた変換結果が候補表示されている状態において、逐次変換を指示する指示操作が行われた際に、前記一般辞書を使用して当該入力文字列を逐次変換する逐次変換手段と、この逐次変換手段によって得られた変換結果に対して確定操作が行われた際に、この逐次変換時に変換対象となった入力文字列を前記一般辞書に対する未知語として記憶する未知語記憶手段と、この未知語記憶手段内の未知語に基づいて前記分野別辞書をそれぞれ検索し、当該未知語に該当する表記が存在する分野別辞書を指定する辞書指定手段とを具備し、前記辞書指定手段によって指定された分野別辞書を優先的に使用して文字列変換を行うようにしたことを特徴とする。
さらに、コンピュータに対して、上述した請求項１記載の発明に示した主要機能を実現させるためのプログラムを提供する（請求項５記載の発明）。
なお、請求項１記載の発明は次のようなものであってもよい。
前記辞書指定手段は、前記未知語記憶手段に記憶されている未知語のうち、少なくとも２種類の未知語が同一分野に属すると判定した際に、その分野に対応する分野別辞書を指定する（請求項２記載の発明）。
入力文字列を漢字変換する文字列変換辞書として通常使用されている一般辞書を備えた文書処理装置と、専門分野毎に分類された分野別辞書を記憶管理する辞書管理装置とを備えた通信システムであって、前記文書処理装置は、前記一般辞書を使用して入力文字列を一括変換する一括変換手段と、この一括変換手段によって得られた変換結果が候補表示されている状態において、逐次変換を指示する指示操作が行われた際に、前記一般辞書を使用して逐次変換する逐次変換手段と、この逐次変換手段によって得られた変換結果に対して確定操作が行われた際に、この逐次変換時に変換対象となった入力文字列を前記一般辞書に対する未知語として記憶する未知語記憶手段と、この未知語記憶手段の内容を前記辞書管理装置に送信する送信手段とを具備し、前記辞書管理装置は、文書処理装置から送信されて来た未知語に基づいて前記分野別辞書をそれぞれ検索して当該未知語に該当する表記が存在する分野別辞書を指定する辞書指定手段と、この辞書指定手段によって指定された分野別辞書を文書処理装置へ送信する送信手段とを具備し、前記文書処理装置は、辞書管理装置から送信されて来た分野別辞書を優先的に使用して文字列変換を行う（請求項３記載の発明）。
前記文書処理装置は、辞書管理装置から送信されて来た分野別辞書を記憶する辞書記憶手段と、この辞書記憶手段に記憶されている分野別辞書が所定期間使用されなかったか否かを判別する判別手段と、この判別手段によって所定期間使用されなかったことが判別された際に、前記辞書記憶手段から当該分野別辞書を削除する削除手段とを具備する（請求項４記載の発明）。
【０００５】
請求項１記載の発明によれば、入力文字列(よみ文字列)を一般辞書によって一括変換した変換結果を候補表示した後に、逐次変換が指示された際には、当該入力文字列の再変換を行うが、その際、一般辞書を使用して逐次変換を行うと共に、この変換結果に対してその確定が指示された際には、当該入力文字列を一般辞書に対する未知語とし、この未知語に基づいて分野別辞書をそれぞれ検索することによって未知語に該当する表記が存在する分野別辞書を優先使用の対象辞書として指定するようにしたから、文書作成時において、ユーザは、一括変換の結果を確認し、誤り個所を発見した際には、その個所を逐次変換によって正して確定操作を行うだけで、当該文書内容に属する分野の専門辞書を優先的に使用しての文字列変換が可能となるため、ユーザ自身がどのような分野の文書を作成するかを意識することなく、専門的な分野の文書を作成する場合であっても効率良く変換できるという効果を有する。
【０００６】
【発明の実施の形態】
以下、図１〜図１２を参照してこの発明の一実施形態を説明する。
図１はクライアント・サーバ・システムを示したシステム構成図である。このシステムは、複数台の文書処理装置ＷＰとデータベースサーバＤＢＳとを構内専用回線を介して接続して成るローカルエリアネットワークシステムである。ここで、各文書処理装置ＷＰにはかな漢字変換用として通常使用される一般辞書を備えているが、専門分野毎に分類された分野別辞書はデータベースサーバＤＢＳ側で記憶管理されており、必要に応じてデータベースサーバＤＢＳは分野別辞書を文書処理装置ＷＰへダウンロードするようにしている。
【０００７】
図２（Ａ）は文書処理装置ＷＰの全体構成を示したブロック図である。
ＣＰＵ１はＲＡＭ２内にロードされている各種プログラムにしたがってこの文書処理装置ＷＰの全体動作を制御する中央演算処理装置である。記憶装置３はオペレーティングシステムや各種アプリケーションプログラム、データファイル、かな漢字変換辞書、文字フォントデータ等が予め格納されている記憶媒体４やその駆動系を有している。この記憶媒体４は固定的に設けたもの、もしくは着脱自在に装着可能なものであり、フロッピーディスク、ハードディスク、光ディスク、ＲＡＭカード等の磁気的・光学的記憶媒体、半導体メモリによって構成されている。また、記憶媒体４内のプログラムやデータは、必要に応じてＣＰＵ１の制御により、ＲＡＭ２にロードされる。更に、ＣＰＵ１は通信回線等を介して他の機器側から送信されて来たプログラム、データを受信して記憶媒体４に格納したり、他の機器側に設けられている記憶媒体に格納されているプログラム、データを通信回線等を介して使用することもできる。
そして、ＣＰＵ１にはその入出力周辺デバイスである入力装置５、表示装置６、印刷装置７がバスラインを介して接続されており、入出力プログラムにしたがってＣＰＵ１はそれらの動作を制御する。
入力装置５は文字列データを入力したり、各種コマンドを入力するキーボードの他、マウス等のポインティングデバイスを有している。表示装置６は多色表示を行う液晶表示装置やＣＲＴ表示装置あるいはプラズマ表示装置等であり、また印刷装置７はフルカラープリンタ装置で、熱転写やインクジェットなどのノンインパクトプリンタあるいはインパクトプリンタである。
【０００８】
図２（Ｂ）はＲＡＭ２の主要構成を示し、このＲＡＭ２には各種のメモリ領域が割り当てられている。文書メモリ２−１は入力された文字列がかな漢字変換処理によって確定される毎にその確定文字列が格納されるテキストメモリである。一般辞書メモリ２−２はかな漢字変換用の辞書として通常使用される一般辞書を記憶するもので、記憶装置３に常駐されている一般辞書がこのメモリ２−２にロードされる。分野別辞書エリア２−３はデータベースサーバＤＢＳからダウンロードされたかな漢字変換用の専門分野別辞書が格納されるメモリ領域である。候補バッファ２−４は一般辞書あるいは分野別辞書を用いてかな漢字変換された際に、その変換結果を候補文字列として一時記憶するもので、その内容は表示装置６のテキスト画面に候補表示される。未知語メモリ２−５は入力文字列に該当する表記が一般辞書メモリ２−２に存在しなかった場合に、単漢字変換や伸縮変換が指示されて当該入力文字列が一般辞書メモリ２−２の内容にしたがってかな漢字変換（逐次変換）されたときに、その入力文字列（読み）と変換結果（表記）が未知語として蓄積されるメモリである。ここで、未知語とは一般辞書メモリ２−２に表記されていない単語であり、単漢字変換や伸縮変換によってユーザ自身が作り出した単語を意味している。この未知語メモリ２−５に格納された未知語が一定数（例えば１０種類）に達した際に、ＣＰＵ１はこの未知語メモリ２−５の内容を自己のターミナルNo（マシンNo）と共にデータベースサーバＤＢＳへ送信する。辞書別日時メモリ２−６は専門分野別辞書毎に日時データを記憶するもので、分野別辞書を用いてかな漢字変換される毎にその分野別辞書に対応する日時データを更新するようにしている。
【０００９】
図３（Ａ）はデータベースサーバＤＢＳの全体構成を示したブロック図である。このデータベースサーバＤＢＳはＣＰＵ１１、ＲＡＭ１２、記憶装置１３、記憶媒体１４、入力装置１５、表示装置１６を有する構成で、記憶装置１３、記憶媒体１４は上述した文書処理装置ＷＰの記憶装置３、記憶媒体４と基本的には同様であるためその説明は省略するが、記憶媒体１４には図３（Ｂ）に示すように、未知語管理テーブルＵＭＴ、専門分野別辞書ＳＰＤ、未知語分野別辞書対応テーブルＣＯＴが格納されており、それらは必要に応じてＲＡＭ１２にロードされる。ここで、文書処理装置ＷＰから未知語データが送信されて来ると、ＣＰＵ１１はこれをＲＡＭ１２内の未知語管理テーブルＵＭＴに格納すると共に、この未知語管理テーブルＵＭＴと専門分野別辞書ＳＰＤとの内容を照合し、未知語に該当する表記が存在する専門分野別辞書ＳＰＤを判定し、その判定結果を未知語分野別辞書対応テーブルＣＯＴにセットする。そして、この未知語分野別辞書対応テーブルＣＯＴの内容を解析することにより専門分野別辞書ＳＰＤを特定し、特定した専門分野別辞書ＳＰＤを文書処理装置ＷＰへ送信するようにしている。
【００１０】
図４は未知語管理テーブルＵＭＴの構成を示したもので、文書処理装置ＷＰを識別するターミナルNo毎に、未知語管理テーブルＵＭＴはレコードNo（Ｍ１、Ｍ２……Ｍｎ）、未知語データとしてその読み、表記を記憶管理する構成となっている。
図５は専門分野別辞書ＳＰＤを例示したもので、（Ａ）は物理用語辞書、（Ｂ）は化学用語辞書を示し、各専門分野別辞書ＳＰＤは読み、表記、Ａｉ情報（用例）を記憶する構成となっている。
図６は未知語分野別辞書対応テーブルＣＯＴの構成を示したもので、この未知語分野別辞書対応テーブルＣＯＴはターミナルNo毎に備えられており、その行項目には未知語情報を指定する未知語管理テーブルＵＭＴのレコードNoが固定的に記述され、また、その列項目には専門分野別辞書ＳＰＤを指定する辞書Noが固定的に記述されており、このレコードNoと辞書Noとから成るマトリックス上の各交点領域には、未知語がどの分野に属するかを判定した判定結果が記述される。この場合、図中丸印はその判定結果を示している。
【００１１】
次に、文書処理装置ＷＰ、データベースサーバＤＢＳの動作を図７〜図１１に示すフローチャートにしたがって説明する。ここで、これらのフローチャートに記述されている各機能を実現するためのプログラムは、ＣＰＵ１（１１）が読み取り可能なプログラムコードの形態で記憶媒体４に記憶されており、その内容がＲＡＭ２（１２）内のワークメモリにロードされている。
図７は文書処理装置ＷＰの全体動作の概要を示したフローチャートである。先ず、ＣＰＵ１は入力待ち状態において（ステップＡ１）、何んらかの入力があると、入力解析を行う（ステップＡ２）。いま、かな入力文字列に対するかな漢字変換として通常、例えば、文節変換や複合文節変換等の一括変換が行われる。すると、ステップＡ９に進み一括かな漢字変換処理が実行される。
【００１２】
図８はこの場合の一括変換処理を示したフローチャートである。先ず、ＣＰＵ１は一般辞書メモリ２−２を参照して入力文字列を順次かな漢字変換してゆき（ステップＢ１）、第１候補としてその全てを単語や複合語等に変換することができたかを調べ（ステップＢ２）、第１候補としてその全てを変換することができた場合には、その変換結果を候補バッファ２−４に格納し（ステップＢ５）、また、その一部のみ第１候補として変換できた場合にも未変換文字列を除き、第１候補として変換された変換結果のみを候補バッファ２−４に格納する（ステップＢ５）。いま、第１候補として変換することができなかった未変換文字列が有る場合には、ステップＢ３に進み、データベースサーバＤＢＳからロードされた専門分野別辞書が１種類でも分野別辞書エリア２−３に格納されているかを調べる。ここで、分野別辞書が存在していなければ、ステップＢ４に進み、未変換文字列を一般辞書メモリ２−２を参照することによって再度かな漢字変換し、次候補として変換された変換結果を候補バッファ２−４に格納する（ステップＢ５）。
【００１３】
このように候補バッファ２−４に変換結果が格納されると、図７のステップＡ１０に進み、候補バッファ２−４の内容が候補表示されるが、その際、一括変換された結果に、変換誤りが存在する場合には、次候補変換の他に、単漢字変換や伸縮変換によって誤り箇所の再変換を指示する。いま、伸縮変換や単漢変換が指示されると、その箇所がかな文字列に戻されると共に（ステップＡ３、Ａ６）、通常と同様の伸縮漢字変換処理（ステップＡ４）や単漢字変換処理（ステップＡ７）が実行されるが、その際、伸縮変換された文字列や単漢字変換された文字列に伸縮変換や単漢字変換が行われたことを示す識別子が付加される（ステップＡ５、Ａ８）。そして、その変換結果は候補バッファ２−４に格納されたのち候補表示される（ステップＡ１０）。
いま、図１２（Ａ）に示すように、図中アンダーラインを付した文字列が一般辞書を用いてかな漢字変換された変換候補とする。この場合、この変換誤りを正すために単漢字変換や伸縮変換が指示されると、当該文字列は仮文字列に戻される（図１２（Ｂ）参照）。この状態で単漢字変換や伸縮変換が指示されると、図１２（Ｃ）に示すような化学全部の専門用語に変換されて候補表示される。
【００１４】
そして、変換確定が指示されると、ステップＡ１１に進み、未知語判定記憶処理が行われる。
図９はこの処理内容を詳述したフローチャートであり、先ず、候補バッファ２−４内の変換文字列の中に伸縮変換された文字列が含まれているかを調べると共に（ステップＣ１）、単漢字変換された文字列が含まれているかを調べる（ステップＣ２）。ここで、何れの変換方式で変換された文字列が無ければ、このフローから抜けるが、何れか一方の変換方式で変換された文字列が含まれていれば、ステップＣ３に進み、その文字列と同一の文字列が未知語として既に未知語メモリ２−５内に格納されているかを調べる。これは同一文字列を重複して未知語メモリ２−５にセットすることを避けるためであり、同一文字列が無ければその文字列の読みと表記とを対応付けて未知語メモリ２−５に格納する（ステップＣ４）。これによって未知語メモリ２−５に格納された未知語が一定数越えたかを調べる（ステップＣ５）。この場合、文書作成中において、例えば１０種類の未知語が蓄積されている場合には、一定数越えたものと判断し、未知語メモリ２−５の内容を自己のターミナルNoと共にデータベースサーバＤＢＳへ通信回線を介して送信するが（ステップＣ６）、一定数以下であればステップＣ６の送信処理は行われない。
そして、図７のステップＡ１２に進み、候補バッファ２−４の内容を確定文字列として文書メモリ２−１に格納する処理が行われる。
【００１５】
一方、未知語メモリ２−５の内容がデータベースサーバＤＢＳへ送信されると、データベースサーバＤＢＳは図１１のフローチャートにしたがった動作を行う。すなわち、ＣＰＵ１１は文書処理装置ＷＰから送信されて来た未知語をターミナルNoに対応付けて未知語管理テーブルＵＭＴに格納する（ステップＤ１）。そして、このターミナルNo領域における未知語管理テーブルＵＭＴの先頭位置に未知語ポインタをセットしておくと共に（ステップＤ２）、専門分野別辞書ＳＰＤを順次アクセスするための分類別辞書ポインタに初期値をセットしておく（ステップＤ３）。この状態において、辞書ポインタで指定された専門分野別辞書ＳＰＤを未知語ポインタで指定された未知語に基づいて検索し、当該専門分野別辞書ＳＰＤ内にその未知語が含まれているかを判定する（ステップＤ４）。この結果、未知語に該当する表記が専門分野別辞書ＳＰＤに存在していれば、未知語分野別辞書対応テーブルＣＯＴにその判定結果をセットする（ステップＤ５）。ここで、未知語分野別辞書対応テーブルＣＯＴは上述したように未知語に対応するレコードNoが記述されており、その未知語が属する分野別辞書が検索されると、それに対応する未知語分野別辞書対応テーブルＣＯＴ内の交点領域に分野判定情報がセットされる。そして、次の専門分野別辞書ＳＰＤを指定するために辞書ポインタを更新し（ステップＤ６）、全辞書を指定したか、つまり辞書終了かを調べ（ステップＤ７）、終了でなければステップＤ４に戻り、上述の動作を繰り返す。したがって、１つの未知語が複数の辞書内に存在していれば、各辞書毎に未知語分野別辞書対応テーブルＣＯＴ内に分野判定情報がセットされることになる。
【００１６】
このようにして最初の未知語と全辞書との照合が終ると、未知語管理テーブルＵＭＴ内の次の未知語を指定するために未知語ポインタを更新し（ステップＤ８）、同一のターミナルNoに対応する全ての未知語を指定し終ったかを調べ（ステップＤ９）、未知語終了でなければ、ステップＤ３に戻り、辞書ポインタに初期値をセットし、以下、未知語が属する分野を判定するための処理を辞書ポインタを更新しながら繰り返す。これによって、未知語終了が検出されると、ターミナルNoに対応する未知語分野別辞書対応テーブルＣＯＴを解析し、分野別辞書毎に未知語を計数する（ステップＤ１０）。このようにして辞書毎に求められた計数値に基づいて１つの辞書内に複数種（例えば３以上）の未知語が含まれているかによって分野別辞書を特定することができたかを調べる（ステップＤ１１）。ここで、１つでも辞書を特定することができた場合には、特定された１または２以上の専門分野別辞書ＳＰＤの内容をターミナルNoで指定される文書処理装置ＷＰ側へ送信すると共に、特定された分野別辞書が複数存在する場合には、各辞書に対応付けて未知語数（計数値）を送出する（ステップＤ１２）。このように特定された辞書に未知語計数値を付加して送出するのは、文書処理装置ＷＰ側において、複数の分野別辞書のうちどの辞書を優先的に使用してかな漢字変換を行うかを判断することができるようにするためである。そして、当該ターミナルNoに対応する未知語管理テーブルＵＭＴおよび未知語分野別辞書対応テーブルＣＯＴの内容をそれぞれクリアする処理が行われる（ステップＤ１３）。なお、辞書を特定することができなかった場合には、辞書送信処理は行わず未知語管理テーブルＵＭＴ、未知語分野別辞書対応テーブルＣＯＴをクリアする処理のみが行われる（ステップＤ１３）。
【００１７】
ここで、データベースサーバＤＢＳから送信されて来た専門分野別辞書ＳＰＤを受信した文書処理装置ＷＰにおいては、それを自己の分野別辞書エリア２−３に順次格納してゆく（ステップＡ１３）。このように分野別辞書がダウンロードされている状態において、一括かな漢字変換が指示されると、図８のステップＢ１において、かな漢字変換処理が実行され、その結果、第１候補として変換されない文字列が存在していなければ（ステップＢ２）、分野別辞書がロードされているかを調べる（ステップＢ３）、いま、分野別辞書有りが検出するため、ステップＢ６に進み、その辞書は現在使用中かを調べる。ここで、分野別辞書がダウンロードされた直後であれば、その辞書は使用中ではないので、ステップＢ１０に進み、ダウンロードされた直後の分野別辞書を用いてかな漢字変換が行われる。この場合、ダウンロードされた分野別辞書が複数存在するときには、各辞書に付加されている未知語計数値を比較し、その計数値が多い辞書を用いてかな漢字変換が行われる。そして、変換された場合には（ステップＢ１１）、その辞書を現在使用中の分野別辞書として指定しておく（ステップＢ１２）。そして、当該辞書に対応する辞書別日時メモリ２−６内の日時データを更新する処理が行われるが（ステップＢ１３）、この場合、その辞書はダウンロードされた直後であるので、現在日時をその辞書に対応付けて辞書別日時メモリ２−６に書き込まれる。また、上述のように分類別辞書を用いてかな漢字変換された変換文字列は、候補バッファ２−４に格納される（ステップＢ５）。
【００１８】
このように使用中辞書が指定された状態において、再び、この一括かな漢字変換処理に入り、ステップＢ１、Ｂ２、Ｂ３からステップＢ６に進むと、使用中分野別辞書が有ることが検出されるため、ステップＢ７に進み、その使用中分野別辞書を用いてかな漢字変換処理が行われ、変換された場合には（ステップＢ８）、その辞書に対応する辞書別日時メモリ２−６の内容を更新すると共に（ステップＢ１３）、変換文字列を候補バッファ２−４に格納する（ステップＢ５）。
一方、使用中の分野別辞書によっても変換することができなかった場合には、他の分野別辞書が存在するかを調べ（ステップＢ９）、既にデータベースサーバＤＢＳから自己の文書処理装置ＷＰにその他の分野別辞書が転送されていれば、ステップＢ１０に進み、ダウンロードされた直後の分野別辞書が他に有れば、その未知語計数値に基づいて指定された分野別辞書を用いてかな漢字変換を行い、また、辞書別日時メモリ２−６を参照し、更新日時の分野別辞書を用いてかな漢字変換を行う。ここで、変換された場合には（ステップＢ１１）、その辞書を使用中の辞書として指定すると共にその辞書に対応する辞書別日時メモリ２−６内の日時データを更新し、更に変換結果を候補バッファ２−４に格納する（ステップＢ１２、Ｂ１３、Ｂ５）。なお、ステップＢ１１で変換されなかったことが検出された場合にはステップＢ９に進み、他の分野別辞書が有るかを再度調べる。このようにして各分野別辞書を上述した優先順位にしたがって順次指定しながら変換処理を行っても、変換することができなかった場合にはステップＢ４に進み、一般辞書メモリ２−２を用いて次候補のかな漢字変換処理を行い、その変換結果を候補バッファ２−４に格納する。
【００１９】
以下、同様の動作は文書作成が終了するまで繰り返される。ここで、文書作成の終了が指示されると（図７のステップＡ１７）、その文書作成時に得られた未知語メモリ２−５の内容をクリアすると共に、使用中分野別辞書の指定をクリアする（ステップＡ１８）。
また、電源が投入されてその直後であることが検出されると（ステップＡ１４）、各種の初期化処理が行われたのち（ステップＡ１５）、辞書削除処理に移る（ステップＡ１６）。
図１０はこの辞書削除処理を示したフローチャートである。先ず、辞書別日時メモリ２−６の先頭から１レコード分のデータを読み出し、その日時データと現在日時とから経過時間を求め、この経過時間が予め設定されている一定期間を越えたかを調べる（ステップＢ２）。例えば、最新使用日時から２日（２４時間）を越えたかを調べ、越えていれば、当該辞書を分野別辞書エリア２−３から削除する（ステップＥ３）。そして、全ての辞書を指定し終ったかを調べ（ステップＥ４）、全辞書終了までステップＥ１に戻り、上述の動作が繰り返される。なお、辞書別日時メモリ２−６内に日時データが設定されていなければ、当該辞書は削除対象外とされる。
【００２０】
以上のように、このクライアント・サーバシステムにおいて、各文書処理装置ＷＰには一般辞書が備えられ、データベースサーバＤＢＳには各種の専門分野別辞書が備えられており、文書処理装置ＷＰ側で一般辞書を用いてかな漢字変換処理を行った際に、入力文字列に該当する表記が一般辞書に存在しなかった場合に、単漢字変換や伸縮変換によって当該入力文字列が一般辞書を用いて逐次変換されたか否かを判別し、逐次変換されたものであれば、その文字列を一般辞書に対する未知語として未知語メモリ２−５に蓄えられてゆく。そして、この未知語メモリ２−５に所定数分の未知語が蓄えられた際に、未知語メモリ２−５の内容がデータベースサーバＤＢＳに送信されると、データベースサーバＤＢＳはそれを解析し、その未知語が属する分野を判定する。すると、この分野に対応する分野別辞書が文書処理装置ＷＰへダウンロードされるので、文書処理装置ＷＰ側においてはこの分野別辞書を優先的に使用してかな漢字変換を行うことができる。つまり、ユーザがどのような分野の文書を入力している場合でも、その入力途中においてその分野が自動的に判断され、豊富な分野別辞書を記憶管理するデータベースサーバＤＢＳからその分野に対応する専門分野別辞書が送信されてダウンロードされるため、以降の文書入力からは分野別辞書を優先的に使用してかな漢字変換されるため、変換効率を落さずに入力を続けることができる。したがって、ユーザ自身がどのような分野の文書を作成するかを意識することなく、専門的な分野の文書であっても効率良く変換することが可能となる。
【００２１】
また、データベースサーバＤＢＳ側において、未知語が属する分野を判定する際に、未知語と分野別辞書とを照合するようにしたから、キーワードテーブル等を参照しなくても分野を判定することができると共に、その判定を確実に行うことができる。また、文書処理装置ＷＰ側において、未知語メモリ２−５内に未知語が所定数達した際に、その内容をデータベースサーバＤＢＳへ送信して分野の判定を指示すると共に、分野を判定する際にはその分野に複数の未知語が含まれていることを条件とするため、分野の判定を確実に行うことができる。
更に、文書処理装置ＷＰ側においては分野別辞書の使用状態を常時監視し、一定時間使用されない分野別辞書を削除するようにしたから、分野別辞書エリア２−３の膨大化を防ぐことができると共に、辞書の検索スピードを不必要に低下させず、効率良い変換が可能となる。
【００２２】
なお、上述した一実施形態においては、分野別辞書の使用状態を監視し、一定期間使用されていない場合にその辞書を削除するようにしたが、文書入力中の変換回数を計数し、その計数値が一定回数を越えたことを条件に分野別辞書を削除するようにしてもよい。
また、クライアント・サーバシステムについて説明したが、スタンドアロンタイプの文書処理装置ＷＰにも適用可能である。この場合、文書処理装置ＷＰ自身が分野別辞書を記憶管理していることが条件となり、現在入力されている文書について分野が特定された場合にその分野別辞書の検索順位を１番とすることで変換速度を上げることができる。
【００２３】
【発明の効果】
この発明によれば、ユーザがどのような分野の文書を入力している場合でも、その入力途中においてその分野が自動的に判定され、それに対応する分野別辞書を用いて文字列変換を行うことができるので、入力文字列を漢字変換する際に、ユーザ自身がどのような分野の文書を作成するかを意識することなく、専門的な分野の文書であっても効率良く変換することが可能となる。
【図面の簡単な説明】
【図１】クライアント・サーバシステムを示したシステム構成図。
【図２】（Ａ）は文書処理装置ＷＰの全体構成を示したブロック図、（Ｂ）はＲＡＭ２の主要構成を示した図。
【図３】（Ａ）はデータベースサーバＤＢＳの全体構成を示した図、（Ｂ）はその主要記憶内容を示した図。
【図４】未知語管理テーブルＵＭＴの構成を示した図。
【図５】専門分野別辞書ＳＰＤを例示したもので、（Ａ）は物理用語辞書、（Ｂ）は化学用語辞書を示した図。
【図６】未知語分野別辞書対応テーブルＣＯＴの構成を示した図。
【図７】文書処理装置ＷＰの全体動作を示したフローチャート。
【図８】図７のステップＡ９（一括かな漢字変換処理）を詳述したフローチャート。
【図９】図７のステップＡ１１（未知語判別記憶処理）を詳述したフローチャート。
【図１０】図７のステップＡ１６（辞書削除処理）を詳述したフローチャート。
【図１１】データベースサーバＤＢＳの動作を示したフローチャート。
【図１２】かな漢字変換の具体例を示した図で、（Ａ）は一般辞書で変換した際に専門用語の変換誤りを例示した図、（Ｂ）は単漢字変換や伸縮変換を行う際に、変換誤りの文字列がかな文字列に戻された例を示した図、（Ｃ）は単漢字変換や伸縮変換後の文字列を例示した図。
【符号の説明】
１、１１ＣＰＵ
２、１２ＲＡＭ
２−１文書メモリ
２−２一般辞書メモリ
２−３分野別辞書エリア
２−４候補バッファ
２−５未知語メモリ
２−６辞書別日時メモリ
３、１３記憶装置
４、１４記憶媒体
５入力装置
６表示装置
ＷＰ文書処理装置
ＤＢＳデータベースサーバ
ＵＴＭ未知語管理テーブル
ＳＰＤ専門分野別辞書
ＣＯＴ未知語分野別辞書対応テーブル[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a character string converter for converting an input character string into Kanji and a program recording medium thereof.
[0002]
[Prior art]
Conventionally, when kana-kanji conversion is performed using a kana-kanji conversion dictionary (general dictionary) that is normally provided in a document processing device such as a word processor, documents in specialized fields such as architecture, medicine, chemistry, physics, etc. Since the conversion rate is extremely worse, when creating a document in such a specialized field, the input operation is troublesome by single kanji conversion or word conversion. Alternatively, the user himself / herself determines the field of the document to be created this time, and supplies a technical term dictionary suitable for the field from the external storage medium to the document processing apparatus.
[0003]
[Problems to be solved by the invention]
However, the user himself / herself must always be aware of what field of document he / she creates, and the user himself / herself saves the external storage medium and attaches it to the document processing apparatus as necessary. In addition, it puts a heavy burden on the user.
An object of the present invention is to efficiently convert even a document in a specialized field without being aware of the field in which the user himself creates a document when converting an input character string into Kanji. Is to do.
[0004]
[Means for Solving the Problems]
The invention according to claim 1 is a first dictionary storage means for storing a general dictionary normally used as a character string conversion dictionary for converting an input character string into kanji, and a character string conversion dictionary for converting an input character string into kanji. A second dictionary storage means for storing a field-specific dictionary classified for each specialized field, a batch conversion means for batch conversion of input character strings using the general dictionary, and a conversion obtained by the batch conversion means In the state where the result is displayed as a candidate, when an instruction operation for instructing sequential conversion is performed, a sequential conversion unit that sequentially converts the input character string using the general dictionary, and the sequential conversion unit An unknown word storage means for storing an input character string to be converted at the time of this sequential conversion as an unknown word for the general dictionary when a confirmation operation is performed on the conversion result obtained, and the unknown word storage unit The field specified by the dictionary specifying means includes a dictionary specifying means for searching the field specific dictionary based on the unknown word in the stage and specifying the field specific dictionary where the notation corresponding to the unknown word exists. Character string conversion is performed by preferentially using another dictionary.
Furthermore, a program for realizing the main functions shown in the invention described in claim 1 is provided to the computer (the invention described in claim 5).
The invention described in claim 1 may be as follows.
The dictionary designating unit designates a field-specific dictionary corresponding to the field when it is determined that at least two types of unknown words among the unknown words stored in the unknown word storage unit belong to the same field ( Invention of Claim 2).
A communication system including a document processing apparatus having a general dictionary that is normally used as a character string conversion dictionary for converting an input character string into Kanji, and a dictionary management apparatus for storing and managing a field-specific dictionary classified for each specialized field In the document processing device, the batch conversion unit that batch converts input character strings using the general dictionary, and the conversion result obtained by the batch conversion unit are sequentially displayed in a candidate display state. When an instruction operation for instructing is performed, a sequential conversion unit that sequentially converts using the general dictionary, and a confirmation operation is performed on a conversion result obtained by the sequential conversion unit An unknown word storage means for storing an input character string to be converted at the time of sequential conversion as an unknown word for the general dictionary, and a transmission means for transmitting the contents of the unknown word storage means to the dictionary management device And the dictionary management device searches the field-specific dictionaries based on unknown words transmitted from the document processing device, and specifies a field-specific dictionary in which a notation corresponding to the unknown words exists. A specifying unit; and a transmitting unit that transmits the field-specific dictionary specified by the dictionary specifying unit to the document processing device, wherein the document processing device preferentially receives the field-specific dictionary transmitted from the dictionary management device. The character string conversion is performed using the above (invention of claim 3).
The document processing device determines a dictionary storage unit that stores a field-specific dictionary transmitted from the dictionary management device, and whether or not the field-specific dictionary stored in the dictionary storage unit has not been used for a predetermined period. A discriminating unit; and a deleting unit that deletes the field-specific dictionary from the dictionary storage unit when it is discriminated that the discriminating unit has not been used for a predetermined period of time (the invention according to claim 4).
[0005]
According to the first aspect of the present invention, when a conversion result obtained by batch-converting an input character string (reading character string) using a general dictionary is displayed as a candidate, and when successive conversion is instructed, the input character string is reconverted. In this case, the general dictionary is used for sequential conversion, and when the confirmation of the conversion result is instructed, the input character string is set as an unknown word for the general dictionary. The field-specific dictionary containing the notation that corresponds to the unknown word is specified as the priority target dictionary by searching the field-specific dictionary based on the If the error location is found, just correct the location by sequential conversion and perform a confirmation operation, and the character string conversion using the specialized dictionary in the field belonging to the document content will be performed preferentially. Made possible Therefore, an effect that may be efficiently converted in a case without considering whether to create a document in which areas the user himself, to create a document in specialized areas.
[0006]
DETAILED DESCRIPTION OF THE INVENTION
An embodiment of the present invention will be described below with reference to FIGS.
FIG. 1 is a system configuration diagram showing a client-server system. This system is a local area network system in which a plurality of document processing apparatuses WP and a database server DBS are connected via a private leased line. Here, each document processing device WP is provided with a general dictionary normally used for kana-kanji conversion, but the field-specific dictionary classified for each specialized field is stored and managed on the database server DBS side, and is necessary. Accordingly, the database server DBS downloads the field-specific dictionary to the document processing device WP.
[0007]
FIG. 2A is a block diagram showing the overall configuration of the document processing apparatus WP.
The CPU 1 is a central processing unit that controls the overall operation of the document processing apparatus WP according to various programs loaded in the RAM 2. The storage device 3 has a storage medium 4 in which an operating system, various application programs, a data file, a Kana-Kanji conversion dictionary, character font data, and the like are stored in advance, and a drive system thereof. This storage medium 4 is fixedly provided or detachably mountable, and is composed of a magnetic / optical storage medium such as a floppy disk, a hard disk, an optical disk, and a RAM card, and a semiconductor memory. Further, programs and data in the storage medium 4 are loaded into the RAM 2 under the control of the CPU 1 as necessary. Further, the CPU 1 receives a program and data transmitted from another device via a communication line or the like and stores them in the storage medium 4 or stored in a storage medium provided on the other device. Existing programs and data can be used via a communication line or the like.
An input device 5, a display device 6, and a printing device 7 as input / output peripheral devices are connected to the CPU 1 via a bus line, and the CPU 1 controls their operations according to the input / output program.
The input device 5 has a pointing device such as a mouse in addition to a keyboard for inputting character string data or inputting various commands. The display device 6 is a liquid crystal display device that performs multicolor display, a CRT display device, a plasma display device, or the like, and the printing device 7 is a full-color printer device, which is a non-impact printer or impact printer such as thermal transfer or ink jet.
[0008]
FIG. 2B shows a main configuration of the RAM 2, and various memory areas are allocated to the RAM 2. The document memory 2-1 is a text memory that stores a confirmed character string every time an input character string is confirmed by a kana-kanji conversion process. The general dictionary memory 2-2 stores a general dictionary normally used as a dictionary for kana-kanji conversion. The general dictionary resident in the storage device 3 is loaded into the memory 2-2. The field dictionary area 2-3 is a memory area in which a specialized field dictionary for kana-kanji conversion downloaded from the database server DBS is stored. The candidate buffer 2-4 temporarily stores the conversion result as a candidate character string when Kana-Kanji conversion is performed using a general dictionary or a field-specific dictionary, and the contents are displayed as candidates on the text screen of the display device 6. . If the notation corresponding to the input character string does not exist in the general dictionary memory 2-2, the unknown word memory 2-5 is instructed to perform single-kanji conversion or expansion / conversion conversion, and the input character string is stored in the general dictionary memory 2-2. The input character string (reading) and the conversion result (notation) are stored as unknown words when Kana-Kanji conversion (sequential conversion) is performed according to the contents of Here, the unknown word is a word that is not written in the general dictionary memory 2-2, and means a word created by the user himself / herself by single kanji conversion or expansion / contraction conversion. When the number of unknown words stored in the unknown word memory 2-5 reaches a certain number (for example, 10 types), the CPU 1 stores the contents of the unknown word memory 2-5 together with its own terminal number (machine number) in the database server. Send to DBS. The dictionary date / time memory 2-6 stores date / time data for each specialized field dictionary, and updates the date / time data corresponding to the field dictionary each time Kana-Kanji conversion is performed using the field dictionary. .
[0009]
FIG. 3A is a block diagram showing the overall configuration of the database server DBS. The database server DBS includes a CPU 11, a RAM 12, a storage device 13, a storage medium 14, an input device 15, and a display device 16. The storage device 13 and the storage medium 14 are the storage device 3 and the storage medium of the document processing device WP described above. The description is omitted because it is basically the same as 4, but the storage medium 14 supports the unknown word management table UMT, the specialized field dictionary SPD, and the unknown word field dictionary as shown in FIG. Tables COT are stored and loaded into the RAM 12 as needed. Here, when unknown word data is transmitted from the document processing device WP, the CPU 11 stores the unknown word data in the unknown word management table UMT in the RAM 12, and the contents of the unknown word management table UMT and the specialized field dictionary SPD. Are determined, a specialized field dictionary SPD in which a notation corresponding to an unknown word exists is determined, and the determination result is set in the unknown word field dictionary correspondence table COT. Then, by analyzing the contents of the unknown word field dictionary correspondence table COT, the specialized field dictionary SPD is specified, and the specified specialized field dictionary SPD is transmitted to the document processing device WP.
[0010]
FIG. 4 shows the structure of the unknown word management table UMT. For each terminal No that identifies the document processing device WP, the unknown word management table UMT has a record No (M1, M2,... It is configured to store and manage reading and notation.
FIG. 5 exemplifies a specialized field dictionary SPD, (A) shows a physical term dictionary, (B) shows a chemical term dictionary, and each specialized field dictionary SPD reads, stores and stores Ai information (example). It is the composition to do.
FIG. 6 shows the structure of the unknown word field dictionary correspondence table COT. This unknown word field dictionary correspondence table COT is provided for each terminal No. A record No. of the word management table UMT is fixedly described, and a dictionary No. that specifies the specialized field dictionary SPD is fixedly described in the column item, and a matrix including the record No and the dictionary No. In each of the intersection areas above, a determination result for determining to which field the unknown word belongs is described. In this case, a circle in the figure indicates the determination result.
[0011]
Next, operations of the document processing device WP and the database server DBS will be described with reference to flowcharts shown in FIGS. Here, a program for realizing each function described in these flowcharts is stored in the storage medium 4 in the form of a program code readable by the CPU 1 (11), and the contents thereof are stored in the RAM 2 (12). Is loaded into the working memory.
FIG. 7 is a flowchart showing an outline of the overall operation of the document processing apparatus WP. First, the CPU 1 waits for input (step A1) and performs input analysis if any input is received (step A2). Now, as Kana-Kanji conversion for Kana input character strings, for example, batch conversion such as phrase conversion and compound phrase conversion is generally performed. Then, it progresses to step A9 and a batch kana-kanji conversion process is performed.
[0012]
FIG. 8 is a flowchart showing batch conversion processing in this case. First, the CPU 1 refers to the general dictionary memory 2-2 to sequentially convert the input character string into Kana-Kanji (step B1), and checks whether all of the input characters can be converted into words, compound words, etc. as the first candidate. (Step B2) When all of the first candidates can be converted, the conversion result is stored in the candidate buffer 2-4 (Step B5), and only a part thereof is converted as the first candidate. Even if it is possible, only the conversion result converted as the first candidate is stored in the candidate buffer 2-4 except for the unconverted character string (step B5). If there is an unconverted character string that could not be converted as the first candidate, the process proceeds to step B3, and even if there is only one type of specialized field dictionary loaded from the database server DBS, the field dictionary area 2-3 Is stored in. If there is no field-specific dictionary, the process proceeds to step B4, where the unconverted character string is again converted to Kana-Kanji by referring to the general dictionary memory 2-2, and the conversion result converted as the next candidate is converted into a candidate buffer. 2-4 (step B5).
[0013]
When the conversion result is stored in the candidate buffer 2-4 in this way, the process proceeds to step A10 in FIG. 7, and the contents of the candidate buffer 2-4 are displayed as candidates. At this time, the conversion result is converted into the batch converted result. When there is an error, in addition to the next candidate conversion, re-conversion of the error part is instructed by single kanji conversion or expansion / contraction conversion. Now, when expansion / contraction conversion or single kanji conversion is instructed, the portion is returned to the kana character string (steps A3 and A6), and the same expansion / contraction kanji conversion processing (step A4) or single kanji conversion processing (step A4) A7) is executed. At this time, an identifier indicating that the expansion / contraction conversion or single kanji conversion has been performed is added to the character string subjected to expansion / contraction conversion or single character conversion (steps A5 and A8). . The conversion result is stored in the candidate buffer 2-4 and then displayed as a candidate (step A10).
Now, as shown in FIG. 12A, it is assumed that a character string with an underline in the figure is a conversion candidate obtained by Kana-Kanji conversion using a general dictionary. In this case, when single kanji conversion or expansion / contraction conversion is instructed to correct this conversion error, the character string is returned to the temporary character string (see FIG. 12B). When single kanji conversion or expansion / contraction conversion is instructed in this state, it is converted into all chemical terms as shown in FIG. 12C and displayed as candidates.
[0014]
Then, when conversion confirmation is instructed, the process proceeds to step A11 where unknown word determination storage processing is performed.
FIG. 9 is a flowchart detailing the processing contents. First, it is checked whether the converted character string in the candidate buffer 2-4 includes a character string subjected to expansion / contraction conversion (step C1). It is checked whether or not the converted character string is included (step C2). Here, if there is no character string converted by any conversion method, the flow goes out of this flow, but if a character string converted by any one of the conversion methods is included, the process proceeds to step C3 and the character string is obtained. It is checked whether the same character string is already stored in the unknown word memory 2-5 as an unknown word. This is to avoid duplicating and setting the same character string in the unknown word memory 2-5. If there is no identical character string, the reading and notation of the character string are associated with each other in the unknown word memory 2-5. Store (step C4). As a result, it is checked whether or not the number of unknown words stored in the unknown word memory 2-5 exceeds a certain number (step C5). In this case, for example, when 10 types of unknown words are accumulated during document creation, it is determined that the number of unknown words has been exceeded, and the contents of the unknown word memory 2-5 are stored in the database server DBS together with its own terminal number. Although transmission is performed via the communication line (step C6), if it is less than a certain number, the transmission process of step C6 is not performed.
Then, the process proceeds to step A12 in FIG. 7, and the process of storing the contents of the candidate buffer 2-4 as a confirmed character string in the document memory 2-1 is performed.
[0015]
On the other hand, when the contents of the unknown word memory 2-5 are transmitted to the database server DBS, the database server DBS performs an operation according to the flowchart of FIG. That is, the CPU 11 stores the unknown word transmitted from the document processing device WP in the unknown word management table UMT in association with the terminal No (step D1). Then, an unknown word pointer is set at the head position of the unknown word management table UMT in the terminal No area (step D2), and an initial value is set in the classified dictionary pointer for sequentially accessing the specialized field dictionary SPD. (Step D3). In this state, the specialized field dictionary SPD designated by the dictionary pointer is searched based on the unknown word designated by the unknown word pointer, and it is determined whether the unknown word is included in the specialized field dictionary SPD. (Step D4). As a result, if the notation corresponding to the unknown word exists in the specialized field dictionary SPD, the determination result is set in the unknown word field dictionary correspondence table COT (step D5). Here, as described above, the record number corresponding to the unknown word is described in the unknown word field dictionary correspondence table COT, and when the field dictionary to which the unknown word belongs is searched, the unknown word field corresponding to it is searched. Field determination information is set in the intersection area in the dictionary correspondence table COT. Then, the dictionary pointer is updated to designate the next specialized field dictionary SPD (step D6), and it is checked whether all dictionaries are designated, that is, whether the dictionary is finished (step D7). If not finished, the process returns to step D4. The above operation is repeated. Therefore, if one unknown word exists in a plurality of dictionaries, the field determination information is set in the unknown word field dictionary correspondence table COT for each dictionary.
[0016]
When the comparison between the first unknown word and all the dictionaries is completed in this way, the unknown word pointer is updated to designate the next unknown word in the unknown word management table UMT (step D8). It is checked whether all corresponding unknown words have been specified (step D9). If the unknown word has not ended, the process returns to step D3, the initial value is set in the dictionary pointer, and the field to which the unknown word belongs is determined. This process is repeated while updating the dictionary pointer. Thus, when the end of the unknown word is detected, the unknown word field dictionary correspondence table COT corresponding to the terminal No is analyzed, and the unknown words are counted for each field dictionary (step D10). Based on the count value obtained for each dictionary in this way, it is checked whether or not a field-specific dictionary can be specified depending on whether a plurality of types (for example, three or more) of unknown words are included in one dictionary (step) D11). If at least one dictionary can be specified, the contents of the specified one or more specialized field specific dictionary SPD are transmitted to the document processing device WP specified by the terminal No. If there are a plurality of specified field-specific dictionaries, the number of unknown words (count value) is sent in association with each dictionary (step D12). The unknown word count value is added to the dictionary specified in this way and sent out on the document processing device WP side, which dictionary among a plurality of field-specific dictionaries is used preferentially for kana-kanji conversion. This is so that it can be judged. Then, a process of clearing the contents of the unknown word management table UMT corresponding to the terminal No. and the unknown word field dictionary correspondence table COT is performed (step D13). If the dictionary cannot be specified, the dictionary transmission process is not performed, and only the process of clearing the unknown word management table UMT and the unknown word field dictionary correspondence table COT is performed (step D13).
[0017]
Here, in the document processing apparatus WP that has received the specialized field dictionary SPD transmitted from the database server DBS, it sequentially stores it in its own field dictionary area 2-3 (step A13). When a kana-kanji conversion is instructed in a state where the field-specific dictionary is downloaded as described above, a kana-kanji conversion process is executed in step B1 of FIG. 8, and as a result, there is a character string that is not converted as the first candidate. If not (step B2), it is checked whether a field-specific dictionary is loaded (step B3). Since it is detected that there is a field-specific dictionary, the process proceeds to step B6 to check whether the dictionary is currently in use. Here, if the field-specific dictionary is just downloaded, the dictionary is not in use, so the process proceeds to step B10, and kana-kanji conversion is performed using the field-specific dictionary immediately after being downloaded. In this case, when a plurality of downloaded field-specific dictionaries exist, the unknown word count values added to the dictionaries are compared, and kana-kanji conversion is performed using a dictionary having a large count value. If converted (step B11), the dictionary is designated as a field-specific dictionary currently in use (step B12). Then, the date / time data in the dictionary date / time memory 2-6 corresponding to the dictionary is updated (step B13). In this case, since the dictionary is just after being downloaded, the current date / time is changed to the dictionary. Is written in the dictionary date / time memory 2-6. Also, the converted character string that has been Kana-Kanji converted using the classification dictionary as described above is stored in the candidate buffer 2-4 (step B5).
[0018]
In such a state where the in-use dictionary is designated, the batch kana-kanji conversion process is entered again. When the process proceeds from step B1, B2, B3 to step B6, it is detected that there is a used field-specific dictionary. In step B7, the kana-kanji conversion process is performed using the in-use field-specific dictionary, and if converted (step B8), the contents of the date-by-dictionary memory 2-6 corresponding to the dictionary are updated. (Step B13), the converted character string is stored in the candidate buffer 2-4 (Step B5).
On the other hand, if it cannot be converted even by the field-specific dictionary in use, it is checked whether another field-specific dictionary exists (step B9), and the other is already transferred from the database server DBS to its own document processing apparatus WP. If the field-specific dictionary has been transferred, the process proceeds to step B10, and if there is another field-specific dictionary immediately after being downloaded, kana-kanji conversion is performed using the field-specific dictionary specified based on the unknown word count value. In addition, the dictionary date / time memory 2-6 is referred to, and the kana-kanji conversion is performed using the updated date / time dictionary. Here, if converted (step B11), the dictionary is designated as a dictionary in use, the date / time data in the date / time memory 2-6 corresponding to the dictionary is updated, and the conversion result is also a candidate. Store in the buffer 2-4 (steps B12, B13, B5). If it is detected in step B11 that the data has not been converted, the process proceeds to step B9 to check again whether there is another field-specific dictionary. In this way, even if the conversion process is performed while sequentially specifying the respective field-specific dictionaries in accordance with the above-described priorities, if the conversion cannot be performed, the process proceeds to step B4 and the general dictionary memory 2-2 is used. The kana-kanji conversion process for the next candidate is performed, and the conversion result is stored in the candidate buffer 2-4.
[0019]
Thereafter, the same operation is repeated until the document creation is completed. When the end of document creation is instructed (step A17 in FIG. 7), the contents of the unknown word memory 2-5 obtained at the time of document creation are cleared, and the specification of the in-use field-specific dictionary is cleared. (Step A18).
When it is detected that the power is turned on and immediately after that (step A14), after various initialization processes are performed (step A15), the process proceeds to a dictionary deletion process (step A16).
FIG. 10 is a flowchart showing the dictionary deletion processing. First, data for one record is read from the head of the dictionary date / time memory 2-6, an elapsed time is obtained from the date / time data and the current date / time, and it is checked whether or not this elapsed time exceeds a preset fixed period ( Step B2). For example, it is checked whether two days (24 hours) have passed since the latest use date and time, and if so, the dictionary is deleted from the field-specific dictionary area 2-3 (step E3). Then, it is checked whether or not all the dictionaries have been specified (step E4), and the process returns to step E1 until the end of all dictionaries, and the above operation is repeated. If date / time data is not set in the dictionary date / time memory 2-6, the dictionary is excluded from deletion.
[0020]
As described above, in this client / server system, each document processing device WP is provided with a general dictionary, and the database server DBS is provided with various specialized field dictionaries. When kana-kanji conversion processing is performed using, and the notation corresponding to the input character string does not exist in the general dictionary, the input character string is sequentially converted using the general dictionary by single kanji conversion or expansion conversion. The character string is stored in the unknown word memory 2-5 as an unknown word for the general dictionary if it is sequentially converted. When a predetermined number of unknown words are stored in the unknown word memory 2-5, when the contents of the unknown word memory 2-5 are transmitted to the database server DBS, the database server DBS analyzes it, The field to which the unknown word belongs is determined. Then, the field-specific dictionary corresponding to this field is downloaded to the document processing apparatus WP, and the kana-kanji conversion can be performed on the document processing apparatus WP side preferentially using this field-specific dictionary. In other words, regardless of the field in which the user is inputting a document in any field, the field is automatically determined in the middle of the input, and a specialized database corresponding to the field is stored from the database server DBS that stores and manages a rich field-specific dictionary. Since the field-specific dictionary is transmitted and downloaded, Kana-Kanji conversion is performed preferentially using the field-specific dictionary from the subsequent document input, and therefore input can be continued without reducing the conversion efficiency. Therefore, it is possible to efficiently convert a document in a specialized field without being aware of the field in which the user himself / herself creates a document.
[0021]
In addition, when the field to which the unknown word belongs is determined on the database server DBS side, the unknown word and the field-specific dictionary are collated, so the field can be determined without referring to the keyword table or the like. At the same time, the determination can be made reliably. Further, when a predetermined number of unknown words reach the unknown word memory 2-5 on the document processing device WP side, the contents are transmitted to the database server DBS to instruct the field determination, and when the field is determined. Is subject to the fact that a plurality of unknown words are included in the field, so that the field can be determined reliably.
Furthermore, since the document processing device WP constantly monitors the usage status of the field-specific dictionary and deletes the field-specific dictionary that is not used for a certain period of time, it is possible to prevent the field-specific dictionary area 2-3 from becoming too large. At the same time, efficient conversion is possible without unnecessarily reducing the dictionary search speed.
[0022]
In the above-described embodiment, the usage status of the field-specific dictionary is monitored, and the dictionary is deleted when the dictionary is not used for a certain period of time. The field-specific dictionary may be deleted on condition that the numerical value exceeds a certain number of times.
Although the client / server system has been described, the present invention can also be applied to a stand-alone type document processing apparatus WP. In this case, it is a condition that the document processing apparatus WP itself stores and manages the field-specific dictionary, and when the field is specified for the currently input document, the search order of the field-specific dictionary is set to the first. Can increase the conversion speed.
[0023]
【The invention's effect】
According to the present invention, regardless of the field in which the user is inputting a document in any field, the field is automatically determined in the middle of the input, and character string conversion is performed using the corresponding field-specific dictionary. Therefore, when converting an input character string to Kanji, it is possible to efficiently convert even a document in a specialized field without being aware of the field in which the user himself / herself creates a document. It becomes.
[Brief description of the drawings]
FIG. 1 is a system configuration diagram showing a client / server system.
2A is a block diagram showing an overall configuration of a document processing apparatus WP, and FIG. 2B is a diagram showing a main configuration of a RAM 2;
FIG. 3A is a diagram showing an overall configuration of a database server DBS, and FIG. 3B is a diagram showing its main storage contents.
FIG. 4 is a diagram showing a configuration of an unknown word management table UMT.
FIGS. 5A and 5B illustrate a specialized field dictionary SPD, in which FIG. 5A shows a physical term dictionary and FIG. 5B shows a chemical term dictionary;
FIG. 6 is a diagram showing a configuration of an unknown word field-specific dictionary correspondence table COT.
FIG. 7 is a flowchart showing the overall operation of the document processing apparatus WP.
FIG. 8 is a flowchart detailing step A9 (collective kana-kanji conversion processing) in FIG. 7;
FIG. 9 is a flowchart detailing step A11 (unknown word discrimination storage process) in FIG. 7;
FIG. 10 is a flowchart detailing step A16 (dictionary deletion processing) in FIG. 7;
FIG. 11 is a flowchart showing the operation of the database server DBS.
FIG. 12 is a diagram showing a specific example of kana-kanji conversion, (A) is a diagram illustrating conversion errors of technical terms when converted by a general dictionary, and (B) is a diagram when performing single-kanji conversion or expansion / contraction conversion. The figure which showed the example in which the character string of the conversion error was returned to the kana character string, (C) is the figure which illustrated the character string after single kanji conversion or expansion / contraction conversion.
[Explanation of symbols]
1,11 CPU
2, 12 RAM
2-1 Document memory
2-2 General dictionary memory
2-3 Field dictionary area
2-4 Candidate buffer
2-5 Unknown word memory
2-6 Date and time memory by dictionary
3, 13 Storage device
4,14 Storage media
5 input devices
6 Display device
WP document processing device
DBS database server
UTM unknown word management table
SPD specialized dictionary
COT Unknown word field dictionary correspondence table

Claims

A first dictionary storage means for storing a general dictionary normally used as a character string conversion dictionary for converting an input character string into kanji;
A second dictionary storage means for storing a field-specific dictionary classified for each specialized field as a character string conversion dictionary for converting an input character string into kanji;
Batch conversion means for batch conversion of input character strings using the general dictionary;
Sequential conversion that sequentially converts the input character string using the general dictionary when an instruction operation for instructing sequential conversion is performed in a state where the conversion results obtained by the batch conversion means are displayed as candidates. Means,
An unknown word storage means for storing an input character string to be converted at the time of this sequential conversion as an unknown word for the general dictionary when a determination operation is performed on the conversion result obtained by the successive conversion means ;
Each of the field-specific dictionaries is searched based on unknown words in the unknown word storage means, and includes a dictionary designation means for designating a field-specific dictionary in which a notation corresponding to the unknown word exists ,
A character string conversion apparatus characterized in that character string conversion is performed by preferentially using a field-specific dictionary specified by the dictionary specifying means.

The dictionary designating unit designates a field-specific dictionary corresponding to the field when it is determined that at least two types of unknown words among the unknown words stored in the unknown word storage unit belong to the same field. The character string conversion device according to claim 1, wherein

A communication system including a document processing apparatus having a general dictionary that is normally used as a character string conversion dictionary for converting an input character string into Kanji, and a dictionary management apparatus for storing and managing a field-specific dictionary classified for each specialized field Because
The document processing device uses batch conversion means for batch conversion of input character strings using the general dictionary, and instructions for instructing sequential conversion in a state where conversion results obtained by the batch conversion means are displayed as candidates. When the operation is performed, the sequential conversion unit that performs sequential conversion using the general dictionary, and the conversion result obtained when the conversion result obtained by the sequential conversion unit is performed is converted at the time of the sequential conversion. An unknown word storage means for storing the target input character string as an unknown word for the general dictionary, and a transmission means for transmitting the contents of the unknown word storage means to the dictionary management device,
The dictionary management device includes a dictionary specifying unit that searches the field-specific dictionary based on an unknown word transmitted from the document processing device and specifies a field-specific dictionary in which a notation corresponding to the unknown word exists , Transmission means for transmitting the field-specific dictionary designated by the dictionary designation means to the document processing device,
2. The character string conversion apparatus according to claim 1, wherein the document processing apparatus preferentially uses a field dictionary transmitted from the dictionary management apparatus to perform character string conversion.

The document processing apparatus determines a dictionary storage unit that stores a field-specific dictionary transmitted from the dictionary management device, and whether the field dictionary stored in the dictionary storage unit has not been used for a predetermined period. 2. The apparatus according to claim 1, further comprising: a determination unit; and a deletion unit that deletes the field-specific dictionary from the dictionary storage unit when the determination unit determines that it has not been used for a predetermined period. String converter.

Against the computer,
A function of storing and managing a general dictionary that is normally used as a character string conversion dictionary for converting an input character string into kanji, and storing and managing a field-specific dictionary classified for each specialized field,
A function for batch conversion of input character strings using the general dictionary;
In a state where conversion results obtained by the batch conversion are displayed as candidates , when an instruction operation for instructing sequential conversion is performed, a function of sequentially converting the input character string using the general dictionary ,
When a confirmation operation is performed on the conversion result obtained by the successive conversion, the input character string that is a conversion target at the time of the successive conversion is stored as an unknown word for the general dictionary , and based on the unknown word . A function for specifying the field dictionary in which the notation corresponding to the unknown word exists by searching the field dictionary respectively.
A function that performs string conversion using the specified field-specific dictionary preferentially ;
A recording medium on which a program for realizing the above is recorded.