JP3524189B2

JP3524189B2 - Character processor

Info

Publication number: JP3524189B2
Application number: JP01401395A
Authority: JP
Inventors: 仁志緩利; 聖範若井
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1995-01-31
Filing date: 1995-01-31
Publication date: 2004-05-10
Anticipated expiration: 2019-05-10
Also published as: JPH08202700A

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、仮名漢字変換により漢
字仮名混じり文を出力する文字処理装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character processing device for outputting a kanji / kana mixed sentence by kana / kanji conversion.

【０００２】[0002]

【従来の技術】仮名漢字変換は、各種辞書を参照するこ
とにより、入力された読み列を漢字に変換するものであ
る。自立語辞書においては、各単語に対して名詞、サ変
名詞、副詞、動詞、形容詞、形容動詞などの品詞情報が
記述されており、付属語辞書においては、その付属語の
文法情報が記述されている。また、付属語同士の連接
や、自立語と付属語の連接の、可能／不可能を記述した
連接判定テーブルが準備されていて、かな漢字変換は、
各種辞書やテーブルを参照しつつ、入力された読み列を
漢字に変換していく。2. Description of the Related Art Kana-Kanji conversion is to convert an input reading string into Kanji by referring to various dictionaries. In the independent word dictionary, part-of-speech information such as nouns, sahen nouns, adverbs, verbs, adjectives, and adjective verbs is described for each word.In the adjunct dictionary, grammatical information about the adjunct is described. There is. In addition, there is a concatenation judgment table that describes whether or not adjunct words can be connected or independent words and adjunct words can be connected.
While referring to various dictionaries and tables, the input reading string is converted into kanji.

【０００３】例えば、「しろにとのがいる」という入力
に対しては、「市」「白」「城」「白に」「城に」「白
にと」「城にと」「白にとの」「城にとの」「白にとの
が」「城にとのが」、「炉」「露」「炉に」…、「二」
「似」「荷」「煮」「二と」…、「都」「戸」「都の」
「殿」「戸の」「都のが」「殿が」…、「いる」「居
る」「要る」「煎る」…等の文節の候補を作成し、これ
らを組み合わせた結果、一般には文節数が最小となる組
み合わせが優先され、「白にとのが／いる」「城にとの
が／いる」といった、２文節で構成される変換結果が優
先され、「城に／殿が／いる」といった変換結果は、３
文節で構成されるため、第１候補として、出力されるこ
とがなかった。[0003] For example, in response to the input "Shiro ni to ga ga", "city""white""castle""white""castle""whiteto""castleto""white""Tono","To the castle", "To the white", "To the castle", "Furnace", "Dew", "In the furnace" ..., "Second"
"Similar", "Package", "Boiled", "Nito" ..., "Miyako", "House", "Miyanono"
As a result of creating phrase candidates such as "don", "tono", "tono", "don", "is", "is", "necessary", "roasting", etc. The combination with the smallest number is prioritized, and the conversion result composed of two phrases such as "White / Toga / Iru" and "Castle / Toga / Iru" is given priority, The conversion result is 3
Since it is composed of clauses, it was not output as the first candidate.

【０００４】[0004]

【発明が解決しようとする課題】しかしながら、そもそ
も上記の例においては、「白にとのが」「城にとのが」
といった文節が生成されることに問題がある。However, in the above example, "white nito no ga" and "castle ni to no ga" were originally used.
There is a problem that such clauses are generated.

【０００５】従来、一般には、名詞＋「に」、「に」＋
「と」、「と」＋「の」、「の」＋「が」が、それぞれ
連接可能であると定義されている。「白に」「私にと」
「彼との」「彼女のが」という文節を考えて見ると、確
かに、それぞれを連接可能とすることは正しい。しかし
ながら、この方法では、名詞＋「に」＋「と」＋
「の」、さらには、名詞＋「に」＋「と」＋「の」＋
「が」が連接可能となってしまう。Conventionally, in general, noun + "ni", "ni" +
It is defined that “to”, “to” + “no”, and “no” + “ga” can be respectively connected. "To white""Tome"
Considering the phrases "with him" and "with her", it is certainly correct to be able to connect each. However, with this method, noun + "ni" + "to" +
“No”, moreover, noun + “ni” + “to” + “no” +
"Ga" can be connected.

【０００６】要するに、従来は、３語以上の連接関係に
ついては、連接を禁止する定義がなされておらず、連接
を禁止することが不可能な構成であり、無意味な文節を
生成していた。[0006] In short, in the past, with respect to the connection relation of three or more words, the definition that prohibits the connection is not made, and it is impossible to prohibit the connection, and a meaningless clause is generated. .

【０００７】（課題１）そこで、本発明は、上述した従
来の問題を解決し、付属語の連接関係を付属語列として
記述し、付属語列を付属部辞書に格納し、付属語同士の
連接判定テーブルは用いない構成とすることによって、
無意味な付属語列、さらには無意味な文節を生成しない
仮名漢字変換を提供することを目的とする。(Problem 1) Therefore, the present invention solves the above-mentioned conventional problem, describes the concatenation relationship of adjunct words as an adjunct word string, stores the adjunct word string in an adjunct dictionary, and stores adjunct words between adjunct words. By not using the connection determination table,
It is an object of the present invention to provide a kana-kanji conversion that does not generate meaningless attached word strings and meaningless clauses.

【０００８】（課題２）さらに、本発明は（課題１）に
おいて付属語列を付属部辞書に格納したのと同様に、付
属語と自立語が連接した状態で構成される言い回しを、
１つの付属部として付属語辞書に格納することによっ
て、優先して仮名漢字変換される付属部を格納すること
を目的とする。(Problem 2) Further, according to the present invention, as in the case of (Problem 1), in which an adjunct word string is stored in an adjunct dictionary, a phrase composed of an adjunct word and an independent word is connected,
By storing in the adjunct word dictionary as one adjunct, the adjunct to be converted into kana-kanji is preferentially stored.

【０００９】（課題３）さらに、本発明は（課題１）お
よび（課題２）において、付属部辞書に格納した付属語
列や言い回しに、優先度情報を付加することにより、付
属語列として使用された時の頻度や、言い回しとしての
頻度を考慮した、精度の高い仮名漢字変換を提供するこ
とを目的とする。(Problem 3) Further, according to the present invention, in (Problem 1) and (Problem 2), by adding priority information to an adjunct word string or a phrase stored in an adjunct dictionary, it is used as an adjunct word string. It is an object of the present invention to provide a highly accurate kana-kanji conversion that takes into account the frequency of occurrences and the frequency of wording.

【００１０】[0010]

【課題を解決するための手段】上記課題を解決するため
に、本発明の文字処理装置は、仮名文字列を入力するた
めの、入力手段と、単語の読みと、表記および品詞など
の文法情報とを対応づけて記憶した単語辞書手段と、助
詞や助動詞などの付属語を、その読みと表示および文法
情報とを対応づけて記憶した付属部辞書手段と、前記入
力手段により入力された仮名文字列を、前記単語辞書手
段および前記付属部辞書手段を参照して、対応する表記
に変換するかな漢字変換手段とを有し、前記付属部辞書
手段に、単一の付属語とともに、連接して使用可能な付
属語列を１つの付属部として格納し、前記仮名漢字変換
手段による変換において、付属部同士を非連接とするこ
とを特徴とする。In order to solve the above-mentioned problems, the character processing device of the present invention has an input means for inputting a kana character string, word reading, and grammatical information such as notation and part-of-speech. And a word dictionary means that stores in association with each other, and an adjunct dictionary means that stores an accessory word such as a particle or auxiliary verb in association with its reading, display, and grammatical information, and a kana character input by the input means. A kana-kanji conversion means for converting a string into a corresponding notation by referring to the word dictionary means and the auxiliary dictionary means, and using the auxiliary dictionary means together with a single auxiliary word A possible adjunct word string is stored as one adjunct, and the adjuncts are not connected in the conversion by the kana-kanji conversion means.

【００１１】さらに、本発明は上記文字処理装置におい
て、さらに、前記付属部辞書手段に、付属語および付属
語列とともに、付属語と自立語との連接により構成され
る言い回しを１つの付属部として格納することを特徴と
する。Further, according to the present invention, in the above character processing device, the adjunct dictionary means is provided with an adjunct word and an adjunct word string, and a phrase formed by concatenating an adjunct word and an independent word as one adjunct part. It is characterized by storing.

【００１２】さらに本発明は上述した文字処理装置にお
いて、さらに、前記付属部辞書手段に格納される各付属
部には、優先度情報を対応づけて記憶することを特徴と
する。Further, the present invention is characterized in that, in the character processing device described above, priority information is stored in association with each of the attached parts stored in the attached part dictionary means.

【００１３】[0013]

【作用】本発明においては、付属部辞書手段に連接可能
な付属語列を１つの付属部として格納し、付属部同士は
非連接とすることで、無意味な付属語列が生成されるこ
とを防止することができる。In the present invention, meaningless adjunct word strings are generated by storing adjunct word strings that can be connected to the adjunct part dictionary means as one adjunct part and making the adjunct parts non-adjoining. Can be prevented.

【００１４】さらに、本発明では、付属部辞書手段に、
付属語と自立語との連接で構成される言い回しを１つの
付属部として格納することにより、当該言い回しが優先
して変換される。Further, according to the present invention, the appendix dictionary means includes:
By storing a phrase composed of a concatenation of an adjunct and an independent word as one adjunct part, the phrase is preferentially converted.

【００１５】さらに、本発明では、上述した構成におい
て、付属部辞書手段に格納した付属語列や言い回しに、
優先度情報を付加して格納することにより、優先度の高
い付属語列や言い回しが優先して変換される。Further, according to the present invention, in the above-mentioned structure, the attached word string or the phrase stored in the attached dictionary means is
By adding and storing priority information, a high priority auxiliary word sequence or a wording is preferentially converted.

【００１６】[0016]

【実施例】以下図面を参照しながら本発明を詳細に説明
する。DESCRIPTION OF THE PREFERRED EMBODIMENTS The present invention will be described in detail below with reference to the drawings.

【００１７】図１は本発明の全体構成の一例を示すブロ
ック図である。FIG. 1 is a block diagram showing an example of the overall configuration of the present invention.

【００１８】図示の構成において、ＣＰＵは、マイクロ
プロセッサであり、文字処理のための演算、論理判断等
を行ない、アドレスバスＡＢ、コントロールバスＣＢ、
データバスＤＢを介して、それらのバスに接続された各
構成要素を制御する。In the configuration shown in the figure, the CPU is a microprocessor, which performs arithmetic operations for character processing, logical judgments, etc., an address bus AB, a control bus CB,
The respective components connected to those buses are controlled via the data bus DB.

【００１９】アドレスバスＡＢはマイクロプロセッサＣ
ＰＵの制御の対象とする構成要素を指示するアドレス信
号を転送する。コントロールバスＣＢはマイクロプロセ
ッサＣＰＵの制御の対象とする各構成要素のコントロー
ル信号を転送して印加する。データバスＤＢは各構成機
器相互間のデータ転送を行なう。The address bus AB is a microprocessor C
An address signal for instructing a component to be controlled by the PU is transferred. The control bus CB transfers and applies a control signal of each constituent element to be controlled by the microprocessor CPU. The data bus DB transfers data between the respective constituent devices.

【００２０】次にＲＯＭは、読出し専用の固定メモリで
ある。ＲＯＭに設けられているＰＡは、図８〜図１１に
つき後述するマイクロプロセッサＣＰＵによる制御の手
順を記憶させたプログラムエリアである。The ROM is a read-only fixed memory. PA provided in the ROM is a program area in which the procedure of control by the microprocessor CPU described later with reference to FIGS. 8 to 11 is stored.

【００２１】また、ＲＡＭは、１ワード１６ビットの構
成の書き込み可能のランダムアクセスメモリであって、
各構成要素からの各種データの一時記憶に用いる。ＲＡ
Ｍには以下に説明するＪＤＩＣ、ＦＤＩＣ、ＣＴＢＬ、
ＩＢＵＦ、ＯＢＵＦ、ＢＴＢＬが格納されている。The RAM is a writable random access memory having a structure of 16 bits per word,
Used for temporary storage of various data from each component. RA
M includes JDIC, FDIC, CTBL, and
IBUF, OBUF, and BTBL are stored.

【００２２】ＪＤＩＣは自立部辞書で、かな漢字変換用
の自立部を格納したものであり、詳細は図２を用いて後
述する。JDIC is a self-supporting part dictionary, which stores a self-supporting part for kana-kanji conversion, and details will be described later with reference to FIG.

【００２３】ＦＤＩＣは付属部辞書で、かな漢字変換用
の付属部を格納したものであり、詳細は図３を用いて後
述する。The FDIC is an adjunct dictionary, which stores an adjunct for kana-kanji conversion, the details of which will be described later with reference to FIG.

【００２４】ＣＴＢＬは連接判定テーブルで、自立部と
付属部の連接可否の情報を格納したものであり、詳細は
図４を用いて後述する。CTBL is a connection determination table, which stores information on whether or not the self-supporting part and the auxiliary part are connected, and details thereof will be described later with reference to FIG.

【００２５】ＩＢＵＦは入力バッファで、キー入力され
たキーデータを記憶する入力バッファであり、詳細は図
５（ａ）を用いて後述する。IBUF is an input buffer, which is an input buffer for storing key data input by a key, and details will be described later with reference to FIG.

【００２６】ＯＢＵＦは出力バッファで、かな漢字変換
された結果を一時的に記憶するバッファであり、詳細は
図５（ｂ）を用いて後述する。OBUF is an output buffer, which is a buffer for temporarily storing the result of Kana-Kanji conversion, and details will be described later with reference to FIG.

【００２７】ＢＴＢＬは文節候補テーブルで、入力され
た仮名文字列をかな漢字変換して、出力を決定する段階
において使用されるバッファであり、かな漢字変換の途
中結果を格納する。詳細は図６及び図７を用いて後述す
る。BTBL is a phrase candidate table, which is a buffer used at the stage of converting the inputted kana character string into kana-kanji characters and determining the output, and stores the intermediate result of kana-kanji conversion. Details will be described later with reference to FIGS. 6 and 7.

【００２８】ＫＢはキーボードであって、アルファベッ
トキー、ひらがなキー、カタカナキー等の文字記号入力
キー、及び、カーソル移動を指示するカーソル移動キー
や各種のファンクションキーを備えている。The KB is a keyboard, which is provided with a character / symbol input key such as an alphabet key, a hiragana key, a katakana key, a cursor movement key for instructing cursor movement, and various function keys.

【００２９】ＤＩＳＫは文書データ等を記憶するための
外部メモリである。文書データ等は必要に応じて保管さ
れ、また、保管されたデータはキーボードの指示によ
り、必要な時呼び出される。DISK is an external memory for storing document data and the like. Document data and the like are stored as needed, and the stored data is recalled when necessary by an instruction from the keyboard.

【００３０】ＣＲはカーソルレジスタである。ＣＰＵに
より、カーソルレジスタの内容を読み書きできる。後述
するＣＲＴコントローラＣＲＴＣは、ここに蓄えられた
アドレスに対応する表示装置ＣＲＴ上の位置にカーソル
を表示する。CR is a cursor register. The CPU can read and write the contents of the cursor register. The CRT controller CRTC described later displays a cursor at a position on the display device CRT corresponding to the address stored here.

【００３１】ＤＢＵＦは表示用バッファメモリで、表示
すべきデータのパターンを蓄える。DBUF is a display buffer memory for storing a pattern of data to be displayed.

【００３２】ＣＲＴＣはＣＲＴコントローラで、カーソ
ルレジスタＣＲ及びバッファＤＢＵＦに蓄えられた内容
を表示器ＣＲＴに表示する役割を担う。The CRTC is a CRT controller, and has a role of displaying the contents stored in the cursor register CR and the buffer DBUF on the display CRT.

【００３３】またＣＲＴは陰極線管等を用いた表示装置
であり、その表示装置ＣＲＴにおけるドット構成の表示
パターンおよびカーソルの表示をＣＲＴコントローラで
制御する。The CRT is a display device using a cathode ray tube or the like, and the display pattern of the dot configuration and the display of the cursor on the display device CRT are controlled by the CRT controller.

【００３４】さらに、ＣＧはキャラクタジェネレータで
あって、表示装置ＣＲＴに表示する文字、記号のパター
ンを記憶するものである。Further, CG is a character generator, which stores a pattern of characters and symbols to be displayed on the display device CRT.

【００３５】かかる各構成要素からなる本発明文字処理
装置においては、キーボードＫＢからの各種の入力に応
じて作動するものであって、キーボードＫＢからの入力
が供給されると、まず、インタラプト信号がマイクロプ
ロセッサＣＰＵに送られ、そのマイクロプロセッサＣＰ
ＵがＲＯＭ内に記憶してある各種の制御信号を読出し、
それらの制御信号に従って、各種の制御が行なわれる。The character processing apparatus of the present invention comprising the above-described components operates in response to various inputs from the keyboard KB, and when an input from the keyboard KB is supplied, an interrupt signal is first sent. Sent to the microprocessor CPU and its microprocessor CP
U reads out various control signals stored in the ROM,
Various controls are performed in accordance with those control signals.

【００３６】図２は本発明における自立部辞書ＪＤＩＣ
に格納される自立部辞書データの構成を示す図である。FIG. 2 shows the independent dictionary JDIC according to the present invention.
It is a figure which shows the structure of the independent part dictionary data stored in.

【００３７】読み、表記、品詞、優先度の各フィールド
から構成される。It is composed of reading, notation, part of speech, and priority fields.

【００３８】読みには単語の読み、表記には単語の表
記、品詞には単語の品詞が格納される。優先度は、頻度
情報などを考慮して与えられる、当該単語が優先的に使
用される度合いである。優先度＝５は、普通に優先され
るという意味であり、５以上だと普通よりさらに優先さ
れ、５より小さい単語は、普通の単語より優先されない
ことを意味する。The word reading is stored in the reading, the word notation is stored in the notation, and the part of speech of the word is stored in the part of speech. The priority is a degree of preferential use of the word, which is given in consideration of frequency information and the like. Priority = 5 means that normal priority is given, and if it is 5 or higher, it has higher priority than normal, and if it is lower than 5, it means that it has no priority over normal words.

【００３９】図３は本発明における付属部辞書ＦＤＩＣ
に格納される付属部辞書データの構成を示す図である。FIG. 3 shows the accessory dictionary FDIC according to the present invention.
It is a figure which shows the structure of the auxiliary part dictionary data stored in.

【００４０】読み、表記、文法情報、優先度の各フィー
ルドから構成される。It consists of reading, notation, grammar information, and priority fields.

【００４１】読みには付属部の読み、表記には付属部の
表記が格納される。文法情報には付属部の文法情報が格
納され、後述する連接判定テーブルへリンクしている。
優先度は、頻度情報などを考慮して与えられる、当該付
属語が優先的に使用される度合いである。優先度＝５
は、普通に優先されるという意味であり、５以上だと普
通よりさらに優先され、５より小さい単語は、普通の付
属語より優先されないことを意味する。The reading of the attached part is stored in the reading, and the notation of the attached part is stored in the notation. The grammatical information stores the grammatical information of the attached part, which is linked to the connection determination table described later.
The priority is a degree of preferential use of the attached word, which is given in consideration of frequency information and the like. Priority = 5
Means that it is normally prioritized, that if it is 5 or more, it has higher priority than ordinary, and if it is less than 5, it does not have priority over ordinary adjuncts.

【００４２】図４は本発明における連接判定テーブルＣ
ＴＢＬの構成を示す図である。品詞と、付属部の文法情
報を軸とする表の構造をしている。表内の数字は、連接
強度であり、連接の強さを示すものである。５は普通の
連接強度であり、５より大きい場合は、強い連接を示
し、５より小さい場合は、弱い連接であることを示して
いる。また、連接強度が０というのは、連接しないこと
を意味している。例えば、「名詞」と「助詞−と」は、
連接強度５なので、連接の強さは普通である。また、
「動詞終止形」と「助動詞−です」は、連接強度０なの
で、連接しない。すなわち、「動詞終止形」＋「助動詞
−です」といった文節は成り立たないということであ
る。FIG. 4 is a connection determination table C according to the present invention.
It is a figure which shows the structure of TBL. It has a table structure centered on the part of speech and the grammatical information of the appendix. The numbers in the table are the connection strengths, and indicate the connection strengths. 5 is a normal joint strength, and when it is larger than 5, it is a strong joint, and when it is smaller than 5, it is a weak joint. Further, the connection strength of 0 means that the connection is not made. For example, "noun" and "particle-to"
Since the connection strength is 5, the connection strength is normal. Also,
“Verb ending form” and “auxiliary verb-is” do not connect because the connection strength is 0. In other words, a phrase such as "verb final form" + "auxiliary verb-is" does not hold.

【００４３】図５は入・出力バッファの構成を示し、
（ａ）は入力バッファＩＢＵＦ、（ｂ）は出力バッファ
ＯＢＵＦの構成を示した図である。ＩＢＵＦ、ＯＢＵＦ
ともに同じ構成である。最初の２バイトは各バッファの
サイズ情報であり、バッファに格納されている文字数が
格納されている。各文字は１文字２バイトで構成され、
ＪＩＳＸ０２０８コード等で格納される。FIG. 5 shows the structure of the input / output buffer.
(A) is a diagram showing a configuration of an input buffer IBUF, and (b) is a diagram showing a configuration of an output buffer OBUF. IBUF, OBUF
Both have the same configuration. The first 2 bytes are size information of each buffer, and the number of characters stored in the buffer is stored. Each character consists of 2 bytes per character,
It is stored as a JIS X0208 code or the like.

【００４４】図６は文節候補テーブルＢＴＢＬの構成を
示した図である。ＢＴＢＬは、かな漢字変換の途中結果
を格納するものであり、入力バッファに格納された仮名
文字列を、形態素解析を行い、考えられる解析パターン
をすべて格納する。ＢＴＢＬはツリー構造をしており、
ノードには文節を格納するものとする。同じ形態の文節
は１つのノードにまとめて格納されることもある。例で
は、「しろにとのがいる」を解析した結果であり、「城
にと／野が／いる」を初め、「白／煮／都のが／要る」
といった、数多くの解析パターンが存在している。FIG. 6 is a diagram showing the structure of the phrase candidate table BTBL. The BTBL stores an intermediate result of kana-kanji conversion, performs morphological analysis on the kana character string stored in the input buffer, and stores all possible analysis patterns. BTBL has a tree structure,
The clause shall be stored in the node. The clauses of the same form may be stored together in one node. In the example, it is the result of analyzing "Shiro ni Tono ga Iru", including "Castle Nito / No ga / Iru" and "White / boiled / Miyakono ga / I need"
There are many analysis patterns such as.

【００４５】図７は図６と同様に、文節候補テーブルＢ
ＴＢＬの構成を示した図である。当該例では、入力バッ
ファに格納された文字列が「みようとつとめる」であっ
た場合の例である。Similar to FIG. 6, FIG. 7 shows the phrase candidate table B.
It is a figure showing composition of TBL. In this example, the character string stored in the input buffer is “Mito to Tamotsu”.

【００４６】上述の実施例のプログラムエリアＰＡに格
納された手順の動作をフローに従って説明する。The operation of the procedure stored in the program area PA of the above embodiment will be described according to the flow.

【００４７】図８は本発明文字処理装置の動作を示すフ
ローチャートである。FIG. 8 is a flow chart showing the operation of the character processing device of the present invention.

【００４８】ステップ８−１は、本発明の文字処理装置
のいろいろな初期設定を行う処理であり、同種の文字処
理装置において一般に行われている処理である。処理を
終えると、ステップ８−２へ進む。Step 8-1 is a process for performing various initial settings of the character processing device of the present invention, which is a process generally performed in a character processing device of the same type. When the processing is completed, the process proceeds to step 8-2.

【００４９】ステップ８−２はキーボードからのデータ
を取り込む処理である。Step 8-2 is a process for fetching data from the keyboard.

【００５０】ステップ８−３は取り込まれたキーの種別
を判定し、各キーの処理ルーチンに分岐する。In step 8-3, the type of the fetched key is judged, and the process branches to each key processing routine.

【００５１】読みキーが入力された時はステップ８−４
に分岐する。When the reading key is input, step 8-4
Branch to.

【００５２】変換キーが入力された時はステップ８−５
に分岐し、図９にて後述する変換処理を行う。When the conversion key is input, step 8-5
And the conversion process described later with reference to FIG. 9 is performed.

【００５３】その他のキーが入力されたときには、ステ
ップ８−６に分岐し、文字の入力や、カーソル移動、挿
入、削除等の通常の文字処理装置において行なわれるそ
の他の処理が行なわれる。これらの処理は同種の文字処
理装置において、一般に行われている処理であり、公知
であるので特に記述しない。When any other key is input, the process branches to step 8-6 to perform other processes such as character input, cursor movement, insertion, deletion, etc. which are carried out in a normal character processing apparatus. These processes are processes that are generally performed in the same type of character processing device and are well known, and therefore will not be described.

【００５４】ステップ８−４は、読み入力処理であり、
かな漢字変換の読みとなるキーが入力されると、入力バ
ッファＩＢＵＦに文字を格納する処理である。Step 8-4 is a reading input process,
This is a process of storing a character in the input buffer IBUF when a key for reading kana-kanji conversion is input.

【００５５】ステップ８−４、ステップ８−５、ステッ
プ８−６のそれぞれの処理を終ると、ステップ８−２へ
進み、再びキー入力待ちとなる。When the processing of step 8-4, step 8-5, and step 8-6 is completed, the process proceeds to step 8-2 and waits for key input again.

【００５６】図９はステップ８−５の「変換処理」を詳
細化したフローチャートである。FIG. 9 is a detailed flowchart of the "conversion process" in step 8-5.

【００５７】ステップ９−１は、文節候補テーブル作成
処理であり、入力バッファＩＢＵＦに格納される読みを
もとに、図６あるいは図７で示したような文節候補テー
ブルＢＴＢＬを作成する。作成処理の詳細は、図１０で
説明する。Step 9-1 is a phrase candidate table creating process, which creates the phrase candidate table BTBL as shown in FIG. 6 or 7 based on the reading stored in the input buffer IBUF. Details of the creation process will be described with reference to FIG.

【００５８】ステップ９−２は、第１候補決定処理であ
り、図６あるいは図７で示したような文節候補テーブル
ＢＴＢＬに格納された解析結果のなかから、もっとも優
先して変換させたい候補を決定する。決定の方法として
は、さまざまな方法が考えられるが、ここでは、文節数
最小法と、自立語の優先度および、付属部の優先度を組
み合わせて決定するものとする。図６の場合、まず、文
節数が最小となるような候補に絞る。そうすると、３文
節に構成される、「｛城にと，白にと｝＋｛野が｝＋｛いる，要る｝」、
「｛城に，白に｝＋｛殿が，都のが｝＋｛いる，要
る｝」、「｛城，白｝＋｛荷とのが，煮とのが｝＋｛い
る，要る｝」の候補に絞られる。Step 9-2 is the first candidate determination process, which selects the candidate to be converted with the highest priority from the analysis results stored in the phrase candidate table BTBL as shown in FIG. 6 or 7. decide. Although various methods are conceivable as the determination method, here, it is assumed that the minimum clause number method, the priority of the independent word, and the priority of the attached part are combined. In the case of FIG. 6, first, the candidates are selected so as to minimize the number of phrases. Then, "{to the castle, to the white} + {field} + {is there, I need it}" composed of 3 phrases,
"{To the castle, to the white} + {Tohru, tonoha} + {I need, need}", "{Castle, white} + {Load and tongue, boiled tongue} + {I want, to need }] ”.

【００５９】さらに、各候補から、自立部の優先度の大
きいものを採ると、「城」＞「白」、「いる」＞「要
る」、「殿」＝「都」、「荷」＞「煮」であるから、「城にと＋野が＋いる」、「城に＋｛殿が，都のが｝＋
いる」、「城＋荷とのが＋いる」に絞ることが出来る。Furthermore, from the candidates, when the one with a high priority of the independent section is selected, "castle">"white","is">"necessary","don" = "city", "load"> Because it is "boiled", "there are + fields in the castle," and "in the castle, the {noden, the capital}} +
It can be narrowed down to "I am there" and "I am with the castle + load".

【００６０】また、更に、付属部の優先度の大きいもの
を採ると、「が」＞「のが」であるから、「城にと＋野が＋いる」、「城に＋殿が＋いる」、「城
＋荷とのが＋いる」に絞られる。Furthermore, when the priority of the attached part is high, since "ga">"noga","there is + the field + in the castle" and "the + hall is + in the castle". , "" Castle + load + there are + ".

【００６１】また、更に、各文節の優先度を、（自立部
の優先度＋付属部の優先度）／２（ただし、付属部が存
在しない文節は自立部の優先度が文節の優先度とな
る。）で計算する。上記の３つの文の、文節の優先度の
合計は、それぞれ、「城にと＋野が＋いる」→５＋６＋５＝１６、「城に＋殿が＋いる」→６＋６＋５＝１７、「城＋荷とのが＋いる」→５＋４＋５＝１４となり、「城に＋殿が＋いる」を第１候補に決定する。Furthermore, the priority of each bunsetsu is (priority of independent section + priority of ancillary section) / 2 (however, in the case of a bunsetsu with no ancillary section, the priority of the independent section is the priority of a bunsetsu. Will be calculated. The total priority of the clauses in the above three sentences is “++ in the castle +” → 5 + 6 + 5 = 16, “++ in the castle +” → 6 + 6 + 5 = 17, “castle + load” “Tono + is present” → 5 + 4 + 5 = 14, and “+ there is + in the castle” is decided as the first candidate.

【００６２】図７の例の場合、文節数が最小となる候補
に絞った時点で、「見ようと努める」を第１候補に決定
することができる。In the case of the example in FIG. 7, when the candidates having the minimum number of clauses are narrowed down, "I try to see" can be determined as the first candidate.

【００６３】ステップ９−３は、変換結果出力処理であ
る。ステップ９−２において決定した第１候補を出力バ
ッファＯＢＵＦに格納する。Step 9-3 is a conversion result output process. The first candidate determined in step 9-2 is stored in the output buffer OBUF.

【００６４】ステップ９−４は、入力バッファ消去処理
である。入力バッファＩＢＵＦの内容を消去し、次回の
読みキー入力のために備える。処理を終えると、リター
ンする。Step 9-4 is an input buffer erasing process. The contents of the input buffer IBUF are erased to prepare for the next read key input. When the processing is completed, it returns.

【００６５】図１０はステップ９−１の「文節候補テー
ブル作成処理」を詳細化したフローチャートである。FIG. 10 is a detailed flowchart of the "clause candidate table creation process" in step 9-1.

【００６６】ステップ１０−１は、カウンタｉを初期化
する処理である。カウンタｉは、入力バッファＩＢＵＦ
の読み文字列のｉ番目の文字を指すカウンタであり、最
初は、１文字目を指すために、１をセットする。Step 10-1 is a process for initializing the counter i. The counter i is the input buffer IBUF
Is a counter that points to the i-th character of the reading character string, and at the beginning, 1 is set to point to the first character.

【００６７】ステップ１０−２は、文節解析処理であ
る。入力バッファＩＢＵＦのｉ番目の読みを先頭読みと
して、それ以降の読みから、文節を解析して、文節候補
テーブルを作成する。詳細は、図１１にて説明する。Step 10-2 is a clause analysis process. Using the i-th reading of the input buffer IBUF as the head reading, the clauses are analyzed from the subsequent readings to create a clause candidate table. Details will be described with reference to FIG.

【００６８】図１１はステップ１０−２の「文節解析処
理」を詳細化したフローチャートである。FIG. 11 is a detailed flowchart of the "bunsetsu analysis process" in step 10-2.

【００６９】ステップ１１−１は、自立部辞書を検索す
る処理である。入力バッファＩＢＵＦのｉ番目の読み以
降の文字をキーとして、自立部辞書ＪＤＩＣを検索し、
部分一致する自立部を検出する。Step 11-1 is a process for retrieving the independent dictionary. Using the character after the i-th reading of the input buffer IBUF as a key, search the independent dictionary JDIC,
Detect a partially matching free-standing part.

【００７０】ステップ１１−２は、カウンタｊにステッ
プ１１−１で検出した自立部の数を格納する。In step 11-2, the number of independent parts detected in step 11-1 is stored in the counter j.

【００７１】ステップ１１−３は、カウンタｋにカウン
タｉにステップ１１−１で検出した自立部のｊ番目の自
立部の読みの文字数を足して代入する。即ち、カウンタ
ｋは、入力バッファＩＢＵＦのｋ番目の読み文字を指し
ていることになる。In step 11-3, the number of reading characters of the j-th independent portion of the independent portion detected in step 11-1 is added to the counter i and is substituted into the counter k. That is, the counter k indicates the k-th reading character in the input buffer IBUF.

【００７２】ステップ１１−４は、付属部辞書を検索す
る処理である。入力バッファＩＢＵＦのｋ番目の読み文
字以降の文字をキーとして、付属部辞書ＦＤＩＣを検索
し、部分一致する付属部を検出する。Step 11-4 is a process for searching the appendix dictionary. Using the characters after the k-th reading character of the input buffer IBUF as a key, the adjunct dictionary FDIC is searched to detect an adjunct that partially matches.

【００７３】部分一致する付属部が検出できない場合に
は、ステップ１１−４において、処理をステップ１１−
７に進め、検出できた場合には処理を、ステップ１１−
６に進める。If the attached part that partially matches is not detected, the processing is performed in step 11-4.
If it can be detected, the process proceeds to step 11-
Proceed to 6.

【００７４】ステップ１１−６は、検出した付属部と、
直前の自立部との連接可否を判定する処理である。判定
には図４に示した連接判定テーブルＣＴＢＬが用いられ
る。連接不可能な場合には、処理をステップ１１−４に
戻して、再び付属部辞書を検索する。連接可能な場合に
は、処理をステップ１１−７に進める。Step 11-6 is to detect the attached parts,
This is a process of determining whether or not the connection with the immediately preceding independent section is possible. The connection determination table CTBL shown in FIG. 4 is used for the determination. If connection is not possible, the process is returned to step 11-4, and the accessory dictionary is searched again. If they can be connected, the process proceeds to step 11-7.

【００７５】ステップ１１−７は、文節生成処理であ
る。直前に検出した自立部と付属部を組み合わせた文字
列を文節として、文節候補テーブルのノードに登録す
る。Step 11-7 is a clause generation process. A character string that is a combination of the independent part and the attached part detected immediately before is registered as a phrase in the node of the phrase candidate table.

【００７６】ステップ１１−８は、入力バッファＩＢＵ
Ｆの処理すべき読み先頭文字を更新する処理である。即
ち、直前に処理した自立部の先頭位置ｋに付属部の読み
数を足し込んだものをカウンタｉに格納する。In step 11-8, the input buffer IBU is used.
This is a process of updating the reading leading character of F to be processed. That is, a value obtained by adding the number of readings of the attached part to the head position k of the self-supporting part processed immediately before is stored in the counter i.

【００７７】ステップ１１−９は、入力バッファＩＢＵ
Ｆに格納されている文字数と、カウンタｉを比較する処
理である。入力バッファＩＢＵＦに格納されている文字
数が、カウンタｉより小さい場合には、入力バッファに
格納されている読みを全て処理したことになるので、当
該処理をリターンする。そうでない時には、未処理の読
みが残っているということなので、処理をステップ１１
−１０に進める。Step 11-9 is the input buffer IBU.
This is a process of comparing the number of characters stored in F with the counter i. If the number of characters stored in the input buffer IBUF is smaller than the counter i, it means that all the readings stored in the input buffer have been processed, and the process is returned. If not, it means that there are unprocessed readings, so the process proceeds to step 11.
Go to -10.

【００７８】ステップ１１−１０は、文節解析処理であ
る。文節解析処理は、まさしく図１１に示す当該処理で
あり、再帰的にコールすることが可能である。このステ
ップにおいて、文節解析処理をコールすることは、入力
バッファのｉ番目の読み以降の読み列に対して、文節解
析を行うことを意味する。ステップ１０−２においてコ
ールした時には、ｉが１にセットされていたが、当該ス
テップでコールする場合は、ｉが更新されている。Step 11-10 is a clause analysis process. The phrase analysis processing is exactly the processing shown in FIG. 11, and can be called recursively. In this step, calling the bunsetsu analysis processing means performing bunsetsu analysis on the i-th and subsequent reading strings in the input buffer. When i was called in step 10-2, i was set to 1. However, i was updated when i was called in this step.

【００７９】ステップ１１−１１は、カウンタｊの更新
処理である。ひとつの自立部に対する処理を終了したの
で、カウンタｊを１だけ減じる。Step 11-11 is a process for updating the counter j. Since the processing for one independent part is completed, the counter j is decremented by 1.

【００８０】ステップ１１−１２は、処理すべき未処理
の自立部の存在を判定する処理である。まだ、未処理の
自立部が存在する場合には、処理をステップ１１−３に
進める。そうでない場合には、当該処理をリターンす
る。Steps 11-12 are processes for determining the presence of an unprocessed independent part to be processed. If there are still unprocessed independent sections, the process proceeds to step 11-3. If not, the process is returned.

【００８１】なお、本発明は上述した実施例に限定され
るものではない。The present invention is not limited to the above embodiment.

【００８２】本実施例においては、第１候補決定処理に
おいて、文節数最小法と、自立語の優先度および、付属
部の優先度を組み合わせて決定するものとしたが、この
ような方法に限定するものではない。例えば、連接判定
テーブルに準備した連接強度の加味した計算式で決定す
る構成としてもよい。In the present embodiment, in the first candidate determination process, the minimum clause number method, the priority of the independent word, and the priority of the attached part are determined in combination, but the method is limited to such a method. Not something to do. For example, the connection determination table may be determined by a calculation formula that takes into account the connection strength prepared.

【００８３】その他、本発明はその要旨を逸脱しない範
囲で種々変形して実施することができる。In addition, the present invention can be variously modified and implemented without departing from the scope of the invention.

【００８４】[0084]

【発明の効果】以上の説明から明らかなように本発明に
よれば、（１）付属部辞書に連接可能な付属語列を１つの付属部
として格納し、付属部同士は非連接とすることにより、
無意味な付属語の連接がなくなり、精度の高いかな漢字
変換が実現できる；（２）さらに、上記に加えて、付属部辞書に付属語と自
立語との連接で構成される言い回しを格納することによ
り、精度の高いかな漢字変換が実現できる；（３）さらに、上記効果に加えて、前記付属部辞書手段
に格納される各付属部には、優先度情報を対応づけて記
憶することにより、付属部の優先度を有効に活用した、
精度の高いかな漢字変換が実現できる；という効果があ
る。As is apparent from the above description, according to the present invention, (1) an adjunct word string that can be connected to an adjunct dictionary is stored as one adjunct, and the adjuncts are not connected. Due to
Highly accurate kana-to-kanji conversion can be realized by eliminating meaningless adjunct word concatenation; (2) In addition to the above, the adjunct dictionary can also store wording composed of concatenations of adjunct words and independent words. With this, highly accurate kana-kanji conversion can be realized. (3) Further, in addition to the above-mentioned effect, each auxiliary section stored in the auxiliary section dictionary means stores priority information in association with each other, thereby attaching Effective use of departmental priority,
It has the effect of enabling highly accurate kana-kanji conversion.

[Brief description of drawings]

【図１】本発明の実施例における全体構成のブロック図
である。FIG. 1 is a block diagram of an overall configuration according to an embodiment of the present invention.

【図２】本発明の実施例における自立部辞書の構成例を
示した図である。FIG. 2 is a diagram showing a configuration example of an independent section dictionary according to an embodiment of the present invention.

【図３】本発明の実施例における付属部辞書の構成例を
示した図である。FIG. 3 is a diagram showing a configuration example of an appendix dictionary according to an embodiment of the present invention.

【図４】本発明の実施例における連接判定テーブルの構
成例を示した図である。FIG. 4 is a diagram showing a configuration example of a connection determination table in the embodiment of the present invention.

【図５】本発明の実施例における入力バッファ及び出力
バッファの構成例を示した図である。FIG. 5 is a diagram showing a configuration example of an input buffer and an output buffer in the embodiment of the present invention.

【図６】本発明の実施例における文節候補テーブルの構
成例を示した図である。FIG. 6 is a diagram showing a configuration example of a phrase candidate table in the embodiment of the present invention.

【図７】本発明の実施例における文節候補テーブルの構
成例を示した図である。FIG. 7 is a diagram showing a configuration example of a phrase candidate table in the embodiment of the present invention.

【図８】本発明の実施例における文字処理装置の動作を
示すフローチャートである。FIG. 8 is a flowchart showing an operation of the character processing device in the embodiment of the present invention.

【図９】本発明の実施例における変換処理の動作を示す
フローチャートである。FIG. 9 is a flowchart showing an operation of conversion processing in the embodiment of the present invention.

【図１０】本発明の実施例における文節候補テーブル作
成処理の動作を示すフローチャートである。FIG. 10 is a flowchart showing an operation of a phrase candidate table creating process in the embodiment of the present invention.

【図１１】本発明の実施例における文節解析処理の動作
を示すフローチャートである。FIG. 11 is a flowchart showing an operation of a phrase analysis process in the example of the present invention.

[Explanation of symbols]

ＣＰＵマイクロプロセッサＡＢアドレスバスＣＢコントロールバスＤＢデータバスＲＯＭ読出し専用固定メモリＰＡプログラムエリアＲＡＭランダムアクセスメモリＪＤＩＣ自立部辞書ＦＤＩＣ付属部辞書ＣＴＢＬ連接判定テーブルＩＢＵＦ入力バッファＯＢＵＦ出力バッファＢＴＢＬ文節候補テーブルＫＢキーボードＤＩＳＫ外部メモリＣＲカーソルレジスタＤＢＵＦ表示用バッファメモリ CPU microprocessor AB address bus CB control bus DB data bus ROM Read-only fixed memory PA program area RAM random access memory JDIC Independent Department Dictionary FDIC attachment dictionary CTBL connection judgment table IBUF input buffer OBUF output buffer BTBL clause candidate table KB keyboard DISK external memory CR cursor register DBUF display buffer memory

───────────────────────────────────────────────────── フロントページの続き (56)参考文献特開平５−67077（ＪＰ，Ａ) 特開平３−265061（ＪＰ，Ａ) 特開平４−256159（ＪＰ，Ａ) 特開平３−286249（ＪＰ，Ａ) 特開平２−36466（ＪＰ，Ａ) (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06F 17/21 - 17/24 ─────────────────────────────────────────────────── ─── Continuation of the front page (56) References JP-A-5-67077 (JP, A) JP-A-3-265061 (JP, A) JP-A-4-256159 (JP, A) JP-A-3- 286249 (JP, A) JP-A-2-36466 (JP, A) (58) Fields investigated (Int.Cl. ⁷ , DB name) G06F 17/21-17/24

Claims

(57) [Claims]

1. An input means for inputting a kana character string, a word dictionary means storing word readings, grammatical information such as notations and parts of speech in association with each other, and auxiliary words such as particles and auxiliary verbs. , The reading, display, and grammatical information are stored in association with each other, and the kana character string input by the inputting means is referred to by referring to the word dictionary means and the adding dictionary. A kana-kanji conversion means for converting into notation, and the adjunct dictionary means stores a single adjunct and an adjunct word string that can be concatenated and used as one adjunct, and the kana-kanji conversion means In the conversion, the character processing device is characterized in that the attached parts are not connected.

2. The character processing device according to claim 1, further comprising, in the adjunct dictionary means, an adjunct word and an adjunct word string, and an adjunct word formed by connecting an adjunct word and an independent word. A character processing device characterized by being stored as a part.

3. The character processing device according to claim 1 or 2, further comprising: priority information is stored in association with each attached part stored in said attached part dictionary means. Character processing device.