JP5849675B2

JP5849675B2 - Character input program and information processing apparatus

Info

Publication number: JP5849675B2
Application number: JP2011273523A
Authority: JP
Inventors: 拓也中山
Original assignee: Omron Corp
Current assignee: Omron Corp
Priority date: 2011-12-14
Filing date: 2011-12-14
Publication date: 2016-02-03
Anticipated expiration: 2031-12-14
Also published as: JP2013125399A

Description

本発明は、起動中のアプリケーションに文字を入力するための文字入力装置としてコンピュータを機能させるためのプログラム、文字入力機能を備えた情報処理装置に関する。 The present invention relates to a program for causing a computer to function as a character input device for inputting characters to a running application, and an information processing device having a character input function.

一般的な文字入力システム（ＩＭＥ）では、入力された変換前文字列によりシステム内の辞書を検索することによって変換後文字列の候補を複数抽出し、各候補のリストを表示して選択操作を受け付ける。また、利用される頻度が高い語句の呼び出しを容易にするために、選択により確定された変換後文字列を入力順序や品詞種別に対応づけて蓄積する学習辞書が設けられたシステムがある。さらに、変換の精度を向上するために、体言と付属語との組み合わせによる文字列を、語頭部分（体言）と語尾の部分（付属語）とを区分けして保存したり、動詞を表す文字列をその活用形と組み合わせて保存することが提案されている（特許文献１を参照）。 In a general character input system (IME), a plurality of post-conversion character string candidates are extracted by searching a dictionary in the system based on the input pre-conversion character string, a list of candidates is displayed, and a selection operation is performed. Accept. In addition, there is a system provided with a learning dictionary that stores a converted character string determined by selection in association with an input order and part-of-speech type in order to easily call a word that is frequently used. Furthermore, in order to improve the accuracy of conversion, character strings based on combinations of syntactic and ancillary words are stored by separating the initial part (syntactic) and the ending part (adjunct), and character strings that represent verbs. Has been proposed to be stored in combination with its utilization form (see Patent Document 1).

さらに近年は、インターネットなどの通信システムを利用して、外部の変換システムに変換前文字列を送信して変換処理を行わせ、変換システムから送信された変換後文字列による候補を表示する機能を有する情報処理装置がある（たとえば、特許文献２の段落００７５〜００８４，特許文献３の段落００４７〜００５２を参照。）。以下、この機能を「外部変換機能」と呼び、当該機能により取得される候補を「外部変換候補」と呼ぶ。 Furthermore, in recent years, using a communication system such as the Internet, a function for displaying a candidate based on a post-conversion character string transmitted from the conversion system by transmitting the pre-conversion character string to an external conversion system to perform conversion processing. (For example, refer to paragraphs 0075 to 0084 of Patent Document 2 and paragraphs 0047 to 0052 of Patent Document 3.) Hereinafter, this function is referred to as an “external conversion function”, and a candidate acquired by the function is referred to as an “external conversion candidate”.

特開平８−６９４５９号公報JP-A-8-69459 特開２００８−２５０６０６号公報JP 2008-250606 A 特開２００９−２０８６５号公報JP 2009-20865 A

外部変換機能によれば、大容量の辞書を持たない装置や辞書を全く持たない機器でも、変換前文字列に対して相当数の変換候補を取得することができる。また、新語を入力する場合にも容易に対応することができる。
また、学習辞書の仕組みを利用すれば、外部変換機能により取得した外部変換候補であっても、ユーザにより選択されて学習辞書に登録されたものであれば、次の入力から、携帯端末装置の変換機能によって呼び出すことができる。 According to the external conversion function, even a device that does not have a large-capacity dictionary or a device that does not have any dictionary can acquire a considerable number of conversion candidates for the pre-conversion character string. Moreover, it is possible to easily cope with the case of inputting a new word.
Moreover, if the learning dictionary mechanism is used, even if the external conversion candidate is acquired by the external conversion function, if it is selected by the user and registered in the learning dictionary, the mobile terminal device It can be called by the conversion function.

しかし、外部の変換システムにおいて採用している品詞の分類が装置で採用している分類と異なると、外部変換候補の品詞種別を正確に特定するのが困難になる。このため、選択された外部変換候補が誤った品詞種別に対応づけられて学習されると、その学習結果を利用した変換処理で誤った変換が行われるおそれがある。 However, if the part-of-speech classification adopted in the external conversion system is different from the classification adopted in the apparatus, it is difficult to accurately specify the part-of-speech type of the external conversion candidate. For this reason, if the selected external conversion candidate is learned in association with an incorrect part-of-speech type, there is a possibility that an incorrect conversion is performed in the conversion process using the learning result.

たとえば、動詞が名詞として認識された場合、その単語には付属語が付くと認識されるので、単語自体の語尾を変化させることができず、後続の文字列との関係にも誤認識が生じる場合がある。たとえば、「ねむる」という変換前文字列に対して外部から取得した「眠る」という単語が名詞として学習されると、「ねむ・・・」という変換前文字列を「眠れない」や「眠らない」に変換することは不可能になる。また、「ねむるもり」という読み文字列に対する変換処理では、「も」が名詞の「ねむる」に対する係助詞であると判別されて、変換前文字列が「ねむる−も／り」と切り分けられ、「眠るも利」のように変換される場合がある。 For example, if a verb is recognized as a noun, it is recognized that the word has an attached word, so that the ending of the word itself cannot be changed, and misrecognition occurs in the relationship with the subsequent character string. There is a case. For example, if the word “sleep” acquired from the outside is learned as a noun for the pre-conversion character string “nemuru”, the pre-conversion character string “nemu ...” It becomes impossible to convert to "". In addition, in the conversion process for the reading character string “Nemurumori”, it is determined that “mo” is a related particle for the noun “Nemuru”, and the pre-conversion character string is separated from “Nemuru-mo / ri”. It may be converted to “Sleep Motori”.

本発明は上記の問題に着目し、外部のシステムから取得した単語など、品詞種別を一意に特定できない語句であっても、品詞種別を高い確度で推定して、正しい変換ができるように学習することを課題とする。 The present invention pays attention to the above problem, and learns so that the part of speech classification can be estimated with high accuracy and correct conversion can be performed even for words such as words acquired from an external system that cannot uniquely identify the part of speech classification. This is the issue.

本発明によるプログラムは、変換前文字列の入力操作を受け付ける入力手段と、変換前文字列に対応する変換後文字列の候補を検索する検索手段と、検索手段により抽出された候補の中の１つを選択する操作に応じて選択された候補の変換後文字列を確定して出力する確定処理手段と、確定された変換後文字列を当該文字列に適合する品詞種別に対応づけて学習辞書に保存する学習処理手段とを備えた文字入力装置として、コンピュータを機能させる。 The program according to the present invention includes an input unit that accepts an input operation for a pre-conversion character string, a search unit that searches for a post-conversion character string candidate corresponding to the pre-conversion character string, and one of the candidates extracted by the search unit. A confirmation processing means for confirming and outputting a candidate converted character string selected in accordance with an operation for selecting one, and a learning dictionary in which the confirmed converted character string is associated with a part of speech type that matches the character string The computer is caused to function as a character input device provided with a learning processing means for storing the data.

この文字入力装置には、語尾が変化する品詞を語尾変化のパターンに基づき分類することにより設定された複数の品詞種別が語尾変化のパターンと共に登録された語尾変化パターンテーブルが設けられる。 This character input device is provided with a ending change pattern table in which a plurality of part of speech types set by classifying part of speech whose ending changes based on the ending change pattern are registered together with the ending change pattern.

学習処理手段は、品詞種別が特定されない変換後文字列が確定されたことに応じて当該変換後文字列を語尾変化パターンテーブルと照合して、当該変換後文字列の後方部分に一致する語尾を含む語尾変化のパターンを有する品詞種別が当該変換後文字列の品詞種別であり、前記一致する語尾が当該変換後文字列の語尾であると推定する推定手段と、推定された品詞種別および語尾を保存するための記憶手段とを具備する。推定手段は、品詞種別および語尾が推定された変換後文字列を、当該変換後文字列より後に確定されかつ前記推定された語尾を除く部分（語頭部分）に前方一致する変換後文字列と組み合わせ、組み合わせ毎に語尾の集合を作成して記憶手段に保存すると共に、推定された品詞種別のうち、語尾の集合に適合する語尾変化のパターンを有する品詞種別を当該集合に対応する各変換後文字列に対応する品詞種別として前記記憶手段に保存する。 The learning processing means collates the converted character string with the ending change pattern table in response to confirmation of the converted character string in which the part of speech type is not specified, and finds a ending that matches the rear part of the converted character string. A part of speech type having a pattern of ending change that includes the part of speech type of the converted character string, an estimation means for estimating that the matching ending is the ending of the converted character string, and the estimated part of speech type and ending Storage means for saving. The estimating means combines the converted character string in which the part of speech type and the ending are estimated, with a converted character string that is confirmed after the converted character string and is forward-matched with a portion excluding the estimated ending (the beginning portion). , together with the stores to create and store means a set of endings for each combination, the estimated out of parts of speech classification, the converted character a word class type having a pattern conforming inflection to the set of endings corresponding to the set The part-of-speech type corresponding to the column is stored in the storage means.

上記の構成によれば、動詞や形容詞などの語尾に変化が生じる単語が品詞種別を一意に特定できない状態で確定された場合、その単語を語尾変化パターンテーブルと照合することにより、品詞種別や語尾を推定することができる。ここで推定される品詞種別が１つに特定されない場合でも、同じ単語が語尾が異なる形態で確定されて複数種の語尾が保存されると、これらの語尾の集合に語尾変化のパターンが適合する品詞種別を対応づけの対象として絞り込むことができる。よって、絞り込まれた品詞種別の語尾パターンに基づき、読み文字列の切り分けや語尾を変形する処理を正確に行うことができるようになる。 According to the above configuration, if a word such as a verb or adjective that causes a change in the ending is determined in a state where the part of speech type cannot be uniquely specified, the part of speech type or ending is checked by checking the word against the ending change pattern table. Can be estimated. Even if the part of speech type estimated here is not specified, if the same word is confirmed with different endings and plural endings are stored, the ending change pattern is adapted to the set of these endings. Part-of-speech types can be narrowed down as correspondence targets. Therefore, based on the narrowed part-of-speech type ending pattern, it becomes possible to accurately perform the processing of segmenting the reading character string and transforming the ending.

上記の語尾変化パターンテーブルには、さらに語尾が変化しない品詞をその品詞の単語に添付され得る付属語のパターンに基づき分類することにより設定された品詞種別を、付属語のパターンを語尾変化のパターンとして少なくとも１つ登録することもできる。この構成によれば、名詞のような語尾が変化しない品詞と語尾が変化する品詞との識別が容易になる。また語尾が変化しない品詞に属する単語を、その品詞に適合する付属語を付した形態に変換することが可能になる。 In the above-mentioned ending change pattern table, the part-of-speech type set by classifying the part-of-speech whose ending does not change based on the accessory-word pattern that can be attached to the word-of-speech word, and the accessory-word pattern as the ending-change pattern At least one can be registered. According to this configuration, it becomes easy to distinguish a part of speech such as a noun whose ending does not change and a part of speech whose ending changes. In addition, a word belonging to a part of speech whose ending does not change can be converted into a form with an attached word that matches the part of speech.

また上記文字入力装置の一実施形態では、検索手段は、変換前文字列により学習辞書を検索すると共に、この検索により語尾変化パターンテーブルに登録されている品詞種別に対応づけられた変換後文字列を抽出したとき、その品詞種別に対応する語尾変化のパターンと抽出された変換後文字列の語頭部分とを組み合わせて複数種の候補を作成する。このようにすれば、学習辞書に保存されている語句に完全一致するものだけでなく、保存されている語句の末尾を変形させた文字列による候補を作成することが可能になる。 In one embodiment of the character input device, the search means searches the learning dictionary with the pre-conversion character string, and the post-conversion character string associated with the part-of-speech type registered in the ending change pattern table by this search. Is extracted, a plurality of types of candidates are created by combining the ending change pattern corresponding to the part-of-speech type and the beginning part of the extracted converted character string. In this way, it is possible to create a candidate based on a character string in which the end of the stored phrase is deformed as well as a phrase that completely matches the phrase stored in the learning dictionary.

上記文字入力装置の他の実施形態では、検索手段は、外部の変換システムに変換前文字列による変換処理を要求してその要求に対する当該変換システムからの応答により変換後文字列の候補を取得する機能を有する。また、推定手段は、外部の変換システムから取得した変換後文字列の候補が確定されたとき、その変換後文字列を語尾変化パターンテーブルと照合して品詞種別および語尾を推定し、当該変換後文字列を推定された語尾と当該語尾を除いた語頭部分とに区分けして、推定された品詞種別に対応づけて学習辞書に保存する。 In another embodiment of the character input device, the search means requests an external conversion system to perform conversion processing using a pre-conversion character string, and acquires post-conversion character string candidates by a response from the conversion system in response to the request. It has a function. In addition, when the candidate for the converted character string acquired from the external conversion system is confirmed, the estimating means collates the converted character string with the ending change pattern table to estimate the part of speech type and the ending, and after the conversion The character string is divided into an estimated ending and an initial portion excluding the ending, and stored in the learning dictionary in association with the estimated part of speech type.

上記の実施形態によれば、外部システムから取得して確定された変換後文字列の品詞情報が不明であっても、その文字列の品詞種別や語尾を推定し、その推定結果に基づき、変換後文字列の語頭部分と語尾とに区分けして、品詞種別の推定結果と共に保存することができる。よって、保存された文字列と語頭部分が同一で語尾が異なる文字列が確定されると、保存された文字列の中から確定文字列に組み合わせるべき文字列を容易に見つけて、両者を組み合わせることができる。また組み合わせられた文字列に対応づけられている品詞種別を対象にして、品詞種別の絞り込みを効率良く行うことが可能になる。 According to the above embodiment, even if the part-of-speech information of the converted character string obtained and confirmed from the external system is unknown, the part-of-speech type and ending of the character string are estimated, and the conversion is performed based on the estimation result. It can be divided into the beginning part and the ending part of the subsequent character string and stored together with the estimation result of the part of speech type. Therefore, when a character string that has the same initial part as the saved character string and a different ending is confirmed, it is easy to find a character string to be combined with the confirmed character string from the stored character strings, and combine the two. Can do. In addition, it is possible to efficiently narrow down the part of speech classification for the part of speech classification associated with the combined character string.

本発明によれば、文字入力装置で採用している品詞種別に適合しない変換後文字列がアプリケーションへの入力文字列として確定された場合でも、この変換後文字列に適合する可能性の高い品詞種別に対応づけて学習することができるので、学習された変換後文字列を利用した変換処理で誤った形態の文字列に変換させるのを防ぐことができる。よって、文字入力処理の精度や利便性を高めることができる。 According to the present invention, even if a converted character string that does not conform to the part of speech type employed in the character input device is determined as an input character string to the application, the part of speech that is highly likely to conform to the converted character string. Since learning can be performed in association with the type, it is possible to prevent conversion into a character string having an incorrect form by a conversion process using the learned converted character string. Therefore, the accuracy and convenience of the character input process can be improved.

本発明が適用された文字入力システムの機能ブロック図である。It is a functional block diagram of a character input system to which the present invention is applied. 語尾変化パターンテーブルのデータ構成例を示す図である。It is a figure which shows the data structural example of an end change pattern table. 文字入力システムにおける処理の概略手順を示すフローチャートである。It is a flowchart which shows the schematic procedure of the process in a character input system. 学習辞書内の各テーブルに保存されるデータの構成およびテーブル間の関係を説明する図である。It is a figure explaining the structure of the data preserve | saved at each table in a learning dictionary, and the relationship between tables. 学習辞書内の各テーブルに保存されるデータの構成およびテーブル間の関係を説明する図である。It is a figure explaining the structure of the data preserve | saved at each table in a learning dictionary, and the relationship between tables. 予測候補リスト作成処理の詳細な手順を示すフローチャートである。It is a flowchart which shows the detailed procedure of a prediction candidate list creation process. 候補確定処理の詳細な手順を示すフローチャートである。It is a flowchart which shows the detailed procedure of a candidate confirmation process.

図１は、本発明が適用された文字入力システムの構成例を示す。
この実施例の文字入力システム１は、携帯電話などの情報処理装置１００内で、ユーザの操作に従って当該装置で起動中のアプリケーション１０１（メモ帳、メーラーなど）に文字を入力するためのものである。具体的に当該システム１には、かな漢字変換処理部１０，ユーザインタフェース１１，品詞推定処理部１２，変換辞書１３，学習辞書１４，語尾変化パターンテーブル１５などが含まれる。これらはいずれも、専用のプログラムを情報処理装置１００に組み込むことによって設定される。 FIG. 1 shows a configuration example of a character input system to which the present invention is applied.
A character input system 1 according to this embodiment is for inputting characters into an application 101 (such as a memo pad or a mailer) running on the apparatus in accordance with a user operation within an information processing apparatus 100 such as a mobile phone. . Specifically, the system 1 includes a kana-kanji conversion processing unit 10, a user interface 11, a part-of-speech estimation processing unit 12, a conversion dictionary 13, a learning dictionary 14, a ending change pattern table 15, and the like. All of these are set by incorporating a dedicated program into the information processing apparatus 100.

変換辞書１３には、開発者により選択された複数の単語の辞書データが登録される。辞書データには、単語の読み（変換前のかな文字列）および表記（変換後文字列）のほか、品詞種別、優先度などの情報が含まれる。なお、この実施例の変換辞書１３は、複数のデータファイル１３１により構成されるが、これに限らず、１つのデータファイルにまとめることも可能である。 In the conversion dictionary 13, dictionary data of a plurality of words selected by the developer is registered. The dictionary data includes information such as word part (Kana character string before conversion) and notation (character string after conversion), part of speech type, priority, and the like. The conversion dictionary 13 of this embodiment is composed of a plurality of data files 131. However, the present invention is not limited to this, and can be combined into one data file.

この実施例の品詞種別は、文法上の品詞より細かい単位に分類される。たとえば、動詞などの語尾が変化する品詞については、「カ行五段活用動詞」「ラ行五段活用動詞」「一段動詞」・・・というように、変化のパターン（活用形）に基づく分類により複数の品詞種別が設定される。また、名詞についても、添付され得る付属語のパターンの違いに基づく分類により、「普通名詞」「サ変名詞」などの品詞種別が設定される。ただし、付属語単体や感動詞など、分類されることなく、品詞毎に１つずつ品詞種別が設定されるものもある。 The part-of-speech type in this embodiment is classified into smaller units than the grammatical part-of-speech. For example, parts of speech such as verbs whose endings change are classified based on patterns of change (utilization forms), such as “ka-line five-stage inflection verbs”, “la-line five-stage inflection verbs”, “one-step verbs”, etc. A plurality of part-of-speech types are set. As for nouns, part-of-speech types such as “common nouns” and “sa-changing nouns” are set according to the classification based on the pattern of attached words that can be attached. However, there are cases where a part of speech type is set for each part of speech without being classified, such as a single adjunct or an impression verb.

学習辞書１４には、ユーザの選択操作により確定されてアプリケーション１０１に入力された語句（以下、「確定語句」という。）が、入力された順に一定数まで蓄積される。具体的に学習辞書１４には、確定語句を蓄積するための使用単語履歴テーブル１４１のほか、各確定語句に適合する品詞種別が保存される品詞情報テーブル１４２、および品詞種別の特定に用いられる一致パターンが保存される一致パターンテーブル１４３が含まれる。これらのテーブル１４１，１４２，１４３の構成や格納されるデータについては、後で詳細に説明する。 In the learning dictionary 14, words / phrases (hereinafter referred to as “determined words / phrases”) that are determined by the user's selection operation and input to the application 101 are accumulated up to a certain number in the input order. Specifically, in the learning dictionary 14, in addition to the used word history table 141 for accumulating definite phrases, the part-of-speech information table 142 that stores part-of-speech types that match each definitive word-phrase, and the coincidence used to specify the part-of-speech types A matching pattern table 143 in which patterns are stored is included. The configuration of these tables 141, 142, and 143 and stored data will be described later in detail.

ユーザインタフェース１１は、図示しない表示部に文字キーを含む操作部を立ち上げ、キー操作の内容をかな漢字変換処理部１０に連絡しながら、かな漢字変換処理部１０による処理（読み文字列の組み立て処理、候補リストの作成処理、候補の確定表示など）の結果を表示する。なお、操作部は画像によるものに限らず、実物のキーによる操作部を設けてもよい。 The user interface 11 starts up an operation unit including a character key on a display unit (not shown), and communicates the contents of the key operation to the kana-kanji conversion processing unit 10 while processing by the kana-kanji conversion processing unit 10 (reading character string assembly process, The result of candidate list creation processing, candidate confirmation display, etc.) is displayed. The operation unit is not limited to an image, and an operation unit using real keys may be provided.

かな漢字変換処理部１０は、ユーザインタフェース１１と協働して、読み文字列の組み立てや変換処理を行う。また、この実施例のかな漢字変換処理部１０は、インターネットなどのネットワーク３を介して外部のかな漢字変換サービスシステム２に読み文字列を渡して変換処理を要求し、この要求に対するかな漢字変換サービスシステム２からの応答によって外部変換候補を取得することもできる。 The kana-kanji conversion processing unit 10 cooperates with the user interface 11 to assemble and convert a read character string. Further, the kana-kanji conversion processing unit 10 of this embodiment passes a reading character string to the external kana-kanji conversion service system 2 via the network 3 such as the Internet and requests conversion processing, and the kana-kanji conversion service system 2 responds to this request. The external conversion candidate can also be acquired by the response.

各種検索により抽出された候補はユーザインタフェース１１によってリスト表示される。ユーザインタフェース１１がいずれかの候補を選択する操作を受け付けると、かな漢字変換処理部１０は、選択された候補の文字列を確定してアプリケーション１０１に出力すると共に、確定語句を学習辞書１４に保存する。また、確定語句により学習辞書１４を検索し、次に入力される可能性のある文字列を繋がり予測候補として抽出する。 Candidates extracted by various searches are displayed as a list by the user interface 11. When the user interface 11 accepts an operation for selecting one of the candidates, the kana-kanji conversion processing unit 10 finalizes the selected candidate character string and outputs it to the application 101, and saves the confirmed phrase in the learning dictionary 14. . Further, the learning dictionary 14 is searched for the fixed phrase, and a character string that may be input next is connected and extracted as a prediction candidate.

品詞推定処理部１２は、かな漢字変換サービス２から提供された外部変換候補などの品詞種別を特定できない文字列を対象に、その品詞種別を推定する処理を実行する。語尾変化パターンテーブル１５は、この推定処理のために、利用される頻度が高い品詞種別の語尾変化のパターンが種別毎に登録される。また、この実施例では、かな漢字変換処理部１０が学習辞書１４から読み出した候補の文字列から語尾が異なる他の文字列を候補として作成する処理（後記する予測候補リストの作成処理）でも、語尾変化パターンテーブル１５が使用される。 The part-of-speech estimation processing unit 12 executes a process of estimating the part-of-speech type for a character string that cannot specify the part-of-speech type, such as an external conversion candidate provided from the kana-kanji conversion service 2. In the ending change pattern table 15, the ending change pattern of the part of speech type that is frequently used is registered for each type for this estimation process. In this embodiment, the kana-kanji conversion processing unit 10 creates another character string having a different ending from the candidate character string read from the learning dictionary 14 as a candidate (prediction candidate list creation process described later). A change pattern table 15 is used.

図２は、語尾変化パターンテーブル１５内の登録情報の一例を示す。このテーブル１５には、品詞種別毎の語尾変化パターンとして、その品種に属する単語を含む語句の語尾のパターンの集合が格納される。各語尾パターンは、語頭を表す「〜」記号と語尾の文字列との組み合わせにより表される。動詞のように、単語自体の末尾が変化する品詞種別では、活用語尾を表す語尾パターンが設定される。一方、名詞のように、単語の末尾が変化しない品詞種別では、単語に添付され得る付属語を表す語尾パターンが設定される。 FIG. 2 shows an example of registration information in the ending change pattern table 15. This table 15 stores a set of ending patterns of phrases including words belonging to the type as ending change patterns for each part of speech type. Each ending pattern is represented by a combination of a “˜” symbol representing the beginning of the word and a character string at the ending. For part-of-speech types in which the end of the word itself changes, such as a verb, a ending pattern representing the ending ending is set. On the other hand, for part-of-speech types in which the end of a word does not change, such as a noun, a ending pattern representing an appendix that can be attached to the word is set.

図３は、上記の文字入力システム１で実行される処理の概略手順を示す。
この処理では、毎時のキー操作を受け付けながら（ステップＳ１）、その操作内容に応じた処理を実行する。 FIG. 3 shows a schematic procedure of processing executed by the character input system 1 described above.
In this process, while accepting an hourly key operation (step S1), a process corresponding to the operation content is executed.

各種操作に伴う処理を説明する。まず、読みを示す仮名が入力された場合（ステップＳ２が「ＹＥＳ」）には、変換前の読み文字列を組み立て（ステップＳ６）、当該読み文字列により学習辞書１４を検索して予測候補リストを作成し（ステップＳ７）、このリスト内の各候補を表示する（ステップＳ１１）。 Processing associated with various operations will be described. First, when a kana indicating reading is input (“YES” in step S2), a reading character string before conversion is assembled (step S6), the learning dictionary 14 is searched using the reading character string, and a prediction candidate list is obtained. Is created (step S7), and each candidate in the list is displayed (step S11).

読み文字列が組み立てられている状態下で変換操作が行われた場合（ステップＳ３が「ＹＥＳ」）には、その読み文字列により変換辞書１３および学習辞書１４を検索して変換候補リストを作成し（ステップＳ８）、リスト内の各候補を表示する（ステップＳ１１）。なお、ステップＳ８では、読み文字列に前方一致する語句を抽出し、抽出された語句の品詞種別に基づき読み文字列を語頭部分と語尾部分とに区分けするなどして、変換後文字列の候補を作成する。さらに、必要に応じて、かな漢字変換サービスシステム２と通信を行って外部変換候補を取得する。なお、外部変換候補の取得は、再度の変換操作によって行われる場合もある。 When the conversion operation is performed in a state where the reading character strings are assembled (“YES” in step S3), the conversion dictionary 13 and the learning dictionary 14 are searched using the reading character strings to create a conversion candidate list. Then, each candidate in the list is displayed (step S11). In step S8, a word string that matches the reading character string is extracted, and the converted character string candidate is classified by dividing the reading character string into a head part and a tail part based on the part of speech type of the extracted word. Create Furthermore, if necessary, it communicates with the Kana-Kanji conversion service system 2 to acquire external conversion candidates. The acquisition of external conversion candidates may be performed by a second conversion operation.

候補が表示された状態下でいずれかの候補が選択されると（ステップＳ４が「ＹＥＳ」）、選択された候補の文字列を確定する（ステップＳ９）。さらに確定された文字列（確定語句）に続く語句の候補リスト（繋がり予測リスト）を作成し（ステップＳ１０）、リスト内の各候補を表示する（ステップＳ１１）。
所定の時点で終了操作が行われると（ステップＳ５）、操作部の表示を消失させることにより文字入力処理を終了する。 If any candidate is selected in a state where the candidate is displayed (step S4 is “YES”), the character string of the selected candidate is confirmed (step S9). Further, a word candidate list (connection prediction list) following the confirmed character string (determined word) is created (step S10), and each candidate in the list is displayed (step S11).
When a termination operation is performed at a predetermined time (step S5), the character input process is terminated by erasing the display on the operation unit.

図３に示す処理の流れ自体は、従来のシステムによるものと同様である。しかし、太枠で示した予測候補リストの作成処理（ステップＳ７）と候補確定処理（ステップＳ９）には特有の特徴がある。以下、これらの処理に関連する構成や処理の手順を説明する。 The process flow itself shown in FIG. 3 is the same as that of the conventional system. However, the prediction candidate list creation process (step S7) and candidate determination process (step S9) indicated by thick frames have unique characteristics. In the following, the configuration and processing procedures related to these processes will be described.

図４は、学習辞書１４の各テーブル１４１，１４２，１４３の構成とこれらのテーブルに格納されるデータの内容を具体的に示す。
使用単語履歴テーブル１４１には、毎回の確定語句の読みおよび表記が確定された順序に従って格納される。読みおよび表記は、いずれも語頭部分と語尾部分とに分かれている（間にハイフン記号（−）が設けられる。）が、確定語句の内容によってはこの区分けなしに保存される場合もある。その場合には確定語句全体が語頭部分とされ、語尾は「なし」とされる。 FIG. 4 specifically shows the configuration of each table 141, 142, 143 of the learning dictionary 14 and the contents of data stored in these tables.
The used word history table 141 is stored in accordance with the order in which the reading and notation of the fixed word each time are fixed. Both readings and notations are divided into a head part and a tail part (a hyphen symbol (-) is provided between them), but depending on the content of the fixed phrase, it may be stored without this division. In that case, the entire fixed phrase is the beginning part and the ending is “none”.

各確定文字列には、さらに、使用頻度、使用順序、および２種類のコードが対応づけられている。使用頻度は、確定語句がこれまでに選択された頻度を示し、使用順序は、確定語句が一連の文字入力において何番目に入力されたかを示す。 Each fixed character string is further associated with a use frequency, a use order, and two types of codes. The use frequency indicates the frequency with which the fixed word has been selected so far, and the use order indicates what number the fixed word has been input in a series of character inputs.

２種類のコードの一方は品詞情報テーブル１４２内の対応する情報を示し、他方のコードは一致パターンテーブル１４３内の対応する情報を示す。品詞情報テーブル１４２では、確定語句に適合する品詞種別がコード情報に組み合わせられて格納される。一致パターンテーブル１４３には、語尾変化パターンテーブル１４１に登録されている語尾パターンの中で確定語句の語尾に一致したパターン（以下、「一致パターン」という。）がコード情報に組み合わせられて格納される。 One of the two types of code indicates corresponding information in the part-of-speech information table 142, and the other code indicates corresponding information in the matching pattern table 143. In the part-of-speech information table 142, the part-of-speech type that matches the fixed phrase is combined with the code information and stored. In the coincidence pattern table 143, a pattern (hereinafter referred to as “match pattern”) that matches the ending of the fixed word phrase among the ending patterns registered in the ending change pattern table 141 is stored in combination with the code information. .

品詞情報テーブル１４２や一致パターンテーブル１４３の情報は、確定語句と一対一に対応づけられるのではなく、語頭部分が一致する確定語句毎に共通の情報が保存される。また、品詞情報テーブル１４２には、全ての確定語句に対する情報が保存されるが、一致パターンテーブル１４３の情報は、最初の学習時に品詞情報を一意に特定できなかった確定語句、具体的には外部変換候補として取り込まれて確定された語句に対する一致パターンが保存される。 The information in the part-of-speech information table 142 and the matching pattern table 143 is not associated one-to-one with the confirmed word / phrase, but common information is stored for each confirmed word / phrase having the same head part. The part-of-speech information table 142 stores information for all confirmed words / phrases, but the information in the matching pattern table 143 is a confirmed word / phrase for which the part-of-speech information cannot be uniquely specified at the time of the first learning, specifically, external A matching pattern for a phrase that is captured and confirmed as a conversion candidate is stored.

以下、図２を合わせて参照しながら、図４の各テーブルの構成やテーブル間の関係を説明する。
図４中のＡ，Ｂは、アプリケーション１０１に入力された文のデータを、各確定語句をスラッシュ記号「／」により分けた形式にして示したものである。なお、この実施例では、１番目の『私は』は変換辞書１３から抽出され、２番目の『走る』および３番目の『貴方も』は、かな漢字変換サービスシステム２から提供され、最後の『走った』は、学習辞書１４に保存された『走る』のデータから作成されたものとする。 The configuration of each table in FIG. 4 and the relationship between the tables will be described below with reference to FIG.
A and B in FIG. 4 show sentence data input to the application 101 in a format in which each fixed phrase is separated by a slash mark “/”. In this embodiment, the first “I am” is extracted from the conversion dictionary 13, the second “Run” and the third “You” are also provided from the Kana-Kanji conversion service system 2, and the last “ “Run” is assumed to have been created from “run” data stored in the learning dictionary 14.

文字入力処理が開始されて、図中のＡに示す文『私は／走る』が入力された時点の各テーブルは、図中の（Ａ１）（Ａ２）（Ａ３）のようになる。
１番目の確定語句の『私は』では、変換辞書１３に普通名詞として登録されている「私」と整合したことにより導出されるので、検索の結果に基づき『私』と『は』とに区切られて、使用単語履歴テーブル１４１に保存される。また『私』に対応づけられている品詞種別の「普通名詞」が品詞情報テーブル１４２に書き込まれ、その書込先の「００１」が使用単語履歴テーブル１４１に格納される。品詞情報が一意に特定されたので、一致パターンテーブル１４３に１番目の確定語句の一致パターンが保存されることはない。 Each table at the time when the character input process is started and the sentence “I am running” shown in A in the figure is input is as shown in (A1), (A2), and (A3) in the figure.
The first definite phrase “I am” is derived by matching with “I” registered as a common noun in the conversion dictionary 13, so “I” and “Ha” are based on the search results. It is divided and stored in the used word history table 141. In addition, “common noun” of the part of speech type associated with “I” is written in the part of speech information table 142, and “001” of the writing destination is stored in the used word history table 141. Since the part-of-speech information is uniquely specified, the matching pattern of the first definite word / phrase is not stored in the matching pattern table 143.

２番目の『走る』は、外部変換候補であったため、品詞種別を特定できない。このような文字列に対し、この実施例では、語尾変化パターンテーブル１５を参照して、『走る』に整合する語尾パターンを含む品詞種別を抽出する。具体的には各語尾パターンによる後方一致検索を実施する。この検索は、図１の品詞推定処理部１２により実施される。 The second “run” was an external conversion candidate, so the part of speech type cannot be specified. In this embodiment, with respect to such a character string, the part-of-speech type including the ending pattern that matches “run” is extracted with reference to the ending change pattern table 15. Specifically, a backward matching search is performed using each ending pattern. This search is performed by the part of speech estimation processing unit 12 of FIG.

ここで図２を参照すると、語尾変化パターンテーブル１５には、「〜る」という語尾パターンが設定されている品詞種別が２つ（ラ行五段動詞と一段動詞）保存されている。よって上記の検索により、これら２つの品詞種別が抽出されて、共通のコード「００２」に対応づけられて品詞情報テーブル１４２に保存される。また、一致した語尾パターンの「〜る」が、コード「１０２」に対応づけられて一致パターンテーブル１４３に格納される。 Referring now to FIG. 2, the ending change pattern table 15 stores two part-of-speech types (La-line five-step verb and one-step verb) in which the ending pattern “˜RU” is set. Therefore, by the above search, these two part-of-speech types are extracted and stored in the part-of-speech information table 142 in association with the common code “002”. Also, “˜” of the matched ending pattern is stored in the matched pattern table 143 in association with the code “102”.

また上記の処理結果に基づき、確定語句『走る』は『走』と『る』とに区切られて使用単語履歴テーブル１４１に保存される。さらに、品詞情報テーブル１４２および一致パターンテーブル１４３内の対応情報のコード「００２」「１０２」がこの確定語句に対応づけられて保存される。 Further, based on the above processing result, the definite word “run” is divided into “run” and “ru” and stored in the used word history table 141. Furthermore, the codes “002” and “102” of the correspondence information in the part-of-speech information table 142 and the matching pattern table 143 are stored in association with the fixed word / phrase.

さらに、図４のＢに示すように、『私は／走る』に続いて『貴方も／走った』という文が入力されると、各テーブル１４１，１４２，１４３は、図中の（Ｂ１）（Ｂ２）（Ｂ３）のようになる。
『貴方も』は、先の『走る』と同様に外部変換候補であるため、語尾変化パターンテーブル１５から『貴方も』に整合する語尾パターンを含む品詞種別が抽出され、これが品詞情報テーブル１４２に保存される。再び図２を参照すると、語尾変化パターンテーブル１５には、『貴方も』に整合する「〜も」という語尾パターンを含む品詞種別が２つ（普通名詞およびサ変名詞）保存されているので、これら２つの品詞種別が抽出されて、共通のコード「００３」に対応づけられて品詞情報テーブル１４２に保存される。また、一致した語尾パターン「〜も」が、コード「１０３」に対応づけられて一致パターンテーブル１４３に格納される。 Further, as shown in FIG. 4B, when a sentence “You also ran” is input after “I ran”, each table 141, 142, 143 is displayed as (B1) in the figure. (B2) (B3).
Since “you too” is an external conversion candidate like the previous “run”, the part-of-speech type including the ending pattern matching “you too” is extracted from the ending change pattern table 15, and this is stored in the part-of-speech information table 142. Saved. Referring to FIG. 2 again, since the ending change pattern table 15 stores two part-of-speech types (common nouns and sa-changing nouns) that include the ending pattern “~ mo” that matches “you too”. Two part-of-speech types are extracted and stored in the part-of-speech information table 142 in association with a common code “003”. In addition, the matched ending pattern “˜mo” is stored in the matched pattern table 143 in association with the code “103”.

上記の処理結果に基づき、確定語句『貴方も』は『貴方』と『も』とに区切られて使用単語履歴テーブル１４１に保存される。さらに、品詞情報テーブル１４２および一致パターンテーブル１４３内の対応情報のコード「００３」「１０３」がこの確定語句に対応づけられて保存される。 Based on the result of the above processing, the fixed phrase “you” is stored in the used word history table 141 after being divided into “you” and “also”. Furthermore, the codes “003” and “103” of the correspondence information in the part-of-speech information table 142 and the matching pattern table 143 are stored in association with the fixed phrase.

４番目の確定文字列『走った』は、学習辞書１４の検索により『走−る』の変化形として抽出されるので、この『走−る』に一致する語頭部分『走』と語尾の『った』とに分けられて使用単語履歴テーブル１４１に格納される。 The fourth fixed character string “run” is extracted as a variation of “run” by searching the learning dictionary 14, so that the beginning portion “run” and “ Are stored in the used word history table 141.

『走−った』に対しては、品詞情報テーブル１４２や一致パターンテーブル１４３への新たなコードは作成されず、『走−る』と同じコード（品詞情報テーブルのコード「００２」と一致パターンテーブルのコード「１０２」）が保存される。ただし、『走−る』とは異なる語尾が付与されているので、この語尾パターン「〜った」が一致パターンとして追加される。 For “run”, no new code is created in the part-of-speech information table 142 or the match pattern table 143, and the same code as “run-run” (the match pattern matches the code “002” in the part-of-speech information table) The table code “102”) is stored. However, since the ending is different from that of “Run”, this ending pattern “~” is added as a matching pattern.

また、図２によれば、コード「１０２」に対応づけられた『〜った』および『〜る』が語尾変化パターンに含まれている品詞種別は「ラ行五段動詞」のみである。よって、品詞情報テーブル１４２内のコード００２の情報中の「一段動詞」は削除される。 Also, according to FIG. 2, the part of speech type in which “to” and “to” associated with the code “102” are included in the ending change pattern is only “La line five-step verb”. Therefore, the “single verb” in the information of the code 002 in the part of speech information table 142 is deleted.

図３に示した予測候補リストの作成処理（ステップＳ７）では、読み文字列による学習辞書１４の検索により抽出された語句の語尾をそれぞれの品詞種別に基づき変化させることによって，複数の候補を作成する。また、変換候補リストの作成処理（ステップＳ８）では、変換辞書１３や学習辞書１４との照合により変換対象の読み文字列を語頭部分と語尾部分とに分けて、表記（かな漢字）をあてはめる。よって、辞書内の各単語の品詞種別が特定されていない場合には、誤変換が生じてしまう。 In the prediction candidate list creation process (step S7) shown in FIG. 3, a plurality of candidates are created by changing the endings of the words extracted by the search of the learning dictionary 14 using the read character strings based on the respective part of speech types. To do. Further, in the conversion candidate list creation process (step S8), the reading character string to be converted is divided into a head part and a tail part by collation with the conversion dictionary 13 and the learning dictionary 14, and the notation (kana-kanji) is applied. Therefore, when the part of speech type of each word in the dictionary is not specified, erroneous conversion occurs.

しかし、この実施例では、品詞種別が特定できない文字列を学習辞書１４に保存する際に、語尾変化パターンテーブル１５との照合によって語頭部分と語尾部分とを識別し、当該文字列に該当する可能性のある品詞種別を推定することができる。よって、この推定された品詞種別に基づき、正しく変換された候補を抽出することが可能になる。 However, in this embodiment, when a character string whose part-of-speech type cannot be specified is stored in the learning dictionary 14, the beginning part and the ending part are identified by collation with the ending change pattern table 15 and can correspond to the character string. The type of part of speech can be estimated. Therefore, it is possible to extract correctly converted candidates based on the estimated part of speech type.

たとえば、図４の例においては、４番目の『走った』の入力のために『はし』という読み文字列が入力されたとき、先に確定された『走る』に対して保存された品詞種別の「ラ行五段動詞」および「一段動詞」の各語尾パターンが『走』に適用される。よって、この段階では、『走ない』『走た』など、一段動詞の語尾変化パターンによる誤った変換候補も作成されるが、ラ行五段動詞の語尾変化パターンによる正しい変換候補も作成される。また、作成された候補の中から入力対象の『走った』が選択されると、『走』に対応する品詞種別はラ行五段動詞のみに絞り込まれるので、その後は、『走』を語頭とする語を入力する際に誤変換が生じることはなくなる。 For example, in the example of FIG. 4, when the reading character string “Hashi” is input for the fourth “Run” input, the part of speech stored for the previously determined “Run” is entered. Each ending pattern of the type “La line five-step verb” and “one-step verb” is applied to “Run”. Therefore, at this stage, incorrect conversion candidates based on the ending change pattern of the first verb such as “do not run” or “run” are created, but the correct conversion candidate is also created based on the ending change pattern of the la-row five-step verb. . In addition, when “Run” is selected from the candidates created, the part-of-speech type corresponding to “Run” will be narrowed down to only the five-stage verb of the line. No erroneous conversion occurs when the word is entered.

このように、かな漢字変換サービスシステム２から提供された外部変換候補の文字列についても、変換辞書１３に登録されている単語と同じように、語尾を正しく変化させた変換候補を作成することが可能になる。また、動詞と名詞との識別など、広い概念での品詞種別の識別も可能になる。 In this way, it is possible to create conversion candidates in which the endings of the external conversion candidate character strings provided from the Kana-Kanji conversion service system 2 are correctly changed in the same manner as the words registered in the conversion dictionary 13. become. In addition, part-of-speech types can be identified with a broad concept, such as identifying verbs and nouns.

なお、品詞種別の推定の対象となる確定語句は外部変換候補に限らず、読み文字列から確定されたひらがな文字列、変換処理により設定された語頭と語尾との区切りを変更する操作によって再変換されたかな漢字文字列、後変換操作に応じて変換されたカタカナ文字列やアルファベット文字列など、変換辞書１３に登録されていない語句の全てを対象とすることができる。 Note that the definitive words / phrases for which the part-of-speech type is to be estimated are not limited to external conversion candidates, but can be re-converted by changing the hiragana character string determined from the read character string or the beginning / end delimiter set by the conversion process. It is possible to target all words that are not registered in the conversion dictionary 13 such as a kana-kanji character string, a katakana character string converted according to a post-conversion operation, and an alphabetic character string.

また図４の例では、語尾変化パターンテーブル１５との照合によって確定語句に整合する語尾パターンが１つに特定されたが、常にそうなるとは限らず、複数の語尾パターンが抽出される場合もある。その場合は、品詞情報テーブル１４２や一致パターンテーブル１４３には、抽出された語尾パターン毎に、異なるコードによる情報が保存される。 In the example of FIG. 4, one ending pattern that matches the confirmed word / phrase is specified by collation with the ending change pattern table 15, but this is not always the case, and a plurality of ending patterns may be extracted. In that case, the part-of-speech information table 142 and the matching pattern table 143 store information with different codes for each extracted ending pattern.

図５は、『ははは』という語の確定に伴って各テーブル１４１，１４２，１４３に保存される情報を示す。
この例の確定語句の『ははは』は、読み文字列がそのまま確定されたもので、品詞種別が不明であるため、語尾変化パターンテーブル１５を用いて、確定語句に整合する語尾パターンを探す処理が実施される。図２によれば、確定語句『ははは』に整合する変化パターンは普通名詞およびサ変名詞の『〜は』および『〜（付属語なし）』の２つになる。 FIG. 5 shows information stored in the tables 141, 142, and 143 when the word “hahaha” is confirmed.
The confirmed word / phrase “hahaha” in this example is the one in which the reading character string is confirmed as it is and the part of speech type is unknown, and therefore the ending pattern matching the fixed word / phrase is searched using the ending change pattern table 15. Processing is performed. According to FIG. 2, there are two change patterns that match the definite word “hahaha”: “~ ha” and “˜ (no attached word)” of common nouns and sa variable nouns.

よって、使用単語履歴テーブル１４１には、第１の語尾パターン『〜は』に基づく『はは−は』と区分けされた形態の語と、第２の語尾パターン『〜（付属語なし）』に基づき『ははは』全体を語頭とする形態の語とが保存される。また、前者の形態に関しては、品詞情報テーブル１４２に『〜は』を語尾変化パターンに含む「普通名詞」および「サ変名詞」がコード００４に対応づけられて保存され、一致パターンテーブル１４３には、『〜は』がコード１０４に対応づけられて保存される。後者の形態に関しては、品詞情報テーブル１４２に、『〜』を語尾変化パターンに含む「普通名詞」「サ変名詞」「感動詞」の３種類の品詞種別がコード００５に対応づけられて保存され、一致パターンテーブルに、『〜』がコード１０５に対応づけられて保存される。使用単語履歴テーブル１４１では、『はは−は』と『ははは』とが、それぞれテーブル１４２，１４３内の対応するコードに組み合わせられて保存される。 Therefore, in the used word history table 141, the words classified into “haha-ha” based on the first ending pattern “˜ha” and the second ending pattern “˜ (no attached word)” are included. On the basis of this, “hahaha” as a whole is stored in the form of the word. As for the former form, “common noun” and “sa variable noun” including “˜ha” in the ending change pattern are stored in the part-of-speech information table 142 in association with the code 004, and the matching pattern table 143 includes “˜ha” is stored in association with the code 104. Regarding the latter form, three part-of-speech types “common noun”, “sa-variant noun”, and “adverb” including “˜” in the ending change pattern are stored in the part-of-speech information table 142 in association with the code 005. “˜” is stored in the matching pattern table in association with the code 105. In the used word history table 141, “haha-ha” and “hahaha” are stored in combination with the corresponding codes in the tables 142 and 143, respectively.

品詞種別の推定の対象となる語句は、上記のほかにも考えられる。
たとえば、古い文字入力システムを残して新しい文字入力システムを導入して使用する場合に、古いシステムの辞書から読み出した単語を入力対象として確定したり、新語などが格納された変換辞書をシステムに追加してその新しい変換辞書から読み出した単語を入力対象として確定する場合にも、上記実施例と同様の方法により、確定語句の品詞種別を推定したり、学習された語句の品詞種別を絞り込むことができる。 In addition to the above, the words / phrases for which the part-of-speech type is estimated are also conceivable.
For example, when a new character input system is introduced and used while leaving the old character input system, the words read from the old system dictionary are confirmed as input targets, or a conversion dictionary storing new words is added to the system. Even when a word read from the new conversion dictionary is confirmed as an input target, the part of speech type of the confirmed word or phrase can be estimated or the part of speech type of the learned word can be narrowed down by the same method as in the above embodiment. it can.

つぎに図６を参照して、予測候補リストの作成処理（図３のステップＳ７）の詳細な処理手順を説明する。
この処理では、最初のステップＳ１０１で、予測候補リストをクリアする。以下に述べるように、この予測候補リストは２つに分かれており、ステップＳ１０２で一方のリストが作成され、ステップＳ１０３〜Ｓ１０７により、他方のリストが作成される。 Next, a detailed processing procedure of the prediction candidate list creation process (step S7 in FIG. 3) will be described with reference to FIG.
In this process, the prediction candidate list is cleared in the first step S101. As will be described below, this prediction candidate list is divided into two, one list is created in step S102, and the other list is created in steps S103 to S107.

ステップＳ１０２では、変換辞書１３に登録されている単語を対象として、読み文字列による前方一致検索を実行し、抽出された候補により第１の予測候補リストを作成する。 In step S102, a forward match search using a reading character string is executed for words registered in the conversion dictionary 13, and a first prediction candidate list is created from the extracted candidates.

第２の予測候補リストの作成では、まず学習辞書１４に保存されている語句の語頭部分を対象にして、前方一致検索（語頭部分が読み文字列に包含されるものを抽出）を実行する。以下、抽出された語句を順に対象として、ステップＳ１０５，Ｓ１０６，Ｓ１０７を実行する。 In creating the second prediction candidate list, first, a prefix match search (extracting a word head part included in a reading character string) is performed on the word head part of a word saved in the learning dictionary 14. Hereinafter, Steps S105, S106, and S107 are executed for the extracted words in order.

ステップＳ１０５では、ステップＳ１０３の検索で一致した語頭部分を抽出する。ステップＳ１０６では、処理対象の語句に対応づけられている品詞種別の語尾パターンを全て読み出す。ここでは、複数の品詞種別が対応づけられている場合には、全ての品詞種別の語尾パターンが読み出される。
ステップＳ１０７では、読み出された各語尾パターンを、それぞれステップＳ１０５で抽出した語頭部分に組み合わせることによって、複数の候補の文字列を作成する。さらにこれらの候補を第２の予測候補リストに保存する。 In step S105, the beginning part matched in the search in step S103 is extracted. In step S106, all the part-of-speech ending patterns associated with the words to be processed are read out. Here, when a plurality of part-of-speech types are associated with each other, the ending patterns of all the part-of-speech types are read out.
In step S107, a plurality of candidate character strings are created by combining each read ending pattern with the beginning part extracted in step S105. Further, these candidates are stored in the second prediction candidate list.

ステップＳ１０３で抽出された語句の全てに対する処理が終了すると（ステップＳ１０４が「ＹＥＳ」）、ステップＳ１０８に進み、第１および第２の予測候補リストを統合する。たとえば、双方のリスト間で重複している候補を１つにまとめたり、各候補に優先順位を設定する処理などが行われる。統合されたリストは、図３のステップＳ１１の処理により、表示される。 When the processing for all the words extracted in step S103 is completed (step S104 is “YES”), the process proceeds to step S108, and the first and second prediction candidate lists are integrated. For example, a process is performed in which candidates that overlap between both lists are combined into one, or a priority is set for each candidate. The integrated list is displayed by the process of step S11 in FIG.

つぎに図７を参照して、候補の選択操作に伴う候補確定処理（図３のステップＳ９）の詳細な手順を説明する。 Next, with reference to FIG. 7, a detailed procedure of candidate determination processing (step S9 in FIG. 3) accompanying the candidate selection operation will be described.

この処理では、まず、選択された候補の文字列（確定語句）により学習辞書１４の使用単語履歴テーブル１４１を検索して、読みおよび表記が共に確定語句に前方一致する語句を抽出する。なお、確定語句が変換辞書１３や学習辞書１４の検索により抽出された場合には、その検索の際に語頭と語尾とに区分けされているので、ステップＳ２０１では、語頭部分が一致する語句を検索する。 In this process, first, the used word history table 141 of the learning dictionary 14 is searched for the selected candidate character string (definite word / phrase), and a word whose reading and notation both match the definite word / phrase in front is extracted. When the fixed phrase is extracted by searching the conversion dictionary 13 or the learning dictionary 14, it is divided into a head part and an end part at the time of the search. In step S201, a phrase having the same head part is searched. To do.

上記の検索で確定語句に前方一致する語句が抽出されなかった場合（ステップＳ２０２が「ＮＯ」）には、ステップＳ２０３およびＳ２０４を実行する。
ステップＳ２０３では、語尾変化パターンテーブル１５から確定語句に整合する語尾パターンを有する品詞種別を抽出する。また、抽出された品詞種別を品詞情報テーブル１４２に保存する。さらに、ステップＳ２０４では、確定語句に整合した語尾パターンを一致パターンテーブル１４３に保存する。 If the above search does not extract a phrase that directly matches the confirmed phrase (step S202 is “NO”), steps S203 and S204 are executed.
In step S203, the part-of-speech type having the ending pattern that matches the fixed phrase is extracted from the ending change pattern table 15. The extracted part of speech type is stored in the part of speech information table 142. Further, in step S 204, the ending pattern matched with the fixed phrase is stored in the matching pattern table 143.

上記の品詞種別や一致パターンは、いずれもコードに対応づけて保存される。これらの保存が終了すると、ステップＳ２０９に進み、確定語句に関する諸情報（上記のコードを含む）を使用単語履歴テーブル１４１に保存する。
この後は、読み文字列をクリアする処理（ステップＳ２１０）、および確定語句をアプリケーションに出力する処理（ステップＳ２１１）を実行し、処理を終了する。 The part-of-speech type and the matching pattern are all stored in association with the code. When these storages are completed, the process proceeds to step S209, and various information (including the above-described code) related to the fixed word / phrase is stored in the use word history table 141.
Thereafter, the process of clearing the reading character string (step S210) and the process of outputting the definite word / phrase to the application (step S211) are executed, and the process is terminated.

ステップＳ２０１の検索で確定語句に前方一致する語句が抽出されたが、その語句に複数の品詞種別が対応づけられている場合（ステップＳ２０２が「ＹＥＳ」、ステップＳ２０５が「ＮＯ」）には、抽出された語句に対応づけられているコードにより、抽出された語句に対応する品詞種別および一致パターンを特定する（ステップＳ２０６）。これらは、つぎのステップＳ２０７およびステップＳ２０８の処理対象となる。 In the search in step S201, a phrase that directly matches the confirmed phrase is extracted, but when a plurality of part-of-speech types are associated with the phrase (step S202 is “YES”, step S205 is “NO”), The part-of-speech type and matching pattern corresponding to the extracted phrase are specified by the code associated with the extracted phrase (step S206). These are the processing targets of the next step S207 and step S208.

ステップＳ２０７では、確定語句の語尾パターンを処理対象の一致パターンとして追加する。ステップＳ２０８では、追加後の一致パターンのリストと処理対象の品詞種別の語尾変化パターンとを照合し、一致パターンのリストに整合する語尾変化パターンを有する品詞種別を残し、整合しない品詞種別を削除する（ステップＳ２０８）。この後は、前述のステップＳ２０９に進み、確定語句に関する諸情報を使用単語履歴テーブル１４１に保存する。この保存情報には、ステップＳ２０６で特定された品詞種別および一致パターンのコードが含まれる。
なお、確定語句の語尾のパターンが既に一致パターンに含まれている場合には、ステップＳ２０７，Ｓ２０８はスキップされる。 In step S207, the final pattern of the fixed phrase is added as a matching pattern to be processed. In step S208, the added matching pattern list is compared with the ending change pattern of the part of speech type to be processed, the part of speech type having the ending change pattern that matches the matching pattern list is left, and the inconsistent part of speech type is deleted. (Step S208). After this, the process proceeds to step S209 described above, and various information related to the confirmed word / phrase is stored in the used word history table 141. This stored information includes the part-of-speech type and matching pattern code specified in step S206.
Note that if the pattern at the end of the fixed phrase is already included in the matching pattern, steps S207 and S208 are skipped.

ステップＳ２０１の検索処理で確定語句に前方一致する語句が抽出され、その語句の品詞種別が確定されている場合（ステップＳ２０２，Ｓ２０５が「ＹＥＳ」）には、品詞種別や一致パターンを推定する必要がないので、すぐにステップＳ２０９に進む。この場合のステップＳ２０９では、確定語句の読みおよび表記を、当該確定語句を候補として導出した際に識別した語頭と語尾とに分けて保存すると共に、ステップＳ２０１の検索によりヒットした語句と同じ品詞種別を対応づける。
さらに、読み文字列のクリア（ステップＳ２１０）や確定語句をアプリケーション１０１に出力する処理（ステップＳ２１１）を実行し、処理を終了する。 If a word that matches the definite word / phrase in the search process in step S201 is extracted and the part of speech type of the word / phrase is fixed (“YES” in steps S202 and S205), it is necessary to estimate the part of speech type and the matching pattern. Since there is no, immediately proceed to step S209. In step S209 in this case, the reading and notation of the confirmed word are stored separately for the beginning and ending identified when the confirmed word is derived as a candidate, and the same part of speech type as the word hit by the search in step S201. Associate.
Furthermore, clearing of the reading character string (step S210) and a process of outputting the fixed phrase to the application 101 (step S211) are executed, and the process is terminated.

なお、外部変換候補として取り込まれて学習辞書１４に保存された単語であっても、変換辞書１３のアップロードなどにより変換辞書１３に登録される場合がある。この場合には、学習辞書１４内の当該単語の品詞種別も変換辞書１３に適合するように更新される。 Note that even a word taken as an external conversion candidate and stored in the learning dictionary 14 may be registered in the conversion dictionary 13 by uploading the conversion dictionary 13 or the like. In this case, the part-of-speech type of the word in the learning dictionary 14 is also updated to be compatible with the conversion dictionary 13.

上記の処理によれば、品詞種別が不明または品詞種別が一意に特定できない単語でも、語尾パターンが様々な形態に変更されて複数回の入力が行われるうちに、品詞種別を絞り込むことができる。よって、変換辞書１３に登録されている単語と同様に、ステップＳ２０５が「ＹＥＳ」となり、学習処理が容易になる。 According to the above processing, even for a word whose part of speech type is unknown or whose part of speech type cannot be uniquely specified, the part of speech type can be narrowed down while the ending pattern is changed to various forms and input is performed a plurality of times. Therefore, like the words registered in the conversion dictionary 13, step S205 is “YES”, and the learning process is facilitated.

ただし、図７のステップＳ２０８において、一致パターンのリストの全てに一致する語尾変化パターンのみが残るように品詞種別を絞り込むと、ユーザが候補の選択を誤った場合に、以下のような支障が生じるおそれがある。 However, if the part-of-speech type is narrowed down so that only the ending change pattern that matches the entire matching pattern list remains in step S208 of FIG. 7, the following trouble occurs when the user makes a mistake in selecting a candidate. There is a fear.

たとえば、図４の例では、『走る』がラ行五段動詞および一段動詞に対応づけられて保存された後に、『走った』を入力するために『はし』という読み文字列が組み立てられたとき、ラ行五段動詞の語尾変化パターンによる予測変換候補（『走った』など）のほか、一段動詞の語尾変化パターンによる予測変換候補（『走た』など）も表示される。これらの候補の中から、ユーザが誤って『走た』を選択すると、『走』に関する品詞種別は一段動詞に絞り込まれ、ラ行五段動詞との対応関係が解消されてしまう。そうなると、『走』については、『走る』以外の活用形を正しく導出することが不可能になる。 For example, in the example shown in FIG. 4, after “run” is stored in association with the la-row five-step verb and the one-step verb, the character string “hashi” is assembled to input “run”. In addition to predictive conversion candidates (such as “run”) based on the ending change pattern of the five-stage verb of the line, predicted conversion candidates (such as “run”) based on the ending change pattern of the one-step verb are also displayed. If the user mistakenly selects “run” from these candidates, the part-of-speech type related to “run” is narrowed down to a one-step verb, and the correspondence with the la-row five-step verb is canceled. As a result, it is impossible to correctly derive a utilization form other than “run” for “run”.

この問題を重視する場合には、ステップＳ２０８での品詞種別の削除の条件をやや緩和するのが望ましい。たとえば、一致パターンのリストの全てが一致しない語尾変化パターンを有する品詞種別でも、不一致の数があらかじめ定めた許容値までであれば、削除せずに残すようにするとよい。または一致パターンのリストに含まれるパターンの数が所定数に達するまでは品詞種別を絞り込みせずに、品詞種別毎にその種別に対応する一致パターンに整合する語句が選択された頻度を計数し、一致パターンの数が所定数に達してから、各品詞種別の選択頻度に基づいて、いずれの品詞種別を残すかを決めてもよい。
これらの処理によれば、候補の選択操作に多少の誤りがあった場合でも，正しい品詞種別が削除されてしまうのを防ぐことができる。 If importance is attached to this problem, it is desirable that the part-of-speech type deletion conditions in step S208 be somewhat relaxed. For example, even if the part-of-speech type has a ending change pattern that does not match all of the matching pattern lists, it may be left without being deleted if the number of mismatches reaches a predetermined allowable value. Or, without narrowing down the part of speech type until the number of patterns included in the list of matching patterns reaches a predetermined number, for each part of speech type, count the frequency of selection of a phrase that matches the matching pattern corresponding to that type, After the number of matching patterns reaches a predetermined number, it may be determined which part-of-speech type is left based on the selection frequency of each part-of-speech type.
According to these processes, it is possible to prevent the correct part-of-speech type from being deleted even if there is a slight error in the candidate selection operation.

なお、上記の実施例では、学習辞書１４内に一致パターンテーブル１４３を設けて、このテーブル１４３により、品詞種別を一意に特定できない語句を語頭部分が共通するもの毎に組にして、組毎に一致パターンを保存したが、語尾パターンの蓄積場所は学習辞書１４には限らず、バッファに一時保存するに留め、品詞種別が１つに絞り込まれた語句に対しては一致パターンを消去してもよい。一致パターンを学習辞書１４内に保存する場合も同様に、品詞種別が１つに絞り込まれた語句に対する情報を削除してもよい。 In the above embodiment, the matching pattern table 143 is provided in the learning dictionary 14, and by this table 143, words and phrases whose part of speech type cannot be uniquely specified are grouped for each common word head part, and for each group. Although the matching pattern has been saved, the storage location of the ending pattern is not limited to the learning dictionary 14, but can be temporarily saved in the buffer, and even if the matching pattern is deleted for the phrase whose part of speech type is narrowed down to one Good. Similarly, when the matching pattern is stored in the learning dictionary 14, information on a word / phrase in which the part of speech type is narrowed down to one may be deleted.

また、上記の実施例では、品詞情報テーブル１４２および一致パターンテーブル１４３の保存情報にコードを含めて、このコードを介して各テーブル１４２，１４３の情報と使用単語履歴テーブル１４１に保存される語句とを対応づけたが、これに限らず、品詞種別や一致パターンの格納先のアドレスを介した対応づけを行ってもよい。また、品詞情報テーブル１４２を設けずに、使用単語履歴テーブル１４１の各レコードに品詞種別を含めるようにしてもよい。 In the above embodiment, the codes are included in the storage information of the part-of-speech information table 142 and the matching pattern table 143, and the information stored in each of the tables 142 and 143 and the words / phrases stored in the used word history table 141 via this code However, the present invention is not limited to this, and the association may be performed via the part-of-speech type or the address where the matching pattern is stored. Further, the part of speech type may be included in each record of the used word history table 141 without providing the part of speech information table 142.

最後に、上記の文字入力システム１は、携帯型の情報処理装置に限らず、据え置き型の装置（ディスプレイ装置やファクシミリなど）に組み込むこともできる。またパーソナルコンピュータにも適用することが可能である。また、入力対象の言語は日本語に限らず、英語などの他の言語の入力にも、同様の構成のシステムにより対応することができる。 Finally, the character input system 1 is not limited to a portable information processing apparatus, but can be incorporated in a stationary apparatus (such as a display apparatus or a facsimile). It can also be applied to a personal computer. Further, the input target language is not limited to Japanese, and input of other languages such as English can be handled by a system having the same configuration.

１文字入力システム
２かな漢字変換サービスシステム
１０かな漢字変換処理部
１１ユーザインタフェース
１２品詞推定処理部
１３変換辞書
１４学習辞書
１５語尾変化パターンテーブル
１０１アプリケーション
１４１使用単語履歴テーブル
１４２品詞情報テーブル
１４３一致パターンテーブル DESCRIPTION OF SYMBOLS 1 Character input system 2 Kana-Kanji conversion service system 10 Kana-Kanji conversion processing part 11 User interface 12 Part of speech estimation processing part 13 Conversion dictionary 14 Learning dictionary 15 End change pattern table 101 Application 141 Word usage history table 142 Part of speech information table 143 Match pattern table

Claims

An input means for accepting an input operation for a pre-conversion character string, a search means for searching for a candidate for a post-conversion character string corresponding to the pre-conversion character string, and an operation for selecting one of the candidates extracted by the search means Confirmation processing means for confirming and outputting the converted character string of the candidate selected according to the above, and learning processing means for storing the confirmed converted character string in the learning dictionary in association with the part-of-speech type that matches the character string A program for causing a computer to function as a character input device comprising:
A program for incorporating in a character input device a ending change pattern table in which a plurality of part of speech types set by classifying parts of speech whose ending changes based on the ending change pattern and registered with the ending change pattern is included. And
The learning processing means collates the converted character string with the ending change pattern table in response to the confirmation of the converted character string in which the part of speech type is not specified, and the ending that matches the rear part of the converted character string. A part-of-speech type having a pattern of ending change that includes the part-of-speech type of the converted character string, an estimation means for estimating that the matching ending is the ending of the converted character string, and the estimated part-of-speech type and ending Storage means for storing
The estimation means combines the post-conversion character string whose part-of-speech type and ending are estimated with a post-conversion character string that is confirmed after the post- conversion character string and is forward-matched to a portion excluding the estimated ending part. A set of endings is created and stored in the storage means, and among the estimated part-of-speech types, part-of-speech types having a ending change pattern that matches the set of endings are converted to character strings after conversion. A program for character input, which is stored in the storage means in association with

In the ending change pattern table, the part-of-speech type set by classifying the part-of-speech whose ending does not change based on the accessory-word pattern that can be attached to the word-of-speech word, The program for character input according to claim 1, wherein at least one of them is registered.

The search means searches the learning dictionary by the pre-conversion character string, and when the post-conversion character string associated with the part-of-speech type registered in the ending change pattern table is extracted by this search, corresponds to the part-of-speech type. The program for character input according to claim 1 or 2, wherein a plurality of types of candidates are created by combining the ending change pattern to be performed and the beginning part of the extracted converted character string.

The search means has a function of requesting a conversion process using a pre-conversion character string to an external conversion system and obtaining a candidate for a post-conversion character string by a response from the conversion system to the request.
When the candidate for the converted character string acquired from the external conversion system is confirmed, the estimating means compares the converted character string with the ending change pattern table to estimate the part-of-speech type and the ending, and after the conversion The character string according to any one of claims 1 to 3, wherein the character string is divided into an estimated ending part and an initial part excluding the ending part, and stored in the learning dictionary in association with the estimated part of speech type. Input program.

An input means for accepting an input operation for a pre-conversion character string, a search means for searching for a candidate for a post-conversion character string corresponding to the pre-conversion character string, and an operation for selecting one of the candidates extracted by the search means A confirmation processing means for confirming the converted character string of the candidate selected according to the output and outputting it to the active application in the apparatus, and associating the confirmed converted character string with a part of speech type that matches the character string An information processing apparatus comprising learning processing means for storing in a learning dictionary,
There is further provided a ending change pattern table in which a plurality of part of speech types set by classifying parts of speech whose endings are changed based on ending change patterns are registered together with the ending change patterns,
The learning processing means collates the converted character string with the ending change pattern table in response to the confirmation of the converted character string in which the part of speech type is not specified, and the ending that matches the rear part of the converted character string. A part-of-speech type having a pattern of ending change that includes the part-of-speech type of the converted character string, an estimation means for estimating that the matching ending is the ending of the converted character string, and the estimated part-of-speech type and ending and a storage means for storing,
The estimation means combines the post-conversion character string whose part-of-speech type and ending are estimated with a post-conversion character string that is confirmed after the post- conversion character string and is forward-matched to a portion excluding the estimated ending part. a set of ending with creating and storing in said storage means, out of the estimated part of speech classification, the part of speech type having a pattern conforming inflection to the set of ending each converted character string corresponding to the set in An information processing apparatus, wherein the information is stored in the storage unit in association with each other.