JP5553037B2 - Text input support system, text input support device, reference information creation device, and program - Google Patents

Text input support system, text input support device, reference information creation device, and program Download PDF

Info

Publication number
JP5553037B2
JP5553037B2 JP2011013942A JP2011013942A JP5553037B2 JP 5553037 B2 JP5553037 B2 JP 5553037B2 JP 2011013942 A JP2011013942 A JP 2011013942A JP 2011013942 A JP2011013942 A JP 2011013942A JP 5553037 B2 JP5553037 B2 JP 5553037B2
Authority
JP
Japan
Prior art keywords
sentence
word
subsequent
input
appearance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
JP2011013942A
Other languages
Japanese (ja)
Other versions
JP2012155520A (en
Inventor
基行 鷹合
大悟 杉原
洋平 山根
圭悟 服部
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujifilm Business Innovation Corp
Original Assignee
Fuji Xerox Co Ltd
Fujifilm Business Innovation Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuji Xerox Co Ltd, Fujifilm Business Innovation Corp filed Critical Fuji Xerox Co Ltd
Priority to JP2011013942A priority Critical patent/JP5553037B2/en
Publication of JP2012155520A publication Critical patent/JP2012155520A/en
Application granted granted Critical
Publication of JP5553037B2 publication Critical patent/JP5553037B2/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Machine Translation (AREA)
  • Document Processing Apparatus (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Description

本発明は、文章入力支援システム、文章入力支援装置、参照情報作成装置及びプログラムに関する。   The present invention relates to a text input support system, a text input support device, a reference information creation device, and a program.

コンピュータを用いて文書を作成する文書作成装置には、一例として、利用者によるタイプ入力の負荷を軽減するために、入力中の文について、次に利用者により挿入されるであろう単語を予測して補完候補として利用者に提示し、補完候補の中から利用者に選択された単語を文に挿入する入力支援機能を備えたものがある。   For example, in a document creation apparatus that creates a document using a computer, a word that will be inserted next by a user is predicted for a sentence that is being input in order to reduce the typing input load by the user. Some of them have an input support function that is presented to the user as a completion candidate and the word selected by the user from the completion candidates is inserted into the sentence.

ここで、上記のような文書作成装置における入力支援機能に関し、従前より種々の発明が提案されている。
例えば、ペン入力コンピュータにおいてテキスト入力中にその続きに入力されるであろう文字列を予測し、一つ或いは複数を提示することにより、利用者がその文字列を入力しなくても済むようにする発明が提案されている(特許文献1参照)。
例えば、携帯電話やPDAなどメモリの少ないハードウェア上でも予測入力できるように、既存のテキストコーパスからサイズの小さい辞書及び統計的言語モデルを学習し、それを利用して予測入力する発明であり、入力されたテキストの最後の幾つかの文字列に対して、テキストコーパス、辞書、言語モデルの中に同様の表現が出現しているかどうかを調べ、その続きの文字列を予測する発明が提案されている(特許文献2参照)。
例えば、電子メールの返信時のテキスト作成時において、返信元のテキストの文字列が予測入力の候補として上位に出現するように優先順位を調整する発明が提案されている(特許文献3参照)。
Here, various inventions have been proposed for the input support function in the document creation apparatus as described above.
For example, a pen input computer predicts a character string that will be input after text input and presents one or more so that the user does not have to input the character string. An invention has been proposed (see Patent Document 1).
For example, it is an invention that learns a small-sized dictionary and a statistical language model from an existing text corpus so that prediction input can be performed even on hardware with a small amount of memory such as a mobile phone or a PDA, and uses that to predict input. An invention has been proposed in which it is checked whether a similar expression appears in a text corpus, a dictionary, or a language model for the last several character strings of input text, and a subsequent character string is predicted. (See Patent Document 2).
For example, an invention has been proposed in which the priority order is adjusted so that a character string of a reply source text appears at the top as a predicted input candidate when creating a text when replying to an e-mail (see Patent Document 3).

特開平10−154033号公報Japanese Patent Laid-Open No. 10-154033 特開2006−216044号公報JP 2006-216044 A 特開2006−344039号公報JP 2006-344039 A

本発明は、文章作成中の利用者による入力中の文について、その続きに入力されることが予測される候補を提示するに際し、当該入力中の文に先行する文とのつながりに基づいた候補を提示可能な技術を提案することを目的とする。   The present invention provides a candidate based on a connection with a sentence preceding the sentence being input when presenting a candidate that is predicted to be input subsequent to the sentence being input by the user who is creating the sentence. The purpose is to propose a technology that can present

請求項1に係る本発明は、コンピュータに、利用者による操作入力に基づいて作成中の文章において、入力中の文に先行する文を解析して当該文に含まれる単語を特定する特定機能と、既存の文章中に含まれる連続する2つの文について、先行する文に含まれる単語を先行語とし、後続する文に含まれる単語を後続語として対応付けて記憶する記憶手段から、前記特定機能により特定された単語に合致する先行語に対応付けられた後続語を検索する検索機能と、前記検索機能により検索された後続語を利用者に対して提示する提示機能と、を実現させるためのプログラムである。   The present invention according to claim 1 is a computer-specific function for analyzing a sentence preceding a sentence being input and identifying a word included in the sentence in a sentence being created based on an operation input by a user. From the storage unit that stores two consecutive sentences included in an existing sentence in association with a word included in a preceding sentence as a preceding word, and a word included in a succeeding sentence as a subsequent word, the specific function A search function for searching for a subsequent word associated with an antecedent word that matches the word specified by, and a presentation function for presenting the subsequent word searched by the search function to a user. It is a program.

請求項2に係る本発明は、請求項1に係る本発明において、前記記憶手段は、既存の文章において先行する文に含まれる各先行語を文単位でまとめた各先行語群と当該先行語群を含む文に後続する文に含まれる各後続語との対応付け情報と共に、既存の文章における当該先行語群を含む文の出現度合を示す文出現量、及び、当該先行語群を含む文に後続する文における当該後続語の出現度合を示す単語出現量を記憶しており、前記検索機能は、入力中の文に先行する文に含まれる単語群に合致する先行語群に対応する後続語を検索すると共に、当該先行語群に係る文出現量及び当該先行語群と当該後続語との組み合わせに係る単語出現量を検索し、前記提示機能は、前記検索機能により検索された後続語を、当該後続語と共に検索された文出現量に対する単語出現量の比が高い順に提示する、ことを特徴とするプログラムである。   According to a second aspect of the present invention, in the first aspect of the present invention, the storage unit includes a group of preceding words in which each preceding word included in a preceding sentence in an existing sentence is grouped in sentence units and the preceding word. A sentence including the preceding word group and a sentence appearance amount indicating the appearance degree of the sentence including the preceding word group in the existing sentence together with the association information with each subsequent word included in the sentence following the sentence including the group. The word appearance amount indicating the appearance degree of the subsequent word in the sentence that follows is stored, and the search function is the subsequent corresponding to the preceding word group that matches the word group included in the sentence preceding the sentence being input. A search is made for a word, and a sentence appearance amount relating to the preceding word group and a word appearance amount relating to a combination of the preceding word group and the subsequent word are searched, and the presenting function is a subsequent word searched by the search function. For the sentence searched with the subsequent word The ratio of word appearance amount is presented in descending order with respect to the amount, which is a program, characterized in that.

請求項3に係る本発明は、請求項1に係る本発明において、前記記憶手段は、既存の文章において先行する文に含まれる各先行語と当該先行語を含む文に後続する文に含まれる各後続語との対応付け情報と共に、既存の文章における当該先行語を含む文の出現度合を示す文出現量、及び、当該先行語を含む文に後続する文における当該後続語の出現度合を示す単語出現量を記憶しており、前記検索機能は、入力中の文に先行する文に含まれる各単語に合致するそれぞれの先行語に対応する後続語を検索すると共に、当該先行語に係る文出現量及び当該先行語と当該後続語との組み合わせに係る単語出現量を検索し、前記提示機能は、前記検索機能により検索された後続語を、当該後続語と共に検索された文出現量に対する単語出現量の比が高い順に提示する、ことを特徴とするプログラムである。   According to a third aspect of the present invention, in the first aspect of the present invention, the storage means is included in each preceding word included in a sentence preceding in an existing sentence and a sentence following the sentence including the preceding word. Along with association information with each subsequent word, a sentence appearance amount indicating the appearance degree of a sentence including the preceding word in an existing sentence, and an appearance degree of the subsequent word in a sentence following the sentence including the preceding word The amount of word appearance is stored, and the search function searches for a subsequent word corresponding to each preceding word that matches each word included in a sentence preceding the sentence being input, and a sentence related to the preceding word The amount of appearance and the word appearance amount related to the combination of the preceding word and the subsequent word are searched, and the presentation function uses the subsequent word searched by the search function as a word for the sentence appearance amount searched together with the subsequent word. High ratio of appearance amount Presented in the order, it is a program, characterized in that.

請求項4に係る本発明は、請求項3に係る本発明において、前記記憶手段は、各先行語と当該先行語を含む文に後続する文に含まれる各後続語との対応付け情報と共に、当該先行語の格の情報を記憶しており、前記特定機能は、入力中の文に先行する文に含まれる単語の格を特定し、前記検索機能は、入力中の文に先行する文に含まれる単語及び格が合致する先行語に対応付けられた後続語を検索する、ことを特徴とするプログラムである。   According to a fourth aspect of the present invention, in the present invention according to the third aspect, the storage means includes association information between each preceding word and each subsequent word included in a sentence following the sentence including the preceding word. Information on the case of the preceding word is stored, the specifying function specifies a case of a word included in a sentence preceding the sentence being input, and the search function is used for a sentence preceding the sentence being input. It is a program characterized by searching for a subsequent word associated with a preceding word having a matching word and case.

請求項5に係る本発明は、コンピュータに、既存の文章中に含まれる連続する2つの文を解析し、先行する文に含まれる単語を先行語とし、後続する文に含まれる単語を後続語として特定する特定機能と、利用者による操作入力に基づいて作成中の文章における入力中の文について、当該文に先行する文に含まれる単語に合致する先行語に対応する後続語を利用者に対して提示する文章入力支援処理のために、前記特定機能により特定された先行語と後続語とを対応付けて記憶手段に記憶させる記憶機能と、を実現させるためのプログラムである。   According to the fifth aspect of the present invention, a computer analyzes two consecutive sentences included in an existing sentence, a word included in a preceding sentence is a preceding word, and a word included in a succeeding sentence is a subsequent word. For the sentence being entered in the sentence being created based on the input function specified by the user and the operation input by the user, the subsequent word corresponding to the preceding word that matches the word contained in the sentence preceding the sentence is given to the user. The program for realizing a storage function for storing the preceding word and the succeeding word specified by the specifying function in association with each other and storing them in the storage means for the sentence input support process to be presented.

請求項6に係る本発明は、請求項5に係る本発明において、既存の文章において先行する文に含まれる各先行語を文単位でまとめた先行語群毎に、既存の文章における当該先行語群を含む文の出現度合を示す文出現量を算出すると共に、各先行語群と当該先行語群を含む文に後続する文に含まれる各後続語との組み合わせ毎に、当該先行語群を含む文に後続する文における当該後続語の出現度合を示す単語出現量を算出する算出機能を前記コンピュータに更に実現させ、前記文章入力支援処理において、入力中の文に先行する文に含まれる単語群に合致する先行語群に対応する後続語を、当該先行語群に係る文出現量に対する当該後続語に係る単語出現量の比が高い順に提示させるために、前記記憶機能は、各先行語群と当該先行語群を含む文に後続する文に含まれる各後続語との対応付け情報と共に、前記算出機能により算出された文出現量及び単語出現量を前記記憶手段に記憶させる、ことを特徴とするプログラムである。   The present invention according to claim 6 is the present invention according to claim 5, wherein for each preceding word group in which each preceding word included in the preceding sentence in the existing sentence is grouped in sentence units, the preceding word in the existing sentence. A sentence appearance amount indicating the appearance degree of a sentence including a group is calculated, and for each combination of each preceding word group and each subsequent word included in a sentence following the sentence including the preceding word group, the preceding word group is calculated. The computer further realizes a calculation function for calculating a word appearance amount indicating the appearance degree of the subsequent word in a sentence subsequent to the sentence including the word, and the word included in the sentence preceding the sentence being input in the sentence input support process In order to present the subsequent words corresponding to the preceding word group matching the group in descending order of the ratio of the word appearance amount related to the subsequent word to the sentence appearance amount related to the preceding word group, the storage function Group and the preceding word group Together with correspondence information between each subsequent word included in statement that follows, the stores the calculated sentence occurrence amount calculated by the function and word appearance amount in the storage means, a program, characterized in that.

請求項7に係る本発明は、請求項5に係る本発明において、既存の文章において先行する文に含まれる先行語毎に、既存の文章における当該先行語を含む文の出現度合を示す文出現量を算出すると共に、各先行語と当該先行語を含む文に後続する文に含まれる各後続語との組み合わせ毎に、当該先行語を含む文に後続する文における当該後続語の出現度合を示す単語出現量を算出する算出機能を前記コンピュータに更に実現させ、前記文章入力支援処理において、入力中の文に先行する文に含まれる各単語に合致するそれぞれの先行語に対応する後続語を、当該先行語に係る文出現量に対する当該後続語に係る単語出現量の比が高い順に提示させるために、前記記憶機能は、各先行語と当該先行語を含む文に後続する文に含まれる各後続語との対応付け情報と共に、前記算出機能により算出された文出現量及び単語出現量を前記記憶手段に記憶させる、ことを特徴とするプログラムである。   According to a seventh aspect of the present invention, in the present invention according to the fifth aspect, for each preceding word included in the preceding sentence in the existing sentence, a sentence appearance indicating a degree of appearance of the sentence including the preceding word in the existing sentence. For each combination of each preceding word and each subsequent word included in the sentence that follows the sentence that includes the preceding word, the degree of appearance of the subsequent word in the sentence that follows the sentence that includes the preceding word is calculated. The computer further realizes a calculation function for calculating a word appearance amount to indicate, and in the sentence input support process, subsequent words corresponding to respective preceding words that match each word included in a sentence preceding the sentence being input. The memory function is included in a sentence following each preceding word and a sentence including the preceding word in order of increasing the ratio of the word appearance amount related to the succeeding word to the sentence appearing amount related to the preceding word. Each successor and Together with correspondence information, the stores the calculated sentence occurrence amount calculated by the function and word appearance amount in the storage means, a program, characterized in that.

請求項8に係る本発明は、請求項7に係る本発明において、前記算出機能は、先行語毎の文出現量の算出を、当該先行語を含む文毎に、当該文の単語数が多いほど小さい値を加算することにより行い、また、先行語と後続語の組み合わせ毎の単語出現量の算出を、当該先行語を含む文に後続する文で且つ当該後続語を含む文毎に、当該文の単語数が多いほど小さい値を加算することにより行う、ことを特徴とするプログラムである。   According to an eighth aspect of the present invention, in the present invention according to the seventh aspect, the calculation function calculates a sentence appearance amount for each preceding word, and for each sentence including the preceding word, the number of words in the sentence is large. In addition, the calculation of the word appearance amount for each combination of the preceding word and the succeeding word is performed for each sentence including the succeeding word and the sentence following the sentence including the preceding word. The program is characterized by adding a smaller value as the number of words in a sentence increases.

請求項9に係る本発明は、請求項7、8に係る本発明において、前記特定機能は、既存の文章において先行する文に含まれる先行語の格と、当該文に後続する文に含まれる後続語の格を特定し、前記算出機能は、前記特定された先行語と格が一致する後続語について、当該後続語に係る単語出現量に加算する値を大きくする、ことを特徴とするプログラムである。   The present invention according to claim 9 is the present invention according to claims 7 and 8, wherein the specific function is included in a case of a preceding word included in a sentence preceding in an existing sentence and a sentence following the sentence. A program that identifies a case of a subsequent word, and the calculation function increases a value to be added to a word appearance amount related to the subsequent word for a subsequent word having a case that matches the specified preceding word. It is.

請求項10に係る本発明は、請求項7〜9に係る本発明において、前記特定機能は、既存の文章において先行する文に含まれる先行語の格を特定し、前記文章入力支援処理において、入力中の文に先行する文に含まれる単語及び格が合致する先行語に対応する後続語を提示させるために、前記記憶機能は、各先行語と当該先行語を含む文に後続する文に含まれる各後続語との対応付け情報と共に、当該先行語の格の情報を前記記憶手段に記憶させる、ことを特徴とするプログラムである。   According to a tenth aspect of the present invention, in the present invention according to the seventh to ninth aspects, the specifying function specifies a case of a preceding word included in a preceding sentence in an existing sentence, and in the sentence input support process, In order to present the word included in the sentence preceding the sentence being input and the subsequent word corresponding to the preceding word having the matching case, the storage function is provided for each preceding word and the sentence following the sentence including the preceding word. A program characterized in that information on the case of the preceding word is stored in the storage unit together with association information with each subsequent word included.

請求項11に係る本発明は、既存の文章中に含まれる連続する2つの文を解析し、先行する文に含まれる単語を先行語とし、後続する文に含まれる単語を後続語として特定する第1特定手段と、前記第1特定手段により特定された先行語と後続語とを対応付けて記憶する記憶手段と、を有する参照情報作成部と、利用者による操作入力に基づいて作成中の文章において、入力中の文に先行する文を解析して当該文に含まれる単語を特定する第2特定手段と、前記第2特定手段により特定された単語に合致する先行語に対応付けられた後続語を前記記憶手段から検索する検索手段と、前記検索手段により検索された後続語を利用者に対して提示する提示手段と、を有する文章入力支援部と、を備えたことを特徴とする文章入力支援システムである。   The present invention according to claim 11 analyzes two consecutive sentences included in an existing sentence, specifies a word included in a preceding sentence as a preceding word, and specifies a word included in a succeeding sentence as a subsequent word. A reference information creating unit having a first specifying unit, a storage unit that stores the preceding word and the subsequent word specified by the first specifying unit in association with each other, and is being created based on an operation input by the user In the sentence, the second identification unit that analyzes the sentence preceding the sentence being input and identifies the word included in the sentence, and the preceding word that matches the word identified by the second identification unit A text input support unit comprising: search means for searching for a subsequent word from the storage means; and presentation means for presenting the subsequent word searched by the search means to a user. A text input support system .

請求項12に係る本発明は、利用者による操作入力に基づいて作成中の文章において、入力中の文に先行する文を解析して当該文に含まれる単語を特定する特定手段と、既存の文章中に含まれる連続する2つの文について、先行する文に含まれる単語を先行語とし、後続する文に含まれる単語を後続語として対応付けて記憶する記憶手段から、前記特定手段により特定された単語に合致する先行語に対応付けられた後続語を検索する検索手段と、前記検索手段により検索された後続語を利用者に対して提示する提示手段と、を備えたことを特徴とする文章入力支援装置である。   According to a twelfth aspect of the present invention, in a sentence being created based on an operation input by a user, a specifying unit that analyzes a sentence preceding the sentence being input and identifies a word included in the sentence, Two consecutive sentences included in a sentence are specified by the specifying unit from a storage unit that stores a word included in a preceding sentence as a preceding word and a word included in a succeeding sentence as a subsequent word in association with each other. Search means for searching for a subsequent word associated with the preceding word matching the word, and presentation means for presenting the subsequent word searched by the search means to the user. This is a text input support device.

請求項13に係る本発明は、既存の文章中に含まれる連続する2つの文を解析し、先行する文に含まれる単語を先行語とし、後続する文に含まれる単語を後続語として特定する特定手段と、利用者による操作入力に基づいて作成中の文章における入力中の文について、当該文に先行する文に含まれる単語に合致する先行語に対応する後続語を利用者に対して提示する文章入力支援処理のために、前記特定手段により特定された先行語と後続語とを対応付けて記憶手段に記憶させる記憶手段と、を備えたことを特徴とする参照情報作成装置である。   The present invention according to claim 13 analyzes two consecutive sentences included in an existing sentence, specifies a word included in a preceding sentence as a preceding word, and specifies a word included in a succeeding sentence as a subsequent word. For the sentence being entered in the sentence being created based on the user's operation input by the specifying means, the subsequent word corresponding to the preceding word that matches the word included in the sentence preceding the sentence is presented to the user. And a storage means for storing the preceding word and the succeeding word specified by the specifying means in association with each other in order to store them in the storage means.

請求項1、5、11〜13に係る本発明によれば、文章作成中の利用者による入力中の文について、その続きに入力されることが予測される候補を提示するに際し、本発明を適用しない場合に比べて、当該入力中の文に先行する文とのつながりに基づいた候補となる単語を提示し、入力を支援することができる。   According to the first, fifth, and eleventh to thirteenth aspects of the present invention, when presenting a candidate that is predicted to be input subsequent to a sentence being input by a user who is creating a sentence, Compared to a case where the sentence is not applied, it is possible to present a candidate word based on the connection with the sentence preceding the sentence being input and to support the input.

請求項2、6に係る本発明によれば、入力中の文に先行する文との間で単語群が合致する既存の文に後続する文に含まれる単語を、当該既存の文に後続する文に当該単語が出現する条件付き確率に応じた順序で提示可能となる。   According to the second and sixth aspects of the present invention, a word included in a sentence following an existing sentence whose word group matches with a sentence preceding the sentence being input follows the existing sentence. It is possible to present in the order according to the conditional probability that the word appears in the sentence.

請求項3、7に係る本発明によれば、入力中の文に先行する文との間で合致する単語を有する既存の文に後続する文に含まれる単語を、当該既存の文に後続する文に当該単語が出現する条件付き確率に応じた順序で提示可能となる。   According to the third and seventh aspects of the present invention, a word included in a sentence following an existing sentence having a matching word with a sentence preceding the sentence being input follows the existing sentence. It is possible to present in the order according to the conditional probability that the word appears in the sentence.

請求項4、10に係る本発明によれば、入力中の文に含まれる単語の格を考慮して、その続きに入力されることが予測される候補の単語を検索可能となる。   According to the fourth and tenth aspects of the present invention, it is possible to search for candidate words that are predicted to be input in succession in consideration of the case of the word included in the sentence being input.

請求項8に係る本発明によれば、条件付き確率の算出に用いる文出現量及び単語出現量を、その基となる文における単語の重みを加味して算出することができる。   According to the present invention of claim 8, the sentence appearance amount and the word appearance amount used for calculating the conditional probability can be calculated in consideration of the word weight in the sentence as a basis thereof.

請求項9に係る本発明によれば、条件付き確率の算出に用いる単語出現量を、連続する文において同じ格の単語が出現し易い傾向にあることを加味して算出することができる。   According to the present invention of claim 9, it is possible to calculate the word appearance amount used for calculating the conditional probability taking into account that words of the same case tend to appear in successive sentences.

本発明の一実施形態に係る文章入力支援システムの第1構成例の機能ブロックを示す図である。It is a figure which shows the functional block of the 1st structural example of the text input assistance system which concerns on one Embodiment of this invention. 形態素解析の結果を例示する図である。It is a figure which illustrates the result of morphological analysis. 第1構成例に係る辞書を例示する図である。It is a figure which illustrates the dictionary concerning the 1st example of composition. 本発明の一実施形態に係る文章入力支援システムの第2構成例の機能ブロックを示す図である。It is a figure which shows the functional block of the 2nd structural example of the text input assistance system which concerns on one Embodiment of this invention. 第2構成例に係る辞書を例示する図である。It is a figure which illustrates the dictionary which concerns on a 2nd structural example. 本発明の一実施形態に係る文章入力支援システムの第3構成例の機能ブロックを示す図である。It is a figure which shows the functional block of the 3rd structural example of the text input assistance system which concerns on one Embodiment of this invention. 構文解析の結果を例示する図である。It is a figure which illustrates the result of a parsing. 第3構成例に係る辞書を例示する図である。It is a figure which illustrates the dictionary which concerns on a 3rd structural example. 本発明の一実施形態に係る文章入力支援システムの第4構成例の機能ブロックを示す図である。It is a figure which shows the functional block of the 4th structural example of the text input assistance system which concerns on one Embodiment of this invention. 本発明の一実施形態に係る文章入力支援システムとして動作するコンピュータのハードウェア構成を例示する図である。It is a figure which illustrates the hardware constitutions of the computer which operate | moves as a text input assistance system which concerns on one Embodiment of this invention. 文章入力支援機能について説明する図である。It is a figure explaining a text input support function. 文章入力支援機能について説明する図である。It is a figure explaining a text input support function.

本発明の具体的な説明に先立って、文章入力支援機能について図11、12を参照して説明する。
図11の例では、利用者が文章を作成するためにテキスト文を入力するテキスト入力部51に、「拡散強調像で」という文字列が入力されている。このとき、文章入力支援機能により、現在のカーソル位置より前の文字列に対してそれに続く可能性の高い文字列を推測し、当該推測により得られた各文字列を補完候補として子ウィンドウ52内に列挙して提示(表示)される。利用者は、子ウィンドウ52内に提示された補完候補の一つをキーボードやマウス等を操作して選択することで、その補完候補の文字列をタイプ入力することなく追加することができる。すなわち、その補完候補の文字列をタイプ入力するより少ない手間で文章作成を行うことができる。
Prior to specific description of the present invention, the text input support function will be described with reference to FIGS.
In the example of FIG. 11, a character string “in a diffusion weighted image” is input to a text input unit 51 where a user inputs a text sentence to create a sentence. At this time, with the text input support function, a character string that is likely to follow the character string before the current cursor position is estimated, and each character string obtained by the estimation is used as a complement candidate in the child window 52. Are presented (displayed). The user can select one of the supplement candidates presented in the child window 52 by operating the keyboard, the mouse, or the like, and can add the supplement candidate character string without typing. In other words, it is possible to create a sentence with less effort than typing the character string of the candidate for completion.

ここで、一つ以上の文が入力された後で、その続きの文を入力している場面を考える。
図12には、従来の文章入力支援機能において、利用者が一文を入力した後に次の文の数文字を入力した際に、その続きに入力されるであろう文字列を推定して提示した場面を例示してある。図12の例では、テキスト入力部61に、「右肺上部にT1強調像で著名な高信号域が見られます。」という1文目と、「その部位のT」という2文目の文頭の数文字が入力されており、2文目(入力中の文)の続きに追加する補完候補の文字列として、子ウィンドウ52に、「1強調像」、「2強調像」、「1領域」という3つの文字列が提示されている。すなわち、利用者が「その部位のT」と入力した際に、文章入力支援機能により、利用者が直前に入力した2〜3単語或いは数文字(本例では、「部位のT」)の文字列を参考にして、その続きに出現することが推定される「1強調像」、「2強調像」、「1領域」などの補完候補が提示されている。
Here, consider a scene in which one or more sentences are input and then a subsequent sentence is input.
In FIG. 12, when the user inputs several characters of the next sentence after inputting one sentence in the conventional sentence input support function, the character string that will be input after that is estimated and presented. The scene is illustrated. In the example of FIG. 12, in the text input unit 61, the first sentence “A prominent high signal area is seen in the upper part of the right lung as a T1-weighted image” and the beginning of the second sentence “T of the part” are displayed. Are input, and as a complement candidate character string to be added to the continuation of the second sentence (the sentence being input), “1 emphasis image”, “2 emphasis image”, “1 region” are displayed in the child window 52. ”Are presented. That is, when the user inputs “T of the part”, the text input support function allows the user to input two to three words or several characters (“T of the part” in this example) immediately before the user inputs. With reference to the column, supplemental candidates such as “1 weighted image”, “2 weighted image”, “1 region”, which are estimated to appear in succession, are presented.

上記のような補完候補を提示するにあたり、従来の文章入力支援機能では、例えば、Nグラムモデルという、直前のN−1個の単語が出現したときのN番目の単語が出現するスコア(例えば確率など)を、予め用意された大量のテキスト(文章)から統計的に推定する手法が用いられる。
しかしながら、「T1強調像」は、入力中の文(2文目)に先行する前文(1文目)において既に記述がなされており、更に重ねて「T1強調像」に関する記述がなされることは考え難い。更に、MRI(Magnetic Resonance Imaging)撮像法では普通「T1強調像」と「T2強調像」は同時に得られ、両方の画像に関する記述がなされることがしばしばある。よって、この場合には、「T2強調像」の後半部分の文字列である「2強調像」が補完候補として最上位に表示されることが望ましい。
そこで、本発明では、入力中の文に先行する文(位置的に前の文)に入力された単語の情報を用いて、利用者に入力されることが推定される単語を(文字列)をより効果的に提示できるようにする。
In presenting the above-mentioned complementation candidates, the conventional text input support function, for example, an N-gram model, for example, a score at which the Nth word appears when the immediately preceding N−1 words appear (eg, probability) Is statistically estimated from a large amount of text (sentences) prepared in advance.
However, the “T1-weighted image” has already been described in the previous sentence (first sentence) preceding the sentence (second sentence) being input, and the description regarding the “T1-weighted image” is not repeated. Hard to think. Further, in the MRI (Magnetic Resonance Imaging) imaging method, a “T1 weighted image” and a “T2 weighted image” are usually obtained at the same time, and descriptions of both images are often made. Therefore, in this case, it is desirable that “2 emphasized image”, which is a character string in the latter half of “T2 emphasized image”, is displayed at the top as a complement candidate.
Therefore, in the present invention, a word that is estimated to be input to the user is (character string) using the information of the word that is input to the sentence preceding the sentence being input (positionally preceding sentence). Can be presented more effectively.

本発明の一実施例に係る文章入力支援システムの第1構成例について説明する。
図1には、第1構成例の文章入力支援システムの機能ブロックを示してある。
本例の文章入力支援システムは、利用者による操作入力に基づいて作成中の文章について、入力中の文に先行する文に含まれる単語に基づいて補完候補を提示する文章入力支援部30と、補完候補の提示に際して文章入力支援部30により参照される辞書(辞書DB21)を既存の文章に基づいて作成する参照情報作成部10とで構成されている。
A first configuration example of a text input support system according to an embodiment of the present invention will be described.
FIG. 1 shows functional blocks of the text input support system of the first configuration example.
The text input support system of the present example includes a text input support unit 30 that presents a candidate for completion based on a word included in a sentence preceding the sentence being input for a text that is being created based on an operation input by a user; The reference information creating unit 10 creates a dictionary (dictionary DB 21) that is referred to by the text input support unit 30 when presenting the complement candidates based on existing text.

第1構成例に係る参照情報作成部10は、文書DB11に格納されている複数の既存の文書に含まれる文章を文単位に分割する文分割部12、対象の文を形態素解析して当該文に含まれる単語を特定する形態素解析部14を用いて、文分割部12により得られた文間の関係(以下では、文間Nグラムという)を解析する文間Nグラム解析部13、補完候補として列挙する単語の優先順を決定するためのスコアを算出する文−単語カウント部16、といった機能部を有し、これらの機能部による処理の結果として作成される辞書を辞書DB21に格納する。   The reference information creation unit 10 according to the first configuration example includes a sentence division unit 12 that divides sentences included in a plurality of existing documents stored in the document DB 11 into sentence units, morphologically analyzes the target sentence, An inter-sentence N-gram analyzing unit 13 that analyzes a relationship between sentences (hereinafter referred to as inter-sentence N-gram) obtained by the sentence dividing unit 12 using a morpheme analyzing unit 14 that identifies words included in And a sentence-word count unit 16 for calculating a score for determining the priority order of the words to be listed, and a dictionary created as a result of processing by these function units is stored in the dictionary DB 21.

また、第1構成例に係る文章入力支援部30は、利用者により作成中の文章(テキスト)を取得するテキスト取得部31、テキスト取得部31により取得された文章における入力中の文に先行する文(前文)を取得する前文取得部32、対象の文を形態素解析して当該文に含まれる単語を特定する形態素解析部14を用いて、前文取得部32により得られた文(前文)を解析する前文解析部33、前文解析部33による解析結果に基づいて辞書DB21を検索して、補完候補となる単語を特定する補完候補列挙部35、補完候補列挙部35により得られた補完候補の各単語について優先順を決定する補完候補評価部37、補完候補評価部37により決定された優先順に沿って補完候補の単語の提示を行う補完候補提示部38、といった機能部を有し、提示した補完候補の中から選択された単語を入力中の文の続きに追加することで利用者の文章入力を支援する。   Moreover, the text input support unit 30 according to the first configuration example precedes the text acquisition unit 31 that acquires the text (text) being created by the user, and the text being input in the text acquired by the text acquisition unit 31. The sentence (previous sentence) obtained by the previous sentence acquiring part 32 is obtained by using the preamble acquiring part 32 for acquiring a sentence (previous sentence) and the morpheme analyzing part 14 for morphologically analyzing the target sentence and specifying words included in the sentence. Based on the analysis result of the preamble analysis unit 33 to be analyzed and the analysis result of the preamble analysis unit 33, the candidate DB enumeration unit 35 that searches the dictionary DB 21 and identifies a word that is a candidate for complementation and the candidate for completion obtained by the complementation candidate enumeration unit 35 Functional units such as a complement candidate evaluation unit 37 that determines the priority order for each word, and a complement candidate presentation unit 38 that presents the complement candidate words in the priority order determined by the complement candidate evaluation unit 37 It has, to help the user of the text input by adding the words that have been selected from among the presented completions in continuation of the sentence in the input.

ここで、文間Nグラムについて説明する。本例の文間Nグラムは、N−1番目の文が出現した場合に、これに後続するN番目の文に単語Wが存在するスコア(例えば、score(W|文1、・・・、文N−1))を算出して、補完候補の提示に利用するものである。
以下、N=2の場合(すなわち、入力中の文に対して直前の文だけを考慮する場合)について、自立語(名詞や動詞などそれ単独で意味のある単語)に着目して説明する。
例えば、大量の文章中に或る文sが出現する頻度(文出現頻度)をcs(s)とし、文sが出現した次の文に単語Wが出現する頻度(単語出現頻度)をcw(s,W)とすると、補完候補の提示に係るスコアとして、文出現頻度cs(s)に対する単語出現頻度cw(s,W)の比、すなわち、或る文sが出現した次の文に単語Wが出現する条件付き確率の推定値p(W|s)=cw(s,W)/cs(s)を用いることができる。
Here, the intergram N-gram will be described. The inter-sentence N-gram in this example has a score (for example, score (W | sentence 1,..., Score) in which the word W is present in the Nth sentence following the N−1th sentence. Sentence N-1)) is calculated and used for the presentation of the complementary candidates.
Hereinafter, the case where N = 2 (that is, the case where only the immediately preceding sentence is considered with respect to the sentence being input) will be described by focusing on independent words (words having meaning alone such as nouns and verbs).
For example, the frequency at which a certain sentence s appears in a large amount of sentences (sentence appearance frequency) is cs (s), and the frequency at which the word W appears in the next sentence after the sentence s appears (word appearance frequency) is cw ( s, W), the score relating to the presentation of the complement candidate is the ratio of the word appearance frequency cw (s, W) to the sentence appearance frequency cs (s), that is, the word in the next sentence in which a certain sentence s appears An estimated value p (W | s) = cw (s, W) / cs (s) of the conditional probability that W appears can be used.

文出現頻度cs(s)及び単語出現頻度cw(s,W)は、参照情報作成部10の各機能により、事前に以下のような手順で求められる。
まず、大量の文章(テキスト)を用意する。これは、Web上の文章や電子版の新聞記事などの外部資源や、本例の文章入力支援システムを利用する各利用者によって作成された文章などを利用することが出来る。本例では、これらの文章を予め収集して文書DB11に格納しているものとする。そして、これらの文章から連続する2つの文単位で文を取り出して、以下のような処理を行う。
The sentence appearance frequency cs (s) and the word appearance frequency cw (s, W) are obtained in advance by the following procedure by each function of the reference information creation unit 10.
First, prepare a large amount of text. This can use external resources such as Web texts and electronic newspaper articles, and texts created by each user who uses the text input support system of this example. In this example, it is assumed that these sentences are collected in advance and stored in the document DB 11. Then, sentences are extracted from these sentences in units of two consecutive sentences, and the following processing is performed.

例えば、或る文章中のi−1番目の文が「右肺上部にT1強調像で著名な高信号域が見られます。」、i番目の文が「その部位にT2強調像では異常が見られません。」であった場合、これらの文を文分割部12により文単位に分割して形態素解析部14により形態素解析すると、図2に例示するように、文毎にその文に含まれる単語が特定される。本例の形態素解析部14では単語分割と共に品詞付与も行っており、図2の例では、i−1番目の文である文(i−1)、i番目の文である文(i)の各文について、単語の切れ目を空白で示すと共に、自立語に下線を付して示している。   For example, the i-1st sentence in a certain sentence is "a high signal area is prominent in the T1-weighted image in the upper right lung." If it is not seen, ”when these sentences are divided into sentence units by the sentence dividing unit 12 and morphological analysis is performed by the morpheme analyzing unit 14, each sentence is included in the sentence as illustrated in FIG. Is identified. The morpheme analysis unit 14 of this example performs part-of-speech assignment as well as word division. In the example of FIG. 2, the sentence (i-1) which is the i-1th sentence and the sentence (i) which is the ith sentence. For each sentence, the word breaks are shown as blanks, and the independent words are underlined.

文間Nグラム解析部13は、形態素解析部14による解析結果に基づき、文(i−1)に出現する各単語(以下、先行語という)を文単位でまとめた先行語群と、文(i)に出現する各単語(以下、後続語という)とを特定し、文−単語カウント部16が、先行語群毎に、文書DB11中の各文章における当該先行語群を含む文の文出現頻度cs(s)を算出すると共に、各先行語群と各後続語との組み合わせ毎に、当該先行語群を含む文に後続する文における当該後続語の単語出現頻度cw(s,W)を算出する。
なお、本例の文−単語カウント部16では、該当する先行語群を含む文毎に、予め定められた値(本例では1)を加算して文出現頻度cs(s)を算出し、また、該当する先行語群及び後続語との組み合わせ毎に、予め定められた値(本例では1)を加算して単語出現頻度cw(s,W)を算出するようにしている。
The inter-sentence N-gram analysis unit 13, based on the analysis result by the morphological analysis unit 14, a preceding word group in which each word appearing in the sentence (i-1) (hereinafter referred to as a preceding word) is grouped in sentence units, Each word appearing in i) (hereinafter referred to as a subsequent word) is identified, and the sentence-word counting unit 16 generates a sentence appearance of a sentence including the preceding word group in each sentence in the document DB 11 for each preceding word group. The frequency cs (s) is calculated, and the word appearance frequency cw (s, W) of the subsequent word in the sentence following the sentence including the preceding word group is calculated for each combination of each preceding word group and each subsequent word. calculate.
The sentence-word counting unit 16 in this example calculates a sentence appearance frequency cs (s) by adding a predetermined value (1 in this example) for each sentence including the corresponding preceding word group. Also, a word appearance frequency cw (s, W) is calculated by adding a predetermined value (1 in this example) for each combination of the corresponding preceding word group and succeeding word.

上述した処理の結果、図3に例示するような文間Nグラムの辞書が作成されて、辞書DB21に記憶される。本例では、図3(a)に示すように、先行語群と後続語との対応(組み合わせ)に係る単語出現頻度cw(s,W)を設定した第1辞書と、図3(b)に示すように、先行語群に係る文出現頻度cs(s)を設定した第2辞書とを作成している。   As a result of the processing described above, a dictionary of intergram N-grams as illustrated in FIG. 3 is created and stored in the dictionary DB 21. In this example, as shown in FIG. 3A, the first dictionary in which the word appearance frequency cw (s, W) related to the correspondence (combination) between the preceding word group and the succeeding word is set, and FIG. As shown in FIG. 2, a second dictionary in which the sentence appearance frequency cs (s) related to the preceding word group is set is created.

次に、文間Nグラムによる文章入力支援の例として、利用者がテキスト入力部に「右肺上部にT1強調像で著名な高信号域が見られます。鮮明ではありませんが」という文章を入力した場合について説明する。
まず、テキスト取得部31により取得された利用者による作成中の文章から、前文取得部32が、当該文章においてカーソル位置の文或いは最後の文字が入力された文に対して先行する文を取り出す。本例の場合、「右肺上部にT1強調像で著名な高信号域が見られます。」という文が取り出される。
そして、前文解析部33は、前文取得部32により得られた文を形態素解析部14により形態素解析し、辞書DB21に対するデータベース検索を行うための表現に変換する。本例の場合、「右肺−上部−T1強調像−著名−高信号域−見られ」という単語群に変換される。
Next, as an example of sentence input support using intergram N-grams, the user enters the sentence "A prominent high-signal area is visible in the upper part of the right lung with a T1-weighted image. The case will be described.
First, from the sentence being created by the user acquired by the text acquisition unit 31, the previous sentence acquisition unit 32 extracts a sentence preceding the sentence at the cursor position or the sentence in which the last character is input in the sentence. In the case of this example, the sentence “A prominent high signal area is seen in the T1-weighted image in the upper right lung” is extracted.
Then, the preamble analysis unit 33 performs a morpheme analysis on the sentence obtained by the preamble acquisition unit 32 by the morpheme analysis unit 14 and converts it into an expression for performing a database search for the dictionary DB 21. In the case of this example, it is converted into the word group “right lung—upper part—T1-weighted image—famous—high signal area—seen”.

次に、補完候補列挙部35が、前文解析部33により得られた単語群を用いて辞書DB21を検索し、当該単語群に合致する先行語群に対応付けられた後続語を特定すると共に、当該先行語群に係る文出現頻度cs(s)及び当該先行語群と当該後続語との組み合わせに係る単語出現頻度cw(s,W)を取得する。
そして、補完候補評価部37が、補完候補列挙部35により特定された各後続語について、スコア=cw(s,W)/cs(s)を算出し、当該算出したスコアが閾値以上となる1つ或いは複数の単語Wを補完候補として特定する。
Next, the complement candidate listing unit 35 searches the dictionary DB 21 using the word group obtained by the preamble analysis unit 33, specifies the subsequent word associated with the preceding word group matching the word group, The sentence appearance frequency cs (s) related to the preceding word group and the word appearance frequency cw (s, W) related to the combination of the preceding word group and the subsequent word are acquired.
Then, the complement candidate evaluation unit 37 calculates score = cw (s, W) / cs (s) for each subsequent word specified by the complement candidate listing unit 35, and the calculated score is equal to or greater than the threshold value. One or a plurality of words W is specified as a candidate for completion.

その後、補完候補提示部38が、テキスト入力部におけるカーソル位置(或いは最後の文字の入力位置)の付近(本例では右下部分)に表示する子ウィンドウ内に、補完候補の各単語をスコア順に列挙して利用者により選択可能に提示する。当該子ウィンドウに候補として提示された単語は、利用者に選択された際に、カーソル位置(或いは最後の文字の入力位置)に続けて追加されることになる。   Thereafter, the completion candidate presentation unit 38 displays each word of the completion candidate in the score order in the child window displayed near the cursor position (or the input position of the last character) in the text input unit (lower right part in this example). Enumerate and present it so that it can be selected by the user. The word presented as a candidate in the child window is added after the cursor position (or the input position of the last character) when selected by the user.

なお、前文解析部33により得られた単語群を用いたデータベース検索を、当該単語群に含まれる各単語の文中での並び順を無視して行うようにしてもよく、並び順を考慮して行うようにしてもよい。並び順を考慮した検索を行う場合には、辞書中の各先行語群に、当該先行語群に含まれる各先行語の並び順の情報を付加しておけばよい。   Note that the database search using the word group obtained by the previous sentence analysis unit 33 may be performed while ignoring the arrangement order of the words included in the word group in the sentence. You may make it perform. When searching in consideration of the order of arrangement, information on the order of arrangement of the preceding words included in the preceding word group may be added to each preceding word group in the dictionary.

本発明の一実施例に係る文章入力支援システムの第2構成例について説明する。
図4には、第2構成例に係る文章入力支援システムの機能ブロックを示してある。
本例の文章入力支援システムは、第1構成例の参照情報作成部10における文−単語出現カウント部16に代えて、単語−単語出現カウント部17を設けた構成となっている。
A second configuration example of the text input support system according to an embodiment of the present invention will be described.
FIG. 4 shows functional blocks of the text input support system according to the second configuration example.
The text input support system of this example has a configuration in which a word-word appearance count unit 17 is provided instead of the sentence-word appearance count unit 16 in the reference information creation unit 10 of the first configuration example.

第2構成例においては、或る文sが単語w1、・・・、wkを含む場合に、1≦j≦kとなる全てのjについて、大量の文章中にwjを含む文が出現する頻度(文出現頻度)をcs(wj)とし、当該文が出現した次の文に単語Wが出現する頻度(単語出現頻度)をcw(wj,W)として、補完候補の提示に係るスコアとして、文出現頻度cs(wj)に対する単語出現頻度cw(wj,W)の比の総和を用いる。   In the second configuration example, when a sentence s includes the words w1,..., Wk, the frequency at which sentences including wj appear in a large amount of sentences for all j satisfying 1 ≦ j ≦ k. (Sentence appearance frequency) is cs (wj), and the frequency (word appearance frequency) that the word W appears in the next sentence in which the sentence appears is cw (wj, W). The sum of the ratio of the word appearance frequency cw (wj, W) to the sentence appearance frequency cs (wj) is used.

第2構成例に係る参照情報作成部10では、文間Nグラム解析部13が、形態素解析部14による解析結果に基づき、文(i−1)に出現する各先行語と、文(i)に出現する各後続語とを特定し、単語−単語カウント部17が、先行語毎に、文書DB11中の各文章における当該先行語を含む文の文出現頻度cs(w)を算出すると共に、各先行語と各後続語との組み合わせ毎に、当該先行語を含む文に後続する文における当該後続語の単語出現頻度cw(w,W)を算出する。   In the reference information creation unit 10 according to the second configuration example, the inter-sentence N-gram analysis unit 13 determines each preceding word that appears in the sentence (i-1) based on the analysis result by the morpheme analysis unit 14, and the sentence (i). And the word-word counting unit 17 calculates a sentence appearance frequency cs (w) of a sentence including the preceding word in each sentence in the document DB 11 for each preceding word, For each combination of each preceding word and each subsequent word, the word appearance frequency cw (w, W) of the subsequent word in the sentence following the sentence including the preceding word is calculated.

なお、単語−単語カウント部17では、該当する先行語を含む文毎に、予め定められた式により定まる値(本例では、文中の単語数kで1を割った値(1/k))を加算して文出現頻度cs(w)を算出し、また、該当する先行語及び後続語との組み合わせ毎に、予め定められた式により定まる値(本例では、文中の単語数kで1を割った値(1/k))を加算して単語出現頻度cw(w,W)を算出するようにしている。すなわち、文中の単語数が多いほど小さい値を加算して文出現頻度及び単語出現頻度を算出するものであり、同一文内に共起する単語数が多いほど加算する値を割り引くようにして、各文における単語の重みを文出現頻度及び単語出現頻度に反映させている。   In the word-word counting unit 17, a value determined by a predetermined formula for each sentence including the corresponding preceding word (in this example, a value obtained by dividing 1 by the number k of words in the sentence (1 / k)). Is added to calculate the sentence appearance frequency cs (w), and for each combination of the corresponding preceding and succeeding words, a value determined by a predetermined formula (in this example, 1 for the number of words k in the sentence) The word appearance frequency cw (w, W) is calculated by adding the value obtained by dividing ((1 / k)). That is, as the number of words in the sentence increases, a smaller value is added to calculate the sentence appearance frequency and the word appearance frequency, and as the number of words co-occurring in the same sentence increases, the value to be added is discounted. The weight of the word in each sentence is reflected in the sentence appearance frequency and the word appearance frequency.

上述した処理の結果、図5に例示するような文間Nグラムの辞書が作成されて、辞書DB21に記憶される。本例では、図5(a)に示すように、先行語と後続語との対応(組み合わせ)に係る単語出現頻度cw(w,W)を設定した第1辞書と、図5(b)に示すように、先行語に係る文出現頻度cs(w)を設定した第2辞書とを作成している。   As a result of the above processing, a dictionary of intergram N-grams as illustrated in FIG. 5 is created and stored in the dictionary DB 21. In this example, as shown in FIG. 5A, the first dictionary in which the word appearance frequency cw (w, W) related to the correspondence (combination) between the antecedent word and the subsequent word is set, and in FIG. As shown, a second dictionary in which the sentence appearance frequency cs (w) related to the preceding word is set is created.

第2構成例に係る文章入力支援部30では、前文解析部33が、前文取得部32により得られた文を形態素解析部14により形態素解析し、辞書DB21に対するデータベース検索を行うための表現に変換する。本例の場合、w1=「右肺」、w2=「上部」、w3=「T1強調像」、w4=「著名」、w5=「高信号域」、w6=「見られ」の単語wj(1≦j≦k)が得られる。   In the text input support unit 30 according to the second configuration example, the previous sentence analysis unit 33 performs a morpheme analysis on the sentence obtained by the previous sentence acquisition unit 32 by the morpheme analysis unit 14 and converts it into an expression for performing a database search for the dictionary DB 21. To do. In this example, w1 = “right lung”, w2 = “upper part”, w3 = “T1-weighted image”, w4 = “famous”, w5 = “high signal area”, w6 = “seen” word wj ( 1 ≦ j ≦ k) is obtained.

次に、補完候補列挙部35が、前文解析部33により得られた各単語wjを用いて辞書DB21を検索し、各単語wjに合致する先行語にそれぞれ対応付けられた後続語を特定すると共に、当該先行語に係る文出現頻度cs(wj)及び当該先行語と当該後続語との組み合わせに係る単語出現頻度cw(wj,W)を取得する。
そして、補完候補評価部37が、補完候補列挙部35により特定された各後続語について、以下の(式1)によりスコアを算出し、当該算出したスコアが閾値以上となる1つ或いは複数の単語Wを補完候補として特定する。その後、補完候補提示部38が、補完候補の各単語をスコア順に列挙して利用者により選択可能に提示する。

Figure 0005553037
Next, the complement candidate enumeration unit 35 searches the dictionary DB 21 using each word wj obtained by the preamble analysis unit 33 and specifies subsequent words respectively associated with the preceding words that match each word wj. The sentence appearance frequency cs (wj) related to the preceding word and the word appearance frequency cw (wj, W) related to the combination of the preceding word and the subsequent word are acquired.
Then, the complementation candidate evaluation unit 37 calculates a score for each subsequent word specified by the complementation candidate listing unit 35 according to the following (Equation 1), and the calculated score is one or more words that are equal to or greater than a threshold value. Specify W as a candidate for completion. Thereafter, the supplement candidate presentation unit 38 lists each candidate word in the order of score and presents it in a selectable manner by the user.
Figure 0005553037

本発明の一実施例に係る文章入力支援システムの第3構成例について説明する。
図6には、第3構成例に係る文章入力支援システムの機能ブロックを示してある。
本例の文章入力支援システムは、第2構成例の参照情報作成部10に構文解析部15を追加した構成となっている。
A third configuration example of the text input support system according to an embodiment of the present invention will be described.
FIG. 6 shows functional blocks of the text input support system according to the third configuration example.
The text input support system of this example has a configuration in which a syntax analysis unit 15 is added to the reference information creation unit 10 of the second configuration example.

前述した第2構成例では、文出現頻度cs(wj)及び単語出現頻度cw(wj,W)のデータベース(図5参照)を作成する際に、一律して1/kを加えていたところ、第3構成例では、構文解析部15により上下2文の構文解析を行って文中の単語間の係り受け構造を特定し、その結果に応じて加算する値を調整する。
すなわち、第3構成例は、(1)連続する文の間で主語或いはその他の格が共通していることが多いこと、(2)テキストの記述の流れに従った表現パターンがあること(例1;「走る」の次の文に「こける」が出現し易いが、その逆は出現し難い。例2;「T1強調像で」の後に「T2強調像で」が出現し易い。)、などに着目したものであり、上下2文の構文解析の結果に基づき、先行語と後続語との格が一致する組み合わせについては、当該後続語に係る単語出現頻度cw(wj,W)に加算する値を大きくする。
In the above-described second configuration example, when creating the database (see FIG. 5) of the sentence appearance frequency cs (wj) and the word appearance frequency cw (wj, W), 1 / k was added uniformly. In the third configuration example, the syntactic analysis unit 15 performs syntactic analysis of the upper and lower two sentences to identify the dependency structure between words in the sentence, and adjusts the value to be added according to the result.
That is, in the third configuration example, (1) the subject or other case is often common between consecutive sentences, and (2) there is an expression pattern according to the flow of text description (example) 1; “Koke” is likely to appear in the next sentence after “Run”, but the reverse is unlikely to occur, eg, “T2 weighted image” is likely to appear after “T1 weighted image”). Based on the results of the syntactic analysis of the upper and lower two sentences, for combinations where the antecedent word and the succeeding word match, add to the word appearance frequency cw (wj, W) related to the succeeding word Increase the value to be

第3構成例に係る参照情報作成部10では、文間Nグラム解析部13が、形態素解析部14による解析結果に基づき、文(i−1)に出現する各先行語と、文(i)に出現する各後続語とを特定し、単語−単語カウント部17が、先行語毎に、文書DB11中の各文章における当該先行語を含む文の文出現頻度cs(w)を算出すると共に、各先行語と各後続語との組み合わせ毎に、当該先行語を含む文に後続する文における当該後続語の単語出現頻度cw(w,W)を算出する。このとき、構文解析部15による解析結果に基づき、先行語と後続語との格が一致する組み合わせについては、当該後続語に係る単語出現頻度cw(wj,W)に加算する値を他に比べて大きくなるように調整する。   In the reference information creation unit 10 according to the third configuration example, the inter-sentence N-gram analysis unit 13 determines each preceding word that appears in the sentence (i-1) based on the analysis result by the morpheme analysis unit 14, and the sentence (i). And the word-word counting unit 17 calculates a sentence appearance frequency cs (w) of a sentence including the preceding word in each sentence in the document DB 11 for each preceding word, For each combination of each preceding word and each subsequent word, the word appearance frequency cw (w, W) of the subsequent word in the sentence following the sentence including the preceding word is calculated. At this time, based on the analysis result by the syntax analysis unit 15, for a combination in which the case of the preceding word and the subsequent word match, the value added to the word appearance frequency cw (wj, W) related to the subsequent word is compared with other values. Adjust so that it becomes larger.

例えば、図2に例示した各文の構文解析結果は、図7のように表現することができる。図7(a)は文(i−1)の構文解析結果、図7(b)は文(i)の構文解析結果を例示したものであり、単語間の係り受け構造を矢印で示してある。図7(a)、(b)によれば、文(i−1)は、述語「見られ」について、主格「高信号域」、処格「右肺上部」、具格「T1強調像」という係り受け関係を有しており、文(i)は、述語「見られ」について、主格「異常」、処格「部位」、具格「T2強調像」という係り受け関係を有している。なお、図7では、詳細な係り受け構造については省略してある。   For example, the syntax analysis result of each sentence illustrated in FIG. 2 can be expressed as shown in FIG. FIG. 7A illustrates the syntax analysis result of the sentence (i-1), and FIG. 7B illustrates the syntax analysis result of the sentence (i). The dependency structure between words is indicated by arrows. . According to FIGS. 7A and 7B, the sentence (i-1) has the main character “high signal area”, the procedural “upper right lung”, and the character “T1-weighted image” for the predicate “seen”. Sentence (i) has a dependency relationship with the predicate “seen” as the main character “abnormal”, the case “part”, and the character “T2 weighted image”. . In FIG. 7, the detailed dependency structure is omitted.

ここで、それぞれの格を構成する単語、すなわち、述語「見られ」と「見られ」、主格「高信号域」と「異常」、処格「右肺上部」と「部位」、具格「T1強調像」と「T2強調像」の対応を特定し、それらのペア(先行語と後続語との組み合わせ)における後続語に係る単語出現頻度cw(wj,W)が大きくなるように加算する値を調整する。本例では、先行語と後続語との対応(組み合わせ)に係る単語出現頻度cw(w,W)を設定した第1辞書を図8に例示するように、格が一致する組み合わせにおける後続語に係る単語出現頻度cw(wj,W)をそれ以外の2倍に設定するようにしているが、これに限定するものではなく、例えば、一致する格の種別に応じて加算する値を変化させるようにしてもよい。   Here, the words that make up each case, namely, the predicates “seen” and “seen”, the main cases “high signal area” and “abnormal”, the procedural “upper right lung” and “part”, the class “ The correspondence between the “T1-weighted image” and the “T2-weighted image” is specified and added so that the word appearance frequency cw (wj, W) related to the subsequent word in the pair (combination of the preceding word and the subsequent word) becomes large. Adjust the value. In this example, the first dictionary in which the word appearance frequency cw (w, W) related to the correspondence (combination) between the antecedent word and the subsequent word is set as the subsequent word in the combination having the same case as illustrated in FIG. The word appearance frequency cw (wj, W) is set to be doubled, but is not limited to this. For example, the value to be added is changed according to the type of matching cases. It may be.

本発明の一実施例に係る文章入力支援システムの第4構成例について説明する。
図9には、第4構成例に係る文章入力支援システムの機能ブロックを示してある。
本例の文章入力支援システムは、第3構成例の文章入力支援部30に構文解析部35を追加した構成となっている。
第4構成例は、補完候補として、入力中の文に先行する文に含まれる単語及び格が合致する先行語に対応付けられた後続語を提示するものである。なお、補完候補を検索する辞書DB21には、図8に例示するように、先行語と後続語との対応(組み合わせ)に係る単語出現頻度cw(w,W)を設定した第1辞書において、先行語の格(及び後続語の格)を更に対応付けて記憶しているものとする。
A fourth configuration example of the text input support system according to an embodiment of the present invention will be described.
FIG. 9 shows functional blocks of a text input support system according to the fourth configuration example.
The text input support system of this example has a configuration in which a syntax analysis unit 35 is added to the text input support unit 30 of the third configuration example.
In the fourth configuration example, the word included in the sentence preceding the sentence being input and the succeeding word matched with the preceding word matching the case are presented as the complement candidates. In addition, in the dictionary DB 21 for searching for a complement candidate, as illustrated in FIG. 8, in the first dictionary in which the word appearance frequency cw (w, W) related to the correspondence (combination) between the preceding word and the succeeding word is set, Assume that the case of the preceding word (and the case of the subsequent word) is further stored in association with each other.

第4構成例に係る文章入力支援部30では、前文解析部33が、形態素解析部14による解析結果に対して構文解析部35を用いて構文解析を行って各単語の格を同定し、辞書DB21に対するデータベース検索を行うための表現に変換する。その結果、例えば、w1=「右肺(処格)」、w2=「上部(処格)」、w3=「T1強調像(具格)」、w4=「著名」、w5=「高信号域(主格)」、w6=「見られ(述語)」の格情報付きの単語wj(1≦j≦k)が得られる。   In the sentence input support unit 30 according to the fourth configuration example, the preamble analysis unit 33 performs syntax analysis on the analysis result by the morpheme analysis unit 14 using the syntax analysis unit 35 to identify the case of each word, and the dictionary It converts into the expression for performing the database search with respect to DB21. As a result, for example, w1 = “right lung (legal)”, w2 = “upper part (legal)”, w3 = “T1 weighted image (grit)”, w4 = “famous”, w5 = “high signal area A word wj (1 ≦ j ≦ k) with case information “(main case)” and w6 = “seen (predicate)” is obtained.

次に、補完候補列挙部35が、前文解析部33により得られた各単語wjを用いて辞書DB21を検索し、各単語wjとその格が合致する先行語にそれぞれ対応付けられた後続語を特定すると共に、当該先行語に係る文出現頻度cs(wj)及び当該先行語と当該後続語との組み合わせに係る単語出現頻度cw(wj,W)を取得する。
そして、補完候補評価部37が、補完候補列挙部35により特定された各後続語について、前述した(式1)によりスコアを算出し、当該算出したスコアが閾値以上となる1つ或いは複数の単語Wを補完候補として特定する。その後、補完候補提示部38が、補完候補の各単語をスコア順に列挙して利用者により選択可能に提示する。
Next, the complement candidate enumeration unit 35 searches the dictionary DB 21 using each word wj obtained by the preamble analysis unit 33, and selects each subsequent word associated with each word wj and the preceding word that matches the case. At the same time, the sentence appearance frequency cs (wj) related to the preceding word and the word appearance frequency cw (wj, W) related to the combination of the preceding word and the subsequent word are acquired.
Then, the complementation candidate evaluation unit 37 calculates a score for each subsequent word specified by the complementation candidate enumeration unit 35 according to (Equation 1) described above, and the calculated score is equal to or greater than a threshold value. Specify W as a candidate for completion. Thereafter, the supplement candidate presentation unit 38 lists each candidate word in the order of score and presents it in a selectable manner by the user.

ここで、上述した各構成例に係る文章入力支援システムは、参照情報作成部10として動作する参照情報作成装置と、文章入力支援部30として動作する文章入力支援装置とを別体の装置に設け、参照情報作成装置により作成された辞書を各文章入力支援装置に配布するように構成しているが、これに限定するものではなく、例えば、各文章入力支援装置が参照情報作成装置に保持されている辞書を参照する構成としてもよく、参照情報作成部10と文章入力支援部30とを一体の装置に設けた構成としてもよい。   Here, the text input support system according to each configuration example described above includes a reference information creation device that operates as the reference information creation unit 10 and a text input support device that operates as the text input support unit 30 in separate devices. The dictionary created by the reference information creation device is configured to be distributed to each text input support device. However, the present invention is not limited to this. For example, each text input support device is held in the reference information creation device. The reference information creation unit 10 and the text input support unit 30 may be provided in an integrated apparatus.

また、上述した各構成例では、連続する2つの文に基づいて補完候補を提示する構成としているが、連続する3つ以上の文に基づいて補完候補を提示する構成としてもよい。すなわち、例えば、入力中の文に先行する文(1つ前の文)に含まれる単語wxと、更に先行する文(2つ前の文)に含まれる単語wyとした場合において、既存の文章中に単語wyを含む文syが存在し且つ文syの次に単語wxを含む文sxが存在する場合に、文sxの次の文に出現する単語を補完候補として提示するようにする。   Moreover, in each structural example mentioned above, it is set as the structure which shows a complementation candidate based on two continuous sentences, However, It is good also as a structure which presents a complementation candidate based on three or more continuous sentences. That is, for example, in the case where the word wx included in the sentence preceding the sentence being input (the previous sentence) and the word wy included in the preceding sentence (the previous sentence), the existing sentence When a sentence sy including the word wy exists and a sentence sx including the word wx exists after the sentence sy, a word appearing in the sentence next to the sentence sx is presented as a complement candidate.

また、上述した各構成例では、入力中の文に先行する文に含まれる単語と辞書DB21中の先行語との一致を条件に、当該先行語に対応付けられた後続語を補完候補として特定しているが、例えば、表現が異なる単語同士であっても同義語や関連語などであれば一致する単語と見做して、該当する先行語に対応付けられた後続語を補完候補として特定するようにしてもよい。   Further, in each configuration example described above, the subsequent word associated with the preceding word is specified as a complementation candidate on condition that the word included in the sentence preceding the sentence being input matches the preceding word in the dictionary DB 21. However, for example, even if the words are different in expression, if they are synonyms or related words, they are regarded as matching words, and the subsequent words associated with the corresponding predecessors are identified as the candidate for completion. You may make it do.

また、上述した各構成例では、補完候補として特定される単語の数が比較的少ないことが想定されるため、例えば、各構成例により得られた補完候補を、従来手法により得られた補完候補とマージ(合成)して提示するようにしてもよい。また、更に、各構成例により得られた補完候補のスコアと従来手法により得られた補完候補のスコアとを合計して、そのスコア順に補完候補を提示するようにしてもよい。   Further, in each configuration example described above, since it is assumed that the number of words specified as the complement candidates is relatively small, for example, the complement candidates obtained by each configuration example are replaced with the complement candidates obtained by the conventional method. May be merged (synthesized) and presented. Furthermore, the complement candidate scores obtained by the respective configuration examples and the complement candidate scores obtained by the conventional method may be summed, and the complement candidates may be presented in the order of the scores.

また、上述した第1構成例〜第4構成例では、幅広い分野の文書入力の支援を行うべく、分野を特定せずに文章を収集して補完候補の提示に用いる辞書を作成するようにしているが、例えば、カルテ等の医療文書に基づいて、医療文書の作成時に専用的に用いる辞書を作成するようにしたり、報告書等の社内文書に基づいて、社内文書の作成時に専用的に用いる辞書を作成するようにしたりする等、定型的な文が用いられる特定の種別の文書を収集し、当該種別の文書の作成時に専用的に用いる辞書を作成するようにしてもよい。また、各利用者が自分で作成した文書に基づいて、自分用の辞書を作成するようにしてもよい。   Further, in the first to fourth configuration examples described above, in order to support document input in a wide range of fields, it is possible to collect a sentence without specifying a field and create a dictionary that is used for presenting supplement candidates. For example, based on medical documents such as medical records, a dictionary that is used exclusively when creating medical documents may be created, or based on internal documents such as reports and used exclusively when creating internal documents It is also possible to collect a specific type of document in which a fixed sentence is used, such as creating a dictionary, and create a dictionary exclusively used when creating the type of document. Moreover, you may make it create a dictionary for oneself based on the document which each user created by himself.

図10には、第1構成例〜第4構成例に係る文章入力支援システムにおいて、参照情報作成部10として動作する参照情報作成装置のコンピュータ、及び、文章入力支援部30として動作する文章入力支援装置のコンピュータのハードウェア構成を例示してある。
本例のコンピュータは、各種演算処理を行うCPU(Central Processing Unit)41、CPU41の作業領域となるRAM(Random Access Memory)42や基本的な制御プログラムを記録したROM(Read Only Memory)43等の主記憶装置、本発明の一実施形態に係るプログラムや各種データを記憶するHDD(Hard Disk Drive)44等の補助記憶装置、各種情報を表示出力するための表示装置及び操作者により入力操作に用いられる操作ボタンやタッチパネル等の入力機器とのインタフェースである入出力I/F45、他の装置との間で有線又は無線により通信を行うインタフェースである通信I/F46、等のハードウェア資源を有している。
そして、本発明の一実施形態に係るプログラムを補助記憶装置44等から読み出してRAM42に展開し、これをCPU41により実行させることで、上述した各機能部をコンピュータ上に実現している。
FIG. 10 shows the computer of the reference information creation device that operates as the reference information creation unit 10 and the text input support that operates as the text input support unit 30 in the text input support system according to the first to fourth configuration examples. The hardware configuration of the computer of the apparatus is illustrated.
The computer of this example includes a CPU (Central Processing Unit) 41 that performs various arithmetic processing, a RAM (Random Access Memory) 42 that is a work area of the CPU 41, a ROM (Read Only Memory) 43 that records basic control programs, and the like. Main storage device, auxiliary storage device such as HDD (Hard Disk Drive) 44 for storing programs and various data according to an embodiment of the present invention, display device for displaying and outputting various information, and used for input operations by an operator Hardware resources such as an input / output I / F 45 that is an interface with input devices such as operation buttons and a touch panel, and a communication I / F 46 that is an interface for performing wired or wireless communication with other devices. ing.
The program according to the embodiment of the present invention is read from the auxiliary storage device 44 and the like, loaded into the RAM 42, and executed by the CPU 41, thereby realizing the above-described functional units on the computer.

なお、本発明の一実施形態に係るプログラムは、例えば、当該プログラムを記憶したCD−ROM等の外部記憶媒体から読み込む形式や、通信網等を介して受信する形式などにより、本例に係るコンピュータに設定される。
また、本例のようなソフトウェア構成により各機能部を実現する態様に限られず、それぞれの機能部を専用のハードウェア資源で実現するようにしてもよい。
Note that the program according to the embodiment of the present invention is based on, for example, a computer according to this example depending on a format read from an external storage medium such as a CD-ROM storing the program or a format received via a communication network. Set to
Moreover, it is not restricted to the aspect which implement | achieves each function part by software configuration like this example, You may make it implement | achieve each function part with a dedicated hardware resource.

11:文書DB、 12:文分割部、 13:文間Nグラム解析部、 14:形態素解析部、 15:構文解析部、 16:単語−文出現カウント部、 17:単語−単語出現カウント部、 21:辞書DB、 31:テキスト取得部、 32:前文取得部、 33:前文解析部、 34:形態素解析部、 35:構文解析部、 36:補完候補列挙部、 37:補完候補評価部、 38:補完候補提示部   11: Document DB, 12: Sentence division unit, 13: Inter-sentence N-gram analysis unit, 14: Morphological analysis unit, 15: Syntax analysis unit, 16: Word-sentence appearance count unit, 17: Word-word appearance count unit, 21: Dictionary DB, 31: Text acquisition unit, 32: Preamble acquisition unit, 33: Preamble analysis unit, 34: Morphological analysis unit, 35: Syntax analysis unit, 36: Completion candidate enumeration unit, 37: Completion candidate evaluation unit, 38 : Completion candidate presentation section

Claims (13)

コンピュータに、
利用者による操作入力に基づく作成中の文章において、入力中の文に先行する文を解析して当該文に含まれる単語を特定する特定機能と、
既存の文章中に含まれる連続する2つの文について、先行する文に含まれる単語を先行語とし、後続する文に含まれる単語を後続語として対応付けて記憶する記憶手段から、前記特定機能により特定された単語に合致する先行語に対応付けられた後続語を検索する検索機能と、
前記検索機能により検索された後続語を利用者に対して提示する提示機能と、
を実現させるためのプログラム。
On the computer,
In a sentence being created based on an operation input by a user, a specific function for analyzing a sentence preceding the sentence being input and identifying a word included in the sentence,
With respect to two consecutive sentences included in an existing sentence, a word included in a preceding sentence is used as a preceding word, and a word included in a succeeding sentence is stored in association as a subsequent word. A search function that searches for subsequent words associated with an antecedent that matches the identified word;
A presentation function for presenting subsequent words retrieved by the search function to a user;
A program to realize
前記記憶手段は、既存の文章において先行する文に含まれる各先行語を文単位でまとめた各先行語群と当該先行語群を含む文に後続する文に含まれる各後続語との対応付け情報と共に、既存の文章における当該先行語群を含む文の出現度合を示す文出現量、及び、当該先行語群を含む文に後続する文における当該後続語の出現度合を示す単語出現量を記憶しており、
前記検索機能は、入力中の文に先行する文に含まれる単語群に合致する先行語群に対応する後続語を検索すると共に、当該先行語群に係る文出現量及び当該先行語群と当該後続語との組み合わせに係る単語出現量を検索し、
前記提示機能は、前記検索機能により検索された後続語を、当該後続語と共に検索された文出現量に対する単語出現量の比が高い順に提示する、
ことを特徴とする請求項1に記載のプログラム。
The storage means associates each preceding word group in which each preceding word included in a preceding sentence in an existing sentence is grouped in sentence units with each subsequent word included in a sentence following the sentence including the preceding word group. Along with the information, a sentence appearance amount indicating the appearance degree of the sentence including the preceding word group in the existing sentence, and a word appearance amount indicating the appearance degree of the subsequent word in the sentence following the sentence including the preceding word group are stored. And
The search function searches for a subsequent word corresponding to a preceding word group that matches a word group included in a sentence preceding the sentence being input, and includes a sentence appearance amount related to the preceding word group, the preceding word group, and the Search the word appearance amount related to the combination with the succeeding word,
The presenting function presents subsequent words searched by the search function in descending order of the ratio of the word appearance amount to the sentence appearance amount searched together with the subsequent word.
The program according to claim 1.
前記記憶手段は、既存の文章において先行する文に含まれる各先行語と当該先行語を含む文に後続する文に含まれる各後続語との対応付け情報と共に、既存の文章における当該先行語を含む文の出現度合を示す文出現量、及び、当該先行語を含む文に後続する文における当該後続語の出現度合を示す単語出現量を記憶しており、
前記検索機能は、入力中の文に先行する文に含まれる各単語に合致するそれぞれの先行語に対応する後続語を検索すると共に、当該先行語に係る文出現量及び当該先行語と当該後続語との組み合わせに係る単語出現量を検索し、
前記提示機能は、前記検索機能により検索された後続語を、当該後続語と共に検索された文出現量に対する単語出現量の比が高い順に提示する、
ことを特徴とする請求項1に記載のプログラム。
The storage means stores the antecedent word in the existing sentence together with the correspondence information between each antecedent word included in the sentence preceding the existing sentence and each subsequent word included in the sentence subsequent to the sentence including the antecedent word. Storing the sentence appearance amount indicating the appearance degree of the sentence including, and the word appearance amount indicating the appearance degree of the subsequent word in the sentence subsequent to the sentence including the preceding word,
The search function searches for a subsequent word corresponding to each preceding word that matches each word included in a sentence preceding the sentence being input, and also includes a sentence appearance amount related to the preceding word and the preceding word and the subsequent word. Search for word appearance amount related to combination with word,
The presenting function presents subsequent words searched by the search function in descending order of the ratio of the word appearance amount to the sentence appearance amount searched together with the subsequent word.
The program according to claim 1.
前記記憶手段は、各先行語と当該先行語を含む文に後続する文に含まれる各後続語との対応付け情報と共に、当該先行語の格の情報を記憶しており、
前記特定機能は、入力中の文に先行する文に含まれる単語の格を特定し、
前記検索機能は、入力中の文に先行する文に含まれる単語及び格が合致する先行語に対応付けられた後続語を検索する、
ことを特徴とする請求項3に記載のプログラム。
The storage means stores information on the case of the preceding word together with association information between each preceding word and each subsequent word included in a sentence following the sentence including the preceding word,
The specifying function specifies a case of a word included in a sentence preceding the sentence being input,
The search function searches for a subsequent word associated with a preceding word that matches a word and a case included in a sentence preceding the sentence being input.
The program according to claim 3.
コンピュータに、
既存の文章中に含まれる連続する2つの文を解析し、先行する文に含まれる単語を先行語とし、後続する文に含まれる単語を後続語として特定する特定機能と、
利用者による操作入力に基づいて作成中の文章における入力中の文について、当該文に先行する文に含まれる単語に合致する先行語に対応する後続語を利用者に対して提示する文章入力支援処理のために、前記特定機能により特定された先行語と後続語とを対応付けて記憶手段に記憶させる記憶機能と、
を実現させるためのプログラム。
On the computer,
A specific function for analyzing two consecutive sentences included in an existing sentence, specifying a word included in a preceding sentence as a preceding word, and specifying a word included in a succeeding sentence as a subsequent word,
Sentence input support that presents subsequent words corresponding to a preceding word that matches a word included in a sentence preceding the sentence for the sentence being input based on an operation input by the user to the user A storage function that associates and stores the antecedent word and the subsequent word specified by the specific function in the storage means for processing;
A program to realize
既存の文章において先行する文に含まれる各先行語を文単位でまとめた先行語群毎に、既存の文章における当該先行語群を含む文の出現度合を示す文出現量を算出すると共に、各先行語群と当該先行語群を含む文に後続する文に含まれる各後続語との組み合わせ毎に、当該先行語群を含む文に後続する文における当該後続語の出現度合を示す単語出現量を算出する算出機能を前記コンピュータに更に実現させ、
前記文章入力支援処理において、入力中の文に先行する文に含まれる単語群に合致する先行語群に対応する後続語を、当該先行語群に係る文出現量に対する当該後続語に係る単語出現量の比が高い順に提示させるために、前記記憶機能は、各先行語群と当該先行語群を含む文に後続する文に含まれる各後続語との対応付け情報と共に、前記算出機能により算出された文出現量及び単語出現量を前記記憶手段に記憶させる、
ことを特徴とする請求項5に記載のプログラム。
For each preceding word group that summarizes each preceding word included in the preceding sentence in the existing sentence in sentence units, a sentence appearance amount indicating the appearance degree of the sentence including the preceding word group in the existing sentence is calculated, and each For each combination of a preceding word group and each subsequent word included in a sentence following the sentence including the preceding word group, a word appearance amount indicating the appearance degree of the subsequent word in the sentence following the sentence including the preceding word group The computer further realizes a calculation function for calculating
In the sentence input support process, the subsequent word corresponding to the preceding word group that matches the word group included in the sentence preceding the sentence being input is represented as the word occurrence related to the subsequent word with respect to the sentence appearance amount related to the preceding word group. In order to present in the order of the ratio of the quantities, the storage function is calculated by the calculation function together with association information between each preceding word group and each subsequent word included in a sentence following the sentence including the preceding word group. Storing the generated sentence appearance amount and word appearance amount in the storage means;
The program according to claim 5.
既存の文章において先行する文に含まれる先行語毎に、既存の文章における当該先行語を含む文の出現度合を示す文出現量を算出すると共に、各先行語と当該先行語を含む文に後続する文に含まれる各後続語との組み合わせ毎に、当該先行語を含む文に後続する文における当該後続語の出現度合を示す単語出現量を算出する算出機能を前記コンピュータに更に実現させ、
前記文章入力支援処理において、入力中の文に先行する文に含まれる各単語に合致するそれぞれの先行語に対応する後続語を、当該先行語に係る文出現量に対する当該後続語に係る単語出現量の比が高い順に提示させるために、前記記憶機能は、各先行語と当該先行語を含む文に後続する文に含まれる各後続語との対応付け情報と共に、前記算出機能により算出された文出現量及び単語出現量を前記記憶手段に記憶させる、
ことを特徴とする請求項5に記載のプログラム。
For each preceding word included in the preceding sentence in the existing sentence, the sentence appearance amount indicating the appearance degree of the sentence including the preceding word in the existing sentence is calculated, and subsequent to each preceding word and the sentence including the preceding word For each combination with each subsequent word included in the sentence to be performed, the computer further realizes a calculation function for calculating a word appearance amount indicating the appearance degree of the subsequent word in the sentence subsequent to the sentence including the preceding word,
In the sentence input support process, a word corresponding to each subsequent word corresponding to each preceding word that matches each word included in the sentence preceding the sentence being input is represented as the word appearance related to the subsequent word relative to the sentence appearance amount related to the preceding word. The storage function is calculated by the calculation function together with association information between each preceding word and each subsequent word included in a sentence following the sentence including the preceding word, in order to present in order of the ratio of the quantities. Storing the sentence appearance amount and the word appearance amount in the storage means;
The program according to claim 5.
前記算出機能は、先行語毎の文出現量の算出を、当該先行語を含む文毎に、当該文の単語数が多いほど小さい値を加算することにより行い、また、先行語と後続語の組み合わせ毎の単語出現量の算出を、当該先行語を含む文に後続する文で且つ当該後続語を含む文毎に、当該文の単語数が多いほど小さい値を加算することにより行う、
ことを特徴とする請求項7に記載のプログラム。
The calculation function calculates the sentence appearance amount for each preceding word by adding a smaller value as the number of words in the sentence increases for each sentence including the preceding word. The calculation of the word appearance amount for each combination is performed by adding a smaller value as the number of words in the sentence increases for each sentence including the subsequent word that is a sentence subsequent to the sentence including the preceding word.
The program according to claim 7.
前記特定機能は、既存の文章において先行する文に含まれる先行語の格と、当該文に後続する文に含まれる後続語の格を特定し、
前記算出機能は、前記特定された先行語と格が一致する後続語について、当該後続語に係る単語出現量に加算する値を大きくする、
ことを特徴とする請求項7又は請求項8に記載のプログラム。
The specifying function specifies a case of a preceding word included in a sentence preceding in an existing sentence and a case of a subsequent word included in a sentence following the sentence,
The calculation function increases a value to be added to a word appearance amount related to the succeeding word for a succeeding word whose case matches the identified preceding word,
The program according to claim 7 or 8, wherein
前記特定機能は、既存の文章において先行する文に含まれる先行語の格を特定し、
前記文章入力支援処理において、入力中の文に先行する文に含まれる単語及び格が合致する先行語に対応する後続語を提示させるために、前記記憶機能は、各先行語と当該先行語を含む文に後続する文に含まれる各後続語との対応付け情報と共に、当該先行語の格の情報を前記記憶手段に記憶させる、
ことを特徴とする請求項7乃至請求項9のいずれか1項に記載のプログラム。
The specifying function specifies a case of a preceding word included in a preceding sentence in an existing sentence,
In the sentence input support processing, in order to present a subsequent word corresponding to a preceding word that matches a word and a case included in a sentence preceding the sentence being input, the storage function is configured to display each preceding word and the preceding word. Storing information on the case of the preceding word in the storage unit together with the association information with each subsequent word included in the sentence subsequent to the sentence including the sentence;
The program according to any one of claims 7 to 9, characterized in that:
既存の文章中に含まれる連続する2つの文を解析し、先行する文に含まれる単語を先行語とし、後続する文に含まれる単語を後続語として特定する第1特定手段と、前記第1特定手段により特定された先行語と後続語とを対応付けて記憶する記憶手段と、を有する参照情報作成部と、
利用者による操作入力に基づいて作成中の文章において、入力中の文に先行する文を解析して当該文に含まれる単語を特定する第2特定手段と、前記第2特定手段により特定された単語に合致する先行語に対応付けられた後続語を前記記憶手段から検索する検索手段と、前記検索手段により検索された後続語を利用者に対して提示する提示手段と、を有する文章入力支援部と、
を備えたことを特徴とする文章入力支援システム。
A first specifying means for analyzing two consecutive sentences included in an existing sentence, specifying a word included in a preceding sentence as a preceding word, and specifying a word included in a succeeding sentence as a following word; A storage means for storing the preceding word and the succeeding word specified by the specifying means in association with each other;
In a sentence being created based on an operation input by a user, the sentence is identified by a second identification unit that analyzes a sentence preceding the sentence being input and identifies a word included in the sentence, and the second identification unit Sentence input support comprising: search means for searching a subsequent word associated with a preceding word that matches a word from the storage means; and presentation means for presenting the subsequent word searched by the search means to a user And
A text input support system characterized by comprising:
利用者による操作入力に基づいて作成中の文章において、入力中の文に先行する文を解析して当該文に含まれる単語を特定する特定手段と、
既存の文章中に含まれる連続する2つの文について、先行する文に含まれる単語を先行語とし、後続する文に含まれる単語を後続語として対応付けて記憶する記憶手段から、前記特定手段により特定された単語に合致する先行語に対応付けられた後続語を検索する検索手段と、
前記検索手段により検索された後続語を利用者に対して提示する提示手段と、
を備えたことを特徴とする文章入力支援装置。
In a sentence being created based on an operation input by a user, a specifying means for analyzing a sentence preceding the sentence being input and identifying a word included in the sentence;
With respect to two consecutive sentences included in an existing sentence, a word included in a preceding sentence is used as a preceding word, and a word included in a succeeding sentence is stored in association as a subsequent word. Search means for searching for subsequent words associated with the preceding word that matches the identified word;
Presenting means for presenting subsequent words retrieved by the retrieval means to the user;
A text input support device characterized by comprising:
既存の文章中に含まれる連続する2つの文を解析し、先行する文に含まれる単語を先行語とし、後続する文に含まれる単語を後続語として特定する特定手段と、
利用者による操作入力に基づいて作成中の文章における入力中の文について、当該文に先行する文に含まれる単語に合致する先行語に対応する後続語を利用者に対して提示する文章入力支援処理のために、前記特定手段により特定された先行語と後続語とを対応付けて記憶手段に記憶させる記憶手段と、
を備えたことを特徴とする参照情報作成装置。
A means for analyzing two consecutive sentences included in an existing sentence, specifying a word included in a preceding sentence as a preceding word, and specifying a word included in a succeeding sentence as a subsequent word;
Sentence input support that presents subsequent words corresponding to a preceding word that matches a word included in a sentence preceding the sentence for the sentence being input based on an operation input by the user to the user Storage means for associating and storing the antecedent word and the subsequent word specified by the specifying means in the storage means for processing;
A reference information creating apparatus comprising:
JP2011013942A 2011-01-26 2011-01-26 Text input support system, text input support device, reference information creation device, and program Expired - Fee Related JP5553037B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2011013942A JP5553037B2 (en) 2011-01-26 2011-01-26 Text input support system, text input support device, reference information creation device, and program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2011013942A JP5553037B2 (en) 2011-01-26 2011-01-26 Text input support system, text input support device, reference information creation device, and program

Publications (2)

Publication Number Publication Date
JP2012155520A JP2012155520A (en) 2012-08-16
JP5553037B2 true JP5553037B2 (en) 2014-07-16

Family

ID=46837184

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2011013942A Expired - Fee Related JP5553037B2 (en) 2011-01-26 2011-01-26 Text input support system, text input support device, reference information creation device, and program

Country Status (1)

Country Link
JP (1) JP5553037B2 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014162756A1 (en) 2013-04-04 2014-10-09 ソニー株式会社 Information processing device, data input assistance method, and program
JP7433601B2 (en) * 2018-05-15 2024-02-20 インテックス ホールディングス ピーティーワイ エルティーディー Expert report editor

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009522625A (en) * 2005-12-28 2009-06-11 ザン,ホンウェイ Computer and mobile phone input method

Also Published As

Publication number Publication date
JP2012155520A (en) 2012-08-16

Similar Documents

Publication Publication Date Title
JP3820242B2 (en) Question answer type document search system and question answer type document search program
JP3067966B2 (en) Apparatus and method for retrieving image parts
US8001135B2 (en) Search support apparatus, computer program product, and search support system
US8443008B2 (en) Cooccurrence dictionary creating system, scoring system, cooccurrence dictionary creating method, scoring method, and program thereof
JP4962967B2 (en) Web page search server and query recommendation method
EP2109050A1 (en) Facilitating display of an interactive and dynamic cloud of terms related to one or more input terms
JP2010157178A (en) Computer system for creating term dictionary with named entities or terminologies included in text data, and method and computer program therefor
JP2005122295A (en) Relationship figure creation program, relationship figure creation method, and relationship figure generation device
JP5156047B2 (en) Keyword presentation apparatus, method, and program
JP4049317B2 (en) Search support apparatus and program
JP2008077163A (en) Search system, search method and search program
JP4631795B2 (en) Information search support system, information search support method, and information search support program
JP5146108B2 (en) Document importance calculation system, document importance calculation method, and program
JP5553037B2 (en) Text input support system, text input support device, reference information creation device, and program
JP5345987B2 (en) Document search apparatus, document search method, and document search program
JP2008262506A (en) Information extraction system, information extraction method, and information extraction program
JP3583631B2 (en) Information mining method, information mining device, and computer-readable recording medium recording information mining program
JPH05324728A (en) Information retrieving device
JP5269399B2 (en) Structured document retrieval apparatus, method and program
JP6549173B2 (en) Computer system and text data search method
JP4953440B2 (en) Morphological analysis device, morphological analysis method, morphological analysis program, and recording medium storing computer program
JPH07334499A (en) Input device for character string
JP2001014326A (en) Device and method for retrieving similar document by structure specification
JP4452527B2 (en) Document search device, document search method, and document search program
US20230359658A1 (en) Business matching support device, business matching support method, and program

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20131220

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20140421

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20140430

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20140513

R150 Certificate of patent or registration of utility model

Ref document number: 5553037

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150

S533 Written request for registration of change of name

Free format text: JAPANESE INTERMEDIATE CODE: R313533

R350 Written notification of registration of transfer

Free format text: JAPANESE INTERMEDIATE CODE: R350

LAPS Cancellation because of no payment of annual fees