JP2010086472A

JP2010086472A - Query candidate providing device

Info

Publication number: JP2010086472A
Application number: JP2008257639A
Authority: JP
Inventors: Takeyuki Aikawa; 勇之相川; Koichi Tanigaki; 宏一谷垣; Takashi Mikami; 崇志三上
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2008-10-02
Filing date: 2008-10-02
Publication date: 2010-04-15

Abstract

<P>PROBLEM TO BE SOLVED: To improve the efficiency of input of a user's retrieval keyword by presenting suitable query candidates based on an image size, a reading deviation of retrieval target data, or the like. <P>SOLUTION: A query candidate extracting part 106 extracts a character string of a query candidate from a reading try structure index generated on the basis of the retrieval target data 102. A score calculating part 107 calculates a score of a query candidate by taking into consideration the number of keying times which can be saved by presenting a query candidate, a determined score showing a reduction in the number of keying times, which is predicted during input of the remaining character string after the user's selecting a query candidate, the number of items displayable in one screen, or the like. A query candidate selecting part 109 registers a query candidate based on the score and generates a query candidate dictionary 103. When the user inputs a reading character string of a retrieval keyword by one character at a time through an input part 112, a query candidate displaying part 113 presents a query candidate based on a reading character sting with reference to the query candidate dictionary 103. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

この発明は、数文字の読みを入力すると検索用キーワードの候補を提示するクエリ候補提示装置に関するものである。 The present invention relates to a query candidate presentation device that presents search keyword candidates when a few characters are read.

近年、各種の電子機器の高機能化が進み、電子マニュアル等を機器上で閲覧および検索したいというニーズが高まっている。また、環境問題の側面からも、従来の紙の説明書を電子化したいというニーズが大きい。
しかしながら、カーナビゲーション装置およびＦＡ用表示器等のキーボードをもたない機器では、検索用キーワードの入力がしにくく操作の手間がかかるため、機器上の電子化文書を充分に活用できないという問題があった。 In recent years, various electronic devices have become highly functional, and there is an increasing need to browse and search electronic manuals and the like on the devices. Also, from the viewpoint of environmental problems, there is a great need to digitize conventional paper instructions.
However, devices that do not have a keyboard, such as a car navigation device and an FA display device, have a problem in that it is difficult to input search keywords and it takes time to operate, so that digitized documents on the device cannot be fully utilized. It was.

そこで、この問題を解決するために、特許文献１では入力操作の手間を軽減するための技術が提案されている。特許文献１に開示の文字列予測装置は、ユーザが入力した文字に基づいて、ユーザが入力しようとする所望の文字列を予測するものである。この文字列予測装置は、ユーザが候補リストの中から候補文字列を選択するために必要な第１操作コストと、ユーザが次の文字を入力することによって絞り込まれる候補文字列を選択するために必要な第２操作コストとを比較し、第１操作コストが第２操作コストよりも大きい候補文字列を候補リストから除外して不要な候補文字列の提示を減らすことにより、入力効率を高めていた。 Therefore, in order to solve this problem, Patent Document 1 proposes a technique for reducing the labor of input operation. The character string predicting device disclosed in Patent Literature 1 predicts a desired character string to be input by a user based on characters input by the user. The character string predicting apparatus is configured to select a first operation cost necessary for the user to select a candidate character string from the candidate list and a candidate character string to be narrowed down by the user inputting the next character. The input efficiency is improved by comparing the required second operation cost and excluding candidate character strings whose first operation cost is larger than the second operation cost from the candidate list to reduce the presentation of unnecessary candidate character strings. It was.

特開２００８−４６７７５号公報JP 2008-46775 A

従来の文字列予測装置は以上のように構成されているので、第１操作コストが第２操作コストよりも大きい候補文字列を候補リストから削除するのみであるため、削除されない候補文字列が多数存在する場合には、結局入力に手間がかかってしまうという課題があった。 Since the conventional character string predicting apparatus is configured as described above, since only the candidate character string whose first operation cost is larger than the second operation cost is deleted from the candidate list, there are many candidate character strings that are not deleted. If it exists, there was a problem that it would take time to input.

また、候補リストに大量の候補文字列が存在する場合には、候補文字列をグループ化して階層的に入力することができないため、ユーザに提示する候補数が多くなってしまうという課題があった。 In addition, when there are a large number of candidate character strings in the candidate list, the candidate character strings cannot be grouped and input in a hierarchical manner, resulting in an increase in the number of candidates presented to the user. .

さらに、一画面に表示可能な候補数が限られている機器に対して、表示可能な候補数を考慮したコスト計算がなされていないため、対象機器または対象画面に対して個別に適切な候補文字列の提示を行うことができないという課題があった。 In addition, for devices that have a limited number of candidates that can be displayed on one screen, cost calculation that considers the number of candidates that can be displayed has not been performed. There was a problem that the column could not be presented.

この発明は、上記のような課題を解決するためになされたもので、画面サイズ、検索対象データの読みの偏り等に応じた適切なクエリ候補を提示することができ、ユーザの検索用キーワードの入力効率を向上させることを目的とする。 The present invention has been made to solve the above-described problems, and can provide suitable query candidates according to the screen size, reading bias of data to be searched, and the like. The purpose is to improve the input efficiency.

この発明に係るクエリ候補提示装置は、一文字ずつ入力される検索用キーワードの読み文字で始まるクエリ候補をクエリ候補辞書から取得して、当該検索用キーワードの候補として提示するクエリ候補表示部を備えるクエリ候補提示装置であって、検索対象データから抽出された語句を当該語句の読み文字に基づいて階層構造化したデータから、クエリ候補となる文字列を抽出するクエリ候補抽出部と、検索用キーワードの読み文字で始まる文字列をクエリ候補とすることによって得られる当該検索用キーワードの入力手間軽減の効果を示すスコアを計算するスコア計算部と、検索対象データから抽出された語句の読み文字毎に、スコア計算部で計算されたスコアに基づいて文字列をクエリ候補としてクエリ候補辞書に登録するクエリ候補選択部とを備えるようにしたものである。 A query candidate presentation device according to the present invention includes a query candidate display unit that obtains query candidates starting from a reading character of a search keyword input character by character from the query candidate dictionary and presents the query candidates as candidates for the search keyword. A candidate presentation device, a query candidate extraction unit for extracting a character string as a query candidate from data obtained by hierarchically structuring a word extracted from search target data based on a reading character of the word, a search keyword For each reading character of the word extracted from the search target data, a score calculation unit that calculates a score indicating the effect of reducing input labor of the search keyword obtained by setting a character string starting with the reading character as a query candidate, Query candidate selection that registers character strings as query candidates in the query candidate dictionary based on the score calculated by the score calculator It is obtained so as to include a part.

この発明によれば、階層構造化したデータからクエリ候補となる文字列を抽出し、検索用キーワードの読み文字で始まる文字列をクエリ候補とすることによって得られる当該検索用キーワードの入力手間軽減の効果を示すスコアを計算して、検索対象データから抽出された語句の読み文字毎にスコアに基づいて文字列をクエリ候補としてクエリ候補辞書に登録するようにしたので、画面サイズ、検索対象データの読みの偏り等に応じた適切なクエリ候補を提示することができ、ユーザの検索用キーワードの入力効率を向上できる。 According to the present invention, it is possible to reduce the labor for inputting the search keyword obtained by extracting a character string as a query candidate from the hierarchically structured data and using the character string starting with the reading character of the search keyword as a query candidate. Since a score indicating the effect is calculated and a character string is registered in the query candidate dictionary as a query candidate based on the score for each reading character of the word extracted from the search target data, the screen size, the search target data Appropriate query candidates according to reading bias and the like can be presented, and the user's search keyword input efficiency can be improved.

実施の形態１．
図１は、この発明の実施の形態１に係るクエリ候補提示装置の構成を示すブロック図である。本実施の形態のクエリ候補提示装置は、検索対象データ１０２をもとにクエリ候補辞書１０３を生成するクエリ候補辞書生成部１０１と、ユーザによって読みが一文字ずつ入力されると、その読みで始まる検索用キーワードの候補、即ちクエリ候補をクエリ候補辞書１０３から選択して提示するクエリ候補提示部１１０とから構成される。 Embodiment 1 FIG.
FIG. 1 is a block diagram showing a configuration of a query candidate presentation device according to Embodiment 1 of the present invention. The query candidate presentation device according to the present embodiment includes a query candidate dictionary generation unit 101 that generates a query candidate dictionary 103 based on search target data 102, and a search that starts with a reading when a reading is input character by character by the user. The query candidate presenting unit 110 selects and presents a candidate for a keyword, that is, a query candidate from the query candidate dictionary 103.

先ず、クエリ候補辞書生成部１０１を説明する。クエリ候補辞書生成部１０１は、検索対象データ１０２を解析してクエリ候補の対象とすべき重要語句を抽出する重要語句抽出部１０４、抽出した重要語句を読みで構造化した読みトライ構造索引を生成するトライ構造索引生成部１０５、読みトライ構造索引からクエリ候補にする文字列を抽出するクエリ候補抽出部１０６、抽出したクエリ候補のスコアを計算するスコア計算部１０７、スコアに基づき最適なクエリ候補の組合わせを選択してクエリ候補辞書データを生成するクエリ候補選択部１０９を備える。
なお、クエリ候補提示装置は、検索対象データ１０２として施設名、説明書等のデータを用いる。 First, the query candidate dictionary generation unit 101 will be described. The query candidate dictionary generation unit 101 analyzes the search target data 102 and extracts an important phrase that should be a target of the query candidate, and generates a reading trie structure index that is structured by reading the extracted important phrase. A trie structure index generation unit 105, a query candidate extraction unit 106 that extracts a character string to be a query candidate from the reading trie structure index, a score calculation unit 107 that calculates a score of the extracted query candidate, and an optimal query candidate based on the score A query candidate selection unit 109 that selects a combination and generates query candidate dictionary data is provided.
The query candidate presentation device uses data such as facility names and instructions as the search target data 102.

重要語句抽出部１０４は、検索対象データ１０２を解析し、フォントサイズ等の画面レイアウト情報および章節構造等の文書構造情報等を手掛りに、クエリ候補とすべき重要な語句を抽出する。この処理は、例えば特開２００１−５２０３２号公報の「要約文作成方法及び装置及び要約文作成プログラムを格納した記憶媒体」等、公知の技術を用いて実現可能であるので、詳細な説明は省略する。
図２は、重要語句抽出部１０４がクエリ候補として抽出した重要語句の例を示す説明図である。重要語句抽出部１０４はクエリ候補となる重要語句を検索対象データ１０２から抽出して見出しとし、この見出しを単語毎に区切った単語区切り、およびこの見出しの読みを単語毎に区切った読み区切りの情報を生成する。そして、重要語句抽出部１０４は少なくともこれら３種類の情報を重要語句データに含めてトライ構造索引生成部１０５へ出力する。なお、重要語句データには、これら３種類の情報の他に、語句の重要度を示す情報等を含めてもよい。 The important word / phrase extraction unit 104 analyzes the search target data 102 and extracts important words / phrases to be used as query candidates by using screen layout information such as font size and document structure information such as chapter structure. This process can be realized by using a known technique such as “A summary sentence creation method and apparatus and a storage medium storing a summary sentence creation program” in Japanese Patent Application Laid-Open No. 2001-52032, and detailed description thereof is omitted. To do.
FIG. 2 is an explanatory diagram illustrating an example of important phrases extracted as query candidates by the important phrase extraction unit 104. The important word / phrase extraction unit 104 extracts important words / phrases as query candidates from the search target data 102 as headlines, information on word breaks obtained by dividing the headings into words, and reading breaks obtained by dividing the reading of the headings into words. Is generated. Then, the important phrase extraction unit 104 includes at least these three types of information in the important phrase data and outputs them to the trie structure index generation unit 105. The important phrase data may include information indicating the importance of the phrase in addition to these three types of information.

トライ構造索引生成部１０５は見出し、単語区切りおよび読み区切りの情報をもつ重要語句データを入力として、各重要語句の読みを木構造として整理した読みトライ構造索引を生成する。
図３は、トライ構造索引生成部１０５が生成した読みトライ構造索引を示す説明図である。図３では、図２に示す１４個の重要語句を用いて生成された読みトライ構造索引を示す。読みトライ構造は、各重要語句の読みをノードＮ_xとし、ルートノードＮ_rを基点にして読みのうちの先頭一致する共通部分を木構造としてまとめあげて階層構造化したデータであり、自然言語処理における辞書検索処理等で多用される公知のデータ構造である。
同データ構造の高速な実装方式としてダブル配列（「ダブル配列におけるキー削除の効率化手法」情報処理学会自然言語処理研究会、Ｖｏｌ．２００３Ｎｏ．２３２００２−ＮＬ−１５４）がよく知られており、トライ構造索引生成部１０５がダブル配列によって読みトライ構造索引データを生成してもよい。 The trie structure index generation unit 105 receives as input important word / phrase data having heading, word breaker and reading breaker information, and generates a read trie structure index in which the readings of each important word / phrase are arranged as a tree structure.
FIG. 3 is an explanatory diagram showing the reading trie structure index generated by the trie structure index generation unit 105. FIG. 3 shows a reading trie structure index generated using the 14 key words shown in FIG. TRIE structure readings, and the node N _x readings for each keyword, a data hierarchically structures put together as a tree structure the intersection of the top match of the read in the base root node N _r, natural language processing This is a well-known data structure frequently used in dictionary search processing and the like.
As a high-speed implementation method of the same data structure, a double array ("Key deletion efficiency improvement method in a double array" Information Processing Society of Japan, Natural Language Processing Study Group, Vol. 2003 No. 23 2002-NL-154) is well known. The trie structure index generation unit 105 may generate the read trie structure index data by a double array.

クエリ候補抽出部１０６は読みトライ構造索引データを入力とし、クエリ候補選択条件１０８に従って、読みトライ構造索引からクエリ候補辞書１０３に登録するクエリ候補をノード毎に抽出する。具体的には、図３に示す読みトライ構造索引の各ノードＮ_xに対して、ノードＮ_xの子孫ノードＮ_yのうちの単語区切りの末端であるノードを全て抽出し、ルートノードＮ_rから子孫ノードＮ_yに至る各ノードに対応する読みをつなげた語句を、クエリ候補とする。ただし、図３の読みトライ構造索引は、説明を簡便にするために図を簡略化してあるため、単語区切りの末端であるノードを区別して示してはいない。 The query candidate extraction unit 106 receives the reading trie structure index data as input, and extracts query candidates to be registered in the query candidate dictionary 103 from the reading trie structure index for each node according to the query candidate selection condition 108. Specifically, for each node N _x of the reading trie structure index shown in FIG. 3, all the nodes that are the end of the word break among the descendant nodes N _y of the node N _x are extracted from the root node N _r. the phrase connecting the readings for each node to reach the descendant nodes N _y, and query candidates. However, since the reading trie structure index of FIG. 3 is simplified for the sake of simplicity of explanation, the node that is the end of the word break is not distinguished.

検索対象データ１０２が大規模なデータである場合には、クエリ候補選択条件１０８として「子孫ノードＮ_yはノードＮ_xからＷ単語以内に限定」、「子孫ノードＮ_yはノードＮ_xからＬ文字以内に限定」、「子孫ノードＮ_yはノードＮ_xからの経路における分岐数をＢ個以内に限定」等の条件を指定しておくことにより、クエリ候補抽出部１０６によるクエリ候補抽出のための探索範囲を限定することができる。 When the search target data 102 is large-scale data, the query candidate selection condition 108 is “descendant node N _y is limited to W words from node N _x ”, “descendant node N _y is L characters from node N _x By specifying conditions such as “limited within” and “descendant node N _y limit the number of branches on the route from node N _x to B”, query candidate extraction unit 106 can extract query candidates. The search range can be limited.

ここで、ノード間のショートカットについて説明する。クエリ候補辞書生成部１０１では、読みトライ構造索引の各ノードに対して、ノードＮ_xからこのノードＮ_xの子孫ノードＮ_y中のショートカット先のノードであるショートカットノードＮ_iへ、ノード間をショートカットするリンクを生成する。
図４は、読みトライ構造索引のショートカットを示す説明図である。例えば検索用キーワードの読み「とうきょうこ」が入力された場合、即ちルートノードＮ_rからノードＮ６に注目ノードが遷移した状態（以下、読みＮ_r〜Ｎ６と呼ぶ）のとき、詳細は後述するが、クエリ候補提示装置は「東京国際空港第一」、「東京国際空港第二」、「東京国際空港第三駐車場」をクエリ候補としてユーザに提示する。例えば「東京国際空港第一」というクエリ候補は、ユーザが入力した読み「とうきょうこ」の「こ」に対応するノードＮ６からノードＮ１７へノード間をショートカットしたものといえる。
ユーザによりこれらのクエリ候補の中から「東京国際空港第一」が選択されると、クエリ候補提示装置は注目ノードをＮ６からショートカット先のノードＮ１７に遷移させて（以下、ショートカットＮ６→Ｎ１７と呼ぶ）、ショートカット経路中のノードＮ７，Ｎ８，・・・，Ｎ１６に対応する読みを自動的に表示し、ユーザの入力手間を軽減させる。 Here, a shortcut between nodes will be described. In query candidate dictionary generator 101, read for each node in the trie structure index, node N _x from the shortcut node N _i is a shortcut destination node descendants in the node N _y of the node N _x, a shortcut between nodes Generate a link to
FIG. 4 is an explanatory diagram showing a shortcut for the reading trie structure index. For example, when the search keyword reading “Tokyo Kyoko” is input, that is, when the node of interest has transitioned from the root node N _r to the node N 6 (hereinafter referred to as readings N _{r to} N 6), details will be described later. The query candidate presentation device presents “Tokyo International Airport First”, “Tokyo International Airport Second”, and “Tokyo International Airport Third Parking” to the user as query candidates. For example, it can be said that the query candidate “Tokyo International Airport First” is a shortcut between nodes from the node N6 corresponding to “ko” of the reading “Tokyo Kyoko” input by the user.
When the user selects “Tokyo International Airport No. 1” from these query candidates, the query candidate presentation device changes the node of interest from N6 to the shortcut destination node N17 (hereinafter referred to as shortcut N6 → N17). ), The readings corresponding to the nodes N7, N8,..., N16 in the shortcut path are automatically displayed to reduce the user's input effort.

スコア計算部１０７はクエリ候補のデータを入力とし、検索用キーワード入力途中の読みに対応するクエリ候補を提示することによって得られる入力手間軽減の効果を定式化したスコアを、クエリ候補選択条件１０８に従って計算する。そして、スコア計算部１０７はクエリ候補に関するスコアデータをクエリ候補選択部１０９へ出力する。
スコア計算部１０７へ入力されるクエリ候補選択条件１０８として、一画面あたりに表示可能なクエリ候補数を示す最大表示候補数Ｄ_maxがある。 The score calculation unit 107 receives the query candidate data as an input, and uses the query candidate selection condition 108 as a score that formulates the effect of reducing input labor obtained by presenting the query candidate corresponding to the reading during the search keyword input. calculate. Then, the score calculation unit 107 outputs score data related to the query candidate to the query candidate selection unit 109.
As the query candidate selection condition 108 input to the score calculation unit 107, there is a maximum display candidate number D _max indicating the number of query candidates that can be displayed per screen.

図５は、スコア計算部１０７の詳細構成を示すブロック図である。スコア計算部１０７は、打鍵数短縮スコア計算部１１、確定後スコア計算部１２、スコア集計部１３を備える。また、図６は、スコア計算部１０７が用いるスコアの定義を示す説明図である。
図５に示す打鍵数短縮スコア計算部１１は、読みＮ_r〜Ｎ_xに対応するクエリ候補（「読みＮ_r〜Ｎ_x」＋「ショートカットＮ_x→Ｎ_i」に対応する読みをつなげた語句）の打鍵数短縮スコアＣ₁（Ｎ_x，Ｎ_i）を計算する。打鍵数短縮スコアは、ユーザが検索用キーワードの読み入力途中で提示されたクエリ候補を選択することによって、クエリ候補の読みを入力する場合と比較して何回少ない打鍵数で同じクエリ候補を入力できるかを示す。 FIG. 5 is a block diagram illustrating a detailed configuration of the score calculation unit 107. The score calculation unit 107 includes a keystroke number shortening score calculation unit 11, a post-confirmation score calculation unit 12, and a score totaling unit 13. FIG. 6 is an explanatory diagram showing the definition of the score used by the score calculation unit 107.
Keying the number of speed score calculation unit 11 shown in FIG. 5, phrases connecting the readings corresponding to the query candidate corresponding to the reading N _r to N _x ( "reading N _r to N _x" + "shortcut N _x → N _i" ) To calculate the keystroke number shortening score C ₁ (N _x , N _i ). The keystroke shortening score is input by inputting the same query candidate with the number of keystrokes that is less than the number of keystrokes compared to entering the query candidate reading by selecting the query candidate presented in the middle of reading the keyword for search. Show if you can.

図７は、スコア計算部１０７のスコア計算例を示す説明図であり、読みとして「とうきょうとり」（読みＮ_r〜Ｎ６６）が入力された場合のスコアを示す。図７では、クエリ候補選択条件１０８の最大表示候補数Ｄ_maxは３と定義する。
ここで、図７に示す打鍵数短縮スコアＣ₁（Ｎ６６，Ｎ_i）について、図４および図６を用いて説明する。読みＮ_r〜Ｎ６６に対応するショートカットＮ６６→Ｎ７１のクエリ候補「東京都立第一」の場合、Ｎ６６→Ｎ７１の経路途中にあるノードＮ_jのうち、葉の数（「ちゅうがく」および「こうこう」）がＤ_max以下、かつ、Ｄｅｐｔｈ（Ｎ_j）が最小となるノードＮ_zはＮ７０である。よって、「東京都立第一」の打鍵数短縮スコアＣ₁（Ｎ６６，Ｎ７１）は、Ｎ_zが存在する場合の計算式から「６」となる。
他方、読みＮ_r〜Ｎ６６に対応するショートカットＮ６６→Ｎ８１のクエリ候補「東京都立第二」の場合、Ｎ６６→Ｎ８１の経路途中に葉の数がＤ_max以下となるノードＮ_zが存在しない。よって、「東京都立第二」の打鍵数短縮スコアＣ₁（Ｎ６６，Ｎ８１）は、Ｎ_zが存在しない場合の計算式から「６」となる。
打鍵数短縮スコア計算部１１は、ノードＮ６６に対応するショートカットのクエリ候補「東京都立第三」および「東京都立産業」についても上記同様に打鍵数短縮スコアを計算する。 Figure 7 is an explanatory diagram showing a score calculation example of the score calculation unit 107, indicating the score when "Tokyo tori" (read N _r ~N66) is input as reading. In FIG. 7, the maximum display candidate number D _max of the query candidate selection condition 108 is defined as 3.
Here, the keystroke number shortening score C ₁ (N66, N _i ) shown in FIG. 7 will be described with reference to FIGS. In the case of the query candidate “Tokyo Metropolitan Daiichi” for the shortcut N66 → N71 corresponding to the readings N _r to N66, among the nodes N _{j in} the middle of the route N66 → N71, the number of leaves (“Chugaku” and “Kokou” ”) Is equal to or less than D _max and the node N _z having the minimum Depth (N _j ) is N70. Therefore, the key stroke reduction score C ₁ (N66, N71) of “Tokyo Metropolitan Daiichi” is “6” from the calculation formula when N _z exists.
On the other hand, in the case of the query candidate “Tokyo Metropolitan Second” of the shortcut N66 → N81 corresponding to the readings N _r to N66, there is no node N _z whose number of leaves is equal to or less than D _{max in the} middle of the route N66 → N81. Therefore, the keystroke reduction score C ₁ (N66, N81) of “Tokyo Metropolitan Second” is “6” from the calculation formula when N _z does not exist.
The keystroke number shortening score calculator 11 calculates the keystroke number shortening score for the shortcut query candidates “Tokyo Metropolitan Third” and “Tokyo Metropolitan Sangyo” corresponding to the node N66 in the same manner as described above.

図５に示す確定後スコア計算部１２は、読みＮ_r〜Ｎ_xに対応するショートカットＮ_x→Ｎ_iのクエリ候補がユーザによって選択され、検索用キーワードの一部であることが確定した場合に、このクエリ候補に続く次のショートカットＮ_i→Ｎ_iiに対応したクエリ候補の打鍵数削減効果を予測して確定後スコアＣ₂（Ｎ_i）とする。ここで、Ｎ_iiは、ショートカットノードＮ_iを基点にしたときのショートカット先のノードである。 The after-confirmation score calculation unit 12 shown in FIG. 5 selects a query candidate of shortcut N _x → N _i corresponding to the readings N _{r to} N _x by the user and determines that it is a part of the search keyword. Then, the key hit reduction effect of the query candidate corresponding to the next shortcut N _i → N _ii following this query candidate is predicted and set as a post-determined score C ₂ (N _i ). Here, N _ii is the shortcut destination node when the base shortcut node N _i.

図８は、確定後スコア計算部１２の確定後スコア計算例を示す説明図であり、クエリ候補として「東京国際空港」が選択された場合を示す。クエリ候補「東京国際空港」のショートカットノードＮ１３を新たな基点のノードと仮定した場合の次のショートカットノードＮ_iiは例えばＮ１７、Ｎ３１，Ｎ５４，Ｎ６４がある。このとき、次のショートカットＮ_i→Ｎ_iiの組合わせは、Ｎ１３→Ｎ１７，Ｎ１３→Ｎ４１，Ｎ１３→Ｎ５４，Ｎ１３→Ｎ６４である。確定後スコア計算部１２は、これらショートカットの組合わせのスコアＳ_c（Ｎ_i，｛Ｎ_y｝）を計算する。ここで、｛Ｎ_y｝は次のショートカットノードＮ_iiの集合を表しており、要素数はクエリ候補選択条件１０８で定義される一画面あたりの最大表示候補数Ｄ_max以下とする。またスコアＳ_cについては後述する。
Ｎ_iに対する次のショートカットノードＮ_iiがＤ_max以上存在すると、ショートカットＮ_i→Ｎ_iiの組合わせも多数存在する。そのような場合には、確定後スコア計算部１２がそれら多数のショートカットＮ_i→Ｎ_iiのクエリ候補をＤ_max以下ずつ組合わせたスコアＳ_c（Ｎ_i，｛Ｎ_y｝）を計算し、そのうち最大のもの（Ｍａｘ（Ｓ_c（Ｎ_i，｛Ｎ_y｝））をショートカットノードＮ_iの確定後スコアＣ₂（Ｎ_i）とする。 FIG. 8 is an explanatory diagram illustrating an example of a score calculation after determination by the score calculation unit 12 after determination, and illustrates a case where “Tokyo International Airport” is selected as a query candidate. Assuming that the shortcut node N13 of the query candidate “Tokyo International Airport” is a new base node, the next shortcut nodes N _ii include N17, N31, N54, and N64, for example. In this case, the combination of the following shortcut _{_{N i → N ii, N13 →}} N17, N13 → N41, N13 → N54, N13 → is the N64. The post-confirmation score calculation unit 12 calculates a score S _c (N _i , {N _y }) of these shortcut combinations. Here, {N _y } represents the next set of shortcut nodes N _ii , and the number of elements is equal to or less than the maximum display candidate number D _max per screen defined by the query candidate selection condition 108. The score _Sc will be described later.
When following shortcut node N _ii for N _i are present or D _max, also there are many combinations of shortcut N _i → N _ii. In such a case, the post-confirmation score calculation unit 12 calculates a score S _c (N _i , {N _y }) obtained by combining the query candidates of the large number of shortcuts N _i → N _ii by D _max or less, Among them, the maximum one (Max (S _c (N _i , {N _y })) is set as the post-established score C ₂ (N _i ) of the shortcut node N _i .

図８の確定後スコア計算例では、｛Ｎ_y｝の選び方の代表例としてパターン８Ａおよび８Ｂの２種類を示す。パターン８Ａは、ショートカットＮ１３→Ｎ１７，Ｎ１３→Ｎ４１，Ｎ１３→Ｎ５４の組合わせを示し、パターン８Ｂはその組合わせをＮ１３→Ｎ１７，Ｎ１３→Ｎ５４，Ｎ１３→Ｎ６４にした場合を示す。確定後スコア計算部１２は、各パターンのスコアＳ_c（Ｎ１３，｛Ｎ_y｝）を比較して、最大となるパターン８Ａのスコアを、クエリ候補「東京国際空港」（ノードＮ１３）に対応する確定後スコアＣ₂（Ｎ１３）とする。 In the example of score calculation after determination in FIG. 8, two types of patterns 8A and 8B are shown as representative examples of how to select {N _y }. Pattern 8A shows a combination of shortcuts N13 → N17, N13 → N41, N13 → N54, and pattern 8B shows a case where the combination is N13 → N17, N13 → N54, and N13 → N64. The post-confirmation score calculation unit 12 compares the score S _c (N13, {N _y }) of each pattern, and corresponds the score of the pattern 8A that is the maximum to the query candidate “Tokyo International Airport” (node N13). The score after determination is C ₂ (N13).

図７のスコア計算例では、「東京都立第一」等のクエリ候補に対して次のショートカットノードＮ_iiを考えた場合、いずれのショートカットノードＮ_iiでもクエリ候補数が最大表示候補数Ｄ_max以下となる。確定後スコア計算部１２は、図６に示す定義に従えば、確定後スコアＣ₂（Ｎ７１）はスコアＳ_c（Ｎ７１，｛Ｎ_y｝）で与えられるが、この場合はノードＮ_zがＮ７１となるため、打鍵数短縮スコアＣ₁（Ｎ７１，Ｎ_ii）はいずれも０となる。そのため、Ｄ_max個以下の次のショートカットノードＮ_iiについてのＣ₁（Ｎ７１，Ｎ_ii）を加算したものである、クエリ候補「東京都立第一」の確定後スコアＣ₂（Ｎ７１）も０となる。 In the score calculation example of FIG. 7, when the next shortcut node N _ii is considered for a query candidate such as “Tokyo Metropolitan Daiichi”, the number of query candidates is less than the maximum display candidate number D _max in any shortcut node N _ii. It becomes. According to the definition shown in FIG. 6, the post-confirmation score calculation unit 12 is given the post-confirmation score C ₂ (N71) as a score S _c (N71, {N _y }). In this case, the node N _z is N71. Therefore, the keystroke number shortening score C ₁ (N71, N _ii ) is all 0. Therefore, C ₁ (N71, N _ii ) for the next shortcut node N _ii of D _max or less is added, and the score C ₂ (N71) after confirmation of the query candidate “Tokyo Metropolitan Daiichi” is also 0. Become.

図５に示すスコア集計部１３は、打鍵数短縮スコア計算部１１が計算した打鍵数短縮スコアと確定後スコア計算部１２が計算した確定後スコアを用いて各クエリ候補のショートカットスコアＳ_e（Ｎ_x，Ｎ_y）を計算し、ショートカットスコアを用いて各クエリ候補の組合わせに応じたスコアＳ_c（Ｎ_x，｛Ｎ_y｝）を計算する。｛Ｎ_y｝はショートカットノードＮ_iの集合を表しており、要素数は最大表示候補数Ｄ_max以下とする。 The score totaling unit 13 shown in FIG. 5 uses the keystroke number shortening score calculated by the keystroke number shortening score calculation unit 11 and the post-confirmation score calculated by the post-confirmation score calculation unit 12 to use the shortcut score S _e (N _x , N _y ) is calculated, and a score S _c (N _x , {N _y }) corresponding to each query candidate combination is calculated using the shortcut score. {N _y } represents a set of shortcut nodes N _i , and the number of elements is equal to or less than the maximum display candidate number D _max .

図７のスコア計算例では、｛Ｎ_y｝の選び方の代表例としてパターン７Ａおよび７Ｂの２種類を示す。パターン７Ａは、読みＮ_r〜Ｎ６６に対して、一画面あたりに表示するクエリ候補の組合わせを「東京都立第一」、「東京都立第二」および「東京都立第三」にした場合を示し、パターン７Ｂはその組合わせを「東京都立第一」、「東京都立第二」および「東京都立産業」にした場合を示す。スコア集計部１３は、ショートカットにより入力を効率化できる葉の数Ｌ_zが同一であれば、ショートカットにより短縮できる読みの長さ（Ｄｅｐｔｈ（Ｎ_z）−Ｄｅｐｔｈ（Ｎ_x）およびＤｅｐｔｈ（Ｎ_i）−Ｄｅｐｔｈ（Ｎ_x））がより大きいパターン７Ａに高いスコアを与える。
従来技術ではクエリ候補の提示により短縮できる読みの長さに応じてクエリ候補間の優先順位を決定することができなかったため、これらのパターン７Ａおよび７Ｂに優劣をつけられず、表示候補数を適切な３種類に絞り込むことができなかった。これに対して、本実施の形態では最大表示候補数Ｄ_maxに応じてクエリ候補を優先させることができる。
なお、同一パターン内のクエリ候補間の優劣は、ショートカットスコアに準じればよく、高いショートカットスコアのクエリ候補を優先させる。 The score calculation example of FIG. 7, showing two kinds of patterns 7A and 7B as a typical example of selection of {N _y}. Pattern 7A shows the case where the combinations of query candidates displayed per screen are “Tokyo Metropolitan First”, “Tokyo Metropolitan Second” and “Tokyo Metropolitan Third” for readings N _{r to} N66. Pattern 7B shows the case where the combination is “Tokyo Metropolitan First”, “Tokyo Metropolitan Second” and “Tokyo Metropolitan Sangyo”. The score totaling unit 13 can reduce the length of reading (Depth (N _z ) −Depth (N _x ) and Depth (N _i ) that can be shortened by the shortcut if the number of leaves L _z that can be input efficiently by the shortcut is the same. A high score is given to the pattern 7A having a higher Depth (N _x )).
In the prior art, the priority order between the query candidates could not be determined according to the length of reading that can be shortened by presenting the query candidates. Therefore, these patterns 7A and 7B cannot be superior or inferior, and the number of display candidates is appropriately set. It was not possible to narrow down to three kinds. In contrast, in the present embodiment, query candidates can be prioritized according to the maximum display candidate number _Dmax .
The superiority or inferiority between query candidates in the same pattern may be determined according to the shortcut score, and a query candidate with a high shortcut score is given priority.

図９は、スコア計算部１０７のスコア計算例を示す説明図であり、読みとして「とうきょうこ」（読みＮ_r〜Ｎ６）が入力された場合のスコアを示す。図９においても、クエリ候補選択条件１０８の最大表示候補数Ｄ_maxは３と定義する。パターン９Ａは、読みＮ_r〜Ｎ６に対して、クエリ候補の組合わせを「東京国際空港第一」、「東京国際空港第二」および「東京国際空港第三駐車場」にした場合を示し、パターン９Ｂはその組合わせを「東京国際空港第一」、「東京国際空港第三駐車場」および「東京国際空港第四駐車場」とした場合を示す。この例では、スコア集計部１３は、ショートカットにより短縮できる読みの長さ（Ｄｅｐｔｈ（Ｎ_z）−Ｄｅｐｔｈ（Ｎ_x）およびＤｅｐｔｈ（Ｎ_i）−Ｄｅｐｔｈ（Ｎ_x））が同一であれば、葉の数Ｌ_zがより多いパターン９Ａに高いスコアを与える。
従来技術では葉の数を考慮したスコア計算をしていなかったため、これらのパターン９Ａおよび９Ｂに優劣をつけられず、表示候補数を適切な３種類に絞り込むことができなかった。これに対して、本実施の形態ではより多くの葉を含むクエリ候補を優先させることができる。 FIG. 9 is an explanatory diagram showing a score calculation example of the score calculation unit 107, and shows a score when “Tokyo Kyoko” (reading N _{r to} N6) is input as a reading. Also in FIG. 9, the maximum display candidate number D _max of the query candidate selection condition 108 is defined as 3. Pattern 9A shows the case where the combination of query candidates is “Tokyo International Airport First”, “Tokyo International Airport Second”, and “Tokyo International Airport Third Parking” for readings N _{r to} N6, Pattern 9B shows a case where the combination is “Tokyo International Airport First”, “Tokyo International Airport Third Parking” and “Tokyo International Airport Fourth Parking”. In this example, if the reading lengths (Depth (N _z ) −Depth (N _x ) and Depth (N _i ) −Depth (N _x )) that can be shortened by the shortcut are the same, the number L _z of giving a high score to the higher pattern 9A.
Since the prior art did not calculate the score in consideration of the number of leaves, the patterns 9A and 9B could not be superior or inferior, and the number of display candidates could not be narrowed down to three appropriate types. In contrast, in the present embodiment, it is possible to prioritize query candidates including more leaves.

また、図１０に、確定後スコアがスコアに寄与する例を示す。図１０は、スコア計算部１０７のスコア計算例を示す説明図であり、読みとして「とうきょう」（読みＮ_r〜Ｎ５）が入力された場合のスコアを示す。パターン１０Ａではクエリ候補数が最大表示候補数Ｄ_maxに満たないにもかかわらず、パターン１０Ａとパターン１０Ｂのスコアは同値となる。これは、スコア集計部１３が確定後スコアを考慮してスコアを計算したことにより、ユーザによりクエリ候補が選択された後に予測される打鍵数節減の効果をスコアに反映したためである。
ここで、パターン１０Ａおよび１０Ｂにおけるクエリ候補「東京国際空港」の確定後スコアＣ₂（Ｎ１３）は、図８において確定後スコア計算部１２が計算した確定後スコアである。この確定後スコアは、パターン８ＡのスコアＳ_c（Ｎ１３，｛Ｎ１７，Ｎ３１，Ｎ５４｝）に相当する。 FIG. 10 shows an example in which the post-confirmation score contributes to the score. FIG. 10 is an explanatory diagram showing a score calculation example of the score calculation unit 107, and shows a score when “Tokyo” (reading N _{r to} N5) is inputted as a reading. In the pattern 10A, although the number of query candidates is less than the maximum display candidate number _Dmax , the scores of the pattern 10A and the pattern 10B have the same value. This is because the score totaling unit 13 calculates the score in consideration of the score after determination, thereby reflecting the score reduction effect predicted after the query candidate is selected by the user in the score.
Here, the post-confirmation score C ₂ (N13) of the query candidate “Tokyo International Airport” in the patterns 10A and 10B is the post-confirmation score calculated by the post-confirmation score calculation unit 12 in FIG. This post-determination score corresponds to the score S _c (N13, {N17, N31, N54}) of the pattern 8A.

また、上記のようなスコア計算部１０７のスコア計算により、読みトライ構造索引において木構造が偏った部分についても、検索対象データ１０２および最大表示候補数Ｄｍａで定義される画面の大きさに応じた適切なクエリ候補選択が可能となる。例えば、図１１に最大表示候補数Ｄ_maxを５と定義した場合のスコア計算部１０７のスコア計算例を示す。ここでは読みとして「とうきょうとり」（読みＮ_r〜Ｎ６６）が入力されたときの、｛Ｎ_y｝の組合わせの代表例としてパターン１１Ａおよび１１Ｂの各スコアを示す。この例では、スコア集計部１３は、より広い範囲の葉をカバーできるパターン１１Ｂを優先して高いスコアを与える。 Further, according to the score calculation of the score calculation unit 107 as described above, the portion of the reading trie structure index in which the tree structure is biased also corresponds to the size of the screen defined by the search target data 102 and the maximum display candidate number Dma. Appropriate query candidate selection becomes possible. For example, FIG. 11 shows a score calculation example of the score calculation unit 107 when the maximum display candidate number D _max is defined as five. Here, the respective scores of the patterns 11A and 11B are shown as typical examples of combinations of {N _y } when “Tokyo” (reading N _{r to} N66) is input as the reading. In this example, the score totaling unit 13 gives a high score in preference to the pattern 11B that can cover a wider range of leaves.

図１のクエリ候補選択部１０９は、クエリ候補選択条件１０８の最大表示候補数Ｄ_maxとスコアデータを入力とし、スコアＳ_c（Ｎ_x，｛Ｎ_y｝）に基づいてスコアが最大となるクエリ候補の組合わせを選択して、ショートカット情報を生成する。クエリ候補選択部１０９は、生成したショートカット情報を、重要語句抽出部１０４で抽出された重要語句データと、トライ構造索引生成部１０５で生成された読みトライ構造索引データとをあわせてクエリ候補辞書データを生成し、クエリ候補辞書１０３へ登録する。 The query candidate selection unit 109 in FIG. 1 receives the maximum display candidate number D _max of the query candidate selection condition 108 and the score data, and a query having the maximum score based on the score S _c (N _x , {N _y }). Shortcut information is generated by selecting a combination of candidates. The query candidate selection unit 109 adds the generated shortcut information to the query candidate dictionary data by combining the important phrase data extracted by the important phrase extraction unit 104 and the reading trie structure index data generated by the trie structure index generation unit 105. Is registered in the query candidate dictionary 103.

図１２は、クエリ候補提示装置のクエリ候補辞書１０３の構成を示すブロック図であり、クエリ候補辞書１０３は、見出し一覧２１、読みトライ構造索引２２およびショートカット情報２３を備える。図１３にクエリ候補辞書１０３に登録された見出し一覧２１の例を示し、図１４にクエリ候補辞書１０３に登録されたショートカット情報２３の例を示す。読みトライ構造索引２２の例は図３および図４に示す。 FIG. 12 is a block diagram showing a configuration of the query candidate dictionary 103 of the query candidate presentation device. The query candidate dictionary 103 includes a heading list 21, a reading trie structure index 22, and shortcut information 23. FIG. 13 shows an example of the heading list 21 registered in the query candidate dictionary 103, and FIG. 14 shows an example of the shortcut information 23 registered in the query candidate dictionary 103. An example of the reading trie structure index 22 is shown in FIGS.

図１３の見出し一覧２１は、読みトライ構造索引２２およびショートカット情報２３等の他データから参照するためのレコード番号およびクエリ候補としてユーザに提示される見出しからなる。
図１４のショートカット情報２３ａ，２３ｂ，２３ｃは、ノードＮ_xに対するショートカットの情報をリスト情報で表現している。リストの要素は（ア）〜（エ）からなり、それぞれ、「（ア）リストの次要素へのポインタ」、「（イ）ショートカットノードＮ_i」、「（ウ）見出し一覧を参照するレコード番号」、「（エ）ショートカットスコアＳ_e（Ｎ_x，Ｎ_y）」である。 13 includes a record number for reference from other data such as the reading trie structure index 22 and the shortcut information 23 and a headline presented to the user as a query candidate.
Shortcut information 23a, 23b, 23c in FIG. 14 represent the shortcut information for the node N _x in the list information. The elements of the list are composed of (A) to (D), and “(A) Pointer to the next element of the list”, “(A) Shortcut node N _i ”, “(C) Record number referring to the heading list”, respectively. “, (D) Shortcut score S _e (N _x , N _y )”.

クエリ候補選択部１０９は、スコア計算部１０７で計算されたノードＮ_xに対するクエリ候補の全組合わせからスコアが最大となるクエリ候補の組合わせを選択し、それらクエリ候補のショートカットに関する情報を図１４に示すようにノードＮ_xにリンクさせる。図１４ではノードＮ６にリンクするショートカット情報を示しているが、このショートカット情報は図９のパターン９Ａの組合わせに相当する。 The query candidate selection unit 109 selects a combination of query candidates that gives the maximum score from all combinations of query candidates for the node N _x calculated by the score calculation unit 107, and displays information on shortcuts of these query candidates as shown in FIG. It is linked to the node N _x as shown in. FIG. 14 shows shortcut information linked to the node N6. This shortcut information corresponds to a combination of the pattern 9A in FIG.

図１５は、実施の形態１に係るクエリ候補提示装置のクエリ候補辞書生成部１０１の動作を示すフローチャートである。以下、図１〜図１５を適宜用いて、クエリ候補辞書生成部１０１の動作を説明する。
ステップＳＴ１は重要語句抽出処理であり、重要語句抽出部１０４が検索対象データ１０２を解析して、画面レイアウト情報および文書構造情報等を利用して重要語句を抽出すると共に、単語分割および読み付与を実行する。そして、重要語句抽出部１０４は図２に示す重要語句データをトライ構造索引生成部１０５へ出力する。 FIG. 15 is a flowchart showing the operation of the query candidate dictionary generation unit 101 of the query candidate presentation device according to Embodiment 1. Hereinafter, the operation of the query candidate dictionary generation unit 101 will be described with reference to FIGS.
Step ST1 is an important word / phrase extraction process, in which the important word / phrase extraction unit 104 analyzes the search target data 102, extracts important words / phrases using screen layout information and document structure information, and performs word division and reading. Execute. Then, the important phrase extraction unit 104 outputs the important phrase data shown in FIG. 2 to the trie structure index generation unit 105.

ステップＳＴ２はトライ構造索引生成処理であり、トライ構造索引生成部１０５がステップＳＴ１で抽出された重要語句の情報を用いて、図３に示す読みトライ構造索引を生成する。そして、トライ構造索引生成部１０５は読みトライ構造索引データをクエリ候補抽出部１０６へ出力する。
ステップＳＴ３はクエリ候補抽出処理であり、クエリ候補抽出部１０６はクエリ候補選択条件１０８およびステップＳＴ２で生成された読みトライ構造索引を用いて、クエリ候補辞書１０３に登録するためのクエリ候補を全て抽出する。そして、クエリ候補抽出部１０６は生成したクエリ候補データをスコア計算部１０７へ出力する。 Step ST2 is a trie structure index generation process, and the trie structure index generation unit 105 generates the reading trie structure index shown in FIG. 3 using the information of the key words extracted in step ST1. Then, the trie structure index generation unit 105 outputs the read trie structure index data to the query candidate extraction unit 106.
Step ST3 is a query candidate extraction process, and the query candidate extraction unit 106 extracts all query candidates to be registered in the query candidate dictionary 103 using the query candidate selection condition 108 and the reading trie structure index generated in step ST2. To do. Then, the query candidate extraction unit 106 outputs the generated query candidate data to the score calculation unit 107.

ステップＳＴ４はスコア計算処理であり、スコア計算部１０７が、ステップＳＴ３で抽出された各クエリ候補に対するスコアを計算する。クエリ候補毎に、打鍵数短縮スコア計算部１１が打鍵数短縮スコアを計算し、確定後スコア計算部１２が確定後スコアを計算する。そして、スコア集計部１３が打鍵数短縮スコアおよび確定後スコアを用いて、クエリ候補の組合わせ毎にスコアを計算し、クエリ候補選択部１０９へスコアデータを出力する。 Step ST4 is a score calculation process, and the score calculation unit 107 calculates a score for each query candidate extracted in step ST3. For each query candidate, the keystroke number shortening score calculation unit 11 calculates the keystroke number reduction score, and the post-confirmation score calculation unit 12 calculates the post-confirmation score. Then, the score totaling unit 13 calculates a score for each combination of query candidates using the keystroke number shortening score and the finalized score, and outputs the score data to the query candidate selecting unit 109.

ステップＳＴ５はクエリ候補選択処理であり、ステップＳＴ４で計算されたスコアデータを用いて、クエリ候補選択部１０９がスコアが最大となるクエリ候補の組合わせを選択し、図１２〜図１４に示す見出し一覧２１、読みトライ構造索引２２およびショートカット情報２３を含むクエリ候補辞書データを生成してクエリ候補辞書１０３へ登録する。
全てのノードについて同様のスコア計算が行われ、各ノードに対してスコア最大となるクエリ候補の組合わせが登録されると（ステップＳＴ６“Ｙｅｓ”）、クエリ候補辞書生成処理は終了する。クエリ候補の組合わせが決まっていないノードが存在しているうちは、処理はステップＳＴ３のクエリ候補抽出処理へ戻り（ステップＳＴ６“Ｎｏ”）、クエリ候補抽出部１０６が次のノードに対応するクエリ候補を抽出する。 Step ST5 is a query candidate selection process. Using the score data calculated in step ST4, the query candidate selection unit 109 selects a combination of query candidates having the maximum score, and the headings shown in FIGS. Query candidate dictionary data including the list 21, reading trie structure index 22 and shortcut information 23 is generated and registered in the query candidate dictionary 103.
The same score calculation is performed for all the nodes, and when a combination of query candidates having the maximum score is registered for each node (step ST6 “Yes”), the query candidate dictionary generation process ends. While there is a node for which a combination of query candidates is not determined, the process returns to the query candidate extraction process in step ST3 (step ST6 “No”), and the query candidate extraction unit 106 selects a query corresponding to the next node. Extract candidates.

次に、クエリ候補提示部１１０を説明する。図１に示すクエリ候補提示部１１０は、クエリ候補辞書生成部１０１で生成したクエリ候補辞書１０３、ユーザが入力した検索用キーワードの読み文字列１１１を受け付ける入力部１１２、検索用キーワードの読み文字列１１１をもとにクエリ候補辞書１０３を参照して次のクエリ候補を表示するクエリ候補表示部１１３を備える。
なお、検索用キーワードの読み文字列１１１は、ユーザが入力した検索用キーワードの読み文字列、またはクエリ候補表示部１１３で表示したクエリ候補に対するユーザの選択結果である。入力はタッチパネル上に表示したソフトウェアキーボード、テンキー等によって行うこととし、入力部１１２およびクエリ候補表示部１１３をこのタッチパネルによって実現する。上述したクエリ候補選択条件１０８の最大表示候補数Ｄ_maxは、クエリ候補表示部１１３を実現するタッチパネルの表示画面の大きさに応じて定義されたものである。 Next, the query candidate presentation unit 110 will be described. The query candidate presentation unit 110 illustrated in FIG. 1 includes a query candidate dictionary 103 generated by the query candidate dictionary generation unit 101, an input unit 112 that receives a search keyword reading character string 111 input by a user, and a search keyword reading character string. A query candidate display unit 113 that displays the next query candidate with reference to the query candidate dictionary 103 based on 111 is provided.
The search keyword reading character string 111 is a search keyword reading character string input by the user or a user selection result for the query candidate displayed on the query candidate display unit 113. The input is performed by a software keyboard, a numeric keypad, etc. displayed on the touch panel, and the input unit 112 and the query candidate display unit 113 are realized by this touch panel. The maximum display candidate number D _max of the query candidate selection condition 108 described above is defined according to the size of the display screen of the touch panel that realizes the query candidate display unit 113.

入力部１１２は、ユーザから検索用キーワードの読み文字列１１１およびクエリ候補選択結果１１１ａの入力を受け付けて、その情報をクエリ候補表示部１１３へ出力する。
クエリ候補表示部１１３は、検索用キーワードの読み文字列１１１およびクエリ候補選択結果１１１ａの情報を受けると、クエリ候補辞書１０３を参照して読みトライ構造索引の注目ノードを遷移させる。そして、遷移先のノードに対するショートカット情報をクエリ候補辞書１０３から取得してクエリ候補のリストを作成して提示する。 The input unit 112 accepts input of a search keyword reading character string 111 and a query candidate selection result 111 a from the user, and outputs the information to the query candidate display unit 113.
When the query candidate display unit 113 receives the information of the reading character string 111 of the search keyword and the query candidate selection result 111a, the query candidate display unit 113 refers to the query candidate dictionary 103 and changes the target node of the reading trie structure index. Then, shortcut information for the transition destination node is acquired from the query candidate dictionary 103, and a list of query candidates is created and presented.

具体的には、検索用キーワードの読み文字列１１１が何も入力されていない初期状態では、クエリ候補表示部１１３は、クエリ候補辞書１０３の読みトライ構造索引２２においてルートノードＮ_rを注目ノードとする。読みが入力されると、クエリ候補表示部１１３は入力された読みを一文字ずつ読み込み、注目ノードを、その注目ノードの子孫ノードのうちの入力された読みに合致するノードに遷移して、これを新たな注目ノードとする。
他方、クエリ候補表示部１１３は、提示した複数のクエリ候補の中からユーザが選択した選択結果を示すクエリ候補選択結果１１１ａが入力された場合には、図１４に示すショートカット情報の「（イ）ショートカットノードＮ_i」を参照し、該当するノード番号に遷移する。
続いて、クエリ候補表示部１１３は、遷移した先のノード番号にリンクする全てのショートカット情報をもとに、各ショートカット情報の「（ウ）見出し一覧を参照するレコード番号」を参照し、さらに図１３に示す見出し一覧からそのレコード番号の見出しを取得して、この見出しをクエリ候補としてユーザに提示する。このとき、クエリ候補表示部１１３は、ショートカット情報の「（エ）ショートカットスコアＳ_e（Ｎ_x，Ｎ_y）」が高いクエリ候補の表示順位が高くなるように、表示順位を制御する。 Specifically, in the initial state in which no reading character string 111 of the search keyword is input, the query candidate display unit 113 sets the root node _Nr as the target node in the reading trie structure index 22 of the query candidate dictionary 103. To do. When a reading is input, the query candidate display unit 113 reads the input reading one character at a time, and changes the attention node to a node that matches the input reading among the descendant nodes of the attention node. Let it be a new attention node.
On the other hand, when the query candidate selection result 111a indicating the selection result selected by the user from the plurality of presented query candidates is input, the query candidate display unit 113 displays “(A)” of the shortcut information illustrated in FIG. With reference to the shortcut node N _i , the transition is made to the corresponding node number.
Subsequently, the query candidate display unit 113 refers to “(c) a record number referring to a heading list” of each shortcut information based on all shortcut information linked to the node number of the transition destination. The heading of the record number is acquired from the heading list shown in FIG. 13, and this heading is presented to the user as a query candidate. At this time, the query candidate display unit 113 controls the display order so that the display order of query candidates having a high “(D) shortcut score S _e (N _x , N _y )” in the shortcut information is high.

次に、クエリ候補提示部１１０の動作を説明する。図１６は、この発明の実施の形態１に係るクエリ候補提示装置のクエリ候補提示部１１０の動作を示すフローチャートである。
ステップＳＴ１１は文字入力処理またはクエリ候補選択処理である。クエリ候補提示処理の開始直後、即ち初期状態のステップＳＴ１１において、入力部１１２はユーザから入力される検索用キーワードの読み文字列１１１を受ける。 Next, the operation of the query candidate presentation unit 110 will be described. FIG. 16 is a flowchart showing the operation of the query candidate presentation unit 110 of the query candidate presentation device according to Embodiment 1 of the present invention.
Step ST11 is a character input process or a query candidate selection process. Immediately after the start of the query candidate presentation process, that is, in step ST11 in the initial state, the input unit 112 receives the reading character string 111 of the search keyword input from the user.

ステップＳＴ１２は読みトライ構造索引における状態遷移処理であり、ステップＳＴ１１で受け付けた検索用キーワードの読み文字列１１１の情報に従って、クエリ候補表示部１１３が図３に示す読みトライ構造索引の注目ノードを更新する。例えば、読みＮ_r〜Ｎ６の「とうきょうこ」が入力された場合、クエリ候補表示部１１３は注目ノードをルートノードＮ_rからＮ６へ遷移させる。 Step ST12 is a state transition process in the reading tri structure index, and the query candidate display unit 113 updates the attention node of the reading tri structure index shown in FIG. 3 according to the information of the reading character string 111 of the search keyword received in step ST11. To do. For example, when “Tokyo Kyoko” of readings N _{r to} N 6 is input, the query candidate display unit 113 changes the target node from the root node N _r to N 6.

ステップＳＴ１３は終了判定処理であり、ステップＳＴ１２において注目ノードの子孫ノードのうち、入力された読みに合致する遷移先のノードがない場合は、ユーザが入力した読みに対応するクエリ候補が存在しないということなので、クエリ候補表示部１１３はクエリ候補提示処理を終了する（ステップＳＴ１３“Ｎｏ”）。
遷移先のノードがあれば、クエリ候補表示部１１３は処理を次へ進める（ステップＳＴ１３“Ｙｅｓ”）。読み「とうきょうこ」が入力された場合、遷移先のノードＮ６が存在するため次の処理へ進む。 Step ST13 is an end determination process. If there is no transition destination node that matches the input reading among the descendant nodes of the target node in step ST12, there is no query candidate corresponding to the input input by the user. Therefore, the query candidate display unit 113 ends the query candidate presentation process (step ST13 “No”).
If there is a transition destination node, the query candidate display unit 113 advances the processing to the next (step ST13 “Yes”). When the reading “Tokyo Kyoko” is input, since the transition destination node N6 exists, the process proceeds to the next process.

ステップＳＴ１４はクエリ候補取得表示処理であり、クエリ候補表示部１１３が遷移先のノードに対応したショートカット情報を参照して、クエリ候補のリストを取得する。
読み「とうきょうこ」が入力された場合、遷移先の注目ノードＮ６に対応するショートカット情報は、図１４に示すとおりである。クエリ候補表示部１１３は、先ずＮ６に対応するショートカット情報２３ａ（ショートカットＮ６→Ｎ１７）、ショートカット情報２３ｂ（ショートカットＮ６→Ｎ３１）およびショートカット情報２３ｃ（ショートカットＮ６→Ｎ５４）を参照して見出し一覧のレコード番号Ｒ３、Ｒ６およびＲ９を取得する。クエリ候補表示部１１３は次に、図１３の見出し一覧２１から同一レコード番号のクエリ候補「東京国際空港第一」、「東京国際空港第二」および「東京国際空港第三駐車場」を取得して、画面に表示する。 Step ST14 is a query candidate acquisition display process, in which the query candidate display unit 113 refers to shortcut information corresponding to the transition destination node and acquires a list of query candidates.
When the reading “Tokyo Kyoko” is input, the shortcut information corresponding to the transition target node N6 is as shown in FIG. The query candidate display unit 113 first refers to the shortcut information 23a (shortcut N6 → N17), shortcut information 23b (shortcut N6 → N31) and shortcut information 23c (shortcut N6 → N54) corresponding to N6, and the record number of the heading list. Obtain R3, R6 and R9. Next, the query candidate display unit 113 acquires query candidates “Tokyo International Airport First”, “Tokyo International Airport Second”, and “Tokyo International Airport Third Parking Lot” with the same record number from the heading list 21 of FIG. Displayed on the screen.

図１７は、クエリ候補提示装置のクエリ候補表示部１１３がユーザの入力に対して表示するクエリ候補提示の画面例を示す説明図である。図１７（ａ）に示すように、クエリ候補表示部１１３が取得したクエリ候補を画面上のクエリ候補文字列として表示している。
そして、処理は再びステップＳＴ１１に戻る。 FIG. 17 is an explanatory diagram illustrating an example of a query candidate presentation screen that the query candidate display unit 113 of the query candidate presentation device displays in response to a user input. As shown in FIG. 17A, the query candidates acquired by the query candidate display unit 113 are displayed as query candidate character strings on the screen.
Then, the process returns to step ST11 again.

検索用キーワードの読み文字列１１１として「と」、「う」、「き」、「ょ」、「う」、「こ」が入力されるとき、クエリ候補提示部１１０はステップＳＴ１１〜ステップＳＴ１３の処理を繰り返し、注目ノードがルートノードＮ_rからＮ１，Ｎ２，Ｎ３，Ｎ４，Ｎ５，Ｎ６まで遷移する。このとき、クエリ候補表示部１１３の表示画面は図１７（ａ）の状態となる。ここで、処理がステップＳＴ１１に戻り、図１７（ｂ）に示すようにユーザがクエリ候補「東京国際空港第二」を選択する入力を入力部１１２に対して行うと、入力部１１２がその入力をクエリ候補選択結果１１１ａとして受け付けるクエリ候補選択処理を行う。 When “to”, “u”, “ki”, “yo”, “u”, “ko” are input as the search keyword reading character string 111, the query candidate presentation unit 110 performs steps ST11 to ST13. The process is repeated and the node of interest transits from the root node _Nr to N1, N2, N3, N4, N5, and N6. At this time, the display screen of the query candidate display unit 113 is in the state shown in FIG. Here, when the process returns to step ST11 and the user inputs to the input unit 112 the query candidate “Tokyo International Airport 2” as shown in FIG. 17B, the input unit 112 inputs the input. The query candidate selection process is received that accepts as the query candidate selection result 111a.

そして、ステップＳＴ１２において、クエリ候補表示部１１３が、クエリ候補「東京国際空港第二」に対するショートカット情報に従って注目ノードをショートカットノードＮ３１に遷移させる。同時に、クエリ候補表示部１１３は、選択されたクエリ候補のショートカット経路（Ｎ６→Ｎ３１）にある読み文字列「くさいくうこうだいに」を読み入力として補って画面表示する（図１７（ｃ））。
遷移先ノードが存在するので処理は進み（ステップＳＴ１３“Ｙｅｓ”）、続くステップＳＴ１４にて、クエリ候補表示部１１３はクエリ候補「東京国際空港第二」のショートカットノードＮ３１に応じた次のクエリ候補「東京国際空港第二ターミナル」および「東京国際空港第二駐車場」を取得して画面表示する。 In step ST12, the query candidate display unit 113 changes the node of interest to the shortcut node N31 according to the shortcut information for the query candidate “Tokyo International Airport Second”. At the same time, the query candidate display unit 113 displays the screen supplementing the reading character string “Kusakuu Kodani” in the shortcut path (N6 → N31) of the selected query candidate as a reading input (FIG. 17C).
Since there is a transition destination node, the process proceeds (step ST13 “Yes”), and in subsequent step ST14, the query candidate display unit 113 displays the next query candidate corresponding to the shortcut node N31 of the query candidate “Tokyo International Airport Second”. Acquire “Tokyo International Airport Terminal 2” and “Tokyo International Airport Second Parking” and display them on the screen.

再び処理がステップＳＴ１１に戻り、クエリ候補「東京国際空港第二ターミナル」または「東京国際空港第二駐車場」のいずれか一方がユーザにより選択されると、ステップＳＴ１３の終了判定において、遷移先のノードが存在しないことが確認されるため、ここでクエリ候補提示処理が終了する。 When the process returns to step ST11 again and either one of the query candidates “Tokyo International Airport Second Terminal” or “Tokyo International Airport Second Parking” is selected by the user, in the end determination of step ST13, the transition destination Since it is confirmed that there is no node, the query candidate presentation process ends here.

図１８は、最大表示候補数Ｄ_maxを５に設定した場合のクエリ候補提示の画面例であり、読みＮ_r〜Ｎ６６の「とうきょうとり」が入力された場合を示す。クエリ候補提示部１１０は、図１１に示すパターン１１Ｂのスコア計算例に基づいて生成されたクエリ候補辞書１０３を用いてクエリ候補提示処理を行うことによって、図１８の画面例のように項目数とその表示順位を制御して、クエリ候補を表示することができる。 FIG. 18 is a screen example of query candidate presentation when the maximum display candidate number D _max is set to 5, and shows a case where “Toyotori” of readings N _{r to} N 66 is input. The query candidate presentation unit 110 performs the query candidate presentation process using the query candidate dictionary 103 generated based on the score calculation example of the pattern 11B shown in FIG. Query candidates can be displayed by controlling the display order.

以上のように、実施の形態１によれば、クエリ候補を提示してユーザに選択させることによって節約できる打鍵数、ユーザのクエリ候補選択後に残りの文字列を入力するときに予測される打鍵数削減を示す確定後スコア、一画面に表示できる項目数である最大表示候補数等を考慮してスコア計算を行うスコア計算部１０７を用いてクエリ候補辞書１０３を生成し、クエリ候補提示部１１０がこのクエリ候補辞書１０３を参照してクエリ候補を提示するように構成した。そのため、画面サイズ、検索対象データの読みの偏り等に応じた適切なクエリ候補を提示することができ、ユーザの検索用キーワードの入力効率を向上できるという効果が得られる。 As described above, according to the first embodiment, the number of keystrokes that can be saved by presenting query candidates and allowing the user to select them, the number of keystrokes that are predicted when the remaining character strings are input after the user selects a query candidate The query candidate dictionary 103 is generated using the score calculation unit 107 that performs score calculation in consideration of the score after confirmation indicating reduction, the maximum number of display candidates that are the number of items that can be displayed on one screen, and the query candidate presentation unit 110 A query candidate is presented with reference to the query candidate dictionary 103. Therefore, it is possible to present appropriate query candidates according to the screen size, the reading bias of the search target data, and the like, and the effect of improving the user's search keyword input efficiency can be obtained.

実施の形態２．
図１９は、この発明の実施の形態２に係るクエリ候補提示装置のスコア計算部の詳細構成を示すブロック図である。本実施の形態のスコア計算部１０７は、図１９に示すように、打鍵数短縮スコア計算部１１、確定後スコア計算部１２、スコア集計部１３、候補選択スコア計算部１４を備える。
本実施の形態に係るクエリ候補提示装置の構成は、スコア計算部１０７の内部構成以外は図１に示した実施の形態１に係るクエリ候補提示装置の構成と同じであるため、スコア計算部１０７以外の図示および詳細な説明は省略する。 Embodiment 2. FIG.
FIG. 19 is a block diagram showing a detailed configuration of the score calculation unit of the query candidate presentation device according to Embodiment 2 of the present invention. As shown in FIG. 19, the score calculation unit 107 of the present embodiment includes a keystroke number shortening score calculation unit 11, a post-confirmation score calculation unit 12, a score totaling unit 13, and a candidate selection score calculation unit 14.
Since the configuration of the query candidate presentation device according to the present embodiment is the same as the configuration of the query candidate presentation device according to the first embodiment shown in FIG. 1 except for the internal configuration of the score calculation unit 107, the score calculation unit 107 Other illustrations and detailed descriptions are omitted.

図２０は、スコア計算部１０７が用いるスコアの定義を示す説明図である。本実施の形態のスコア計算部１０７は、候補選択スコア計算部１４が候補選択スコアＣ_sel（Ｎ_i）を計算し、打鍵数短縮スコア計算部１１が候補選択スコアＣ_sel（Ｎ_i）を用いて打鍵数短縮スコアＣ₁（Ｎ_x，Ｎ_i）を計算する点で、上記実施の形態１のスコア計算部１０７とは異なる。 FIG. 20 is an explanatory diagram showing the definition of the score used by the score calculation unit 107. In the score calculation unit 107 of the present embodiment, the candidate selection score calculation unit 14 calculates the candidate selection score C _sel (N _i ), and the keystroke number shortening score calculation unit 11 uses the candidate selection score C _sel (N _i ). The key calculation number shortening score C ₁ (N _x , N _i ) is different from the score calculation unit 107 of the first embodiment.

上記実施の形態１では、クエリ候補選択条件１０８として一画面に表示可能なクエリ候補数である最大表示候補数Ｄ_maxを設定していたが、一画面に表示できる項目数以上のクエリ候補数をＤ_maxとして設定しておくことも可能である。このような場合には、クエリ候補提示部１１０は数ページにわたってクエリ候補を表示することになり、ユーザはスクロールバーまたはページ送りボタン等によってページを送り、クエリ候補を選択する。
候補選択スコア計算部１４は、このようにスクロールバーまたはページ送りボタン等によって複数ページにわたってクエリ候補を提示する場合に、ページ数に応じて増大するユーザの入力手間を表す候補選択スコアＣ_sel（Ｎ_i）を計算する。 In the first embodiment, the maximum display candidate number D _max that is the number of query candidates that can be displayed on one screen is set as the query candidate selection condition 108. However, the number of query candidates that is greater than the number of items that can be displayed on one screen is set. It is also possible to set it as D _max . In such a case, the query candidate presentation unit 110 displays the query candidates over several pages, and the user selects a query candidate by sending a page with a scroll bar or a page feed button.
When the candidate selection score calculation unit 14 presents query candidates over a plurality of pages using a scroll bar or a page feed button or the like in this way, the candidate selection score C _sel (N representing the user's input effort that increases with the number of pages. _i ) Calculate.

打鍵数短縮スコア計算部１１は、候補選択スコア計算部１４の候補選択スコアを反映した打鍵数短縮スコアを算出する。スコア計算部１０７のスコアの定義を図２０のように拡張することによって、クエリ候補提示部１１０がスクロールバーまたはページ送りボタン等によって複数ページにわたってクエリ候補を提示する場合であっても、スコア計算部１０７は、２ページ目、３ページ目で表示されるクエリ候補の打鍵数低減に対する貢献度を考慮したスコア計算が可能となる。 The keystroke number shortening score calculation unit 11 calculates a keystroke number reduction score reflecting the candidate selection score of the candidate selection score calculation unit 14. By extending the score definition of the score calculation unit 107 as shown in FIG. 20, even if the query candidate presentation unit 110 presents query candidates over a plurality of pages by a scroll bar or a page feed button, the score calculation unit The score calculation 107 can be performed in consideration of the degree of contribution of the query candidates displayed on the second page and the third page to reducing the number of keystrokes.

以上のように、実施の形態２によれば、クエリ候補を提示してユーザに選択させることによって節約できる打鍵数、ユーザのクエリ候補選択後に残りの文字列を入力するときに予測される打鍵数削減を示す確定後スコア、一画面に表示できる項目数である最大表示候補数等を考慮してスコア計算を行うスコア計算部１０７に、候補選択スコア計算部１４を設けるように構成した。このため、スコア計算部１０７は、クエリ候補提示部１１０が複数ページにわたってクエリ候補を提示する場合の入力の手間を考慮した適切なスコア計算を行ってクエリ候補辞書１０３を生成できる。そして、クエリ候補提示部１１０が、このクエリ候補辞書１０３を参照してクエリ候補を提示するように構成したので、画面サイズ、検索対象データの読みの偏り等に応じた適切なクエリ候補を提示することができ、ユーザの入力効率を向上できるという効果が得られる。 As described above, according to the second embodiment, the number of keystrokes that can be saved by presenting query candidates and allowing the user to select them, the number of keystrokes that are predicted when the remaining character strings are input after the user selects a query candidate The candidate selection score calculation unit 14 is provided in the score calculation unit 107 that performs score calculation in consideration of the post-confirmation score indicating reduction, the maximum number of display candidates that can be displayed on one screen, and the like. Therefore, the score calculation unit 107 can generate the query candidate dictionary 103 by performing an appropriate score calculation considering the input effort when the query candidate presentation unit 110 presents query candidates over a plurality of pages. Since the query candidate presenting unit 110 is configured to present the query candidates with reference to the query candidate dictionary 103, the query candidate presenting unit 110 presents appropriate query candidates according to the screen size, the reading bias of the search target data, and the like. Thus, the effect of improving the user input efficiency can be obtained.

実施の形態３．
図２１は、この発明の実施の形態３に係るクエリ候補提示装置のスコア計算部の詳細構成を示すブロック図である。本実施の形態のスコア計算部１０７は、図１９に示した実施の形態２のスコア計算部１０７の構成に、候補重要度スコア計算部１５を追加したものである。
本実施の形態に係るクエリ候補提示装置の構成は、スコア計算部１０７の内部構成以外は図１に示した実施の形態１に係るクエリ候補提示装置の構成と同じであるため、スコア計算部１０７以外の図面および詳細な説明は省略する。 Embodiment 3 FIG.
FIG. 21 is a block diagram showing a detailed configuration of the score calculation unit of the query candidate presentation device according to Embodiment 3 of the present invention. The score calculation unit 107 of the present embodiment is obtained by adding a candidate importance score calculation unit 15 to the configuration of the score calculation unit 107 of the second embodiment shown in FIG.
Since the configuration of the query candidate presentation device according to the present embodiment is the same as the configuration of the query candidate presentation device according to the first embodiment shown in FIG. 1 except for the internal configuration of the score calculation unit 107, the score calculation unit 107 Other drawings and detailed description are omitted.

図２２は、スコア計算部１０７が用いるスコアの定義を示す説明図である。本実施の形態のスコア計算部１０７は、候補重要度スコア計算部１５が候補重要度スコアＣ_L（Ｎ_i）を計算し、打鍵数短縮スコア計算部１１が候補重要度スコアＣ_L（Ｎ_i）を用いて打鍵数短縮スコアＣ₁（Ｎ_x，Ｎ_i）を計算する点で、上記実施の形態１および２のスコア計算部１０７とは異なる。 FIG. 22 is an explanatory diagram showing the definition of the score used by the score calculation unit 107. In the score calculation unit 107 of the present embodiment, the candidate importance score calculation unit 15 calculates the candidate importance score C _L (N _i ), and the keystroke number shortening score calculation unit 11 calculates the candidate importance score C _L (N _i ) Is used to calculate the keystroke number shortening score C ₁ (N _x , N _i ), which is different from the score calculation unit 107 of the first and second embodiments.

例えば、図２に示すような地名または施設名をクエリ候補としてクエリ候補辞書１０３を生成する場合、重要語句抽出処理において重要語句抽出部１０４が有名な地名または施設名に高い重要度を付与し（説明書の場合には画面レイアウト情報等を手掛りとした重要度を付与すればよい）、トライ構造索引生成処理においてトライ構造索引生成部１０５が各ノードに語句の重要度を反映した重み（図２２のＷｇｔ（Ｎ_j））を与えておく。
候補重要度スコア計算部１５は、ショートカットノードＮ_iの子孫である葉ノードＮ_jに対応する語句の重要度Ｗｇｔ（Ｎ_j）を用いて、クエリ候補の重要度を示す候補重要度スコアＣ_L（Ｎ_i）を計算する。 For example, when the query candidate dictionary 103 is generated using a place name or facility name as shown in FIG. 2 as a query candidate, the important phrase extraction unit 104 gives high importance to the famous place name or facility name in the important phrase extraction process ( In the case of a description, it is only necessary to give importance based on screen layout information or the like), and in the trie structure index generation process, the trie structure index generation unit 105 reflects the importance of the phrase on each node (FIG. 22). Of Wgt (N _j )).
Candidate importance score calculation unit 15, a shortcut node N _i with the importance of the phrase corresponding to the leaf node N _j is a descendant Wgt (N _j) of the candidate importance indicating the importance of the query candidates score C _L (N _i ) is calculated.

打鍵数短縮スコア計算部１１は、候補重要度スコア計算部１５の候補重要度スコアを反映した打鍵数短縮スコアを算出する。スコア計算部１０７のスコアの定義を図２２のように拡張することによって、読み文字数や葉ノード数にかかわらず、特に重要な語句を優先的にクエリ候補として表示することが可能となる。 The keystroke number reduction score calculation unit 11 calculates a keystroke number reduction score reflecting the candidate importance score of the candidate importance score calculation unit 15. By extending the score definition of the score calculation unit 107 as shown in FIG. 22, it becomes possible to preferentially display particularly important phrases as query candidates regardless of the number of reading characters and the number of leaf nodes.

以上のように、実施の形態３によれば、クエリ候補を提示してユーザに選択させることによって節約できる打鍵数、ユーザのクエリ候補選択後に残りの文字列を入力するときに予測される打鍵数削減を示す確定後スコア、一画面に表示できる項目数である最大表示候補数等を考慮してスコア計算を行うスコア計算部１０７に、候補重要度スコア計算部１５を設けるように構成した。このため、スコア計算部１０７は、読み文字数や葉ノード数にかかわらず特に重要な語句に高い優先順位を付けるスコア計算を行ってクエリ候補辞書１０３を生成できる。そして、クエリ候補提示部１１０が、このクエリ候補辞書１０３を参照してクエリ候補を提示するように構成したので、画面サイズ、検索対象データの読みの偏り、語句の重要度等に応じた適切なクエリ候補を提示することができ、ユーザの入力効率を向上できるという効果が得られる。 As described above, according to the third embodiment, the number of keystrokes that can be saved by presenting query candidates and allowing the user to select them, and the number of keystrokes that are predicted when the remaining character strings are input after the user selects a query candidate The score calculation unit 107 that performs score calculation in consideration of the confirmed score indicating reduction, the maximum number of display candidates that can be displayed on one screen, and the like is configured to include the candidate importance score calculation unit 15. Therefore, the score calculation unit 107 can generate the query candidate dictionary 103 by performing score calculation that gives a high priority to particularly important words regardless of the number of reading characters or the number of leaf nodes. Since the query candidate presenting unit 110 is configured to present the query candidates with reference to the query candidate dictionary 103, the query candidate presenting unit 110 can appropriately display the screen size, the reading bias of the search target data, the importance of the phrase, and the like. Query candidates can be presented, and the effect of improving user input efficiency can be obtained.

この発明の実施の形態１に係るクエリ候補提示装置の構成を示すブロック図である。It is a block diagram which shows the structure of the query candidate presentation apparatus which concerns on Embodiment 1 of this invention. この発明の実施の形態１に係るクエリ候補提示装置の重要語句抽出部が抽出した重要語句の例を示す説明図である。It is explanatory drawing which shows the example of the important phrase extracted by the important phrase extraction part of the query candidate presentation apparatus which concerns on Embodiment 1 of this invention. この発明の実施の形態１に係るクエリ候補提示装置のトライ構造索引生成部が生成した読みトライ構造索引を示す説明図である。It is explanatory drawing which shows the reading trie structure index which the trie structure index production | generation part of the query candidate presentation apparatus concerning Embodiment 1 of this invention produced | generated. この発明の実施の形態１に係るクエリ候補提示装置の読みトライ構造索引のショートカットを示す説明図である。It is explanatory drawing which shows the shortcut of the reading try structure index of the query candidate presentation apparatus which concerns on Embodiment 1 of this invention. この発明の実施の形態１に係るクエリ候補提示装置のスコア計算部の詳細構成を示すブロック図である。It is a block diagram which shows the detailed structure of the score calculation part of the query candidate presentation apparatus which concerns on Embodiment 1 of this invention. この発明の実施の形態１に係るクエリ候補提示装置のスコア計算部が用いるスコアの定義を示す説明図である。It is explanatory drawing which shows the definition of the score which the score calculation part of the query candidate presentation apparatus which concerns on Embodiment 1 of this invention uses. この発明の実施の形態１に係るクエリ候補提示装置のスコア計算部のスコア計算例を示す説明図である。It is explanatory drawing which shows the score calculation example of the score calculation part of the query candidate presentation apparatus which concerns on Embodiment 1 of this invention. この発明の実施の形態１に係るクエリ候補提示装置の確定後スコア計算部の確定後スコア計算例を示す説明図である。It is explanatory drawing which shows the score calculation example after confirmation of the score calculation part after confirmation of the query candidate presentation apparatus which concerns on Embodiment 1 of this invention. この発明の実施の形態１に係るクエリ候補提示装置のスコア計算部のスコア計算例を示す説明図である。It is explanatory drawing which shows the score calculation example of the score calculation part of the query candidate presentation apparatus which concerns on Embodiment 1 of this invention. この発明の実施の形態１に係るクエリ候補提示装置のスコア計算部のスコア計算例を示す説明図である。It is explanatory drawing which shows the score calculation example of the score calculation part of the query candidate presentation apparatus which concerns on Embodiment 1 of this invention. この発明の実施の形態１に係るクエリ候補提示装置のスコア計算部のスコア計算例を示す説明図であり、最大表示候補数Ｄ_maxが５のときを示す。It is explanatory drawing which shows the example of score calculation of the score calculation part of the query candidate presentation apparatus which concerns on Embodiment 1 of this invention, and shows the time when the _maximum number _{Dmax of} display candidates is five. この発明の実施の形態１に係るクエリ候補提示装置のクエリ候補辞書の構成を示すブロック図である。It is a block diagram which shows the structure of the query candidate dictionary of the query candidate presentation apparatus which concerns on Embodiment 1 of this invention. この発明の実施の形態１に係るクエリ候補提示装置のクエリ候補辞書に登録された見出し一覧の例を示す説明図である。It is explanatory drawing which shows the example of the heading list registered into the query candidate dictionary of the query candidate presentation apparatus which concerns on Embodiment 1 of this invention. この発明の実施の形態１に係るクエリ候補提示装置のクエリ候補辞書に登録されたショートカット情報の例を示す説明図である。It is explanatory drawing which shows the example of the shortcut information registered into the query candidate dictionary of the query candidate presentation apparatus which concerns on Embodiment 1 of this invention. この発明の実施の形態１に係るクエリ候補提示装置のクエリ候補辞書生成部の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the query candidate dictionary production | generation part of the query candidate presentation apparatus which concerns on Embodiment 1 of this invention. この発明の実施の形態１に係るクエリ候補提示装置のクエリ候補提示部の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the query candidate presentation part of the query candidate presentation apparatus which concerns on Embodiment 1 of this invention. この発明の実施の形態１に係るクエリ候補提示装置のクエリ候補表示部がユーザの入力に対して表示するクエリ候補提示の画面例を示す説明図である。It is explanatory drawing which shows the example of a screen of query candidate presentation which the query candidate display part of the query candidate presentation apparatus concerning Embodiment 1 of this invention displays with respect to a user's input. この発明の実施の形態１に係るクエリ候補提示装置のクエリ候補表示部が表示する最大表示候補数Ｄ_maxが５のときのクエリ候補提示の画面例を示す説明図である。It is explanatory drawing which shows the example of a screen of query candidate presentation when the maximum display candidate number _Dmax which the query candidate display part of the query candidate presentation apparatus concerning Embodiment 1 of this invention displays is five. この発明の実施の形態２に係るクエリ候補提示装置のスコア計算部の詳細構成を示すブロック図である。It is a block diagram which shows the detailed structure of the score calculation part of the query candidate presentation apparatus which concerns on Embodiment 2 of this invention. この発明の実施の形態２に係るクエリ候補提示装置のスコア計算部が用いるスコアの定義を示す説明図である。It is explanatory drawing which shows the definition of the score which the score calculation part of the query candidate presentation apparatus which concerns on Embodiment 2 of this invention uses. この発明の実施の形態３に係るクエリ候補提示装置のスコア計算部の詳細構成を示すブロック図である。It is a block diagram which shows the detailed structure of the score calculation part of the query candidate presentation apparatus which concerns on Embodiment 3 of this invention. この発明の実施の形態３に係るクエリ候補提示装置のスコア計算部が用いるスコアの定義を示す説明図である。It is explanatory drawing which shows the definition of the score which the score calculation part of the query candidate presentation apparatus which concerns on Embodiment 3 of this invention uses.

Explanation of symbols

１１打鍵数短縮スコア計算部、１２確定後スコア計算部、１３スコア集計部、１４候補選択スコア計算部、１５候補重要度スコア計算部、２１見出し一覧、２２読みトライ構造索引、２３，２３ａ，２３ｂ，２３ｃショートカット情報、１０１クエリ候補辞書生成部、１０２検索対象データ、１０３クエリ候補辞書、１０４重要語句抽出部、１０５トライ構造索引生成部、１０６クエリ候補抽出部、１０７スコア計算部、１０８クエリ候補選択条件、１０９クエリ候補選択部、１１０クエリ候補提示部、１１１検索用キーワードの読み文字列、１１１ａクエリ候補選択結果、１１２入力部、１１３クエリ候補表示部。 DESCRIPTION OF SYMBOLS 11 Keystroke shortening score calculation part, 12 Post-confirmation score calculation part, 13 Score totaling part, 14 Candidate selection score calculation part, 15 Candidate importance score calculation part, 21 Heading list, 22 Reading tri structure index, 23, 23a, 23b , 23c Shortcut information, 101 Query candidate dictionary generation unit, 102 Search target data, 103 Query candidate dictionary, 104 Important phrase extraction unit, 105 Tri structure index generation unit, 106 Query candidate extraction unit, 107 Score calculation unit, 108 Query candidate selection Condition, 109 Query candidate selection part, 110 Query candidate presentation part, 111 Reading character string of search keyword, 111a Query candidate selection result, 112 input part, 113 Query candidate display part.

Claims

In a query candidate presentation device including a query candidate display unit that obtains query candidates starting from a reading keyword of a search keyword input character by character from the query candidate dictionary and presents the query candidates as candidates for the search keyword.
A query candidate extraction unit that extracts a character string serving as the query candidate from data obtained by hierarchically structuring the word extracted from the search target data based on the reading characters of the word;
A score calculation unit for calculating a score indicating the effect of reducing input labor of the search keyword obtained by setting the character string starting with the reading character of the search keyword as a query candidate;
A query candidate selection unit that registers the character string as a query candidate in the query candidate dictionary based on the score calculated by the score calculation unit for each reading character of the phrase extracted from the search target data. A query candidate presentation device as a feature.

The score calculator
A keystroke number shortening score calculation unit for calculating a keystroke number shortening score indicating the number of keystrokes of the search keyword to be reduced by setting a character string starting with a reading character of the search keyword as a query candidate;
After confirming the effect of reducing the number of keystrokes of the search keyword predicted by presenting a new query candidate starting with the character string when it is determined that the character string is a part of the search keyword A post-confirmation score calculator for calculating the score;
A score totaling unit that calculates a score indicating an effect of reducing input labor of the search keyword obtained by using the character string as a query candidate by using the keystroke number shortening score and the post-confirmation score. The query candidate presentation device according to claim 1, wherein:

The score calculator
A keystroke number shortening score calculation unit for calculating a keystroke number shortening score indicating the number of keystrokes of the search keyword to be reduced by setting a character string starting with a reading character of the search keyword as a query candidate;
After confirming the effect of reducing the number of keystrokes of the search keyword predicted by presenting a new query candidate starting with the character string when it is determined that the character string is a part of the search keyword A post-confirmation score calculator for calculating the score;
A candidate selection score calculation unit that calculates a candidate selection score indicating the input effort of the search keyword that increases by presenting query candidates over a plurality of pages based on the number of query candidates that can be presented simultaneously by the query candidate display unit. ,
The score totaling unit is a score indicating an effect of reducing input labor of the search keyword obtained by presenting the character string as a query candidate using the keystroke number shortening score, the post-confirmation score, and the candidate selection score. The query candidate presentation device according to claim 1, wherein:

The score calculator
A keystroke number shortening score calculation unit for calculating a keystroke number shortening score indicating the number of keystrokes of the search keyword to be reduced by setting a character string starting with a reading character of the search keyword as a query candidate;
After confirming the effect of reducing the number of keystrokes of the search keyword predicted by presenting a new query candidate starting with the character string when it is determined that the character string is a part of the search keyword A post-confirmation score calculator for calculating the score;
A candidate selection score calculation unit that calculates a candidate selection score indicating the input effort of the search keyword that is increased by presenting query candidates over a plurality of pages based on the number of query candidates that can be simultaneously presented by the query candidate display unit;
A candidate importance score calculation unit that calculates a candidate importance score indicating the importance of the character string based on the importance of the phrase extracted from the search target data,
The score totaling unit reduces the input effort of the search keyword obtained by using the character string as a query candidate by using the keystroke reduction score, the post-confirmation score, the candidate selection score, and the candidate importance score. The query candidate presentation device according to claim 1, wherein a score indicating the effect of the query is calculated.

The query candidate selection unit registers, for each reading character of the phrase extracted from the search target data, a number of character strings corresponding to the number of query candidates that can be presented simultaneously by the query candidate display unit as query candidates in the query candidate dictionary. 5. The query candidate presentation device according to claim 1, wherein: the query candidate presentation device according to claim 1.