JP4942350B2

JP4942350B2 - Search client

Info

Publication number: JP4942350B2
Application number: JP2006017796A
Authority: JP
Inventors: 裕一小島
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2006-01-26
Filing date: 2006-01-26
Publication date: 2012-05-30
Anticipated expiration: 2026-01-26
Also published as: JP2007200019A

Description

本発明は検索技術に係り、より詳細にサーバークライアント型の索引を利用する検索クライアントに関する。 The present invention relates to a search technique, and more particularly to a search client that uses a server client type index.

大量の文書中から所望の情報を得るための技術に、検索技術がある。この検索技術は、ユーザに示された文字列あるいは単語を、文書中の文字列あるいは単語と比較し、完全に一致した文書あるいは一致の度合いが高い文書をユーザに提示するものである。 There is a search technique as a technique for obtaining desired information from a large amount of documents. This search technique compares a character string or word shown to a user with a character string or word in a document, and presents a completely matched document or a document with a high degree of matching to the user.

検索技術において、検索処理速度に影響を与えるものが索引の存在の有無である。索引とは、ある特定の情報を検索するために、その情報を示す語句または記号等を一定の順次に配列し、その情報の所在を指示するものである。 In search technology, the presence or absence of an index affects the search processing speed. In order to search for specific information, the index is an arrangement of words or symbols indicating the information in a certain sequential order to indicate the location of the information.

例えば単語Ａを含む文書を検索する場合、索引が予め作成されていれば、索引において単語Ａを調べればよいが、索引が作成されていない場合、すべての文書について単語Ａの存在を調べる必要があり、時間がかかる。 For example, when searching for a document including the word A, if the index is created in advance, the word A may be checked in the index. However, if the index is not created, it is necessary to check the existence of the word A for all documents. Yes, it takes time.

このため、高速で同時に複数の検索処理を行うサーバークライアント型の検索システムでは、一般に索引を保持している。 For this reason, an index is generally held in a server client type search system that performs a plurality of search processes simultaneously at high speed.

一方、日本語に限らず、自然言語では表記の揺れや同義語が存在する。表記の揺れについては、大文字／小文字、半角／全角などの文字単位で現れるもの、英語の単数型／複数型などのように、単語単位で文法的に現れるもの、片仮名、外来語の揺れのように単語単位で不規則に現れるものがある。そして同義語については、「電子計算機」と「コンピュータ」のように意味は同じだが表記がまったく異なるものがある。これらの表記の揺れや同義語は、検索漏れを引き起こす原因であり、検索に悪影響を及ぼす。 On the other hand, not only in Japanese but also in natural language, there are notation fluctuations and synonyms. As for the shaking of the notation, it appears in character units such as uppercase / lowercase letters, half-width / full-width, English singular / plural types, etc., which appears grammatically in units of words, katakana, foreign words Some words appear irregularly on a word basis. As for synonyms, there are things like "electronic computer" and "computer" that have the same meaning but different notations. These wobbles and synonyms are the cause of search omission and adversely affect the search.

例えば、索引を有する検索システムにおいて、検索対象である検索語を「ｃｏｍｐｕｔｅｒ」として、索引検索を行ったとき、頭文字が大文字の「Ｃｏｍｐｕｔｅｒ」ではヒットしない。また、検索語を「コンピューター」とした場合、「コンピュータ」はヒットしない。 For example, in a search system having an index, when an index search is performed with a search word to be searched as “computer”, the initial letter “Computer” does not hit. If the search term is “computer”, “computer” will not be hit.

このため、多くの場合において、索引の作成時に正規化処理を施し、このような検索漏れを少なくしている。 For this reason, in many cases, normalization processing is performed at the time of creating an index to reduce such a search omission.

このように正規化された索引を有する文書群は、高速に漏れの少ない検索を行うことが可能であるが、その反面、正規化処理の方法に変更が生じた場合、索引を作り直さなくてはならない、という問題点がある。また、一度正規化処理を施して作成された索引は、ユーザ固有の特性を反映させることができず、利便性が悪い、等の問題点がある。 A document group having a normalized index as described above can be searched at a high speed with few leaks. On the other hand, if a change occurs in the normalization processing method, the index must be recreated. There is a problem of not becoming. In addition, an index that has been created once after normalization cannot reflect user-specific characteristics and is not convenient.

このような問題点を解決する方法として、正規化処理を行わず、検索時にすべての検索語を可能なバリエーションに展開する方法が挙げられるが、この方法の場合、展開した結果の語群が膨大になる可能性があり、処理速度が要求される場合には適切ではない。 As a method for solving such problems, there is a method in which normalization processing is not performed and all search terms are expanded into possible variations at the time of search. In this method, the expanded word group is enormous. This is not appropriate when processing speed is required.

検索技術に関する従来文献として、例えば、次の文献があげられる。 For example, the following documents can be cited as conventional documents related to search technology.

特開２００１−２３６３５８号公報（特許文献１）には、ユーザのニーズに応じて検索語および索引語に関し正規化レベルを指定できる方法と検索語の関連表記への展開を指定できる方法とを具備することにより、検索精度と検索効率および再現率と適合率とのトレードオフを実現することが可能な文書検索方法および装置が開示されている。 Japanese Patent Laid-Open No. 2001-236358 (Patent Document 1) includes a method that can specify a normalization level for a search word and an index word and a method that can specify expansion of the search word to related notations according to the user's needs. Thus, a document search method and apparatus capable of realizing a trade-off between search accuracy, search efficiency, reproduction rate, and relevance rate are disclosed.

特開２００２−２３００２１号公報（特許文献２）には、ベクトル空間モデルを応用した類似文書検索において共起情報に従って検索クエリを解析し多義解消することによりユーザの意図に沿った検索精度の高い情報検索を行うことが可能な情報検索措装置及び情報検索方法並びに記憶媒体が開示されている。
特開２００１−２３６３５８号公報特開２００２−２３００２１号公報 Japanese Patent Laid-Open No. 2002-230021 (Patent Document 2) discloses information with high search accuracy according to a user's intention by analyzing a search query according to co-occurrence information and solving ambiguity in a similar document search using a vector space model. An information retrieval device, an information retrieval method, and a storage medium capable of performing retrieval are disclosed.
JP 2001-236358 A Japanese Patent Laid-Open No. 2002-230021

しかしながら、特許文献１に開示された発明では、ユーザ固有の特性を反映させることができず、特許文献２に開示された発明では、検索処理速度を上げることができないという問題点がある。 However, the invention disclosed in Patent Document 1 cannot reflect user-specific characteristics, and the invention disclosed in Patent Document 2 has a problem that the search processing speed cannot be increased.

本発明は、このような問題点を鑑みて、これらを解決すべくなされたものであり、高精度の検索処理を高速で実行させ、かつユーザ固有の特性を反映させることが可能な検索クライアントを提供することを目的とするものである。 The present invention has been made in order to solve these problems, and a search client capable of executing high-precision search processing at high speed and reflecting user-specific characteristics is provided. It is intended to provide.

上記目的を達成するために、本発明の画像形成装置は次の如き構成を採用した。 In order to achieve the above object, the image forming apparatus of the present invention employs the following configuration.

検索クライアントは、言語を解析する際の基本的な単語データおよびその正規化結果を集めた標準辞書を用いて正規化された文字列により生成された標準索引と、検索要求に含まれる検索文字列に対し、文字単位の正規化処理を施すための文字正規化辞書を用いて正規化された文字列により生成された文字単位正規化索引と、を有する検索サーバにアクセスして検索を行う検索クライアントにおいて、前記検索クライアントは、ユーザ固有の言葉を追加登録することが可能であり各個人ごとの特別な単語データを集めた編集可能なユーザ辞書と、前記標準辞書と、前記文字正規化辞書と、検索コマンドを生成する検索コマンド生成手段とを有し、前記検索コマンド生成手段は、前記検索文字列を前記標準辞書と前記ユーザ辞書においてそれぞれ独立に検索し、前記標準辞書における検索結果の文字列を代表語とする文字列の集合である標準検索結果集合と、前記ユーザ辞書における検索結果の文字列を代表語とする文字列の集合であるユーザ検索結果集合とを抽出し、前記標準検索結果集合の要素と、前記ユーザ検索結果集合の要素とに対して論理演算処理を行って検索コマンドを生成し、前記検索コマンドを前記検索サーバへ送信する構成とすることができる。 Search client, and the standard index generated by normalized string using basic words data and a standard dictionary that the regularization result when parsing language, search characters included in the search request A search for accessing a search server having a character unit normalization index generated by a character string normalized using a character normalization dictionary for performing character unit normalization processing on the column, and performing a search In the client, the search client is capable of additionally registering user-specific words, an editable user dictionary collecting special word data for each individual, the standard dictionary, the character normalization dictionary, Search command generation means for generating a search command, wherein the search command generation means stores the search character string in the standard dictionary and the user dictionary, respectively. Independent search, a standard search result set that is a set of character strings whose representative words are the character strings of search results in the standard dictionary, and a set of character strings whose representative words are the character strings of search results in the user dictionary A user search result set is extracted, a logical operation process is performed on the elements of the standard search result set and the elements of the user search result set to generate a search command, and the search command is sent to the search server It can be set as the structure which transmits.

これにより、検索処理速度の低下を抑えつつ、ユーザ固有の特性を反映させ、かつ検索漏れを減少させ、より高い精度の検索処理を実行させることが可能な検索クライアントを提供することができる。 As a result, it is possible to provide a search client that can suppress a decrease in search processing speed, reflect user-specific characteristics, reduce search omissions, and execute search processing with higher accuracy.

また、検索クライアントは、さらに、前記検索コマンド生成手段は、前記標準検索結果集合に含まれ前記ユーザ検索結果集合に含まれない第一の文字列集合と、前記ユーザ検索結果集合に含まれ前記標準検索結果集合に含まれない第二の文字列集合と、を抽出し、前記第一の文字列集合と前記第二の文字列集合とに論理演算処理を行って前記検索コマンドを生成する構成とすることができる。 Also, search client, further the search command generating means, said a standard search results in the set the user search results are not included in the set first set of character strings is included in the user search result set the A configuration in which a second character string set not included in the standard search result set is extracted, and a logical operation process is performed on the first character string set and the second character string set to generate the search command It can be.

これにより、ユーザ固有の特性を反映させた検索処理を実行させることが可能な検索クライアントを提供することができる。 Accordingly, it is possible to provide a search client that can execute a search process that reflects user-specific characteristics.

また、上記目的を達成するために、本発明の検索クライアントは、言語を解析する際の基本的な単語データおよびその正規化結果を集めたｎ個の標準辞書を用いて正規化された文字列により生成された、前記ｎ個の標準辞書に対応したｎ個の標準索引と、検索要求に含まれる検索文字列に対し、文字単位の正規化処理を施すための文字正規化辞書を用いて正規化された文字列により生成された文字単位正規化索引と、を有する検索サーバにアクセスして検索を行う検索クライアントにおいて、前記検索クライアントは、ユーザ固有の言葉を追加登録することが可能であり各個人ごとの特別な単語データを集めた編集可能なユーザ辞書と、前記ｎ個の標準辞書と、前記文字正規化辞書と、検索コマンドを生成する検索コマンド生成手段と、前記検索コマンドに含まれる論理演算の演算回数を検出する演算回数検出手段と、を有し、前記検索コマンド生成手段は、前記検索文字列を前記ｎ個の標準辞書のうち一の標準辞書と前記ユーザ辞書においてそれぞれ独立に検索し、前記一の標準辞書の検索結果の文字列を代表語とする文字列の集合である集合Ｘｉと、前記ユーザ辞書における検索結果の文字列を代表語とする文字列の集合である集合Ｙｉとを抽出し、前記集合Ｘｉの要素を検索文字列として前記一の標準辞書と対応した前記標準索引を検索させた結果の第一の集合と、前記集合Ｘｉに含まれ前記集合Ｙｉに含まれない集合ＳＵＢｉを構成する各要素で前記文字単位正規化索引を検索させた結果の集合と、前記集合Ｙｉに含まれ前記集合Ｘｉに含まれない集合ＡＤＤｉを構成する各要素で前記文字単位正規化索引を検索させた結果の集合との論理和をとった結果の第二の集合と、の論理積に含まれない前記第一の集合を抽出する論理演算を含む検索コマンドを生成し、当該検索クライアントは、前記検索コマンド生成手段において、前記ｎ個の各々の標準辞書に対応して生成された検索コマンドのうち、前記演算回数検出手段により検出された前記論理演算の演算回数が最も少ない検索コマンドを前記検索サーバへ送信する構成とすることができる。 In order to achieve the above object, the search client of the present invention uses the standard word data used for language analysis and the character strings normalized using n standard dictionaries that collect the normalization results. Normalization is performed using the character normalization dictionary for performing character-by-character normalization processing on the n standard indexes corresponding to the n standard dictionaries and the search character string included in the search request. In a search client that performs a search by accessing a search server having a character-unit normalized index generated by a converted character string, the search client can additionally register user-specific words, and editable user dictionary that special word data for each individual, the n number of the standard dictionary, and the character normalization dictionary, a search command generating means for generating a search command, the Includes a number of operations detecting means for detecting the number of operations of the logical operations involved in the search command, and the search command generating means, the user as one of the standard dictionary of the search string the n standard dictionary searches independently in the dictionary, and the set der Ru collection if Xi string typified word string search result of the one of the standard dictionary, typified word string of the search results in the user dictionary extracting a set der Ru collection case Yi strings, the first set of results obtained by searching the standard index that corresponds with the one of the standard dictionary elements of the set Xi search string, the set A set of results obtained by searching the character unit normalized index with each element constituting the set SUBi included in Xi and not included in the set Yi, and a set ADDi included in the set Yi and not included in the set Xi Constitute Search comprising logical operation to extract a second set of results the logical sum of the set of results obtained by searching the character unit normalization index in element, the first set that are not included in logical product of The search client generates a command of the logical operation detected by the operation count detection unit among the search commands generated corresponding to each of the n standard dictionaries in the search command generation unit. A search command with the smallest number of operations can be transmitted to the search server.

また、前記演算回数の少ない検索コマンドを生成することで、前記検索サーバに対する負荷を少なくすることが可能な検索クライアントを提供することができる。 Further, it is possible to provide a search client capable of reducing the load on the search server by generating a search command with a small number of calculations.

また、上記目的を達成するために、本発明の検索クライアントは、さらに、前記検索サーバから配信された標準辞書更新データに基づき前記標準辞書を更新するための標準辞書更新手段を有する構成とすることができる。 In order to achieve the above object, the search client of the present invention further includes a standard dictionary update unit for updating the standard dictionary based on the standard dictionary update data distributed from the search server. Can do.

これにより、前記標準辞書に変更があった場合でも、前記標準辞書更新データに基づき前記標準辞書を更新することがきる。これにより、これまでの発明により得られた効果を継続させることができる。すなわち、検索処理速度の低下を抑えつつ、ユーザ固有の特性を反映させ、かつ検索漏れを減少させ、より高い精度の検索処理を実行させることが可能な検索クライアントを提供することができる。 Thereby, even when the standard dictionary is changed, the standard dictionary can be updated based on the standard dictionary update data. Thereby, the effect obtained by the present invention can be continued. In other words, it is possible to provide a search client that can suppress a decrease in search processing speed, reflect user-specific characteristics, reduce search omissions, and execute search processing with higher accuracy.

本発明の検索クライアントによれば、検索処理速度の低下を抑えつつ、より高い精度の検索処理を実行させ、かつユーザ固有の特性を反映させることができる。 According to the search client of the present invention, it is possible to execute search processing with higher accuracy and reflect user-specific characteristics while suppressing a decrease in search processing speed.

以下、本発明の実施例を図面に基づいて説明する。 Embodiments of the present invention will be described below with reference to the drawings.

本発明の検索クライアントは、検索処理を行う際に、ユーザにより編集可能なユーザ辞書と標準辞書に基づいて生成した検索コマンドを検索サーバへ送出し、検索サーバにて検索処理を実行させるものである。 When performing a search process, the search client of the present invention sends a search command generated based on a user dictionary editable by a user and a standard dictionary to the search server, and causes the search server to execute the search process. .

図１は本発明の実施例１の検索クライアントを有する検索システムの機能ブロック図の例である。 FIG. 1 is an example of a functional block diagram of a search system having a search client according to the first embodiment of the present invention.

検索システム１０は、検索サーバ２０と検索クライアント３０から構成されており、検索サーバ２０と検索クライアント３０は、ネットワークを介して接続されている。 The search system 10 includes a search server 20 and a search client 30, and the search server 20 and the search client 30 are connected via a network.

検索システム１０は検索処理を行うものであって、検索クライアント３０において、ユーザから入力された検索要求に基づき検索コマンドが生成され、この検索コマンドが、検索クライアント３０より検索サーバ２０へ送出されると、検索サーバ２０がこれを受けて検索処理を行うものである。 The search system 10 performs a search process. In the search client 30, a search command is generated based on a search request input from a user, and the search command is sent from the search client 30 to the search server 20. The search server 20 receives this and performs a search process.

ここで、検索要求とは、ユーザから入力されるものであって、ユーザが検索システム１０において検索処理を行う検索対象を文字列で表したものである。この文字列が本実施例において、検索対象である検索文字列となる。 Here, the search request is input from the user and represents a search target that the user performs a search process in the search system 10 as a character string. This character string becomes a search character string to be searched in the present embodiment.

検索サーバ２０は、索引Ａ１、索引Ｂ、文書データベース２６（以下、文書ＤＢ２６）および標準辞書配信部２８を有する。検索サーバ２０は、検索クライアント３０より送出された検索コマンドを受けて、索引Ａ１と索引Ｂを検索する処理を行い、その結果として文書ＤＢ２６内に存在するユーザが所望する文書の位置情報（所在情報）を得る。そして、この位置情報を検索結果として検索クライアント３０へ送出する。ユーザは、この検索結果である文書の位置情報に基づき、文書ＤＢ２６内に格納された所望の文書を簡単に閲覧することができる。 The search server 20 includes an index A1, an index B, a document database 26 (hereinafter, document DB 26), and a standard dictionary distribution unit 28. The search server 20 receives the search command sent from the search client 30 and performs a process for searching the index A1 and the index B. As a result, the position information (location information) of the document desired by the user existing in the document DB 26 is obtained. ) The position information is sent to the search client 30 as a search result. The user can easily browse a desired document stored in the document DB 26 based on the position information of the document as the search result.

索引Ａ１は、標準辞書を用いて標準的な正規化処理が施された文字列により生成された索引である。ここで標準辞書とは、例えば言語を解析する際の基本的な単語データおよびその正規化結果を集めたものなどである。また、索引Ａ１には、文書ＤＢ２６に格納された文書データの標準辞書による正規化結果のほか、文書データの名前、大きさなどの属性と、この文書データの記録場所を示す位置情報であるＵＲＬ（ＵｎｉｆｏｒｍＲｅｓｏｕｒｃｅＬｏｃａｔｏｒ）等が保持されている。 The index A1 is an index generated by a character string that has been subjected to a standard normalization process using a standard dictionary. Here, the standard dictionary is, for example, a collection of basic word data and its normalization results when a language is analyzed. In addition to the normalization result of the document data stored in the document DB 26 using the standard dictionary, the index A1 includes attributes such as the name and size of the document data, and URL that is position information indicating the recording location of the document data. (Uniform Resource Locator) and the like are held.

索引Ｂは、文字単位の正規化のみを行う文字正規化辞書を用いて正規化処理が施された文字列により生成された索引である。また、索引Ｂには、文書ＤＢ２６に格納された文書データの文字正規化辞書による制覇か結果のほか、文書データの名前、大きさなどの属性と、その文書データの記録場所を示す位置情報であるＵＲＬ等が保持されている。 The index B is an index generated by a character string that has been subjected to normalization processing using a character normalization dictionary that performs only character-unit normalization. In addition, the index B includes attributes such as the name and size of the document data, and position information indicating the recording location of the document data, in addition to the result of the character normalization dictionary of the document data stored in the document DB 26. A certain URL or the like is held.

以上において、索引Ａ１および索引Ｂに保持された情報のうち、正規化結果以外は同一であるので、正規化結果以外の情報は索引Ａ１および索引Ｂに共有されて保持されている。共有の具体的方法は、後述するＳＱＬによるＣｒｅａｔ文に示される。 In the above, since information other than the normalization result is the same among the information held in the index A1 and the index B, information other than the normalization result is shared and held in the index A1 and the index B. A specific method of sharing is shown in a CREATE statement by SQL described later.

文書ＤＢ２６には、多数の文書データが格納されており、索引Ａ１および索引Ｂのそれぞれからたどることが可能となっている。尚、本実施例において文書ＤＢ２６は、検索サーバ２０内に保持されたものとしたが、文書ＤＢ２６は、適切な接続方法により検索サーバ２０の外部に接続されていても良い。 A large number of document data is stored in the document DB 26 and can be traced from each of the index A1 and the index B. In this embodiment, the document DB 26 is stored in the search server 20, but the document DB 26 may be connected to the outside of the search server 20 by an appropriate connection method.

標準辞書配信部２８は、標準辞書の変更などにより、索引Ａ１の正規化処理方法に変更が生じた場合、ネットワークを通して新しい標準辞書を検索クライアント３０へ配信する。 The standard dictionary distribution unit 28 distributes a new standard dictionary to the search client 30 through the network when the normalization processing method of the index A1 is changed due to a change of the standard dictionary or the like.

検索クライアント３０は、検索コマンド生成部３２、標準辞書更新部３４、ユーザ辞書３６、標準辞書３８および文字正規化辞書３９を有する。検索クライアント３０では、ユーザからの検索要求としての検索文字列が入力されると、検索コマンドを生成し、その検索コマンドを検索サーバ２０へ送出している。 The search client 30 includes a search command generation unit 32, a standard dictionary update unit 34, a user dictionary 36, a standard dictionary 38, and a character normalization dictionary 39. When a search character string is input as a search request from the user, the search client 30 generates a search command and sends the search command to the search server 20.

検索コマンド生成部３２は、ユーザ辞書３６と標準辞書３８を用いて検索文字列を関連用語に展開し、その結果に基づき検索コマンドを生成する。 The search command generation unit 32 expands the search character string into related terms using the user dictionary 36 and the standard dictionary 38, and generates a search command based on the result.

標準辞書更新部３４では、標準辞書配信部２８より配信された新しい標準辞書に基づいて、標準辞書３８を更新する。 The standard dictionary update unit 34 updates the standard dictionary 38 based on the new standard dictionary distributed from the standard dictionary distribution unit 28.

ユーザ辞書３６は、ユーザが、ユーザ固有の言葉を追加登録することが可能であり、各個人ごとの特別な単語データを集めたものである。 The user dictionary 36 is a collection of special word data for each individual, allowing the user to additionally register words unique to the user.

標準辞書３８は、検索サーバ２０において索引Ａ１の生成に用いられる辞書である。また標準辞書３８は、例えば言語を解析する際の基本的な単語データを集めたものなどであり、ユーザ辞書との比較に用いられる。 The standard dictionary 38 is a dictionary used for generating the index A1 in the search server 20. The standard dictionary 38 is a collection of basic word data when analyzing languages, for example, and is used for comparison with a user dictionary.

文字正規化辞書３９は、検索サーバ２０において索引Ｂの生成に用いられる辞書である。また、文字正規化辞書３９は、検索要求に含まれる検索文字列に対し、文字単位の正規化処理を施すために用いられる。 The character normalization dictionary 39 is a dictionary used for generating the index B in the search server 20. The character normalization dictionary 39 is used to perform a character unit normalization process on the search character string included in the search request.

ここで、本実施例において、検索サーバ２０では、ＳＱＬコマンドを受け取るものである。検索サーバ２０では、文書ＤＢ２６に格納されている文書データの属性や文書データの位置情報であるＵＲＬが示されたテーブルが予め作成されている。テーブルを作成することを示すコマンドは、「ｃｒｅａｔｅｔａｂｌｅｔａｂｌｅ１（ｄｏｃｕｍｅnｔＵＲＩｔｅｘｔ，ＩＡ１ｔｅｘｔ，ＩＢｔｅｘｔ）」である。ここで、「ＩＡ１」とは索引Ａ１を示し、「ＩＢ」とは索引Ｂを示すものである。 Here, in the present embodiment, the search server 20 receives the SQL command. In the search server 20, a table showing the attribute of the document data stored in the document DB 26 and the URL that is the position information of the document data is created in advance. A command indicating creation of a table is “create table table1 (document URI text, IA1 text, IB text)”. Here, “IA1” indicates the index A1, and “IB” indicates the index B.

また、検索サーバ２０における索引Ａ１と索引Ｂは以下のように生成される。 Further, the index A1 and the index B in the search server 20 are generated as follows.

検索サーバ２０では、ユーザにより入力された文書から、索引Ａ１および索引Ｂに登録される文字列がテキストＴ１として抽出される。抽出されたテキストＴ１は、まず文字正規化辞書により正規化処理を施され、文字単位の正規化処理を施された文字列として索引Ｂに登録される。ここで、文字正規化辞書により正規化処理を施されたテキストＴ１を「ＩＢ１」と示す。 In the search server 20, a character string registered in the index A1 and the index B is extracted as text T1 from the document input by the user. The extracted text T1 is first subjected to normalization processing by a character normalization dictionary, and is registered in the index B as a character string subjected to character-unit normalization processing. Here, the text T1 subjected to normalization processing by the character normalization dictionary is indicated as “IB1”.

尚このとき索引Ｂでは、ＩＢ１と、ユーザが入力した文字列の記録場所を示す位置情報ＵＲＬ１が関連付けられて登録される。 At this time, in the index B, IB1 is registered in association with the position information URL1 indicating the recording location of the character string input by the user.

このＩＢ１は、さらに標準辞書により正規化処理が施され、標準的な正規化処理が施された文字列として索引Ａ１に登録される。ここで、標準辞書により正規化処理を施されたテキストＩＢ１を「ＩＡ１１」と示す。このとき、索引Ａ１では、索引Ｂと同様に、ＩＡ１１と、ユーザが入力した文字列の記録場所を示す位置情報ＵＲＬ１が関連付けられて登録される。 This IB1 is further subjected to normalization processing by a standard dictionary, and is registered in the index A1 as a character string subjected to standard normalization processing. Here, the text IB1 subjected to normalization processing by the standard dictionary is indicated as “IA11”. At this time, in the index A1, like the index B, the IA11 and the positional information URL1 indicating the recording location of the character string input by the user are registered in association with each other.

この結果、索引Ａ１および索引Ｂには、上述したそれぞれの正規化処理を施された文字列が格納されることとなる。 As a result, the index A1 and the index B store character strings subjected to the above-described normalization processes.

図２は、検索システム１０における検索処理の流れを示すフローチャートである。 FIG. 2 is a flowchart showing the flow of search processing in the search system 10.

ユーザが検索クライアント３０に対して検索要求を入力すると、検索コマンド生成部３２は、この検索要求を受けて検索コマンドを生成する（Ｓ２１０）。ここで、検索コマンドの生成方法は、入力された検索要求に応じて変化するものであり、本実施例においては、複数の検索文字列が空白によって連結されたものを検索要求とした。 When the user inputs a search request to the search client 30, the search command generation unit 32 receives the search request and generates a search command (S210). Here, the search command generation method changes according to the input search request. In this embodiment, a search request is formed by concatenating a plurality of search character strings with spaces.

すなわち、Ｓ２１０における検索要求は、文字列ａ、文字列ｂ、文字列ｃおよび文字列ｄを空白により連結したものである。文字列ａ、文字列ｂ、文字列ｃおよび文字列ｄは、それぞれが検索文字列に該当するものである。 That is, the search request in S210 is obtained by concatenating character string a, character string b, character string c, and character string d with a blank. The character string a, the character string b, the character string c, and the character string d correspond to search character strings.

検索コマンド生成部３２は、「ａｂｃｄ」なる検索要求から「ｓｅｌｅｃｔ＊ｆｒｏｍｔａｂｌｅ１ｗｈｅｒｅＡａｎｄＢａｎｄＣａｎｄＤ」なる検索コマンドを生成する。ここで、Ａ、Ｂ、Ｃ、Ｄは、検索サーバ２０において、各検索文字列ごとに検索処理を行うための部分検索コマンドである。部分検索コマンドは、検索文字列ごとに生成されるものである。よって、部分検索コマンドは、検索コマンド生成部３２において、検索要求に含まれる検索文字列の数と同じ数だけ生成される。以下に部分検索コマンドＡ、部分検索コマンドＢ、部分検索コマンドＣおよび部分検索コマンドＤについて具体的に説明する。 The search command generation unit 32 generates a search command “select * from table1 where A and B and C and D” from the search request “a b c d”. Here, A, B, C, and D are partial search commands for performing search processing for each search character string in the search server 20. The partial search command is generated for each search character string. Therefore, the search command generation unit 32 generates the same number of partial search commands as the number of search character strings included in the search request. The partial search command A, partial search command B, partial search command C, and partial search command D will be specifically described below.

部分検索コマンドＡ、部分検索コマンドＢ、部分検索コマンドＣおよび部分検索コマンドＤは、文字列ａ、文字列ｂ、文字列ｃおよび文字列ｄを検索文字列として、索引Ｗを用いて検索することを示すものである。尚ここで、索引Ｗなるものは存在しないが、後に説明するユーザ辞書展開処理によって索引Ａ１または索引Ｂと置き換えられるものであり、前記部分検索コマンドの説明にあたり便宜的に用いたものである。 The partial search command A, the partial search command B, the partial search command C, and the partial search command D are searched using the index W with the character string a, the character string b, the character string c, and the character string d as search character strings. Is shown. Here, although there is no index W, it is replaced with the index A1 or index B by a user dictionary expansion process described later, and is used for the explanation of the partial search command for convenience.

すなわち、部分検索コマンドＡは文字列ａを検索文字列とした部分検索コマンドであって、「ＩＷｌｉｋｅ ‘％ｎＢ（ａ）％’」と示される。部分検索コマンドＢは、文字列ｂを検索文字列とした部分検索コマンドであって、「ＩＷｌｉｋｅ ‘％ｎＢ（ｂ）％’」と示される。部分検索コマンドＣは、文字列ｃを検索文字列とした部分検索コマンドであって、「ＩＷｌｉｋｅ ‘％ｎＢ（ｃ）％’」と示される。部分検索コマンドＤは、文字列ｄを検索文字列とした部分検索コマンドであって、「ＩＷｌｉｋｅ ‘％ｎＢ（ｄ）％’」と示される。 That is, the partial search command A is a partial search command using the character string a as a search character string, and is indicated as “IW like“% nB (a)% ””. The partial search command B is a partial search command using the character string b as a search character string, and is indicated as “IW like‘% nB (b)% ’”. The partial search command C is a partial search command using the character string c as a search character string, and is indicated as “IW like '% nB (c)%'”. The partial search command D is a partial search command using the character string d as a search character string, and is indicated as “IW like '% nB (d)%'”.

ここで、上述した各部分検索コマンドにおけるＩＷは、索引Ｗを示すものであり、ｎＢ（ａ）、ｎＢ（ｂ）、ｎＢ（ｃ）およびｎＢ（ｄ）は、検索文字列である文字列ａ、文字列ｂ、文字列ｃおよび文字列ｄのそれぞれに対して、文字正規化辞書３９を用いて正規化処理を施した後の文字列を示すものである。 Here, IW in each partial search command indicates the index W, and nB (a), nB (b), nB (c), and nB (d) are character strings a that are search character strings. The character string after normalizing the character string b, the character string c, and the character string d using the character normalization dictionary 39 is shown.

次に、検索コマンド生成部３２は、Ｓ２１０で生成された部分検索コマンドＡ、部分検索コマンドＢ、部分検索コマンドＣ、および部分検索コマンドＤに含まれる各検索文字列に、ユーザ辞書展開処理を施し、「ｓｅｌｅｃｔ＊ｆｒｏｍｔａｂｌｅ１ｗｈｅｒｅＡ´ ａｎｄＢ´ ａｎｄＣ´ ａｎｄＤ´」なる送出用の検索コマンドを生成する（Ｓ２２０）。 Next, the search command generation unit 32 performs a user dictionary expansion process on each search character string included in the partial search command A, the partial search command B, the partial search command C, and the partial search command D generated in S210. , A search command for transmission “select * from table1 where A ′ and B ′ and C ′ and D ′” is generated (S220).

ここで、Ａ´、Ｂ´、Ｃ´およびＤ´は、部分検索コマンドＡ、部分検索コマンドＢ、部分検索コマンドＣおよび部分検索コマンドＤそれぞれについてユーザ辞書展開処理を施したことを示すものである。 Here, A ′, B ′, C ′, and D ′ indicate that the user dictionary expansion processing has been performed for each of the partial search command A, the partial search command B, the partial search command C, and the partial search command D. .

Ｓ２２０での処理についての詳細は後に説明する。 Details of the processing in S220 will be described later.

このようにして生成された送出用の検索コマンドは、ネットワークを介して検索サーバ２０へ送出される（Ｓ２３０）。そして、送出された検索コマンドは、検索サーバ２０により受信され、これを受けた検索サーバ２０は索引Ａ１および索引Ｂにおいて検索処理を開始する（Ｓ２４０）。この検索処理により、検索サーバ２０は文書ＤＢ２６における検索文字列の位置情報を得る。 The search command for transmission generated in this way is transmitted to the search server 20 via the network (S230). Then, the sent search command is received by the search server 20, and the search server 20 that has received the search command starts search processing in the index A1 and the index B (S240). By this search processing, the search server 20 obtains position information of the search character string in the document DB 26.

検索サーバ２０は、検索処理終了後、ネットワークを介して、この位置情報を検索結果として検索クライアント３０へ送出し、検索クライアント３０は、この検索結果を受信する（Ｓ２５０）。そして、検索クライアント３０において、この検索結果がユーザに表示される。ユーザは、この検索結果である検索文字列の位置情報に基づき、文書ＤＢ２６内に格納された所望の文書を簡単に閲覧することができる。 After the search process is completed, the search server 20 sends this position information as a search result to the search client 30 via the network, and the search client 30 receives this search result (S250). Then, in the search client 30, this search result is displayed to the user. The user can easily browse a desired document stored in the document DB 26 based on the position information of the search character string as the search result.

ここで、Ｓ２２０における処理について図３を参照して詳細を説明する。 Here, the details of the process in S220 will be described with reference to FIG.

図３はＳ２２０におけるユーザ辞書展開処理を説明するフローチャートである。図３において、部分検索コマンドに含まれる検索文字列をＳｉ（ｉ＝０，１，２，・・・，ｍ）とする。 FIG. 3 is a flowchart for explaining the user dictionary expansion processing in S220. In FIG. 3, the search character string included in the partial search command is Si (i = 0, 1, 2,..., M).

検索コマンド生成部３２は、複数ある部分検索コマンドのうち、最初の部分検索コマンドに含まれる検索文字列をＳ０とし、処理を開始する（Ｓ３１０）。本実施例において、最初の部分検索コマンドとは、部分検索コマンドＡに該当する。 The search command generation unit 32 sets the search character string included in the first partial search command among a plurality of partial search commands as S0, and starts processing (S310). In this embodiment, the first partial search command corresponds to the partial search command A.

次に検索コマンド生成部３２は、最後の部分検索コマンドに含まれる検索文字列まで、ユーザ辞書展開処理が施されているかを判断し、ユーザ辞書展開処理が施されていない、部分検索コマンドに含まれる検索文字列に対してユーザ辞書展開処理を開始する（Ｓ３２０）。本実施例において、最後の部分検索コマンドとは、部分検索コマンドＤに該当する。 Next, the search command generation unit 32 determines whether the user dictionary expansion process has been performed up to the search character string included in the last partial search command, and is included in the partial search command that has not been subjected to the user dictionary expansion process. The user dictionary expansion process is started for the search character string to be searched (S320). In the present embodiment, the last partial search command corresponds to the partial search command D.

検索コマンド生成部３２は、まず検索文字列Ｓｉで標準辞書３８を検索する処理を行う（Ｓ３３０）。この検索結果として、標準辞書３８より得られた文字列を文字列ｎＡ（Ｓｉ）と示す。次に、検索コマンド生成部３２は、標準辞書３８の中から、文字列ｎＡ（Ｓｉ）を代表語とし、文字列ｎＡ（Ｓｉ）に関連する文字列の集合である文字列集合Ｘｉを抽出する展開処理を行う。（Ｓ３４０）。 The search command generator 32 first performs a process of searching the standard dictionary 38 using the search character string Si (S330). As a search result, a character string obtained from the standard dictionary 38 is indicated as a character string nA (Si). Next, the search command generation unit 32 extracts the character string set Xi, which is a set of character strings related to the character string nA (Si), from the standard dictionary 38 with the character string nA (Si) as a representative word. Perform expansion processing. (S340).

次に、検索コマンド生成部３２は、検索文字列Ｓｉでユーザ辞書３６を検索する処理をう。そして、検索コマンド生成部３２は、ユーザ辞書３６の中から、この検索処理により得られた文字列を代表語とする文字列の集合である文字列集合Ｙｉを抽出する展開処理を行う（Ｓ３５０）。 Next, the search command generation unit 32 performs a process of searching the user dictionary 36 with the search character string Si. Then, the search command generation unit 32 performs an expansion process for extracting a character string set Yi, which is a set of character strings having the character string obtained by the search process as a representative word, from the user dictionary 36 (S350). .

検索コマンド生成部３２は、文字列集合Ｘｉに含まれ、文字列集合Ｙｉに含まれない文字列を抽出し、その結果の文字列集合をＳＵＢｉとする（Ｓ３６０）。次に、検索コマンド生成部３２は、文字列集合Ｘｉに含まれず、文字列集合Ｙｉに含まれる文字列を抽出し、その結果の文字列集合をＡＤＤｉとする（Ｓ３７０）。 The search command generation unit 32 extracts character strings included in the character string set Xi but not included in the character string set Yi, and sets the resulting character string set as SUBi (S360). Next, the search command generation unit 32 extracts character strings that are not included in the character string set Xi but are included in the character string set Yi, and sets the resulting character string set as ADDi (S370).

ここで、各部分検索コマンドは、検索コマンド生成部３２により、文字列集合ＳＵＢｉと文字列集合ＡＤＤｉを用いて「（ＩＡ１ｌｉｋｅ ‘％ｎＡ（Ｓｉ）％’ ａｎｄｎｏｔ（ＩＢｌｉｋｅ ‘％ＳＵＢｉ０％’ ｏｒＩＢｌｉｋｅ ‘％ＳＵＢｉ１％’ ｏｒ・・・））ｏｒ（ＩＢｌｉｋｅ ‘％ＡＤＤｉ０％’ ｏｒＩＢｌｉｋｅ ‘％ＡＤＤｉ１％’ｏｒ・・・）」に置き換えられる。 Here, each partial search command is sent from the search command generation unit 32 using the character string set SUBi and the character string set ADDi to “(IA1 like '% nA (Si)%' and not (IB like '% SUBi0%'). or IB like '% SUBi1%' or ...)) or (IB like '% ADDi0%' or IB like '% ADDi1%' or ...) ".

ここで、「ＩＡ１ｌｉｋｅ ‘％ｎＡ（Ｓｉ）％’」とは、文字列ｎＡ（Ｓｉ）を検索文字列として索引Ａ１を検索する処理を示す。また、「ＩＢｌｉｋｅ ‘％ＳＵＢｉ％’」、「ＩＢｌｉｋｅ ‘％ＡＤＤｉ％’」は、文字列集合ＳＵＢｉおよび文字列集合ＡＤＤｉを構成する要素となる文字列を検索文字列として、それぞれの検索文字列について索引Ｂを検索する処理を示す。 Here, “IA1 like '% nA (Si)%'” indicates a process of searching the index A1 using the character string nA (Si) as a search character string. "IB like '% SUBi%'" and "IB like '% ADDi%'" are character strings that are elements constituting the character string set SUBi and the character string set ADDi. The process for searching index B for a column is shown.

検索コマンド生成部３２は、この置き換えられた部分検索コマンドの論理演算処理を行う（Ｓ３８０）。 The search command generating unit 32 performs a logical operation process of the replaced partial search command (S380).

Ｓ３８０において行われる論理演算処理の結果が、ユーザ辞書展開処理後の部分検索コマンドとなる。検索コマンド生成部３２は、Ｓ３８０までの処理を、個々の部分検索コマンドについて行う（Ｓ３９０）。本実施例において、個々の部分検索コマンドとは、部分検索コマンドＡ、部分検索コマンドＢ、部分検索コマンドＣおよび部分検索コマンドＤに該当する。 The result of the logical operation process performed in S380 becomes a partial search command after the user dictionary expansion process. The search command generation unit 32 performs the processing up to S380 for each partial search command (S390). In this embodiment, each partial search command corresponds to a partial search command A, a partial search command B, a partial search command C, and a partial search command D.

そして、検索コマンド生成部３２は、検索文字列にユーザ辞書展開処理が施された個々の部分検索コマンドの論理和をとる処理を行い、その結果を検索サーバ２０への送出用の検索コマンドとする。 Then, the search command generation unit 32 performs a process of calculating the logical sum of the individual partial search commands that have been subjected to the user dictionary expansion process on the search character string, and uses the result as a search command for transmission to the search server 20. .

以下に、ユーザ辞書展開処理に関し、具体例を挙げて表１、表２を参照しつつ説明する。例えば本実施例において、検索要求が「コンピュータヘビーメタル解析」なる文字列である。 Hereinafter, the user dictionary development process will be described with reference to Tables 1 and 2 with specific examples. For example, in this embodiment, the search request is a character string “computer heavy metal analysis”.

表１は、標準辞書３８からそれぞれの検索文字列に関連する関連文字列を抽出した結果の文字列集合を示す例である。表２は、ユーザ辞書３６からそれぞれの検索文字列に関連する文字列を抽出した結果の文字列集合を示す例である。 Table 1 is an example showing a character string set as a result of extracting related character strings related to each search character string from the standard dictionary 38. Table 2 shows an example of a character string set as a result of extracting a character string related to each search character string from the user dictionary 36.

検索コマンド生成部３２は、最初に「ｓｅｌｅｃｔ＊ｆｒｏｍｔａｂｌｅ１ｗｈｅｒｅＩＷｌｉｋｅ ‘％コンピュータ％’ ａｎｄＩＷｌｉｋｅ ‘％ヘビーメタル％’ ａｎｄＩＷｌｉｋｅ ‘％解析％’」という検索コマンドを生成する。この検索コマンドにおいて、部分検索コマンドとなるコマンドは、「ＩＷｌｉｋｅ ‘％コンピュータ％’」、「ＩＷｌｉｋｅ ‘％ヘビーメタル％’」および「ＩＷｌｉｋｅ ‘％解析％’」である。

The search command generation unit 32 first generates a search command of “select * from table1 where IW like“% computer% ”and IW like“% heavy metal% ”and IW like“% analysis% ”. In this search command, the commands that are partial search commands are “IW like“% computer% ””, “IW like“% heavy metal% ””, and “IW like“% analysis% ””.

部分検索コマンド生成部３２は、上記の各部分検索コマンドごとにユーザ辞書展開処理を行う。部分検索コマンド「ＩＷｌｉｋｅ ‘％コンピュータ％’」におけるユーザ辞書展開処理について以下に説明する。 The partial search command generation unit 32 performs a user dictionary expansion process for each partial search command. The user dictionary expansion process in the partial search command “IW like“% computer% ”” will be described below.

検索コマンド生成部３２が、検索文字列「コンピュータ」で標準辞書３８を検索する処理を行う。この結果、得られた文字列ｎＡ（Ｓｉ）は「コンピュータ」となる。検索コマンド生成部３２は、標準辞書３８の中から、文字列「コンピュータ」を代表語とし、文字列「コンピュータ」に関連する文字列の集合である文字列集合Ｘを抽出する。 The search command generation unit 32 performs a process of searching the standard dictionary 38 using the search character string “computer”. As a result, the obtained character string nA (Si) becomes “computer”. The search command generation unit 32 extracts a character string set X, which is a set of character strings related to the character string “computer”, from the standard dictionary 38 with the character string “computer” as a representative word.

次に、検索コマンド生成部３２は、検索文字列「コンピュータ」でユーザ辞書３６を検索する処理を行う。この検索処理により、検索コマンド生成部３２は、文字列「コンピュータ」を得る。検索コマンド生成部３２は、ユーザ辞書３６の中から、文字列「コンピュータ」を代表語とし、文字列「コンピュータ」に関連する文字列の集合である文字列集合Ｙを抽出する。 Next, the search command generation unit 32 performs a process of searching the user dictionary 36 using the search character string “computer”. By this search process, the search command generation unit 32 obtains the character string “computer”. The search command generation unit 32 extracts a character string set Y, which is a set of character strings related to the character string “computer”, from the user dictionary 36 with the character string “computer” as a representative word.

このとき、表１および表２に示されるように、文字列集合Ｘと文字列集合Ｙは、どちらとも「コンピューター、コンピュータ、電子計算機、ｃｏｍｐｕｔｅｒ」なる要素から構成されている。 At this time, as shown in Table 1 and Table 2, the character string set X and the character string set Y are both composed of elements “computer, computer, electronic computer, computer”.

本実施例において、このように各文字列集合が同様の要素で構成されているとき、検索コマンド生成部３２は、ユーザ辞書展開処理後の部分検索コマンドを、「ＩＡ１ｌｉｋｅ ‘％コンピュータ％’」とする。 In the present embodiment, when each character string set is composed of similar elements in this way, the search command generation unit 32 designates a partial search command after the user dictionary expansion process as “IA1 like '% computer%'”. And

部分検索コマンド「ＩＷｌｉｋｅ ‘％解析％’」も同様に、標準辞書３８の中から抽出される文字列集合Ｘと、ユーザ辞書３６の中から抽出される文字列集合Ｙのどちらとも、「解析、分析」なる要素から構成されている。 Similarly, the partial search command “IW like '% analysis%'” is “analysis” for both the character string set X extracted from the standard dictionary 38 and the character string set Y extracted from the user dictionary 36. , Analysis ".

すなわち、文字列集合Ｘと文字列集合Ｙは同一の要素で構成されているので、ユーザ辞書展開処理後の部分検索コマンドは、「ＩＡ１ｌｉｋｅ ‘％解析％’」とする。 That is, since the character string set X and the character string set Y are composed of the same elements, the partial search command after the user dictionary expansion processing is “IA1 like“% analysis% ””.

次に、部分検索コマンド「ＩＷｌｉｋｅ ‘％ベビーメタル％’」におけるユーザ辞書展開処理について説明する。 Next, a user dictionary expansion process in the partial search command “IW like“% baby metal% ”” will be described.

まず、検索コマンド生成部３２が、検索文字列「ヘビーメタル」で標準辞書３８を検索する処理を行う。この結果、得られた文字列ｎＡ（Ｓｉ）は「ヘビーメタル」となる。 First, the search command generation unit 32 performs a process of searching the standard dictionary 38 using the search character string “heavy metal”. As a result, the obtained character string nA (Si) is “heavy metal”.

次に、検索コマンド生成部３２は、標準辞書３８の中から、文字列「ヘビーメタル」を代表語とし、文字列「ヘビーメタル」に関連する文字列の集合である文字列集合Ｘを抽出する。ここで、文字列集合Ｘは、「ヘビーメタル、へビィメタル、ヘビメタ」なる要素で構成されている。 Next, the search command generation unit 32 extracts a character string set X that is a set of character strings related to the character string “heavy metal” from the standard dictionary 38 with the character string “heavy metal” as a representative word. . Here, the character string set X is composed of elements “heavy metal, heavy metal, snake meta”.

さらに、検索コマンド生成部３２は、検索文字列「ヘビーメタル」でユーザ辞書３６を検索する処理を行う。この結果、得られた文字列は「タングステン合金」である。次に、検索コマンド生成部３２は、ユーザ辞書３６の中から、文字列「タングステン合金」を代表語とし、文字列「タングステン合金」に関連する文字列の集合である文字列集合Ｙを抽出する。ここで、文字列集合Ｙは、「タングステン合金、ヘビーメタル、へビィメタル」なる要素で構成されている。 Further, the search command generation unit 32 performs a process of searching the user dictionary 36 with the search character string “heavy metal”. As a result, the obtained character string is “tungsten alloy”. Next, the search command generation unit 32 extracts from the user dictionary 36 a character string set Y that is a set of character strings related to the character string “tungsten alloy” with the character string “tungsten alloy” as a representative word. . Here, the character string set Y is composed of elements “tungsten alloy, heavy metal, heavy metal”.

検索コマンド生成部３２は、文字列集合Ｘに含まれ、文字列集合Ｙに含まれない文字列集合ＳＵＢを抽出する。このとき、文字列集合ＳＵＢは「ヘビメタ」である。 The search command generation unit 32 extracts a character string set SUB that is included in the character string set X and not included in the character string set Y. At this time, the character string set SUB is “snake meta”.

次に、検索コマンド生成部３２は、文字列集合Ｘに含まれず、文字列集合Ｙに含まれる文字列集合ＡＤＤを抽出する。このとき、文字列集合ＡＤＤは「タングステン合金」である。 Next, the search command generation unit 32 extracts a character string set ADD that is not included in the character string set X but included in the character string set Y. At this time, the character string set ADD is “tungsten alloy”.

ここで、検索コマンド生成部３２は、文字列集合ＳＵＢと文字列集合ＡＤＤを用いて論理演算処理を行い、ユーザ辞書展開処理後の部分検索コマンドを生成する。 Here, the search command generation unit 32 performs a logical operation process using the character string set SUB and the character string set ADD, and generates a partial search command after the user dictionary expansion process.

すなわち、部分検索コマンド「ＩＷｌｉｋｅ ‘％ベビーメタル％’」は、「（ＩＡ１ｌｉｋｅ ‘％ベビーメタル％’ ａｎｄｎｏｔ（ＩＢｌｉｋｅ ‘％ヘビメタ％’））ｏｒＩＢｌｉｋｅ ‘％タングステン合金％’」となる。 That is, the partial search command “IW like“% baby metal% ”” becomes “(IA1 like“% baby metal% ”and not (IB like“% heavy metal% ”)) or IB like“% tungsten alloy% ””. .

検索コマンド生成部３２は、各検索文字列にユーザ辞書展開処理が施された個々の部分検索コマンドの論理和をとる処理を行い、検索サーバ２０へ送出するための送出用の検索コマンドを生成する。 The search command generation unit 32 performs a process of calculating the logical sum of each partial search command that has been subjected to the user dictionary expansion process for each search character string, and generates a search command for transmission to be transmitted to the search server 20. .

すなわち、検索コマンド生成部３２において最終的に生成される検索コマンドは、「ｓｅｌｅｃｔ＊ｆｒｏｍｔａｂｌｅ１ｗｈｅｒｅＩＡ１ｌｉｋｅ ‘％コンピュータ％’ ａｎｄ（（ＩＡ１ｌｉｋｅ ‘％ベビーメタル％’ ａｎｄｎｏｔ（ＩＢｌｉｋｅ ‘％ヘビメタ％’））ｏｒＩＢｌｉｋｅ ‘％タングステン合金％’）ａｎｄＩＡ１ｌｉｋｅ ‘％解析％’」である。 That is, the search command finally generated by the search command generation unit 32 is “select * from table1 where IA1 like '% computer%' and ((IA1 like '% baby metal%' and not (IB like '% heavy meta % ')) OrIB like'% tungsten alloy% ') and IA1 like'% analysis% '.

この検索コマンドが、検索サーバ２０への送出用の検索コマンドとなり、検索クライアント３０から検索サーバ２０に送出される。 This search command becomes a search command for sending to the search server 20 and is sent from the search client 30 to the search server 20.

このように、本発明の検索クライアント３０は、ユーザ固有の言葉を追加登録でき、ユーザが自由に編集することが可能なユーザ辞書３６を用いて検索コマンドを生成するため、ユーザ固有の言葉などを反映させた検索コマンドを生成することができ、ひいてはユーザがカスタマイズ可能な検索処理を実行させることができる。 Thus, the search client 30 of the present invention can additionally register user-specific words and generates a search command using the user dictionary 36 that can be freely edited by the user. The reflected search command can be generated, and as a result, a search process customizable by the user can be executed.

また、本発明の検索クライアント３０は、検索文字列ごとに、標準辞書３８とユーザ辞書３６の両方から関連する文字列集合を抽出し、その文字列集合に基づき検索サーバ２０に送出する検索コマンドを生成している。このため、検索漏れが少なく精度の高い検索を行うことが可能であり、かつ検索処理速度の低下を抑えることができる。 Further, the search client 30 of the present invention extracts a set of related character strings from both the standard dictionary 38 and the user dictionary 36 for each search character string, and sends a search command to be sent to the search server 20 based on the character string set. Is generated. For this reason, it is possible to perform a highly accurate search with few search omissions, and it is possible to suppress a decrease in search processing speed.

以下に、本発明の実施例２について図面を参照して説明する。図４は、本発明の実施例２における検索クライアントを有する検索システムの機能ブロック図の例である。 Embodiment 2 of the present invention will be described below with reference to the drawings. FIG. 4 is an example of a functional block diagram of a search system having a search client according to the second embodiment of the present invention.

図４に示す検索システム１０Ａにおいて、実施例１に記載の検索システム１０と同様の構成である部分の説明は省略する。また、図４において実施例１に記載の検索システム１０と同様の構成である部分には、図１の説明に用いた符号と同様の符号をつけた。 In the search system 10A shown in FIG. 4, description of portions having the same configuration as the search system 10 described in the first embodiment is omitted. Also, in FIG. 4, the same reference numerals as those used in the description of FIG. 1 are attached to portions having the same configuration as the search system 10 described in the first embodiment.

図４に示す検索システム１０Ａにおける検索クライアント３０Ａは、標準辞書３８、標準辞書３８Ａ、演算回数検出部３３を有するものである。演算回数検出部３３では、検索コマンド生成部３２において生成された検索コマンドに含まれる論理演算処理の回数を検出する。 A search client 30A in the search system 10A shown in FIG. 4 includes a standard dictionary 38, a standard dictionary 38A, and an operation count detection unit 33. The operation count detection unit 33 detects the number of logical operation processes included in the search command generated by the search command generation unit 32.

また、検索システム１０Ａにおける検索サーバ２０Ａは、標準辞書３８を用いて標準的な正規化処理が施された文字列により生成された索引Ａ１と、標準辞書３８Ａを用いて標準的な正規化処理が施された文字列により生成された索引Ａ２を有する。 The search server 20A in the search system 10A performs standard normalization processing using the index A1 generated by the character string that has been subjected to standard normalization processing using the standard dictionary 38 and the standard dictionary 38A. It has an index A2 generated by the applied character string.

なお、ここでは、検索クライアントにおいて保持される標準辞書の数を２つとしたが、検索クライアントにおいて保持される標準辞書の数に制限はなく、より多くの標準辞書を保持しているものであってもよい。 Although the number of standard dictionaries held in the search client is two here, the number of standard dictionaries held in the search client is not limited, and more standard dictionaries are held. Also good.

さらに、検索サーバにおいては、検索クライアントに保持された標準辞書に対応して、それぞれの標準辞書に基づき正規化処理を施された文字列により生成された索引があることが好ましいが、それに限定されるものではない。 Further, in the search server, it is preferable that there is an index generated by a character string that has been subjected to normalization processing based on each standard dictionary corresponding to the standard dictionary held in the search client. It is not something.

検索システム１０Ａでの検索処理の流れは、図２を参照して説明した検索システム１０における検索処理の流れと同様であるので説明を省略する。 The flow of the search process in the search system 10A is the same as the flow of the search process in the search system 10 described with reference to FIG.

本発明の実施例２における検索クライアント２０Ａは、検索文字列に対して複数の標準辞書およびユーザ辞書を用いてユーザ辞書展開処理を行う点で、実施例１の検索クライアント２０と異なっている。 The search client 20A according to the second embodiment of the present invention is different from the search client 20 according to the first embodiment in that user dictionary expansion processing is performed on a search character string using a plurality of standard dictionaries and user dictionaries.

以下に、本発明の実施例２におけるユーザ辞書展開処理について図５を参照して説明する。図５は、本発明の実施例２におけるユーザ辞書展開処理を説明するフローチャートである。 Hereinafter, user dictionary expansion processing according to the second embodiment of the present invention will be described with reference to FIG. FIG. 5 is a flowchart for explaining user dictionary expansion processing according to the second embodiment of the present invention.

図５における部分検索コマンドに含まれる検索文字列をＳｉ（ｉ＝０，１，２，・・・，ｍとする。）とし、検索クライアント２０Ａに保持された標準辞書をＤＩＣｐ（ｐ＝０，１，２，・・・，ｍとする。）とする。 The search character string included in the partial search command in FIG. 5 is Si (i = 0, 1, 2,..., M), and the standard dictionary held in the search client 20A is DICp (p = 0, 1, 2, ..., m).

検索コマンド生成部３２は、複数ある部分検索コマンドのうち、最初の部分検索コマンドに含まれる検索文字列をＳ０とする（Ｓ５１０）。次に検索コマンド生成部３２は、最後の部分検索コマンドに含まれる検索文字列まで、ユーザ辞書展開処理が施されているかを判断し、ユーザ辞書展開処理が施されていない、検索文字列に対してユーザ辞書展開処理を開始する（Ｓ５１１）。 The search command generation unit 32 sets S0 as the search character string included in the first partial search command among a plurality of partial search commands (S510). Next, the search command generating unit 32 determines whether the user dictionary expansion processing has been performed up to the search character string included in the last partial search command, and for the search character string that has not been subjected to the user dictionary expansion processing. The user dictionary expansion process is started (S511).

検索コマンド生成部３２は、検索文字列Ｓｉに対して、展開処理を行う際に用いる１つ目の標準辞書をＤＩＣ０として（Ｓ５１２）、検索クライアント２０Ａに保持されている標準辞書が検索文字列Ｓｉのユーザ辞書展開処理に用いられているかを判断する。そして、検索クライアント２０Ａ内に、ユーザ辞書展開処理に用いられていない標準辞書があった場合、その標準辞書を用いたユーザ辞書展開処理を開始する（Ｓ５１３）。 The search command generation unit 32 sets DIC0 as the first standard dictionary used when performing the expansion processing on the search character string Si (S512), and the standard dictionary held in the search client 20A is the search character string Si. It is determined whether it is used for the user dictionary expansion process. If there is a standard dictionary that is not used in the user dictionary expansion process in the search client 20A, the user dictionary expansion process using the standard dictionary is started (S513).

ここで、Ｓ５１４からＳ５１９までの処理は、実施例１における図３のＳ３３０からＳ３７０までの処理と同様なので、説明を省略する。 Here, the processing from S514 to S519 is the same as the processing from S330 to S370 in FIG.

Ｓ５１９において部分検索コマンドが生成されると、演算回数検出部３３は、前記部分検索コマンドに含まれる論理演算処理の回数を検出する。そして、検索コマンド生成部３２は、ここで検出された論理演算処理の回数と、これまで生成された部分検索コマンドにおける論理演算処理の回数のうち最小の論理演算処理の回数と、を比較する（Ｓ５２０）。 When a partial search command is generated in S519, the operation count detection unit 33 detects the number of logical operation processes included in the partial search command. Then, the search command generation unit 32 compares the number of times of the logical operation processing detected here with the minimum number of times of the logical operation processing in the number of times of the logical operation processing in the partial search commands generated so far ( S520).

検索コマンド生成部３２は、ここで部分検索コマンドに含まれる論理演算処理の回数が最小となる方の部分検索コマンドを検索文字列Ｓｉにおける部分検索コマンドとする（Ｓ５２１）。 The search command generation unit 32 sets the partial search command having the smallest number of logical operation processes included in the partial search command as the partial search command in the search character string Si (S521).

検索コマンド生成部３２は、Ｓ５１４からＳ５２１までの処理を、すべての標準辞書を用いて検索文字列Ｓｉのユーザ辞書展開処理が施されるまで繰り返す（Ｓ５２２）。 The search command generation unit 32 repeats the processing from S514 to S521 until the user dictionary expansion processing of the search character string Si is performed using all the standard dictionaries (S522).

本実施例においては、すべての標準辞書とは、標準辞書３８と標準辞書３８Ａの２つに該当する。よって、Ｓ５２２において繰り返される処理の回数は２回である。尚このとき、検索クライアントにｎ個の標準辞書が保持されていた場合には、Ｓ５２２において繰り返される処理の回数はｎ回となる。 In this embodiment, all the standard dictionaries correspond to the standard dictionary 38 and the standard dictionary 38A. Therefore, the number of processes repeated in S522 is two. At this time, if n standard dictionaries are held in the search client, the number of processes repeated in S522 is n.

検索文字列Ｓｉに対して、すべての標準辞書を用いてユーザ辞書展開処理が施され、論理演算処理回数が最小となる部分検索コマンドに生成されると、検索コマンド生成部３２は、次の部分検索コマンドに含まれる検索文字列に対してＳ５１２からＳ５２２の処理を繰り返す。 When the search character string Si is subjected to user dictionary expansion processing using all standard dictionaries and is generated into a partial search command that minimizes the number of logical operation processes, the search command generation unit 32 The processing from S512 to S522 is repeated for the search character string included in the search command.

このようにして、検索文字列にユーザ辞書展開処理が施された個々の部分検索コマンドが生成されると、検索コマンド生成部３２は、各部分検索コマンドの論理和をとる処理を行い、検索サーバへ送出する送出用の検索コマンドを生成する。 In this way, when each partial search command in which the user dictionary expansion process is performed on the search character string is generated, the search command generation unit 32 performs a process of calculating the logical sum of the partial search commands, and the search server Generate search command for sending to send to.

以下に、具体例をあげて本発明の実施例２について説明する。実施例２において、実施例１と同様に、検索要求が「コンピュータヘビーメタル解析」なる文字列である。 The second embodiment of the present invention will be described below with a specific example. In the second embodiment, as in the first embodiment, the search request is a character string “computer heavy metal analysis”.

表１、表２については実施例１で説明した通りである。表３は、標準辞書３８Ａからそれぞれの検索文字列に関連する文字列を抽出した結果の文字列集合を示す例である。 Tables 1 and 2 are as described in Example 1. Table 3 shows an example of a character string set as a result of extracting character strings related to each search character string from the standard dictionary 38A.

検索コマンド生成部３２は、最初に「ｓｅｌｅｃｔ＊ｆｒｏｍｔａｂｌｅ１ｗｈｅｒｅＩＷｌｉｋｅ ‘％コンピュータ％’ ａｎｄＩＷｌｉｋｅ ‘％ヘビーメタル％’ ａｎｄＩＷｌｉｋｅ ‘％解析％’」という検索コマンドを生成する。

The search command generation unit 32 first generates a search command of “select * from table1 where IW like“% computer% ”and IW like“% heavy metal% ”and IW like“% analysis% ”.

次に、検索コマンド生成部３２は、各部分検索コマンドごとにユーザ辞書展開処理を行う。部分検索コマンド「ＩＷｌｉｋｅ ‘％コンピュータ％’」におけるユーザ辞書展開処理について以下に説明する。 Next, the search command generation unit 32 performs a user dictionary expansion process for each partial search command. The user dictionary expansion process in the partial search command “IW like“% computer% ”” will be described below.

検索コマンド生成部３２が、検索文字列「コンピュータ」で標準辞書３８を検索する処理を行う。この結果、得られた文字列は「コンピュータ」となる。検索コマンド生成部３２は、標準辞書３８の中から、文字列「コンピュータ」を代表語とし、文字列「コンピュータ」に関連する文字列の集合である文字列集合Ｘを抽出する。 The search command generation unit 32 performs a process of searching the standard dictionary 38 using the search character string “computer”. As a result, the obtained character string is “computer”. The search command generation unit 32 extracts a character string set X, which is a set of character strings related to the character string “computer”, from the standard dictionary 38 with the character string “computer” as a representative word.

ここで、演算回数検出部３３は、前記部分検索コマンドにおける論理演算処理回数を検出する。ここでの論理演算処理回数は１回である。 Here, the operation count detection unit 33 detects the number of logical operation processes in the partial search command. Here, the number of logical operation processes is one.

次に、検索コマンド生成部３２は、検索文字列「コンピュータ」で標準辞書３８Ａを検索する処理を行う。この結果、得られた文字列は「コンピュータ」となる。検索コマンド生成部３２は、標準辞書３８Ａの中から、文字列「コンピュータ」を代表語とし、文字列「コンピュータ」に関連する文字列の集合である文字列集合Ｚを抽出する。 Next, the search command generation unit 32 performs a process of searching the standard dictionary 38A with the search character string “computer”. As a result, the obtained character string is “computer”. The search command generation unit 32 extracts, from the standard dictionary 38A, a character string set Z that is a set of character strings related to the character string “computer” with the character string “computer” as a representative word.

このとき、文字列集合Ｚは「コンピューター、コンピュータ」なる要素で構成されており、その要素においてユーザ辞書３６の中から抽出された文字列集合Ｙと異なる。よって、ここでは文字列集合Ｙと文字列集合Ｚに基づいて部分検索コマンドを生成するにあたり、実施例１で説明したように、複数回の論理演算処理を行う必要がある。 At this time, the character string set Z is composed of elements “computer, computer”, and is different from the character string set Y extracted from the user dictionary 36 in that element. Therefore, here, when generating the partial search command based on the character string set Y and the character string set Z, it is necessary to perform a plurality of logical operation processes as described in the first embodiment.

そこで、検索コマンド生成部３２は、検索文字列「コンピュータ」に関する部分検索コマンドを、演算処理回数が１回である「ＩＡ１ｌｉｋｅ ‘％コンピュータ％’」とする。 Accordingly, the search command generation unit 32 sets the partial search command related to the search character string “computer” as “IA1 like‘% computer% ’” in which the number of calculation processes is one.

部分検索コマンド「ＩＷｌｉｋｅ ‘％解析％’」も同様に、標準辞書３８とユーザ辞書３６のそれぞれから抽出される文字列集合Ｘ、文字列集合Ｙは、どちらも「解析、分析」なる要素で構成されている。すなわち、ここで生成される部分検索コマンドは、「ＩＡ１ｌｉｋｅ ‘％解析％’」であり、論理演算処理回数が１回となる。 Similarly, the partial search command “IW like“% analysis% ”” includes both the character string set X and the character string set Y extracted from the standard dictionary 38 and the user dictionary 36 as elements of “analysis and analysis”. It is configured. That is, the partial search command generated here is “IA1 like“% analysis% ””, and the number of logical operation processes is one.

次に、検索コマンド生成部３２は、検索文字列「解析」で標準辞書３８Ａを検索して得られた文字列「解析」を代表語とし、文字列「解析」に関連する文字列の集合である文字列集合Ｚを抽出する。このとき、文字列集合Ｚ、文字列集合Ｙが、どちらも「解析、分析」なる要素で構成されている。すなわち、ここで生成される部分検索コマンドの論理演算処理回数も１回となる。 Next, the search command generation unit 32 uses a character string “analysis” obtained by searching the standard dictionary 38A with the search character string “analysis” as a representative word, and is a set of character strings related to the character string “analysis”. A certain character string set Z is extracted. At this time, the character string set Z and the character string set Y are both composed of elements of “analysis and analysis”. That is, the number of logical operation processes of the partial search command generated here is also one.

よって、検索コマンド生成部３２は、検索文字列「解析」に関する部分検索コマンドを、「ＩＡ１ｌｉｋｅ ‘％解析％’」とする。 Therefore, the search command generation unit 32 sets the partial search command related to the search character string “analysis” to “IA1 like“% analysis% ””.

次に、部分検索コマンド「ＩＷｌｉｋｅ ‘％ヘビーメタル％’」のユーザ辞書展開処理について以下に説明する。 Next, the user dictionary expansion process of the partial search command “IW like“% heavy metal% ”” will be described below.

標準辞書３８とユーザ辞書３６に基づきユーザ辞書展開処理を施した際に生成される部分検索コマンドは、実施例１で説明したように、「（ＩＡ１ｌｉｋｅ ‘％ベビーメタル％’ ａｎｄｎｏｔ（ＩＢｌｉｋｅ ‘％ヘビメタ％’））ｏｒＩＢｌｉｋｅ ‘％タングステン合金％’」である。 As described in the first embodiment, the partial search command generated when the user dictionary expansion process is performed based on the standard dictionary 38 and the user dictionary 36 is “(IA1 like“% baby metal% ”and not (IB like). '% Heavy metal%')) orIB like '% tungsten alloy%'.

次に、検索コマンド生成部３２は、検索文字列「ヘビーメタル」で標準辞書３８Ａを検索する処理を行う。この結果、得られた文字列は「タングステン合金」となる。検索コマンド生成部３２は、標準辞書３８Ａの中から、文字列「タングステン合金」を代表語とし、文字列「タングステン合金」に関連する文字列の集合である文字列集合Ｚを抽出する。このとき、文字列集合Ｚは「タングステン合金、ヘビーメタル、へビィメタル」なる要素で構成されており、その要素において文字列集合Ｙと同じである。 Next, the search command generation unit 32 performs a process of searching the standard dictionary 38A with the search character string “heavy metal”. As a result, the obtained character string becomes “tungsten alloy”. The search command generation unit 32 extracts, from the standard dictionary 38A, a character string set Z that is a set of character strings related to the character string “tungsten alloy” with the character string “tungsten alloy” as a representative word. At this time, the character string set Z is composed of elements “tungsten alloy, heavy metal, heavy metal”, and is the same as the character string set Y in the elements.

すなわち、ここで文字列集合Ｙと文字列集合Ｚに基づいて生成される部分検索コマンドは、「ＩＡ２ｌｉｋｅ ‘％ヘビーメタル％’」となり、論理演算処理回数が１回である。 That is, the partial search command generated based on the character string set Y and the character string set Z is “IA2 like“% heavy metal% ””, and the number of logical operation processes is one.

よって、検索コマンド生成部３２は、検索文字列「ヘビーメタル」に関する部分検索コマンドを、「ＩＡ２ｌｉｋｅ ‘％ヘビーメタル％’」とする。このようにして部分検索コマンドが生成されると、検索コマンド生成部３２は、これらの論理積をとる処理を行い、検索サーバ２０Ａへ送出するための検索コマンドを生成する。 Therefore, the search command generation unit 32 sets the partial search command related to the search character string “heavy metal” to “IA2 like“% heavy metal% ””. When the partial search command is generated in this way, the search command generation unit 32 performs a process of taking a logical product of these, and generates a search command for sending to the search server 20A.

ここで、検索サーバ２０Ａに送出される検索コマンドは、「ｓｅｌｅｃｔ＊ｆｒｏｍｔａｂｌｅ１ｗｈｅｒｅＩＡ１ｌｉｋｅ ‘％コンピュータ％’ ａｎｄＩＡ２ｌｉｋｅ ‘％ヘビーメタル％’ ａｎｄＩＡ１ｌｉｋｅ ‘％解析％’」となる。 Here, the search command sent to the search server 20A is “select * from table1 where IA1 like“% computer% ”and IA2 like“% heavy metal% ”and IA1 like“% analysis% ”.

このように、実施例２の検索クライアント３０Ａにおいては、演算処理回数が最も少ない部分検索コマンドを生成し、これを検索サーバ２０Ａへ送出している。これにより、検索サーバ２０Ａにかかる負荷が軽減され、より高速な検索処理を行うことができる。 As described above, the search client 30A according to the second embodiment generates a partial search command with the smallest number of calculation processes and sends it to the search server 20A. As a result, the load on the search server 20A is reduced, and faster search processing can be performed.

また、本発明の実施例１および実施例２で説明した標準辞書３８、標準辞書３８Ａに関しては、標準辞書更新部３４により更新することができる。すなわち、検索クライアント２０、検索クライアント２０Ａに保持された標準辞書に変更があった場合に、標準辞書配信部２８より、変更された標準辞書情報である標準辞書更新データが配信される。標準辞書更新部３４は、この標準辞書更新データに基づき標準辞書３８、標準辞書３８Ａを更新する。 The standard dictionary 38 and the standard dictionary 38A described in the first and second embodiments of the present invention can be updated by the standard dictionary updating unit 34. That is, when there is a change in the standard dictionaries held in the search client 20 and the search client 20A, standard dictionary update data, which is changed standard dictionary information, is distributed from the standard dictionary distribution unit 28. The standard dictionary update unit 34 updates the standard dictionary 38 and the standard dictionary 38A based on the standard dictionary update data.

以上のように、本発明によれば、検索漏れの少ない高精度の検索処理を実行することができ、かつ検索処理速度の低下を抑えることが可能な検索クライアントを提供することができる。 As described above, according to the present invention, it is possible to provide a search client that can execute high-precision search processing with few search omissions and can suppress a decrease in search processing speed.

また、ユーザが自由に編集できるユーザ辞書を用いて検索コマンドを生成することで、ユーザ固有の特性を反映させ、ユーザ側でカスタマイズされた検索処理を実行させることが可能な検索クライアントを提供することができる。 Also, to provide a search client capable of executing a search process customized on the user side by reflecting a user-specific characteristic by generating a search command using a user dictionary that can be freely edited by the user. Can do.

また、論理演算処理回数の最も少ない検索コマンドを生成することで、検索サーバの負担を軽減でき、かつ検索処理速度の低下を抑えることが可能な検索クライアントを提供することができる。 Further, by generating a search command with the smallest number of logical operation processes, it is possible to provide a search client that can reduce the load on the search server and suppress a decrease in search processing speed.

さらに、検索クライアントに保持された標準辞書を更新することで、常に最新の情報に基づいた検索処理を実行させることが可能であり、かつその他の本発明の効果を継続させることが可能な検索クライアントを提供することができる。 Furthermore, by updating the standard dictionary held in the search client, it is possible to always execute a search process based on the latest information, and to continue other effects of the present invention. Can be provided.

以上、各実施例に基づき本発明の説明を行ってきたが、上記実施例に示した要件に本発明が限定されるものではない。これらの点に関しては、本発明の主旨をそこなわない範囲で変更することが可能であり、その応用形態に応じて適切に定めることができる。 Although the present invention has been described based on each embodiment, the present invention is not limited to the requirements shown in the above embodiment. With respect to these points, the present invention can be changed within a range that does not detract from the gist of the present invention, and can be appropriately determined according to the application form.

本発明は、検索処理を行う検索システムを構成する検索クライアントに応用可能である。 The present invention can be applied to a search client constituting a search system that performs search processing.

本発明の実施例１における検索クライアントを有する検索システムの機能ブロック図である。It is a functional block diagram of the search system which has a search client in Example 1 of this invention. 検索システム１０における検索処理の流れを示すフローチャートである。3 is a flowchart showing a flow of search processing in the search system 10. 本発明の実施例１におけるユーザ辞書展開処理を説明するフローチャートである。It is a flowchart explaining the user dictionary expansion | deployment process in Example 1 of this invention. 本発明の実施例２なおける検索クライアントを有する検索システムの機能ブロック図である。It is a functional block diagram of the search system which has a search client in Example 2 of this invention. 本発明の実施例２におけるユーザ辞書展開処理を説明するフローチャートである。It is a flowchart explaining the user dictionary expansion | deployment process in Example 2 of this invention.

Explanation of symbols

１０、１０Ａ検索システム
２０、２０Ａ検索サーバ
２１標準辞書配信部
３０、３０Ａ検索クライアント
３２検索コマンド生成部
３３演算回数検出部
３４標準辞書更新部
３６ユーザ辞書
３８、３８Ａ標準辞書
３９文字正規化辞書
Ａ１、Ａ２、Ｂ索引 10, 10A Search system 20, 20A Search server 21 Standard dictionary distribution unit 30, 30A Search client 32 Search command generation unit 33 Operation count detection unit 34 Standard dictionary update unit 36 User dictionary 38, 38A Standard dictionary 39 Character normalization dictionary A1, A2, B Index

Claims

N words corresponding to the n standard dictionaries generated by the standardized character strings using n standard dictionaries that collect the basic word data and the normalization results when the language is analyzed. A standard index,
A search server having a character unit normalization index generated by a character string normalized using a character normalization dictionary for performing character unit normalization processing on a search character string included in a search request In a search client that accesses and searches,
The search client
Editable user dictionaries that can register additional user-specific words and collect special word data for each individual,
The n standard dictionaries;
The character normalization dictionary;
Search command generation means for generating a search command;
An operation number detecting means for detecting the operation number of the logical operation included in the search command ,
The search command generation means includes
The search character string is independently searched in one standard dictionary and the user dictionary among the n standard dictionaries,
A set der Ru collection if Xi of the string to the representative words a string of search results of the one of the standard dictionary, set der Ru collection of string to a representative word a string of search results in the user dictionary Extract Yi ,
A first set of results obtained by searching the standard index corresponding to the one standard dictionary using the elements of the set Xi as a search character string;
A set obtained by searching the character unit normalized index with each element constituting the set SUBi included in the set Xi and not included in the set Yi, and a set included in the set Yi and not included in the set Xi A second set of results obtained by ORing with a set of results obtained by searching the character unit normalized index with each element constituting ADDi;
A search command including a logical operation for extracting the first set that is not included in the logical product of :
The search client
In the search command generating means, the search command having the smallest number of arithmetic operations detected by the arithmetic operation number detecting means among the search commands generated corresponding to each of the n standard dictionaries is searched. A search client characterized by being sent to a server.

The search client according to claim 1, further comprising a standard dictionary update unit configured to update the standard dictionary based on standard dictionary update data distributed from the search server.