JP2849263B2

JP2849263B2 - Keyword expansion search system

Info

Publication number: JP2849263B2
Application number: JP4033041A
Authority: JP
Inventors: 正樹細井
Original assignee: Fujitsu FIP Corp
Current assignee: Fujitsu FIP Corp
Priority date: 1992-02-20
Filing date: 1992-02-20
Publication date: 1999-01-20
Anticipated expiration: 2014-01-20
Also published as: JPH05233704A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明はキーワードを用いてデー
タの検索を行うデータ検索システムに関し、特に、カタ
カナ表記、漢字仮名混じり表記のように、同一の事項に
ついて複数の表記方法を持つキーワード（例えば「ウイ
スキー」と「ウィスキー」、「読み出し」と「読出し」
等のように、意味が同一で複数の異なった表記方法をも
つキーワード、以下このような異なった表記方法をもつ
ものを「あいまいさを持つキーワード」という）を用い
てデータの検索を行う場合に有効なキーワード拡張デー
タ検索システムに関する。The present invention relates to relates to a data retrieval system for retrieving data using the keyword, in particular, katakana, as kanji kana notation, keywords with a plurality of the notation for the same matter (e.g. "Whiskey" and "whiskey", "read" and "read"
Keywords with the same meaning and a plurality of different notations, such as those with different notations, are referred to as "keywords with ambiguity"). An effective keyword extension data retrieval system .

【０００２】[0002]

【従来の技術】近年データ・ベース・システムにおい
て、データの検索もれの防止が要求されている。従来の
検索処理においては、入力されたキーワードをそのまま
用いてファイル中のデータが持つキーワードと比較して
いた。ところが、上記のようにあいまいさを持つキーワ
ードは、入力する利用者によって表記（表現）が不統一
なため、データ中に利用者が検索したいデータが存在し
ていても、入力したキーワードとデータ上のキーとが完
全に一致していなければ検索することができず、検索も
れが生ずることが多かった。2. Description of the Related Art In recent years, a data base system has been required to prevent omission of data retrieval. In a conventional search process, an input keyword is used as it is and compared with a keyword included in data in a file. However, since the notation (expression) of the keyword having the ambiguity as described above is not uniform depending on the input user, even if the data that the user wants to search exists in the data, the input keyword and the data are not displayed. If the key does not completely match the key, search cannot be performed, and search omission often occurs.

【０００３】このような問題点を解決するため、従来、
同義語辞書をシステムに登録し、キーワードにより検索
するに際して、同義語辞書を参照して別表現のキーワー
ドを生成して検索を行う検索方式が用いられている。し
かしながら、上記同義語辞書を用いてデータの検索を行
うためには、同義語辞書を作成する必要があり、そのメ
ンテナンスに多大な時間を必要とする。In order to solve such a problem, conventionally,
When a synonym dictionary is registered in a system and a search is performed by using a keyword, a search method is used in which a keyword of another expression is generated with reference to the synonym dictionary to perform a search. However, in order to search for data using the synonym dictionary, it is necessary to create a synonym dictionary, and maintenance of the dictionary requires a great deal of time.

【０００４】特に、メンテナンスを行うにあたっては、
登録された同義語のメンテナンスを行うだけでなく、同
義語辞書に登録されていない新語を検索することができ
るようにするため、たえず新語を同義語辞書に登録する
必要がある。また、上記したあいまいさを持つキーワー
ドを用いて検索するための他の検索方式として、キーワ
ードを所定のルールを用いて正規化してファイルに格納
し、検索する際、利用者の入力したキーワードを上記ル
ールに基づいて正規化し、正規化されたキーワードを用
いてデータを検索する方式が知られている（特開昭６３
−２１１０２３号公報）。In particular, when performing maintenance,
In order to not only maintain registered synonyms but also search for new words that are not registered in the synonym dictionary, it is necessary to constantly register new words in the synonym dictionary. Further, as another search method for searching using the keyword having the above-mentioned ambiguity, the keyword is normalized using a predetermined rule and stored in a file, and when searching, the keyword input by the user is described above. A method of normalizing based on rules and searching for data using the normalized keywords is known (Japanese Patent Laid-Open No.
-211023).

【０００５】上記公報に記載される検索方式は、例え
ば、「日本」のカタカナ表記として、「ニッポン」、
「ニホン」の２つの表記が考えられる場合、ファイル中
には「ニホン」という統一された表記により登録し、検
索する際、利用者が「ニッポン」、「ニホン」のいずれ
の表記のキーワードを入力しても、利用者の入力したキ
ーワードを「ニホン」に変換して、変換されたキーワー
ド「ニホン」により検索する方式である。[0005] The retrieval system described in the above-mentioned publication is, for example, as "Katakana notation" of "Japan", "Nippon",
When two notations of "Nihon" are conceivable, register in the file using the unified notation of "Nihon", and when searching, the user inputs the keyword of either notation "Nihon" or "Nihon" Even in this case, the keyword input by the user is converted into “Nihon”, and a search is performed using the converted keyword “Nihon”.

【０００６】しかしながら、上記検索方式においては、
ファイル中に登録されているデータが持つキーは正規化
されていなければならず、既存のデータ・ベースを用い
る場合には、ファイル中のデータが持つキーを正規化す
る必要があり、既存のデータ・ベースをそのまま用いる
ことができない。以上のように、上記第１番目に示した
従来の検索方式においては、あいまいさを持つキーワー
ドについての配慮がなされておらず、同義で異なる表現
をしたときの検索結果が保証されないため、利用者が入
力するキーワードに制約を設けなければならないという
問題があった。However, in the above search method,
The key of the data registered in the file must be normalized, and when using an existing database, the key of the data in the file must be normalized. -The base cannot be used as it is. As described above, in the first conventional search method described above, no consideration is given to keywords having ambiguity, and search results when synonymous and different expressions are not guaranteed. However, there is a problem that the keyword to be input must be restricted.

【０００７】また、上記第２番目に示した従来の検索方
式においては、辞書のメンテナンスに多大な時間を要す
るという問題があった。また、上記第３番目に示した従
来の検索方式においては、既存のデータ・ベースをその
まま利用することができないという問題があった。In the second conventional search method described above, there is a problem that a great deal of time is required for dictionary maintenance. Further, the third conventional search method described above has a problem that an existing database cannot be used as it is.

【０００８】[0008]

【発明が解決しようとする課題】本発明は上記した従来
技術の欠点を改善するためになされたものであって、同
義語辞書を用いることなく、また、既存のデータ・ベー
スに何の処理を加えることなく、かつキーワードに制約
を付加させずに、あいまいさを持つキーワードを用いて
データを検索することができるキーワード拡張検索シス
テムを提供することを目的とする。SUMMARY OF THE INVENTION The present invention has been made in order to improve the above-mentioned drawbacks of the prior art, and does not use a synonym dictionary and does not perform any processing on an existing database. A keyword expansion search system that can search for data using ambiguity keywords without adding or adding restrictions to the keywords
The purpose is to provide a system.

【０００９】[0009]

【課題を解決するための手段】図１は本発明の原理ブロ
ック図である。本発明は上記課題を解決するため、図１
に示すように、キーワードと各キーワードに対応したデ
ータを格納したデータ・ベース１と、キーワードを入力
する端末２と、キーワードを文字もしくは文字列の単位
に分解し、分解された各文字もしくは文字列の単位に所
定の変換ルールを適用することにより、複数のキーワー
ドを生成するキーワード拡張処理部３と、キーワード拡
張処理部３により生成されたキーワードに基づき、デー
タ・ベース１よりデータを検索するデータ検索処理部４
とを備えている。FIG. 1 is a block diagram showing the principle of the present invention. The present invention solves the above-mentioned problem by using FIG.
As shown in FIG. 2, a database 1 storing keywords and data corresponding to the keywords, a terminal 2 for inputting the keywords, and decomposing the keywords into units of characters or character strings, and decomposing each character or character string A keyword conversion processing unit 3 that generates a plurality of keywords by applying a predetermined conversion rule to the unit of “”, and a data search that searches data from the data base 1 based on the keywords generated by the keyword expansion processing unit 3 Processing unit 4
And

【００１０】そして、端末２よりキーワードを入力した
際、キーワード拡張処理部３において、キーワードを文
字もしくは文字列の単位に分解し、分解された各文字も
しくは文字列の単位に、書き換え可能文字列および例外
ルールからなる所定の変換ルールを適用することによ
り、入力されたキーワードと意味が同一で表記の異なっ
た複数のキーワードを生成し、生成された複数のキーワ
ードに基づきデータ・ベース１よりデータを検索するよ
うに構成したものである。また、カタカナ表記のキーワ
ードにカタカナ表記変換ルールを適用しカタカナ表記の
複数のキーワードを生成するキーワード拡張処理部３を
設けることができる。[0010] Then, when inputting a keyword from the terminal 2, the keyword expansion processing section 3, a keyword sentence
Is decomposed into character or character string units, and each decomposed character is
Or rewritable strings and exceptions in units of strings
By applying a predetermined conversion rule consisting of rules
That is, a plurality of keywords having the same meaning as the input keyword and different in notation are generated, and data is retrieved from the database 1 based on the plurality of generated keywords. Further, a keyword expansion processing unit 3 that applies a katakana notation conversion rule to a katakana notation keyword and generates a plurality of katakana notation keywords can be provided.

【００１１】また、さらに、漢字仮名混じり表記のキー
ワードに漢字仮名混じり表記変換ルールを適用し漢字仮
名混じり表記の複数のキーワードを生成するキーワード
拡張処理部３を設けることができる。Further, a keyword expansion processing unit 3 for generating a plurality of keywords in the kanji kana mixed notation by applying the kanji kana mixed notation conversion rule to the kanji kana mixed notation keywords can be provided.

【００１２】[0012]

【作用】端末２よりキーワードを入力すると、キーワー
ド拡張処理部３は入力されたキーワードより、意味が同
一で表記の異なった複数のキーワードを生成する。検索
処理部４はキーワード拡張処理部３により生成されたキ
ーワードにより、データ・ベース１よりデータを検索す
る。When a keyword is input from the terminal 2, the keyword expansion processing unit 3 generates a plurality of keywords having the same meaning and different notations from the input keyword. The search processing unit 4 searches data from the database 1 based on the keyword generated by the keyword expansion processing unit 3.

【００１３】キーワードを文字もしくは文字列の単位に
分解し、分解された各文字もしくは文字列の単位に、書
き換え可能文字列および例外ルールからなる所定の変換
ルールを適用することにより、複数のキーワードを生成
し、生成された複数のキーワードによりデータ・ベース
１よりデータを検索するように構成したので、キーワー
ドの表現が人によって多少異なっても、キーワードに制
約を付加することなく、正しい検索処理を行うことがで
きる。[0013] Keywords in units of characters or character strings
Decompose and write in units of each decomposed character or character string.
Predefined conversions consisting of replaceable strings and exception rules
By applying a rule, a plurality of keywords are generated, and data is retrieved from the database 1 using the generated plurality of keywords. A correct search process can be performed without adding.

【００１４】[0014]

【実施例】図２は本発明のキーワード拡張検索方式にお
けるシステム構成の１実施例を示す図である。同図にお
いて、１１は端末、１２は検索処理部、１２ａはキーワ
ード拡張処理部、１２ａ−１はキーワード推論／制御エ
ンジン、１２ａ−２は異表記生成ルール格納ファイル、
１２ｂはデータ検索処理部、１３はデータ・ベース、１
３ａはインバーテッド・ファイル、１３ｂはデータ部で
ある。FIG. 2 is a diagram showing one embodiment of a system configuration in the keyword expansion search system of the present invention. In the figure, 11 is a terminal, 12 is a search processing unit, 12a is a keyword expansion processing unit, 12a-1 is a keyword inference / control engine, 12a-2 is a different notation generation rule storage file,
12b is a data search processing unit, 13 is a data base, 1
3a is an inverted file and 13b is a data section.

【００１５】同図において、検索処理部１２にはキーワ
ード拡張処理部１２ａ、データ検索処理部１２ｂが設け
られている。検索処理部１２におけるキーワード拡張処
理部１２ａは端末１１より入力されたキーワードより、
同義の複数のキーワードを生成する手段である。キーワ
ード拡張処理部１２ａにおける異表記生成ルール格納フ
ァイル１２ａ−２には、キーワードの表記を変換するル
ールが格納されており、キーワード推論／制御エンジン
１２ａ−１は異表記生成ルール格納ファイル１２ａ−２
を参照して、端末１１より入力されたキーワードを拡張
して、複数の異表記キーワードを生成する。In FIG. 1, the search processing section 12 is provided with a keyword expansion processing section 12a and a data search processing section 12b. The keyword expansion processing unit 12a in the search processing unit 12
This is a means for generating a plurality of synonymous keywords. A rule for converting a keyword notation is stored in a different notation generation rule storage file 12a-2 in the keyword expansion processing unit 12a, and the keyword inference / control engine 12a-1 uses a different notation generation rule storage file 12a-2.
, The keyword input from the terminal 11 is extended to generate a plurality of keywords with different notations.

【００１６】また、検索処理部１２におけるデータ検索
処理部１２ｂはキーワード拡張処理部１２ａにより生成
されたキーワードに基づき、データ・ベース１３より必
要なデータを検索する手段である。データ・ベース１３
にはインバーテッド・ファイル１３ａ、データ部１３ｂ
が設けられており、インバーテッド・ファイル１３ａに
は、キーワードとそれに対応したデータのデータ部１３
ｂにおける格納位置が格納されている。また、データ部
１３ｂには、キーワードに対応するデータ（同図におい
ては、キーワードに関する文献）が格納されている。The data search processing section 12b in the search processing section 12 is means for searching the database 13 for necessary data based on the keyword generated by the keyword expansion processing section 12a. Database 13
Inverted file 13a, data part 13b
Are provided in the inverted file 13a, and the data portion 13 of the keyword and the data corresponding to the keyword is stored in the inverted file 13a.
The storage position in b is stored. The data section 13b stores data corresponding to the keyword (in FIG. 3, documents relating to the keyword).

【００１７】図３、図４は異表記生成ルール格納ファイ
ル１２ａ−２に格納された変換ルールの例を示す図であ
る。図３は外来語カタカナ表記変換ルールの１例を示す
図であり、カタカナを含むキーワードが入力される場合
には、同図に示すように、カタカナ表記の変換ルール
（例えば、「チャー」が「チュア」に、また、「チュ
ア」が「チャー」に変換可能である等の変換ルール）、
および、その例外ルール（「チャ、チュ、…チォ」の
「チ」は「ティ」にならない等の例外ルール）が異表記
生成ルール格納ファイル１２ａ−２に格納される。FIGS. 3 and 4 are diagrams showing examples of conversion rules stored in the different notation generation rule storage file 12a-2. FIG. 3 is a diagram showing an example of a foreign language katakana notation conversion rule. When a keyword including katakana is input, as shown in FIG. 3, a katakana notation conversion rule (for example, "char" is changed to " Conversion rules, such as “Tur” and “Tur” can be converted to “Char”),
And, the exception rules (exception rules such as “chi” of “cha, chu,..., Cho” that does not become “ti”) are stored in the variant notation generation rule storage file 12a-2.

【００１８】図４は漢字仮名混じり表記変換ルールおよ
び新旧漢字表記変換ルールの１例を示す図である。漢字
仮名混じり表記のキーワードが入力される場合には、同
図に示すように、漢字仮名混じり表記変換ルール（例え
ば、「読み出し」が「読出し」に変換可能である等の変
換ルール）、および、その例外ルール（例えば、「１の
位」は「１位」には変換できない等の例外ルール）が異
表記生成ルール格納ファイル１２ａ−２に格納される。FIG. 4 is a diagram showing an example of a kanji kana mixed notation conversion rule and a new and old kanji notation conversion rule. As shown in the figure, when a keyword in kanji kana mixed notation is input, a conversion rule for kanji kana mixed notation (for example, a conversion rule such that “read” can be converted to “read”), and The exception rule (for example, an exception rule in which “ones place” cannot be converted to “first place”) is stored in the variant notation generation rule storage file 12a-2.

【００１９】また、新旧漢字表記のキーワードが入力さ
れる場合には、同図に示すように、新旧漢字表記変換ル
ール（「斉」は「斎」に変換可能である等の変換ルー
ル）が異表記生成ルール格納ファイル１２ａ−２に格納
される。次ぎに図２のシステムにおける検索処理につい
て説明する。利用者が端末１１より、検索処理をおこな
うキーワード（例えば、「ウィスキー」）を入力する
と、キーワードは検索処理部１２のキーワード拡張処理
部１２ａに与えられる。When a keyword in the old and new kanji notation is input, the conversion rule for the new and old kanji notation (a conversion rule such that “Sai” can be converted to “sai”) differs as shown in FIG. It is stored in the notation generation rule storage file 12a-2. Next, search processing in the system of FIG. 2 will be described. When a user inputs a keyword (for example, “whiskey”) for performing a search process from the terminal 11, the keyword is given to a keyword expansion processing unit 12 a of the search processing unit 12.

【００２０】キーワード拡張処理部１２ａにおけるキー
ワード推論／制御エンジン１２ａ−１は、端末１１より
入力されたキーワード（例えば、「ウィスキー」）に異
表記生成ルール格納ファイル１２ａ−２に格納された変
換ルールを適用して、同義で異なった表記のキーワード
を生成し、生成された複数のキーワードをデータ検索処
理部１２ｂに与える。The keyword inference / control engine 12a-1 in the keyword expansion processing unit 12a converts a conversion rule stored in the variant notation generation rule storage file 12a-2 into a keyword (for example, "whiskey") input from the terminal 11. By applying the keyword, a keyword having the same meaning and different notation is generated, and the generated keywords are provided to the data search processing unit 12b.

【００２１】例えば、「ウィスキー」について、図３の
外来カタカナ表記変換ルールを参照すると、「ウィスキ
ー」における「ウィ」は「ウイ」に変換できること、そ
の末尾の「キー」の長音は削除可能でないこと、また、
上記変換は例外ルールに含まれないことが分かるので、
キーワード「ウィスキー」については、「ウイスキー」
のキーワードが生成される。For example, referring to the foreign katakana notation conversion rule of “whiskey” in FIG. 3, “whis” in “whiskey” can be converted to “whis”, and the long sound of the “key” at the end cannot be deleted. ,Also,
Since we know that the above conversion is not included in the exception rule,
About the keyword "whiskey", "whiskey"
Is generated.

【００２２】データ検索処理部１２ｂはデータ・ベース
１３を参照して、キーワード拡張処理部１２ａより与え
られた複数のキーワードに対応したデータを検索する。
すなわち、データ・ベース１３のインバーテッド・ファ
イル１３ａを参照して、キーワード（例えば「ウィスキ
ー」、「ウイスキー」のキーワード）に対応したデータ
のデータ部１３ｂにおけるデータの格納位置を求め、デ
ータ部１３ｂより、キーワードに対応したデータ（同図
においては、「ウイスキー」に関する文献１、「ウィス
キー」に関する文献１、「ウィスキー」に関する文献
２）を読み出し、端末１１に出力する。The data search processor 12b refers to the data base 13 to search for data corresponding to a plurality of keywords provided by the keyword expansion processor 12a.
That is, with reference to the inverted file 13a of the data base 13, the data storage position of the data corresponding to the keyword (for example, the keyword of "whiskey" or "whiskey") in the data portion 13b is obtained, and the data portion 13b , The data corresponding to the keyword (in FIG. 3, the document 1 relating to “whiskey”, the document 1 relating to “whiskey”, and the document 2 relating to “whiskey”), and outputs the data to the terminal 11.

【００２３】図５、図６は図２に示した実施例における
フローチャートを示す図であり、図５は本実施例におけ
る検索処理の全体のフローチャートであり、図６は図５
のステップＳ２における「キーワード拡張処理」のフロ
ーチャートである。図５において、利用者が検索処理を
するため図２の端末１１よりキーワードを入力すると
（ステップＳ１）、入力されたキーワードは検索処理部
１２のキーワード拡張処理部１２ａに送られキーワード
の拡張処理が行われる（ステップＳ２）。FIGS. 5 and 6 show flowcharts in the embodiment shown in FIG. 2. FIG. 5 is an overall flowchart of a search process in the embodiment. FIG.
9 is a flowchart of a “keyword expansion process” in step S2 of FIG. In FIG. 5, when a user inputs a keyword from the terminal 11 of FIG. 2 to perform a search process (step S1), the input keyword is sent to a keyword expansion processing unit 12a of the search processing unit 12, and the keyword expansion process is performed. (Step S2).

【００２４】図６のキーワード拡張処理において、ま
ず、ステップＴ１において、キーワード表記を出力テー
ブルに格納し、ステップＴ２において、キーワード表記
のサーチ位置を先頭に設定する。ステップＴ３におい
て、キーワード表記のサーチが終了したか否かを判別
し、終了していない場合には、ステップＴ４へ行き、ル
ールのサーチ位置を先頭に設定する。In the keyword expansion process of FIG. 6, first, in step T1, the keyword notation is stored in the output table, and in step T2, the search position of the keyword notation is set at the head. In step T3, it is determined whether or not the search of the keyword notation has been completed. If the search has not been completed, the process proceeds to step T4, and the search position of the rule is set to the head.

【００２５】次ぎに、ステップＴ５において、ルールの
サーチが終了したか否かを判別し、終了していない場合
には、ステップＴ６に行き、異表記生成ルール格納ファ
イル１２ａ−２に格納された変換ルールを参照して、サ
ーチ位置よりの文字列がルール適応可能か否かを判別す
る。また、ステップＴ５において、ルールのサーチが終
了したと判別された場合には、ステップＴ１０に行き、
キーワード表記のサーチ位置を１字後方にずらして、ス
テップＴ３に戻り以上の処理を繰り返す。Next, in step T5, it is determined whether or not the rule search has been completed. If not, the process proceeds to step T6, where the conversion stored in the notation generation rule storage file 12a-2 is performed. With reference to the rule, it is determined whether or not the character string from the search position is applicable to the rule. If it is determined in step T5 that the search for the rule has been completed, the process proceeds to step T10,
The search position of the keyword notation is shifted backward by one character, and the process returns to step T3 to repeat the above processing.

【００２６】ステップＴ６において、ルール適応可能で
ないと判別された場合には、ステップＴ８に行き、ルー
ルのサーチ位置を次ぎのルールに変えて、ステップＴ５
よりステップＴ６に行き、再びサーチ位置よりの文字列
がルール適応可能か否かを判別する。以上の処理を繰り
返し、ステップＴ６において、サーチ位置よりの文字列
がルール適応可能であると判別されると、ステップＴ７
に行き、例外ルールが存在しないか否か（すなわち、ル
ール適応候補か）を判別し、例外ルールが存在する場合
には、再びステップＴ８に行きルールのサーチ位置を次
ぎのルールに変えて、以上の処理を繰り返す。If it is determined in step T6 that the rule is not applicable, the process proceeds to step T8, where the search position of the rule is changed to the next rule, and the process proceeds to step T5.
The process then proceeds to step T6, where it is determined again whether or not the character string from the search position is applicable to the rules. When the above processing is repeated and it is determined in step T6 that the character string from the search position is applicable to the rule, step T7
To determine whether or not there is an exception rule (that is, whether it is a rule adaptation candidate). If there is an exception rule, go to step T8 again and change the search position of the rule to the next rule. Is repeated.

【００２７】ステップＴ７において、例外ルールが存在
しない場合には、ステップＴ９にいき、出力テーブルに
ある全ての表記を該当ルールの従い変換し、出力テーブ
ルの件数分、次ぎの出力テーブルに順に追加格納する。
例えば、文字列「Ａ」が「ａ」に変換可能であり、ま
た、文字列「Ｂ」が「ｂ」に変換可能である、文字列
「ＡＢ」がキーワードとして与えられた場合、まず、出
力テーブルに「ＡＢ」を記録し、ついで、「Ａ」につい
て変換ルールを適用して「ＡＢ」を「ａＢ」に変換し、
変換されたキーワード「ａＢ」を出力テーブルに記録す
る。In step T7, if there is no exception rule, the procedure goes to step T9, where all the notations in the output table are converted according to the applicable rule, and are additionally stored in the next output table by the number of output tables in order. I do.
For example, if the character string “AB” is given as a keyword, the character string “A” can be converted to “a”, and the character string “B” can be converted to “b”. Record "AB" in the table, and then apply the conversion rule for "A" to convert "AB" to "aB",
The converted keyword “aB” is recorded in the output table.

【００２８】この時の出力テーブルは下記のようにな
る。「ＡＢ」、「ａＢ」つぎに、「Ｂ」について変換ルールを適用して、上記出
力テーブルにある全ての表記（「ＡＢ」、「ａＢ」）を
変換し、出力テーブルの件数分（この場合には２件）、
出力テーブルに順に追加格納する。The output table at this time is as follows. "AB", "aB" Next, a conversion rule is applied to "B" to convert all the notations ("AB", "aB") in the output table, and the number of output tables (in this case, 2),
Store them in the output table in order.

【００２９】すなわち、「Ｂ」についての変換ルールに
より、「ＡＢ」を「Ａｂ」に変換し、「ａＢ」を「ａ
ｂ」に変換して追加格納するので、この場合の出力テー
ブルは下記のようになる。「ＡＢ」、「ａＢ」、「Ａｂ」、「ａｂ」ついで、ステップＴ１０に行き、キーワード表記のサー
チ位置を１字後方にずらして、ステップＴ３に戻り以上
の処理を繰り返す。That is, according to the conversion rule for “B”, “AB” is converted to “Ab”, and “aB” is converted to “a”.
b ”and additionally stored, the output table in this case is as follows. "AB", "aB", "Ab", "ab" Then, the process goes to step T10, shifts the search position of the keyword notation backward by one character, returns to step T3, and repeats the above processing.

【００３０】そして、ステップＴ３において、キーワー
ド表記のサーチが終了したと判別された場合にはキーワ
ード拡張処理を終了する。以上のようなキーワード拡張
処理が終了すると、図５のステップＳ３に行き、キーワ
ード拡張処理部１２ａにおいて、求めたキーワードを順
にデータ検索処理部１２ｂの入力領域にセットする。If it is determined in step T3 that the search for the keyword notation has been completed, the keyword expansion process is completed. When the above-described keyword expansion processing is completed, the process proceeds to step S3 in FIG. 5, and the keyword expansion processing unit 12a sequentially sets the obtained keywords in the input area of the data search processing unit 12b.

【００３１】ついで、ステップＳ４に行き、データ検索
処理部１２ｂにセットするデータがないか否かを判別
し、データ検索処理部１２ｂにセットするデータがある
場合には、Ｓ５に行きデータの検索処理を行い、ステッ
プＳ６において、検索結果を出力して、再びステップＳ
３に行き、上記処理を繰り返す。また、データ検索処理
部１２ｂにセットするデータがない場合には検索処理を
終了する。Then, the process proceeds to step S4, where it is determined whether or not there is data to be set in the data search processing unit 12b. If there is data to be set in the data search processing unit 12b, the process proceeds to S5 to search for data. Is performed, and in step S6, the search result is output, and
3 and repeat the above process. If there is no data to be set in the data search processing unit 12b, the search processing ends.

【００３２】なお、以上説明した実施例には、変換ルー
ルとして、外来語カタカナ表記変換ルール、漢字仮名混
じり表記変換ルールおよび新旧漢字表記変換ルールを示
したが、上記変換ルールは１つの変換ルールのみを用い
ることもできるし、また複数のルールを組み合わせ用い
ることもできる。また、変換ルールは上記実施例に限定
されるものではなく、その他、文章の文末の表記（例え
ば、「です」、「である」など）、外国語の表記など、
種々の変換ルールを用いることができる。In the above-described embodiment, the conversion rules of the foreign language katakana notation conversion rule, the kanji kana mixed notation conversion rule, and the new and old kanji notation conversion rules have been described. Or a combination of a plurality of rules. In addition, the conversion rules are not limited to the above-described embodiment, and may include notation at the end of a sentence (for example, “is” or “is”), notation in a foreign language, or the like.
Various conversion rules can be used.

【００３３】[0033]

【発明の効果】以上説明したことから明らかなように、
本発明によれば、キーワードを文字もしくは文字列の単
位に分解し、分解された各文字もしくは文字列の単位
に、書き換え可能文字列および例外ルールからなる所定
の変換ルールを適用することにより、複数のキーワード
を生成し、生成されたキーワードに基づき検索処理を行
うようにしたので、キーワードの表現が人によって多少
異なっても、キーワードに制約を付加することなく、正
しい検索処理を行うことができ、利用者が普通の表現で
検索処理を行うことが可能となる。また、同義語辞書の
メンテナンスを行ったり、あるいはまた、既存のデータ
・ベースに何ら変更を加えることなく、あいまいさを持
つキーワードを用いて検索することが可能となる。As is apparent from the above description,
According to the present invention, a keyword is simply expressed as a character or a character string.
Unit of each character or character string that has been decomposed into
A rewritable string and exception rules
By applying the conversion rule of, a plurality of keywords are generated and the search processing is performed based on the generated keywords. Therefore, even if the expression of the keyword is slightly different from person to person, without restricting the keyword Thus, correct search processing can be performed, and the user can perform search processing with ordinary expressions. Further, it becomes possible to perform a search using a keyword having ambiguity without performing maintenance of a synonym dictionary or changing an existing database at all.

【００３４】また、同義語辞書のメンテナンスを行った
り、あるいはまた、既存のデータ・ベースに何ら変更を
加えることなく、あいまいさを持つキーワードを用いて
検索することが可能となる。Further, it is possible to perform a search using a keyword having an ambiguity without maintaining the synonym dictionary or changing the existing database at all.

【図面の簡単な説明】[Brief description of the drawings]

【図１】本発明の原理ブロック図である。FIG. 1 is a principle block diagram of the present invention.

【図２】本発明の実施例のシステム構成を示す図であ
る。FIG. 2 is a diagram illustrating a system configuration according to an embodiment of the present invention.

【図３】外来語カタカナ表記変換ルールを示す図であ
る。FIG. 3 is a diagram showing a foreign language katakana notation conversion rule.

【図４】漢字仮名混じり表記変換ルールおよび新旧漢字
変換ルールを示す図である。FIG. 4 is a diagram showing a kanji / kana mixed notation conversion rule and a new / old kanji conversion rule.

【図５】本発明の実施例の検索処理のフローチャートを
示す図である。FIG. 5 is a diagram illustrating a flowchart of a search process according to the embodiment of the present invention.

【図６】本発明の実施例のキーワード拡張処理のフロー
チャートを示す図である。FIG. 6 is a flowchart showing a keyword expansion process according to the embodiment of the present invention.

[Explanation of symbols]

１，１３データ・ベース２，１１端末３，１２ａキーワード拡張処理部４，１２ｂデータ検索処理部１２検索処理部１２ａ−１キーワード推論／制御エンジン１２ａ−２異表記生成ルール格納ファイル１３ａインバーテッド・ファイル１３ｂデータ部 1,13 database 2,11 terminal 3,12a keyword expansion processing unit 4,12b data search processing unit 12 search processing unit 12a-1 keyword inference / control engine 12a-2 different notation generation rule storage file 13a inverted file 13b Data section

Claims

(57) [Claims]

And 1. A keyword and data base that stores data corresponding to each keyword (1), and the terminal (2) for inputting a keyword, decomposing the keyword into units of a character or a string, decomposition
Rewritable sentence in each character or character string unit
Apply predefined conversion rules consisting of strings and exception rules
By doing so, a keyword expansion processing unit (3) for generating a plurality of keywords, and a data search processing unit (3) for searching data from the database (1) based on the keywords generated by the keyword expansion processing unit (3) When a keyword is input from the terminal (2), the keyword expansion processor (3) generates a plurality of keywords having the same meaning and different notation from the input keyword, and performs data search.
The search processing unit (4) is a keyword expansion search system characterized by searching data from the database (1) based on a plurality of generated keywords.

2. A keyword expansion search system according to claim 1, further comprising a keyword expansion processing section (3) for applying a katakana notation conversion rule to a katakana notation keyword and generating a plurality of katakana notation keywords. .

3. A keyword expansion processing unit that applies a kanji kana mixed notation conversion rule to a kanji kana mixed notation keyword to generate a plurality of kanji kana mixed notation keywords.
The keyword expansion search system according to claim 1, further comprising (3).