JP2002132809A

JP2002132809A - Character string retrieval method, its implementation device, and recording medium recorded with processing program thereof

Info

Publication number: JP2002132809A
Application number: JP2000330796A
Authority: JP
Inventors: Noriyasu Kotaki; 伯泰小瀧; Satoshi Kikuchi; 菊地　　聡; Toshiharu Nakamura; 敏治中村
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2000-10-30
Filing date: 2000-10-30
Publication date: 2002-05-10

Abstract

PROBLEM TO BE SOLVED: To provide technology capable of improving the usability of character string retrieval. SOLUTION: The character string retrieval method for retrieving a character string has a step for extracting syllable elements from an inputted keyword, a step for normalizing representation vacillation of the extracted syllable elements, and a step for retrieving a character string matching the reading of the inputted keyword by referring reading matching indexes indicating storage locations of character strings including the normalized syllable elements.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は文字列を検索する文
字列検索装置に関し、特にディレクトリサービスでロー
マ字によって表記された日本語の文字列をその読みに基
づいて検索する文字列検索装置に適用して有効な技術に
関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character string search apparatus for searching a character string, and more particularly to a character string search apparatus for searching a Japanese character string written in Roman characters in a directory service based on its reading. And effective technology.

【０００２】[0002]

【従来の技術】企業内及び企業間の円滑なコミュニケー
ションを実現する手段として、ＬＡＮ（Local Area Net
work）等のネットワークを介してＰＣ（Personal Compu
ter）等の情報処理装置で作成した文書を送受信する電
子メールシステムの普及が進んでおり、受信者のメール
アドレスを検索する手段、所謂電子アドレス帳機能とし
て、ＩＴＵ−Ｔ勧告のＸ．５００（ＩＳＯ９５９４）等
に代表されるディレクトリサービスが利用されている。2. Description of the Related Art As a means for realizing smooth communication within a company and between companies, a LAN (Local Area Net) is used.
work) and other PCs (Personal Compu
e-mail systems for transmitting and receiving documents created by information processing devices such as ter.) have been widely used, and as a means for searching for a recipient's e-mail address, a so-called electronic address book function, the X.T. Directory services, such as 500 (ISO9594), are used.

【０００３】Ｘ．５００準拠のディレクトリサービスは
木構造として階層管理されたデータモデルを有する。木
の枝葉に相当する個所にはディレクトリエントリが配置
される。各々のエントリは階層情報を含む名称（ＤＮ：
Distinguished Name）で一意に識別され、ユーザのメー
ルアドレスに加え、姓名、電話番号、ＦＡＸ番号、写真
等、様々な情報を属性として記憶可能である。[0003] X. The directory service conforming to 500 has a data model hierarchically managed as a tree structure. Directory entries are arranged at locations corresponding to the branches and leaves of the tree. Each entry has a name (DN:
Distinguished Name), and in addition to the user's e-mail address, various information such as first and last name, telephone number, FAX number, and photograph can be stored as attributes.

【０００４】Ｘ．５００はクライアント−サーバ型の分
散システムアーキテクチャを採用しており、クライアン
ト及びサーバの役割を担う情報処理装置間の通信プロト
コルとしてＯＳＩ（Open Systems Interconnection）の
７レイヤ構造に従ったＤＡＰ（Directory Access Proto
col）を規定している。一方、インターネットにおける
標準化機関であるＩＥＴＦ（Internet Engineering Tas
k Force）は、ＴＣＰ／ＩＰ上のディレクトリクライア
ント−サーバ間プロトコルとして「ＬＤＡＰ：Lightwei
ght Directory Access Protocol（ＲＦＣ２２５１）」
を標準化した。ユーザはクライアント上のアプリケーシ
ョンプログラムからＸ．５００等のディレクトリサーバ
にＤＡＰまたはＬＤＡＰでアクセスし、ユーザのメール
アドレス等、所望の情報を検索する。更にＤＡＰまたは
ＬＤＡＰは、エントリ追加、削除、更新、エントリ名変
更等のディレクトリ更新要求も規定している。X. Reference numeral 500 adopts a client-server type distributed system architecture, and a DAP (Directory Access Protocol) according to a seven-layer structure of OSI (Open Systems Interconnection) as a communication protocol between information processing apparatuses that play a role of a client and a server.
col). On the other hand, IETF (Internet Engineering Tas
k Force) is "LDAP: Lightwei as a directory client-server protocol on TCP / IP.
ght Directory Access Protocol (RFC2251) "
Was standardized. The user can send an X.X. A directory server such as 500 is accessed by DAP or LDAP to search for desired information such as a user's mail address. DAP or LDAP also defines directory update requests such as entry addition, deletion, update, and entry name change.

【０００５】Ｘ．５００に規定される検索機能の一つに
属性検索機能がある。属性検索機能は、ユーザが指定し
た属性値に合致するディレクトリエントリを検索する機
能であり、例えば、ユーザの姓名についての属性検索
は、いわゆる五十音別アドレス帳に相当する機能とな
る。また属性検索機能には、ユーザにより指定された属
性値に全て合致するディレクトリエントリを検索する全
一致検索と、部分的に含むディレクトリエントリを検索
する部分一致検索がある。X. One of the search functions specified in the 500 is an attribute search function. The attribute search function is a function of searching for a directory entry that matches the attribute value specified by the user. For example, the attribute search for the user's first and last names is a function corresponding to a so-called Japanese syllabary address book. The attribute search function includes an all-match search for searching for directory entries that all match the attribute value specified by the user, and a partial-match search for searching for directory entries that partially include them.

【０００６】属性検索は、様々な情報システムで利用さ
れている逐次検索方法により実現可能である。しかし、
逐次検索方法はディレクトリサービスが保管する全ての
エントリについて属性値の比較を実行するため、処理オ
ーバヘッドが大きく、十分な性能を得ることができな
い。特に部分一致検索時は性能劣化が著しい。[0006] The attribute search can be realized by a sequential search method used in various information systems. But,
In the sequential search method, attribute values are compared for all entries stored by the directory service, so that the processing overhead is large and sufficient performance cannot be obtained. In particular, at the time of a partial match search, the performance is significantly deteriorated.

【０００７】上記の様な部分一致検索時の問題を解決す
る為に、例えば、平成７年にURL ftp://terminator.rs.
itd.umich.edu/ldap/papers/xldbm.psにて公開されたTi
mothy A Howes氏によるドキュメント「An X.500 and LD
AP Database: Design and Implementation」の４ペー
ジ、及び５ページに記述されている「4.3 Substring Ma
tching」の方法や、特開平７−３１９９２０号公報の様
なインデックス検索方法が提案されている。[0007] In order to solve the above problem at the time of partial match search, for example, in 1995, the URL ftp://terminator.rs.
Ti published on itd.umich.edu/ldap/papers/xldbm.ps
Mothy A Howes's document "An X.500 and LD
“4.3 Substring Ma” described on pages 4 and 5 of “AP Database: Design and Implementation”
A method of “tching” and an index search method as disclosed in JP-A-7-319920 have been proposed.

【０００８】図２１は従来のディレクトリサービスが各
ディレクトリエントリを記憶するディレクトリＤＢの情
報構成例を示す図である。図２１において、行がエント
リ、列が属性である。ディレクトリＤＢ構成例１５０１
は、各ユーザエントリの属性として、ＤＮ（dn）、英語
姓（sn）、英語名（givenName）、メールアドレス（mai
l）、電話番号（tel）、ＦＡＸ番号（fax）を管理して
いる。また処理を容易にするため、各ディレクトリエン
トリにはユニークな識別番号であるエントリＩＤ（ID）
が付加されている。FIG. 21 is a diagram showing an example of the information structure of a directory DB in which a conventional directory service stores each directory entry. In FIG. 21, rows are entries and columns are attributes. Directory DB configuration example 1501
Indicates the attributes of each user entry as DN (dn), English surname (sn), English name (givenName), and email address (mai
l), telephone numbers (tel), and fax numbers (fax). To facilitate processing, each directory entry has an entry ID (ID) which is a unique identification number.
Is added.

【０００９】図２２は従来のインデックス検索方法にお
ける英語名属性に関するパターン一致インデックスの情
報構成例を示す図である。従来のインデックス検索方法
は、ディレクトリＤＢ構成例１５０１を参照する前に、
予めパターン一致インデックス表１６０１を用いて候補
を絞り込むことにより、部分一致検索処理を高速化す
る。FIG. 22 is a diagram showing an example of an information structure of a pattern matching index for an English name attribute in a conventional index search method. In the conventional index search method, before referring to the directory DB configuration example 1501,
By narrowing down candidates using the pattern match index table 1601 in advance, the speed of the partial match search process is increased.

【００１０】図２２に示す様にパターン一致インデック
ス表１６０１は、パターン一致キー１６０２を格納する
文字列格納領域とエントリＩＤリスト１６０３を格納す
るリスト格納領域から成る。エントリＩＤリスト１６０
３に登録されるエントリＩＤは、図２１に示した各ディ
レクトリエントリのエントリＩＤに対応する。パターン
一致キー１６０２は、属性値である文字列からｍ文字お
きに１文字づつ、計ｎ文字（図中、ｍは０、ｎは３）を
抽出した部分文字列である。As shown in FIG. 22, the pattern matching index table 1601 includes a character string storage area for storing a pattern matching key 1602 and a list storage area for storing an entry ID list 1603. Entry ID list 160
The entry ID registered in No. 3 corresponds to the entry ID of each directory entry shown in FIG. The pattern matching key 1602 is a partial character string obtained by extracting a total of n characters (m is 0 and n is 3 in the figure) one by one every m characters from the character string that is the attribute value.

【００１１】例えば、ＤＮが"cn=Hiroshi Sato,ou=Sale
s,o=Hitachi,c=JP"であるエントリの登録時は、ディレ
クトリＤＢ構成例１５０１に各属性値を登録した後、"H
iroshi"という英語名に関して、"HIR"、"IRO"、"RO
S"、"OSH"、"SHI"という５つの検索キーを生成し、対応
するエントリＩＤ"1"と合わせて、図２２に示す英語名
用パターン一致インデックス表１６０１のパターン一致
キー１６０２、エントリＩＤリスト１６０３に登録す
る。他の属性に関しても同様にパターン一致インデック
ス表１６０１が生成される。For example, if the DN is "cn = Hiroshi Sato, ou = Sale
At the time of registration of the entry of “s, o = Hitachi, c = JP”, after registering each attribute value in the directory DB configuration example 1501, “H
For the English name "iroshi", "HIR", "IRO", "RO
Five search keys S "," OSH ", and" SHI "are generated, and together with the corresponding entry ID" 1 ", the pattern match key 1602 and the entry ID in the English name pattern match index table 1601 shown in FIG. It is registered in the list 1603. A pattern matching index table 1601 is similarly generated for other attributes.

【００１２】一方、検索時にも同様に、検索文字列から
ｍ文字おきにｎ文字を抽出し、各々を検索キーとして対
応するエントリＩＤリスト１６０３をパターン一致イン
デックス表１６０１から抽出し、抽出した０以上のエン
トリＩＤリスト１６０３に共通するエントリＩＤを選定
することにより候補を絞り込む。例えば、英語名に"hir
o"という文字列を含むエントリを検索する場合、"HI
R"、"IRO"という２つの検索キーを生成し、図２２に示
す英語名用パターン一致インデックス表１６０１から各
々のパターン一致キー１６０２に対応するエントリＩＤ
リスト１６０３を抽出する。次に、抽出された２つのエ
ントリＩＤリスト１６０３、"1,2,4,5,6"と"1,2,5,6"を
比較し、共通のエントリＩＤである"1,2,5,6"を検索結
果とする。つまり、英語名が、"Hiroshi"、"Shiro"、"C
hihiro"、"Ichiro"であるエントリを検索結果とする。On the other hand, at the time of retrieval, similarly, n characters are extracted every m characters from the retrieval character string, and a corresponding entry ID list 1603 is extracted from the pattern matching index table 1601 using each as a retrieval key. The candidate is narrowed down by selecting an entry ID common to the entry ID list 1603 of. For example, the English name "hir
"HI" to search for entries containing the string "o"
Two search keys R "and" IRO "are generated, and an entry ID corresponding to each pattern matching key 1602 is obtained from the English name pattern matching index table 1601 shown in FIG.
The list 1603 is extracted. Next, the two extracted entry ID lists 1603, "1,2,4,5,6" and "1,2,5,6" are compared, and a common entry ID "1,2,5" , 6 "as a search result. That is, the English names are "Hiroshi", "Shiro", "C
The search result is an entry that is "hihiro" or "Ichiro".

【００１３】[0013]

【発明が解決しようとする課題】前述した従来のインデ
ックス検索方法によると、ローマ字により登録された属
性を日本人のユーザが検索する際、以下の問題が発生す
る。図２２のケースを例にとると、日本人のユーザがデ
ィレクトリサービスに対して"hiro"という文字列を含む
エントリの検索を要求した場合、通常、名前が"ひろ"で
あるユーザのエントリを所望していると考えられる。し
かし従来のインデックス検索方法によると、文字列のパ
ターン一致手法を用い、検索文字列が含まれる全てのエ
ントリを検索結果とするため、"hiro"という検索文字列
に対し、"Hiroshi"、"Chihiro"以外に、ユーザが所望し
ていない"Shiro"、"Ichiro"という英語名を含むエント
リまで検索の結果としてしまう。また、ローマ字表記に
はユーザによりゆらぎがある。例えば図２１において、
ＤＮが"cn=Satiko Suzuki,ou=Sales,o=Hitachi,c=JP"で
あるエントリには、英語名属性として"Satiko"が登録さ
れている。この様な場合、別のローマ字表記である"Sac
hiko"で英語名を検索しても、上記エントリはヒットし
ない。本発明の目的は上記問題を解決し、文字列検索の
使い勝手を向上させることが可能な技術を提供すること
にある。According to the above-described conventional index search method, the following problem occurs when a Japanese user searches for an attribute registered in Roman characters. Taking the case of FIG. 22 as an example, when a Japanese user requests the directory service to search for an entry containing the character string “hiro”, the user entry whose name is “Hiro” is usually desired. it seems to do. However, according to the conventional index search method, since all entries including the search character string are used as search results using a character string pattern matching method, "Hiroshi", "Chihiro" In addition to "", the search results include entries that do not include the English names "Shiro" and "Ichiro" that the user does not want. In addition, there is fluctuation in the Roman alphabet notation depending on the user. For example, in FIG.
In the entry whose DN is "cn = Satiko Suzuki, ou = Sales, o = Hitachi, c = JP", "Satiko" is registered as an English name attribute. In such cases, another Romanized notation, "Sac
Even if the English name is searched by "hiko", the above entry does not hit. An object of the present invention is to provide a technique capable of solving the above problem and improving the usability of the character string search.

【００１４】[0014]

【課題を解決するための手段】本発明は、文字列を検索
する文字列検索装置において、キーワードの読みに基づ
いて文字列の検索を行うものである。本発明では、検索
の対象となる文字列をデータベースに登録する際に、そ
の文字列から音節要素を抽出し、前記抽出した音節要素
の読みに対して存在している異なる複数の表記を統一表
記に変換する表記ゆらぎの正規化を行う。そして前記正
規化した音節要素とその文字列の格納場所とを対応付け
て読み一致インデックスとして格納しておく。SUMMARY OF THE INVENTION The present invention provides a character string search apparatus for searching for a character string based on the reading of a keyword. In the present invention, when registering a character string to be searched in a database, a syllable element is extracted from the character string, and a plurality of different notations existing for the reading of the extracted syllable element are unified. Performs normalization of the notation fluctuation to be converted to. Then, the normalized syllable element and the storage location of the character string are associated with each other and stored as a reading matching index.

【００１５】文字列を検索する際にユーザからキーワー
ドが入力されると、入力されたキーワードから音節要素
を抽出し、前記抽出した音節要素の表記ゆらぎを前記と
同様にして正規化する。そして前記読み一致インデック
スを参照し、正規化したキーワードの音節要素を含む文
字列の格納場所からその文字列を読み出して、前記入力
されたキーワードの読みに一致する文字列の検索を行
う。When a user inputs a keyword when searching for a character string, a syllable element is extracted from the input keyword, and the notation fluctuation of the extracted syllable element is normalized in the same manner as described above. Then, by referring to the reading matching index, the character string is read from the storage location of the character string including the syllable element of the normalized keyword, and a character string matching the reading of the input keyword is searched.

【００１６】例えば、ディレクトリサービスで提供され
る各種の属性情報を格納したディレクトリＤＢから、日
本語の名前をローマ字で表記した英語名を示す属性情報
を日本語の読みで検索する場合には、まず前記ディレク
トリＤＢに格納されているローマ字の英語名を日本語で
読んだ場合の音節要素を抽出して正規化した後、その正
規化した音節要素とその英字名を含む各属性情報のディ
レクトリＤＢでのエントリＩＤとを対応付けて読み一致
インデックスとして格納しておく。For example, when searching attribute information indicating an English name in which a Japanese name is expressed in Roman characters by reading Japanese from a directory DB storing various attribute information provided by a directory service, first, After extracting and normalizing the syllable elements when the English name of the Roman alphabet stored in the directory DB is read in Japanese, the directory DB of each attribute information including the normalized syllable elements and the alphabetic names is obtained. Is stored in the form of a reading match index in association with the entry ID.

【００１７】そして日本語の名前をローマ字で表記した
英語名の検索を行う際には、ユーザがキーワードとして
入力した英語名を前記と同様に日本語で読んだ場合の音
節要素に分解して正規化した後、その正規化した音節要
素に対応するエントリＩＤを読み一致インデックスから
読み出し、ディレクトリＤＢのそのディレクトリエント
リにアクセスして前記入力されたキーワードに一致する
英語名を検索する。When performing a search for an English name in which a Japanese name is expressed in Roman characters, the English name input by the user as a keyword is decomposed into syllable elements when read in Japanese in the same manner as described above, and a regular expression is obtained. After the conversion, the entry ID corresponding to the normalized syllable element is read from the reading match index, and the directory entry of the directory DB is accessed to search for an English name that matches the input keyword.

【００１８】前記の様に本発明によれば、文字列の読み
に基づく読み一致インデックスにより、ユーザが意図す
る読みと異なる検索結果を除去できると共に、表記にゆ
らぎがあっても所望の検索結果を得ることができ、ユー
ザにとって使い勝手の良いディレクトリ検索方法を提供
することが可能である。以上の様に本発明の文字列検索
装置によれば、キーワードの読みに基づいて文字列の検
索を行うので、文字列検索の使い勝手を向上させること
が可能である。As described above, according to the present invention, it is possible to remove a search result different from the user's intended reading by using a reading matching index based on the reading of a character string, and to obtain a desired search result even if there is fluctuation in the notation. Thus, it is possible to provide a directory search method that is easy to use for the user. As described above, according to the character string search device of the present invention, since the character string is searched based on the reading of the keyword, the usability of the character string search can be improved.

【００１９】[0019]

【発明の実施の形態】（実施形態１）以下にディレクト
リサービスでローマ字によって表記された日本語の文字
列をその読みに基づいて検索する実施形態１の文字列検
索装置について説明する。本実施形態のディレクトリサ
ービスは、各属性情報を１つ以上の音節要素に分解して
その属性情報のディレクトリエントリと対応付けた読み
一致インデックスを用いて、読みに基づく検索機能を実
現するものである。本実施形態において、読み一致の比
較とは、比較対象の文字列が同一の発音であるかどうか
を調べる比較のことである。例えば、ローマ字表記の日
本語であれば、“Hiroshi”と“Hirosi”は文字列の比
較では不一致であるが、読みは同一となるので読み一致
の比較では一致と見なす。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS (Embodiment 1) A character string search apparatus according to Embodiment 1 for searching a Japanese character string written in Roman characters in a directory service based on its reading will be described below. The directory service of the present embodiment realizes a search function based on reading by decomposing each attribute information into one or more syllable elements and using a reading match index associated with a directory entry of the attribute information. . In the present embodiment, the comparison of the reading matches is a comparison for checking whether the character strings to be compared have the same pronunciation. For example, in Japanese in Roman notation, “Hiroshi” and “Hirosi” do not match in a comparison of character strings, but have the same reading, so that in a comparison of reading matching, they are regarded as a match.

【００２０】図１は本実施形態のディレクトリサービス
を提供するシステムの概略構成を示す図である。図１に
示す様に本実施形態のディレクトリサービスでは、ＬＡ
Ｎ等のネットワーク１０３でディレクトリサーバ１０１
とクライアント１０２が接続されている。FIG. 1 is a diagram showing a schematic configuration of a system for providing a directory service according to the present embodiment. As shown in FIG. 1, in the directory service of the present embodiment, LA
Directory server 101 on the network 103 such as N
And the client 102 are connected.

【００２１】ディレクトリサーバ１０１は、ディレクト
リエントリに関する各種属性情報を記憶するディレクト
リＤＢ１０４、ディレクトリエントリの各属性情報に対
して付与するインデックスの種別を記憶するインデック
ス定義情報１０５、各属性情報を１つ以上のパターン要
素に分解して記憶するパターン一致インデックス１０
６、各属性情報を１つ以上の音節要素に分解して記憶す
る読み一致インデックス１０７、ディレクトリに対する
アクセス要求を受け付けるディレクトリ制御部１０８、
属性情報等から、読み一致インデックス１０７の読み一
致キーを抽出する読み一致キー抽出部１０９、パターン
一致インデックス１０６のパターン一致キーを抽出する
パターン一致キー抽出部１１０から成る。The directory server 101 stores a directory DB 104 for storing various types of attribute information on directory entries, index definition information 105 for storing types of indexes to be assigned to each attribute information of the directory entries, and stores one or more pieces of each attribute information. Pattern matching index 10 to be decomposed into pattern elements and stored
6. a reading matching index 107 for decomposing each attribute information into one or more syllable elements and storing the directory matching information;
It comprises a reading matching key extracting unit 109 for extracting a reading matching key of the reading matching index 107 from attribute information and the like, and a pattern matching key extracting unit 110 for extracting a pattern matching key of the pattern matching index 106.

【００２２】ディレクトリ制御部１０８は、ディレクト
リに対する更新処理を制御するディレクトリ更新制御部
１１１、ディレクトリに対する検索処理を制御するディ
レクトリ検索制御部１１２、インデックスの更新を制御
するインデックス更新制御部１１３、インデックスの検
索を制御するインデックス検索制御部１１４から成る。The directory control unit 108 includes a directory update control unit 111 for controlling a directory update process, a directory search control unit 112 for controlling a directory search process, an index update control unit 113 for controlling an index update, and an index search. Is controlled by an index search control unit 114.

【００２３】インデックス更新制御部１１３は、パター
ン一致インデックス１０６に対して書き込み処理を実行
するパターン一致インデックス更新部１１５と、読み一
致インデックス１０７に対して書き込み処理を実行する
読み一致インデックス更新部１１６を備える。The index update control unit 113 includes a pattern match index update unit 115 for executing a write process on the pattern match index 106, and a read match index update unit 116 for executing a write process on the read match index 107. .

【００２４】インデックス検索制御部１１４は、パター
ン一致インデックス１０６に対して検索処理を実行する
パターン一致インデックス検索部１１７と、読み一致イ
ンデックス１０７に対して検索処理を実行する読み一致
インデックス検索部１１８を備える。The index search control unit 114 includes a pattern match index search unit 117 that executes a search process on the pattern match index 106 and a reading match index search unit 118 that executes a search process on the reading match index 107. .

【００２５】読み一致キー抽出部１０９は、入力された
キーワードやディレクトリエントリの属性情報等から音
節要素を抽出する音節要素抽出部１１９、前記抽出した
音節要素の表記ゆらぎを正規化する音節要素正規化部１
２０から成る。パターン一致キー抽出部１１０は、属性
情報等からパターン一致インデックス１０６のパターン
一致キーとなる要素を抽出するキー要素抽出部１２１、
パターンを正規化するパターン正規化部１２２から成
る。The reading matching key extracting unit 109 extracts a syllable element from the input keyword or attribute information of the directory entry, etc., and a syllable element normalizing unit that normalizes the fluctuation of the notation of the extracted syllable element. Part 1
20. The pattern matching key extracting unit 110 includes a key element extracting unit 121 that extracts an element serving as a pattern matching key of the pattern matching index 106 from attribute information and the like.
It comprises a pattern normalizing section 122 for normalizing the pattern.

【００２６】尚、本実施形態中、パターン一致インデッ
クス１０６の情報要素及びパターン一致インデックス更
新部１１５、パターン一致インデックス検索部１１７、
パターン一致キー抽出部１１０、キー要素抽出部１２
１、パターン正規化部１２２の処理内容は、従来例と同
様である。In this embodiment, the information elements of the pattern matching index 106 and the pattern matching index updating unit 115, the pattern matching index searching unit 117,
Pattern matching key extraction unit 110, key element extraction unit 12
1. The processing contents of the pattern normalization unit 122 are the same as in the conventional example.

【００２７】ディレクトリサーバ１０１を前記各処理部
として機能させる為のプログラムは、ＣＤ−ＲＯＭ等の
記録媒体に記録されディレクトリサーバ１０１の磁気デ
ィスク等に格納された後、ディレクトリサーバ１０１の
メモリにロードされて実行されるものとする。なお前記
プログラムを記録する記録媒体はＣＤ−ＲＯＭ以外の他
の記録媒体でも良い。A program for causing the directory server 101 to function as each processing unit is recorded on a recording medium such as a CD-ROM, stored on a magnetic disk or the like of the directory server 101, and then loaded into a memory of the directory server 101. Shall be executed. The recording medium for recording the program may be a recording medium other than the CD-ROM.

【００２８】図２は本実施形態の読み一致インデックス
１０７の情報構成例を示す図である。図２では、図２１
の英語名属性に関する読み一致インデックス表２００１
の情報要素を表しており、従来のパターン一致インデッ
クス１０６と同様、検索キーとなる読み一致キー２００
２とエントリＩＤリスト２００３から成る。但し、読み
一致インデックス表２００１における読み一致キー２０
０２は、属性値である文字列を音節要素に分解した部分
文字列である。最初に、インデックスの生成方法につい
て説明する。なお、インデックスの生成は、エントリ登
録時に限らず、ディレクトリＤＢ１０４の初期構築にも
行われる。また、エントリや属性値の削除時にはインデ
ックスの削除が、属性値の変更時にはインデックスの変
更が行われる。FIG. 2 is a diagram showing an example of the information structure of the reading coincidence index 107 according to this embodiment. In FIG. 2, FIG.
Matching Index Table 2001 for English Name Attributes
And a reading match key 200 serving as a search key as in the case of the conventional pattern match index 106.
2 and an entry ID list 2003. However, the reading match key 20 in the reading match index table 2001
02 is a partial character string obtained by decomposing a character string as an attribute value into syllable elements. First, an index generation method will be described. The generation of the index is performed not only at the time of entry registration but also at the initial construction of the directory DB 104. When an entry or attribute value is deleted, the index is deleted, and when the attribute value is changed, the index is changed.

【００２９】図３は本実施形態のインデックス定義情報
１０５の情報構成例を示す図である。図３では、ディレ
クトリサーバ１０１の運用管理者がディレクトリエント
リの各属性に対して付与するインデックスの種別を記憶
する、インデックス定義情報１０５の情報構成例を表し
ている。図３のインデックス定義情報１７０１は、属性
名称の記憶領域と、インデックス種別毎の要／不要を記
憶する３つの領域から成る。ディレクトリサーバ１０１
の運用管理者は、インデックス作成が必要な属性には、
作成するインデックス種別領域に、作成要を示す"Y"を
登録し、インデックス作成が不要な場合には、インデッ
クス不要を示す"N"を登録する。FIG. 3 is a diagram showing an example of the information configuration of the index definition information 105 of the present embodiment. FIG. 3 shows an example of the information configuration of the index definition information 105 that stores the type of index given to each attribute of the directory entry by the operation manager of the directory server 101. The index definition information 1701 in FIG. 3 includes a storage area for attribute names and three areas for storing necessity / unnecessity for each index type. Directory server 101
The operation administrator of the site must know which attributes need to be indexed,
In the index type area to be created, “Y” indicating creation is registered, and if index creation is not required, “N” indicating index not needed is registered.

【００３０】図４は本実施形態のディレクトリ制御部１
０８の処理手順を示すフローチャートである。ディレク
トリサーバ１０１のディレクトリ制御部１０８は、ディ
レクトリクライアント１０２からの処理要求を読み出し
（ステップ２０１）、処理要求の内容を判断し（ステッ
プ２０２〜ステップ２０６）、ディレクトリ更新（ステ
ップ２０８）またはディレクトリ検索（ステップ２０
９）を行い、ディレクトリクライアント１０２に処理結
果を返す（ステップ２０７）。ディレクトリクライアン
ト１０２からの処理要求が、エントリの登録要求、エン
トリの変更要求、エントリの削除要求の場合には、ディ
レクトリ更新制御部１１１を呼び出す（ステップ２０
８）。FIG. 4 shows a directory control unit 1 according to this embodiment.
It is a flowchart which shows the processing procedure of 08. The directory control unit 108 of the directory server 101 reads a processing request from the directory client 102 (step 201), determines the content of the processing request (steps 202 to 206), updates the directory (step 208), or searches for a directory (step 201). 20
9), and returns a processing result to the directory client 102 (step 207). When the processing request from the directory client 102 is an entry registration request, an entry change request, or an entry deletion request, the directory update control unit 111 is called (step 20).
8).

【００３１】図５は本実施形態のディレクトリ更新制御
部１１１の処理手順を示すフローチャートである。ディ
レクトリ制御部１０８から呼び出されたディレクトリ更
新制御部１１１は、ディレクトリクライアント１０２か
らの処理要求が、ディレクトリエントリの登録要求であ
る場合に、要求中に含まれるエントリ識別名、英語姓、
英語名、メールアドレス、電話番号、ＦＡＸ番号等の属
性を、各々ディレクトリＤＢ１０４内の各記憶領域に登
録する（ステップ３０１）。同様に、エントリの変更要
求である場合には、要求中に含まれる変更情報に従っ
て、ディレクトリＤＢ１０４内の各記憶領域を変更す
る。削除要求も同様である。次にディレクトリ更新制御
部１１１は、各属性のインデックス更新を更新する為に
インデックス更新制御部１１３を呼び出す（ステップ３
０２）。この呼び出しは、属性の名称と属性値である文
字列、及びディレクトリエントリを一意に識別できるエ
ントリＩＤをパラメータに持つ。FIG. 5 is a flowchart showing a processing procedure of the directory update control unit 111 of this embodiment. When the processing request from the directory client 102 is a request for registering a directory entry, the directory update control unit 111 called from the directory control unit 108 determines the entry identifier, the English surname,
Attributes such as an English name, a mail address, a telephone number, and a FAX number are registered in each storage area in the directory DB 104 (step 301). Similarly, in the case of an entry change request, each storage area in the directory DB 104 is changed according to the change information included in the request. The same applies to a deletion request. Next, the directory update control unit 111 calls the index update control unit 113 to update the index update of each attribute (step 3).
02). This call has as parameters the name of the attribute, a character string that is the attribute value, and an entry ID that can uniquely identify the directory entry.

【００３２】図６は本実施形態のインデックス更新制御
部１１３の処理手順を示すフローチャートである。ディ
レクトリ更新制御部１１１から呼び出されたインデック
ス更新制御部１１３は、インデックス定義情報１０５か
ら、ディレクトリ更新制御部１１１に指示された属性名
と同一の属性名を有するレコードを探索する（ステップ
５０１）。探索したレコードにおいて、全一致領域また
は部分一致領域に"Y"が設定されている場合は（ステッ
プ５０２）、パターン一致インデックス更新部１１５に
対してパターン一致インデックスの更新を指示する（ス
テップ５０５）。この指示は、更新するインデックスの
種別（全一致若しくは部分一致）と属性値である文字
列、及びエントリＩＤをパラメータに持つ。更に、探索
したレコードにおいて、読み一致領域に"Y"が設定され
ている場合は（ステップ５０３）、読み一致インデック
ス更新部１１６に対して読み一致インデックスの更新を
指示する（ステップ５０６）。この指示は、属性値であ
る文字列、及びエントリＩＤをパラメータに持つ。全て
の更新情報について処理するまでステップ５０１からの
処理を繰り返し行う（ステップ５０４）。FIG. 6 is a flowchart showing the processing procedure of the index update control unit 113 of this embodiment. The index update control unit 113 called from the directory update control unit 111 searches the index definition information 105 for a record having the same attribute name as the attribute name specified to the directory update control unit 111 (Step 501). If “Y” is set to the entire matching area or the partial matching area in the searched record (step 502), the pattern matching index updating unit 115 is instructed to update the pattern matching index (step 505). This instruction has, as parameters, the type of index to be updated (all match or partial match), a character string as an attribute value, and an entry ID. Further, when “Y” is set in the reading matching area in the searched record (step 503), the reading matching index updating unit 116 is instructed to update the reading matching index (step 506). This instruction has a character string as an attribute value and an entry ID as parameters. The processing from step 501 is repeated until all the update information is processed (step 504).

【００３３】図７は本実施形態のパターン一致インデッ
クス更新部１１５の処理手順を示すフローチャートであ
る。インデックス更新制御部１１３から呼び出されたパ
ターン一致インデックス更新部１１５は、パターン一致
キー抽出部１１０を呼び出し、パターン一致キー１６０
２を抽出する（ステップ７０１）。指示が登録要求の場
合（ステップ７０２）、パターン一致インデックス表１
６０１の、抽出した各パターン一致キー１６０２に関連
付けられたエントリＩＤリスト１６０３に、当該エント
リＩＤを登録する（ステップ７０６）。指示が削除要求
の場合（ステップ７０３）、パターン一致インデックス
表１６０１の、抽出した各パターン一致キー１６０２に
関連付けられたエントリＩＤリスト１６０３から、当該
エントリＩＤを削除する（ステップ７０７）。指示が変
更要求の場合（ステップ７０４）、パターン一致インデ
ックス表１６０１の、変更前情報から抽出した各パター
ン一致キー１６０２に関連付けられたエントリＩＤリス
ト１６０３から、当該エントリＩＤを削除し（ステップ
７０８）、パターン一致インデックス表１６０１の、変
更後情報から抽出した各パターン一致キー１６０２に関
連付けられたエントリＩＤリスト１６０３に、当該エン
トリＩＤを登録する（ステップ７０９）。全ての更新情
報を処理するまで、ステップ７０１から繰り返し処理を
行う（ステップ７０５）。FIG. 7 is a flowchart showing a processing procedure of the pattern matching index updating unit 115 of the present embodiment. The pattern matching index updating unit 115 called from the index updating control unit 113 calls the pattern matching key extracting unit 110, and the pattern matching key 160
2 is extracted (step 701). If the instruction is a registration request (step 702), the pattern matching index table 1
The entry ID is registered in the entry ID list 1603 associated with each of the extracted pattern matching keys 1602 at step 601 (step 706). If the instruction is a deletion request (step 703), the entry ID is deleted from the entry ID list 1603 associated with each extracted pattern matching key 1602 in the pattern matching index table 1601 (step 707). If the instruction is a change request (step 704), the entry ID is deleted from the entry ID list 1603 associated with each pattern matching key 1602 extracted from the pre-change information in the pattern matching index table 1601 (step 708). The entry ID is registered in the entry ID list 1603 of the pattern matching index table 1601 associated with each pattern matching key 1602 extracted from the post-change information (step 709). The process is repeated from step 701 until all update information is processed (step 705).

【００３４】図８は本実施形態のパターン一致キー抽出
部１１０の処理手順を示すフローチャートである。パタ
ーン一致インデックス更新部１１５から呼び出されたパ
ターン一致キー抽出部１１０は、初めにパターン正規化
部１２２により、与えられたパターンの表記のゆらぎを
正規化する（ステップ１１０１）。次に、キー要素抽出
部１２１により、与えられたパターンからｍ文字おきに
ｎ文字を抽出する（ステップ１１０２）。キー要素がな
くなるまでステップ１１０２から繰り返し処理する（ス
テップ１１０３）。パターン一致キー抽出部１１０は、
抽出したキーを呼び出し元に返す。FIG. 8 is a flowchart showing a processing procedure of the pattern matching key extraction unit 110 of this embodiment. The pattern matching key extracting unit 110 called by the pattern matching index updating unit 115 first normalizes the fluctuation of the notation of the given pattern by the pattern normalizing unit 122 (step 1101). Next, the key element extraction unit 121 extracts n characters from the given pattern at every m characters (step 1102). The process is repeated from step 1102 until there is no more key element (step 1103). The pattern matching key extraction unit 110
Return the extracted key to the caller.

【００３５】図９は本実施形態の読み一致インデックス
更新部１１６の処理手順を示すフローチャートである。
本実施形態の読み一致インデックス更新部１１６は、音
節要素抽出部１１９で抽出し、音節要素正規化部１２０
で正規化した各音節要素について、その音節要素を含む
属性情報のエントリＩＤの登録、削除または変更を読み
一致インデックス表２００１に対して行う。ディレクト
リサーバ１０１の読み一致インデックス更新部１１６の
処理手順は、図７に示したパターン一致インデックス更
新部１１５の処理と同様であるが、キーの抽出には読み
一致キー抽出部１０９を利用する（ステップ８０１）と
ころが異なる。すなわち、インデックス更新制御部１１
３から呼び出された読み一致インデックス更新部１１６
は、読み一致キー抽出部１０９を呼び出し、読み一致キ
ー２００２を抽出する（ステップ８０１）。指示が登録
要求の場合（ステップ８０２）、読み一致インデックス
表２００１の、抽出した各読み一致キー２００２に関連
付けられたエントリＩＤリスト２００３に、当該エント
リＩＤを登録する（ステップ８０６）。指示が削除要求
の場合（ステップ８０３）、読み一致インデックス表２
００１の、抽出した各読み一致キー２００２に関連付け
られたエントリＩＤリスト２００３から、当該エントリ
ＩＤを削除する（ステップ８０７）。指示が変更要求の
場合（ステップ８０４）、読み一致インデックス表２０
０１の、変更前情報から抽出した各読み一致キー２００
２に関連付けられたエントリＩＤリスト２００３から、
当該エントリＩＤを削除し（ステップ８０８）、読み一
致インデックス表２００１の、変更後情報から抽出した
各読み一致キー２００２に関連付けられたエントリＩＤ
リスト２００３に、当該エントリＩＤを登録する（ステ
ップ８０９）。全ての更新情報を処理するまで、ステッ
プ８０１から繰り返し処理を行う（ステップ８０５）。FIG. 9 is a flowchart showing a processing procedure of the reading matching index updating unit 116 according to the present embodiment.
The reading match index updating unit 116 of the present embodiment extracts the syllable element by the syllable element extracting unit 119 and
The registration, deletion or change of the entry ID of the attribute information including the syllable element is performed for each of the syllable elements normalized by the above, and the matching index table 2001 is read. The processing procedure of the reading matching index updating unit 116 of the directory server 101 is the same as the processing of the pattern matching index updating unit 115 shown in FIG. 7, but the key is extracted by using the reading matching key extracting unit 109 (step). 801) Different. That is, the index update control unit 11
Reading match index updating unit 116 called from step 3
Calls the reading matching key extraction unit 109 and extracts the reading matching key 2002 (step 801). If the instruction is a registration request (step 802), the entry ID is registered in the entry ID list 2003 associated with each extracted reading matching key 2002 in the reading matching index table 2001 (step 806). If the instruction is a deletion request (step 803), the reading match index table 2
001, the entry ID is deleted from the entry ID list 2003 associated with each extracted matching key 2002 (step 807). If the instruction is a change request (step 804), the reading match index table 20
01, each reading matching key 200 extracted from the pre-change information
2 from the entry ID list 2003 associated with
The entry ID is deleted (step 808), and the entry ID associated with each reading matching key 2002 extracted from the post-change information in the reading matching index table 2001
The entry ID is registered in the list 2003 (step 809). The process is repeated from step 801 until all update information is processed (step 805).

【００３６】図１０は本実施形態の読み一致キー抽出部
１０９の処理手順を示すフローチャートである。読み一
致インデックス更新部１１６から呼び出された読み一致
キー抽出部１０９は、音節要素抽出部１１９により、ロ
ーマ字表記テーブルを参照し、与えられた文字列の最初
の音節要素を認識する（ステップ１２０１）。図１１は
本実施形態のローマ字表記テーブルの一例を示す図であ
る。図では省略しているが、濁音や破裂音、拗音等のロ
ーマ字表記のテーブルや、同一の読みに対して異なる表
記となる音節を示すテーブルも存在しているものとす
る。本実施形態の音節要素抽出部１１９は、ローマ字表
記テーブル１８０１中の緒音節と属性値である文字列と
を比較し、ローマ字表記テーブル１８０１の音節と一致
するものをその文字列の音節要素として抽出する。次に
ディレクトリサーバ１０１の読み一致キー抽出部１０９
は、音節要素正規化部１２０により、ローマ字正規化テ
ーブルを参照し、認識した音節要素を正規化する（ステ
ップ１２０２）。FIG. 10 is a flowchart showing a processing procedure of the reading matching key extraction unit 109 of the present embodiment. The reading matching key extracting unit 109 called from the reading matching index updating unit 116 refers to the Roman alphabet table by the syllable element extracting unit 119 and recognizes the first syllable element of the given character string (step 1201). FIG. 11 is a diagram showing an example of the Roman alphabet notation table of the present embodiment. Although omitted in the drawing, it is assumed that there is also a table in Roman alphabet notation such as a muddy sound, a plosive sound, and a murmur, and a table indicating syllables in different notations for the same reading. The syllable element extraction unit 119 of the present embodiment compares the syllables in the Roman alphabet notation table 1801 with the character string that is the attribute value, and extracts those that match the syllables in the Roman alphabet notation table 1801 as syllable elements of the character string. I do. Next, the matching key extraction unit 109 of the directory server 101
The syllable element normalization unit 120 normalizes the recognized syllable elements with reference to the Roman character normalization table (step 1202).

【００３７】図１２は本実施形態のローマ字正規化テー
ブルの一例を示す図である。本実施形態の音節要素正規
化部１２０は、音節要素抽出部１１９で抽出された音節
要素と、ローマ字正規化テーブル１９０１中の領域１９
０２に登録された正規化前の表記とを比較し、音節要素
抽出部１１９で抽出された音節要素が領域１９０２に登
録された表記である場合に、その音節要素を領域１９０
３の表記に変換し、その音節要素の読みに対して存在し
ている異なる複数の表記を統一表記に変換する表記ゆら
ぎの正規化を行う。例えば、音節要素抽出部１１９で認
識した音節要素が“HU”である場合、その音節要素を
“FU”に変換することで“ふ”という発音に対する表現
の統一を行って正規化する。FIG. 12 is a diagram showing an example of the Roman character normalization table of the present embodiment. The syllable element normalizing section 120 of the present embodiment compares the syllable element extracted by the syllable element extracting section 119 with the area 19 in the Roman character normalization table 1901.
The syllable element extracted in the syllable element extraction unit 119 is compared with the notation registered before normalization registered in the area 1902, and if the syllable element is the notation registered in the area 1902, the syllable element is stored in the area 1902.
3 is converted into a notation of 3, and a plurality of different notations existing for the reading of the syllable element are converted into a unified notation, and the notation fluctuation is normalized. For example, when the syllable element recognized by the syllable element extraction unit 119 is “HU”, the expression for the pronunciation of “fu” is unified and normalized by converting the syllable element to “FU”.

【００３８】認識した音節要素に続く文字列がなくなる
まで、ステップ１２０１から繰り返し行う。文字列がな
くなった場合、認識し、正規化した音節要素を返す。な
お、ローマ字表記テーブル１８０１及びローマ字正規化
テーブル１９０１は読み一致キー抽出部１０９が備える
ものとする。Step 1201 is repeated until there is no character string following the recognized syllable element. If the string runs out, it returns the recognized and normalized syllable element. It should be noted that the Roman character notation table 1801 and the Roman character normalization table 1901 are provided in the reading matching key extraction unit 109.

【００３９】例えば、与えられた属性値が"Hiroshi"で
あった場合、読み一致キー抽出部１０９は、ローマ字表
記テーブル１８０１を参照し、文字列の音節要素"Hi"を
検出する。音節分解した部分文字列"Hi"には表記のゆら
ぎは無いが、もし音節分解した部分文字列に表記のゆら
ぎがある場合には、その後、読み一致キー抽出部１０９
は、ローマ字正規化テーブル１９０１で表記のゆらぎを
正し、更に英大文字小文字等のゆらぎも正す。この様な
処理により、結果的に属性値"Hiroshi"から読み一致キ
ー２００２の"HI","RO","SHI"が得られる。同様に"Sati
ko"からは、"SA","CHI","KO"が得られる。これらの読み
一致キー２００２を図９に示した読み一致インデックス
更新部１１６で利用する。For example, when the given attribute value is “Hiroshi”, the reading matching key extraction unit 109 refers to the Roman alphabet notation table 1801 and detects the syllable element “Hi” of the character string. The syllable-decomposed partial character string "Hi" has no fluctuation in notation, but if the syllable-decomposed partial character string has fluctuation in notation, then the reading matching key extraction unit 109
Corrects fluctuations in the notation in the Roman character normalization table 1901 and also corrects fluctuations in uppercase and lowercase letters. As a result of such processing, "HI", "RO", and "SHI" of the read matching key 2002 are obtained from the attribute value "Hiroshi". Similarly, "Sati
From "ko", "SA", "CHI", and "KO" are obtained, and these reading match keys 2002 are used by the reading match index updating unit 116 shown in FIG.

【００４０】次に、インデックス検索方法について説明
する。図１３は本実施形態の属性検索時の画面表示例を
示す図である。図１３では、エンドユーザがディレクト
リを検索する際の画面表示例を表しており、ディレクト
リクライアント１０２がディスプレイへの表示、及びキ
ーボードやマウス等による入力を制御する。Next, an index search method will be described. FIG. 13 is a diagram showing a screen display example at the time of attribute search according to the present embodiment. FIG. 13 illustrates an example of a screen display when the end user searches the directory, and the directory client 102 controls display on the display and input using a keyboard, a mouse, or the like.

【００４１】画面１３０１は、属性名を選択する領域１
３０２、検索パターン文字列を指示する領域１３０３、
読み一致検索やパターン一致検索を選択する領域１３０
４、部分一致や完全一致を選択する領域１３０５、設定
した条件で検索を実行する為のＯＫボタン１３０６、設
定した情報をキャンセルする為のキャンセルボタン１３
０７から成る。なお、本実施形態において、部分一致や
完全一致を選択する領域１３０５が「を含む」である場
合には、検索パターン文字列を指示する領域１３０３に
入力された文字列前後に任意文字列であることを表すワ
イルドカード"*"が指定されるものとする。The screen 1301 is an area 1 for selecting an attribute name.
302, an area 1303 indicating a search pattern character string,
Area 130 for selecting reading match search or pattern match search
4. An area 1305 for selecting a partial match or a perfect match, an OK button 1306 for executing a search under set conditions, and a cancel button 13 for canceling the set information
07. In the present embodiment, if the area 1305 for selecting a partial match or a perfect match is “contains”, it is an arbitrary character string before and after the character string input to the area 1303 indicating the search pattern character string. It is assumed that a wildcard "*" indicating that the event is to be performed is specified.

【００４２】ディレクトリクライアント１０２は、画面
１３０１で諸情報が入力され、ＯＫボタン１３０６が押
下されると、ディレクトリサーバ１０１に対してＬＤＡ
Ｐ等の検索要求を発行する。先に説明した図４のフロー
チャートの通り、ディレクトリクライアント１０２から
検索要求を受け取ったディレクトリサーバ１０１のディ
レクトリ制御部１０８は、要求がエントリ検索であるこ
とを認識して（ステップ２０６）、ディレクトリ検索制
御部１１２を呼び出す（ステップ２０９）。When various information is input on a screen 1301 and an OK button 1306 is pressed, the directory client 102 sends an LDA to the directory server 101.
Issues a search request such as P. As described in the flowchart of FIG. 4, the directory control unit 108 of the directory server 101 that has received the search request from the directory client 102 recognizes that the request is an entry search (step 206), and Call 112 (step 209).

【００４３】図１４は本実施形態のディレクトリ検索制
御部１１２の処理手順を示すフローチャートである。デ
ィレクトリ制御部１０８から検索条件を受け取ったディ
レクトリ検索制御部１１２は、検索条件を渡してインデ
ックス検索制御部１１４を呼び出し、ヒットエントリの
候補となるエントリのＩＤ一覧を得る（ステップ４０
１）。検索ヒット候補がある間（ステップ４０２）ディ
レクトリＤＢ１０４からエントリＩＤをキーにエントリ
情報を読み出す（ステップ４０３）。そして、読み出し
たエントリ情報が検索条件に合致するかどうかを確認す
る（ステップ４０４）。合致した場合には、エントリ情
報をディレクトリクライアント１０２に返す（ステップ
４０５）。合致しなかった場合には次のヒット候補につ
いて調べる（ステップ４０６）。FIG. 14 is a flowchart showing a processing procedure of the directory search control unit 112 according to this embodiment. Upon receiving the search condition from the directory control unit 108, the directory search control unit 112 passes the search condition, calls the index search control unit 114, and obtains a list of IDs of hit entry candidates (step 40).
1). While there is a search hit candidate (step 402), entry information is read from the directory DB 104 using the entry ID as a key (step 403). Then, it is checked whether the read entry information matches the search condition (step 404). If they match, the entry information is returned to the directory client 102 (step 405). If they do not match, the next hit candidate is checked (step 406).

【００４４】図１５は本実施形態の属性検索結果の画面
表示例を示す図である。検索結果を受け取ったディレク
トリクライアント１０２は、図１５の様な画面１４０１
に検索結果を出力する。FIG. 15 is a diagram showing an example of a screen display of an attribute search result according to the present embodiment. Upon receiving the search result, the directory client 102 displays a screen 1401 as shown in FIG.
Output the search result to.

【００４５】図１６は本実施形態のインデックス検索制
御部１１４の処理手順を示すフローチャートである。デ
ィレクトリ検索制御部１１２から呼び出されたインデッ
クス検索制御部１１４は、与えられた検索条件がパター
ン一致である場合には（ステップ６０１）パターン一致
インデックス検索部１１７を呼び出し（ステップ６０
６）、与えられた検索条件が読み一致である場合には
（ステップ６０２）読み一致インデックス検索部１１８
を呼び出すことにより、当該条件における検索ヒット候
補のエントリＩＤのリストを得る。全ての検索条件を処
理するまで（ステップ６０３）ステップ６０１から処理
を続ける。全ての検索条件を処理した後、各検索条件で
見つかったヒット候補のエントリＩＤのリスト同士を、
各検索条件間の論理演算条件に従って論理演算する（ス
テップ６０４）。論理演算した結果のエントリＩＤのリ
ストを呼び出し元に返す（ステップ６０５）。FIG. 16 is a flowchart showing the processing procedure of the index search control unit 114 of this embodiment. When the given search condition is a pattern match (step 601), the index search control unit 114 called from the directory search control unit 112 calls the pattern match index search unit 117 (step 60).
6) If the given search condition is a reading match (step 602), the reading matching index search unit 118
To obtain a list of entry IDs of search hit candidates under the condition. Processing is continued from step 601 until all search conditions are processed (step 603). After processing all search conditions, the list of entry IDs of hit candidates found in each search condition is
A logical operation is performed according to a logical operation condition between the search conditions (step 604). A list of entry IDs resulting from the logical operation is returned to the caller (step 605).

【００４６】図１７は本実施形態のパターン一致インデ
ックス検索部１１７の処理手順を示すフローチャートで
ある。インデックス検索制御部１１４から呼び出された
パターン一致インデックス検索部１１７は、パターン一
致キー抽出部１１０により、パターン一致キー１６０２
のリストを取得する（ステップ９０１）。ステップ９０
２では、前記取得したパターン一致キー１６０２に関連
付けられたエントリＩＤリスト１６０３をパターン一致
インデックス表１６０１から読み出し、ステップ９０１
で抽出した全てのパターン一致キー１６０２を処理する
までステップ９０２の処理を繰り返す（ステップ９０
３）。そして前記読み出した各エントリＩＤリスト１６
０３間の論理積をとり、インデックス検索結果として返
す（ステップ９０４）。FIG. 17 is a flowchart showing a processing procedure of the pattern matching index search unit 117 of this embodiment. The pattern match index search unit 117 called by the index search control unit 114 uses the pattern match key
Is obtained (step 901). Step 90
In step 2, the entry ID list 1603 associated with the acquired pattern matching key 1602 is read from the pattern matching index table 1601, and step 901 is executed.
The processing of step 902 is repeated until all the pattern matching keys 1602 extracted in step are processed (step 90).
3). Then, each of the read entry ID lists 16
The logical product of the numbers 03 is obtained and returned as an index search result (step 904).

【００４７】図１８は本実施形態の読み一致インデック
ス検索部１１８の処理手順を示すフローチャートであ
る。図１８に示す様にディレクトリサーバ１０１の読み
一致インデックス検索部１１８は、音節要素抽出部１１
９で抽出し、音節要素正規化部１２０で正規化したキー
ワードの各音節要素について、読み一致インデックス表
２００１を参照し、それらの音節要素を含む属性情報の
検索を行う。インデックス検索制御部１１４から呼び出
された読み一致インデックス検索部１１８は、読み一致
キー抽出部１０９を呼び出して読み一致キー２００２の
リストを要求する（ステップ１００１）。FIG. 18 is a flowchart showing a processing procedure of the reading matching index search unit 118 of this embodiment. As shown in FIG. 18, the reading matching index search unit 118 of the directory server 101
For each syllable element of the keyword extracted in step 9 and normalized by the syllable element normalizing unit 120, the reading matching index table 2001 is referenced to search for attribute information including those syllable elements. The reading matching index searching unit 118 called from the index searching control unit 114 calls the reading matching key extracting unit 109 and requests a list of reading matching keys 2002 (step 1001).

【００４８】読み一致インデックス検索部１１８から呼
び出された読み一致キー抽出部１０９は、読み一致イン
デックス更新部１１６の場合と同様の処理を行い、音節
要素抽出部１１９により、ローマ字表記テーブル１８０
１を参照し、検索のキーワードとして与えられた文字列
の最初の音節要素を認識する（ステップ１２０１）。音
節要素抽出部１１９は、ローマ字表記テーブル１８０１
中の緒音節とキーワードの文字列とを比較し、ローマ字
表記テーブル１８０１の音節と一致するものをその文字
列の音節要素として抽出する。The reading matching key extracting unit 109 called from the reading matching index searching unit 118 performs the same processing as that of the reading matching index updating unit 116, and the syllable element extracting unit 119 causes the Roman alphabet notation table 180.
1, the first syllable element of the character string given as the keyword for the search is recognized (step 1201). The syllable element extraction unit 119 uses the Roman alphabet notation table 1801
The middle syllable and the character string of the keyword are compared, and a syllable that matches the syllable in the Roman alphabet notation table 1801 is extracted as a syllable element of the character string.

【００４９】次に読み一致キー抽出部１０９は、音節要
素正規化部１２０により、ローマ字正規化テーブルを参
照し、認識した音節要素を正規化する（ステップ１２０
２）。音節要素正規化部１２０は、音節要素抽出部１１
９で抽出されたキーワードの音節要素と、ローマ字正規
化テーブル１９０１中の領域１９０２に登録された正規
化前の表記とを比較し、キーワードの音節要素が領域１
９０２に登録された表記である場合に、その音節要素を
領域１９０３の表記に変換して正規化する。認識した音
節要素に続く文字列がなくなるまで、ステップ１２０１
から繰り返し行い、文字列がなくなった場合、認識し、
正規化した音節要素を読み一致キー２００２のリストと
して返す。Next, the reading matching key extracting section 109 normalizes the recognized syllable element by referring to the Roman character normalization table by the syllable element normalizing section 120 (step 120).
2). The syllable element normalizing section 120 includes the syllable element extracting section 11
9 is compared with the notation before normalization registered in the area 1902 in the Roman character normalization table 1901, and the syllable element of the keyword is set in the area 1
If the notation is registered in 902, the syllable element is converted to the notation in the area 1903 and normalized. Step 1201 until there is no character string following the recognized syllable element.
It repeats from and, when the character string runs out, recognizes,
The normalized syllable elements are read and returned as a list of matching keys 2002.

【００５０】インデックス検索制御部１１４から呼び出
された読み一致インデックス検索部１１８は、読み一致
キー抽出部１０９から返された読み一致キー２００２の
リストを取得する（ステップ１００１）。次にステップ
１００２では、前記取得した読み一致キー２００２に関
連付けられたエントリＩＤリスト２００３を読み一致イ
ンデックス表２００１から読み出し、ステップ１００１
で抽出した全ての読み一致キー２００２を処理するまで
ステップ１００２の処理を繰り返す（ステップ１００
３）。そして前記読み出した各エントリＩＤリスト２０
０３間の論理積をとり、インデックス検索結果として返
す（ステップ１００４）。The reading match index search unit 118 called from the index search control unit 114 acquires the list of the reading matching keys 2002 returned from the reading matching key extraction unit 109 (step 1001). Next, in step 1002, the entry ID list 2003 associated with the acquired matching key 2002 is read from the reading matching index table 2001, and step 1001 is executed.
Step 1002 is repeated until all the read matching keys 2002 extracted in Step 100 are processed (Step 100).
3). Then, each of the read entry ID lists 20
The logical product between the numbers 03 is obtained and returned as an index search result (step 1004).

【００５１】図２の例では、ステップ１００１の読み一
致キー抽出部１０９の呼び出しにより、検索条件の"hir
o"から、"HI","RO"が抽出される。ステップ１００２で"
HI"をキーに読み一致キー２００２を検索し、エントリ
ＩＤリスト２００３からエントリＩＤ"1,5" を得る。同
様に"RO"から、エントリＩＤ"1,2,4,5,6"を得る。ステ
ップ１００３で両者の論理積をとると"1,5"になり、ス
テップ１００４で"1,5"をインデックス検索結果として
返すことになる。In the example of FIG. 2, by calling the reading matching key extraction unit 109 in step 1001, the search condition "hir"
“HI” and “RO” are extracted from “o”.
HI "is used as a key to search for a match key 2002 to obtain entry IDs" 1,5 "from the entry ID list 2003. Similarly, entry IDs" 1,2,4,5,6 "are obtained from" RO ". In step 1003, the logical product of the two is "1,5", and in step 1004, "1,5" is returned as the index search result.

【００５２】以上、本実施形態のディレクトリ検索方法
を説明した。従来のインデックス検索方法によると、検
索文字列が含まれる全てのエントリを検索結果とするた
め、ユーザが所望していないエントリまで検索の結果と
してしまう。これに対し本実施形態のディレクトリ検索
方法は、属性値を音節要素に分解した読み一致インデッ
クスを用いることにより、属性値の読みに忠実な検索結
果を得ることが可能である。例えば図２によると、"hir
o"に関する部分一致検索要求を"ひろ"と言う読みの属性
値に対する部分一致検索要求であると解釈し、"hiro"を
含むものの読み"ひろ"を含まない"Shiro"や"Ichiro"が
登録されたエントリを除外した後、"Hiroshi"及び"Chih
iro"が登録されたエントリだけを検索結果とできる。The directory search method according to the present embodiment has been described. According to the conventional index search method, since all entries including the search character string are set as search results, the search results include entries not desired by the user. On the other hand, the directory search method according to the present embodiment can obtain a search result faithful to the reading of the attribute value by using the reading matching index in which the attribute value is decomposed into syllable elements. For example, according to FIG.
Interpret the partial match search request for "o" as a partial match search request for the attribute value of the reading "Hiro", and register "Shiro" or "Ichiro" that includes "hiro" but does not include the reading "Hiro""Hiroshi" and "Chih"
Only entries with "iro" registered can be used as search results.

【００５３】更に本実施形態によると、属性値の表記法
にゆらぎがあっても、ユーザが所望のエントリを検索可
能である。例えば図２によると、"sachi"に関する部分
一致検索要求を"さち"と言う読みの属性値に対する部分
一致検索要求であると解釈し、"sachi"は含まないもの
の同義である"sati"を含むエントリを検索結果とでき
る。以上説明した様に本実施形態の文字列検索装置によ
れば、キーワードの読みに基づいて文字列の検索を行う
ので、文字列検索の使い勝手を向上させることが可能で
ある。Further, according to the present embodiment, the user can search for a desired entry even if the notation of the attribute value fluctuates. For example, according to FIG. 2, a partial match search request for “sachi” is interpreted as a partial match search request for an attribute value of “sachi”, and “sachi” is not included but contains “sati”, which is a synonym. The entry can be a search result. As described above, according to the character string search device of the present embodiment, the character string is searched based on the reading of the keyword, so that the usability of the character string search can be improved.

【００５４】（実施形態２）以下にディレクトリサービ
スでローマ字によって表記された日本語の文字列の全一
致文字列検索を、そのゆらぎを考慮して行う実施形態２
の文字列検索装置について説明する。上記の実施形態１
は、読みに基づく部分文字列の検索方法に関するもので
ある。しかし本発明のディレクトリ検索方法は、表記の
ゆらぎも考慮しており、全一致文字列検索にも適用可能
である。(Embodiment 2) The following is a second embodiment in which a directory service performs a full match character string search of a Japanese character string represented in Roman characters in consideration of its fluctuation.
Will be described. Embodiment 1 above
Relates to a method of searching for a partial character string based on the reading. However, the directory search method of the present invention also takes into account the fluctuation of the notation, and is applicable to an all-match character string search.

【００５５】図１９は本実施形態の読み一致インデック
ス１０７の情報構成例を示す図である。図１９では、読
みに基づく全一致及び部分一致検索を可能とする読み一
致インデックス表２１０１の情報構成例を表している。
本実施形態の読み一致インデックス更新部１１６は、音
節要素抽出部１１９で抽出し、音節要素正規化部１２０
で正規化した各音節要素について、それらの音節要素を
連結した正規化後の属性情報のエントリＩＤの登録、削
除または変更を読み一致インデックス表２１０１に対し
て行い、また、読み一致インデックス検索部１１８は、
音節要素抽出部１１９で抽出し、音節要素正規化部１２
０で正規化したキーワードの各音節要素について、読み
一致インデックス表２１０１を参照し、それら音節要素
を連結した正規化後のキーワードに一致する属性情報を
検索する全一致文字列検索を行う。その他の構成につい
ては実施形態１と同様であるものとする。FIG. 19 is a diagram showing an example of the information structure of the reading coincidence index 107 according to this embodiment. FIG. 19 shows an example of the information configuration of a reading match index table 2101 that enables a full match and partial match search based on reading.
The reading match index updating unit 116 of the present embodiment extracts the syllable element by the syllable element extracting unit 119 and
The registration, deletion or change of the entry ID of the normalized attribute information obtained by concatenating the syllable elements with respect to each syllable element is performed on the reading match index table 2101, and the reading match index search unit 118 Is
The syllable element extracting unit 119 extracts the syllable element normalizing unit 12
For each syllable element of the keyword normalized by 0, a full match character string search is performed by referring to the reading matching index table 2101 and searching for attribute information matching the normalized keyword obtained by connecting the syllable elements. Other configurations are the same as in the first embodiment.

【００５６】図１９の読み一致インデックス表２１０１
には、実施形態１（図２）と同様に属性値を音節毎に分
割した文字列が登録されることに加え、読み一致キー抽
出部１０９において正規化した音節要素を順に連結した
もの、すなわち属性値全体を正規化したものが登録され
る。本実施形態を適用する際は、図９に示した読み一致
インデックス更新部１１６において、読み一致キー抽出
（ステップ８０１）後に、正規化した音節要素を順に連
結したもの、すなわち属性値全体を正規化したものも読
み一致キー２１０２に加えれば良い。The reading match index table 2101 in FIG.
In the same manner as in the first embodiment (FIG. 2), a character string obtained by dividing an attribute value for each syllable is registered, and the syllable elements normalized by the reading matching key extraction unit 109 are sequentially connected, that is, A normalized value of the entire attribute value is registered. When this embodiment is applied, in the reading matching index updating unit 116 shown in FIG. 9, after the reading matching key is extracted (step 801), the normalized syllable elements are sequentially connected, that is, the entire attribute value is normalized. What is done may be added to the reading matching key 2102.

【００５７】更に図１８に示した読み一致インデックス
検索部１１８のインデックス検索動作において、検索種
別が読みに基づく全一致検索である場合は、読み一致キ
ー抽出（ステップ１００１）後に、正規化した音節要素
を順に連結したもの、すなわち属性値全体を正規化した
ものを用いて読み一致インデックス表２１０１の読み一
致キー２１０２を探索し、対応するエントリＩＤリスト
２１０３を抽出して返す様に処理を加えれば良い。Further, in the index search operation of the reading match index search unit 118 shown in FIG. 18, if the search type is a full match search based on reading, after the reading matching key extraction (step 1001), the normalized syllable element May be added to search for the reading matching key 2102 of the reading matching index table 2101 using a sequence obtained by normalizing the entire attribute value, and a corresponding entry ID list 2103 may be extracted and returned. .

【００５８】図１９を例にとると、英語名属性"Satiko"
に関するインデックスの生成時、読み一致インデックス
更新部１１６は、読みに基づく、全一致検索用インデッ
クスの"SACHIKO"と部分一致検索用インデックスの"S
A"、"CHI"、"KO"を、読み一致インデックス表２１０１
に登録する。読みに基づく全一致検索の際、読み一致イ
ンデックス検索部１１８は、検索条件中の属性値"Sachi
ko"を正規化し、"SACHIKO"に該当するエントリＩＤであ
る"3"を発見し、検索結果とする。Taking FIG. 19 as an example, the English name attribute "Satiko"
When an index is generated, the reading match index updating unit 116 performs a full match search index “SACHIKO” and a partial match search index “S
A "," CHI ", and" KO "are read and the match index table 2101 is read.
Register with. At the time of an all-match search based on readings, the reading matching index search unit 118 uses the attribute value “Sachi
After normalizing “ko”, an entry ID “3” corresponding to “SACHIKO” is found and set as a search result.

【００５９】以上、本実施形態によると、属性値の読み
に忠実な全一致検索結果を得ることが可能である。例え
ば図１９によると、"sachiko"に関する全一致検索要求
を"さちこ"と言う読みの属性値に対する全一致検索要求
であると解釈し、"sachiko"ではないものの同義の"sati
ko"であるエントリを検索結果とできる。以上説明した
様に本実施形態の文字列検索装置によれば、キーワード
の読みに基づいて文字列の検索を行うので、文字列検索
の使い勝手を向上させることが可能である。As described above, according to the present embodiment, it is possible to obtain an all-match search result faithful to reading attribute values. For example, according to FIG. 19, an all-match search request for "sachiko" is interpreted as an all-match search request for the attribute value of the reading "Sachiko".
The entry "ko" can be used as a search result. As described above, according to the character string search device of the present embodiment, a search for a character string is performed based on the reading of a keyword, thereby improving the usability of the character string search. It is possible.

【００６０】（実施形態３）以下にディレクトリサービ
スでローマ字によって表記された日本語の文字列の文字
列検索を、その音節の連続性を考慮して行う実施形態３
の文字列検索装置について説明する。上記の実施形態１
においては、属性値を音節で区切り、一音節毎にインデ
ックス化した。しかし当該方法は音節の連続性について
考慮しておらず、例えば図２の読み一致インデックス表
２１０１に英語名属性"Hikoroku"を加えると、"hiro"に
関する部分一致検索時に該エントリもヒットしてしま
う。この問題は、インデックス検索により候補を絞り込
んだ後、ディレクトリＤＢ１０４から各エントリの情報
を読み出し、検索条件に合致するか否か再度チェックす
る処理（ステップ４０４）により解消可能であるが、候
補は可能な限り絞り込んだ方が検索性能を向上できる。(Embodiment 3) A character string search for a Japanese character string described in Roman characters in a directory service is performed in consideration of the continuity of its syllables.
Will be described. Embodiment 1 above
In, attribute values are separated by syllables and indexed for each syllable. However, this method does not consider the continuity of syllables. For example, if the English name attribute "Hikoroku" is added to the reading match index table 2101 in FIG. 2, the entry will also be hit at the time of a partial match search for "hiro". . This problem can be solved by a process of narrowing down candidates by index search, reading information of each entry from the directory DB 104, and checking again whether the search condition is met (step 404). Refining as much as possible can improve search performance.

【００６１】上記問題を解決するため、本実施形態のデ
ィレクトリ検索方法では、属性値を音節要素に分解した
後、ｍ音節おきのｎ音節に相当する文字列をインデック
ス化する処理を行う。本実施形態においては、ｍを１、
ｎを２とする。In order to solve the above problem, in the directory search method according to the present embodiment, after the attribute value is decomposed into syllable elements, a process of indexing a character string corresponding to n syllables every m syllables is performed. In the present embodiment, m is 1,
Let n be 2.

【００６２】図２０は本実施形態の読み一致インデック
ス１０７の情報構成例を示す図である。図２０の読み一
致インデックス表２２０１には、実施形態１（図２）と
同様に属性値を１音節毎に分割した文字列が登録される
ことに加え、属性値を０音節おきに２音節抽出した文字
列が登録される。FIG. 20 is a diagram showing an example of the information structure of the reading coincidence index 107 according to this embodiment. As in the first embodiment (FIG. 2), a character string obtained by dividing the attribute value for each syllable is registered in the reading matching index table 2201 of FIG. 20, and the attribute value is extracted for every two syllables. The registered character string is registered.

【００６３】本実施形態の読み一致インデックス更新部
１１６は、音節要素抽出部１１９で抽出し、音節要素正
規化部１２０で正規化した各音節要素について、それら
の音節要素の内で連続する複数の音節要素を含む属性情
報のエントリＩＤの登録、削除または変更を読み一致イ
ンデックス表２２０１に対して行い、また、読み一致イ
ンデックス検索部１１８は、音節要素抽出部１１９で抽
出し、音節要素正規化部１２０で正規化したキーワード
の各音節要素について、読み一致インデックス表２２０
１を参照し、それらの音節要素の内で連続する複数の音
節要素を含む属性情報の検索を行う。その他の構成につ
いては実施形態１と同様であるものとする。The reading matching index updating section 116 of this embodiment extracts a plurality of syllable elements extracted by the syllable element extracting section 119 and normalized by the syllable element The entry ID of the attribute information including the syllable element is registered, deleted, or changed in the reading matching index table 2201, and the reading matching index searching unit 118 extracts the syllable element extracting unit 119, and the syllable element normalizing unit. For each syllable element of the keyword normalized in 120, the reading match index table 220
1 and search for attribute information including a plurality of continuous syllable elements among those syllable elements. Other configurations are the same as in the first embodiment.

【００６４】本実施形態を適用する際は、図９に示した
読み一致インデックス更新部１１６のインデックス更新
動作において、１音節毎に分割した文字列を読み一致イ
ンデックス表２２０１に順次登録した後、更に０音節お
きに２音節抽出した文字列を読み一致キー２２０２とし
て読み一致インデックス表２２０１に順次登録すれば良
い。When this embodiment is applied, in the index updating operation of the reading matching index updating unit 116 shown in FIG. 9, the character strings divided for each syllable are sequentially registered in the reading matching index table 2201, and then, A character string extracted every two syllables at every 0th syllable may be sequentially registered in the reading match index table 2201 as the reading match key 2202.

【００６５】更に図１８に示した読み一致インデックス
検索部１１８のインデックス検索動作において、検索条
件内の属性値が複数音節で構成される場合、０音節おき
に２音節の文字列に分解し、分解後の各２音節文字列の
読み一致キー２２０２に対応するエントリＩＤリスト２
２０３を抽出後、共通のエントリＩＤを返す様に処理を
加えれば良い。Further, in the index search operation of the reading matching index search unit 118 shown in FIG. 18, if the attribute value in the search condition is composed of a plurality of syllables, the character string is divided into two syllables every 0 syllable, and Entry ID list 2 corresponding to later two-syllable character string reading match key 2202
After extracting 203, processing may be added to return a common entry ID.

【００６６】図２０を例にとると、英語名属性"Hikorok
u"に関するインデックスの生成時、読み一致インデック
ス更新部１１６は、１音節の"HI"、"KO"、"RO"、"KU"に
加え、０音節おき２音節の"HIKO"、"KORO"、"ROKU"を、
読み一致インデックス表２２０１に登録する。複数音節
の検索の際、読み一致インデックス検索部１１８は、検
索条件中の属性値"hiro"を０音節おきに２音節に分割
し、"HIRO"に該当するエントリＩＤである"1,5"を発見
し、検索結果とする。また、"hiroshi"に関する部分一
致検索の場合は、同様に０音節おきに２音節に分割
し、"HIRO"、"ROSHI"の双方に該当するエントリＩＤで
ある"1"を発見し、検索結果とする。Taking FIG. 20 as an example, the English name attribute “Hikorok
When generating an index for "u", the reading matching index update unit 116 adds "HI", "KO", "RO", and "KU" for one syllable and "HIKO" and "KORO" for every two syllables. , "ROKU",
It is registered in the reading matching index table 2201. When searching for a plurality of syllables, the reading matching index search unit 118 divides the attribute value “hiro” in the search condition into two syllables at intervals of 0 syllable, and the entry ID “1,5” corresponding to “HIRO” Is found and set as a search result. In the case of a partial match search for "hiroshi", similarly, it is divided into two syllables at intervals of 0 syllable, and an entry ID "1" corresponding to both "HIRO" and "ROSHI" is found. And

【００６７】以上、本実施形態によると、音節の連続性
を加味した検索処理が可能である。例えば図２０による
と、"hiro"に関する部分一致検索要求を受けて、"hi"及
び"ro"を含むものの連続していない"Hikoroku"が登録さ
れたエントリを除外した後、"Hiroshi"及び"Chihiro"が
登録されたエントリだけを検索ヒット候補とできる。ま
た、上記各実施形態の組み合わせにより、一層使い勝手
の良いディレクトリ検索方法を提供できることは明白で
ある。As described above, according to the present embodiment, it is possible to perform a search process in consideration of the continuity of syllables. For example, according to FIG. 20, after receiving a partial match search request for “hiro”, after excluding entries in which “Hikoroku” that includes “hi” and “ro” but is not continuous, “Hiroshi” and “Hiroroku” Only entries with "Chihiro" registered can be search hit candidates. It is apparent that a more convenient directory search method can be provided by a combination of the above embodiments.

【００６８】以上説明した様に本実施形態の文字列検索
装置によれば、キーワードの読みに基づいて文字列の検
索を行うので、文字列検索の使い勝手を向上させること
が可能である。尚、本発明を実現させる為のプログラム
を記録媒体に格納し、計算機（若しくは携帯端末等）を
用いて、前記記録媒体から前記プログラムをインストー
ルして使用しても良いし、前記記録媒体にネットワーク
等を通じてアクセスして、前記プログラムをダウンロー
ドして使用しても良い。As described above, according to the character string search device of the present embodiment, the character string is searched based on the reading of the keyword, so that the usability of the character string search can be improved. A program for realizing the present invention may be stored in a recording medium, and the program may be installed and used from the recording medium using a computer (or a portable terminal), or a network may be stored in the recording medium. The program may be accessed and downloaded to use the program.

【００６９】[0069]

【発明の効果】本発明によればキーワードの読みに基づ
いて文字列の検索を行うので、文字列検索の使い勝手を
向上させることが可能である。According to the present invention, since a character string is searched based on the reading of a keyword, the usability of the character string search can be improved.

[Brief description of the drawings]

【図１】実施形態１のディレクトリサービスを提供する
システムの概略構成を示す図である。FIG. 1 is a diagram illustrating a schematic configuration of a system that provides a directory service according to a first embodiment.

【図２】実施形態１の読み一致インデックス１０７の情
報構成例を示す図である。FIG. 2 is a diagram illustrating an example of an information configuration of a reading matching index 107 according to the first embodiment.

【図３】実施形態１のインデックス定義情報１０５の情
報構成例を示す図である。FIG. 3 is a diagram illustrating an information configuration example of index definition information 105 according to the first embodiment.

【図４】実施形態１のディレクトリ制御部１０８の処理
手順を示すフローチャートである。FIG. 4 is a flowchart illustrating a processing procedure of a directory control unit according to the first embodiment.

【図５】実施形態１のディレクトリ更新制御部１１１の
処理手順を示すフローチャートである。FIG. 5 is a flowchart illustrating a processing procedure of a directory update control unit 111 according to the first embodiment.

【図６】実施形態１のインデックス更新制御部１１３の
処理手順を示すフローチャートである。FIG. 6 is a flowchart illustrating a processing procedure of an index update control unit 113 according to the first embodiment.

【図７】実施形態１のパターン一致インデックス更新部
１１５の処理手順を示すフローチャートである。FIG. 7 is a flowchart illustrating a processing procedure of a pattern matching index update unit 115 according to the first embodiment.

【図８】実施形態１のパターン一致キー抽出部１１０の
処理手順を示すフローチャートである。FIG. 8 is a flowchart illustrating a processing procedure of a pattern matching key extraction unit 110 according to the first embodiment.

【図９】実施形態１の読み一致インデックス更新部１１
６の処理手順を示すフローチャートである。FIG. 9 is a reading match index updating unit 11 according to the first embodiment;
6 is a flowchart illustrating a processing procedure of No. 6;

【図１０】実施形態１の読み一致キー抽出部１０９の処
理手順を示すフローチャートである。FIG. 10 is a flowchart illustrating a processing procedure of a reading matching key extraction unit 109 according to the first embodiment.

【図１１】実施形態１のローマ字表記テーブルの一例を
示す図である。FIG. 11 is a diagram illustrating an example of a Roman alphabet notation table according to the first embodiment.

【図１２】実施形態１のローマ字正規化テーブルの一例
を示す図である。FIG. 12 is a diagram illustrating an example of a Roman character normalization table according to the first embodiment.

【図１３】実施形態１の属性検索時の画面表示例を示す
図である。FIG. 13 is a diagram illustrating a screen display example at the time of attribute search according to the first embodiment.

【図１４】実施形態１のディレクトリ検索制御部１１２
の処理手順を示すフローチャートである。FIG. 14 is a directory search control unit 112 according to the first embodiment.
6 is a flowchart showing the processing procedure of FIG.

【図１５】実施形態１の属性検索結果の画面表示例を示
す図である。FIG. 15 is a diagram illustrating a screen display example of an attribute search result according to the first embodiment.

【図１６】実施形態１のインデックス検索制御部１１４
の処理手順を示すフローチャートである。FIG. 16 is an index search control unit 114 according to the first embodiment.
6 is a flowchart showing the processing procedure of FIG.

【図１７】実施形態１のパターン一致インデックス検索
部１１７の処理手順を示すフローチャートである。FIG. 17 is a flowchart illustrating a processing procedure of a pattern matching index search unit 117 according to the first embodiment.

【図１８】実施形態１の読み一致インデックス検索部１
１８の処理手順を示すフローチャートである。FIG. 18 is a reading match index search unit 1 according to the first embodiment;
It is a flowchart which shows the processing procedure of 18.

【図１９】実施形態２の読み一致インデックス１０７の
情報構成例を示す図である。FIG. 19 is a diagram illustrating an information configuration example of a reading matching index 107 according to the second embodiment.

【図２０】実施形態３の読み一致インデックス１０７の
情報構成例を示す図である。FIG. 20 is a diagram illustrating an information configuration example of a reading matching index 107 according to the third embodiment.

【図２１】従来のディレクトリサービスが各ディレクト
リエントリを記憶するディレクトリＤＢの情報構成例を
示す図である。FIG. 21 is a diagram showing an example of the information configuration of a directory DB in which a conventional directory service stores each directory entry.

【図２２】従来のインデックス検索方法における英語名
属性に関するパターン一致インデックスの情報構成例を
示す図である。FIG. 22 is a diagram illustrating an information configuration example of a pattern matching index for an English name attribute in a conventional index search method.

[Explanation of symbols]

１０１…ディレクトリサーバ、１０２…クライアント、
１０３…ネットワーク、１０４…ディレクトリＤＢ、１
０５…インデックス定義情報、１０６…パターン一致イ
ンデックス、１０７…読み一致インデックス、１０８…
ディレクトリ制御部、１０９…読み一致キー抽出部、１
１０…パターン一致キー抽出部、１１１…ディレクトリ
更新制御部、１１２…ディレクトリ検索制御部、１１３
…インデックス更新制御部、１１４…インデックス検索
制御部、１１５…パターン一致インデックス更新部、１
１６…読み一致インデックス更新部、１１７…パターン
一致インデックス検索部、１１８…読み一致インデック
ス検索部、１１９…音節要素抽出部、１２０…音節要素
正規化部、１２１…キー要素抽出部、１２２…パターン
正規化部、２００１…読み一致インデックス表、２００
２…読み一致キー、２００３…エントリＩＤリスト、１
７０１…インデックス定義情報、１８０１…ローマ字表
記テーブル、１９０１…ローマ字正規化テーブル、１９
０２…領域、１９０３…領域、１３０１…画面、１３０
２…属性名を選択する領域、１３０３…検索パターン文
字列を指示する領域、１３０４…読み一致検索やパター
ン一致検索を選択する領域、１３０５…部分一致や完全
一致を選択する領域、１３０６…ＯＫボタン、１３０７
…キャンセルボタン、１４０１…画面、２１０１…読み
一致インデックス表、２１０２…読み一致キー、２１０
３…エントリＩＤリスト、２２０１…読み一致インデッ
クス表、２２０２…読み一致キー、２２０３…エントリ
ＩＤリスト、１５０１…ディレクトリＤＢ構成例、１６
０１…パターン一致インデックス表、１６０２…パター
ン一致キー、１６０３…エントリＩＤリスト。101: directory server, 102: client,
103: network, 104: directory DB, 1
05 index definition information, 106 pattern matching index, 107 reading matching index, 108
Directory control unit, 109: reading matching key extracting unit, 1
10: pattern matching key extraction unit, 111: directory update control unit, 112: directory search control unit, 113
... Index update control unit, 114 ... Index search control unit, 115 ... Pattern matching index update unit, 1
16: reading matching index updating unit, 117: pattern matching index searching unit, 118: reading matching index searching unit, 119: syllable element extracting unit, 120: syllable element normalizing unit, 121: key element extracting unit, 122: pattern normal , 2001: reading match index table, 200
2 ... read matching key, 2003 ... entry ID list, 1
701: Index definition information, 1801: Roman alphabet notation table, 1901: Roman alphabet normalization table, 19
02 area, 1903 area, 1301 screen, 130
2 ... Area for selecting an attribute name, 1303 ... Area for designating a search pattern character string, 1304 ... Area for selecting a reading match search or a pattern match search, 1305 ... Area for selecting a partial match or perfect match, 1306 ... OK button , 1307
... Cancel button, 1401 screen, 2101 reading matching index table, 2102 reading matching key, 210
3 Entry ID list, 2201 reading matching index table, 2202 reading matching key, 2203 entry ID list, 1501 directory DB configuration example, 16
01: pattern matching index table, 1602: pattern matching key, 1603: entry ID list.

───────────────────────────────────────────────────── フロントページの続き (72)発明者中村敏治神奈川県横浜市戸塚区戸塚町5030番地株式会社日立製作所ソフトウェア事業部内Ｆターム(参考） 5B075 ND03 NK02 NK32 NK54 QP10 QS20 ────────────────────────────────────────────────── ─── Continued on the front page (72) Inventor Toshiharu Nakamura 5030 Totsuka-cho, Totsuka-ku, Yokohama-shi, Kanagawa Prefecture F-term in the Software Division, Hitachi, Ltd. F-term (reference) 5B075 ND03 NK02 NK32 NK54 QP10 QS20

Claims

[Claims]

1. A character string search method for searching a character string, comprising: a step of extracting a syllable element from an input keyword; a step of normalizing the notational fluctuation of the extracted syllable element; And searching for a character string that matches the input keyword reading by referring to a reading matching index indicating a storage location of the character string that contains the character string.

2. The character string search method according to claim 1, wherein a full-match character string search is performed to search for a character string that matches the normalized keyword obtained by connecting the normalized syllable elements. .

3. The character string search method according to claim 1, wherein a character string including a plurality of continuous syllable elements is searched for in the normalized syllable elements.

4. A syllable element extracting unit for extracting a syllable element from an input keyword, and a syllable element normalizing unit for normalizing the notation fluctuation of the extracted syllable element. And a reading matching index search unit that searches for a character string that matches the reading of the input keyword by referring to a reading matching index indicating a storage location of a character string including the normalized syllable element. Character string search device.

5. A syllabic element extracting unit for extracting a syllable element from an input keyword on a computer-readable recording medium storing a program for causing a computer to function as a character string searching device for searching for a character string. The syllable element normalization unit for normalizing the fluctuation of the notation of the extracted syllable element, and the reading matching index indicating the storage location of the character string including the normalized syllable element, refer to the reading of the input keyword. A recording medium having recorded thereon a program for causing a computer to function as a reading match index search unit for searching for a character string.