JPH03260869A

JPH03260869A - Data base retrieving system

Info

Publication number: JPH03260869A
Application number: JP2058045A
Authority: JP
Inventors: Yoshifusa Togawa; 好房外川; Takashi Tsubokura; 孝坪倉
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1990-03-12
Filing date: 1990-03-12
Publication date: 1991-11-20
Anticipated expiration: 2013-10-27
Also published as: JP2817103B2

Abstract

PURPOSE:To retrieve a data base at a high speed by retrieving a subject data group based on the character having the lowest generating frequency out of a character string to be retrieved. CONSTITUTION:A generating frequency table 2 stores the generating frequencies of characters in the retrieving subject data 1 and the head emerging place addresses of those characters. Meanwhile a following emergence table 3 stores the following emerging place addresses of the characters. A character extracting part 4 refers to a generating frequency table 2 to extract a character having the lowest generating frequency out of an inputted character string. A retrieving processing part 5 obtains successively the emerging place addresses of the characters extracted by the part 4 out of the table 2 or 3 and then compares the data set before and after the data designated by the emerging place address with the the inputted character string in order to retrieve the data including the inputted character string. Thus the data base retrieving speed is improved.

Description

【発明の詳細な説明】〔概　　　要〕ユーザが入力した文字列によりデータの検索を行うデー
タベース検索方式に関し、ユーザが自由に単語等を入力して検索を行え、かつより
高速な検索を行えることを目的とし、検索対象データ内
における文字の発生頻度と該文字の先頭出現場所アドレ
スとを記憶する発生頻度テーブルと、前記各文字の検索
対象データにおける次出現場所アドレスを記憶する次出
現場所テーブルと、検索すべき文字列が入力されたとき
、前記発生頻度テーブルを参照して、該文字列の中で最
も発生頻度の少ない文字を抽出する文字抽出部と、該文
字抽出部で抽出された文字の出現場所アドレスを、前記
発生頻度テーブルまたは次出現場所テーブルから順次求
め、該アドレスで指定される検索対象データの文字の前
後の文字を前記文字列と比較し、該文字列を含むデータ
を検索する検索処理部とを備えるように構成する。[Detailed Description of the Invention] [Summary] Regarding a database search method that searches for data using character strings input by the user, the present invention allows the user to freely input words, etc. to perform a search, and to perform a faster search. For the purpose of , a character extraction unit that, when a character string to be searched is input, refers to the occurrence frequency table and extracts a character that occurs least frequently in the character string; and a character extracted by the character extraction unit. Sequentially obtain the occurrence location address from the occurrence frequency table or the next occurrence location table, compare the characters before and after the character of the search target data specified by the address with the character string, and search for data containing the character string. and a search processing unit.

[Industrial application field]

本発明は、ユーザが入力した文字列によりデータの検索
を行うデータベース検索方式に関する。The present invention relates to a database search method for searching data using a character string input by a user.

[Conventional technology]

ＣＤ−ＲＯＭ、光ディスクなどの大きな記憶容量を持つ
記憶媒体が実用化され、これらの記憶媒体を利用して辞
書、現代用語などの大きなデータ量を持つデータベース
を容易に構成できるようになってきた。Storage media with large storage capacities such as CD-ROMs and optical disks have been put into practical use, and it has become possible to easily construct databases with large amounts of data such as dictionaries and modern terminology using these storage media.

データベースの検索方法としては、ユーザが入力した単
語と一敗する単語を、データベースに記憶されているデ
ータ（以下、これを本文データと呼ぶ）から直接検索し
て、一致した単語を含むデータを検索する方法がある。The database search method is to directly search the data stored in the database (hereinafter referred to as main text data) for the word entered by the user and the word that matches, and then search for data that includes the matching word. There is a way to do it.

また、本文データ中の単語をキーワードとして抽出した
インデックスを設け、ユーザが入力した単語と一致する
キーワードをそのインデックスから検索して、一致する
キーワードが示す本文データを検索する方法がある。こ
のキーワードによる検索方法としては、ユーザから入力
された単語を先頭に持つキーワードを検索する前方一致
検索、入力された単語を末尾に持つキーワードを検索す
る後方一致検索、及び入力され単語と完全に一致するキ
ーワ・−ドを検索する完全一致検索などがある。Another method is to provide an index in which words in text data are extracted as keywords, search the index for keywords that match the words input by the user, and search for text data indicated by the matching keywords. Search methods using this keyword include a prefix search that searches for keywords that start with the word entered by the user, a suffix search that searches for keywords that end with the entered word, and a search that searches for a keyword that ends with the entered word, and an exact match with the entered word. There is an exact match search that searches for specific keywords.

第９図は、キーワードにより検索を行う従来のデータベ
ース検索方式のフローチャートである。FIG. 9 is a flowchart of a conventional database search method for searching using keywords.

先ずユーザから入力された検索単語を読み取る（第９図
、５１）０次に、読み取った検索単語の長さを求める（
Ｓ２）。First, the search word input by the user is read (Fig. 9, 51). Next, the length of the read search word is determined (
S2).

そして、先ず前方一致検索かどうかを判断する（Ｓ３）
、前方一致検索であったときには、検索単語の長さに基
づいて、語句の先頭の単語と本文データ上の格納アドレ
スとを対応させて記憶している前方一致インデックスの
最初のインデックスと、入力された検索単語とを比較す
る（Ｓ　４　）　。First, it is determined whether it is a prefix match search (S3)
, when it is a prefix match search, the first index of the prefix match index that stores the correspondence between the first word of the phrase and the storage address in the body data based on the length of the search word, and the input and the search words found (S4).

そして、それらの単語が一致しているか否を判別する（
Ｓ５）。Then, determine whether those words match (
S5).

一致した場合には、該当するインデックスの示す本文デ
ータをＣＤ−ＲＯＭから読み出して表示する（Ｓ６）。If they match, the text data indicated by the corresponding index is read out from the CD-ROM and displayed (S6).

このときインデックスが一致しなければ、次のインデッ
クスを読み込み（Ｓ７）、読み込んだデータがアドレス
データか否かにより、検索インデックスが残っているか
どうかを判断す、る（３Ｂ）。If the indexes do not match at this time, the next index is read (S7), and it is determined whether a search index remains depending on whether the read data is address data (3B).

そして、インデックスが残っているときには、ステップ
Ｓ４に戻り次のインデックスについて同様な処理を繰り
返す。If there are any indexes remaining, the process returns to step S4 and repeats the same process for the next index.

一方、ステップＳ３の判別で前方一致検索でなかったと
きには、ステップＳ９に進み後方−敗検索かどうかを判
別する。後方一致検索であったときには、語句の末尾の
単語と本文データ上でのその語句の格納アドレスを記憶
している後方一致インデックスの最初のインデックスに
対し同様な比較を行う（３１０）、そして、それらの単
語が一敗しているか否かを判別する（Ｓｌｌ）。On the other hand, if it is determined in step S3 that the search is not a forward match search, the process proceeds to step S9, and it is determined whether the search is a backward match search or not. If it is a suffix match search, a similar comparison is made with the last word of the phrase and the first index of the suffix index that stores the storage address of that word in the body data (310), and then It is determined whether or not the word has been defeated (Sll).

一致した場合には、検索したインデックスの示す本文デ
ータをＣＤ−ＲＯＭから読み出して表示する（３１２）
。このときインデックスが一致しなければ、次のインデ
ックスを読み込み（Ｓ１３）、読み込んだデータがアド
レスデータか否かを見て、検索インデックスが残ってい
るかどうかを判断する（Ｓ１４）。検索インデックスが
残っているときには、ステップＳＩＯに戻り次のインデ
ックスについて同様な処理を行う。If there is a match, the text data indicated by the searched index is read from the CD-ROM and displayed (312).
. If the indexes do not match at this time, the next index is read (S13), it is checked whether the read data is address data, and it is determined whether a search index remains (S14). If a search index remains, the process returns to step SIO and the same process is performed for the next index.

他方、ステップＳ９の判別で後方一致検索でなかったと
きには、ステップＳ１５に進み語句と本文データ上での
格納アドレスを記憶している完全一致インデックスに対
し同様な比較を行う。そして、それらの語句が一致して
いるか否を判別する（５１６）。On the other hand, if it is determined in step S9 that the search is not a suffix match search, the process proceeds to step S15, where a similar comparison is made with the exact match index that stores the word and phrase and the storage address on the text data. Then, it is determined whether or not these words match (516).

一致した場合には、検索したインデックスの示す本文デ
ータをＣＤ−ＲＯＭから読み出し表示する（５１７）、
このときインデックスが一致しなければ、次のインデッ
クスを読み込み（３１Ｂ）、読み込んだデータがアドレ
スデータか否かを見て、検索インデックスが残っている
かどうかを判断する（Ｓ１９）。検索インデックスが残
っているときには、ステップＳ１５に戻り次のインデッ
クスについて同様な処理を行う。If there is a match, the text data indicated by the searched index is read out from the CD-ROM and displayed (517);
If the indexes do not match at this time, the next index is read (31B), and it is checked whether the read data is address data or not, and it is determined whether a search index remains (S19). If a search index remains, the process returns to step S15 and the same process is performed for the next index.

このように、予め本文中の単語（語句）をキーワードイ
ンデックスとして登録しておき、ユーザが入力した検索
単語とそのインデックスとを比較することにより、所望
の本文データを検索することができる。In this way, desired text data can be searched by registering words (phrases) in the text in advance as a keyword index and comparing the search words input by the user with the index.

[Problem to be solved by the invention]

本文データを直接検索する前者の方法は、ユーザが自由
な検索単語を選択できるという利点があるが、入力され
た検索単語と一敗する単語を検索する際に、例えば本文
データ中の単語を１文字型位で順に比較して検索を行う
必要があるので、検索に時間がかかるという欠点があっ
た。The former method of directly searching the main text data has the advantage that the user can freely select the search word, but when searching for words that match the input search word, for example, if the words in the main text data are Since it is necessary to perform a search by sequentially comparing character type positions, there is a drawback that the search takes time.

一方、キーワードにより検索を行う後者の方法は、上記
の方法に比べて検索速度は早くなるが、検索する単語が
限定されユーザが自由に単語を入力して検索することが
できないという欠点があった。On the other hand, the latter method of searching by keyword is faster than the above method, but it has the disadvantage that the words to be searched are limited and the user cannot freely enter words to search. .

また、キーワードで検索する方法では、抽出したキーワ
ードが適切でないと、必要な情報がなかなか得られず使
いにくくなるのでキーワードの抽出に工夫がいる。例え
ば、ＣＤ−ＲＯＭなどでは一旦書き込んだデータを書き
替えることができないので、キーワードの抽出に際して
検証用のシュミレーションソフトを作り、抽出したキー
ワードで正しく本文データが検索できるかどうかをｍ認
する必要がある。このとき、検証が不充分であると作成
したＣＤ−ＲＯＭが使えないものとなってしまう。In addition, with the keyword search method, if the extracted keywords are not appropriate, it will be difficult to obtain the necessary information and it will be difficult to use, so it is necessary to be creative in extracting the keywords. For example, since it is not possible to rewrite data once written to a CD-ROM, etc., it is necessary to create simulation software for verification when extracting keywords and check whether the text data can be searched correctly using the extracted keywords. . At this time, if the verification is insufficient, the created CD-ROM will become unusable.

本発明は、ユーザが自由に単語等を入力して検索を行え
、かつより高速な検索を行えることを目的とする。An object of the present invention is to enable a user to freely enter words and the like to perform a search, and to perform a faster search.

[Means to solve the problem]

第１図は、本発明の原理説明図である。 FIG. 1 is a diagram explaining the principle of the present invention.

同図において、発生頻度テーブル２には、検索対象デー
タ１における文字の発生頻度と、それらの文字が出現す
る先頭出現場所アドレスとが記憶されている。また、次
出現テーブル３には、上記文字の次の出現場所アドレス
が記憶されている。In the figure, an occurrence frequency table 2 stores the occurrence frequency of characters in the search target data 1 and the first appearance location address where those characters appear. Further, the next appearance table 3 stores the next appearance location address of the above character.

文字抽出部４は、発生頻度テーブル２を参照して、人力
された文字列の中で最も発生頻度の少ない文字を抽出す
る。The character extraction unit 4 refers to the occurrence frequency table 2 and extracts the character that occurs least frequently from the human-generated character string.

検索処理部５は、文字抽出部４で抽出された文字の出現
場所アドレスを、発生頻度テーブル２または次出現場所
テーブル３から順次求め、さらに、上記出現場所アドレ
スで指定されるデータの前後のデータと入力された文字
列とを比較し、人力された文字列を含むデータを検索す
る。The search processing unit 5 sequentially obtains the appearance location address of the character extracted by the character extraction unit 4 from the occurrence frequency table 2 or the next appearance location table 3, and further searches for data before and after the data specified by the appearance location address. and the input character string to search for data that includes the human-generated character string.

る。そして、そのアドレスで指定される文字の前後の文
字が検索対象データ１から読み出され、その読み出され
たデータと文字列との比較が行われる。Ru. Then, the characters before and after the character specified by the address are read from the search target data 1, and the read data and the character string are compared.

このように、人力された文字列の中で発生頻度の少ない
文字について、検索対象データを調べればよいので、検
索対象データを順次比較してい〈従来の直接検索方式に
比べ、検索速度を向上させることができる。In this way, it is only necessary to examine the search target data for characters that occur less frequently in the human-generated character string, so the search target data can be compared sequentially. be able to.

また、検索する文字列をユーザが自由に選択することが
できるのでより使い易いものとなる。さらに、キーワー
ドによる検索ではないので、キーワードの抽出作業が不
用となり、当然のことながらキーワードの検証も不要と
なる。Furthermore, since the user can freely select the character string to be searched, it becomes easier to use. Furthermore, since the search is not based on keywords, there is no need to extract keywords and, of course, there is no need to verify keywords.

[For production]

ユーザから検索すべき文字列が入力されると、その文字
列の中で最も発生頻度の少ない文字が抽出され、その文
字の格納アドレスが発生頻度テーブル２または次出現場
所テーブル３から求められ〔実　　施　　例〕以下、本発明の実施例を図面を参照しながら説明する。When a character string to be searched is input by the user, the character that occurs least frequently in the character string is extracted, and the storage address of that character is determined from the occurrence frequency table 2 or the next occurrence location table 3. Embodiments] Hereinafter, embodiments of the present invention will be described with reference to the drawings.

第２図は、本発明のデータベース検索方式に従うデータ
ベース検索装置の構成国である。FIG. 2 shows the constituent countries of the database search device according to the database search method of the present invention.

同図において、入力部１１は検索すべき単語等を入力す
るキーボードである。表示部１２は、ＣＲＴなどのデイ
スプレィであり、入力された単語あるいは検索された本
文データ等を表示する。In the figure, an input unit 11 is a keyboard for inputting words to be searched for. The display unit 12 is a display such as a CRT, and displays input words, searched text data, etc.

処理部１３は、データベースの検索を実行する回路であ
り、後述するＣＤ−ＲＯＭＩ　６からのデータの読み出
し等を行うＣＰＵ１４と、ＣＰＵＩ４が読み出したデー
タを一時記憶するメモリ１５とで構成されている。The processing unit 13 is a circuit that executes a database search, and is composed of a CPU 14 that reads data from a CD-ROMI 6, which will be described later, and a memory 15 that temporarily stores data read by the CPU 4.

ＣＤ−ＲＯＭＩ　６には、本文データ（データベースに
登録されたデータ）部１７と、その本文データ部１７内
における文字の発生頻度と、それらの文字の本文データ
部１７における先頭出現場所アドレスとを記憶したコー
ドソート部１８と、同一文字の次出現場所アドレスを記
憶した次出現場所テーブル１９とが設けられている。The CD-ROMI 6 stores a text data (data registered in the database) section 17, the frequency of occurrence of characters in the text data section 17, and the address of the first appearance location of those characters in the text data section 17. A code sorting unit 18 and a next appearance table 19 storing addresses of the next appearance of the same character are provided.

第３図は、コードソート部１８の構成図であり、例えば
５０音順に文字が記憶されており、各文字の発生頻度と
先頭出現場所アドレスとが、それらの文字に対応して記
憶されている。FIG. 3 is a configuration diagram of the code sorting unit 18, in which characters are stored in, for example, alphabetical order, and the occurrence frequency and first appearance location address of each character are stored in correspondence with those characters. .

次に以上のような構成の実施例の動作を、第４図のフロ
ーチャートを参照して説明する。Next, the operation of the embodiment configured as described above will be explained with reference to the flowchart shown in FIG.

先ず、ＣＤ−ＲＯＭ１６からコードソート部１８を読み
出しメモリに格納する（第４図、５２１）。First, the code sorting section 18 is read from the CD-ROM 16 and stored in the memory (FIG. 4, 521).

次番こ、ユーザから入力される検索単語を読み取る（Ｓ
２２）。そして、コードソート部１８を参照して人力さ
れた検索単語の中で発生頻度の最も少ない文字を探し、
その頻度を発生頻度カウンタ（図示せず）にセットする
と共に、その文字の先頭出現場所アドレスを求める（３
２３）、さらに、発生頻度カウンタが「０」かどうかを
判別する（Ｓ２４）。Next, read the search word input by the user (S
22). Then, by referring to the code sorting unit 18, search for the least frequently occurring character among the manually generated search words,
The frequency is set in an occurrence frequency counter (not shown), and the address of the first appearance of that character is obtained (3
23), and further determines whether the occurrence frequency counter is "0" (S24).

発生頻度カウンタの値が「０」でなければ、コードソー
ト部１８の先頭出現場所アドレスにより指示される本文
データを読み出し、対象なる文字の前後のデータと検索
単語とを比較する（Ｓ２５）。If the value of the occurrence frequency counter is not "0", the text data indicated by the first appearance address of the code sorting section 18 is read out, and the data before and after the target character are compared with the search word (S25).

そして、読み出したデータと検索単語とが一致するか否
かを判別する（Ｓ２６）。Then, it is determined whether the read data and the search word match (S26).

第５図は、コードソート部１８から文字の発生頻度を求
める動作、及びその文字の先頭出現場所アドレスから該
当する本文データを読み出す動作の説明図である。FIG. 5 is an explanatory diagram of the operation of obtaining the frequency of occurrence of a character from the code sorting unit 18 and the operation of reading out the corresponding text data from the first appearance location address of the character.

例えば検索単語として「あいうえお」が人力されたとす
ると、コードソート部１８の対応する文字の発生頻度が
調べられて、その文字列の中で本文データ中の発生頻度
が最も少ない文字が抽出される。この場合、文字「う」
の発生頻度が最も少ないので、その発生頻度「２」が発
生頻度カウンタにセットされる。さらに、文字「う」に
対応して記憶されている先頭出現場所アドレスとその前
後のアドレスのデータが読み出される。この場合、文字
「う」の前後の文字は、検索単語と一致しないので、文
字「う」の次出現場所アドレスを求める処理が実行され
る。For example, if "Aiueo" is entered manually as a search word, the frequency of occurrence of the corresponding character is checked in the code sorting section 18, and the character that occurs least frequently in the text data is extracted from the character string. In this case, the letter "u"
Since the occurrence frequency is the lowest, the occurrence frequency "2" is set in the occurrence frequency counter. Further, the data of the first appearance location address and the addresses before and after the first appearance location address stored corresponding to the character "U" are read out. In this case, since the characters before and after the character "U" do not match the search word, a process is executed to obtain the address of the next appearance of the character "U".

第４図に戻り、本文データの先頭出現場所アドレスから
読み出したデータが検索単語と一致しないときには、発
生頻度の最も少ない文字の次出現場所アドレスを次出現
場所テーブル１９から読み出す（Ｓ２７）、さらに、こ
こまでの処理で１回の検索動作が終了したので発生頻度
カウンタの値をデクリメントする（３２Ｂ）。その後、
ステップ３２４に戻り、次出現場所テーブル１９から読
み出した次出現場所アドレスとその前後のアドレスのデ
ータを読み出し検索単語と比較する。Returning to FIG. 4, if the data read from the first appearance address of the main text data does not match the search word, the next appearance address of the least frequently occurring character is read from the next appearance table 19 (S27); Since one search operation has been completed through the processing up to this point, the value of the occurrence frequency counter is decremented (32B). after that,
Returning to step 324, the data of the next appearance location address read from the next appearance location table 19 and the addresses before and after it are read out and compared with the search word.

以下、発生頻度カウンタの値が「０」となるまで上述し
たステップ３２４〜３２Ｂの処理を繰り返し、検索単語
に一致するデータを探す。そして、検索単語と一致する
データが存在したなら、そのとき指示されたアドレス以
降の本文データを読み出し表示部に表示する（Ｓ２９）
。Thereafter, the processing of steps 324 to 32B described above is repeated until the value of the occurrence frequency counter becomes "0" to search for data matching the search word. If data matching the search word exists, the text data from the address specified at that time is read out and displayed on the display section (S29).
.

第６図〜第８図は、次出現場所テーブル１９を参照して
の検索動作の説明図である。6 to 8 are explanatory diagrams of the search operation with reference to the next appearance location table 19.

前述したように「あいうえお」が検索単語として入力さ
れ、文字「うＪの先頭出現場所アドレスの前後のデータ
が検索単語と一致しなかったときには、同一文字の次出
現場所アドレスが次出現場所テーブル１９から読み出さ
れる〔第６図）。次出現場所テーブル１９には、例えば
同一文字の出現場所アドレスが出現順に記憶されており
、これらのアドレスを順に読み出すことで、本文データ
中の伺じ文字を順名こ検索することができる。As mentioned above, when "Aiueo" is input as a search word and the data before and after the first appearance address of the character "UJ" does not match the search word, the next appearance address of the same character is entered in the next appearance table 19. [Fig. 6].The next appearance location table 19 stores, for example, the appearance location addresses of the same character in the order of appearance, and by reading these addresses in order, the next character in the text data can be read out in order. You can search by name.

次に、第７図に示すように、次出現場所アドレスで指定
される文字「う」の前後のデータが読み出され、そのデ
ータと検索単語とが一致するかどうかが調べられる。両
者が一致した場合には、第８図に示すように一致したデ
ータ以降の本文データが読み出され表示部１２に表示さ
れる。Next, as shown in FIG. 7, the data before and after the character "u" specified by the next appearance location address is read out, and it is checked whether the data matches the search word. If the two match, the text data subsequent to the matched data is read out and displayed on the display section 12, as shown in FIG.

以上のように上記実施例では、人力された検索単語の中
で最も発生頻度の少ない文字を抽出し、その文字の本文
データにおける出現場所アドレスを順次求めて検索を行
うようにしたので、従来のようにユーザが人力した検索
単語で本文データを直接検索する方式に比べて検索をよ
り高速化することができる。As described above, in the above embodiment, the character with the least frequency of occurrence is extracted from the human-generated search word, and the search is performed by sequentially finding the appearance location address of that character in the text data. This makes it possible to speed up the search compared to a method in which text data is directly searched using search words manually entered by the user.

また、検索単語が予め抽出したキーワードに限定されな
いので、ユーザが自由に検索単語を決めることができ、
より使い易い検索方式を実現できる。さらに、キーワー
ドによる検索ではないので、抽出したキーワードが不適
切であった為に、知りたいデータがなかなか検索できな
いという問題を生じることがない。In addition, since search words are not limited to pre-extracted keywords, users can freely decide on search words.
A search method that is easier to use can be realized. Furthermore, since the search is not based on keywords, there is no problem in which it is difficult to retrieve the data that one wants to know because the extracted keywords are inappropriate.

これにより、ＣＤ−ＲＯＭなどを制作する場合でも、キ
ーワードの検証等が不要となり制作作業が容易になり、
検証が不完全であった為にＣＤ−ＲＯＭを廃棄すること
もなくなる。As a result, even when producing CD-ROMs, etc., there is no need to verify keywords, making the production work easier.
CD-ROMs no longer need to be discarded due to incomplete verification.

尚、検索対象データは、文字だけに限らず絵、音などの
データと組み合わせてもよく、例えば音声の記憶されて
いるメモリのポインタを文字データの間に組み込んでお
けば、文字と共に音声を検索することができる。Note that the search target data is not limited to text, but may also be combined with data such as pictures, sounds, etc. For example, if a pointer to the memory where the audio is stored is inserted between the text data, the audio can be searched along with the text. can do.

また、本発明は、実施例に述べたＣＤ−ＲＯＭに限らず
光ディスクなどの他の記録媒体を使用した装置にも適用
でき、ワードプロセッサ、パーソナルコンピュータ及び
ハイパーテキストなどのマルチメディアに利用できる。Furthermore, the present invention is not limited to the CD-ROM described in the embodiments, but can also be applied to devices using other recording media such as optical disks, and can be used for multimedia such as word processors, personal computers, and hypertext.

〔Effect of the invention〕

本発明によれば、検索すべき文字列の中で最も発生頻度
の少ない文字により対象となるデータ群を検索するよう
にしたので、検索を高速化することができる。さらに、
検索する単語をユーザが自由に選択することができるの
でより容易に検索を行うことができる。According to the present invention, since a target data group is searched for using characters that occur least frequently in a character string to be searched, it is possible to speed up the search. moreover,
Since the user can freely select the word to be searched, the search can be performed more easily.

[Brief explanation of drawings]

第１図は、本発明の原理説明図、第２図は、実施例のデータベース検索装置の構成図、第３図は、第２図のコードソート部の構成図、第４図は
、実施例の動作を説明するフローチャート、第５図〜第８図は、検索動作の説明図、第９図は、従来
の検索方式を説明するフローチャートである。・検索対象データ、・発生頻度テーブル、・次出現場所テーブル、・文字抽出部、・検索処理部。FIG. 1 is a diagram explaining the principle of the present invention. FIG. 2 is a configuration diagram of a database search device according to an embodiment. FIG. 3 is a configuration diagram of the code sorting section of FIG. 2. FIG. FIG. 5 to FIG. 8 are diagrams for explaining the search operation, and FIG. 9 is a flow chart for explaining the conventional search method.・Search target data, ・occurrence frequency table, ・next appearance location table, ・character extraction section, ・search processing section.

Claims

[Claims] An occurrence frequency table (2) that stores the frequency of occurrence of a character in the search target data (1) and the address of the first appearance of the character, and the next occurrence of each character in the search target data (1). When a character string to be searched is input, the next occurrence place table (3) that stores the appearance place address and the occurrence frequency table (2) are referred to to find the character that occurs least frequently in the character string. a character extraction unit (4) that extracts the characters; and a character extraction unit (4) that sequentially obtains the appearance location address of the character extracted by the character extraction unit (4) from the occurrence frequency table (2) or the next appearance location table (3), and extracts the address. a search processing unit (
5) A database search method comprising: