JPH03246764A

JPH03246764A - Data base retrieving system

Info

Publication number: JPH03246764A
Application number: JP2045246A
Authority: JP
Inventors: Teruo Akiyama; 秋山　照雄; Tomoyuki Kiyosue; 悌之清末; Haruhiko Kojima; 児島　治彦
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1990-02-26
Filing date: 1990-02-26
Publication date: 1991-11-05

Abstract

PURPOSE:To select one or more files suitable for a desired content by generating the histograms of all key words attached on data on which a retrieval key word is attached at every file or designated plural files. CONSTITUTION:A key word conversion part 1 performs conversion to the retrieval key word X' by performing the retrieval of a key word dictionary 2 based on a word inputted as a retrieval word, and retrieves every file comprising data bases 4, 4',... based on the key word X', and extracts the data on which the retrieval key word X' is attached. Next, all the key words attached on extracted data are extracted at every file or designated plural files, and they are outputted to a histogram calculation part 3. The histogram is generated by arranging the key word at every file or designated plural files, and the retrieval key word X' and the histogram are displayed on a display part 5. In such a way, it is possible to easily select the file optimum for a retrieval content from the data bases 4, 4',....

Description

【発明の詳細な説明】（産業上の利用分野〕本発明は、データベース利用者が、検索内容に最も適合
したファイルをデータベースから探し出すデータベース
検索システムに関するものである。DETAILED DESCRIPTION OF THE INVENTION (Field of Industrial Application) The present invention relates to a database search system in which a database user searches a database for a file that best matches the search content.

[Conventional technology]

従来、データベースの検索をする場合に、データベース
を構成する多（のファイルの中から、検索者が希望する
内容が多く含まれているファイルを見つけだし、それら
を選択する作業は、検索者の長年の経験によるところが
多かった。また、ＤＩＡＬＯＧなどのデータベースサー
ビスでは、指定したキーワードに対し、ファイル別にそ
のキーワードがヒツトするデータの数を表示するサービ
スを提供している。Traditionally, when searching a database, the task of finding and selecting files that contain a lot of the content desired by the searcher from among the large number of files that make up the database is a process that the searcher has spent many years working on. A lot of it depended on experience.Also, database services such as DIALOG provide a service that displays the number of data hits for a specified keyword for each file.

[Problem to be solved by the invention]

しかしこの場合でもヒツトした各データの分野を検索者
が知るには不十分であった。また、このキーワードを設
定する際にも２例えば「コンピュータ」という標記の場
合にはヒツトしても「コンピューター」という標記の場
合ではヒツトしないといった標記上の問題があった。さ
らにこれと同様の、「国民体育大会」を「国体」で表現
するといった略語の場合の問題、特に英語の場合には綴
り誤りや、複合語におけるハイフネーションに関する問
題、「コンピュータ」と「計算機コなどの同義語の問題
、あるいは派生語などの問題があった。However, even in this case, it was insufficient for the searcher to know the field of each hit data. Furthermore, when setting this keyword, there was a problem with the wording, for example, the wording ``computer'' would get a hit, but the wording ``computer'' would not get a hit. Furthermore, there are similar problems with abbreviations, such as expressing ``National Athletic Meet'' with ``Kokutai,'' especially in English, problems with spelling errors and hyphenation in compound words, and ``computer'' with ``calculator.'' There was a problem with synonyms or derivative words.

本発明は、これらの問題を解決し、素人がデータベース
の検索を行う場合などにおいて、検索業務に長年の経験
がなくてもキーワードとすべき単語を即座に見つけだし
、さらにそれに基づいて。The present invention solves these problems, and when an amateur searches a database, he or she can immediately find words that should be used as keywords even if he or she does not have many years of experience in search operations.

検索者が検索を希望する内容に最も適した１つ以上のフ
ァイルをデータベースの中から選択することを目的とし
ている。The purpose is for the searcher to select one or more files from the database that are most suitable for the content desired by the searcher.

[Means to solve the problem]

本発明では、特にデータベース検索の初心者が遭遇する
所のファイル選択の問題と　それに付随して生じるキー
ワード設定の問題とを解決するようにしており、キーワ
ードの設定においては、初心者が人力した検索語を、翻
訳等によってデータベースでの探索が可能なキーワード
に変換する手段をもうけ、ファイル選択においては該キ
ーワード変換手段によって設定した検索キーワードに基
づいて、該検索キーワードが付与されたデータに付与さ
れている全てのキーワードのヒストグラムを、ファイル
ごと、あるいは指定した複数のファイルごとに作成する
手段を設けるようにしている。The present invention is designed to solve the problem of file selection, which is encountered especially by beginners in database searches, and the problem of keyword setting that arises. , has a means for converting keywords into keywords that can be searched in a database by translation etc., and when selecting a file, based on the search keyword set by the keyword conversion means, all the data attached to the search keyword are added. A means is provided to create a histogram of keywords for each file or for each specified multiple files.

[For production]

以上の手段により、データベース検索者が人力した単語
をデータベース検索用のキーワードに変換することがで
き、さらに、該検索キーワードが付与されているファイ
ル内の１つ以上のデータについて、これらのデータに付
与されている全てのキーワードに関するファイル内での
分布から、該ファイルに格納されているデータの内容・
分野・傾向などを容易に知ることができる。なお、検索
キーワードの変換手段はファイル探索時だけでなく、フ
ァイルを固定した後の通常の検索においても有用である
ことは言うまでもない。また、ファイルを一旦固定した
あとは、固定した該ファイルに検索キーワードを人力し
、検索結果を表示部に表示する通常の使用をすればよい
。By the above means, it is possible to convert the words manually created by the database searcher into keywords for database search, and furthermore, for one or more data in the file to which the search keyword is attached, the words added to these data can be From the distribution of all the keywords in the file, we can determine the content and content of the data stored in the file.
You can easily learn about fields, trends, etc. It goes without saying that the search keyword conversion means is useful not only when searching for files, but also during normal searches after fixing files. Furthermore, once a file has been fixed, it is sufficient to enter a search keyword into the fixed file and display the search results on the display unit.

〔Example〕

第１図に本発明の一実施例を示す。図中の符号１はキー
ワード変換部であって、信号線１１を経由して検索語と
して入力された単語をもとに、キーワード辞書２内で検
索を行い、必要があれば入力単語Ｘから検索キーワード
Ｘ゛への変換を行う。FIG. 1 shows an embodiment of the present invention. Reference numeral 1 in the figure is a keyword conversion unit, which performs a search in the keyword dictionary 2 based on the word input as a search word via the signal line 11, and if necessary, searches from the input word X. Convert to keyword X゛.

辞書２への検索語の入力、変換後の検索キーワードの出
力は信号線１２を用いて行われる。なお。A signal line 12 is used to input a search word to the dictionary 2 and output the converted search keyword. In addition.

入力単語が、そのまま検索キーワードとして使用できる
ものは変換を行わない。キーワード辞書２においては３
例えば、キーワードとして登録されているインタフェー
スという単語と、データベース検索者が入力しうる「イ
ンターフェース」。If the input word can be used as a search keyword as is, it will not be converted. 3 in keyword dictionary 2
For example, the word "interface" is registered as a keyword, and "interface" can be entered by a database searcher.

「インターフェイス」などの単語との変換対を登録して
おく。キーワード辞書２では、同義語、同類語、略語、
翻訳語などの単語対を登録しておくだけでなく、シソー
ラスなどを用いて派生語の問題を解決したり、また特に
英語などでは辞書を用いて誤字や脱字を修正したり、　
“Ｆｉｒｓｔ　Ｃ１ａｓｓ　＋“Ｆｉｒｓｔ−ｃｌａｓ
ｓ　＋　”Ｆｉｒｓｔｃｌａｓｓ″といったハイフネー
ションや複合語の問題を解決する機能、あるいは必要に
応じて辞書の新規登録や削除を行う機能を付与してお（
ようにされる。ユーザが入力した検索語Ｘをキーワード
変換部ｌで変換した検索キーワードＸ°は信号線１３を
経由してヒストグラム計算部３．データベース４及びデ
ータベース４°に転送される。なお、キーワード変換部
ｌに入力される検索語は、単語という形態だけでなく単
語をａｎｄやｏｒなどの演算子で結合した式の形態であ
ってもよいことは明らかである。また、データベースが
３つ以上あっても同様の処理が可能であることも明らか
である。Conversion pairs with words such as "interface" are registered. Keyword Dictionary 2 provides synonyms, similar words, abbreviations,
In addition to registering word pairs such as translated words, you can also use a thesaurus etc. to solve problems with derived words, and especially in English, use a dictionary to correct spelling errors and omissions.
“First C1ass +”First-class
It has a function to solve problems with hyphenation and compound words such as s + "Firstclass", and a function to register and delete new dictionaries as necessary (
It will be done like this. The search keyword X°, which is obtained by converting the search word X input by the user by the keyword conversion unit l, is sent to the histogram calculation unit 3 via the signal line 13. Transferred to database 4 and database 4°. Note that it is clear that the search word input to the keyword conversion unit 1 may be in the form of not only a word but also an expression in which words are combined using operators such as and and or. It is also clear that similar processing is possible even if there are three or more databases.

第１図に示すように、データベースは、内容などが異な
る複数個のファイルから構成されており。As shown in FIG. 1, the database is composed of multiple files with different contents.

各ファイルには、それぞれ１つ以上のキーワードを付与
された複数個のデータが格納されている。Each file stores a plurality of pieces of data each assigned one or more keywords.

データベース検索においては、これらのデータのうち、
検索者のキーワードにヒントしたデータが出力される。In database search, among these data,
Data hinted at the searcher's keywords is output.

複数個のファイル、すなわち「ファイル１」「ファイル
２Ｊ、・・・、「ファイルｎ」を含むデータベース４．
さらに「ファイル１°　」、「ファイル２゛」、・・・
、「ファイルｎ゛」を含むデータベース４°では、まず
、該キーワード変換部１から送られてきた検索キーワー
ドＸ゛をもとに該データベースを構成する各ファイルを
検索し、検索キーワードＸ゛が付与されているデータを
抽出する。A database 4 containing a plurality of files, ie, "File 1", "File 2J", . . . , "File n".
Furthermore, “File 1°”, “File 2゛”, etc.
In the database 4° containing "file n", first, each file constituting the database is searched based on the search keyword X' sent from the keyword conversion unit 1, and the search keyword X' is assigned. Extract the data that is

次に、該抽出データに付与されている全てのキーワード
をファイルごと、あるいは指定された複数のファイルご
とに抽出し、信号線１４を用いてヒストグラム計算部３
に出力する。当然のことながら、これらのキーワードに
は検索キーワードＸ゛が含まれる。Next, all keywords assigned to the extracted data are extracted for each file or for each specified plurality of files, and the histogram calculation unit 3 uses the signal line 14 to
Output to. Naturally, these keywords include the search keyword X'.

ヒストグラム計算部３では信号線１４を経由して入力さ
れた検索キーワードをファイルごと、あるいは指定され
た複数のファイルごとに整理してヒストグラムを作成す
る。さらに、信号線１３を用いて入力された検索キーワ
ードＸ”　と共に信号線１５を用いて出力する。The histogram calculation unit 3 organizes the search keywords input via the signal line 14 for each file or for each specified plurality of files to create a histogram. Furthermore, the search keyword X" inputted using the signal line 13 is outputted using the signal line 15.

表示部５では信号線１５を用いて入力された検索キーワ
ードＸ″及びヒストグラムを表示する。The display unit 5 displays the input search keyword X'' and histogram using the signal line 15.

表示結果の一例を第２図に示す。第２図においては、そ
れぞれ「ファイル１」、「ファイル２」において、キー
ワードＡを含むデータの数が「３」及び［２」、同様に
、キーワードＢを含むデータの数が「１０」及び「１」
、キーワードＣを含むデータの数が「３」及び「２」、
キーワードＤを含むデータの数が「１」及びｒｌＯＪ　
、さらに、検索キーワードＸ゛　を含むデータの数が「
ファイルｌ」、「ファイル２」共に「１５」となってい
る。An example of the display results is shown in FIG. In FIG. 2, in "File 1" and "File 2", the number of data containing keyword A is "3" and "2", and similarly, the number of data containing keyword B is "10" and "2", respectively. 1"
, the number of data containing keyword C is "3" and "2",
The number of data containing keyword D is “1” and rlOJ
, Furthermore, the number of data containing the search keyword
"File l" and "File 2" are both "15".

ここで、検索キーワードＸ゛が付与されたデータの数そ
のものは「１５」で同じであっても、これらのデータの
うち、「ファイルｌ」ではキーワードＢを含むデータが
、「ファイル２」ではキーワードＤを含むデータが多い
ことから、「ファイル１」と「ファイル２」とにそれぞ
れ格納されたデータの性質が異なることがわかる。従っ
て、データベース検索者はキーワードＢとキーワードＤ
とを比較し、自分が検索したいと思う分野に近いキーワ
ードを多く含むファイルの方を選ぶことが可能になる。Here, even though the number of data to which the search keyword Since there is a large amount of data containing D, it can be seen that the properties of the data stored in "File 1" and "File 2" are different. Therefore, the database searcher needs keyword B and keyword D.
By comparing these files, you can select files that contain many keywords that are close to the field you want to search.

従来のデータベース検索システムでは。In traditional database search systems.

このような場合、検索者に対して、検索キーワードＸ゛
を含むデータの数が「ファイル１」と「ファイル２」と
では「１５」で同じであるという情報しか与えられない
ため、検索者は、どちらのファイルが自分に通している
かを、その場で判断することができなかった。In such a case, the searcher is only given the information that the number of data containing the search keyword , I couldn't decide on the spot which file was going through me.

なお、−旦、検索ファイルを固定したあとは。By the way, after fixing the search file.

キーワード変換部１から出力された検索キーワードを、
該検索ファイルに出力し、信号線１６を経由して検索結
果を表示部５に出力すればよい。The search keyword output from the keyword conversion unit 1 is
It is sufficient to output the search results to the search file and output the search results to the display section 5 via the signal line 16.

〔Effect of the invention〕

以上説明したように１本発明によれば、素人がデータベ
ースを検索する場合であっても、容易に検索キーワード
を設定でき、また、検索内容に最も適したファイルをデ
ータベースから容易に選択することができる。As explained above, according to the present invention, even when an amateur searches a database, it is possible to easily set search keywords and to easily select the file most suitable for the search content from the database. can.

[Brief explanation of drawings]

第１図は本発明の一実施例、第２図はヒストグラム計算
部で計算されたヒストグラムの表示例を示す。１・・・キーワード変換部、２・・・キーワード辞書。３・・・ヒストグラム計算部、４．４’　・・・データ
ベース、５・・・表示部、１１ないし１６・・・信号線
。FIG. 1 shows an embodiment of the present invention, and FIG. 2 shows a display example of a histogram calculated by a histogram calculating section. 1...Keyword converter, 2...Keyword dictionary. 3... Histogram calculation section, 4.4'... Database, 5... Display section, 11 to 16... Signal line.

Claims

[Claims] In a database search device that selects one or more files containing a large amount of data with content desired by a searcher from one or more databases containing a plurality of files, the database search device includes: a search keyword or a search; Search each file for data that matches a search formula that combines keywords using a logical formula, and calculate the frequency of occurrence of the keywords assigned to each of the search data within the same file or within multiple specified files. What is claimed is: 1. A database search system characterized in that the histogram is obtained by calculating the histogram, and the histogram is displayed.