JPS62241026A - Character string retrieving system - Google Patents

Character string retrieving system

Info

Publication number
JPS62241026A
JPS62241026A JP61083845A JP8384586A JPS62241026A JP S62241026 A JPS62241026 A JP S62241026A JP 61083845 A JP61083845 A JP 61083845A JP 8384586 A JP8384586 A JP 8384586A JP S62241026 A JPS62241026 A JP S62241026A
Authority
JP
Japan
Prior art keywords
character
data
search
retrieved
character string
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP61083845A
Other languages
Japanese (ja)
Inventor
Hisanori Takahashi
高橋 久則
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to JP61083845A priority Critical patent/JPS62241026A/en
Publication of JPS62241026A publication Critical patent/JPS62241026A/en
Pending legal-status Critical Current

Links

Abstract

PURPOSE:To shorten the executing time of a character string retrieving program by deciding a single character having the minimum using frequency out of a retrieved character string and retrieving and selecting the data to be retrieved with use of said decided character to decrease the comparison frequency between the retrieved character string and the data to be retrieved. CONSTITUTION:A using character analyzing means 1 reads the data 100 to be retrieved and analyzes it to deliver the using character distribution data 200. The emerging frequency of each character of the data 100 is recorded to the data 200. A deciding means 20 of a character string retrieving means 2 decides a single character having the minimum using frequency out of a retrieved character string 300 based on the data 200. Then a retrieving means 21 reads the data 100 and performs the retrieval with the single character decided by the means 20 and to be retrieved first to select the data to be retrieved. A comparison means 22 compares the data to be retrieved that is selected by the means 21 with the character string 300 and delivers the retrieved data 400 when the coincidence is obtained through said comparison.

Description

【発明の詳細な説明】 (産業上の利用分野〕 本発明は文字列検索方式に関し、特にコンピュータシス
テムにおける文字列検索方式に関する。
DETAILED DESCRIPTION OF THE INVENTION (Field of Industrial Application) The present invention relates to a character string search method, and particularly to a character string search method in a computer system.

〔従来の技術〕[Conventional technology]

従来、この種の文字列検索方式では、ファイルに格納さ
れているデータからある文字列データを含むデータを検
索するための方式として、被検索データの先頭から検索
文字列の先頭1文字を順次比較し、同一文字が検出され
た位置から検索文字列全体と条件に合致するが否かを比
較していた。
Conventionally, in this type of string search method, the first character of the search string is sequentially compared from the beginning of the searched data as a method to search for data containing certain string data from data stored in a file. Then, the entire search string was compared from the position where the same character was detected to see if it matched the condition.

〔発明が解決しようとする問題点〕[Problem that the invention seeks to solve]

上述した従来の文字列検索方式は、文字列検索プログラ
ムの実行時間が実行命令数と実行のときに参照する文字
数に比例するので、検索文字列の先頭1文字が被検索デ
ータに多数ありかつ対象条件に合致する文字列が少ない
場合には、不必要な命令実行および文字参照を行うこと
になり、文字列検索プログラムの実行時間が長くなると
いう欠点がある。
In the conventional string search method described above, the execution time of the string search program is proportional to the number of executed instructions and the number of characters referenced during execution. If there are few character strings that match the conditions, unnecessary command execution and character references will be performed, resulting in a disadvantage that the execution time of the character string search program will be longer.

本発明の目的は、上述の点に鑑み、文字列検索プログラ
ムの実行時間を短縮することができるようにした文字列
検索方式を提供することにある。
In view of the above-mentioned points, an object of the present invention is to provide a character string search method that can shorten the execution time of a character string search program.

C問題点を解決するための手段〕 本発明の文字列検索方式は、ファイルに格納されている
データを入力し指定文字列を指定条件で調べて条件に合
致したデータを検索する文字列検索方式において、被検
索データを入力して使用文字分布を解析し使用文字分布
データを作成する使用文字分布解析手段と、この使用文
字分布解析手段により作成された前記使用文字分布デー
タにもとづいて前記被検索データ内にある文字列の条件
検索を行う文字列検索手段とを含む。
Means for Solving Problem C] The character string search method of the present invention is a character string search method that inputs data stored in a file, searches for specified character strings under specified conditions, and searches for data that matches the conditions. a used character distribution analysis means for inputting searched data and analyzing the used character distribution to create used character distribution data; and a character string search means for performing a conditional search for character strings in the data.

〔実施例〕〔Example〕

次に、本発明について図面を参照して説明する。 Next, the present invention will be explained with reference to the drawings.

図は本発明の一実施例を示すブロック図である。The figure is a block diagram showing one embodiment of the present invention.

本実施例の文字列検索方式は、使用文字分布解析手段1
と、文字列検索手段2と、被検索データ100と、使用
文字分布データ200と、検索文字列300と、検索さ
れたデータ400とから構成されている。
The character string search method of this embodiment uses the usage character distribution analysis means 1.
, character string search means 2 , searched data 100 , used character distribution data 200 , search character string 300 , and searched data 400 .

文字列検索手段2は、使用文字分布データ200に基づ
いて最初に検索すべき1文字を決定する決定手段20と
、この決定手段20で決定された1文字で被検索データ
100を検索する検索手段21と、この検索手段21で
検索されたデータと検索文字列300とを比較して検索
されたデータ400を出力する比較手段22とから構成
されている。
The character string search means 2 includes a determining means 20 that determines one character to be searched first based on used character distribution data 200, and a retrieval means that searches the searched data 100 using the one character determined by the determining means 20. 21, and a comparison means 22 that compares the data searched by the search means 21 with the search character string 300 and outputs the searched data 400.

次に、このように構成された本実施例の文字列検索方式
の動作について説明する。
Next, the operation of the character string search method of this embodiment configured as described above will be explained.

まず、使用文字分布解析手段1は、被検索データ100
を読み込み、これを解析して使用文字分布データ200
を出力する。使用文字分布データ200には、被検索デ
ータ100における各文字の出現頻度が記録される0本
実施例では、例えば文字Aが1%、文字Bが3%、文字
Cが5%、・・・と記録される。
First, the usage character distribution analysis means 1 analyzes the search target data 100.
Read and analyze this to obtain usage character distribution data 200
Output. In the used character distribution data 200, the appearance frequency of each character in the searched data 100 is recorded. In this embodiment, for example, the character A is 1%, the character B is 3%, the character C is 5%, etc. is recorded.

決定手段20は、使用文字分布データ200に基づいて
検索文字列300の中で最も使用頻度の低い1文字を決
定する0本実施例では、例えば検索文字列がrABCJ
のときに文字rAJが決定される。
The determining means 20 determines the least frequently used character in the search string 300 based on the used character distribution data 200. In this embodiment, for example, the search string is rABCJ.
The character rAJ is determined when .

検索文字列300として使用文字分布データ200にな
い文字が1定されていれば、被検索データ100を検索
することなしに、検索文字列300が存在しないことに
なる。
If one character that is not in the used character distribution data 200 is specified as the search character string 300, the search character string 300 does not exist without searching the searched data 100.

次に、検索手段21は、被検索データ100を読み込み
、決定手段20によって決定された最初に検索すべき1
文字で検索を行い、検索対象データを選択する。
Next, the search means 21 reads the searched data 100 and selects the first search item determined by the determination means 20.
Search by character and select the data to be searched.

比較手段22は、検索手段21で選択された検索対象デ
ータと検索文字列300と比較し、条件に合致したもの
を検索されたデータ400として出力する。
The comparison means 22 compares the search target data selected by the search means 21 with the search character string 300, and outputs the data matching the conditions as searched data 400.

〔発明の効果〕〔Effect of the invention〕

以上説明したように本発明は、被検索データ上に存在す
る文字の分布を解析し、解析結果にもとづいて検索文字
列上の最も使用頻度の低い1文字を決定し、決定された
文字で被検索データを検索し、検索文字列と比較すべき
被検索データを選択し、検索文字列と被検索データの比
較回数を減らすことにより、文字列検索プログラムの実
効時間を短縮することができる効果がある。特に、被検
索データを複数の検索文字列で検索するような場合に有
効である。
As explained above, the present invention analyzes the distribution of characters existing in the searched data, determines the least frequently used character in the search string based on the analysis result, and By searching the search data, selecting the searched data to be compared with the search string, and reducing the number of comparisons between the search string and the searched data, the effective time of the string search program can be shortened. be. This is particularly effective when searching for searched data using multiple search strings.

【図面の簡単な説明】[Brief explanation of drawings]

図は本発明の一実施例を示すブロック図である。 図において、 1・・・使用文字分布解析手段、 2・・・文字列検索手段、 2σ・・・決定手段、 21・・・検索手段、 22・・・比較手段、 100  ・・被検索データ、 200  ・・使用文字分布データ、 300 ・・検索文字列、 400  ・・検索されたデータである。 The figure is a block diagram showing one embodiment of the present invention. In the figure, 1...Used character distribution analysis means, 2... Character string search means, 2σ...Decision means, 21... search means, 22... Comparison means, 100...searched data, 200...Character distribution data used, 300...Search string, 400...Retrieved data.

Claims (1)

【特許請求の範囲】 ファイルに格納されているデータを入力し指定文字列を
指定条件で調べて条件に合致したデータを検索する文字
列検索方式において、 被検索データを入力して使用文字分布を解析し使用文字
分布データを作成する使用文字分布解析手段と、 この使用文字分布解析手段により作成された前記使用文
字分布データにもとづいて前記被検索データ内にある文
字列の条件検索を行う文字列検索手段と、 を含むことを特徴とする文字列検索方式。
[Claims] In a character string search method in which data stored in a file is input and a specified character string is checked under specified conditions to search for data that matches the conditions, the data to be searched is input and the distribution of characters used is determined. Character usage distribution analysis means for analyzing and creating usage character distribution data, and character strings for performing a conditional search for character strings in the search target data based on the usage character distribution data created by the usage character distribution analysis means. A character string search method comprising: a search means;
JP61083845A 1986-04-11 1986-04-11 Character string retrieving system Pending JPS62241026A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP61083845A JPS62241026A (en) 1986-04-11 1986-04-11 Character string retrieving system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP61083845A JPS62241026A (en) 1986-04-11 1986-04-11 Character string retrieving system

Publications (1)

Publication Number Publication Date
JPS62241026A true JPS62241026A (en) 1987-10-21

Family

ID=13814039

Family Applications (1)

Application Number Title Priority Date Filing Date
JP61083845A Pending JPS62241026A (en) 1986-04-11 1986-04-11 Character string retrieving system

Country Status (1)

Country Link
JP (1) JPS62241026A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH02109167A (en) * 1988-10-18 1990-04-20 Hitachi Ltd Method and device for retrieving character string
US5168533A (en) * 1989-06-14 1992-12-01 Hitachi, Ltd. Hierarchical presearch type text search method and apparatus and magnetic disk unit used in the apparatus
US5220625A (en) * 1989-06-14 1993-06-15 Hitachi, Ltd. Information search terminal and system
US5357431A (en) * 1992-01-27 1994-10-18 Fujitsu Limited Character string retrieval system using index and unit for making the index
US5471610A (en) * 1989-06-14 1995-11-28 Hitachi, Ltd. Method for character string collation with filtering function and apparatus
JPH09114842A (en) * 1995-10-13 1997-05-02 Nec Software Ltd Information retrieval processor and information retrieval processing method
US5748953A (en) * 1989-06-14 1998-05-05 Hitachi, Ltd. Document search method wherein stored documents and search queries comprise segmented text data of spaced, nonconsecutive text elements and words segmented by predetermined symbols
WO2006013126A1 (en) * 2004-07-31 2006-02-09 Robert Bosch Gmbh Method for searching character strings and device therefor

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH02109167A (en) * 1988-10-18 1990-04-20 Hitachi Ltd Method and device for retrieving character string
WO1990004826A1 (en) * 1988-10-18 1990-05-03 Hitachi, Ltd. Method and apparatus for retrieving key word sequence for concurrent processing
US5604910A (en) * 1988-10-18 1997-02-18 Hitachi, Ltd. Method of and vector processor for searching text for key words based on candidate character strings obtained from the text using parallel processing
US5168533A (en) * 1989-06-14 1992-12-01 Hitachi, Ltd. Hierarchical presearch type text search method and apparatus and magnetic disk unit used in the apparatus
US5220625A (en) * 1989-06-14 1993-06-15 Hitachi, Ltd. Information search terminal and system
US5471610A (en) * 1989-06-14 1995-11-28 Hitachi, Ltd. Method for character string collation with filtering function and apparatus
US5519857A (en) * 1989-06-14 1996-05-21 Hitachi, Ltd. Hierarchical presearch type text search method and apparatus and magnetic disk unit used in the apparatus
US5748953A (en) * 1989-06-14 1998-05-05 Hitachi, Ltd. Document search method wherein stored documents and search queries comprise segmented text data of spaced, nonconsecutive text elements and words segmented by predetermined symbols
US6094647A (en) * 1989-06-14 2000-07-25 Hitachi, Ltd. Presearch type document search method and apparatus
US5357431A (en) * 1992-01-27 1994-10-18 Fujitsu Limited Character string retrieval system using index and unit for making the index
JPH09114842A (en) * 1995-10-13 1997-05-02 Nec Software Ltd Information retrieval processor and information retrieval processing method
WO2006013126A1 (en) * 2004-07-31 2006-02-09 Robert Bosch Gmbh Method for searching character strings and device therefor

Similar Documents

Publication Publication Date Title
JPS62241026A (en) Character string retrieving system
JPH02130673A (en) Data retrieving system
JP2003308229A (en) Time-series event record analysis method and apparatus, program and recording medium
JPH04241672A (en) Character string retrieving system
JP2001005830A (en) Information processor, its method and computer readable memory
JP5514682B2 (en) Batch processing program analysis method and apparatus
JP7339148B2 (en) Search support device
JPH0423167A (en) Command retrieving system
JPH04274519A (en) Automatically executing system for accelerating program
JP2518383B2 (en) Source program analysis device
JP2506809B2 (en) Japanese morphological analyzer
JPH06110927A (en) Parallel retrieval system for data in extra-large record
JPH02219176A (en) Character-string retrieving system
JP2724235B2 (en) Variable name inference device
JPH08329095A (en) Data retrieval system for relational data base
JPH08185412A (en) Method and device for processing document
JPH0823867B2 (en) Adjacency search method using aggregate files
JPH10320403A (en) Method and device for generating retrieval expression, and record medium
JPS63238622A (en) Relation retrieval system
JPH07110734A (en) Multimodal input analysis system
JPH04242840A (en) Execution step distribution calculating system
JPH0371262A (en) Data processor
JPS63170742A (en) Retrieval processing system for common character string
JPH08185409A (en) Method and device for processing information
JP2005293440A (en) Character code converting method, computer with character code converting function, and program for converting character code