JPH06290219A

JPH06290219A - Document processor with character retrieving function

Info

Publication number: JPH06290219A
Application number: JP5074321A
Authority: JP
Inventors: Yukio Shimizu; 裕紀夫清水
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 1993-03-31
Filing date: 1993-03-31
Publication date: 1994-10-18

Abstract

PURPOSE:To improve the retrieval efficiency by enabling partial matching retrieval in paragraph units of a KANJI(Chinese character)-KANA(Japanese syllabary) mixed sentence. CONSTITUTION:When a converting means 102 converts reading information inputted by an input means 101 into the KANJI-KANA mixed sentence, a paragraph information adding means 103 adds delimiting information on paragraphs to the KANJI-KANA mixed sentence. For the addition of the delimiter information, special codes which are independent of character codes are added, the information is embedded as the special flag in the character codes, or specific character strings are stored, paragraph by paragraph. Then, when a retrieval means 105 performs retrieval, a sentence which matches with the KANJI-KANA mixed sentence in one paragraph or more among the paragraphs of the KANJI- KANA mixed sentence is retrieved in the document stored in a document storage means 104. Therefore, since the partial matching retrieval is performed in paragraph units, meaningless retrieval conditions are not generated and the generation of fruitless retrieving operation is suppressed.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】この発明は、日本語による文書作
成が可能な、ワードプロセッサやパーソナルコンピュー
タのような文書処理装置に関し、特に、指定した文字列
を文書の中から検索することが可能な文字検索機能付き
文書処理装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a document processing apparatus such as a word processor or a personal computer capable of creating a document in Japanese, and more particularly to a character searchable character string in a document. The present invention relates to a document processing device with a search function.

【０００２】[0002]

【従来の技術】従来、この種の文書処理装置において
は、図７に示すように、文書内の特定の文字列を検索す
る場合、指定された文字列との完全一致をその条件とす
る手法と、指定文字列の一部のみの一致を認める部分一
致を条件とする手法がある。2. Description of the Related Art Conventionally, in a document processing apparatus of this type, as shown in FIG. 7, when searching for a specific character string in a document, a method of making a perfect match with a specified character string the condition is used. There is a method that uses a partial match that allows only part of the specified character string to match.

【０００３】完全一致を条件とする手法は、図７の
（ａ）に示すように、指定の文字列をＭ字とすると、一
致したと判断する条件として、１〜Ｍ字目の全ての位置
において文字が一致した時のみ、一致したと判断する。
すなわち、一致条件は指定文字列そのものである。As shown in FIG. 7 (a), when the specified character string is M-shaped, the method that uses perfect matching as a condition is that all positions from the 1st to the M-th character are determined as matching conditions. Only when the characters match in, it is judged that they match.
That is, the matching condition is the designated character string itself.

【０００４】一方、部分一致を条件とする手法は、図７
の（ｂ）に示すように、指定文字列をＭ字とすると、一
致したと判断する条件として、１〜Ｍ字目のいずれかの
位置において文字が一致したとき、一致したと判断す
る。すなわち、１〜Ｍ字目のいずれかの単数あるいは複
数の位置の文字の不一致を無視することにより、条件を
緩和する方法である。On the other hand, the method that uses partial matching as a condition is shown in FIG.
As shown in (b), if the designated character string is M-letter, as a condition for determining that the characters match, when the characters match at any one of the first to M-th positions, it is determined that they match. That is, it is a method of easing the condition by ignoring the mismatch of the characters at any one or more positions of the 1st to Mth characters.

【０００５】[0005]

【発明が解決しようとする課題】しかしながら、このよ
うな従来の完全一致を条件とする方法は、あくまでも指
定文字列と完全に一致する文字列を検索するため、操作
者の記憶があいまいで、検索するべき文字列を間違って
入力した場合には、全く一致する文字列が見つからない
か、あるいは期待しない文字列を検索する場合が発生す
る。これは、長文あるいは長期にわたって作成される文
書において特に発生しやすい。However, such a conventional method that uses a perfect match as a condition only searches for a character string that exactly matches a specified character string, and the operator's memory is ambiguous. If the wrong character string is entered, a completely matching character string may not be found, or an unexpected character string may be searched for. This is especially likely to occur in long texts or documents created over a long period of time.

【０００６】一方、従来の部分一致を条件とする方法
は、指定された文字列をそのまま条件として使うだけで
なく、緩和した条件を自動的に生成するため、完全一致
を条件とする方法の短所であった操作者の記憶のミス、
あるいは入力のミスを吸収できる。しかし、指定された
文字列は、あくまでも１文字の集まった集合として認識
するため、文字は全て同格、即ち１文字１文字が独立し
たものとして条件が自動生成される。したがって、存在
する確率が極めて低い意味の通らない条件も生成するた
め、無意味な検索作業が発生する。また、１文字１文字
の文字が独立したものとして条件の緩和が行われると、
一致と判断される条件が多数発生し、数多くの期待しな
い文字列を抽出する。On the other hand, the conventional method that uses partial matching as a condition not only uses the specified character string as a condition as it is, but also automatically generates a relaxed condition. Was a mistake in the memory of the operator,
Alternatively, input mistakes can be absorbed. However, since the designated character string is recognized as a set of one character to the last, the characters are all equivalent, that is, one character is independent and the condition is automatically generated. Therefore, a meaningless condition having a very low existence probability is generated, and meaningless search work occurs. Also, if the conditions are relaxed assuming that each character is independent,
Many conditions are determined to match, and many unexpected character strings are extracted.

【０００７】以上の短所により、従来の完全一致による
方法では、操作者の記憶／入力の正確性が要求され、ま
た、部分一致による方法では、作業に必要とする時間が
要求される。Due to the above disadvantages, the conventional complete matching method requires the accuracy of the memory / input of the operator, and the partial matching method requires the time required for the work.

【０００８】この発明は、このような事情を考慮してな
されたもので、複数の文節からなる文字列を検索する場
合に、文節単位での部分一致検索を可能とすることによ
り、検索効率を向上させるようにした文字検索機能付き
文書処理装置を提供するものである。The present invention has been made in consideration of such circumstances, and when a character string consisting of a plurality of clauses is searched, it is possible to perform a partial match search in clause units, thereby improving the search efficiency. It is an object of the present invention to provide a document processing device with a character search function which is improved.

【０００９】[0009]

【課題を解決するための手段】図１はこの発明の構成を
示すブロック図であり、図に示すように、この発明は、
単語の読み情報を複数の文節にわたって入力する入力手
段１０１と、仮名漢字変換辞書を有し、その辞書を参照
することにより、入力された読み情報を複数の文節から
なる漢字仮名交じり文に変換する変換手段１０２と、入
力された読み情報が漢字仮名交じり文に変換されると
き、その漢字仮名交じり文に文節の区切り情報を付加す
る文節情報付加手段１０３と、文節の区切り情報の付加
された多数の漢字仮名交じり文を含む文書を記憶した文
書記憶手段１０４と、漢字仮名交じり文に付加された文
節の区切り情報を参照することにより、変換手段１０２
によって変換された漢字仮名交じり文の複数の文節の内
の、いずれか一つ又はそれ以上の文節の漢字仮名交じり
文と一致する漢字仮名交じり文を、文書記憶手段１０４
に記憶された文書の中から検索する検索手段１０５と、
検索手段１０５による検索結果を出力する出力手段１０
６を備えてなる文字検索機能付き文書処理装置を提供す
る。FIG. 1 is a block diagram showing the configuration of the present invention. As shown in the figure, the present invention is
It has an input means 101 for inputting word reading information over a plurality of phrases and a kana-kanji conversion dictionary. By referring to the dictionary, the input reading information is converted into a kanji kana mixed sentence consisting of a plurality of phrases. The converting means 102, the clause information adding means 103 for adding bunsetsu delimiter information to the kanji kana kana mingling sentence when the input reading information is converted to the kanji kana kana merging sentence, and a large number of bunsetsu delimiting information added By referring to the document storage unit 104 that stores the document including the Kanji-Kana mixed sentence and the paragraph delimiter information added to the Kanji-Kana mixed sentence, the conversion unit 102 is referred to.
The document storage means 104 stores a kanji-kana mixed sentence that matches the kanji-kana mixed sentence of any one or more of the plurality of kanji-kana mixed sentences converted by.
Search means 105 for searching from the documents stored in
Output means 10 for outputting the search result by the search means 105
Provided is a document processing device having a character search function, which is provided with 6.

【００１０】なお、この発明の入力手段１０１として
は、キーボード装置やタブレット装置などの入力装置が
用いられる。An input device such as a keyboard device or a tablet device is used as the input means 101 of the present invention.

【００１１】変換手段１０２、文節情報付加手段１０
３、及び検索手段１０５としては、ＣＰＵ，ＲＯＭ，Ｒ
ＡＭ，インターフェースからなるマイクロコンピュータ
システムを用いるのが便利であり、文書記憶手段１０４
としては、通常、その中のＲＡＭが用いられる。Conversion means 102, phrase information addition means 10
3 and the search means 105, CPU, ROM, R
It is convenient to use a microcomputer system including an AM and an interface, and the document storage means 104
As for, the RAM therein is usually used.

【００１２】出力手段１０６としては、ＣＲＴディスプ
レイ装置やＬＣＤ（液晶ディスプレイ装置）のような表
示装置や、各種のプリンタが用いられる。As the output means 106, a display device such as a CRT display device or an LCD (liquid crystal display device) and various printers are used.

【００１３】[0013]

【作用】この発明によれば、入力手段１０１によって入
力された読み情報が、変換手段１０２によって漢字仮名
交じり文に変換されるときには、文節情報付加手段１０
３により、漢字仮名交じり文に文節の区切り情報が付加
される。そして、検索手段１０５によって検索が行われ
るときには、その漢字仮名交じり文の複数の文節の内
の、いずれか一つ又はそれ以上の文節の漢字仮名交じり
文と一致する漢字仮名交じり分が、文書記憶手段１０４
に記憶された文書の中から検索される。According to the present invention, when the reading information input by the input means 101 is converted into the kanji kana mixed sentence by the converting means 102, the phrase information adding means 10
By 3, the phrase delimiter information is added to the kanji kana mixed sentence. Then, when a search is performed by the search means 105, the kanji-kana mixed part that matches the kanji-kana mixed statement of any one or more of the plurality of kanji-kana mixed sentences of the kanji kana mixed sentence is stored in the document storage. Means 104
Are retrieved from the documents stored in.

【００１４】したがって、文節単位での部分一致検索を
行うため、無意味な検索条件の生成が無くなり、これに
より無駄な検索作業の発生が抑制され、検索作業の簡略
化、効率化が図られる。Therefore, since the partial match search is performed in phrase units, meaningless search conditions are not generated, and thus unnecessary search work is suppressed, and the search work can be simplified and made more efficient.

【００１５】[0015]

【実施例】以下、図面に示す実施例に基づいてこの発明
を詳述する。なお、これによってこの発明が限定される
ものではない。DESCRIPTION OF THE PREFERRED EMBODIMENTS The present invention will be described below in detail with reference to the embodiments shown in the drawings. The present invention is not limited to this.

【００１６】図２は本発明を日本語ワードプロセッサに
適用した一実施例の構成を示すブロック図である。この
図に示すように、本発明の日本語ワードプロセッサは、
プログラムを実行しシステム全体を制御し統括するＣＰ
Ｕ（中央処理装置）１と、文書データを記憶するＲＡＭ
２と、ＣＰＵ１で実行される制御プログラム及び仮名漢
字変換用の辞書データを記憶したＲＯＭ３と、印字手段
としてのプリンタ５と、プリンタ５を制御するプリンタ
コントローラ４と、ＣＲＴあるいは液晶ディスプレイな
どからなる表示装置７と、表示装置７を制御して表示装
置７にデータを表示させる表示コントローラ６と、入力
手段としてのキーボード９と、キーボード９のインター
フェースとなるキーインターフェイス８と、データおよ
びプログラムなどを記憶する手段としてのＦＤ（フロッ
ピーディスク装置）１１及びＩＣカード１３と、ＦＤ１
１及びＩＣカード１３を制御するＦＤコントローラ１０
及びＩＣカードインターフェース１２と、システムデー
タ記憶用ＥＥＰＲＯＭ１５とを備えている。FIG. 2 is a block diagram showing the configuration of an embodiment in which the present invention is applied to a Japanese word processor. As shown in this figure, the Japanese word processor of the present invention is
CP that executes programs and controls and controls the entire system
U (Central Processing Unit) 1 and RAM for storing document data
2, a ROM 3 in which a control program executed by the CPU 1 and dictionary data for Kana-Kanji conversion are stored, a printer 5 as a printing unit, a printer controller 4 for controlling the printer 5, and a display including a CRT or a liquid crystal display. A device 7, a display controller 6 for controlling the display device 7 to display data on the display device 7, a keyboard 9 as an input unit, a key interface 8 serving as an interface of the keyboard 9, and data, programs and the like are stored. FD (floppy disk device) 11 and IC card 13 as means, and FD1
FD controller 10 for controlling the IC card 1 and the IC card 13
And an IC card interface 12 and an EEPROM 15 for storing system data.

【００１７】上記ＲＡＭ２、ＲＯＭ３、プリンタコント
ローラ４、表示コントローラ６、キーインターフェース
８、ＦＤコントローラ１０、ＩＣカードインターフェー
ス１２、及びシステムデータ記憶用ＥＥＰＲＯＭ１５
は、アドレスバス・データバス１４を介してＣＰＵ１と
接続されている。The RAM 2, ROM 3, printer controller 4, display controller 6, key interface 8, FD controller 10, IC card interface 12, and EEPROM 15 for storing system data.
Are connected to the CPU 1 via the address bus / data bus 14.

【００１８】キーボード９からは、単語の読み情報が複
数の文節にわたって入力される。ＣＰＵ１は、ＲＯＭ３
に記憶した仮名漢字変換辞書を参照することにより、入
力された読み情報を複数の文節からなる漢字仮名交じり
文に変換する。また、その変換時には、その漢字仮名交
じり文に、文節の区切り情報を付加して、それら文節の
区切り情報を付加した多数の漢字仮名交じり文を含む文
書をＲＡＭ２に記憶する。From the keyboard 9, word reading information is input over a plurality of phrases. CPU1 is ROM3
By referring to the kana-kanji conversion dictionary stored in, the input reading information is converted into a kanji-kana mixed sentence composed of a plurality of clauses. Further, at the time of the conversion, bunsetsu delimiter information is added to the kanji kana mingled sentence, and a document including a large number of kanji kana mingled sentences to which the bunsetsu delimiting information is added is stored in the RAM 2.

【００１９】そして、キーボード９から、複数の文節か
らなる検索文字列が指定されたときには、漢字仮名交じ
り文に付加した文節の区切り情報を参照することによ
り、検索文字列の複数の文節の内の、いずれか一つ又は
それ以上の文節の検索文字列に一致する文字列を、ＲＡ
Ｍ２に記憶された文書の中から検索し、その検索結果を
表示装置７に表示する。When a search character string consisting of a plurality of clauses is specified from the keyboard 9, the delimiter information of the clause added to the kanji kana kana sentence is referred to, so that a plurality of clauses of the search character string can be selected. , A character string matching the search character string of one or more clauses is RA
The document stored in M2 is searched, and the search result is displayed on the display device 7.

【００２０】文節の区切り情報の付加については、以下
のようにして行う。すなわち、ＣＰＵ１は、仮名漢字変
換時には、仮名漢字変換辞書を参照し、入力文字列の文
節の区切りを自動的に認識しながら、漢字仮名交じり文
へ変換する（いわゆる連文節変換）。この変換確定時
に、文節の区切りを指定文字列に区切り情報として付加
して、ＲＡＭ２に記憶する。The phrase delimiter information is added as follows. That is, the CPU 1 refers to the kana-kanji conversion dictionary at the time of kana-kanji conversion, and automatically recognizes the delimiter of the bunsetsu of the input character string and converts it into a kanji-kana mixed sentence (so-called continuous phrase conversion). At the time of confirming the conversion, the delimiter of the phrase is added to the designated character string as delimiter information and stored in the RAM 2.

【００２１】この情報を付加する方法としては以下のよ
うな方法がある。・文字コードとは独立した特殊コードを付加する（図３
の（ａ）参照）。・文字コードに特殊フラグとして埋め込む（図３の
（ｂ）参照）。・区切り情報をコードあるいはフラグとして設けるので
はなく、指定の文字列を文節の区切りごとに記憶する
（図３の（ｃ）参照）。The following methods are available for adding this information. -Add a special code that is independent of the character code (Fig. 3
(A)). Embed as a special flag in the character code (see (b) of FIG. 3). -The delimiter information is not provided as a code or a flag, but a specified character string is stored for each delimiter of a phrase (see (c) of FIG. 3).

【００２２】次に、この区切りをどのように検索条件の
緩和に用いるかを、図４を用いて説明する。最初に指定
したＭ字の文字列が文節Ａ、文節Ｂ、文節Ｃ（１つの文
節は一字以上の文字列である）に区切られているとする
と、部分一致は文節ごとに行われる。完全一致とは、文
節Ａ、文節Ｂ、文節Ｃが全て一致することであり、部分
一致とは、文節Ａ、文節Ｂ、文節Ｃのいずれか一つ又は
それ以上の文節が一致することである。Next, how this delimiter is used to relax search conditions will be described with reference to FIG. If the initially specified M character string is divided into bunsetsu A, bunsetsu B, and bunsetsu C (one bunsetsu is a string of one or more characters), partial matching is performed for each bunsetsu. Exact match means that all clauses A, B, and C match, and partial match means that one or more clauses of clause A, clause B, and clause C match. .

【００２３】最初の緩和として、どれか一つの文節を一
致条件から外す。例えば、文節Ａと文節Ｂが一致する場
合、文節Ａと文節Ｃが一致する場合、文節Ｂと文節Ｃが
一致する場合の三つを条件とする。こうして、一致条件
から外す文節の数を一つずつ増やすことで、一致条件の
緩和を実現する。この緩和は一致条件が文節一つになる
までとし（図中の例では、文節Ａのみ、文節Ｂのみ、文
節Ｃのみの一致となるまでとし）、これを限度とする。
なお、図中、“＊”は任意の文字とする。As a first relaxation, any one clause is removed from the matching condition. For example, there are three conditions: the phrase A and the phrase B match, the phrase A and the phrase C match, and the phrase B and the phrase C match. Thus, the matching condition is relaxed by increasing the number of clauses to be excluded from the matching condition one by one. This relaxation is performed until the matching condition becomes one clause (in the example in the figure, matching of clause A only, clause B only, and clause C only), and this limit is set.
In the figure, "*" is an arbitrary character.

【００２４】このような構成における処理動作の内容
を、図５に示すフローチャートに基づいて説明する。ま
ず、ステップＳ１で検索処理が操作者により起動され、
ステップＳ２で操作者が文書内の検索したい文字列をキ
ーボード９から入力する（図６の画面１参照）。次に、
入力された文字列をステップＳ３で漢字仮名交じり文へ
変換する（図６の画面２参照）。すなわち、ＲＯＭ３に
格納された仮名漢字変換プログラムにより、ＣＰＵ１
は、入力された文字列を、ＲＯＭ３に格納された仮名漢
字変換辞書を参照しながら、単数及び複数の文節に自動
的に区切り、各文節単位に漢字に変換し、漢字仮名交じ
り文に変換する。ここで、単数及び複数の文字で構成さ
れる入力文字列は、Ｎ個の文節１〜文節Ｎに切り分けら
れる。次に、ステップＳ４で変換された文字列が確定さ
れ、ステップＳ５で変換の区切り情報を付加し、指定文
字列が記憶される。そして、完全一致による検索モード
で作業を開始する。The contents of the processing operation in such a configuration will be described with reference to the flowchart shown in FIG. First, in step S1, the search process is started by the operator,
In step S2, the operator inputs a character string to be searched for in the document from the keyboard 9 (see screen 1 in FIG. 6). next,
In step S3, the input character string is converted into a kanji kana mixed sentence (see screen 2 in FIG. 6). That is, the Kana-Kanji conversion program stored in the ROM 3 causes the CPU 1
Refers to the kana-kanji conversion dictionary stored in ROM3, automatically divides the input character string into singular and plural clauses, converts each clause into kanji, and converts into kanji kana mixed sentences. . Here, the input character string including a single character and a plurality of characters is divided into N clauses 1 to N. Next, the converted character string is determined in step S4, conversion delimiter information is added in step S5, and the specified character string is stored. Then, the work is started in the search mode by exact match.

【００２５】ステップＳ６で、比較位置Ｑをセットす
る。初期値として、文書の先頭（１文字目：Ｑ＝１）が
セットされる。次に、ステップＳ７で、その比較位置Ｑ
で文書データと指定文字列を比較し、指定文字列の全て
が一致した場合は、ステップＳ８からＳ１１に進み、そ
の一致文字列がある周辺の文書データを表示し（図６の
画面３参照）、ステップＳ１２へ進む。In step S6, the comparison position Q is set. As the initial value, the beginning of the document (first character: Q = 1) is set. Next, in step S7, the comparison position Q
The document data and the designated character string are compared with each other, and when all of the designated character strings match, the process proceeds from step S8 to S11, and the surrounding document data having the matching character string is displayed (see screen 3 in FIG. 6). , And proceeds to step S12.

【００２６】ステップＳ８において一致しなかった場合
は、ステップＳ９へと進み、検索位置を調べて文末に達
していない場合、すなわち検索対象となる文書データが
あれば、ステップＳ１０へと進み、比較位置を１つを進
める（Ｑ＝Ｑ＋１）。もし、文末であれば、ステップＳ
１３へ進む。If they do not match in step S8, the process proceeds to step S9, and if the search position is not reached to the end of the sentence, that is, if there is document data to be searched, the process proceeds to step S10 and the comparison position. Advances one (Q = Q + 1). If it is the end of the sentence, step S
Proceed to 13.

【００２７】ステップＳ１２では、検索作業の継続／中
止の選択（図６の画面４参照）となり、検索作業の中止
を選択した場合は処理を終了する。一方、検索作業の継
続を選択した場合は、ステップＳ１２からステップＳ９
へと進み、検索作業を継続する。At step S12, the continuation / stop of the search work is selected (see screen 4 in FIG. 6), and when the stop of the search work is selected, the process ends. On the other hand, when continuation of the search work is selected, steps S12 to S9 are performed.
Go to and continue the search.

【００２８】ここまでの検索は、完全一致による検索モ
ードである。完全一致モードで一致する文字列がない場
合、または継続して文末に達した場合は、ステップＳ９
から、ステップＳ１３へと進み、自動的に部分一致の検
索モードとなる。The search up to this point is a search mode based on perfect matching. If there is no matching character string in the exact match mode, or if the end of the sentence has been reached continuously, step S9
To step S13, the search mode for partial matching is automatically set.

【００２９】部分一致モードでは、まずステップＳ１３
で完全一致で調べた文字列がＫ個の文節で構成されてい
るとすると、一致と判断する基準ＬをＫとする。そし
て、ステップＳ１４で基準Ｌから１を減じ、ステップＳ
１５でＬを調べる。Ｌは一致と判断する一致文節数の基
準であるから、Ｌ＜１であれば、緩和不可能として、作
業を終了する（図６の画面７参照）。即ち、当初の条件
として入力された文字列が一つの文節から構成される場
合は、部分一致による検索作業は行わない。In the partial match mode, first, step S13.
Assuming that the character string checked by the perfect match in K is composed of K clauses, the criterion L for judging the match is K. Then, in step S14, 1 is subtracted from the reference L, and step S
Check L at 15. Since L is the criterion of the number of matching clauses that is judged to be a match, if L <1, it is determined that relaxation is impossible, and the work ends (see screen 7 in FIG. 6). That is, when the character string input as the initial condition is composed of one clause, the search operation by partial matching is not performed.

【００３０】Ｌが１以上、即ち、条件文字列が２つ以上
の文節で構成されていれば、緩和可能と判断し、ステッ
プＳ１５からステップＳ１６へと進む。ステップＳ１６
では、先頭からの字数を表すサーチ位置値Ｐを１とす
る。そして、ステップＳ１７で、何番目の文節をサーチ
しているかを示すＭの初期値を１、文節一致数Ｎ＝０、
補助ポインタＲ＝０とする。If L is 1 or more, that is, if the condition character string is composed of 2 or more clauses, it is determined that the clause can be relaxed, and the process proceeds from step S15 to step S16. Step S16
Then, the search position value P representing the number of characters from the beginning is set to 1. Then, in step S17, the initial value of M indicating which number of clauses is being searched is 1, the number of clause matches N = 0,
The auxiliary pointer R = 0.

【００３１】そして、ステップＳ１８で、位置（Ｐ＋
Ｒ）において、先頭からＭ番目の文節の文字列と、その
位置にある文書データ内の文字列が一致するか否かを調
べる。つまり、Ｍ番目の文節の１〜Ｈ文字のそれぞれと
文書データの先頭から（Ｐ＋Ｒ＋１−１）〜（Ｐ＋Ｒ＋
Ｈ−１）のそれぞれの文字が全て一致すれば、ステップ
Ｓ１９からステップ２０へ進む。ステップＳ１９で文字
が１つでも一致しなかった場合は、ステップＳ２１に進
む。Then, in step S18, the position (P +
In R), it is checked whether or not the character string of the Mth clause from the beginning matches the character string in the document data at that position. That is, each of the 1st to Hth characters of the Mth clause and the beginning of the document data is (P + R + 1-1) to (P + R +).
If all the characters of (H-1) match, the process proceeds from step S19 to step 20. If even one character does not match in step S19, the process proceeds to step S21.

【００３２】ステップＳ２０では、文節一致数Ｎに１を
加算し、継続して次の文節の一致を調べるために、補助
ポインタＲに現在比較中の文節Ｍの文字数を足す。ステ
ップＳ２１で不一致文節数（Ｍ−Ｎ）と不一致判断基準
文節数（Ｋ−Ｌ）とを比較し、（Ｍ−Ｎ）の値が（Ｋ−
Ｌ）の値以下であれば、ステップＳ２２に進み、Ｍと全
文節数Ｋを比較して、一致した場合はステップＳ２５に
進む。ステップＳ２１で、（Ｍ−Ｎ）の値が（Ｋ−Ｌ）
の値より大きければ、不一致とし、ステップＳ２３へ進
む。一方、Ｍと総文節数Ｎとを比較し、一致しなければ
ステップＳ２４へ進んでＭに１を加算し、ステップＳ１
８へと進み、次の文節の比較を行う。In step S20, 1 is added to the phrase matching number N, and the number of characters of the phrase M currently being compared is added to the auxiliary pointer R in order to continuously check the matching of the next phrase. In step S21, the number of mismatched clauses (MN) is compared with the number of mismatch judgment reference clauses (KL), and the value of (MN) is (K-
If it is less than or equal to the value of (L), the process proceeds to step S22, M is compared with the total phrase number K, and if they match, the process proceeds to step S25. In step S21, the value of (MN) is (KL)
If it is larger than the value of, it is determined that they do not match, and the process proceeds to step S23. On the other hand, M is compared with the total phrase count N, and if they do not match, the process proceeds to step S24, 1 is added to M, and step S1
Proceed to step 8 to compare the next clause.

【００３３】ステップＳ２２で、ＭとＫが等しければ、
全ての文節を比較し終わったので、ステップＳ２５へと
進み、部分一致成立として、その付近の文書データの表
示を行う（図６の画面５参照）。結果を確認後、ステッ
プＳ２６で処理の継続／中止を選び（図６の画面６参
照）、中止ならば、処理を終了する（図６の画面７参
照）。If M and K are equal at step S22,
Since all the phrases have been compared, the process proceeds to step S25, and it is determined that the partial match is established, and the document data in the vicinity thereof is displayed (see screen 5 in FIG. 6). After confirming the result, the continuation / stop of the process is selected in step S26 (see screen 6 in FIG. 6). If the process is stopped, the process ends (see screen 7 in FIG. 6).

【００３４】一方、続行する場合は、ステップＳ２３へ
進み、一つサーチする位置を進める。そしてＳ２７に進
み、検索する文書データがあるか否かを調べる。もしＰ
＝文末の場合は、ステップＳ２８に進み、一致と見なす
基準であるＬから１を減じることで、一致条件の緩和を
行う。そして、ステップＳ１５へと進み、条件を緩和し
て、再び文書の最初から部分一致による検索を行う。一
方、文末に達していない場合は、ステップＳ１７へ進
み、現在の条件で、比較位置を進めて作業を継続する。On the other hand, in the case of continuing, the process proceeds to step S23, and the position for searching one is advanced. Then, the process proceeds to S27 to check whether or not there is document data to be searched. If P
= In the case of the end of the sentence, the process proceeds to step S28, and the matching condition is relaxed by subtracting 1 from L, which is the criterion for determining matching. Then, the process proceeds to step S15, the conditions are relaxed, and the partial matching search is performed again from the beginning of the document. On the other hand, if the end of the sentence has not been reached, the process proceeds to step S17 to advance the comparison position and continue the work under the current conditions.

【００３５】以上の処理を大きく分けると、以下のよう
になる。・ステップＳ１〜Ｓ５：検索文字列入力（仮名漢字変
換）作業・ステップＳ６〜Ｓ１２：完全一致による検索作業・ステップＳ１３〜Ｓ２８：部分一致による検索作業The above processing is roughly divided as follows.・ Steps S1 to S5: Search character string input (kana-kanji conversion) work ・ Steps S6 to S12: Search work by exact match ・ Steps S13 to S28: Search work by partial match

【００３６】実際の入力例を図６の画面の表示例に基づ
いて説明すると、ステップＳ１で処理を起動して、ステ
ップＳ２で図６の画面１のように、キーボード９から検
索文字列「かんぜんいっちけんさく」を入力し、ステッ
プＳ３で仮名漢字変換して、図６の画面２のように、
「完全一致検索」と変換する。この仮名漢字変換で、
「完全」「一致」「検索」のように自動的に３つの文節
に分けて変換されたとする。An actual input example will be described based on the display example of the screen of FIG. 6. In step S1, the process is started, and in step S2, as shown in the screen 1 of FIG. "Ichichikensaku" is input, Kana-to-Kanji conversion is performed in step S3, and as shown in screen 2 of FIG. 6,
Convert to "exact match search". With this kana-kanji conversion,
It is assumed that the sentences are automatically converted into three clauses such as "perfect", "match", and "search".

【００３７】そして、ステップＳ４で確定すると、文節
の区切り情報を検索文字列「完全一致検索」に埋め込
み、記憶する。そして、完全一致による検索を開始す
る。まず、ここでは文書の先頭（１文字目）から調べる
ためのポインタをセットし、ステップＳ７で１文字目と
「完」、２文字目と「全」、３文字目と「一」、……、
５文字目と「検」、６文字目と「索」を比較し、全てが
一致した場合、ステップＳ８からステップＳ１１で、そ
の部分を図６の画面３のように表示し、ステップＳ１２
へ進み、図６の画面４のように表示し、処理の継続を尋
ね、操作者が継続を選択すると、ステップＳ９からステ
ップＳ１０へと進み、ポインタを進め比較する。こうし
て、次の一致文字列が見つかるか、あるいは文末に達す
るまで、ステップＳ７，Ｓ８，Ｓ９，Ｓ１０を繰り返
す。一方、ステップＳ９で文末であった場合は、ステッ
プＳ１３へと進み、部分一致による検索モードに入る。When determined in step S4, the segment delimiter information is embedded in the search character string "exact match search" and stored. Then, the search by exact match is started. First, here, a pointer for checking from the beginning (first character) of the document is set, and in step S7, the first character and "end", the second character and "all", the third character and "one", ... ,
When the fifth character and "inspection" are compared with each other and the sixth character and "search" are compared, if all of them match, that portion is displayed as in screen 3 of FIG. 6 in steps S8 to S11, and step S12
6, the screen 4 shown in FIG. 6 is displayed to inquire about continuation of the process, and when the operator selects continuation, the process proceeds from step S9 to step S10 to advance the pointer for comparison. Thus, steps S7, S8, S9, and S10 are repeated until the next matching character string is found or the end of the sentence is reached. On the other hand, if it is the end of the sentence in step S9, the process proceeds to step S13 to enter the search mode by partial matching.

【００３８】部分一致による検索モードでは、まず、
「完全一致検索」の文節数３をＬに代入し（ステップＳ
１３）、さらにＬ−１＝２のため（ステップＳ１４）、
ステップＳ１５からステップＳ１６に進む。次に、ポイ
ンタを１とし、比較する位置をまず先頭にセットする。
そして、ステップＳ１７で必要な数値を設定した後、ス
テップＳ１８で、ポインタの示す文書データの位置に、
例えば「部分一致検索では」という文字列が存在したと
すると、まず、「部分」と「完全」を比較して、不一致
であるので、ステップＳ２１に進み、（Ｍ−Ｎ）＝１と
（Ｋ−Ｌ）＝１を比較し、等しいので、ステップＳ２２
へと進み、Ｍ＝１、Ｋ＝３なのでステップＳ２４、Ｓ１
８へと進む。ここで、文節「一致」と「部分」を比較し
た結果、不一致であり、（Ｍ−Ｎ）＝２、（Ｋ−Ｌ）＝
１と一致基準を下回るので、その位置での比較を中止
し、ステップＳ２３へと進み、比較位置ポインタＰを進
める。このとき、文末であるのか否かのチェックもステ
ップＳ２７で行い、今度は文書データの文字列「分一致
検索で」と比較する。この場合も不一致となり、更に比
較位置ポインタＰが進められて、文書データの文字列
「一致検索では」と比較する。In the search mode by partial matching, first,
Substitute 3 for the "exact match search" clause (step S
13), and because L-1 = 2 (step S14),
The process proceeds from step S15 to step S16. Next, the pointer is set to 1, and the position to be compared is set to the beginning.
Then, after setting the necessary numerical values in step S17, in step S18, the position of the document data indicated by the pointer is set to
For example, if the character string "partial match search" is present, first, "partial" and "complete" are compared, and since they do not match, the process proceeds to step S21 and (MN) = 1 and (K -L) = 1 are compared, and since they are equal, step S22
Go to, and since M = 1 and K = 3, steps S24 and S1
Proceed to 8. Here, as a result of comparing the phrases “match” and “part”, they are not matched, and (MN) = 2, (KL) =
Since it is less than 1 and the matching criterion, the comparison at that position is stopped, the process proceeds to step S23, and the comparison position pointer P is advanced. At this time, it is also checked in step S27 whether or not it is the end of the sentence, and this time, it is compared with the character string "by minute match search" of the document data. In this case also, there is a mismatch, and the comparison position pointer P is further advanced to compare with the character string "in match search" of the document data.

【００３９】まず、文節「完全」と「一致」を比較し
て、不一致となるので、次に「一致」と「一致」を比較
し、一致するので、Ｎ＝１とし、補助ポインタＲに文節
「一致」の文字数２を足し（ステップＳ２０）、次に、
文書データの文字列「一致検索では」の（Ｐ＋２）文字
目、つまり「検索」と文節「検索」を比較する（ステッ
プＳ１８）。これは一致するので、Ｎ＝２とし、補助ポ
インタＲ＝４とする（ステップＳ２０）。次に、ステッ
プＳ２１で（Ｍ−Ｎ）＝（３−２）＝１、（Ｋ−Ｌ）＝
（３−２）＝１となって、（Ｍ−Ｎ）と（Ｋ−Ｌ）とが
等しくなる。First, the phrases "complete" and "match" are compared, and there is no match. Then, "match" and "match" are compared and they match, so N = 1 is set and the phrase is stored in the auxiliary pointer R. Add the number 2 of "match" (step S20), then
The (P + 2) th character of the character string “in matching search” of the document data, that is, “search” and the phrase “search” are compared (step S18). Since they match, N = 2 and auxiliary pointer R = 4 are set (step S20). Next, in step S21, (M−N) = (3−2) = 1, (K−L) =
(3-2) = 1 and (MN) and (KL) become equal.

【００４０】３つの文節を全て比較し終わったところで
（ステップＳ２２）、部分一致成立となり、ステップＳ
２５へと進み、その部分を図６の画面５のように表示す
る。次に、継続するかどうかを選択し（ステップＳ２
６）、継続しなければ作業終了とし、継続する場合は、
同様の処理を、一致文字列があるか、あるいは比較位置
が文末となるまで繰り返し行う。When all the three clauses have been compared (step S22), partial matching is established and step S
25, and that portion is displayed as on screen 5 in FIG. Then, select whether to continue (step S2
6) If the work is not continued, the work is finished. If the work is continued,
The same process is repeated until there is a matching character string or the comparison position is at the end of the sentence.

【００４１】実際の条件の緩和は、一致と見なす基準Ｌ
を１ずつ減じて行う。また、Ｌが０となった場合は、緩
和不可として、処理を終了する。The relaxation of the actual conditions is based on the criterion L that is regarded as a match.
Decrease by 1. Further, when L becomes 0, it is determined that the relaxation is not possible, and the process ends.

【００４２】このようにして、特定の文字列に部分一致
する文字列を検索する場合に、漢字仮名交じり文への変
換作業時の文節の区切り情報を利用することにより、文
節単位での部分一致検索を行うことが可能となり、検索
作業の効率化を図ることができる。In this way, when searching for a character string that partially matches a specific character string, by using the delimiter information of the bunsetsu at the time of conversion into a kanji kana mixed sentence, a partial match in bunsetsu units It becomes possible to perform a search, and the efficiency of search work can be improved.

【００４３】[0043]

【発明の効果】この発明によれば、以下のような効果を
奏する。すなわち、指定検索文字列の中の文字と文字と
の関係が明確化されるので、操作者の期待しない条件の
自動生成を抑制することができる。また、無駄な作業、
すなわち操作者の期待しない条件の生成と検索作業が減
り、作業の効率化および作業時間の短縮化を図ることが
できる。さらに、無意味な条件の生成が抑制されること
により、不必要な文字列の抽出が無くなり、作業の有効
性が向上する。そして、変換時に用いられる文節の区切
りは、内部辞書データを参照するため、同じ辞書データ
を用いて変換を行う装置では、効率的な検索が可能とな
る。According to the present invention, the following effects can be obtained. That is, since the relationship between the characters in the designated search character string is clarified, it is possible to suppress the automatic generation of conditions that the operator does not expect. Also, wasteful work,
That is, the generation of conditions that the operator does not expect and the search work are reduced, and the work efficiency and the work time can be shortened. Furthermore, since the generation of meaningless conditions is suppressed, unnecessary extraction of character strings is eliminated, and the work efficiency is improved. Then, since the phrase delimiter used at the time of conversion refers to the internal dictionary data, an efficient search can be performed by an apparatus that performs conversion using the same dictionary data.

[Brief description of drawings]

【図１】この発明の構成を示すブロック図である。FIG. 1 is a block diagram showing a configuration of the present invention.

【図２】この発明の一実施例の構成を示すブロック図で
ある。FIG. 2 is a block diagram showing the configuration of an embodiment of the present invention.

【図３】文節区切り情報の埋め込み方法を示す説明図で
ある。FIG. 3 is an explanatory diagram showing a method of embedding segment break information.

【図４】区切り情報の利用方法を示す説明図である。FIG. 4 is an explanatory diagram showing a method of using delimiter information.

【図５】実施例の動作を示すフローチャートである。FIG. 5 is a flowchart showing the operation of the embodiment.

【図６】画面の表示例を示す説明図である。FIG. 6 is an explanatory diagram showing a display example of a screen.

【図７】従来技術による一致の判定を示す説明図であ
る。FIG. 7 is an explanatory diagram showing determination of matching according to a conventional technique.

[Explanation of symbols]

１ＣＰＵ２ＲＡＭ３ＲＯＭ４プリンタコントローラ５プリンタ６表示コントローラ７表示装置８キーインターフェース９キーボード１０ＦＤコントローラ１１ＦＤ１２ＩＣカードインターフェース１３ＩＣカード１４アドレスバス・データバス１５システムデータ記憶用ＥＥＰＲＯＭ 1 CPU 2 RAM 3 ROM 4 Printer controller 5 Printer 6 Display controller 7 Display device 8 Key interface 9 Keyboard 10 FD controller 11 FD 12 IC card interface 13 IC card 14 Address bus / data bus 15 System data storage EEPROM

Claims

[Claims]

1. An input means for inputting reading information of a word over a plurality of phrases, and a kana-kanji conversion dictionary. By referring to the dictionary, the input reading information is mixed with kanji kana composed of a plurality of phrases. A conversion means for converting the sentence into sentences, and when the input reading information is converted into a kanji-kana mixed sentence, a phrase information addition means for adding the phrase delimiter information to the kanji kana-mixed sentence, and the phrase delimiter information is added. By referring to the document storage means that stores a document containing a large number of kanji and kana mixed sentences and the delimiter information of the clauses added to the kanji and kana mixed sentences, the plurality of kanji and kana mixed sentences converted by the conversion means are referred to. A searcher that searches the documents stored in the document storage unit for a kanji-kana mixed sentence that matches the kanji-kana mixed sentence of any one or more of the clauses. When, character search function document processing apparatus comprising an output means for outputting the search result by the search unit.