JPH0689330A

JPH0689330A - Image filing system

Info

Publication number: JPH0689330A
Application number: JP4238642A
Authority: JP
Inventors: Yasuto Ishitani; 康人石谷; Shuichi Tsujimoto; 修一辻本
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1992-09-07
Filing date: 1992-09-07
Publication date: 1994-03-29

Abstract

PURPOSE:To easily give an instruction about a desired character string area in an input image displayed on a screen and obtain the character string direction, to efficiently input a keyword used to retrieve an input image, and to lighten the burden of operation on an operator. CONSTITUTION:When the position coordinates of one optional point are given from an operation part 1 in the input image displayed on the screen of a display part 3, a recognition object character string extraction part 7 sets a proper area based on the given coordinates and also decides the array direction of the character string components in the area at the same time, and expands the area in the array direction of the character string components according to the result to extract the character string components in the extended area. A character recognition part 8 recognizes the character string components and further codes them, and the coded character string is filed on an optical disk 11 while made to correspond to the input image as a key word.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、画像情報を記録・検索
する画像ファイリングシステムに関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an image filing system for recording / retrieving image information.

【０００２】[0002]

【従来の技術】画像情報を記録・検索する画像ファイリ
ングシステムでは、画像情報の検索に用いるタイトルを
容易に入力するための方法が種々考えられている。2. Description of the Related Art In an image filing system for recording and retrieving image information, various methods have been considered for easily inputting a title used for retrieving image information.

【０００３】しかして、従来、このような画像情報の検
索に用いるタイトルの入力を可能にしたものとして、例
えば、特開６０−１３６８７７号公報に開示されたもの
が知られている。このものは、画像をファイリングする
作業において、入力画像を表示している画面上の所望の
文字列領域をマウス等の指示手段により指示して切り出
し、この指示領域の内容をキーワードとして文字認識す
るとともにコード化して、その後の画像検索に用いるタ
イトルとして入力するようにしている。Conventionally, for example, the one disclosed in Japanese Unexamined Patent Publication No. 60-136877 is known as a device that enables the input of a title used for searching such image information. This is to cut out a desired character string area on the screen displaying an input image by pointing with a pointing device such as a mouse in the process of filing an image and recognizing the content of this pointing area as a keyword. It is coded and input as a title to be used for subsequent image retrieval.

【０００４】また、画像中に含まれる文字列を認識し、
この認識された文字コードによってその画像に付与すべ
き名称またはキーワードを作成すものとして、「信学論
Ｄ，Ｊ71ーＤ，No.10,pp2050-2058 」に開示されるよう
な書式定義言語で文字列領域を記述する方法も知られて
いる。Further, the character string included in the image is recognized,
In order to create a name or keyword to be given to the image by this recognized character code, a format definition language such as that disclosed in "Study D, J71-D, No.10, pp2050-2058" A method of describing a character string area is also known.

【０００５】[0005]

【発明が解決しようとする課題】しかし、最初の考えの
ものでは、マウス等により文字列領域を指示する場合、
文字列領域の少なくとも２点指示を必要とし、また、同
時に文字列方向をオペレータが設定しなければならなら
なず、さらに、各入力画像ごとに指示作業を必要とする
ことからオペレータに対する作業負担が大きくなり、使
用しずらいという問題点があった。However, according to the first idea, when a character string area is designated by a mouse or the like,
At least two points in the character string area need to be designated, and at the same time, the operator must set the direction of the character string. Further, since the instruction work is required for each input image, the work load on the operator is reduced. There was a problem that it became large and difficult to use.

【０００６】また、後の考えのものでも、文書構造や文
字列位置をその画像中における絶対座標などの物理量を
用いて記述するため、同一構造を持つ文書を処理する場
合においても、大幅な位置ズレが生じているものを入力
する場合には正確に文字列領域を抽出できないという問
題があり、また、文字列領域の定義の記述が煩雑である
ため、オペレータに対する負担が大きいといった実用的
制約を生じるという問題点があった。Further, even in the later thought, since the document structure and the character string position are described by using physical quantities such as absolute coordinates in the image, even when processing a document having the same structure, a large position There is a problem that the character string area cannot be extracted accurately when inputting a gap, and the practical description that the burden on the operator is large because the description of the character string area is complicated. There was a problem that it would occur.

【０００７】また、入力画像よりキーワード画像を指示
してキーワードを作成するものでは、文字認識により複
数の認識候補が得られたような場合、この内の一つを選
択しタイトルとして入力するようになるため、例えば認
識結果が不明瞭で一つの候補に絞れないような場合には
不正確な認識結果をタイトルとして入力することがあ
り、、このために正確な画像検索が難しくなるという問
題点もあった。なお、この問題を回避するために、キー
ワード画像の文字認識結果を画像入力時にオペレータが
いちいち確認、修正するようにしたのでは、オペレータ
の負担が大きくなってしまう。In the case of creating a keyword by designating a keyword image from an input image, if a plurality of recognition candidates are obtained by character recognition, one of them should be selected and input as a title. Therefore, for example, when the recognition result is unclear and it is not possible to narrow down to one candidate, an incorrect recognition result may be input as a title, which makes it difficult to perform accurate image search. there were. In order to avoid this problem, if the operator checks and corrects the character recognition result of the keyword image at the time of image input, the operator's burden will be increased.

【０００８】本発明は、上記事情に鑑みてなされたもの
で、画面上に表示された入力画像上の所望の文字列領域
を簡単に指示できると同時に、文字列方向も得られ、入
力画像の検索に用いるキーワード等を効率よく入力で
き、オペレータに対する作業負担を軽減することができ
る画像ファイリングシステムを提供することを目的とす
る。The present invention has been made in view of the above circumstances, and a desired character string region on an input image displayed on a screen can be easily specified, and at the same time, a character string direction can be obtained, so that the input image An object of the present invention is to provide an image filing system capable of efficiently inputting a keyword or the like used for a search and reducing the work load on an operator.

【０００９】また、本発明は、同一種類の文書で内容に
よる字数や行数の変動や位置ズレが生じる場合であって
もキーワードとなるべき文字列成分を正確に検出するこ
とができ、オペレータに対する負担を軽減できる画像フ
ァイリングシステムを提供することを目的とする。Further, according to the present invention, even if the number of characters or lines of a document of the same type varies or the position shifts, a character string component to be a keyword can be accurately detected, and the operator can be detected. An object is to provide an image filing system that can reduce the burden.

【００１０】さらに、本発明は、入力画像中のキーワー
ド画像に関する認識結果をオペレータが確定しておかな
くても画像検索を行うことができる画像ファイリングシ
ステムを提供することを目的とする。A further object of the present invention is to provide an image filing system capable of performing image search without the operator having to confirm the recognition result regarding the keyword image in the input image.

【００１１】[0011]

【課題を解決するための手段】本発明は、入力画像を画
面上に表示する表示手段と、この表示手段の画面上の任
意の一点の位置座標を指示する手段と、この手段で指示
された位置座標から適当な領域を設定する手段、この手
段で設定された領域内の入力画像から文字列成分の並び
方向を判定する手段、この手段で判定された文字列成分
の並び方向に沿って領域を拡張する手段、この手段によ
り拡張された領域内の文字列成分を抽出する手段を有す
る認識対象文字列抽出手段と、この認識対象文字列抽出
手段により抽出された文字列成分を文字認識及びコード
化するとともに該コード化された文字列を前記入力画像
のキーワードとして該入力画像に対応づけてファイリン
グする手段とにより構成されている。According to the present invention, there is provided a display means for displaying an input image on a screen, a means for designating a position coordinate of an arbitrary point on the screen of the display means, and an instruction made by this means. Means for setting an appropriate area from the position coordinates, means for determining the arrangement direction of the character string components from the input image in the area set by this means, areas along the arrangement direction of the character string components determined by this means And a recognition target character string extracting means having a means for extracting a character string component in the area expanded by this means, and a character recognition and code for the character string component extracted by the recognition target character string extracting means. And filing the coded character string as a keyword of the input image in association with the input image.

【００１２】また、本発明は、入力画像から文字列成分
を抽出する手段と、この手段より抽出された各文字列成
分を所定の関係で統合したブロックを検出するとともに
これらブロックの物理的な位置および大きさを解析し各
ブロック間の意味的接続関係を獲得する手段と、この手
段より各ブロック間の意味的接続関係から前記各ブロッ
クに対して属性情報を割り当てる手段と、予め所定の文
字列に関する情報を記憶した手段と、この手段に記憶さ
れた所定の文字列に関する情報を参照して前記属性情報
を割り当てられたブロックの文字列成分を抽出する手段
と、この手段より抽出された文字列成分を文字認識及び
コード化するとともに該コード化された文字列を前記入
力画像のキーワードとして該入力画像に対応づけてファ
イリングする手段とにより構成されている。Further, according to the present invention, means for extracting a character string component from an input image, a block in which the character string components extracted by this means are integrated in a predetermined relationship, and the physical position of these blocks are detected. And means for analyzing the size of the blocks to obtain the semantic connection between the blocks, means for assigning attribute information to the blocks from the semantic connection between the blocks, and a predetermined character string. Means for storing information regarding the attribute information, means for extracting the character string component of the block to which the attribute information is assigned by referring to the information regarding the predetermined character string stored in this means, and the character string extracted by this means Means for character recognition and coding of components, and filing by correlating the coded character string with the input image as a keyword of the input image And it is made of.

【００１３】さらに本発明は、入力画像に対してキーワ
ード領域を指示する手段と、この手段より指示されたキ
ーワード領域の画像をキーワード画像として前記入力画
像より切り出す手段と、この手段より切り出されたキー
ワード画像を文字認識する手段と、この手段で文字認識
された結果を文字認識候補も含めて格納する手段と、検
索時に与えられるキーワードに対して前記各文字認識候
補の類似性を判断して類似性の高い画像を優先的に提示
する手段により構成されている。Further, according to the present invention, means for designating a keyword area for an input image, means for cutting out an image of the keyword area designated by this means from the input image as a keyword image, and a keyword cut out by this means A means for recognizing characters in the image, a means for storing the result of character recognition by this means including character recognition candidates, and a similarity by judging the similarity of each character recognition candidate with respect to a keyword given at the time of retrieval. It is configured by means for preferentially presenting a high image.

【００１４】[0014]

【作用】この結果、本発明によれば、オペレータにより
画面上に表示されている入力画像上で任意の一点が指示
されると、最初に適当な領域が設定されるとともに、領
域内の入力画像の文字列成分の並び方向が判定され、こ
の判定に基づいて文字列成分の並び方向に沿って領域が
拡大されて、この拡大領域の文字列成分について文字認
識およびコード化が実行され、コード化された文字列を
キーワードとして入力画像のファイリングが行われるよ
うになる。これにより、オペレータは、入力画像上の任
意の一点を指示するだけで、入力画像に付与すべきタイ
トル情報などのキーワードを文字認識およびコード化し
て入力することができるようになる。As a result, according to the present invention, when the operator designates an arbitrary point on the input image displayed on the screen, an appropriate area is set first and the input image within the area is set. The arrangement direction of the character string components of is determined, the region is expanded along the arrangement direction of the character string components based on this judgment, and character recognition and coding are executed for the character string components of this expanded region, and encoding is performed. Filing of the input image is performed using the created character string as a keyword. With this, the operator can character-recognize and code a keyword such as title information to be given to the input image and input the keyword by only pointing an arbitrary point on the input image.

【００１５】また、本発明によれば、入力画像から抽出
した文字列成分を所定の関係で統合したブロックについ
て物理的な位置および大きさを解析し、各ブロック間の
意味的接続関係を獲得し、さらに各ブロックに対する属
性情報を付与して、これら意味的接続関係や属性情報に
より入力画像のタイトルもしくはキーワードに相当する
任意の位置に記載されている文字列成分を正確に検出で
きるようになる。これにより、同一種類の文書等におい
て所望の文字列に、字数や行数の変動および位置ズレが
生じている場合であってもキーワードとなるべき文字列
成分を正確に検出することが可能となる。Further, according to the present invention, the physical position and size of a block in which the character string components extracted from the input image are integrated in a predetermined relationship are analyzed to obtain the semantic connection relationship between the blocks. Further, by adding attribute information to each block, it becomes possible to accurately detect a character string component described at an arbitrary position corresponding to the title or keyword of the input image based on the semantic connection relationship and the attribute information. This makes it possible to accurately detect a character string component that is to be a keyword even if the desired character string in the same type of document has a variation in the number of characters or the number of lines and a positional shift. .

【００１６】さらに、本発明によれば、入力画像に対し
て指示されたキーワード領域より切り出されたキーワー
ド画像について文字認識が行われ、この文字認識結果を
文字認識候補を含めて格納し、画像検索時に与えられる
キーワードにより文字認識候補との類似性の判断より両
者がよく類似する画像を優先的に提示するようになる。
これにより、キーワード画像の認識が不明瞭の場合で
も、複数の認識候補中に正しい認識結果を残しておき、
検索時には認識候補の中からキーワードとの類似性の高
いものを選んで対応する画像を提示するので、オペレー
タは所望の画像を得ることができる。つまり、キーワー
ド画像の認識結果を確定することなく適切な画像検索が
行える。さらに画像が提示された時点で、オペレータが
所望の画像が得られたと判断して確定の指示を与えるこ
とにより、その画像のキーワードの認識結果を検索に用
いたキーワードとして一意に決定するようにしてもよ
い。Further, according to the present invention, character recognition is performed on the keyword image cut out from the keyword area designated for the input image, the character recognition result is stored including the character recognition candidates, and the image is searched. An image that is similar to the character recognition candidate is preferentially presented rather than the similarity between the character recognition candidate and the character recognition candidate determined by the keyword given at some time.
As a result, even if the recognition of the keyword image is unclear, the correct recognition result is left in the plurality of recognition candidates,
At the time of search, the one having a high similarity to the keyword is selected from the recognition candidates and the corresponding image is presented, so that the operator can obtain the desired image. That is, an appropriate image search can be performed without confirming the recognition result of the keyword image. Further, when the image is presented, the operator determines that the desired image is obtained and gives a confirmation instruction so that the recognition result of the keyword of the image is uniquely determined as the keyword used for the search. Good.

【００１７】[0017]

【Example】

（第１実施例）図１は同実施例の概略的な構成を示して
いる。図において、１は操作部で、この操作部１は、本
装置の運用に関する指示情報の入力に用いられるキーボ
ードやマウスなどのポインティングデバイスからなって
いる。そして、この操作部１からの入力データは制御部
２に送られる。(First Embodiment) FIG. 1 shows a schematic structure of the first embodiment. In the figure, reference numeral 1 denotes an operation unit, which is composed of a pointing device such as a keyboard or a mouse used for inputting instruction information regarding the operation of the apparatus. Then, the input data from the operation unit 1 is sent to the control unit 2.

【００１８】制御部２は、本装置全体の動作を制御する
もので、表示部３、画像入力部４、出力部５、入力画像
圧縮部６、認識対象文字列抽出部７、文字認識部８、登
録部１０および記録部１２に対し制御指示を与えるよう
になっている。The control unit 2 controls the operation of the entire apparatus, and includes a display unit 3, an image input unit 4, an output unit 5, an input image compression unit 6, a recognition target character string extraction unit 7, and a character recognition unit 8. A control instruction is given to the registration unit 10 and the recording unit 12.

【００１９】ここで、表示部３は、本装置からの種々の
情報をオペレータに提示するためのＣＲＴディスプレイ
等からなっている。画像入力部４は、文字や図形、写
真、表、グラフなどが記載された文書を、例えば光学的
にスキャンして画像情報として入力するためのものであ
る。Here, the display unit 3 is composed of a CRT display or the like for presenting various information from this apparatus to the operator. The image input unit 4 is for inputting, as image information, for example, by optically scanning a document in which characters, figures, photographs, tables, graphs, etc. are described.

【００２０】本システムによる画像ファイリング処理
は、画像入力部４から画像情報として入力された画像に
対して操作部１の操作と表示部３による情報提示により
対話的に進められるようになる。この場合、画像入力部
４から入力された文書（入力画像）は入力画像圧縮部６
と認識対象文字列抽出部７に与えられる。入力画像圧縮
部６では、入力記画像を所定の圧縮形式で圧縮し、記憶
部１２に格納する。認識対象文字列抽出部７では、利用
者がポインティングデバイスを用いて指示する「入力画
像の不特定の位置に記載されている入力文書のタイトル
もしくはキーワードに相当する文字列」を入力画像から
抽出するようになる。この場合、従来では文字列領域を
獲得するために領域の２カ所を指示しなければならなか
ったが、本発明では獲得すべき文字列領域の中心と見な
せる一点のみを指示するだけで所望の文字列領域を獲得
できるようになっている。このとき、文字認識対象文字
列が抽出された結果を表示部３において表示するように
してもよい。The image filing processing by the present system can be interactively proceeded by the operation of the operation unit 1 and the information presentation by the display unit 3 for the image input as the image information from the image input unit 4. In this case, the document (input image) input from the image input unit 4 is the input image compression unit 6
Is given to the recognition target character string extraction unit 7. The input image compression unit 6 compresses the input image in a predetermined compression format and stores it in the storage unit 12. The recognition target character string extraction unit 7 extracts, from the input image, a “character string corresponding to the title or keyword of the input document described in an unspecified position of the input image” that the user instructs using the pointing device. Like In this case, conventionally, two points in the area had to be designated in order to acquire the character string area, but in the present invention, only one point which can be regarded as the center of the character string area to be acquired is designated and the desired character is designated. The row area can be acquired. At this time, the result of extracting the character recognition target character string may be displayed on the display unit 3.

【００２１】文字認識部８は、文字認識対象文字列に含
まれる文字を一文字ずつ切り出して、辞書部９に予め格
納されている認識対象カテゴリの標準文字パターンとそ
れぞれ照合することにより認識処理を行い、各文字をコ
ード化して登録部１０に供給するようにしている。この
とき、文字認識結果を表示部３において表示してもよ
い。そして、表示部３に表示された文字認識結果で誤っ
ているものを利用者が操作部１のキーボード等を用いて
対話的に修正し、その後、登録部１０でコード化された
文字列を、記憶部１２に格納されている圧縮された入力
画像と関連させて、その入力画像のタイトルあるいはキ
ーワードとして光ディスク１１に登録するようになる。The character recognizing unit 8 performs the recognizing process by cutting out the characters included in the character recognizing target character string one by one and comparing them with the standard character patterns of the recognizing target category stored in the dictionary unit 9 in advance. , Each character is encoded and supplied to the registration unit 10. At this time, the character recognition result may be displayed on the display unit 3. Then, the user interactively corrects the incorrect character recognition result displayed on the display unit 3 using the keyboard of the operation unit 1, and then the character string encoded by the registration unit 10 is changed to In association with the compressed input image stored in the storage unit 12, it is registered in the optical disc 11 as the title or keyword of the input image.

【００２２】出力部５は、光ディスク１１に格納されて
いる画像を出力するレーザプリンタ等からなっている。
また、記憶部１２は、圧縮された入力画像と各処理にお
ける途中結果、及び最終結果や、装置の状態に関する情
報等を格納するものである。次に、以上のように構成し
た実施例の動作について説明する。The output unit 5 comprises a laser printer or the like that outputs the image stored on the optical disk 11.
The storage unit 12 also stores a compressed input image, intermediate results and final results in each process, information about the state of the device, and the like. Next, the operation of the embodiment configured as described above will be described.

【００２３】この場合、同実施例における画像ファイリ
ング処理は、図２に示す３つの処理からなっている。In this case, the image filing process in this embodiment is composed of the three processes shown in FIG.

【００２４】（ａ）入力画像を圧縮する（図示１３）。(A) The input image is compressed (shown in FIG. 13).

【００２５】（ｂ）オペレータによって対話的に指示さ
れた「入力文書に付与すべきタイトルあるいはキーワー
ドとなるべき文字列」を含む領域を入力画像から抽出し
（図示１４）、その領域に含まれる文字パターンを文字
認識処理によりコード化する（図示１５）。(B) An area including "a character string to be a title or a keyword to be added to an input document" interactively instructed by the operator is extracted from the input image (Fig. 14), and the characters included in the area are extracted. The pattern is coded by character recognition processing (15 in the figure).

【００２６】（ｃ）圧縮入力画像とコード化された文字
列を対応付けて光ディスクに登録する（図示１６）。(C) The compressed input image and the coded character string are associated with each other and registered in the optical disc (FIG. 16).

【００２７】ここでの（ａ）の処理は、画像圧縮部６に
おいて実施され、（ｂ）の処理は、認識対象文字列抽出
部７および文字認識部８において実施され、（ｃ）の処
理は登録部１０において実施される。The process (a) here is carried out by the image compression unit 6, the process (b) is carried out by the recognition target character string extraction unit 7 and the character recognition unit 8, and the process (c) is carried out. It is implemented in the registration unit 10.

【００２８】以下、本発明の要部である上述の（ｂ）の
処理について、図３に示す入力画像例を用いて説明す
る。The process (b) described above, which is an essential part of the present invention, will be described below with reference to the input image example shown in FIG.

【００２９】ところで、オペレータは、画像ファイリン
グ作業に先だって、表示部３に表示されているメニュウ
から「画像入力及びキーワードの自動付加」という項目
をキーボードまたはポインティングデバイスにより選択
するようになる。Before the image filing work, the operator selects the item "image input and automatic addition of keywords" from the menu displayed on the display unit 3 using the keyboard or pointing device.

【００３０】すると、入力画像に対して２値化処理、ノ
イズ除去処理等が実施され、これら２値化処理、ノイズ
除去処理等が行われた画像１７が表示部３に表示され、
さらに「文字認識対象文字列を指定して下さい」という
メッセージが表示される。Then, the input image is subjected to binarization processing, noise removal processing, etc., and the image 17 subjected to the binarization processing, noise removal processing, etc. is displayed on the display unit 3,
Further, a message “Please specify the character recognition target character string” is displayed.

【００３１】この状態から、オペレータは、表示部３に
表示されている画像１７上において、入力文書中に記載
されている入力文書のタイトルあるいはキーワードに相
当する文字列領域を獲得するためにポインティングデバ
イスを用いて指示を与える。この場合、図４に示すよう
に入力文書のタイトルあるいはキーワードに相当する表
題文字列１８の中心部１９と見なすことのできる一点を
指示するようにする。From this state, the operator points on the image 17 displayed on the display unit 3 to obtain a character string area corresponding to the title or keyword of the input document described in the input document. Use to give instructions. In this case, as shown in FIG. 4, one point that can be regarded as the central portion 19 of the title character string 18 corresponding to the title or keyword of the input document is designated.

【００３２】これにより、ポインティングデバイスで指
示した位置に相当する画像上の座標値（ｘ，ｙ）が得ら
れる。そして、この座標値（ｘ，ｙ）を用いて以下述べ
る処理手続きにより認識対象文字列を抽出する。As a result, the coordinate value (x, y) on the image corresponding to the position designated by the pointing device is obtained. Then, using this coordinate value (x, y), the recognition target character string is extracted by the processing procedure described below.

【００３３】この場合、座標値（ｘ，ｙ）を含む画像１
７上の限定された範囲において、水平方向の文字間スペ
ースと垂直方向の文字間スペースを求め、これらを比較
することにより文字列方向の推定を行い、その方向に文
字（実際は黒連結成分矩形）をまとめることにより認識
対象文字列領域を抽出するようになる。In this case, the image 1 containing the coordinate values (x, y)
7 In the limited range above, the space between the characters in the horizontal direction and the space between the characters in the vertical direction are obtained, and the character string direction is estimated by comparing them, and the character (actually, the black connected component rectangle) is in that direction. The character string area to be recognized can be extracted by putting together.

【００３４】まず、座標値（ｘ，ｙ）に対し下記に示す
式１，式２，式３，式４を適用して図５に示すように座
標値（ｌｘ，ｌｙ）と座標値（ｒｘ，ｒｙ）を求め、認
識対象文字列領域を抽出するための処理の適用範囲２０
を限定する。First, the following equation 1, equation 2, equation 3 and equation 4 are applied to the coordinate value (x, y) to obtain the coordinate value (lx, ly) and the coordinate value (rx as shown in FIG. , Ry), and the application range of the process for extracting the recognition target character string region 20
To limit.

【００３５】ここで、Ｗは入力画像の幅、Ｈは入力画像の高さ、αは
しきい値である。[0035] Here, W is the width of the input image, H is the height of the input image, and α is the threshold value.

【００３６】そして、これら各式より得られた処理適用
範囲２０（すなわち、ｌｘ≦ｘ≦ｒｘとｌｙ≦ｙ≦ｒｙ
をともに満たす範囲）において、各黒画素間の連結関係
を調べて黒連結領域を抽出するラベリング処理を行い、
各黒画素領域の外接矩形を抽出する（図６）。さらに、
図７（ａ）（ｂ）に示すように、ある程度近接する黒画
素連結成分２１や領域の重なりをもつ黒画素連結成分２
２は、それらを一纏めにして一つの文字候補矩形２３と
する。Then, the processing applicable range 20 (that is, lx≤x≤rx and ly≤y≤ry) obtained from these equations is satisfied.
In a range that satisfies both), perform a labeling process to extract the black connected region by examining the connection relationship between each black pixel,
The circumscribed rectangle of each black pixel area is extracted (FIG. 6). further,
As shown in FIGS. 7A and 7B, black pixel connected components 21 that are close to each other to some extent or black pixel connected components 2 that have overlapping areas.
2 is made into one character candidate rectangle 23 by putting them together.

【００３７】次いで、文字候補矩形２３群から認識対象
文字列の文字列方向を推定する。そのためには、図８に
示すように、まず、ポインティングデバイスで指示した
座標（ｘ，ｙ）に最も近い黒連結成分矩形Ｃを見つけ
る。この黒連結成分矩形Ｃは、すべての黒連結成分矩形
の重心のうちで座標（ｘ，ｙ）からの距離が最も小さい
ものである。Next, the character string direction of the character string to be recognized is estimated from the character candidate rectangle group 23. For that purpose, as shown in FIG. 8, first, the black connected component rectangle C closest to the coordinates (x, y) designated by the pointing device is found. The black connected component rectangle C has the smallest distance from the coordinates (x, y) among the centers of gravity of all black connected component rectangles.

【００３８】そして、この黒連結成分矩形Ｃに着目し
て、さらに黒連結成分矩形全体から下記の式５及び式６
を共に満たす他の黒連結成分矩形との集合ＣＳを抽出す
る。Then, paying attention to the black connected component rectangle C, from the whole black connected component rectangle, the following equations 5 and 6 are obtained.
, A set CS with other black connected component rectangles that both satisfy is extracted.

【００３９】ｍｉｎ（ｃｘ２，ｘ２）−ｍａｘ（ｃｘ１，ｘ１）＞０ …式５ｍｉｎ（ｃｙ２，ｙ２）−ｍａｘ（ｃｙ１，ｙ１）＞０ …式６ここで、（ｃｘ１，ｃｙ１）を黒連結成分矩形２４の左
上端、（ｃｘ２，ｃｙ２）を右下端とし、（ｘ１，ｙ
１）を他の黒連結成分矩形の左上端、（ｘ２，ｙ２）を
右下端とする。Min (cx2, x2) -max (cx1, x1)> 0 Equation 5 min (cy2, y2) -max (cy1, y1)> 0 Equation 6 Here, (cx1, cy1) is black-connected. The upper left corner of the component rectangle 24, (cx2, cy2) is the lower right corner, and (x1, y
Let 1) be the upper left corner of another black connected component rectangle, and (x2, y2) be the lower right corner.

【００４０】ここで、（１）式５を満たす（すなわち黒連結成分矩形Ｃと垂直
方向に並ぶ）矩形の数をカウントする（このカウント値
をＶＣとする）。Here, the number of rectangles satisfying the expression (1) 5 (that is, vertically aligned with the black connected component rectangle C) is counted (this count value is VC).

【００４１】（２）式６を満たす（すなわち黒連結成分
矩形Ｃと水平方向に並ぶ）矩形の数をカウントする（こ
のカウント値をＨＣとする）。(2) The number of rectangles satisfying the expression 6 (that is, aligned with the black connected component rectangle C in the horizontal direction) is counted (this count value is referred to as HC).

【００４２】そして、ＶＣ＋ｔｈ（しきい値）＜ＨＣな
らば認識対象文字列の方向は水平方向となり、また、Ｈ
Ｃ＋ｔｈ＜ＶＣならば認識対象文字列の方向は垂直方向
となる。また、上記以外の場合にはさらに以下の処理を
行う。If VC + th (threshold value) <HC, the direction of the character string to be recognized is horizontal, and H
If C + th <VC, the direction of the character string to be recognized is the vertical direction. In addition, in the cases other than the above, the following processing is further performed.

【００４３】｜ＨＶ−ＶＣ｜≦ｔｈこの場合、上述のＣＳから水平方向に隣接する黒連結成
分矩形間距離の平均値を求め、これを水平方向文字間ス
ペースをＨＳとし、垂直方向に隣接する黒連結成分矩形
間距離の平均値を求め、これを垂直方向文字間スペース
をＶＳとする。│HV-VC│≤th In this case, the average value of the distances between the black connected component rectangles that are horizontally adjacent to each other is obtained from the above-mentioned CS, and this is defined as the horizontal character space HS, and is vertically adjacent. The average value of the distances between the black connected component rectangles is calculated, and this is taken as the vertical space between characters.

【００４４】そして、ＨＳは、例えば上述のＣＳの中でｍｉｎ（ｘi2，ｘj2）−ｍａｘ（ｘi1，ｘj1）＞０ …式７を満たす隣接する文字候補矩形間距離の平均値を求める
ことにより推定することができる。また、ＶＳも同様
に、上述のＣＳの中でｍｉｎ（ｙi2，ｙj2）−ｍａｘ（ｙi1，ｙj1）＞０ …式８を満たす隣接する文字候補矩形間距離の平均値を求める
ことにより推定することができる。Then, the HS is estimated by, for example, obtaining the average value of the distances between adjacent character candidate rectangles that satisfy the following expression: min (xi2, xj2) -max (xi1, xj1)> 0 in CS described above. can do. Similarly, VS is estimated by obtaining the average value of the distances between adjacent character candidate rectangles that satisfy the following expression: min (yi2, yj2) -max (yi1, yj1)> 0 ... You can

【００４５】ここで、黒連結成分矩形ｉの左上端の座標
値を（ｘi1，ｙi1）、右下端の座標値を（ｘi2，ｙi2）
とし、黒連結成分矩形ｊの左上端の座標値を（ｘj1，ｙ
j1）、右下端の座標値を（ｘj2，ｙj2）としている。Here, the coordinate value of the upper left end of the black connected component rectangle i is (xi1, yi1), and the coordinate value of the lower right end is (xi2, yi2).
And the coordinate value of the upper left corner of the black connected component rectangle j is (xj1, y
j1) and the coordinate value of the lower right corner is (xj2, yj2).

【００４６】一般に印刷文書では文字間距離の方が行間
距離より小さいという性質を持っており、これにより、
文字間スペースの小さい方向を文字列方向と推定するこ
とができる。すなわち、ＨＳ＜ＶＳ …式９が成り立つ場合には文字列方向を水平方向とし、ＨＳ＞ＶＳ …式１０が成り立つ場合には文字列方向を垂直方向とする。ま
た、ＨＳ＝ＶＳ …式１１が成り立つ場合には文字列方向の推定不可能とする。Generally, a printed document has a property that the distance between characters is smaller than the distance between lines.
The direction in which the space between characters is small can be estimated as the direction of the character string. That is, when HS <VS ... Equation 9 holds, the character string direction is horizontal, and when HS> VS ... Equation 10 holds, the character string direction is vertical. Further, if HS = VS ... Equation 11 holds, it is impossible to estimate in the character string direction.

【００４７】このようにして認識対象文字列が得られた
ら、さらにその文字列方向に処理適用範囲２０を以下の
ように拡大する。When the character string to be recognized is obtained in this way, the processing applicable range 20 is further expanded in the character string direction as follows.

【００４８】まず、文字列方向が水平（垂直）方向の場
合には、処理適用範囲２０は、図９に示すように拡大す
る。この場合、１ｘ′＝０（１ｘ′＝１ｘ） …式１２１ｙ′＝１ｙ（１ｙ′＝０） …式１３ｒｘ′＝Ｗ（ｒｘ′＝ｒｘ） …式１４ｒｙ′＝ｒｙ（ｒｙ′＝Ｈ） …式１５となり、新しい処理適用範囲２０１は、「１ｘ′≦ｘ≦
ｒｘ′と１ｙ′≦ｙ≦ｒｙ′を共に満たす領域」にな
る。First, when the character string direction is the horizontal (vertical) direction, the processing applicable range 20 is expanded as shown in FIG. In this case, 1x ′ = 0 (1x ′ = 1x) Equation 12 1y ′ = 1y (1y ′ = 0) Equation 13 rx ′ = W (rx ′ = rx) Equation 14 ry ′ = ry (ry ′ = H) ... Equation 15 is obtained, and the new processing application range 201 is “1x ′ ≦ x ≦
rx ′ and 1y ′ ≦ y ≦ ry ′ are both satisfied ”.

【００４９】そして、この新しい処理適用範囲２０１に
おいて認識対象文字列領域を推定する。この場合、まず
処理適用範囲２０１でラベリング処理を行って上述した
黒連結成分矩形Ｃを抽出し、以下のように認識対象文字
列領域を推定する。Then, the character string region to be recognized is estimated in this new processing application range 201. In this case, first, a labeling process is performed in the process application range 201 to extract the above-described black connected component rectangle C, and the recognition target character string region is estimated as follows.

【００５０】文字列方向が水平（垂直）方向である場合
は、まず、矩形Ｃから見て左（上）側にある（すなわち
矩形Ｃ以外の矩形の右下端のｘ（ｙ）座標＜ｃｘ１（ｃ
ｙ１）である）矩形に対して式６（式５）を適用し、こ
の条件を満たす矩形群ＣＳＨ１（ＣＳＶ１）を抽出す
る。ここでＣＳＨ１（ＣＳＶ１）には矩形Ｃが含まれて
いる。When the character string direction is the horizontal (vertical) direction, first, it is on the left (upper) side as viewed from the rectangle C (that is, the x (y) coordinate of the lower right corner of the rectangle other than the rectangle C <cx1 ( c
Equation 6 (Equation 5) is applied to the rectangle (which is y1) to extract a rectangle group CSH1 (CSV1) satisfying this condition. Here, a rectangle C is included in CSH1 (CSV1).

【００５１】次に、それぞれの矩形の左上端のｘ（ｙ）
座標について降順にＣＳＨ１（ＣＳＶ１）において矩形
の出現順序を並べかえる。これをＣＳＨ１′（ＣＳＶ
１′）とする。この結果、ＣＳＨ１（ＣＳＶ１）は矩形
Ｃに最も近い矩形から最も遠い矩形へと並べかえられる
ことになる。Next, x (y) at the upper left corner of each rectangle
The appearance order of the rectangles is rearranged in CSH1 (CSV1) in descending order of coordinates. This is CSH1 '(CSV
1 '). As a result, CSH1 (CSV1) is rearranged from the rectangle closest to the rectangle C to the farthest rectangle.

【００５２】次に、ＣＳＨ１′（ＣＳＶ１′）におい
て、以下の条件を先頭から順次適用していき、条件を満
たす矩形Ｃから最も遠い矩形ＣＬＨ１（ＣＬＶ１）を抽
出する。ここでの条件は、式６（式５）と式１６（式１
７）の両方を満たすことである。｜ｘi1−ｘj2｜≦
λ …式１６（｜ｙi1−ｙj2｜≦λ …式１７）ただし、矩形ｊは矩形ｉより出現順序は後とする。ま
た、λは、矩形Ｃの高さとする。Next, in CSH1 '(CSV1'), the following conditions are sequentially applied from the beginning, and the rectangle CLH1 (CLV1) farthest from the rectangle C satisfying the conditions is extracted. The conditions here are Equation 6 (Equation 5) and Equation 16 (Equation 1)
It is to satisfy both of 7). │xi1-xj2│ ≦
λ ... Equation 16 (| yi1−yj2 | ≦ λ ... Equation 17) However, the appearance order of the rectangle j is later than that of the rectangle i. Further, λ is the height of the rectangle C.

【００５３】そして、ＣＬＨ１（ＣＬＶ１）までの矩形
の左上端の座標の内、最小のｘ座標ｘｍｉｎとｙ座標ｙ
ｍｉｎを求める。Then, among the coordinates of the upper left corner of the rectangle up to CLH1 (CLV1), the minimum x coordinate xmin and y coordinate y are obtained.
Find min.

【００５４】次に、矩形Ｃから見て右（下）側にある
（すなわち矩形Ｃ以外の矩形の左上端のｘ（ｙ）座標＞
ｃｘ２（ｃｙ２）である）矩形に対して式６（式５）を
適用し、この条件を満たす矩形群ＣＳＨ２（ＣＳＶ２）
を抽出する。ここでＣＳＨ２（ＣＳＶ２）には矩形Ｃを
含める。Next, the right (lower) side of the rectangle C (that is, the x (y) coordinate of the upper left corner of the rectangle other than the rectangle C>
Equation 6 (Equation 5) is applied to a rectangle (cx2 (cy2)), and a rectangle group CSH2 (CSV2) satisfying this condition is satisfied.
To extract. Here, the rectangle C is included in CSH2 (CSV2).

【００５５】次に、それぞれの矩形の左上端のｘ（ｙ）
座標について昇順にＣＳＨ２（ＣＳＶ２）において矩形
の出現順序を並べかえる。これをＣＳＨ２′（ＣＳＶ
２′）とする。この結果、ＣＳＨ２（ＣＳＶ２）は矩形
Ｃに最も近い矩形から最も遠い矩形へと並べかえられる
ことになる。Next, x (y) at the upper left corner of each rectangle
The order of appearance of rectangles is rearranged in CSH2 (CSV2) in ascending order of coordinates. This is CSH2 '(CSV
2 '). As a result, CSH2 (CSV2) is rearranged from the rectangle closest to the rectangle C to the farthest rectangle.

【００５６】次に、ＣＳＨ２′（ＣＳＶ２′）におい
て、以下の条件を先頭から順次適用していき、条件を満
たす矩形Ｃから最も遠い矩形ＣＬＨ２（ＣＬＶ２）を抽
出する。ここでの条件は、式６（式５）と式１６（式１
７）を両方を満たすことである。さらに、ＣＬＨ２（Ｃ
ＬＶ２）までの矩形の右下端の座標の内、最大のｘ座
標：ｘｍａｘとｙ座標：ｙｍａｘを求める。Next, in CSH2 '(CSV2'), the following conditions are sequentially applied from the beginning, and the rectangle CLH2 (CLV2) farthest from the rectangle C satisfying the conditions is extracted. The conditions here are Equation 6 (Equation 5) and Equation 16 (Equation 1)
7) is to satisfy both. Furthermore, CLH2 (C
Among the coordinates of the lower right corner of the rectangle up to LV2), the maximum x coordinate: xmax and y coordinate: ymax are obtained.

【００５７】そして、（ｘｍｉｎ，ｙｍｉｎ）と（ｘｍ
ａｘ，ｙｍａｘ）の２点の座標で特定される矩形を認識
対象文字列領域とする（すなわちｘｍｉｎ≦ｘ≦ｘｍａ
ｘとｙｍｉｎ≦ｙ≦ｙｍａｘを共に満たす領域）。Then, (xmin, ymin) and (xm
A rectangle specified by the coordinates of two points (ax, ymax) is set as the recognition target character string area (that is, xmin ≦ x ≦ xma).
x and ymin ≦ y ≦ ymax are both satisfied).

【００５８】ここで、表示部３で表示されている画像１
７上に文字認識対象領域を枠で囲んだり、該当部分の色
をかえて表示したりさらに文字部や背景部をそれらとは
違う色で塗りつぶして表示するようにしてもよい。Here, the image 1 displayed on the display unit 3 is displayed.
The character recognition target area may be surrounded by a frame on the display 7, the color of the corresponding portion may be changed and displayed, and the character portion and the background portion may be filled in with a color different from them.

【００５９】上述した方法による一点指示により文字認
識対象領域が正しく抽出されない場合には、ポインティ
ングデバイスを用いて、表示部３で表示されている画像
１７上における所望の文字列の最左上端と最右下端の２
点をポインティングデバイスして指示する、従来の「２
点指示」を用いれば、認識対象文字列領域を指示するこ
とができる。このとき、指示された領域が縦長（ｗ＜
ｈ）の場合では、認識対象文字列は縦書きであると見な
し、横長（ｗ≧ｈ）の場合では、認識対象文字列は横書
きであると見なす。ただし、ｗ：認識対象文字列領域の
横幅、ｈ：認識対象文字列領域の縦幅とする。When the character recognition target area is not correctly extracted by the one-point instruction by the above-described method, the pointing device is used to select the upper leftmost end and the uppermost end of the desired character string on the image 17 displayed on the display unit 3. 2 at the bottom right
The conventional “2
By using "point designation", the recognition target character string area can be designated. At this time, the designated area is vertically long (w <
In the case of h), the recognition target character string is considered to be vertical writing, and in the case of landscape orientation (w ≧ h), the recognition target character string is considered to be horizontal writing. However, w: horizontal width of the recognition target character string area, and h: vertical width of the recognition target character string area.

【００６０】次に、ポインティングデバイスとして、例
えば図１０に示すマウスを用いたときその位置座標の指
示について説明する。Next, when the mouse shown in FIG. 10, for example, is used as the pointing device, the instruction of the position coordinates will be described.

【００６１】図において、２４はマウス本体で、この本
体２４に設けられる左ボタン２５は一点指示により文字
認識対象領域を指定するときに用いられ、中ボタン２６
は指定された領域に対して文字認識を行い画像を自動的
にコード化するときに用いられ、右ボタン２７は二点指
示により文字認識対象領域を指定するときに用いられる
ようになっている。そして、このようなマウスの各ボタ
ンを押すことによって生じる制御信号に関する論理は以
下の通りである。In the figure, reference numeral 24 denotes a mouse body, a left button 25 provided on the body 24 is used when a character recognition target area is designated by one-point designation, and a middle button 26
Is used when character recognition is performed on a designated area and an image is automatically coded, and the right button 27 is used when a character recognition target area is designated by a two-point instruction. The logic regarding the control signal generated by pressing each button of the mouse is as follows.

【００６２】（１）「左ボタン２５」を押して次ぎに
「中ボタン２６」を押す。１点指示により文字認識対象
領域が指定され、その対象領域内部のパターンが認識さ
れコード化される。(1) Press the "left button 25" and then press the "middle button 26". The character recognition target area is designated by the one-point designation, and the pattern inside the target area is recognized and coded.

【００６３】（２）「左ボタン２５」を押し続ける。最
後の１点指示により確定した領域を文字認識対象領域と
する。(2) Press and hold the "left button 25". The area determined by the last one-point instruction is set as the character recognition target area.

【００６４】（３）「左ボタン２５」を押して「右ボタ
ン２７」を押す。１点指示モードが解除され２点指示モ
ードとなる。(3) Press the "left button 25" and then the "right button 27". The 1-point instruction mode is released and the 2-point instruction mode is set.

【００６５】（４）「右ボタン２７」を押して「左ボタ
ン２５」を押す。２点指示モードが解除され１点指示モ
ードとなる。(4) Press the "right button 27" and then the "left button 25". The 2-point instruction mode is released and the 1-point instruction mode is entered.

【００６６】（５）「右ボタン２７」を押し続ける。最
後の２点指示により確定した領域を文字認識対象領域と
する。(5) Continue to press the "right button 27". The area determined by the last two-point instruction is the character recognition target area.

【００６７】（６）「中ボタン２６」を押す。予め１回
以上「左ボタン２５」を押してある場合と２回以上「右
ボタン２７」を押してある場合以外は無効とする。(6) Press the "middle button 26". It is invalid unless the "left button 25" has been pressed once or more in advance and the "right button 27" has been pressed twice or more in advance.

【００６８】この他に、上記（２）の「左ボタン２５」
を押し続けたとき、例えば文字列領域の候補が「左ボタ
ン２５」を押した回数だけ抽出され、それを表示部３に
表示するようにしてもよい。例えば、図１１（ａ）に示
すような文字列において図中Ｐ１の位置で左ボタン２７
を一回押したとき図１１（ｂ）が文字列領域候補として
表示され、２回押したとき図１１（ｃ）が文字列領域候
補として表示されるように徐々に文字列領域を拡大して
いくようにしてもよい。このとき、逐次抽出される文字
列領域候補が所望の文字列領域と合致する時点で中ボタ
ン２６を押すことにより文字認識処理が実施されるよう
になる。In addition to this, the "left button 25" of the above (2)
When is continuously pressed, for example, the character string region candidates may be extracted by the number of times the “left button 25” is pressed and displayed on the display unit 3. For example, in the character string as shown in FIG. 11A, the left button 27 is placed at the position P1 in the figure.
When the button is pressed once, the character string area is displayed as shown in FIG. 11B, and when it is pressed twice, the character string area is displayed as shown in FIG. 11C as the character string area candidate. You may go. At this time, the character recognition process is performed by pressing the middle button 26 when the character string region candidates that are sequentially extracted match the desired character string region.

【００６９】このようにして文字認識処理対象文字列が
抽出され、さらにポインティングデバイスによる文字認
識機能作動の信号が発生させられた状態で、文字認識部
８が起動される。この場合、認識対象文字列領域では、
文字候補矩形が生成されているので、各矩形ごとにその
文字パターンを文字認識部８に供給する。このとき、左
右における文字候補矩形のパターン統合は基本的に行わ
ない。この結果、例えば「動」のような文字は「重」と
「力」のような２つのパターンとして抽出される。この
ようにして抽出される２つのパターンは１文字である可
能性と、２文字である可能性とを持っているので、この
２つの可能性についてそれぞれ考慮することが必要とな
る。従ってこのような場合には、例えば複数の切り出し
候補のすべてに対して文字認識を行い、最後に文脈的な
整合性を評価して何れかに決定するようにすれば良い。
また逆に、複数の文字が接触しているような文字パター
ンも存在する。このような文字パターン、例えば図１２
（ａ）に示すような文字パターンに対しては、同図
（ｂ）に示すように文字パターンを構成する黒画素の文
字列と垂直な方向の周辺分布（射影成分）ｆの極小点を
取る位置を、接触した複数の文字パターンの境界位置で
あると判定する。しかる後、この該境界位置の付近のパ
ターン形状を調査し、境界位置の間近の凹み部位を当該
文字列パターンの切断箇所であると判定し、その切断箇
所にて前記文字列パターンを切り分けることによって個
々の文字列パターンを切り出す。In this manner, the character recognition processing target character string is extracted, and the character recognition unit 8 is activated in a state where a signal for operating the character recognition function by the pointing device is generated. In this case, in the recognition target character string area,
Since the character candidate rectangles have been generated, the character pattern for each rectangle is supplied to the character recognition unit 8. At this time, the pattern integration of the character candidate rectangles on the left and right is basically not performed. As a result, a character such as "movement" is extracted as two patterns such as "heavy" and "force". Since the two patterns extracted in this way have the possibility of being one character and the possibility of being two characters, it is necessary to consider each of these two possibilities. Therefore, in such a case, for example, character recognition may be performed on all of the plurality of cutout candidates, and finally the contextual consistency may be evaluated to determine one.
Conversely, there are also character patterns in which a plurality of characters are in contact. Such a character pattern, for example, FIG.
For the character pattern shown in (a), the minimum point of the peripheral distribution (projection component) f in the direction perpendicular to the character string of black pixels forming the character pattern is taken as shown in FIG. The position is determined to be the boundary position of the plurality of contacting character patterns. After that, by examining the pattern shape in the vicinity of the boundary position, it is determined that the recessed portion near the boundary position is the cutting position of the character string pattern, and the character string pattern is cut at the cutting position. Cut out individual string patterns.

【００７０】しかして、個々の文字パターンが切り出さ
れると、次には各文字パターンを、そのパターンが帰属
する可能性のある文字のカテゴリと対応付ける。この対
応付けは、例えば複合類似度法のように、統計的に作成
された標準パターンと未知パターンとの重ね合わせによ
る一致度を算出する等して行われる。この結果、未知パ
ターンと対応付けられる第１位から第Ｍ位までの文字カ
テゴリを、その一致度の高いものから順に当該文字カテ
ゴリのコードと上記一致度と共にリスト化して出力す
る。上記リストは表示部３に表示され、オペレータが操
作部１を操作しながら、誤認識結果を修正するようにな
る。When individual character patterns are cut out, each character pattern is then associated with a category of characters to which the pattern may belong. This association is performed, for example, by calculating the degree of coincidence by superimposing the statistically created standard pattern and the unknown pattern, as in the composite similarity method. As a result, the first to Mth character categories associated with the unknown pattern are listed together with the code of the character category and the above-mentioned matching degree in order from the one with the highest matching degree, and are output. The list is displayed on the display unit 3 and the operator corrects the erroneous recognition result while operating the operation unit 1.

【００７１】そして、このような処理により、入力文書
のタイトルあるいはキーワードが確定したならば、その
文字列コードを、記憶部１２に既に格納してある前記圧
縮された入力画像と関連づけて光ディスク１１に格納し
てファイリング処理を終了することになる。When the title or keyword of the input document is determined by such processing, the character string code is associated with the compressed input image already stored in the storage unit 12 and stored in the optical disc 11. The file is stored and the filing process is ended.

【００７２】ここで、光ディスク１１に格納されるデー
タは入力画像そのものでなくても良く、その一部分や認
識結果等の加工データであってもよい。Here, the data stored in the optical disk 11 need not be the input image itself, but may be a part of it or processed data such as a recognition result.

【００７３】従って、このようにすれば、入力画像上の
任意の位置に記載されている所望の文字列上の一点を指
示するだけで、適当な領域の設定とともに、領域内の入
力画像の文字列成分の並び方向が判定され、この判定に
基づいて文字列成分の並び方向に沿って領域が拡大され
て、この拡大領域の文字列成分について文字認識および
コード化が実行され、このコード化された文字列をキー
ワードとして入力画像のファイリングが行われるように
なるので、入力画像の検索に用いるタイトル情報などの
キーワードを効率よく入力でき、オペレータに対する作
業負担を軽減することができるようになる。Therefore, in this way, by designating a point on the desired character string described at an arbitrary position on the input image, the appropriate area is set and the character of the input image within the area is set. The arrangement direction of the column components is determined, the region is expanded along the arrangement direction of the character string components based on this determination, character recognition and coding are executed for the character string components of the enlarged region, and this coding is performed. Since the input image is filed using the character string as a keyword, keywords such as title information used for searching the input image can be efficiently input, and the work load on the operator can be reduced.

【００７４】（第２実施例）上述した第１実施例では入
力された文書ごとに利用者が「入力文書のキーワードあ
るいはタイトルとなりうる文字列領域」をポインティン
グデバイスで指示するようにしているが、この作業は大
量の入力文書をファイリングする場合には煩雑に感じら
れることがある。(Second Embodiment) In the first embodiment described above, the user designates the "character string area which can be a keyword or title of the input document" for each input document with the pointing device. This work can be cumbersome when filing a large number of input documents.

【００７５】ところで、入力文書がほぼ同じ書式である
場合には、「入力文書のキーワードあるいはタイトルと
なりうる文字列領域」が、文書中のほぼ決まった位置に
あることが予想され、一枚の文書のその領域がわかれば
それを他の文書でも流用できると思われる。By the way, when the input document has almost the same format, it is expected that the "character string area which can be a keyword or title of the input document" is at a substantially fixed position in the document, and one document If you know the area of, you can use it in other documents.

【００７６】そこで、第２実施例では、「キーワードあ
るいはタイトルに相当する文字列」が多数の文書におい
ても大体同じ位置にある場合、最初の画像に対してポイ
ンティングデバイスで認識対象文字列を指示するだけで
他の文書においても認識対象文字列を正しく抽出できる
ようにしている。Therefore, in the second embodiment, when the "character string corresponding to a keyword or a title" is located at almost the same position in a large number of documents, the character string to be recognized is designated by the pointing device for the first image. Only by doing so, the character string to be recognized can be correctly extracted in other documents.

【００７７】この場合、１つの文書を代表として選んだ
とき、その入力画像上で、ポインティングデバイスを用
いて所望の文字列領域を指示したときに得られる矩形領
域の左上端の座標値（ｘｍｉｎ，ｙｍｉｎ）と右下端の
座標値（ｘｍａｘ，ｙｍａｘ）をそれぞれ下記の式１
８，式１９，式２０，式２１により変化させる。In this case, when one document is selected as a representative, the coordinate value (xmin, xmin, x) of the upper left corner of the rectangular area obtained when the desired character string area is designated on the input image using the pointing device. ymin) and the coordinate value (xmax, ymax) of the lower right corner are respectively expressed by the following equation 1
8, equation 19, equation 20, and equation 21 are used.

【００７８】この結果、認識対象領域は図１３に示すように、最初の
指示により得られた文字列領域２８に対して左上端の座
標２９、右下端の座標３０で表される拡大した文字列領
域３１として得られ、入力文書に書式の変動や多少の位
置ズレが生じても所望の文字列を正確に抽出することを
可能にしている。[0078] As a result, the recognition target area is, as shown in FIG. 13, an enlarged character string area 31 represented by coordinates 29 at the upper left corner and coordinates 30 at the lower right corner with respect to the character string area 28 obtained by the first instruction. It is possible to accurately extract a desired character string even if the input document has a format change or a slight positional deviation.

【００７９】そして、他の入力文書に関しては、拡大し
た文字列領域３１に含まれる画像を抽出することにより
所望の文字列を抽出するようになる。このとき、文字列
領域３１の中で、第１実施例と同様に、黒連結成分矩形
を抽出し、その分布を調べることにより、さらに正確な
文字列領域を決定して、その内部の画像を所望の文字列
パターンとして抽出してもよい。With respect to other input documents, a desired character string can be extracted by extracting an image included in the enlarged character string area 31. At this time, as in the first embodiment, a black connected component rectangle is extracted in the character string region 31 and the distribution thereof is examined to determine a more accurate character string region, and the image inside thereof is determined. You may extract as a desired character string pattern.

【００８０】また、上述した他に、例えば、最初の指示
により得られた文字列領域２８を入力画像ごとに適応的
に変化させて、所望の文字列領域を決定するようにして
もよい。例えば、入力画像が傾いているため、所望の文
字列パターンが矩形外にはみ出ている場合には、領域２
８の各辺上の座標値に対応する画像がすべて白画素（す
なわち背景部）となるように領域２８を徐々に拡大して
所望の文字列成分を囲む矩形を生成するようにしてもよ
い。また、領域２８の各辺上の座標値に対応する画像が
すでに全て白画素である場合には、徐々に領域を縮小し
て、所望の文字列成分を外接する最小の矩形を生成する
ようにしてもよい。In addition to the above, for example, the character string region 28 obtained by the first instruction may be adaptively changed for each input image to determine the desired character string region. For example, if the desired character string pattern is outside the rectangle because the input image is tilted, the area 2
The area 28 may be gradually enlarged so that the image corresponding to the coordinate values on each side of 8 is all white pixels (that is, the background portion), and a rectangle enclosing a desired character string component may be generated. If all the images corresponding to the coordinate values on each side of the area 28 are already white pixels, the area is gradually reduced to generate the smallest rectangle that circumscribes the desired character string component. May be.

【００８１】従って、このようにすれば、入力画像にお
けるタイトル領域が定型化しているような場合は、予め
その領域を設定するようにすれば大量の文書で自動的に
タイトル情報をコード化して入力することができるよう
になり、入力コストを大幅に低減させることができる。Therefore, in this way, when the title area in the input image is standardized, if the area is set in advance, the title information is automatically coded and input in a large number of documents. Therefore, the input cost can be significantly reduced.

【００８２】（第３実施例）上述の第１実施例では、利
用者が画像ファイリング作業中に、表示部３に表示され
た入力文書画像上で直接操作部１を用いて「入力文書に
付与すべきタイトルあるいはキーワードとなるべき文字
列（すなわち認識対象文字列）」を指示または指定する
ものであったが、ここでは、利用者が画像ファイリング
作業前に予め所望の文字列（認識対象文字列）が存在し
うる領域（例えば文書画像中の絶対位置座標）を例えば
フォーマットコントロール等で定義するようにしてい
る。(Third Embodiment) In the above-described first embodiment, during the image filing operation, the user directly uses the operation unit 1 on the input document image displayed on the display unit 3 to “add to the input document”. A character string to be used as a title or a keyword (that is, a character string to be recognized) has been designated or specified. However, here, before the image filing work, the user previously desired a character string (character string to be recognized). ) May exist (for example, absolute position coordinates in the document image) is defined by, for example, format control.

【００８３】この場合、「所望の文字列が存在しうる領
域」は、予め利用者により例えば２点の座標情報により
矩形領域Ｓとして定義される。そして、画像ファイリン
グ作業時に文書が入力されたとき、この矩形領域Ｓの情
報を用いて以下のように動作する。In this case, the "region in which a desired character string can exist" is defined in advance by the user as a rectangular region S based on, for example, coordinate information of two points. When a document is input during the image filing operation, the information in the rectangular area S is used to operate as follows.

【００８４】まず、入力文書についてレイアウト解析処
理または文字列抽出処理を行い、文字列単位あるいは単
語単位で文字パターンを抽出し、それを矩形で表現する
（抽出された文字列矩形群をＳＳとする。）次いで、「矩形領域Ｓ」と文字列矩形群ＳＳとの間で重
ね合わせを行い、文字列矩形群ＳＳのうち「矩形領域
Ｓ」と重なりを持つ文字列矩形Ｓ′を抽出する。このと
き矩形Ｓ′は、他の矩形と比べて明示的に表示されても
良い。もちろん、文字列矩形群ＳＳすべてを表示しても
良いし、それらを原画像に重ねて表示することは有効で
ある。さらに認識結果をそれらの近傍に表示することも
できる。First, layout analysis processing or character string extraction processing is performed on an input document to extract character patterns in character string units or word units, and represent them in rectangles (the extracted character string rectangle group is referred to as SS. Next, the "rectangular area S" and the character string rectangular group SS are overlapped with each other to extract a character string rectangle S'of the character string rectangular group SS that overlaps with the "rectangular area S". At this time, the rectangle S ′ may be explicitly displayed as compared with other rectangles. Of course, it is also possible to display all the character string rectangle groups SS, or it is effective to display them by superimposing them on the original image. Further, the recognition result can be displayed near them.

【００８５】次に、オペレータが「重なりを持つ文字列
矩形Ｓ′」のうち、所望する文字列に相当する矩形を操
作部１を用いて順次指示する（指示された矩形をＰＲと
する）。そして、この「指示された矩形ＰＲ」におい
て、その内部のパターンを文字認識部８において認識し
てコード化する。認識結果は指示（ポインティング）さ
れた順に格納されることになるが、このとき、認識結果
を逐次、表示部３に表示し、オペレータが操作部１を操
作しながら、誤認識結果を修正できるようにしても良
い。Next, the operator sequentially designates a rectangle corresponding to a desired character string in the "character string rectangle S'with overlapping" using the operation unit 1 (the designated rectangle is referred to as PR). Then, in the "instructed rectangle PR", the internal pattern is recognized by the character recognition unit 8 and coded. The recognition results are stored in the order in which they are instructed (pointing). At this time, the recognition results are sequentially displayed on the display unit 3 so that the operator can correct the erroneous recognition results while operating the operation unit 1. You can

【００８６】次に、「ポインティングデバイスを用いて
指示することにより得られる座標と矩形ＰＲとの対応関
係」について説明する。この対応関係は、ポインティン
グデバイスにより指示された座標Ｐは、それを含む矩形
のうち最小の矩形に対応するというものである。しか
し、その矩形内に、他の矩形Ｒ′が含まれている場合
（ただし、Ｒ′はＰを含まないこととする）には、矩形
Ｒ′は考慮されないこととする。例を挙げて説明する
と、たとえば図１４に示すように、矩形３２の内部の任
意の一点（例えば３３）を指示すると、矩形３２の内部
のパターンのみを認識するようにする。また、矩形３４
の内部の任意の一点（例えば３５）を指示すると矩形３
４の内部パターンのみを認識するようにする。そして、
矩形３２の内部および矩形３４の内部に含まれない矩形
３６の内部の任意の一点（例えば３７）を指示すると、
矩形３６内部のパターンを全て認識するようにする。こ
のとき、利用者が矩形を指示するとき識別しやすいよう
に、例えば矩形３２と矩形３４（すなわち他の矩形に含
まれる矩形）は、表示部３において実際より多少小さく
表示するようにしてもよい。Next, the "correspondence between the coordinates obtained by pointing with the pointing device and the rectangle PR" will be described. This correspondence relationship is that the coordinate P designated by the pointing device corresponds to the smallest rectangle among the rectangles including it. However, if another rectangle R'is included in the rectangle (provided that R'does not include P), the rectangle R'is not considered. For example, as shown in FIG. 14, when an arbitrary point (for example, 33) inside the rectangle 32 is designated, only the pattern inside the rectangle 32 is recognized. Also, the rectangle 34
If one point (eg, 35) inside the
Only the internal pattern of 4 is recognized. And
When an arbitrary point (for example, 37) inside the rectangle 32 and inside the rectangle 36 which is not included in the rectangle 34 is designated,
All patterns inside the rectangle 36 are recognized. At this time, for example, the rectangle 32 and the rectangle 34 (that is, the rectangles included in other rectangles) may be displayed on the display unit 3 to be slightly smaller than they actually are so that the user can easily identify the rectangle. .

【００８７】これらの処理においては、あらかじめ矩形
領域Ｓを決めておかなくてもよい。すなわち、入力画像
全体について、あるいは一部についてレイアウト解析を
行ない、その結果を画面に原画像に重ねあわせて表示
し、その中から認識対象文字列をオペレータが指示でき
るようにしてもよい。In these processes, it is not necessary to determine the rectangular area S in advance. That is, the layout analysis may be performed on the entire input image or a part of the input image, and the result may be displayed on the screen so as to be superimposed on the original image so that the operator can instruct the character string to be recognized.

【００８８】そして、このような処理により、入力文書
のタイトルあるいはキーワードが確定したならば、その
文字列コードを、記憶部１２に既に格納してある前記圧
縮された入力画像と関連づけて光ディスク１１に格納し
てファイリング処理を終了する。When the title or keyword of the input document is determined by such processing, the character string code is associated with the compressed input image already stored in the storage unit 12 and stored in the optical disc 11. Store and finish the filing process.

【００８９】（第４実施例）図１５は同実施例の概略的
な構成を示すもので、図１と同一部分には同符号を付し
ている。(Fourth Embodiment) FIG. 15 shows a schematic structure of the same embodiment. The same parts as those in FIG. 1 are designated by the same reference numerals.

【００９０】この場合、画像入力部４から入力された文
書（入力画像）は入力画像圧縮部６と文書構造解析部４
２に与えられる。In this case, the document (input image) input from the image input unit 4 is the input image compression unit 6 and the document structure analysis unit 4.
Given to 2.

【００９１】文書構造解析部４２は、入力画像に対して
２値化処理を施し、さらに入力文書画像の文書構造を解
析して、パラグラフや文字列、その他図形成分等を矩形
情報として抽出するようにしている。そして、文書構造
データ作成部４３で、文書構造解析部４２で抽出された
種々の矩形情報を互いに関連づけ、文書構造を表現する
データとして体系づけるようにしている。このとき、文
書構造データの作成結果を表示部３において表示しても
よい。The document structure analysis unit 42 performs binarization processing on the input image, further analyzes the document structure of the input document image, and extracts paragraphs, character strings, and other graphic components as rectangular information. I have to. Then, in the document structure data creation unit 43, the various pieces of rectangular information extracted by the document structure analysis unit 42 are associated with each other and systematized as data expressing the document structure. At this time, the creation result of the document structure data may be displayed on the display unit 3.

【００９２】文書構造解析部４２および文書構造データ
作成部４３での処理は、利用者が予め操作部１の操作と
表示部３による情報提示により対話的に作成した知識ベ
ース４５に格納されている入力文書に固有の知識を参照
して実行されるようになっている。The processes in the document structure analysis unit 42 and the document structure data creation unit 43 are stored in the knowledge base 45 which is interactively created by the user in advance by the operation of the operation unit 1 and the information presentation by the display unit 3. It is designed to be executed by referring to the knowledge specific to the input document.

【００９３】そして、認識対象文字列抽出部４４におい
て、知識ベース４５に格納されている入力文書に固有の
知識に基づいて、文書構造データから入力文書のタイト
ルもしくはキーワードに相当する文字列（すなわち文字
認識対象文字列）を抽出するようにしている。このと
き、文字認識対象文字列として抽出された画像を表示部
３において表示するようにしてもよい。Then, in the recognition target character string extraction unit 44, based on the knowledge peculiar to the input document stored in the knowledge base 45, the character string (that is, the character string) corresponding to the title or keyword of the input document is extracted from the document structure data. (Recognition target character string) is extracted. At this time, the image extracted as the character recognition target character string may be displayed on the display unit 3.

【００９４】なお、４１は知識獲得支援部で、知識ベー
ス４５に格納するべき「入力文書に関する知識」と「文
字認識対象文字列情報に関する情報」を記述および改訂
する作業を支援するものである。Reference numeral 41 denotes a knowledge acquisition support unit for supporting the work of describing and revising the "knowledge about the input document" and the "information about the character string information of the character recognition target" to be stored in the knowledge base 45.

【００９５】その他の構成は、上述した図１と同様なの
で、ここでの説明は省略する。Since the other structure is the same as that of FIG. 1 described above, the description thereof is omitted here.

【００９６】次に、第４実施例の処理動作について説明
する。Next, the processing operation of the fourth embodiment will be described.

【００９７】本装置では、オペレータによって入力され
た画像から、画像に記載されてある「キーワードもしく
はタイトルとなる文字列」を自動的に抽出し、その文字
列領域内のパターンを文字認識し、自動的にコード化し
て入力画像に関連づけてファイリングするようにしてい
る。In the present apparatus, the "character string as a keyword or title" described in the image is automatically extracted from the image input by the operator, the pattern in the character string area is recognized, and the character string is automatically recognized. It is encoded so that it can be associated with the input image for filing.

【００９８】従って、オペレータが直接「キーワードも
しくはタイトル等」をキー入力する必要がないためファ
イリング作業の省力化を図ることができる。まず入力文
書画像を解析して、文字行成分、単語成分、ブロック
（パラグラフに相当）成分等を抽出し、それらを矩形で
表現する。次いでレイアウト理解処理により矩形間の論
理的な関係を調べることにより、各矩形に「表題、要
約、本文、ヘッダー、フッター、キャプション等」の属
性を割り当てる。この結果、利用者は所望の文字列すな
わち「キーワードもしくはタイトルとなる文字列」をそ
の属性や論理関係（相対的な位置）で指定することが可
能となる。Therefore, it is not necessary for the operator to directly input the "keyword, title or the like" by a key, so that the filing work can be saved. First, the input document image is analyzed to extract character line components, word components, block (corresponding to paragraphs) components, etc., and represent them with rectangles. Then, the layout understanding process examines the logical relationship between the rectangles to assign attributes of “title, abstract, body, header, footer, caption, etc.” to each rectangle. As a result, the user can specify a desired character string, that is, a “character string that is a keyword or a title” by its attribute or logical relationship (relative position).

【００９９】これにより、文字列をその画像中の絶対的
な位置座標で指定する従来方法と比べて柔軟性が高く、
入力画像に多少の位置ズレや書式の変動があっても対応
できるので、同形式の文書を大量にファイリングする場
合、逐次文字列領域を指定しなくても良いという利点が
ある。また、操作性の容易なインターフェイスを用いて
所望の文字列領域を指定することもできるため、トータ
ルな作業時間を従来より短縮することが可能となる。As a result, the flexibility is higher than the conventional method of designating the character string by the absolute position coordinates in the image,
Since it is possible to cope with a slight positional deviation or format change in the input image, there is an advantage that it is not necessary to sequentially specify the character string area when filing a large number of documents of the same format. In addition, since it is possible to specify a desired character string region using an interface that is easy to operate, it is possible to reduce the total work time as compared with the conventional case.

【０１００】そして、同実施例における画像ファイリン
グ処理は、図１６に示す３つの処理からなっている。The image filing process in this embodiment is composed of the three processes shown in FIG.

【０１０１】（ａ）入力画像を圧縮する（図示４６）。(A) The input image is compressed (46 in the figure).

【０１０２】（ｂ）入力文書に付与すべきタイトルある
いはキーワードとなるべき文字列を含む領域を入力画像
から自動的に検出し（図示４７）、その領域に含まれる
文字パターンを文字認識処理によりコード化する（図示
４８）。(B) An area containing a character string to be given as a title or a keyword to be added to the input document is automatically detected from the input image (47), and the character pattern included in the area is coded by the character recognition processing. (Fig. 48).

【０１０３】（ｃ）圧縮入力画像とコード化されたタイ
トルあるいはキーワードを対応付けて光ディスクに登録
する（図示４９）。(C) The compressed input image and the coded title or keyword are associated with each other and registered in the optical disc (49 in the figure).

【０１０４】ここでの（ａ）の処理は、画像圧縮部６に
おいて実施され、（ｂ）の処理は、文章構造解析部４
２、文書構造データ作成部４３、認識対象文字列抽出部
４４、文字認識部８において実施され、（ｃ）の処理は
登録部１０において実施される。The process (a) here is carried out by the image compression unit 6, and the process (b) is carried out by the sentence structure analysis unit 4.
2. The document structure data creation unit 43, the recognition target character string extraction unit 44, and the character recognition unit 8 perform the process (c) in the registration unit 10.

【０１０５】以下、本発明の要部である（ｂ）の処理
を、図１７に示す処理手続きに従って説明する。The process (b) which is the main part of the present invention will be described below in accordance with the process procedure shown in FIG.

【０１０６】この場合、図１７の処理手続きにおいて、
ステップａからステップｈまでは、文書構造解析部４２
で実施され、ステップｉは文書構造データ作成部４３
で、ステップｊとステップｋは認識対象文字列抽出部４
４で、ステップｌとステップｍは文字認識部８それぞれ
実施される。In this case, in the processing procedure of FIG.
From step a to step h, the document structure analysis unit 42
And the step i is the document structure data creation unit 43.
Then, steps j and k are the recognition target character string extraction unit 4
In step 4, steps 1 and m are performed by the character recognition unit 8, respectively.

【０１０７】いま、画像入力部４から、例えば光学スキ
ャナのような画像入力手段を用いて画像が入力される
と、この入力画像を２値化した後（ステップａ）、その
２値化画像中の微小な孤立点をノイズとして除去する
（ステップｂ）。Now, when an image is input from the image input section 4 using an image input means such as an optical scanner, this input image is binarized (step a), and then the binarized image The minute isolated points of are removed as noise (step b).

【０１０８】次に、２値化画像から例えば各黒画素間の
連結関係を調べることにより、黒連結領域を抽出するラ
ベリング処理を行い、各黒連結領域の外接矩形を抽出す
る（ステップｃ）。このとき各黒連結成分矩形において
左上端の座標値と高さおよび幅（あるいは右下端の座標
値）などをデータとして抽出する。図１８（ａ）に入力
画像データに対する黒連結成分矩形の抽出例を示してい
る。Next, for example, by checking the connection relationship between each black pixel from the binarized image, the labeling process for extracting the black connection region is performed, and the circumscribed rectangle of each black connection region is extracted (step c). At this time, the coordinate value at the upper left end and the height and width (or the coordinate value at the lower right end) of each black connected component rectangle are extracted as data. FIG. 18A shows an example of extracting a black connected component rectangle from the input image data.

【０１０９】次に、検出された黒連結成分矩形に対し
て、その形状（例えば幅、高さ、縦横比、黒画素濃度）
を調べることにより各黒連結成分矩形の属性、すなわち
各黒連結成分矩形が、文字列領域中に含まれる矩形（以
後、文字候補矩形とする）、グラフィクス（図形、表、
枠、直線成分）に相当する矩形、写真等のイメージに相
当する矩形及びノイズに相当する矩形のいずれであるか
を決定し、それらを記憶部１２に格納する（ステップ
ｄ）。この黒連結成分矩形の識別処理は、公知の方式に
より実現されてもよい。以後の処理では、文字候補矩形
に着目する。Next, for the detected black connected component rectangle, its shape (for example, width, height, aspect ratio, black pixel density)
By checking the attribute of each black connected component rectangle, that is, each black connected component rectangle, a rectangle (hereinafter referred to as a character candidate rectangle) included in the character string area, a graphics (graphic, table,
A rectangle corresponding to a frame, a straight line component), a rectangle corresponding to an image such as a photograph, or a rectangle corresponding to noise is determined, and these are stored in the storage unit 12 (step d). This black connected component rectangle identification processing may be realized by a known method. In the subsequent processing, attention is paid to the character candidate rectangle.

【０１１０】そして、図１８（ｂ）に示すように文字候
補矩形それぞれの先端位置（あるいは終端位置、あるい
は矩形の横幅すべて）を文字列方向と垂直な方向に射影
することにより周辺分布を求め、この周辺分布形状を解
析することにより入力文書画像の段組後方位置を検出
し、さらに文字候補矩形の先頭（あるいは終端）位置が
文字列方向と垂直な方向に揃っている位置を検出し段組
５０を定義する。そして、これらを文書構造データの一
部として記憶部１２に格納する（ステップｅ）。Then, as shown in FIG. 18B, the marginal distribution is obtained by projecting the tip position (or end position, or the entire width of the rectangle) of each character candidate rectangle in the direction perpendicular to the character string direction, By analyzing this peripheral distribution shape, the column backward position of the input document image is detected, and the position where the beginning (or end) position of the character candidate rectangle is aligned in the direction perpendicular to the character string direction is detected. Define 50. Then, these are stored in the storage unit 12 as a part of the document structure data (step e).

【０１１１】段組位置検出処理は上述した他に、例えば
文字列と垂直な方向に延びる連続した空白領域を検出す
ることにより実現してもよい。このとき、入力文書の文
字列方向は、知識ベース４５に格納されている「入力文
書の文書構造に関する情報」を参照することにより知る
ことができる。ただし、この段組位置検出処理は１段組
の文書に対しては特に実行される必要がないので、利用
者が実行すべきか否かを決定することもできる。すなわ
ち段組位置検出処理の実行、非実行に関しては、知識ベ
ース４５に格納されているオペレータが予め定義した
「入力文書の文書構造に関する情報」に記述されてお
り、黒連結成分矩形の識別処理の後にこの知識を参照し
て段組位置検出処理を実行するか否かを決定することが
できる。特に、知識ベース４５において段組位置検出処
理の実行・非実行に関する定義がなされていない場合に
は、本処理を常に実施するようにしてもよい。In addition to the above, the column position detecting process may be realized by detecting a continuous blank area extending in the direction perpendicular to the character string. At this time, the direction of the character string of the input document can be known by referring to “information regarding the document structure of the input document” stored in the knowledge base 45. However, since this column position detection process does not need to be executed for a single column document, the user can also decide whether or not to execute it. That is, the execution / non-execution of the column position detection process is described in “Information related to the document structure of the input document” stored in the knowledge base 45 and defined by the operator. Later, with reference to this knowledge, it is possible to decide whether or not to execute the column position detection processing. In particular, if the knowledge base 45 does not define whether or not to execute the column position detection process, this process may be always executed.

【０１１２】段組位置検出処理で得られた段組位置に従
って、文字列方向に文字候補矩形を統合することにより
文字行に相当する矩形（以後、文字行候補矩形とする）
を抽出する（ステップｆ）。すなわち段組が検出されな
い領域では、文字列方向と垂直な方向に重なりがあり、
かつ文字列方向に近接している文字候補矩形を文字列方
向に統合することにより文字行が抽出される。また、段
組が検出されている領域では、その段組位置をまたがら
ないようにして同様に文字候補矩形を文字列方向に統合
することにより文字行候補矩形が抽出される。A rectangle corresponding to a character line by integrating character candidate rectangles in the character string direction in accordance with the column position obtained by the column position detection processing (hereinafter referred to as a character line candidate rectangle).
Is extracted (step f). That is, in the area where no columns are detected, there is overlap in the direction perpendicular to the character string direction,
A character line is extracted by integrating character candidate rectangles that are close to each other in the character string direction in the character string direction. In the area where the column is detected, the character line candidate rectangle is extracted by not integrating the column position and similarly integrating the character candidate rectangles in the character string direction.

【０１１３】図１９に、図１８に示す画像データに対し
て文字行抽出処理を施した例を示す。このときさらに、
各文字行候補矩形において、文字候補矩形間距離を解析
して、近接する文字候補矩形を統合することにより単語
検出を行い、単語候補矩形を生成する（ステップｇ）。
単語候補矩形検出処理は、例えば以下の手順で行われて
も良い。FIG. 19 shows an example in which character line extraction processing is applied to the image data shown in FIG. At this time,
In each character line candidate rectangle, the distance between the character candidate rectangles is analyzed, and word detection is performed by integrating adjacent character candidate rectangles to generate a word candidate rectangle (step g).
The word candidate rectangle detection process may be performed in the following procedure, for example.

【０１１４】（１）文字候補矩形間距離を求め、そのヒ
ストグラムを作成する。(1) The distance between the character candidate rectangles is obtained and the histogram thereof is created.

【０１１５】（２）ヒストグラムの形状を解析し、単語
内距離と単語間距離を分離するいき値を求める。(2) The shape of the histogram is analyzed and the threshold value for separating the intra-word distance and the inter-word distance is obtained.

【０１１６】（３）このいき値に基づき順に文字候補矩
形間距離を評価し、文字候補矩形を単語として統合する
ことにより単語候補矩形を生成する。(3) Based on this threshold value, the distances between the character candidate rectangles are sequentially evaluated, and the character candidate rectangles are integrated as words to generate word candidate rectangles.

【０１１７】この結果、単語候補矩形は文字行候補矩形
において左から右の方向に順に管理されることになる。
この時点で矩形データは文字候補矩形、単語候補矩形、
文字行候補矩形というように階層的に表現され管理され
ることになる。As a result, the word candidate rectangles are sequentially managed from the left to the right in the character line candidate rectangles.
At this point, the rectangle data is character candidate rectangle, word candidate rectangle,
It is expressed and managed hierarchically like a character line candidate rectangle.

【０１１８】次に、図２０に示すように同一段組にあ
り、文字行と垂直な方向に隣接する文字行候補矩形を統
合し、ブロック（パラグラフに相当する）を検出する
（ステップｈ）。このブロック検出処理は、例えば（信
学技報、PRU87-89、pp.51-61、1987. ）に示されている
ような公知の方式によるもので実現されても良い。Next, as shown in FIG. 20, character line candidate rectangles that are in the same column and are adjacent to each other in the direction perpendicular to the character line are integrated, and a block (corresponding to a paragraph) is detected (step h). This block detection processing may be realized by a known method as shown in, for example, (Technical Bulletin, PRU87-89, pp.51-61, 1987.).

【０１１９】このようにブロックが抽出された後、例え
ば（信学技報、PRU87-89、pp.51-61、1987. ）に示され
ているような公知の方式を用いて段組位置とブロックの
物理的な位置関係より、ブロックの順番を決定する。更
にブロックの順番に従って文字行候補矩形の順番を並べ
変える。この時点で、入力文書は文字単位の矩形、単語
単位の矩形、文字行単位の矩形、ブロック単位の矩形で
表現され、ブロック、文字行候補矩形、単語候補矩形、
文字候補矩形における物理的位置関係により階層的に管
理される。ただしこの時点では、各矩形は、座標情報や
大きさ等の情報しか持っていない。そして、前記ブロッ
クと文字行候補矩形それぞれの物理的な位置関係、順
番、論理関係、文字候補矩形以外の属性を持つ黒連結成
分矩形（例えば図・写真など）の位置や大きさ及び文字
候補矩形との相互関係等に基づいてレイアウト解析処理
を行うことにより矩形間の意味的接続関係を獲得し、各
矩形に「表題」、「要約」、「本文」、「ヘッダー」、
「フッター」、「キャプション」等の属性を割り当てる
（ステップｉ）。After the blocks are extracted in this way, the column positions are determined by using a known method as shown in, for example, (Technical Bulletin, PRU87-89, pp.51-61, 1987.). The order of blocks is determined from the physical positional relationship of the blocks. Further, the order of the character line candidate rectangles is rearranged according to the block order. At this point, the input document is represented by a rectangle for each character, a rectangle for each word, a rectangle for each character line, and a rectangle for each block.
It is hierarchically managed by the physical positional relationship in the character candidate rectangle. However, at this point, each rectangle has only information such as coordinate information and size. Then, the physical position relationship between the block and the character line candidate rectangle, the order, the logical relationship, the position and size of a black connected component rectangle (for example, a figure or a photograph) having attributes other than the character candidate rectangle, and the character candidate rectangle. By performing the layout analysis process based on the mutual relations with, etc., the semantic connection between the rectangles is acquired, and the “title”, “summary”, “body”, “header”,
Attributes such as "footer" and "caption" are assigned (step i).

【０１２０】この場合、図２１（ａ）に示すような矩形
群は、同図（ｂ）に示す木構造で表現され、さらにブロ
ック内で文字行候補矩形が、また文字行候補矩形内では
単語候補矩形が、また単語候補矩形内では文字候補矩形
がそれぞれ木構造で表現されている。このようにして得
られたデータは、文書構造データとして記憶部１２に格
納する。In this case, the group of rectangles as shown in FIG. 21 (a) is represented by the tree structure shown in FIG. 21 (b), and the character line candidate rectangles in the block and the word in the character line candidate rectangles are the words. The candidate rectangles and the character candidate rectangles in the word candidate rectangles are represented by a tree structure. The data thus obtained is stored in the storage unit 12 as document structure data.

【０１２１】以上の処理により、入力画像から単語、文
字列、ブロックに関して、属性情報と意味的接続関係情
報と物理的位置関係情報が得られたことになる。ここで
のレイアウト解析処理は、例えば「Tsujimoto,S., and
Asada, H. :"UnderstandingMulti-articled Document
s", in Proc. 10th Int. Conf. Pattern Recognition,p
p.551-556, 1990.」に示されているような公知の方式を
用いてもよい。By the above processing, the attribute information, the semantic connection relation information, and the physical positional relation information regarding the word, the character string, and the block are obtained from the input image. The layout analysis process here is performed by, for example, “Tsujimoto, S., and
Asada, H.: "Understanding Multi-articled Document
s ", in Proc. 10th Int. Conf. Pattern Recognition, p
A known method such as that shown in p.551-556, 1990. "may be used.

【０１２２】入力画像に付与すべきタイトルもしくはキ
ーワードに相当する文字列の抽出は、認識対象文字列抽
出部４４において、あらかじめオペレータにより定義さ
れている知識ベース４５を参照して、記憶部１２に格納
されている木構造の文書構造データから所望の文字列に
相当する矩形を抽出することにより実現される（ステッ
プｊ）。そして、知識ベース４５には、入力文書のタイ
トルもしくはキーワードと見なすことができる文字列領
域がオペレータにより記述される。すなわち、オペレー
タは所望の入力画像のキーワードに相当する文字列を知
識ベース４５で、属性、意味的接続関係、物理的位置に
関する情報により定義している。このことより、所望の
文字列は、「属性情報」、「意味的接続関係情報」、
「物理的位置情報」のいずれを用いても抽出することが
可能となる。To extract the character string corresponding to the title or keyword to be added to the input image, the recognition target character string extraction unit 44 refers to the knowledge base 45 previously defined by the operator and stores it in the storage unit 12. It is realized by extracting a rectangle corresponding to a desired character string from the tree-structured document structure data (step j). Then, in the knowledge base 45, a character string area that can be regarded as a title or a keyword of the input document is described by the operator. That is, the operator defines a character string corresponding to a keyword of a desired input image in the knowledge base 45 based on attributes, semantic connection relations, and physical position information. From this, the desired character string is "attribute information", "semantic connection relation information",
It is possible to extract using any of the “physical position information”.

【０１２３】例えば、図２５に示すような文書において
題目（１）を抽出したい場合、知識ベース４５に「題
目」を抽出するように記述しておけば、認識対象文字列
抽出部４４では、記憶部１２に格納されている木構造で
表現されている文書構造データを探索して「題目」の属
性の付いた矩形をすべて抽出する。このとき、入力文書
ごとに、表示部３において、文字認識処理対象となる文
字列領域を表示してもよい（ステップｋ）。また、その
文字列領域が正しく抽出されていない場合あるいは抽出
すべき文字列を含む複数の文字列領域が抽出されている
場合などには、操作部１のマウスなどのによりオペレー
タが正しい文字列領域を指示または指定する（実際の画
像を囲む）ようにしてもよい。For example, when it is desired to extract the subject (1) in a document as shown in FIG. 25, if the "base" is described in the knowledge base 45, the recognition target character string extraction unit 44 stores it. The document structure data represented by the tree structure stored in the unit 12 is searched to extract all the rectangles with the attribute of "title". At this time, a character string area to be subjected to character recognition processing may be displayed on the display unit 3 for each input document (step k). When the character string area is not correctly extracted or when a plurality of character string areas including the character string to be extracted are extracted, the operator can use the mouse of the operation unit 1 to input the correct character string area. May be instructed or specified (surrounding the actual image).

【０１２４】このようにして文字認識処理対象文字列が
抽出されたら、次にその文字列領域から個々の文字パタ
ーンを切り出す。この処理は、既にステップｃで抽出さ
れている近接する連続黒画素領域同士を統合して文字パ
ターンとすることにより実現される。この近接を評価す
るためには距離に関するしきい値を設定し、文字の上下
方向の近接する文字素を統合して１つの文字パターンと
する。この結果、例えば「ｉ」や「％」のような分離文
字は一つに統合されて抽出される。尚、左右における文
字素のパターン統合は基本的に行わない。この結果、例
えば「動」のような文字は「重」と「力」のような２つ
のパターンとして抽出される。このようにして抽出され
る２つのパターンは１文字である可能性と、２文字であ
る可能性とを持っているので、この２つの可能性につい
てそれぞれ考慮することが必要となる。従ってこのよう
な場合には、例えば複数の切り出し候補のすべてに対し
て文字認識を行い、最後に文脈的な整合性を評価して何
れかに決定するようにすればよい。また逆に、複数の文
字が接触しているような文字パターンも存在する。この
ような文字パターン、例えば図２２（ａ）に示すような
文字パターンに対しては、同図（ｂ）に示すように文字
パターンを構成する黒画素の文字列と垂直な方向の周辺
分布（射影成分）ｆの極小点を取る位置を、接触した複
数の文字パターンの境界位置であると判定する。しかる
後、この該境界位置の付近のパターン形状を調査し、境
界位置の間近の凹み部位を当該文字列パターンの切断箇
所であると判定し、その切断箇所にて前記文字列パター
ンを切り分けることによって個々の文字列パターンを切
り出すようになる。After the character string to be subjected to character recognition processing is extracted in this way, individual character patterns are then cut out from the character string area. This processing is realized by integrating the adjacent continuous black pixel areas already extracted in step c into a character pattern. In order to evaluate this proximity, a threshold value related to the distance is set, and adjacent character elements in the vertical direction of the character are integrated into one character pattern. As a result, separated characters such as "i" and "%" are integrated and extracted. Basically, the pattern integration of the character elements on the left and right is not performed. As a result, a character such as "movement" is extracted as two patterns such as "heavy" and "force". Since the two patterns extracted in this way have the possibility of being one character and the possibility of being two characters, it is necessary to consider each of these two possibilities. Therefore, in such a case, for example, character recognition may be performed on all of the plurality of cutout candidates, and finally the contextual consistency may be evaluated to determine one. Conversely, there are also character patterns in which a plurality of characters are in contact. For such a character pattern, for example, the character pattern shown in FIG. 22 (a), as shown in FIG. 22 (b), the marginal distribution (in the direction perpendicular to the character string of black pixels forming the character pattern) The position where the minimum point of the projected component) f is taken is determined to be the boundary position of the plurality of contacting character patterns. After that, by examining the pattern shape in the vicinity of the boundary position, it is determined that the recessed portion near the boundary position is the cutting position of the character string pattern, and the character string pattern is cut at the cutting position. It comes to cut out individual character string patterns.

【０１２５】このようにして個々の文字パターンが切り
出されると、次には各文字パターンを、そのパターンが
帰属する可能性のある文字のカテゴリと対応付ける。こ
の対応付けは、例えば複合類似度法のように、統計的に
作成された標準パターンと未知パターンとの重ね合わせ
による一致度を算出する等して行われる（ステップ
ｌ）。この結果、未知パターンと対応付けられる第１位
から第Ｍ位までの文字がカテゴリを、その一致の高いも
のから順に当該文字カテゴリのコードと上記一致度と共
にリス化して出力する。上記リストは前記「表示部」２
に表示され（ステップｍ）、利用者が前記「操作部」１
を操作しながら、認識結果を修正するようにしてもよ
い。When individual character patterns are cut out in this way, each character pattern is then associated with a category of characters to which the pattern may belong. This association is performed, for example, by calculating the degree of coincidence by superimposing the statistically created standard pattern and the unknown pattern, as in the composite similarity method (step l). As a result, the first to Mth characters associated with the unknown pattern are categorized and output in order from the character having the highest matching, together with the code of the character category and the degree of matching. The above list is the "display" 2
Is displayed on the screen (step m), and the user operates the "operation unit" 1
The recognition result may be corrected while operating.

【０１２６】上述の処理で入力文書のタイトルあるいは
キーワードが確定したならば、その文字列コードを、記
憶部１２に既に格納してある圧縮された入力画像と関連
づけて光ディスク１１に格納してファイリング処理を終
了する。格納されるデータは入力画像そのものでなくて
もよく、それの一部分やあ認識結果等の加工データであ
っても良い。When the title or keyword of the input document is determined by the above processing, the character string code is stored in the optical disk 11 in association with the compressed input image already stored in the storage unit 12, and the filing processing is performed. To finish. The stored data may not be the input image itself, but may be a part of it or processed data such as a recognition result.

【０１２７】ここで、知識ベース４５について具体的に
説明する。この知識ベース４５には、同一の書式もしく
は構造を持つ文書ごとにオペレータが作成した「文書構
造に関する情報」と「入力画像中に存在する入力文書の
タイトルもしくはキーワードに相当する文字列情報に関
する情報」がそれぞれ格納されている。これらの知識
は、画像ファイリング作業が行われる前に、オペレータ
が知識獲得支援部４１を起動し、操作部１の操作と表示
部３による情報提示により対話的に作成するものであ
る。また、オペレータは画像ファイリング作業時に知識
ベース４５から定義済みの入力文書に応じた「文書構造
に関する情報」と「入力画像中に存在する入力文書のタ
イトルもしくはキーワードに相当する文字列情報に関す
る情報」を選択あるいは新たに作成する。画像入力作業
時における文書構造解析及び理解処理と文字認識対象文
字列の抽出処理は、この知識を参照しながら実行され
る。Here, the knowledge base 45 will be specifically described. In this knowledge base 45, "information regarding the document structure" created by the operator for each document having the same format or structure and "information regarding the character string information corresponding to the title or keyword of the input document existing in the input image". Are stored respectively. Before the image filing work is performed, the knowledge is created interactively by the operator activating the knowledge acquisition support unit 41 and operating the operation unit 1 and presenting the information by the display unit 3. In addition, the operator provides "information about the document structure" corresponding to the defined input document and "information about the character string information corresponding to the title or keyword of the input document existing in the input image" from the knowledge base 45 at the time of the image filing work. Select or create a new one. The document structure analysis and understanding process and the character recognition target character string extraction process at the time of image input work are executed with reference to this knowledge.

【０１２８】「文書構造に関する情報」では、例えば・入力文書の文字列方向（垂直方向、水平方向）・入力文書の構造（段組数）・入力文書のサイズ（Ａ４、Ｂ４など）・入力文書の属性（テキスト、表、図など）などが定義されており、「入力画像中に存在する入力文
書のタイトルもしくはキーワードに相当する文字列情報
に関する情報」では、入力文書中に存在する文字認識対
象文字列領域が定義されている。In the "information on the document structure", for example: -character string direction of input document (vertical direction, horizontal direction) -structure of input document (number of columns) -size of input document (A4, B4, etc.)-Input document The attributes (text, table, figure, etc.) are defined, and in "Information about the character string information corresponding to the title or keyword of the input document existing in the input image", the character recognition target existing in the input document The character string area is defined.

【０１２９】次に、知識ベース４５の作成に用いられる
知識獲得支援部４１について説明する。知識獲得支援部
４１は、オペレータが画像ファイリング処理を行う前に
知識ベース４５を構築するとき、オペレータによって起
動される処理部であり、ファイリング処理時には動作し
ないようになっている。知識獲得支援部４１では、入力
文書に対する「文書構造に関する情報」と「入力画像中
に存在する入力文書のタイトルもしくはキーワードに相
当する文字列情報に関する情報」の定義、修正、追加、
変更、確認、評価等を、オペレータが操作部１の操作と
表示部３による情報提示により対話的に行うことができ
るようになっている。知識獲得支援部４１における知識
の対話設計では、例えばオペレータは、文字列領域を定
義するために文字数や文字が存在している座標位置など
の煩雑な情報を設定する必要がなく、簡単に文書構造や
文字列位置を定義することが可能である。オペレータが
知識を定義して知識ベース４５を構築するときに、知識
獲得支援部４１がオペレータに提示する外部表現は例え
ば、図２３に示すような画面５１に対してウインドウ５
２が考えられ、またパネル形式も考えられる。ウインド
ウ５２は、例えば図２４に示すように画面５１の上部
（下部あるいは左端あるいは右端）にアイコン形式で表
示されているアイコン５３を操作部１のマウスなどで指
示することにより、画面の任意の位置に表示される。文
書構造あるいは文字認識対象領域の定義に用いられるウ
インドウはスクロール可能な１枚のウインドウで構成さ
れていてもよいし、段階的に順次表示されるような複数
のウインドウで構成されていてもよい。スクロール可能
な１枚のウインドウで、文書構造あるいは文字認識対象
領域の定義に用いられるウインドウを構成している場合
には、オペレータはウインドウにおける指示に従って上
から順に文書構造及び文字認識対象文字列領域を定義し
ていく。また、複数のウインドウで、文書構造及び文字
認識対象領域の定義に用いられるウインドウを構成して
いる場合には、利用者はまず入力文書の文書構造に関す
る定義を一つのウインドウで行った後、次に文字認識対
象文字列領域に関する定義を順次新たに発生するウイン
ドウで定義する。この場合、各ウインドウは前段階の定
義が終了する度ごとに、自動的に発生するようにしても
よい。Next, the knowledge acquisition support unit 41 used for creating the knowledge base 45 will be described. The knowledge acquisition support unit 41 is a processing unit activated by the operator when the operator constructs the knowledge base 45 before performing the image filing process, and does not operate during the filing process. The knowledge acquisition support unit 41 defines, corrects and adds "information about document structure" and "information about character string information corresponding to the title or keyword of the input document existing in the input image" for the input document.
The operator can interactively make changes, confirmations, evaluations and the like by operating the operation unit 1 and presenting information on the display unit 3. In the interactive design of knowledge in the knowledge acquisition support unit 41, for example, the operator does not need to set complicated information such as the number of characters and the coordinate position where the character exists in order to define the character string area, and the document structure can be easily set. It is possible to define the position of the character string. When the operator defines the knowledge and constructs the knowledge base 45, the external expression presented by the knowledge acquisition support unit 41 to the operator is, for example, the window 5 on the screen 51 as shown in FIG.
2 is also considered, and panel format is also considered. The window 52 is displayed at an arbitrary position on the screen by, for example, pointing to the icon 53 displayed in the icon form on the upper part (lower part, left end, or right end) of the screen 51 as shown in FIG. Is displayed in. The window used for defining the document structure or the character recognition target area may be composed of one scrollable window, or may be composed of a plurality of windows that are sequentially displayed step by step. When a window that is used to define the document structure or the character recognition target area is configured with one scrollable window, the operator sequentially selects the document structure and the character recognition target character string area from the top according to the instruction in the window. Define it. When a window used for defining the document structure and the character recognition target area is composed of a plurality of windows, the user first defines the document structure of the input document in one window and then The definition concerning the character recognition target character string area is defined in a newly generated window. In this case, each window may be automatically generated every time the definition in the previous stage is completed.

【０１３０】次に、図２５に示す実際の文書の例を用い
て知識獲得支援部４１により「文書構造に関する情報」
と「文字認識対象文字列情報に関する情報」を定義する
場合を説明する。Next, using the example of the actual document shown in FIG. 25, the knowledge acquisition support unit 41 performs "information on the document structure".
And the case of defining "information regarding character string information for character recognition" will be described.

【０１３１】図２５は、「情報処理学会論文誌」の表題
頁である。表題頁には題目や著者名、要約などが必ず記
載され、また本文の一部も記載される。この文書を画像
として蓄積する場合、検索用のインデックス（すなわち
入力画像のタイトル）を画像に付与する必要がある。こ
のような文字列を自動的に抽出するためには、その文字
列領域を予め定義しなければならないが、本発明を用い
ると簡単な操作でそれが可能となる。ここでは、図２５
の入力画像の文書構造と、同図中の題目（１）の文字列
領域を本発明を用いて定義することにする。FIG. 25 shows the title page of "Information Processing Society of Japan". The title page must include the title, author's name, abstract, etc., as well as part of the text. When this document is stored as an image, it is necessary to add a search index (that is, the title of the input image) to the image. In order to automatically extract such a character string, the character string area must be defined in advance, but using the present invention makes it possible with a simple operation. Here, FIG.
The document structure of the input image and the character string area of the subject (1) in the figure will be defined using the present invention.

【０１３２】オペレータが入力画像の文書構造と文字認
識対象文字列を定義するためには、まず図２４に示すア
イコン５３を操作部１のマウス等で指示しなければなら
ない。その結果、表示部３の画面上に、まず、図２６に
示すようなウインドウ５２ａが表示される。In order for the operator to define the document structure of the input image and the character recognition target character string, the operator must first designate the icon 53 shown in FIG. As a result, the window 52a as shown in FIG. 26 is first displayed on the screen of the display unit 3.

【０１３３】オペレータは、このウインドウ５２ａにお
いて、項目（１）を選択すると既に定義済みの「文書構
造及び文字認識対象文字列に関する知識」を入力文書に
適用することができる。また、項目（４）を選択する
と、新たに入力文書の「文書構造及び文字認識対象文字
列に関する知識」を図２８から図３２に示すウインドウ
５２ｃからウインドウ５２ｇまでを用いて定義すること
ができる。When the operator selects item (1) in this window 52a, the operator can apply the already defined "knowledge about the document structure and character string to be recognized" to the input document. When item (4) is selected, "knowledge about the document structure and character recognition target character string" of the input document can be newly defined using windows 52c to 52g shown in FIGS. 28 to 32.

【０１３４】ここで図２６に示すウインドウ５２ａで項
目（１）を選択すると、ウインドウ５２ａが表示部３か
ら消滅し、図２７に示すウインドウ５２ｂが表示部３の
画面上に表示される。そして、このウインドウ５２ｂの
項目（２）を選択すると、図２８から図３２に示すウイ
ンドウ５２ｃからウインドウ５２ｇが順次表示されパラ
メータの一部を変更して入力文書に適用することができ
る。また、ウインドウ５２ｂの項目（３）を選択すると
定義済みの「文書構造及び文字認識対象文字列に関する
知識」をそのまま入力文書に適用することができる。こ
の場合、入力文書に対する「文書構造及び文字認識対象
文字列に関する知識」の定義はこれで終了となる。When item (1) is selected in the window 52a shown in FIG. 26, the window 52a disappears from the display unit 3 and the window 52b shown in FIG. 27 is displayed on the screen of the display unit 3. Then, when the item (2) of the window 52b is selected, the windows 52c to 52g shown in FIGS. 28 to 32 are sequentially displayed, and some of the parameters can be changed and applied to the input document. Further, by selecting the item (3) of the window 52b, the defined "knowledge about the document structure and character recognition target character string" can be applied to the input document as it is. In this case, the definition of "knowledge about document structure and character recognition target character string" for the input document is completed.

【０１３５】ここでは、図２６に示すウインドウ５２ａ
の項目（４）を選択して図２５の文書に対して新しく
「文書構造及び文字認識対象文字列に関する知識」定義
することにする。Here, the window 52a shown in FIG. 26 is used.
Item (4) is selected to newly define "knowledge about document structure and character recognition target character string" for the document of FIG.

【０１３６】なお、各ウインドウにおいて、記述されて
いる各項目を選択するには、操作部１のマウス等を用い
て、各項目の先頭に記載された白抜きの角印を指示する
ようにする。この場合、白抜きの角印を指示すると黒色
の角印に変化し、この状態で下線付きの空白部に情報を
記入すると、この入力情報は入力文書に関する知識とし
て採用されるようになる。また、黒色の角印を指示する
と白抜きの角印に変化し、下線付きの空白部に記入され
ていた情報はキャンセルされ（下線部は再び空白とな
る）、オペレータは新たな情報を記入することができる
ようになる。In order to select each described item in each window, the mouse or the like of the operation unit 1 is used to indicate the blank square mark at the beginning of each item. . In this case, when a blank square mark is designated, it changes to a black square mark. In this state, if information is written in a blank part with an underline, this input information will be adopted as knowledge about the input document. In addition, when the black square mark is specified, it changes to a white square mark, the information written in the underlined blank part is canceled (the underlined part becomes blank again), and the operator enters new information. Will be able to.

【０１３７】ここで、図２６に示すウインドウ５２ａの
項目（４）を選択すると、ウインドウ５２ａは表示部３
から消滅し、図２８に示すウインドウ５２ｃが表示部３
の画面上に表示される。オペレータはこのウインドウ５
２ｃを用いて入力画像の文書構造を定義することにな
る。If item (4) of the window 52a shown in FIG. 26 is selected, the window 52a will be displayed on the display unit 3.
28 disappears and the window 52c shown in FIG.
Displayed on the screen. The operator has this window 5
2c will be used to define the document structure of the input image.

【０１３８】ウインドウ５２ｃに対してオペレータは、
上の項目から順に選択するとともに、各項目の下線付き
空白部に操作部１のキーボード等により所定の情報を記
入する。この場合、項目（５）は文書の文字列方向の定
義に関する項目である。図２５に示す文書は横書きなの
で、利用者は項目（５）において横書きをマウス等で指
示すればよい。図１７の処理手続きにおける文字行抽出
処理（ステップｆ）は、この情報に基づいて黒連結成分
矩形を統合し文字行を抽出するようになる。For the window 52c, the operator
The items are selected in order from the above items, and predetermined information is entered in the underlined blank area of each item using the keyboard of the operation unit 1 or the like. In this case, the item (5) is an item related to the definition in the character string direction of the document. Since the document shown in FIG. 25 is in horizontal writing, the user may indicate horizontal writing in item (5) with a mouse or the like. In the character line extraction process (step f) in the processing procedure of FIG. 17, the black connected component rectangles are integrated based on this information to extract the character line.

【０１３９】項目（６）は、入力文書の段組数の定義に
関する項目である。図２５に示す文書は２段組であるの
で、オペレータは項目（６）の下線付き空白部に「２」
を記入すれば良い。ここで下線付き空白部に「２」以上
を記入した場合のみ図１７の処理手続きにおいて段組抽
出処理（ステップｅ）が実施される。Item (6) relates to the definition of the number of columns of the input document. Since the document shown in FIG. 25 has two columns, the operator writes "2" in the underlined blank part of the item (6).
Should be entered. Here, the column extraction process (step e) is executed in the process procedure of FIG. 17 only when "2" or more is entered in the underlined blank part.

【０１４０】項目（７）は、入力文書のサイズの定義に
関する項目である。図２５に示す入力文書はＢ５サイズ
であるので、オペレータは項目（７）の下線付き空白部
に「Ｂ５」を記入すれば良い。項目（８）は、入力文書
の属性の定義に関する項目である。入力文書は主にテキ
ストデータから構成されているので、オペレータは項目
（８）において「テキスト」をマウス等で指示すれば良
い。ここで例えば「表」が選択される場合には、ファイ
リング処理における文書構造解析処理で表検出処理と表
構造解析処理が実施される。また図面が選択される場合
には段組検出処理（ステップｅ）とブロック検出処理
（ステップｈ）が行われず、図面に対応した文字列抽出
処理が実施される。項目（９）をマウス等で指示した場
合には、ウインドウ５２ｃがオープンされてからこのウ
インドウ上で定義したすべての情報をキャンセルするこ
とができる。このウインドウ５２ｃにおける文書構造の
定義を終了したい場合には、最後に項目（１０）をマウ
ス等で指示すると、ウインドウ５２ｃが表示部３の画面
上から消滅し、ウインドウ５２ｄが画面上に表示され、
文書認識対象領域の定義ができるようになる。このと
き、ウインドウ５２ｃにおいて項目（５）から（８）ま
での内一つでも選択漏れや記入漏れが生じている場合に
は、表示部３において「ウインドウにおける定義が適切
出はありません。定義をやり直してください」というよ
うな警告メッセージを表示するようにしてもよい。Item (7) is an item relating to the definition of the size of the input document. Since the input document shown in FIG. 25 is B5 size, the operator may enter "B5" in the underlined blank part of the item (7). Item (8) is an item relating to the definition of the attribute of the input document. Since the input document is mainly composed of text data, the operator may indicate "text" in item (8) with a mouse or the like. Here, for example, when "table" is selected, the table detection process and the table structure analysis process are executed in the document structure analysis process in the filing process. When a drawing is selected, the column detection process (step e) and the block detection process (step h) are not performed, and the character string extraction process corresponding to the drawing is performed. When the item (9) is designated with a mouse or the like, all the information defined on this window can be canceled after the window 52c is opened. When ending the definition of the document structure in the window 52c, when the item (10) is finally designated by a mouse or the like, the window 52c disappears from the screen of the display unit 3, and the window 52d is displayed on the screen.
The document recognition target area can be defined. At this time, if any one of the items (5) to (8) in the window 52c is omitted or missed in selection, the display section 3 displays “There is no appropriate definition in the window. Please display a warning message such as "Please."

【０１４１】図２５に示す文書における題目（１）の文
字列領域を定義するために、まず図２９に示すウインド
ウ５２ｄで文字列領域抽出処理範囲を定義する。In order to define the character string area of the subject (1) in the document shown in FIG. 25, first, the character string area extraction processing range is defined in the window 52d shown in FIG.

【０１４２】ここで、項目（４１）を選択すると、所望
の文字列を属性で定義することができる。このとき、図
３３に示すウインドウ５２ｈが画面上に表示され、所望
の文字列の属性に相当する項目を選択すればよい。題目
（１）に関しては、これで定義を終了できるが、さらに
位置座標情報による定義についても説明する。Here, when the item (41) is selected, a desired character string can be defined by the attribute. At this time, the window 52h shown in FIG. 33 is displayed on the screen, and the item corresponding to the attribute of the desired character string may be selected. With respect to the subject (1), the definition can be completed by this, and the definition by the position coordinate information will be further described.

【０１４３】この場合、図２８に示すウインドウ５２ｃ
の項目（６）で段組数を「２」と定義したので、この図
２９に示すウインドウ５２ｄでは項目（１１）を選択す
ればよい。他に、上述ウインドウ５２ｃで項目（６）に
おいて段組数を「１」と設定しておけば、ウインドウ５
２ｄでは項目（１２）を選択し「全体を垂直方向に３分
割した内の上から１番目までを文字列領域抽出処理範囲
とする」と定義すると、ファイリング処理時における文
書構造解析処理は全体の３分の１の領域（上部）につい
てのみ実施される。また、図２９に示すウインドウ５２
ｄでは他に、項目（１３）を選択すると図３４のような
抽出処理範囲の定義が可能であるし、また項目（１４）
を選択すると任意の２点の絶対座標を頂点とする矩形に
よる抽出処理範囲の定義が可能である。ここでの項目選
択は以下のウインドウにおける定義に曖昧さを残さない
ために（１１）から（１４）までの内の１つの項目を１
回しか選択できない。このウインドウ５２ｄにおける定
義を終了したい場合には、最後に項目（１６）をマウス
等で指示すると、ウインドウ５２ｄが表示部３の画面上
から消滅し、図３０に示すウインドウ５２ｅが画面上に
表示され文字認識対象文字列領域について更に詳細な定
義が可能となる。オペレータは、このウインドウ５２ｅ
により文字認識対象文字列を含むブロックの領域に関す
る定義を行うようになる。In this case, the window 52c shown in FIG.
Since the number of columns is defined as "2" in the item (6) of (2), the item (11) may be selected in the window 52d shown in FIG. In addition, if the number of columns is set to "1" in the item (6) in the window 52c, the window 5
In 2d, if item (12) is selected and defined as "the first to the uppermost character string area extraction processing range obtained by dividing the entire image into three vertically" is defined, the document structure analysis processing during filing processing It is carried out only for a third region (upper part). Also, the window 52 shown in FIG.
In addition, in item d, if item (13) is selected, the extraction processing range can be defined as shown in FIG.
When is selected, the extraction processing range can be defined by a rectangle having the absolute coordinates of two arbitrary points as vertices. For the item selection here, in order not to leave ambiguity in the definition in the following window, one item from (11) to (14) is set to 1
You can select only once. When it is desired to end the definition in this window 52d, when the item (16) is finally designated by the mouse or the like, the window 52d disappears from the screen of the display unit 3, and the window 52e shown in FIG. 30 is displayed on the screen. It is possible to define the character recognition target character string area in more detail. The operator uses this window 52e
Thus, the definition of the area of the block including the character recognition target character string is performed.

【０１４４】図３０に示すウインドウ５２ｅの項目（１
７）は、図１７の処理手続きにおけるブロック検出処理
（ステップｈ）で参照されるパラメータを定義する項目
である。オペレータは、図２５に示す入力文書におい
て、ブロック間のセパレータと見なしうる空白領域の幅
の下限を測定し、その大きさをmm単位で項目（１７）の
下線付き空白部に記入する。Items (1 in the window 52e shown in FIG.
Item 7) is an item that defines the parameter referred to in the block detection process (step h) in the process procedure of FIG. In the input document shown in FIG. 25, the operator measures the lower limit of the width of the blank area that can be regarded as a separator between blocks and writes the size in mm in the underlined blank part of the item (17).

【０１４５】ここで、図２５の入力文書について６mmが
妥当であるとすると項目（１７）の下線付き空白部に
「６」を記入すればよい。項目（１８）は、文字認識対
象文字列が含まれるブロックの領域を定義するところで
ある。文書構造解析処理で自動的にブロックが検出され
順番付けされるが、利用者はどのブロックに文字認識対
象文字列が含まれるかを推測し、項目（１８）の下線付
き空白部に記入する。例えば図２５の題目（１）は２番
目のブロックに含まれるので利用者は「２」を項目（１
８）の下線付き空白部に記入する。ここでの項目選択は
以下のウインドウにおける定義に曖昧さを残さないため
に一つのブロックしか定義できないようになっている。
また、項目（１７）を選択せず項目（１８）を選択した
場合には項目（１７）の下線付き空白部にはαmmあるい
は、入力文書の平均文字行幅のβ倍がセットされる。こ
こで、項目（１８）が利用者によって選択されない場合
には、図１７の処理手続きにおけるブロック抽出処理
（ステップｈ）実施されない。あるいは、常に実施され
るようになっていてもよい。If 6 mm is appropriate for the input document shown in FIG. 25, "6" may be entered in the underlined blank part of the item (17). Item (18) defines the area of the block containing the character recognition target character string. The blocks are automatically detected and sequenced in the document structure analysis process, but the user guesses which block contains the character recognition target character string and fills in the underlined blank part of the item (18). For example, the subject (1) in FIG. 25 is included in the second block, so the user must enter "2" in the item (1
8) Fill in the blank space with underline. The item selection here allows only one block to be defined in order not to leave ambiguity in the definition in the following windows.
When item (18) is selected without selecting item (17), αmm or β times the average character line width of the input document is set in the underlined blank area of item (17). Here, if the item (18) is not selected by the user, the block extraction processing (step h) in the processing procedure of FIG. 17 is not executed. Alternatively, it may be always implemented.

【０１４６】その後、図３０に示すウインドウ５２ｅに
おける定義を終了したい場合には、最後に項目（２０）
をマウス等で指示すると、ウインドウ５２ｅが表示部３
の画面上から消滅し、図３１に示すウインドウ５２ｆが
画面上に表示され文字認識対象文字列領域について更に
詳細な定義が可能となる。Then, when the definition in the window 52e shown in FIG. 30 is desired to be ended, the item (20) is added at the end.
Window is displayed with the mouse, etc.
31 disappears from the screen, and the window 52f shown in FIG. 31 is displayed on the screen, and the character recognition target character string area can be defined in more detail.

【０１４７】このウインドウ５２ｆは，それまでに定義
された文字認識対象文字列抽出範囲の中の文字認識対象
文字列領域を定義するためのものである。図２５の題目
（１）の文字列領域を決定するためには項目（２１）か
ら（３０）までのうち、オペレータは項目（２１）を選
択すればよい。すなわち上述の題目（１）は２番目のブ
ロックに含まれる全ての文字（列）で構成されているの
で、それら全ての文字を認識してコード化する必要があ
る。This window 52f is for defining the character recognition target character string area within the character recognition target character string extraction range defined so far. To determine the character string area of the subject (1) of FIG. 25, the operator may select the item (21) from the items (21) to (30). That is, since the above-mentioned subject (1) is composed of all the characters (columns) included in the second block, it is necessary to recognize and code all those characters.

【０１４８】図２５の示す文書に関するウインドウを用
いた文字列領域の定義は、これで完了となる。The definition of the character string area using the window for the document shown in FIG. 25 is completed.

【０１４９】次に、その他の項目（２２）から（３０）
について、これらで定義できる文字列領域の種類につい
て説明する。Next, other items (22) to (30)
The types of character string areas that can be defined by these will be described.

【０１５０】項目（２２）は、ある領域（それまでに定
義した文字認識対象文字列が存在する領域）内の連続す
る複数（或いは１つ）の文字行領域を文字認識対象文字
行と定義することができる。項目（２３）は、ある領域
内の任意の罫線の前後に存在する連続した複数（或いは
１行）の文字列を文字認識対象文字列と定義することが
できる。例えば図２５の書誌事項（２）に含まれる文字
行領域を項目（２３）で定義することができる。項目
（２４）は、ある領域内に含まれる２本の罫線に囲まれ
る連続する複数（或いは一つの）文字列を認識対象文字
列領域と定義することができる。項目（２５）は、ある
領域内に含まれる文字列のうちその文字列幅（横書き文
字列なら縦幅、縦書き文字列なら横幅）が下線付き空白
部に記入した数値の範囲内に含まれるものを認識対象文
字列と定義することができる。例えば図２５の題目
（１）の文字列について、例えば、下線付き空白部に
「３．５」と「５」のように順に入力する事によりその
文字列領域を定義することができる。項目（２６）は、
ある領域に含まれるアンダーラインの引かれた複数（或
いは１つ）の文字列を認識対象文字列領域であると定義
することができる。項目（２７）は、ある領域の矩形に
囲まれた連続する複数（或いは１つ）の文字行を認識対
象文字列領域と定義することができる。項目（２８）
は、ある領域に含まれる白抜き文字列を認識対象文字列
とすることができる。この項目（２８）が選択される
と、図１７の処理手続きの文字行抽出処理（ステップ
ｆ）において抽出された文字行候補矩形のうち、（黒画
素の量／白画素の量）がγ以上である文字行を白抜き文
字行として抽出し、更に黒画素を反転させて通常の文字
行と同様に扱えるようにする。項目（２９）は、これま
での項目（２１）から（２８）までにおいて定義した認
識対象文字列領域に含まれる文字列のうち、利用者がキ
ーボード等から下線付き空白部に記入した文字列を含ん
だ文字行を認識対象文字列と定義することができる。こ
の項目（２９）が定義されると、図１７の処理手続きの
文字認識処理（ステップ１）においてそれまでに定義さ
れた対象文字列領域に含まれる全ての文字パターンを認
識して、更にその中から利用者が指定した文字列を含む
文字行を抽出する。このとき項目（２９）の（含、除
去）において「含」をマウス等で指示すれば利用者が指
定した文字列を含んだ全ての文字コードを登録すること
ができ、「除去」をマウス等で指示すれば利用者が指定
した文字列を除いた全ての文字コードを登録することが
できる。項目（３０）は、項目（２１）から（２９）ま
での何れかにおいて、入力文書のある１行を認識対象文
字列領域と定義した場合のみ選択できる項目である。す
なわち、項目（２１）から（２９）までの何れかを選択
し、更に認識対象文字列として複数行を定義した場合に
は項目（３０）を選択することはできない（このとき項
目（３０）を選択すると再定義を促す警告メッセージが
表示部３の任意の位置に表示される）。この項目（３
０）を選択すると利用者が指定した単語のみを認識対象
文字列と定義することができる。Item (22) defines a plurality of (or one) continuous character line areas in a certain area (the area where the character string to be recognized as character string defined up to that point exists) as the character line to be recognized. be able to. In the item (23), a plurality of (or one line) continuous character strings existing before and after an arbitrary ruled line in a certain area can be defined as a character recognition target character string. For example, the character line area included in the bibliographic item (2) in FIG. 25 can be defined by the item (23). The item (24) can define a plurality of (or one) continuous character strings surrounded by two ruled lines included in a certain area as a recognition target character string area. Item (25) includes the character string width (vertical width for horizontal writing character string, horizontal width for vertical writing character string) of the character strings included in a certain area within the range of the numerical value written in the underlined blank part. Objects can be defined as recognition target character strings. For example, with respect to the character string of the subject (1) in FIG. 25, the character string area can be defined by sequentially inputting "3.5" and "5" in the underlined blank area. Item (26) is
A plurality of (or one) underlined character strings included in a certain area can be defined as a recognition target character string area. The item (27) can define a plurality of (or one) continuous character lines surrounded by a rectangle of a certain area as a recognition target character string area. Item (28)
Can set an outline character string included in a certain area as a recognition target character string. When this item (28) is selected, (amount of black pixels / amount of white pixels) of the character line candidate rectangles extracted in the character line extraction process (step f) of the processing procedure of FIG. 17 is γ or more. Is extracted as a white character line, and black pixels are further inverted so that it can be handled in the same manner as a normal character line. Item (29) is a character string included in the recognition target character string area defined in the previous items (21) to (28) and written by the user in the underlined blank part from the keyboard or the like. The included character line can be defined as the recognition target character string. When this item (29) is defined, all the character patterns included in the target character string area defined up to that time are recognized in the character recognition processing (step 1) of the processing procedure of FIG. Extract the character line containing the character string specified by the user from. At this time, if "Inclusion" is indicated by a mouse or the like in the item (29) (Including or removing), all the character codes including the character string designated by the user can be registered. All character codes except the character string specified by the user can be registered by pointing at. The item (30) is an item that can be selected only when any one line of the input document is defined as the recognition target character string area in any of the items (21) to (29). That is, if any one of the items (21) to (29) is selected and a plurality of lines are defined as the recognition target character string, the item (30) cannot be selected (at this time, the item (30) is selected). If selected, a warning message prompting redefinition is displayed at an arbitrary position on the display unit 3). This item (3
If 0) is selected, only the word specified by the user can be defined as the recognition target character string.

【０１５１】このように上述の各項目（２１）から（３
１）は、それ以前の項目（１）から（２０）までで定義
された各領域に含まれる文字列に対して、認識対象文字
列領域を定義することができる。従って、新たに別の領
域を定義する場合には、項目（３２）をマウス等で指示
する必要がある。項目（３２）を指示すると図２９に示
すウインドウ５２ｄが再び表示され、定義作業を続ける
ことができる。Thus, each of the above items (21) to (3
In 1), a recognition target character string area can be defined for a character string included in each area defined in the previous items (1) to (20). Therefore, when newly defining another area, it is necessary to specify the item (32) with a mouse or the like. When the item (32) is designated, the window 52d shown in FIG. 29 is displayed again, and the definition work can be continued.

【０１５２】全ての定義を終了する場合には、図３１に
示すウインドウ５２ｆにおいて、まず項目（３３）をマ
ウス等で選択し、それまで定義した情報を入力文書に対
する「文書構造に関する情報」および「入力画像中に存
在する入力文書のタイトルもしくはキーワードに相当す
る文字列情報に関する情報」として知識ベース４５に格
納する。To end all the definitions, first select the item (33) in the window 52f shown in FIG. 31 with the mouse or the like, and use the information defined up to that time as the "information on the document structure" and "information about the document structure" for the input document. The information regarding the character string information corresponding to the title or the keyword of the input document existing in the input image ”is stored in the knowledge base 45.

【０１５３】そして、最後に項目（３４）を指示すると
ウインドウ５２ｆが表示部３の画面上から消滅し、図３
２に示すウインドウ５２ｇが画面上に表示される。ウイ
ンドウ５２ｆでは、定義済みの知識を入力文書に適用す
るときの種々の条件設定を行うもので、項目（３５）及
び項目（３６）は、複数枚の文書を連続処理する場合に
定義済みの知識を適用する文書の位置を決めるためのも
ので、項目（３５）は対象文書がまとまって入力される
ときに選択し、項目（３６）は対象文書がとびとびに入
力されるときに選択するようにしている。両方の項目と
も「複」を指示すると一回の選択で複数回の定義が可能
である。項目（３７）は、ファイリング作業中に表示部
３に表示された入力画像あるいは矩形情報で表現された
文書構造データ上に文字認識対象文字列領域の表示する
場合に選択するものであり、項目（３８）は、ファイリ
ング作業中に表示部３において文字認識結果の表示し、
修正作業を行う場合に選択するものである。そして、項
目（４０）を選択すると、図３３に示すようなファイリ
ング作業に関するメニューが表示され、ファイリング処
理のための準備が完了する。Finally, when the item (34) is designated, the window 52f disappears from the screen of the display unit 3, and the window 52f shown in FIG.
A window 52g shown in 2 is displayed on the screen. In the window 52f, various conditions are set when the defined knowledge is applied to the input document. Items (35) and (36) are defined knowledge when a plurality of documents are continuously processed. The item (35) is selected when the target documents are collectively input, and the item (36) is selected when the target documents are randomly input. ing. Both items can be defined more than once by selecting "duplicate". Item (37) is selected when the character recognition target character string area is displayed on the input image displayed on the display unit 3 or the document structure data represented by the rectangular information during the filing work. 38) displays the character recognition result on the display unit 3 during filing work,
It is selected when the correction work is performed. Then, when the item (40) is selected, a menu regarding filing work as shown in FIG. 33 is displayed, and preparation for filing processing is completed.

【０１５４】ここで、上述した各ウインドウ５２ａ〜５
２ｇは、上述した表示方法の他に、図３５に示すように
各ウインドウ５２ａ〜５２ｇがオーバラップされ、必要
に応じて各ウインドウをマウス等で指示し、該当ウイン
ドウを手前に表示し定義するようにしてもよいし、、図
３６に示すようにタイル形式で表示し、各ウインドウ５
２ａ〜５２ｇで定義するようにしてもよい。また、上述
のごとく記述された認識対象文字列の領域に関する定義
について、知識獲得支援部４１でその内容に「重複」や
「矛盾」があるかどうかをチェックするようにしてもよ
い。さらに、知識獲得支援部４１の起動等を制御するメ
ニューにおいて「テスト」を操作部１で選択すると、オ
ペレータが定義した「入力文書中に存在する入力文書の
タイトルもしくはキーワードに相当する文字列情報に関
する情報」が所望通り定義されているか否かをサンプル
文書により確認することができるようにしてもよい。Here, each of the windows 52a to 5a described above.
In addition to the above-described display method, 2g is such that windows 52a to 52g are overlapped as shown in FIG. Alternatively, the tiles may be displayed as shown in FIG.
It may be defined by 2a to 52g. Further, regarding the definition regarding the area of the recognition target character string described as described above, the knowledge acquisition support unit 41 may check whether the content has “duplication” or “contradiction”. Furthermore, when "test" is selected on the operation unit 1 in the menu for controlling the activation of the knowledge acquisition support unit 41, the operator defines "character string information corresponding to the title or keyword of the input document existing in the input document". It may be possible to confirm from the sample document whether "information" is defined as desired.

【０１５５】従って、このようにすれば、入力画像から
抽出した文字列成分を所定の関係で統合したブロックに
ついて物理的な位置および大きさを解析し、各ブロック
間の意味的接続関係を獲得し、さらに各ブロックに対す
る属性情報を付与して、これら意味的接続関係や属性情
報により入力画像のタイトルもしくはキーワードに相当
する任意の位置に記載されている文字列成分を正確に検
出できるようになるので、多種多様な書式を持つ文書に
ついても一括して自動的に検索タイトルを付与しながら
ファイリングできることになる。また、入力文書に記載
されているタイトル文字列に関する情報の設定は、表示
部での適切な表示により装置と対話的に行うことがで
き、オペレータのファイリング処理に関する作業を大幅
に軽減することができる。Therefore, in this way, the physical position and size of the block in which the character string components extracted from the input image are integrated in a predetermined relationship are analyzed to obtain the semantic connection relationship between the blocks. Further, by adding attribute information to each block, it becomes possible to accurately detect the character string component described at any position corresponding to the title or keyword of the input image by these semantic connection relations and attribute information. , Documents with various formats can be batched while automatically adding search titles. In addition, the setting of the information regarding the title character string described in the input document can be performed interactively with the device by an appropriate display on the display unit, and the work related to the filing process by the operator can be significantly reduced. .

【０１５６】なお、上述した第４実施例では、知識ベー
ス４５に格納されている「入力文書の文書構造に関する
情報」として（１）入力文書の文字列方向、（２）入力
文書の構造（段組等）、（３）入力文書のサイズ、
（４）入力文書の属性などが定義され、「入力文書中に
存在する入力文書のタイトルもしくはキーワードに相当
する文字列情報に関する情報」では文字認識対象列領域
が定義されているが、上記の他に入力文書の種類とそれ
に対応して、入力文書画像のキーワードもしくはタイト
ルとなるべき文字列（矩形）の属性（すなわち「表
題」、「要約」、「本文」、「ヘッダー」、「フッタ
ー」、「キャプション」の何れか）を知識として格納す
るようにしてもよい。この場合、画像ファイリング作業
時にオペレータは入力文書ごとに文書の種類を入力す
る。このとき、認識対象文字列抽出部４４では、オペレ
ータによって入力された入力文書の種類と知識ベース４
５を照合して、抽出すべき文字列（すなわち文書構造デ
ータ中の矩形）の属性を獲得した後、レイアウト解析処
理により得られた入力文書の文書構造データ群（記憶部
１２に格納されている）から妥当な矩形（文字列領域）
を抽出するようになる。In the above-described fourth embodiment, (1) the direction of the character string of the input document and (2) the structure (step) of the input document are stored as "information about the document structure of the input document" stored in the knowledge base 45. Group, etc.), (3) size of input document,
(4) The attribute of the input document is defined, and the character recognition target column area is defined in "information regarding the character string information corresponding to the title or keyword of the input document existing in the input document". The type of the input document and the attributes of the character string (rectangle) that should be the keyword or title of the input document image (that is, "title", "summary", "body", "header", "footer", Any one of "captions" may be stored as knowledge. In this case, the operator inputs the document type for each input document during the image filing work. At this time, in the recognition target character string extraction unit 44, the type of the input document input by the operator and the knowledge base 4 are input.
5, the attribute of the character string to be extracted (that is, the rectangle in the document structure data) is acquired, and then the document structure data group of the input document obtained by the layout analysis processing (stored in the storage unit 12 ) To a valid rectangle (string area)
Will be extracted.

【０１５７】また、上述の第４実施例では、知識ベース
４５を構築するために知識獲得支援部４１を用いる必要
があったが、「入力文書中に存在する入力文書のタイト
ルもしくはキーワードに相当する文字列情報に関する情
報」を第４実施例よりも容易に、知識獲得支援部４１を
用いずに定義するようにもできる。この方法を以下に説
明する。この方法は同形式の文書を大量にファイリング
する場合に有効である。Further, in the above-mentioned fourth embodiment, it was necessary to use the knowledge acquisition support unit 41 in order to construct the knowledge base 45. However, "It corresponds to the title or keyword of the input document existing in the input document". It is also possible to define "information regarding character string information" more easily than in the fourth embodiment without using the knowledge acquisition support unit 41. This method will be described below. This method is effective when filing a large number of documents of the same format.

【０１５８】まず、サンプル文書１枚を入力し、その文
書のレイアウト構造を理解する事により文書構造データ
を抽出し、この結果を図３７に示すように表示部３の画
面５４上に、単語単位の文書構造データ５５、文字行単
位の文書構造データ５６、ブロック単位の文書構造デー
タ５７としてそれぞれ表示する。そして、オペレータ
は、表示されたものの内「入力文書に付与すべきタイト
ルあるいはキーワードとなるべき文字列領域」と見なす
ことのできるものをマウス等で指示する事により「入力
文書中に存在する入力文書のタイトルもしくはキーワー
ドに相当する文字列情報に関する情報」を作成するよう
にしている。すなわち、マウス等で指示されたものに関
する「属性」や「位置」、「大きさ」、「順番」、「論
理関係」、「相互関係」等が自動的に抽出され、体系づ
けられて「入力文書中に存在する入力文書のタイトルも
しくはキーワードに相当する文字列情報に関する情報」
として知識ベース４５に格納されるようになる。First, one sample document is input, document structure data is extracted by understanding the layout structure of the document, and the result is displayed on the screen 54 of the display unit 3 in word units as shown in FIG. Document structure data 55, character line unit document structure data 56, and block unit document structure data 57, respectively. Then, the operator uses a mouse or the like to indicate, by using a mouse, what can be regarded as "a character string area which should be a title or a keyword to be added to the input document" among the displayed ones, "the input document existing in the input document". The information about the character string information corresponding to the title or keyword "is created. That is, "attributes", "positions", "sizes", "orders", "logical relationships", "mutual relationships", etc. related to what is instructed with a mouse etc. are automatically extracted and systematized. Information about the character string information corresponding to the title or keyword of the input document existing in the document "
Will be stored in the knowledge base 45.

【０１５９】（第５実施例）本実施例では、入力画像中
に存在するタイトルやキーワードに相当するキーワード
画像情報について文字認識により得られた各文字認識候
補を特定することなく、この文字認識候補のままで保存
しておき、画像検索の際に複数認識候補の中から最も相
応しい認識結果を画像のタイトルやキーワードとして選
択するようにしたものである。(Fifth Embodiment) In this embodiment, character recognition candidates obtained by character recognition for keyword image information corresponding to titles and keywords existing in the input image are not specified, and the character recognition candidates are not identified. It is stored as it is, and the most suitable recognition result is selected as a title or keyword of an image from a plurality of recognition candidates when an image is searched.

【０１６０】図３８は、キ―ワ―ド画像の登録までを説
明するもので、ここでは、入力図面に対してキ―ワ―ド
を付加して光ディスクなどの登録するものを示してい
る。FIG. 38 explains up to the registration of a key word image, and here, a key word is added to an input drawing to register an optical disk or the like.

【０１６１】この場合、入力図面６１は、スキャナ等の
画像入力装置６２より画像操作することで画像デ―タベ
―ス６３に格納されるようになる。ここで、画像はディ
スプレ６４に表示され，オペレ―タは入力画像のキ―ワ
―ドとして登録したい領域をマウスなどを用いて指示を
するようになる。In this case, the input drawing 61 is stored in the image database 63 by operating the image with the image input device 62 such as a scanner. Here, the image is displayed on the display 64, and the operator uses the mouse or the like to designate the area to be registered as the keyword of the input image.

【０１６２】図３９はキ―ワ―ド領域を登録する際の画
面の１例を示している。この場合、ディスプレ６４に表
示された入力図面の画像６４１に対してキ―ワ―ド領域
６４２を示している。この場合のキ―ワ―ド領域６４２
としては、あらかじめ登録されたキ―ワ―ド位置、ある
いは自動検出またはマウスによる直接指定などの操作に
よるキ―ワ―ド位置が与えられる。FIG. 39 shows an example of a screen for registering a keyword area. In this case, the keyword area 642 is shown for the image 641 of the input drawing displayed on the display 64. Keyboard area 642 in this case
As a keyword, a keyword position registered in advance or a keyword position by an operation such as automatic detection or direct designation with a mouse is given.

【０１６３】画像デ―タベ―ス６３に格納され入力画像
から切り出されたキ―ワ―ド画像６５は、文字認識装置
６６に送られ、ここで文字認識が行われる。そして、そ
の認識候補は、すべて文字認識候補格納部６７に格納さ
れる。この場合、文字認識候補格納部６７は、図３８
（ｂ）に示すように認識順位の高いものから順に格納す
るようにしている。The keyword image 65 stored in the image database 63 and cut out from the input image is sent to the character recognition device 66, where character recognition is performed. Then, all the recognition candidates are stored in the character recognition candidate storage unit 67. In this case, the character recognition candidate storage unit 67 is shown in FIG.
As shown in (b), the recognition order is stored in descending order.

【０１６４】その後、入力図面が上述の手続きに従って
順次登録される。なお、キ―ワ―ドの登録やキ―ワ―ド
画像の切り出し処理、文字認識処理等は必ずしも画像登
録毎に行う必要はなくバッチ的に処理してもよい。Thereafter, the input drawings are sequentially registered according to the above procedure. Note that the keyword registration, the keyword image cutout processing, the character recognition processing, and the like do not necessarily have to be performed for each image registration, and may be performed in batches.

【０１６５】このようにして本実施例では、文字認識候
補から文字認識結果を一意に決定するのでなく文字認識
候補も含めてキ―ワ―ドとして登録されるようになる。As described above, in this embodiment, the character recognition result is not uniquely determined from the character recognition candidates but is registered as a keyword including the character recognition candidates.

【０１６６】図４０は、キ―ワ―ド画像の検索を説明す
るためのものである。この場合、オペレ―タにより与え
られるキ―ワ―ドに基づいて画像デ―タベ―ス６３より
図面画像を選択する。つまり、オペレ―タより与えられ
るキ―ワ―ド６８と文字認識候補格納部６７とを検索装
置６９で照合し、キ―ワ―ド６８として保存された図面
画像７０を画像デ―タベ―ス６３より取り出すようにな
る。FIG. 40 is for explaining retrieval of keyword images. In this case, a drawing image is selected from the image database 63 based on the keyword given by the operator. That is, the keyword 68 provided by the operator is collated with the character recognition candidate storage unit 67 by the retrieval device 69, and the drawing image 70 saved as the keyword 68 is converted into the image database. It comes to take out from 63.

【０１６７】そして、オペレ―タによる確認操作７１に
より文字認識候補が確定されると余分な文字認識候補は
文字認識候補格納部６７から除去され，この段階で文字
認識結果が一意に決定されるようになる。Then, when the character recognition candidates are confirmed by the operator's confirmation operation 71, the surplus character recognition candidates are removed from the character recognition candidate storage unit 67, and the character recognition result is uniquely determined at this stage. become.

【０１６８】次に、文字認識候補格納部６７について詳
述する。図４１（ａ）は切り出されたキ―ワ―ド画像７
２を示している。そして、このキ―ワ―ド画像７２に対
して１文字切り出し処理を施すことにより、同図（ｂ）
に示す文字画像７２１から７２７を切り出して文字認識
を行う。Next, the character recognition candidate storage section 67 will be described in detail. Fig. 41 (a) shows the cut keyword image 7
2 is shown. Then, by performing a character cutting process on the keyword image 72, FIG.
Character images are identified by cutting out the character images 721 to 727 shown in FIG.

【０１６９】文字認識結果は文字認識処理で評価され
る’類似度値’の大きい順に並べ換えられる。図４１
（ｃ）図は、切り出された文字画像７２１から７２７の
文字認識結果７３を示している。ここでは、類似度値は
０から１０００まで取り得る。The character recognition results are rearranged in descending order of the “similarity value” evaluated in the character recognition processing. Figure 41
The figure (c) has shown the character recognition result 73 of the extracted character images 721-727. Here, the similarity value can range from 0 to 1000.

【０１７０】図４２（ａ）は、文字認識結果から３位ま
でを文字認識候補として選択した場合を示し、同図
（ｂ）は文字カテゴリ毎に異なる候補数を与えた場合を
示している。この場合、１位の文字認識結果の文字カテ
ゴリにより候補数が決まる。図４２（ｂ）図の場合、文
字’Ｔ’と’Ｙ’ならば１つ、文字’−’ならば２つ、
文字’０’、’Ｏ’、’Ｂ’、’１’ならば４つと予め
与えられている。FIG. 42 (a) shows a case in which the character recognition results up to the third place are selected as character recognition candidates, and FIG. 42 (b) shows a case in which a different number of candidates is given for each character category. In this case, the number of candidates is determined by the character category of the first character recognition result. In the case of FIG. 42 (b), one is for the characters'T 'and'Y', two are for the characters'- ',
If the characters are "0", "O", "B", and "1", four characters are given in advance.

【０１７１】図４２（ｃ）は、文字認識処理における類
似度値により候補を選択する例である。類似度値が予め
与えた値よりも大きい文字認識結果のみを文字認識候補
として登録することが考えられる。あるいは１位の類似
度値が十分大きく、かつ２位との類似度値の差も大きい
のなら１位の認識結果のみを文字認識候補として登録す
るが２位との類似度値の差が余り大きくないのなら文字
認識結果の１位と２位を共に登録するような方法も適用
可能である。図４２（ｄ）は各文字のサブセット（文字
種）に関する情報を用いて文字認識結果より文字認識候
補を選択した場合である。ここで、キ―ワ―ド画像７２
のサブセット情報として１文字目：英文字大文字２文字目：英文字大文字３文字目：英文字大文字４文字目：英文字大文字５文字目：記号（’−’または’／’）６文字目：英文字大文字７文字目：数字を用いることができる。この際、”５文字目は必ず’
−’（ハイフン）である”といった制限を設定すること
も可能である。このサブセットに関する情報はキ―ワ―
ド画像登録時にオペレ―タにより設定される。FIG. 42C shows an example of selecting a candidate based on the similarity value in the character recognition process. It is conceivable to register only the character recognition result whose similarity value is larger than a predetermined value as a character recognition candidate. Alternatively, if the similarity value of the first place is sufficiently large and the difference between the similarity values of the second place and the second place is also large, only the recognition result of the first place is registered as a character recognition candidate, but the difference of the similarity value with the second place remains. If it is not large, a method of registering both the first and second positions of the character recognition result can be applied. FIG. 42 (d) shows a case where a character recognition candidate is selected from the character recognition result using the information on the subset (character type) of each character. Here, the keyword image 72
1st character: English capital letters 2nd character: English capital letters 3rd character: English capital letters 4th character: English capital letters 5th character: Symbol ('-' or '/') 6th character: Uppercase English letters 7th character: Numbers can be used. At this time, "5th character is always'
It is also possible to set restrictions such as'is' (hyphen). Information about this subset is keyed.
It is set by the operator when registering the image.

【０１７２】キ―ワ―ド画像７２の文字認識において１
文字として切り出す際にもいくつかの候補が存在するこ
とがある。図４１（ｂ）の場合、文字画像７２６は１文
字として切り出されたが、文字’１’と’３’が接触し
て生成されたパタ―ンである可能性もある。そこで、図
４３に示すように文字切り出しの候補も含めた文字認識
候補を出力するようにもできる。この場合、かすれある
いは分離文字の存在などを考慮した文字切り出しの候補
も考えられる。また、キ―ワ―ド登録時にキ―ワ―ド文
字数などを与えることができるのなら文字切り出しの候
補を少なくすることができる。In character recognition of the keyword image 72, 1
There may be some candidates when extracting as characters. In the case of FIG. 41B, the character image 726 is cut out as one character, but it may be a pattern generated by contacting the characters “1” and “3”. Therefore, as shown in FIG. 43, it is possible to output the character recognition candidates including the character cutout candidates. In this case, a character cutout candidate that considers the presence of faint or separated characters may be considered. Further, if the number of characters in the keyword can be given when registering the keyword, the number of candidates for character extraction can be reduced.

【０１７３】文字認識候補の選択例を上で述べたが文字
認識候補格納部６７のデ―タ量を圧縮する必要がないの
なら文字認識結果７３をそのまま用いることもできる。
また、文字認識候補の選択手法は組み合わせて用いるこ
ともできる。Although an example of selecting character recognition candidates has been described above, if it is not necessary to compress the data amount of the character recognition candidate storage unit 67, the character recognition result 73 can be used as it is.
Further, the character recognition candidate selection methods may be used in combination.

【０１７４】文字認識において文字画像へのノイズの付
加、かすれなどにより、あるいは１文字切り出しの誤り
などにより文字認識結果が却下されることがある。In character recognition, the character recognition result may be rejected due to addition of noise to the character image, blurring, or an error in cutting out one character.

【０１７５】図４４は、文字認識結果が却下された場合
の処理を説明するためのもので、この場合、図示７４か
ら８０までの処理は、図３８で述べたと同様である。そ
して、文字認識装置８０での文字認識結果が却下された
場合は（図示８１）、オペレ―タに対してキ―ボ―ドな
どを用いてキ―ワ―ドをタイプするかどうか尋ねる（図
示８２）。ここで、Ｙｅｓならオペレ―タのタイピング
によるキ―ワ―ドの登録処理８３によりキ―ワ―ドが文
字認識候補格納部８４に格納される。なお、タイピング
されたキ―ワ―ドは文字認識候補の１位の欄に登録され
る。FIG. 44 is for explaining the processing when the character recognition result is rejected. In this case, the processing from 74 to 80 in the figure is the same as that described in FIG. When the result of character recognition by the character recognition device 80 is rejected (81), the operator is asked whether or not to type the keyboard using the keyboard (illustrated). 82). If Yes, the key word is stored in the character recognition candidate storage section 84 by the key word registration processing 83 by the operator typing. The typed keyword is registered in the first column of character recognition candidates.

【０１７６】一方、オペレ―タに対して図示８２の返答
がＮｏの場合は、文字認識処理がパラメ―タ等を変更し
て再度行われるようになる。その具体的な方法として
は、画像入力装置７５の感度を変更して画像の入力から
行うことや文字認識処理の際のパラメ―タを変更するこ
とが考えられる。また、オペレ―タによる登録キ―ワ―
ドの指定７７まで戻り、オペレ―タに切り出し位置の確
認を求めることも考えられる。ここで、処理が図面画像
の入力まで戻るときは、入力図面を再び画像入力装置７
５にセットする必要がある。しかし、画像入力装置７５
が多値画像を取り扱うことができるのなら、入力図面を
多値画像として多値画像メモリ８５に格納し、これを２
値化処理８６して画像入力感度を変更した図面画像とし
て獲得することができる。また、画像ファイリング処理
がバッチ的に行われているのならキ―ワ―ド登録が却下
された図面に対してはすべての図面が登録された後、確
認修正操作が施される。On the other hand, when the reply 82 in the drawing is No to the operator, the character recognition processing is performed again by changing the parameters and the like. As a concrete method, it is conceivable to change the sensitivity of the image input device 75 to start the image input or change the parameters in the character recognition processing. Also, the registration key by the operator
It is also possible to return to the designation 77 of the code and ask the operator to confirm the cutout position. Here, when the process returns to the input of the drawing image, the input drawing is again input to the image input device 7
Must be set to 5. However, the image input device 75
Is capable of handling a multi-valued image, the input drawing is stored in the multi-valued image memory 85 as a multi-valued image,
It is possible to obtain the drawing image with the image input sensitivity changed by performing the binarization process 86. If the image filing process is performed in batches, all drawings are registered and then confirmation correction operation is performed on the drawings whose keyword registration is rejected.

【０１７７】ここでは、キ―ワ―ドの文字認識結果が一
意に決定されることなく、その後の検索時にそれらが確
認される。よって検索前にキ―ワ―ドに関する処理、例
えばキ―ワ―ドリストを表示する等の要求があった場合
は認識候補のうち１位だけが用いられる。Here, the character recognition results of the keywords are not uniquely determined, but they are confirmed at the time of subsequent retrieval. Therefore, if there is a request for a keyword-related process before the search, for example, a keyword list is displayed, only the first candidate among the recognition candidates is used.

【０１７８】次に、キ―ワ―ド画像の検索を図４５によ
り説明する。Next, the retrieval of keyword images will be described with reference to FIG.

【０１７９】この場合、検索すべきキ―ワ―ド９０の文
字数と等しい文字数から構成されるキ―ワ―ド候補８８
を文字認識候補格納部８７より抽出する。次いで、各キ
―ワ―ド候補８８に対してキ―ワ―ド９０との類似性を
評価することによりキ―ワ―ド９０に対応する登録キ―
ワ―ド８９が決定される。類似性の評価は文字認識候補
の中にキ―ワ―ドが含まれているかどうかで判定するこ
とができる。そして登録キ―ワ―ド８９に対応する図面
画像が画像データベースより選択されオペレ―タに提示
する。そして、オペレ―タの確認動作の後、キ―ワ―ド
の文字認識結果が一意に決定され、キ―ワ―ドに対する
文字認識候補９１に従い文字認識候補８７が修正される
ようになる。なお、文字認識結果が既に一意に決定され
ている場合は文字認識候補８７の修正は行われない。In this case, a keyword candidate 88 having a number of characters equal to the number of characters of the keyword 90 to be searched.
Is extracted from the character recognition candidate storage unit 87. Then, each keyword candidate 88 is evaluated for similarity to the keyword 90 and the registered key corresponding to the keyword 90 is evaluated.
Word 89 is decided. The similarity evaluation can be judged by whether or not a keyword is included in the character recognition candidates. Then, the drawing image corresponding to the registered keyword 89 is selected from the image database and presented to the operator. After the operator confirming operation, the character recognition result of the keyword is uniquely determined, and the character recognition candidate 87 is corrected according to the character recognition candidate 91 for the keyword. If the character recognition result is already uniquely determined, the character recognition candidate 87 is not corrected.

【０１８０】文字認識候補とキ―ワ―ドとの類似性につ
いて図４６（ａ）（ｂ）を用いて説明する。この場合、
１位の候補と一致するときは１０点、２位の候補と一致
するときは９点、等と点数をつけ合計点により類似性を
評価するようにしている。そして、図示例の場合、キ―
ワ―ド９２に対し、文字認識候補９３との類似性は３９
点と評価され、文字認識候補９４との類似性は３７点と
評価される。Similarities between character recognition candidates and keywords will be described with reference to FIGS. 46 (a) and 46 (b). in this case,
The similarity is evaluated based on the total score by assigning points such as 10 points when they match the 1st place candidate, 9 points when they match the 2nd place candidate, and so on. And in the case of the illustrated example, the key
The similarity between the word 92 and the character recognition candidate 93 is 39.
It is evaluated as a point, and the similarity with the character recognition candidate 94 is evaluated as 37 points.

【０１８１】点数の与え方としては、１位１０点、２位
７点、３位５点というように１位に重みを置いた点数の
与え方や文字認識の際の類似度値を用いる方法、また類
似文字には同じ点数を与えるなどいろいろ考えられる。
類似文字に同じ点数を与える方法では文字認識候補９３
の２文字目の２位にも一位と同じ点数が与えられる。つ
まり、キ―ワ―ド９２との類似性は４０点と評価される
ようになる。As a method of giving a score, a method of giving a weight to the first place such as 1st place 10 points, 2nd place 7 points, 3rd place 5 points and a method of using a similarity value at the time of character recognition There are various possibilities such as giving the same score to similar characters.
In the method of giving the same score to similar characters, character recognition candidates 93
The second character of the second character will be given the same score as the first character. That is, the similarity to the keyword 92 is evaluated as 40 points.

【０１８２】図４５に戻って、検索装置ではオペレ―タ
により与えられたキ―ワ―ド９０に対応する登録キ―ワ
―ド８９を前に述べた類似性に従い検出するが、その検
出の方法を図４７を用いて説明する。Returning to FIG. 45, the retrieval device detects the registered keyword 89 corresponding to the keyword 90 given by the operator according to the similarity described above. The method will be described with reference to FIG.

【０１８３】この場合、第１の方法では予め与えた得点
よりも大きな類似性を持つ候補があればそれを登録キ―
ワ―ドとして検出することであり、図４７（ａ）の例で
はキ―ワ―ド候補９５から９７のうちキ―ワ―ド候補９
５が登録キ―ワ―ドとして検出される。第２の方法では
キ―ワ―ド候補すべてとの間で類似性を評価しそのうち
最高の類似性をもつものを登録キ―ワ―ドとして検出す
ることであり、図４７（ａ）の例ではキ―ワ―ド候補９
５から９７のうちキ―ワ―ド候補９７が登録キ―ワ―ド
として検出されることになる。なお、第２の方法の場
合、図４７（ｂ）に示すようにキ―ワ―ド候補９８、９
９が同点をとることがあると、類似性の評価方法を代え
たり、オペレ―タに提示して選択させる方法がとられ
る。類似性の評価方法を代える場合は１位、２位、３位
の得点配分を変えることが考えられる。In this case, in the first method, if there is a candidate having a similarity larger than the score given in advance, the candidate is registered with the key.
This is to be detected as a word, and in the example of FIG. 47 (a), a keyword candidate 9 out of keyword candidates 95 to 97 is selected.
5 is detected as a registration keyword. The second method is to evaluate the similarity with all the keyword candidates and detect the one having the highest similarity among them as the registered keyword. The example of FIG. 47 (a) Keyword candidate 9
The keyword candidate 97 out of 5 to 97 will be detected as the registered keyword. In the case of the second method, as shown in FIG. 47 (b), keyword candidates 98, 9
If 9 have the same score, the method of evaluating the similarity may be changed, or the method of presenting to the operator for selection may be adopted. When changing the method of evaluating the similarity, it is conceivable to change the score distribution for the first, second and third places.

【０１８４】次に、オペレ―タに選択させる場合の例を
図４８を用いて説明する。Next, an example in which the operator is made to select will be described with reference to FIG.

【０１８５】この場合、図４８（ａ）は同点をとったキ
―ワ―ド候補に対応する図面画像１００から１０５をデ
ィスプレ１０６に表示した状態を示している。オペレ―
タはこれによりキ―ワ―ドの対応した図面画像を選択す
ることができる。図４８（ｂ）図は、同点を取ってキ―
ワ―ド候補に対応する図面画像中のキ―ワ―ド画像１０
７から１１２をディスプレ１１３に表示したところを示
しており、オペレ―タはこれによりキ―ワ―ド画像を選
択しその結果対応する図面画像が決定される。キ―ワ―
ド画像を提示する場合はキ―ワ―ド位置も文字認識候補
と共に格納しておく必要がある。In this case, FIG. 48 (a) shows a state in which drawing images 100 to 105 corresponding to the keyword candidates having the same points are displayed on the display 106. Operation
This allows the computer to select the drawing image corresponding to the keyword. Figure 48 (b) shows the key
Keyword image 10 in the drawing image corresponding to the word candidate
7 to 112 are displayed on the display 113, which allows the operator to select a keyword image and consequently the corresponding drawing image is determined. Keyer
When presenting the key image, the keyword position must be stored together with the character recognition candidates.

【０１８６】ところで、検索装置により登録キ―ワ―ド
が検出できないことがある。このような場合は、類似パ
ターンを発生するようにしている。図４９は、この様子
を説明するものである。By the way, the search keyword may not be detected by the search device. In such a case, a similar pattern is generated. FIG. 49 illustrates this situation.

【０１８７】図ではキ―ワ―ド画像１１４が文字認識さ
れて文字認識候補１１５が得られている。そして、オペ
レ―タによりキ―ワ―ド１１６が与えられると検索装置
１１７により登録キ―ワ―ドが検索される。この場合、
文字認識候補がキ―ワ―ド１１６で検出されないような
場合、類似文字デ―タベ―ス１１８に予め登録されてい
る類似文字デ―タに基づいて類似パタ―ン発生装置１１
９よりキ―ワ―ドの類似パタ―ン１２０を自動的に発生
させ、それらと文字認識候補との類似性を検索装置１２
１で調べ登録キ―ワ―ドを検出するようになる。これで
もキ―ワ―ドが検索できない時は、図示１２２のように
オペレータによる直接の選択が行われるようになる。In the figure, the keyword image 114 is character-recognized and character recognition candidates 115 are obtained. When the operator inputs the keyword 116, the retrieval device 117 retrieves the registered keyword. in this case,
If the character recognition candidate is not detected by the keyword 116, the similar pattern generator 11 is based on the similar character data registered in advance in the similar character database 118.
9, the keyword similar pattern 120 is automatically generated, and the similarity between them and the character recognition candidate is searched for by the retrieval device 12
Check 1 to detect registered keywords. If the keyword cannot be retrieved even with this, the operator directly selects as shown in FIG.

【０１８８】図５０は、類似パタ―ン発生装置１１９を
説明するためのものである。この場合、各文字毎に類似
文字が類似文字デ―タベ―ス１１８に登録されている。
この中には類似文字のみではなく文字切り出しの候補も
登録されている。図５０の文字’Ｂ’の類似文字とし
て、’８’、’０’、…、’１３’、…、等が登録され
ている。なお、類似パタ―ンは必ずしも類似文字である
必要はなく文字認識装置の誤認識する性質を反映させた
ものにしてもよい。つまり、用いた文字認識装置が文
字’責’の文字パタ―ンを文字’貴’と誤認識するので
あれば文字’責’の類似パタ―ンとして文字’貴’を登
録してもよい。FIG. 50 is for explaining the similar pattern generator 119. In this case, a similar character is registered in the similar character database 118 for each character.
In this, not only similar characters but also character cutout candidates are registered. As characters similar to the character “B” in FIG. 50, “8”, “0”, ..., “13”, ... The similar pattern does not necessarily have to be similar characters, and may be a character reflecting the erroneous recognition characteristic of the character recognition device. That is, if the used character recognition device erroneously recognizes the character pattern of the character “blame” as the character “ki”, the character “ki” may be registered as a similar pattern of the character “blame”.

【０１８９】検索装置により登録キ―ワ―ドが検出でき
ない場合の他の対処方法として、文字認識を再度やり直
すことである。キ―ワ―ド画像切り出し位置も文字認識
候補と共に格納されていればこれは可能となる。キ―ワ
―ド検出する意図でキ―ワ―ド画像を解析、文字認識す
れば正しく文字認識できる可能性がある。As another coping method when the registered keyword cannot be detected by the retrieval device, character recognition is performed again. This is possible if the keyword image cutout position is also stored together with the character recognition candidates. If the keyword image is analyzed and the character is recognized with the intention of detecting the keyword, the character may be correctly recognized.

【０１９０】図５１に示すように検索により文字認識結
果が確定されたデ―タを不確定のデ―タを明確に区別す
ることも考えられる。先の例では文字認識候補が１つに
限定された場合を確定デ―タとしたが図５１に示すよう
に入力に対して確定デ―タは光ディスク１２３、不確定
デ―タは磁気ディスク１２４に格納するようにしてもよ
い。つまり、入力キ―ワ―ド画像が文字認識されその文
字認識結果が信頼できると判定された場合は確定デ―タ
として光ディスク１２３に格納され、文字認識結果が信
頼できない場合で複数の文字認識候補が存在する場合は
いったん不確定デ―タとして磁気ディスク１２４に格納
するようにしている。As shown in FIG. 51, it is also possible to clearly distinguish the data whose character recognition result is confirmed by the search from the uncertain data. In the above example, the case where the number of character recognition candidates is limited to one is defined as the definite data, but as shown in FIG. 51, the definite data is the optical disk 123 and the uncertain data is the magnetic disk 124 for the input. It may be stored in. That is, when the input keyword image is character-recognized and it is determined that the character recognition result is reliable, it is stored in the optical disc 123 as definite data, and when the character recognition result is unreliable, a plurality of character recognition candidates are obtained. If it exists, it is temporarily stored in the magnetic disk 124 as uncertain data.

【０１９１】そして、検索装置１２５により磁気ディス
ク１２４内の不確定デ―タ、光ディスク１２３内の確定
デ―タを検索するが、不確定デ―タが検索キ―ワ―ドと
して検出された場合は、確認作業１２６によりデ―タの
確定を行う。即ち磁気ディスク１２４内の不確定デ―タ
を確定デ―タを格納している光ディスク１２３に移動す
るようにする。When the uncertain data in the magnetic disk 124 and the definite data in the optical disk 123 are searched by the retrieval device 125, but the uncertain data is detected as the retrieval keyword. Confirms the data by the confirmation work 126. That is, the uncertain data in the magnetic disk 124 is moved to the optical disk 123 storing the definite data.

【０１９２】いずれの方法を用いても検索装置で登録キ
―ワ―ドが検出できない場合は検索装置においてキ―ワ
―ドが検索され文字認識結果が確定されたもの以外の図
面画像、あるいはキ―ワ―ド画像を図４８に示すように
ディスプレに表示しオペレ―タにより選択させるように
なる。If the searching device cannot detect the registered keyword by any of the methods, the searching device searches the keyword and the drawing image other than the one in which the character recognition result is confirmed, or the key. The word image is displayed on the display as shown in FIG. 48 and can be selected by the operator.

【０１９３】従って、このようにすれば、入力画像に対
して指示されたキーワード領域より切り出されたキーワ
ード画像について文字認識が行われ、この文字認識結果
を文字認識候補を含めて格納しておき、画像検索時に与
えられるキーワードとの類似性により画像を検索し、提
示された画像が所望のものであるかどうかによって文字
認識候補の中から文字認識結果を一意に決定するように
したので、キーワード画像の認識が不明瞭の場合でも、
複数の認識候補中に正しい認識結果を残しておくことに
より、画像検索時に認識候補の中から認識結果を一意に
決定することで、画像入力時に認識結果を一意に決定す
る必要がなく、適切な画像検索を行うことができるよう
になる。Accordingly, in this way, character recognition is performed on the keyword image cut out from the keyword area designated for the input image, and the character recognition result is stored including the character recognition candidates. The image is searched based on the similarity to the keyword given during image search, and the character recognition result is uniquely determined from among the character recognition candidates depending on whether the presented image is the desired one. Even if the recognition of is unclear,
By leaving the correct recognition result among multiple recognition candidates, the recognition result is uniquely determined from among the recognition candidates at the time of image search, and it is not necessary to uniquely determine the recognition result at the time of image input. You will be able to search for images.

【０１９４】なお、保存しておく複数の認識候補の数に
よっては正しい認識結果がその中に含まれない可能性が
有り得るので、画像入力時にオペレータが重要と思うも
のに付いては、認識結果を確認、修正、確定するか、キ
ーワード画像認識に頼ることなくオペレータ自身がキー
ワードを入力することができるようにし、オペレータが
このような事を行わないときに本実施例の方法を採るよ
うにすると効果的である。Depending on the number of the recognition candidates to be stored, the correct recognition result may not be included in the recognition results. It is effective to confirm, correct, confirm, or allow the operator to input the keyword without relying on the keyword image recognition, and to adopt the method of this embodiment when the operator does not do such a thing. Target.

【０１９５】[0195]

【発明の効果】本発明によれば、オペレータにより画面
上に表示されている入力画像上で任意の一点が指示され
ると、最初に適当な領域が設定されるとともに、領域内
の入力画像の文字列成分の並び方向が判定され、この判
定に基づいて文字列成分の並び方向に沿って領域が拡大
されて、この拡大領域の文字列成分について文字認識お
よびコード化が実行され、コード化された文字列をキー
ワードとして入力画像のファイリングが行われるように
なるので、オペレータは、入力画像上の任意の一点を指
示するだけで、入力画像に付与すべきタイトル情報など
のキーワードを文字認識およびコード化して入力するこ
とができるようになり、オペレータのファイリング処理
作業を著しく軽減することができる。According to the present invention, when an operator designates an arbitrary point on the input image displayed on the screen, an appropriate area is set first and the input image within the area is set. The arrangement direction of the character string components is determined, the region is enlarged along the arrangement direction of the character string components based on this determination, and character recognition and coding are executed and coded for the character string components of the enlarged region. Since the input image will be filed using the character string as a keyword, the operator only needs to specify an arbitrary point on the input image to perform character recognition and code for the keyword such as title information to be given to the input image. It becomes possible to input the data in a digitized form, and the filing processing work of the operator can be significantly reduced.

【０１９６】また、本発明によれば、入力画像から抽出
した文字列成分を所定の関係で統合したブロックについ
て物理的な位置および大きさを解析し、各ブロック間の
意味的接続関係を獲得し、さらに各ブロックに対する属
性情報を付与して、これら意味的接続関係や属性情報に
より入力画像のタイトルもしくはキーワードに相当する
任意の位置に記載されている文字列成分を正確に検出で
きるので、同一種類の文書等において所望の文字列に、
字数や行数の変動および位置ズレが生じている場合であ
ってもキーワードとなるべき文字列成分を正確に検出す
ることができ、多種多様な書式を持つ文書についても一
括して自動的に検索タイトルを付与しながら効率よくフ
ァイリングでき、また、入力文書に記載されているタイ
トル文字列に関する情報の設定は、表示部での適切な表
示により装置と対話的に行うことができ、オペレータの
ファイリング処理に関する作業を大幅に軽減することが
できる。Further, according to the present invention, the physical position and size of the block obtained by integrating the character string components extracted from the input image in a predetermined relationship are analyzed to obtain the semantic connection relationship between the blocks. Further, since attribute information is added to each block and a character string component described at an arbitrary position corresponding to the title or keyword of the input image can be accurately detected by the semantic connection relation and the attribute information, the same type To the desired character string in the document such as
Even if there are variations in the number of characters or the number of lines and misalignment has occurred, it is possible to accurately detect the character string component that should be the keyword, and automatically search all documents with various formats in batch. Filing can be done efficiently while giving a title, and information about the title character string described in the input document can be set interactively with the device by an appropriate display on the display unit, and the operator's filing process The work related to can be significantly reduced.

【０１９７】さらに、本発明によれば、入力画像に対し
て指示されたキーワード領域より切り出されたキーワー
ド画像について文字認識が行われ、この文字認識結果を
文字認識候補を含めて格納し、画像検索時に与えられる
キーワードにより文字認識候補との類似性の判断より両
者がよく類似する画像を優先的に提示するようになるの
で、キーワード画像の認識が不明瞭の場合でも、複数の
認識候補中に正しい認識結果を残しておき、検索時には
認識候補の中からキーワードとの類似性の高いものを選
んで対応する画像を提示することになり、オペレータは
所望の画像を得ることができる。つまり、キーワード画
像の認識結果を確定することなく適切な画像検索が行え
るようになる。さらに画像が提示された時点で、オペレ
ータが所望の画像が得られたと判断して確定の指示を与
えることにより、その画像のキーワードの認識結果を検
索に用いたキーワードとして一意に決定するようにして
もよい。Further, according to the present invention, character recognition is performed on the keyword image cut out from the keyword area designated for the input image, the character recognition result is stored including the character recognition candidates, and the image retrieval is performed. Even if the recognition of the keyword image is unclear, it is correct in multiple recognition candidates because the image that is similar to the character recognition candidate is preferentially presented rather than the similarity of the character recognition candidate is judged by the keyword given at some time. The recognition result is left and the one having a high similarity to the keyword is selected from the recognition candidates at the time of search to present the corresponding image, and the operator can obtain a desired image. That is, an appropriate image search can be performed without confirming the recognition result of the keyword image. Further, when the image is presented, the operator determines that the desired image is obtained and gives a confirmation instruction so that the recognition result of the keyword of the image is uniquely determined as the keyword used for the search. Good.

[Brief description of drawings]

【図１】本発明の第１実施例の概略構成を示す図。FIG. 1 is a diagram showing a schematic configuration of a first embodiment of the present invention.

【図２】第１実施例のファイリング処理を説明するため
の図。FIG. 2 is a diagram for explaining a filing process according to the first embodiment.

【図３】第１実施例によりファイリング処理される文書
例を示す図。FIG. 3 is a diagram showing an example of a document subjected to filing processing according to the first embodiment.

【図４】第１実施例によるファイリング処理を説明する
ための図。FIG. 4 is a diagram for explaining a filing process according to the first embodiment.

【図５】第１実施例によるファイリング処理を説明する
ための図。FIG. 5 is a diagram for explaining a filing process according to the first embodiment.

【図６】第１実施例での黒画素連結成分の生成例を示す
図。FIG. 6 is a diagram showing an example of generating a black pixel connected component in the first embodiment.

【図７】第１実施例での近接する黒画素連結成分の統合
例を示す図。FIG. 7 is a diagram showing an example of integration of adjacent black pixel connected components in the first embodiment.

【図８】第１実施例での文字列を形成する可能性のある
黒画素連結成分の抽出例を示す図。FIG. 8 is a diagram showing an example of extracting a black pixel connected component that may form a character string in the first embodiment.

【図９】第１実施例で文字列方向に処理適用範囲の拡大
例を示す図。FIG. 9 is a diagram showing an example of expanding the processing application range in the character string direction in the first embodiment.

【図１０】第１実施例に適用されるポインティングデバ
イスを示す図。FIG. 10 is a diagram showing a pointing device applied to the first embodiment.

【図１１】図１０のポインティングデバイスの指示によ
り得られる位置座標と文字列成分との対応関係を説明す
る図。11 is a diagram illustrating a correspondence relationship between a position coordinate obtained by an instruction from the pointing device in FIG. 10 and a character string component.

【図１２】第１実施例に適用される文字パターンの切り
だし方法を説明する図。FIG. 12 is a diagram illustrating a method of cutting out a character pattern applied to the first embodiment.

【図１３】本発明の第２実施例での文字列領域の拡大を
説明するための図。FIG. 13 is a diagram for explaining expansion of a character string area in the second embodiment of the present invention.

【図１４】本発明の第３実施例でのポインティングデバ
イスの指示により得られる位置座標と文字列成分との対
応関係を説明する図。FIG. 14 is a diagram illustrating a correspondence relationship between a position coordinate obtained by an instruction of a pointing device and a character string component in the third embodiment of the present invention.

【図１５】本発明の第４実施例の概略構成を示す図。FIG. 15 is a diagram showing a schematic configuration of a fourth embodiment of the present invention.

【図１６】第４実施例のファイリング処理を説明するた
めの図。FIG. 16 is a diagram for explaining filing processing according to the fourth embodiment.

【図１７】第４実施例のファイリング処理を説明するた
めのフローチャート。FIG. 17 is a flowchart for explaining filing processing according to the fourth embodiment.

【図１８】第４実施例での黒画素連結成分の生成を示す
図。FIG. 18 is a diagram showing generation of a black pixel connected component in the fourth embodiment.

【図１９】第４実施例での文字行成分の生成を示す図。FIG. 19 is a diagram showing generation of character line components in the fourth embodiment.

【図２０】第４実施例でのブロックの生成を示す図。FIG. 20 is a diagram showing generation of blocks in the fourth embodiment.

【図２１】第４実施例での文書構造データの一例を示す
図。FIG. 21 is a diagram showing an example of document structure data according to the fourth embodiment.

【図２２】第４実施例での文字パターンの切り出し方法
を説明する図。FIG. 22 is a diagram illustrating a method of cutting out a character pattern according to the fourth embodiment.

【図２３】第４実施例に用いられる知識獲得支援部を説
明するための図。FIG. 23 is a diagram for explaining a knowledge acquisition support unit used in the fourth embodiment.

【図２４】第４実施例に用いられる知識獲得支援部を説
明するための図。FIG. 24 is a diagram for explaining a knowledge acquisition support unit used in the fourth embodiment.

【図２５】第４実施例によりファイリングされる文書例
を示す図。FIG. 25 is a diagram showing an example of a document filed according to the fourth embodiment.

【図２６】第４実施例でのウインドウ表現を示す図。FIG. 26 is a diagram showing a window representation in the fourth embodiment.

【図２７】第４実施例でのウインドウ表現を示す図。FIG. 27 is a diagram showing a window representation in the fourth embodiment.

【図２８】第４実施例でのウインドウ表現を示す図。FIG. 28 is a diagram showing a window representation in the fourth embodiment.

【図２９】第４実施例でのウインドウ表現を示す図。FIG. 29 is a diagram showing a window representation in the fourth embodiment.

【図３０】第４実施例でのウインドウ表現を示す図。FIG. 30 is a diagram showing a window representation in the fourth embodiment.

【図３１】第４実施例でのウインドウ表現を示す図。FIG. 31 is a diagram showing a window representation in the fourth embodiment.

【図３２】第４実施例でのウインドウ表現を示す図。FIG. 32 is a diagram showing a window representation in the fourth embodiment.

【図３３】第４実施例でのウインドウ表現を示す図。FIG. 33 is a diagram showing a window representation in the fourth embodiment.

【図３４】第４実施例に用いられる知識獲得支援部を説
明するための図。FIG. 34 is a diagram for explaining a knowledge acquisition support unit used in the fourth embodiment.

【図３５】第４実施例でのウインドウ表現を示す図。FIG. 35 is a diagram showing a window representation in the fourth embodiment.

【図３６】第４実施例でのウインドウ表現を示す図。FIG. 36 is a diagram showing a window representation in the fourth embodiment.

【図３７】他の実施例に用いられる知識獲得支援部を説
明するための図。FIG. 37 is a diagram for explaining a knowledge acquisition support unit used in another embodiment.

【図３８】第５実施例でのキ―ワ―ド画像登録を説明す
るための図。FIG. 38 is a diagram for explaining keyword image registration in the fifth embodiment.

【図３９】第５実施例でのキ―ワ―ド領域を登録する際
の画面例を示す図。FIG. 39 is a view showing an example of a screen when registering a keyword area in the fifth embodiment.

【図４０】第５実施例でのキ―ワ―ド画像検索を説明す
るための図。FIG. 40 is a view for explaining a keyword image search in the fifth embodiment.

【図４１】第５実施例に用いられる文字認識候補格納部
を説明するための図。FIG. 41 is a diagram for explaining a character recognition candidate storage unit used in the fifth embodiment.

【図４２】第５実施例での文字認識候補の選択を説明す
る図。FIG. 42 is a diagram for explaining selection of character recognition candidates in the fifth embodiment.

【図４３】第５実施例の文字切り出し候補も含めた文字
認識候補の出力状態を示す図。FIG. 43 is a diagram showing an output state of character recognition candidates including the character cutout candidates of the fifth embodiment.

【図４４】第５実施例の文字認識結果が却下された場合
の処理を説明するための図。FIG. 44 is a diagram for explaining the processing when the character recognition result of the fifth embodiment is rejected.

【図４５】第５実施例でのキ―ワ―ド画像の検索を説明
するための図。FIG. 45 is a diagram for explaining retrieval of keyword images in the fifth embodiment.

【図４６】第５実施例での文字認識候補とキ―ワ―ドと
の類似性を説明するための図。FIG. 46 is a diagram for explaining the similarity between character recognition candidates and keywords in the fifth embodiment.

【図４７】第５実施例での登録キーワードの検出を説明
するための図。FIG. 47 is a diagram for explaining detection of registered keywords in the fifth embodiment.

【図４８】第５実施例でのオペレータによるーワード候
補の選択を説明するための図。FIG. 48 is a view for explaining selection of word candidates by an operator in the fifth embodiment.

【図４９】第５実施例での類似パターン自動発生を説明
するための図。FIG. 49 is a view for explaining similar pattern automatic generation in the fifth embodiment.

【図５０】図４９に用いられる類似パタ―ン発生装置を
示す図。FIG. 50 is a view showing a similar pattern generator used in FIG. 49.

【図５１】第５実施例での確定データと不確定データの
扱いを説明するための図。FIG. 51 is a view for explaining how to handle confirmed data and uncertain data in the fifth embodiment.

[Explanation of symbols]

１…操作部、２…制御部、３…表示部、４…画像入力
部、５…出力部、６…画像圧縮部、７…認識対象文字列
抽出部、８…文字認識部、９…辞書部、１０…登録部、
１１…光ディスク、１２…記憶部、１７…画像、１８…
表題文字列、１９…中心部、２０、２０１…処理適用範
囲、２１、２２…黒画素連結成分、２３…文字候補矩
形、２４…マウス本体、２５、２６、２７…ボタン、２
８、３１…文字列領域、２９、３０…座標、４１…知識
獲得支援部、４２…文書構造解析部、４３…文書構造デ
ータ生成部、４４…認識対象文字列抽出部、４５…知識
ベース、５０…段組位置、５１…画面、５２、５２ａ〜
５２ｇ…ウインドウ、５３…アイコン、５５…単語単位
の文書構造データ、５６…文字行単位の文書構造デー
タ、５７…ブロック単位の文書構造データ、６１…入力
画面、６２…画像入力装置、６３…画像デ―タベ―ス、
６４…ディスプレ、６５…キ―ワ―ド画像、６６…文字
認識装置、６７…文字認識候補格納部、６８…キ―ワ―
ド、６９…検索装置、７０…図面画像、７１…確認操
作、１１８…類似文字データベース、１１９…類似パタ
ーン発生装置、１２０…類似パタ―ン、１２３…磁気デ
ィスク、１２４…光ディスク。DESCRIPTION OF SYMBOLS 1 ... Operation part, 2 ... Control part, 3 ... Display part, 4 ... Image input part, 5 ... Output part, 6 ... Image compression part, 7 ... Recognition target character string extraction part, 8 ... Character recognition part, 9 ... Dictionary. Department, 10 ... Registration Department,
11 ... Optical disc, 12 ... Storage unit, 17 ... Image, 18 ...
Title character string, 19 ... Central part, 20, 201 ... Processing applicable range, 21, 22 ... Black pixel connected component, 23 ... Character candidate rectangle, 24 ... Mouse body, 25, 26, 27 ... Button, 2
8, 31 ... Character string area, 29, 30 ... Coordinates, 41 ... Knowledge acquisition support unit, 42 ... Document structure analysis unit, 43 ... Document structure data generation unit, 44 ... Recognition target character string extraction unit, 45 ... Knowledge base, 50 ... Column position, 51 ... Screen, 52, 52a ...
52g ... window, 53 ... icon, 55 ... document structure data in word units, 56 ... document structure data in character line units, 57 ... document structure data in block units, 61 ... input screen, 62 ... image input device, 63 ... image Database,
64 ... Display, 65 ... Keyword image, 66 ... Character recognition device, 67 ... Character recognition candidate storage unit, 68 ... Keyword
Reference numeral 69 ... Search device, 70 ... Drawing image, 71 ... Confirmation operation, 118 ... Similar character database, 119 ... Similar pattern generator, 120 ... Similar pattern, 123 ... Magnetic disk, 124 ... Optical disk.

Claims

[Claims]

1. A display unit for displaying an input image on a screen, a unit for designating a position coordinate of an arbitrary point on the screen of the display unit, and an appropriate region is set from the position coordinate designated by this unit. Means for determining the arrangement direction of the character string components from the input image within the area set by this means, means for expanding the area along the arrangement direction of the character string components determined by this means, A recognition target character string extraction means having a means for extracting a character string component in the expanded area, and character recognition and coding of the character string component extracted by the recognition target character string extraction means, and the coded An image filing system comprising: a means for filing a character string as a keyword of the input image in association with the input image.

2. A means for extracting character string components from an input image, a block in which each character string component extracted by this means is integrated in a predetermined relationship, and a physical position and size of these blocks are detected. Means for analyzing and obtaining a semantic connection relationship between each block, means for assigning attribute information to each block from the semantic connection relationship between each block by this means, and information for storing a predetermined character string in advance Means for extracting the character string component of the block to which the attribute information is assigned by referring to the information about the predetermined character string stored in this means, and character recognition of the character string component extracted by this means. And a means for encoding and filing the encoded character string as a keyword of the input image in association with the input image. Image filing system, characterized in that.

3. A means for designating a keyword area for an input image, a means for cutting out an image of the keyword area designated by this means from the input image as a keyword image, and a character image for the keyword image cut out by this means. A means for recognizing, a means for storing the result of character recognition by this means including the character recognition candidates, and a high similarity by judging the similarity of each character recognition candidate with respect to the keyword given at the time of image retrieval. An image filing system comprising means for preferentially presenting images.