JPH05342408A

JPH05342408A - Document image filing device

Info

Publication number: JPH05342408A
Application number: JP3098013A
Authority: JP
Inventors: Katsuhiko Itonori; 勝彦糸乗; Noboru Shimizu; 昇清水
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 1991-04-04
Filing date: 1991-04-04
Publication date: 1993-12-24

Abstract

PURPOSE:To provide the document image filing device which can file inputted document images so as to apply an original attribute for each area (sentence, photograph and graphic) according to the respective attributes. CONSTITUTION:This device is provided with a document image input device 1 to read a document, area separator 3 to separate the character area, photograph area and graphic area from the document images read by the document image input device 1, character recognizing device 42 to perform character recognition to the images of the character area separated by the area separator 3, graphic area finishing device 62 to make the images of the graphic area into vector graphics and to perform finish processing, and filing means 43, 53 and 63 to file characters as character codes, graphics as vector graphic data and photograph area as image data in respect to the respective separated areas.

Description

【発明の詳細な説明】【０００１】【産業上の利用分野】この発明は、文字・写真・図形の
混在する文書画像のファイリング装置に関する。【０００２】【従来の技術】従来の文書画像のファイリング装置にお
いて、文書画像からいくつかの属性（文章、写真、ある
いは図形など）に従った領域を自動的に抽出し、再配置
するシステムが提案されている。たとえば、入力した画
像の各領域を対話処理により抽出し、各領域毎にファイ
リング装置に蓄積し、利用時にはその目的に合わせて、
文書のレイアウトを編集することのできるデータベース
編集装置がある（信学技法，ＰＲＬ８４−１０１，ｐ．
６５〜ｐ．７２）。【０００３】【発明が解決しようとする課題】しかし、これらの従来
のシステムは入力した画像の配置を変更するだけであ
り、各領域が画像として表現されていたので、本来文字
が持つ属性（フォント、文字サイズ等）や図形が持つ属
性（線幅、線種等）などを変更・活用することができ
ず、そのため、ファイル検索後の利用方法が、画像とし
ての非常に限れた形でしかできないという問題があっ
た。また、各領域をすべて画像として扱っているため
に、検索内容の表示は画像入力時の解像度に依存してし
まい、細かい字や図形を奇麗に再現することができなか
った。本発明は、前記問題点を解決するためになされた
もので、入力した文書画像を各属性に従った領域（文章
・写真・図形）ごとに、本来の属性を与えるようにファ
イリングすることのできる文書画像ファイリング装置を
提供することを目的とするものである。【０００４】【課題を解決するための手段】本発明の文書画像ファイ
リング装置は、前記目的を達成するために、文書を読み
取る文書画像入力手段（１）と、その文書画像入力手段
により読み取った文書画像から文字領域、写真領域、お
よび図形領域を分離する領域分離手段（３）と、その領
域分離手段（３）で分離された文字領域の画像に対して
文字認識を行う文字認識手段（４２）と、図形領域の画
像に対してベクトル図形化し、清書処理を行う図形領域
清書手段（６２）と、分離した前記各領域に対して文字
は文字コードとして、図形はベクトル図形データとし
て、写真領域は画像データとしてファイリングするファ
イリング手段（４３，５３，６３）とを備えたことを特
徴とする。【０００５】【作用】文書画像入力手段から入力されたディジタル画
像を、領域分離手段（３）で、文字領域、写真領域およ
び図形領域に分離する。分離された文字領域画像は、文
字認識手段（４２）において文字認識を行い、その認識
結果はファイリング手段（４３）により文字コードとし
てファイリングされる。上記分離された写真領域画像
は、必要に応じて情報圧縮などの処理を施した後、ファ
イリング手段（５３）により、従来と同じように画像デ
ータとしてファイリングされる。さらに、分離された図
形領域画像は、図形領域清書手段（６２）により直線近
似され整形処理された後に、図形データとしてファイリ
ングされる。これらのファイリングをすることにより、
文字領域はコードとして記録され、文書編集装置などに
より文書として自由な編集をすることができる。また、
図形は図形データとして記録されるのでＣＡＤなどの図
形データとして高度な編集をすることができるようにな
る。【０００６】【実施例】図１は、本発明の一実施例の文書画像ファイ
リング装置を示したもので、この装置は、文書画像を入
力するための画像入力装置１、入力した画像を記憶する
画像メモリ２、上記画像から文字領域・写真領域・図形
領域を分離するための領域分離装置３、その領域分離装
置３で分離した文字領域、写真領域、図形領域をそれぞ
れ記憶する文字領域画像メモリ４１，写真領域画像メモ
リ５１、図面領域画像メモリ６１を備えている。さら
に、この文書画像ファイリング装置は、文字領域画像メ
モリ４１中の文字領域画像の文字を認識する文字認識装
置４２、写真領域画像メモリ５１の中の写真領域画像を
情報圧縮する写真領域圧縮装置５２、図形領域画像メモ
リ６１中の図形領域画像を直線近似し清書を行う図形領
域清書装置６２を備えている。上記で処理された結果を
ファイリングするために、文章ファイル装置４３、写真
ファイル装置５３、図形ファイル装置６３が接続されて
いる。【０００７】画像入力装置１では、文字・写真・図形が
混在している文書画像をディジタルデータとして入力
し、画像メモリ２に記憶する。図２は、文書画像の一例
を示すものである。図２で見られるように、文書画像２
１は文字領域４１１、写真領域５１１、図形領域６１１
を混在して持っている。しかし、画像として入力しただ
けでは、前述のように本来文字が持つ属性（フォント、
文字サイズ等）や図形が持つ属性（線幅、線種等）を変
更・活用することができない。そのため、ファイル検索
後の利用方法が、画像としての非常に限れた形でしかで
きなかった。本実施例では、ファイル検索後の利用価値
を広げるために、この後の処理では文書画像２１に含ま
れる各領域に対して本来の属性を与えるような処理を行
い、さらにその属性を持たせたままで、各領域別々にフ
ァイル装置に蓄積するようにしている。【０００８】画像メモリ２中の文書画像２１は、領域分
離装置３により文字領域４１１、写真領域５１１、およ
び図形領域６１１に分割される。図３（ａ），（ｂ），
（ｃ）は、文書画像２１から分離した、文字領域、写真
領域、図形領域を示している。図４は、図形領域を分離
するための領域分離装置３の一例を示すものである。図
４に示すように、この領域分離装置３は、多値で入力さ
れた文字線画像、写真画像の混在する画像に対して２値
化を行う２値化回路３１と、２値化された画像に対して
孤立点除去を行う孤立点除去回路３２と、孤立点除去を
行った画像に対して輪郭を抽出する輪郭抽出回路３３
と、２値画像と輪郭画像をＭ×Ｎ画素のブロックで扱
い、Ｍ×Ｎ画素内に存在する２値の黒画素の数と輪郭抽
出された画素との比により中間調領域を識別する写真領
域判定回路３４と、輪郭抽出された画像をチェーン符号
化するチェーン符号化回路３５と、チェーン符号化され
た各々の黒画素連の方向の変化を計数する方向変化計数
回路３６と、その計数された方向の変化とその黒画素連
結画像の画素数の関係から文字と図形を判定する文字領
域図形領域判定回路３７と、各領域判定回路３４，３７
の判定結果により、入力画像を各領域に振り分けるため
の切替回路３８を有している。【０００９】孤立点除去回路３２および輪郭検出回路３
３の入力部には、３ラインずつ並列に順次入力画像を取
り込むためのラインメモリ３９１，３９２が設けられ、
また写真領域判定回路３４の入力部には、５ラインずつ
並列に順次画像を取り込むためのラインメモリ３９３が
設けられている。多値の入力画像データは２値化回路３
１により２値化されて、孤立点除去回路３２により、画
像データ中の孤立した画素の除去を行う。例えば、一つ
の着目画素に隣接する８個の画素を見て着目画素が孤立
した画素か否かを判定し、孤立画素であればそれを除去
する。そのためにラインメモリ９１を設けて３ライン分
のデータを参照している。輪郭検出回路３３は、３ライ
ン分のデータを参照することにより、輪郭抽出を行うこ
とができる。すなわち、輪郭抽出部３３は、原画像と、
その原画像を１画素分、右にシフトした画像との論理積
をとり、その処理結果の画像と原画像を１画素分、左に
シフトした画像との論理積をとり、同じように上，下に
１画素分シフトした画像との論理積をとると、４方向に
１画素分収縮した画像ができあがり、この画像と原画像
との排他的論理和をとることによって、輪郭抽出をす
る。写真領域判定回路３４は、写真領域（中間調領域）
と文字線画像の特性の相違に着目して領域の判定を行
う。すなわち、写真領域は画像を２値化したものと輪郭
抽出した後のものとで画素数におおきな差異があるのに
対し、文字線画像はその差異が少ない。従って、２値画
像の画素数と輪郭抽出後の画素数との比を調べ、しきい
値よりも大きい場合に写真領域と判定する。【００１０】チェーン符号化回路３５は、輪郭抽出回路
３３によって輪郭抽出された画像に対して、ラスター走
査を行い、黒画素を見つけ、その画素の８近傍を走査
し、黒画素があったならば、チェーン符号化を行い、そ
の画素を白画素に置き換え、次々とこの処理を８近傍に
黒画素がなくなるまで、くり返す。また、８近傍に複数
の黒画素が存在した場合は、注目した画素以外の黒画素
の位置情報をスタックする。メインの黒画素追跡が終了
した時点でスタックされた画素の位置情報を取り出し、
同様にチェーン符号化をくり返す。このスタックされて
いた画素情報から追跡して抽出したチェーン符号はメイ
ンのチェーン符号とつながっているので、子チェーン符
号とする。スタックにある画素情報を全て取り出した時
点で、最初に行っていたラスター走査を続ける。これに
よって、画像全てのチェーン符号化が行われる。【００１１】方向変化計数回路３６は、チェーン符号化
された各々の黒画素連（１本のメインのチェーン符号と
それにつながっている子チェーン符号の組）の方向の変
化を計数する。すなわち、方向変化計数回路３６では、
チェーン符号化回路部３５によって生成された一連のチ
ェーン符号（メインのチェーン符号とその子チェーン符
号の組）に連結している原画像の黒画素を計数する。一
連のチェーン符号に連結する黒画素を計数することは、
孤立画像の面積を計数していることになる。また、複数
の輪郭を持つ、つまり、一連のチェーン符号が複数組で
一つの孤立画像を形成している場合は、その複数組のチ
ェーン符号を新たに一連のチェーン符号として、方向変
化計数回路３６で処理している一連のチェーン符号と対
応がとれるようにする。具体的には、そのようなチェー
ン符号番号を文字領域図形領域判定回路３７へ知らせる
ようにする。【００１２】文字領域図形領域判定部３７では、“文字
は比較的小さく、複雑で輪郭の方向変化が煩雑であり、
図形は比較的大きく、簡単で輪郭の方向変化が少ない”
ことを利用して、各々のチェーン符号の方向の変化回数
とその黒連結画像の画素数（面積）の関係、たとえば次
式のような比をとり、あるしきい値で、文字と図形を分
離する。（チェーン符号の方向の変化回数）／（面積）上記領域分離装置３で分割された文字領域４１１、写真
領域５１１および図形領域６１１は、それぞれ文字領域
画像メモリ４１、写真領域画像メモリ５１、および図形
領域画像メモリ６１に記憶される【００１３】上記文字領域画像メモリ４１の文字領域４
１１は、文字認識装置４２により認識され、認識結果を
文字コードとして出力する。この認識結果は文章のデー
タとして文章ファイル装置４３に記録される。この文字
認識装置は、公知の技術によって構成してもよいが、本
出願人の出願した特願平１−３１４３０１号の発明「文
字認識装置」（発明者大住淳一）、あるいは特願平１−
３１８８２７号の発明「文字認識装置」（発明者倉持
勉）などにより構成すれば、好適である。ここでは、前
者の文字認識装置を用いた例について概略の説明をす
る。図５は本発明による文字認識装置の基本的構成を示
すブロック図である。この文字認識装置は、ストローク
方向抽出部４２１と、方向ストロークパターン形成部４
２２と、パターンぼかし部４２３と、相関処理部４２４
と、方向パターン格納部４２５−１、４２５−２、・・
・４２５−ｎと、パターン辞書格納部４２６とから構成
される。図５において、ストローク方向抽出部４２１で
は、入力文字から切り出されたパターンから文字ストロ
ークを抽出する。方向ストロークパターン形成部４２２
では、前記ストローク方向抽出部１で抽出された各方向
のストロークの内、方向ストロークごとに分ける。パタ
ーンぼかし部４２３では、各同一方向ストロークを集め
て形成されたパターンを縮小する。相関処理部４２４で
は、パターンぼかし部４２３で形成されたぼかしパター
ンと予めパターン辞書格納部４２６に格納されているパ
ターンとの相関を求める処理を行う。方向パターン格納
部４２５−１ないし４２５−ｎは、それぞれ同一方向の
ストロークパターンが格納されている。【００１４】図６はストローク方向説明図、図７はスト
ローク方向判定例説明図である。図６において、ストロ
ーク方向が８方向の例が示されている。たとえば、図３
図示のごときパターンについて、各画素がどの方向のパ
ターンに属しているかの判定方法を説明する。図７示の
各正方形が一つの画素を形成し、図中の黒画素がどの方
向ストロークに属するかを判定する。すなわち、注目し
ている画素を基にして、図７に示すように、前後左右お
よび斜め方向の各方向にそれぞれ画素を順次走査し、黒
画素をカウントして行き、走査した画素が白画素になっ
たら、その方向の走査を止める。このような走査を各方
向について行い、終了したら、黒画素のカウント数が最
も大きい方向を、その着目画素のストローク方向である
と判定する。図７に示す例では、図６のストローク方向
７と一致する方向のストロークが最も長い。したがっ
て、注目画素は、方向７のストロークに属する。同様に
他の画素全てに対してストローク方向の判定を行う。【００１５】図８は方向パターンの抽出例説明図であ
る。図８において、たとえば、「漢」の文字の８方向の
ストロークパターンを抽出した例で、便宜上縮小してあ
る。また、図８に示す番号（１）ないし（８）は、図６
のような方向性を示す番号１ないし８と対応している。
ぼかし処理は、２値画像である各方向ストロークパター
ンを縮小し濃淡のある画像に変換する。ここでは以下の
ような処理を用いることにする。方向ストロークパター
ンの大きさがＮ×Ｎ画素として、ぼかし後のパターンの
大きさをＭ×Ｍとする。ＮはＭの画素の整数ａ倍とす
る。すなわち、ａ×ａ画素の原画を１画素に投影するこ
とになる。この際、方向ストロークパターンの各画素の
内、背景である白画素を−１、パターンを形成する黒画
素を＋１とし、個の画素を加算する。したがって、ぼ
かし変換後の１画素は−ａから＋の間の値を持つことに
なる。ここで得られた各方向のぼかしパターンと辞書と
して持っている各文字方向のぼかしパターンとの間で各
方向ごとに相関をとる。相関はぼかした方向パターンの
各画素を要素とするＭ次元ベクトル同志の内積をそれぞ
れのベクトルのノルムで除したものになる。式で書く
と、Ｓi ＝（Ｉi ，Ｄi ）／‖Ｉi ‖・‖Ｄi ‖ となる。Ｓが相関値、Ｉが入力パターン、Ｄが辞書パタ
ーンを、添字ｉはストロークの方向を示す。８方向の場
合、各文字に対して８個の相関値が得られるので、８個
の相関値の２乗和を各文字に対する類似の度合いとす
る。辞書として持っている全文字中で、最も類似の度合
いの高いものを認識結果とする。【００１６】写真領域画像メモリ５１中の写真領域５１
１は、写真領域圧縮装置５２により情報圧縮され、画像
データとして写真ファイル装置に記録される。上記写真
領域圧縮装置は、従来の任意の情報圧縮手法を用いて実
現することができる。さらに、上記図形領域メモリ６１
中の図形領域６１１は、図形領域清書装置６２により、
線図形に対して直線近似をおこない、直線近似の際に生
じる端点や交差点のずれを直し、線図形全体を清書す
る。清書する様子を図９に示す。すなわち、図９（ａ）
のような線図形画像に対して、直線近似をすることによ
り、図９（ｂ）のような結果を得ることができる。この
結果に対して、整形処理を施し端点や交点のずれを直す
ことで、図９（ｃ）のような結果を得ることができる。
上記の図形領域清書装置の直線近似の機能は、例えば、
本出願人の先に出願した特願 − 号（FX2432
2）の発明「画像データベクトル変換装置」に記載の方
法を用いることにより、また整形処理は同じく本出願人
の先に出願した特願− 号（FX25527）の発明
「ベクトルデータ整形方式」に記載の方法を用いること
で実現することができる。【００１７】図１０は上記「画像データベクトル変換装
置」および「ベクトルデータ整形方式」を用いた図形領
域清書装置の構成例を示すものである。この図形領域清
書装置は、走査を主体とする単純な処理により、画像デ
ータをベクトルデータに変換できるものであり、図１０
に示すように、図形領域画像メモリ６１に格納された２
値画像を直交する方向（ここではＸ軸方向およびＹ軸方
向）に走査して所定の処理を行う。Ｘ軸方向の走査と処
理は、Ｘ軸方向走査部６２１、連続黒画素計数部６２
２、黒画素重心抽出部６２３、重心連結部６２４により
行い、Ｙ軸方向の走査と処理は、Ｙ軸方向走査部６２
５、連続黒画素計数部６２６、黒画素重心抽出部６２
７、重心連結部６２８により行う。各処理の結果は、ベ
クトル整形部６２９により整形される。Ｙ軸方向とＸ軸
方向の処理とは走査方向が異なるだけで実質的には同じ
ものであり、ここではＹ軸方向を例にとり説明する。図
１１は、２値画像をベクトル変換するために行う走査を
説明する図である。Ｙ軸方向走査部６２５の走査は、画
素単位に行うのではなく、幾つかの画素を飛び越して行
う。その飛び越し幅である走査線間幅Ｓは、任意の幅に
決めることができる。連続黒画素計数部６２４は、走査
をしつつ黒画素が幾つ連続しているかを計数する。その
計数結果に基づき、黒画素重心抽出部６２７は連続した
黒画素の重心を抽出する。重心連結部６２７は、黒画素
重心抽出部６２７の抽出した黒画素重心同士を連結し
て、ベクトルを形成する。一定の距離を予め定めておい
て、黒画素重心間の距離がその一定の距離より小であれ
ば、両者を連結してベクトルを形成する。しかし、上記
一定の距離より大であれば連結しない。ベクトル整形部
６２７は、ベクトル間を結合したり、接触させたり、誤
ベクトルの削除等を行い、ベクトルの整形を行う。【００１８】上記の領域分離装置を用いて例えば、図１
１に示すような構成のファイリング装置を構成すること
ができる。画像入力装置１により入力された文書画像
は、上記のように領域分離装置３により各領域に分離
後、文字認識装置４２、写真領域圧縮装置５２、図形領
域清書装置６２により処理された結果をそれぞれ文書フ
ァイル装置４３、写真ファイル装置５３、図形ファイル
装置６３に記録する。記録する際に、各ファイルに対し
適当なキーワードを付与しておく。また、１ページの完
全な文章として利用するために、文書から分離した各フ
ァイルの関係も同時に記録する。あるテーマについて文
章を作成したとき、適当なキーワードで検索装置１０を
用いて、適当な文章を文章ファイル装置４２から検索す
る。検索した結果は文章として編集することができるの
で、文章編集装置７に用いて自由に編集し利用すること
が可能である。また、あるテーマについて書いている文
章に対して表や図形を付加したい場合、適当なキーワー
ドで検索装置１０を用いて図形ファイル装置６２から適
当な図形あるいは表を検索し、検索した結果は図形とし
て編集することが可能なので、図形編集装置８により形
を変えて利用することができる。また、文章ファイル、
写真ファイル、図形ファイル間の関係を調べて１ページ
の文章を検索するように検索装置１０に指示することに
より、１ページの文章を検索でき、文章編集装置７、図
形編集装置８により検索した文章を編集し、再利用する
ことが可能となる。このような文書編集装置７、図形編
集装置８、検索装置１０は同一の処理装置で実現するこ
とも可能である。【００１９】以上のように、本発明をも用いることで、
ファイリングした結果を有効に活用できる装置を構成す
ることができる。ここで説明した実施例では、文字領
域、写真領域、および図形領域に分割しているのみであ
るが、文字領域に対する文字認識の結果から、文字の大
きさを抽出することにより、タイトルを抽出できる。す
なわち、文字のおおきさより大きな文字領域をタイトル
部分であるとすることができる。また、このタイトルと
して抽出された部分の文字認識結果をキーワードとし
て、文字領域、写真領域、図形領域をファイルに記録す
ることにより、ファイリングしようとしている画像から
自動的にキーワードを抽出することがてきる。【００２０】【発明の効果】本発明は、文書ファイリング装置におい
て、画像をファイリングする前に領域分割を行う。すな
わち、入力画像に対し文字領域、写真領域、および図形
領域に分割を施す。そして、文字領域は、文字コードと
して、図形領域は直線近似を行うようにした。従来の画
像ファイリング装置では、文字や図形も画像データとし
てファイリングされていたので、検索した結果を見るの
が主であり、画像としての利用価値しかなかった。しか
し、本発明によれば文字領域は文書として編集が可能と
なり、また図形領域は図形として編集が可能となった。
これにより、過去にファイリングした内容を自由に手直
しして利用することができるようになった。また、検索
した文章を再度出力する際、文字は文字フォントで、図
形は直線を描画するコマンドによって描画されるので、
奇麗な出力を得ることができる。Description: BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a filing apparatus for document images in which characters, photographs and figures are mixed. 2. Description of the Related Art In a conventional document image filing apparatus, a system has been proposed which automatically extracts and rearranges regions according to some attributes (texts, photographs, figures, etc.) from a document image. Has been done. For example, each area of the input image is extracted by interactive processing, accumulated in the filing device for each area, and according to its purpose when used,
There is a database editing device that can edit the layout of a document (Publication Technique, PRL 84-101, p.
65-p. 72). However, these conventional systems only change the arrangement of the input image, and each area is expressed as an image, so that the attribute (font , The character size, etc.) and the attributes of the figure (line width, line type, etc.) cannot be changed or utilized, so the usage after file search can be done only in a very limited form as an image. There was a problem. In addition, since each area is treated as an image, the display of search contents depends on the resolution at the time of image input, and fine characters and figures cannot be reproduced neatly. The present invention has been made in order to solve the above-mentioned problems, and it is possible to perform filing so as to give an original attribute to an input document image for each region (text, photograph, figure) according to each attribute. An object of the present invention is to provide a document image filing device. In order to achieve the above-mentioned object, the document image filing apparatus of the present invention has a document image input means (1) for reading a document and a document read by the document image input means. A region separating means (3) for separating a character region, a photographic region and a graphic region from an image, and a character recognizing device (42) for recognizing a character region image separated by the region separating means (3). A graphic area clearing means (62) for converting the graphic area image into a vector graphic and performing a clean copy process; and for each of the separated areas, a character is a character code, a graphic is vector graphic data, and a photographic area is Filing means (43, 53, 63) for filing as image data are provided. The digital image input from the document image input means is separated into the character area, the photograph area and the graphic area by the area separating means (3). The separated character area image is subjected to character recognition by the character recognition means (42), and the recognition result is filed as a character code by the filing means (43). The separated photographic area image is subjected to processing such as information compression as necessary, and then is filed as image data by the filing means (53) in the same manner as in the conventional case. Further, the separated graphic area image is linearly approximated and shaped by the graphic area clearing means (62) and then filed as graphic data. By filing these,
The character area is recorded as a code and can be freely edited as a document by a document editing device or the like. Also,
Since the figure is recorded as figure data, it is possible to perform advanced editing as figure data such as CAD. FIG. 1 shows a document image filing apparatus according to an embodiment of the present invention. This apparatus stores an image input device 1 for inputting a document image and an input image. The image memory 2, the area separating device 3 for separating the character area / photo area / graphic area from the image, and the character area image memory 41 for storing the character area, the photo area and the graphic area separated by the area separating device 3, respectively. A photo area image memory 51 and a drawing area image memory 61 are provided. Further, this document image filing device is a character recognition device 42 for recognizing characters of a character area image in a character area image memory 41, a photographic area compression device 52 for information-compressing a photographic area image in a photographic area image memory 51, A graphic area clearing device 62 for linearly approximating the graphic area image in the graphic area image memory 61 to perform a clean copy is provided. A text file device 43, a photo file device 53, and a graphic file device 63 are connected in order to file the results processed above. In the image input device 1, a document image in which characters, photographs and figures are mixed is input as digital data and stored in the image memory 2. FIG. 2 shows an example of a document image. As seen in FIG. 2, document image 2
1 is a character area 411, a photograph area 511, and a graphic area 611.
Have mixed. However, just by inputting as an image, as described above, the attributes (font,
It is not possible to change or utilize the character size, etc.) and the attributes of the graphic (line width, line type, etc.). Therefore, the usage after the file search can be done only in a very limited form as an image. In the present embodiment, in order to increase the utility value after the file search, in the subsequent process, a process of giving an original attribute to each area included in the document image 21 is performed, and the attribute is further added. Up to this, each area is separately stored in the file device. The document image 21 in the image memory 2 is divided by the area separating device 3 into a character area 411, a photograph area 511, and a graphic area 611. 3 (a), (b),
(C) shows a character area, a photograph area, and a graphic area separated from the document image 21. FIG. 4 shows an example of the area separation device 3 for separating the graphic area. As shown in FIG. 4, the area separating device 3 is binarized with a binarization circuit 31 that binarizes a mixed image of a character line image and a photographic image input in multivalues. An isolated point removal circuit 32 that removes isolated points from an image and a contour extraction circuit 33 that extracts a contour from an image from which isolated points have been removed.
And a photograph in which a binary image and a contour image are treated as a block of M × N pixels, and a halftone region is identified by the ratio between the number of binary black pixels existing in the M × N pixels and the contour extracted pixels. A region determination circuit 34, a chain coding circuit 35 for chain coding the contour-extracted image, a direction change counting circuit 36 for counting changes in the direction of each chain-coded black pixel string, and its counting. Character area and graphic area determination circuit 37 for determining a character and a graphic based on the relationship between the change in the direction and the number of pixels of the black pixel connected image, and the area determination circuits 34, 37
It has a switching circuit 38 for allocating the input image to each area according to the determination result. The isolated point removing circuit 32 and the contour detecting circuit 3
The input section 3 is provided with line memories 391 and 392 for sequentially capturing input images in parallel in units of three lines,
Further, a line memory 393 is provided in the input section of the photographic area determination circuit 34 so as to sequentially capture images in parallel every five lines. Multi-valued input image data is binarized circuit 3
The image is binarized by 1 and the isolated point removal circuit 32 removes isolated pixels in the image data. For example, eight pixels adjacent to one target pixel are checked to determine whether the target pixel is an isolated pixel, and if it is an isolated pixel, it is removed. Therefore, the line memory 91 is provided to refer to the data for three lines. The contour detection circuit 33 can perform contour extraction by referring to the data of three lines. That is, the contour extracting unit 33
The original image is ANDed with the image shifted by one pixel to the right, and the processed image is ANDed with the image shifted by one pixel to the left, and similarly, When the logical product of the image shifted downward by one pixel is calculated, an image contracted by one pixel in four directions is created, and the exclusive logical sum of this image and the original image is taken to extract the contour. The photographic area determination circuit 34 is a photographic area (halftone area)
The area is determined by paying attention to the difference in the characteristics of the character line image. That is, in the photographic area, there is a large difference in the number of pixels between the binarized image and the one after contour extraction, whereas the character line image has a small difference. Therefore, the ratio between the number of pixels of the binary image and the number of pixels after contour extraction is checked, and if it is larger than the threshold value, it is determined to be a photographic region. The chain encoding circuit 35 performs raster scanning on the image contour-extracted by the contour extraction circuit 33 to find a black pixel, scans 8 neighborhoods of the pixel, and scans if there is a black pixel. , Chain encoding is performed, the pixel is replaced with a white pixel, and this process is repeated one after another until there are no black pixels in eight neighborhoods. If there are a plurality of black pixels in the vicinity of 8, the position information of the black pixels other than the pixel of interest is stacked. When the main black pixel tracking is completed, the position information of the stacked pixels is taken out,
Similarly, the chain encoding is repeated. Since the chain code that is tracked and extracted from the stacked pixel information is connected to the main chain code, it is used as a child chain code. At the time when all the pixel information in the stack is taken out, the raster scanning that was initially performed is continued. As a result, all the images are chain-encoded. The direction change counting circuit 36 counts the change in direction of each chain-coded series of black pixels (a set of one main chain code and a child chain code connected thereto). That is, in the direction change counting circuit 36,
The black pixels of the original image connected to the series of chain codes (the set of the main chain code and its child chain codes) generated by the chain encoding circuit unit 35 are counted. Counting black pixels concatenated in a series of chain codes is
This means that the area of the isolated image is being counted. Further, in the case where it has a plurality of contours, that is, a series of chain codes forms one isolated image by a plurality of sets, the direction change counting circuit 36 is newly set as the series of chain codes. Correspond with the series of chain codes processed in. Specifically, such a chain code number is notified to the character area / graphic area determination circuit 37. In the character area / graphic area determination unit 37, "a character is relatively small and complicated, and the change in the direction of the contour is complicated,
The figure is relatively large, simple and has little change in contour direction. ”
By taking advantage of the above, the relationship between the number of changes in the direction of each chain code and the number of pixels (area) of the black connected image, for example, the ratio given by To do. (Number of changes in the direction of the chain code) / (area) The character area 411, the photograph area 511, and the figure area 611 divided by the area separation device 3 are the character area image memory 41, the photograph area image memory 51, and the figure, respectively. The character area 4 of the character area image memory 41 is stored in the area image memory 61.
11 is recognized by the character recognition device 42 and outputs the recognition result as a character code. The recognition result is recorded in the text file device 43 as text data. This character recognition device may be configured by a known technique, but the invention “character recognition device” (inventor Junichi Osumi) of Japanese Patent Application No. 1-314301 filed by the present applicant, or Japanese Patent Application No. −
It is preferable that the invention is constituted by the invention “Character recognition device” of No. 318827 (Inventor Tsutomu Kuramochi). Here, an outline of an example using the former character recognition device will be described. FIG. 5 is a block diagram showing the basic configuration of the character recognition device according to the present invention. This character recognition device includes a stroke direction extraction unit 421 and a direction stroke pattern formation unit 4
22, the pattern blurring unit 423, and the correlation processing unit 424.
And the direction pattern storage units 425-1, 425-2, ...
425-n and a pattern dictionary storage unit 426. In FIG. 5, the stroke direction extraction unit 421 extracts a character stroke from the pattern cut out from the input character. Direction stroke pattern forming unit 422
Then, the strokes in each direction extracted by the stroke direction extraction unit 1 are divided for each direction stroke. The pattern blurring unit 423 reduces the pattern formed by collecting strokes in the same direction. The correlation processing unit 424 performs a process of obtaining a correlation between the blurring pattern formed by the pattern blurring unit 423 and the pattern stored in the pattern dictionary storage unit 426 in advance. The direction pattern storage units 425-1 to 425-n store stroke patterns in the same direction. FIG. 6 is a stroke direction explanatory diagram, and FIG. 7 is a stroke direction determination example explanatory diagram. FIG. 6 shows an example in which the stroke direction is 8 directions. For example, in FIG.
A method of determining which direction each pixel belongs to in the pattern as shown will be described. Each square shown in FIG. 7 forms one pixel, and it is determined which direction stroke the black pixel in the drawing belongs to. That is, based on the pixel of interest, as shown in FIG. 7, pixels are sequentially scanned in each of the front, rear, left, right, and diagonal directions, black pixels are counted, and the scanned pixels become white pixels. When that happens, stop scanning in that direction. Such scanning is performed for each direction, and when the scanning is completed, the direction in which the count number of black pixels is the largest is determined to be the stroke direction of the pixel of interest. In the example shown in FIG. 7, the stroke in the direction corresponding to the stroke direction 7 in FIG. 6 is the longest. Therefore, the pixel of interest belongs to the stroke in the direction 7. Similarly, the stroke direction is determined for all other pixels. FIG. 8 is an explanatory diagram of an example of extracting a direction pattern. In FIG. 8, for example, a stroke pattern in eight directions of the character “Kan” is extracted, which is reduced for convenience. Further, the numbers (1) to (8) shown in FIG.
It corresponds to the numbers 1 to 8 which indicate the directionality.
In the blurring process, each direction stroke pattern that is a binary image is reduced and converted into a shaded image. Here, the following processing will be used. The size of the direction stroke pattern is N × N pixels, and the size of the pattern after blurring is M × M. N is an integer a times the number of pixels of M. That is, the original image of a × a pixels is projected on one pixel. At this time, among the pixels of the direction stroke pattern, the white pixel which is the background is −1, and the black pixel which forms the pattern is +1 and these pixels are added. Therefore, one pixel after the blur conversion has a value between −a and +. The blur pattern in each direction obtained here and the blur pattern in the character direction stored as a dictionary are correlated for each direction. The correlation is obtained by dividing the inner product of M-dimensional vectors having each pixel of the blurred direction pattern as an element by the norm of each vector. When written by the formula, Si = (Ii, Di) / | Ii‖ · ‖Di‖. S is the correlation value, I is the input pattern, D is the dictionary pattern, and the subscript i indicates the stroke direction. In the case of eight directions, eight correlation values are obtained for each character, so the sum of squares of the eight correlation values is taken as the degree of similarity for each character. Of all the characters in the dictionary, the one with the highest degree of similarity is the recognition result. Photograph area 51 in photograph area image memory 51
1 is information-compressed by the photographic area compression device 52 and recorded as image data in the photographic file device. The photo region compression device can be realized by using any conventional information compression method. Further, the graphic area memory 61
The graphic area 611 in the inside is displayed by the graphic area clearing device 62.
Approximate a line figure to a straight line, correct the misalignment of the end points and intersections that occur during the straight line approximation, and write the whole line figure. Fig. 9 shows how to make a clean copy. That is, FIG. 9 (a)
By linearly approximating such a line figure image, the result as shown in FIG. 9B can be obtained. By shaping this result and correcting the deviation of the end points and the intersections, the result as shown in FIG. 9C can be obtained.
The function of the linear approximation of the above-mentioned graphic area clearing device is, for example,
Japanese patent application filed before the applicant (FX2432
By using the method described in the invention 2) "Image data vector conversion device", the shaping process is also described in the invention "vector data shaping method" of Japanese Patent Application No. (FX25527) filed earlier by the present applicant. It can be realized by using the method of. FIG. 10 shows an example of the configuration of a graphic area clearing apparatus using the "image data vector conversion apparatus" and the "vector data shaping method". This graphic area clearing apparatus can convert image data into vector data by a simple process mainly including scanning.
2 stored in the graphic area image memory 61 as shown in FIG.
The value image is scanned in the orthogonal direction (here, the X-axis direction and the Y-axis direction) to perform a predetermined process. The scanning and processing in the X-axis direction are performed by the X-axis direction scanning unit 621 and the continuous black pixel counting unit 62.
2. The black pixel centroid extraction unit 623 and the centroid connection unit 624 perform Y-axis scanning and processing in the Y-axis scanning unit 62.
5, continuous black pixel counting unit 626, black pixel centroid extraction unit 62
7. The center of gravity connecting portion 628 is used. The result of each process is shaped by the vector shaping unit 629. The processing in the Y-axis direction and the processing in the X-axis direction are substantially the same except that the scanning directions are different. Here, the Y-axis direction will be described as an example. FIG. 11 is a diagram illustrating scanning performed to convert a binary image into a vector. The scanning of the Y-axis direction scanning unit 625 is performed not for each pixel, but for some pixels. The inter-scan line width S, which is the interlace width, can be set to any width. The continuous black pixel counting unit 624 counts how many black pixels are continuous while scanning. Based on the counting result, the black pixel centroid extraction unit 627 extracts the centroids of consecutive black pixels. The center of gravity connecting section 627 connects the black pixel center of gravity extracted by the black pixel center of gravity extracting section 627 to form a vector. If a constant distance is set in advance and the distance between the black pixel centroids is smaller than the constant distance, the two are connected to form a vector. However, if the distance is larger than the certain distance, they are not connected. The vector shaping unit 627 shapes the vectors by connecting the vectors, bringing them into contact with each other, deleting an erroneous vector, and the like. Using the above-mentioned area separation device, for example, FIG.
It is possible to configure a filing device having the configuration shown in FIG. The document image input by the image input device 1 is separated into each area by the area separation device 3 as described above, and the results processed by the character recognition device 42, the photo area compression device 52, and the graphic area clearing device 62 are respectively obtained. It is recorded in the document file device 43, the photo file device 53, and the graphic file device 63. When recording, give an appropriate keyword to each file. In addition, the relationship of each file separated from the document is recorded at the same time so that it can be used as a complete sentence of one page. When a sentence is created for a certain theme, the search device 10 is searched with an appropriate keyword to search the sentence file device 42 for an appropriate sentence. Since the retrieved result can be edited as a sentence, it can be freely edited and used by the sentence editing device 7. Further, when it is desired to add a table or a figure to a sentence written about a certain theme, an appropriate keyword is used to retrieve an appropriate figure or table from the figure file device 62 using the retrieval device 10 and the retrieved result is a figure. Since it can be edited, it can be used by changing the shape by the graphic editing device 8. Also, a text file,
A sentence of one page can be searched by instructing the search device 10 to search the sentence of one page by checking the relationship between the photo file and the figure file, and the sentence searched by the sentence editing device 7 and the figure editing device 8 Can be edited and reused. The document editing device 7, the graphic editing device 8, and the search device 10 can be realized by the same processing device. As described above, by using the present invention as well,
It is possible to configure a device that can effectively utilize the result of filing. In the embodiment described here, only the character area, the photograph area, and the graphic area are divided, but the title can be extracted by extracting the size of the character from the result of the character recognition for the character area. .. That is, a character area larger than the character size can be regarded as the title portion. Also, by using the character recognition result of the portion extracted as the title as a keyword and recording the character area, the photograph area, and the graphic area in a file, the keyword can be automatically extracted from the image to be filed. .. According to the present invention, in a document filing apparatus, area division is performed before filing an image. That is, the input image is divided into a character area, a photograph area, and a graphic area. The character area is used as a character code, and the graphic area is approximated by a straight line. In the conventional image filing apparatus, since characters and figures are also filed as image data, the search results are mainly seen, and they are only useful as images. However, according to the present invention, the character area can be edited as a document, and the graphic area can be edited as a graphic.
As a result, it becomes possible to freely modify and use the contents filed in the past. Also, when the retrieved sentence is output again, the characters are in the character font and the figure is drawn by the command that draws a straight line.
You can get a beautiful output.

【図面の簡単な説明】【図１】入力画像に対し領域を分離し、所定の形式に
変換する本発明の一実施例を示す図である。【図２】文書画像の一例を示す図である。【図３】領域分離の結果を示した図である。【図４】領域分離装置の構成を示す図である。【図５】文字認識装置の構成を示す図である。【図６】ストローク方向を説明するための図である。【図７】ストローク方向の判定を説明するための図で
ある。【図８】方向パターンの抽出例を説明するための図で
ある。【図９】線図形整形処理の過程を説明するための図で
ある。【図１０】図形領域清書装置の構成を示す図である。【図１１】装置の構成例を示す図である。【符号の説明】１…画像入力装置、２…画像メモリ３…領域分離装
置４１…文字領域画像メモリ４２…文字認識装置４３
…文章ファイル装置５１…写真領域画像メモリ５２…写真領域圧縮装置
５３…写真ファイル装置６１…図形領域画像メモリ
６２…図形領域清書装置６３…図形ファイル装置２
１…文書画像４１１…文字領域５１１…写真領域
６１１…図形領域BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a diagram showing an embodiment of the present invention in which a region is separated from an input image and converted into a predetermined format. FIG. 2 is a diagram showing an example of a document image. FIG. 3 is a diagram showing a result of region separation. FIG. 4 is a diagram showing a configuration of a region separation device. FIG. 5 is a diagram showing a configuration of a character recognition device. FIG. 6 is a diagram for explaining a stroke direction. FIG. 7 is a diagram for explaining determination of a stroke direction. FIG. 8 is a diagram for explaining an example of direction pattern extraction. FIG. 9 is a diagram for explaining the process of line figure shaping processing. FIG. 10 is a diagram showing a configuration of a graphic area clearing device. FIG. 11 is a diagram illustrating a configuration example of a device. [Explanation of Codes] 1 ... Image input device, 2 ... Image memory 3 ... Region separation device 41 ... Character region image memory 42 ... Character recognition device 43
… Text file device 51… Photo area image memory 52… Photo area compression device
53 ... Photo file device 61 ... Graphic area image memory
62 ... Graphic area clearing device 63 ... Graphic file device 2
1 ... Document image 411 ... Character area 511 ... Photo area
611 ... Graphic area

Claims

What is claimed is: 1. A document image input means device for reading a document, an area separating means for separating a character area, a photographic area, and a graphic area from the read document image, and an image of the character area separated by the area separating means. Character recognition means for recognizing characters, and a graphic area clearing means for converting the graphic area image into a vector graphic and performing clearing processing.For each of the above separated areas, a character is a character code and a graphic is a vector. A document image filing device characterized by comprising filing means for filing the photograph area as image data as figure data.