JP2003046746A

JP2003046746A - Method and apparatus for processing image

Info

Publication number: JP2003046746A
Application number: JP2001232757A
Authority: JP
Inventors: Makoto Takaoka; 真琴高岡
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2001-07-31
Filing date: 2001-07-31
Publication date: 2003-02-14

Abstract

PROBLEM TO BE SOLVED: To solve the problem, in order to obtain an electronic form based on a printed document, of the need to first read the document by a scanner and to replace a character part with a background pixel, which has been a considerably difficult work. SOLUTION: A method for processing an image comprises the steps of inputting multi-valued original image data 101, based on a document image formed on a recording medium, binarizing the data using a binarizing unit 102, laying out the binarized image, and analyzing the image by a layout analyzer 104. The method further comprises the steps of solidly coating the character part of the image data by its peripheral color, based on a character region information and the binarized image data as a layout analyzed result in a character solid coating part 105, and thereby to form a first electronic form.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は画像処理装置及びそ
の方法に関し、特に、印刷文書に基づいて電子フォーム
を作成する画像処理方法及び画像処理装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an image processing apparatus and a method thereof, and more particularly to an image processing method and an image processing apparatus for creating an electronic form based on a print document.

【０００２】[0002]

【従来の技術】近年、電子帳票や電子定型フォームを利
用した文書作成技術が普及している。例えばインターネ
ットを介してウェブ上から電子注文等を行うシステムに
おいては、ユーザが必要最低限の入力を行うことによっ
て、その他の形式を満足した電子帳票が作成され、更に
必要であれば、レイアウト済み帳票の印刷処理を提供す
るシステムが増加している。2. Description of the Related Art In recent years, document creation technology using electronic forms and electronic fixed forms has become widespread. For example, in a system for electronic ordering on the Web via the Internet, the user can make an electronic form satisfying other formats by performing the minimum necessary input, and further, if necessary, a laid-out form. An increasing number of systems are providing printing processes.

【０００３】あるいは、Microsoft社のWord等、所謂ド
キュメント作成ソフトを利用する際にも、文書の体裁の
雛型として定型フォームを利用し、該定型フォーム内に
自分の文章を挿入していくことによって、所望の文書を
作成することが日常化している。Alternatively, even when using so-called document creation software such as Microsoft Word, by using a standard form as a template for the format of a document and inserting one's own text in the standard form. It has become commonplace to create desired documents.

【０００４】このように、ユーザ自らが文書体裁を整え
る作業を不要とするために、電子フォームの自動作成処
理に対するニーズは極めて高い。As described above, there is a great need for automatic creation processing of electronic forms in order to eliminate the need for the user to prepare the document format.

【０００５】[0005]

【発明が解決しようとする課題】一般に電子フォームの
作成作業は煩雑であるため、既に作成されているフォー
ムを流用することが多かった。しかしながら、既存のフ
ォームが必ずしもユーザの所望するものに完全に一致す
るとは限らず、該フォームを変更する必要が少なからず
生じてしまう。Since the work of creating an electronic form is generally complicated, an already created form is often used. However, the existing form does not always exactly match what the user desires, and there is a considerable need to change the form.

【０００６】例えば、既存フォームにおいてその背景画
像を所望の画像に修正する場合、大変な作業工数を要し
てしまう。そこで、特にフォームの背景として、一般の
印刷文書を利用したいという要望があった。For example, when the background image of an existing form is modified to a desired image, a great number of work steps are required. Therefore, there has been a demand for using a general print document as the background of the form.

【０００７】また、電子帳票作成アプリケーションによ
って作成された電子帳票の印刷物のみが存在する場合に
は、該帳票の表枠部を構成する電子データ（電子帳票デ
ータ）を取得したいという要望があった。Further, when there is only a printed matter of an electronic form created by the electronic form creating application, there is a demand to obtain electronic data (electronic form data) forming the front frame portion of the form.

【０００８】しかしながら、記録媒体上に印刷された文
書に基づく電子フォームを得るためには、まず該文書を
スキャナで読み込むことによって画像データ化し、次
に、得られた画像データから不要な部分を削除する必要
がある。However, in order to obtain an electronic form based on a document printed on a recording medium, first, the document is read by a scanner to form image data, and then unnecessary portions are deleted from the obtained image data. There is a need to.

【０００９】ここで、電子フォーム化にあたって不要と
なる部分としては、文字部であることが多い。一般に、
文書において文字部は重要であるが、該文書を電子フォ
ーム化したい場合には該文字部は不必要となり、むしろ
背景や、線、図等が、再利用の対象となる。Here, a character portion is often used as an unnecessary portion in forming an electronic form. In general,
The character part is important in the document, but when the document is to be converted into an electronic form, the character part is unnecessary, and rather, the background, the line, the drawing, and the like are to be reused.

【００１０】文書画像データから文字部のみを削除する
には、文字部を背景画像に対応する画素で置き換えるこ
とになるが、この作業は通常は人手によって行われてい
た。従って、例えば背景が中間調画像である場合等にお
いては、該背景画像から文字部のみを削除することはか
なり困難な作業となってしまう。In order to delete only the character part from the document image data, the character part is replaced with the pixel corresponding to the background image, but this work is usually done manually. Therefore, for example, when the background is a halftone image or the like, deleting only the character portion from the background image becomes a very difficult task.

【００１１】本発明は上記問題を解決するためになされ
たものであり、印刷文書に基づく電子フォームを容易に
作成可能とする画像処理装置及びその方法を提供するこ
とを目的とする。The present invention has been made to solve the above problems, and an object of the present invention is to provide an image processing apparatus and a method thereof that can easily create an electronic form based on a printed document.

【００１２】[0012]

【課題を解決するための手段】上記目的を達成するため
の一手段として、本発明の画像処理方法は以下の工程を
備える。As one means for achieving the above object, the image processing method of the present invention comprises the following steps.

【００１３】すなわち、記録媒体上に形成された文書画
像に基づいて電子フォームを作成する画像処理方法であ
って、記録媒体上に形成された文書画像に基づく多値画
像データを入力する画像入力工程と、前記多値画像デー
タを二値化して二値画像データを生成する二値化工程
と、前記二値画像データに基づいて前記文書画像のレイ
アウトを解析する解析工程と、該レイアウト解析結果及
び前記二値画像データに基づいて、前記多値画像データ
における文字部をその周囲色で塗りつぶすことによって
第１の電子フォームを作成する文字塗りつぶし工程と、
を有することを特徴とする。That is, an image processing method for creating an electronic form based on a document image formed on a recording medium, the image inputting step of inputting multi-valued image data based on the document image formed on the recording medium. A binarizing step of binarizing the multi-valued image data to generate binary image data; an analyzing step of analyzing a layout of the document image based on the binary image data; a layout analysis result; A character filling step of creating a first electronic form by filling a character portion in the multi-valued image data with its surrounding color based on the binary image data;
It is characterized by having.

【００１４】更に、前記レイアウト解析結果に基づい
て、前記第１の電子フォームをその線領域をマークして
表示する第１フォーム表示工程と、該表示された第１の
電子フォーム上における任意の線領域をユーザ指示に基
づいて選択する選択工程と、該選択された線領域におけ
る線部をベクトル化することによって線ベクトル情報を
作成するベクトル化工程と、前記レイアウト解析結果及
び前記二値画像データに基づいて、前記第１の電子フォ
ームにおける線部をその周囲色で塗りつぶすことによっ
て背景画像データを作成する線塗りつぶし工程と、を有
し、前記線ベクトル情報及び前記背景画像データを第２
の電子フォームとすることを特徴とする。Further, based on the layout analysis result, a first form displaying step of displaying the first electronic form by marking its line area, and an arbitrary line on the displayed first electronic form. A selection step of selecting a region based on a user instruction, a vectorization process of creating line vector information by vectorizing a line portion in the selected line region, the layout analysis result and the binary image data A line filling step of creating background image data by filling a line portion in the first electronic form with its surrounding color, based on the line vector information and the background image data.
The electronic form of

【００１５】更に、前記線ベクトル情報と前記背景画像
データからなる前記第２の電子フォームにおける表領域
を、その背景色と同様な色で塗りつぶす表領域塗りつぶ
し工程と、前記表領域塗りつぶし後の画像データから前
記背景画像データを削除することによって第３の電子フ
ォームを作成する背景削除工程と、を有し、前記第３の
電子フォームにおいて前記表領域を構成する前記線ベク
トル情報は、前記背景色情報を有することを特徴とす
る。Further, a table area filling step of filling the table area in the second electronic form composed of the line vector information and the background image data with a color similar to the background color, and the image data after the table area filling. A background deletion step of creating a third electronic form by deleting the background image data from the background image data, and the line vector information forming the table area in the third electronic form is the background color information. It is characterized by having.

【００１６】[0016]

【発明の実施の形態】以下、本発明に係る一実施形態に
ついて、図面を参照して詳細に説明する。BEST MODE FOR CARRYING OUT THE INVENTION An embodiment of the present invention will be described in detail below with reference to the drawings.

【００１７】＜第１実施形態＞ ●装置構成図１は、本実施形態における画像処理装置のブロック構
成を示す図である。<First Embodiment> Device Configuration FIG. 1 is a diagram showing a block configuration of an image processing device according to the present embodiment.

【００１８】同図において、１０１は原画像である。１
０２は原画像１０１に対して最適二値化を施す二値化部
であり、二値画像１０３を出力する。１０４は二値画像
１０３のレイアウトを解析するレイアウト解析部であ
り、文字領域を検出した文字領域座標を含むレイアウト
解析結果１０８を出力する。In the figure, 101 is an original image. 1
Reference numeral 02 denotes a binarization unit that performs optimal binarization on the original image 101, and outputs a binary image 103. A layout analysis unit 104 analyzes the layout of the binary image 103, and outputs a layout analysis result 108 including the character area coordinates of the detected character area.

【００１９】１０５は文字塗りつぶし部であり、検出さ
れた文字領域座標を参照して、二値画像データ１０３に
おける黒領域を原画像データ１０１から削除してその周
囲の色で塗りつぶすことによって、文字が消去された画
像データＡを作成する。この画像Ａがすなわち、第１次
電子フォームである。Reference numeral 105 denotes a character filling section, which refers to the detected character area coordinates, deletes the black area in the binary image data 103 from the original image data 101, and fills it with the surrounding color so that the character is The erased image data A is created. This image A is the primary electronic form.

【００２０】そして、１次電子フォームを縮小部１０６
で縮小して縮小画像データＢを作成し、更にＪＰＥＧ圧
縮部１０７でＪＰＥＧ圧縮することによって、第１次電
子フォームの圧縮コードＸ１０９が作成される。Then, the primary electronic form is reduced by the reduction unit 106.
Then, the reduced image data B is reduced to generate the reduced image data B, and the JPEG compression unit 107 further performs JPEG compression to generate the compression code X109 of the primary electronic form.

【００２１】本実施形態においては上記構成によって、
原画像１０１から文字領域を検出し、該文字領域中の文
字のみを消去することによって下地画像のみを残す。該
下地画像を電子フォーム（第１次電子フォーム：画像デ
ータＡに相当）として、その圧縮コードＸ１０９が登録
される。In the present embodiment, with the above configuration,
A character area is detected from the original image 101, and only the character in the character area is erased to leave only the base image. The base image is used as an electronic form (primary electronic form: corresponding to image data A), and its compression code X109 is registered.

【００２２】以下、本実施形態における第１次電子フォ
ームの作成処理について、詳細に説明する。The process of creating the primary electronic form in this embodiment will be described in detail below.

【００２３】●文字領域検出処理図２は、本実施形態における文字領域検出処理を示すフ
ローチャートであり、上記構成の二値化部１０２からレ
イアウト解析部１０４までにおいて実行される処理に対
応する。Character Area Detection Processing FIG. 2 is a flowchart showing the character area detection processing in this embodiment, which corresponds to the processing executed by the binarization unit 102 to the layout analysis unit 104 having the above-described configuration.

【００２４】先ずステップＳ２０１においててカラーの
原画像１０１を入力し、これを間引くことによって解像
度を落しながら輝度変換を行い、輝度画像Ｊを作成す
る。例えば原画像１０１がＲＧＢ成分より成る、全８ビ
ット３００ｄｐｉのデータであるとすると、縦方向、横
方向とも４画素ごとに Y = 0.299R + 0.587G + 0.114B なる演算を行うことによって新たな画像Ｊを作成した場
合、画像ＪはＹ８ビット７５ｄｐｉの画像データとして
得られる。First, in step S201, the color original image 101 is input, and the luminance image is converted while thinning it to reduce the resolution, and a luminance image J is created. For example, if the original image 101 is composed of RGB components and has data of 8 bits and 300 dpi, Y = 0.299R + 0.587G + 0.114B is calculated for every 4 pixels in the vertical and horizontal directions to obtain a new image. When J is created, the image J is obtained as Y8 bit 75 dpi image data.

【００２５】そしてステップＳ２０２にて画像Ｊにおけ
る輝度ヒストグラムを取って二値化閾値Ｔを算出し、ス
テップＳ２０３で輝度画像Ｊを閾値Ｔにて二値化するこ
とによって、二値画像Ｋ（全面二値画像１０３）を作成
する。Then, in step S202, the brightness histogram of the image J is taken to calculate a binarization threshold value T, and in step S203, the brightness image J is binarized with the threshold value T, so that the binary image K A value image 103) is created.

【００２６】次にステップＳ２０４で二値画像Ｋにおい
て、黒画素の輪郭線追跡を行うことによって全ての黒領
域へのラベル付けを行う。そしてステップＳ２０５で該
黒領域中から文字らしい領域を判定し、ステップＳ２０
６にて該文字らしい領域の形や位置に基づき、結合すべ
き黒領域同士を結合する。Next, in step S204, all the black areas are labeled by tracing the contour lines of the black pixels in the binary image K. Then, in step S205, a character-like area is determined from the black area, and step S20
At 6, the black areas to be combined are combined based on the shape and position of the character-like areas.

【００２７】以下、上記図２に示す文字領域検出処理の
具体例を示す。図３に示すカラー画像を原画像１０１と
した場合、これを間引いて輝度変換した画像Ｊの輝度ヒ
ストグラムは図４に示すようになる。該ヒストグラムの
平均、分散等に基づいて二値化閾値Ｔ＝１５０が算出さ
れ、該閾値Ｔに基づく二値化により、二値画像Ｋ（二値
画像１０３）が図５のように得られる。A specific example of the character area detecting process shown in FIG. 2 will be described below. When the color image shown in FIG. 3 is used as the original image 101, the luminance histogram of the image J obtained by thinning and converting the original image 101 is as shown in FIG. A binarization threshold T = 150 is calculated based on the average, variance, etc. of the histogram, and binarization based on the threshold T yields a binary image K (binary image 103) as shown in FIG.

【００２８】図５に示す二値画像Ｋに対し、黒画素の輪
郭線追跡を行ってその全てをラベリングし、横幅または
高さが所定の閾値以下である黒画素の集合を文字とする
と、図６に示す黒画素の集まりが文字領域として判定さ
れる。但し、図６に示す例はあくまでも文字領域の概念
を示すものであり、実際にこのような画像が作成される
ものではない。そして必要に応じて、図６に示す黒画素
の集まりを、その距離や横幅及び／又は高さの一致等の
各種条件に基づいてグループ化していくことにより、図
７に示す１７個の文字領域７０１〜７１８が検出され
る。これら文字領域の座標データが、文字領域座標とし
て装置内の不図示のＲＡＭ等に保持される。For the binary image K shown in FIG. 5, contour lines of black pixels are traced, all of them are labeled, and a set of black pixels whose width or height is equal to or smaller than a predetermined threshold is a character. A group of black pixels shown in 6 is determined as a character area. However, the example shown in FIG. 6 merely shows the concept of the character area, and such an image is not actually created. Then, if necessary, the group of black pixels shown in FIG. 6 is grouped based on various conditions such as the distance, width, and / or height matching, so that the 17 character regions shown in FIG. 701 to 718 are detected. The coordinate data of these character areas are held in a RAM or the like (not shown) in the device as character area coordinates.

【００２９】●文字部塗りつぶし処理以下、文字塗りつぶし部１０５における処理について、
図８の具体例及び図９のフローチャートを参照して説明
する。Character Part Filling Process Hereinafter, with respect to the process in the character filling unit 105,
This will be described with reference to the specific example of FIG. 8 and the flowchart of FIG.

【００３０】図８の(ａ)が原画像１０１であるとする
と、該原画像に対する上記文字領域検出の過程におい
て、図８(ｂ)に示す様な１つの文字領域を含む全面２値
画像１０３が得られている。Assuming that the original image 101 is shown in FIG. 8A, in the process of detecting the character area for the original image, the whole binary image 103 including one character area as shown in FIG. 8B. Has been obtained.

【００３１】本実施形態の文字塗りつぶし処理（図９）
においては、原画像を３２×３２画素の領域(以下、パ
ーツ)に分割し（Ｓ９０１）、該パーツごとに処理を行
う。図８（ｃ）に、原画像１０１をパーツに分割した様
子を示す。Character filling processing of this embodiment (FIG. 9)
In step 1, the original image is divided into 32 × 32 pixel regions (hereinafter referred to as parts) (S901), and processing is performed for each part. FIG. 8C shows how the original image 101 is divided into parts.

【００３２】図８（ｃ）に示す「００」〜「１０」まで
の６つのパーツについては文字領域を含まないため、ス
テップＳ９０３の分岐により何も処理が行われず、「１
１」のパーツを処理する時点で処理がステップＳ９０４
に進む。なお、パーツが文字領域を含むか否かは、レイ
アウト解析結果１０８として保持されている文字領域座
標に基づいて判定される。The character parts are not included in the six parts "00" to "10" shown in FIG. 8C, so that no processing is performed by the branch in step S903, and "1" is set.
The process is step S904 at the time of processing the “1” part.
Proceed to. Whether or not the part includes a character area is determined based on the character area coordinates held as the layout analysis result 108.

【００３３】ステップＳ９０４では、図８（ｂ）に示す
２値画像において、「１１」のパーツと同位置である部
分（２値パーツ）を参照し、該２値パーツ内の白画素部
分に対応する、図８（ａ）に示す原画像の部分につい
て、そのＲＧＢ値の平均値ave_colorを算出する。In step S904, in the binary image shown in FIG. 8B, the part (binary part) located at the same position as the part "11" is referred to, and the white pixel part in the binary part is corresponded. Then, the average value ave_color of the RGB values of the portion of the original image shown in FIG. 8A is calculated.

【００３４】次にステップＳ９０５において、上記２値
パーツ内の黒画素部分に対応する、図８（ａ）に示す原
画像の部分を構成する画素値として、上記ave_colorを
代入する。Next, in step S905, the ave_color is substituted as a pixel value forming a portion of the original image shown in FIG. 8A corresponding to the black pixel portion in the binary part.

【００３５】以上の処理を、文字領域の存在する「１
２」，「１３」，「２１」，「２２」，「２３」の各パ
ーツに対して実行することによって、原画像上で文字が
存在していた部分を、その周囲画素の平均値で埋めるこ
とができる。The above-mentioned processing is performed for "1" when the character area exists.
By executing this for each part of "2", "13", "21", "22", and "23", the part where the character was present in the original image is filled with the average value of the surrounding pixels. be able to.

【００３６】以上のようにして文字部塗りつぶしが行わ
れた画像がすなわち、本実施形態における第１次電子フ
ォームである。The image in which the character portion has been filled in as described above is the primary electronic form in this embodiment.

【００３７】文字部塗りつぶしによって作成された第１
次電子フォームは、縮小部１０６にて例えば単純間引き
により縮小される。この縮小処理は、例えば３００ｄｐ
ｉの画像を１５０ｄｐｉにすることによって、画像サイ
ズを小さくして扱いやすくするために実行される。First created by filling character area
The next electronic form is reduced by the reduction unit 106 by, for example, simple thinning. This reduction processing is, for example, 300 dp
This is executed to reduce the image size and make it easier to handle by setting the image of i to 150 dpi.

【００３８】なお、上記文字塗りつぶし処理と縮小処理
は、その実行順序を逆にしても構わないが、先に縮小処
理を行う場合には、縮小による二値画像とカラー画像の
位置ずれを考慮する必要がある。The execution order of the character filling processing and the reduction processing may be reversed. However, when the reduction processing is performed first, the positional deviation between the binary image and the color image due to the reduction is taken into consideration. There is a need.

【００３９】以上説明した方法により、印刷物をスキャ
ンして得られたカラー画像について、その背景のみを残
して文字部を消去したフォーム画像（第１次電子フォー
ム）を得ることができる。すなわち、図３に示す印刷物
のスキャンデータに基づき、図１０に示すフォーム画像
が得られる。このフォーム画像には、背景として元の印
刷物（図３）の文字以外の部分（罫線、テクスチャ、イ
メージ等）が残されており、該フォーム画像内に新たな
文字を書き込むことによって、該フォーム画像を利用し
たオリジナル文書を作成することができる。By the method described above, it is possible to obtain a form image (primary electronic form) in which a character portion of a color image obtained by scanning a printed matter is erased while leaving only the background. That is, the form image shown in FIG. 10 is obtained based on the scan data of the printed matter shown in FIG. In this form image, parts (ruled lines, textures, images, etc.) other than the characters of the original printed matter (FIG. 3) are left as a background, and by writing new characters in the form image, the form image You can create an original document using.

【００４０】なお、第１次電子フォームを作成する際
に、レイアウト認識を誤った箇所については、通常のレ
タッチソフトを利用してこれを修正することが可能であ
る。例えば、得られたフォーム画像内に文字部が残され
ていた場合には、レタッチソフトの画素修正機能を用い
て、該文字部に周囲の画素を上書きすることによって消
去すれば良い。この作業は、印刷物をスキャンした画像
について、その文字部全体を手作業で消去する場合と比
べると、はるかに容易であり、従って十分実用的である
といえよう。When the primary electronic form is created, it is possible to correct the location where the layout is erroneously recognized by using ordinary retouching software. For example, if a character portion is left in the obtained form image, it may be erased by overwriting surrounding pixels in the character portion using the pixel correction function of retouching software. This operation is far easier than the case of manually erasing the entire character portion of an image obtained by scanning a printed matter, and thus can be said to be sufficiently practical.

【００４１】以上説明したように本実施形態によれば、
用紙上に印刷された画像（原画像）を流用して、その背
景を生かした第１次電子フォームを生成することができ
る。ユーザにとって、原画像を単に背景画として扱えれ
ば十分である場合には、この第１次電子フォームを画像
データとして扱うことができる。As described above, according to this embodiment,
An image (original image) printed on a sheet can be diverted to generate a primary electronic form that takes advantage of the background. If it is sufficient for the user to handle the original image simply as a background image, this primary electronic form can be treated as image data.

【００４２】なお、原画像１０１のデータ形式としては
ＲＧＢに限らず、ＹＵＶ等、他の形式であっても本発明
は適用可能である。The data format of the original image 101 is not limited to RGB, and the present invention can be applied to other formats such as YUV.

【００４３】なお、文字塗りつぶし部１０５から出力さ
れた画像データＡが既に第１次電子フォームとして完成
しているため、例えばこれを直接利用したい場合等にお
いては、特に縮小処理やＪＰＥＧ圧縮処理を施す必要は
ない。Since the image data A output from the character filling unit 105 has already been completed as the primary electronic form, for example, when it is desired to directly use it, reduction processing or JPEG compression processing is performed. No need.

【００４４】＜第２実施形態＞以下、本発明に係る第２
実施形態について説明する。<Second Embodiment> The second embodiment of the present invention will be described below.
An embodiment will be described.

【００４５】第２実施形態においては、上述した第１実
施形態において生成された、原画像の下地からなる第１
次電子フォームについて、線部、表部、フレーム部等の
属性別にベクトル化した情報を更に含め、第２次電子フ
ォームとして登録する。In the second embodiment, a first background image of the original image generated in the first embodiment is used.
The secondary electronic form is registered as a secondary electronic form, further including vectorized information for each attribute such as a line portion, a surface portion, and a frame portion.

【００４６】●装置構成図１１に、第２実施形態における第２次電子フォームを
作成するためのブロック構成を示すが、該構成は、第１
実施形態において図１に示した構成に続いて備えられる
ものである。すなわち、図１及び図１１においてブロッ
ク１Ａ内に示される、レイアウト解析結果１０８及び第
１次電子フォームの圧縮コードＸ１０９が、第２次電子
フォーム作成時において使用される。[Device Configuration] FIG. 11 shows a block configuration for creating a secondary electronic form in the second embodiment, which is the first configuration.
In the embodiment, it is provided subsequent to the configuration shown in FIG. That is, the layout analysis result 108 and the compression code X109 of the primary electronic form shown in the block 1A in FIGS. 1 and 11 are used when the secondary electronic form is created.

【００４７】図１１において、１５０１は圧縮コードＸ
１０９を伸長して第１次電子フォーム画像１５０２を生
成するＪＰＥＧ伸長部である。１５０３は矩形枠表示部
であり、レイアウト解析結果１０８における線、フレー
ム、表の座標情報に基づき、第１次電子フォーム画像１
５０２上に矩形枠を重畳して表示する。なお、矩形枠表
示部１５０３においては表示された矩形枠について、ユ
ーザによる所望枠の選択を可能とする。該選択された矩
形枠は、線ベクトル化部１５０６においてレイアウト解
析結果１０８における線、フレーム等の属性情報に基づ
いて線ベクトル化されることにより、線ベクトル情報１
５０７が生成される。In FIG. 11, reference numeral 1501 is the compressed code X.
This is a JPEG decompression unit that decompresses 109 to generate a primary electronic form image 1502. Reference numeral 1503 denotes a rectangular frame display portion, which is based on the coordinate information of the line, frame, and table in the layout analysis result 108 and is used for the primary electronic form image 1
A rectangular frame is superimposed and displayed on 502. The rectangular frame display unit 1503 allows the user to select a desired frame for the displayed rectangular frame. The selected rectangular frame is line-vectorized by the line-vectorization unit 1506 on the basis of the attribute information such as the line and the frame in the layout analysis result 108.
507 is generated.

【００４８】一方、１５０４は線塗りつぶし部であり、
図１に示す文字塗りつぶし部１０５と同様に、線領域の
周囲画素による塗りつぶしを行う。ここで、線塗りつぶ
し部１５０４においては実際には、第１次電子フォーム
作成時に利用した２値画像１０３（図１）を参照する必
要があるが、図１１ではこれを省略している。１５０５
は該線領域が塗りつぶされて下地のみとなった画像をＪ
ＰＥＧ圧縮して圧縮コードＹ１５０８を作成するＪＰＥ
Ｇ圧縮部である。On the other hand, 1504 is a line-filled portion,
Similar to the character filling unit 105 shown in FIG. 1, the surrounding pixels of the line area are filled. Here, the line filling unit 1504 actually needs to refer to the binary image 103 (FIG. 1) used when creating the primary electronic form, but this is omitted in FIG. 11. 1505
Is an image in which the line area is filled and only the background is displayed.
JPE that creates compressed code Y1508 by PEG compression
It is a G compression unit.

【００４９】第２実施形態においては上記構成によっ
て、第１次電子フォーム内の所望の線部についての線ベ
クトル情報１５０７と、第１次電子フォームから線部を
消去した下地画像の圧縮コードＹ１５０８を、第２次電
子フォームとして登録する。In the second embodiment, with the above configuration, the line vector information 1507 about the desired line portion in the primary electronic form and the compression code Y1508 of the background image in which the line portion is erased from the primary electronic form are provided. , Register as a secondary electronic form.

【００５０】以下、第２実施形態における第２次電子フ
ォームの作成処理について、図１２のフローチャートを
参照して説明する。The process of creating the secondary electronic form in the second embodiment will be described below with reference to the flowchart of FIG.

【００５１】先ずステップＳ１２０１において、ＪＰＥ
Ｇ伸長後の第１次電子フォームを表示する。ここではす
なわち、図１０に示すような画像が表示される。そして
ステップＳ１２０２で該表示画像上において、線、フレ
ーム、表属性の矩形表示を行なう。図１０に示すフォー
ム例においてはフレーム属性のみが検出されるため、図
１３に示すようにフレーム枠表示のみが行われる。な
お、図１０に示す例においては、フレーム属性情報を伴
って矩形表示を行っている。First, in step S1201, JPE
Display the primary electronic form after G extension. Here, that is, the image as shown in FIG. 10 is displayed. Then, in step S1202, lines, frames, and rectangles of table attributes are displayed on the display image. Since only the frame attribute is detected in the form example shown in FIG. 10, only the frame frame is displayed as shown in FIG. In the example shown in FIG. 10, rectangular display is performed with the frame attribute information.

【００５２】次にステップＳ１２０３において、ユーザ
は、線ベクトル変換を実行する領域を任意に選択する。
上記ステップＳ１２０１〜Ｓ１２０３の処理は、矩形枠
表示部１５０３において行われる。Next, in step S1203, the user arbitrarily selects a region for which line vector conversion is to be executed.
The processing of steps S1201 to S1203 is performed by the rectangular frame display unit 1503.

【００５３】そしてステップＳ１２０４，Ｓ１２０５に
おいて、ステップＳ１２０３で選択された領域につい
て、図９に示した文字領域の塗りつぶし処理と同様に、
線領域の塗りつぶし処理を行なう。これにより、選択さ
れた領域における線画が消去される。Then, in steps S1204 and S1205, with respect to the area selected in step S1203, similarly to the character area filling processing shown in FIG.
Performs line area filling processing. As a result, the line drawing in the selected area is erased.

【００５４】そしてステップＳ１２０６で該選択領域に
ついて、線、フレーム等の属性の情報を利用して線ベク
トル化処理を行い、作成された線（フレームや表を含
む）を当該フォーム上に表示する。Then, in step S1206, the selected area is subjected to line vectorization processing using information on attributes such as lines and frames, and the created lines (including frames and tables) are displayed on the form.

【００５５】ここで線属性とは、レイアウト解析時に、
線、フレーム、表の各領域に対して付加される線の情報
であり、例えば以下の（１）〜（４）の様な情報を含
む。Here, the line attribute means, at the time of layout analysis,
It is the information of the line added to each area of the line, the frame, and the table, and includes, for example, the following information (1) to (4).

【００５６】（１）線の開始位置、終了位置（２）線種（実線、破線等）（３）線の太さ（４）斜線（右上がり、左上がり）なお厳密には、線属性が原画像に忠実とはならない場合
も多い。しかしながら、ステップＳ１２０７にてユーザ
による後修正が行われるため、最終的に作成される第２
次電子フォームは利用上さほど問題は生じない。該後修
正とは、線ベクトル化処理が原画像に忠実とはならなか
った場合や、ユーザが例えば線を太くする等、所望の変
更を施したい場合等に、任意に修正を施す処理である。
この後修正処理とはすなわち、通常の文書作成アプリケ
ーションが有している編集機能と同等の処理を行う。(1) Start position and end position of line (2) Line type (solid line, broken line, etc.) (3) Line thickness (4) Oblique line (upward to the right, upward to the left) In many cases, it is not faithful to the original image. However, since the post-correction is performed by the user in step S1207, the second created finally
The next electronic form does not cause much problem in use. The post-correction is a process of arbitrarily making corrections when the line vectorization process is not faithful to the original image or when the user wants to make a desired change such as thickening the line. .
The post-correction process is the same as the editing function of a normal document creation application.

【００５７】そしてステップＳ１２０８で、上記ステッ
プＳ１２０１〜Ｓ１２０７の処理によって、第１次電子
フォームの背景を利用しつつ、ユーザの所望する線部を
ベクトル化することによって作成されたフォームを、第
２次電子フォームとして登録する。すなわち第２次電子
フォームとしては、第１次電子フォームの背景画像（圧
縮コードＹ１５０８）のみならず、さらにフレーム領域
や表領域をも構成する線情報をベクトル情報（線ベクト
ル情報１５０７）として保持することができる。In step S1208, the form created by vectorizing the line portion desired by the user while the background of the primary electronic form is utilized by the processing of steps S1201 to S1207 is converted into the secondary form. Register as an electronic form. That is, as the secondary electronic form, not only the background image (compression code Y1508) of the primary electronic form, but also line information that constitutes a frame area and a table area is held as vector information (line vector information 1507). be able to.

【００５８】なお、第２次電子フォームは原画像の背景
画と線ベクトルによって構成されるが、もしもフォーム
として背景画が不要であればこれを消去することによっ
て、線ベクトルのみで構成された電子フォームを作成す
ることができる。The secondary electronic form is composed of the background image of the original image and the line vector. If the background image is unnecessary as a form, the secondary electronic form is erased to create an electronic form composed of only the line vector. You can create a form.

【００５９】以上説明したように本実施形態によれば、
第１実施形態において作成された第１次電子フォームに
基づき、さらに該フォーム内の所望の線部をベクトル化
した第２次電子フォームを生成することができる。As described above, according to this embodiment,
Based on the primary electronic form created in the first embodiment, it is possible to further generate a secondary electronic form in which desired line portions in the form are vectorized.

【００６０】ユーザにとって、原画像を単に背景画とし
て扱えれば十分である場合は、第１次電子フォームを作
成した時点で処理を終了し、該第１次電子フォームを利
用することができる。そして、さらに詳細な線領域まで
もフォームとして利用したい場合には、線ベクトル化に
より一層電子フォームの形式に近い、第２次電子フォー
ムを作成すれば良い。If it is sufficient for the user to handle the original image simply as the background image, the processing can be terminated at the time when the primary electronic form is created, and the primary electronic form can be used. If it is desired to use even more detailed line areas as a form, a secondary electronic form that is closer to the electronic form format can be created by line vectorization.

【００６１】なお、第１次電子フォームについては登録
する必要がなく、第２次電子フォームのみを作成・登録
したい場合には、図１に示した文字塗りつぶし部１０５
から出力された画像データＡを直接、図１１に示す第１
次電子フォーム画像１５０２として使用しても良い。If it is not necessary to register the primary electronic form and only the secondary electronic form is to be created and registered, the character filling unit 105 shown in FIG.
The image data A output from the
It may be used as the next electronic form image 1502.

【００６２】＜第３実施形態＞以下、本発明に係る第３
実施形態について説明する。<Third Embodiment> The third embodiment of the present invention will be described below.
An embodiment will be described.

【００６３】上述した第１及び第２実施形態において
は、原画像の背景フォームである第１次電子フォーム
と、さらに線ベクトルを含む第２次電子フォームを作成
する例について説明した。第３実施形態においては更
に、第２次電子フォームの全体をベクトル化して、第３
次電子フォームとして登録する。In the above-described first and second embodiments, the example in which the primary electronic form that is the background form of the original image and the secondary electronic form that further includes the line vector are created has been described. In the third embodiment, the entire secondary electronic form is further vectorized to create a third electronic form.
Register as the next electronic form.

【００６４】●装置構成図１４に、第３実施形態における第３次電子フォームを
作成するためのブロック構成を示すが、該構成は、第２
実施形態において図１１に示した構成に続いて備えられ
るものである。すなわち、図１１及び図１４においてブ
ロック１Ｂ内に示される線ベクトル情報１５０７及び圧
縮コードＹ１５０８（すなわち第２次電子フォーム）
が、第３次電子フォーム作成時において使用される。[Device Configuration] FIG. 14 shows a block configuration for creating a tertiary electronic form in the third embodiment.
In the embodiment, it is provided after the configuration shown in FIG. 11. That is, the line vector information 1507 and the compression code Y1508 shown in the block 1B in FIGS. 11 and 14 (that is, the secondary electronic form).
Is used when the third electronic form is created.

【００６５】図１４において、１５０９はＪＰＥＧ伸長
部であり、第２次電子フォーム内の圧縮コードＹ１５０
８を伸長して第２次電子フォーム画像１５１０を生成す
る。１５１１は、第２次電子フォーム画像１５１０、及
び第２次電子フォーム内の線ベクトル情報１５０７に基
づき、第２次電子フォームを表示する表示部である。In FIG. 14, reference numeral 1509 denotes a JPEG decompression unit, which is a compression code Y150 in the secondary electronic form.
8 is expanded to generate a secondary electronic form image 1510. Reference numeral 1511 denotes a display unit that displays the secondary electronic form based on the secondary electronic form image 1510 and the line vector information 1507 in the secondary electronic form.

【００６６】１５１２は塗りつぶし部であり、表属性を
有する領域に対して、その内部の背景色による塗りつぶ
し処理を行うことに加えて、ユーザによる線ベクトルの
修正処理を受け付ける。なお、ここでの塗りつぶし処理
は枠で囲まれた領域に対して行われ、背景色と同様な色
の塗りつぶし、またはパターン塗りつぶしが可能であ
る。１５１３は塗りつぶし後の第２次電子フォームから
背景画像を消去して線ベクトル情報１５１４を生成する
背景画消去部である。Reference numeral 1512 denotes a filling section which, in addition to performing the filling processing with the background color inside the area having the table attribute, receives the correction processing of the line vector by the user. Note that the filling processing here is performed on the area surrounded by the frame, and it is possible to perform painting with a color similar to the background color or pattern painting. Reference numeral 1513 denotes a background image erasing unit that erases a background image from the filled secondary electronic form to generate line vector information 1514.

【００６７】第３実施形態においては上記構成によっ
て、第２次電子フォーム内の表部を背景色で塗りつぶし
た後に背景画像を消去することによって得られた線ベク
トル情報１５１４を、第３次電子フォームとして登録す
る。In the third embodiment, with the above configuration, the line vector information 1514 obtained by erasing the background image after the front part in the secondary electronic form is filled with the background color is converted into the tertiary electronic form. Register as.

【００６８】以下、第３実施形態における第３次電子フ
ォームの作成処理について、図１５のフローチャートを
参照して説明する。The process of creating the tertiary electronic form in the third embodiment will be described below with reference to the flowchart of FIG.

【００６９】先ずステップＳ１４０１において、第２次
電子フォームを表示する。図１６（ｂ）に、この表示例
を示す。なお、図１６（ａ）に対応する第１次電子フォ
ームの表示例を示す。First, in step S1401, the secondary electronic form is displayed. FIG. 16B shows an example of this display. A display example of the primary electronic form corresponding to FIG. 16A is shown.

【００７０】次にステップＳ１４０２において、特に表
属性を有する線ベクトル情報に対し、第１実施形態に示
した線ベクトルの修正（Ｓ１２０７）と同様に、レイア
ウト誤認識の修正及びユーザの任意修正を施す。Next, in step S1402, the line vector information having the table attribute is subjected to the layout misrecognition correction and the user's arbitrary correction as in the line vector correction (S1207) shown in the first embodiment. .

【００７１】そしてステップＳ１４０３において、表領
域の背景色変換を行う。すなわち図１６（ｃ）に示すよ
うに、表領域内（表枠及び表内部）をその背景色と同等
の色に変換する（塗りつぶす）。これにより第２次電子
フォームにおいて、背景画像データの表領域にある画素
のみならず、表枠を示す線ベクトルが有する色情報が背
景色相当に変換される。Then, in step S1403, the background color of the table area is converted. That is, as shown in FIG. 16C, the inside of the table area (table frame and inside table) is converted (filled) into a color equivalent to the background color. As a result, in the secondary electronic form, not only the pixels in the table area of the background image data but also the color information included in the line vector indicating the table frame is converted into the background color.

【００７２】最後にステップＳ１４０４において、背景
画像を消去する。これは、第３次電子フォームとして
は、背景画像の色情報のみが表枠に残っていれば良いと
するためである。図１６（ｄ）に、背景画を消去した
例、すなわち作成された第３次電子フォームの例を示
す。同図によれば、該フォームが線ベクトルのみによっ
て構成されていることが分かる。Finally, in step S1404, the background image is erased. This is because it is sufficient for the tertiary electronic form that only the color information of the background image remains in the front frame. FIG. 16D shows an example in which the background image is deleted, that is, an example of the created third electronic form. According to the figure, it can be seen that the form is composed only of line vectors.

【００７３】以上説明したように第３実施形態によれ
ば、第２実施形態において作成した第第２次電子フォー
ムに基づき、その枠領域を背景色に変換して更にその背
景画を消去することによって、特に枠部のみを残した第
３次電子フォームを作成する。従って、第３次電子フォ
ームは線ベクトルのみによって構成されているため、該
フォームのファイルサイズが小さくて済む。As described above, according to the third embodiment, based on the secondary electronic form created in the second embodiment, the frame area is converted into the background color and the background image is further erased. In particular, a tertiary electronic form is created in which only the frame is left. Therefore, since the tertiary electronic form is composed of only line vectors, the file size of the form can be small.

【００７４】なお、第３次電子フォームのみを作成する
場合には、図１１に示した線塗りつぶし部１５０４から
出力された第２次電子フォームを直接、図１４に示す第
２次電子フォーム画像１５１０として使用しても良い。When only the tertiary electronic form is created, the secondary electronic form output from the line filling unit 1504 shown in FIG. 11 is directly converted to the secondary electronic form image 1510 shown in FIG. May be used as

【００７５】また、特にステップＳ１４０２に示すレイ
アウト修正を行う必要がなければ、その前段の第２次電
子フォームの表示及び該修正処理（Ｓ１４０１，Ｓ１４
０２）をスキップして、直接ステップＳ１４０３の背景
色変換処理を行うことも可能である。If it is not necessary to make the layout correction shown in step S1402, the secondary electronic form in the preceding stage and the correction processing (S1401 and S14) are performed.
02) can be skipped and the background color conversion processing in step S1403 can be directly performed.

【００７６】＜他の実施形態＞なお、本発明は、複数の
機器（例えばホストコンピュータ、インタフェイス機
器、リーダ、プリンタなど）から構成されるシステムに
適用しても、一つの機器からなる装置（例えば、立体画
像プリンタなど）に適用してもよい。<Other Embodiments> The present invention can be applied to a system composed of a plurality of devices (for example, host computer, interface device, reader, printer, etc.), but an apparatus consisting of one device ( For example, it may be applied to a stereoscopic image printer).

【００７７】また、本発明の目的は、前述した実施形態
の機能を実現するソフトウェアのプログラムコードを記
録した記憶媒体（または記録媒体）を、システムあるい
は装置に供給し、そのシステムあるいは装置のコンピュ
ータ（またはＣＰＵやＭＰＵ）が記憶媒体に格納された
プログラムコードを読み出し実行することによっても、
達成されることは言うまでもない。この場合、記憶媒体
から読み出されたプログラムコード自体が前述した実施
形態の機能を実現することになり、そのプログラムコー
ドを記憶した記憶媒体は本発明を構成することになる。
また、コンピュータが読み出したプログラムコードを実
行することにより、前述した実施形態の機能が実現され
るだけでなく、そのプログラムコードの指示に基づき、
コンピュータ上で稼働しているオペレーティングシステ
ム（ＯＳ）などが実際の処理の一部または全部を行い、
その処理によって前述した実施形態の機能が実現される
場合も含まれることは言うまでもない。Further, an object of the present invention is to supply a storage medium (or recording medium) recording a program code of software for realizing the functions of the above-described embodiment to a system or apparatus, and to supply a computer of the system or apparatus ( Alternatively, by the CPU or MPU) reading and executing the program code stored in the storage medium,
It goes without saying that it will be achieved. In this case, the program code itself read from the storage medium realizes the functions of the above-described embodiments, and the storage medium storing the program code constitutes the present invention.
Further, by executing the program code read by the computer, not only the functions of the above-described embodiments are realized, but also based on the instruction of the program code,
An operating system (OS) running on the computer does some or all of the actual processing,
It goes without saying that the processing includes the case where the functions of the above-described embodiments are realized.

【００７８】さらに、記憶媒体から読み出されたプログ
ラムコードが、コンピュータに挿入された機能拡張カー
ドやコンピュータに接続された機能拡張ユニットに備わ
るメモリに書込まれた後、そのプログラムコードの指示
に基づき、その機能拡張カードや機能拡張ユニットに備
わるＣＰＵなどが実際の処理の一部または全部を行い、
その処理によって前述した実施形態の機能が実現される
場合も含まれることは言うまでもない。Further, after the program code read from the storage medium is written in the memory provided in the function expansion card inserted in the computer or the function expansion unit connected to the computer, based on the instruction of the program code. , The CPU provided in the function expansion card or the function expansion unit performs some or all of the actual processing,
It goes without saying that the processing includes the case where the functions of the above-described embodiments are realized.

【００７９】[0079]

【発明の効果】以上説明したように本発明によれば、印
刷文書に基づく電子フォームを容易に作成することがで
きる。As described above, according to the present invention, an electronic form based on a printed document can be easily created.

[Brief description of drawings]

【図１】第１実施形態における画像処理装置のブロック
構成を示す図である。FIG. 1 is a diagram showing a block configuration of an image processing apparatus in a first embodiment.

【図２】第１実施形態における文字領域検出処理を示す
フローチャートである。FIG. 2 is a flowchart showing a character area detection process in the first embodiment.

【図３】第１実施形態における文字領域検出処理の具体
例を示す原画像例である。FIG. 3 is an example of an original image showing a specific example of a character area detection process in the first embodiment.

【図４】図３に示す原画像の輝度ヒストグラムである。FIG. 4 is a luminance histogram of the original image shown in FIG.

【図５】図３に示す原画像より作成した２値画像であ
る。5 is a binary image created from the original image shown in FIG.

【図６】図３に示す原画像における文字領域のみの２値
画像である。FIG. 6 is a binary image of only the character area in the original image shown in FIG.

【図７】図６に示す２値画像における黒画素をグループ
化した文字領域例を示す図である。7 is a diagram showing an example of a character area in which black pixels in the binary image shown in FIG. 6 are grouped.

【図８】第１実施形態における文字部塗りつぶし処理の
具体例を示す図である。FIG. 8 is a diagram showing a specific example of character part filling processing according to the first embodiment.

【図９】第１実施形態における文字部塗りつぶし処理を
示すフローチャートである。FIG. 9 is a flowchart showing a character part filling process according to the first embodiment.

【図１０】第１実施形態において作成した第１次電子フ
ォーム例を示す図である。FIG. 10 is a diagram showing an example of a primary electronic form created in the first embodiment.

【図１１】第２実施形態における第２次電子フォーム作
成用のブロック構成を示す図である。FIG. 11 is a diagram showing a block configuration for creating a secondary electronic form in the second embodiment.

【図１２】第２実施形態における第２次電子フォーム作
成処理を示すフローチャートである。FIG. 12 is a flowchart showing a secondary electronic form creation process in the second embodiment.

【図１３】第２実施形態における第１次電子フォーム上
へのフレーム枠表示例である。FIG. 13 is a frame frame display example on the primary electronic form in the second embodiment.

【図１４】第３実施形態における第３次電子フォーム作
成用のブロック構成を示す図である。FIG. 14 is a diagram showing a block configuration for creating a tertiary electronic form in the third embodiment.

【図１５】第３実施形態における第３次電子フォーム作
成処理を示すフローチャートである。FIG. 15 is a flowchart showing a tertiary electronic form creating process in the third embodiment.

【図１６】第３実施形態における第３次電子フォーム作
成処理の具体例を示す図である。FIG. 16 is a diagram showing a specific example of a tertiary electronic form creation process in the third embodiment.

[Explanation of symbols]

１０１原画像１０２２値化部１０３２値画像１０４レイアウト解析部１０５文字塗りつぶし部１０６縮小部１０７ＪＰＥＧ圧縮部１０８レイアウト解析結果１０９圧縮コードＸ１５０１ベクトル伸長部１５０２第１次電子フォーム画像１５０３矩形枠表示部１５０４線塗りつぶし部１５０５ＪＰＥＧ圧縮部１５０６線ベクトル化部１５０７線ベクトル情報１５０８圧縮コードＹ１５０９ＪＰＥＧ伸長部１５１０第２次電子フォーム画像１５１１表示部１５１２線塗りつぶし部１５１３背景画消去部１５１４線ベクトル情報 101 Original image 102 Binarization unit 103 binary image 104 Layout analysis section 105 character fill part 106 Reduction unit 107 JPEG compression unit 108 Layout analysis result 109 Compressed code X 1501 Vector decompression unit 1502 Primary electronic form image 1503 Rectangular frame display area 1504 Line fill part 1505 JPEG compression unit 1506 Line vectorization unit 1507 line vector information 1508 Compressed code Y 1509 JPEG extension section 1510 Secondary electronic form image 1511 display 1512 Line fill part 1513 Background image erasing section 1514 line vector information

フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｈ０４Ｎ 1/46 Ｈ０４Ｎ 1/46 Ｚ５Ｌ０９６Ｆターム(参考） 2C087 AA15 BA14 BB10 BD06 BD31 BD40 CA02 5B021 AA01 LA01 5C076 AA03 AA22 AA40 BA06 5C077 LL17 MP05 MP07 PP19 PP20 PP21 PP27 PP46 PP58 PP65 PP68 PQ08 PQ19 RR02 RR11 RR18 RR21 5C079 HB01 HB04 LA06 LA26 LA31 LA34 LA37 LA39 NA27 PA00 5L096 AA02 AA06 BA08 BA17 CA18 DA01 EA23 EA37 FA37 FA44 GA34 GA40 Front page continuation (51) Int.Cl. ⁷ Identification code FI theme code (reference) H04N 1/46 H04N 1/46 Z 5L096 F term (reference) 2C087 AA15 BA14 BB10 BD06 BD31 BD40 CA02 5B021 AA01 LA01 5C076 AA03 AA22 AA40 BA06 5C077 LL17 MP05 MP07 PP19 PP20 PP21 PP27 PP46 PP58 PP65 PP68 PQ08 PQ19 RR02 RR11 RR18 RR21 5C079 HB01 HB04 LA06 LA26 LA31 LA34 LA37 LA39 NA27 PA00 5L096 AA02 AA06 BA08 BA17 CA18 DA01 EA44 GA34 FA34 EA34 FA34 EA37

Claims

[Claims]

1. An image processing method for creating an electronic form based on a document image formed on a recording medium, the image inputting step of inputting multi-valued image data based on the document image formed on the recording medium. A binarizing step of binarizing the multi-valued image data to generate binary image data; an analyzing step of analyzing the layout of the document image based on the binary image data; A character filling step of creating a first electronic form by filling a character portion in the multi-valued image data with a surrounding color based on the binary image data, the image processing method.

2. The character filling step includes a dividing step of dividing the multi-valued image data into blocks of a predetermined size; a character area detecting step of detecting a block including a character area based on the layout analysis result; 2. The image according to claim 1, further comprising: a replacing step of replacing a pixel of the character portion with a pixel equivalent to a peripheral pixel of the detected block based on a corresponding region of the binary image data. Processing method.

3. The replacing step comprises: an average color calculating step of calculating average color data of multi-valued image data corresponding to white pixels in the binary image data in the block; An average color replacement step of replacing multi-valued image data corresponding to black pixels in the value image data with the average color data, the image processing method according to claim 2.

4. The image processing method according to claim 1, wherein in the analyzing step, the character area information is created by grouping black pixels of the binary image data.

5. A reduction step of reducing the first electronic form, and a compression step of compressing the reduced first electronic form,
The image processing method according to claim 1, further comprising:

6. The image processing method according to claim 1, further comprising an editing step of editing the first electronic form based on a user instruction.

7. A first form displaying step of displaying the first electronic form by marking a line area thereof on the basis of the layout analysis result, and an arbitrary line on the displayed first electronic form. A selection step of selecting an area based on a user instruction; a vectorization step of creating line vector information by vectorizing a line portion in the selected line area; and a layout analysis result and the binary image data. A line filling step of creating background image data by filling a line portion in the first electronic form with its surrounding color, based on the second electronic form of the line vector information and background image data. The image processing method according to claim 1, wherein the image processing method is a form.

8. The image processing method according to claim 7, wherein the line area is an area having any one of attributes of a line, a frame, and a table.

9. The image processing method according to claim 8, wherein in the form displaying step, the line area is displayed in a rectangular shape.

10. The image processing method according to claim 9, wherein in the form displaying step, a rectangle is displayed with attribute information of the line area.

11. The image processing method according to claim 7, further comprising a background removing step of deleting the background image data from the second electronic form.

12. A table area filling step of filling a table area in the second electronic form composed of the line vector information and the background image data with a color similar to the background color, and image data after the table area filling. And a background deletion step of creating a third electronic form by deleting the background image data from the line vector information constituting the table area in the third electronic form. The image processing method according to claim 7, wherein the image processing method includes information.

13. The image processing method according to claim 12, wherein the line vector information forming the table area includes the background color information.

14. A second form displaying step of displaying the second electronic form, and an editing step of editing an arbitrary line area on the displayed second electronic form based on a user instruction. 13. The image processing method according to claim 12, wherein the frame area filling step is executed for the edited second electronic form.

15. An image processing apparatus for creating an electronic form based on a document image formed on a recording medium, comprising image input means for inputting multi-valued image data based on the document image formed on the recording medium. A binarizing unit that binarizes the multi-valued image data to generate binary image data; an analyzing unit that analyzes the layout of the document image based on the binary image data; A character filling unit that fills a character portion in the multi-valued image data with its surrounding color based on the binary image data, and a first holding that holds the multi-valued image data after the character filling as a first electronic form. An image processing device comprising:

16. Based on the layout analysis result,
First form displaying means for displaying the first electronic form by marking its line area, and selecting means for selecting an arbitrary line area on the displayed first electronic form based on a user instruction, Vectorizing means for creating line vector information by vectorizing a line part in the selected line area, and a line part in the first electronic form based on the layout analysis result and the binary image data. Line painting means for creating background image data by painting with the surrounding color, and second holding means for holding the line vector information and the background image data as a second electronic form, and The image processing apparatus according to claim 15, further comprising:

17. A table area filling means for filling a table area in the second electronic form, which is composed of the line vector information and the background image data, with a color similar to the background color, and a second area after the table area is filled. Background deleting means for extracting only the line vector information by deleting the background image data from the electronic form, and a third holding means for holding the line vector information extracted by the background deleting means as a third electronic form. The image processing apparatus according to claim 16, further comprising: and the line vector information forming the table area in the third electronic form includes the background color information.

18. A program for executing the image processing method according to claim 1, which is executed by a computer.

19. A recording medium on which the program according to claim 18 is recorded.