JPH0528258A

JPH0528258A - Form input device with character/graphic separation device

Info

Publication number: JPH0528258A
Application number: JP3203680A
Authority: JP
Inventors: Noboru Shimizu; 昇清水; Katsuhiko Itonori; 勝彦糸乗
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 1991-07-19
Filing date: 1991-07-19
Publication date: 1993-02-05

Abstract

PURPOSE:To easily input a form without permitting an operator to erase a character part for the form where a character and a graphic are mixed by providing a form processing means which executes a copy processing with a separated graphic and executes conversion into form information. CONSTITUTION:A picture input part digital-inputs a document picture where the character and the graphic are mixed, and an image memory 2 stores inputted digital data. A character/graphic separation part 3 executes a processing obtaining a picture consisting only of the character and a picture consisting only of the graphic from the original picture where the graphic and the character are mixed in a ruled line, which is stored in the image memory 2. A form picture memory 4 stores the graphic picture separated by the character/graphic separation part 3. A form processing part 5 encodes the form for inputting it to a computer for the graphic picture stored in the form picture memory 4, executes the copy processing of vectorized data and executes a processing for conversion into the document of the computer.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、文字と図形（フォーム
部分）が混在する文書画像から、文字と図形と分離し、
図形部分のみを清書入力するフォーム入力装置に関する
ものである。BACKGROUND OF THE INVENTION The present invention separates characters and figures from a document image in which characters and figures (form part) are mixed,
The present invention relates to a form input device for inputting only a graphic part in plain text.

【０００２】[0002]

【従来の技術】伝票などのフォームを計算機（ワードプ
ロセッサなど）に入力する際、キーボードやマウスなど
の入力装置を通して操作者が入力するのが一般的であ
る。近年、キーボード等の操作のわずらわしさを解消す
るために、紙に印刷された文書、または、ていねいに手
書きされた文書を画像として取り込み、フォームの線部
分に対してベクトル化処理を行ない、フォーム入力する
研究が行なわれている（例えば、信学技報ＰＲＵ８７−
３５）。2. Description of the Related Art When a form such as a slip is input to a computer (such as a word processor), an operator generally inputs it through an input device such as a keyboard or a mouse. In recent years, in order to eliminate the hassle of operating keyboards, etc., a document printed on paper or carefully handwritten is captured as an image, and the line portion of the form is vectorized, and form input is performed. Research is being carried out (for example, IEICE Technical Report PRU87-
35).

【０００３】[0003]

【発明が解決しようとする課題】しかし、これらは、線
部分のみの画像に対する処理であるため、文字と図形
（フォーム部分）が混在している一般のフォーム文書か
らの入力は、一旦、操作者が文字部分を消すための操作
が必要であった。また、フォーム入力の対象としては、
直線のみであるため、印刷物またはきれいに手書きされ
た文書のみであり、粗く手書きされた文書（フォーム）
の入力は無理であった。本発明は、これらの従来技術の
問題点を解決するためになされたものである。すなわ
ち、本発明は、文字と図形が混在しているフォームに対
しても、操作者が文字部分を消去するという操作なしに
簡単にフォーム入力が可能なフォーム入力装置を提供す
ることを目的とするものである。However, since these are processes for the image of only the line part, the input from a general form document in which characters and figures (form part) are mixed is once performed by the operator. Was necessary to erase the character part. Also, as the target of form input,
Since it is only straight lines, it is only a printed matter or a nicely handwritten document, and a roughly handwritten document (form)
It was impossible to input. The present invention has been made to solve these problems of the prior art. That is, it is an object of the present invention to provide a form input device capable of easily inputting a form even if the form has a mixture of characters and graphics without the operator having to erase the character portion. It is a thing.

【０００４】[0004]

【課題を解決するための手段】本発明は、上記目的を達
成するために、文書画像をデジタルデータで入力する画
像入力手段（図１の１）と、入力した画像を記憶するイ
メージメモリ（図１の２）と、画像中の図形と文字を分
離する文字／図形分離手段（図１の３）と、分離した図
形を清書処理を行うとともにフォーム情報に変換するフ
ォーム処理手段（図１の５）とを備えたことを特徴とす
る。In order to achieve the above object, the present invention provides image input means (1 in FIG. 1) for inputting a document image as digital data, and an image memory (FIG. 1) for storing the input image. 1), 2), a character / figure separating means (3 in FIG. 1) for separating a figure and a character in an image, and a form processing means (5 in FIG. 1) for performing a clearing process on the separated figure and converting it into form information. ) And are provided.

【０００５】[0005]

【作用】フォーム情報として用いたい図形と文字を含む
画像（例えば、図２）を画像入力手段（１）で読み取
り、イメージメモリ（２）に記憶する。その記憶した画
像（例えば図２）は文字／図形分離手段（３）により文
字（例えば図３）と図形（例えば図４）に分離される。
そして、分離された図形は、フォーム処理手段（５）に
より、清書処理が施されフォーム情報に変換される。本
発明によれば、フォーム処理の前に文字／図形分離処理
を行なうことによって、既存のフォーム入力では対処で
きなかった文字が混在するフォームに対しても、フォー
ム部分のみを抽出して計算機やワードプロセッサなどの
応用装置に入力することができる。また、本発明では抽
出したフォーム部分に対して清書処理を行なうようにし
たので、粗く手書きされたフォームに対しても入力する
ことができる。The image inputting means (1) reads an image (for example, FIG. 2) including a figure and a character to be used as form information and stores it in the image memory (2). The stored image (for example, FIG. 2) is separated into a character (for example, FIG. 3) and a graphic (for example, FIG. 4) by the character / graphic separation means (3).
Then, the separated figures are subjected to a clear writing process by the form processing means (5) and converted into form information. According to the present invention, by performing character / figure separation processing before form processing, even for a form in which characters that cannot be handled by existing form input are mixed, only the form portion is extracted and a computer or word processor is extracted. Can be input to the application device such as. Further, according to the present invention, since the fair copy processing is performed on the extracted form portion, it is possible to input even a roughly handwritten form.

【０００６】[0006]

【実施例】本発明の一実施例のフォーム入力装置は、図
１に示すように、フォーム文書をデジタル入力する画像
入力部１と、その入力された画像を格納するイメージメ
モリ２と、文字のみの画像と図形のみの画像に分離する
文字／図形分離部３と、その文字／図形分離部３により
分離された図形（フォーム）のみの画像を格納するフォ
ーム画像メモリ４と、その図形（フォーム）のみの画像
をベクトル化処理し、文書フォーマットへの変換を行う
フォーム処理部５とを備えている。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS As shown in FIG. 1, a form input device according to one embodiment of the present invention comprises an image input section 1 for digitally inputting a form document, an image memory 2 for storing the input image, and characters only. Character / figure separating unit 3 for separating the image of FIG. 3 and an image of only a figure, a form image memory 4 for storing an image of only a figure (form) separated by the character / figure separating unit 3, and the figure (form) The image processing unit 5 includes a form processing unit 5 that vectorizes only the image and converts it into a document format.

【０００７】そのフォーム処理部５は、詳しくは図形
（フォーム）のみの画像に対し線分をベクトル化処理を
施すベクトル化部５１と、そのベクトル化処理された画
像に対し、整形処理を施す清書処理部５２と、計算機
（ワードプロセッサなど）への入力のために、文書フォ
ーマットへの変換を行なう応用処理部５３からなってい
る。More specifically, the form processing section 5 is a vectorization section 51 for vectorizing a line segment on an image containing only a figure (form), and a clear copy for applying a shaping process to the vectorized image. It is composed of a processing unit 52 and an application processing unit 53 for converting to a document format for input to a computer (word processor or the like).

【０００８】以下、上記のように構成された本実施例の
動作について説明する。画像入力部１は、文字と図形
（フォーム）が混在する文書画像をデジタル入力する。
図２は文書画像（原画像）の一例を示すものである。イ
メージメモリ２は、画像入力部１によって入力されたデ
ジタルデータを格納する。文字／図形分離部３は、イメ
ージメモリ２に格納された図２に例示するような罫線の
図形（フォーム）と文字の混在した原画像から、図３に
示すような文字のみの画像、および図４に示すような図
形（フォーム）のみの画像を得る処理をする。この文字
／図形分離の方法は種々提案されており、これらの中か
ら任意の一つを選択して採用すれば良い。この文字／図
形分離の方法の選択は、入力する文書の形態により異な
る。例えば、縦／横線のみから構成されているフォーム
文書を対象としている場合は、例えば岩城外３「文字・
図形分離処理におけるプロダクション・システム導入の
一検討」電子通信学会技術研究報告ＰＲＬ８３−６３、
p.67-74記載の方法を用いることができ、また罫線が点
線などで構成されている場合は、例えば特開昭２−１５
９６９０号公報の方法を用いることができる。フォーム
画像メモリ４は、文字／図形分離部３により分離された
図形（フォーム）画像を記憶する。図２の原画像におけ
る図形画像は、図４に示すようなものである。The operation of this embodiment having the above arrangement will be described below. The image input unit 1 digitally inputs a document image in which characters and figures (forms) are mixed.
FIG. 2 shows an example of a document image (original image). The image memory 2 stores the digital data input by the image input unit 1. The character / graphic separation unit 3 stores an image of only characters as shown in FIG. 3 from the original image in which the graphic (form) of the ruled line and the character as illustrated in FIG. Processing for obtaining an image of only a figure (form) as shown in FIG. Various character / graphic separation methods have been proposed, and any one of them may be selected and used. The selection of the character / graphic separation method depends on the form of the input document. For example, if the target is a form document that is composed only of vertical / horizontal lines, then, for example, Iwashiro 3
A Study on Introduction of Production System in Figure Separation Processing "Technical Report of the Institute of Electronics and Communication Engineers PRL83-63,
When the method described in p.67-74 can be used and the ruled line is constituted by a dotted line or the like, for example, Japanese Patent Application Laid-Open No. 2-15
The method of 9690 can be used. The form image memory 4 stores the graphic (form) images separated by the character / graphic separation unit 3. The graphic image in the original image of FIG. 2 is as shown in FIG.

【０００９】フォーム処理部５により、フォーム画像モ
リ４に記憶されている図形（フォーム）画像に対して、
計算機（ワードプロセッサなど）に入力するためのフォ
ームのコード化（ベクトル化）を行い、ベクトル化され
たものの清書（整形）処理を行ない、計算機の文書へ変
換する処理を行なう。具体的には、まずベクトル化部５
１により、フォーム画像に対して、線分の両端点を求め
ることにより、ベクトル化処理をする。このベクトル化
の処理としては、信学技法ＰＲＬ８３−８、信学技法Ｐ
ＲＬ８５−２４、信学技法ＰＲＵ８６−８９、特開平２
−２１０５８６号公報等の文献記載の手法を利用するこ
とができる。また、本出願人の先に出願した特開平２−
１０５２６５号公報（発明の名称「画像データベクトル
変換装置」）に記載の方法を用いるとベクトル化ととも
に清書処理を行うことができる、好適である。By the form processing unit 5, for the figure (form) image stored in the form image memory 4,
The code for inputting to a computer (word processor, etc.) is coded (vectorized), the vectorized one is written (shaped), and converted into a computer document. Specifically, first, the vectorization unit 5
According to 1, vectorization processing is performed by obtaining both end points of the line segment for the form image. The vectorization processing includes the communication technique PRL83-8 and the communication technique P.
RL85-24, Practical Technique PRU86-89, JP-A-2
It is possible to use the method described in the literature such as JP-A-210586. In addition, Japanese Patent Application Laid-Open No. 2-
It is preferable to use the method described in Japanese Patent No. 105265 (the title of the invention, “Image data vector conversion device”), because it is possible to perform a clear copy process together with vectorization.

【００１０】図５は上記「画像データベクトル変換装
置」を用いたベクトル化の構成例を示すものである。こ
のベクトル化部５１は、走査を主体とする単純な処理に
より、画像データをベクトルデータに変換できるもので
あり、図５に示すように、フォーム画像メモリ４に格納
された２値画像を直交する方向（ここではＸ軸方向およ
びＹ軸方向）に走査して所定の処理を行う。Ｘ軸方向の
走査と処理は、Ｘ軸方向走査部５１１、連続黒画素計数
部５１２、黒画素重心抽出部５１３、重心連結部５１４
により行い、Ｙ軸方向の走査と処理は、Ｙ軸方向走査部
５１５、連続黒画素計数部５１６、黒画素重心抽出部５
１７、重心連結部５１８により行う。各処理の結果は、
ベクトル整形部５２により整形される。Ｙ軸方向とＸ軸
方向の処理とは走査方向が異なるだけで実質的には同じ
ものであり、ここではＹ軸方向を例にとり説明する。Ｙ
軸方向走査部５１５の走査は、画素単位に行うのではな
く、幾つかの画素を飛び越して行う。その飛び越し幅で
ある走査線間幅Ｓは、任意の幅に決めることができる。FIG. 5 shows an example of the configuration of vectorization using the "image data vector conversion device". The vectorization unit 51 can convert the image data into vector data by a simple process mainly including scanning. As shown in FIG. 5, the binary image stored in the form image memory 4 is orthogonalized. A predetermined process is performed by scanning in a direction (here, the X-axis direction and the Y-axis direction). The scanning and processing in the X-axis direction are performed by the X-axis direction scanning unit 511, the continuous black pixel counting unit 512, the black pixel centroid extracting unit 513, and the centroid connecting unit 514.
The scanning and processing in the Y-axis direction are performed by the Y-axis direction scanning unit 515, the continuous black pixel counting unit 516, and the black pixel centroid extracting unit 5.
17, the center of gravity connecting portion 518. The result of each process is
It is shaped by the vector shaping unit 52. The processing in the Y-axis direction and the processing in the X-axis direction are substantially the same except that the scanning directions are different. Here, the Y-axis direction will be described as an example. Y
The scanning of the axial scanning unit 515 is not performed in pixel units, but is performed by skipping some pixels. The inter-scan line width S, which is the interlace width, can be set to any width.

【００１１】連続黒画素計数部５１６は、走査をしつつ
黒画素が幾つ連続しているかを計数する。その計数結果
に基づき、黒画素重心抽出部５１７は連続した黒画素の
重心を抽出する。重心連結部５１８は、黒画素重心抽出
部５１７の抽出した黒画素重心同士を連結して、ベクト
ルを形成する。一定の距離を予め定めておいて、黒画素
重心間の距離がその一定の距離より小であれば、両者を
連結してベクトルを形成する。しかし、上記一定の距離
より大であれば連結しない。図６は抽出した重心の間を
連結してベクトルを形成する状況を説明する図である。
黒画素重心Ａと黒画素重心Ｃ間の距離は、前記一定の距
離より大であると言うことで、連結されない。従って、
一点鎖線で表したベクトルＶ3は形成されない（しか
し、後述する清書処理部５２の整形作用の段階で、ベク
トルＶ1，Ｖ2とがベクトルＶ3の形に整形されることは
あり得る）。同様にして、黒画素重心Ｄ，Ｅ間は連結さ
れてベクトルＶ4が形成されるが、黒画素重心Ｅ，Ｆ間
は連結されず、ベクトルＶ5は形成されない。The continuous black pixel counting unit 516 counts how many black pixels are continuous while scanning. Based on the counting result, the black pixel centroid extraction unit 517 extracts the centroids of consecutive black pixels. The center of gravity connecting section 518 connects the black pixel center of gravity extracted by the black pixel center of gravity extracting section 517 to form a vector. If a constant distance is set in advance and the distance between the black pixel centroids is smaller than the constant distance, the two are connected to form a vector. However, if the distance is larger than the certain distance, they are not connected. FIG. 6 is a diagram for explaining a situation in which the extracted centroids are connected to form a vector.
The distance between the black pixel center of gravity A and the black pixel center of gravity C is larger than the fixed distance, and thus is not connected. Therefore,
The vector V3 represented by the alternate long and short dash line is not formed (however, the vectors V1 and V2 may be shaped into the vector V3 at the stage of shaping operation of the clean copy processing unit 52 described later). Similarly, the black pixel centroids D and E are connected to form the vector V4, but the black pixel centroids E and F are not connected to each other and the vector V5 is not formed.

【００１２】清書処理部５２は、ベクトル間を結合した
り、接触させたり、誤ベクトルの削、除、傾きの補正を
したりして、粗く手書きされたフォームを整形する。図
７に、清書処理部５２で行うベクトル整形の種々のケー
スを示す。Ｖ6〜Ｖ16は、ベクトルである。（ａ）短ベクトルの削除ベクトルの長さがある閾値以下であるとき、これは、重
心の抽出に誤りがあり、それにより形成された誤りベク
トルであると判断して削除する。（ｂ）ベクトルの結合「ベクトルの結合」という処理は、２つのベクトルの端
点間の距離が所定距離より小であり、かつ２つのベクト
ルのなす角度が所定角度より小であるとき、前記２つの
ベクトルを、それらの互いに離れている端点を結ぶ１つ
のベクトルで代表させていまうという処理である。図７
の（イ）に、ベクトルの結合の具体例を示す。ベクトル
Ｖ6とＶ7との近い方の端点間の距離Ｈが所定距離ＨTよ
り小で、かつ１つのベクトルのなす角度θが所定角度θ
Tより小であると、２つのベクトルＶ6は、それらの互い
に離れている端点間を結ぶベクトルＶ8という１つのベ
クトルに整形される。（ｃ）ベクトルの接触「ベクトルの接触」という処理は、一方のベクトルのど
ちらかの端点から他方のベクトルまでの距離が、所定距
離より小であるとき、その端点を他方のベクトルに接触
するよう整形してしまうという処理である。図７の
（ロ）、（ハ）にベクトルの接触の具体例を示す。図７
の（ロ）は、ベクトルＶ9，Ｖ10とが交差している場合
を示し、（ａ1，ｂ1）ないし（ａ4，ｂ4）は各ベクトル
の端点の座標である。図７の（ロ）の左図に示すよう
に、ベクトルＶ10の短部は交差している点からＫだけは
み出している。このＫが所定距離より小であれば、図
７の（ロ）の右図のように、はみ出していたＫの部分を
カットしてベクトルＶ11とし、丁度、ベクトルＶ9に接
触しているように整形される。図７の（ハ）の左図は、
ベクトルＶ9，Ｖ10とは離れているものの、その離れて
いる距離、即ち、ベクトルＶ10を距離Ｍだけ延長してベ
クトルＶ11とし、ベクトルＶ9と接触させる。接触点の
座標（ｃ1，ｃ2）は図７の（ロ）の場合と同様にして求
められる。図７の（ニ）は、ベクトルの結合とベクトル
の接触が行われる様子を併せて描いたものである。左図
がベクトルの結合、接触を行う前の状態であり、右図が
その後の状態を示している。ベクトルＶ12とＶ13とが結
合されたものと、ベクトルＶ14との接触が行われる。そ
の結果得られるベクトルが、ベクトルＶ15，Ｖ16であ
る。以上のような清書処理部５２での整形処理により、
誤ベクトルの削除とかベクトル間の途切れの修正が行わ
れる。The clear copy processing unit 52 forms a roughly handwritten form by combining or contacting the vectors, deleting or removing erroneous vectors, and correcting the inclination. FIG. 7 shows various cases of vector shaping performed by the fair copy processing unit 52. V6 to V16 are vectors. (A) Deletion of short vector When the length of the vector is less than or equal to a certain threshold, it is determined that this is an error vector formed due to an error in the extraction of the center of gravity, and the vector is deleted. (B) Vector combination If the distance between the end points of two vectors is smaller than a predetermined distance and the angle formed by the two vectors is smaller than a predetermined angle, the process of "vector combination" is performed. This is a process in which a vector is represented by a single vector that connects end points that are separated from each other. Figure 7
A specific example of vector combination is shown in (a). The distance H between the closer end points of the vectors V6 and V7 is smaller than the predetermined distance HT, and the angle θ formed by one vector is the predetermined angle θ.
If it is smaller than T, the two vectors V6 are shaped into one vector, which is a vector V8 connecting between the end points which are separated from each other. (C) Vector contact In the process of "vector contact", when the distance from one endpoint of one vector to the other vector is smaller than a predetermined distance, the endpoint is contacted with the other vector. It is a process of shaping. Specific examples of vector contact are shown in (b) and (c) of FIG. 7. Figure 7
(B) indicates the case where the vectors V9 and V10 intersect, and (a1, b1) to (a4, b4) are the coordinates of the end points of each vector. As shown in the left diagram in (b) of FIG. 7, only K is protruded from the intersection of the short portions of the vector V10. If this K is smaller than the predetermined distance, as shown on the right side of FIG. 7B, the part of the protruding K is cut into a vector V11 and shaped so as to be in contact with the vector V9. To be done. The left diagram of (c) of FIG.
Although it is separated from the vectors V9 and V10, the distance, that is, the vector V10 is extended by the distance M to form a vector V11, which is brought into contact with the vector V9. The coordinates (c1, c2) of the contact point are obtained in the same manner as in the case of (b) in FIG. FIG. 7D illustrates a state in which the vectors are combined and the vectors are combined. The left figure shows the state before vector connection and vector contact, and the right figure shows the state after that. The combination of the vectors V12 and V13 is brought into contact with the vector V14. The resulting vectors are vectors V15 and V16. By the shaping process in the clear copy processing unit 52 as described above,
Erroneous vectors are deleted and gaps between vectors are corrected.

【００１３】以上に説明した清書の様子を、図形全体に
ついて概略的にみると、図８のようになる。すなわち、
図８（イ）のような線図形画像に対して、直線近似をす
ることにより、図８（ロ）のような結果を得ることがで
きる。この結果に対して、整形処理を施し端点や交点の
ずれを直すことで、図８（ハ）のような結果を得ること
ができる。この清書処理部５２は、印刷文書やきれいに
手書きされたフォームが入力された場合には、そのこと
を検出して、この機能を実行するのを省略するようにし
てもよい。FIG. 8 is a schematic view of the above-described state of the clean copy of the entire figure. That is,
By performing linear approximation on the line graphic image as shown in FIG. 8A, the result as shown in FIG. 8B can be obtained. By subjecting this result to a shaping process to correct the deviation of the end points and the intersections, the result as shown in FIG. 8C can be obtained. When the print document or the form handwritten neatly is input, the clear copy processing unit 52 may detect the fact and omit the execution of this function.

【００１４】応用処理部５３では、ベクトル化されたフ
ォームを計算機の文書または各応用装置（アプリケーシ
ョン）の入力フォーマットに変換する。以上により、紙
上に描かれている文書フォームを計算機が取り扱えるフ
ォーム文書として入力できる。The application processing unit 53 converts the vectorized form into a document of a computer or an input format of each application device (application). As described above, the document form drawn on the paper can be input as the form document that can be handled by the computer.

【００１５】上述の実施例では、原画像から図形（フォ
ーム）画像のみをフォーム入力の対象としているが、文
字のみの画像も残しておき、その画像に対して、文字認
識処理を行なうことによって文字を含むフォーム入力が
可能になる。また、この文字認識入力処理を追加するこ
とによって、フォームの各行、各列の意味がわかり、後
のアプリケーションへの応用を容易にすることができ
る。例えば、フォーム内の目的地の欄がわかることによ
って、目的地別の統計が取れるようになる。In the above-described embodiment, only the figure (form) image from the original image is the object of form input, but an image containing only characters is left and character recognition processing is performed on the image to perform character recognition. It becomes possible to fill in forms including. In addition, by adding this character recognition input process, the meaning of each row and each column of the form can be understood, and application to a later application can be facilitated. For example, by knowing the destination column in the form, it becomes possible to obtain statistics for each destination.

【００１６】[0016]

【発明の効果】本発明によれば、フォーム処理の前に文
字／図形処理を行なうことによって、既存のフォーム入
力では対処できなかった文字が混在するフォームに対し
ても、フォーム部分を自動的に抽出して計算機やワード
プロセッサなどの応用装置に入力することができる。ま
た、本発明では抽出したフォーム部分に対して清書処理
を行なうようにしたので、粗く手書きされたフォームに
対しても入力することができる。従って、本発明によれ
ば、応用装置に文書のフォームを極めて容易に入力する
ことができる利点がある。According to the present invention, by performing the character / graphic processing before the form processing, even if the form has a mixture of characters which cannot be dealt with by the existing form input, the form portion is automatically generated. It can be extracted and input to an application device such as a computer or word processor. Further, according to the present invention, since the fair copy processing is performed on the extracted form portion, it is possible to input even a roughly handwritten form. Therefore, according to the present invention, there is an advantage that the form of the document can be extremely easily input to the application device.

[Brief description of drawings]

【図１】本発明の一実施例のブロック構成図である。FIG. 1 is a block diagram of an embodiment of the present invention.

【図２】文字と図形が混在しているフォームを有する
入力文書の画像の一例を示すものである。FIG. 2 shows an example of an image of an input document having a form in which characters and figures are mixed.

【図３】図２の画像を文字／図形分離処理によって分
離したときの分離された文字画像の一例を示す図であ
る。FIG. 3 is a diagram showing an example of a separated character image when the image of FIG. 2 is separated by a character / graphic separation process.

【図４】図２の画像を文字／図形分離処理によって分
離したときの分離さ原画像から文字／図形分離処理によ
って分離された図形（フォーム）画像の一例を示す図で
ある。FIG. 4 is a diagram showing an example of a figure (form) image separated from the separated original image by the character / graphic separation processing when the image of FIG. 2 is separated by the character / graphic separation processing.

【図５】ベクトル化部および清書処理部の構成例を示
す図である。FIG. 5 is a diagram showing a configuration example of a vectorization unit and a clean copy processing unit.

【図６】抽出した重心の間を連結してベクトルを形成
する状況を説明する図である。FIG. 6 is a diagram illustrating a situation in which extracted centroids are connected to form a vector.

【図７】（イ）〜（ニ）は清書処理部で行うベクトル
整形の種々のケースを示す図である。7A to 7D are diagrams showing various cases of vector shaping performed by the fair copy processing unit.

【図８】（イ）〜（ハ）は清書の様子を示す図であ
る。8A to 8C are diagrams showing a state of a clean copy.

[Explanation of symbols]

１…画像入力部、２…イメージメモリ、３…文字／
図形分離部、４…フォーム画像メモリ、５…フォーム
処理部、５１…ベクトル化部、５２…清書処理部、
５３…応用処理部。1 ... Image input section, 2 ... Image memory, 3 ... Character /
Figure separation unit, 4 ... Form image memory, 5 ... Form processing unit, 51 ... Vectorization unit, 52 ... Clean copy processing unit,
53 ... Applied processing unit.

Claims

Claim: What is claimed is: 1. An image input means for inputting a document image as digital data, an image memory for storing the input image, and a character / graphic separation means for separating a figure and a character in the image. A form input device with a character / graphic separation device, comprising: a clear copy processing unit for performing a clear copy process on a separated graphic; and a form processing unit for performing a clear copy process on the separated graphic and converting it into form information.