JP6798309B2

JP6798309B2 - Image processing equipment, image processing methods and programs

Info

Publication number: JP6798309B2
Application number: JP2016255941A
Authority: JP
Inventors: 孝子四條
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2016-03-18
Filing date: 2016-12-28
Publication date: 2020-12-09
Anticipated expiration: 2036-12-28
Also published as: JP2017175600A

Description

この発明は、画像処理装置、画像処理方法及びプログラムに関する。 The present invention relates to an image processing apparatus, an image processing method and a program.

従来から、画像に対して圧縮等の処理を行う場合において、文字領域を的確に判定して、文字のみの原稿に対して圧縮率を上げることにより、ファイルサイズを小さくする技術が知られている。このような技術は例えば特許文献１に記載されている。
さらに、帳票上から縦及び横の罫線を抽出し、抽出した罫線をもとにセルを抽出し、セル毎に文字認識を行う技術が知られている。このような技術は例えば特許文献２に記載されている。 Conventionally, there has been known a technique for reducing the file size by accurately determining the character area and increasing the compression rate for a character-only original when performing processing such as compression on an image. .. Such a technique is described in, for example, Patent Document 1.
Further, there is known a technique of extracting vertical and horizontal ruled lines from a form, extracting cells based on the extracted ruled lines, and performing character recognition for each cell. Such a technique is described in, for example, Patent Document 2.

また、特許文献３には、原稿をデジタル的に走査して得られる画像の地肌レベルを検出後、複数のパラメータを用いて画像の文字エッジを検出し、地肌レベルに応じて複数のパラメータを変更する構成が開示されている。 Further, in Patent Document 3, after detecting the background level of an image obtained by digitally scanning a document, character edges of the image are detected using a plurality of parameters, and a plurality of parameters are changed according to the background level. The configuration to be used is disclosed.

しかし、上述した従来の方式で、文字領域を他と区別するなどの像域分離を行おうとした場合、例えば新聞のような地肌レベルの高い紙の原稿からスキャンした画像に対して適用すると、文字領域を絵柄領域と誤検知してしまうという問題があった。文字領域の誤検知に関する問題は、特許文献３の技術を用いても十分に解決できていない。 However, when trying to separate the image area such as distinguishing the character area from others by the above-mentioned conventional method, when it is applied to an image scanned from a paper document having a high background level such as a newspaper, the character is displayed. There was a problem that the area was erroneously detected as a picture area. The problem related to false detection of a character area has not been sufficiently solved even by using the technique of Patent Document 3.

この発明は、このような問題を解決し、像域分離を精度よく行えるようにすることを目的とする。なお、画像中で検出すべき領域は、文字領域には限らない。 An object of the present invention is to solve such a problem and enable accurate image region separation. The area to be detected in the image is not limited to the character area.

この発明の画像処理装置は、上記の目的を達成するため、画像処理装置において、画像データを入力する入力手段と、前記入力手段が入力した画像データから、所定の特徴要素を抽出する抽出手段と、前記所定の特徴要素について、該特徴要素が満たすべき条件と、画像データ中で該条件を満たす特徴要素を基準として定められる領域の属性とを予め定めた判定条件を参照して、前記抽出手段が抽出した特徴要素と、前記判定条件とに基づき、前記入力した画像データ中の、前記判定条件により規定された領域に、前記判定条件により規定された属性を設定する属性設定手段とを設け、前記抽出手段による特徴要素の抽出と、前記属性設定手段による、前記判定条件を参照した、前記入力した画像データに対する属性の設定とを、複数の特徴要素についてそれぞれ行い、前記属性設定手段は、前記複数の特徴要素に従ってそれぞれ設定した属性を、前記複数の特徴要素分集計して、前記入力した画像データ中における各部分の属性を設定する。
In order to achieve the above object, the image processing apparatus of the present invention includes an input means for inputting image data and an extraction means for extracting a predetermined feature element from the image data input by the input means in the image processing apparatus. With respect to the predetermined feature element, the extraction means refers to a determination condition in which the condition to be satisfied by the feature element and the attribute of the region determined based on the feature element satisfying the condition in the image data are determined in advance. Based on the feature element extracted by the above and the determination condition, an attribute setting means for setting the attribute defined by the determination condition is provided in the area defined by the determination condition in the input image data . Extraction of the feature element by the extraction means and setting of the attribute for the input image data with reference to the determination condition by the attribute setting means are performed for each of the plurality of feature elements, and the attribute setting means is described. The attributes set according to the plurality of feature elements are aggregated for the plurality of feature elements, and the attributes of each part in the input image data are set.

上記構成によれば、像域分離を精度よく行うことができる。 According to the above configuration, image area separation can be performed with high accuracy.

この発明による画像処理装置の一実施形態であるＭＦＰのハードウェア構成を示す図である。It is a figure which shows the hardware structure of the MFP which is one Embodiment of the image processing apparatus by this invention. 図１に示したＭＦＰが備える画像処理部２００の機能構成を示す図である。It is a figure which shows the functional structure of the image processing unit 200 included in the MFP shown in FIG. 図２に示したスキャナ補正部２１０の機能構成を示す図である。It is a figure which shows the functional structure of the scanner correction part 210 shown in FIG. 図３に示した文字領域判定部２１２の機能構成を詳細に示す図である。It is a figure which shows the functional structure of the character area determination part 212 shown in FIG. 3 in detail. 図２に示した画像処理部２００による画像処理の基本フローチャートである。It is a basic flowchart of image processing by the image processing unit 200 shown in FIG. 図５に示した文字領域判定（Ｓ１１）のより詳細な処理手順のフローチャートである。It is a flowchart of the more detailed processing procedure of the character area determination (S11) shown in FIG. 図６の続きのフローチャートである。It is a continuation flowchart of FIG. 紙面を読み取った画像データに対し、罫線の領域設定の処理を適用する例について説明するための図である。It is a figure for demonstrating an example which applies the process of setting the area of a ruled line to the image data which read the paper surface. その別の図である。It is another figure. そのさらに別の図である。It is yet another figure. 図８Ａと同じ画像データに対し、地肌罫線についての領域設定の処理を適用する例について説明するための図である。It is a figure for demonstrating an example which applies the process of setting the area about the background ruled line to the same image data as FIG. 8A. その別の図である。It is another figure. 特定文字についての領域設定の処理について説明するための図である。It is a figure for demonstrating the process of area setting for a specific character. 罫線による判定条件を使って地肌罫線の抽出を行う方法の説明のための図である。It is a figure for demonstrating the method of extracting the background ruled line using the judgment condition by a ruled line. その別の図である。It is another figure. 地肌抽出の結果から地肌罫線の候補を抽出する方法の説明のための図である。It is a figure for demonstrating the method of extracting the candidate of the background ruled line from the result of the background extraction. その別の図である。It is another figure. 複数の特徴要素に基づく領域の設定結果を集計して最終的に文字領域を確定させる方法について説明するための図である。It is a figure for demonstrating the method of summarizing the setting result of the area based on a plurality of feature elements, and finally fixing a character area.

以下、この発明の実施形態について、図面を参照しつつ説明する。
図１に、この発明による画像処理装置の一実施形態であるＭＦＰ（デジタル複合機）のハードウェア構成を示す。
図１に示すＭＦＰ１００は、スキャン、プリント、コピー、ファクシミリ通信、および文書蓄積等の機能を備えた画像処理装置である。
図１に示すように、ＭＦＰ１００は、ＣＰＵ１０１、ＲＯＭ１０２、ＲＡＭ１０３、ＨＤＤ（ハードディスクドライブ）１０４、通信Ｉ／Ｆ（インタフェース）１０５、操作部１０６、表示部１０７、およびエンジンＩ／Ｆ１０８を備えている。そして、これらをシステムバス１０９により接続してＭＦＰ１００を構成している。また、エンジンＩ／Ｆ１０８には、スキャナ１２０、プロッタ１３０及び画像処理部２００が接続される。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.
FIG. 1 shows a hardware configuration of an MFP (digital multifunction device) which is an embodiment of an image processing device according to the present invention.
The MFP 100 shown in FIG. 1 is an image processing device having functions such as scanning, printing, copying, facsimile communication, and document storage.
As shown in FIG. 1, the MFP 100 includes a CPU 101, a ROM 102, a RAM 103, an HDD (hard disk drive) 104, a communication I / F (interface) 105, an operation unit 106, a display unit 107, and an engine I / F 108. Then, these are connected by the system bus 109 to form the MFP 100. Further, a scanner 120, a plotter 130, and an image processing unit 200 are connected to the engine I / F 108.

そして、ＣＰＵ１０１が、ＲＡＭ１０３をワークエリアとしてＲＯＭ１０２あるいはＨＤＤ１０４に記憶された所要のプログラムを実行することにより、ＭＦＰ１００全体を制御し、種々の機能を実現する。
ＲＯＭ１０２及びＨＤＤ１０４は、不揮発性記憶媒体（記憶手段）であり、ＣＰＵ１０１が実行する各種プログラムや後述する各種データを格納している。
通信Ｉ／Ｆ１０５は、外部の装置と通信するためのインタフェースである。使用する通信路の規格に応じた物を設ければよい。 Then, the CPU 101 controls the entire MFP 100 and realizes various functions by executing a required program stored in the ROM 102 or the HDD 104 with the RAM 103 as a work area.
The ROM 102 and the HDD 104 are non-volatile storage media (storage means), and store various programs executed by the CPU 101 and various data described later.
The communication I / F 105 is an interface for communicating with an external device. It suffices to provide the one corresponding to the standard of the communication path to be used.

操作部１０６は、ユーザからの操作を受け付けるための操作手段であり、各種のキー、ボタン、タッチパネル等により構成される。
表示部１０７は、ＭＦＰ１００の動作状態や設定内容、メッセージ等をユーザに提示するための提示手段であり、液晶ディスプレイやランプ等を備えている。 The operation unit 106 is an operation means for receiving an operation from the user, and is composed of various keys, buttons, a touch panel, and the like.
The display unit 107 is a presentation means for presenting the operating state, setting contents, messages, etc. of the MFP 100 to the user, and includes a liquid crystal display, a lamp, and the like.

なお、操作部１０６及び表示部１０７は外付けであってもよい。また、ＭＦＰ１００が必ずしもユーザからの操作を直接受ける必要はない。ＭＦＰ１００は、通信Ｉ／Ｆ１０５を介して接続された外部装置による操作を受け付けたり、その外部装置に対して情報の提示を行ったりしてもよい。その場合には、操作部１０６や表示部１０７を設けなくてもよい。 The operation unit 106 and the display unit 107 may be externally attached. Further, the MFP 100 does not necessarily have to be directly operated by the user. The MFP 100 may accept an operation by an external device connected via the communication I / F 105, or may present information to the external device. In that case, the operation unit 106 and the display unit 107 may not be provided.

エンジンＩ／Ｆ１０８は、スキャナ１２０、プロッタ１３０及び画像処理部２００をシステムバス１０９に接続してＣＰＵ１０１から制御できるようにするためのインタフェースである。
スキャナ１２０は、原稿の画像を読み取ってその画像データを出力する画像読取装置であり、読み取った画像データを画像処理部２００へ送る。
画像処理部２００は、画像処理手段であり、スキャナ１２０で読み取った原稿の原稿種別を判定し、原稿種別に応じて原稿に適した画像解像度の圧縮率を選択し、その圧縮率に従って画像データを圧縮して圧縮画像データを出力する。
プロッタ１３０は、画像データに従って用紙に画像を形成する画像形成手段であり、画像処理部２００が出力した圧縮画像データに基づき画像を形成することができる。 The engine I / F 108 is an interface for connecting the scanner 120, the plotter 130, and the image processing unit 200 to the system bus 109 so that they can be controlled by the CPU 101.
The scanner 120 is an image reading device that reads an image of a document and outputs the image data, and sends the read image data to the image processing unit 200.
The image processing unit 200 is an image processing means, determines the original type of the original read by the scanner 120, selects a compression ratio of an image resolution suitable for the original according to the original type, and outputs image data according to the compression ratio. Compress and output compressed image data.
The plotter 130 is an image forming means for forming an image on paper according to the image data, and can form an image based on the compressed image data output by the image processing unit 200.

図２に、画像処理部２００の機能構成を示す。図２に示す各部の機能は、専用のハードウェアで実現しても、プロセッサにソフトウェアを実行させることによって実現しても、その組み合わせでもよい。一部の機能をＣＰＵ１０１が担うことも考えられる。
この画像処理部２００は図２に示すように、スキャナ補正部２１０、圧縮処理部２２０、データインタフェース部２３０、伸張処理部２４０、およびプリンタ補正部２５０を備える。 FIG. 2 shows the functional configuration of the image processing unit 200. The functions of each part shown in FIG. 2 may be realized by dedicated hardware, realized by having a processor execute software, or a combination thereof. It is conceivable that the CPU 101 is responsible for some functions.
As shown in FIG. 2, the image processing unit 200 includes a scanner correction unit 210, a compression processing unit 220, a data interface unit 230, a decompression processing unit 240, and a printer correction unit 250.

これらのうち、スキャナ補正部２１０は、スキャナ１２０が読み取った画像データを分類し画像処理を施す機能を備える。その機能の詳細は図３によって後述する。
圧縮処理部２２０は、スキャナ補正部２１０で補正した後の画像データを内部処理用に圧縮する機能を備える。
データインタフェース部２３０は、圧縮処理部２２０で圧縮した画像データをＨＤＤ１０４へ一時保存する際のＨＤＤ管理インタフェースである。 Of these, the scanner correction unit 210 has a function of classifying the image data read by the scanner 120 and performing image processing. The details of the function will be described later with reference to FIG.
The compression processing unit 220 has a function of compressing the image data corrected by the scanner correction unit 210 for internal processing.
The data interface unit 230 is an HDD management interface for temporarily storing the image data compressed by the compression processing unit 220 in the HDD 104.

伸張処理部２４０は、内部処理用の圧縮画像データをプロッタ１３０での画像形成に用いるために伸張する機能を備える。
プリンタ補正部２５０は、伸張処理部２４０で伸張した画像データを、必要に応じて補正を加えた上でプロッタ１３０へ送出する機能を備える。 The stretching processing unit 240 has a function of stretching the compressed image data for internal processing in order to use it for image formation on the plotter 130.
The printer correction unit 250 has a function of sending the image data decompressed by the decompression processing unit 240 to the plotter 130 after correcting it as necessary.

次に、図３に、図２に示したスキャナ補正部２１０の機能構成を示す。
このスキャナ補正部２１０は図３に示すように、原稿種判定部２１１、文字領域判定部２１２、像域分離部２１３、スキャナγ部２１４、フィルタ処理部２１５、色補正処理部２１６、および文字γ部２１７を備えている。
これらのうち原稿種判定部２１１は、文字が存在する原稿の特徴と有彩原稿の特徴と印画紙写真の特徴と印刷写真の特徴とから、処理対象の画像データが文字のみの原稿の画像データかカラー原稿の画像データかを判定する機能を備える。 Next, FIG. 3 shows the functional configuration of the scanner correction unit 210 shown in FIG.
As shown in FIG. 3, the scanner correction unit 210 includes a document type determination unit 211, a character area determination unit 212, an image area separation unit 213, a scanner γ unit 214, a filter processing unit 215, a color correction processing unit 216, and a character γ. The part 217 is provided.
Of these, the original type determination unit 211 is based on the characteristics of the original containing characters, the characteristics of the chromatic original, the characteristics of the photographic paper photograph, and the characteristics of the printed photograph, and the image data of the original whose image data to be processed is only characters. It has a function to determine whether it is image data of a color original.

文字領域判定部２１２は、画像の特徴から画像データが示す画像中の文字領域を判定する機能を備え、原稿種判定部２１１で判定しきれない画像データについて判定を行ったり、誤判定を防ぐために再判定を行ったりする役割を担う。
像域分離部２１３は、文字領域判定部２１２による文字領域の判定結果を参照しつつ、処理対象の画像データを黒文字エッジ領域と色文字エッジ領域とその他の３つの領域に分離する機能を備えている。 The character area determination unit 212 has a function of determining the character area in the image indicated by the image data from the characteristics of the image, and in order to determine the image data that cannot be determined by the document type determination unit 211 and prevent erroneous determination. It plays a role of making a re-judgment.
The image area separation unit 213 has a function of separating the image data to be processed into a black character edge area, a color character edge area, and three other areas while referring to the determination result of the character area by the character area determination unit 212. There is.

スキャナγ部２１４は、画像データを反射率リニアから濃度リニアのデータに変換する機能を備える。
フィルタ処理部２１５は、複数のフィルタを切り替えつつ、処理対象の画像データのうち像域分離部２１３により分離された３つの領域に対し、それぞれ異なるフィルタ処理を行う機能を備えている。 The scanner γ unit 214 has a function of converting image data from reflectance linear data to density linear data.
The filter processing unit 215 has a function of switching between a plurality of filters and performing different filter processing on the three regions of the image data to be processed separated by the image area separation unit 213.

色補正処理部２１６は、黒文字エッジ領域以外でＲＧＢの画像データをＣＭＹＫの画像データに変換する手段である。
文字γ部２１７は、文字領域にある色文字と黒文字に対してγを立たせ、文字を強調する機能を備える。文字γ部２１７による処理後の画像データは圧縮処理部２２０に供給される。 The color correction processing unit 216 is a means for converting RGB image data into CMYK image data other than the black character edge region.
The character γ unit 217 has a function of setting γ on a color character and a black character in a character area to emphasize the character. The image data processed by the character γ unit 217 is supplied to the compression processing unit 220.

次に、図４に、図３に示した文字領域判定部２１２の機能構成をさらに詳細に示す。
文字領域判定部２１２は、罫線領域判定部２６１、地肌罫線領域判定部２６２、特定文字領域判定部２６３、判定条件取得部２６４、判定条件２６５、範囲抽出部２６６、および判定結果設定部２６７で構成される。
罫線領域判定部２６１は、判定条件取得部２６４が取得した判定条件に罫線に関する項目が存在する場合に、罫線に基づき画像データへの領域及び属性の設定を行う機能を備える。 Next, FIG. 4 shows in more detail the functional configuration of the character area determination unit 212 shown in FIG.
The character area determination unit 212 includes a ruled line area determination unit 261, a background ruled line area determination unit 262, a specific character area determination unit 263, a determination condition acquisition unit 264, a determination condition 265, a range extraction unit 266, and a determination result setting unit 267. Will be done.
The ruled line area determination unit 261 has a function of setting an area and attributes to image data based on the ruled line when an item related to the ruled line exists in the determination condition acquired by the determination condition acquisition unit 264.

この罫線領域判定部２６１は、例えば、帳票データあるいは紙に印刷された帳票から読み取った画像データから縦罫線及び横罫線を抽出し、抽出した各罫線について、判定条件と比較するためのパラメータを算出する。そして、抽出した各罫線のパラメータと判定条件とを照合し、一致した各罫線について、判定条件に沿って領域の設定及び属性を設定する。各領域に設定する属性は、文字領域、絵柄領域等が考えられる。 The ruled line area determination unit 261 extracts vertical ruled lines and horizontal ruled lines from, for example, form data or image data read from a form printed on paper, and calculates parameters for comparing each of the extracted ruled lines with a determination condition. To do. Then, the parameters of the extracted ruled lines are collated with the determination conditions, and for each of the matching ruled lines, the area setting and the attributes are set according to the determination conditions. The attributes set in each area may be a character area, a picture area, and the like.

地肌罫線領域判定部２６２は、判定条件取得部２６４が取得した判定条件に地肌罫線に関する項目が存在する場合に、地肌罫線に基づき画像データへの領域及び属性の設定を行う機能を備える。なお、地肌罫線とは、線の代わりに地肌の空間を使って罫線を構成したものである。 The background ruled line area determination unit 262 has a function of setting an area and attributes for image data based on the background ruled line when an item related to the background ruled line exists in the determination condition acquired by the determination condition acquisition unit 264. The background ruled line is a ruled line constructed by using the space of the background instead of the line.

地肌罫線領域判定部２６２は、例えば、帳票データあるいは紙に印刷された帳票から読み取った画像データから地肌領域を抽出し、抽出した地肌領域について、判定条件と比較するためのパラメータを算出する。はじめから、地肌罫線候補となるような細長い領域だけ抽出するようにしてもよい。そして、抽出した各地肌領域のパラメータと判定条件とを照合し、一致した地肌領域を地肌罫線として、各地肌罫線について判定条件に沿って領域の設定及び属性を設定する。地肌領域の抽出方法については、図１２Ａ及び図１２Ｂを用いて後述する。 The background ruled line area determination unit 262 extracts, for example, the background area from the form data or the image data read from the form printed on paper, and calculates the parameters for comparing the extracted background area with the determination conditions. From the beginning, only an elongated area that is a candidate for a background ruled line may be extracted. Then, the parameters of the extracted skin areas in each area are collated with the judgment conditions, and the matching skin areas are set as the skin ruled lines, and the area settings and attributes are set for the skin ruled lines in each area according to the judgment conditions. The method for extracting the background region will be described later with reference to FIGS. 12A and 12B.

特定文字領域判定部２６３は、判定条件取得部２６４が取得した判定条件に特定文字に関する項目が存在する場合に、特定の文字に基づき画像データへの領域及び属性の設定を行う機能を備える。
特定文字領域判定部２６３は、例えば、帳票データあるいは紙に印刷された帳票から読み取った画像データ内に判定条件と一致する特定の文字が存在するか検索する。そして、抽出した各特定の文字について、判定条件に沿って領域の設定及び属性を設定する。 The specific character area determination unit 263 has a function of setting an area and attributes to image data based on a specific character when an item related to the specific character exists in the determination condition acquired by the determination condition acquisition unit 264.
The specific character area determination unit 263 searches, for example, whether a specific character that matches the determination condition exists in the form data or the image data read from the form printed on paper. Then, for each of the extracted specific characters, the area is set and the attributes are set according to the determination conditions.

判定条件取得部２６４は、判定条件２６５として登録されている情報を読み出して取得する機能を備える。
判定条件２６５は、誤検知しやすい原稿（の画像データ）が持つ特徴から文字領域をはじめとする種々の領域を判定するための種々の条件を規定したものである。
その判定は、文字領域、絵柄領域、などの領域の位置の基準とする罫線、地肌罫線、および特定文字などの特徴要素に基づいて行われる。具体例は後述するが、判定条件２６５は、特徴要素が満たすべき条件と、画像データ中でその条件を満たす特徴要素を基準として、どの位置にどの属性の領域を設定するかを規定するデータである。 The determination condition acquisition unit 264 has a function of reading and acquiring the information registered as the determination condition 265.
The determination condition 265 defines various conditions for determining various regions including a character region from the characteristics of the original (image data) that is easily erroneously detected.
The determination is made based on a feature element such as a ruled line, a background ruled line, and a specific character that are used as a reference for the position of an area such as a character area or a pattern area. A specific example will be described later, but the determination condition 265 is data that defines a condition to be satisfied by the feature element and a region of which attribute is set at which position based on the feature element that satisfies the condition in the image data. is there.

範囲抽出部２６６は、罫線領域判定部２６１、地肌罫線領域判定部２６２、および特定文字領域判定部２６３が設定した結果から、画像データ中の、文字領域、絵柄領域などの各種属性の領域の範囲を確定する。
判定結果設定部２６７は、画像データ中の、文字領域と確定された範囲に対し、それ以外の部分と異なる圧縮率を設定する手段である。
なお、罫線領域判定部２６１、地肌罫線領域判定部２６２、および特定文字領域判定部２６３が実際に判定を行うか否かは、判定条件２６５としてどの特徴要素に関する条件が保持されているかに依存する。 The range extraction unit 266 is a range of various attribute areas such as a character area and a picture area in the image data from the results set by the ruled line area determination unit 261 and the background ruled line area determination unit 262 and the specific character area determination unit 263. To confirm.
The determination result setting unit 267 is a means for setting a compression rate different from that of the other portion in the range determined as the character region in the image data.
Whether or not the ruled line area determination unit 261, the background ruled line area determination unit 262, and the specific character area determination unit 263 actually perform the determination depends on which feature element condition is held as the determination condition 265. ..

次に、表１〜表４に、判定条件２６５のいくつかの具体例を示す。
判定条件２６５は、表１に示す特徴要素のカテゴリ一覧と、表２〜表４に示す各カテゴリの詳細テーブルを含む。 Next, Tables 1 to 4 show some specific examples of the determination condition 265.
The determination condition 265 includes a list of category of feature elements shown in Table 1 and a detailed table of each category shown in Tables 2 to 4.

表１〜表４の各テーブルには、テーブル内の項目同士の優先順位を付けられるようにする。例としては、罫線に関する判定より地肌罫線に関する判定を先に行いたい場合に、表１のテーブル中の特徴要素を希望する実行順に並べ変えることにより、各特徴要素に関する判定をその実行順で行えるようにすることが考えられる。表２〜表４の各判定条件についても同様である。表１〜表４では、ＩＤを優先度を含めたシーケンシャル番号としている。 Each table in Tables 1 to 4 can be prioritized among the items in the table. As an example, when it is desired to make a judgment regarding a background ruled line before a judgment regarding a ruled line, by rearranging the feature elements in the table of Table 1 in the desired execution order, the judgment regarding each feature element can be performed in the execution order. Can be considered. The same applies to each determination condition in Tables 2 to 4. In Tables 1 to 4, the ID is a sequential number including the priority.

表１は、特徴要素のカテゴリのＩＤと、そのカテゴリを構成する特徴要素の種類の情報で構成されているテーブルである。表１にある罫線、特定文字、地肌罫線以外の種類を用いることも考えられる。その場合、文字領域判定部２１２に、特徴要素の種類と対応する判定部を設ける。 Table 1 is a table composed of the ID of the category of the feature element and the information of the type of the feature element constituting the category. It is also conceivable to use types other than the ruled lines, specific characters, and background ruled lines shown in Table 1. In that case, the character area determination unit 212 is provided with a determination unit corresponding to the type of the feature element.

表２〜表４は、表１で定義された各カテゴリの特徴要素に関する判定条件の詳細を登録したテーブルである。表１で定義されたカテゴリの数の分だけ詳細テーブルを作ることとし、ここでは表１で定義した３つのカテゴリに対応して表２〜表４を設けている。
表２〜表４の詳細テーブルでは、１行のデータが１つの判定条件の内容を示す。そして、各判定条件は、「ＩＤ」、「要素条件」、「範囲」、および「判定結果」の情報を含む。 Tables 2 to 4 are tables in which the details of the determination conditions for the feature elements of each category defined in Table 1 are registered. Detailed tables are created for the number of categories defined in Table 1, and Tables 2 to 4 are provided here corresponding to the three categories defined in Table 1.
In the detailed tables of Tables 2 to 4, one row of data indicates the content of one determination condition. Then, each determination condition includes information of "ID", "element condition", "range", and "determination result".

これらのうち、「要素条件」は、特徴要素、あるいは特徴要素と他の要素との関係が満たすべき条件を既定したものであり、特徴要素のカテゴリにより、数も内容も異なる。
「範囲」は、特徴要素を基準に領域をどの位置に設定するかを示す。「判定結果」は、その設定した領域に設定する属性を示す。表２〜表４の例では判定結果は文字（文字領域）ばかりであるが、その他の属性を設定してもよいことは勿論である。 Of these, the "element condition" defines a feature element or a condition that the relationship between the feature element and another element should satisfy, and the number and contents differ depending on the category of the feature element.
"Range" indicates the position where the area is set based on the feature element. The "judgment result" indicates the attribute to be set in the set area. In the examples of Tables 2 to 4, the determination result is only characters (character area), but it goes without saying that other attributes may be set.

表２のテーブルは、表１のテーブルでＩＤ＝Ａの罫線に関する判定条件のテーブルである。罫線については、要素条件として、向き、線色、太さ、範囲色、間隔を設けている。
このうち「向き」は、罫線の向きの条件のことで、縦線、横線、Don’t care（任意）のいずれかが入るものとする。「線色」は罫線の色の条件のことで、表２では黒と示しているが、ＲＧＢの比率または量などの色情報でもよい。「太さ」は罫線の太さの条件のことで、図ではミリメートル（ｍｍ）単位で示しているが、インチ（inch）や画素など単位は適当なものを使用することができる。丁度の太さでなくても、所定誤差範囲であればよいものとする。 The table in Table 2 is a table in Table 1 and is a table of determination conditions regarding the ruled line with ID = A. Regarding the ruled line, the direction, line color, thickness, range color, and interval are provided as element conditions.
Of these, "direction" is a condition for the direction of the ruled line, and one of vertical lines, horizontal lines, and Don't care (optional) is included. "Line color" is a condition of the color of the ruled line, and although it is shown as black in Table 2, it may be color information such as the ratio or amount of RGB. The "thickness" is a condition of the thickness of the ruled line, and is shown in millimeters (mm) in the figure, but an appropriate unit such as inches (inch) or pixels can be used. It does not have to be just the thickness, as long as it is within a predetermined error range.

「範囲色」は、該当の特徴要素を基準に「範囲」の条件に従い設定される領域内が白黒の場合のみ、該当の特徴要素が要素条件を満たすとするか、カラーを含んでもよいとするかの指定である。「間隔」は、罫線の間隔のことで、抽出された隣の罫線との間の間隔の条件である。この「範囲色」及び「間隔」は、特徴要素そのものの特性に関する条件ではない。 The "range color" means that the corresponding feature element satisfies the element condition or may include a color only when the area set according to the "range" condition based on the corresponding feature element is black and white. Is specified. The "spacing" is the spacing between the ruled lines, and is the condition of the spacing between the extracted adjacent ruled lines. The "range color" and "interval" are not conditions relating to the characteristics of the feature element itself.

例えば、ＩＤ＝Ａ＿００１の判定条件では、ある罫線が、縦向きの黒い罫線であって太さが０．１ｍｍ、隣の罫線との間に挟まれた部分（「範囲」の条件に基づく）が白黒であれば、隣の罫線との間の間隔によらず、要素条件を満たすことになる。そしてこの場合、要素条件を満たした罫線と、隣の罫線（こちらも同じＩＤの要素条件を満たす必要がある）との間にはさまれた部分に領域を設定し、その領域の属性を「文字」に設定すべきことが規定されている。 For example, under the judgment condition of ID = A_001, a certain ruled line is a vertical black ruled line having a thickness of 0.1 mm and a portion sandwiched between the adjacent ruled line (based on the condition of "range"). If it is black and white, the element condition is satisfied regardless of the distance between the adjacent ruled lines. Then, in this case, an area is set in the part sandwiched between the ruled line satisfying the element condition and the adjacent ruled line (which also needs to satisfy the element condition of the same ID), and the attribute of the area is set to ". It is stipulated that it should be set to "character".

次に、表３のテーブルは、表１のテーブルでＩＤ＝Ｂの特定文字に関する判定条件のテーブルである。特定文字については、要素条件として、記号、範囲色、組合せ、記号含むか、向き、を設けている。
「記号」は判定で使用する特定の文字（文字列でもよい）を示しており、画像データや文字コード、フォント、大きさ(幅と高さ)などで示す。「範囲色」は表２の場合と同じである。「組合せ」は、「範囲」を他のＩＤの要素条件を満たす特徴要素を用いて定めたい場合に、そのＩＤを指定する項目である。複数のＩＤを指定してもよい。
「記号含むか」は、「範囲」を特定の文字そのものを含むように設定するか否かを示す。「向き」は、判定で使用する文字列の向きを示し、横と縦のいずれかを指定する。
これらのうち「記号」及び「向き」以外は、特徴要素そのものの特性に関する条件ではない。 Next, the table in Table 3 is a table of determination conditions regarding a specific character with ID = B in the table in Table 1. For specific characters, symbol, range color, combination, symbol included or orientation are provided as element conditions.
The "symbol" indicates a specific character (may be a character string) used in the judgment, and is indicated by image data, a character code, a font, a size (width and height), and the like. The "range color" is the same as in Table 2. The "combination" is an item for designating an ID when it is desired to define a "range" using a feature element that satisfies the element conditions of another ID. A plurality of IDs may be specified.
"Whether to include a symbol" indicates whether or not to set the "range" to include the specific character itself. "Orientation" indicates the orientation of the character string used in the determination, and specifies either horizontal or vertical.
Of these, other than "symbol" and "direction", there are no conditions related to the characteristics of the feature element itself.

例えば、ＩＤ＝Ｂ＿００１の判定条件では、ある特定の文字「Å」が、横向きに配列され、当該「Å」の文字を含み、その文字から、文字列の配列方向に向かって次のＩＤ＝Ａ＿００１の罫線までの領域（「範囲」の条件に基づく）が白黒であれば、その文字「Å」が要素条件を満たすことになる。そして、この場合に、上記「範囲」の条件に基づく領域を設定し、その領域の属性を「文字」に設定すべきことが規定されている。 For example, under the determination condition of ID = B_001, a specific character "Å" is arranged horizontally, includes the character "Å", and from that character, the next ID = A_001 is directed toward the arrangement direction of the character string. If the area up to the ruled line of is black and white (based on the condition of "range"), the character "Å" satisfies the element condition. Then, in this case, it is stipulated that an area based on the above-mentioned "range" condition should be set and the attribute of the area should be set to "character".

次に、表４のテーブルは、表１のテーブルでＩＤ＝Ｃの地肌罫線に関する判定条件のテーブルである。地肌罫線については、要素条件として、向き、太さ、範囲色、間隔、が規定されている。この地肌罫線に関する判定条件は、要素条件に「色」を含まない点以外は表２の罫線に関する判定条件と同じである。 Next, the table in Table 4 is a table of determination conditions regarding the background ruled line with ID = C in the table in Table 1. For the background ruled line, the direction, thickness, range color, and interval are defined as element conditions. The judgment condition regarding the background ruled line is the same as the judgment condition regarding the ruled line in Table 2 except that the element condition does not include "color".

次に、画像処理部２００による、以上説明してきた判定条件に従った領域及び属性の設定を含む画像処理の手順について、フローチャートを用いて説明する。
図５に、その画像処理の基本フローを示す。
画像処理部２００は、スキャナ１２０から画像データを取得すると、図５のフローチャートに示す処理を開始する。この画像データの取得に係る処理が、入力手順の処理である。また、入力手段の機能と対応する。 Next, the procedure of image processing including the setting of the area and the attribute according to the determination conditions described above by the image processing unit 200 will be described with reference to the flowchart.
FIG. 5 shows the basic flow of the image processing.
When the image processing unit 200 acquires the image data from the scanner 120, the image processing unit 200 starts the process shown in the flowchart of FIG. The process related to the acquisition of the image data is the process of the input procedure. It also corresponds to the function of the input means.

そしてまず、スキャナ補正部２１０の文字領域判定部２１２が画像データ中の文字領域と、それ以外の領域とを判定する（Ｓ１１）。このステップＳ１１の文字領域判定の詳細については、図６及び図７を用いて後述する。次に、スキャナ補正部２１０の像域分離部２１３から文字γ部２１７までの各部が順次、スキャナ１２０から取得した画像データに対し、スキャナ画像用の画像処理を行う（Ｓ１２）。 Then, first, the character area determination unit 212 of the scanner correction unit 210 determines the character area in the image data and the other area (S11). The details of the character area determination in step S11 will be described later with reference to FIGS. 6 and 7. Next, each unit from the image area separation unit 213 to the character γ unit 217 of the scanner correction unit 210 sequentially performs image processing for a scanner image on the image data acquired from the scanner 120 (S12).

次に、圧縮処理部２２０が、文字領域判定部２１２の判定に従い、スキャナ１２０から取得した画像データに対し、文字領域とその他の領域（例えば絵柄領域）にそれぞれ適当な方法による圧縮処理を実施する（Ｓ１３）。その後、画像処理部２００は、データインタフェース部２３０を通じてＨＤＤ１０４へ圧縮後の画像データを保存する（Ｓ１４）。 Next, the compression processing unit 220 performs compression processing on the image data acquired from the scanner 120 in the character area and other areas (for example, the pattern area) by an appropriate method according to the determination of the character area determination unit 212. (S13). After that, the image processing unit 200 saves the compressed image data in the HDD 104 through the data interface unit 230 (S14).

その後、プロッタ１３０にて印刷出力する際に、伸張処理部２４０が、ステップＳ１４で保存した画像を伸張処理する（Ｓ１５）。次に、プリンタ補正部２５０が、伸張後の画像データに対しプロッタ１３０の特性に合わせた画像処理を実施するとともに、ステップＳ１１の判定結果に沿って文字領域の文字を強調する画像処理を行って、処理後の画像データを出力する（Ｓ１６）。
以上で図５の処理は終了する。ステップＳ１６で文字に対して特別な画像処理を行う理由は、ステップＳ１２〜Ｓ１５の処理を実施することにより文字の端部が薄くなることや、地肌レベルが高い原稿で裏移りが発生することによる、文字の読みにくさを改善するためである。 After that, when the plotter 130 prints out, the stretching processing unit 240 stretches the image saved in step S14 (S15). Next, the printer correction unit 250 performs image processing on the expanded image data according to the characteristics of the plotter 130, and also performs image processing for emphasizing the characters in the character area according to the determination result in step S11. , The processed image data is output (S16).
This completes the process of FIG. The reason why the special image processing is performed on the characters in step S16 is that the edges of the characters are thinned by performing the processing of steps S12 to S15, and the original with a high background level is set off. , To improve the difficulty of reading characters.

次に、図６に、図５のステップＳ１１の文字領域判定のより詳細な処理手順のフローチャートを示す。このフローチャートに示す処理は、図４に示した文字領域判定部２１２の機能と対応するものである。
図６の処理においてはまず、文字領域判定部２１２は、判定条件取得部２６４が取得した判定条件２６５のカテゴリ一覧（表１）を参照し、判定条件２６５に、罫線に関する条件があるか否か判断する（Ｓ２１）。ここで、あれば、罫線についての領域の設定について検討すべく、罫線領域判定部２６１が、画像データから罫線を全て抽出すると共に（Ｓ２２）、罫線に関する各判定条件（表２参照）を順次処理対象としつつ、ステップＳ２３〜Ｓ２５の処理を繰り返す。 Next, FIG. 6 shows a flowchart of a more detailed processing procedure for determining the character area in step S11 of FIG. The process shown in this flowchart corresponds to the function of the character area determination unit 212 shown in FIG.
In the process of FIG. 6, first, the character area determination unit 212 refers to the category list (Table 1) of the determination condition 265 acquired by the determination condition acquisition unit 264, and whether or not the determination condition 265 has a condition related to the ruled line. Judgment (S21). Here, if there is, the ruled line area determination unit 261 extracts all the ruled lines from the image data (S22) and sequentially processes each determination condition (see Table 2) regarding the ruled lines in order to consider the setting of the area for the ruled lines. The processing of steps S23 to S25 is repeated while targeting.

すなわち、罫線領域判定部２６１はまず、ステップＳ２２で抽出した罫線のうち、処理対象の判定条件中の要素条件に合うものを取得する（Ｓ２３）。このとき、罫線領域判定部２６１は取得しなかった罫線の情報も保持しておく。
そして、罫線領域判定部２６１は、ステップＳ２３で取得した罫線と処理対象の判定条件とに従い、画像データ中に罫線を基準とした領域及びその属性を設定する（Ｓ２４）。例えば、ＩＤ＝Ａ＿００１の判定条件を用いた場合、要素条件を満たす罫線に挟まれた部分に領域を設定し、その領域の属性を「文字」に設定する。最後に、処理対象の判定条件と対応付けて、要素条件に合う罫線を保存する（Ｓ２５）。この保存は、別の特徴要素に関する処理で参照するために行うものである。 That is, the ruled line area determination unit 261 first acquires, among the ruled lines extracted in step S22, those that meet the element conditions in the determination conditions to be processed (S23). At this time, the ruled line area determination unit 261 also retains the information of the ruled line that has not been acquired.
Then, the ruled line area determination unit 261 sets an area based on the ruled line and its attributes in the image data according to the ruled line acquired in step S23 and the determination condition of the processing target (S24). For example, when the determination condition of ID = A_001 is used, an area is set in the portion sandwiched between the ruled lines satisfying the element conditions, and the attribute of the area is set to "character". Finally, the ruled line that matches the element condition is saved in association with the determination condition of the processing target (S25). This preservation is performed for reference in the processing related to another feature element.

また、ステップＳ２２〜Ｓ２５のループの終了後、あるいはステップＳ２１でＮＯの場合、処理はステップＳ３１に進む。ここでは、文字領域判定部２１２は、判定条件取得部２６４が取得した判定条件２６５に、地肌罫線に関する条件があるか否か判断する（Ｓ３１）。なお、図６の例では、表１のデータと異なり、地肌罫線に関する判定を２番目に行うようにしている。 Further, after the end of the loop in steps S22 to S25, or if NO in step S21, the process proceeds to step S31. Here, the character area determination unit 212 determines whether or not the determination condition 265 acquired by the determination condition acquisition unit 264 includes a condition related to the background ruled line (S31). In the example of FIG. 6, unlike the data in Table 1, the determination regarding the background ruled line is performed second.

ステップＳ３１でＹｅｓであれば、地肌罫線についての領域の設定について検討すべく、地肌罫線領域判定部２６２が、画像データから地肌罫線を全て抽出すると共に（Ｓ３２）、地肌罫線に関する各判定条件（表４参照）を順次処理対象としつつ、ステップＳ３３〜Ｓ３５の処理を繰り返す。 If Yes in step S31, the background ruled line area determination unit 262 extracts all the background ruled lines from the image data (S32), and each determination condition (table) regarding the background ruled line, in order to examine the setting of the area for the background ruled line. 4) is sequentially targeted for processing, and the processing of steps S33 to S35 is repeated.

すなわち、地肌罫線領域判定部２６２はまず、ステップＳ３２で抽出した地肌罫線のうち、処理対象の判定条件中の要素条件に合うものを取得する（Ｓ３３）。このとき、地肌罫線領域判定部２６２は取得しなかった地肌罫線の情報も保持しておく。
そして、地肌罫線領域判定部２６２は、ステップＳ３３で取得した地肌罫線と処理対象の判定条件とに従い、画像データ中に地肌罫線を基準とした領域及びその属性を設定する（Ｓ３４）。例えば、ＩＤ＝Ｃ＿００１の判定条件を用いた場合、要素条件を満たす地肌罫線に挟まれた部分に領域を設定し、その領域の属性を「文字」に設定する。最後に、処理対象の判定条件と対応付けて、要素条件に合う地肌罫線を保存する（Ｓ３５）。この保存は、別の特徴要素に関する処理で参照するために行うものである。 That is, the background ruled line area determination unit 262 first acquires the background ruled line extracted in step S32 that meets the element conditions in the determination conditions to be processed (S33). At this time, the background ruled line area determination unit 262 also retains the information of the background ruled line that has not been acquired.
Then, the background ruled line area determination unit 262 sets an area based on the background ruled line and its attributes in the image data according to the background ruled line acquired in step S33 and the determination condition of the processing target (S34). For example, when the determination condition of ID = C_001 is used, an area is set in the portion sandwiched between the background ruled lines satisfying the element conditions, and the attribute of the area is set to "character". Finally, the background ruled line that matches the element condition is saved in association with the determination condition of the processing target (S35). This preservation is performed for reference in the processing related to another feature element.

また、ステップＳ３２〜Ｓ３５のループの終了後、あるいはステップＳ３１でＮＯの場合、処理は図７のステップＳ４１に進む。ここでは、文字領域判定部２１２は、判定条件取得部２６４が取得した判定条件２６５に、特定文字に関する条件があるか否か判断する（Ｓ４１）。
ステップＳ４１でＹｅｓであれば、特定文字についての領域の設定について検討すべく、特定文字領域判定部２６３が特定文字に関する各判定条件（表３参照）を順次処理対象としつつ、ステップＳ４２〜Ｓ４６の処理を繰り返す。特定文字の場合、判定条件内に特定文字を抽出するための条件が含まれているため、罫線や地肌罫線の場合と異なり、ループの中で特定文字の抽出を行う。 Further, after the end of the loop in steps S32 to S35, or when NO in step S31, the process proceeds to step S41 in FIG. 7. Here, the character area determination unit 212 determines whether or not the determination condition 265 acquired by the determination condition acquisition unit 264 has a condition related to a specific character (S41).
If Yes in step S41, the specific character area determination unit 263 sequentially targets each determination condition (see Table 3) related to the specific character in order to examine the setting of the area for the specific character, and steps S42 to S46. Repeat the process. In the case of a specific character, since the condition for extracting the specific character is included in the judgment condition, the specific character is extracted in the loop unlike the case of the ruled line or the background ruled line.

ループの処理において、特定文字領域判定部２６３はまず、処理対象の判定条件から、「記号」及び「向き」の指定に従い、検索用の、特定文字のデータを生成する（Ｓ４２）。次に、特定文字領域判定部２６３は、その生成した特定文字のデータに従い、処理対象の画像データから、特定文字を全て抽出する（Ｓ４３）。
そして、抽出した特定文字のうち、処理対象の判定条件中の要素条件に合うものを取得する（Ｓ４４）。このとき、特定文字領域判定部２６３は取得しなかった特定文字の情報も保持しておく。また、特定文字領域判定部２６３は、処理対象の判定条件において「組合せ」として指定された判定条件と対応付けて保存された特徴要素（ここでは、ステップＳ２５で保存された罫線又はステップＳ３５で保存された地肌罫線）を取得する（Ｓ４５）。 In the loop processing, the specific character area determination unit 263 first generates data of a specific character for search from the determination conditions of the processing target according to the designation of the "symbol" and the "direction" (S42). Next, the specific character area determination unit 263 extracts all the specific characters from the image data to be processed according to the generated specific character data (S43).
Then, among the extracted specific characters, those that meet the element conditions in the determination conditions to be processed are acquired (S44). At this time, the specific character area determination unit 263 also retains the information of the specific character that has not been acquired. Further, the specific character area determination unit 263 saves the feature element (here, the ruled line saved in step S25 or the ruled line saved in step S35) in association with the determination condition designated as the "combination" in the determination condition to be processed. The background ruled line) is acquired (S45).

次に、特定文字領域判定部２６３は、ステップＳ４４で取得した特定文字と、ステップＳ４５で取得した特徴要素と、処理対象の判定条件とに従い、画像データ中に、特定文字を基準とした領域及びその属性を設定する（Ｓ４６）。例えば、ＩＤ＝Ｂ＿００１の判定条件を用いた場合、要素条件を満たす特定文字から、文字の配列方向で次のＩＤ＝Ａ＿００１の罫線までの範囲に領域を設定し、その領域の属性を「文字」に設定する。 Next, the specific character area determination unit 263 sets an area based on the specific character in the image data according to the specific character acquired in step S44, the feature element acquired in step S45, and the determination condition of the processing target. The attribute is set (S46). For example, when the judgment condition of ID = B_001 is used, an area is set in the range from a specific character satisfying the element condition to the ruled line of the next ID = A_001 in the character arrangement direction, and the attribute of the area is set to "character". Set to.

ステップＳ４２〜Ｓ４６のループの終了後、あるいはステップＳ４１でＮＯの場合、範囲抽出部２６６は、罫線、地肌罫線、および特定文字のそれぞれを基準に設定した領域を総合して、画像データ中の文字属性の領域を確定させる（Ｓ４７）。ここでは文字属性としたが、他の属性でもよい。
以上で図７の処理を終了する。なお、ステップＳ４７の処理の詳細は、図１３を用いて後述する。 After the end of the loop of steps S42 to S46, or when NO in step S41, the range extraction unit 266 integrates the area set based on each of the ruled line, the background ruled line, and the specific character, and the character in the image data. The attribute area is fixed (S47). Although it is a character attribute here, other attributes may be used.
This completes the process of FIG. 7. The details of the process in step S47 will be described later with reference to FIG.

以上の図６及び図７の処理において、ステップＳ２２、Ｓ３２及びＳ４３の処理が、抽出手順の処理であり、抽出手段の機能と対応する。ステップＳ２４、Ｓ３４、Ｓ４６及びＳ４７の処理が、属性設定手順の処理であり、属性設定手段の機能と対応する。
なお、この実施形態における文字領域判定部２１２は、上記抽出手段による複数の特徴要素についての特徴要素の抽出と、上記属性設定手段による記属性の設定とを、上述のように考慮する特徴要素を所定の順序で選択しつつ行う。 In the above processes of FIGS. 6 and 7, the processes of steps S22, S32 and S43 are the processes of the extraction procedure and correspond to the functions of the extraction means. The processing of steps S24, S34, S46 and S47 is the processing of the attribute setting procedure, and corresponds to the function of the attribute setting means.
The character area determination unit 212 in this embodiment considers the extraction of the feature elements for the plurality of feature elements by the extraction means and the setting of the notation attributes by the attribute setting means as described above. Perform while selecting in a predetermined order.

次に、図６及び図７の処理を具体的な画像データに適用する例について説明する。
まず図８Ａ〜図８Ｃに、紙面を読み取った画像データに対し、ステップＳ２２〜Ｓ２５の、罫線についての領域の設定の処理を適用する例を示す。
図８Ａは、初期状態の画像データを簡略的に表わしたものである。図中の太線６０１〜６０５は罫線を示しており、図中の細線６１０は文字を簡略的に表わしたものである。
図８Ｂは、太線６０１〜６０５の内、要素条件を満たす罫線を抽出した状態を示す。１段の間隔が３０ｍｍであり、各罫線の太さが０．１ｍｍであるとすると、ＩＤ＝Ａ＿００２を処理対象とした図６のステップＳ２２及びＳ２３の処理により、要素条件を満たす罫線として、罫線７０１〜７０４が抽出される。 Next, an example of applying the processing of FIGS. 6 and 7 to specific image data will be described.
First, FIGS. 8A to 8C show an example in which the process of setting the area for the ruled line in steps S22 to S25 is applied to the image data obtained by reading the paper surface.
FIG. 8A is a simplified representation of the image data in the initial state. The thick lines 601 to 605 in the figure indicate ruled lines, and the thin lines 610 in the figure are simplified representations of characters.
FIG. 8B shows a state in which ruled lines satisfying the element conditions are extracted from the thick lines 601 to 605. Assuming that the interval of one step is 30 mm and the thickness of each ruled line is 0.1 mm, the ruled line as a ruled line satisfying the element condition is obtained by the processing of steps S22 and S23 of FIG. 6 for ID = A_002. 701 to 704 are extracted.

図８Ｃは、それらの罫線に従って設定される領域を示す。図６のステップＳ２４の処理では、抽出された罫線に挟まれた部分に領域が設定されるため、領域７０６〜７０８が設定される。
なお、図８Ｂでは、表２のＩＤ＝Ａ＿００１を処理対象としたときに抽出される罫線７０５も示している。しかし、この条件で抽出される罫線は罫線７０５のみであり、複数の罫線に挟まれる部分はないため、領域の設定はなされない。 FIG. 8C shows an area set according to those ruled lines. In the process of step S24 of FIG. 6, since the area is set in the portion sandwiched between the extracted ruled lines, the areas 706 to 708 are set.
Note that FIG. 8B also shows a ruled line 705 extracted when ID = A_001 in Table 2 is used as a processing target. However, the ruled line extracted under this condition is only the ruled line 705, and since there is no portion sandwiched between the plurality of ruled lines, the area is not set.

次に、図９Ａ及び図９Ｂに、図８Ａと同じ画像データに対し、図６のステップＳ３２〜Ｓ３５の、地肌罫線についての領域の設定の処理を適用する例を示す。
図９Ａは、そこから要素条件を満たす地肌を抽出した状態を示す。図９Ａに示すように、画像データ中の地肌領域としては、例えば符号８０１〜８０４に示す領域を抽出できる。しかし、この中で例えば表４のＩＤ＝Ｃ＿００２の要素条件を満たす地肌領域は、地肌罫線８０１，８０２のみであり、これらが地肌罫線として抽出される。 Next, FIGS. 9A and 9B show an example in which the process of setting the area for the background ruled line in steps S32 to S35 of FIG. 6 is applied to the same image data as in FIG. 8A.
FIG. 9A shows a state in which the background satisfying the element conditions is extracted from the background. As shown in FIG. 9A, as the background region in the image data, for example, the regions shown by reference numerals 801 to 804 can be extracted. However, among these, for example, the background areas satisfying the element conditions of ID = C_002 in Table 4 are only the background ruled lines 801 and 802, and these are extracted as the background ruled lines.

図９Ｂは、それらの罫線に従って設定される領域を示す。ステップＳ３４の処理では、抽出された地肌罫線に挟まれた部分に領域が設定されるため、領域８１１が設定される。なお、領域８１２は、図９Ａの地肌罫線には挟まれていない。しかし、要素条件に「組合せ」としてＡ＿００２を規定することで、罫線７０３と地肌罫線８０２で挟まれる部分にも領域を設定できるようにすれば、領域８１２も設定することができる。「範囲」を「地肌罫線に接する部分」としても同様である。 FIG. 9B shows an area set according to those ruled lines. In the process of step S34, since the area is set in the portion sandwiched between the extracted background ruled lines, the area 811 is set. The area 812 is not sandwiched between the background ruled lines of FIG. 9A. However, if A_002 is specified as a "combination" in the element condition so that the area can be set also in the portion sandwiched between the ruled line 703 and the background ruled line 802, the area 812 can also be set. The same applies when the "range" is set to "the part in contact with the background ruled line".

次に、図１０に、図７におけるステップＳ４２〜Ｓ４６の、特定文字についての領域の設定の処理を適用した例を示す。ここでは、表３のＩＤ＝Ｂ＿００１の判定条件について説明する。
この判定条件では、「組合せ」としてＡ＿００１の罫線が指定されていることから、Ａ＿００１の要素条件を満たす罫線９０１，９０２を用いて領域を設定する。
図４の特定文字領域判定部２６３は、要素条件中の「記号」の「Å」及び「向き」の「横」から特定文字の画像データを生成し、画像データの中に生成した画像データとマッチする箇所が存在するかを検索する。その結果、符号９５で示す文字「Å」を見つける。そして、文字「Å」の位置と、次の罫線９０２とに挟まれた範囲を、領域９０３として設定する。
ここで、画像データの中から一致する文字データを検索するには時間がかかってしまうので、「組合せ」で定義している罫線に対する特定文字の位置などを定義しておき、定義した位置のみに対して特定文字の検索を行うのがよい。 Next, FIG. 10 shows an example in which the process of setting the area for a specific character in steps S42 to S46 in FIG. 7 is applied. Here, the determination conditions for ID = B_001 in Table 3 will be described.
In this determination condition, since the ruled line of A_001 is specified as the "combination", the area is set using the ruled lines 901 and 902 that satisfy the element conditions of A_001.
The specific character area determination unit 263 of FIG. 4 generates image data of specific characters from "Å" of "symbol" and "horizontal" of "direction" in the element condition, and together with the image data generated in the image data. Search if there is a match. As a result, the character "Å" indicated by reference numeral 95 is found. Then, the range sandwiched between the position of the character "Å" and the next ruled line 902 is set as the area 903.
Here, it takes time to search for matching character data from the image data, so define the position of a specific character with respect to the ruled line defined in "Combination", and only at the defined position. On the other hand, it is better to search for a specific character.

次に、図１１Ａ及び図１１Ｂを用いて、原稿が地肌罫線を含む場合で、かつ地肌罫線の判定条件が無い場合に、罫線による判定条件を使って地肌罫線の抽出を行う方法について説明する。この処理は、図６及び図７の処理における地肌抽出の変形例に該当する。 Next, with reference to FIGS. 11A and 11B, a method of extracting the background ruled line using the judgment condition based on the ruled line will be described when the document includes the background ruled line and there is no determination condition for the background ruled line. This process corresponds to a modified example of the background extraction in the processes of FIGS. 6 and 7.

図１１Ａにおいて、罫線７０１〜７０４は、図８Ｂで抽出されたものと同じである。この場合、罫線７０１及び７０２のように、罫線が画像の途中で切れている箇所には、しばしばその延長線上に地肌罫線がある。そこで、延長線１００１，１００２上で地肌検出を行い、ここで検出された地肌領域に対して罫線に関する判定条件を適用すると、領域設定の基礎とする地肌罫線の領域を抽出可能である。
ここでは、延長線１００１，１００２上に、図１１Ｂに示すように罫線１００３，１００４が抽出されるとし、罫線に関する判定条件に従い、罫線に挟まれる領域１００６，１００７が設定される。 In FIG. 11A, the ruled lines 701 to 704 are the same as those extracted in FIG. 8B. In this case, where the ruled line is cut off in the middle of the image, such as the ruled lines 701 and 702, there is often a background ruled line on the extension line. Therefore, if the background is detected on the extension lines 1001 and 1002 and the determination condition regarding the ruled line is applied to the detected background area, the area of the background ruled line that is the basis of the area setting can be extracted.
Here, assuming that the ruled lines 1003 and 1004 are extracted on the extension lines 1001 and 1002 as shown in FIG. 11B, the areas 1006 and 1007 sandwiched between the ruled lines are set according to the determination conditions regarding the ruled lines.

次に、図１２Ａ及び図１２Ｂを用いて、地肌抽出の結果から地肌罫線の候補を抽出する方法について説明する。
なお、適宜公知の技術を適用して、画像データ中で文字が縦書きか横書きかは特定できる。この特定した状態で、図１２Ａに示すように、原稿の画像データ１１０１の縦方向と横方向の黒色部のヒストグラム１１０２，１１０３を取得する。画像データ１１０１は縦書きなので、ヒストグラム１１０３の行間を取得する。ただし、行間は地肌罫線の候補としない。行間以上の幅でヒストグラム値が低い部分１１３１と１１３２と対応する箇所の、画像データ１１０１における地肌を抽出し、図１２Ｂに示すように、地肌が連続する領域１１３３，１１３４を地肌罫線の候補として抽出できる。 Next, a method of extracting candidate background ruled lines from the result of background extraction will be described with reference to FIGS. 12A and 12B.
It should be noted that it is possible to specify whether the characters are written vertically or horizontally in the image data by appropriately applying a known technique. In this specified state, as shown in FIG. 12A, histograms 1102 and 1103 of the black portions in the vertical and horizontal directions of the image data 1101 of the original are acquired. Since the image data 1101 is written vertically, the line spacing of the histogram 1103 is acquired. However, the line spacing is not a candidate for the background ruled line. The background in the image data 1101 at the portion corresponding to the portions 1131 and 1132 having a width equal to or larger than the line spacing and having a low histogram value is extracted, and as shown in FIG. 12B, the regions 1133 and 1134 with continuous background are extracted as candidates for the background ruled line. it can.

また、ヒストグラム１１０２から、値が低い部分１１２１，１１２２，１１２３の画像データ１１０１における地肌を抽出し、地肌が連続する領域１１２４，１１２５，１１２６を、地肌罫線の候補として抽出できる。
ここで、ヒストグラムの値が低いか否かを決める基準１１０４，１１０５は、あらかじめシステム内に固定で設定しておく。 Further, from the histogram 1102, the background in the image data 1101 of the low value portion 1121, 1122, 1123 can be extracted, and the region 1124, 1125, 1126 in which the background is continuous can be extracted as a candidate for the background rule line.
Here, the criteria 1104 and 1105 for determining whether or not the histogram value is low are fixedly set in the system in advance.

次に、図１３を用いて、複数の特徴要素に基づく領域の設定結果を集計して最終的に文字領域を確定させる方法について説明する。
罫線に基づく文字領域の設定結果を１２０１、地肌罫線に基づく文字領域の設定結果を１２０２、特定文字に基づく文字領域の設定結果を１２０３のそれぞれのハッチング部とする。
１２０１と１２０２を合わせた状態を１２０４とし、さらに１２０４に１２０３を合わせた状態を１２０５とする。１２０５において重なっている部分を説明するために、一行分をピックアップし、それぞれの特徴要素に基づき文字属性の領域を設定した部分を枠囲い及び「１」で示す（１２１１〜１２１３）。また、一行分のサイズ目安として、１２１４として全て「０」としたデータを示す。 Next, with reference to FIG. 13, a method of totaling the setting results of the areas based on the plurality of feature elements and finally determining the character area will be described.
The setting result of the character area based on the ruled line is 1201, the setting result of the character area based on the background ruled line is 1202, and the setting result of the character area based on the specific character is 1203.
The combined state of 1201 and 1202 is 1204, and the combined state of 1204 and 1203 is 1205. In order to explain the overlapping portion in 1205, one line is picked up, and the portion in which the character attribute area is set based on each feature element is indicated by a frame and “1” (121 to 1213). In addition, as a guideline for the size of one line, data in which all are set to "0" as 1214 are shown.

１２１１〜１２１３の結果を座標ごとに足し（１２１５、１２１６）、その加算結果を最終的な文字領域の確定に使用することができる。
１２１５は、閾値として２を用い、加算結果が２以上の領域を文字領域とする例である。この場合、枠囲いされた範囲が文字領域として確定される。
１２１６は、罫線に基づく文字領域の設定結果の優先度を高くした場合である。ここでは、加算結果が２以上の領域でかつ１２１３の判定結果で「１」となっている、枠囲いの部分が文字領域として確定される。 The results of 1211-1213 can be added for each coordinate (1215, 1216) and the addition result can be used to determine the final character area.
Reference numeral 1215 is an example in which 2 is used as the threshold value and the area where the addition result is 2 or more is set as the character area. In this case, the framed range is fixed as the character area.
Reference numeral 1216 is a case where the priority of the setting result of the character area based on the ruled line is increased. Here, the framed portion in which the addition result is 2 or more and the determination result of 1213 is “1” is determined as the character area.

このように、この実施形態の文字領域判定部２１２は、抽出手段による特徴要素の抽出と、属性設定手段による判定条件を参照した入力した画像データに対する属性の設定とを、複数の特徴要素についてそれぞれ行う。そして、属性設定手段は、複数の特徴要素に従ってそれぞれ設定した属性を、その複数の特徴要素分集計して、上記入力した画像データ中における各部分の属性を設定する。
さらに、判定方法が多い場合には、優先度の高い方法の判定結果へ優先度相当の大きな数字を掛け、その掛けた値同士を足したときの値が、１２１５のときの２のような基準値以上であるか否かにより、文字領域を決定することができる。いずれにせよ、閾値や優先度は、予め設定しておく。 As described above, the character area determination unit 212 of this embodiment extracts the feature element by the extraction means and sets the attribute for the input image data with reference to the determination condition by the attribute setting means for each of the plurality of feature elements. Do. Then, the attribute setting means aggregates the attributes set according to the plurality of feature elements for each of the plurality of feature elements, and sets the attributes of each part in the input image data.
Further, when there are many judgment methods, the judgment result of the method having a high priority is multiplied by a large number corresponding to the priority, and the value when the multiplied values are added is a standard such as 2 when 1215. The character area can be determined depending on whether or not it is equal to or greater than the value. In any case, the threshold value and the priority are set in advance.

以上で実施形態の説明を終了するが、この発明において、装置の具体的な構成、具体的な処理の手順、データの形式、具体的なデータの内容、判定条件の内容や数、項目等は、実施形態で説明したものに限るものではない。
例えば、文字領域に関する特徴を複数の特徴要素抽出から取得して扱う場合、複数種の結果から特徴の優先度の考慮や文字領域とするための条件付けにより、最終的な文字領域を確定させるようにすることが考えられる。このことにより、文字領域判定を上述した実施形態以外の処理工程で実施している場合にも拡張性が上がる。また、条件付けをユーザが出来るようにＵＩを設ければ、利便性を向上できるという効果がある。 The description of the embodiment is completed above, but in the present invention, the specific configuration of the device, the specific processing procedure, the data format, the specific data content, the content and number of determination conditions, the items, etc. , The invention is not limited to that described in the embodiment.
For example, when a feature related to a character area is acquired from a plurality of feature element extracts and handled, the final character area should be determined by considering the priority of the feature from multiple types of results and conditioning the character area. It is conceivable to do. As a result, the expandability is improved even when the character area determination is performed in a processing process other than the above-described embodiment. In addition, if a UI is provided so that the user can perform conditioning, there is an effect that convenience can be improved.

また、文字領域に関する特徴を罫線抽出から取得した場合に、判定対象の罫線が、例えば新聞の罫線であることを判断するとよい。このことにより、属性情報の判別属性に条件をつけ、罫線抽出で抽出する罫線の特徴を特定でき、精度を高めることができる。
また、文字領域に関する特徴を地肌罫線抽出から取得した場合に、罫線抽出の延長線を仮定し、延長線上の画像データから文字範囲を判断できる対象であるか否かを判定するとよい。このようにすれば、罫線が存在しない場所でも文字領域判定の精度を保つことができる。 Further, when the characteristics related to the character area are acquired from the ruled line extraction, it may be determined that the ruled line to be determined is, for example, a newspaper ruled line. As a result, it is possible to set conditions for the discrimination attribute of the attribute information, specify the characteristics of the ruled line extracted by the ruled line extraction, and improve the accuracy.
Further, when the feature related to the character area is acquired from the background ruled line extraction, it is preferable to assume an extension line of the ruled line extraction and determine whether or not the character range can be determined from the image data on the extension line. By doing so, the accuracy of character area determination can be maintained even in a place where no ruled line exists.

また、文字領域に関する特徴を特定文字抽出から取得した場合に、フォントデータを基に文字領域を判定するとよい。このようにすれば、行抽出で誤判定しやすかった特殊文字も文字領域として判定できるため、精度を高めることができる。
また、複数の特徴要素に基づく抽出を行う場合に、どの抽出を先に行うかの優先度をユーザが決められるようにするとよい。抽出順を定めることにより、ユーザが意図した文字領域や絵柄領域の判定をすることができる。 Further, when the characteristics related to the character area are acquired from the specific character extraction, the character area may be determined based on the font data. In this way, special characters that are easily erroneously determined by line extraction can be determined as a character area, so that accuracy can be improved.
Further, when extracting based on a plurality of feature elements, it is preferable that the user can determine the priority of which extraction is performed first. By defining the extraction order, it is possible to determine the character area and the pattern area intended by the user.

また、判定条件を抽出範囲と判定結果も含んだ情報として保持する場合に、判定条件とともに抽出範囲と判定結果を取得し、その取得情報を基に抽出範囲や判定結果を決定するようにするとよい。このことにより、抽出基準の項目数に応じた抽出範囲の確定や判定結果を保持することが出来る。
また、判定条件を画像処理の制御方法も含んだ情報として保持する場合に、判定条件とともに制御方法も取得し、その取得情報を基に制御するようにするとよい。このことにより、抽出基準の項目数に応じた制御を変えることができる。 Further, when the judgment condition is retained as information including the extraction range and the judgment result, it is preferable to acquire the extraction range and the judgment result together with the judgment condition and determine the extraction range and the judgment result based on the acquired information. .. As a result, it is possible to determine the extraction range and retain the determination result according to the number of items of the extraction standard.
Further, when the determination condition is held as information including the control method of image processing, it is preferable to acquire the control method together with the determination condition and control based on the acquired information. As a result, the control can be changed according to the number of items of the extraction standard.

また、原稿の種類を抽出するための抽出基準を保持し、原稿の特徴として文字原稿や写真原稿などの原稿の種類を判断するようにするとよい。原稿の種類を判別できれば、文字の割合が多い原稿に対する圧縮処理の時間短縮や原稿種に応じた画像処理を施すことができる。
また、画像の特性として文字や絵柄の特性を区別するようにした場合に、文字部と絵柄部の画像処理を切り替えるとよい。このことにより、文字の端部が薄くなり読みにくくなってしまう画像データに対して、文字の部分だけ端部を濃くしたり文字自体を濃くしたりすることにより文字を読みやすくすることができる。 In addition, it is preferable to maintain the extraction criteria for extracting the type of the original, and to determine the type of the original such as a character original or a photographic original as a feature of the original. If the type of the original can be identified, it is possible to shorten the compression processing time for the original having a large proportion of characters and perform image processing according to the original type.
Further, when the characteristics of characters and patterns are distinguished as the characteristics of the image, it is preferable to switch the image processing of the character portion and the pattern portion. As a result, for image data in which the end portion of the character becomes thin and difficult to read, the character can be made easier to read by thickening the end portion of the character portion or darkening the character itself.

また、画像の特性を原稿の特性とした場合に、原稿の特性で画像処理を切り替えるとよい。このことにより、地肌濃度が高く紙が薄い原稿に、例えば新聞や広告があるが、地肌が濃すぎたり両面印刷されているときには裏移りしたりして、文字が読みにくくなってしまう。そのような特徴の原稿を読み取った画像データに対して、地肌補正を強めにしたり、裏移り除去したりすることにより、文字を読みやすくすることができる。 Further, when the characteristic of the image is the characteristic of the original, it is preferable to switch the image processing according to the characteristic of the original. As a result, there are newspapers and advertisements in a manuscript having a high background density and a thin paper, but when the background is too dark or double-sided printing is performed, the characters are set off and the characters become difficult to read. The characters can be made easier to read by strengthening the background correction or removing the set-off with respect to the image data obtained by reading the original with such characteristics.

また、この発明のプログラムの実施形態は、コンピュータに所要のハードウェアを制御させて上述した実施形態におけるＭＦＰ１００の機能の全部又は一部を実現させるためのプログラムである。
このようなプログラムは、はじめからコンピュータに備えるＲＯＭや他の不揮発性記憶媒体（フラッシュメモリ，ＥＥＰＲＯＭ等）などに格納しておいてもよい。しかし、メモリカード、ＣＤ、ＤＶＤ、ブルーレイディスク等の任意の不揮発性記録媒体に記録して提供することもできる。それらの記録媒体に記録されたプログラムをコンピュータにインストールして実行させることにより、上述した各手順を実行させることができる。
さらに、ネットワークに接続され、プログラムを記録した記録媒体を備える外部装置あるいはプログラムを記憶手段に記憶した外部装置からダウンロードし、コンピュータにインストールして実行させることも可能である。 Further, an embodiment of the program of the present invention is a program for causing a computer to control necessary hardware to realize all or a part of the functions of the MFP 100 in the above-described embodiment.
Such a program may be stored in a ROM provided in the computer or another non-volatile storage medium (flash memory, EEPROM, etc.) from the beginning. However, it can also be recorded and provided on any non-volatile recording medium such as a memory card, a CD, a DVD, or a Blu-ray disc. By installing the program recorded on these recording media on a computer and executing it, each of the above-mentioned procedures can be executed.
Further, it is also possible to download an external device connected to a network and having a recording medium on which the program is recorded or a program stored in a storage means, install the program on a computer, and execute the program.

また、以上説明してきた実施形態及び変形例の構成が、相互に矛盾しない限り任意に組み合わせて実施可能であり、また、一部のみを取り出して実施することができることは、勿論である。 Further, it goes without saying that the configurations of the embodiments and the modifications described above can be arbitrarily combined and implemented as long as they do not contradict each other, and only a part of them can be taken out and implemented.

１００：ＭＦＰ、１０１：ＣＰＵ、１０２：ＲＯＭ、１０３：ＲＡＭ、１０４：ＨＤＤ、１０５：通信Ｉ／Ｆ、１０６：操作部、１０７：表示部、１０８：エンジンＩ／Ｆ、１２０：スキャナ、１３０：プロッタ、２００：画像処理部、２１０：スキャナ補正部、２２０：圧縮処理部、２３０：データインタフェース部、２４０：伸張処理部、２５０：プリンタ補正部、２６１：罫線領域判定部、２６２：地肌罫線領域判定部、２６３：特定文字領域判定部、２６４：判定条件取得部、２６５：判定条件、２６６：範囲抽出部、２６７：判定結果設定部 100: MFP, 101: CPU, 102: ROM, 103: RAM, 104: HDD, 105: Communication I / F, 106: Operation unit, 107: Display unit, 108: Engine I / F, 120: Scanner, 130: Plotter, 200: Image processing unit, 210: Scanner correction unit, 220: Compression processing unit, 230: Data interface unit, 240: Decompression processing unit, 250: Printer correction unit, 261: Rule line area determination unit, 262: Background rule line area Judgment unit 263: Specific character area judgment unit 264: Judgment condition acquisition unit 265: Judgment condition 266: Range extraction unit 267: Judgment result setting unit

特開２００７−１８９２７５号公報Japanese Unexamined Patent Publication No. 2007-189275 特許第３８９８６４５号公報Japanese Patent No. 3898645 特許第３２５１１１９号公報Japanese Patent No. 3251119

Claims

Input means for inputting image data and
An extraction means for extracting a predetermined feature element from the image data input by the input means, and an extraction means.
With respect to the predetermined feature element, the extraction means refers to a determination condition in which the condition to be satisfied by the feature element and the attribute of the region determined based on the feature element satisfying the condition in the image data are determined in advance. An attribute setting means for setting an attribute defined by the determination condition in a region defined by the determination condition in the input image data based on the extracted feature element and the determination condition is provided .
Extraction of the feature element by the extraction means and setting of the attribute for the input image data with reference to the determination condition by the attribute setting means are performed for each of the plurality of feature elements.
The attribute setting means is an image processing apparatus characterized in that the attributes set according to the plurality of feature elements are aggregated for the plurality of feature elements and the attributes of each part in the input image data are set. ..

The image processing apparatus according to claim 1 .
Image processing characterized in that the extraction of the feature elements for the plurality of feature elements by the extraction means and the setting of the attributes by the attribute setting means are performed while selecting the feature elements to be considered in a predetermined order. apparatus.

The image processing apparatus according to claim 1 or 2 .
An image processing device characterized in that one of the predetermined feature elements is a ruled line.

The image processing apparatus according to claim 1 or 2 .
One of the predetermined feature elements is a specific character, and the range in which the extraction means searches for the specific character in the input image data is determined based on the position of another feature element that has already been extracted. An image processing device characterized in that it is determined according to a determination condition.

The image processing apparatus according to claim 1 or 2 .
Said predetermined characteristic element comprises a border and background border, said extraction means, when extracting the ruled lines from image data obtained by the input, to the extension on the area of the extracted ruled line, performing the extraction of the background border An image processing device characterized by.

The image processing apparatus according to any one of claims 1 to 5 .
An image processing apparatus comprising: an image processing means for processing each part of the input image data by a processing method according to the attribute set in the part by the attribute setting means.

The image processing apparatus according to claim 6 .
An image processing apparatus characterized in that the processing for the image data performed by the image processing means includes a processing for emphasizing characters for a portion in which an attribute of a character area is set.

Input procedure for inputting image data and
An extraction procedure for extracting a predetermined feature element from the image data input in the input procedure, and an extraction procedure.
With respect to the predetermined feature element, the extraction procedure refers to a determination condition in which the condition to be satisfied by the feature element and the attribute of the region defined based on the feature element satisfying the condition in the image data are determined in advance. Based on the extracted feature element and the determination condition, an attribute setting procedure for setting the attribute defined by the determination condition in the area specified by the determination condition in the input image data is provided .
Extraction of the feature element by the extraction procedure and setting of the attribute for the input image data with reference to the determination condition by the attribute setting procedure are performed for each of the plurality of feature elements.
The attribute setting procedure is an image processing method in which attributes set according to the plurality of feature elements are aggregated for the plurality of feature elements, and the attributes of each portion in the input image data are set .

A program for operating a computer as the image processing device according to any one of claims 1 to 7 .