JP2010011450A

JP2010011450A - Image-forming device and image processing method

Info

Publication number: JP2010011450A
Application number: JP2009126614A
Authority: JP
Inventors: Shunichi Mekawa; 俊一女川; Takahiro Fuchigami; 隆博渕上
Original assignee: Toshiba Corp; Toshiba TEC Corp
Current assignee: Toshiba Corp; Toshiba TEC Corp
Priority date: 2008-06-26
Filing date: 2009-05-26
Publication date: 2010-01-14
Anticipated expiration: 2029-05-26
Also published as: JP5005732B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide an image-forming device capable of reducing the risk of deteriorating quality of an output image by reducing an increase in image area identification processing time depending on an input document image. <P>SOLUTION: The image-forming device has: an image input means 101 for acquiring image information from an original to generate an input image; a first identification means 102 for performing first image area identification processing to the input image to output a first identification signal indicating the attribute of an image; a second identification means 104 for inputting the first identification signal and performing second image area identification processing following the first image area identification processing to output a second identification signal; a determination means 103 for inputting the first identification signal to output a determination signal indicating whether the second identification means should be executed; a selection means for selecting the first identification signal as a third one without performing the second image area identification processing when the determination signal does not indicate the execution and for selecting the second identification signal as the third one when the determination signal indicates the execution; and an image processing means 105 for inputting the input image and the third identification signal to execute image processing without performing the second image area identification processing. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、画像形成装置及び画像処理方法に関し、特に、文書画像に像域識別処理を施して画像形成するための技術に関する。 The present invention relates to an image forming apparatus and an image processing method, and more particularly to a technique for performing image area identification processing on a document image to form an image.

文書画像に像域識別処理を行い、その識別結果に基づいて適用する画像処理を切り替えて画像形成する技術が知られている。特に最近では画質を保ったままデータ量を削減することを目的に、文字と写真とを分離し、文字は解像性を重視し、写真は階調性を重視してそれぞれに適した画像処理・圧縮処理を施して電子ファイル化する技術がある。 A technique is known in which image area identification processing is performed on a document image, and image formation is performed by switching image processing to be applied based on the identification result. Recently, in order to reduce the amount of data while maintaining the image quality, text and photos are separated, text emphasizes resolution, and photos emphasize gradation and image processing suitable for each.・ There is a technology to compress files into electronic files.

例えば、特許文献１に記載の技術では、処理対象となった画像データにレイアウト解析処理を適用し、その結果に基づいて文字部分の画像要素を特定し、文字部分として特定された画像要素と、それ以外の部分とについて、互いに異なる圧縮方法で圧縮処理を行うことで文字の可読性を保ったまま高圧縮な画像ファイル（高圧縮ＰＤＦ）を得ている。 For example, in the technique described in Patent Document 1, a layout analysis process is applied to image data that is a processing target, an image element of a character part is specified based on the result, an image element specified as a character part, By compressing the remaining portions with different compression methods, a highly compressed image file (highly compressed PDF) is obtained while maintaining character readability.

また、前述した像域識別処理については特許文献２及び特許文献３に開示された技術が知られている。
特許文献２に記載の技術では、処理対象を２値化した後、その黒画素群の連結成分の外接矩形のサイズのみに基づいて黒画素が文字か否かを判断する。
特許文献３に記載の技術では、ある特定画素群の外接矩形の集合のサイズや並び方に基づいて文字列、文字列領域を抽出する。 For the above-described image area identification processing, techniques disclosed in Patent Document 2 and Patent Document 3 are known.
In the technique described in Patent Document 2, after binarizing the processing target, it is determined whether or not the black pixel is a character based only on the size of the circumscribed rectangle of the connected component of the black pixel group.
In the technique described in Patent Document 3, a character string and a character string region are extracted based on the size and arrangement of a circumscribed rectangle set of a specific pixel group.

特許文献２に記載の技術では、誤識別が起こりにくく、また、高速に処理することができるが、写真中の文字が抽出できないなど適用できる対象が狭い。そのため、この識別結果を利用して前述の高圧縮な画像ファイルを形成した場合、黒文字以外の文字では文字がぼやけて可読性が悪化し、また、高い圧縮効果を得ることが難しいという問題点がある。 With the technique described in Patent Document 2, misidentification is unlikely to occur and processing can be performed at a high speed, but applicable objects are narrow such that characters in a photograph cannot be extracted. Therefore, when the above-described highly compressed image file is formed using this identification result, characters other than black characters are blurred and readability deteriorates, and it is difficult to obtain a high compression effect. .

一方、特許文献３に記載の技術では、前者と比較して多様な文字を抽出することが可能である。しかし、網点下地を多く含んでいる文書画像、あるいは、複雑な構成である文書画像が入力された場合、その外接矩形の数が膨大となりレイアウト解析処理に非常に大きな時間を要することがある。また、レイアウト解析処理で誤識別を起こす危険性も高くなる。大きな処理時間を要する点に関しては、所定の処理時間内でのみ識別処理を実施し、処理時間を超過した場合は識別処理を実施しないという退避方法も考えられるが、結果としてユーザが望む画像を形成することができない。 On the other hand, in the technique described in Patent Document 3, it is possible to extract various characters compared to the former. However, when a document image including many halftone dot backgrounds or a document image having a complicated configuration is input, the number of circumscribed rectangles becomes enormous and the layout analysis process may take a very long time. In addition, there is a high risk of erroneous identification in the layout analysis process. Regarding the point that requires a long processing time, there is a saving method in which the identification processing is performed only within the predetermined processing time and the identification processing is not performed when the processing time is exceeded, but as a result, the image desired by the user is formed. Can not do it.

本発明は斯かる事情に鑑みてなされたものであって、入力文書画像に依存する像域識別処理時間の増加を低減するとともに、出力画像の画質劣化の危険性を軽減して、ユーザの期待に近い出力画像を得ることのできる画像形成装置及び画像処理方法を提供することを目的とする。 The present invention has been made in view of such circumstances, and reduces an increase in image area identification processing time that depends on an input document image, reduces the risk of image quality degradation of an output image, and reduces user expectations. It is an object of the present invention to provide an image forming apparatus and an image processing method capable of obtaining an output image close to.

上記課題を解決するための本発明は、紙原稿を含む原稿から画像情報を取得して入力画像を生成する画像入力手段と、入力画像に対して第１の像域識別処理を行って画像の属性を示す第１の識別信号を出力する第１の識別手段と、前記第１の識別信号を入力し、前記第１の像域識別処理に続く第２の像域識別処理を行って画像の属性を示す第２の識別信号を出力する第２の識別手段と、前記第１の識別信号を入力し、前記第２の識別手段を実行すべきか否かを示す判断信号を出力する判断手段と、前記判断信号が実行すべきではないと示している場合には前記第１の識別信号を第３の識別信号として選択し、前記判断信号が実行すべきと示す場合には、前記第２の識別信号を前記第３の識別信号として選択する選択手段と、前記入力画像と前記第３の識別信号とを入力して画像処理を実行する画像処理手段とを備える画像形成装置である。 In order to solve the above problems, the present invention provides an image input means for acquiring image information from a document including a paper document and generating an input image, and performing a first image area identification process on the input image to obtain an image First identification means for outputting a first identification signal indicating an attribute, and the first identification signal are input, and a second image area identification process following the first image area identification process is performed to perform image identification. Second identification means for outputting a second identification signal indicating an attribute; and determination means for inputting the first identification signal and outputting a determination signal indicating whether or not to execute the second identification means; When the determination signal indicates that it should not be executed, the first identification signal is selected as a third identification signal, and when the determination signal indicates that it should be executed, the second identification signal is selected. Selecting means for selecting an identification signal as the third identification signal; Enter a third identification signal which is an image forming apparatus and an image processing means for executing image processing.

また本発明は、紙原稿を含む原稿から画像情報を取得して入力画像を生成する画像入力手段と、入力画像に対して第１の像域識別処理を行って画像の属性を示す第１の識別信号を出力する第１の識別手段と、前記第１の識別信号を入力し、前記第１の像域識別処理に続く第２の像域識別処理の前処理を行って前処理信号を出力する前処理手段と、前処理信号を入力し、前記第２の像域識別処理を行って画像の属性を示す第２の識別信号を出力する第２の識別手段と、前処理信号を入力し、前記第２の識別手段を実行すべきか否かを示す判断信号を出力する判断手段と、前記判断信号が実行すべきではないと示している場合には前記第１の識別信号を第３の識別信号として選択し、前記判断信号が実行すべきと示す場合には、前記第２の識別信号を前記第３の識別信号として選択する選択手段と、前記入力画像と前記第３の識別信号とを入力して画像処理を実行する画像処理手段とを備える画像形成装置である。 According to another aspect of the present invention, there is provided an image input unit that obtains image information from a document including a paper document and generates an input image, and a first image area identification process performed on the input image to indicate an attribute of the image. A first identification means for outputting an identification signal; and the first identification signal is inputted, a preprocessing of a second image area identification process following the first image area identification process is performed, and a preprocessed signal is output Pre-processing means for inputting, pre-processing signals, second identifying means for performing the second image area identification processing and outputting second identification signals indicating image attributes, and pre-processing signals are inputted. Determining means for outputting a determination signal indicating whether or not the second identification means should be executed; and if the determination signal indicates that the second identification means should not be executed, If it is selected as an identification signal and the decision signal indicates that it should be executed, the second identification signal Selection means for selecting as said third identification signals, and inputs the said third identification signal and the input image is an image forming apparatus and an image processing means for executing image processing.

また本発明は、紙原稿を含む原稿から画像情報を取得して入力画像を生成する画像入力手段と、入力画像に対して第１の像域識別処理を行って画像の属性を示す第１の識別信号を出力する第１の識別手段と、前記第１の識別信号を入力し、前記第１の像域識別処理に続く第２の像域識別処理を行って画像の属性を示す第２の識別信号を出力する第２の識別手段と、前記第１の識別信号を入力し、前記第２の識別手段を実行すべきか否かを示す判断信号を出力する判断手段と、前記判断信号が実行すべきではないと示している場合は、前記入力画像と前記第１の識別信号とを入力して画像処理を実行する第１の画像処理手段と、前記判断信号が実行すべきと示している場合は、前記入力画像と前記第２の識別信号とを入力して画像処理を実行する第２の画像処理手段とを備える画像形成装置である。 According to another aspect of the present invention, there is provided an image input unit that obtains image information from a document including a paper document and generates an input image, and a first image area identification process performed on the input image to indicate an attribute of the image. A first identification unit that outputs an identification signal; a second identification unit that inputs the first identification signal and performs a second image area identification process subsequent to the first image area identification process; A second identification unit that outputs an identification signal; a determination unit that inputs the first identification signal and outputs a determination signal indicating whether or not the second identification unit should be executed; and the determination signal is executed When it is indicated that it should not be performed, it indicates that the input signal and the first identification signal are input to execute image processing, and that the determination signal is to be executed. In the case, the input image and the second identification signal are input to execute image processing. An image forming apparatus and a second image processing means.

また本発明は、紙原稿を含む原稿から画像情報を取得して入力画像を生成し、前記入力画像に対して第１の像域識別処理を行って画像の属性を示す第１の識別信号を生成し、前記第１の識別信号に、前記第１の像域識別処理に続く第２の像域識別処理を行って画像の属性を示す第２の識別信号を生成し、前記第１の識別信号から前記第２の識別手段を実行すべきか否かを示す判断信号を生成し、前記判断信号が実行すべきではないと示している場合には前記第２の像域識別処理を行わずに前記第１の識別信号を第３の識別信号として選択し、前記判断信号が実行すべきと示す場合には、前記第２の識別信号を前記第３の識別信号として選択し、前記入力画像と前記第３の識別信号とに前記第２の像域識別処理を行わずに画像処理を実行する画像処理方法である。 The present invention also acquires image information from a document including a paper document to generate an input image, and performs a first image area identification process on the input image to generate a first identification signal indicating an attribute of the image. Generating a second identification signal indicating an image attribute by performing a second image area identification process subsequent to the first image area identification process on the first identification signal, and generating the first identification signal A determination signal indicating whether or not the second identification means should be executed is generated from the signal, and when the determination signal indicates that the second identification means should not be executed, the second image area identification processing is not performed. When the first identification signal is selected as a third identification signal and the determination signal indicates that it should be executed, the second identification signal is selected as the third identification signal, and the input image An image for which image processing is performed without performing the second image area identification processing on the third identification signal. It is a processing method.

本発明の画像形成装置及び画像処理方法によれば、入力文書画像に依存する像域識別処理時間の増加を低減するとともに、出力画像の画質劣化の危険性を軽減して、ユーザの期待に近い出力画像を得ることができる。 According to the image forming apparatus and the image processing method of the present invention, the increase in the image area identification processing time depending on the input document image is reduced, and the risk of the image quality deterioration of the output image is reduced, which is close to the user's expectation. An output image can be obtained.

第１の実施の形態の画像形成装置の構成を示す図。1 is a diagram illustrating a configuration of an image forming apparatus according to a first embodiment. 第１の実施の形態の画像形成装置の概略の処理手順を示すフローチャート。3 is a flowchart showing a schematic processing procedure of the image forming apparatus according to the first embodiment. 第１の実施の形態の出力ファイルのデータ形式を示す図。The figure which shows the data format of the output file of 1st Embodiment. 従来の画像形成装置を示す図。1 is a diagram illustrating a conventional image forming apparatus. ラベリング処理を説明する図。The figure explaining a labeling process. エッジ画像の連結成分の外接矩形と文字成分を例示する図。The figure which illustrates the circumscribed rectangle and character component of the connection component of an edge image. 画像処理手段の一構成例を示すブロック図。The block diagram which shows the example of 1 structure of an image processing means. 第２の実施の形態の画像形成装置の構成を示す図。FIG. 4 is a diagram illustrating a configuration of an image forming apparatus according to a second embodiment. 第２の実施の形態の画像形成装置の概略の処理手順を示すフローチャート。9 is a flowchart illustrating a schematic processing procedure of the image forming apparatus according to the second embodiment. 第３の実施の形態の画像形成装置の構成を示す図。FIG. 9 is a diagram illustrating a configuration of an image forming apparatus according to a third embodiment. 第３の実施の形態の画像形成装置の概略の処理手順を示すフローチャート。10 is a flowchart illustrating a schematic processing procedure of the image forming apparatus according to the third embodiment. 第３の実施の形態の第１の画像処理手段で生成されるＰＤＦファイルのデータ形式を説明する図。The figure explaining the data format of the PDF file produced | generated by the 1st image processing means of 3rd Embodiment. 第１の画像処理手段の一構成例を示すブロック図。The block diagram which shows the example of 1 structure of a 1st image processing means. ＭＲＣ形式と他の形式との画像のデータサイズを比較して示す図。The figure which compares and compares the data size of the image of a MRC format and another format. 文字色検出手段の動作例を示す図。The figure which shows the operation example of a character color detection means. 射影像の計算方法を説明する図。The figure explaining the calculation method of a projection image. 第１及び第２の識別手段の機能を分類して示す図。The figure which classifies and shows the function of the 1st and 2nd identification means.

[第１の実施の形態]
図１は、第１の実施の形態の画像形成装置の構成を示す図であり、図２は、第１の実施の形態の画像形成装置の概略の処理手順を示すフローチャートである。図１及び図２を参照しつつ画像形成装置の構成と動作について説明する。 [First embodiment]
FIG. 1 is a diagram illustrating a configuration of the image forming apparatus according to the first embodiment, and FIG. 2 is a flowchart illustrating a schematic processing procedure of the image forming apparatus according to the first embodiment. The configuration and operation of the image forming apparatus will be described with reference to FIGS.

画像入力手段１０１は、入力された紙原稿を画像信号に変換する、例えば、ＭＦＰのスキャナである。アクト２０１において、画像入力手段１０１は、変換した入力画像信号１１１を出力する。アクト２０２において、第１の識別手段１０２は、入力画像信号１１１を入力してエッジ抽出・黒画素抽出などにより入力画像の比較的狭い範囲の画素を参照して第１の識別処理を実行し、画素属性を示す第１の識別信号１１２を出力する。アクト２０３において、判断手段１０３は、第１の識別信号１１２を入力して入力文書画像の複雑さを判断し、入力文書画像が複雑であるか否かを示す判断信号１１３を出力する。 The image input unit 101 is, for example, an MFP scanner that converts an input paper document into an image signal. In Act 201, the image input means 101 outputs the converted input image signal 111. In Act 202, the first identification means 102 inputs the input image signal 111, performs a first identification process with reference to pixels in a relatively narrow range of the input image by edge extraction / black pixel extraction, etc. A first identification signal 112 indicating a pixel attribute is output. In Act 203, the determination unit 103 receives the first identification signal 112, determines the complexity of the input document image, and outputs a determination signal 113 indicating whether the input document image is complicated.

アクト２０４において、第２の識別手段１０４は、第１の識別信号１１２を入力して第２の識別処理を実行して文字列・文字領域を抽出し、第２の識別信号１１４を出力する。但し、判断信号１１３が“入力文書が複雑である”ことを示している場合には、第２の識別手段は処理を実施せず、“入力文書が複雑ではない（標準的である）”ことを示している場合に、第２の識別手段は処理を実施する。 In Act 204, the second identification means 104 receives the first identification signal 112, executes the second identification process, extracts the character string / character region, and outputs the second identification signal 114. However, if the determination signal 113 indicates that “the input document is complicated”, the second identification means does not perform processing, and “the input document is not complicated (standard)”. Is indicated, the second identification means performs processing.

アクト２０５において、画像処理手段１０５は、入力画像信号１１１と第１の識別信号１１２あるいは第２の識別信号１１４とを入力して画像処理を実施する。ここで、判断信号１１３が“入力文書が複雑である”ことを示している場合には、第１の識別信号１１２が識別信号として入力される。判断信号１１３が、“入力文書が複雑ではない（標準的である）”ことを示している場合には第２の識別信号１１４が識別信号として入力される。 In Act 205, the image processing means 105 inputs the input image signal 111 and the first identification signal 112 or the second identification signal 114 and performs image processing. Here, when the determination signal 113 indicates that “the input document is complicated”, the first identification signal 112 is input as the identification signal. When the determination signal 113 indicates that “the input document is not complicated (standard)”, the second identification signal 114 is input as the identification signal.

なお、第１の実施の形態における第１の識別手段１０２ならび、第２の識別手段１０４は公知の技術を使用することができる。例えば、特開２０００−２０７２６号公報に記載の文字領域抽出部と同様の処理を使用することができる。 In addition, the well-known technique can be used for the 1st identification means 102 and the 2nd identification means 104 in 1st Embodiment. For example, the same processing as that of the character region extraction unit described in Japanese Patent Laid-Open No. 2000-20726 can be used.

続いて、第１の実施の形態を具体的に説明する。
第１の実施の形態では画像形成装置としてＭＦＰ（多機能複合機）を例として、当該ＭＦＰのスキャン機能として標準的に搭載されている高圧縮ＰＤＦ生成を説明する。 Next, the first embodiment will be specifically described.
In the first embodiment, an MFP (multifunction multi-function peripheral) is taken as an example of an image forming apparatus, and high compression PDF generation that is mounted as a standard as a scanning function of the MFP will be described.

図３は、第１の実施の形態の出力ファイルのデータ形式を示す図である。
一般に高圧縮ＰＤＦと呼ばれる電子ファイルであっても、ファイル内のデータフォーマットは様々である。第１の実施の形態の高圧縮ＰＤＦにおいては、図３に示す入力画像３０１から文字・自然画などのオブジェクトを切り出し、その属性に適した方式で圧縮し結合することで高圧縮を実現する。 FIG. 3 is a diagram illustrating a data format of the output file according to the first embodiment.
Even in an electronic file generally called a highly compressed PDF, there are various data formats in the file. In the high compression PDF of the first embodiment, high compression is realized by cutting out an object such as a character / natural image from the input image 301 shown in FIG. 3 and compressing and combining them in a method suitable for the attribute.

即ち、このデータ形式では、１５０ｄｐｉの背景画像３０２の上に、（図では表現できていないが）色毎に分割された３００ｄｐｉの２値の文字画像３０３を重ねて表現する。ここで、文字画像３０３は“０”“１”の２値を有している。文字画像３０３では白画素に当る“０”値の画素は、重ね合わせの下の層にある背景画像３０２を透過して描画し、黒画素に当る“１”値の画素は、別に指定されている色で描画する。文字画像は３００ｄｐｉで表現するためはっきりと表わされ、文字以外の部分は解像度を１５０ｄｐｉに落としているため、ファイルサイズを削減できる。 That is, in this data format, a 300 dpi binary character image 303 divided for each color is superimposed on the 150 dpi background image 302 (not shown in the figure). Here, the character image 303 has binary values of “0” and “1”. In the character image 303, “0” value pixels corresponding to white pixels are drawn through the background image 302 in the layer below the overlap, and “1” value pixels corresponding to black pixels are specified separately. Draw in the color that is present. The character image is clearly represented because it is expressed at 300 dpi, and the resolution is reduced to 150 dpi for portions other than the characters, so that the file size can be reduced.

ここで、比較のために従来の画像形成装置の構成と動作について説明する。図４は、従来の画像形成装置を示す図である。 Here, for comparison, the configuration and operation of a conventional image forming apparatus will be described. FIG. 4 is a diagram illustrating a conventional image forming apparatus.

画像入力手段４０１は、入力画像信号４１１を出力する。識別手段４０２は、入力画像信号４１１に基づいて識別信号４１２を出力する。画像処理手段４０６は、入力画像信号４１１と識別信号４１２とを入力して高圧縮ＰＤＦファイルを生成して出力する。 The image input unit 401 outputs an input image signal 411. The identification unit 402 outputs an identification signal 412 based on the input image signal 411. The image processing unit 406 receives the input image signal 411 and the identification signal 412 and generates and outputs a highly compressed PDF file.

ここで、識別手段４０２は、エッジ抽出手段４０３、外矩生成手段（ラベリング手段）４０４及び文字列・文字領域抽出処理手段４０５を備えている。 Here, the identification unit 402 includes an edge extraction unit 403, an outer rectangle generation unit (labeling unit) 404, and a character string / character area extraction processing unit 405.

エッジ抽出手段４０３は、入力画像信号４１１を入力して、入力画像中のエッジ画素を抽出したエッジ画像を出力する。エッジ抽出手段としては様々な手法が提案されているが、例えば式（１）に示す変換式で輝度値Ｙを求めた後に、下の行列に示す単純なＳｏｂｅｌフィルタを使用して抽出する。 The edge extraction unit 403 receives the input image signal 411 and outputs an edge image obtained by extracting edge pixels in the input image. Various methods have been proposed as edge extraction means. For example, after obtaining the luminance value Y by the conversion equation shown in equation (1), extraction is performed using a simple Sobel filter shown in the lower matrix.

Ｙ＝０．２５７Ｒ＋０．５０４Ｇ＋０．０９８Ｂ＋１６・・・式（１）
（ここで、Ｒ，Ｇ，Ｂは各画素の信号値）

Y = 0.257R + 0.504G + 0.098B + 16 Formula (1)
(Where R, G and B are signal values of each pixel)

外矩生成手段（ラベリング）４０４は、エッジ抽出手段４０３より出力されたエッジ画像を入力して、エッジ画素の連結成分の外接矩形を生成した外矩画像を出力する。外接矩形を生成するために、外矩生成手段（ラベリング）４０４は、ラベリング処理を実行する。 Outer rectangle generation means (labeling) 404 receives the edge image output from the edge extraction means 403, and outputs an outer rectangle image in which a circumscribed rectangle of a connected component of edge pixels is generated. In order to generate a circumscribed rectangle, an outer rectangle generation unit (labeling) 404 executes a labeling process.

図５は、ラベリング処理の一例を説明する図である。なお、ラベリングの手法は様々なものが提案されており、いずれを選択してもよいことはいうまでもない。
入力画像５０１は、“０” “１”の２値画像である。入力画像５０１に対して左上の画素を始点とし、左→右、上→下の順で全画素を走査してラベルを付ける。いま、走査する画素を注目画素とすると、対応する参照画素の状況に応じて注目画素に付けるラベル（数字）を決定する。注目画素と参照画素とを示す例５０２で規定する処理方法では、注目画素（○）が黒画素であった場合、注目画素の左、左上、上、右上の位置にある参照画素（＊）の値に応じて注目画素のラベルが定められる。 FIG. 5 is a diagram illustrating an example of the labeling process. Various labeling methods have been proposed, and it goes without saying that any of them may be selected.
The input image 501 is a binary image of “0” and “1”. The input image 501 is labeled by scanning all pixels in the order of left → right, top → bottom, starting from the top left pixel. Now, assuming that a pixel to be scanned is a pixel of interest, a label (number) attached to the pixel of interest is determined according to the situation of the corresponding reference pixel. In the processing method defined in the example 502 indicating the target pixel and the reference pixel, when the target pixel (O) is a black pixel, the reference pixel (*) at the left, upper left, upper, and upper right positions of the target pixel is displayed. The label of the pixel of interest is determined according to the value.

例えば、次のようなルールでラベルを付ける。
＊参照画素に黒画素が一つもなかった場合、０から始めて昇順に数値を付番する。
＊参照画素に１種類のラベルがあった場合、参照画素にあるのと同じラベルを付ける。
＊参照画素に２種類以上のラベルがあった場合、参照画素にある中で一番若い番号のラベルを付ける。更に、その異なる番号のラベルが同一ラベルであることを記録しておく。 For example, label with the following rules:
* If there is no black pixel in the reference pixel, numbers are numbered in ascending order starting from 0.
* If there is one type of label in the reference pixel, attach the same label as in the reference pixel.
* If there are two or more types of labels in the reference pixel, the label with the lowest number in the reference pixel is attached. Furthermore, it is recorded that the labels with the different numbers are the same label.

左上（１行×１列）の画素を注目画素とすると、参照画素は領域外となり黒画素は存在しない。従って、新しいラベル“０”を付ける。次に右側（１行×２列）の画素を注目画素とすると、１行×１列の参照画素が黒画素である。従って、参照画素（１行×１列）にあるのと同じラベル“０”を付ける。 If the pixel in the upper left (1 row × 1 column) is the target pixel, the reference pixel is outside the region and there is no black pixel. Therefore, a new label “0” is attached. Next, if the pixel on the right side (1 row × 2 columns) is the target pixel, the reference pixel of 1 row × 1 column is a black pixel. Therefore, the same label “0” as that in the reference pixel (1 row × 1 column) is attached.

更に右側に進んだ、（１行×３列）の画素と（１行×４列）の画素は白画素である。従って、この注目画素についてはブランクとしてラベルを付与しない。右上（１行×５列）の画素を注目画素とすると、参照画素には黒画素は存在しない。従って、昇順の新しいラベル“１”を付ける。 Further, the pixel of (1 row × 3 columns) and the pixel of (1 row × 4 columns) proceeding to the right side are white pixels. Therefore, no label is given as a blank for this pixel of interest. If the pixel in the upper right (1 row × 5 columns) is the target pixel, there is no black pixel as the reference pixel. Therefore, a new label “1” in ascending order is attached.

（４行×４列）の画素を注目画素とすると、注目画素の左側の参照画素は既に“０”のラベルが付いているのに対し、上側の参照画素は“１”のラベルが付いている。従って、注目画素に対しては“０”と“１”のうち若い番号である“０”のラベルを付ける。更に、“０”と“１”が同じラベルであることを記録しておく。 If the pixel of (4 rows × 4 columns) is the target pixel, the reference pixel on the left side of the target pixel is already labeled “0”, while the upper reference pixel is labeled “1”. Yes. Accordingly, the pixel of interest is labeled “0”, which is a smaller number among “0” and “1”. Further, it is recorded that “0” and “1” are the same label.

左下（５行×１列）の画素を注目画素とすると、参照画素には黒画素は存在しない。従って、新しいラベルを付ける。この時、ラベル“０”と“１”は既に使用されているので新しいラベル“２”を付ける。 If the pixel in the lower left (5 rows × 1 column) is the target pixel, no black pixel exists in the reference pixel. Therefore, add a new label. At this time, since labels “0” and “1” are already used, a new label “2” is attached.

このようにして、１回目の走査が終了した後のラベルの状態をラベリング結果５０３に示す。このラベリング結果５０３をみると、本来同じラベルを持つべき連結成分が“０”というラベルを持った画素と“１”というラベルを持った画素の２つのグループに分けられていることがわかる。 The labeling result 503 shows the state of the label after the first scan is completed in this way. From the labeling result 503, it can be seen that the connected components that should originally have the same label are divided into two groups of pixels having the label “0” and pixels having the label “1”.

次にラベルの統一処理を実施する。上述のように、左から４番目・上から４番目の画素を注目画素とした時に、“０”と“１”が同一ラベルであることは別に記録されている。このような場合には若い数字にラベルを統一する。すなわちラベリング結果５０３において“１”というラベルが付いた画素は“０”とラベルをつけ直す。この処理結果をラベリング結果５０４に示している。ラベリング結果５０４に示すように、本来一つの連結成分となるべきグループの画素全てが同じラベルになる。 Next, label unification processing is performed. As described above, when the fourth pixel from the left and the fourth pixel from the top are the target pixels, it is recorded separately that “0” and “1” are the same label. In such a case, unify labels with young numbers. That is, a pixel labeled “1” in the labeling result 503 is relabeled “0”. This processing result is shown as a labeling result 504. As shown in the labeling result 504, all the pixels of the group that should originally be one connected component have the same label.

最後に空きラベルの修正処理を実行する。統一処理を行った結果として空きラベルが発生した場合は、ラベルの番号を詰め直す。ラベリング結果５０４の例ではラベル“１”が空き番号になるので“２”以降のラベルの番号を詰める。ラベリング結果５０５は、このようにして最終的に得られるラベリング結果を示している。 Finally, empty label correction processing is executed. If an empty label is generated as a result of the unification process, the label number is repacked. In the example of the labeling result 504, since the label “1” becomes an empty number, the label numbers after “2” are packed. A labeling result 505 indicates a labeling result finally obtained in this manner.

同じラベルが付与された画素の（ｘ座標，ｙ座標）の最小値の組、最大値の組がそれぞれ連結成分の外接矩形の左上頂点、右下頂点の（ｘ座標，ｙ座標）となるので、ラベリング処理の結果からエッジ画素の連結成分の外接矩形を生成することができる。 Since the set of the minimum value and the maximum value of the (x coordinate, y coordinate) of the pixel with the same label are respectively the upper left vertex and the lower right vertex (x coordinate, y coordinate) of the circumscribed rectangle of the connected component. The circumscribed rectangle of the connected component of the edge pixel can be generated from the result of the labeling process.

文字列・文字領域抽出処理手段４０５は、外矩生成手段４０４から出力された外矩画像を入力とし、識別信号４１２を出力する。図６は、エッジ画像の連結成分の外接矩形と文字成分を例示する図である。 The character string / character area extraction processing unit 405 receives the outer rectangular image output from the outer rectangular generation unit 404 and outputs an identification signal 412. FIG. 6 is a diagram illustrating the circumscribed rectangle and the character component of the connected component of the edge image.

具体的には、文字列・文字領域抽出処理手段４０５は、エッジ画像の連結成分の外接矩形画像６０１を入力した場合（図中の一点鎖線が外接矩形）、接触・交差・内包している矩形同士を結合し文字成分６０２を生成する。そして、文字列・文字領域抽出処理手段４０５は、このようにして生成された文字成分について
・自身のサイズが極端に大きかったり小さかったりしないか？
・矩形の縦横比は１に近いか（矩形の形状は正方形に近いか）？
・水平方向の近傍の文字成分とサイズが揃っているか？ｙ座標が揃っているか？
・垂直方向の近傍の文字成分とのサイズが揃っているか？ｘ座標が揃っているか？
など文字列らしい特性を持っているかという基準から文字列生成を行うと共に、更に文字列らしさ、文字列の並び方から文字領域であるか否かを判定する。 Specifically, the character string / character region extraction processing unit 405 receives a circumscribed rectangular image 601 as a connected component of the edge image (a dashed-dotted line in the drawing is a circumscribed rectangle), and is a rectangle that is in contact, intersecting, or included. The character components 602 are generated by combining them. Then, the character string / character area extraction processing unit 405 does not have an extremely large or small size for the character component generated in this way.
• Is the aspect ratio of the rectangle close to 1 (is the shape of the rectangle close to a square)?
・ Is the character component and size in the horizontal direction aligned? Are the y coordinates aligned?
・ Is the size of the character component nearby in the vertical direction aligned? Are the x coordinates aligned?
A character string is generated based on whether or not the character string has a characteristic such as a character string, and it is further determined whether or not it is a character region from the character string characteristic and the character string arrangement method.

ここで、文字列・文字領域抽出の過程では、矩形の削除および矩形同士の結合が繰り返して実行される。そのため、処理対象となる文字成分が多い場合には、処理時間が膨大になってしまう。また、文字成分が多いということは、文字と誤識別しやすい網点やノイズも多いため識別精度が悪い可能性が高いことも示している。 Here, in the process of extracting the character string / character region, the deletion of the rectangle and the combination of the rectangles are repeatedly executed. Therefore, when there are many character components to be processed, the processing time becomes enormous. In addition, the fact that there are many character components also indicates that there is a high possibility that the identification accuracy is poor because there are many halftone dots and noise that are easily misidentified as characters.

画像処理手段４０６について説明する。図７は、画像処理手段の一構成例を示すブロック図である。この画像処理手段４０６は、図３に示したデータ形式の画像ファイルを得るための構成である。 The image processing unit 406 will be described. FIG. 7 is a block diagram showing an example of the configuration of the image processing means. The image processing means 406 has a configuration for obtaining an image file having the data format shown in FIG.

文字画素検出手段７０１は、入力画像信号に基づいて像域識別処理を実行し、識別信号を出力する。即ち、文字画素検出手段７０１は、図４の識別手段４０２に該当するため、その詳細の説明は省略する。 The character pixel detection unit 701 performs image area identification processing based on the input image signal and outputs an identification signal. That is, the character pixel detection unit 701 corresponds to the identification unit 402 in FIG.

まず背景用の多値画像３０２の作成処理を説明する。文字周囲色検出手段７０２は、文字周囲の色を検出する。検出する方法としては、例えば文字との距離が３画素以内の画素を抽出し、その画素値の平均値を求めて文字周囲色とする。次に文字塗りつぶし手段７０３は、文字画素を検出した文字周囲色に置き換える。これによって文字が周囲の背景色によって塗りつぶされる。最後に画像縮小手段７０４が縮小処理を実行し、その後多値画像圧縮手段７０５がＪＰＥＧなどの多値画像圧縮を実行して背景用画像を作成する。 First, the process of creating the background multi-value image 302 will be described. The character surrounding color detection unit 702 detects the color around the character. As a detection method, for example, pixels within a distance of 3 pixels from a character are extracted, and an average value of the pixel values is obtained as a character surrounding color. Next, the character filling means 703 replaces the character pixel with the detected character surrounding color. This fills the character with the surrounding background color. Finally, the image reduction means 704 executes reduction processing, and then the multi-value image compression means 705 executes multi-value image compression such as JPEG to create a background image.

なお、文字塗りつぶし手段７０３が文字を塗りつぶすのは、ＪＰＥＧに代表される多値画像圧縮は一般に空間周波数の高周波成分が多い画像の圧縮は苦手な傾向があるため、高周波成分を含む文字を消すことによって圧縮率を高めるためである。 It should be noted that the character filling means 703 paints characters because multi-valued image compression represented by JPEG generally has a tendency not to compress images with many high-frequency components of the spatial frequency, and therefore erases characters containing high-frequency components. This is to increase the compression ratio.

次に文字用画像３０３の作成処理を説明する。まず文字色検出手段７０６が各文字画素の文字色を検出する。具体的には、例えば、類似する色の文字画素をグルーピングし、そのグループの文字画素のＲＧＢ平均値を文字色とする。一方、この処理と並行して２値化手段７０７は、文字画素が“１”、非文字画素が“０”となるように２値画像を生成する。 Next, a process for creating the character image 303 will be described. First, the character color detection means 706 detects the character color of each character pixel. Specifically, for example, character pixels of similar colors are grouped, and the RGB average value of the character pixels of the group is set as the character color. On the other hand, in parallel with this processing, the binarizing means 707 generates a binary image so that the character pixel is “1” and the non-character pixel is “0”.

文字色による２値画像分割手段７０８が、生成した２値画像を文字色検出手段７０６がグルーピングした色毎の２値画像に分割する。そして、２値画像圧縮手段７０９が、分割後の２値画像に対してＭＭＲなどの２値画像圧縮を施して文字用画像を作成する。
なお、色毎に文字画素をグルーピングし２値画像を分割する理由は、２値画像であるため、一つの画像では背景と前景（文字）の２色しか表現できないためである。 A binary image dividing unit 708 by character color divides the generated binary image into binary images for each color grouped by the character color detecting unit 706. A binary image compression unit 709 performs binary image compression such as MMR on the divided binary image to create a character image.
Note that the reason why the binary image is divided by grouping the character pixels for each color is that it is a binary image, and therefore only one color of the background and the foreground (character) can be expressed in one image.

以上、従来の画像形成装置の構成と動作について説明した。上述のように、従来の画像形成装置では入力文書画像が複雑であった場合、像域識別手段における処理時間が膨大となったり、あるいは、識別精度が不十分なため画質不具合が生じる危険性が高い。 The configuration and operation of the conventional image forming apparatus have been described above. As described above, when the input document image is complicated in the conventional image forming apparatus, there is a risk that the processing time in the image area identification unit becomes enormous, or the image quality defect occurs due to insufficient identification accuracy. high.

図１に示す第１の実施の形態の画像形成装置は、このような問題点の発生を防止する。第１の実施の形態の画像形成装置の構成と動作を従来の画像形成装置と比較させて、図１と図４とを参照しつつ説明する。 The image forming apparatus according to the first embodiment shown in FIG. 1 prevents such problems from occurring. The configuration and operation of the image forming apparatus according to the first embodiment will be described in comparison with a conventional image forming apparatus with reference to FIGS.

第１の実施の形態の画像形成装置では、従来の画像形成装置の識別手段４０２（図４）を、第１の識別手段１０２（図１）と第２の識別手段１０４（図１）とに分割する。ここで、第１の識別手段１０２（図１）は、従来の画像形成装置のエッジ抽出手段４０３（図４）に相当し、第２の識別手段１０４は、従来の画像形成装置の外矩生成手段４０４（図４）および文字列・文字領域抽出処理手段４０５（図４）に相当する。 In the image forming apparatus according to the first embodiment, the identification unit 402 (FIG. 4) of the conventional image forming apparatus is replaced with the first identification unit 102 (FIG. 1) and the second identification unit 104 (FIG. 1). To divide. Here, the first identification unit 102 (FIG. 1) corresponds to the edge extraction unit 403 (FIG. 4) of the conventional image forming apparatus, and the second identification unit 104 generates the outer rectangle of the conventional image forming apparatus. It corresponds to the means 404 (FIG. 4) and the character string / character area extraction processing means 405 (FIG. 4).

第１の実施の形態の画像形成装置では、判断手段１０３（図１）が文書複雑さを判断する。即ち、判断手段１０３（図１）は第１の識別信号１１２（図１）を入力し、文書が複雑であるか否かの判断信号１１３（図１）を出力する。 In the image forming apparatus according to the first embodiment, the determination unit 103 (FIG. 1) determines the document complexity. That is, the determination means 103 (FIG. 1) receives the first identification signal 112 (FIG. 1) and outputs a determination signal 113 (FIG. 1) as to whether or not the document is complicated.

ここで、文書画像複雑さの判断手段１０３（図１）について説明する。上述のように、第２の識別手段１０４、即ち、従来の画像形成装置の外矩生成手段４０４（図４）および文字列・文字領域抽出処理手段４０５（図４）はエッジ画像の外接矩形数に依存して処理時間が膨大になりあるいは識別精度が悪化する恐れが高くなる。ここで、エッジ画素の外接矩形の数はエッジ画素数にある程度依存すると考えられる。 Here, the document image complexity determination unit 103 (FIG. 1) will be described. As described above, the second identification unit 104, that is, the outer rectangle generation unit 404 (FIG. 4) and the character string / character area extraction processing unit 405 (FIG. 4) of the conventional image forming apparatus, is the number of circumscribed rectangles of the edge image. The processing time becomes enormous or the identification accuracy is likely to deteriorate. Here, the number of circumscribed rectangles of edge pixels is considered to depend to some extent on the number of edge pixels.

そこで、判断手段１０３（図１）は、第１の識別手段１０２即ち、従来の画像形成装置のエッジ抽出手段４０３（図４）から出力されるエッジ画像信号を第１の識別信号１１２（図１）として入力とし、そのエッジ画像を走査してエッジ画素数Ｎｅをカウントする。そして、Ｎｅ＞Ｔｈ１であれば入力文書が複雑であることを示す判断信号１１３を出力する。反対にＮｅ＜＝Ｔｈ１であれば、入力文書が複雑ではない（標準的である）ことを示す判断信号１１３を出力する。ここでＴｈ１は予め決めておいた閾値である。 Therefore, the determination unit 103 (FIG. 1) uses the edge image signal output from the first identification unit 102, that is, the edge extraction unit 403 (FIG. 4) of the conventional image forming apparatus, as the first identification signal 112 (FIG. 1). ) And the edge image is scanned to count the number of edge pixels Ne. If Ne> Th1, a determination signal 113 indicating that the input document is complicated is output. On the other hand, if Ne <= Th1, a determination signal 113 indicating that the input document is not complicated (standard) is output. Here, Th1 is a predetermined threshold value.

判断信号１１３（図１）が“文書が複雑であること”を示している場合には、第２の識別手段１０４、即ち、従来の画像形成装置の外矩生成手段４０４（図４）および文字列・文字領域抽出処理手段４０５（図４）は動作しない。従って、画像処理手段１０５（図１）は、入力画像信号１１１（図１）と第１の識別信号１１２を入力として高圧縮ＰＤＦを出力する。 When the determination signal 113 (FIG. 1) indicates that the document is complicated, the second identification unit 104, that is, the outer rectangle generation unit 404 (FIG. 4) of the conventional image forming apparatus and the character The column / character area extraction processing unit 405 (FIG. 4) does not operate. Accordingly, the image processing means 105 (FIG. 1) receives the input image signal 111 (FIG. 1) and the first identification signal 112 and outputs a highly compressed PDF.

反対に判断信号１１３（図１）が“文書が複雑ではない（標準的である）こと”を示している場合には、更に第２の識別手段１０４、即ち、従来の画像形成装置の外矩生成手段４０４（図４）および文字列・文字領域抽出処理手段４０５（図４）が動作する。従って、画像処理手段１０５（図１）は、入力画像信号１１１（図１）と第２の識別信号１１４を入力として高圧縮ＰＤＦを出力する。 On the other hand, when the determination signal 113 (FIG. 1) indicates “the document is not complicated (standard)”, the second identification unit 104, that is, the outer rectangle of the conventional image forming apparatus is used. The generation means 404 (FIG. 4) and the character string / character area extraction processing means 405 (FIG. 4) operate. Therefore, the image processing means 105 (FIG. 1) receives the input image signal 111 (FIG. 1) and the second identification signal 114 as an input and outputs a highly compressed PDF.

以上説明したとおり、第１の実施の形態では、入力文書画像が複雑な場合においても画質不具合や処理時間が膨大となる危険性が少ない高圧縮ＰＤＦファイルを作成することが可能である。
また、第１の実施の形態の文書画像複雑さ判断手段１０３は、第２の識別手段１０４に必要なデータのみを使用しており、新たなデータ入力を必要としていない。従って、処理時間に大きな影響を及ぼさずに判断することができる。更に、入力された文書画像が複雑であった場合には、従来の方式と比較して不必要な信号を生成することなく高圧縮ＰＤＦを作成することが可能である。 As described above, according to the first embodiment, it is possible to create a highly compressed PDF file with less risk of image quality problems and enormous processing time even when the input document image is complicated.
Further, the document image complexity determination unit 103 according to the first embodiment uses only data necessary for the second identification unit 104 and does not require new data input. Therefore, the determination can be made without greatly affecting the processing time. Furthermore, when the input document image is complicated, it is possible to create a highly compressed PDF without generating unnecessary signals as compared with the conventional method.

[第２の実施の形態]
図８は、第２の実施の形態の画像形成装置の構成を示す図であり、図９は、第２の実施の形態の画像形成装置の概略の処理手順を示すフローチャートである。図８及び図９を参照しつつ画像形成装置の構成と動作について説明する。
なお、第２の実施の形態の画像形成装置は、第２識別の前処理手段８０３を備えている点で第１の実施の形態の画像形成装置と異なっている。 [Second Embodiment]
FIG. 8 is a diagram illustrating a configuration of the image forming apparatus according to the second embodiment, and FIG. 9 is a flowchart illustrating a schematic processing procedure of the image forming apparatus according to the second embodiment. The configuration and operation of the image forming apparatus will be described with reference to FIGS.
The image forming apparatus according to the second embodiment is different from the image forming apparatus according to the first embodiment in that the image forming apparatus according to the second embodiment includes a pre-processing unit 803 for second identification.

画像入力手段８０１は、入力された紙原稿を画像信号に変換するＭＦＰのスキャナなどである。アクト９０１において、画像入力手段８０１は、変換した入力画像信号８１１を出力する。アクト９０２において、第１の識別手段８０２は、入力画像信号８１１を入力してエッジ抽出・黒画素抽出などにより入力画像の比較的狭い範囲の画素を参照して第１の識別処理を実行し、画素属性を示す第１の識別信号８１２を出力する。アクト９０３において、第２識別の前処理手段８０３は、第１の識別信号８１２を入力して、第２の識別手段の前処理を行い、前処理信号８１３を出力する。 The image input means 801 is an MFP scanner or the like that converts an input paper document into an image signal. In Act 901, the image input means 801 outputs the converted input image signal 811. In Act 902, the first identification means 802 receives the input image signal 811 and executes a first identification process with reference to pixels in a relatively narrow range of the input image by edge extraction / black pixel extraction, etc. A first identification signal 812 indicating a pixel attribute is output. In Act 903, the second identification preprocessing means 803 receives the first identification signal 812, performs preprocessing of the second identification means, and outputs a preprocessing signal 813.

アクト９０４において、判断手段８０４は、前処理信号８１３を入力して入力文書画像の複雑さを判断し、入力文書画像が複雑であるか否かを示す判断信号８１４を出力する。アクト９０５において、第２の識別手段８０５は、前処理信号８１３を入力して文字列・文字領域を抽出し、第２の識別信号８１５を出力する。但し、判断信号８１４が“入力文書が複雑である”ことを示している場合には第２の識別手段８０５は処理を実施せず、“入力文書が複雑ではない（標準的である）”ことを示している場合には第２の識別手段８０５は処理を実施する。 In Act 904, the determination unit 804 receives the preprocess signal 813, determines the complexity of the input document image, and outputs a determination signal 814 indicating whether the input document image is complicated. In Act 905, the second identification unit 805 receives the preprocess signal 813, extracts a character string / character region, and outputs a second identification signal 815. However, when the determination signal 814 indicates that “the input document is complicated”, the second identification unit 805 does not perform processing, and “the input document is not complicated (standard)”. Is indicated, the second identification means 805 performs processing.

アクト９０６において、画像処理手段８０６は、入力画像信号８１１と第１の識別信号８１２あるいは第２の識別信号８１５とを入力として画像処理を行う。ここで、判断信号８１４が“入力文書が複雑である”ことを示している場合には、第１の識別信号８１２が識別信号として入力される。判断信号８１４が“入力文書が複雑ではない（標準的である）”ことを示している場合には、第２の識別信号８１５が識別信号として入力される。 In Act 906, the image processing means 806 performs image processing with the input image signal 811 and the first identification signal 812 or the second identification signal 815 as inputs. Here, when the determination signal 814 indicates that “the input document is complicated”, the first identification signal 812 is input as the identification signal. When the determination signal 814 indicates “input document is not complicated (standard)”, the second identification signal 815 is input as the identification signal.

第２の実施の形態と第１の実施の形態との大きな違いは、第１の実施の形態では、従来の画像形成装置の識別手段４０２（図４）を第１の識別手段１０２（図１）と第２の識別手段１０４（図１）の２つに分割していたのに対し、第２の実施の形態においては第１の識別手段８０２（図８）と第２識別の前処理手段８０３（図８）と第２の識別手段８０５（図８）の３つに分割している点である。 The major difference between the second embodiment and the first embodiment is that in the first embodiment, the identification means 402 (FIG. 4) of the conventional image forming apparatus is replaced with the first identification means 102 (FIG. 1). ) And the second identification means 104 (FIG. 1), whereas in the second embodiment, the first identification means 802 (FIG. 8) and the second identification preprocessing means are divided. It is divided into three parts 803 (FIG. 8) and second identification means 805 (FIG. 8).

第２の実施の形態の各手段と従来の画像形成装置の処理ブロックとの具体的な対応は以下の通りである。
第２の実施の形態の第１の識別手段８０２は、従来技術のエッジ抽出手段４０３（図４）に対応する。第２の実施の形態の第２識別の前処理手段８０３は、従来技術の外矩生成手段４０４（図４）に対応する。第２の実施の形態の第２の識別手段８０５は、従来技術の文字列・文字領域抽出処理手段４０５（図４）に対応する。 The specific correspondence between the units of the second embodiment and the processing blocks of the conventional image forming apparatus is as follows.
The first identification unit 802 of the second embodiment corresponds to the edge extraction unit 403 (FIG. 4) of the prior art. The second identification pre-processing means 803 of the second embodiment corresponds to the prior art outer rectangle generation means 404 (FIG. 4). The second identification means 805 of the second embodiment corresponds to the conventional character string / character area extraction processing means 405 (FIG. 4).

ここで、文書画像複雑さの判断手段８０４（図８）について説明する。上述したように、第２の識別手段８０５に対応する従来技術の文字列・文字領域抽出処理手段４０５（図４）は、エッジ画像の外接矩形数に依存して処理時間が膨大になったり識別精度が悪くなったりする。第１の実施の形態においては、エッジ画素数Ｎｅをカウントし、Ｎｅ＞Ｔｈであるか否かを入力文書が複雑であるか否かの判断基準とした。 Here, the document image complexity determination means 804 (FIG. 8) will be described. As described above, the conventional character string / character area extraction processing unit 405 (FIG. 4) corresponding to the second identification unit 805 recognizes that the processing time becomes enormous depending on the number of circumscribed rectangles of the edge image. The accuracy may deteriorate. In the first embodiment, the number of edge pixels Ne is counted, and whether Ne> Th is used as a criterion for determining whether the input document is complicated.

第２の実施の形態においては、第２識別の前処理手段８０３は前処理信号８１３として外矩画像を出力するため、判断手段８０４は、文書の複雑さをより表している外矩数ＮｒをカウントしＮｒ＞Ｔｈ２であれば入力文書が複雑であると判断し、入力文書が複雑であることを示す判断信号８１４を出力する。反対にＮｒ＜＝Ｔｈ２であれば、入力文書が複雑ではない（標準的である）ことを示す判断信号８１４を出力する。ここでＴｈ２は予め決めておいた閾値である。
これにより、第１の実施の形態よりさらに正確に文書の複雑さを判断することが可能となる。 In the second embodiment, since the pre-processing unit 803 for second identification outputs an outer rectangular image as the pre-processing signal 813, the determining unit 804 uses the outer rectangular number Nr that more represents the complexity of the document. If Nr> Th2 is counted, it is determined that the input document is complex, and a determination signal 814 indicating that the input document is complex is output. On the other hand, if Nr <= Th2, a determination signal 814 indicating that the input document is not complicated (standard) is output. Here, Th2 is a predetermined threshold value.
This makes it possible to determine the complexity of the document more accurately than in the first embodiment.

以上説明したとおり、第２の実施の形態では、第１の実施の形態よりも正確に入力文書画像の複雑さを判断することが可能であり、入力文書画像が複雑な場合においても画質不具合や処理時間が膨大となる危険性が少ない高圧縮ＰＤＦファイルを作成することが可能である。 As described above, in the second embodiment, it is possible to determine the complexity of the input document image more accurately than in the first embodiment. Even when the input document image is complicated, image quality defects and It is possible to create a highly-compressed PDF file with less risk of processing time becoming enormous.

また、第２の実施の形態の文書画像複雑さ判断手段８０４は、第２の識別手段８０５に必要なデータのみを使用しており、新たなデータ入力を必要としていない。従って、処理時間に大きな影響を及ぼさずに判断することができる。 The document image complexity determination unit 804 of the second embodiment uses only data necessary for the second identification unit 805 and does not require new data input. Therefore, the determination can be made without greatly affecting the processing time.

[第３の実施の形態]
図１０は、第３の実施の形態の画像形成装置の構成を示す図であり、図１１は、第３の実施の形態の画像形成装置の概略の処理手順を示すフローチャートである。図１０及び図１１を参照しつつ画像形成装置の構成と動作について説明する。
なお、第３の実施の形態と第１の実施の形態との違いは入力文書画像の複雑さにより画像処理手段まで切り替える点であり、第３の実施の形態では、第１の画像処理手段１００５と第２の画像処理手段１００６とを有している。 [Third embodiment]
FIG. 10 is a diagram illustrating a configuration of the image forming apparatus according to the third embodiment, and FIG. 11 is a flowchart illustrating a schematic processing procedure of the image forming apparatus according to the third embodiment. The configuration and operation of the image forming apparatus will be described with reference to FIGS.
Note that the difference between the third embodiment and the first embodiment is that the image processing means is switched depending on the complexity of the input document image. In the third embodiment, the first image processing means 1005 is switched. And second image processing means 1006.

第３の実施の形態では、アクト１１０３において、複雑さ判断手段１００３により入力画像が複雑であると判断された場合には、アクト１１０６において、第１の識別信号１０１２に基づき第１の画像処理手段１００５によって第１の画像処理が施される。一方、入力画像が複雑ではない（標準的）と判断された場合には、アクト１１０４において、第２の識別手段１００４が、第１の識別信号１０１２に基づき第２の識別信号１０１３を出力する。アクト１１０５において、第２の識別信号１０１３に基づき第２の画像処理手段１００６によって第２の画像処理が施される。 In the third embodiment, if it is determined in Act 1103 that the input image is complicated by the complexity determining unit 1003, the first image processing unit is based on the first identification signal 1012 in Act 1106. In step 1005, the first image processing is performed. On the other hand, if it is determined that the input image is not complicated (standard), in Act 1104, the second identification unit 1004 outputs the second identification signal 1013 based on the first identification signal 1012. In Act 1105, second image processing is performed by the second image processing unit 1006 based on the second identification signal 1013.

図１２は、第３の実施の形態の第１の画像処理手段１００５で生成されるＰＤＦファイルのデータ形式を説明する図である。 FIG. 12 is a diagram for explaining the data format of the PDF file generated by the first image processing unit 1005 of the third embodiment.

図１２に示すデータ形式（ＭＲＣ）は、マスク画像１２０２、背景用画像１２０３及び文字色用画像１２０４を備えている。マスク画像１２０２は、“０”“１”の２値を持った３００ｄｐｉの画像である。背景用画像１２０３は、入力画像から文字を消去した１５０ｄｐｉの画像である。文字色用画像１２０４は、文字色を特定するための７５ｄｐｉの画像である。 The data format (MRC) shown in FIG. 12 includes a mask image 1202, a background image 1203, and a character color image 1204. The mask image 1202 is a 300 dpi image having binary values of “0” and “1”. The background image 1203 is a 150 dpi image obtained by deleting characters from the input image. The character color image 1204 is a 75 dpi image for specifying the character color.

第３の実施の形態ではマスク画像１２０２が黒画素の位置には文字色用画像１２０４を選択し、白画素の位置には背景用画像１２０３を選択して表示する。それにより、文字の形状は３００ｄｐｉのマスク画像に依存するため、文字ははっきりさせたまま文字以外の解像度を落とすことでファイルサイズの削減を実現している。
なお、第３の実施の形態の第２の画像処理手段１００６で作成されるＰＤＦファイルのデータ形式は、第１の実施の形態の画像処理手段１０５（図１）で生成される図３に示すＰＤＦファイルと同じデータ形式とする。 In the third embodiment, the mask image 1202 selects and displays the character color image 1204 at the position of the black pixel and the background image 1203 at the position of the white pixel. Thereby, since the shape of the character depends on the 300 dpi mask image, the file size is reduced by reducing the resolution other than the character while keeping the character clear.
The data format of the PDF file created by the second image processing unit 1006 of the third embodiment is shown in FIG. 3 generated by the image processing unit 105 (FIG. 1) of the first embodiment. The data format is the same as that of the PDF file.

図１３は、第１の画像処理手段１００５の一構成例を示すブロック図である。 FIG. 13 is a block diagram illustrating a configuration example of the first image processing unit 1005.

文字画素検出手段１３０１は、入力画像信号に基づいて像域識別処理を実行し、識別信号を出力する。即ち、文字画素検出手段１３０１は、図１０の第１の識別手段１００２に該当するため、その詳細の説明は省略する。 The character pixel detection unit 1301 performs image area identification processing based on the input image signal and outputs an identification signal. That is, since the character pixel detection unit 1301 corresponds to the first identification unit 1002 of FIG. 10, detailed description thereof is omitted.

まずマスク画像１２０２の作成動作を説明する。２値化手段１３０２は、例えば文字画素を“１”，文字以外の画素を“０”とする２値画像を生成する。２値画像圧縮手段１３０３は、生成した２値画像に対しＭＭＲなどの２値圧縮を施してマスク画像１２０２を作成する。 First, the creation operation of the mask image 1202 will be described. For example, the binarizing unit 1302 generates a binary image in which character pixels are “1” and pixels other than characters are “0”. The binary image compressing unit 1303 creates a mask image 1202 by performing binary compression such as MMR on the generated binary image.

次に、背景用画像１２０３の作成動作を説明する。文字周囲色検出手段１３０４は、文字周囲の色を検出する。検出する方法としては、例えば文字との距離が３画素以内の画素を抽出し、その画素値の平均値を求めて文字周囲色とする。次に文字塗りつぶし手段１３０５は、文字画素を検出した文字周囲色に置き換える。これによって文字が周囲の背景色によって塗りつぶされる。最後に第１の画像縮小手段１３０６が縮小処理を実行し、その後第１の多値画像圧縮手段７０５がＪＰＥＧなどの多値画像圧縮を実行して背景用画像を作成する。 Next, the creation operation of the background image 1203 will be described. Character surrounding color detection means 1304 detects the color around the character. As a detection method, for example, pixels within a distance of 3 pixels from a character are extracted, and an average value of the pixel values is obtained as a character surrounding color. Next, the character filling means 1305 replaces the character pixel with the detected character surrounding color. This fills the character with the surrounding background color. Finally, the first image reduction unit 1306 executes a reduction process, and then the first multi-value image compression unit 705 executes multi-value image compression such as JPEG to create a background image.

なお、文字塗りつぶし手段１３０５が文字を塗りつぶすのは、ＪＰＥＧに代表される多値画像圧縮は一般に空間周波数の高周波成分が多い画像の圧縮は苦手な傾向があるためであり、高周波成分を含む文字を消すことによって圧縮率を高めるためである。 Note that the character filling means 1305 paints characters because multi-valued image compression represented by JPEG generally has a tendency to be poor at compression of images having many high-frequency components of the spatial frequency. This is to increase the compression rate by erasing.

次に、文字色用画像１２０４の作成動作を説明する。まず、非文字消去手段１３０８が入力画像から文字以外を消去する。具体的には文字検出手段１３０１で文字として検出されなかった画素を白画素で置き換える。次に文字画素膨張手段１３０９が文字画素を膨張させる。具体的には、色画素の８近傍にある白画素をその色で置き換える処理を、全ての色画素に対して繰り返す。なお、この文字画素膨張処理も背景用画像１２０３と同様に高周波成分をなくすために文字形状を残さないための処理である。そのため、繰り返し回数を大きくすれば処理時間が大きくなるものの圧縮ファイルサイズを小さくできる。 Next, the creation operation of the character color image 1204 will be described. First, the non-character erasing unit 1308 erases characters other than characters from the input image. Specifically, pixels that are not detected as characters by the character detection unit 1301 are replaced with white pixels. Next, the character pixel expansion means 1309 expands the character pixel. Specifically, the process of replacing white pixels in the vicinity of the color pixels with that color is repeated for all color pixels. This character pixel expansion process is also a process for leaving no character shape in order to eliminate high-frequency components, as with the background image 1203. Therefore, if the number of repetitions is increased, the processing time increases, but the compressed file size can be reduced.

なお、膨張処理をある回数繰り返すとファイルサイズは減少しなくなるが、その回数は入力解像度に依存する。例えば、解像度が低いほど少ない繰り返し回数でファイルサイズの減少は限界となる。文字画素膨張後の画像に対して第２の画像縮小手段１３１０が縮小処理を実行し、その後第２の多値画像圧縮手段１３１１が、多値画像圧縮を実行して文字色用画像１２０４を作成する。 Note that if the expansion process is repeated a certain number of times, the file size does not decrease, but the number of times depends on the input resolution. For example, the lower the resolution, the less the file size can be reduced with a smaller number of repetitions. The second image reduction unit 1310 performs a reduction process on the image after the character pixel expansion, and then the second multi-value image compression unit 1311 executes multi-value image compression to create a character color image 1204. To do.

次に、データ形式をＭＲＣ形式にした理由を説明する。
図１２に示すように、第１の画像処理手段１００５により出力されるＭＲＣ形式では背景用の１５０ｄｐｉの多値画像と文字形状用のマスク画像３００ｄｐｉの２値画像および文字色用の７５ｄｐｉの多値画像により構成されている。それに対し、図３に示すように、第２の画像処理手段１００６により出力される形式では背景用の１５０ｄｐｉの多値画像と文字用の３００ｄｐｉの２値画像により構成される。 Next, the reason why the data format is the MRC format will be described.
As shown in FIG. 12, in the MRC format output by the first image processing means 1005, a 150 dpi multi-value image for background, a binary image of 300 dpi for character shape, and a multi-value of 75 dpi for character color. It consists of images. On the other hand, as shown in FIG. 3, the format output by the second image processing means 1006 is composed of a 150 dpi multi-value image for background and a 300 dpi binary image for characters.

それぞれを構成する各画像のデータサイズを比較すると、図１４に示す通り、前者（ＭＲＣ形式）と後者の背景用画像は、ほぼ同じデータサイズである。また、前者（ＭＲＣ形式）のマスク画像と後者の文字用画像も、ほぼ同じデータサイズである。そのため、文字色用画像のデータサイズ分、前者（ＭＲＣ形式）の方がファイルサイズが大きくなりやすい。しかし、後者の形式では、ＭＲＣ形式より高精度な処理が要求される。具体的には、後者の形式でのみ必要な文字色検出手段７０６（図７）は高い精度が要求される。 Comparing the data sizes of the images constituting each of them, as shown in FIG. 14, the former (MRC format) and the latter background images have substantially the same data size. The former (MRC format) mask image and the latter character image have substantially the same data size. Therefore, the file size tends to be larger in the former (MRC format) by the data size of the character color image. However, the latter format requires processing with higher accuracy than the MRC format. Specifically, the character color detection means 706 (FIG. 7) required only in the latter format is required to have high accuracy.

図１５は、文字色検出手段の動作例を示す図である。
図１５の文字画像１５０１は、文字色検出手段７０６に入力される画像の例である。この文字画像１５０１の１行目（アルファベット）、２行目（数字）、３行目（平仮名）は、全て異なる文字色である。 FIG. 15 is a diagram showing an operation example of the character color detection means.
A character image 1501 in FIG. 15 is an example of an image input to the character color detection unit 706. The first line (alphabet), second line (number), and third line (Hiragana) of the character image 1501 are all different character colors.

文字色ごとにグルーピングする際に文字画像１５０２に示すように文字色とグループが１対１対応し、かつ、そのグループの色を正しく抽出できれば、文字画像１５０３に示すように、出力電子画像ファイル中の文字画像を入力画像とほぼ同様に表現できる。しかし、例えば文字画像１５０４に示すように第１行目〜第３行目の複数色の文字を誤って一つの文字色グループとしてしまった場合、文字画像１５０５に示すように出力電子画像ファイル中の文字画像はただ１色（本例では３行の平均色のただ１色）で表現されてしまう。 When grouping for each character color, if there is a one-to-one correspondence between the character color and the group as shown in the character image 1502 and the color of the group can be extracted correctly, as shown in the character image 1503, in the output electronic image file Can be expressed in substantially the same manner as the input image. However, for example, when a plurality of colors of characters in the first to third lines are mistakenly made into one character color group as shown in a character image 1504, as shown in a character image 1505, in the output electronic image file The character image is expressed by only one color (in this example, only one color of the average color of three lines).

また、文字画像１５０６に示すように一文字一文字でグルーピングしてしまった場合、あるいは文字画像１５０７に示すように文字列中で一文字だけ抽出文字色を間違えてしまった場合には、同一色の文字列の中で異なる色の文字が発生してしまい非常に違和感のある画像を出力する危険性もある。
また、画像データサイズ自体は文字画像１５０３、１５０５、１５０７ともほぼ同じだが、文字画像１５０６のように細かく分割しすぎると、出力の電子画像ファイルを構成するためのオーバーヘッドが大きくなり、この結果ファイルサイズが大きくなることもありうる。 In addition, when grouping is performed for each character as shown in the character image 1506, or when the extracted character color is mistaken for only one character in the character string as shown in the character image 1507, a character string of the same color There is also a risk that characters of different colors are generated in the image and an image with a very uncomfortable feeling is output.
The image data size itself is almost the same as that of the character images 1503, 1505, and 1507. However, if the image data size is too finely divided like the character image 1506, the overhead for constructing the output electronic image file becomes large. Can be large.

すなわち、入力画像が複雑な場合に第２の画像処理手段１００６を実行すると文字色を間違える危険性が高いため、第３の実施の形態においては第１の画像処理手段１００５にてＭＲＣ形式のＰＤＦファイルを作成するようにしている。 That is, when the second image processing unit 1006 is executed when the input image is complicated, there is a high risk of wrong character color. Therefore, in the third embodiment, the first image processing unit 1005 uses the MRC format PDF. I try to create a file.

以上で説明したとおり、第３の実施の形態では、入力文書画像が複雑であった場合であっても画質不具合や処理時間が膨大となる危険性が少ない高圧縮ＰＤＦファイルを作成することが可能である。さらに、本第３の実施の形態における文書画像複雑さ判断処理は、像域識別処理に必要なデータのみを使用しており、新たなデータ入力を必要としていない。従って、標準的な画像が入力された場合の処理時間に悪影響を及ぼさずに判断が可能である。 As described above, according to the third embodiment, it is possible to create a highly compressed PDF file with less risk of image quality problems and enormous processing time even when the input document image is complicated. It is. Furthermore, the document image complexity determination process in the third embodiment uses only data necessary for the image area identification process, and does not require new data input. Therefore, determination can be made without adversely affecting the processing time when a standard image is input.

第１〜第３の実施の形態においては、高圧縮ＰＤＦを具体例にして、文書が複雑なために処理時間が膨大となりあるいは画質不良が生じるなどの不具合を防止することのできる画像処理技術について説明した。以下の形態では、高圧縮ＰＤＦ以外への本画像処理技術の適用の例を挙げる
[第４の実施の形態]
第４の実施の形態の画像形成装置は図１に示す画像形成装置と同じ構成をとる。従って、第１の実施の形態の画像形成装置と同じ参照符号を用いて、その構成についての詳細説明は割愛する。なお、本実施形態における画像形成装置は原稿の入力傾きを検知・補正する装置とする。 In the first to third embodiments, a high-compression PDF is taken as a specific example, and an image processing technique capable of preventing problems such as a large amount of processing time due to a complicated document and image quality defects. explained. In the following embodiment, an example of application of the present image processing technique to a high-compression PDF is given.
[Fourth embodiment]
The image forming apparatus of the fourth embodiment has the same configuration as the image forming apparatus shown in FIG. Accordingly, the same reference numerals as those of the image forming apparatus according to the first embodiment are used and a detailed description of the configuration is omitted. Note that the image forming apparatus in the present embodiment is an apparatus that detects and corrects an input tilt of a document.

文書画像には文字が多く含まれるため、画像を決められた角度ステップで回転しながら文字画素をｘ軸へ射影するとその先鋭度から傾き角度を検知することができる。しかし、回転する角度を細かいステップで処理すると、画像を回転する処理は多くの時間を要するため、処理時間が大きくなり、回転する角度を粗いステップで処理すると精度が低下するとのトレードオフが生じる。
一方、表などの線分を多く含む文書画像であれば、その線分を抽出できれば線分の傾き角度を求めるだけで比較的高速に傾き角を検知することが可能である。 Since a document image contains many characters, if a character pixel is projected onto the x-axis while rotating the image at a predetermined angle step, the inclination angle can be detected from the sharpness. However, if the rotation angle is processed in fine steps, the process of rotating the image takes a lot of time, so that the processing time becomes large, and if the rotation angle is processed in a rough step, a tradeoff occurs that accuracy decreases.
On the other hand, in the case of a document image including a large number of line segments such as a table, if the line segments can be extracted, the tilt angle can be detected at a relatively high speed only by obtaining the tilt angle of the line segment.

次に、図１を参照しつつ画像処理装置の動作を説明する。第１の識別手段１０２は、文字・線分画素の抽出とラベリングを実施する。具体的には第１実施例と同じ手法でエッジ抽出とそのエッジ画素のラベリングを実施する。更に、同一ラベルの画素のｘ座標ｙ座標の最大最小値から連結成分の外接矩形を生成する。更にその外接矩形が下の条件を満たせば、外接矩形の対角線を線分と見なす。 Next, the operation of the image processing apparatus will be described with reference to FIG. The first identification unit 102 extracts and labels character / line segment pixels. Specifically, edge extraction and labeling of the edge pixels are performed by the same method as in the first embodiment. Furthermore, a circumscribed rectangle of the connected component is generated from the maximum and minimum values of the x-coordinate and y-coordinate of the pixels with the same label. If the circumscribed rectangle satisfies the following condition, the diagonal line of the circumscribed rectangle is regarded as a line segment.

条件１：縦横比がある閾値以上（例えば１００以上）である。
条件２：（Ａ）矩形の右上・左下頂点が黒画素で、かつ、左上・右下頂点が白画素である。または、（Ｂ）左上・右下頂点が黒画素で、かつ、右上・左下頂点が白画素である。 Condition 1: The aspect ratio is greater than or equal to a certain threshold (for example, 100 or more).
Condition 2: (A) The upper right and lower left vertices of the rectangle are black pixels, and the upper left and lower right vertices are white pixels. Alternatively, (B) the upper left and lower right vertices are black pixels, and the upper right and lower left vertices are white pixels.

次に判断手段１０３は、線分情報を用いてスキュー（傾き）検知できるか否かを判断する。ここで、文書に使用される線分は水平または垂直であると仮定している。
判断手段１０３は、第１の識別手段において識別された線分の傾き角の平均と分散を求める。それぞれの線分の角度は単純にθ＝ｔａｎ^−１（ｗ／ｈ）として求める。ただし、上記（Ｂ）のパターンの場合は符号を逆転させてθ＝−ｔａｎ^−１（ｗ／ｈ）とする。ここでｗ，ｈはそれぞれ矩形の幅、高さである。このようにして判断手段１０３は、各線分の傾き角度を算出する。そして、その傾き角度の分散値がある閾値未満であれば、線分の傾き角の平均値θを文書全体のスキュー角度として使用できると判断する。 Next, the determination unit 103 determines whether or not skew (tilt) can be detected using the line segment information. Here, it is assumed that the line segments used in the document are horizontal or vertical.
The determination unit 103 obtains the average and variance of the inclination angles of the line segments identified by the first identification unit. The angle of each line segment is simply determined as θ = tan ⁻¹ (w / h). However, in the case of the pattern (B), the sign is reversed to θ = −tan ⁻¹ (w / h). Here, w and h are the width and height of the rectangle, respectively. In this way, the determination unit 103 calculates the inclination angle of each line segment. If the variance of the tilt angles is less than a certain threshold, it is determined that the average tilt angle θ of the line segments can be used as the skew angle of the entire document.

一方、判断手段１０３が、線分の傾き角を使用できないと判断した場合には、第２の識別手段１０４は、文字と見なされる外接矩形を予め決めていた角度ステップで回転させながらｘ軸へ射影し、そのヒストグラム形状から傾き角を検知する。
ここで、文字は正方形に近い形状であることから外接矩形の縦横比がある閾値（例えば２）未満のものを文字と見なす。また、実画像を回転させながらｘ軸への射影を求めると処理時間が膨大になってしまうため、実際は計算により射影を求める。 On the other hand, when the determination unit 103 determines that the inclination angle of the line cannot be used, the second identification unit 104 rotates the circumscribed rectangle regarded as a character by a predetermined angle step to the x axis. Projection is performed, and an inclination angle is detected from the histogram shape.
Here, since the character has a shape close to a square, an aspect ratio of the circumscribed rectangle that is less than a threshold (for example, 2) is regarded as a character. In addition, if the projection onto the x-axis is obtained while rotating the actual image, the processing time becomes enormous, so the projection is actually obtained by calculation.

図１６は、射影像の計算方法を説明する図である。図１６（１）が傾きの無い状態での外接矩形の座標を表している。その左上座標は（ｘ０，ｙ０）、幅・高さは（ｓｘ，ｓｙ）である。図１６（２）は、座標軸を角度θ回転させた状態を表している。その外接矩形のｘ軸への射影は下記式（２）、式（３）で表されるα〜βの範囲となる。
α ＝ｘ０ｃｏｓθ − ｙ０ｓｉｎθ ・・・式（２）
β ＝（ｘ０＋ｓｘ）ｃｏｓθ − （ｙ０＋ｓｙ）ｓｉｎθ ・・・式（３）
この射影を文字と見なされる全ての外接矩形について行い、その射影の長さのヒストグラム形状が最も尖鋭である角度θを文書のスキュー角度と検知する。 FIG. 16 is a diagram for explaining a method for calculating a projected image. FIG. 16A shows the coordinates of a circumscribed rectangle in a state where there is no inclination. The upper left coordinates are (x0, y0), and the width and height are (sx, sy). FIG. 16B shows a state where the coordinate axis is rotated by an angle θ. The projection of the circumscribed rectangle onto the x-axis is in the range of α to β expressed by the following equations (2) and (3).
α = x0 cos θ−y0 sin θ Formula (2)
β = (x0 + sx) cosθ− (y0 + sy) sinθ (3)
This projection is performed for all circumscribed rectangles regarded as characters, and the angle θ at which the histogram shape of the projection length is the sharpest is detected as the skew angle of the document.

続いて、画像処理手段１０５における傾き補正方法について説明する。第１の識別手段または第２の識別手段により求められた原稿傾き角度をθとすると、傾きを補正するためには、原稿を角度θ回転させればよいので、補正後の画素（ｘ’，ｙ’）の画素値は下の式を満たす画素（ｘ，ｙ）の画素値となる。

Next, an inclination correction method in the image processing unit 105 will be described. Assuming that the original inclination angle obtained by the first identification means or the second identification means is θ, in order to correct the inclination, the original has only to be rotated by the angle θ, so that the corrected pixel (x ′, The pixel value of y ′) is the pixel value of the pixel (x, y) that satisfies the following expression.

従って、補正後の画素位置（ｘ’，ｙ’）の画素値を得るために参照すべき入力画像の画素位置（ｘ，ｙ）は、上式（４）の両辺に左側から回転行列の逆行列をかけることで下の式（５）で求められる。 Therefore, the pixel position (x, y) of the input image to be referred to obtain the pixel value at the corrected pixel position (x ′, y ′) is the inverse of the rotation matrix from the left side on both sides of the above equation (4). It is obtained by the following equation (5) by multiplying the matrix.

ｘ＝ｘ’ｃｏｓθ ＋ｙ’ｓｉｎθ
ｙ＝ −ｘ’ｓｉｎθ ＋ｙ’ｃｏｓθ ・・・式（５）
式（５）で求められた入力画像の座標位置(x, y)は回転角θが９０°の倍数以外のときは整数とならないことが多い。即ち、ピクセル単位の値にならないことが多い。そこで、座標位置(x, y)の周囲のピクセル単位（格子点）の座標位置とその画素値を用いて、例えば式（６）のように入力画像の画素を補間（バイリニア法）しながら回転処理する。これによってジャギーの発生を防ぐ。

x = x'cosθ + y'sinθ
y = −x′sin θ + y ′ cos θ (5)
The coordinate position (x, y) of the input image obtained by Expression (5) often does not become an integer when the rotation angle θ is other than a multiple of 90 °. That is, the value often does not become a pixel unit. Therefore, using the coordinate position of pixel units (grid points) around the coordinate position (x, y) and its pixel value, for example, the input image pixel is rotated while being interpolated (bilinear method) as shown in Equation (6). To process. This prevents jaggy from occurring.

なお（ｘ_０、ｙ_０）、（ｘ_１、ｙ_１）は座標位置（ｘ，ｙ）の最近傍の左上の格子点と右下の格子点であり、左上を原点としている。 Note that (x ₀ , y ₀ ) and (x ₁ , y ₁ ) are the upper left lattice point and the lower right lattice point nearest to the coordinate position (x, y), and the upper left is the origin.

以上説明したように、本実施形態においては、文書中に信頼できる線分がある場合は線分情報を用いてスキュー角度を検知し、信頼できる線分がないと判断した場合にのみ第２の識別手段によってスキュー角を検知する。これにより、処理時間を浪費することなくスキュー補正することが可能となる。 As described above, in the present embodiment, when there is a reliable line segment in the document, the skew angle is detected using the line segment information, and only when it is determined that there is no reliable line segment. The skew angle is detected by the identification means. Thus, skew correction can be performed without wasting processing time.

[第５の実施の形態]
第５の実施の形態の画像形成装置は図１０に示す第３の実施の形態の画像形成装置と同じ構成をとる。従って、第３の実施の形態の画像形成装置と同じ参照符号を用いて、その構成についての詳細説明は割愛する。
第５の実施の形態における画像形成装置は図３で説明した高圧縮ＰＤＦファイルを生成するとともに、原稿の向きおよび入力傾きを検知・補正する。なお、本実施形態においてスキュー検知は線分を用いずに文字画素のみを用いる。 [Fifth embodiment]
The image forming apparatus of the fifth embodiment has the same configuration as the image forming apparatus of the third embodiment shown in FIG. Accordingly, the same reference numerals as those of the image forming apparatus according to the third embodiment are used and the detailed description of the configuration is omitted.
The image forming apparatus according to the fifth embodiment generates the highly compressed PDF file described with reference to FIG. 3, and detects and corrects the orientation and input tilt of the document. In this embodiment, skew detection uses only character pixels without using line segments.

現在よく使用されるスキュー検知および文書向き検知は入力文書画像に文字が多数含まれていることを前提とした処理になっている。しかし、文字数が少ない場合にはその検知精度が低くなってしまう。 Currently used skew detection and document orientation detection are processes based on the premise that many characters are included in an input document image. However, when the number of characters is small, the detection accuracy is low.

そこで、第５の実施の形態においては、第３の実施の形態と同様に第１の識別手段１００２（図１０）がエッジ抽出を行い、判断手段１００３がラベリング処理を実行して外接矩形数を計数する。ここで外接矩形数があらかじめ決めておいた閾値（例えば１００）未満の場合、十分なスキュー・文書向きの検知精度が得られないと判断し、第２の識別手段１００４を実行しない。スキュー・文書向き検知を行わないと判断した場合、第１の画像処理手段１００５はスキュー・文書向き補正は行わず、第１の実施の形態で説明した方法により高圧縮ＰＤＦを生成する。 Therefore, in the fifth embodiment, as in the third embodiment, the first identification unit 1002 (FIG. 10) performs edge extraction, and the determination unit 1003 executes a labeling process to calculate the number of circumscribed rectangles. Count. If the number of circumscribed rectangles is less than a predetermined threshold value (for example, 100), it is determined that sufficient skew / document orientation detection accuracy cannot be obtained, and the second identification unit 1004 is not executed. If it is determined that skew / document orientation detection is not performed, the first image processing unit 1005 does not perform skew / document orientation correction, and generates a highly compressed PDF by the method described in the first embodiment.

判断手段１００３がスキュー・文書向き検知を行うべきと判断した場合、第２の識別手段１００４がスキュー・文書向きの検知処理を行う。スキュー検知については第４の実施形態と同様の処理を行えばよいので説明は割愛する。文書向き検知はＯＣＲを使用して行う。まず任意の文字列を抽出しその文字列画像に対し０°、９０°、１８０°、２７０°の４方向に画像回転しＯＣＲの類似度を算出する。その類似度が最も高い角度を正しい文書向きと検知する。このように第２の識別手段にて原稿傾きと文書向きを検知した場合、第２の画像処理手段１００６が原稿傾きと文書向きを補正し、高圧縮ＰＤＦ作成を実行する。 When the determination unit 1003 determines that the skew / document orientation should be detected, the second identification unit 1004 performs the skew / document orientation detection process. The skew detection may be performed in the same manner as in the fourth embodiment, and the description is omitted. Document orientation detection is performed using OCR. First, an arbitrary character string is extracted, the image is rotated in four directions of 0 °, 90 °, 180 °, and 270 ° with respect to the character string image, and the OCR similarity is calculated. The angle with the highest similarity is detected as the correct document orientation. As described above, when the second identification unit detects the document tilt and the document orientation, the second image processing unit 1006 corrects the document tilt and the document orientation, and executes high-compression PDF creation.

以上説明した各実施の形態の画像形成装置では、像域識別処理を前半の第１の識別処理と後半の第２の識別処理に分割し、入力文書画像に応じて第２の識別手段を実施するか否かを判断する。ここで第１の識別処理は入力文書画像に精度・速度が依存しない処理である。また、第２の識別手段を実施するか否かを判断する判断手段においては第２の識別手段にて使用する信号を入力、あるいは生成しその信号を基に判断する。 In the image forming apparatus according to each of the embodiments described above, the image area identification process is divided into the first identification process in the first half and the second identification process in the second half, and the second identification unit is implemented according to the input document image. Judge whether to do. Here, the first identification process is a process whose accuracy and speed do not depend on the input document image. In addition, the determination means for determining whether or not to implement the second identification means inputs or generates a signal to be used by the second identification means and makes a determination based on the signal.

文書の複雑さを判断した結果、簡素な文書である場合には第１の識別手段より出力された第１の識別信号と判断手段より出力された判断信号を基に第２の識別処理を実施し、第２の識別信号を基に画像処理を行う。一方で、文書が複雑であると判断した場合には第２の識別処理は行わず第１の識別信号を用いて画像処理を行う。 As a result of determining the complexity of the document, the second identification process is performed based on the first identification signal output from the first identification unit and the determination signal output from the determination unit when the document is a simple document. Then, image processing is performed based on the second identification signal. On the other hand, when it is determined that the document is complicated, the second identification process is not performed, and the image processing is performed using the first identification signal.

なお、第１の識別手段と第２の識別手段が備える機能は上述の入力文書画像に精度・速度が依存するかどうかのみによって分類されるものではない。図１７は、第１及び第２の識別手段の機能を分類して示す図である。本願発明の第１の識別手段と第２の識別手段は、この図１７に示す機能によって分類することができ、逆にこの機能によって分類される第１の識別手段と第２の識別手段とは本願発明の技術的範囲に含まれる。 Note that the functions of the first identification unit and the second identification unit are not classified based only on whether the accuracy / speed depends on the input document image. FIG. 17 is a diagram showing the functions of the first and second identification means classified. The first identification means and the second identification means of the present invention can be classified by the function shown in FIG. 17, and conversely, the first identification means and the second identification means classified by this function are It is included in the technical scope of the present invention.

これらの形態の画像形成装置によれば、複雑な画像が入力された場合には、処理時間が膨大になったり画質不具合が起こったりする危険性を軽減してユーザの希望に近い画像を形成することが可能となる。また、複雑ではない（標準的な）画像が入力された場合には、第１の識別手段・判定手段・第２の識別手段いずれも本来必要な処理のみで構成されていることから、処理時間、識別精度に影響を与えることなくユーザが期待する出力画像を得られる。 According to the image forming apparatus of these forms, when a complicated image is input, an image close to the user's desire is formed by reducing the risk of enormous processing time and image quality failure. It becomes possible. In addition, when a non-complex (standard) image is input, the first identification unit, the determination unit, and the second identification unit are configured only by processing that is originally necessary. The output image expected by the user can be obtained without affecting the identification accuracy.

尚、本発明は、上記実施形態そのままに限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形して具体化できる。
また、上記実施形態に開示されている複数の構成要素の適宜な組み合せにより種々の発明を形成できる。例えば、実施形態に示される全構成要素から幾つかの構成要素を削除してもよい。更に、異なる実施形態に亘る構成要素を適宜組み合せてもよい。 Note that the present invention is not limited to the above-described embodiment as it is, and can be embodied by modifying the constituent elements without departing from the scope of the invention in the implementation stage.
Further, various inventions can be formed by appropriately combining a plurality of constituent elements disclosed in the embodiment. For example, some components may be deleted from all the components shown in the embodiment. Furthermore, you may combine suitably the component covering different embodiment.

本発明は、入力文書画像に依存する像域識別処理時間の増加を低減するとともに、出力画像の画質劣化の危険性を軽減して、ユーザの期待に近い出力画像を得ることができる画像形成装置を製造する産業で利用することができる。 The present invention reduces an increase in image area identification processing time depending on an input document image, reduces an image quality deterioration risk of an output image, and can obtain an output image close to a user's expectation. Can be used in the manufacturing industry.

１０１…画像入力手段、１０２…第１の識別手段、１０３…判断手段、１０４…第２の識別手段、１０５…画像処理手段、３０１…入力画像、３０２…背景画像、３０２…多値画像、３０３…文字画像、３０３…文字用画像、４０１…画像入力手段、４０２…識別手段、４０３…エッジ抽出手段、４０４…外矩生成手段、４０５…文字列・文字領域抽出処理手段、４０６…画像処理手段、８０１…画像入力手段、８０２…第１の識別手段、８０３…前処理手段、８０４…判断手段、８０５…第２の識別手段、８０６…画像処理手段、１００２…第１の識別手段、１００３…判断手段、１００４…第２の識別手段、１００５…第１の画像処理手段、１００６…第２の画像処理手段。 DESCRIPTION OF SYMBOLS 101 ... Image input means, 102 ... First identification means, 103 ... Determination means, 104 ... Second identification means, 105 ... Image processing means, 301 ... Input image, 302 ... Background image, 302 ... Multi-value image, 303 ... character image, 303 ... character image, 401 ... image input means, 402 ... identification means, 403 ... edge extraction means, 404 ... outer rectangle generation means, 405 ... character string / character area extraction processing means, 406 ... image processing means 801 ... Image input means 802 ... First identification means 803 ... Pre-processing means 804 ... Determination means 805 ... Second identification means 806 ... Image processing means 1002 ... First identification means 1003 ... Judgment means, 1004... Second identification means, 1005... First image processing means, 1006.

特開２００５−１７５６４１号公報JP-A-2005-175541 特開２００３−００８９０９号公報JP 2003-008909 A 特開２０００−０２０７２６号公報JP 2000-020726 A

Claims

Image input means for acquiring image information from a document including a paper document and generating an input image;
First identification means for performing a first image area identification process on an input image and outputting a first identification signal indicating an attribute of the image;
Second identification means for inputting the first identification signal, performing a second image area identification process subsequent to the first image area identification process, and outputting a second identification signal indicating an attribute of the image;
Judgment means for inputting the first identification signal and outputting a judgment signal indicating whether or not to execute the second identification means;
When the determination signal indicates that it should not be executed, the first identification signal is selected as the third identification signal without performing the second image area identification processing, and the determination signal is executed. A selection means for selecting the second identification signal as the third identification signal;
An image forming apparatus comprising: an image processing unit that inputs the input image and the third identification signal and executes image processing without performing the second image area identification processing.

Image input means for acquiring image information from a document including a paper document and generating an input image;
First identification means for performing a first image area identification process on an input image and outputting a first identification signal indicating an attribute of the image;
Pre-processing means for inputting the first identification signal, performing pre-processing of a second image area identification process following the first image area identification process, and outputting a pre-process signal;
Second identification means for inputting a pre-processing signal, performing the second image area identification process, and outputting a second identification signal indicating an attribute of the image;
Judgment means for inputting a pre-processing signal and outputting a judgment signal indicating whether or not to execute the second identification means;
When the determination signal indicates that it should not be executed, the first identification signal is selected as the third identification signal without performing the second image area identification processing, and the determination signal is executed. A selection means for selecting the second identification signal as the third identification signal;
An image forming apparatus comprising: an image processing unit that inputs the input image and the third identification signal and executes image processing without performing the second image area identification processing.

Image input means for acquiring image information from a document including a paper document and generating an input image;
First identification means for performing a first image area identification process on an input image and outputting a first identification signal indicating an attribute of the image;
Second identification means for inputting the first identification signal, performing a second image area identification process subsequent to the first image area identification process, and outputting a second identification signal indicating an attribute of the image;
Judgment means for inputting the first identification signal and outputting a judgment signal indicating whether or not to execute the second identification means;
When the determination signal indicates that it should not be executed, the second image area identification processing is not performed, and the input image and the first identification signal are input to execute image processing. When the image processing means indicates that the determination signal should be executed, the image processing means includes second input image processing means for inputting the input image and the second identification signal to execute image processing. An image forming apparatus.

The first identifying means identifies a pixel attribute by referring only to the pixel and its vicinity, and the second identifying means identifies the attribute by referring to a wider area or the entire document. The image forming apparatus according to claim 1 or 2.

The first identifying means is an identifying means that does not depend on the distribution of pixel values of the input image, and the second identifying means is an identifying means that depends on the distribution of pixel values of the input image. The image forming apparatus according to claim 1, wherein the image forming apparatus is an image forming apparatus.

The first identification unit is an identification unit whose processing accuracy does not depend on the distribution of pixel values of the input image, and the second identification unit is an identification unit whose processing accuracy depends on the distribution of pixel values of the input image. The image forming apparatus according to claim 1.

4. The identification means according to claim 3, wherein the first identification means is an identification means whose processing time and accuracy do not depend on a distribution of pixel values of an input image composed of threshold processing of the pixel itself and filter processing of a fixed size. Item 5. The image forming apparatus according to Item 4.

The second identifying means is an identifying means whose processing time and accuracy depend on a distribution of pixel values of an input image for analyzing a mutual relationship between a size and a position of connected components of pixels classified by the first identifying means. The image forming apparatus according to claim 3, wherein the image forming apparatus is an image forming apparatus.

The first identifying means is means for identifying an attribute for each pixel of the input image, and the second identifying means is means for identifying an attribute for each area represented by a rectangle in the input image. The image forming apparatus according to claim 1.

Obtain image information from a manuscript including a paper manuscript to generate an input image,
Performing a first image area identification process on the input image to generate a first identification signal indicating an attribute of the image;
Performing a second image area identification process subsequent to the first image area identification process on the first identification signal to generate a second identification signal indicating an attribute of the image;
Generating a determination signal indicating whether to execute the second identification means from the first identification signal;
When the determination signal indicates that it should not be executed, the first identification signal is selected as the third identification signal without performing the second image area identification processing, and the determination signal is executed. If it should indicate, the second identification signal is selected as the third identification signal,
An image processing method, wherein image processing is performed without performing the second image area identification processing on the input image and the third identification signal.