JP2010011455A

JP2010011455A - Image-forming device

Info

Publication number: JP2010011455A
Application number: JP2009148053A
Authority: JP
Inventors: Shunichi Mekawa; 俊一女川; Masaaki Yasunaga; 真明安永
Original assignee: Toshiba Corp; Toshiba TEC Corp
Current assignee: Toshiba Corp; Toshiba TEC Corp
Priority date: 2008-06-26
Filing date: 2009-06-22
Publication date: 2010-01-14

Abstract

<P>PROBLEM TO BE SOLVED: To provide a technique for obtaining a precise image area identification result even in a copy original of a plurality of generations, and for obtaining a high-function electronic file. <P>SOLUTION: An image-forming device has: a document image input means for inputting a document image to generate an input document image, namely image data; an image area information input means for inputting image area information created for each image area of the document image; an image formation means for forming a watermark image in which the image area information is embedded; and a watermarked document image composite means for compositing the input document and watermark images to form a watermarked document image. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、複数世代のコピー原稿であっても高精度な像域識別結果を得ることができ、また、高機能な電子ファイルを得ることができる技術に関する。 The present invention relates to a technique capable of obtaining a highly accurate image area identification result even for a plurality of generations of copy originals and obtaining a highly functional electronic file.

従来プリンタ、ＭＦＰ或いは画像形成装置（以下複写機と略称する。）等では、原稿を走査して読み取った原画像データを画像に再生する際、画像のタイプを識別する像域識別を実施する。そして、その像域識別結果に基づいて各種画像のタイプに適した画像処理（原画像データのフィルタ処理等）を実行した後、プリント装置に出力している。 Conventional printers, MFPs, image forming apparatuses (hereinafter abbreviated as “copiers”), and the like perform image area identification that identifies the type of image when reproducing original image data read by scanning a document. Based on the image area identification result, image processing suitable for various image types (such as filtering of original image data) is executed, and then output to the printing apparatus.

ここで、原稿画像の各領域における画像のタイプを決定する理由は、例えば同じ原稿上にあっても、文字領域と図形領域とでは夫々異なるフィルタ処理を施す必要があるためである。原稿画像を複数の領域に分離して領域毎に画像のタイプを識別する方法としては特許文献１に開示された技術が知られている。 Here, the reason for determining the image type in each area of the document image is that, for example, even on the same document, it is necessary to perform different filter processes for the character area and the graphic area. As a method for separating a document image into a plurality of regions and identifying the type of image for each region, a technique disclosed in Patent Document 1 is known.

一方、電子メールやＷｅｂサービス等のネットワークの普及に伴い、電子ファイルをネットワークを介して送信・配布する頻度が増加している。この際、電子ファイルをワープロソフトやプレゼンテーションソフトの出力ファイル形式で配布すると容易に改竄されてしまう。このような改竄を防ぐ簡易的な方法として、一旦プリントアウトした画像をスキャン入力し、ＪＰＥＧ／ＴＩＦＦ／ＰＤＦなどの画像ファイルに変換して配布することが行われている。しかし、画像ファイルのままでは検索等の再利用が困難、ファイル容量が大きい等の短所がある。 On the other hand, with the spread of networks such as e-mail and Web services, the frequency of sending and distributing electronic files via the network is increasing. At this time, if the electronic file is distributed in the output file format of word processing software or presentation software, it is easily falsified. As a simple method for preventing such falsification, an image once printed out is scanned in, converted into an image file such as JPEG / TIFF / PDF, and distributed. However, there are disadvantages such as that it is difficult to reuse the search if the image file remains, and the file capacity is large.

そこで、検索などの再利用を図るために、例えば特許文献１で提案されている画像識別処理を用いて入力画像から文字領域を抽出し、抽出された文字領域にＯＣＲ処理をすることでキーワード検索が可能なファイルを作成することができる。
また、ファイル容量を小さくするために、入力画像を文字領域及び自然画像領域に領域分割し、文字領域、自然画像領域それぞれに適した手法で画像データを圧縮することで高圧縮・高画質な画像再現を実現する方法も提案されている（例えば、特許文献２参照）。 Therefore, in order to reuse the search or the like, for example, a character area is extracted from the input image using the image identification process proposed in Patent Document 1, and keyword search is performed by performing OCR processing on the extracted character area. Can create a file that can.
In addition, in order to reduce the file capacity, the input image is divided into character areas and natural image areas, and the image data is compressed by a method suitable for each of the character areas and natural image areas, thereby achieving high-compression and high-quality images. A method for realizing reproduction has also been proposed (see, for example, Patent Document 2).

このように画像識別処理を用いて入力画像を文字像域および自然画像像域に領域分割を行うことで、検索等の再利用、ファイル容量の高圧縮などを実現することが可能となる。しかし、一般に完璧な識別技術というものは存在せず、常に誤識別が発生する可能性がある。 As described above, by dividing the input image into the character image area and the natural image image area by using the image identification process, it is possible to realize reuse of search and the like, high compression of the file capacity, and the like. However, generally there is no perfect identification technique, and there is a possibility that misidentification always occurs.

特に複写機で何世代もコピーを繰り返したコピー原稿は、オリジナル原稿と比べて画質が劣化することは良く知られている。そのため、このようなコピー原稿を読取った世代コピー画像に対する像域識別結果はオリジナル原稿に対する像域識別結果と比較して識別精度が低くなっている。 In particular, it is well known that a copy original that has been copied for many generations by a copying machine is deteriorated in image quality as compared with an original original. For this reason, the image area identification result for the generation copy image obtained by reading such a copy document has a lower identification accuracy than the image area identification result for the original document.

このことは、複写機によるコピー原稿に限らずプリンタによる出力原稿においても同様である。即ち、プリントアウトされた画像をスキャン入力し、高機能な電子ファイルを作成する際、その画像識別処理で誤識別が生じる可能性がある。 This applies not only to the copy original by the copying machine but also to the output original by the printer. That is, when a printed image is scanned and input to create a highly functional electronic file, there is a possibility that erroneous identification may occur in the image identification process.

本発明は斯かる事情に鑑みてなされたものであって、複数世代のコピー原稿であっても高精度な像域識別結果を得ることができ、また、高機能な電子ファイルを得ることができる技術を提供することを目的とする。 The present invention has been made in view of such circumstances, and can obtain a highly accurate image area identification result even for a plurality of generations of copy originals, and can obtain a highly functional electronic file. The purpose is to provide technology.

上記課題を解決するための本発明は、文書画像を入力して画像データである入力文書画像を生成する文書画像入力手段と、前記文書画像のそれぞれの像域について作成された像域情報を入力する像域情報入力手段と、前記像域情報を埋め込んだ透かし画像を形成する透かし画像形成手段と、前記入力文書画像と前記透かし画像とを合成し透かし入り文書画像を形成する透かし入り文書画像合成手段とを有する画像形成装置である。 In order to solve the above-mentioned problems, the present invention inputs document image input means for generating an input document image as image data by inputting a document image, and image area information created for each image area of the document image. Image area information input means, watermark image forming means for forming a watermark image in which the image area information is embedded, and watermarked document image composition for synthesizing the input document image and the watermark image to form a watermarked document image An image forming apparatus.

また本発明は、文書データを入力する文書データ入力手段と、前記文書データから画像データである文書画像を形成する文書画像形成手段と、前記文書データ中のそれぞれの像域について作成された像域情報を入力する像域情報入力手段と、前記像域情報を埋め込んだ透かし画像を形成する透かし画像形成手段と、前記文書画像と前記透かし画像とを合成し透かし入り文書画像を形成する透かし入り文書画像合成手段とを有する画像形成装置である。 The present invention also provides a document data input means for inputting document data, a document image forming means for forming a document image which is image data from the document data, and an image area created for each image area in the document data. Image area information input means for inputting information, watermark image forming means for forming a watermark image in which the image area information is embedded, and a watermarked document for synthesizing the document image and the watermark image to form a watermarked document image An image forming apparatus having image composition means.

本発明によれば、複数世代のコピー原稿であっても高精度な像域識別結果を得ることができ、また、高機能な電子ファイルを得ることができる。 According to the present invention, a highly accurate image area identification result can be obtained even for a copy document of a plurality of generations, and a highly functional electronic file can be obtained.

第１の実施の形態の画像形成装置の構成を示すブロック図。1 is a block diagram illustrating a configuration of an image forming apparatus according to a first embodiment. 第１の実施の形態の画像形成装置の概略の動作を示すフローチャート。3 is a flowchart showing a schematic operation of the image forming apparatus according to the first embodiment. 入力される文書画像及び像域情報の内容を示す図。The figure which shows the content of the input document image and image area information. 従来の画像形成方法を説明する図。FIG. 6 is a diagram illustrating a conventional image forming method. 第１の実施の形態の画像形成装置に対応する画像処理装置の構成を示すブロック図。1 is a block diagram illustrating a configuration of an image processing apparatus corresponding to an image forming apparatus according to a first embodiment. 画像処理手段により施される画像処理の例を示す図。The figure which shows the example of the image processing performed by an image processing means. 第１の実施の形態の画像形成装置に対応する画像処理装置の他の構成を示すブロック図。FIG. 3 is a block diagram showing another configuration of the image processing apparatus corresponding to the image forming apparatus of the first embodiment. 第２の実施の形態の画像形成装置の構成を示すブロック図。FIG. 4 is a block diagram illustrating a configuration of an image forming apparatus according to a second embodiment. 第２の実施の形態の画像形成装置の概略の動作を示すフローチャート。9 is a flowchart showing a schematic operation of the image forming apparatus according to the second embodiment. 文書画像形成手段がＲＩＰである場合の入力される文書データと出力する文書画像データを示す図。FIG. 5 is a diagram showing input document data and output document image data when the document image forming unit is RIP. 第３の実施の形態の画像形成装置の構成を示すブロック図。FIG. 9 is a block diagram illustrating a configuration of an image forming apparatus according to a third embodiment. 第３の実施の形態の画像識別手段により得られる像域情報を示す図。The figure which shows the image area information obtained by the image identification means of 3rd Embodiment. 第４の実施の形態の画像形成装置の構成を示すブロック図。FIG. 10 is a block diagram illustrating a configuration of an image forming apparatus according to a fourth embodiment. 像域情報抽出手段がＲＩＰ内で実装される場合の具体的な抽出方法と抽出情報の例を示す図。The figure which shows the example of the specific extraction method and extraction information in case an image area information extraction means is mounted in RIP. 第５の実施の形態の画像形成装置の構成を示すブロック図。FIG. 10 is a block diagram illustrating a configuration of an image forming apparatus according to a fifth embodiment. 像域情報編集手段の機能を説明する図。The figure explaining the function of an image area information edit means. 像域情報の編集の一例を示す図。The figure which shows an example of edit of image area information. 第６の実施の形態の画像形成装置の構成を示すブロック図。FIG. 10 is a block diagram illustrating a configuration of an image forming apparatus according to a sixth embodiment. 像域情報編集手段の機能を説明する図。The figure explaining the function of an image area information edit means. 第７の実施の形態の画像形成装置の構成を示すブロック図。FIG. 10 is a block diagram illustrating a configuration of an image forming apparatus according to a seventh embodiment. 像域情報編集手段の機能を説明する図。The figure explaining the function of an image area information edit means. 第１の実施の形態の画像処理装置の構成を示すブロック図。1 is a block diagram illustrating a configuration of an image processing apparatus according to a first embodiment. 第１の実施の形態の画像処理装置の概略の動作を示すフローチャート。3 is a flowchart showing a schematic operation of the image processing apparatus according to the first embodiment. 画像形成装置の構成を示すブロック図。1 is a block diagram illustrating a configuration of an image forming apparatus. 入力される文書画像及び像域情報の内容を示す図。The figure which shows the content of the input document image and image area information. 従来の技術を用いた画像形成方法を説明する図。FIG. 10 is a diagram illustrating an image forming method using a conventional technique. 入力された文書画像に埋め込まれた透かし画像から抽出した像域情報を示す図。The figure which shows the image area information extracted from the watermark image embedded in the input document image. 画像識別手段により得られた像域情報を示す図。The figure which shows the image area information obtained by the image identification means. 高画質な世代コピーを得る装置の構成と動作を説明する図。The figure explaining the structure and operation | movement of an apparatus which obtains a high quality generation copy. 画像処理手段により施される画像処理例を示す図。The figure which shows the image processing example performed by an image processing means. 高圧縮な電子ファイルを得る装置の構成と動作を説明する図。The figure explaining the structure and operation | movement of an apparatus which obtains a highly compressed electronic file. 第２の実施の形態の画像処理装置の構成を示すブロック図。The block diagram which shows the structure of the image processing apparatus of 2nd Embodiment. 第２の実施の形態の画像処理装置をＯＣＲ前処理として使用する際の像域情報合成手段の動作例を示す図。The figure which shows the operation example of the image area information synthetic | combination means at the time of using the image processing apparatus of 2nd Embodiment as OCR pre-processing. 第２の実施の形態の画像処理装置を高圧縮ファイル作成のために使用する際の像域情報合成手段の動作例を示す図。The figure which shows the operation example of the image area information synthetic | combination means at the time of using the image processing apparatus of 2nd Embodiment for high compression file preparation. 第２の実施の形態の画像処理装置をコピア用識別信号生成に使用する際の像域情報合成手段の動作例を示す図。The figure which shows the operation example of the image area information synthetic | combination means at the time of using the image processing apparatus of 2nd Embodiment for the identification signal production | generation for copiers. 第３の実施の形態の画像処理装置の構成を示すブロック図。The block diagram which shows the structure of the image processing apparatus of 3rd Embodiment. 第３の実施の形態の画像処理装置により得られる像域情報の例を示す図。FIG. 10 is a diagram illustrating an example of image area information obtained by the image processing apparatus according to the third embodiment. 第４の実施の形態の画像処理装置の構成を示すブロック図。The block diagram which shows the structure of the image processing apparatus of 4th Embodiment. 第４の実施の形態の画像処理装置により得られる像域情報の例を示す図。The figure which shows the example of the image area information obtained by the image processing apparatus of 4th Embodiment.

[第１の実施の形態]
図１は、第１の実施の形態の画像形成装置の構成を示すブロック図である。図２は、第１の実施の形態の画像形成装置の概略の動作を示すフローチャートである。
以下、図１及び図２を参照しつつ画像形成装置の構成と動作を説明する。 [First embodiment]
FIG. 1 is a block diagram illustrating a configuration of the image forming apparatus according to the first embodiment. FIG. 2 is a flowchart showing a schematic operation of the image forming apparatus according to the first embodiment.
Hereinafter, the configuration and operation of the image forming apparatus will be described with reference to FIGS. 1 and 2.

画像形成装置は、文書画像入力手段１０１、像域情報入力手段１０２、透かし画像形成手段１０３、透かし文書合成手段１０４を備えている。 The image forming apparatus includes document image input means 101, image area information input means 102, watermark image formation means 103, and watermark document composition means 104.

動作２０１において、文書画像入力手段１０１は、文書画像を入力する。ここで文書画像入力手段１０１は、例えば、ＭＦＰや複写機に搭載されているスキャナであり、入力される文書画像は紙原稿である。動作２０２において、像域情報入力手段１０２は入力される文書画像を元に予め作成されている像域情報を入力する。 In operation 201, the document image input unit 101 inputs a document image. Here, the document image input unit 101 is, for example, a scanner mounted on an MFP or a copier, and the input document image is a paper document. In operation 202, the image area information input unit 102 inputs image area information created in advance based on the input document image.

図３は、入力される文書画像及び像域情報の内容を示す図である。図３（１）の文書画像３０１は、タイトル３０２，本文３０３及び自然画像３０４の像域を含んでいる。この文書画像３０１から像域情報を抽出するために、例えば、図３（２）の文書画像３０５に示したように、ユーザが画面を見ながらマウス等で各像域（３０６，３０７，３０８）を矩形で囲み、キーボード等から各属性を指定する。 FIG. 3 is a diagram showing the contents of the input document image and image area information. A document image 301 in FIG. 3A includes image areas of a title 302, a body 303, and a natural image 304. In order to extract the image area information from the document image 301, for example, as shown in the document image 305 of FIG. 3B, each image area (306, 307, 308) is viewed with the mouse while the user is viewing the screen. Is enclosed in a rectangle and each attribute is specified from the keyboard.

このようにユーザに指定された各像域の矩形座標位置と属性とから、図３（３）に示す像域情報３０９が作成され、像域情報入力手段１０２に入力される。像域情報としては、属性（タイトル／本文／自然画像）とその座標位置（３１０、３１１、３１２）である。
なお、本実施の形態では像域情報として座標位置と、タイトル／本文／自然画像の属性を入力しているが、例えば文字属性の像域であれば文字色や文字コード、自然画像であれば風景画像／人物画像などの情報も像域情報として入力することが可能である。 As described above, the image area information 309 shown in FIG. 3C is created from the rectangular coordinate position and attribute of each image area designated by the user and input to the image area information input means 102. The image area information includes attributes (title / text / natural image) and coordinate positions (310, 311 and 312).
In this embodiment, the coordinate position and the title / text / natural image attributes are input as image area information. For example, in the case of an image area of character attributes, the character color, character code, and natural image are input. Information such as landscape images / person images can also be input as image area information.

動作２０３において、透かし画像形成手段１０３は、像域情報入力手段１０２により抽出された像域情報を埋め込んだ透かし画像を形成する。更に、動作２０４において、透かし文書合成手段１０４は、文書画像入力手段１０１から出力された文書画像と、透かし画像形成手段１０３から出力された透かし画像を合成し、透かし情報が埋め込まれた画像を形成する。 In operation 203, the watermark image forming unit 103 forms a watermark image in which the image area information extracted by the image area information input unit 102 is embedded. Further, in operation 204, the watermark document synthesizing unit 104 synthesizes the document image output from the document image input unit 101 and the watermark image output from the watermark image forming unit 103 to form an image in which watermark information is embedded. To do.

透かし画像の形成及び透かし情報が埋め込まれた画像の形成に関しては、既に様々な手法が開示されている。例えば特開２００３−１０１７６２号公報に記載の技術では、文書の背景に所定の方法で黒画素を埋め込むことで文書に情報を埋め込む。 Various methods have already been disclosed for the formation of watermark images and images with embedded watermark information. For example, in the technique described in Japanese Patent Application Laid-Open No. 2003-101762, information is embedded in a document by embedding black pixels in a predetermined method in the background of the document.

図４は、特開２００３−１０１７６２号公報に記載の技術を用いた画像形成方法を説明する図である。
文書画像入力手段１０１から文書画像４０１を出力し、像域情報入力手段１０２から出力された像域情報を透かし画像形成手段１０３により透かし画像４０２に変換し、文書画像４０１と透かし画像４０２を透かし文書合成手段１０４により合成して透かし情報が埋め込まれた文書４０３を得る。 FIG. 4 is a diagram for explaining an image forming method using the technique described in Japanese Patent Application Laid-Open No. 2003-101762.
The document image 401 is output from the document image input unit 101, the image area information output from the image area information input unit 102 is converted into a watermark image 402 by the watermark image forming unit 103, and the document image 401 and the watermark image 402 are converted into a watermark document. A document 403 in which watermark information is embedded is obtained by combining by the combining means 104.

次に、第１の実施の形態の画像形成装置を用いて得られた透かし情報が埋め込まれた文書から、高画質な世代コピーを得る方法を説明する。
図５は、第１の実施の形態の画像形成装置に対応する画像処理装置の構成を示すブロック図である。 Next, a method for obtaining a high-quality generation copy from a document in which watermark information obtained using the image forming apparatus according to the first embodiment is embedded will be described.
FIG. 5 is a block diagram illustrating a configuration of an image processing apparatus corresponding to the image forming apparatus according to the first embodiment.

文書画像入力手段５０１は、紙に出力された上述の像域情報が透かし画像として埋め込まれた文書画像を入力する。ここで文書画像入力手段５０１は、ＭＦＰや複写機のスキャナ等である。次に透かし情報抽出手段５０２は、埋め込まれた像域情報を抽出する。ここで用いられる透かし情報を抽出する技術は、例えば上述の特開２００３−１０１７６２号公報に記載の技術を用いることができる。
続いて画像処理手段５０３は、文書画像入力手段５０１から出力された文書画像に対して、抽出された像域情報を用いて像域ごとに適した画像処理を施す。 The document image input unit 501 inputs a document image in which the above-described image area information output on paper is embedded as a watermark image. Here, the document image input means 501 is an MFP or a scanner of a copying machine. Next, the watermark information extraction unit 502 extracts the embedded image area information. As a technique for extracting watermark information used here, for example, the technique described in the above-mentioned Japanese Patent Application Laid-Open No. 2003-101762 can be used.
Subsequently, the image processing unit 503 performs image processing suitable for each image area on the document image output from the document image input unit 501 using the extracted image area information.

図６は、画像処理手段５０３により施される画像処理の例を示す図である。画像処理の例としては、フィルタ処理や階調処理がある。例えば、自然画像に対しては画像出力時の網点との干渉によるモアレを防ぐために入力画像の網点を潰すためのぼかしフィルタ処理を施し、文字画像に対しては文字のエッジを強調するためにエッジ強調フィルタを施す。また、階調処理においても、自然画像は解像性よりも階調性を重視するため低線数の階調処理を施し、文字画像は解像性を重視するために高線数の階調処理を施す。 FIG. 6 is a diagram illustrating an example of image processing performed by the image processing unit 503. Examples of image processing include filter processing and gradation processing. For example, for natural images, blur filter processing is applied to crush halftone dots in the input image to prevent moire due to interference with halftone dots during image output, and character edges are emphasized for character images. Apply an edge enhancement filter to. Also, in gradation processing, natural images are subjected to gradation processing with a low number of lines to emphasize gradation rather than resolution, and character images are subjected to gradations with a high number of lines to emphasize resolution. Apply processing.

最後に画像出力手段５０４は、画像処理された画像を出力する。画像を出力する手段としては、ＭＦＰや複写機のプリント部がある。 Finally, the image output unit 504 outputs an image that has been subjected to image processing. As a means for outputting an image, there is a printing unit of an MFP or a copying machine.

以上述べた通り、劣化しているコピー画像を対象として像域識別するのではなく、透かしとして埋め込まれた情報から像域情報を得ることで、正確な像域識別結果を使用することが出来る。また、透かしから得られた像域情報を図１に示す像域情報入力手段１０２に入力し直すことで、何世代コピーしても高精度な像域識別結果を得ることが可能となる。 As described above, an accurate image area identification result can be used by obtaining image area information from information embedded as a watermark instead of identifying an image area for a deteriorated copy image. Further, by inputting again the image area information obtained from the watermark into the image area information input means 102 shown in FIG. 1, it is possible to obtain a highly accurate image area identification result regardless of how many generations are copied.

図７は、第１の実施の形態の画像形成装置に対応する画像処理装置の他の構成を示すブロック図である。この画像処理装置では、透かし情報が埋め込まれた文書から、高圧縮な電子ファイルを得る。 FIG. 7 is a block diagram showing another configuration of the image processing apparatus corresponding to the image forming apparatus of the first embodiment. In this image processing apparatus, a highly compressed electronic file is obtained from a document in which watermark information is embedded.

文書画像入力手段７０１は、紙に出力された上述の像域情報が透かし画像として埋め込まれた文書画像を入力する。次に透かし情報抽出手段７０２は、埋め込まれた像域情報を抽出する。続いて画像分割手段７０３は、抽出された像域情報を用いて文書画像入力手段７０１から出力された文書画像を文字画像と自然画像に分割する。 The document image input unit 701 inputs a document image in which the above-described image area information output on paper is embedded as a watermark image. Next, the watermark information extracting unit 702 extracts the embedded image area information. Subsequently, the image dividing unit 703 divides the document image output from the document image input unit 701 into a character image and a natural image using the extracted image area information.

分割された文字画像と自然画像をそれぞれに適した方法で画像圧縮することにより、高画質・高圧縮を実現する。例えば、文字画像を圧縮するための第１の圧縮手段７０４としては、２値画像しか扱えないが劣化しない（可逆である）ＭＭＲ圧縮を用いる。また、自然画像を圧縮するための第２の圧縮手段７０５としては、画像の高周波成分は失われるが、階調性のある画像に適したＪＰＥＧ圧縮を用いる。最後に画像結合手段７０６は、圧縮された文字画像と圧縮された自然画像を結合する。 High-quality and high-compression can be achieved by compressing the divided character images and natural images using methods suitable for each. For example, as the first compression unit 704 for compressing a character image, MMR compression that can handle only a binary image but does not deteriorate (reversible) is used. Further, as the second compression unit 705 for compressing the natural image, JPEG compression suitable for a gradation image is used although the high frequency component of the image is lost. Finally, the image combining unit 706 combines the compressed character image and the compressed natural image.

以上述べたように劣化しているコピー画像を像域識別せずに、埋め込まれた像域情報を用いて画像処理することで高画質で高圧縮な画像ファイルを得ることが可能となる。 As described above, it is possible to obtain a high-quality and highly-compressed image file by performing image processing using the embedded image area information without identifying an image area of a copy image that has deteriorated.

[第２の実施の形態]
図８は、第２の実施の形態の画像形成装置の構成を示すブロック図である。図９は、第２の実施の形態の画像形成装置の概略の動作を示すフローチャートである。
以下、図８及び図９を参照しつつ画像形成装置の構成と動作を説明する。 [Second Embodiment]
FIG. 8 is a block diagram illustrating a configuration of the image forming apparatus according to the second embodiment. FIG. 9 is a flowchart illustrating a schematic operation of the image forming apparatus according to the second embodiment.
The configuration and operation of the image forming apparatus will be described below with reference to FIGS.

画像形成装置は、文書データ入力手段８０１、文書画像形成手段８０２、像域情報入力手段８０３、透かし画像形成手段８０４、透かし文書合成手段８０５を備えている。 The image forming apparatus includes document data input means 801, document image formation means 802, image area information input means 803, watermark image formation means 804, and watermark document composition means 805.

動作９０１において、文書データ入力手段９０１は、文書データを入力する。ここで入力される文書データはＰＣ上のアプリケーションファイル、アプリケーションより出力されるＧＤＩ、プリンタドライバから出力されるＰＤＬ（ページ記述言語）などである。 In operation 901, the document data input unit 901 inputs document data. The document data input here is an application file on the PC, GDI output from the application, PDL (page description language) output from the printer driver, and the like.

動作９０２において、文書画像形成手段８０２は、入力された文書データからＧＤＩ・ＰＤＬ・ラスタデータ等の画像データを形成する。文書画像形成手段８０２はＰＣ上のアプリケーションやプリンタドライバ、ＲＩＰ（ラスタイメージプロセッサ）などである。 In operation 902, the document image forming unit 802 forms image data such as GDI, PDL, and raster data from the input document data. The document image forming unit 802 is an application on a PC, a printer driver, a RIP (raster image processor), or the like.

図１０は、文書画像形成手段８０２がＲＩＰである場合の入力される文書データと出力する文書画像データを示す図である。入力される文書データは主にプリンタドライバが作成したＰＤＬであり、図１０（１）の文書データ１００１が具体例である。この文書データ１００１は、文字像域データ１００２，１００３と自然画像像域データ１００４とを含んでいる。 FIG. 10 is a diagram showing input document data and output document image data when the document image forming unit 802 is RIP. The input document data is mainly PDL created by the printer driver, and the document data 1001 in FIG. 10A is a specific example. The document data 1001 includes character image area data 1002 and 1003 and natural image area data 1004.

ＲＩＰである文書画像形成手段８０２は、文書画像データとしてラスタ画像を出力する。図１０（２）に示すラスタ画像１００５がその具体例である。ラスタ画像１００５中の文字画像データ１００６，１００７は、それぞれ入力ＰＤＬの文字像域データ１００２，１００３から形成される文字の文書画像であり、ＰＤＬに記述されているフォント名・フォントサイズ・文字の表示位置・文字色・文字列サイズ・文字列データ（文字コード）などの情報に従いビットマップ画像を形成する。 A document image forming unit 802 which is a RIP outputs a raster image as document image data. A specific example is a raster image 1005 shown in FIG. Character image data 1006 and 1007 in the raster image 1005 are character document images formed from the character image area data 1002 and 1003 of the input PDL, respectively, and display of font names, font sizes, and characters described in the PDL. A bitmap image is formed according to information such as position, character color, character string size, and character string data (character code).

また、ラスタ画像１００５中の自然画像データ１００８は入力ＰＤＬの自然画像データ１００４から形成される自然画像の文書画像であり、画象の大きさ・位置・色情報・ビット数・圧縮方式・画像データ（ビットマップデータ）などの情報に従いビットマップ画像を形成する。通常、このように形成されたラスタ画像はプリントエンジンを介して紙へと出力される。 The natural image data 1008 in the raster image 1005 is a document image of a natural image formed from the natural image data 1004 of the input PDL, and the size, position, color information, bit number, compression method, and image data of the image. A bitmap image is formed according to information such as (bitmap data). Usually, the raster image formed in this way is output to paper via a print engine.

図８に示す像域情報入力手段８０３、透かし画像形成手段８０４、透かし文書合成手段８０５は、図１に示す第１の実施の形態に対応する各手段と同様に処理する。すなわち、動作９０３において、像域情報入力手段８０３は、文書データを元にユーザによって作成された像域情報を入力し、動作９０４において、透かし画像形成手段８０４は、像域情報を埋め込んだ透かし画像を形成する。動作９０５において、透かし文書合成手段８０５は、透かし情報が埋め込まれた合成画像を形成する。 The image area information input unit 803, the watermark image forming unit 804, and the watermark document synthesizing unit 805 shown in FIG. 8 perform the same processing as the units corresponding to the first embodiment shown in FIG. That is, in operation 903, the image area information input unit 803 inputs image area information created by the user based on the document data. In operation 904, the watermark image forming unit 804 inputs the watermark image in which the image area information is embedded. Form. In operation 905, the watermark document composition unit 805 forms a composite image in which watermark information is embedded.

図５、図７で説明した画像処理装置を用いれば、本第２の実施の形態の画像形成装置でプリント出力した文書画像を何世代コピーしても高画質な画像及び高画質・高圧縮な電子ファイルを得ることが可能となる。 If the image processing apparatus described with reference to FIGS. 5 and 7 is used, a high-quality image and high-quality / high-compression can be obtained regardless of how many generations of document images printed by the image forming apparatus according to the second embodiment are copied. An electronic file can be obtained.

[第３の実施の形態]
図１１は、第３の実施の形態の画像形成装置の構成を示すブロック図である。画像形成装置は、文書画像入力手段１１０１、画像識別手段１１０２、透かし画像形成手段１１０３、透かし文書合成手段１１０４を備えている。 [Third embodiment]
FIG. 11 is a block diagram illustrating a configuration of an image forming apparatus according to the third embodiment. The image forming apparatus includes document image input means 1101, image identification means 1102, watermark image formation means 1103, and watermark document composition means 1104.

第３の実施の形態は画像識別手段１１０２を有している点で第１の実施の形態と異なっている。即ち、第３の実施の形態では像域情報はユーザが作成するのではなく、入力される文書画像の像域を自動的に識別して像域情報を抽出する点で第１の実施の形態と異なっている。 The third embodiment is different from the first embodiment in that it includes an image identification unit 1102. That is, in the third embodiment, the image area information is not created by the user, but the image area information is extracted by automatically identifying the image area of the input document image. Is different.

図１２は、第３の実施の形態の画像識別手段１１０２により得られる像域情報を示す図である。図１２（１）は入力される文書画像１２０１を示している。図１２（２）は、画像識別手段１１０２により得られた像域情報１２０５である。文書画像１２０１から文字画像・自然画像等の像域を識別する手段には既に公知の技術があり、画像識別手段１１０２としては、例えば、特開２００５−３９４３０号公報に提案されている技術を用いることができる。 FIG. 12 is a diagram illustrating image area information obtained by the image identification unit 1102 according to the third embodiment. FIG. 12A shows an input document image 1201. FIG. 12B is image area information 1205 obtained by the image identification unit 1102. As a means for identifying an image area such as a character image or a natural image from the document image 1201, there is a known technique. As the image identification means 1102, for example, a technique proposed in Japanese Patent Application Laid-Open No. 2005-39430 is used. be able to.

本第３の実施の形態では、画質が劣化していないオリジナル原稿（文書画像）に対して画像識別処理をした結果を用いて画像を合成している。そのため、常に高精度な像域識別結果を使用することが可能である。このことにより、何世代コピーしても高精度な像域識別結果を得ることが可能である。また、高画質・高圧縮な電子ファイルを得ることも可能となる。 In the third embodiment, an image is synthesized using the result of image identification processing performed on an original document (document image) whose image quality has not deteriorated. Therefore, it is possible to always use a highly accurate image area identification result. This makes it possible to obtain a highly accurate image area identification result regardless of how many generations are copied. Also, it is possible to obtain an electronic file with high image quality and high compression.

[第４の実施の形態]
図１３は、第４の実施の形態の画像形成装置の構成を示すブロック図である。
画像形成装置は、文書データ入力手段１３０１、文書画像形成手段１３０２、像域情報抽出手段１３０３、透かし画像形成手段１３０４、透かし文書合成手段１３０５を備えている。 [Fourth embodiment]
FIG. 13 is a block diagram illustrating a configuration of an image forming apparatus according to the fourth embodiment.
The image forming apparatus includes document data input means 1301, document image formation means 1302, image area information extraction means 1303, watermark image formation means 1304, and watermark document composition means 1305.

第４の実施の形態は像域情報抽出手段１３０３を有している点で第２の実施の形態と異なっている。即ち、第４の実施の形態では像域情報はユーザが作成するのではなく、入力される文書データの像域を自動的に識別して像域情報を抽出する点で第２の実施の形態と異なっている。 The fourth embodiment is different from the second embodiment in that an image area information extracting unit 1303 is provided. That is, in the fourth embodiment, the image area information is not created by the user, but the image area information is extracted by automatically identifying the image area of the input document data. Is different.

像域情報抽出手段１３０３は、入力される文書データから特定の像域情報を抽出する。像域情報抽出手段１３０３は、アプリケーション内やプリンタドライバ内、ＲＩＰ内で実現される。
図１４は、像域情報抽出手段１３０３がＲＩＰ内で実装される場合の具体的な抽出方法と抽出情報の例を示す図である。図１４（１）は、入力される文書データ１４０１を示している。文書データ１４０１はＰＤＬである。文書データ１４０１には、文字列１４０２、文字列１４０３、文字列１４０４の３行の文字列と自然画像１４０５を含んでいる。 The image area information extraction unit 1303 extracts specific image area information from the input document data. The image area information extraction unit 1303 is realized in an application, a printer driver, or a RIP.
FIG. 14 is a diagram illustrating an example of a specific extraction method and extraction information when the image area information extraction unit 1303 is implemented in the RIP. FIG. 14A shows the input document data 1401. The document data 1401 is PDL. The document data 1401 includes a character string 1402, a character string 1403, a character string of three lines of a character string 1404, and a natural image 1405.

像域抽出手段１３０３は、ＰＤＬから像域の属性・座標・文字色などを抽出する。図１４（２）は抽出された像域情報１４０６を示している。
具体的には、像域抽出手段１３０３は、文書データ１４０１中の文字列データ１４０２から、第１像域の像域情報１４０７を抽出すると共に、第１像域の属性を文字と特定している。また、像域抽出手段１３０３は、文字列データ１４０２の位置情報と文字列サイズ情報から、第１像域の左下座標（ｘ１，ｙ１）と右上座標（ｘ２，ｙ２）の組が（２０，７０）−（８０，９０）であることも特定している。更に像域抽出手段１３０３は、文字列データ１４０２中の文字色情報から第１像域の文字色も抽出する。 An image area extraction unit 1303 extracts attributes, coordinates, character colors, and the like of the image area from the PDL. FIG. 14B shows the extracted image area information 1406.
Specifically, the image area extraction unit 1303 extracts the image area information 1407 of the first image area from the character string data 1402 in the document data 1401 and specifies the attribute of the first image area as a character. . The image area extraction unit 1303 determines that the set of the lower left coordinates (x1, y1) and the upper right coordinates (x2, y2) of the first image area is (20, 70) from the position information and the character string size information of the character string data 1402. )-(80, 90). Further, the image area extraction unit 1303 extracts the character color of the first image area from the character color information in the character string data 1402.

同様の処理を文字列１４０３、１４０４と自然画像１４０５に対しても繰り返すことで像域情報１４０６が生成される。
尚、本実施の形態で抽出した像域情報は像域の属性・座標・文字色であるが、これに限定するものではなく、文書データに記述されている情報、および、その情報に処理を加えることで得られる情報であれば抽出可能である。 Similar processing is repeated for the character strings 1403 and 1404 and the natural image 1405 to generate image area information 1406.
Note that the image area information extracted in the present embodiment includes image area attributes, coordinates, and character colors. However, the present invention is not limited to this, and the information described in the document data and the information are processed. Any information obtained by adding can be extracted.

第４の実施の形態では、文書データを用いて正確な像域情報を抽出する。従って、画像識別処理を用いることによる誤識別を回避することができる。この結果、何世代コピーしても高精度な像域識別結果を得ることが可能である。また、高画質・高圧縮な電子ファイルを得ることも可能となる。 In the fourth embodiment, accurate image area information is extracted using document data. Accordingly, erroneous identification due to the use of the image identification processing can be avoided. As a result, it is possible to obtain a highly accurate image area identification result regardless of how many generations are copied. Also, it is possible to obtain an electronic file with high image quality and high compression.

[第５の実施の形態]
図１５は、第５の実施の形態の画像形成装置の構成を示すブロック図である。画像形成装置は、文書画像入力手段１５０１、画像識別手段１５０２、像域情報編集手段１５０３、透かし画像形成手段１５０４、透かし文書合成手段１５０５を備えている。 [Fifth embodiment]
FIG. 15 is a block diagram illustrating a configuration of an image forming apparatus according to the fifth embodiment. The image forming apparatus includes a document image input unit 1501, an image identification unit 1502, an image area information editing unit 1503, a watermark image forming unit 1504, and a watermark document synthesizing unit 1505.

第５の実施の形態では、像域情報編集手段１５０３を有している点で、第３の実施の形態と異なっている。像域情報を組み込んで形成される画像から高機能な電子データを得る際、文字画像・写真画像の情報だけではなく更に詳細な情報が必要とされる場合がある。例えば、文字領域を識別すると共に、「タイトル」という属性まで識別できれば、タイトルの領域にＯＣＲ処理をし、その結果からファイル名を付けるといった高度な処理が可能となる。 The fifth embodiment is different from the third embodiment in that an image area information editing unit 1503 is provided. When highly functional electronic data is obtained from an image formed by incorporating image area information, not only character image / photographic image information but also more detailed information may be required. For example, if a character area can be identified and an attribute “title” can be identified, advanced processing such as performing OCR processing on the title area and assigning a file name based on the result is possible.

しかし、このような詳細な情報を得ようとした場合、画像識別手段１５０２には非常に高度な技術が要求される。このため、詳細情報を出力できなかったり、あるいは、誤識別を発生させてしまう場合がある。
また、透かし等を用いて埋め込める情報量には制限がある。そのため、優先順位を付けずに、例えば、像域情報抽出手段１３０３（図１３）により抽出された順番で情報を埋め込んでいくと、その文書を顕著に表す重要な像域情報を埋め込めない可能性がある。 However, in order to obtain such detailed information, the image identification means 1502 requires a very advanced technique. For this reason, detailed information may not be output or erroneous identification may occur.
In addition, there is a limit to the amount of information that can be embedded using a watermark or the like. Therefore, for example, if information is embedded in the order extracted by the image area information extraction unit 1303 (FIG. 13) without assigning priorities, there is a possibility that important image area information notably representing the document cannot be embedded. There is.

図１６は、像域情報編集手段１５０３の機能を説明する図である。図１６（１）は、入力された文書画像１６０１である。図１６（２）は、画像識別手段１５０２により出力された像域情報１６０６を示している。図１６（３）は、ユーザにより必要な情報が付与されたり、あるいは別の情報に書き換えられたり、優先順位を付けられたりして編集された像域情報１６１８を示している。なお、本実施の形態では、像域情報１６０６は優先順位の高い順番に記述されるものとする。 FIG. 16 is a diagram for explaining the function of the image area information editing unit 1503. FIG. 16A shows the input document image 1601. FIG. 16B shows the image area information 1606 output by the image identification unit 1502. FIG. 16 (3) shows image area information 1618 that has been edited with necessary information added by the user, rewritten with other information, or given priority. In this embodiment, it is assumed that the image area information 1606 is described in order of priority.

画像識別手段１５０２により出力された像域情報１６０６は単純に文書を上→下、左→右の順番で走査した順に記述されている。まず、像域情報付与の例を説明する。一般に画像識別処理において像域が重なっている場合には誤識別を起こしやすい。そこで、文書画像１６０１にある自然画像１６０５の中の文字“ＭＦＰ”は誤識別による抽出漏れで抽出出来なかったとする。この場合、ユーザが像域情報１６１８を編集し、文字列“ＭＦＰ”を第５像域としてその属性情報１６２５と位置情報１６２６を付加する。 The image area information 1606 output by the image identification unit 1502 is described in the order in which the document is simply scanned in the order of top → bottom and left → right. First, an example of image area information addition will be described. In general, when image areas overlap in image identification processing, erroneous identification is likely to occur. Therefore, it is assumed that the character “MFP” in the natural image 1605 in the document image 1601 cannot be extracted due to omission of extraction due to misidentification. In this case, the user edits the image area information 1618 and adds the attribute information 1625 and the position information 1626 to the character string “MFP” as the fifth image area.

次に、像域情報削除の例を示す。本実施の形態の画像形成装置により形成した画像を再利用する際に、特に文字色の記述がない場合は黒文字であるという規則が既にある場合、“黒文字”という情報は不要である。そのため、像域情報１６０６の第１像域、第３像域の文字色情報（１６０９，１６１５）は像域情報１６１８から削除されている。 Next, an example of image area information deletion will be shown. When reusing an image formed by the image forming apparatus according to the present embodiment, if there is already a rule that the character is black unless there is a description of the character color, the information “black character” is unnecessary. Therefore, the character color information (1609, 1615) of the first image area and the third image area of the image area information 1606 is deleted from the image area information 1618.

続いて情報書き換えの例を示す。像域情報１６０６においてはどのような文字像域もその属性は単なる“文字”としか抽出されていない。それに対し編集後の像域情報１６１８では“タイトル”、“本文”と更に詳細な分類（１６１９，１６２１，１６２３）がされている。 Next, an example of information rewriting is shown. In the image area information 1606, the attribute of any character image area is simply extracted as “character”. On the other hand, the image area information 1618 after editing is further classified into “title” and “text” (1619, 1621, 1623).

最後に優先順位の書き換えの例を示す。本実施の形態では、自然画像像域より文字画像像域を優先し、更に文字色情報の優先度を最も低くしている。その結果、像域情報１６１８では自然画像像域情報１６２７，１６２８の優先順位が低く、第２像域の文字色１６２９が最も優先順位が低くなっている。 Finally, an example of rewriting priority is shown. In the present embodiment, the character image area is prioritized over the natural image area, and the priority of the character color information is the lowest. As a result, in the image area information 1618, the priority order of the natural image image area information 1627 and 1628 is low, and the character color 1629 of the second image area has the lowest priority order.

像域情報編集手段１５０３は、予め決められた優先度の順に情報を選択し、透かし画像形成手段１５０４が、限界の情報量までの情報を埋め込んだ透かし画像を作成する。例えば、４個分の像域情報しか埋めこめられない場合には、第１像域の属性情報（１６１９）、位置情報（１６２０）、第２像域の属性情報（１６２１）、位置情報（１６２２）の４つの像域情報のみを埋め込んだ透かし画像を作成する。 The image area information editing unit 1503 selects information in the order of predetermined priority, and the watermark image forming unit 1504 creates a watermark image in which information up to the limit information amount is embedded. For example, when only four pieces of image area information can be embedded, attribute information (1619) of the first image area, position information (1620), attribute information of the second image area (1621), position information (1622). ) To create a watermark image in which only four pieces of image area information are embedded.

なお、像域情報編集手段１５０３は、ユーザとの間で、例えば、コントロールパネルを介して情報授受を行って、像域情報を編集する。図１７は、像域情報の編集の一例を示す図である。ユーザが文書画像１７０１中の領域（１７０２、１７０３、１７０４）のいずれかを指定して、属性の編集を行うことを入力すると、プルダウンメニュー１７０５が表示される。そこで、ユーザはこのプルダウンメニューから、タイトル、本文、文字、自然画像などを選択するとその選択された情報が像域情報１６１８に反映される。 The image area information editing unit 1503 edits image area information by exchanging information with the user via, for example, a control panel. FIG. 17 is a diagram illustrating an example of editing image area information. When the user designates one of the areas (1702, 1703, 1704) in the document image 1701 and inputs to edit the attribute, a pull-down menu 1705 is displayed. Therefore, when the user selects a title, text, character, natural image, or the like from this pull-down menu, the selected information is reflected in the image area information 1618.

第５の実施の形態では、重要な像域情報の優先度を高くしているため精度の高い像域情報を使用することが可能となる。このことにより、何世代コピーしても高精度な像域識別結果を得ることが可能である。また、高画質・高圧縮な電子ファイルを得ることも可能となる。 In the fifth embodiment, since the priority of important image area information is increased, it is possible to use highly accurate image area information. This makes it possible to obtain a highly accurate image area identification result regardless of how many generations are copied. Also, it is possible to obtain an electronic file with high image quality and high compression.

[第６の実施の形態]
図１８は、第６の実施の形態の画像形成装置の構成を示すブロック図である。
画像形成装置は、文書データ入力手段１８０１、文書画像形成手段１８０２、像域情報抽出手段１８０３、像域情報編集手段１８０４、透かし画像形成手段１８０５、透かし文書合成手段１８０６を備えている。 [Sixth embodiment]
FIG. 18 is a block diagram illustrating a configuration of an image forming apparatus according to the sixth embodiment.
The image forming apparatus includes document data input means 1801, document image forming means 1802, image area information extracting means 1803, image area information editing means 1804, watermark image forming means 1805, and watermark document composition means 1806.

第６の実施の形態は像域情報編集手段１８０４を有している点で第４の実施の形態と異なっている。 The sixth embodiment is different from the fourth embodiment in that an image area information editing unit 1804 is provided.

入力される文書データには画像を形成するためのデータのみを含んでいる場合が多い。そのため、後で高機能な電子データを得ようとしたときには別の情報が必要となる場合がある。
また、透かし等を用いて像域情報を埋め込む際、埋め込める情報量には制限がある。そのため、優先順位を付けずに、例えば、像域情報抽出手段１８０３により抽出された順番に像域情報を埋め込んでいくと、その文書を顕著に表す重要な像域情報を埋め込めない可能性がある。 In many cases, the input document data includes only data for forming an image. For this reason, when trying to obtain highly functional electronic data later, other information may be required.
Further, when embedding image area information using a watermark or the like, there is a limit to the amount of information that can be embedded. Therefore, for example, if image area information is embedded in the order extracted by the image area information extraction unit 1803 without assigning priorities, there is a possibility that important image area information notably representing the document cannot be embedded. .

図１９は、像域情報編集手段１８０４の機能を説明する図である。図１９（１）は、入力された文書データ１９０１を示している。図１９（２）は、像域情報抽出手段１８０３により出力された像域情報１９０６を示している。図１９（３）は、ユーザにより必要な情報が付与されたり、あるいは別の情報に書き換えられたり、優先順位を付けられたりして編集された像域情報１９１８を示している。なお、本実施の形態では、像域情報１９０６は優先順位の高い順番に記述されるものとする。 FIG. 19 is a diagram illustrating the function of the image area information editing unit 1804. FIG. 19 (1) shows the input document data 1901. FIG. 19 (2) shows the image area information 1906 output by the image area information extraction unit 1803. FIG. 19 (3) shows image area information 1918 that has been edited with necessary information added by the user, rewritten with other information, or given priority. In this embodiment, it is assumed that the image area information 1906 is described in order of priority.

第６の実施の形態の像域情報編集手段１８０４は、第５の実施の形態の像域情報編集手段１５０３と同様に動作する。従って、その詳細の説明は省略する。
第６の実施の形態では、重要な像域情報の優先度を高くしているため精度の高い像域情報を使用することが可能となる。このことにより、何世代コピーしても高精度な像域識別結果を得ることが可能である。また、高画質・高圧縮な電子ファイルを得ることも可能となる。 The image area information editing unit 1804 according to the sixth embodiment operates in the same manner as the image area information editing unit 1503 according to the fifth embodiment. Therefore, detailed description thereof is omitted.
In the sixth embodiment, since the priority of important image area information is increased, it is possible to use highly accurate image area information. This makes it possible to obtain a highly accurate image area identification result regardless of how many generations are copied. Also, it is possible to obtain an electronic file with high image quality and high compression.

[第７の実施の形態]
図２０は、第７の実施の形態の画像形成装置の構成を示すブロック図である。
画像形成装置は、文書データ入力手段２００１、文書画像形成手段２００２、像域情報抽出手段２００３、像域情報編集手段２００４、透かし画像形成手段２００５、透かし文書合成手段２００６を備えている。 [Seventh embodiment]
FIG. 20 is a block diagram illustrating a configuration of an image forming apparatus according to the seventh embodiment.
The image forming apparatus includes a document data input unit 2001, a document image forming unit 2002, an image area information extracting unit 2003, an image area information editing unit 2004, a watermark image forming unit 2005, and a watermark document synthesizing unit 2006.

第７の実施の形態と第６の実施の形態とでは像域情報編集手段２００４（１８０４）の機能が異なっている。第６の実施の形態の像域情報編集手段１８０４は、ユーザが直接指定した優先順位に基づいて編集するのに対し、第７の実施の形態の像域情報編集手段２００４は、事前に決めた規則に則って自動的に指定される優先順位に基づいて編集する。 The functions of the image area information editing unit 2004 (1804) are different between the seventh embodiment and the sixth embodiment. The image area information editing unit 1804 according to the sixth embodiment performs editing based on the priority directly designated by the user, whereas the image area information editing unit 2004 according to the seventh embodiment is determined in advance. Edit based on the priority automatically specified according to the rules.

図２１は、像域情報編集手段２００４の機能を説明する図である。図２１（１）は、入力された文書データ２１０１である。図２１（２）は、像域情報抽出手段２００３により出力された像域情報２１０６を示している。図２１（３）は、事前に決めた規則に則って自動的に指定される優先順位に基づいて編集された像域情報２１１８を示している。 FIG. 21 is a diagram for explaining the function of the image area information editing unit 2004. FIG. 21 (1) shows the input document data 2101. FIG. 21 (2) shows the image area information 2106 output by the image area information extraction unit 2003. FIG. 21 (3) shows the image area information 2118 edited based on the priority order automatically designated in accordance with a predetermined rule.

続いて、像域情報編集手段２００４の動作について説明する。
像域情報編集手段２００４は予め決められた規則に則って、例えば、像域識別する際に誤識別が起こりやすい順に、像域情報の優先順位をつける。
一般的に複数のオブジェクトに重なりがある場合、像域識別は非常に難しい。図２１（１）の文書データ２１０１の例では、像域２１０２と像域２１０５との像域識別は難しい。そこで、重なりのある像域情報の優先順位を高くする。 Next, the operation of the image area information editing unit 2004 will be described.
The image area information editing unit 2004 prioritizes the image area information according to a predetermined rule, for example, in the order in which erroneous identification is likely to occur when the image area is identified.
In general, when there are overlapping objects, it is very difficult to identify the image area. In the example of the document data 2101 in FIG. 21A, it is difficult to identify the image area between the image area 2102 and the image area 2105. Therefore, the priority of overlapping image area information is increased.

図２１（２）の像域情報２１０６によれば、それぞれの像域の左下座標（ｘ１，ｙ１）及び右上座標（ｘ２，ｙ２）の座標値は判明している。そこで、領域Ａ及び領域Ｂの座標を調べて下記条件が成立する場合、領域Ａ及び領域Ｂには重なり有りと判断し、優先順位を高くする。 According to the image area information 2106 in FIG. 21 (2), the coordinate values of the lower left coordinates (x1, y1) and the upper right coordinates (x2, y2) of each image area are known. Therefore, if the following conditions are satisfied by examining the coordinates of the regions A and B, it is determined that there is an overlap in the regions A and B, and the priority is increased.

条件：（ｙＡ１＜ｙＢ１＜ｙＡ２またはｙＡ１＜ｙＢ２＜ｙＡ２）かつ
（ｘＡ１＜ｘＢ１＜ｘＡ２またはｘＡ１＜ｘＢ２＜ｘＡ２）
但し、領域Ａの左下座標−右上座標の組を（ｘＡ１，ｙＡ１）−（ｘＡ２，ｙＡ２）、
領域Ｂの左下座標−右上座標の組を（ｘＢ１，ｙＢ１）−（ｘＢ２，ｙＢ２）とする。 Condition: (yA1 <yB1 <yA2 or yA1 <yB2 <yA2) and
(XA1 <xB1 <xA2 or xA1 <xB2 <xA2)
However, the set of the lower left coordinate-upper right coordinate of the region A is (xA1, yA1)-(xA2, yA2),
A set of the lower left coordinate and the upper right coordinate of the region B is defined as (xB1, yB1)-(xB2, yB2).

また、フォントサイズが小さな文字領域も像域識別が難しい。そこで、像域情報としてフォントサイズが抽出されている場合には、そのフォントサイズがある閾値より小さい場合に優先順位を上げる。図２１（２）の像域情報２１０６に示したように、像域情報にフォントサイズが直接記述されていない場合でも、例えば、像域属性が文字であって、その像域の幅・高さいずれかが閾値より小さい場合、即ち像域座標の関係が下式に当てはまる場合は、小文字領域と判断し優先順位を上げる。 In addition, it is difficult to identify an image area even in a character area having a small font size. Therefore, when the font size is extracted as the image area information, the priority is increased when the font size is smaller than a certain threshold value. As shown in the image area information 2106 in FIG. 21 (2), even when the font size is not directly described in the image area information, for example, the image area attribute is a character, and the width / height of the image area If any of them is smaller than the threshold value, that is, if the relationship of the image area coordinates is satisfied by the following equation, it is determined as a lower-case area and the priority is raised.

ｘ２−ｘ１＜ｔｈまたはｙ２−ｙ１＜ｔｈ
なお、ｔｈはある閾値を表す。 x2-x1 <th or y2-y1 <th
Note that th represents a certain threshold value.

また、ここでは優先順位を付け替える場合について具体例を挙げて説明した。しかし、第７の実施の形態では、優先順位の付け替えのみに限定されない。第３の実施の形態と同様に、例えばある閾値よりサイズが大きな文字像域の属性はロゴに置き換える、ある閾値よりサイズが小さな自然画像の属性情報は削除するなどの規則に則り、像域情報の付与・置き換え・削除等の編集が可能である。 Further, here, the case of changing the priority order has been described with a specific example. However, the seventh embodiment is not limited to only changing the priority order. Similar to the third embodiment, the image area information conforms to the rules such as replacing the attribute of a character image area having a size larger than a certain threshold value with a logo or deleting the attribute information of a natural image having a size smaller than a certain threshold value. Editing such as assigning, replacing, and deleting is possible.

透かし画像形成手段２００５は、像域情報編集手段２００４で決めた優先度の順に情報を選択して限界の情報量までの情報を埋め込んだ透かし画像を作成する。 The watermark image forming unit 2005 selects information in the order of priority determined by the image area information editing unit 2004 and creates a watermark image in which information up to the limit information amount is embedded.

第７の実施の形態では、重要な像域情報の優先度を高くしているため精度の高い像域情報を使用することが可能となる。このことにより、何世代コピーしても高精度な像域識別結果を得ることが可能である。また、高画質・高圧縮な電子ファイルを得ることも可能となる。 In the seventh embodiment, since the priority of important image area information is increased, it is possible to use highly accurate image area information. This makes it possible to obtain a highly accurate image area identification result regardless of how many generations are copied. Also, it is possible to obtain an electronic file with high image quality and high compression.

以上説明した各実施の形態によれば、コピー出力、あるいはプリント出力を再度コピー出力する際に従来の手法と比べて高画質なコピー出力を得ることができる。あるいは、プリント出力、コピー出力をスキャン入力して再電子ファイル化する際に高機能な電子ファイルを得ることができる。 According to each embodiment described above, it is possible to obtain a high-quality copy output as compared with the conventional method when copy output or print output is copied again. Alternatively, a highly functional electronic file can be obtained when a print output and a copy output are scanned and converted into an electronic file.

続いて、上述の画像形成装置で形成された画像から透かし情報を抽出する画像処理装置について説明する。 Next, an image processing apparatus that extracts watermark information from an image formed by the above-described image forming apparatus will be described.

[第１の実施の形態]
図２２は、第１の実施の形態の画像処理装置の構成を示すブロック図である。図２３は、第１の実施の形態の画像処理装置の概略の動作を示すフローチャートである。
以下、図１及び図２を参照しつつ画像処理装置の構成と動作を説明する。 [First embodiment]
FIG. 22 is a block diagram illustrating a configuration of the image processing apparatus according to the first embodiment. FIG. 23 is a flowchart illustrating a schematic operation of the image processing apparatus according to the first embodiment.
Hereinafter, the configuration and operation of the image processing apparatus will be described with reference to FIGS. 1 and 2.

画像処理装置は、文書画像入力手段２２０１、透かし情報抽出手段２２０２、画像識別手段２２０３を備えている。 The image processing apparatus includes a document image input unit 2201, a watermark information extraction unit 2202, and an image identification unit 2203.

動作２３０１において、文書画像入力手段２２０１は文書画像を入力する。ここで文書画像入力手段２２０１は、ＭＦＰや複写機に搭載されているスキャナ等である。入力される文書画像は画像形成装置を用いて像域情報を透かし画像として合成した文書画像を紙などに出力したものである。 In operation 2301, the document image input unit 2201 inputs a document image. Here, the document image input means 2201 is a scanner or the like mounted on an MFP or a copying machine. The input document image is obtained by outputting, on paper or the like, a document image obtained by combining image area information as a watermark image using an image forming apparatus.

像域情報を透かし画像として合成する上述の画像形成装置の例を、以下に簡単に説明する。
図２４は、画像形成装置の構成を示すブロック図である。
画像形成装置は、文書画像入力手段２４０１、像域情報入力手段２４０２、透かし画像形成手段２４０３、透かし文書合成手段２４０４を備えている。 An example of the above-described image forming apparatus that synthesizes image area information as a watermark image will be briefly described below.
FIG. 24 is a block diagram illustrating a configuration of the image forming apparatus.
The image forming apparatus includes a document image input unit 2401, an image area information input unit 2402, a watermark image formation unit 2403, and a watermark document composition unit 2404.

文書画像入力手段２４０１は、文書画像を入力する。ここで文書画像入力手段２４０１は、例えば、ＭＦＰや複写機に搭載されているスキャナであり、入力される文書画像は紙原稿である。像域情報入力手段２４０２は入力される文書画像を元に予め作成されている像域情報を入力する。 The document image input unit 2401 inputs a document image. Here, the document image input unit 2401 is, for example, a scanner mounted on an MFP or a copier, and the input document image is a paper document. The image area information input unit 2402 inputs image area information created in advance based on the input document image.

図２５は、入力される文書画像及び像域情報の内容を示す図である。図２５（１）の文書画像２５０１は、タイトル２５０２，本文２５０３及び自然画像２５０４の像域を含んでいる。この文書画像２５０１から像域情報を抽出するために、例えば、図２５（２）の文書画像２５０５に示したように、ユーザが画面を見ながらマウス等で各像域（２５０６，２５０７，２５０８）を矩形で囲み、キーボード等から各属性を指定する。 FIG. 25 is a diagram showing the contents of the input document image and image area information. A document image 2501 in FIG. 25A includes an image area of a title 2502, a body 2503, and a natural image 2504. In order to extract image area information from the document image 2501, for example, as shown in the document image 2505 in FIG. 25 (2), each image area (2506, 2507, 2508) is viewed with the mouse while the user looks at the screen. Is enclosed in a rectangle and each attribute is specified from the keyboard.

このようにユーザに指定された各像域の矩形座標位置と属性とから、図２５（３）に示す像域情報２５０９が作成され、像域情報入力手段２４０２に入力される。像域情報としては、属性（タイトル／本文／自然画像）とその座標位置（２５１０、２５１１、２５１２）である。
なお、本実施例では像域情報として座標位置と、タイトル／本文／自然画像の属性を入力しているが、例えば文字属性の像域であれば文字色や文字コード、自然画像であれば風景画像／人物画像などの情報も像域情報として入力することが可能である。 Thus, image area information 2509 shown in FIG. 25 (3) is created from the rectangular coordinate position and attribute of each image area designated by the user and input to the image area information input means 2402. The image area information includes attributes (title / text / natural image) and coordinate positions (2510, 2511, 2512).
In this embodiment, the coordinate position and the title / text / natural image attributes are input as the image area information. For example, if the image area has a character attribute, the character color or character code is used. Information such as images / person images can also be input as image area information.

透かし画像形成手段２４０３は、像域情報入力手段２４０２により抽出された像域情報を埋め込んだ透かし画像を形成する。更に、透かし文書合成手段２４０４は、文書画像入力手段２４０１から出力された文書画像と、透かし画像形成手段２４０３から出力された透かし画像を合成し、透かし情報が埋め込まれた画像を形成する。 The watermark image forming unit 2403 forms a watermark image in which the image area information extracted by the image area information input unit 2402 is embedded. Further, the watermark document synthesizing unit 2404 synthesizes the document image output from the document image input unit 2401 and the watermark image output from the watermark image forming unit 2403 to form an image in which watermark information is embedded.

図２６は、特開２００３−１０１７６２号公報に記載の技術を用いた画像形成方法を説明する図である。
文書画像入力手段２４０１から出力された文書画像２６０１に対し、像域情報入力手段２４０２から出力された像域情報を透かし画像形成手段２４０３により透かし画像２６０２に変換し、文書画像２６０１と透かし画像２６０２を透かし文書合成手段２４０４により合成して透かし情報が埋め込まれた文書２６０３を得る。 FIG. 26 is a diagram illustrating an image forming method using the technique described in Japanese Patent Laid-Open No. 2003-101762.
For the document image 2601 output from the document image input means 2401, the image area information output from the image area information input means 2402 is converted into a watermark image 2602 by the watermark image forming means 2403, and the document image 2601 and the watermark image 2602 are converted. A document 2603 in which watermark information is embedded is obtained by synthesizing by the watermark document synthesizing unit 2404.

図２３に戻り、動作２３０２において透かし情報抽出手段２２０２（図２２）が入力された文書画像から透かし画像として埋め込まれている像域情報を抽出する。なお、透かし情報を抽出するための技術としては、例えば、特開２００３−１０１７６２号公報に記載の技術等を用いることができる。 Returning to FIG. 23, in operation 2302, the watermark information extraction unit 2202 (FIG. 22) extracts image area information embedded as a watermark image from the input document image. As a technique for extracting watermark information, for example, a technique described in Japanese Patent Application Laid-Open No. 2003-101762 can be used.

図２７は、入力された文書画像に埋め込まれた透かし画像から抽出した像域情報を示す図である。図２７（１）は入力された文書画像２７０１を示している。この文書画像２７０１は、図２４に示す画像形成装置に図２５（１）の文書画像２５０１を入力し、図２５（３）の像域情報２５０９を埋め込んだ透かし画像を合成して形成したものとする。 FIG. 27 is a diagram showing image area information extracted from a watermark image embedded in an input document image. FIG. 27A shows the input document image 2701. This document image 2701 is formed by inputting the document image 2501 of FIG. 25 (1) to the image forming apparatus shown in FIG. 24 and synthesizing the watermark image in which the image area information 2509 of FIG. 25 (3) is embedded. To do.

透かし情報抽出手段２２０２により抽出された、文書画像に埋め込まれた像域情報が図２７（２）の像域情報２７０５である。この像域情報２７０５は、画像形成装置により埋め込まれていた図２５（３）の像域情報２５０９であり、そのまま抽出できる。 Image area information embedded in the document image extracted by the watermark information extraction unit 2202 is image area information 2705 in FIG. This image area information 2705 is the image area information 2509 of FIG. 25 (3) embedded by the image forming apparatus, and can be extracted as it is.

動作２３０３においてＹｅｓの場合、すなわち入力文書画像に像域情報が透かし画像で埋め込まれていた場合には、動作２３０４において、透かし情報抽出手段２２０２により抽出された像域情報２７０５を本実施の形態の画像処理装置の出力とする。 If Yes in operation 2303, that is, if image area information is embedded in the input document image with a watermark image, the image area information 2705 extracted by the watermark information extraction unit 2202 is used in operation 2304 in the present embodiment. The output of the image processing apparatus.

一方、動作２３０３においてＮｏの場合、すなわち入力文書画像に透かし画像が埋め込まれていなかったために、透かし情報抽出手段２０２が像域情報を抽出できなかった場合には、動作２３０５において、画像識別手段２２０３により入力文書画像の像域識別を行い、その像域情報を本実施の形態の画像処理装置の出力とする。なお、画像識別手段としては、例えば、特開２００３−８７５６２号公報に記載の技術を用いる。 On the other hand, if No in operation 2303, that is, if the watermark information extraction unit 202 cannot extract the image area information because the watermark image is not embedded in the input document image, the image identification unit 2203 in operation 2305. Thus, the image area of the input document image is identified, and the image area information is used as the output of the image processing apparatus of the present embodiment. As the image identification means, for example, the technique described in Japanese Patent Laid-Open No. 2003-87562 is used.

図２８は、画像識別手段２２０３により得られた像域情報を示す図である。図２８（１）は、入力文書画像２８０１を示している。画像識別手段２２０３は、入力画像中の文字領域・自然画像領域を識別し、それぞれの領域の属性・座標位置を図２８（２）に示す像域情報２８０５として出力する。 FIG. 28 is a diagram showing image area information obtained by the image identification unit 2203. FIG. 28 (1) shows an input document image 2801. The image identifying means 2203 identifies the character area / natural image area in the input image, and outputs the attribute / coordinate position of each area as image area information 2805 shown in FIG.

第１の実施の形態の画像処理装置によれば、透かし画像が埋め込まれていない画像に対しては従来と同精度の像域情報を得ることが出来、既に像域情報が透かし画像で埋め込まれた文書画像に対しては、画像識別を行わずに像域情報を得ることが可能である。従って、高精度な識別情報を得ることができる。 According to the image processing apparatus of the first embodiment, it is possible to obtain image area information with the same accuracy as the conventional image with respect to an image in which the watermark image is not embedded, and the image area information is already embedded with the watermark image. It is possible to obtain image area information for a document image without performing image identification. Therefore, highly accurate identification information can be obtained.

次に、第１の実施の形態の画像処理装置を用いて得られる像域情報の利用方法として、高画質な世代コピーを得る装置の構成と動作を図２９を参照しつつ説明する。
上述の画像形成装置を用いて像域情報が透かし画像として埋め込まれた文書画像を紙に出力する。文書画像入力手段２９０１が紙に出力された文書画像を入力する。ここで文書画像入力手段２９０１はＭＦＰや複写機のスキャナ等である。 Next, as a method for using the image area information obtained by using the image processing apparatus according to the first embodiment, the configuration and operation of an apparatus for obtaining a high-quality generation copy will be described with reference to FIG.
A document image in which image area information is embedded as a watermark image is output to paper using the image forming apparatus described above. A document image input unit 2901 inputs a document image output on paper. Here, the document image input means 2901 is an MFP or a scanner of a copying machine.

次に、第１の実施の形態の画像処理装置２９０２を用いて埋め込まれた像域情報を抽出する。続いて画像処理手段２９０３が、入力された文書画像に対して、抽出された像域情報を用いて像域ごとに適した画像処理を施す。 Next, the embedded image area information is extracted using the image processing apparatus 2902 of the first embodiment. Subsequently, the image processing unit 2903 performs image processing suitable for each image area on the input document image using the extracted image area information.

図３０は、画像処理手段２９０３により施される画像処理例を示す図である。画像処理の例としては、フィルタ処理や階調処理がある。例えば、自然画像に対しては画像出力時の網点との干渉によるモアレを防ぐために入力画像の網点を潰すためのぼかしフィルタ処理を施す。文字画像に対しては文字のエッジを強調するためにエッジ強調フィルタを施す。また、階調処理においても、自然画像は解像性よりも階調性を重視するため低線数の階調処理を施し、文字画像は解像性を重視するために高線数の階調処理を施す。 FIG. 30 is a diagram illustrating an example of image processing performed by the image processing unit 2903. Examples of image processing include filter processing and gradation processing. For example, a natural image is subjected to a blur filter process for crushing halftone dots of an input image in order to prevent moire due to interference with the halftone dots during image output. An edge enhancement filter is applied to the character image to enhance the edge of the character. Also, in gradation processing, natural images are subjected to gradation processing with a low number of lines to emphasize gradation rather than resolution, and character images are subjected to gradations with a high number of lines to emphasize resolution. Apply processing.

画像処理手段２９０３による画像処理の後、画像出力手段２９０４が画像を出力する。画像出力手段２９０４としては、ＭＦＰや複写機のプリント部がある。 After the image processing by the image processing unit 2903, the image output unit 2904 outputs an image. As the image output means 2904, there is a print section of an MFP or a copier.

以上述べた通り、劣化しているコピー画像に対して像域識別するのではなく、透かしとして埋め込まれた情報から像域情報を得ることで、正確な像域識別結果を使用することが出来る。また、第１の実施の形態の画像処理装置から得られた像域情報を、再度、図２４に示す画像形成装置に入力し直すことで、何世代コピーしても高精度な像域識別結果を得ることが可能となる。 As described above, an accurate image area identification result can be used by obtaining image area information from information embedded as a watermark instead of identifying an image area for a deteriorated copy image. Further, the image area information obtained from the image processing apparatus according to the first embodiment is input again to the image forming apparatus shown in FIG. Can be obtained.

更に、第１の実施の形態の画像処理装置を用いて得られた像域情報の他の利用方法として、高圧縮な電子ファイルを得る装置の構成と動作を図３１を参照しつつ説明する。
文書画像入力手段３１０１が像域情報が埋め込まれた文書画像を入力し、第１の実施の形態の画像処理装置３１０２が埋め込まれた像域情報を抽出する。次に画像分割手段３１０３が抽出された像域情報を用いて入力された文書画像を文字画像と自然画像に分割する。 Furthermore, as another method of using the image area information obtained by using the image processing apparatus of the first embodiment, the configuration and operation of an apparatus for obtaining a highly compressed electronic file will be described with reference to FIG.
The document image input unit 3101 inputs a document image in which image area information is embedded, and the image processing apparatus 3102 according to the first embodiment extracts the image area information in which the image area information is embedded. Next, the image dividing unit 3103 divides the input document image into a character image and a natural image using the extracted image area information.

分割された文字画像と自然画像をそれぞれに適した方法で画像圧縮することで、高画質・高圧縮を実現する。例えば、文字画像を圧縮するための第１の圧縮手段３１０４は、２値画像しか扱えないが劣化しない（可逆である）ＭＭＲ圧縮を用いる。また、自然画像を圧縮するための第２の圧縮手段３１０５は、画像の高周波成分は失われるが、階調性のある画像に適したＪＰＥＧ圧縮を用いる。 High-quality and high-compression can be achieved by compressing the divided character images and natural images using methods suitable for each. For example, the first compression means 3104 for compressing a character image uses MMR compression that can handle only a binary image but does not deteriorate (reversible). Further, the second compression unit 3105 for compressing the natural image uses JPEG compression suitable for an image having gradation, although the high frequency component of the image is lost.

最後に画像結合手段３１０６が、前記圧縮文字画像と前記圧縮自然画像を結合する。 Finally, an image combining unit 3106 combines the compressed character image and the compressed natural image.

以上述べたように像域識別を行わず、埋め込まれた像域情報を用いると高精度な像域情報を得られるため高画質で高圧縮な画像ファイルを得ることが可能となる。 As described above, if the embedded image area information is used without performing image area identification, high-accuracy image area information can be obtained, so that an image file with high image quality and high compression can be obtained.

[第２の実施の形態]
図３２は、第２の実施の形態の画像処理装置の構成を示すブロック図である。画像処理装置は、文書画像入力手段３２０１、透かし情報抽出手段３２０２、画像識別手段３２０３、像域情報合成手段３２０４を備えている。 [Second Embodiment]
FIG. 32 is a block diagram illustrating a configuration of an image processing apparatus according to the second embodiment. The image processing apparatus includes document image input means 3201, watermark information extraction means 3202, image identification means 3203, and image area information synthesis means 3204.

第２の実施の形態の画像処理装置は、入力文書画像の透かし情報の有無に関わらず画像識別を行うことと、像域情報合成手段３２０４を有する点で第１の実施の形態と異なっている。
一般に透かし画像に埋め込むことのできる情報量には限りがあり、必要な像域情報が全て透かし画像に埋め込まれていない場合もある。そこで、第２の実施の形態の画像処理装置は透かし画像の有無に関わらず画像識別を行い、透かし画像から得られた第１の像域情報と画像識別により生成した第２の像域情報を、像域情報合成手段３２０４により合成し出力する。 The image processing apparatus according to the second embodiment is different from the first embodiment in that image identification is performed regardless of the presence / absence of watermark information of an input document image and that image area information synthesizing means 3204 is provided. .
In general, the amount of information that can be embedded in a watermark image is limited, and all necessary image area information may not be embedded in the watermark image. Therefore, the image processing apparatus according to the second embodiment performs image identification regardless of the presence or absence of the watermark image, and uses the first image area information obtained from the watermark image and the second image area information generated by the image identification. The image area information synthesizing means 3204 synthesizes and outputs.

第２の実施の形態の画像処理装置をどのような目的に使用するかによって、像域情報合成手段３２０４による最適な合成方法が異なる。
例えば、全文検索用のＯＣＲ処理の前処理として第２の実施の形態の画像処理装置を使用する場合では、文字領域を過抽出することよりも、文字領域の抽出漏れを発生させることが不具合としての度合が大きいと考えられる。そこで、像域情報合成手段３２０４は、第１の像域情報と第２の像域情報に含まれる全ての領域を出力するように動作する。 The optimum combining method by the image area information combining unit 3204 differs depending on the purpose of using the image processing apparatus of the second embodiment.
For example, in the case where the image processing apparatus according to the second embodiment is used as pre-processing for OCR processing for full-text search, it is a problem that omission of character area extraction occurs rather than overextraction of character areas. The degree of is thought to be large. Therefore, the image area information synthesizing unit 3204 operates to output all the areas included in the first image area information and the second image area information.

また、高圧縮な電子ファイルを作成するために第２の実施の形態の画像処理装置を使用する場合、文字領域には２値化処理を行うので、自然画像を“文字領域”と誤って識別すると、文字抽出漏れよりも画質不具合による影響が大きくなる。そのため、像域情報合成手段３２０４は、信頼度の低い領域情報は破棄するように動作する。 In addition, when the image processing apparatus according to the second embodiment is used to create a highly compressed electronic file, since the binarization process is performed on the character area, the natural image is erroneously identified as the “character area”. As a result, the influence of the image quality defect becomes larger than the character extraction omission. Therefore, the image area information combining unit 3204 operates so as to discard area information with low reliability.

更に、文字画像を優先したコピア用の識別信号生成に第２の実施の形態の画像処理装置を使用する際には、まず領域毎に文字用の画像処理／自然画像用の画像処理とを切り替える必要がある。そのため領域に重複が許されず、かつ、自然画像領域を文字画像と間違うことよりも文字領域を自然画像領域と間違うことの方が画質不具合の度合が大きい。従って、像域情報合成手段３２０４は、領域に重複があり、かつ、その属性が異なった場合には文字属性の領域を優先するように動作する。 Further, when the image processing apparatus according to the second embodiment is used for generating a copier identification signal giving priority to a character image, first, switching between character image processing and natural image image processing is performed for each region. There is a need. For this reason, overlapping of areas is not permitted, and the degree of image quality defects is greater when a character area is mistaken for a natural image area than when a natural image area is mistaken for a character image. Accordingly, the image area information synthesizing unit 3204 operates so as to give priority to the character attribute area when the areas are overlapped and the attributes are different.

図３３は、第２の実施の形態の画像処理装置をＯＣＲ前処理として使用する際の像域情報合成手段３２０４の動作例を示す図である。図３３（１）は、入力文書画像３３０１を示している。図３３（２）は、透かし画像から抽出された第１の像域情報３３０５を示している。図３３（３）は、画像識別手段３３０３により生成された第２の像域情報３３０８を示している。図３３（４）は、第１の像域情報３３０５と第２の像域情報３３０８を合成して生成された第３の像域情報３３１１を示している。 FIG. 33 is a diagram illustrating an operation example of the image area information combining unit 3204 when the image processing apparatus according to the second embodiment is used as the OCR preprocessing. FIG. 33A shows an input document image 3301. FIG. 33 (2) shows the first image area information 3305 extracted from the watermark image. FIG. 33 (3) shows the second image area information 3308 generated by the image identification means 3303. FIG. 33 (4) shows third image area information 3311 generated by combining the first image area information 3305 and the second image area information 3308.

図３３に示した例では、像域情報合成手段３２０４は、第１の像域情報３３０５と第２の像域情報３３０８を単純に合成している。すなわち、第１の像域情報３３０５中の第１領域３３０６と第２領域３３０７が第３の像域情報３３１１の第１領域３３１２、第２領域３３１３となり、第２の像域情報３３０８中の第１領域３３０９と第２領域３３１０が第３の像域情報３３１１の第３領域３３１４、第４領域３３１５となっている。 In the example shown in FIG. 33, the image area information combining unit 3204 simply combines the first image area information 3305 and the second image area information 3308. That is, the first area 3306 and the second area 3307 in the first image area information 3305 become the first area 3312 and the second area 3313 in the third image area information 3311, and the second area 3308 in the second image area information 3308. The first area 3309 and the second area 3310 are a third area 3314 and a fourth area 3315 of the third image area information 3311.

図３４は、第２の実施の形態の画像処理装置を高圧縮ファイル作成のために使用する際の像域情報合成手段３２０４の動作例を示す図である。図３４（１）は、入力文書画像３４０１を示している。図３４（２）は、透かし画像から抽出された第１の像域情報３４０５を示している。図３４（３）は、画像識別手段３２０３により生成された第２の像域情報３４０８を示している。図３４（４）は、第１の像域情報３４０５と第２の像域情報３４０８を合成して生成された第３の像域情報３４１１を示している。 FIG. 34 is a diagram illustrating an operation example of the image area information combining unit 3204 when the image processing apparatus according to the second embodiment is used for creating a highly compressed file. FIG. 34 (1) shows an input document image 3401. FIG. 34 (2) shows the first image area information 3405 extracted from the watermark image. FIG. 34 (3) shows the second image area information 3408 generated by the image identification means 3203. FIG. 34 (4) shows third image area information 3411 generated by combining the first image area information 3405 and the second image area information 3408.

図３４に示した例では、第１の像域情報３４０５中の領域と第２の像域情報３４０８中の領域で座標位置に重なりがある場合は第１の像域情報３４０５中の領域を採用し、第２の像域情報３４０８中の領域を破棄している。例えば、第１の像域情報３４０５中の第１領域３４０６と第２像域情報３４０８中の第１領域３４０９に重なりがある。従って、第２像域情報３４０８中の第１領域３４０９を破棄して合成した第３の像域情報３４１１が出力される。 In the example shown in FIG. 34, when there is an overlap in the coordinate position between the area in the first image area information 3405 and the area in the second image area information 3408, the area in the first image area information 3405 is adopted. In addition, the area in the second image area information 3408 is discarded. For example, there is an overlap between the first area 3406 in the first image area information 3405 and the first area 3409 in the second image area information 3408. Accordingly, the third image area information 3411 synthesized by discarding the first area 3409 in the second image area information 3408 is output.

なお、２つの領域、領域Ａと領域Ｂの重なりの有無は、例えば下式に当てはまるか否かで判定する。
（ｙＡ１＜ｙＢ１＜ｙＡ２またはｙＡ１＜ｙＢ２＜ｙＡ２）かつ
（ｘＡ１＜ｘＢ１＜ｘＡ２またはｘＡ１＜ｘＢ２＜ｘＡ２）
但し、領域Ａの左下座標−右上座標の組を（ｘＡ１，ｙＡ１）−（ｘＡ２，ｙＡ２）、領域Ｂの左下座標−右上座標の組を（ｘＢ１，ｙＢ１）−（ｘＢ２，ｙＢ２）とする。 Note that the presence or absence of overlap between the two regions, region A and region B, is determined by whether or not the following formula is satisfied, for example.
(YA1 <yB1 <yA2 or yA1 <yB2 <yA2) and
(XA1 <xB1 <xA2 or xA1 <xB2 <xA2)
However, a set of the lower left coordinate and the upper right coordinate of the area A is (xA1, yA1)-(xA2, yA2), and a set of the lower left coordinate and the upper right coordinate of the area B is (xB1, yB1)-(xB2, yB2).

図３５は、第２の実施の形態の画像処理装置をコピア用識別信号生成に使用する際の像域情報合成手段３２０４の動作例を示す図である。図３５（１）は、入力文書画像３５０１を示している。図３５（２）は、透かし画像から抽出された第１の像域情報３５０５を示している。図３５（３）は、画像識別手段３２０３により生成された第２の像域情報３５０８を示している。図３５（４）は、第１の像域情報３５０５と第２の像域情報３５０８を合成して生成された第３の像域情報３５１１を示している。 FIG. 35 is a diagram illustrating an operation example of the image area information combining unit 3204 when the image processing apparatus according to the second embodiment is used for generating a copier identification signal. FIG. 35 (1) shows an input document image 3501. FIG. 35 (2) shows the first image area information 3505 extracted from the watermark image. FIG. 35 (3) shows the second image area information 3508 generated by the image identification means 3203. FIG. 35 (4) shows third image area information 3511 generated by combining the first image area information 3505 and the second image area information 3508.

図３５に示した例では、第１の像域情報３５０５中の第１領域３５０６（文字列“ＴＩＴＬＥ”）と第２の像域情報３５０８中の第１領域３５０９（文字列“ＴＩＴＬＥ”）で座標位置に重なりがある。そこで、その属性が“文字”である第２の像域情報３５０８中の領域３５０９を採用し、その属性が“自然画像”である第１の像域情報３５０５中の領域３５０６を破棄する。すなわち、第１の像域情報３５０５中の第２領域３５０７と第２の像域情報３５０８中の第１領域３５０９、第２領域３５１０を採用して第３の像域情報３５１１として出力する。 In the example shown in FIG. 35, the first area 3506 (character string “TITLE”) in the first image area information 3505 and the first area 3509 (character string “TITLE”) in the second image area information 3508 are used. There are overlapping coordinate positions. Therefore, the area 3509 in the second image area information 3508 whose attribute is “character” is adopted, and the area 3506 in the first image area information 3505 whose attribute is “natural image” is discarded. That is, the second area 3507 in the first image area information 3505 and the first area 3509 and the second area 3510 in the second image area information 3508 are adopted and output as the third image area information 3511.

なお、像域情報合成手段３２０４の動作として３例を説明したが、この例に限られるものではない。より高精度な像域情報を得るために第１の像域情報と第２の像域情報の両方に存在する領域のみを採用する等、他の動作例も考えられる。 Although three examples have been described as the operation of the image area information combining unit 3204, the present invention is not limited to this example. Other operation examples are also conceivable, such as adopting only the areas existing in both the first image area information and the second image area information in order to obtain more accurate image area information.

また、上述の第３の像域情報中に領域を記述する順序は、第１の像域情報中の領域の後に第２の像域情報中の領域を記述する単純なものである。しかし、第２の実施の形態の画像処理装置の出力をコピア用識別信号に使用する場合、コピアの画像処理は副走査の順方向にしか処理できない。従って、第３の像域情報中の領域も副走査の順方向に並べ替える必要がある。例えば、図３５（４）の第３の像域情報３５１１の例では、上を副走査の先頭とすると、第２領域、第３領域、第１領域の順に並び替える。 Further, the order of describing the areas in the third image area information is simple as describing the areas in the second image area information after the areas in the first image area information. However, when the output of the image processing apparatus of the second embodiment is used as a copier identification signal, copier image processing can be performed only in the forward direction of sub-scanning. Therefore, it is necessary to rearrange the areas in the third image area information in the sub-scanning forward direction. For example, in the example of the third image area information 3511 in FIG. 35 (4), if the top is the head of the sub-scan, the second area, the third area, and the first area are rearranged in this order.

以上説明した第２の実施の形態によれば、第１の実施の形態と同様に、透かし画像が埋め込まれていない画像に対しては従来と同精度の像域情報を得ることが出来、既に像域情報が透かし画像で埋め込まれた文書画像に対しては、透かし画像中の像域情報では不足している像域情報を画像識別手段で生成した像域情報で補うことにより高精度な識別情報を得ることが可能である。 According to the second embodiment described above, as in the first embodiment, image area information with the same accuracy as in the past can be obtained for an image in which a watermark image is not embedded. For document images in which image area information is embedded with a watermark image, the image area information that is lacking in the image area information in the watermark image is supplemented with the image area information generated by the image identification means, thereby enabling high-precision identification. Information can be obtained.

[第３の実施の形態]
図３６は、第３の実施の形態の画像処理装置の構成を示すブロック図である。画像処理装置は、文書画像入力手段３６０１、透かし情報抽出手段３６０２、画像識別手段３６０３、像域情報合成手段３６０４を備えている。 [Third embodiment]
FIG. 36 is a block diagram illustrating a configuration of an image processing apparatus according to the third embodiment. The image processing apparatus includes a document image input unit 3601, a watermark information extraction unit 3602, an image identification unit 3603, and an image area information synthesis unit 3604.

第３の実施の形態の画像処理装置は、画像識別手段３６０３が透かし情報から抽出した第１の像域情報中に記載されていない領域のみに対して画像識別を行う点で第２の実施の形態と異なっている。 The image processing apparatus according to the third embodiment is different from the second embodiment in that the image identification unit 3603 performs image identification only on an area not described in the first image area information extracted from the watermark information. It is different from the form.

図３７は、第３の実施の形態の画像処理装置により得られる像域情報の例を示す図である。図３７（１）は、入力文書画像３７０１を示している。図３７（２）は、透かし画像から抽出された第１の像域情報３７０５を示している。図３７（３）は、画像識別手段３６０３により生成された第２の像域情報３７０８を示している。図３７（４）は、第１の像域情報３７０５と第２の像域情報３７０８を合成して生成された第３の像域情報３７１０を示している。 FIG. 37 is a diagram illustrating an example of image area information obtained by the image processing apparatus according to the third embodiment. FIG. 37A shows an input document image 3701. FIG. 37 (2) shows the first image area information 3705 extracted from the watermark image. FIG. 37 (3) shows the second image area information 3708 generated by the image identification means 3603. FIG. 37 (4) shows third image area information 3710 generated by combining the first image area information 3705 and the second image area information 3708.

透かし情報抽出手段３６０２が抽出した結果である第１の像域情報３７０５には第１領域３７０６として“ＴＩＴＬＥ”という文字領域と、第２領域３７０７としてＭＦＰが撮影されている自然画像が抽出されている。しかし、透かし情報抽出手段３６０２は、入力文書画像３７０１中の文字列“ＡＢＣ”を抽出していない。 The first image area information 3705, which is the result of the extraction by the watermark information extracting means 3602, is extracted with a character area “TITLE” as the first area 3706 and a natural image in which the MFP is photographed as the second area 3707. Yes. However, the watermark information extraction unit 3602 does not extract the character string “ABC” in the input document image 3701.

画像識別手段３６０３は、第１の像域情報３７０５中の第１領域３７０６・第２領域３７０７で表される領域以外の箇所に対して識別処理を実行する。そして、文字列“ＡＢＣ”を第１領域３７０９として含む第２の像域情報３７０８を生成する。像域情報合成手段３６０４は、得られた第１の像域情報３７０５と第２の像域情報３７０８を合成し、第３の像域情報３７１０を生成する。 The image identification unit 3603 executes identification processing on a portion other than the regions represented by the first region 3706 and the second region 3707 in the first image region information 3705. Then, second image area information 3708 including the character string “ABC” as the first area 3709 is generated. The image area information combining unit 3604 combines the obtained first image area information 3705 and second image area information 3708 to generate third image area information 3710.

以上説明した第３の実施の形態によれば、第２の実施の形態と同様に、透かし画像が埋め込まれていない画像に対しては従来と同精度の像域情報を得ることが出来、既に像域情報が透かし画像で埋め込まれた文書画像に対しては、透かし画像中の像域情報では不足している像域情報を画像識別手段で生成した像域情報で補うことにより高精度な識別情報を得ることが可能である。更に透かし画像から得られた像域情報中に領域が無かった箇所のみを識別するので第２の実施の形態と比較して高速に処理することが可能となる。 According to the third embodiment described above, as in the second embodiment, image area information with the same accuracy as the conventional technique can be obtained for an image in which a watermark image is not embedded. For document images in which image area information is embedded with a watermark image, the image area information that is lacking in the image area information in the watermark image is supplemented with the image area information generated by the image identification means, thereby enabling high-precision identification. Information can be obtained. Furthermore, since only a portion where there is no area in the image area information obtained from the watermark image is identified, it is possible to perform processing at a higher speed than in the second embodiment.

[第４の実施の形態]
図３８は、第４の実施の形態の画像処理装置の構成を示すブロック図である。画像処理装置は、文書画像入力手段３８０１、透かし情報抽出手段３８０２、画像識別手段３８０３、像域情報合成手段３８０４を備えている。 [Fourth embodiment]
FIG. 38 is a block diagram illustrating a configuration of an image processing apparatus according to the fourth embodiment. The image processing apparatus includes document image input means 3801, watermark information extraction means 3802, image identification means 3803, and image area information synthesis means 3804.

第４の実施の形態の画像処理装置は、透かし情報から抽出した第１の像域情報中に存在する領域のみを対象として画像識別手段３８０３が識別処理を行い、第１の像域情報に不足している情報を補う点で第３の実施の形態と異なっている。 In the image processing apparatus according to the fourth embodiment, the image identification unit 3803 performs identification processing only on a region existing in the first image area information extracted from the watermark information, and the first image area information is insufficient. The third embodiment is different from the third embodiment in that the supplemented information is supplemented.

図３９は、第４の実施の形態の画像処理装置により得られる像域情報の例を示す図である。図３９（１）は、入力文書画像３９０１を示している。図３９（２）は、透かし画像から抽出された第１の像域情報３９０５を示している。図３９（３）は、画像識別手段３８０３により生成された第２の像域情報３９０９を示している。図３９（４）は、第１の像域情報３９０５と第２の像域情報３９０９を合成して生成された第３の像域情報３９１２を示している。 FIG. 39 is a diagram illustrating an example of image area information obtained by the image processing apparatus according to the fourth embodiment. FIG. 39 (1) shows an input document image 3901. FIG. 39 (2) shows the first image area information 3905 extracted from the watermark image. FIG. 39 (3) shows the second image area information 3909 generated by the image identification means 3803. FIG. 39 (4) shows third image area information 3912 generated by combining the first image area information 3905 and the second image area information 3909.

透かし情報抽出手段３８０２が抽出した結果である第１の像域情報３９０５には第１領域３９０６、第２領域３９０７、第３領域３９０８の３つの領域が含まれており、その領域を表す情報として座標位置と“文字”／“自然画像”という属性情報が存在する。例えば、第４の実施の形態の画像処理装置から出力する像域情報に文字色情報が必要な場合、画像識別手段３８０３は第１像域３９０５中の文字領域である第１領域３９０６・第２領域３９０７に対してのみ識別処理を行い、文字色情報を抽出して含む第２の像域情報３９０９を生成する。 The first image area information 3905, which is the result of extraction by the watermark information extraction means 3802, includes three areas, a first area 3906, a second area 3907, and a third area 3908. Information representing the areas is as follows. Coordinate positions and attribute information “character” / “natural image” exist. For example, when character color information is required for the image area information output from the image processing apparatus according to the fourth embodiment, the image identification unit 3803 includes a first area 3906 and a second area that are character areas in the first image area 3905. Identification processing is performed only on the area 3907, and second image area information 3909 including character color information is generated.

文字色を抽出する方法として、例えば次のような処理を行う。
１：文字領域をＧｒａｙｓｃａｌｅ化する。単純に（Ｒ＋Ｇ＋Ｂ）／３等で求めても良いし、輝度変換によりも求めても良い。
２：２値化により白黒化する。２値化の閾値は適当な固定閾値を用いてもよいし、Ｇｒａｙｓｃａｌｅ画像のヒストグラムから算出した適応的な値を用いても良い。
３：２値画像の白画素数と黒画素数を数える。数の少なかった色の画素を文字画素とする。 As a method for extracting the character color, for example, the following processing is performed.
1: The character area is grayscaled. It may be obtained simply by (R + G + B) / 3 or by luminance conversion.
2: Convert to black and white by binarization. As the binarization threshold, an appropriate fixed threshold may be used, or an adaptive value calculated from the histogram of the Grayscale image may be used.
3: Count the number of white pixels and black pixels in the binary image. Let the pixel of the color with few numbers be a character pixel.

４：原画像中の文字画素と判断された画素のＲＧＢ値の平均値を求める。この平均値を文字色とする。 4: An average value of RGB values of pixels determined to be character pixels in the original image is obtained. This average value is used as the character color.

像域情報合成手段３８０４は、得られた第１の像域情報３９０５と第２の像域情報３９０９を合成し、第３の像域情報３９１２を生成する。 The image area information combining unit 3804 combines the obtained first image area information 3905 and second image area information 3909 to generate third image area information 3912.

なお、第４の実施の形態においては、第１の像域情報中に存在する領域にのみ画像識別を行う処理について説明してきた。しかし、例えば第３の実施の形態に説明した技術と組み合わせて、第１の像域情報中に存在する領域については文字色識別のみを行い、第１の像域情報に存在しない領域に付いては、領域の座標位置・属性・属性が文字であった場合には文字色の識別を行うという処理も考えられる。 Note that, in the fourth embodiment, the process of performing image identification only on the area existing in the first image area information has been described. However, for example, in combination with the technique described in the third embodiment, only the character color identification is performed for the area existing in the first image area information, and the area not existing in the first image area information is attached. For example, if the coordinate position / attribute / attribute of the region is a character, a process of identifying the character color is also conceivable.

以上説明した第４の実施の形態によれば、第３の実施の形態と同様に、透かし画像が埋め込まれていない画像に対しては従来と同精度の像域情報を得ることが出来、既に像域情報が透かし画像で埋め込まれた文書画像に対しては、透かし画像中の像域情報では不足している像域情報を画像識別手段で生成した像域情報で補うことにより高精度な識別情報を得ることが可能である。 According to the fourth embodiment described above, as in the third embodiment, image area information with the same accuracy as the conventional technique can be obtained for an image in which a watermark image is not embedded. For document images in which image area information is embedded with a watermark image, the image area information that is lacking in the image area information in the watermark image is supplemented with the image area information generated by the image identification means, thereby enabling high-precision identification. Information can be obtained.

なお、第１乃至第４の実施の形態に係る画像処理装置では、図２４の画像形成装置で形成した透かし画像が含まれた画像を入力したが、この形態に限られず、第１乃至第７の実施の形態の画像形成装置により形成された画像を用いても良い。 In the image processing apparatuses according to the first to fourth embodiments, an image including a watermark image formed by the image forming apparatus in FIG. 24 is input. However, the present invention is not limited to this form, and the first to seventh An image formed by the image forming apparatus of the embodiment may be used.

なお、以上説明した第１乃至第４の実施の形態に係る画像処理装置は、次のように表すことができる。 The image processing apparatuses according to the first to fourth embodiments described above can be expressed as follows.

[付記１]（透かしから像域情報を抽出する）
文書画像を入力する文書画像入力手段と、
前記入力された文書画像に埋め込まれた透かし画像から像域情報を抽出する透かし情報抽出手段、
とを有することを特徴とする画像処理装置。 [Appendix 1] (Extract image area information from watermark)
A document image input means for inputting a document image;
Watermark information extracting means for extracting image area information from a watermark image embedded in the input document image;
An image processing apparatus comprising:

[付記２]（透かしから像域情報を抽出する）
請求項１の画像処理装置において、
前記入力された文書画像に埋め込まれた透かし画像から抽出された像域情報は、
その座標の基準位置（左下基準，左上基準ｅｔｃ．）、長さの単位（ｃｍ，ｉｎｃｈ，ｅｔｃ．）が埋め込まれている。 [Appendix 2] (Extract image area information from watermark)
The image processing apparatus according to claim 1.
Image area information extracted from the watermark image embedded in the input document image is:
A reference position of the coordinates (lower left reference, upper left reference etc.) and a unit of length (cm, inch, etc.) are embedded.

[付記３]（透かしがなかった場合は識別処理をする）
文書画像を入力する文書画像入力手段と、
前記入力された文書画像に像域情報が透かし画像として埋められていた場合、埋め込まれた透かし画像から像域情報を抽出する透かし情報抽出手段と、
前記入力された文書画像に透かし画像が埋め込まれていなかった場合に入力文書画像の像域を識別する画像識別手段、
とを有することを特徴とする画像処理装置。 [Appendix 3] (If there is no watermark, an identification process is performed)
A document image input means for inputting a document image;
When image area information is embedded as a watermark image in the input document image, watermark information extracting means for extracting image area information from the embedded watermark image;
Image identifying means for identifying an image area of the input document image when a watermark image is not embedded in the input document image;
An image processing apparatus comprising:

[付記４]（透かし情報と識別結果とを合成する）
文書画像を入力する文書画像入力手段と、
前記入力された文書画像に像域情報が透かし画像として埋められていた場合、埋め込まれた透かし画像から第１の像域情報を抽出する透かし情報抽出手段と、
入力された文書画像の像域を識別し第２像域情報を生成する画像識別手段と、
第１の像域情報と第２の像域情報を合成し第３の像域情報を生成する像域情報合成手段、
とを有することを特徴とする画像処理装置。 [Appendix 4] (Combining watermark information and identification result)
A document image input means for inputting a document image;
Watermark information extraction means for extracting first image area information from the embedded watermark image when image area information is embedded as a watermark image in the input document image;
Image identifying means for identifying the image area of the input document image and generating second image area information;
Image area information combining means for combining the first image area information and the second image area information to generate third image area information;
An image processing apparatus comprising:

[付記５]（透かし画像から抽出した領域については像域識別を行わない）
文書画像を入力する文書画像入力手段と、
前記入力された文書画像に像域情報が透かし画像として埋められていた場合、埋め込まれた透かし画像から第１の像域情報を抽出する透かし情報抽出手段と、
前記入力された文書画像の前記第１の像域情報においてどの領域にも含まれていない位置の像域を識別し第２像域情報を生成する画像識別手段と、
第１の像域情報と第２の像域情報を合成し第３の像域情報を生成する像域情報合成手段、
とを有することを特徴とする画像処理装置。 [Appendix 5] (Image areas are not identified for areas extracted from watermark images)
A document image input means for inputting a document image;
Watermark information extraction means for extracting first image area information from the embedded watermark image when image area information is embedded as a watermark image in the input document image;
Image identifying means for identifying an image area at a position not included in any area in the first image area information of the input document image and generating second image area information;
Image area information combining means for combining the first image area information and the second image area information to generate third image area information;
An image processing apparatus comprising:

[付記６]（透かしから抽出した領域は、透かしにない情報のみ（文字色等）を識別処理で生成する）
文書画像を入力する文書画像入力手段と、
前記入力された文書画像に像域情報が透かし画像として埋められていた場合、埋め込まれた透かし画像から第１の像域情報を抽出する透かし情報抽出手段と、
第１の像域情報内の領域に対して画像識別を行い、第１の像域情報にはない種類の情報である第２の像域情報を生成する画像識別手段と、
第１の像域情報と第２の像域情報を合成し第３の像域情報を生成する像域情報合成手段、
とを有することを特徴とする画像処理装置。 [Appendix 6] (Region extracted from watermark generates only information that is not in the watermark (character color, etc.) by identification processing)
A document image input means for inputting a document image;
Watermark information extraction means for extracting first image area information from the embedded watermark image when image area information is embedded as a watermark image in the input document image;
Image identifying means for performing image identification on an area in the first image area information and generating second image area information which is a type of information not included in the first image area information;
Image area information combining means for combining the first image area information and the second image area information to generate third image area information;
An image processing apparatus comprising:

[付記７]（本提案画像処理装置の使い方（コピーｏｒ高圧縮ファイル）に応じて、合成方法を変える）
請求項４、請求項５、請求項６に記載の画像処理装置において、
像域情報合成手段は、前記画像処理装置の使用方法（コピー用識別／高圧縮ファイル用識別／ＯＣＲ前処理）に応じてその合成方法を変える。 [Appendix 7] (Combination method is changed depending on how to use the proposed image processing device (copy or highly compressed file))
In the image processing device according to claim 4, claim 5, or claim 6,
The image area information synthesis means changes the synthesis method according to the usage method of the image processing apparatus (copy identification / high compression file identification / OCR preprocessing).

[付記８]（透かし情報と識別結果の重複を許して全て出力する）
請求項４、請求項５、請求項６に記載の画像処理装置において、
前記像域情報合成手段は、第１の像域情報と第２像域情報に含まれている領域情報を全て第３の像域情報に含ませる。 [Appendix 8] (Allows duplicate output of watermark information and identification results)
In the image processing device according to claim 4, claim 5, or claim 6,
The image area information synthesizing unit includes all the area information included in the first image area information and the second image area information in the third image area information.

[付記９]（透かし情報を重視。像域識別結果と重なりがあった場合は、識別結果の方を破棄する）
請求項４、請求項５、請求項６に記載の画像処理装置において、
前記像域情報合成手段は、第１の像域情報内の領域と第２像域情報内の領域の座標位置に重なりがあった場合、該領域については第１の像域情報の領域を採用し第２の像域情報の領域を破棄する。 [Appendix 9] (Watermark information is emphasized. If there is an overlap with the image area identification result, the identification result is discarded)
In the image processing device according to claim 4, claim 5, or claim 6,
When there is an overlap in the coordinate position of the area in the first image area information and the area in the second image area information, the image area information combining unit adopts the area of the first image area information for the area. The area of the second image area information is discarded.

[付記１０]（領域の属性（文字であるか否か）を重視。重なりがあった場合は、優先属性の領域のみ残す）
請求項４、請求項５、請求項６に記載の画像処理装置において、
前記像域情報合成手段は、予め領域の属性による重要度を決めておき、第１の像域情報内の領域と第２像域情報内の領域の座標位置に重なりがあり、かつ、それぞれの属性が異なった場合、該領域については第１の像域情報と第２の像域情報で優先度が低い属性情報を有する方の像域情報を破棄する。 [Appendix 10] (Area attribute (whether it is a character or not) is emphasized. If there is an overlap, only the priority attribute area remains)
In the image processing device according to claim 4, claim 5, or claim 6,
The image area information synthesizing means determines the importance according to the attribute of the area in advance, the coordinate positions of the area in the first image area information and the area in the second image area information overlap, and each If the attributes are different, the image area information of the first image area information and the second image area information having attribute information with lower priority is discarded for the area.

[付記１１]（像域情報を合成する際に、領域の順番を自由に入れ替える）
請求項４、請求項５、請求項６に記載の画像処理装置において、
前記像域情報合成手段は、第１の像域情報と第２の像域情報を合成した第３の像域情報中の領域情報は、その記述順を自由に置き換えられる。 [Appendix 11] (When combining image area information, the order of the areas can be freely changed)
In the image processing device according to claim 4, claim 5, or claim 6,
The image area information combining means can freely replace the description order of the area information in the third image area information obtained by combining the first image area information and the second image area information.

[付記１２]（透かし情報と像域識別結果とで一致した領域のみを出力する）
請求項４、請求項５、請求項６に記載の画像処理装置において、
前記像域情報合成手段は、第１の像域情報内の領域と第２像域情報内の領域の座標位置を調べ、一致した領域のみを第３の像域情報に採用する。 [Supplementary Note 12] (Only the area where the watermark information and the image area identification result match is output)
In the image processing device according to claim 4, claim 5, or claim 6,
The image area information synthesizing unit examines the coordinate positions of the area in the first image area information and the area in the second image area information, and adopts only the matched area as the third image area information.

本発明は、像域情報が埋め込まれた文書画像から正確な像域識別結果を得る技術に関する。この技術を用いると何世代にも渡り正確な像域識別結果を用いた世代コピーが可能となる。また、より高画質・高機能な電子ファイルを作成可能となる。 The present invention relates to a technique for obtaining an accurate image area identification result from a document image in which image area information is embedded. When this technology is used, generational copying using an accurate image area identification result for many generations becomes possible. In addition, it is possible to create an electronic file with higher image quality and higher functionality.

尚、本発明は、上記実施形態そのままに限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形して具体化できる。
また、上記実施形態に開示されている複数の構成要素の適宜な組み合せにより種々の発明を形成できる。例えば、実施形態に示される全構成要素から幾つかの構成要素を削除してもよい。更に、異なる実施形態に亘る構成要素を適宜組み合せてもよい。 Note that the present invention is not limited to the above-described embodiment as it is, and can be embodied by modifying the constituent elements without departing from the scope of the invention in the implementation stage.
Further, various inventions can be formed by appropriately combining a plurality of constituent elements disclosed in the embodiment. For example, some components may be deleted from all the components shown in the embodiment. Furthermore, you may combine suitably the component covering different embodiment.

本発明は、複数世代のコピー原稿であっても高精度な像域識別結果を得ることができ、また、高機能な電子ファイルを得ることができる画像形成装置、画像処理装置を製造し、使用する産業で利用することができる。 The present invention manufactures and uses an image forming apparatus and an image processing apparatus that can obtain a highly accurate image area identification result even for a plurality of generations of copy originals and can obtain a highly functional electronic file. Can be used in industry.

１０１…文書画像入力手段、１０２…像域情報入力手段、１０３…画像形成手段、１０４…文書合成手段、５０１…文書画像入力手段、５０２…情報抽出手段、５０３…画像処理手段、５０４…画像出力手段、７０１…文書画像入力手段、７０２…情報抽出手段、７０３…画像分割手段、７０４…第１の圧縮手段、７０５…第２の圧縮手段、７０６…画像結合手段、８０１…文書データ入力手段、８０２…文書画像形成手段、８０３…像域情報入力手段、８０４…画像形成手段、８０５…文書合成手段、１１０１…文書画像入力手段、１１０２…画像識別手段、１１０３…画像形成手段、１１０４…文書合成手段、１３０１…文書データ入力手段、１３０２…文書画像形成手段、１３０３…像域情報抽出手段、１３０４…画像形成手段、１３０５…文書合成手段、１５０１…文書画像入力手段、１５０２…画像識別手段、１５０３…像域情報編集手段、１５０４…画像形成手段、１５０５…文書合成手段、１８０１…文書データ入力手段、１８０２…文書画像形成手段、１８０３…像域情報抽出手段、１８０４…像域情報編集手段、１８０５…画像形成手段、１８０６…文書合成手段。 DESCRIPTION OF SYMBOLS 101 ... Document image input means, 102 ... Image area information input means, 103 ... Image formation means, 104 ... Document composition means, 501 ... Document image input means, 502 ... Information extraction means, 503 ... Image processing means, 504 ... Image output Means 701 ... Document image input means 702 ... Information extraction means 703 ... Image division means 704 ... First compression means 705 ... Second compression means 706 ... Image combination means 801 ... Document data input means, 802 ... Document image forming means, 803 ... Image area information inputting means, 804 ... Image forming means, 805 ... Document composition means, 1101 ... Document image input means, 1102 ... Image identification means, 1103 ... Image formation means, 1104 ... Document composition Means 1301 ... Document data input means 1302 ... Document image forming means 1303 ... Image area information extracting means 1304 ... Image forming means 13 5 ... Document composition means, 1501 ... Document image input means, 1502 ... Image identification means, 1503 ... Image area information editing means, 1504 ... Image formation means, 1505 ... Document composition means, 1801 ... Document data input means, 1802 ... Document image Forming means, 1803... Image area information extracting means, 1804... Image area information editing means, 1805... Image forming means, 1806.

特開２００５−３９４３０号公報JP 2005-39430 A 特開２００２−７７６３３号公報JP 2002-77633 A

Claims

A document image input means for inputting a document image and generating an input document image as image data;
Image area information input means for inputting image area information created for each image area of the document image;
Watermark image forming means for forming a watermark image in which the image area information is embedded;
An image forming apparatus comprising: a watermarked document image synthesizing unit that combines the input document image and the watermark image to form a watermarked document image.

Document data input means for inputting document data;
Document image forming means for forming a document image which is image data from the document data;
Image area information input means for inputting image area information created for each image area in the document data;
Watermark image forming means for forming a watermark image in which the image area information is embedded;
An image forming apparatus comprising: a watermarked document image synthesizing unit that combines the document image and the watermark image to form a watermarked document image.

A document image input means for inputting a document image and generating an input document image as image data;
Image area identifying means for identifying each image area in the input document image and generating image area information;
Watermark image forming means for forming a watermark image in which the image area information is embedded;
An image forming apparatus comprising: a watermarked document image synthesizing unit that combines the input document image and the watermark image to form a watermarked document image.

Document data input means for inputting document data;
Document image forming means for forming a document image which is image data from the document data;
Image area information extracting means for extracting image area information for each image area in the document data;
Watermark image forming means for forming a watermark image in which the image area information is embedded;
An image forming apparatus comprising: a watermarked document image synthesizing unit that combines the document image and the watermark image to form a watermarked document image.

A document image input means for inputting a document image and generating an input document image as image data;
Image area identifying means for identifying each image area in the input document image and generating image area information;
Image area information editing means for editing the image area information according to a predetermined rule;
Watermark image forming means for forming a watermark image in which the edited image area information is embedded;
An image forming apparatus comprising: a watermarked document image synthesizing unit that combines the input document image and the watermark image to form a watermarked document image.

Document data input means for inputting document data;
Document image forming means for forming a document image which is image data from the document data;
Image area information extracting means for extracting image area information for each image area in the document data;
Image area information editing means for editing the extracted image area information according to a predetermined rule;
Watermark image forming means for forming a watermark image in which the edited image area information is embedded;
An image forming apparatus comprising: a watermarked document image synthesizing unit that combines the document image and the watermark image to form a watermarked document image.

The image forming apparatus according to claim 5, wherein the image area information editing unit adds new image area information.

7. The image forming apparatus according to claim 5, wherein the image area information editing unit removes at least one image area information.

The image forming apparatus according to claim 5, wherein the image area information editing unit replaces at least one image area information with new image area information.

The image forming apparatus according to claim 5, wherein the image area information editing unit prioritizes the image area information.