JP2004221679A

JP2004221679A - Image processing apparatus, program, and storage medium

Info

Publication number: JP2004221679A
Application number: JP2003003546A
Authority: JP
Inventors: Toshio Miyazawa; 利夫宮澤; Yasuyuki Nomizu; 泰之野水; Hiroyuki Sakuyama; 宏幸作山; Junichi Hara; 潤一原; Nekka Matsuura; 熱河松浦; Takanori Yano; 隆則矢野; Taku Kodama; 児玉　　卓; Yasuyuki Shinkai; 康行新海; Takayuki Nishimura; 隆之西村
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2003-01-09
Filing date: 2003-01-09
Publication date: 2004-08-05
Anticipated expiration: 2023-01-09
Also published as: JP4194373B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide an image processing apparatus for realizing proper encoding to searching and management of image data and processing or the like with respect to the image data. <P>SOLUTION: A composite machine being the image processing apparatus for compressing an original image to generate a code stream, divides the original image into a plurality of regions (S2), performs character recognition processing to recognize a character from images of a plurality of divided regions and to generate character recognition information being a result of character recognition as attached information (S3), and embeds the generated attached information to a prescribed embedded position of the code stream (S5). Thus, in the case of searching required image data from a plurality of image data or the like for example, the image data can simply be searched by utilizing the attached information. That is, the searching and management of image data and processing applied to the image data or the like can easily be performed by embedding the attached information to the code stream as required. <P>COPYRIGHT: (C)2004,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、画像処理装置、プログラム及び記憶媒体に関する。
【０００２】
【従来の技術】
近年、スキャナ、デジタルカメラ、パーソナルコンピュータ、プリンタ、複写機、複合機（ＭＦＰ）等の画像処理装置の普及に伴い、デジタル画像データをメモリやハードディスク等の記憶装置に保存したり、ＣＤ−ＲＯＭ等の光ディスクに保存したり、さらには、インターネット等を介して伝送したりすることが身近なものになっている。このような画像データは、通常、圧縮されて記憶装置や光ディスク等に保存されることが多い。
【０００３】
最近では、様々な技術により簡単に高精細画像を得ることができるが、高精細画像の画像データサイズは大きくなる傾向にあり、高精細画像の取扱いは困難になってきている。こうした高精細画像の取扱いを容易にする画像圧縮伸長アルゴリズムとしては、現在、ＪＰＥＧ（ＪｏｉｎｔＰｈｏｔｏｇｒａｐｈｉｃＥｘｐｅｒｔｓＧｒｏｕｐ）が最も広く用いられている。また、このＪＰＥＧで採用されているＤＣＴ（離散コサイン変換）に代わる周波数変換として、近年、ＤＷＴ（離散ウェーブレット変換）の採用が増加している。その代表例は、２００１年に国際標準となったＪＰＥＧ後継の画像圧縮伸長方式ＪＰＥＧ２０００である。
【０００４】
このような圧縮された画像データはデジタルデータであるため、インターネット等を介する伝送や記憶装置への保存等を容易にするが、一方で、作成者に無断で改変される可能性が高いものである。これを防ぐため、作成者を特定するための署名情報を付加情報として原画像に埋め込む方法が提案されている（例えば、特許文献１参照）。
【０００５】
【特許文献１】
特開２００１−４２７６８公報
【０００６】
【発明が解決しようとする課題】
しかしながら、特許文献１の技術では、原画像に付加情報として署名情報を埋め込むことはできるが、署名情報は原画像に関する付加情報ではないため、作成者を特定するため以外、例えば画像データ検索等の二次的な利用に署名情報を用いることは難しい。また、原画像に関する付加情報を原画像に埋め込んだ場合でも、その埋め込み位置によっては、付加情報を利用する際の処理時間が長くなる場合がある。
【０００７】
また、ユーザは、記憶装置に格納された複数の画像データから文字領域又は写真領域を有する画像データや類似した画像を有する画像データを容易に検索できることを要望している。例えば、従来の技術においては、ユーザが、画像データに基づいて表示装置等に表示された画像や用紙等に印字された画像等を確認することで、文字領域又は写真領域を有する画像データや類似した画像を有する画像データ等を検索する場合が多い。
【０００８】
本発明の目的は、画像データの検索や管理、画像データに対する処理等に好適な符号化を実現する画像処理装置、プログラム及び記憶媒体を提供することである。
【０００９】
【課題を解決するための手段】
請求項１記載の発明は、原画像を２次元ウェーブレット変換、量子化及び符号化という手順で圧縮してコードストリームを生成する画像処理装置において、前記原画像を複数領域に分割する領域分割手段と、前記領域分割手段により分割された前記複数領域の画像から前記原画像に関連する付加情報を生成する付加情報生成手段と、前記付加情報生成手段により生成された付加情報を前記コードストリームに埋め込む付加情報埋込手段と、を備えることを特徴とする。
【００１０】
したがって、原画像に関連する付加情報を生成し、生成した付加情報をコードストリームに埋め込むことによって、原画像の画像データが付加情報を有し、例えば、複数の画像データから必要とする画像データを検索する場合等、付加情報を利用して簡単に検索することが可能となる。すなわち、必要に応じた付加情報を埋め込むことで、画像データの検索や管理、画像データに対する処理等を容易に行うことが可能になる。
【００１１】
請求項２記載の発明は、請求項１記載の画像処理装置において、前記付加情報生成手段は、前記複数領域の画像から文字を認識することで前記付加情報として文字認識情報を生成することを特徴とする。
【００１２】
したがって、付加情報として文字認識情報を生成することによって、この文字認識情報がコードストリームに埋め込まれ、例えば、複数の画像データから文字領域を有する画像データを検索する場合等、文字認識情報により簡単に検索することが可能となる。すなわち、文字認識情報を利用することで、画像データの検索や管理等を容易に行うことが可能になる。
【００１３】
請求項３記載の発明は、請求項１又は２記載の画像処理装置において、前記付加情報生成手段は、前記複数領域の領域属性を識別することで前記付加情報として領域識別情報を生成することを特徴とする。
【００１４】
したがって、付加情報として領域識別情報を生成することによって、この領域識別情報がコードストリームに埋め込まれ、例えば、複数の画像データから写真領域を有する画像データを検索する場合等、領域識別情報により簡単に検索することが可能となる。すなわち、領域識別情報を利用することで、画像データの検索や管理等を容易に行うことが可能になる。
【００１５】
請求項４記載の発明は、請求項１、２又は３記載の画像処理装置において、前記付加情報生成手段は、前記複数領域の画像の傾きを検出することで前記付加情報として傾き検出情報を生成することを特徴とする。
【００１６】
したがって、付加情報として傾き検出情報を生成することによって、この傾き検出情報がコードストリームに埋め込まれ、例えば、画像を表示装置に表示させたり、用紙に印字させたりする場合等、傾き検出情報により簡単に画像の傾きを補正することが可能となる。すなわち、傾き検出情報を利用することで、画像データに対する処理等を容易に行うことが可能になる。
【００１７】
請求項５記載の発明は、請求項１ないし４のいずれか一記載の画像処理装置において、前記付加情報埋込手段は、前記コードストリームのメインヘッダに前記付加情報を埋め込むことを特徴とする。
【００１８】
したがって、付加情報をコードストリームのメインヘッダに埋め込むことによって、例えば、複数の画像データから必要とする画像データを検索する場合等、メインヘッダを読み取る早い段階でメインヘッダの付加情報を利用して検索することが可能となり、その結果として、処理時間を短縮することが可能となる。
【００１９】
請求項６記載の発明は、請求項１ないし４のいずれか一記載の画像処理装置において、前記付加情報埋込手段は、前記コードストリームのタイルパートヘッダに前記付加情報を埋め込むことを特徴とする。
【００２０】
したがって、付加情報をコードストリームのタイルパートヘッダに埋め込むことによって、例えば、タイル毎（領域毎）の付加情報が対応するタイルパートヘッダに埋め込まれ、複数の画像データから必要とする画像データを検索する場合等、タイルパートヘッダの付加情報を利用して簡単に検索することが可能となる。
【００２１】
請求項７記載の発明は、請求項１ないし４のいずれか一記載の画像処理装置において、前記付加情報埋込手段は、前記コードストリームにおけるレイヤの最下位ビットに前記付加情報を埋め込むことを特徴とする。
【００２２】
したがって、付加情報をコードストリームにおけるレイヤの最下位ビットに埋め込むことによって、画像サイズを増加させることなく付加情報を埋め込むことが可能である。
【００２３】
請求項８記載の発明は、請求項１ないし７のいずれか一記載の画像処理装置において、原稿から前記原画像を光学的に読み取る読取光学系を備えることを特徴とする。
【００２４】
したがって、原稿から原画像を読み取ることが可能になり、その結果として、読み取った原画像に対し画像処理等の様々な処理を実行することが可能になる。
【００２５】
請求項９記載の発明は、請求項１ないし８のいずれか一記載の画像処理装置において、圧縮された前記原画像を復号化、逆量子化及び２次元ウェーブレット逆変換という手順で伸長する伸長手段を備えることを特徴とする。
【００２６】
したがって、ＪＰＥＧ２０００アルゴリズムの伸長手段を用いることで、ＪＰＥＧ２０００アルゴリズムで圧縮された画像をＪＰＥＧ２０００の特性を活かして伸長することが可能となり、その結果として、伸長された画像の表示装置等への表示や用紙等への印字等を実行することが可能になる。
【００２７】
請求項１０記載の発明は、請求項９記載の画像処理装置において、前記伸長手段により伸長された画像を記録材に画像形成するプリンタエンジンを備えることを特徴とする。
【００２８】
したがって、伸長された画像を用紙等の記録材に形成することが可能になる。
【００２９】
請求項１１記載の発明のプログラムは、原画像を２次元ウェーブレット変換、量子化及び符号化という手順で圧縮してコードストリームを生成する画像処理装置が備えるコンピュータに解釈され、前記コンピュータに、原画像を複数領域に分割する領域分割機能と、前記領域分割手段により分割された前記複数領域の画像から前記原画像に関連する付加情報を生成する付加情報生成機能と、前記付加情報生成手段により生成された付加情報を前記コードストリームに埋め込む付加情報埋込機能と、を実行させる。
【００３０】
したがって、原画像に関連する付加情報を生成し、生成した付加情報をコードストリームに埋め込むことによって、原画像の画像データが付加情報を有し、例えば、複数の画像データから必要とする画像データを検索する場合等、付加情報を利用して簡単に検索することが可能となる。すなわち、必要に応じた付加情報を埋め込むことで、画像データの検索や管理、画像データに対する処理等を容易に行うことが可能になる。
【００３１】
請求項１２記載の発明は、請求項１１記載のプログラムにおいて、前記付加情報生成機能は、前記複数領域の画像から文字を認識することで前記付加情報として文字認識情報を生成する。
【００３２】
したがって、付加情報として文字認識情報を生成することによって、この文字認識情報がコードストリームに埋め込まれ、例えば、複数の画像データから文字領域を有する画像データを検索する場合等、文字認識情報により簡単に検索することが可能となる。すなわち、文字認識情報を利用することで、画像データの検索や管理等を容易に行うことが可能になる。
【００３３】
請求項１３記載の発明は、請求項１１又は１２記載のプログラムにおいて、前記付加情報生成機能は、前記複数領域の領域属性を識別することで前記付加情報として領域識別情報を生成する。
【００３４】
したがって、付加情報として領域識別情報を生成することによって、この領域識別情報がコードストリームに埋め込まれ、例えば、複数の画像データから写真領域を有する画像データを検索する場合等、領域識別情報により簡単に検索することが可能となる。すなわち、領域識別情報を利用することで、画像データの検索や管理等を容易に行うことが可能になる。
【００３５】
請求項１４記載の発明は、請求項１１、１２又は１３記載のプログラムにおいて、前記付加情報生成機能は、前記複数領域の画像の傾きを検出することで前記付加情報として傾き検出情報を生成する。
【００３６】
したがって、付加情報として傾き検出情報を生成することによって、この傾き検出情報がコードストリームに埋め込まれ、例えば、画像を表示装置に表示させたり、用紙に印字させたりする場合等、傾き検出情報により簡単に画像の傾きを補正することが可能となる。すなわち、傾き検出情報を利用することで、画像データに対する処理等を容易に行うことが可能になる。
【００３７】
請求項１５記載の発明は、請求項１１ないし１４のいずれか一記載のプログラムにおいて、前記付加情報埋込機能は、前記コードストリームのメインヘッダに前記付加情報を埋め込む。
【００３８】
したがって、付加情報をコードストリームのメインヘッダに埋め込むことによって、例えば、複数の画像データから必要とする画像データを検索する場合等、メインヘッダを読み取る早い段階でメインヘッダの付加情報を利用して検索することが可能となり、その結果として、処理時間を短縮することが可能となる。
【００３９】
請求項１６記載の発明は、請求項１１ないし１４のいずれか一記載のプログラムにおいて、前記付加情報埋込機能は、前記コードストリームのタイルパートヘッダに前記付加情報を埋め込む。
【００４０】
したがって、付加情報をコードストリームのタイルパートヘッダに埋め込むことによって、例えば、タイル毎（領域毎）の付加情報が対応するタイルパートヘッダに埋め込まれ、複数の画像データから必要とする画像データを検索する場合等、タイルパートヘッダの付加情報を利用して簡単に検索することが可能となる。
【００４１】
請求項１７記載の発明は、請求項１１ないし１４のいずれか一記載のプログラムにおいて、前記付加情報埋込機能は、前記コードストリームにおけるレイヤの最下位ビットに前記付加情報を埋め込む。
【００４２】
したがって、付加情報をコードストリームにおけるレイヤの最下位ビットに埋め込むことによって、画像サイズを増加させることなく付加情報を埋め込むことが可能である。
【００４３】
請求項１８記載の発明のコンピュータ読取可能な記憶媒体は、請求項１１ないし１７のいずれか一記載のプログラムを記憶している。
【００４４】
したがって、請求項１１ないし１７のいずれか一記載の発明と同様な作用を奏する。
【００４５】
【発明の実施の形態】
本発明の第一の実施の形態を図１ないし図１１に基づいて説明する。
【００４６】
本実施の形態は、「ＪＰＥＧ２０００アルゴリズム」を利用するものであるが、ＪＰＥＧ２０００アルゴリズム自体は各種文献や公報等により周知であるので、詳細は省略し、その概要について説明する。
【００４７】
図１はＪＰＥＧ２０００アルゴリズムの概要を説明するための機能ブロック図である。
ＪＰＥＧ２０００のアルゴリズムは、色空間変換・逆変換部１００、２次元ウェーブレット変換・逆変換部１０１、量子化・逆量子化部１０２、エントロピー符号化・復号化部１０３、タグ処理部１０４で構成されている。
【００４８】
ＪＰＥＧ２０００の特徴の一つは、高圧縮領域における画質が良いという長所を持つ２次元離散ウェーブレット変換（ＤＷＴ：ＤｉｓｃｒｅｔｅＷａｖｅｌｅｔＴｒａｎｓｆｏｒｍ）を用いている点である。また、もう一つの大きな特徴は、最終段に符号形成を行うためのタグ処理部１０４と呼ばれる機能ブロックが追加されており、符号列データであるコードストリームの生成や解釈が行われる点である。そして、コードストリームによって、ＪＰＥＧ２０００は様々な便利な機能を実現できるようになっている。
【００４９】
なお、画像の入出力部分には、色空間変換・逆変換部１００が用意されることが多い。この色空間変換・逆変換部１００は、例えば、原色系のＲ（赤）／Ｇ（緑）／Ｂ（青）の各コンポーネントからなるＲＧＢ表色系や、補色系のＹ（黄）／Ｍ（マゼンタ）／Ｃ（シアン）の各コンポーネントからなるＹＭＣ表色系から、ＹＣｒＣｂあるいはＹＵＶ表色系への変換又は逆の変換を行う部分である。
【００５０】
以下、ＪＰＥＧ２０００アルゴリズム、特にウェーブレット変換について説明する。
【００５１】
図２はカラー画像である原画像の分割された各コンポーネントの一例を概略的に示す模式図である。カラー画像は、一般に、図２に示すように、原画像の各コンポーネント１１０が、例えばＲＧＢ原色系によって分離される。さらに、画像の各コンポーネント１１０は、矩形をした領域であるタイル１１１によって分割される（図２の例では、各コンポーネント１１０が縦横４×４、合計１６個の矩形のタイル１１１に分割されている）。このような個々のタイル１１１、例えば、Ｒ００，Ｒ０１，…，Ｒ１５／Ｇ００，Ｇ０１，…，Ｇ１５／Ｂ００，Ｂ０１，…，Ｂ１５は、画像データの圧縮伸長プロセスを実行する際の基本単位となる。従って、画像データの圧縮伸長動作は、コンポーネント１１０毎に、また、タイル１１１毎に、独立して行われる。
【００５２】
画像データの符号化時には（図１参照）、各コンポーネント１１０の各タイル１１１のデータが色空間変換・逆変換部１００に入力され、色空間変換を施された後、２次元ウェーブレット変換・逆変換部１０１で２次元ウェーブレット変換（順変換）が適用されて周波数帯に空間分割される。
【００５３】
図３はデコンポジションレベル数が３である場合の各デコンポジションレベルにおけるサブバンドを概略的に示す模式図である。２次元ウェーブレット変換・逆変換部１０１は、画像のタイル分割によって得られたタイル画像（デコンポジションレベル０（１２０）：０ＬＬ）に対して、２次元ウェーブレット変換を施し、デコンポジションレベル１（１２１）に示すサブバンド（１ＬＬ，１ＨＬ，１ＬＨ，１ＨＨ）を分離する。引き続き、２次元ウェーブレット変換・逆変換部１０１は、この階層における低周波成分１ＬＬに対して、２次元ウェーブレット変換を施し、デコンポジションレベル２（１２２）に示すサブバンド（２ＬＬ，２ＨＬ，２ＬＨ，２ＨＨ）を分離する。そして、２次元ウェーブレット変換・逆変換部１０１は、順次同様に、低周波成分２ＬＬに対しても、２次元ウェーブレット変換を施し、デコンポジションレベル３（１２３）に示すサブバンド（３ＬＬ，３ＨＬ，３ＬＨ，３ＨＨ）を分離する。なお、図３中では、各デコンポジションレベルにおいて符号化の対象となるサブバンドはグレーで示されている。例えば、デコンポジションレベル数を３とした場合、グレーで示したサブバンド（３ＨＬ，３ＬＨ，３ＨＨ，２ＨＬ，２ＬＨ，２ＨＨ，１ＨＬ，１ＬＨ，１ＨＨ）が符号化対象となり、３ＬＬサブバンドは符号化されない。
【００５４】
次いで、量子化・逆量子化部１０２では（図１参照）、指定した符号化の順番で符号化の対象となるビットが定められた後、対象ビット周辺のビットからコンテキストが生成される。この量子化の処理が終わったウェーブレット係数は、個々のサブバンド毎に、「プレシンクト」と呼ばれる重複しない矩形に分割される。これは、インプリメンテーションでメモリを効率的に使うために導入されたものである。ここで、図４はプレシンクトを示す説明図である。図４に示すように、一つのプレシンクトは、空間的に一致した３つの矩形領域からなっている。さらに、個々のプリシンクトは、重複しない矩形の「コードブロック」に分けられる。これは、エントロピーコーディングを行う際の基本単位となる。
【００５５】
なお、ウェーブレット変換後の係数値は、そのまま量子化し符号化することも可能であるが、ＪＰＥＧ２０００では符号化効率を上げるために、係数値を「ビットプレーン」単位に分解し、画素あるいはコードブロック毎にビットプレーンに順位付けを行うことができる。
【００５６】
ここで、図５はビットプレーンに順位付けする手順の一例を示す説明図である。図５に示すように、この例は、原画像（３２×３２画素）を１６×１６画素のタイル４つで分割した場合で、デコンポジションレベル１のプレシンクトとコードブロックの大きさは、各々８×８画素と４×４画素としている。プレシンクトとコードブロックの番号は、ラスター順に付けられており、この例では、プレンシクトが番号０から３まで、コードブロックが番号０から３まで割り当てられている。タイル境界外に対する画素拡張にはミラーリング法を使い、可逆（５，３）フィルタでウェーブレット変換を行い、デコンポジションレベル１のウェーブレット係数値を求めている。
【００５７】
また、タイル０／プレシンクト３／コードブロック３について、代表的な「レイヤ」構成の概念の一例を示す説明図も図５に併せて示す。変換後のコードブロックは、サブバンド（１ＬＬ，１ＨＬ，１ＬＨ，１ＨＨ）に分割され、各サブバンドにはウェーブレット係数値が割り当てられている。
【００５８】
レイヤの構造は、ウェーブレット係数値を横方向（ビットプレーン方向）から見ると理解し易い。１つのレイヤは任意の数のビットプレーンから構成される。
この例では、レイヤ０，１，２，３は、各々、１，３，１，３のビットプレーンから成っている。そして、ＬＳＢ（ＬｅａｓｔＳｉｇｎｉｆｉｃａｎｔＢｉｔ：最下位ビット）に近いビットプレーンを含むレイヤ程、先に量子化の対象となり、逆に、ＭＳＢ（ＭｏｓｔＳｉｇｎｉｆｉｃａｎｔＢｉｔ：最上位ビット）に近いレイヤは最後まで量子化されずに残ることになる。ＬＳＢに近いレイヤから破棄する方法はトランケーションと呼ばれ、量子化率を細かく制御することが可能である。
【００５９】
エントロピー符号化・復号化部１０３では（図１参照）、コンテキストと対象ビットとから、確率推定によって各コンポーネント１１０の各タイル１１１に対する符号化を行う。こうして、画像の全てのコンポーネント１１０について、タイル１１１単位で符号化処理が行われる。
【００６０】
最後に、タグ処理部１０４では（図１参照）、エントロピー符号化・復号化部１０３からの全符号化データを１本のコードストリーム（符号列データ）に結合するとともに、それにタグを付加する処理を行う。ここで、図６はコードストリームの構造の一例を概略的に示す模式図である。コードストリームの先頭と各タイル１１１を構成する部分タイルの先頭には、ヘッダ（メインヘッダ（Ｍａｉｎｈｅａｄｅｒ）、タイルパートヘッダ（ｔｉｌｅｐａｒｔｈｅａｄｅｒ））と呼ばれるタグ情報が付加され、その後に、各タイル１１１の符号化データ（ｂｉｔｓｔｒｅａｍ）が続く。そして、コードストリームの終端には、再びタグ情報（ｅｎｄｏｆｃｏｄｅｓｔｒｅａｍ）が付加される。
【００６１】
一方、復号化時には、符号化時とは逆に、各コンポーネント１１０の各タイル１１１のコードストリームから画像データを生成する。この場合、図１に示すように、タグ処理部１０４は、外部より入力されたコードストリーム（符号列データ）に付加されたタグ情報を解釈し、コードストリームを各コンポーネント１１０の各タイル１１１のコードストリームに分解し、その各コンポーネント１１０の各タイル１１１のコードストリーム毎に復号化処理（伸長処理）を行う。このとき、コードストリーム内のタグ情報に基づく順番で復号化の対象となるビットの位置が定められるとともに、量子化・逆量子化部１０２において、その対象ビット位置の周辺ビット（既に復号化を終えている）の並びからコンテキストを生成する。そして、エントロピー符号化・復号化部１０３では、そのコンテキストとコードストリームとから確率推定によって復号化を行って対象ビットを生成し、それを対象ビットの位置に書き込む。このようにして復号化されたデータは、周波数帯域毎に空間分割されているため、これを２次元ウェーブレット変換・逆変換部１０１で２次元ウェーブレット逆変換を行うことにより、画像データ中の各コンポーネント１１０における各タイル１１１が復元される。復元されたデータは、色空間変換・逆変換部１００によって元の表色系のデータに変換される。ここに、伸長手段が実行される。
【００６２】
次に、本実施の形態の画像処理装置である複合機１の構成例について説明する。本実施の形態の複合機１は、複写機能、プリンタ機能、スキャナ機能、ファクシミリ機能、画像サーバ機能等の複合機能を有している。
【００６３】
図７は本実施の形態の複合機１を概略的に示す縦断面図である。複合機１は、原稿から原稿画像を読み取る画像読取部であるスキャナ２と、スキャナ２で読み取られた画像を用紙等の記録材に形成する画像形成部であるプリンタ３とを備えている。
【００６４】
スキャナ２の本体ケース４の上面には、原稿（図示せず）が載置されるコンタクトガラス５が設けられている。原稿は、原稿面をコンタクトガラス５に対向させて載置される。コンタクトガラス５の上側には、コンタクトガラス５上に載置された原稿を押える原稿圧板６（いわゆるＡＤＦであってもよい）が設けられている。
【００６５】
コンタクトガラス５の下方には、原稿画像を光学的に読み取るための読取光学系７が設けられている。この読取光学系７は、光を発光する光源８及びミラー９を搭載する第１走行体１０、２枚のミラー１１，１２を搭載する第２走行体１３、結像レンズ１４を介してミラー９，１１，１２によって導かれる光を受光するＣＣＤ（ＣｈａｒｇｅＣｏｕｐｌｅｄＤｅｖｉｃｅ）イメージセンサ１５等によって構成されている。ＣＣＤイメージセンサ１５は、ＣＣＤイメージセンサ１５上に結像される原稿からの反射光を光電変換することで光電変換データを生成する光電変換素子として機能する。光電変換データは、原稿からの反射光の強弱に応じた大きさを有する電圧値である。第１、第２走行体１０，１３は、コンタクトガラス５に沿って往復動自在に設けられており、後述する原稿画像の読取動作に際しては、図示しないモータ等の移動装置によって２：１の速度比で副走査方向にスキャニング走行する。これにより、読取光学系７による原稿読取領域の露光走査が行われる。なお、本実施の形態では、読取光学系７側がスキャニング走査を行う原稿固定型で示しているが、読取光学系７側が位置固定で原稿側が移動する原稿移動型であってもよい。
【００６６】
プリンタ３は、シート状の用紙等の記録材を保持する記録材保持部１６から電子写真方式のプリンタエンジン１７及び定着器１８を経由して排出部１９へ至る記録材経路２０を備えている。
【００６７】
プリンタエンジン１７は、感光体２１、帯電器２２、露光器２３、現像器２４、転写器２５及びクリーナー２６等を用いて、電子写真方式で感光体２１の周囲に形成したトナー像を記録材に転写し、転写したトナー像を、定着器１８によって記録材上に定着させる。なお、本実施の形態では、プリンタエンジン１７が電子写真方式で画像形成を行うが、これに限るものではなく、例えば、インクジェット方式、昇華型熱転写方式、直接感熱記録方式等の様々な画像形成方式で画像形成を行うようにしても良い。
【００６８】
このような複合機１は、複数のマイクロコンピュータで構成される制御系により制御される。図８はこれらの制御系のうち、画像処理に関わる制御系の電気的な接続を概略的に示すブロック図である。この制御系は、ＣＰＵ３０、ＲＯＭ３１、ＲＡＭ３２、操作パネル３３、ＩＰＵ（ＩｍａｇｅＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）３４、Ｉ／Ｏポート３５、通信制御部３６等がバス３７で接続され構成されている。ＣＰＵ３０は、各種演算を行い、画像処理等の処理を集中的に制御する。ＲＯＭ３１には、ＣＰＵ３０が実行する処理に関わる各種プログラムや固定データが格納されている。また、ＲＡＭ３２は、ＣＰＵ３０のワークエリアとして機能し、加えて、画像データ（例えば、画像ファイル）を一時的に記憶するメモリとして機能する。操作パネル３３には、ＬＣＤ（ＬｉｑｕｉｄＣｒｙｓｔａｌＤｉｓｐｌａｙ）等の表示器、ハードキー及びタッチパネル等によって構成される複数の操作キー（いずれも図示せず）が設けられており、操作パネル３３が表示部及び操作部として機能する。ＩＰＵ３４は各種画像処理に関わるハードウエアを備えており、ＲＯＭ３１はＥＥＰＲＯＭやフラッシュメモリ等の不揮発性メモリを備えている。ここで、ＲＯＭ３１内に格納されているプログラムは、ＣＰＵ３０の制御によりＩ／Ｏポート３５を介して外部装置（図示せず）からダウンロードされるプログラムに書換え可能である。なお、本実施の形態では、ＲＯＭ３１がプログラムを記憶する記憶媒体として機能している。通信制御部３６は、複合機１と外部装置（図示せず）との間でネットワーク等を介してデータを送受信する機能を有しており、ファクシミリのモデム機能、公衆電話回線網に接続するための網制御機能、ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）制御機能等を備えている。
【００６９】
次に、本実施の形態の複合機１における画像処理の概要について図９を参照して説明する。図９は複合機１における画像処理の概要を説明するための機能ブロック図である。複合機１の画像処理は、スキャナ２で読み取った原画像を複数の領域に分割する領域分割部４０、分割した複数領域を領域属性に基づいて識別する領域識別処理を実行し、さらに、複数領域に対して画像から文字を認識する文字認識処理を実行する付加情報生成部４１と、図１を参照して説明した各機能ブロックを有する圧縮部４２と、領域識別処理により生成された領域識別情報や文字認識処理により生成された文字認識情報等を付加情報として圧縮データであるコードストリームの所定の埋め込み位置に埋め込む付加情報埋込部４３とを備える。
【００７０】
なお、画像から文字を認識する文字認識処理としては、例えばＯＣＲ（ＯｐｔｉｃａｌＣｈａｒａｃｔｅｒＲｅｃｏｇｎｉｔｉｏｎ）処理が用いられる。また、付加情報生成部４１では、スキャナ２で読み取った原画像の傾き角度を検出する傾き検出処理を実行しても良い。これにより、傾き検出結果である傾き角度補正情報等の傾き検出情報が得られる。したがって、付加情報埋込部４３では、その傾き検出情報を付加情報として埋め込むようにしても良い。
【００７１】
また、複合機１の画像処理は、基本的に、スキャナ２で読み取られた原画像の画像データから分割した領域に対応する画像データに対してＯＣＲ処理を行い、さらに、画像データをＪＰＥＧ２０００アルゴリズムにより圧縮符号化して、コードストリームを生成する。すなわち、画像を１又は複数の矩形領域（タイル１１１）に分割し、この矩形領域毎に画素値を離散ウェーブレット変換して階層的に圧縮符号化する。このとき、領域識別情報、文字認識情報、傾き検出情報等は、コードストリームの所定の埋め込み位置に埋め込まれる。このような領域分割部４０、付加情報生成部４１、圧縮部４２、付加情報埋込部４３等の機能は、ＲＯＭ３１に記憶されているプログラムに基づいてＣＰＵ３０が行う画像処理で実行されるようにしているが、これに限るものではなく、例えば、ＩＰＵ３４等によりハードウエアが行う画像処理で実行されるようにしても良い。
【００７２】
ここで、領域属性としては、例えば、文字領域、写真領域、図領域、表領域、黒ベタ領域、背景領域等の様々な領域属性がある。なお、背景領域とは、原稿の余白にあたる余白領域であるが、これに限るものではなく、例えば、余白領域に行間にあたる行間領域を加えた領域であっても良い。
【００７３】
一方、付加情報としては、例えば、ＯＣＲ処理により生成された文字認識情報（文字認識結果情報）、領域識別処理により生成された座標情報や領域属性情報等の領域識別情報領域（領域識別結果情報）、傾き検出処理により生成された傾き角度補正情報等の傾き検出情報（傾き検出結果情報）等がある。文字認識情報としては、文字コード、文字色、フォントサイズ、フォント情報、文字の座標位置情報、確信度情報（確からしさ情報）、タイトル（文字コード）等がある。なお、タイトルは領域識別処理によりタイトル領域を抽出し、そのタイトル領域をＯＣＲ処理して得られた文字コードである。例えば、「２００２年度会議議事録．ｊ２ｋ」等のタイトルが文字コードで所定の埋め込み位置に埋め込まれる。
【００７４】
また、所定の埋め込み位置としては、例えば、メインヘッダやタイルパートヘッダ等のヘッダ（図６参照）、画像の最下位ビット（画像サイズを増加させることなく埋め込みが可能）、さらに、画像の文字領域に対応する最下位ビット等がある。加えて、画像サイズがタイルの整数倍でない場合には、情報を付加してタイルの整数倍の画像を形成するが、その付加する情報部分に埋め込むことも可能である。
【００７５】
次に、複合機１のＣＰＵ３０がプログラムに基づいて実行する画像処理について説明する。ここでは、例えば、複合機１を画像サーバとして用いるために画像データを蓄積する画像処理について図１０及び図１１を参照して説明する。図１０は本実施の形態の画像処理の流れを概略的に示すフローチャート、図１１はその画像処理による付加情報の埋め込み位置を概略的に示す説明図である。
【００７６】
図１０に示すように、まず、スキャナ２による原稿画像の読取に待機する（ステップＳ１のＮ）。操作者がスキャナ２の原稿圧板６を開放してコンタクトガラス５上に原稿をセットし、原稿圧板６を閉じて操作パネル３３のコピースタートキーを押下すると、スキャナ２は読取光学系７のスキャニング動作でコンタクトガラス５上にセットされた原稿から原画像を読み取る。
【００７７】
スキャナ２により原稿から原画像が読み取られると（Ｓ１のＹ）、読み取られた原画像を複数の領域に分割し、文字領域、写真領域、図領域、表領域、背景領域等の領域属性に基づいて、原画像中に混在する複数領域の属性を識別する（Ｓ２）。ここに、領域分割手段又は領域分割機能が実行され、付加情報生成手段又は付加情報生成機能が実行される。なお、領域識別の方法は、従来の方法、例えば、黒ランの密度を用いて領域識別する方法等で十分であり、その方法は公知であるため、その説明は省略する。ここで、複数領域を領域識別することによって、座標情報や領域属性情報等の領域識別情報が複数領域毎に得られる。なお、スキャナ２により読み取った原稿画像に対して本実施の形態の画像処理を実行しているが、これに限るものではなく、例えばネットワークを介して受信した原画像に対して本実施の形態の画像処理を実行しても良い。
【００７８】
次に、分割識別された複数領域に対してＯＣＲ処理を実行する（Ｓ３）。すなわち、複数領域毎の画像から文字を認識する。ここに、付加情報生成手段又は付加情報生成機能が実行される。なお、ＯＣＲ処理としては、パターン・マッチング法や構造解析法等があり、これらの方法は公知であるため、その説明は省略する。ここで、複数領域毎にＯＣＲ処理を実行することによって、文字コード、文字色、フォントサイズ、フォント情報、文字の座標位置情報、確信度情報（確からしさ情報）等の文字認識情報が複数領域毎に得られる。
【００７９】
次いで、原画像をＪＰＥＧ２０００アルゴリズムに基づいて圧縮する（Ｓ４）。これにより、原画像からＪＰＥＧ２０００アルゴリズムに基づいてコードストリームが生成される。そして、コードストリームの所定の埋め込み位置に領域識別情報及び文字認識情報を付加情報として埋め込む（Ｓ５）。ここに、付加情報埋込手段又は付加情報埋込機能が実行される。例えば、図１１に示すように、付加情報は、コードストリームにおけるレイヤの最下位ビットに埋め込まれる。これにより、画像サイズを増加させることなく埋め込むことができる。なお、付加情報が埋め込まれた領域Ｒは、原画像の文字領域に対応する領域である。
【００８０】
なお、ここでは、画像の文字領域に対応する領域Ｒの最下位ビットに埋め込んでいるが、これに限るものではなく、例えば単純に最下位ビットに埋め込むようにしても良い。また、領域識別情報及び文字認識情報を付加情報として所定の埋め込み位置に埋め込んでいるが、これに限るものではなく、例えば、領域識別情報及び文字認識情報のどちらか一方だけを所定の埋め込み位置に埋め込んでも良く、あるいは、傾き検出処理を実行した場合には、傾き検出結果の傾き角度補正情報等の傾き検出情報を所定の埋め込み位置に埋め込んでも良い。また、ステップＳ４及びステップＳ５を同時に実行するようにして、付加情報を埋め込みながら原画像を圧縮するようにしても良い。ここで、原画像はＪＰＥＧ２０００アルゴリズムに基づいて複数の解像度で圧縮されているので、解像度毎にＯＣＲ処理を実行して文字認識情報を所定の埋め込み位置に埋め込むようにしても良い。
【００８１】
最後に、付加情報が埋め込まれた画像データ（圧縮データ）をＲＡＭ３２に画像ファイルとして格納する（Ｓ６）。ここで、原稿が複数枚ある場合には、原稿毎にステップＳ１からＳ６までの処理が繰り返され、ＲＡＭ３２には原稿毎に複数の画像ファイルが保存される。
【００８２】
その後、ＲＡＭ３２に画像ファイルとして格納されている画像データは、例えば、所定のタイミングで通信制御部３６によりネットワークを介して外部装置に送信される場合がある。このとき、例えば、文字領域を有する画像の画像データだけを送信するために、ＲＡＭ３２に画像ファイルとして格納された複数の画像データから文字領域を有する画像の画像データを検索する場合には、ＣＰＵ３０は画像データ内の文字コードを検出すれば良く、簡単に検索することができる。
また、類似した画像を有する画像データだけを送信するために、ＲＡＭ３２に画像ファイルとして格納された複数の画像データから類似した画像を有する画像データを検索する場合には、ＣＰＵ３０は画像データ内の領域識別情報を検出し、他の画像データと領域毎に領域属性が一致するか否かを判断して、簡単に検索することができる。
【００８３】
なお、画像データの画像は、変倍（拡大，縮小）、回転、白黒反転等の画像処理が行われる場合もある。このような画像処理では、文字領域Ｍのフォントサイズを変更することで文字領域Ｍの文字画像を変倍し、また、画像がＪＰＥＧ２０００アルゴリズムの圧縮手段により様々な解像度の画像として保持されているので、画像を高画質から低画質に自由に変化させることができる。さらに、原画像を表示装置等に表示する場合には、表示装置の解像度等に合わせて画像を伸長することができる。
【００８４】
このように本実施の形態では、原画像を複数領域に分割し、複数領域の属性を識別して領域識別情報を生成し（Ｓ２）、分割した複数領域に対してＯＣＲ処理を実行して文字認識情報を生成し（Ｓ３）、生成した領域識別情報及び文字認識情報をコードストリームの所定の埋め込み位置に埋め込むことによって（Ｓ５）、例えば、複数の画像データ（圧縮データ）から必要とする画像データを検索する場合等、付加情報を利用して簡単に検索することが可能となる。すなわち、必要に応じた付加情報を埋め込むことで、画像データの検索や管理、画像データに対する処理等を容易に行うことができる。また、原画像の複数領域毎に対応する付加情報、例えば領域識別情報や文字認識情報等が得られるので、画像の中から必要とする領域、例えば文字領域や写真領域等を簡単に検索して抽出することができる。さらに、ＪＰＥＧ２０００アルゴリズムの圧縮手段及び伸長手段を用いることで、ＪＰＥＧ２０００の特性を活かした様々な画像処理を実行することができる。
【００８５】
本発明の第二の実施の形態を図１２に基づいて説明する。図１２は本実施の形態の画像処理の流れを概略的に示すフローチャートである。なお、前述して説明した部分と同一部分は同一符号で示す。
【００８６】
本実施の形態の基本的構成は、第一の実施の形態と同様であるが、第一の実施の形態との相違点は、複合機１のＣＰＵ３０がプログラムに基づいて実行する画像処理が異なる点である。
【００８７】
本実施の形態の複合機１のＣＰＵ３０がプログラムに基づいて実行する画像処理について説明する。ここでは、例えば、複合機１を画像サーバとして用いるために画像データを蓄積する画像処理について説明する。
【００８８】
図１２に示すように、まず、スキャナ２による原稿画像の読取に待機する（ステップＳ１１のＮ）。操作者がスキャナ２の原稿圧板６を開放してコンタクトガラス５上に原稿をセットし、原稿圧板６を閉じて操作パネル３３のコピースタートキーを押下すると、スキャナ２は読取光学系７のスキャニング動作でコンタクトガラス５上にセットされた原稿から原画像を読み取る。
【００８９】
スキャナ２により原稿から原画像が読み取られると（Ｓ１１のＹ）、読み取られた原画像をタイリング処理して複数のタイル１１１（複数の領域：図１参照）に分割する（Ｓ１２）。ここに、領域分割手段又は領域分割機能が実行される。
これにより、原画像は複数のタイル１１１（複数の領域）に分割される。なお、このタイリング処理は、ＪＰＥＧ２０００アルゴリズムに基づいて実行される。
【００９０】
次に、分割された複数のタイル１１１に対してＯＣＲ処理を実行する（Ｓ１３）。すなわち、複数のタイル１１１毎の画像から文字を認識する。ここに、付加情報生成手段又は付加情報生成機能が実行される。なお、ＯＣＲ処理としては、パターン・マッチング法や構造解析法等があり、これらの方法は公知であるため、その説明は省略する。ここで、ＯＣＲ処理を実行することによって、文字コード、文字色、フォントサイズ、フォント情報、文字の座標位置情報、確信度情報（確からしさ情報）等の文字認識情報がタイル１１１毎に得られる。
【００９１】
次いで、原画像をＪＰＥＧ２０００アルゴリズムに基づいて圧縮する（Ｓ１４）。これにより、原画像からＪＰＥＧ２０００アルゴリズムに基づいてコードストリームが生成される。そして、コードストリームの所定の埋め込み位置に付加情報として文字認識情報を埋め込む（Ｓ１５）。ここに、付加情報埋込手段又は付加情報埋込機能が実行される。例えば、タイル１１１毎の付加情報は、対応するタイルパートヘッダに埋め込まれる。なお、ステップＳ１２でタイリング処理を実行しているので、ステップＳ１４でタイリング処理を実行する必要はない。
【００９２】
ここでは、文字認識情報を付加情報として所定の埋め込み位置に埋め込んでいるが、これに限るものではなく、例えば、傾き検出処理を実行した場合には、傾き検出結果の傾き角度補正情報等の傾き検出情報を所定の埋め込み位置に埋め込んでも良い。また、ステップＳ１４及びステップＳ１５を同時に実行するようにして、付加情報を埋め込みながら画像を圧縮するようにしても良い。
【００９３】
最後に、付加情報が埋め込まれた画像データ（圧縮データ）をＲＡＭ３２に画像ファイルとして格納する（Ｓ１６）。ここで、原稿が複数枚ある場合には、原稿毎にステップＳ１１からＳ１６までの処理が繰り返され、ＲＡＭ３２には原稿毎に複数の画像ファイルが保存される。
【００９４】
その後、ＲＡＭ３２に画像ファイルとして格納されている画像データは、例えば、所定のタイミングで通信制御部３６によりネットワークを介して外部装置に送信される場合がある。このとき、例えば、文字領域を有する画像の画像データだけを送信するために、ＲＡＭ３２に画像ファイルとして格納された複数の画像データから文字領域を有する画像の画像データを検索する場合には、ＣＰＵ３０は画像データ内の文字コードを検出すれば良く、簡単に検索することができる。
【００９５】
このように本実施の形態では、原画像をタイリング処理して複数のタイル１１１（複数の領域）に分割し（Ｓ１２）、分割した複数のタイル１１１に対してＯＣＲ処理を実行して文字認識情報を生成し（Ｓ１３）、生成した文字認識情報をコードストリームの所定の埋め込み位置に埋め込むことによって（Ｓ１５）、例えば、複数の画像データ（圧縮データ）から必要とする画像データを検索する場合等、付加情報を利用して簡単に検索することが可能となる。すなわち、必要に応じた付加情報を埋め込むことで、画像データの検索や管理、画像データに対する処理等を容易に行うことができる。また、原画像のタイル１１毎に対応する付加情報、例えば文字認識情報等が得られるので、画像の中から必要とする領域、例えば文字領域を簡単に検索して抽出することができる。さらに、ＪＰＥＧ２０００アルゴリズムの圧縮手段及び伸長手段を用いることで、ＪＰＥＧ２０００の特性を活かした様々な画像処理を実行することができる。
【００９６】
なお、各実施の形態においては、画像処理装置として複合機１を用いているが、これに限るものではなく、例えば、パーソナルコンピュータ等を用いても良い。この場合、パーソナルコンピュータは、ＣＰＵ、ＲＯＭ、ＲＡＭ、各種のプログラムを記憶するＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）、ＣＤ−ＲＯＭドライブ、スキャナ、ネットワークを介して外部装置と通信により情報を伝達するための通信制御装置、処理経過や結果等を操作者に表示する表示装置、キーボードやマウス等の入力装置等を備えている。ここで、ＨＤＤは、前述したような画像処理に関するプログラムを記憶する記憶媒体として機能する。
【００９７】
なお、一般的には、パーソナルコンピュータのＨＤＤにインストールされるプログラムは、ＣＤ−ＲＯＭやＤＶＤ−ＲＯＭ等の光情報記録メディアやＦＤ等の磁気メディア等に記録され、この記録されたプログラムがＨＤＤにインストールされる。このため、ＣＤ−ＲＯＭ等の光情報記録メディアやＦＤ等の磁気メディア等の可搬性を有する記憶媒体も、前述したような画像処理に関するプログラムを記憶する記憶媒体となり得る。さらには、このようなプログラムは、例えば通信制御装置を介して外部から取込まれ、ＨＤＤにインストールされても良い。
【００９８】
【発明の効果】
請求項１記載の発明によれば、原画像を２次元ウェーブレット変換、量子化及び符号化という手順で圧縮してコードストリームを生成する画像処理装置において、前記原画像を複数領域に分割する領域分割手段と、前記領域分割手段により分割された前記複数領域の画像から前記原画像に関連する付加情報を生成する付加情報生成手段と、前記付加情報生成手段により生成された付加情報を前記コードストリームに埋め込む付加情報埋込手段と、を備えることを特徴とすることから、原画像の画像データが付加情報を有し、例えば、複数の画像データから必要とする画像データを検索する場合等、付加情報を利用して簡単に検索することが可能となる。すなわち、必要に応じた付加情報を埋め込むことで、画像データの検索や管理、画像データに対する処理等を容易に行うことができる。
【００９９】
請求項２記載の発明によれば、請求項１記載の画像処理装置において、前記付加情報生成手段は、前記複数領域の画像から文字を認識することで前記付加情報として文字認識情報を生成することを特徴とすることから、この文字認識情報がコードストリームに埋め込まれ、例えば、複数の画像データから文字領域を有する画像データを検索する場合等、文字認識情報により簡単に検索することが可能となる。すなわち、文字認識情報を利用することで、画像データの検索や管理等を容易に行うことができる。
【０１００】
請求項３記載の発明によれば、請求項１又は２記載の画像処理装置において、前記付加情報生成手段は、前記複数領域の領域属性を識別することで前記付加情報として領域識別情報を生成することを特徴とすることから、この領域識別情報がコードストリームに埋め込まれ、例えば、複数の画像データから写真領域を有する画像データを検索する場合等、領域識別情報により簡単に検索することが可能となる。すなわち、領域識別情報を利用することで、画像データの検索や管理等を容易に行うことが可能になる。
【０１０１】
請求項４記載の発明によれば、請求項１、２又は３記載の画像処理装置において、前記付加情報生成手段は、前記複数領域の画像の傾きを検出することで前記付加情報として傾き検出情報を生成することを特徴とすることから、この傾き検出情報がコードストリームに埋め込まれ、例えば、画像を表示装置に表示させたり、用紙に印字させたりする場合等、傾き検出情報により簡単に画像の傾きを補正することが可能となる。すなわち、傾き検出情報を利用することで、画像データに対する処理等を容易に行うことができる。
【０１０２】
請求項５記載の発明によれば、請求項１ないし４のいずれか一記載の画像処理装置において、前記付加情報埋込手段は、前記コードストリームのメインヘッダに前記付加情報を埋め込むことを特徴とすることから、例えば、複数の画像データから必要とする画像データを検索する場合等、メインヘッダを読み取る早い段階でメインヘッダの付加情報を利用して検索することが可能となり、その結果として、処理時間を短縮することができる。
【０１０３】
請求項６記載の発明によれば、請求項１ないし４のいずれか一記載の画像処理装置において、前記付加情報埋込手段は、前記コードストリームのタイルパートヘッダに前記付加情報を埋め込むことを特徴とすることから、例えば、タイル毎（領域毎）の付加情報が対応するタイルパートヘッダに埋め込まれ、複数の画像データから必要とする画像データを検索する場合等、タイルパートヘッダの付加情報を利用して簡単に検索することができる。
【０１０４】
請求項７記載の発明によれば、請求項１ないし４のいずれか一記載の画像処理装置において、前記付加情報埋込手段は、前記コードストリームにおけるレイヤの最下位ビットに前記付加情報を埋め込むことを特徴とすることから、画像サイズを増加させることなく付加情報を埋め込むことができる。
【０１０５】
請求項８記載の発明によれば、請求項１ないし７のいずれか一記載の画像処理装置において、原稿から前記原画像を光学的に読み取る読取光学系を備えることを特徴とすることから、原稿から原画像を読み取ることが可能になり、その結果として、読み取った原画像に対し画像処理等の様々な処理を実行することができる。
【０１０６】
請求項９記載の発明によれば、請求項１ないし８のいずれか一記載の画像処理装置において、圧縮された前記原画像を復号化、逆量子化及び２次元ウェーブレット逆変換という手順で伸長する伸長手段を備えることを特徴とすることから、ＪＰＥＧ２０００アルゴリズムの伸長手段を用いることで、ＪＰＥＧ２０００アルゴリズムで圧縮された画像をＪＰＥＧ２０００の特性を活かして伸長することが可能となり、その結果として、伸長された画像の表示装置等への表示や用紙等への印字等を実行することができる。
【０１０７】
請求項１０記載の発明によれば、請求項９記載の画像処理装置において、前記伸長手段により伸長された画像を記録材に画像形成するプリンタエンジンを備えることを特徴とすることから、伸長された画像を用紙等の記録材に形成することができる。
【０１０８】
請求項１１記載の発明のプログラムは、原画像を２次元ウェーブレット変換、量子化及び符号化という手順で圧縮してコードストリームを生成する画像処理装置が備えるコンピュータに解釈され、前記コンピュータに、原画像を複数領域に分割する領域分割機能と、前記領域分割手段により分割された前記複数領域の画像から前記原画像に関連する付加情報を生成する付加情報生成機能と、前記付加情報生成手段により生成された付加情報を前記コードストリームに埋め込む付加情報埋込機能と、を実行させることから、原画像の画像データが付加情報を有し、例えば、複数の画像データから必要とする画像データを検索する場合等、付加情報を利用して簡単に検索することができる。すなわち、必要に応じた付加情報を埋め込むことで、画像データの検索や管理、画像データに対する処理等を容易に行うことができる。
【０１０９】
請求項１２記載の発明によれば、請求項１１記載のプログラムにおいて、前記付加情報生成機能は、前記複数領域の画像から文字を認識することで前記付加情報として文字認識情報を生成することから、この文字認識情報がコードストリームに埋め込まれ、例えば、複数の画像データから文字領域を有する画像データを検索する場合等、文字認識情報により簡単に検索することが可能となる。すなわち、文字認識情報を利用することで、画像データの検索や管理等を容易に行うことができる。
【０１１０】
請求項１３記載の発明によれば、請求項１１又は１２記載のプログラムにおいて、前記付加情報生成機能は、前記複数領域の領域属性を識別することで前記付加情報として領域識別情報を生成することから、この領域識別情報がコードストリームに埋め込まれ、例えば、複数の画像データから写真領域を有する画像データを検索する場合等、領域識別情報により簡単に検索することが可能となる。すなわち、領域識別情報を利用することで、画像データの検索や管理等を容易に行うことができる。
【０１１１】
請求項１４記載の発明によれば、請求項１１、１２又は１３記載のプログラムにおいて、前記付加情報生成機能は、前記複数領域の画像の傾きを検出することで前記付加情報として傾き検出情報を生成することから、この傾き検出情報がコードストリームに埋め込まれ、例えば、画像を表示装置に表示させたり、用紙に印字させたりする場合等、傾き検出情報により簡単に画像の傾きを補正することが可能となる。すなわち、傾き検出情報を利用することで、画像データに対する処理等を容易に行うことができる。
【０１１２】
請求項１５記載の発明によれば、請求項１１ないし１４のいずれか一記載のプログラムにおいて、前記付加情報埋込機能は、前記コードストリームのメインヘッダに前記付加情報を埋め込むことから、例えば、複数の画像データから必要とする画像データを検索する場合等、メインヘッダを読み取る早い段階でメインヘッダの付加情報を利用して検索することが可能となり、その結果として、処理時間を短縮することができる。
【０１１３】
請求項１６記載の発明によれば、請求項１１ないし１４のいずれか一記載のプログラムにおいて、前記付加情報埋込機能は、前記コードストリームのタイルパートヘッダに前記付加情報を埋め込むことから、例えば、タイル毎（領域毎）の付加情報が対応するタイルパートヘッダに埋め込まれ、複数の画像データから必要とする画像データを検索する場合等、タイルパートヘッダの付加情報を利用して簡単に検索することができる。
【０１１４】
請求項１７記載の発明によれば、請求項１１ないし１４のいずれか一記載のプログラムにおいて、前記付加情報埋込機能は、前記コードストリームにおけるレイヤの最下位ビットに前記付加情報を埋め込むことから、画像サイズを増加させることなく付加情報を埋め込むことができる。
【０１１５】
請求項１８記載の発明のコンピュータ読取可能な記憶媒体によれば、請求項１１ないし１７のいずれか一記載のプログラムを記憶していることから、請求項１１ないし１７のいずれか一記載の発明と同様な効果を奏する。
【図面の簡単な説明】
【図１】ＪＰＥＧ２０００アルゴリズムの概要を説明するための機能ブロック図である。
【図２】カラー画像である原画像の分割された各コンポーネントの一例を概略的に示す模式図である。
【図３】デコンポジションレベル数が３である場合の各デコンポジションレベルにおけるサブバンドを概略的に示す模式図である。
【図４】プレシンクトを示す説明図である。
【図５】ビットプレーンに順位付けする手順の一例を示す説明図である。
【図６】コードストリームの構造の一例を概略的に示す模式図である。
【図７】本発明の第一の実施の形態の複合機を概略的に示す縦断面図である。
【図８】複合機の制御系のうち、画像処理に関わる制御系の電気的な接続を概略的に示すブロック図である。
【図９】複合機における画像処理の概要を説明するための機能ブロック図である。
【図１０】本発明の第一の実施の形態の画像処理の流れを概略的に示すフローチャートである。
【図１１】本発明の第一の実施の形態の画像処理による付加情報の埋め込み位置を概略的に示す説明図である。
【図１２】本発明の第二の実施の形態の画像処理の流れを概略的に示すフローチャートである。
【符号の説明】
１画像処理装置（複号機）
７読取光学系
１７プリンタエンジン
３０コンピュータ（ＣＰＵ）
３１記憶媒体（ＲＯＭ）[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to an image processing device, a program, and a storage medium.
[0002]
[Prior art]
2. Description of the Related Art In recent years, with the spread of image processing apparatuses such as scanners, digital cameras, personal computers, printers, copiers, and multifunction peripherals (MFPs), digital image data has been stored in storage devices such as memories and hard disks, and CD-ROMs and the like. It is easy to save the data on an optical disk or to transmit the data via the Internet or the like. Such image data is usually compressed and stored in a storage device, an optical disk, or the like in many cases.
[0003]
Recently, high-definition images can be easily obtained by various techniques, but the image data size of the high-definition images tends to be large, and handling of high-definition images has become difficult. At present, JPEG (Joint Photographic Experts Group) is most widely used as an image compression / decompression algorithm for facilitating the handling of such high-definition images. In recent years, DWT (discrete wavelet transform) has been increasingly used as a frequency transform in place of DCT (discrete cosine transform) adopted in JPEG. A representative example is JPEG2000, an image compression / decompression method succeeding JPEG, which became an international standard in 2001.
[0004]
Since such compressed image data is digital data, it can be easily transmitted via the Internet or the like or stored in a storage device. However, on the other hand, there is a high possibility that the image data will be modified without permission of the creator. is there. To prevent this, a method of embedding signature information for specifying a creator as additional information in an original image has been proposed (for example, see Patent Document 1).
[0005]
[Patent Document 1]
JP 2001-42768 A
[0006]
[Problems to be solved by the invention]
However, in the technique of Patent Document 1, although signature information can be embedded as additional information in an original image, the signature information is not additional information relating to the original image. It is difficult to use signature information for secondary use. Further, even when the additional information on the original image is embedded in the original image, the processing time when using the additional information may be long depending on the embedding position.
[0007]
In addition, a user desires to be able to easily retrieve image data having a character area or a photograph area or image data having a similar image from a plurality of image data stored in a storage device. For example, in the related art, a user confirms an image displayed on a display device or the like or an image printed on a sheet or the like based on image data, so that image data having a character area or a photograph area or similar image data can be obtained. In many cases, image data or the like having the searched image is searched.
[0008]
An object of the present invention is to provide an image processing apparatus, a program, and a storage medium that realize encoding suitable for search and management of image data, processing of image data, and the like.
[0009]
[Means for Solving the Problems]
An image processing apparatus for compressing an original image by a procedure of two-dimensional wavelet transform, quantization and encoding to generate a code stream, comprising: an area dividing unit that divides the original image into a plurality of areas; An additional information generating unit configured to generate additional information related to the original image from an image of the plurality of regions divided by the region dividing unit; and an additional unit that embeds the additional information generated by the additional information generating unit into the code stream. Information embedding means.
[0010]
Therefore, by generating additional information related to the original image and embedding the generated additional information in the code stream, the image data of the original image has the additional information. When searching, for example, it is possible to easily search using the additional information. That is, by embedding additional information as needed, it becomes possible to easily perform search and management of image data, processing of image data, and the like.
[0011]
According to a second aspect of the present invention, in the image processing apparatus according to the first aspect, the additional information generation unit generates character recognition information as the additional information by recognizing characters from the images of the plurality of areas. And
[0012]
Therefore, by generating the character recognition information as additional information, the character recognition information is embedded in the code stream. For example, when searching for image data having a character area from a plurality of image data, the character recognition information can be easily obtained. It becomes possible to search. That is, by using the character recognition information, it is possible to easily search and manage image data.
[0013]
According to a third aspect of the present invention, in the image processing apparatus according to the first or second aspect, the additional information generation unit generates area identification information as the additional information by identifying area attributes of the plurality of areas. Features.
[0014]
Therefore, by generating the region identification information as additional information, the region identification information is embedded in the code stream. For example, in a case where image data having a photograph region is searched from a plurality of image data, the region identification information can be more easily obtained. It becomes possible to search. That is, by using the area identification information, it is possible to easily search and manage the image data.
[0015]
According to a fourth aspect of the present invention, in the image processing apparatus according to the first, second or third aspect, the additional information generating unit generates tilt detection information as the additional information by detecting a tilt of an image in the plurality of regions. It is characterized by doing.
[0016]
Therefore, by generating the inclination detection information as additional information, the inclination detection information is embedded in the code stream. For example, when an image is displayed on a display device or printed on paper, the inclination detection information can be more easily used. This makes it possible to correct the inclination of the image. That is, by using the inclination detection information, it is possible to easily perform processing on image data.
[0017]
According to a fifth aspect of the present invention, in the image processing apparatus according to any one of the first to fourth aspects, the additional information embedding unit embeds the additional information in a main header of the code stream.
[0018]
Therefore, by embedding the additional information in the main header of the code stream, for example, when searching for required image data from a plurality of image data, a search is performed using the additional information of the main header at an early stage of reading the main header. As a result, the processing time can be reduced.
[0019]
According to a sixth aspect of the present invention, in the image processing apparatus according to any one of the first to fourth aspects, the additional information embedding unit embeds the additional information in a tile part header of the code stream. .
[0020]
Therefore, by embedding the additional information in the tile part header of the code stream, for example, the additional information for each tile (each area) is embedded in the corresponding tile part header, and the required image data is searched from a plurality of image data. In such cases, it is possible to easily search using the additional information of the tile part header.
[0021]
According to a seventh aspect of the present invention, in the image processing apparatus according to any one of the first to fourth aspects, the additional information embedding unit embeds the additional information in a least significant bit of a layer in the code stream. And
[0022]
Therefore, by embedding the additional information in the least significant bit of the layer in the code stream, it is possible to embed the additional information without increasing the image size.
[0023]
An eighth aspect of the present invention is the image processing apparatus according to any one of the first to seventh aspects, further comprising a reading optical system that optically reads the original image from a document.
[0024]
Therefore, the original image can be read from the document, and as a result, various processes such as image processing can be performed on the read original image.
[0025]
According to a ninth aspect of the present invention, in the image processing apparatus according to any one of the first to eighth aspects, the decompression means for decompressing the compressed original image in a sequence of decoding, inverse quantization, and inverse two-dimensional wavelet transform. It is characterized by having.
[0026]
Therefore, by using the decompression means of the JPEG2000 algorithm, an image compressed by the JPEG2000 algorithm can be decompressed by utilizing the characteristics of JPEG2000, and as a result, the decompressed image can be displayed on a display device or a paper Etc. can be executed.
[0027]
According to a tenth aspect of the present invention, there is provided the image processing apparatus according to the ninth aspect, further comprising a printer engine for forming an image decompressed by the decompression means on a recording material.
[0028]
Therefore, it is possible to form an expanded image on a recording material such as paper.
[0029]
The program according to claim 11 is interpreted by a computer provided in an image processing apparatus that generates a code stream by compressing an original image by a procedure of two-dimensional wavelet transform, quantization, and encoding. Divided into a plurality of regions, an additional information generating function of generating additional information related to the original image from the images of the plurality of regions divided by the region dividing unit, and an additional information generating unit that generates the additional information. And an additional information embedding function for embedding the additional information into the code stream.
[0030]
Therefore, by generating additional information related to the original image and embedding the generated additional information in the code stream, the image data of the original image has the additional information. When searching, for example, it is possible to easily search using the additional information. That is, by embedding additional information as needed, it becomes possible to easily perform search and management of image data, processing of image data, and the like.
[0031]
According to a twelfth aspect of the present invention, in the program according to the eleventh aspect, the additional information generation function generates character recognition information as the additional information by recognizing characters from the images of the plurality of areas.
[0032]
Therefore, by generating the character recognition information as additional information, the character recognition information is embedded in the code stream. For example, when searching for image data having a character area from a plurality of image data, the character recognition information can be easily obtained. It becomes possible to search. That is, by using the character recognition information, it is possible to easily search and manage image data.
[0033]
The invention according to claim 13 is the program according to claim 11 or 12, wherein the additional information generation function generates area identification information as the additional information by identifying area attributes of the plurality of areas.
[0034]
Therefore, by generating the region identification information as additional information, the region identification information is embedded in the code stream. For example, in a case where image data having a photograph region is searched from a plurality of image data, the region identification information can be more easily obtained. It becomes possible to search. That is, by using the area identification information, it is possible to easily search and manage the image data.
[0035]
According to a fourteenth aspect of the present invention, in the program according to the eleventh, twelfth, or thirteenth aspect, the additional information generation function generates tilt detection information as the additional information by detecting a tilt of an image in the plurality of regions.
[0036]
Therefore, by generating the inclination detection information as additional information, the inclination detection information is embedded in the code stream. For example, when an image is displayed on a display device or printed on paper, the inclination detection information can be more easily used. This makes it possible to correct the inclination of the image. That is, by using the inclination detection information, it is possible to easily perform processing on image data.
[0037]
The invention according to claim 15 is the program according to any one of claims 11 to 14, wherein the additional information embedding function embeds the additional information in a main header of the code stream.
[0038]
Therefore, by embedding the additional information in the main header of the code stream, for example, when searching for required image data from a plurality of image data, a search is performed using the additional information of the main header at an early stage of reading the main header. As a result, the processing time can be reduced.
[0039]
The invention according to claim 16 is the program according to any one of claims 11 to 14, wherein the additional information embedding function embeds the additional information in a tile part header of the code stream.
[0040]
Therefore, by embedding the additional information in the tile part header of the code stream, for example, the additional information for each tile (each area) is embedded in the corresponding tile part header, and the required image data is searched from a plurality of image data. In such cases, it is possible to easily search using the additional information of the tile part header.
[0041]
The invention according to claim 17 is the program according to any one of claims 11 to 14, wherein the additional information embedding function embeds the additional information in the least significant bit of a layer in the code stream.
[0042]
Therefore, by embedding the additional information in the least significant bit of the layer in the code stream, it is possible to embed the additional information without increasing the image size.
[0043]
A computer readable storage medium according to the invention of claim 18 stores the program according to any one of claims 11 to 17.
[0044]
Therefore, the same operation as the invention according to any one of claims 11 to 17 is achieved.
[0045]
BEST MODE FOR CARRYING OUT THE INVENTION
A first embodiment of the present invention will be described with reference to FIGS.
[0046]
In the present embodiment, the “JPEG2000 algorithm” is used. However, since the JPEG2000 algorithm itself is well known in various documents and gazettes, the details are omitted, and the outline is described.
[0047]
FIG. 1 is a functional block diagram for explaining the outline of the JPEG2000 algorithm.
The JPEG2000 algorithm includes a color space conversion / inverse conversion unit 100, a two-dimensional wavelet conversion / inverse conversion unit 101, a quantization / inverse quantization unit 102, an entropy encoding / decoding unit 103, and a tag processing unit 104. I have.
[0048]
One of the features of JPEG2000 is that a two-dimensional discrete wavelet transform (DWT: Discrete Wavelet Transform) having an advantage of high image quality in a high compression area is used. Another major feature is that a functional block called a tag processing unit 104 for performing code formation is added at the last stage, and a code stream as code string data is generated and interpreted. JPEG2000 can realize various convenient functions by the code stream.
[0049]
Note that a color space conversion / inversion unit 100 is often provided in the input / output part of the image. The color space conversion / inverse conversion unit 100 includes, for example, an RGB color system including R (red) / G (green) / B (blue) components of a primary color system, and Y (yellow) / M of a complementary color system. This is a part that performs conversion from the YMC color system composed of each component of (magenta) / C (cyan) to YCrCb or the YUV color system or vice versa.
[0050]
Hereinafter, the JPEG2000 algorithm, particularly the wavelet transform, will be described.
[0051]
FIG. 2 is a schematic diagram schematically showing an example of each of the divided components of the original image which is a color image. In a color image, as shown in FIG. 2, each component 110 of the original image is generally separated by, for example, an RGB primary color system. Further, each component 110 of the image is divided by tiles 111, which are rectangular areas (in the example of FIG. 2, each component 110 is divided into a total of 16 rectangular tiles 111, 4 × 4 vertically and horizontally). ). Each of such tiles 111, for example, R00, R01,..., R15 / G00, G01,..., G15 / B00, B01,. . Therefore, the compression / expansion operation of the image data is performed independently for each component 110 and for each tile 111.
[0052]
At the time of encoding image data (see FIG. 1), data of each tile 111 of each component 110 is input to the color space conversion / inverse conversion unit 100, subjected to color space conversion, and then subjected to two-dimensional wavelet conversion / inverse conversion. The unit 101 applies a two-dimensional wavelet transform (forward transform) and spatially divides the frequency band.
[0053]
FIG. 3 is a schematic diagram schematically showing subbands at each decomposition level when the number of decomposition levels is three. The two-dimensional wavelet transform / inverse transform unit 101 performs a two-dimensional wavelet transform on the tile image (decomposition level 0 (120): 0LL) obtained by dividing the image into tiles, and performs a decomposition level 1 (121). (1LL, 1HL, 1LH, 1HH) are separated. Subsequently, the two-dimensional wavelet transform / inverse transform unit 101 performs a two-dimensional wavelet transform on the low-frequency component 1LL in this layer, and outputs the sub-bands (2LL, 2HL, 2LH, 2HH) indicated by the decomposition level 2 (122). ) To separate. Then, similarly, the two-dimensional wavelet transform / inverse transform unit 101 sequentially performs the two-dimensional wavelet transform also on the low-frequency component 2LL, and outputs the sub-bands (3LL, 3HL, 3LH) indicated by the decomposition level 3 (123). , 3HH). In FIG. 3, the subbands to be encoded at each decomposition level are shown in gray. For example, if the number of decomposition levels is 3, the gray subbands (3HL, 3LH, 3HH, 2HL, 2LH, 2HH, 1HL, 1LH, 1HH) are to be encoded, and the 3LL subband is not encoded. .
[0054]
Next, in the quantization / dequantization unit 102 (see FIG. 1), after bits to be encoded are determined in the specified encoding order, a context is generated from bits around the target bits. The wavelet coefficients after the quantization process are divided into non-overlapping rectangles called “precincts” for each subband. This was introduced to make efficient use of memory in the implementation. Here, FIG. 4 is an explanatory diagram showing a precinct. As shown in FIG. 4, one precinct is composed of three rectangular regions that spatially match. Further, each precinct is divided into non-overlapping rectangular "code blocks". This is a basic unit when performing entropy coding.
[0055]
Although the coefficient values after the wavelet transform can be quantized and encoded as they are, in JPEG2000, in order to increase the encoding efficiency, the coefficient values are decomposed into "bit planes", and each pixel or code block is decomposed. The bit planes can be prioritized.
[0056]
Here, FIG. 5 is an explanatory diagram showing an example of a procedure for prioritizing bit planes. As shown in FIG. 5, in this example, the original image (32 × 32 pixels) is divided into four 16 × 16 pixel tiles, and the size of the precinct at the decomposition level 1 and the size of the code block are each 8 × 8 pixels and 4 × 4 pixels. The numbers of the precincts and code blocks are assigned in raster order. In this example, the precincts are assigned numbers 0 to 3 and the code blocks are assigned numbers 0 to 3. The pixel expansion outside the tile boundary is performed by using a mirroring method, performing a wavelet transform using a reversible (5, 3) filter, and obtaining a wavelet coefficient value of decomposition level 1.
[0057]
FIG. 5 also shows an explanatory diagram illustrating an example of a typical “layer” configuration concept for tile 0 / precinct 3 / code block 3. The converted code block is divided into subbands (1LL, 1HL, 1LH, 1HH), and each subband is assigned a wavelet coefficient value.
[0058]
The layer structure is easy to understand when the wavelet coefficient value is viewed from the horizontal direction (bit plane direction). One layer is composed of an arbitrary number of bit planes.
In this example, layers 0, 1, 2, and 3 are made up of 1, 3, 1, and 3 bit planes, respectively. A layer including a bit plane closer to LSB (Least Significant Bit: Least Significant Bit) is subject to quantization first, and conversely, a layer closer to MSB (Most Significant Bit: Most Significant Bit) is quantized to the end. It will remain without being. A method of discarding from a layer close to the LSB is called truncation, and it is possible to finely control the quantization rate.
[0059]
The entropy coding / decoding unit 103 (see FIG. 1) performs coding on each tile 111 of each component 110 by probability estimation from the context and the target bit. Thus, the encoding process is performed on all the components 110 of the image in units of the tiles 111.
[0060]
Finally, the tag processing unit 104 (see FIG. 1) combines all coded data from the entropy coding / decoding unit 103 into one code stream (code string data) and adds a tag to the code stream. I do. Here, FIG. 6 is a schematic diagram schematically showing an example of the structure of the code stream. Tag information called a header (main header (Main header), tile part header) is added to the beginning of the code stream and the beginning of the partial tiles constituting each tile 111. Coded data (bit stream). Then, at the end of the code stream, tag information (end of codestream) is added again.
[0061]
On the other hand, at the time of decoding, image data is generated from the code stream of each tile 111 of each component 110, contrary to the time of encoding. In this case, as shown in FIG. 1, the tag processing unit 104 interprets the tag information added to the code stream (code string data) input from the outside, and converts the code stream into the code of each tile 111 of each component 110. The stream is decomposed into streams, and a decoding process (decompression process) is performed for each code stream of each tile 111 of each component 110. At this time, the position of the bit to be decoded is determined in the order based on the tag information in the code stream, and the quantization / inverse quantization unit 102 determines the peripheral bits of the target bit position (decoding has already been performed. Is generated from the sequence of Then, the entropy coding / decoding unit 103 generates a target bit by performing decoding by probability estimation from the context and the code stream, and writes the target bit at the position of the target bit. Since the data decoded in this manner is spatially divided for each frequency band, the two-dimensional wavelet transform / inverse transform unit 101 performs an inverse two-dimensional wavelet transform on each of the components, thereby obtaining each component in the image data. Each tile 111 at 110 is restored. The restored data is converted by the color space conversion / inversion unit 100 into the original color system data. Here, the extension means is executed.
[0062]
Next, a configuration example of the multifunction peripheral 1 which is the image processing apparatus of the present embodiment will be described. The multifunction device 1 according to the present embodiment has a multifunction such as a copy function, a printer function, a scanner function, a facsimile function, and an image server function.
[0063]
FIG. 7 is a longitudinal sectional view schematically showing the multifunction peripheral 1 of the present embodiment. The MFP 1 includes a scanner 2 that is an image reading unit that reads a document image from a document, and a printer 3 that is an image forming unit that forms an image read by the scanner 2 on a recording material such as paper.
[0064]
A contact glass 5 on which a document (not shown) is placed is provided on the upper surface of the main body case 4 of the scanner 2. The original is placed with the original surface facing the contact glass 5. On the upper side of the contact glass 5, a document pressure plate 6 (which may be a so-called ADF) for pressing a document placed on the contact glass 5 is provided.
[0065]
Below the contact glass 5, a reading optical system 7 for optically reading a document image is provided. The reading optical system 7 includes a first traveling body 10 on which a light source 8 for emitting light and a mirror 9 are mounted, a second traveling body 13 on which two mirrors 11 and 12 are mounted, and a mirror 9 via an imaging lens 14. , 11, and 12 are configured by a CCD (Charge Coupled Device) image sensor 15 and the like that receive the light guided by the light. The CCD image sensor 15 functions as a photoelectric conversion element that generates photoelectric conversion data by photoelectrically converting reflected light from a document imaged on the CCD image sensor 15. The photoelectric conversion data is a voltage value having a magnitude corresponding to the intensity of light reflected from the document. The first and second traveling bodies 10 and 13 are provided so as to be able to reciprocate along the contact glass 5, and at the time of reading an original image, which will be described later, have a 2: 1 speed by a moving device such as a motor (not shown). Scanning travel in the sub-scanning direction by the ratio. As a result, exposure scanning of the document reading area by the reading optical system 7 is performed. In the present embodiment, the reading optical system 7 side is shown as a fixed document type that performs scanning, but the reading optical system 7 side may be a document moving type in which the position is fixed and the document side moves.
[0066]
The printer 3 includes a recording material path 20 from a recording material holding unit 16 for holding a recording material such as a sheet of paper to an output unit 19 via an electrophotographic printer engine 17 and a fixing unit 18.
[0067]
The printer engine 17 uses a photoconductor 21, a charger 22, an exposure unit 23, a developing unit 24, a transfer unit 25, a cleaner 26, and the like to form a toner image formed around the photoconductor 21 by electrophotography on a recording material. The transferred and transferred toner image is fixed on the recording material by the fixing device 18. In the present embodiment, the printer engine 17 forms an image by an electrophotographic method. However, the present invention is not limited to this. For example, various image forming methods such as an ink jet method, a sublimation type thermal transfer method, and a direct thermal recording method may be used. May be used to form an image.
[0068]
The MFP 1 is controlled by a control system including a plurality of microcomputers. FIG. 8 is a block diagram schematically showing an electrical connection of a control system related to image processing among these control systems. The control system includes a CPU 30, a ROM 31, a RAM 32, an operation panel 33, an IPU (Image Processing Unit) 34, an I / O port 35, a communication control unit 36, and the like, which are connected by a bus 37. The CPU 30 performs various calculations and centrally controls processing such as image processing. The ROM 31 stores various programs and fixed data related to processing executed by the CPU 30. The RAM 32 functions as a work area for the CPU 30 and also functions as a memory for temporarily storing image data (for example, image files). The operation panel 33 is provided with a display such as an LCD (Liquid Crystal Display), a plurality of operation keys including a hard key, a touch panel, and the like (all not shown). Functions as an operation unit. The IPU 34 includes hardware related to various types of image processing, and the ROM 31 includes a nonvolatile memory such as an EEPROM or a flash memory. Here, the program stored in the ROM 31 can be rewritten as a program downloaded from an external device (not shown) via the I / O port 35 under the control of the CPU 30. In the present embodiment, the ROM 31 functions as a storage medium for storing a program. The communication control unit 36 has a function of transmitting and receiving data between the multifunction device 1 and an external device (not shown) via a network or the like, and is used to connect to a facsimile modem function and a public telephone line network. Network control function, LAN (Local Area Network) control function, and the like.
[0069]
Next, an outline of image processing in the multifunction peripheral 1 of the present embodiment will be described with reference to FIG. FIG. 9 is a functional block diagram for explaining an outline of image processing in the multifunction peripheral 1. In the image processing of the multifunction peripheral 1, the area dividing unit 40 that divides the original image read by the scanner 2 into a plurality of areas, performs an area identification process for identifying the plurality of divided areas based on an area attribute, , An additional information generation unit 41 that performs a character recognition process for recognizing characters from an image, a compression unit 42 having each of the functional blocks described with reference to FIG. 1, and region identification information generated by the region identification process. And an additional information embedding unit 43 that embeds character recognition information or the like generated by character recognition processing as additional information at a predetermined embedding position of a code stream that is compressed data.
[0070]
As a character recognition process for recognizing characters from an image, for example, an OCR (Optical Character Recognition) process is used. Further, the additional information generation unit 41 may execute a tilt detection process for detecting a tilt angle of the original image read by the scanner 2. As a result, tilt detection information such as tilt angle correction information, which is a tilt detection result, is obtained. Therefore, the additional information embedding section 43 may embed the inclination detection information as additional information.
[0071]
The image processing of the multifunction device 1 basically performs OCR processing on image data corresponding to an area divided from the image data of the original image read by the scanner 2, and further converts the image data by the JPEG2000 algorithm. A code stream is generated by compression encoding. That is, the image is divided into one or a plurality of rectangular regions (tiles 111), and the pixel values are discrete wavelet-transformed for each of the rectangular regions and compression-coded hierarchically. At this time, the area identification information, character recognition information, inclination detection information, and the like are embedded at predetermined embedding positions in the code stream. The functions of the area dividing unit 40, the additional information generating unit 41, the compressing unit 42, the additional information embedding unit 43, and the like are executed by image processing performed by the CPU 30 based on a program stored in the ROM 31. However, the present invention is not limited to this, and may be executed, for example, by image processing performed by hardware by the IPU 34 or the like.
[0072]
Here, examples of the area attribute include various area attributes such as a character area, a photograph area, a figure area, a table area, a solid black area, and a background area. Note that the background area is a margin area corresponding to a margin of a document, but is not limited to this, and may be, for example, an area obtained by adding a space between lines to the margin area.
[0073]
On the other hand, the additional information includes, for example, character recognition information (character recognition result information) generated by the OCR process, and region identification information region (region identification result information) such as coordinate information and region attribute information generated by the region identification process. And tilt detection information (tilt detection result information) such as tilt angle correction information generated by the tilt detection processing. The character recognition information includes a character code, a character color, a font size, font information, character coordinate position information, certainty information (certainty information), and a title (character code). The title is a character code obtained by extracting the title area by the area identification processing and performing OCR processing on the title area. For example, a title such as “2002 meeting minutes.j2k” is embedded in a predetermined embedding position by a character code.
[0074]
Examples of the predetermined embedding position include headers such as a main header and a tile part header (see FIG. 6), the least significant bit of the image (embedding can be performed without increasing the image size), and a character area of the image. , There is a least significant bit, etc. In addition, when the image size is not an integral multiple of the tile, information is added to form an image of an integral multiple of the tile, but it is also possible to embed the image in the information portion to be added.
[0075]
Next, image processing executed by the CPU 30 of the MFP 1 based on a program will be described. Here, for example, image processing for accumulating image data in order to use the MFP 1 as an image server will be described with reference to FIGS. 10 and 11. FIG. 10 is a flowchart schematically showing the flow of image processing according to the present embodiment, and FIG. 11 is an explanatory diagram schematically showing the embedding position of additional information by the image processing.
[0076]
As shown in FIG. 10, first, the process waits for reading of a document image by the scanner 2 (N in step S1). When the operator opens the document pressing plate 6 of the scanner 2 and sets a document on the contact glass 5, closes the document pressing plate 6 and presses a copy start key on the operation panel 33, the scanner 2 performs a scanning operation of the reading optical system 7. To read the original image from the original set on the contact glass 5.
[0077]
When an original image is read from a document by the scanner 2 (Y in S1), the read original image is divided into a plurality of regions, and based on region attributes such as a character region, a photograph region, a drawing region, a table region, and a background region. Then, attributes of a plurality of areas mixed in the original image are identified (S2). Here, the area dividing means or the area dividing function is executed, and the additional information generating means or the additional information generating function is executed. As a method of area identification, a conventional method, for example, an area identification method using the density of black runs is sufficient, and the method is well-known, so that the description thereof will be omitted. Here, by identifying a plurality of areas, area identification information such as coordinate information and area attribute information can be obtained for each of the plurality of areas. Note that the image processing of the present embodiment is performed on the original image read by the scanner 2, but the present invention is not limited to this. For example, the image processing of the present embodiment is performed on the original image received via the network. Image processing may be performed.
[0078]
Next, an OCR process is performed on the plurality of divided areas (S3). That is, characters are recognized from the image of each of the plurality of regions. Here, the additional information generating means or the additional information generating function is executed. Note that the OCR processing includes a pattern matching method, a structure analysis method, and the like. These methods are known, and thus description thereof is omitted. Here, by performing the OCR process for each of the plurality of regions, the character recognition information such as the character code, the character color, the font size, the font information, the coordinate position information of the character, and the certainty information (reliability information) is obtained for each of the plurality of regions. Is obtained.
[0079]
Next, the original image is compressed based on the JPEG2000 algorithm (S4). As a result, a code stream is generated from the original image based on the JPEG2000 algorithm. Then, the area identification information and the character recognition information are embedded as additional information at a predetermined embedding position of the code stream (S5). Here, additional information embedding means or additional information embedding function is executed. For example, as shown in FIG. 11, the additional information is embedded in the least significant bit of the layer in the code stream. As a result, embedding can be performed without increasing the image size. The area R in which the additional information is embedded is an area corresponding to the character area of the original image.
[0080]
Here, the data is embedded in the least significant bit of the area R corresponding to the character area of the image. However, the present invention is not limited to this. For example, the data may be simply embedded in the least significant bit. Further, although the area identification information and the character recognition information are embedded as additional information at a predetermined embedding position, the present invention is not limited to this. For example, only one of the area identification information and the character recognition information is embedded at a predetermined embedding position. When the inclination detection processing is executed, the inclination detection information such as the inclination angle correction information of the inclination detection result may be embedded at a predetermined embedding position. Alternatively, the original image may be compressed while embedding the additional information by simultaneously executing steps S4 and S5. Here, since the original image is compressed at a plurality of resolutions based on the JPEG2000 algorithm, OCR processing may be performed for each resolution to embed character recognition information at a predetermined embedding position.
[0081]
Finally, the image data (compressed data) in which the additional information is embedded is stored in the RAM 32 as an image file (S6). Here, when there are a plurality of originals, the processing from steps S1 to S6 is repeated for each original, and a plurality of image files are stored in the RAM 32 for each original.
[0082]
After that, the image data stored in the RAM 32 as an image file may be transmitted to an external device via a network by the communication control unit 36 at a predetermined timing, for example. At this time, for example, when searching for image data of an image having a character area from a plurality of image data stored as image files in the RAM 32 to transmit only image data of an image having a character area, the CPU 30 What is necessary is just to detect the character code in image data, and it can search easily.
In order to transmit only image data having a similar image, when searching for image data having a similar image from a plurality of image data stored as image files in the RAM 32, the CPU 30 determines the area within the image data. It is possible to easily search by detecting the identification information and determining whether or not the area attributes match other image data for each area.
[0083]
The image of the image data may be subjected to image processing such as scaling (enlargement and reduction), rotation, and black-and-white inversion. In such image processing, the character image in the character area M is scaled by changing the font size of the character area M, and the image is held as images of various resolutions by the compression means of the JPEG2000 algorithm. The image can be freely changed from high image quality to low image quality. Further, when the original image is displayed on a display device or the like, the image can be expanded according to the resolution of the display device or the like.
[0084]
As described above, in the present embodiment, the original image is divided into a plurality of regions, the attributes of the plurality of regions are identified, the region identification information is generated (S2), and the OCR process is performed on the divided plurality of regions to perform the character By generating recognition information (S3) and embedding the generated area identification information and character recognition information at predetermined embedding positions of the code stream (S5), for example, image data required from a plurality of image data (compressed data) is obtained. It is possible to easily search using additional information, for example, when searching for. That is, by embedding the additional information as needed, search and management of image data, processing of image data, and the like can be easily performed. In addition, since additional information corresponding to each of a plurality of regions of the original image, such as region identification information and character recognition information, can be obtained, a necessary region, such as a character region or a photograph region, can be easily searched from the image. Can be extracted. Further, by using the compression means and the decompression means of the JPEG2000 algorithm, it is possible to execute various image processing utilizing the characteristics of JPEG2000.
[0085]
A second embodiment of the present invention will be described with reference to FIG. FIG. 12 is a flowchart schematically showing the flow of image processing according to the present embodiment. The same parts as those described above are denoted by the same reference numerals.
[0086]
The basic configuration of this embodiment is the same as that of the first embodiment, but the difference from the first embodiment is that the image processing executed by the CPU 30 of the MFP 1 based on the program is different. Is a point.
[0087]
Image processing executed by the CPU 30 of the MFP 1 according to the present embodiment based on a program will be described. Here, for example, image processing for accumulating image data in order to use the MFP 1 as an image server will be described.
[0088]
As shown in FIG. 12, first, the process waits for reading of a document image by the scanner 2 (N in step S11). When the operator opens the document pressing plate 6 of the scanner 2 and sets a document on the contact glass 5, closes the document pressing plate 6 and presses a copy start key on the operation panel 33, the scanner 2 performs a scanning operation of the reading optical system 7. To read the original image from the original set on the contact glass 5.
[0089]
When an original image is read from a document by the scanner 2 (Y in S11), the read original image is divided into a plurality of tiles 111 (a plurality of areas: see FIG. 1) by tiling processing (S12). Here, the area dividing means or the area dividing function is executed.
Thereby, the original image is divided into a plurality of tiles 111 (a plurality of areas). This tiling process is performed based on the JPEG2000 algorithm.
[0090]
Next, an OCR process is performed on the plurality of divided tiles 111 (S13). That is, characters are recognized from the image of each of the plurality of tiles 111. Here, the additional information generating means or the additional information generating function is executed. Note that the OCR processing includes a pattern matching method, a structure analysis method, and the like. These methods are known, and thus description thereof is omitted. Here, by executing the OCR process, character recognition information such as a character code, a character color, a font size, font information, character coordinate position information, and certainty information (reliability information) can be obtained for each tile 111.
[0091]
Next, the original image is compressed based on the JPEG2000 algorithm (S14). As a result, a code stream is generated from the original image based on the JPEG2000 algorithm. Then, character recognition information is embedded as additional information at a predetermined embedding position of the code stream (S15). Here, additional information embedding means or additional information embedding function is executed. For example, additional information for each tile 111 is embedded in a corresponding tile part header. Since the tiling process is executed in step S12, it is not necessary to execute the tiling process in step S14.
[0092]
Here, the character recognition information is embedded at a predetermined embedding position as additional information. However, the present invention is not limited to this. For example, when the inclination detection processing is executed, the inclination detection information such as the inclination angle correction information of the inclination detection result is executed. The detection information may be embedded at a predetermined embedding position. Alternatively, the image may be compressed while embedding the additional information by executing Step S14 and Step S15 simultaneously.
[0093]
Finally, the image data (compressed data) in which the additional information is embedded is stored as an image file in the RAM 32 (S16). Here, when there are a plurality of originals, the processing from steps S11 to S16 is repeated for each original, and a plurality of image files are stored in the RAM 32 for each original.
[0094]
After that, the image data stored in the RAM 32 as an image file may be transmitted to an external device via a network by the communication control unit 36 at a predetermined timing, for example. At this time, for example, when searching for image data of an image having a character area from a plurality of image data stored as image files in the RAM 32 to transmit only image data of an image having a character area, the CPU 30 What is necessary is just to detect the character code in image data, and it can search easily.
[0095]
As described above, in the present embodiment, the original image is divided into a plurality of tiles 111 (a plurality of areas) by performing tiling processing (S12), and the OCR processing is performed on the plurality of divided tiles 111 to perform character recognition. By generating information (S13) and embedding the generated character recognition information in a predetermined embedding position of the code stream (S15), for example, when searching for required image data from a plurality of image data (compressed data), etc. It is possible to easily search using the additional information. That is, by embedding the additional information as needed, search and management of image data, processing of image data, and the like can be easily performed. Further, since additional information corresponding to each tile 11 of the original image, for example, character recognition information or the like is obtained, a necessary area, for example, a character area can be easily searched and extracted from the image. Further, by using the compression means and the decompression means of the JPEG2000 algorithm, it is possible to execute various image processing utilizing the characteristics of JPEG2000.
[0096]
In each of the embodiments, the MFP 1 is used as an image processing apparatus. However, the present invention is not limited to this. For example, a personal computer or the like may be used. In this case, the personal computer includes a CPU, a ROM, a RAM, an HDD (Hard Disk Drive) storing various programs, a CD-ROM drive, a scanner, and communication control for transmitting information by communication with an external device via a network. The apparatus is provided with a device, a display device for displaying the progress and results of processing to the operator, and an input device such as a keyboard and a mouse. Here, the HDD functions as a storage medium for storing a program related to the image processing as described above.
[0097]
Generally, a program installed in the HDD of a personal computer is recorded on an optical information recording medium such as a CD-ROM or a DVD-ROM, or a magnetic medium such as an FD, and the recorded program is stored in the HDD. Installed. Therefore, a portable storage medium such as an optical information recording medium such as a CD-ROM or a magnetic medium such as an FD can also be a storage medium for storing the above-described program related to image processing. Further, such a program may be fetched from outside via a communication control device, for example, and installed in the HDD.
[0098]
【The invention's effect】
According to the first aspect of the present invention, in an image processing apparatus that generates a code stream by compressing an original image in a procedure of two-dimensional wavelet transform, quantization, and encoding, a region division that divides the original image into a plurality of regions Means, additional information generating means for generating additional information related to the original image from the images of the plurality of areas divided by the area dividing means, and additional information generated by the additional information generating means in the code stream. And additional information embedding means for embedding, the image data of the original image has additional information, for example, when searching for required image data from a plurality of image data. , It is possible to easily search. That is, by embedding the additional information as needed, search and management of image data, processing of image data, and the like can be easily performed.
[0099]
According to the second aspect of the present invention, in the image processing apparatus according to the first aspect, the additional information generating unit generates character recognition information as the additional information by recognizing a character from the images of the plurality of regions. Since the character recognition information is embedded in the code stream, for example, when searching for image data having a character area from a plurality of image data, the character recognition information can be easily searched. . That is, by using the character recognition information, search and management of image data can be easily performed.
[0100]
According to the third aspect of the present invention, in the image processing apparatus according to the first or second aspect, the additional information generation unit generates area identification information as the additional information by identifying area attributes of the plurality of areas. Because of this feature, the area identification information is embedded in the code stream, and, for example, when searching for image data having a photo area from a plurality of image data, it is possible to easily search using the area identification information. Become. That is, by using the area identification information, it is possible to easily search and manage the image data.
[0101]
According to a fourth aspect of the present invention, in the image processing apparatus according to the first, second or third aspect, the additional information generating unit detects the inclination of the image of the plurality of regions to detect the inclination detection information as the additional information. Since the tilt detection information is embedded in the code stream, for example, when the image is displayed on a display device or printed on paper, the image can be easily detected based on the tilt detection information. The inclination can be corrected. That is, by using the inclination detection information, it is possible to easily perform processing on image data.
[0102]
According to a fifth aspect of the present invention, in the image processing apparatus according to any one of the first to fourth aspects, the additional information embedding unit embeds the additional information in a main header of the code stream. Therefore, for example, when searching for required image data from a plurality of image data, it is possible to search using the additional information of the main header at an early stage of reading the main header. Time can be reduced.
[0103]
According to the invention described in claim 6, in the image processing apparatus according to any one of claims 1 to 4, the additional information embedding unit embeds the additional information in a tile part header of the code stream. Therefore, for example, the additional information of each tile (each area) is embedded in the corresponding tile part header, and the additional information of the tile part header is used in a case where necessary image data is searched from a plurality of image data. Then you can easily search.
[0104]
According to a seventh aspect of the present invention, in the image processing apparatus according to any one of the first to fourth aspects, the additional information embedding unit embeds the additional information in a least significant bit of a layer in the code stream. , Additional information can be embedded without increasing the image size.
[0105]
According to an eighth aspect of the present invention, the image processing apparatus according to any one of the first to seventh aspects further comprises a reading optical system for optically reading the original image from the original. It is possible to read the original image from the computer, and as a result, it is possible to execute various processes such as image processing on the read original image.
[0106]
According to a ninth aspect of the present invention, in the image processing apparatus according to any one of the first to eighth aspects, the compressed original image is decompressed in a procedure of decoding, inverse quantization, and inverse two-dimensional wavelet transform. By using the decompression means of the JPEG2000 algorithm, it is possible to decompress an image compressed by the JPEG2000 algorithm by utilizing the characteristics of JPEG2000. Display of an image on a display device or the like, printing on paper or the like can be executed.
[0107]
According to a tenth aspect of the present invention, in the image processing apparatus of the ninth aspect, a printer engine for forming an image expanded by the expansion means on a recording material is provided. An image can be formed on a recording material such as paper.
[0108]
The program according to claim 11 is interpreted by a computer provided in an image processing apparatus that generates a code stream by compressing an original image by a procedure of two-dimensional wavelet transform, quantization, and encoding. Divided into a plurality of regions, an additional information generating function of generating additional information related to the original image from the images of the plurality of regions divided by the region dividing unit, and an additional information generating unit that generates the additional information. And the additional information embedding function of embedding the additional information into the code stream, the image data of the original image has the additional information, for example, when searching for necessary image data from a plurality of image data , Etc., can be easily searched using additional information. That is, by embedding the additional information as needed, search and management of image data, processing of image data, and the like can be easily performed.
[0109]
According to the twelfth aspect of the present invention, in the program according to the eleventh aspect, the additional information generation function generates character recognition information as the additional information by recognizing characters from the images of the plurality of regions. This character recognition information is embedded in the code stream. For example, when searching for image data having a character area from a plurality of image data, it is possible to easily search using the character recognition information. That is, by using the character recognition information, search and management of image data can be easily performed.
[0110]
According to a thirteenth aspect of the present invention, in the program according to the eleventh or twelfth aspect, the additional information generation function generates area identification information as the additional information by identifying area attributes of the plurality of areas. The area identification information is embedded in the code stream. For example, when searching for image data having a photographic area from a plurality of pieces of image data, it is possible to easily search using the area identification information. That is, by using the area identification information, search and management of image data can be easily performed.
[0111]
According to a fourteenth aspect of the present invention, in the program according to the eleventh, twelfth or thirteenth aspect, the additional information generation function generates tilt detection information as the additional information by detecting a tilt of an image in the plurality of regions. Therefore, the inclination detection information is embedded in the code stream, and the inclination of the image can be easily corrected by the inclination detection information, for example, when the image is displayed on a display device or printed on paper. It becomes. That is, by using the inclination detection information, it is possible to easily perform processing on image data.
[0112]
According to the invention described in claim 15, in the program according to any one of claims 11 to 14, the additional information embedding function embeds the additional information in a main header of the code stream. For example, when searching for required image data from the image data, it is possible to search using the additional information of the main header at an early stage of reading the main header. As a result, the processing time can be reduced. .
[0113]
According to the invention of claim 16, in the program according to any one of claims 11 to 14, the additional information embedding function embeds the additional information in a tile part header of the code stream. The additional information of each tile (each area) is embedded in the corresponding tile part header, and it is easy to search using the additional information of the tile part header when searching for required image data from a plurality of image data. Can be.
[0114]
According to the invention described in claim 17, in the program according to any one of claims 11 to 14, the additional information embedding function embeds the additional information in a least significant bit of a layer in the code stream. Additional information can be embedded without increasing the image size.
[0115]
According to the computer-readable storage medium of the invention of claim 18, since the program of any one of claims 11 to 17 is stored, the program of any one of claims 11 to 17 is stored. A similar effect is achieved.
[Brief description of the drawings]
FIG. 1 is a functional block diagram for explaining an outline of a JPEG2000 algorithm.
FIG. 2 is a schematic diagram schematically showing an example of each divided component of an original image that is a color image.
FIG. 3 is a schematic diagram schematically showing subbands at each decomposition level when the number of decomposition levels is three.
FIG. 4 is an explanatory diagram showing a precinct.
FIG. 5 is an explanatory diagram showing an example of a procedure for ranking bit planes.
FIG. 6 is a schematic diagram schematically showing an example of a structure of a code stream.
FIG. 7 is a longitudinal sectional view schematically showing the multifunction peripheral according to the first embodiment of the present invention.
FIG. 8 is a block diagram schematically showing an electrical connection of a control system related to image processing in a control system of the multifunction peripheral.
FIG. 9 is a functional block diagram for describing an outline of image processing in the multifunction peripheral.
FIG. 10 is a flowchart schematically showing a flow of image processing according to the first embodiment of the present invention.
FIG. 11 is an explanatory diagram schematically showing an embedding position of additional information by image processing according to the first embodiment of the present invention.
FIG. 12 is a flowchart schematically showing a flow of image processing according to the second embodiment of the present invention.
[Explanation of symbols]
1 Image processing device (compound machine)
7 Reading optical system
17 Printer Engine
30 Computer (CPU)
31 Storage media (ROM)

Claims

In an image processing apparatus that generates a code stream by compressing an original image by a procedure of two-dimensional wavelet transform, quantization, and encoding,
Area dividing means for dividing the original image into a plurality of areas,
Additional information generating means for generating additional information related to the original image from the image of the plurality of areas divided by the area dividing means,
An image processing apparatus, comprising: additional information embedding means for embedding the additional information generated by the additional information generating means in the code stream.

2. The image processing apparatus according to claim 1, wherein the additional information generation unit generates character recognition information as the additional information by recognizing characters from the images of the plurality of areas.

The image processing apparatus according to claim 1, wherein the additional information generation unit generates area identification information as the additional information by identifying area attributes of the plurality of areas.

4. The image processing apparatus according to claim 1, wherein the additional information generation unit generates tilt detection information as the additional information by detecting a tilt of an image in the plurality of regions.

The image processing apparatus according to claim 1, wherein the additional information embedding unit embeds the additional information in a main header of the code stream.

The image processing device according to claim 1, wherein the additional information embedding unit embeds the additional information in a tile part header of the code stream.

The image processing apparatus according to claim 1, wherein the additional information embedding unit embeds the additional information in a least significant bit of a layer in the code stream.

The image processing apparatus according to claim 1, further comprising a reading optical system that optically reads the original image from a document.

The image processing apparatus according to claim 1, further comprising an expansion unit configured to expand the compressed original image in a sequence of decoding, inverse quantization, and two-dimensional inverse wavelet transform.

The image processing apparatus according to claim 9, further comprising a printer engine configured to form an image expanded by the expansion unit on a recording material.

The original image is interpreted by a computer provided with an image processing apparatus that generates a code stream by compressing the original image in a procedure of two-dimensional wavelet transform, quantization, and encoding.
An area division function for dividing an original image into a plurality of areas,
An additional information generation function of generating additional information related to the original image from an image of the plurality of regions divided by the region dividing unit;
A program for executing an additional information embedding function of embedding the additional information generated by the additional information generating means in the code stream.

12. The program according to claim 11, wherein the additional information generation function generates character recognition information as the additional information by recognizing characters from the images of the plurality of regions.

13. The program according to claim 11, wherein the additional information generation function generates area identification information as the additional information by identifying area attributes of the plurality of areas.

14. The program according to claim 11, wherein the additional information generation function generates tilt detection information as the additional information by detecting a tilt of an image in the plurality of regions.

15. The program according to claim 11, wherein the additional information embedding function embeds the additional information in a main header of the code stream.

15. The program according to claim 11, wherein the additional information embedding function embeds the additional information in a tile part header of the code stream.

15. The program according to claim 11, wherein the additional information embedding function embeds the additional information in a least significant bit of a layer in the code stream.

A computer-readable storage medium storing the program according to any one of claims 11 to 17.