JP4194373B2

JP4194373B2 - Image processing apparatus, program, and storage medium

Info

Publication number: JP4194373B2
Application number: JP2003003546A
Authority: JP
Inventors: 利夫宮澤; 泰之野水; 宏幸作山; 潤一原; 熱河松浦; 隆則矢野; 児玉　　卓; 康行新海; 隆之西村
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2003-01-09
Filing date: 2003-01-09
Publication date: 2008-12-10
Anticipated expiration: 2023-01-09
Also published as: JP2004221679A

Description

【０００１】
【発明の属する技術分野】
本発明は、画像処理装置、プログラム及び記憶媒体に関する。
【０００２】
【従来の技術】
近年、スキャナ、デジタルカメラ、パーソナルコンピュータ、プリンタ、複写機、複合機（ＭＦＰ）等の画像処理装置の普及に伴い、デジタル画像データをメモリやハードディスク等の記憶装置に保存したり、ＣＤ−ＲＯＭ等の光ディスクに保存したり、さらには、インターネット等を介して伝送したりすることが身近なものになっている。このような画像データは、通常、圧縮されて記憶装置や光ディスク等に保存されることが多い。
【０００３】
最近では、様々な技術により簡単に高精細画像を得ることができるが、高精細画像の画像データサイズは大きくなる傾向にあり、高精細画像の取扱いは困難になってきている。こうした高精細画像の取扱いを容易にする画像圧縮伸長アルゴリズムとしては、現在、JPEG（Joint Photographic Experts Group）が最も広く用いられている。また、このJPEGで採用されているＤＣＴ（離散コサイン変換）に代わる周波数変換として、近年、ＤＷＴ（離散ウェーブレット変換）の採用が増加している。その代表例は、２００１年に国際標準となったJPEG後継の画像圧縮伸長方式JPEG2000である。
【０００４】
このような圧縮された画像データはデジタルデータであるため、インターネット等を介する伝送や記憶装置への保存等を容易にするが、一方で、作成者に無断で改変される可能性が高いものである。これを防ぐため、作成者を特定するための署名情報を付加情報として原画像に埋め込む方法が提案されている（例えば、特許文献１参照）。
【０００５】
【特許文献１】
特開２００１−４２７６８公報
【０００６】
【発明が解決しようとする課題】
しかしながら、特許文献１の技術では、原画像に付加情報として署名情報を埋め込むことはできるが、署名情報は原画像に関する付加情報ではないため、作成者を特定するため以外、例えば画像データ検索等の二次的な利用に署名情報を用いることは難しい。また、原画像に関する付加情報を原画像に埋め込んだ場合でも、その埋め込み位置によっては、付加情報を利用する際の処理時間が長くなる場合がある。
【０００７】
また、ユーザは、記憶装置に格納された複数の画像データから文字領域又は写真領域を有する画像データや類似した画像を有する画像データを容易に検索できることを要望している。例えば、従来の技術においては、ユーザが、画像データに基づいて表示装置等に表示された画像や用紙等に印字された画像等を確認することで、文字領域又は写真領域を有する画像データや類似した画像を有する画像データ等を検索する場合が多い。
【０００８】
本発明の目的は、画像データの検索や管理、画像データに対する処理等に好適な符号化を実現する画像処理装置、プログラム及び記憶媒体を提供することである。
【０００９】
【課題を解決するための手段】
本発明の画像処理装置は、原画像を矩形領域に分割し、該矩形領域毎に２次元ウェーブレット変換、量子化及び符号化という手順で圧縮して、前記矩形領域毎の符号化データを有するコードストリームを生成する画像処理装置において、前記原画像を複数の矩形領域に分割する領域分割手段と、前記領域分割手段により分割された矩形領域毎に該矩形領域の画像に関連する検索のための付加情報を生成する付加情報生成手段と、前記付加情報生成手段により生成された付加情報を前記コードストリームにおける該付加情報が対応する矩形領域の符号化データに埋め込む付加情報埋込手段と、を備えることを特徴とする。
【００１０】
したがって、原画像に関連する付加情報を生成し、生成した付加情報をコードストリームに埋め込むことによって、原画像の画像データが付加情報を有し、例えば、複数の画像データから必要とする画像データを検索する場合等、付加情報を利用して簡単に検索することが可能となるコードストリームを生成する。すなわち、必要に応じた付加情報を埋め込むことで、画像データの検索や管理、画像データに対する処理等を容易に行うことが可能なコードストリームを生成する。
【００４５】
【発明の実施の形態】
本発明の第一の実施の形態を図１ないし図１１に基づいて説明する。
【００４６】
本実施の形態は、「JPEG2000アルゴリズム」を利用するものであるが、JPEG2000アルゴリズム自体は各種文献や公報等により周知であるので、詳細は省略し、その概要について説明する。
【００４７】
図１はJPEG2000アルゴリズムの概要を説明するための機能ブロック図である。JPEG2000のアルゴリズムは、色空間変換・逆変換部１００、２次元ウェーブレット変換・逆変換部１０１、量子化・逆量子化部１０２、エントロピー符号化・復号化部１０３、タグ処理部１０４で構成されている。
【００４８】
JPEG2000の特徴の一つは、高圧縮領域における画質が良いという長所を持つ２次元離散ウェーブレット変換（ＤＷＴ：Discrete Wavelet Transform）を用いている点である。また、もう一つの大きな特徴は、最終段に符号形成を行うためのタグ処理部１０４と呼ばれる機能ブロックが追加されており、符号列データであるコードストリームの生成や解釈が行われる点である。そして、コードストリームによって、JPEG2000は様々な便利な機能を実現できるようになっている。
【００４９】
なお、画像の入出力部分には、色空間変換・逆変換部１００が用意されることが多い。この色空間変換・逆変換部１００は、例えば、原色系のＲ（赤）／Ｇ（緑）／Ｂ（青）の各コンポーネントからなるＲＧＢ表色系や、補色系のＹ（黄）／Ｍ（マゼンタ）／Ｃ（シアン）の各コンポーネントからなるＹＭＣ表色系から、ＹＣｒＣｂあるいはＹＵＶ表色系への変換又は逆の変換を行う部分である。
【００５０】
以下、JPEG2000アルゴリズム、特にウェーブレット変換について説明する。
【００５１】
図２はカラー画像である原画像の分割された各コンポーネントの一例を概略的に示す模式図である。カラー画像は、一般に、図２に示すように、原画像の各コンポーネント１１０が、例えばＲＧＢ原色系によって分離される。さらに、画像の各コンポーネント１１０は、矩形をした領域であるタイル１１１によって分割される（図２の例では、各コンポーネント１１０が縦横４×４、合計１６個の矩形のタイル１１１に分割されている）。このような個々のタイル１１１、例えば、Ｒ００，Ｒ０１，…，Ｒ１５／Ｇ００，Ｇ０１，…，Ｇ１５／Ｂ００，Ｂ０１，…，Ｂ１５は、画像データの圧縮伸長プロセスを実行する際の基本単位となる。従って、画像データの圧縮伸長動作は、コンポーネント１１０毎に、また、タイル１１１毎に、独立して行われる。
【００５２】
画像データの符号化時には（図１参照）、各コンポーネント１１０の各タイル１１１のデータが色空間変換・逆変換部１００に入力され、色空間変換を施された後、２次元ウェーブレット変換・逆変換部１０１で２次元ウェーブレット変換（順変換）が適用されて周波数帯に空間分割される。
【００５３】
図３はデコンポジションレベル数が３である場合の各デコンポジションレベルにおけるサブバンドを概略的に示す模式図である。２次元ウェーブレット変換・逆変換部１０１は、画像のタイル分割によって得られたタイル画像（デコンポジションレベル０（１２０）：０ＬＬ）に対して、２次元ウェーブレット変換を施し、デコンポジションレベル１（１２１）に示すサブバンド（１ＬＬ，１ＨＬ，１ＬＨ，１ＨＨ）を分離する。引き続き、２次元ウェーブレット変換・逆変換部１０１は、この階層における低周波成分１ＬＬに対して、２次元ウェーブレット変換を施し、デコンポジションレベル２（１２２）に示すサブバンド（２ＬＬ，２ＨＬ，２ＬＨ，２ＨＨ）を分離する。そして、２次元ウェーブレット変換・逆変換部１０１は、順次同様に、低周波成分２ＬＬに対しても、２次元ウェーブレット変換を施し、デコンポジションレベル３（１２３）に示すサブバンド（３ＬＬ，３ＨＬ，３ＬＨ，３ＨＨ）を分離する。なお、図３中では、各デコンポジションレベルにおいて符号化の対象となるサブバンドはグレーで示されている。例えば、デコンポジションレベル数を３とした場合、グレーで示したサブバンド（３ＨＬ，３ＬＨ，３ＨＨ，２ＨＬ，２ＬＨ，２ＨＨ，１ＨＬ，１ＬＨ，１ＨＨ）が符号化対象となり、３ＬＬサブバンドは符号化されない。
【００５４】
次いで、量子化・逆量子化部１０２では（図１参照）、指定した符号化の順番で符号化の対象となるビットが定められた後、対象ビット周辺のビットからコンテキストが生成される。この量子化の処理が終わったウェーブレット係数は、個々のサブバンド毎に、「プレシンクト」と呼ばれる重複しない矩形に分割される。これは、インプリメンテーションでメモリを効率的に使うために導入されたものである。ここで、図４はプレシンクトを示す説明図である。図４に示すように、一つのプレシンクトは、空間的に一致した３つの矩形領域からなっている。さらに、個々のプリシンクトは、重複しない矩形の「コードブロック」に分けられる。これは、エントロピーコーディングを行う際の基本単位となる。
【００５５】
なお、ウェーブレット変換後の係数値は、そのまま量子化し符号化することも可能であるが、JPEG2000では符号化効率を上げるために、係数値を「ビットプレーン」単位に分解し、画素あるいはコードブロック毎にビットプレーンに順位付けを行うことができる。
【００５６】
ここで、図５はビットプレーンに順位付けする手順の一例を示す説明図である。図５に示すように、この例は、原画像（３２×３２画素）を１６×１６画素のタイル４つで分割した場合で、デコンポジションレベル１のプレシンクトとコードブロックの大きさは、各々８×８画素と４×４画素としている。プレシンクトとコードブロックの番号は、ラスター順に付けられており、この例では、プレンシクトが番号０から３まで、コードブロックが番号０から３まで割り当てられている。タイル境界外に対する画素拡張にはミラーリング法を使い、可逆（５，３）フィルタでウェーブレット変換を行い、デコンポジションレベル１のウェーブレット係数値を求めている。
【００５７】
また、タイル０／プレシンクト３／コードブロック３について、代表的な「レイヤ」構成の概念の一例を示す説明図も図５に併せて示す。変換後のコードブロックは、サブバンド（１ＬＬ，１ＨＬ，１ＬＨ，１ＨＨ）に分割され、各サブバンドにはウェーブレット係数値が割り当てられている。
【００５８】
レイヤの構造は、ウェーブレット係数値を横方向（ビットプレーン方向）から見ると理解し易い。１つのレイヤは任意の数のビットプレーンから構成される。この例では、レイヤ０，１，２，３は、各々、１，３，１，３のビットプレーンから成っている。そして、ＬＳＢ（Least Significant Bit：最下位ビット）に近いビットプレーンを含むレイヤ程、先に量子化の対象となり、逆に、ＭＳＢ（Most Significant Bit：最上位ビット）に近いレイヤは最後まで量子化されずに残ることになる。ＬＳＢに近いレイヤから破棄する方法はトランケーションと呼ばれ、量子化率を細かく制御することが可能である。
【００５９】
エントロピー符号化・復号化部１０３では（図１参照）、コンテキストと対象ビットとから、確率推定によって各コンポーネント１１０の各タイル１１１に対する符号化を行う。こうして、画像の全てのコンポーネント１１０について、タイル１１１単位で符号化処理が行われる。
【００６０】
最後に、タグ処理部１０４では（図１参照）、エントロピー符号化・復号化部１０３からの全符号化データを１本のコードストリーム（符号列データ）に結合するとともに、それにタグを付加する処理を行う。ここで、図６はコードストリームの構造の一例を概略的に示す模式図である。コードストリームの先頭と各タイル１１１を構成する部分タイルの先頭には、ヘッダ（メインヘッダ（Main header）、タイルパートヘッダ（tile part header））と呼ばれるタグ情報が付加され、その後に、各タイル１１１の符号化データ（bit stream）が続く。そして、コードストリームの終端には、再びタグ情報（end of codestream）が付加される。
【００６１】
一方、復号化時には、符号化時とは逆に、各コンポーネント１１０の各タイル１１１のコードストリームから画像データを生成する。この場合、図１に示すように、タグ処理部１０４は、外部より入力されたコードストリーム（符号列データ）に付加されたタグ情報を解釈し、コードストリームを各コンポーネント１１０の各タイル１１１のコードストリームに分解し、その各コンポーネント１１０の各タイル１１１のコードストリーム毎に復号化処理（伸長処理）を行う。このとき、コードストリーム内のタグ情報に基づく順番で復号化の対象となるビットの位置が定められるとともに、量子化・逆量子化部１０２において、その対象ビット位置の周辺ビット（既に復号化を終えている）の並びからコンテキストを生成する。そして、エントロピー符号化・復号化部１０３では、そのコンテキストとコードストリームとから確率推定によって復号化を行って対象ビットを生成し、それを対象ビットの位置に書き込む。このようにして復号化されたデータは、周波数帯域毎に空間分割されているため、これを２次元ウェーブレット変換・逆変換部１０１で２次元ウェーブレット逆変換を行うことにより、画像データ中の各コンポーネント１１０における各タイル１１１が復元される。復元されたデータは、色空間変換・逆変換部１００によって元の表色系のデータに変換される。ここに、伸長手段が実行される。
【００６２】
次に、本実施の形態の画像処理装置である複合機１の構成例について説明する。本実施の形態の複合機１は、複写機能、プリンタ機能、スキャナ機能、ファクシミリ機能、画像サーバ機能等の複合機能を有している。
【００６３】
図７は本実施の形態の複合機１を概略的に示す縦断面図である。複合機１は、原稿から原稿画像を読み取る画像読取部であるスキャナ２と、スキャナ２で読み取られた画像を用紙等の記録材に形成する画像形成部であるプリンタ３とを備えている。
【００６４】
スキャナ２の本体ケース４の上面には、原稿（図示せず）が載置されるコンタクトガラス５が設けられている。原稿は、原稿面をコンタクトガラス５に対向させて載置される。コンタクトガラス５の上側には、コンタクトガラス５上に載置された原稿を押える原稿圧板６（いわゆるＡＤＦであってもよい）が設けられている。
【００６５】
コンタクトガラス５の下方には、原稿画像を光学的に読み取るための読取光学系７が設けられている。この読取光学系７は、光を発光する光源８及びミラー９を搭載する第１走行体１０、２枚のミラー１１，１２を搭載する第２走行体１３、結像レンズ１４を介してミラー９，１１，１２によって導かれる光を受光するＣＣＤ（Charge Coupled Device）イメージセンサ１５等によって構成されている。ＣＣＤイメージセンサ１５は、ＣＣＤイメージセンサ１５上に結像される原稿からの反射光を光電変換することで光電変換データを生成する光電変換素子として機能する。光電変換データは、原稿からの反射光の強弱に応じた大きさを有する電圧値である。第１、第２走行体１０，１３は、コンタクトガラス５に沿って往復動自在に設けられており、後述する原稿画像の読取動作に際しては、図示しないモータ等の移動装置によって２：１の速度比で副走査方向にスキャニング走行する。これにより、読取光学系７による原稿読取領域の露光走査が行われる。なお、本実施の形態では、読取光学系７側がスキャニング走査を行う原稿固定型で示しているが、読取光学系７側が位置固定で原稿側が移動する原稿移動型であってもよい。
【００６６】
プリンタ３は、シート状の用紙等の記録材を保持する記録材保持部１６から電子写真方式のプリンタエンジン１７及び定着器１８を経由して排出部１９へ至る記録材経路２０を備えている。
【００６７】
プリンタエンジン１７は、感光体２１、帯電器２２、露光器２３、現像器２４、転写器２５及びクリーナー２６等を用いて、電子写真方式で感光体２１の周囲に形成したトナー像を記録材に転写し、転写したトナー像を、定着器１８によって記録材上に定着させる。なお、本実施の形態では、プリンタエンジン１７が電子写真方式で画像形成を行うが、これに限るものではなく、例えば、インクジェット方式、昇華型熱転写方式、直接感熱記録方式等の様々な画像形成方式で画像形成を行うようにしても良い。
【００６８】
このような複合機１は、複数のマイクロコンピュータで構成される制御系により制御される。図８はこれらの制御系のうち、画像処理に関わる制御系の電気的な接続を概略的に示すブロック図である。この制御系は、ＣＰＵ３０、ＲＯＭ３１、ＲＡＭ３２、操作パネル３３、ＩＰＵ（Image Processing Unit）３４、Ｉ／Ｏポート３５、通信制御部３６等がバス３７で接続され構成されている。ＣＰＵ３０は、各種演算を行い、画像処理等の処理を集中的に制御する。ＲＯＭ３１には、ＣＰＵ３０が実行する処理に関わる各種プログラムや固定データが格納されている。また、ＲＡＭ３２は、ＣＰＵ３０のワークエリアとして機能し、加えて、画像データ（例えば、画像ファイル）を一時的に記憶するメモリとして機能する。操作パネル３３には、ＬＣＤ(Liquid Crystal Display)等の表示器、ハードキー及びタッチパネル等によって構成される複数の操作キー(いずれも図示せず)が設けられており、操作パネル３３が表示部及び操作部として機能する。ＩＰＵ３４は各種画像処理に関わるハードウエアを備えており、ＲＯＭ３１はＥＥＰＲＯＭやフラッシュメモリ等の不揮発性メモリを備えている。ここで、ＲＯＭ３１内に格納されているプログラムは、ＣＰＵ３０の制御によりＩ／Ｏポート３５を介して外部装置（図示せず）からダウンロードされるプログラムに書換え可能である。なお、本実施の形態では、ＲＯＭ３１がプログラムを記憶する記憶媒体として機能している。通信制御部３６は、複合機１と外部装置（図示せず）との間でネットワーク等を介してデータを送受信する機能を有しており、ファクシミリのモデム機能、公衆電話回線網に接続するための網制御機能、ＬＡＮ（Local Area Network）制御機能等を備えている。
【００６９】
次に、本実施の形態の複合機１における画像処理の概要について図９を参照して説明する。図９は複合機１における画像処理の概要を説明するための機能ブロック図である。複合機１の画像処理は、スキャナ２で読み取った原画像を複数の領域に分割する領域分割部４０、分割した複数領域を領域属性に基づいて識別する領域識別処理を実行し、さらに、複数領域に対して画像から文字を認識する文字認識処理を実行する付加情報生成部４１と、図１を参照して説明した各機能ブロックを有する圧縮部４２と、領域識別処理により生成された領域識別情報や文字認識処理により生成された文字認識情報等を付加情報として圧縮データであるコードストリームの所定の埋め込み位置に埋め込む付加情報埋込部４３とを備える。
【００７０】
なお、画像から文字を認識する文字認識処理としては、例えばＯＣＲ（Optical Character Recognition）処理が用いられる。また、付加情報生成部４１では、スキャナ２で読み取った原画像の傾き角度を検出する傾き検出処理を実行しても良い。これにより、傾き検出結果である傾き角度補正情報等の傾き検出情報が得られる。したがって、付加情報埋込部４３では、その傾き検出情報を付加情報として埋め込むようにしても良い。
【００７１】
また、複合機１の画像処理は、基本的に、スキャナ２で読み取られた原画像の画像データから分割した領域に対応する画像データに対してＯＣＲ処理を行い、さらに、画像データをJPEG2000アルゴリズムにより圧縮符号化して、コードストリームを生成する。すなわち、画像を１又は複数の矩形領域（タイル１１１）に分割し、この矩形領域毎に画素値を離散ウェーブレット変換して階層的に圧縮符号化する。このとき、領域識別情報、文字認識情報、傾き検出情報等は、コードストリームの所定の埋め込み位置に埋め込まれる。このような領域分割部４０、付加情報生成部４１、圧縮部４２、付加情報埋込部４３等の機能は、ＲＯＭ３１に記憶されているプログラムに基づいてＣＰＵ３０が行う画像処理で実行されるようにしているが、これに限るものではなく、例えば、ＩＰＵ３４等によりハードウエアが行う画像処理で実行されるようにしても良い。
【００７２】
ここで、領域属性としては、例えば、文字領域、写真領域、図領域、表領域、黒ベタ領域、背景領域等の様々な領域属性がある。なお、背景領域とは、原稿の余白にあたる余白領域であるが、これに限るものではなく、例えば、余白領域に行間にあたる行間領域を加えた領域であっても良い。
【００７３】
一方、付加情報としては、例えば、ＯＣＲ処理により生成された文字認識情報（文字認識結果情報）、領域識別処理により生成された座標情報や領域属性情報等の領域識別情報領域（領域識別結果情報）、傾き検出処理により生成された傾き角度補正情報等の傾き検出情報（傾き検出結果情報）等がある。文字認識情報としては、文字コード、文字色、フォントサイズ、フォント情報、文字の座標位置情報、確信度情報（確からしさ情報）、タイトル（文字コード）等がある。なお、タイトルは領域識別処理によりタイトル領域を抽出し、そのタイトル領域をＯＣＲ処理して得られた文字コードである。例えば、「２００２年度会議議事録．ｊ２ｋ」等のタイトルが文字コードで所定の埋め込み位置に埋め込まれる。
【００７４】
また、所定の埋め込み位置としては、例えば、メインヘッダやタイルパートヘッダ等のヘッダ（図６参照）、画像の最下位ビット（画像サイズを増加させることなく埋め込みが可能）、さらに、画像の文字領域に対応する最下位ビット等がある。加えて、画像サイズがタイルの整数倍でない場合には、情報を付加してタイルの整数倍の画像を形成するが、その付加する情報部分に埋め込むことも可能である。
【００７５】
次に、複合機１のＣＰＵ３０がプログラムに基づいて実行する画像処理について説明する。ここでは、例えば、複合機１を画像サーバとして用いるために画像データを蓄積する画像処理について図１０及び図１１を参照して説明する。図１０は本実施の形態の画像処理の流れを概略的に示すフローチャート、図１１はその画像処理による付加情報の埋め込み位置を概略的に示す説明図である。
【００７６】
図１０に示すように、まず、スキャナ２による原稿画像の読取に待機する（ステップＳ１のＮ）。操作者がスキャナ２の原稿圧板６を開放してコンタクトガラス５上に原稿をセットし、原稿圧板６を閉じて操作パネル３３のコピースタートキーを押下すると、スキャナ２は読取光学系７のスキャニング動作でコンタクトガラス５上にセットされた原稿から原画像を読み取る。
【００７７】
スキャナ２により原稿から原画像が読み取られると（Ｓ１のＹ）、読み取られた原画像を複数の領域に分割し、文字領域、写真領域、図領域、表領域、背景領域等の領域属性に基づいて、原画像中に混在する複数領域の属性を識別する（Ｓ２）。ここに、領域分割手段又は領域分割機能が実行され、付加情報生成手段又は付加情報生成機能が実行される。なお、領域識別の方法は、従来の方法、例えば、黒ランの密度を用いて領域識別する方法等で十分であり、その方法は公知であるため、その説明は省略する。ここで、複数領域を領域識別することによって、座標情報や領域属性情報等の領域識別情報が複数領域毎に得られる。なお、スキャナ２により読み取った原稿画像に対して本実施の形態の画像処理を実行しているが、これに限るものではなく、例えばネットワークを介して受信した原画像に対して本実施の形態の画像処理を実行しても良い。
【００７８】
次に、分割識別された複数領域に対してＯＣＲ処理を実行する（Ｓ３）。すなわち、複数領域毎の画像から文字を認識する。ここに、付加情報生成手段又は付加情報生成機能が実行される。なお、ＯＣＲ処理としては、パターン・マッチング法や構造解析法等があり、これらの方法は公知であるため、その説明は省略する。ここで、複数領域毎にＯＣＲ処理を実行することによって、文字コード、文字色、フォントサイズ、フォント情報、文字の座標位置情報、確信度情報（確からしさ情報）等の文字認識情報が複数領域毎に得られる。
【００７９】
次いで、原画像をJPEG2000アルゴリズムに基づいて圧縮する（Ｓ４）。これにより、原画像からJPEG2000アルゴリズムに基づいてコードストリームが生成される。そして、コードストリームの所定の埋め込み位置に領域識別情報及び文字認識情報を付加情報として埋め込む（Ｓ５）。ここに、付加情報埋込手段又は付加情報埋込機能が実行される。例えば、図１１に示すように、付加情報は、コードストリームにおけるレイヤの最下位ビットに埋め込まれる。これにより、画像サイズを増加させることなく埋め込むことができる。なお、付加情報が埋め込まれた領域Ｒは、原画像の文字領域に対応する領域である。
【００８０】
なお、ここでは、画像の文字領域に対応する領域Ｒの最下位ビットに埋め込んでいるが、これに限るものではなく、例えば単純に最下位ビットに埋め込むようにしても良い。また、領域識別情報及び文字認識情報を付加情報として所定の埋め込み位置に埋め込んでいるが、これに限るものではなく、例えば、領域識別情報及び文字認識情報のどちらか一方だけを所定の埋め込み位置に埋め込んでも良く、あるいは、傾き検出処理を実行した場合には、傾き検出結果の傾き角度補正情報等の傾き検出情報を所定の埋め込み位置に埋め込んでも良い。また、ステップＳ４及びステップＳ５を同時に実行するようにして、付加情報を埋め込みながら原画像を圧縮するようにしても良い。ここで、原画像はJPEG2000アルゴリズムに基づいて複数の解像度で圧縮されているので、解像度毎にＯＣＲ処理を実行して文字認識情報を所定の埋め込み位置に埋め込むようにしても良い。
【００８１】
最後に、付加情報が埋め込まれた画像データ（圧縮データ）をＲＡＭ３２に画像ファイルとして格納する（Ｓ６）。ここで、原稿が複数枚ある場合には、原稿毎にステップＳ１からＳ６までの処理が繰り返され、ＲＡＭ３２には原稿毎に複数の画像ファイルが保存される。
【００８２】
その後、ＲＡＭ３２に画像ファイルとして格納されている画像データは、例えば、所定のタイミングで通信制御部３６によりネットワークを介して外部装置に送信される場合がある。このとき、例えば、文字領域を有する画像の画像データだけを送信するために、ＲＡＭ３２に画像ファイルとして格納された複数の画像データから文字領域を有する画像の画像データを検索する場合には、ＣＰＵ３０は画像データ内の文字コードを検出すれば良く、簡単に検索することができる。また、類似した画像を有する画像データだけを送信するために、ＲＡＭ３２に画像ファイルとして格納された複数の画像データから類似した画像を有する画像データを検索する場合には、ＣＰＵ３０は画像データ内の領域識別情報を検出し、他の画像データと領域毎に領域属性が一致するか否かを判断して、簡単に検索することができる。
【００８３】
なお、画像データの画像は、変倍（拡大，縮小）、回転、白黒反転等の画像処理が行われる場合もある。このような画像処理では、文字領域Ｍのフォントサイズを変更することで文字領域Ｍの文字画像を変倍し、また、画像がJPEG2000アルゴリズムの圧縮手段により様々な解像度の画像として保持されているので、画像を高画質から低画質に自由に変化させることができる。さらに、原画像を表示装置等に表示する場合には、表示装置の解像度等に合わせて画像を伸長することができる。
【００８４】
このように本実施の形態では、原画像を複数領域に分割し、複数領域の属性を識別して領域識別情報を生成し（Ｓ２）、分割した複数領域に対してＯＣＲ処理を実行して文字認識情報を生成し（Ｓ３）、生成した領域識別情報及び文字認識情報をコードストリームの所定の埋め込み位置に埋め込むことによって（Ｓ５）、例えば、複数の画像データ（圧縮データ）から必要とする画像データを検索する場合等、付加情報を利用して簡単に検索することが可能となる。すなわち、必要に応じた付加情報を埋め込むことで、画像データの検索や管理、画像データに対する処理等を容易に行うことができる。また、原画像の複数領域毎に対応する付加情報、例えば領域識別情報や文字認識情報等が得られるので、画像の中から必要とする領域、例えば文字領域や写真領域等を簡単に検索して抽出することができる。さらに、JPEG2000アルゴリズムの圧縮手段及び伸長手段を用いることで、JPEG2000の特性を活かした様々な画像処理を実行することができる。
【００８５】
本発明の第二の実施の形態を図１２に基づいて説明する。図１２は本実施の形態の画像処理の流れを概略的に示すフローチャートである。なお、前述して説明した部分と同一部分は同一符号で示す。
【００８６】
本実施の形態の基本的構成は、第一の実施の形態と同様であるが、第一の実施の形態との相違点は、複合機１のＣＰＵ３０がプログラムに基づいて実行する画像処理が異なる点である。
【００８７】
本実施の形態の複合機１のＣＰＵ３０がプログラムに基づいて実行する画像処理について説明する。ここでは、例えば、複合機１を画像サーバとして用いるために画像データを蓄積する画像処理について説明する。
【００８８】
図１２に示すように、まず、スキャナ２による原稿画像の読取に待機する（ステップＳ１１のＮ）。操作者がスキャナ２の原稿圧板６を開放してコンタクトガラス５上に原稿をセットし、原稿圧板６を閉じて操作パネル３３のコピースタートキーを押下すると、スキャナ２は読取光学系７のスキャニング動作でコンタクトガラス５上にセットされた原稿から原画像を読み取る。
【００８９】
スキャナ２により原稿から原画像が読み取られると（Ｓ１１のＹ）、読み取られた原画像をタイリング処理して複数のタイル１１１（複数の領域：図１参照）に分割する（Ｓ１２）。ここに、領域分割手段又は領域分割機能が実行される。これにより、原画像は複数のタイル１１１（複数の領域）に分割される。なお、このタイリング処理は、JPEG2000アルゴリズムに基づいて実行される。
【００９０】
次に、分割された複数のタイル１１１に対してＯＣＲ処理を実行する（Ｓ１３）。すなわち、複数のタイル１１１毎の画像から文字を認識する。ここに、付加情報生成手段又は付加情報生成機能が実行される。なお、ＯＣＲ処理としては、パターン・マッチング法や構造解析法等があり、これらの方法は公知であるため、その説明は省略する。ここで、ＯＣＲ処理を実行することによって、文字コード、文字色、フォントサイズ、フォント情報、文字の座標位置情報、確信度情報（確からしさ情報）等の文字認識情報がタイル１１１毎に得られる。
【００９１】
次いで、原画像をJPEG2000アルゴリズムに基づいて圧縮する（Ｓ１４）。これにより、原画像からJPEG2000アルゴリズムに基づいてコードストリームが生成される。そして、コードストリームの所定の埋め込み位置に付加情報として文字認識情報を埋め込む（Ｓ１５）。ここに、付加情報埋込手段又は付加情報埋込機能が実行される。例えば、タイル１１１毎の付加情報は、対応するタイルパートヘッダに埋め込まれる。なお、ステップＳ１２でタイリング処理を実行しているので、ステップＳ１４でタイリング処理を実行する必要はない。
【００９２】
ここでは、文字認識情報を付加情報として所定の埋め込み位置に埋め込んでいるが、これに限るものではなく、例えば、傾き検出処理を実行した場合には、傾き検出結果の傾き角度補正情報等の傾き検出情報を所定の埋め込み位置に埋め込んでも良い。また、ステップＳ１４及びステップＳ１５を同時に実行するようにして、付加情報を埋め込みながら画像を圧縮するようにしても良い。
【００９３】
最後に、付加情報が埋め込まれた画像データ（圧縮データ）をＲＡＭ３２に画像ファイルとして格納する（Ｓ１６）。ここで、原稿が複数枚ある場合には、原稿毎にステップＳ１１からＳ１６までの処理が繰り返され、ＲＡＭ３２には原稿毎に複数の画像ファイルが保存される。
【００９４】
その後、ＲＡＭ３２に画像ファイルとして格納されている画像データは、例えば、所定のタイミングで通信制御部３６によりネットワークを介して外部装置に送信される場合がある。このとき、例えば、文字領域を有する画像の画像データだけを送信するために、ＲＡＭ３２に画像ファイルとして格納された複数の画像データから文字領域を有する画像の画像データを検索する場合には、ＣＰＵ３０は画像データ内の文字コードを検出すれば良く、簡単に検索することができる。
【００９５】
このように本実施の形態では、原画像をタイリング処理して複数のタイル１１１（複数の領域）に分割し（Ｓ１２）、分割した複数のタイル１１１に対してＯＣＲ処理を実行して文字認識情報を生成し（Ｓ１３）、生成した文字認識情報をコードストリームの所定の埋め込み位置に埋め込むことによって（Ｓ１５）、例えば、複数の画像データ（圧縮データ）から必要とする画像データを検索する場合等、付加情報を利用して簡単に検索することが可能となる。すなわち、必要に応じた付加情報を埋め込むことで、画像データの検索や管理、画像データに対する処理等を容易に行うことができる。また、原画像のタイル１１毎に対応する付加情報、例えば文字認識情報等が得られるので、画像の中から必要とする領域、例えば文字領域を簡単に検索して抽出することができる。さらに、JPEG2000アルゴリズムの圧縮手段及び伸長手段を用いることで、JPEG2000の特性を活かした様々な画像処理を実行することができる。
【００９６】
なお、各実施の形態においては、画像処理装置として複合機１を用いているが、これに限るものではなく、例えば、パーソナルコンピュータ等を用いても良い。この場合、パーソナルコンピュータは、ＣＰＵ、ＲＯＭ、ＲＡＭ、各種のプログラムを記憶するＨＤＤ（Hard Disk Drive）、ＣＤ−ＲＯＭドライブ、スキャナ、ネットワークを介して外部装置と通信により情報を伝達するための通信制御装置、処理経過や結果等を操作者に表示する表示装置、キーボードやマウス等の入力装置等を備えている。ここで、ＨＤＤは、前述したような画像処理に関するプログラムを記憶する記憶媒体として機能する。
【００９７】
なお、一般的には、パーソナルコンピュータのＨＤＤにインストールされるプログラムは、ＣＤ−ＲＯＭやＤＶＤ−ＲＯＭ等の光情報記録メディアやＦＤ等の磁気メディア等に記録され、この記録されたプログラムがＨＤＤにインストールされる。このため、ＣＤ−ＲＯＭ等の光情報記録メディアやＦＤ等の磁気メディア等の可搬性を有する記憶媒体も、前述したような画像処理に関するプログラムを記憶する記憶媒体となり得る。さらには、このようなプログラムは、例えば通信制御装置を介して外部から取込まれ、ＨＤＤにインストールされても良い。
本実施の形態によれば、原画像を２次元ウェーブレット変換、量子化及び符号化という手順で圧縮してコードストリームを生成する画像処理装置において、前記原画像を複数領域に分割する領域分割手段と、前記領域分割手段により分割された前記複数領域の画像から前記原画像に関連する付加情報を生成する付加情報生成手段と、前記付加情報生成手段により生成された付加情報を前記コードストリームに埋め込む付加情報埋込手段と、を備えることを特徴とすることから、原画像の画像データが付加情報を有し、例えば、複数の画像データから必要とする画像データを検索する場合等、付加情報を利用して簡単に検索することが可能となる。すなわち、必要に応じた付加情報を埋め込むことで、画像データの検索や管理、画像データに対する処理等を容易に行うことができる。
本実施の形態の画像処理装置において、前記付加情報生成手段は、前記複数領域の画像から文字を認識することで前記付加情報として文字認識情報を生成することを特徴とすることから、この文字認識情報がコードストリームに埋め込まれ、例えば、複数の画像データから文字領域を有する画像データを検索する場合等、文字認識情報により簡単に検索することが可能となる。すなわち、文字認識情報を利用することで、画像データの検索や管理等を容易に行うことができる。
本実施の形態の画像処理装置において、前記付加情報生成手段は、前記複数領域の領域属性を識別することで前記付加情報として領域識別情報を生成することを特徴とすることから、この領域識別情報がコードストリームに埋め込まれ、例えば、複数の画像データから写真領域を有する画像データを検索する場合等、領域識別情報により簡単に検索することが可能となる。すなわち、領域識別情報を利用することで、画像データの検索や管理等を容易に行うことが可能になる。
本実施の形態の画像処理装置において、前記付加情報生成手段は、前記複数領域の画像の傾きを検出することで前記付加情報として傾き検出情報を生成することを特徴とすることから、この傾き検出情報がコードストリームに埋め込まれ、例えば、画像を表示装置に表示させたり、用紙に印字させたりする場合等、傾き検出情報により簡単に画像の傾きを補正することが可能となる。すなわち、傾き検出情報を利用することで、画像データに対する処理等を容易に行うことができる。
本実施の形態の画像処理装置において、前記付加情報埋込手段は、前記コードストリームのメインヘッダに前記付加情報を埋め込むことを特徴とすることから、例えば、複数の画像データから必要とする画像データを検索する場合等、メインヘッダを読み取る早い段階でメインヘッダの付加情報を利用して検索することが可能となり、その結果として、処理時間を短縮することができる。
本実施の形態の画像処理装置において、前記付加情報埋込手段は、前記コードストリームのタイルパートヘッダに前記付加情報を埋め込むことを特徴とすることから、例えば、タイル毎（領域毎）の付加情報が対応するタイルパートヘッダに埋め込まれ、複数の画像データから必要とする画像データを検索する場合等、タイルパートヘッダの付加情報を利用して簡単に検索することができる。
本実施の形態の画像処理装置において、前記付加情報埋込手段は、前記コードストリームにおけるレイヤの最下位ビットに前記付加情報を埋め込むことを特徴とすることから、画像サイズを増加させることなく付加情報を埋め込むことができる。
本実施の形態の画像処理装置において、原稿から前記原画像を光学的に読み取る読取光学系を備えることを特徴とすることから、原稿から原画像を読み取ることが可能になり、その結果として、読み取った原画像に対し画像処理等の様々な処理を実行することができる。
本実施の形態の画像処理装置において、圧縮された前記原画像を復号化、逆量子化及び２次元ウェーブレット逆変換という手順で伸長する伸長手段を備えることを特徴とすることから、ＪＰＥＧ２０００アルゴリズムの伸長手段を用いることで、ＪＰＥＧ２０００アルゴリズムで圧縮された画像をＪＰＥＧ２０００の特性を活かして伸長することが可能となり、その結果として、伸長された画像の表示装置等への表示や用紙等への印字等を実行することができる。
本実施の形態の画像処理装置において、前記伸長手段により伸長された画像を記録材に画像形成するプリンタエンジンを備えることを特徴とすることから、伸長された画像を用紙等の記録材に形成することができる。
本実施の形態のプログラムは、原画像を２次元ウェーブレット変換、量子化及び符号化という手順で圧縮してコードストリームを生成する画像処理装置が備えるコンピュータに解釈され、前記コンピュータに、原画像を複数領域に分割する領域分割機能と、前記領域分割手段により分割された前記複数領域の画像から前記原画像に関連する付加情報を生成する付加情報生成機能と、前記付加情報生成手段により生成された付加情報を前記コードストリームに埋め込む付加情報埋込機能と、を実行させることから、原画像の画像データが付加情報を有し、例えば、複数の画像データから必要とする画像データを検索する場合等、付加情報を利用して簡単に検索することができる。すなわち、必要に応じた付加情報を埋め込むことで、画像データの検索や管理、画像データに対する処理等を容易に行うことができる。
本実施の形態のプログラムにおいて、前記付加情報生成機能は、前記複数領域の画像から文字を認識することで前記付加情報として文字認識情報を生成することから、この文字認識情報がコードストリームに埋め込まれ、例えば、複数の画像データから文字領域を有する画像データを検索する場合等、文字認識情報により簡単に検索することが可能となる。すなわち、文字認識情報を利用することで、画像データの検索や管理等を容易に行うことができる。
本実施の形態のプログラムにおいて、前記付加情報生成機能は、前記複数領域の領域属性を識別することで前記付加情報として領域識別情報を生成することから、この領域識別情報がコードストリームに埋め込まれ、例えば、複数の画像データから写真領域を有する画像データを検索する場合等、領域識別情報により簡単に検索することが可能となる。すなわち、領域識別情報を利用することで、画像データの検索や管理等を容易に行うことができる。
本実施の形態のプログラムにおいて、前記付加情報生成機能は、前記複数領域の画像の傾きを検出することで前記付加情報として傾き検出情報を生成することから、この傾き検出情報がコードストリームに埋め込まれ、例えば、画像を表示装置に表示させたり、用紙に印字させたりする場合等、傾き検出情報により簡単に画像の傾きを補正することが可能となる。すなわち、傾き検出情報を利用することで、画像データに対する処理等を容易に行うことができる。
本実施の形態のプログラムにおいて、前記付加情報埋込機能は、前記コードストリームのメインヘッダに前記付加情報を埋め込むことから、例えば、複数の画像データから必要とする画像データを検索する場合等、メインヘッダを読み取る早い段階でメインヘッダの付加情報を利用して検索することが可能となり、その結果として、処理時間を短縮することができる。
本実施の形態のプログラムにおいて、前記付加情報埋込機能は、前記コードストリームのタイルパートヘッダに前記付加情報を埋め込むことから、例えば、タイル毎（領域毎）の付加情報が対応するタイルパートヘッダに埋め込まれ、複数の画像データから必要とする画像データを検索する場合等、タイルパートヘッダの付加情報を利用して簡単に検索することができる。
本実施の形態のプログラムにおいて、前記付加情報埋込機能は、前記コードストリームにおけるレイヤの最下位ビットに前記付加情報を埋め込むことから、画像サイズを増加させることなく付加情報を埋め込むことができる。
本実施の形態の記憶媒体によれば、本実施の形態のプログラムを記憶していることから、本実施の形態プログラムと同様な効果を奏する。
【００９８】
【発明の効果】
本発明によれば、原画像に関連する付加情報を生成し、生成した付加情報をコードストリームに埋め込むことによって、原画像の画像データが付加情報を有し、例えば、複数の画像データから必要とする画像データを検索する場合等、付加情報を利用して簡単に検索することが可能となるコードストリームを生成する。すなわち、必要に応じた付加情報を埋め込むことで、画像データの検索や管理、画像データに対する処理等を容易に行うことが可能なコードストリームを生成する。
【図面の簡単な説明】
【図１】 JPEG2000アルゴリズムの概要を説明するための機能ブロック図である。
【図２】カラー画像である原画像の分割された各コンポーネントの一例を概略的に示す模式図である。
【図３】デコンポジションレベル数が３である場合の各デコンポジションレベルにおけるサブバンドを概略的に示す模式図である。
【図４】プレシンクトを示す説明図である。
【図５】ビットプレーンに順位付けする手順の一例を示す説明図である。
【図６】コードストリームの構造の一例を概略的に示す模式図である。
【図７】本発明の第一の実施の形態の複合機を概略的に示す縦断面図である。
【図８】複合機の制御系のうち、画像処理に関わる制御系の電気的な接続を概略的に示すブロック図である。
【図９】複合機における画像処理の概要を説明するための機能ブロック図である。
【図１０】本発明の第一の実施の形態の画像処理の流れを概略的に示すフローチャートである。
【図１１】本発明の第一の実施の形態の画像処理による付加情報の埋め込み位置を概略的に示す説明図である。
【図１２】本発明の第二の実施の形態の画像処理の流れを概略的に示すフローチャートである。
【符号の説明】
１画像処理装置（複号機）
７読取光学系
１７プリンタエンジン
３０コンピュータ（ＣＰＵ）
３１記憶媒体（ＲＯＭ）[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an image processing apparatus, a program, and a storage medium.
[0002]
[Prior art]
In recent years, with the spread of image processing apparatuses such as scanners, digital cameras, personal computers, printers, copiers, and multifunction peripherals (MFPs), digital image data is stored in a storage device such as a memory or a hard disk, or a CD-ROM or the like. It is familiar to store on an optical disc of this type, or to transmit via the Internet or the like. Such image data is usually compressed and stored in a storage device, an optical disk or the like in many cases.
[0003]
Recently, high-definition images can be easily obtained by various techniques, but the image data size of high-definition images tends to increase, and handling of high-definition images has become difficult. Currently, JPEG (Joint Photographic Experts Group) is most widely used as an image compression / decompression algorithm that facilitates handling of such high-definition images. In recent years, the use of DWT (Discrete Wavelet Transform) is increasing as a frequency transform in place of DCT (Discrete Cosine Transform) adopted in JPEG. A representative example is JPEG2000, an image compression / decompression method succeeding JPEG, which became an international standard in 2001.
[0004]
Since such compressed image data is digital data, it facilitates transmission over the Internet or the like, storage in a storage device, etc. On the other hand, it is highly likely to be altered without permission from the creator. is there. In order to prevent this, a method has been proposed in which signature information for identifying a creator is embedded as additional information in an original image (see, for example, Patent Document 1).
[0005]
[Patent Document 1]
JP 2001-42768 A
[0006]
[Problems to be solved by the invention]
However, in the technique of Patent Document 1, signature information can be embedded as additional information in the original image. However, since the signature information is not additional information related to the original image, other than for specifying the creator, for example, image data search or the like It is difficult to use signature information for secondary use. Even when the additional information related to the original image is embedded in the original image, the processing time for using the additional information may be longer depending on the embedded position.
[0007]
Further, the user desires that image data having a character area or a photographic area or image data having a similar image can be easily searched from a plurality of image data stored in the storage device. For example, in the conventional technology, a user confirms an image displayed on a display device or the like based on image data, an image printed on a sheet, etc. In many cases, image data or the like having the processed image is searched.
[0008]
An object of the present invention is to provide an image processing apparatus, a program, and a storage medium that realize encoding suitable for image data search and management, processing on image data, and the like.
[0009]
[Means for Solving the Problems]
The image processing apparatus of the present invention The original image Divided into rectangular areas, and for each rectangular area Compress by the procedure of 2D wavelet transform, quantization and encoding , Having encoded data for each rectangular area In an image processing apparatus for generating a code stream, a plurality of the original images are Rectangle An area dividing unit that divides into areas and the area dividing unit For each rectangular area, the image of the rectangular area Related For search Additional information generating means for generating additional information, and additional information generated by the additional information generating means as the code stream In the encoded data of the rectangular area corresponding to the additional information in And additional information embedding means for embedding.
[0010]
Therefore, by generating additional information related to the original image and embedding the generated additional information in the code stream, the image data of the original image has additional information. For example, necessary image data is obtained from a plurality of image data. It becomes possible to easily search using additional information when searching. Generate codestream . In other words, by embedding additional information as necessary, it is possible to easily search and manage image data, process image data, and so on. A simple codestream .
[0045]
DETAILED DESCRIPTION OF THE INVENTION
A first embodiment of the present invention will be described with reference to FIGS.
[0046]
Although the present embodiment uses the “JPEG2000 algorithm”, the JPEG2000 algorithm itself is well known from various documents, publications, and the like, so the details will be omitted and the outline will be described.
[0047]
FIG. 1 is a functional block diagram for explaining the outline of the JPEG2000 algorithm. The JPEG2000 algorithm includes a color space conversion / inverse conversion unit 100, a two-dimensional wavelet transform / inverse conversion unit 101, a quantization / inverse quantization unit 102, an entropy encoding / decoding unit 103, and a tag processing unit 104. Yes.
[0048]
One of the features of JPEG2000 is that it uses two-dimensional discrete wavelet transform (DWT), which has the advantage of good image quality in the high-compression region. Another major feature is that a function block called a tag processing unit 104 for performing code formation is added to the final stage, and a code stream that is code string data is generated and interpreted. And JPEG2000 can realize various convenient functions by the code stream.
[0049]
Note that a color space conversion / inverse conversion unit 100 is often prepared for an input / output portion of an image. This color space conversion / inverse conversion unit 100 is, for example, an RGB color system composed of R (red) / G (green) / B (blue) components of the primary color system, or Y (yellow) / M of the complementary color system. This is a part that performs conversion from the YMC color system composed of the components of (magenta) / C (cyan) to the YCrCb or YUV color system or vice versa.
[0050]
Hereinafter, the JPEG2000 algorithm, particularly the wavelet transform will be described.
[0051]
FIG. 2 is a schematic diagram schematically showing an example of each component obtained by dividing an original image which is a color image. In the color image, generally, as shown in FIG. 2, each component 110 of the original image is separated by, for example, an RGB primary color system. Furthermore, each component 110 of the image is divided by a tile 111 that is a rectangular area (in the example of FIG. 2, each component 110 is divided into 4 × 4 vertical and horizontal 16 tiles 111 in total. ). Such individual tiles 111, for example, R00, R01,..., R15 / G00, G01,..., G15 / B00, B01,..., B15 are basic units for executing the image data compression / decompression process. . Therefore, the compression / decompression operation of the image data is performed independently for each component 110 and for each tile 111.
[0052]
When encoding image data (see FIG. 1), the data of each tile 111 of each component 110 is input to the color space conversion / inverse conversion unit 100, and after color space conversion is performed, two-dimensional wavelet conversion / inverse conversion is performed. The unit 101 applies a two-dimensional wavelet transform (forward transform) to divide the space into frequency bands.
[0053]
FIG. 3 is a schematic diagram schematically showing subbands at each decomposition level when the number of decomposition levels is three. The two-dimensional wavelet transform / inverse transform unit 101 performs a two-dimensional wavelet transform on a tile image (decomposition level 0 (120): 0LL) obtained by tile division of the image, and obtains a composition level 1 (121). Are separated (1LL, 1HL, 1LH, 1HH). Subsequently, the two-dimensional wavelet transform / inverse transform unit 101 performs two-dimensional wavelet transform on the low-frequency component 1LL in this hierarchy, and displays subbands (2LL, 2HL, 2LH, 2HH) indicated by the decomposition level 2 (122). ). Then, the two-dimensional wavelet transform / inverse transform unit 101 sequentially performs the two-dimensional wavelet transform on the low frequency component 2LL in the same manner, and displays subbands (3LL, 3HL, 3LH) indicated by the decomposition level 3 (123). , 3HH). In FIG. 3, the subbands to be encoded at each decomposition level are shown in gray. For example, when the number of decomposition levels is 3, the subbands shown in gray (3HL, 3LH, 3HH, 2HL, 2LH, 2HH, 1HL, 1LH, 1HH) are to be encoded, and the 3LL subband is not encoded. .
[0054]
Next, in the quantization / inverse quantization unit 102 (see FIG. 1), after the bits to be encoded are determined in the designated encoding order, a context is generated from the bits around the target bits. The wavelet coefficients that have undergone the quantization process are divided into non-overlapping rectangles called “precincts” for each subband. This was introduced to use memory efficiently in implementation. Here, FIG. 4 is an explanatory view showing a precinct. As shown in FIG. 4, one precinct consists of three rectangular regions that are spatially matched. Furthermore, each precinct is divided into rectangular “code blocks” that do not overlap. This is a basic unit for entropy coding.
[0055]
The coefficient values after wavelet transform can be quantized and encoded as they are. However, in JPEG2000, in order to increase the encoding efficiency, the coefficient values are decomposed into “bit plane” units, and each pixel or code block is divided. The bit planes can be prioritized.
[0056]
Here, FIG. 5 is an explanatory diagram showing an example of a procedure for ranking the bit planes. As shown in FIG. 5, in this example, the original image (32 × 32 pixels) is divided into four 16 × 16 pixel tiles, and the size of the precinct and the code block at the composition level 1 is 8 respectively. X8 pixels and 4x4 pixels. The numbers of the precinct and the code block are assigned in raster order, and in this example, the numbers are assigned from 0 to 3 for the preamble and from 0 to 3 for the code block. A mirroring method is used for pixel expansion outside the tile boundary, wavelet transform is performed with a reversible (5, 3) filter, and a wavelet coefficient value of decomposition level 1 is obtained.
[0057]
An explanatory diagram showing an example of the concept of a typical “layer” configuration for tile 0 / precinct 3 / code block 3 is also shown in FIG. The converted code block is divided into subbands (1LL, 1HL, 1LH, 1HH), and wavelet coefficient values are assigned to the subbands.
[0058]
The layer structure is easy to understand when the wavelet coefficient values are viewed from the horizontal direction (bit plane direction). One layer is composed of an arbitrary number of bit planes. In this example, layers 0, 1, 2, and 3 are made up of bit planes of 1, 3, 1, and 3, respectively. A layer including a bit plane closer to LSB (Least Significant Bit) is subject to quantization first. Conversely, a layer closer to MSB (Most Significant Bit) is quantized to the end. It will remain without being. A method of discarding from a layer close to the LSB is called truncation, and the quantization rate can be finely controlled.
[0059]
The entropy encoding / decoding unit 103 (see FIG. 1) performs encoding on each tile 111 of each component 110 by probability estimation from the context and the target bit. In this way, the encoding process is performed in units of tiles 111 for all the components 110 of the image.
[0060]
Finally, in the tag processing unit 104 (see FIG. 1), all encoded data from the entropy encoding / decoding unit 103 is combined into one code stream (code string data) and a tag is added to the code stream. I do. Here, FIG. 6 is a schematic diagram schematically showing an example of the structure of the code stream. Tag information called a header (main header (tile header), tile part header (tile header)) is added to the top of the code stream and the top of the partial tiles constituting each tile 111, and then each tile 111 Followed by the encoded data (bit stream). Then, tag information (end of codestream) is added to the end of the codestream again.
[0061]
On the other hand, at the time of decoding, contrary to the time of encoding, image data is generated from the code stream of each tile 111 of each component 110. In this case, as shown in FIG. 1, the tag processing unit 104 interprets tag information added to a code stream (code string data) input from the outside, and converts the code stream to the code of each tile 111 of each component 110. The data is decomposed into streams, and decoding processing (decompression processing) is performed for each code stream of each tile 111 of each component 110. At this time, the position of the bit to be decoded is determined in the order based on the tag information in the code stream, and the quantization / inverse quantization unit 102 determines the peripheral bits of the target bit position (decoding has already been completed). )) To generate a context. Then, the entropy encoding / decoding unit 103 performs decoding by probability estimation from the context and the code stream, generates a target bit, and writes it in the position of the target bit. Since the data decoded in this way is spatially divided for each frequency band, each component in the image data is obtained by performing a two-dimensional wavelet inverse transform in the two-dimensional wavelet transform / inverse transform unit 101. Each tile 111 at 110 is restored. The restored data is converted into original color system data by the color space conversion / inverse conversion unit 100. Here, decompression means is executed.
[0062]
Next, a configuration example of the multifunction machine 1 that is the image processing apparatus according to the present embodiment will be described. The multifunction device 1 of the present embodiment has complex functions such as a copying function, a printer function, a scanner function, a facsimile function, and an image server function.
[0063]
FIG. 7 is a longitudinal sectional view schematically showing the multifunction machine 1 according to the present embodiment. The multifunction machine 1 includes a scanner 2 that is an image reading unit that reads a document image from a document, and a printer 3 that is an image forming unit that forms an image read by the scanner 2 on a recording material such as paper.
[0064]
A contact glass 5 on which a document (not shown) is placed is provided on the upper surface of the main body case 4 of the scanner 2. The document is placed with the document surface facing the contact glass 5. On the upper side of the contact glass 5, a document pressure plate 6 (which may be a so-called ADF) for pressing a document placed on the contact glass 5 is provided.
[0065]
Below the contact glass 5, a reading optical system 7 for optically reading a document image is provided. The reading optical system 7 includes a first traveling body 10 on which a light source 8 that emits light and a mirror 9 are mounted, a second traveling body 13 on which two mirrors 11 and 12 are mounted, and a mirror 9 via an imaging lens 14. , 11, 12 and the like, and a CCD (Charge Coupled Device) image sensor 15 that receives the light is configured. The CCD image sensor 15 functions as a photoelectric conversion element that generates photoelectric conversion data by photoelectrically converting reflected light from an original image formed on the CCD image sensor 15. The photoelectric conversion data is a voltage value having a magnitude corresponding to the intensity of reflected light from the document. The first and second traveling bodies 10 and 13 are provided so as to freely reciprocate along the contact glass 5, and at a speed of 2: 1 by a moving device such as a motor (not shown) during a document image reading operation described later. Scanning is performed in the sub-scanning direction at a ratio. Thereby, exposure scanning of the original reading area is performed by the reading optical system 7. In the present embodiment, the reading optical system 7 side is shown as a document fixing type that performs scanning scanning, but the reading optical system 7 side may be a document moving type in which the position is fixed and the document side moves.
[0066]
The printer 3 includes a recording material path 20 that extends from a recording material holding unit 16 that holds a recording material such as sheet-like paper to a discharge unit 19 via an electrophotographic printer engine 17 and a fixing device 18.
[0067]
The printer engine 17 uses a photosensitive member 21, a charger 22, an exposure unit 23, a developing unit 24, a transfer unit 25, a cleaner 26, and the like as a recording material on a toner image formed around the photosensitive member 21 by electrophotography. The transferred toner image is fixed on the recording material by the fixing device 18. In this embodiment, the printer engine 17 forms an image by an electrophotographic method, but the present invention is not limited to this. For example, various image forming methods such as an ink jet method, a sublimation type thermal transfer method, a direct thermal recording method, etc. Alternatively, image formation may be performed.
[0068]
Such a multifunction machine 1 is controlled by a control system including a plurality of microcomputers. FIG. 8 is a block diagram schematically showing an electrical connection of a control system related to image processing among these control systems. The control system includes a CPU 30, a ROM 31, a RAM 32, an operation panel 33, an IPU (Image Processing Unit) 34, an I / O port 35, a communication control unit 36, and the like connected by a bus 37. The CPU 30 performs various calculations and centrally controls processing such as image processing. The ROM 31 stores various programs and fixed data related to processing executed by the CPU 30. The RAM 32 functions as a work area for the CPU 30 and, in addition, functions as a memory that temporarily stores image data (for example, image files). The operation panel 33 is provided with a plurality of operation keys (all not shown) including a display device such as an LCD (Liquid Crystal Display), hard keys, a touch panel, and the like. Functions as an operation unit. The IPU 34 includes hardware related to various image processing, and the ROM 31 includes a nonvolatile memory such as an EEPROM or a flash memory. Here, the program stored in the ROM 31 can be rewritten into a program downloaded from an external device (not shown) via the I / O port 35 under the control of the CPU 30. In the present embodiment, the ROM 31 functions as a storage medium for storing the program. The communication control unit 36 has a function of transmitting / receiving data between the multifunction device 1 and an external device (not shown) via a network or the like, and is connected to a facsimile modem function or a public telephone line network. Network control function, LAN (Local Area Network) control function, and the like.
[0069]
Next, an overview of image processing in the multifunction machine 1 of the present embodiment will be described with reference to FIG. FIG. 9 is a functional block diagram for explaining an overview of image processing in the multifunction machine 1. The image processing of the multifunction machine 1 includes an area dividing unit 40 that divides an original image read by the scanner 2 into a plurality of areas, an area identification process that identifies the divided areas based on area attributes, and a plurality of areas An additional information generation unit 41 that executes character recognition processing for recognizing characters from an image, a compression unit 42 having each functional block described with reference to FIG. 1, and region identification information generated by region identification processing And an additional information embedding unit 43 that embeds character recognition information generated by character recognition processing or the like as additional information in a predetermined embedding position of a code stream that is compressed data.
[0070]
As character recognition processing for recognizing characters from an image, for example, OCR (Optical Character Recognition) processing is used. Further, the additional information generation unit 41 may execute an inclination detection process for detecting the inclination angle of the original image read by the scanner 2. Thereby, tilt detection information such as tilt angle correction information, which is a tilt detection result, is obtained. Therefore, the additional information embedding unit 43 may embed the inclination detection information as additional information.
[0071]
The image processing of the multifunction device 1 basically performs OCR processing on the image data corresponding to the area divided from the image data of the original image read by the scanner 2, and further, the image data is processed by the JPEG2000 algorithm. A code stream is generated by compression encoding. That is, an image is divided into one or a plurality of rectangular areas (tiles 111), and pixel values are discretely wavelet transformed for each rectangular area and hierarchically encoded. At this time, area identification information, character recognition information, inclination detection information, and the like are embedded at a predetermined embedding position of the code stream. The functions of the area dividing unit 40, the additional information generating unit 41, the compressing unit 42, the additional information embedding unit 43, and the like are executed by image processing performed by the CPU 30 based on a program stored in the ROM 31. However, the present invention is not limited to this, and may be executed by image processing performed by hardware using the IPU 34 or the like.
[0072]
Here, the area attributes include various area attributes such as a character area, a photograph area, a figure area, a table area, a black solid area, a background area, and the like. The background area is a margin area corresponding to the margin of the document. However, the background area is not limited to this. For example, the background area may be an area obtained by adding an inter-line area corresponding to a line.
[0073]
On the other hand, as additional information, for example, character recognition information (character recognition result information) generated by OCR processing, region identification information regions (region identification result information) such as coordinate information and region attribute information generated by region identification processing, etc. And tilt detection information (tilt detection result information) such as tilt angle correction information generated by the tilt detection process. Character recognition information includes character code, character color, font size, font information, character coordinate position information, certainty information (probability information), title (character code), and the like. The title is a character code obtained by extracting a title area by area identification processing and performing OCR processing on the title area. For example, a title such as “2002 meeting minutes. J2k” is embedded at a predetermined embedding position with a character code.
[0074]
The predetermined embedding position includes, for example, a header such as a main header and tile part header (see FIG. 6), the least significant bit of the image (embedding is possible without increasing the image size), and a character area of the image. And the least significant bit corresponding to. In addition, if the image size is not an integral multiple of the tile, information is added to form an image that is an integral multiple of the tile, but it is also possible to embed in the information portion to be added.
[0075]
Next, image processing executed by the CPU 30 of the multifunction device 1 based on a program will be described. Here, for example, image processing for storing image data in order to use the multifunction device 1 as an image server will be described with reference to FIGS. 10 and 11. FIG. 10 is a flowchart schematically showing the flow of image processing according to the present embodiment, and FIG. 11 is an explanatory diagram schematically showing the embedded position of additional information by the image processing.
[0076]
As shown in FIG. 10, first, the scanner 2 waits for reading of a document image (N in step S1). When the operator opens the document pressure plate 6 of the scanner 2 and sets a document on the contact glass 5, closes the document pressure plate 6 and presses the copy start key on the operation panel 33, the scanner 2 scans the reading optical system 7. The original image is read from the original set on the contact glass 5.
[0077]
When the original image is read from the document by the scanner 2 (Y in S1), the read original image is divided into a plurality of areas, and based on area attributes such as a character area, a photo area, a figure area, a table area, and a background area. Then, attributes of a plurality of areas mixed in the original image are identified (S2). Here, the area dividing means or the area dividing function is executed, and the additional information generating means or the additional information generating function is executed. As a method for region identification, a conventional method, for example, a region identification method using the density of black runs is sufficient, and since the method is known, the description thereof is omitted. Here, by identifying a plurality of regions, region identification information such as coordinate information and region attribute information is obtained for each of the plurality of regions. Note that the image processing of the present embodiment is performed on the document image read by the scanner 2, but the present invention is not limited to this. For example, the image processing of the present embodiment is performed on an original image received via a network. Image processing may be executed.
[0078]
Next, OCR processing is executed for the plurality of areas identified and divided (S3). That is, a character is recognized from an image for each of a plurality of areas. Here, the additional information generating means or the additional information generating function is executed. The OCR process includes a pattern matching method, a structure analysis method, and the like. Since these methods are publicly known, description thereof is omitted. Here, the character recognition information such as the character code, the character color, the font size, the font information, the character coordinate position information, the certainty degree information (the certainty information), etc. Is obtained.
[0079]
Next, the original image is compressed based on the JPEG2000 algorithm (S4). As a result, a code stream is generated from the original image based on the JPEG2000 algorithm. Then, the area identification information and the character recognition information are embedded as additional information at a predetermined embedding position of the code stream (S5). Here, the additional information embedding means or the additional information embedding function is executed. For example, as shown in FIG. 11, the additional information is embedded in the least significant bit of the layer in the code stream. Thereby, it is possible to embed without increasing the image size. The region R in which the additional information is embedded is a region corresponding to the character region of the original image.
[0080]
Here, although embedded in the least significant bit of the region R corresponding to the character region of the image, the present invention is not limited to this. For example, it may be simply embedded in the least significant bit. In addition, the area identification information and the character recognition information are embedded in the predetermined embedding position as additional information, but the present invention is not limited to this. For example, only one of the area identification information and the character recognition information is included in the predetermined embedding position. Alternatively, when tilt detection processing is executed, tilt detection information such as tilt angle correction information as a tilt detection result may be embedded at a predetermined embedding position. Further, the original image may be compressed while embedding additional information by simultaneously executing steps S4 and S5. Here, since the original image is compressed at a plurality of resolutions based on the JPEG2000 algorithm, the character recognition information may be embedded at a predetermined embedding position by executing OCR processing for each resolution.
[0081]
Finally, the image data (compressed data) in which the additional information is embedded is stored in the RAM 32 as an image file (S6). Here, when there are a plurality of documents, the processing from steps S1 to S6 is repeated for each document, and a plurality of image files are stored in the RAM 32 for each document.
[0082]
Thereafter, the image data stored as an image file in the RAM 32 may be transmitted to an external device via a network by the communication control unit 36 at a predetermined timing, for example. At this time, for example, when searching for image data of an image having a character area from a plurality of image data stored as image files in the RAM 32 in order to transmit only the image data of the image having a character area, the CPU 30 It is sufficient to detect a character code in the image data, and the search can be easily performed. In addition, in order to transmit only image data having similar images, when searching for image data having similar images from a plurality of image data stored as image files in the RAM 32, the CPU 30 uses an area in the image data. The identification information can be detected, and it can be easily searched by determining whether or not the region attribute matches each other image data.
[0083]
Note that the image of the image data may be subjected to image processing such as scaling (enlargement / reduction), rotation, and black / white reversal. In such image processing, the character image in the character area M is scaled by changing the font size in the character area M, and the image is held as an image of various resolutions by the compression means of the JPEG2000 algorithm. The image can be freely changed from high image quality to low image quality. Furthermore, when an original image is displayed on a display device or the like, the image can be expanded in accordance with the resolution or the like of the display device.
[0084]
As described above, in this embodiment, the original image is divided into a plurality of regions, the attributes of the plurality of regions are identified to generate region identification information (S2), and the character is obtained by executing the OCR process on the divided regions By generating recognition information (S3) and embedding the generated area identification information and character recognition information in a predetermined embedding position of the code stream (S5), for example, required image data from a plurality of image data (compressed data) For example, it is possible to easily search using additional information. That is, by embedding additional information as required, image data search and management, processing for image data, and the like can be easily performed. Further, since additional information corresponding to each of a plurality of areas of the original image, such as area identification information and character recognition information, can be obtained, a necessary area such as a character area or a photo area can be easily searched from the image. Can be extracted. Furthermore, by using the compression means and decompression means of the JPEG2000 algorithm, various image processing utilizing the characteristics of JPEG2000 can be executed.
[0085]
A second embodiment of the present invention will be described with reference to FIG. FIG. 12 is a flowchart schematically showing a flow of image processing according to the present embodiment. In addition, the same part as the part demonstrated above is shown with the same code | symbol.
[0086]
The basic configuration of the present embodiment is the same as that of the first embodiment, but the difference from the first embodiment is the image processing that the CPU 30 of the multifunction device 1 executes based on the program. Is a point.
[0087]
Image processing executed by the CPU 30 of the multifunction machine 1 according to the present embodiment based on a program will be described. Here, for example, image processing for storing image data in order to use the multifunction device 1 as an image server will be described.
[0088]
As shown in FIG. 12, first, the scanner 2 waits for reading of an original image (N in step S11). When the operator opens the document pressure plate 6 of the scanner 2 and sets a document on the contact glass 5, closes the document pressure plate 6 and presses the copy start key on the operation panel 33, the scanner 2 scans the reading optical system 7. The original image is read from the original set on the contact glass 5.
[0089]
When the original image is read from the document by the scanner 2 (Y in S11), the read original image is tiling processed and divided into a plurality of tiles 111 (a plurality of areas: see FIG. 1) (S12). Here, the area dividing means or the area dividing function is executed. Thereby, the original image is divided into a plurality of tiles 111 (a plurality of regions). This tiling process is executed based on the JPEG2000 algorithm.
[0090]
Next, OCR processing is executed for the plurality of divided tiles 111 (S13). That is, a character is recognized from the image for each of the plurality of tiles 111. Here, the additional information generating means or the additional information generating function is executed. The OCR process includes a pattern matching method, a structure analysis method, and the like. Since these methods are publicly known, description thereof is omitted. Here, by executing the OCR process, character recognition information such as a character code, a character color, a font size, font information, character coordinate position information, and certainty information (certainty information) is obtained for each tile 111.
[0091]
Next, the original image is compressed based on the JPEG2000 algorithm (S14). As a result, a code stream is generated from the original image based on the JPEG2000 algorithm. Then, character recognition information is embedded as additional information at a predetermined embedding position of the code stream (S15). Here, the additional information embedding means or the additional information embedding function is executed. For example, the additional information for each tile 111 is embedded in the corresponding tile part header. Since the tiling process is executed in step S12, it is not necessary to execute the tiling process in step S14.
[0092]
Here, the character recognition information is embedded as the additional information at a predetermined embedding position, but the present invention is not limited to this. For example, when the inclination detection process is executed, the inclination of the inclination detection result such as the inclination angle correction information, etc. The detection information may be embedded at a predetermined embedding position. Further, step S14 and step S15 may be executed simultaneously, and the image may be compressed while embedding additional information.
[0093]
Finally, the image data (compressed data) in which the additional information is embedded is stored in the RAM 32 as an image file (S16). Here, when there are a plurality of documents, the processing from steps S11 to S16 is repeated for each document, and a plurality of image files are stored in the RAM 32 for each document.
[0094]
Thereafter, the image data stored as an image file in the RAM 32 may be transmitted to an external device via a network by the communication control unit 36 at a predetermined timing, for example. At this time, for example, when searching for image data of an image having a character area from a plurality of image data stored as image files in the RAM 32 in order to transmit only the image data of the image having a character area, the CPU 30 It is sufficient to detect a character code in the image data, and the search can be easily performed.
[0095]
As described above, in the present embodiment, the original image is tiled and divided into a plurality of tiles 111 (a plurality of areas) (S12), and the character recognition is performed by executing the OCR process on the divided tiles 111. Information is generated (S13), and the generated character recognition information is embedded in a predetermined embedding position of the code stream (S15), for example, when searching for necessary image data from a plurality of image data (compressed data), etc. This makes it possible to easily search using additional information. That is, by embedding additional information as required, image data search and management, processing for image data, and the like can be easily performed. Further, since additional information corresponding to each tile 11 of the original image, such as character recognition information, can be obtained, a necessary region, for example, a character region can be easily searched and extracted from the image. Furthermore, by using the compression means and decompression means of the JPEG2000 algorithm, various image processing utilizing the characteristics of JPEG2000 can be executed.
[0096]
In each embodiment, the multifunction device 1 is used as the image processing apparatus. However, the present invention is not limited to this. For example, a personal computer or the like may be used. In this case, the personal computer communicates with the external device via a CPU, ROM, RAM, HDD (Hard Disk Drive) storing various programs, a CD-ROM drive, a scanner, and a network to communicate information. It includes a device, a display device that displays processing progress and results to the operator, an input device such as a keyboard and a mouse, and the like. Here, the HDD functions as a storage medium for storing a program related to image processing as described above.
[0097]
Generally, a program installed in the HDD of a personal computer is recorded on an optical information recording medium such as a CD-ROM or DVD-ROM, a magnetic medium such as an FD, and the recorded program is stored in the HDD. Installed. Therefore, a portable storage medium such as an optical information recording medium such as a CD-ROM or a magnetic medium such as an FD can also be a storage medium that stores a program related to image processing as described above. Furthermore, such a program may be taken in from outside via a communication control device, for example, and installed in the HDD.
This embodiment In the image processing apparatus for generating a code stream by compressing an original image by a procedure of two-dimensional wavelet transform, quantization, and encoding, an area dividing unit that divides the original image into a plurality of areas, and the area division Additional information generating means for generating additional information related to the original image from the images of the plurality of regions divided by the means, and additional information embedding means for embedding the additional information generated by the additional information generating means in the code stream Therefore, the image data of the original image has additional information. For example, when searching for required image data from a plurality of image data, it is easy to use the additional information. It becomes possible to search. That is, by embedding additional information as required, image data search and management, processing for image data, and the like can be easily performed.
This embodiment In the image processing apparatus, the additional information generation unit generates character recognition information as the additional information by recognizing characters from the images of the plurality of regions. For example, when searching for image data having a character area from a plurality of image data, it is possible to easily search using character recognition information. That is, by using the character recognition information, it is possible to easily search and manage image data.
This embodiment In the image processing apparatus, the additional information generation unit generates region identification information as the additional information by identifying region attributes of the plurality of regions. For example, when searching for image data having a photo area from a plurality of image data, it is possible to easily search by area identification information. That is, by using the area identification information, it is possible to easily search and manage image data.
This embodiment In the image processing apparatus, the additional information generation unit generates inclination detection information as the additional information by detecting inclinations of the images of the plurality of regions. For example, when the image is displayed on a display device or printed on paper, the inclination of the image can be easily corrected by the inclination detection information. That is, by using the tilt detection information, it is possible to easily process the image data.
This embodiment In the image processing apparatus, since the additional information embedding unit embeds the additional information in the main header of the code stream, for example, when searching for necessary image data from a plurality of image data Thus, it becomes possible to search using the additional information of the main header at an early stage of reading the main header, and as a result, the processing time can be shortened.
This embodiment In the image processing apparatus, the additional information embedding unit embeds the additional information in the tile part header of the code stream. For example, the tile corresponding to the additional information for each tile (for each region) corresponds to When searching for necessary image data from a plurality of image data embedded in the part header, it can be easily searched using additional information of the tile part header.
This embodiment In the image processing apparatus, the additional information embedding unit embeds the additional information in the least significant bit of the layer in the code stream, so that the additional information can be embedded without increasing the image size. it can.
This embodiment The image processing apparatus of the present invention is provided with a reading optical system that optically reads the original image from the original, so that the original image can be read from the original. Various processes such as image processing can be executed.
This embodiment In the image processing apparatus according to the present invention, it is provided with decompression means for decompressing the compressed original image by the procedures of decoding, inverse quantization, and two-dimensional wavelet inverse transform, and therefore using the decompression means of the JPEG2000 algorithm Thus, an image compressed by the JPEG2000 algorithm can be expanded utilizing the characteristics of JPEG2000, and as a result, display of the expanded image on a display device or the like, printing on paper, or the like can be executed. it can.
This embodiment The image processing apparatus includes a printer engine that forms an image expanded by the expansion unit on a recording material. Therefore, the expanded image can be formed on a recording material such as paper.
This embodiment This program is interpreted by a computer included in an image processing apparatus that generates a code stream by compressing an original image by a procedure of two-dimensional wavelet transform, quantization, and encoding, and divides the original image into a plurality of regions. An area dividing function; an additional information generating function for generating additional information related to the original image from the images of the plurality of areas divided by the area dividing means; and the additional information generated by the additional information generating means as the code. Since the additional information embedding function embedded in the stream is executed, the image data of the original image has additional information, and the additional information is used, for example, when searching required image data from a plurality of image data. And can be easily searched. That is, by embedding additional information as required, image data search and management, processing for image data, and the like can be easily performed.
This embodiment In the above program, the additional information generation function generates character recognition information as the additional information by recognizing characters from the images of the plurality of regions. For example, when searching for image data having a character area from the image data, it is possible to easily search using the character recognition information. That is, by using the character recognition information, it is possible to easily search and manage image data.
This embodiment In this program, the additional information generation function generates region identification information as the additional information by identifying region attributes of the plurality of regions. Therefore, the region identification information is embedded in a code stream, for example, a plurality of regions For example, when searching for image data having a photo area from image data, it is possible to easily search based on area identification information. That is, by using the area identification information, it is possible to easily search and manage image data.
This embodiment In the above program, the additional information generation function generates inclination detection information as the additional information by detecting the inclination of the image of the plurality of areas. Therefore, the inclination detection information is embedded in a code stream, for example, an image When the image is displayed on a display device or printed on paper, the tilt of the image can be easily corrected by the tilt detection information. That is, by using the tilt detection information, it is possible to easily process the image data.
This embodiment In this program, since the additional information embedding function embeds the additional information in the main header of the code stream, the main header is quickly read, for example, when searching required image data from a plurality of image data. It is possible to search using the additional information of the main header at a stage, and as a result, the processing time can be shortened.
This embodiment In this program, since the additional information embedding function embeds the additional information in the tile part header of the code stream, for example, additional information for each tile (for each area) is embedded in the corresponding tile part header, When searching for necessary image data from the image data, it is possible to easily search using additional information of the tile part header.
This embodiment In this program, since the additional information embedding function embeds the additional information in the least significant bit of the layer in the code stream, the additional information can be embedded without increasing the image size.
Of this embodiment According to the storage medium This embodiment Because I remember the program of Program of this embodiment Has the same effect as
[0098]
【The invention's effect】
According to the present invention, by generating additional information related to the original image and embedding the generated additional information in the code stream, The image data of the original image has additional information. For example, when searching for required image data from a plurality of image data, it is possible to easily search using the additional information. Generate codestream . In other words, by embedding additional information as necessary, it is possible to easily search and manage image data, process image data, and so on. A simple codestream .
[Brief description of the drawings]
FIG. 1 is a functional block diagram for explaining an outline of a JPEG2000 algorithm.
FIG. 2 is a schematic diagram schematically showing an example of each component obtained by dividing an original image that is a color image;
FIG. 3 is a schematic diagram schematically showing subbands at each decomposition level when the number of decomposition levels is 3. FIG.
FIG. 4 is an explanatory diagram showing a precinct.
FIG. 5 is an explanatory diagram showing an example of a procedure for ranking bit planes;
FIG. 6 is a schematic diagram schematically showing an example of the structure of a code stream.
FIG. 7 is a longitudinal sectional view schematically showing the multifunction machine according to the first embodiment of the present invention.
FIG. 8 is a block diagram schematically showing an electrical connection of a control system related to image processing in the control system of the multifunction peripheral.
FIG. 9 is a functional block diagram for explaining an overview of image processing in a multifunction peripheral.
FIG. 10 is a flowchart schematically showing a flow of image processing according to the first embodiment of the present invention.
FIG. 11 is an explanatory diagram schematically showing a position where additional information is embedded by image processing according to the first embodiment of this invention;
FIG. 12 is a flowchart schematically showing a flow of image processing according to the second embodiment of the present invention.
[Explanation of symbols]
1 Image processing device (compound machine)
7 Reading optical system
17 Printer Engine
30 Computer (CPU)
31 Storage media (ROM)

Claims

In an image processing apparatus that divides an original image into rectangular areas and compresses each rectangular area by a procedure of two-dimensional wavelet transform, quantization, and encoding to generate a code stream having encoded data for each rectangular area ,
Area dividing means for dividing the original image into a plurality of rectangular areas;
Additional information generating means for generating additional information for searching related to the image of the rectangular area for each rectangular area divided by the area dividing means;
An image processing apparatus comprising: additional information embedding means for embedding additional information generated by the additional information generating means in encoded data of a rectangular area corresponding to the additional information in the code stream.

The image processing apparatus according to claim 1, wherein the additional information generation unit generates character recognition information as the additional information by recognizing a character from an image of each rectangular area.

The image processing apparatus according to claim 1, wherein the additional information generation unit generates area identification information as the additional information by identifying an area attribute for each rectangular area.

The image processing apparatus as claimed in 3 claims 1, characterized in that it comprises a reading optical system reads the original image from an original optically.

Image processing comprising means for searching for and extracting a required rectangular area from the code stream generated by the image processing device according to any one of claims 1 to 4 , based on the additional information for the search apparatus.

Decoding the compressed the original image, the image processing apparatus as claimed in claims 1, characterized in that it comprises an extension means for extending the procedure of inverse quantization and two-dimensional wavelet inverse transformation 5.

The image processing apparatus according to claim 6, further comprising a printer engine that forms an image on the recording material, the image expanded by the expansion unit.

An image processing apparatus that divides an original image into rectangular areas and compresses each rectangular area by a procedure of two-dimensional wavelet transform, quantization, and encoding to generate a code stream having encoded data for each rectangular area. Comprising a computer, said computer having
An area dividing function for dividing an original image into a plurality of rectangular areas;
An additional information generating function for generating additional information for searching related to the image of the rectangular area for each rectangular area divided by the area dividing means;
A program for executing an additional information embedding function for embedding additional information generated by the additional information generating means in encoded data of a rectangular area corresponding to the additional information in the code stream.

The program according to claim 8, wherein the additional information generation function generates character recognition information as the additional information by recognizing a character from an image for each rectangular area.

The program according to claim 8 or 9, wherein the additional information generation function generates area identification information as the additional information by identifying an area attribute for each rectangular area.

A necessary rectangular area is searched and extracted from a code stream generated by executing the program according to any one of claims 8 to 10 on the computer based on the additional information for the search. A program that causes a computer to execute functions

A computer-readable storage medium storing the program according to any one of claims 8 to 11 .