JP2728663B2

JP2728663B2 - Image processing device

Info

Publication number: JP2728663B2
Application number: JP62033178A
Authority: JP
Inventors: 幸榎田; 良信三田; 良弘石田; 尚登河村
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1987-02-18
Filing date: 1987-02-18
Publication date: 1998-03-18
Anticipated expiration: 2013-03-18
Also published as: JPS63201780A

Description

【発明の詳細な説明】［産業上の利用分野］本発明は画像処理装置、特に画像データの高速な伸長
を行う画像処理装置に関するものである。［従来の技術］一般に高速に画像を処理する場合に、コンピユータに
よる処理としてはソフトウエアで行う方式がとられる
が、画像データが膨大になるにつれて高速化が必要とな
つてくる。高速化の手法としては２通りの方法があり、
１つはパイプライン方式と呼ばれる逐次処理型のハード
ウエアで行う方式、もう一つは複数個のプロセツサを置
く並列処理型と呼ばれるものである。前者は画像データ
の高速処理に伴つてクロツクの周波数が高くなり限界が
ある。一方後者は並列に置くプロセツサの数を増すこと
により、高速化をいくらでも高める事ができる。極端に
言えば、画像の数の分だけプロセツサを置く事により最
大のスピードを得る事が可能である事から、現在注目さ
れている技術の一つである。ところで、この時に各画素間での通信処理が重要とな
り、相互通信を行いつつ処理を進めていく必要がある。
かかる並列処理方式に於ては、プロセツサを各画素の数
だけ持つ事は高解像データを取り扱う場合には不可能と
なる。例えば、A4を16画素/mm（pel）でよんだ画像を取
り扱う場合、画素数は約16M画素（pixels）となり、こ
れだけのプロセツサを同時に持つ事は不可能と言える。［発明が解決しようとする課題］本発明は、画像の圧縮を分割された領域単位で高速に
並列処理する画像処理装置を提供する。［課題を解決するための手段］この課題を解決するために、本発明の画像処理装置
は、画像を所定サイズに分割された領域単位で圧縮する
画像処理装置であって、前記領域の画素数に対応する複
数のメモリ・エレメントから成り、画像データを格納す
る画像メモリ（実施例では、第21図の291に相当する）
と、前記領域の画素数に対応する複数のプロセッサ・エ
レメントから成り、前記複数のメモリ・エレメントから
並列に読み出された画像データを前記領域単位で圧縮す
べく、前記複数のプロセッサ・エレメント間で該プロセ
ッサ・エレメントにおいて演算された画像データを転送
し、転送された画像データを演算処理するプロセッサ
（実施例では、第21図292、画像データの転送は第22図
及び第23図、第47頁第４行〜第49頁第３行に相当する）
とを備えることを特徴とする。［実施例］以下、本発明の一実施例を説明する。本実施例の画像処理装置の構成は、１頁分の画像メモ
リ１とプロセツサ・ユニツト２及び入出力装置等の周辺
部３から成る。第１図はその基本部のみの原理構成を示
したもので、画像メモリ１にプロセツサ・ユニツト２が
連絡されている。画像メモリ１上の任意位置のｎ×ｍの
画像データは、ｎ×ｍのプロセツサ・エレメント2aのア
レイで構成されるプロセツサ・ユニツト２へ転送され、
高速処理をされた後、再び画像メモリ１へ戻される。ｎ
×ｍのプロセツサ・エレメント2aのアレイ内での各処理
は同時に行われる、所謂並列処理方式のアーキテクチヤ
ーである。又、第９図（ａ），（ｂ）には他の構成を示
した。第９図（ａ）では、制御回路94の制御に従つて、
入力側画像メモリよりの画像データは、複数のプロセツ
サ・エレメントから成るプロセツサ・ユニツト92で複数
画素が並列に所定の処理されて、出力側画像メモリ93に
格納される。一方第９図（ｂ）では、画像メモリ91ある
いは93とプロセツサ・ユニツト92と、更に入力装置96と
出力装置ちが共通バスによつて繋がれた構成である。以下画像メモリ１について詳述する。今、簡単のため、画像サイズを1024×1024画素，各８
ビツト／画素のデータをもつ画像メモリで話を進める。
画像サイズの変更は、本実施例のアーキテクチヤーを拡
張するのみでよい。又、プロセツサ・ユニツト２は４×
４の計16個のプロセツサ・エレメント2aで構成されるも
のとする。第２図は画像メモリ１の構成を示す図である。画像の
構成が図の如く1024×1024画素で出来ているとすると、
これを４×４の単位で分けていくと、256×256の合計64
K（＝65536）個のブロツクに分割される。今、これを第
３図の如く４×４画素単位で再編成し、４×４画素が64
K個あると想定する（各画素当り８ビツト長のデータを
有す）。従つてメモリのアドレス空間は、４×４×64K
の三次元アドレス指定となる。４×４内の１つの64K画
素を１つのメモリチツプが受け持つものとすると、64K
のアドレス空間で各アドレスが８ビツトの深さのメモリ
・チツプが必要となる。これは512Kビツト（＝64Kバイ
ト）の容量のメモリ・チツプが必要であるが、本実施例
では256KビツトのダイナミツクRAM（Ｄ−RAM）２個を組
み合わせて用いる。即ち、256KビツトＤ−RAMのうち64K
×４ビツト構成のものを２個用いて、64K×８ビツトと
して用いる。かかる２個のメモリ・チツプを今後、メモ
リ・エレント1aと呼ぶ。４×４のマトリツクスに対応して、上記画像メモリ１
は16個のメモリ・エレメント1aから構成される。第４図
はかかる４×４のメモリ・エレメント1aの構成を示す。
各メモリ・エレメント1aはローアドレス及びカラムアド
レスを指定されて、４×４画素の内の一の画素の64kア
ドレス空間の画像データを入・出力する。ローアドレス
・ジエネレータ４及びカラムアドレス・ジエネレータ５
からは４×４の各メモリ・エレメント1aへアドレスを与
える。尚、メモリ・エレメント1aがＤ−RAMでローアド
レス及びカラムアドレスをタイムシエアして与えるもの
であれば、このアドレス・ジエネレータは１つでよい。
この時には、ローアドレスとカラムアドレスの時分割切
換制御が必要となる。かかるアドレス・ジエネレータからそれぞれのアドレ
スを与える事により、４×４画素のメモリ・エレメント
1aをリード／ライトする事が可能となる。即ち、一回の
アドレス指定により４×４画素分の画像データが同時に
駆動可能となる。このため、データ・ラインとしては、
各メモリ・エレメント1aから直接８ビツトのデータ線が
出ているものとする今、ローアドレスがＡ（０≦Ａ≦255）、カラムアド
レスがＢ（０≦Ｂ≦255）のデータが画像メモリ１から
呼び出されたものとすると、画像データとしては、第２
図における（A,B）のアドレスに相当する４×４画素の
８ビツト長の画像データが読み出される。更に複数画素の同時アクセスについて一般化して、説
明する。第10図は画像１頁ををのまま表わしたものであり、こ
の画像データを図示するように連続して隣接するｋ×ｌ
画素のブロツクで分割し、第11図の様にｋ×ｌ個のメモ
リ・エレメント1aに対応させる。又、ｋ×ｌ画素のブロ
ツクは端から（0,0），（0,1），（0,2），（0,3）…と
番号付けされ、第12図のようなｋ×ｌ個のメモリ・エレ
メント1aからなるメモリ・ユニツト１に対応する。第13
図はメモリ・ユニツト１を二次元的に表わしたものであ
る。又、アクセスするメモリサイズはｋ×ｌ画素のブロ
ツクサイズの単位なので、任意の位置のｋ×ｌ画素のブ
ロツクＲをアクセスした場合でも、ｋ×ｌ個のメモリ・
エレメント1aすべてがアクセスされ、しかも１つのメモ
リ・エレメント1aにつき、各１個のアドレスのアクセス
となる。この様に画像中の任意位置の隣接するｋ×ｌ個の複数
画素の画像データを一度にアクセスし、リードした後に
プロセツサ・ユニツト２で処理を行う。プロセツサ・ユ
ニツト２で処理を行われた画像データは、再度ｋ′×
ｌ′画素のブロツクサイズで、しかも任意の位置をアク
セスしてライトできる。いここでは、ｋ′＝ｋ、ｌ′＝
ｌとして今後説明を行う。前述のｋ′×ｌ′画素のみのメモリのアクセスについ
て補足説明すると、プロセツサ・ユニツト２における処
理が空間フイルタ処理等の場合には、読み出し側のアク
セスするブロツクサイズｋ×ｌよりも書き込み側のアク
セスするブロツクサイズが小さくなることがある。一般
的には書き込み側のブロツクサイズｋ′×ｌ′は１×１
になる処理が多い。又、プロセツサ・ユニツト２におけ
る処理が画像の縮小の場合にも、リード側のアクセスす
るブロツクサイズｋ×ｌよりもライト側のアクセスする
ブロツクサイズが小さくなる。一般的にライト側のブロツクサイズｋ′×ｌ′は縦横
の縮小率を、α，βとした時にｋ′≧αk,l′≧βｌを
満たす最小の整数がｋ′,l′となる。仮に読み出しと書
き込みのメモリが同一、又は同一のｋ×ｌのメモリ構成
の時に、前述の２例のような処理を行う場合は、書き込
み側のメモリ・ユニツト１の構成サイズｋ×ｌよりも小
さなサイズｋ′×ｌ′に書き込みを行わなければならな
い。この場合にはメモリ・エレメント1aのｋ×ｌ個のす
べてにアクセスをかけないで、書き込みに該当しないメ
モリ・エレメント1aをマスクして、アクセスしない様に
しなければならない。しかしながら、ｋ×ｌ個のメモリ
・エレメント1aで構成される画像メモリ１は１度にアク
セスして読み出しできるデータは隣接する画像データの
最大ｋ×ｌ個であるが、それより小さいサイズの隣接す
るｋ′×ｌ′の画像データも前記マスクを行う事により
自由にアクセスできる。マスクしてｋ′×ｌ′個のみを
同時にアクセスする事は、メモリ・エレメント1aのチツ
プのイネーブルを操作する事で容易に可能となる。次に順を追つて、任意の位置の所定画素のメモリアク
セスの実施例について、メモリ・ユニツト構成が４×４
の場合とｋ×ｌの場合とについて説明し、前記マスクす
るためのチツプイネーブルの制御についても説明する。まずブロツクサイズｋ×ｌを４×４とした場合の実施
例より示す。第２図の一部分を拡大した図を第５図に示す。画像メ
モリ１中任意の４×４のブロツクＳの画像データを読み
出し、これを前述プロセツサ・ユニツト２で処理した後
に、任意の４×４のブロツクＴに転送する場合の処理に
ついて説明する。第５図及び第６図上の４×４のます目
は、４×４の16個のメモリ・エレメント1aを区切るます
目である。この16個のメモリ・エレメント1aに仮にAa,A
b,…,Ba,Bb,…Ca,…Dc,Ddと名前をつける。まず最初に
４×４のブロツクＳを読み出す場合、16個のメモリ・エ
レメント1aの内、メモリ・エレメントDdには（ローアド
レス，カラムアドレス）として（N,M）が与えられる。
メモリ・エレメントDb,Dc,Ddには（N,M＋１）、メモリ
・エレメントAd,Bd,Cdには（Ｎ＋1,M）残りのメモリ・
エレメントには（Ｎ＋1,M＋１）が与えられる。これは
前述したローアドレス・ジエネレータ4,カラムアドレス
・ジエネレータ５により発生される。又、４×４のブロ
ツクＳの端点ｕの位置が定まれば、その水平方向と垂直
方向の位置アドレスを４で割り、その余りの数n,mによ
り、メモリ・エレメントAa〜Ddまでに割りつけるローア
ドレス・カラムアドレスは一意的に決まる事は明らかで
ある。仮にｕの位置アドレスｕ（Y,X）とすると、Ｙ＝4N＋ｎ（ｎ＝0,1,2,3）Ｘ＝4N＋ｍ（ｎ＝0,1,2,3）例えば、アドレス・ジエネレータ4,5ではM,Nの情報とm,
nの情報をルツクアツプテーブル等に入力し、メモリ・
エレメントAa〜Ddに与えるアドレスを出力するような構
成も考えられる。この時出力はM,N,M＋1,N＋１のいずれ
かである事は、前述の説明より明らかである。又、この
性質を利用して、第７図のように、ルツクアツプテーブ
ルにｎ、又はｍを入力し、この値に応じて0,1を出力
し、メモリ・エレメントAa〜Ddに与えるアドレスＮまた
はＭをインクリメントするかしないかの制御を行えば良
い。ローアドレス・ジエネレータ４ではn,Nを使用し、
カラムアドレス・ジエネレータ５ではm,Mを使用する。このようにして、４×４の16個のメモリ・エレメント
に前述したようにアドレス・ジエネレータ4,5よりアド
レスが与えられて、同時に16個のデータを得る事ができ
る。この16個のデータは、プロセツサ・ユニツト２におい
て、何らかの処理をされ、又は何も処理されないで、再
び第５図に示す４×４のブロツクＴに転送される。しか
しながら、16個のメモリ・エレメントAa〜Ddから読み出
された画像データそれぞれが必ずしも同じメモリ・エレ
メントAa〜Ddに転送されるとは限らない。第５図の４×
４のメモリブロツクＳが４×４のメモリブロツクＴに転
送される場合には、４×４のメモリブロツクＳのうちメ
モリ・エレメントAaから読み出されたデータは、メモリ
・エレメントDcに転送されなければならない。では、４×４のメモリブロツクS,Tがその端点u,vを任
意の位置（Y,X），（Ｙ′,X′）を有している時に、メ
モリ・エレメントAa〜Ddの16個の読み出しデータがメモ
リ・エレメントAa〜Ddのどのメモリ・エレメントに書き
込まれれば良いのか説明する。第５図のようにＹ＝4N＋ｎ（ｎ＝0,1,2,3）Ｘ＝4N＋ｍ（ｍ＝0,1,2,3）Ｙ′＝4P＋ｐ（ｐ＝0,1,2,3）Ｘ′＝4Q＋ｑ（ｑ＝0,1,2,3）と表わせる時に、ｐ−ｎ＝4y′＋ｙ（ｙ′＝−1,0 ｙ＝0,1,2,3）… ｑ−ｍ＝4x′＋ｘ（ｘ′＝−1,0 ｘ＝0,1,2,3）… なるx,yを求める。まず（Aa,Ab,Ac,Ad）からなる行配列Ａを右方向にｘ
回ローテーシヨンする。これを行配列Ａ′と名付ける。
同様に行配列B,C,Dを右方向にｘ回ローテーシヨンした
ものを行配列Ｂ′,C′,D′と名付ける。次に行配列Ａ′,B′,C′,D′より成る配列（ABCD）′
を下方向にｙ回ローテーシヨンする。第５図の場合には、第５図よりn,m,p,qは3,3,2,1なの
は明らかなので，式よりｙ′＝−1,y＝3,x′＝−1,
x＝２を得る。故に前述の説明より次の行列を得る。右方向に２回ローテーシヨンすると、行配列Ａ′＝（Ac,Ad,Aa,Ab）Ｂ′＝（Bc,Bd,Ba,Bb）Ｃ′＝（Cc,Cd,Ca,Cb）Ｄ′＝（Dc,Dd,Da,Db）下方向に３回ローテーシヨンすると、（Bc,Bd,Ba,Bb）（Cc,Cd,Ca,Cb）（Dc,Dd,Da,Db）（Ac,Ad,Aa,Ab） … この行列を下の基本配列と対比させて考えて見る
と、 Aa,Ab,Ac,Ad Ba,Bb,Bc,Bd Ca,Cb,Cc,Cd Da,Db,Dc,Dd …基本配列基本配列はメモリ・エレメントAa〜Ddの読み出しデー
タを順に左から右、上から下と並べて２次元配列しただ
けのもので、行列は、メモリ・エレメントAa〜Ddに書
き込むべきデータを順に並べて２次元配列したものに相
当する。即ち、例としてメモリ・エレメントAaから読み
出されたデータは、配列を見ると、４行目３列目に書
き込まれる。これを基本配列を参照すると４行目３列
目にDcとなつているので、メモリ・エレメントDcにメモ
リ・エレメントAaの読み出しデータが書かれれば良い事
がわかる。補足説明すると、第５図上のメモリ・エレメントAaの
読み出しデータがDcの位置に書き込まれれば良い事は容
易に気がつくが、このAaからDcの位置への変位は、位置
アドレスからｕからｖへの変位に等しい。又、メモリ・
エレメント1aの構成が４×４なので、水平方向、垂直方
向の位置を共に４で割つた余りが、メモリ・エレメント
の変位x,yと考えて良い。例えばu,vの変位が４の倍数で
あれば、変位x,yは０になり、あるメモリ・エレメント
から読み出したデータは、処理が行なわれた後に、同じ
メモリ・エレメントに書き込まれるわけである。以上の処理のハードウエア化について簡単に説明す
る。第８図は、４×４の16個のメモリ・エレメント1aよ
りなるメモリ・エレメント10から同時に読み出したデー
タが、プロセツサ・ユニツト２で処理され、そのデータ
をそれぞれ４要素ずつｘ変位ローテータ81においてｘの
数だけローテーシヨンを行う。その後にｙ変位ローテー
タ82によつてｙの数だけローテーシヨンを行い、それぞ
れをAa〜Ad,Ba〜Bd,Ca〜Cd,Da〜Ddのメモリ・エレメン
ト1aに書き込む構成になつている。尚、ｙ変位ローテータ82は、入力がそれぞれ４要素の
データなので、ｘ変位ローテータ81と全く同じもの４つ
で構成できる事は言うまでもない。又、前記ローテータ
は、メモリデータの深みと同じビツト数の深みを持つて
も良いし、１ビツトの深みのものをメモリデータの深み
と同じ数だけ使用しても良い事も言うまでもない。又、
ローテータはシフトレジスタやバレルシフタ等を使用で
きる事は容易に推察できる。さらに一般化して考えてみると、メモリブロツクをｋ
×ｌのサイズにした場合には、メモリ・エレメント10の
構成もｋ×ｌになる。この場合に、任意の位置にあるｋ
×ｌのメモリブロツクＳをプロセツサ・ユニツト２で処
理した後に、任意の位置のｋ×ｌのメモリブロツクＴに
転送する場合に、Ｙ＝kN＋ｎ（ｎ＝0,1,…,k−１）Ｘ＝lM＋ｍ（ｍ＝0,1…,l−１）（N,M,P,Qは0,1,2,3…）Ｙ′＝kP＋ｐ（ｐ＝0,1,…,k−１）Ｘ′＝lQ＋ｑ（ｑ＝0,1,…,q−１）但し、Ｓの端点の位置アドレスを（Y,X）、Ｔの端点の位置アドレスを（Ｙ′,X′） …（1
0）なるn,m,p,qを求め、ｐ−ｎ＝Ky′＋ｙ（ｙ′＝1,0,y＝0,1,2,3,…,k−１）ｑ−ｍ＝lx′＋ｘ（ｘ′＝−1,0,x＝0,1,2,3,…ｌ−１） …（1
1）なx,yを用いて、例えば第８図のようなｘ変位ローテー
タ81、ｙ変位ローテータ82を使用して処理を行えば良
い。この場合、ｘ変位ローテータ81は、ｌ個の入力を持
ち、０〜ｌ−１までのシフトができる。ｙ変位ローレー
タ82は、ｋ個の入力を持ち、０〜ｋ−１までのシフトが
できる。しかも、ｙ変位ローテータ82のｋ個の入力はそ
れぞれｌ個の要素をもつため、入力１要素のローテータ
がｌ個の構成となる。第10図に示すように前述のｋ′×ｌ′のブロツクの同
時アクセスのためのメモリ・エレメントのアクセス制御
について説明する。ｋ′×ｌ′のブロツク端点ｉの位置アドレスを（f,
g）と仮定する。前述の式（10）に従いアクセスするメ
モリをリードする場合は、Y,Xにf,gを代入し、アクセス
するメモリにライトする場合はＹ′,X′にf,gを代入す
る。その結果を式（11）に代入してy,xを求めると、第
７図，第８図に示した実施例をｋ×ｌに一般化したもの
にそのまま適用できる。又、この際に、ｋ×ｌのメモリ・エレメントのうち
ｋ′×ｌ′のメモリ・エレメントのみをチツプイネーブ
ルにする。このイネーブルにするチツプはｋ′×ｌ′の
端点ｉの（f,g）の位置アドレスさえ決まれば、式（1
0）よりn,m、又はp,qが一意的に決まり、アクセスすべ
きｋ′×ｌ′個のメモリ・エレメントも一意的に決ま
る。ところで、今まで説明した様にｋ×ｌのメモリ・エレ
メントから成るメモリ構成において、リードアクセス側
をｋ′×ｌ′のブロツクを同時にアクセスし、ライト側
をｋ″×ｌ″のブロツクを同時にアクセスする場合も
（但し、０≦ｋ″≦k,0≦ｌ″≦ｌ）考えられるが、こ
れも今までの説明と同様である。この場合のメモリ・エ
レメントに与えるチツプイネーブルの制御の実施例を第
14図に示す。ｋ′×ｌ′,k″×ｌ″のブロツクの端点の位置アドレ
スを（Y,X）、（Ｙ′,X′）とする時に、式（10）より
n,m及びp,qが求まる。このn,m及びp,qはセレクタのデー
タ入力に入力される。さらにセレクタの選択制御信号と
して、メモリアクセスのリードライト信号R/Wが入力さ
れ、リードの時にn,mを選択出力し、ライトの時にp,qを
選択出力する。同様にブロツクサイズ、ｋ′,l′及びｋ″,l″もセレ
クタに入力され、R/W信号が選択制御信号として入力さ
れている。リード時には、ｋ′,l′を選択出力し、ライ
ト時にはｋ″,l″が選択出力される。ところで、アクセ
スするメモリ・エレメントはリード側のn,m,k′,l′、
又はライト側のｋ″,l″,p,qが定まれば一意的に決まる
事は明白なので、セレクタから出力されたこれらのデー
タはルツクアツプテーブルに入力し、それぞれｋ×ｌの
メモリ・エレメントのうちアクセスするメモリを制御す
る信号を出力する。ところでプロセツサ・ユニツト２で処理する前後の画
像メモリ１が別のメモリで、しかもそのメモリ構成がそ
れぞれｋ×ｌ、Ｋ×Ｌの場合には、第15図の様に、２つ
のルツクアツプテーブルを用いれば良いことは容易に推
察できる。この場合ルツクアツプテーブル151とルツク
アツプテーブル152は別の内容のテーブルとなる。又、ｋ＝K,l＝Ｌとなつても全く問題はない。以上前
述したような構成をすれば、アクセスするメモリ・エレ
メントをｋ×ｌ個のメモリ・エレメント全部としない
で、一部マスクする事が可能である。そしてｋ×ｌのメ
モリ・エレメントの構成は最大必要とするｋ×ｌの大き
さに設定すれば良い。次にメモリ・エレメントをどのようにアクセスして前
画面全体にあたる画像データすべてを処理するか、即ち
全メモリデータのアクセスのスキヤン方法について説明
する。例えばアクセスする隣接するｋ×ｌのブロツクの端点
ｕの位置アドレス、つまり垂直方向で端から、０から順
に数えた時の番号をＹとし、水平方向で端から、０から
順に数えた時の番号をＸとした時のY,Xが定まつた場合
のメモリのアクセスの仕方は、すでに説明した。それで
は、このX,Yをどの様な順番でスキヤンして全画像を処
理するかの実施例を説明する。（第１例）ｋ×ｌのメモリ・エレメントをアクセスするための画
像データの位置アドレスY,Xをそれぞれk,lの整数倍ずつ
増減させてスキヤンする方法で、例えばはじめにY,Xを
０に設定し、Ｘを順次ｌずつ増やす。水平方向の終点ま
でＸを増やしたら、次はＸを０に設定し直し、Ｙをｋ増
やしてまたＸをｌずつ増やす。これをシーケンシヤルに
繰り返して全画面又は画面の一部をスキヤンする。仮り
にこれを第１シーケンシヤルスキヤン方式と名付ける。（第２例）又、XYの増減を前述のようにシーケンシヤルに行わな
いで、画像全画面のあちらこちらの連続するｋ×ｌのブ
ロツクをとびとびにアクセスし、しかもそのアクセスす
る時のX,Yがk,lの整数倍の変位である時に、仮りにこれ
を第１ランダムスキヤン方式と名付ける。（第３例）ｋ×ｌのメモリ・エレメントをアクセスするための画
像データの位置アドレスY,Xをそれぞれ整数ずつ増減さ
せてスキヤンする方法で、例えばはじめにY,Xを０に設
定し、Ｘを順次１ずつ増やす。水平方向の終点までＸを
増やしたら、次にＸを再び０に設定し直し、Ｙを１増や
してからＸを１ずつ増やしていく。これをシーケンシヤ
ルに繰り返して全画面又は画面の一部をスキヤンする。
これを仮に第２シーケンシヤルスキヤン方式と名付け
る。この場合、同じメモリデータを何度もアクセスされ
る。（第４例）又、X,Yの増減を前述のようにシーケンシヤルに行わ
ないで、画像全画面のあちらこちらのｋ×ｌのブロツク
をとびとびにアクセスし、全Y,Xについてこれを実行す
る。又は画面全画面の内、連続する一部分全部のX,Yに
ついて実行。それがランダムである時に、これを第２ラ
ンダムスキヤン方式と仮りに名付ける。（第５例）ｋ×ｌのメモリ・エレメントを有するメモリ構成にお
いて、アクセスするメモリブロツクがｋ′×ｌ′の時
に、（１≦ｋ′≦k,1≦ｌ′≦ｌ）位置アドレスY,Xを
ｋ′,l′の整数倍ずつ増減させてこれをシーケンシヤル
に繰り返して全画面をスキヤンする方式を第１シーケン
シヤルスキヤン方式と区別して、ブロツクワイズ・シー
ケーンシヤルスキヤン方式と名付ける。（第６例）又、X,Yの増減を（第５例）のようにシーケンシヤル
に行わないで、画像全画面のあちらこちらの連続する
ｋ′×ｌ′のブロツクをとびとびにアクセスし、そのY,
Xがｋ′×ｌ′の整数倍の変位である時に、仮りにこれ
をブロツクワイズ・ランダムスキヤン方式と名付ける。（第７例）メモリ・エレメントのｋ×ｌのメモリ構成に関係な
く、シーケンシヤルにスキヤンするもの、例えば任意の
数ｄ′,f′おきにX,Yを変化させてスキヤンするもの
を、単にシーケンシヤルスキヤン方式と呼ぶ。４（第８例）（第７例）でランダムにスキヤンする場合や（第４
例）の場合でも、全てのX,Yの組み合わせについてメモ
リアクセスを行わない場合に、単にランダムスキヤン方
式と呼ぶ事にする。以上のように数々のスキヤン方式が考えられるが、こ
れとは別に、メモリアクセスには、リード側のメモリア
クセスがあり、このリード側のメモリアクセスのスキヤ
ン方式とライト側のメモリアクセスのスキヤン方式が一
致するとは限らない。又、このスキヤン方法はリード側が決まれば、ライト
側のアクセスするＸ′,Y′はプロセツサ・ユニツト２の
処理内容で決まる。又、ライト側のスキヤン方法を先に
決めてもよい。この場合はリード側のスキヤンは処理内
容で決まる。又、リード側とライト側でのアクセスするブロツクサ
イズｋ′,l′が異なる事もあれば、メモリ・エレメント
構成ｋ×ｌのサイズが異なる事もある。後述する本発明の実施例においては、プロセツサ・ユ
ニツト２において行う処理が画像の圧縮であるが、リー
ド側とライト側との画像メモリを構成するメモリ・エレ
メントの数ｋ×l,K×Ｌは、これに限定するものではな
い。又、リード側とライト側とのメモリ内でアクセスする
画素のブロツクサイズｋ′×ｌ′,K′×ｌ′についても
後述の例に限定するものではない。ただし、１≦ｋ′≦
k,1≦ｌ′≦l,1≦Ｋ′≦K,1≦Ｌ′≦ｌである。後述する実施例におけるメモリアクセスのスキヤンに
ついて補足説明するならば、リード側，ライト側に限ら
ずアクセスする画素サイズが各々の画像メモリを構成す
るメモリ・エレメントのサイズに等しければ、各々の画
像メモリで第１シーケンシヤルスキヤン方式でスキヤン
できる事は前述（第１例）で容易に推察できる。又、リ
ード側とライト側の各々の画像メモリで、各々の画像メ
モリを構成するメモリ・エレメントのサイズよりも小さ
い画素サイズをアクセスする場合には、前述した（第５
例）のブロツクワイズ・シーケンシヤルスキヤン方式で
スキヤンできる事もいうまでもない。前述の如くして、原画像メモリ上の矩形領域ｍ画素×
ｎ画素に対応する画像メモリ上の画像データに同時にア
クセスをかけ、原画像メモリ上の矩形領域のブロツクサ
イズｍ×ｎより少ない数のプロセツサ・エレメント（演
算素子）から成るプロセツサ・ユニツトに画像データを
同時に取り込んだ後、それぞれのプロセツサ・エレメン
トが相互に画像データ等の情報を通信しつつ、それぞれ
の処理を行うことによつて、入力画像データの圧縮処理
を行い、結果を入力の原画像メモリ上の矩形領域のブロ
ツクサイズｍ×ｎより小さい出力側の画像メモリ上の矩
形領域に出力することによつて、入力側の原画像データ
を圧縮する過程を説明する。説明の中では簡単にするた
めに、入力側の画像メモリの矩形領域のブロツクサイズ
をｍ＝ｎ＝４、演算素子であるプロセツサ・エレメント
の数は２個、出力側の画像メモリの矩形領域のブロツク
サイズ１×１＝１とする。第18図は入力側の原画像メモリ260に対応する入力画
素ブロツク261及び各画素261a、演算部であるプロセツ
サ・ユニツト262と、その構成要素であるプロセツサ・
エレメント263a,263bと、出力側の圧縮を施したデータ
を出力する出力側画像メモリ264に対する出力画素264a
の関係を示す図である。制御部265からの制御信号がプ
ロセツサ・ユニツト262と入力側の原画像メモリ260に入
力され、入力側の原画像メモリ260内の該当する16画素
分の画像データブロツク261に同時にアクセスをかけ、
プロセツサ・ユニツト262内のそれぞれのプロセツサ・
エレメント263a,263bに必要な画像データを取り込む。
プロセツサ・ユニツト262は16画素分の画像データから
第19図に示すような代表濃度情報271と細部情報272を演
算し、出力側の出力画像メモリ264内の該当する位置
に、出力画素264aとして圧縮された画像データを出力す
る。ここで、演算部であるプロセツサ・ユニツト262内の
２つのプロセツサ・エレメント263a,263bは、一方は16
画素の画像データの代表濃度情報271を専用に演算する
プロセツサ・エレメント263aであり、もう一方は入力画
像の特性に合つた固定のしきい値等の画像情報を基に演
算して求める細部情報272専用に演算するプロセツサ・
エレメント263bである。以上が入力された生画像データ
を圧縮するための装置と処理フローの概要である。以
下、それぞれのプロセツサ・エレメント263a,263bの詳
細な処理過程を説明する。代表濃度情報271を専用に演算するプロセツサ・エレ
メント263aは、第20図に示す様に16画素の画像データを
一時蓄えておくバツフア281と演算部282から成り、16画
素の画像データの平均濃度値を求め、この値を代表濃度
情報として出力側の画像メモリ264に出力する。一方、
細部情報272を専用に演ざするプロセツサ・エレメント2
63bもやはり第20図に示す様な16画素分のバツフア281と
演算部282という構成になり、入力される原画像の特性
に合わせて、予め図示しない装置により決められたしき
い値により16画素の階調情報を２値化して得られるブロ
ツク内のパターン情報と、しきい値とブロツク内の各画
素の画像データから得られる分散情報等から成る細部情
報272を出力側の画像メモリ264に代表濃度情報271と一
緒に出力する。この時、２つのプロセツサ・エレメント263a,263bは
並列に動作することができ、圧縮処理を高速に行うこと
ができる。以上の圧縮処理は入力側の画像メモリを４×４画素の
メモリブロツク単位にシーケンシヤルにアクセスをかけ
けてゆき、原画像メモリの最後の４×４のメモリブロツ
クの処理が終わるまで繰り返すことにより、原画像１ペ
ージ分の画像を圧縮することができる。また、説明では、圧縮データ内の細部情報を演算する
ためのしきい値を予め決められた固定しきい値を使用し
たが、この値はもう一方のプロセツサ・エレメント263a
に出力する平均濃度の値を使用しても良いことは容易に
推察することができる。さらに、演算部であるプロセツ
サ・ユニツト263内のプロセツサ・エレメントの数を１
つにしても良いことも容易に推察することができる。以上説明した如く本実施例によれば、入力される原画
像の生データをｍ×ｎ（例えば４×４）のメモリブロツ
ク毎にシーケンシヤルにアクセスするため、入力側の画
像メモリ内の各画素は複数回アクセスされることなく、
かつ、ｍ×ｎ画素の画像データを同時にアクセスするこ
とができるため、高速に画像データを転送することがで
きる。また、画像データをｍ×ｎ画素のブロツク単位で符号
化する際、入力側のメモリブロツクのサイズを同じｍ×
ｎにすることにより、１回のメモリアクセスで１回の符
号化の処理が行えるため、処理が高速に行え、かつ装置
構成を簡単にすることができる。さらに、演算部である
プロセツサ・ユニツト内のプロセツサ・エレメントの数
を入力側のメモリブロツク内の画素数ｍ×ｎ個よりも少
ない数ｍ′×ｎ′にし、各プロセツサ・エレメントに別
々の処理をさせることにより、演算部のコスト低下を図
ると共に、並列処理により処理スピードの向上を図るこ
とができる。次に他の実施例を説明する。原画像上の矩形領域ｍ画素×ｎ画素に対応する画像メ
モリ上の画像データに同時にアクセスをかけ、各画素対
応にそれぞれ１個のプロセツサ・エレメント（演算素
子）に対応させたｍ×ｎ個のプロセツサ・エレメントか
ら成るプロセツサ・ユニツトに画像データを取り込んだ
後、それぞれのプロセツサ・エレメントが画像データの
圧縮処理を施して、結果を画像メモリに出力する過程を
説明する。説明の中では簡単にするためｍ＝ｎ＝４とす
る。第21図は原画像290に対応する入力画素ブロツク291及
び各画素291a、演算部であるプロセツサ・ユニツト292
とその構成要素であるプロセツサ・エレメント292aと、
出力画像メモリ293内の出力画像データ293aの関係を示
す図である。図中の制御部294からの制御信号に従つ
て、入力側の原画像メモリ290内の該当する16画素分の
画像データ291に同時にアクセスをかけ、プロセツサ・
ユニツト292内のそれぞれのプロセツサ・エレメント292
aに画像データを取り込む。プロセツサ・ユニツト292は
16画素の画像データ291から第19図に示すような16画素
の代表濃度情報271と細部情報272を演算し、出力側の画
像メモリ293に出力する。ここで、プロセツサ・ユニツト292内の各プロセツサ
・エレメント292aは、４×４の画素１つずつに対応し、
正方格子状に４×４＝16個で構成されている。以上が画
像データの圧縮処理の概要である。以下、それぞれのプ
ロセツサ・エレメント292aの詳細な処理過程を説明す
る。プロセツサ・ユニツト292内の各プロセツサ・エレメ
ント292aに行方向及び列方向にそれぞれ番号を付け、そ
の組合わせで第22及び第23図に示す様に各プロセツサ・
エレメント292aを区別する。まず、16画素の画像データから代表濃度情報271を作
る過程を説明する。第22図に示す16個のプロセツサ・エ
レメント292aに各々対応する画像データが取り込まれて
いるものとする。各プロセツサ・エレメント（1,1），
…（4,4）は各画素の濃度データの1/16を並列に計算
し、その計算結果をプロセツサ・エレメント（1,1）に
全てを足し込み、16画素の濃度情報の平均値を求め、こ
の値を第19図に示す圧縮データ中の代表濃度情報271の
値として出力画像メモリに出力する。次に、第19図に示す圧縮データ中の細部情報272を求
める過程を説明する。第23図に示す各プロセツサ・エレ
メントは第22図に示すものと同じものである。まず、16個のプロセツサ・エレメント292aが持つてい
る各画素の階調の情報を、前記第22図中のプロセツサ・
エレメント（1,1）が出力する平均濃度情報で２値化し
て得られる各画素のパターン情報と、平均濃度情報と、
ブロツク内各画素データより得られる分散情報を高速に
求めるために、４×４のプロセツサ・エレメントを第23
図の実線で示す２×２の４つのブロツクに分け、その２
×２の４ブロツク内で並列に演算し、その結果を中間結
果として中心の４つのプロセツサ・エレメント（2,
2），（2,3），（3,2），（3,3）に格納し、次に中心の
２×２のブロツク内で上記の演算を施し、最終結果をプ
ロセツサ・エレメント（2,2）に求め、その値を該当す
る16画素の細部情報272として出力画像メモリに出力す
る。以上の処理を入力側の原画像メモリを４×４のブロツ
ク単位にシーケンシヤルにアクセスをかけてゆき、原画
像メモリの最後の４×４のブロツクの圧縮処理が終わる
まで繰り返すことにより、原画像１ページ分の圧縮デー
タを得ることができる。以上説明した如く本実施例によれば、入力される原画
像の生データえをｍ×ｎ（例えば４×４）のメモリブロ
ツク毎にシーケンシヤルにアクセスするため、入力側の
画像メモリ内の各画素は複数回アクセスされることはな
く、かつｍ×ｎ画素の画像データを同時にアクセスする
ことができるため、高速に画像データを転送することが
できる。また、画像データをｍ×ｎ画素のブロツク単位で符号
化する際、入力側のメモリブロツクのサイズを同じｍ×
ｎにすることにより、１回のメモリアクセスで１回の符
号化の処理が行えるため、処理が高速に行え、かつ装置
構成を簡単にすることができる。さらに、演算部であるプロセツサ・ユニツト内の入力
側のメモリブロツクのサイズに同じｍ×ｎ個の各プロセ
ツサ・エレメントは並列に処理を行うことができるた
め、演算部の処理スピードを上げることもできる。更に他の実施例を説明する。入力側の原画像メモリ上の矩形領域ｍ×ｎ画素に対す
る複数の画像データに同時にアクセスをかけ、演算部で
あるプロセツサエレメントに画像データを取り込み、画
像データの圧縮処理を施した後、入力時のブロツクサイ
ズより小さいサイズｍ′×ｎ′（ｍ＞ｍ′,n＞ｎ′）で
出力側の画像メモリ上の該当する位置に出力する。また
これとは逆に、入力側の画像メモリのブロツクサイズ
ｍ′×ｎ′画素に同時にアクセスをかけ、プロセツサユ
ニツトに画像データを取り込み、伸長処理を施した後、
入力時のブロツクサイズより大きいサイズｍ×ｎ（ｍ＞
ｍ′,n＞ｎ′）全ての画素の画像データを出力側の画像
メモリに同時に出力する。この時のメモリブロツクサイ
ズｍ×n,m′×ｎ′は固定であり、圧縮及び伸長処理等
の処理内容によつて、入力側をｍ×ｎ、出力側をｍ′×
ｎ′にしたり、逆に入力側をｍ′×ｎ′、出力側をｍ×
ｎ（但し、ｍ＞ｍ′ｎ＞ｎ′）を切り替えて処理を行う
ものである。以下、圧縮及び伸長処理の過程を説明する
が、簡単なためにｍ＝ｎ＝４、ｍ′＝ｎ＝１とし、プロ
セツサユニツト内のプロセツサエレメントの数は２個と
する。まず圧縮時には、前述した第18図と同様に、制御部24
4からの制御信号がプロセツサ・ユニツト241に入力さ
れ、入力側のブロツクサイズが４×４、出力側のサイズ
が１と判断され、入力側の原画像メモリ240内の該当す
る16画素分の画像データに同時にアクセスをかけ、プロ
セツサ・ユニツト241内のそれぞれのプロセツサ・エレ
メント241aに必要な画像データを取り込む。プロセツサ
・ユニツト241は16画素分の画像データから第19図に示
すような代表濃度情報271と細部情報272を演算し、出力
画の出力画像メモリ242内に該当する位置に圧縮された
画像データを出力する。一方、圧縮された符号化データを伸長する時の処理を
説明する。第24図は入力側画像メモリ240の符号化データ240a
と、プロセツサ・エレメント241aからなるプロセツサ・
ユニツト241と、出力側の再生画像メモリ242に対する出
力画素ブロツク243及び出力画素243aの関係を示す図で
ある。図中に示してある制御部244からの制御信号がプ
ロセツサ・ユニツト241に入力され、入力側のブロツク
サイズが１、出力側のブロツクサイズが４×４と判断さ
れ、入力側の画像メモリから第19図に示すような符号化
データが１つプロセツサ・ユニツト241に入力され、各
プロセツサ・エレメント241aがそれぞれの処理を施し、
再生された16画素の画像データを出力側の画像データ24
2内の該当する４×４の矩形領域に同時に出力する。こ
こで演算部であるプロセツサ・ユニツト241内の各プロ
セツサ・エレメント241aは、前述の圧縮処理時とは逆の
処理、例えば符号化データ中の代表濃度情報271と細部
情報中272の分散の情報等から16画素の画像データの濃
度情報を得るなどの処理を施し、16画素の画像データを
同時に再生する。この時、複数のプロセツサ・エレメン
ト241aは並列に動作うることができ、伸長処理を高速に
行うことができる。以上の伸長処理を入力側の符号化データにシーケンシ
ヤルにアクセスをかけていき、出力側の画像データに
も、４×４のブロツク単位にシーケンシヤルに出力する
動作を入力側の符号化データが無くなるまで行うことに
よつて、符号化データ１ページ分の画像データから再生
画像データを作ることができる。また説明では、圧縮データ内の細部情報を演算するた
めのしきい値を予め決められた固定しきい値を使用した
が、この値はもう一方のプロセツサエレメントが出力す
る平均濃度の値を使用しても良いことは容易に推察する
ことができる。さらに演算部であるプロセツサユニツト
内のプロセツサエレメントの数を１つにしても良いこと
も容易に推察することができる。また、プロセツサエレメントの数を入力あるいは出力
側のブロツクサイズで大きい方、本実施例の場合は、４
×４個を正方格子状に並べたものでも良いことも容易に
推察することができる。なお、本実施例では、ブロツク
サイズを４×４と１としたが、これらのサイズはいくる
でも良いことは容易に推察することができる。以上説明した如く本実施例によれば入力される原画像
の生データあるいは出力側の再生画像データをｍ×ｎ
（例えば４×４）のメモリブロツク毎にシーケンシヤル
にアクセスするため、入力側の画像メモリ内の各画素は
複数回路アクセスされることはなく、かつｍ×ｎ画素の
画像データを同時にアクセスすることができるため、高
速に画像データを転送することができる。また、入力、出力側の画像メモリのブロツクサイズを
（４×４と１）（１と４×４）というように切り変える
ことができるため、圧縮器、伸長器を別々にすることな
く、１つの装置で済み、かつそのために必要な装置を最
小とすることが可能となつた。さらに、入力側のブロツクサイズを別々のサイズにす
ることができるため、必要ない画像データを読んだり、
書き替えたりしないためのマスク処理も不要となつた。かつ、プロセツサ・ユニツト内の各プロセツサ・エレ
メントは、並列に処理できるため、演算部の処理スピー
ドを上げることもできる。［第２の実施例］同時にｋ×ｌ個のデータをアクセスするためのｋ×ｌ
個のメモリ・エレメントへの画像データの割り付けの第
２の実施例について説明する。第16図は画像１画面の上
方をデータに置き換えた状態を示す図で、これを水平方
向ｌ等分に分割し、垂直方向ｋ等分に分割する。この時
にｋ×ｌに分割されたエリアを説明のために、（0,
0），（0,1），…（0,l），…，（k,l）とすると、この
１つ１つのエリアを第17図に示すように１つ１つのメモ
リ・エレメントに割り付ける。割り付け方は、第16図に
示す破線斜線の部分を、各々のメモリ・エレメントの０
番地に割り付け、次に隣りの画像データを各々のメモリ
・エレメントの１番地に割り付け、同様にエリア内の１
ラインすべての割り付けが終わつたら、２ライン目を同
様に左から右へと割り付け、すべての画像データを割り
付ける。すると、ｋ×ｌ個の全てのメモリ・エレメント
に対し、第４図に示すローアドレス・ジエネレータ４、
及びカラムアドレス・ジエネレータ５が与えるアドレス
が全て同一である時に、第16図に示す斜線部のように、
とびとびの画像データを一度にアクセスする事ができ
る。この様な構成をとる事により、あるアドレスを指定し
て画像メモリ１をリードして、プロセツサ・ユニツト２
において処理を受けた後に、ｋ×ｌ個のメモリ・エレメ
ント1aにライトする際のアドレスを変える事なく、デー
タを書き込める可能性が生じる。例えば、第16図に示す
様に、前記エリアがＫ×Ｌの画素データで構成される場
合に、画像１画面中の１部分を水平方向にＬの整数倍、
垂直方向にＫの整数倍の変位や移動や転送等の処理を行
う場合にはリードアドレスとライトアドレスは同一で構
わない。このために、ローアドレス・ジエネレータ4,カ
ラムアドレス・ジエネレータ５等のアドレス制御関係の
負荷が極端に減る。この移動や転送の処理はプロセツサ・ユニツト２にお
いて処理さる。プロセツサ・ユニツト２には、第16図に
示す破線斜線で示す様にｋ×ｌ個の画像データ、それも
画面全体にわたる画像データが入力され、そのデータの
１つ１つは水平方向と垂直方向にL,Kの整数倍の変位を
もつているので、プロセツサ・ユニツト２内でｋ×ｌ個
のデータの変換や移動転送を行い、メモリ・エレメント
の全アドレスについて、０から順番にシーケンシヤルに
処理を実行すれば良い。この結果、画面全体での処理が
できる。本実施例中、ｋ×ｌ個のメモリ構成を例えば１×l,k
×１等の構成にして画像１画面中の水平１ライン、又は
垂直１ラインを各メモリ・ユニツトに割り付ける事によ
り、プロセツサ・ユニツト２における処理が画像１ライ
ン分のヒストグラム演算や、一次元フーリエ変換等の各
種画像処理に適応できる事は類推できる。又、複数画素
同時アクセスの際に、画像１画面中のデータをどのメモ
リ・エレメントのどの番地に割り付けるかを限定するも
のではない。［発明の効果］本発明により、画像の圧縮を分割された領域単位で高
速に並列処理する画像処理装置を提供できる。すなわ
ち、複数のメモリ・エレメント及び複数のプロセツサ・
エレメントが夫々分割された領域に対応しており、複数
のプロセツサ・エレメント間で該プロセツサ・エレメン
トにおいて演算された画像データを転送し、転送された
画像データを演算処理するので、画像の圧縮を高速に並
列処理できる。特に、プロセツサ・エレメント間で画像
データを転送するので、かかる転送のための特別な構成
を必要とせずに簡単な構成とすることが出来る。Description: BACKGROUND OF THE INVENTION The present invention relates to an image processing apparatus, in particular, a high-speed decompression of image data.
The present invention relates to an image processing apparatus that performs the following. [Prior art] Generally, when processing images at high speed,
Software processing is used as the processing.
However, as image data becomes enormous, higher speed is required.
Come. There are two methods for speeding up,
One is a sequential processing type hardware called a pipeline method.
One method is to use hardware, and the other is to place multiple processors.
It is called parallel processing type. The former is image data
The clock frequency increases with the high-speed processing of
is there. On the other hand, the latter requires more processors to be placed in parallel.
As a result, the speed can be increased as much as possible. Extremely
In other words, by placing processors for the number of images,
Attention now because it is possible to get great speed
This is one of the technologies that are being used. By the way, at this time, communication processing between pixels is important.
It is necessary to proceed with the process while performing mutual communication.
In such a parallel processing method, the processor is used for each pixel.
Is not possible when dealing with high-resolution data.
Become. For example, take an image of A4 read at 16 pixels / mm (pel).
When handling, the number of pixels is about 16M pixels (pixels),
It is impossible to have all these processors at the same time. [Problems to be Solved by the Invention] The present invention provides high-speed compression of an image in units of divided areas.
An image processing apparatus for performing parallel processing is provided. [Means for Solving the Problem] To solve the problem, an image processing apparatus according to the present invention is used.
Compresses an image in units of areas divided into predetermined sizes
An image processing apparatus, comprising: a plurality of pixels corresponding to the number of pixels in the area.
Number of memory elements to store image data.
Image memory (corresponding to 291 in FIG. 21 in the embodiment)
And a plurality of processor elements corresponding to the number of pixels in the area.
Element from the plurality of memory elements.
The image data read in parallel is compressed for each area.
In order to achieve this, the processor
Transfers image data calculated in the monitor element
Processor that processes the transferred image data
(In the embodiment, FIG.
And FIG. 23, page 47, line 4 to page 49, line 3)
And characterized in that: Example An example of the present invention will be described below. The configuration of the image processing apparatus of the present embodiment is a
Peripheral parts such as the resource 1 and the processor unit 2 and input / output devices
It consists of part 3. Fig. 1 shows the principle configuration of only the basic part.
The processor unit 2 is stored in the image memory 1.
Have been contacted. N × m at an arbitrary position on the image memory 1
The image data is stored in an n × m processor element 2a.
To the processor unit 2 consisting of
After the high-speed processing, the image data is returned to the image memory 1 again. n
Each processing in an array of × m processor elements 2a
Are performed simultaneously, so-called parallel processing architecture
It is. 9 (a) and 9 (b) show other configurations.
did. In FIG. 9 (a), under the control of the control circuit 94,
The image data from the input side image memory
Multiple in processor unit 92 consisting of support elements
The pixels are subjected to predetermined processing in parallel and stored in the output side image memory 93.
Is stored. On the other hand, in FIG. 9B, there is an image memory 91.
I and 93, the processor unit 92, and the input device 96
The output devices are connected by a common bus. Hereinafter, the image memory 1 will be described in detail. For the sake of simplicity, assume that the image size is 1024 x 1024 pixels,
We will proceed with an image memory having bit / pixel data.
Changing the image size expands the architecture of this embodiment.
Only need to be stretched. Also, processor unit 2 is 4 ×
4 consisting of a total of 16 processor elements 2a
And FIG. 2 is a diagram showing the configuration of the image memory 1. Image
If the configuration is made of 1024 x 1024 pixels as shown in the figure,
If this is divided into 4 × 4 units, 256 × 256 total 64
It is divided into K (= 65536) blocks. Now, this
As shown in Fig. 3, reorganization is performed in 4x4 pixel units, and 4x4 pixels are 64
Assume that there are K (8 bits long data for each pixel
Yes). Therefore, the address space of the memory is 4 × 4 × 64K
Is a three-dimensional address specification. One 64K image in 4x4
Assuming that one memory chip handles the element, 64K
Memory with each address being 8 bits deep in the address space of
・ A chip is required. This is a 512K bit (= 64K byte)
G) requires a memory chip with a capacity of
Is a set of two 256K-bit dynamic RAM (D-RAM)
Use together. That is, 64K of 256K bit D-RAM
A 64K × 8 bit using two × 4 bit configuration
Used. These two memory chips will be recorded in the future.
Called Re-Elent 1a. The above image memory 1 corresponds to 4 × 4 matrix.
Is composed of 16 memory elements 1a. Fig. 4
Shows the configuration of such a 4 × 4 memory element 1a.
Each memory element 1a has a row address and a column address.
Address is specified and the 64k address of one of the 4x4 pixels
Input / output image data of dress space. Row address
A generator 4 and a column address generator 5
Gives an address to each 4 × 4 memory element 1a
I can. Note that memory element 1a is loaded in the DRAM
Address and column address given by time sharing
In this case, only one address generator is required.
At this time, time division of row address and column address
Exchange control is required. Each address from the address generator
4 × 4 pixel memory element
1a can be read / written. That is, once
Simultaneous image data of 4 × 4 pixels by address specification
It can be driven. Therefore, the data line
An 8-bit data line directly from each memory element 1a
Assume that the row address is A (0 ≦ A ≦ 255) and the column address is
Data is B (0 ≦ B ≦ 255) from the image memory 1.
If it is called, the image data
4 × 4 pixels corresponding to the address (A, B) in the figure
Image data having a length of 8 bits is read. Generalize the simultaneous access of multiple pixels and explain
I will tell. FIG. 10 shows one page of the image as it is.
Of the image data of k × l
Divided by the block of pixels, as shown in Fig. 11,
Corresponds to the re-element 1a. Also, a block of k × l pixels
From the end, (0,0), (0,1), (0,2), (0,3) ...
Numbered k × l memory elements as shown in FIG.
This corresponds to the memory unit 1 composed of the element 1a. Thirteenth
The figure shows the memory unit 1 in two dimensions.
You. The memory size to be accessed is a block of k × l pixels.
Since it is a unit of the check size, a block of k × l pixels at an arbitrary position
Even when the lock R is accessed, k × l memories
All elements 1a are accessed and one memo
Access to one address for each re-element 1a
Becomes In this manner, a plurality of k × l adjacent pixels at an arbitrary position in an image
After accessing and reading pixel image data at once
Processing is performed by the processor unit 2. Processor Yu
The image data processed in nit 2 is again k ′ ×
1 'pixel block size, and any position
You can access and write. Where k '= k, l' =
It will be described in the future as l. The above-mentioned memory access of only k ′ × l ′ pixels
To further explain, the processing in processor unit 2
If the processing is spatial filtering, etc.
Access on the writing side than the block size k × l
Block size may be reduced. General
Generally, the block size k ′ × l ′ on the writing side is 1 × 1.
There are many processes that become Also in processor unit 2
Even if the processing to reduce the image is
Access on the write side of the block size k × l
Block size is reduced. Generally, the block size k '× l' on the light side is vertical and horizontal.
Let k '≧ αk, l ′ ≧ βl be the reduction ratio of α and β
The minimum integer to be satisfied is k ', l'. Read and write
Same k × l memory configuration with the same memory
In the case of performing the processing as in the above two examples,
Smaller than the configuration size k × l of the memory unit 1 on the remote side
Must be written to a small size k '× l'.
No. In this case, k × l squares of the memory element 1a are used.
Do not give access to everything, and write
Mask the memory element 1a so that it is not accessed
Must. However, k × l memories
・ The image memory 1 composed of the elements 1a
Data that can be accessed and read
The maximum number is k × l, but smaller neighboring
K ′ × l ′ image data is also
Can be freely accessed. Mask and only k '× l'
Accessing at the same time depends on the chip of memory element 1a.
It is easily possible by operating the enable of the loop. Next, the memory access of a predetermined pixel at an arbitrary position is sequentially performed.
In the embodiment of the process, the memory unit configuration is 4 × 4
And k × l are explained, and the mask mask
The control of the chip enable for this is also described. First, when the block size k × l is 4 × 4
An example is shown. FIG. 5 is an enlarged view of a part of FIG. Image
Reads image data of an arbitrary 4 × 4 block S in the memory 1.
After processing it in the processor unit 2
In the process for transferring to an arbitrary 4 × 4 block T,
explain about. 4 × 4 squares in FIGS. 5 and 6
Separates 4x4 16 memory elements 1a
Eyes. Aa, A are temporarily stored in these 16 memory elements 1a.
Name them b,…, Ba, Bb,… Ca,… Dc, Dd. First of all
When reading a 4 × 4 block S, 16 memory
Of the element 1a, the memory element Dd (low load
(N, M) is given as the address, column address).
(N, M + 1) is assigned to memory elements Db, Dc, and Dd.
(N + 1, M) remaining memory for elements Ad, Bd, Cd
The elements are given (N + 1, M + 1). this is
Row address generator 4 and column address described above
Generated by the generator 5 Also, 4 × 4 block
Once the position of the end point u of the stick S is determined, its horizontal direction and vertical
Divide the position address in the direction by 4 and calculate the remaining numbers n and m.
Lower row allocated to memory elements Aa to Dd
It is clear that dress and column addresses are uniquely determined.
is there. Assuming that the position address of u is u (Y, X), Y = 4N + n (n = 0,1,2,3) X = 4N + m (n = 0,1,2,3) For example, the address generators 4,5 Then, the information of M and N and m,
n information into a lookup table, etc.
Output address given to elements Aa to Dd
It is conceivable. At this time, the output is M, N, M + 1, N + 1
Is clear from the above description. Also this
Utilizing the properties, as shown in Fig. 7,
Input n or m to the device and output 0,1 according to this value
The address N given to the memory elements Aa to Dd or
Should control whether M is incremented or not.
No. The row address generator 4 uses n and N,
In the column address generator 5, m and M are used. Thus, 4 × 4 16 memory elements
As described above, address generators 4 and 5
Address, you can get 16 data at the same time
You. These 16 data are stored in processor unit 2
Without any processing, or without any processing,
And transferred to a 4.times.4 block T shown in FIG. Only
While reading from the 16 memory elements Aa to Dd
Image data is not necessarily the same memory element
Are not necessarily transferred to the comments Aa to Dd. 4x in Fig. 5
4 memory blocks S are converted to 4 × 4 memory blocks T.
If it is sent, the memory
The data read from the memory element Aa is stored in the memory
• Must be transferred to element Dc. Then, the 4 × 4 memory blocks S and T assign their endpoints u and v.
At the desired position (Y, X), (Y ', X')
The 16 read data of the memory elements Aa to Dd
Which memory element of the re-elements Aa to Dd
I will explain if it should be included. As shown in FIG. 5, Y = 4N + n (n = 0, 1, 2, 3) X = 4N + m (m = 0, 1, 2, 3) Y '= 4P + p (p = 0, 1, 2, 3) X '= 4Q + q (q = 0,1,2,3), pn = 4y' + y (y '=-1,0 y = 0,1,2,3) ... qm = 4x '+ X (x' =-1,0 x = 0,1,2,3) ... x and y are obtained. First, a row array A consisting of (Aa, Ab, Ac, Ad) is shifted rightward by x
Rotate once. This is named a row array A '.
Similarly, row arrays B, C, and D were rotated x times to the right.
Those are named row arrays B ', C', D '. Next, an array (ABCD) 'composed of row arrays A', B ', C', and D '
Is rotated downward y times. In the case of Fig. 5, n, m, p, q are 3,3,2,1 from Fig.5.
Is clear, y '=-1, y = 3, x' =-1,
Obtain x = 2. Therefore, the following matrix is obtained from the above description. When rotated twice to the right, the row array A '= (Ac, Ad, Aa, Ab) B' = (Bc, Bd, Ba, Bb) C '= (Cc, Cd, Ca, Cb) D' = (Dc, Dd, Da, Db) When rotated downward three times, (Bc, Bd, Ba, Bb) (Cc, Cd, Ca, Cb) (Dc, Dd, Da, Db) (Ac, Ad, Aa, Ab)… Compare this matrix with the basic array below
And Aa, Ab, Ac, Ad Ba, Bb, Bc, Bd Ca, Cb, Cc, Cd Da, Db, Dc, Dd ... Basic array The basic array is the read data of memory elements Aa to Dd.
Are arranged two-dimensionally from left to right and top to bottom.
The matrix is written in memory elements Aa-Dd.
The two-dimensional array of data to be inserted
Hit. That is, as an example, read from memory element Aa
In the array, the output data is written in the fourth row and third column.
I will be absorbed. If you refer to this basic array, the 4th row and 3 columns
Since it is Dc in your eyes, make a note on memory element Dc.
What is necessary if read data of re-element Aa is written
I understand. To give a supplementary explanation, the memory element Aa shown in FIG.
It is only necessary that the read data be written to the location of Dc.
As you can easily notice, the displacement from Aa to Dc
Equal to the displacement from u to v from the address. Also, memory
Since the configuration of element 1a is 4 × 4, horizontal and vertical
The remainder of dividing the direction position by 4 is the memory element
Can be considered as the displacement x, y of. For example, if u, v displacement is a multiple of 4,
If there is, the displacement x, y becomes 0, and a certain memory element
The data read from
It is written to the memory element. The hardware implementation of the above processing will be briefly described.
You. FIG. 8 shows 4 × 4 16 memory elements 1a.
Data read simultaneously from memory element 10
Data is processed by the processor unit 2 and the data
In the x-displacement rotator 81 for each of the four elements
Rotate by number. Then y displacement rotation
Rotate by the number of y according to tab 82, and
Aa ~ Ad, Ba ~ Bd, Ca ~ Cd, Da ~ Dd
The configuration is such that the data is written to the port 1a. The y-displacement rotator 82 has four inputs each.
Data, so four exactly the same as x displacement rotator 81
It goes without saying that it can be composed of Also, the rotator
Has the same number of bits as the depth of the memory data.
Good, one bit deeper to memory data deeper
Needless to say, the same number as above may be used. or,
Rotator can use shift register, barrel shifter, etc.
It is easy to guess. Considering this more generalized, the memory block is k
Xl size, the memory element 10
The configuration is also k × 1. In this case, k at an arbitrary position
Xl memory block S is processed by processor unit 2.
After processing, the memory block T of k × l at an arbitrary position
When transferring, Y = kN + n (n = 0,1, ..., k-1) X = lM + m (m = 0,1, ..., l-1) (N, M, P, Q are 0,1,2 , 3 ...) Y '= kP + p (p = 0,1, ..., k-1) X' = lQ + q (q = 0,1, ..., q-1) where the position address of the end point of S is (Y, X), the position address of the end point of T is (Y ', X') ... (1
0) The following n, m, p, q are obtained, and pn = Ky '+ y (y' = 1,0, y = 0,1,2,3, ..., k-1) qm = lx ' + X (x '=-1,0, x = 0,1,2,3, ... l-1) ... (1
1) Using x and y, for example, the x displacement rotation shown in Fig. 8
The processing should be performed using the data 81 and y displacement rotator 82.
No. In this case, the x displacement rotator 81 has l inputs.
That is, a shift from 0 to l-1 can be performed. y displacement lowey
Data 82 has k inputs and shifts from 0 to k-1
it can. Moreover, the k inputs of the y displacement rotator 82 are
Since each element has l elements, a rotator with one input element
Becomes l configurations. As shown in FIG. 10, the k '× l'
Control of memory elements for time access
Will be described. The position address of the block end point i of k ′ × l ′ is (f,
g). Method to access according to the above formula (10)
To read memory, substitute f and g for Y and X and access
If writing to memory to be performed, substitute f and g for Y 'and X'
You. By substituting the result into equation (11) to obtain y, x,
Generalization of the embodiment shown in FIGS. 7 and 8 to k × l
Can be applied as is. At this time, of the k × l memory elements,
Enable only k '× l' memory elements
To This enable chip is k '× l'.
If only the position address of (f, g) of the end point i is determined, the equation (1)
0), n, m or p, q are uniquely determined, and all
K ′ × l ′ memory elements are also uniquely determined.
You. By the way, as described above, the memory element of k × l
Read access side in a memory configuration consisting of
To the block of k '× l' at the same time,
To access the block of k ″ × l ″ at the same time
(However, 0 ≦ k ″ ≦ k, 0 ≦ l ″ ≦ l)
These are the same as described above. In this case, the memory
Example of control of chip enable given to element
It is shown in Figure 14. k '× l', k "xl" block address
Where (Y, X) and (Y ', X') are
n, m and p, q are obtained. These n, m and p, q are selector data.
Input to the data input. Further, the selector selection control signal and
The memory access read / write signal R / W
Select and output n and m when reading, and p and q when writing
Select output. Similarly, block sizes k ', l' and k ", l" are also selected.
And the R / W signal is input as the selection control signal.
Have been. During reading, k 'and l' are selected and output, and
In this case, k ″, l ″ is selectively output. By the way,
The memory elements to be read are n, m, k ', l' on the read side,
Or uniquely determined if k ", l", p, q on the light side are determined
It is clear that these data output from the selector
Input to the lookup table, and
Controls which memory elements are accessed
Output signal. By the way, the images before and after processing by the processor unit 2
The image memory 1 is another memory, and its memory configuration is
In the case of k × l and K × L, respectively, as shown in FIG.
It is easy to think that a look-up table should be used.
I can understand. In this case, the look-up table 151 and the look-up
The up table 152 is a table having different contents. There is no problem even if k = K and l = L. Before
With the configuration described above, the memory element to be accessed is
Not all k × l memory elements
It is possible to partially mask. And k × l
The configuration of the moly element is the largest required k × l
Just set it to Next, how to access the memory element
Whether to process all image data corresponding to the entire screen,
Explains how to scan all memory data
I do. For example, the end point of an adjacent k × l block to be accessed
u position address, that is, from the end in the vertical direction, starting from 0
The number when counting is Y, and from the end in the horizontal direction from 0
When Y and X are determined when the number counted in order is X
The method of accessing the memory has already been described. So
Scans X and Y in any order and processes all images.
An example will be described. (First example) An image for accessing a k × l memory element
The position addresses Y and X of the image data are each an integral multiple of k and l
In the method of scanning by increasing or decreasing, for example, first, Y, X
It is set to 0, and X is sequentially increased by l. To the end point in the horizontal direction
After increasing X, set X to 0 next and increase Y by k
Then X is increased by 1 again. This is a sequential
The entire screen or a part of the screen is repeatedly scanned. Temporary
This is referred to as a first sequential scan method. (Second example) Also, increase and decrease of XY should not be performed sequentially as described above.
Then, the continuous k × l block of the whole image
Access the locks intermittently, and
If X and Y at the time of displacement are integer multiples of k and l,
Is referred to as a first random scan method. (Third example) An image for accessing a k × l memory element
Increase / decrease the image data position addresses Y and X by integers, respectively.
For example, first set Y and X to 0.
And X is sequentially increased by one. X to end point in horizontal direction
After increasing, then reset X to 0 again and increase Y by 1.
And then increase X by one. This is Sequencer
Scan the entire screen or a part of the screen repeatedly.
This is temporarily named the second sequential scan method.
You. In this case, the same memory data is accessed many times.
You. (Fourth example) In addition, increase and decrease of X and Y are performed sequentially as described above.
No, there is a block of k × l here and there on the whole image.
And jump to all Y, X
You. Or X, Y of all continuous parts of the entire screen
Run about. When it is random, this is
It is tentatively named Nandskyan method. (Fifth example) In a memory configuration having k × l memory elements,
And the memory block to access is k '× l'
And (1 ≦ k ′ ≦ k, 1 ≦ l ′ ≦ l) location addresses Y and X
increase or decrease by an integer multiple of k ', l'
The first sequence is a method that scans the entire screen repeatedly.
The blockwise sea method is distinguished from the
It is named the Känshärskiyan method. (Sixth example) In addition, increase and decrease of X and Y are sequentially determined as in (Fifth example).
Don't do it all over the image full screen
k '× l' blocks are accessed discretely and their Y,
If X is a displacement that is an integral multiple of k '× l',
Is called a blockwise random scan method. (Seventh Example) The seventh example relates to the memory configuration of k × l of memory elements.
That scans the sequence, such as any
Scan by changing X and Y every number d 'and f'
Is simply referred to as a sequential scan method. 4 (Eighth example) (Seventh example)
Example), note all combinations of X and Y
If no re-access, simply random scan
Let's call it an expression. As described above, various scanning methods are conceivable.
Apart from this, the memory access on the read side is
Access to the memory on the read side.
Scan method and write-side memory access scan method
Not necessarily. In addition, this scanning method uses a write method if the read side is determined.
X 'and Y' to be accessed by the processor unit 2
Determined by the processing content. Also, the scanning method on the light side first
You may decide. In this case, the scan on the lead side is in process.
Determined by volume. In addition, a blocker to be accessed on the read side and the write side
If the k ', l' may be different, the memory element
The size of the configuration k × l may be different. In an embodiment of the present invention to be described later, a processor
The processing performed in nit 2 is image compression.
Memory elements that constitute the image memory on the write and write sides
The numbers k × l and K × L of the elements are not limited to these.
No. Also, access is made within the memory on the read side and the write side.
For pixel block sizes k '× l' and K '× l'
It is not limited to the examples described below. Where 1 ≦ k ′ ≦
k, 1 ≦ l ′ ≦ l, 1 ≦ K ′ ≦ K, 1 ≦ L ′ ≦ l. For memory access scan in the embodiment described later
If it is supplementary explanation, it is limited to read side and write side
The pixel size to be accessed for each
Each image if it is equal to the size of the memory element
Scan by image memory in first sequential scan mode
What can be done can be easily inferred from the above (first example). Also,
Each image memory is stored in the image memory on the card side and the light side.
Smaller than the size of the memory elements that make up the memory
When accessing a pixel size that is not
Example) blockwise sequential scan method
It goes without saying that you can scan. As described above, a rectangular region m pixels ×
At the same time, image data in the image memory
Access and block the rectangular area in the original image memory.
Is m × n less processor elements
Image data to a processor unit consisting of
After capturing at the same time, each processor element
While communicating information such as image data with each other,
The compression processing of the input image data
And block the result into a rectangular area on the input original image memory.
A rectangle on the output side image memory smaller than
The original image data on the input side
The process of compressing is described. I'll keep it simple in the description
The block size of the rectangular area of the input image memory
M = n = 4, a processor element which is an arithmetic element
Is 2 blocks of the rectangular area of the image memory on the output side.
Size 1 × 1 = 1. FIG. 18 shows an input image corresponding to the original image memory 260 on the input side.
The raw block 261 and each pixel 261a,
The unit 262 and its components, the processor
Element 263a, 263b and output compressed data
The output pixel 264a for the output side image memory 264
FIG. The control signal from the control unit 265 is
Input to the Rosetta Unit 262 and the original image memory 260 on the input side.
16 pixels in the input source image memory 260
Access the image data block 261 at the same time,
Each processor in processor unit 262
The necessary image data is taken into the elements 263a and 263b.
Processor Unit 262 is based on 16 pixel image data
The representative density information 271 and the detailed information 272 as shown in FIG.
The corresponding position in the output image memory 264 on the output side.
Output compressed image data as output pixel 264a
You. Here, the processing unit 262 in the processor unit 262
Two processor elements 263a and 263b, one of which is 16
Dedicated calculation of representative density information 271 of pixel image data
Processor element 263a and the other
Perform based on image information such as fixed thresholds that match image characteristics.
Calculates detailed information 272
Element 263b. Raw image data input above
1 is an outline of an apparatus for compressing a file and a processing flow. Less than
Below, details of each processor element 263a, 263b
A detailed process will be described. A processor / element that exclusively calculates the representative density information 271
The element 263a converts the image data of 16 pixels as shown in FIG.
It consists of a buffer 281 for temporary storage and an arithmetic
Calculate the average density value of the raw image data and calculate this value as the representative density
The information is output to the image memory 264 on the output side. on the other hand,
Processor element 2 dedicated to detailed information 272
63b also has a buffer 281 for 16 pixels as shown in Fig. 20
It has the structure of the operation unit 282, and the characteristics of the input original image
According to the equipment not shown
Block obtained by binarizing the gradation information of 16 pixels with
Pattern information in the block, thresholds and each image in the block
Detailed information consisting of distributed information etc. obtained from raw image data
Information 272 and the representative density information 271 in the image memory 264 on the output side.
Output in the beginning. At this time, the two processor elements 263a and 263b
Can operate in parallel and perform high-speed compression processing
Can be. The compression processing described above requires the input side image memory to store 4 × 4 pixels.
Access to the sequential in memory block units
The last 4x4 memory block of the original image memory
By repeating until the processing of the
Image can be compressed. In the description, the detailed information in the compressed data is calculated.
Use a fixed threshold that is predetermined
However, this value is the value of the other processor element 263a.
It is easy to use the average density value output to
Can be inferred. In addition, the processing unit
Number of processor elements in the unit 263 is 1
It is easy to guess what is good. As described above, according to the present embodiment, the input original image
The raw image data is stored in an m × n (eg, 4 × 4) memory block.
The screen on the input side is used to access the sequential
Each pixel in the image memory is not accessed multiple times,
In addition, image data of m × n pixels can be simultaneously accessed.
Image data can be transferred at high speed.
Wear. Also, the image data is encoded in block units of m × n pixels.
The size of the memory block on the input side
n, one memory access makes one code
Since the encryption process can be performed, the process can be performed at high speed and the device
The configuration can be simplified. Furthermore, the operation unit
Number of processor elements in a processor unit
Is smaller than the number of pixels m × n in the memory block on the input side.
Number m '× n' and separate for each processor element
Reduce processing unit cost by performing various processes
And improve processing speed by parallel processing.
Can be. Next, another embodiment will be described. Image data corresponding to the rectangular area m pixels x n pixels on the original image
Access the image data on the memory at the same time,
In response, one processor element (operator)
Mxn processor elements corresponding to
Image data into a processor unit
After that, each processor element
The process of applying the compression process and outputting the result to the image memory
explain. In the description, m = n = 4 for simplicity
You. FIG. 21 shows input pixel blocks 291 and 291 corresponding to the original image 290.
And each pixel 291a, a processor unit 292 which is an arithmetic unit
And its component processor element 292a;
The relationship between the output image data 293a in the output image memory 293 is shown.
FIG. According to the control signal from control unit 294 in the figure
Therefore, the corresponding 16 pixels in the original image memory 290 on the input side are
Simultaneous access to image data 291
Each processor element 292 in unit 292
Import image data into a. Processor Unit 292
16 pixels as shown in Fig. 19 from the 16 pixel image data 291
Calculate the representative density information 271 and detailed information 272 of the
Output to the image memory 293. Here, each processor in the processor unit 292 is
Element 292a corresponds to each 4 × 4 pixel,
It is composed of 4 × 4 = 16 pieces in a square lattice. This is the picture
4 is an outline of image data compression processing. Below,
A detailed process of the process element 292a will be described.
You. Each processor and element in processor unit 292
Number 292a in the row and column directions, respectively.
As shown in Figs. 22 and 23, each processor
Distinguish element 292a. First, representative density information 271 is created from 16 pixel image data.
The process will be described. The 16 processors and processors shown in FIG.
The image data corresponding to each of the elements 292a
Shall be Each processor element (1,1),
… (4,4) calculates 1/16 of the density data of each pixel in parallel
And calculate the result as a processor element (1,1).
Add all the values and find the average value of the density information of 16 pixels.
Of the representative density information 271 in the compressed data shown in FIG.
Output to the output image memory as a value. Next, detailed information 272 in the compressed data shown in FIG. 19 is obtained.
The following describes the process of setting. Each processor and element shown in Fig. 23
The comments are the same as those shown in FIG. First, the 16 processor elements 292a have
The gradation information of each pixel is stored in the processor
Binarization based on the average density information output by element (1,1)
Pattern information of each pixel obtained by
High-speed distributed information obtained from each pixel data in the block
To find the 4x4 processor element in the 23rd
It is divided into 4 blocks of 2 × 2 shown by the solid line in the figure.
Calculates in parallel within 4 blocks of × 2 and intermediates the result.
The result is four central processor elements (2,
2), (2,3), (3,2), and (3,3)
Perform the above operation in a 2x2 block and save the final result
Determine the value of the Rosetsusa element (2,2)
Output to the output image memory as detailed information 272 of 16 pixels.
You. The above processing is performed by inputting the original image memory into 4 × 4 blocks.
Access to the sequence in units of
The compression processing of the last 4 × 4 block of the image memory is completed.
The compressed data for one page of the original image
Data can be obtained. As described above, according to the present embodiment, the input original image
The raw image data is stored in an m × n (eg, 4 × 4) memory block.
To access the sequential for each check, the input side
Each pixel in the image memory cannot be accessed multiple times.
And simultaneously access m × n pixel image data
Image data can be transferred at high speed.
it can. Also, the image data is encoded in block units of m × n pixels.
The size of the memory block on the input side
n, one memory access makes one code
Since the encryption process can be performed, the process can be performed at high speed and the device
The configuration can be simplified. Furthermore, the input in the processor unit
M × n processes equal to the size of the memory block on the side
Tusa elements can be processed in parallel
Therefore, the processing speed of the arithmetic unit can be increased. Another embodiment will be described. For a rectangular area m × n pixels on the input-side original image memory
Access multiple image data at the same time,
Image data is imported to a certain processor element and
After performing image data compression processing,
Smaller than m '× n'(m> m ', n>n')
Output to the corresponding position on the output side image memory. Also
Conversely, the block size of the image memory on the input side
Access m ′ × n ′ pixels at the same time,
After importing image data into the nit and performing decompression processing,
Size m × n (m>) larger than the block size at the time of input
m ', n>n') Image data of all pixels is output
Output to memory at the same time. Memory block size at this time
Mxn, m'xn 'are fixed, and compression and decompression
M × n on the input side and m ′ ×
n ', or conversely m' × n 'on the input side and m ×
n (where m>m'n> n ') is switched to perform the processing.
Things. Hereinafter, the process of the compression and decompression processing will be described.
However, for simplicity, let m = n = 4 and m ′ = n = 1,
The number of processor elements in a processor unit is two.
I do. First, at the time of compression, as in the case of FIG.
4 is input to the processor unit 241.
The input side block size is 4 × 4, the output side size
Is determined to be 1, and the corresponding
Access image data for 16 pixels
Each processor element in the seta unit 241
The necessary image data is taken into the comment 241a. Processor
Unit 241 is shown in Fig. 19 from 16 pixel image data.
Calculate and output representative density information 271 and detailed information 272
The image is compressed to the corresponding position in the output image memory 242.
Output image data. On the other hand, the processing when decompressing the compressed encoded data
explain. FIG. 24 shows the encoded data 240a of the input side image memory 240.
And the processor element 241a
The output to the unit 241 and the playback image memory 242 on the output side
FIG. 14 is a diagram showing the relationship between the output pixel block 243 and the output pixel 243a.
is there. A control signal from the control unit 244 shown in FIG.
Input to the processor unit 241 and block on the input side
The size is determined to be 1 and the output block size is determined to be 4x4.
From the image memory on the input side as shown in Fig. 19.
One piece of data is input to the processor unit 241.
Processor element 241a performs each processing,
The reproduced 16-pixel image data is converted to image data 24 on the output side.
2 are simultaneously output to the corresponding 4 × 4 rectangular area. This
Here, each processor in the processor unit 241 which is the arithmetic unit
The set element 241a is the reverse of the compression process described above.
Processing, eg, representative density information 271 in coded data and details
The density of the 16-pixel image data is
To obtain 16-pixel image data.
Play at the same time. At this time, multiple processor elements
241a can operate in parallel, making decompression processing faster
It can be carried out. The above decompression processing is sequentially performed on the encoded data on the input side.
Access to the image data on the output side
Also output sequentially in 4 × 4 block units
Perform the operation until there is no more encoded data on the input side.
Therefore, the encoded data is reproduced from one page of image data.
Image data can be created. In the description, it is assumed that detailed information in the compressed data is calculated.
Use a fixed threshold that is predetermined
However, this value is output by the other processor element.
It is easy to guess that the average concentration value can be used.
be able to. In addition, a processor unit that is an arithmetic unit
That the number of processor elements in one can be one
Can also be easily inferred. Input or output the number of processor elements
Side block size is larger, 4 in the case of this embodiment.
It is also easy to arrange × 4 in a square lattice
Can be inferred. In this embodiment, the block
The size is 4x4 and 1, but these sizes are
But good things can easily be inferred. As described above, according to the present embodiment, the input original image
Raw data or reproduced image data on the output side
(For example, 4 × 4) memory block
In order to access, each pixel in the input side image memory is
Multiple circuits are not accessed, and m × n pixels
Since image data can be accessed simultaneously,
Image data can be transferred quickly. Also, set the block size of the image memory on the input and output sides.
Switch to (4 × 4 and 1) (1 and 4 × 4)
Therefore, do not use separate compressor and decompressor.
And only one device is required, and
It became possible to make it small. In addition, make the input block sizes different.
To read unnecessary image data,
The need for masking to avoid rewriting is eliminated. In addition, each processor element in the processor unit
Can be processed in parallel.
Can be raised. Second Embodiment k × l for Accessing k × l Data Simultaneously
Of image data allocation to memory elements
Example 2 will be described. Figure 16 is on the top of the image 1 screen
This is a diagram showing the state in which the
It is divided into l equal directions and k equal parts in the vertical direction. At this time
In order to explain the area divided into k × l, (0,
0), (0,1), ... (0, l), ..., (k, l)
As shown in Fig. 17, each area has its own memo.
Assign to the element. The assignment method is shown in Fig. 16.
The hatched portion shown by the broken line indicates the 0 of each memory element.
Addresses and then store the next image data in each memory
・ Assigned to the first address of the element, and
When all lines have been assigned, repeat the second line.
All the image data is allocated from left to right
wear. Then, all k × l memory elements
In contrast, the row address generator 4 shown in FIG.
And the address given by the column address generator 5
Are all the same, as indicated by the shaded area in FIG.
You can access discrete image data at once
You. With this configuration, you can specify a certain address
To read the image memory 1 and the processor unit 2
K × l memory elements
Data without changing the address when writing to
Data can be written. For example, as shown in FIG.
Thus, when the area is composed of K × L pixel data,
In this case, one part of one screen of the image is horizontally an integral multiple of L,
Performs processing such as displacement, movement, and transfer in the vertical direction that is an integral multiple of K.
In this case, the read address and the write address are
I don't know. For this purpose, the row address generator 4,
Address control related to the RAM address generator 5 etc.
The load is extremely reduced. This transfer and transfer processing is performed by the processor unit 2.
Be processed. Figure 16 shows the processor unit 2
As shown by the hatched dashed lines, k × l image data,
Image data covering the entire screen is input, and the data
Each one has horizontal and vertical displacements of integer multiples of L and K
K × l units in processor unit 2
Data conversion and transfer
For all addresses in sequence from 0
What is necessary is just to perform a process. As a result, processing on the entire screen
it can. In this embodiment, k × l memory configurations are changed to, for example, 1 × l, k
× 1 or other configuration, one horizontal line in one image, or
By assigning one vertical line to each memory unit
Processing in processor unit 2 is
Operations such as histogram calculation and one-dimensional Fourier transform
It can be inferred that it can be applied to seed image processing. Also, multiple pixels
At the time of simultaneous access, the data in one screen
Restrict which address of the re-element is assigned
Not. [Effects of the Invention] According to the present invention, the compression of an image is improved in units of divided areas.
It is possible to provide an image processing apparatus that performs parallel processing at high speed. Sand
That is, a plurality of memory elements and a plurality of processors
Each element corresponds to the divided area, and multiple
Processor element between the processor elements of the
Transfer the image data calculated in the
Because image data is processed, image compression can be performed at high speed.
Column processing is possible. In particular, images between processor elements
Special configuration for such transfer as it transfers data
And a simple configuration can be achieved without the need for

【図面の簡単な説明】第１図は本実施例の画像処理装置の構成を示す図、第２図は画像１画面をメモリ・エレメントの番地に対応
させる図、第３図は４×４個のメモリ・エレメントから成るメモリ
全体を示す図、第４図はメモリとそれに与えるアドレス生成器の図、第５図は画像の一部分を示す図、第６図は画像一部分のメモリ割り付けを示す図、第７図はメモリアドレスの制御回路を示す図、第８図は画素データ制御のブロツク図、第９図（ａ），（ｂ）は本実施例の他の画像処理装置の
構成を示す図、第10図は画像１画面を示す図、第11図はｋ×ｌ個のメモリ・エレメントを示す図、第12図、第13図は１個のメモリ・エレメントを示す図、第14図、第15図はメモリ・エレメントアクセスの制御回
路を示す図、第16図は画像１画面を示す図、第17図はｋ×ｌ個のメモリ・エレメントを示す図、第18図は本実施例での入力側の画像メモリ、プロセツサ
・ユニツトと、出力側の画像メモリの関係図、第19図は本実施例で使用した画像圧縮データの書式図、第20図は本実施例での各プロセツサ・エレメントの機能
図、第21図は本実施例での入力側の画像メモリ、プロセツサ
・ユニツトと、出力側の画像メモリの関係図、第22図，第23図は本実施例での各プロセツサ・エレメン
トの動作概略図、第24図は本実施例での伸長処理時における入力画像メモ
リ、プロセツサユニツト及び出力画像メモリの関係図で
ある。図中、１……画像メモリ、1a,1b……メモリ・エレメン
ト、２……プロセツサ・ユニツト、2a……プロセツサ・
エレメント、３……周辺部、４……ローアドレス・ジエ
ネレータ、５……カラムアドレス・ジエネレータ、91…
…入力側画像メモリ、92……プロセツサ・ユニツト、93
……出力側画像メモリ、94……制御回路、95……入力装
置、96……出力装置、240,260,290……入力側画像メモ
リ、261,291……入力画像ブロツク、240a,261a,291a…
…入力画素、241,262,292……プロセツサ・ユニツト、2
41a,263a,263b,292a……プロセツサ・エレメント、242,
264,293……出力側画像メモリ、243……出力画像ブロツ
ク、243a,264a,293a……出力画素、244,265,294……制
御部である。BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a diagram showing the configuration of an image processing apparatus according to the present embodiment, FIG. 2 is a diagram in which one image corresponds to the address of a memory element, and FIG. FIG. 4 is a diagram showing a memory and an address generator provided thereto, FIG. 5 is a diagram showing a portion of an image, FIG. 6 is a diagram showing memory allocation of a portion of an image, FIG. 7 is a diagram showing a memory address control circuit, FIG. 8 is a block diagram of pixel data control, and FIGS. 9 (a) and 9 (b) are diagrams showing the configuration of another image processing apparatus of the present embodiment. FIG. 10 is a diagram showing one image screen, FIG. 11 is a diagram showing k × l memory elements, FIG. 12 and FIG. 13 are diagrams showing one memory element, FIG. FIG. 15 shows a control circuit for memory element access, and FIG. 16 shows one image screen. FIG. 17, FIG. 17 is a diagram showing k × l memory elements, FIG. 18 is a diagram showing the relationship between an input side image memory, a processor unit and an output side image memory in this embodiment, FIG. Is a format diagram of the compressed image data used in the present embodiment, FIG. 20 is a functional diagram of each processor element in the present embodiment, and FIG. 21 is a diagram showing the input side image memory and the processor unit in the present embodiment. FIG. 22 and FIG. 23 are schematic diagrams of the operation of each processor element in the present embodiment, and FIG. 24 is an input image memory and a processor during decompression processing in the present embodiment. FIG. 3 is a diagram showing the relationship between a processor unit and an output image memory. In the figure, 1 ... image memory, 1a, 1b ... memory element, 2 ... processor unit, 2a ... processor
Element 3, peripheral part, 4 ... row address generator, 5 ... column address generator, 91 ...
… Input side image memory, 92 …… Processor unit, 93
…… Output image memory, 94… Control circuit, 95… Input device, 96… Output device, 240,260,290 …… Input side image memory, 261,291 …… Input image block, 240a, 261a, 291a…
… Input pixels, 241,262,292 …… Processor unit, 2
41a, 263a, 263b, 292a …… Processor element, 242,
264, 293... Output image memory, 243... Output image block, 243a, 264a, 293a... Output pixels, 244, 265, 294.

───────────────────────────────────────────────────── フロントページの続き (72)発明者河村尚登東京都大田区下丸子３丁目30番２号キヤノン株式会社内 (56)参考文献特開昭62−13165（ＪＰ，Ａ) 特開昭60−185987（ＪＰ，Ａ) 特開昭61−107475（ＪＰ，Ａ) 特開昭61−16369（ＪＰ，Ａ) ────────────────────────────────────────────────── ─── Continuation of front page (72) Inventor Naoto Kawamura 3-30-2 Shimomaruko, Ota-ku, Tokyo Inside Canon Inc. (56) References JP-A-62-13165 (JP, A) JP-A-60-185987 (JP, A) JP-A-61-107475 (JP, A) JP-A-61-16369 (JP, A)

Claims

(57) [Claims] What is claimed is: 1. An image processing apparatus for compressing an image in units of a region divided into a predetermined size, comprising: a plurality of memory elements corresponding to the number of pixels of the region; an image memory for storing image data; A plurality of processor elements corresponding to the plurality of memory elements, and image data read out in parallel from the plurality of memory elements are compressed by the plurality of processor elements in the processor element so as to be compressed in units of the area. An image processing apparatus comprising: a processor that transfers image data that has been transferred and performs arithmetic processing on the transferred image data.