JP3486871B2

JP3486871B2 - High-performance code compression system for video information

Info

Publication number: JP3486871B2
Application number: JP2000361681A
Authority: JP
Inventors: 枝博昭國; 色剛一; 冬菊李; 藤和人伊; 塚友彦大; トリオ・アディノ; チャワリット・ホンサワイック
Original assignee: 財団法人理工学振興会
Priority date: 2000-11-28
Filing date: 2000-11-28
Publication date: 2004-01-13
Anticipated expiration: 2020-11-28
Also published as: CN1691782A; JP2002165222A

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、テレビ電話等に必
要とされる、ビットレートの低減をする動画像情報の高
性能符号圧縮システムに関する。即ち、画像情報のなか
で重要性の高い部分例えばテレビ電話等における喋りに
伴う唇の動きを含んだ顔と、重要性の低い部分例えば人
の顔以外の背景を区別し、これらの情報処理に重み付け
して情報伝送効率を良くしたり、伝送容量に制約のある
在来の電話回線に対応すべく圧縮した最小限の画像情報
伝送量で、自然の会話に近い表情、特に喋りに合った唇
の動き即ちリップ同期を実現するようにした動画像情報
の高性能符号圧縮システムを開示している。The present invention relates are required to the television telephone or the like, relates to high-performance code compression system of the moving image information to the bit rate reduction. That is, in the image information, a highly important part such as a face including the movement of the lips associated with talking in a videophone and a less important part such as a background other than a human face are distinguished, and the information is processed. Weighted to improve information transmission efficiency, and compressed with a minimum amount of image information transmitted to accommodate conventional telephone lines with limited transmission capacity, facial expressions close to natural conversation, especially lips suitable for speaking It discloses a high performance code compression system of the motion i.e. moving image information so as to realize lip synchronization.

【０００２】[0002]

【従来の技術】従来のテレビ電話等に見られる画像は、
電話回線という制約のなかで伝送可能な情報量の許容範
囲内に収納できるように、画像情報の間引きをして劣化
させた画像であり、テレビの動画というよりは、間欠的
変化を伝送する静止画の連続に近いものであり、いわば
電話中の送話者の顔写真を電話の音声から少し遅れて電
送写真即ちＦＡＸで届けている感覚であった。それは、
動画機能よりも各フレーム毎の画質維持を優先し、静止
画になった際の画質を維持するために、毎秒２５フレー
ム（欧州、ロシアのＰＡＬ，ＳＥＣＡＭ方式テレビ）若
しくは毎秒３０フレーム（日米諸国のＮＴＳＣ方式テレ
ビ）送るべきテレビ画像をコマ落としにして、テレビ本
来の持つ動画機能を大幅に低減し妥協していた。そして
画像は、大多数のフレームを間引きしてもなお音声より
も情報量が多く、その処理と伝送に時間を要し、遅れて
受信されるので唇の動きは喋りと一致していなかった。
逆に、遅れて受信される画像に合わせるように、音声を
強制的に（遅らせて）同期させる方法を取った場合、会
話の応答が遅れるため、ちょうどテレビの衛星中継に見
られるように、非常にちぐはぐな会話になってしまう。2. Description of the Related Art Images seen on conventional videophones are
It is an image that is degraded by thinning out image information so that it can be stored within the allowable range of the amount of information that can be transmitted under the restriction of a telephone line. It is a still image that transmits intermittent changes rather than a moving image on a television. It was like a series of pictures, so to speak, the photograph of the face of the talker on the phone was delivered a little later than the voice of the phone by electronic photograph, that is, FAX. that is,
25 frames per second (PAL, SECAM television in Europe and Russia) or 30 frames per second (Japan-US countries) (NTSC TV) The TV image to be sent was dropped, and the original moving picture function of the TV was greatly reduced to a compromise. In addition, since the image has much more information than voice even if the majority of frames are thinned out, it takes time to process and transmit the image, and the image is received with a delay, the movement of the lips does not match the talking.
On the other hand, if you force the audio to be synchronized (delayed) so that it matches the image received late, the response of the conversation will be delayed, which is very likely to be seen in satellite relay on TV. It becomes a messy conversation.

【０００３】[0003]

【発明が解決しようとする課題】しかし、テレビ電話等
で用いる画像は、必ずしも劇場上映用の高品位な映画の
画質を目標にする必要性は無く、むしろ送話者の喋りに
合った唇の動きが見られれば、唇の動きを伴った顔以外
部分は毎秒２４コマの映画や毎秒２５又は３０フレーム
のテレビ程度にまで高忠実度を維持した動きの再現を追
求しなくともテレビ電話本来の目的は達成できる。However, it is not always necessary to target the image quality of a high-definition movie for theatrical screening in an image used in a videophone or the like. If the motion is seen, the parts other than the face with the movement of the lips are the same as those of the original videophone without pursuing the reproduction of the motion with high fidelity up to a movie of 24 frames per second or a TV of 25 or 30 frames per second. The purpose can be achieved.

【０００４】そして、テレビ電話の画面においては、テ
レビ電話の本来の目的に沿って、画面中の被写体の顔の
輪郭等を含んだ特定の領域別の重要性の有無を考慮し、
それに基づいて各部の信号処理に重み付けし、必要最小
限の情報量に圧縮してもなお、テレビ電話の本来の目
的、即ち、既存の電話回線網における情報伝送量の制約
範囲内でも、送話者の唇の動きを見せる画像と喋りを伝
える音声が実際と一致して再現されるリップ同期と、妥
協できる画質を保証し、簡素な構成でなる仕様に沿っ
た、テレビ電話会談の雰囲気を損なわないような、動画
像情報の高性能符号圧縮システムが望まれていた。On the screen of the videophone, in consideration of the original purpose of the videophone, consideration is given to the importance of each specific area including the outline of the face of the subject in the screen,
Even if the signal processing of each part is weighted based on it and compressed to the necessary minimum amount of information, the original purpose of the videophone, that is, the limitation of the amount of information transmission in the existing telephone line network
Even within the range, an image showing the movement of the talker's lips and talking can be transmitted.
Sound that is reproduced in real time
There has been a demand for a high-performance code compression system for moving image information that guarantees a collaborative image quality and that does not impair the atmosphere of a videophone conference, which complies with the specifications of a simple configuration .

【０００５】本発明が解決しようとする課題は、興味関
心をもたれやすく目立つ顔の部分だけを特定領域（ウイ
ンドウ）として優先的な情報処理を施すようにした符号
圧縮システムにおいて、伝送容量に制約のある在来の電
話回線に対応すべく圧縮した最小限の画像情報伝送量
で、自然の会話に近い表情、特に喋りに合った唇の動き
即ちリップ同期を実現するようにした動画像情報の高性
能符号圧縮システムを提供すること、即ちテレビ電話
を、極めて簡素かつ安価なままで高性能化することであ
る。そのため、個別具体的には、情報圧縮率とノイズの
関係、Ｂフレームの差分画像情報の取り扱い、特に、符
号器における情報圧縮処理のレート制御機構の改善する
ことである。The problem to be solved by the present invention is that
Only the part of the face that stands out easily
Code for which preferential information processing is performed as
In a compression system, conventional electric power with a limited transmission capacity is used.
Minimal amount of image information transmission compressed to accommodate speech lines
So, facial expressions that are close to natural conversation, especially lip movements that are suitable for speaking
That is, the high quality of the moving image information that realizes lip synchronization
Providing a functional code compression system, that is, a videophone
Is to improve performance while remaining extremely simple and inexpensive.
It Therefore, specifically, information compression rate and noise
Relationship, handling of the difference image information of B frame,
Improvement of rate control mechanism of information compression processing in encoder
That is .

【０００６】前記レート制御機構の改善が必要とされる
理由は二つあって、一つは伝送経路の情報伝送容量の制
約に対応するためであり、二つめは復号器側で動画像を
再生する速度を一定ならしめるべく各画像フレームのビ
ット長を可能な限り平均化するためである。[0006] reasons required improvement in the rate control mechanism is a two-fold, one is for the purpose of corresponding to the restriction information transmission capacity of the transmission path, reproducing a moving image in second is the decoder side This is for averaging the bit lengths of the image frames as much as possible in order to make the speed of movement constant.

【０００７】そこで従来のレート制御方式では、国際電
気通信連合ＩＴＵ（ＩｎｔｅｒｎａｔｉｏｎａｌＴｅ
ｌｅｃｏｍｍｕｎｉｃａｔｉｏｎＵｎｉｏｎ）が公開
しているＶｉｄｅｏＣｏｄｅｃＴｅｓｔＭｏｄｅ
ｌ，Ｎｅａｒ−Ｔｅｒｍ，Ｖｅｒｓｉｏｎ１１（以下、
ＴＭＮ−１１と呼ぶ）という、Ｈ．２６３プラス規格に
準拠した動画像情報圧縮ソフトウェアプログラムにおい
て採用されている数種類の方式が存在していた。Therefore, in the conventional rate control system, the International Telecommunications Union ITU (International Te
Video Codec Test Mode published by LeCommunication Union)
1, Near-Term, Version 11 (hereinafter,
(Referred to as TMN-11). There are several types of methods adopted in the moving image information compression software program based on the H.263 plus standard.

【０００８】しかし、従来のレート制御方式では、カメ
ラ入力画像が前記符号器、伝送経路及び前記復号器を経
由して復号画像となって出力されるまでの間に発生する
遅延時間とコマ落ちを最小限にするための、遅延時間を
厳密に制御する機能までは具備しておらず、この遅延時
間が問題とされ、遅延の無い音声に対して、口の動きを
映した動画像の遅れから、口の動きと音声が一致しない
リップ同期という課題を呈した。しかも、前記各画像フ
レームのビット長を高精度で平均化するためには、非常
に複雑な計算が必要であり、しかもその計算処理のため
にさらなる前記遅延時間の発生が避けられなかった。However, in the conventional rate control system, a delay time and a frame drop that occur until a camera input image is output as a decoded image via the encoder, the transmission path and the decoder are eliminated. It does not have a function to strictly control the delay time to minimize it, and this delay time is a problem. , And the subject of lip synchronization in which the movement of the mouth and the voice do not match. Moreover, in order to average the bit lengths of the respective image frames with high accuracy, a very complicated calculation is required, and further the delay time is unavoidable due to the calculation processing.

【０００９】そこで、本発明では、前記情報圧縮率とノ
イズの関係、前記Ｂフレームの差分画像情報の取り扱い
のほか、特に前記各画像フレームのビット長を高精度で
平均化するために、従来は不可欠とされていた非常に複
雑な計算を、簡単な計算に置き換えることにより、前記
の計算処理のために要する前記遅延時間を減少させ、リ
ップ同期が実現できるシステムを提供することを目的と
している。Therefore, in the present invention, the information compression rate and the
Relation of noise, handling of difference image information of the B frame
In addition, in order to average the bit lengths of the image frames with high accuracy, a very complicated calculation, which was conventionally indispensable, is replaced with a simple calculation so that the calculation process described above can be performed. It is an object of the present invention to provide a system capable of realizing the lip synchronization by reducing the required delay time.

【００１０】[0010]

【課題を解決するための手段】前記目的を達成するため
に、請求項１の発明は、画像データを記憶しておくメモ
リ（１）と、認識処理中の動画像の画面上で移動自在の
特定領域即ち優先的な情報処理を施されるウインドウ
（２１）と、前記ウインドウ（２１）を構成する全画像
を矩形の小ブロックに分割し、前記小ブロックごとの探
索データと参照データを前記メモリ（１）から逐次的に
入力してウインドウ並列処理を実行する逐次処理手段
と、前記小ブロックの動画像の動きに伴った動きベクト
ルを探索する動きベクトル探索部（１０）と、前記動き
ベクトル探索部（１０）が探索した動きベクトルを利用
して、次のフレームの顔のウインドウ（２１）の位置を
推定し、主なる被写体の動きにウインドウ（２１）を追
随させるウインドウ位置制御プログラムと、を備えた動
画像情報の高性能符号圧縮システムであって、現フレー
ム画像と前参照フレーム画像と後参照フレーム画像を入
力して動き予測、動き補償及び予測方式決定を行う動き
予測機能部と、その動き予測機能部から出力される予測
画像と前記現フレーム画像との差分情報を入力してその
差分情報の全画素値を強制的にゼロにする全画素値ゼロ
化機能部と、その全画素値ゼロ化機能部から出力される
ゼロ化全画素情報を入力し、前記動き予測機能部が決定
した予測方式により動画像の次の動きを予測しながら前
記ゼロ化全画素情報を符号化する符号生成部と、を備え
てＢフレーム処理する符号器を含み、かつ、前記符号器
により符号圧縮して送信され伝送経路を経て送り届けら
れた動画像情報を受信して復号する復号部と、その復号
部から出力される復号信号を入力して逆量子化する逆量
子化部と、その逆量子化部から出力される逆量子化信号
を入力して離散コサイン逆変換することにより差分画像
に復元する離散コサイン逆変換部と、その復元差分画像
と前記予測方式により予測された予測画像を足し合わせ
て復号画像を出力する加算部と、を備えてＢフレーム処
理する復号器を含み、かつ、前記予測画像を使わずに現
マクロブロック画像を直接符号化するイントラマクロブ
ロックの場合にのみ輝度信号と色差信号を四捨五入によ
り量子化し、前記イントラマクロブロック以外の場合に
は輝度信号を切り下げにより量子化し色差信号は四捨五
入により量子化し、輝度信号と色差信号に同一の量子化
レベルを適用したまま色差信号のノイズを低減するノイ
ズ低減手段を含み、前記ノイズ低減手段は逆量子化補正
値（ｐ）と丸め補正パラメータ（ｆ）の和に、量子化レ
ベル（Ｑ）と量子化レベル補正値（ｓ）を乗じた値を、
離散コサイン変換された周波数成分でなるもとのデータ
の絶対値（｜Ｃ｜）から減じた値を、量子化レベル
（Ｑ）と量子化レベル補正値（ｓ）を乗じた値で除した
実数値を切り下げて得られた整数を量子化されたデータ
の絶対値（｜Ｌ｜）とする量子化手段を含み、かつ、前
記量子化手段として、前記予測画像を使わずに現マクロ
ブロック画像を直接符号化するイントラマクロブロック
の場合にのみ輝度信号と色差信号を四捨五入により量子
化するように前記補正パラメータ（ｆ）を０，５に設定
する第１の量子化手段と、前記イントラマクロブロック
以外の場合には輝度信号を切り下げにより量子化するよ
うに、前記補正パラメータ（ｆ）を０に設定し、色差信
号は四捨五入により量子化するように、前記補正パラメ
ータ（ｆ）を０，５に設定する第２の量子化手段と、を
備えたことを特徴とする動画像情報の高性能符号圧縮シ
ステムである。To achieve SUMMARY OF for the] said object, the inventions of claims 1, stores the image data and Okumemo Li (1), of the moving image during the recognition process A specific area that is movable on the screen, that is, a window (21) to which priority information processing is applied, and all the images forming the window (21) are divided into rectangular small blocks, and search data for each small block is divided. Sequential processing means for sequentially inputting the reference data and the reference data from the memory (1) to execute window parallel processing, and a motion vector search unit (10) for searching a motion vector associated with the motion of the moving image of the small block. And the position of the face window (21) of the next frame is estimated by using the motion vector searched by the motion vector search unit (10), and the window (21) is made to follow the motion of the main subject. Moving with a position control program, the
A high-performance code compression system for image information, including a motion prediction function unit that inputs a current frame image, a previous reference frame image, and a subsequent reference frame image to perform motion prediction, motion compensation, and prediction method determination, and a motion prediction function thereof. All pixel value zeroization function unit for inputting the difference information between the prediction image output from the unit and the current frame image and forcibly setting all pixel values of the difference information to zero, and the all pixel value zeroization function A code generation unit that inputs the zeroed all-pixel information output from the unit, encodes the zeroed all-pixel information while predicting the next motion of the moving image by the prediction method determined by the motion prediction function unit , Equipped with
And a decoding unit that receives and decodes moving image information that is code-compressed and transmitted by the encoder and transmitted via a transmission path, and a decoding output from the decoding unit. An inverse quantization unit that inputs a signal and inversely quantizes it, and a discrete cosine inverse transformation unit that restores a differential image by inputting the inverse quantized signal output from the inverse quantization unit and inversely transforming the discrete cosine A B-frame processing, and an addition unit that adds the restored difference image and the prediction image predicted by the prediction method and outputs a decoded image.
Quantize the luminance signal and the color difference signal by rounding off only in the case of an intra macroblock that directly encodes the current macroblock image without using the prediction image , and if it is other than the intra macroblock. to reduce noise quantized chrominance signals are quantized by rounding, the color difference signal while applying the same quantization level into a luminance signal and a color difference signal by devaluation luminance signal to noise
And a noise reduction means for multiplying the sum of the inverse quantization correction value (p) and the rounding correction parameter (f) by the quantization level (Q) and the quantization level correction value (s). ,
The value obtained by subtracting the value obtained by subtracting the absolute value (| C |) of the original data composed of frequency components that have been discrete cosine transformed by the value obtained by multiplying the quantization level (Q) and the quantization level correction value (s). Quantizing means for making an integer obtained by rounding down a numerical value the absolute value (| L |) of quantized data , and
As the quantizing means, the correction parameter (f) is set to 0 so that the luminance signal and the color difference signal are quantized by rounding only in the case of an intra macroblock in which the current macroblock image is directly encoded without using the prediction image. a first quantizing means for setting the 5, to quantize the devaluation the luminance signal in the case of other than the intra-macro block, the set correction parameter (f) to zero, the color difference signals are due to rounding to quantize the correction parameter high code compression of video information to the second quantizing means for setting the (f) the 0,5, comprising the sheet
It is a stem .

【００１１】そして、請求項１に係る発明の作用によ
り、前後の参照フレームの画像情報との差分情報とし
て、現フレームを情報圧縮するＢフレームの符号方式に
おいて、従来のもののようにその差分情報自体は送信せ
ず、差分計算の種類、即ち順方向予測、逆方向の予測又
は両方向予測のみを送信するＢフレーム情報圧縮法を確
立した。このようにしたことにより、前記動画像の動き
が目くるめく激しい場合は、前記差分画像の情報量も多
くなるが、前記差分画像の情報量を最小ならしめるべ
く、全画素値を強制的にゼロにすることによって、伝送
元の画像を最も少ない情報により、伝送先で復元でき
る。According to the operation of the invention of claim 1,
In the B-frame coding method for compressing the information of the current frame, the difference information itself is not transmitted as the difference information with the image information of the reference frames before and after, and the difference calculation type, ie, the forward direction We have established a B-frame information compression method that sends only prediction, backward prediction or bidirectional prediction. By doing so, when the motion of the moving image is noticeable and intense, the amount of information in the difference image also increases, but in order to minimize the amount of information in the difference image, all pixel values are forced to zero. By doing so, the image of the transmission source can be restored at the transmission destination with the least amount of information.

【００１２】又、輝度信号と二つの色差信号の符号器に
おける量子化方法を変えることにより、復号器において
同じ量子化レベルで復号した場合に、色差信号のノイズ
を軽減することを可能とし、復号画像の視覚的画像品質
を向上させる量子化法を確立していることである。この
ようにしたことにより、従来より簡単な構成で、効率的
に色差信号のノイズ削減効果を発揮できる。従って、人
の目の性質上、視覚的により高い画質が得られる作用も
ある。Also, by changing the quantization method in the encoder for the luminance signal and the two color difference signals, it is possible to reduce the noise in the color difference signals when decoding is performed at the same quantization level in the decoder. it is that having established the quantizer to improve visual image quality of the image. By doing so, the noise reduction effect of the color difference signals can be efficiently exhibited with a simpler structure than the conventional one. Therefore, due to the nature of the human eye, there is also the effect that visually higher image quality can be obtained.
There is .

【００１３】又、請求項２に係る発明は、符号化された
動画像情報のビット量を通信バッファ残留ビット量と比
較する比較手段と、その比較手段の比較結果により前記
通信バッファ残留ビット量が枯渇しないようにフレーム
の目標ビット量を制御する制御手段と、その制御手段に
よる制御結果を用いて、カメラ入力画像が前記符号器、
伝送経路及び前記復号器を経由して復号画像となって出
力されるまでの間に発生する遅延時間とコマ落ちを最小
限にする、フレームレベルレート制御における、フレー
ムの目標ビット量の計算手段と、を備えたことを特徴と
する請求項１に記載の動画像情報の高性能符号圧縮シス
テムである。The invention according to claim 2 is characterized in that the comparing means compares the bit amount of the encoded moving image information with the communication buffer residual bit amount, and the comparison result of the comparing means.
Control means for controlling the target bit amount of the frame so that the residual bit amount of the communication buffer is not exhausted, and the control result by the control means is used, and the camera input image is the encoder,
To minimize the delay time and drop frame generated until the transmission path and via said decoder is output as the decoded image, in the frame-level rate control, and calculation means of the target bit amount of the frame 2. A high-performance code compression system for moving image information according to claim 1, further comprising:
System .

【００１４】又、請求項３に係る発明は、フレームの最
初のマクロブロックに適用する量子化レベルを、前フレ
ームの各マクロブロックの量子化レベルの重み付き平均
を用いて算出する第１の計算手段と、二番目以降のマク
ロブロックに適用する量子化レベルの微調整量を、前記
目標ビット量、現マクロブロックまでの実際の符号量及
び前記最初のマクロブロックに適用する量子化レベルを
用いて算出する第２の計算手段と、を備えたことを特徴
とする請求項２に記載の動画像情報の高性能符号圧縮シ
ステムである。The invention according to claim 3 is the first calculation for calculating the quantization level applied to the first macroblock of the frame by using the weighted average of the quantization levels of the macroblocks of the previous frame. Means for finely adjusting the quantization level applied to the second and subsequent macroblocks, using the target bit amount, the actual code amount up to the current macroblock, and the quantization level applied to the first macroblock. a second calculating means for calculating a high-performance code compression system of video information according to claim 2, comprising the.

【００１５】これらの請求項２〜３に係る発明の作用に
より、画像の符号化、情報の転送及び符号化に要する遅
延時間を最小限に抑えながら、コマ落ち数を最小限に抑
えるように、各フレーム毎の最適な目標符号ビット量を
求め（以下、この処理をフレームレベルレート制御と呼
ぶ）、それ以降の各マクロブロックにおいて量子化レベ
ルを最適調整する（以下、この処理をマクロブロックレ
ベルレート制御と称す）。この方法は少ない計算量にも
拘らず、優れた制御能力を発揮する。このようにしたこ
とにより、前記各画像フレームのビット長を高精度で平
均化するために、従来は不可欠とされていた非常に複雑
な計算を、簡単な計算に置き換えることにより、前記の
計算処理のためにする前記遅延時間を減少させ、リップ
同期を実現できる。The operation of the invention according to claims 2 to 3
More, the coding of the image, while minimizing transfer and a delay time required for the encoding of the information, to minimize the dropped frames number, determine the optimum target code bit amount for each frame (hereinafter, This process is referred to as frame level rate control), and the quantization level is optimally adjusted in each subsequent macro block (hereinafter, this process is referred to as macro block level rate control). This method exhibits excellent controllability despite the small amount of calculation. By doing so, in order to average the bit lengths of the image frames with high accuracy, a very complicated calculation, which has been indispensable in the past, is replaced with a simple calculation, thereby performing the calculation process described above. Therefore, the delay time can be reduced and lip synchronization can be realized.

【００１６】又、請求項４に係る発明は、画像フレーム
情報を保存するメモリ（１）と、各々が独立して動作す
るハードウェア・モジュールをデータバス（２）を介し
て結合し、前記メモリ（１）と前記各ハードウェア・モ
ジュールとの間のデータの流れ及び動作スケジュールを
制御する集中制御装置（３）が前記各ハードウェア・モ
ジュールに制御バス（４）を介して結合したシステム・
アーキテクチャにより構成された符号器及び／又は復号
器を備えたことを特徴とする請求項１〜３の何れか１項
に記載の動画像情報の高性能符号圧縮システムである。According to a fourth aspect of the present invention, a memory (1) for storing image frame information and a hardware module that operates independently of each other are coupled via a data bus (2), and the memory is stored. A system in which a centralized control device (3) for controlling a data flow and an operation schedule between (1) and each of the hardware modules is coupled to each of the hardware modules via a control bus (4).
4. An encoder and / or a decoder configured according to the architecture, comprising :
It is a high-performance code compression system for moving picture information described in 1 .

【００１７】このようにしたことにより、柔軟性と高速
性を有し、低消費電力で動作する小さな回路規模のハー
ドウェア・モジュールがバス結合され、それらの動作ス
ケジュールとメモリデータとの間のデータの流れを制御
するＡＧＵモジュールにより、動画像圧縮に最適なシス
テム・アーキテクチュアが確立した。By doing so, hardware modules having a small circuit scale, which have flexibility and high speed and operate with low power consumption, are bus-coupled, and the data between the operation schedule and the memory data can be obtained. With the AGU module that controls the flow of, the optimum system architecture for video compression has been established.

【００１８】又、請求項５に係る発明は、画像を複数の
ブロックに分割しそのブロックの座標単位の情報を処理
するブロック方式に適合するアドレス構成のメモリ
（１）領域でなるメモリ（１）と、そのメモリ（１）に
アクセスするためのアドレス生成を前記マクロブロック
の座標単位、ブロック単位、画素単位で可能にした命令
及び各ハードウェア・モジュールの実行制御の命令プロ
グラムを格納したＲＯＭ（９）を備えた集中制御装置
（３）を有することを特徴とする請求項３又は請求項４
に記載の動画像情報の高性能符号圧縮システムである。[0018] The invention according to claim 5, memory compatible address configuration block scheme an image is divided into a plurality of blocks for processing the information of the coordinate units of the block (1) comprising an area memory (1) If, coordinate units of the address generation for accessing the memory (1) the macroblock, block, possible to instruction 及beauty execution control of each hardware module in pixels 5. A centralized control device (3) having a ROM (9) storing the instruction program according to claim 3 or 4.
It is a high-performance code compression system for moving picture information described in 1.

【００１９】そして、個々のモジュールの動作開始と停
止やメモリ（１）へのデータ入出力の制御など全体の動
作の集中制御を行うプロセッサとして、メモリアドレス
のためのアドレス生成をマクロブロック単位、ブロック
単位及び画素単位で可能とした。このようにしたことに
より、複数のハードウェア・モジュールの夫々のハード
ウェア・モジュールの設計の独立性も得られ、設計上の
制約条件も格段に少なくなり、複数の設計者が夫々を分
担して設計することにより、システム全体の設計に対す
る所要時間を短縮できる。As a processor for performing centralized control of the whole operation such as starting and stopping the operation of each module and controlling the data input / output to the memory (1), the address generation for the memory address is performed in macro block units, in block units. It is possible in units and pixels. By doing so, the independence of the design of each hardware module of the multiple hardware modules can be obtained, the design constraint conditions are significantly reduced, and the multiple designers can share each of them. By designing, the time required to design the entire system can be shortened.

【００２０】請求項６に係る発明は、複数のプロセッサ
で並列に同時処理するアレー処理手段において、特定画
素に着目し、狭いウインドウ（２１）をくまなく上下左
右に動かして、ウインドウ（２１）の中でいろいろな位
置に含まれる画素の所属する前記特定画素の変化する位
置毎に１回づつ所定の計算処理を実行し、その所定計算
処理を前記位置が変わる毎に、ウインドウ（２１）を上
下左右に動かして何度もやり直す所定計算処理の代用手
段として、画像データを記憶しておくメモリ（１）と、
そのメモリ（１）からマクロブロックのデータをマクロ
ブロック毎に逐次入力するデータ形式変換用のバッファ
（３９）と、そのバッファ（３９）から出力される複数
ポートのデータを供給される並列アレー状に接続された
複数のプロセッサ要素（１０１〜１３２）でなるメモリ
共有型のアレー構成と、そのプロセッサ要素（１０１〜
１３２）のデータを超高速並列演算する演算手段によ
り、現フレームのマクロブロックが前フレームのどの位
置から移動してきたものかを表す動きベクトルを探索す
る加算器（４４）でなる動きベクトル探索回路と、を備
えたことを特徴とする請求項１〜５の何れか１項に記載
の動画像情報の高性能符号圧縮システムである。[0020] The invention according to claim 6, in the array processing means for simultaneously processed in parallel by multiple processors, focusing on a particular pixel, by moving vertically and horizontally all over the narrow window (21), the window (21) A predetermined calculation process is executed once for each changing position of the specific pixel to which pixels included in various positions belong, and the predetermined calculation process is performed on the window (21) every time the position changes. as an alternative means of predetermined calculation process to start over several times by moving vertically and horizontally, and stores the image data with your Kume memory (1),
Supply and its memory (1) or llama black buffer for data format conversion to sequentially input the data block for each macro-block (39), the data of a plurality ports outputted from the buffer (39) Connected in a parallel array
Memory composed of multiple processor elements (101 to 132)
Shared array configuration and its processor elements (101-101)
The arithmetic means of ultra-high-speed parallel arithmetic data 132), the motion vector search circuit comprising an adder for searching for a motion vector representing whether those macroblocks of the current frame has been moved from any position of the previous frame (44) a high-performance code compression system of the moving image information according to any one of claims 1 to 5, comprising the.

【００２１】請求項６に係る発明の作用により、夫々の
ハードウェア・モジュールの中でも、全体の８０％以上
の処理手数が必要な動きベクトル探索回路に要求され
る、複数のプロセッサを効率良く動作させる構成が確立
した。このようにしたことにより、高い並列効率を落と
すことなく、前記外部メモリ（１）から探索データと参
照データを逐次的に入力し、ウインドウ並列処理を実行
できる。With the operation of the invention according to claim 6, among the respective hardware modules, a plurality of processors required for the motion vector search circuit requiring 80% or more of the total processing steps can be efficiently operated. The configuration is established. By doing so, the search data and the reference data can be sequentially input from the external memory (1) and the window parallel processing can be executed without lowering the high parallel efficiency.

【００２２】次に、請求項７に係る発明は、前記メモリ
（１）の入力データが入力切替スイッチ（６６）により
交互に切替られる２組のレジスタ（６２），（６３）を
介して連続的に入力され、縦８横８都合６４画素でなる
マクロブロックのデータとなってマクロブロック毎に逐
次入力することにより、データ転送レートを落とすこと
なく、逐次的なデータを並列データに変換する２組のデ
ータ形式変換手段（５５）と、そのデータ形式変換手段
（５５）を介して前記並列データを二次元離散コサイン
変換を行うプロセッサ要素（１０１〜１３２）と、その
プロセッサ要素（１０１〜１３２）からの二次元離散コ
サイン変換出力を入力してこれを量子化し前記メモリ
（１）に逐次的にデータ格納するためにデータ出力する
量子化モジュール（１２）（５８）と、を備えたことを
特徴とする請求項１〜６の何れか１項に記載の動画像情
報の高性能符号圧縮システムである。Next, the claims7The invention according to
The input data of (1) is changed by the input selector switch (66).
Two sets of registers (62) and (63) that are switched alternately
It is continuously input through and is not 64 pixels in the vertical 8 horizontal 8 convenience.Ru
MaIt becomes the data of the black block, and it is written every macro block.
Reduce the data transfer rate by entering the following
, Two sets of data that convert serial data into parallel data
Data format conversion means (55) and its data format conversion means
Two-dimensional discrete cosine of the parallel data via (55)
The processor that does the conversionelement(101-132) and its
ProcessorelementTwo-dimensional discrete cos from (101-132)
Input the sine transform output and quantize itThe memory
Output data to store data sequentially in (1)
That the quantization module (12) (58) is provided.
CharacterizingClaims 1-6The moving image information according to any one of items
It is a high-performance code compression system for information.

【００２３】そして、動きベクトルの次に多くの計算量
を必要とするモジュールである離散コサイン変換及び量
子化回路と、これらの逆の操作である逆量子化回路及び
逆離散コサイン変換モジュールにおいて、高速動作を損
なうことなく、外部メモリに蓄えられたデータを逐次的
に、読み、並列処理し、書き込みする。このようにした
ことにより、高いデータ転送レート及び並列効率を落と
すことなく、前記外部メモリから画素データを逐次的に
入力し、並列処理を実行する手段が確立した。[0023] Then, the next most computational ChiaKing Ru away in modules requiring cosine transform and quantization circuit of the motion vector, in these inverse quantization circuit and an inverse discrete cosine transform module, which is a reverse operation , The data stored in the external memory is sequentially read, processed in parallel, and written without impairing the high-speed operation. By doing so, a means for sequentially inputting pixel data from the external memory and performing parallel processing without reducing the high data transfer rate and the parallel efficiency was established.

【００２４】又、請求項８に係る発明は、画像データを
記憶しておくメモリ（１）と、認識処理中の動画像の画
面上で移動自在の特定領域即ち優先的な情報処理を施さ
れるウインドウ（２１）と、前記ウインドウ（２１）を
構成する全画像を矩形の小ブロックに分割し、前記小ブ
ロックごとの探索データと参照データを前記メモリ
（１）から逐次的に入力してウインドウ並列処理を実行
する逐次処理手段と、前記小ブロックの動画像の動きに
伴った動きベクトルを探索する動きベクトル探索部（１
０）と、前記動きベクトル探索部（１０）が探索した動
きベクトルを利用して、次のフレームの顔のウインドウ
（２１）の位置を推定し、主なる被写体の動きにウイン
ドウ（２１）を追随させるウインドウ位置制御プログラ
ムと、を備えた動画像情報の高性能符号圧縮システムで
あって、カメラ入力画像が符号器、伝送経路及び復号器
を経由して復号画像となって出力されるまでの間に発生
する遅延時間とコマ落ちを最小限にする、フレームレベ
ルレート制御のプログラムとして、通信バッファ残留ビ
ット量（Ｗ）を０に設定し、始めの入力画面を１フレー
ムとして符号化する処理（Ｓ６１）と、現フレームのビ
ット量（Ｂ）が正値かどうかを判断しビット量（Ｂ）が
正値ならば１出力画像フレーム当りの画面数（Ｃ）を出
力画像の１秒当りの画面数（Ｆ）で除した（Ｃ／Ｆ）秒
を現フレームの符号化処理時間とし、この符号化処理時
間（Ｃ／Ｆ）秒に１秒当りの送信可能なビット量（Ｒ）
を乗じこの間に送信されるビット量（Ｕ）を所定値（Ｒ
・Ｃ／Ｆ）に設定し、前記ビット量（Ｂ）が０ならば現
フレームはスキップされていると判断し、前記符号化処
理時間（Ｃ／Ｆ）秒を入力画像の１秒当りの画面数
（Ｇ）の逆数である入力画面周期と同じ（１／Ｇ）秒と
しこの入力画面周期（１／Ｇ）秒の間に送信されるビッ
ト量（Ｕ）を（Ｒ／Ｇ）に設定する処理（Ｓ６２）と、
前記通信バッファ残留ビット量（Ｗ）に現フレームのビ
ット量（Ｂ）を加算した値と現フレームの符号化処理時
間（Ｃ／Ｆ）秒の間に通信バッファから送信されたビッ
ト量（Ｕ）を比べ前記通信バッファ残留ビット量（Ｗ）
値を更新する処理（Ｓ６３）と、保証する最大フレーム
遅延時間（Ｄ）に１を加えた値（Ｄ+１）に１秒当りの
送信可能なビット量（Ｒ）と入力画面周期（１／Ｇ）秒
を乗じた値（Ｒ／Ｇ）・（Ｄ+１）から、前記１出力画
像フレーム当りの画面数（Ｃ）から１を減じた値（Ｃ-
１）に前記送信可能なビット量（Ｒ）を乗じて出力画像
の１秒当りの画面数（Ｆ）で除した値（Ｒ／Ｆ）・（Ｃ
-１）を減じて得られる変数（Ｌ）を設定し、その変数
（Ｌ）から前記所定値（Ｒ・Ｃ／Ｆ）に設定された通信
路の帯域を１００％使用するための最小ビット量（Ｅ）
を減じた値と、前記通信バッファ残留ビット量（Ｗ）と
の大小関係の比較及び／又は復号器の１秒当りの再生可
能な最大画面数（Ｈ）と出力画像の１秒当りの画面数
（Ｆ）との大小関係を比較し同時に次フレームの目標ビ
ット量（Ｔ）を演算処理する比較処理（Ｓ６４）と、前
記比較処理（Ｓ６４）した結果が前記通信バッファ残留
ビット量（Ｗ）又は出力画像の１秒当りの画面数（Ｆ）
の方が小さい場合（Ｗ＜Ｌ−Ｅ）は１フレーム当りの割
り当てビット量（Ｋ）から通信バッファ残留ビット量
（Ｗ）を引いたものを前記次フレームの目標ビット量
（Ｔ）に設定し、目標ビット量（Ｔ）が正値の場合は、
通常処理として、次の（Ｇ／Ｆ−１）個のフレームをス
キップし、その後の１又は２個の入力画面を符号化し、
前記目標ビット量（Ｔ）値が０又は負の場合は次の入力
画面一つを符号化せずにスキップしスキップした場合の
現フレームのビット量（Ｂ）を零となす処理（Ｓ６５）
と、現フレームのビット量（Ｂ）を前記通信バッファ残
留ビット量（Ｗ）と比較する比較段階（Ｓ６２）（Ｓ６
３）と、前記比較段階（Ｓ６２）（Ｓ６３）の比較結果
により前記通信バッファ残留ビット量（Ｗ）が枯渇しな
いように前記次フレームの目標ビット量（Ｔ）を制御す
る段階（Ｓ６４）（Ｓ６５）と、を備えたことを特徴と
する動画像情報の高性能符号圧縮システムである。[0024] The invention according to claim 8, facilities and Okumemo Li (1) stores the image data, a specific region or preferential information processing movable on the screen of the moving image during the recognition process Window (21) and all the images forming the window (21) are divided into rectangular small blocks, and search data and reference data for each small block are sequentially input from the memory (1). Sequential processing means for executing window parallel processing, and a motion vector search unit (1) for searching a motion vector associated with the motion of the moving image of the small block.
0) and the motion vector searched by the motion vector search unit (10) to estimate the position of the window (21) of the face of the next frame, and follow the window (21) to the motion of the main subject. A high-performance code compression system for moving image information that includes a window position control program
There, the camera input image mark-decoder, to minimize the delay time and drop frame generated until the output becomes a decoded image through a transmission path and a decoder, a frame-level rate control As a program , the communication buffer residual bit amount (W) is set to 0, and the process of encoding the first input screen as one frame (S61) and determining whether the bit amount (B) of the current frame is a positive value If the bit amount (B) is a positive value, the number of screens per output image frame (C) is divided by the number of screens per second (F) of the output image (C / F) Seconds is the encoding process of the current frame Time, and the amount of bits that can be transmitted per second (R) in this encoding processing time (C / F) seconds
And the amount of bits (U) transmitted during this period multiplied by a predetermined value (R
C / F), if the bit amount (B) is 0, it is determined that the current frame is skipped, and the encoding processing time (C / F) seconds is the screen of the input image per second. The same (1 / G) seconds as the input screen cycle which is the reciprocal of the number (G) is set, and the bit amount (U) transmitted during this input screen cycle (1 / G) seconds is set to (R / G). Processing (S62),
A value obtained by adding the bit amount (B) of the current frame to the bit amount (W) remaining in the communication buffer and the bit amount (U) transmitted from the communication buffer during the encoding processing time (C / F) seconds of the current frame. The communication buffer residual bit amount (W)
The process of updating the value (S63), the value (D + 1) obtained by adding 1 to the guaranteed maximum frame delay time (D), and the bit amount (R) that can be transmitted per second and the input screen cycle (1 / G) second multiplied by (R / G) · (D + 1), the value obtained by subtracting 1 from the number of screens per output image frame (C) (C-
A value (R / F) · (C) obtained by multiplying 1) by the transmittable bit amount (R) and dividing by the number of screens per second (F) of the output image.
-1) is set to obtain a variable (L), and the minimum bit amount for using 100% of the bandwidth of the communication path set to the predetermined value (R / C / F) from the variable (L) (E)
And the maximum number of reproducible screens per second (H) of the decoder and the number of screens per second of the output image. (F) and a comparison process (S64) in which the target bit amount (T) of the next frame is calculated at the same time, and the result of the comparison process (S64) is the communication buffer residual bit amount (W) or Number of screens per second of output image (F)
Is smaller (W <LE), the target bit amount (T) of the next frame is set by subtracting the communication buffer residual bit amount (W) from the allocated bit amount (K) per frame. , If the target bit amount (T) is a positive value,
As a normal process, the next (G / F-1) frames are skipped, the subsequent 1 or 2 input screens are encoded,
If the target bit amount (T) value is 0 or negative, the next input screen is skipped without being encoded.
Processing for setting the bit amount (B) of the current frame to zero (S65)
And a comparison step of comparing the bit amount (B) of the current frame with the communication buffer residual bit amount (W) (S62) (S6).
3) and the step (S64) (S65) of controlling the target bit amount (T) of the next frame so that the communication buffer residual bit amount (W) is not exhausted according to the comparison result of the comparison step (S62) (S63). ), And is a high-performance code compression system for moving image information.

【００２５】又、請求項９に係る発明は、前記フレーム
レベルレート制御以降の縦８横８都合６４画素でなる各
マクロブロックにおいて量子化レベルを最適調整するた
めに符号ビット量を指標としたマクロブロックレベルレ
ート制御機構を構成するプログラムとして、前フレーム
の状態を調べ現フレームの最初のマクロブロックに適用
する量子化レベルの初期値（Ｑ）を、前フレームの符号
化マクロブロックの量子化レベルの平均値（Ｑａ）を用
いて算出する第１の計算手段（Ｓ１）と、現フレームの
目標ビット量（Ｔ）が正値であれば、実際に現フレーム
の符号化処理を行い、そうでなければ全てのマクロブロ
ックの処理が終了する段階（Ｓ６）に行く判断（Ｓ２）
と、前記全てのうちの各マクロブロック（ｉ）における
動画像情報の符号化圧縮処理の前半部であり、現フレー
ムの直前の復号されたＩフレーム又はＰフレームの画像
をもとに、動き予測器で求めた各マクロブロックの動き
ベクトルによって動き補償された予測画像を生成し、こ
の予測画像と現入力フレームの画像との差分画像に対し
離散コサイン変換を行い、離散コサイン変換された周波
数成分を量子化レベル（ｑ）を用いて量子化を行う処理
（Ｓ３）と、量子化レベル（ｑ）を適切な値に更新する
処理（Ｓ４，Ｓ７〜１１）と、前記各マクロブロック
（ｉ）における動画像情報の符号化圧縮処理の後半部で
あり、可変長符号化を行いＢの値をマクロブロック
（ｉ）の符号を含むものに更新し、逆量子化して、離散
コサイン復元画像を生成する処理（Ｓ５）と、前記全て
のマクロブロックの処理が終わり、現フレームの符号が
完成した後に、前フレームの量子化レベルの初期値
（Ｑ’）、前フレームのビット量（Ｂ’）、前フレーム
の目標ビット量（Ｔ’）の値に置き換え、次フレームの
処理に備える段階（Ｓ６）と、マクロブロックが符号化
されない条件は、イントラマクロブロックでなく、量子
化周波数成分及び動きベクトル成分が全てゼロであるこ
ととし、符号化されない場合は、前記量子化レベル
（ｑ）の更新を行わないという、各マクロブロック
（ｉ）が符号化されるかどうかの判断（Ｓ７）と、マク
ロブロック（ｉ）が符号化される場合に量子化レベル
（ｑ）の更新計算の指標となる４つの変数であるところ
のマクロブロック（ｉ）のビット量の予測値（ｄ）、残
量ビット予測値（ｈ）、残量ビット許容値（ａ）及び残
量ビット目標値（ｅ）を計算する処理（Ｓ８）と、現在
の前記量子化レベル（ｑ）が最初のマクロブロックに適
用された量子化レベルの初期値（Ｑ）よりも大きい時に
前記量子化レベル（ｑ）がそれ以上大きくなり難くする
ように作用するバイアスであるパラメータ（ｂ１）と、
前記量子化レベル（ｑ）が前記初期値（Ｑ）よりも小さ
くなり難くするように作用するバイアスであるパラメー
タ（ｂ２）を求める処理（Ｓ９）と、パラメータ（ｂ
１）に０以上の定数（ｇ）を乗じて１以上の定数（ｆ）
を加えたパラメータ（ｃ１）及びパラメータ（ｂ２）に
０以上の定数（ｇ）を乗じて１以上の定数（ｆ）を加え
たパラメータ（ｃ２）を求める処理（Ｓ１０）と、正方
向の更新値（ｑ１）が最大量子化レベル（ｑｍａｘ）に
達しない正値でありしかも負方向の更新値（ｑ２）が最
小量子化レベル（ｑｍｉｎ）に達しない負値の条件下
で、前記現在の量子化レベル（ｑ）が前記初期値（Ｑ）
よりも小さくなりしかも前記残量ビット許容値（ａ）が
前記残量ビット予測値（ｈ）よりも小さい第１の条件が
真の場合は前記量子化レベル（ｑ）に前記更新値（ｑ
１）を足した値を変数（ｑ'）の値とし、偽の場合は第
２の条件を評価し、前記残量ビット許容値（ａ）と前記
パラメータ（ｃ１）の積が前記残量ビット目標値（ｅ）
よりも小さくなる前記第２の条件が真の場合は前記量子
化レベル（ｑ）にｑ１を足した値を変数（ｑ'）の値と
し、偽の場合は第３の条件を評価し、前記残量ビット目
標値（ｅ）とパラメータ（ｃ２）の積が前記残量ビット
許容値（ａ）より小さくなる第３の条件が真の場合は第
４の条件を評価し、偽の場合は前記量子化レベル（ｑ）
を変数（ｑ'）の値とし、通信バッファ残留ビット量
（Ｗ）と現フレームのビット量（Ｂ）の和が入力画面周
期（１／Ｇ）秒の間に送信されるビット量（Ｕ）より小
さくなる第４の条件が真の場合は前記量子化レベル
（ｑ）に前記更新値（ｑ２）を足しそれ以外の場合は前
記量子化レベル（ｑ）をそのままにし、偽の場合は前記
量子化レベル（ｑ）を変数（ｑ'）の値として実際の量
子化レベル（ｑ）の更新を行う処理（Ｓ１１）と、を備
え、前記第１〜４の条件の評価によって計算した変数
（ｑ'）の値を許容限度内に収納する関数（ＣＱ）によ
りクリッピングした値を前記量子化レベル（ｑ）の更新
値とするようにしたことを特徴とする請求項２〜８の何
れか１項に記載の動画像情報の高性能符号圧縮システ
ム。The invention according to claim 9 is the frame
Each consisting of vertical 8 horizontal 8 convenience 64 pixels after level rate control
To optimize the quantization level in macroblocks
As a program constituting the macroblock level rate control mechanism, sign bit quantity indicative in order, before the initial value of the quantization level applied to the first macroblock state examining the current frame of frame (Q), before If the first calculation means (S1) for calculating using the average value (Qa) of the quantization level of the coded macroblock of the frame and the target bit amount (T) of the current frame are positive values, the actual If the frame is encoded, otherwise go to the stage (S6) where the processing of all macroblocks is completed (S2)
When a first half of the coding compression processing video information in each macroblock (i) of said all current frame
Image of the decoded I frame or P frame immediately before the frame
The motion of each macroblock obtained by the motion estimator based on
Generates a motion-compensated prediction image with a vector
Processing of performing a discrete cosine transform on the difference image between the predicted image of the current image and the image of the current input frame, and quantizing the frequency component subjected to the discrete cosine transform using the quantization level (q) (S3), and quantization. This is the latter half of the process of updating the level (q) to an appropriate value (S4, S7 to 11) and the coding and compressing process of the moving image information in each macroblock (i). Is updated to a value including the code of the macroblock (i), dequantized to generate a discrete cosine restored image (S5), and the processing of all the macroblocks is finished, and the code of the current frame is After completion, replacing with the initial value (Q ') of the quantization level of the previous frame, the bit amount (B') of the previous frame, and the target bit amount (T ') of the previous frame, and preparing for the processing of the next frame (S6) The condition that the block is not coded is that it is not an intra macroblock and that the quantized frequency component and the motion vector component are all zero, and if it is not coded, the quantization level (q) is not updated. , The determination of whether each macroblock (i) is coded (S7), and the four variables that are indicators of the update calculation of the quantization level (q) when the macroblock (i) is coded A process of calculating the predicted value (d) of the bit amount, the remaining bit predicted value (h), the remaining bit allowable value (a), and the remaining bit target value (e) of a certain macroblock (i) (S8). a), so that the quantization level when the current of the quantization level (q) is greater than the initial value of the first macroblock to apply quantization level (Q) (q) is unlikely more increased and A parameter (b1) is a bias use,
A process (S9) of obtaining a parameter (b2) that is a bias that acts so that the quantization level (q) is less likely to become smaller than the initial value (Q), and the parameter (b
Multiplying 1) by a constant (g) of 0 or more, a constant (f) of 1 or more
The parameter (c1) and the parameter (b2) to which is added is multiplied by a constant (g) of 0 or more to obtain a parameter (c2) of which a constant (f) of 1 or more is obtained (S10), and an updated value in the forward direction. Under the condition that (q1) is a positive value that does not reach the maximum quantization level (qmax) and the update value (q2) in the negative direction does not reach the minimum quantization level (qmin), the current quantization is performed. Level (q) is the initial value (Q)
If the first condition is true that the remaining bit allowable value (a) is smaller than the remaining bit predicted value (h), the quantization level (q) is updated to the update value (q).
The value obtained by adding 1) is used as the value of the variable (q ′), and when false, the second condition is evaluated, and the product of the remaining bit allowable value (a) and the parameter (c1) is the remaining bit. Target value (e)
If the second condition that is smaller than is true, the value obtained by adding q1 to the quantization level (q) is set as the value of the variable (q ′), and if false, the third condition is evaluated, and evaluates the remaining bit target value (e) a parameter (c2) if the third condition the product is made smaller than the remaining bit tolerance (a) is true for the fourth condition, when the false the Quantization level (q)
Is the value of the variable (q '), and the sum of the communication buffer residual bit amount (W) and the current frame bit amount (B) is the input screen frequency.
If the fourth condition that is smaller than the bit amount (U) transmitted during the period (1 / G) seconds is true, the quantization level (q) is added to the update value (q2), and otherwise. Is a process of updating the actual quantization level (q) with the quantization level (q) as it is and false, using the quantization level (q) as the value of the variable (q ′) (S11), And a value clipped by a function (CQ) that stores the value of the variable (q ′) calculated by the evaluation of the first to fourth conditions within an allowable limit is set as the updated value of the quantization level (q). The high-performance code compression system for moving image information according to any one of claims 2 to 8 , wherein
Mu .

【００２６】これらの請求項８〜９に係る発明の作用に
より、前記フレームレベルレート制御と、それ以降の前
記マクロブロックレベルレート制御を、少ない計算量に
も拘らず、優れた制御能力を発揮できる。このように、
前記各画像フレームのビット長を高精度で平均化するた
めに、従来は不可欠とされていた非常に複雑な計算を、
簡単な計算に置き換えることにより、前記の計算処理の
ためにする前記遅延時間を減少させ、リップ同期を実現
できる。In the operation of the invention according to claims 8 to 9 ,
As a result, the frame level rate control and the subsequent macroblock level rate control can exhibit excellent control capability despite a small amount of calculation. Like this ,
Before Symbol the bit length of each image frame in order to average with high precision, very conventionally it has been essential complex calculations,
By replacing the calculation with a simple calculation, the delay time for the calculation process can be reduced and the lip synchronization can be realized.

【００２７】さらに、通信バッファ残留ビット量
（Ｗ）、現フレームのビット量（Ｂ）、１出力画像フレ
ーム当りの画面数（Ｃ）、出力画像の１秒当りの画面数
（Ｆ）、現フレームの符号化処理時間（Ｃ／Ｆ）秒の間
に通信バッファから送信されたビット量（Ｕ）、次フレ
ームの目標ビット量（Ｔ）、１秒当りの送信可能なビッ
ト量（Ｒ）、入力画像の１秒当りの画面数（Ｇ）、出力
画像の１秒当りの画面数（Ｆ）、復号器の１秒当りの再
生可能な最大画面数（Ｈ）、保証する最大フレーム遅延
時間（Ｄ）、前記変数（Ｌ）から前記所定値（Ｒ・Ｃ／
Ｆ）に設定された通信路の帯域を１００％使用するため
の最小ビット量（Ｅ）、１フレーム当りの割り当てビッ
ト量（Ｋ）は全て整数値へ近似し、フレームレベルレー
ト制御の処理が全て整数演算で行うことも可能となり、
ハードウェアの実現を容易にする。Further, the communication buffer residual bit amount (W), the current frame bit amount (B), the number of screens per output image frame (C), the number of screens per second of the output image (F), the current frame , The target bit amount of the next frame (T), the transmittable bit amount (R) per second, Number of screens per second of image (G), number of screens of output image per second (F), maximum number of reproducible screens of decoder per second (H), maximum guaranteed frame delay time (D) ), The variable (L) to the predetermined value (R · C /
Minimum bit amount for the band to use 100% of the communication path is set to F) (E), allocation bit amount per frame (K) is approximated to all integer values, the frame level rate
It becomes possible to process the bets control performed on all integer operations,
To facilitate the realization of the hardware.

【００２８】[0028]

【発明の実施の形態】以下、図１乃至図２１に沿って、
本発明による実施形態について説明する。図１はウイン
ドウの説明図であり、顔のウインドウ２１と、背景画像
２２に識別して、画像情報に重み付けし、顔のウインド
ウ２１は手厚く、逆に背景画像２２はわざと画質を落と
して情報量を少なくしている。又、顔のウインドウ２１
は揺動する顔の動きに追随する機構により、最新位置に
更新され続ける。BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, referring to FIGS.
An embodiment according to the present invention will be described. FIG. 1 is an explanatory view of the window. The face window 21 and the background image 22 are identified and weighted to the image information. The face window 21 is thick, and conversely, the background image 22 intentionally reduces the image quality and the amount of information. Is reduced. Also, the face window 21
Is continuously updated to the latest position by a mechanism that follows the swinging movement of the face.

【００２９】認識処理中の動画像の画面上で移動自在の
特定領域即ち優先的な情報処理を施される顔のウインド
ウ２１を構成する全画像を矩形の小ブロックに分割して
逐次処理を行う方式の下でブロックの動画像の動きに伴
った動きベクトルを利用して、次のフレームの顔のウイ
ンドウの位置を推定し、主なる被写体の動きに顔のウイ
ンドウ２１を追随させるようにした。On the screen of the moving image undergoing the recognition process, all the images constituting the movable specific area, that is, the window 21 of the face to which the information processing is preferentially performed are divided into rectangular small blocks and the sequential processing is performed. Under the method, the position of the face window of the next frame is estimated by using the motion vector associated with the motion of the moving image of the block, and the face window 21 is made to follow the motion of the main subject.

【００３０】又、通信当初は中央位置に固定形状の顔の
ウインドウ２１を設定し、そのウインドウ２１の内部の
マクロブロックの動きベクトルの中で、ゼロでないマク
ロブロックの平均値を計算して、それをウインドウ２１
の動き方向と見なして、ウインドウ２１の位置を更新す
る。当初、人物の顔が画面の中央に位置せず、ウインド
ウ２１の位置が人物の顔と一致していない場合にも、一
旦動きベクトルをもつ顔の一部がウインドウ２１の内部
に入ると、ウインドウ２１の動きベクトルの作用によっ
て、ウインドウ２１は逐次人物の顔に近づき、最終的に
は顔と一致する。At the beginning of communication, a fixed-shape face window 21 is set at the center position, and among the motion vectors of macroblocks inside the window 21, the average value of non-zero macroblocks is calculated, and The window 21
And the position of the window 21 is updated. Even if the face of the person is not initially located in the center of the screen and the position of the window 21 does not match the face of the person, once a part of the face having the motion vector enters the inside of the window 21, the window is displayed. Due to the action of the motion vector 21 of the window 21, the window 21 gradually approaches the face of the person and finally coincides with the face.

【００３１】このようにしたことにより、動画像情報を
重み付けするための選別基準をウインドウ２１として明
確にし、かつそのウインドウ２１が顔の動きに追随する
ので、確実にその動画像情報に重み付けできる。By doing so, the selection criterion for weighting the moving image information is clarified as the window 21, and the window 21 follows the movement of the face, so that the moving image information can be reliably weighted.

【００３２】ここで、前記ＩＴＵ標準で要求されてい
る、ブロック単位の処理に関して定義しておく。尚、以
下の数字は縦×横の画素数を示す。動画像の１フレーム
は、８×８画素でなるブロックと呼ばれる小単位から構
成され、さらに１６×１６画素の輝度信号Ｙと８×８画
素の２つの色差信号Ｃｒ，Ｃｂからなる領域をマクロブ
ロックと呼ぶ。従って、マクロブロックとは４つの隣接
する輝度信号Ｙと、１つづつの色差信号Ｃｒ，Ｃｂの合
計６つのブロックで構成されている。Here, the processing in block units required by the ITU standard will be defined. The numbers below indicate the number of pixels in the vertical and horizontal directions. One frame of a moving image is composed of a small unit called a block composed of 8 × 8 pixels, and further, an area composed of a luminance signal Y of 16 × 16 pixels and two color difference signals Cr and Cb of 8 × 8 pixels is a macroblock. Call. Therefore, the macroblock is composed of four adjacent luminance signals Y and one color difference signal Cr and Cb, which is a total of six blocks.

【００３３】前記ＩＴＵ標準で採用されている１フレー
ムの標準の画素数は、１４４×１７６画素の輝度信号Ｙ
と、７２×８８画素の２つの色差信号Ｃｒ，Ｃｂでな
る、４分の１共通中間フォーマットＱＣＩＦ（Ｑｕａｒ
ｔｅｒＣｏｍｍｏｎＩｎｔｅｒｍｅｄｉａｔｅＦ
ｏｒｍａｔ）を始め、ＣＩＦ（Ｙ：２８８×３５２画
素，Ｃｒ／Ｃｂ：１４４×１７６画素）や、４ＣＩＦ
（Ｙ：５７６×７０４画素，Ｃｒ／Ｃｂ：２８８×３５
２画素）等の数種類が存在しており、本発明は前記ＩＴ
Ｕ標準で使用が認められている全てのフレームサイズを
対象とし、前記マクロブロック及びブロック単位で信号
処理する規定を適用又は準用する制約の範囲内で実施す
る。このような制約条件の範囲内での画質向上こそが新
規な本発明の要旨である。The standard number of pixels in one frame adopted in the ITU standard is a luminance signal Y of 144 × 176 pixels.
And two color difference signals Cr and Cb of 72 × 88 pixels, a quarter common intermediate format QCIF (Quar).
ter Common Intermediate F
orat), CIF (Y: 288 × 352 pixels, Cr / Cb: 144 × 176 pixels) and 4CIF
(Y: 576 × 704 pixels, Cr / Cb: 288 × 35
There are several types, such as 2 pixels), and
All frame sizes permitted to be used by the U standard are targeted, and implementation is performed within the range of restrictions to which the standard for processing signals in macro blocks and block units is applied or applied mutatis mutandis. Improving the image quality within the range of such constraints is the gist of the new invention.

【００３４】次に、図２は顔のウインドウと周辺動きウ
インドウの説明図であり、これは、比較的顔の動きが少
ない時、対話者の関心が顔の周辺の手などに集まるの
で、その関心に答えようとしたものである。具体的動作
は、前フレーム画像と現フレーム画像の差分を求め、そ
の差分値が所定の閾値以上となるような動作を伴い、顔
に次ぐ重要性を持つ腕等の対象部を囲んで１６×１６画
素の大きさの画像ブロック（以下、マクロブロックと呼
ぶ）又は８×８画素の大きさの画像ブロック（以下、ブ
ロックと呼ぶ）からなり、変動自在の領域である周辺動
きウインドウ５１が、優先的な情報処理を施される。Next, FIG. 2 is an explanatory view of the face window and the peripheral movement window. This is because when the face movement is relatively small, the interlocutor's attention is focused on the hands around the face. It was an attempt to answer an interest. The specific operation involves obtaining a difference between the previous frame image and the current frame image, and performing an operation such that the difference value is equal to or greater than a predetermined threshold value. The target portion such as an arm having the second most importance next to the face is surrounded by 16 ×. The peripheral motion window 51, which is a variable region and is composed of an image block having a size of 16 pixels (hereinafter referred to as a macro block) or an image block having a size of 8 × 8 pixels (hereinafter referred to as a block), has priority. Information processing is performed.

【００３５】そして、その周辺動きウインドウ５１を構
成する前記全画像を矩形の小ブロックに分割して逐次処
理を行う方式の下でブロックの動画像に伴った動きベク
トルを利用して、次のフレームの周辺動きウインドウの
位置及び領域を推定し、推定した位置及び及び領域に周
辺動きウインドウ５１を追随させるようにした。周辺動
きウインドウ５１は、顔のウインドウ２１とは異なる別
ものであり、前フレーム画像と現フレーム画像の差分が
ある閾値より大きな領域という原理で算出し、しかも任
意の形状に適宜変化するものである。Then, the whole image forming the peripheral motion window 51 is divided into small rectangular blocks and the sequential processing is performed, and the motion vector associated with the moving image of the block is used to move to the next frame. The position and area of the peripheral motion window of are estimated, and the peripheral motion window 51 is made to follow the estimated position and area. The peripheral motion window 51 is different from the face window 21, and is calculated on the principle that the difference between the previous frame image and the current frame image is larger than a certain threshold value, and is changed to an arbitrary shape as appropriate. .

【００３６】この周辺動きウインドウ５１は顔の動きが
激しい時は顔のウインドウ２１に近づき、顔の動きがほ
とんど無い時は顔の周辺をカバーするように、閾値を適
応的に変える。特に手を動かした場合などは、周辺動き
ウインドウ５１がこの手の動きをカバーすることができ
る。このようにしたことにより、人の顔を主体とするの
みならず、身振り手振りの動作を伴う対象部までの動画
像情報に対して重み付けするための選別基準を、周辺動
きウインドウ５１として明確にし、なおかつ周辺動きウ
インドウ５１がその人の手などの動きに追随させ、人の
顔を主体として身振り手振りの動作を伴う対象部まで確
実にその動画像情報に重み付けできる。The peripheral movement window 51 adaptively changes the threshold value so that it approaches the window 21 of the face when the face moves strongly and covers the periphery of the face when there is almost no face movement. Especially when the hand is moved, the peripheral movement window 51 can cover the movement of the hand. By doing so, not only the human face as a subject, but also the selection criterion for weighting the moving image information up to the target portion accompanied by the motion of the gesture, as the peripheral movement window 51, In addition, the peripheral motion window 51 can follow the motion of the person's hand or the like, and the moving image information can be reliably weighted to the target portion that is mainly the person's face and is accompanied by a gesture motion.

【００３７】次に、図３は背景の情報量を削減する方法
の説明図であり、顔のウインドウ２１の他、さらに周辺
動きウインドウ５１の枠でも背景を区別しているが、そ
の点に関しては図２でも説明した通りである。そして、
その背景の動きが激しく動画像情報量の多い場合には前
記背景の動き量を低減して背景画質をわざと劣化される
演算、即ち現フレームのマクロブロック画像に、そのマ
クロブロック画像と同一の位置にある前画像フレームの
データを所定の割合で加算し混合する時間方向フィルタ
（図示せず）を備え、前記背景の動画像情報量を削減す
る。Next, FIG. 3 is an explanatory diagram of a method of reducing the information amount of the background, and the background is distinguished not only by the window 21 of the face but also by the frame of the peripheral movement window 51. As explained in Section 2. And
When the movement of the background is strong and the amount of moving image information is large, the amount of movement of the background is reduced and the background image quality is intentionally deteriorated. A time direction filter (not shown) that adds and mixes the data of the preceding image frame at a predetermined ratio is reduced, and the moving image information amount of the background is reduced.

【００３８】顔のウインドウ２１と周辺動きウインドウ
５１を合わせて、両方の領域は手厚く、逆にそれ以外の
背景画像２２（図１参照）はわざと画質を落として情報
量を少なくさせるために、時間方向のフィルタを施すの
である。ここでは、現フレームのマクロブロックそのも
のを処理する代わりに、同位置の前画像フレーム（以
下、「前フレーム」又は「前参照フレーム」とも称す）
のマクロブロックと現フレームのマクロブロックから、
重みをつけて平均化して導出されるマクロブロックを利
用する。平均的な重みでは、同位置の前フレームのマク
ロブロックと現フレームの画素値の平均値が考えられ
る。極端な場合は前フレームのマクロブロックだけに置
き換えた場合は、平均画面は静止画のようになる。この
ように背景画像のノイズや動きを抑制することにより、
少ない情報割り当てを背景画面に限定し、顔及び周辺領
域に多くの情報割り当てが可能になる。Both the face window 21 and the peripheral movement window 51 are thick in both areas, and conversely, the background image 22 (see FIG. 1) other than that has the time required to intentionally reduce the image quality and reduce the amount of information. The directional filter is applied. Here, instead of processing the macroblock itself of the current frame, the previous image frame at the same position (hereinafter, also referred to as “previous frame” or “previous reference frame”)
From the macroblock of and the macroblock of the current frame,
A macroblock derived by weighting and averaging is used. For the average weight, the average value of the pixel values of the previous frame macroblock and the current frame at the same position can be considered. In the extreme case, if only the macroblock of the previous frame is replaced, the average screen looks like a still image. By suppressing the noise and movement of the background image in this way,
Only a small amount of information can be allocated to the background screen, and a large amount of information can be allocated to the face and peripheral areas.

【００３９】このようにして、伝送情報量を多く費やす
激しい動画の画像情報は時間フィルタを通過することに
より、画質を適度に劣化させる代わりに情報量を激減で
きる。この情報量激減処理を経た画像を復号して再現す
ると、早い動きで変化する場面でのみ少し画質劣化した
印象を受ける程度で済む。具体的には、走っている自動
車内でこのテレビ電話を用いて送信した場合に、送話者
の背景として写りこむ車窓から流れて見える風景が少し
ぼやける程度である。このようにしたことにより、背景
の動きが激しくその情報量を大幅に削減したい場合を特
定し、かつ適切に処理し、重要でない背景の動画像の情
報量を見苦しくならない程度に適宜に間引きできる。In this way, the image information of a moving picture that consumes a large amount of transmission information passes through the time filter, so that the amount of information can be drastically reduced instead of degrading the image quality appropriately. When the image that has undergone this information amount drastic reduction processing is decoded and reproduced, it is sufficient to give the impression that the image quality is slightly degraded only in the scene that changes in a fast motion. Specifically, when transmitted using this videophone in a running car, the scenery flowing from the car window that appears as the background of the talker is slightly blurred. By doing so, it is possible to identify and appropriately process the case where the background movement is strong and the information amount of which is to be greatly reduced, and to appropriately thin the information amount of the unimportant background moving image to the extent that it does not make it uncomfortable.

【００４０】ここで、予めＩフレーム、Ｐフレーム及び
Ｂフレームの関係を説明しておく。Ｉフレームは、現入
力フレームの画像情報のみを使って符号化するため、そ
の画像は、それ以前の復号画像の画質に依存しない。Ｐ
フレームは、前参照フレーム（現フレームの直前の復号
されたＩフレーム又はＰフレーム）の画像をもとに、動
き予測器で求めた各マクロブロックの動きベクトルによ
って動き補償された予測画像を生成し、この予測画像と
現入力フレームの画像との差分画像の情報を圧縮（離散
コサイン変換、量子化、可変長符号化）して復号器に送
信する。Here, the relationship between the I frame, the P frame and the B frame will be described in advance. Since the I frame is encoded using only the image information of the current input frame, the image does not depend on the image quality of the decoded image before that. P
The frame generates a prediction image that is motion-compensated by the motion vector of each macroblock obtained by the motion predictor based on the image of the previous reference frame (the decoded I frame or P frame immediately before the current frame). , The information of the difference image between the predicted image and the image of the current input frame is compressed (discrete cosine transform, quantization, variable length coding) and transmitted to the decoder.

【００４１】復号器では、この差分画像情報の圧縮符号
を復号（離散コサイン逆変換、逆量子化、可変長復号
化）し、これとは別に符号器から送信される動きベクト
ルによって動き補償された予測画像（符号器側で生成す
るものと同一）に足し合わせることによって、現Ｐフレ
ームを復号する（復号Ｐフレーム＝予測画像＋復号差分
画像）。ただし、復号差分画像には圧縮符号化に伴うノ
イズが混入している。復号されたＰフレームの画質は、
予測画像の予測精度（予測画像と現入力フレームの画像
の類似度）に大きく依存するので、予測画像のもととな
る前参照フレームの画質の直接的な影響を受ける。In the decoder, the compression code of the difference image information is decoded (discrete cosine inverse transform, inverse quantization, variable length decoding), and motion compensation is separately performed by the motion vector transmitted from the encoder. The current P frame is decoded by adding it to the predicted image (the same as that generated on the encoder side) (decoded P frame = predicted image + decoded difference image). However, noise due to compression encoding is mixed in the decoded difference image. The image quality of the decoded P frame is
Since it largely depends on the prediction accuracy of the predicted image (the similarity between the predicted image and the image of the current input frame), it is directly affected by the image quality of the previous reference frame that is the source of the predicted image.

【００４２】つまり、前参照フレームの画質が悪いと、
復号される現Ｐフレームの画質も悪くなり、さらにそれ
以降のＰフレームの画質も連鎖的に悪影響を及ぼす。逆
に、Ｐフレームの画質の向上は、それを参照するフレー
ムの画質をも向上させ、さらにそれ以降のフレームの画
質にも連鎖的に向上させる。Ｂフレームは、時間的に前
後する２つの参照フレーム（現フレームの直前及び直後
に復号されたＩフレーム又はＰフレーム）の画像をもと
に、前記同様の予測画像の生成、更には現入力フレーム
との差分画像情報の圧縮符号を復号器に送信する。That is, if the image quality of the previous reference frame is poor,
The image quality of the current P frame to be decoded is also deteriorated, and the image quality of the subsequent P frames are also chained. On the contrary, the improvement of the image quality of the P frame also improves the image quality of the frame that refers to it and further improves the image quality of subsequent frames in a chained manner. The B frame is based on the images of two reference frames (I frame or P frame decoded immediately before and immediately after the current frame) that are temporally preceding and subsequent to each other. And sends the compression code of the difference image information to the decoder.

【００４３】復号器では前記同様の復号化された差分情
報を符号器から別に送信される動きベクトル及び予測方
式（順方向予測、逆方向予測、両方向予測）情報をもと
に生成される予測画像（符号器で生成されるものと同
一）に足し合わせることで現Ｂフレームを復号する。Ｂ
フレームがＰフレームと違う点は、２つの参照フレーム
をもとに予測画像を生成するため、Ｐフレームに比べ予
測の精度が上がり、差分画像の圧縮符号の量が小さいと
いう点の他に、Ｂフレーム自体は参照フレームにはなら
ないため、Ｂフレームの画質の劣化は他の（それ以降
の）フレーム画質に悪影響を及ぼさないという点であ
る。ただし、Ｂフレームの画質自体は、Ｐフレーム同
様、その参照フレームの画質の影響を大きく受ける。In the decoder, a predicted image generated based on the motion vector and the prediction method (forward prediction, backward prediction, bidirectional prediction) information separately transmitted from the encoder by the same decoded difference information as described above. Decode the current B frame by adding (same as that generated by the encoder). B
The difference between the frame and the P frame is that the prediction image is generated based on two reference frames, the prediction accuracy is higher than that of the P frame, and the compression code amount of the difference image is small. Since the frame itself does not become a reference frame, the deterioration of the image quality of the B frame does not adversely affect the image quality of other (subsequent) frames. However, the image quality of the B frame itself is greatly influenced by the image quality of the reference frame, as in the P frame.

【００４４】次に、図４はＢフレーム処理による情報量
を削減する方法の説明図であり、各処理過程における信
号及び処理の名称を記載し、必ずしも各信号処理の機能
部だけの名称を記載したものではない。従って各処理に
関して、説明の必要に応じて機能部の名称にだけ符号を
つけている。先ず図４（ａ）に示す符号器においては、
現フレーム画像と前参照フレーム画像と後参照フレーム
画像を動き予測機能部４１へ入力する。この動き予測機
能部４１では動き予測、動き補償及び予測方式決定を行
う。そして、その動き予測機能部４１から出力される予
測画像と現フレーム画像との差分画像の情報を全画素値
ゼロ化機能部４２へ入力し、その差分情報の全画素値を
強制的にゼロにする。Next, FIG. 4 is an explanatory diagram of a method for reducing the amount of information by B frame processing, in which the names of signals and processing in each processing process are described, and the names of functional units of each signal processing are not necessarily described. Not what I did. Therefore, with respect to each processing, only the names of the functional units are designated by reference numerals as necessary. First, in the encoder shown in FIG.
The current frame image, the previous reference frame image, and the subsequent reference frame image are input to the motion prediction function unit 41. The motion prediction function unit 41 performs motion prediction, motion compensation, and prediction method determination. Then, the information of the difference image between the predicted image and the current frame image output from the motion prediction function unit 41 is input to the all-pixel value zeroization function unit 42, and all the pixel values of the difference information are forcibly set to zero. To do.

【００４５】そして、その全画素値ゼロ化機能部４２か
ら出力されるゼロ化全画素情報は離散コサイン変換部４
３を経由し、量子化部４９を経て符号生成部４５に入力
される。符号生成部４５は動き予測機能部４１が決定し
た予測方式により動画像の次の動きを予測しながら前記
ゼロ化全画素情報を符号化する。尚、符号器全体の構成
は図１１に示すブロック図に沿って後述し、図１２で復
号器のブロック図を、図１３で符号器／復号器兼用一体
のブロック図を示し、それらに沿って後述するので、こ
こではＢフレーム処理における信号の流れを示して、そ
の作用を説明する。The zeroed all-pixel information output from the all-pixel value zeroing function unit 42 is the discrete cosine transform unit 4
The signal is input to the code generation unit 45 via the quantization unit 49 via the signal No. 3. The code generation unit 45 encodes the zeroed all pixel information while predicting the next motion of the moving image by the prediction method determined by the motion prediction function unit 41. The overall configuration of the encoder will be described later with reference to the block diagram shown in FIG. 11. FIG. 12 shows a block diagram of a decoder, and FIG. Since it will be described later, the flow of signals in the B frame processing will be shown here to explain the operation thereof.

【００４６】次に図４（ｂ）に示す復号器では、符号器
により符号圧縮して送信され伝送経路１００を経て送り
届けられた動画像情報を受信して復号する復号部４６を
経て、その復号部４６から復号信号を逆量子化部４７へ
入力して逆量子化する。そして、その逆量子化部４７か
ら出力される逆量子化信号を離散コサイン逆変換部４８
へ入力して離散コサイン逆変換することにより前記差分
画像に復元し、その復元差分画像と前記予測方式により
予測された予測画像で足し合わせて復号画像を出力す
る。Next, in the decoder shown in FIG. 4 (b), the decoding unit 46, which receives and decodes the moving image information which is code-compressed by the encoder and transmitted and sent through the transmission path 100, is decoded. The decoded signal is input from the unit 46 to the inverse quantization unit 47 and inversely quantized. Then, the inverse quantized signal output from the inverse quantization unit 47 is converted into the discrete cosine inverse transformation unit 48.
The difference image is restored to the difference image by inputting to the inverse discrete cosine transform, and the restored difference image and the predicted image predicted by the prediction method are added together to output a decoded image.

【００４７】一般的に使われている動画像圧縮方式にお
いては、現入力フレームの画像情報を直接符号化するＩ
フレーム、現フレームの画像情報を前参照フレームの画
像情報から予測し、その予測画像との差分画像を符号化
するＰフレーム、及び時間的に前後する二つの参照フレ
ームの画像情報から合成される予測画像との差分画像を
符号化するＢフレームが存在する。ここで言う「参照フ
レーム」とは、既に復号されたＩフレーム又はＰフレー
ムを指す。又、ここで言う「予測画像」とは、参照フレ
ームと現入力フレームの間で発生した動きを各々マクロ
ブロック単位で予測し、その動き量を参照フレームにお
いて補償した画像を指す。最近、Ｈ．２６３やＨ．２６
３プラス規格では、Ｂフレームとその後のＰフレームを
同時にマクロブロック単位で符号化するＰＢフレーム符
号方式も提唱されている。In the generally used moving image compression method, I which directly encodes the image information of the current input frame is used.
A frame, a P frame that predicts the image information of the current frame from the image information of the previous reference frame, and a P frame that encodes a difference image from the predicted image, and a prediction that is composed from the image information of two reference frames that are temporally preceding and following. There is a B frame that encodes a difference image from the image. The “reference frame” here refers to an already decoded I frame or P frame. Further, the “predicted image” mentioned here refers to an image in which the motion generated between the reference frame and the current input frame is predicted in macroblock units, and the amount of motion is compensated in the reference frame. Recently, H. 263 and H.264. 26
The 3 Plus standard also proposes a PB frame coding method that simultaneously codes a B frame and a subsequent P frame in macroblock units.

【００４８】従来のＢフレーム又は、ＰＢフレーム符号
方式におけるＢフレームの予測方法として、以下の三種
類が存在する。（１）前の参照フレームを予測する場合に用いる順方向
予測（２）後の参照フレームを予測する場合に用いる逆方向
予測（３）前後の参照フレームを両方向予測する場合に用い
る両方向予測従来は、現入力フレーム画像と予測画像のずれを補正す
るために、これらの差分画像を符号化し、復号器に送信
していたが、図４（ａ）に示すようにこの差分画像を強
制的に全てゼロ値の画像とし、実質的には前記三種類の
Ｂフレームの予測方法のみを情報として送信することに
よって、Ｂフレームの情報量を激減させる方法を示して
いる。There are the following three types of prediction methods for the conventional B frame or B frame in the PB frame coding method. (1) Forward prediction used when predicting the previous reference frame (2) Reverse prediction used when predicting the subsequent reference frame (3) Bidirectional prediction used when predicting the preceding and following reference frames in both directions In order to correct the deviation between the current input frame image and the predicted image, these difference images were encoded and transmitted to the decoder. However, as shown in FIG. It shows a method of drastically reducing the information amount of the B frame by transmitting a zero-value image and substantially transmitting only the three types of B frame prediction methods as information.

【００４９】Ｂフレームは、前にも説明した通り、他の
フレームを符号化する上での参照フレームにはならない
ので、たとえＢフレームの画質が多少劣化したとして
も、それ以降のフレームの符号化には影響を与えない
他、Ｂフレームの情報量を激減させたことによりＰフレ
ームの情報量をこの分増やすことにより、結果的にＰフ
レームの画質が向上し、前にも説明した通り、動画像全
体としての画質の向上にもつながる。As described above, since the B frame does not serve as a reference frame for encoding other frames, even if the image quality of the B frame is deteriorated to some extent, the subsequent frames are encoded. In addition to the above, the information amount of the B frame is drastically reduced, and the information amount of the P frame is increased accordingly, so that the image quality of the P frame is improved. It also improves the image quality of the entire image.

【００５０】又、Ｂフレームの差分画像を強制的に全て
ゼロ値の画像とすることにより、図４（ａ）の破線で囲
った離散コサイン変換部４３及び量子化部４９を必要と
せず、ハードウェアの削減につながる。尚、この新しい
Ｂフレームの符号化方法を復号器側で導入することによ
る復号器側の構成の変更は必要なく、従来規格との互換
性は完全に保たれている。このようにしたことにより、
前記動画像の動きが目くるめく激しい場合は、前記差分
画像の情報量も多くなるが、前記差分画像の情報量を最
小ならしめるべく、全画素値を強制的にゼロにすること
によって、伝送元の画像を最も少ない情報により、伝送
先で復元できる。Further, by forcing the differential image of the B frame to be an image of all zero values, the discrete cosine transform section 43 and the quantizing section 49 enclosed by the broken line in FIG. It leads to reduction of wear. It is not necessary to change the configuration on the decoder side by introducing this new B frame encoding method on the decoder side, and compatibility with the conventional standard is completely maintained. By doing this,
When the motion of the moving image is noticeable and intense, the information amount of the difference image also increases, but in order to minimize the information amount of the difference image, all pixel values are forcibly set to zero, The image can be restored at the destination with the least amount of information.

【００５１】ここで、画像の色差信号のノイズを低減
し、復号画像の視覚的画像品質を向上させる量子化法に
ついて説明する。復号器によって再生される画像の品質
は、一般的に輝度信号と二つの色差信号に圧縮符号化さ
れる過程で挿入されるノイズの量によってきまるが、人
間の視覚は特に色差信号に敏感であり、これらの色差信
号のノイズを軽減することが、視覚的画像品質の向上に
つながることが知られている。A quantization method for reducing noise in the color difference signal of an image and improving the visual image quality of a decoded image will be described. The quality of the image reproduced by the decoder is generally determined by the amount of noise that is inserted in the process of compression-encoding the luminance signal and the two color difference signals, but human vision is particularly sensitive to the color difference signals. It is known that reducing noise in these color difference signals leads to improvement in visual image quality.

【００５２】従来の量子化法は、離散コサイン変換され
た周波数成分データを、指定された量子化レベルによっ
て、以下のような計算式によって変換する。｜Ｌ｜＝［（｜Ｃ｜−Ｑ・ｓ・（ｐ＋ｆ））／（Ｑ・ｓ）］ここで、｜Ｌ｜は量子化されたデータの絶対値、｜Ｃ｜
はもとのデータの絶対値、Ｑは量子化レベル、ｓは量子
化レベル補正値、ｐは逆量子化補正値、ｆは丸め補正パ
ラメータ、［］は実数から整数への切り下げによる変
換演算を示す。In the conventional quantization method, the frequency component data subjected to the discrete cosine transform is transformed according to the designated quantization level by the following calculation formula. | L | = [(| C | −Q · s · (p + f)) / (Q · s)] where | L | is the absolute value of the quantized data, and | C |
Is the absolute value of the original data, Q is the quantization level, s is the quantization level correction value, p is the inverse quantization correction value, f is the rounding correction parameter, and [] is the conversion operation by rounding down from a real number to an integer. Show.

【００５３】逆量子化補正値ｐ及び量子化レベル補正値
ｓは、各動画像符号化規格によって明示されている。
Ｈ．２６３及びＨ．２６３プラス規格においては、逆量
子化補正値ｐはマクロブロックの符号化方式と周波数成
分データの種類により、０（イントラマクロブロックの
直流成分の場合）又は０，５（それ以外の場合）の値を
取り、量子化レベル補正値ｓは２を取るように定められ
ている。ここで言う、イントラマクロブロックとは、予
測画像を使わず現マクロブロック画像を直接符号化する
マクロブロックを指す。The inverse quantization correction value p and the quantization level correction value s are specified by each moving image coding standard.
H. 263 and H.H. In the H.263 plus standard, the dequantization correction value p is a value of 0 (in the case of the DC component of the intra macroblock) or 0, 5 (in other cases) depending on the coding method of the macroblock and the type of frequency component data. , And the quantization level correction value s is set to 2. The intra macroblock referred to here is a macroblock that directly encodes the current macroblock image without using the prediction image.

【００５４】一方、丸め補正パラメータｆは、規格で定
められておらず、自由に設定できる値である。ｆ＝０，
５の場合は四捨五入による量子化を意味し、ｆ＝０の場
合は切り下げによる量子化を意味する。ＴＭＦ−１１に
おいて採用されている従来の量子化方法では、丸め補正
パラメータｆは、各マクロブロックの符号化方式によ
り、０，５（イントラマクロブロックの場合）又は０，
２５（それ以外の場合）に設定されており、輝度信号及
び二つの色差信号に共通のｆ値が適用されている。On the other hand, the rounding correction parameter f is not defined by the standard and can be set freely. f = 0,
A value of 5 means quantization by rounding, and a value of f = 0 means quantization by rounding down. In the conventional quantization method adopted in TMF-11, the rounding correction parameter f is 0, 5 (in the case of an intra macroblock) or 0, 5 depending on the coding method of each macroblock.
It is set to 25 (other cases), and the common f value is applied to the luminance signal and the two color difference signals.

【００５５】本発明では、このｆ値を輝度信号と色差信
号で異なるものを設定することにより、輝度信号と色差
信号に同一の量子化レベルを適用したまま、色差信号の
ノイズを大幅に軽減することに成功している。具体的に
は、以下に示すようにｆ値を決定する。（１）イントラマクロブロックの場合、従来通りｆ＝
０，５（四捨五入による量子化）に設定。（２）それ以外の場合、輝度信号にはｆ＝０（切り下
げ）を、色差信号にはｆ＝０，５（四捨五入）を適用す
る。In the present invention, by setting different f values for the luminance signal and the color difference signal, the noise of the color difference signal is greatly reduced while applying the same quantization level to the luminance signal and the color difference signal. Has been successful. Specifically, the f value is determined as shown below. (1) In the case of an intra macroblock, f =
Set to 0, 5 (quantization by rounding). (2) In other cases, f = 0 (rounded down) is applied to the luminance signal and f = 0,5 (rounded off) is applied to the color difference signal.

【００５６】輝度信号の量子化において切り下げを用い
ることは、量子化ノイズが増加するため、従来では全く
考えられなかった方法である。しかし、輝度信号の切り
下げによる量子化は、０の値をとる量子化周波数成分の
個数を飛躍的に増加させる効果があり、そのため圧縮比
が著しく向上し、量子化レベルの相対的な低下を引き起
こす。量子化レベルの低下は量子化のノイズの増加傾向
をある程度相殺し、最終的な量子化ノイズは、僅か０，
０ｄＢ乃至０，３ｄＢ程度と従来方式のものとほとんど
変わらないことが確認されている。逆に色差信号の量子
化において四捨五入を適用することで、色差信号のノイ
ズは、１，２ｄＢ乃至１，３ｄＢ程度が改善され、全体
としてのノイズ比も０，３ｄＢ程改善される。さらに、
色差信号のノイズが大幅に減ることにより、色の再現が
より正確になり、視覚的画像品質も大幅に改善される。
従って、従来より簡単な構成で、効率的に色差信号での
ノイズ削減効果を発揮し、人の目の性質上、Ｓ／Ｎ比の
数値的改善レベル以上に視覚的実質上の画質向上を得ら
れる。The use of rounding down in the quantization of the luminance signal is a method that has never been considered in the past, because the quantization noise increases. However, the quantization by rounding down the luminance signal has the effect of dramatically increasing the number of quantized frequency components that take a value of 0, which significantly improves the compression ratio and causes a relative decrease in the quantization level. . The decrease in the quantization level offsets the increasing tendency of the quantization noise to some extent, and the final quantization noise is only 0,
It has been confirmed that the level is about 0 dB to 0.3 dB, which is almost the same as that of the conventional method. On the contrary, by applying rounding in the quantization of the color difference signal, the noise of the color difference signal is improved by about 1 to 2 dB to about 1 to 3 dB, and the noise ratio as a whole is improved by about 0 to 3 dB. further,
By significantly reducing the noise in the color difference signals, the color reproduction is more accurate and the visual image quality is also greatly improved.
Therefore, with a simpler structure than the conventional one, the noise reduction effect in the color difference signal can be efficiently exerted, and due to the nature of the human eye, the image quality can be substantially improved visually more than the numerical improvement level of the S / N ratio. To be

【００５７】次に、ここではイントラマクロブロックの
ＤＣ係数（８×８ブロックの左上隅の値）の量子化、即
ちＤＣ係数を８で割って、端数を四捨五入するが、前記
ＤＣ係数は、受信した量子化値に８を掛けて求めてい
る。従って、［逆量子化値］＝８×［量子化値］とな
る。又、前記予測画像を使わずに現マクロブロック（１
６×１６又は８×８）画像を直接符号化するイントラマ
クロブロックの場合にのみ輝度信号と色差信号を四捨五
入により量子化する。そして、前記イントラマクロブロ
ック以外の場合には輝度信号を切り下げにより量子化し
色差信号は四捨五入により量子化する。Next, here, the DC coefficient of the intra macroblock (the value in the upper left corner of the 8 × 8 block) is quantized, that is, the DC coefficient is divided by 8 and the fraction is rounded off. It is calculated by multiplying the quantized value obtained by 8. Therefore, [inverse quantized value] = 8 × [quantized value]. In addition, the current macroblock (1
The luminance signal and the color difference signal are quantized by rounding only in the case of an intra macroblock that directly encodes a 6 × 16 or 8 × 8) image. Then, in the case other than the intra macroblock, the luminance signal is quantized by rounding down and the color difference signal is quantized by rounding off.

【００５８】このようにしたことにより、輝度信号と色
差信号に同一の量子化レベルを適用したまま色差信号の
ノイズを低減できる。計測上、同一レベルのノイズ低減
ならば輝度信号よりも色差信号のノイズを低減させたほ
うが、視覚上の効果が高い。従って、従来より簡単な構
成で、効率的に色差信号でのノイズ削減効果を発揮し、
人の目の性質上、Ｓ／Ｎ比の数値的改善レベル以上に視
覚的実質上の画質向上に寄与する。By doing so, the noise of the color difference signal can be reduced while applying the same quantization level to the luminance signal and the color difference signal. In terms of measurement, if the noise of the same level is reduced, reducing the noise of the color difference signal rather than the luminance signal has a higher visual effect. Therefore, with a simpler structure than the conventional one, the noise reduction effect in the color difference signal can be efficiently exhibited,
Due to the nature of the human eye, it contributes to the substantial improvement of the image quality visually than the numerical improvement level of the S / N ratio.

【００５９】次に、図５はフレームレベルレートに関す
る条件及び変数の一覧図であり、図６はフレームレベル
レート制御の処理流れ図である。図５（ａ）はフレーム
レベルレート制御における目標ビット量を計算する上で
考慮する条件を示しており、Ｒは１秒当りの送信可能な
ビット量である。そして、入力画像の１秒当りの画面数
Ｇは、符号器側の動画像入力源（カメラ、ビデオ）によ
って決まり、一般的には２５乃至３０の値をとる。出力
画像の１秒当りの画面数Ｆは、入力画像の１秒当りの画
面数Ｇ以下の値をとり、ＦとＧは整数比であるとする。Next, FIG. 5 is a list of conditions and variables relating to the frame level rate, and FIG. 6 is a processing flow chart of the frame level rate control. FIG. 5A shows conditions to be considered in calculating the target bit amount in frame level rate control, and R is the transmittable bit amount per second. The number G of screens per second of the input image is determined by the moving image input source (camera, video) on the encoder side, and generally takes a value of 25 to 30. The number of screens F per second of the output image is a value equal to or less than the number of screens G of the input image per second, and F and G are assumed to be an integer ratio.

【００６０】ここで、１出力画像フレーム当りの画面数
Ｃは、次の出力フレームがＰＢフレーム符号方式の場合
はＣ＝２となり、次の出力フレームがＩフレーム又はＰ
フレーム符号方式の場合はＣ＝１となる。又、復号器の
１秒当りの再生可能な最大画面数Ｈは、Ｆ以上の値をと
るものとする。又、保証する最大フレーム遅延時間Ｄ
は、符号器側における画像入力から復号器側における画
像再生に要する遅延時間を、入力画面周期（１／Ｇ）秒
を単位時間として表したものである。この遅延時間は、
各画面の符号化情報の送信に要する時間に依存する他、
符号器及び復号器の処理遅延時間にも依存するが、ここ
では、これら処理遅延時間は無視するものとする。尚、
Ｄが満たすべき条件については、次の段落で説明する。Here, the number of screens C per output image frame is C = 2 when the next output frame is the PB frame coding system, and the next output frame is the I frame or P frame.
In the case of the frame coding method, C = 1. The maximum number of reproducible screens H per second of the decoder is F or more. Also, the guaranteed maximum frame delay time D
Is the delay time required from the image input on the encoder side to the image reproduction on the decoder side, with the input screen period (1 / G) seconds as a unit time. This delay time is
It depends on the time required to transmit the coded information of each screen,
Although it also depends on the processing delay times of the encoder and the decoder, these processing delay times are ignored here. still,
The conditions that D must meet are described in the next paragraph.

【００６１】図５（ｂ）は図５（ａ）の条件より定まる
三つの変数を表す。Ｅは通信路の帯域を１００％使用す
るための最小ビット量である。実際の符号ビット量がＥ
を下回ると、出力情報を蓄えている通信バッファが空に
なるアンダーフローという現象を起こし、通信速度の実
効値が下がり、結果的に画質の劣化を招く。ここで、Ｌ
はフレーム遅延時間Ｄを保証し得る最大のビット量であ
る。実際の符号ビット量がＬを上回ると、復号器におけ
る再生画像の遅延がＤを超えてしまう。尚、アンダーフ
ロー現象が少ない安定した動画像通信をおこないながら
フレーム遅延時間Ｄを保証するためには、ＬはＥ以上の
値である必要があるため、図５（ａ）におけるＤが満た
すべき条件式Ｄ≧（２・Ｃ−１）Ｇ／Ｆ−１が与えられ
ている。又、Ｋは１フレーム当りの割り当てビット量で
あり、Ｅ以上Ｌ以下の値を設定する。尚、ここにおける
ｓは、０から１の間の実効値をとる定数である。FIG. 5B shows three variables determined by the conditions of FIG. E is the minimum bit amount for using 100% of the bandwidth of the communication channel. The actual code bit amount is E
When the value falls below the range, a phenomenon called underflow occurs in which the communication buffer storing the output information becomes empty, and the effective value of the communication speed decreases, resulting in deterioration of image quality. Where L
Is the maximum bit amount that can guarantee the frame delay time D. When the actual code bit amount exceeds L, the delay of the reproduced image in the decoder exceeds D. In order to guarantee the frame delay time D while performing stable moving image communication with less underflow phenomenon, L must be a value equal to or greater than E. Therefore, the condition that D in FIG. The equation D ≧ (2 · C−1) G / F−1 is given. K is the amount of allocated bits per frame and is set to a value of E or more and L or less. Note that s here is a constant that takes an effective value between 0 and 1.

【００６２】図５（ｃ）は、各画像フレームの符号ビッ
ト量に依存する変数である。そして、図５（ａ）（ｂ）
（ｃ）に表れる変数Ｒ，Ｇ，Ｆ，Ｃ，Ｈ，Ｄ，Ｅ，Ｌ，
Ｋ，Ｗ，Ｂ，Ｕ，Ｔは全て整数値へ近似する。このこと
により、以下のフレームレベルレート制御の処理が全て
整数演算で行うことが可能であり、ハードウェアの実現
を容易にする。FIG. 5C shows variables depending on the code bit amount of each image frame. And FIG. 5 (a) (b)
The variables R, G, F, C, H, D, E, L, which appear in (c),
K, W, B, U and T all approximate integer values. Thus, the following processes frame-level rate control is all it is possible to perform integer arithmetic, to facilitate realization of hardware.

【００６３】それから、図６はフレームレベルレート制
御の処理流れ図を示す。図６（Ｓ６１）では、通信バッ
ファ残留ビット量Ｗを０に設定し、始めの入力画面を１
フレームとして符号化する。図６（Ｓ６２）では、現フ
レームのビット量Ｂが正値かどうかを判断する。Ｂが正
値ならば、現フレームの符号化処理時間をＣ／Ｆ秒と
し、この間に送信されるビット量ＵをＲ・Ｃ／Ｆに設定
する。Ｂが０ならば、現フレームはスキップされている
と判断し、符号化処理時間を入力画面周期と同じ１／Ｇ
秒とし、この間に送信されるビット量ＵをＲ／Ｇに設定
する。図６（Ｓ６３）では、通信バッファ残留ビット量
Ｗに現フレームのビット量Ｂを加算した値と、現フレー
ムの符号化処理時間の間に通信バッファから送信された
ビット量Ｕを比べＷ値を更新する。Then, FIG. 6 shows a processing flow chart of the frame level rate control. In FIG. 6 (S61), the communication buffer residual bit amount W is set to 0, and the first input screen is set to 1.
Encode as a frame. In FIG. 6 (S62), it is determined whether the bit amount B of the current frame is a positive value. If B is a positive value, the coding processing time of the current frame is set to C / F seconds, and the bit amount U transmitted during this period is set to R · C / F. If B is 0, it is determined that the current frame is skipped, and the encoding processing time is 1 / G which is the same as the input screen cycle.
Seconds, and the bit amount U transmitted during this period is set to R / G. In FIG. 6 (S63), the value obtained by adding the bit amount B of the current frame to the communication buffer residual bit amount W and the bit amount U transmitted from the communication buffer during the encoding processing time of the current frame are compared to obtain the W value. Update.

【００６４】図６（Ｓ６４）では、次フレームの目標ビ
ット量Ｔを計算するが、この時にＷとＬ−Ｅ，ＦとＨの
夫々の大小関係を調べる。通常はＷ＜Ｌ−Ｅが成り立
ち、この時は１フレーム当りの割り当てビット量Ｋから
通信バッファ残留ビット量Ｗを引いたものをＴに設定す
る。しかし、Ｗ≧Ｌ−Ｅの場合、現フレームが保証する
フレーム遅延時間以内では復号器で再生できず、保証す
る遅延時間以上の遅れが一時的に生じる（以下、このこ
とを「過剰遅延状態」と呼ぶ）。その後符号器側では、
この一時的な過剰遅延状態は解消されるようにＫが設定
されており、復号器の１秒当りの再生可能な最大画面数
Ｈが出力画像の１秒当りの画面数Ｆよりも大きければ、
復号器側でもこの過剰遅延状態は解消される。In FIG. 6 (S64), the target bit amount T of the next frame is calculated. At this time, the magnitude relationship between W and LE and between F and H is checked. Normally, W <LE is established, and at this time, the value obtained by subtracting the communication buffer residual bit amount W from the allocated bit amount K per frame is set to T. However, if W ≧ LE, the decoder cannot reproduce within the frame delay time guaranteed by the current frame, and a delay longer than the guaranteed delay time temporarily occurs (hereinafter, this is referred to as “excess delay state”). Called). After that, on the encoder side,
This temporary excess delay state K is set to be eliminated, greater than the screen speed F per second of 1 second per renewable maximum screen number H is the output image of the decoder ,
This excessive delay state is also eliminated on the decoder side.

【００６５】ところが、このＨがＦに等しい場合、たと
えこの一時的な過剰遅延状態が符号器側で解消されたと
しても、復号器側では解消することが出来ず、過剰遅延
状態が持続する恐れがある。そして、一時的な過剰遅延
状態はほとんど知覚できないが、過剰遅延状態が持続し
た場合は、その大きな遅延が問題となる場合が多く、従
ってＷ≧Ｌ−ＥかつＦ＝Ｈの場合はＴ＝０とする。図６
（Ｓ６５）においてＴが正値の場合は、通常処理とし
て、次の（Ｇ／Ｆ−１）フレームをスキップし、その後
のＣ個（次フレームがＩ又はＰフレームの場合はＣ＝
１，ＰＢフレームの場合はＣ＝２）の入力画面を符号化
する。Ｔの値が０又は負の場合は、次の入力画面一つを
符号化せずにスキップする。そして、スキップした場合
のフレームのビット量Ｂ＝０となる。However, when this H is equal to F, even if this temporary excessive delay state is eliminated on the encoder side, it cannot be eliminated on the decoder side, and the excessive delay state may continue. There is. Then, although a temporary excess delay state is hardly perceivable, when the excess delay state persists, the large delay is often a problem. Therefore, when W ≧ LE and F = H, T = 0. And Figure 6
When T is a positive value in (S65), the next (G / F-1) frame is skipped as normal processing, and C (the next frame is I or P frame, C =
In the case of 1, PB frame, the input screen of C = 2) is encoded. If the value of T is 0 or negative, the next input screen is skipped without being encoded. The bit amount B of the frame when skipped is B = 0.

【００６６】このように、符号器側における画像入力か
ら復号器側における画像再生までに要する遅延時間を厳
密な制御を可能とし、しかも出力画面のコマ落ちが起き
た場合も、従来はＣ・Ｇ／Ｆ個の画面のコマ落ちするの
に対し、本発明では１画面だけのコマ落ちに抑えるた
め、全体的にコマ落ちする画面数も少なくなり、安定し
た動画像の通信が可能となる。As described above, the delay time required from the image input on the encoder side to the image reproduction on the decoder side can be strictly controlled, and even when a frame drop occurs on the output screen, the conventional C.G. In contrast to the frame dropping of / F screens, the present invention suppresses the frame dropping of only one screen. Therefore, the number of screens dropping the frame is reduced as a whole, and stable communication of moving images becomes possible.

【００６７】又、符号化された動画像情報の現フレーム
のビット量Ｂを通信バッファ残留ビット量Ｗと比較する
比較手段を図６（Ｓ６２）（Ｓ６３）に示し、その比較
手段の比較結果により、前記残留ビット量Ｗが枯渇しな
いように次フレームの目標ビット量Ｔを制御する制御手
段を図６（Ｓ６４）（Ｓ６５）に示している。その制御
手段による制御結果を用いて、カメラ入力画像が前記符
号器、伝送経路１００及び前記復号器を経由して復号画
像となって出力されるまでの間に発生する遅延時間とコ
マ落ちを最小限にする、フレームレベルレート制御にお
ける、次フレームの目標ビット量Ｔの計算手段を確立し
ている。このようにしたことにより、実効通信速度を低
下させる通信バッファのアンダーフローや再生画像の遅
延といった、従来では十分に制御できなかった現象を、
極めて簡単な計算により、高い制御能力を有し、リップ
同期を実現できる。Also, comparing means for comparing the bit amount B of the current frame of the encoded moving image information with the communication buffer residual bit amount W is shown in FIGS. The control means for controlling the target bit amount T of the next frame so that the residual bit amount W is not exhausted is shown in FIG. 6 (S64) (S65). By using the control result by the control means, the delay time and the frame drop occurring until the camera input image is output as a decoded image via the encoder, the transmission path 100 and the decoder are minimized. The calculation means of the target bit amount T of the next frame in the frame level rate control is established. By doing this, phenomena that could not be controlled sufficiently in the past, such as underflow of the communication buffer that reduces the effective communication speed and delay of the reproduced image,
Highly controllable and lip synchronization can be realized by extremely simple calculation.

【００６８】次に、図７、図８、図９、及び図１０によ
り、符号ビット量を指標としたマクロブロックレベルレ
ート制御方法について説明する。図７は、マクロブロッ
クレベルレートに関する変数及び定数の一覧図であり、
特に前フレームの符号化マクロブロックの量子化レベル
の平均値Ｑａは、符号化されたマクロブロック（イント
ラマクロブロックでなく、量子化周波数成分及び動きベ
クトル成分がすべてゼロのマクロブロックは符号化され
ないとする）に適用されたマクロブロック数で割った値
である。Next, a macroblock level rate control method using the code bit amount as an index will be described with reference to FIGS. 7, 8, 9, and 10. FIG. 7 is a list of variables and constants related to the macroblock level rate,
Particularly, the average value Qa of the quantization levels of the coded macroblocks of the previous frame is the coded macroblock (not the intra macroblock, the macroblock of which the quantized frequency component and motion vector component are all zero is not coded. Value) divided by the number of macroblocks applied to

【００６９】図８に示すように、現フレームの最初（ｉ
＝１）のマクロブロックに適用する量子化レベルの初期
値Ｑを、前フレームの符号化マクロブロックの量子化レ
ベルの重み付き平均Ｑａを用いて算出する第１の計算手
段（Ｓ１）を構成している。そして図８（Ｓ３）におい
て、２番目以降（ｉ＝２，３乃至Ｎ）のマクロブロック
に適用する量子化レベルの微調整量の算出に、目標ビッ
ト量、現マクロブロックまでの実際の符号量及び最初
（ｉ＝１）のマクロブロックに適用する量子化レベルを
用いる第２の計算手段に関しては、図９及び図１０に示
すように構成している。As shown in FIG. 8, the first (i
A first calculation means (S1) for calculating the initial value Q of the quantization level applied to the macroblock of (1) using the weighted average Qa of the quantization level of the coding macroblock of the previous frame. ing. In FIG. 8 (S3), the target bit amount and the actual code amount up to the current macroblock are used to calculate the fine adjustment amount of the quantization level applied to the second and subsequent (i = 2, 3 to N) macroblocks. The second calculation means using the quantization level applied to the first (i = 1) macroblock is configured as shown in FIGS. 9 and 10.

【００７０】図８はマクロブロックレベルレート制御の
処理流れ図である。ただし、最初の入力画面の１フレー
ム符号化には、このマクロブロックレベルレート制御は
適用されず、適用されない場合は量子化レベルはある特
定の値に固定する。図８（Ｓ１）では前フレームの状態
を調べる。前フレームが最初のＩフレームかあるいはス
キップされた（Ｂ’＝０）場合は、前フレームの量子化
レベルの初期値Ｑ’をそのまま現フレームの量子化レベ
ルの初期値Ｑとする。前記以外の場合、前フレームの符
号化マクロブロックの量子化レベルの平均値Ｑａに、前
フレームの目標ビット量Ｔ’に対する実際の符号化ビッ
ト量Ｂ’の割合Ｂ’／Ｔ’を掛けた値を、関数ＣＱの引
数とし、関数ＣＱの出力を現フレームの量子化レベルの
初期値Ｑとする。FIG. 8 is a processing flow chart of macroblock level rate control. However, this macroblock level rate control is not applied to the one-frame encoding of the first input screen, and when it is not applied, the quantization level is fixed to a specific value. In FIG. 8 (S1), the state of the previous frame is checked. When the previous frame is the first I frame or is skipped (B ′ = 0), the initial value Q ′ of the quantization level of the previous frame is used as it is as the initial value Q of the quantization level of the current frame. In other cases, a value obtained by multiplying the average value Qa of the quantization levels of the coding macroblocks of the previous frame by the ratio B ′ / T ′ of the actual coding bit amount B ′ to the target bit amount T ′ of the previous frame. Is an argument of the function CQ, and the output of the function CQ is the initial value Q of the quantization level of the current frame.

【００７１】ここで、図９はマクロブロックレベルレー
ト制御における関数ＣＱ（ｘ）の定義を説明する図であ
り、関数ＣＱは、図９に示すように、引数ｘが量子化レ
ベルの許容最大値Ｑｍｉｎを下回る場合はＱｍｉｎを出
力し、ｘが許容範囲内であればそのままの値を出力する
「クリッピング」関数である。前フレームにおいて目標
ビット量よりも実際の符号ビット量が大きい場合は、量
子化レベルの初期値をＱａよりも高く設定し、逆に目標
ビット量よりも実際の符号ビット量が小さい場合は、Ｑ
ａよりも低く設定することにより、このマクロブロック
レベルレート制御の「制御速度」を適応的に調整し、そ
の制御能力を向上させている。Here, FIG. 9 is a diagram for explaining the definition of the function CQ (x) in the macroblock level rate control. In the function CQ, as shown in FIG. 9, the argument x is the allowable maximum value of the quantization level. It is a "clipping" function that outputs Qmin when it is less than Qmin, and outputs the value as it is when x is within the allowable range. When the actual code bit amount is larger than the target bit amount in the previous frame, the initial value of the quantization level is set higher than Qa, and conversely, when the actual code bit amount is smaller than the target bit amount, Q
By setting it lower than a, the "control speed" of this macroblock level rate control is adaptively adjusted and its control capability is improved.

【００７２】又、この他に図８（Ｓ１）において必要と
なる、非ゼロ量子化周波数成分１個当りの平均ビット量
の予測値Ｊを計算する。この計算では、前フレームにお
いて、周波数成分に費やした全ビット量を非ゼロ量子化
周波数成分の個数で割った値（図７では「前フレームの
非零量子化周波数成分１個当りの平均ビット量」と略し
て記載）Ｊ’を用い、ＪとＪ’をｔ：（１−ｔ）の比で
足し合わせた値を新しいＪの値とする。ｔは０以上１以
下の実数値をとる定数とする。図８（Ｓ２）では現フレ
ームの目標ビット量Ｔが正値であれば、実際に現フレー
ムの符号化処理を行い、そうでなければ直接図８（Ｓ
６）に行く。In addition to this, the predicted value J of the average bit amount per non-zero quantized frequency component, which is required in FIG. 8 (S1), is calculated. In this calculation, a value obtained by dividing the total amount of bits spent on frequency components in the previous frame by the number of non-zero quantized frequency components (in FIG. 7, “average bit amount per non-zero quantized frequency component of previous frame”). The value obtained by adding J and J ′ in the ratio of t: (1−t) is used as a new J value. t is a constant that takes a real value of 0 or more and 1 or less. In FIG. 8 (S2), if the target bit amount T of the current frame is a positive value, the current frame is actually encoded.
Go to 6).

【００７３】図８（Ｓ３）では各マクロブロック（ｉ）
（ｉ＝１，２乃至Ｎ）における動画像情報の符号化圧縮
処理の前半部であり、前述の差分画像（図４参照）に対
し離散コサイン変換を行い、離散コサイン変された周波
数成分を量子化レベルｑを用いて量子化を行う。図８
（Ｓ４）では量子化レベルｑを適切な値に更新する。
尚、この部分の処理の詳細は図１０に示しているので、
それに沿って後述する。図８（Ｓ５）では各マクロブロ
ック（ｉ）における動画像情報の符号化圧縮処理の後半
部であり、可変長符号化を行い、Ｂの値をマクロブロッ
ク（ｉ）の符号を含むものに更新する。さらに、逆量子
化して、離散コサイン復元画像を生成する。図８（Ｓ
６）では全てのマクロブロックの処理が終わり、現フレ
ームの符号が完成した後に、Ｑ’，Ｂ’Ｔ’の値に置き
換え、次フレームの処理に備える。以上の処理は全て整
数演算によって実現する。In FIG. 8 (S3), each macroblock (i)
In the first half of the coding / compression process of moving image information in (i = 1, 2 to N), discrete cosine transform is performed on the difference image (see FIG. 4) described above, and the discrete cosine transformed frequency component is quantized. Quantization is performed using the quantization level q. Figure 8
In (S4), the quantization level q is updated to an appropriate value.
Since the details of the processing of this part are shown in FIG. 10,
It will be described later along with it. In FIG. 8 (S5), which is the latter half of the coding / compression process of the moving image information in each macroblock (i), variable length coding is performed, and the value of B is updated to include the code of macroblock (i). To do. Further, inverse quantization is performed to generate a discrete cosine restored image. Figure 8 (S
In 6), after the processing of all macroblocks is completed and the code of the current frame is completed, the values are replaced with the values of Q ′ and B′T ′ to prepare for the processing of the next frame. All the above processing is realized by integer arithmetic.

【００７４】そして、図１０は各マクロブロック（ｉ）
における量子化レベルｑの更新計算処理の流れ図であ
り、順番は前後するが図８（Ｓ４）での処理の詳細説明
である。図１０（Ｓ７）では、先ず各マクロブロック
（ｉ）が符号化されるかどうかを判断する。マクロブロ
ックが符号化されない条件は、イントラマクロブロック
でなく、量子化周波数成分及び動きベクトル成分が全て
ゼロであることである。もし、符号化されない場合は、
量子化レベルｑの更新は行わない。FIG. 10 shows each macroblock (i)
9 is a flow chart of the update calculation process of the quantization level q in FIG. 8, which is a detailed description of the process in FIG. In FIG. 10 (S7), it is first determined whether each macroblock (i) is coded. The condition that a macroblock is not coded is that it is not an intra macroblock and that the quantized frequency component and motion vector component are all zero. If not encoded,
The quantization level q is not updated.

【００７５】図１０（Ｓ８）ではマクロブロック（ｉ）
が符号化される場合に、先ず量子化レベルｑの更新計算
の指標となる４つの変数ｄ，ｈ，ａ，ｅを計算する。ｄ
は現マクロブロック（ｉ）のビット量の予測値であり、
現マクロブロック（ｉ）の非量子化周波数成分の個数ｚ
に前記（Ｓ１）で計算した非零量子化周波数成分１個当
りの平均ビット量の予測値Ｊを掛けた値に、周波数成分
以外の予想ビット量Ｖを足した値とする。周波数成分以
外の予想ビット量Ｖは、予め実験的に求めた値を用い
る。ｈは残り（未処理）のマクロブロックが消費するビ
ット量の予測値であり、残量ビット予測値である。ａは
マクロブロック（ｉ）以降消費できるビット残量値であ
り、残量ビット許容値である。ｅは各マクロブロック
（ｉ）が同一の符号ビット量を発生すると仮定した時の
残り（未処理）のマクロブロックが消費するビット量の
目標値であり、残量ビット目標値である。ビット残量値
ａがビット残量予測値ｈよりも大幅に大きい場合は、量
子化レベルｑを上げて情報発生量を増やし、逆にビット
残量値ａがビット残量予測値ｈよりも大幅に小さい場合
には、量子化レベルｑを下げて情報発生量を抑えること
を行う。In FIG. 10 (S8), macroblock (i)
When is encoded, first, four variables d, h, a, and e which are indices for update calculation of the quantization level q are calculated. d
Is a predicted value of the bit amount of the current macroblock (i),
The number z of non-quantized frequency components of the current macroblock (i)
Is multiplied by the predicted value J of the average bit amount per non-zero quantized frequency component calculated in (S1), and the expected bit amount V other than the frequency component is added. As the expected bit amount V other than the frequency component, a value that is experimentally obtained in advance is used. h is a predicted value of the bit amount consumed by the remaining (unprocessed) macroblocks, and is a remaining bit predicted value. a is the remaining bit value that can be consumed after the macroblock (i), and is the remaining bit allowable value. e is a target value of the bit amount consumed by the remaining (unprocessed) macro blocks when it is assumed that each macro block (i) generates the same code bit amount, and is a remaining bit target value. When the bit remaining amount value a is significantly larger than the bit remaining amount predicted value h , the quantization level q is increased to increase the information generation amount, and conversely, the bit remaining amount value a is significantly larger than the bit remaining amount predicted value h. If it is very small, the quantization level q is lowered to suppress the information generation amount.

【００７６】図１０（Ｓ９）ではパラメータｂ１，ｂ２
を求める。ｂ１は現在の量子化レベルｑが最初のマクロ
ブロックに適用された量子化レベルの初期値Ｑよりも大
きい時に、ｑがそれ以上大きくなり難くするように作用
するバイアスであり、ｂ２はｑがＱよりも小さくなり難
くするように作用するバイアスである。図１０（Ｓ１
０）ではパラメータｃ１，ｃ２を求める。ｃ１＝ｆ＋ｇ・ｂ１ｃ２＝ｆ＋ｇ・ｂ２ここの計算で使う定数ｆは、レート制御の感度を調整す
るパラメータであり、通常１以上の値を使う。又、定数
ｇはバイアスｂ１，ｂ２の作用の強さを調整するパラメ
ータであり、通常０以上の値を使う。従って、パラメー
タ（ｂ１）に０以上の定数（ｇ）を乗じて１以上の定数
（ｆ）を加えたパラメータ（ｃ１）及びパラメータ（ｂ
２）に０以上の定数（ｇ）を乗じて１以上の定数（ｆ）
を加えたパラメータ（ｃ２）を求める処理（Ｓ１０）を
行う。In FIG. 10 (S9), parameters b1 and b2 are set.
Ask for. b1 is a bias that acts so as to prevent q from becoming larger when the current quantization level q is larger than the initial value Q of the quantization level applied to the first macroblock, and b2 is a bias that q does not exceed Q. It is a bias that acts to make it difficult to be smaller than. FIG. 10 (S1
In 0), the parameters c1 and c2 are obtained. c1 = f + g · b1 c2 = f + g · b2 The constant f used in the calculation here is a parameter for adjusting the sensitivity of the rate control, and usually a value of 1 or more is used. The constant g is a parameter for adjusting the strength of action of the biases b1 and b2, and a value of 0 or more is usually used. Therefore, the parameter (c1) and the parameter (b) obtained by multiplying the parameter (b1) by a constant (g) of 0 or more and adding a constant (f) of 1 or more
Multiply 2) by a constant (g) of 0 or more and a constant (f) of 1 or more
A process (S10) of obtaining the parameter (c2) to which is added is performed.

【００７７】図１０（Ｓ１１）では実際の量子化レベル
ｑの更新を行う。ここでは以下の条件１乃至条件４を考
慮する。条件１：ｑ＜Ｑかつａ＜ｈ真の場合、ｑにｑ１を足
した値をｑ’の値とする。偽の場合は、条件２を評価す
る。条件２：ｅ＞ａ・ｃ１真の場合、ｑにｑ１を足した
値をｑ’の値とする。偽の場合は、条件３を評価する。条件３：ｅ・ｃ２＜ａ真の場合、条件４を評価す
る。偽の場合は、ｑをｑ’の値とする。条件４：Ｗ＋Ｂ＜Ｕ真の場合、ｑにｑ２を足し、そ
れ以外の場合はｑをそのままにする。偽の場合は、ｑを
ｑ’の値とする。ただし、０＜ｑ１≦ｑｍａｘ及びｑｍｉｎ≦ｑ２＜０を
満たすとする。又、Ｗは図５（ｃ）に示した通信バッフ
ァ残留ビット量であり、Ｕは図５（ｃ）に示した現フレ
ームの符号化処理時間中に送信されるビット量である。In FIG. 10 (S11), the actual quantization level q is updated. Here, the following conditions 1 to 4 are considered. Condition 1: q <Q and a <h When true, a value obtained by adding q1 to q is taken as the value of q ′. If false, condition 2 is evaluated. Condition 2: e> a · c1 When true, a value obtained by adding q1 to q is set as a value of q ′. If false, condition 3 is evaluated. Condition 3: If e · c2 <a is true, Condition 4 is evaluated. If false, let q be the value of q '. Condition 4: W + B <U If true, add q2 to q, otherwise leave q as it is. If false, let q be the value of q '. However, it is assumed that 0 <q1 ≦ qmax and qmin ≦ q2 <0 are satisfied. Further, W is the communication buffer residual bit amount shown in FIG. 5C, and U is the bit amount transmitted during the encoding processing time of the current frame shown in FIG. 5C.

【００７８】最後に、前記４つの条件の評価によって計
算したｑ’の値を関数ＣＱによりクリッピングした値を
ｑの更新値とする。尚、Ｈ．２６３及びＨ．２６３プラ
スにおいて規定されている量子化レベルの許容最大値Ｑ
ｍａｘは３１、最小値Ｑｍｉｎは１に設定されており、
連続する２つのマクロブロックの量子化レベルの変化量
の許容最大値ｑｍａｘは＋２、変化量の許容最小値ｑｍ
ｉｎは−２に設定されている。以上の処理は全て整数演
算又は固定小数点によって実現する。このようにしたこ
とにより、前記各画像フレームのビット長を高精度で平
均化するために、従来は不可欠とされていた非常に複雑
な計算を、簡単な計算に置き換えることにより、前記の
計算処理のためにする前記遅延時間を減少させ、リップ
同期を実現できる。Finally, the value of q'calculated by the evaluation of the above four conditions is clipped by the function CQ to be the updated value of q. In addition, H. 263 and H.H. 263 Plus Quantization Level Allowed Maximum Value Q
max is set to 31 and minimum value Qmin is set to 1,
The allowable maximum value qmax of the change amount of the quantization level of two consecutive macroblocks is +2, and the allowable minimum value qm of the change amount is
in is set to -2. All the above processes are realized by integer arithmetic or fixed point. By doing so, in order to average the bit lengths of the image frames with high accuracy, a very complicated calculation, which has been indispensable in the past, is replaced with a simple calculation, thereby performing the calculation process described above. Therefore, the delay time can be reduced and lip synchronization can be realized.

【００７９】そして、一実施形態として、動画像情報の
高性能符号圧縮システムを構成するＳｙｓｔｅｍＭｅ
ｍｏｒｙＳｈａｒｉｎｇＰｒｏｃｅｓｓｏｒＡｒ
ｒａｙ即ちシステム化されたメモリ共有型プロセッサア
レー方式（以下、システムＭＳＰＡとも称す）のアーキ
テクチャに基づいた保存するメモリと、低ビットレート
ビデオ符号のためのシステムを図に沿って説明する。As one embodiment, the System Me that constitutes the high-performance code compression system for moving image information.
more Sharing Processor Ar
A system for memory and a low bit rate video code based on the architecture of a ray, that is, a systematic memory sharing type processor array system (hereinafter, also referred to as system MSPA) will be described with reference to the drawings.

【００８０】図１１は符号器、図１２は復号器、図１３
は符号器／復号器兼用一体のブロック図である。図１３
において、５はテレビカメラであり人の顔を主体とする
動画の映像信号をカメラインターフェース６を介してデ
ータバス２へ入力する。データバス２はホストインター
フェース７を介してホストコンピュータ８が接続され、
ＡｄｄｒｅｓｓＧｅｎｅｒａｔｉｏｎＵｎｉｔ即ち
集中制御装置（以下、ＡＧＵと称す）３及びＤＲＡＭ＠
１３．５ＭＨｚ２５６×１６ビットのメモリ（以下、単
に「メモリ」と称す）１との間でデータの交換がなされ
る。尚、図１３において、２個のメモリ１が記載されて
いるのは、符号器／復号器兼用一体のブロック図である
ため、符号器と復号器とで、夫々各１個のメモリ１が配
設されているためである。又、ＡＧＵ３及び図１３にお
いて、メモリ１と協同して試験動作させられるので、設
計及び試作段階での動作試験を実行し易くなっている。
ＡＧＵ３はメモリ１とは別に基本動作のプログラムが格
納されたＲＯＭ９とも接続され、制御バス４を介して制
御信号を前記各ハードウェア・モジュール（以下、「モ
ジュール」又は「各要素」若しくは「要素」とも称す、
各要素名で図示する）に伝え、動画の高性能符号圧縮シ
ステムを機能させる。FIG. 11 is an encoder, FIG. 12 is a decoder, and FIG.
FIG. 3 is a block diagram of an integrated encoder / decoder. FIG.
In the figure, 5 is a television camera which inputs a video signal of a moving image mainly composed of a human face to the data bus 2 through the camera interface 6. The data bus 2 is connected to the host computer 8 via the host interface 7,
Address Generation Unit, that is, centralized control unit (hereinafter referred to as AGU) 3 and DRAM @
Data is exchanged with a 13.5 MHz 256 × 16 bit memory (hereinafter, simply referred to as “memory”) 1. Note that, in FIG. 13, the two memories 1 are described because it is an integrated block diagram for both the encoder and the decoder, and therefore, one memory 1 is provided for each of the encoder and the decoder. This is because it is installed. Further, in AGU 3 and FIG. 13, since the test operation is performed in cooperation with the memory 1, it is easy to execute the operation test in the design and trial production stages.
The AGU 3 is connected to a ROM 9 in which a program for basic operation is stored in addition to the memory 1, and transmits a control signal via the control bus 4 to each of the hardware modules (hereinafter, “module” or “each element” or “element”). Also called,
(Illustrated by each element name) to enable the high-performance code compression system for moving images.

【００８１】尚、本実施形態におけるメモリ１として、
Ｄ（ダイナミック）ＲＡＭを用いて、十分な結果を得て
おり、その旨を各図に亘り記載しているが、本発明にお
いては、符号化された動画像情報を適宜に書き込み即読
み出し自在なメモリ手段であれば、前記ＤＲＡＭに限定
しないで実施可能であり、そのような実施形態であって
も当然に本発明の要旨に含まれるものと看做し得る。As the memory 1 in this embodiment,
Sufficient results have been obtained using a D (dynamic) RAM, and that fact is described in each figure. However, in the present invention, encoded moving image information can be appropriately written and immediately read out. The memory means can be implemented without being limited to the DRAM, and even such an embodiment can naturally be considered to be included in the gist of the present invention.

【００８２】前記各要素はデータバス２を介して動画の
画像情報を共有し、制御バス４にチップ外部で接続して
いるメモリ１を介してのみ、前記各要素間のデータ転送
を行い、前記各要素とメモリ１のデータ転送はＡＧＵ３
及びＲＯＭ９により制御される。これが、前記システム
ＭＳＰＡ最大の特徴であり、このように前記各要素間の
データ転送を全てメモリ経由で行うことにより、前記各
要素間の処理依存関係をメモリ１へＡＧＵ３がメモリア
クセスする際の、時分割処理の時間割だけで決定でき
る。The respective elements share the image information of the moving image via the data bus 2, and the data transfer between the respective elements is performed only through the memory 1 connected to the control bus 4 outside the chip. Data transfer between each element and memory 1 is AGU3
And the ROM 9. This is the greatest feature of the system MSPA, and by performing all the data transfer between the respective elements via the memory in this way, when the AGU 3 makes a memory access to the memory 1 for the processing dependency between the respective elements, It can be determined only by the timetable of the time division processing.

【００８３】又、前記各要素として図１１乃至図１３に
示すものが有り、それぞれは前記各要素間の処理依存関
係に配慮することなく独立した設計によるものをデータ
バス２と制御バス４を介して並列接続し、全体を構成す
る。従って、前記各要素ごとにそれぞれ別の設計者が同
時進行させられるため、システム全体のプログラム構成
が大規模であるにもかかわらず設計の所要時間は短くて
済む。又、前記各要素に設計変更があれば、ＡＧＵ３の
プログラム変更で柔軟に対応できる。動きベクトル探索
部１０は、図示せぬ予測決定部を備え、顔のウインドウ
２１の画像の平均移動量を算出し、その平均移動量に顔
のウインドウ２１を追随して移動させる。そして、撮影
された人の顔の揺動に合わせて、遅延無き高画質な領域
を移動させるために前記平均移動量から近未来の動きを
予測し、顔のウインドウ２１を先回りさせる。There are elements shown in FIGS. 11 to 13 as the above-mentioned respective elements, each of which has an independent design through the data bus 2 and the control bus 4 without considering the processing dependence between the respective elements. Connected in parallel to form the whole. Therefore, different designers can proceed simultaneously for each of the above-mentioned elements, so that the time required for the design can be short even though the program configuration of the entire system is large. Further, if there is a design change in each of the above-mentioned elements, it is possible to flexibly deal with it by changing the program of the AGU 3. The motion vector search unit 10 includes a prediction determination unit (not shown), calculates an average movement amount of the image of the face window 21, and moves the face window 21 following the average movement amount. Then, the movement of the near future is predicted from the average movement amount in order to move the area of high image quality without delay in accordance with the swing of the photographed person's face, and the face window 21 is advanced.

【００８４】以後、主にウインドウＭＳＰＡの説明をす
る。画像フレーム情報を保存するメモリ１と、各々が独
立して動作するハードウェア・モジュールをデータバス
２を介して結合し、メモリ１と前記各ハードウェア・モ
ジュールとの間のデータの流れ及び動作スケジュールを
制御するＡＧＵ３が前記各ハードウェア・モジュールに
制御バス４を介して結合したシステム・アーキテクチャ
により前記符号器及び前記復号器を構成している。Hereinafter, the window MSPA will be mainly described. A memory 1 that stores image frame information and a hardware module that operates independently of each other are coupled via a data bus 2, and a data flow and an operation schedule between the memory 1 and each of the hardware modules. The AGU 3 for controlling the ACU 3 constitutes the encoder and the decoder by a system architecture in which the AGU 3 is coupled to each of the hardware modules via a control bus 4.

【００８５】又、本実施形態のテレビ電話にあっては、
図１３のブロック図に示す符号器／復号器兼用一体、即
ち送信と受信の双方向にかかわる装置を一体にまとめた
構成になっている。図１１に符号器、図１２には復号器
を区別して夫々単独の構成も示し、それらの重複部は適
宜説明を省略する。図１１に示す符号器のシステムに
は、外部にアドレス１８ビットでデータ１６ビットの４
メガビット容量でアクセス時間が４０ナノ秒のメモリ１
であり、ＱＣＩＦ（１７６×１４４画素）の４フレーム
分のデータが格納できる。図１１において、テレビカメ
ラ５からのデータがメモリ１に蓄えられながら、同様に
１６×１６画素からなるマクロブロック毎に順次圧縮処
理される。In the videophone of this embodiment,
It has a configuration in which the encoder / decoder combined as shown in the block diagram of FIG. FIG. 11 shows the encoder and FIG. 12 shows the decoder separately from each other, and their duplicated configurations are omitted as appropriate. The encoder system shown in FIG. 11 has an external address of 18 bits and data of 16 bits.
Memory with megabit capacity and access time of 40 nanoseconds 1
Therefore, data for four frames of QCIF (176 × 144 pixels) can be stored. In FIG. 11, while the data from the television camera 5 is stored in the memory 1, it is similarly sequentially compressed for each macro block of 16 × 16 pixels.

【００８６】先ず、動きベクトル探索部１０では、処理
するマクロブロックが前フレームのどの位置から動いた
ものかを探索し、動きベクトルとして出力する。この
時、顔のウインドウ２１や周辺動きウインドウ５１に属
さないマクロブロックについては、図示せぬ時間フィル
タの作用により画像情報を劣化させる。又、前記時間フ
ィルタ等に静止画の画像情報が入力された場合は、元通
りの画像情報そのままで出力され、逆に動きの激しい映
像の画像情報が入力された場合は、その動きが緩和され
たように情報操作される。このようにして、伝送情報量
を多く費やす激しい動画の画像情報は前記時間フィルタ
を通過することにより、画質を適度に劣化される代わり
に情報量を激減できる。この情報量激減処理を経た画像
を復号して再現すると、早い動きで変化する場面でのみ
少し画質劣化した印象を受ける程度で済む。具体的には
走っている自動車内でこのテレビ電話を用いて送信した
場合に、送話者の背景として写り込む車窓から流れて見
える風景が少しぼやける程度である。First, the motion vector search unit 10 searches from which position in the previous frame the macroblock to be processed has moved, and outputs it as a motion vector. At this time, with respect to macroblocks that do not belong to the face window 21 or the peripheral motion window 51, the image information is deteriorated by the action of a time filter (not shown). Also, when the image information of a still image is input to the time filter or the like, the original image information is output as it is, and conversely, when the image information of a video with a lot of movement is input, the movement is eased. Information is manipulated as described above. In this way, the image information of a moving image that consumes a large amount of transmission information passes through the time filter, so that the amount of information can be sharply reduced instead of degrading the image quality appropriately. When the image that has undergone this information amount drastic reduction processing is decoded and reproduced, it is sufficient to give the impression that the image quality is slightly degraded only in the scene that changes in a fast motion. Specifically, when transmitted using this videophone in a running car, the scenery seen from the car window reflected as the background of the talker is slightly blurred.

【００８７】又、動き補償部１１では、得られた動きベ
クトルを利用して、処理中の現フレームのマクロブロッ
クと、それが移動してきたと思われる前フレームの領域
との差分データを生成し、メモリ１に書き込む。離散コ
サイン（逆）変換部／（逆）量子化部１２では、メモリ
１から読み込まれる差分データをマクロブロックの４分
の１である８×８画素のブロック毎に、離散コサイン変
換により、８×８画素の周波数成分を求め、さらにマク
ロブロック毎に量子化ステップに基づく、表現ビット数
を変化させる量子化の操作を高速に行い、その結果はメ
モリ１に出力される。Further, the motion compensation unit 11 uses the obtained motion vector to generate difference data between the macroblock of the current frame being processed and the region of the previous frame where it is supposed to have moved, Write to memory 1. In the discrete cosine (inverse) transform unit / (inverse) quantizer 12, the difference data read from the memory 1 is divided into 8 × 8 pixel blocks, which is a quarter of the macroblock, by a discrete cosine transform, and 8 × 8 pixels are obtained. The frequency component of 8 pixels is obtained, and the quantization operation for changing the number of representation bits is performed at high speed based on the quantization step for each macroblock, and the result is output to the memory 1.

【００８８】そして、可変長符号器１３では、メモリ１
から読み込まれる、量子化されてビット低減された差分
データの周波数成分に適切な符号に割り当て、図示せぬ
内部バッファに蓄える。その内部バッファは外部に一定
の伝送レートで符号化データを出力する。又、離散コサ
イン（逆）変換部／（逆）量子化部１２は、例えば３５
２×２８８画素でなる１枚の画像を８×８画素の画素ブ
ロックに分割し、ＤＣＴ（ＤｉｓｃｒｅｔｅＣｏｓｉ
ｎｅＴｒａｎｓｆｏｒｍ）離散コサイン変換（以下、
ＤＣＴ変換とも称する）により、周波数成分を分解（直
交変換）し、高周波項を丸めて情報圧縮し、ＤＣＴ変換
後の各係数をある除数で割り算を行い、余りを丸める。
これらは、符号処理では順方向に、復号処理では逆方向
に機能させる。尚、図１２の復号器では前記順方向の変
換が無いので離散コサイン逆変換部／逆量子化部１２ｂ
が配設されている。In the variable length encoder 13, the memory 1
An appropriate code is assigned to the frequency component of the quantized and bit-reduced difference data that is read from, and stored in an internal buffer (not shown). The internal buffer outputs encoded data to the outside at a constant transmission rate. Further, the discrete cosine (inverse) transforming unit / (inverse) quantizing unit 12 is, for example, 35
One image consisting of 2 × 288 pixels is divided into pixel blocks of 8 × 8 pixels, and DCT (Discrete Cosi)
ne Transform) Discrete cosine transform (hereinafter,
Frequency components are decomposed (orthogonal transformation) by DCT transformation), high-frequency terms are rounded and information is compressed, and each coefficient after DCT transformation is divided by a certain divisor to round the remainder.
These function in the forward direction in the encoding process and in the reverse direction in the decoding process. Since the decoder of FIG. 12 does not have the forward transform, the discrete cosine inverse transform unit / inverse quantization unit 12b
Is provided.

【００８９】又、離散コサイン（逆）変換部／（逆）量
子化部１２と、Ｐマクロブロック再構築部１４と、Ｂマ
クロブロック推定部１５ａと、ブロック歪み除去フィル
タ１６は量子化周波数成分から現マクロブロックを再構
築する過程で機能し、次のフレームの圧縮処理に用いら
れる。そして、離散コサイン逆変換部／逆量子化部１２
では離散コサイン及び量子化の逆操作として、差分デー
タの周波数成分の量子化データをメモリ１から入力し
て、逆量子化操作により、元のビットに復元し、更に離
散コサイン逆変換によって、元の差分データに復元し、
メモリ１に結果を格納する。The discrete cosine (inverse) transformer / (inverse) quantizer 12, the P macroblock reconstructor 14, the B macroblock estimator 15a, and the block distortion eliminator 16 convert the quantized frequency component from the quantized frequency component. It functions in the process of reconstructing the current macroblock and is used for compression processing of the next frame. Then, the discrete cosine inverse transform unit / inverse quantization unit 12
Then, as the inverse operation of the discrete cosine and the quantization, the quantized data of the frequency component of the difference data is input from the memory 1, the original bit is restored by the inverse quantization operation, and the original cosine inverse transform is performed. Restore to differential data,
Store the result in memory 1.

【００９０】又、Ｐマクロブロック再構築部１４は、図
１１、図１２、図１３に共通したもので、ここでは、メ
モリ１から差分データと動きベクトルから求められる前
フレームのデータを読み出し、加算することによって現
フレームのマクロブロックデータを復元し、メモリ１に
書き込む。The P macroblock reconstructing unit 14 is common to FIGS. 11, 12, and 13, and here, the previous frame data obtained from the differential data and the motion vector is read from the memory 1 and added. By doing so, the macroblock data of the current frame is restored and written in the memory 1.

【００９１】又、図１３にはＢマクロブロック推定部／
再構築部１５があり、夫々ＰフレームとＢフレームを処
理するが、図１１に示す符号器ではＢマクロブロック推
定部１５ａを配設し、図１２に示す復号器ではＢマクロ
ブロック再構築部１５ｂを配設している。符号器ではＢ
マクロブロック推定部１５ａで、メモリ１から再構築さ
れたＰマクロブロックとそれが移動してきた前フレーム
のマクロブロックをメモリ１から読み出し、前フレーム
から前向きに推定されるＢマクロブロックデータと両方
を混合して推定されるＢマクロブロックを構築し、実際
にテレビカメラ５から入力されたＢマクロブロックとの
類似度を比較する。これらの中でどれが最適であるかマ
クロブロック毎に判定して、その情報だけを可変長符号
器１３を介して送信する。又、復号器ではＢマクロブロ
ック再構築部１５ｂにより、実際にＢフレームのデータ
を再現する。Further, in FIG. 13, the B macroblock estimation unit /
There is a reconstructing unit 15, which processes P frames and B frames, respectively. The encoder shown in FIG. 11 has a B macroblock estimating unit 15a, and the decoder shown in FIG. 12 has a B macroblock reconstructing unit 15b. Are installed. B in the encoder
The macroblock estimation unit 15a reads the P macroblock reconstructed from the memory 1 and the macroblock of the previous frame to which it has moved from the memory 1, and mixes both with the B macroblock data estimated forward from the previous frame. Then, the B macro block estimated is constructed, and the similarity with the B macro block actually input from the television camera 5 is compared. Which of these is the best is determined for each macroblock, and only that information is transmitted via the variable length encoder 13. In the decoder, the B macroblock reconstructing unit 15b actually reproduces B frame data.

【００９２】ブロック歪み除去フィルタ１６は、図１
１、図１２、図１３に共通したもので、メモリ１から復
元されたＰマクロブロックを読み出し、ブロック歪み除
去フィルタ１６の作用により、マクロブロックのつなぎ
目に見苦しく生じる碁盤目状のノイズを除去し、メモリ
１に書き込む。The block distortion removing filter 16 is shown in FIG.
1, FIG. 12 and FIG. 13, the P macroblock restored from the memory 1 is read out, and the operation of the block distortion removal filter 16 removes the grid-like noise that is unsightly at the joint between macroblocks. Write to memory 1.

【００９３】又、ＡＧＵ３は、各モジュールの実行制御
と各モジュールとメモリ１との間のデータのやり取りを
制御する。そのＡＧＵ３はＲＯＭ９に格納されたプログ
ラムに従って動作し、そのプログラム命令としては、メ
モリ１のアドレス発生やメモリ１のアクセス制御などか
らできている。The AGU 3 also controls the execution of each module and the exchange of data between each module and the memory 1. The AGU 3 operates in accordance with a program stored in the ROM 9, and its program instructions include address generation of the memory 1 and access control of the memory 1.

【００９４】又、ホストインターフェース７は、ＡＧＵ
３のＲＯＭ９に代わって外部のホストコンピュータ８か
らの命令をシステムに入力し、各モジュールの実行制御
やメモリ１とホストコンピュータ８との間のデータ転送
を行う。このように、ウインドウＭＳＰＡは、全て夫々
の機能を高速で実行するハードウェア・モジュール（各
要素）がデータバス２を介して外部のメモリ１に接続さ
れている。即ち、前記各要素は外部のメモリ１から入力
データを受け取り、処理を行い、その結果を再び外部の
メモリ１へ出力する形式になっている。従って、前記各
要素間の直接のデータの受け渡しは無く、必ずメモリ１
を介して行う。又、ＡＧＵ３はＲＯＭ９からの命令によ
って、前記各要素とメモリ１のデータ転送を制御する。
尚、２個のメモリ１が符号器用と復号器用に夫々割り当
てられているが、合計で２個分の応答処理能力が備えら
れていれば、必ずしも２個でなくても良い。The host interface 7 is the AGU.
In place of the ROM 9 of No. 3, an instruction from the external host computer 8 is input to the system, execution control of each module and data transfer between the memory 1 and the host computer 8 are performed. As described above, in the window MSPA, the hardware modules (each element) for executing all the respective functions at high speed are connected to the external memory 1 via the data bus 2. That is, each element receives input data from the external memory 1, processes the data, and outputs the result to the external memory 1 again. Therefore, there is no direct data transfer between the above-mentioned elements, and the memory 1
Through. Further, the AGU 3 controls the data transfer between each of the above elements and the memory 1 in accordance with a command from the ROM 9.
The two memories 1 are allocated to the encoder and the decoder, respectively, but the number of memory 1 need not be two as long as a total of two response processing capacities are provided.

【００９５】これが、前記ウインドウＭＳＰＡの最大の
特徴であり、このように前記各要素間のデータ転送を全
てメモリ経由で行うことにより、前記各要素間の処理依
存関係をメモリ１へのアクセスのＡＧＵ３の時間割だけ
で決定できる。前記各要素間の処理依存関係はＡＧＵ３
で決まるため、前記各要素は独立に設計したものをデー
タバス２とコントロールバス４を介して並列接続し、全
体を構成する。このようにしたことにより、複数の前記
各要素の夫々の設計の独立性も得られ、設計上の制約条
件も格段に少なくなり、かつ複数の設計者が夫々を分担
して同時進行で設計することにより、システム全体が大
規模であるにもかかわらず、その設計に対する所要時間
は短くて済む。This is the greatest feature of the window MSPA. By thus performing all data transfer between the respective elements via the memory in this way, the processing dependency between the respective elements is AGU3 of the access to the memory 1. It can be decided only by the timetable The processing dependency between the above elements is AGU3.
Since each element is designed independently, it is connected in parallel via the data bus 2 and the control bus 4 to form the whole. By doing so, the independence of the design of each of the plurality of each element can be obtained, the constraint condition on the design can be remarkably reduced, and a plurality of designers can share the respective designs at the same time. As a result, despite the large scale of the entire system, the time required for its design is short.

【００９６】又、前記各要素の処理依存関係はＡＧＵ３
のプログラムによって柔軟に設計することができる。実
際の設計では、各ＰフレームとＢフレームのマクロブロ
ックの処理を１５，６２５クロックで終了する必要があ
る。しかし、動きベクトル探索処理は、データ入出力の
ためのメモリ１のアクセス時間は少ないものの、処理時
間が長くかかり、動きベクトル探索処理全体で１３，０
００クロックの時間が必要である。それ以外の処理は夫
々ハードウェア機構で構成されるため、残り全部でも１
５，６２５クロック内での処理が可能である。動きベク
トル探索処理とそれ以外の処理をマクロブロック毎のパ
イプライン並列処理を行う。即ち、１フレームに含まれ
る１００のマクロブロックの圧縮処理を順々に実行する
が、動きベクトル探索処理を１５，６２５クロック内で
実行した後に、次の１５，６２５クロックの動きベクト
ル探索処理を始めるという要領である。The processing dependency of each element is AGU3.
Can be flexibly designed by the program. In the actual design, it is necessary to finish the processing of each P frame and B frame macroblock in 15,625 clocks. However, the motion vector search process requires a long processing time although the access time of the memory 1 for data input / output is short, and the motion vector search process as a whole is 13,0.
Time of 00 clocks is required. Since the other processes are each configured by the hardware mechanism, all the remaining 1
Processing within 5,625 clocks is possible. The motion vector search processing and other processing are pipeline parallel processing for each macroblock. That is, the compression processing of 100 macroblocks included in one frame is sequentially executed, but after the motion vector search processing is executed within 15,625 clocks, the motion vector search processing of the next 15,625 clocks is started. The point is.

【００９７】このようにして、ほとんどの周期では、動
きベクトル探索処理とその他の処理が同時に実行される
ことになる。これらのパイプライン処理もＡＧＵ３のプ
ログラムにより、本発明のアーキテクチャでは柔軟に設
計することが可能である。In this way, the motion vector search process and other processes are simultaneously executed in most cycles. These pipeline processes can be flexibly designed by the architecture of the present invention by the program of AGU3.

【００９８】又、前記各要素の全ての入出力データがメ
モリ１に格納されているため、外部からの試験を簡単に
させている。実際、ホストコンピュータ８からホストイ
ンターフェース７を介して、ＡＧＵ３からの命令を発す
ることにより、ホストコンピュータ８からメモリ１のデ
ータ設定、前記各要素の処理実行後に、メモリ１のデー
タをホストコンピュータ８へ読み込んで検査することを
可能にしている。更に、前記各要素の内部データもホス
トコンピュータ８に読みだせるような検査プログラムを
装備することにより、検査が容易な構成に成っている。Further, since all the input / output data of each element are stored in the memory 1, the test from the outside can be simplified. Actually, by issuing a command from the AGU 3 from the host computer 8 via the host interface 7, the data in the memory 1 is read into the host computer 8 after the data setting in the memory 1 is executed by the host computer 8 and the processing of each element is executed. It is possible to inspect. Further, by providing an inspection program that allows the host computer 8 to read the internal data of each of the above-mentioned elements, the inspection is made easy.

【００９９】そして、前記各要素は夫々の動作に依存関
係が無く、動作する要素にのみクロックを供給し、それ
以外の要素は動作させる必要もないので、そのことによ
り、システム全体の消費電力を低減できる。The above-mentioned respective elements have no dependency on their respective operations, and the clocks are supplied only to the elements that operate, and it is not necessary to operate the other elements. Therefore, the power consumption of the entire system is reduced. It can be reduced.

【０１００】次に図１２に沿って、復号器に固有の説明
をする。可変長復号器１７で復号された周波数領域の差
分データは符号器のＢマクロブロック再構築部１５ｂ及
びブロック歪み除去フィルタ１６により、実際にＢフレ
ームのデータを復元する。そして、符号器と同様に、独
立して動作する前記各要素がメモリ１からデータを受け
取り、そのデータを処理した後にメモリ１に戻して格納
する。これらの制御をＡＧＵ３が制御している。最終的
な再生画像は、ＬＣＤインターフェース１８を通して、
外部に接続されたＬＣＤ１９上に映し出される。Next, a description unique to the decoder will be given with reference to FIG. The frequency domain differential data decoded by the variable length decoder 17 is actually restored by the B macroblock reconstructing unit 15b and the block distortion removal filter 16 of the encoder to restore B frame data. Then, similarly to the encoder, each of the above-described independently operating elements receives data from the memory 1, processes the data, and then returns the data to the memory 1 for storage. AGU3 controls these controls. The final reproduced image is displayed through the LCD interface 18.
It is displayed on the LCD 19 connected to the outside.

【０１０１】次に、図１４は集中制御装置（ＡＧＵ３）
のブロック図であり、実行／テストモード切替スイッチ
３０がホストコンピュータ８からホストインターフェー
ス７を介しての命令（ＰＣ命令と図示）により、実行又
はテストモードの何れかに決定され、ＲＯＭ９の命令プ
ログラムの内容を、命令解読実行制御装置３１により、
実行指示され、メモリの読み込み／書き込み繰り返し命
令制御部３２及び繰り返し命令開始アドレスレジスタ３
３の作用により、メモリ１との間でアドレス制御された
データ交換を行う。これらにより、メモリ１のアドレス
の生成、メモリアクセス制御信号の生成、前記各要素の
演算開始、停止信号の生成を行う。尚、繰り返し命令開
始アドレスレジスタ３３と、レジスタファイル３４は主
に汎用レジスタで構成され、８連の４ビットレジスタ２
組、８連の２ビットレジスタ１組、８連の１ビットレジ
スタ１組を備えている（詳細は図示せず）。Next, FIG. 14 shows a centralized control unit (AGU3).
Is a block diagram of the execution / test mode changeover switch 30 is determined by the command (PC command) from the host computer 8 via the host interface 7 to determine either the execution mode or the test mode. By the instruction decoding execution control device 31,
Execution instruction, memory read / write repeat instruction control unit 32 and repeat instruction start address register 3
By the operation of 3, the address-controlled data exchange with the memory 1 is performed. By these, the address of the memory 1 is generated, the memory access control signal is generated, the calculation of each element is started, and the stop signal is generated. The repeat instruction start address register 33 and the register file 34 are mainly composed of general-purpose registers.
Group, one set of eight 2-bit registers, and one set of eight 1-bit registers (details not shown).

【０１０２】又、図１５は外部メモリ１のメモリ領域の
構成図であり、前記各ハードウェア・モジュールの実行
制御の命令プログラムを格納したＲＯＭ９の命令に対応
して、画像を複数のブロックに分割し、そのブロックの
座標単位の情報を処理するブロック方式に適合するアド
レス構成にしている。図１５に示す、行アドレス及び列
アドレスをメモリアドレスで指定するように、ＡＧＵ３
のメモリ読み込み／書き込み繰り返し命令制御部３２に
より制御される。FIG. 15 is a block diagram of the memory area of the external memory 1. The image is divided into a plurality of blocks corresponding to the instructions of the ROM 9 storing the instruction program of the execution control of each hardware module. However, the address configuration is adapted to the block system for processing the information of the block coordinate unit. As shown in FIG. 15, the AGU3 is used so that the row address and the column address are designated by the memory address.
It is controlled by the memory read / write repetitive instruction control unit 32.

【０１０３】前記メモリ領域でなる外部のメモリ１と、
メモリアクセスのためのアドレス生成を前記マクロブロ
ックの座標単位、ブロック単位、画素単位で可能にした
命令はＲＯＭ９に格納されており、このＲＯＭ９は図１
１、図１２、図１３に示すようにＡＧＵ３の外部に配設
されていても良い。尚、実際のハードウェアの配置はこ
れらのブロック図の配置とは異なっているのが普通であ
る。An external memory 1 which is the memory area,
An instruction that enables address generation for memory access in the coordinate unit, block unit, and pixel unit of the macroblock is stored in the ROM 9. This ROM 9 is shown in FIG.
It may be arranged outside the AGU 3 as shown in FIG. The actual hardware layout is usually different from those shown in these block diagrams.

【０１０４】又、メモリ１のメモリ空間は、アドレス１
８ビットが図１５に示すように、フレーム上位１ビッ
ト、マクロブロック位置のＸ座標とＹ座標を表す４ビッ
トと４ビットからなる９ビットの行アドレスとフレーム
下位ビットと、ブロック位置のＸ座標とＹ座標を表す２
ビットと１ビット、画素位置のＸ座標とＹ座標を表す２
ビットと３ビットからなる１０ビットの列アドレスで区
切られる。ただし、ブロック位置は輝度信号Ｙの情報の
ための（０，０）（０，１）（１，０）（１，１）で表
される４つの領域の他に、色差信号Ｃｒ，Ｃｂの情報の
ためのブロックが（０，２）（１，２）の領域に割り当
てられている。又、データは１６ビットであるため、Ｘ
座標方向に隣り合う２画素の８ビットデータに１つのア
ドレスが割り当てられている。このため、画素位置のＸ
座標は８画素分にも拘らず、２ビットしか割り当てられ
ていない。The memory space of the memory 1 is the address 1
As shown in FIG. 15, 8 bits represent the upper 1 bit of the frame, 4 bits representing the X coordinate and Y coordinate of the macroblock position, a 9-bit row address consisting of 4 bits and the lower bit of the frame, and the X coordinate of the block position. 2 representing the Y coordinate
Bit and 1 bit, 2 representing the X and Y coordinates of the pixel position
It is delimited by a 10-bit column address consisting of 3 bits. However, in addition to the four areas represented by (0,0) (0,1) (1,0) (1,1) for the information of the luminance signal Y, the block position of the color difference signals Cr, Cb A block for information is assigned to the area (0, 2) (1, 2). Also, since the data is 16 bits, X
One address is assigned to 8-bit data of two pixels adjacent in the coordinate direction. Therefore, X of the pixel position
Although the coordinates are for 8 pixels, only 2 bits are assigned.

【０１０５】そして、命令は２７ビット長の長さで表さ
れ、（１）メモリアクセス開始アドレス命令（２）メモリ読み込みループ命令（３）メモリ書き込みループ命令（４）ＡＧＵレジスタ制御命令（５）サブルーチン命令及び条件分岐命令（６）ホストコンピュータ発行特殊命令が含まれている。前記（１）のメモリアクセス開始アド
レス命令はメモリ読み込みや書き込みのための、ループ
命令を実行する前に発行して、ループ命令実行のための
開始アドレスを設定する。このことにより、絶対番地の
指定や、現在処理中のマクロブロックからの相対的な位
置の指定や、現在処理中のマクロブロックから動きベク
トル分の偏移を指定できる。前記（２）（３）のメモリ
読み込み／書き込みループ命令は、矩形領域の複数のマ
クロブロックレベルでの繰り返し、矩形領域の複数のブ
ロックレベルでの繰り返し、矩形領域の複数の画素レベ
ルでの繰り返しのためのループ機構を備えているため、
通常は繰り返し文で複雑に記述されるループ制御が、前
記（２）（３）のメモリ読み込み／書き込みループ命令
を利用して簡単に行える。例えば、あるフレームデータ
の読み書きや、あるマクロブロックデータの読み書き
や、ある矩形領域の画素の読み書きが、１つのメモリ読
み込み／書き込みループ命令で実行可能である。（４）
ＡＧＵレジスタ制御命令は、ＡＧＵ３によるモジュール
（要素）の実行順序の制御のために利用される補助レジ
スタのデータのセットやクリアなどの命令である。
（５）のサブルーチン命令及び条件分岐命令は、ＡＧＵ
３のプログラム制御のための命令である。（６）のホス
トコンピュータ８が発行する特殊命令には、プログラム
ＲＯＭ９に基づいて動作する実行モードに代わって、ホ
ストコンピュータ８からの命令を受け付けるテストモー
ドでのみ有効な命令である。その中には、指定したステ
ップ数だけ実行するステップ実行命令なども含まれてい
る。テストモードでは、（１）から（３）までのすべて
の命令がホストコンピュータ８から発行可能になる。The instruction is represented by a length of 27 bits. (1) Memory access start address instruction (2) Memory read loop instruction (3) Memory write loop instruction (4) AGU register control instruction (5) Subroutine Instructions and conditional branch instructions (6) Includes special instructions issued by the host computer. The memory access start address command of (1) is issued before executing the loop command for memory reading and writing, and sets the start address for executing the loop command. As a result, it is possible to specify an absolute address, a relative position from the macro block currently being processed, and a shift for the motion vector from the macro block currently being processed. The memory read / write loop instructions of (2) and (3) are repeated at a plurality of macroblock levels of a rectangular area, at a plurality of block levels of a rectangular area, and at a plurality of pixel levels of a rectangular area. Because it has a loop mechanism for
Normally, a loop control that is complicatedly described by a repeat statement can be easily performed by using the memory read / write loop instructions of (2) and (3). For example, reading and writing certain frame data, reading and writing certain macroblock data, and reading and writing pixels in a certain rectangular area can be executed by one memory read / write loop instruction. (4)
The AGU register control instruction is an instruction for setting or clearing data of an auxiliary register used for controlling the execution order of modules (elements) by the AGU 3.
The subroutine instruction and conditional branch instruction in (5) are AGU
3 is an instruction for program control. The special instruction issued by the host computer 8 in (6) is an instruction valid only in the test mode in which the instruction from the host computer 8 is accepted instead of the execution mode operating based on the program ROM 9. It also includes step execution instructions for executing the specified number of steps. In the test mode, all the commands (1) to (3) can be issued from the host computer 8.

【０１０６】従って、テストモードで、ホストコンピュ
ータ８からメモリ１へ画像データを書き込み、実行モー
ドに変換してある要素を動作させ、再びテストモードに
戻して、メモリ１に書き込まれた演算結果をホストコン
ピュータ８に読み出すことができる。このようにして、
前記各要素レベルでの動作検証が可能になる。ホストコ
ンピュータ８発行の特殊命令には、前記各要素の内部状
態を読み出す命令もあり、同様な方法によって、前記各
要素の回路レベルでの動作検証も可能にしている。Therefore, in the test mode, the image data is written from the host computer 8 to the memory 1, the element converted into the execution mode is operated, the mode is returned to the test mode again, and the operation result written in the memory 1 is stored in the host. It can be read by the computer 8. In this way
It becomes possible to verify the operation at each element level. The special instruction issued by the host computer 8 includes an instruction for reading the internal state of each element, and the operation verification of each element at the circuit level is enabled by the same method.

【０１０７】ここで、「動きベクトル探索モジュール
（要素）」について説明する。動きベクトル探索は、処
理する現フレームのマクロブロック毎に、前のフレーム
におけるマクロブロックの位置の周辺で、一番類似する
１６×１６の画素領域を探索する。実際の探索は、現在
の位置から上下、左右方向に夫々最大で１６画素移動す
る４８×４８の画素範囲で、補間処理を利用して画素半
分の分解能で行っている。この範囲内の任意の１６×１
６画素領域と、現フレームのマクロブロックの画素との
間で、画素毎の差分から、それらの絶対値の総和ＳＡＤ
（ＳｕｍｏｆＡｂｓｏｌｕｔｅＤｉｆｆｅｒｅｎｃ
ｅ）（以下、ＳＡＤと称す）を求める。この操作を全て
の可能性について行い、その中で最小のものも見出し
て、その領域を処理マクロブロックが移動してきた前フ
レームでの位置とする。又、処理マクロブロックの位置
とその位置から動きベクトルを求める。この操作は、４
８×４８の探索領域に１６×１６のウインドウ領域を設
定して、前記ＳＡＤを求めるウインドウ処理を、探索領
域全体に施すというもので、画像処理特有のウインドウ
処理の一つの形と言える。Now, the "motion vector search module (element)" will be described. In the motion vector search, for each macroblock of the current frame to be processed, the most similar 16 × 16 pixel area is searched around the position of the macroblock in the previous frame. The actual search is performed at a resolution of half the pixel using interpolation processing in a 48 × 48 pixel range that moves up to 16 pixels in the vertical and horizontal directions from the current position. Any 16x1 within this range
From the difference of each pixel between the 6-pixel region and the pixel of the macroblock of the current frame, the sum SAD of their absolute values
(Sum of Absolute Difference
e) (hereinafter referred to as SAD). This operation is performed for all possibilities, and the smallest one is found, and that area is set as the position in the previous frame to which the processing macroblock has moved. Also, the position of the processing macroblock and the motion vector are obtained from the position. This operation is 4
A 16 × 16 window area is set in the 8 × 48 search area, and the window processing for obtaining the SAD is performed on the entire search area, which is one form of window processing unique to image processing.

【０１０８】従来、このウインドウ処理に適する構成法
に、ウインドウＭＳＰＡがあり、本発明ではそのウイン
ドウＭＳＰＡの特徴である高い並列効率を落とすことな
く、メモリから探索データと参照データを逐次的に入力
し、ウインドウ並列処理を行う。このようにして、従来
の動画像圧縮処理のハードウェア実現における処理の高
速化の問題を解決している。Conventionally, there is a window MSPA as a construction method suitable for this window processing. In the present invention, search data and reference data are sequentially input from a memory without lowering the high parallel efficiency which is a characteristic of the window MSPA. , Performs window parallel processing. In this way, the problem of speeding up the processing in the conventional hardware implementation of the moving image compression processing is solved.

【０１０９】次に、図１６は動きベクトル探索回路図で
あり、画像データを記憶しておく外部のメモリ１と、そ
のメモリ１から前記マクロブロックのデータをマクロブ
ロック毎に逐次入力するデータ形式変換用のバッファ３
９がデータバス２に接続され、相互にデータの交換を行
う。そして、そのバッファ３９から内部バス４０を介し
て出力される３ポートのデータは、３２個の並列アレー
状に接続されたプロセッサ要素１０１，１０２乃至１３
２でなるウインドウＭＳＰＡに供給される。Next, FIG. 16 is a motion vector search circuit diagram, in which an external memory 1 for storing image data and a data format conversion for successively inputting the macroblock data from the memory 1 for each macroblock are performed. Buffer 3 for
9 is connected to the data bus 2 and exchanges data with each other. The 3-port data output from the buffer 39 via the internal bus 40 is the processor elements 101, 102 to 13 connected in the 32 parallel array form.
2 is supplied to the window MSPA.

【０１１０】そのプロセッサ要素１０１乃至１３２のデ
ータを超高速並列演算する加算器４４により、現フレー
ムのマクロブロックが前フレームのどの位置から移動し
てきたものかを表す動きベクトルを探索する動きベクト
ル探索回路を構成している。即ち、動きベクトル探索回
路は、図１６に示すように、データバス２に接続された
メモリ１、データ形式変換用のバッファ３９、内部バス
４０を介したプロセッサ要素１０１，１０２乃至１３２
及び加算器４４により構成されている。例えば、プロセ
ッサ要素１０１乃至１３２には探索データと参照データ
の２つの画素値を入力して、減算と絶対値操作を行い、
ある地点までのＳＡＤに加算する操作を行っており、こ
の３２個のプロセッサ要素１０１乃至１３２でなるウイ
ンドウＭＳＰＡは、並列に動作して１０８９回のウイン
ドウ処理を行う。１回のウインドウ処理で１６×１６＝
２５６回のＳＡＤ処理が必要であるため、全部の探索に
は、１０８９×２５６＝２７８，７８４回のＳＡＤ操作
が必要である。A motion vector search circuit for searching for a motion vector indicating from which position in the previous frame the macroblock of the current frame has moved by the adder 44 for performing ultra-high speed parallel calculation of the data of the processor elements 101 to 132. Are configured. That is, as shown in FIG. 16, the motion vector search circuit includes processor elements 101, 102 to 132 via a memory 1 connected to a data bus 2, a buffer 39 for data format conversion, and an internal bus 40.
And an adder 44. For example, two pixel values of search data and reference data are input to the processor elements 101 to 132, subtraction and absolute value operation are performed,
The operation of adding to the SAD up to a certain point is performed, and the window MSPA composed of the 32 processor elements 101 to 132 operates in parallel to perform 1089 times of window processing. 16 × 16 = in one window processing
Since 256 SAD processes are required, 1089 × 256 = 278,784 SAD operations are required for the entire search.

【０１１１】実際は、３２個のプロセッサ要素（プロセ
ッサアレーとも称す）が１クロックでＳＡＤ操作を並列
に行っているので、速くとも２７８，７８４回／３２個
＝８，７１２クロックが必要である。常に３２個のプロ
セッサアレーが動作することは不可能であるが、このシ
ステムでは１３，０００クロックの時間で動きベクトル
探索を実行することを可能にしている。In practice, 32 processor elements (also referred to as processor arrays) perform SAD operations in parallel in one clock, so 278,784 times / 32 = 8,712 clocks are required at the fastest. Although it is not possible for 32 processor arrays to operate at all times, this system makes it possible to perform motion vector searches in a time of 13,000 clocks.

【０１１２】このようにしたことにより、高い並列効率
を落とすことなく、メモリ１から探索データと参照デー
タを逐次的に入力し、「ウインドウ並列処理」を実行す
る手段が確立した。その「ウインドウ並列処理」とは、
画像処理で広く利用される処理で、ある広い画面領域に
おいて、狭いウインドウ２１をくまなく上下左右に動か
して、画像の認識及び情報加工処理することを意味する
が、ある特定の画素に着目すると、ウインドウ２１の中
でいろいろな位置に含まれるので、その画素の所属する
位置毎に１回づつ所定の計算処理を実行するものとすれ
ば、その所定計算処理を前記位置が変わる毎にやり直す
重複の無駄があった。By doing so, the means for sequentially inputting the search data and the reference data from the memory 1 and executing the "window parallel processing" without deteriorating the high parallel efficiency was established. What is "window parallel processing"?
This is a process that is widely used in image processing, and means that the narrow window 21 is moved up and down, left and right all over a wide screen area to perform image recognition and information processing, but focusing on a specific pixel, Since it is included in various positions in the window 21, if the predetermined calculation process is executed once for each position to which the pixel belongs, the predetermined calculation process is repeated every time the position changes. There was a waste.

【０１１３】そこで、前記重複の無駄を省くべく、ウイ
ンドウ２１を上下左右に動かして何度も所定計算処理を
やり直す代わりに、複数のプロセッサで並列に同時処理
することにより、重複計算しないようにして並列効率を
高めるアレー処理法がウインドウＭＳＰＡにより確立さ
れている。尚、ここで言う「並列効率」とは、ｎ個のプ
ロセッサで並列に同時処理する時に１個で処理する場合
の１／ｎ倍に処理時間の短縮された場合を理想的な１０
０％として、そうならない度合いを効率としたものであ
る。又、前記重複計算しないようにして「並列効率」を
高めるアレー処理法に関しては、既に公表済みなので説
明を省略する。Therefore, instead of moving the window 21 up, down, left and right to repeat the predetermined calculation process many times, in order to avoid the waste of the above-mentioned duplication, the plural processes are simultaneously executed in parallel to prevent the duplicate calculation. An array processing method that improves parallel efficiency has been established by the window MSPA. The term "parallel efficiency" as used herein means ideally a case where the processing time is shortened to 1 / n times as much as the case of processing one by one when simultaneously processing in parallel by n processors.
The efficiency is defined as 0%, which is not the case. Further, the array processing method for increasing the "parallel efficiency" by avoiding the duplicate calculation has already been published, and the description thereof will be omitted.

【０１１４】次に、図１６に示した動きベクトル探索回
路図は、前記「ウインドウ並列処理」に用いられていた
「既成のウインドウＭＳＰＡ」を「動きベクトル探索」
に適用したことこそが、従来例の無い新規な発明であ
る。この「動きベクトル探索」とは、１６×１６画素の
マクロブロック単位で、輝度信号Ｙについてのみ行うも
ので、あるマクロブロックが前フレームのどの位置から
移動してきたかを探索する。Next, in the motion vector search circuit diagram shown in FIG. 16, the "ready-made window MSPA" used in the "window parallel processing" is "motion vector search".
It is a novel invention that has never been seen in the prior art. This “motion vector search” is performed only for the luminance signal Y in units of 16 × 16 pixel macroblocks, and searches from which position in a previous frame a certain macroblock has moved.

【０１１５】次に、図１７は離散コサイン変換器及び量
子化器のブロック図であり、データバス２には、ＤＲＡ
Ｍ１の外、逆量子化器５３の入力端、３者選択式の切り
替えスイッチ５４の３つの入力端の一端、２者選択式の
切り替えスイッチ５９の出力端が接続されている。切り
替えスイッチ５４の残る２つの入力端には、逆量子化器
５３の出力端と、転置操作バッファ５７の出力端が接続
されている。切り替えスイッチ５４の出力端には、デー
タ形式変換器５５の入力端が接続されている。データ形
式変換器５５の出力端には、離散コサイン変換／逆変換
器５６の入力端が接続されている。離散コサイン変換／
逆変換器５６の出力端には、転置操作バッファ５７の入
力端と、量子化器５８の入力端と、切り替えスイッチ５
９の２つの入力端のうちの一端が接続されている。切り
替えスイッチ５９の他方の入力端には量子化器５８の出
力端が接続されている。ここで、離散コサイン変換器及
び量子化器の順動作と逆動作を説明する。メモリ１に格
納されている８×８＝６４画素のブロックデータは、デ
ータバス２を介して逐次的に逆量子化器５３及び離散コ
サイン変換／逆変換器５６に入力される。逆量子化器５
３及び量子化器５８は、１クロック周期で１データを処
理できる程高度にパイプライン化されたモジュールであ
り、高速動作を損なうことのないように設計されてい
る。Next, FIG. 17 is a block diagram of a discrete cosine transformer and a quantizer. The data bus 2 has a DRA.
In addition to M1, the input terminal of the inverse quantizer 53 is connected to one of the three input terminals of the three-way selection switch 54, and the output terminal of the two-way selection switch 59 is connected. The output terminal of the inverse quantizer 53 and the output terminal of the transposition operation buffer 57 are connected to the remaining two input terminals of the changeover switch 54. The output end of the changeover switch 54 is connected to the input end of the data format converter 55. The output terminal of the data format converter 55 is connected to the input terminal of the discrete cosine transform / inverse converter 56. Discrete cosine transform /
At the output end of the inverse transformer 56, the input end of the transposition operation buffer 57, the input end of the quantizer 58, and the changeover switch 5
One of the two input terminals 9 is connected. The output terminal of the quantizer 58 is connected to the other input terminal of the changeover switch 59. Here, the forward and reverse operations of the discrete cosine transformer and the quantizer will be described. The block data of 8 × 8 = 64 pixels stored in the memory 1 is sequentially input to the inverse quantizer 53 and the discrete cosine transform / inverse transformer 56 via the data bus 2. Inverse quantizer 5
3 and quantizer 58 are highly pipelined modules that can process one data in one clock cycle, and are designed so as not to impair high speed operation.

【０１１６】又、データ形式変換器５５は、離散コサイ
ン変換／逆変換器５６のためのビット並列データ形式か
らビット直列データ形式への変換器である。離散コサイ
ン変換／逆変換５６は、通常の二次元処理を２回の一次
元処理に分解し、転置換操作バッファ５７はこの際に必
要となる８×８データの転置換操作を行う。そこで、デ
ータ形式変換器５５の詳細を図１８に示し、動作の説明
をする。先ず入力線６１から１６ビットでなるビット並
列データが１クロック毎に入力され、レジスタ６２及び
レジスタ６３で合計の１６のレジスタに格納されて行
く。レジスタ６２又はレジスタ６３に８つのデータが揃
った時点で、これらのレジスタから同時にビット直列形
式でデータが接続線６４，６５を通して出力され、入力
切替スイッチ６６によってレジスタ６２又はレジスタ６
３の出力データが選択される。Further, the data format converter 55 is a converter for the discrete cosine transform / inverse converter 56 from the bit parallel data format to the bit serial data format. The discrete cosine transform / inverse transform 56 decomposes the ordinary two-dimensional processing into two one-dimensional processing, and the transposition operation buffer 57 performs the transposition operation of 8 × 8 data required at this time. Therefore, details of the data format converter 55 are shown in FIG. 18, and the operation will be described. First, 16-bit bit parallel data is input from the input line 61 every clock, and is stored in 16 registers in total in the registers 62 and 63. When eight data are collected in the register 62 or the register 63, the data are simultaneously output from these registers in the bit serial form through the connection lines 64 and 65, and the input changeover switch 66 is used to register the register 62 or the register 6 with each other.
3 output data is selected.

【０１１７】ここで、データ形式変換器５５に２組のレ
ジスタ６２，６３を備えている点に着目して、さらに説
明する。２組のレジスタ６２，６３を用いることによ
り、データ形式変換器５５への入力データの受入れ、シ
フト操作を伴う下位桁から上位桁へのデータ群の転送と
いう２つの処理を同時に並列実行させ、データ処理の高
速化を図っている。図１８において、動作タイミングで
８クロックに区切って、動作を説明する。１）最初の８
クロックでは、１６ビットのデータが逐次レジスタ６２
にデータ０乃至７の順で入力される。２）次の８クロッ
クでは、レジスタ６２へのデータ入力ではなく、８区画
のレジスタの最下位だけを集めた８個のデータがレジス
タから出力され、それと同時に各区画のレジスタの夫々
のデータは下位方向へとシフトされる。このようにして
入力された並列データを８つまとめて、下位ビットのデ
ータから上位ビットのデータに逐次出力される。この
間、レジスタ６２へのデータ格納を禁止しているとすれ
ば、次の８クロック分をレジスタ６３が代わりに格納す
る必要がある。このように、データ形式変換器５５へ、
２組のレジスタ６２，６３を入力切替スイッチ６６の作
用で交互に用い、メモリ１からの入力データを連続的に
受入れて、パイプライン並列によるデータ処理の高速
化、即ちデータ転送レートを落とすことなく、逐次的な
データを並列データに変換する２組のデータ形式変換手
段を構成している。The data format converter 55 will be further described below by focusing on the fact that the data format converter 55 is provided with two sets of registers 62 and 63. By using the two sets of registers 62 and 63, two processes of receiving the input data to the data format converter 55 and transferring the data group from the lower digit to the upper digit accompanied by the shift operation are simultaneously executed in parallel, and We are trying to speed up the process. In FIG. 18, the operation will be described by dividing the operation timing into 8 clocks. 1) The first 8
In the clock, 16-bit data is transferred to the serial register 62.
Data 0 to 7 are input in this order. 2) In the next 8 clocks, not the data input to the register 62, but the 8 pieces of data obtained by collecting only the least significant bits of the registers of the 8 sections are output from the registers, and at the same time, the respective data of the registers of each section are the lower bits. Is shifted in the direction. Eight parallel data thus input are collected and sequentially output from lower bit data to upper bit data. If data storage in the register 62 is prohibited during this period, the register 63 needs to store the next 8 clocks instead. In this way, to the data format converter 55,
The two sets of registers 62 and 63 are alternately used by the operation of the input changeover switch 66 to continuously receive the input data from the memory 1 and to speed up the data processing by the pipeline parallel, that is, without lowering the data transfer rate. , Two sets of data format conversion means for converting sequential data into parallel data.

【０１１８】ここで言うデータ転送レートとは、毎秒何
ビットのデータを転送できるか、その速度のことであ
る。又、逐次的なデータでは、ある１本の線を介して１
ビットづつ転送するが、例えば１００万ビット／秒のデ
ータ転送レートを必要とする場合、１０万ビット／秒の
データ転送レートの線を１０本並列に用いれば良いこと
になる。The data transfer rate referred to here is the speed of how many bits of data can be transferred per second. In addition, in the case of sequential data,
Bits are transferred bit by bit. For example, when a data transfer rate of 1 million bits / second is required, it is sufficient to use 10 lines having a data transfer rate of 100,000 bits / second in parallel.

【０１１９】又、図１９で離散コサイン変換／逆変換器
のブロック図に示すように、８つのビット順列データが
入力処理部７１に入力し、８つの１６ビット累積加算器
７２を通って、最終的に出力処理部７３に入力する。Further, as shown in the block diagram of the discrete cosine transform / inverse converter in FIG. 19, eight bit permutation data are input to the input processing unit 71, passed through eight 16-bit cumulative adders 72, and finally processed. Input to the output processing unit 73.

【０１２０】又、図２０で離散コサイン変換／逆変換器
における入力処理部のブロック図に詳細を示すように８
つのビット直列入力データは、ビット直列加算器８１及
びビット直列減算器８２に入力し、これらの出力は８つ
のビット分散計算用ＲＯＭ８３に入力する。さらに、８
つの入力データは直接別の８つのビット分散計算用ＲＯ
Ｍ８４に入力する。これら二組のＲＯＭ出力は切替信号
８５によって制御される切替スイッチ８６によって選択
される。この切替信号８５としては離散コサイン変換が
実行されている場合は「０」が、離散コサイン逆変換が
実行されている場合は「１」が入力される。Further, as shown in detail in the block diagram of the input processing unit in the discrete cosine transform / inverse transformer in FIG.
One bit serial input data is input to the bit serial adder 81 and the bit serial subtractor 82, and these outputs are input to the eight bit dispersion calculation ROM 83. In addition, 8
One input data is directly another 8 bit RO for variance calculation
Input to M84. These two sets of ROM outputs are selected by a changeover switch 86 controlled by a changeover signal 85. As the switching signal 85, “0” is input when the discrete cosine transform is executed, and “1” is input when the discrete cosine inverse transform is executed.

【０１２１】又、図２１で離散コサイン変換／逆変換器
における出力処理部のブロック図に詳細を示すように８
つのビット並列入力データは、ビット並列加算器９１及
びビット並列減算器９２に入力する。これら二組のＲＯ
Ｍ出力は切替信号９３によって制御される切替スイッチ
９４によって選択される。切替信号９３は離散コサイン
変換が実行されている場合は「０」が、離散コサイン逆
変換が実行されている場合は「１」が入力される。切替
スイッチ９４の出力は８つのレジスタ９５に入力し、こ
れらのレジスタから逐次的に出力線９６を通して出力さ
れる。Further, as shown in detail in the block diagram of the output processing unit in the discrete cosine transform / inverse transformer in FIG.
One bit parallel input data is input to the bit parallel adder 91 and the bit parallel subtractor 92. These two sets of RO
The M output is selected by the changeover switch 94 controlled by the changeover signal 93. As the switching signal 93, “0” is input when the discrete cosine transform is executed, and “1” is input when the discrete cosine inverse transform is executed. The output of the changeover switch 94 is input to the eight registers 95, and is sequentially output from these registers through the output line 96.

【０１２２】ここで、テレビ電話に付属するテレビカメ
ラで撮影された送話者の顔を主体とする映像信号を、電
話回線により受話者へ伝送する前に以下の信号処理を説
明する。尚、双方向通信のため、送信と受信の往復分の
情報伝送が同一経路で同時になされるが、ここでその双
方向通信に関する説明は省略する。従来からあるＩＴＵ
国際規格Ｈ．２６３準拠の低ビットレート動画像伝送の
信号処理回路がベースになっている。そのベースに追加
する機能として、映像信号から顔の部分を抽出して顔の
ウインドウ２１を構成し、顔の動きに追随して顔のウイ
ンドウ２１を移動させる機構と、顔以外の背景部分の動
きを抑制する時間フィルタ機能と、伝送遅延の小さな新
たな伝送レート制御のための機構があり、これらにより
動画の画像情報に対して適切な圧縮の信号処理を施す。Now, the following signal processing will be described before transmitting the video signal mainly composed of the face of the talker photographed by the TV camera attached to the videophone to the receiver through the telephone line. Incidentally, since the bidirectional communication is performed, the information transmission for the round trip of the transmission and the reception is performed at the same time on the same route, but the description of the bidirectional communication is omitted here. Traditional ITU
International standard H.264. It is based on a signal processing circuit for low bit rate moving image transmission conforming to H.263 standard. As a function to be added to the base, a mechanism for extracting a face portion from a video signal to configure a face window 21 and moving the face window 21 according to the movement of the face, and a movement of a background portion other than the face There is a time filter function that suppresses noise and a new mechanism for controlling the transmission rate with a small transmission delay. By these, appropriate compression signal processing is performed on the image information of the moving image.

【０１２３】前記ベースに追加する機能として、まずテ
レビカメラ等により入力された映像信号を認識処理し、
その動画像の画面上で移動自在の特定領域でなる顔のウ
インドウ２１を構成する画素が顔の動きに伴って生じる
動きベクトルの平均値を算出する演算プログラムがあ
る。話し手を聴き手がその肉眼で視野に捕らえている場
合はその話し手の顔の動きに聴き手が視線を追随させる
事により、話し手の顔を常に聴き手の視野の中心に位置
付ける。この行為は人の肉眼における視野の中心付近が
最も解像力に優れているので、興味ある対象物を鮮明に
見るために視野の中心に位置付けて、価値ある有効な情
報を漏らさず最大限に収集しようとする、本能的かつ無
意識又は必要を感じて意識した行動である。このように
話し手の顔を、聴き手の目が解像力に優れている顔のウ
インドウ２１の中心に（画面の中心ではない）位置付け
ようとする行動を、それに該当する第１の知能を備えた
電子機械装置に置き換えるために、その人の顔を含む顔
のウインドウ２１の動きベクトルの平均値を算出する演
算プログラム、前記平均値に対応して顔のウインドウ２
１を移動させるウインドウ位置制御プログラムが機能す
る。As a function to be added to the base, first, a video signal inputted by a television camera or the like is subjected to recognition processing,
There is a calculation program for calculating an average value of motion vectors generated by the pixels forming the window 21 of the face, which is a movable specific area on the screen of the moving image, along with the movement of the face. When the listener listens to the visual field with his or her naked eyes, the listener's face follows the movement of the speaker's face, so that the speaker's face is always positioned at the center of the listener's visual field. This action has the highest resolution near the center of the visual field of the human eye, so position it in the center of the visual field to clearly see the object of interest, and collect the maximum value without leaking valuable and useful information. It is an instinctual, unconscious or conscious behavior that feels necessary. In this way, the action of attempting to position the speaker's face at the center (not at the center of the screen) of the window 21 of the face where the listener's eyes have excellent resolution is equivalent to the electronic device having the first intelligence. An arithmetic program for calculating the average value of the motion vectors of the face window 21 including the face of the person in order to replace it with a mechanical device, and the face window 2 corresponding to the average value.
The window position control program for moving 1 works.

【０１２４】又、第２の知能として顔のウインドウ２１
以外の背景画像２２の動きが激しく動画像情報の多い場
合には、その背景画像２２の動き量を相当に低減して画
質を落とす演算プログラムでなる符号化アルゴリズムが
ある。この演算プログラムは順次継続する前後のフレー
ムで同一位置の各画素を示す情報をそれぞれ足して２で
割った値が後のフレームの画像情報に置き換えることに
より、符号化された動画像情報量を抑制している。情報
量を抑制する事により、伝送容量に制約のある伝送経路
での画像伝送遅延を最小化する出力伝送レートの制御機
構を持つ顔強調型Ｈ．２６３プラス符号化アルゴリズム
と、実行順序の決定及びデータの授受のタイミングを変
更可能なプログラムの機能により、見た目に歴然と顔を
鮮明でかつ自然な動きにし、そうでありながらも背景画
像２２の画質劣化は極端なものでなく、僅かにぼやける
程度に済ませられる。As the second intelligence, the face window 21 is used.
In the case where the background image 22 other than the above has a large movement and there is a lot of moving image information, there is an encoding algorithm which is a calculation program that considerably reduces the movement amount of the background image 22 and deteriorates the image quality. This arithmetic program suppresses the encoded moving image information amount by replacing the image information of the subsequent frame with a value obtained by adding up the information indicating each pixel at the same position in the preceding and succeeding frames and dividing by 2 is doing. By suppressing the amount of information, the face enhancement type H.264 has a control mechanism of an output transmission rate that minimizes an image transmission delay in a transmission path having a limited transmission capacity. The H.263 plus encoding algorithm and the function of the program that can change the execution order determination and data transfer timing make the face look clear and natural, and the background image 22 deteriorates in quality. Is not extreme and can be slightly blurred.

【０１２５】顔のウインドウ２１は送話者の顔の輪郭を
略中心に補足し、顔のウインドウ２１以外の背景画像２
２の情報量を粗にし、顔のウインドウ２１内の情報量を
蜜に維持し、これら映像信号の持つ画像情報量の総和を
極小にする。これにより、電話回線の制約条件にも当て
はまる情報量となる。又、送話者の喋りに伴って揺動す
る顔の動きに追随して顔のウインドウ２１を移動させる
顔のウインドウ位置制御プログラムにより、ウインドウ
２１が最新位置に更新され続ける。従って、常に顔だけ
は家庭用白黒テレビで視聴している人物の顔に近い鮮明
でかつ自然な動きを伴った映像であり、なおかつ背景の
無駄な情報に関しては極限まで圧縮できるので伝送効率
が高く維持できる。The face window 21 supplements the outline of the face of the talker substantially at the center, and the background image 2 other than the face window 21 is supplemented.
The amount of information of 2 is made coarse, the amount of information in the window 21 of the face is kept to a minimum, and the total sum of the image information amounts of these video signals is minimized. This makes the amount of information applicable to the restriction conditions of the telephone line. Further, the window 21 continues to be updated to the latest position by the face window position control program which moves the window 21 of the face in accordance with the movement of the face which oscillates as the talker speaks. Therefore, only the face is always an image with clear and natural movement that is close to the face of a person watching on a home black and white television, and unnecessary information in the background can be compressed to the maximum, so transmission efficiency is high. Can be maintained.

【０１２６】次に、テレビ電話では送信できる情報量が
決まっているので、視聴者に影響の無い動画像情報は徹
底的に圧縮する必要がある。そして、部分的には無視で
きない情報でも、それが全体に及ぼす影響が少ない場合
は、あえて削除した方が、他の重要情報を増加でき、結
果的には全体の性能を向上できる。Next, since the amount of information that can be transmitted by the videophone is fixed, it is necessary to thoroughly compress moving image information that does not affect the viewer. And, even if the information that cannot be partially ignored has a small effect on the whole, it is possible to increase other important information by intentionally deleting it, and consequently improve the overall performance.

【０１２７】又、Ｂフレーム情報における差分全画素情
報のゼロ化や、イントラマクロブロック以外の輝度信号
のデータの切り下げによる量子化も、動画像情報の大幅
な情報量の削減になる。これらの操作はそれ自体は画像
の劣化の原因になるが、それよりは削減された情報量の
分だけ他のデータの情報量を増加できるため、結果的に
は画質の向上になっている。Further, the zeroization of the difference all pixel information in the B frame information and the quantization by rounding down the data of the luminance signal other than the intra macro block can also significantly reduce the amount of moving image information. Although these operations themselves cause the deterioration of the image, the information amount of other data can be increased by the amount of the reduced information amount, and as a result, the image quality is improved.

【０１２８】次に、唇の動きと音声を一致させるリップ
同期の方法であるが、新しく開発した「レート制御方
式」により、１フレーム即ち動画の構成単位画像（テレ
ビでは毎秒２５枚若しくは３０枚）当りの符号量を平準
化し、その所定の符号量例えば１１２バイトと、単位時
間当たりの伝達可能情報量例えば伝送レート２７Ｋｂｓ
から予測できる通信遅延時間を「レート制御機構」に折
り込み済であり、例えば常に１０フレーム分の通信遅延
時間が発生すると予測されていれば、その分を過去から
現在までの変化量から未来を予測して情報加工する。多
少の予想外れが発生しても実用上何ら問題ない程度の範
囲内で、未来を予測しているので、本発明の一実施形態
での加工済情報では、遅延無く受信される音声に対して
常に３フレーム以内の画像の遅延に抑えることに成功し
た。Next, regarding the lip synchronization method for matching the movement of the lips with the sound, a newly developed "rate control method" is used to form one frame, that is, a moving image component image (25 or 30 images per second on a television). The code amount per hit is leveled, and the predetermined code amount, for example, 112 bytes, and the amount of information that can be transmitted per unit time, for example, a transmission rate of 27 Kbs.
If the communication delay time that can be predicted from the above is already inserted into the “rate control mechanism” and it is predicted that the communication delay time of 10 frames always occurs, for example, the future is predicted from the change amount from the past to the present. And process the information. Since the future is predicted within a range in which there is no practical problem even if some disappointment occurs, the processed information according to the embodiment of the present invention is applied to a voice received without delay. We have succeeded in keeping the image delay within 3 frames.

【０１２９】これらの知能を多く取り入れたテレビ電話
であっても、その知能を実現するためのハードウェアが
高速で動作しない場合は、１秒間のフレーム数を落とす
しか方法が無い。そうしたコマ落ちしたテレビ画像で
は、それだけで自然な雰囲気の対談でなくなってしま
う。新しく開発したウインドウＭＳＰＡや、それの構成
要素であるＡＧＵ集中制御装置、動きベクトル探索回
路、離散コサイン変換と量子化回路及び逆量子化回路と
離散コサイン逆変換は、目的のハードウェアを回路規模
が小さい低コストな回路で目的を達成する回路を示して
いる。Even in a videophone which incorporates a lot of these intelligences, if the hardware for realizing the intelligences does not operate at high speed, the only method is to reduce the number of frames per second. In such a TV image with dropped frames, it is no longer a natural dialogue. The newly developed window MSPA, its constituent elements such as the AGU centralized controller, the motion vector search circuit, the discrete cosine transform / quantization circuit, and the dequantization circuit / discrete cosine inverse transform have the target hardware in a circuit scale. It shows a circuit that achieves its purpose with a small, low-cost circuit.

【０１３０】要するに、テレビ電話を鮮明かつ自然に見
せるために、（ａ）顔という見たい部分だけを強調する、オブジェク
ト別の重要性に関する重み付け。（ｂ）全体の画面を鮮明にするための、無駄な情報量の
削減。（ｃ）画面と音声とを同期させて自然な雰囲気を作り出
すための、画像処理で発生する遅延時間の最小化。（ｄ）自然な速度で動作する画面を作り出すための、低
コストでしかも高速の動作するハードウェア構成。からなる画像情報伝送システムなのである。In short, in order to make the videophone look clear and natural, (a) weighting on the importance of each object, emphasizing only the part to be seen, the face. (B) Reduction of useless information amount to make the entire screen clear. (C) Minimize the delay time that occurs in image processing in order to create a natural atmosphere by synchronizing the screen and voice. (D) A low-cost and high-speed hardware configuration for creating a screen that operates at a natural speed. It is an image information transmission system consisting of.

【０１３１】尚、前記テレビ電話は本発明の一実施形態
に過ぎず、その効率良い動画像情報の符号圧縮システム
の適用範囲は無限にある。そして、その利用者にとって
興味関心の薄いとみなされる部分の画質を劣化させ、そ
の代わりに興味関心の集中する部分の画質を維持するよ
うにし、目くるめく流れ過ぎる無駄な部分の画質を必要
最小限の情報量にまで圧縮してもなお、本来の画像伝送
の目的に沿った、画像の雰囲気が損なわれないような、
自然の動きを実現したものは、本発明の要件に含まれる
ものと看做し得る。The videophone is only one embodiment of the present invention, and the applicable range of the efficient code compression system for moving image information is endless. Then, the image quality of the part that is considered to be of little interest to the user is degraded, and instead, the image quality of the part where the interest is concentrated is maintained, and the image quality of the wasteful part that flows too quickly is minimized. Even if compressed to the amount of information, the atmosphere of the image will not be spoiled for the original purpose of image transmission,
The realization of natural movement can be regarded as included in the requirements of the present invention.

【０１３２】[0132]

【発明の効果】本発明によれば、興味関心をもたれやす
く目立つ顔の部分だけを特定領域（ウインドウ）として
優先的な情報処理を施すようにした符号圧縮システムに
おいて、伝送容量に制約のある在来の電話回線に対応す
べく圧縮した最小限の画像情報伝送量で、自然の会話に
近い表情、特に喋りに合った唇の動き即ちリップ同期を
実現するようにした動画像情報の高性能符号圧縮システ
ムを提供すること、即ちテレビ電話を、極めて簡素かつ
安価なままで、高性能化できる。 According to the present invention, it is easy to be interested.
Only the conspicuous face part as a specific area (window)
A code compression system that gives priority information processing
Support conventional telephone lines with limited transmission capacity .
With minimal amount of image information transmission compressed as much as possible, for natural conversation
A close facial expression, especially lip movement that matches the talking, that is, lip synchronization
High-performance code compression system for moving image information
To provide a videophone, which is extremely simple and
High performance at low cost.

【０１３３】本発明は、以上説明したように構成した動
画像情報の高性能符号圧縮システムであるため、光ファ
イバー以前の既成の電話回線等に用いて最大の効果を発
揮できる。即ち、周波数帯域を限定した話し声のみを効
率的に伝送することを前提に構築されている、無線も含
めた従来の電話回線網の情報伝送容量の制約の範囲内で
も、十分に実用性を維持したテレビ画像の伝送を、それ
と同伴して伝送される音声とリップ同期させるのみなら
ず、光速度で届く電磁波の絶対的速度からも、画像伝送
の遅延を略無くし、天文学的距離で無い限りは、略自然
なテレビ対談を可能ならしめた。Since the present invention is a high-performance code compression system for moving picture information constructed as described above, it can be used in an established telephone line before an optical fiber to exert its maximum effect. In other words, it has been constructed on the premise of efficiently transmitting only the voice with a limited frequency band, and it remains practical even within the limits of the information transmission capacity of the conventional telephone network including wireless. In addition to lip-synchronizing the transmission of the transmitted television image with the voice transmitted along with it, the delay of the image transmission is almost eliminated from the absolute speed of the electromagnetic wave that arrives at the light speed, unless it is an astronomical distance. , It was possible to have a nearly natural TV conversation.

【０１３４】尚、前記した従来の電話回線網で効果的な
本発明は、将来の光ファイバー網に対しても有効であ
る。即ち、高速大容量化する伝送経路においては、この
発明を用いたテレビ電話による通信の加入件数の増大を
可能ならしめる。従って、より一般大衆にまでその利便
性をもたらす。The present invention, which is effective in the conventional telephone line network, is also effective in the future optical fiber network. That is, it is possible to increase the number of subscribers of the communication by the videophone using the present invention in the transmission path of high speed and large capacity. Therefore, it brings the convenience to the general public.

【０１３５】又、本発明のシステムを、例えば１２８Ｍ
ビットＤＲＡＭと組み合わせた録画再生装置に応用すれ
ば、音声込みの動画像を毎秒３４Ｋビットのデータ量に
圧縮し、１時間の記録再生できるので、従来のビデオテ
ープレコーダのテープ駆動装置も不要となり、その記録
再生装置を格安で製造できる。前記同様の理由により、
ＲＯＭ画像再生装置も手軽に普及させられる。これらの
応用範囲は子供の玩具から、視聴覚教材、生活必需品及
び公の営造物に至るまで、応用範囲とその利便性は計り
知れない。In addition, the system of the present invention can be applied to, for example, 128M.
If it is applied to a recording / playback device combined with a bit DRAM, a moving image containing sound can be compressed to a data amount of 34 Kbits per second and can be recorded and played for one hour, so that a tape drive device of a conventional video tape recorder is not required, The recording / reproducing device can be manufactured at a low price. For the same reason as above,
The ROM image reproducing device can be easily spread. The application range of these is from children's toys to audiovisual materials, daily necessities and public works, and their convenience is immeasurable.

[Brief description of drawings]

【図１】顔のウインドウの説明図である。FIG. 1 is an explanatory diagram of a face window.

【図２】顔のウインドウと周辺動きウインドウの説明図
である。FIG. 2 is an explanatory diagram of a face window and a peripheral movement window.

【図３】背景の情報量を削減する方法の説明図である。FIG. 3 is an explanatory diagram of a method for reducing the amount of background information.

【図４】Ｂフレーム処理による情報量を削減する方法の
説明図である。FIG. 4 is an explanatory diagram of a method of reducing the amount of information by B frame processing.

【図５】フレームレベルレートに関する条件及び変数の
一覧図である。FIG. 5 is a list of conditions and variables relating to a frame level rate.

【図６】フレームレベルレート制御の処理流れ図であ
る。FIG. 6 is a processing flowchart of frame level rate control.

【図７】マクロブロックレベルレートに関する変数及び
定数の一覧図である。FIG. 7 is a list of variables and constants related to macroblock level rates.

【図８】マクロブロックレベルレート制御の処理流れ図
である。FIG. 8 is a processing flowchart of macroblock level rate control.

【図９】マクロブロックレベルレート制御における関数
ＣＱ（ｘ）の定義を説明する図である。FIG. 9 is a diagram illustrating the definition of a function CQ (x) in macroblock level rate control.

【図１０】マクロブロック（ｉ）における量子化レベル
ｑの更新計算処理の流れ図である。FIG. 10 is a flow chart of update calculation processing of a quantization level q in a macroblock (i).

【図１１】符号器のブロック図である。FIG. 11 is a block diagram of an encoder.

【図１２】復号器のブロック図である。FIG. 12 is a block diagram of a decoder.

【図１３】符号器／復号器兼用一体のブロック図であ
る。FIG. 13 is a block diagram of an integrated encoder / decoder.

【図１４】集中制御装置（ＡＧＵ）のブロック図であ
る。FIG. 14 is a block diagram of a centralized control unit (AGU).

【図１５】外部メモリのメモリ領域の構成図である。FIG. 15 is a configuration diagram of a memory area of an external memory.

【図１６】動きベクトル探索回路図である。FIG. 16 is a motion vector search circuit diagram.

【図１７】離散コサイン変換器及び量子化器のブロック
図である。FIG. 17 is a block diagram of a discrete cosine transformer and a quantizer.

【図１８】データ形式変換器のブロック図である。FIG. 18 is a block diagram of a data format converter.

【図１９】離散コサイン変換／逆変換器のブロック図で
ある。FIG. 19 is a block diagram of a discrete cosine transform / inverse transformer.

【図２０】離散コサイン変換／逆変換器における入力処
理部のブロック図である。FIG. 20 is a block diagram of an input processing unit in a discrete cosine transform / inverse transformer.

【図２１】離散コサイン変換／逆変換器における出力処
理部のブロック図である。FIG. 21 is a block diagram of an output processing unit in a discrete cosine transform / inverse transformer.

[Explanation of symbols]

１メモリ２データバス３ＡＧＵ４制御バス８ホストコンピュータ９プログラムＲＯＭ１０動きベクトル探索部２１顔のウインドウ２２背景画像３９，５７バッファ４１動き予測機能部４２全画素値ゼロ化機能部４５符号生成部４６復号部４７逆量子化部４８離散コサイン逆変換部４４加算部４９量子化部５１周辺動きウインドウ５５データ形式変換器５６離散コサイン変換／逆変換器１０１，１０２，…，１３２プロセッサ要素 1 memory 2 data bus 3 AGU 4 control bus 8 Host computer 9 Program ROM 10 Motion vector search unit 21 Face window 22 background image 39,57 buffers 41 Motion Prediction Function Unit 42 All pixel value zeroization function unit 45 Code Generator 46 Decoding section 47 inverse quantizer 48 Discrete Cosine Inversion Unit 44 Adder 49 Quantizer 51 Peripheral motion window 55 Data format converter 56 Discrete Cosine Transform / Inverse Transformer 101, 102, ..., 132 Processor element

───────────────────────────────────────────────────── フロントページの続き (72)発明者伊藤和人埼玉県浦和市下大久保255番埼玉大学電気電子システム工学科内 (72)発明者大塚友彦東京都八王子市椚田町1220番２号東京工業高等専門学校電気工学科内 (72)発明者トリオ・アディノ東京都目黒区大岡山二丁目12番１号東京工業大学理工学研究科電気電子工学専攻内 (72)発明者チャワリット・ホンサワイック東京都目黒区大岡山二丁目12番１号東京工業大学理工学研究科電気電子工学専攻内 (56)参考文献特開平６−168330（ＪＰ，Ａ) 特開平10−275237（ＪＰ，Ａ) 特開平10−63855（ＪＰ，Ａ) 特開平10−51755（ＪＰ，Ａ) 特開平９−185708（ＪＰ，Ａ) 特開2000−295600（ＪＰ，Ａ) ＤｏｎｇｊｕＬｉ（外３名），ＭｕｌｔｉｍｅｄｉａＬＳＩＤｅｓｉｇｎＢａｓｅｄｏｎＷｉｎｄｏｗ− ＭＳＰＡＡｒｃｈｉｔｅｃｔｕｒｅ, ＩＳＰＡＣＳ’99，米国，ＩＥＥＥ, 1999年12月８日，ｐ．187−190 (58)調査した分野(Int.Cl.⁷，ＤＢ名) H04N 7/24 - 7/68 H03M 7/30 ─────────────────────────────────────────────────── ─── Continuation of the front page (72) Inventor Kazuto Ito 255 Shimookubo, Urawa-shi, Saitama Department of Electrical and Electronic System Engineering, Saitama University (72) Inventor Tomohiko Otsuka 1220-2, Kusada-cho, Hachioji-shi, Tokyo Tokyo Institute of Technology, Department of Electrical Engineering (72) Inventor Trio Adino 2-12-1, Ookayama, Meguro-ku, Tokyo Tokyo University of Technology Graduate School of Science and Engineering Department of Electrical and Electronic Engineering (72) Inventor Chawalit Honsawaik 2-12-1 Ookayama, Meguro-ku, Tokyo Within the Department of Electrical and Electronic Engineering, Graduate School of Science and Engineering, Tokyo Institute of Technology (56) Reference JP-A-6-168330 (JP, A) JP-A-10-275237 (JP , A) JP 10-63855 (JP, A) JP 10-51755 (JP, A) JP 9-185708 (JP, A) JP 2000-295600 (JP, A) D ngju Li (and three others), Mu ltimedia LSI Desig n Based on Window- MSPA Architecture, ISPACS'99, the United States, IEEE, 1999 December 8, p. 187-190 (58) Fields investigated (Int.Cl. ⁷ , DB name) H04N ^7/ 24-7/68 H03M 7/30

Claims

(57) [Claims]

1. A stores image data Okumemo Li (1)
And a window (21) that is movable on a screen of the moving image being recognized, that is, a window (21) on which priority information processing is performed, and the entire image forming the window (21) is divided into rectangular small blocks. Then, the sequential processing means for sequentially inputting the search data and the reference data for each of the small blocks from the memory (1) to execute the window parallel processing, and the motion vector associated with the motion of the moving image of the small blocks, The motion vector search unit (10) to be searched and the motion vector searched by the motion vector search unit (10) are used to display the face window (21) of the next frame.
Estimate the position of the
1) a window position control program that follows
A high-performance code compression system for moving image information, comprising a motion prediction function unit that inputs a current frame image, a previous reference frame image, and a subsequent reference frame image to perform motion prediction, motion compensation, and prediction method determination, An all-pixel value zeroization function unit that inputs difference information between a prediction image output from the motion prediction function unit and the current frame image and forcibly sets all pixel values of the difference information to zero, and all pixel values thereof Code generation for inputting the zeroed all-pixel information output from the zeroization function unit and encoding the zeroed all-pixel information while predicting the next motion of the moving image by the prediction method determined by the motion prediction function unit And a B frame
It includes an encoder for processing and input a decoder for receiving and decoding the encoder codes the compressed video information which is forwarded via the transmission path is transmitted by the decoded signals output from the decoding section And a dequantization unit for dequantizing the signal, a dequantization signal output from the dequantization unit, and a discrete cosine inverse transformation unit that decimates the discrete cosine and transforms it to a difference image, and its decompression. Bei addition unit, the outputting the decoded picture adding the predicted image predicted by the prediction method and the difference image
In addition , the present invention includes a decoder for B-frame processing, and quantizes the luminance signal and the color difference signal by rounding off only in the case of an intra macroblock which directly encodes the current macroblock image without using the prediction image. color difference signals quantized is quantized by rounding the devaluation luminance signal when a non includes noise reduction means for reducing the noise of the color difference signal while applying the same quantization level into a luminance signal and a color difference signal, said noise The reduction means multiplies the sum of the inverse quantization correction value (p) and the rounding correction parameter (f) by the quantization level (Q) and the quantization level correction value (s) to obtain the discrete cosine transformed frequency component. The absolute value of the original data consisting of (| C |)
The absolute value of the quantized data obtained by rounding down the real value obtained by dividing the value obtained by subtracting from the value obtained by multiplying the quantization level (Q) and the quantization level correction value (s) (| L | ) and comprises quantization means for, and, as said quantization means, a quantization by rounding off a luminance signal and a color difference signal only in the case of an intra macroblock directly coding the current macroblock image without the predictive picture the correction parameter to the first quantization means for setting the (f) to 0,5, to quantize the devaluation the luminance signal in the case of other than the intra-macro block, the correction parameter (f) was set to 0, the color difference signals to quantized by rounding, characterized by comprising a second quantizing means for setting said correction parameter (f) in 0,5 kinematic
High-performance code compression system for image information.

2. Comparing means for comparing a bit amount of encoded moving image information with a communication buffer residual bit amount, and target bits of a frame so that the communication buffer residual bit amount is not exhausted by the comparison result of the comparing means. It occurs until the camera input image is output as a decoded image via the encoder, the transmission path and the decoder by using the control means for controlling the amount and the control result by the control means. The high-quality moving image information according to claim 1 , further comprising: a calculation unit for calculating a target bit amount of a frame in frame level rate control, which minimizes a delay time and frame drop .
Noh code compression system.

3. A first calculation means for calculating a quantization level applied to a first macroblock of a frame by using a weighted average of the quantization levels of respective macroblocks of a previous frame, and second and subsequent macros. a fine adjustment amount of the quantization level applied to the block, the target bit amount, a second calculating means for calculating using the actual code amount and the quantization level applied to the first macroblock to the current macroblock high-performance code compression system of video information according to claim 2, characterized in that it comprises a.

4. A memory (1) for storing image frame information.
And a hardware module, which operates independently of each other, via a data bus (2), and the memory (1)
Centralized control device (3) for controlling the flow of data and the operation schedule between the hardware module and each of the hardware modules
Is provided with an encoder and / or a decoder configured by a system architecture coupled to each of the hardware modules via a control bus (4).
High performance of the moving image information according to any one of claims 1 to 3.
Code compression system.

5. A memory (1) comprising an area of a memory (1) having an address configuration adapted to a block system for dividing an image into a plurality of blocks and processing information in coordinate units of the blocks.
If, coordinate units of the address generation for accessing the memory (1) the macroblock, block, ROM that stores instructions及beauty instruction program execution control of each hardware module that enables a pixel unit ( 9)
A high-performance code compression system for moving image information according to claim 3 or 4 , further comprising a centralized control device (3) provided with.

6. The array processing means for simultaneously processed in parallel by multiple processors, focusing on a particular pixel, by moving vertically and horizontally all over the narrow window (21), various positions in the window (21) A predetermined calculation process is executed once for each changing position of the specific pixel to which the pixel included in, and the predetermined calculation process is performed by moving the window (21) up, down, left and right every time the position changes. as an alternative means of predetermined calculation process to start over even degree, the image data stored in your Kume memory a (1), the memory (1) or llama black data format for conversion to sequentially input the data blocks for each macro block the buffer (39), a plurality of processes Tsu support elements connected in parallel an array supplied with data of a plurality ports outputted from the buffer (39) (101-1 Memory sharing type of array consisting of 2)
Adder for searching for a motion vector indicating from which position in the previous frame the macroblock of the current frame has moved, by means of a configuration and an operation means for performing ultra-high speed parallel operation of the data of the processor elements (101 to 132) a motion vector search circuit consisting of (44), characterized by comprising a billing
A high-performance code compression system for moving image information according to any one of Items 1 to 5 .

7. The input data of the memory (1) is continuously input through two sets of registers (62), (63) which are alternately switched by an input selector switch (66), and is vertically 8
By horizontal 8 becomes data luma black blocks on account 64 pixels successively inputted for each macro block without lowering the data transfer rate, sequential converts data into a parallel data two sets of data format conversion A processor element (1) for performing a two-dimensional discrete cosine transform of the parallel data through a means (55) and a data format conversion means (55) thereof.
01-132) and the two-dimensional discrete cosine transform output from the processor element (101-132), and quantizes the output.
Moving image according to any one of claims 1 to 6, characterized in that it comprises a quantization module for data output for sequentially storing data in Li (1) (12) (58), the High-performance code compression system for information.

8. stores image data Okumemo Li (1)
And a window (21) that is movable on a screen of the moving image being recognized, that is, a window (21) on which priority information processing is performed, and the entire image forming the window (21) is divided into rectangular small blocks. Then, the sequential processing means for sequentially inputting the search data and the reference data for each of the small blocks from the memory (1) to execute the window parallel processing, and the motion vector associated with the motion of the moving image of the small blocks, The motion vector search unit (10) to be searched and the motion vector searched by the motion vector search unit (10) are used to display the face window (21) of the next frame.
Estimate the position of the
1) a window position control program that follows
A high-performance code compression system of video information with the camera input image mark-decoder, the delay time frame which occurs until it is outputted as a decoded image through a transmission path and decoder As a frame level rate control program that minimizes dropout, a process of setting the communication buffer residual bit amount (W) to 0 and encoding the first input screen as one frame (S61)
Then, it is determined whether the bit amount (B) of the current frame is a positive value, and if the bit amount (B) is a positive value, the number of screens per output image frame (C) is the number of screens per second of the output image ( (C / F) seconds divided by F) is used as the encoding processing time of the current frame, and this encoding processing time (C / F) seconds is multiplied by the amount of bits (R) that can be transmitted per second and transmitted during this period. The bit amount (U) to be generated is set to a predetermined value (R · C / F), and if the bit amount (B) is 0, it is determined that the current frame is skipped, and the encoding processing time (C / F) Second is input image 1
The same (1 / G) seconds as the input screen cycle, which is the reciprocal of the number of screens (G) per second, and the bit amount (U) transmitted during this input screen cycle (1 / G) seconds are (R / G ) Is set to (S62), and communication is performed between a value obtained by adding the bit amount (B) of the current frame to the communication buffer residual bit amount (W) and the encoding processing time (C / F) seconds of the current frame. The bit amount (U) transmitted from the buffer is compared, and the communication buffer residual bit amount (W) is compared.
A process of updating the value (S63), a value (D + 1) obtained by adding 1 to the guaranteed maximum frame delay time (D), and a bit amount (R) that can be transmitted per second and an input screen cycle (1 / G) Value multiplied by seconds (R / G) ・ (D +
From 1), the number of screens per output image frame (C)
A value (R / F) · (C-1) obtained by multiplying the value (C-1) obtained by subtracting 1 from the above by the amount of bits (R) that can be transmitted and dividing by the number of screens per second (F) of the output image. ) Is set to obtain a variable (L), and the predetermined value (R.
C / F) comparison and / or decoding of the magnitude relationship between a value obtained by subtracting the minimum bit amount (E) for using 100% of the communication channel bandwidth and the communication buffer residual bit amount (W) Comparison between the maximum number of reproducible screens per second (H) and the number of output screens per second (F), and at the same time the target bit amount (T) of the next frame is calculated. When the result of the processing (S64) and the comparison processing (S64) is smaller in the communication buffer residual bit amount (W) or the number of screens per second (F) of the output image (W <LE), A value obtained by subtracting the communication buffer residual bit amount (W) from the allocated bit amount (K) per frame is set as the target bit amount (T) of the next frame. When the target bit amount (T) is a positive value, , As normal processing, the next (G / F-1) frames are Skipped, and then the 1 or encoding two input screen, the target bit amount (T) value is zero or negative if the current in the case of skipping skips without encoding the one next input screen Processing for setting the bit amount (B) of the frame to zero (S6
5) and a comparing step of comparing the bit amount (B) of the current frame with the communication buffer residual bit amount (W) (S62) (S63).
And a step (S64) (S65) of controlling the target bit amount (T) of the next frame so that the communication buffer residual bit amount (W) is not exhausted according to the comparison result of the comparison steps (S62) (S63). A high-performance code compression system for moving picture information, characterized by comprising:

9. Vertical 8 after the frame level rate control
Quantum in each macroblock consisting of 64 pixels in the horizontal direction
A macro block level rate control mechanism that uses the code bit amount as an index to optimally adjust the coding level.
As a gram, before the initial value of the quantization level applied to the first macroblock state examining the current frame of frame (Q), the average value of the quantization level of the coded macroblocks of the previous frame (Qa) using If the target bit amount (T) of the current frame is a positive value, the current frame is actually encoded; otherwise, all macroblocks are processed. (S2), which is the first half of the coding / compression process of the moving image information in each of the macroblocks (i) among all the macroblocks (i) ,
The image of the immediately preceding decoded I frame or P frame
And the motion vector of each macroblock obtained by the motion estimator.
The motion-compensated prediction image is generated by
A process of performing a discrete cosine transform on the difference image between the measured image and the image of the current input frame, and quantizing the frequency component subjected to the discrete cosine transform using the quantization level (q) (S
3) and the process of updating the quantization level (q) to an appropriate value (S4,
S7 to 11) and the latter half of the coding compression processing of the moving image information in each macroblock (i), variable length coding is performed, and the value of B is updated to include the code of macroblock (i). Processing to generate a discrete cosine restored image by inverse quantization (S
5), after the processing of all the macroblocks is completed and the coding of the current frame is completed, the quantization level initial value (Q ′) of the previous frame, the bit amount (B ′) of the previous frame, and the previous frame The step of preparing for the processing of the next frame (S6) by replacing with the value of the target bit amount (T ′) and the condition that the macroblock is not coded are that the quantization frequency component and motion vector component are all zero, not the intra macroblock. And that the quantization level (q) is not updated when not encoded,
There are four variables that are indexes for determining whether or not each macroblock (i) is coded (S7) and for updating the quantization level (q) when the macroblock (i) is coded. The predicted value (d) of the bit amount of the macroblock (i),
Remaining bit prediction value (h), the process of calculating the remaining bits tolerance (a) and the remaining bit target value (e) and (S8), the current of the quantization level (q) is the first macroblock A parameter (b1) which is a bias that acts so as to make it difficult for the quantization level (q) to become larger when it is larger than the initial value (Q) of the applied quantization level.
And a process (S9) for obtaining a parameter (b2) that is a bias that acts so that the quantization level (q) is less likely to become smaller than the initial value (Q), and a constant of 0 or more for the parameter (b1). Parameter (c1) obtained by multiplying (g) by a constant (f) of 1 or more, and parameter (c2) obtained by multiplying parameter (b2) by a constant (g) of 0 or more and adding a constant (f) of 1 or more. To obtain (S
10) and the update value (q1) in the positive direction is the maximum quantization level (qma
x), which is a positive value and does not reach the updated value (q
Under the condition that 2) is a negative value that does not reach the minimum quantization level (qmin), the current quantization level (q) becomes smaller than the initial value (Q) and the remaining bit allowable value (a) Is smaller than the remaining bit prediction value (h)
If the condition is true, the value obtained by adding the update value (q1) to the quantization level (q) is used as the value of the variable (q '). If the condition is false, the second condition is evaluated to determine the remaining amount. Bit allowance (a)
And the parameter (c1) is smaller than the remaining bit target value (e), the second condition is true, the value obtained by adding q1 to the quantization level (q) is a variable (q ′). )
The value, if false evaluates the third condition, the remaining bit target value (e) and parameter product of the third made smaller than the remaining bit tolerance (a) of (c2) conditions If true, the fourth condition is evaluated, and if false, the quantization level (q) is used as the value of the variable (q '), and the communication buffer residual bit amount (W) and the current frame bit amount (B). Sum of input images
If the fourth condition that is smaller than the bit amount (U) transmitted during the surface period (1 / G) seconds is true, the quantization level (q) is added to the update value (q2) In the case where the quantization level (q) is left as it is, and when it is false, the actual quantization level (q) is updated using the quantization level (q) as the value of the variable (q ′) (S11). , And a value clipped by a function (CQ) that stores the value of the variable (q ′) calculated by the evaluation of the first to fourth conditions within an allowable limit is used as an updated value of the quantization level (q). The high-performance code compression system for moving image information according to any one of claims 2 to 8 , characterized in that
Mu .