JP3346448B2

JP3346448B2 - Video encoding device

Info

Publication number: JP3346448B2
Application number: JP2138196A
Authority: JP
Inventors: 淳嵯峨田; 裕尚如沢; 一人上倉; 久茨木
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1996-02-07
Filing date: 1996-02-07
Publication date: 2002-11-18
Anticipated expiration: 2016-02-07
Also published as: JPH09214974A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、画像通信、画像記
録等に利用される画像信号のディジタル圧縮符号化方法
に関し、詳しくは画像信号を端末間で送受信する場合や
ファイルに蓄積する場合等に用いられる動画像符号化装
置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method for digitally compressing and encoding an image signal used for image communication, image recording, etc. The present invention relates to a moving picture coding device used.

【０００２】[0002]

【従来の技術】ＩＴＵ−Ｔ（前ＣＣＩＴＴ）勧告Ｈ．２
６１は「ｐ×６４ｋｂ／ｓオーディオビジュアルサービ
ス用ビデオ符号化方式」と題され、６４ｋｂ／ｓ（ｐ＝
１）から２Ｍｂ／ｓ（ｐ＝３０）までのビットレートを
用いる通信用のビデオ符号化標準である。標準化の作業
開始は１９８４年１２月、勧告成立は１９９０年１２月
である。アプリケーションとしてはテレビ電話、テレビ
会議等が挙げられる。Ｈ．２６１は動画像信号の時間的
冗長度を動き補償予測により抑圧し、各フレームの空間
的冗長度を離散コサイン変換（ＤＣＴ）符号化により抑
圧する。2. Description of the Related Art ITU-T (formerly CCITT) recommendation H.264. 2
61 is entitled “p × 64 kb / s video coding system for audiovisual service”, and 64 kb / s (p =
A video coding standard for communications using bit rates from 1) to 2 Mb / s (p = 30). The standardization work started in December 1984 and the recommendation was approved in December 1990. Applications include videophones, videoconferencing, and the like. H. Reference numeral 261 suppresses temporal redundancy of a moving image signal by motion compensation prediction, and suppresses spatial redundancy of each frame by discrete cosine transform (DCT) coding.

【０００３】以下、図４を用いてＨ．２６１を用いた画
像通信装置の概要を、図５を用いてＨ．２６１の符号化
アルゴリズムを簡単に説明する。[0003] Referring to FIG. The outline of an image communication apparatus using H.261 will be described with reference to FIG. 261 will be briefly described.

【０００４】図４に示す画像通信装置は、カメラ４０と
Ａ／Ｄ変換部４１とフォーマット変換部４２と映像符号
化部４３と多重化部４４よりなる。映像信号４５はカメ
ラ４０から出力された映像信号であり、信号形式はＮＴ
ＳＣ（National TelevisionSystem Comitiee)，ＰＡＬ
(Phase Alternate Line)，ＳＥＣＡＭ(Sequential Coul
eur a Memore) 等のアナログ信号である。ディジタル信
号３６はフォーマット変換部４２への入力信号であり、
ＮＴＳＣ，ＰＡＬ，ＳＥＣＡＭ等の信号をＡ／Ｄ変換部
４１で変換したディジタル映像信号である。信号４７は
共通中間フォーマット（Common Intermedeate Format:
ＣＩＦ、またはQuarter CIF:ＱＣＩＦ）である。信号４
８は映像符号化部４３で圧縮された映像信号であり、多
重化部４４への入力信号である。The image communication apparatus shown in FIG. 4 comprises a camera 40, an A / D converter 41, a format converter 42, a video encoder 43, and a multiplexer 44. The video signal 45 is a video signal output from the camera 40 and has a signal format of NT
SC (National Television System Comitiee), PAL
(Phase Alternate Line), SECAM (Sequential Coul
eur a Memore). The digital signal 36 is an input signal to the format converter 42,
It is a digital video signal obtained by converting the signal of NTSC, PAL, SECAM or the like by the A / D converter 41. The signal 47 is a common intermediate format (Common Intermedeate Format:
CIF or Quarter CIF: QCIF). Signal 4
Reference numeral 8 denotes a video signal compressed by the video encoding unit 43, which is an input signal to the multiplexing unit 44.

【０００５】カメラ４０により人物や風景、書画等を撮
像し、その映像信号４５はカメラ４０からＮＴＳＣ，Ｐ
ＡＬ，ＳＥＣＡＭ等のアナログ信号形式で出力され、Ａ
／Ｄ変換部４１でディジタル信号４６となり、フォーマ
ット変換部４２へ入力される。その際、ＩＴＵ−Ｔ国際
標準化勧告にしたがった映像符号化部４３を利用するた
め、フォーマット変換部４２において、入力されたディ
ジタル信号４６はＣＩＦまたはＱＣＩＦの信号４７に変
換され、ＩＴＵ−Ｔ勧告Ｈ．２６１で規定される高機能
符号化方式にしたがう映像符号化部４３で圧縮される。
この圧縮されたディジタル映像信号４８は、さらに、デ
ィジタル化された音声信号やデータ信号とともに、多重
化部４４において、ＩＴＵ−Ｔ勧告Ｈ．２２１で規定さ
れるフレーム構成に多重化され、ＩＳＤＮ(Integrated
Services Digital Network) 等の回線交換網や、高速デ
ィジタル回線等の専用線を通して相手装置に送られる。[0005] A camera 40 captures an image of a person, a landscape, a calligraphy, etc.
Output in analog signal format such as AL, SECAM, etc.
The signal is converted into a digital signal 46 by the / D converter 41 and input to the format converter 42. At this time, in order to use the video encoding unit 43 in accordance with the ITU-T international standardization recommendation, the input digital signal 46 is converted into a CIF or QCIF signal 47 by the format conversion unit 42, and the ITU-T recommendation H . 261 is compressed by the video encoding unit 43 according to the high-performance encoding method.
The compressed digital video signal 48 is further multiplexed with the digitized audio signal and data signal by the multiplexing unit 44 in the ITU-T Recommendation H.264. H.221 is multiplexed into a frame structure defined by ISDN (Integrated
Services Digital Network) or a dedicated line such as a high-speed digital line and sent to the partner device.

【０００６】上述のＣＩＦやＱＣＩＦのフォーマットの
決定は、当該装置の利用者があらかじめ決定しているフ
ォーマットを装置間のネゴシエーションにより用いるこ
ととなる。In determining the format of the CIF or QCIF, a format predetermined by a user of the device is used by negotiation between the devices.

【０００７】図５は、図４の映像符号化部４３、すなわ
ち、ＩＴＵ−Ｔ勧告Ｈ．２６１で規定される符号化方式
にしたがう映像符号化装置の構成例を示す図である。FIG. 5 is a block diagram of the video encoding unit 43 shown in FIG. 261 is a diagram illustrating a configuration example of a video encoding device that conforms to an encoding method defined by H.261.

【０００８】まず、符号化対象画像１は正方形パターン
２’と共に動き検出部３’に入力され、１６画素×１６
ラインのマクロブロックと称される正方形ブロックに分
割される。動き検出部３’では、符号化対象画像１の中
の各マクロブロック毎に、参照画像との間の動き量を検
出し、得られた動ベクトル１８をブロック動き補償部
４’に送る。ここで、各マクロブロックの動ベクトル１
８は、参照画像において、着目マクロブロックとのマッ
チング度が最も高いブロックの座標と、着目マクロブロ
ックの座標との変位として表される。動ベクトル１８の
探索範囲は、着目マクロブロックの座標とその周囲の±
１５画素×±１５ラインに制限される。First, an image to be encoded 1 is input to a motion detecting section 3 'together with a square pattern 2', and is 16 pixels × 16 pixels.
It is divided into square blocks called macroblocks of lines. The motion detection unit 3 'detects the amount of motion between the macroblock in the encoding target image 1 and the reference image, and sends the obtained motion vector 18 to the block motion compensation unit 4'. Here, the motion vector 1 of each macroblock
Reference numeral 8 denotes a displacement between the coordinates of the block having the highest matching degree with the macroblock of interest and the coordinates of the macroblock of interest in the reference image. The search range of the motion vector 18 is determined by the coordinates of the macroblock of interest and ±
It is limited to 15 pixels × ± 15 lines.

【０００９】次に、ブロック動き補償部４’では、各マ
クロブロックの動きベクトル１８とフレームメモリ５に
蓄積された直前フレームの局部復号画像６とから動き補
償予測画像７を生成する。ここで得られた動き補償予測
画像７は符号化対象画像１と共に減算器８に入力され
る。両者の差分、すなわち動き補償予測誤差９は、ＤＣ
Ｔ／量子化部１０’においてＤＣＴ変換され、さらに量
子化されて圧縮差分データ１１となる。ここで、ＤＣＴ
のブロックサイズは８×８である。圧縮差分データ１１
（量子化インデックス）は差分データ符号化部１２にお
いてデータ圧縮され、差分画像符号化データ１３とな
る。一方、動ベクトル１８は動ベクトル符号化部１９に
おいて符号化され、得られた動ベクトル符号化データ２
０は差分画像符号化データ１３と共に多重化部２１’に
て多重化され、多重化データ２２’として伝送される。Next, the block motion compensator 4 'generates a motion compensated prediction image 7 from the motion vector 18 of each macroblock and the local decoded image 6 of the immediately preceding frame stored in the frame memory 5. The motion-compensated predicted image 7 obtained here is input to the subtracter 8 together with the image 1 to be encoded. The difference between them, that is, the motion compensation prediction error 9 is DC
The DC / DCT transform is performed in the T / quantization unit 10 ′, and the data is further quantized to become compressed difference data 11. Where DCT
Is 8 × 8. Compressed difference data 11
The (quantization index) is subjected to data compression in the differential data encoding unit 12 to become encoded differential image data 13. On the other hand, the motion vector 18 is encoded by the motion vector encoding unit 19, and the obtained motion vector encoded data 2
0 is multiplexed with the coded difference image data 13 in the multiplexing unit 21 ′ and transmitted as multiplexed data 22 ′.

【００１０】なお、復号装置と同じ復号画像を符号化装
置内でも得るため、圧縮差分データ１１（量子化インデ
ックス）は逆量子化／逆ＤＣＴ部１４’で量子化代表値
に戻され、さらに逆ＤＣＴ変換された後、伸長差分画像
１５となる。伸長差分画像１５と動き補償予測画像７は
加算器１６で加算され、局部復号画像１７となる。この
局部復号画像１７はフレームメモリ５に蓄積され、次の
フレームの符号化時に参照画像として用いられる。Note that, in order to obtain the same decoded image as the decoding device in the encoding device, the compressed difference data 11 (quantization index) is returned to the quantized representative value by the inverse quantization / inverse DCT unit 14 ', and further inversely. After the DCT transform, the image becomes an expanded difference image 15. The expanded difference image 15 and the motion-compensated predicted image 7 are added by the adder 16 to form a locally decoded image 17. This locally decoded image 17 is stored in the frame memory 5 and used as a reference image when encoding the next frame.

【００１１】[0011]

【発明が解決しようとする課題】上記従来技術では、端
末間で決定された映像信号形式のＣＩＦまたはＱＣＩＦ
等、利用する装置によって決定される一定の解像度の入
力画像信号を用いて符号化されるため、書類や写真等の
高解像度を必要とする被写体の映像信号を符号化する場
合には、解像度が不十分となる欠点があった。書類、写
真等の映像信号を符号化する場合にはＨ．２６１以外の
画像符号化方式に切替えて符号化することも考えられる
が、端末等の利用者が被写体の種別により符号化方式を
切替えることとなり、利用勝手が悪い等の問題がある。In the above prior art, a CIF or QCIF of a video signal format determined between terminals is used.
For example, when encoding a video signal of a subject requiring a high resolution such as a document or a photograph, the encoding is performed using an input image signal having a certain resolution determined by a device to be used. There was a disadvantage that it became insufficient. When encoding a video signal such as a document or a photograph, H.264 is used. It is also conceivable to switch the coding method to an image coding method other than H.261, but the user such as a terminal switches the coding method according to the type of the subject, and there is a problem that the usability is poor.

【００１２】また、上述従来技術では、カメラのパニン
グや、チルトや、ズーム等によって引き起こされる符号
化される画像信号の動きが激しい部分では、視覚的には
細かいディテイルを識別できないにも関わらず、ＱＣＩ
Ｆ程度の解像度まででしか符号化できず、本来重要とな
る動き情報を符号化することが困難になる欠点があっ
た。Further, in the above-described conventional technique, in a portion where the movement of an encoded image signal caused by panning, tilting, zooming, or the like of a camera is intense, although fine details cannot be visually identified, QCI
Encoding can be performed only up to a resolution of about F, which makes it difficult to encode motion information that is originally important.

【００１３】さらに、上記従来技術は、カメラの動き量
に関わらず、一定の解像度の入力画像を用いているた
め、カメラが激しくパニングやチルトをするような画像
では、限られた伝送路の帯域内では、画像を十分なフレ
ーム数伝送することができず、逆に十分なフレーム数を
符号化／伝送しようとすると、画質が著しく低下すると
いう欠点があった。Further, the above-mentioned prior art uses an input image having a constant resolution regardless of the amount of movement of the camera. Therefore, in an image in which the camera pans or tilts violently, the bandwidth of the transmission path is limited. However, there is a disadvantage in that the image cannot be transmitted in a sufficient number of frames, and conversely, if an attempt is made to encode / transmit a sufficient number of frames, the image quality is significantly reduced.

【００１４】本発明の目的は、上記問題点を解決し、カ
メラの動きがなく画像信号の動きが余り激しくないとき
は、通常高解像度の入力画像を用い、カメラのパニング
やチルトやズーム等が入力信号に含まれる際には、利用
者が被写体の異なりを意識することなく、自動的にＣＩ
Ｆ，ＱＣＩＦ等、符号化する入力画像の解像度を低く
し、動画像信号を品質よく符号化できる動画像符号化装
置を提供することにある。SUMMARY OF THE INVENTION An object of the present invention is to solve the above-mentioned problem. When the camera does not move and the movement of the image signal is not so intense, a high-resolution input image is usually used to perform panning, tilting, zooming and the like of the camera. When included in the input signal, the user automatically
F, QCIF etc., to lower the resolution of an input image to be coded is to provide a moving picture coding instrumentation <br/> location capable quality better coding a moving picture signal.

【００１５】[0015]

【００１６】[0016]

【００１７】[0017]

【００１８】[0018]

【００１９】[0019]

【００２０】[0020]

【００２１】[0021]

【００２２】[0022]

【００２３】[0023]

【課題を解決するための手段】本発明の動画像符号化装
置は、符号化対象画像または空間解像度変換後符号化対
象画像と多角形パターンを入力し、動ベクトルを求める
動き検出部と、前記符号化対象画像の動ベクトルから、
符号化対象画像のパニング、チルト、ズームインまたは
ズームアウトを検出するカメラパラメータ検出部と、検
出されたカメラパラメータから符号化対象画像の空間解
像度を選択し、入力画像空間解像度として出力する入力
画像空間解像度選択部と、前記符号化対象画像を前記入
力画像空間解像度で示される空間解像度の画像に変換
し、前記空間解像度変換後符号化対象画像として出力す
る空間解像度変換部と、局部復号画像を蓄えるフレーム
メモリと、前記入力画像空間解像度を入力し、前記フレ
ームメモリに蓄積された直前フレームの局部復号画像
を、前記入力画像空間解像度で示される解像度の画像に
変換し、空間解像度変換後局部復号画像として出力する
動き補償用空間解像度変換部と、前記動き検出部から出
力された、前記空間解像度変換後符号化対象画像の動ベ
クトルと前記空間解像度変換後局部復号画像とから動き
補償予測画像を生成するブロック動き補償部と、前記空
間解像度変換後符号化対象画像と前記動き補償予測画像
の差分をとり、動き補償予測誤差を出力する減算器と、
前記動き補償予測誤差に対して空間冗長度の抑圧を行
い、圧縮差分データを出力する空間冗長度圧縮部と、前
記圧縮差分データを伸長差分画像に復号する差分データ
伸長部と、前記動き補償予測画像と前記伸長差分画像を
加算し、前記局部復号画像として前記フレームメモリに
蓄積する加算器と、前記圧縮差分データをデータ圧縮符
号化し、差分画像符号化データとして出力する差分デー
タ符号化部と、前記空間解像度変換後符号化対象画像の
動ベクトルをデータ圧縮符号化し、動ベクトル符号化デ
ータとして出力する動ベクトル符号化部と、前記入力画
像空間解像度をデータ圧縮符号化し、選択空間解像度識
別子符号化データとして出力する選択空間解像度識別子
符号化部と、前記差分画像符号化データと前記動ベクト
ル符号化データと前記選択空間解像度識別子符号化デー
タを多重化する多重化部とを有する。According to the present invention, there is provided a moving picture coding apparatus comprising: a motion detecting section for inputting a picture to be coded or a picture to be coded after spatial resolution conversion and a polygon pattern to obtain a motion vector; From the motion vector of the image to be encoded,
A camera parameter detection unit that detects panning, tilt, zoom-in or zoom-out of the encoding target image, and an input image spatial resolution that selects a spatial resolution of the encoding target image from the detected camera parameters and outputs the selected spatial resolution as an input image spatial resolution A selection unit, a spatial resolution conversion unit that converts the encoding target image into an image having a spatial resolution indicated by the input image spatial resolution, and outputs the image as the encoding target image after the spatial resolution conversion, and a frame that stores a locally decoded image A memory and the input image spatial resolution are input, and the locally decoded image of the immediately preceding frame stored in the frame memory is converted into an image having a resolution indicated by the input image spatial resolution, and is converted into a spatially decoded local decoded image. A motion compensation spatial resolution conversion unit to be output, and the spatial solution output from the motion detection unit. A block motion compensator that generates a motion-compensated predicted image from a motion vector of the degree-converted encoding target image and the local resolution-converted local decoded image; and A subtractor that takes a difference and outputs a motion compensation prediction error;
The perform suppression of the spatial redundancy of relative motion compensated prediction error, and the spatial redundancy compression unit for outputting a compressed difference data, the difference data <br/> extension length which decodes the compressed difference data in the expanded difference image An adder that adds the motion-compensated predicted image and the decompressed difference image and accumulates the local decoded image in the frame memory; and a difference that compresses and compresses the compressed difference data and outputs the resultant as difference image encoded data. A data encoding unit, a data compression encoding of a motion vector of the image to be encoded after the spatial resolution conversion, a motion vector encoding unit for outputting as a motion vector encoded data, and a data compression encoding of the input image spatial resolution, A selected spatial resolution identifier encoding unit that outputs as selected spatial resolution identifier encoded data, the differential image encoded data and the motion vector encoded data The selected spatial resolution identifier encoded data and a multiplexer for multiplexing.

【００２４】検出されたカメラの動きにしたがって、各
フレーム毎に入力画像の解像度を変更することにより、
カメラが激しく動いた際にも十分なフレーム数を伝送す
ることが可能となり、またカメラが静止しているときに
は、ディテイルまで表示することが可能となるため、従
来の符号化方法に比べ視覚的な画質を大きく向上させる
ことが可能となる。By changing the resolution of the input image for each frame according to the detected movement of the camera,
It is possible to transmit a sufficient number of frames even when the camera moves violently, and when the camera is stationary, it is possible to display even details, so it is more visual than conventional coding methods. Image quality can be greatly improved.

【００２５】[0025]

【発明の実施の形態】次に、本発明の実施の形態につい
て図面を参照して説明する。Next, embodiments of the present invention will be described with reference to the drawings.

【００２６】図１は本発明の一実施形態の動画像符号化
装置の構成図である。図５中と同符号のものは同じもの
を示す。FIG. 1 is a block diagram of a moving picture coding apparatus according to an embodiment of the present invention. The same components as those in FIG. 5 indicate the same components.

【００２７】本動画像符号化装置は、動き検出部３から
スイッチ２３を経て出力された、符号化対象画像１の動
ベクトル１８₁ から符号化対象画像１のカメラパラメー
タ２５を検出するカメラパラメータ検出部２４と、カメ
ラパラメータ２５から符号化対象画像１の解像度を選択
し、入力画像解像度２７として出力する入力画像解像度
選択部２６と、符号化対象画像１を入力画像解像度２７
で示される解像度の画像に変換し、解像度変換後符号化
対象画像２９として出力する解像度変換部２８と、フレ
ームメモリ５に蓄えられた局部復号画像６を入力画像解
像度２７で示される解像度の画像に変換し、解像度変換
後局部復号画像３１として出力する動き補償用解像度変
換部３０と、動き検出部３から出力された、解像度変換
後符号化対象画像２９の動ベクトル１８₂をデータ圧縮
符号化し、動ベクトル符号化データ２０として出力する
動ベクトル符号化部１９と、解像度変換後局部復号画像
３１と動ベクトル１８₂ とから動き補償予測画像７を生
成するブロック動き補償部４と、入力画像解像度２７を
データ圧縮符号化し、選択解像度識別子符号化データ３
４を出力する選択解像度識別子符号化部３３と、差分画
像符号化データ１３と動ベクトル符号化データ２０と選
択解像度識別子符号化データ３４を多重化し、多重化デ
ータ２２として伝送する多重化部２１と、スイッチ２
３，３２，３５を新たに備えている。[0027] This moving picture coding apparatus, output from the motion detection unit 3 through the switch 23, the camera parameters detector which detects camera parameters 25 of the encoding target image 1 from the motion vector 18 ₁ of the encoding target image 1 An input image resolution selecting unit 26 for selecting the resolution of the encoding target image 1 from the camera parameters 25 and outputting the selected resolution as an input image resolution 27;
The resolution conversion unit 28 converts the image into the image having the resolution indicated by the following formula, and outputs the image as the encoding target image 29 after the resolution conversion. conversion, and motion compensation resolution conversion unit 30 for outputting the resolution-converted local decoded image 31, output from the motion detector 3, a motion vector 18 _second resolution conversion after encoding target image 29 and data compression encoding, a motion vector coding unit 19 for outputting the motion vector coding data 20, from the resolution-converted local decoded image 31 motion vector 18 ₂ which block the motion compensation unit 4 which generates a motion compensated predicted image 7, an input image resolution 27 Is compressed and encoded, and the selected resolution identifier encoded data 3
4; a multiplexing unit 21 that multiplexes the difference image coded data 13, the motion vector coded data 20, and the selected resolution identifier coded data 34, and transmits the multiplexed data 22. , Switch 2
3, 32, 35 are newly provided.

【００２８】次に、本実施形態の動作を説明する。Next, the operation of this embodiment will be described.

【００２９】まず、符号化対象画像１が、多角形パター
ン２と共に動き検出部３に入力され、各バッチの動ベク
トル１８₁ （動き量）が求められる。ここで、多角形パ
ターン２としては、正方形がよく用いられる。また、動
ベクトル１８₁は、誤差評価値として平均自乗誤差や誤
差絶対値和等を用いて参照画像において着目マクロブロ
ックと同じ位置の近傍をフルサーチをして求めることが
でき、着目マクロブロックとのマッチング度が最も高い
ブロックの座標と、着目マクロブロックの座標との変位
として表される。First, the image to be encoded 1 is input to the motion detecting section 3 together with the polygon pattern 2, and a motion vector 18 ₁ (motion amount) of each batch is obtained. Here, a square is often used as the polygon pattern 2. Also, the motion vector 18 ₁ may be determined by a full search in the vicinity of the same position as the macro block of interest in the reference image using the mean square error or error absolute value sum or the like as the error evaluation value, and the macro block of interest Is represented as a displacement between the coordinates of the block having the highest matching degree and the coordinates of the macroblock of interest.

【００３０】次に、動ベクトル１８₁はカメラパラメー
タ検出部２４に入力され、パニング、チルト、ズームイ
ン、ズームアウト等のカメラパラメータ２５が検出され
る。Next, the motion vector 18 ₁ is input to the camera parameter detection section 24, panning, tilting, zooming, camera parameters 25 such as a zoom-out is detected.

【００３１】ここで、カメラパラメータ２５の検出法は
次のように行われる。Here, the method of detecting the camera parameters 25 is performed as follows.

【００３２】図２において、図面の中心を原点（０，
０）としたとき、点（ｉ，ｊ）でのパニングやチルトに
よる動きベクトルは位置に無関係で、各々水平／垂直方
向のパニング、チルトの速さによって決まる定数であ
る。これを（Ｈ，Ｖ）とする。また、ズームによる動き
ベクトルは中心から離れるほど大きくなり、Ｚをズーム
の倍率によって決まる定数とすると、（Ｚ×ｉ，Ｚ×
ｊ）と表すことができる。したがって、両方の影響によ
るカメラの動きによる動きベクトル１８₁は（Ｚ×ｉ＋
Ｈ，Ｚ×ｊ＋Ｖ）と表すことができる。この３つの定数
Ｚ，Ｈ，Ｖを求めることにより、動画像のカメラパラメ
ータ２５を求めることができる。In FIG. 2, the center of the drawing is defined as the origin (0,
0), the motion vector due to panning or tilting at the point (i, j) is a constant determined by the panning / tilting speed in the horizontal / vertical directions regardless of the position. This is (H, V). Further, the motion vector by zooming increases as the distance from the center increases, and if Z is a constant determined by the zoom magnification, (Z × i, Z ×
j). Therefore, the motion vector 18 ₁ due to the camera motion due to both effects is (Z × i +
H, Z × j + V). By obtaining these three constants Z, H, and V, the camera parameters 25 of the moving image can be obtained.

【００３３】画面の中心に対して対称の位置にある２つ
のブロックの中心座標（ｉ，ｊ）、（−ｉ，−ｊ）と、
その動ベクトル（Ｖ１_x ，Ｖ１_y ），（Ｖ２_x ，Ｖ２
_y ）を用いて、３ずつの定数Ｚ，Ｈ，Ｖを次式で求め
る。The center coordinates (i, j) and (-i, -j) of two blocks located symmetrically with respect to the center of the screen,
The motion vectors (V1 _x , V1 _y ), (V2 _x , V2
_y ), constants Z, H, and V are determined by the following equation.

【００３４】[0034]

【数１】また、同様に画面内の全ての対称位置にあるブロックに
ついてＺ，Ｈ，Ｖを求め、Ｚ，Ｈ，Ｖの各々に対して、
最大頻度のものを最終的にＺ，Ｈ，Ｖの値とする。(Equation 1) Similarly, Z, H, and V are obtained for all symmetric blocks in the screen, and for each of Z, H, and V,
The one with the highest frequency is finally set to the values of Z, H, and V.

【００３５】求められたカメラパラメータ２５は入力画
像解像度選択部２６に入力され、ズーム、パニング、チ
ルトの度合い、すなわち（Ｚ，Ｈ，Ｖ）の値にしたが
い、ＩＴＵ−Ｒ．Ｂ．Ｔ．６０１，ＣＩＦ、ＱＣＩＦ等
の入力画像の解像度を選択する。一般的に、カメラの動
きが激しいほど、解像度の低い入力画像を用いる。選ば
れた入力画像解像度２７は解像度変換部２８および動き
補償用解像度変換部３０に入力される。The obtained camera parameters 25 are input to the input image resolution selection section 26, and according to the zoom, panning, and tilt degrees, that is, (Z, H, V) values, the ITU-R. B. T. 601, a resolution of an input image such as CIF or QCIF is selected. Generally, an input image having a lower resolution is used as the camera moves more rapidly. The selected input image resolution 27 is input to the resolution converter 28 and the motion compensation resolution converter 30.

【００３６】符号化対称画像１は、解像度変換部２８で
入力画像解像度２７に示される画像の解像度に変換さ
れ、解像度変換後符号化対象画像２９が得られる。解像
度変換後符号化対象画像２９は、スイッチ３２を経て多
角形パターン２と共に再び動き検出部３に入力され、該
解像度における各マクロブロック毎の参照画像との間の
動き量を検出し、得られた動ベクトル１８₂をブロック
動き補償部４に送る。The encoded symmetric image 1 is converted by the resolution conversion unit 28 to the resolution of the image indicated by the input image resolution 27, and an image 29 to be encoded after resolution conversion is obtained. The encoding target image 29 after the resolution conversion is input again to the motion detecting unit 3 together with the polygon pattern 2 via the switch 32, and the motion amount between the reference image and each macro block at the resolution is detected and obtained. Send the motion vectors 18 ₂ to block motion compensation unit 4.

【００３７】同様に、解像度変換後符号化対象画像２９
の解像度と整合させるため、入力画像解像度２７は動き
補償用解像度変換部３０に入力され、フレームメモリ５
に蓄積された直前フレームの局部復号画像６は、入力画
像解像度２７に示される画像の解像度に変換され、解像
度変換後局部復号画像３１が得られる。Similarly, the image 29 to be encoded after resolution conversion
The input image resolution 27 is input to the motion-compensation resolution conversion unit 30 so as to match with the resolution of the frame memory 5.
Is converted to the resolution of the image indicated by the input image resolution 27, and the resolution-converted local decoded image 31 is obtained.

【００３８】図３に解像度変換方法の一例を示す。例え
ばＣＩＦは３５２画素×２８８ラインの解像度であり、
ＱＣＩＦは１７６画素×１４４ラインの解像度である。
したがって、この２つの画像フォーマット間において
は、図３（ａ）に示すように、ＣＩＦからＱＣＩＦへの
変換は１画素、１ライン毎の間引きで可能になる。逆
に、ＱＣＩＦからＣＩＦへの変換は、図３（ｂ）に示す
ように、１画素を４画素に置き換えることにより可能と
なる。さらに解像度が必要な場合は、ＣＩＦとその４倍
の解像度の画像フォーマット間の変換を用いることもで
きる。これらの解像度変換は、単純な間引き、拡大で実
現できる。また、符号化においては、動き予測に用いら
れるフレームの解像度と入力フレームの解像度が一致し
ているため、予測を簡単に実施できる。FIG. 3 shows an example of the resolution conversion method. For example, CIF has a resolution of 352 pixels × 288 lines,
QCIF has a resolution of 176 pixels × 144 lines.
Therefore, between these two image formats, as shown in FIG. 3A, conversion from CIF to QCIF can be performed by thinning out one pixel and one line. Conversely, conversion from QCIF to CIF becomes possible by replacing one pixel with four pixels as shown in FIG. If more resolution is required, conversion between CIF and an image format with four times the resolution can be used. These resolution conversions can be realized by simple thinning and enlargement. Also, in encoding, since the resolution of a frame used for motion prediction matches the resolution of an input frame, prediction can be performed easily.

【００３９】以上の説明では、主にＱＣＩＦ，ＣＩＦを
例に示したが、もちろんこれ以外であってもよく、例え
ば、ＱＣＩＦの１／３や１／４、またさらにその１／４
等の小画像への変換、ＣＩＦの３倍、４倍、またさらに
その４倍等の画像への変換と組み合せてもよい。In the above description, QCIF and CIF have been mainly described as examples. However, it is needless to say that other types may be used, for example, １／ or 例えば of QCIF, or １／ of QCIF.
May be combined with conversion to an image such as 3 ×, 4 ×, or even 4 × of CIF.

【００４０】次に、ブロック動き補償部４では、各マク
ロブロックの動ベクトル１８₂と入力画像解像度２７に
示される画像の解像度に変換された直前フレームの局部
復号画像、すなわち解像度変換後局部復号画像３１とか
ら動き補償予測画像７を生成する。ここで得られた動き
補償予測画像７は解像度変換後符号化対象画像２９と共
に減算器８に入力される。両者の差分、すなわち動き補
償予測誤差９は、空間冗長度圧縮部１０において空間冗
長度の抑圧が行われる。現在の解像度変換後符号化対象
画像２９の局部復号画像１７を得るため、空間冗長度圧
縮部１０より出力された圧縮差分データ１１は差分デー
タ伸長部１４にて伸長差分画像１５に復号される。伸長
差分画像１５は空間冗長度を抑圧された動き補償予測誤
差信号である。伸長差分画像１５は加算器１６にて動き
補償予測画像７と加算され、現在の解像度変換後符号化
対象画像２９の局部復号画像１７となる。局部復号画像
１７はフレームメモリ５に蓄積され、以降のフレームの
符号化にて参照される。Next, at block motion compensation unit 4, a local decoded image of the immediately preceding frame that has been converted to the resolution of the image shown in the motion vector 18 ₂ and the input image resolution 27 of each macro block, i.e. the resolution-converted local decoded image The motion compensation prediction image 7 is generated from the motion compensation prediction image 31 and the motion compensation prediction image 31. The motion-compensated prediction image 7 obtained here is input to the subtractor 8 together with the encoding target image 29 after resolution conversion. The difference between the two, that is, the motion compensation prediction error 9, is subjected to spatial redundancy suppression in the spatial redundancy compressor 10. In order to obtain a local decoded image 17 of the current image 29 to be subjected to the resolution conversion after encoding, the compressed difference data 11 output from the spatial redundancy compressing unit 10 is decoded by the difference data expanding unit 14 into an expanded difference image 15. The expanded difference image 15 is a motion-compensated prediction error signal in which spatial redundancy is suppressed. The expanded difference image 15 is added to the motion-compensated predicted image 7 by the adder 16 to become the local decoded image 17 of the current resolution-converted encoding target image 29. The local decoded image 17 is stored in the frame memory 5 and is referred to in encoding of subsequent frames.

【００４１】一方、動き補償予測誤差９に対する圧縮差
分圧縮データ１１は差分データ符号化部１２にてデータ
圧縮符号化され、差分画像符号化データ１３となる。動
ベクトル１８は動ベクトル符号化部１９にてデータ圧縮
符号化され、動ベクトル符号化データ２０となる。入力
画像解像度２７は選択解像度識別子符号化部３３にてデ
ータ圧縮符号化され、選択解像度識別子符号化データ３
４となる。差分画像符号化データ１３と動ベクトル符号
化データ２０と選択解像度識別子符号化データ３４は多
重化部２１において多重化され、多重化データ２２とし
て伝送または蓄積される。On the other hand, the compressed differential compressed data 11 corresponding to the motion compensation prediction error 9 is subjected to data compression encoding by the differential data encoding section 12 to become differential image encoded data 13. The motion vector 18 is subjected to data compression encoding in a motion vector encoding unit 19 to become encoded motion vector data 20. The input image resolution 27 is data compression-encoded by the selected resolution identifier encoding unit 33, and the selected resolution identifier encoded data 3
It becomes 4. The difference image encoded data 13, the motion vector encoded data 20, and the selected resolution identifier encoded data 34 are multiplexed in the multiplexing unit 21 and transmitted or stored as multiplexed data 22.

【００４２】なお、本実施形態に示した符号化装置は、
入力画像の解像度が既知ではない場合であり、画像の撮
影と同時に、カメラパラメータが記録される場合のよう
に、入力画像信号のカメラパラメータが既知の場合に
は、カメラパラメータ検出部２４は不要であり、記録さ
れたカメラパラメータを入力画像解像度選択部２６に直
接入力することで本装置を実現することが可能となる。Note that the encoding apparatus shown in the present embodiment
In the case where the resolution of the input image is not known and the camera parameters of the input image signal are known as in the case where the camera parameters are recorded at the same time when the image is captured, the camera parameter detection unit 24 is unnecessary. Yes, this apparatus can be realized by directly inputting the recorded camera parameters to the input image resolution selection unit 26.

【００４３】[0043]

【発明の効果】以上説明したように、本発明によれば、
カメラが静止しているときは、自動的に高い解像度の画
像信号を符号化し、また、カメラが動きだすと、動画像
中の動きをよく表現できるように、自動的に入力画像信
号の解像度が低くなり、利用者が意識することなく、画
像信号を効率よく符号化できる効果がある。As described above, according to the present invention,
When the camera is stationary, it automatically encodes the high-resolution image signal, and when the camera starts moving, the resolution of the input image signal is automatically reduced to better express the motion in the moving image. That is, there is an effect that the image signal can be efficiently encoded without the user being conscious.

【００４４】これは、解像度を必要とする対象画像が、
写真や書類等の静止画であることが多いことと、人間の
視覚特性が、動きが大きい場合には解像度に対する分解
能が劣化し、また、動きが小さい場合は解像度に対する
分解能も高くなるという特性とも一致している。This is because the target image requiring resolution is
The fact that the image is often a still image such as a photograph or a document, and that the human visual characteristics are such that the resolution with respect to the resolution is degraded when the motion is large and the resolution with respect to the resolution is increased when the motion is small. Match.

[Brief description of the drawings]

【図１】本発明の一実施形態の動画像符号化装置の構成
図である。FIG. 1 is a configuration diagram of a video encoding device according to an embodiment of the present invention.

【図２】動画像からカメラパラメータを検出する動作の
説明図である。FIG. 2 is an explanatory diagram of an operation of detecting a camera parameter from a moving image.

【図３】解像度変換の一例を示す図である。FIG. 3 is a diagram illustrating an example of resolution conversion.

【図４】従来の画像通信装置の概要を示す図である。FIG. 4 is a diagram showing an outline of a conventional image communication device.

【図５】従来のブロック単位の動き補償予測符号化方法
による符号化装置の構成を示す図である。FIG. 5 is a diagram illustrating a configuration of an encoding device according to a conventional motion compensation prediction encoding method in block units.

[Explanation of symbols]

１符号化対象画像２多角形パターン３動き検出部４ブロック動き補償部５フレームメモリ６局部復号画像７動き補償予測画像８減算器９動き補償予測誤差１０空間冗長度圧縮部１１圧縮差分データ１２差分データ符号化部１３差分画像符号化データ１４差分データ伸長部１５伸長差分画像１６加算器１７局部復号画像１８₁，１８₂ 動ベクトル１９動ベクトル符号化部２０動ベクトル符号化データ２１多重化部２２多重化データ２３スイッチ２４カメラパラメータ検出部２５カメラパラメータ２６入力画像解像度選択部２７入力画像解像度２８解像度変換部２９解像度変換後符号化対象画像３０動き補償用解像度変換部３１解像度変換後局部復号画像３２スイッチ３３選択解像度識別子符号化部３４選択解像度識別子符号化データ３５スイッチDESCRIPTION OF SYMBOLS 1 Encoding target image 2 Polygon pattern 3 Motion detection part 4 Block motion compensation part 5 Frame memory 6 Local decoded image 7 Motion compensation prediction image 8 Subtractor 9 Motion compensation prediction error 10 Spatial redundancy compression part 11 Compressed difference data 12 Difference data encoding unit 13 coded differential picture data 14 difference data extension length section 15 expanded difference image 16 adder 17 local decoded image 18 _1, 18 ₂ motion vectors 19 motion vector coding unit 20 motion vector coding data 21 multiplexing unit Reference Signs List 22 multiplexed data 23 switch 24 camera parameter detection unit 25 camera parameter 26 input image resolution selection unit 27 input image resolution 28 resolution conversion unit 29 image to be encoded after resolution conversion 30 resolution conversion unit for motion compensation 31 locally decoded image after resolution conversion 32 switch 33 selected resolution identifier encoding unit 34 Selected resolution identifier encoded data 35 Switch

───────────────────────────────────────────────────── フロントページの続き (72)発明者茨木久東京都新宿区西新宿三丁目19番２号日本電信電話株式会社内 (56)参考文献特開平７−250322（ＪＰ，Ａ) 特開平６−197333（ＪＰ，Ａ) 特開平９−116802（ＪＰ，Ａ) 上倉一人、渡辺裕，動画像符号化におけるグローバル動き補償法，電子情報通信学会論文誌Ｂ−Ｉ，日本，社団法人電子情報通信学会，1993年12月25日, Ｖｏｌ．Ｊ76−Ｂ−Ｉ，Ｎｏ．12，ｐ. 944−952 (58)調査した分野(Int.Cl.⁷，ＤＢ名) H04N 7/24 - 7/68 ──────────────────────────────────────────────────続き Continuing on the front page (72) Inventor Hisashi Ibaraki 3-19-2 Nishishinjuku, Shinjuku-ku, Tokyo Within Nippon Telegraph and Telephone Corporation (56) References JP-A-7-250322 (JP, A) 6-197333 (JP, A) JP-A-9-116802 (JP, A) Kazuto Uekura, Yutaka Watanabe, Global Motion Compensation Method in Video Coding, IEICE Transactions BI, Japan, The Institute of Electronics, Information and Communication Engineers, December 25, 1993, Vol. J76-BI, No. 12, p. 944-952 (58) Field surveyed (Int. Cl. ⁷ , DB name) H04N ^7/ 24-7/68

Claims

(57) [Claims]

1. A moving image encoding apparatus for inputting and encoding an image signal, comprising the steps of: inputting an encoding target image or an encoding target image after spatial resolution conversion and a polygon pattern; Unit, a camera parameter detection unit that detects panning, tilting, zoom-in or zoom-out of the encoding target image from the motion vector of the encoding target image, and selects a spatial resolution of the encoding target image from the detected camera parameters. An input image spatial resolution selection unit that outputs the input image spatial resolution, and converts the encoding target image into an image having a spatial resolution indicated by the input image spatial resolution, and outputs the image as the spatial resolution converted encoding target image. A spatial resolution converting unit, a frame memory for storing a locally decoded image, and the input image spatial resolution The local decoded image of the immediately preceding frame stored in the memory is converted into an image having a spatial resolution indicated by the input image spatial resolution,
A spatial resolution conversion unit for motion compensation to be output as a locally decoded image after the spatial resolution conversion, and a motion vector of the image to be encoded after the spatial resolution conversion and the local decoded image after the spatial resolution conversion output from the motion detection unit. A motion compensation prediction unit that generates a motion compensation prediction image from the motion compensation prediction image; a subtracter that calculates a difference between the coding target image after the spatial resolution conversion and the motion compensation prediction image and outputs a motion compensation prediction error; perform suppression of the spatial redundancy degree with respect to a spatial redundancy degree compression unit for outputting a compressed difference data, and the difference data extension length which decodes the compressed difference data in the extension length difference image, the motion compensated prediction picture And an adder for adding the decompressed difference image and storing the compressed difference data in the frame memory as the locally decoded image. A differential data encoding unit that outputs data as a data, a motion vector encoding unit that performs data compression encoding on a motion vector of the encoding target image after the spatial resolution conversion, and outputs the motion vector encoded data, and the input image spatial resolution. A selected spatial resolution identifier encoding unit that performs data compression encoding and outputs the selected spatial resolution identifier encoded data, and multiplexes the differential image encoded data, the motion vector encoded data, and the selected spatial resolution identifier encoded data. And a multiplexing unit.