JP4503959B2

JP4503959B2 - Image coding method

Info

Publication number: JP4503959B2
Application number: JP2003313473A
Authority: JP
Inventors: 京子内林; 眞也角野; 陽司能登屋
Original assignee: Panasonic Corp; Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Corp; Panasonic Holdings Corp
Priority date: 2003-09-05
Filing date: 2003-09-05
Publication date: 2010-07-14
Anticipated expiration: 2023-09-05
Also published as: JP2005086290A

Description

本発明は、動画像符号化方法に関し、特に符号化対象ピクチャと時間的に異なるピクチャのうち１つ以上の画像を用いて符号化対象ピクチャを符号化する動き補償画面間予測符号化処理に関するものである。 The present invention relates to a moving picture coding method, and more particularly to a motion compensated inter-picture predictive coding process for coding a picture to be coded using one or more pictures among pictures temporally different from the picture to be coded. It is.

近年、マルチメディアアプリケーションの発展に伴い、画像・音声・テキストなど、あらゆるメディアの情報を統一的に扱うことが一般的になってきた。この時、全てのメディアをディジタル化することにより、統一的にメディアを扱うことが可能になる。しかしながら、ディジタル化された画像は膨大なデータ量を持つため、蓄積・伝送のためには、画像の情報圧縮技術が不可欠である。一方で、圧縮した画像データを相互運用するためには、圧縮技術の標準化も重要である。画像圧縮技術の標準規格としては、ＩＴＵ‐Ｔ（国際電気通信連合電気通信標準化部門）のH.２６１、H.２６３、ＩＳＯ／ＩＥＣ（国際標準化機構国際電気標準会議）のＭＰＥＧ（Moving Picture Experts Group）-１、MPEG-２、MPEG-４など、またＩＴＵ−ＴとＭＰＥＧの合同であるＪＶＴ(JointVideoTeam)により現在標準化中のＨ.２６４(ＭＰＥＧ−４ＡＶＣ)がある。 In recent years, with the development of multimedia applications, it has become common to handle all media information such as images, sounds, and texts in a unified manner. At this time, by digitalizing all the media, it becomes possible to handle the media in a unified manner. However, since a digitized image has an enormous amount of data, image information compression technology is indispensable for storage and transmission. On the other hand, in order to interoperate compressed image data, standardization of compression technology is also important. Image compression technology standards include ITU-T (International Telecommunication Union Telecommunication Standardization Sector) H.261, H.263, ISO / IEC (International Standardization Organization International Electrotechnical Commission) MPEG (Moving Picture Experts Group). ) -1, MPEG-2, MPEG-4, etc. There is also H.264 (MPEG-4AVC) currently being standardized by JVT (JointVideoTeam), which is a congruence of ITU-T and MPEG.

一般に動画像の符号化では、時間方向および空間方向の冗長性を削減することによって情報量の圧縮を行う。そこで時間的な冗長性の削減を目的とする画面間予測符号化では、前方または後方のピクチャを参照してブロック単位で動きの検出および予測画像の作成を行い、得られた予測画像と符号化対象ピクチャとの差分値に対して符号化を行う。ここで、ピクチャとは１枚の画面を表す用語であり、プログレッシブ画像ではフレームを意味し、インタレース画像ではフレームもしくはフィールドを意味する。ここで、インタレース画像とは、１つのフレームが時刻の異なる２つのフィールドから構成される画像である。インタレース画像の符号化や復号化処理においては、１つのフレームをフレームのまま処理したり、２つのフィールドとして処理したり、フレーム内のブロック毎にフレーム構造またはフィールド構造として処理したりすることができる。 In general, in encoding of moving images, the amount of information is compressed by reducing redundancy in the time direction and the spatial direction. Therefore, in inter-picture predictive coding for the purpose of reducing temporal redundancy, motion is detected and a predicted image is created in units of blocks with reference to the forward or backward picture, and the resulting predicted image and the encoded image are encoded. Encoding is performed on the difference value from the target picture. Here, a picture is a term representing a single screen. In a progressive image, it means a frame, and in an interlaced image, it means a frame or a field. Here, an interlaced image is an image in which one frame is composed of two fields having different times. In interlaced image encoding and decoding processing, one frame may be processed as a frame, processed as two fields, or processed as a frame structure or a field structure for each block in the frame. it can.

参照画像を持たず画面内予測符号化を行うものをＩピクチャと呼ぶ。また、１枚の参照画像のみを参照し画面間予測符号化を行うものをＰピクチャと呼ぶ。また、同時に２枚の参照画像を参照して画面間予測符号化を行うことのできるものをＢピクチャと呼ぶ。Ｂピクチャは表示時間が前方もしくは後方から任意の組み合わせとして２枚のピクチャを参照することが可能である。参照画像（参照ピクチャ）は符号化の基本単位であるマクロブロックごとに指定することができるが、符号化を行ったビットストリーム中に先に記述される方の参照ピクチャを第１参照ピクチャ、後に記述される方を第２参照ピクチャとして区別する。ただし、これらのピクチャを符号化する場合の条件として、参照するピクチャが既に符号化されている必要がある。 A picture that does not have a reference picture and performs intra prediction coding is called an I picture. A picture that performs inter-frame predictive coding with reference to only one reference picture is called a P picture. A picture that can be subjected to inter-picture prediction coding with reference to two reference pictures at the same time is called a B picture. The B picture can refer to two pictures as an arbitrary combination of display times from the front or the rear. A reference picture (reference picture) can be specified for each macroblock that is a basic unit of encoding. The reference picture described earlier in the encoded bitstream is the first reference picture, The one described is distinguished as the second reference picture. However, as a condition for encoding these pictures, the picture to be referenced needs to be already encoded.

Ｐピクチャ又はＢピクチャの符号化には、動き補償画面間予測符号化が用いられている。動き補償画面間予測符号化とは、画面間予測符号化に動き補償を適用した符号化方式である。動き補償とは、単純に参照フレームの画素値から予測するのではなく、ピクチャ内の各部の動き量（以下、これを動きベクトルと呼ぶ）を検出し、当該動き量を考慮した予測を行うことにより予測精度を向上すると共に、データ量を減らす方式である。例えば、符号化対象ピクチャの動きベクトルを検出し、その動きベクトルの分だけシフトした予測値と符号化対象ピクチャとの予測残差を符号化することによりデータ量を減している。この方式の場合には、復号化の際に動きベクトルの情報が必要になるため、動きベクトルも符号化されて記録又は伝送される。 Motion compensation inter-picture prediction coding is used for coding a P picture or a B picture. The motion compensation inter-picture prediction encoding is an encoding method in which motion compensation is applied to inter-picture prediction encoding. Motion compensation is not simply predicting from the pixel value of the reference frame, but detecting the amount of motion of each part in the picture (hereinafter referred to as a motion vector) and performing prediction in consideration of the amount of motion. This improves the prediction accuracy and reduces the amount of data. For example, the amount of data is reduced by detecting the motion vector of the encoding target picture and encoding the prediction residual between the prediction value shifted by the motion vector and the encoding target picture. In the case of this method, since motion vector information is required at the time of decoding, the motion vector is also encoded and recorded or transmitted.

動きベクトルはマクロブロック単位で検出されており、具体的には、符号化対象ピクチャ側のマクロブロックを固定しておき、参照ピクチャ側のマクロブロックを探索範囲内で移動させ、基準ブロックと最も似通った参照ブロックの位置を見つけることにより、動きベクトルが検出される。 The motion vector is detected in units of macroblocks. Specifically, the macroblock on the encoding target picture side is fixed, the macroblock on the reference picture side is moved within the search range, and is most similar to the reference block. The motion vector is detected by finding the position of the reference block.

図１３は、従来の画像符号化装置１０００の構成を示すブロック図である。画像符号化装置１０００は、差分器１００１、画像符号化部１００２、可変長符号化部１００３、画像復号化部１００４、加算器１００５、画像メモリ１００６、ピクチャメモリ１００７、動き補償符号化部１００８、動きベクトル検出部１００９、予測方向決定部１０１０および動きベクトル記憶部１０１１を備えている。なお、符号化の処理は１６x１６画素のマクロブロックと呼ばれる単位で行われ、動き補償のブロックのサイズとしては、現在策定中の規格案であるＨ．２６４では、４×４、４×８、８×４、８×８、８×１６、１６×８、１６×１６、の７通りの動き補償のブロックサイズからマクロブロック単位で適切なものを選択して符号化に使用する。 FIG. 13 is a block diagram showing a configuration of a conventional image encoding apparatus 1000. The image coding apparatus 1000 includes a difference unit 1001, an image coding unit 1002, a variable length coding unit 1003, an image decoding unit 1004, an adder 1005, an image memory 1006, a picture memory 1007, a motion compensation coding unit 1008, a motion A vector detection unit 1009, a prediction direction determination unit 1010, and a motion vector storage unit 1011 are provided. Note that the encoding process is performed in units called 16 × 16 pixel macroblocks, and the size of the motion compensation block is H.264, which is a currently drafted standard. H.264 selects the appropriate block size for each macroblock from 7 types of motion compensation block sizes: 4x4, 4x8, 8x4, 8x8, 8x16, 16x8, 16x16 And used for encoding.

ピクチャメモリ１００７は、表示時間順にピクチャ単位で入力された、動画像を表す画像データImgを格納する。差分器１００１は、ピクチャメモリ１００７より読み出された画像データImgと、動き補償符号化部１００８より入力された予測画像データPredとの差分を演算し、予測差分画像データResを生成する。画像符号化部１００２は、入力された予測差分画像データResに対して周波数変換や量子化等の符号化処理を行い、差分画像符号化データCodedResを生成する。画面内符号化の場合には、画面間の動き補償を行わないので、予測画像データPredの値は"０"と考える。 The picture memory 1007 stores image data Img representing a moving image input in units of pictures in display time order. The differentiator 1001 calculates a difference between the image data Img read from the picture memory 1007 and the predicted image data Pred input from the motion compensation encoding unit 1008, and generates predicted difference image data Res. The image encoding unit 1002 performs encoding processing such as frequency conversion and quantization on the input prediction difference image data Res to generate difference image encoded data CodedRes. In the case of intra-frame coding, since motion compensation between screens is not performed, the value of the predicted image data Pred is considered to be “0”.

動きベクトル検出部１００９は、画像メモリ１００６に記憶された符号化済みの復号化画像データである参照画像データRefを参照ピクチャとして用いて、そのピクチャ内の探索領域において最適と予測される位置を示す動きベクトルMotionVectorと参照ピクチャ番号RefIdxを検出し、出力する。このとき検出する動きベクトルMotionVectorは１つの参照画像データRefに対し１つである。 The motion vector detection unit 1009 uses the reference image data Ref, which is the decoded image data that has been encoded and stored in the image memory 1006, as a reference picture, and indicates the position predicted to be optimal in the search region within the picture. The motion vector MotionVector and the reference picture number RefIdx are detected and output. The motion vector MotionVector detected at this time is one for one reference image data Ref.

図１４は上記動きベクトル検出部１００９の処理をさらに詳しく説明するための図である。ここでは、例として符号化対象マクロブロック(１６x１６画素)内の動き補償ブロックを１６x１６画素のサイズとしたＢピクチャ時の動きベクトルを検出する場合で説明する。 FIG. 14 is a diagram for explaining the processing of the motion vector detection unit 1009 in more detail. Here, as an example, a case will be described where a motion vector at the time of a B picture is detected with a motion compensation block in the encoding target macroblock (16 × 16 pixels) having a size of 16 × 16 pixels.

動きベクトル検出部１００９は、符号化対象のピクチャが格納されているピクチャメモリ１００７から符号化対象のマクロブロックＭＢ０の画素と、参照ピクチャの格納されている画像メモリ１００６から参照ピクチャRefIdx１、RefIdx２内の探索領域の画素Ｐ３、Ｐ５を取得する。参照ピクチャRefIdx１、RefIdx２内の符号化対象マクロブロックの位置をＭＢ０ａとする。まず、符号化対象マクロブロックＭＢ０の画素と最も類似した位置を探索領域Ｐ３内で検出する(ＭＢ１と検出されたとする)。このとき、検出された類似位置を符号化対象マクロブロックの位置ＭＢ０ａからの移動距離を動きベクトルＭＶ１であらわす。同様に探索領域Ｐ５でも同様に類似位置を検出し(ＭＢ２の位置が検出されたとする)、その位置を動きベクトルＭＶ２とあらわす。 The motion vector detection unit 1009 stores the pixels of the macroblock MB0 to be encoded from the picture memory 1007 in which the picture to be encoded is stored, and the reference pictures RefIdx1 and RefIdx2 from the image memory 1006 in which the reference picture is stored. Pixels P3 and P5 in the search area are acquired. The position of the encoding target macroblock in the reference pictures RefIdx1 and RefIdx2 is MB0a. First, a position most similar to the pixel of the encoding target macroblock MB0 is detected in the search area P3 (assuming that it is detected as MB1). At this time, the detected similar position is represented by the motion vector MV1 as the movement distance from the position MB0a of the encoding target macroblock. Similarly, a similar position is also detected in the search area P5 (assuming that the position of MB2 is detected), and this position is represented as a motion vector MV2.

動きベクトル検出部で検出された動きベクトルＭＶ１、ＭＶ２および、参照しているピクチャ番号RefIdx(RefIdx１やRefIdx２など)は予測方向決定部１０１０に出力する。 The motion vectors MV1 and MV2 detected by the motion vector detection unit and the referenced picture number RefIdx (RefIdx1, RefIdx2, etc.) are output to the prediction direction determination unit 1010.

予測方向決定部１０１０は、符号化対象ブロックの予測方向Dir(前方片方向（Fwd）、後方片方向(Bwd)、双方向（Bid）など)を決定し、決定された方向Dirと参照ピクチャ番号RefIdxと、動きベクトルMoitonVectorを動き補償符号化部１００８に入力する。動き補償符号化部１００８はこの入力に基づいて予測画像データPredを生成する。ここで決定された予測方向Dirで使用した参照ピクチャ番号RefIdxと動きベクトルMotionVectorは動きベクトル記憶部１０１１に保存される。 The prediction direction determination unit 1010 determines the prediction direction Dir (forward unidirectional (Fwd), backward unidirectional (Bwd), bidirectional (Bid), etc.) of the encoding target block, and the determined direction Dir and reference picture number RefIdx and the motion vector MoitonVector are input to the motion compensation encoding unit 1008. The motion compensation encoding unit 1008 generates predicted image data Pred based on this input. The reference picture number RefIdx and the motion vector MotionVector used in the prediction direction Dir determined here are stored in the motion vector storage unit 1011.

なお、動き補償符号化部１００８では、動きベクトルが１／２画素、１／４画素などの小数以下の画素位置を指す場合には、低域通過フィルタなどを用いて１／２画素、１／４画素などの小数画素位置の画素値を補間生成する。可変長符号化部１００３は、入力された差分画像符号化データCodedResおよび動きベクトル検出部１００９で求められた動きパラメータMotionParamに対して可変長符号化等を行い、さらに動き補償符号化部１００８から出力された予測方向Dirと参照するブロックが属する参照ピクチャ番号RefIdxを付加することにより符号化データBitstreamを生成する。ここで動きパラメータMotionParamとは動きベクトルMotionVectorと既に符号化した符号化対象ピクチャの動きベクトル(動きベクトル記憶部１０１１を参照)から予測した動きベクトルから求めた動きベクトル差分である。 In the motion compensation encoding unit 1008, when the motion vector indicates a pixel position of a decimal number such as 1/2 pixel or 1/4 pixel, a low pass filter or the like is used. Interpolate and generate pixel values at decimal pixel positions such as 4 pixels. The variable length coding unit 1003 performs variable length coding or the like on the input differential image coded data CodedRes and the motion parameter MotionParam obtained by the motion vector detection unit 1009, and further outputs from the motion compensation coding unit 1008. The encoded data Bitstream is generated by adding the prediction direction Dir and the reference picture number RefIdx to which the block to be referenced belongs. Here, the motion parameter MotionParam is a motion vector difference obtained from a motion vector predicted from a motion vector MotionVector and a motion vector of an already encoded picture (see the motion vector storage unit 1011).

画像復号化部１００４は、入力された符号化データ差分画像符号化データCodedResに対して逆量子化や逆周波数変換等の復号化処理を行い、復号差分画像データReconResを生成する。加算器１００５は、画像復号化部１００４より出力された復号差分画像データと、動き補償符号化部１００８より入力された予測画像データPredとを加算し、復号化画像データReconを生成する。画像メモリ１００６は、生成された復号化画像データReconを格納する。 The image decoding unit 1004 performs decoding processing such as inverse quantization and inverse frequency conversion on the input encoded data difference image encoded data CodedRes to generate decoded difference image data ReconRes. The adder 1005 adds the decoded differential image data output from the image decoding unit 1004 and the predicted image data Pred input from the motion compensation encoding unit 1008 to generate decoded image data Recon. The image memory 1006 stores the generated decoded image data Recon.

被写体の動きによっては、整数画素単位より小さい単位の動きで予測を行うと予測効果が高い場合がある。一般に、整数画素単位より小さい単位の動きを伴う予測画像の画素値の計算には画素補間を使用する。この画素補間は、参照画像の画素値に対して線形フィルタ（低域通過フィルタ）によるフィルタリングを行うことにより実行される。この線形フィルタのタップ数を増やせば良好な周波数特性を持つフィルタを実現でき、予測効果が高くなるが処理量は大きくなる。一方、フィルタのタップ数が少ないとフィルタの周波数特性は悪くなり、予測効果は低くなるが処理量は小さくなる。 Depending on the movement of the subject, the prediction effect may be high when the prediction is performed in a unit smaller than the integer pixel unit. In general, pixel interpolation is used to calculate a pixel value of a predicted image with a unit movement smaller than an integer pixel unit. This pixel interpolation is performed by filtering the pixel value of the reference image with a linear filter (low-pass filter). If the number of taps of this linear filter is increased, a filter having good frequency characteristics can be realized, and the prediction effect is enhanced, but the processing amount is increased. On the other hand, when the number of taps of the filter is small, the frequency characteristic of the filter is deteriorated and the prediction effect is lowered, but the processing amount is reduced.

現在策定中の規格案であるＨ．２６４では１／４画素までの単位で動き補償を行うことが許可されており(MPEG-４ Simple Profileでは１／２画素まで)、線形フィルタ画素補間の方法としては、６タップフィルタが採用されている。この６タップフィルタによる画素補間の方法について、図１５を用いて説明する。 H. is a draft standard currently being developed. In H.264, motion compensation in units of up to 1/4 pixel is permitted (up to 1/2 pixel in MPEG-4 Simple Profile), and a 6-tap filter is adopted as a linear filter pixel interpolation method. Yes. A pixel interpolation method using this 6-tap filter will be described with reference to FIG.

図１４はＨ．２６４における輝度成分の画素補間方法を説明するための図である。Ｆ１〜Ｆ３６は整数画素位置の画素値、Ｈ１〜Ｈ７は１/２画素位置の画素値をQ１、Q２は１/４画素位置の画素値を表している。例えば、１/２画素位置Ｈ１の画素値を求める場合、式Ａに示すように周囲６画素の整数画素値(Ｆ１〜Ｆ６)を用いた演算により画素値を予測生成する（H２〜H６も同様に求めることができる）。また、Ｈ７の画素値を求める場合、式Ｂに示すように周囲６画素の１/２画素位置の画素値(Ｈ１〜Ｈ６)を用いた演算により画素値を予測生成する。また、１/４画素位置Q１やQ２の画素値を求める場合式C、式Dのように、近傍の１/２画素値を用いた演算により画素値を生成する。このとき式Ａ、式Ｂにおける定数αは丸め係数による値である。(非特許文献１参照)
"Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification", Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG , JVT-G０５０r１ , ２７ May ２００３ FIG. 2 is a diagram for explaining a pixel interpolation method of luminance components in H.264. F1 to F36 represent pixel values at integer pixel positions, H1 to H7 represent pixel values at 1/2 pixel positions, and Q1 and Q2 represent pixel values at 1/4 pixel positions. For example, when the pixel value at the ½ pixel position H1 is obtained, the pixel value is predicted and generated by calculation using integer pixel values (F1 to F6) of the surrounding six pixels as shown in Expression A (the same applies to H2 to H6). Can ask for). When the pixel value of H7 is obtained, the pixel value is predicted and generated by calculation using the pixel values (H1 to H6) at the ½ pixel positions of the surrounding six pixels as shown in Expression B. Further, when obtaining the pixel values at the quarter pixel positions Q1 and Q2, the pixel values are generated by calculation using the neighboring half pixel values as in Expression C and Expression D. At this time, the constant α in the expressions A and B is a value based on a rounding coefficient. (See Non-Patent Document 1)
"Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification", Joint Video Team (JVT) of ISO / IEC MPEG & ITU-T VCEG, JVT-G050r1, 27 May 2003

しかしなら、被写体の動きによっては、１／４画素位置までの動き補償では予測誤差を少なく、すなわち効率良く符号化することができず符号化したストリームを復号化した際画質が劣化するという課題がある。 However, depending on the movement of the subject, the motion compensation up to 1/4 pixel position has a small prediction error, that is, it cannot be encoded efficiently, and the image quality deteriorates when the encoded stream is decoded. is there.

本発明は、このような問題を解決するためになされたものであり、精度の高い動き補償により符号化効率の良い画像符号化方法を提供することを目的とする。 The present invention has been made to solve such a problem, and an object of the present invention is to provide an image coding method with high coding efficiency by highly accurate motion compensation.

上記目的を達成するために、本発明に係る画像符号化方法は、符号化対象ブロックに対し、小数画素精度の２つの動きベクトルを決定し、前記２つの動きベクトルが示す参照ブロックの画素値を平均した値と、前記符号化対象ブロックの画素値との差分値を、前記２つの動きベクトルと共に符号化する画像符号化方法であって、参照ピクチャ上にて整数画素精度の動き検出を行う、動き検出ステップと、前記動き検出ステップにて検出された動きの参照位置を中心とする周囲において、２ｎ分の１の画素精度で動きベクトルの検出を行い、動きベクトルを２つ決定する動きベクトル決定ステップと、前記符号化対象ブロックに対し、前記動きベクトル決定ステップにて決定された２つの動きベクトルと、前記２つの動きベクトルが示す参照ブロックの画素値を平均した値と前記符号化対象ブロックの画素値との差分値を符号化する符号化ステップとを有することを特徴とする。 In order to achieve the above object, an image encoding method according to the present invention determines two motion vectors with decimal pixel precision for a block to be encoded, and determines a pixel value of a reference block indicated by the two motion vectors. An image encoding method that encodes a difference value between an average value and a pixel value of the encoding target block together with the two motion vectors, and performs motion detection with integer pixel accuracy on a reference picture. A motion vector determination in which a motion vector is detected with a pixel accuracy of 1 / 2n and a motion vector is determined around a motion detection step and a motion reference position detected in the motion detection step. Step, the two motion vectors determined in the motion vector determination step, and the reference block indicated by the two motion vectors for the encoding target block. And having a coding step of coding a difference value between Tsu value obtained by averaging the pixel values of the click and the pixel value of the coding target block.

ここで、前記画像符号化方法において、２つの動きベクトルが示す参照ブロックの画素値の平均は、２つの動きベクトルが示す参照ブロックの画素値の和を２で割った１／２平均であることが好ましい。Here, in the image coding method, the average of the pixel values of the reference block indicated by the two motion vectors is a ½ average obtained by dividing the sum of the pixel values of the reference block indicated by the two motion vectors by 2. Is preferred.

なお、本発明は、このような画像符号化方法として実現できるだけでなく、画像符号化装置として実現することもできる。つまり、符号化対象ブロックに対し、２ｎ分の１の画素精度の２つの動きベクトルを決定し、前記２つの動きベクトルが示す参照ブロックの画素値を平均した値と、前記符号化対象ブロックの画素値との差分値を、前記２つの動きベクトルと共に符号化する画像符号化装置として実現することもできる。 The present invention can be realized not only as such an image encoding method but also as an image encoding device. That is, for the encoding target block, two motion vectors having a pixel precision of 1 / 2n are determined, and a value obtained by averaging the pixel values of the reference block indicated by the two motion vectors and the pixel of the encoding target block It can also be realized as an image encoding device that encodes a difference value with a value together with the two motion vectors.

同様に、本発明は、このような画像符号化装置のためのプログラムとして実現することもできる。つまり、本発明は、符号化対象ブロックに対し、２ｎ分の１の画素精度の２つの動きベクトルを決定し、前記２つの動きベクトルが示す参照ブロックの画素値を平均した値と、前記符号化対象ブロックの画素値との差分値を、前記２つの動きベクトルと共に符号化する画像符号化方法を画像符号化装置に実現させるためのプログラムとして実現することもできる。 Similarly, the present invention can be realized as a program for such an image encoding device. That is, the present invention determines two motion vectors with a pixel precision of 1 / 2n for the encoding target block, averages the pixel values of the reference block indicated by the two motion vectors, and the encoding It can also be realized as a program for causing an image encoding apparatus to realize an image encoding method for encoding a difference value with a pixel value of a target block together with the two motion vectors.

本発明にかかる動き補償画面間予測符号化方法および符号化装置によって符号化すると、時間的に動きの細かい画像の符号化において、従来方法より精度の高い動き補償を行うことができ、すなわち予測誤差が少なく符号化効率のよい符号化を行うことが可能となる。つまり、同一ビットレートでより高画質な符号化を行うことができる。 When encoded by the motion compensated inter-picture predictive encoding method and encoding apparatus according to the present invention, it is possible to perform motion compensation with higher accuracy than conventional methods in encoding an image with fine temporal motion, that is, a prediction error. Therefore, it is possible to perform encoding with less encoding efficiency. That is, higher quality encoding can be performed at the same bit rate.

また、本発明にかかる画像符号化方法をプログラムまたはハードウエア等で実現する場合、小数画素を生成する動き補間時に、１/２画素の値を一時的にプログラムまたはハードウエア内のメモリに保持することによって、複数回の１/２画素値演算を省略することができ、演算量の増加を抑え、消費電力を少なく抑えることもできる。 Further, when the image encoding method according to the present invention is realized by a program or hardware, the value of 1/2 pixel is temporarily held in a memory in the program or hardware at the time of motion interpolation for generating a decimal pixel. Thus, a plurality of 1/2 pixel value calculations can be omitted, an increase in the amount of calculation can be suppressed, and power consumption can be reduced.

さらに、本発明にかかる画像符号化方法および装置によれば、動き補償処理にかかる演算量およびメモリ量は従来のBピクチャの双予測から増加することなく実現することが可能である。 Furthermore, according to the image coding method and apparatus of the present invention, the amount of computation and the amount of memory required for motion compensation processing can be realized without increasing from the conventional bi-prediction of B pictures.

以下、本発明の実施の形態を図面を参照しながら説明する。 Embodiments of the present invention will be described below with reference to the drawings.

(実施の形態１)
以下、本発明の第１の実施形態について図面を用いて詳細に説明する。
図１は、本発明に係る画像符号化方法を用いた画像符号化装置１００の構成を示すブロック図である。画像符号化装置１００は、差分器１００１、画像符号化部１００２、可変長符号化部１００３、画像復号化部１００４、加算器１００５、画像メモリ１００６、ピクチャメモリ１００７、動き補償符号化部１００８、モード決定部１０２、動きベクトル候補検出部１０１および動きベクトル記憶部１０１１を備えている。符号化の処理は従来例と同様１６x１６画素のマクロブロックと呼ばれる単位で行われる。この画像符号化装置１００に備わる動きベクトル候補検出部１０１とモード決定部１０２以外の処理部は、背景技術で図１３を用いて説明した処理部と同様である。したがって、ここでは動きベクトル候補検出部１０１とモード決定部１０２の処理についてのみ詳細に述べ、その他の処理部については省略する。 (Embodiment 1)
Hereinafter, a first embodiment of the present invention will be described in detail with reference to the drawings.
FIG. 1 is a block diagram showing a configuration of an image encoding device 100 using an image encoding method according to the present invention. The image encoding device 100 includes a subtractor 1001, an image encoding unit 1002, a variable length encoding unit 1003, an image decoding unit 1004, an adder 1005, an image memory 1006, a picture memory 1007, a motion compensation encoding unit 1008, a mode. A determination unit 102, a motion vector candidate detection unit 101, and a motion vector storage unit 1011 are provided. The encoding process is performed in units called 16 × 16 pixel macroblocks as in the conventional example. The processing units other than the motion vector candidate detection unit 101 and the mode determination unit 102 included in the image encoding device 100 are the same as the processing units described with reference to FIG. 13 in the background art. Therefore, only the processing of the motion vector candidate detection unit 101 and the mode determination unit 102 will be described in detail here, and the other processing units will be omitted.

図２は動きベクトル候補検出部１０１における、動き補償ブロックサイズを４x４とした場合の動きベクトルの検出方法を説明するための図である。背景技術は図１４を用いて説明したように、探索領域内で最も類似した位置(整数精度でも小数精度でもよい)を検出するのは同様であるが、従来例が類似位置を１点決定するのにたいし、本発明では、２つ以上の類似位置を検出することが特徴である。 FIG. 2 is a diagram for explaining a motion vector detection method in the motion vector candidate detection unit 101 when the motion compensation block size is 4 × 4. As described in the background art with reference to FIG. 14, it is the same to detect the most similar position (either integer precision or decimal precision) in the search area, but the conventional example determines one similar position. On the other hand, the present invention is characterized by detecting two or more similar positions.

まず、図２(ａ)に示すように整数画素精度での動き検出を行い、図１４で示した参照領域Ｐ３から符号化対象ブロックＭＢ０と最も画素値の差分の絶対値和ＳＡＤが小さい位置(類似した位置)を決定する(ＭＢ１に決定したとしＭＢ１に含まれる整数精度画素をＦ１〜Ｆ１６とする)。 First, as shown in FIG. 2A, motion detection is performed with integer pixel precision, and the position where the absolute value sum SAD of the difference between the pixel value and the encoding target block MB0 is the smallest from the reference area P3 shown in FIG. (Similar positions) are determined (assuming that MB1 is determined, integer precision pixels included in MB1 are F1 to F16).

次に図２(b)に示すように整数画素精度で最も類似した位置ＭＢ１を中心に周辺の小数画素(１/２、１/４画素)精度での画素値の差分の絶対値和ＳＡＤが最も小さい位置２点(最もＳＡＤが小さい画素位置Ｂｅｓｔ１、２番目にＳＡＤが小さい画素位置Ｂｅｓｔ２)を決定し、動きベクトルを求める(例えば図２(ｂ)の×で示す１/２画素位置がＢｅｓｔ１、＋で示す１/４画素位置がＢｅｓｔ２になったとする)。なお、小数精度の画素値生成は背景技術で図１５を用いて説明した方法と同様でも、また異なっていてもよい。また、ここで最も類似した位置の決定方法を符号化対象ブロックの画素値と参照領域の画素値の差分の絶対値和ＳＡＤとしたが、これ以外の決定方法を用いてもよい。 Next, as shown in FIG. 2 (b), the absolute value sum SAD of the difference between the pixel values with the precision of the fractional pixels (1/2, 1/4 pixels) around the position MB1 that is most similar with the integer pixel precision is obtained. The two smallest positions (the pixel position Best with the smallest SAD and the pixel position Best2 with the second smallest SAD) are determined to obtain a motion vector (for example, the 1/2 pixel position indicated by x in FIG. 2B is the Best1). The 1/4 pixel position indicated by + is assumed to be Best2.) The pixel value generation with decimal precision may be the same as or different from the method described with reference to FIG. 15 in the background art. The most similar position determination method here is the absolute value sum SAD of the difference between the pixel value of the encoding target block and the pixel value of the reference region, but other determination methods may be used.

図２で示したような動き補償ブロックサイズ４x４以外や、参照領域P３(図１４の参照ピクチャRefIdx１)以外の参照領域(例えば図１３のＰ５など)でも同様に動き検出を行い、Ｂｅｓｔ１、Ｂｅｓｔ２の動きベクトルを決定し、参照ピクチャ番号RefIdxとともにモード決定部１０２に出力する。このとき、出力される情報は図３のようにマクロブロック(１６x１６画素)毎にまとめたかたちで渡されるとする。例えば、動き補償ブロックが８x１６の場合、図３(ｂ)のようにマクロブロック内に動き補償ブロックは２つありそれぞれのブロックに参照ピクチャ番号RefIdxと動きベクトル(Best１のベクトルをMV１、Best２のベクトルをMV２とあらわす)の全ての候補が格納される。図３(a)は動き補償ブロックの単位が１６x１６の場合であり、図３(c)は動き補償ブロックの単位が８x８の場合を示している。 Motion detection is performed in the same manner in a reference area (for example, P5 in FIG. 13) other than the motion compensation block size 4x4 as shown in FIG. 2 or a reference area P3 (reference picture RefIdx1 in FIG. 14). The motion vector is determined and output to the mode determination unit 102 together with the reference picture number RefIdx. At this time, the output information is assumed to be delivered in the form of a group for each macroblock (16 × 16 pixels) as shown in FIG. For example, when the motion compensation block is 8 × 16, there are two motion compensation blocks in the macro block as shown in FIG. 3B, and each block has a reference picture number RefIdx and a motion vector (Best1 vector is MV1, Best2 vector). All candidates) are stored. 3A shows a case where the unit of the motion compensation block is 16 × 16, and FIG. 3C shows a case where the unit of the motion compensation block is 8 × 8.

なお、図２を用いて説明した最も類似した位置Best１、Best２の決定方法はまず整数画素精度で最も類似した位置を決定し、その後その位置を中心に周囲の小数精度で類似位置を検出して最も類似した位置２点を決定したが、この最も類似した位置２点の決定方法はこれ以外でもよい。例えば、探索領域内の整数画素、小数画素(１/２、１/４など)すべての中で最も類似した２点を求めるとしてもよい。また、整数画素精度で最も類似した位置を決定し、その後その位置を中心に周囲の１/２画素精度で最も類似した位置を決定し、さらにその周囲１/４画素精度で最も類似した位置２つを検出するとしてもよい。
≪候補の中から最も符号化効率のよいPartition、方向を決定する≫
次にモード決定部１０２の処理について詳細に説明する。モード決定部１０２には、上記動きベクトル候補検出部１０１より動き補償ブロックサイズ毎マクロブロック単位で動きベクトルMotionVectorとそれに対応した参照ピクチャ番号RefIdxの全ての組み合わせが入力される。 The method of determining the most similar positions Best1 and Best2 described with reference to FIG. 2 first determines the most similar position with integer pixel accuracy, and then detects the similar position with decimal precision around that position. Although the two most similar positions are determined, the method for determining the two most similar positions may be other than this. For example, two points that are the most similar among all the integer pixels and decimal pixels (1/2, 1/4, etc.) in the search area may be obtained. Further, the most similar position is determined with integer pixel accuracy, and then the most similar position is determined with the surrounding 1/2 pixel accuracy around the position, and the most similar position 2 with the surrounding 1/4 pixel accuracy is further determined. One may be detected.
≪Determine the partition and direction with the best coding efficiency from the candidates≫
Next, the processing of the mode determination unit 102 will be described in detail. The mode decision unit 102 receives all combinations of the motion vector MotionVector and the corresponding reference picture number RefIdx from the motion vector candidate detection unit 101 in units of macroblocks for each motion compensation block size.

まずモード決定部１０２は上記入力情報から、最も効率よく符号化できるモードを決定する。ここでいうモードとは、動き補償ブロックサイズとその場合の各ブロックの予測タイプ(前方片方向、後方片方向、双方向など)および、参照ピクチャと動きベクトルの組み合わせをいう。Bピクチャの場合このモードは、１つのブロックについて１つまたは２つの異なる参照ピクチャ番号の動きベクトルを用いてもよく、Ｈ．２６４等の場合はさらに同一の参照ピクチャの動きベクトルを用いてもよい。そこで以下で説明するような方法等を用いてモードを決定する。 First, the mode determination unit 102 determines a mode that can be most efficiently encoded from the input information. The mode here refers to a motion compensation block size, a prediction type of each block in that case (forward unidirectional, backward unidirectional, bidirectional, etc.), and a combination of a reference picture and a motion vector. In the case of B pictures, this mode may use motion vectors with one or two different reference picture numbers for one block. In the case of H.264, the motion vector of the same reference picture may be used. Therefore, the mode is determined using a method described below.

図４はモード決定方法を具体的に説明するための図である。説明を簡単にするため、候補となる組み合わせ(動き検出部１０１から与えられる参照ピクチャ番号RefIdxと動きベクトルの組み合わせ)を動き補償ブロックタイプは１６x１６と８x１６の２種、それぞれの参照ピクチャ番号はRefIdx１、RefIdx２の２つとする。参照ピクチャ毎に２つの動きベクトルMV１、MV２がある。ここでMV１、MV２で示される動きベクトルはそれぞれ上記動きベクトル候補検出部１０１で決定したBest１、Best２の位置を表す動きベクトルであり、参照ピクチャ番号と動きベクトルの組み合わせを明確にするため、図４(a)中にはMV１#Ref１(参照ピクチャ番号１のMV１)等のかたちであらわしている。 FIG. 4 is a diagram for specifically explaining the mode determination method. In order to simplify the description, the candidate combinations (combination of the reference picture number RefIdx and the motion vector given from the motion detection unit 101) are motion compensation block types of 16x16 and 8x16, and each reference picture number is RefIdx1, Two RefIdx2s are assumed. There are two motion vectors MV1 and MV2 for each reference picture. Here, the motion vectors indicated by MV1 and MV2 are motion vectors representing the positions of Best1 and Best2 determined by the motion vector candidate detection unit 101, respectively. In order to clarify the combination of the reference picture number and the motion vector, FIG. In (a), it is shown in the form of MV1 # Ref1 (MV1 of reference picture number 1).

まず、図４(a)に示すような動き補償ブロック毎の各モード候補１〜５の評価値を導出する。(動き補償ブロックタイプ８x１６の場合、マクロブロックは左右２つに分割されるので、８x１６#Left、８x１６#Rightとあらわす) 図４(b)に各ブロックのあるモード候補の評価値の導出例をしめす。評価値を求めようとするモード候補は片方向１ベクトルの候補であるか否かの判定を行い(S３０１)、もし片方向１ベクトルの候補である場合、参照する１つの参照ピクチャ画素と符号化対象ブロックの画素との画素値の差の絶対値和にある値αを加えた値を評価値とする(S３０２)。もしS３０１で片方向１ベクトルでない(片方向２ベクトル、または両方向各１ベクトル)であった場合はS３０３に進み、参照する２つのピクチャの平均画素を求め、これを予測画素とする。次にS３０３で算出した予測画素と符号化対象ブロックとの画素値の差の絶対値和に値βを加えた値を評価値とする(S３０４)。ここで用いた値α、βとは動きベクトルをあらわすビットを考慮するために与える値である。また上記予測画素を求める方法は参照する２つのピクチャの平均画素に限定するものではなく、他の方法でも良い。 First, the evaluation value of each mode candidate 1-5 for every motion compensation block as shown to Fig.4 (a) is derived | led-out. (In the case of motion compensation block type 8x16, the macroblock is divided into left and right parts, so it is expressed as 8x16 # Left and 8x16 # Right) Fig. 4 (b) shows an example of derivation of evaluation values of mode candidates with each block. Shimese. It is determined whether or not the mode candidate for which the evaluation value is to be obtained is a one-way one vector candidate (S301). If it is a one-way one vector candidate, it is encoded with one reference picture pixel to be referred to. A value obtained by adding the value α to the sum of absolute values of pixel value differences from the pixel of the target block is set as an evaluation value (S302). If it is not a one-way vector in S301 (one-way two vectors or one vector in both directions), the process proceeds to S303, an average pixel of two pictures to be referred to is obtained, and this is used as a predicted pixel. Next, the value obtained by adding the value β to the absolute value sum of the pixel value difference between the prediction pixel calculated in S303 and the encoding target block is set as an evaluation value (S304). The values α and β used here are values given in consideration of bits representing motion vectors. The method for obtaining the predicted pixel is not limited to the average pixel of the two pictures to be referred to, and other methods may be used.

以上の方法で動き補償ブロックタイプ毎に各ブロックの候補に評価値を与え、そのブロックの候補内で最も評価値の小さいモード１つを選択する。 With the above method, an evaluation value is given to each block candidate for each motion compensation block type, and one mode with the smallest evaluation value is selected from the block candidates.

次に、マクロブロック単位で評価値を算出し、マクロブロック内の動き補償ブロックタイプをいずれにするか決定する。図５は図４(a)の各ブロックの各モード候補に図４(b)の方法で評価値が与えられた例を示すものである。この例の場合、動き補償ブロック１６x１６では評価値４８０のモード候補１(片方向１ベクトル、MV１#Ref１)、動き補償ブロックタイプ８x１６#Left(マクロブロックを分割した場合の左のブロック)では評価値２３２のモード候補５(両方向各１ベクトル、MV１#Ref１,MV１#Ref１)、動き補償ブロックタイプ８x１６#Rightでは評価値２１０のモード候補３(片方向２ベクトル,MV１#Ref１,MV２#Ref１)がそれぞれ選ばれる。処理単位であるマクロブロックをどの動き補償単位で符号化すれば最も符号化効率がよいかは、ここで得られた評価値をマクロブロック単位で比較することにより決定する。図５で決定された最小の評価値をマクロブロック単位で比較すると、この場合動き補償単位１６x１６の評価値は４８０、動き補償単位８x１６の場合は４５２(２３２＋２１０ =４５２)であり、評価値の小さい動き補償単位８x１６に決定する。以上の手順により、符号化モードは動き補償ブロックタイプ８x１６で各ブロックの符号化は８x１６#Leftブロックが両方向各１ベクトル(MV１#Ref１,MV１#Ref２)を用いた双予測、８x１６#Rightブロックが片方向２ベクトル(MV１#Ref１,MV２#Ref１)を用いた双予測で符号化すると決定する。 Next, an evaluation value is calculated for each macroblock, and a motion compensation block type in the macroblock is determined. FIG. 5 shows an example in which an evaluation value is given to each mode candidate of each block of FIG. 4A by the method of FIG. 4B. In the case of this example, the motion compensation block 16x16 has an evaluation value of 480 mode candidate 1 (one-way 1 vector, MV1 # Ref1), and the motion compensation block type 8x16 # Left (the left block when the macroblock is divided). In 232 mode candidate 5 (one vector in each direction, MV1 # Ref1, MV1 # Ref1), and motion compensation block type 8x16 # Right, mode candidate 3 (one-way two vector, MV1 # Ref1, MV2 # Ref1) is Each is chosen. It is determined by comparing the evaluation value obtained here in units of macroblocks in which motion compensation unit the macroblock which is a processing unit is encoded in the most motion compensation unit. When the minimum evaluation value determined in FIG. 5 is compared in units of macroblocks, the evaluation value of the motion compensation unit 16 × 16 is 480, and the evaluation value of the motion compensation unit 8 × 16 is 452 (232 + 210 = 452), and the evaluation value is small. The motion compensation unit is 8x16. With the above procedure, the coding mode is the motion compensation block type 8x16, and the coding of each block is bi-prediction using 8x16 # Left blocks with 1 vector in each direction (MV1 # Ref1, MV1 # Ref2), and 8x16 # Right block is It is determined that encoding is performed by bi-prediction using one-way two vectors (MV1 # Ref1, MV2 # Ref1).

このように、他の複数の動き補償ブロックタイプについても評価値を与え、符号化対象マクロブロックの符号化モードを決定する。 In this manner, evaluation values are also given to a plurality of other motion compensation block types, and the encoding mode of the encoding target macroblock is determined.

なお、ここでは評価値として、符号化対象ブロックと参照ブロック(または算出した予測画素)との画素値の差の絶対値和を評価値の一部として用いているが、評価値はこれに限ったものではなく、符号化効率の最もよいモードを決定するための値であればよい。 Here, as the evaluation value, the absolute value sum of the pixel value differences between the encoding target block and the reference block (or the calculated prediction pixel) is used as part of the evaluation value, but the evaluation value is not limited to this. However, any value may be used as long as it determines the mode with the best coding efficiency.

モード決定部１０２で決定された符号化モードに従い、片方向予測か否かを示すフラグFlagおよび参照ピクチャ番号と動きベクトルを補償符号化部１００８に入力し、以降従来符号化と同様に処理する。 In accordance with the encoding mode determined by the mode determination unit 102, a flag Flag indicating whether or not unidirectional prediction is performed, a reference picture number, and a motion vector are input to the compensation encoding unit 1008, and the same processing as in the conventional encoding is performed thereafter.

以上の実施の形態１に示した動きベクトル候補検出部１０１およびモード決定部１０２を含む画像符号化装置で符号化を行うと、Bピクチャの双予測符号化時、マクロブロックが参照する２つの参照ブロックが同一の参照ピクチャになる可能性が考慮され、２つの参照ブロックが同一ピクチャか否かにかかわらず、最も符号化効率の高い予測を選択することが可能となる。すなわち、同一の参照ピクチャ内の１/４画素精度で隣り合う２つの位置の画像を参照した場合、実質的に１/８位置の画素を生成したものと同じ意味合いをもち、より細かな動きの画像について符号化効率の向上が期待できる。 When encoding is performed by the image encoding apparatus including the motion vector candidate detection unit 101 and the mode determination unit 102 described in the first embodiment, two references that a macroblock refers to when bi-predictive encoding a B picture Considering the possibility that the block becomes the same reference picture, it is possible to select the prediction with the highest coding efficiency regardless of whether the two reference blocks are the same picture or not. That is, when referring to images at two adjacent positions with 1/4 pixel accuracy in the same reference picture, it has substantially the same meaning as that for generating a pixel at 1/8 position, and has a finer motion. An improvement in encoding efficiency can be expected for an image.

さらに、本実施の形態１の画像符号化方法および装置は従来の双方向予測(異なる参照ピクチャから各１ベクトル、すなわち両方向各１ベクトル)の動き補償符号化を実現するプログラムまたはハードウエアへの入力を変更するのみで、特別なメモリ増加は必要なく実現できる。 Furthermore, the image coding method and apparatus according to the first embodiment is a conventional bi-directional prediction (one vector from different reference pictures, that is, one vector in both directions) input to a program or hardware that realizes motion compensation coding. It is possible to realize a special increase in memory only by changing.

また、上記に説明した実施の形態１について、符号化対象マクロブロックが２枚を参照する双予測符号化ブロックである場合、その２枚の参照画像が同一ピクチャであることを許可するか否かの切り替えは簡単なフラグを用いることで実現できる。 Also, in the first embodiment described above, if the encoding target macroblock is a bi-predictive coding block that refers to two, whether or not to permit the two reference images to be the same picture? Switching can be realized by using a simple flag.

図６は双予測を行う符号化において、参照する２枚の参照画像が同一ピクチャであることを許可するか否かを簡単なフラグを用いることで切り替える場合の画像符号化処理を説明するための図である。図６は図１のブロック図における動きベクトル候補検出部１０１の処理および入力が異なるのみである。図６において，動きベクトル候補検出部１０３には、外部から双予測において２つの参照画像が同一ピクチャであることを許可するか否かを示す１ビットのフラグBIPflagを入力する。参照画像が同一ピクチャであることを許可する場合BIPflagは１、許可しない場合BIPflagは０とする。動きベクトル候補検出部１０３ではBIPflagが１の場合、上記において図２を用いて説明したように各参照ピクチャの参照領域から符号化対象と最も類似した２点(Best１,Best２)を決定する。もしBIPflagが０の場合は、参照領域からは最も類似した１点(Best１)のみを決定する。上記と同様各参照ピクチャで決定された１点について、それぞれ参照ピクチャ番号RefIdxと動きベクトルMV１をモード決定部１０２に出力する。モード決定部１０２では、図４，５を用いて説明したように、入力されたモード候補について符号化効率が最もよいモードを決定する。BIPflagが０の場合、モード決定部１０２には各参照ピクチャについて１点の位置しか入力されないため、双予測(２ベクトル)のモード候補として、同一参照ピクチャ内の２点を参照する候補は発生しない。これにより同一ピクチャを参照する双予測符号化を禁止することができる。 FIG. 6 is a diagram for explaining image encoding processing in a case where switching is performed by using a simple flag as to whether or not two reference images to be referred to are permitted to be the same picture in encoding that performs bi-prediction. FIG. FIG. 6 differs only in the processing and input of the motion vector candidate detection unit 101 in the block diagram of FIG. In FIG. 6, a 1-bit flag BIPflag indicating whether or not two reference images are permitted to be the same picture in bi-prediction is input to the motion vector candidate detection unit 103 from the outside. BIPflag is set to 1 when the reference picture is permitted to be the same picture, and BIPflag is set to 0 when not permitted. When BIPflag is 1, the motion vector candidate detection unit 103 determines two points (Best 1 and Best 2) most similar to the encoding target from the reference area of each reference picture as described above with reference to FIG. If BIPflag is 0, only the most similar point (Best 1) is determined from the reference area. Similarly to the above, the reference picture number RefIdx and the motion vector MV1 are output to the mode determining unit 102 for one point determined in each reference picture. As described with reference to FIGS. 4 and 5, the mode determination unit 102 determines the mode with the best coding efficiency for the input mode candidate. When BIPflag is 0, only one point position is input to the mode determination unit 102 for each reference picture, so that no candidate for referring to two points in the same reference picture is generated as a mode candidate for bi-prediction (two vectors). . As a result, bi-predictive coding that refers to the same picture can be prohibited.

このように同一ピクチャの参照を許可するか否かを簡単なフラグによって切り替えることで、例えば２つの参照画像を用いて符号化する場合、その２つが同一ピクチャであることを許可されていないMPEG-２などの符号化標準規格と、同一ピクチャであることが許可されているH.２６４などの符号化標準規格の双方の符号化方式を実現する画像符号化装置簡単に実現することが可能となる。 In this way, by switching whether or not to allow reference to the same picture by a simple flag, for example, when encoding using two reference images, the MPEG- which is not permitted to be the same picture as the two pictures. An image encoding apparatus that realizes both the encoding standard such as 2 and the encoding standard such as H.264 that is permitted to be the same picture can be easily realized. .

以上に説明した画像符号化方法および装置によって符号化した画像のビットストリームは、Bピクチャ内のマクロブロックが同一参照ピクチャの２つの位置を参照することがある、という特徴をもつ。 The bit stream of an image encoded by the image encoding method and apparatus described above has a feature that a macroblock in a B picture may refer to two positions of the same reference picture.

(実施の形態２)
以下に本発明の実施形態２を説明する。本発明の実施の形態２は上記実施の形態１で示した画像符号化方法および画像符号化装置において、図１で示した動き補償符号化部１００８およびモード決定部１０２の処理に関するものであり、図１のその他の処理については同様である。 (Embodiment 2)
Embodiment 2 of the present invention will be described below. The second embodiment of the present invention relates to the processing of the motion compensation encoding unit 1008 and the mode determining unit 102 shown in FIG. 1 in the image encoding method and the image encoding device shown in the first embodiment. The other processes in FIG. 1 are the same.

以降図１の動き補償符号化部１００８、およびモード決定部１０２で行う画素補間処理の実施の形態を説明する。従来例で図１５を用いて説明したように動き補償符号化部１００８では小数画素精度の動き補償を行うために画素補間を行う。またモード決定部１０２では評価値を求めるために、予測画素(参照ピクチャ)の画素が必要でありこのとき画素補間を行う必要がある。 Hereinafter, an embodiment of pixel interpolation processing performed by the motion compensation encoding unit 1008 and the mode determination unit 102 in FIG. 1 will be described. As described with reference to FIG. 15 in the conventional example, the motion compensation encoding unit 1008 performs pixel interpolation in order to perform motion compensation with decimal pixel accuracy. Further, in order to obtain the evaluation value, the mode determination unit 102 needs a pixel of a prediction pixel (reference picture), and at this time, it is necessary to perform pixel interpolation.

ここで上記実施の形態１で示した、モード決定部１０２におけるモード候補、動き補償符号化部１００８に入力された予測モード(最適モード) (１ベクトル予測や２ベクトル予測)について、参照画像の数を２値のフラグであらわす。予測画像の数が１すなわち１ベクトル予測を０、予測画像の数が２すなわち２ベクトル予測を１とあらわすことにする。 Here, for the mode candidate in the mode determination unit 102 and the prediction mode (optimum mode) (one vector prediction or two vector prediction) input to the motion compensation encoding unit 1008 shown in the first embodiment, the number of reference images Is represented by a binary flag. It is assumed that the number of predicted images is 1, that is, 1 vector prediction is 0, and the number of predicted images is 2, that is, 2 vector prediction is 1.

図７に動き補償符号化部１００８およびモード決定部１０２で行う画素補間の処理フローを示す。まず入力された予測モードをチェックし(S５０１)フラグが１(２ベクトル)の場合、その２つの動きベクトル(２つの動きベクトルをMV１、MV２とし、それぞれの水平・垂直成分をmv１#x、mv１#y、mv２#x、mv２#yとする)と参照ピクチャ番号(MV１#RefIdx、MV２#RefIdx)を参照する(S５０２)。２つの参照ピクチャ番号が同じ(MV１#RefIdx==MV２#RefIdx)かつ、水平成分、垂直成分の差の絶対値がそれぞれ一定値α、βより小さい、すなわち２つの参照画像が同一ピクチャでかつ重なっているような場合(S５０２がYESの場合)、１つ目の参照画像(MV１)の画素補間演算時、１/２画素値をメモリ内に一時的に保存する(S５０３)。そうでない場合は保存しない(S５０４)。このときS５０２での値α、βは画素補間時の参照画素数(タップ数)によって決める０以上の値であるとする。 FIG. 7 shows a process flow of pixel interpolation performed by the motion compensation encoding unit 1008 and the mode determination unit 102. First, the input prediction mode is checked (S501). If the flag is 1 (2 vectors), the two motion vectors (the two motion vectors are MV1 and MV2 and the horizontal and vertical components are mv1 # x and mv1). #y, mv2 # x, and mv2 # y) and reference picture numbers (MV1 # RefIdx, MV2 # RefIdx) are referred to (S502). The two reference picture numbers are the same (MV1 # RefIdx == MV2 # RefIdx) and the absolute value of the difference between the horizontal and vertical components is smaller than the constant values α and β, respectively, that is, the two reference images are the same picture and overlap If this is the case (S502 is YES), the 1/2 pixel value is temporarily stored in the memory during the pixel interpolation calculation of the first reference image (MV1) (S503). Otherwise, it is not saved (S504). At this time, it is assumed that the values α and β in S502 are 0 or more values determined by the number of reference pixels (number of taps) at the time of pixel interpolation.

２つの画像を参照(双予測)し、２つの画像が同じ参照ピクチャ内の重複した画素を参照している場合、１つ目の参照画素の画素補間時に演算生成する１/２画素値をメモリに一時的に保存し、２つ目の参照画素の動き補間時にこのメモリを参照することで、２つ目の画素補間にかかる演算を省略することができる。これは動き補償符号化部１００８およびモード決定部１０２をハードウエアで実現する場合に消費電力の削減に効果がある。 When two images are referenced (bi-predicted) and the two images refer to overlapping pixels in the same reference picture, the 1/2 pixel value calculated and generated during pixel interpolation of the first reference pixel is stored in memory In this case, the calculation related to the second pixel interpolation can be omitted by temporarily storing the data in the memory and referring to the memory during the motion interpolation of the second reference pixel. This is effective in reducing power consumption when the motion compensation encoding unit 1008 and the mode determination unit 102 are realized by hardware.

(実施の形態３)
さらに、上記各実施の形態で示した画像符号化方法の構成を実現するためのプログラムを、フレキシブルディスク等の記憶媒体に記録するようにすることにより、上記各実施の形態で示した処理を、独立したコンピュータシステムにおいて簡単に実施することが可能となる。 (Embodiment 3)
Furthermore, by recording a program for realizing the configuration of the image encoding method shown in each of the above embodiments on a storage medium such as a flexible disk, the processing shown in each of the above embodiments is performed. It can be easily implemented in an independent computer system.

図８は、上記実施の形態１および実施の形態２の画像符号化方法を格納したフレキシブルディスクを用いて、コンピュータシステムにより実施する場合の説明図である。
図８（ｂ）は、フレキシブルディスクの正面からみた外観、断面構造、及びフレキシブルディスクを示し、図８（ａ）は、記録媒体本体であるフレキシブルディスクの物理フォーマットの例を示している。フレキシブルディスクＦＤはケースＦ内に内蔵され、該ディスクの表面には、同心円状に外周からは内周に向かって複数のトラックＴｒが形成され、各トラックは角度方向に１６のセクタＳｅに分割されている。従って、上記プログラムを格納したフレキシブルディスクでは、上記フレキシブルディスクＦＤ上に割り当てられた領域に、上記プログラムとしての画像符号化方法が記録されている。 FIG. 8 is an explanatory diagram in the case of implementing by a computer system using the flexible disk storing the image coding method of the first embodiment and the second embodiment.
FIG. 8B shows an appearance, a cross-sectional structure, and a flexible disk as seen from the front of the flexible disk, and FIG. 8A shows an example of a physical format of the flexible disk that is a recording medium body. The flexible disk FD is built in the case F, and on the surface of the disk, a plurality of tracks Tr are formed concentrically from the outer periphery toward the inner periphery, and each track is divided into 16 sectors Se in the angular direction. ing. Therefore, in the flexible disk storing the program, an image encoding method as the program is recorded in an area allocated on the flexible disk FD.

また、図８（ｃ）は、フレキシブルディスクＦＤに上記プログラムの記録再生を行うための構成を示す。上記プログラムをフレキシブルディスクＦＤに記録する場合は、コンピュータシステムＣｓから上記プログラムとしての画像符号化方法または画像復号化方法をフレキシブルディスクドライブを介して書き込む。また、フレキシブルディスク内のプログラムにより上記画像符号化方法をコンピュータシステム中に構築する場合は、フレキシブルディスクドライブによりプログラムをフレキシブルディスクから読み出し、コンピュータシステムに転送する。 FIG. 8C shows a configuration for recording and reproducing the program on the flexible disk FD. When recording the program on the flexible disk FD, the image encoding method or the image decoding method as the program is written from the computer system Cs via the flexible disk drive. When the image encoding method is built in a computer system by a program in a flexible disk, the program is read from the flexible disk by a flexible disk drive and transferred to the computer system.

なお、上記説明では、記録媒体としてフレキシブルディスクを用いて説明を行ったが、光ディスクを用いても同様に行うことができる。また、記録媒体はこれに限らず、ＩＣカード、ＲＯＭカセット等、プログラムを記録できるものであれば同様に実施することができる。 In the above description, a flexible disk is used as the recording medium, but the same can be done using an optical disk. Further, the recording medium is not limited to this, and any recording medium such as an IC card or a ROM cassette capable of recording a program can be similarly implemented.

さらにここで、上記実施の形態で示した動き補償方法、画像符号化方法応用例とそれを用いたシステムを説明する。 Furthermore, the motion compensation method and the image coding method application example shown in the above embodiment and a system using the same will be described.

図９は、コンテンツ配信サービスを実現するコンテンツ供給システムex１００の全体構成を示すブロック図である。通信サービスの提供エリアを所望の大きさに分割し、各セル内にそれぞれ固定無線局である基地局ex１０７〜ex１１０が設置されている。 FIG. 9 is a block diagram showing an overall configuration of a content supply system ex100 that implements a content distribution service. The communication service providing area is divided into desired sizes, and base stations ex107 to ex110, which are fixed radio stations, are installed in each cell.

このコンテンツ供給システムex１００は、例えば、インターネットex１０１にインターネットサービスプロバイダex１０２および電話網ex１０４、および基地局ｅｘ１０７〜ｅｘ１１０を介して、コンピュータex１１１、ＰＤＡ（personal digital assistant）ex１１２、カメラex１１３、携帯電話ex１１４、カメラ付きの携帯電話ｅｘ１１５などの各機器が接続される。 The content supply system ex100 includes, for example, a computer ex111, a PDA (personal digital assistant) ex112, a camera ex113, a mobile phone ex114, a camera via the Internet ex101, the Internet service provider ex102, the telephone network ex104, and the base stations ex107 to ex110. Each device such as the attached mobile phone ex115 is connected.

しかし、コンテンツ供給システムex１００は図９のような組み合わせに限定されず、いずれかを組み合わせて接続するようにしてもよい。また、固定無線局である基地局ex１０７〜ex１１０を介さずに、各機器が電話網ex１０４に直接接続されてもよい。 However, the content supply system ex100 is not limited to the combination as shown in FIG. 9, and may be connected in any combination. Further, each device may be directly connected to the telephone network ex104 without going through the base stations ex107 to ex110 which are fixed wireless stations.

カメラex１１３はデジタルビデオカメラ等の動画撮影が可能な機器である。また、携帯電話は、ＰＤＣ（Personal Digital Communications）方式、ＣＤＭＡ（Code Division Multiple Access）方式、Ｗ−ＣＤＭＡ（Wideband-Code Division Multiple Access）方式、若しくはＧＳＭ（Global System for Mobile Communications）方式の携帯電話機、またはＰＨＳ（Personal Handyphone System）等であり、いずれでも構わない。 The camera ex113 is a device capable of shooting a moving image such as a digital video camera. The mobile phone is a PDC (Personal Digital Communications) system, a CDMA (Code Division Multiple Access) system, a W-CDMA (Wideband-Code Division Multiple Access) system, or a GSM (Global System for Mobile Communications) system mobile phone, Alternatively, PHS (Personal Handyphone System) or the like may be used.

また、ストリーミングサーバex１０３は、カメラex１１３から基地局ex１０９、電話網ex１０４を通じて接続されており、カメラex１１３を用いてユーザが送信する符号化処理されたデータに基づいたライブ配信等が可能になる。撮影したデータの符号化処理はカメラex１１３で行っても、データの送信処理をするサーバ等で行ってもよい。また、カメラ１１６で撮影した動画データはコンピュータex１１１を介してストリーミングサーバex１０３に送信されてもよい。カメラex１１６はデジタルカメラ等の静止画、動画が撮影可能な機器である。この場合、動画データの符号化はカメラex１１６で行ってもコンピュータex１１１で行ってもどちらでもよい。また、符号化処理はコンピュータex１１１やカメラex１１６が有するＬＳＩex１１７において処理することになる。なお、画像符号化・復号化用のソフトウェアをコンピュータex１１１等で読み取り可能な記録媒体である何らかの蓄積メディア（ＣＤ−ＲＯＭ、フレキシブルディスク、ハードディスクなど）に組み込んでもよい。さらに、カメラ付きの携帯電話ex１１５で動画データを送信してもよい。このときの動画データは携帯電話ex１１５が有するＬＳＩで符号化処理されたデータである。 In addition, the streaming server ex103 is connected from the camera ex113 through the base station ex109 and the telephone network ex104, and live distribution or the like based on the encoded data transmitted by the user using the camera ex113 becomes possible. The encoded processing of the captured data may be performed by the camera ex113 or may be performed by a server or the like that performs data transmission processing. Further, the moving image data shot by the camera 116 may be transmitted to the streaming server ex103 via the computer ex111. The camera ex116 is a device that can shoot still images and moving images, such as a digital camera. In this case, the encoding of the moving image data may be performed by the camera ex116 or the computer ex111. The encoding process is performed in the LSI ex117 included in the computer ex111 and the camera ex116. Note that image encoding / decoding software may be incorporated into any storage medium (CD-ROM, flexible disk, hard disk, etc.) that is a recording medium readable by the computer ex111 or the like. Furthermore, you may transmit moving image data with the mobile telephone ex115 with a camera. The moving image data at this time is data encoded by the LSI included in the mobile phone ex115.

このコンテンツ供給システムex１００では、ユーザがカメラex１１３、カメラex１１６等で撮影しているコンテンツ（例えば、音楽ライブを撮影した映像等）を上記実施の形態同様に符号化処理してストリーミングサーバex１０３に送信する一方で、ストリーミングサーバex１０３は要求のあったクライアントに対して上記コンテンツデータをストリーム配信する。クライアントとしては、上記符号化処理されたデータを復号化することが可能な、コンピュータex１１１、ＰＤＡex１１２、カメラex１１３、携帯電話ex１１４等がある。このようにすることでコンテンツ供給システムex１００は、符号化されたデータをクライアントにおいて受信して再生することができ、さらにクライアントにおいてリアルタイムで受信して復号化し、再生することにより、個人放送をも実現可能になるシステムである。 In this content supply system ex100, the content (for example, video shot of music live) captured by the user with the camera ex113, camera ex116, etc. is encoded and transmitted to the streaming server ex103 as in the above embodiment. On the other hand, the streaming server ex103 distributes the content data to the requested client. Examples of the client include a computer ex111, a PDA ex112, a camera ex113, a mobile phone ex114, and the like that can decode the encoded data. In this way, the content supply system ex100 can receive and reproduce the encoded data at the client, and also realize personal broadcasting by receiving, decoding, and reproducing in real time at the client. It is a system that becomes possible.

このシステムを構成する各機器の符号化には上記各実施の形態で示した画像符号化装置を用いるようにすればよい。
その一例として携帯電話について説明する。
図１０は、上記実施の形態で説明した動き補償方法、画像符号化方法および画像復号化方法を用いた携帯電話ex１１５を示す図である。携帯電話ex１１５は、基地局ex１１０との間で電波を送受信するためのアンテナex２０１、ＣＣＤカメラ等の映像、静止画を撮ることが可能なカメラ部ex２０３、カメラ部ex２０３で撮影した映像、アンテナex２０１で受信した映像等が復号化されたデータを表示する液晶ディスプレイ等の表示部ex２０２、操作キーｅｘ２０４群から構成される本体部、音声出力をするためのスピーカ等の音声出力部ex２０８、音声入力をするためのマイク等の音声入力部ex２０５、撮影した動画もしくは静止画のデータ、受信したメールのデータ、動画のデータもしくは静止画のデータ等、符号化されたデータまたは復号化されたデータを保存するための記録メディアex２０７、携帯電話ex１１５に記録メディアex２０７を装着可能とするためのスロット部ex２０６を有している。記録メディアex２０７はＳＤカード等のプラスチックケース内に電気的に書換えや消去が可能な不揮発性メモリであるＥＥＰＲＯＭ（Electrically Erasable and Programmable Read Only Memory）の一種であるフラッシュメモリ素子を格納したものである。 What is necessary is just to use the image coding apparatus shown in said each embodiment for the encoding of each apparatus which comprises this system.
A mobile phone will be described as an example.
FIG. 10 is a diagram illustrating the mobile phone ex115 using the motion compensation method, the image encoding method, and the image decoding method described in the above embodiment. The cellular phone ex115 includes an antenna ex201 for transmitting and receiving radio waves to and from the base station ex110, a camera such as a CCD camera, a camera unit ex203 capable of taking a still image, a video shot by the camera unit ex203, and an antenna ex201. A display unit ex202 such as a liquid crystal display that displays data obtained by decoding received video and the like, a main body unit composed of a group of operation keys ex204, an audio output unit ex208 such as a speaker for outputting audio, and a voice input To store encoded data or decoded data such as a voice input unit ex205 such as a microphone, captured video or still image data, received mail data, video data or still image data, etc. Recording medium ex207, and slot portion ex20 for enabling the recording medium ex207 to be attached to the mobile phone ex115 The has. The recording medium ex207 stores a flash memory element which is a kind of EEPROM (Electrically Erasable and Programmable Read Only Memory) which is a nonvolatile memory that can be electrically rewritten and erased in a plastic case such as an SD card.

さらに、携帯電話ex１１５について図１１を用いて説明する。携帯電話ex１１５は表示部ex２０２及び操作キーｅｘ２０４を備えた本体部の各部を統括的に制御するようになされた主制御部ex３１１に対して、電源回路部ex３１０、操作入力制御部ex３０４、画像符号化部ex３１２、カメラインターフェース部ex３０３、ＬＣＤ（Liquid Crystal Display）制御部ex３０２、画像復号化部ex３０９、多重分離部ex３０８、記録再生部ex３０７、変復調回路部ex３０６及び音声処理部ex３０５が同期バスex３１３を介して互いに接続されている。 Further, the cellular phone ex115 will be described with reference to FIG. The cellular phone ex115 controls the power supply circuit ex310, the operation input control unit ex304, and the image coding for the main control unit ex311 which is configured to control the respective units of the main body unit including the display unit ex202 and the operation key ex204. Unit ex312, camera interface unit ex303, LCD (Liquid Crystal Display) control unit ex302, image decoding unit ex309, demultiplexing unit ex308, recording / reproducing unit ex307, modulation / demodulation circuit unit ex306, and audio processing unit ex305 via a synchronization bus ex313 Are connected to each other.

電源回路部ex３１０は、ユーザの操作により終話及び電源キーがオン状態にされると、バッテリパックから各部に対して電力を供給することによりカメラ付ディジタル携帯電話ex１１５を動作可能な状態に起動する。 When the end call and power key are turned on by a user operation, the power supply circuit ex310 activates the camera-equipped digital mobile phone ex115 by supplying power from the battery pack to each unit. .

携帯電話ex１１５は、ＣＰＵ、ＲＯＭ及びＲＡＭ等でなる主制御部ex３１１の制御に基づいて、音声通話モード時に音声入力部ex２０５で集音した音声信号を音声処理部ex３０５によってディジタル音声データに変換し、これを変復調回路部ex３０６でスペクトラム拡散処理し、送受信回路部ex３０１でディジタルアナログ変換処理及び周波数変換処理を施した後にアンテナex２０１を介して送信する。また携帯電話機ex１１５は、音声通話モード時にアンテナex２０１で受信した受信信号を増幅して周波数変換処理及びアナログディジタル変換処理を施し、変復調回路部ex３０６でスペクトラム逆拡散処理し、音声処理部ex３０５によってアナログ音声信号に変換した後、これを音声出力部ｅｘ２０８を介して出力する。 The mobile phone ex115 converts the voice signal collected by the voice input unit ex205 in the voice call mode into digital voice data by the voice processing unit ex305 based on the control of the main control unit ex311 including a CPU, a ROM, a RAM, and the like. The modulation / demodulation circuit unit ex306 performs spread spectrum processing, and the transmission / reception circuit unit ex301 performs digital analog conversion processing and frequency conversion processing, and then transmits the result via the antenna ex201. In addition, the cellular phone ex115 amplifies the received signal received by the antenna ex201 in the voice call mode, performs frequency conversion processing and analog-digital conversion processing, performs spectrum despreading processing by the modulation / demodulation circuit unit ex306, and analog audio by the voice processing unit ex305. After conversion into a signal, this is output via the audio output unit ex208.

さらに、データ通信モード時に電子メールを送信する場合、本体部の操作キーｅｘ２０４の操作によって入力された電子メールのテキストデータは操作入力制御部ex３０４を介して主制御部ex３１１に送出される。主制御部ex３１１は、テキストデータを変復調回路部ex３０６でスペクトラム拡散処理し、送受信回路部ex３０１でディジタルアナログ変換処理及び周波数変換処理を施した後にアンテナex２０１を介して基地局ex１１０へ送信する。 Further, when an e-mail is transmitted in the data communication mode, text data of the e-mail input by operating the operation key ex204 of the main body is sent to the main control unit ex311 via the operation input control unit ex304. The main control unit ex311 performs spread spectrum processing on the text data in the modulation / demodulation circuit unit ex306, performs digital analog conversion processing and frequency conversion processing in the transmission / reception circuit unit ex301, and then transmits the text data to the base station ex110 via the antenna ex201.

データ通信モード時に画像データを送信する場合、カメラ部ex２０３で撮像された画像データをカメラインターフェース部ex３０３を介して画像符号化部ex３１２に供給する。また、画像データを送信しない場合には、カメラ部ex２０３で撮像した画像データをカメラインターフェース部ex３０３及びＬＣＤ制御部ex３０２を介して表示部ex２０２に直接表示することも可能である。 When transmitting image data in the data communication mode, the image data captured by the camera unit ex203 is supplied to the image encoding unit ex312 via the camera interface unit ex303. When image data is not transmitted, the image data captured by the camera unit ex203 can be directly displayed on the display unit ex202 via the camera interface unit ex303 and the LCD control unit ex302.

画像符号化部ex３１２は、本願発明で説明した画像符号化装置を備えた構成であり、カメラ部ex２０３から供給された画像データを上記実施の形態で示した画像符号化装置に用いた符号化方法によって圧縮符号化することにより符号化画像データに変換し、これを多重分離部ex３０８に送出する。また、このとき同時に携帯電話機ex１１５は、カメラ部ex２０３で撮像中に音声入力部ex２０５で集音した音声を音声処理部ex３０５を介してディジタルの音声データとして多重分離部ex３０８に送出する。 The image encoding unit ex312 has a configuration including the image encoding device described in the present invention, and an encoding method using the image data supplied from the camera unit ex203 in the image encoding device described in the above embodiment. The encoded image data is converted into encoded image data by compression encoding, and sent to the demultiplexing unit ex308. At the same time, the cellular phone ex115 sends the sound collected by the audio input unit ex205 during imaging by the camera unit ex203 to the demultiplexing unit ex308 as digital audio data via the audio processing unit ex305.

多重分離部ex３０８は、画像符号化部ex３１２から供給された符号化画像データと音声処理部ex３０５から供給された音声データとを所定の方式で多重化し、その結果得られる多重化データを変復調回路部ex３０６でスペクトラム拡散処理し、送受信回路部ex３０１でディジタルアナログ変換処理及び周波数変換処理を施した後にアンテナex２０１を介して送信する。 The demultiplexing unit ex308 multiplexes the encoded image data supplied from the image encoding unit ex312 and the audio data supplied from the audio processing unit ex305 by a predetermined method, and the multiplexed data obtained as a result is a modulation / demodulation circuit unit A spectrum spread process is performed at ex306, a digital-analog conversion process and a frequency conversion process are performed at the transmission / reception circuit unit ex301, and then transmitted through the antenna ex201.

データ通信モード時にホームページ等にリンクされた動画像ファイルのデータを受信する場合、アンテナex２０１を介して基地局ex１１０から受信した受信信号を変復調回路部ex３０６でスペクトラム逆拡散処理し、その結果得られる多重化データを多重分離部ex３０８に送出する。 When receiving data of a moving image file linked to a home page or the like in the data communication mode, the received signal received from the base station ex110 via the antenna ex201 is subjected to spectrum despreading processing by the modulation / demodulation circuit unit ex306, and the resulting multiplexing is obtained. Is sent to the demultiplexing unit ex308.

また、アンテナex２０１を介して受信された多重化データを復号化するには、多重分離部ex３０８は、多重化データを分離することにより画像データの符号化ビットストリームと音声データの符号化ビットストリームとに分け、同期バスex３１３を介して当該符号化画像データを画像復号化部ex３０９に供給すると共に当該音声データを音声処理部ex３０５に供給する。 In addition, in order to decode the multiplexed data received via the antenna ex201, the demultiplexing unit ex308 separates the multiplexed data to generate an encoded bitstream of image data and an encoded bitstream of audio data. The encoded image data is supplied to the image decoding unit ex309 via the synchronization bus ex313, and the audio data is supplied to the audio processing unit ex305.

次に、画像復号化部ex３０９は、画像データの符号化ビットストリームを復号することにより再生動画像データを生成し、これをＬＣＤ制御部ex３０２を介して表示部ex２０２に供給し、これにより、例えばホームページにリンクされた動画像ファイルに含まれる動画データが表示される。このとき同時に音声処理部ex３０５は、音声データをアナログ音声信号に変換した後、これを音声出力部ex２０８に供給し、これにより、例えばホームページにリンクされた動画像ファイルに含まる音声データが再生される。 Next, the image decoding unit ex309 generates reproduction moving image data by decoding the encoded bit stream of the image data, and supplies this to the display unit ex202 via the LCD control unit ex302. The moving image data included in the moving image file linked to the home page is displayed. At the same time, the audio processing unit ex305 converts the audio data into an analog audio signal, and then supplies the analog audio signal to the audio output unit ex208. Thus, for example, the audio data included in the moving image file linked to the home page is reproduced. The

なお、上記システムの例に限られず、最近は衛星、地上波によるディジタル放送が話題となっており、図１２に示すようにディジタル放送用システムにも上記実施の形態の画像符号化装置を組み込むことができる。具体的には、放送局ex４０９では映像情報の符号化ビットストリームが電波を介して通信または放送衛星ex４１０に伝送される。これを受けた放送衛星ex４１０は、放送用の電波を発信し、この電波を衛星放送受信設備をもつ家庭のアンテナex４０６で受信し、テレビ（受信機）ex４０１またはセットトップボックス（ＳＴＢ）ex４０７などの装置により符号化ビットストリームを復号化してこれを再生する。また、記録媒体であるCDやDVD等の蓄積メディアex４０２に記録した符号化ビットストリームを読み取り、再生装置ex４０３で復号化する。 Note that the present invention is not limited to the above system, and recently, digital broadcasting using satellites and terrestrial waves has become a hot topic. As shown in FIG. 12, the image coding apparatus of the above embodiment is also incorporated into a digital broadcasting system. Can do. Specifically, in the broadcasting station ex409, the encoded bit stream of the video information is transmitted to the communication or broadcasting satellite ex410 via radio waves. Receiving this, the broadcasting satellite ex410 transmits a radio wave for broadcasting, and receives the radio wave with a home antenna ex406 having a satellite broadcasting receiving facility, such as a television (receiver) ex401 or a set top box (STB) ex407. The device decodes the encoded bit stream and reproduces it. Also, the encoded bit stream recorded on the storage medium ex402 such as a CD or DVD as a recording medium is read and decoded by the reproduction apparatus ex403.

この場合、再生された映像信号はモニタex４０４に表示される。また、ケーブルテレビ用のケーブルex４０５または衛星／地上波放送のアンテナex４０６に接続されたセットトップボックスex４０７内に画像復号化装置を実装し、これをテレビのモニタex４０８で再生する構成も考えられる。このときセットトップボックスではなく、テレビ内に画像復号化装置を組み込んでも良い。また、アンテナex４１１を有する車ex４１２で衛星ex４１０からまたは基地局ex１０７等から信号を受信し、車ex４１２が有するカーナビゲーションex４１３等の表示装置に動画を再生することも可能である。 In this case, the reproduced video signal is displayed on the monitor ex404. Further, a configuration in which an image decoding device is mounted in a set-top box ex407 connected to a cable ex405 for cable television or an antenna ex406 for satellite / terrestrial broadcasting, and this is reproduced on the monitor ex408 of the television is also conceivable. At this time, the image decoding apparatus may be incorporated in the television instead of the set top box. It is also possible to receive a signal from the satellite ex410 or the base station ex107 by the car ex412 having the antenna ex411 and reproduce a moving image on a display device such as the car navigation ex413 that the car ex412 has.

更に、画像信号を上記実施の形態で示した画像符号化装置で符号化し、記録媒体に記録することもできる。具体例としては、DVDディスクｅｘ４２１に画像信号を記録するDVDレコーダや、ハードディスクに記録するディスクレコーダなどのレコーダｅx４２０がある。更にSDカードｅｘ４２２に記録することもできる。レコーダｅｘ４２０が上記実施の形態で示した画像復号化装置を備えていれば、DVDディスクｅｘ４２１やSDカードｅｘ４２２に記録した画像信号を再生し、モニタｅｘ４０８で表示することができる。 Further, the image signal can be encoded by the image encoding device shown in the above embodiment and recorded on a recording medium. As a specific example, there is a recorder ex420 such as a DVD recorder that records an image signal on a DVD disk ex421 or a disk recorder that records on a hard disk. Further, it can be recorded on the SD card ex422. If the recorder ex420 includes the image decoding device described in the above embodiment, the image signal recorded on the DVD disc ex421 or the SD card ex422 can be reproduced and displayed on the monitor ex408.

なお、カーナビゲーションex４１３の構成は例えば図１１に示す構成のうち、カメラ部ex２０３とカメラインターフェース部ex３０３、画像符号化部ｅｘ３１２を除いた構成が考えられ、同様なことがコンピュータex１１１やテレビ（受信機）ex４０１等でも考えられる。 For example, the configuration of the car navigation ex413 may be a configuration excluding the camera unit ex203, the camera interface unit ex303, and the image encoding unit ex312 in the configuration illustrated in FIG. 11, and the same applies to the computer ex111 and the television (receiver). ) Ex401 can also be considered.

また、上記携帯電話ex１１４等の端末は、符号化器・復号化器を両方持つ送受信型の端末の他に、符号化器のみの送信端末、復号化器のみの受信端末の３通りの実装形式が考えられる。 In addition to the transmission / reception type terminal having both the encoder and the decoder, the terminal such as the mobile phone ex114 has three implementation formats: a transmitting terminal having only an encoder and a receiving terminal having only a decoder. Can be considered.

このように、上記実施の形態で示した動き補償方法、画像符号化方法上述したいずれの機器・システムに用いることは可能であり、そうすることで、上記実施の形態で説明した効果を得ることができる。 As described above, the motion compensation method and the image encoding method described in the above embodiment can be used in any of the devices and systems described above, and by doing so, the effects described in the above embodiment can be obtained. Can do.

本発明にかかる画像符号化方法およびこれを画像符号化装置は、符号化効率の高い画像の符号化を行うことができ、これは動画像を扱う分野、たとえばコンテンツ配信や携帯電話、ディジタル放送などの用途にも適用できる。 The image coding method and the image coding apparatus according to the present invention can perform coding of an image with high coding efficiency, which is a field that handles moving images, for example, content distribution, cellular phone, digital broadcasting, etc. It can be applied to other uses.

本発明の実施の形態１にかかる画像符号化装置を示すブロック図である。It is a block diagram which shows the image coding apparatus concerning Embodiment 1 of this invention. 図１の動きベクトル候補検出部の処理を説明する画素のイメージ図である。It is an image figure of the pixel explaining the process of the motion vector candidate detection part of FIG. 図１の動きベクトル候補検出部から出力される動き補償ブロックタイプと情報を説明するための図である。It is a figure for demonstrating the motion compensation block type and information output from the motion vector candidate detection part of FIG. 図１のモード決定部の処理を説明するための表およびフロー図である。It is the table | surface and flowchart for demonstrating the process of the mode determination part of FIG. 図１モード決定部の処理を説明するための表およびフロー図である。1 is a table and a flow diagram for explaining the processing of the mode determination unit. 本発明の実施の形態１にかかる画像符号化装置を示すブロック図である。It is a block diagram which shows the image coding apparatus concerning Embodiment 1 of this invention. 本発明の実施の形態２にかかる動き補償符号化部の処理を説明するフロー図である。It is a flowchart explaining the process of the motion compensation encoding part concerning Embodiment 2 of this invention. 上記実施の形態１から実施の形態２の画像符号化方法または画像復号化方法を格納したフレキシブルディスクを用いて、コンピュータシステムにより実施する場合の説明図である。It is explanatory drawing in the case of implementing with a computer system using the flexible disk which stored the image coding method or the image decoding method of the said Embodiment 1 to Embodiment 2. コンテンツ配信サービスを実現するコンテンツ供給システムの全体構成を示すブロック図である。It is a block diagram which shows the whole structure of the content supply system which implement | achieves a content delivery service. 動き補償方法、画像符号化方法および画像復号化方法を用いた携帯電話を示す図である。It is a figure which shows the mobile telephone using the motion compensation method, the image coding method, and the image decoding method. 携帯電話の構成を示すブロック図である。It is a block diagram which shows the structure of a mobile telephone. ディジタル放送用システムの全体構成を示すブロック図である。It is a block diagram which shows the whole structure of the system for digital broadcasting. 従来の画像符号化装置の構成を示すブロック図である。It is a block diagram which shows the structure of the conventional image coding apparatus. 従来の画像符号化における動きベクトル検出部の処理を説明するための図である。It is a figure for demonstrating the process of the motion vector detection part in the conventional image coding. 従来の画像符号化における動き補償時の小数精度画素の画素補間方法を説明するための図である。It is a figure for demonstrating the pixel interpolation method of the decimal precision pixel at the time of the motion compensation in the conventional image coding.

Explanation of symbols

１００画像符号化装置
１０１動きベクトル候補検出部
１０２モード決定部
１０３動きベクトル候補検出部
１０００画像符号化装置
１００１差分器
１００２画像符号化部
１００３可変長符号化部
１００４画像復号化部
１００５加算器
１００６画像メモリ
１００７ピクチャメモリ
１００８動き補償符号化部
１００９動きベクトル検出部
１０１０予測方向決定部
１０１１動きベクトル記憶部
Img 画像データ
Res 差分画像データ
CodedRes 差分画像符号化データ
Bitstream 符号化データ
Recon 復号化画像データ
Ref 参照画像データ
MotionParam 動きパラメータ
Mod 符号化モード
MotionVector 動きベクトル
RefIdx 参照ピクチャ番号
Dir 予測モード DESCRIPTION OF SYMBOLS 100 Image coding apparatus 101 Motion vector candidate detection part 102 Mode determination part 103 Motion vector candidate detection part 1000 Image coding apparatus 1001 Differentiator 1002 Image coding part 1003 Variable length coding part 1004 Image decoding part 1005 Adder 1006 Image Memory 1007 Picture memory 1008 Motion compensation encoding unit 1009 Motion vector detection unit 1010 Prediction direction determination unit 1011 Motion vector storage unit
Img image data
Res difference image data
CodedRes Difference image encoded data
Bitstream encoded data
Recon Decoded image data
Ref reference image data
MotionParam motion parameters
Mod encoding mode
MotionVector motion vector
RefIdx reference picture number
Dir prediction mode

Claims

Two motion vectors with decimal pixel precision are determined for the encoding target block, and a difference value between a value obtained by averaging the pixel values of the reference block indicated by the two motion vectors and the pixel value of the encoding target block is calculated. An image encoding method for encoding together with the two motion vectors,
A motion detection step of performing motion detection with integer pixel precision on the reference picture;
A motion vector determination step of detecting two motion vectors by detecting a motion vector with a pixel accuracy of 1 / 2n around the reference position of the motion detected in the motion detection step;
For the encoding target block, two motion vectors determined in the motion vector determination step, a value obtained by averaging pixel values of reference blocks indicated by the two motion vectors, and a pixel value of the encoding target block An image encoding method comprising: an encoding step for encoding a difference value between the two.

2. The image coding method according to claim 1, wherein the average of the pixel values of the reference block indicated by the two motion vectors is 1 / which is obtained by dividing the sum of the pixel values of the reference block indicated by the two motion vectors by 2. An image encoding method characterized by being an average of two.

Two motion vectors with decimal pixel precision are determined for the encoding target block, and a difference value between a pixel value of a reference block indicated by the two motion vectors and a pixel value of the encoding target block is determined. An image encoding device for encoding together with the two motion vectors,
Motion detection means for performing motion detection with integer pixel accuracy on a reference picture;
Motion vector determining means for detecting a motion vector with a pixel accuracy of 1 / 2n and determining two motion vectors around a reference position of the motion detected by the motion detecting means;
The difference between the two motion vectors determined by the motion vector determination means for the encoding target block, the average value of the pixel values of the reference block indicated by the two motion vectors, and the pixel value of the encoding target block An image encoding device having encoding means for encoding a value.

Two motion vectors with decimal pixel precision are determined for the encoding target block, and a difference value between a pixel value of a reference block indicated by the two motion vectors and a pixel value of the encoding target block is determined. , A program for causing an image encoding apparatus to implement an image encoding method for encoding together with the two motion vectors,
A motion detection step in which the image encoding device performs motion detection with integer pixel accuracy on a reference picture;
A motion in which the image encoding device detects a motion vector with a pixel accuracy of 1 / 2n and determines two motion vectors around the reference position of the motion detected in the motion detection step. A vector determination step;
The image encoding device averages the two motion vectors determined in the motion vector determination step, the pixel values of the reference block indicated by the two motion vectors, and the code for the encoding target block. A program for realizing an encoding step for encoding a difference value with a pixel value of a conversion target block.