JP2008258769A

JP2008258769A - Image encoding device and control method thereof, and computer program

Info

Publication number: JP2008258769A
Application number: JP2007096595A
Authority: JP
Inventors: Hidekazu Tanaka; 英一田中
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2007-04-02
Filing date: 2007-04-02
Publication date: 2008-10-23

Abstract

<P>PROBLEM TO BE SOLVED: To provide an image encoding device, an image encoding method and a program thereof, which can lead out an optimum prediction value at a low operation cost while maintaining a wide search range. <P>SOLUTION: A reference image 200 and a macroblock image 201 to be encoded are respectively resolution-converted and binarized. Block matching is performed between both the resolution-converted images and a search range is narrowed by a search range restriction part 206. The narrowed range is applied to a binarized image of the reference image as it is and further narrowed by a binary image searching range restriction part 208. The range narrowed by the resolution-converted image and the binarized image is finally applied to the reference image 200 and block matching between the reference image 200 and the macroblock image 201 to be encoded is performed to detect a moving vector. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、動画像符号化装置及びその制御方法、並びに、コンピュータプログラムに関する。 The present invention relates to a moving image encoding apparatus, a control method thereof, and a computer program.

現在、フレーム間の相関性を利用してマクロブロック単位で動画像を圧縮符号化する方式としてＭＰＥＧ２、ＭＰＥＧ４あるいはＨ.２６４が主流となっている。これらの方式では、時間的に前もしくは後の参照画像と符号化対象マクロブロックとの画素の差分値を符号化することでデータ量を削減することが可能である。その際、常に符号化対象マクロブロックとフレーム内での相対的位置が同じ場所の画素同士の差分を取るのではなく、ブロックマッチングを行って差分値が最小となる位置を探索する。 Currently, MPEG2, MPEG4, or H.264 is the mainstream as a method for compressing and encoding a moving image in units of macroblocks using the correlation between frames. In these methods, it is possible to reduce the amount of data by encoding the difference value of the pixel between the reference image before or after in time and the macroblock to be encoded. At that time, instead of always taking a difference between pixels having the same relative position in the encoding target macroblock and the frame, block matching is performed to search for a position where the difference value is minimized.

符号化対象マクロブロックと参照画像のどの位置との差分を取ったかという情報は動きベクトルという形で差分値と別に符号化される。このような符号化方法をフレーム間予測という。つまり、符号化対象マクロブロック内の物体が画面上の位置においてどのように移動しているかを予測して符号化する方法である。 Information indicating the difference between the encoding target macroblock and the position of the reference image is encoded separately from the difference value in the form of a motion vector. Such an encoding method is called interframe prediction. That is, it is a method of encoding by predicting how an object in the encoding target macroblock is moving at a position on the screen.

動きの速い被写体を符号化する場合、ブロックマッチングを行う範囲により符号化効率が変化する。符号化対象のマクロブロック内の被写体が、参照画像におけるブロックマッチングの探索範囲を越える位置に存在すると、最適なフレーム間予測が困難となる。このような場合、実際とは異なる被写体との差分値（比較的差分値が小さくなる部分）とのフレーム間予測を行うか、もしくはフレーム間予測自体を行わない。これに対して、ブロックマッチングを行う範囲が広げれば、参照画素との差分値が最小となる位置を検出する可能性が高くなるが、演算コストが増加することとなる。具体的に、探索範囲を広げることで、ブロックマッチングの計算回数、探索範囲の画素データを保持しておくメモリ等の増加が避けられない。 When encoding a fast-moving subject, the encoding efficiency changes depending on the block matching range. If the subject in the macroblock to be encoded exists at a position that exceeds the block matching search range in the reference image, optimal interframe prediction becomes difficult. In such a case, inter-frame prediction with a difference value (part where the difference value is relatively small) from a subject different from the actual one is performed, or inter-frame prediction itself is not performed. On the other hand, if the range in which block matching is performed is expanded, the possibility of detecting a position where the difference value from the reference pixel is minimum increases, but the calculation cost increases. Specifically, by increasing the search range, it is inevitable that the number of block matching calculations and the memory for holding pixel data in the search range will increase.

そこで、符号化を適用する製品に応じてフレーム間予測時の探索範囲を変えざるを得ない。例えば、ビデオカメラ等の製品の場合、探索範囲を大きく取れば消費電力が多くなってしまい、長時間の記録に対応できない。さらに、参照画像の転送量が膨大なものとなってしまい、メモリのバンド幅が足りなくなってしまう。 Therefore, the search range at the time of inter-frame prediction must be changed according to the product to which encoding is applied. For example, in the case of a product such as a video camera, if the search range is large, power consumption increases, and it is not possible to cope with long-time recording. Further, the transfer amount of the reference image becomes enormous, and the memory bandwidth becomes insufficient.

そのため、消費電力及びバスの転送能力を考慮して探索範囲を狭くする、もしくは広い探索範囲に対して粗くブロックマッチングを行うという方法が一般的である。しかしながら、前述した理由により、必ずしも最適な予測値を得られているとは言えない。 For this reason, a general method is to narrow the search range in consideration of power consumption and bus transfer capability, or to roughly perform block matching for a wide search range. However, for the reasons described above, it cannot be said that an optimal predicted value is necessarily obtained.

そこで、参照画像を複数の解像度へと変換して、低解像度の階層から徐々に探索範囲を絞り込んでいく方法が知られている（特許文献１を参照）。この技術によれば、探索範囲を狭くすることなく効率よくフレーム間予測が可能である、としている。 Therefore, a method is known in which a reference image is converted into a plurality of resolutions, and a search range is gradually narrowed down from a low resolution hierarchy (see Patent Document 1). According to this technique, inter-frame prediction can be efficiently performed without narrowing the search range.

具体的に、当該方法では、低解像度の階層から極小値となる点の動きベクトルを複数検出し、次の階層の動きベクトル探索初期値として利用している。例えば、最初に１／６４の解像度に落とした画像を用意して探索を行い、探索初期値を求め、そこから更に１／３２の解像度に落とした画像に対して同じように探索初期値を求め、というように徐々に解像度を変化させる。このように、解像度を落とすことで画像メモリと演算回数が削減可能である。
特開平７−１５４８０１号公報 Specifically, in this method, a plurality of motion vectors at points that are minimal values are detected from the low-resolution layer, and are used as the initial motion vector search value for the next layer. For example, an image first reduced to a resolution of 1/64 is prepared and a search is performed to obtain an initial search value, and then an initial search value is similarly obtained for an image further reduced to a resolution of 1/32. , And so on, gradually change the resolution. Thus, the image memory and the number of operations can be reduced by reducing the resolution.
Japanese Unexamined Patent Publication No. 7-154801

しかしながら、上記のような解像度を落とした画像を用いて探索範囲を絞っていく方法では、ある程度広い範囲から絞り込む際には有効であるが、範囲が狭くなるに従って予測精度が悪くなることが分かっている。そのため、ある程度まで範囲を絞り込んだ後は解像度を落としていない、通常の予測方法へと切り換える必要がある。さらに、絞り込みすぎると、最適な予測値が得られない可能性も出てくる。 However, the method of narrowing down the search range using an image with a reduced resolution as described above is effective when narrowing down from a wide range to some extent, but it turns out that the prediction accuracy deteriorates as the range becomes narrower. Yes. Therefore, after narrowing down the range to some extent, it is necessary to switch to a normal prediction method that does not reduce the resolution. Furthermore, if it is narrowed down too much, there is a possibility that an optimum predicted value cannot be obtained.

本発明は、このような事情を考慮してなされたもので、広い探索範囲を維持しつつ、演算コストを抑え、かつ最適な予測値を導出することができる画像符号化装置、画像符号化方法及びそのプログラムを提供することを目的とする。 The present invention has been made in view of such circumstances, and maintains an wide search range, suppresses the calculation cost, and can derive an optimal prediction value and an image encoding method. And to provide the program.

この発明は、上述した課題を解決すべくなされたもので、符号化対象画像と時間的に異なるフレームの画像を参照画像として用いて、前記符号化対象画像に含まれる各マクロブロック画像の動きベクトルを算出し、該動きベクトルに基づいてフレーム間符号化処理を行う動画像符号化装置であって、
前記参照画像を二値化して参照二値化画像を生成する参照画像二値化手段と、
前記参照画像を低解像度に変換して参照解像度変換画像を生成する参照解像度変換手段と、
前記マクロブロック画像を二値化してマクロブロック二値化画像を生成するマクロブロック画像二値化手段と、
前記マクロブロック画像を低解像度に変換してマクロブロック解像度変換画像を生成するマクロブロック解像度変換手段と、
前記参照解像度変換画像と前記マクロブロック解像度変換画像とのブロックマッチングに基づき、前記参照二値化画像から切り出す第１の範囲を決定する第１の範囲決定手段と、
前記参照二値化画像から、前記第１の範囲の画像を参照二値化ブロック画像として切り出す第１の切り出し手段と、
前記参照二値化ブロック画像と、前記マクロブロック二値化画像とのブロックマッチングに基づき、前記マクロブロック画像についての前記動きベクトルを算出する動きベクトル算出手段と
を備えることを特徴とする。 The present invention has been made to solve the above-described problem, and uses a frame image temporally different from an encoding target image as a reference image, and uses a motion vector of each macroblock image included in the encoding target image. And a video encoding device that performs inter-frame encoding processing based on the motion vector,
Reference image binarization means for binarizing the reference image to generate a reference binarized image;
Reference resolution conversion means for converting the reference image to a low resolution to generate a reference resolution conversion image;
Macroblock image binarization means for binarizing the macroblock image to generate a macroblock binarized image;
Macroblock resolution conversion means for converting the macroblock image to a low resolution to generate a macroblock resolution conversion image;
First range determining means for determining a first range to be cut out from the reference binarized image based on block matching between the reference resolution converted image and the macroblock resolution converted image;
First cutout means for cutting out the image of the first range as a reference binarized block image from the reference binarized image;
The image processing apparatus includes a motion vector calculation unit that calculates the motion vector for the macroblock image based on block matching between the reference binarized block image and the macroblock binary image.

本発明によれば、フレーム間予測演算において解像度変換画像と二値化画像の二種類の画像を用いた予測を行うことにより、メモリ容量を削減しつつ効率的な動画像符号化ができる。 According to the present invention, by performing prediction using two types of images, a resolution-converted image and a binarized image, in the inter-frame prediction calculation, efficient moving image encoding can be performed while reducing the memory capacity.

以下において、添付の図面を参照して、本発明の好適な実施の形態について説明する。 Hereinafter, preferred embodiments of the present invention will be described with reference to the accompanying drawings.

＜第１の実施形態＞
図１は、本発明の第１の実施形態に係る画像処理装置としての動画像符号化装置の構成例を示すブロック図である。当該動画像符号化装置１００では、ＭＰＥＧ２、Ｈ２６４等の方式により、動画像符号化を行う。 <First Embodiment>
FIG. 1 is a block diagram showing a configuration example of a moving image encoding apparatus as an image processing apparatus according to the first embodiment of the present invention. The moving image encoding apparatus 100 performs moving image encoding by a method such as MPEG2 or H264.

図１において、動画像符号化装置１００は、以下の構成要素を備える。まず、レンズやCCD等のカメラ部を含む撮像部１０１、動きベクトル検出部１０２、動き補償部１０８、減算器１０３、DCT（直行変換）部１０４、量子化部１０５。また、可変長符号化部１０６、記録部１０７、記録媒体１０９、符号量制御部１１０、逆量子化部１１１。更には、IDCT（逆直行変換）部１１２、加算器１１３、表示部１１７、フレームメモリ１１８を備える構成としている。 In FIG. 1, the moving image encoding apparatus 100 includes the following components. First, an imaging unit 101 including a camera unit such as a lens or a CCD, a motion vector detection unit 102, a motion compensation unit 108, a subtractor 103, a DCT (direct transform) unit 104, and a quantization unit 105. Further, a variable length encoding unit 106, a recording unit 107, a recording medium 109, a code amount control unit 110, and an inverse quantization unit 111. Furthermore, an IDCT (inverse orthogonal transformation) unit 112, an adder 113, a display unit 117, and a frame memory 118 are provided.

更に、符号量制御部１１０をはじめとする各構成部の動作は、システムコントローラ１２０によって制御されるように構成されている。このシステムコントローラ１２０は、装置１００全体の動作制御を司るものであり、操作部１２１を利用したユーザからの指示に応じて装置１００全体の動作制御を行うこともできる。 Furthermore, the operation of each component unit including the code amount control unit 110 is configured to be controlled by the system controller 120. The system controller 120 controls operation of the entire apparatus 100, and can also perform operation control of the entire apparatus 100 in accordance with an instruction from a user using the operation unit 121.

本発明における撮像手段としての撮像部１０１にて被写体を撮像して得られた一連の画像信号は、第１フレーム、第２フレーム、第３フレーム、・・・の順で、順次フレームメモリ１１８に格納されていく。フレームメモリ１１８からは、例えば、第３フレーム、第１フレーム、第２フレーム、・・・といった符号化を行う順序で画像データが出力される。本実施形態における符号化方式には、フレーム内の画像データのみで符号化する"イントラ符号化"と、フレーム間予測も含めて符号化する"インター符号化"とが含まれる。 A series of image signals obtained by imaging the subject by the imaging unit 101 as the imaging means in the present invention are sequentially stored in the frame memory 118 in the order of the first frame, the second frame, the third frame,. It will be stored. From the frame memory 118, for example, image data is output in the order of encoding such as the third frame, the first frame, the second frame,. The encoding system in the present embodiment includes “intra coding” for encoding only with image data in a frame and “inter encoding” for encoding including inter-frame prediction.

まず、イントラ符号化を行うピクチャは、Ｉピクチャという。符号化順がフレームの入力順と異なるのは、時間的に未来のフレームを使った予測（後方向予測）を可能にするためである。次に、インター符号化（フレーム間符号化処理）を行う符号化対象画像（ピクチャ）には、２つのピクチャがある。１つ目は、動き補償の単位（マクロブロック）に対して１枚の参照フレームとの前方向予測を行うＰピクチャである。２つ目は、マクロブロックに対して２枚までの参照フレームとの双方向予測を行うＢピクチャである。このように、本実施形態では、Ｉピクチャ、Ｐピクチャ及びＢピクチャという、第１、第２及び第３の種別の画像データにより動画像符号化を行う発明を記載する。 First, a picture to be subjected to intra coding is called an I picture. The reason why the encoding order is different from the frame input order is to enable prediction (backward prediction) using temporally future frames. Next, there are two pictures in the encoding target image (picture) to be subjected to inter coding (interframe coding processing). The first is a P picture that performs forward prediction with one reference frame for a unit of motion compensation (macroblock). The second is a B picture that performs bi-directional prediction with up to two reference frames for a macroblock. As described above, the present embodiment describes an invention in which moving picture coding is performed using first, second, and third types of image data, which are I pictures, P pictures, and B pictures.

イントラ符号化がなされる場合、符号化単位となるブロックに分割された画像データはフレームメモリ１１８から読み出されて、直行変換部（ＤＣＴ）１０４で直行変換が施される。この直交変換としては、本実施形態ではＤＣＴ変換を行う。直行変換部１０４からの出力である変換係数は量子化部（Ｑ）１０５において量子化処理される。量子化部１０５からの出力である量子化された変換係数は、可変長符号化部１０６において可変長符号化がなされた後、記録部１０７によって記録媒体への記録信号が生成されて、記録媒体１０９へ記録される。 When intra coding is performed, image data divided into blocks serving as coding units is read from the frame memory 118 and subjected to direct conversion by a direct conversion unit (DCT) 104. As this orthogonal transformation, DCT transformation is performed in this embodiment. A transform coefficient that is an output from the direct transform unit 104 is quantized in a quantizer (Q) 105. The quantized transform coefficient output from the quantization unit 105 is subjected to variable length coding in the variable length coding unit 106, and then a recording signal to the recording medium is generated by the recording unit 107, so that the recording medium 109 is recorded.

量子化部１０５における量子化係数は、可変長符号化部（ＶＬＣ）１０６が発生した符号量のフィードバックなどから符号量制御部１１０が算出する。また、量子化部１０５の出力である量子化された変換係数は、逆量子化部（ＩＱ）１１１において逆量子化される。逆直行変換部（ＩＤＣＴ）１１２において逆直行変換処理が施されて、復号された画像信号となり、その画像信号はフレームメモリ１１８に記憶される。 The quantization amount in the quantization unit 105 is calculated by the code amount control unit 110 from the feedback of the code amount generated by the variable length coding unit (VLC) 106. Further, the quantized transform coefficient that is the output of the quantization unit 105 is inversely quantized by an inverse quantization unit (IQ) 111. The inverse orthogonal transform unit (IDCT) 112 performs an inverse orthogonal transform process to obtain a decoded image signal, and the image signal is stored in the frame memory 118.

一方、インター符号化がなされる場合、符号化単位となるマクロブロックに分割された画像データはフレームメモリ１１８から読み出されて、動きベクトル検出部１０２へ入力される。同時に、動きベクトル検出部１０２は、時間的に異なるフレーム画像である参照画像をフレームメモリ１１８から読み出し、符号化画像と参照画像とを用いて動きベクトルを検出する。 On the other hand, when inter coding is performed, the image data divided into macroblocks serving as coding units is read from the frame memory 118 and input to the motion vector detection unit 102. At the same time, the motion vector detection unit 102 reads a reference image, which is a temporally different frame image, from the frame memory 118, and detects a motion vector using the encoded image and the reference image.

動き補償部１０８は、動きベクトルにしたがって動き補償を行い予測画像を生成する。符号化画像と予測画像との差分は減算器１０３によって計算され、差分画像が生成される。差分画像は直行変換部１０４に出力され、直交変換される。この直交変換部１０４以降に行われる処理は上述のイントラ符号化の場合と同様であるので、省略する。 The motion compensation unit 108 performs motion compensation according to the motion vector and generates a predicted image. The difference between the encoded image and the predicted image is calculated by the subtracter 103, and a difference image is generated. The difference image is output to the orthogonal transform unit 104 and orthogonally transformed. Since the processing performed after the orthogonal transform unit 104 is the same as that in the case of the above-described intra coding, the description is omitted.

次に、動きベクトル検出部１０２の動作について詳細に説明する。図２は、動きベクトル検出部１０２の具体的な構成の一例を示した図である。 Next, the operation of the motion vector detection unit 102 will be described in detail. FIG. 2 is a diagram illustrating an example of a specific configuration of the motion vector detection unit 102.

以下、図２の構成要素を説明する。２００は、フレームメモリ１１８から読み出された参照画像である。２０１は、符号化対象の画像から、所定のサイズ（例えば、１６×１６画素）で切り出されたマクロブロック（以下、ＭＢ）画像である。２０２は、参照画像を任意の閾値で二値化して参照二値化画像を生成する参照画像二値化部、２０３は、参照画像の解像度を落とした画像を生成する参照画像解像度変換部である。２０４は、符号化対象MB画像の解像度を落としたMB画像（マクロブロック解像度変換画像）を生成するMB解像度変換部（マクロブロック画像解像度変換部）である。２０５は、符号化対象MB画像を上記任意の閾値で二値化してＭＢ二値化画像（マクロブロック二値化画像）を生成するＭＢ画像二値化部（マクロブロック画像二値化部）である。 Hereinafter, the components of FIG. 2 will be described. Reference numeral 200 denotes a reference image read from the frame memory 118. Reference numeral 201 denotes a macroblock (hereinafter referred to as MB) image cut out from a coding target image with a predetermined size (for example, 16 × 16 pixels). A reference image binarization unit 202 generates a reference binarized image by binarizing the reference image with an arbitrary threshold, and a reference image resolution conversion unit 203 generates an image with a reduced resolution of the reference image. . Reference numeral 204 denotes an MB resolution conversion unit (macroblock image resolution conversion unit) that generates an MB image (macroblock resolution conversion image) in which the resolution of the encoding target MB image is reduced. Reference numeral 205 denotes an MB image binarization unit (macroblock image binarization unit) that binarizes the encoding target MB image with the above-described arbitrary threshold value and generates an MB binarized image (macroblock binarized image). is there.

２０６は、探索範囲限定部（第１の範囲決定部）であり、参照画像解像度変換部２０３から得られる参照画像と、ＭＢ画像解像度変換部２０４から得られる符号化対象ＭＢとのブロックマッチングを行って探索範囲を絞り込みを行う。探索範囲限定部２０６はその上で、参照二値化画像から切り出す範囲（第１の範囲）を決定する。 A search range limiting unit (first range determination unit) 206 performs block matching between the reference image obtained from the reference image resolution conversion unit 203 and the encoding target MB obtained from the MB image resolution conversion unit 204. To narrow the search range. In addition, the search range limiting unit 206 determines a range (first range) to be cut out from the reference binarized image.

２０７は、探索範囲限定部２０６での探索範囲の絞り込みにより決定された範囲に基づいて、参照画像二値化部２０２で生成された参照二値化画像から参照二値化ブロック画像を切り出す二値画像切り出し部（第１の切り出し部）である。 Reference numeral 207 denotes a binary that extracts a reference binarized block image from the reference binarized image generated by the reference image binarizing unit 202 based on the range determined by narrowing down the search range in the search range limiting unit 206. It is an image cutout unit (first cutout unit).

２０８は、二値画像探索範囲限定部（第２の範囲決定部）であり、まず、参照二値化ブロック画像と、ＭＢ画像二値化部２０５からの符号化対象ＭＢのＭＢ二値化画像とのブロックマッチングを行って探索範囲を絞り込む。二値画像探索範囲限定部２０８はその上で、参照画像２００から切り出す範囲（第２の範囲）を決定する。 Reference numeral 208 denotes a binary image search range limiting unit (second range determination unit). First, a reference binarized block image and an MB binarized image of the encoding target MB from the MB image binarizing unit 205 are shown. To narrow the search range. In addition, the binary image search range limiting unit 208 determines a range (second range) to be cut out from the reference image 200.

２０９は、二値画像探索範囲限定部２０８での探索範囲の絞り込みにより決定された範囲に基づき、参照画像２００から参照ブロック画像を切り出す参照画像切り出し部（第２の切り出し部）である。２１０は、符号化対象MB画像２０１と、参照ブロック画像とのブロックマッチングを行って動きベクトル算出を行う予測生成部である。２１１は、予測生成部２１０で生成され、出力された、動きベクトルとしての予測値を示す。 Reference numeral 209 denotes a reference image cutout unit (second cutout unit) that cuts out a reference block image from the reference image 200 based on the range determined by narrowing down the search range in the binary image search range limiting unit 208. A prediction generation unit 210 performs block matching between the encoding target MB image 201 and the reference block image to calculate a motion vector. 211 denotes a predicted value as a motion vector generated and output by the prediction generation unit 210.

以上の構成に基づく、第１の実施形態における動きベクトル検出部１０２の動作について、図３を併せて参照して説明する。図３は、本実施形態におけるマクロブロックのサイズ及び探索範囲のサイズの一例を記載する図である。 The operation of the motion vector detection unit 102 in the first embodiment based on the above configuration will be described with reference to FIG. FIG. 3 is a diagram describing an example of the size of the macroblock and the size of the search range in the present embodiment.

参照画像２００は、参照画像二値化部２０２において解像度はそのままで二値化される。８bitの２５６階調を持つ画像の場合を考えると、閾値値を例えば「１２５」とした場合、画素値が０から１２４までの画素については画素値が「０」となり、画素値が１２５から２５５までの画素については、画素値が「２５５」となる。この二値化によれば、データ量は１／８となる。この二値化方法はあくまで一例であって、これ以外の任意の閾値を用いて二値化したりフィルタを用いたりする方法が考えられるが、二値化方法自体が限定されるものではない。 The reference image 200 is binarized by the reference image binarization unit 202 without changing the resolution. Considering the case of an image having 256 bits of 8 bits, when the threshold value is set to “125”, for example, the pixel value is “0” for pixels with pixel values 0 to 124, and the pixel values are 125 to 255. For the pixels up to, the pixel value is “255”. According to this binarization, the data amount becomes 1/8. This binarization method is merely an example, and a method of binarization or a filter using any other threshold is conceivable, but the binarization method itself is not limited.

このとき同時に参照画像２００は、参照画像解像度変換部２０３において任意の解像度へと変換される。解像度変換は階調はそのままで解像度のみを下げる（画素数を少なくする）方向で行い、これにより画像情報も削減される。解像度変換の手法は、例えば、単純に画素を間引いてもよい。この解像度変換の方法自体も特に限定されるものではない。例えば、１／６４（水平１／８、垂直１／８）の解像度へと変換した場合、情報量も１／６４に縮小することができる。このようにして解像度変換により生成された画像を本実施形態では「階層画像」と呼ぶ。但し、本実施形態では解像度変換を１段階しか行わないので、階層としては１階層である。 At the same time, the reference image 200 is converted into an arbitrary resolution by the reference image resolution conversion unit 203. Resolution conversion is performed in the direction of decreasing only the resolution (decreasing the number of pixels) without changing the gradation, thereby reducing image information. As a resolution conversion method, for example, pixels may be simply thinned out. The resolution conversion method itself is not particularly limited. For example, when the resolution is converted to 1/64 (horizontal 1/8, vertical 1/8), the information amount can be reduced to 1/64. In this embodiment, an image generated in this way by resolution conversion is called a “hierarchical image”. However, since the resolution conversion is performed only in one stage in this embodiment, the hierarchy is one.

同様にして、符号化対象ＭＢ画像２０１も、ＭＢ画像二値化部２０５において上記と同様の手法により二値化されると共に、ＭＢ画像解像度変換部２０４で、上記と同様の解像度変換率により解像度変換される。本実施形態では、解像度変換率を一例として１／６４として、これ以降の説明を行う。１／６４の解像度変換により、１６×１６画素のマクロブロックは、２×２画素のサイズに縮小される。この時のマクロブロックのサイズは図３の３０１に示す通りである。 Similarly, the encoding target MB image 201 is also binarized by the MB image binarization unit 205 by the same method as described above, and at the MB image resolution conversion unit 204, the resolution is converted at the same resolution conversion rate as described above. Converted. In the present embodiment, the resolution conversion rate is assumed to be 1/64 as an example, and the following description will be given. By the resolution conversion of 1/64, the macro block of 16 × 16 pixels is reduced to the size of 2 × 2 pixels. The size of the macroblock at this time is as indicated by 301 in FIG.

次に、探索範囲限定部２０６では、参照画像解像度変換部２０３で生成された参照解像度変換画像と、ＭＢ画像解像度変換部２０４で生成されたＭＢ解像度変換画像との間でブロックマッチングを行い、探索範囲を絞り込む。参照解像度変換画像においてブロックマッチングを行う範囲は、例えば、図３では８×８画素の領域３０２とすることができる。この際、解像度変換率が１／６４であるので、領域３０２は、参照画像の本来の解像度で換算して６４×６４画素の領域に相当する。領域３０２は、符号化対象画像におけるマクロブロック位置に対応する、参照画像における位置に基づいて設定される。なお、ブロックマッチングを行う領域３０２のサイズはあくまで一例であって、上記に限定されるものではない。 Next, the search range limiting unit 206 performs block matching between the reference resolution conversion image generated by the reference image resolution conversion unit 203 and the MB resolution conversion image generated by the MB image resolution conversion unit 204 to perform search. Narrow the range. In FIG. 3, for example, an area 302 of 8 × 8 pixels can be used as a range for performing block matching in the reference resolution converted image. At this time, since the resolution conversion rate is 1/64, the area 302 corresponds to an area of 64 × 64 pixels in terms of the original resolution of the reference image. The area 302 is set based on the position in the reference image corresponding to the macroblock position in the encoding target image. Note that the size of the region 302 for performing block matching is merely an example, and is not limited to the above.

探索範囲限定部２０６における探索範囲の絞り込みは、例えば、以下のように行うことができる。即ち、解像度変換画像のうち、ＭＢ解像度変換画像と最も相関が強くなる領域を探索し、該領域の位置に基づいて所定サイズの探索範囲を出力する。この所定サイズの探索範囲は、例えば、相関が最も強い領域の中心に対して４×４画素の範囲とすることができるが、これはあくまで一例であって、これ以外のサイズの範囲に絞り込んでもよい。なお、相関の強さは、例えば解像度変換画像とＭＢ解像度変換画像との各画素ごとの差分絶対値を積算した値（ＭＡＥ）を用いて判定可能であり、検出範囲内でＭＡＥが最小となる領域の中心位置を決定する。 The search range narrowing in the search range limiting unit 206 can be performed as follows, for example. That is, an area having the strongest correlation with the MB resolution conversion image is searched from the resolution conversion image, and a search range having a predetermined size is output based on the position of the area. The search range of the predetermined size can be, for example, a range of 4 × 4 pixels with respect to the center of the region having the strongest correlation, but this is only an example, and even if narrowed down to a range of other sizes Good. The strength of the correlation can be determined using, for example, a value (MAE) obtained by integrating the absolute difference values for each pixel between the resolution-converted image and the MB resolution-converted image, and the MAE is minimized within the detection range. Determine the center position of the region.

探索範囲限定部２０６で絞り込まれた範囲は、二値画像切り出し部２０７に与えられる。このとき探索範囲限定部２０６から与えられる情報として、例えば、二値化された参照画像から切り出すサイズと、切り出し領域の中心位置が与えられる。ここで、切り出しサイズは、解像度変換率１／６４に基づき４×４画素が３２×３２画素となる。なお、切り出しサイズが固定的な場合には、中心位置の情報のみが与えられてもよい。 The range narrowed down by the search range limiting unit 206 is given to the binary image cutout unit 207. As information given from the search range limiting unit 206 at this time, for example, the size to be cut out from the binarized reference image and the center position of the cut-out area are given. Here, the cut-out size is 4 × 4 pixels becomes 32 × 32 pixels based on the resolution conversion ratio 1/64. If the cutout size is fixed, only the information on the center position may be given.

二値画像切り出し部２０７では、二値化された参照画像から該範囲に対応する領域の参照二値化ブロック画像を切り出す。参照二値化ブロック画像は、例えば図３の３０４に示すようになり、３２×３２画素のサイズを有する。画像３０２の解像度は、符号化対象画像の本来の解像度であるが、画素値は２値で表現されている。 The binary image cutout unit 207 cuts out a reference binarized block image in a region corresponding to the range from the binarized reference image. The reference binarized block image is, for example, as shown at 304 in FIG. 3, and has a size of 32 × 32 pixels. The resolution of the image 302 is the original resolution of the encoding target image, but the pixel value is expressed in binary.

つぎに、二値画像探索範囲限定部２０８では、二値画像切り出し部２０７で生成された参照二値化ブロック画像と、ＭＢ画像二値化部２０５で生成されたＭＢ二値化画像との間でブロックマッチングを行い、二値化画像での探索範囲を絞り込む。図３の３０３はＭＢ二値化画像を示している。 Next, in the binary image search range limiting unit 208, between the reference binarized block image generated by the binary image cutout unit 207 and the MB binarized image generated by the MB image binarization unit 205. Block matching is performed to narrow down the search range in the binarized image. Reference numeral 303 in FIG. 3 denotes an MB binarized image.

この探索範囲の絞り込みは、例えば以下のように行うことができる。即ち、参照二値化ブロック画像のうち、ＭＢ二値化画像と最も相関が強くなる領域を探索し、該領域の位置に基づいて、所定サイズの探索範囲を出力する。この所定サイズの探索範囲は、例えば、相関が最も強い領域の中心に対して１８×１８画素の範囲とすることができるが、これはあくまで一例であって、これ以外のサイズの範囲に絞り込んでもよい。相関の強さは、例えば参照二値化ブロック画像３０４とＭＢ二値化画像３０３との各画素ごとの差分絶対値を積算したＭＡＥを用いて判定可能であり、検出範囲内でＭＡＥが最小となる領域の中心位置を決定する。 The search range can be narrowed down as follows, for example. That is, an area having the strongest correlation with the MB binarized image is searched from the reference binarized block image, and a search range of a predetermined size is output based on the position of the area. The search range of the predetermined size can be, for example, a range of 18 × 18 pixels with respect to the center of the region having the strongest correlation, but this is merely an example, and even if the range is narrowed to a size other than this, Good. The strength of the correlation can be determined using, for example, MAE obtained by integrating the absolute difference values for each pixel between the reference binarized block image 304 and the MB binarized image 303, and the MAE is minimum within the detection range. The center position of the area to be determined is determined.

二値画像探索範囲限定部２０８で絞り込まれた範囲は、参照画像切り出し部２０９に与えられる。このとき二値画像探索範囲限定部２０８から与えられる情報として、例えば、参照画像２００において切り出すサイズと、切り出し領域の中心位置が与えられる。ここで、切り出しサイズは１８×１８画素となる。なお、切り出しサイズが固定的な場合には、中心位置の情報のみが与えられてもよい。 The range narrowed down by the binary image search range limiting unit 208 is given to the reference image cutout unit 209. As information given from the binary image search range limiting unit 208 at this time, for example, the size to be cut out in the reference image 200 and the center position of the cut-out area are given. Here, the cutout size is 18 × 18 pixels. If the cutout size is fixed, only the information on the center position may be given.

参照画像切り出し部２０９では、参照画像２００から該範囲に対応する領域を、参照ブロック画像として切り出す。参照ブロック画像は、例えば図３の３０５に示すようになり、１８×１８画素のサイズを有する。参照ブロック画像３０５の解像度は、符号化対象画像の本来の解像度であり、画素値の階調も２５６階調（８bit）で表現されている。 The reference image cutout unit 209 cuts out a region corresponding to the range from the reference image 200 as a reference block image. The reference block image is, for example, as indicated by 305 in FIG. 3 and has a size of 18 × 18 pixels. The resolution of the reference block image 305 is the original resolution of the encoding target image, and the gradation of the pixel value is expressed by 256 gradations (8 bits).

予測生成部２１０は、参照ブロック画像３０６と、ＭＢ画像２０１とのブロックマッチングを行う。具体的に、参照ブロック画像３０６のうち、ＭＢ画像２０１と最も相関が強くなる領域を探索し、該領域の中心位置を決定し、該中心位置と、ＭＢ画像２０１の符号化対象画像における位置とに基づいて、動きベクトルを決定することができる。相関の強さは、例えば参照ブロック画像３０６とＭＢ画像２０１との各画素ごとの差分絶対値を積算したＭＡＥを用いて判定可能であり、検出範囲内でＭＡＥが最小となる領域の中心位置を決定する。このようにして、最終的に予測値２１１として動きベクトルが得られ、動き補償部１０８に出力される。 The prediction generation unit 210 performs block matching between the reference block image 306 and the MB image 201. Specifically, the reference block image 306 is searched for an area having the strongest correlation with the MB image 201, the center position of the area is determined, and the position of the MB image 201 in the encoding target image is determined. Based on the above, a motion vector can be determined. The strength of the correlation can be determined using, for example, MAE obtained by integrating the absolute difference values for each pixel between the reference block image 306 and the MB image 201, and the center position of the region where the MAE is minimum within the detection range. decide. In this way, a motion vector is finally obtained as the predicted value 211 and output to the motion compensation unit 108.

本実施形態では、探索範囲の絞り込みにより決定される範囲は、探索範囲限定部２０６と、二値画像探索範囲限定部２０８とで、処理を経る毎に、参照画像の本来の解像度に換算して狭くなっていく点に特徴がある。例えば、図３に示すように、領域３０２は、本来の解像度では６４×６４画素サイズの領域であるのに対し、探索範囲限定部２０６で絞り込まれた領域３０４は３２×３２画素サイズと、狭くなっている。また、二値画像探索範囲限定部２０８で絞り込まれた領域３０５は、１８×１８画素と更に狭くなっている。 In this embodiment, the range determined by narrowing down the search range is converted into the original resolution of the reference image every time processing is performed by the search range limiting unit 206 and the binary image search range limiting unit 208. The feature is that it becomes narrower. For example, as shown in FIG. 3, the area 302 is an area of 64 × 64 pixels in the original resolution, whereas the area 304 narrowed down by the search range limiting unit 206 is as narrow as 32 × 32 pixels. It has become. In addition, the area 305 narrowed down by the binary image search range limiting unit 208 is further narrowed to 18 × 18 pixels.

なお、図２では、動きベクトル検出部１０２の一つの構成例を示したにすぎない。例えば、予測生成部２１０における処理を省略して、二値画像探索範囲限定部２０８において、参照二値化画像とＭＢ二値化画像とのブロックマッチングに基づいて動きベクトルを算出してもよい。なお、動きベクトルの算出方法は、予測生成部２１０に関連して説明した内容と同様である。 Note that FIG. 2 shows only one configuration example of the motion vector detection unit 102. For example, the process in the prediction generation unit 210 may be omitted, and the binary image search range limitation unit 208 may calculate a motion vector based on block matching between the reference binarized image and the MB binarized image. Note that the motion vector calculation method is the same as that described in relation to the prediction generation unit 210.

メモリ量が同じならば、解像度を低くした方が広い範囲を探索することが可能であるが、解像度が低いために詳細な探索は困難になる。また、二値画像（階調を落とした画像）を用いることでデータ量を削減することができるが、画素を表す情報が１ビットでは、広範囲に用いた場合に誤検出してしまう可能性が高くなる。 If the amount of memory is the same, it is possible to search a wider range if the resolution is lowered, but a detailed search becomes difficult because the resolution is low. In addition, the amount of data can be reduced by using a binary image (an image with a reduced gradation), but if the information representing a pixel is 1 bit, it may be erroneously detected when used widely. Get higher.

これに対して本実施形態では、解像度の低い状態で広範囲を探索エリアとする一方、ある程度範囲を絞った後に、本来の解像度において二値画像を用いてデータ量を削減しながらさらに範囲を絞る。これにより、本来の解像度及び階調を用いて探索を行う範囲を効率的に絞り込むことが可能となる。 On the other hand, in the present embodiment, a wide area is set as a search area with a low resolution, and after narrowing the range to some extent, the range is further narrowed down while reducing the data amount using a binary image at the original resolution. Thereby, it is possible to efficiently narrow down the search range using the original resolution and gradation.

以上によれば、広い探索範囲を維持しつつ、参照画像のメモリ転送量や演算コストを抑え、かつ最適な動きベクトルの検出が可能となる。 According to the above, it is possible to reduce the memory transfer amount and calculation cost of the reference image while maintaining a wide search range, and to detect an optimal motion vector.

＜第２の実施形態＞
次に、図４及び図５を参照して、発明の第２の実施形態について説明する。上述の第１の実施形態では、解像度変換の階層は１階層のみ、即ち、解像度変換率は１通りであったが、本実施形態では、複数（ｎ個）の解像度変換率を用いた複数の階層（ｎ階層）に対応した例を説明する。 <Second Embodiment>
Next, a second embodiment of the invention will be described with reference to FIGS. In the first embodiment described above, there is only one resolution conversion layer, that is, one resolution conversion rate, but in this embodiment, a plurality of (n) resolution conversion rates are used. An example corresponding to a hierarchy (n hierarchy) will be described.

図４において、２００から２１１までの参照番号を付した各構成要素は、図２において同一参照番号を有する各構成要素に対応する。また、４００ａは第１階層目の参照画像解像度変換部、４００ｂは第２階層目の参照画像解像度変換部、４００ｃは第ｎ階層目の参照画像解像度変換部である。同様に、４０１ａは第１階層目のＭＢ画像解像度変換部、４０１ｂは第２階層目のＭＢ画像解像度変換部、４０１ｃは第ｎ階層目のＭＢ画像解像度変換部である。４０２ａは第１階層目の探索範囲限定部、４０２ｂは第２階層目の探索範囲限定部、４０２ｃは第３階層目の探索範囲限定部である。 In FIG. 4, each component given a reference number from 200 to 211 corresponds to each component having the same reference number in FIG. 2. Reference numeral 400a denotes a reference image resolution conversion unit in the first layer, 400b denotes a reference image resolution conversion unit in the second layer, and 400c denotes a reference image resolution conversion unit in the nth layer. Similarly, 401a is an MB image resolution conversion unit in the first layer, 401b is an MB image resolution conversion unit in the second layer, and 401c is an MB image resolution conversion unit in the nth layer. Reference numeral 402a denotes a search range limiting unit in the first hierarchy, 402b a search range limitation unit in the second hierarchy, and 402c a search range limitation unit in the third hierarchy.

４００ａから４００ｃ及び４０１ａから４０１ｃにおいて、階層の数字が小さい方がより高解像度である（解像度変換率が大きい）ことを示している。ただし、第１階層においてでも、符号化対象画像や参照画像が本来有する解像度より高い解像度へは変換しない。ここで第ｎ階層における「ｎ」は正の整数である。参照画像２００は、参照画像解像度変換部４０１ａ〜４０１ｃにおいて、各階層で指定された解像度変換率を用いて所定の解像度へと変換される。 In 400a to 400c and 401a to 401c, a smaller number in the hierarchy indicates a higher resolution (a higher resolution conversion rate). However, even in the first layer, the image is not converted to a resolution higher than the original resolution of the encoding target image and the reference image. Here, “n” in the nth layer is a positive integer. The reference image 200 is converted into a predetermined resolution in the reference image resolution conversion units 401a to 401c using the resolution conversion rate specified in each layer.

図４では、ｎが３以上の値をとらねばならないように表現されているが、これは一例であって実際にはｎは２以上であればよい。探索範囲の絞り込みは、最初に第ｎ階層、つまり最も低解像度の階層から行われる。第ｎ階層参照画像解像度変換部４００ｃと第ｎ階層ＭＢ画像解像度変換部４０１ｃとのブロックマッチングを行い、第ｎ階層探索範囲限定部４０２ｃにより第ｎ階層目における探索範囲の絞り込みが行われる。 In FIG. 4, it is expressed that n must take a value of 3 or more. However, this is an example, and in actuality, n may be 2 or more. The search range is narrowed first from the n-th layer, that is, the lowest resolution layer. Block matching is performed between the n-th layer reference image resolution conversion unit 400c and the n-th layer MB image resolution conversion unit 401c, and the n-th layer search range limiting unit 402c narrows down the search range in the n-th layer.

第１の実施形態では、この結果が二値画像切り出し部２０７へと送られたが、本実施形態では次の階層、つまりｎ−１階層目の参照画像解像度変換部へと伝達される。ｎ−１階層目の参照画像解像度変換部では、絞り込みにより得られた探索範囲について、参照画像の解像度変換を行い、ｎ−１階層目の探索範囲限定部に出力する。ｎ−１階層目の探索範囲限定部では、この画像と、同一解像度のＭＢ画像を用いて、探索範囲の更なる絞り込みが行われ、その結果はさらにｎ−２階層目へと伝達される。これを繰り返して、最終的に第１階層目まで順次絞り込んでいく。 In the first embodiment, the result is sent to the binary image cutout unit 207. In the present embodiment, the result is transmitted to the reference image resolution conversion unit in the next layer, that is, the (n-1) th layer. The reference image resolution conversion unit in the (n-1) th layer converts the resolution of the reference image for the search range obtained by narrowing down, and outputs it to the search range limiting unit in the (n-1) th layer. The search range limiting unit in the (n-1) th layer further narrows down the search range using this image and the MB image having the same resolution, and the result is further transmitted to the (n-2) th layer. This is repeated, and finally the first layer is sequentially narrowed down.

なお、各階層の探索範囲限定部４０２における処理は、解像度の違いを除き、第１の実施形態の探索範囲限定部２０２における処理と同様である。 Note that the processing in the search range limitation unit 402 of each hierarchy is the same as the processing in the search range limitation unit 202 of the first embodiment, except for the difference in resolution.

第１階層探索範囲限定部４０２ａにおいて、第１階層参照画像解像度変換部４００ａで生成された参照解像度変換画像と、第１階層ＭＢ画像解像度変換部４０１ａで生成されたＭＢ解像度変換画像とのブロックマッチングが行われる。このブロックマッチングの結果は、二値画像切り出し部２０７へと伝送される。それ以降の動作は第１の実施形態と同様である。 Block matching between the reference resolution conversion image generated by the first layer reference image resolution conversion unit 400a and the MB resolution conversion image generated by the first layer MB image resolution conversion unit 401a in the first layer search range limitation unit 402a. Is done. The result of this block matching is transmitted to the binary image cutout unit 207. The subsequent operation is the same as that of the first embodiment.

次に、図５を参照して本実施形態の階層探索の手順を説明する。図５は、ｎ＝２として、２段階の解像度階層を設定した場合の、探索範囲限定部における参照解像度変換画像と、ＭＢ解像度変換画像との関係の一例を示す図である。 Next, the hierarchical search procedure of this embodiment will be described with reference to FIG. FIG. 5 is a diagram illustrating an example of the relationship between the reference resolution converted image and the MB resolution converted image in the search range limiting unit when n = 2 and a two-stage resolution hierarchy is set.

図５では、第２階層における解像度変換率を１／６４（第２の解像度変換率）とし、第１階層における解像度変換率を１／１６（第１の解像度変換率）としている。即ち、図４においてｎ＝２とした場合に相当する。 In FIG. 5, the resolution conversion rate in the second layer is 1/64 (second resolution conversion rate), and the resolution conversion rate in the first layer is 1/16 (first resolution conversion rate). That is, this corresponds to the case where n = 2 in FIG.

図５において、５０１は、第２階層における、ＭＢ解像度変換画像（第２のマクロブロック解像度変換画像）を示し、５０２は、第２階層における参照解像度変換画像（第２の参照解像度変換画像）を示す。ＭＢ解像度変換画像（第２のマクロブロック解像度変換画像）５０１は、１６×１６画素のマクロブロック画像を１／６４に解像度変換したものであって、２×２画素のサイズを有する。また、参照解像度変換画像（第２の参照解像度変換画像）５０２は、６４×６４画素領域を１／６４に解像度変換したものであって、８×８画素のサイズを有する。 In FIG. 5, 501 indicates an MB resolution converted image (second macroblock resolution converted image) in the second layer, and 502 indicates a reference resolution converted image (second reference resolution converted image) in the second layer. Show. The MB resolution converted image (second macroblock resolution converted image) 501 is obtained by converting the resolution of a 16 × 16 pixel macroblock image to 1/64, and has a size of 2 × 2 pixels. The reference resolution converted image (second reference resolution converted image) 502 is a resolution converted from a 64 × 64 pixel area to 1/64, and has a size of 8 × 8 pixels.

以上のＭＢ解像度変換画像５０１と、参照解像度変換画像５０２とを用いて、第２階層探索範囲限定部（第３の範囲決定部）４０２ｂは、探索範囲（第３の範囲）の絞り込みを行う。この結果として、例えば、ＭＡＥが最小となる領域の中心位置に基づき所定の大きさの範囲（例えば、６×６画素）を設定する方法が考えられる。この場合、次の階層における探索範囲が一定となるので、ハードウェアでの実現が容易である。以下では、この例について説明する。 Using the MB resolution converted image 501 and the reference resolution converted image 502, the second hierarchical search range limiting unit (third range determining unit) 402b narrows down the search range (third range). As a result of this, for example, a method of setting a range of a predetermined size (for example, 6 × 6 pixels) based on the center position of the area where the MAE is minimized can be considered. In this case, since the search range in the next hierarchy is constant, realization with hardware is easy. This example will be described below.

なお、ＭＡＥがある閾値以下となる、或いは、マッチング対象間の相関がある閾値以上に強くなる領域を全て包含するように範囲を設定することも可能である。また、ＭＡＥの少ない方から任意の数だけ包含するように、或いは、マクロブロックと一定以上の相関を有する領域を包含するように、範囲を設定することもできる。これらの場合、次の階層における探索範囲が可変となるが、より精度の高い探索範囲が策定できる。このようにして次階層のための探索範囲を限定するが、どのような方法をとる場合でも、次階層における探索範囲は現階層における探索範囲よりも小さくなる。なお、これ以外の絞り込みに関する処理の詳細は、第１の実施形態で既に説明しているので、ここでは省略する。 Note that it is also possible to set the range so as to include all regions where the MAE is below a certain threshold or the correlation between matching targets is stronger than a certain threshold. Further, the range can be set so as to include an arbitrary number from the smaller MAE, or to include an area having a certain correlation with the macroblock. In these cases, the search range in the next hierarchy is variable, but a search range with higher accuracy can be formulated. In this way, the search range for the next hierarchy is limited. In any method, the search range in the next hierarchy is smaller than the search range in the current hierarchy. Note that details of other processing related to narrowing down have already been described in the first embodiment, and are omitted here.

第１階層参照画像解像度変換部４００ａでは、第２階層探索範囲限定部４０２ｂから得られた探索範囲に基づいて、参照画像から所定領域を切り出し解像度変換したブロック画像を、第１階層探索範囲限定部４０３ａに出力する。このとき、第２階層探索範囲限定部４０２ｂから得られた探索範囲は、本来の解像度に換算して４８×４８画素領域であり、これを１／１６の解像度変換率により、解像度変換する。つまり１２×１２画素の第１階層における参照解像度変換画像が得られる。 In the first hierarchy reference image resolution conversion unit 400a, a block image obtained by cutting out a predetermined area from the reference image and converting the resolution based on the search range obtained from the second hierarchy search range limitation unit 402b is converted into a first hierarchy search range limitation unit. Output to 403a. At this time, the search range obtained from the second hierarchy search range limiting unit 402b is a 48 × 48 pixel area in terms of the original resolution, and the resolution is converted at a resolution conversion rate of 1/16. That is, a reference resolution conversion image in the first hierarchy of 12 × 12 pixels is obtained.

図５の５０３は、第１階層におけるＭＢ解像度変換画像（第１のマクロブロック解像度変換画像）を示し、５０４は、この第１階層における参照解像度変換画像（第１の参照解像度変換画像）を示す。ＭＢ解像度変換画像５０３は、１６×１６画素のマクロブロック画像を１／１６に解像度変換したものであって、４×４画素のサイズを有する。また、参照解像度変換画像５０４は、上記の通り１２×１２画素のサイズを有する。 503 in FIG. 5 shows the MB resolution conversion image (first macroblock resolution conversion image) in the first layer, and 504 shows the reference resolution conversion image (first reference resolution conversion image) in the first layer. . The MB resolution conversion image 503 is a 16 × 16 pixel macroblock image obtained by converting the resolution to 1/16, and has a size of 4 × 4 pixels. Further, the reference resolution converted image 504 has a size of 12 × 12 pixels as described above.

以上のＭＢ解像度変換画像５０３と、参照解像度変換画像５０４とを用いて、第１階層探索範囲限定部４０２ａは、探索範囲の絞り込みを行う。この結果として、例えば、８×８画素の探索範囲に絞り込みを行ってもよいが、これはあくまで一例であって、探索範囲のサイズはこれに限定されない。この絞り込みに関する処理の詳細は、第１の実施形態で既に説明しているので、ここでは省略する。 Using the MB resolution converted image 503 and the reference resolution converted image 504, the first hierarchy search range limiting unit 402a narrows down the search range. As a result, for example, the search range may be narrowed down to an 8 × 8 pixel search range, but this is only an example, and the size of the search range is not limited to this. Details of the processing related to the narrowing down have already been described in the first embodiment, and are omitted here.

二値画像切り出し部２０７では、第１階層探索範囲限定部４０２ａから得られた探索範囲に基づいて、二値化された参照画像から所定領域を切り出して、二値画像探索範囲限定部２０８に出力する。このとき、第１階層探索範囲限定部４０２ａから得られた探索範囲は、本来の解像度に換算して３２×３２画素領域である。 The binary image cutout unit 207 cuts out a predetermined area from the binarized reference image based on the search range obtained from the first hierarchy search range limiting unit 402a, and outputs it to the binary image search range limiting unit 208. To do. At this time, the search range obtained from the first hierarchy search range limiting unit 402a is a 32 × 32 pixel region in terms of the original resolution.

以上のように、本実施形態によれば、解像度を可変として広い探索範囲を維持しつつ、参照画像のメモリ転送量や演算コストを抑え、かつ最適な動きベクトルの検出が可能となる。 As described above, according to the present embodiment, it is possible to detect the optimal motion vector while suppressing the memory transfer amount and calculation cost of the reference image while maintaining a wide search range with variable resolution.

＜第３の実施形態＞
次に、図６を参照して、発明の第３の実施形態について説明する。図６は、発明の第３の実施形態における動きベクトル検出部１０２の構成の一例を示した図である。 <Third Embodiment>
Next, a third embodiment of the invention will be described with reference to FIG. FIG. 6 is a diagram illustrating an example of the configuration of the motion vector detection unit 102 according to the third embodiment of the invention.

上述の第２の実施形態では、各解像度変換部において本来の解像度を有する参照画素から、各階層の解像度画像を個別に生成していた。そのため、自由な解像度への変換が可能な一方、各解像度変換画像を準備するためのフィルタが個別に必要となり演算コストの増加が避けられない。そこで、本実施形態では、まず第１階層参照画像解像度変換部６００ａにおいて、一番高い解像度を保持した状態の解像度変換を行い、次の階層はその出力を用いて解像度変換を行うようにする。それをｎ階層まで繰り返し、各解像度変換部が予め解像度変換画像を保持しておく。 In the second embodiment described above, the resolution images of the respective layers are individually generated from the reference pixels having the original resolution in each resolution conversion unit. Therefore, while conversion to a free resolution is possible, a filter for preparing each resolution-converted image is required individually, and an increase in calculation cost is inevitable. Therefore, in the present embodiment, first, the first layer reference image resolution conversion unit 600a performs resolution conversion with the highest resolution held, and the next layer performs resolution conversion using the output. This is repeated up to n layers, and each resolution conversion unit holds a resolution conversion image in advance.

その際に、階層間の解像度変換の割合を一定にすることで個別にフィルタを用意する必要がなく、一種類のフィルタを用いるだけでよい。例えば、第１階層参照画像解像度変換部６００ａで１／２の解像度へ、第２階層参照画像解像度変換部６００ｂで１／４の解像度へと変換すれば、利用するフィルタは解像度を１／２に変換するためのフィルタだけでよい。同様に階層が進むに従って１／８、１／１６・・・と１/２ⁿの解像度変換を行うことができる。ＭＢ画像解像度変換部６０１ａ〜６０１ｃもそれぞれに同様の単一のフィルタを用いて解像度変換を行う。なお、マクロブロック内の画素数は２５６（１６×１６）画素であるので、解像度変換は１／２５６以上は行わない。 In that case, it is not necessary to prepare a filter separately by making the ratio of resolution conversion between hierarchies constant, and only one type of filter may be used. For example, if the first layer reference image resolution conversion unit 600a converts the resolution to ½, and the second layer reference image resolution conversion unit 600b converts the resolution to ¼, the used filter reduces the resolution to ½. Only a filter for conversion is required. Similarly, resolution conversion of 1/8, 1/16... And 1/2 ⁿ can be performed as the hierarchy progresses. The MB image resolution conversion units 601a to 601c also perform resolution conversion using the same single filter. Since the number of pixels in the macroblock is 256 (16 × 16) pixels, resolution conversion is not performed over 1/256.

図６では、ｎが３以上の値をとらねばならないように表現されているが、これは一例であって実際にはｎは２以上であればよい。探索範囲の絞り込みは、最初に第ｎ階層、つまり最も低解像度の階層から行われる。第ｎ階層参照画像解像度変換部６００ｃと第ｎ階層ＭＢ画像解像度変換部６０１ｃとのブロックマッチングを行い、第ｎ階層探索範囲限定部４０２ｃにより第ｎ階層目における探索範囲の絞り込みが行われる。 In FIG. 6, it is expressed that n must take a value of 3 or more. However, this is an example, and in actuality, n may be 2 or more. The search range is narrowed first from the n-th layer, that is, the lowest resolution layer. Block matching is performed between the n-th layer reference image resolution conversion unit 600c and the n-th layer MB image resolution conversion unit 601c, and the n-th layer search range limiting unit 402c narrows down the search range in the n-th layer.

この絞り込みの結果は次の階層、つまりｎ−１階層目の参照画像解像度変換部へと伝達される。ｎ−１階層目の参照画像解像度変換部では、絞り込みにより得られた探索範囲に基づき、ｎ−１階層目の参照解像度変換画像からブロック画像を切り出し、ｎ−１階層目の探索範囲限定部に出力する。ｎ−１階層目の探索範囲限定部では、この画像と、同一解像度のＭＢ画像を用いて、探索範囲の更なる絞り込みが行われ、その結果はさらにｎ−２階層目へと伝達される。これを繰り返して、最終的に第１階層目まで順次絞り込んでいく。 The result of this narrowing down is transmitted to the next layer, that is, the reference image resolution conversion unit in the (n-1) th layer. The reference image resolution conversion unit in the (n-1) th layer cuts out a block image from the reference resolution conversion image in the (n-1) th layer based on the search range obtained by narrowing down and serves as a search range limiting unit in the (n-1) th layer. Output. The search range limiting unit in the (n-1) th layer further narrows down the search range using this image and the MB image having the same resolution, and the result is further transmitted to the (n-2) th layer. This is repeated, and finally the first layer is sequentially narrowed down.

図６において、２００から２１１までの参照番号を付した各構成要素は、図２において同一参照番号を有する各構成要素に対応する。 In FIG. 6, each component having a reference number from 200 to 211 corresponds to each component having the same reference number in FIG. 2.

また、６００ａは第１階層目の参照画像解像度変換部、６００ｂは第２階層目の参照画像解像度変換部、６００ｃは第ｎ階層目の参照画像解像度変換部である。同様に、６０１ａは第１階層目のＭＢ画像解像度変換部、６０１ｂは第２階層目のＭＢ画像解像度変換部、６０１ｃは第ｎ階層目のＭＢ画像解像度変換部である。４０２ａは第１階層目の探索範囲限定部、４０２ｂは第２階層目の探索範囲限定部、４０２ｃは第３階層目の探索範囲限定部である。また、４０２ａ〜４０２ｃ参照番号を付した各構成要素は、図４において同一参照番号を有する各構成要素に対応する。 Reference numeral 600a denotes a reference image resolution conversion unit in the first layer, 600b denotes a reference image resolution conversion unit in the second layer, and 600c denotes a reference image resolution conversion unit in the nth layer. Similarly, 601a is an MB image resolution conversion unit in the first layer, 601b is an MB image resolution conversion unit in the second layer, and 601c is an MB image resolution conversion unit in the nth layer. Reference numeral 402a denotes a search range limiting unit in the first hierarchy, 402b a search range limitation unit in the second hierarchy, and 402c a search range limitation unit in the third hierarchy. Moreover, each component which attached | subjected the reference number 402a-402c respond | corresponds to each component which has the same reference number in FIG.

図６において、６００ａは第１階層目の参照画像解像度変換部、６００ｂは第２階層目の参照画像解像度変換部、６００ｃは第ｎ階層目の参照画像解像度変換部である。同様に、６０１ａは第１階層目のＭＢ画像解像度変換部、６０１ｂは第２階層目のＭＢ画像解像度変換部、６０１ｃは第ｎ階層目のＭＢ画像解像度変換部である。上記のように、各解像度変換部は、共通のフィルタを用いて１階層上の解像度変換部における解像度変換結果を用いて、解像度変換を行う。 In FIG. 6, reference numeral 600a is a reference image resolution conversion unit in the first layer, 600b is a reference image resolution conversion unit in the second layer, and 600c is a reference image resolution conversion unit in the nth layer. Similarly, 601a is an MB image resolution conversion unit in the first layer, 601b is an MB image resolution conversion unit in the second layer, and 601c is an MB image resolution conversion unit in the nth layer. As described above, each resolution conversion unit performs resolution conversion using the resolution conversion result in the resolution conversion unit on one layer using a common filter.

以上によれば、演算コストの増加を抑えた解像度変換が可能である。 According to the above, resolution conversion can be performed while suppressing an increase in calculation cost.

＜その他の実施形態＞
上述のそれぞれの実施形態においては水平と垂直が同じ倍率や画素数で説明したがその限りでは無く、水平と垂直が異なる倍率や画素数であっても良い。 <Other embodiments>
In each of the above-described embodiments, the horizontal and vertical are described with the same magnification and the number of pixels. However, the present invention is not limited thereto, and the horizontal and vertical may have different magnification and the number of pixels.

また、上述したそれぞれの実施形態において図１、図２、図４および図６に示した各構成要素における処理はハードウェアに限定するものではない。各処理部の機能を実現する為のプログラムをメモリから読み出してＣＰＵ（中央演算装置）が実行することによりその機能を実現させてもよい。 In each of the above-described embodiments, the processing in each component shown in FIGS. 1, 2, 4 and 6 is not limited to hardware. The function may be realized by reading a program for realizing the function of each processing unit from the memory and executing it by a CPU (Central Processing Unit).

また、さらに上述した構成に限定されるものではない。図１、図２、図４及び図６に示した各構成要素における各処理の全部または一部の機能を専用のハードウェアにより実現してもよい。また、上述したＣＰＵがプログラムを読み出すメモリは、ＨＤＤ、光磁気ディスク装置、フラッシュメモリ等の不揮発性のメモリや、ＣＤ−ＲＯＭ等の読み出しのみが可能な記録媒体でもよい。さらにＲＡＭ以外の揮発性のメモリ、あるいはこれらの組合せによるコンピュータ読み取り、書き込み可能な記録媒体より構成されてもよい。 Further, the present invention is not limited to the configuration described above. All or some of the functions of the processes in the components shown in FIGS. 1, 2, 4, and 6 may be realized by dedicated hardware. The memory from which the CPU reads the program may be a non-volatile memory such as an HDD, a magneto-optical disk device, or a flash memory, or a recording medium such as a CD-ROM that can only be read. Furthermore, the recording medium may be a computer-readable / writable recording medium using a volatile memory other than the RAM or a combination thereof.

また、「コンピュータ読み取り、書き込み可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに、インターネット等のネットワークや電話回線等の通信回線を介してプログラムが送信された場合のサーバやクライアントとなるコンピュータシステム内部の揮発メモリ（ＲＡＭ）も含む。その他、一定時間プログラムを保持しているものも含むものとする。 The “computer-readable and writable recording medium” refers to a storage device such as a flexible disk, a magneto-optical disk, a portable medium such as a ROM or a CD-ROM, and a hard disk built in a computer system. Further, it also includes a volatile memory (RAM) inside a computer system that becomes a server or client when a program is transmitted via a network such as the Internet or a communication line such as a telephone line. In addition, those holding programs for a certain period of time are also included.

また、上記プログラムは、このプログラムを記憶装置等に格納したコンピュータシステムから、伝送媒体を介して、あるいは、伝送媒体中の伝送波により他のコンピュータシステムに伝送されてもよい。ここで、プログラムを伝送する「伝送媒体」は、インターネット等のネットワーク（通信網）や電話回線等の通信回線（通信線）のように情報を伝送する機能を有する媒体のことをいう。 The program may be transmitted from a computer system storing the program in a storage device or the like to another computer system via a transmission medium or by a transmission wave in the transmission medium. Here, the “transmission medium” for transmitting the program refers to a medium having a function of transmitting information, such as a network (communication network) such as the Internet or a communication line (communication line) such as a telephone line.

また、上記プログラムは、前述した機能の一部を実現する為のものであっても良い。さらに、前述した機能をコンピュータシステムに既に記録されているプログラムとの組合せで実現できるもの、いわゆる差分ファイル（差分プログラム）であっても良い。 The program may be for realizing a part of the functions described above. Furthermore, what can implement | achieve the function mentioned above in combination with the program already recorded on the computer system, and what is called a difference file (difference program) may be sufficient.

また、上記のプログラムを記録したコンピュータ読み取り可能な記録媒体等のプログラムプロダクトも本発明の実施形態として適用することができる。上記のプログラム、記録媒体、伝送媒体およびプログラムプロダクトは、本発明の範疇に含まれる。 A program product such as a computer-readable recording medium in which the above program is recorded can also be applied as an embodiment of the present invention. The above program, recording medium, transmission medium, and program product are included in the scope of the present invention.

以上、この発明の実施形態について図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の設計等も含まれる The embodiment of the present invention has been described in detail with reference to the drawings. However, the specific configuration is not limited to this embodiment, and includes a design and the like within the scope not departing from the gist of the present invention.

本発明の第１の実施形態に対応する画像符号化装置の構成の一例を示した図である。It is the figure which showed an example of the structure of the image coding apparatus corresponding to the 1st Embodiment of this invention. 本発明の第１の実施形態に対応する動きベクトル検出部１０２の構成の一例を示した図である。It is the figure which showed an example of the structure of the motion vector detection part 102 corresponding to the 1st Embodiment of this invention. 本発明の第１の実施形態に対応するマクロブロックのサイズ及び探索範囲のサイズの一例を記載する図である。It is a figure which describes an example of the size of the macroblock corresponding to the 1st Embodiment of this invention, and the size of a search range. 本発明の第２の実施形態に対応する動きベクトル検出部１０２の構成の一例を示した図である。It is the figure which showed an example of the structure of the motion vector detection part 102 corresponding to the 2nd Embodiment of this invention. 本発明の第２の実施形態に対応するマクロブロックのサイズ及び探索範囲のサイズの一例を記載する図である。It is a figure which describes an example of the size of the macroblock corresponding to the 2nd Embodiment of this invention, and the size of a search range. 本発明の第４の実施形態に対応する動きベクトル検出部１０２の構成の一例を示した図である。It is the figure which showed an example of the structure of the motion vector detection part 102 corresponding to the 4th Embodiment of this invention.

Explanation of symbols

２００参照画像
２０１符号化対象マクロブロック画像
２０２参照画像二値化部
２０３参照画像解像度変換部
２０４ＭＢ画像解像度変換部
２０５ＭＢ画像二値化部
２０６二値画像切り出し部
２０７探索範囲限定部
２０８二値画像探索範囲限定部
２０９参照画像切り出し部
２１０予測生成部
２１１予測値 200 Reference Image 201 Encoding Target Macroblock Image 202 Reference Image Binarization Unit 203 Reference Image Resolution Conversion Unit 204 MB Image Resolution Conversion Unit 205 MB Image Binarization Unit 206 Binary Image Clipping Unit 207 Search Range Limiting Unit 208 Binary Image search range limiting unit 209 Reference image cutout unit 210 Prediction generation unit 211 Prediction value

Claims

Using a frame image temporally different from the encoding target image as a reference image, a motion vector of each macroblock image included in the encoding target image is calculated, and interframe encoding processing is performed based on the motion vector. A moving image encoding device for performing,
Reference image binarization means for binarizing the reference image to generate a reference binarized image;
Reference resolution conversion means for converting the reference image to a low resolution to generate a reference resolution conversion image;
Macroblock image binarization means for binarizing the macroblock image to generate a macroblock binarized image;
Macroblock resolution conversion means for converting the macroblock image to a low resolution to generate a macroblock resolution conversion image;
First range determining means for determining a first range to be cut out from the reference binarized image based on block matching between the reference resolution converted image and the macroblock resolution converted image;
First cutout means for cutting out the image of the first range as a reference binarized block image from the reference binarized image;
A moving image code comprising: motion vector calculating means for calculating the motion vector for the macroblock image based on block matching between the reference binarized block image and the macroblock binary image Device.

The motion vector calculation means includes
Based on block matching between the reference binarized block image and the macroblock binarized image, the second range that is included in the first range and is narrower than the first range is cut out from the reference image. A second range determining means for determining a range;
A second cutout unit that cuts out the image of the second range as a reference block image from the reference image;
The moving image encoding apparatus according to claim 1, wherein the motion vector for the block image is calculated based on block matching between the reference block image and the macroblock image.

The reference resolution conversion means includes
First reference resolution conversion means for converting the reference image into a first reference resolution conversion image having a first resolution lower than the resolution of the reference image based on a first resolution conversion rate;
Based on a second resolution conversion ratio smaller than the first resolution conversion ratio, a second reference resolution conversion image is converted into a second reference resolution conversion image having a second resolution lower than the first resolution. A reference resolution conversion means,
The macroblock resolution conversion means includes:
First macroblock resolution conversion means for converting the macroblock image into a first macroblock resolution conversion image of the first resolution based on the first resolution conversion rate;
Second macroblock resolution conversion means for converting the macroblock image into a second macroblock resolution conversion image of the second resolution based on the second resolution conversion rate;
The moving image encoding device is:
Based on the block matching between the second reference resolution conversion image and the second macroblock resolution conversion image, the first reference resolution conversion means performs the third conversion of the reference image for resolution conversion at the first resolution. Further comprising third range determining means for determining the range of
The third range corresponds to a range wider than the first range when converted by the resolution of the reference image,
The first range determining means includes
Based on block matching between a reference block image of a first resolution generated by the first reference resolution conversion means converting the resolution of a third range of the reference image, and the first macroblock resolution conversion image, The moving image encoding apparatus according to claim 1, wherein the first range is determined.

The reference resolution conversion means includes
First reference resolution conversion means for converting the reference image into a first reference resolution conversion image having a first resolution lower than the resolution of the reference image based on a first resolution conversion rate;
Second reference resolution conversion means for converting the first reference resolution conversion image into a second reference resolution conversion image having a second resolution lower than the first resolution based on the first resolution conversion rate. And
The macroblock resolution conversion means includes:
First macroblock resolution conversion means for converting the macroblock image into a first macroblock resolution conversion image of the first resolution based on the first resolution conversion rate;
Second macroblock resolution conversion means for converting the first macroblock resolution conversion image into the second macroblock resolution conversion image of the second resolution based on the first resolution conversion rate;
The moving image encoding device is:
Third range determining means for determining a third range cut out from the first reference resolution conversion image based on block matching between the second reference resolution conversion image and the second macroblock resolution conversion image. Prepared,
The third range corresponds to a range wider than the first range when converted by the resolution of the reference image,
The first range determining means includes
For block matching between a reference block image having a first resolution cut out from a third range of the first reference resolution conversion image by the first reference resolution conversion means and the first macroblock resolution conversion image. 3. The moving image encoding apparatus according to claim 1, wherein the first range is determined based on the first range.

At least one of the first to third ranges is determined as a predetermined range including the center of an area determined to have the strongest correlation between matching targets in the block matching. The moving picture encoding device according to claim 3 or 4.

At least one of the first to third ranges is determined so as to include an area in which the correlation between matching targets is determined to be stronger than a predetermined threshold in the block matching. The moving image encoding apparatus according to claim 3 or 4.

Using a frame image temporally different from the encoding target image as a reference image, a motion vector of each macroblock image included in the encoding target image is calculated, and interframe encoding processing is performed based on the motion vector. A method for controlling a moving image encoding device, comprising:
A reference image binarization step for binarizing the reference image to generate a reference binarized image;
A reference image resolution conversion step of converting the reference image to a low resolution to generate a reference resolution conversion image;
A macroblock image binarization step for binarizing the macroblock image to generate a macroblock binarized image;
A macroblock image resolution conversion step of converting the macroblock image to a low resolution to generate a macroblock resolution conversion image;
A first range determining step for determining a first range cut out from the reference binarized image based on block matching between the reference resolution converted image and the macroblock resolution converted image;
A first cutout step of cutting out the image of the first range as a reference binarized block image from the reference binarized image;
A moving image code comprising: a motion vector calculating step of calculating the motion vector for the macroblock image based on block matching between the reference binarized block image and the macroblock binarized image Control method.

A computer program for causing a computer to function as the moving picture encoding apparatus according to any one of claims 1 to 6.