JP4921780B2

JP4921780B2 - Method and apparatus for motion prediction

Info

Publication number: JP4921780B2
Application number: JP2005344110A
Authority: JP
Inventors: 黄朝宗; 曽博志
Original assignee: 聯詠科技股▲ふん▼有限公司
Priority date: 2005-07-20
Filing date: 2005-11-29
Publication date: 2012-04-25
Anticipated expiration: 2025-11-29
Also published as: US7949194B2; JP2007028574A; TW200706000A; TWI280803B; US20070019732A1

Description

本発明は、動き予測の方法および装置、さらに詳細には、動きベクトル予測のための階層的探索を用いる方法およびその装置に関する。 The present invention relates to a motion prediction method and apparatus, and more particularly, to a method and apparatus using a hierarchical search for motion vector prediction.

動き予測は、ビデオ圧縮エンコーダにおける計算量に関してもっとも複雑な計算エリアであり、さらに圧縮結果にもっとも影響を与えるものである。したがって、計算量およびメモリ利用量を低下させるために提案された多数の高速アルゴリズムがあり、これにより十分な圧縮品位が維持される。 Motion prediction is the most complicated calculation area regarding the calculation amount in the video compression encoder, and has the most influence on the compression result. Thus, there are a number of fast algorithms proposed to reduce computational complexity and memory usage, thereby maintaining sufficient compression quality.

さまざまな高速アルゴリズムの中で、階層的探索は、計算量とメモリ使用量との両方を効果的に減少させるアルゴリズムである。階層的探索法は、図１のフロー図で示されているとおり、Ｊ．Ｈ．リー[１]により提案されたものであり、可変ブロックが生成できる。 Among various fast algorithms, hierarchical search is an algorithm that effectively reduces both computational complexity and memory usage. As shown in the flow diagram of FIG. H. This is proposed by Lee [1] and can generate variable blocks.

図１を参照すると、まず、ステップ１０１および１０２において、原フレームデータ１１１および基準フレームデータ１１２に対して低帯域フィルタリングおよびサブサンプリングが行われ、これにより３つの異なる解像度層が得られる。これらの層は、原解像度層、中間解像度層、低解像度層である。ここで、原解像度層には原フレームデータ１１１および基準フレームデータ１１２が含まれ、中間解像度層にはステップ１０１で生成されるデータが含まれ、低解像度層にはステップ１０２で生成されるデータが含まれる。 Referring to FIG. 1, first, in steps 101 and 102, low-band filtering and subsampling are performed on the original frame data 111 and the reference frame data 112, thereby obtaining three different resolution layers. These layers are the original resolution layer, the intermediate resolution layer, and the low resolution layer. Here, the original resolution layer includes the original frame data 111 and the reference frame data 112, the intermediate resolution layer includes the data generated in step 101, and the low resolution layer includes the data generated in step 102. included.

その後、ステップ１０３において、低解像度層に対して広範囲探索が行われ、これにより３つの動きベクトルが得られる。これらのベクトルは、２つの最適動きベクトルと、ビデオ基準から得られた1つの予測動きベクトルである。その後、ステップ１０４において、中間解像度層に対して局所探索が行われる。ここで、上述の低解像度層および中間解像度層に対して行われる両探索において、１６×１６のブロックサイズが用いられる。最後に、ステップ１０５において、中間解像度層から得られる動きベクトルに近接する原解像度層に対して局所探索が行われる。一方、１６×１６の原ブロックは８×８の４つの小さなブロックに分割される。さらに最終的に、最適ブロックモードおよび動きベクトル１１３が選択される。この方法の欠点としては、小さなブロックの動きベクトルが非常に小さな範囲に限られるということである。このため、小さなブロック間の物理的動きベクトルがそれぞれ大きく離れているような場合に効果的な予測ができない。 Thereafter, in step 103, a wide range search is performed on the low resolution layer, thereby obtaining three motion vectors. These vectors are two optimal motion vectors and one predicted motion vector obtained from the video reference. Thereafter, in step 104, a local search is performed on the intermediate resolution layer. Here, a block size of 16 × 16 is used in both searches performed on the low resolution layer and the intermediate resolution layer. Finally, in step 105, a local search is performed on the original resolution layer proximate to the motion vector obtained from the intermediate resolution layer. On the other hand, the original 16 × 16 block is divided into four small 8 × 8 blocks. Finally, the optimal block mode and motion vector 113 are selected. The disadvantage of this method is that the motion vector of a small block is limited to a very small range. For this reason, effective prediction cannot be performed when physical motion vectors between small blocks are largely separated from each other.

Ｊ．Ｈ．リーら著「高速複数解像度ブロック整合アルゴリズムおよび低ビット速度ビデオコーディングに対するＬＳＩアーキテクチャ」、ＩＥＥＥトランザクション、ビデオ技術用回路・システム、第１１巻、Ｎｏ．１２、２００１年１２月、ｐ．１２８９−１３０１J. et al. H. Lee et al., “LSI Architecture for High-Speed Multiple Resolution Block Matching Algorithms and Low Bit Rate Video Coding”, IEEE Transactions, Circuits and Systems for Video Technology, Vol. 12, December 2001, p. 1289-1301

したがって、本発明の目的は動き予測のための方法を提供することである。このような方法を用いて、計算量が少なくメモリ使用量も小さいという利点が維持され、最適な可変ブロックモードおよび動きベクトルが正確に予測できる。 Accordingly, it is an object of the present invention to provide a method for motion estimation. By using such a method, the advantage that the calculation amount is small and the memory usage amount is small is maintained, and the optimum variable block mode and motion vector can be accurately predicted.

本発明の他の目的は、動き予測のための装置を提供することである。この装置により可変ブロック動きベクトルの高効率組合せをもたらすことができる。 Another object of the present invention is to provide an apparatus for motion estimation. This device can provide a highly efficient combination of variable block motion vectors.

上述の目的、その他を達成するため、本発明は動き予測方法を提供する。この方法は以下のステップを含む：
（ａ）階層的データ構造を形成するために原フレームデータおよび基準フレームデータを用い、この階層的データ構造がＮ個の層を含むステップであって、Ｎ番目の層が原フレームデータおよび基準フレームデータを含み、残りのｉ番目の層が原フレームデータおよび基準フレームデータから生成されたデータを含み、ｉ番目の層の画像解像度が（ｉ＋１）番目の層の画像解像度よりも低く、ここでＮが２以上の正の整数であり、１≦ｉ＜Ｎであることを特徴とするステップと；
（ｂ）１番目の層上にあるマクロブロックの複数候補セットのコストに応じてこの候補セットから少なくとも１つの候補セットを選択し、この選択された候補セットを２番目の層に与えるステップであって、候補セットのおのおのがマクロブロックの可変ブロックモードと、この可変ブロックモードのブロックのおのおのの動きベクトルとのセットであることを特徴とするステップと；
（ｃ）Ｎが２より大きい場合に２番目の層から始まって２≦ｉ＜Ｎの順番で各ｉ番目の層に対して以下の２つのサブステップを順次実施するステップであって、このサブステップが：
（ｃ１）（ｉ−１）番目の層でもたらされる候補セットに基づいて局所探索を行うサブステップと；
（ｃ２）局所探索後に候補セットのコストに応じて局所探索から得られた候補セットから少なくとも１つの候補セットを選択し、選択された候補セットを（ｉ＋１）番目の層にもたらすサブステップであるステップと；
（ｄ）Ｎ番目の層に対して以下の２つのサブステップを実施するステップであって、このサブステップが：
（ｄ１）（Ｎ−１）番目の層でもたらされる候補セットに基づいて局所探索を行うサブステップと；
（ｄ２）局所探索後に候補セットのコストに応じて局所探索から得られた候補セットから１つの候補セットを選択するサブステップであるステップ。 In order to achieve the above object and others, the present invention provides a motion estimation method. This method includes the following steps:
(A) using original frame data and reference frame data to form a hierarchical data structure, the hierarchical data structure including N layers, wherein the Nth layer is the original frame data and the reference frame And the remaining i th layer contains data generated from the original frame data and the reference frame data, and the image resolution of the i th layer is lower than the image resolution of the (i + 1) th layer, where N Is a positive integer greater than or equal to 2 and 1 ≦ i <N;
(B) selecting at least one candidate set from the candidate sets according to the cost of a plurality of candidate sets of macroblocks on the first layer, and providing the selected candidate set to the second layer. Each of the candidate sets is a set of a variable block mode of the macroblock and a motion vector of each block of the variable block mode;
(C) When N is larger than 2, the following two sub-steps are sequentially performed on each i-th layer in the order of 2 ≦ i <N starting from the second layer. The steps are:
(C1) a substep of performing a local search based on the candidate set provided in the (i-1) th layer;
(C2) A step that is a sub-step of selecting at least one candidate set from candidate sets obtained from the local search according to the cost of the candidate set after the local search and bringing the selected candidate set to the (i + 1) th layer When;
(D) performing the following two substeps for the Nth layer, which substeps:
(D1) a substep of performing a local search based on the candidate set provided in the (N-1) th layer;
(D2) A step that is a sub-step of selecting one candidate set from the candidate sets obtained from the local search according to the cost of the candidate set after the local search.

本発明の１つの実施例による動き予測方法において、各ｉ番目の層の全データが、（ｉ＋１）番目の層に対して低帯域フィルタリングおよびサブサンプリングを行うことで生成される。 In the motion estimation method according to an embodiment of the present invention, all data of each i-th layer is generated by performing low-band filtering and sub-sampling on the (i + 1) -th layer.

本発明の１つの実施例による動き予測方法において、ステップ（ｃ１）あるいは（ｄ１）がさらに以下のサブステップ：候補セットの１つから複数の導出された候補セットを導出し、この導出された候補セットを次のステップの選択に追加するサブステップであって、導出された候補セットのおのおのおよび上述の候補セットが同一の可変ブロックモードをもつが、動きベクトルは異なることを特徴とするサブステップを含む。 In the motion estimation method according to one embodiment of the present invention, step (c1) or (d1) further derives a plurality of derived candidate sets from one of the following sub-steps: candidate set, and the derived candidates A sub-step of adding a set to the selection of the next step, characterized in that each of the derived candidate sets and the above-mentioned candidate set have the same variable block mode but different motion vectors Including.

本発明の１つの実施例による動き予測方法において、ステップ（ｃ１）あるいは（ｄ１）がさらに以下のサブステップ：候補セットの１つから複数の導出された並行候補セットを導出し、この導出された候補セットを次のステップの選択に追加するサブステップであって、導出された候補セットのおのおのの可変ブロックモードが、上述の候補セットの可変ブロックモードを分割することから得られる結果であることを特徴とするサブステップを含む。 In the motion estimation method according to one embodiment of the present invention, step (c1) or (d1) further derives a plurality of derived parallel candidate sets from one of the following sub-steps: A sub-step of adding the candidate set to the selection of the next step, wherein each variable block mode of the derived candidate set is a result obtained from dividing the variable block mode of the candidate set described above. Includes characteristic sub-steps.

本発明の他の観点により、本発明はさらに動き予測装置を提供するが、この装置は層発生器、全文対象探索ユニット、最終探索ユニットを含む。ここで、層発生器は、原フレームデータおよび基準フレームデータを用いて階層的データ構造を形成する。階層的データ構造は２つの層を含み、ここで、２番目の層は原フレームデータおよび基準フレームデータを含み、１番目の層は原フレームデータおよび基準フレームデータに基づいて生成されたデータを含み、１番目の層の画像解像度は２番目の層の画像解像度よりも低い。全文対象探索ユニットは、上述の候補セットのコストに応じて１番目の層のマクロブロックの複数の候補セットから選択された少なくとも１つの候補セットをもたらす。最終探索ユニットは、全文対象探索ユニットにより与えられた候補セットに基づいて２番目の層に対して局所探索を行い、局所探索後に候補セットのコストに応じた局所探索で得られた候補セットから候補セットを選択する。 In accordance with another aspect of the present invention, the present invention further provides a motion prediction apparatus, which includes a layer generator, a full text object search unit, and a final search unit. Here, the layer generator forms a hierarchical data structure using the original frame data and the reference frame data. The hierarchical data structure includes two layers, where the second layer includes original frame data and reference frame data, and the first layer includes data generated based on the original frame data and reference frame data. The image resolution of the first layer is lower than the image resolution of the second layer. The full text object search unit yields at least one candidate set selected from a plurality of candidate sets of macroblocks in the first layer according to the cost of the candidate set described above. The final search unit performs a local search on the second layer based on the candidate set given by the full-text target search unit, and candidates from the candidate set obtained by the local search according to the cost of the candidate set after the local search Select a set.

本発明の他の観点により、本発明はさらに、動き予測装置を提供するが、この装置は層発生器、全文対象探索ユニット、Ｎ−２個の局所探索ユニットおよび最終探索ユニットを含み、ここでＮは２よりも大きな正の整数である。ここで、層発生器は、原フレームデータおよび基準フレームデータを用いて階層的データ構造を形成する。階層的データ構造はＮ個の層を含み、ここで、Ｎ番目の層は原フレームデータおよび基準フレームデータを含み、残りのｉ番目の層は原フレームデータおよび基準フレームデータに基づいて生成されたデータを含み、ｉ番目の層の画像解像度は（ｉ＋１）番目の層の画像解像度よりも低く、ここでｉは整数であり、１≦ｉ＜Ｎである。全文対象探索ユニットは、上述の候補セットのコストに応じて１番目の層のマクロブロックの複数の候補セットから選択された少なくとも１つの候補セットをもたらす。Ｎ−２個の局所探索ユニットの内、１番目の局所探索ユニットが２番目の層に対応して全文対象探索ユニットにより与えられる候補セットを受け入れ、ｋ番目の局所探索ユニットが（ｋ＋１）番目の層に対応して（ｋ−１）番目の局所探索ユニットにより与えられる少なくとも１つの候補セットを受け入れ、ここでｋは整数であり、１≦ｋ≦Ｎ−２である。さらに、各局所探索ユニットは、受け入れられた候補セットに基づき、対応する層に対して局所探索を行い、局所探索後に候補セットのコストに応じた局所探索で得られた候補セットから少なくとも１つの候補セットをもたらす。最終探索ユニットは、（Ｎ−２）番目の局所探索ユニットにより与えられた候補セットに基づいてＮ番目の層に対して局所探索を行い、局所探索後に候補セットのコストに応じた局所探索で得られた候補セットから１つの候補セットを選択する。 In accordance with another aspect of the present invention, the present invention further provides a motion estimation apparatus, which includes a layer generator, a full text object search unit, N-2 local search units, and a final search unit, where N is a positive integer greater than 2. Here, the layer generator forms a hierarchical data structure using the original frame data and the reference frame data. The hierarchical data structure includes N layers, where the Nth layer includes original frame data and reference frame data, and the remaining i th layer is generated based on the original frame data and reference frame data. Including data, the image resolution of the i-th layer is lower than the image resolution of the (i + 1) -th layer, where i is an integer and 1 ≦ i <N. The full text object search unit yields at least one candidate set selected from a plurality of candidate sets of macroblocks in the first layer according to the cost of the candidate set described above. Of the N-2 local search units, the first local search unit accepts the candidate set given by the full-text target search unit corresponding to the second layer, and the kth local search unit is the (k + 1) th Accept at least one candidate set given by the (k−1) th local search unit corresponding to the layer, where k is an integer and 1 ≦ k ≦ N−2. Further, each local search unit performs a local search on the corresponding layer based on the accepted candidate set, and at least one candidate from the candidate set obtained by the local search according to the cost of the candidate set after the local search Bring the set. The final search unit performs a local search for the Nth layer based on the candidate set given by the (N-2) th local search unit, and obtains the local search according to the cost of the candidate set after the local search. One candidate set is selected from the obtained candidate sets.

添付の図面は、本発明の理解をさらに深めるためのものであり、本仕様書に組み込まれ、その一部を構成するものである。これらの図面は本発明の実施例を示すものであり、説明とあわせて本発明の原理を説明する一助となるものである。 The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification. These drawings show embodiments of the present invention, and together with the description, help to explain the principle of the present invention.

本発明において、最適ブロックモードは、低解像度の１番目の層に対する分割を可能にし、高解像度の層に対して局所探索が行われ、これによりこのブロックをさらに分解することができる。本発明によりさらに柔軟性がもたらされることから、計算量が少なくメモリ使用量も小さいという利点が維持され、最適な可変ブロックモードおよび動きベクトルが正確に予測できる。 In the present invention, the optimal block mode allows partitioning for the low resolution first layer, and a local search is performed for the high resolution layer, which can further decompose this block. Further flexibility is provided by the present invention, so that the advantage of less computation and less memory usage is maintained, and the optimal variable block mode and motion vector can be predicted accurately.

本発明の１つの実施例による動き予測方法は、これ以降、図２を参照しながらさらに詳しく説明される。図２は、ステップ２１０から始まる本実施例のフロー図である。 The motion prediction method according to one embodiment of the present invention will now be described in more detail with reference to FIG. FIG. 2 is a flowchart of this embodiment starting from step 210.

まず、ステップ２１０において、Ｎ層の階層的データ構造は、原フレームデータおよび基準フレームデータを含み、ここでＮは２以上の正の整数である。ここで、Ｎ番目の層は原フレームデータおよび基準フレームデータで構成され、残りのｉ番目の層のデータは、（ｉ＋１）番目の層のデータに対して低帯域フィルタリングおよびサブサンプリングを行うことで全て生成されるものであり、ここでｉは整数であり、１≦ｉ＜Ｎである。上述の説明から、Ｎ番目の層は最大画像解像度の原解像度層であり、この解像度は、最低解像度の１番目の層まで層を追う毎に小さくなることがわかる。 First, in step 210, the hierarchical data structure of the N layer includes original frame data and reference frame data, where N is a positive integer of 2 or more. Here, the Nth layer is composed of original frame data and reference frame data, and the remaining i-th layer data is obtained by performing low-band filtering and subsampling on the (i + 1) th layer data. All are generated, where i is an integer and 1 ≦ i <N. From the above description, it can be seen that the Nth layer is the original resolution layer with the maximum image resolution, and that this resolution becomes smaller each time the layer is followed up to the first layer with the lowest resolution.

その後、ステップ２２０において、最低解像度の１番目の層に対して全文対象探索が行われ、これはまた、上述の候補セットのコストに応じてマクロブロックの複数候補セットから少なくとも１つの候補セットを選択するステップと、選択された候補セットを２番目の層に与えるステップとを含む広範囲探索として知られている。ここで、候補セットのおのおのは、上述のマクロブロックの可変ブロックモードと、この可変ブロックモードのブロックのおのおのの動きベクトルとのセットである。さらに、候補セットはビデオ圧縮エンコーダに対して最終的に与えられるデータ構造である。 Thereafter, in step 220, a full text object search is performed on the first layer with the lowest resolution, which also selects at least one candidate set from a plurality of candidate sets of macroblocks according to the cost of the candidate set described above. And a step of providing the selected candidate set to the second layer is known as a global search. Here, each candidate set is a set of the variable block mode of the macroblock described above and the motion vector of each block of the variable block mode. Further, the candidate set is a data structure that is finally given to the video compression encoder.

上述の選択方法に関して、全般として、この方法では各候補に対するコストをまず計算し、その後、このコストを比較してさらに選択を行い、例えば、この方法では、最小量のある値のコストを選択する、あるいは特定の所定値よりもコストの小さな候補セットを選択する。このようにコスト計算およびさらに選択を行う方法は通常の当業者にとってよく知られた従来技術であるため、ここでは詳細を省略する。 With respect to the selection method described above, in general, the method first calculates the cost for each candidate, then compares this cost to make further selections, for example, the method selects the cost of a value with a minimum amount. Alternatively, a candidate set having a cost lower than a specific predetermined value is selected. Since the cost calculation and further selection method are conventional techniques well known to those skilled in the art, details are omitted here.

ステップ２２０においてマクロブロックの動きベクトルだけを選択するのではなく、このマクロブロックの可変ブロックモードも選択するという点が従来技術と異なる。言い換えると、ステップ２２０において全マクロブロックを残しておくことが可能である、すなわち複数の小さなブロックに分解し、これらのブロックをさらに２番目の層に与えることが可能である。物理的適用を考慮するため、マクロブロックを小さなブロックに分解する際に、この小さいブロックのサイズが小さすぎるためにマクロブロックを分解したくない場合、小さなブロックの候補セットコストが選択されないよう適切に調整することができる。当然ながら、コスト調整はまた、他の形式の候補セットの選択を避けるために用いることができる。 In step 220, not only the motion vector of the macroblock is selected, but also the variable block mode of this macroblock is selected. In other words, it is possible to leave all macroblocks in step 220, i.e. break them down into a number of smaller blocks and feed these blocks to the second layer. When considering breaking down a macroblock into small blocks to account for physical applications, if the size of this small block is too small and you do not want to break up the macroblock, make sure that the candidate set cost for the small block is not selected. Can be adjusted. Of course, cost adjustments can also be used to avoid selection of other types of candidate sets.

それに続くステップには異なる２つのオプションがあり、このオプションはステップ２１０の階層的データ構造に基づいて選択される。Ｎが２に等しい場合、このプロセスはステップ２４０に進むが、ここでＮ番目の層に対して最終探索が行われ、Ｎが２より大きければ、このプロセスはまずステップ２３０に進むが、ここで２番目の層と（Ｎ−１）番目の層間で各層に対して局所探索が行われ、その後、このプロセスはステップ２４０に進む。 Subsequent steps have two different options, which are selected based on the hierarchical data structure of step 210. If N is equal to 2, the process proceeds to step 240, where a final search is performed for the Nth layer, and if N is greater than 2, the process first proceeds to step 230, where A local search is performed for each layer between the second layer and the (N−1) th layer, after which the process proceeds to step 240.

上で述べたとおり、Ｎが２より大きければ、このプロセスはまずステップ２３０に進み、ここで（ｉ−１）番目の層で与えられる候補セットに応じて２番目の層から各ｉ番目の層に対して局所探索が行われ、ここで２≦ｉ＜Ｎであり、その後、局所探索後に候補セットのコストに応じて局所探索から得られた候補セットから少なくとも１つの候補セットが選択され、選択された候補セットが（ｉ＋１）番目の層に与えられる。 As stated above, if N is greater than 2, the process first proceeds to step 230 where each i th layer from the second layer depends on the candidate set given by the (i−1) th layer. A local search is performed, where 2 ≦ i <N, and then after the local search, at least one candidate set is selected from the candidate sets obtained from the local search according to the cost of the candidate set and selected The candidate set is given to the (i + 1) th layer.

上述の局所探索において、ｉ番目の層の高解像度のデータは、動きベクトルの再予測と、さらに選択を行うためにコストを再計算するため、（ｉ−１）番目の層で与えられる候補セットとともに用いられる。局所探索において、導出された複数候補セットが１つの候補セットから導出され、選択に合わされる。例えば、これらのセットは同一の可変ブロックモードをもつが、動きベクトルが異なる。その他の場合、（ｉ−１）番目の層で与えられる可変ブロックモードがさらに分割される。最適の圧縮品位を達成するため、全て可能性のある選択肢が独立した候補セットとして導出され、（ｉ＋１）番目の層の選択に合わされる。ステップ２２０と同じく、特定の候補セットあるいは複数候補セットのコストがステップ２３０で調整され、これによりフィルタリング結果が修正される。 In the local search described above, the i-th layer high-resolution data is used as a candidate set given by the (i-1) -th layer in order to re-predict the motion vector and recalculate the cost for further selection. Used with. In the local search, a plurality of derived candidate sets are derived from one candidate set and matched to the selection. For example, these sets have the same variable block mode but different motion vectors. In other cases, the variable block mode given in the (i-1) th layer is further divided. In order to achieve optimal compression quality, all possible choices are derived as independent candidate sets and matched to the selection of the (i + 1) th layer. Similar to step 220, the cost of a particular candidate set or multiple candidate sets is adjusted in step 230, thereby modifying the filtering result.

最後のステップ２４０はステップ２２０（ここでＮは２に等しい）あるいはステップ２３０（ここでＮは２より大きい）に続くものである。ステップ２４０において、（Ｎ−１）番目の層により与えられる候補セットに基づいて原解像度をもつＮ番目の層に対して局所探索がまず行われ、その後、局所探索後に候補セットのコストに応じた局所探索で得られた候補セットから１つの候補セットが選択される。 The last step 240 follows step 220 (where N is equal to 2) or step 230 (where N is greater than 2). In step 240, a local search is first performed on the Nth layer having the original resolution based on the candidate set given by the (N-1) th layer, and then according to the cost of the candidate set after the local search. One candidate set is selected from the candidate sets obtained by the local search.

実際には、ステップ２４０はステップ２３０に類似しており、これら２つのステップ間の主な違いは、局所探索が行われる層が同一ではなく、ステップ２４０においてビデオ圧縮エンコーダの入力として１つの候補セットだけが最終的に選択されるという点である。さらに、ステップ２４０で行われる局所探索により、導出された候補セットを選択することができ、局所探索から得られるコストが調整可能であり、これにより特定の候補セットあるいは複数の候補セットが選択されることを避けることができる。 In practice, step 240 is similar to step 230, and the main difference between these two steps is that the layer on which the local search is performed is not the same, and in step 240 one candidate set as input to the video compression encoder. Only is finally selected. Furthermore, the derived candidate set can be selected by the local search performed in step 240, and the cost obtained from the local search can be adjusted, thereby selecting a specific candidate set or a plurality of candidate sets. You can avoid that.

上述の動き予測方法に加えて、本発明はさらに、動き予測装置を提供するが、この装置は動き予測方法を具現化するものである。図３は、本発明の他の実施例による動き予測装置３００の概略図である。動き予測装置３００は図２に示される動き予測方法を具現化するものであるが、ここでＮは２に等しい。 In addition to the motion estimation method described above, the present invention further provides a motion prediction device, which embodies the motion prediction method. FIG. 3 is a schematic diagram of a motion prediction apparatus 300 according to another embodiment of the present invention. The motion prediction apparatus 300 embodies the motion prediction method shown in FIG. 2, where N is equal to 2.

図３で示すとおり、動き予測装置３００は層発生器３０１、全文対象探索ユニット３０２、最終探索ユニット３０３を含む。ここで、層発生器３０１は、原フレームデータ３１１および基準フレームデータ３１２を用いて、ステップ２１０で示すとおり、階層的データ構造を形成する。しかし、本実施例の階層的データ構造では、低解像度の１番目の層と原解像度の２番目の層の２つの層だけを含む。さらに、全文対象探索ユニット３０２では、ステップ２２０で全文対象探索と同一の全文対象探索を１番目の層に対して行い、最終探索ユニット３０３に対して少なくとも１つの候補セットをもたらす。その後、最終探索ユニット３０３が、ステップ２４０で最終探索と同一の最終探索を上述の候補セットに対して行い、これにより最適候補セット３１３を選択する。 As shown in FIG. 3, the motion prediction apparatus 300 includes a layer generator 301, a full text object search unit 302, and a final search unit 303. Here, the layer generator 301 uses the original frame data 311 and the reference frame data 312 to form a hierarchical data structure as shown in step 210. However, the hierarchical data structure of this embodiment includes only two layers, a first layer having a low resolution and a second layer having an original resolution. Further, the full-text object search unit 302 performs the same full-text object search as the full-text object search in step 220 for the first layer, and brings at least one candidate set to the final search unit 303. Thereafter, the final search unit 303 performs the same final search as the final search on the above-described candidate set in step 240, thereby selecting the optimal candidate set 313.

図４は、本発明の他の実施例による動き予測装置４００の概略図である。動き予測装置４００は図２に示される動き予測方法を具現化するものであるが、ここでＮは２より大きい。 FIG. 4 is a schematic diagram of a motion prediction apparatus 400 according to another embodiment of the present invention. The motion prediction apparatus 400 embodies the motion prediction method shown in FIG. 2, where N is greater than 2.

図４で示されるとおり、動き予測装置４００は層発生器４０１、全文対象探索ユニット４０２、Ｎ−２個の局所探索ユニット（２つの局所探索ユニット４０３および４０４だけが図４で示されている）、最終探索ユニット４０５を含む。ここで、層発生器４０１は、原フレームデータ４１１および基準フレームデータ４１２を用いて、ステップ２１０で示すとおり、Ｎ層の階層的データ構造を形成し、ここでＮは２より大きい。さらに、全文対象探索ユニット４０２では、ステップ２２０で階層的データ構造の１番目の層に対して全文対象探索と同一の全文対象探索を行い、Ｎ−２個の局所探索ユニットの１番目の局所探索ユニットに対して少なくとも１つの候補セットをもたらす。 As shown in FIG. 4, the motion estimation apparatus 400 includes a layer generator 401, a full-text object search unit 402, N-2 local search units (only two local search units 403 and 404 are shown in FIG. 4). A final search unit 405. Here, the layer generator 401 uses the original frame data 411 and the reference frame data 412 to form an N layer hierarchical data structure, where N is greater than 2, as shown in step 210. Further, the full-text object search unit 402 performs the same full-text object search as the full-text object search for the first layer of the hierarchical data structure in step 220, and the first local search of N-2 local search units. Bring at least one candidate set to the unit.

動き予測装置４００のＮ−２個の局所探索ユニットに関して、１番目の局所探索ユニット４０３が階層的データ構造の２番目の層に対応して全文対象探索ユニット４０２により与えられる候補セットを受け入れる。それに続くｋ番目の局所探索ユニットは（ｋ＋１）番目の層に対応し、（ｋ−１）番目の局所探索ユニットでもたらされる候補セットを受け入れるが、ここでｋは整数であり、１≦ｋ≦Ｎ−２である。さらに、局所探索ユニットのおのおのはステップ２３０の局所探索と同一の方法で、受け入れられた候補セットに基づく対応層に対して局所探索を行い、これにより少なくとも１つの候補セットを選択する。 For the N-2 local search units of the motion prediction device 400, the first local search unit 403 accepts the candidate set provided by the full text object search unit 402 corresponding to the second layer of the hierarchical data structure. The subsequent k th local search unit corresponds to the (k + 1) th layer and accepts the candidate set resulting from the (k−1) th local search unit, where k is an integer and 1 ≦ k ≦ N-2. Further, each local search unit performs a local search on the corresponding layer based on the accepted candidate set in the same manner as the local search in step 230, thereby selecting at least one candidate set.

上述の探索および選択の後、最終探索ユニット４０５はステップ２４０の局所探索と同一の方法で、最終局所探索ユニット４０４でもたらされた候補セットに基づく原解像度のＮ番目の層に対して局所探索を行い、これにより最適候補セット４１３を選択する。 After the search and selection described above, final search unit 405 performs a local search on the Nth layer at the original resolution based on the candidate set provided in final local search unit 404 in the same manner as the local search in step 240. Then, the optimal candidate set 413 is selected.

上述の実施例より、本発明において、最適ブロックモードは、最低解像度の１番目の層に対する分割を可能にし、高解像度の層に対して局所探索が行われ、これによりこのブロックをさらに分解することができることがわかる。本発明によりさらに柔軟性がもたらされることから、計算量が少なくメモリ使用量も小さいという利点が維持され、最適な可変ブロックモードおよび動きベクトルが正確に予測できる。 From the above embodiments, in the present invention, the optimal block mode allows partitioning for the first layer with the lowest resolution, and a local search is performed for the higher resolution layer, thereby further decomposing this block. You can see that Further flexibility is provided by the present invention, so that the advantage of less computation and less memory usage is maintained, and the optimal variable block mode and motion vector can be predicted accurately.

本発明は、本発明の特定の実施例を参照しながら説明したが、通常の当業者であれば、本発明の考え方から逸脱することなく、説明された実施例に対して改造を行ってもよいことは明白である。したがって、本発明の適用範囲は、上述の詳細な説明ではなく、添付の請求項により決められる。 Although the present invention has been described with reference to particular embodiments of the invention, those of ordinary skill in the art may make modifications to the described embodiments without departing from the spirit of the invention. The good thing is obvious. The scope of the invention is, therefore, determined by the appended claims rather than by the foregoing detailed description.

従来技術における動き予測方法を図示するフロー図の概略を示す。1 schematically shows a flow diagram illustrating a motion estimation method in the prior art. 本発明の１つの実施例による動き予測方法を図示するフロー図の概略を示す。FIG. 2 shows a schematic flow diagram illustrating a motion estimation method according to one embodiment of the present invention. 本発明の１つの実施例による動き予測装置の図の概略を示す。1 shows a schematic diagram of a motion prediction device according to one embodiment of the invention. 本発明の１つの実施例による動き予測装置の図の概略を示す。1 shows a schematic diagram of a motion prediction device according to one embodiment of the invention.

Explanation of symbols

１１１原フレームデータ
１１２基準フレームデータ
１１３ベクトル
３００予測装置
３０１層発生器
３０２全文対象探索ユニット
３０３最終探索ユニット
３１１原フレームデータ
３１２基準フレームデータ
３１３最適候補セット
４００予測装置
４０１層発生器
４０２全文対象探索ユニット
４０３局所探索ユニット
４０４最終局所探索ユニット
４０５最終探索ユニット
４１１原フレームデータ
４１２基準フレームデータ
４１３最適候補セット 111 Original Frame Data 112 Reference Frame Data 113 Vector 300 Prediction Device 301 Layer Generator 302 Full Text Target Search Unit 303 Final Search Unit 311 Original Frame Data 312 Reference Frame Data 313 Optimal Candidate Set 400 Prediction Device 401 Layer Generator 402 Full Text Target Search Unit 403 Local search unit 404 Final local search unit 405 Final search unit 411 Original frame data 412 Reference frame data 413 Optimal candidate set

Claims

A method for motion prediction, which is:
(A) using original frame data and reference frame data to form a hierarchical data structure, the hierarchical data structure including N layers, wherein the Nth layer is the original frame data and the reference frame And the remaining i th layer contains data generated from the original frame data and the reference frame data, and the image resolution of the i th layer is lower than the image resolution of the (i + 1) th layer, where N Is a positive integer greater than or equal to 2 and 1 ≦ i <N;
(B) selecting at least one candidate set from the candidate sets according to the cost of a plurality of candidate sets of macroblocks on the first layer, and providing the selected candidate set to the second layer. Te, and variable block mode each of macroblock candidate set, the steps of comprising a set of each of the motion vectors of the blocks of the variable block mode;
(C) When N is greater than 2, the following two sub-steps are sequentially performed for each i-th layer in the order of 2 ≦ i <N starting from the second, :
(C1) a substep of performing a local search based on the candidate set provided in the (i-1) th layer;
(C2) A step that is a sub-step of selecting at least one candidate set from candidate sets obtained from the local search according to the cost of the candidate set after the local search and bringing the selected candidate set to the (i + 1) th layer When;
(D) performing the following two substeps for the Nth layer, which substeps:
(D1) a substep of performing a local search based on the candidate set provided in the (N-1) th layer;
(D2) A method including a step which is a sub-step of selecting one candidate set from candidate sets obtained from the local search according to the cost of the candidate set after the local search.

The method for motion prediction according to claim 1, wherein each i-th layer data is generated by performing low-band filtering and sub-sampling on the (i + 1) -th layer data. A method characterized by that.

The method for motion prediction according to claim 1, wherein step (b) further comprises:
Adjusting one of the costs so that at least one candidate set is not selected.

The method for motion prediction according to claim 1, wherein step (c2) or (d2) further comprises:
Adjusting the cost or one of the costs so that at least one candidate set is not selected.

The method for motion prediction according to claim 1, wherein step (c1) or (d1) further comprises:
Deriving a plurality of derived candidate sets from the candidate set or one of the candidate sets and adding the derived candidate set to the selection of the next step, each of the derived candidate sets and the above-mentioned A method comprising the steps characterized in that the candidate sets have the same variable block mode but the motion vectors are different.

The method for motion prediction according to claim 1, wherein step (c1) or (d1) further comprises:
Deriving a plurality of derived candidate sets from the candidate set or one of the candidate sets and adding the derived candidate set to the selection of the next step, a variable block for each of the derived candidate sets A method comprising the steps characterized in that the mode is a result obtained by splitting the variable block mode of the candidate set described above.

A device for motion prediction, which device:
A layer generator for forming a hierarchical data structure using original frame data and reference frame data, wherein the hierarchical data structure includes two layers, the second layer being the original frame data And the first frame includes data generated from the original frame data and the reference frame data, and the image resolution of the first layer is lower than the image resolution of the second layer. With a layer generator;
A full range search unit for providing at least one candidate set from a plurality of candidate sets of macroblocks on the first layer according to the cost of the candidate set, wherein each candidate set is a variable block of a macroblock A full range search unit characterized in that it is a set of modes and respective motion vectors of blocks of this variable block mode;
To perform a local search for the second layer based on the candidate set given by the full range search unit, and to select a candidate set from the candidate set obtained by the local search according to the cost of the candidate set after the local search Comprising a final search unit.

8. The motion prediction apparatus according to claim 7, wherein the first layer data is generated by performing low-band filtering and subsampling on the second layer data. apparatus.

8. The motion estimation apparatus according to claim 7, wherein the full range search unit further comprises adjusting one of the costs so that at least one candidate set is not selected.

8. The motion estimation apparatus according to claim 7, further comprising the step of adjusting the cost or one of the costs so that the final search unit does not select at least one candidate set .

8. The motion prediction apparatus according to claim 7, wherein the final search unit further derives a plurality of derived candidate sets from the candidate set or one of the candidate sets, and selects the derived candidate set after the local search. Wherein each of the derived candidate sets and the candidate set described above have the same variable block mode, but the motion vectors are different.

8. The motion prediction apparatus according to claim 7, wherein the final search unit further derives a plurality of derived candidate sets from the candidate set or one of the candidate sets, and selects the derived candidate set after the local search. And wherein each variable block mode of the derived candidate set is a result obtained by dividing the variable block mode of the candidate set described above.

A device for motion prediction, which device:
A layer generator for forming a hierarchical data structure using original frame data and reference frame data, the hierarchical data structure including N layers, wherein the Nth layer is an original frame Data and reference frame data, the remaining i-th layer contains data generated from the original frame data and reference frame data, and the image resolution of the i-th layer is lower than the image resolution of the (i + 1) -th layer A layer generator, wherein N is a positive integer greater than 2 and 1 ≦ i <N;
A full-range object search unit for providing at least one candidate set from a plurality of candidate sets of macroblocks on the first layer according to the cost of the candidate set, wherein each candidate set is a variable macroblock A full range object search unit characterized in that it includes a set of block modes and respective motion vectors of blocks of this variable block mode;
N-2 local search units, where the first local search unit accepts the candidate set given by the full range object search unit corresponding to the second layer, and the kth local search unit is (k + 1) Accept at least one candidate set given by the (k−1) th local search unit corresponding to the th layer, where k is an integer, 1 ≦ k ≦ N−2, Each performs a local search on the corresponding layer based on the accepted candidate set, and after the local search, at least one candidate set selected from the candidate set obtained from the local search according to the cost of the candidate set A local search unit characterized by providing;
From the candidate set obtained by performing a local search for the Nth layer based on the candidate set given by the (N-2) th local search unit, and performing a local search according to the cost of the candidate set after the local search And a final search unit for selecting one candidate set.

14. The method for motion prediction according to claim 13, wherein each i-th layer data is generated by performing low-band filtering and sub-sampling on (i + 1) -th layer data. A device characterized by that.

14. The motion prediction apparatus according to claim 13, further comprising the step of adjusting the one of the costs so that the full range object search unit does not select at least one candidate set .

14. The motion estimation apparatus according to claim 13, wherein one of the local search unit and the final search unit further comprises adjusting the cost or one of the costs so that at least one candidate set is not selected. Features device.

14. The motion estimation device according to claim 13, wherein one of the local search unit and the final search unit further derives a plurality of derived candidate sets from the candidate set or one of the candidate sets. Adding the candidate set to the selection after local search, wherein each of the derived candidate sets and the candidate set described above have the same variable block mode, but different motion vectors.

14. The motion estimation device according to claim 13, wherein one of the local search unit and the final search unit further derives a plurality of derived candidate sets from the candidate set or one of the candidate sets. Adding each candidate set to the selection after local search, wherein each variable block mode of the derived candidate set is a result obtained by dividing the variable block mode of the candidate set described above. Device to do.