JP5331643B2

JP5331643B2 - Motion vector detection apparatus and program

Info

Publication number: JP5331643B2
Application number: JP2009231913A
Authority: JP
Inventors: 康孝松尾; 善明鹿喰; 慎一境田
Original assignee: Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2009-10-05
Filing date: 2009-10-05
Publication date: 2013-10-30
Anticipated expiration: 2029-10-05
Also published as: JP2011082700A

Abstract

<P>PROBLEM TO BE SOLVED: To provide a motion vector detector for detecting a motion vector of a moving image, and to provide a program. <P>SOLUTION: The motion vector detector (1) includes: resolution determining means (11, 12, 13, 14) which detect the proportion of power of a high-frequency region in a time direction and/or the proportion of power of a low-frequency region in a spatial direction, and determine a value of resolution; and a hierarchical motion vector detecting means (15) which starts detection of motion vector from a layer having a block size corresponding to the value of resolution and the size of a range of motion search, and gradually shifts to detection of motion vector in a layer having a block size smaller than the block size and the size of a search range smaller than the search range. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、動画像の時間方向及び／又は空間方向のスペクトルパワーによって動画像の動き量を階層的に分析し、動画像の動きベクトルを高精度に検出する動きベクトル検出装置及びプログラムに関する。 The present invention relates to a motion vector detection apparatus and program for hierarchically analyzing the amount of motion of a moving image by spectral power in the time direction and / or spatial direction of the moving image and detecting a motion vector of the moving image with high accuracy.

近年、撮像装置及び表示装置の高精細化が進んでおり、超解像（Ｓｕｐｅｒ−Ｒｅｓｏｌｕｔｉｏｎ）と称される動画像の高解像化技術が研究されている（例えば、特許文献１参照）。いわゆる８Ｋシステムと呼ばれるスーパーハイビジョン（ＳＨＶ）のような超高精細映像、又は４Ｋシステムと呼ばれるデジタルシネマのような高精細映像は、従来のハイビジョン（ＨＶ）映像の４倍ないし１６倍の高解像度を有するに至っている。 2. Description of the Related Art In recent years, high-definition imaging devices and display devices have been advanced, and a moving image high-resolution technique called super-resolution has been studied (for example, see Patent Document 1). Ultra-high definition video such as Super Hi-Vision (SHV) called 8K system, or high-definition video like Digital Cinema called 4K system has a resolution that is 4 to 16 times higher than that of conventional Hi-Vision (HV) video. It has come to have.

しかしながら、動画像を表示する表示装置の画面が高精細化されるほど、同じ画角で撮影した場合の動領域における１画素あたりの動きボケ量が大きくなる。 However, the higher the definition of the screen of a display device that displays a moving image, the greater the amount of motion blur per pixel in the moving region when shooting at the same angle of view.

例えば、図１３（ａ）に示すように、ハイビジョン（ＨＶ：Hi-Vision）画面は１９２０画素×１０８０ラインであり、図１３（ｂ）に示すように、スーパーハイビジョン（ＳＨＶ：Super Hi-Vision）画面は、７６８０画素×４３２０ラインである。ハイビジョン画面用の動画像と同じＦＯＶ (ＦＯＶ：ＦｉｅｌｄＯｆＶｉｅｗ)で撮像した動画像をスーパーハイビジョン画面で見ると、水平・垂直解像度ともに４倍となるため、動きのある被写体の動き量は４倍となり、動領域における１画素あたりの動きボケ量も４倍となる。特に、画面全体が大きく変化するスポーツシーン等の高速動きシーンでは、視覚的なボケ感は顕著となる。 For example, as shown in FIG. 13 (a), the Hi-Vision (HV) screen has 1920 pixels × 1080 lines, and as shown in FIG. 13 (b), Super Hi-Vision (SHV). The screen is 7680 pixels × 4320 lines. When a moving image taken with the same FOV (Field Of View) as a moving image for a high-definition screen is viewed on a super high-definition screen, the horizontal and vertical resolutions are four times larger, so the amount of movement of a moving subject is four times larger. Thus, the amount of motion blur per pixel in the motion region is also quadrupled. In particular, in a high-speed motion scene such as a sports scene in which the entire screen changes greatly, the visual blur is remarkable.

尚、或る動画像の１つの画面に対して異なる解像度の画像データを階層的に複数設定して、或る解像度の画像データについて動き量のための評価値を求めるとともに、この解像度とは異なる画像データについて動き量のための評価値を求め、各評価値を加算して得られる値から最終的な動き量を決定する技法が知られている（例えば、特許文献２参照）。 Note that a plurality of image data with different resolutions are set hierarchically for one screen of a certain moving image, and an evaluation value for the amount of motion is obtained for the image data with a certain resolution, which is different from this resolution. A technique is known in which an evaluation value for a motion amount is obtained for image data, and a final motion amount is determined from a value obtained by adding the evaluation values (see, for example, Patent Document 2).

また、或る動画像の１つの画面に対して空間方向に複数階のウェーブレット変換を施して、高周波成分の領域を多く含む階数の優先度を低くする輪郭情報を抽出するとともに、動画像の１つの画面をブロック分割し、輪郭情報で示されるブロックにおけるアクティビティ（画像の局所的性質）が小さいほど優先度を高くなるように、輪郭情報で示されるブロックに同一の優先度を設定して、優先度が低いブロックから順に切り捨て処理を行って符号化データ量を制御する技法が知られている（例えば、特許文献３参照）。 Further, wavelet transform of a plurality of floors is performed in a spatial direction on one screen of a certain moving image to extract contour information that lowers the priority of the floor including a lot of high-frequency component regions. Divide one screen into blocks and set the same priority to the block indicated by the contour information so that the priority becomes higher as the activity (local nature of the image) in the block indicated by the contour information is smaller. A technique is known in which the amount of encoded data is controlled by performing a truncation process in order from the block with the lowest degree (see, for example, Patent Document 3).

特開２００９−１０５４９０号公報JP 2009-105490 A 特許第３３３４２７１号公報Japanese Patent No. 3334271 特許第４１９５９７８号公報Japanese Patent No. 4195978

前述したように、ＳＨＶ画面の映像フォーマットは、水平７６８０画素、垂直４３２０ライン、時間６０フレーム／秒であり、ＨＶ画面の映像フォーマットと比較して、水平及び垂直標本化周波数が、時間標本化周波数に対して相対的に増大している。 As described above, the video format of the SHV screen is horizontal 7680 pixels, vertical 4320 lines, time 60 frames / second. Compared with the video format of the HV screen, the horizontal and vertical sampling frequencies are temporal sampling frequencies. It has increased relative to.

従って、ＳＨＶ画面の動領域は、同じ画角で撮像された動画像で比較した場合、ＨＶ画面と比較して大きな動き量（フレーム単位の動きを示す画素数：画素／フレーム）を示し、動領域ではフレーム間の相関が低くなり時間方向の高周波領域のパワーが高くなることが想定されるとともに、動領域のボケ量が大きくなり、空間方向の高周波領域のパワーが低くなることが想定される。符号化処理や超解像処理における複数フレーム間での動き量を推定するには、これらの想定に基づく処理が有効となる。尚、ＨＤＴＶ標準の動画像の動き量は、一般的に数画素／フレーム〜数十画素／フレーム程度であることが知られている。 Therefore, the moving area of the SHV screen shows a larger amount of movement (number of pixels indicating movement in units of frames: pixels / frame) than the HV screen when compared with moving images captured at the same angle of view. In the region, it is assumed that the correlation between frames is low and the power in the high frequency region in the time direction is high, the blur amount in the moving region is large, and the power in the high frequency region in the spatial direction is low. . Processing based on these assumptions is effective in estimating the amount of motion between a plurality of frames in encoding processing and super-resolution processing. Note that it is known that the amount of motion of a moving picture of the HDTV standard is generally about several pixels / frame to several tens of pixels / frame.

一般に、空間高周波成分の多い絵柄は、小さいブロックサイズ（例えば、２×２画素）を用いた動きベクトル検出が適している。一方、空間高周波成分が少ない絵柄は、小さいブロックサイズでは誤った動きベクトル検出となる可能性が高くなるため、大きなブロックサイズ（例えば、１６×１６画素）を用いた動きベクトル検出が有効である。更に、大きなブロックサイズを用いた動きベクトル検出では、大きな動きによるボケの影響も考えられるため、大きな動き探索範囲が要求される。 In general, a motion vector detection using a small block size (for example, 2 × 2 pixels) is suitable for a pattern with many spatial high-frequency components. On the other hand, since a pattern with few spatial high-frequency components has a high possibility of erroneous motion vector detection at a small block size, motion vector detection using a large block size (for example, 16 × 16 pixels) is effective. Furthermore, in motion vector detection using a large block size, a large motion search range is required because the influence of blur due to large motion can be considered.

一方、このような符号化処理や超解像処理における複数フレーム間での動き量の推定において、特許文献２の技法を適用しても、時空間周波領域のパワーに基づいて階層化するものではなく予め規定した階層数で処理を行うために、処理負担が大きくなり、且つ時間方向のボケの影響が反映された動き量を検出することができない。 On the other hand, even if the technique of Patent Literature 2 is applied to estimate the amount of motion between a plurality of frames in such encoding processing and super-resolution processing, it is not possible to hierarchize based on the power in the spatio-temporal frequency domain. In addition, since processing is performed with a predetermined number of hierarchies, the processing load increases, and it is impossible to detect a motion amount that reflects the influence of blur in the time direction.

また、このような符号化処理や超解像処理における複数フレーム間での動き量の推定において、特許文献３の技法を適用しても、高周波成分の領域を多く含む階数に依存して符号化データ量を取捨選択するためのブロックの優先度を決定することができるが、時空間周波領域のパワーに基づいて階層化するものではないので、動きベクトルを高精度化させることができない。 In addition, even if the technique of Patent Document 3 is applied to estimate the amount of motion between a plurality of frames in such encoding processing or super-resolution processing, encoding is performed depending on the rank including a large number of high-frequency component regions. Although the priority of the block for selecting the data amount can be determined, the motion vector cannot be made highly accurate because it is not hierarchized based on the power of the spatio-temporal frequency domain.

そこで、本発明の目的は、動画像の時間方向及び／又は空間方向のスペクトルパワーによって動画像の動き量を階層的に分析し、動画像の動きベクトルを高精度に検出する動きベクトル検出装置及びプログラムを提供することにある。 SUMMARY OF THE INVENTION An object of the present invention is to provide a motion vector detection device that hierarchically analyzes a motion amount of a moving image based on temporal and / or spatial spectral power of the moving image and detects a motion vector of the moving image with high accuracy. To provide a program.

前述のように、動き量と時空間方向の高周波領域のパワーとの間には一定の相関を持つことが多いため、動画像の時間及び空間スペクトルのパワー分析を行って、動きベクトル検出における適切なブロックサイズを推定するとともに、推定したブロックサイズを空間的に階層化して動きベクトルを検出することが有効である。 As mentioned above, there is often a certain correlation between the amount of motion and the power in the high-frequency region in the spatio-temporal direction. It is effective to estimate a block size and detect a motion vector by spatially hierarchizing the estimated block size.

動画像の時空間方向のスペクトルを考察すると、動領域における空間方向の高周波領域のパワーは、動き量の面積が大きくなるにつれて減少する。即ち、大面積の動オブジェクトが大きな動き量を持つ動画像は、空間方向の高周波領域のパワーが小さくなるが、時間方向の高周波領域のパワーは大きくなる傾向がある。これは、画面中で大きな面積のオブジェクトが大きく動く場合は、時間方向の変動が大きくなることに起因する。 Considering the spectrum in the spatio-temporal direction of the moving image, the power of the high-frequency region in the moving direction in the moving region decreases as the area of the motion amount increases. That is, a moving image in which a large-area moving object has a large amount of motion has a tendency that the power in the high-frequency region in the spatial direction decreases, but the power in the high-frequency region in the time direction tends to increase. This is because when a large area object moves greatly on the screen, the variation in the time direction becomes large.

そこで、本発明の動きベクトル検出装置及びプログラムは、動画像における動領域の面積及び動き量の推定のために、動画像の時間方向及び／又は空間方向のスペクトルパワー分析を行って、これらのパワーに応じてブロックサイズと動き探索範囲の大きさを階層的に可変にして動きベクトル検出を行う。 Therefore, the motion vector detection apparatus and program of the present invention perform spectral power analysis in the time direction and / or space direction of a moving image in order to estimate the area of the moving region and the amount of motion in the moving image, and the power Accordingly, the motion vector detection is performed by changing the block size and the size of the motion search range hierarchically.

即ち、本発明の動きベクトル検出装置は、動画像の動きベクトル検出を行う動きベクトル検出装置であって、動画像における複数フレームに亘る時間方向の高周波領域のパワーの割合が大きいほど大きなブロックサイズ及び大きな動き探索範囲の大きさとなるように、時間方向の高周波領域のパワーの割合と空間周波数の分解階数とを対応付けたテーブル、及び／又は動画像における１フレームの空間方向の低周波領域のパワーの割合が大きいほど大きなブロックサイズ及び大きな動き探索範囲の大きさとなるように、空間方向の低周波領域のパワーの割合と動きベクトル検出を開始する空間周波数の分解階数とを対応付けたテーブルを保持しており、当該時間方向の高周波領域のパワーの割合及び／又は空間方向の低周波領域のパワーの割合を検出して空間周波数の分解階数を決定する分解能決定手段と、前記テーブルを参照して、前記空間周波数の分解階数に対応するブロックサイズ及び動き探索範囲の大きさの階層から動きベクトル検出を開始して、順次階層をデクリメントし次第に該ブロックサイズよりも小さいブロックサイズ及び該動き探索範囲の大きさよりも小さい動き探索範囲の大きさの階層での動きベクトル検出へと移行する階層型動きベクトル検出手段と、を備えることを特徴とする。 That is, the motion vector detection device of the present invention is a motion vector detection device that performs motion vector detection of a moving image, and the larger the ratio of the power in the high-frequency region in the time direction over a plurality of frames in the moving image, the larger the block size and A table in which the ratio of the power in the high frequency region in the time direction and the resolution rank of the spatial frequency are associated with each other so as to be a large motion search range, and / or the power in the low frequency region in the spatial direction of one frame in the moving image. A table that associates the power ratio of the low-frequency region in the spatial direction with the spatial frequency decomposition rank at which motion vector detection is started so that the larger the ratio, the larger the block size and the larger the motion search range becomes. The ratio of the power in the high frequency region in the time direction and / or the power in the low frequency region in the spatial direction And resolution determining means for determining a degradation rank of spatial frequencies detected and the, by referring to the table, start the motion vector detection from the magnitude of the hierarchy of the block size and motion estimation range corresponding to the decomposition rank of the spatial frequency Then, the hierarchical motion vector detection means for sequentially decrementing the hierarchy and shifting to motion vector detection in a hierarchy having a block size smaller than the block size and a size of the motion search range smaller than the size of the motion search range. And.

また、本発明の動きベクトル検出装置において、前記分解能決定手段は、当該時間方向の高周波領域のパワーの割合のみから空間周波数の分解階数を決定するために、予め規定した複数のフレームで時間方向の各周波領域のパワーを検出し、検出した時間方向の高周波領域のパワーの割合が大きいほど動領域面積と動き量が大きいと判別することを特徴とする。 Further, in the motion vector detection device of the present invention, the resolution determination means determines the spatial frequency decomposition rank from only the ratio of the power in the high-frequency region in the time direction, so that the resolution in the time direction is determined in a plurality of frames defined in advance. The power of each frequency region is detected, and it is determined that the moving region area and the amount of motion are larger as the ratio of the detected power in the high frequency region in the time direction is larger.

また、本発明の動きベクトル検出装置において、前記分解能決定手段は、当該空間方向の高周波領域のパワーの割合のみから空間周波数の分解階数を決定するために、予め規定した分解能で空間方向の各周波領域のパワーを検出し、検出した空間方向の低周波領域のパワーの割合が大きいほど動領域面積と動き量が大きいと判別することを特徴とする。 Further, in the motion vector detection device of the present invention, the resolution determining means determines each spatial frequency with a predetermined resolution in order to determine a resolution rank of the spatial frequency based only on the power ratio of the high frequency region in the spatial direction. The power of the region is detected, and it is determined that the moving region area and the amount of motion are larger as the ratio of the detected power in the low frequency region in the spatial direction is larger.

また、本発明の動きベクトル検出装置において、前記分解能決定手段は、当該時間方向及び空間方向の高周波領域のパワーの割合を検出して空間周波数の分解階数を決定する際に、予め規定した複数のフレームで時間方向の各周波領域のパワーを検出し、検出した時間方向の高周波領域のパワーの割合が大きいほど動領域面積と動き量が大きいと判別して空間周波数の分解階数を決定し、該決定した空間周波数の分解階数に応じて空間方向の低周波領域のパワーの割合を検出し、検出した空間方向の低周波領域のパワーの割合が大きいほど動領域面積と動き量が大きいと判別することを特徴とする。 Further, in the motion vector detection device of the present invention, the resolution determination means detects a ratio of the power in the high-frequency region in the time direction and the spatial direction to determine the spatial frequency decomposition rank , The power of each frequency domain in the time direction is detected in the frame, and the greater the ratio of the detected power in the high frequency domain in the time direction, the greater the motion area area and the amount of motion, and the spatial frequency decomposition rank is determined. The ratio of the power in the low frequency region in the spatial direction is detected according to the determined spatial frequency decomposition rank , and the moving region area and the amount of motion are determined to be larger as the detected power ratio in the low frequency region in the spatial direction is larger. It is characterized by that.

また、本発明の動きベクトル検出装置において、前記分解能決定手段は、動画像の基準フレームの全画素について、前記複数フレームを時間方向に予め規定した最大階数の周波領域に分解した後、全画素における時間方向の周波数帯域毎のパワーを算出し、算出した全画素における各時間方向の周波数帯域別のパワーから時間方向の高周波領域のパワーの割合を算出する空間周波数の分解階数決定部と、前記複数フレームにおける時間方向の高周波領域のパワーの割合から、時間方向の高周波領域のパワーの割合が大きいほど動領域面積が大きく、且つ動き量が大きいと判断し、前記時間方向の高周波領域のパワーの割合に応じて前記基準フレームにおける空間周波数の分解階数（Ｎｓ）を決定する空間方向分解階数決定部と、前記空間周波数の分解階数（Ｎｓ）に基づいて、前記動画像における１フレームに対して空間Ｎｓ階離散ウェーブレット分解を実行し、前記１フレームにおける空間周波数帯域毎のパワーを算出し、算出した空間周波数帯域毎のパワーから前記基準フレームにおける空間方向の低周波領域のパワーの割合を算出する空間方向低周波領域パワー算出部と、前記１フレームにおける空間方向の低周波領域のパワーの割合が大きいほど、前記動画像における動領域面積が大きく、且つ動き量が大きいと判断し、前記１フレームにおける空間方向の低周波領域のパワーの割合に応じて前記動きベクトル検出を開始する空間周波数の分解階数を表す動き検出開始分解能を決定する動き検出開始分解能決定部とを有し、前記階層型動きベクトル検出手段は、前記基準フレームを前記動き検出開始分解能に対応するブロックサイズに分割し、この分割した各ブロックについて、前記動き検出開始分解能に対応する動き探索範囲の大きさで動きベクトル検出を行い、この動きベクトル検出の処理を、元の基準フレームに対応する最上位の階数の分解階数となるまで順次階数をデクリメントして動きベクトル検出を繰り返して、最終的な動きベクトルを決定する階層型動き検出部を有することを特徴とする。 In the motion vector detection device of the present invention, the resolution determining means may decompose all the pixels of the reference frame of the moving image into frequency regions of the maximum rank defined in advance in the time direction, and Calculating the power for each frequency band in the time direction, and calculating the ratio of the power in the high frequency region in the time direction from the power for each frequency band in each time direction in all the calculated pixels; From the proportion of power in the high frequency region in the time direction in the frame, it is determined that the larger the proportion of power in the high frequency region in the time direction, the larger the moving region area and the greater the amount of motion. A spatial direction decomposition rank determining unit for determining a spatial frequency decomposition rank (Ns) in the reference frame according to The spatial Ns-order discrete wavelet decomposition is performed on one frame in the moving image based on the decomposition rank (Ns) of the moving image, the power for each spatial frequency band in the one frame is calculated, and the calculated spatial frequency band is calculated for each spatial frequency band. A spatial low frequency region power calculation unit for calculating a power ratio of a low frequency region in the spatial direction in the reference frame from power; and the larger the proportion of the power in the low frequency region in the spatial direction in the one frame, the larger the moving image. large dynamic region area in and determines that the amount of motion is large, the motion detection start representing the degradation rank of spatial frequency for starting the motion vector detection in accordance with the ratio of the power of the low-frequency region of the spatial direction in one frame A motion detection start resolution determination unit that determines resolution, and the hierarchical motion vector detection means includes the reference frame. The motion vector is divided into block sizes corresponding to the motion detection start resolution, and motion vector detection is performed for each of the divided blocks with the size of the motion search range corresponding to the motion detection start resolution. It has a hierarchical motion detection unit that determines the final motion vector by repeating the motion vector detection by sequentially decrementing the rank until the decomposition rank of the highest rank corresponding to the original reference frame is reached. Features.

また、本発明の動きベクトル検出装置において、前記１フレームにおける空間方向の低周波領域のパワーの割合は、動きベクトル検出を行う基準フレームにおける空間方向の低周波領域のパワーの割合か、又は動き探索に用いる参照フレームにおける空間方向の低周波領域のパワーの割合か、又は、前記基準フレーム及び前記参照フレームのうちの空間方向の低周波領域のパワーの割合が大きいほうであることを特徴とする。 In the motion vector detection device of the present invention, the power ratio of the low frequency region in the spatial direction in the one frame is the power ratio of the low frequency region in the spatial direction in the reference frame for performing motion vector detection, or motion search. The ratio of the power in the low frequency region in the spatial direction in the reference frame used for the reference frame or the ratio of the power in the low frequency region in the spatial direction of the base frame and the reference frame is larger.

更に、本発明は、動画像の動きベクトル検出を行う動きベクトル検出装置として構成するコンピュータであって、動画像における複数フレームに亘る時間方向の高周波領域のパワーの割合が大きいほど大きなブロックサイズ及び大きな動き探索範囲の大きさとなるように、時間方向の高周波領域のパワーの割合と空間周波数の分解階数とを対応付けたテーブル、及び／又は動画像における１フレームの空間方向の低周波領域のパワーの割合が大きいほど大きなブロックサイズ及び大きな動き探索範囲の大きさとなるように、空間方向の低周波領域のパワーの割合と動きベクトル検出を開始する空間周波数の分解階数とを対応付けたテーブルを保持している、コンピュータに、当該時間方向の高周波領域のパワーの割合及び／又は空間方向の低周波領域のパワーの割合を検出して空間周波数の分解階数を決定するステップと、前記テーブルを参照して、前記空間周波数の分解階数に対応するブロックサイズ及び動き探索範囲の大きさの階層から動きベクトル検出を開始して、順次階層をデクリメントし次第に該ブロックサイズよりも小さいブロックサイズ及び該動き探索範囲の大きさよりも小さい動き探索範囲の大きさの階層での動きベクトル検出へと移行するステップと、を実行させるためのプログラムとしても特徴付けられる。 Furthermore, the present invention is a computer configured as a motion vector detection device that detects a motion vector of a moving image, and the larger the block size and the larger the power ratio in the high-frequency region in the time direction over a plurality of frames in the moving image. A table in which the ratio of power in the high frequency region in the time direction and the resolution rank of the spatial frequency are associated with each other and / or the power in the low frequency region in the spatial direction of one frame in the moving image so as to be the size of the motion search range. A table that associates the power ratio of the low frequency region in the spatial direction with the spatial frequency decomposition rank for starting motion vector detection is maintained so that the larger the ratio, the larger the block size and the larger the motion search range. The ratio of power in the high frequency region in the time direction and / or low in the spatial direction. Motion determining degradation rank of spatial frequencies by detecting the ratio of the power of the wave area, by referring to the table, from the size of the hierarchy of the block size and motion estimation range corresponding to the decomposition rank of the spatial frequency Starting vector detection, sequentially decrementing the hierarchy, and gradually moving to motion vector detection in a hierarchy having a block size smaller than the block size and a motion search range smaller than the size of the motion search range; , Is also characterized as a program for executing.

本発明によれば、動画像における動きベクトル検出にあたって、適切なブロックサイズを推定して動きベクトル検出を開始することができる。特に、時空間方向のスペクトルパワーから動き量及び動領域を推定してブロックサイズ及び動き探索範囲の大きさを決定し、階層的に動きベクトル検出を行うことで、雑音に強く、且つ高精度の動きベクトル検出の計算量を削減することができる。例えば、ＳＨＶ画面における大きな動きをするオブジェクトを高精度に推定した動きベクトル検出が可能となる。 According to the present invention, when detecting a motion vector in a moving image, an appropriate block size can be estimated and motion vector detection can be started. In particular, the motion amount and motion region are estimated from the spectral power in the spatio-temporal direction, the block size and the motion search range are determined, and motion vector detection is performed hierarchically. The calculation amount of motion vector detection can be reduced. For example, it is possible to detect a motion vector by accurately estimating an object that makes a large movement on the SHV screen.

本発明による一実施例の動きベクトル検出装置の概略図である。It is the schematic of the motion vector detection apparatus of one Example by this invention. 本発明による一実施例の動きベクトル検出装置における時間方向高周波領域パワー算出部の概略図である。It is the schematic of the time direction high frequency domain power calculation part in the motion vector detection apparatus of one Example by this invention. 本発明による一実施例の動きベクトル検出装置における空間方向低周波領域パワー算出部の概略図である。It is the schematic of the space direction low frequency area | region power calculation part in the motion vector detection apparatus of one Example by this invention. 本発明による一実施例の動きベクトル検出装置における動き検出開始分解能決定部の概略図である。It is the schematic of the motion detection start resolution determination part in the motion vector detection apparatus of one Example by this invention. 本発明による一実施例の動きベクトル検出装置における階層型動き検出部の概略図である。It is the schematic of the hierarchical motion detection part in the motion vector detection apparatus of one Example by this invention. 本発明による一実施例の動きベクトル検出装置の動作を示す動作フロー図である。It is an operation | movement flowchart which shows operation | movement of the motion vector detection apparatus of one Example by this invention. 本発明による一実施例の動きベクトル検出装置におけるフレーム画像列を示す図である。It is a figure which shows the frame image sequence in the motion vector detection apparatus of one Example by this invention. 本発明による一実施例の動きベクトル検出装置における時間方向高周波領域パワー算出部の動作説明図である。It is operation | movement explanatory drawing of the time direction high frequency domain power calculation part in the motion vector detection apparatus of one Example by this invention. 本発明による一実施例の動きベクトル検出装置に係る２次元２階離散ウェーブレット分解の説明図である。It is explanatory drawing of the two-dimensional 2nd-order discrete wavelet decomposition which concerns on the motion vector detection apparatus of one Example by this invention. 本発明による一実施例の動きベクトル検出装置における動き検出開始分解能決定部の動作説明図である。It is operation | movement explanatory drawing of the motion detection start resolution determination part in the motion vector detection apparatus of one Example by this invention. 本発明による一実施例の動きベクトル検出装置における動き検出開始分解能決定部に係るブロックサイズの説明図である。It is explanatory drawing of the block size which concerns on the motion detection start resolution determination part in the motion vector detection apparatus of one Example by this invention. 本発明による一実施例の動きベクトル検出装置に係る２次関数近似による小数画素位置のブロックマッチング法の説明図である。It is explanatory drawing of the block matching method of the decimal pixel position by quadratic function approximation concerning the motion vector detection apparatus of one Example by this invention. 動領域における１画素あたりの動きボケ量が映像フォーマットに従って変化する様子を示す図である。It is a figure which shows a mode that the amount of motion blurring per pixel in a moving region changes according to a video format.

本発明の動きベクトル検出装置は、包括的には、動画像における基準フレームを含む複数フレームに亘る時間方向の高周波領域のパワーの割合が大きいほど大きなブロックサイズ及び大きな動き探索範囲の大きさとなるように階層的に規定した分解能、及び／又は動画像における１フレームの空間方向の低周波領域のパワーの割合が大きいほど大きなブロックサイズ及び大きな動き探索範囲の大きさとなるように階層的に規定した分解能で動きベクトル検出を開始するために、当該時間方向の高周波領域のパワーの割合及び／又は空間方向の低周波領域のパワーの割合を検出して分解能の値を決定し、決定した分解能の値に対応するブロックサイズ及び動き探索範囲の大きさの階層から動きベクトル検出を開始して、次第に該ブロックサイズよりも小さいブロックサイズ及び該動き探索範囲の大きさよりも小さい動き探索範囲の大きさの階層での動きベクトル検出へと移行する階層型の動きベクトル検出を行う装置である。 In general, the motion vector detection apparatus of the present invention has a larger block size and a larger motion search range as the power ratio of the high-frequency region in the time direction over a plurality of frames including a reference frame in a moving image increases. And / or a resolution that is hierarchically defined so that the larger the proportion of power in the low-frequency region in the spatial direction of one frame in a moving image, the larger the block size and the larger the motion search range. In order to start motion vector detection, a ratio of power in the high frequency region in the time direction and / or a ratio of power in the low frequency region in the spatial direction is detected to determine a resolution value, and the determined resolution value is set. Start motion vector detection from the hierarchy of the corresponding block size and the size of the motion search range, and gradually increase the block size Remote is an apparatus for performing hierarchical motion vector detection to shift to the motion vector detection in a small block size and the size of the hierarchy of smaller motion search range than the magnitude of the motion search range.

以下、本発明による一実施例の動きベクトル検出装置について説明する。 Hereinafter, a motion vector detection apparatus according to an embodiment of the present invention will be described.

一実施例の動きベクトル検出装置１として、時間方向及び空間方向の周波数解析にウェーブレット変換を用いる場合について説明する。尚、時間方向及び空間方向の周波数解析には、ウェーブレット変換を用いる場合以外に、他の直交変換又はＦＦＴ（Fast Fourier transform）を用いることができる。 The case where wavelet transform is used for frequency analysis in the time direction and the spatial direction will be described as the motion vector detection device 1 of one embodiment. For the frequency analysis in the time direction and the space direction, other orthogonal transforms or FFT (Fast Fourier transform) can be used besides the case of using the wavelet transform.

[装置構成]
図１に、本発明による一実施例の動きベクトル検出装置１を示す。本実施例の動きベクトル検出装置１は、時間方向高周波領域パワー算出部１１と、空間分解階数決定部１２と、空間方向低周波領域パワー算出部１３と、動き検出開始分解能決定部１４と、階層型動き検出部１５とを備える。尚、各構成要素で処理するのに必要な画像データは、動きベクトル検出装置１が備える記憶部（図示せず）に適宜格納して読み出すように構成することができる。 [Device configuration]
FIG. 1 shows a motion vector detection apparatus 1 according to an embodiment of the present invention. The motion vector detection device 1 according to the present embodiment includes a time direction high frequency domain power calculation unit 11, a spatial resolution rank determination unit 12, a spatial direction low frequency domain power calculation unit 13, a motion detection start resolution determination unit 14, and a hierarchy. And a mold motion detector 15. Note that image data necessary for processing by each component can be appropriately stored in a storage unit (not shown) included in the motion vector detection device 1 and read out.

時間方向高周波領域パワー算出部１１は、動きベクトルの検出を行う基準フレームＦ（ｔ_Ｃ）及び動き探索に用いる参照フレームＦ（ｔ_Ｒ）を含む、時刻ｔ＝ｔ_０・・・ｔ_ｍにおける複数フレームのフレーム画像列Ｆ（ｔ_０），・・・，Ｆ（ｔ_Ｃ），・・・，Ｆ（ｔ_Ｒ），・・・，Ｆ（ｔ_ｍ）を入力し、基準フレームＦ（ｔ_Ｃ）における全画素について、この複数フレームを時間方向に予め規定した最大階数の周波領域に分解した後、全画素における時間方向の周波数帯域毎のパワーを算出し、算出した全画素における時間方向の周波数帯域別のパワーから時間方向の高周波領域のパワーの割合を算出して空間分解階数決定部１２に送出する。 Temporal high-frequency range power calculation unit 11 includes a reference frame F to be used for reference frame F _{(t C)} and the motion search to detect the motion vector _{(t R),} a plurality of times _{_t} = _t 0 ··· _t _m F (t ₀ ),..., F (t _C ),..., F (t _R ),..., F (t _m ) are input, and the reference frame F (t _C ) For all the pixels in (1), the plurality of frames are decomposed into frequency regions of the maximum rank defined in advance in the time direction, and then the power for each frequency band in the time direction in all pixels is calculated. The ratio of the power in the high-frequency region in the time direction is calculated from the power for each band and sent to the spatial resolution rank determining unit 12.

例えば、図２に示すように、時間方向高周波領域パワー算出部１１は、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）を含む、時刻ｔ＝ｔ_０・・・ｔ_ｍにおけるフレーム画像列Ｆ（ｔ_０），・・・，Ｆ（ｔ_Ｃ），・・・，Ｆ（ｔ_Ｒ），・・・，Ｆ（ｔ_ｍ）を入力し、基準フレームＦ（ｔ_Ｃ）の全画素について時間方向に予め規定したＮｍａｘ階（例えば、４階）の離散ウェーブレット分解を行う時間方向１次元Ｎｍａｘ階離散ウェーブレット分解処理部１１１と、基準フレームＦ（ｔ_Ｃ）の全画素における時間方向の周波数帯域毎のパワーを算出し、算出した全画素における時間方向の周波数帯域別のパワーから時間方向の高周波領域のパワーの割合を算出して空間分解階数決定部１２に送出する時間方向周波数帯域別パワー算出部１１２から構成することができる。 For example, as illustrated in FIG. 2, the time-direction high-frequency domain power calculation unit 11 includes a frame image at time t = t ₀ ... T _m including a base frame F (t _C ) and a reference frame F (t _R ). column _{_{F (t 0), ···,}} F (t C), ···, F (t R), ···, all the pixels of the F _{(t m)} enter the reference frame F _{(t C)} Time-direction one-dimensional Nmax-order discrete wavelet decomposition processing unit 111 that performs Nwave-order (for example, fourth-order) discrete wavelet decomposition in advance in the time direction, and frequency in the time direction for all pixels of the reference frame F (t _C ) The power for each band is calculated, the ratio of the power in the high-frequency region in the time direction is calculated from the calculated power for each frequency band in all the pixels, and is sent to the spatial resolution rank determining unit 12 and sent to the spatial resolution rank determining unit 12. It can consist of over calculator 112.

空間分解階数決定部１２は、時間方向高周波領域パワー算出部１１によって算出した時間方向の高周波領域のパワーの割合から、時間方向の高周波領域のパワーの割合が大きいほど、動領域面積が大きく、且つ動き量が大きいと判断し、時間方向の高周波領域のパワーの割合に応じて空間周波数の分解階数Ｎｓ（即ち、空間方向の分解能）を決定し、決定した空間周波数の分解階数Ｎｓの情報を空間方向低周波領域パワー算出部１３に送出する。 The spatial resolution rank determination unit 12 has a larger dynamic region area as the proportion of power in the high frequency region in the time direction increases from the proportion of power in the high frequency region in the time direction calculated by the time direction high frequency region power calculation unit 11. It is determined that the amount of motion is large, the spatial frequency decomposition rank Ns (that is, the resolution in the spatial direction) is determined according to the power ratio of the high frequency region in the time direction, and information on the determined spatial frequency decomposition rank Ns is stored in the space. Send to the direction low frequency region power calculation unit 13.

空間方向低周波領域パワー算出部１３は、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）と空間周波数の分解階数Ｎｓの情報を入力し、空間周波数の分解階数Ｎｓに基づいて、基準フレームＦ（ｔ_Ｃ）及び／又は参照フレームＦ（ｔ_Ｒ）に対して空間Ｎｓ階離散ウェーブレット分解を実行し、当該フレームにおける空間周波数帯域毎のパワーを算出し、算出した空間周波数帯域毎のパワーから当該フレームにおける空間方向の低周波領域のパワーの割合を算出し、算出した空間方向の低周波領域のパワーの割合と空間Ｎｓ階離散ウェーブレット分解したデータを動き検出開始分解能決定部１４に送出する。 The spatial direction low frequency region power calculation unit 13 inputs information on the base frame F (t _C ) and the reference frame F (t _R ) and the spatial frequency decomposition rank Ns, and based on the spatial frequency decomposition rank Ns, The spatial Ns-order discrete wavelet decomposition is performed on the frame F (t _C ) and / or the reference frame F (t _R ), the power for each spatial frequency band in the frame is calculated, and the calculated power for each spatial frequency band The power ratio of the low-frequency region in the spatial direction in the frame is calculated, and the calculated power ratio of the low-frequency region in the spatial direction and the spatial Ns-order discrete wavelet decomposition data are sent to the motion detection start resolution determining unit 14. .

尚、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の双方に対して空間Ｎｓ階離散ウェーブレット分解を実行することは、後の処理として、固定のブロックサイズ及び探索範囲の大きさで階層型動きベクトル検出を行う際のウェーブレット再構成を階層的に行うことにより、元画像に対して可変のブロックサイズ及び探索範囲の大きさとする階層型動きベクトル検出を行うことができる点で有利であり、特に、動きベクトル検出を階層的に行うための分解能の決定のためには、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）のうちの空間方向の低周波領域のパワーの割合が大きいほうを選定するのが好適となる。以下の説明では、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の双方について空間Ｎｓ階離散ウェーブレット分解を行う例を説明する。 It should be noted that performing the spatial Ns-order discrete wavelet decomposition on both the base frame F (t _C ) and the reference frame F (t _R ) requires a fixed block size and a search range size as later processing. It is advantageous in that hierarchical motion vector detection with variable block size and search range size can be performed on the original image by hierarchically performing wavelet reconstruction when performing hierarchical motion vector detection. In particular, in order to determine the resolution for performing hierarchical motion vector detection, the ratio of the power in the low-frequency region in the spatial direction of the base frame F (t _C ) and the reference frame F (t _R ) It is preferable to select the larger one. In the following description, an example will be described in which spatial Ns-order discrete wavelet decomposition is performed on both the base frame F (t _C ) and the reference frame F (t _R ).

例えば、図３に示すように、空間方向低周波領域パワー算出部１３は、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）を入力し、空間分解階数決定部１２によって決定した空間周波数の分解階数Ｎｓに基づいて、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の各々の全画素に対して空間Ｎｓ階離散ウェーブレット分解を行う空間方向２次元Ｎｓ階離散ウェーブレット分解処理部１３１と、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の各々における空間周波数帯域毎のパワーを算出し、算出した空間周波数帯域毎のパワーから基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の各々における空間方向の低周波領域のパワーの割合を算出し、算出した基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の各々における空間方向の低周波領域のパワーの割合の大きいほうの情報を動き検出開始分解能決定部１４に送出する空間方向周波数帯域別パワー算出部１３２から構成することができる。 For example, as illustrated in FIG. 3, the spatial direction low frequency region power calculation unit 13 receives the reference frame F (t _C ) and the reference frame F (t _R ), and the spatial frequency determined by the spatial resolution rank determination unit 12. A spatial direction two-dimensional Ns-order discrete wavelet decomposition processing unit that performs spatial Ns-order discrete wavelet decomposition on all the pixels of the base frame F (t _C ) and the reference frame F (t _R ) based on the decomposition rank Ns of and 131, the reference frame F (t _C) and the reference frame F (t _R) of calculating a power of each spatial frequency bands in each reference frame F (t _C) from the power of each calculated spatial frequency band and the reference frame The ratio of the power in the low frequency region in the spatial direction in each of F (t _R ) is calculated, and the calculated base frame F (t _C ) and reference frame F (t _R) ) Each of which has a larger power ratio in the low-frequency region in the spatial direction, and includes a power calculation unit 132 for each frequency band in the spatial direction that sends the information to the motion detection start resolution determination unit 14.

動き検出開始分解能決定部１４は、空間方向低周波領域パワー算出部１３によって算出した空間方向の低周波領域のパワーの割合の情報から、空間方向の低周波領域のパワーの割合が大きいほど（空間方向の高周波領域のパワーの割合が小さいほど）、動領域面積が大きく、且つ動き量が大きいと判断し、動きベクトル検出を階層的に開始するための分解能（以下、「動き検出開始分解能」と称する）が小さい値（即ち、低解像度画像）となるように、空間方向の低周波領域のパワーの割合に応じて動き検出開始分解能を決定し、決定した階層的な動き検出開始分解能の情報を階層型動き検出部１５に送出する。 The motion detection start resolution determination unit 14 determines that the power ratio of the low frequency region in the spatial direction is larger from the information on the power ratio of the low frequency region in the spatial direction calculated by the spatial direction low frequency region power calculation unit 13 (space The resolution for starting motion vector detection hierarchically (hereinafter referred to as “motion detection start resolution”) is determined as the dynamic region area is large and the amount of motion is large as the power ratio of the high frequency region in the direction is small. The motion detection start resolution is determined according to the ratio of the power in the low frequency region in the spatial direction, and information on the determined hierarchical motion detection start resolution is obtained. It is sent to the hierarchical motion detector 15.

例えば、図４に示すように、動き検出開始分解能決定部１４は、空間方向低周波領域パワー算出部１３によって算出した空間方向の低周波領域のパワーの割合から、空間方向の低周波領域のパワーの割合が大きいほど（空間方向の高周波領域のパワーの割合が小さいほど）動領域面積が大きく、且つ動き量が大きいと判断し、動きベクトル検出を開始する階数（以下、「動き検出開始階数」と称する）ｎｓが大きな値となるように、空間方向の低周波領域のパワーの割合に応じて動き検出開始階数ｎｓを決定し、決定した動き検出開始階数ｎｓの情報を階層型動き検出部１５に送出する動き検出開始階数決定部１４１として構成することができる。 For example, as shown in FIG. 4, the motion detection start resolution determination unit 14 determines the power of the low frequency region in the spatial direction from the ratio of the power of the low frequency region in the spatial direction calculated by the spatial direction low frequency region power calculation unit 13. Is larger (the smaller the power ratio of the high-frequency region in the spatial direction is), the larger the moving region area and the larger the amount of motion, and the number of floors at which motion vector detection starts (hereinafter referred to as “motion detection start floor”). The motion detection start floor ns is determined in accordance with the power ratio of the low frequency region in the spatial direction so that ns becomes a large value, and the information on the determined motion detection start floor ns is used as the hierarchical motion detector 15. Can be configured as a motion detection start floor number determination unit 141 to be transmitted to.

階層型動き検出部１５は、動き検出開始階数ｎｓに応じた空間方向に低周波領域の基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の画像を得るために、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の空間方向２次元Ｎｓ階離散ウェーブレット分解したデータに対して、動き検出開始階数ｎｓに応じた空間ｎｓ階ウェーブレットの再構成を行い、予め定めたブロックサイズ及び探索範囲の大きさで動きベクトル検出を実行し、続いて空間ｎｓ−１階ウェーブレットの再構成を行い、当該予め定めたブロックサイズ及び探索範囲の大きさで動きベクトル検出を再度実行し、最上位の階層（即ち、元の画像レベル）にて当該予め定めたブロックサイズ及び探索範囲の大きさで動きベクトル検出を行うまで階数をデクリメントして繰り返す。この階層型動きベクトル検出の動作は、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）を入力し、動き検出開始分解能に基づいて、基準フレームＦ（ｔ_Ｃ）に対して順次ブロックサイズ及び探索範囲の大きさを縮小しながら動きベクトル検出を行うことと類似した処理となる。ただし、空間ｎｓ階ウェーブレット分解及び再構成を経て順次繰り返すことによる階層型動きベクトル検出によれば、階層に応じて順次可変にすべきブロックサイズ及び探索範囲の大きさを用意する必要がなく固定とすることができ、且つ画像シーンに応じた動き検出開始階数ｎｓに応じた動きベクトル検出を行うため、高精度化が期待できる。 The hierarchical motion detection unit 15 obtains images of the reference frame F (t _C ) and the reference frame F (t _R ) in the low frequency region in the spatial direction according to the motion detection start rank ns in order to obtain the reference frame F (t _C ) and the spatial direction two-dimensional Ns-order discrete wavelet decomposition of the reference frame F (t _R ), the spatial ns-order wavelet is reconstructed according to the motion detection start rank ns, and a predetermined block size and The motion vector detection is executed with the size of the search range, the spatial ns-1 wavelet is reconstructed, and the motion vector detection is executed again with the predetermined block size and the search range size. The floor is decremented until motion vector detection is performed with the predetermined block size and the size of the search range in the hierarchy (that is, the original image level). Repeat. In this hierarchical motion vector detection operation, the base frame F (t _C ) and the reference frame F (t _R ) are input, and the block size is sequentially set with respect to the base frame F (t _C ) based on the motion detection start resolution. The processing is similar to the motion vector detection while reducing the size of the search range. However, according to hierarchical motion vector detection by sequentially repeating through spatial ns-order wavelet decomposition and reconstruction, it is not necessary to prepare the block size and the size of the search range that should be sequentially changed according to the hierarchy. Since the motion vector is detected according to the motion detection start rank ns according to the image scene, high accuracy can be expected.

例えば、図５に示すように、階層型動き検出部１５は、動き検出開始階数ｎｓに応じた空間方向に低周波領域の基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の画像を得るために動き検出開始階数ｎｓに応じた空間ｎｓ階ウェーブレットの再構成を行い、予め定めたブロックサイズ及び探索範囲の大きさで小数画素精度のブロックマッチングによる動きベクトル検出を行う動き検出部１５１と、この動きベクトル検出の処理を最上位の階数に対応する元の画像レベルとなるまで階数をデクリメントして繰り返すために、空間方向低周波領域パワー算出部１３によって算出した空間Ｎｓ階離散ウェーブレット分解データに対して動き検出開始階数ｎｓよりも上位の階数の画像となるように空間方向に１階上位のウェーブレット再構成を実行して動き検出部１５１に送出する空間１階ウェーブレット再構成部１５２から構成することができる。従って、動き検出部１５１は、空間１階ウェーブレット再構成部１５２から得られる基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の再構成画像を用いて、動きベクトル検出の処理を階層的に繰り返し、最終的な動きベクトルを決定して出力することができる。 For example, as illustrated in FIG. 5, the hierarchical motion detection unit 15 displays the images of the reference frame F (t _C ) and the reference frame F (t _R ) in the low frequency region in the spatial direction according to the motion detection start rank ns. A motion detection unit 151 that performs reconstruction of a spatial ns-order wavelet according to the motion detection start rank ns, and performs motion vector detection by block matching with decimal pixel precision with a predetermined block size and search range size; In order to repeat this motion vector detection process by decrementing the rank until the original image level corresponding to the highest rank is reached, the spatial Ns-th order discrete wavelet decomposition data calculated by the spatial direction low frequency region power calculation unit 13 is used. The wavelet reconstruction of the first floor higher in the spatial direction so that the image of the higher floor than the motion detection start floor ns A spatial first-order wavelet reconstruction unit 152 that executes and sends the motion detection unit 151 can be configured. Therefore, the motion detection unit 151 uses the reconstructed images of the base frame F (t _C ) and the reference frame F (t _R ) obtained from the spatial first-order wavelet reconstruction unit 152 to perform hierarchical motion vector detection processing. The final motion vector can be determined and output repeatedly.

以下、本発明による一実施例の動きベクトル検出装置１の動作について更に詳細に説明する。 Hereinafter, the operation of the motion vector detection device 1 according to an embodiment of the present invention will be described in more detail.

[装置動作]
図６は、本発明による一実施例の動きベクトル検出装置の動作を示す動作フローである。 [Device operation]
FIG. 6 is an operation flow showing the operation of the motion vector detection apparatus of one embodiment according to the present invention.

ステップＳ１にて、動きベクトル検出装置１は、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）を含む、時刻ｔ＝ｔ_０・・・ｔ_ｍにおけるフレーム画像列Ｆ（ｔ_０），・・・，Ｆ（ｔ_Ｃ），・・・，Ｆ（ｔ_Ｒ），・・・，Ｆ（ｔ_ｍ）を入力して、動きベクトル検出装置１が備える記憶部（図示せず）に適宜読み出し可能に格納する。 In step S1, the motion vector detection device 1 includes a frame image sequence F (t ₀ ) at time t = t ₀ ... T _m including a base frame F (t _C ) and a reference frame F (t _R ), .., F (t _C ),..., F (t _R ),..., F (t _m ) are inputted and stored in a storage unit (not shown) included in the motion vector detection device 1 as appropriate. Store readable.

ステップＳ２にて、時間方向高周波領域パワー算出部１１により、フレーム画像列Ｆ（ｔ_０），・・・，Ｆ（ｔ_Ｃ），・・・，Ｆ（ｔ_Ｒ），・・・，Ｆ（ｔ_ｍ）を入力し、基準フレームＦ（ｔ_Ｃ）における全画素について時間方向に予め規定した最大階数の周波領域に分解した後、全画素における時間方向の周波数帯域毎のパワーを算出し、算出した全画素における時間方向の周波数帯域別のパワーから時間方向の高周波領域のパワーの割合を算出する。 In step S2, the temporal high-frequency range power calculation unit 11, frame image sequence _{_{F (t 0), ···,}} F (t C), ···, F (t R), ···, F ( t _m ) is input, and all the pixels in the reference frame F (t _C ) are decomposed into frequency regions of the maximum rank defined in advance in the time direction, and then power for each frequency band in the time direction in all pixels is calculated. The ratio of the power in the high-frequency region in the time direction is calculated from the power for each frequency band in the time direction in all the pixels.

例えば、図７に示すように、フレーム画像列Ｆ（ｔ_０），・・・，Ｆ（ｔ_Ｃ），・・・，Ｆ（ｔ_Ｒ），・・・，Ｆ（ｔ_ｍ）における或る画素Ｒ（ｋ，ｌ）について、時間方向に予め規定した最大階数（Ｎｍａｘ）の周波領域に分解した後、全画素における時間方向周波数帯域毎のパワーを算出することができる。例えば、図８に示すように、１６フレームのフレーム画像列Ｆ（ｔ）を時間方向にＮｍａｘ階に分解するとすれば、Ｎｍａｘ＝１では、低周波領域Ｌ^１及び高周波領域Ｈ^１として分割することができ（図８（ａ）参照）、Ｎｍａｘ＝２では、低周波領域Ｌ^２及び高周波領域Ｈ^１，Ｈ^２として分割することができ（図８（ｂ）参照）、Ｎｍａｘ＝３では、低周波領域Ｌ^３及び高周波領域Ｈ^１，Ｈ^２，Ｈ^３として分割することができ（図８（ｃ）参照）、Ｎｍａｘ＝４では、低周波領域Ｌ^４及び高周波領域Ｈ^１，Ｈ^２，Ｈ^３，Ｈ^４として分割することができる（図８（ｄ）参照）。 For example, as shown in FIG. 7, the frame image sequence _{_{F (t 0), ···,}} F (t C), ···, F (t R), ···, one at F _{(t m)} After the pixel R (k, l) is decomposed into frequency regions of the maximum rank (Nmax) defined in advance in the time direction, the power for each time direction frequency band in all the pixels can be calculated. For example, as shown in FIG. 8, if a frame image sequence F (t) of 16 frames is decomposed into Nmax floors in the time direction, when Nmax = 1, it is divided into a low frequency region L ¹ and a high frequency region H ^1. (See FIG. 8A), when Nmax = 2, it can be divided into the low frequency region L ² and the high frequency regions H ¹ and H ² (see FIG. 8B), and when Nmax = 3, it is low. It can be divided into the frequency region L ³ and the high frequency regions H ¹ , H ² , H ³ (see FIG. 8C). When Nmax = 4, the low frequency region L ⁴ and the high frequency regions H ¹ , H ² , H ³ and H ⁴ (see FIG. 8D).

ステップＳ３にて、空間分解階数決定部１２により、算出した時間方向の高周波領域のパワーの割合から、時間方向の高周波領域のパワーの割合が大きいほど動領域面積が大きく、且つ動き量が大きいと判断し、空間周波数の分解階数Ｎｓが大きな値となるように、時間方向の高周波領域のパワーの割合に応じて準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の空間周波数の分解階数Ｎｓを決定する。 In step S3, when the power ratio of the high frequency region in the time direction is larger from the ratio of the power in the high frequency region in the time direction calculated by the spatial decomposition rank determination unit 12, the dynamic region area is larger and the amount of motion is larger. The spatial frequency decomposition rank of the quasi-frame F (t _C ) and the reference frame F (t _R ) is determined according to the power ratio in the high-frequency region in the time direction so that the spatial frequency decomposition rank Ns becomes a large value. Ns is determined.

例えば、表１に示すように、時間方向の高周波領域のパワーの割合と空間周波数の分解階数Ｎｓとの間で規定されるテーブルを予め保持しておく。 For example, as shown in Table 1, a table defined in advance between the power ratio in the high frequency region in the time direction and the resolution rank Ns of the spatial frequency is held in advance.

ステップＳ４にて、空間方向低周波領域パワー算出部１３により、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）を入力し、空間分解階数決定部１２によって決定した空間周波数の分解階数Ｎｓに基づいて、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）に対して空間Ｎｓ階離散ウェーブレット分解を実行し、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）における空間周波数帯域毎のパワーをそれぞれ算出し、動き検出開始分解能（又は動き検出開始階数ｎｓ）の決定のために、算出した空間周波数帯域毎のパワーから基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）における空間周波数帯域毎のパワーの割合の大きいほうを選定する。基準フレームＦ（ｔ_Ｃ）又は参照フレームＦ（ｔ_Ｒ）における空間周波数帯域毎のパワーの割合のみを算出してもよい。 In step S4, the base frame F (t _C ) and the reference frame F (t _R ) are input by the spatial direction low frequency region power calculation unit 13, and the spatial frequency decomposition rank Ns determined by the spatial decomposition rank determination unit 12 is obtained. based on the reference frame F _{(t C)} and the reference frame F _{(t R)} performs spatial Ns floor discrete wavelet decomposition on the reference frame F _{(t C),} and the spatial frequency in the reference frame F _{(t R)} The power for each band is calculated, and the reference frame F (t _C ) and the reference frame F (t _R ) are calculated from the calculated power for each spatial frequency band in order to determine the motion detection start resolution (or motion detection start floor ns). The power ratio for each spatial frequency band in (1) is selected. Only the power ratio for each spatial frequency band in the reference frame F (t _C ) or the reference frame F (t _R ) may be calculated.

例えば、図９（ａ）に示すように、基準フレームＦ（ｔ_Ｃ）の全画素に対して空間方向に２次元２階離散ウェーブレット分解を実行して、各周波領域のパワーを算出し、算出した空間周波数帯域毎のパワーから空間方向の低周波領域のパワーの割合を算出することができる。また、図９（ｂ）に示すように、基準フレームＦ（ｔ_Ｃ）の空間方向の低周波領域（例えば、ＬＬ^２）のみを抽出して基準フレームＦ（ｔ_Ｃ）の低周波領域のみの画像を再構成することができる。 For example, as shown in FIG. 9A, two-dimensional second-order discrete wavelet decomposition is performed in the spatial direction on all the pixels of the reference frame F (t _C ) to calculate the power in each frequency region. The power ratio in the low frequency region in the spatial direction can be calculated from the power for each spatial frequency band. Further, as shown in FIG. 9 (b), the reference frame F _{(t C)} a low-frequency region (e.g., LL ²⁾ spatial direction only extracted by the reference frame F low-frequency region only of the _{(t C)} Images can be reconstructed.

ステップＳ５にて、動き検出開始分解能決定部１４により、空間Ｎｓ階離散ウェーブレット分解データ及び空間方向の低周波領域のパワーの割合から、空間方向の低周波領域のパワーの割合が大きいほど（空間方向の高周波領域のパワーの割合が小さいほど）動領域面積が大きく、且つ動き量が大きいと判断し、動き検出開始分解能が小さい値（動き検出開始階数ｎｓが大きい値）となるように、空間方向の低周波領域のパワーの割合に応じて階層的な動き検出開始分解能（又は動き検出開始階数ｎｓ）を決定する。ただし、ｎｓ≦Ｎｓである。 In step S5, the motion detection start resolution determination unit 14 determines that the power ratio of the low frequency region in the spatial direction is larger from the spatial Ns-order discrete wavelet decomposition data and the power ratio of the low frequency region in the spatial direction (space direction). The direction of the spatial direction is such that the smaller the power ratio of the high frequency region is, the larger the moving region area is and the amount of motion is large, and the motion detection start resolution is small (the motion detection start rank ns is large). The hierarchical motion detection start resolution (or motion detection start rank ns) is determined in accordance with the power ratio in the low frequency region. However, ns ≦ Ns.

例えば、表２に示すように、空間方向の低周波領域のパワーの割合と動き検出開始分解能（又は動き検出開始階数ｎｓ）との間で規定されるテーブルを予め保持しておく。尚、動き検出開始階数が大きくなるにつれて、元の画像が低解像度化することを意味しており、元の画像に対して相対的にブロックサイズ及び動き探索範囲の大きさが増大することを意味している。例えば、空間分解能１／１６，１／８，１／４，１／２とすれば、それぞれ（ブロックサイズ，動き探索範囲の大きさ）は、（１６×１６，水平・垂直１６画素），（８×８，水平・垂直８画素），（４×４，水平・垂直４画素），（２×２，水平・垂直２画素）などである。ここで、例えば、空間分解能１／１６は、元の画像における水平標本化周波数Ｈｓ及び垂直標本化周波数Ｖｓにおいて、１６画素を１画素として標本化する低解像度化を意味する。 For example, as shown in Table 2, a table defined between the power ratio of the low frequency region in the spatial direction and the motion detection start resolution (or motion detection start floor ns) is held in advance. Note that, as the motion detection start floor is increased, the original image is reduced in resolution, and the block size and the motion search range are increased relative to the original image. doing. For example, if the spatial resolution is 1/16, 1/8, 1/4, 1/2, (block size, size of motion search range) is (16 × 16, horizontal / vertical 16 pixels), ( 8 × 8, horizontal / vertical 8 pixels), (4 × 4, horizontal / vertical 4 pixels), (2 × 2, horizontal / vertical 2 pixels), and the like. Here, for example, the spatial resolution of 1/16 means a reduction in resolution in which 16 pixels are sampled as one pixel at the horizontal sampling frequency Hs and the vertical sampling frequency Vs in the original image.

つまり、図１０に示すように、空間方向の低周波領域のパワーの割合によって、動き検出開始階数ｎｓを関連付けることができる。例えば、Ｎｓ＝４のとき、低周波領域（ＬＬ^４）及び高周波領域（ＬＬ^４以外）のそれぞれのパワーを算出して、全体における低周波領域（ＬＬ^４）の割合が、９９．５％以上であれば、動き検出開始階数ｎｓ＝４として４階層の低周波領域のみの画像を再構成することができる（図１０（ｄ）参照）。また、全体における低周波領域（ＬＬ^４）の割合が、９８．０％以上９９．５％未満であれば、動き検出開始階数ｎｓ＝３として３階層の低周波領域（この場合、ＬＬ^３）のみの画像を再構成することができる（図１０（ｃ）参照）。同様に、全体における低周波領域（ＬＬ^４）の割合が、９５．０％以上９８．０％未満であれば、動き検出開始階数ｎｓ＝２として２階層の低周波領域（この場合、ＬＬ^２）のみの画像を再構成することができ（図１０（ｂ）参照）、全体における低周波領域（ＬＬ^４）の割合が、９５．０％未満であれば、動き検出開始階数ｎｓ＝１として１階層の低周波領域（この場合、ＬＬ^１）のみの画像を再構成することができる。 That is, as shown in FIG. 10, the motion detection start rank ns can be associated with the power ratio in the low frequency region in the spatial direction. For example, when Ns = 4, the respective powers of the low frequency region (LL ⁴ ) and the high frequency region (other than LL ⁴ ) are calculated, and the ratio of the low frequency region (LL ⁴ ) in the whole is 99.5% or more. If this is the case, it is possible to reconstruct an image of only the four-layer low-frequency region with the motion detection start rank ns = 4 (see FIG. 10D). Further, if the ratio of the low frequency region (LL ⁴ ) in the whole is 98.0% or more and less than 99.5%, the motion detection start rank ns = 3 and the three layers of low frequency regions (in this case, LL ³ ). Only an image can be reconstructed (see FIG. 10C). Similarly, if the ratio of the low frequency region (LL ⁴ ) in the whole is 95.0% or more and less than 98.0%, the motion detection start rank ns = 2 is set to two layers of low frequency regions (in this case, LL ² ) Only can be reconstructed (see FIG. 10B), and if the ratio of the low frequency region (LL ⁴ ) in the whole is less than 95.0%, the motion detection start rank ns = 1 It is possible to reconstruct an image of only one layer of the low-frequency region (in this case, LL ¹ ).

ステップＳ６にて、階層型動き検出部１５により、動き検出開始分解能（動き検出開始階数ｎｓ）に基づいた基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の低周波画像に対して、基準フレームＦ（ｔ_Ｃ）の低周波画像を所定のブロックサイズに分割し、分割した各ブロックについて、所定の動き探索範囲の大きさで、小数画素精度のブロックマッチングによる動きベクトル検出を行う。 In step S6, the hierarchical motion detection unit 15 applies the low-frequency images of the base frame F (t _C ) and the reference frame F (t _R ) based on the motion detection start resolution (motion detection start rank ns) to The low-frequency image of the reference frame F (t _C ) is divided into a predetermined block size, and for each divided block, motion vector detection is performed by block matching with decimal pixel accuracy within a predetermined motion search range.

ステップＳ７にて、階層型動き検出部１５により、基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）に対してブロックサイズ及び動き探索範囲の大きさを縮小しながら動きベクトル検出を繰り返す効果を得るために、算出していた基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の空間ｎｓ階離散ウェーブレット分解データに対して動き検出開始階数ｎｓよりも上位の階数の画像となるように空間方向に１階上位のウェーブレット再構成を実行する。 In step S7, the hierarchical motion detection unit 15 repeats motion vector detection while reducing the block size and the size of the motion search range with respect to the base frame F (t _C ) and the reference frame F (t _R ). In order to obtain an image having a rank higher than the motion detection start rank ns with respect to the spatial ns-order discrete wavelet decomposition data of the calculated base frame F (t _C ) and reference frame F (t _R ). Next, wavelet reconstruction of the first floor in the spatial direction is executed.

階層型動き検出部１５は、空間１階ウェーブレット再構成部１５２から得られる基準フレームＦ（ｔ_Ｃ）及び参照フレームＦ（ｔ_Ｒ）の再構成画像を用いて、最上位の階層（即ち、元の画像レベルにおける動きベクトル検出）となるまで順次階数をデクリメントして動きベクトル検出の処理を繰り返し（ステップＳ８）、最終的な動きベクトルを決定して出力することができる（ステップＳ９）。 The hierarchical motion detection unit 15 uses the reconstructed images of the reference frame F (t _C ) and the reference frame F (t _R ) obtained from the spatial first-order wavelet reconstruction unit 152, and uses the highest layer (ie, the original hierarchy). The motion vector detection process is repeated (step S8) and the final motion vector is determined and output (step S9).

例えば、図１１（ａ）〜（ｄ）に示すように、動き検出開始階数ｎｓが大きいほどブロックサイズが大きくなる様子を示しており、ブロックサイズが大きいほど参照フレームＦ（ｔ_Ｒ）における動き探索範囲の大きさも大きくなる。 For example, as shown in FIGS. 11A to 11D, the block size increases as the motion detection start rank ns increases, and the motion search in the reference frame F (t _R ) increases as the block size increases. The size of the range also increases.

つまり、基準フレームＦ（ｔ_Ｃ）のｎｓ階低周波画像を水平及び垂直のブロックサイズ（Ｂｘ^ｎｓ，Ｂｙ^ｎｓ）の或るブロックＢ^ｎｓ（上添え字は、階級を示す）に分割し、参照フレームＦ（ｔ_Ｒ）の±Ｓｘ^ｎｓ，±Ｓｙ^ｎｓの範囲（例えば、±２ブロック）で探索し、各ブロックの動きベクトルｖ^ｎｓを算出する。次に、基準フレームＦ（ｔ_Ｃ）のｎｓ−１階低周波画像を、ブロックサイズ（Ｂｘ^ｎｓ，Ｂｙ^ｎｓ）で分割し、参照フレームＦ（ｔ_Ｒ）のｎｓ階低周波画像上の同じ位置から２×ｖ^ｎｓだけずらした場所を中心位置とする水平及び垂直画素数としてそれぞれ±Ｓｘ^ｎｓ，±Ｓｙ^ｎｓの範囲で探索し、得られた動きベクトルに２×ｖ^ｎｓをベクトル加算して、ｎｓ−１階低周波画像における動きベクトルｖ^ｎｓ−１を算出する。このようにして、最上位の階数（即ち、１階）まで動きベクトル検出を繰り返すことにより、高精度化を図ることができる。 That is, the ns-order low-frequency image of the reference frame F (t _C ) is divided into a certain block B ^ns (the superscript indicates the class) of horizontal and vertical block sizes (Bx ^ns , By ^ns ) and referred to ± ^{Sx ns} frame F _{(t R),} the range of ± ^{Sy ns} (e.g., ± 2 blocks) probed with, calculates a motion vector ^{v ns} of each block. Next, the ns-1 floor low frequency image of the base frame F (t _C ) is divided by the block size (Bx ^ns , By ^ns ), and the same position on the ns floor low frequency image of the reference frame F (t _R ) 2 × v ^ns shifted by location respectively ± Sx ns as the horizontal and vertical number of pixels centered position ^from probed with a range of ± Sy ^ns, the 2 × v ^ns to the obtained motion vector by vector addition, The motion vector v ^ns-1 in the ns-1 floor low frequency image is calculated. In this way, high accuracy can be achieved by repeating motion vector detection up to the highest rank (that is, the first floor).

尚、動きベクトル検出は、２次関数近似による小数画素位置のブロックマッチング法を用いて行うのは、最上位の階数（即ち、１階）でのみ行うのが好適であり、式（１）で与えられる。 It should be noted that the motion vector detection is preferably performed only with the highest rank (that is, the first floor) using the block matching method of the decimal pixel position by quadratic function approximation. Given.

尚、探索位置における画素位置をxとしたとき、SSD(x)は、画素位置におけるＳＳＤ値（誤差二乗和）を表し、より具体的には、SSD(0)は中心位置におけるＳＳＤ値、SSD(−1)は中心位置から−Ｓｘ（Ｓｙ）画素の位置におけるＳＳＤ値、SSD(1)は中心位置から＋Ｓｘ（Ｓｙ）画素の位置におけるＳＳＤ値を表す。式（１）から、水平又は垂直方向の小数画素精度の画素位置（小数画素位置）をそれぞれ算出することができる。例えば、図１２に示すように、式（１）から２次関数近似して、小数画素位置として例えば−０．３３を得ることができる。 When the pixel position at the search position is x, SSD (x) represents the SSD value (sum of squares of error) at the pixel position. More specifically, SSD (0) is the SSD value at the center position, SSD (−1) represents the SSD value at the position of −Sx (Sy) pixel from the center position, and SSD (1) represents the SSD value at the position of + Sx (Sy) pixel from the center position. From equation (1), pixel positions (decimal pixel positions) with decimal pixel precision in the horizontal or vertical direction can be calculated respectively. For example, as shown in FIG. 12, a quadratic function approximation can be performed from Equation (1) to obtain, for example, −0.33 as the decimal pixel position.

以上のように、一実施例の動きベクトル検出装置によれば、動画像における動きベクトル検出にあたって、適切なブロックサイズを推定して動きベクトル検出を開始することができる。特に、時空間方向のスペクトルパワーから動き量及び動き領域を推定してブロックサイズ及び動き探索範囲の大きさを決定し、空間方向に階層型動きベクトル検出を行うことで、雑音に強く、且つ高精度の動きベクトル検出の計算量を削減することができる。例えば、ＳＨＶ画面における大きな動きをするオブジェクトを高精度に推定した動きベクトル検出が可能となる。 As described above, according to the motion vector detection device of one embodiment, when detecting a motion vector in a moving image, an appropriate block size can be estimated and motion vector detection can be started. In particular, by estimating the amount of motion and the motion region from the spectral power in the spatio-temporal direction, determining the block size and the size of the motion search range, and performing hierarchical motion vector detection in the spatial direction, it is robust against noise and high It is possible to reduce the amount of calculation of accurate motion vector detection. For example, it is possible to detect a motion vector by accurately estimating an object that makes a large movement on the SHV screen.

また、このように高精度で効率的に求めた動きベクトルを既存の符号化装置の符号化処理や超解像処理に適用することで、更なる高品質化が期待できる。 Further, by applying the motion vector obtained with high accuracy and efficiency in this way to encoding processing and super-resolution processing of an existing encoding device, further improvement in quality can be expected.

尚、上述の実施例では、一実施例の動きベクトル検出装置１が、時間方向及び空間方向の高周波領域のパワーを検出して分解能の値を決定する際に、複数フレームで時間方向の各周波領域のパワーを検出し、検出した時間方向の高周波領域のパワーの割合が大きいほど動領域面積と動き量が大きいと判別して空間方向の分解能を決定し、該決定した空間方向の分解能に応じて空間方向の低周波領域のパワーの割合を検出し、検出した空間方向の低周波領域のパワーが大きいほど動領域面積と動き量が大きいと判別する例を説明した。 In the above-described embodiment, when the motion vector detection apparatus 1 of the embodiment detects the power of the high-frequency region in the time direction and the spatial direction and determines the resolution value, each frequency in the time direction is determined in a plurality of frames. The power of the region is detected, and the larger the ratio of the detected power in the high frequency region in the time direction, the larger the moving region area and the amount of motion are determined, the spatial direction resolution is determined, and the determined spatial direction resolution is determined. In the above description, the ratio of the power in the low-frequency region in the spatial direction is detected, and the moving region area and the amount of motion are determined to increase as the detected power in the low-frequency region in the spatial direction increases.

他の例として、変形例の動きベクトル検出装置は、時間方向の高周波領域のパワーの割合のみから分解能の値を決定するために、複数フレームで時間方向の各周波領域のパワーを検出し、検出した時間方向の高周波領域のパワーの割合が大きいほど動領域面積と動き量が大きいと判別するように構成することもできる。 As another example, the motion vector detection device of the modified example detects and detects the power of each frequency domain in the time direction in a plurality of frames in order to determine the resolution value only from the ratio of the power in the high frequency domain in the time direction. It can also be configured such that the larger the ratio of the power in the high-frequency region in the time direction is, the larger the moving region area and the amount of motion are.

更に他の例として、更なる変形例の動きベクトル検出装置は、空間方向の高周波領域のパワーの割合のみから分解能の値を決定するために、予め規定した分解能で空間方向の各周波領域のパワーを検出し、検出した空間方向の低周波領域のパワーが大きいほど動領域面積と動き量が大きいと判別するように構成することもできる。 As yet another example, the motion vector detection device of a further modification example determines the resolution value from only the ratio of the power in the high-frequency region in the spatial direction, so that the power in each frequency region in the spatial direction has a predetermined resolution. It is also possible to detect that the moving region area and the amount of movement are larger as the detected power in the low frequency region in the spatial direction is larger.

また、上述の実施例では、各周波領域のパワーの算出のために、ウェーブレット変換を用いる例を説明したが、離散コサイン変換などの既知の直交変換や、フィルターバンク等を用いて各周波領域のパワーを算出することができる。ウェーブレット変換を用いる場合には、画像の周波数変化を高精度に捉えることができる点で優れており、離散コサイン変換を用いる場合には、既存のシステムが離散コサイン変換を用いている場合に装置構成が容易になる。 In the above-described embodiment, an example in which the wavelet transform is used to calculate the power of each frequency region has been described. However, a known orthogonal transform such as a discrete cosine transform, a filter bank, or the like is used. Power can be calculated. When using the wavelet transform, it is superior in that it can capture the frequency change of the image with high accuracy. When using the discrete cosine transform, the device configuration is used when the existing system uses the discrete cosine transform. Becomes easier.

従って、本発明の動きベクトル検出装置は、時間方向及び／又は空間方向の各周波領域のパワーを検出して分解能の値を決定し、決定した分解能の値に対応するブロックサイズ及び動き探索範囲の大きさの階層から動きベクトル検出を開始して、次第に該ブロックサイズよりも小さいブロックサイズ及び該動き探索範囲の大きさよりも小さい動き探索範囲の大きさの階層での動きベクトル検出へと移行する階層型の動きベクトル検出を行うように構成することができる。 Therefore, the motion vector detection apparatus of the present invention detects the power of each frequency domain in the time direction and / or spatial direction to determine the resolution value, and the block size and motion search range corresponding to the determined resolution value. Hierarchy in which motion vector detection is started from a size hierarchy, and gradually moves to motion vector detection in a hierarchy having a block size smaller than the block size and a size of the motion search range smaller than the size of the motion search range. A type of motion vector detection can be performed.

ここで、時間方向の高周波領域のパワーの割合で定めた分解能を決定した後に、空間方向の高周波領域のパワーの割合で階層的に規定した分解能を決定するのが優れた処理効率で装置を構成することができるが、時間方向の高周波領域のパワーの割合と、空間方向の高周波領域のパワーの割合の組み合わせで規定した分解能をテーブルとして保持しておき、時間方向の高周波領域のパワーの割合と空間方向の高周波領域のパワーの割合を並列的に算出して、これらのパワーの割合の組み合わせで規定した分解能の値を決定するように構成することもできる。 Here, after determining the resolution determined by the power ratio in the high frequency region in the time direction, it is possible to determine the resolution defined hierarchically by the power ratio in the high frequency region in the spatial direction. However, the resolution defined by the combination of the power ratio of the high frequency region in the time direction and the power ratio of the high frequency region in the spatial direction is held as a table, and the power ratio of the high frequency region in the time direction is The power ratio in the high frequency region in the spatial direction can be calculated in parallel, and the resolution value defined by the combination of the power ratios can be determined.

更に、本発明の一態様として、本発明の動きベクトル検出装置をコンピュータとして構成させることができる。コンピュータに、前述した本発明の動きベクトル検出装置の各構成要素を実現させるためのプログラムは、コンピュータの内部又は外部に備えられる記憶部に記憶される。そのような記憶部は、外付けハードディスクなどの外部記憶装置、或いはＲＯＭ又はＲＡＭなどの内部記憶装置で実現することができる。コンピュータに備えられる制御部は、中央演算処理装置（ＣＰＵ）などの制御で実現することができる。即ち、ＣＰＵが、各構成要素の機能を実現するための処理内容が記述されたプログラムを、適宜、記憶部から読み込んで、各構成要素の機能をコンピュータ上で実現させることができる。ここで、各構成要素の機能をハードウェアの一部で実現しても良い。 Furthermore, as one aspect of the present invention, the motion vector detection device of the present invention can be configured as a computer. A program for causing a computer to implement each component of the motion vector detection device of the present invention described above is stored in a storage unit provided inside or outside the computer. Such a storage unit can be realized by an external storage device such as an external hard disk or an internal storage device such as ROM or RAM. The control unit provided in the computer can be realized by controlling a central processing unit (CPU) or the like. In other words, the CPU can appropriately read from the storage unit a program in which the processing content for realizing the function of each component is described, and realize the function of each component on the computer. Here, the function of each component may be realized by a part of hardware.

また、この処理内容を記述したプログラムを、例えばＤＶＤ又はＣＤ−ＲＯＭなどの可搬型記録媒体の販売、譲渡、貸与等により流通させることができるほか、そのようなプログラムを、例えばネットワーク上にあるサーバの記憶部に記憶しておき、ネットワークを介してサーバから他のコンピュータにそのプログラムを転送することにより、流通させることができる。 In addition, the program describing the processing contents can be distributed by selling, transferring, or lending a portable recording medium such as a DVD or CD-ROM, and such a program can be distributed on a server on a network, for example. Can be distributed by transferring the program from the server to another computer via the network.

また、そのようなプログラムを実行するコンピュータは、例えば、可搬型記録媒体に記録されたプログラム又はサーバから転送されたプログラムを、一旦、自己の記憶部に記憶することができる。また、このプログラムの別の実施態様として、コンピュータが可搬型記録媒体から直接プログラムを読み取り、そのプログラムに従った処理を実行することとしてもよく、更に、このコンピュータにサーバからプログラムが転送される度に、逐次、受け取ったプログラムに従った処理を実行することとしてもよい。 In addition, a computer that executes such a program can temporarily store, for example, a program recorded on a portable recording medium or a program transferred from a server in its own storage unit. As another embodiment of the program, the computer may directly read the program from a portable recording medium and execute processing according to the program, and each time the program is transferred from the server to the computer. In addition, the processing according to the received program may be executed sequentially.

以上、具体例を挙げて本発明の実施例を詳細に説明したが、本発明の特許請求の範囲から逸脱しない限りにおいて、あらゆる変形や変更が可能であることは当業者に明らかである。 While the embodiments of the present invention have been described in detail with specific examples, it will be apparent to those skilled in the art that various modifications and changes can be made without departing from the scope of the claims of the present invention.

本発明によれば、動画像の超解像処理及び映像符号化、又はその他の画像処理で用いられる動きベクトル検出の精度を高めることができる。近年における動画像処理は益々高精細化しており、本発明による動きベクトル検出は、高精度の動きベクトル検出が求められる任意の動画像処理の用途に有用である。 According to the present invention, it is possible to improve the accuracy of motion vector detection used in super-resolution processing and video encoding of moving images, or other image processing. In recent years, moving image processing has been increasingly refined, and the motion vector detection according to the present invention is useful for any moving image processing application that requires highly accurate motion vector detection.

１動きベクトル検出装置
１１時間方向高周波領域パワー算出部
１２空間周波数の分解階数決定部
１３空間方向低周波領域パワー算出部
１４動き検出開始分解能決定部
１５階層型動き検出部
１１１時間方向１次元Ｎｍａｘ階離散ウェーブレット分解処理部
１１２時間方向周波数帯域別パワー算出部
１３１空間方向２次元Ｎｓ階離散ウェーブレット分解処理部
１３２空間方向周波数帯域別パワー算出部
１４１動き検出開始階数決定部
１５１動き検出部
１５２空間１階ウェーブレット再構成部 DESCRIPTION OF SYMBOLS 1 Motion vector detection apparatus 11 Time direction high frequency domain power calculation part 12 Spatial frequency decomposition rank determination part 13 Spatial direction low frequency area power calculation part 14 Motion detection start resolution determination part 15 Hierarchical motion detection part 111 Time direction one-dimensional Nmax floor Discrete wavelet decomposition processing unit 112 Power calculation unit by time direction frequency band 131 Spatial direction two-dimensional Ns-order discrete wavelet decomposition processing unit 132 Power calculation unit by spatial direction frequency band 141 Motion detection start rank determining unit 151 Motion detection unit 152 First floor of space Wavelet reconstruction unit

Claims

A motion vector detection device for detecting a motion vector of a moving image,
The ratio of the power in the high frequency region in the time direction and the resolution rank of the spatial frequency so that the larger the block ratio and the size of the large motion search range are, the larger the power ratio in the high frequency region in the time direction over multiple frames in the moving image is. And / or the low frequency region in the spatial direction so that the larger the ratio of the power in the low frequency region in the spatial direction of one frame in the moving image, the larger the block size and the larger the motion search range . Holds a table that correlates the power ratio and the spatial frequency decomposition order at which motion vector detection starts ,
Resolution determining means for detecting the power ratio of the high frequency region in the time direction and / or the power ratio of the low frequency region in the spatial direction to determine the resolution rank of the spatial frequency ;
Referring to the table, motion vector detection is started from a layer having a block size and a motion search range corresponding to the resolution rank of the spatial frequency, and the block size is gradually smaller than the block size by sequentially decrementing the layer. And a hierarchical motion vector detection means for shifting to motion vector detection in a hierarchy having a size of the motion search range smaller than the size of the motion search range,
A motion vector detection device comprising:

The resolution determining means detects and detects the power of each frequency domain in the time direction in a plurality of predetermined frames in order to determine the resolution rank of the spatial frequency from only the ratio of the power of the high frequency domain in the time direction. 2. The motion vector detection device according to claim 1, wherein it is determined that the moving area area and the amount of movement are larger as the power ratio of the high-frequency area in the time direction is larger.

The resolution determining means detects the power of each frequency region in the spatial direction with a predetermined resolution in order to determine the resolution rank of the spatial frequency from only the ratio of the power of the high frequency region in the spatial direction, and the detected spatial direction The motion vector detection device according to claim 1, wherein the motion region area and the motion amount are determined to be larger as the power ratio of the low frequency region is larger.

The resolution determining means detects the power ratio of the high-frequency region in the time direction and the spatial direction to determine the resolution rank of the spatial frequency, and determines the power of each frequency region in the time direction in a plurality of predetermined frames. It is determined that the larger the power ratio of the detected high frequency region in the time direction is, the larger the moving region area and the amount of motion are, the spatial frequency decomposition rank is determined, and the spatial frequency is determined according to the determined spatial frequency decomposition rank. The ratio of the power in the low frequency region in the direction is detected, and the larger the ratio of the detected power in the low frequency region in the spatial direction, the larger the moving region area and the amount of motion are determined. Motion vector detection device.

The resolution determining means includes
For all the pixels of the reference frame of the moving image, after decomposing the plurality of frames into a frequency region of the maximum rank specified in advance in the time direction, calculate the power for each frequency band in the time direction in all pixels, A spatial frequency decomposition rank determining unit that calculates the ratio of the power in the high-frequency region in the time direction from the power for each frequency band in each time direction;
From the ratio of the power in the high frequency region in the time direction in the plurality of frames, it is determined that the larger the power ratio in the high frequency region in the time direction, the larger the moving region area and the greater the amount of movement. A spatial direction decomposition rank determining unit that determines the decomposition rank (Ns) of the spatial frequency in the reference frame according to the ratio of
Based on the decomposition rank (Ns) of the spatial frequency, a spatial Ns-order discrete wavelet decomposition is performed on one frame in the moving image, power for each spatial frequency band in the one frame is calculated, and the calculated spatial frequency A spatial direction low frequency region power calculation unit for calculating a ratio of power in a low frequency region in the spatial direction in the reference frame from the power for each band;
It is determined that the larger the ratio of the power in the low frequency region in the spatial direction in the one frame, the larger the moving region area in the moving image and the larger the amount of motion, and the power in the low frequency region in the spatial direction in the one frame. A motion detection start resolution determining unit that determines a motion detection start resolution representing a resolution rank of a spatial frequency at which the motion vector detection is started according to a ratio;
The hierarchical motion vector detection means includes:
The reference frame is divided into block sizes corresponding to the motion detection start resolution, and motion vector detection is performed for each of the divided blocks with the size of the motion search range corresponding to the motion detection start resolution. A hierarchical motion detection unit that repeatedly decrements the rank and repeats motion vector detection until the decomposition rank of the highest rank corresponding to the original reference frame is reached, and determines the final motion vector The motion vector detection device according to claim 1, wherein:

The ratio of power in the low frequency area in the spatial direction in the one frame is the ratio of power in the low frequency area in the spatial direction in the reference frame for motion vector detection, or the low frequency area in the spatial direction in the reference frame used for motion search. The ratio of the power of the above, or the ratio of the power in the low-frequency region in the spatial direction of the base frame and the reference frame is the larger one of claims 1 to 5, The motion vector detection device described.

A computer configured as a motion vector detection device for detecting a motion vector of a moving image,
The ratio of the power in the high frequency region in the time direction and the resolution rank of the spatial frequency so that the larger the block ratio and the size of the large motion search range are, the larger the power ratio in the high frequency region in the time direction over multiple frames in the moving image is. And / or the low frequency region in the spatial direction so that the larger the ratio of the power in the low frequency region in the spatial direction of one frame in the moving image, the larger the block size and the larger the motion search range . In a computer that holds a table that associates the power ratio and the spatial frequency decomposition order at which motion vector detection starts ,
Detecting the power ratio of the high frequency region in the time direction and / or the power ratio of the low frequency region in the spatial direction to determine the decomposition rank of the spatial frequency ;
Referring to the table, motion vector detection is started from a layer having a block size and a motion search range corresponding to the resolution rank of the spatial frequency, and the block size is gradually smaller than the block size by sequentially decrementing the layer. And transitioning to motion vector detection in a hierarchy of motion search range sizes smaller than the size of the motion search range;
A program for running