JP2015084186A

JP2015084186A - Information processing device, control method therefor, and program

Info

Publication number: JP2015084186A
Application number: JP2013222685A
Authority: JP
Inventors: 小林　正明; Masaaki Kobayashi; 正明小林
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2013-10-25
Filing date: 2013-10-25
Publication date: 2015-04-30
Anticipated expiration: 2033-10-25
Also published as: JP6235860B2

Abstract

PROBLEM TO BE SOLVED: To enhance processing speed and processing accuracy of image processing.SOLUTION: An information processing device is configured to: calculate a first movement parameter indicating movement from a reference image in the processing units of divided areas into which a processing target image is divided; extract a movement vector corresponding to the first movement parameter from movement vectors in the divided areas; and calculate a second movement parameter of the processing target image that represents positional deviation from the reference image, by using the extracted movement vector.

Description

本発明は、時間的に連続する画像を処理する情報処理技術に関するものである。 The present invention relates to an information processing technique for processing temporally continuous images.

コンピュータの計算性能の向上に伴い、領域分割、位置合わせ等、コンピュータビジョンと呼ばれる分野の画像処理技術の実用性が高まっている。 With the improvement of computer computing performance, the practicality of image processing technology in a field called computer vision, such as area division and alignment, is increasing.

画像の位置合わせでは、時間的に連続する画像から複数の動きベクトルを算出し、これらの動きベクトルから画像の位置ズレ（画像全体の動き）を表現する動きパラメータを算出する。動きベクトルの算出方法は、ブロックマッチングによる動き探索や、特徴点を検出してその対応を算出し、対応点ペアの座標を動きベクトルとする等の方法がある（特許文献１参照）。 In image alignment, a plurality of motion vectors are calculated from temporally continuous images, and a motion parameter expressing the positional deviation of the image (motion of the entire image) is calculated from these motion vectors. As a motion vector calculation method, there are a method such as a motion search by block matching, a feature point is detected and its correspondence is calculated, and the coordinates of the corresponding point pair are used as a motion vector (see Patent Document 1).

また、動きパラメータは、二次元ベクトルやホモグラフィ行列、回転行列（非特許文献１参照）等を用いて表現できる。しかし、検出した動きベクトルは、全てが正しいとは限らず、誤った動きベクトルが含まれることがあるため、誤りを含むデータからモデルを推定するロバスト推定技術が必要となる。ロバスト推定の代表的なアルゴリズムには、ＲＡＮＳＡＣがある（非特許文献２参照）。ＲＡＮＳＡＣは、計算を繰り返しながら最適なモデルを推定する技術である。しかし、ＲＡＮＳＡＣは、データに誤りの量が多い程、または、推定するパラメータの要素数が多い程、多くの繰り返し（以下、イテレーション）を必要とする。以下、この誤ったデータをアウトライア（ｏｕｔｌｉｅｒ）、正しいデータをインライア（ｉｎｌｉｅｒ）と表現する。 The motion parameter can be expressed using a two-dimensional vector, a homography matrix, a rotation matrix (see Non-Patent Document 1), or the like. However, not all detected motion vectors are correct, and erroneous motion vectors may be included. Therefore, a robust estimation technique for estimating a model from data containing errors is required. RANSAC is a typical algorithm for robust estimation (see Non-Patent Document 2). RANSAC is a technique for estimating an optimal model while repeating calculations. However, RANSAC requires more iterations (hereinafter, iteration) as the amount of errors in the data increases or the number of parameters to be estimated increases. Hereinafter, this incorrect data is expressed as an outlier, and correct data is expressed as an inlier.

画像の位置合わせ技術は、画像ブレ補正（電子防振）、画像合成、符号化、自由視点生成等、様々に応用できる。例えば、連続する複数のフレームに対し、上記の方法でフレーム間の動きを表現する行列の逆行列をそれぞれ作成する。この複数の逆行列を用いて平滑化し、平滑化した行列を用いて、それぞれの画像を幾何変換することにより、動きブレを補正ができる。行列の平滑化は、行列の移動相乗平均を用いて計算することができる。相乗平均の算出に必要な、行列のべき乗根は、例えば、非特許文献３にある方法を用いて計算できる。 Image alignment technology can be applied in various ways, such as image blur correction (electronic image stabilization), image composition, encoding, and free viewpoint generation. For example, for each of a plurality of consecutive frames, an inverse matrix of a matrix expressing the motion between the frames is created by the above method. Motion blurring can be corrected by performing smoothing using the plurality of inverse matrices and geometrically transforming each image using the smoothed matrix. Matrix smoothing can be calculated using a matrix moving geometric mean. The power root of the matrix necessary for calculating the geometric mean can be calculated using the method described in Non-Patent Document 3, for example.

特開２００７−３３４６２５号公報JP 2007-334625 A

姿勢推定と回転行列、玉木徹、"IEICE Technical Report SIP2009-48, SIS2009-23(2009-09)"Attitude estimation and rotation matrix, Toru Tamaki, "IEICE Technical Report SIP2009-48, SIS2009-23 (2009-09)" "Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography"、M.A. Fischler and R.C. Bolles、 "Communications of the ACM, 24(6):381-395, 1981""Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography", M.A. Fischler and R.C. Bolles, "Communications of the ACM, 24 (6): 381-395, 1981" "Algorithms for the matrix pth root"、 "Dario A. Binia、 Nicholas J. Highamb、and Beatrice Meinia"、NumericalAlgorithms (2005)39: 349-378"Algorithms for the matrix pth root", "Dario A. Binia, Nicholas J. Highamb, and Beatrice Meinia", Numerical Algorithms (2005) 39: 349-378

アウトライアを含む動きベクトルから、画像全体の動きを表現する行列を動きパラメータとして算出する場合、ＲＡＮＳＡＣ等のロバスト推定が必要になる。ＲＡＮＳＡＣはデータのアウトライア率が高い場合、多くのイテレーションを必要とし処理に時間がかかる。 When a matrix representing the motion of the entire image is calculated as a motion parameter from a motion vector including an outlier, robust estimation such as RANSAC is required. When the data outlier rate is high, RANSAC requires many iterations and takes time to process.

本発明は上記の課題を解決するためになされたものであり、画像処理に係る処理速度及び処理精度を向上することができる情報処理技術を提供することを目的とする。 SUMMARY An advantage of some aspects of the invention is that it provides an information processing technique capable of improving processing speed and processing accuracy related to image processing.

上記の目的を達成するための本発明による情報処理装置は以下の構成を備える。即ち、
時間的に連続する画像を処理する情報処理装置であって、
処理対象画像を分割した分割領域を処理単位として基準画像からの動きを示す第一の動きパラメータを算出する第一の算出手段と、
前記分割領域内の動きベクトルから、前記第一の動きパラメータに対応する動きベクトルを抽出する抽出手段と、
前記分割領域内の動きベクトルから前記抽出手段で抽出した動きベクトルを用いて、前記基準画像からの位置ズレを表現する、前記処理対象画像に関する第二の動きパラメータを算出する第二の算出手段と
を備える。 In order to achieve the above object, an information processing apparatus according to the present invention comprises the following arrangement. That is,
An information processing apparatus for processing temporally continuous images,
A first calculation means for calculating a first motion parameter indicating a motion from the reference image using a divided region obtained by dividing the processing target image as a processing unit;
Extracting means for extracting a motion vector corresponding to the first motion parameter from the motion vectors in the divided region;
Second calculation means for calculating a second motion parameter related to the processing target image, which expresses a positional deviation from the reference image, using a motion vector extracted by the extraction means from a motion vector in the divided area; Is provided.

本発明によれば、画像処理に係る処理速度及び処理精度を向上することができる。 According to the present invention, it is possible to improve processing speed and processing accuracy related to image processing.

装置構成を説明する図である。It is a figure explaining an apparatus structure. 電子防振処理を示すフローチャートである。It is a flowchart which shows an electronic image stabilization process. 変換行列の推定処理の詳細を示すフローチャートである。It is a flowchart which shows the detail of the estimation process of a conversion matrix. 画像分割の例を説明する図である。It is a figure explaining the example of image division. 代表ベクトルの算出処理の詳細を示すフローチャートである。It is a flowchart which shows the detail of the calculation process of a representative vector. 回転行列の推定処理の詳細を示すフローチャートである。It is a flowchart which shows the detail of the estimation process of a rotation matrix. 類似動きベクトルの抽出処理の詳細を示すフローチャートである。It is a flowchart which shows the detail of the extraction process of a similar motion vector. 類似動きベクトルの抽出処理の詳細を示すフローチャートである。It is a flowchart which shows the detail of the extraction process of a similar motion vector. 代表ベクトルの算出処理の詳細を示すフローチャートである。It is a flowchart which shows the detail of the calculation process of a representative vector. オブジェクト単位の領域分割を用いる変換行列の推定処理の詳細を示すフローチャートである。It is a flowchart which shows the detail of the estimation process of the transformation matrix using the area division | segmentation of an object unit. 画像がオブジェクト単位で分割された状態と分割領域の番号の例を示す図である。It is a figure which shows the example of the state in which the image was divided | segmented per object, and the number of a division area.

以下、本発明の実施の形態について図面を用いて詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

＜実施形態１＞
実施形態１では、画像の位置合わせを行うために、時間的に連続する画像から複数の動きベクトルを算出し、これらの動きベクトルから画像全体の動きを表現する動きパラメータを算出する構成について説明する。本実施形態は、画像全体の動き（例えば、位置ズレ）を表現する動きパラメータとして回転行列（非特許文献１参照）を算出することとし、連続する画像から行列を推定して、画像に対して電子防振処理を行う応用例として説明する。 <Embodiment 1>
In the first embodiment, a configuration is described in which a plurality of motion vectors are calculated from temporally continuous images and motion parameters expressing the motion of the entire image are calculated from these motion vectors in order to perform image alignment. . In the present embodiment, a rotation matrix (see Non-Patent Document 1) is calculated as a motion parameter expressing the motion (for example, positional deviation) of the entire image, the matrix is estimated from successive images, An application example in which electronic image stabilization processing is performed will be described.

本実施形態では、ディスプレイ（表示装置）が接続された情報処理装置としてのＰＣ（パーソナルコンピュータ）内で、ＣＰＵ（中央演算処理装置）が撮像画像を解析して特徴点を検出するための各処理を行う。以下、ＰＣの構成と各モジュールの動作について、図１（ａ）を参照して説明する。図１（ａ）は、ＰＣの内部構成を説明する図である。 In this embodiment, each process for a CPU (Central Processing Unit) to analyze a captured image and detect a feature point in a PC (Personal Computer) as an information processing apparatus to which a display (display device) is connected. I do. Hereinafter, the configuration of the PC and the operation of each module will be described with reference to FIG. FIG. 1A is a diagram for explaining the internal configuration of a PC.

１０１はバスであり、ＰＣ１００内の各種構成要素を相互に接続し、構成要素間のデータの通信経路を司る。１０２はＲＡＭ（書込可能メモリ）であり、ＣＰＵ１０５のワークエリア等の記憶領域として機能する。１０３はグラフィックプロセッサであり、ディスプレイ１０４に画像を表示する際に必要となる計算処理を行うプロセッサである。グラフィックプロセッサ１０３は、行列演算が可能で、行列に従って、回転等の画像の幾何変換を行うことができる。 A bus 101 connects various components in the PC 100 to each other and manages a data communication path between the components. Reference numeral 102 denotes a RAM (writable memory) that functions as a storage area such as a work area of the CPU 105. Reference numeral 103 denotes a graphic processor, which is a processor that performs calculation processing necessary for displaying an image on the display 104. The graphic processor 103 can perform matrix operations, and can perform geometric transformation of an image such as rotation according to the matrix.

１０４はディスプレイであり、ユーザＩ／Ｆ１０６から入力されたコマンドや、それに対するＰＣ１００の応答出力等の情報を表示する表示装置である。１０５はＣＰＵであり、オペレーティングシステム（ＯＳ）やアプリケーションプログラム等のコンピュータプログラムに基づいて他の構成要素と協働し、ＰＣ１００全体の動作を制御する。尚、本実施形態では、ＣＰＵが一つであるとして説明するが、これに限定されず、複数のＣＰＵが存在する構成を採用してもよい。その場合の各処理は、マルチスレッド処理による並列動作が可能である。１０６はユーザＩ／Ｆであり、ユーザからの指示やコマンドの入力を受け付けて、プログラムの起動が行われる。ユーザＩ／Ｆ１０６は、タッチパネル、ポインティングデバイス、キーボード等であるが、特定のデバイスに限定されない。また、ユーザＩ／Ｆ１０６が、タッチパネル、ポインティングデバイスの場合は、ディスプレイ１０４上の任意の座標位置でタッチされたか否かの情報を入力することができる。 Reference numeral 104 denotes a display, which is a display device that displays information such as a command input from the user I / F 106 and a response output of the PC 100 in response thereto. A CPU 105 controls the overall operation of the PC 100 in cooperation with other components based on a computer program such as an operating system (OS) or an application program. In the present embodiment, the description will be made assuming that there is one CPU. However, the present invention is not limited to this, and a configuration in which a plurality of CPUs exist may be employed. Each processing in that case can be performed in parallel by multithread processing. Reference numeral 106 denotes a user I / F, which receives an instruction or command input from the user and starts the program. The user I / F 106 is a touch panel, a pointing device, a keyboard, or the like, but is not limited to a specific device. Further, when the user I / F 106 is a touch panel or a pointing device, information indicating whether or not the user I / F 106 is touched at an arbitrary coordinate position on the display 104 can be input.

１０７は不揮発性の外部ストレージであり、大容量メモリとして機能する。本実施形態では、ハードディスク装置（以下、ＨＤと呼ぶ）により実現されるが、ＳＳＤ（フラッシュメモリを使用したソリッドステートドライブ）等の他の記憶装置を用いてもよい。１０８はネットワークＩ／Ｆであり、外部装置とのデータの送受信を中継する。１０９はカメラ等の外部撮像部であり、被写体を撮像して撮像画像を取得することができる。 Reference numeral 107 denotes a non-volatile external storage that functions as a large-capacity memory. In the present embodiment, it is realized by a hard disk device (hereinafter referred to as HD), but other storage devices such as SSD (solid state drive using flash memory) may be used. Reference numeral 108 denotes a network I / F that relays data transmission / reception with an external apparatus. Reference numeral 109 denotes an external imaging unit such as a camera, which can capture a captured image by imaging a subject.

本実施形態では、実行されるプラグラム及びデータは、外部ストレージ１０７に記録されており、これらをＲＡＭ１０２へ入力し、ＣＰＵ１０５が実行及び処理する構成をとる。プログラム及びデータは、バス１０１を介して入出力が行われる。画像データは、特に説明しない限り、外部ストレージ１０７から入力され、入力時に、ＰＣ１００内部で処理するための内部画像フォーマットに変換するものとする。画像データの入力は、外部撮像部１０９やネットワークＩ／Ｆ１０８から行うことも可能である。 In the present embodiment, the program and data to be executed are recorded in the external storage 107, and these are input to the RAM 102 and executed and processed by the CPU 105. Programs and data are input / output via the bus 101. Unless otherwise specified, the image data is input from the external storage 107 and is converted into an internal image format for processing inside the PC 100 at the time of input. Image data can also be input from the external imaging unit 109 or the network I / F 108.

本実施形態における内部画像フォーマットはＲＧＢ画像とするが、これに限定されず、ＹＵＶ画像、モノクロの輝度画像でもよい。また、後述の動き検出は輝度画像で行うものとし、内部画像フォーマットがＲＧＢ画像またはＹＵＶ画像の場合には、変換して動き検出がなされるものとして説明する。ＵＩ（ユーザインターフェース）画面や処理画像結果は、グラフィックプロセッサ１０３を介して、ディスプレイ１０４上に表示することができる。グラフィックプロセッサ１０３は、入力した画像データの幾何変換を行うことが可能で、変換した画像データをＲＡＭ１０２に記憶したり、ディスプレイ１０４に直接出力したりすることも可能である。処理データは、外部ストレージ１０７に記録したり、ＲＡＭ１０２に記憶したりして、他のプログラムと共有することができる。 The internal image format in the present embodiment is an RGB image, but is not limited to this, and may be a YUV image or a monochrome luminance image. In the following description, it is assumed that motion detection described later is performed on a luminance image, and that when the internal image format is an RGB image or a YUV image, the motion detection is performed after conversion. A UI (user interface) screen and processed image results can be displayed on the display 104 via the graphic processor 103. The graphic processor 103 can perform geometric conversion of the input image data, and can store the converted image data in the RAM 102 or directly output it to the display 104. The processing data can be recorded in the external storage 107 or stored in the RAM 102 and shared with other programs.

尚、本実施形態では、電子防振処理を行う情報処理装置をＰＣにより実現する例を説明するが、これに限られない。本実施形態に係る電子防振処理は、撮像装置、組込システム、タブレット端末、スマートフォン等の情報機器を用いて実施することができる。また、全体、または、部分的にハードウェアが実行する構成を採用してもよい。例えば、図１（ｂ）は、撮像装置２００の構成を表している。１１０は撮像部であり、１１１は動き検出部である。尚、図１（ａ）と同一の構成要素については同一の参照番号を付加し、その説明は省略する。図１（ｂ）では、画像を撮像部１１０から入力し、動き検出部１１１は動きベクトルの検出を行う。このように、本実施形態で説明する処理は、撮像装置でも実行可能である。 In addition, although this embodiment demonstrates the example which implement | achieves the information processing apparatus which performs an electronic image stabilization process by PC, it is not restricted to this. The electronic image stabilization processing according to the present embodiment can be performed using an information device such as an imaging device, an embedded system, a tablet terminal, or a smartphone. Moreover, you may employ | adopt the structure which a hardware performs entirely or partially. For example, FIG. 1B shows the configuration of the imaging apparatus 200. Reference numeral 110 denotes an imaging unit, and 111 denotes a motion detection unit. The same components as those in FIG. 1A are denoted by the same reference numerals, and the description thereof is omitted. In FIG. 1B, an image is input from the imaging unit 110, and the motion detection unit 111 detects a motion vector. As described above, the processing described in this embodiment can also be executed by the imaging apparatus.

連続する画像から画像全体の動きを表現する行列を推定して電子防振処理を実行する方法について、図２を参照して具体的に説明する。図２は、電子防振処理を示すフローチャートである。尚、図２のフローチャートは、ＣＰＵ１０５が、外部ストレージ１０７に記録されているプログラムを読み出し実行することで実現される。また、ＣＰＵ１０５は、処理内容に応じて、図１（ａ）あるいは図１（ｂ）に示される各種構成要素と協働して処理を実行する。 A method of executing electronic image stabilization processing by estimating a matrix expressing the motion of the entire image from successive images will be specifically described with reference to FIG. FIG. 2 is a flowchart showing the electronic image stabilization process. Note that the flowchart of FIG. 2 is realized by the CPU 105 reading and executing a program recorded in the external storage 107. Further, the CPU 105 executes the process in cooperation with various components shown in FIG. 1A or FIG.

以下、フローチャートの説明に関しては、特に説明しない限り、ステップ番号順に、それぞれのステップが実行されるものとして説明する。また、互いに依存関係のない独立した処理については、記載のステップ順に処理を実行する必要はなく、順序を入れ替えて実行したり、複数ＣＰＵが存在する場合には処理を並列に実行することも可能である。同様に、ステップが存在するサブルーチンの位置も限定はなく、処理結果が同じであれば、異なるサブルーチンで処理を実行する構成もとることができ、サブルーチンの構成にも限定はない。 Hereinafter, the description of the flowchart will be made assuming that the steps are executed in the order of step numbers unless otherwise specified. In addition, for independent processes that are not dependent on each other, it is not necessary to execute the processes in the order in which they are described, and it is possible to execute them in a different order or in parallel when there are multiple CPUs. It is. Similarly, the position of the subroutine in which the step exists is not limited, and if the processing result is the same, a configuration in which processing is executed by a different subroutine can be used, and the configuration of the subroutine is not limited.

Ｓ２０１０では、ＣＰＵ１０５あるいは動き検出部１１１が、画像を順に入力して動き検出を実行する。本実施形態では、第ｃ−１番フレーム、第ｃ番フレームの輝度画像が入力され、第ｃ−１番フレーム（基準画像）から第ｃ番フレーム（処理対象画像）の変換（動き）を表す動きベクトルが検出されるものとして説明する。入力画像のフレーム番号は０から、処理対象のフレーム番号は１から開始され、本ステップが実行される毎にｃの値がインクリメントされるものとして説明する。 In step S2010, the CPU 105 or the motion detection unit 111 sequentially inputs images and executes motion detection. In the present embodiment, the luminance images of the c-1 frame and the c frame are input, and represent the conversion (motion) from the c-1 frame (reference image) to the c frame (processing target image). A description will be given assuming that a motion vector is detected. In the following description, it is assumed that the frame number of the input image starts from 0 and the frame number to be processed starts from 1, and the value of c is incremented each time this step is executed.

動き検出は、特徴点を検出し、特徴点の特徴量を画像間でマッチングし、その対応位置関係を動きベクトルとすることによって行われる。しかし、動き検出のアルゴリズムは、これに限定されず、例えば、輝度画像を縦３２×横３２画素単位のブロック（分割領域）に分割し、ブロック単位（分割領域単位）で、ブロックマッチングよる動き探索によって行う構成を採用してもよい。本実施形態では、一つの動きベクトルは、始点と終点の座標から構成される有向線分であり、一つの動きベクトルを Motion detection is performed by detecting feature points, matching feature quantities of feature points between images, and using the corresponding positional relationship as a motion vector. However, the motion detection algorithm is not limited to this. For example, a luminance image is divided into blocks (divided regions) of 32 × 32 pixels, and motion search is performed by block matching in units of blocks (divided regions). You may employ | adopt the structure performed by. In the present embodiment, one motion vector is a directed line segment composed of the coordinates of the start point and the end point, and one motion vector is

と表現する。 It expresses.

但し、Ａ、Ｂは動きベクトルの始点と終点を表す。また、動きベクトルの純粋なベクトル成分を However, A and B represent the start point and end point of the motion vector. Also, the pure vector component of the motion vector

と表現する。 It expresses.

複数の動きベクトルの集合をＸとし、Ｘの個別の動きベクトルを識別するインデックス番号をｉとすると、各動きベクトルは、 When a set of a plurality of motion vectors is X and an index number for identifying an individual motion vector of X is i, each motion vector is

と表現される。このとき、Ｘは、 It is expressed. At this time, X is

と表現される。以降、特別な記述がない場合、添え字が共通の It is expressed. After that, if there is no special description, the subscript is common

は、同一の動きベクトル、及び、その要素を表すものとして説明する。本実施形態では、各数値は浮動小数点として扱うものとして説明するが、固定小数点として計算する方法を採用してもよい。また、画像の画素を参照する場合、特別な記述がなければ小数部を切り捨てた数値を座標値として画素を参照するものとする。本実施形態では、集合は配列として実装されるものとし、集合の要素を Are described as representing the same motion vector and its elements. In the present embodiment, each numerical value is described as being handled as a floating point, but a method of calculating as a fixed point may be adopted. Further, when referring to a pixel of an image, unless there is a special description, it is assumed that the pixel is referred to using a numerical value obtained by discarding the decimal part as a coordinate value. In this embodiment, the set is implemented as an array, and the elements of the set are

あるいは、 Or

と表現して、集合の要素である動きベクトルやそのベクトル成分に参照できるものとして説明する。

In the following description, it is assumed that the motion vector that is an element of the set and its vector component can be referred to.

また、集合の要素数は、集合を｜｜で挟む形式で表現する。例えば、集合Ｘの要素数は、｜Ｘ｜となる。尚、集合は配列として実装することに限定されず、例えば、リストとして実装してもよい。 The number of elements in the set is expressed in a format in which the set is sandwiched between ||. For example, the number of elements in the set X is | X |. The set is not limited to being implemented as an array, and may be implemented as a list, for example.

Ｓ２０２０では、ＣＰＵ１０５が、動き検出結果から変換行列を推定する。変換行列の推定方法の詳細は、図３を用いて後述する。本実施形態では、第ｃ−１番フレームから第ｃ番フレームの変化を表す変換行列をＨ_cとする。本実施形態では、Ｈ_cは３×３の行列である回転行列（非特許文献１参照）として説明する。しかし、行列の形式に限定はなく、アフィン変換行列やホモグラフィ行列等の他の行列であってもよい。 In S2020, the CPU 105 estimates a transformation matrix from the motion detection result. Details of the transformation matrix estimation method will be described later with reference to FIG. In the present embodiment, a conversion matrix representing a change from the c-1 frame to the c frame is H _c . In the present embodiment, H _c is described as a rotation matrix (see Non-Patent Document 1) that is a 3 × 3 matrix. However, the form of the matrix is not limited, and other matrices such as an affine transformation matrix and a homography matrix may be used.

Ｓ２０３０では、ＣＰＵ１０５が、防振行列を生成するために必要な、防振フレーム周期の数以上の変換行列が推定できたかを判定する。防振フレーム周期をｐとすると、ｃ≧ｐが真の場合（Ｓ２０３０でＹＥＳ）、Ｓ２０４０へ遷移し、偽の場合（Ｓ２０３０でＮＯ）、Ｓ２０１０へ遷移する。ｐの値は、例えば、１６であるとするが、ｐの値に限定はなく、長周期のブレを抑制する場合にはｐを大きく設定し、短周期のブレのみ抑制する場合はｐを小さく設定する。 In S2030, the CPU 105 determines whether or not a transformation matrix equal to or more than the number of anti-vibration frame periods necessary for generating the anti-vibration matrix has been estimated. If the anti-vibration frame period is p, if c ≧ p is true (YES in S2030), the process proceeds to S2040, and if false (NO in S2030), the process proceeds to S2010. The value of p is, for example, 16. However, the value of p is not limited, and p is set large when suppressing long-period blurring, and p is decreased when suppressing only short-period blurring. Set.

Ｓ２０４０では、ＣＰＵ１０５が、推定した複数の変換行列から防振行列を生成する。防振は、高周波のブレを抑制することが目的であり、変換行列を複数フレームに渡って平滑化したものが防振行列となる。本実施形態では、過去のフレームの変換行列と直前の防振行列から計算する。例えば、第ｃ番フレームの防振行列をＳ_cとすると、Ｓ_cは、 In S2040, the CPU 105 generates an image stabilization matrix from the estimated plurality of transformation matrices. The purpose of anti-vibration is to suppress high-frequency blur, and the anti-vibration matrix is obtained by smoothing the transformation matrix over a plurality of frames. In the present embodiment, the calculation is performed from the transformation matrix of the past frame and the previous anti-vibration matrix. For example, if the anti-vibration matrix of the c-th frame is S _c , S _c is

として計算される。尚、行列のべき乗根の計算は近似計算でよく、非特許文献２で説明される方法で計算できる。行列のべき乗根は複数存在する場合があるため、一意の行列が定まる制約を設ける。本実施形態では、行列は回転行列であるため、回転量が最も小さい行列を選択することになる。また、行列のべき乗根が計算できない場合、防振行列Ｓ_cは単位行列であるものとして処理を進める。尚、行列の平滑の方法はこれに限定されない。 Is calculated as The calculation of the power root of the matrix may be approximate calculation, and can be performed by the method described in Non-Patent Document 2. Since there may be a plurality of power roots of the matrix, there is a constraint that a unique matrix is determined. In this embodiment, since the matrix is a rotation matrix, the matrix having the smallest rotation amount is selected. If the power root of the matrix cannot be calculated, the process proceeds assuming that the image stabilization matrix _Sc is a unit matrix. The matrix smoothing method is not limited to this.

Ｓ２０５０では、ＣＰＵ１０５が、防振行列を用いて画像を幾何変換する。本実施形態では、第ｃ−ｐ＋１番フレームのＲＧＢ画像を入力し、ＲＧＢそれぞれのチャネル毎に処理がなされる。 In S2050, the CPU 105 geometrically transforms the image using the image stabilization matrix. In this embodiment, an RGB image of the (c−p + 1) th frame is input, and processing is performed for each of the RGB channels.

このとき、幾何変換後の画像である出力画像の画素位置を（ｘ_out，ｙ_out）、入力画像の画素位置を（ｘ_in，ｙ_in）、出力画像から入力画像への変換行列を At this time, the pixel position of the output image, which is the image after geometric transformation, is (x _out , y _out ), the pixel position of the input image is (x _in , y _in ), and the conversion matrix from the output image to the input image is

とすると、（ｘ_out，ｙ_out）から（ｘ_in，ｙ_in）を計算するｐｒｏｊ関数は下記のように表せる。 Then, the proj function for calculating (x _in , y _in ) from (x _out , y _out ) can be expressed as follows.

本ステップでは、出力画像の画素を一画素ずつ走査しながら、Ｍ＝Ｓ^-1としたｐｒｏｊ関数を用いて出力画像の走査対象画素に対応する入力画像の対応画素の位置を計算する。この対応画素の画素値を走査対象画素の画素値として、出力画像の全ての画素値を決定する。尚、（ｘ_in，ｙ_in）は小数値を有するため、バイリニアやバイキュービック等の方法を用いて補間し、より精度の高い画素値を計算する方法を採用してもよい。変換された画像は、ＣＰＵ１０５が、ディスプレイ１０４に表示する、あるいは符号化して外部ストレージ１０７に記録する。 In this step, the position of the corresponding pixel of the input image corresponding to the scanning target pixel of the output image is calculated using a proj function with M = S ⁻¹ while scanning the pixels of the output image pixel by pixel. All pixel values of the output image are determined using the pixel value of the corresponding pixel as the pixel value of the scanning target pixel. Since (x _in , y _in ) has a decimal value, a method of calculating a pixel value with higher accuracy by interpolation using a method such as bilinear or bicubic may be employed. The CPU 105 displays the converted image on the display 104 or encodes it and records it in the external storage 107.

Ｓ２０６０では、ＣＰＵ１０５が、全入力画像の処理が終了したか否かを判定する。終了した場合（Ｓ２０６０でＹＥＳ）、処理を終了し、処理が終了していない場合（Ｓ２０６０でＮＯ）、Ｓ２０１０に遷移して、以後、Ｓ２０１０からＳ２０６０の処理をくり返す。本実施形態では、処理の終了条件として全入力画像の処理が終了したか否かを判定しているが、これに限定されない。例えば、ユーザが処理終了を指示するＵＩ操作が行われたか否かを判定して、処理を終了してもよい。 In step S2060, the CPU 105 determines whether all input images have been processed. If completed (YES in S2060), the process ends. If the process is not completed (NO in S2060), the process proceeds to S2010, and thereafter, the processes from S2010 to S2060 are repeated. In the present embodiment, it is determined whether or not the processing of all input images has been completed as the processing termination condition, but the present invention is not limited to this. For example, it may be determined whether the user has performed a UI operation instructing the end of the process, and the process may be ended.

次に、Ｓ２０２０の変換行列の推定処理の詳細について、図３を参照して説明する。 Next, the details of the conversion matrix estimation processing in S2020 will be described with reference to FIG.

図３は、変換行列の推定処理の詳細を示すフローチャートである。 FIG. 3 is a flowchart showing details of the transformation matrix estimation processing.

Ｓ３０１０では、ＣＰＵ１０５が、対象分割領域を走査して、分割領域内毎に分割領域内の動きベクトルの集合を入力する。 In step S3010, the CPU 105 scans the target divided area and inputs a set of motion vectors in the divided area for each divided area.

以下、分割領域の走査方法について図４を用いて詳細に説明する。図４は画像分割の例を説明する図である。図４（ａ）は、画像の分割方法と分割領域番号を例示する図である。本実施形態では、分割領域を図４（ａ）に示す数値のようにラスター順に従って走査する。つまり、一回目のＳ３０１０が実行される場合には、番号１の分割領域が処理対象となり、以下、２回、３回と実行される毎に分割領域番号２、３の分割領域が処理対象となる。本ステップでは、この対象分割領域内にベクトルの終点（矢印のついた点）が含まれる動きベクトルを入力する。本実施形態では、この分割領域の番号をｄとし、以下、分割領域ｄのように表現して説明する。また、最大分割数をｄ_maxと表現する。本実施形態では、ｄ_max＝２０である。ｄは１から始まり、Ｓ３０１０が実行される毎にｄがインクリメントされることになる。 Hereinafter, a method of scanning the divided areas will be described in detail with reference to FIG. FIG. 4 is a diagram for explaining an example of image division. FIG. 4A is a diagram illustrating an image dividing method and divided region numbers. In this embodiment, the divided areas are scanned according to the raster order as shown by the numerical values shown in FIG. That is, when the first S3010 is executed, the divided area with the number 1 is the processing target, and the divided areas with the divided area numbers 2 and 3 are the processing target every time the process is executed twice and three times. Become. In this step, a motion vector in which the end point (point with an arrow) of the vector is included in the target divided region is input. In the present embodiment, the number of the divided area is d, and the following description will be made by expressing like the divided area d. The maximum number of divisions is expressed as d _max . In the present embodiment, d _max = 20. d starts from 1, and d is incremented every time S3010 is executed.

図４（ｂ）は、動きベクトルの状態を例示する図である。図４（ｂ）のように、動きベクトルは分割領域をまたぐ場合があるため、本実施形態では、動きベクトルｖの終点が含まれる動きベクトルを分割領域の動きベクトルとして扱う。動きベクトルｖの終点Ｂが、分割領域ｄに含まれるか否かを判定する関数をｉｎ（ｄ，Ｂ）とする。分割領域ｄに含まれるベクトルの集合Ｙ_dは、 FIG. 4B is a diagram illustrating the state of the motion vector. As shown in FIG. 4B, since the motion vector may cross the divided area, in this embodiment, the motion vector including the end point of the motion vector v is handled as the motion vector of the divided area. A function for determining whether or not the end point B of the motion vector v is included in the divided region d is in (d, B). A set Y _d of vectors included in the divided region d is

と表現される。この記法は、集合Ｘの要素を走査して、「｜」以降で表現された条件を満足する要素を抽出し、抽出された要素の部分集合であるＹ_dを生成することを示している。以下、部分集合の生成は、同様の記法を用いて説明する。尚、部分集合として新しい配列やリストを生成せず、各要素に部分集合であるか否かを示すフラグを設け、要素の抽出時にフラグを設定する構成を採用してもよい。この構成では、処理毎に上位集合の要素を走査して走査対象の要素のフラグの参照することによって、部分集合である要素のみを取得できる。 It is expressed. This notation indicates that elements of the set X are scanned to extract elements that satisfy the conditions expressed after “|”, and Y _d that is a subset of the extracted elements is generated. Hereinafter, the generation of the subset will be described using the same notation. A configuration may be adopted in which a flag indicating whether or not each element is a subset is provided and a flag is set when extracting an element without generating a new array or list as a subset. In this configuration, only the elements that are a subset can be acquired by scanning the superordinate elements for each process and referring to the flags of the elements to be scanned.

尚、Ｙ_dは、事前に作成しておき、本ステップで入力のみする構成を採用してもよい。また、本実施形態では、図４のように画像を２０分割したものとして説明しているが、分割方法はこれに限定されない。また、本実施形態では、終点が分割領域に含まれる動きベクトルを入力するとして説明しているが、始点が分割領域に含まれる動きベクトルを入力する構成を採用してもよい。また、分割領域の走査順も、ラスター順に限定されない。さらには、動きベクトルｖの終点Ｂが、分割領域の８近傍の分割領域と自身の領域に含まれる否かを判定する関数をｎｅｉｇｈｂｏｕｒ（ｄ，Ｂ）とすると、 Y _d may be created in advance and may be configured to be input only in this step. In the present embodiment, the image is divided into 20 as shown in FIG. 4, but the dividing method is not limited to this. In this embodiment, the end point is described as inputting a motion vector included in the divided region. However, a configuration may be adopted in which the starting point is input a motion vector included in the divided region. Further, the scanning order of the divided areas is not limited to the raster order. Furthermore, if the function for determining whether or not the end point B of the motion vector v is included in the divided area near the divided area and its own area is defined as neighbor (d, B),

という式を用いて、ベクトルの集合Ｙ_dを作成してもよい。例えば、分割領域番号が９の場合、分割領域の８近傍の分割領域と自身の領域とは、図４（ａ）の太線で囲まれた９つの分割領域になる。但し、近傍領域が画面外に存在する場合には、画面内の領域から、ベクトルの集合Ｙ_dが作成されるものとする。 The vector set Y _d may be created using the following expression. For example, when the divided region number is 9, the divided regions near the divided region 8 and its own region are nine divided regions surrounded by the bold lines in FIG. However, if a nearby region exists outside the screen, a vector set Y _d is created from the region in the screen.

Ｓ３０２０では、ＣＰＵ１０５が、入力した動きベクトルを用いて代表ベクトル（第一の動きパラメータ）を算出する。本実施形態では、代表ベクトルを算出するサブルーチンを実行するものとする。このサブルーチンの動作については、図５を用いて後述する。尚、本実施形態では、ロバスト推定を用いる図５に示すフローチャートを用いて説明するが、Ｍ推定、最小メジアン法等の他の推定手法を用いてもよい。以下、代表ベクトルを In S3020, the CPU 105 calculates a representative vector (first motion parameter) using the input motion vector. In this embodiment, a subroutine for calculating a representative vector is executed. The operation of this subroutine will be described later with reference to FIG. Although the present embodiment will be described with reference to the flowchart shown in FIG. 5 using robust estimation, other estimation methods such as M estimation and minimum median method may be used. Below, representative vectors

と表現する。 It expresses.

Ｓ３０３０では、ＣＰＵ１０５が、入力した動きベクトルの集合から代表ベクトルに対応する動きベクトル（類似動きベクトル）を抽出する。本実施形態では、ロバスト推定における許容誤差を設定し、代表ベクトルと入力した動きベクトルのベクトル成分の差分の絶対値が許容誤差以内の動きベクトルを、代表ベクトルに対応する類似動きベクトルとして抽出する。許容誤差の値ｅ₂は、後述のＳ３０５０で使われるＲＡＮＳＡＣ処理の許容誤差のｅ_h（本実施形態では、３を設定する）に対して、下記の式で計算されるものとする。 In S3030, the CPU 105 extracts a motion vector (similar motion vector) corresponding to the representative vector from the set of input motion vectors. In the present embodiment, an allowable error in robust estimation is set, and a motion vector whose absolute value of a difference between vector components of the representative vector and the input motion vector is within an allowable error is extracted as a similar motion vector corresponding to the representative vector. The allowable error value e ₂ is calculated by the following equation with respect to the allowable error e _h (set to 3 in the present embodiment) of the RANSAC process used in S3050 described later.

尚、本実施形態では、ｋ＝５とするが、この値に限定されるものではない。このように、Ｓ３０２０における許容誤差の値ｅ₂は、Ｓ３０５０における許容誤差のｅ_hよりも大きい値が設定される。 In this embodiment, k = 5, but the value is not limited to this value. Thus, the allowable error value e ₂ in S3020 is set to a value larger than the allowable error e _h in S3050.

また、画像の高さをｉｍａｇｅ＿ｈｅｉｇｈｔ、除数をｄｉｖとして、 Also, the image height is image_height and the divisor is div.

という計算式を用いて許容誤差のｅ_hを計算してもよい。尚、本実施形態では、ｋ＝５、ｄｉｖ＝３６０として計算するが、これらの値に限定されるものではない。 It may be calculated e _h tolerance using a formula that. In this embodiment, the calculation is performed with k = 5 and div = 360, but the present invention is not limited to these values.

このとき、第ｃ番フレームの分割領域ｄの抽出された動きベクトルの集合Ｖ_dは、 At this time, a set V _{d of} motion vectors extracted from the divided region d of the c-th frame is

として抽出できる。尚、本実施形態では、ベクトルの差分の絶対値に対し許容誤差の判定を行ってベクトルを抽出しているが、他の方法でも抽出できる。例えば、ベクトルの差分のそれぞれの成分に対し個別に許容誤差の判定を行ってもよい。具体的には、 Can be extracted as In this embodiment, the tolerance is determined for the absolute value of the vector difference and the vector is extracted. However, other methods can be used. For example, the allowable error may be individually determined for each component of the vector difference. In particular,

として動きベクトルを抽出できる。許容誤差はそれぞれｅ_2x＝５ｅ_h、ｅ_2y＝３ｅ_h等の値をとる。 The motion vector can be extracted as The tolerances take values such as e _2x = 5e _h and e _2y = 3e _h .

Ｓ３０４０では、ＣＰＵ１０５が、全分割領域の処理が終了したか否か判定する。処理が終了している場合（Ｓ３０４０でＹＥＳ）、Ｓ３０５０に遷移する。一方、処理が終了していない場合（ステップＳ３０４０でＮＯ）、Ｓ３０１０に遷移し、以後、Ｓ３０１０からＳ３０４０のステップが繰り返される。 In S3040, the CPU 105 determines whether or not the processing of all the divided areas has been completed. If the process has been completed (YES in S3040), the process proceeds to S3050. On the other hand, if the process has not ended (NO in step S3040), the process proceeds to S3010, and thereafter, the steps from S3010 to S3040 are repeated.

Ｓ３０５０では、ＣＰＵ１０５が、全分割領域から抽出した動きベクトルを入力して回転行列（第二の動きパラメータ）を推定する。本実施形態では、許容誤差ｅ_hを３としてＲＡＮＳＡＣを実行するサブルーチンが呼び出すものとする。サブルーチンの動作説明については図６を用いて後述する。 In S3050, the CPU 105 inputs a motion vector extracted from all the divided regions and estimates a rotation matrix (second motion parameter). In the present embodiment, it is assumed that a subroutine for executing RANSAC with an allowable error e _h of 3 is called. The operation of the subroutine will be described later with reference to FIG.

以下、図３のＳ３０２０における分割領域単位で代表ベクトルを算出するサブルーチンの動作について、図５を用いて説明する。図５は、実施形態１の代表ベクトルの算出処理の詳細を示すフローチャートである。 Hereinafter, the operation of the subroutine for calculating the representative vector for each divided region in S3020 of FIG. 3 will be described with reference to FIG. FIG. 5 is a flowchart illustrating details of the representative vector calculation processing according to the first embodiment.

Ｓ５０００では、ＣＰＵ１０５が、イテレーション数をインクリメントする。尚、イテレーション数は、事前に０で初期化されているものとする。 In S5000, the CPU 105 increments the number of iterations. It is assumed that the number of iterations is initialized to 0 in advance.

Ｓ５０１０では、ＣＰＵ１０５が、比較対象となるサンプル全体からランダムに動きベクトルを取得する。本サブルーチンの動作におけるサンプル全体とは、少なくとも１つの対象分割領域に含まれる全ての動きベクトルであり、本実施形態では、一つの動きベクトルｖ_rを取得する。 In step S5010, the CPU 105 acquires a motion vector randomly from the entire sample to be compared. The entire sample in the operation of this subroutine are all motion vectors included in at least one subject divided region, in the present embodiment, to obtain a single motion vector v _r.

Ｓ５０２０では、ＣＰＵ１０５が、取得した動きベクトルとサンプル全体の動きベクトルの差を算出し、差が許容誤差内のデータ数をインライア数としてカウントする。以後、このインライア数をｃ_inlierとすると In S5020, the CPU 105 calculates the difference between the acquired motion vector and the motion vector of the entire sample, and counts the number of data within which the difference is within an allowable error as the number of inliers. From now on, if this _inlier number is c _inlier

として計算できる。 Can be calculated as

Ｓ５０３０では、ＣＰＵ１０５が、現在までのイテレーションでインライア数が最大であるか否か判定する。真である場合（Ｓ５０３０でＹＥＳ）、Ｓ５０４０へ遷移し、偽である場合（Ｓ５０３０でＮＯ）、Ｓ５０５０へ遷移する。尚、例外として、一回目のＳ５０３０の実行では、必ず、Ｓ５０４０へ遷移するものとする。 In S5030, the CPU 105 determines whether or not the number of inliers is the maximum in the iteration up to now. If true (YES in S5030), the process proceeds to S5040. If false (NO in S5030), the process proceeds to S5050. As an exception, the first execution of S5030 always makes a transition to S5040.

Ｓ５０４０では、ＣＰＵ１０５が、取得した動きベクトルをベストパラメータとして保存する。本実施形態では、ベストパラメータは、 In S5040, the CPU 105 stores the acquired motion vector as the best parameter. In this embodiment, the best parameter is

であり、 And

として、 As

を更新する。 Update.

Ｓ５０５０では、ＣＰＵ１０５が、イテレーション数が上限数に達したか否かを判定する。上限数に達した場合（Ｓ５０５０でＹＥＳ）、Ｓ５０７０へ遷移する。一方、上限数に達していない場合（Ｓ５０５０でＮＯ）、Ｓ５０６０へ遷移する。 In S5050, the CPU 105 determines whether or not the number of iterations has reached the upper limit number. When the upper limit number is reached (YES in S5050), the process proceeds to S5070. On the other hand, if the upper limit number has not been reached (NO in S5050), the process proceeds to S5060.

尚、本実施形態では、上限を５０回とする。但し、この回数に限定されるものではない。例えば、入力される画像のフレームレートが６０ｆｐｓの場合、図２のフローチャートは１６ｍｓ以内で完了する必要がある。そのため、ＣＰＵ１０５のスペックや数によって、最適な値が決定される。 In the present embodiment, the upper limit is 50 times. However, the number of times is not limited. For example, when the frame rate of the input image is 60 fps, the flowchart of FIG. 2 needs to be completed within 16 ms. Therefore, an optimal value is determined according to the specifications and number of CPUs 105.

Ｓ５０６０では、ＣＰＵ１０５が、イテレーション数が十分であるか否かを判定する。イテレーション数が十分である場合（Ｓ５０６０でＹＥＳ）、Ｓ５０７０に遷移し、不十分である場合（Ｓ５０６０でＮＯ）、Ｓ５０００へ遷移する。この判定は、イテレーション数が（式１１）によって算出されるＮ値を超える場合に十分と判定される。 In S5060, the CPU 105 determines whether the number of iterations is sufficient. If the number of iterations is sufficient (YES in S5060), the process proceeds to S5070, and if insufficient (NO in S5060), the process proceeds to S5000. This determination is determined to be sufficient when the number of iterations exceeds the N value calculated by (Equation 11).

ｐ_sampleは、正しいサンプル（動きベクトル）が最低一つ存在する確率である。本実施形態では、サンプルが９９％の確率で存在すると仮定し、ｐ_sample＝０．９９とする。ｍは、パラメータの算出に必要な動きベクトルの数である。二次元ベクトルを算出する場合は、ｍ＝１である。また、ホモグラフィ行列や回転行列の場合は、ｍ＝４となる。（式１１）は、同じｒ_inlierであれば、ｍが小さい程、小さいＮ値を導出する式である。つまり、図５のフローチャートでは、ｍが小さい程、イテレーション数が少なくなることになる。（式１１）中のｒ_inlierは、下記の式により算出される。 p _sample is the probability that there is at least one correct sample (motion vector). In this embodiment, it is assumed that a _sample exists with a probability of 99%, and p _sample = 0.99. m is the number of motion vectors necessary for parameter calculation. When calculating a two-dimensional vector, m = 1. In the case of a homography matrix or a rotation matrix, m = 4. (Equation 11) is an equation for deriving a smaller N value as m is smaller for the same r _inlier . That is, in the flowchart of FIG. 5, the smaller the m, the smaller the number of iterations. R _inlier in (Expression 11) is calculated by the following expression.

ここで、ｃ_inlierは、Ｓ５０２０で算出したインライア数である。｜Ｖ_d｜は、Ｓ３０３０で抽出した動きベクトルの要素数である。 Here, c _inlier is the number of inliers calculated in S5020. | V _d | is the number of elements of the motion vector extracted in S3030.

Ｓ５０７０では、ＣＰＵ１０５が、ベストパラメータとして戻り値を返す。本実施形態では、Ｓ５０７０が実行される時点での In S5070, the CPU 105 returns a return value as the best parameter. In the present embodiment, at the time when S5070 is executed

が返され、これが分割領域の代表ベクトルにあたる。 Is returned, which corresponds to the representative vector of the divided area.

以下、図３のＳ３０５０における回転行列を推定するサブルーチンの動作について、図６を用いて説明する。 Hereinafter, the operation of the subroutine for estimating the rotation matrix in S3050 in FIG. 3 will be described with reference to FIG.

図６は、ＲＡＮＳＡＣを用いる回転行列の推定処理の詳細を示すフローチャートである。基本的な動作フローは、図５の処理と同じであるが、図６では、Ｓ６０１１が追加され、Ｓ５０１０とＳ５０２０が、Ｓ６０１０とＳ６０２０に置き換えられている。 FIG. 6 is a flowchart showing details of a rotation matrix estimation process using RANSAC. The basic operation flow is the same as the processing in FIG. 5, but in FIG. 6, S6011 is added, and S5010 and S5020 are replaced with S6010 and S6020.

Ｓ６０００では、ＣＰＵ１０５が、イテレーション数をインクリメントする。 In S6000, the CPU 105 increments the number of iterations.

Ｓ６０１０では、ＣＰＵ１０５が、入力サンプル全体から４つの動きベクトルを取得する。本実施形態において、入力サンプル全体とは、図３のフローチャートで抽出した動きベクトルの全分割領域に対する集合である。つまり、入力サンプル全体Ｚは、 In step S6010, the CPU 105 acquires four motion vectors from the entire input sample. In the present embodiment, the entire input sample is a set of all motion vectors extracted in the flowchart of FIG. That is, the entire input sample Z is

と表現される。 It is expressed.

Ｓ６０１１では、ＣＰＵ１０５が、４つの動きベクトルから行列を算出する。このとき、取得した動きベクトルをｖ_j（但し、ｊは１から４）と表現する。算出する行列はＳ２０２０における In S6011, the CPU 105 calculates a matrix from the four motion vectors. At this time, the acquired motion vector is expressed as v _j (where j is 1 to 4). The matrix to be calculated in S2020

である。本ステップでは、方程式を解いて下記を満足する回転行列のそれぞれの要素を算出する。 It is. In this step, the equation is solved to calculate each element of the rotation matrix that satisfies the following.

尚、回転行列の算出は様々な方法があり、例えば、非特許文献１に記載されている方法を利用できるため、ここでは、詳細な説明は省略する。回転行列の算出方法によっては、サンプルの選び方によって行列が算出できない場合がある。行列算出の失敗を判定して、失敗と判定された場合には、処理をＳ５０００へ遷移させ、再度、処理が行われるものとする。 Note that there are various methods for calculating the rotation matrix. For example, the method described in Non-Patent Document 1 can be used, and thus detailed description thereof is omitted here. Depending on how the rotation matrix is calculated, the matrix may not be calculated depending on how the samples are selected. If it is determined that the matrix calculation has failed and it is determined that the process has failed, the process proceeds to S5000, and the process is performed again.

Ｓ６０２０では、ＣＰＵ１０５が、サンプル全体に対し、動きベクトルの始点を、算出した行列で射影した点と終点との距離を算出し、距離が許容誤差内のデータ数をインライア数としてカウントする。この許容誤差は、前述のｅ_hである。回転行列推定におけるインライア数ｃ_Hinlierは、 In step S6020, the CPU 105 calculates the distance between the point where the start point of the motion vector is projected with the calculated matrix and the end point of the entire sample, and counts the number of data whose distance is within the allowable error as the number of inliers. This tolerance is the aforementioned e _h. The _inlier number c _Hinlier in the rotation matrix estimation is

として計算できる。 Can be calculated as

以降は、図５と同様に、イテレーションを繰り返すことにより、回転行列を推定することができる。 Thereafter, as in FIG. 5, the iteration matrix can be estimated by repeating the iteration.

一般に、アウトライアの比率（以下、アウトライア率）が高い動きベクトルを入力とし、ＲＡＮＳＡＣを用いて回転行列を推定すると、イテレーション数が多くなるという問題がある。また、ロバスト推定技術の一つで比較的処理が軽いとされ、処理時間が短いＭ推定は、アウトライア率が高い場合には十分な推定性能が発揮できない。 In general, when a motion vector having a high outlier ratio (hereinafter, “outlier ratio”) is input and a rotation matrix is estimated using RANSAC, there is a problem that the number of iterations increases. Further, one of the robust estimation techniques is considered to be relatively light in processing, and M estimation with a short processing time cannot exhibit sufficient estimation performance when the outlier rate is high.

そこで、本実施形態では、行列推定の前に分割領域毎に代表ベクトルを算出し、代表ベクトルに対応する類似の動きベクトルを抽出している。これにより、画像全体の動きを表現する回転行列の推定の入力となる動きベクトルのアウトライアを除去し、ＲＡＮＳＡＣのイテレーション数を減少させる、あるいは、Ｍ推定の推定性能を向上させることができる。代表ベクトルを算出する処理は、複雑な行列演算がないため処理量が非常に少ない。そのため、例えば、ＲＡＮＳＡＣに適用する場合、その前処理として類似動きベクトルを抽出する処理のオーバーヘッドを考慮してもトータルの処理時間の大幅な短縮が可能となる。 Therefore, in this embodiment, a representative vector is calculated for each divided region before matrix estimation, and a similar motion vector corresponding to the representative vector is extracted. As a result, it is possible to remove the outlier of the motion vector that is an input of the estimation of the rotation matrix expressing the motion of the entire image, to reduce the number of RANSAC iterations, or to improve the estimation performance of M estimation. The processing for calculating the representative vector is very small because there is no complicated matrix operation. Therefore, for example, when applied to RANSAC, the total processing time can be significantly shortened even if the processing overhead for extracting similar motion vectors is taken into consideration as the preprocessing.

また、例えば、６０ｆｐｓの画像では、一つのフレームの処理を１６ｍｓ以内に完了する必要がある。そのため、イテレーション数に上限を設ける必要があるが、この場合でも、本実施形態を用いれば、イテレーション数が上限に達しにくく、安定的に行列を推定できる。これにより、行列を用いて画像を防振する場合、行列推定の失敗の確率が減り、より安定的で自然な防振が可能となる。尚、本実施形態では、画像の防振を行うことを例に、代表ベクトルの決定と類似動きベクトルの抽出処理を実施してから、画像全体の動きを表現する行列の算出方する法について説明している。しかし、算出した画像全体の動き情報を利用するアプリケーションはこれに限定されず、画像合成、符号化、自由視点合成等にも応用できる。 For example, in the case of an image of 60 fps, it is necessary to complete processing of one frame within 16 ms. Therefore, although it is necessary to provide an upper limit for the number of iterations, even in this case, if this embodiment is used, it is difficult for the number of iterations to reach the upper limit, and a matrix can be estimated stably. As a result, when an image is shaken using a matrix, the probability of matrix estimation failure is reduced, and more stable and natural image stabilization is possible. In the present embodiment, a method for calculating a matrix that represents the motion of the entire image after determining the representative vector and extracting the similar motion vector will be described by taking image stabilization as an example. doing. However, the application using the calculated motion information of the entire image is not limited to this, and can be applied to image synthesis, encoding, free viewpoint synthesis, and the like.

本実施形態では、二次元の代表ベクトルを算出してから回転行列を算出する構成を説明している。しかし、領域を代表する二次元ベクトルを算出してから面（処理対象画像）全体の動きを表現するアフィン変換行列、あるいは、領域を代表するアフィン変換行列を算出してから画像全体の動きを表現する回転行列を算出する構成にも適用できる。アフィン変換行列Ｔは、 In the present embodiment, a configuration is described in which a rotation matrix is calculated after calculating a two-dimensional representative vector. However, after calculating a two-dimensional vector that represents the area, the affine transformation matrix that represents the movement of the entire surface (image to be processed), or the affine transformation matrix that represents the area, and then representing the movement of the entire image The present invention can also be applied to a configuration for calculating a rotation matrix. The affine transformation matrix T is

と表現される。 It is expressed.

アフィン変換行列も、複数の動きベクトルから回転行列を算出する方法と同様の方法で計算できる。しかし、回転行列に比べると要素数が少ない（パラメータの自由度が小さい）ため、行列演算の演算量が少なく、また、ＲＡＮＳＡＣに必要なイテレーション回数も少なく済むため、領域を代表する動き情報推定にも十分利用できる。 The affine transformation matrix can also be calculated by a method similar to the method for calculating the rotation matrix from a plurality of motion vectors. However, since the number of elements is small compared to the rotation matrix (the degree of freedom of parameters is small), the amount of matrix calculation is small, and the number of iterations required for RANSAC is also small, so motion information estimation representing a region can be performed. Can also be used.

また、本実施形態では、代表ベクトルを算出する処理にＲＡＮＳＡＣを用いているが、これに限定されず、Ｍ推定、最小メジアン法、最小二乗法、平均、全探索的なロバスト推定を用いてもよい。全探索は、通常のＲＡＮＳＡＣのフローチャートと同様であるが、サンプルをランダムに選択するステップが、一回のイテレーション毎にサンプルが順に取得されるステップとなり、最大でサンプル数だけイテレーションを実行する構成となる。全探索であっても、代表ベクトルを算出する処理は複雑な行列演算がないため、その処理負荷は３×３のィ行列をロバスト推定により算出する場合に比べ大きくない。また、許容誤差内に収まるインライアの比率から処理を打ち切る処理を持てば、ワーストケース以外、サンプル数のイテレーションを繰り返す必要もない。このような全探索的な手法であっても、アウトライアを含んだ入力データからパラメータを推定できるロバスト性を持たせることができる。 In the present embodiment, RANSAC is used for the process of calculating the representative vector. However, the present invention is not limited to this, and M estimation, least median method, least square method, average, and full search robust estimation may be used. Good. The full search is the same as the normal RANSAC flowchart, but the step of randomly selecting a sample is a step in which samples are sequentially acquired for each iteration, and the maximum number of samples is executed. Become. Even in the full search, the processing for calculating the representative vector does not involve a complicated matrix operation, so the processing load is not large compared to the case of calculating a 3 × 3 matrix by robust estimation. Further, if the processing is terminated from the ratio of inliers within the allowable error, it is not necessary to repeat the iteration of the number of samples other than the worst case. Even such a full-search method can provide robustness capable of estimating parameters from input data including outliers.

以上説明したように、実施形態１によれば、画像全体の動きパラメータをロバスト推定で算出する際に、事前にインライアを抽出することにより、アウトライア率を低減させて、処理時間を短縮することができる。また、動画処理等の既定時間内に処理を完了させるためにイテレーションの上限回数を設定する場合、イテレーションが上限に達する割合が減少し、推定性能を安定することができる。 As described above, according to the first embodiment, when calculating the motion parameters of the entire image by robust estimation, the outlier rate is reduced and the processing time is shortened by extracting inliers in advance. Can do. Further, when the upper limit number of iterations is set in order to complete the processing within a predetermined time such as moving image processing, the rate at which the iteration reaches the upper limit is reduced, and the estimation performance can be stabilized.

＜実施形態２＞
実施形態２では、連続する画像から回転行列を推定して電子防振処理を行う構成について説明する。本実施形態では、実施形態１に対し、図３のＳ３０３０での類似動きベクトルを抽出するステップを実行する代わりに、図７に示す類似動きベクトルを抽出する処理がサブルーチンとして実行される。他の処理は、実施形態１に準ずるものとする。 <Embodiment 2>
In the second embodiment, a configuration for performing electronic image stabilization processing by estimating a rotation matrix from continuous images will be described. In the present embodiment, a process of extracting a similar motion vector shown in FIG. 7 is executed as a subroutine instead of executing the step of extracting the similar motion vector in S3030 of FIG. Other processing is based on the first embodiment.

以下、類似動きベクトル抽出の動作について、図７を参照して説明する。図７は類似動きベクトルの抽出処理の詳細を示すフローチャートである。 Hereinafter, the operation of extracting similar motion vectors will be described with reference to FIG. FIG. 7 is a flowchart showing details of the similar motion vector extraction process.

Ｓ７０００では、ＣＰＵ１０５が、ベストパラメータである領域の代表ベクトルと差が許容誤差 In S7000, the CPU 105 determines that the difference from the representative vector of the area that is the best parameter is an allowable error.

内の領域内の動きベクトル（類似動きベクトル）を抽出する。 A motion vector (similar motion vector) in the inner region is extracted.

の初期値は１とする。 The initial value of is 1.

Ｓ７０１０では、抽出した動きベクトルの数をカウントする。この値を In S7010, the number of extracted motion vectors is counted. This value

とすると If

と表現できる。 Can be expressed as

Ｓ７０２０は、ＣＰＵ１０５が、抽出したデータ数が領域内の動きベクトルの所定割合以上であるか否かを判定する。ここで、所定割合は、例えば、２５％である。本実施形態では、代表ベクトルとの差が許容誤差内である動きベクトルの数と領域内の全動きベクトル数の比率をインライア率 In S7020, the CPU 105 determines whether or not the number of extracted data is equal to or greater than a predetermined ratio of the motion vectors in the region. Here, the predetermined ratio is, for example, 25%. In this embodiment, the ratio of the number of motion vectors whose difference from the representative vector is within an allowable error and the total number of motion vectors in the region is the inlier ratio.

と表現する。 It expresses.

このとき、 At this time,

は、 Is

として計算できる。判定結果が真の場合（Ｓ７０２０でＹＥＳ）、Ｓ７０４０へ遷移し、偽の場合（Ｓ７０２０でＮＯ）、Ｓ７０３０へ遷移する。 Can be calculated as If the determination result is true (YES in S7020), the process proceeds to S7040, and if false (NO in S7020), the process proceeds to S7030.

Ｓ７０３０では、ＣＰＵ１０５が、許容誤差 In S7030, the CPU 105 determines the allowable error.

を所定倍（例えば、１．５倍）に拡大する。その後、Ｓ７０００へ遷移する。 Is enlarged to a predetermined magnification (for example, 1.5 times). Thereafter, the process proceeds to S7000.

Ｓ７０４０では、ＣＰＵ１０５が、抽出した動きベクトルをサブルーチンの呼び出し元に返す。この動きベクトルの集合はイテレーションが終了時点の In S7040, the CPU 105 returns the extracted motion vector to the subroutine caller. This set of motion vectors is at the end of the iteration.

である。 It is.

本実施形態において、回転行列を算出するためのＲＡＮＳＡＣに入力される動きベクトルはＳ７０４０で返された動きベクトルの集合である。 In the present embodiment, the motion vector input to RANSAC for calculating the rotation matrix is the set of motion vectors returned in S7040.

以上説明したように、実施形態２によれば、許容誤差の初期値に小さい値を設定し、許容誤差を大きくしながらイテレーションを繰り返して類似動きベクトルを抽出する。つまり、類似動きベクトルを抽出のための許容誤差が適応的に設定されることになる。これにより、許容誤差が小さすぎて、十分な数の動きベクトルが検出できないという問題を回避することができる。また、本実施形態では、許容誤差を大きくしながらイテレーションを繰り返しているが、逆に初期値に大きい値を設定し値を小さくしていく構成を採用してもよい。この場合の初期値は、本実施形態を防振に適用する場合、防振の最大補正量であり、画像高の１０％等の値が用いられる。 As described above, according to the second embodiment, a small value is set as the initial value of the allowable error, and the iteration is repeated while extracting the similar motion vector while increasing the allowable error. That is, an allowable error for extracting similar motion vectors is adaptively set. As a result, it is possible to avoid the problem that the allowable error is too small to detect a sufficient number of motion vectors. Further, in the present embodiment, iteration is repeated while increasing the allowable error. Conversely, a configuration in which a large value is set as the initial value and the value is decreased may be employed. The initial value in this case is the maximum amount of image stabilization when the present embodiment is applied to image stabilization, and a value such as 10% of the image height is used.

さらには、２つを複合した構成を採用してもよい。例えば、許容誤差を増減させながらインライア率が２０％から３０％の範囲に収まる最適な許容誤差を探索していく構成を採用してもよい。尚、本実施形態では、目標とするインライア率を２５％としているが、これに限定されるものではない。例えば、目標とするインライア率は、ターゲットのアプリケーション、処理速度、入力画像の特性、行列を推定するために用いられるロバスト推定アルゴリズムに応じて適切な値が設定される。 Furthermore, you may employ | adopt the structure which compounded two. For example, a configuration may be adopted in which an optimum tolerance error that falls within the range of 20% to 30% is searched for while increasing or decreasing the tolerance error. In the present embodiment, the target inlier ratio is 25%, but the present invention is not limited to this. For example, the target inlier rate is set to an appropriate value according to the target application, the processing speed, the characteristics of the input image, and the robust estimation algorithm used for estimating the matrix.

＜実施形態３＞
実施形態３では、連続する画像から回転行列を推定して電子防振処理を行う構成について説明する。本実施形態は、実施形態２の類似動きベクトルの抽出処理の変形例である。 <Embodiment 3>
In the third embodiment, a configuration for performing electronic image stabilization processing by estimating a rotation matrix from continuous images will be described. The present embodiment is a modification of the similar motion vector extraction process of the second embodiment.

本実施形態において、Ｓ２０２０における動き検出は、特徴点検出を用いるとものとする。特徴点検出による動き検出結果から防振処理を行っても、ブロックマッチングによる動き検出結果から防振処理を行っても、動き検出以外に、基本的な処理の違いはない。但し、特徴点検出の場合、特徴点から生成される動きベクトルの数は一定になりにくい。また、動きベクトルの分布も不均一になることがある。 In this embodiment, the motion detection in S2020 uses feature point detection. There is no difference in basic processing other than motion detection whether the image stabilization processing is performed from the motion detection result by the feature point detection or the image stabilization processing is performed from the motion detection result by the block matching. However, in the case of feature point detection, the number of motion vectors generated from feature points is difficult to be constant. Also, the motion vector distribution may be non-uniform.

以下、類似動きベクトル抽出の動作について図８を参照して用いて説明する。図８は類似動きベクトルの抽出処理の詳細を示すフローチャートである。図８は、実施形態２の図７のフローチャートの変形例であるので、図７と同一のステップは同一の参照番号を付加して、その詳細については省略する。 Hereinafter, the similar motion vector extraction operation will be described with reference to FIG. FIG. 8 is a flowchart showing details of the similar motion vector extraction process. FIG. 8 is a modification of the flowchart of FIG. 7 of the second embodiment, and therefore the same steps as those in FIG. 7 are denoted by the same reference numerals and the details thereof are omitted.

Ｓ８０２０は、ＣＰＵ１０５が、抽出した動きベクトル数が抽出目標数以下であるか否かを判定する。目標数は、例えば、１５である。判定結果が真の場合（Ｓ８０２０でＹＥＳ）、Ｓ７０４０へ遷移し、偽の場合（Ｓ８０２０でＮＯ）、Ｓ７０００へ遷移する。 In step S8020, the CPU 105 determines whether the number of extracted motion vectors is equal to or less than the extraction target number. The target number is, for example, 15. If the determination result is true (YES in S8020), the process proceeds to S7040, and if false (NO in S8020), the process proceeds to S7000.

Ｓ８０３０では、ＣＰＵ１０５が、許容誤差を減少する。これは、例えば、許容誤差を所定倍（例えば、０．７５倍）にすることで実現する。その後、Ｓ７０００へ遷移する。尚、許容誤差の初期値は、予め大きな値、例えば、画像高の１０％の値が設定されているものとする。 In S8030, the CPU 105 reduces the allowable error. This is realized, for example, by increasing the allowable error by a predetermined value (for example, 0.75 times). Thereafter, the process proceeds to S7000. Note that the initial value of the allowable error is set to a large value in advance, for example, a value of 10% of the image height.

以上説明したように、実施形態３によれば、許容誤差を小さくしながらイテレーションを繰り返して類似動きベクトルを抽出する。つまり、類似動きベクトルを抽出のための許容誤差が適応的に設定されることになる。これにより、許容誤差が大きすぎて、アウトライア率の高い不要な動きベクトルを多く抽出しすぎる問題を回避することができる。 As described above, according to the third embodiment, a similar motion vector is extracted by repeating iteration while reducing the allowable error. That is, an allowable error for extracting similar motion vectors is adaptively set. As a result, it is possible to avoid the problem that the allowable error is too large and too many unnecessary motion vectors having a high outlier rate are extracted.

また、本実施形態では、許容誤差を小さくしながらイテレーションを繰り返しているが、逆に、初期値に小さい値を設定し許容誤差を大きくしていく構成を採用する、あるいは、２つを複合した構成を採用してもよい。例えば、許容誤差を増減させながら、動きベクトルが１５から３０の範囲に収まるように、最適な許容誤差を探索していく構成を採用してもよい。 In this embodiment, iteration is repeated while reducing the allowable error. Conversely, a configuration is adopted in which a small value is set as the initial value and the allowable error is increased, or two are combined. A configuration may be adopted. For example, a configuration may be adopted in which an optimum allowable error is searched so that the motion vector falls within a range of 15 to 30 while increasing or decreasing the allowable error.

尚、本実施形態では、動きベクトルの抽出目標数を１５としているが、これに限定されない。抽出目標数は、ターゲットのアプリケーション、処理速度、入力画像の特性、行列を推定するために用いられるロバスト推定アルゴリズムに応じて適切な値が設定される。３×３行列を推定するＲＡＮＳＡＣでは、３００のサンプルがあれば十分なので、本実施形態では、３００を領域分間数である２０で割った値である１５を抽出目標数としている。 In this embodiment, the target number of motion vector extractions is 15, but the present invention is not limited to this. The extraction target number is set to an appropriate value according to the target application, the processing speed, the characteristics of the input image, and the robust estimation algorithm used for estimating the matrix. In the RANSAC for estimating the 3 × 3 matrix, it is sufficient if there are 300 samples. Therefore, in the present embodiment, the extraction target number is 15 which is a value obtained by dividing 300 by 20 which is the number of regions.

＜実施形態４＞
実施形態４は、代表ベクトルの算出にＲＡＮＳＡＣを用いずに、代表ベクトルを算出する構成について説明する。 <Embodiment 4>
Embodiment 4 demonstrates the structure which calculates a representative vector, without using RANSAC for calculation of a representative vector.

以下、代表ベクトルを算出するサブルーチンの動作について、図９を用いて説明する。図９は代表ベクトルの算出処理の詳細を示すフローチャートである。図９は、実施形態１の図５のフローチャートの代わりに実行される。 Hereinafter, the operation of the subroutine for calculating the representative vector will be described with reference to FIG. FIG. 9 is a flowchart showing details of the representative vector calculation process. FIG. 9 is executed instead of the flowchart of FIG. 5 of the first embodiment.

Ｓ９０１０では、ＣＰＵ１０５が、領域内の動きベクトル全体から平均動きベクトルを算出する。平均動きベクトルは、下記のように計算される。 In step S9010, the CPU 105 calculates an average motion vector from all the motion vectors in the region. The average motion vector is calculated as follows.

Ｓ９０２０では、ＣＰＵ１０５が、平均動きベクトルと距離差が許容誤差ｅ_avg以内の動きベクトルを抽出する。ここで、抽出した動きベクトルの集合Ｖ_avgは、下記のように表現できる。 In S9020, the CPU 105 extracts a motion vector whose distance difference from the average motion vector is within an allowable error e _avg . Here, the extracted set of motion vectors V _avg can be expressed as follows.

Ｓ９０３０では、ＣＰＵ１０５が、Ｓ９０２０で抽出した動きベクトルの平均動きベクトル（第二の平均動きベクトル）を再度算出する。 In S9030, the CPU 105 calculates again the average motion vector (second average motion vector) of the motion vectors extracted in S9020.

Ｓ９０４０では、ＣＰＵ１０５が、第二の平均動きベクトルを戻り値として返す。 In S9040, the CPU 105 returns the second average motion vector as a return value.

この戻り値である This is the return value

を代表ベクトルとして、実施形態１で説明した処理を実行することにより防振が可能となる。 By using the representative vector as a representative vector and executing the processing described in the first embodiment, image stabilization is possible.

以上説明したように、実施形態４によれば、代表ベクトルの算出にＲＡＮＳＡＣを用いるため、分割領域内の動きベクトルのアウトライア率が低い場合の利用に限られるものの、高速に代表ベクトルを算出することができる。 As described above, according to the fourth embodiment, since RANSAC is used for calculating the representative vector, the use of the motion vector in the divided region is limited to use when the outlier rate is low, but the representative vector is calculated at high speed. be able to.

＜実施形態５＞
実施形態５は、インテリジェントな領域分割を用いて分割領域毎に代表ベクトルを算出する構成について説明する。 <Embodiment 5>
Embodiment 5 describes a configuration for calculating a representative vector for each divided region using intelligent region division.

図１０は、オブジェクト単位の領域分割を用いる変換行列の推定処理の詳細を示すフローチャートである。図１０は、実施形態１の図３のフローチャートに対し、Ｓ１００００が追加されている。 FIG. 10 is a flowchart showing details of the transformation matrix estimation process using object-based region division. In FIG. 10, S10000 is added to the flowchart of FIG. 3 of the first embodiment.

Ｓ１００００は、ＣＰＵ１０５が、入力画像をオブジェクト単位で領域分割する。領域分割方法には様々な方法が存在するが、本実施形態では、ｋ−ｍｅａｎｓ法を用いて分割するものとする。分割数は８とする。各分割領域には番号が付けられる。番号の順序は任意である。尚、分割アルゴリズムや分割数はこれに限定されず、他の方式、分割数であってもよい。このようにして画像を分割すると、例えば、図１１のように分割される。図１１は、画像がオブジェクト単位で分割された状態と分割領域の番号の例を示す図である。分割領域の番号の振り方は任意である。 In S10000, the CPU 105 divides the input image into regions in units of objects. Various methods exist for the region dividing method. In this embodiment, the region is divided using the k-means method. The number of divisions is 8. Each divided area is numbered. The order of the numbers is arbitrary. The division algorithm and the number of divisions are not limited to this, and other methods and division numbers may be used. When the image is divided in this way, for example, it is divided as shown in FIG. FIG. 11 is a diagram illustrating an example of a state where an image is divided in units of objects and numbers of divided areas. The numbering method of the divided areas is arbitrary.

Ｓ３０１０以降は、格子状に区切られた分割領域の代わりに任意の形状の分割領域を対象にする以外は、実施形態１と同様の処理が実行される。 After S3010, the same processing as that in the first embodiment is executed except that a divided region having an arbitrary shape is used instead of the divided region divided in a lattice shape.

以上説明したように、実施形態５によれば、オブジェクト単位で画像を分割して、そのオブジェクトの領域毎に、代表ベクトルを算出する。同一オブジェクトに含まれる動きベクトルのベクトル成分は同一になる可能性が高いため、同一の許容誤差であっても代表ベクトルに類似のベクトルを抽出する際のインライア率を高めることができる。このため、回転行列推定においては、同一の動きを持つオブジェクトの集合の面積が最も大きい集合に含まれる動きベクトルが、画像全体の動きの主要成分となる傾向が強まる。これは、推定した回転行列を用いて防振を行う場合、広い面積が安定して防振されることになり、防振の安定性を高めることができる。 As described above, according to the fifth embodiment, an image is divided in units of objects, and a representative vector is calculated for each area of the object. Since the vector components of the motion vectors included in the same object are likely to be the same, the inlier ratio when extracting a vector similar to the representative vector can be increased even with the same tolerance. For this reason, in the rotation matrix estimation, the tendency that the motion vector included in the set having the largest area of the set of objects having the same motion becomes the main component of the motion of the entire image increases. This is because, when the image stabilization is performed using the estimated rotation matrix, a large area is stably imaged and the stability of the image stabilization can be improved.

尚、本発明は、以下の処理を実行することによっても実現される。即ち、上述した実施形態の機能を実現するソフトウェア（プログラム）を、ネットワーク又は各種記憶媒体を介してシステムまたは装置に供給し、そのシステムまたは装置のコンピュータ（またはＣＰＵやＭＰＵ等）がプログラムを読み出して実行する処理である。 The present invention can also be realized by executing the following processing. That is, software (program) that realizes the functions of the above-described embodiments is supplied to a system or apparatus via a network or various storage media, and a computer (or CPU, MPU, or the like) of the system or apparatus reads the program. It is a process to be executed.

Claims

An information processing apparatus for processing temporally continuous images,
A first calculation means for calculating a first motion parameter indicating a motion from the reference image using a divided region obtained by dividing the processing target image as a processing unit;
Extracting means for extracting a motion vector corresponding to the first motion parameter from the motion vectors in the divided region;
Second calculation means for calculating a second motion parameter related to the processing target image, which expresses a positional deviation from the reference image, using a motion vector extracted by the extraction means from a motion vector in the divided area; An information processing apparatus comprising:

The information processing apparatus according to claim 1, wherein the first motion parameter and the second motion parameter are calculated by a robust estimation that estimates a parameter while excluding an influence of an outlier.

The first motion parameter and the second motion parameter are the difference between all motion vectors included in at least one divided region to be a sample in the processing target image and a motion vector acquired from the divided region, The information processing apparatus according to claim 1, wherein the information processing apparatus is calculated by robust estimation using a number of data within an allowable error.

The information processing apparatus according to claim 3, wherein the allowable error used in the robust estimation in the first calculation means is set to a value larger than the allowable error used in the robust estimation in the second calculation means. .

The extraction means sets an allowable error used in the robust estimation in the first calculation means so that the number of motion vectors extracted from the divided area is equal to or greater than a predetermined ratio of the motion vectors in the divided area. The information processing apparatus according to claim 3 or 4, wherein the information processing apparatus is characterized in that:

The extraction unit sets an allowable error used in robust estimation in the first calculation unit so that the number of motion vectors extracted from the divided region is equal to or less than a target extraction number. 5. The information processing apparatus according to 4.

The first motion parameter is a matrix;
The extraction means calculates a motion vector whose distance between a point obtained by projecting the start point of the motion vector from the motion vector in the divided area by the matrix indicated by the first parameter and the end point of the motion vector is within an allowable error. The information processing apparatus according to any one of claims 1 to 6, wherein the information is extracted as the corresponding motion vector.

The first motion parameter is a vector;
The extraction means extracts, as the corresponding motion vector, a motion vector whose distance from the vector indicated by the first parameter is within an allowable error from the motion vectors in the divided region. The information processing apparatus according to any one of 1 to 6.

The first motion parameter is an average motion vector of all motion vectors in the divided region;
The extraction unit extracts, as the corresponding motion vector, a motion vector whose distance from the average motion vector indicated by the first parameter is within an allowable error from the motion vectors in the divided region. The information processing apparatus according to claim 1.

The information processing apparatus according to claim 1, wherein a degree of freedom of the first motion parameter is smaller than a degree of freedom of the second motion parameter.

The information processing apparatus according to any one of claims 1 to 10, wherein the second motion parameter is a motion parameter for the entire processing target image.

The information processing apparatus according to claim 1, wherein the divided area is an area obtained by dividing the processing target image in units of objects.

A method of controlling an information processing apparatus that processes temporally continuous images,
A first calculation step of calculating a first motion parameter indicating a motion from the reference image using a divided region obtained by dividing the processing target image as a processing unit;
An extraction step of extracting a motion vector corresponding to the first motion parameter from the motion vectors in the divided region;
A second calculation step of calculating a second motion parameter related to the processing target image, which expresses a positional deviation from the reference image, using the motion vector extracted in the extraction step from the motion vector in the divided region; An information processing apparatus control method comprising:

A program for causing a computer to control the information processing apparatus that processes temporally continuous images,
The computer,
A first calculation means for calculating a first motion parameter indicating a motion from the reference image using a divided region obtained by dividing the processing target image as a processing unit;
Extracting means for extracting a motion vector corresponding to the first motion parameter from the motion vectors in the divided region;
Second calculation means for calculating a second motion parameter related to the processing target image, which expresses a positional deviation from the reference image, using a motion vector extracted by the extraction means from a motion vector in the divided area; A program characterized by making it function.