JP2020120362A

JP2020120362A - Image processing device, image processing method, and program

Info

Publication number: JP2020120362A
Application number: JP2019012484A
Authority: JP
Inventors: 優成田; Masaru Narita; 光洋齊藤; Mitsuhiro Saito
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2019-01-28
Filing date: 2019-01-28
Publication date: 2020-08-06
Anticipated expiration: 2039-01-28
Also published as: JP7191711B2

Abstract

To provide an image processing device capable of stabilizing moving images considering position variation of a moving body when generating stabilized moving images from picked-up moving images.SOLUTION: The image processing device includes: a trajectory estimation part that estimates an imaging trajectory from picked-up moving images; a trajectory correction part that corrects the estimated imaging trajectory; and an image generation part that generates stabilized moving images according to the imaging trajectory after correction from the picked-up moving images. The trajectory correction part detects a moving body area based on movement vector detected from the picked-up moving images, corrects the imaging trajectory to reduce the variation of the position of the moving body in each frame of the generated moving images calculated from the detected moving body area, and generates stabilized moving images from which the position variation of the moving body is reduced.SELECTED DRAWING: Figure 1

Description

本発明は、画像処理装置、画像処理方法、及びプログラムに関する。 The present invention relates to an image processing device, an image processing method, and a program.

手振れ等による撮影画角の変動を電子的に安定化させる技術として、撮影動画から撮像軌跡を推定して、推定された撮像軌跡の軌跡変動が低減するように補正し、撮影動画を基に補正後の撮像軌跡に対応した動画を生成する技術が提案されている。例えば、特許文献１には、撮影動画のフレーム間の被写体の移動量が一定となるようにフレームを間引くもしくは補間する処理を行うことで、軌跡変動を低減する技術が提案されている。また、特許文献２には、フレーム間の特定被写体の移動量が所望の移動量となるように撮影間隔を制御することで、軌跡変動を低減する技術が提案されている。また最近では、撮影画角の変動を安定化させた動画を撮影動画のＮ倍速い速度で再生するように生成することで、アクションカムで撮影された長時間動画を短時間のダイジェスト動画にして楽しむ、と言ったことも行われている。 As a technology to electronically stabilize the fluctuation of the shooting angle of view due to camera shake, etc., the shooting trajectory is estimated from the shooting video, and the estimated trajectory fluctuation of the shooting trajectory is corrected so as to be reduced and corrected based on the shooting video. A technique for generating a moving image corresponding to a subsequent image capturing locus has been proposed. For example, Patent Document 1 proposes a technique of reducing trajectory fluctuation by performing processing for thinning out or interpolating frames so that the amount of movement of a subject between frames of a captured moving image is constant. Patent Document 2 proposes a technique for reducing trajectory fluctuation by controlling the shooting interval so that the movement amount of a specific subject between frames becomes a desired movement amount. In addition, recently, by creating a movie with stabilized fluctuations in the shooting angle of view so that it plays at a speed N times faster than the shot video, a long movie shot with an action cam becomes a short digest movie. It is said that he enjoys it.

特開２０１２−６０２５８号公報JP 2012-60258 A 特開２０１１−６６８７３号公報JP, 2011-66873, A

しかしながら、特許文献１に記載の技術では、動画の安定化のために、撮像装置の動きによる画面全体の動きを考慮しているが、動体の位置変動を考慮していない。その結果、動体の動きが安定化されず、フレーム毎に動体が不規則な位置に現れてしまう。特に、通常のデジタルカメラで撮影された長時間動画から短時間のダイジェストの動画を生成する場合には、動体の動きが安定化されていないと、フレームが間引かれることによって動体の動きが不規則かつ不連続になる。そのため、画面全体の動きが安定化されていたとしても、画面内で動体がちらついて見え、見苦しい映像になってしまう。 However, in the technique described in Patent Document 1, the movement of the entire screen due to the movement of the imaging device is taken into consideration in order to stabilize the moving image, but the positional fluctuation of the moving body is not taken into consideration. As a result, the motion of the moving body is not stabilized, and the moving body appears at irregular positions in each frame. In particular, when a short-digest movie is generated from a long-time movie shot by a normal digital camera, if the motion of the moving body is not stabilized, the frames are thinned out and the moving body does not move smoothly. Be regular and discontinuous. Therefore, even if the movement of the entire screen is stabilized, the moving object may flicker on the screen, resulting in an unsightly image.

また、特許文献２に記載の技術では、動体の位置変動を考慮しているものの、撮影間隔を制御することのみにとどまり、画像上の動体位置そのものを制御するものではない。そのため、動体が所望の位置を通過して移動する場合を除いて、動体の動きの安定化を実現することができない。 In addition, although the technique described in Patent Document 2 considers the position variation of the moving body, it does not control the moving body position itself on the image, only by controlling the shooting interval. Therefore, the movement of the moving body cannot be stabilized except when the moving body moves past a desired position.

本発明は、このような事情に鑑みてなされたものであり、撮影動画から安定化されたＮ倍速の動画を生成する場合に、動体の位置変動を考慮して動画を安定化させることが可能な画像処理装置を提供することを目的とする。 The present invention has been made in view of such circumstances, and when generating a stabilized N-times speed moving image from a captured moving image, it is possible to stabilize the moving image in consideration of the position variation of the moving body. An image processing apparatus is provided.

本発明に係る画像処理装置は、第１の動画から撮像軌跡を推定する軌跡推定手段と、推定された前記撮像軌跡を補正する軌跡補正手段と、補正後の前記撮像軌跡に応じた第２の動画を前記第１の動画から生成する画像生成手段とを有し、前記軌跡補正手段は、前記第１の動画から検出される動きベクトルに基づいて動体領域を検出し、検出した動体領域から算出される、前記第２の動画の各フレームにおける動体位置の変動を低減させるように前記撮像軌跡を補正することを特徴とする。 An image processing apparatus according to the present invention includes a trajectory estimation unit that estimates an imaging trajectory from a first moving image, a trajectory correction unit that corrects the estimated imaging trajectory, and a second trajectory that corresponds to the corrected imaging trajectory. An image generation unit that generates a moving image from the first moving image, the trajectory correction unit detects a moving body region based on a motion vector detected from the first moving image, and calculates from the detected moving body region. The imaging locus is corrected so as to reduce the fluctuation of the moving body position in each frame of the second moving image.

本発明によれば、撮影動画から安定化されたＮ倍速の動画を生成する場合に、動体の位置変動による見苦しさが低減された安定化された動画を生成することが可能となる。 According to the present invention, when a stabilized N-times speed moving image is generated from a captured moving image, it is possible to generate a stabilized moving image in which unsightlyness due to position variation of a moving body is reduced.

第１の実施形態における画像処理装置の構成例を示す図である。It is a figure which shows the structural example of the image processing apparatus in 1st Embodiment. 第１の実施形態における軌跡推定部の構成例を示す図である。It is a figure which shows the structural example of the locus|trajectory estimation part in 1st Embodiment. 第１の実施形態における軌跡補正部の構成例を示す図である。It is a figure showing the example of composition of the locus amendment part in a 1st embodiment. 第１の実施形態における画像処理装置の動作例を示すフローチャートである。6 is a flowchart illustrating an operation example of the image processing apparatus according to the first exemplary embodiment. 第１の実施形態における軌跡推定処理の例を示すフローチャートである。6 is a flowchart showing an example of trajectory estimation processing in the first embodiment. 第１の実施形態におけるマッチング処理を説明する図である。It is a figure explaining the matching process in 1st Embodiment. 第１の実施形態におけるカメラ位置姿勢の推定方法を説明する図である。It is a figure explaining the estimation method of the camera position and orientation in a 1st embodiment. 第１の実施形態における撮像軌跡の変動例を説明する図である。FIG. 6 is a diagram illustrating an example of a variation of an image pickup trajectory in the first embodiment. 第１の実施形態における軌跡補正処理の例を示すフローチャートである。6 is a flowchart showing an example of trajectory correction processing in the first embodiment. 第１の実施形態における動体領域の検出方法を説明する図である。It is a figure explaining the detection method of the moving body area|region in 1st Embodiment. 第１の実施形態における撮像軌跡の適正化を説明する図である。It is a figure explaining the optimization of the imaging locus in 1st Embodiment. 第１の実施形態における撮像軌跡の適正化を説明する図である。It is a figure explaining the optimization of the imaging locus in 1st Embodiment. 第２の実施形態における軌跡補正部の構成例を示す図である。It is a figure showing the example of composition of the locus amendment part in a 2nd embodiment. 第２の実施形態における軌跡補正処理の例を示すフローチャートである。8 is a flowchart showing an example of trajectory correction processing in the second embodiment. 第２の実施形態における撮像軌跡補正の重み制御を説明する図である。It is a figure explaining the weight control of the imaging locus correction in 2nd Embodiment. 本実施形態における画像処理装置のハードウェア構成例を示す図である。It is a figure which shows the hardware structural example of the image processing apparatus in this embodiment.

以下、本発明の実施形態を図面に基づいて説明する。 Embodiments of the present invention will be described below with reference to the drawings.

（第１の実施形態）
本発明の第１の実施形態について説明する。図１は、第１の実施形態における画像処理装置１００の構成例を示すブロック図である。画像処理装置１００は、画像入力部１０１、画像メモリ１０２、軌跡推定部１０３、軌跡補正部１０４、及び画像生成部１０５を有する。画像入力部１０１は、複数のフレームからなる画像（撮影動画）が入力される。画像メモリ１０２は、画像入力部１０１により入力された画像を１フレーム又は複数のフレーム、一時的に記憶保持する。 (First embodiment)
A first embodiment of the present invention will be described. FIG. 1 is a block diagram showing a configuration example of an image processing apparatus 100 according to the first embodiment. The image processing apparatus 100 includes an image input unit 101, an image memory 102, a trajectory estimation unit 103, a trajectory correction unit 104, and an image generation unit 105. The image input unit 101 receives an image (captured moving image) including a plurality of frames. The image memory 102 temporarily stores and holds the image input by the image input unit 101 for one frame or a plurality of frames.

軌跡推定部１０３は、画像入力部１０１により入力された画像と画像メモリ１０２に記憶保持された画像との異なるフレームの画像を使用して、撮影動画が撮影されたときのカメラの軌跡推定を行う。図２は、軌跡推定部１０３の構成例を示すブロック図である。軌跡推定部１０３は、図２に示すように、画像マッチング部２０１、変動量算出部２０２、及び変動量累積部２０３を有する。 The trajectory estimation unit 103 estimates the trajectory of the camera when a moving image is captured, using images of different frames from the image input by the image input unit 101 and the image stored and held in the image memory 102. .. FIG. 2 is a block diagram showing a configuration example of the trajectory estimation unit 103. As shown in FIG. 2, the trajectory estimation unit 103 includes an image matching unit 201, a variation amount calculation unit 202, and a variation amount accumulation unit 203.

画像マッチング部２０１は、現フレーム（フレーム番号ｎ）と次フレーム（フレーム番号ｎ＋１）のマッチング処理を行う。例えば、現フレーム（フレーム番号ｎ）の画像が画像メモリ１０２より入力され、次フレーム（フレーム番号ｎ＋１）の画像が画像入力部１０１より入力される。変動量算出部２０２は、画像マッチング部２０１によるマッチング処理の結果を用いて画像間の変動量を算出する。変動量累積部２０３は、変動量算出部２０２によって算出された画像間の変動量を累積し撮像軌跡を算出する。 The image matching unit 201 performs matching processing of the current frame (frame number n) and the next frame (frame number n+1). For example, the image of the current frame (frame number n) is input from the image memory 102, and the image of the next frame (frame number n+1) is input from the image input unit 101. The variation amount calculation unit 202 calculates the variation amount between images using the result of the matching process by the image matching unit 201. The variation amount accumulating unit 203 accumulates the variation amounts between the images calculated by the variation amount calculating unit 202 to calculate an imaging trajectory.

軌跡補正部１０４は、軌跡推定部１０３により推定される撮像時の撮像軌跡に基づいて、画像間の変動量を低減するように補正処理を行い、安定化された撮像軌跡を算出する。図３は、軌跡補正部１０４の構成例を示すブロック図である。軌跡補正部１０４は、図３に示すように、動きベクトル検出部３０１、動体領域検出部３０２、動体位置算出部３０３、及び軌跡適正化部３０４を有する。 The locus correction unit 104 performs a correction process based on the imaging locus at the time of imaging estimated by the locus estimation unit 103 so as to reduce the variation amount between images, and calculates a stabilized imaging locus. FIG. 3 is a block diagram showing a configuration example of the trajectory correction unit 104. As shown in FIG. 3, the locus correction unit 104 includes a motion vector detection unit 301, a moving body region detection unit 302, a moving body position calculation unit 303, and a locus optimization unit 304.

動きベクトル検出部３０１は、フレーム毎に画素単位で動きベクトルを検出する。動体領域検出部３０２は、動きベクトル検出部３０１で検出された動きベクトルを用いて、フレーム毎に動体領域を検出する。動体位置算出部３０３は、動体領域検出部３０２で検出された動体領域に基づいて、動体領域を示す座標を動体位置として算出する。軌跡適正化部３０４は、動体位置算出部３０３で算出されたフレーム毎の動体位置を利用して、軌跡推定部１０３によって推定された撮像軌跡の補正を行う。 The motion vector detection unit 301 detects a motion vector on a pixel-by-pixel basis for each frame. The moving body area detection unit 302 detects the moving body area for each frame using the motion vector detected by the motion vector detection unit 301. The moving body position calculation unit 303 calculates the coordinates indicating the moving body region as the moving body position based on the moving body region detected by the moving body region detection unit 302. The trajectory optimization unit 304 corrects the imaging trajectory estimated by the trajectory estimation unit 103 by using the moving body position for each frame calculated by the moving body position calculation unit 303.

画像生成部１０５は、軌跡補正部１０４によって補正された撮像軌跡に応じた画角領域に対応する画像を撮影動画から読み出し、撮影動画のＮ倍速での安定化された動画を生成する。 The image generation unit 105 reads out an image corresponding to the angle-of-view area corresponding to the imaging trajectory corrected by the trajectory correction unit 104 from the captured moving image and generates a stabilized moving image at N times speed of the captured moving image.

第１の実施形態における画像処理装置１００の動作について説明する。図４は、第１の実施形態における画像処理装置１００の動作例を示すフローチャートである。 The operation of the image processing apparatus 100 according to the first embodiment will be described. FIG. 4 is a flowchart showing an operation example of the image processing apparatus 100 according to the first embodiment.

ステップＳ４０１では、撮影された画像が、画像入力部１０１を介して画像処理装置１００に入力される。入力された画像は、画像メモリ１０２に記憶保持される。次に、ステップＳ４０２では、軌跡推定部１０３は、ステップＳ４０１において入力された画像と、画像メモリ１０２に記憶保持されている画像とを使用して、撮影動画が撮像されたときの撮像軌跡を推定する軌跡推定処理を行う。 In step S401, the captured image is input to the image processing apparatus 100 via the image input unit 101. The input image is stored and held in the image memory 102. Next, in step S402, the trajectory estimation unit 103 estimates an imaging trajectory when a captured moving image is captured using the image input in step S401 and the image stored and held in the image memory 102. The trajectory estimation process is performed.

続く、ステップＳ４０３では、軌跡補正部１０４は、ステップＳ４０２において推定された撮像軌跡を補正する軌跡補正処理を行う。軌跡補正部１０４は、推定された撮像時の撮像軌跡に対して１フレーム毎の変化が滑らかになるように処理を施すことで、補正された撮像軌跡を生成する。次に、ステップＳ４０４では、画像生成部１０５は、１フレーム毎にステップＳ４０３において補正された撮像軌跡に応じた画角領域の画像を撮影動画から読み出すことで、安定化された動画を生成する。 In subsequent step S403, the trajectory correction unit 104 performs trajectory correction processing for correcting the imaging trajectory estimated in step S402. The trajectory correction unit 104 generates a corrected imaging trajectory by performing processing on the estimated imaging trajectory at the time of imaging so that the change for each frame becomes smooth. Next, in step S404, the image generation unit 105 generates, for each frame, a stabilized moving image by reading an image of the angle-of-view area corresponding to the image capturing trajectory corrected in step S403 from the captured moving image.

図４のステップＳ４０２での軌跡推定部１０３による軌跡推定処理について説明する。図５は、第１の実施形態における軌跡推定処理の例を示すフローチャートである。 The trajectory estimation processing by the trajectory estimation unit 103 in step S402 of FIG. 4 will be described. FIG. 5 is a flowchart showing an example of the trajectory estimation process in the first embodiment.

ステップＳ５０１では、軌跡推定部１０３の画像マッチング部２０１は、図６に一例を示すように、図６（ａ）の現フレーム（フレーム番号ｎ）と図６（ｂ）の次フレーム（フレーム番号ｎ＋１）とでマッチング処理を行う。マッチング処理の結果の例を図６（ｃ）に示す。マッチング処理の手法は、２フレーム内の対応点を算出することができれば手法は問わない。例えば、ＳＩＦＴ（Scale Invariant Feature Transform）のように現フレームと次フレームとの特徴点をそれぞれ算出し、算出した特徴点の対応探索を行っても良い。また、例えばＫＬＴ（Kanade-Lucas-Tomasi feature tracker）追跡器のように現フレームで特徴点を算出し、算出した特徴点を続くフレームで追跡する特徴点追跡を行っても良い。 In step S501, the image matching unit 201 of the trajectory estimation unit 103, as shown in an example in FIG. 6, the current frame (frame number n) in FIG. 6A and the next frame (frame number n+1) in FIG. 6B. ) And perform matching processing. An example of the result of the matching process is shown in FIG. Any matching method may be used as long as it can calculate corresponding points in two frames. For example, like SIFT (Scale Invariant Feature Transform), the feature points of the current frame and the next frame may be calculated, and the correspondence search of the calculated feature points may be performed. Further, for example, like a KLT (Kanade-Lucas-Tomasi feature tracker) tracker, feature points may be calculated in the current frame, and the feature points may be tracked in subsequent frames.

図６（ａ）において、丸で示された位置６０１、６０２、６０３は、現フレーム中にある特徴点の位置であり、図６（ｂ）において示されている位置６０４、６０５、６０６は、次フレーム中にある特徴点の位置である。マッチング処理の結果は、各特徴点の位置のうち対応する点を、図６（ｃ）の線６０７、６０８、６０９で示した。マッチング処理により、このように図６（ａ）の位置６０１は、図６（ｂ）の位置６０４といったように各特徴点の対応を算出する。特徴点の数は、図６においては３点の対応点のみしか表示していないが、特徴点の数は、これに限定されない。しかし、後に算出する基礎行列Ｆのパラメータの数が８個あるため、好ましくは８点以上の特徴点が有る方が精度よく算出可能である。また特徴点は、その特徴量やマッチングの精度が高い点のみを用いる方が、後のカメラ位置姿勢の算出精度が向上するため、マッチング時の相関値や特徴量がある閾値よりも高い対応点のみを用いるといった処理を行っても良い。 In FIG. 6A, the positions 601, 602 and 603 indicated by circles are the positions of the feature points in the current frame, and the positions 604, 605 and 606 shown in FIG. It is the position of the feature point in the next frame. As a result of the matching process, corresponding points among the positions of the respective characteristic points are shown by lines 607, 608, and 609 in FIG. 6C. By the matching processing, the correspondence between the respective feature points is calculated such that the position 601 of FIG. 6A is the position 604 of FIG. 6B. In FIG. 6, only the three corresponding points are displayed as the number of characteristic points, but the number of characteristic points is not limited to this. However, since the number of parameters of the basic matrix F to be calculated later is eight, it is preferable that there are eight or more feature points for accurate calculation. Also, as for the feature points, it is better to use only the feature amount and the points with high matching precision, because the calculation accuracy of the camera position and orientation afterwards is improved, so that the correlation value and the feature amount during matching are higher than a certain threshold. You may perform the process which uses only.

ステップＳ５０２では、軌跡推定部１０３の変動量算出部２０２は、ステップＳ５０１において行ったマッチング処理の結果を使用して、画像間のカメラ位置姿勢の変動量を算出する。ここでＳｔｒｕｃｔｕｒｅｆｒｏｍＭｏｔｉｏｎを用いたカメラ位置姿勢の推定方法を、図７を用いて説明する。２フレーム間の対応点は２つの視点から見た三次元的な同一点であり、図７を用いると、２つの視点のカメラ中心はＣ１、Ｃ２であり、同一点は三次元座標点Ｘで表される。そして、カメラ中心Ｃ１、Ｃ２から見た三次元座標点Ｘがフレーム上に投影される対応点はｘ１、ｘ２で表される。 In step S502, the variation amount calculation unit 202 of the trajectory estimation unit 103 uses the result of the matching processing performed in step S501 to calculate the variation amount of the camera position and orientation between images. Here, a method of estimating the camera position and orientation using the Structure from Motion will be described with reference to FIG. 7. Corresponding points between the two frames are three-dimensional same points viewed from two viewpoints. Using FIG. 7, the camera centers of the two viewpoints are C1 and C2, and the same points are three-dimensional coordinate points X. expressed. The corresponding points at which the three-dimensional coordinate point X viewed from the camera centers C1 and C2 are projected on the frame are represented by x1 and x2.

ここで、カメラ中心Ｃ１、Ｃ２でのカメラ行列Ｐ１、Ｐ２は、それぞれ以下の（式１）及び（式２）のように表せる。 Here, the camera matrices P1 and P2 at the camera centers C1 and C2 can be expressed as the following (Equation 1) and (Equation 2), respectively.

ここで、原点を基準とした時の座標軸は、カメラ中心Ｃ１にあわせており、Ｉは単位行列であり、Ｋは内部キャリブレーション行列である。また、Ｒはカメラ中心Ｃ２の位置でのカメラの向きを表す回転変換行列であり、Ｔは三次元の平行移動ベクトルである。本実施形態では、カメラ中心Ｃ１、Ｃ２でのカメラは同じであるので、内部キャリブレーション行列Ｋは同じである。 Here, the coordinate axes with the origin as the reference are aligned with the camera center C1, I is the unit matrix, and K is the internal calibration matrix. Further, R is a rotation conversion matrix representing the orientation of the camera at the position of the camera center C2, and T is a three-dimensional translation vector. In the present embodiment, since the cameras at the camera centers C1 and C2 are the same, the internal calibration matrix K is the same.

ここで、カメラの相対的な位置関係ＲとＴ、カメラの特性Ｋ、三次元点の位置の結果として画像上の点ｘ１、ｘ２に幾何学的な制約が生じるために、前述のカメラ行列を用いると、三次元座標点Ｘを画像上の点ｘ１、ｘ２に射影する条件を導き出すことができる。この場合、以下の（式３）の方程式を満たす必要がある。 Here, as a result of the relative positional relations R and T of the cameras, the characteristics K of the cameras, and the positions of the three-dimensional points, geometric constraints are generated at the points x1 and x2 on the image. When used, the condition for projecting the three-dimensional coordinate point X on the points x1 and x2 on the image can be derived. In this case, it is necessary to satisfy the following equation (3).

（式３）はエピポーラ制約といい、このエピポーラ制約中の行列Ｆを基礎行列という。この行列Ｆは、２つのカメラＣ１，Ｃ２間の回転及び平行移動量を表す式であり、（式３）は点ｘ１を点ｘ２に変換する式として表している。（式４）は行列Ｆを、カメラの相対的な位置関係ＲとＴ、カメラの特性Ｋ、Ｓｔで表した式である。Ｓｔは、三次元の平行移動ベクトルＴで表される交代行列である。 (Equation 3) is called an epipolar constraint, and the matrix F in this epipolar constraint is called a basic matrix. This matrix F is an equation representing the amount of rotation and translation between the two cameras C1 and C2, and (Equation 3) is represented as an equation for converting the point x1 into the point x2. (Formula 4) is a formula in which the matrix F is represented by the relative positional relationships R and T of the cameras, the characteristics K of the cameras, and St. St is an alternating matrix represented by a three-dimensional translation vector T.

よってステップＳ５０１において求めた２フレーム間の対応点群を用いて、（式３）を解くと、この基礎行列Ｆが求められる。基礎行列Ｆを求める方法は、８組の対応点を用いた８点法、ＳＶＤ（特異値分解）を用いた最小二乗法等がある。そして、この基礎行列Ｆを求めた後に、（式４）を用いてカメラ行列を復元することで、２つのフレーム間での回転変換行列Ｒと平行移動ベクトルＴを求めることができる。 Therefore, this basic matrix F is obtained by solving (Equation 3) using the corresponding point group between the two frames obtained in step S501. As a method of obtaining the basic matrix F, there are an 8-point method using eight sets of corresponding points, a least square method using SVD (singular value decomposition), and the like. Then, after the basic matrix F is obtained, the camera matrix is restored using (Equation 4), whereby the rotation conversion matrix R and the translation vector T between the two frames can be obtained.

図８（ａ）は、算出した変動量のうち平行移動ベクトルＴの変動量を時系列（フレーム番号順）に並べたものである。図８（ａ）において、縦軸が画像の変動量、横軸がフレーム番号を示し、各フレームでのベクトルは矢印８０１で示している。画像変動量は、実際にはｘ、ｙ、ｚ方向の３方向の成分を持っているが、図８（ａ）では説明の便宜上、省略していずれか１成分のみを表しているものとする。 FIG. 8A shows the amounts of variation of the translation vector T among the calculated amounts of variation, which are arranged in time series (ordered by frame number). In FIG. 8A, the vertical axis represents the image variation amount, the horizontal axis represents the frame number, and the vector in each frame is indicated by the arrow 801. The image fluctuation amount actually has components in three directions of x, y, and z directions, but in FIG. 8A, for convenience of explanation, it is omitted and only one component is represented. ..

次に、ステップＳ５０３では、軌跡推定部１０３の変動量累積部２０３は、ステップＳ５０２において算出した変動量を時間方向に累積して撮像軌跡を算出する。変動量累積部２０３は、図８（ａ）のベクトルである矢印８０１を時間方向に累積することで、図８（ｂ）の撮像軌跡である折れ線８０２を算出する。
以上で、ステップＳ４０２における軌跡推定処理の説明を終える。 Next, in step S503, the variation amount accumulating unit 203 of the trajectory estimation unit 103 accumulates the variation amounts calculated in step S502 in the time direction to calculate an imaging trajectory. The fluctuation amount accumulating unit 203 calculates a polygonal line 802 that is the imaging trajectory of FIG. 8B by accumulating the arrow 801 that is the vector of FIG. 8A in the time direction.
This is the end of the description of the trajectory estimation process in step S402.

次に、図４のステップＳ４０３での軌跡補正部１０４による軌跡補正処理について説明する。図９は、第１の実施形態における軌跡補正処理の例を示すフローチャートである。 Next, the locus correction processing by the locus correction unit 104 in step S403 of FIG. 4 will be described. FIG. 9 is a flowchart showing an example of the trajectory correction process in the first embodiment.

ステップＳ９０１では、軌跡補正部１０４の動きベクトル検出部３０１は、フレーム毎に画素単位で動きベクトルを検出する。画素単位で動きベクトルを検出する方法としては例えば勾配法がある。勾配法では、連続する２フレーム間での物体の移動量は微小であり、物体上の点の明るさは移動後も変化しないと仮定する。この仮定を基に、画像上の画素（ｘ，ｙ）の時刻ｔにおける画素値をＩ（ｘ，ｙ，ｔ）、時間Δｔ後にその画素が（Δｘ，Δｙ）だけ移動したとすると、以下の（式５）が成り立つ。 In step S901, the motion vector detection unit 301 of the trajectory correction unit 104 detects a motion vector in pixels on a frame-by-frame basis. As a method of detecting a motion vector in pixel units, there is a gradient method, for example. In the gradient method, it is assumed that the amount of movement of an object between two consecutive frames is minute and the brightness of a point on the object does not change even after the movement. Based on this assumption, if the pixel value of the pixel (x, y) on the image at time t is I(x, y, t) and the pixel moves by (Δx, Δy) after a time Δt, (Equation 5) is established.

（式５）の右辺をテイラー展開すると（式６）が得られる。 Taylor expansion of the right side of (Equation 5) yields (Equation 6).

ここで、（式６）において、εは、Δｘ，Δｙ，Δｔに関する高次の項である。εが十分小さいとして無視して、両辺をΔｔで割る。そして、Δｔ→０として極限をとることで、以下の（式７）が得られる。 Here, in (Equation 6), ε is a high-order term regarding Δx, Δy, and Δt. Ignore that ε is sufficiently small and divide both sides by Δt. Then, by taking the limit as Δt→0, the following (formula 7) is obtained.

ここで、（式７）だけでは、２つの未知数ｕ，ｖを一意に定めることができない。そのため、例えば、着目画素近傍の局所領域内では動きベクトルは等しいと仮定し、局所領域内で得られる複数の（式７）による拘束条件式を連立させることでｕ，ｖを求める。 Here, the two unknowns u and v cannot be uniquely determined only by (Equation 7). Therefore, for example, it is assumed that the motion vectors are equal in the local region near the pixel of interest, and u and v are obtained by arranging a plurality of constraint condition expressions of (Equation 7) obtained in the local region.

なお、画素単位で動きベクトルを検出する方法としては、勾配法以外にも、例えば公知のテンプレートマッチング法を用いても良い。また、動きベクトルは、画像中のすべての画素を着目画素として全点に対して求めるようにしても良いし、一定間隔の画素に対してのみ求めるようにしても良い。 In addition to the gradient method, for example, a known template matching method may be used as the method of detecting the motion vector on a pixel-by-pixel basis. Further, the motion vector may be calculated for all points with all the pixels in the image as the pixel of interest, or may be calculated only for pixels at a constant interval.

次に、ステップＳ９０２では、軌跡補正部１０４の動体領域検出部３０２は、ステップＳ９０１において検出された画素単位の動きベクトルを用いて、フレーム毎に動体領域を検出する。図１０を参照して、動体領域の検出方法について説明する。 Next, in step S902, the moving body area detection unit 302 of the trajectory correction unit 104 detects the moving body area for each frame using the pixel-based motion vector detected in step S901. A method of detecting a moving body area will be described with reference to FIG.

まず、連続するフレーム１００１とフレーム１００２との２フレーム間で動きベクトルを算出することを考える。図１０（ａ）に示すように、フレーム１００１には静止被写体１００３と動被写体１００４が映っている。図１０（ｂ）のように、次のフレーム１００２では、静止被写体１００３はカメラの動きによって１００５に移動し、動被写体１００４はカメラの動きと動被写体の動きとによって１００６に移動する。 First, consider calculation of a motion vector between two consecutive frames 1001 and 1002. As shown in FIG. 10A, a stationary subject 1003 and a moving subject 1004 are shown in the frame 1001. As shown in FIG. 10B, in the next frame 1002, the still subject 1003 moves to 1005 by the movement of the camera, and the moving subject 1004 moves to 1006 by the movement of the camera and the movement of the moving subject.

図１０（ｃ）は、フレーム１１０１と１１０２との間の動きベクトルを示している。この動きベクトルには、カメラの位置姿勢の変化による画面全体の動き１００７と、それに動体の動きを加えた１００８が混在しているが、この動きベクトル情報からだけでは両者を区別することができない。そこで、図５のステップＳ５０２において、カメラの位置姿勢の変化（変動量）が分かっていることを利用する。例えば、カメラ位置姿勢の回転Ｒと平行移動Ｔから、図１０（ｄ）に示すように、カメラの位置姿勢の変化だけによるフレーム間の各画素の移動量１００９が求められる。図１０（ｅ）は、図１０（ｃ）に示した動きベクトルから、図１０（ｄ）に示した各画素の移動量を除去した結果を示している。静止領域は零ベクトル１０１０、動体領域は非零ベクトル１０１１として検出されている。図１０（ｆ）は、非零ベクトルを持つ領域を示しており、この領域の集合が動体領域１０１２に相当する。 FIG. 10C shows the motion vector between the frames 1101 and 1102. The motion vector includes a motion 1007 of the entire screen due to a change in the position and orientation of the camera and a motion motion 1008 added to the motion of the moving object. However, the motion vector information alone cannot distinguish the two. Therefore, in step S502 of FIG. 5, it is used that the change (variation amount) of the position and orientation of the camera is known. For example, from the rotation R of the camera position and orientation and the parallel movement T, as shown in FIG. 10D, the movement amount 1009 of each pixel between frames due to only the change of the position and orientation of the camera is obtained. FIG. 10E shows the result of removing the movement amount of each pixel shown in FIG. 10D from the motion vector shown in FIG. 10C. The stationary area is detected as a zero vector 1010, and the moving body area is detected as a non-zero vector 1011. FIG. 10F shows a region having a non-zero vector, and a set of this region corresponds to the moving body region 1012.

実際には、カメラの位置姿勢や画素単位の動きベクトルは誤差を含んで算出されるため、非動体領域であっても厳密に零とはならないことがある。しかし、カメラの位置姿勢及び画素単位の動きベクトルがおおよそ正確に算出できていれば、非動体領域は零に近いベクトル値を示す。そのため、ベクトル値がより大きい領域を動体領域として検出すれば良い。あるいは、判別分析法のような二値化手法を用いて、ベクトル値を零ベクトルと非零ベクトルとに二値化し、非零ベクトルを動体領域として検出しても良い。 Actually, since the position and orientation of the camera and the motion vector of each pixel are calculated with an error, they may not be exactly zero even in the non-moving object region. However, if the position and orientation of the camera and the motion vector of each pixel can be calculated almost accurately, the non-moving object region shows a vector value close to zero. Therefore, an area having a larger vector value may be detected as a moving body area. Alternatively, a vectorization value may be binarized into a zero vector and a non-zero vector by using a binarization method such as a discriminant analysis method, and the non-zero vector may be detected as a moving body region.

なお、ステップＳ９０１の処理を行う直前に、ステップＳ５０２において算出されたカメラ位置姿勢の回転Ｒと平行移動Ｔを用いて、フレーム間でのカメラの位置姿勢の変化を打ち消すように画像を幾何学変換するようにしても良い。そして、その後のステップＳ９０１において、動きベクトル検出を実行するようにしても良い。この場合、フレーム間のカメラ位置姿勢の変化が無い状態で動きベクトル検出が行われるため、動体領域のみが非零のベクトル値を持つことになる。 Immediately before performing the process of step S901, the image is geometrically transformed so as to cancel the change in the position and orientation of the camera between frames using the rotation R and the parallel movement T of the camera position and orientation calculated in step S502. It may be done. Then, in the subsequent step S901, motion vector detection may be executed. In this case, since the motion vector detection is performed in a state where there is no change in the camera position and orientation between frames, only the moving body area has a non-zero vector value.

続く、ステップＳ９０３では、軌跡補正部１０４の動体位置算出部３０３は、ステップＳ９０２において検出された動体領域に基づいて、動体領域を示す座標を動体位置として算出する。動体位置算出部３０３は、例えば図１０（ｇ）に示すように、動体領域１０１２に外接する矩形１０１３を求め、その重心１０１４を動体位置として算出する。動体位置は動体領域を示す点の座標であれば良いので、動体位置算出部３０３は、例えば動体領域を構成する画素の各座標の平均値を、動体位置として算出しても良い。前者の外接矩形の重心を求める方法は演算量が少なくて済む。一方、後者の各画素の座標の平均値を求める方法は、複雑な形状の動体に対しても動体位置を精度良く算出できる。 Subsequently, in step S903, the moving body position calculation unit 303 of the trajectory correction unit 104 calculates the coordinates indicating the moving body region as the moving body position based on the moving body region detected in step S902. The moving body position calculation unit 303 obtains a rectangle 1013 circumscribing the moving body region 1012 and calculates its center of gravity 1014 as the moving body position, for example, as shown in FIG. Since the moving body position may be the coordinates of a point indicating the moving body region, the moving body position calculation unit 303 may calculate, for example, the average value of the coordinates of the pixels forming the moving body region as the moving body position. The former method of obtaining the center of gravity of the circumscribed rectangle requires a small amount of calculation. On the other hand, the latter method of obtaining the average value of the coordinates of each pixel can accurately calculate the moving body position even for a moving body having a complicated shape.

ステップＳ９０４では、動体位置算出部３０３は、動体の動き量の大きさを算出する。動体が存在していたとしても、その動き量が小さい場合には安定化する必要性が低いと考えられる。そのため、動体の動き量に応じて、後のステップＳ９０５で撮像軌跡を補正する際に、動体の位置変動の低減度合いを調整することが好ましい。動体の動き量としては、例えばステップＳ９０３において算出された動体位置における動きベクトルの絶対値を用いれば良い。あるいは、ステップＳ９０２において検出された動体領域を構成する各画素の動きベクトルの平均を求め、その絶対値を用いることもできる。 In step S904, the moving body position calculation unit 303 calculates the magnitude of the moving amount of the moving body. Even if there is a moving body, it is considered that there is little need to stabilize it if the amount of movement is small. Therefore, it is preferable to adjust the degree of reduction in the positional fluctuation of the moving body according to the amount of movement of the moving body when correcting the imaging trajectory in step S905 later. As the motion amount of the moving body, for example, the absolute value of the motion vector at the moving body position calculated in step S903 may be used. Alternatively, it is also possible to obtain the average of the motion vectors of the respective pixels forming the moving body area detected in step S902 and use the absolute value thereof.

ステップＳ９０５では、軌跡補正部１０４の軌跡適正化部３０４は、ステップＳ４０２において推定された撮像時の撮像軌跡を、ステップＳ９０３において算出した動体位置を使用して補正し撮像軌跡の適正化を行う。撮像軌跡適正化の式は、例えば次の（式８）から（式１３）で表すことができる。 In step S905, the trajectory optimization unit 304 of the trajectory correction unit 104 corrects the imaging trajectory at the time of imaging estimated in step S402 using the moving body position calculated in step S903 to optimize the imaging trajectory. The equation for optimizing the imaging trajectory can be expressed by the following (Equation 8) to (Equation 13), for example.

基準位置は、動体位置を固定したい点として画像上の任意の座標を設定すれば良い。例えば、画像の中心や最初に処理するフレームの動体位置を基準位置として設定する。 As the reference position, an arbitrary coordinate on the image may be set as a point at which the moving body position is desired to be fixed. For example, the center of the image or the moving body position of the frame to be processed first is set as the reference position.

ここで、図１１を参照して、撮像軌跡の補正の効果について説明する。図１１（ａ）〜図１１（ｄ）に示したグラフは、縦軸が三次元上のカメラ位置姿勢の変動量、横軸がフレーム番号を表している。図１１（ａ）〜図１１（ｄ）に示す折れ線１１０１は、入力された撮像軌跡（撮像時の撮像軌跡）を示している。 Here, with reference to FIG. 11, the effect of the correction of the imaging trajectory will be described. In the graphs shown in FIGS. 11A to 11D, the vertical axis represents the variation amount of the camera position and orientation in three dimensions, and the horizontal axis represents the frame number. A polygonal line 1101 shown in FIGS. 11A to 11D indicates an input imaging locus (imaging locus at the time of imaging).

（式８）は、補正後の撮像軌跡の長さを調整する式である。図１１（ｂ）に示す折れ線１１０２は、折れ線１１０１に対して補正後の撮像軌跡の長さが短くなるように調整している図である。 (Equation 8) is an equation for adjusting the length of the corrected imaging trajectory. A polygonal line 1102 shown in FIG. 11B is a diagram in which the length of the corrected imaging locus is shorter than that of the polygonal line 1101.

（式９）は、補正後のカメラ位置姿勢の１フレーム毎の変化が滑らかになるように調整する式である。図１１（ｃ）に示す曲線１１０３は、折れ線１１０２に対して補正後の撮像軌跡の１フレーム毎の変化が滑らかになるように調整している図である。 (Equation 9) is an equation for adjusting so that the change in the corrected camera position and orientation for each frame becomes smooth. A curve 1103 shown in FIG. 11C is a diagram in which the polygonal line 1102 is adjusted so that the change in the corrected image pickup trajectory for each frame becomes smooth.

（式１０）は、入力カメラ位置姿勢と補正後のカメラ位置姿勢との差分を調整する式である。図１１（ｄ）に示す曲線１１０４は、曲線１１０３が入力撮像軌跡との差分が少なくなるように調整している図である。 (Formula 10) is a formula for adjusting the difference between the input camera position and orientation and the corrected camera position and orientation. A curve 1104 illustrated in FIG. 11D is a diagram in which the curve 1103 is adjusted so that the difference from the input imaging trajectory is reduced.

（式１１）は、出力画像内の動体位置と基準位置の差分を調整する式であり、詳細は後述する。（式１２）は、（式１１）を変換した撮像軌跡を調整する式である。 (Formula 11) is a formula for adjusting the difference between the moving body position and the reference position in the output image, and details will be described later. (Equation 12) is an equation for adjusting the imaging trajectory obtained by converting (Equation 11).

（式１３）は、（式８）〜（式１０）と（式１２）の各項に対して、パラメータλ１〜λ４で重み付け加算したものであり、本実施形態では、これを撮像軌跡補正の式とする。そして、（式１３）に示される値Ｅを最小とするようにｐ（ｔ）とｆ（ｔ）について非線形の最小化問題を解くことで、補正された撮像軌跡を算出することができる。 (Equation 13) is obtained by weighting and adding each of the terms of (Equation 8) to (Equation 10) and (Equation 12) with the parameters λ1 to λ4. Let it be an expression. Then, the corrected imaging trajectory can be calculated by solving the nonlinear minimization problem for p(t) and f(t) so as to minimize the value E shown in (Equation 13).

（式１１）及び（式１２）について、図１２を参照して説明する。（式１１）は、二次元画像上での動体位置ｏｂｊ（ｔ）と基準位置ｃとの差分Ｄを最小化する式である。（式１２）は、（式１１）を三次元のカメラ位置姿勢の表現に変換した式である。（式１２）は、二次元画像上での差分Ｄが０となるような三次元空間上のカメラ位置姿勢をｐｃ（ｔ）として、カメラ位置姿勢ｐ（ｔ）がｐｃ（ｔ）に近づくように補正を行う。 (Equation 11) and (Equation 12) will be described with reference to FIG. (Equation 11) is an equation that minimizes the difference D between the moving object position obj(t) on the two-dimensional image and the reference position c. (Equation 12) is an equation obtained by converting (Equation 11) into a three-dimensional representation of the camera position and orientation. (Equation 12) is such that the camera position/posture in the three-dimensional space such that the difference D on the two-dimensional image is 0 is pc(t), and the camera position/posture p(t) approaches pc(t). Correction.

図１２（ａ）は、三次元空間におけるシーン及びカメラ位置を示している。図１２（ａ）において、１２０１は入力のカメラ位置姿勢ｐ_inである。三次元空間中の動体座標１２０２を、各カメラ位置姿勢から撮像することで、二次元画像上の動体位置ｏｂｊ（ｔ）１２０３が得られる。 FIG. 12A shows a scene and a camera position in the three-dimensional space. In FIG. 12 (a), 1201 denotes a camera position and orientation p _in the input. By capturing the moving body coordinates 1202 in the three-dimensional space from each camera position and orientation, the moving body position obj(t) 1203 on the two-dimensional image can be obtained.

入力のカメラ位置姿勢１２０１を、動体位置ｏｂｊ（ｔ）を考慮せずに補正した補正後のカメラ位置姿勢ｐ（ｔ）を１２０４とする。カメラ位置姿勢１２０４からの出力画角を１２０５として、図１２（ｂ）にその様子を示す。ここで、出力画角１２０５の中央を基準位置ｃ１２０６として設定している。 The corrected camera position/orientation p(t) 1201 is set as 1204 by correcting the input camera position/orientation 1201 without considering the moving object position obj(t). The output angle of view from the camera position/orientation 1204 is set to 1205, and the state is shown in FIG. Here, the center of the output angle of view 1205 is set as the reference position c1206.

カメラ位置姿勢ｐ（ｔ）は動体位置ｏｂｊ（ｔ）を考慮していないため、カメラ位置姿勢ｐ（ｔ）の出力画角１２０５では、動体位置１２０３と基準位置１２０６のずれが大きい。このように、動体位置を考慮せずに入力のカメラ位置姿勢を補正した場合、補正後のカメラ位置姿勢の出力画角において、動体位置１２０３は基準位置１２０６とは無関係に存在することになる。その結果、フレーム毎に動体位置が変動するため、動体位置が安定しない動画が生成されることになる。 Since the camera position/orientation p(t) does not consider the moving object position obj(t), the deviation between the moving object position 1203 and the reference position 1206 is large at the output angle of view 1205 of the camera position/orientation p(t). As described above, when the input camera position/orientation is corrected without considering the moving object position, the moving object position 1203 exists regardless of the reference position 1206 in the corrected output angle of view of the camera position/orientation. As a result, since the moving body position changes for each frame, a moving image in which the moving body position is not stable is generated.

入力のカメラ位置姿勢１２０１を、（式１２）に基づいて動体位置ｏｂｊ（ｔ）を考慮して補正した補正後のカメラ位置姿勢ｐ（ｔ）を１２０７とする。カメラ位置姿勢１２０７からの出力画角を１２０８として、図１２（ｃ）にその様子を示す。 The corrected camera position/posture p(t) is set to 1207 by correcting the input camera position/posture 1201 in consideration of the moving object position obj(t) based on (Equation 12). The state of output from the camera position/orientation 1207 is set to 1208, and the state is shown in FIG.

カメラ位置姿勢ｐ（ｔ）は動体位置ｏｂｊ（ｔ）を考慮しているため、カメラ位置姿勢ｐ（ｔ）の出力画角１２０８では、動体位置１２０３と基準位置１２０６がほぼ重なっている。このように、動体位置を考慮して入力のカメラ位置姿勢を補正した場合、補正後のカメラ位置姿勢の出力画角において、動体位置１２０３は基準位置１２０６に近づくようになる。その結果、フレーム毎で動体位置が基準位置付近に固定されるため、動体位置が安定した動画が生成されることになる。 Since the camera position/posture p(t) considers the moving object position obj(t), the moving object position 1203 and the reference position 1206 substantially overlap with each other in the output angle of view 1208 of the camera position/posture p(t). In this way, when the input camera position/orientation is corrected in consideration of the moving object position, the moving object position 1203 comes closer to the reference position 1206 at the corrected output angle of view of the camera position/orientation. As a result, since the moving body position is fixed near the reference position for each frame, a moving image with a stable moving body position is generated.

ここで、（式１３）は、パラメータλによって各項の重みを変更することが可能である。これらの重みづけは任意で指定することが可能である。例えば、パラメータλ２の値を大きくし、パラメータλ３の値を小さくすると、図１１（ｃ）に示した曲線１１０３のように入力された撮像軌跡から離れた撮像軌跡になっても１フレーム毎の変化を滑らかすることが優先される。また、ステップＳ９０４において算出された動体の動き量が大きくなるほど、動体の位置変動をより低減することが好ましい。そのため、動体の動き量が大きいほど、パラメータλ４の値を大きくするのが良い。 Here, in (Equation 13), the weight of each term can be changed by the parameter λ. These weightings can be specified arbitrarily. For example, if the value of the parameter λ2 is increased and the value of the parameter λ3 is decreased, even if the imaging locus is far from the input imaging locus as shown by the curve 1103 in FIG. Smoothing is a priority. Further, it is preferable to further reduce the position variation of the moving body as the moving amount of the moving body calculated in step S904 increases. Therefore, it is better to increase the value of the parameter λ4 as the moving amount of the moving body increases.

ステップＳ９０６では、軌跡補正部１０４は、処理フレームが最終フレームであるか否かの判定を行う。処理フレームが最終フレームであると軌跡補正部１０４が判定した場合（ＹＥＳ）、軌跡補正処理を終了し、図４に示すフローチャートに戻る。一方、処理フレームが最終フレームでないと軌跡補正部１０４が判定した場合、ステップＳ９０１へ進む。 In step S906, the trajectory correction unit 104 determines whether the processing frame is the final frame. When the trajectory correction unit 104 determines that the processing frame is the final frame (YES), the trajectory correction processing ends, and the process returns to the flowchart shown in FIG. On the other hand, when the trajectory correction unit 104 determines that the processing frame is not the final frame, the process proceeds to step S901.

以上説明したように、第１の実施形態によれば、動体位置を考慮してフレーム毎に動体位置が基準位置付近に固定されるように、推定された撮像時の撮像軌跡を補正する。そして、補正後の撮像軌跡に応じた画角領域に対応する画像を撮影動画から読み出すことで、撮影画角の変動を安定化させた動画を生成することができる。したがって、動体位置の変動を低減した安定化された動画の生成が可能となる。 As described above, according to the first embodiment, the estimated imaging trajectory at the time of imaging is corrected so that the moving body position is fixed near the reference position for each frame in consideration of the moving body position. Then, by reading the image corresponding to the angle-of-view area corresponding to the corrected imaging trajectory from the captured moving image, it is possible to generate a moving image in which the variation in the captured angle of view is stabilized. Therefore, it is possible to generate a stabilized moving image in which the fluctuation of the moving body position is reduced.

（第２の実施形態）
次に、本発明の第２の実施形態について説明する。第２の実施形態における画像処理装置及びそれが有する軌跡推定部は、第１の実施形態における画像処理装置１００及び軌跡推定部１０３と同様であるので、その説明は省略する。図１３は、第２の実施形態における軌跡補正部１０４の構成例を示すブロック図である。図１３において、図３に示した構成要素と同様の機能を有する構成要素には同一の符号を付し、重複する説明は省略する。 (Second embodiment)
Next, a second embodiment of the present invention will be described. The image processing apparatus and the trajectory estimation unit included in the image processing apparatus according to the second embodiment are the same as the image processing apparatus 100 and the trajectory estimation unit 103 according to the first embodiment, and description thereof will be omitted. FIG. 13 is a block diagram showing a configuration example of the trajectory correction unit 104 in the second embodiment. 13, constituent elements having the same functions as those of the constituent elements shown in FIG. 3 are designated by the same reference numerals, and redundant description will be omitted.

第２の実施形態における軌跡補正部１０４は、動きベクトル検出部３０１、動体領域検出部３０２、動体位置算出部３０３、及び軌跡適正化部３０４に加え、動体サイズ算出部３０５及び距離情報取得部３０６を有する。なお、図１３においては、動体サイズ算出部３０５及び距離情報取得部３０６を有する構成を示しているが、動体サイズ算出部３０５又は距離情報取得部３０６を有するように構成しても良い。 The trajectory correction unit 104 according to the second embodiment includes a motion vector size detection unit 301, a moving body region detection unit 302, a moving body position calculation unit 303, and a trajectory optimization unit 304, as well as a moving body size calculation unit 305 and a distance information acquisition unit 306. Have. Although FIG. 13 shows the configuration including the moving body size calculation unit 305 and the distance information acquisition unit 306, it may be configured to include the moving body size calculation unit 305 or the distance information acquisition unit 306.

第２の実施形態が第１の実施形態と異なるのは、撮像軌跡を補正する際に動体の大きさ及び距離情報の少なくとも一方をさらに考慮する点である。第１の実施形態では、フレーム間の動体位置の変動が低減するように撮像軌跡の補正を行う。これにより、動体の動きが注目されるシーンにおいては、動体の変動が低減することで安定化された動画が得られる。 The second embodiment is different from the first embodiment in that at least one of the size of the moving body and the distance information is further taken into consideration when correcting the imaging trajectory. In the first embodiment, the imaging trajectory is corrected so that the variation in the moving body position between frames is reduced. As a result, in a scene in which the motion of the moving body is noticed, a moving image that is stabilized by reducing the variation of the moving body can be obtained.

ここで、動体の動きが注目されるシーンとしては、例えば、人が歩いているところを後ろから追いながら撮影するシーンや、歩きながら自分の顔を撮影するシーンが考えられる。これらのシーンでは、動体の大きさが大きいことや、動体の距離が近くて並進の移動量が相対的に大きくなることから、動体の動きが注目されやすくなる。しかしながら、これに当てはまらないシーンでは、たとえ動体の移動量が大きい場合でも、動体の動きが注目されにくく、動体の変動を低減する必要性は低い。そのため本実施形態では、シーンにおける動体の注目度合いを、動体の大きさ及び距離情報の少なくとも一方を用いて判断し、その結果を撮像軌跡の適正化に反映する。 Here, as a scene in which the movement of a moving body is noticed, for example, a scene in which a person is walking while chasing from behind, or a scene in which one's face is photographed while walking can be considered. In these scenes, since the size of the moving body is large, or the distance of the moving body is short and the amount of translational movement is relatively large, the movement of the moving body is easily noticed. However, in a scene that does not apply to this, even if the moving amount of the moving body is large, the movement of the moving body is hardly noticed, and it is not necessary to reduce the fluctuation of the moving body. Therefore, in the present embodiment, the degree of attention of the moving body in the scene is determined using at least one of the size of the moving body and the distance information, and the result is reflected in the optimization of the imaging trajectory.

第２の実施形態における軌跡補正部１０４による軌跡補正処理について説明する。図１４は、第２の実施形態における軌跡補正処理の例を示すフローチャートである。ステップＳ１４０１〜Ｓ１４０４の処理は、図９に示したステップＳ９０１〜Ｓ９０４での処理とそれぞれ同様であるので、説明は省略する。 The locus correction processing by the locus correction unit 104 according to the second embodiment will be described. FIG. 14 is a flowchart showing an example of the trajectory correction process in the second embodiment. The processing in steps S1401 to S1404 is the same as the processing in steps S901 to S904 shown in FIG. 9, and thus the description thereof will be omitted.

ステップＳ１４０５では、軌跡補正部１０４の動体サイズ算出部３０５は、ステップＳ１４０２において検出された動体領域の大きさを算出する。動体領域の大きさとしては、例えば、動体領域を構成する画素の数や、動体領域に外接する矩形の面積を用いれば良い。 In step S1405, the moving body size calculation unit 305 of the trajectory correction unit 104 calculates the size of the moving body region detected in step S1402. As the size of the moving body region, for example, the number of pixels forming the moving body region or the area of a rectangle circumscribing the moving body region may be used.

ステップＳ１４０６では、軌跡補正部１０４の距離情報取得部３０６は、撮像時におけるカメラと被写体との距離（動体距離）を取得する。例えば、動体距離として、位相差検出方式のオートフォーカス制御で算出できる測距情報を使用する。フォーカスを合わせている領域は撮影者が注目している領域であるので、その領域の距離が近く、そこに動体が存在するとすれば、動体の動きが注目されるシーンとして判断することができる。 In step S1406, the distance information acquisition unit 306 of the trajectory correction unit 104 acquires the distance (moving object distance) between the camera and the subject at the time of image capturing. For example, the distance measurement information that can be calculated by the autofocus control of the phase difference detection method is used as the moving body distance. Since the area in focus is the area the photographer is paying attention to, if the distance between the areas is short and there is a moving body, it can be determined that the movement of the moving body is of interest.

ここで、ステップＳ１４０３〜Ｓ１４０６の処理を行う順番は任意であり、入れ替えても良い。 Here, the order of performing the processes of steps S1403 to S1406 is arbitrary and may be replaced.

ステップＳ１４０７では、軌跡補正部１０４の撮像軌跡適正化部３０４は、ステップＳ４０２において推定された撮像時の撮像軌跡を補正し撮像軌跡の適正化を行う。第２の実施形態では、ステップＳ１４０３において算出した動体位置と、ステップＳ１４０５において算出した動体の大きさ及びステップＳ１４０６において取得した距離情報の少なくとも一方を使用し、撮像軌跡の補正を行う。撮像軌跡適正化の式は、第１の実施形態で示した（式８）から（式１３）と同じである。ここでは、動体の動きが注目されるシーンほど、（式１３）の重みλ４を大きくし、動体位置の変動が低減されるように制御する。 In step S1407, the imaging trajectory optimization unit 304 of the trajectory correction unit 104 corrects the imaging trajectory at the time of imaging estimated in step S402 to optimize the imaging trajectory. In the second embodiment, the imaging locus is corrected using at least one of the moving body position calculated in step S1403, the size of the moving body calculated in step S1405, and the distance information acquired in step S1406. The equation for optimizing the imaging trajectory is the same as (Equation 8) to (Equation 13) shown in the first embodiment. Here, the weight λ4 of (Equation 13) is increased in a scene in which the motion of the moving body is noticed, and control is performed so that fluctuations in the moving body position are reduced.

前述の通り、動体の動きが注目されるシーンでは、動体の大きさが大きいことや、動体の距離が近いことが考えられる。これを考慮し、図１５に重みλ４の制御の一例を示す。図１５（ａ）では、動体の大きさが大きくなるほど、重みλ４を大きくしている。また図１５（ｂ）では、動体の距離が遠くなるほど、重みλ４を小さくしている。なお、図１５では重みを線形に変化させる例を示しているが、重みは非線形に変化させても良い。 As described above, in a scene where the movement of the moving body is noticed, it is possible that the size of the moving body is large or the distance of the moving body is short. Considering this, FIG. 15 shows an example of control of the weight λ4. In FIG. 15A, the weight λ4 is increased as the size of the moving body increases. Further, in FIG. 15B, the weight λ4 is reduced as the distance of the moving body increases. Although FIG. 15 shows an example in which the weight is changed linearly, the weight may be changed non-linearly.

動体の大きさ及び距離情報によって決定される重みλ４を、それぞれλ４ａ、λ４ｂとする。最終的な重みλ４は、重みλ４ａと重みλ４ｂを合成して得られる。合成する方法としては、例えば両者を足し合わせれば良い。なお、動体の大きさと距離情報は両方用いた方が、動体の動きが注目されるシーンを的確に判断できるため好ましいが、いずれか一方の情報を用いるのでも良い。その場合、重みλ４ａ又は重みλ４ｂのいずれかを、最終的な重みλ４として採用すればよい。 The weights λ4 determined by the size of the moving body and the distance information are λ4a and λ4b, respectively. The final weight λ4 is obtained by combining the weight λ4a and the weight λ4b. As a method of combining, for example, both may be added. It is preferable to use both the size of the moving body and the distance information because it is possible to accurately determine the scene in which the movement of the moving body is noted, but it is also possible to use either one of the information. In that case, either the weight λ4a or the weight λ4b may be adopted as the final weight λ4.

第２の実施形態によれば、第１の実施形態と同様に、動体位置の変動を低減した安定化された動画の生成が可能となる。また、動体の大きさ及び距離情報の少なくとも一方に応じて決定した重みλ４を用いて撮像軌跡の適正化を行うことで、シーンに応じて適切に動体の変動を低減した安定化された動画を生成することが可能になる。 According to the second embodiment, similarly to the first embodiment, it is possible to generate a stabilized moving image with reduced fluctuation of the moving body position. Further, by optimizing the imaging trajectory by using the weight λ4 determined according to at least one of the size and distance information of the moving object, a stabilized moving image in which fluctuation of the moving object is appropriately reduced according to the scene is generated. It will be possible to generate.

（本発明の他の実施形態）
本発明は、前述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読み出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 (Other Embodiments of the Present Invention)
The present invention supplies a program that implements one or more functions of the above-described embodiments to a system or apparatus via a network or a storage medium, and one or more processors in a computer of the system or apparatus read and execute the program. It can also be realized by the processing. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

図１６は、本実施形態における画像処理装置のハードウェア構成の一例を示すブロック図である。図１６に示すように、本実施形態における画像処理装置は、ＣＰＵ１６０１、ＲＡＭ１６０２、ＲＯＭ１６０３、入力部１６０４、出力部１６０５、記憶部１６０６、及び通信インターフェース（ＩＦ）１６０７を有する。ＣＰＵ１６０１、ＲＡＭ１６０２、ＲＯＭ１６０３、入力部１６０４、出力部１６０５、記憶部１６０６、及び通信ＩＦ１６０７は、システムバス１６０８を介して通信可能に接続される。 FIG. 16 is a block diagram showing an example of the hardware configuration of the image processing apparatus according to this embodiment. As shown in FIG. 16, the image processing apparatus according to this embodiment has a CPU 1601, a RAM 1602, a ROM 1603, an input unit 1604, an output unit 1605, a storage unit 1606, and a communication interface (IF) 1607. The CPU 1601, RAM 1602, ROM 1603, input unit 1604, output unit 1605, storage unit 1606, and communication IF 1607 are communicatively connected via a system bus 1608.

ＣＰＵ（Central Processing Unit）１６０１は、システムバス１６０８に接続された各部の制御を行う。ＲＡＭ（Random Access Memory）１６０２は、ＣＰＵ１６０１の主記憶装置として使用される。ＲＯＭ（Read Only Memory）１６０３は、装置の起動プログラム等を記憶する。ＣＰＵ１６０１が、記憶部１６０６からプログラムを読み出して実行することで、例えば軌跡推定部１０３、軌跡補正部１０４、及び画像生成部１０５等の機能が実現される。 A CPU (Central Processing Unit) 1601 controls each unit connected to the system bus 1608. A RAM (Random Access Memory) 1602 is used as a main storage device of the CPU 1601. A ROM (Read Only Memory) 1603 stores a boot program of the apparatus. The functions of, for example, the trajectory estimation unit 103, the trajectory correction unit 104, and the image generation unit 105 are realized by the CPU 1601 reading the program from the storage unit 1606 and executing the program.

入力部１６０４は、ユーザによる入力等を受け付けたり、画像データを入力したりする。出力部１６０５は、画像データやＣＰＵ１６０１における処理結果等を出力する。記憶部１６０６は、装置の動作や処理に係る制御プログラム等を記憶する不揮発性の記憶装置である。通信ＩＦ１６０７は、本装置と他の装置（中継器等）との情報通信を制御する。 The input unit 1604 receives input from the user and inputs image data. The output unit 1605 outputs image data, processing results in the CPU 1601, and the like. The storage unit 1606 is a non-volatile storage device that stores a control program related to the operation and processing of the device. The communication IF 1607 controls information communication between this device and another device (such as a repeater).

前述のように構成された装置において、装置に電源が投入されると、ＣＰＵ１６０１は、ＲＯＭ１６０３に格納された起動プログラムに従って、記憶部１６０６から制御プログラム等をＲＡＭ１６０２に読み込む。ＣＰＵ１６０１は、ＲＡＭ１６０２に読み込んだ制御プログラム等に従い処理を実行することによって、画像処理装置の機能を実現する。つまり、画像処理装置のＣＰＵ１６０１が制御プログラム等に基づき処理を実行することによって、画像処理装置の機能構成及び動作が実現される。 In the device configured as described above, when the device is powered on, the CPU 1601 reads the control program and the like from the storage unit 1606 into the RAM 1602 according to the startup program stored in the ROM 1603. The CPU 1601 realizes the function of the image processing apparatus by executing processing according to the control program read into the RAM 1602. That is, the CPU 1601 of the image processing apparatus executes processing based on a control program or the like, whereby the functional configuration and operation of the image processing apparatus are realized.

なお、前記実施形態は、何れも本発明を実施するにあたっての具体化のほんの一例を示したものに過ぎず、これらによって本発明の技術的範囲が限定的に解釈されてはならないものである。すなわち、本発明はその技術思想、又はその主要な特徴から逸脱することなく、様々な形で実施することができる。 It should be noted that each of the above-described embodiments is merely an example of the embodiment in carrying out the present invention, and the technical scope of the present invention should not be limitedly interpreted by these. That is, the present invention can be implemented in various forms without departing from the technical idea or the main features thereof.

１００：画像処理装置１０１：画像入力部１０２：画像メモリ１０３：軌跡推定部１０４：軌跡補正部１０５：画像生成部２０１：画像マッチング部２０２：変動量算出部２０３：変動量累積部３０１：動きベクトル検出部３０２：動体領域検出部３０３：動体位置算出部３０４：軌跡適正化部３０５：動体サイズ算出部３０６：距離情報取得部 100: image processing device 101: image input unit 102: image memory 103: trajectory estimation unit 104: trajectory correction unit 105: image generation unit 201: image matching unit 202: variation amount calculation unit 203: variation amount accumulation unit 301: motion vector Detection unit 302: Moving body region detection unit 303: Moving body position calculation unit 304: Trajectory optimization unit 305: Moving body size calculation unit 306: Distance information acquisition unit

Claims

Trajectory estimation means for estimating an imaging trajectory from the first moving image,
Trajectory correction means for correcting the estimated imaging trajectory,
Image generating means for generating a second moving image corresponding to the corrected image pickup trajectory from the first moving image,
The locus correction means,
The imaging locus is detected so as to reduce the variation of the moving body position in each frame of the second moving image, which is detected from the moving body area based on the motion vector detected from the first moving image, and is calculated from the detected moving body area. An image processing apparatus which corrects

The image processing apparatus according to claim 1, wherein the trajectory correction unit detects the moving body region based on a motion vector detected from the first moving image and the estimated imaging trajectory.

The trajectory correcting means detects the moving body region based on a motion vector obtained by removing a variation amount calculated based on the imaging trajectory from a motion vector detected from the first moving image. Item 2. The image processing device according to item 2.

The image processing apparatus according to claim 1, wherein the trajectory correction unit further corrects the imaging trajectory according to at least one of the size of the moving body region and the distance to the moving body. ..

The image processing apparatus according to claim 1, wherein the trajectory correction unit calculates a center of gravity of a rectangle circumscribing the detected moving body region as the moving body position.

The image processing apparatus according to any one of claims 1 to 4, wherein the trajectory correction unit calculates an average value of the coordinates of the detected pixels forming the moving body region as the moving body position.

The locus correction means corrects the imaging locus by adjusting a camera position and orientation in a three-dimensional space so that the moving body position approaches a reference position on a two-dimensional image. 6. The image processing device according to any one of 6.

A trajectory estimation step of estimating an imaging trajectory from the first moving image,
A trajectory correction step of correcting the estimated imaging trajectory,
An image generating step of generating a second moving image corresponding to the corrected image pickup trajectory from the first moving image,
In the trajectory correction step,
The imaging locus is detected so as to reduce the variation of the moving body position in each frame of the second moving image, which is detected from the moving body area based on the motion vector detected from the first moving image, and is calculated from the detected moving body area. An image processing method characterized by:

A trajectory estimation step of estimating an imaging trajectory from the first moving image,
A trajectory correction step of correcting the estimated imaging trajectory,
An image generation step of generating a second moving image according to the corrected imaging trajectory from the first moving image,
In the trajectory correction step,
The imaging locus is detected so as to reduce the variation of the moving body position in each frame of the second moving image, which is detected from the moving body area based on the motion vector detected from the first moving image, and is calculated from the detected moving body area. A program for causing a computer to execute the process of correcting the.