JP2019139310A

JP2019139310A - Information processing apparatus, method and program

Info

Publication number: JP2019139310A
Application number: JP2018019335A
Authority: JP
Inventors: 加藤　晴久; Haruhisa Kato; 晴久加藤
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2018-02-06
Filing date: 2018-02-06
Publication date: 2019-08-22
Anticipated expiration: 2038-02-06
Also published as: JP6901978B2

Abstract

To provide an information processing apparatus capable of stabilizing a tracking result.SOLUTION: An information processing apparatus 10 comprises: a search unit 6 that searches for motion of an object in frames from all or a part between frames, for the plurality of frames arranged on a time axis satisfying a unit interval of a second frame rate lower than a first frame rate, obtained by performing imaging at the first frame rate; and an estimation unit 7 that obtains the motion at the second frame rate by combining the searched motion. It further comprises an output unit 8 that performs display at the second frame rate based on the combined motion. When performing the display at the second frame rate, the frames obtained by performing imaging at the first frame rate are displayed at the second frame rate.SELECTED DRAWING: Figure 1

Description

本発明は、追跡結果の安定化を図ることのできる情報処理装置、方法及びプログラムに関する。 The present invention relates to an information processing apparatus, method, and program capable of stabilizing a tracking result.

画像から対象を判定して認識する技術は、一つの応用例として配布や提示が容易な媒体に記載されたアナログ情報からデジタル情報に変換させることが可能であり、利用者の利便性を向上させることができる。当該技術としては、非特許文献１のものが公開されている。非特許文献１では、画像から特徴点を検出し、特徴点周辺から特徴量を算出した上で、事前に蓄積しておいた特徴量と比較することによって、対象の種類および相対的な位置関係を特定する。 The technology for identifying and recognizing an object from an image can convert analog information described in a medium that can be easily distributed and presented as one application, and can be converted into digital information, improving user convenience. be able to. As this technique, the one of Non-Patent Document 1 is disclosed. In Non-Patent Document 1, a feature point is detected from an image, a feature amount is calculated from the periphery of the feature point, and then compared with a feature amount accumulated in advance. Is identified.

一方、上記のような特徴点及び特徴量に基づく判定に関してさらに、精度を安定させる技術としては、例えば特許文献１のようなものが公開されている。特許文献１では、対象の追跡テンプレートを動的に更新する方法を開示している。非特許文献１等で対象を検出した後に入力画像を補正した画像を追跡テンプレートとし、追跡テンプレートで特徴点をテンプレートマッチングによる追跡で特徴点座標を求め、対象の姿勢を推定する。追跡テンプレートが後続の画像中で追跡されている対象の実際の画像から生成されるので、有効な安定した追跡を可能にし得る。 On the other hand, as a technique for further stabilizing accuracy regarding the determination based on the feature points and feature amounts as described above, for example, a technique disclosed in Patent Document 1 is disclosed. Patent Document 1 discloses a method for dynamically updating a target tracking template. An image obtained by correcting an input image after detecting a target in Non-Patent Document 1 or the like is used as a tracking template, and feature points are obtained by tracking feature points using the tracking template by template matching, and the posture of the target is estimated. Since the tracking template is generated from the actual image of the object being tracked in subsequent images, it can enable effective and stable tracking.

特開２０１６−２８３３１号公報JP 2006-28331 A

D. G. Lowe, ``Object recognition from local scale-invariant Features,'' Proc. of IEEE International Conference on Computer Vision (ICCV), pp.1150-1157, 1999.D. G. Lowe, `` Object recognition from local scale-invariant Features, '' Proc. Of IEEE International Conference on Computer Vision (ICCV), pp.1150-1157, 1999.

しかしながら、以上のような非特許文献１や特許文献１といった従来技術には、認識結果ないし追跡結果が不安定となることがあるという課題があった。 However, the conventional techniques such as Non-Patent Document 1 and Patent Document 1 described above have a problem that the recognition result or the tracking result may become unstable.

具体的には、非特許文献１は、斜めから撮像した場合、射影歪みに依って特徴量が変化し、事前に登録している特徴量と一致できないと判定処理が全く機能しないという課題がある。また、撮像対象が遠くから撮像された場合でも、特徴量算出に用いられる領域が検出すべき対象の占める領域に対して相対的に広くなるため、特徴量が変化し、事前に登録している特徴量と一致できないと判定処理が全く機能しないという課題がある。 Specifically, Non-Patent Document 1 has a problem that when imaging is performed from an oblique direction, the feature amount changes depending on the projection distortion, and the determination process does not function at all if the feature amount cannot be matched with a pre-registered feature amount. . Even when the imaging target is imaged from a distance, the feature amount changes and is registered in advance because the region used for calculating the feature amount is relatively wide with respect to the region occupied by the target to be detected. There is a problem that the determination process does not function at all if it cannot be matched with the feature amount.

また、特許文献１では、一旦検出した対象を追跡するため、正面で検出できていれば斜めに移動した場合でも姿勢を推定できることから前記課題の一部を解決できる。しかし、補正に利用する姿勢情報に誤差が存在すると、テンプレートにも誤差が発生するだけでなく、当該テンプレートで追跡した誤差が蓄積することで精度が大幅に低下あるいは追跡に失敗するという課題がある。 Further, in Patent Document 1, since the object once detected is tracked, the posture can be estimated even if the object is moved obliquely as long as it can be detected in front, so that part of the problem can be solved. However, if there is an error in the posture information used for correction, there is a problem that not only the error occurs in the template, but also the error tracked by the template is accumulated, so that the accuracy is significantly lowered or the tracking fails. .

上記従来技術の課題に鑑み、本発明は、追跡結果の安定化を図ることのできる情報処理装置、方法及びプログラムを提供することを目標とする。 In view of the above-described problems of the prior art, an object of the present invention is to provide an information processing apparatus, method, and program capable of stabilizing a tracking result.

上記目的を達成するため、本発明は情報処理装置であって、第一フレームレートで撮像を行うことで得られる、第一フレームレートよりも低い第二フレームレートの単位間隔を満たして時間軸上に並ぶ複数のフレームに対して、当該フレーム間の全て又は一部よりフレーム内の対象の動きを探索する探索部と、前記探索された動きを合成して第二フレームレートにおける動きを得る推定部と、を備えることを特徴とする。また当該装置に対応する方法及びプログラムであることを特徴とする。 In order to achieve the above object, the present invention is an information processing apparatus that satisfies a unit interval of a second frame rate lower than the first frame rate obtained by performing imaging at the first frame rate on the time axis. For a plurality of frames arranged in a frame, a search unit that searches for a target motion in the frame from all or part of the frame, and an estimation unit that combines the searched motions to obtain a motion at the second frame rate And. Further, the present invention is characterized by a method and a program corresponding to the apparatus.

本発明によれば、高い第一フレームレートを前提とした動き探索を行い、その結果を合成することによって低い第二フレームレートにおける動きを得ることで、追跡結果の安定化を図ることができる。 According to the present invention, it is possible to stabilize the tracking result by performing a motion search on the premise of a high first frame rate and obtaining a motion at a low second frame rate by combining the results.

一実施形態に係る情報処理装置の機能ブロック図である。It is a functional block diagram of the information processor concerning one embodiment. 一実施形態における探索部、推定部及び出力部の処理内容の相互の関係を模式的に示す図である。It is a figure which shows typically the mutual relationship of the processing content of the search part in one Embodiment, an estimation part, and an output part. 逐次処理としての一実施形態に係る情報処理装置の動作のフローチャートである。It is a flowchart of operation | movement of the information processing apparatus which concerns on one Embodiment as sequential processing. バッチ処理としての一実施形態に係る情報処理装置の動作のフローチャートである。It is a flowchart of operation | movement of the information processing apparatus which concerns on one Embodiment as batch processing. 動きを求める際のフレーム間の間隔に応じた探索範囲の設定の模式例を示す図である。It is a figure which shows the schematic example of the setting of the search range according to the space | interval between frames at the time of calculating | requiring a motion. 隣接フレームよりも離れたフレームであっても探索対象とする利点を説明するための模式例を示す図である。It is a figure which shows the example of a model for demonstrating the advantage made into a search object even if it is a flame | frame away from an adjacent frame.

図１は、一実施形態に係る情報処理装置の機能ブロック図である。図示するように、情報処理装置10は、撮像部1、バッファ部2、スイッチ部3、検出部4、記憶部5、探索部6、推定部7及び出力部8を備える。ここで、情報処理装置10を実現するハードウェア構成としては撮像部1を備える任意の情報端末を利用することができ、携帯端末の他、タブレット型端末、デスクトップ型又はラップトップ型のコンピュータその他を利用することができる。また、撮像部1以外の機能部の一部又は全てをサーバーに設置し、図１にて示される機能部間での情報授受を、ネットワーク等を経由した通信で実現するようにしてもよい。逆に、撮像部1がネットワーク上に存在して、これから取得した撮像画像を情報処理装置10において処理するようにしてもよい。図１の各機能部の処理内容の概要は以下の通りである。 FIG. 1 is a functional block diagram of an information processing apparatus according to an embodiment. As illustrated, the information processing apparatus 10 includes an imaging unit 1, a buffer unit 2, a switch unit 3, a detection unit 4, a storage unit 5, a search unit 6, an estimation unit 7, and an output unit 8. Here, as a hardware configuration for realizing the information processing apparatus 10, an arbitrary information terminal including the imaging unit 1 can be used. In addition to a mobile terminal, a tablet terminal, a desktop or laptop computer, and the like can be used. Can be used. Alternatively, some or all of the functional units other than the imaging unit 1 may be installed in a server, and information exchange between the functional units illustrated in FIG. 1 may be realized by communication via a network or the like. Conversely, the imaging unit 1 may exist on the network, and a captured image acquired from the imaging unit 1 may be processed in the information processing apparatus 10. The outline of the processing contents of each functional unit in FIG. 1 is as follows.

撮像部1は、ユーザによるカメラ撮像の操作を受けて対象の撮像を行い、得られた撮像画像をバッファ部2へと出力する。撮像部1による撮像は第一フレームレートにおいて行われ、映像すなわち時系列上の各時刻のフレームとしての撮像画像がバッファ部2へと出力される。なお、図１中では「フレームレート」を「レート」と略記している。撮像部1を実現するハードウェアとしては、近年では携帯端末には標準装備されることの多いデジタルカメラを利用することができる。 The imaging unit 1 captures an object in response to a camera imaging operation by the user, and outputs the obtained captured image to the buffer unit 2. Imaging by the imaging unit 1 is performed at the first frame rate, and an image, that is, a captured image as a frame at each time in time series is output to the buffer unit 2. In FIG. 1, “frame rate” is abbreviated as “rate”. As hardware for realizing the imaging unit 1, in recent years, a digital camera that is often provided as a standard in a mobile terminal can be used.

以下の説明における変数名の表記として、整数iでその順番を指定される時刻を「時刻ti」、当該時刻tiにおいて撮像部1による撮像により得られた撮像画像を「フレームFi」と表記することとする。すなわち、「フレームFi」は「撮像画像Fi」と同一であるが、以下の説明において時間軸上での処理に関して言及する際には、「フレームFi」等の表現を主に用いることとする。 In the following description, as the notation of the variable name, the time at which the order is specified by the integer i is expressed as “time ti”, and the captured image obtained by imaging by the imaging unit 1 at the time ti is expressed as “frame Fi”. And That is, “frame Fi” is the same as “captured image Fi”, but when referring to processing on the time axis in the following description, expressions such as “frame Fi” are mainly used.

バッファ部2は、撮像部1より得られる各時刻tiのフレームFiを一時的に保存することにより、（スイッチ部3を経由した）検出部4及び探索部6によるフレームFiの参照と、出力部8によるフレームFiの参照と、を可能とさせるものである。一実施形態においては、（スイッチ部3を経由した）検出部4及び探索部6からのフレームFiの当該参照は、撮像部1における撮像レートと同じ第一フレームレートにおいてなされ、出力部8からのフレームFiの当該参照は第一フレームレートより低い第二フレームレートにおいてなされる。 The buffer unit 2 temporarily stores the frame Fi at each time ti obtained from the imaging unit 1, thereby referencing the frame Fi by the detection unit 4 and the search unit 6 (via the switch unit 3) and an output unit. 8 makes it possible to refer to the frame Fi. In one embodiment, the reference of the frame Fi from the detection unit 4 and the search unit 6 (via the switch unit 3) is made at the same first frame rate as the imaging rate in the imaging unit 1, and from the output unit 8 The reference of the frame Fi is made at a second frame rate lower than the first frame rate.

なお、バッファ部2において各時刻tiのフレームFiを一時的に保存する期間は、少なくとも、以下に説明する検出部4及び探索部6並びに出力部8の処理が可能となるような期間であればよい。バッファ部2では、当該期間が経過することにより参照対象としては不要となったフレームFiを破棄してよい。 Note that the period for temporarily storing the frame Fi at each time ti in the buffer unit 2 is at least a period in which processing of the detection unit 4, the search unit 6, and the output unit 8 described below is possible. Good. The buffer unit 2 may discard a frame Fi that has become unnecessary as a reference target after the period has elapsed.

スイッチ部3は、フレームFi内における所定の検出対象が検出部4によって未検出の間はバッファ部2の各フレームFiを検出部4へと供給し、検出部4により当該対象の検出がなされた後の時刻の各フレームFiは、当該対象をフレーム内において継続して追跡させるべく探索部6へと供給するようスイッチング処理を行う。なお、当該対象の探索部6による継続追跡が不可能となった際は、スイッチ部3は再度、各フレームFiを検出部4へと供給する状態へと戻る。 While the predetermined detection target in the frame Fi is not detected by the detection unit 4, the switch unit 3 supplies each frame Fi of the buffer unit 2 to the detection unit 4, and the detection unit 4 detects the target. Each frame Fi at a later time performs a switching process so as to be supplied to the search unit 6 so that the target is continuously tracked in the frame. When continuous tracking by the target search unit 6 becomes impossible, the switch unit 3 returns to the state in which each frame Fi is supplied to the detection unit 4 again.

検出部4では、上記の通り所定対象が未検出の際にスイッチ部3から供給される各時刻tiのフレームFiから画像特徴量を抽出し、記憶部5に記憶された所定対象の画像特徴量と照合することにより、当該所定対象の検出を試みる。ここで、ある時刻t0において検出に成功したものとすると、検出部4は検出結果としての当該時刻t0におけるフレームF0内において所定対象が占める領域R0の情報を、探索部6へと出力する。 The detection unit 4 extracts the image feature amount from the frame Fi at each time ti supplied from the switch unit 3 when the predetermined target is not detected as described above, and stores the image feature amount of the predetermined target stored in the storage unit 5. Is attempted to detect the predetermined object. Here, assuming that the detection is successful at a certain time t0, the detection unit 4 outputs to the search unit 6 information on the region R0 occupied by the predetermined object in the frame F0 at the time t0 as the detection result.

検出部4において当該検出するための画像特徴量としては、任意種類の既存のものを用いてよく、任意種類の特徴点（対象におけるコーナーなどの特徴的な点としてのキーポイント）検出及び当該特徴点から抽出される任意種類の局所特徴量に基づく画像特徴量を利用してよい。特徴点検出と局所特徴量抽出の手法に関してはそれぞれ、前掲の非特許文献1等に開示のSIFT (Scale-Invariant Feature Transform)やSURF (Speeded Up Robust Features)などの既存手法が利用できる。ここで、特徴点の座標情報（特に複数の特徴点の相対的配置の座標情報）を含めて画像特徴量を定義しておいてもよいし、座標情報は除外して局所特徴量のみで画像特徴量を定義しておいてもよい。例えば、対象の個別の局所特徴量（実数ベクトルとして構成されるもの）をさらにビジュアルワードに量子化したうえで、対象全体でのビジュアルワードをヒストグラム化したバグ・オブ・ビジュアルワードとして画像特徴量を定義しておいてもよい。 As the image feature amount for detection in the detection unit 4, any type of existing feature may be used, detection of any type of feature point (key point as a characteristic point such as a corner in the target) and the feature An image feature amount based on an arbitrary type of local feature amount extracted from a point may be used. With respect to the feature point detection and local feature extraction methods, existing methods such as SIFT (Scale-Invariant Feature Transform) and SURF (Speeded Up Robust Features) disclosed in Non-Patent Document 1 and the like can be used. Here, the image feature amount may be defined including the coordinate information of the feature points (particularly the coordinate information of the relative arrangement of the plurality of feature points). A feature amount may be defined. For example, after further quantizing individual local features (configured as real vectors) into visual words, the image features can be converted into bug-of-visual words that are histograms of the visual words of the entire target. You may define it.

記憶部5では検出部4が抽出するのと同種類の画像特徴量を、１つ以上の所定の検出対象ごとにその画像から抽出されたものとして予め記憶しておき、検出部4では当該フレームFiより抽出された画像特徴量と記憶部5に予め記憶された画像特徴量とが閾値判定で類似すると判定された場合に、記憶された画像特徴量に対応する所定対象がフレームFiより検出されたと判定することができる。類似判定の際には、画像特徴量間の距離を計算して当該距離が小さいと判定されるものを類似していると判定すればよい。画像特徴量が特徴点の座標情報を含めて定義されている場合は、ホモグラフィ行列等の算出で評価される特徴点座標間の幾何整合の対応関係も考慮して特徴点間のマッチングを行い、当該幾何整合の度合いも反映されたものとして画像特徴量間の距離を計算するようにしてもよい。 The storage unit 5 stores in advance the same type of image feature quantity that the detection unit 4 extracts as being extracted from the image for each of one or more predetermined detection targets, and the detection unit 4 When it is determined that the image feature value extracted from Fi and the image feature value stored in advance in the storage unit 5 are similar by the threshold determination, a predetermined target corresponding to the stored image feature value is detected from the frame Fi. Can be determined. In the similarity determination, the distance between the image feature amounts is calculated, and those determined to be small may be determined to be similar. When image feature values are defined including the coordinate information of feature points, matching between feature points is performed taking into account the correspondence of geometric matching between feature point coordinates evaluated by calculation of homography matrix etc. The distance between the image feature amounts may be calculated on the assumption that the degree of geometric matching is also reflected.

検出部4ではまた、上記のように画像特徴量の抽出処理に基づいた検出ではなく、より一般に画像特徴を評価することによる検出を行うようにしてもよい。例えば、テンプレートマッチングでフレームFiから対象検出を行うようにしてもよい。この場合、記憶部5に予め１つ以上の検出対象ごとにテンプレート画像を記憶しておき、検出部4で当該テンプレート画像を用いたフレームFi内のサーチを行うようにすればよい。 The detection unit 4 may also perform detection based on evaluation of image features more generally, instead of detection based on the image feature amount extraction processing as described above. For example, target detection may be performed from the frame Fi by template matching. In this case, a template image may be stored in advance in the storage unit 5 for each of one or more detection targets, and the search within the frame Fi using the template image may be performed by the detection unit 4.

探索部6では、検出部4が上記時刻t0において所定対象の検出に成功した後の時刻t1,t2,…のフレームF1,F2,…に関して、フレーム内で当該所定対象の占める領域の動きを別時刻のフレーム間において探索することにより、当該所定対象を追跡する。当該追跡結果としての動きの探索結果は推定部7へと出力される。 In the search unit 6, regarding the frames F1, F2,... At the times t1, t2,... After the detection unit 4 has successfully detected the predetermined target at the time t0, the movement of the area occupied by the predetermined target in the frame is separated. The predetermined target is tracked by searching between frames of time. The motion search result as the tracking result is output to the estimation unit 7.

探索部6では、ある時刻tiにおいてフレームFi内に領域Riとして所定対象が既に検出済み又は探索済みとなっている場合に、当該領域Riの情報を利用して、別の時刻ti+k（kは正でも負でもよい。）のフレームFi+kにおいて当該所定対象の領域Ri+kを探索することで、フレームFiの領域RiからフレームFiの領域Ri+kへの動きM(i,i+k)の情報を得る。ここで、当該探索の際の探索元フレームFi及び探索先フレームFi+kをどのように設定し、どのような順番で当該探索を実施するかに関しては種々の実施形態が可能であり、後述する図３や図４において当該種々の実施形態を説明する。 In the search unit 6, when a predetermined target has already been detected or searched for as a region Ri in the frame Fi at a certain time ti, another time ti + k (k Can be positive or negative.) By searching the predetermined area Ri + k in the frame Fi + k of the frame Fi + k, the motion M (i, i + from the area Ri of the frame Fi to the area Ri + k of the frame Fi Get the information of k). Here, various embodiments are possible regarding how to set the search source frame Fi and the search destination frame Fi + k in the search and in what order the search is performed. The various embodiments will be described with reference to FIGS.

一実施形態において探索部6における当該探索処理には、領域追跡法として既存手法であるテンプレートマッチングを用いてよい。当該用いるテンプレート画像としては、検出部4で検出に成功した際に、当該検出対象に関してあらかじめ記憶部5に記憶しておく所定のテンプレート画像を記憶部5から読み込んで用いてもよいし、探索元のフレームFiの領域Riそれ自体を、随時更新されるテンプレート画像として利用するようにしてもよい。この際、時刻t0での検出部4による検出領域R0のみを固定的にテンプレート画像として利用してもよい。また、別の一実施形態において探索部6における当該探索処理には、領域追跡法として既存手法であるパーティクルフィルタを用いることにより、領域Riから領域Ri+kを推定するようにしてもよい。 In one embodiment, the search processing in the search unit 6 may use template matching which is an existing method as a region tracking method. As the template image to be used, when the detection unit 4 succeeds in detection, a predetermined template image that is stored in advance in the storage unit 5 with respect to the detection target may be read from the storage unit 5 and used. The area Ri of the frame Fi itself may be used as a template image updated as needed. At this time, only the detection region R0 by the detection unit 4 at time t0 may be fixedly used as a template image. In another embodiment, the search process in the search unit 6 may estimate the region Ri + k from the region Ri by using a particle filter that is an existing method as the region tracking method.

推定部7は、上記の探索部6から第一フレームレートにおいて得られる動きの探索結果を時間軸上において合成することにより、第二フレームレートにおける動きの推定結果を得て、当該第二フレームレートにおける動き推定結果を出力部8へと出力する。 The estimation unit 7 obtains the motion estimation result at the second frame rate by combining the motion search results obtained at the first frame rate from the search unit 6 on the time axis, and obtains the second frame rate. The motion estimation result at is output to the output unit 8.

ここで、動きM(i,i+k)の情報が領域Ri,Ri+k間の座標変換の行列M(i,i+k)として与えられている場合、第二フレームレートの動きDを以下の式(1)のように当該行列M(i,i+k)の積として合成することができる。 Here, when the information of the motion M (i, i + k) is given as a matrix M (i, i + k) of coordinate transformation between the regions Ri, Ri + k, the motion D of the second frame rate is It can be synthesized as a product of the matrix M (i, i + k) as in the following equation (1).

なお、動きM(i,i+k)の情報が領域Ri,Ri+k間の座標の並進ベクトルM(i,i+k)として与えられている場合には、以下の式(2)のように積ではなく当該ベクトルの和として簡略化された形で動きDを合成して得ることができる。 When the information of the motion M (i, i + k) is given as the translation vector M (i, i + k) of the coordinates between the regions Ri, Ri + k, the following equation (2) Thus, the motion D can be obtained by synthesizing in a simplified form as the sum of the vectors, not the product.

なお、以上の２つの式(1),(2)において積や和を取る対象は、第二フレームレートの動きDの期間内にあって、第一フレームレートを前提として探索部6により探索された全ての動きM(i,i+k)であり、以上の２つの式ではインデクスiで当該対象を表現している。当該インデクスiで表現される積や和を取る対象の種々の実施形態は、探索部6の具体的な探索に応じて定まるものであるので、同じく後述の図３や図４において当該種々の実施形態を説明する。なお、積を取る場合は、取得された動きの時刻の順番で積を取ればよい。 Note that the object to be multiplied or summed in the above two formulas (1) and (2) is within the period of the motion D of the second frame rate and is searched by the search unit 6 on the assumption of the first frame rate. All the motions M (i, i + k), and in the above two expressions, the object is represented by the index i. Since various embodiments of products and sums represented by the index i are determined according to a specific search by the search unit 6, the various embodiments are also shown in FIGS. 3 and 4 described later. A form is demonstrated. In addition, what is necessary is just to take a product in order of the time of the acquired motion, when taking a product.

出力部8は、推定部7から得られる第二フレームレートでの対象の動き推定結果（当該推定結果はすなわち、検出部4で検出されて以降の時刻における対象のフレーム内位置の推定結果にも対応する内容である。）に基づく所定の加工処理を、バッファ部2から得られる第二フレームレートでの各フレームFiに対して施すことにより、第二フレームレートにおける出力情報を得て表示する。すなわち、出力部8により表示される出力情報の更新間隔は、第二フレームレートの逆数となる。 The output unit 8 obtains the motion estimation result of the target at the second frame rate obtained from the estimation unit 7 (that is, the estimation result is also detected by the detection unit 4 and the estimation result of the target intra-frame position at a later time) Is applied to each frame Fi at the second frame rate obtained from the buffer unit 2 to obtain and display output information at the second frame rate. That is, the update interval of the output information displayed by the output unit 8 is the reciprocal of the second frame rate.

出力部8ではさらに、検出部4において所定対象が検出された際に特定される当該所定対象が何であるかの認識結果に応じた所定の情報（検出対象に応じた付随情報）を、当該情報を予め記憶している記憶部5から読み込んで、当該加工処理を行うことができる。一実施形態において出力部8では、当該加工処理を施した第二フレームレートでの出力情報として、撮像部1において第一フレームレートで得られる撮像画像Fiに対して第二フレームレートにおいて拡張現実表示を施した出力情報を得て、表示するようにすることができる。 In the output unit 8, further, predetermined information (accompanying information according to the detection target) corresponding to the recognition result of what the predetermined target is identified when the predetermined target is detected in the detection unit 4, the information Can be read from the storage unit 5 that stores them in advance and the processing can be performed. In one embodiment, the output unit 8 displays the augmented reality display at the second frame rate with respect to the captured image Fi obtained at the first frame rate in the imaging unit 1 as output information at the second frame rate subjected to the processing. Can be obtained and displayed.

出力部8においては、拡張現実表示の既存技術を用いることで、撮像画像Fiに対して、撮像されている実空間の３次元構造に整合させて、検出及び追跡された所定対象に対して拡張現実表示を行った出力情報を得るようにすることもできる。この場合、撮像部1を構成するハードウェアとしてのカメラと、撮像画像Fiにおいて検出及び追跡された所定対象と、の空間的な相対的位置関係の情報を用いるようにすればよい。 In the output unit 8, by using the existing technology of augmented reality display, the captured image Fi is matched with the three-dimensional structure of the captured real space, and the detected and tracked target is expanded. It is also possible to obtain output information that has been displayed in reality. In this case, information on a spatial relative positional relationship between a camera as hardware constituting the imaging unit 1 and a predetermined target detected and tracked in the captured image Fi may be used.

ここで、出力部8では、個々の出力動き情報（すなわち、第二フレームレートで得られている時系列上の個々の動きD）から求まる対応点群にDLTや8点アルゴリズム等の既存手法を適用することにより、カメラ-所定対象間やカメラ-カメラ間の相対的位置関係を推定することができる。すなわち、まず、検出部4において時刻t0に所定対象を検出する際に、画像特徴量として個別の特徴点の座標情報が含まれるものを利用することで、検出時刻t0におけるカメラ-所定対象間の相対的位置関係をホモグラフィ行列H0の形で求めることができる。検出後において第二フレームレートで動きD1,D2,D3,…が求まるものとすると、当該動きに対応する相対的位置関係を上記既存手法によりホモグラフィ行列H1,H2,H3…として求めることができる。したがって、出力部8ではこれらの積として、検出後の第二フレームレートでの相対的位置関係をH1・H0, H2・H1・H0, H3・H2・H1・H0,…として求めることができる。 Here, the output unit 8 applies an existing method such as DLT or 8-point algorithm to the corresponding point group obtained from each output motion information (that is, each motion D on the time series obtained at the second frame rate). By applying, it is possible to estimate the relative positional relationship between the camera and the predetermined object or between the camera and the camera. That is, first, when detecting the predetermined target at the time t0 in the detection unit 4, by using what includes the coordinate information of the individual feature points as the image feature amount, between the camera and the predetermined target at the detection time t0 The relative positional relationship can be obtained in the form of a homography matrix H0. If the motions D1, D2, D3,... Are obtained at the second frame rate after detection, the relative positional relationship corresponding to the motions can be obtained as homography matrices H1, H2, H3,. . Therefore, the output unit 8 can obtain the relative positional relationship at the second frame rate after detection as H1, H0, H2, H1, H0, H3, H2, H1, H0,.

なお、出力部8はハードウェアとしてはディスプレイで実現することができる。当該ディスプレイは通常の液晶モニタ等による非シースルー型のものとして構成されていてもよいし、液晶や有機ELモニタ等によるシースルー型のものとして構成されていてもよい。なお、シースルー型のものとして構成される場合に出力部8で拡張現実表示を実現する場合には、出力部8ではバッファ部2から撮像画像Fiを受け取って表示することを省略してもよい。また、シースルー型のものとして例えばヘッドマウントディスプレイ（HMD）の形態で構成される場合には、カメラとしての撮像部1による撮像画像Fiにおける対象の検出位置に対して、シースルー型HMDにおいて当該検出位置の対応位置において拡張現実の重畳表示を行った際に、ユーザ視界での実世界に実物としての検出対象が見えている位置と、撮像画像Fiにおける検出位置の対応位置としての重畳表示位置とが整合するように、カメラとシースルー型HMDとの間でキャリブレーションを行っておくことが望ましい。 The output unit 8 can be realized by a display as hardware. The display may be configured as a non-see-through type using a normal liquid crystal monitor or the like, or may be configured as a see-through type using a liquid crystal or an organic EL monitor. When the output unit 8 realizes augmented reality display when configured as a see-through type, the output unit 8 may omit receiving and displaying the captured image Fi from the buffer unit 2. Further, when the see-through type is configured in the form of a head-mounted display (HMD), for example, the detection position in the see-through type HMD with respect to the target detection position in the captured image Fi by the imaging unit 1 as a camera. When the augmented reality superimposed display is performed at the corresponding position, the position where the real detection target is visible in the user's view and the superimposed display position as the corresponding position of the detected position in the captured image Fi It is desirable to perform calibration between the camera and the see-through HMD so as to match.

なお、以上では情報処理装置10を適用するのに好適な一例として、出力部8で拡張現実表示等の視覚に関連した出力情報を得る例を説明したが、これに代えて又はこれに加えて、その他の任意の知覚に関連した出力情報を、動きDに基づいて加工されたものとして得るようにしてもよい。例えば、動きDに応じた音声出力を得るようにしてもよい。また、動きDをそのままテキスト情報等の形で出力するようにしてもよい。（この場合は、情報処理装置10において出力部8は省略され、推定部7からの出力を情報処理装置10からの出力とする構成に相当する。）また、動きDに応じてその他のアクチュエータや機器等を制御するための制御出力を得るようにしてもよい。この場合には、音声、テキスト情報、制御等の出力情報の出力間隔が第二フレームレートに相当する。 In the above, as an example suitable for applying the information processing apparatus 10, an example in which output information related to vision such as augmented reality display is obtained by the output unit 8 has been described, but instead of or in addition to this, The output information related to any other perception may be obtained as processed based on the motion D. For example, an audio output corresponding to the motion D may be obtained. Further, the motion D may be output as it is in the form of text information or the like. (In this case, the output unit 8 is omitted in the information processing apparatus 10, which corresponds to a configuration in which the output from the estimation unit 7 is the output from the information processing apparatus 10.) A control output for controlling a device or the like may be obtained. In this case, the output interval of output information such as voice, text information, and control corresponds to the second frame rate.

記憶部5では、上記の検出部4における検出を可能とするために、画像特徴量等として構成される検出対象の特徴情報を記憶しておくと共に、上記の出力部8における加工処理による出力情報の生成を可能とするために、当該検出対象の種類ごとにその付随情報を記憶しておき、検出部4及び出力部8に対して当該各情報を参照に供する。記憶部5における当該記憶しておく情報は管理者等が予め用意しておけばよい。 In the storage unit 5, in order to enable detection in the detection unit 4 described above, feature information of a detection target configured as an image feature amount or the like is stored, and output information obtained by processing in the output unit 8 In order to enable the generation of the detection information, the accompanying information is stored for each type of the detection target, and the information is provided to the detection unit 4 and the output unit 8 for reference. The information stored in the storage unit 5 may be prepared in advance by an administrator or the like.

図２は、検出部4において時刻t0でフレームF0より対象が検出されてから後の各フレームFiに対する探索部6、推定部7及び出力部8による、一実施形態に係る処理内容の相互関係を模式的に示す図である。図２にて[1]は探索部6の探索処理を、[2]は当該探索処理に連携した推定部7及び出力部8における推定処理及び出力処理を模式的に示している。 FIG. 2 shows the interrelationship of the processing contents according to the embodiment by the search unit 6, the estimation unit 7 and the output unit 8 for each frame Fi after the detection of the target from the frame F0 at the time t0. It is a figure shown typically. In FIG. 2, [1] schematically shows search processing of the search unit 6, and [2] schematically shows estimation processing and output processing in the estimation unit 7 and the output unit 8 linked to the search processing.

すなわち、図２にて[1]に示すように、探索部6は撮像部1の撮像のフレームレートと同じ第一フレームレートにおいて、隣接時刻ti及びti+1(i=0,1,2,…6)にある隣接フレームFi及びFi+1(i=0,1,2,…6)の間での検出対象（図中、模式的にフレーム内に白丸（〇）で表現されている。）の動きM(i,i+1)(i=0,1,2,…6)（図中、模式的に湾曲した矢印で表現されている。）を探索する。そして、図２にて[2]に示すように、推定部7では探索部6により当該探索された7個の動きM(i,i+1)(i=0,1,2,…6)を前述の式(1)又は(2)により合成したものとして、フレームF0及びF7の間で定義された第二フレームレートの動きM(0,7)を得ると共に、当該得られた動きM(0,7)を用いて出力部8が第二フレームレートにおいて拡張現実表示等の出力（図中、模式的に星印（☆）で表現されている。）を行う。 That is, as indicated by [1] in FIG. 2, the search unit 6 uses the adjacent times ti and ti + 1 (i = 0, 1, 2, 2) at the same first frame rate as the imaging frame rate of the imaging unit 1. ... 6) is detected between adjacent frames Fi and Fi + 1 (i = 0, 1, 2,... 6) (in the figure, they are schematically represented by white circles (O) in the frame). ) Motion M (i, i + 1) (i = 0, 1, 2,... 6) (represented schematically by curved arrows in the figure). Then, as shown in [2] in FIG. 2, in the estimation unit 7, the seven motions M (i, i + 1) (i = 0, 1, 2,... 6) searched by the search unit 6 As a combination of the above-described equations (1) or (2) to obtain a motion M (0,7) of the second frame rate defined between the frames F0 and F7, and the obtained motion M ( 0,7), the output unit 8 performs output such as augmented reality display at the second frame rate (represented schematically by stars (*) in the figure).

なお、以上の図２の模式例は第一フレームレートが第二フレームレートの7倍である例となっている。例えば、第一フレームレートは210fps（フレーム毎秒）であり、第二フレームレートは30fpsであってよい。 2 is an example in which the first frame rate is seven times the second frame rate. For example, the first frame rate may be 210 fps (frames per second) and the second frame rate may be 30 fps.

図３は、一実施形態に係る情報処理装置10の動作のフローチャートである。図３の実施形態は、撮像部1において第一フレームレートで得られる撮像画像に対して逐次処理で探索部6が動き探索を行う実施形態に相当し、図２の模式例の動作を実現する一例に相当する。 FIG. 3 is a flowchart of the operation of the information processing apparatus 10 according to an embodiment. The embodiment in FIG. 3 corresponds to an embodiment in which the search unit 6 performs a motion search by sequential processing on a captured image obtained at the first frame rate in the imaging unit 1, and realizes the operation of the schematic example in FIG. It corresponds to an example.

なお、図３にてステップS1,S2,S20で構成されるループ内にある状態が図１の機能ブロック表現におけるスイッチ部3から検出部4へと撮像画像Fiが送られる状態に相当し、当該ループ外にある状態が図１の機能ブロック表現におけるスイッチ部3から探索部6へと撮像画像Fiが送られる状態に相当する。すなわち、当該図３にその一例が示されるような動作フロー構造を図１の機能ブロック構成において表現するものがスイッチ部3であるため、図３の各ステップの説明においてはこのようなスイッチ部3の動作について重複する説明は省略する。従って、当該ループ内にある際は検出部4が直接にバッファ部2を参照して撮像画像Fiを取得し、当該ループ外にある際は探索部6が直接にバッファ部2を参照して撮像画像Fiを取得するものとして、スイッチ部3には特に言及することなく図３の動作の説明を行う。（後述の図４でも同様とする。）以下、図３の各ステップを説明する。 3 corresponds to the state in which the captured image Fi is sent from the switch unit 3 to the detection unit 4 in the functional block expression of FIG. 1 in the loop composed of steps S1, S2, and S20. The state outside the loop corresponds to the state in which the captured image Fi is sent from the switch unit 3 to the search unit 6 in the functional block expression of FIG. That is, the switch unit 3 expresses the operation flow structure as an example shown in FIG. 3 in the functional block configuration of FIG. 1. Therefore, in the description of each step of FIG. A duplicate description of the operation will be omitted. Therefore, when in the loop, the detection unit 4 directly refers to the buffer unit 2 to acquire the captured image Fi, and when outside the loop, the search unit 6 directly refers to the buffer unit 2 to capture images. The operation of FIG. 3 will be described without particularly mentioning the switch unit 3 as acquiring the image Fi. (The same applies to FIG. 4 described later.) Hereinafter, each step of FIG. 3 will be described.

図３のフローの開始時点は、撮像部1による撮像の開始直後等で対象が未検出の状態（例えば、撮像部1のカメラを起動した直後等の状態）であるものとし、当該開始するとステップS1へと進む。ステップS1では現時刻tiに対して撮像部1の取得した撮像画像Fiに対して検出部4が対象の検出を試みてから、ステップS2へと進む。ステップS2では、直近のステップS1における検出部4による検出が成功したか否かを判定し、成功していればステップS3へと進み、失敗していればステップS20へと進む。 The start time of the flow in FIG. 3 is assumed to be a state in which the target is not detected immediately after the start of imaging by the imaging unit 1 (for example, a state immediately after the camera of the imaging unit 1 is activated). Proceed to S1. In step S1, after the detection unit 4 tries to detect the target of the captured image Fi acquired by the imaging unit 1 at the current time ti, the process proceeds to step S2. In step S2, it is determined whether or not the detection by the detection unit 4 in the latest step S1 is successful. If successful, the process proceeds to step S3, and if unsuccessful, the process proceeds to step S20.

ステップS20では、時刻tiを次の時刻ti+1へと更新し、当該更新された現時刻ti+1において撮像部1が撮像画像Fi+1を取得してからステップS1へと戻る。なお、ステップS20における時刻更新はこのように第一フレームレートの更新として時刻tiから時刻ti+1へと更新するものであってもよいし、第二フレームレート（Nを2以上の整数として第一フレームレートの1/Nの値であるものとする。）の更新として時刻tiから時刻ti+Nに更新するものであってもよい。すなわち、ステップS1での検出部4による検出は第一又は第二フレームレートのいずれでなされてもよい。 In step S20, the time ti is updated to the next time ti + 1, and after the imaging unit 1 acquires the captured image Fi + 1 at the updated current time ti + 1, the process returns to step S1. The time update in step S20 may be such that the first frame rate is updated from time ti to time ti + 1 as described above, or the second frame rate (N is an integer greater than or equal to 2 It is also possible to update from time ti to time ti + N as the update of 1 / N of one frame rate. That is, the detection by the detection unit 4 in step S1 may be performed at either the first or second frame rate.

ステップS3では、検出部4が直近のステップS1で成功した検出結果に基づき、フレームFiから検出された対象の領域Riの情報を探索部6へと出力するとともに、記憶部5から当該検出された対象に応じた付随情報（前述した拡張現実表示等を可能とするための付随情報）を出力部8へと出力してから、ステップS4へと進む。 In step S3, the detection unit 4 outputs information on the target area Ri detected from the frame Fi to the search unit 6 based on the detection result succeeded in the most recent step S1, and the detection is performed from the storage unit 5. Accompanying information according to the object (accompanying information for enabling the above-described augmented reality display or the like) is output to the output unit 8, and then the process proceeds to step S4.

ステップS4では時刻tiを第一フレームレートでの次の時刻ti+1へと更新し、当該更新された現時刻ti+1において撮像部1が撮像画像Fi+1を取得してからステップS5へと進む。なお、時刻tiとはこのように、ステップS4やS20において次の時刻ti+1へと更新されることで常に最新時刻（現時刻）となることを前提として、図３の各ステップにおいては現時刻を時刻tiとして説明する。（後述する図４の各ステップの説明も同様の前提で、現時刻を時刻tiとして説明する。） In step S4, the time ti is updated to the next time ti + 1 at the first frame rate. After the imaging unit 1 acquires the captured image Fi + 1 at the updated current time ti + 1, the process proceeds to step S5. Proceed with Note that the time ti is the current time in each step of FIG. 3 on the assumption that the latest time (current time) is always obtained by updating to the next time ti + 1 in steps S4 and S20. The time will be described as time ti. (The description of each step in FIG. 4 to be described later is also based on the same premise, and the current time is described as time ti.)

ステップS5では、探索部6がバッファ部2を参照することにより、直近の過去時刻ti-k(k≧1)のフレームFi-kであって検出部4により検出に成功した又は探索部6により探索により探索に成功した対象の領域Ri-kを用いて、現時刻tiのフレームFi内から対象の領域Riを探索することで、領域の追跡を試みてからステップS6へと進む。ステップS6では、直近のステップS5の追跡に成功したか否かを判定し、成功していればステップS7へ進み、失敗していればステップS70へと進む。 In step S5, the search unit 6 refers to the buffer unit 2, and is the frame Fi-k of the latest past time ti-k (k ≧ 1) and has been successfully detected by the detection unit 4 or by the search unit 6. By using the target area Ri-k that has been successfully searched by searching, the target area Ri is searched from within the frame Fi at the current time ti, and then the process proceeds to step S6 after trying to track the area. In step S6, it is determined whether or not the tracking of the latest step S5 is successful. If successful, the process proceeds to step S7, and if unsuccessful, the process proceeds to step S70.

ステップS7では、探索部6が当該成功した現時刻tiのフレームFi内の領域Riの情報を次の時刻ti+1におけるステップS5での追跡に用いる情報として取得してから、ステップS8へと進む。ステップS70では、当該時刻tiに関しては追跡に失敗した旨を探索部6が記憶してからステップS9へと進む。なお、ステップS70において当該記憶することにより、ステップS5における直近の過去時刻ti-k(k≧1)を得ることが可能となる。また、ある時刻tiに関して追跡に失敗した場合はこのようにステップS7がスキップされることから、当該失敗した時刻tiに関してはステップS5での追跡・探索元としての領域Riは取得されないこととなる。 In step S7, the search unit 6 acquires information on the area Ri in the frame Fi at the current time ti that has succeeded as information used for tracking in step S5 at the next time ti + 1, and then proceeds to step S8. . In step S70, the search unit 6 stores the fact that the tracking has failed for the time ti, and then proceeds to step S9. In addition, by storing in step S70, it is possible to obtain the latest past time ti-k (k ≧ 1) in step S5. In addition, when tracking fails for a certain time ti, step S7 is skipped in this way, and therefore the area Ri as the tracking / search source in step S5 is not acquired for the failed time ti.

ステップS8では、探索部6が直近のステップS5において成功した追跡の結果としての動きM(i,i-k)を取得してから、ステップS9へと進む。ステップS9では、現時刻ti（ステップS4により第一フレームレートにて1ずつ増分され更新されている現時刻ti）が第二フレームレートの更新タイミングに到達したか否かを判定し、到達していればステップS10へと進み、到達していなければステップS4へと戻る。 In step S8, the search unit 6 acquires the motion M (i, i-k) as a result of the successful tracking in the latest step S5, and then proceeds to step S9. In step S9, it is determined whether or not the current time ti (current time ti incremented and updated by 1 at the first frame rate in step S4) has reached the update timing of the second frame rate. If so, the process proceeds to step S10. If not reached, the process returns to step S4.

ステップS10では、第二フレームレートの更新タイミングに該当する現時刻tiと、直近過去の第二フレームレートの更新タイミングに該当する時刻ti-N（すなわち、Nを2以上の整数として前述の通り、第一フレームレート間隔の更新のN回分が第二フレームレート間隔の更新の1回に相当するものとする。）と、の間の期間における一連のステップS8において取得された動きM(i-k)を合成して、第二フレームレートの動きDを得ること（及び動きDに基づく出力部8による出力情報の出力）を試みてからステップS11へと進む。 In step S10, the current time ti corresponding to the update timing of the second frame rate and the time ti-N corresponding to the update timing of the most recent second frame rate (that is, N is an integer of 2 or more as described above, N times of the update of the first frame rate interval corresponds to one update of the second frame rate interval)), and the motion M (ik) acquired in the series of steps S8 in the period between After synthesizing and trying to obtain the motion D of the second frame rate (and outputting the output information by the output unit 8 based on the motion D), the process proceeds to step S11.

ステップS11では、ステップS10で試みた動きDを得ること（及びこれに基づく出力情報の出力）が可能であったか否かを判定し、可能であったならばステップS4に戻ることで、更新された最新時刻tiに対して探索部6による以上のような探索が継続される。ステップS11にて不可能と判定された場合にはステップS1へと戻ることで、更新された最新時刻tiに対して検出部4による検出が再開されることとなる。 In step S11, it is determined whether it is possible to obtain the motion D attempted in step S10 (and output of output information based on this), and if it is possible, the process is updated by returning to step S4. The search as described above by the search unit 6 is continued for the latest time ti. If it is determined in step S11 that it is impossible, the process returns to step S1, and the detection by the detection unit 4 is resumed for the updated latest time ti.

なお、動きDを得るのが可能であったか否かの判定は、以下の条件の全て又は任意の一部分が成立する場合に、動きDを得るための個別の動きM(i,i-k)の取得が不十分であるものとして、動きDを得るのが不可能である旨を判定することにより実施すればよい。
（条件１）第二フレームレートの更新タイミングにおいて、すなわち、ステップS9で肯定判断を得た際に、直近のステップS6（すなわち、当該ステップS9における現時刻tiと同じ時刻tiでのステップS6）で否定判断が得られている。
（条件２）当該第二フレームレートを構成するN回分の第一フレームレートに関して、ステップS6で否定判断が得られた回数が所定閾値を超えている。 It should be noted that the determination of whether or not it was possible to obtain the motion D is the acquisition of the individual motion M (i, ik) for obtaining the motion D when all or any of the following conditions are satisfied. It may be carried out by determining that it is impossible to obtain the motion D as insufficient.
(Condition 1) At the update timing of the second frame rate, that is, when an affirmative determination is obtained in step S9, in the latest step S6 (that is, step S6 at the same time ti as the current time ti in step S9) Negative judgment is obtained.
(Condition 2) The number of times that a negative determination is obtained in step S6 regarding the first frame rate for N times constituting the second frame rate exceeds a predetermined threshold.

以上の図３のフローによる動きの取得及び合成の例を挙げる。ここで、N=4とし、第一フレームレートは第二フレームレートの4倍であるものとして、4+1=5枚の第一フレームレートのフレームF0,F1,F2,F3,F4において求まる動きで第二フレームレートの動きM(0,4)を合成する例を挙げる。
（例１）すべての動きM(0,1),M(1,2),M(2,3),M(3,4)の探索に成功した場合、これら4個の動きを合成して動きM(0,4)が得られる。
（例２）上記のうち動きM(1,2)の探索に失敗した後に動きM(1,3)の探索に成功し、結果として動きM(0,1),M(1,3),M(3,4)の探索に成功した場合、これら3個の動きを合成して動きM(0,4)が得られる。
（例３）上記のうちM(1,2)の探索に失敗した後にさらに動きM(1,3)の探索にも失敗し、その後に動きM(1,4)の探索に成功することで結果として動きM(0,1),M(1,4)の探索に成功した場合、これら2個の動きを合成して動きM(0,4)が得られる。 An example of motion acquisition and synthesis according to the flow of FIG. Here, assuming that N = 4 and the first frame rate is four times the second frame rate, 4 + 1 = 5 frames F0, F1, F2, F3, and F4 obtained at the first frame rate Then, an example of synthesizing the motion M (0,4) at the second frame rate is given.
(Example 1) When all the motions M (0,1), M (1,2), M (2,3), M (3,4) are successfully searched, these four motions are combined. A motion M (0,4) is obtained.
(Example 2) Of the above, the search for the motion M (1,3) succeeds after the search for the motion M (1,2) fails, and as a result, the motion M (0,1), M (1,3), When the search for M (3,4) is successful, the motion M (0,4) is obtained by combining these three motions.
(Example 3) Of the above, after the search for M (1,2) fails, the search for motion M (1,3) also fails, and then the search for motion M (1,4) succeeds. As a result, when the motion M (0,1) and M (1,4) are successfully searched, the motion M (0,4) is obtained by combining these two motions.

なお、第一フレームレートは第二フレームレートのN倍であるものとして説明しているが、いわゆる植木算の関係から、第一フレームレートの連続するN枚のフレームにさらにもう1枚を追加してN+1枚とすることで第二フレームレートの一周期分に相当し、且つ、その両端フレームをも含んだものが得られることとなる。 Although the first frame rate is described as being N times the second frame rate, another one frame is added to N consecutive frames of the first frame rate because of so-called planting calculation. By setting the number to N + 1, a frame corresponding to one cycle of the second frame rate and including both end frames can be obtained.

以上の図３のフローは撮像部1での第一フレームレートの撮像画像Fiに対して探索部6が逐次処理で動き探索を行う場合の一実施形態であった。図４は別の一実施形態に係る情報処理装置10の動作のフローチャートであり、探索部6は図３の場合と同じく第一フレームレートで与えらえた撮像画像Fiを前提とした探索を行うものの、逐次処理として探索するのではなく、当該第一フレームレートの撮像画像Fiが第二フレームレートの一周期（及び両端）分に相当するN+1枚だけバッファ部2に蓄積されてから、当該Ｎ+1枚又はその一部分を対象としてバッチ処理で探索を行う実施形態のフローである。以下、図４の各ステップを説明する。 The above flow of FIG. 3 is an embodiment in the case where the search unit 6 performs a motion search by sequential processing on the captured image Fi of the first frame rate in the imaging unit 1. FIG. 4 is a flowchart of the operation of the information processing apparatus 10 according to another embodiment. The search unit 6 performs a search based on the captured image Fi given at the first frame rate as in FIG. Instead of searching as sequential processing, the captured image Fi of the first frame rate is accumulated in the buffer unit 2 by N + 1 sheets corresponding to one cycle (and both ends) of the second frame rate, and then It is the flow of an embodiment which searches by batch processing for N + 1 sheets or a part thereof. Hereinafter, each step of FIG. 4 will be described.

図４のフローが開始されてからのステップS21,S22,S23及びS220はそれぞれ、図３のステップS1,S2,S3及びS20と同一であるので、その説明は省略する。こうして、図３のステップS2,S3と同様に図４のステップS22で検出部4による検出の成功判定を得てステップS23で当該検出結果に対応する出力等を得た後は、ステップS31へと進む。 Steps S21, S22, S23, and S220 after the flow of FIG. 4 is started are the same as steps S1, S2, S3, and S20 of FIG. In this way, after obtaining the detection success detection by the detection unit 4 in step S22 of FIG. 4 and obtaining the output corresponding to the detection result in step S23 as in steps S2 and S3 of FIG. 3, the process proceeds to step S31. move on.

ステップS31では第一フレームレートにおいて現時刻tiを次の時刻ti+1に更新し、当該最新時刻ti+1において撮像部1がフレームFi+1を取得してからステップS32へと進む。ステップS32では、現時刻tiが第二フレームレートの更新タイミングに到達したか否かを判断し、到達していればステップS33へ進み、到達していなければステップS31に戻る。 In step S31, the current time ti is updated to the next time ti + 1 at the first frame rate. After the imaging unit 1 acquires the frame Fi + 1 at the latest time ti + 1, the process proceeds to step S32. In step S32, it is determined whether or not the current time ti has reached the update timing of the second frame rate. If it has reached, the process proceeds to step S33, and if not, the process returns to step S31.

ステップS33では、上記のステップS31,S32のループを経て現時刻tiにおいて得られている第二フレームレートの一周期（及び両端）分のN+1枚のフレームFi-N,Fi-N+1,…,Fi-1,Fiのうち、両端フレームFi及びFi-Nの間で探索部6が動き探索を試みることにより対象の追跡を試みてから、ステップS34へと進む。ステップS33での動き探索に関しては、探索元のフレームFi-N（すなわち、直近のステップS21及びS22で検出成功したフレーム、または、直近のステップ「S33及びS34」又は「S35及びS36」で探索成功したフレーム）で検出又は探索済みとなっている領域Ri-Nを用いるようにすればよい。 In step S33, N + 1 frames Fi-N, Fi-N + 1 for one period (and both ends) of the second frame rate obtained at the current time ti through the loop of steps S31 and S32 described above. ,..., Fi-1, Fi, the search unit 6 tries to track the object by trying motion search between the end frames Fi and Fi-N, and then proceeds to step S34. Regarding the motion search in step S33, the search source frame Fi-N (that is, the frame successfully detected in the latest steps S21 and S22, or the search success in the latest steps “S33 and S34” or “S35 and S36”) The region Ri-N that has already been detected or searched in the frame) may be used.

ステップS34では、ステップS33での両端フレーム探索が成功したか否かを判定し、成功していればステップS37へと進み、失敗していればステップS35へと進む。ステップS35では、探索部6が、当該両端フレームFi及びFi-Nでの探索失敗を受け、当該N+1枚存在しているフレームのうち両端以外の内部のフレームも利用して所定規則に従って再帰的に複数の動き探索を行うことにより、当該両端の動き探索結果に該当するものを得ることを試みてから、ステップS36へと進む。ステップS35は各種の実施形態が可能であり、後述する。 In step S34, it is determined whether or not the double-ended frame search in step S33 is successful. If successful, the process proceeds to step S37, and if unsuccessful, the process proceeds to step S35. In step S35, the search unit 6 receives a search failure in the both end frames Fi and Fi-N, and recursively follows the predetermined rule using internal frames other than both ends among the N + 1 frames existing. In particular, by performing a plurality of motion searches, an attempt is made to obtain a motion search result corresponding to both ends, and the process proceeds to step S36. Various embodiments are possible for step S35, which will be described later.

ステップS36では、ステップS35の探索が成功したか否かを判定し、成功していればステップS37へと進み、失敗していればステップS21へと戻ることで、検出部4の処理が再開されることとなる。ステップS37では、推定部7が当該両端フレームFi及びFi-Nでの動き推定結果を得て出力部8に渡すことにより出力情報を得ると共に、以降も継続される動き推定のために探索部6が現時刻tiの領域Riの情報を取得・保持してからステップS31へと戻る。当該ステップS37にて推定部7は、ステップS33及びS34の両端フレームFi及びFi-Nの動き推定が成功している場合には、当該１つの動きをそのまま動き推定結果M(i,i-N)として採用し、ステップS35及びS36からステップS37へと至った場合には、ステップS35で再帰的に追跡された複数の動きを合成して動き推定結果M(i,i-N)を得る。 In step S36, it is determined whether or not the search in step S35 has been successful. If successful, the process proceeds to step S37, and if unsuccessful, the process returns to step S21 to restart the process of the detection unit 4. The Rukoto. In step S37, the estimation unit 7 obtains the motion estimation result in the both end frames Fi and Fi-N and passes it to the output unit 8 to obtain output information, and the search unit 6 for motion estimation that is continued thereafter. After acquiring / holding the information of the area Ri at the current time ti, the process returns to step S31. In step S37, when the motion estimation of both end frames Fi and Fi-N in steps S33 and S34 is successful, the estimation unit 7 directly uses the one motion as the motion estimation result M (i, iN). When step S35 and S36 lead to step S37, a plurality of motions recursively tracked in step S35 are combined to obtain a motion estimation result M (i, iN).

以下、ステップS35の各実施形態を説明する。一実施形態では、図３で説明した逐次処理（S4,S5,S6,S7,S70,S8,S9のループ処理）をそのままステップS35を実現する手順として採用することができる。この場合、ステップS4の時刻更新手順は、単に次の追跡対象の隣接フレームFiを設定する手順と読み替えればよく、ステップS9の判断はN+1枚のフレームの全てについて処理が終わったかの判断に読み替えればよい。 Hereinafter, each embodiment of step S35 will be described. In one embodiment, the sequential processing described in FIG. 3 (loop processing of S4, S5, S6, S7, S70, S8, and S9) can be employed as it is as a procedure for realizing step S35. In this case, the time update procedure in step S4 may be simply read as a procedure for setting the next adjacent frame Fi to be tracked, and the determination in step S9 is to determine whether the processing has been completed for all N + 1 frames. You can replace it.

上記の図３の逐次処理をそのまま利用する実施形態では、（探索失敗がない限り）N+1枚のフレームの全てが動きの探索対象として利用されることとなるが、別の一実施形態として、所定手順に従うことで一部分のフレームのみを動きの探索対象とするようにしてもよい。第二フレームレートに相当するN+1枚のフレームF0,F1,…FNで動きM(0,N)を探索する場合に関して、当該別の一実施形態を実現する手順の一例は以下の通りである。
（手順１）変数iに初期値i=1を設定し、手順２へ進む。
（手順２）動きM(0,i)を探索し、成功すれば手順３へ、失敗すれば手順４へ進む。
（手順３）動きM(i,N)を探索し、成功すれば終了し、失敗すれば手順４へ進む。
（手順４）変数iの値を1だけ増分し、手順２に戻る。 In the embodiment using the sequential processing of FIG. 3 as it is, all N + 1 frames are used as motion search targets (unless there is no search failure). By following a predetermined procedure, only a part of the frames may be set as a motion search target. An example of a procedure for realizing the other embodiment in the case of searching for motion M (0, N) in N + 1 frames F0, F1,... FN corresponding to the second frame rate is as follows. is there.
(Procedure 1) The initial value i = 1 is set in the variable i, and the procedure proceeds to procedure 2.
(Procedure 2) The motion M (0, i) is searched, and if successful, the procedure proceeds to procedure 3, and if unsuccessful, the procedure proceeds to procedure 4.
(Procedure 3) The motion M (i, N) is searched for, and if successful, the process ends. If unsuccessful, the process proceeds to Procedure 4.
(Procedure 4) The value of variable i is incremented by 1, and the procedure returns to procedure 2.

なお、手順２では、既に探索成功している動きM(0,i-j)(j≧1)が１つ以上あれば、M(i-j,i)を探索するようにしてもよく、jの最小のものから探索を試みて成功するものがあった場合は、手順２が成功したものと判断してもよい。また、手順４でiの値がNに到達した場合、以上の手順による探索は失敗したものと判断する。 In step 2, if there is at least one motion M (0, ij) (j ≧ 1) that has already been successfully searched, M (ij, i) may be searched for, and the minimum of j If there is a successful search from those, it may be determined that step 2 has been successful. If the value of i reaches N in step 4, it is determined that the search according to the above procedure has failed.

また、具体的な手順は上記に限らず、その他のものでもよく、結果的に（成功した場合には）複数の動きM(0,i1),M(i1,i2),M(i2,i3),…,M(im,N)（0<i1<i2<i3<…<im<N）が探索されうるような任意の手順を利用してよい。推定部7においてはこれらを合成することで動きM(0,N)を得ることが可能となる。 Further, the specific procedure is not limited to the above, and other procedures may be used. As a result (when successful), a plurality of movements M (0, i1), M (i1, i2), M (i2, i3 ), ..., M (im, N) (0 <i1 <i2 <i3 <... <im <N) may be used as an arbitrary procedure. The estimation unit 7 can obtain the motion M (0, N) by combining these.

例えば、手順１の初期値をN-1とし、手順４では変数iの値を1だけ減算するようにしてもよい。そして、手順１の初期値や手順４の増分／減算の値をそれぞれ、1以外の所定値（Nに応じた所定値など）としてもよい。また、また、N+1枚のフレームを再帰的に半分に分割しながらその両端で動き探索の可能性を探るような手順を用いてもよいし、同様に、N+1枚のフレームを再帰的に比率「a：(1-a)」（ここで、0<a<1）の位置で分割しながらその両端で動き探索の可能性を探るような手順を用いてもよい。当該再帰的な探索を行う際は、探索に成功した区間は以降の探索から除外するようにすればよい。 For example, the initial value of procedure 1 may be N-1, and the value of variable i may be subtracted by 1 in procedure 4. Then, the initial value of procedure 1 and the increment / subtraction value of procedure 4 may each be a predetermined value other than 1 (such as a predetermined value according to N). It is also possible to use a procedure that recursively divides N + 1 frames in half and explores the possibility of motion search at both ends. Similarly, N + 1 frames are recursively. Specifically, a procedure may be used in which the possibility of motion search is searched at both ends while dividing at the position of the ratio “a: (1-a)” (where 0 <a <1). When performing the recursive search, a section that has been successfully searched may be excluded from subsequent searches.

以上、本発明によれば、撮像画像Fi間から算出した動きを複数組み合わせることで、出力部8の更新間隔において当該動きを高精度に推定することが可能となる。当該高精度に推定された動きを用いて例えば拡張現実表示等を実現する場合であれば、撮像部1のカメラと撮像されている対象との相対的な位置関係等も高精度に推定可能となり、高精度な拡張現実表示等を実現することが可能となる。なお、以下のような観点によって、本発明においては高精度な動き推定が可能となる。 As described above, according to the present invention, it is possible to estimate the motion with high accuracy at the update interval of the output unit 8 by combining a plurality of motions calculated from between the captured images Fi. If, for example, augmented reality display is realized using the motion estimated with high accuracy, the relative positional relationship between the camera of the imaging unit 1 and the object being imaged can be estimated with high accuracy. In addition, it is possible to realize highly accurate augmented reality display and the like. In the present invention, highly accurate motion estimation is possible from the following viewpoints.

（観点１）撮像間隔の短縮に伴う撮像画像Fi間の動き量減少による高速化。例えば、撮像間隔が1/240秒であれば1/30秒の撮像間隔と比較して撮像画像Fi間の動き量は1/8に低減するため、撮像間隔に応じて動き探索の範囲を制御することで高速化を実現する。一般に、撮像間隔が1/N倍になると、撮像情報の増加により撮像情報間の数がN倍に増加するが、動き量自体は1/Nになるので、動きの探索範囲を(1/N)*(1/N)に減少させる。よって、処理全体としては探索回数が最大1/Nになるため、撮像間隔の短縮は高速化につながる。 (Viewpoint 1) Speeding up by reducing the amount of motion between captured images Fi accompanying shortening of the imaging interval. For example, if the imaging interval is 1/240 seconds, the amount of motion between the captured images Fi is reduced to 1/8 compared to the imaging interval of 1/30 seconds, so the range of motion search is controlled according to the imaging interval. To achieve high speed. In general, when the imaging interval is 1 / N times, the number of imaging information increases N times due to the increase in imaging information, but the motion amount itself is 1 / N, so the motion search range is (1 / N ) * (1 / N). Therefore, the number of searches for the entire process is 1 / N at the maximum, so shortening the imaging interval leads to higher speed.

（観点２）撮像間隔の短縮に伴う撮像画像Fi間の動きの線形近似による高精度化。並進以外の動きや立体形状の見え方の変化が生じる場合、並進を前提としたテンプレートマッチング等は撮像間隔が短いほど並進で近似できる可能性が高まり動き探索の精度向上につながる。 (Viewpoint 2) Higher accuracy by linear approximation of motion between captured images Fi accompanying shortening of the imaging interval. When movement other than translation or a change in the appearance of the three-dimensional shape occurs, template matching based on translation is more likely to be approximated by translation as the imaging interval is shorter, leading to improved accuracy of motion search.

（観点３）撮像間隔の短縮に伴う撮像画像Fi間の動き量減少による高精度化。大きな動きが存在する場合において、撮像間隔が短いほど動き量が小さいため探索を継続できる可能性が高まり動き探索の精度向上につながる。逆に、小さな動きが存在する場合においては、撮像画像Fi間の動きが抑制されるため、撮像間隔の短縮は安定化につながる。例えば、動き量が1/30秒に1画素の場合、1/240秒間隔での動き量は平均1/8画素となるが、各撮像情報間の動き量はそれぞれ0として推定されうる。結果、出力動き情報も0として推定されるが、拡張現実の用途において動き量が1画素程度の状況は静止状態時における手振れ等の外乱であることが多いため、撮像間隔の短縮は出力情報の安定化につながる。 (Aspect 3) Higher accuracy due to a reduction in the amount of motion between captured images Fi accompanying a reduction in the imaging interval. In the case where there is a large movement, the shorter the imaging interval, the smaller the amount of movement, so the possibility that the search can be continued increases and the accuracy of the movement search is improved. On the other hand, when there is a small movement, the movement between the captured images Fi is suppressed, so that shortening the imaging interval leads to stabilization. For example, when the amount of motion is 1 pixel per 1/30 second, the average amount of motion at 1/240 second intervals is 1/8 pixel, but the amount of motion between pieces of imaging information can be estimated as 0. As a result, the output motion information is also estimated as 0.However, in augmented reality applications, the situation where the amount of motion is about 1 pixel is often a disturbance such as camera shake in a stationary state. It leads to stabilization.

（観点４）蛍光灯のちらつきやフラッシュ等の外乱に対し頑健。動き探索の結果は、推定部7において出力動き情報の推定に利用されるだけであるため、撮像動き情報の一部で探索に失敗したとしても出力動き情報が推定できるだけの動き情報が探索できれば良いことから、システム全体としての安定性向上に貢献する。 (Aspect 4) Robust against disturbances such as fluorescent light flicker and flash. Since the result of the motion search is only used for the estimation of the output motion information in the estimation unit 7, it is only necessary to search for motion information that can estimate the output motion information even if the search fails for a part of the imaging motion information. This contributes to improving the stability of the entire system.

以下、本発明における変形例などの補足事項を説明する。 Hereinafter, supplementary matters such as modifications in the present invention will be described.

（１）上記の観点１に関連して、探索部6では動きM(i,i+k)を求める際に、当該動きを求める間隔k≧1が大きいほど、テンプレートマッチング等による探索範囲を広く設定するようにしてよい。図５に当該設定の模式例を示す。時刻t10のフレームF10の領域R10が探索元の領域であるとする場合、k=1となる時刻t11のフレームF11で動きM(10,11)を求める場合、領域R10（フレームF11内での同じ位置が領域T11）を例えば110%に拡張した領域SR11を探索範囲として設定し、k=2となるt12のフレームF12で動きM(10,12)を求める場合、領域R10（フレームF12内での同じ位置が領域T12）を例えば120%に拡張した領域SR12を探索範囲と設定してよい。 (1) In relation to the above viewpoint 1, when the search unit 6 obtains the motion M (i, i + k), the search range by template matching or the like becomes wider as the interval k ≧ 1 for obtaining the motion is larger. It may be set. FIG. 5 shows a schematic example of the setting. When region R10 of frame F10 at time t10 is the search source region, when obtaining motion M (10,11) in frame F11 at time t11 where k = 1, region R10 (the same in frame F11) For example, when the region SR11 with the position expanded to 110% is set as the search range, and the motion M (10,12) is obtained in the frame F12 at t12 where k = 2, the region R10 (in the frame F12) A region SR12 in which the same position extends the region T12) to, for example, 120% may be set as the search range.

なお、図５の例では探索範囲となる領域T11やT12は探索元R10と同じ位置に中心を有しているが、当該中心位置をそれまでに推定されている動きから予測される移動された位置として設定するようにしてもよい。 In the example of FIG. 5, the regions T11 and T12 that are the search range have the center at the same position as the search source R10, but the center position has been moved as predicted from the motion estimated so far. You may make it set as a position.

（２）第二フレームレートでの表示等の出力を行うためには、本発明による動き推定等の処理を完了する必要があるが、第二フレームレートでの時刻t0,tN,t2N,…にフレームF0,FN,F2N,…が得られるものとし、所定時間Δ以内に当該動き推定等を完了することを前提として、実際に第二フレームレートに即した時刻「t0」+Δ,「tN」+Δ,「t2N」+Δ,…に表示等の出力が可能となる。以上の図３や図４の説明では時刻ti等で各フレームFi等が得られると瞬時に計算等も完了するものとして、特に計算等の所要時間について言及することなく説明したが、このように、当該一連の計算等が所定時間Δ以内に完了することを前提に、第二フレームレートを保った表示等の出力が実際に可能である。 (2) In order to perform an output such as display at the second frame rate, it is necessary to complete processing such as motion estimation according to the present invention, but at times t0, tN, t2N,. Assume that frames F0, FN, F2N,... Are obtained, and the time “t0” + Δ, “tN” corresponding to the second frame rate is actually obtained on the assumption that the motion estimation or the like is completed within a predetermined time Δ. Display and the like can be performed on + Δ, “t2N” + Δ,. In the above description of FIG. 3 and FIG. 4, it has been explained that calculation etc. is completed instantaneously when each frame Fi etc. is obtained at time ti etc., without particularly mentioning the time required for calculation etc. Assuming that the series of calculations are completed within a predetermined time Δ, it is possible to actually output a display or the like while maintaining the second frame rate.

（３）上記の観点４に関連して、本発明において必ずしも（最も）隣接するフレームFi,Fi+1間での動きM(i,i+1)に限定されず、敢えてさらに離れたフレームFi,Fi+k(k≧2)間での動きM(i,i+k)であっても探索対象とする利点を、図６を用いて説明する。ここでは、隣接する3つの時刻t20,t21,t22において画像座標(u,v)のu,v=0,1の微小範囲（2×2画素範囲）で、斜線付与で示すような当該1画素とほぼ同じ大きさの模様が(u,v)=(+1,+1)方向に移動している例が示されている。この場合、時刻t20では位置(0,0)のみを当該模様が占拠し、時刻t21では位置(1,1)のみを当該模様が占拠しているが、その中間時刻t21では当該2×2画素範囲に当該模様が分散して配置されてしまっている。このような状況下（画素境界でのエッジ境界等の分布の現れ方が変化する状況下）では、当該模様に関する動きM(20,21)や動きM(21,22)は原理的に検出精度が落ちざるを得ないため探索対象として必ずしも好ましくないが、動きM(20,22)は高精度に検出可能となる。当該高精度な検出の可能性が離れたフレームFi,Fi+k(k≧2)間での動きM(i,i+k)探索によって確保されることとなる。 (3) In relation to the above point of view 4, in the present invention, it is not necessarily limited to the motion M (i, i + 1) between the (most) adjacent frames Fi, Fi + 1. , The advantage that the motion M (i, i + k) between Fi + k (k ≧ 2) is a search target will be described with reference to FIG. Here, at the three adjacent times t20, t21, and t22, the one pixel as shown by hatching in the minute range (2 × 2 pixel range) of u, v = 0,1 of the image coordinates (u, v) An example is shown in which a pattern of approximately the same size as is moved in the (u, v) = (+ 1, + 1) direction. In this case, the pattern occupies only the position (0,0) at time t20, and the pattern occupies only the position (1,1) at time t21, but at the intermediate time t21, the 2 × 2 pixels The pattern is dispersed and arranged in the range. Under such circumstances (when the appearance of the distribution of edge boundaries etc. at the pixel boundary changes), the motion M (20,21) and motion M (21,22) related to the pattern is in principle detected accuracy. However, the motion M (20, 22) can be detected with high accuracy. The possibility of highly accurate detection is ensured by a motion M (i, i + k) search between distant frames Fi, Fi + k (k ≧ 2).

（４）検出部4では2つ以上の対象を検出してもよい。例えば２つの対象A,Bが検出された場合であれば、以上説明した本発明による動き推定は対象A,Bそれぞれについて独立に（同時並行で）実施することが可能であり、この場合、どのようなフレーム間の動きM(i,i+k)がどのような順番で探索されるかに関しても対象A,Bで独立となる。ここで、探索の前処理としての対象A,Bが検出される時刻が異なっていてもよい。また、検出部4では個別の対象の全体を矩形領域等として検出したうえで、探索部6では当該検出された対象の部分ごとに探索を行うようにしてもよく、当該部分ごとに動きM(i,i+k)の探索の順番等は独立なものとなる。当該対象の部分は、検出部4で画像特徴に基づいて検出した際の１つ又は複数の特徴点の所定近傍範囲（例えば所定サイズの矩形範囲など）として構成されていてもよい。 (4) The detection unit 4 may detect two or more objects. For example, if two objects A and B are detected, the motion estimation according to the present invention described above can be performed independently (simultaneously in parallel) for each of the objects A and B. The order in which the inter-frame motion M (i, i + k) is searched is independent for the objects A and B. Here, the time at which the objects A and B as the preprocessing for the search are detected may be different. Further, the detection unit 4 may detect the entire individual object as a rectangular region or the like, and the search unit 6 may perform a search for each part of the detected object, and the motion M ( The search order of i, i + k) is independent. The target portion may be configured as a predetermined neighborhood range (for example, a rectangular range of a predetermined size) of one or a plurality of feature points when detected by the detection unit 4 based on image features.

（５）第二フレームレートはユーザからのマニュアル指定等を受け付けて、あるいは第一フレームレートでの撮像画像Fiの性質（明暗や全体的な動き量等）などの解析結果に応じて、手動又は自動で変更可能な可変のものとして設定されていてもよい。 (5) The second frame rate accepts manual designation from the user, or manually or according to the analysis result such as the nature of the captured image Fi at the first frame rate (brightness, overall movement amount, etc.) It may be set as a variable that can be automatically changed.

（６）情報処理装置10は一般的な構成のコンピュータとして実現可能である。すなわち、CPU（中央演算装置）、当該CPUにワークエリアを提供する主記憶装置、ハードディスクやSSDその他で構成可能な補助記憶装置、キーボード、マウス、タッチパネルその他といったユーザからの入力を受け取る入力インタフェース、ネットワークに接続して通信を行うための通信インタフェース、表示を行うディスプレイ、カメラ及びこれらを接続するバスを備えるような、一般的なコンピュータによって情報処理装置10を構成することができる。さらに、図１に示す情報処理装置10の各部の処理はそれぞれ、当該処理を実行させるプログラムを読み込んで実行するCPUによって実現することができるが、任意の一部の処理を別途の専用回路等において実現するようにしてもよい。撮像部1は、当該ハードウェアとしてのカメラによって実現できる。 (6) The information processing apparatus 10 can be realized as a computer having a general configuration. That is, a CPU (Central Processing Unit), a main storage device that provides a work area for the CPU, an auxiliary storage device that can be configured with a hard disk, SSD, etc., an input interface that receives input from the user such as a keyboard, mouse, touch panel, etc. The information processing apparatus 10 can be configured by a general computer including a communication interface for connecting to and communicating, a display for displaying, a camera, and a bus for connecting them. Further, the processing of each unit of the information processing apparatus 10 shown in FIG. 1 can be realized by a CPU that reads and executes a program for executing the processing, but any part of the processing is performed in a separate dedicated circuit or the like. It may be realized. The imaging unit 1 can be realized by a camera as the hardware.

10…情報処理装置、1…撮像部、2…バッファ部、3…スイッチ部、4…検出部、5…記憶部、6…探索部、7…推定部、8…出力部 DESCRIPTION OF SYMBOLS 10 ... Information processing apparatus, 1 ... Imaging part, 2 ... Buffer part, 3 ... Switch part, 4 ... Detection part, 5 ... Memory | storage part, 6 ... Search part, 7 ... Estimation part, 8 ... Output part

Claims

For a plurality of frames arranged on the time axis that satisfy the unit interval of the second frame rate lower than the first frame rate obtained by imaging at the first frame rate, all or part of the frames A search unit for searching for the movement of the object in the frame,
An information processing apparatus comprising: an estimation unit that combines the searched motions to obtain a motion at a second frame rate.

The information processing apparatus according to claim 1, further comprising an output unit configured to output at a second frame rate based on the synthesized motion.

The information processing apparatus according to claim 2, wherein the output unit performs output at the second frame rate by display.

The output unit further displays a frame obtained by performing imaging at the first frame rate at the second frame rate when performing display at the second frame rate. The information processing apparatus described.

5. The search unit according to claim 1, wherein when searching for a motion of a target in a frame between the frames, the search unit performs a search in a search range corresponding to an interval between the frames. Information processing device.

The information processing apparatus according to claim 1, wherein the search unit searches for motion of a target in the frame between the frames by region tracking.

The search unit first searches for a target motion between frames at both ends of the plurality of frames,
When the initial search is successful, the estimation unit adopts the motion that succeeded in the search as the motion at the second frame rate,
When the first search fails, the search unit searches for a motion of a target in a frame between frames at both ends of the plurality of frames and frames other than the both ends, and the estimation unit 7. The information processing apparatus according to claim 1, wherein the searched motion is synthesized to obtain a motion at the second frame rate.

Furthermore, a detection unit that detects a target based on feature information from a frame obtained by performing imaging at the first frame rate,
The search unit searches for a motion of a target in a frame between the frames based on the detection result with respect to a frame after the detected frame. The information processing apparatus according to any one of 7.

Furthermore, a detection unit that detects a target based on feature information from a frame obtained by performing imaging at the first frame rate,
For the frame after the detected frame, the search unit searches for the movement of the target in the frame between the frames based on the detected result,
The information processing apparatus according to claim 2, wherein the output unit outputs incidental information corresponding to the detected object at the second frame rate.

For a plurality of frames arranged on the time axis that satisfy the unit interval of the second frame rate lower than the first frame rate obtained by imaging at the first frame rate, all or part of the frames A search stage to search for the motion of the object in the frame, and
And an estimation step of synthesizing the searched motions to obtain motions at the second frame rate.

A program that causes a computer to function as the information processing apparatus according to claim 1.