JP2008250999A

JP2008250999A - Object tracing method, object tracing device and object tracing program

Info

Publication number: JP2008250999A
Application number: JP2008057801A
Authority: JP
Inventors: Hiroshi Saito; 宏斉藤; Naoteru Maeda; 直輝前田
Original assignee: Omron Corp; Omron Tateisi Electronics Co
Current assignee: Omron Corp
Priority date: 2007-03-08
Filing date: 2008-03-07
Publication date: 2008-10-16
Anticipated expiration: 2028-03-07
Also published as: JP5035035B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide an object tracing device excellent in a tracing accuracy in an object tracing processing to acquire a trajectory. <P>SOLUTION: The object tracing device for determining a trajectory of a tracing object in a moving image by repeating processing for acquiring position candidates for the tracing object in the current frame corresponding to the position candidates of the tracing object in the preceding frame acquires one or a plurality of position candidates of the current frame corresponding to one position candidate in the preceding frame. When there are a plurality of position candidates in the preceding frame, the object tracing device acquires position candidates of the current frame corresponding to the respective position candidates to thereby acquire a plurality of trajectory candidates of the tracing object, acquiring a plausible trajectory as a tracing result among the plurality of acquired trajectory candidates. The plausible trajectory is suitable to be selected on the basis of reliability in each position candidate. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、動画像中の追跡対象物を追跡する技術に関する。 The present invention relates to a technique for tracking a tracking object in a moving image.

従来より、動画像中から人の顔などを追跡（トラッキング）する技術が利用されている。このようなトラッキング技術には、動画像の毎フレームについて顔検出をしてその履歴をつなげることでトラッキング結果を求める手法と、前フレームのトラッキング結果を用いて現フレームにおいて顔らしい領域を１つ選択することで初期フレームから現フレームまでのトラッキング結果を求める手法とが存在する。 Conventionally, a technique for tracking (tracking) a human face or the like in a moving image has been used. In such tracking technology, a face detection is performed for each frame of a moving image and a tracking result is obtained by connecting the history, and one face-like region in the current frame is selected using the tracking result of the previous frame. Thus, there is a method for obtaining a tracking result from the initial frame to the current frame.

前フレームのトラッキング結果を用いて現フレームにおける顔の位置を取得する方法は、現フレームにおいて顔を探索する領域を限定することができるため、毎回顔検出を行う方法に比べて高速かつ高信頼度で顔の追跡を行うことができる。 The method of acquiring the position of the face in the current frame using the tracking result of the previous frame can limit the area to search for the face in the current frame, so it is faster and more reliable than the method that performs face detection every time Can track the face.

しかし、一度追跡を誤ると以降のフレームにおいても追跡を誤ってしまうという問題点が存在する。例えば、追跡中の顔が顔に似た物体と交差する場合に、顔に似た物体の方を追跡対象物と認識してしまうと、それ以降のフレームではその物体の方を追跡してしまうことになり、誤動作につながる。この誤動作は「のりうつり」と呼ばれる。 However, there is a problem that once tracking is mistaken, tracking is erroneously performed in subsequent frames. For example, when the face being tracked intersects with an object resembling a face, if the object resembling the face is recognized as the tracking target, the object is tracked in subsequent frames. This will lead to malfunction. This malfunction is referred to as “nodding”.

例えば、図１５のように、人の顔が車両の車輪と交差する場合を例に説明する。図１５の例では、図１５（ａ）に検出した顔の追跡をしている最中に、図１５（ｄ）で顔と車輪が重なっている。ここで、図１５（ｅ）において車輪の方を顔であると誤認識してしまうと、図１５（ｆ）以降のフレームで車輪の方を追跡してしまう。したがって、図１６に示すように、本来の顔とは異なるルートを顔の軌跡であると判定してしまうことになる。 For example, as shown in FIG. 15, a case where a human face intersects with a vehicle wheel will be described as an example. In the example of FIG. 15, the face and the wheel overlap in FIG. 15 (d) while the face detected in FIG. 15 (a) is being tracked. Here, if the wheel is misrecognized as a face in FIG. 15E, the wheel is tracked in the frames after FIG. 15F. Therefore, as shown in FIG. 16, a route different from the original face is determined to be the locus of the face.

このような問題点を解消するために、動画像の各フレームから特徴点を検出し、これら特徴点の時系列的な対応付けを複数通り列挙して、その中から最も合理的な対応付け結果をトラッキング結果として得る手法が提案されている（特許文献１）。ここで、特徴点の対応付けは、対応する特徴点における特徴量の類似度や、移動経路（軌跡）が一定であるか否かなどを考慮して決定される。このような手法によれば、動画像の途中段階で生じる特徴点の対応付けの中断あるいは一時的な誤対応に影響されることなく、時間的に好適な特徴点追跡を実現することができる。
特開２００４−５０８９号公報 In order to solve such problems, feature points are detected from each frame of the moving image, a plurality of time-series associations of these feature points are listed, and the most reasonable association result among them is listed. Has been proposed as a tracking result (Patent Document 1). Here, the association of the feature points is determined in consideration of the similarity of the feature amounts at the corresponding feature points, whether the movement route (trajectory) is constant, or the like. According to such a method, temporally suitable feature point tracking can be realized without being affected by the interruption of the feature point correspondence that occurs in the middle of the moving image or the temporary erroneous correspondence.
JP 2004-5089 A

しかしながら、特許文献１に記載の手法は、毎フレームの全画像領域から特徴点検出を行うものであり、前フレームまでのトラッキング結果を利用して現フレームまでのトラッキング結果を求める手法のメリットを十分に得ることができない。 However, the method described in Patent Document 1 performs feature point detection from the entire image region of each frame, and sufficiently uses the tracking result up to the previous frame to obtain the tracking result up to the current frame. Can't get to.

本発明は、前フレームまでの追跡結果を利用して現フレームにおける対象物の位置を取得することで追跡対象物の軌跡を取得する対象物追跡処理において、追跡精度を向上することを目的とする。 It is an object of the present invention to improve tracking accuracy in an object tracking process for acquiring a track of a tracked object by acquiring a position of the object in the current frame by using a tracking result up to the previous frame. .

上記目的を達成するために本発明に係る対象物追跡方法は、前フレームにおける追跡対象物の位置候補に対応する現フレームにおける追跡対象物の位置候補を取得するステップ
を繰り返すことによって、動画像における追跡対象物の軌跡を決定する対象物追跡方法であって、前フレームにおける１つの位置候補に対応する現フレームの位置候補を１または複数取得し、前フレームに複数の位置候補がある場合は、それぞれの位置候補について対応する現フレームの位置候補を取得することによって、追跡対象物の軌跡候補を複数取得し、取得された複数の軌跡候補のうちから、前記追跡対象物の軌跡を決定することを特徴とする。 In order to achieve the above object, the object tracking method according to the present invention repeats the step of obtaining the position candidate of the tracking object in the current frame corresponding to the position candidate of the tracking object in the previous frame. In the object tracking method for determining the trajectory of the tracking object, one or more current frame position candidates corresponding to one position candidate in the previous frame are acquired, and when there are a plurality of position candidates in the previous frame, Acquiring a plurality of tracking object trajectory candidates by acquiring corresponding current frame position candidates for each position candidate, and determining the tracking object trajectory from the acquired plurality of trajectory candidates It is characterized by.

すなわち、本発明に係る対象物追跡方法は、追跡対象物の軌跡を１つに限定することなく、途中で分岐することを許して複数の軌跡候補を取得しておき、これら複数の軌跡候補のうちから追跡対象物の軌跡を選択するものである。 In other words, the object tracking method according to the present invention is not limited to a single track of the tracked object, and allows a plurality of track candidates to be obtained by allowing branching in the middle. The trajectory of the tracking object is selected from among them.

このような構成によれば、追跡対象物が該追跡対象物に似た物体（以下、類似物という）と交差する場合などに、軌跡候補を分岐させその両方について追跡が行われることになる。そして、両方の軌跡候補に対して追跡を実行した後に対象物の軌跡を決定するので、追跡対象物と類似物が交差する場合であっても、「のりうつり」による誤検出を防止することができる。 According to such a configuration, when the tracking object intersects with an object similar to the tracking object (hereinafter referred to as a similar object), the trajectory candidates are branched and tracking is performed for both. And, since the trajectory of the object is determined after tracking for both trajectory candidates, it is possible to prevent erroneous detection due to “pushing” even when the tracking object and the similar object intersect. it can.

また、本発明に係る対象物追跡方法は、前フレームの位置候補に対応する現フレームの所定の領域に追跡対象物が存在する確からしさである信頼度を算出し、算出された信頼度が所定の閾値以上である場合に、その領域を現フレームにおける追跡対象物の位置候補として取得するとよい。 Further, the object tracking method according to the present invention calculates a reliability that is a probability that the tracking object exists in a predetermined area of the current frame corresponding to the position candidate of the previous frame, and the calculated reliability is a predetermined value. If it is equal to or greater than the threshold value, the area may be acquired as a position candidate of the tracking target in the current frame.

ここで、信頼度はテンプレートマッチングによる検出スコアとして求めることができる。また、行動予測を行って前フレームまでの追跡対象物の移動方向に基づいて信頼度を算出（補正）しても良い。また、色ヒストグラムや照合技術などを用いて信頼度を算出しても良い。すなわち、信頼度は、前フレームの位置候補に基づいて、現フレームにおける位置候補を求める既存のどのような技術によって算出されても構わない。 Here, the reliability can be obtained as a detection score by template matching. In addition, the reliability may be calculated (corrected) based on the movement direction of the tracking object up to the previous frame by performing behavior prediction. Further, the reliability may be calculated using a color histogram or a matching technique. That is, the reliability may be calculated by any existing technique for obtaining a position candidate in the current frame based on a position candidate in the previous frame.

このような構成によれば、現フレームにおける追跡対象物の位置候補を、高速かつ信頼性高く取得することが可能となる。また、追跡対象物が類似物などと交差して軌跡候補が分岐した場合には、類似物に対する分岐は誤りであったことが後から判明する（周囲に追跡対象物らしい領域が存在しないことが判明する）ので、類似物に対する追跡を打ち切ることが可能となる。 According to such a configuration, it is possible to acquire the position candidate of the tracking object in the current frame at high speed and with high reliability. In addition, when the tracking target crosses a similar object and the trajectory candidate branches, it is later determined that the branch for the similar object was an error (there may be no area that seems to be a tracking target in the surrounding area). So that it is possible to abort the tracking for similarities.

また、本発明に係る対象物追跡方法は、各フレームにおいて追跡対象物の位置候補における信頼度を記憶し、軌跡候補に係る信頼度に基づいて、追跡対象物の軌跡を決定することが好適である。 The object tracking method according to the present invention preferably stores the reliability of the position candidate of the tracking object in each frame, and determines the track of the tracking object based on the reliability of the locus candidate. is there.

軌跡の決定は、軌跡候補に係る信頼度に基づいて行われれば、どのような具体的な方法によって行われても良い。例えば、各軌跡候補における信頼度（フレームごとに変化する）の最小値を求め、この最小値が最も大きい軌跡候補を追跡対象物の軌跡として決定することができる。また、各軌跡候補における信頼度の平均値や累積値を求め、この値が大きい軌跡候補を追跡対象物の軌跡として決定することができる。また、軌跡候補における信頼度が一度でも所定の閾値以下になった軌跡候補は除外したり、軌跡候補における信頼度が所定の閾値以上になった軌跡候補の中から選択するようにしたりしても良い。その他、軌跡候補における信頼度の最大値や分散値や標準偏差などに基づいて軌跡を決定しても良い。また、ここで説明した手法を組み合わせて利用しても良い。 The determination of the trajectory may be performed by any specific method as long as it is performed based on the reliability related to the trajectory candidate. For example, the minimum value of the reliability (changes for each frame) in each trajectory candidate can be obtained, and the trajectory candidate having the largest minimum value can be determined as the trajectory of the tracking object. Further, an average value or cumulative value of reliability in each trajectory candidate can be obtained, and a trajectory candidate having a large value can be determined as the trajectory of the tracking target object. In addition, a trajectory candidate whose reliability in the trajectory candidate is once even below a predetermined threshold may be excluded or selected from trajectory candidates whose reliability in the trajectory candidate is higher than a predetermined threshold. good. In addition, the trajectory may be determined based on the maximum reliability value, variance value, standard deviation, and the like of the trajectory candidates. Moreover, you may utilize combining the method demonstrated here.

このような構成によれば、「のりうつり」などが発生し軌跡候補が分岐した場合であっても、類似物を追跡する軌跡候補に係る信頼度は時間の経過とともに低くなるので、複数
の軌跡候補の中から正しい軌跡を選択することが可能となる。 According to such a configuration, even when a trajectory candidate is generated due to the occurrence of “nodding” or the like, the reliability related to the trajectory candidate that tracks the similar object decreases with the passage of time. It is possible to select a correct locus from the candidates.

また、本発明に係る対象物追跡方法は、前フレームの複数の位置候補に対応する現フレームの位置候補が同じである場合は、以降のフレームにおいては、これら複数の位置候補のうちから１つの位置候補について追跡を行うことが好ましい。 In addition, in the object tracking method according to the present invention, when the position candidates of the current frame corresponding to the plurality of position candidates of the previous frame are the same, in the subsequent frames, one of the plurality of position candidates is selected. It is preferable to track the position candidates.

このような構成によれば、軌跡候補が分岐した後に、分岐した軌跡候補が再び合流した場合には、複数の位置候補に対して追跡を行わなくてもよくなるので、処理が高速化される。 According to such a configuration, when the trajectory candidates branch and then the branched trajectory candidates merge again, it is not necessary to track a plurality of position candidates, so the processing is speeded up.

また、本発明に係る対象物追跡方法は、追跡する位置候補の数に上限値が定められており、現フレームにおいて取得された位置候補が上限値を超える場合は、取得された位置候補に係る信頼度に基づいて、以降のフレームで追跡する位置候補を選択することが好ましい。 In addition, the object tracking method according to the present invention has an upper limit on the number of position candidates to be tracked, and when the position candidates acquired in the current frame exceed the upper limit, It is preferable to select a position candidate to be tracked in subsequent frames based on the reliability.

このような構成によれば、追跡する軌跡候補の数を絞り込むことで処理の高速化が期待できる。 According to such a configuration, it is possible to expect an increase in processing speed by narrowing down the number of trajectory candidates to be tracked.

また、本発明に係る対象物追跡方法は、動画像から追跡対象物を検出し、以降のフレームにおいて、検出された追跡対象物を追跡することが好ましい。 In the object tracking method according to the present invention, it is preferable that a tracking object is detected from a moving image, and the detected tracking object is tracked in subsequent frames.

このような構成によれば、動画像の途中から現れた追跡対象物を追跡することが可能となる。また、複数の追跡対象物を同時に追跡することが可能となる。 According to such a configuration, it is possible to track a tracking object that appears from the middle of a moving image. In addition, a plurality of tracking objects can be tracked simultaneously.

また、本発明における追跡対象方法は、検出された追跡対象物の位置が、すでに追跡対象となっている追跡対象物の現フレームにおける位置候補と等しい場合には、検出された追跡対象物をすでに追跡対象となっている追跡対象物とみなして、新たな追跡は行わないことが好ましい。 In addition, the tracking target method according to the present invention, when the position of the detected tracking target is equal to the position candidate in the current frame of the tracking target that is already the tracking target, the detected tracking target is already detected. It is preferable that new tracking is not performed considering that the tracking target is a tracking target.

このような構成によれば、動画像から追跡対象物を検出して検出された対象物を追跡する手法を採用した場合に、すでに追跡中の対象物が検出されても、重複して追跡をすることを防止できるので処理を効率化することができる。 According to such a configuration, when a method of detecting a tracking target object from a moving image and tracking the detected target object is adopted, even if an object being tracked is already detected, the tracking is repeated. Therefore, the processing can be made more efficient.

なお、本発明に係る対象物追跡方法は、種々の物体を対象物として追跡することが可能である。例えば、追跡対象物として、人の顔を採用することができる。また、追跡対象物として、人の他の部位（頭部や、目や鼻）や全身などを採用することができる。また、追跡対象物は、任意の物体であって構わない。 Note that the object tracking method according to the present invention can track various objects as objects. For example, a human face can be adopted as the tracking target. In addition, other parts of the person (head, eyes, nose), the whole body, and the like can be employed as the tracking target. Further, the tracking object may be an arbitrary object.

また、本発明は、上記処理の少なくとも一部を含む対象物追跡方法、または、かかる方法を実現するためのプログラムとして捉えることもできる。また、本発明は、上記方法を実行する対象物追跡装置として捉えることもできる。上記手段および処理の各々は可能な限り互いに組み合わせて本発明を構成することができる。 The present invention can also be understood as an object tracking method including at least a part of the above processing, or a program for realizing such a method. Moreover, this invention can also be grasped | ascertained as the target tracking apparatus which performs the said method. Each of the above means and processes can be combined with each other as much as possible to constitute the present invention.

たとえば、本発明の一態様としての対象物追跡装置は、動画像の入力を受け付ける動画像入力手段と、前フレームにおける追跡対象物の位置候補に対応する、現フレームにおける追跡対象物の位置候補を取得する位置候補探索手段であって、前フレームにおける１つの位置候補に対して現フレームの位置候補を１または複数取得し、前フレームにおいて複数の位置候補がある場合はそれぞれの位置候補について対応する現フレームの位置候補を取得する位置候補探索手段を用いて、追跡対象物の軌跡候補を複数取得する軌跡候補取得手段と、取得された複数の軌跡候補のうちから前記追跡対象物の軌跡を決定する軌跡決定
手段と、を有することを特徴とする。 For example, an object tracking apparatus as one aspect of the present invention includes a moving image input unit that receives an input of moving images, and a position candidate of the tracking object in the current frame corresponding to the position candidate of the tracking object in the previous frame. The position candidate search means for acquiring one or a plurality of position candidates of the current frame for one position candidate in the previous frame, and corresponding to each position candidate when there are a plurality of position candidates in the previous frame Using the position candidate search means for acquiring the position candidate of the current frame, the trajectory candidate acquiring means for acquiring a plurality of trajectory candidates for the tracking object, and determining the trajectory of the tracking object from the acquired plurality of trajectory candidates And trajectory determination means.

また、本発明に係る対象物追跡装置において、前記位置候補探索手段は、前フレームの位置候補に対応する現フレームの所定の領域に追跡対象物が存在する確からしさである信頼度を算出し、算出された信頼度が所定の閾値以上である場合に、前記領域を現フレームにおける追跡対象物の位置候補として取得することが好ましい。 Further, in the object tracking device according to the present invention, the position candidate search means calculates a reliability that is a probability that the tracking object exists in a predetermined region of the current frame corresponding to the position candidate of the previous frame, When the calculated reliability is greater than or equal to a predetermined threshold, it is preferable to acquire the region as a position candidate of the tracking target in the current frame.

また、本発明に係る対象物追跡装置において、前記軌跡候補取得手段は、各フレームおいて追跡対象物の位置候補における前記信頼度を記憶し、前記軌跡決定手段は、前記軌跡候補に係る前記信頼度に基づいて、前記追跡対象物の軌跡を決定することが好ましい。 Further, in the object tracking device according to the present invention, the trajectory candidate acquisition unit stores the reliability of the position candidate of the tracking target in each frame, and the trajectory determination unit includes the reliability related to the trajectory candidate. It is preferable to determine the track of the tracking object based on the degree.

また、本発明に係る対象物追跡装置において、前記追跡対象物は人の顔であることが好ましい。 In the object tracking device according to the present invention, it is preferable that the tracking object is a human face.

また、本発明の一態様としての対象物追跡プログラムは、前フレームにおける追跡対象物の位置候補に対応する現フレームにおける追跡対象物の位置候補を取得するステップを情報処理装置に繰り返させることによって、動画像における追跡対象物を決定するための対象物追跡プログラムであって、情報処理装置に、前フレームにおける１つの位置候補に対応する現フレームの位置候補を１または複数取得させ、前フレームに複数の位置候補がある場合は、それぞれの位置候補について対応する現フレームの位置候補を取得させることによって、前記追跡対象物の軌跡候補を複数取得させ、取得された複数の軌跡候補の中から、前記追跡対象物の軌跡を決定させることを特徴とする。 In addition, the object tracking program as one aspect of the present invention causes the information processing apparatus to repeat the step of acquiring the tracking object position candidate in the current frame corresponding to the tracking object position candidate in the previous frame. An object tracking program for determining a tracking object in a moving image, causing an information processing apparatus to acquire one or more current frame position candidates corresponding to one position candidate in a previous frame, and If there is a position candidate, a plurality of trajectory candidates for the tracking target object are obtained by acquiring a position candidate for the current frame corresponding to each position candidate, and the plurality of trajectory candidates obtained are The trajectory of the tracking object is determined.

また、本発明に係る対象物追跡プログラムにおいて、前フレームの位置候補に対応する現フレームの所定の領域に追跡対象物が存在する確からしさである信頼度を算出させ、算出された信頼度が所定の閾値以上である場合に、前記領域を現フレームにおける追跡対象物の位置候補として取得させることが好ましい。 Further, in the object tracking program according to the present invention, the reliability that is the probability that the tracking target exists in a predetermined area of the current frame corresponding to the position candidate of the previous frame is calculated, and the calculated reliability is predetermined. It is preferable that the region is acquired as a position candidate of the tracking object in the current frame.

また、本発明に係る対象物追跡プログラムにおいて、各フレームにおいて追跡対象物の位置候補における前記信頼度を記憶し、前記軌跡候補に係る信頼度に基づいて、前記追跡対象物の軌跡を決定させることが好ましい。 Further, in the object tracking program according to the present invention, the reliability of the position candidate of the tracking object is stored in each frame, and the locus of the tracking object is determined based on the reliability of the locus candidate. Is preferred.

また、本発明に係る対象物追跡プログラムにおいて、前記追跡対象物は人の顔であることが好ましい。 In the object tracking program according to the present invention, the tracking object is preferably a human face.

本発明によれば、前フレームまでの追跡結果を利用して現フレームにおける対象物の位置を取得することで追跡対象物の軌跡を決定する対象物追跡処理において、追跡精度を向上することが可能となる。 According to the present invention, it is possible to improve tracking accuracy in an object tracking process that determines the locus of a tracking object by acquiring the position of the object in the current frame using the tracking result up to the previous frame. It becomes.

以下に図面を参照して、この発明の好適な実施の形態を例示的に詳しく説明する。なお、以下の実施形態の説明では追跡対象物を人の顔とした顔追跡装置を例に説明するが、追跡対象物は人の顔以外であっても構わない。 Exemplary embodiments of the present invention will be described in detail below with reference to the drawings. In the following description of the embodiment, a face tracking apparatus using a tracking target object as a human face will be described as an example, but the tracking target object may be other than a human face.

（第１の実施形態）
第１の実施形態に係る顔追跡装置は、入力される動画像の全てのフレームについて追跡処理を行ってから、顔の移動軌跡を求めるものである。 (First embodiment)
The face tracking apparatus according to the first embodiment performs tracking processing for all frames of an input moving image and then obtains a face movement locus.

＜構成＞
本実施形態に係る顔追跡装置１は、ハードウェア的には、バスを介して接続されたＣＰＵ（中央演算処理装置）、主記憶装置（ＲＡＭ）、補助記憶装置などを備える通常のコンピュータ（情報処理装置）によって構成される。この場合、プログラムがＣＰＵに実行されることによって、顔追跡装置１が実現される。 <Configuration>
The face tracking device 1 according to the present embodiment is a normal computer (information) including a CPU (Central Processing Unit), a main storage device (RAM), an auxiliary storage device and the like connected via a bus in terms of hardware. Processing device). In this case, the face tracking device 1 is realized by executing the program on the CPU.

図１は、顔追跡装置１の機能ブロック例を示す図である。顔追跡装置１は、補助記憶装置に記憶された各種のプログラム（ＯＳ，アプリケーション等）が主記憶装置にロードされＣＰＵにより実行されることによって、動画像入力部２、顔検出部３、軌跡候補取得部４、位置情報ＤＢ５、軌跡情報ＤＢ６、軌跡決定部７および画像合成・出力部８として機能する。また、顔追跡装置１の全部または一部は、専用のチップとして構成されても良い。 FIG. 1 is a diagram illustrating an example of functional blocks of the face tracking device 1. The face tracking device 1 loads various programs (OS, applications, etc.) stored in the auxiliary storage device into the main storage device and is executed by the CPU, whereby the moving image input unit 2, the face detection unit 3, the trajectory candidate It functions as the acquisition unit 4, position information DB 5, trajectory information DB 6, trajectory determination unit 7, and image composition / output unit 8. Further, all or part of the face tracking device 1 may be configured as a dedicated chip.

次に、顔追跡装置１が含む各機能部について説明する。 Next, each functional unit included in the face tracking device 1 will be described.

動画像入力部２は、動画像のデータを顔追跡装置１へ入力するためのインタフェースとして機能する。動画像入力部２は、顔追跡装置１へ動画像のデータを入力するための既存のどのような技術を用いて構成されても良い。 The moving image input unit 2 functions as an interface for inputting moving image data to the face tracking device 1. The moving image input unit 2 may be configured using any existing technique for inputting moving image data to the face tracking device 1.

例えば、ネットワーク（ＬＡＮやインターネット）などを介して顔画像のデータが顔追跡装置１へ入力されても良い。また、デジタルビデオカメラや記録装置（例えば、ハードディスクドライブや、ＣＤ−ＲＯＭ，ＤＶＤ−ＲＯＭや各種フラッシュメモリ）等から動画像が顔追跡装置１へ入力されても良い。また、顔追跡装置１がデジタルビデオカメラ等の撮像装置または、デジタルビデオカメラ等の撮像装置を備える各種装置の内部に含まれ、撮像された動画像が顔追跡装置１へと入力されても良い。 For example, face image data may be input to the face tracking device 1 via a network (LAN or the Internet). Further, a moving image may be input to the face tracking device 1 from a digital video camera or a recording device (for example, a hard disk drive, a CD-ROM, a DVD-ROM, or various flash memories). Further, the face tracking device 1 may be included in an imaging device such as a digital video camera or various devices including an imaging device such as a digital video camera, and the captured moving image may be input to the face tracking device 1. .

顔検出部３は、入力された動画像の１フレームから顔を検出する。顔検出部３は、例えば、顔全体の輪郭に対応した基準テンプレートを用いたテンプレートマッチングによって顔を検出するように構成されても良い。また、顔検出部３は、顔の構成要素（目、鼻、耳など）に基づくテンプレートマッチングによって顔を検出するように構成されても良い。また、顔検出部３は、ニューラルネットワークを使って教師信号による学習を行い、探索領域内に顔があるか検出するように構成されても良い。顔検出部３は、その他既存のどのような技術が適用されることによって実現されても良い。 The face detection unit 3 detects a face from one frame of the input moving image. For example, the face detection unit 3 may be configured to detect a face by template matching using a reference template corresponding to the contour of the entire face. The face detection unit 3 may be configured to detect a face by template matching based on face components (eg, eyes, nose, ears). The face detection unit 3 may be configured to perform learning using a teacher signal using a neural network and detect whether a face is present in the search area. The face detection unit 3 may be realized by applying any other existing technology.

顔検出部３によって検出された顔は、軌跡候補取得部４に入力され、追跡の対象となる。なお、以下では、顔検出部３によって検出された顔のことを「基本顔」と呼ぶ。 The face detected by the face detection unit 3 is input to the trajectory candidate acquisition unit 4 and becomes a tracking target. Hereinafter, the face detected by the face detection unit 3 is referred to as a “basic face”.

軌跡候補取得部４は、位置候補探索部４１を含む。位置候補探索部４１は、前フレームの顔の位置候補に対応する、現フレームでの顔の位置候補を、テンプレートマッチングなどの統計的パターン認識手法を用いて取得する。位置候補探索部４１は、前フレームにおける顔の位置候補の周辺領域にあらかじめ複数の探索領域を設定しておき、この中から現フレームにおける顔の位置候補を統計的パターン認識手法によって取得する。 The locus candidate acquisition unit 4 includes a position candidate search unit 41. The position candidate search unit 41 acquires a face position candidate in the current frame corresponding to a face position candidate in the previous frame using a statistical pattern recognition method such as template matching. The position candidate search unit 41 sets a plurality of search areas in advance in the peripheral area of the face position candidate in the previous frame, and acquires a face position candidate in the current frame from among them by a statistical pattern recognition method.

図２に示すように、前フレームの顔の位置候補Ａｃおよびその周辺に、複数の探索領域Ａｓがあらかじめ定められている。なお、探索領域Ａｓの数や配置は任意であって構わない。位置候補探索部４１は、前フレームの顔位置候補Ａｃ内の画像をテンプレート画像ベクトルμとして用い、カレントフレームの探索領域内の画像ベクトルｘとテンプレート画像ベクトルμとの間の距離ｄを算出して、探索領域Ａｓ内に顔があるか否か判断する。探索画像ベクトルｘとテンプレート画像ベクトルμとの距離ｄは、以下の式にしたがって算出する。 As shown in FIG. 2, a plurality of search areas As are determined in advance around the face position candidate Ac in the previous frame. Note that the number and arrangement of the search areas As may be arbitrary. The position candidate search unit 41 uses the image in the face position candidate Ac of the previous frame as the template image vector μ, and calculates the distance d between the image vector x and the template image vector μ in the search area of the current frame. Then, it is determined whether or not there is a face in the search area As. The distance d between the search image vector x and the template image vector μ is calculated according to the following equation.

そして、探索領域画像ベクトルｘと前フレームの候補領域のテンプレート画像ベクトルμとの距離ｄが、所定の閾値よりも小さい探索領域を、カレントフレームにおける候補領域とする。 A search area in which the distance d between the search area image vector x and the template image vector μ of the candidate area in the previous frame is smaller than a predetermined threshold is set as a candidate area in the current frame.

なお、探索画像ベクトルｘとテンプレート画像ベクトルμの距離の値が小さいほど、探索画像がテンプレートと一致する確率が高い。つまり、探索画像ベクトルｘとテンプレート画像ベクトルμの距離ｄは、探索画像とテンプレートとの類似度あるいは、探索画像とテンプレートが一致する確率（信頼度）として捉えることができる。 Note that the smaller the distance between the search image vector x and the template image vector μ, the higher the probability that the search image matches the template. That is, the distance d between the search image vector x and the template image vector μ can be regarded as the similarity between the search image and the template or the probability (reliability) that the search image and the template match.

上記の説明では、テンプレート画像ベクトルμとして前フレームの画像を用いているが、直前のフレーム以外の画像（例えば、最初に顔が検出されたときの画像）や、ヒトの顔の平均画像などを使っても構わない。 In the above description, the image of the previous frame is used as the template image vector μ. However, an image other than the immediately preceding frame (for example, an image when a face is first detected), an average image of a human face, or the like is used. You can use it.

位置候補探索部４１による位置候補探索では、前フレームの１つの位置候補に対して、２つ以上の位置候補を取得可能に構成される。つまり、探索画像ベクトルｘとテンプレート画像ベクトルμの距離が閾値以下（あるいは、類似度または信頼度が閾値以上）の探索領域が複数ある場合は、それら全ての探索領域を現フレームにおける位置情報として取得する。また、前フレームに複数の位置候補がある場合には、それぞれに対応する位置候補を現フレームから取得する。 The position candidate search by the position candidate search unit 41 is configured to be able to acquire two or more position candidates for one position candidate in the previous frame. That is, when there are a plurality of search areas in which the distance between the search image vector x and the template image vector μ is less than or equal to the threshold (or the similarity or reliability is greater than or equal to the threshold), all the search areas are acquired as position information in the current frame To do. Further, when there are a plurality of position candidates in the previous frame, the position candidates corresponding to each are acquired from the current frame.

軌跡候補取得部４は、位置候補探索部４１による顔の追跡を、入力される動画像の各フレームに対して適用することで、顔検出部３によって検出された顔の軌跡を取得することができる。ここで、位置候補探索部４１は現フレームにおいて複数の位置候補を取得することがあるので、軌跡候補取得部４は顔の軌跡として複数の軌跡候補を取得することになる。 The trajectory candidate acquisition unit 4 can acquire the trajectory of the face detected by the face detection unit 3 by applying the face tracking by the position candidate search unit 41 to each frame of the input moving image. it can. Here, since the position candidate search unit 41 may acquire a plurality of position candidates in the current frame, the locus candidate acquisition unit 4 acquires a plurality of locus candidates as a face locus.

位置情報ＤＢ５には、位置候補探索部４１によって取得された顔の位置候補に関する情報が格納される（例えば、図８（ｂ）参照）。位置情報ＤＢ５には、各位置候補に関する情報として、位置候補ＩＤ５１、フレーム番号５２、Ｘ座標５３、Ｙ座標５４、信頼度５５が格納される。位置候補ＩＤ５１は、位置候補を識別する識別子である。フレーム番号５２は、その位置候補がどのフレームにおける位置候補であるかを表す。Ｘ座標５３、Ｙ座標５４は、その位置候補のフレーム内での位置を表す。信頼度５５は、位置候補が顔である確からしさ（位置候補の領域に顔が存在する確からしさ）を表す。 The position information DB 5 stores information related to face position candidates acquired by the position candidate search unit 41 (see, for example, FIG. 8B). In the position information DB 5, a position candidate ID 51, a frame number 52, an X coordinate 53, a Y coordinate 54, and a reliability 55 are stored as information on each position candidate. The position candidate ID 51 is an identifier for identifying a position candidate. The frame number 52 represents in which frame the position candidate is the position candidate. The X coordinate 53 and the Y coordinate 54 represent the position of the position candidate in the frame. The reliability 55 represents the probability that the position candidate is a face (the probability that a face exists in the position candidate region).

軌跡情報ＤＢ６には、軌跡候補取得部４によって取得された顔の軌跡候補に関する情報が格納される（例えば、図８（ｃ）参照）。軌跡情報ＤＢ６には、軌跡ＩＤ６１，基本顔ＩＤ６２およびパス６３が格納される。軌跡ＩＤ６１は、軌跡候補を識別する識別子である。基本顔ＩＤ６２は、その軌跡の始点となる基本顔を識別する識別子である。パス６３は、各フレームにおける軌跡の位置を示すものであり、位置情報ＤＢ５に格納されている位置候補の位置候補ＩＤをつなげたデータが格納される。 The trajectory information DB 6 stores information on face trajectory candidates acquired by the trajectory candidate acquisition unit 4 (see, for example, FIG. 8C). The trajectory information DB 6 stores a trajectory ID 61, a basic face ID 62, and a path 63. The trajectory ID 61 is an identifier for identifying a trajectory candidate. The basic face ID 62 is an identifier for identifying the basic face that is the starting point of the trajectory. The path 63 indicates the position of the trajectory in each frame, and stores data in which the position candidate IDs of the position candidates stored in the position information DB 5 are connected.

軌跡決定部７は、軌跡候補取得部４によって取得された複数の軌跡候補のうちから、いずれの軌跡がもっともらしいかを判断して、顔の軌跡を１つに決定する。複数の軌跡候補の中からいずれの軌跡を選択するかは、各探索時における信頼度に基づいて決定される。 The trajectory determination unit 7 determines which trajectory is plausible from the plurality of trajectory candidates acquired by the trajectory candidate acquisition unit 4, and determines a single facial trajectory. Which trajectory is selected from among a plurality of trajectory candidates is determined based on the reliability at the time of each search.

例えば、各探索時における信頼度の平均値が最も高い軌跡候補を、トラッキング結果として選択することができる。すなわち、ｃｏｎｆ_ｉを時刻ｉにおける信頼度として、次式で表される平均値Ａｖｅ_ｃｏｎｆが最大となるルートを求める。 For example, a trajectory candidate having the highest reliability average value during each search can be selected as a tracking result. That is, a route where the average value Ave _conf represented by the following equation is maximized is obtained with conf _i as the reliability at time i.

例えば、図３のように２つの軌跡候補が取得され、各探索時における信頼度が図に示される値をとるとき、ルートＡについては信頼度の平均は６４５となり、ルートＢについては７５３となる。したがって、図３の場合はルートＢの方が高い信頼度を持つので、ルートＢがトラッキング結果として選択されることになる。 For example, when two trajectory candidates are acquired as shown in FIG. 3 and the reliability at the time of each search takes the value shown in the figure, the average reliability is 645 for route A and 753 for route B. . Therefore, in the case of FIG. 3, the route B has higher reliability, so the route B is selected as the tracking result.

なお、上記の説明では信頼度の平均値が最も高い軌跡候補をトラッキング結果として選択しているが、その他の方法によって選択しても構わない。例えば、軌跡内における信頼度の最大値や最小値がに基づいて（つまり、最大値や最小値が最も大きい軌跡を）トラッキング結果として選択したり、移動方向などを用いた統計的識別方法による選択を行ったりしても良い。 In the above description, the trajectory candidate having the highest reliability average value is selected as the tracking result, but it may be selected by other methods. For example, based on the maximum and minimum values of reliability in the trajectory (that is, the trajectory with the largest maximum or minimum value) is selected as a tracking result, or selected by a statistical identification method using the moving direction, etc. You may do.

画像合成・出力部８は、軌跡決定部７によって決定された軌跡を強調する表示を入力された画像に合成し、追跡結果として出力する。 The image composition / output unit 8 synthesizes a display that emphasizes the trajectory determined by the trajectory determination unit 7 with the input image, and outputs the result as a tracking result.

＜動作例＞
以下、本実施形態に係る顔追跡装置１の動作例にしたがって、各機能部が行う処理の詳細について説明する。 <Operation example>
Hereinafter, details of processing performed by each functional unit will be described according to an operation example of the face tracking device 1 according to the present embodiment.

図４は、顔追跡装置１の動作例を示すフローチャートである。図４のフローチャートは、動画像から顔が検出された後の処理を示す。 FIG. 4 is a flowchart showing an operation example of the face tracking device 1. The flowchart in FIG. 4 shows processing after a face is detected from a moving image.

［顔検出時の初期処理］
動画像から顔が検出された場合には、以下の初期処理が行われる（Ｓ１０）。図５（ａ）に示すように、動画像から新たに顔が検出された場合には、その顔の位置に関する情報が位置情報ＤＢ５に格納される。図５（ｂ）は、位置情報ＤＢ５の内容を示す図であり、新たに検出された顔の位置などが格納される。顔らしさを表す信頼度５５は、顔検出部３によって検出された顔なので１０００点（最大値）としてある。また、顔が新たに検出されたので、その顔についての軌跡を求めるために、軌跡情報ＤＢ６に新たなレコードが追加される。新たに追加されるレコードのパス（軌跡）６３には、検出位置の一点のみの情報が格納される。 [Initial processing for face detection]
When a face is detected from the moving image, the following initial processing is performed (S10). As shown in FIG. 5A, when a new face is detected from the moving image, information regarding the position of the face is stored in the position information DB 5. FIG. 5B is a diagram showing the contents of the position information DB 5 in which the newly detected face position and the like are stored. The reliability 55 representing the likelihood of a face is 1000 points (maximum value) because the face is detected by the face detection unit 3. Further, since a face is newly detected, a new record is added to the trajectory information DB 6 in order to obtain a trajectory for the face. In the path (trajectory) 63 of the newly added record, information on only one point of the detection position is stored.

初期処理が終了すると、次のフレームを処理の対象として（Ｓ１１）、そのフレームにおける顔の位置候補を取得する処理を実行する。 When the initial processing is completed, the next frame is set as a processing target (S11), and processing for acquiring a face position candidate in that frame is executed.

［現フレームにおける顔の位置候補の取得処理］
具体的には、まず、位置候補探索部４１は、前フレームにおける位置候補を１つ取得する（Ｓ１２〜Ｓ１３）。前フレームにおける位置候補は、位置情報ＤＢ５において、フレーム番号５２が前フレームであるものを検索することによって取得することができる。 [Acquisition of face position candidates in the current frame]
Specifically, first, the position candidate search unit 41 acquires one position candidate in the previous frame (S12 to S13). The position candidate in the previous frame can be obtained by searching the position information DB 5 for the frame number 52 that is the previous frame.

次に、位置候補探索部４１が、Ｓ１３で取得された、前フレームにおける１つの位置候補に対応する、現フレームの位置候補を取得する（Ｓ１４）。位置候補探索部４１は、具体的には、現フレームにおける顔の位置候補を次のようにして取得する。 Next, the position candidate search unit 41 acquires a position candidate of the current frame corresponding to one position candidate in the previous frame acquired in S13 (S14). Specifically, the position candidate search unit 41 acquires face position candidates in the current frame as follows.

まず、図６に示すように、前フレームにおける顔の候補の位置に応じて、現フレームにおける探索領域があらかじめ定められている。図６に示す例では、前フレームにおける顔の候補と同じ領域４０１、および、その周囲の８つの領域４０２〜４０９の合計９つの探索領域が定義されている。もっとも、探索領域の定義の方法は種々の方法を採用することができる。 First, as shown in FIG. 6, the search area in the current frame is determined in advance according to the position of the candidate face in the previous frame. In the example illustrated in FIG. 6, a total of nine search areas are defined, which are the same area 401 as the face candidate in the previous frame and eight areas 402 to 409 around the area 401. However, various methods can be adopted as a method for defining a search area.

位置候補探索部４１は、これらの探索領域のそれぞれについて、領域内に顔が存在する確からしさである信頼度を算出する。ここで、信頼度の算出の際には、前フレームまでの移動方向を重視する動き予測による補正を行っても良い。また、顔照合技術や、顔周囲の色ヒストグラムを考慮した補正を行っても良い。その他、前フレームにおける顔の位置に基づいて、現フレームにおいてその周囲から顔を探索するための既存のどのような技術が適用されても良い。 The position candidate search unit 41 calculates, for each of these search areas, a reliability that is a probability that a face exists in the area. Here, when calculating the reliability, correction by motion prediction that places importance on the moving direction to the previous frame may be performed. Further, correction may be performed in consideration of a face matching technique or a color histogram around the face. In addition, any existing technique for searching for a face from its surroundings in the current frame based on the position of the face in the previous frame may be applied.

そして、算出された信頼度が所定の閾値（たとえば、５００）以上の領域を、現フレームにおける顔の位置候補として取得する。信頼度が閾値以上の領域を顔の位置候補とするので、前フレームの１つの顔の候補に対して、複数の顔の位置候補が取得される場合もあれば、顔の位置候補が１つも取得されない場合もある。 Then, an area where the calculated reliability is a predetermined threshold (for example, 500) or more is acquired as a face position candidate in the current frame. Since an area having a reliability greater than or equal to the threshold is set as a face position candidate, a plurality of face position candidates may be acquired for one face candidate in the previous frame, or one face position candidate may be obtained. It may not be acquired.

次に、位置候補探索部４１によって、いくつの位置候補が取得されたか判定する（Ｓ１５）。すなわち、信頼度が閾値以上である探索領域がいくつあったかを判定する。ここで、いくつの位置候補が取得されたかによって、位置情報ＤＢ５および軌跡情報ＤＢ６の更新処理が変わる。 Next, the position candidate search unit 41 determines how many position candidates have been acquired (S15). That is, it is determined how many search areas have a reliability greater than or equal to a threshold value. Here, the update processing of the position information DB 5 and the trajectory information DB 6 changes depending on how many position candidates are acquired.

［位置情報ＤＢ・軌跡情報ＤＢの更新処理］
１．位置候補が１つの場合
位置候補探索部４１によって取得される位置候補が１つの場合は、Ｓ１６へ進み、位置情報ＤＢ５に新たな位置候補の情報が追加される。そして、軌跡情報ＤＢ６のパス６３に、新たな位置候補が追加される。 [Update processing of position information DB / trajectory information DB]
1. When there is one position candidate When there is one position candidate acquired by the position candidate search unit 41, the process proceeds to S16, and new position candidate information is added to the position information DB 5. Then, a new position candidate is added to the path 63 of the trajectory information DB 6.

図５に示す状態が前フレームであり、図５の顔位置候補に対する現フレームの顔の位置候補が１つのみ取得された場合の例を図７に示す。図７（ｂ）に示すように、位置情報ＤＢ５に現フレーム（ｆ_１）における顔の位置候補に関する情報（位置候補ＩＤ：２）が追加される（Ｓ１６）。また、図７（ｃ）に示すように、軌跡候補のパス６３に新たに追加された位置候補（の位置候補ＩＤ）が追加される。このように、軌跡情報ＤＢ６において軌跡が延長される（Ｓ１７）。 FIG. 7 shows an example in which the state shown in FIG. 5 is the previous frame, and only one face position candidate of the current frame for the face position candidate in FIG. 5 is acquired. As shown in FIG. 7B, information (position candidate ID: 2) related to the face position candidate in the current frame (f ₁ ) is added to the position information DB 5 (S16). Further, as shown in FIG. 7C, a position candidate (position candidate ID) newly added to the path candidate path 63 is added. Thus, the locus is extended in the locus information DB 6 (S17).

２．位置候補が２つ以上の場合
位置候補探索部４１によって取得される位置候補が２つ以上の場合は、Ｓ１８へ進み、位置情報ＤＢ５にこれら複数の位置候補の情報が追加される。そして、軌跡情報ＤＢ６の更新が行われる（Ｓ１９）。ここで、現フレームにおいて位置候補が複数取得されたので、軌跡候補に分岐が発生することになる。そこで、軌跡情報ＤＢ６の軌跡情報を複製した後に、それぞれの軌跡のパス６３に新たな位置候補をそれぞれ追加する。 2. When there are two or more position candidates When there are two or more position candidates acquired by the position candidate search unit 41, the process proceeds to S <b> 18, and information on the plurality of position candidates is added to the position information DB 5. Then, the locus information DB 6 is updated (S19). Here, since a plurality of position candidates are acquired in the current frame, a branch occurs in the trajectory candidate. Therefore, after replicating the trajectory information in the trajectory information DB 6, new position candidates are added to the paths 63 of the respective trajectories.

図７に示す状態が前フレームであり、図７の顔位置に対する現フレームの顔位置が２つ取得された場合の例を図８に示す。図８（ｂ）に示すように、位置情報ＤＢ５に現フレーム（ｆ_２）における、２つの顔の位置候補に関する情報（位置候補ＩＤ：３，４）が追加
される。また、図８（ｃ）に示すように、軌跡候補が２つに複製され、それぞれの軌跡のパス６３に新たに追加された位置候補が追加される。 FIG. 8 shows an example in which the state shown in FIG. 7 is the previous frame and two face positions of the current frame with respect to the face position of FIG. 7 are acquired. As shown in FIG. 8B, information (position candidate IDs: 3 and 4) regarding two face position candidates in the current frame (f ₂ ) is added to the position information DB 5. Further, as shown in FIG. 8C, the trajectory candidates are duplicated into two, and the newly added position candidates are added to the path 63 of each trajectory.

３．位置候補がない場合
現フレームにおいて、前フレームの位置候補に対応する位置候補が存在しない場合は、軌跡が途切れることになる。この場合、現フレームにおける対応する位置候補が存在しないので、位置情報ＤＢ５の更新は行わない。そして、軌跡が終了するので、軌跡情報ＤＢ６を更新して、対応する軌跡に軌跡が終了したことを示すマーク（フラグ）を付して、軌跡が途切れたことを記憶する（Ｓ２０）。 3. When there is no position candidate When there is no position candidate corresponding to the position candidate of the previous frame in the current frame, the trajectory is interrupted. In this case, since there is no corresponding position candidate in the current frame, the position information DB 5 is not updated. Since the trajectory ends, the trajectory information DB 6 is updated, a mark (flag) indicating the end of the trajectory is attached to the corresponding trajectory, and the fact that the trajectory is interrupted is stored (S20).

以上の、データベース更新処理が終了したら、Ｓ２１に進み、前フレームにおける顔候補全てについて、上記の処理が終了したか判断する。前フレームの全ての顔候補について上記の処理が終了していない場合には、Ｓ１３へ進み、次の顔候補について上記の処理を行う。一方、全ての顔候補について処理が終了している場合には、Ｓ２２へ進む。 When the above database update processing is completed, the process proceeds to S21, and it is determined whether the above processing is completed for all face candidates in the previous frame. If the above process has not been completed for all face candidates in the previous frame, the process proceeds to S13, and the above process is performed for the next face candidate. On the other hand, if the processing has been completed for all face candidates, the process proceeds to S22.

Ｓ２２では、現フレームに顔候補が１つ以上存在するか判断する。現フレームに顔候補が存在する場合には、顔の追跡を続行する必要があるので、Ｓ１１へ進み、次のフレームに対して上記と同様の処理を実行する。現フレームに顔候補が存在しない場合には、動画像中から顔が存在しなくなったことになるので、Ｓ２３へ進み、もっとも確からしい顔の軌跡を決定する。 In S22, it is determined whether one or more face candidates exist in the current frame. If a face candidate exists in the current frame, it is necessary to continue tracking the face, so that the process proceeds to S11 and the same processing as described above is executed for the next frame. If there is no face candidate in the current frame, no face is present in the moving image, so the process proceeds to S23, and the most likely face trajectory is determined.

［軌跡決定処理］
以下、Ｓ２３における軌跡決定処理の詳細について説明する。ここでは、上記の軌跡候補取得部４によって、図９Ａに示すように３つの軌跡候補が取得された場合を例に説明する。図９Ａは各軌跡候補の軌跡を示し、図９Ｂは位置情報ＤＢ５の内容を示し、図９Ｃは軌跡情報ＤＢ６の内容を示す。図９Ａに示すように、フレームｆ_０で検出された顔のフレームｆ_１における位置候補が２つ検出されている。さらに、一方の軌跡では、フレームｆ_１で検出された位置候補に対応するフレームｆ_２の位置候補が２つ検出されている。したがって、フレームｆ_０で検出された顔に対して、３つの軌跡候補が得られる。 [Trace determination process]
Hereinafter, details of the locus determination process in S23 will be described. Here, the case where three trajectory candidates are acquired by the trajectory candidate acquisition unit 4 as shown in FIG. 9A will be described as an example. 9A shows the trajectory of each trajectory candidate, FIG. 9B shows the contents of the position information DB 5, and FIG. 9C shows the contents of the trajectory information DB 6. As shown in FIG. 9A, two position candidates in the face frame f ₁ detected in the frame f ₀ are detected. Furthermore, in one locus, _two position candidates of the frame f ₂ corresponding to the position candidate detected in the frame f ₁ are detected. Therefore, for the detected face frame f _0, 3 single trajectory candidate is obtained.

ここでは、軌跡決定部７は、複数の軌跡のそれぞれについて、信頼度（顔らしさ）の平均値を算出し、信頼度の平均値が最も高い軌跡を顔の軌跡であると決定する。図９Ａ〜Ｃに示すような状況では、各軌跡について、信頼度の時間変化は図１０のようになる。各軌跡について、信頼度の平均値を求めると、全てのフレームで平均した場合は、軌跡１の平均値が最も大きくなるので、軌跡決定部７は軌跡１を顔の軌跡であるとして決定する。 Here, the trajectory determination unit 7 calculates an average value of reliability (likeness of face) for each of a plurality of trajectories, and determines a trajectory having the highest reliability average value as a face trajectory. In the situation as shown in FIGS. 9A to 9C, the temporal change in the reliability for each locus is as shown in FIG. When the average value of reliability is obtained for each trajectory, the average value of the trajectory 1 becomes the largest when averaged over all frames, and the trajectory determination unit 7 determines the trajectory 1 as a face trajectory.

なお、軌跡の決定方法は、例えば、信頼度の最小値が最も大きい軌跡を選択することによって決定しても良い。また、信頼度の累積値が最も大きい軌跡を選択することによって決定しても良い。その他、分散値や標準偏差などの統計量を用いて判断しても良い。また、全てのフレームを対象として統計量だけでなく、直近の数フレームのみを対象とした統計量を用いて軌跡を決定しても良く、最終フレームのみの信頼度を用いて軌跡を決定しても良い。 Note that the determination method of the trajectory may be determined, for example, by selecting a trajectory having the largest reliability value. Alternatively, it may be determined by selecting a trajectory having the largest cumulative value of reliability. In addition, you may judge using statistics, such as a dispersion value and a standard deviation. In addition, the trajectory may be determined using not only the statistics for all frames but also the statistics for only the last few frames, and the trajectory may be determined using the reliability of only the last frame. Also good.

＜本実施形態の効果＞
本実施形態に係る顔追跡装置１によれば、前フレームの顔に対応する顔が現フレームにおいて複数取得された場合には、軌跡を分岐させてそれぞれについて追跡を行っている。したがって、追跡対象の顔が顔に似た物体と交差する場合などに、その時点で軌跡を決定することなく、とりあえず両方について追跡を行い、後からいずれの軌跡が正しいか判断することができる。例えば、図９Ａ〜Ｃおよび図１０に示すような状況で、分岐を許さず常に最も高い信頼度の顔を追跡する場合には、Ａ点（図１０参照）において軌跡２を対象
として顔を追跡することになる。しかしながら、その後の経過から判断すると、軌跡２は誤りであり軌跡１の方が正しいことが判明する。本実施形態に係る顔追跡装置によれば、分岐を許して複数の軌跡を追跡しているので、軌跡１を正しい追跡結果として得ることができる。すなわち、本実施形態に係る顔追跡装置によれば、信頼性高く顔追跡を行うことができる。 <Effect of this embodiment>
According to the face tracking device 1 according to the present embodiment, when a plurality of faces corresponding to the face of the previous frame are acquired in the current frame, the trajectory is branched and tracking is performed for each. Accordingly, when the face to be tracked intersects with an object similar to the face, it is possible to track both for the time being without determining the trajectory at that time, and to determine later which trajectory is correct. For example, in the situation shown in FIGS. 9A to 9C and FIG. 10, in the case where the face with the highest reliability is always tracked without allowing branching, the face is tracked for the locus 2 at point A (see FIG. 10). Will do. However, judging from the subsequent progress, it is found that the trajectory 2 is incorrect and the trajectory 1 is correct. According to the face tracking device according to the present embodiment, since a plurality of trajectories are tracked while allowing branching, the trajectory 1 can be obtained as a correct tracking result. That is, according to the face tracking device according to the present embodiment, face tracking can be performed with high reliability.

（第２の実施形態）
上記第１の実施形態では、入力された動画像の全フレームに対して追跡処理が終了してから、軌跡を１つに決定してトラッキング結果とする構成を採用していた。本実施形態においては、動画像の入力を受け付け、毎フレームごとに１つの軌跡を決定する。このような構成によれば、動画像の入力を受け付けながら、リアルタイムに顔の追跡結果を出力することができる。 (Second Embodiment)
In the first embodiment, after the tracking process is completed for all the frames of the input moving image, a configuration is adopted in which the track is determined as one and used as a tracking result. In the present embodiment, an input of a moving image is accepted and one trajectory is determined for each frame. According to such a configuration, it is possible to output a face tracking result in real time while accepting input of a moving image.

図１１は、本実施形態における顔追跡処理の流れを示すフローチャートである。第１の実施形態における顔追跡処理との違いは、ステップＳ２２とＳ２３の順序が入れ替わった点のみである。このように、本実施形態においては、毎フレームで複数の軌跡候補のうちどの軌跡候補がもっともらしいかを判断してトラッキング結果として取得しているので、リアルタイムにトラッキング結果を得ることができる。 FIG. 11 is a flowchart showing the flow of face tracking processing in the present embodiment. The only difference from the face tracking process in the first embodiment is that the order of steps S22 and S23 is switched. As described above, in the present embodiment, since a trajectory candidate out of a plurality of trajectory candidates is determined and acquired as a tracking result in each frame, a tracking result can be obtained in real time.

なお、本実施形態によれば、時間の経過によって、トラッキング結果として採用される分岐が変わることがあり、このような場合はその時点において出力されるトラッキング結果が変更されることになる。例えば、図９Ａ〜Ｃ、図１０のような動画が入力される場合、図１０の時点Ａから時点Ｂまでの間は軌跡２がトラッキング結果として出力される。そして、時点Ｂをすぎた後は、軌跡１がトラッキング結果として取得されることになる。したがって、このように軌跡候補の切り替わりによって、トラッキング結果が大きく変更されることがある。 According to the present embodiment, the branch adopted as the tracking result may change with the passage of time. In such a case, the tracking result output at that time is changed. For example, when moving images such as those shown in FIGS. 9A to 9C and FIG. 10 are input, the trajectory 2 is output as a tracking result from time A to time B in FIG. And after passing the time B, the locus | trajectory 1 will be acquired as a tracking result. Therefore, the tracking result may be greatly changed by switching the locus candidates in this way.

具体的に、図１５のような状況を例にとって説明する。また、図１２（ａ）〜（ｄ）は図１５の（ｅ）〜（ｈ）に対応する。図１２（ａ）の時点では、数式２にしたがって算出した信頼度の平均は、ルートＡ（車輪）が７２５、ルートＢ（顔）が７１７となり、ルートＡの方が信頼度が高く算出される。したがって、図１２（ａ）においては、ルートＡの方がトラッキング結果として表示される。ただし、この時点ではルートＢの方のトラッキングも続行される。 Specifically, a situation as shown in FIG. 15 will be described as an example. 12A to 12D correspond to FIGS. 15E to 15H. At the time of FIG. 12A, the average reliability calculated according to Equation 2 is 725 for route A (wheel) and 717 for face B (face), and route A is calculated with higher reliability. . Accordingly, in FIG. 12A, route A is displayed as a tracking result. However, at this time, the tracking of the route B is also continued.

次に、図１２（ｂ）の時点で、数式２にしたがって信頼度の平均を算出すると、ルートＢの方が信頼度が高く算出される。したがって、図１２（ｂ）においては、ルートＢの方がトラッキング結果として表示される。そして、これ以降の図１２（ｃ）（ｄ）においても、ルートＢの方の信頼度が高く算出されるので、正しくトラッキング結果が得られることになる。 Next, at the time of FIG. 12B, when the average reliability is calculated according to Equation 2, the route B is calculated with higher reliability. Accordingly, in FIG. 12B, route B is displayed as a tracking result. In the subsequent FIGS. 12C and 12D, the reliability of the route B is calculated to be higher, so that the tracking result can be obtained correctly.

本実施形態によれば、リアルタイムにトラッキング結果を出力しつつ、のりうつりなどにより誤トラッキングが生じた場合（図１０の時点Ａ〜Ｂでは、軌跡２が選択されてしまう）にも、誤トラッキングから回復して正しいトラッキング結果を得ることができる（図１０で時点Ｂ以降は軌跡１を選択できる）。このように、本実施形態に係る顔追跡装置によれば、信頼性高くリアルタイムに顔追跡を行うことができる。 According to the present embodiment, even when the tracking result is output in real time and the erroneous tracking occurs due to the dragging or the like (the trajectory 2 is selected at the time point A to B in FIG. 10), the erroneous tracking is also detected. It is possible to recover and obtain a correct tracking result (track 1 can be selected after time point B in FIG. 10). Thus, according to the face tracking device according to the present embodiment, face tracking can be performed with high reliability in real time.

（第３の実施形態）
上記の実施形態においては、いったん分岐した複数の軌跡候補が合流した場合に、追跡する位置候補を減らす処理を行う。上記の実施形態においては、軌跡に分岐が発生した場合には、それぞれの軌跡について追跡処理を行っていた。しかしながら、一度分岐した軌
跡の次フレームにおける顔の位置が、同じ領域となる場合がある。このような場合、これら複数の軌跡候補はそれ以降のフレームにおいて同じ軌跡を取ることになる。したがって、重複して追跡せずに合流させることによって追跡する軌跡の数を減らすことができる。これによって処理効率の向上が見込める。 (Third embodiment)
In the above-described embodiment, when a plurality of trajectory candidates once branched merge, processing for reducing the number of position candidates to be tracked is performed. In the above embodiment, when a branch occurs in the trajectory, the tracking process is performed for each trajectory. However, the face position in the next frame of the trajectory once branched may be the same region. In such a case, the plurality of locus candidates take the same locus in the subsequent frames. Therefore, it is possible to reduce the number of tracks to be tracked by merging without overlapping tracking. This can improve the processing efficiency.

図１３は、いったん分岐した２つの軌跡が合流する例を説明した図である。図１３では、位置候補２で分岐した軌跡が位置候補７で合流している。このように、複数の位置候補（５と６）に対応する次のフレームの位置候補が同じになる場合に、以降のフレームにおいては一方の位置候補のみの追跡を行う。 FIG. 13 is a diagram illustrating an example where two trajectories once branched merge. In FIG. 13, the trajectories branched at the position candidate 2 merge at the position candidate 7. In this way, when the position candidates of the next frame corresponding to the plurality of position candidates (5 and 6) are the same, only one position candidate is tracked in the subsequent frames.

２つの軌跡が位置候補７で合流したとき、軌跡情報ＤＢ６には、１−２−３−５−７という軌跡と、１−２−４−６−７という２つの軌跡が格納されている。これら２つの軌跡の取り扱いについては、以下の２通りの方法が考えられる。 When the two trajectories merge at the position candidate 7, the trajectory information DB 6 stores a trajectory 1-2-3-5-7 and two trajectories 1-2-4-6-7. Regarding the handling of these two trajectories, the following two methods are conceivable.

１つ目の方法は、次フレーム以降に位置候補が取得された場合に、これらの軌跡を同時に更新する方法である。例えば、次のフレームで位置候補７に対応して位置候補８が取得された場合、２つの軌跡が更新されて、１−２−３−５−７−８という軌跡と１−２−４−６−７−８という軌跡が軌跡情報ＤＢ６に格納される。２つ目の方法は、これら２つの軌跡が合流した時に、いずれの軌跡候補の方がもっともらしい軌跡であるかを判断して、もっともらしい軌跡以外の軌跡候補を軌跡情報ＤＢ６から削除する方法である。 The first method is a method of simultaneously updating these trajectories when position candidates are acquired after the next frame. For example, when the position candidate 8 is acquired corresponding to the position candidate 7 in the next frame, the two loci are updated, and the loci of 1-2-3-5-7-8 and 1-2-4- The trajectory 6-7-8 is stored in the trajectory information DB 6. The second method is a method in which when these two trajectories merge, it is determined which trajectory candidate is the most likely trajectory, and trajectory candidates other than the plausible trajectory are deleted from the trajectory information DB 6. is there.

本実施形態に係る顔追跡装置によれば、処理の高速化を実現できる。 According to the face tracking device according to the present embodiment, the processing speed can be increased.

（第４の実施形態）
本実施形態では、分岐数に上限を設ける。すなわち、取得された位置候補が、上限値を超える場合には、上限値を超える位置候補については以降のフレームにおいて追跡を行わない。すなわち、次フレームでその分岐は途切れるものとして扱う。 (Fourth embodiment)
In this embodiment, an upper limit is set for the number of branches. That is, when the acquired position candidate exceeds the upper limit value, the position candidate exceeding the upper limit value is not tracked in subsequent frames. That is, the branch is treated as being interrupted in the next frame.

現フレームにおいて上限値を超える数の位置候補が取得された場合は、次フレーム以降において追跡する位置候補を信頼度に基づいて選択することができる。例えば、位置候補探索部４１によって取得された現フレームでの信頼度（位置情報ＤＢ５の信頼度５５）が高いものを優先的に選択する方法を採用できる。また、例えば、軌跡決定部７による軌跡決定処理と同様に、軌跡に係る信頼度（軌跡の各フレームにおける信頼度の平均値や最小値や最大値など）に基づいて、どの軌跡（位置候補）を選択するか決定しても良い。 When the number of position candidates exceeding the upper limit value is acquired in the current frame, position candidates to be tracked in the next frame and thereafter can be selected based on the reliability. For example, it is possible to employ a method of preferentially selecting a frame having a high reliability (reliability 55 of the position information DB 5) in the current frame acquired by the position candidate search unit 41. Also, for example, as in the locus determination process by the locus determination unit 7, which locus (position candidate) is based on the reliability related to the locus (average value, minimum value, maximum value, etc. of reliability in each frame of the locus). You may decide whether to select.

本実施形態に係る顔追跡装置によれば、追跡する位置候補の上限値を定め、信頼度に基づいて追跡する軌跡を選択しているので、追跡精度の大幅な低下を招くことなく処理の高速化を実現できる。 According to the face tracking device according to the present embodiment, the upper limit value of the position candidate to be tracked is determined, and the track to be tracked is selected based on the reliability, so that the processing speed can be increased without causing a significant decrease in tracking accuracy. Can be realized.

（第５の実施形態）
上記の実施形態の説明においては、動画像中で１つの顔を追跡する場合を例に説明したが、顔追跡装置１は動画像中から複数の顔を同時に追跡することができる。すなわち、顔検出部３が、入力される動画像から、毎フレームあるいは所定のフレームおきに顔を検出し、検出された顔のそれぞれに対して、図４に示した顔追跡処理を実行することで、複数の顔の同時追跡が実現できる。 (Fifth embodiment)
In the above description of the embodiment, the case where one face is tracked in the moving image has been described as an example. However, the face tracking device 1 can simultaneously track a plurality of faces from the moving image. That is, the face detection unit 3 detects a face from the input moving image every frame or every predetermined frame, and executes the face tracking process shown in FIG. 4 for each detected face. Thus, simultaneous tracking of multiple faces can be realized.

ここで問題になりうるのは、すでに追跡中の顔を顔検出部３が検出した場合に、同じ顔に対して二重に顔追跡が行われてしまうことである。この問題点は、上記第４の実施形態と同様の手法によって回避することができる。以下、顔を検出した際の処理の詳細を説明しつつ、上記問題点を回避する方法についても説明する。 Here, a problem may be that when the face detection unit 3 detects a face that is already being tracked, face tracking is performed twice for the same face. This problem can be avoided by the same method as in the fourth embodiment. Hereinafter, a method for avoiding the above problem will be described while explaining details of processing when a face is detected.

図１４は、顔検出部３によって顔が検出された時の処理を示すフローチャートである。まず、入力された動画像から顔検出部３が顔の検出を行う（Ｓ１０１）。顔検出部３が顔を検出した場合（Ｓ１０２−ＹＥＳ）は、顔検出結果と顔位置候補とのマージ処理を行う。具体的には、顔検出部３によって検出された顔の位置と、位置候補探索部４１によって取得された顔候補の位置とが、一致するか判断し（Ｓ１０３）、一致する場合（Ｓ１０３−ＹＥＳ）には検出された顔をすでに追跡中の顔と同一とみなして新たな追跡処理を行わない。顔検出部３によって検出された顔が、すでに追跡中の顔と一致しない場合（Ｓ１０３−ＮＯ）は、位置情報ＤＢと軌跡情報ＤＢを更新する（Ｓ１０４）。より具体的には、検出された顔の位置を位置情報ＤＢ５に格納するとともに、軌跡情報ＤＢ６に新しい基本顔の軌跡のレコードを１つ追加する初期化処理を行う。そして、その顔に対して追跡処理（図４のフローチャート）を開始する（Ｓ１０５）。 FIG. 14 is a flowchart illustrating processing when a face is detected by the face detection unit 3. First, the face detection unit 3 detects a face from the input moving image (S101). When the face detection unit 3 detects a face (S102—YES), the face detection result and the face position candidate are merged. Specifically, it is determined whether the face position detected by the face detection unit 3 matches the position of the face candidate acquired by the position candidate search unit 41 (S103). ), The detected face is regarded as the same as the face being tracked, and no new tracking process is performed. When the face detected by the face detection unit 3 does not coincide with the face being tracked (S103-NO), the position information DB and the trajectory information DB are updated (S104). More specifically, the detected position of the face is stored in the position information DB 5 and an initialization process for adding one new basic face locus record to the locus information DB 6 is performed. Then, the tracking process (flow chart of FIG. 4) is started for the face (S105).

本実施形態によれば、動画像から顔を検出しつつ検出された複数の顔について顔追跡が行えるとともに、すでに追跡中の顔を二重に追跡してしまうことを防止できる。 According to the present embodiment, face tracking can be performed for a plurality of faces detected while detecting faces from a moving image, and it is possible to prevent a face being tracked from being tracked twice.

なお、上記の各実施形態は本発明の具体例を例示したものにすぎない。本発明の範囲は上記実施形態に限られるものではなく、その技術思想の範囲内で種々の変形が可能である。 The above embodiments are merely examples of the present invention. The scope of the present invention is not limited to the above embodiment, and various modifications can be made within the scope of the technical idea.

顔追跡装置の機能ブロックを示す図である。It is a figure which shows the functional block of a face tracking apparatus. 顔検出処理における探索領域を説明する図である。It is a figure explaining the search area | region in a face detection process. 軌跡候補と各探索時における信頼度を説明する図である。It is a figure explaining the reliability at the time of a locus | trajectory candidate and each search. 顔追跡処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a face tracking process. 新しく顔が検出されたときの、位置情報ＤＢと軌跡情報ＤＢの更新処理を説明する図である。It is a figure explaining update processing of position information DB and locus information DB when a new face is detected. 前フレームの顔位置候補に対応する現フレームの顔位置を探索するための、周辺領域について説明した図である。It is the figure explaining the peripheral region for searching the face position of the present frame corresponding to the face position candidate of the previous frame. 前フレームの顔位置候補に対して、１つの顔位置候補が現フレームにおいて取得されたときの、位置情報ＤＢと軌跡情報と軌跡情報ＤＢの更新処理を説明する図である。It is a figure explaining the update process of position information DB, locus | trajectory information, and locus | trajectory information DB when one face position candidate is acquired in the present frame with respect to the face position candidate of a front frame. 前フレームの顔位置候補に対して、２つの顔位置候補が現フレームにおいて取得されたときの、位置情報ＤＢと軌跡情報と軌跡情報ＤＢの更新処理を説明する図である。It is a figure explaining the update processing of position information DB, locus information, and locus information DB when two face position candidates are acquired in the current frame with respect to face position candidates of the previous frame. 軌跡が分岐して、複数の軌跡候補が取得されたときの、各軌跡候補を示す図である。It is a figure which shows each locus | trajectory candidate when a locus | trajectory branches and several locus | trajectory candidates are acquired. 図９Ａに対応する、位置情報ＤＢの内容を示す図である。It is a figure which shows the content of position information DB corresponding to FIG. 9A. 図９Ａに対応する、軌跡情報ＤＢの内容を示す図である。It is a figure which shows the content of locus | trajectory information DB corresponding to FIG. 9A. 複数の軌跡候補に係る信頼度の時間変化の例を示した図である。It is the figure which showed the example of the time change of the reliability which concerns on a some locus | trajectory candidate. リアルタイムに追跡結果を出力する場合の、顔追跡処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a face tracking process in the case of outputting a tracking result in real time. リアルタイムに追跡結果を出力する場合の、各探索時におけるそれぞれの軌跡候補について信頼度を説明する図である。It is a figure explaining reliability about each locus candidate at the time of each search in the case of outputting a tracking result in real time. 分岐した２つの軌跡候補が合流する例を示す図である。It is a figure which shows the example which two branched locus | trajectory candidates merge. 複数の顔を同時に追跡する場合の顔検出時の処理の流れを示す図である。It is a figure which shows the flow of a process at the time of face detection in the case of tracking a some face simultaneously. 従来技術による顔追跡処理を表す図である。It is a figure showing the face tracking process by a prior art. 従来技術による顔追跡処理の検出結果を表す図である。It is a figure showing the detection result of the face tracking process by a prior art.

Explanation of symbols

１顔追跡装置
２動画像入力部
３顔検出部
４軌跡候補取得部
４１位置候補探索部
５位置情報ＤＢ
６軌跡情報ＤＢ
７軌跡決定部
８画像合成・出力部 DESCRIPTION OF SYMBOLS 1 Face tracking apparatus 2 Moving image input part 3 Face detection part 4 Trajectory candidate acquisition part 41 Position candidate search part 5 Position information DB
6 Trajectory information DB
7 Trajectory determination unit 8 Image composition / output unit

Claims

An object tracking method for acquiring a track of a tracking object in a moving image by repeating a step of acquiring a position candidate of the tracking object in a current frame corresponding to a position candidate of the tracking object in a previous frame,
One or more current frame position candidates corresponding to one position candidate in the previous frame are acquired, and if there are a plurality of position candidates in the previous frame, the corresponding current frame position candidate is acquired for each position candidate. To obtain a plurality of trajectory candidates of the tracking object,
An object tracking method, comprising: determining a locus of the tracking object from a plurality of acquired locus candidates.

A reliability that is a probability that the tracking target exists in a predetermined area of the current frame corresponding to the position candidate of the previous frame is calculated, and when the calculated reliability is equal to or greater than a predetermined threshold, the area is The object tracking method according to claim 1, wherein the object tracking method is obtained as a position candidate of a tracking object in a frame.

Storing the reliability of the candidate position of the tracking object in each frame;
The object tracking method according to claim 2, wherein a locus of the tracking object is determined based on a reliability related to the locus candidate.

When the position candidates of the current frame corresponding to a plurality of position candidates in the previous frame are the same, in the subsequent frames, one position candidate is tracked from among the plurality of position candidates. Item 4. The object tracking method according to any one of Items 1 to 3.

There is an upper limit on the number of location candidates to track,
When a position candidate acquired in a current frame exceeds the upper limit value, a position candidate to be tracked in a subsequent frame is selected based on the reliability related to the acquired position candidate. 5. The object tracking method according to any one of 4 above.

Detect tracking object from moving image,
The object tracking method according to claim 1, wherein the detected tracking object is tracked in subsequent frames.

When the position of the detected tracking object is equal to the position candidate in the current frame of the tracking object that has already been tracked, the tracking object that has already been tracked is the detected tracking object The object tracking method according to claim 6, wherein new tracking is not performed.

The object tracking method according to claim 1, wherein the tracking object is a human face.

Moving image input means for receiving moving image input;
A position candidate search means for acquiring a position candidate of a tracking object in a current frame corresponding to a position candidate of a tracking object in a previous frame, wherein the position candidate of the current frame is 1 for one position candidate in the previous frame If there are a plurality of position candidates in the previous frame, a plurality of tracking object trajectory candidates are acquired using position candidate search means for acquiring the corresponding current frame position candidate for each position candidate. Trajectory candidate acquisition means;
Trajectory determination means for determining a trajectory of the tracking target object from a plurality of acquired trajectory candidates;
An object tracking device characterized by comprising:

The position candidate search means calculates a reliability that is a probability that the tracking target exists in a predetermined area of the current frame corresponding to the position candidate of the previous frame, and the calculated reliability is equal to or greater than a predetermined threshold. The object tracking device according to claim 9, wherein the region is acquired as a position candidate of the tracking object in the current frame.

The trajectory candidate acquisition means stores the reliability in the position candidate of the tracking target in each frame,
The object tracking apparatus according to claim 10, wherein the trajectory determining unit determines a trajectory of the tracking target object based on the reliability related to the trajectory candidate.

The object tracking device according to claim 9, wherein the tracking object is a human face.

An object for acquiring a track of a tracking object in a moving image by causing the information processing apparatus to repeat a step of acquiring a position candidate of the tracking object in the current frame corresponding to the position candidate of the tracking object in the previous frame. A tracking program,
In the information processing device,
One or more current frame position candidates corresponding to one position candidate in the previous frame are acquired, and when there are a plurality of position candidates in the previous frame, the corresponding current frame position candidate is acquired for each position candidate. To obtain a plurality of trajectory candidates of the tracking object,
An object tracking program for determining a locus of the tracking object from a plurality of acquired locus candidates.

When the reliability that is the probability that the tracking target exists in the predetermined area of the current frame corresponding to the position candidate of the previous frame is calculated, and the calculated reliability is equal to or greater than a predetermined threshold, the area is The object tracking program according to claim 13, wherein the object tracking program is acquired as a position candidate of a tracking object in a frame.

Storing the reliability of the candidate position of the tracking object in each frame;
The object tracking program according to claim 14, wherein a locus of the tracking object is determined based on a reliability related to the locus candidate.

The object tracking program according to claim 13, wherein the tracking object is a human face.