JP2020057424A

JP2020057424A - Device and method for tracking mobile object and program

Info

Publication number: JP2020057424A
Application number: JP2019230789A
Authority: JP
Inventors: ヴェトクォクファン; Viet Quoc Pham
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2019-12-20
Filing date: 2019-12-20
Publication date: 2020-04-09

Abstract

To provide a method and device for tracking a mobile object, and a program each of which allows for tracking of a plurality of mobile objects with little calculation cost even if the mobile objects move complexly.SOLUTION: A device for tracking an object comprises an acquisition unit, a detection unit, an extraction unit, and a coordination unit. The acquisition unit acquires a plurality of frames. The detection unit detects an object form the plurality of frames. The extraction unit extracts, with respect to the detected object, a first movement trajectory including a first frame, a second movement trajectory composed of only frames preceding the first frame, and a third movement trajectory composed of only frames after the first frame. The coordination unit makes the third and second movement trajectories correspond to each other which satisfy a condition that similarity between the first and third movement trajectories is greater than or equal to similarity between the second and other third movement trajectories, and that similarity between the first and third movement trajectories is greater than or equal to similarity between the first and third movement trajectories.SELECTED DRAWING: Figure 1

Description

本発明の実施形態は、移動物体追跡装置、方法及びプログラムに関する。 Embodiments of the present invention relate to a moving object tracking device, method, and program.

画像の時系列において複数のフレームに含まれる複数の移動物体を検出し、同一の物体
どうしをフレーム間で対応付けることで、移動物体の追跡を行い、追跡した結果を記録し
たり、追跡した結果をもとに移動物体を識別したりする移動物体追跡システムが開示され
ている。 In the time series of images, multiple moving objects included in multiple frames are detected, and the same objects are associated with each other between the frames to track the moving objects, record the results of the tracking, and record the results of the tracking. A moving object tracking system for identifying a moving object based on the moving object is disclosed.

しかし、時系列の複数画像において検出された顔の出現、消滅、及び検出失敗をそれぞ
れノードとした枝（パス）の組合せを検討するため、計算コストが増大する。 However, calculation costs increase because a combination of branches (paths) whose appearance, disappearance, and detection failure of a face detected in a plurality of time-series images are nodes is considered.

特開２０１１−１７０７１１号公報JP 2011-170711 A

本発明が解決しようとする課題は、複数の移動物体の複雑な動きでも計算コストの少な
い移動物体追跡システム及び方法を提供することである。 The problem to be solved by the present invention is to provide a moving object tracking system and method with low calculation cost even for complicated movement of a plurality of moving objects.

実施形態の物体追跡装置は、取得部と、検出部と、抽出部と、対応付部を備える。取得
部は、複数のフレームを取得する。検出部は、前記複数のフレームから物体を検出する。
抽出部は、前記検出された物体について、第１フレームを含む第１移動軌道、前記第１フ
レームより前のフレームのみで構成される第２移動軌道、及び、前記第１フレーム以降の
フレームのみで構成される第３移動軌道を抽出する。対応付部は、前記第２移動軌道と前
記第３移動軌道の類似度が前記第２移動軌道と他の前記第３移動軌道の類似度以上であり
、かつ前記第２移動軌道と前記第３移動軌道の類似度が前記第１移動軌道と前記第３移動
軌道の類似度以上になる条件を満たす前記第３移動軌道と前記第２移動軌道とを対応付け
る。 The object tracking device according to the embodiment includes an acquisition unit, a detection unit, an extraction unit, and an association unit. The acquisition unit acquires a plurality of frames. The detection unit detects an object from the plurality of frames.
For the detected object, the extracted object includes only a first movement trajectory including a first frame, a second movement trajectory including only a frame before the first frame, and only a frame after the first frame. The third moving trajectory to be constructed is extracted. The associating unit may be configured such that a similarity between the second movement trajectory and the third movement trajectory is equal to or greater than a similarity between the second movement trajectory and another third movement trajectory, and the second movement trajectory and the third movement trajectory are similar to each other. The third trajectory and the second trajectory satisfying a condition that the similarity of the trajectory becomes equal to or more than the similarity of the first trajectory and the third trajectory are associated.

第１の実施形態の移動物体追跡システムの例を示す構成図。FIG. 1 is a configuration diagram illustrating an example of a moving object tracking system according to a first embodiment. 第１の実施形態の移動物体追跡システムの例を示すフローチャート。4 is a flowchart illustrating an example of a moving object tracking system according to the first embodiment. 移動軌跡の分割の例を示す説明図。FIG. 4 is an explanatory diagram showing an example of dividing a moving trajectory. ユークリッド距離を用いた計算の例を示す説明図。FIG. 4 is an explanatory diagram showing an example of calculation using a Euclidean distance. 第２の実施形態の移動物体追跡システムの例を示す構成図。FIG. 2 is a configuration diagram illustrating an example of a moving object tracking system according to a second embodiment. 第２の実施形態の遮蔽を表す模式図。FIG. 9 is a schematic diagram illustrating a shield according to a second embodiment.

以下、添付図面を参照しながら、実施形態を詳細に説明する。 Hereinafter, embodiments will be described in detail with reference to the accompanying drawings.

（第１の実施形態）
図１は、第１の実施形態の移動物体追跡システム１の一例を示す構成図である。図１に示
すように、取得部１０、検出部１１、抽出部１２、管理部１３、設定部１４、分割部１５
、対応付部１６、統合部１７、出力部１８とを備える。移動物体追跡システムは、例えば
、ＣＰＵ（Central Processing Unit）などの処理装置にプログラムを実行させること
、即ち、ソフトウェアにより実現してもよいし、ＩＣ（Integrated Circuit）などのハ
ードウェアにより実現してもよいし、ソフトウェア及びハードウェアを併用して実現して
もよい。取得部１０が取得する動画像は記憶装置に記憶されたものを用いてもよい。 (First embodiment)
FIG. 1 is a configuration diagram illustrating an example of a moving object tracking system 1 according to the first embodiment. As shown in FIG. 1, the acquiring unit 10, the detecting unit 11, the extracting unit 12, the managing unit 13, the setting unit 14, and the dividing unit 15
, An association unit 16, an integration unit 17, and an output unit 18. The moving object tracking system may be realized, for example, by causing a processing device such as a CPU (Central Processing Unit) to execute a program, that is, by a software or by a hardware such as an IC (Integrated Circuit). Alternatively, it may be realized by using software and hardware together. The moving image acquired by the acquiring unit 10 may be one stored in a storage device.

記憶装置は、例えば、ＨＤＤ（Hard Disk Drive）、ＳＳＤ（Solid State Drive）
、ＲＯＭ（Read Only Memory）、メモリカードなどの磁気的、光学的、及び電気的に記
憶可能な記憶装置の少なくともいずれかにより実現できる。 The storage device is, for example, a hard disk drive (HDD) or a solid state drive (SSD).
, A ROM (Read Only Memory), a magnetic card, a magnetic card, an optical storage device, and the like.

図２は第１の実施形態の移動物体追跡システム１の処理の手順の流れの一例を示すフロ
ーチャートである。 FIG. 2 is a flowchart illustrating an example of a flow of a processing procedure of the moving object tracking system 1 according to the first embodiment.

まず移動物体追跡システム１は、撮像装置で撮像された動画像または記憶装置に記憶さ
れた動画像を取得する（ステップＳ１０１）。動画画像は複数枚のフレーム（画像）を含
む。 First, the moving object tracking system 1 acquires a moving image captured by the imaging device or a moving image stored in the storage device (step S101). The moving image includes a plurality of frames (images).

次に検出部１１が、取得部が取得した動画像から前記移動物体を複数検出する（ステッ
プＳ１０２）。移動物体とはたとえば、人物、車等を指す。以下、移動物体が人物の場合
を例に説明する。人物を検出する具体的な処理方法としては、以下の手法が適用できる。
例えば、文献（Ｎ．Ｄａｌａｌ、Ｂ．Ｔｒｉｇｇｓ、“ＨｉｓｔｏｇｒａｍｓｏｆＯ
ｒｉｅｎｔｅｄＧｒａｄｉｅｎｔｓｆｏｒＨｕｍａｎＤｅｔｅｃｔｉｏｎ、”Ｉ
ＥＥＥＣｏｍｐｕｔｅｒＶｉｓｉｏｎａｎｄＰａｔｔｅｒｎＲｅｃｏｇｎｉｔ
ｉｏｎ、ｐｐ．８８６−８９３、２００５）に記載された手法が適用可能である。また、
ＶｉｓｕａｌＴｒａｃｋｉｎｇ技術を用い、処理中フレームの直前のフレームで検出さ
れた物体を追跡することで処理中のフレームにおける同一物体の位置を推定することによ
り、人物の検出の精度を高めることが可能である。ＶｉｓｕａｌＴｒａｃｋｉｎｇ技術
は、例えば、文献（Ｋ．Ｚｈａｎｇ，Ｌ．Ｚｈａｎｇ，Ｍ．Ｈ．Ｙａｎｇ“，Ｒｅａｌ−
ｔｉｍｅｃｏｍｐｒｅｓｓｉｖｅＴｒａｃｋｉｎｇ”、ＥｕｒｏｐｅａｎＣｏｎｆ
ｅｒｎｃｅＣｏｍｐｕｔｅｒＶｉｓｉｏｎ，ｐｐ．８６６−８７９、２０１２）に記
載された手法が適用可能である。
次に抽出部１２は各フレームでそれぞれの人物を対応付け、当該対応づけられた人物の
移動軌道(tracklet;トラックレット、以下移動軌道をトラックレットと称する)を抽出す
る（ステップＳ１０３）。トラックレットを抽出する方法として、例えば、文献（Ｈ．Ｐ
ｉｒｓｉａｖａｓｈ，Ｄ．Ｒａｍａｎａｎ，Ｃ．Ｃ．Ｆｏｗｌｋｅｓ“、Ｇｌｏｂａｌｌ
ｙ−ＯｐｔｉｍａｌＧｒｅｅｄｙＡｌｇｏｒｉｔｈｍｓｆｏｒＴｒａｃｋｉｎｇ
ａＶａｒｉａｂｌｅＮｕｍｂｅｒｏｆＯｂｊｅｃｔｓ”、ＩＥＥＥＣｏｍｐ
ｕｔｅｒＶｉｓｉｏｎａｎｄＰａｔｔｅｒｎＲｅｇｃｏｎｉｔｉｏｎ，ｐｐ．１
２０１−１２０８、２０１２）に記載された手法が適応可能である Next, the detecting unit 11 detects a plurality of the moving objects from the moving image acquired by the acquiring unit (Step S102). The moving object refers to, for example, a person, a car, and the like. Hereinafter, a case where the moving object is a person will be described as an example. The following method can be applied as a specific processing method for detecting a person.
For example, see the literature (N. Diall, B. Triggs, "Histograms of O.
iented Gradients for Human Detection, "I
EEE Computer Vision and Pattern Recognit
ion, pp. 886-893, 2005) can be applied. Also,
By using the Visual Tracking technology and estimating the position of the same object in the frame being processed by tracking the object detected in the frame immediately before the frame being processed, it is possible to increase the accuracy of detecting a person. . The Visual Tracking technology is described, for example, in the literature (K. Zhang, L. Zhang, MH Yang, “Real-
time compressive Tracking ”, European Conf
eance Computer Vision, pp. 866-879, 2012) can be applied.
Next, the extraction unit 12 associates each person in each frame, and extracts a movement trajectory (tracklet; hereinafter, a movement trajectory is referred to as a tracklet) of the associated person (step S103). As a method of extracting a tracklet, for example, a document (H.P.
Irsiavash, D .; Ramanan, C .; C. Fowlkes ", Global
y-Optimal Greedy Algorithms for Tracking
a Variable Number of Objects ", IEEE Comp
uter Vision and Pattern Reg., p. 1
201-1208, 2012) can be applied.

次に、管理部１３が人物毎のトラックレットを管理する（ステップＳ１０４）。トラッ
クレットは、移動する人物が動画像中のどの時間帯に含まれているか、を管理すればよく
、画像のフレーム番号や再生または録画時間等で管理してもよい。また、検出部１１が検
出した人物の位置や大きさを合わせて記憶してもよい。これらの情報は記憶部１９に記憶
してもよい。人物毎のトラックレットを人物のＩＤと共に記憶部１９に記憶させてもよい
。人物のＩＤは互いが区別可能なように付与されたものであればよく、人物自体を同定で
きなくてもよい。 Next, the management unit 13 manages a tracklet for each person (step S104). The tracklet may manage the time zone in the moving image in which the moving person is included, and may be managed by the frame number of the image, the reproduction or recording time, or the like. Further, the position and size of the person detected by the detection unit 11 may be stored together. These pieces of information may be stored in the storage unit 19. The tracklet for each person may be stored in the storage unit 19 together with the ID of the person. The IDs of the persons need only be given so as to be distinguishable from each other, and may not be able to identify the persons themselves.

次に設定部１４は、位置を算出するフレームを注目フレームとして設定する（ステップ
Ｓ１０５）。 Next, the setting unit 14 sets the frame for which the position is calculated as the frame of interest (step S105).

次に、分割部１５は前記抽出された複数のトラックレットに対し、前記注目フレームを
少なくとも含む第１ブロックと、時系列順で当該第１ブロックより前に位置する第２ブロ
ックと、時系列順で当該第１ブロックより後に位置する第３ブロックとに分割する（ステ
ップＳ１０５）。例えば、図３に示すようにＮ枚フレームの画像中の、時刻ｔ（i番目の
）フレームを注目フレームと設定した場合、i番目フレームを含むブロックを第１ブロッ
ク、第１ブロックよりも時系列順で前にあるブロックを第２ブロック、第１ブロックより
も時系列順で後にあるブロックを第３ブロックとする。より具体的には下記の数式１を用
いる。

ただし、start(t)はトラックレットtの開始フレームの番号で、end(t)はトラックレットt
の終了フレームの番号である。 Next, the dividing unit 15 assigns a first block including at least the frame of interest, a second block positioned before the first block in chronological order, and Is divided into a third block located after the first block (step S105). For example, as shown in FIG. 3, when a time t (i-th) frame in an N-frame image is set as a target frame, a block including the i-th frame is a first block, and the block including the i-th frame is more time-series than the first block. A block preceding the first block is referred to as a second block, and a block subsequent to the first block in a chronological order is referred to as a third block. More specifically, Equation 1 below is used.

Where start (t) is the number of the start frame of the tracklet t, and end (t) is the tracklet t
Is the end frame number.

次に対応付部１６は第２ブロックに含まれるトラックレットを管理部１３から取得し、
取得したトラックレットと第１ブロック及び第３ブロックに含まれるトラックレットとを
トラックレットの類似度に基づいて対応付ける（ステップＳ１０６）。 Next, the associating unit 16 acquires the tracklet included in the second block from the managing unit 13,
The acquired tracklets are associated with the tracklets included in the first block and the third block based on the similarity of the tracklets (step S106).

第２ブロックのトラックレット選択と第３ブロックのトラックレット選択の２段階で実
行する。 This is executed in two stages, that is, the tracklet selection of the second block and the tracklet selection of the third block.

まず、第２ブロックのトラックレット選択は、第２ブロックからend(ap)=i-1となるよ
うなapを選択する。iは注目フレームである。より具体的には第２ブロックのトラックレ
ットの長さが所定閾値以内のものを用いるようにするとよい。i-1よりも時系列順で前に
終了するトラックレットはこの処理フローより以前に処理済みであることから、除外する
。この場合、対応付けの候補数を大分減らすことで処理時間を大幅に節約することが可能
になる。
また、第３ブロックのトラックレット選択は、前段階で選択した｛ap｝のそれぞれに対
し、第３ブロック中から以下の式（２）の条件を満たすトラックレットbqと対応付ける。
First, in the tracklet selection of the second block, ap such that end (ap) = i−1 is selected from the second block. i is the frame of interest. More specifically, it is preferable to use a tracklet whose length in the second block is within a predetermined threshold. Tracklets that end before i-1 in chronological order are excluded because they have been processed before this processing flow. In this case, the processing time can be greatly reduced by greatly reducing the number of association candidates.
In the tracklet selection of the third block, each {ap} selected in the previous stage is associated with a tracklet bq satisfying the following expression (2) from the third block.

ただし、D(,)とは２つのトラックレットの対応性を示す、トラックレット間のモーション
の類似度とアピアランスの類似度から計算される。

モーション類似度MotionMatch(t1,t2)は、トラックレットt1とt2間の短時間においてt1
を人物が線形な移動を取ることを仮定する。t1がt2の開始時刻までにt1’へ拡張し、図４
に示すように、ユークリッド距離を用いて計算する。
Here, D (,) indicates the correspondence between two tracklets and is calculated from the similarity of motion between tracklets and the similarity of appearance.

Motion similarity MotionMatch (t1, t2) is t1 in the short time between tracklets t1 and t2.
Suppose that the person takes a linear movement. FIG. 4 shows that t1 is extended to t1 ′ by the start time of t2.
As shown in the above, the calculation is performed using the Euclidean distance.

また、アピアランスの類似度AppearanceMatch(t1,t2)は、それぞれのトラックレットか
ら人物の代表的な外観を選び、その２つの外観から特徴量を抽出すると比較することによ
り計算する。代表的な外観の選択に関しては、図４に示すように、トラックレットの間に
ある人物を選べばよい。また、特徴量の抽出と比較については、文献（Ｋ．Ｚｈａｎｇ，
Ｌ．Ｚｈａｎｇ，Ｍ．Ｈ．Ｙａｎｇ“，Ｒｅａｌ−ｔｉｍｅｃｏｍｐｒｅｓｓｉｖｅ
Ｔｒａｃｋｉｎｇ”、ＥｕｒｏｐｅａｎＣｏｎｆｅｒｎｃｅＣｏｍｐｕｔｅｒＶｉ
ｓｉｏｎ，ｐｐ．８６６−８７９、２０１２）に記載された手法を適用可能である。 The appearance similarity AppearanceMatch (t1, t2) is calculated by selecting a representative appearance of a person from each tracklet and extracting and comparing a feature amount from the two appearances. Regarding the selection of a representative appearance, as shown in FIG. 4, a person located between tracklets may be selected. Also, regarding extraction and comparison of feature amounts, see the literature (K. Zhang,
L. Zhang, M .; H. Yang ", Real-time compressive
Tracking ”, European Conference Computer Vi
sion, pp. 866-879, 2012) can be applied.

ここで、第２ブロックに人物検出結果と対応するトラックレット抽出結果がない場合に
は、図２で説明したステップＳ１０１〜ステップＳ１０４と同様の処理を行うことによっ
て、人物の抽出、トラックレットの抽出が可能になる。 Here, if there is no tracklet extraction result corresponding to the person detection result in the second block, the same processing as steps S101 to S104 described in FIG. 2 is performed to extract a person and tracklet. Becomes possible.

統合部１７は対応付部１６が対応づけたトラックレットの組を新しいトラックレットと
して統合する。管理部１４は、検出部１２の検出結果と統合部１７の統合されたトラック
レットを管理する。 The integration unit 17 integrates a set of tracklets associated by the association unit 16 as a new tracklet. The management unit 14 manages the detection result of the detection unit 12 and the integrated tracklet of the integration unit 17.

出力部１８は対応付けられた人物とトラックレットの結果を出力する。人物のトラック
レットの結果は動画像上に重畳して表示してもよいし、所望の人物のトラックレットの結
果のみを出力してもよい。人物のトラックレットの結果を重畳すれば複雑な軌跡もユーザ
にとって明確になるため良い。また、注目フレームのみに対して、人物のＩＤと位置を出
力してもよい。上述のとおり人物のＩＤは互いが区別可能なように付与されたものであれ
ばよく、人物自体を同定できなくてもよい。 The output unit 18 outputs the result of the associated person and the tracklet. The result of the person's tracklet may be superimposed on the moving image and displayed, or only the result of the desired person's tracklet may be output. It is preferable to superimpose the result of the tracklet of the person, since the complicated trajectory becomes clear to the user. Alternatively, the ID and position of the person may be output only for the frame of interest. As described above, the IDs of the persons need only be given so as to be distinguishable from each other, and the persons themselves may not be able to be identified.

以上のように、第１の実施形態に係わる移動物体追跡システム１によれば、複数の移動
物体の複雑な動きでも計算コストの少なくすることが可能になる。特に、対応付部１６は
第２ブロックのトラックレットを利用し、第１及び第３ブロックのトラックレットと統合
するため、トラックレットの対応付におけるいて重複計算が不要になる。また、重複した
計算が不要になるため計算コストを少なくすることが可能になる。 As described above, according to the moving object tracking system 1 according to the first embodiment, it is possible to reduce the calculation cost even for a complicated movement of a plurality of moving objects. In particular, the associating unit 16 uses the tracklets of the second block and integrates them with the tracklets of the first and third blocks, so that there is no need for duplicate calculation in associating the tracklets. In addition, since redundant calculations are not required, calculation costs can be reduced.

（第２の実施形態）
図５は、第２の実施形態に係わる移動物体追跡システム２を示すブロック図である。本
実施形態に関わる移動物体追跡システム２は、取得部１０、設定部１１、検出部１２、抽
出部１３、管理部１４、分割部１５、対応付け部１６、補間部２０、統合部１７、出力部
１８とを備える。第１の実施形態とは補間部２０を備えることが異なる。 (Second embodiment)
FIG. 5 is a block diagram showing a moving object tracking system 2 according to the second embodiment. The moving object tracking system 2 according to the present embodiment includes an acquisition unit 10, a setting unit 11, a detection unit 12, an extraction unit 13, a management unit 14, a division unit 15, a correspondence unit 16, an interpolation unit 20, an integration unit 17, an output unit And a unit 18. The difference from the first embodiment is that an interpolation unit 20 is provided.

補間部２０は、前記第２ブロックと前記第３ブロックとの間で対応づけられる人物各々
のトラックレットから、第１ブロックにおける人物の位置情報を補間する。ここで位置情
報とは、人物のフレームにおける位置、人物のフレームにおける大きさ、時系列順で注目
フレームより前の移動軌道（トラックレット）のいずれか一つを含むものを指す。
具体的には、対応付部１６により対応付けられたトラックレット間の時系列における人
物の未検出を補間する。対応付けられたトラックレットを(t1,t2)とし、t1の終了時の人
物の位置、大きさ及びフレーム番号([x1,y1],[h1,w1],f1)を、t2の開始時の人物の位置、
大きさ及びフレーム番号([x2,y2],[h2,w2],f2)とする。df=f2-f1とおくと、[f1+1,…,f1+
df-1]における各フレームf1+sに対し、人物の位置・大きさ([xs,ys],[hs,ws])は式５によ
って推定できる。
The interpolating unit 20 interpolates the position information of the person in the first block from each tracklet of the person associated between the second block and the third block. Here, the position information refers to information including any one of the position of the person in the frame, the size of the person in the frame, and the movement trajectory (tracklet) preceding the frame of interest in chronological order.
Specifically, non-detection of a person in a time series between tracklets associated by the association unit 16 is interpolated. The associated tracklet is (t1, t2), and the position, size, and frame number ([x1, y1], [h1, w1], f1) of the person at the end of t1 are The position of the person,
Size and frame number ([x2, y2], [h2, w2], f2). By setting df = f2-f1, [f1 + 1,…, f1 +
df-1], the position and size ([xs, ys], [hs, ws]) of the person can be estimated by Equation 5 for each frame f1 + s.

出力部１８は、対応付部１６の対応付け結果と補間部２０の推定結果を用い、注目フレ
ームにおける人物の位置を出力する。また、大きさと、注目フレームよりも時系列順で前
の対応付け結果を合わせて出力してもよい。 The output unit 18 outputs the position of the person in the frame of interest using the association result of the association unit 16 and the estimation result of the interpolation unit 20. Further, the size and the result of association before the frame of interest in chronological order may be output together.

統合部１７は、対応付部１６の対応付け結果と補間部２０の補間結果を用い、対応付け
られたトラックレットと補間された人物の領域とを新しいトラックレットに統合してもよ
い。 The integrating unit 17 may integrate the associated tracklet and the interpolated person area into a new tracklet using the association result of the association unit 16 and the interpolation result of the interpolation unit 20.

ここで、具体的に補間が必要な場合について図６を用いて説明する。 Here, a specific case where interpolation is necessary will be described with reference to FIG.

図６は、人物が建物の影にかくれる場合（上段）と、人物通しのすれ違いによって生じ
る遮蔽を表す模式図である。時間軸に沿って、左側の図面から右側へ変化し場合の様子を
表す。 FIG. 6 is a schematic diagram illustrating a case where a person is overshadowed by a building (upper part) and a blockage caused by passing of a person. The figure shows a case where the drawing changes from the left drawing to the right along the time axis.

たとえば、人物通しのすれ違いであれば（図６下段）、歩行の速度などから前後の対応
付をすることができる。速度は、移動物体が人であるのか、車であるのかを判定し、予め
平均的な速度を学習しておくことによって、遮蔽時間を推定することができる。 For example, if a person passes each other (lower part in FIG. 6), it is possible to make a correspondence between front and rear based on walking speed and the like. The speed can be estimated by determining whether the moving object is a person or a car and learning the average speed in advance, thereby estimating the occlusion time.

建物による遮蔽（図６上段）も同様であるが、例えば、撮像する周辺に人物を遮蔽する
程度の建物、移動物体が車などであれば移動物体を遮蔽する程度の建物の有無、または位
置情報などを予め取得しておくことで、遮蔽する時間の推定が可能である。 The same applies to the shielding by a building (the upper part in FIG. 6). For example, the presence or absence of a building that blocks a moving object when the moving object is a car or the like, or the presence or absence of a building that blocks a moving object in the vicinity of imaging, or position information By acquiring such information in advance, it is possible to estimate the shielding time.

推定された遮蔽時間から、想定する遮蔽時間から第１ブロックの適切な長さを設定でき
るようにしてもよい。例えば、一般の監視映像において、長さＭがフレームレートの２倍
（すなわち２秒間のフレーム数）と設定するなどする。遮蔽時間の設定によって、建物や
通行量などによる環境の変化に適宜対応でき、人物の追跡結果をより頑健に行うことがで
きる。 From the estimated shielding time, an appropriate length of the first block may be set from the assumed shielding time. For example, in a general monitoring video, the length M is set to twice the frame rate (that is, the number of frames in 2 seconds). By setting the shielding time, it is possible to appropriately cope with changes in the environment due to the building, the traffic volume, and the like, and the result of tracking the person can be performed more robustly.

以上のように、第２の実施形態に係わる移動物体追跡システム１によれば、複数の移動
物体の複雑な動きでも計算コストの少なくすることが可能になる。特に、補間により、全
体の追跡結果が分からなくても、注目フレームにおける追跡結果を出力可能になる。 As described above, according to the moving object tracking system 1 according to the second embodiment, it is possible to reduce the calculation cost even for a complicated movement of a plurality of moving objects. In particular, the interpolation makes it possible to output the tracking result in the frame of interest without knowing the entire tracking result.

（ハードウェア構成）
上記実施形態の移動物体追跡システムは、ＣＰＵ（Central Processing Unit）など
の制御装置、ＲＯＭやＲＡＭなどの記憶装置、ＨＤＤやＳＳＤなどの外部記憶装置、ディ
スプレイなどの表示装置、マウスやキーボードなどの入力装置、及びカメラなどの撮像装
置等を備えており、通常のコンピュータを利用したハードウェア構成で実現可能となって
いる。 (Hardware configuration)
The moving object tracking system of the above embodiment includes a control device such as a CPU (Central Processing Unit), a storage device such as a ROM and a RAM, an external storage device such as an HDD and an SSD, a display device such as a display, and an input device such as a mouse and a keyboard. It has an apparatus and an imaging device such as a camera, and can be realized by a hardware configuration using a normal computer.

上記実施形態の装置で実行されるプログラムは、ＲＯＭ等に予め組み込んで提供される
。 The program executed by the apparatus of the above embodiment is provided by being incorporated in a ROM or the like in advance.

また、上記実施形態の装置で実行されるプログラムを、インストール可能な形式又は実
行可能な形式のファイルでＣＤ−ＲＯＭ、ＣＤ−Ｒ、メモリカード、ＤＶＤ、フレキシブ
ルディスク（ＦＤ）等のコンピュータで読み取り可能な記憶媒体に記憶されて提供するよ
うにしてもよい。 Further, the program executed by the apparatus of the above embodiment can be read by a computer such as a CD-ROM, a CD-R, a memory card, a DVD, and a flexible disk (FD) in an installable format or an executable format file. It may be provided by being stored in a simple storage medium.

また、上記実施形態の装置で実行されるプログラムを、インターネット等のネットワー
クに接続されたコンピュータ上に格納し、ネットワーク経由でダウンロードさせることに
より提供するようにしてもよい。また、上記実施形態の装置で実行されるプログラムを、
インターネット等のネットワーク経由で提供または配布するようにしてもよい。 Further, the program executed by the apparatus of the above-described embodiment may be stored on a computer connected to a network such as the Internet and provided by being downloaded via the network. Further, a program executed by the device of the above embodiment is
It may be provided or distributed via a network such as the Internet.

上記実施形態の装置で実行されるプログラムは、上述した各部をコンピュータ上で実現
させるためのモジュール構成となっている。実際のハードウェアとしては、例えば、制御
装置が外部記憶装置からプログラムを記憶装置上に読み出して実行することにより、上記
各部がコンピュータ上で実現されるようになっている。 The program executed by the apparatus of the above embodiment has a module configuration for realizing the above-described units on a computer. As the actual hardware, for example, the control unit reads a program from an external storage device onto a storage device and executes the program, whereby the above-described units are realized on a computer.

なお、本発明は上記実施形態そのままに限定されるものではなく、実施段階ではその要
旨を逸脱しない範囲で構成要素を変形して具体化できる。また、上記実施形態に開示され
ている複数の構成要素の適宜な組み合わせにより、種々の発明を形成できる。例えば、実
施形態に示される全構成要素から幾つかの構成要素を削除してもよい。さらに、異なる実
施形態にわたる構成要素を適宜組み合わせてもよい。 Note that the present invention is not limited to the above-described embodiment as it is, and can be embodied by modifying constituent elements in an implementation stage without departing from the scope of the invention. In addition, various inventions can be formed by appropriately combining a plurality of components disclosed in the above embodiments. For example, some components may be deleted from all the components shown in the embodiment. Further, components of different embodiments may be appropriately combined.

例えば、上記実施形態のフローチャートにおける各ステップを、その性質に反しない限
り、実行順序を変更し、複数同時に実施し、あるいは実施毎に異なった順序で実施しても
よい。 For example, the steps in the flowchart of the above-described embodiment may be executed in a different order and executed simultaneously or in different orders for each execution, as long as they do not violate the nature of the steps.

以上のように、実施形態に係わる移動物体追跡システム１によれば、複数の移動物体の
複雑な動きでも計算コストの少なくすることが可能になる。特に、対応付部１６は第２ブ
ロックのトラックレットを利用し、第１及び第３ブロックのトラックレットと統合するた
め、トラックレットの対応付におけるいて重複計算が不要になる。 As described above, according to the moving object tracking system 1 according to the embodiment, it is possible to reduce the calculation cost even for a complicated movement of a plurality of moving objects. In particular, the associating unit 16 uses the tracklets of the second block and integrates them with the tracklets of the first and third blocks, so that there is no need for duplicate calculation in associating the tracklets.

１、２・・・移動物体追跡システム、１０・・・取得部、１１・・・検出部、１２・・・
抽出部、１３・・・管理部、１４・・・設定部、１５・・・分割部、１６・・・対応付部
、１７・・・統合部、１８・・・出力部、１９・・・記憶部、２０・・・補間部 1, 2 ... moving object tracking system, 10 ... acquisition unit, 11 ... detection unit, 12 ...
Extraction unit, 13 management unit, 14 setting unit, 15 division unit, 16 association unit, 17 integration unit, 18 output unit, 19 ... Storage unit, 20 ... interpolation unit

Claims

An acquisition unit that acquires a plurality of frames;
A detection unit that detects an object from the plurality of frames,
For the detected object, a first movement trajectory including a first frame, a second movement trajectory including only frames before the first frame, and a second movement trajectory including only frames subsequent to the first frame. An extraction unit for extracting three trajectories;
The similarity between the second trajectory and the third trajectory is equal to or greater than the similarity between the second trajectory and the other third trajectory, and the similarity between the second trajectory and the third trajectory. And the second moving trajectory and the second moving trajectory satisfying a condition that is equal to or greater than the similarity between the first moving trajectory and the third moving trajectory.
An associating unit for associating with a movement trajectory;
An object tracking device comprising:

From the second moving trajectory and the third moving trajectory associated with the associating unit, the first
An interpolation unit that interpolates the position information of the object after a frame,
The object tracking device according to claim 1, further comprising:

The object tracking device according to claim 1, wherein the associating unit uses the similarity of an object and the movement of a moving trajectory as the similarity.

The object tracking device according to claim 3, wherein the associating unit uses, as the similarity, a product of the similarity of the object and the similarity of the movement of the movement trajectory.

The object tracking device according to claim 1, wherein a length of the second movement trajectory associated with the association unit is within a predetermined threshold.

Obtaining a plurality of frames;
Detecting an object from the plurality of frames;
For the detected object, a first movement trajectory including a first frame, a second movement trajectory including only frames before the first frame, and a second movement trajectory including only frames subsequent to the first frame. Extracting three trajectories;
The similarity between the second trajectory and the third trajectory is equal to or greater than the similarity between the second trajectory and the other third trajectory, and the similarity between the second trajectory and the third trajectory. And the second moving trajectory and the second moving trajectory satisfying a condition that is equal to or greater than the similarity between the first moving trajectory and the third moving trajectory.
Associating with a movement trajectory;
Object tracking method having

Obtaining a plurality of frames;
Detecting an object from the plurality of frames;
For the detected object, a first movement trajectory including a first frame, a second movement trajectory including only frames before the first frame, and a second movement trajectory including only frames subsequent to the first frame. Extracting three trajectories;
The similarity between the second trajectory and the third trajectory is equal to or greater than the similarity between the second trajectory and the other third trajectory, and the similarity between the second trajectory and the third trajectory. And the second moving trajectory and the second moving trajectory satisfying a condition that is equal to or greater than the similarity between the first moving trajectory and the third moving trajectory.
Associating with a movement trajectory;
A program for causing a computer to execute.