JP2008211781A

JP2008211781A - System for modeling movement of objects in certain environment and method executed by computer

Info

Publication number: JP2008211781A
Application number: JP2008017789A
Authority: JP
Inventors: Christopher R Wren; クリストファー・アール・レン; Yuri A Ivanov; ユリ・エイ・イバノブ; Alexander Sorokin; アレクサンダー・ソロキン; Ishwinder Kaur Banga; イシュウィンデル・カウアー・バンガ
Original assignee: Mitsubishi Electric Research Laboratories Inc
Current assignee: Mitsubishi Electric Research Laboratories Inc
Priority date: 2007-02-05
Filing date: 2008-01-29
Publication date: 2008-09-11

Abstract

<P>PROBLEM TO BE SOLVED: To provide a method and system for tracking models and predicting object movement in an environment. <P>SOLUTION: Sequences of temporally and spatially adjacent events sensed by a set of sensors are linked to form a set of tracklets. Each tracklet has an associated starting and terminating location. The tracklets are used to construct a directed graph including starting nodes, terminating nodes, and intermediate nodes connected by edges. The intermediate nodes can be split nodes where the tracklets diverge onto different tracks, and join nodes where multiple tracklets converge onto a single path. Probabilities are assigned to the edges to model and predict movement of the objects in the environment. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

［関連出願］
この出願は、２００６年１１月３０日にイワノフ（Ｉｖａｎｏｖ）外によって出願された、米国特許出願第１１／５６５，２６４号、「環境における物体を追跡、識別するための監視システム及び方法」の一部継続出願である。 [Related applications]
This application is a part of US patent application Ser. No. 11 / 565,264, “Monitoring System and Method for Tracking and Identifying Objects in the Environment,” filed November 30, 2006, by Ivanov et al. This is a continuous continuation application.

本発明は、一般に、監視システム及び物体追尾方法に関し、特に監視データから物体の動きをモデリングすることに関する。 The present invention relates generally to surveillance systems and object tracking methods, and more particularly to modeling object motion from surveillance data.

ビデオカメラと比較的簡単なセンサとは、大きな環境に対して混合様式の監視システムを構成することを可能にする。センサは物体を識別できないが、比較的小さな領域では、物体を検出できる。画像が利用可能であるときには、カメラによって入手されたビデオ（映像）の画像から識別を行うことができる。 Video cameras and relatively simple sensors make it possible to construct a mixed-type surveillance system for large environments. The sensor cannot identify the object, but can detect the object in a relatively small area. When the image is available, identification can be made from the video (video) image obtained by the camera.

そのようなシステムによって入手されたビデオを記憶或いは蓄積することは、多テラバイトのデータを超えることがある。明らかに、特定の物体に対して何カ月にも亘って収集された記憶データを、瞬時に検索することは、実際には不可能である。従って、簡易なセンサで検出した動きの離散イベントにより物体を見つけるのが望ましい。その上、動きのイベントから物体の移動をモデル化して、物体の移動を予測するのが望ましい。 Storing or accumulating video obtained by such a system can exceed multi-terabytes of data. Clearly, it is practically impossible to retrieve stored data collected over months for a particular object. Therefore, it is desirable to find an object by a discrete event of motion detected by a simple sensor. In addition, it is desirable to model the movement of an object from a motion event to predict the movement of the object.

従来の監視システムでは、通常、人間、動物、車などの物体の追跡は、画像及びビデオ処理によって行われる。そのような監視システムの欠点は、特定の物体を追跡、識別することが必要なときに、物体をカメラによって観測することが必要なことである。しかしながら、多くの監視環境は、正確な操作に必要な完全な監視範囲を提供するために、多くのビデオカメラを必要とする。大量のビデオストリームは、正確に作動するために、監視システムでの演算負荷を増大させる。 In conventional surveillance systems, tracking of objects such as humans, animals, cars, etc. is usually done by image and video processing. The disadvantage of such a surveillance system is that when a specific object needs to be tracked and identified, it is necessary to observe the object with a camera. However, many surveillance environments require many video cameras to provide the complete surveillance range required for accurate operation. A large amount of video stream increases the computational load on the surveillance system to operate correctly.

本発明の目的は、様々なセンサ、カメラ及び監視データベースの混合されたネットワークを使用することにより、移動する物体（例えば、人々）を追跡して、識別するためのシステム及び方法を提供することである。 It is an object of the present invention to provide a system and method for tracking and identifying moving objects (eg, people) by using a mixed network of various sensors, cameras and surveillance databases. is there.

少数のＰＴＺカメラは、監視下に置かれるべき環境に配置される。カメラの数が比較的少ないとしても、ビデオ（映像）データの量は、多テラバイトの記憶容量を超えることがあり得る。 A small number of PTZ cameras are placed in the environment to be under surveillance. Even if the number of cameras is relatively small, the amount of video (video) data can exceed the multi-terabyte storage capacity.

ビデオカメラは、その環境の一部のイメージ（画像）を獲得できるだけである。このため、カメラだけで物体の追跡と識別とを行なうのは難しくなる。カメラの監視範囲が完全であっても、映像データを検索する時間は非実用的なものとなる。 A video camera can only acquire an image of a part of its environment. For this reason, it is difficult to track and identify an object using only a camera. Even if the monitoring range of the camera is complete, the time for searching for video data becomes impractical.

従って、その環境は、またセンサの密度の高い配置構成を含み、それらのセンサは、本質的には、全ての公共領域をカバーする。イベントは、関連するセンサ識別と時間とを有する。これは、センサデータの総量をかなり小さくして、処理を簡単にする。個々の人間がカメラによって連続して見られないとしても、センサの起動イベントは、特定の個人を追跡するために、ビデオ画像に空間的及び時間的に相関づけられる。 Thus, the environment also includes a dense arrangement of sensors, which essentially cover all public areas. An event has an associated sensor identification and time. This makes the total amount of sensor data much smaller and simplifies processing. Sensor activation events are correlated spatially and temporally to a video image to track a particular individual, even if an individual person is not continuously viewed by the camera.

その環境における物体の動きを表わす有向グラフを作成することが可能である。それらのグラフには、開始ノード、終了ノード、及び中間ノードがある。中間ノードは、曖昧点につながる。物体が特定の位置でそれらの移動を開始して終了する尤度を示すように、それらのノード同士を接続するエッジに確率が割り当てられる。物体の動き（移動）は、その動き及びそれを表わすグラフに曖昧点があったとしても、それらの確率から予測することができる。 It is possible to create a directed graph that represents the movement of an object in the environment. These graphs have a start node, an end node, and an intermediate node. Intermediate nodes lead to ambiguities. Probabilities are assigned to the edges connecting the nodes so that the objects show the likelihood of starting and ending their movement at a particular location. The movement (movement) of an object can be predicted from the probabilities even if there is an ambiguity in the movement and the graph representing it.

本発明の実施の形態は、混合様式の監視システムを提供する。本システムは、多くの比較的簡単なセンサ及び比較的小数の可動カメラを含む。このことは、従来の監視システムと対比して、費用、複雑さ、ネットワーク帯域幅、記憶容量、及び処理時間を減少させる。 Embodiments of the present invention provide a mixed mode monitoring system. The system includes many relatively simple sensors and a relatively small number of movable cameras. This reduces cost, complexity, network bandwidth, storage capacity, and processing time compared to conventional monitoring systems.

ある環境における物体は、センサからの利用可能な前後の（文脈）情報を使用してカメラにより追跡される。特定の物体の航跡（トラック）を瞬時に判別するために、何カ月にも亘って収集された前後の（文脈）情報を検索することができる。そして、物体を識別するのに、対応する物体の画像を使用できる。このようなことは、膨大な量の映像データの検索を必要とする従来の監視システムでは、実際には不可能である。 Objects in an environment are tracked by the camera using the available front and back (context) information from the sensors. In order to instantly determine the track (track) of a particular object, the previous and subsequent (context) information collected over many months can be retrieved. The image of the corresponding object can then be used to identify the object. Such a thing is not actually possible with a conventional monitoring system that requires retrieval of a huge amount of video data.

また、本発明の実施の形態は、航跡（トラック）に対する確率を求めることにより、物体の移動をモデル化して、予測するための方法及びシステムを提供する。顕著な特徴として、本発明は、解析の前に完全な航跡（トラック）を必要としない。殆どの従来の監視追尾（トラッキング）システムは、信頼できる前処理工程を必要とする。これは、全ての曖昧点を解消するために、忠実度の高いセンサと計算上複雑な方法とを含意する。 Embodiments of the present invention also provide a method and system for modeling and predicting object movement by determining the probability for a track (track). As a salient feature, the present invention does not require a complete track (track) prior to analysis. Most conventional surveillance tracking systems require reliable pre-processing steps. This implies high fidelity sensors and computationally complex methods to resolve all ambiguities.

対照的に、本発明は、比較的簡単な（バイナリの）モーションセンサ、及び曖昧点を持っている不完全なトラックでも働くことができる。本発明は、トラックを表すグラフにおけるエッジに確率を割り当てることによって、曖昧点を解決する。 In contrast, the present invention can also work with relatively simple (binary) motion sensors and imperfect tracks with ambiguities. The present invention resolves ambiguities by assigning probabilities to edges in a graph representing a track.

監視システム
図１に示されているように、本発明の実施の形態に従って追跡モジュールが実施される監視システムは、比較的大きなセット（組）のワイヤレス・ネットワークのセンサ（ドット）１０１と、比較的小さなセット（組）のパン‐ティルト‐ズーム（ＰＴＺ）カメラ（三角形）１０２とを含む。センサ対カメラの比率は、非常に大きく、例えば３０：１以上の場合がある。 Monitoring System As shown in FIG. 1, a monitoring system in which a tracking module is implemented in accordance with an embodiment of the present invention includes a relatively large set of wireless network sensors (dots) 101, a relatively A small set of pan-tilt-zoom (PTZ) cameras (triangles) 102. The ratio of sensor to camera is very large, for example 30: 1 or higher.

センサ
センサは、モーション（運動）センサでもよく、またドアセンサ、昇降機センサ、熱センサ、圧力センサ、音響センサでもよい。赤外線センサなどのモーションセンサは、センサの付近での物体の動きを検出できる。ドアセンサは、典型的には戸口を通過する人を示すドア開閉イベントを検出できる。昇降機センサは、同様に、ある環境における人々の到着または出発を示すことができる。また、例えばトランスデューサやマイクロホン等の音響センサは、ある領域の活動を検出できる。センサは、その環境における光スイッチ、またはオフィス機器の電源スイッチに搭載することができる。また、マット（敷物）の圧力センサも、通過するトラフィックを示すことができる。また、環境への入口通路におけるバッジ読取り装置などのセキュリティセンサを組み込むことができる。 Sensor The sensor may be a motion (motion) sensor, or may be a door sensor, an elevator sensor, a thermal sensor, a pressure sensor, or an acoustic sensor. A motion sensor such as an infrared sensor can detect the movement of an object in the vicinity of the sensor. The door sensor can typically detect a door opening / closing event indicating a person passing through the doorway. Elevator sensors can similarly indicate the arrival or departure of people in an environment. For example, an acoustic sensor such as a transducer or a microphone can detect an activity in a certain region. The sensor can be mounted on an optical switch in the environment or a power switch of an office device. The mat (rug) pressure sensor can also indicate traffic passing through. Security sensors such as badge readers in the entrance passage to the environment can also be incorporated.

各センサは比較的小さく、例えば、モーションセンサについては、３×５×６ｃｍである。好適な実施の形態において、これらのセンサは、約１０メートル以下の間隔で互いに離れて、公共領域に密に配置され、また天井、壁、或いは床に取り付けられる。なお、特定の環境及びその環境におけるトラフィックフロー（交通流）に合うように、センサの空間的配置と密度を適合させることができることに注意すべきである。例えば、高いトラフィック（交通）の地域は、低いトラフィックの地域よりも人口密度が高い。 Each sensor is relatively small, for example 3 × 5 × 6 cm for a motion sensor. In a preferred embodiment, these sensors are closely spaced in a public area, spaced from each other by a distance of about 10 meters or less, and mounted on a ceiling, wall, or floor. It should be noted that the spatial arrangement and density of sensors can be adapted to suit a particular environment and the traffic flow in that environment. For example, areas with high traffic (traffic) have a higher population density than areas with low traffic.

本発明の一実施の形態では、そのセット（一組）のセンサは、工業規格ＩＥＥＥ８０２．１５．４の無線信号を使用して、プロセッサ１１０（図１を参照）と通信する。これは、ジグビー（Ｚｉｇｂｅｅ）タイプのデバイスによって典型的に使用される物理層である。各電池式センサは、検出モードで約５０μＡ、また通信時に４６ｍＡを消費する。起動による通信間隔は、約１６ミリ秒である。また、センサは、ハードウェアに組み込まれてもよいし、或いは他の通信技術を使用してもよいことに注意すべきである。 In one embodiment of the invention, the set of sensors communicates with the processor 110 (see FIG. 1) using industry standard IEEE 802.15.4 radio signals. This is the physical layer typically used by Zigbee type devices. Each battery-powered sensor consumes about 50 μA in detection mode and 46 mA during communication. The communication interval by activation is about 16 milliseconds. It should also be noted that the sensor may be embedded in hardware or may use other communication technologies.

イベントがセンサ１０１の何れかによって検出されるとき、そのイベントに対応するセンサ識別記号（ＳＩＤ）とタイムスタンプ（ＴＳ）が一斉送信されるか、或いは別の方法でプロセッサ１１０に送られる。プロセッサは、センサデータを監視データベースとしてメモリに記憶する。識別記号は、本来センサの配置場所、従って起動を引き起こしたイベントの配置場所を示す。イベントの記録を行うのには、僅かなバイト数を要するだけである。従って、映像データと比較すると、長期間に亘って収集されたセンサデータの総量は、本質的には、取るにたらないものである。 When an event is detected by any of the sensors 101, a sensor identification symbol (SID) and a time stamp (TS) corresponding to the event are broadcast simultaneously or otherwise sent to the processor 110. The processor stores the sensor data in the memory as a monitoring database. The identification symbol originally indicates the location of the sensor, and thus the location of the event that caused the activation. Only a few bytes are required to record an event. Therefore, compared with video data, the total amount of sensor data collected over a long period is essentially trivial.

そのセット（一組）のカメラは、ビデオ（映像）データ（画像系列）を取得するのに使用される。画像は、カメラの固有カメラ識別記号（ＣＩＤまたは配置場所）とフレーム番号（ＦＮ）を有する。本明細書で使用されるように、フレーム番号は、時間と同義である。すなわち、フレーム番号から時間を直接的に計算することができる。さらに、データベースからの問い合わせの間、どんな時刻におけるセンサ近傍のシーン(場面)の可視部分についても計算できるように、あらゆる時刻が各カメラの一組のパン‐ティルト‐ズーム・パラメータに関連している。 The set of cameras is used to acquire video (video) data (image series). The image has a unique camera identification symbol (CID or location) of the camera and a frame number (FN). As used herein, frame number is synonymous with time. That is, the time can be calculated directly from the frame number. In addition, every time is associated with a set of pan-tilt-zoom parameters for each camera so that the visible part of the scene near the sensor at any time can be calculated during the query from the database. .

カメラは、最大の監視範囲を提供するために、戦略的な配置場所（位置）、例えばその環境における全てのトラフィックがいつか通過しなければならない配置場所で、典型的には、天井に搭載される。如何なる一般的な方向にも、ＰＴＺカメラ１０２を向けて、焦点を合わせることが可能である。必ずしも必要ではないが、ビデオ画像を取得するために、イベントの検出により、近くの何れのビデオカメラをもセンサ近傍のシーンに向けることができる。少数の一連の画像すなわちイベントに関連するビデオクリップを検索するのに、関連するセンサのＩＤとＴＳを後で使用することができる。また、特定のカメラの近くのセンサの近傍で何らイベントが検出されない場合には、所要の記憶容量を減量させるために、画像収集を中断できることに注意すべきである。 Cameras are typically mounted on the ceiling in strategic locations (locations), such as locations where all traffic in the environment must pass sometime to provide maximum coverage. . The PTZ camera 102 can be pointed and focused in any general direction. Although not necessary, any nearby video camera can be directed to a scene near the sensor by detecting an event in order to acquire a video image. The associated sensor ID and TS can later be used to retrieve a series of images or video clips associated with the event. It should also be noted that if no event is detected in the vicinity of a sensor near a particular camera, image acquisition can be interrupted to reduce the required storage capacity.

特定のイベントを見つけたり、特定の物体のトラック（航跡）を見つけたりして、それらの物体を識別するために、何カ月もの操作に亘って取得された映像データを調べることは難題である。 Examining video data acquired over many months of operation to identify specific objects, such as finding specific events or finding tracks of specific objects, is a challenge.

トラックレット及びトラックレットグラフ
図２に示されているように、本発明の一実施の形態では、一組のトラックレット（トラック片）２１０が使用される。対応するトラックレットグラフ２００が一組のトラックレット２１０から集められる。トラックレットは、一連の空間的に隣接しているセンサ１０１での一連の時間的に隣接している離散事象（イベント）同士にリンクを張ることによって、形成される。トラックレットは、トラックレットグラフ２１０の基本的構成ブロックである。「離散事象（離散イベント）」という用語は、センサの近くの動き（移動）を単一のタイムスタンプのビットで送信できることを示すために使用される。これは、ビデオの形態の連続信号を流すカメラとは異なっている。 Tracklet and Tracklet Graph As shown in FIG. 2, in one embodiment of the present invention, a set of tracklets (track pieces) 210 is used. A corresponding tracklet graph 200 is collected from a set of tracklets 210. A tracklet is formed by linking a series of temporally adjacent discrete events (events) with a series of spatially adjacent sensors 101. A tracklet is a basic building block of the tracklet graph 210. The term “discrete event” is used to indicate that motion near the sensor (movement) can be transmitted in a single timestamp bit. This is different from a camera that plays a continuous signal in the form of video.

我々は、現在のイベントにリンクした直前の先行または後継イベントを見つける処理をコールする（呼び出す）。システムの性能を向上させるために、周期的にトラックレットのリンクと保存（記憶）とを行うことができる。例えば、１営業日の終わりに、或いは毎時間ごとにリンクと保存とを行うことができる。このように、検索を行うことが必要なとき、予め保存したトラックレットを容易に利用することが可能である。 We call the process to find the previous predecessor or successor event linked to the current event. To improve system performance, tracklet linking and storage (storage) can be performed periodically. For example, linking and saving can be done at the end of one business day or every hour. As described above, when it is necessary to perform a search, it is possible to easily use a tracklet stored in advance.

組み立てられたトラックレットグラフ２００では、トラックレットは、グラフのノードで接続された有向エッジである。そのグラフのノードは、各トラックレットのその直後のトラックレットまたは直前のトラックレットとの関係をコード化する。ノードは、次の４つのタイプの内の１つを持つことができる。すなわち、トラックスタート(開始)２０１、トラックジョイン（結合）２０２、トラックスプリット（分岐）２０３、及びトラックエンド（終了）２０４である。 In the assembled tracklet graph 200, tracklets are directed edges connected by nodes of the graph. The nodes of the graph encode the relationship of each tracklet with the immediately following or immediately preceding tracklet. A node can have one of four types: That is, a track start (start) 201, a track join (join) 202, a track split (branch) 203, and a track end (end) 204.

トラックスタート（開始）
トラックスタートノードは、トラックレットにおける最初（第１）のイベントを表しており、所定時間間隔内で何れの先行イベントもそのセンサに関連させることができない。本明細書で使用されるように、「先行」は、隣接するセンサでの以前のイベントを意味する。「時間間隔」は、歩行者が一つのセンサから隣接する次のセンサまで移動するのにかかる時間にほぼ限定できる。 Track start (start)
The track start node represents the first (first) event in the tracklet, and no preceding event can be associated with that sensor within a predetermined time interval. As used herein, “predecessor” means a previous event at an adjacent sensor. The “time interval” can be substantially limited to the time required for the pedestrian to move from one sensor to the next adjacent sensor.

トラックジョイン（結合）
トラックジョインノードは、所定時間間隔内にそのセンサに関連させることができる複数の先行イベントが存在するような、トラックレットグラフにおけるイベントを表す。すなわち、トラックレットジョインノードは、複数の先行トラックレットが単一の後継トラックレットに収束することを表している。単一の有効な先行トラックレットは、それが現在のトラックレットに既にリンクされているので、存在することができない。 Track join
A track join node represents an event in the tracklet graph such that there are multiple preceding events that can be associated with that sensor within a predetermined time interval. That is, the tracklet join node represents that a plurality of preceding tracklets converge to a single successor tracklet. A single valid predecessor tracklet cannot exist because it is already linked to the current tracklet.

トラックスプリット（分岐）
トラックスプリットノードは、所定時間間隔内にそのセンサに関連させることができる複数の後継トラックレットが存在するような、トラックレットにおけるイベントを表す。すなわち、トラックレットスプリットノードは、単一の先行トラックレットが複数の後継トラックレットへ分岐することを表している。単一の有効な後継トラックレットは、それが現在のトラックレットに既にリンクされているので、存在することができない。 Track split (branch)
A track split node represents an event in a tracklet such that there are multiple successor tracklets that can be associated with that sensor within a predetermined time interval. That is, the tracklet split node indicates that a single preceding tracklet branches to a plurality of successor tracklets. A single valid successor tracklet cannot exist because it is already linked to the current tracklet.

トラックエンド（終了）
トラックエンドノードは、所定時間間隔内に何れの後行イベントにも関連させられないような、トラックレットにおける最後のイベントを表している。全てのトラックレットは、一組のグラフを形成し、それらのグラフの各々は、物体が移動した実際のトラック（航跡）に関する固有の曖昧さを表している。 Track end (end)
The track end node represents the last event in the tracklet that is not associated with any subsequent event within a predetermined time interval. All tracklets form a set of graphs, each of which represents an inherent ambiguity with respect to the actual track (wake) that the object has traveled.

トラックレットグラフは、ユーザにより課され得るか、または時間経過とともに「学習され得る」、時間的及び空間的な制約条件により集めることができるイベントに関連する一組のトラックレットである。 A tracklet graph is a set of tracklets related to events that can be imposed by a user or collected by temporal and spatial constraints that can be “learned” over time.

図２のトラックレットグラフには、２つの開始トラックレットがあり、それらは、その後単一のトラックに収束する。そして、収束されたトラックレットは、２度分岐して、４つのエンドポイント（終点）になる。トラックレットグラフは、我々が物体追跡のために使用するイベントの中心的な表示である。 In the tracklet graph of FIG. 2, there are two starting tracklets, which then converge to a single track. The converged tracklet is branched twice to become four end points. The tracklet graph is a central representation of the events we use for object tracking.

延長トラックレットグラフ
物体がセンサネットワークの視界から見えなくなる場合における延長された追跡の目的のためにも、２つの空間的に隣接していて時間的にも隣接しているトラックレットグラフを集めることができる。このような状況は、追跡されている人々が廊下などの公共領域を抜け出して、オフィスなどの領域に進入するときの環境で頻繁に起こる。オフィスに入るイベントは、人が感知すなわち観測されなくなったとき、トラックレットエンドノードで先行トラックレットを終了させる。オフィスを出る際、その人を再び後継グラフで追跡できる。人がオフィスに入ると、その人は、例えば何時間もの長い期間の後にでも、退出しなければならないと思われる。この場合、空間的な制約条件は、厳密に実行され得るが、時間的な制約条件は緩和できる。 Extended Tracklet Graph For the purpose of extended tracking when objects disappear from view of the sensor network, it is possible to collect two spatially adjacent and temporally adjacent tracklet graphs. it can. This situation frequently occurs in an environment where people being tracked exit a public area such as a corridor and enter an area such as an office. An event entering the office terminates the preceding tracklet at the tracklet end node when a person is no longer sensed or observed. When leaving the office, the person can be tracked again in the successor graph. When a person enters the office, he may have to leave, for example after a long period of time. In this case, spatial constraints can be strictly enforced, but temporal constraints can be relaxed.

先行グラフにおけるトラックレットのトラックエンドノードの一つが、後継グラフにおけるトラックレットの少なくとも一つのトラックレットスタートノードのタイムスタンプよりも小さなタイムスタンプを有するという条件の下で、グラフが集められる。 The graph is collected under the condition that one of the track end nodes of the tracklet in the preceding graph has a time stamp that is smaller than the time stamp of at least one tracklet start node of the tracklet in the successor graph.

センサの可視性の判別
本発明の一つの目標は、あるセンサの近傍の領域が複数のカメラの何れから見ることが出来るかを判別することである。これにより、ユーザに提示される無関係な画像の量を最小にする。 Determining sensor visibility One goal of the present invention is to determine from which of a plurality of cameras the region in the vicinity of a sensor can be viewed. This minimizes the amount of irrelevant images presented to the user.

この目標を実現するために、システムの全てのカメラがセンサの配置場所に較正される。我々のシステムでは、各センサは、そのカメラからセンサの起動を引き起こしたイベントを見えるようにする各カメラのさまざまなパン、ティルト及びズームのパラメータに関連している。各カメラのＰＴＺパラメータが、カメラ方向の変更の度に、監視データベースに保存されるならば、トラックレットがセンサ起動毎にデータベースから検索されるときに、「可視」領域を、対応する時に各カメラのＰＴＺパラメータと比較できる。カメラのＰＴＺパラメータがセンサの可視領域に入ると、次に、センサの起動（イベント）が可視であると考えられ、対応するカメラからの一連の画像がビデオ証拠として検索される。以下に述べるように、この証拠は、続いて、ユーザインタフェースを使用したトラックレット選択プロセスの間、ユーザに表示される。 To achieve this goal, all cameras in the system are calibrated to the sensor location. In our system, each sensor is associated with various pan, tilt and zoom parameters for each camera that make the event that caused the sensor activation visible from that camera. If the PTZ parameters of each camera are stored in the monitoring database each time the camera direction is changed, the “visible” area will be displayed when the tracklet is retrieved from the database every time the sensor is activated. It can be compared with the PTZ parameter. Once the camera's PTZ parameters are in the visible region of the sensor, the sensor activation (event) is then considered visible and a series of images from the corresponding camera are retrieved as video evidence. As will be discussed below, this evidence is subsequently displayed to the user during the tracklet selection process using the user interface.

人間に誘導された追跡
我々が我々のシステムで解決する人間により誘導された追跡及び検索に関するタスクを簡単なシナリオで示すことができる。 Human-guided tracking We can show in simple scenarios the tasks related to human-induced tracking and searching that we solve in our system.

ラップトップ型パソコンがオフィスから午後１時００分と午後２時００分との間に盗まれたと報告された。そのオフィスに対して利用可能な直接的カメラ監視範囲は無かった。ユーザは、その時間帯にオフィスを通過できた全ての人々を見つけ出し、そして可能ならば、彼らを識別し、個人をそのイベントに結びつける証拠を集める必要がある。そのような状況では、オペレータ（操作者）は、そのオフィスのドアで開始する全てのトラック（航跡）を識別し、全ての利用可能なビデオ証拠を調べることによって、その個人を識別することを欲するであろう。 A laptop computer was reported stolen from the office between 1:00 pm and 2:00 pm. There was no direct camera surveillance area available for the office. The user needs to find all the people who were able to pass the office at that time, and if possible, identify them and gather evidence that links the individual to the event. In such a situation, the operator wants to identify the individual by identifying all trucks (wakes) starting at the office door and examining all available video evidence. Will.

混合様式のセンサネットワークでの物体追跡の一般原理
トラックスタートノード及びトラックエンドノードは、完全なトラック（航跡）の明確な始めと終わりである。ところで、トラックスプリット（航跡分岐）とトラックジョイン（航跡結合）の曖昧さの自動的解決は、感知されたイベントのみを使用するだけでは不可能である。スプリット（分岐）とジョイン（結合）との曖昧さは、センサにおける、或いは、センサの近くのイベント以外の如何なる特徴へのセンサネットワークの知覚的限界のためである。 General principles of object tracking in mixed-mode sensor networks The track start and track end nodes are the clear beginning and end of a complete track. By the way, automatic resolution of the ambiguity between the track split (wake branch) and the track join (wake combination) is not possible by using only the sensed event. The ambiguity between splits and joins is due to the perceptual limitations of the sensor network to any feature other than events at or near the sensor.

そのような状況では、２人の人間が廊下で通路を横切る（出会う）イベントは、可能な交差点の前後で各人に対するイベントを含む少なくとも４つのトラックレットをシステムに発生させる。更なる情報がなければ、このセット（組）のトラックレットの解釈には、固有の曖昧さがある。例えば、その２人の人間は、互いに行き違うか、或いは出会って彼らが来た道を引き返すことができる。これらのトラック（航跡）に対するアイデンティティ（同一性）をマップ化して、絶対確実にそれらの連続性を維持することは、それらのイベントだけからでは不可能である。 In such a situation, an event where two people cross a passage in the hallway (meet) will cause the system to generate at least four tracklets containing events for each person before and after a possible intersection. Without further information, there is an inherent ambiguity in the interpretation of this set of tracklets. For example, the two people can cross each other or meet and turn back the way they came. It is impossible to map the identity (identity) of these tracks (tracks) and maintain their continuity with absolute certainty only from those events.

これらの曖昧さの見地から、我々は、以下のように簡素化して観測を行う。 From the point of view of these ambiguities, we conduct observations with the following simplification.

ユーザは、グラフ全体の曖昧さを除く必要がない。ユーザは、前方または後方のグラフ横断に対して、選択されたトラックレットを開始するトラックジョインノードの曖昧さを除いたり、或いは、トラックレットを終了させるトラックスプリットノードの曖昧さを除いたりすることがそれぞれ必要になるだけである。 The user need not remove the ambiguity of the entire graph. The user may remove the ambiguity of the track join node that starts the selected tracklet or the ambiguity of the track split node that ends the tracklet for forward or backward graph traversal. Each is only needed.

各候補トラックに関連するビデオクリップを考慮に入れることによって、トラックジョイン及びトラックスプリットの曖昧さを解決することを簡素化できる。 Resolving track join and track split ambiguity can be simplified by taking into account the video clips associated with each candidate track.

第１の観測は、トラックに集められるべき可能な候補であるとみなされる必要があるトラックレットの量を著しく減量させる。一実施の形態では、ユーザは、一度に一人のみを追跡する。従って、システムは、効果的に他のイベントを無視しつつ、その人の移動を解決（分析）するだけでよい。通路で交差する２人の人間の例に対して、我々は、１つのトラックレットが交差点の前に選択され、４つ全てではなく、２つのトラックレットだけが、可能な連続であるとみなされる必要があると仮定する。トラッキング（追跡）及びトラックの曖昧さの解消への、この繰り返しフォーカスされた手法で、私たちは、問題の複雑さを潜在的に指数型から線形（一次関数）へ減少させることができる。 The first observation significantly reduces the amount of tracklets that need to be considered as possible candidates to be collected on the track. In one embodiment, the user tracks only one person at a time. Thus, the system need only resolve (analyze) the person's movement while effectively ignoring other events. For the example of two humans crossing in the aisle, we see that one tracklet is selected before the intersection and only two tracklets are considered to be a possible sequence rather than all four Assume that there is a need. With this iteratively focused approach to tracking and track ambiguity, we can potentially reduce the complexity of the problem from exponential to linear (linear function).

第２の観測は、スプリット及びジョイン（分岐と結合）の曖昧さが生じるとき、システムがトラックレットの時間と配置場所とを直近のカメラからのビデオと相関させて、集合トラックに対して何れのトラックレットが可能性の高い連続であるかを決定するために、対応するビデオクリップをユーザに表示することができることを含意する。 The second observation is that when split and join ambiguity arises, the system correlates the time and placement of the tracklet with the video from the most recent camera, and To determine whether a tracklet is a likely sequence, it implies that the corresponding video clip can be displayed to the user.

センサのネットワークだけを使用することにより、物体の動きの力学（動態）を推測するのを試みる自動追跡手順を開発することが可能であるかもしれない。しかし、如何なるそのような処理手順も誤りを犯すのは不可避である。監視アプリケーションでは、若干不正確なだけの追跡処理の結果に対するコミットメント（関与、取り組み、対応）は、かなり高価になる場合がある。 By using only a network of sensors, it may be possible to develop an automatic tracking procedure that attempts to infer the motion dynamics of an object. However, it is inevitable that any such processing procedure will make a mistake. In surveillance applications, commitments (involvement, commitment, response) to the results of tracking processes that are only slightly inaccurate can be quite expensive.

従って、我々の追尾方式は、追跡データを表す基本的な前後の（文脈）情報としてのトラックレットグラフと共に、人間により誘導された技術を使用する。追跡及び検索が基づいているセンサデータが非常に小さく、従って、特に映像データの従来の検索と比較すると、速やかに処理を進めることができることに注意すべきである。 Thus, our tracking scheme uses human-guided techniques with tracklet graphs as basic contextual information representing tracking data. It should be noted that the sensor data on which tracking and retrieval is based is very small and can therefore proceed quickly, especially when compared to conventional retrieval of video data.

我々のシステムの主眼とすることは、イベントを使用することにより、非常に短時間で効率的に多量の映像データを検索することである。このために、我々は、主としてフォールスネガティブレート（偽陰性率）を小さくすることに関心があるが、フォールスポジティブレート（偽陽性率）を小さくすることが遠い二次的目標である。これらの目標を実現するために、我々は、以下に述べるようなトラック集合用の機構を採用する。 The main focus of our system is to retrieve a large amount of video data efficiently in a very short time by using events. For this reason, we are primarily interested in reducing the false negative rate (false negative rate), but reducing the false positive rate (false positive rate) is a distant secondary goal. In order to achieve these goals, we employ a track assembly mechanism as described below.

トラックレット集合処理
我々のシステムの人間に誘導された追跡のプロセスは、我々が開始すべきトラックと予想する一つ以上のセンサのサブセット（部分集合）と、任意ではあるが、時間間隔と、を選択することで始まる。例えば、我々のシステムでは、センサがオフィスの外側の公共区域に配置される場合、ユーザは、フロアープラン（床配置図）を使用して、人が特定のオフィスを出るときに恐らく起動されうるそのセンサのサブセット（部分集合）を選択できる。 Tracklet Set Processing The human-guided tracking process of our system includes a subset (subset) of one or more sensors that we expect a track to start with, and optionally a time interval. Start with a choice. For example, in our system, if the sensor is placed in a public area outside the office, the user can use a floor plan to show that the person is probably activated when leaving a particular office A subset of sensors can be selected.

イベントのデータベースにおける高速検索を行なうことによって、我々は、選択されたセンサの１つで開始するトラックレットのあらゆるインスタンス（例）を識別できる。ここで、ユーザは、さらに詳細に探査を行うために、そのトラックレットの単一のインスタンス（例）を選択できる。トラックが開始するおおよその時点を特定することによって、上記の検索を速めることができる。 By performing a fast search in the event database, we can identify every instance (example) of a tracklet that starts with one of the selected sensors. Here, the user can select a single instance (example) of the tracklet for further exploration. By identifying the approximate point in time when the track starts, the above search can be expedited.

第１のトラックレットを選択すると、対応するトラックレットグラフが構成される。集められたトラックのグラフは、時間的及び空間的に隣接する一連のイベントに関連する複数のトラックレットを含む。選択されたトラックレットは、図３に示されているように、エンド（終了）、スプリット（分岐）またはノード（結合）があるところのポイント（地点）まで床配置図上に描かれる。エンドポイント（終点）に到達すると、トラック３００は完成する。床配置図においてトラック３００に沿った人の場所が、トラック３００で太線３０１により、ユーザインタフェースに視覚的に示される。 When the first tracklet is selected, a corresponding tracklet graph is constructed. The collected track graph includes a plurality of tracklets associated with a series of temporally and spatially adjacent events. The selected tracklet is drawn on the floor plan to the point where the end (end), split (branch) or node (join) is, as shown in FIG. When the end point (end point) is reached, the track 300 is completed. The location of the person along the track 300 in the floor plan is visually shown on the user interface by a thick line 301 on the track 300.

トラックレットのエンド（終端）がスプリット（分岐）ノード或いはジョイン（結合）ノードを有するならば、トラックは終了されず、候補トラックレットを、一貫性を有するトラックへ集めるために複数のトラックレットグラフを使用してトラックレット集合処理が繰り返して行われる。このプロセスの間、グラフ中の各曖昧（不明瞭）点（スプリットノードまたはジョインノード）で、ユーザは、さらに横断するためのサブグラフを選択する。人々を識別して、正しい後継トラックレットを選択するために、対応するトラックレットに属するセンサ起動の何れかに指向されたカメラからの利用可能なビデオ画像を表示することができる。また、物体や顔の認証などの自動化技術を、上述の識別のために使用できる。 If the end of a tracklet has a split or join node, the track will not be terminated and multiple tracklet graphs will be created to collect candidate tracklets into a consistent track. The tracklet aggregation process is repeated using the above. During this process, at each ambiguous (unclear) point (split node or join node) in the graph, the user selects a subgraph for further traversal. In order to identify people and select the correct successor tracklet, available video images from cameras directed to any of the sensor activations belonging to the corresponding tracklet can be displayed. Also, automated techniques such as object and face authentication can be used for the identification described above.

その処理は、選択グラフを使用して図４に示されている。選択グラフでは、ビデオ画像４０１は、対応する複数のトラックレットに含まれている複数のセンサへ指向された複数のカメラからの利用可能な複数のビデオクリップを表している。菱形４１０は、曖昧ポイント（点）、及びその曖昧ポイントに続く可能な競合トラックレットを示す。そのグラフにおけるエッジは、トラックレットが存在することを示す。 The process is illustrated in FIG. 4 using a selection graph. In the selection graph, the video image 401 represents a plurality of video clips available from a plurality of cameras directed to a plurality of sensors included in a corresponding plurality of tracklets. The diamond 410 shows an ambiguous point (point) and possible competing tracklets following the ambiguous point. The edges in the graph indicate the presence of tracklets.

なお、図４のトラックレット選択グラフは、図２のトラックレットグラフに関連するが、それと同一ではない。事実、図４のグラフは、一般的な選択グラフを表しており、それは、時間的に前方（図示されるように）或いは後方へのトラックレットグラフの横断のために使用できる。前者の場合には、図４における選択グラフのスタート（開始）及びエンド（終了）ノードは、トラックレットグラフにおけるそれらと同じ意味を持っているが、菱形のみは、スプリット（分岐）を表している。トラックジョインは、前進の選択代替策を提示しないので、前進の選択プロセスとは無関係である。対照的に、選択グラフが後方の横断のために使用される場合には、選択グラフのスタート及びエンドノードは、トラックレットグラフのものと反対の意味を有し、菱形のみがジョイン（結合）を表している。 Note that the tracklet selection graph of FIG. 4 relates to the tracklet graph of FIG. 2, but is not the same. In fact, the graph of FIG. 4 represents a general selection graph, which can be used for traversing the tracklet graph forward (as shown) or backward in time. In the former case, the start (start) and end (end) nodes of the selection graph in FIG. 4 have the same meaning as those in the tracklet graph, but only the diamond represents a split (branch). . Track joins are independent of the forward selection process because they do not offer forward selection alternatives. In contrast, if the selection graph is used for backward traversal, the start and end nodes of the selection graph have the opposite meaning of that of the tracklet graph, and only the diamonds are joined. Represents.

いずれの場合でも、トラックレット選択グラフは、最初に選択されたトラックレットで始まり、スタートノード２０１で示される利用可能なカメラフレーム４０１を横断可能な、トラックレットグラフによる一組のトラックを表している。曖昧ポイント（点）は、既知であるので、そのような各ポイントでは、システムは、曖昧さの解消のために、曖昧なトラックレットのセットをユーザに提示できる。 In any case, the tracklet selection graph represents a set of tracks according to the tracklet graph, starting with the first selected tracklet and traversing the available camera frame 401 indicated by the start node 201. . Since the ambiguous points (points) are known, at each such point, the system can present an ambiguous set of tracklets to the user for resolution.

例えば、第１ステップでは、曖昧ポイント４１０は、現在のノードからのスリーウェイスプリット（三方向分岐）を表している。最も左側のトラックレットは、２つのカメラ視界４３１につながる。中央のトラックレットは、カメラ視界を持たずに終わる。３番目のトラックレットは、１つのカメラ視界を持っており、次に、ツーウェイスプリット（二方向分岐）につながる。これらのトラックレットの各々を床配置図に描くことができる。選択が行われた後、拒絶されたトラックレットは、床配置図から取り除かれる。エンド（終了）トラック２０４に出会うまで、処理が続けられる。 For example, in the first step, the ambiguous point 410 represents a three-way split (three-way branch) from the current node. The leftmost tracklet leads to two camera views 431. The middle tracklet ends without camera view. The third tracklet has one camera view and then leads to a two-way split (bidirectional branch). Each of these tracklets can be drawn on the floor plan. After the selection is made, rejected tracklets are removed from the floor plan. Processing continues until an end track 204 is encountered.

トラックの終端に遭遇するとき、トラック集合処理のプロセスを終了することができる。しかし、ユーザに実際のトラックが終了ポイントから続くと信じる理由があるならば、前述のようなトラックレットグラフ拡張機構が使用される。システムは、終了されたトラックの位置で始まる新しいトラックレットを見出すために、所定時間間隔内でデータベースの検索を行なう。そのようなトラックレットを見出すなら、以下に述べるように、対応するビデオクリップが識別され、トラックレット選択コントロールパネルにおいてユーザに表示される。ユーザがトラックの延長された線分に対して最初のトラックを選択すると、集められたトラックの終端にトラックレットが追加され、その選択されたトラックレットで始まる新しいトラックレットグラフが作成される。その後、その物体の完全なトラックをさらに延長するために、選択プロセスが、前述と同様に、繰り返して続けられる。完全なトラックでは、全てのジョイン及びスプリットノードが取り除かれ、トラックは、単一の開始トラックレットと単一の終了トラックレットとを含むのみである。 When the end of the track is encountered, the track set processing process can be terminated. However, if the user has reason to believe that the actual track will continue from the end point, a tracklet graph expansion mechanism as described above is used. The system searches the database within a predetermined time interval to find a new tracklet that begins at the position of the terminated track. If such a tracklet is found, the corresponding video clip is identified and displayed to the user in the tracklet selection control panel, as described below. When the user selects the first track for an extended line of tracks, a tracklet is added at the end of the collected track and a new tracklet graph is created starting with the selected tracklet. The selection process is then continued repeatedly as before to further extend the complete track of the object. In a complete track, all join and split nodes are removed and the track only contains a single start tracklet and a single end tracklet.

ユーザインタフェース
図５に示されるように、一実施の形態では、ユーザインタフェースは、５つのメインパネル、すなわち、床配置図５０１、タイムライン（時系列）５０２、ビデオクリップビン（記憶装置）５０３、トラックレットセレクタ５０４、及びカメラ視界パネル５０５を含む。 User Interface As shown in FIG. 5, in one embodiment, the user interface has five main panels: floor layout diagram 501, timeline (time series) 502, video clip bin (storage device) 503, track. A let selector 504 and a camera view panel 505 are included.

床配置図は、図３に示されている通りである。床配置図のトラック３００に沿った人の位置が、トラック３００における「膨れ」３０１（破線）により示される。各センサに対して、タイムライン５０２は、イベントを示す。タイムラインにおける各列は、一つのセンサに対応しており、時間は、左から右に進む。垂直線５１０は、「現在」の再生時間を示す。現在時刻を設定するために、メニュー及びアイコン５２０を使用できる。再生のスピードを調整するために、「ノブ」５２１を使用できる。マウスで線をドラッグすることによって、タイムラインを前方及び後方に移動できる。短い線分２００は、トラックレット、また、線３００は、解決されたトラックをそれぞれ表す、図３を参照。 The floor plan is as shown in FIG. The position of the person along the track 300 in the floor plan is indicated by a “bulge” 301 (broken line) in the track 300. For each sensor, timeline 502 shows the event. Each column in the timeline corresponds to one sensor, and time progresses from left to right. The vertical line 510 indicates the “current” playback time. A menu and icon 520 can be used to set the current time. A “knob” 521 can be used to adjust the speed of playback. You can move the timeline forward and backward by dragging the line with the mouse. The short line segment 200 represents the tracklet, and the line 300 represents the resolved track, see FIG.

ビデオクリップビンは、物体識別のために選択されたクリップ（画像系列）の画像を示している。本質的には、ビデオクリップビンにおけるトラックに関連する収集された一連の画像は、トラックと物体に関連するビデオ証拠である。 The video clip bin shows an image of a clip (image series) selected for object identification. In essence, the collected series of images associated with a track in a video clip bin is video evidence associated with the track and the object.

トラックレット選択制御は、図４の決定グラフの現状を示している。 The tracklet selection control shows the current state of the decision graph of FIG.

現在時刻及び選択された位置に対応する画像がカメラ視界パネル５０５に示されている。画像は、ユーザにより選択されるか、或いはカメラスケジューリング処理手順により自動的に選択されることができる。ビデオクリップビン５０３を形成するために、クリップの再生の間に、スケジューリング処理手順を呼び出すことができる。 An image corresponding to the current time and the selected position is shown on the camera view panel 505. The image can be selected by the user or automatically selected by a camera scheduling procedure. To form the video clip bin 503, a scheduling procedure can be invoked during clip playback.

追跡方法
本発明の実施の形態では、追跡プロセスは、二つの相を含む。すなわち、物体を追跡するために、監視データを記録する相と、及び検索する相とである。 Tracking Method In an embodiment of the present invention, the tracking process includes two phases. That is, a phase for recording monitoring data and a phase for searching for tracking an object.

記録相は、図６に示されている。図６は、監視データベース６１１におけるセンサデータを保存する方法を示している。監視データベースは、一組のセンサ１０１によって取得されたイベント１０３を保存する。センサの選択されたサブセット（部分集合）に対して時間的及び空間的に隣接している一連のイベントは、一組のトラックレット６３１を形成するようにリンク６３０されている（繋がっている）。各トラックレットには、トラックレットスタートノードとトラックレットエンドノードがある。また、トラックレットは、監視データベースに保存される。 The recording phase is shown in FIG. FIG. 6 shows a method for storing sensor data in the monitoring database 611. The monitoring database stores events 103 acquired by a set of sensors 101. A series of events that are temporally and spatially adjacent to a selected subset (subset) of sensors are linked 630 to form a set of tracklets 631. Each tracklet has a tracklet start node and a tracklet end node. The tracklet is stored in the monitoring database.

センサ起動と同時に、一組のカメラ１０２によって取得された画像１０４の順序がコンピュータ記憶装置６１２に記録される。各イベント及び画像は、１つのカメラ（位置）と時間とに関連付けられている。なお、前述と同様に、カメラのＰＴＺパラメータを判別することもできる。 Simultaneously with sensor activation, the sequence of images 104 acquired by the set of cameras 102 is recorded in the computer storage device 612. Each event and image is associated with one camera (position) and time. As described above, the PTZ parameter of the camera can also be determined.

トラッキングフェーズ（追跡相）が図７に示されている。このフェーズ（相）は、あるトラックが開始すると予想されるセンサのサブセット（部分集合）を選択し６２０、複数のトラックの開始として使用されうる複数のトラックレットを見つけ出し６２５、そのトラックの開始として第１のトラックレットを選択し６４０、そしてトラック集合処理６８０を行うことを含む。 The tracking phase is shown in FIG. This phase selects 620 a subset of the sensors that a track is expected to start 620, finds multiple tracklets that can be used as the start of multiple tracks 625, and sets the first as the start of that track. Including selecting one tracklet 640 and performing track set processing 680.

トラック集合処理は、選択されたトラックレットに対してトラックレットグラフ６５１を構成する６５０ことで始まる。トラックレットグラフ６５１は、複数の先行トラックレットが単一の後継トラックレットに合流するところの可能なトラックレットジョインノードと、単一の先行トラックレットが複数のトラックレットに分岐するところの可能なトラックレットスプリットノードとを有する。 The track set process begins by configuring 650 a tracklet graph 651 for the selected tracklet. The tracklet graph 651 shows possible tracklet join nodes where multiple preceding tracklets merge into a single successor tracklet and possible tracks where a single preceding tracklet branches into multiple tracklets. And let split node.

トラックレットグラフ６５１は、最初に選択されたトラックレットから始まり、繰り返し横断される。そのグラフに続いて、次の曖昧なノードが識別され、候補トラックレットに含まれるセンサ起動（イベント）に時間的及び空間的に相関づけられた画像がコンピュータ記憶装置６１２から検索され、表示６６０され、そして、集められたトラック６６１に結合されるべき次のトラックレット６７０が選択される６７０。 The tracklet graph 651 is traversed repeatedly, starting with the first selected tracklet. Following the graph, the next ambiguous node is identified, and images that are temporally and spatially correlated with sensor activations (events) included in the candidate tracklet are retrieved from computer storage 612 and displayed 660. And the next tracklet 670 to be combined with the collected track 661 is selected 670.

その処理は、集められたトラック６６１がそのエンドポイントとしてトラックエンドノードを有するトラックレットで終了されるときに終わり、そして、そのグラフから全てのジョイン及びスプリットノードが取り除かれる。 The process ends when the collected track 661 is terminated with a tracklet having a track end node as its end point, and all join and split nodes are removed from the graph.

物体トラックのモデル化及び物体の移動の予測
以上で、私たちは、離散的な動き（モーション）イベントに従って、物体（例えば、人間）が何処にあったかを判別する方法を説明した。次に、我々は、ある環境で取得された監視データから、物体が行くかもしれない場所を予測して、物体の曖昧な動き（移動）を識別しようと思う。具体的には、我々は、異常な、或いは不審な挙動を示すかもしれない動きを予測しようと思う。 Object Track Modeling and Object Movement Prediction So far, we have described a method for determining where an object (eg, a human) is located according to discrete motion events. Next, we will try to identify ambiguous movements (movements) of objects by predicting where the objects may go from monitoring data acquired in an environment. Specifically, we will try to predict movements that may behave abnormally or suspiciously.

我々は、高品質の（ビデオ）追跡が可能でないか、または実行不可能であるときでも、その環境における物体の動きをモデル化するための複数の方法について説明する。それらの方法は、複雑な環境で移動する如何なる関心物体（人間、車等）に対しても有効である。「複雑な環境」という言葉により、我々は、恐らく数百や数千個ものセンサ及び同様の数の、ワーカー（労働者）や車等の移動物体を有する大規模オフィスビル、製造工場、または駐車場を意味している。複雑な環境では、互いに関連する物体の尤度は比較的高く、物体のアイデンティティ（同一性）は、未知であるかもしれない。 We describe multiple methods for modeling the movement of objects in the environment, even when high quality (video) tracking is not possible or not feasible. These methods are effective for any object of interest (human, car, etc.) moving in a complex environment. The term “complex environment” means that we are likely to have hundreds or thousands of sensors and a similar number of large office buildings, manufacturing plants or parks with moving objects such as workers or cars. It means a parking lot. In complex environments, the likelihood of objects associated with each other is relatively high, and the identity (identity) of the objects may be unknown.

まず最初に、我々は、トラッキング（追跡）により高品質及び低品質を比較する。高品質のトラッキング（追跡）は、センサ、例えばカメラやモーションセンサ等、の集合からセンサデータの連続ストリームを処理して、特定の個人の動きのモデルを生成する行為である。そのモデルは、その物体（個人）が移動したトラックに沿った当該個人の開始、中間及び終了位置に関する質問に答えるために使用できる。一般に、トラッキング（追跡）は、決定論的なトラックモデル（すなわち、その環境を通るその個人のトラックに関する最良の推測）を生成すると考えられる。 First of all, we compare high quality and low quality by tracking. High quality tracking is the act of processing a continuous stream of sensor data from a collection of sensors, such as cameras and motion sensors, to generate a model of a particular individual's movement. The model can be used to answer questions about the person's start, middle and end positions along the track the object (person) has moved. In general, tracking is considered to generate a deterministic track model (ie, the best guess about the person's track through the environment).

そのモデルの曖昧点は、誤り、追跡の失敗、異常な或いは珍しい運動であると考えられる。高品質のトラッキング（追跡）システムは、全ての曖昧点を解消する試みにおいて非常に複雑になる可能性がある。その環境に複数の個人がいると、曖昧点が不可避的に生じるであろう。 The ambiguity of the model is thought to be errors, tracking failures, unusual or unusual movements. High quality tracking systems can become very complex in an attempt to resolve all ambiguities. If there are multiple individuals in the environment, ambiguity will inevitably arise.

各物体に対して決定論的な動き（移動）モデルを生成するという目標を有する監視システムでは、それらの曖昧点は、物体の真の構成に関する仮説と、将来のイベントに対して、唯一の真の仮説が残るまで、伝播され、テストされ、できれば剪定されねばならない環境との集合として表される。 In a surveillance system with the goal of generating a deterministic motion (movement) model for each object, those ambiguities are the only true for hypotheses about the true composition of the object and for future events. Until the hypothesis remains, it is represented as a set of environments that must be propagated, tested, and possibly pruned.

曖昧点は、物体または個人の相互作用から生じるので、考慮されねばならない仮説の数は、その環境における個体数と比較して、指数関数的に増大する可能性がある。 Since ambiguities arise from the interaction of objects or individuals, the number of hypotheses that must be considered can increase exponentially compared to the number of individuals in the environment.

低品質のトラッキング（追跡）は、曖昧点を解決するのが非現実的であるか、または事実上不可能なので曖昧点が許容される場合である。 Low quality tracking is when it is impractical to resolve the ambiguity or it is practically impossible to allow ambiguity.

高品質のトラッキングシステムは、イベントが曖昧点の解決のために充分な特別な情報を含まなければならないので、一般的に高品質のセンサを必要とする。人々を追跡するシステムは、顔を認識したり、衣服の色のマッチング（照合）を行ったり、キー（鍵）カードやバイオメトリックスキャンに依存したりして、そのような曖昧点を解決するかもしれない。 A high quality tracking system generally requires a high quality sensor because the event must contain enough special information to resolve the ambiguity. People tracking systems may resolve such ambiguities by recognizing faces, matching clothing colors, or relying on key cards and biometric scans. unknown.

それらの種類の高品質なセンサが利用できない状況では、離散イベントを送信する（例えば、動きの無いときにセンサがオフになり、動きがあるときにセンサがオンになる）ことしかできない簡易なセンサを備える環境において、物体の動き（移動）のパターンに関する推論を引き出すことが、まだ有用である場合がある。完全なトラッキング（追跡）出力を当てにするよりむしろ、我々は、離散的な動きのイベントを送信することしかできない低品質のモーションセンサによって得られた不完全な追跡モデルの集合からこれらの推論を得る方法を説明する。 In situations where these types of high-quality sensors are not available, simple sensors that can only transmit discrete events (for example, the sensor turns off when there is no movement and the sensor turns on when there is movement) In an environment with, it may still be useful to derive inferences about patterns of object movement (movement). Rather than relying on the full tracking output, we rely on these inferences from a set of incomplete tracking models obtained by low quality motion sensors that can only send discrete motion events. How to obtain will be described.

トラックレット
我々は、トラックレットの概念を、確率的なフレームワークにおいてその表示を埋め込みつつ使用する。１つのトラックレットの基本概念は、明白に互いに関連する離散イベントをそのトラックレットに集めるということである。トラックレットの概念は、上述のように、トラックレットのグラフ中の不完全な追跡モデルを高品質のトラックモデルへと洗練するために、システムが効率的に当該システムのユーザと対話できるように、監視に適用するための物体追跡における曖昧さを表わす方法として開発された。 Tracklets We use the concept of tracklets, embedding their representation in a probabilistic framework. The basic concept of one tracklet is that it collects discrete events that are clearly related to each other. The concept of the tracklet, as described above, allows the system to efficiently interact with the user of the system in order to refine the imperfect tracking model in the tracklet graph into a high quality track model. It was developed as a method to express ambiguity in object tracking for application to surveillance.

今、我々は、一組の確率的なグラフから推論を得るのを可能にするために、トラックレットとトラックレットグラフとを確率的モデルに埋設する。このようにして、本発明は、短い期間に亘っての個々人の単一または小さなグループの特定の動きではなく、長い期間に亘って個々人の集団または他の物体の動き（移動）のパターンを理解するのを可能にする。 Now we embed tracklets and tracklet graphs in a probabilistic model to allow inferences from a set of probabilistic graphs. In this way, the present invention understands the pattern of movement (movement) of a group of individuals or other objects over a long period, rather than a specific movement of a single or small group of individuals over a short period of time. Make it possible to do.

図８に極度に単純化して示すように、イベント８０１の集合が空間と時間を通して一列に並べられる。それらのイベントは、上述したように、例えば、簡易な低品質のモーションセンサによって検出された動きイベントであり得る。 As shown in an extremely simplified manner in FIG. 8, a set of events 801 are arranged in a line through space and time. These events can be, for example, motion events detected by a simple low quality motion sensor, as described above.

以下において、それらのイベントは、その環境における位置と同義である。１つのセンサは、その環境における比較的小さな領域における動き、例えば約５〜１０メートルの領域における物体の動きを検出できるだけである。このため、複数のセンサは、比較的小さな領域における移動物体を検出し、（検出された）複数のイベントは、複数のセンサの位置に直接関連させることができる。それらのセンサが廊下に配置され、そしてオフィスに隣接して特定の個人に割り当てられたならば、比較的高い確率で特定の個人のトラフィック（通行）パターンを予測することが出来る。 In the following, these events are synonymous with their location in the environment. One sensor can only detect movement in a relatively small area of the environment, for example, movement of an object in an area of about 5-10 meters. Thus, multiple sensors detect moving objects in a relatively small area, and multiple (detected) events can be directly related to multiple sensor locations. If those sensors are placed in the corridor and assigned to a particular individual adjacent to the office, a particular individual's traffic (traffic) pattern can be predicted with a relatively high probability.

複数のイベントが空間的にお互いに近くにあり、時間的に、順番に配列されており、他のイベントから孤立しているならば、その環境における物体の動きの簡易モデルは、それらのイベントを通る実線矢印で示すように、我々が当該イベントを単一のトラック８０２に集めることを可能にする。その矢印は、一連の離散イベントを１つのトラックレットに抽象化したものであり、そのトラックレットは、その集合における全てのイベントが同一の個人、または個人の小さなグループによって生成されたという事実を、非常に高い確率で表すモデルである。 If multiple events are spatially close to each other, arranged in time, and isolated from other events, a simple model of object movement in that environment It allows us to collect the event on a single track 802, as shown by the solid arrows passing through. The arrow is an abstraction of a series of discrete events into one tracklet, which tracks the fact that all events in the set were generated by the same individual or a small group of individuals. This is a model with very high probability.

さらに、それらの一連のイベントに、空間及び時間において最終的なイベントの近くにある如何なるイベントも続いていないならば、我々は、トラックレットが終了したと言う。これは、ボックスノード８０３によって表される。同様に、トラックレットが空間及び時間において他のイベントに接続されるように思われるイベントに先行されないならば、トラックレットは、その空間及び時間で始まり、そして、これは三角形ノード８０４により示される。続いて、我々は、開始ノードＡから終了ノードＺまでの高水準のトラックレット抽象化８０５を示すのみで、それらのイベント自体をドロップ（廃棄）することができる。 Furthermore, if those series of events are not followed by any event that is near the final event in space and time, we say the tracklet has ended. This is represented by box node 803. Similarly, if a tracklet is not preceded by an event that appears to be connected to other events in space and time, the tracklet will begin in that space and time, and this is indicated by triangle node 804. Subsequently, we can only drop the high-level tracklet abstraction 805 from the start node A to the end node Z and drop those events themselves.

トラックレットグラフが示される。我々は、図８の単純グラフ８０５をグラフγ_０と呼ぶ。グラフγ_０は、ノードＺで検出された終了イベントが、開始ノードＡでそのイベントを生成した同じ個人によって生成されるという確率Ｐが高いことを示す。 A tracklet graph is shown. We call the simple graph 805 of FIG. 8 a graph γ ₀ . Graph γ ₀ shows that the probability P that the end event detected at node Z is generated by the same individual who generated the event at start node A is high.

一般性の喪失がなければ、グラフγ_０を考えると、次式

で表される確率８００は、ノードＺで検出されたイベントが、ノードＡのイベントに続いて起こるという確率が明らかに（曖昧ではなく）１であることを意味する。ここで、

は、後に続くということを示している。 If there is no loss of generality, considering the graph γ ₀ ,

The probability 800 represented by means that the probability that the event detected at node Z will follow the event at node A is clearly 1 (not ambiguous). here,

Indicates that it will follow.

一般に、物体が既知の位置にないならば、確率Ｐは０であり、物体が既知の位置にあれば、確率Ｐは１であり、また、その他の場合には、確率Ｐは０＜Ｐ＜１である。 In general, the probability P is 0 if the object is not in a known position, the probability P is 1 if the object is in a known position, and in other cases the probability P is 0 <P <. 1.

トラックレットグラフ
その環境における曖昧点は、より複雑なグラフを生成する。我々のグラフは、有向のエッジまたはトラックレットによって接続されたノードを含む。可能な共通の原因を共有する全てのイベントが、同一の連結されたグラフに結びつけられる。グラフ中のノードは開始ノード、中間ノード及び終了ノードを含むことができる。この説明のために、グラフは、スプリットノードまたはジョインノードのような、少なくとも１つの中間ノードを含む。そうでなければ、どんな曖昧点もないであろうし、また、物体を追跡するという問題が決定論的になり、かつ些細になる。中間ノードはスプリットノードとジョインノードを含むことができる。 Tracklet graph The ambiguity in the environment produces a more complex graph. Our graph includes nodes connected by directed edges or tracklets. All events that share a possible common cause are tied to the same connected graph. The nodes in the graph can include a start node, an intermediate node, and an end node. For purposes of this description, the graph includes at least one intermediate node, such as a split node or join node. Otherwise, there will be no ambiguity, and the problem of tracking objects becomes deterministic and trivial. Intermediate nodes can include split nodes and join nodes.

確率は、物体が特定のノードで始動して、別のノードで終了するという見込みを示すためにエッジ（トラックレット）に割り当てられる。換言すると、確率とは、終了ノードにおけるイベントが開始ノードのイベントに続き、それらのイベントは同じ物体によって引き起こされるという尤度である。トラックは、開始ノードから終了ノードへ延びており、曖昧点を生じさせるために中間ノードを含む。 Probabilities are assigned to edges (tracklets) to indicate the likelihood that an object will start at a particular node and end at another node. In other words, probability is the likelihood that an event at the end node follows the event at the start node, and that these events are caused by the same object. The track extends from the start node to the end node and includes intermediate nodes to create ambiguities.

図９は、その環境においてトラックを横断する２人以上の個人を表すより複雑なグラフを示している。 FIG. 9 shows a more complex graph representing two or more individuals crossing a track in the environment.

山形部９０１及び９０２はジョイン（結合）ｊ及びスプリット（分岐）ｓをそれぞれ表している。すなわち、数回のイベントが、十分に小さな時間的及び空間的な近傍内でスプリットｓに続いて起こり、曖昧点が発生する。スプリットの場合には、１つのトラックレットにおいて共同アクター（人物）であるとみなされるように、一緒に動いている、或いは互いに非常に近い幾人かの個人がいる。スプリットの後では、ｊからｓへトラックレットを横断した各個人のトラックは曖昧である。すなわち、それぞれの個人がトラックレットに沿ってスプリットｓから終了ノードＺへ移動したのか、或いはスプリットｓから終了ノードＹへ移動したのかが曖昧である。 Angle portions 901 and 902 represent a join j and a split s, respectively. That is, several events occur following the split s within a sufficiently small temporal and spatial neighborhood, creating an ambiguity point. In the split case, there are several individuals who are moving together or very close to each other so that they are considered to be co-actors in one tracklet. After the split, each individual's track that crossed the tracklet from j to s is ambiguous. That is, it is ambiguous whether each individual has moved from the split s to the end node Z or moved from the split s to the end node Y along the tracklet.

我々がγ_１と呼ぶこのグラフを考えると、私たちはノードＡで始まる個人のトラックを知らない。この曖昧点の確率は、次式（１）で表される。

ここで、我々は、両方の結果が等しく起こりうるように、ｐ＝ｑと決めてもよいし、或いは、我々の予測を一方或いは他方に偏向させるように、移動パターンを学習しても良い。そのパターンは、システムのユーザにより手動的に学習されることもできるし、或いは時間の経過とともに自動的に学習されることもできる。ノードＡの個人または物体は、どこかに行かなければならないので、グラフにおけるその確率は、合計すると１、すなわちｐ＋ｑ＝１にならねばならない。 Considering this graph we call γ ₁ , we don't know the personal track starting at node A. The probability of this ambiguous point is expressed by the following equation (1).

Here we may decide p = q so that both results can occur equally, or we can learn the movement pattern to bias our prediction to one or the other. The pattern can be learned manually by the user of the system or can be learned automatically over time. Since the individual or object at node A has to go somewhere, its probabilities in the graph should sum to 1, ie p + q = 1.

グラフは、任意に複雑にすることができる。図１０のグラフγ_２は、幾つかの開始ノードＡ、Ｂ及びＣ、幾つかのスプリット（分岐）ｓ及びジョイン（結合）ｊ、並びに終了ノードＷ、Ｘ、Ｙ及びＺを有する遙かに複雑なグラフである。 The graph can be arbitrarily complex. The graph γ _{2 in} FIG. 10 is much more complex with several start nodes A, B and C, several splits s and joins j, and end nodes W, X, Y and Z. It is a simple graph.

グラフにおいて有向エッジをトレースすることによって、どんな終了ノード｛Ｗ、Ｘ、Ｙ、Ｚ｝にも開始ノードＡを接続することが可能である。例えば、その確率は、次式の通りである。

It is possible to connect the start node A to any end node {W, X, Y, Z} by tracing the directed edge in the graph. For example, the probability is as follows:

ここで、ノードＡで開始する個人が、非零の確率を有するそれらの位置の何れかで終わることができるように、我々は、その確率を以下の次式（２）〜次式（５）に示す。

我々は、それらの全ての結果が等しく起こりそうであると決めてもよいし、或いはまた、個人の習慣的な動き等のような、以前に学習された情報に基づいて、それらの確率を偏倚させてもよい。 Now, so that an individual starting at node A can end up at any of those positions with non-zero probabilities, we determine the probabilities below (2) through (5) Shown in

We may decide that all those outcomes are likely to occur equally, or we bias their probabilities based on previously learned information, such as personal habits etc. You may let them.

より興味深いケースは、ノードＣで開始する個々人が、彼らの最終目的地において同じくらい多くの曖昧さを持っていないことである。ｓ１からｊ２へエッジが指向されるので、すなわち、次式

で表される確率が成立するので、ノードＣで開始しノードＷまたはノードＸで終了する個人に対する妥当な説明は無い。そのため、次式（６）〜次式（９）が成立する。

A more interesting case is that individuals starting at node C do not have as much ambiguity at their final destination. Since the edge is directed from s1 to j2, that is,

Therefore, there is no valid explanation for an individual who starts at node C and ends at node W or node X. Therefore, the following expressions (6) to (9) are established.

グラフが完全であることに注意する価値がある。すなわち、ノードＣで始まり、１つの終了ノードで終わる物体に関連する一連の可能なイベントがあれば、その終了ノードがグラフにある。しかしながら、グラフ中でノードＶが終了ノードでないならば、確率は次式で表される。

It is worth noting that the graph is complete. That is, if there is a series of possible events associated with an object starting at node C and ending with one end node, that end node is in the graph. However, if node V is not an end node in the graph, the probability is expressed by the following equation.

同様に、特定のグラフで見出せない開始ノードに対しては、そのグラフに亘る時間の間、その開始ノードと終了ノードとの間には、どんな可能な接続もこのグラフにはない。例えば、その確率は次式で表される。

Similarly, for a start node that cannot be found in a particular graph, there are no possible connections in the graph between the start node and the end node for the time spanned by that graph. For example, the probability is expressed by the following equation.

グラフのセット
ある瞬間に、外観上連係されていないようにその環境内を動き回わる幾つかの物体または個人がたまたまトラックを横断し、曖昧点を生成して、その結果、特定のトラックレットグラフのインスタンス（例）γを生成した個々人のコーホート（群）になる。上述の解析は、それらの個人のトラックに関して多少の弱いアサート（表明、主張）を行う。 A set of graphs. At some moment, several objects or individuals who move around in the environment so that they are not visually linked accidentally cross the track, creating an ambiguity, resulting in a specific tracklet graph. Instance (example) becomes a cohort (group) of individuals who generated γ. The above analysis makes some weak assertions (assertions, assertions) on those individual tracks.

我々は今、より長い期間に亘る物体の集団の動きを説明する１セット（組）のグラフΓについて考える。 We now consider a set of graphs Γ that illustrate the movement of a group of objects over a longer period of time.

我々は、「このセット（組）のグラフが、ビル（建物）内で個々人が一つの場所から他の場所へ移動する確率に関して我々に何を示すか」という形の質問をすることができる。我々は、例えばノードＡを含み且つノードＡで開始する全てのグラフを検索して、可能ならば個々の確率によって重み付けされた、ノードＡとＺとの間の可能な接続を含むグラフの数を数えることにより、この質問に答えることができる。このことは、次式で表される。

ここで、Ｎは、そのセットΓにおけるグラフの数である。ノードＡを開始ノードとして含むセットΓにおけるグラフの数はＭであり、それは、正規化因子（係数）として使用できる。１つの接続に対する証拠、すなわち次式

で表される確率は、トラックレットグラフを作図して、次にグラフの数Ｍと共に、各センサ対に対する証拠を蓄積することによって、イベントが受信されるにつれて蓄積され得る。 We can ask questions in the form of "what this set of graphs tells us about the probability that an individual moves from one place to another in a building". We search all graphs that contain, for example, node A and start with node A, and find the number of graphs that contain possible connections between nodes A and Z, possibly weighted by individual probabilities. Counting can answer this question. This is expressed by the following equation.

Here, N is the number of graphs in the set Γ. The number of graphs in the set Γ including node A as a start node is M, which can be used as a normalization factor (coefficient). Evidence for one connection, ie

Can be accumulated as events are received by plotting tracklet graphs and then accumulating evidence for each sensor pair along with the number M of graphs.

物体の動きを全体として示している明白な（曖昧さの無い）トラックレットが無いときでも、繰り返す動き（運動）に対する証拠を蓄積することが可能である。ノードＡで開始するトラフィックが常にノードＺで終了するならば、ノードＡを開始ノードとして含むセットΓにおける全てのグラフは、ノードＡからノードＺへの妥当なトラックを含んでいる。これはかなりの量の証拠となる。 It is possible to accumulate evidence for repetitive movements (motions) even when there is no obvious (unambiguous) tracklet showing the movement of the object as a whole. If the traffic starting at node A always ends at node Z, then all graphs in the set Γ including node A as the starting node will contain valid tracks from node A to node Z. This is a significant amount of evidence.

確率

は、平均証拠確率

である。その確率は、その動きのあらゆる例が完全に明白（曖昧さの無い）であるときよりも小さく、そして、確率

は、同様に１である。 probability

Mean probability of evidence

It is. The probability is less than when every instance of the movement is completely obvious (unambiguous), and the probability

Is 1 as well.

逆に、決して繰り返されない偶然の或いは極めて稀な動き（運動）は、分子に殆ど証拠を集めないが、分母は、偶発的な開始ノードを含むグラフの数であり、そして、その結果として生じる値は、グラフのセット（組）に対して非常に小さい。 Conversely, accidental or extremely rare movements (movements) that are never repeated collect little evidence in the numerator, but the denominator is the number of graphs that contain the accidental start node and result The value is very small for a set of graphs.

センサの集団に亘って、及びグラフに亘って確率としての証拠をさらに集めることによって、ある環境におけるトラフィックパターンを次式のように定量化することが可能である。

ここで、Θは終了ノードのセットであり、また、正規化因子Ｍは、全てのトラフィックが通過しなければならない環境における隘路等の、開始ノードとして対象となる特定のノードＡを含むセットΓにおけるグラフの数である。これは、我々がその環境における異なる領域の間の「接続性」を測定することを可能にする。 By gathering more evidence as probabilities across the population of sensors and across the graph, the traffic pattern in an environment can be quantified as:

Where Θ is the set of end nodes and the normalization factor M is in the set Γ including the particular node A that is the target as the start node, such as a bottleneck in an environment where all traffic must pass. The number of graphs. This allows us to measure “connectivity” between different areas in the environment.

より一般な解は、次式に示すように、開始ノード及び終了ノードに対する確率を複数のグラフに亘って集めることである。

ここで、Φは開始ノードのセットである。セットΦ及びΘは、不連続の場合があり、その環境に亘って分散配置させてもよい個人のグループ間の関係を測定する。 A more general solution is to collect probabilities for the start and end nodes across multiple graphs, as shown in the following equation.

Where Φ is a set of start nodes. The sets Φ and Θ may be discontinuous and measure the relationship between groups of individuals that may be distributed across the environment.

いずれの場合でも、次式

で表される確率は、グラフのセットΦにおけるあるノードで開始するトラックが、グラフのセットΘにおけるあるノードで終了する相対的な尤度を表している。 In either case, the following formula

Is represented by the relative likelihood that a track starting at a node in the graph set Φ ends at a node in the graph set Θ.

接続性グラフ
その環境における全ての開始及び終了の対のノードＡ及びＺの間の次式

で表される確率は、その環境におけるある場所から他の場所への、繰り返される人々の流れがある確率の推定値を表している。オフィスビル等の環境で、個々の人々は、典型的には、特定のオフィスに関連している。このため、通常、ある個人を特定のオフィスの外で開始するトラックに関連づけることが可能である。同様に、あるトラックがあるオフィスで終了するならば、その個人を比較的高い確率で再び認識できる。また、通常、少数の共用資源が、例えば、トイレ、台所、コピー機、及びプリンタ等の既知の場所にある。そして、ノードがそれらの位置に対応する確率は、個々人と場所との間の潜在的な接続性の間接的な測定を与える。 Connectivity graph The following expression between nodes A and Z of all start and end pairs in the environment

Is an estimate of the probability that there is a repeated flow of people from one place to another in the environment. In an environment such as an office building, individual people are typically associated with a particular office. Thus, it is usually possible to associate an individual with a track that starts outside a particular office. Similarly, if a track ends in an office, the individual can be recognized again with a relatively high probability. Also, typically a small number of shared resources are in known locations such as toilets, kitchens, copiers, and printers. And the probability that nodes correspond to their location gives an indirect measure of potential connectivity between individuals and locations.

これらの確率は、接続の決定論的なグラフと個人の間の接続品質とを生成するために、２つ以上のクラスに群化できる。 These probabilities can be grouped into two or more classes to produce a deterministic graph of connections and connection quality between individuals.

従来技術では、これらの種類の社会的な「ネットワーク」は、例えば、ＲＦＩＤタグやバッジ等の、人体に装着するセンサを使用することにより測定されているが、これらは、容易に失われたり破損し易く、また装着して検出するのに不快であったり、不便であるかもしれない。その環境に配置された間接的で、簡易なセンサからネットワークを推測することは、本発明の重要な利点である。 In the prior art, these types of social “networks” are measured by using sensors worn on the human body, such as RFID tags and badges, which are easily lost or damaged. It may be uncomfortable or inconvenient to wear and detect. Inferring a network from indirect, simple sensors placed in the environment is an important advantage of the present invention.

予測
トラックレットがセンサ（ノード）Ａで開始するならば、そのノードに対する全ての終了確率のセットまたはサブセットは、次式の確率で与えられる。

Prediction If a tracklet starts with sensor (node) A, the set or subset of all end probabilities for that node is given by the probability:

ここで、そのトラックレットに対して最も確度の高い終了ノードを次式のように予測することが可能である。

Here, it is possible to predict the end node with the highest accuracy for the tracklet as follows.

また、確率により全ての終了ノードをランクにより順序付けして、エレベータ、照明、加熱システム、或いはその他の限られた環境資源に対する需要を予測するシステム等の確率的決定システムに対して、ウエイト（重り）として相対的（条件付き）確率を使用することも可能である。 Also weights for stochastic decision systems such as elevators, lighting, heating systems, or other systems that predict demand for limited environmental resources by ordering all end nodes by rank. It is also possible to use relative (conditional) probabilities as

また、この技術は、より限定されたグラフを作ることによって、「Ａ、Ｒ、Ｓ、Ｚ」等のノードのサブセット（部分集合）を含めるようにその確率を次式のように一般化する。

In addition, this technique generalizes the probability to include a subset (subset) of nodes such as “A, R, S, Z” by creating a more limited graph.

ノードＡとＲとは、シーケンシャル（順次的、連続的）である必要はない。それらのノードは、グラフの可能性の高い（妥当な）部分の一部であればよい。すなわち、ノードＡとノードＲとの間、またノードＲとノードＺとの間には、中間ノードがあってもよい。これは、例えば、ノードＡとノードＲとの間に多くの同等のトラックがあるときに有用になり、ノードＲは、そのグラフにおいて特に有益なノードを表わす。 Nodes A and R need not be sequential (sequential, continuous). Those nodes need only be part of the likely (reasonable) part of the graph. That is, there may be an intermediate node between the node A and the node R and between the node R and the node Z. This is useful, for example, when there are many equivalent tracks between node A and node R, which represents a particularly useful node in the graph.

このような形式は、１つ以上の中間ノードを含むサブセット（部分集合）を含むように延長することができる。次に、予測は、次式のように確率的な形式をとる。

Such a format can be extended to include a subset (subset) that includes one or more intermediate nodes. Next, the prediction takes a probabilistic form as follows:

図１１は、一般的な方法の各ステップ（工程）を示している。移動物体によって引き起こされたイベント１１１１は、ある環境において既知の位置に配設された一組のセンサ１０１により検出される１１１０。時間的及び空間的に隣接する一連のイベントは、一組のトラックレット１１２１を形成するようにリンクされている（繋がれている）１１２０。一組の有向グラフ１１３１がその一組のトラックレットから組み立てられる１１３０。それらのグラフは、少なくとも１つの開始ノード、少なくとも１つの終了ノード、及び複数のトラックレットが物体の動き（移動）に対して曖昧点を発生させながら接続する一つまたは複数の中間ノードを含む。確率１１４１は、その環境において物体の動きをモデル化するために、物体が特定の位置にあった尤度を示すように、エッジへ割り当てられる１１４０。それらの確率１１４１は、時間経過と共に学習１１４５された、動き（移動）のパターン１１４６等の情報に基づいて洗練されうる。そしてまた、それらの確率で注釈されたグラフは、物体の動きを予測するのに使用できる。 FIG. 11 shows each step of the general method. An event 1111 caused by a moving object is detected 1110 by a set of sensors 101 located at a known location in an environment. A series of temporally and spatially adjacent events are linked (connected) 1120 to form a set of tracklets 1121. A set of directed graphs 1131 are assembled 1130 from the set of tracklets. The graphs include at least one start node, at least one end node, and one or more intermediate nodes to which a plurality of tracklets connect while generating an ambiguity with respect to the movement (movement) of the object. Probabilities 1141 are assigned 1140 to the edge to indicate the likelihood that the object was at a particular location in order to model the movement of the object in the environment. The probabilities 1141 can be refined based on information such as a movement (movement) pattern 1146 learned 1145 over time. And the graphs annotated with those probabilities can be used to predict the motion of the object.

本発明は、好適な実施の形態を例に挙げて説明したが、本発明の精神及び範囲内で種々の他の改変及び変更を行うことができることを理解すべきである。従って、添付クレームの目的は、本発明の真実の精神及び範囲に含まれるような全ての変形例及び変更例をカバーすることである。 Although the invention has been described by way of examples of preferred embodiments, it is to be understood that various other modifications and changes can be made within the spirit and scope of the invention. Accordingly, the purpose of the appended claims is to cover all modifications and variations as fall within the true spirit and scope of the invention.

本発明の実施の形態に従ってトラッキングシステムが実施される環境を示す説明図である。It is explanatory drawing which shows the environment where a tracking system is implemented according to embodiment of this invention. 本発明の実施の形態によるトラックレット（小航跡）グラフを示す説明図である。It is explanatory drawing which shows the tracklet (small wake) graph by embodiment of this invention. 本発明の実施の形態による図１の環境における追跡物体の航跡を示すブロック図である。FIG. 2 is a block diagram illustrating a track of a tracked object in the environment of FIG. 1 according to an embodiment of the present invention. 本発明の実施の形態による決定グラフを示す説明図である。It is explanatory drawing which shows the decision graph by embodiment of this invention. 本発明の実施の形態によるユーザインタフェースを示す画像である。It is an image which shows the user interface by embodiment of this invention. 本発明の実施の形態による監視データを記録するための方法を示すフローチャートである。6 is a flowchart illustrating a method for recording monitoring data according to an embodiment of the present invention. 本発明の実施の形態による物体を追跡するために監視データを検索する方法を示すフローチャートである。6 is a flowchart illustrating a method for retrieving monitoring data for tracking an object according to an embodiment of the present invention. 運動（モーション）センサ及びトラックレットによって検出された離散的事象（イベント）を示す説明図である。It is explanatory drawing which shows the discrete event (event) detected by the motion (motion) sensor and the tracklet. 本発明の実施の形態によるトラックレット（小航跡）グラフを示す説明図である。It is explanatory drawing which shows the tracklet (small wake) graph by embodiment of this invention. 本発明の実施の形態によるトラックレット（小航跡）グラフを示す説明図である。It is explanatory drawing which shows the tracklet (small wake) graph by embodiment of this invention. モデリング及び予測を行うための方法を示すフローチャートである。3 is a flowchart illustrating a method for performing modeling and prediction.

Claims

Detecting a discrete event caused by an object moving in an environment with a set of sensors disposed at known locations in the environment;
Linking a series of temporally and spatially adjacent discrete events to form a set of tracklets;
A set comprising at least one start node, at least one end node, and one or more intermediate nodes to which the plurality of tracklets connect while generating an ambiguity in the movement of the object from the set of tracklets Creating a directed graph of
Assigning a probability to the node indicating the likelihood that the object was at a particular known position to model movement of the object in the environment;
A computer-implemented method for modeling the movement of an object in an environment comprising:

The computer-implemented method for modeling object movement in an environment as recited in claim 1, wherein the sensor is a motion sensor.

The computer-implemented method for modeling object movement in an environment as recited in claim 1, wherein the sensor uses a wireless transmitter to transmit the event.

The computer-implemented method for modeling object movement in an environment as recited in claim 1, wherein the linking is performed according to temporal and spatial constraints.

The computer-implemented method for modeling object movement in an environment as recited in claim 4, wherein the temporal and spatial constraints are learned over time.

The computer-implemented method for modeling object movement in an environment as recited in claim 1, wherein a particular probability is associated with a particular object.

When the start node is node A, the end node is node Z, and the graph γ ₀ models the movement of a specific object from node A to node Z, the specific object starting at node A is always said If we end with node Z in graph γ _0, the probability is

The computer-implemented method for modeling the movement of an object in an environment according to claim 1, characterized in that

The intermediate node includes a split node from which a plurality of tracklets diverge and a join node from which a plurality of tracklets converge to model the movement of an object in an environment according to claim 1. A computer-implemented method.

When the start node is node A, the end node is node Z, and Γ is a set of graphs γi, the probability is

If a set of end nodes is a set Θ, the probability is given by

A computer-implemented method for modeling the movement of an object in an environment as claimed in claim 9.

If a set of start nodes is a set Φ, the probability is given by

A computer-implemented method for modeling the movement of an object in an environment according to claim 10 characterized by:

The computer-implemented method for modeling object movement in an environment as recited in claim 11, wherein the set Φ and set Θ can be discontinuous.

The computer-implemented method for modeling object movement in an environment as recited in claim 1, further comprising predicting movement of the object from the set of graphs with an assigned probability. Method.

The computer-implemented method for modeling object movement in an environment as recited in claim 1, comprising ordering the probabilities of the end nodes by rank.

The computer-implemented method for modeling object movement in an environment as recited in claim 1, wherein the environment is complex.

If the particular object is not in a particular known position, the probability P is 0, if the particular object is in the known position, the probability P is 1, and otherwise The computer-implemented method for modeling object movement in an environment according to claim 1, wherein the probability P is 0 <P <1.

A set of sensors disposed at known locations in an environment and configured to detect discrete events caused by objects moving in the environment;
Means for linking together a series of temporally and spatially adjacent discrete events to form a set of tracklets;
A set comprising at least one start node, at least one end node, and one or more intermediate nodes to which a plurality of tracklets connect while generating an ambiguity in the movement of the object from the set of tracklets A means of creating a directed graph of
Means for assigning to the node a probability indicating the likelihood that the object was in a particular known position to model movement of the object in the environment;
A system for modeling the movement of an object in an environment characterized by comprising: