JP2011100175A

JP2011100175A - Device and program for deciding personal action

Info

Publication number: JP2011100175A
Application number: JP2009252559A
Authority: JP
Inventors: Masaki Takahashi; 正樹高橋; Masato Fujii; 真人藤井; Masahiro Shibata; 正啓柴田
Original assignee: Nippon Hoso Kyokai NHK; Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2009-11-04
Filing date: 2009-11-04
Publication date: 2011-05-19
Anticipated expiration: 2029-11-04
Also published as: JP5285575B2

Abstract

PROBLEM TO BE SOLVED: To provide a personal action decision device, capable of deciding the action of a person included in a crowded scene by means of video processing only. SOLUTION: A personal action decision device 1 includes: a personal area detection means 10 for detecting one or more personal areas included in a video image at predetermined frame intervals by machine learning; a personal track generation means 20 for calculating a feature amount for each personal area, for deciding a personal area, having a resembling feature amount of the personal area in a plurality of frame images, to be the personal area of an identical person, and for generating a personal track by connecting the positions of the centers of gravity in the personal areas of the identical person; and a personal action decision means 30 for deciding that the person acts in conformity to an action condition, if the feature amount of the personal action satisfies the action condition by calculating the feature amount for each personal track and by deciding whether the feature amount of the personal track satisfies the action condition. COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、映像に含まれる人物の行動を判定する技術に関する。 The present invention relates to a technique for determining an action of a person included in a video.

映像解析による人物の動作認識は、例えば、人物行動の自動認識、個人認証、又は、動作をキーとした映像検索に活用できる有用な技術である。特に、防犯を目的とし、固定監視カメラの映像から人物の特定動作を自動検出する技術は、その需要が高い。例えば、エレベータや書店に設置した固定監視カメラの映像を自動的に解析し、人物の異常行動に対してアラームを発するシステムが開発されている。 Human motion recognition by video analysis is a useful technique that can be utilized for, for example, automatic recognition of human behavior, personal authentication, or video search using motion as a key. In particular, there is a great demand for a technology for automatically detecting a specific action of a person from a fixed surveillance camera image for the purpose of crime prevention. For example, a system has been developed that automatically analyzes video from a fixed surveillance camera installed in an elevator or bookstore and issues an alarm for abnormal behavior of a person.

ここで、映像解析による人物の動作認識は、モデルをベースにした技術と、特徴をベースにした技術とに大きく分類できる。
このモデルをベースした技術としては、例えば、人体の胴体、腕などの各パーツを幾何学的モデルとして表現し、それらの関節角などのパラメータで人物の姿勢を記述したものがある（特許文献１参照）。そして、このモデルをベースした技術は、人体の腕や足などのパーツの細やかな動作を解析することを主眼としている。 Here, human motion recognition by video analysis can be broadly classified into technology based on models and technology based on features.
As a technique based on this model, for example, there is a technique in which each part such as a human torso and arm is expressed as a geometric model, and the posture of the person is described by parameters such as their joint angles (Patent Document 1). reference). The technology based on this model focuses on analyzing the detailed movements of parts such as human arms and legs.

一方、特徴をベースにした技術としては、例えば、２次元動画像を処理対象として画像特徴量から人物領域を判定し、ジェスチャー認識、人物数カウント、移動方向の算出を行うものがある（特許文献２参照）。
また、この技術としては、例えば、人物位置の時系列データから移動ベクトルを求めて、速度や静止回数から人物の行動を認識するものがある（特許文献３参照）。 On the other hand, as a technique based on features, for example, there is a technique for determining a person region from an image feature amount for processing a two-dimensional moving image, and performing gesture recognition, counting the number of people, and calculating a moving direction (Patent Document). 2).
In addition, as this technique, for example, there is a technique that obtains a movement vector from time-series data of a person position and recognizes the action of the person from the speed and the number of times of stationary (see Patent Document 3).

特開２０００−１４９０２５号公報JP 2000-149025 A 特開平５−４６５８３号公報JP-A-5-46583 特開２００６−２２１３７９号公報JP 2006-221379 A

ここで、映像解析による人物の動作認識は、映像中から人物領域を切り出し、その人物の輪郭を正確に取得することが前提となる。このとき、従来技術では、背景が平坦で、一人から数人程度の人物が点在するシーンを対象としており、人物毎の領域の切り出しが困難となる、混雑したシーンを対象としたものではない。例えば、前記した特許文献１，２に記載の発明は、一人だけが含まれる映像を対象にしていると考えられる。 Here, human motion recognition by video analysis is based on the premise that a person region is cut out from video and the outline of the person is accurately acquired. At this time, the conventional technique is intended for a scene with a flat background and dotted with about one to several persons, and is not intended for a crowded scene in which it is difficult to cut out an area for each person. . For example, it is considered that the inventions described in Patent Documents 1 and 2 described above are intended for videos including only one person.

特許文献３に記載の発明は、比較的混雑したシーンを対象としているが、人物の位置測定にＲＦＩＤ（Radio Frequency IDentification）を用いており、必ずしも映像処理のみで人物の行動判定を行っておらず、装置の大規模化やコストアップにつながるという問題がある。 The invention described in Patent Document 3 is intended for relatively crowded scenes, but uses RFID (Radio Frequency IDentification) to measure a person's position, and does not necessarily perform human action determination only by video processing. However, there is a problem that the scale of the apparatus is increased and the cost is increased.

そこで、本発明は、映像処理のみで、混雑したシーンにおいても人物の行動を正確に判定できる人物行動判定装置及びそのプログラムを提供することを目的とする。 Therefore, an object of the present invention is to provide a human behavior determination device and a program thereof that can accurately determine a human behavior even in a crowded scene by only video processing.

本願第１発明に係る人物行動判定装置は、複数のフレーム画像が連続した映像が入力されると共に、映像に含まれる１以上の人物の行動を判定する人物行動判定装置であって、人物領域検出手段と、人物軌跡生成手段と、人物行動判定手段と、を備えることを特徴とする。 The human behavior determination device according to the first invention of the present application is a human behavior determination device that receives a video in which a plurality of frame images are continuous and determines the behavior of one or more persons included in the video, and includes a human region detection. Means, a person trajectory generation means, and a person action determination means.

かかる構成によれば、人物行動判定装置は、人物領域検出手段によって、映像に含まれる１以上の人物領域を所定のフレーム間隔で機械学習により検出する。また、人物行動判定装置は、人物軌跡生成手段によって、人物領域検出手段が検出した人物領域毎に特徴量を算出すると共に、複数のフレーム画像において人物領域の特徴量が類似する人物領域を同一人物の人物領域と判定し、同一人物の人物領域の重心位置を連結して人物軌跡を生成する。つまり、人物行動判定装置は、例えば、連続する２枚のフレーム画像から検出する動きベクトルに比べ、より長い期間（例えば、３０フレーム）にわたって人物軌跡を生成する。 According to this configuration, the human behavior determination device detects one or more human regions included in the video by machine learning at predetermined frame intervals by the human region detection unit. In addition, the human behavior determination device calculates a feature amount for each person region detected by the person region detection unit by the person trajectory generation unit, and assigns a person region having a similar person region feature amount in a plurality of frame images to the same person. And the center of gravity of the person areas of the same person are connected to generate a person trajectory. That is, for example, the human behavior determination apparatus generates a human trajectory over a longer period (for example, 30 frames) than a motion vector detected from two consecutive frame images.

また、人物行動判定装置は、人物行動判定手段によって、人物軌跡生成手段が生成した人物軌跡毎に特徴量を算出すると共に、人物軌跡の特徴量が予め設定された行動条件を満たすか否かを判定し、人物軌跡の特徴量が行動条件を満たすときは、人物が行動条件に対応する行動を行っていると判定する。つまり、人物行動判定装置は、従来の動きベクトルよりも長い期間にわたって求めた人物軌跡に基づいて、人物の行動を判定する。 Further, the human behavior determination device calculates a feature amount for each human trajectory generated by the human trajectory generation unit by the human behavior determination unit, and determines whether or not the human trajectory feature amount satisfies a preset behavior condition. When it is determined that the feature amount of the person trajectory satisfies the action condition, it is determined that the person is performing an action corresponding to the action condition. In other words, the human behavior determination device determines the human behavior based on the human trajectory obtained over a longer period than the conventional motion vector.

また、本願第２発明に係る人物行動判定装置は、人物領域検出手段が、同一の前記フレーム画像から検出された複数の人物領域について、互いの距離、色ヒストグラム分布の距離又はサイズの差分の何れか一つを代表特徴量として算出すると共に、代表特徴量が予め設定された閾値以下であるときに代表特徴量が類似すると判定し、代表特徴量が類似する人物領域の中で中央側に位置する一つの代表人物領域を人物領域として選択する代表人物領域選択部をさらに備えることを特徴とする。 Further, in the human behavior determination device according to the second invention of the present application, the human area detecting means may select any one of a distance between each other, a color histogram distribution distance, or a size difference for a plurality of human areas detected from the same frame image. Is calculated as a representative feature quantity, and it is determined that the representative feature quantity is similar when the representative feature quantity is equal to or less than a preset threshold, and the representative feature quantity is located at the center side in a similar person area. And a representative person area selecting unit that selects one representative person area as a person area.

ここで、同一のフレーム画像において、同一人物の周辺には類似した画像特徴が存在するため、ある人物の周辺には、複数の人物領域が検出される。また、同一のフレーム画像において、離れた位置にいる別の人物についても、その別の人物の周囲から複数の人物領域が検出される。
かかる構成によれば、人物行動判定装置は、同一のフレーム画像において、同一人物の周囲から複数の人物領域が検出された場合でも、それら人物領域の中で中央側に位置する人物領域を一つ選択する。 Here, in the same frame image, since similar image features exist around the same person, a plurality of person regions are detected around a certain person. In addition, in the same frame image, a plurality of person regions are detected from the periphery of another person who is located at a distant position.
According to such a configuration, the person behavior determination device can detect one person area located on the center side among the person areas even when a plurality of person areas are detected from around the same person in the same frame image. select.

また、本願第３発明に係る人物行動判定装置は、人物軌跡生成手段が生成した人物軌跡から、人物の予測領域を予測フィルタにより予測する人物予測部をさらに備え、人物軌跡生成手段が、予測領域内に人物領域の重心位置が含まれる場合、人物軌跡を生成することを特徴とする。
かかる構成によれば、人物行動判定装置は、オクルージョンにより人物領域が誤検出された場合でも、誤検出された人物領域から人物軌跡を生成してしまう事態を防止できる。 In addition, the human behavior determination device according to the third invention of the present application further includes a human prediction unit that predicts a prediction region of a person using a prediction filter from the human trajectory generated by the human trajectory generation unit, and the human trajectory generation unit includes the prediction region. A human trajectory is generated when the center of gravity of the human area is included in the human trajectory.
According to this configuration, the human behavior determination device can prevent a situation in which a human trajectory is generated from an erroneously detected person area even when the human area is erroneously detected by occlusion.

また、本願第４発明に係る人物行動判定装置は、人物行動判定手段が、人物軌跡の特徴量と人物の行動毎に予め設定された閾値との対応関係が予め設定された行動条件により、人物の行動を判定することを特徴とする。
かかる構成によれば、人物行動判定装置は、簡易な閾値処理で人物の行動を判定できる。 Further, in the human behavior determination device according to the fourth invention of the present application, the human behavior determination means uses a behavior condition in which a correspondence relationship between a feature amount of a human trajectory and a threshold value preset for each human behavior is set in advance. It is characterized by determining the behavior of
According to such a configuration, the human behavior determination device can determine the human behavior by simple threshold processing.

また、本願第５発明に係る人物行動判定装置は、人物行動判定手段が、予め設定された検証対象行動を人物が行っていると判定した場合、映像を逆方向に再生する指令である逆再生指令を出力し、人物行動判定手段から逆再生指令が入力されると共に、逆再生指令に基づいて、映像を逆方向に再生して人物領域検出手段に出力する映像逆再生手段をさらに備えることを特徴とする。
かかる構成によれば、人物行動判定装置は、逆再生映像を用いて判定結果を検証するので、判定結果の正確性をより高くすることができる。 Further, in the human behavior determination device according to the fifth invention of the present application, when the human behavior determination means determines that the person is performing the preset verification target behavior, reverse playback is a command for playing back the video in the reverse direction. A reverse playback command is output from the human behavior determination unit, and a reverse video playback unit is further provided that plays back the video in the reverse direction and outputs the video to the human region detection unit based on the reverse playback command. Features.
According to such a configuration, the human behavior determination device verifies the determination result using the reversely reproduced video, so that the accuracy of the determination result can be further increased.

また、本願第６発明に係る人物行動判定プログラムは、複数のフレーム画像が連続した映像が入力されると共に、前記映像に含まれる１以上の人物の行動を判定するために、コンピュータを、人物領域検出手段、人物軌跡生成手段、人物行動判定手段、として機能させることを特徴とする。 In addition, the human behavior determination program according to the sixth aspect of the present invention is configured to input a video in which a plurality of frame images are continuous, and to determine a behavior of one or more persons included in the video, It is made to function as a detection means, a person locus generation means, and a person action determination means.

かかる構成によれば、人物行動判定プログラムは、人物領域検出手段によって、映像に含まれる１以上の人物領域を所定のフレーム間隔で機械学習により検出する。また、人物行動判定プログラムは、人物軌跡生成手段によって、人物領域検出手段が検出した人物領域毎に特徴量を算出すると共に、複数のフレーム画像において人物領域の特徴量が類似する人物領域を同一人物の人物領域と判定し、同一人物の人物領域の重心位置を連結して人物軌跡を生成する。つまり、人物行動判定プログラムは、例えば、連続する２枚のフレーム画像から検出する動きベクトルに比べ、より長い期間にわたって人物軌跡を生成する。 According to this configuration, the human behavior determination program detects one or more human regions included in the video by machine learning at a predetermined frame interval by the human region detection unit. In addition, the human behavior determination program calculates a feature amount for each person area detected by the person region detection unit by the person trajectory generation unit, and assigns a person region having a similar person region feature amount in a plurality of frame images to the same person. And the center of gravity of the person areas of the same person are connected to generate a person trajectory. That is, for example, the human behavior determination program generates a human trajectory over a longer period than a motion vector detected from two consecutive frame images.

また、人物行動判定プログラムは、人物行動判定手段によって、人物軌跡生成手段が生成した人物軌跡毎に特徴量を算出すると共に、人物軌跡の特徴量が予め設定された行動条件を満たすか否かを判定し、人物軌跡の特徴量が行動条件を満たすときは、人物が行動条件に対応する行動を行っていると判定する。つまり、人物行動判定プログラムは、従来の動きベクトルよりも長い期間にわたって求めた人物軌跡に基づいて、人物の行動を判定する。 In addition, the human behavior determination program calculates a feature amount for each human trajectory generated by the human trajectory generation unit by the human behavior determination unit, and determines whether or not the human trajectory feature amount satisfies a preset behavior condition. When it is determined that the feature amount of the person trajectory satisfies the action condition, it is determined that the person is performing an action corresponding to the action condition. In other words, the human behavior determination program determines the human behavior based on the human trajectory obtained over a longer period than the conventional motion vector.

本発明によれば、以下のような優れた効果を奏する。
本願第１，６発明によれば、従来の動きベクトルよりも長い期間にわたって求めた人物軌跡に基づいて人物の行動を判定できるので、映像処理のみで、混雑したシーンにおいても人物の行動を正確に判定できる。従って、本願第１，５発明によれば、低コストで、正確な人物の行動判定技術を提供できる。 According to the present invention, the following excellent effects can be obtained.
According to the first and sixth inventions of the present application, since a person's action can be determined based on a person trajectory obtained over a longer period than the conventional motion vector, the person's action can be accurately determined even in a crowded scene by only video processing. Can be judged. Therefore, according to the first and fifth inventions of the present application, it is possible to provide an accurate human behavior determination technique at low cost.

本願第２発明によれば、同一のフレーム画像において、同一人物の周囲から複数の人物領域が検出された場合でも、それら人物領域の中で中央側に位置する人物領域を一つ選択できる。従って、本願第２発明によれば、正確な人物軌跡を生成し、人物の行動をより正確に判定できる。 According to the second aspect of the present invention, even when a plurality of person areas are detected from the periphery of the same person in the same frame image, one person area located on the center side can be selected from these person areas. Therefore, according to the second invention of the present application, it is possible to generate an accurate person trajectory and more accurately determine the action of the person.

本願第３発明によれば、人物領域が誤検出された場合でも、誤検出された人物領域から人物軌跡を生成してしまう事態を防止できる。従って、本願第３発明によれば、正確な人物軌跡を生成し、人物の行動をより正確に判定できる。 According to the third invention of the present application, even when a person area is erroneously detected, it is possible to prevent a situation in which a person locus is generated from the erroneously detected person area. Therefore, according to the third invention of the present application, it is possible to generate an accurate person trajectory and more accurately determine the action of the person.

本願第４発明によれば、簡易な閾値処理で人物の行動を判定できるので、処理の高速化を図ることができる。
本願第５発明によれば、逆再生映像を用いて判定結果を検証するので、判定結果の正確性をより高くすることができる。 According to the fourth aspect of the present invention, since the action of a person can be determined by simple threshold processing, the processing speed can be increased.
According to the fifth aspect of the present invention, since the determination result is verified using the reverse reproduction video, the accuracy of the determination result can be further increased.

本発明の第１実施形態に係る人物行動判定装置の全体構成を示すブロック図である。It is a block diagram which shows the whole structure of the person action determination apparatus which concerns on 1st Embodiment of this invention. 図１の人物領域検出手段の構成を示すブロック図である。It is a block diagram which shows the structure of the person area | region detection means of FIG. 図２の人物画像データベースに格納される訓練データを示す図であり、（ａ）は正例の画像の一例であり、（ｂ）は負例の画像の一例である。It is a figure which shows the training data stored in the person image database of FIG. 2, (a) is an example of a positive example image, (b) is an example of a negative example image. 本発明におけるＨＯＧ特徴量を説明する図であり、（ａ）は画像の一例であり、（ｂ）はセル及びブロックを説明する図であり、（ｃ）は輝度ヒストグラムを示す図である。It is a figure explaining the HOG feature-value in this invention, (a) is an example of an image, (b) is a figure explaining a cell and a block, (c) is a figure which shows a brightness | luminance histogram. 本発明における差分画像を説明する図であり、（ａ）は映像を示す図であり、（ｂ）は映像から生成した背景差分画像を示す図である。It is a figure explaining the difference image in this invention, (a) is a figure which shows an image | video, (b) is a figure which shows the background difference image produced | generated from the image | video. 図２の代表人物領域選択部による代表人物領域のクラスタリングを説明する図である。It is a figure explaining clustering of the representative person area | region by the representative person area | region selection part of FIG. 図１の人物軌跡生成手段の構成を示すブロック図である。It is a block diagram which shows the structure of the person locus | trajectory production | generation means of FIG. 本発明における色ヒストグラムの類似を説明する図であり、（ａ）は色ヒストグラムが類似しない例を示し、（ｂ）は色ヒストグラムが類似する例を示す。It is a figure explaining the similarity of the color histogram in this invention, (a) shows the example in which a color histogram is not similar, (b) shows the example in which a color histogram is similar. 本発明における人物軌跡を説明する図であり、（ａ）は人物領域の重心位置を示し、（ｂ）は重心位置を連結して生成した人物軌跡を示す。It is a figure explaining the person locus | trajectory in this invention, (a) shows the gravity center position of a person area | region, (b) shows the person locus | trajectory produced | generated by connecting a gravity center position. 図７の人物予測部による予測を説明する図であり、（ａ）はオクルージョンが発生しない場合であり、（ｂ）はオクルージョンが発生する場合である。FIG. 8 is a diagram for explaining prediction by the person prediction unit in FIG. 7, where (a) shows a case where no occlusion occurs and (b) shows a case where occlusion occurs. 図１の人物行動判定手段の構成を示すブロック図である。It is a block diagram which shows the structure of the person action determination means of FIG. 図１１の人物軌跡特徴量正規化部を説明する図であり、（ａ）はブロック毎の平均動きベクトルを示し、（ｂ）はブロック毎の正規化係数を示す。It is a figure explaining the person locus | trajectory feature-value normalization part of FIG. 11, (a) shows the average motion vector for every block, (b) shows the normalization coefficient for every block. 図１の人物行動判定装置の全体動作を示すフローチャートである。It is a flowchart which shows the whole operation | movement of the person action determination apparatus of FIG. 図２の人物領域検出手段の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the person area detection means of FIG. 図７の人物軌跡生成手段の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the person locus | trajectory production | generation means of FIG. 図１１の人物行動判定手段の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the person action determination means of FIG. 本発明の第２実施形態において、人物行動判定手段の構成を示すブロック図である。In 2nd Embodiment of this invention, it is a block diagram which shows the structure of a person action determination means. 本発明の第３実施形態において、人物行動判定手段の構成を示すブロック図である。In 3rd Embodiment of this invention, it is a block diagram which shows the structure of a person action determination means. 本発明の第３実施形態において、特徴空間を説明する図である。It is a figure explaining feature space in a 3rd embodiment of the present invention. 本発明の第４実施形態に係る人物行動判定装置の全体構成を示すブロック図である。It is a block diagram which shows the whole structure of the person action determination apparatus which concerns on 4th Embodiment of this invention.

（第１実施形態）
以下、本発明の各実施形態について、適宜図面を参照しながら詳細に説明する。なお、各実施形態において、同一の機能を有する手段には同一の符号を付し、説明を省略した。 (First embodiment)
Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings as appropriate. In each embodiment, means having the same function are denoted by the same reference numerals and description thereof is omitted.

［人物行動判定装置の全体構成］
図１を参照して、本発明の第１実施形態に係る人物行動判定装置１の全体構成について説明する。図１に示すように、人物行動判定装置１は、映像に含まれる１以上の人物の行動を判定するものであり、人物領域検出手段１０と、人物軌跡生成手段２０と、人物行動判定手段３０とを備える。 [Overall configuration of person action determination device]
With reference to FIG. 1, the whole structure of the human action determination apparatus 1 which concerns on 1st Embodiment of this invention is demonstrated. As shown in FIG. 1, the human behavior determination device 1 determines the behavior of one or more persons included in a video, and includes a human region detection unit 10, a human trajectory generation unit 20, and a personal behavior determination unit 30. With.

人物行動判定装置１は、例えば、図示を省略した１台の撮影カメラで撮影された映像が入力される。ここで、撮影カメラは、例えば、イベント会場、エレベータ、書店、駅、空港等の混雑する場所に固定される。つまり、撮影カメラが撮影した映像は、２以上の人物が含まれる混雑したシーンとなることが多い。 For example, the human behavior determination apparatus 1 receives an image captured by one imaging camera (not shown). Here, the photographing camera is fixed at a crowded place such as an event venue, an elevator, a bookstore, a station, or an airport. That is, the video shot by the camera is often a crowded scene including two or more people.

人物領域検出手段１０は、フレーム間差分、背景差分等の差分が生じた変化領域において、ＨＯＧ（Histograms of Oriented Gradients）特徴量を算出し、サポートベクターマシン等の機械学習により、その変化領域が人物領域であるか否かの判定を行う。また、人物領域検出手段１０は、同一人物について複数の人物領域が検出された場合、クラスタリングを行い、一人に一つの人物領域を割り当てる。 The person area detection means 10 calculates HOG (Histograms of Oriented Gradients) feature quantities in a change area where differences such as inter-frame differences and background differences occur, and the change area is determined by machine learning such as a support vector machine. It is determined whether the area. Further, when a plurality of person areas are detected for the same person, the person area detecting means 10 performs clustering and assigns one person area to one person.

人物軌跡生成手段２０は、人物領域検出手段１０で検出された人物領域を、距離や色ヒストグラムを考慮してマッチングする。そして、人物軌跡生成手段２０は、所定のフレーム数にわたって人物領域を追跡することで、人物軌跡を生成する。このとき、人物軌跡生成手段２０は、混雑したシーンでも頑健に追跡するために、予測フィルタを用いて人物位置を予測することとした。これによって、人物軌跡生成手段２０は、オクルージョンにより追跡ミスが生じた場合でも、人物領域の再出現時に追跡を再開することが可能となる。なお、オクルージョンは、例えば、撮影カメラから見て、奥側に位置する人物が手前側に位置する別の人物等の障害物に隠れてしまう状態を言う。 The person trajectory generation means 20 matches the person area detected by the person area detection means 10 in consideration of the distance and the color histogram. Then, the person trajectory generation unit 20 generates a person trajectory by tracking the person region over a predetermined number of frames. At this time, the person trajectory generation means 20 predicts the person position using a prediction filter in order to robustly track even a crowded scene. As a result, the person trajectory generation means 20 can resume tracking when the person area reappears even if a tracking error occurs due to occlusion. Occlusion refers to a state in which, for example, a person located on the back side is hidden by an obstacle such as another person located on the near side when viewed from the photographing camera.

人物行動判定手段３０は、人物軌跡生成手段２０が生成した人物軌跡から、平均速度、追跡時間、移動方向等の人物軌跡の特徴量を算出する。そして、人物行動判定手段３０は、これら人物軌跡の特徴量から、「走る」、「人と会う」、「物を置く」等の人物の行動を判定する。例えば、「走る」ときはフレーム間移動量が大きく、「人と会う」ときは加速度が移動方向の反対側（負方向）に高くなる。 The human behavior determination unit 30 calculates the feature amount of the human trajectory such as the average speed, the tracking time, and the moving direction from the human trajectory generated by the human trajectory generation unit 20. Then, the person action determination unit 30 determines the action of the person such as “run”, “meet a person”, and “place an object” from the feature amount of the person trajectory. For example, when “running”, the amount of movement between frames is large, and when “meeting a person”, the acceleration increases on the opposite side (negative direction) of the moving direction.

［人物領域検出手段の構成］
以下、図２を参照し、人物領域検出手段１０の構成について説明する。図２に示すように、人物領域検出手段１０は、人物画像データベース１１と、訓練データ特徴量算出部１２と、変化領域検出部１３と、変化領域特徴量算出部１４と、人物領域判定部１５と、代表人物領域選択部１６とを備える。 [Configuration of Person Area Detection Means]
Hereinafter, the configuration of the person region detecting means 10 will be described with reference to FIG. As shown in FIG. 2, the person area detection means 10 includes a person image database 11, a training data feature amount calculation unit 12, a change region detection unit 13, a change region feature amount calculation unit 14, and a person region determination unit 15. And a representative person area selection unit 16.

人物画像データベース１１は、事前に人物を切り出した正例の画像と人物以外の負例の画像とを、訓練データとして予め格納する。
ここで、人物領域の検出精度は、人物画像データベース１１の訓練データに左右される。このため、人物画像データベース１１に格納する正例の画像は、実際に撮影を行う撮影カメラの映像から切り出すことが好ましい。一方、人物画像データベース１１は、様々な映像から切り出した正例の画像と負例の画像とを、訓練データとして格納することで、人物行動判定装置１に汎用性を持たせることも可能である。 The person image database 11 stores in advance, as training data, a positive example image obtained by cutting out a person in advance and a negative example image other than the person.
Here, the detection accuracy of the person region depends on the training data in the person image database 11. For this reason, it is preferable that the positive image stored in the person image database 11 is cut out from the video of the photographing camera that actually performs photographing. On the other hand, the person image database 11 can also make the person action determination device 1 versatile by storing, as training data, positive example images and negative example images cut out from various videos. .

図３（ａ）に示すように、正例の画像は、様々な人種、姿勢、服装、年齢及び性別の人物を切り出した画像である。また、図３（ｂ）に示すように、負例の画像は、例えば、ファクシミリ、パーソナルコンピュータ、携帯電話端末等の電子機器、自動車、オートバイ等の移動手段、靴等の衣料品、及び、動物等の様々な人物以外の物を切り出した画像である。 As shown in FIG. 3A, the positive image is an image obtained by cutting out persons of various races, postures, clothes, ages and genders. Further, as shown in FIG. 3B, negative examples include, for example, electronic devices such as facsimiles, personal computers, and mobile phone terminals, moving means such as automobiles and motorcycles, clothing such as shoes, and animals. It is the image which cut out things other than various persons, such as.

訓練データ特徴量算出部１２は、人物画像データベース１１を参照し、人物画像データベース１１に格納された訓練データの特徴量をそれぞれ算出し、人物領域判定部１５に出力する。ここで、訓練データ特徴量算出部１２は、訓練データの特徴量として、ＨＯＧ特徴量を算出する。 The training data feature amount calculation unit 12 refers to the person image database 11, calculates the feature amounts of the training data stored in the person image database 11, and outputs them to the person region determination unit 15. Here, the training data feature amount calculation unit 12 calculates a HOG feature amount as the feature amount of the training data.

＜ＨＯＧ特徴量＞
以下、図４を参照して、ＨＯＧ特徴量について説明する（適宜図２参照）。このＨＯＧ特徴量は、画像の局所領域（セル）での輝度の勾配方向をヒストグラム化したものである。この例では、図４（ｂ）に示すように、１セルは、例えば、縦方向に５ピクセル、及び、横方向に５ピクセルで構成される。また、１ブロックは、例えば、縦方向に３セル、及び、横方向に３セルで構成される。 <HOG features>
Hereinafter, the HOG feature amount will be described with reference to FIG. 4 (see FIG. 2 as appropriate). This HOG feature amount is a histogram of the gradient direction of luminance in a local region (cell) of an image. In this example, as shown in FIG. 4B, one cell includes, for example, 5 pixels in the vertical direction and 5 pixels in the horizontal direction. In addition, one block includes, for example, 3 cells in the vertical direction and 3 cells in the horizontal direction.

まず、訓練データ特徴量算出部１２は、例えば、図４（ａ）の画像に含まれる全てのピクセルから、輝度の勾配強度及び輝度の勾配方向を求める。図４（ｂ）のセルでは、各ピクセルにおける輝度の勾配強度及び輝度の勾配強度を矢印で図示した。つまり、図４（ｂ）のセルにおいて、矢印の長さが輝度の勾配強度を示し、矢印の向きが輝度の勾配方向を示す。 First, the training data feature amount calculation unit 12 obtains the luminance gradient strength and the luminance gradient direction from all the pixels included in the image of FIG. 4A, for example. In the cell of FIG. 4B, the brightness gradient strength and the brightness gradient strength at each pixel are indicated by arrows. That is, in the cell of FIG. 4B, the length of the arrow indicates the luminance gradient strength, and the direction of the arrow indicates the luminance gradient direction.

次に、訓練データ特徴量算出部１２は、例えば、図４（ｃ）に示すように、セル毎に、輝度の勾配方向を０°〜１８０°の間で２０°間隔で９方向に区分けして、輝度のヒストグラムを生成する。つまり、このヒストグラムは、縦軸が輝度の勾配強度となり、横軸が輝度の勾配方向となる。 Next, for example, as shown in FIG. 4C, the training data feature quantity calculation unit 12 divides the luminance gradient direction into 9 directions at intervals of 20 ° between 0 ° and 180 ° for each cell. Thus, a luminance histogram is generated. That is, in this histogram, the vertical axis represents the luminance gradient intensity, and the horizontal axis represents the luminance gradient direction.

さらに、訓練データ特徴量算出部１２は、例えば、セル毎に生成したヒストグラムを、１ブロック単位で正規化する。ここで、訓練データ特徴量算出部１２は、例えば、正規化の対象となるブロックを１セルずつ移動させながら、全ブロックに対して正規化を行う。言い換えると、各セルは、異なるブロックで繰り返し正規化されることになる。 Furthermore, the training data feature amount calculation unit 12 normalizes, for example, a histogram generated for each cell in units of blocks. Here, for example, the training data feature amount calculation unit 12 performs normalization on all blocks while moving the block to be normalized one cell at a time. In other words, each cell is repeatedly normalized with different blocks.

なお、訓練データ特徴量算出部１２は、ＨＯＧ特徴量を用いるとして説明したが、これに限定されない。例えば、訓練データ特徴量算出部１２は、ＳＩＦＴ（Scale Invariant Feature Transform）特徴量、エッジ特徴量等の任意の特徴量を用いることができる。 In addition, although the training data feature-value calculation part 12 demonstrated as using a HOG feature-value, it is not limited to this. For example, the training data feature quantity calculation unit 12 can use an arbitrary feature quantity such as a SIFT (Scale Invariant Feature Transform) feature quantity or an edge feature quantity.

以下、図２に戻り、人物領域検出手段１０の構成について説明を続ける。
変化領域検出部１３は、撮影カメラから映像が入力されると共に、この映像で変化が有る変化領域を求める。具体的には、変化領域検出部１３は、例えば、映像から、３フレーム間隔で、フレーム間差分画像、背景差分画像等の差分画像を生成する。 Hereinafter, returning to FIG. 2, the description of the configuration of the person region detecting means 10 will be continued.
The change area detection unit 13 obtains a change area in which a video is input from the photographing camera and changes in the video. Specifically, the change area detection unit 13 generates a difference image such as an inter-frame difference image or a background difference image at intervals of three frames from the video, for example.

例えば、図５（ａ）に示すように、入力された映像は、動いている人物Ｈ１〜Ｈ４と、転がっているボールＯｂ１と、背景とを含んでいるとする。この場合、変化領域検出部１３は、図５（ｂ）のような背景差分画像を生成する。この背景差分画像は、変化（動き）の無い部分（背景部分）が黒色になっており、変化（動き）が有る部分（人物Ｈ１〜Ｈ４、及び、ボールＯｂ１）が白色や灰色になっている。そして、変化領域検出部１３は、変化が有る部分として、人物Ｈ１〜Ｈ４の矩形領域やボールＯｂ１の矩形領域を、背景差分画像から検出する。さらに、図５（ａ）に示すように、変化領域検出部１３は、この変化の有る部分と同じ領域を入力された映像から求め、変化領域Ｒ１〜Ｒ５として変化領域特徴量算出部１４に出力する。 For example, as shown in FIG. 5A, it is assumed that the input video includes moving persons H1 to H4, a rolling ball Ob1, and a background. In this case, the change area detection unit 13 generates a background difference image as shown in FIG. In this background difference image, a portion without change (motion) (background portion) is black, and a portion with change (motion) (persons H1 to H4 and ball Ob1) is white or gray. . And the change area | region detection part 13 detects the rectangular area | region of the persons H1-H4 and the rectangular area | region of ball | bowl Ob1 from a background difference image as a part which has a change. Further, as shown in FIG. 5A, the change area detection unit 13 obtains the same area as the part having the change from the input video, and outputs it to the change area feature quantity calculation unit 14 as the change areas R1 to R5. To do.

なお、変化領域検出部１３は、３フレーム間隔で差分画像を生成するとして説明したが、これに制限されない。例えば、変化領域検出部１３は、１フレーム間隔、２フレーム間隔、又は４以上のフレーム間隔で差分画像を生成できる。 Although the change area detection unit 13 has been described as generating difference images at intervals of three frames, the present invention is not limited to this. For example, the change area detection unit 13 can generate a difference image at intervals of one frame, two frames, or four or more frame intervals.

以下、図２に戻り、人物領域検出手段１０の構成について説明を続ける。
変化領域特徴量算出部１４は、変化領域検出部１３から変化領域が入力されると共に、この変化領域の特徴量を算出する。このとき、変化領域特徴量算出部１４は、訓練データ特徴量算出部１２と同じ種類の特徴量（例えば、ＨＯＧ特徴量）を用いることが好ましい。 Hereinafter, returning to FIG. 2, the description of the configuration of the person region detecting means 10 will be continued.
The change area feature quantity calculation unit 14 receives the change area from the change area detection unit 13 and calculates the feature quantity of the change area. At this time, it is preferable that the change area feature quantity calculation unit 14 uses the same type of feature quantity (for example, HOG feature quantity) as the training data feature quantity calculation unit 12.

このように、変化領域特徴量算出部１４が、フレーム画像全体からＨＯＧ特徴量を算出する場合に比べ、変化領域だけからＨＯＧ特徴量を算出するので、処理の高速化を図ることができる。 As described above, since the change region feature value calculation unit 14 calculates the HOG feature value only from the change region, compared to the case where the HOG feature value is calculated from the entire frame image, the processing speed can be increased.

なお、変化領域特徴量算出部１４は、ＨＯＧ特徴量を用いるとして説明したが、これに限定されない。例えば、変化領域特徴量算出部１４は、ＳＩＦＴ特徴量、エッジ特徴量等の任意の特徴量を用いることができる。 In addition, although the change area | region feature-value calculation part 14 demonstrated as using a HOG feature-value, it is not limited to this. For example, the change area feature value calculation unit 14 can use an arbitrary feature value such as a SIFT feature value or an edge feature value.

人物領域判定部１５は、訓練データ特徴量算出部１２から訓練データのＨＯＧ特徴量が入力されると共に、このＨＯＧ特徴量を用いて機械学習を行う。ここで、人物領域判定部１５は、教師ありの機械学習として、サポートベクターマシン（ＳＶＭ）を用いる。 The person region determination unit 15 receives the HOG feature amount of the training data from the training data feature amount calculation unit 12 and performs machine learning using the HOG feature amount. Here, the person region determination unit 15 uses a support vector machine (SVM) as supervised machine learning.

サポートベクターマシンは、２クラス（例えば、人物のクラス、人物以外のクラス）を認識可能な機械学習である。具体的には、サポートベクターマシンは、訓練データとシナプス荷重との内積が閾値以上なら１を出力し、閾値未満なら−１を出力する識別関数を用いる。このとき、サポートベクターマシンは、訓練データの認識誤りを最小にするため、学習パラメータを、異なるクラスの訓練データの間際ではなく、各訓練データから余裕があるものとして学習する。つまり、サポートベクターマシンは、異なるクラスに属する訓練データ同士のマージンを求め、このマージンが最大となるように学習パラメータを学習する。なお、人物領域判定部１５は、サポートベクターマシンによる機械学習を、変化領域特徴量算出部１４から変化領域の特徴量が入力される前に行うことが好ましい。 The support vector machine is machine learning capable of recognizing two classes (for example, a person class and a class other than a person). Specifically, the support vector machine uses a discriminant function that outputs 1 if the inner product of the training data and the synaptic load is equal to or greater than a threshold, and outputs -1 if it is less than the threshold. At this time, the support vector machine learns the learning parameters as having a margin from each training data, not just between different classes of training data, in order to minimize the recognition error of the training data. That is, the support vector machine obtains a margin between training data belonging to different classes, and learns learning parameters so that this margin is maximized. Note that the person region determination unit 15 preferably performs machine learning by the support vector machine before the change region feature amount calculation unit 14 receives the change region feature amount.

その後、人物領域判定部１５は、変化領域特徴量算出部１４から変化領域のＨＯＧ特徴量が入力される。また、人物領域判定部１５は、機械学習した学習パラメータを用いて、変化領域のＨＯＧ特徴量が人物のクラスであるか否かを判定する。そして、人物領域判定部１５は、この変化領域のＨＯＧ特徴量が人物のクラスである場合、この変化領域が人物領域であると判定する。この場合、人物領域判定部１５は、この変化領域を人物領域として代表人物領域選択部１６に出力する。 Thereafter, the person region determination unit 15 receives the HOG feature amount of the change region from the change region feature amount calculation unit 14. Further, the person area determination unit 15 determines whether or not the HOG feature amount of the change area is a person class using the machine-learned learning parameter. Then, the person area determination unit 15 determines that the change area is a person area when the HOG feature amount of the change area is a person class. In this case, the person area determination unit 15 outputs the change area as a person area to the representative person area selection unit 16.

一方、人物領域判定部１５は、この変化領域のＨＯＧ特徴量が人物以外のクラスである場合、この変化領域が人物領域でないと判定し、この変化領域を破棄する。なお、人物領域とは、映像において人物を含む矩形状の領域である。 On the other hand, when the HOG feature amount of the change area is a class other than a person, the person area determination unit 15 determines that the change area is not a person area and discards the change area. The person area is a rectangular area including a person in the video.

なお、人物領域判定部１５がサポートベクターマシンによる機械学習を行うとして説明したが、これに限定されない。ここで、人物領域判定部１５は、２クラスを認識可能な機械学習、例えば、決定木又はニューラルネットワークを用いることができる。また、人物領域判定部１５が教師あり機械学習を行うとして説明したが、教師なし機械学習を行ってもよい。 In addition, although demonstrated that the person area | region determination part 15 performs machine learning by a support vector machine, it is not limited to this. Here, the person region determination unit 15 can use machine learning capable of recognizing two classes, for example, a decision tree or a neural network. Further, although the person region determination unit 15 has been described as performing supervised machine learning, unsupervised machine learning may be performed.

以下、図２に戻り、人物領域検出手段１０の構成について説明を続ける。
ここで、人物行動判定装置１は、同一人物については、一つの人物領域のみを検出することが好ましい。しかし、同一のフレーム画像において、同一人物の周辺には類似した画像特徴が存在するため、ある人物の周辺には、複数の人物領域が検出される。また、同一のフレーム画像において、離れた位置にいる別の人物についても、複数の人物領域が検出される。このように、同一のフレーム画像において、同一人物の周囲から複数の人物領域を検出すると、人物行動判定装置１は、人物の行動の判定に不都合を生じてしまう。 Hereinafter, returning to FIG. 2, the description of the configuration of the person region detecting means 10 will be continued.
Here, it is preferable that the person action determination device 1 detects only one person region for the same person. However, since similar image features exist around the same person in the same frame image, a plurality of person regions are detected around a certain person. In addition, a plurality of person regions are also detected for another person at a distant position in the same frame image. As described above, if a plurality of person regions are detected from the periphery of the same person in the same frame image, the person action determination device 1 causes inconvenience in determining the person's action.

このため、代表人物領域選択部１６は、人物領域判定部１５から人物領域が入力されると共に、一枚のフレーム画像において、同一人物に対しては一つの人物領域を選択する。
なお、複数の人物領域から同一人物の人物領域を集約し、集約した人物領域の中から一つの人物領域を選択する処理が、クラスタリングと呼ばれることがある。 For this reason, the representative person area selection unit 16 receives a person area from the person area determination unit 15 and selects one person area for the same person in one frame image.
Note that the process of aggregating person areas of the same person from a plurality of person areas and selecting one person area from the aggregated person areas may be called clustering.

以下、図６を参照し、代表人物領域選択部１６によるクラスタリングについて、第１例から第３例までを順に説明する。このとき、各例では、図６に示すように、人物Ｈ１について４個の人物領域Ｒ１〜Ｒ４が検出され、人物Ｈ２について３個の人物領域Ｒ５〜Ｒ７が検出され、人物Ｈ３について３個の人物領域Ｒ８〜Ｒ１０が検出されたこととする。 Hereinafter, with reference to FIG. 6, clustering by the representative person region selection unit 16 will be described in order from the first example to the third example. At this time, in each example, as shown in FIG. 6, four person areas R1 to R4 are detected for the person H1, three person areas R5 to R7 are detected for the person H2, and three person areas R3 to R7 are detected for the person H3. It is assumed that the person areas R8 to R10 are detected.

＜選択の第１例：色ヒストグラム＞
この第１例では、代表人物領域選択部１６は、代表人物領域を選択する基準となる代表特徴量として、色ヒストグラムを用いる。 <First example of selection: color histogram>
In this first example, the representative person area selection unit 16 uses a color histogram as a representative feature quantity serving as a reference for selecting a representative person area.

代表人物領域選択部１６は、人物領域Ｒ１〜Ｒ１０のそれぞれについて、色ヒストグラム分布を算出する。また、代表人物領域選択部１６は、人物領域Ｒ１〜Ｒ１０の色ヒストグラム分布について、互いの距離を求める。このとき、同一人物であれば、それら色ヒストグラム分布は、類似すると考えられる。従って、代表人物領域選択部１６は、色ヒストグラム分布の距離が予め設定された閾値以下となるものを、同一人物の人物領域として集約する。さらに、代表人物領域選択部１６は、同一人物の人物領域として集約された人物領域の中から、中央側に位置する代表人物領域を人物領域として選択する。 The representative person area selection unit 16 calculates a color histogram distribution for each of the person areas R1 to R10. Further, the representative person area selection unit 16 obtains the distance between the color histogram distributions of the person areas R1 to R10. At this time, if the same person, the color histogram distribution is considered to be similar. Therefore, the representative person area selection unit 16 collects the person areas of the same person whose color histogram distribution distance is equal to or less than a preset threshold. Furthermore, the representative person area selecting unit 16 selects a representative person area located on the center side as a person area from the person areas aggregated as the person area of the same person.

図６の例では、代表人物領域選択部１６は、人物Ｈ１に対応する人物領域Ｒ１〜Ｒ４と、人物Ｈ２に対応する人物領域Ｒ５〜Ｒ７と、人物Ｈ３に対応する人物領域Ｒ８〜Ｒ１０とに集約する。また、代表人物領域選択部１６は、人物領域Ｒ１〜Ｒ４の中から、中央側に位置する代表人物領域Ｒ３を、人物Ｈ１の人物領域として選択する。また、代表人物領域選択部１６は、人物領域Ｒ５〜Ｒ７の中から、中央側に位置する代表人物領域Ｒ６を、人物Ｈ２の人物領域として選択する。さらに、代表人物領域選択部１６は、人物領域Ｒ８〜Ｒ１０の中から、中央側に位置する代表人物領域Ｒ９を、人物Ｈ３の人物領域として選択する。その後、代表人物領域選択部１６は、選択した人物領域Ｒ３，Ｒ６，Ｒ９を、図２の人物軌跡生成手段２０に出力する。なお、図６では、代表人物領域Ｒ３，Ｒ６，Ｒ９を太線で図示した。 In the example of FIG. 6, the representative person area selecting unit 16 includes person areas R1 to R4 corresponding to the person H1, person areas R5 to R7 corresponding to the person H2, and person areas R8 to R10 corresponding to the person H3. Summarize. Further, the representative person area selecting unit 16 selects the representative person area R3 located on the center side from the person areas R1 to R4 as the person area of the person H1. Further, the representative person area selecting unit 16 selects the representative person area R6 located on the center side from the person areas R5 to R7 as the person area of the person H2. Further, the representative person area selecting unit 16 selects the representative person area R9 located on the center side from the person areas R8 to R10 as the person area of the person H3. After that, the representative person area selection unit 16 outputs the selected person areas R3, R6, and R9 to the person trajectory generation unit 20 in FIG. In FIG. 6, the representative person regions R3, R6, and R9 are shown by bold lines.

＜第２例：人物領域の距離＞
この第２例では、代表人物領域選択部１６は、代表人物領域を選択する基準となる代表特徴量として、人物領域の距離を用いる。 <Second example: distance of person area>
In this second example, the representative person area selection unit 16 uses the distance of the person area as a representative feature quantity that serves as a reference for selecting the representative person area.

代表人物領域選択部１６は、例えば、入力された人物領域Ｒ１〜Ｒ１０において、人物領域の距離として、重心位置の距離を算出する。このとき、同一人物の人物領域Ｒ１〜Ｒ１０は、互いに近い位置で検出されると考えられる。従って、代表人物領域選択部１６は、人物領域Ｒ１〜Ｒ１０の距離が予め設定された閾値以下となるものを、同一人物の人物領域として集約する。そして、代表人物領域選択部１６は、同一人物として集約された人物領域の中から、中央側に位置する代表人物領域を人物領域として選択し、出力する。 For example, the representative person area selection unit 16 calculates the distance of the center of gravity position as the distance of the person area in the input person areas R1 to R10. At this time, it is considered that the person areas R1 to R10 of the same person are detected at positions close to each other. Therefore, the representative person area selection unit 16 aggregates the person areas R1 to R10 whose distance is equal to or less than a preset threshold as person areas of the same person. Then, the representative person area selecting unit 16 selects a representative person area located on the center side from the person areas aggregated as the same person as the person area, and outputs the selected person area.

図６の例では、人物領域Ｒ１〜Ｒ４が近くに位置し、人物領域Ｒ５〜Ｒ７が近くに位置し、人物領域Ｒ８〜Ｒ１０が近くに位置する。従って、代表人物領域選択部１６は、人物Ｈ１に対応する人物領域Ｒ１〜Ｒ４と、人物Ｈ２に対応する人物領域Ｒ５〜Ｒ７と、人物Ｈ３に対応する人物領域Ｒ８〜Ｒ１０とに集約する。その後、代表人物領域選択部１６は、第１例と同様に人物領域Ｒ３，Ｒ６，Ｒ９を選択し、図２の人物軌跡生成手段２０に出力する。 In the example of FIG. 6, the person areas R1 to R4 are located nearby, the person areas R5 to R7 are located nearby, and the person areas R8 to R10 are located nearby. Therefore, the representative person area selecting unit 16 aggregates the person areas R1 to R4 corresponding to the person H1, the person areas R5 to R7 corresponding to the person H2, and the person areas R8 to R10 corresponding to the person H3. After that, the representative person area selection unit 16 selects the person areas R3, R6, and R9 as in the first example, and outputs them to the person trajectory generation means 20 in FIG.

＜第３例：人物領域のサイズ＞
この第３例では、代表人物領域選択部１６は、代表人物領域を選択する基準となる代表特徴量として、人物領域のサイズを用いる。 <Third example: size of person area>
In the third example, the representative person area selecting unit 16 uses the size of the person area as a representative feature amount serving as a reference for selecting the representative person area.

代表人物領域選択部１６は、入力された人物領域Ｒ１〜Ｒ１０のサイズ（例えば、人物領域の高さ又は幅）を算出する。このとき、同一人物の人物領域Ｒ１〜Ｒ１０は、互いに近いサイズで検出されると考えられる。従って、代表人物領域選択部１６は、人物領域Ｒ１〜Ｒ１０のサイズの差分が予め設定された閾値以下となるものを、同一人物の人物領域として集約する。そして、代表人物領域選択部１６は、同一人物として集約された人物領域の中から、中央側に位置する代表人物領域を人物領域として選択し、出力する。 The representative person area selection unit 16 calculates the size (for example, the height or width of the person area) of the input person areas R1 to R10. At this time, it is considered that the person areas R1 to R10 of the same person are detected with sizes close to each other. Therefore, the representative person area selection unit 16 aggregates the person areas R1 to R10 whose size difference is equal to or less than a preset threshold as person areas of the same person. Then, the representative person area selecting unit 16 selects a representative person area located on the center side from the person areas aggregated as the same person as the person area, and outputs the selected person area.

図６の例では、人物領域Ｒ１〜Ｒ４が近いサイズであり、人物領域Ｒ５〜Ｒ７が近いサイズであり、人物領域Ｒ８〜Ｒ１０が近いサイズである。従って、代表人物領域選択部１６は、人物Ｈ１に対応する人物領域Ｒ１〜Ｒ４と、人物Ｈ２に対応する人物領域Ｒ５〜Ｒ７と、人物Ｈ３に対応する人物領域Ｒ８〜Ｒ１０とに集約する。その後、代表人物領域選択部１６は、第１例と同様に人物領域Ｒ３，Ｒ６，Ｒ９を選択し、図２の人物軌跡生成手段２０に出力する。 In the example of FIG. 6, the person areas R1 to R4 are close in size, the person areas R5 to R7 are close in size, and the person areas R8 to R10 are close in size. Therefore, the representative person area selecting unit 16 aggregates the person areas R1 to R4 corresponding to the person H1, the person areas R5 to R7 corresponding to the person H2, and the person areas R8 to R10 corresponding to the person H3. After that, the representative person area selection unit 16 selects the person areas R3, R6, and R9 as in the first example, and outputs them to the person trajectory generation means 20 in FIG.

このように、代表人物領域選択部１６は、第１例から第３例の何れかを用いて、同一のフレーム画像において、一人の人物に一つの人物領域を割り当てることができる。例えば、オペレータが、第１例から第３例の何れかを用いるか、予め設定する。なお、代表人物領域選択部１６は、人物領域が一つだけ入力された場合、クラスタリングを行わずに、その人物領域を人物軌跡生成手段２０に出力することが好ましい。 As described above, the representative person area selection unit 16 can assign one person area to one person in the same frame image using any one of the first to third examples. For example, the operator sets in advance whether to use one of the first example to the third example. Note that, when only one person area is input, the representative person area selection unit 16 preferably outputs the person area to the person trajectory generation unit 20 without performing clustering.

［人物軌跡生成手段の構成］
以下、図７を参照し、人物軌跡生成手段２０の構成について、説明する。
人物軌跡生成手段２０は、図７に示すように、人物領域特徴量算出部２１と、特徴量データベース２２と、人物領域特徴量照合部２３と、人物軌跡生成部２４と、人物予測部２５とを備える。 [Configuration of person trajectory generation means]
Hereinafter, the configuration of the human trajectory generation means 20 will be described with reference to FIG.
As shown in FIG. 7, the person trajectory generation unit 20 includes a person region feature amount calculation unit 21, a feature amount database 22, a person region feature amount collation unit 23, a person trajectory generation unit 24, and a person prediction unit 25. Is provided.

人物領域特徴量算出部２１は、人物領域検出手段１０（図２参照）から人物領域が入力されると共に、この人物領域の特徴量を算出する。そして、人物領域特徴量算出部２１は、算出した人物領域の特徴量を、処理対象のフレーム画像（現在のフレーム画像）における人物領域の特徴量として、人物領域特徴量照合部２３に出力する。また、このとき、人物領域特徴量算出部２１は、この人物領域の特徴量を、後記する特徴量データベース２２に登録する。 The person area feature amount calculation unit 21 receives a person area from the person area detection means 10 (see FIG. 2) and calculates the feature amount of the person area. Then, the person area feature value calculation unit 21 outputs the calculated feature value of the person area to the person area feature value matching unit 23 as the feature value of the person area in the processing target frame image (current frame image). At this time, the person area feature quantity calculation unit 21 registers the feature quantity of the person area in the feature quantity database 22 described later.

ここで、人物領域特徴量算出部２１は、ＨＯＧ特徴量、色ヒストグラム又は人物位置の何れかを、人物領域の特徴量として算出してもよい。ここで、人物領域特徴量算出部２１は、人物領域の特徴量として人物位置を用いる場合、例えば、人物領域の重心位置を人物位置とする。
また、人物領域特徴量算出部２１は、ＨＯＧ特徴量と色ヒストグラムとの組み合わせ、ＨＯＧ特徴量と人物位置との組み合わせ、又は、色ヒストグラムと人物位置との組み合わせの何れかを、人物領域の特徴量として算出してもよい。
さらに、人物領域特徴量算出部２１は、ＨＯＧ特徴量と色ヒストグラムと人物位置との組み合わせを、人物領域の特徴量として算出してもよい。
なお、オペレータが、人物領域の特徴量として、人物行動判定装置１でどれを用いるのか、予め設定する。 Here, the person region feature value calculation unit 21 may calculate any one of the HOG feature value, the color histogram, and the person position as the feature value of the person region. Here, when using the person position as the feature quantity of the person area, the person area feature quantity calculation unit 21 sets the center position of the person area as the person position, for example.
In addition, the person region feature amount calculation unit 21 determines whether the combination of the HOG feature amount and the color histogram, the combination of the HOG feature amount and the person position, or the combination of the color histogram and the person position is the feature of the person region. It may be calculated as a quantity.
Furthermore, the person area feature quantity calculation unit 21 may calculate a combination of the HOG feature quantity, the color histogram, and the person position as the feature quantity of the person area.
It should be noted that the operator previously sets which character action determination device 1 uses as the feature value of the person area.

特徴量データベース２２は、人物領域特徴量算出部２１で算出された人物領域の特徴量を格納する。言い換えると、特徴量データベース２２は、過去のフレーム画像で検出された人物領域について、その特徴量を格納する。 The feature amount database 22 stores the feature amount of the person area calculated by the person region feature amount calculation unit 21. In other words, the feature amount database 22 stores the feature amount of the person area detected in the past frame image.

人物領域特徴量照合部２３は、特徴量データベース２２を参照し、過去のフレーム画像における人物領域の特徴量を読み出す。また、人物領域特徴量照合部２３は、人物領域特徴量算出部２１から、処理対象のフレーム画像における人物領域の特徴量が入力される。さらに、人物領域特徴量照合部２３は、後記する人物予測部２５から人物の予測位置と予測領域とが入力される。 The person region feature amount matching unit 23 refers to the feature amount database 22 and reads the feature amount of the person region in the past frame image. In addition, the person area feature quantity matching unit 23 receives the feature quantity of the person area in the frame image to be processed from the person area feature quantity calculation unit 21. Further, the person region feature amount matching unit 23 receives the predicted position and prediction region of the person from the person prediction unit 25 described later.

そして、人物領域特徴量照合部２３は、処理対象のフレーム画像における人物領域の特徴量と、過去のフレーム画像における人物領域の特徴量とを照合し、これら人物領域の特徴量が類似しているか否かを判定する。以下、人物領域特徴量照合部２３による人物領域の特徴量の照合について、第１例から第５例までを順に説明する。 Then, the person region feature amount matching unit 23 compares the feature amount of the person region in the frame image to be processed with the feature amount of the person region in the past frame image, and whether the feature amounts of these person regions are similar. Determine whether or not. Hereinafter, the first to fifth examples will be sequentially described with respect to the matching of the human region feature amount by the human region feature amount matching unit 23.

＜第１例：色ヒストグラム＞
以下、図８を参照し、人物領域特徴量照合部２３による人物領域の特徴量の照合の第１例について、説明する（適宜図７参照）。この第１例では、人物領域特徴量照合部２３は、人物領域の特徴量として、色ヒストグラムを用いる。 <First example: color histogram>
Hereinafter, a first example of human area feature amount matching performed by the person region feature amount matching unit 23 will be described with reference to FIG. 8 (see FIG. 7 as appropriate). In this first example, the person region feature amount matching unit 23 uses a color histogram as the feature amount of the person region.

図８では、左側の色ヒストグラムが、処理対象のフレーム画像における人物領域の色ヒストグラムである。また、図８では、右側の色ヒストグラムが、過去のフレーム画像における人物領域の色ヒストグラムである。なお、図８では、色ヒストグラムとして赤−青色領域を図示したが、本発明では、色ヒストグラムに他の色を含めてもよい。 In FIG. 8, the color histogram on the left side is a color histogram of the person region in the frame image to be processed. In FIG. 8, the color histogram on the right side is the color histogram of the person region in the past frame image. In FIG. 8, the red-blue region is illustrated as the color histogram, but in the present invention, other colors may be included in the color histogram.

別の人物の場合、図８（ａ）に示すように、処理対象のフレーム画像における人物領域の色ヒストグラムと、過去のフレーム画像における人物領域の色ヒストグラムとは、大きく異なる。一方、同一人物の場合、図８（ｂ）に示すように、処理対象のフレーム画像における人物領域の色ヒストグラムと、過去のフレーム画像における人物領域の色ヒストグラムとは、類似する。 In the case of another person, as shown in FIG. 8A, the color histogram of the person area in the frame image to be processed is greatly different from the color histogram of the person area in the past frame image. On the other hand, in the case of the same person, as shown in FIG. 8B, the color histogram of the person area in the frame image to be processed is similar to the color histogram of the person area in the past frame image.

従って、人物領域特徴量照合部２３は、下記の式（１）により、処理対象のフレーム画像における人物領域の色ヒストグラムの分布ｐと、過去のフレーム画像における人物領域の色ヒストグラムの分布ｑとの間の距離ＢＣ（ｐ，ｑ）を求める。そして、人物領域特徴量照合部２３は、式（１）で算出した距離ＢＣ（ｐ，ｑ）が、予め設定した閾値以下の場合、色ヒストグラムが類似すると判定する。一方、人物領域特徴量照合部２３は、この距離ＢＣ（ｐ，ｑ）が、予め設定した閾値を超えれば、色ヒストグラムが類似しないと判定する。ここでは、人物領域特徴量照合部２３は、距離ＢＣ（ｐ，ｑ）として、Ｂｈａｔｔａｃｈａｒｙｙａ距離を用いる。 Accordingly, the person area feature amount matching unit 23 calculates the distribution h of the color histogram of the person area in the frame image to be processed and the distribution q of the color histogram of the person area in the past frame image by the following equation (1). A distance BC (p, q) is obtained. The person region feature amount matching unit 23 determines that the color histograms are similar when the distance BC (p, q) calculated by the equation (1) is equal to or less than a preset threshold value. On the other hand, the person region feature amount matching unit 23 determines that the color histograms are not similar if the distance BC (p, q) exceeds a preset threshold value. Here, the person region feature amount matching unit 23 uses the Bhattacharya distance as the distance BC (p, q).

また、人物領域特徴量照合部２３は、人物領域を一意に識別する人物領域ＩＤを格納した人物領域ＩＤリストを有する。そして、人物領域特徴量照合部２３は、人物領域ＩＤリストを参照し、色ヒストグラムが類似する場合（同一人物の場合）、過去のフレーム画像でその人物領域に付与した人物領域ＩＤを、処理対象のフレーム画像で検出された人物領域に付与する。つまり、同一人物の人物領域は、過去のフレーム画像と処理対象のフレーム画像との間で、同一の人物領域ＩＤを有する。 The person area feature amount matching unit 23 has a person area ID list that stores person area IDs that uniquely identify the person area. Then, the person area feature amount matching unit 23 refers to the person area ID list, and when the color histograms are similar (in the case of the same person), the person area ID assigned to the person area in the past frame image is processed. To the person area detected in the frame image. That is, the person area of the same person has the same person area ID between the past frame image and the frame image to be processed.

一方、色ヒストグラムが類似しない場合（別の人物の場合）、人物領域特徴量照合部２３は、人物領域ＩＤリストに登録されていない新たな人物領域ＩＤを、処理対象のフレーム画像で検出された人物領域に付与する。この場合、人物領域特徴量照合部２３は、新たに付与した人物領域ＩＤを人物領域ＩＤリストに登録する。その後、人物領域特徴量照合部２３は、人物領域ＩＤが付与された人物領域を人物軌跡生成部２４に出力する。 On the other hand, if the color histograms are not similar (in the case of another person), the person area feature amount matching unit 23 has detected a new person area ID that is not registered in the person area ID list in the processing target frame image. It is given to the person area. In this case, the person area feature amount matching unit 23 registers the newly assigned person area ID in the person area ID list. Thereafter, the person area feature amount matching unit 23 outputs the person area to which the person area ID is assigned to the person locus generating unit 24.

＜第２例：人物位置＞
以下、人物領域特徴量照合部２３による人物領域の特徴量の照合の第２例について、説明する。この第２例では、人物領域特徴量照合部２３は、人物領域の特徴量として、人物位置を用いる。 <Second example: person position>
Hereinafter, a second example of human region feature amount matching performed by the person region feature amount matching unit 23 will be described. In the second example, the person area feature amount matching unit 23 uses the person position as the feature amount of the person area.

人物領域特徴量照合部２３は、処理対象のフレーム画像における人物位置と、過去のフレーム画像を用いて予測した人物位置との距離を求める。ここで、処理対象のフレーム画像における人物位置は、人物領域特徴量算出部２１から入力される人物位置である。また、過去のフレーム画像を用いて予測した人物位置は、人物予測部２５から入力される予測位置である。 The person region feature amount matching unit 23 obtains the distance between the person position in the frame image to be processed and the person position predicted using the past frame image. Here, the person position in the frame image to be processed is the person position input from the person region feature amount calculation unit 21. The person position predicted using the past frame image is a predicted position input from the person prediction unit 25.

そして、人物領域特徴量照合部２３は、この距離が予め設定した閾値以内であれば、人物位置が類似すると判定する。一方、人物領域特徴量照合部２３は、この距離が予め設定した閾値を超えれば、人物位置が類似しないと判定する。その後、人物領域特徴量照合部２３は、前記した第１例と同様に、人物領域ＩＤリストを参照して、人物領域に人物領域ＩＤを付与し、人物軌跡生成部２４に出力する。 Then, the person region feature amount matching unit 23 determines that the person positions are similar if the distance is within a preset threshold. On the other hand, the person area feature amount matching unit 23 determines that the person positions are not similar if the distance exceeds a preset threshold value. Thereafter, as in the first example described above, the person area feature amount matching unit 23 refers to the person area ID list, assigns a person area ID to the person area, and outputs the person area ID to the person locus generation unit 24.

このとき、人物領域特徴量照合部２３は、処理対象のフレーム画像における人物位置が、予測領域内に含まれる場合だけ、人物領域を人物軌跡生成部２４に出力することが好ましい。言い換えると、人物領域特徴量照合部２３は、処理対象のフレーム画像における人物位置が、予測領域内に含まれない場合、人物領域の誤検出として照合処理をスキップする。これによって、人物行動判定装置１は、誤検出した人物領域から、人物軌跡を生成してしまう事態を防止できる。 At this time, it is preferable that the person area feature amount matching unit 23 outputs the person area to the person locus generation unit 24 only when the person position in the processing target frame image is included in the prediction area. In other words, when the person position in the processing target frame image is not included in the prediction area, the person area feature amount matching unit 23 skips the matching process as a false detection of the person area. As a result, the human behavior determination apparatus 1 can prevent a situation in which a human trajectory is generated from an erroneously detected human region.

＜第３例：ＨＯＧ特徴量＞
以下、人物領域特徴量照合部２３による人物領域の特徴量の照合の第３例について、説明する。この第３例では、人物領域特徴量照合部２３は、人物領域の特徴量として、ＨＯＧ特徴量を用いる。 <Third example: HOG feature amount>
Hereinafter, a third example of human area feature amount matching performed by the person region feature amount matching unit 23 will be described. In the third example, the person region feature amount matching unit 23 uses a HOG feature amount as the feature amount of the person region.

人物領域特徴量照合部２３は、処理対象のフレーム画像におけるＨＯＧ特徴量と、過去のフレーム画像におけるＨＯＧ特徴量との距離を、ＨＯＧ特徴量の次元毎に算出する。そして、人物領域特徴量照合部２３は、この次元毎の距離の総和を算出し、算出した総和が予め設定した閾値以下であれば、ＨＯＧ特徴量が類似すると判定する。一方、人物領域特徴量照合部２３は、この総和が予め設定した閾値を超えれば、ＨＯＧ特徴量が類似しないと判定する。その後、人物領域特徴量照合部２３は、前記した第１例と同様に、人物領域ＩＤリストを参照して、人物領域毎に人物領域ＩＤを付与し、人物軌跡生成部２４に出力する。 The person region feature amount matching unit 23 calculates the distance between the HOG feature amount in the processing target frame image and the HOG feature amount in the past frame image for each dimension of the HOG feature amount. Then, the person region feature amount matching unit 23 calculates the sum of distances for each dimension, and determines that the HOG feature amounts are similar if the calculated sum is equal to or less than a preset threshold value. On the other hand, the person area feature amount matching unit 23 determines that the HOG feature amounts are not similar if the sum exceeds a preset threshold value. Thereafter, as in the first example described above, the person area feature amount matching unit 23 refers to the person area ID list, assigns a person area ID to each person area, and outputs the person area ID to the person locus generation unit 24.

＜第４例：類似度＞
以下、人物領域特徴量照合部２３による人物領域の特徴量の照合の第４例について、説明する。この第４例では、人物領域特徴量照合部２３は、人物領域の特徴量として、色ヒストグラムと、人物位置と、ＨＯＧ特徴量とを組み合わせた類似度を用いる。 <Fourth example: Similarity>
Hereinafter, a fourth example of collation of human region feature amounts by the human region feature amount collation unit 23 will be described. In the fourth example, the person region feature amount matching unit 23 uses a similarity obtained by combining a color histogram, a person position, and a HOG feature amount as the feature amount of the person region.

この場合、人物領域特徴量照合部２３は、人物位置と、色ヒストグラムと、ＨＯＧ特徴量とにそれぞれ重みを乗算し、これらを加算して類似度を算出する。そして、人物領域特徴量照合部２３は、算出した類似度が予め設定した閾値以下であれば、類似すると判定する。一方、人物領域特徴量照合部２３は、この類似度が予め設定した閾値を超えれば、類似しないと判定する。その後、人物領域特徴量照合部２３は、前記した第１例と同様に、人物領域ＩＤリストを参照して、人物領域毎に人物領域ＩＤを付与し、人物軌跡生成部２４に出力する。 In this case, the person region feature quantity matching unit 23 multiplies the person position, the color histogram, and the HOG feature quantity by weights, and adds these to calculate the similarity. Then, the person region feature amount matching unit 23 determines that the similarity is similar if the calculated similarity is equal to or less than a preset threshold value. On the other hand, the person region feature amount matching unit 23 determines that they are not similar if the similarity exceeds a preset threshold. Thereafter, as in the first example described above, the person area feature amount matching unit 23 refers to the person area ID list, assigns a person area ID to each person area, and outputs the person area ID to the person locus generation unit 24.

具体的に、人物領域特徴量照合部２３は、下記の式（２）を用いて、類似度を算出する。なお、人物位置、色ヒストグラム及びＨＯＧ特徴量は、前記した第１例〜第３例と同様に算出できるので、説明を省略する。 Specifically, the person region feature amount matching unit 23 calculates the similarity using the following equation (2). Since the person position, color histogram, and HOG feature amount can be calculated in the same manner as in the first to third examples described above, description thereof is omitted.

なお、式（２）では、Ｓｉｍが類似度であり、αが人物位置の重みであり、Ｌｏｃが人物位置である。また、式（２）では、βが色ヒストグラムの重みであり、Ｈｉｓが色ヒストグラムである。さらに、式（２）では、γがＨＯＧ特徴量の重みであり、ＨｏｇがＨＯＧ特徴量である。 In Equation (2), Sim is the similarity, α is the weight of the person position, and Loc is the person position. In Equation (2), β is the weight of the color histogram, and His is the color histogram. Furthermore, in Expression (2), γ is the weight of the HOG feature value, and Hog is the HOG feature value.

また、式（２）では、３個の重みα，β，γは、予め設定される。さらに、式（２）では、人物位置、色ヒストグラム及びＨＯＧ特徴量のうち、類似度の算出に用いないものは、その重みをゼロに設定する。 In the formula (2), the three weights α, β, and γ are set in advance. Further, in the expression (2), the weight of the person position, the color histogram, and the HOG feature value that is not used for calculating the similarity is set to zero.

ここで、人物位置、色ヒストグラム及びＨＯＧ特徴量は、単位が異なることから、標準化される。具体的には、人物領域特徴量照合部２３は、各特徴量（人物位置、色ヒストグラム及びＨＯＧ特徴量）について、平均をゼロとし、分散を１とするように、下記の式（３）により、標準化を行う。これによって、人物領域特徴量照合部２３は、単位の相違による影響を排除できる。なお、式（３）では、Ｘが元の特徴量、μがＸの平均、σがＸの分散、Ｚが標準化された特徴量を表す。 Here, the person position, the color histogram, and the HOG feature amount are standardized because the units are different. Specifically, the person area feature amount matching unit 23 uses the following equation (3) so that the average is zero and the variance is one for each feature amount (person position, color histogram, and HOG feature amount). , Standardize. Accordingly, the person region feature amount matching unit 23 can eliminate the influence due to the difference in units. In Expression (3), X represents the original feature value, μ represents the average of X, σ represents the variance of X, and Z represents the standardized feature value.

＜第５例：閾値判定の組み合わせ＞
以下、人物領域特徴量照合部２３による人物領域の特徴量の照合の第５例について、説明する。この第５例では、人物領域特徴量照合部２３は、第１例の色ヒストグラムと、第２例の人物位置と、第３例のＨＯＧ特徴量とをそれぞれ算出する。そして、人物領域特徴量照合部２３は、色ヒストグラムと人物位置とＨＯＧ特徴量との何れか一つでも、それぞれの閾値を超えた場合、類似しないと判定する。一方、人物領域特徴量照合部２３は、色ヒストグラムと、人物位置と、ＨＯＧ特徴量との全てが、それぞれの閾値以下の場合、類似すると判定する。その後、人物領域特徴量照合部２３は、前記した第１例と同様に、人物領域ＩＤリストを参照して、人物領域毎に人物領域ＩＤを付与し、人物軌跡生成部２４に出力する。 <Fifth example: Combination of threshold determination>
In the following, a fifth example of person area feature amount matching by the person area feature amount matching unit 23 will be described. In this fifth example, the person region feature amount matching unit 23 calculates the color histogram of the first example, the person position of the second example, and the HOG feature amount of the third example. Then, the person region feature amount matching unit 23 determines that they are not similar if any one of the color histogram, the person position, and the HOG feature amount exceeds the respective threshold values. On the other hand, the person region feature amount matching unit 23 determines that the color histogram, the person position, and the HOG feature amount are similar if they are all equal to or less than the respective threshold values. Thereafter, as in the first example described above, the person area feature amount matching unit 23 refers to the person area ID list, assigns a person area ID to each person area, and outputs the person area ID to the person locus generation unit 24.

このように、人物領域特徴量照合部２３は、第１例から第５例までの何れか一つを用いて、代表人物領域を選択できる。この場合、例えば、オペレータが、第１例から第５例の何れかを用いるか予め設定する。 As described above, the person area feature amount matching unit 23 can select the representative person area using any one of the first to fifth examples. In this case, for example, the operator sets in advance whether to use any of the first to fifth examples.

図７に戻り、人物軌跡生成手段２０の構成について説明を続ける。
人物軌跡生成部２４は、人物領域特徴量照合部２３から人物領域ＩＤが付与された人物領域が入力される。そして、人物軌跡生成部２４は、過去のフレーム画像と処理対象のフレーム画像との間で、人物領域の重心位置を求め、同一の人物領域ＩＤが付与された人物領域の重心位置を連結して人物軌跡を生成する。 Returning to FIG. 7, the description of the configuration of the human trajectory generation unit 20 will be continued.
The person trajectory generation unit 24 receives the person area to which the person area ID is assigned from the person area feature amount matching unit 23. Then, the person trajectory generation unit 24 obtains the gravity center position of the person area between the past frame image and the frame image to be processed, and connects the gravity center positions of the person areas assigned with the same person area ID. Generate a human trajectory.

以下、図９を参照し、人物軌跡の詳細について説明する（適宜図７参照）。図９では、♯１〜♯９がフレーム画像の番号を古い順に示しており、＃１が最も過去のフレーム画像に対応し、＃９が処理対象のフレーム画像（現在のフレーム画像）に対応する。また、符号Ｒ１〜Ｒ９が、それぞれのフレーム画像で検出された同一人物の人物領域を示す。さらに、符号Ｐ１〜Ｐ９が、人物領域Ｒ１〜Ｒ９の重心位置を示す。なお、図９では、説明を簡易にするため、一部符号のみを図示した。また、図９（ｂ）の破線については、後記する。 Hereinafter, the details of the person trajectory will be described with reference to FIG. 9 (see FIG. 7 as appropriate). In FIG. 9, # 1 to # 9 indicate the frame image numbers in order from the oldest, # 1 corresponds to the oldest frame image, and # 9 corresponds to the frame image to be processed (current frame image). . Reference numerals R1 to R9 indicate person regions of the same person detected in the respective frame images. Furthermore, the symbols P1 to P9 indicate the gravity center positions of the person regions R1 to R9. In FIG. 9, only a part of the reference numerals are shown for the sake of simplicity. The broken line in FIG. 9B will be described later.

図９（ａ）に示すように、人物軌跡生成部２４は、過去のフレーム画像（♯１〜♯８）で検出された人物領域Ｒ１〜Ｒ８について、それぞれの重心位置Ｐ１〜Ｐ８を求める。また、人物軌跡生成部２４は、処理対象のフレーム画像（＃９）で検出された人物領域Ｒ９から、その重心位置Ｐ９を求める。そして、人物軌跡生成部２４は、これら重心位置Ｐ１〜Ｐ９を連結し、図９（ｂ）に示すような人物軌跡を生成する。その後、人物軌跡生成部２４は、この人物軌跡を人物予測部２５と人物行動判定手段３０（図１１参照）とに出力する。 As shown in FIG. 9A, the person trajectory generation unit 24 obtains respective gravity center positions P1 to P8 for the person regions R1 to R8 detected in the past frame images (# 1 to # 8). Further, the person trajectory generation unit 24 obtains the gravity center position P9 from the person region R9 detected in the processing target frame image (# 9). Then, the person trajectory generation unit 24 connects these barycentric positions P1 to P9 to generate a person trajectory as shown in FIG. 9B. Thereafter, the person trajectory generation unit 24 outputs the person trajectory to the person prediction unit 25 and the person action determination unit 30 (see FIG. 11).

ここで、人物軌跡は、隣接する重心位置Ｐ１〜Ｐ２・・・Ｐ８〜Ｐ９を結んだ線分のそれぞれが、その人物の動きベクトルに相当する。図９（ｂ）の例では、人物軌跡には、８個の人物軌跡動きベクトルＶ１〜Ｖ８が含まれる。以下、人物軌跡に含まれる動きベクトルを人物軌跡動きベクトルと略記する。 Here, in the human trajectory, each of the line segments connecting adjacent gravity center positions P1 to P2... P8 to P9 corresponds to the motion vector of the person. In the example of FIG. 9B, the human trajectory includes eight human trajectory motion vectors V1 to V8. Hereinafter, the motion vector included in the person trajectory is abbreviated as a person trajectory motion vector.

なお、図９では、一人分の人物軌跡を図示したが、映像に複数の人物が含まれる場合、人物軌跡生成部２４は、人物毎に人物軌跡を生成することが好ましい。
また、人物軌跡生成部２４は、ある人物の人物領域を最初に検出してから、予め設定されたフレーム数（例えば、３０フレーム）を超えたときに、その人物について、人物軌跡の生成を終了してもよい。
また、人物軌跡生成部２４は、ある人物の人物領域を最初に検出した場合、人物軌跡の代わりに人物領域の重心位置を人物の現在位置として人物予測部２５に出力する。 Although FIG. 9 illustrates a person trajectory for one person, when a plurality of persons are included in the video, the person trajectory generation unit 24 preferably generates a person trajectory for each person.
In addition, the person trajectory generation unit 24 ends the generation of the person trajectory for the person when the number of frames exceeds a preset number (for example, 30 frames) after the person area of the person is first detected. May be.
Further, when the person trajectory generation unit 24 first detects a person area of a certain person, the person trajectory generation unit 24 outputs the position of the center of gravity of the person area to the person prediction unit 25 as the current position of the person instead of the person trajectory.

以下、図１０を参照し、人物予測部２５の詳細について説明する（適宜図７参照）。なお、図１０では、人物の予測位置を符号Ｙｂで図示した。
人物予測部２５は、人物軌跡生成部２４から人物軌跡が入力されると共に、この人物軌跡から、人物の予測領域Ｙａ及び予測位置Ｙｂを予測フィルタにより予測する。ここで、人物予測部２５は、カルマンフィルタ、パーティクルフィルタ等の予測フィルタを用いる。 Hereinafter, the details of the person prediction unit 25 will be described with reference to FIG. 10 (see FIG. 7 as appropriate). In FIG. 10, the predicted position of the person is indicated by a symbol Yb.
The person prediction unit 25 receives the person locus from the person locus generation unit 24 and predicts the person's prediction area Ya and the prediction position Yb from the person locus using a prediction filter. Here, the person prediction unit 25 uses a prediction filter such as a Kalman filter or a particle filter.

カルマンフィルタは、複数の観測位置（図９の重心位置Ｐ１〜Ｐ９）から、予測領域Ｙａ及び予測位置Ｙｂを予測する。具体的には、カルマンフィルタは、予測位置Ｙｂと観測位置との誤差からカルマンゲインを算出し、その大きさに従って、補正量を調整する。また、カルマンフィルタにおいて、誤差共分散は、誤差量に加えて、各フレーム画像での人物領域の検出状況に応じて変化する。 The Kalman filter predicts the prediction region Ya and the prediction position Yb from a plurality of observation positions (centroid positions P1 to P9 in FIG. 9). Specifically, the Kalman filter calculates the Kalman gain from the error between the predicted position Yb and the observation position, and adjusts the correction amount according to the magnitude. In the Kalman filter, the error covariance changes in accordance with the detection state of the person area in each frame image in addition to the error amount.

図１０（ａ）に示すように、カルマンフィルタは、人物領域の検出が成功した場合、予測領域Ｙａを縮小する。一方、図１０（ｂ）に示すように、カルマンフィルタは、誤差量が大きい場合や人物領域の検出が不安定な場合、予測領域Ｙａを拡大する。そして、人物予測部２５は、カルマンフィルタで予測した予測領域Ｙａ及び予測位置Ｙｂを人物領域特徴量照合部２３に出力する。 As shown in FIG. 10A, the Kalman filter reduces the prediction area Ya when the person area is successfully detected. On the other hand, as shown in FIG. 10B, the Kalman filter enlarges the prediction area Ya when the amount of error is large or when the detection of the person area is unstable. Then, the person prediction unit 25 outputs the prediction region Ya and the prediction position Yb predicted by the Kalman filter to the person region feature amount matching unit 23.

このように、カルマンフィルタは、オクルージョンにより人物領域の検出に失敗した場合でも、事前に指定した運動モデルに従って、予測位置及び予測領域を更新できる。このため、人物行動判定装置１は、オクルージョンを越えた時点で同一人物の人物領域を再検出でき、信頼性が向上する。 As described above, the Kalman filter can update the predicted position and the predicted region according to the motion model specified in advance even when the detection of the human region fails due to occlusion. For this reason, the person action determination device 1 can re-detect the person area of the same person when the occlusion is exceeded, and the reliability is improved.

なお、人物予測部２５は、人物軌跡生成部２４から人物の現在位置が入力された場合、予測領域Ｙａ及び予測位置Ｙｂを予測できない。このため、人物予測部２５は、この現在位置を予測位置Ｙｂとして、人物領域特徴量照合部２３に出力する。 Note that the person prediction unit 25 cannot predict the prediction region Ya and the prediction position Yb when the current position of the person is input from the person locus generation unit 24. Therefore, the person predicting unit 25 outputs the current position as the predicted position Yb to the person region feature amount matching unit 23.

［人物行動判定手段の構成］
以下、図１１を参照し、人物行動判定手段３０の構成について、説明する。
人物行動判定手段３０は、図１１に示すように、人物軌跡特徴量算出部３１と、人物軌跡特徴量正規化部３２と、行動判定部３３とを備える。 [Configuration of person action determination means]
Hereinafter, the configuration of the person action determination unit 30 will be described with reference to FIG.
As shown in FIG. 11, the human behavior determination unit 30 includes a human trajectory feature amount calculation unit 31, a human trajectory feature amount normalization unit 32, and a behavior determination unit 33.

人物軌跡特徴量算出部３１は、人物軌跡生成手段２０から人物軌跡が入力されると共に、それぞれの人物軌跡毎に特徴量を算出する。ここで、人物軌跡特徴量算出部３１は、追跡時間、初回検出位置、現在検出位置、移動方向、移動距離、平均速度、平均加速度、及び、直線性を含めた多次元の特徴量として、人物軌跡の特徴量を算出する。 The person trajectory feature quantity calculation unit 31 receives a person trajectory from the person trajectory generation means 20 and calculates a feature quantity for each person trajectory. Here, the human trajectory feature quantity calculation unit 31 uses a tracking time, an initial detection position, a current detection position, a movement direction, a movement distance, an average speed, an average acceleration, and a multidimensional feature quantity including linearity as a person. The trajectory feature amount is calculated.

図９に戻り、人物軌跡の特徴量の算出の具体例について、説明する（適宜図１１参照）。このとき、人物軌跡特徴量算出部３１には、図９（ｂ）の人物軌跡が入力されたこととする。 Returning to FIG. 9, a specific example of the calculation of the feature amount of the human trajectory will be described (see FIG. 11 as appropriate). At this time, it is assumed that the person trajectory shown in FIG.

＜第１例：追跡時間＞
人物軌跡特徴量算出部３１は、下記の式（４）より、追跡時間を算出できる。 <First example: tracking time>
The person trajectory feature quantity calculation unit 31 can calculate the tracking time from the following equation (4).

追跡時間＝１／フレームレート×フレーム間隔×重心位置の数・・・式（４） Tracking time = 1 / frame rate x frame interval x number of barycentric positions Equation (4)

例えば、人物軌跡に９個の重心位置Ｐ１〜Ｐ９が含まれ、映像のフレームレートが３０であり、３フレーム間隔で人物領域を検出した場合を考える。この場合、人物軌跡特徴量算出部３１は、式（５）に示すように、追跡時間０．９秒を算出する。 For example, consider a case where nine gravity center positions P1 to P9 are included in a person trajectory, the frame rate of the video is 30, and a person region is detected at intervals of three frames. In this case, the human trajectory feature quantity calculation unit 31 calculates a tracking time of 0.9 seconds as shown in Expression (5).

０．９＝１／３０×３×９・・・式（５） 0.9 = 1/30 × 3 × 9 Expression (5)

＜第２例：初回検出位置＞
人物軌跡特徴量算出部３１は、最も古いフレーム画像＃１における重心位置Ｐ１を初回検出位置として算出する。 <Second example: first detection position>
The person trajectory feature quantity calculation unit 31 calculates the gravity center position P1 in the oldest frame image # 1 as the initial detection position.

＜第３例：現在検出位置＞
人物軌跡特徴量算出部３１は、処理対象のフレーム画像＃９における重心位置Ｐ９を現在検出位置として算出する。 <Third example: current detection position>
The person trajectory feature quantity calculation unit 31 calculates the gravity center position P9 in the processing target frame image # 9 as the current detection position.

＜第４例：移動方向＞
人物軌跡特徴量算出部３１は、全ての人物軌跡動きベクトルＶ１〜Ｖ８の向きを平均して、移動方向を算出する。 <Fourth example: moving direction>
The human trajectory feature quantity calculation unit 31 calculates the moving direction by averaging the directions of all the human trajectory motion vectors V1 to V8.

＜第５例：移動距離＞
人物軌跡特徴量算出部３１は、初回検出位置（重心位置Ｐ１）から現在検出位置（Ｐ９）までを順番に結んだ線の長さを合計し、移動距離を算出する。 <Fifth example: moving distance>
The person trajectory feature quantity calculation unit 31 calculates the movement distance by adding up the lengths of the lines connecting the first detection position (center of gravity position P1) to the current detection position (P9) in order.

＜第６例：平均速度＞
人物軌跡特徴量算出部３１は、全ての人物軌跡動きベクトルＶ１〜Ｖ８の大きさを平均し、平均速度を算出する。 <Sixth example: average speed>
The human trajectory feature quantity calculation unit 31 averages the magnitudes of all the human trajectory motion vectors V1 to V8, and calculates an average speed.

＜第７例：平均加速度＞
人物軌跡特徴量算出部３１は、個々のフレーム＃１〜＃９について、隣接する人物軌跡動きベクトルＶ１〜Ｖ８の差分を求める。例えば、人物軌跡特徴量算出部３１は、フレーム＃４の加速度として、人物軌跡動きベクトルＶ５と人物軌跡動きベクトルＶ４との差分を求める。そして、人物軌跡特徴量算出部３１は、全てのフレーム＃１〜＃９の加速度を平均する。 <Seventh example: average acceleration>
The human trajectory feature quantity calculation unit 31 obtains the difference between adjacent human trajectory motion vectors V1 to V8 for each of the frames # 1 to # 9. For example, the human trajectory feature quantity calculation unit 31 obtains the difference between the human trajectory motion vector V5 and the human trajectory motion vector V4 as the acceleration of frame # 4. Then, the human trajectory feature quantity calculation unit 31 averages the accelerations of all the frames # 1 to # 9.

＜第８例：直線性＞
人物軌跡特徴量算出部３１は、初回検出位置（重心位置Ｐ１）から現在検出位置（Ｐ９）までを結ぶ近似直線を求める。そして、人物軌跡特徴量算出部３１は、初回検出位置及び現在検出位置以外の各検出位置（重心位置Ｐ２〜Ｐ８）から、近似直線までの法線を求める。さらに、人物軌跡特徴量算出部３１は、全ての法線の長さの平均値を算出し、この平均値を直線性とする。なお、図９（ｂ）では、近似直線、および、重心位置Ｐ２から近似直線までの法線を破線で図示した。 <Eighth example: linearity>
The person trajectory feature quantity calculation unit 31 obtains an approximate straight line connecting the first detection position (center of gravity position P1) to the current detection position (P9). Then, the human trajectory feature quantity calculation unit 31 obtains a normal line from each detection position (centroid positions P2 to P8) other than the initial detection position and the current detection position to the approximate line. Furthermore, the person trajectory feature quantity calculation unit 31 calculates an average value of the lengths of all the normals, and sets this average value as linearity. In FIG. 9B, the approximate straight line and the normal line from the gravity center position P2 to the approximate straight line are shown by broken lines.

その後、人物軌跡特徴量算出部３１は、算出した人物軌跡の特徴量を人物軌跡特徴量正規化部３２に出力する。なお、人物軌跡特徴量算出部３１は、人物軌跡の特徴量として、追跡時間、初回検出位置、現在検出位置、移動方向、移動距離、平均速度、平均加速度、及び、直線性の全てを算出せずに、後記する行動判定部３３で用いるものだけを算出してもよい。 Thereafter, the person trajectory feature quantity calculation unit 31 outputs the calculated feature quantity of the person trajectory to the person trajectory feature quantity normalization unit 32. The human trajectory feature quantity calculation unit 31 calculates all of the tracking time, the initial detection position, the current detection position, the moving direction, the moving distance, the average speed, the average acceleration, and the linearity as the human trajectory feature quantity. Instead, only what is used in the action determination unit 33 described later may be calculated.

以下、図１２を参照し、人物軌跡特徴量正規化部３２による正規化を、準備段階と正規化段階とに分けて説明する（適宜図１１参照）。 Hereinafter, with reference to FIG. 12, the normalization by the human trajectory feature amount normalization unit 32 will be described separately in a preparation stage and a normalization stage (see FIG. 11 as appropriate).

２次元画像座標内での検出位置により、画素を単位としたフレーム画像間の移動距離が異なる。例えば、撮影カメラが斜め上から人物を見下ろすように設置されている場合を考える。この場合、人物が等速で移動しているとしても、その撮影カメラで撮影された映像では、手前側で人物軌跡動きベクトルが大きくなり、奥側では人物軌跡動きベクトルが小さくなる。そこで、人物軌跡特徴量正規化部３２は、検出位置の相違による人物軌跡動きベクトルの誤差（つまり、速度の誤差）を補正する。 Depending on the detection position in the two-dimensional image coordinates, the moving distance between frame images in units of pixels differs. For example, let us consider a case where a photographing camera is installed so as to look down at a person from above. In this case, even if the person is moving at a constant speed, in the video shot by the photographing camera, the person trajectory motion vector is large on the near side and the person trajectory motion vector is small on the far side. Accordingly, the human trajectory feature quantity normalization unit 32 corrects the human trajectory motion vector error (that is, the speed error) due to the difference in the detection position.

＜準備段階＞
まず、撮影カメラで一定時間（例えば、２時間程度）の正規化用映像を予め撮影しておき、この正規化用映像を人物軌跡特徴量正規化部３２に入力する。そして、人物軌跡特徴量正規化部３２は、図１２（ａ）に示すように、この正規化用映像を所定数のブロック（例えば、縦３ブロック、横４ブロック）に分割する。また、人物軌跡特徴量正規化部３２は、正規化用映像の開始から終了まで、ブロック毎に、このブロックから動きベクトルを検出する。そして、人物軌跡特徴量正規化部３２は、ブロック毎に、検出された全ての動きベクトルを平均し、平均動きベクトルを算出する。 <Preparation stage>
First, a normalization video for a predetermined time (for example, about 2 hours) is captured in advance by the imaging camera, and this normalization video is input to the person trajectory feature amount normalization unit 32. Then, as shown in FIG. 12A, the person trajectory feature amount normalization unit 32 divides the normalization video into a predetermined number of blocks (for example, 3 vertical blocks and 4 horizontal blocks). In addition, the person trajectory feature amount normalization unit 32 detects a motion vector from this block for each block from the start to the end of the normalization video. Then, the human trajectory feature quantity normalization unit 32 averages all detected motion vectors for each block, and calculates an average motion vector.

図１２（ａ）では、ブロック毎に、平均動きベクトルを矢印で図示した。つまり、この矢印の大きさが平均動きベクトルの大きさであり、この矢印の向きが平均動きベクトルの向きである。図１２（ａ）に示すように、撮影カメラの手前側（図１２の下側）で平均動きベクトルが大きくなり、撮影カメラの奥側（図１２の上側）では平均動きベクトルが小さくなる。 In FIG. 12A, the average motion vector is indicated by an arrow for each block. That is, the size of this arrow is the size of the average motion vector, and the direction of this arrow is the direction of the average motion vector. As shown in FIG. 12A, the average motion vector increases on the front side of the photographing camera (lower side in FIG. 12), and the average motion vector decreases on the far side of the photographing camera (upper side in FIG. 12).

そして、人物軌跡特徴量正規化部３２は、これらブロックの間で平均動きベクトルが等しくなるように、ブロック毎に正規化係数を算出する。例えば、図１２（ｂ）に示すように、撮影カメラの手前側で平均動きベクトルが大きくなることから、人物軌跡特徴量正規化部３２は、下段のブロックでは、例えば、０．６〜０．７といった１未満の正規化係数を算出する。一方、撮影カメラの奥側で平均動きベクトルが小さくなることから、人物軌跡特徴量正規化部３２は、上段のブロックでは、例えば、１．２〜１．５といった１を超える正規化係数を算出する。さらに、中段のブロックでは、人物軌跡特徴量正規化部３２は、１近辺の正規化係数を算出する。 Then, the human trajectory feature quantity normalization unit 32 calculates a normalization coefficient for each block so that the average motion vectors are equal between these blocks. For example, as shown in FIG. 12B, since the average motion vector increases on the front side of the photographing camera, the human trajectory feature quantity normalization unit 32 is, for example, 0.6-0. A normalization factor of less than 1 such as 7 is calculated. On the other hand, since the average motion vector becomes smaller on the far side of the photographing camera, the human trajectory feature normalization unit 32 calculates a normalization coefficient exceeding 1 such as 1.2 to 1.5 in the upper block, for example. To do. Further, in the middle block, the human trajectory feature amount normalization unit 32 calculates a normalization coefficient near one.

なお、正規化を正確に行うために、正規化用映像と人物領域を検出する映像とは、同一の撮影カメラを用いて、同一の場所で撮影することが好ましい。 In order to perform normalization accurately, it is preferable that the normalization video and the video for detecting the person area are shot at the same place using the same shooting camera.

＜正規化段階＞
人物軌跡特徴量正規化部３２は、人物軌跡特徴量算出部３１から人物軌跡の特徴量が入力されると共に、人物軌跡の特徴量を正規化する。まず、人物軌跡特徴量正規化部３２は、人物軌跡動きベクトルが検出された位置（例えば、人物軌跡動きベクトルの始点又は終点）から、その動きベクトルがどのブロックに含まれるかを求める。そして、人物軌跡特徴量正規化部３２は、人物軌跡動きベクトルの大きさに、その人物軌跡動きベクトルに対応するブロックの正規化係数を乗算する。さらに、人物軌跡特徴量正規化部３２は、正規化した人物軌跡の特徴量を行動判定部３３に出力する。 <Normalization stage>
The human trajectory feature quantity normalization unit 32 receives the human trajectory feature quantity from the human trajectory feature quantity calculation unit 31 and normalizes the human trajectory feature quantity. First, the human trajectory feature quantity normalization unit 32 obtains which block the motion vector is included from the position where the human trajectory motion vector is detected (for example, the start point or the end point of the human trajectory motion vector). Then, the human trajectory feature amount normalizing unit 32 multiplies the magnitude of the human trajectory motion vector by the normalization coefficient of the block corresponding to the human trajectory motion vector. Furthermore, the person trajectory feature amount normalization unit 32 outputs the normalized feature amount of the person trajectory to the behavior determination unit 33.

なお、この正規化は、検出位置の相違による人物軌跡動きベクトルの大きさ、つまり、速度に関係する特徴量を補正するものである。このため、人物軌跡特徴量正規化部３２は、速度に関係ない特徴量、例えば、初回検出位置、現在検出位置、及び、移動方向については、正規化しなくともよい。 This normalization is to correct the size of the human locus motion vector due to the difference in detection position, that is, the feature quantity related to the speed. For this reason, the human trajectory feature amount normalization unit 32 does not have to normalize the feature amounts that are not related to the speed, for example, the initial detection position, the current detection position, and the movement direction.

図１１に戻り、人物行動判定手段３０の構成について、説明を続ける。
行動判定部３３は、人物軌跡特徴量正規化部３２から人物軌跡の特徴量が入力されると共に、この人物軌跡の特徴量に基づいて、人物の行動を判定する。なお、行動判定部３３は、単純な閾値処理によるルールベースの判定を行うとして説明するが、機械学習に基づく判定を行ってもよい。 Returning to FIG. 11, the description of the configuration of the human behavior determination unit 30 will be continued.
The behavior determination unit 33 receives the human trajectory feature amount from the human trajectory feature amount normalization unit 32 and determines the human behavior based on the human trajectory feature amount. Note that although the behavior determination unit 33 is described as performing rule-based determination based on simple threshold processing, determination based on machine learning may be performed.

また、行動判定部３３は、人物軌跡の特徴量から、人物の行動を個別に判定するため、行動の種類毎に判定部を備える。図１１の例では、行動判定部３３は、走る判定部３３ａと、人と会う判定部３３ｂと、反対方向に歩く判定部３３ｃとを備える。以下、人物行動の判定の第１例〜第３例を順に説明する。 In addition, the behavior determination unit 33 includes a determination unit for each type of behavior in order to individually determine the behavior of the person from the feature amount of the person trajectory. In the example of FIG. 11, the behavior determination unit 33 includes a determination unit 33 a that runs, a determination unit 33 b that meets a person, and a determination unit 33 c that walks in the opposite direction. Hereinafter, a first example to a third example of determination of person behavior will be described in order.

＜第１例：「走る」＞
走る判定部３３ａは、人物軌跡の特徴量が「走る」の行動条件を満たすか否かを判定する。そして、走る判定部３３ａは、この行動条件を満たすときは、人物の行動を「走る」と判定する。その後、走る判定部３３ａは、「走る」を判定結果として出力する。一方、走る判定部３３ａは、この行動条件を満たさないときは、人物の行動を「走る」と判定せずに、何も出力しない。ここで、「走る」の行動条件は、例えば、以下のように設定される。 <First example: “Run”>
The running determination unit 33a determines whether or not the feature amount of the person trajectory satisfies the behavior condition of “running”. The running determination unit 33a determines that the person's action is “run” when the action condition is satisfied. Thereafter, the running determination unit 33a outputs “run” as the determination result. On the other hand, the running determination unit 33a does not determine that the person's action is “run” and does not output anything when the action condition is not satisfied. Here, the action condition of “run” is set as follows, for example.

＜＜「走る」の行動条件＞＞
平均速度＞平均速度の閾値ＡＮＤ
直線性＞直線性の閾値ＡＮＤ
移動距離＞移動距離の閾値 << Action conditions for "Running">>
Average speed> Average speed threshold AND
Linearity> Linearity threshold AND
Moving distance> Moving distance threshold

この「走る」の行動条件において、平均速度、移動距離及び直線性は、人物軌跡の特徴量として算出された値を用いる。また、平均速度の閾値、移動距離の閾値及び直線性の閾値は、予め設定されるか又は機械学習により決定される。 In this “run” action condition, the average speed, the moving distance, and the linearity use values calculated as the feature amount of the person trajectory. Further, the threshold value of the average speed, the threshold value of the moving distance, and the threshold value of the linearity are set in advance or determined by machine learning.

なお、「走る」の行動条件は、ＡＮＤを含むが、その少なくとも一つをＯＲにしてもよい。また、「走る」の行動条件は、超える（＞）を用いているが、以上（≧）、以下（≦）、又は、未満（＜）を用いてもよい。 The action condition of “run” includes AND, but at least one of them may be OR. Moreover, although the exceeding (>) is used for the action condition of “running”, the above (≧), the following (≦), or the less (<) may be used.

＜第２例：「人と会う」＞
人と会う判定部３３ｂは、人物軌跡の特徴量が「人と会う」の行動条件を満たすか否かを判定する。そして、人と会う判定部３３ｂは、この行動条件を満たすときは、人物の行動を「人と会う」と判定する。その後、人と会う判定部３３ｂは、「人と会う」を判定結果として出力する。一方、人と会う判定部３３ｂは、この行動条件を満たさないときは、「人と会う」と判定せずに、何も出力しない。ここで、「人と会う」の行動条件は、例えば、以下のように設定される。 <Second example: “Meeting people”>
The determination unit 33b that meets the person determines whether or not the feature amount of the person trajectory satisfies the action condition “Meet the person”. Then, the determination unit 33b that meets the person determines that the action of the person is “meet the person” when the action condition is satisfied. Thereafter, the determination unit 33b that meets the person outputs “Meet the person” as the determination result. On the other hand, the determination unit 33b that meets the person does not determine that “meet the person” and outputs nothing when the action condition is not satisfied. Here, the action condition of “meet people” is set as follows, for example.

＜＜「人と会う」の行動条件＞＞
初回検出位置の上限閾値＞初回検出位置＞初回検出位置の下限閾値ＡＮＤ
現在検出位置の上限閾値＞現在検出位置＞現在検出位置の下限閾値ＡＮＤ
移動距離＞移動距離の閾値ＡＮＤ
平均速度＞平均速度の閾値ＡＮＤ
平均加速度＞平均加速度の閾値ＡＮＤ
直線性＞直線性の閾値 <<< Action Conditions for “Meeting People” >>>
Upper limit threshold for first detection position> First detection position> Lower threshold for first detection position AND
Current detection position upper limit threshold> Current detection position> Current detection position lower limit threshold AND
Movement distance> Movement distance threshold AND
Average speed> Average speed threshold AND
Average acceleration> Average acceleration threshold AND
Linearity> Linearity threshold

この「人と会う」の行動条件において、初回検出位置、現在検出位置、移動距離、平均速度、平均加速度及び直線性は、人物軌跡の特徴量として算出された値を用いる。また、初回検出位置の上限閾値や下限閾値、現在検出位置の上限閾値や下限閾値、移動距離の閾値、平均速度の閾値、平均加速度の閾値及び直線性の閾値は、予め設定されるか又は機械学習により決定される。 In the action condition of “meet people”, the first detection position, the current detection position, the moving distance, the average speed, the average acceleration, and the linearity use values calculated as the feature amount of the person trajectory. Further, the upper limit threshold and lower limit threshold for the first detection position, the upper limit threshold and lower limit threshold for the current detection position, the movement distance threshold, the average speed threshold, the average acceleration threshold, and the linearity threshold are set in advance or Determined by learning.

なお、「人と会う」の行動条件は、ＡＮＤを含むが、その少なくとも一つをＯＲにしてもよい。また、「人と会う」の行動条件は、超える（＞）を用いているが、以上（≧）、以下（≦）、又は、未満（＜）を用いてもよい。 In addition, although the action condition of “meet people” includes AND, at least one of them may be OR. Moreover, although the exceeding (>) is used as the action condition of “meet with a person”, the above (≧), the following (≦), or the less (<) may be used.

＜第３例：「反対方向に歩く」＞
反対方向に歩く判定部３３ｃは、人物軌跡の特徴量が「反対方向に歩く」の行動条件を満たすか否かを判定する。そして、反対方向に歩く判定部３３ｃは、この行動条件を満たすときは、人物の行動を「反対方向に歩く」と判定する。その後、反対方向に歩く判定部３３ｃは、「反対方向に歩く」を判定結果として出力する。一方、反対方向に歩く判定部３３ｃは、この行動条件を満たさないときは、「反対方向に歩く」と判定ぜず、何も出力しない。ここで、「反対方向に歩く」の行動条件は、例えば、以下のように設定される。 <Third example: “Walk in the opposite direction”>
The determination unit 33c walking in the opposite direction determines whether or not the feature amount of the person trajectory satisfies the action condition “walk in the opposite direction”. Then, the determination unit 33c walking in the opposite direction determines that the person's action is “walk in the opposite direction” when the action condition is satisfied. Thereafter, the determination unit 33c walking in the opposite direction outputs “walk in the opposite direction” as the determination result. On the other hand, if the determination unit 33c walking in the opposite direction does not satisfy this action condition, the determination unit 33c does not determine “walk in the opposite direction” and outputs nothing. Here, the action condition of “walking in the opposite direction” is set as follows, for example.

＜＜「反対方向に歩く」の行動条件＞＞
移動距離＞移動距離の閾値ＡＮＤ
移動方向の上限閾値＞移動方向＞移動方向の下限閾値 << Action conditions for "walking in the opposite direction">>
Movement distance> Movement distance threshold AND
Upper limit threshold for moving direction> Moving direction> Lower limit threshold for moving direction

この「反対方向に歩く」の行動条件において、移動距離及び移動方向は、人物軌跡の特徴量として算出された値を用いる。また、移動距離の閾値、及び、移動方向の上限閾値や下限閾値は、予め設定されるか又は機械学習により決定される。 In the action condition of “walking in the opposite direction”, the value calculated as the feature amount of the person trajectory is used as the movement distance and the movement direction. Further, the threshold for the movement distance and the upper limit threshold and the lower limit threshold for the movement direction are set in advance or determined by machine learning.

なお、「反対方向に歩く」の行動条件は、ＡＮＤを用いているが、ＯＲを用いてもよい。また、「反対方向に歩く」の行動条件は、超える（＞）を用いているが、以上（≧）、以下（≦）、又は、未満（＜）を用いてもよい。 The action condition “walk in the opposite direction” uses AND, but OR may be used. In addition, the action condition “walking in the opposite direction” exceeds (>), but above (≧), below (≦), or less (<) may be used.

なお、行動判定部３３が判定する人物の行動は、前記した第１例〜第３例に限定されないことは言うまでもない。例えば、行動判定部３３は、物を置く判定部（不図示）を備え、「物を置く」を判定してもよい。この場合、物を置く判定部は、現在検出位置がほとんど変化せず、下方向に動きがあった場合、「物を置く」と判定してもよい。 Needless to say, the behavior of the person determined by the behavior determination unit 33 is not limited to the first to third examples. For example, the behavior determination unit 33 may include a determination unit (not shown) for placing an object and determine “place an object”. In this case, the determination unit for placing an object may determine “place an object” when the current detection position hardly changes and there is a downward movement.

また、行動判定部３３は、行動判定に用いる人物軌跡の特徴量や閾値を手動で設定する判定手法に限られず、主成分分析による判定手法を利用してもよい。この主成分分析による判定手法の詳細は、第３実施形態として後記する。 Further, the behavior determination unit 33 is not limited to a determination method for manually setting a feature amount or a threshold value of a person trajectory used for behavior determination, and may use a determination method based on principal component analysis. Details of the determination method based on the principal component analysis will be described later as a third embodiment.

ここで、例えば、「人に会う」と「物を置く」は、同時に判定されることが考えられる。このため、行動判定部３３は、人と会う判定部３３ｂや物を置く判定部等の各判定部が独立する構成とした。この構成では、行動判定部３３は、矛盾した判定結果を出力することも考えられる。しかし、「物を置く」の行動条件に、例えば、平均速度が低いという条件を加えることで、行動判定部３３は、互いに矛盾する「物を置く」と「走る」とを同時に判定する事態を防止できる。 Here, for example, it is conceivable that “meet a person” and “place an object” are determined at the same time. For this reason, the behavior determination unit 33 is configured such that each determination unit such as a determination unit 33b that meets a person or a determination unit that places an object is independent. In this configuration, the behavior determination unit 33 may output contradictory determination results. However, for example, by adding a condition that the average speed is low to the action condition of “place object”, the action determination unit 33 simultaneously determines “place object” and “run” that contradict each other. Can be prevented.

［人物行動判定装置の全体動作］
図１３を参照して、人物行動判定装置１の全体構成について、説明する（適宜図１参照）。図１３に示すように、人物行動判定装置１は、人物領域検出手段１０によって、撮影カメラから映像が入力されると共に、この映像に含まれる１以上の人物領域を機械学習により検出する（ステップＳ１）。 [Overall operation of human action determination device]
With reference to FIG. 13, the whole structure of the person action determination apparatus 1 is demonstrated (refer FIG. 1 suitably). As shown in FIG. 13, in the human behavior determination apparatus 1, a video is input from the photographing camera by the human region detection unit 10, and one or more human regions included in the video are detected by machine learning (Step S 1). ).

また、人物行動判定装置１は、人物軌跡生成手段２０によって、人物領域検出手段１０から人物領域が入力されると共に、人物領域毎に特徴量を算出する。そして、人物行動判定装置１は、人物軌跡生成手段２０によって、人物領域の特徴量が類似するか否かを判定する。さらに、人物行動判定装置１は、人物軌跡生成手段２０によって、同一人物の人物領域の重心位置を連結して人物軌跡を生成する（ステップＳ２）。 In the human behavior determination apparatus 1, the human trajectory generation unit 20 receives a human region from the human region detection unit 10 and calculates a feature amount for each human region. Then, the human behavior determination apparatus 1 determines whether or not the feature amount of the human region is similar by the human trajectory generation unit 20. Furthermore, the human behavior determination apparatus 1 generates a human trajectory by connecting the barycentric positions of the human regions of the same person by the human trajectory generating unit 20 (step S2).

また、人物行動判定装置１は、人物行動判定手段３０によって、人物軌跡生成手段２０から人物軌跡が入力されると共に、人物軌跡毎に特徴量を算出する。また、人物行動判定手段３０は、この人物軌跡の特徴量が行動条件を満たすか否かを判定する。そして、人物行動判定手段３０は、この人物軌跡の特徴量が行動条件を満たすときは、その人物が行動条件に対応する行動を行っていると判定する（ステップＳ３）。 In the human behavior determination apparatus 1, the human behavior determination unit 30 receives a human trajectory from the human trajectory generation unit 20 and calculates a feature amount for each human trajectory. Further, the person action determination unit 30 determines whether or not the feature amount of the person locus satisfies the action condition. Then, when the feature amount of the person trajectory satisfies the action condition, the person action determination unit 30 determines that the person is performing an action corresponding to the action condition (step S3).

［人物領域検出手段の動作］
以下、図１４を参照し、人物領域検出手段１０の動作について、説明する（適宜図２参照）。人物領域検出手段１０は、訓練データ特徴量算出部１２によって、訓練データの特徴量を算出する（ステップＳ１１）。 [Operation of Person Area Detection Means]
Hereinafter, the operation of the person region detection means 10 will be described with reference to FIG. 14 (see FIG. 2 as appropriate). The person area detection means 10 calculates the feature amount of the training data by the training data feature amount calculation unit 12 (step S11).

また、人物領域検出手段１０は、人物領域判定部１５によって、訓練データを機械学習し、学習パラメータを求める（ステップＳ１２）。そして、人物領域検出手段１０は、変化領域検出部１３によって、映像から変化領域を求める（ステップＳ１３）。 Further, the person area detecting means 10 performs machine learning on the training data by the person area determining unit 15 to obtain learning parameters (step S12). Then, the person area detection means 10 obtains a change area from the video by the change area detection unit 13 (step S13).

また、人物領域検出手段１０は、変化領域特徴量算出部１４によって、変化領域の特徴量を算出する（ステップＳ１４）。そして、人物領域検出手段１０は、人物領域判定部１５によって、学習パラメータを用いて、変化領域が人物領域であるか否かを判定する（ステップＳ１５）。 In addition, the person area detection unit 10 calculates the feature quantity of the change area by the change area feature quantity calculation unit 14 (step S14). Then, the person area detection unit 10 determines whether or not the change area is a person area by using the learning parameter by the person area determination unit 15 (step S15).

また、人物領域検出手段１０は、代表人物領域選択部１６によって、人物領域判定部１５から複数の人物領域が入力されたときは、クラスタリングを行い、一つの人物領域を選択する（ステップＳ１６）。 In addition, when a plurality of person areas are input from the person area determination unit 15 by the representative person area selection unit 16, the person area detection unit 10 performs clustering and selects one person area (step S16).

［人物軌跡生成手段の動作］
以下、図１５を参照し、人物軌跡生成手段２０の動作について、説明する（適宜図７参照）。人物軌跡生成手段２０は、人物領域特徴量算出部２１によって、人物領域の特徴量を算出する。そして、人物軌跡生成手段２０は、人物領域特徴量算出部２１によって、人物領域の特徴量を特徴量データベース２２に登録する（ステップＳ２１）。 [Operation of human trajectory generation means]
Hereinafter, the operation of the human trajectory generation means 20 will be described with reference to FIG. 15 (see FIG. 7 as appropriate). The person trajectory generation means 20 calculates the feature amount of the person region by the person region feature amount calculation unit 21. Then, the person trajectory generation means 20 registers the feature amount of the person region in the feature amount database 22 by the person region feature amount calculation unit 21 (step S21).

また、人物軌跡生成手段２０は、人物領域特徴量照合部２３によって、現在のフレーム画像における人物領域の特徴量と、過去のフレーム画像における人物領域の特徴量とを照合し、同一人物の人物領域であるか否かを判定する（ステップＳ２２）。 In addition, the person trajectory generation unit 20 uses the person region feature amount matching unit 23 to match the feature amount of the person region in the current frame image with the feature amount of the person region in the past frame image, and the person region of the same person It is determined whether or not (step S22).

また、人物軌跡生成手段２０は、人物軌跡生成部２４によって、過去のフレーム画像と処理対象のフレーム画像との間で、同一人物の人物領域の重心位置を求め、これを連結して人物軌跡を生成する（ステップＳ２３）。 In addition, the person trajectory generation means 20 obtains the center of gravity position of the person area of the same person between the past frame image and the frame image to be processed by the person trajectory generation unit 24 and connects them to obtain the person trajectory. Generate (step S23).

また、人物軌跡生成手段２０は、人物予測部２５（カルマンフィルタ）によって、人物軌跡を用いて人物位置を予測し、予測領域と予測位置とを出力する（ステップＳ２４）。 In addition, the person trajectory generation unit 20 predicts a person position using the person trajectory by the person prediction unit 25 (Kalman filter), and outputs a prediction region and a prediction position (step S24).

［人物行動判定手段の構成］
以下、図１６を参照し、人物行動判定手段３０の動作について説明する（適宜図１１参照）。人物行動判定手段３０は、人物軌跡特徴量算出部３１によって、人物軌跡毎に特徴量を算出する。ここで、人物行動判定手段３０は、人物軌跡特徴量算出部３１によって、追跡時間、初回検出位置、現在検出位置、移動方向、移動距離、平均速度、平均加速度、及び、直線性を含めた多次元の特徴量で、人物軌跡の特徴量を算出する（ステップＳ３１）。 [Configuration of person action determination means]
Hereinafter, the operation of the human behavior determination means 30 will be described with reference to FIG. 16 (see FIG. 11 as appropriate). In the human behavior determination unit 30, the human trajectory feature amount calculation unit 31 calculates a feature amount for each human trajectory. Here, the human behavior determination means 30 uses the human trajectory feature quantity calculation unit 31 to include a tracking time, an initial detection position, a current detection position, a movement direction, a movement distance, an average speed, an average acceleration, and a linearity. The feature amount of the human trajectory is calculated using the dimension feature amount (step S31).

また、人物行動判定手段３０は、人物軌跡特徴量正規化部３２によって、予め設定した正規化係数を用いて、人物軌跡の特徴量を正規化する（ステップＳ３２）。 In addition, the person behavior determination unit 30 normalizes the feature amount of the person trajectory using the preset normalization coefficient by the human trajectory feature amount normalization unit 32 (step S32).

また、人物行動判定手段３０は、行動判定部３３によって、正規化された人物軌跡の特徴量に基づいて、人物の行動を判定する（ステップＳ３３）。そして、人物行動判定手段３０は、行動判定部３３によって、その判定結果を出力する（ステップＳ３４）。 In addition, the person action determination unit 30 determines the action of the person based on the normalized feature amount of the person trajectory by the action determination unit 33 (step S33). And the person action determination means 30 outputs the determination result by the action determination part 33 (step S34).

以上のように、本発明の第１実施形態に係る人物行動判定装置１は、映像から特定の人物を長期（例えば、３０フレーム）にわたり追跡し、人物軌跡を生成する。そして、人物行動判定装置１は、この人物軌跡から様々な特徴量を算出し、これら特徴量に基づいて、人物の行動を判定する。これによって、人物行動判定装置１は、オクルージョンが発生するような混雑したシーンからも人物の行動を正確に判定できる。 As described above, the human behavior determination apparatus 1 according to the first embodiment of the present invention tracks a specific person from a video over a long period (for example, 30 frames) and generates a human trajectory. Then, the person behavior determination device 1 calculates various feature amounts from the person trajectory, and determines a person's behavior based on these feature amounts. As a result, the person action determination device 1 can accurately determine the action of a person even from a crowded scene in which occlusion occurs.

また、人物行動判定装置１は、同一のフレーム画像において、同一人物の周囲から複数の人物領域が検出された場合でも、代表人物領域選択部１６が一つの人物領域を選択するので、その人物の行動をより正確に判定できる。そして、人物行動判定装置１は、人物軌跡特徴量正規化部３２が正規化を行うので、撮影カメラの位置に関わらず、その人物の行動をより正確に判定できる。さらに、人物行動判定装置１は、人物行動判定手段３０が簡易なルールベースの判定を行うので、処理の高速化を図ることができる。 In addition, even if a plurality of person areas are detected from the periphery of the same person in the same frame image, the person action determination device 1 selects one person area, so that the person's The behavior can be judged more accurately. And since the person locus | trajectory feature-value normalization part 32 normalizes the person action determination apparatus 1, the action of the person can be determined more correctly irrespective of the position of the photographing camera. Furthermore, since the human behavior determination unit 30 performs simple rule-based determination, the personal behavior determination device 1 can increase the processing speed.

（第２実施形態）
以下、図１７を参照し、本発明の第２実施形態に係る人物行動判定装置１Ｂについて、第１実施形態と異なる点を説明する。人物行動判定装置１Ｂは、人物行動判定手段３０Ｂが、人物軌跡特徴量算出部３１と、人物軌跡特徴量正規化部３２と、行動判定部３３と、行動提示部３４とを備える。 (Second Embodiment)
Hereinafter, with reference to FIG. 17, a difference from the first embodiment will be described regarding the human behavior determination device 1 B according to the second embodiment of the present invention. In the human behavior determination device 1B, the personal behavior determination unit 30B includes a human trajectory feature amount calculation unit 31, a human trajectory feature amount normalization unit 32, a behavior determination unit 33, and an action presentation unit 34.

行動提示部３４は、撮影カメラから映像が入力されると共に、行動判定部３３の判定結果をこの映像に付加する。また、行動提示部３４は、人物軌跡を映像に付加してもよい。さらに、行動提示部３４は、判定結果と人物軌跡との両方を映像に付加してもよい。その後、行動提示部３４は、判定結果や人物軌跡を付加した映像を、外部（例えば、監視者）に提示する。 The action presentation unit 34 receives a video from the photographing camera and adds the determination result of the behavior determination unit 33 to the video. Further, the action presentation unit 34 may add a person trajectory to the video. Furthermore, the action presentation unit 34 may add both the determination result and the person trajectory to the video. After that, the behavior presentation unit 34 presents the video with the determination result and the person trajectory added to the outside (for example, a supervisor).

以上のように、本発明の第２実施形態に係る人物行動判定装置１Ｂは、人物の行動や人物軌跡が付加された映像を外部に提示するので、実際の映像と合わせて判定結果や人物軌跡を確認でき、映像による監視を行いやすくなる。 As described above, the person behavior determination device 1B according to the second embodiment of the present invention presents an image to which a person's action and a person trajectory are added to the outside, so that the determination result and the person trajectory are combined with the actual video. This makes it easier to monitor by video.

（第３実施形態）
以下、図１８を参照し、本発明の第３実施形態に係る人物行動判定装置１Ｃについて、第１実施形態と異なる点を説明する。人物行動判定装置１Ｃは、人物軌跡の特徴量を主成分分析し、３次元程度まで次元数を落として、特徴空間を作成する。そして、人物行動判定装置１Ｃは、この特徴空間内で、例えば、「走る」クラス、「反対方向に歩く」クラス、「人に会う」クラスを作成し、人物軌跡の特徴量をその特徴空間に射影した際にどのクラスに含まれるかを評価して、人物の行動を判定できる。 (Third embodiment)
Hereinafter, with reference to FIG. 18, a difference from the first embodiment will be described regarding a human behavior determination device 1C according to the third embodiment of the present invention. The person action determination device 1C performs principal component analysis on the feature amount of the person trajectory, reduces the number of dimensions to about three dimensions, and creates a feature space. Then, the human behavior determination apparatus 1C creates, for example, a “run” class, a “walk in the opposite direction” class, and a “meet people” class in this feature space, and the feature amount of the person trajectory is stored in the feature space. It is possible to evaluate a person's behavior by evaluating which class is included in the projection.

人物行動判定装置１Ｃは、図１８に示すように、人物行動判定手段３０Ｃが、人物軌跡特徴量算出部３１と、人物軌跡特徴量正規化部３２と、行動判定部３３Ｃと、特徴空間データベース３５と、人物軌跡特徴量射影部３６を備える。 As shown in FIG. 18, in the human behavior determination device 1 C, the personal behavior determination means 30 C includes a human trajectory feature amount calculation unit 31, a human trajectory feature amount normalization unit 32, an action determination unit 33 C, and a feature space database 35. And a human trajectory feature amount projection unit 36.

まず、人物行動判定装置１Ｃは、予め用意した学習用映像から、第１実施形態と同様に人物軌跡を生成する。また、人物行動判定装置１Ｃは、学習用映像から生成した人物軌跡に対して、主成分分析を行い、特徴空間を生成する。そして、人物行動判定装置１Ｃは、生成した特徴空間を特徴空間データベース３５に格納する。 First, the human behavior determination apparatus 1C generates a human trajectory from a learning video prepared in advance as in the first embodiment. In addition, the human behavior determination device 1C performs principal component analysis on the human trajectory generated from the learning video, and generates a feature space. Then, the human behavior determination device 1 C stores the generated feature space in the feature space database 35.

ここで、主成分分析は、相関関係にあるいくつかの要因を合成（圧縮）して、いくつかの成分にし、その総合力や特性を求める方法である。また、特徴空間は、それぞれの人物の行動毎に行動クラス（平均座標、分散）を含む。 Here, the principal component analysis is a method of obtaining (combining) several factors having correlations into several components and obtaining the total power and characteristics thereof. The feature space includes an action class (average coordinates, variance) for each person's action.

人物軌跡特徴量射影部３６は、人物軌跡特徴量正規化部３２から人物軌跡の特徴量が入力されると共に、人物軌跡の特徴量に固有ベクトルを乗算し、特徴空間における人物軌跡の座標を求める。そして、人物軌跡特徴量射影部３６は、この人物軌跡の座標を、行動判定部３３Ｃに出力する。 The human trajectory feature amount projection unit 36 receives the human trajectory feature amount from the human trajectory feature amount normalization unit 32 and multiplies the human trajectory feature amount by an eigenvector to obtain the coordinates of the human trajectory in the feature space. Then, the human trajectory feature amount projection unit 36 outputs the coordinates of the human trajectory to the action determination unit 33C.

行動判定部３３Ｃは、人物軌跡特徴量射影部３６から人物軌跡の座標が入力されると共に、人物軌跡の座標と、行動毎に求められた行動クラスとの距離を算出する。そして、行動判定部３３Ｃは、この距離（行動条件）に基づいて、人物の行動を判定する。 The behavior determination unit 33C receives the coordinates of the human trajectory from the human trajectory feature amount projection unit 36, and calculates the distance between the coordinates of the human trajectory and the behavior class obtained for each behavior. Then, the behavior determination unit 33C determines the behavior of the person based on the distance (action condition).

以下、図１９を参照し、特徴空間を用いた人物行動の判定について説明する（適宜図１８参照）。前記したように、人物軌跡の特徴量は、追跡時間、初回検出位置、現在検出位置、移動方向、移動距離、平均速度、平均加速度、及び、直線性を含めた多次元の特徴量である。この場合、多次元の特徴量を次元圧縮することで、ある特徴量を低次元の特徴空間で評価することができる。 Hereinafter, with reference to FIG. 19, the determination of the human action using the feature space will be described (see FIG. 18 as appropriate). As described above, the feature amount of the human trajectory is a multidimensional feature amount including tracking time, initial detection position, current detection position, moving direction, moving distance, average speed, average acceleration, and linearity. In this case, a certain feature amount can be evaluated in a low-dimensional feature space by dimensionally compressing the multi-dimensional feature amount.

図１９に示すように、人物軌跡の特徴量は、Ｐ１軸、Ｐ２軸及びＰ３軸で表現される３次元の特徴空間に次元圧縮される。これによって、人物領域の特徴量間の距離は、３次元の特徴空間において、点と点との距離として評価することができる。なお、図１９では、特徴空間の各点が人物軌跡の特徴量を示す。 As shown in FIG. 19, the feature amount of the person trajectory is dimensionally compressed into a three-dimensional feature space expressed by the P1, A2, and P3 axes. Thereby, the distance between the feature amounts of the person region can be evaluated as the distance between the points in the three-dimensional feature space. In FIG. 19, each point in the feature space represents a feature amount of the person trajectory.

ここで、図１９の特徴空間では、例えば、走るクラスが符合ＣＬ１の範囲内であり、人に会うクラスが符号ＣＬ２の範囲内であるとする。この場合、走る判定部３３ａは、人物軌跡の座標が走るクラスＣＬ１の範囲内であれば、「走る」と判定する。また、人と会う判定部３３ｂは、人物軌跡の座標が人に会うクラスＣＬ２の範囲内であれば、「人と会う」と判定する。なお、この符合ＣＬ１，ＣＬ２は、予め設定してもよいし、各行動クラスの分散値としてもよい。 Here, in the feature space of FIG. 19, for example, it is assumed that the running class is within the range of the sign CL1, and the class that meets a person is within the range of the sign CL2. In this case, the running determination unit 33a determines “run” if the coordinates of the person trajectory are within the range of the running class CL1. The determination unit 33b that meets a person determines that “meet a person” if the coordinates of the person trajectory are within the range of the class CL2 that meets the person. The codes CL1 and CL2 may be set in advance or may be a variance value for each action class.

なお、行動判定部３３Ｃは、一般的な特徴点が特徴空間の中央に集まることが多いため、中央から遠い特徴点を異常値とみなし、「通常行動」又は「異常行動」であるかを判定してもよい。 In addition, since the general feature points often gather in the center of the feature space, the behavior determination unit 33C regards feature points far from the center as an abnormal value and determines whether it is “normal behavior” or “abnormal behavior”. May be.

以上のように、本発明の第３実施形態に係る人物行動判定装置１Ｃは、人物領域の特徴量の次元数を低減できるので、処理負荷を低減することができる。なお、第３実施形態では、３次元の次元圧縮を説明したが、これに限定されず、５次元の次元圧縮を行ってもよい。 As described above, the human behavior determination apparatus 1C according to the third embodiment of the present invention can reduce the number of dimensions of the feature amount of the human region, and thus can reduce the processing load. In the third embodiment, three-dimensional dimension compression has been described. However, the present invention is not limited to this, and five-dimensional dimension compression may be performed.

（第４実施形態）
以下、図２０を参照し、本発明の第４実施形態に係る人物行動判定装置１Ｄについて、第１実施形態と異なる点を説明する。図２０に示すように、人物行動判定装置１Ｄは、人物領域検出手段１０Ｄと、人物軌跡生成手段２０と、人物行動判定手段３０Ｄと、映像逆再生手段４０とを備える。 (Fourth embodiment)
Hereinafter, with reference to FIG. 20, a different point from 1st Embodiment is demonstrated about the human action determination apparatus 1D which concerns on 4th Embodiment of this invention. As shown in FIG. 20, the person behavior determination device 1D includes a person area detection unit 10D, a person trajectory generation unit 20, a person behavior determination unit 30D, and a video reverse playback unit 40.

人物行動判定手段３０Ｄは、順方向の映像について、図１の人物行動判定手段３０と同様に人物の行動を判定する。そして、人物行動判定手段３０Ｄは、「走る」等の検証対象行動を判定した場合、逆再生指令を映像逆再生手段４０に出力する。この検証対象行動は、例えば、オペレータによって、予め設定される。 The person action determination unit 30D determines a person's action for the forward video in the same manner as the person action determination unit 30 in FIG. Then, when determining the action to be verified such as “run”, the person action determination unit 30D outputs a reverse playback command to the video reverse playback unit 40. This verification target action is preset by an operator, for example.

映像逆再生手段４０は、撮影カメラから入力された映像を一時的に蓄積するフレームメモリ（不図示）を備える。そして、映像逆再生手段４０は、人物行動判定手段３０Ｄから逆再生指令が入力された場合、逆再生映像を人物領域検出手段１０Ｄに出力する。なお、逆再生映像は、フレームメモリに蓄積した映像を逆の順番で再生したものである。 The video reverse playback means 40 includes a frame memory (not shown) that temporarily stores video input from the photographing camera. Then, the reverse video reproduction means 40 outputs the reverse reproduction video to the person area detection means 10D when the reverse reproduction command is input from the person action determination means 30D. Note that the reversely reproduced video is obtained by reproducing the video stored in the frame memory in the reverse order.

人物領域検出手段１０Ｄは、映像逆再生手段４０からの逆再生映像を用いて、図１の人物領域検出手段１０と同様の処理を行う。
人物軌跡生成手段２０は、逆再生映像から生成された人物領域に基づいて、図１の人物軌跡生成手段２０と同様の処理を行う。 The person area detection means 10D performs the same processing as the person area detection means 10 of FIG. 1 using the reversely reproduced video from the video reverse reproduction means 40.
The person trajectory generation means 20 performs the same processing as the person trajectory generation means 20 of FIG. 1 based on the person region generated from the reverse playback video.

その後、人物行動判定手段３０Ｄは、逆再生映像から生成した人物軌跡に基づいて、図１の人物行動判定手段３０と同様に人物の行動を判定する。そして、人物行動判定手段３０Ｄは、同一人物について、順方向の映像と逆再生映像との両方で、同一人物で同じ検証対象行動を判定できた場合、その判定結果を出力する。一方、人物行動判定手段３０Ｄは、順方向の映像と逆再生映像との両方で、同一人物で同じ検証対象行動を判定できない場合、何の判定結果も出力しない。 Thereafter, the person action determining unit 30D determines the person's action based on the person trajectory generated from the reverse reproduction video, similarly to the person action determining unit 30 of FIG. When the same person can determine the same action to be verified for both the forward video and the reverse playback video for the same person, the person action determination unit 30D outputs the determination result. On the other hand, the person action determination unit 30D outputs no determination result when the same person cannot be determined for the same person in both the forward video and the reverse playback video.

以上のように、本発明の第４実施形態に係る人物行動判定装置１Ｄは、逆再生映像を用いて判定結果を検証するので、判定結果の正確性をより高くすることができる。 As described above, since the human behavior determination device 1D according to the fourth embodiment of the present invention verifies the determination result using the reversely reproduced video, the accuracy of the determination result can be further increased.

なお、各実施形態では、本発明に係る人物行動判定装置を独立した装置として説明したが、本発明では、一般的なコンピュータを、前記した各手段として機能させるプログラムによって動作させることもできる。このプログラムは、通信回線を介して配布しても良く、ＣＤ−ＲＯＭやフラッシュメモリ等の記録媒体に書き込んで配布してもよい。 In each embodiment, the human behavior determination device according to the present invention has been described as an independent device. However, in the present invention, a general computer can be operated by a program that functions as each of the above-described units. This program may be distributed via a communication line, or may be distributed by writing in a recording medium such as a CD-ROM or a flash memory.

本発明に係る人物行動判定装置は、セキュリティ分野、スポーツ映像解析、個人認証、及び、人物の行動をキーとした映像検索装置に用いることができる。また、本発明に係る人物行動判定装置は、得られた情報を提示して視聴者の理解を促進することも可能であり、テレビ放送、インターネットなどに向けた映像制作を支援する分野に用いることができる。 The human behavior determination device according to the present invention can be used in a security field, sports video analysis, personal authentication, and a video search device using human behavior as keys. In addition, the human behavior determination device according to the present invention can be used in the field of supporting video production for television broadcasting, the Internet, etc., by presenting the obtained information and promoting viewers' understanding. Can do.

１，１Ｂ，１Ｃ，１Ｄ人物行動判定装置
１０，１０Ｄ人物領域検出手段
１１人物画像データベース
１２訓練データ特徴量算出部
１３変化領域検出部
１４変化領域特徴量算出部
１５人物領域判定部
１６代表人物領域選択部
２０人物軌跡生成手段
２１人物領域特徴量算出部
２２特徴量データベース
２３人物領域特徴量照合部
２４人物軌跡生成部
２５人物予測部
３０，３０Ｂ，３０Ｃ，３０Ｄ人物行動判定手段
３１人物軌跡特徴量算出部
３２人物軌跡特徴量正規化部
３３，３３Ｃ行動判定部
３３ａ走る判定部
３３ｂ人と会う判定部
３３ｃ反対方向に歩く判定部
３４行動提示部
３５特徴空間データベース
３６人物軌跡特徴量射影部
４０映像逆再生手段 1, 1B, 1C, 1D Human action determination device 10, 10D Human region detection means 11 Human image database 12 Training data feature amount calculation unit 13 Change region detection unit 14 Change region feature amount calculation unit 15 Person region determination unit 16 Representative person region Selection unit 20 Human trajectory generation unit 21 Human region feature amount calculation unit 22 Feature amount database 23 Human region feature amount collation unit 24 Human trajectory generation unit 25 Person prediction units 30, 30B, 30C, 30D Human behavior determination unit 31 Human trajectory feature amount Calculation unit 32 Human trajectory feature amount normalization unit 33, 33C Behavior determination unit 33a Running determination unit 33b Determination unit 33c meeting a person Determination unit walking in the opposite direction Action presentation unit 35 Feature space database 36 Human trajectory feature amount projection unit 40 Video Reverse playback means

Claims

A human behavior determination device that receives a continuous video of a plurality of frame images and determines the behavior of one or more persons included in the video,
Human area detection means for detecting one or more human areas included in the video by machine learning at predetermined frame intervals;
A feature amount is calculated for each person region detected by the person region detection means, and a person region having a similar feature amount of the person region in the plurality of frame images is determined as a person region of the same person, and Human trajectory generation means for generating a human trajectory by connecting the center of gravity positions of the human area;
The feature amount is calculated for each person trajectory generated by the person trajectory generation unit, and it is determined whether or not the feature amount of the person trajectory satisfies a preset action condition. When the condition is satisfied, a person action determination unit that determines that the person is performing an action corresponding to the action condition;
A human behavior determination device comprising:

The person area detecting means includes
For the plurality of human regions detected from the same frame image, one of a distance between each other, a color histogram distribution distance, or a size difference is calculated as a representative feature amount, and the representative feature amount is preset. A representative person area that determines that the representative feature amount is similar when it is equal to or less than the threshold value and selects one representative person area located on the center side among the person areas with the similar representative feature quantity as the person area The person action determination device according to claim 1, further comprising a selection unit.

A person prediction unit that predicts a prediction region of the person by a prediction filter from the person locus generated by the person locus generation unit;
3. The human behavior determination device according to claim 1, wherein the human trajectory generation unit generates the human trajectory when the barycentric position of the human area is included in the prediction area.

The person action determination means includes
The action of the person is determined based on the action condition in which a correspondence relationship between a feature amount of the person trajectory and a threshold value set in advance for each action of the person is set in advance. The person action determination device according to claim 3.

When the person action determining means determines that the person is performing a preset action to be verified, the person action determining means outputs a reverse playback command that is a command to play the video in the reverse direction,
A reverse playback command is input from the person behavior determination unit, and a video reverse playback unit is further provided that plays back the video in the reverse direction based on the reverse playback command and outputs the video to the person area detection unit. The human behavior determination device according to any one of claims 1 to 4.

In order to determine a behavior of one or more persons included in the video, and a video in which a plurality of frame images are continuously input,
Human area detection means for detecting one or more human areas included in the video by machine learning at predetermined frame intervals;
A feature amount is calculated for each person region detected by the person region detection means, and a person region having a similar feature amount of the person region in the plurality of frame images is determined as a person region of the same person, and Human trajectory generation means for generating a human trajectory by connecting the gravity center positions of the human area;
The feature amount is calculated for each person trajectory generated by the person trajectory generation unit, and it is determined whether or not the feature amount of the person trajectory satisfies a preset action condition. When the condition is satisfied, a person action determination unit that determines that the person is performing an action corresponding to the action condition,
It is made to function as a person action judging program characterized by things.