JP2012033142A

JP2012033142A - Number-of-person measuring device, number-of-person measuring method and program

Info

Publication number: JP2012033142A
Application number: JP2011003958A
Authority: JP
Inventors: Isamu Igarashi; 勇五十嵐; Naoki Ito; 直己伊藤; Hiroyuki Arai; 啓之新井; Hideki Koike; 秀樹小池
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2010-02-18
Filing date: 2011-01-12
Publication date: 2012-02-16
Anticipated expiration: 2031-01-12
Also published as: JP5478520B2

Abstract

PROBLEM TO BE SOLVED: To provide a number-of-person measuring device for calculating the number of persons passing an area displayed in a video by estimating the number of persons displayed in the video and the velocity as a group.SOLUTION: A number-of-passing-person measuring device for inputting a motion vector of each pixel in an image included in a video and a foreground area in the image and measuring moving persons, comprises: projection means for projecting the motion vectors in the foreground area to one-dimension to generate a projection image; and extremal point extraction means for extracting extremal points among pixels included in the projection image generated by the projection means to detect the moving persons.

Description

本発明は、撮像装置で撮影された映像から、画像処理により映像に映っている人の数や通過人数を測定する人数計測装置、人数計測方法、プログラムに関するものである。 The present invention relates to a person counting device, a person counting method, and a program for measuring the number of people and the number of people passing through an image processed by an image processing device.

時系列に連続したフレームで構成された映像から人を検出しトラッキングすることにより、映像に映っている領域を通過する人の数を計測する方法が知られている（例えば、非特許文献１）。また、時系列に連続したフレームで構成された映像から動きのある領域の移動を観測することによって逆行を検出する方法が知られている（非特許文献２）。 A method is known in which the number of people passing through a region shown in a video is measured by detecting and tracking people from a video composed of time-sequential frames (for example, Non-Patent Document 1). . Also, a method of detecting retrograde by observing the movement of a moving area from a video composed of time-sequential frames (Non-Patent Document 2) is known.

Oliver Sidla, Yuriy Lypetskyy, Norbert Brandle, and Stefan Seer, “Pedestrian detection and tracking for counting applications in crowded situations”, Proc. IEEE International Conference on Video and Signal Based Surveillance (AVSS06), p. 70, 2006Oliver Sidla, Yuriy Lypetskyy, Norbert Brandle, and Stefan Seer, “Pedestrian detection and tracking for counting applications in crowded situations”, Proc.IEEE International Conference on Video and Signal Based Surveillance (AVSS06), p. 70, 2006 新井啓之，安野貴之，水上緑，長谷山美紀，“映像からの逆行者検知方法”，信学技報, pp. 29-34, February 2006Hiroyuki Arai, Takayuki Anno, Midori Mizukami, Miki Haseyama, “Method of Detecting Retrogrades from Video”, IEICE Technical Report, pp. 29-34, February 2006

しかしながら、上述の非特許文献１における通過人数計測方法では、人数計測の精度がトラッキングの性能に大きく影響を受け、特に混雑した状況では、十分な人数計測精度を得ることが難しい場合がある。また、上述の非特許文献２における逆行者検出方法では、必ずしも映像に映っている全ての人の移動を観測するのではないため、通過人数を測定することができない。 However, in the passing person counting method in Non-Patent Document 1 described above, the accuracy of counting the number of people is greatly affected by the tracking performance, and it may be difficult to obtain sufficient number of people counting accuracy particularly in a crowded situation. Further, the retrograde person detection method in Non-Patent Document 2 described above does not necessarily observe the movement of all the people shown in the video, and therefore cannot measure the number of people passing by.

本発明では、上述したような従来手法の問題点を鑑み、映像に映っている人々の数と集団としての速度とを推定することで、映像に映っている領域の通過人数を求める人数計測装置、人数計測方法、プログラムを提供することを目的とする。 In the present invention, in view of the problems of the conventional methods as described above, the number measuring device for determining the number of people passing through the area shown in the video by estimating the number of people shown in the video and the speed as a group. The purpose is to provide a method and program for counting people.

本発明は、映像に含まれる画像中の各画素の動きベクトルと、前記画像中の前景領域とを入力し、移動人物を計測する人数計測装置であって、前記前景領域内の動きベクトルを１次元に射影して射影画像を生成する射影手段と、前記射影手段によって生成された射影画像に含まれる画素のうち極値点を抽出して移動人物を検出する極値点抽出手段と、を有することを特徴とする。 The present invention is a person counting device for inputting a motion vector of each pixel in an image included in an image and a foreground area in the image and measuring a moving person, and the motion vector in the foreground area is 1 Projection means for projecting to a dimension to generate a projected image; and extreme point extraction means for detecting a moving person by extracting extreme points from pixels included in the projected image generated by the projection means. It is characterized by that.

本発明は、極値点抽出手段は、前記射影手段によって生成された射影画像に含まれる画素のうち極値点を抽出して移動人物の射影軸上における位置を検出することを特徴とする。 The present invention is characterized in that the extreme point extraction means extracts extreme points from the pixels included in the projection image generated by the projection means and detects the position of the moving person on the projection axis.

本発明は、前記画像中に含まれる人物に対応する画像に基づいて、前記画像中の座標から当該人物の実空間上での位置を示す実空間座標上の画素にマッピングすることにより、実空間画像を生成する実空間マッピング手段を有し、前記射影処理手段は、前記実空間マッピング手段によって生成された実空間画像から射影画像を生成することを特徴とする。 The present invention is based on an image corresponding to a person included in the image, by mapping from the coordinates in the image to pixels on the real space coordinates indicating the position of the person in the real space. It has a real space mapping means for generating an image, and the projection processing means generates a projected image from the real space image generated by the real space mapping means.

本発明は、前記実空間マッピング手段は、前記射影画像上の対象画素を含む人物像の立ち位置としてあり得ある２次元空間上の範囲に該当する画素に所定の値を加算して前記マッピングを行うことにより、前記実空間画像を生成することを特徴とする。 In the present invention, the real space mapping means adds the predetermined value to a pixel corresponding to a range in a two-dimensional space that can be a standing position of a human image including the target pixel on the projected image, and performs the mapping. By performing, the real space image is generated.

本発明は、前記実空間マッピング手段は、前記射影画像上の対象画素を含む人物像の立ち位置としてあり得ある対数空間上の範囲に該当する画素に所定の値を加算して第１のマッピング処理を行い、該第１のマッピング処理結果から２次元空間上にマッピングすることにより、前記実空間画像を生成することを特徴とする。 In the present invention, the real space mapping means adds a predetermined value to a pixel corresponding to a range in a logarithmic space that can be a standing position of a human image including a target pixel on the projected image, and performs a first mapping. The real space image is generated by performing processing and mapping the first mapping processing result on a two-dimensional space.

本発明は、前記極値点抽出手段によって抽出された極値画像を経過時間毎に蓄積して累積画像を生成する累積画像生成手段と、前記累積画像生成手段により生成された累積画像に基づいて、移動人物の人数を計測する人数算出手段を有することを特徴とする。 The present invention is based on cumulative image generation means for generating an accumulated image by accumulating the extreme value image extracted by the extreme value point extracting means at every elapsed time, and on the accumulated image generated by the accumulated image generating means. The apparatus has a number calculating means for measuring the number of moving persons.

本発明は、前記極値点抽出手段によって抽出された極値画像を経過時間毎に蓄積して累積画像を生成する累積画像生成手段と、前記累積画像生成手段により生成された累積画像に基づいて、各極値点の移動速度から、移動人物の代表移動速度を算出する代表速度算出手段と、を有することを特徴とする。 The present invention is based on cumulative image generation means for generating an accumulated image by accumulating the extreme value image extracted by the extreme value point extracting means at every elapsed time, and on the accumulated image generated by the accumulated image generating means. And representative speed calculating means for calculating the representative moving speed of the moving person from the moving speed of each extreme point.

本発明は、前記画像に含まれる人数を検出する人数算出手段と、前記人数算出手段が検出した人数と、前記代表速度算出手段が算出した代表速度とに基づいて、一定時間内に画像内の領域を通過する人数を算出する通過人数算出手段と、を有することを特徴とする。 The present invention provides a number calculation unit for detecting the number of people included in the image, a number of people detected by the number of people calculation unit, and a representative speed calculated by the representative speed calculation unit within a predetermined time. Passing number calculation means for calculating the number of persons passing through the area.

本発明は、入力した映像から前景部分を検出し、前景画像を生成する前景画像生成手段と、前記前景画像上の対象画素を含む人物像の立ち位置としてあり得ある２次元空間上の範囲に該当する画素に所定の値を加算して前記マッピングを行うことにより、前記実空間画像を生成する実空間画像生成手段と、前記実空間画像の画素値の和を取ることにより、実空間領域に存在する人物の人数を算出する領域内人数算出手段とを備えたことを特徴とする。 The present invention detects a foreground portion from an input video and generates a foreground image, and a range in a two-dimensional space that can be a standing position of a human image including a target pixel on the foreground image. By adding the predetermined value to the corresponding pixel and performing the mapping, the real space image generating means for generating the real space image and the sum of the pixel values of the real space image are taken into the real space region. An in-region number calculating means for calculating the number of existing persons is provided.

本発明は、入力した映像から前景部分を検出し、前景画像を生成する前景画像生成手段と、前記前景画像上の対象画素を含む人物像の立ち位置としてあり得ある対数空間上の範囲に該当する画素に所定の値を加算して第１のマッピング処理を行い、該第１のマッピング処理結果から２次元空間上に第２のマッピング処理を行うことにより、前記実空間画像を生成する実空間画像生成手段と、前記実空間画像の画素値の和を取ることにより、実空間領域に存在する人物の人数を算出する領域内人数算出手段とを備えたことを特徴とする。 The present invention corresponds to a foreground image generation means for detecting a foreground portion from an input video and generating a foreground image, and a range in a logarithmic space that can be a standing position of a human image including a target pixel on the foreground image. A real space for generating the real space image by adding a predetermined value to the pixel to be processed and performing a first mapping process, and performing a second mapping process on a two-dimensional space from the first mapping process result It is characterized by comprising image generation means and area area number calculation means for calculating the number of persons existing in the real space area by taking the sum of the pixel values of the real space image.

本発明は、映像に含まれる画像中の各画素の動きベクトルと、前記画像中の前景領域とを入力し、移動人物を計測する人数計測装置であるコンピュータを用いて、前記コンピュータの射影手段が、前記前景領域内の動きベクトルを１次元に射影して射影画像を生成するステップと、前記コンピュータの極値点抽出手段が、前記射影手段によって生成された射影画像に含まれる画素のうち極値点を抽出して移動人物を検出するステップと、を実行することを特徴とする。 The present invention uses a computer, which is a person counting device that inputs a motion vector of each pixel in an image included in an image and a foreground region in the image, and measures a moving person, and the projection means of the computer Projecting a motion vector in the foreground region in a one-dimensional manner to generate a projected image, and an extreme point extraction unit of the computer is an extreme value among pixels included in the projected image generated by the projection unit Detecting a moving person by extracting points.

本発明は、請求項１〜１０のうちいずれか１項に記載された人数計測装置の各手段をコンピュータに機能させることを特徴とする。 The present invention is characterized by causing a computer to function each means of the people counting device according to any one of claims 1 to 10.

以上説明したように、本発明によれば映像に映っている個々の人を個別に追跡することなしに通過人数を計測することができる。これにより、個人の追跡が困難な状況でも通過人数の計測を実現することができる。また、人流を観測する向きに移動している人物だけを抽出することによって、人々の移動方向が様々な場合であっても、方向ごとの通過人数を計測することができる。 As described above, according to the present invention, it is possible to measure the number of passing people without individually tracking each individual person shown in the video. This makes it possible to measure the number of people passing through even in situations where it is difficult to track individuals. In addition, by extracting only people who are moving in the direction of observing people flow, it is possible to measure the number of people passing in each direction even when people move in various directions.

この発明の一実施形態による通過人数計測装置の構成を示す概略ブロック図である。It is a schematic block diagram which shows the structure of the passing number measurement apparatus by one Embodiment of this invention. 通過人数計測装置１の機能を表す機能ブロック図である。3 is a functional block diagram showing functions of the passing number measuring device 1. FIG. 射影画像を生成する処理を説明する図である。It is a figure explaining the process which produces | generates a projection image. 累積画像を生成する処理を説明する図である。It is a figure explaining the process which produces | generates a cumulative image. 通過人数計測装置１の動作を説明するフローチャートである。4 is a flowchart for explaining the operation of the passing number measuring device 1. 動きベクトルを算出した例を示す図である。It is a figure which shows the example which calculated the motion vector. 第２の実施形態における通過人数計測装置１の構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of the passing number measuring device 1 in 2nd Embodiment. 第３の実施形態における通過人数計測装置１の構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of the passage number measuring device 1 in 3rd Embodiment. 第４の実施形態における通過人数計測装置１の構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of the passage number measuring device 1 in 4th Embodiment. 第５の実施形態における通過人数計測装置１の構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of the passing number measuring device 1 in 5th Embodiment. 通過人数計測装置１の動作を説明するフローチャートである。4 is a flowchart for explaining the operation of the passing number measuring device 1. マッピングの範囲の導出の仕方について説明する図である。It is a figure explaining the method of derivation | leading-out of the range of mapping. マッピングの範囲の導出の仕方について説明する図である。It is a figure explaining the method of derivation | leading-out of the range of mapping. マッピングを実現する方法の一例を説明する図である。It is a figure explaining an example of the method of implement | achieving mapping. マッピングを実現する方法の一例を説明する図である。It is a figure explaining an example of the method of implement | achieving mapping. 実空間画像にマッピングを行った一例を示す図である。It is a figure which shows an example which mapped to the real space image. 射影画像を生成する処理を説明する図である。It is a figure explaining the process which produces | generates a projection image. 累積画像を生成する処理を説明する図である。It is a figure explaining the process which produces | generates a cumulative image. 第６の実施形態における通過人数計測装置１の構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of the passage number measuring device 1 in 6th Embodiment. 第７の実施形態における通過人数計測装置１の構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of the passing number measurement apparatus 1 in 7th Embodiment. 第８の実施形態における通過人数計測装置１の構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of the passing number measuring device 1 in 8th Embodiment. 第９の実施形態における通過人数計測装置１の構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of the passing number measuring device 1 in 9th Embodiment. 図２２に示す装置の処理動作を示すフローチャートである。It is a flowchart which shows the processing operation of the apparatus shown in FIG. 第１０の実施形態における通過人数計測装置１の処理動作を示す説明図である。It is explanatory drawing which shows the processing operation of the passage number measuring device 1 in 10th Embodiment. 第１０の実施形態における通過人数計測装置１の処理動作を示す説明図である。It is explanatory drawing which shows the processing operation of the passage number measuring device 1 in 10th Embodiment.

＜第１の実施形態＞
以下、本発明の第１の実施形態による通過人数計測装置について図面を参照して説明する。図１は、この発明の一実施形態による通過人数計測装置の構成を示す概略ブロック図である。通過人数計測装置１は、外部に撮像装置であるカメラ２と、液晶表示装置等の表示装置３が接続され、入力装置１１、ＲＯＭ（read-only memory）１２、ＣＰＵ（central processing unit）１３、ＲＡＭ（random access memory）１４、Ｉ／Ｆ（インタフェース）１５、外部記憶装置１６、記録媒体駆動装置１７、記録媒体１８を有し、カメラ２によって撮像された映像に映っている領域を通過する人の数を測定するコンピュータである。 <First Embodiment>
Hereinafter, a passing person counting device according to a first embodiment of the present invention will be described with reference to the drawings. FIG. 1 is a schematic block diagram showing the configuration of a passing person counting device according to an embodiment of the present invention. The passing person measuring device 1 is connected to a camera 2 which is an imaging device and a display device 3 such as a liquid crystal display device, and includes an input device 11, a ROM (read-only memory) 12, a CPU (central processing unit) 13, A person who has a random access memory (RAM) 14, an interface (I / F) 15, an external storage device 16, a recording medium driving device 17, and a recording medium 18, and passes through an area shown in an image captured by the camera 2. It is a computer that measures the number of

通過人数計測装置１において、入力装置１１は、マウスやキーボード等の入力デバイスである。ＲＯＭ１２は、所定のプログラムやデータを記憶する。ＣＰＵ１３は、ＲＯＭ１２に記憶されたプログラムを読み出して実行し、通過人数計測装置１内の各部を制御する。ＲＡＭ１４は、ＣＰＵ１３によってアクセスされ、各種データを一時記憶する。Ｉ／Ｆ１５は、カメラ２と通過人数計測装置１とを接続し、カメラ２から出力されるデータを受信する。外部記憶装置１６は、通過人数計測装置１の外部に接続される記憶装置であり、例えば、ハードディスク等である。記録媒体駆動装置１７は、ＣＰＵ１３からの指示に従って記録媒体１８を駆動させて、記録媒体１８に各種データを記憶する。記録媒体１８は、各種データを記憶する。 In the passing person counting device 1, the input device 11 is an input device such as a mouse or a keyboard. The ROM 12 stores predetermined programs and data. The CPU 13 reads out and executes a program stored in the ROM 12 and controls each unit in the passing number measuring device 1. The RAM 14 is accessed by the CPU 13 and temporarily stores various data. The I / F 15 connects the camera 2 and the passing person counting device 1 and receives data output from the camera 2. The external storage device 16 is a storage device connected to the outside of the passing number measuring device 1, and is, for example, a hard disk. The recording medium driving device 17 drives the recording medium 18 according to instructions from the CPU 13 and stores various data in the recording medium 18. The recording medium 18 stores various data.

カメラ２は、所定の位置に固定され、通過人数の計測を行う対象の領域を撮像し、撮像結果を通過人数計測装置１に出力する。このカメラ２は、撮影された画像（フレーム画像）を、時系列的に通過人数計測装置１に入力する。例えば、入力するデータとしては、例えば、静止画系列や映像ストリームなどである。表示装置３は、通過人数計測装置１の指示に従い、各種データを画面上に表示する。 The camera 2 is fixed at a predetermined position, images a target area where the number of passing people is measured, and outputs the imaging result to the passing number measuring device 1. The camera 2 inputs captured images (frame images) to the passing person counting device 1 in time series. For example, the input data is, for example, a still image series or a video stream. The display device 3 displays various data on the screen in accordance with instructions from the passing person counting device 1.

次に、本発明の実施形態の一例をさらに説明する。図２は、通過人数計測装置１の機能を表す機能ブロック図である。動きベクトル抽出部１０１は、カメラ２からの画像を入力し、動きベクトルを出力する。この動きベクトル抽出部１０１は、映像の現在フレームと１つ前のフレームとからオプティカルフロー（例えば、非特許文献３参照）を検出し、カメラで撮影された現時点の入力画像から動きベクトルおよび、オプティカルフローの検出を続けている時間ｔ（フレーム数）を算出する。この動きベクトルは、例えば、ｕが左から右方向、ｖが上から下方向、としたとき、オプティカルフローの検出始点の画素位置（ｕ，ｖ）と動き方向（Ｕ，Ｖ）とで表される。 Next, an example of an embodiment of the present invention will be further described. FIG. 2 is a functional block diagram showing the functions of the passing number measuring device 1. The motion vector extraction unit 101 inputs an image from the camera 2 and outputs a motion vector. The motion vector extraction unit 101 detects an optical flow (see, for example, Non-Patent Document 3) from the current frame of the video and the previous frame, and detects the motion vector and the optical from the current input image captured by the camera. The time t (number of frames) during which the flow detection is continued is calculated. For example, when u is from left to right and v is from top to bottom, this motion vector is represented by the pixel position (u, v) and the motion direction (U, V) of the optical flow detection start point. The

非特許文献３；“画像処理標準テキストブック”，財団法人画像情報教育振興協会，ｐ．７９−２８０，１９９７ Non-Patent Document 3; “Image Processing Standard Text Book”, Association for Promotion of Image Information Education, p. 79-280, 1997

前景検出部１０２は、カメラ２からの画像を入力し、現時点の入力画像における前景画像を生成する。ここで、前景画像とは、入力画像において移動物体が存在する点、すなわち前景である点を１、そうでない点、すなわち背景である点を０とした画像である。前景画像の検出方法は、さまざまな方法が知られているが、どのような方法を適用してもよい。なお、前景画像の検出方法としては、下記の非特許文献４、５のものが一例としてあげられる。 The foreground detection unit 102 inputs an image from the camera 2 and generates a foreground image in the current input image. Here, the foreground image is an image in which a moving object exists in the input image, that is, a point that is the foreground is 1, and a point that is not, that is, a point that is the background is 0. Various methods are known for detecting the foreground image, but any method may be applied. Examples of the foreground image detection method include Non-Patent Documents 4 and 5 below.

非特許文献４：波部斉，和田俊和，松山隆司，“照明変化に対して頑健な背景差分法”，情報処理学会研究報告(1998-CVIM115)，Vol.1999，No.29，pp.17-24, 1999.3
非特許文献５：Kedar A. Patwardhan, Guillermo Sapiro, Vassilios Morellas, “Robust Foreground Detection in Video Using Pixel Layers”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 30, No. 4, April 2008 Non-Patent Document 4: Hanabe Sai, Toshikazu Wada, Takashi Matsuyama, “Background Difference Method Robust against Lighting Changes”, Information Processing Society of Japan Research Report (1998-CVIM115), Vol.1999, No.29, pp.17 -24, 1999.3
Non-Patent Document 5: Kedar A. Patwardhan, Guillermo Sapiro, Vassilios Morellas, “Robust Foreground Detection in Video Using Pixel Layers”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 30, No. 4, April 2008

射影処理部１０３は、動きベクトル抽出部１０１から出力される動きベクトルと、前景検出部１０２から出力される前景画像とを入力し、射影された動きベクトルを生成して出力する。この射影処理部１３０は、動きベクトルの向きがある範囲内にあり、かつ絶対値がある閾値以上であり、かつ前景画像の同座標における値が１である点を１、そうでない点を０とする２値画像を生成し、図３に示されているように、ある方向に座標軸ｘをとり、ｘ方向の位置が等しい２値画像上の各画素における値の総和をとった１次元の射影画像を生成する。 The projection processing unit 103 receives the motion vector output from the motion vector extraction unit 101 and the foreground image output from the foreground detection unit 102, and generates and outputs a projected motion vector. The projection processing unit 130 sets the point where the direction of the motion vector is within a certain range and the absolute value is equal to or greater than a certain threshold and the value at the same coordinate of the foreground image is 1, and the other point is 0. A one-dimensional projection in which a coordinate image x is taken in a certain direction and the sum of the values in each pixel on the binary image having the same position in the x direction is taken, as shown in FIG. Generate an image.

図３においては、横方向を座標軸ｘとし、そのｘ軸上の位置が同じである各画素の値の総和を計算する。この計算は、ｘ軸上のそれぞれの位置において行う。ここでは、射影画像として、符号（ａ）に示すデータが得られた場合が一例として示されている。 In FIG. 3, the horizontal direction is taken as the coordinate axis x, and the sum of the values of the pixels having the same position on the x axis is calculated. This calculation is performed at each position on the x-axis. Here, as an example, the case where the data indicated by the symbol (a) is obtained as the projected image is shown.

このように、画像中においてｘ方向の位置が等しい画素の値の総和をとる処理を、この実施形態においては、座標軸ｘに対する射影処理と呼ぶ。ここで、２値画像を生成する際に、動きベクトルの局所的な誤検出の影響を取り除くためのモルフォロジー演算などの処理をしても良い。 Thus, in this embodiment, the process of obtaining the sum of the values of pixels having the same position in the x direction in the image is referred to as a projection process for the coordinate axis x. Here, when generating a binary image, a process such as a morphological operation for removing the influence of local false detection of a motion vector may be performed.

ここでは、座標軸ｘの方向を横方向としたがこれに限られるものではない。また、例えば、左右方向の人流を観測する場合はｕ方向、上下方向の人流を観測する場合はｖ方向、といったように、計測したい人流の方向にｘ軸をとることが考えられる。動きベクトルの向きの範囲、および閾値の決定に際しては、ここではその方法を限定しないが、例えば以下の方法が考えられる。 Here, the direction of the coordinate axis x is the horizontal direction, but the present invention is not limited to this. In addition, for example, the x-axis may be taken in the direction of the human flow to be measured, such as the u direction when observing the human flow in the left-right direction and the v direction when observing the human flow in the vertical direction. In determining the range of the direction of the motion vector and the threshold value, the method is not limited here. For example, the following method can be considered.

（１）動きベクトルの向きの範囲は、ベクトルのｘ方向の成分が通過を計測したい向きを向いている範囲とする。例えば、ｘ軸をｕ軸と等しくとり、右向きの通過を計測する場合は、動きベクトルのｘ成分が正である範囲とする。
（２）絶対値の閾値は、人物あるいは通過を把握したい物体の動き以外によって検出される動きの影響を切り捨てることができる最小値に設定する。 (1) The range of the direction of the motion vector is a range in which the x-direction component of the vector faces the direction in which passage is desired to be measured. For example, when the x axis is set equal to the u axis and rightward passage is measured, the x component of the motion vector is in a positive range.
(2) The threshold value of the absolute value is set to a minimum value at which the influence of movement detected by a person other than the movement of a person or an object whose passage is to be grasped can be discarded.

極値点抽出部１０４は、射影処理部１０３から出力される射影された動きベクトルを入力し、射影画像の値が極大値を取る点を１、そうでない点を０とする極値点画像を出力する。 The extreme point extraction unit 104 receives the projected motion vector output from the projection processing unit 103, and calculates an extreme point image in which 1 is a point where the value of the projected image takes a maximum value and 0 is a point other than that. Output.

累積画像生成部１０５は、極値点抽出部１０４から出力される極値点画像を入力し、極値点の位置を示す画像である累積画像を生成して出力する。図４は、累積画像を説明するための図である。この図において、累積画像生成部１０５は、極値点抽出部１０４によって生成された極大値画像を元に、累積画像を生成する。ここでは、極値点抽出部１０４によって、極値点画像を時間ｔに対応させてメモリ上に蓄積する（図４（ａ）、図４（ｂ））。累積画像生成部１０５は、この極値点の軌跡を示す２次元の累積画像（ｘ，ｔ）を生成する（図４（ｃ））。ここでは、極大値を表す座標軸ｘ上の位置の画素について１、そうでない画素には０が付与された極値画像が時間ｔに従って蓄積されることにより、２次元の累積画像が生成される。なお、図４においては、極大値である画素と、そうでない画素について異なる色で示している。 The cumulative image generation unit 105 receives the extreme point image output from the extreme point extraction unit 104, and generates and outputs a cumulative image that is an image indicating the position of the extreme point. FIG. 4 is a diagram for explaining the accumulated image. In this figure, the cumulative image generation unit 105 generates a cumulative image based on the local maximum image generated by the extreme point extraction unit 104. Here, the extreme point image is stored in the memory in correspondence with the time t by the extreme point extraction unit 104 (FIGS. 4A and 4B). The cumulative image generation unit 105 generates a two-dimensional cumulative image (x, t) indicating the locus of the extreme points (FIG. 4C). Here, a two-dimensional cumulative image is generated by accumulating an extreme value image in which 1 is assigned to a pixel at a position on the coordinate axis x representing the maximum value and 0 is assigned to a pixel that is not, according to time t. In FIG. 4, the pixels having the maximum value and the pixels not having the maximum value are shown in different colors.

なお、ここで、累積画像のｘ方向のサイズは、極値点画像のｘ方向のサイズと一致させるように生成される。累積画像のｔ方向のサイズは、後述するハフ変換の際に軌跡候補を検出できる程度の大きさが得られる範囲であって、できるだけ小さい値とする。 Here, the size of the accumulated image in the x direction is generated so as to match the size of the extreme point image in the x direction. The size of the accumulated image in the t direction is a range in which a size sufficient to detect a locus candidate can be obtained at the time of the Hough transform described later, and is as small as possible.

代表速度算出部１０６は、累積画像生成部１０５から出力される累積画像を入力し、画像中の人々の代表移動速度を生成して出力する。この代表移動速度の生成の仕方は、種々あるが、後述する。 The representative speed calculation unit 106 receives the cumulative image output from the cumulative image generation unit 105, and generates and outputs a representative movement speed of people in the image. There are various ways of generating the representative moving speed, which will be described later.

画面内人数算出部１０７は、カメラ２からの画像を入力し、画面に映っている人の人数を算出し、算出結果を出力する。この画面内人数算出部１０７は、カメラ２で撮影された映像から、現時点の入力画像に映っている人のうち、人流を観測する向きに移動している人の数を算出する。 The in-screen number calculation unit 107 inputs an image from the camera 2, calculates the number of people on the screen, and outputs the calculation result. This in-screen number calculation unit 107 calculates the number of people who are moving in the direction of observing the human flow from the images captured by the camera 2 among the people shown in the current input image.

通過人数算出部１０８は、代表速度算出部１０６によって算出された代表移動速度と、画面内人数算出部１０７によって算出された画像に映っている人数とに基づいて、通過人数を算出する。 The passing number calculating unit 108 calculates the passing number based on the representative moving speed calculated by the representative speed calculating unit 106 and the number of people shown in the image calculated by the in-screen number calculating unit 107.

次に、上述した構成における通過人数計測装置１の動作について説明する。図５は、通過人数計測装置１の動作を説明するフローチャートである。まず、処理が開始され、カメラ２から画像データが出力されると、動きベクトル抽出部１０１は、カメラ２から出力された、映像の現在フレームと１つ前のフレームとからオプティカルフローを検出し、カメラで撮影された現時点の入力画像から動きベクトル、および、オプティカルフローの検出を続けている時間ｔを算出する（ステップＳ１）。ここでは、カメラ２からの画像データについては、画像全体ではなく、画像の一部を抽出して処理対象としてもよい。 Next, the operation of the passing person measuring apparatus 1 having the above-described configuration will be described. FIG. 5 is a flowchart for explaining the operation of the passing number measuring device 1. First, when processing is started and image data is output from the camera 2, the motion vector extraction unit 101 detects an optical flow from the current frame and the previous frame output from the camera 2, A motion vector and a time t during which the optical flow is continuously detected are calculated from the current input image taken by the camera (step S1). Here, with respect to the image data from the camera 2, not the entire image but a part of the image may be extracted and processed.

オプティカルフローの算出方法は、ここでは限定されるものではないが、シーンによって安定にフローが出る方法を選択するのが好ましい。例えば、輝度勾配に基づく方法や、領域のマッチングに基づく方法が考えられるが、輝度勾配に基づく方法は混雑度が大きい場合により有効であり、領域のマッチングに基づく方法は混雑度が小さい場合により有効である。実際の映像で動きベクトルを算出した例を図６に示す。図６（ａ）は、カメラ２から出力される画像データのある瞬間におけるフレーム画像を抽出した一例を示す。この図６（ａ）では、例えば、画像を８×８ｐｉｘｅｌのブロックに分割し、領域のマッチングに基づく方法を用いて求めたものである。このフレームにおける左右方向の動きベクトルの絶対値を画素の明るさで表すと、図６（ｂ）のようになる。 The method of calculating the optical flow is not limited here, but it is preferable to select a method in which the flow is stably generated depending on the scene. For example, a method based on luminance gradient or a method based on region matching can be considered, but a method based on luminance gradient is more effective when the degree of congestion is large, and a method based on region matching is more effective when the degree of congestion is small. It is. An example in which a motion vector is calculated from an actual video is shown in FIG. FIG. 6A shows an example in which a frame image at a certain moment of image data output from the camera 2 is extracted. In FIG. 6A, for example, an image is divided into 8 × 8 pixel blocks and obtained using a method based on region matching. When the absolute value of the horizontal motion vector in this frame is represented by the brightness of the pixel, it is as shown in FIG.

一方、前景検出部１０２は、現時点の入力画像における前景画像を生成する（ステップＳ２）。動きベクトルと前景画像が生成されると、射影処理部１０３は、動きベクトルの向きがある範囲内にあり、かつ絶対値がある閾値以上であり、かつ前景画像の同座標における値が１である点を１、そうでない点を０とする２値画像を生成することによって、１次元の射影画像を生成する（ステップＳ３）。図６（ｃ）は、図６（ｂ）の動きベクトルからｕ軸に対する射影画像を生成し、グラフ状に図示したものである。 On the other hand, the foreground detection unit 102 generates a foreground image in the current input image (step S2). When the motion vector and the foreground image are generated, the projection processing unit 103 has the direction of the motion vector within a certain range, the absolute value is equal to or greater than a certain threshold value, and the value at the same coordinate of the foreground image is 1. A one-dimensional projected image is generated by generating a binary image in which the point is 1 and the other points are 0 (step S3). FIG. 6C shows a projection image for the u axis generated from the motion vector shown in FIG.

射影画像が生成されると、極値点抽出部１０４は、射影画像の値が極大値を取る点を１、そうでない点を０とする極値点画像を生成する（ステップＳ４）。ここで、極大値の数は、ｘ方向の同じ位置に複数の人物が存在しない場合には、画像中の人物の数を示す。射影画像の値が極大値をとる点は、ｘ方向における人物候補位置に対応する。ここで、極値を抽出する前に射影画像に平滑化フィルタを掛けるなどの処理をしても良い。 When the projected image is generated, the extreme point extraction unit 104 generates an extreme point image in which the point where the value of the projected image takes the maximum value is 1 and the point where it is not is 0 (step S4). Here, the number of maximum values indicates the number of persons in the image when a plurality of persons are not present at the same position in the x direction. The point at which the value of the projected image takes the maximum value corresponds to the human candidate position in the x direction. Here, before extracting the extreme value, a process such as applying a smoothing filter to the projected image may be performed.

極値画像が生成されると、累積画像生成部１０５は、極値点画像を時間ｔに対応させてメモリ上に蓄積し、極値点の軌跡を示す２次元の累積画像（ｘ，ｔ）を生成する（ステップＳ５）。図６（ｄ）は、実際の映像から生成した極値点軌跡画像の一例である。画像の上端が図６（ｃ）に対応している。 When the extreme value image is generated, the cumulative image generation unit 105 stores the extreme point image in the memory in correspondence with the time t, and a two-dimensional cumulative image (x, t) indicating the locus of the extreme point. Is generated (step S5). FIG. 6D is an example of an extreme point trajectory image generated from an actual video. The upper end of the image corresponds to FIG.

累積画像が生成されると、代表速度算出部１０６は、累積画像から、画像に映っている人々の代表移動速度を求める（ステップＳ６）。この代表移動速度の求め方は種々あるが、例えば、次の３通りの方法（方法１〜方法３）が一例としてあげられる。 When the accumulated image is generated, the representative speed calculation unit 106 obtains the representative moving speed of the people shown in the image from the accumulated image (step S6). There are various ways of obtaining the representative moving speed. For example, the following three methods (Method 1 to Method 3) are given as an example.

（方法１）
まず、累積画像にθ−ρハフ変換を施し、画素値（投票数）がある閾値以上であるθ−ρ空間上の点を抽出する（ステップＳ６−１−１）。閾値の決定に際しては、ここではその方法を限定しないが、例えば、累積画像上で極値点がある直線上にいくつ以上並んでいたときにその直線を極値点の軌跡とみなすか、に応じて決定する。図６（ｄ）の累積画像（例えば、縦６０ピクセル×横８０ピクセル）においてθの刻み幅を１度、ρの刻み幅を１ピクセルとしてハフ変換する場合、画素値の閾値は１０〜１５程度が好ましい。これは、極値点の軌跡に対応する直線候補の検出漏れと、極値点の軌跡に対応しない直線候補の誤検出との両方が少ないという観点から、実験的に決定した値である。なお、極値点の軌跡ではない直線候補の誤検出の防止や計算時間短縮のために、ハフ変換においてθの範囲を限定してもよい。例えば、人物の移動速度が３［ｐｉｘｅｌ／ｆｒａｍｅ］以下であると前提を置く場合、右向きの動きの観測であれば９０°≦θ≦１０９°とすることができる。 (Method 1)
First, θ-ρ Hough transform is performed on the accumulated image, and a point on the θ-ρ space where the pixel value (number of votes) is equal to or greater than a certain threshold is extracted (step S6-1-1). In determining the threshold value, the method is not limited here. For example, depending on how many extreme points are arranged on a straight line on the accumulated image, the straight line is regarded as the locus of the extreme points. To decide. In the cumulative image of FIG. 6D (for example, vertical 60 pixels × horizontal 80 pixels), when the Hough transform is performed with the step size of θ being 1 degree and the step size of ρ being 1 pixel, the pixel value threshold is about 10 to 15 Is preferred. This is a value experimentally determined from the viewpoint that there are few detection failures of straight line candidates corresponding to the locus of extreme points and false detection of straight line candidates not corresponding to the locus of extreme points. Note that the range of θ may be limited in the Hough transform in order to prevent erroneous detection of straight line candidates that are not the locus of extreme points and to shorten the calculation time. For example, when it is assumed that the moving speed of a person is 3 [pixel / frame] or less, 90 ° ≦ θ ≦ 109 ° can be set in the case of observation of a rightward movement.

次に、抽出されたθ-ρ空間上の点に対応する累積画像上の直線を極値点の軌跡とし、直線の傾きをその極値点の画像上における移動速度とする（ステップＳ６−１−２）。直線の傾きは、対応する極値点が１フレーム当たり何ピクセルｘ方向に移動するかを表す。 Next, the straight line on the accumulated image corresponding to the extracted point on the θ-ρ space is used as the locus of the extreme point, and the slope of the straight line is used as the moving speed of the extreme point on the image (step S6-1). -2). The slope of the straight line represents how many pixels x the corresponding extreme point moves per frame.

次に、推定された各極値点の移動速度から、画像に映っている人々の代表移動速度を求める（ステップＳ６−１−３）。代表移動速度の算出方法はここでは限定しないが、例えば、各移動速度の平均値をとる方法や、中央値をとる方法が考えられ、シーンによってより精度よい人数推定につながる方法を選択するのが好ましい。極値点の軌跡が映像中の人物１人に対応する程度に人物が少ない場合は平均値をとる方法が、そうでない場合は中央値をとる方法がより有効である。 Next, the representative moving speed of the people shown in the image is obtained from the estimated moving speed of each extreme point (step S6-1-3). The method of calculating the representative moving speed is not limited here, but for example, a method of taking the average value of each moving speed or a method of taking the median value can be considered, and a method that leads to more accurate estimation of the number of people is selected depending on the scene preferable. The method of taking an average value is more effective when the number of people is such that the locus of extreme points corresponds to one person in the video, and the method of taking the median value is more effective when the number of people is not so.

（方法２）
まず、ステップＳ６−１−１と同様に、画素値がある閾値以上であるθ−ρ空間上の点を抽出する（ステップＳ６−２−１）。次に、ステップＳ６−２−１で抽出された点を１、そのほかの点を０とする２値画像（θ，ρ）において座標軸ρに対する射影処理を施し（例えばステップＳ３、図３を参照）、極値点軌跡の角度分布を生成する（ステップＳ６−２−２）。 (Method 2)
First, similarly to step S6-1-1, a point on the θ-ρ space where the pixel value is equal to or greater than a certain threshold is extracted (step S6-2-1). Next, a projection process is performed on the coordinate axis ρ in the binary image (θ, ρ) in which the point extracted in step S6-2-1 is 1 and the other points are 0 (see, for example, step S3 and FIG. 3). Then, an angular distribution of the extreme point locus is generated (step S6-2-2).

次に、極値点軌跡の角度分布から代表角度θ￣（￣は、θの上に付く。以下同じ。）を算出し、θ−ρ空間上の角度θ￣に対応するｘ−ｔ空間上での傾きを画像に映っている人々の代表移動速度とする。（ステップＳ６−２−３）代表角度θ￣の算出方法はここでは限定しないが、例えば、角度分布の平均をとる方法や、分布が最大となる角度を選択する方法がある。また、代表角度θ￣を算出する前に角度分布に平滑化フィルタを掛けるなどの処理をしても良い。 Next, a representative angle θ￣ (￣ is attached to θ. The same applies hereinafter) is calculated from the angular distribution of the extreme point locus, and the xt space corresponding to the angle θ￣ on the θ-ρ space is calculated. The inclination at is the representative moving speed of people shown in the image. (Step S6-2-3) The calculation method of the representative angle θ￣ is not limited here. For example, there are a method of taking an average of the angle distribution and a method of selecting an angle at which the distribution is maximized. Also, a process such as applying a smoothing filter to the angle distribution may be performed before calculating the representative angle θ￣.

（方法３）
まず、累積画像にθ−ρハフ変換を施すことで得られるθ−ρ画像の各画素値を、投票が多い点を強調するために２乗する（ステップＳ６−３−１）。次に、２乗したθ−ρ画像において座標軸ρに対する射影処理を施し（ステップＳ３、図３を参照）、極値点軌跡の角度分布を生成する（ステップＳ６−３−２）。なお、ステップＳ３の場合とは異なり、ここでの射影処理対象であるθ−ρ画像は多値画像である。次に、ステップＳ６−２−３と同様に、代表角度から代表移動速度を算出する（ステップＳ６−３−３）。 (Method 3)
First, each pixel value of the θ-ρ image obtained by performing θ-ρ Hough transform on the accumulated image is squared to emphasize points with many votes (step S6-3-1). Next, a projection process for the coordinate axis ρ is performed on the squared θ-ρ image (see step S3, FIG. 3), and an angular distribution of the extreme point locus is generated (step S6-3-2). Note that unlike the case of step S3, the θ-ρ image that is the object of the projection processing here is a multi-valued image. Next, similarly to step S6-2-3, the representative moving speed is calculated from the representative angle (step S6-3-3).

次に、画面内人数算出部１０７は、カメラ２で撮影された映像から、現時点の入力画像に映っている人のうち、人流を観測する向きに移動している人の数を算出する（ステップＳ７）。人数の算出方法は、ここでは限定されるものではないが、例えば、下記に示す非特許文献６の方法において、前景画像をステップＳ３で生成した２値画像とした方法を用いる。非特許文献６の方法は、画像内の人数が多い状況でも安定に人数を推定できるという特徴を持つため、画像内の人数が多い状況で通過人数を精度よく計測するためには好ましい方法である。また、前景画像としてステップＳ３で生成した２値画像を使用することによって、異なる方向に移動している人が存在する場合でも、人流を観測する向きに移動している人の数を算出することができる。 Next, the in-screen number calculation unit 107 calculates the number of people who are moving in the direction of observing human flow among the people shown in the current input image from the video taken by the camera 2 (Step S1). S7). The method for calculating the number of persons is not limited here. For example, in the method of Non-Patent Document 6 shown below, a method is used in which the foreground image is a binary image generated in step S3. Since the method of Non-Patent Document 6 has a feature that the number of people can be stably estimated even in a situation where there are a large number of people in the image, it is a preferable method for accurately measuring the number of people passing in a situation where there are many people in the image. . Also, by using the binary image generated in step S3 as the foreground image, the number of people moving in the direction of observing the human flow is calculated even when there are people moving in different directions. Can do.

非特許文献6：Hiroyuki ARAI, Isao MIYAGAWA, Hideki KOIKE, and Miki ASEYAMA, Members, ”Estimating Number of People Using Calibrated Monocular Camera Based on Geometrical Analysis of Surface Area”, IEICE Trans. Fundamentals, Vol. E92-A, No. 8, pp. 1932-1938, August 2009 Non-Patent Document 6: Hiroyuki ARAI, Isao MIYAGAWA, Hideki KOIKE, and Miki ASEYAMA, Members, “Estimating Number of People Using Calibrated Monocular Camera Based on Geometrical Analysis of Surface Area”, IEICE Trans. Fundamentals, Vol. E92-A, No 8, pp. 1932-1938, August 2009

また、このステップＳ７で求める人流を観測する向きに移動している人の数については、ステップＳ４によりカウントした極大値の数から求めた人数でもよい。また、非複数人物が画像中で重なり合って移動し、ステップＳ４による方法では１人とカウントしてしまうような混雑している状況においては、特許文献６による方法を用いると有用である。 Further, the number of people moving in the direction of observing the human flow obtained in step S7 may be the number obtained from the maximum value counted in step S4. Also, in a crowded situation where non-plural people move in an overlapping manner in the image and the method according to step S4 counts as one person, it is useful to use the method according to Patent Document 6.

次に、通過人数算出部１０８は、ステップＳ６で算出された代表移動速度と、ステップＳ７で算出された人数とから、現時点の入力画像における瞬間通過人数を算出する（ステップＳ８）。瞬間通過人数の算出方法は、ここでは限定しないが、例えば、代表移動速度を画像のｘ方向の長さで除算し、画像中でｘ方向に移動している人物の数との積をとったものとする。例えば、人物１人がｘ方向の画像長だけ移動することを人物１人の通過とみなし、１つ前のフレームから現在フレームまでの間に何人がｘ方向の画像長の何倍移動したか、を算出する。一例として、３人の人物が１フレームの間にｘ方向の画像幅の０．１倍だけ移動する場合、瞬間通過人数は３×０．１＝０．３となる。 Next, the passing number calculation unit 108 calculates the instantaneous passing number in the current input image from the representative moving speed calculated in step S6 and the number calculated in step S7 (step S8). The method of calculating the instantaneous passing number is not limited here, but, for example, the representative moving speed is divided by the length in the x direction of the image, and the product of the number of persons moving in the x direction in the image is taken. Shall. For example, assuming that one person moves by the image length in the x direction is regarded as the passage of one person, how many times the person has moved the image length in the x direction between the previous frame and the current frame, Is calculated. As an example, when three persons move by 0.1 times the image width in the x direction during one frame, the instantaneous passing number is 3 × 0.1 = 0.3.

通過人数算出部１０８は、全方向の通過人数を算出したか否かを判定し、通過を測定したい向きが複数存在し、全方向の通過人数の算出を行っていない場合には、測定するそれぞれの向きに関してステップＳ３からステップＳ８までの処理を繰り返して行い、通過人数を算出する（ステップＳ９）。 The passing number calculation unit 108 determines whether or not the passing number of people in all directions has been calculated. If there are a plurality of directions in which the passing is desired and the number of passing numbers in all directions is not calculated, each passing number is calculated. The process from step S3 to step S8 is repeatedly performed with respect to the direction of the number, and the number of passing people is calculated (step S9).

次に、通過人数算出部１０８は、通過を測定する向きごとに瞬間通過人数を累積加算することにより、各向きの累積通過人数を算出する。累積通過人数は、処理を開始したフレームから現在フレームまでの間に映像内を各向きに通過した人物の数である。例えば、ステップＳ８で示した例と同じ状況において、人物の速度が一定である場合、３人の人物は画像の一方の端から反対側の端まで１０フレームかけて移動する。すなわち、瞬間通過人数０．３の１０フレーム累積で０．３×１０＝３となり、累積通過人数は３増加する。累積通過人数３という値は、画像中でｘ方向の画像長だけ移動した人物が３人いたか、あるいは、画像中でｘ方向の画像長の１／２だけ移動した人物が６人いたことを示す値である。このように、他のセンサを用いずに、画像のみから一定時間内に画像中のｘ方向の画像長だけ移動した人数の概算値を算出できる。 Next, the passing number calculation unit 108 calculates the cumulative passing number in each direction by cumulatively adding the instantaneous passing number for each direction in which the passing is measured. The cumulative number of passing people is the number of people who have passed through the video in each direction from the frame in which the process is started to the current frame. For example, in the same situation as the example shown in step S8, when the speed of the person is constant, the three persons move from one end of the image to the opposite end over 10 frames. That is, the cumulative number of passing people is 0.3 × 10 = 3 for 10 frames of 0.3, and the cumulative number of people passing by increases by three. The cumulative passing number 3 means that there are 3 persons who have moved by the image length in the x direction in the image, or 6 persons who have moved by 1/2 the image length in the x direction in the image. This is the value shown. In this way, it is possible to calculate an approximate value of the number of people who have moved by the image length in the x direction in an image within a certain time from only the image without using other sensors.

そして、通過人数算出部１０８は、終了条件を満たしているか否かを判定する（ステップＳ１０）。終了条件を満たしていれば、処理を終了し、満たしていない場合には、映像の新たなフレームを１枚取得し（ステップＳ１１）、ステップＳ１からステップＳ１０の処理を繰り返す。 Then, the passing number calculation unit 108 determines whether or not the end condition is satisfied (step S10). If the end condition is satisfied, the process ends. If not, one new frame of video is acquired (step S11), and the processes from step S1 to step S10 are repeated.

なお、以上説明した実施形態において、ステップＳ９の後に、通過人数算出部１０８は、通過を測定したい範囲が画像中に複数存在するか否かを判定し、複数存在する場合、測定対象のそれぞれの範囲の部分画像に関してステップＳ１からステップＳ９までの処理を行い、それぞれの範囲の部分画像について全て処理が行われると、ステップＳ１０に移行するようにしてもよい。 In the embodiment described above, after step S9, the passing number calculation unit 108 determines whether or not there are a plurality of ranges in the image for which the passage is desired to be measured. The processes from step S1 to step S9 may be performed on the partial images in the range, and when all the partial images in the respective ranges are processed, the process may move to step S10.

なお、以上説明した実施形態において、カメラ２から得られる画像に対する通過人数計測装置１の処理は、実時間で行うか否かはいずれでもよい。 In the embodiment described above, the process of the passing person counting device 1 for the image obtained from the camera 2 may be performed in real time or not.

なお、上述した実施形態において、ステップＳ１からステップＳ２までの処理（動きベクトル算出及び前景抽出）については、実施せず、入力フレームの画像、及び該画像の各画素の動きベクトルの情報、該画像の前景領域の情報を入力とし、ステップＳ３以降の処理を行うようにしてもよい。 In the above-described embodiment, the processing from step S1 to step S2 (motion vector calculation and foreground extraction) is not performed, and the image of the input frame and the motion vector information of each pixel of the image, the image The foreground area information may be used as an input, and the processes in and after step S3 may be performed.

＜第２の実施形態＞
図７は、第２の実施形態における通過人数計測装置１の構成を示す機能ブロック図である。ここでは、図２の各部に対応する部分については、同一の符号を付し、その相違点を説明する。射影処理部１０３は、カメラ２から出力される画像データと、各画素の動きベクトル、画像の前景画像を入力する。これらの画像データ、動きベクトル、前景画像は、例えば、記憶装置に記憶しておき、射影処理部１０３が読み出すようにしてもよい。通過人数算出部１０８は、累積画像生成部１０５が生成した累積画像から、通過人数を算出する。例えば、累積画像に含まれる射影画像の極大値から移動人物の人数を算出する。 <Second Embodiment>
FIG. 7 is a functional block diagram showing the configuration of the passing number measuring device 1 in the second embodiment. Here, portions corresponding to the respective portions in FIG. 2 are denoted by the same reference numerals, and differences thereof will be described. The projection processing unit 103 receives the image data output from the camera 2, the motion vector of each pixel, and the foreground image of the image. These image data, motion vector, and foreground image may be stored in a storage device, for example, and read out by the projection processing unit 103. The passing number calculation unit 108 calculates the passing number from the cumulative image generated by the cumulative image generation unit 105. For example, the number of moving persons is calculated from the maximum value of the projected image included in the accumulated image.

＜第３の実施形態＞
図８は、第３の実施形態における通過人数計測装置１の構成を示す機能ブロック図である。ここでは、図２の各部に対応する部分については、同一の符号を付し、その相違点を説明する。射影処理部１０３は、カメラ２から出力される画像データと、各画素の動きベクトル、画像の前景画像を入力する。これらの画像データ、動きベクトル、前景画像は、例えば、記憶装置に記憶しておき、射影処理部１０３が読み出すようにしてもよい。ここでは、例えば、画像データ、動きベクトル、前景画像を既知とし、これに基づいて、代表移動速度を求めることにより、移動人物の速度を算出する。 <Third Embodiment>
FIG. 8 is a functional block diagram showing the configuration of the passing number measuring device 1 in the third embodiment. Here, portions corresponding to the respective portions in FIG. 2 are denoted by the same reference numerals, and differences thereof will be described. The projection processing unit 103 receives the image data output from the camera 2, the motion vector of each pixel, and the foreground image of the image. These image data, motion vector, and foreground image may be stored in a storage device, for example, and read out by the projection processing unit 103. Here, for example, the image data, the motion vector, and the foreground image are known, and the speed of the moving person is calculated by obtaining the representative moving speed based on the known image data, motion vector, and foreground image.

＜第４の実施形態＞
図９は、第４の実施形態における通過人数計測装置１の構成を示す機能ブロック図である。ここでは、図２の各部に対応する部分については、同一の符号を付し、その相違点を説明する。射影処理部１０３は、カメラ２から出力される画像データと、各画素の動きベクトル、画像の前景画像を入力する。これらの画像データ、動きベクトル、前景画像は、例えば、記憶装置に記憶しておき、射影処理部１０３が読み出すようにしてもよい。ここでは、例えば、画像データ、動きベクトル、前景画像を既知とし、これに基づいて、一定時間内に画像に映し出されている領域の距離（画像長の距離）を移動する人数の累積値（累積通過人数）を算出する。 <Fourth Embodiment>
FIG. 9 is a functional block diagram showing the configuration of the passing number measuring device 1 in the fourth embodiment. Here, portions corresponding to the respective portions in FIG. 2 are denoted by the same reference numerals, and differences thereof will be described. The projection processing unit 103 receives the image data output from the camera 2, the motion vector of each pixel, and the foreground image of the image. These image data, motion vector, and foreground image may be stored in a storage device, for example, and read out by the projection processing unit 103. Here, for example, the image data, the motion vector, and the foreground image are known, and based on this, the cumulative value (cumulative value) of the number of people who move the distance (image length distance) of the area projected on the image within a certain time. Calculate the number of people passing through.

＜第５の実施形態＞
次に、第５の実施形態における通過人数計測装置１について説明する。図１０は、この発明の一実施形態による通過人数計測装置の構成を示す概略ブロック図である。動きベクトル抽出部３００は、カメラ２からの画像を入力し、動きベクトルを出力する。前景検出部３０１は、カメラ２からの画像を入力し、前景画像を出力する。実空間マッピング部３０２は、動きベクトル抽出部３００から出力される動きベクトルと前景検出部３０１から出力される前景画像とを入力し、実空間画像と画面に映っている人物の数を出力する。射影処理部３０３は、実空間マッピング部３０２から出力される実空間画像を入力し、射影された動きベクトルを生成して出力する。極値点抽出部３０４は、射影処理部３０３から出力される射影された動き情報を入力し、動き情報の極値点を算出して出力する。 <Fifth Embodiment>
Next, the passing number measuring device 1 in the fifth embodiment will be described. FIG. 10 is a schematic block diagram showing the configuration of the passing person counting device according to the embodiment of the present invention. The motion vector extraction unit 300 receives an image from the camera 2 and outputs a motion vector. The foreground detection unit 301 inputs an image from the camera 2 and outputs a foreground image. The real space mapping unit 302 receives the motion vector output from the motion vector extraction unit 300 and the foreground image output from the foreground detection unit 301, and outputs the real space image and the number of persons shown on the screen. The projection processing unit 303 receives the real space image output from the real space mapping unit 302, generates a projected motion vector, and outputs the generated motion vector. The extreme point extraction unit 304 receives the projected motion information output from the projection processing unit 303, calculates and outputs the extreme point of the motion information.

累積画像生成部３０５は、極値点抽出部３０４から出力される動き情報の極値点を入力し、極値点の位置を示す画像を出力する。代表速度算出部３０６は、累積画像生成部３０５から出力される累積画像を入力し、画像中の人々の代表移動速度を生成して出力する。この代表移動速度の生成の仕方は、種々あるが、後述する。通過人数算出部３０７は、実空間マッピング部３０２から出力される画像に映っている人物の数と代表速度算出部３０６から出力される画像中の人々の代表移動速度とを入力し、カメラ２で撮影された映像から、現時点の入力画像に映っている人のうち、人流を観測する向きに移動している人の数を算出して出力する。 The cumulative image generation unit 305 receives the extreme point of the motion information output from the extreme point extraction unit 304 and outputs an image indicating the position of the extreme point. The representative speed calculation unit 306 receives the cumulative image output from the cumulative image generation unit 305, generates and outputs the representative movement speed of people in the image. There are various ways of generating the representative moving speed, which will be described later. The passing number calculation unit 307 inputs the number of persons shown in the image output from the real space mapping unit 302 and the representative moving speed of people in the image output from the representative speed calculation unit 306. From the captured video, the number of people moving in the direction of observing the human flow among the people shown in the current input image is calculated and output.

次に、上述した構成における通過人数計測装置１の動作について説明する。図１１は、通過人数計測装置１の動作を説明するフローチャートである。まず、処理が開始され、カメラ２から画像データが出力されると、前景検出部３０１は、カメラ２から出力された、現時点の入力画像から予め設定された範囲を抽出した部分画像を生成する（ステップＳ２１）。この抽出する範囲の決定方法は、例えば、入力画像のうち通過をカウントしたい場所が撮影された領域とする方法、人物が写らないことが明らかである部分（壁や固定の設備など）を除いた領域とする方法、等が挙げられる。また、この範囲は入力画像全体であっても良い。 Next, the operation of the passing person measuring apparatus 1 having the above-described configuration will be described. FIG. 11 is a flowchart for explaining the operation of the passing number measuring device 1. First, when the processing is started and image data is output from the camera 2, the foreground detection unit 301 generates a partial image obtained by extracting a preset range from the current input image output from the camera 2 ( Step S21). The method for determining the range to be extracted is, for example, a method in which a place where a passage is to be counted in the input image is taken as an imaged area, or a portion where a person is clearly not captured (such as a wall or a fixed facility) is excluded. The method of making it into a region etc. are mentioned. This range may be the entire input image.

次に、前景検出部３０１は、抽出された部分画像おける前景画像を生成する（ステップＳ２２）。ここで、前景画像とは、部分画像において移動物体が存在する点、すなわち前景である点を１、そうでない点、すなわち背景である点を０とした画像である。前景画像の検出方法はさまざまな方法が知られているが、どのような方法を適用してもよい。なお、前景画像の検出方法としては、上述の非特許文献４、５のものが一例としてあげられる。 Next, the foreground detection unit 301 generates a foreground image in the extracted partial image (step S22). Here, the foreground image is an image in which a moving object exists in a partial image, that is, a point that is a foreground is 1, and a point that is not, that is, a point that is a background is 0. Various methods for detecting the foreground image are known, but any method may be applied. Note that examples of the foreground image detection method include those described in Non-Patent Documents 4 and 5 described above.

一方、動きベクトル抽出部３００は、映像の現在フレームの部分画像と１つ前のフレームの部分画像とから前景部分におけるオプティカルフロー（上述の非特許文献３参照）を検出し、カメラで撮影された現時点の部分画像から動きベクトルおよび、オプティカルフローの検出を続けている時間ｔ（フレーム数）を算出する（ステップＳ２３）。この動きベクトルは、例えば、ｘが左から右方向、ｙが上から下方向、としたとき、各画素位置（ｘ，ｙ）における動き量（Ｖ_ｘ，Ｖ_ｙ）として表される。 On the other hand, the motion vector extraction unit 300 detects the optical flow (see Non-Patent Document 3 above) in the foreground portion from the partial image of the current frame of the video and the partial image of the previous frame, and is captured by the camera. The motion vector and the time t (number of frames) during which the optical flow is continuously detected are calculated from the current partial image (step S23). This motion vector is represented as a motion amount (V _x , V _y ) at each pixel position (x, y), for example, where x is from left to right and y is from top to bottom.

オプティカルフローの算出方法は、ここでは限定しないが、シーンによって安定にフローが出る方法を選択するのが好ましい。例えば、輝度勾配に基づく方法や、領域のマッチングに基づく方法が考えられるが、輝度勾配に基づく方法は撮像領域の人物密度が大きい場合により有効であり、領域のマッチングに基づく方法は撮像領域の人物密度が小さい場合により有効である。 The method of calculating the optical flow is not limited here, but it is preferable to select a method in which the flow is stably generated depending on the scene. For example, a method based on luminance gradient or a method based on region matching can be considered, but a method based on luminance gradient is more effective when the human density in the imaging region is large, and a method based on region matching is a person in the imaging region. It is more effective when the density is small.

次に、実空間マッピング部３０２は、動きベクトルの向きが任意の範囲内にあり、かつ絶対値が任意の閾値以上である点を１、そうでない点を０とする２値画像を生成する（ステップＳ２４）。ここで、２値画像を生成する際に、動きベクトルの局所的な誤検出の影響を取り除くためのモルフォロジー演算などの処理をしても良い。動きベクトルの向きの範囲、および閾値の決定に際しては、ここではその方法を限定しないが、例えば以下の方法が考えられる。 Next, the real space mapping unit 302 generates a binary image in which the point where the direction of the motion vector is in an arbitrary range and the absolute value is equal to or larger than an arbitrary threshold is 1 and the point where the absolute value is not 0 is 0 ( Step S24). Here, when generating a binary image, a process such as a morphological operation for removing the influence of local false detection of a motion vector may be performed. In determining the range of the direction of the motion vector and the threshold value, the method is not limited here. For example, the following method can be considered.

（１）動きベクトルの向きの範囲は、ベクトルの通過を計測したい方向の成分が通過を計測したい向きを向いている範囲とする。例えば、右向きの通過を計測する場合は、動きベクトルのｘ成分が正である範囲とする。 (1) The range of the direction of the motion vector is a range in which the component in the direction in which the vector passage is desired to be measured faces the direction in which the passage is desired to be measured. For example, in the case of measuring rightward passage, the range is such that the x component of the motion vector is positive.

（２）絶対値の閾値は、人物の動き以外によって検出される動きの影響を切り捨てるために十分大きく、かつできるだけ小さい値を実験的に決定する。 (2) The threshold value of the absolute value is experimentally determined to be a value that is sufficiently large and as small as possible to discard the influence of the motion detected other than the motion of the person.

次に、実空間マッピング部３０２は、人物の実空間上での位置を表す実空間画像を作成する（ステップＳ２５）。これは、ステップＳ２４で生成した２値画像の値が１である画素それぞれに関して、“その画素を含む人物像の立ち位置”としてあり得る２次元実空間Ｘ−Ｙ上の範囲に該当する実空間画像の画素に、所定の値を加算することによって実現する。以下では、この画素値の加算処理をマッピングという。 Next, the real space mapping unit 302 creates a real space image representing the position of the person in the real space (step S25). This is a real space corresponding to a range on the two-dimensional real space XY that can be a “standing position of a person image including the pixel” for each pixel whose value of the binary image generated in step S24 is 1. This is realized by adding a predetermined value to the pixels of the image. Hereinafter, this pixel value addition processing is referred to as mapping.

ここで、マッピングの範囲の導出の仕方について、図１２〜図１３を用いて説明する。前提として、人物のモデルを、カメラに正対した高さｈ、幅ｗの長方形平板とする（ｈとｗはあらかじめ定める。例えば、ｈ＝１．７［ｍ］、ｗ＝０．３［ｍ］とする）。また、カメラの光軸を中心とした回転はないものとする。 Here, how to derive the mapping range will be described with reference to FIGS. As a premise, the human model is a rectangular flat plate with a height h and a width w facing the camera (h and w are predetermined. For example, h = 1.7 [m], w = 0.3 [m ]). It is assumed that there is no rotation around the optical axis of the camera.

ある座標（ｘ，ｙ）の画素が人物長方形の底辺（足元）の中心であった場合と上辺（頭頂部）の中心であった場合とを仮定し、対応する２次元実空間Ｘ−Ｙ上の立ち位置Ｐ（ｘ，ｙ）、Ｐ’（ｘ，ｙ）（図１３参照）をそれぞれ考える（（ｘ，ｙ）からＰ（ｘ，ｙ）、Ｐ’（ｘ，ｙ）の座標を求めるには、例えば、上述の非特許文献６の（１２）式を用いる）。 Assuming that the pixel at a certain coordinate (x, y) is the center of the bottom (foot) of the person rectangle and the center of the top (top), the corresponding two-dimensional real space XY Considering the standing positions P (x, y) and P ′ (x, y) (see FIG. 13), the coordinates of P (x, y) and P ′ (x, y) are obtained from (x, y), respectively. For example, the expression (12) of Non-Patent Document 6 described above is used).

このとき、件の画素が人物の中心軸上の任意の点であった場合に対応する実空間座標の範囲は、２点Ｐ（ｘ，ｙ）、Ｐ’（ｘ，ｙ）を結ぶ線分となる。さらに、Ｐ’（ｘ，ｙ）がカメラ位置（以下、カメラ位置を原点とする）から十分離れており、かつ画像上における人物の幅が画素幅に対して十分大きい場合、件の画素が人物上の任意の点であった場合に対応する実空間座標の範囲、すなわち求めるマッピング範囲は、この線分にｗの幅を持たせたもの、すなわち、２点を線分と垂直な方向に±ｗ／２だけ移動した点４つを頂点とする長方形に近似することができる（図１３参照）。 At this time, the range of the real space coordinates corresponding to the case where the pixel in question is an arbitrary point on the central axis of the person is a line segment connecting two points P (x, y) and P ′ (x, y). It becomes. Furthermore, if P ′ (x, y) is sufficiently far from the camera position (hereinafter, camera position is the origin) and the width of the person on the image is sufficiently large relative to the pixel width, the pixel in question is the person. The real space coordinate range corresponding to the above arbitrary point, that is, the mapping range to be obtained, is obtained by giving the width of w to this line segment, that is, ± 2 points in the direction perpendicular to the line segment. It can be approximated to a rectangle whose vertices are four points moved by w / 2 (see FIG. 13).

以上で導出される範囲に、画素値をマッピングする。マッピングを実現する具体的方法の例を、図１４〜図１５を用いて以下に説明する。 The pixel value is mapped in the range derived as described above. An example of a specific method for realizing the mapping will be described below with reference to FIGS.

まず、マッピング先の空間として、各画素の値が０である実空間画像を作成する（ステップＳ２５−１）。実空間画像のサイズは、ステップＳ２４で生成した２値画像の各画素に対応するマッピング範囲をすべて網羅できるサイズとする。網羅すべき範囲は、図１４のように、２値画像の四隅の画素に対応するマッピング範囲を網羅する範囲を考えれば算出することができる。例えば、撮像画像のアスペクト比が４：３、カメラの地表からの高さが７．１ｍ、水平方向からの俯角が１７°、対角線画角が４６．７７°、人物モデルの高さが１．７ｍ、幅が０．３ｍのとき、実空間画像が網羅すべき範囲はＸ方向−５５［ｍ］〜５５［ｍ］、Ｙ方向８［ｍ］〜１６３［ｍ］、とすることができる（Ｘ、Ｙ座標軸の取り方は図１４参照）。 First, a real space image in which the value of each pixel is 0 is created as a mapping destination space (step S25-1). The size of the real space image is set to a size that can cover the entire mapping range corresponding to each pixel of the binary image generated in step S24. The range to be covered can be calculated by considering the range covering the mapping range corresponding to the pixels at the four corners of the binary image as shown in FIG. For example, the aspect ratio of the captured image is 4: 3, the height from the ground surface of the camera is 7.1 m, the depression angle from the horizontal direction is 17 °, the diagonal angle of view is 46.77 °, and the height of the human model is 1. When the width is 7 m and the width is 0.3 m, the range that the real space image should cover can be -55 [m] to 55 [m] in the X direction and 8 [m] to 163 [m] in the Y direction ( (See Fig. 14 for how to set the X and Y coordinate axes).

また、実空間画像の座標の刻み幅は、人物の動きを観測するために十分な最大の幅（例えば、１０ｃｍ刻み）とする。前述の網羅範囲の例で刻み幅を１０ｃｍとすると、実空間画像のサイズは横１１０１×縦１５５１ｐｉｘｅｌとなる。 In addition, the step size of the coordinates of the real space image is set to a maximum width sufficient for observing the movement of the person (for example, 10 cm step). If the step size is 10 cm in the example of the above-described coverage range, the size of the real space image is 1101 horizontal × 1551 pixels vertical.

次に、２値画像の画素値が１である各画素の座標（ｘ，ｙ）に関して、実空間画像上のＰ’（ｘ，ｙ）に対応する画素の値に１を加える（ステップＳ２５−２、図１５（ａ））。図１５では、このステップＳ２５−２から後述するステップＳ５−４までの処理を、画素１つに注目して示したものである。 Next, with respect to the coordinates (x, y) of each pixel whose pixel value is 1 in the binary image, 1 is added to the value of the pixel corresponding to P ′ (x, y) on the real space image (step S25−). 2, FIG. 15 (a)). In FIG. 15, the process from step S25-2 to step S5-4 described later is shown with attention paid to one pixel.

次に、長さと方向、係数が適用画素の原点からの距離に依存して変化するフィルタを実空間画像に適用することにより、原点とフィルタ適用点とを結ぶ直線の方向に画素を分布させる（ステップＳ２５−３、図１５（ｂ））。原点からフィルタ適用点までの距離をｒとすると、フィルタの方向は原点とフィルタ適用点とを結ぶ直線の方向（この方向を、以下ではその点におけるｒ方向とよぶ）、ｒ方向のフィルタ長は｜Ｐ−Ｐ’｜＝ｈｒ／Ｔ_ｚ、幅は１、各フィルタ係数はＴ_ｚｓ＾（ｘ，ｙ）／（ｈｒ）となる（＾はｓの上に付く。以下同じ。）。ここで、Ｔ_ｚは床面からカメラまでの高さである。ｓ＾（ｘ，ｙ）は、２値画像の画素に重み付けをする値であり、ある人物１人が撮像画像に映っていることによって値が１となる２値画像上の画素にこの重みをつけた値の総和が、その人物が撮像画像のどこに映っていても１となるような値である。これは、非特許文献５におけるｓ＾（ｘ，ｙ）と同じものである。 Next, by applying a filter whose length, direction, and coefficient change depending on the distance from the origin of the application pixel to the real space image, the pixels are distributed in the direction of the straight line connecting the origin and the filter application point ( Step S25-3, FIG. 15 (b)). If the distance from the origin to the filter application point is r, the filter direction is the direction of the straight line connecting the origin and the filter application point (this direction is hereinafter referred to as the r direction at that point), and the filter length in the r direction is | P−P ′ | = hr / T _z , width is 1, and each filter coefficient is T _z s ^ (x, y) / (hr) (^ is on s. The same applies hereinafter). Here, T _z is the height from the floor surface to the camera. s ^ (x, y) is a value for weighting the pixels of the binary image, and this weight is given to the pixels on the binary image whose value is 1 when one person appears in the captured image. The sum of the attached values is a value that becomes 1 regardless of where the person appears in the captured image. This is the same as s ^ (x, y) in Non-Patent Document 5.

次に、各画素において、その画素の位置におけるｒ方向と垂直な方向にフィルタを実空間画像に適用することにより、ｒ方向と垂直な方向に画素を分布させる（ステップＳ２５−４、図１５（ｃ））。フィルタの長さは、ｗ（人物モデルの幅）、幅は１、各フィルタ係数は１／ｗとする。 Next, in each pixel, the pixel is distributed in the direction perpendicular to the r direction by applying a filter to the real space image in the direction perpendicular to the r direction at the position of the pixel (step S25-4, FIG. 15 ( c)). The length of the filter is w (the width of the person model), the width is 1, and each filter coefficient is 1 / w.

実際に実空間画像にマッピングを行った例を図１６に示す。図１６（ａ）の画像を撮像画像の動き検出部分として実空間にマッピングをすると、例えば、図１６（ｂ）の画像のようになる。 An example in which mapping is actually performed on a real space image is shown in FIG. When the image in FIG. 16A is mapped to the real space as a motion detection part of the captured image, for example, the image in FIG. 16B is obtained.

次に、実空間マッピング部３０２は、生成した実空間画像の画素値の総和を算出する（ステップＳ２６）。上述のステップＳ２５の方法でマッピングを行う場合、実空間画像の画素値のうち２値画像上の画素（ｘ，ｙ）からのマッピングに起因する成分の総和がｓ＾（ｘ，ｙ）となる。したがって、算出する総和は、ステップＳ２４で算出した２値画像にｓ＾（ｘ，ｙ）を乗じた画像の画素の総和に等しい。これはすなわち、移動を検出する方向に動いている画面内の人物の数に相当する。 Next, the real space mapping unit 302 calculates the sum of the pixel values of the generated real space image (step S26). When mapping is performed by the method of step S25 described above, the sum of components resulting from mapping from the pixel (x, y) on the binary image among the pixel values of the real space image is s ^ (x, y). . Therefore, the calculated sum is equal to the sum of the pixels of the image obtained by multiplying the binary image calculated in step S24 by ＾ (x, y). This corresponds to the number of persons on the screen moving in the direction of detecting movement.

次に、射影処理部３０３は、生成した実空間画像に関して、図１７に示されているように、ある方向に座標軸ｕをとり、ｕ方向の位置が等しい実空間画像上の各画素における値の総和をとった１次元の射影画像を生成する（ステップＳ２７）。 Next, as shown in FIG. 17, the projection processing unit 303 takes the coordinate axis u in a certain direction and generates a value of each pixel on the real space image with the same position in the u direction. A one-dimensional projected image taking the sum is generated (step S27).

図１７においては、横方向を座標軸ｕとし、そのｕ軸上の位置が同じである各画素の値の総和を計算する。この計算は、ｕ軸上のそれぞれの位置において行う。ここでは、射影画像として、符号（ａ）に示すデータが得られた場合が一例として示されている。 In FIG. 17, the horizontal direction is the coordinate axis u, and the sum of the values of the pixels having the same position on the u axis is calculated. This calculation is performed at each position on the u-axis. Here, as an example, the case where the data indicated by the symbol (a) is obtained as the projected image is shown.

このように、画像中においてｕ方向の位置が等しい画素の値の総和をとる処理を、この実施形態においては、座標軸ｕに対する射影処理と呼ぶ。ここで、座標軸ｕの方向は限定されるものではないが、例えば、左右方向の人流を観測する場合はＸ方向、手前と奥側との方向の人流を観測する場合はＹ方向、といったように、実空間画像のＸ方向またはＹ方向のうち計測したい人流の方向により近い方と等しくとることが考えられる。 Thus, in this embodiment, the process of calculating the sum of the values of pixels having the same position in the u direction in the image is referred to as a projection process for the coordinate axis u. Here, the direction of the coordinate axis u is not limited. For example, the X direction is used when observing the human flow in the left-right direction, and the Y direction is used when observing the human flow in the front and back directions. It is conceivable to take the same value as the one closer to the direction of the human flow to be measured out of the X direction or Y direction of the real space image.

射影画像が生成されると、極値点抽出部３０４は、図１７に示されているように、射影画像の値が極大値を取る点を１、そうでない点を０とする極値点画像を生成する（ステップＳ２８）。射影画像の値が極大値をとる点は、ｕ方向における人物候補位置に対応する。ここで、極値を抽出する前に射影画像に平滑化フィルタを掛けるなどの処理をしても良い。 When the projection image is generated, the extreme point extraction unit 304, as shown in FIG. 17, sets the point where the value of the projection image takes the maximum value as 1, and sets the point other than 0 as 0. Is generated (step S28). The point at which the value of the projected image takes the maximum value corresponds to the human candidate position in the u direction. Here, before extracting the extreme value, a process such as applying a smoothing filter to the projected image may be performed.

極値点画像が生成されると、累積画像生成部３０５は、極値点抽出部３０４から出力される極値点画像を入力し、極値点の位置を示す画像である累積画像を生成して出力する（ステップＳ２９）。ここで、図１８は、累積画像を説明するための図である。ここでは、極値点抽出部３０４によって、極値点画像を時間ｔに対応させてメモリ上に蓄積する（図１８（ａ）、図１８（ｂ））。累積画像生成部３０５は、この極値点の軌跡を示す２次元の累積画像（ｕ，ｔ）を生成する（図１８（ｃ））。ここでは、極大値を表す座標軸ｕ上の位置の画素について１、そうでない画素には０が付与された極値画像が時間ｔに従って蓄積されることにより、２次元の累積画像が生成される。なお、図１８においては、極大値である画素と、そうでない画素について異なる色で示している。 When the extreme point image is generated, the cumulative image generation unit 305 receives the extreme point image output from the extreme point extraction unit 304 and generates a cumulative image that is an image indicating the position of the extreme point. (Step S29). Here, FIG. 18 is a diagram for explaining the accumulated image. Here, the extreme point image is stored in the memory in correspondence with the time t by the extreme point extraction unit 304 (FIGS. 18A and 18B). The cumulative image generation unit 305 generates a two-dimensional cumulative image (u, t) indicating the locus of the extreme points (FIG. 18 (c)). Here, a two-dimensional cumulative image is generated by accumulating an extreme value image in which 1 is assigned to a pixel at a position on the coordinate axis u representing the maximum value and 0 is assigned to a pixel that is not, according to time t. In FIG. 18, the pixel having the maximum value and the pixel not having the maximum value are shown in different colors.

なお、ここで、累積画像のｕ方向のサイズは、極値点画像のｕ方向のサイズと一致させるように生成される。累積画像のｔ方向のサイズは、後述するハフ変換の際に軌跡候補を検出できる最小の値とする。 Here, the size of the accumulated image in the u direction is generated so as to coincide with the size of the extreme point image in the u direction. The size of the accumulated image in the t direction is set to a minimum value at which a locus candidate can be detected during the Hough transform described later.

累積画像が生成されると、代表速度算出部３０６は、累積画像から、撮像画像に映っている人々の代表移動速度を求める（ステップＳ３０）。この代表移動速度の求め方は種々あるが、例えば、次の３通りの方法（方法４〜方法６）が一例としてあげられる。 When the accumulated image is generated, the representative speed calculation unit 306 obtains the representative moving speed of the people shown in the captured image from the accumulated image (step S30). There are various ways of obtaining the representative moving speed. For example, the following three methods (method 4 to method 6) are given as examples.

（方法４）
まず、累積画像にθ−ρハフ変換を施し、画素値（投票数）がある閾値以上であるθ−ρ空間上の点を抽出する（ステップＳ３０−１−１）。このとき、極値点の軌跡ではない直線候補の誤検出の防止や計算時間短縮のために、ハフ変換においてθの範囲を限定してもよい。例えば、人物の移動速度が３［ｐｉｘｅｌ／ｆｒａｍｅ］以下であると前提を置く場合、右向きの動きの観測であれば９０°≦θ≦１０９°とすることができる。閾値の決定に際しては、ここではその方法を限定しないが、例えば、累積画像上で極値点がある直線上にいくつ以上並んでいたときにその直線を極値点の軌跡とみなすか、に応じて決定する方法や、極値点の軌跡に対応する直線候補の検出漏れと、極値点の軌跡に対応しない直線候補の誤検出との両方が少ないという観点から、実験的に決定する、といった方法がある。 (Method 4)
First, θ-ρ Hough transform is performed on the accumulated image, and a point on the θ-ρ space where the pixel value (the number of votes) is equal to or greater than a certain threshold is extracted (step S30-1-1). At this time, the range of θ may be limited in the Hough transform in order to prevent erroneous detection of straight line candidates that are not the locus of extreme points and to shorten the calculation time. For example, when it is assumed that the moving speed of a person is 3 [pixel / frame] or less, 90 ° ≦ θ ≦ 109 ° can be set in the case of observation of a rightward movement. In determining the threshold value, the method is not limited here. For example, depending on how many extreme points are arranged on a straight line on the accumulated image, the straight line is regarded as the locus of the extreme points. From the viewpoint that there are few detection errors of straight line candidates corresponding to the locus of extreme points and false detection of straight line candidates that do not correspond to the locus of extreme points, it is determined experimentally. There is a way.

次に、抽出されたθ−ρ空間上の点に対応する累積画像上の直線を極値点の軌跡とし、直線の傾きをその極値点の画像上における移動速度とする（ステップＳ３０−１−２）。
直線の傾きは、対応する極値点が１フレーム当たり何ピクセルｕ方向に移動するか、を表す。 Next, the straight line on the accumulated image corresponding to the extracted point on the θ-ρ space is used as the locus of the extreme point, and the inclination of the straight line is used as the moving speed of the extreme point on the image (step S30-1). -2).
The slope of the straight line represents how many pixels u the corresponding extreme point moves in one frame.

次に、推定された各極値点の移動速度から、撮像画像に映っている人々の代表移動速度を求める（ステップＳ３０−１−３）。代表移動速度の算出方法はここでは限定しないが、例えば、各移動速度の平均値をとる方法や、中央値をとる方法が考えられ、シーンによってより精度よい人数推定につながる方法を選択するのが好ましい。極値点の軌跡が映像中の人物１人に対応する程度に人物が少ない場合は平均値をとる方法が、そうでない場合は中央値をとる方法がより有効である。 Next, the representative moving speed of the people shown in the captured image is obtained from the estimated moving speed of each extreme point (step S30-1-3). The method of calculating the representative moving speed is not limited here, but for example, a method of taking the average value of each moving speed or a method of taking the median value can be considered, and a method that leads to more accurate estimation of the number of people depending on the scene is selected. preferable. The method of taking an average value is more effective when the number of people is such that the locus of extreme points corresponds to one person in the video, and the method of taking the median value is more effective when the number of people is not so.

（方法５）
まず、ステップＳ３０−１−１と同様に、画素値がある閾値以上であるθ−ρ空間上の点を抽出する（ステップＳ３０−２−１）。次に、ステップＳ３０−２−１で抽出された点を１、そのほかの点を０とする２値画像（θ，ρ）において座標軸ρに対する射影処理を施し（ステップＳ２７、図１７を参照）、極値点軌跡の角度分布を生成する（ステップＳ３０−２−２）。 (Method 5)
First, similarly to step S30-1-1, a point on the θ-ρ space whose pixel value is equal to or greater than a certain threshold is extracted (step S30-2-1). Next, a projection process is performed on the coordinate axis ρ in the binary image (θ, ρ) in which the point extracted in step S30-2-1 is 1 and the other points are 0 (see step S27, FIG. 17). An angular distribution of the extreme point locus is generated (step S30-2-2).

次に、極値点軌跡の角度分布から代表角度θ￣を算出し、θ−ρ空間上の角度θ￣に対応するｕ−ｔ空間上での傾きを撮像画像に映っている人々の代表移動速度とする（ステップＳ３０−２−３）。代表角度θ￣の算出方法はここでは限定しないが、例えば、角度分布の平均をとる方法や、分布が最大となる角度を選択する方法がある。また、代表角度θ￣を算出する前に角度分布に平滑化フィルタを掛けるなどの処理をしても良い。 Next, the representative angle θ￣ is calculated from the angular distribution of the extreme point locus, and the representative movement of people whose inclination in the ut space corresponding to the angle θ￣ in the θ-ρ space is shown in the captured image The speed is set (step S30-2-3). Although the calculation method of the representative angle θ しない is not limited here, for example, there are a method of averaging the angle distribution and a method of selecting an angle at which the distribution is maximum. Also, a process such as applying a smoothing filter to the angle distribution may be performed before calculating the representative angle θ￣.

（方法６）
まず、累積画像にθ−ρハフ変換を施すことで得られるθ−ρ画像の各画素値を、投票が多い点を強調するために２乗する（ステップＳ３０−３−１）。次に、２乗したθ−ρ画像において座標軸ρに対する射影処理を施し（ステップＳ２７、図１７を参照）、極値点軌跡の角度分布を生成する（ステップＳ３０−３−２）。なお、ステップＳ２７の場合とは異なり、ここでの射影処理対象であるθ−ρ画像は多値画像である。次に、ステップＳ３０−２−３と同様に、代表角度から代表移動速度を算出する（ステップＳ３０−３−３）。 (Method 6)
First, each pixel value of the θ-ρ image obtained by performing θ-ρ Hough transform on the accumulated image is squared to emphasize points with many votes (step S30-3-1). Next, a projection process for the coordinate axis ρ is performed on the squared θ-ρ image (see step S27, FIG. 17), and an extreme point locus angular distribution is generated (step S30-3-2). Note that unlike the case of step S27, the θ-ρ image that is the subject of the projection processing here is a multi-valued image. Next, similarly to step S30-2-3, the representative moving speed is calculated from the representative angle (step S30-3-3).

次に、代表移動速度が算出されると、通過人数算出部３０７は、ステップＳ３０で算出された代表移動速度とステップＳ２６で算出された人数とから、現時点の部分画像における瞬間通過人数を算出する（ステップＳ３１）。瞬間通過人数の算出方法はここでは限定しないが、例えば、代表移動速度を実空間画像のｕ方向の長さで除算し、画像中でｕ方向に移動している人物の数との積をとったものとする。例えば、人物１人がｕ方向の画像長だけ移動することを人物１人の通過とみなし、１つ前のフレームから現在フレームまでの間に何人がｕ方向の画像長に対してどれだけ移動したか、を算出する。一例として、３人の人物が１フレームの間にｕ方向の画像長の１／１０だけ移動する場合、瞬間通過人数は３×１／１０＝０．３となる。 Next, when the representative moving speed is calculated, the passing number calculation unit 307 calculates the instantaneous number of passing persons in the current partial image from the representative moving speed calculated in step S30 and the number of persons calculated in step S26. (Step S31). The method for calculating the instantaneous passing number is not limited here. For example, the representative moving speed is divided by the length of the real space image in the u direction, and the product is multiplied by the number of persons moving in the u direction in the image. Shall be. For example, assuming that one person moves by the image length in the u direction is regarded as passing one person, how many people have moved relative to the image length in the u direction between the previous frame and the current frame. Or calculate. As an example, when three persons move by 1/10 of the image length in the u direction during one frame, the instantaneous passing number is 3 × 1/10 = 0.3.

次に、通過人数算出部３０７は、全方向の通過人数を算出したか否かを判定し、通過を測定したい向きが複数存在し、全方向の通過人数の算出を行っていない場合には、測定するそれぞれの向きに関してステップＳ２４からステップＳ３１までの処理を繰り返して行い、通過人数を算出する（ステップＳ３２）。 Next, the passing number calculation unit 307 determines whether or not the passing number of people in all directions has been calculated, and when there are a plurality of directions in which the passing is desired and the number of passing numbers in all directions is not calculated, The process from step S24 to step S31 is repeated for each direction to be measured, and the number of passing people is calculated (step S32).

次に、通過人数算出部３０７は、通過を測定する向きごとに瞬間通過人数を累積加算することにより、各向きの累積通過人数とする（ステップＳ３３）。累積通過人数は、処理を開始したフレームから現在フレームまでの間に映像内を各向きに通過した人物の数となる。例えば、ステップＳ３１で示した例と同じ状況において、人物の速度が一定である場合、３人の人物は画像の一方の端から反対側の端まで１０フレームかけて移動する。すなわち、瞬間通過人数０．３の１０フレーム累積で０．３×１０＝３となり、累積通過人数は３増加する。 Next, the passing number calculation unit 307 cumulatively adds the instantaneous passing number for each direction in which passage is measured, thereby obtaining the cumulative passing number in each direction (step S33). The cumulative number of passing people is the number of people who have passed through the video in each direction from the frame in which the process is started to the current frame. For example, in the same situation as the example shown in step S31, if the speed of the person is constant, the three persons move from one end of the image to the opposite end over 10 frames. That is, the cumulative number of passing people is 0.3 × 10 = 3 for 10 frames of 0.3, and the cumulative number of people passing by increases by three.

次に、通過人数算出部３０７は、通過を測定したい範囲が画像中に複数存在するか否かを判定し、複数存在する場合、測定対象のそれぞれの範囲の部分画像に関してステップＳ２２からステップＳ３３までの処理を行う（ステップＳ３４）。そして、それぞれの範囲の部分画像について全て処理が行われると、ステップＳ３５に移行する。 Next, the passage number calculation unit 307 determines whether or not there are a plurality of ranges in which the passage is desired to be measured, and if there are a plurality of ranges, from step S22 to step S33 for the partial image in each range to be measured. Is performed (step S34). When all the partial images in the respective ranges are processed, the process proceeds to step S35.

次に、通過人数算出部３０７は、終了条件を満たしているか否かを判定する（ステップＳ１５）。終了条件を満たしていれば、処理を終了し、満たしていない場合には、映像の新たなフレームを１枚取得し（ステップＳ３６）、ステップＳ２１からステップＳ３５の処理を繰り返す。 Next, the passing number calculation unit 307 determines whether or not the end condition is satisfied (step S15). If the end condition is satisfied, the process ends. If not, one new frame of video is acquired (step S36), and the process from step S21 to step S35 is repeated.

このように、他のセンサを用いずに、画像のみから一定時間内に画像中のｕ方向の画像長だけ移動した人数の概算値を算出できる。 In this way, it is possible to calculate an approximate value of the number of people who have moved by the image length in the u direction in an image within a certain time without using another sensor.

＜第６の実施形態＞
図１９は、第６の実施形態における通過人数計測装置１の構成を示す機能ブロック図である。ここでは、図１０の各部に対応する部分については、同一の符号を付し、その相違点を説明する。実空間マッピング部３０２は、動きベクトル、前景画像を入力し、実空間画像と、画面に映っている人物の数を生成して出力する。これらの動きベクトル、前景画像は、例えば、記憶装置に記憶しておき、実空間マッピング部３０３が読み出すようにしてもよい。射影処理部３０３は、実空間マッピング部３０２から出力される実空間画像を入力する。通過人数算出部３０７は、累積画像生成部３０５が生成した累積画像から、通過人数を算出する。例えば、累積画像に含まれる射影画像の極大値から移動人物の人数を算出する。 <Sixth Embodiment>
FIG. 19 is a functional block diagram showing the configuration of the passing number measuring device 1 in the sixth embodiment. Here, portions corresponding to the respective portions in FIG. 10 are denoted by the same reference numerals, and differences thereof will be described. The real space mapping unit 302 receives a motion vector and a foreground image, generates a real space image and the number of persons shown on the screen, and outputs it. These motion vectors and foreground images may be stored in a storage device, for example, and read by the real space mapping unit 303. The projection processing unit 303 inputs the real space image output from the real space mapping unit 302. The passing number calculating unit 307 calculates the passing number from the accumulated image generated by the accumulated image generating unit 305. For example, the number of moving persons is calculated from the maximum value of the projected image included in the accumulated image.

＜第７の実施形態＞
図２０は、第７の実施形態における通過人数計測装置１の構成を示す機能ブロック図である。ここでは、図１０の各部に対応する部分については、同一の符号を付し、その相違点を説明する。実空間マッピング部３０２は、動きベクトル、前景画像を入力し、実空間画像と、画面に映っている人物の数を生成して出力する。射影処理部３０３は、実空間マッピング部３０２から出力される実空間画像を入力する。ここでは、例えば、画像データ、動きベクトル、前景画像を既知とし、これに基づいて、代表移動速度を求めることにより、移動人物の速度を算出する。 <Seventh Embodiment>
FIG. 20 is a functional block diagram showing the configuration of the passing number measuring device 1 in the seventh embodiment. Here, portions corresponding to the respective portions in FIG. 10 are denoted by the same reference numerals, and differences thereof will be described. The real space mapping unit 302 receives a motion vector and a foreground image, generates a real space image and the number of persons shown on the screen, and outputs it. The projection processing unit 303 inputs the real space image output from the real space mapping unit 302. Here, for example, the image data, the motion vector, and the foreground image are known, and the speed of the moving person is calculated by obtaining the representative moving speed based on the known image data, motion vector, and foreground image.

＜第８の実施形態＞
図２１は、第８の実施形態における通過人数計測装置１の構成を示す機能ブロック図である。ここでは、図１０の各部に対応する部分については、同一の符号を付し、その相違点を説明する。実空間マッピング部３０２は、動きベクトル、前景画像を入力し、実空間画像と、画面に映っている人物の数を生成して出力する。射影処理部３０３は、実空間マッピング部３０２によって生成された実空間画像に関して、ある方向に座標軸ｕをとり、ｕ方向の位置が等しい実空間画像上の各画素における値の総和をとった１次元の射影画像を生成する。ここでは、例えば、画像データ、動きベクトル、前景画像を既知とし、これに基づいて、一定時間内に画像に映し出されている領域の距離（画像長の距離）を移動する人数の累積値（累積通過人数）を算出する。 <Eighth Embodiment>
FIG. 21 is a functional block diagram showing the configuration of the passing number measuring device 1 in the eighth embodiment. Here, portions corresponding to the respective portions in FIG. 10 are denoted by the same reference numerals, and differences thereof will be described. The real space mapping unit 302 receives a motion vector and a foreground image, generates a real space image and the number of persons shown on the screen, and outputs it. The projection processing unit 303 takes a coordinate axis u in a certain direction with respect to the real space image generated by the real space mapping unit 302 and takes the sum of the values in each pixel on the real space image having the same position in the u direction. Generate a projected image of. Here, for example, the image data, the motion vector, and the foreground image are known, and based on this, the cumulative value (cumulative value) of the number of people who move the distance (image length distance) of the area projected on the image within a certain time. Calculate the number of people passing through.

＜第９の実施形態＞
非特許文献１では、瞬間ごとの人数を計測する領域は画像空間上で設定しており、画像空間と実空間の座標の対応は一意には定まらないため、人数を計測する領域を実空間で考えたい場合にはこの方法をそのまま用いることができない。このような問題を解決するため、画像上の前景部分を実空間にマッピングすることで、実空間上で人数を計測する方法が考えられる。しかし、マッピングのテーブルをあらかじめ作成する場合、実空間への写像関数を画素ごとに用意する必要があるため、写像関数のデータサイズが（撮像画像の画素数）×（実空間上の点数）オーダーとなり、このデータを計算機のメモリ上に確保するのは極めて困難である。第９の実施形態は、画像座標から実空間座標へのマッピングを、少ないメモリ使用量で実現し、実空間上で解析を行うことによって、映像に映っている任意の実空間領域内に存在する人物の数を推定するものであり、前述した第５の実施形態を変形したものである。 <Ninth Embodiment>
In Non-Patent Document 1, the area for measuring the number of people for each moment is set on the image space, and the correspondence between the coordinates of the image space and the real space is not uniquely determined. This method cannot be used as it is when thinking. In order to solve such a problem, a method of measuring the number of people in the real space by mapping the foreground portion on the image into the real space can be considered. However, when creating a mapping table in advance, it is necessary to prepare a mapping function to the real space for each pixel, so the data size of the mapping function is (number of pixels of the captured image) × (number of points in the real space) order. Therefore, it is extremely difficult to secure this data on the memory of the computer. In the ninth embodiment, mapping from image coordinates to real space coordinates is realized with a small amount of memory usage, and analysis is performed in the real space, thereby existing in an arbitrary real space area shown in the video. The number of persons is estimated and is a modification of the fifth embodiment described above.

次に、第９の実施形態について詳細に説明する。第９の実施形態では、映像に映っている任意の実空間領域に存在する瞬間ごとの人数を測定することが目的である。第９の実施形態における装置構成は、図１に示す装置構成と同様であるので、詳細な説明を省略する。なお、カメラ２は、所定の位置に固定されたカメラを用いるものとする。また、以下の説明では、カメラ２で撮影された画像（フレーム画像）が時系列的に入力される状況を想定して説明する。入力画像は静止画系列や映像ストリームなどであり、また、必ずしも処理を実時間で行う必要はない。 Next, a ninth embodiment will be described in detail. In the ninth embodiment, it is an object to measure the number of people per moment existing in an arbitrary real space area shown in the video. The device configuration in the ninth embodiment is the same as the device configuration shown in FIG. In addition, the camera 2 shall use the camera fixed to the predetermined position. Further, in the following description, description will be made assuming a situation in which images (frame images) taken by the camera 2 are input in time series. The input image is a still image series, a video stream, or the like, and the processing does not necessarily have to be performed in real time.

図２２は、第９の実施形態における通過人数計測装置１の構成を示す機能ブロック図である。図２２において、符号４０１は、カメラ２からの画像を入力し、入力した画像から前景を検出して、前景画像を出力する前景検出部である。符号４０２は、前景画像を入力し、実空間マッピング処理を行って実空間画像を出力する実空間マッピング部である。符号４０３は、実空間画像を入力し、領域内の人数を算出し、領域内人数情報を出力する領域内人数算出部である。 FIG. 22 is a functional block diagram showing the configuration of the passing number measuring device 1 in the ninth embodiment. In FIG. 22, reference numeral 401 denotes a foreground detection unit that inputs an image from the camera 2, detects a foreground from the input image, and outputs a foreground image. Reference numeral 402 denotes a real space mapping unit that inputs a foreground image, performs a real space mapping process, and outputs a real space image. Reference numeral 403 denotes an in-region number calculation unit that inputs a real space image, calculates the number of people in the region, and outputs in-region number information.

次に、図２３を参照して、図２２に示す装置の動作を説明する。まず、処理が開始されると、現時点の入力画像からあらかじめ設定された範囲を抽出した部分画像を生成する。ここで抽出する範囲の決定方法は、例えば、入力画像のうち人数をカウントしたい実空間上の位置と対応しうる部分とする方法、人物が写らないことが明らかである部分（壁や固定の設備など）を除いた部分とする方法などが適用可能である。また、この範囲は入力画像全体であっても良い。 Next, the operation of the apparatus shown in FIG. 22 will be described with reference to FIG. First, when processing is started, a partial image is generated by extracting a preset range from the current input image. The method for determining the range to be extracted here is, for example, a method in which a part of the input image can correspond to a position in the real space where the number of people is to be counted, a part where a person is clearly not captured (a wall or a fixed facility) Etc.) can be applied. This range may be the entire input image.

次に、前景検出部４０１は、抽出された部分画像における前景部分を検出して前景画像を生成して出力する（ステップＳ４１）。ここで、前景画像とは、部分画像において移動物体が存在する点、すなわち前景である点を１、そうでない点、すなわち背景である点を０とした画像である。前景画像の検出方法はさまざまな方法が知られており、例えば、前述の非特許文献４、５などの技術が適用可能である。 Next, the foreground detection unit 401 detects the foreground part in the extracted partial image, generates a foreground image, and outputs it (step S41). Here, the foreground image is an image in which a moving object exists in a partial image, that is, a point that is a foreground is 1, and a point that is not, that is, a point that is a background is 0. Various methods for detecting the foreground image are known. For example, techniques such as Non-Patent Documents 4 and 5 described above can be applied.

次に、実空間マッピング部４０２は、人物の実空間上での位置を表す実空間画像を作成して出力する（ステップＳ４２）。これは、生成された前景画像の値が１である画素それぞれに関して、“その画素を含む人物像の立ち位置”としてあり得る２次元実空間Ｘ−Ｙ上の範囲に該当する実空間画像の画素に、所定の値を加算することによって実現する。以下では、この画素値の加算処理をマッピングと称する。マッピングの範囲は、以下のように導出する。 Next, the real space mapping unit 402 creates and outputs a real space image representing the position of the person in the real space (step S42). This is because the pixels of the real space image corresponding to the range on the two-dimensional real space XY that can be “the standing position of the human image including the pixel” for each of the pixels for which the value of the generated foreground image is 1. This is realized by adding a predetermined value. Hereinafter, this pixel value addition processing is referred to as mapping. The range of mapping is derived as follows.

前提として、人物のモデルを、カメラに正対した高さｈ、幅ｗの長方形平板とする。高さｈと幅ｗはあらかじめ定め、例えば、ｈ＝１．７［ｍ］、ｗ＝０．３［ｍ］とする。また、カメラの光軸を中心とした回転はないものとする。 As a premise, the human model is a rectangular flat plate having a height h and a width w facing the camera. The height h and the width w are determined in advance, for example, h = 1.7 [m] and w = 0.3 [m]. It is assumed that there is no rotation around the optical axis of the camera.

ある座標（ｘ，ｙ）の画素が人物長方形の底辺（足元）の中心であった場合と上辺（頭頂部）の中心であった場合とを仮定し、対応する２次元実空間Ｘ−Ｙ上の立ち位置Ｐ（ｘ，ｙ）、Ｐ’（ｘ，ｙ）（図１２参照）をそれぞれ考える（（ｘ，ｙ）からＰ（ｘ，ｙ）、Ｐ’（ｘ，ｙ）の座標を求めるには、例えば、非特許文献６に記載の式（例えば（１２）式）を用いる。 Assuming that the pixel at a certain coordinate (x, y) is the center of the bottom (foot) of the person rectangle and the center of the top (top), the corresponding two-dimensional real space XY Considering the standing positions P (x, y) and P ′ (x, y) (see FIG. 12), the coordinates of P (x, y) and P ′ (x, y) are obtained from (x, y). For example, an expression described in Non-Patent Document 6 (for example, Expression (12)) is used.

このとき、件の画素が人物の中心軸上の任意の点であった場合に対応する実空間座標の範囲は、２点Ｐ（ｘ，ｙ）、Ｐ’（ｘ，ｙ）を結ぶ線分となる。さらに、Ｐ’（ｘ，ｙ）がカメラ位置（以下、カメラ位置を原点とする）から十分離れており、かつ画像上における人物の幅が画素幅に対して十分大きい場合（画素幅／人物幅がマッピング範囲の幅の誤差となる）、件の画素が人物上の任意の点であった場合に対応する実空間座標の範囲、すなわち求めるマッピング範囲は、この線分にｗの幅を持たせたもの、すなわち、２点を線分と垂直な方向に±ｗ／２だけ移動した点４つを頂点とする長方形に近似することができる（図１３参照）。以上で導出される範囲に、画素値をマッピングする。 At this time, the range of the real space coordinates corresponding to the case where the pixel in question is an arbitrary point on the central axis of the person is a line segment connecting two points P (x, y) and P ′ (x, y). It becomes. Further, when P ′ (x, y) is sufficiently away from the camera position (hereinafter, camera position is the origin) and the width of the person on the image is sufficiently larger than the pixel width (pixel width / person width) Is a mapping range width error), the real space coordinate range corresponding to the case where the pixel of interest is an arbitrary point on the person, that is, the mapping range to be obtained has a width of w on this line segment. In other words, it can be approximated to a rectangle whose vertices are four points obtained by moving two points by ± w / 2 in a direction perpendicular to the line segment (see FIG. 13). The pixel value is mapped in the range derived as described above.

ここで、マッピング（ステップＳ４２）を実現する具体的方法について説明する。まず、マッピング先の空間として、各画素の値が０である実空間画像を作成する（ステップＳ４２−１−１）。実空間画像のサイズは、ステップＳ４１で生成した前景画像の各画素に対応するマッピング範囲をすべて網羅できるサイズとする。網羅すべき範囲は、図１４のように、前景画像（２値画像）の四隅の画素に対応するマッピング範囲を網羅する範囲を考えれば算出することができる。例えば、撮像画像のアスペクト比が４：３、カメラの地表からの高さが７．１ｍ、水平方向からの俯角が１７°、対角線画角が４６．７７°、人物モデルの高さが１．７ｍ、幅が０．３ｍのとき、画像全体に相当する領域の人数を計測する場合、実空間画像が網羅すべき範囲はＸ方向−５５［ｍ］〜５５［ｍ］、Ｙ方向８［ｍ］〜１６３［ｍ］、とすることができる。また、実空間画像の座標の刻み幅は、人物の動きを観測するために十分な最大の幅（例えば、１０ｃｍ刻み）とする。前述の網羅範囲の例で刻み幅を１０ｃｍとすると、実空間画像のサイズは横１１０１×縦１５５１ｐｉｘｅｌとなる。 Here, a specific method for realizing the mapping (step S42) will be described. First, a real space image in which the value of each pixel is 0 is created as a mapping destination space (step S42-1-1). The size of the real space image is set to a size that can cover the entire mapping range corresponding to each pixel of the foreground image generated in step S41. The range to be covered can be calculated by considering a range that covers the mapping range corresponding to the pixels at the four corners of the foreground image (binary image) as shown in FIG. For example, the aspect ratio of the captured image is 4: 3, the height from the ground surface of the camera is 7.1 m, the depression angle from the horizontal direction is 17 °, the diagonal angle of view is 46.77 °, and the height of the human model is 1. When measuring the number of people in the region corresponding to the entire image when the width is 7 m and the width is 0.3 m, the real space image should cover the X direction −55 [m] to 55 [m] and the Y direction 8 [m]. ] To 163 [m]. In addition, the step size of the coordinates of the real space image is set to a maximum width sufficient for observing the movement of the person (for example, 10 cm step). If the step size is 10 cm in the example of the above-described coverage range, the size of the real space image is 1101 horizontal × 1551 pixels vertical.

次に、前景画像の画素値が１である各画素の座標（ｘ，ｙ）に関して、実空間画像上のＰ’（ｘ，ｙ）に対応する画素の値に１を加える（ステップＳ４２−１−２、図１５（ａ））。このステップではどの前景画素に関しても実空間画像の画素値に１を加え、後のステップのフィルタリングによって画素に重み付けをする。 Next, with respect to the coordinates (x, y) of each pixel having a pixel value of 1 in the foreground image, 1 is added to the value of the pixel corresponding to P ′ (x, y) on the real space image (step S42-1). -2, Fig. 15 (a)). In this step, 1 is added to the pixel value of the real space image for any foreground pixel, and the pixel is weighted by filtering in a later step.

次に、長さと方向、係数が適用画素の原点からの距離に依存して変化するフィルタを実空間画像に適用することにより、原点とフィルタ適用点とを結ぶ直線の方向に画素を分布させる（ステップＳ４２−１−３、図１５（ｂ））。原点からフィルタ適用点までの距離をｒとすると、フィルタの方向は原点とフィルタ適用点とを結ぶ直線の方向（ｒ方向）、ｒ方向のフィルタ長は｜Ｐ−Ｐ’｜＝ｈｒ／Ｔ_ｚ、幅は１、各フィルタ係数はＴ_ｚｓ＾（ｘ，ｙ）／（ｈｒ）となるとなる。ここで、Ｔ_ｚは床面からカメラまでの高さである。ｓ＾（ｘ，ｙ）は、前景画像の画素に重み付けをする値であり、ある人物１人が撮像画像に映っていることによって値が１となる前景画像上の画素にこの重みをつけた値の総和が、その人物が撮像画像のどこに映っていても１となるような値である。これは、非特許文献４におけるｓ＾（ｘ，ｙ）と同じものである。 Next, by applying a filter whose length, direction, and coefficient change depending on the distance from the origin of the application pixel to the real space image, the pixels are distributed in the direction of the straight line connecting the origin and the filter application point ( Step S42-1-3, FIG. 15 (b)). When the distance from the origin to the filter application point is r, the filter direction is the direction of the straight line connecting the origin and the filter application point (r direction), and the filter length in the r direction is | P−P ′ | = hr / T _z , The width is 1, and each filter coefficient is T _z s ^ (x, y) / (hr). Here, T _z is the height from the floor surface to the camera. s ^ (x, y) is a value for weighting the pixels of the foreground image, and this weight is given to the pixels on the foreground image that has a value of 1 when one person appears in the captured image. The sum of the values is such that the person is 1 regardless of where the person appears in the captured image. This is the same as s ^ (x, y) in Non-Patent Document 4.

次に、各画素において、その画素の位置におけるｒ方向と垂直な方向にフィルタを実空間画像に適用することにより、ｒ方向と垂直な方向に画素を分布させる（ステップＳ４２−１−４、図１５（ｃ））。フィルタの長さはｗ（人物モデルの幅）、幅は１、各フィルタ係数は１／ｗとする。実際に実空間画像にマッピングを行った例を図１６に示す。図１６（ａ）の画像を撮像画像の動き検出部分として実空間にマッピングをすると、例えば図１６（ｂ）の画像のようになる。以上説明した方法で作成された実空間画像の画素値の和は、その画像が含む領域に存在する人物の数を表す。また、実空間画像の任意の部分領域に含まれる画素値の和は、部分領域に対応する実空間領域に存在する人物の数となる。 Next, in each pixel, the pixel is distributed in a direction perpendicular to the r direction by applying a filter to the real space image in a direction perpendicular to the r direction at the position of the pixel (step S42-1-4, FIG. 15 (c)). The length of the filter is w (the width of the person model), the width is 1, and each filter coefficient is 1 / w. An example in which mapping is actually performed on a real space image is shown in FIG. When the image in FIG. 16A is mapped to the real space as a motion detection part of the captured image, for example, the image in FIG. 16B is obtained. The sum of the pixel values of the real space image created by the method described above represents the number of persons existing in the area included in the image. Further, the sum of pixel values included in an arbitrary partial area of the real space image is the number of persons existing in the real space area corresponding to the partial area.

図２３に戻り、次に、領域内人数算出部４０３は、実空間マッピング部４０２により生成された実空間画像の画素値の和を取ることにより、対応する実空間領域に存在する人数の推定値として、画面内の人数を算出する（ステップＳ４３）。人数を測定したい領域が複数存在する場合は、撮像画像全体を処理してから必要な実空間領域をそれぞれ抽出してもよいし、撮像画像から部分領域を複数抽出し、それぞれについてステップＳ４１〜Ｓ４３の処理を繰り返してもよい。そして、終了条件を満たしているか否かを判定し（ステップＳ４４）、満たしていなければ、映像の新たなフレームを１枚取得し、ステップＳ４１に戻って処理を繰り返す。 Returning to FIG. 23, next, the in-area number calculation unit 403 obtains the estimated value of the number of persons existing in the corresponding real space area by taking the sum of the pixel values of the real space image generated by the real space mapping unit 402. The number of people in the screen is calculated (step S43). When there are a plurality of areas for which the number of people is to be measured, the necessary real space area may be extracted after processing the entire captured image, or a plurality of partial areas are extracted from the captured image, and steps S41 to S43 are performed for each. The above process may be repeated. Then, it is determined whether or not the end condition is satisfied (step S44). If not satisfied, one new frame of video is acquired, and the process returns to step S41 to repeat the process.

＜第１０の実施形態＞
次に、第１０の実施形態について説明する。第１０の実施形態は、前述した第５の実施形態における実空間マッピング３０２及び第９の実施形態における実空間マッピング部４０２において実行する処理（ステップＳ２５、Ｓ４２）を以下で説明する処理動作に置き換えたものである。 <Tenth Embodiment>
Next, a tenth embodiment will be described. In the tenth embodiment, the processing (steps S25 and S42) executed in the real space mapping 302 in the fifth embodiment and the real space mapping unit 402 in the ninth embodiment described above are replaced with processing operations described below. It is a thing.

まず、第１のマッピング先の空間として、各画素の値が０であるｘ−ｌｏｇＹ空間画像を作成する。ｘ−ｌｏｇＹ空間画像のサイズは、ステップＳ２２、Ｓ４１で生成した前景画像の各画素に対応するマッピング範囲をすべて網羅できるサイズとする。網羅すべき範囲は、ステップＳ４２−２−１と同様に、前景画像の四隅の画素に対応するマッピング範囲を網羅する範囲を考えれば算出することができる。例えば、撮像画像のアスペクト比が４：３、カメラの地表からの高さが７．１ｍ、水平方向からの俯角が１７°、対角線画角が４６．７７°、人物モデルの高さが１．７ｍ、幅が０．３ｍのとき、画像全体に相当する領域の人数を計測する場合、ｘ−ｌｏｇＹ空間画像のｌｏｇＹ方向の範囲は２．０７〜５．１０とすることができる。 First, an x-log Y space image in which the value of each pixel is 0 is created as the first mapping destination space. The size of the x-logY space image is set to a size that can cover the entire mapping range corresponding to each pixel of the foreground image generated in steps S22 and S41. The range to be covered can be calculated by considering the range covering the mapping range corresponding to the pixels at the four corners of the foreground image as in step S42-2-1. For example, the aspect ratio of the captured image is 4: 3, the height from the ground surface of the camera is 7.1 m, the depression angle from the horizontal direction is 17 °, the diagonal angle of view is 46.77 °, and the height of the human model is 1. When the number of people in the area corresponding to the entire image is measured when the width is 7 m and the width is 0.3 m, the range in the log Y direction of the x-log Y space image can be set to 2.07 to 5.10.

また、ｘ−ｌｏｇＹ空間画像のｌｏｇＹ方向の座標の刻み幅は、位置の誤差をどこまで許容できるかに応じて決定する。例えば、０．０１刻みとすると、画素分布範囲の誤差は最大１％程度となる。前述の網羅範囲の例で刻み幅を０．０１とすると、実空間画像のｌｏｇＹ方向のサイズは３０４ｐｉｘｅｌとなる。 Further, the step size of the coordinates in the log Y direction of the x-log Y space image is determined according to how far the position error can be tolerated. For example, when the increment is 0.01, the error in the pixel distribution range is about 1% at the maximum. If the step size is 0.01 in the above-described example of the coverage range, the size of the real space image in the log Y direction is 304 pixels.

次に、前景画像の画素値が１である各画素の座標（ｘ，ｙ）に関して、ｘ−ｌｏｇＹ空間画像上のＰ’に対応する画素の値にＴ_ｚ／（ｈｒ）を加える。Ｔ_ｚ／（ｈｒ）は、ステップＳ４２−１−３に記述したものと同じ値である。この値は時間の経過によって変化することはないため、撮像、マッピング、人数推定を繰り返す処理の開始前にあらかじめ計算しメモリ上に保持してもよい。 Next, T _z / (hr) is added to the value of the pixel corresponding to P ′ on the x-logY space image with respect to the coordinates (x, y) of each pixel whose pixel value is 1 in the foreground image. T _z / (hr) is the same value as that described in step S42-1-3. Since this value does not change with time, it may be calculated in advance and stored in the memory before the start of the process of repeating imaging, mapping, and number estimation.

次に、ｘ−ｌｏｇＹ空間画像をフィルタリングすることにより、ｌｏｇＹ方向に画素を分布させる。これは、前述したｒ方向に画素を分布させることに相当する。フィルタのｌｏｇＹ方向の長さは、ｌｏｇｒ＿Ｐ−ｌｏｇｒ＿Ｐ’＝ｌｏｇ（ｒ＿Ｐ／ｒ＿Ｐ’）＝ｌｏｇ（Ｔ_ｚ／（Ｔｚ−ｈ））となり、一定である。また、ｘ方向の幅は１である。ここで、ｒ＿Ｐ，ｒ＿Ｐ’はそれぞれ点Ｐ、Ｐ’におけるｒである。各フィルタ係数は、１である。 Next, the pixels are distributed in the logY direction by filtering the x-logY space image. This corresponds to distributing pixels in the r direction described above. The length of the filter in the logY direction is logr_P-logr_P ′ = log (r_P / r_P ′) = log (T _z / (Tz−h)) and is constant. The width in the x direction is 1. Here, r_P and r_P ′ are r at points P and P ′, respectively. Each filter coefficient is 1.

次に、第２のマッピング先の空間として、ｘ−Ｙ空間画像を作成する。ｘ−Ｙ空間画像のサイズは、ｘ方向に関してｘ−ｌｏｇＹ空間画像と等しく、Ｙ方向に関してステップＳ４２−１−１における実空間画像と等しい。ｘ−Ｙ空間画像の各画素値は、ｘ−ｌｏｇＹ空間画像において対応する座標の画素値に等しい。 Next, an x-Y space image is created as a second mapping destination space. The size of the x-Y space image is equal to the x-log Y space image in the x direction and is equal to the real space image in step S42-1-1 in the Y direction. Each pixel value of the x-Y space image is equal to the pixel value of the corresponding coordinate in the x-log Y space image.

次に、第２のマッピング先の空間として、ステップＳ４２−１−１と同様の方法で実空間画像を作成し、ｘ−Ｙ空間画像の各画素に関して、その画素値を、実空間画像上の座標が対応する画素に加える。続いて、実空間画像をフィルタリングすることにより、Ｘ方向に画素を分布させる（図２４参照）。フィルタのＸ方向のサイズは人物モデルの幅であり、Ｙ方向のサイズは１である。この処理は、図２５に示すように人物の存在範囲を近似してＸ方向に画素を分布させることに相当する。以上の方法で実際に実空間画像にマッピングを行うと、第５、第９の実施形態と同様に図１６のようになる。 Next, as a second mapping destination space, a real space image is created by the same method as in step S42-1-1. For each pixel of the x-Y space image, the pixel value is set on the real space image. Add coordinates to the corresponding pixel. Subsequently, the real space image is filtered to distribute the pixels in the X direction (see FIG. 24). The size in the X direction of the filter is the width of the person model, and the size in the Y direction is 1. This process corresponds to approximating the human existence range and distributing pixels in the X direction as shown in FIG. When mapping is actually performed on the real space image by the above method, it is as shown in FIG. 16 as in the fifth and ninth embodiments.

第１０の実施形態による方法は画素分布範囲の近似により人数の誤差が発生するという難点を持つ一方、フィルタのサイズと方向が座標によらず一定であるため第９の実施形態による方法よりも処理を高速にすることができるという利点を持つ。以上説明した方法で作成された実空間画像の画素値の和は、その画像が含む領域に存在する人物の数を表す。また、実空間画像の任意の部分領域に含まれる画素値の和は、部分領域に対応する実空間領域に存在する人物の数となる。 The method according to the tenth embodiment has the difficulty that an error of the number of people occurs due to the approximation of the pixel distribution range. On the other hand, since the size and direction of the filter are constant regardless of the coordinates, the method according to the ninth embodiment is more processed. Has the advantage of being able to be fast. The sum of the pixel values of the real space image created by the method described above represents the number of persons existing in the area included in the image. Further, the sum of pixel values included in an arbitrary partial area of the real space image is the number of persons existing in the real space area corresponding to the partial area.

以上説明したように、映像に映る実空間の任意領域における人物の数を直接測定することができるため、画像上での観測すべき範囲を直感的に設定することができる。 As described above, since the number of persons in an arbitrary area of the real space shown in the video can be directly measured, the range to be observed on the image can be set intuitively.

また、図１、図２、図７〜図１０、図１９〜図２２における通過人数計測装置１の機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することにより通過人数の計測を行ってもよい。なお、ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。 In addition, a program for realizing the functions of the passing person counting device 1 in FIGS. 1, 2, 7 to 10, and 19 to 22 is recorded on a computer-readable recording medium and recorded on the recording medium. The number of passing people may be measured by reading the executed program into a computer system and executing it. Here, the “computer system” includes an OS and hardware such as peripheral devices.

また、「コンピュータシステム」は、ＷＷＷシステムを利用している場合であれば、ホームページ提供環境（あるいは表示環境）も含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時間の間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含むものとする。また上記プログラムは、前述した機能の一部を実現するためのものであっても良く、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであってもよい。 Further, the “computer system” includes a homepage providing environment (or display environment) if a WWW system is used. The “computer-readable recording medium” refers to a storage device such as a flexible medium, a magneto-optical disk, a portable medium such as a ROM and a CD-ROM, and a hard disk incorporated in a computer system. Furthermore, the “computer-readable recording medium” dynamically holds a program for a short time like a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line. In this case, a volatile memory in a computer system serving as a server or a client in that case, and a program that holds a program for a certain period of time are also included. The program may be a program for realizing a part of the functions described above, and may be a program capable of realizing the functions described above in combination with a program already recorded in a computer system.

以上、この発明の実施形態について図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の設計等も含まれる。 The embodiment of the present invention has been described in detail with reference to the drawings. However, the specific configuration is not limited to this embodiment, and includes designs and the like that do not depart from the gist of the present invention.

撮像装置で撮影された映像から、画像処理により映像に映っている人の数や通過人数を測定することが不可欠な用途に適用できる。 It can be applied to applications where it is indispensable to measure the number of people and the number of people passing through an image processing by using image processing.

１…通過人数計測装置、２…カメラ、３…表示装置、１０１、３００…動きベクトル抽出部、１０２、３０１、４０１…前景検出部、１０３、３０３…射影処理部、１０４、３０４…極値点抽出部、１０５、３０５…累積画像生成部、１０６、３０６…代表速度算出部、１０７…画面内人数算出部、１０８、３０７…通過人数算出部、３０２、４０２…実空間マッピング部、４０３…領域内人数算出部 DESCRIPTION OF SYMBOLS 1 ... Passage number measuring device, 2 ... Camera, 3 ... Display apparatus, 101, 300 ... Motion vector extraction part, 102, 301, 401 ... Foreground detection part, 103, 303 ... Projection process part, 104, 304 ... Extreme point Extraction unit, 105, 305 ... Cumulative image generation unit, 106, 306 ... Representative speed calculation unit, 107 ... In-screen number calculation unit, 108, 307 ... Passing number calculation unit, 302, 402 ... Real space mapping unit, 403 ... Area Number of people calculation section

Claims

A person counting device that inputs a motion vector of each pixel in an image included in an image and a foreground area in the image and measures a moving person,
Projection means for projecting a motion vector in the foreground region in a one-dimensional manner to generate a projected image;
Extreme point extraction means for detecting a moving person by extracting extreme points from the pixels included in the projected image generated by the projection means;
A person counting device characterized by comprising:

The extreme point extraction means extracts extreme points from pixels included in the projected image generated by the projection means and detects the position of the moving person on the projection axis. People counting device.

Based on an image corresponding to a person included in the image, a real space image is generated by mapping the coordinates in the image to pixels on the real space coordinates indicating the position of the person in the real space. Having real space mapping means,
The number measuring device according to claim 1, wherein the projection processing unit generates a projection image from the real space image generated by the real space mapping unit.

The real space mapping means adds the predetermined value to the pixel corresponding to the range on the two-dimensional space that can be the standing position of the human image including the target pixel on the projected image, and performs the mapping, The person counting apparatus according to claim 3, wherein the real space image is generated.

The real space mapping means performs a first mapping process by adding a predetermined value to a pixel corresponding to a range on a logarithmic space that can be a standing position of a human image including a target pixel on the projected image, The person counting apparatus according to claim 3, wherein the real space image is generated by mapping the first mapping processing result on a two-dimensional space.

Cumulative image generation means for accumulating the extreme value image extracted by the extreme value point extraction means for each elapsed time and generating a cumulative image;
The person counting according to any one of claims 1 to 5, further comprising: a number calculating means for measuring the number of moving persons based on the accumulated image generated by the accumulated image generating means. apparatus.

Cumulative image generation means for accumulating the extreme value image extracted by the extreme value point extraction means for each elapsed time and generating a cumulative image;
Representative speed calculation means for calculating the representative movement speed of the moving person from the movement speed of each extreme point based on the cumulative image generated by the cumulative image generation means;
The person counting device according to any one of claims 1 to 5, characterized in that

A number calculation means for detecting the number of persons included in the image;
Based on the number of persons detected by the number of persons calculating means and the representative speed calculated by the representative speed calculating means, a passing number calculating means for calculating the number of persons who pass through the region in the image within a predetermined time;
The person counting device according to any one of claims 1 to 5, characterized in that

Foreground image generating means for detecting a foreground portion from the input video and generating a foreground image;
The real space image is generated by performing a mapping by adding a predetermined value to a pixel corresponding to a range in a two-dimensional space that can be a standing position of the human image including the target pixel on the foreground image. Real space image generation means;
A number-of-people counting device comprising: an area number calculating means for calculating the number of persons existing in a real space area by calculating a sum of pixel values of the real space image.

Foreground image generating means for detecting a foreground portion from the input video and generating a foreground image;
A first mapping process is performed by adding a predetermined value to a pixel corresponding to a range in a logarithmic space that can be a standing position of a person image including the target pixel on the foreground image, and the first mapping process result Real space image generation means for generating the real space image by performing a second mapping process on the two-dimensional space from
A number-of-people counting device comprising: an area number calculating means for calculating the number of persons existing in a real space area by calculating a sum of pixel values of the real space image.

Using a computer that is a person counting device that inputs a motion vector of each pixel in an image included in an image and a foreground area in the image and measures a moving person,
Projecting means of the computer to project a motion vector in the foreground region in a one-dimensional manner to generate a projected image;
The extreme point extraction means of the computer,
Detecting a moving person by extracting extreme points from pixels included in the projected image generated by the projecting means;
This is a method for counting people.

A computer program for causing a computer to function each means of the people counting device according to any one of claims 1 to 10.