JP2015184810A

JP2015184810A - Image processing apparatus, image processing method, and image processing program

Info

Publication number: JP2015184810A
Application number: JP2014058924A
Authority: JP
Inventors: 麻由奥村; Mayu Okumura; 友樹渡辺; Yuki Watanabe; 学西山; Manabu Nishiyama
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2014-03-20
Filing date: 2014-03-20
Publication date: 2015-10-22
Anticipated expiration: 2034-03-20
Also published as: JP6139447B2

Abstract

PROBLEM TO BE SOLVED: To track an object by use of a plurality of cameras, with higher accuracy.SOLUTION: An image processing apparatus in the first embodiment acquires time-series images from each of a plurality of imaging apparatuses, tracks positions of an object captured in the time-series images, and acquires a moving direction of the object from the time-series images. The image processing apparatus acquires a common area, which is on a first object captured in a first time-series image and a second object captured in a second time-series image among a plurality of time-series images acquired from the imaging apparatuses, and in which the first object and the second object are included in common in the first time-series image and the second time-series image, on the basis of a predetermined reference direction of the object and the moving direction. The image processing apparatus extracts feature quantity from an image corresponding to the common area, and associates the time-series images acquired from the imaging apparatuses, as a tracking result of a tracking section, by use of the feature quantity.

Description

本発明は、画像処理装置、画像処理方法および画像処理プログラムに関する。 The present invention relates to an image processing apparatus, an image processing method, and an image processing program.

互いに視野を共有しない複数のカメラ間で対象を追跡する技術が、従来から知られている。例えば、対象フレームと移動物体のオクルージョンの無い全体形状とを比較して、移動物体にオクルージョンが存在する場合には、移動物体の非遮蔽領域を用いて対応付けを行うことで、移動物体の追跡を高精度で実行可能とした技術が知られている。 A technique for tracking an object between a plurality of cameras that do not share a field of view is conventionally known. For example, by comparing the target frame and the overall shape of the moving object without occlusion, if there is occlusion in the moving object, the moving object is tracked by performing association using the non-occluded area of the moving object. There is known a technology that makes it possible to execute this with high accuracy.

特開２００７−１４２５２７号公報JP 2007-142527 A

Tomoki Watanabe, Satoshi Ito and Kentaro Yokoi: “Co-occurrence Histograms of Oriented Gradients for Human Detection”, IPSJ Transactions on Computer Vision and Applications, Vol. 2, pp.39-47. (2010) .Tomoki Watanabe, Satoshi Ito and Kentaro Yokoi: “Co-occurrence Histograms of Oriented Gradients for Human Detection”, IPSJ Transactions on Computer Vision and Applications, Vol. 2, pp.39-47. (2010). Zdenek Kalal, Krystian Mikolajczyk and Jiri Matas: “Tracking-Learning-Detection”, IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 6, NO. 1, JANUARY 2010.Zdenek Kalal, Krystian Mikolajczyk and Jiri Matas: “Tracking-Learning-Detection”, IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 6, NO. 1, JANUARY 2010. B.Prosser, S.Gong and T.Xiang: “Multi-camera Matching using Bi-Directional Cumulative Brightness Transfer Functions” BMVC(2008).B.Prosser, S.Gong and T.Xiang: “Multi-camera Matching using Bi-Directional Cumulative Brightness Transfer Functions” BMVC (2008).

ところで、追跡対象は、常にカメラに対して同じ方向を向いているとは限らない。例えば複数のカメラの設置向きの違いや、追跡対象の移動方向の違いなどにより、追跡対象がカメラに映っている向きが異なる場合がある。しかしながら、従来技術による追跡技術は、追跡対象以外の物体による遮蔽が存在する場合には有効であるが、このような、追跡対象がカメラに映っている向きが異なる場合には、対応が困難であるという問題点があった。 By the way, the tracking target does not always face the same direction with respect to the camera. For example, the direction in which the tracking target is reflected on the camera may be different due to a difference in the installation direction of a plurality of cameras or a difference in the movement direction of the tracking target. However, the tracking technique according to the conventional technique is effective when there is occlusion by an object other than the tracking target, but it is difficult to cope with the case where the tracking target is reflected in a different direction. There was a problem that there was.

本発明が解決する課題は、複数カメラを用いてより高精度に対象を追跡可能とする画像処理装置、画像処理方法および画像処理プログラムを提供することにある。 The problem to be solved by the present invention is to provide an image processing apparatus, an image processing method, and an image processing program capable of tracking an object with higher accuracy using a plurality of cameras.

第１の実施形態の画像処理装置は、複数の撮像装置からそれぞれ時系列画像を取得し、時系列画像に撮像される対象物体の位置を追跡し、時系列画像から対象物体の移動方向を取得する。画像処理装置は、対象物体の方向の基準となる予め定めた基準方向と、移動方向とから、複数の撮像装置から取得された複数の時系列画像のうち第１の時系列画像に撮像される第１の対象物体と、複数の時系列画像のうち第２の時系列画像に撮像される第２の対象物体とが、第１の時系列画像および第２の時系列画像に共通して含まれる、第１の対象物体および第２の対象物体上の共通領域を取得する。画像処理装置は、共通領域に対応する画像から特徴量を抽出し、特徴量を用いて、追跡部による追跡結果の、複数の撮像装置から取得された複数の時系列画像間での対応付けを行う。 The image processing apparatus according to the first embodiment acquires time-series images from a plurality of imaging devices, tracks the position of the target object captured in the time-series image, and acquires the moving direction of the target object from the time-series image. To do. The image processing device captures a first time-series image among a plurality of time-series images acquired from a plurality of imaging devices from a predetermined reference direction serving as a reference for the direction of the target object and a moving direction. The first target object and the second target object captured in the second time-series image among the plurality of time-series images are included in common in the first time-series image and the second time-series image. The common area on the first target object and the second target object is acquired. The image processing device extracts a feature amount from an image corresponding to the common region, and uses the feature amount to associate a tracking result by the tracking unit between a plurality of time-series images acquired from a plurality of imaging devices. Do.

図１は、第１の実施形態に適用可能な画像処理システムの一例の構成を示すブロック図である。FIG. 1 is a block diagram illustrating a configuration of an example of an image processing system applicable to the first embodiment. 図２は、第１の実施形態に係るカメラの設置場所を説明するための図である。FIG. 2 is a diagram for explaining the installation location of the camera according to the first embodiment. 図３は、第１の実施形態に係る画像処理装置の一例の構成を示すブロック図である。FIG. 3 is a block diagram illustrating a configuration of an example of the image processing apparatus according to the first embodiment. 図４は、第１の実施形態に係る画像処理装置の機能を説明するための一例の機能ブロック図である。FIG. 4 is a functional block diagram illustrating an example of functions of the image processing apparatus according to the first embodiment. 図５は、第１の実施形態に係る画像処理を示す一例のフローチャートである。FIG. 5 is a flowchart illustrating an example of image processing according to the first embodiment. 図６は、複数の対象物体が含まれるフレーム画像の例を示す図である。FIG. 6 is a diagram illustrating an example of a frame image including a plurality of target objects. 図７は、複数の時系列画像を説明するための図である。FIG. 7 is a diagram for explaining a plurality of time-series images. 図８は、第１の実施形態に係る、対象物体の移動方向の取得の例について説明するための図である。FIG. 8 is a diagram for explaining an example of acquiring the moving direction of the target object according to the first embodiment. 図９は、第１の実施形態に係る、人である対象物体に対応するモデルの例を示す図である。FIG. 9 is a diagram illustrating an example of a model corresponding to a target object that is a person according to the first embodiment. 図１０は、第１の実施形態に係る、共通領域の選定方法を説明するための図である。FIG. 10 is a diagram for explaining a common area selection method according to the first embodiment. 図１１は、第１の実施形態に係るモデルを上面から見た例を示す図である。FIG. 11 is a diagram illustrating an example of the model according to the first embodiment viewed from the top. 図１２は、第２の実施形態に係る画像処理装置の機能を説明するための一例の機能ブロック図である。FIG. 12 is a functional block diagram illustrating an example of functions of the image processing apparatus according to the second embodiment. 図１３は、第２の実施形態に係る画像処理を示す一例のフローチャートである。FIG. 13 is a flowchart illustrating an example of image processing according to the second embodiment. 図１４は、色補正に用いる変換関数の例を示す図である。FIG. 14 is a diagram illustrating an example of a conversion function used for color correction.

以下、実施形態に係る画像処理装置、画像処理方法および画像処理プログラムについて説明する。 Hereinafter, an image processing apparatus, an image processing method, and an image processing program according to an embodiment will be described.

（第１の実施形態）
図１は、第１の実施形態に適用可能な画像処理システムの一例の構成を示す。図１において、画像処理システムは、第１の実施形態に係る画像処理装置１０と、複数のカメラ１１₁、１１₂、１１₃、…とを備える。 (First embodiment)
FIG. 1 shows a configuration of an example of an image processing system applicable to the first embodiment. 1, the image processing system includes an image processing apparatus 10 according to the first embodiment and a plurality of cameras 11 ₁ , 11 ₂ , 11 ₃ ,.

各カメラ１１₁、１１₂、１１₃、…は、それぞれ同一の方向および撮像範囲で時系列に沿って複数のタイミングで撮像した画像である時系列画像を出力する。時系列画像は、例えば、所定の時間間隔で撮像したフレーム画像を含む動画像である。また、各カメラ１１₁、１１₂、１１₃、…は、例えば、屋内や屋外に、それぞれ観察対象を俯瞰する角度で設置される。各カメラ１１₁、１１₂、１１₃、…の撮像範囲は、互いに重複する部分を持つ必要はない。 Each of the cameras 11 ₁ , 11 ₂ , 11 ₃ ,... Outputs a time-series image that is an image captured at a plurality of timings along the time-series in the same direction and imaging range. The time series image is, for example, a moving image including frame images captured at a predetermined time interval. In addition, each of the cameras 11 ₁ , 11 ₂ , 11 ₃ ,. The imaging ranges of the cameras 11 ₁ , 11 ₂ , 11 ₃ ,... Do not need to have overlapping portions.

各カメラ１１₁、１１₂、１１₃、…は、例えば可視光を撮像するカメラが用いられる。これに限らず、各カメラ１１₁、１１₂、１１₃、…は、赤外線を撮像する赤外線カメラを用いてもよい。また、各カメラ１１₁、１１₂、１１₃、…の水平面での撮像方向は、特に限定されない。例えば、図２に例示されるように、各カメラ１１₁、１１₂、１１₃、…を互いに異なる向きを撮像するように配置してもよい。図２の例の場合、矢印２１で示される方向に進む観察対象２は、各カメラ１１₁、１１₂、１１₃、…により、それぞれ正面、側面および背面から撮像されることになる。 As each camera 11 ₁ , 11 ₂ , 11 ₃ ,..., For example, a camera that captures visible light is used. It is not limited to this, each of the cameras 11 _1, 11 _2, 11 _3, ... may be an infrared camera that captures infrared rays. Further, each of the cameras 11 _1, 11 _2, 11 _3, ... imaging direction in the horizontal plane of not particularly limited. For example, as illustrated in FIG. 2, the cameras 11 ₁ , 11 ₂ , 11 ₃ ,... May be arranged so as to capture images in different directions. In the case of the example in FIG. 2, the observation object 2 that proceeds in the direction indicated by the arrow 21 is imaged from the front, side, and back by the cameras 11 ₁ , 11 ₂ , 11 ₃ ,.

観察対象２は、時系列に沿って位置が移動する移動物体であり、例えば人物である。以下、観察対象２となる移動物体を、対象物体と呼ぶ。 The observation target 2 is a moving object whose position moves along a time series, for example, a person. Hereinafter, the moving object that is the observation target 2 is referred to as a target object.

各カメラ１１₁、１１₂、１１₃、…から出力された各時系列画像は、画像処理装置１０に供給される。画像処理装置１０は、各カメラ１１₁、１１₂、１１₃、…から供給された各時系列画像に対して画像処理を行い、各時系列画像間で同一対象物体の画像を対応付け、同一対象物体を時系列上で追跡する。このとき、第１の実施形態に係る画像処理装置１０は、各カメラ１１₁、１１₂、１１₃、…から供給された各時系列画像間に共通して含まれる対象物体の画像の領域を、立体モデルを用いて推定する。そして、画像処理装置１０は、この推定された領域を用いて対象物体を各時系列画像間で対応付けて、追跡する。 Each time-series image output from each camera 11 ₁ , 11 ₂ , 11 ₃ ,... Is supplied to the image processing apparatus 10. The image processing apparatus 10 performs image processing on each time series image supplied from each camera 11 ₁ , 11 ₂ , 11 ₃ ,..., And associates the same target object images between the time series images. Track the target object in time series. At this time, the image processing apparatus 10 according to the first embodiment determines the region of the image of the target object that is commonly included between the time-series images supplied from the cameras 11 ₁ , 11 ₂ , 11 ₃ ,. Estimate using a three-dimensional model. Then, the image processing apparatus 10 tracks the target object in association with each time-series image using the estimated area.

これにより、各カメラ１１₁、１１₂、１１₃、…の撮像方向や対象物体の移動方向が異なり、各時系列画像に異なる向きの当該対象物体の画像が含まれる場合であっても、高精度に対象物体を追跡することが可能となる。 Thereby, even if the imaging direction of each camera 11 ₁ , 11 ₂ , 11 ₃ ,... And the moving direction of the target object are different, and each time-series image includes an image of the target object in a different direction, It becomes possible to track the target object with high accuracy.

図３は、第１の実施形態に係る画像処理装置１０の一例の構成を示す。画像処理装置１０は、ＣＰＵ(Central Processing Unit)１０１と、ＲＯＭ(Read Only Memory)１０２と、ＲＡＭ(Random Access Memory)１０３と、ストレージ１０４と、入出力Ｉ／Ｆ１０５と、通信Ｉ／Ｆ１０６と、表示制御部１０７と、カメラＩ／Ｆ１０９とを備え、これら各部がバス１００により互いに通信可能に接続される。画像処理装置１０は、このように、一般的なコンピュータと同様な構成にて実現可能である。 FIG. 3 shows an exemplary configuration of the image processing apparatus 10 according to the first embodiment. The image processing apparatus 10 includes a CPU (Central Processing Unit) 101, a ROM (Read Only Memory) 102, a RAM (Random Access Memory) 103, a storage 104, an input / output I / F 105, a communication I / F 106, A display control unit 107 and a camera I / F 109 are provided, and these units are communicably connected to each other via a bus 100. As described above, the image processing apparatus 10 can be realized by the same configuration as a general computer.

ＣＰＵ１０１は、ＲＯＭ１０２やストレージ１０４に予め記憶されるプログラムに従い、ＲＡＭ１０３をワークメモリとして動作して、この画像処理装置１０の全体の動作を制御する。ストレージ１０４は、ハードディスクドライブや不揮発性の半導体メモリ（フラッシュメモリ）であり、ＣＰＵ１０１が動作するためのプログラムや、種々のデータが記憶される。 The CPU 101 controls the overall operation of the image processing apparatus 10 by operating the RAM 103 as a work memory according to a program stored in advance in the ROM 102 or the storage 104. The storage 104 is a hard disk drive or a nonvolatile semiconductor memory (flash memory), and stores a program for operating the CPU 101 and various data.

入出力Ｉ／Ｆ１０５は、例えばＵＳＢ(Universal Serial Bus)であり、外部機器との間でデータの送受信を行うためのインターフェイスである。キーボードやポインティングデバイス（マウスなど）の入力デバイスをこの入出力Ｉ／Ｆ１０５に接続することができる。また、ＣＤ(Compact Disk)やＤＶＤ(Digital Versatile Disk)といったディスク記憶媒体の読み出しなどを行うドライブ装置をこの入出力Ｉ／Ｆ１０５に接続してもよい。通信Ｉ／Ｆ１０６は、ＬＡＮ(Local Area Network)やインターネットなどのネットワークに対する通信を制御する。表示制御部１０７は、ＣＰＵ１０１によりプログラムに従い生成された表示制御信号を、ＬＣＤ(Liquid Crystal Display)などを表示デバイスとして用いた表示装置１０８が表示可能な表示信号に変換して出力する。 The input / output I / F 105 is, for example, a USB (Universal Serial Bus), and is an interface for transmitting / receiving data to / from an external device. Input devices such as a keyboard and a pointing device (such as a mouse) can be connected to the input / output I / F 105. Further, a drive device that reads a disk storage medium such as a CD (Compact Disk) or a DVD (Digital Versatile Disk) may be connected to the input / output I / F 105. The communication I / F 106 controls communication with a network such as a LAN (Local Area Network) or the Internet. The display control unit 107 converts the display control signal generated according to the program by the CPU 101 into a display signal that can be displayed by the display device 108 using a liquid crystal display (LCD) or the like as a display device, and outputs the display signal.

カメラＩ／Ｆ１０９は、カメラ１１₁、１１₂、１１₃、…から出力される各画像信号を取り込み、上述した、時系列に沿った複数のフレーム画像を含む、カメラ１１₁、１１₂、１１₃、…による各時系列画像としてバス１００に対して出力する。 The camera I / F 109 takes in each image signal output from the cameras 11 ₁ , 11 ₂ , 11 ₃ ,..., And includes a plurality of frame images along the time series described above, the cameras 11 ₁ , 11 ₂ , 11. ₃ ,... Are output to the bus 100 as respective time-series images.

図４は、第１の実施形態に係る画像処理装置１０の機能を説明するための一例の機能ブロック図である。図４において、画像処理装置１０は、画像取得部１２０と、記憶部１２１と、追跡部１２２と、移動方向取得部１２３と、領域選定部１２４と、特徴量抽出部１２５と、連結部１２６と、出力部１２７とを備える。なお、画像取得部１２０は、図４に例示されるように、画像処理装置１０に１つが設けられてもよいし、各カメラ１１₁、１１₂、１１₃、…毎に設けられてもよい。 FIG. 4 is a functional block diagram illustrating an example of functions of the image processing apparatus 10 according to the first embodiment. 4, the image processing apparatus 10 includes an image acquisition unit 120, a storage unit 121, a tracking unit 122, a movement direction acquisition unit 123, a region selection unit 124, a feature amount extraction unit 125, and a connection unit 126. And an output unit 127. As illustrated in FIG. 4, one image acquisition unit 120 may be provided in the image processing apparatus 10, or may be provided for each of the cameras 11 ₁ , 11 ₂ , 11 ₃ ,. .

画像取得部１２０は、各カメラ１１₁、１１₂、１１₃、…から供給された各時系列画像を取得する。また、画像取得部１２０は、時系列画像を取得したカメラを特定する識別情報（画像識別情報と呼ぶ）を取得する。画像取得部１２０は、時系列画像に画像識別情報を付加して、追跡部１２２および記憶部１２１に供給する。記憶部１２１は、供給された時系列画像を画像識別情報と対応付けてＲＡＭ１０３やストレージ１０４に記憶する。 The image acquisition unit 120 acquires each time-series image supplied from each camera 11 ₁ , 11 ₂ , 11 ₃ ,. In addition, the image acquisition unit 120 acquires identification information (referred to as image identification information) that identifies the camera that acquired the time-series image. The image acquisition unit 120 adds image identification information to the time-series image and supplies it to the tracking unit 122 and the storage unit 121. The storage unit 121 stores the supplied time series image in the RAM 103 or the storage 104 in association with the image identification information.

追跡部１２２は、画像取得部１２０から供給された時系列画像に含まれる各フレーム画像に撮像される対象物体を検出し、対象物体の画像のフレーム画像内での位置を各フレーム間で追跡する。なお、以下では、特に記載の無い限り、「対象物体の画像」を単に「対象物体」と呼ぶ。追跡部１２２は、追跡結果である対象物体の各フレーム画像内での位置を示す情報を、対象物体を識別する移動体識別情報を付加して移動方向取得部１２３に供給する。 The tracking unit 122 detects a target object captured in each frame image included in the time-series image supplied from the image acquisition unit 120, and tracks the position of the target object image in the frame image between the frames. . In the following description, unless otherwise specified, the “target object image” is simply referred to as a “target object”. The tracking unit 122 supplies information indicating the position of the target object in each frame image, which is the tracking result, to the moving direction acquisition unit 123 with the moving body identification information for identifying the target object.

移動方向取得部１２３は、追跡部１２２から供給された各フレーム画像に含まれる対象物体の移動方向を推定して取得する。移動方向取得部１２３は、対象物体について取得した移動方向を、画像識別情報および移動体識別情報と対応付けて領域選定部１２４に供給し、記憶部１２１に記憶させる。 The movement direction acquisition unit 123 estimates and acquires the movement direction of the target object included in each frame image supplied from the tracking unit 122. The movement direction acquisition unit 123 supplies the movement direction acquired for the target object to the region selection unit 124 in association with the image identification information and the moving body identification information, and causes the storage unit 121 to store the movement direction.

領域選定部１２４は、対象物体に対して予め定めた基準となる基準方向と、移動方向取得部１２３が取得した移動方向とに基づき、例えばカメラ１１₁から取得された第１の時系列画像に撮像される第１の対象物体と、カメラ１１₂から取得された第２の時系列画像に撮像される第２の対象物体とで共通する、第１の対象物体および第２の対象物体上の領域を推定し、推定された領域を共通領域として選定する。 Region selecting unit 124, the reference direction serving as a predetermined reference for the target object, based on the movement direction the movement direction acquisition unit 123 has acquired, to the first time-series images acquired, for example, from camera 11 ₁ a first object to be imaged, is common to the second object to be imaged in the second time-series images acquired from the camera 11 _2, on the first object and the second object An area is estimated, and the estimated area is selected as a common area.

特徴量抽出部１２５は、領域選定部１２４で選定された共通領域を、上述の第１の対象物体および第２の対象物体にそれぞれ適用して、共通領域内の画像から特徴量を抽出する。特徴量抽出部１２５は、抽出した特徴量を、対象物体の移動体識別情報と、時系列画像の画像識別情報とに対応付けて連結部１２６に供給する。 The feature amount extraction unit 125 applies the common region selected by the region selection unit 124 to the first target object and the second target object described above, and extracts the feature amount from the image in the common region. The feature amount extraction unit 125 supplies the extracted feature amount to the connection unit 126 in association with the moving object identification information of the target object and the image identification information of the time series image.

連結部１２６は、特徴量抽出部１２５で抽出された特徴量に基づき、各対象物体の追跡結果を、各カメラ１１₁、１１₂、１１₃、…から取得された各時系列画像間で対応付ける。対応付けた結果を示す情報は、記憶部１２１に記憶されると共に、出力部１２７に供給される。 Based on the feature amount extracted by the feature amount extraction unit 125, the connecting unit 126 associates the tracking result of each target object among the respective time-series images acquired from the respective cameras 11 ₁ , 11 ₂ , 11 ₃ ,. . Information indicating the associated result is stored in the storage unit 121 and supplied to the output unit 127.

出力部１２７は、対応付けられた結果を示す情報を出力する。一例として、出力部１２７は、各時系列画像に含まれる対象物体の画像に対して、連結部１２６で対応付けられた移動体識別情報を付加して表示する画像を生成することが考えられる。この画像は、例えば表示装置１０８に表示される。 The output unit 127 outputs information indicating the associated result. As an example, it is conceivable that the output unit 127 generates an image to be displayed by adding the moving object identification information associated with the connection unit 126 to the image of the target object included in each time-series image. This image is displayed on the display device 108, for example.

画像取得部１２０、記憶部１２１、追跡部１２２、移動方向取得部１２３、領域選定部１２４、特徴量抽出部１２５、連結部１２６および出力部１２７は、上述したＣＰＵ１０１上で動作する画像処理プログラムにより実現される。これに限らず、画像取得部１２０、記憶部１２１、追跡部１２２、移動方向取得部１２３、領域選定部１２４、特徴量抽出部１２５、連結部１２６および出力部１２７のうち一部または全部を、互いに協働して動作するハードウェアにより構成してもよい。 The image acquisition unit 120, the storage unit 121, the tracking unit 122, the movement direction acquisition unit 123, the region selection unit 124, the feature amount extraction unit 125, the connection unit 126, and the output unit 127 are executed by the above-described image processing program that operates on the CPU 101. Realized. Not limited to this, some or all of the image acquisition unit 120, the storage unit 121, the tracking unit 122, the movement direction acquisition unit 123, the region selection unit 124, the feature amount extraction unit 125, the connection unit 126, and the output unit 127 may be used. You may comprise by the hardware which cooperates mutually and operate | moves.

図５は、第１の実施形態に係る画像処理装置１０における画像処理を示す一例のフローチャートである。この図５のフローチャートを用いて、第１の実施形態に係る画像処理について概略的に説明する。なお、以下では、対象物体が人であるものとして説明する。勿論、対象物体は、人に限られず、顔や手足などの人の各部分でもよいし、人以外の他の動物でもよい。さらに、対象物体は、自動車や自転車などの機械でもよい。 FIG. 5 is a flowchart illustrating an example of image processing in the image processing apparatus 10 according to the first embodiment. Image processing according to the first embodiment will be schematically described with reference to the flowchart of FIG. In the following description, it is assumed that the target object is a person. Of course, the target object is not limited to a person, and may be each part of a person such as a face or a limb, or may be an animal other than a person. Furthermore, the target object may be a machine such as an automobile or a bicycle.

ステップＳ１０で、画像取得部１２０は、各カメラ１１₁、１１₂、１１₃、…から供給された各時系列画像を取得し、取得した各時系列画像にカメラ１１₁、１１₂、１１₃、…を識別するための画像識別情報をそれぞれ付加する。画像取得部１２０で取得された時系列画像は、画像識別情報と対応付けて記憶部１２１により記憶されると共に、追跡部１２２に供給される。 In step S10, the image acquisition unit 120, the camera 11 _1, 11 _2, 11 ₃ takes each time-series images supplied from ..., camera 11 ₁ to each time series images obtained, 11 _2, 11 ₃ ,... Are added to the respective image identification information. The time-series image acquired by the image acquisition unit 120 is stored in the storage unit 121 in association with the image identification information and is supplied to the tracking unit 122.

次のステップＳ１１で、追跡部１２２は、画像取得部１２０で取得された時系列画像から対象物体を検出し、検出した対象物体の時系列画像内での位置を、同一時系列画像内の各フレーム画像間で追跡する。次のステップＳ１２で、移動方向取得部１２３は、ステップＳ１１で追跡部１２２により取得された対象物体の同一の時系列画像内の各フレーム間での位置に基づき、対象物体の移動方向を推定して取得する。 In the next step S11, the tracking unit 122 detects the target object from the time series image acquired by the image acquisition unit 120, and determines the position of the detected target object in the time series image in each time series image. Track between frame images. In the next step S12, the moving direction acquisition unit 123 estimates the moving direction of the target object based on the position of each target object in the same time-series image acquired by the tracking unit 122 in step S11. Get.

次のステップＳ１３で、領域選定部１２４は、第１の時系列画像に撮像される第１の対象物体と、第２の時系列画像に撮像される第２の対象物体とで共通する、第１の対象物体および第２の対象物体上の領域である共通領域を選定する。 In the next step S13, the region selecting unit 124 uses the first target object captured in the first time-series image and the second target object captured in the second time-series image. A common area which is an area on the first target object and the second target object is selected.

次のステップＳ１４で、特徴量抽出部１２５は、ステップＳ１３で共通領域が選定されたか否かを判定する。特徴量抽出部１２５は、共通領域が選定されたと判定した場合、処理をステップＳ１５に移行させ、選定された共通領域を対応する各対象物体に適用してそれぞれ特徴量を抽出する。一方、特徴量抽出部１２５は、共通領域が選定されなかったと判定した場合、処理をステップＳ１６に移行させ、各対象物体の全体からそれぞれ特徴量を抽出する。 In the next step S14, the feature amount extraction unit 125 determines whether or not a common area has been selected in step S13. When it is determined that the common area has been selected, the feature quantity extraction unit 125 shifts the process to step S15 and applies the selected common area to each corresponding target object to extract the feature quantity. On the other hand, when it is determined that the common area has not been selected, the feature amount extraction unit 125 shifts the processing to step S 16 and extracts the feature amount from each of the entire target objects.

ステップＳ１５またはステップＳ１６で特徴量が抽出されると、処理がステップＳ１７に移行される。ステップＳ１７で、連結部１２６は、注目する時系列画像の対象物体毎に、他の時系列画像の対象物体と特徴量を照合する。連結部１２６は、各時系列画像間で類似度の高い特徴量に係る対象物体を同一の物体であると見做して対応付け、各時系列画像を連結する。次のステップＳ１８で、連結部１２６は、ステップＳ１７での時系列画像間での対象物体の対応付け結果を、時系列画像、移動方向、移動体識別情報および画像識別情報に対応付けて記憶部１２１に記憶させる。 When the feature amount is extracted in step S15 or step S16, the process proceeds to step S17. In step S 17, for each target object of the time series image to be noticed, the connecting unit 126 collates the feature amount with the target object of another time series image. The connecting unit 126 associates target objects related to feature quantities having high similarity between the time series images as if they are the same object, and connects the time series images. In the next step S18, the connecting unit 126 stores the association result of the target object between the time series images in step S17 in association with the time series image, the moving direction, the moving body identification information, and the image identification information. 121 is stored.

次のステップＳ１９で、出力部１２７は、一連の画像処理の結果を出力する。例えば、出力部１２７は、各時系列画像に対して、各時系列画像に含まれる対象物体中の同一の対象物体を識別する識別情報を付加して出力する。 In the next step S19, the output unit 127 outputs a result of a series of image processing. For example, the output unit 127 adds and outputs identification information for identifying the same target object among the target objects included in each time-series image to each time-series image.

（第１の実施形態の処理の詳細）
次に、第１の実施形態の画像処理装置１０による画像処理を、上述の図５のフローチャートを参照して、より詳細に説明する。ステップＳ１０で、画像取得部１２０は、各カメラ１１₁、１１₂、１１₃、…から供給された各時系列画像を取得し、取得した各時系列画像に対して、供給元の各カメラ１１₁、１１₂、１１₃、…を識別するための画像識別情報を付加する。画像取得部１２０は、それぞれ画像識別情報が付加された時系列画像を記憶部１２に記憶させると共に、追跡部１２２に供給する。 (Details of processing of the first embodiment)
Next, image processing by the image processing apparatus 10 according to the first embodiment will be described in more detail with reference to the flowchart of FIG. In step S10, the image acquisition unit 120 acquires each time-series image supplied from each camera 11 ₁ , 11 ₂ , 11 ₃ ,..., And supplies each camera 11 of the supply source for each acquired time-series image. Image identification information for identifying ₁ , 11 ₂ , 11 ₃ ,... Is added. The image acquisition unit 120 stores the time-series images to which the image identification information is added in the storage unit 12 and supplies the time-series images to the tracking unit 122.

ステップＳ１１で、追跡部１２２は、ステップＳ１０で画像取得部１２０により取得された、各カメラ１１₁、１１₂、１１₃、…に撮像された各時系列画像に含まれる各フレーム画像から対象物体を検出し、検出された対象物体を時系列画像内のフレーム画像間で追跡する。以下では、説明のため、特に記載の無い限り、各カメラ１１₁、１１₂、１１₃、…のうちカメラ１１₁から取得した時系列画像に注目して説明を行う。 In step S11, the tracking unit 122 detects the target object from each frame image included in each time-series image captured by each camera 11 ₁ , 11 ₂ , 11 ₃ ,... Acquired by the image acquisition unit 120 in step S10. And the detected target object is tracked between frame images in the time-series image. In the following, for explanation, unless otherwise specified, each of the cameras 11 _1, 11 _2, 11 _3, a description will be focused ... on time-series images acquired from the camera 11 _one of.

追跡部１２２は、例えば下記の方法を用いて、フレーム画像から対象物体を検出する。図６に例示されるように、３の対象物体２０ａ、２０ｂおよび２０ｃがフレーム画像２００に含まれるものとする。各対象物体２０ａ、２０ｂおよび２０ｃが人であるものとし、追跡部１２２は、フレーム画像２００に検出窓領域を設定し、検出窓領域内の画像から特徴量を算出する。例えば、追跡部１２２は、特徴量として、検出窓領域内の輝度の勾配と強度とをヒストグラム化したＨＯＧ(Histograms of Oriented Gradients)特徴量を算出することが考えられる。これに限らず、追跡部１２２は、ＨＯＧ特徴量を識別性能の面で改良したＣｏＨＯＧ(Co-occurrence HOG)特徴量（非特許文献１参照）を算出してもよい。 The tracking unit 122 detects the target object from the frame image using, for example, the following method. As illustrated in FIG. 6, it is assumed that three target objects 20 a, 20 b, and 20 c are included in the frame image 200. Assume that each of the target objects 20a, 20b, and 20c is a person, and the tracking unit 122 sets a detection window area in the frame image 200 and calculates a feature amount from the image in the detection window area. For example, the tracking unit 122 may calculate a HOG (Histograms of Oriented Gradients) feature value that is a histogram of the luminance gradient and intensity in the detection window region. Not limited to this, the tracking unit 122 may calculate a CoHOG (Co-occurrence HOG) feature amount (see Non-Patent Document 1) obtained by improving the HOG feature amount in terms of discrimination performance.

追跡部１２２は、例えば、上述のようにして算出した特徴量を用いて検出窓領域内の画像が対象物体であるか否かを識別して、対象物体を検出する。検出した対象物体の時系列画像内での追跡には、例えば非特許文献２で開示されている技術を用いてもよい。 For example, the tracking unit 122 detects whether or not an image in the detection window region is a target object using the feature amount calculated as described above, and detects the target object. For example, a technique disclosed in Non-Patent Document 2 may be used for tracking the detected target object in the time-series image.

追跡部１２２は、追跡した結果取得された、対象物体の各フレーム画像内での位置を示す情報を、対象物体を識別する移動体識別情報を付加して移動方向取得部１２３に供給する。図６の例では、フレーム画像２００から検出された対象物体２０ａ、２０ｂおよび２０ｃに、それぞれ移動体識別情報としてＩＤ＝０１、０２および０３が付加されている。 The tracking unit 122 supplies information indicating the position of the target object in each frame image acquired as a result of the tracking to the moving direction acquisition unit 123 by adding moving body identification information for identifying the target object. In the example of FIG. 6, IDs = 01, 02, and 03 are added to the target objects 20a, 20b, and 20c detected from the frame image 200 as moving body identification information, respectively.

ここで、追跡部１２２は、図７に例示されるように、複数のカメラ１１₁、１１₂、１１₃、…からの各時系列画像について、それぞれフレーム画像間での対象物体の追跡を行う。図７の例では、画像識別情報をカメラＩＤとして示し、例えばカメラ１１₁、１１₂および１１₃から取得された各時系列画像のカメラＩＤを、それぞれカメラＩＤ＝Ａ０１、Ａ０２およびＡ０３としている。 Here, as illustrated in FIG. 7, the tracking unit 122 performs tracking of the target object between the frame images for each of the time-series images from the plurality of cameras 11 ₁ , 11 ₂ , 11 ₃ ,. . In the example of FIG. 7, the image identification information is shown as a camera ID. For example, the camera IDs of the time-series images acquired from the cameras 11 ₁ , 11 _2, and 11 ₃ are camera ID = A01, A02, and A03, respectively.

図７（ａ）に例示されるように、カメラＩＤ＝Ａ０１の時系列画像では、時間ｔ₀、ｔ₁、ｔ₂、ｔ₃でそれぞれ取得されるフレーム画像２００₀、２００₁、２００₂、２００₃に対象物体２０ｂが含まれていることが示されている。また、図７（ｂ）に例示されるように、カメラＩＤ＝Ａ０２の時系列画像では、上述の時間ｔ₀より前の、時間ｔ_-4、ｔ_-3、ｔ_-2、ｔ_-1でそれぞれ取得されるフレーム画像２００_-4、２００_-3、２００_-2、２００_-1に対象物体２０ｂが含まれていることが示されている。同様に、図７（ｃ）に例示されるように、カメラＩＤ＝Ａ０３の時系列画像では、上述の時間ｔ３より後の、時間ｔ₅、ｔ₆、ｔ₇、ｔ₈でそれぞれ取得されるフレーム画像２００₅、２００₆、２００₇、２００₈に対象物体２０ｂが含まれていることが示されている。このように、各カメラにおいて、同一の対象物体２０ｂが異なるタイミングで撮像されている可能性がある。 As illustrated in FIG. 7A, in a time-series image with camera ID = A01, frame images 200 ₀ , 200 ₁ , 200 ₂ , acquired at times t ₀ , t ₁ , t ₂ , t ₃ , respectively. it has been shown that the target object is included 20b to 200 _3. Further, as illustrated in FIG. 7B, in the time-series image with the camera ID = A02, at times t ₋₄ , t ₋₃ , t ₋₂ , and t ₋₁ before the above-described time t _0. frame image 200 _-4 are respectively acquired, 200 _-3, 200 _-2, it has been shown that the target object is included 20b to 200 _-1. Similarly, as illustrated in FIG. 7C, the time-series images with camera ID = A03 are acquired at times t ₅ , t ₆ , t ₇ , and t ₈ after the above-described time t ₃ , respectively. it has been shown that contained in the frame image 200 _5, 200 _6, 200 _7, 200 ₈ the target object 20b. Thus, in each camera, the same target object 20b may be imaged at different timings.

なお、カメラＩＤ＝Ａ０１、ＩＤ＝Ａ０２およびＩＤ＝Ａ０３それぞれの時系列画像に含まれる各対象物体２０ｂは、連結部１２６による各時系列画像間での対応付けがなされていない状態では、同一の対象物体２０ｂとは認識されない。したがって、カメラＩＤ＝Ａ０１、ＩＤ＝Ａ０２およびＩＤ＝Ａ０３それぞれの時系列画像に含まれる各対象物体２０ｂに対して、それぞれ異なるＩＤ＝０２、ＩＤ＝１２およびＩＤ＝２２が付加される。 Note that the target objects 20b included in the time series images of the camera ID = A01, ID = A02, and ID = A03 are the same in a state where the time series images are not associated by the connecting unit 126. The target object 20b is not recognized. Therefore, different ID = 02, ID = 12, and ID = 22 are added to each target object 20b included in the time series images of camera ID = A01, ID = A02, and ID = A03.

移動方向取得部１２３は、各フレーム画像間での対象物体の移動方向を推定して取得する（図５のステップＳ１２）。移動方向取得部１２３は、画像取得部１２０により取得された時系列画像に含まれる注目フレーム画像の前および後の少なくとも一方の、予め定めた所定フレーム数のフレーム画像における対象物体の位置の変遷や、対象物体の向きの少なくとも１以上の情報を用いて、当該対象物体の移動方向を推定する。 The movement direction acquisition unit 123 estimates and acquires the movement direction of the target object between the frame images (step S12 in FIG. 5). The moving direction acquisition unit 123 changes the position of the target object in a predetermined number of frame images before and after the target frame image included in the time-series image acquired by the image acquisition unit 120. The moving direction of the target object is estimated using at least one piece of information on the direction of the target object.

より具体的には、移動方向取得部１２３は、上述の追跡部１２２による対象物体の追跡結果に示される、当該対象物体のフレーム画像間での移動軌跡を用いて、移動方向を推定することができる。これに限らず、移動方向取得部１２３は、追跡部１２２による対象物体の追跡結果に基づく、当該対象物体を包含する任意の形状、例えば矩形の中心座標や、四隅または任意の角の座標の変遷を用いて、移動方向を推定することも可能である。 More specifically, the movement direction acquisition unit 123 can estimate the movement direction using the movement trajectory between the frame images of the target object shown in the tracking result of the target object by the tracking unit 122 described above. it can. The movement direction acquisition unit 123 is not limited to this, and based on the tracking result of the target object by the tracking unit 122, the transition of an arbitrary shape including the target object, for example, the center coordinates of the rectangle, or the coordinates of the four corners or arbitrary corners. It is also possible to estimate the moving direction using.

図８を用いて、第１の実施形態に係る、移動方向取得部１２３による対象物体２０の移動方向の取得の例について説明する。なお、図８において、フレーム画像２００の左上隅を座標の原点とし、右方向に向けてｘ座標が増加し、下方向に向けてｙ座標が増加するものとする。 An example of acquisition of the movement direction of the target object 20 by the movement direction acquisition unit 123 according to the first embodiment will be described with reference to FIG. In FIG. 8, it is assumed that the upper left corner of the frame image 200 is the origin of coordinates, the x coordinate increases in the right direction, and the y coordinate increases in the downward direction.

第１の例として、移動方向取得部１２３は、フレーム画像内での対象物体２０の位置が、注目フレームを含む注目フレームの前後の所定フレーム数のフレームに亘り、ｘ座標が一定でｙ座標のみ増加している場合、対象物体２０は、カメラに向かって移動し（図８中の矢印２１１参照）、且つ、カメラに向かって正面を向いていると推定する。 As a first example, the moving direction acquisition unit 123 has a constant x coordinate and only a y coordinate in which the position of the target object 20 in the frame image covers a predetermined number of frames before and after the target frame including the target frame. When it is increasing, it is estimated that the target object 20 moves toward the camera (see arrow 211 in FIG. 8) and faces the front toward the camera.

第２の例として、移動方向取得部１２３は、フレーム画像２００内での対象物体２０の位置が、注目フレームを含む注目フレームの前後の所定フレーム数のフレームに亘り、ｘ座標のみが減少しｙ座標が一定である場合、対象物体２０は、カメラに向かって右方向に移動し（図８中の矢印２１３参照）、且つ、カメラに向かって右方向を向いていると推定する。 As a second example, the movement direction acquisition unit 123 reduces the x coordinate only when the position of the target object 20 in the frame image 200 extends over a predetermined number of frames before and after the target frame including the target frame. When the coordinates are constant, it is estimated that the target object 20 moves rightward toward the camera (see an arrow 213 in FIG. 8) and faces rightward toward the camera.

第３の例として、移動方向取得部１２３は、フレーム画像２００内での対象物体２０の位置が、注目フレームを含む注目フレームの前後の所定フレーム数のフレームに亘り、ｘ座標が一定でｙ座標のみが減少している場合、対象物体２０は、カメラから離れる方向に移動し（図８中の矢印２１０参照）、且つ、カメラに向かって背面を向いていると推定する。 As a third example, the movement direction acquisition unit 123 determines that the position of the target object 20 in the frame image 200 is a predetermined number of frames before and after the target frame including the target frame, the x coordinate is constant, and the y coordinate. If only the number is decreased, it is estimated that the target object 20 moves away from the camera (see arrow 210 in FIG. 8) and faces the back toward the camera.

第４の例として、移動方向取得部１２３は、フレーム画像２００内での対象物体２０の位置が、注目フレームを含む注目フレームの前後の所定フレーム数のフレームに亘り、ｘ座標のみが増加しｙ座標が一定である場合、対象物体２０は、カメラに向かって左方向に移動し（図８中の矢印２１４参照）、且つ、カメラに向かって左方向を向いていると推定する。 As a fourth example, the movement direction acquisition unit 123 increases the x coordinate only when the position of the target object 20 in the frame image 200 extends over a predetermined number of frames before and after the target frame including the target frame. When the coordinates are constant, it is estimated that the target object 20 moves leftward toward the camera (see arrow 214 in FIG. 8) and faces leftward toward the camera.

上述では、移動方向取得部１２３が対象物体２０の移動方向として、カメラに向かって前後左右の４方向の移動方向を推定するように説明したが、これはこの例に限定されない。すなわち、移動方向取得部１２３は、フレーム画像２００内での対象物体２０の位置のｘ座標およびｙ座標の変化量から、対象物体２０の移動方向を３０°、６０°など任意の角度でより詳細に推定することも可能である。 In the above description, the movement direction acquisition unit 123 has been described as estimating the movement directions of the front, rear, left, and right directions toward the camera as the movement direction of the target object 20, but this is not limited to this example. In other words, the movement direction acquisition unit 123 uses the amount of change in the x and y coordinates of the position of the target object 20 in the frame image 200 to make the movement direction of the target object 20 more detailed at an arbitrary angle such as 30 ° or 60 °. It is also possible to estimate it.

また、上述では、移動方向取得部１２３は、対象物体２０の移動方向を、注目フレームの前後の予め定めた所定フレーム数のフレーム画像を用いて推定しているが、これはこの例に限定されない。すなわち、移動方向取得部１２３は、対象物体２０の移動方向の推定に、注目フレームに対して時間的に後の所定フレーム数のフレーム画像を用いてよいし、注目フレームに対して時間的に前の所定フレーム数のフレーム画像を用いてもよい。 In the above description, the moving direction acquisition unit 123 estimates the moving direction of the target object 20 using a predetermined number of frame images before and after the frame of interest, but this is not limited to this example. . That is, the movement direction acquisition unit 123 may use a frame image of a predetermined number of frames later in time with respect to the frame of interest for estimation of the movement direction of the target object 20, or may be temporally earlier than the frame of interest. A predetermined number of frame images may be used.

なお、移動方向取得部１２３は、移動方向を推定する所定フレーム数の間に、閾値以上の移動方向の変化を検出した場合、変化の前後で対象物体が異なるものと判定することができる。 When the movement direction acquisition unit 123 detects a change in the movement direction that is equal to or greater than the threshold during the predetermined number of frames for estimating the movement direction, the movement direction acquisition unit 123 can determine that the target object is different before and after the change.

上述では、移動方向取得部１２３は、対象物体２０が移動方向を向いているものと推定しているが、これはこの例に限定されない。移動方向取得部１２３は、例えば、任意のフレームにおける対象物体の画像から特徴量を抽出し、特徴量に基づき方向別の対象物体を用いて学習した識別器を用いて各方向を向いている確からしさを算出することができる。 In the above description, the moving direction acquisition unit 123 estimates that the target object 20 faces the moving direction, but this is not limited to this example. For example, the moving direction acquisition unit 123 extracts a feature amount from an image of a target object in an arbitrary frame, and uses a discriminator learned using a target object for each direction based on the feature amount. The likelihood can be calculated.

この場合、識別器としては、ＳＶＭ(Support Vector Machine)を用いることができる。移動方向取得部１２３は、識別器を用いて算出した各方向を向いている確からしさに対して閾値判定を行い、対象物体がどの方向を向いているかを判定する。各方向の確からしさを算出するための特徴量としては、具体的には、例えば、非特許文献１に開示されているＣｏＨＯＧ特徴量を適用することができる。 In this case, an SVM (Support Vector Machine) can be used as the discriminator. The moving direction acquisition unit 123 performs threshold determination on the probability of facing each direction calculated using the classifier, and determines which direction the target object is facing. Specifically, for example, the CoHOG feature amount disclosed in Non-Patent Document 1 can be applied as the feature amount for calculating the probability of each direction.

移動方向取得部１２３は、上述のようにして対象物体について推定した移動方向を、当該対象物体を識別する移動体識別情報と、移動方向を推定するために用いた時系列画像に含まれる各フレーム画像２００と、当該時系列画像を識別するための画像識別情報とに対応付けて記憶部１２１に記憶させる。 The moving direction acquisition unit 123 determines the moving direction estimated for the target object as described above, the moving body identification information for identifying the target object, and each frame included in the time-series image used for estimating the moving direction. The storage unit 121 stores the image 200 in association with the image identification information for identifying the time-series image.

領域選定部１２４は、記憶部１２１に記憶される移動体識別情報毎に、例えばカメラ１１₁から取得された、注目する時系列画像である第１の時系列画像に撮像された第１の対象物体に対応する移動方向と、カメラ１１₂から取得された第２の時系列画像に撮像された第２の対象物体に対応する移動方向とを比較する。領域選定部１２４は、この移動方向の比較の結果に基づき、第１の時系列画像および第２の時系列画像において、第１の対象物体と第２の対象物体とで共通して含まれる第１の対象物体および第２の対象物体上の領域を、予め定めたモデルを用いて推定する。領域選定部１２４は、推定したこの領域を、第１の対象物体および第２の対象物体それぞれの特徴量を抽出するための共通領域として選定する（図５のステップＳ１３）。 Region selecting section 124, for each mobile unit identification information stored in the storage unit 121, for example, acquired from the camera 11 _1, a first object that is captured in the first image sequence is a time-series images of interest It compares the movement direction corresponding to the object, and a moving direction corresponding to the second object to be captured in the second time-series images acquired from the camera 11 _2. Based on the result of the comparison of the moving directions, the region selection unit 124 includes the first target object and the second target object that are commonly included in the first time-series image and the second time-series image. The regions on the first target object and the second target object are estimated using a predetermined model. The region selection unit 124 selects this estimated region as a common region for extracting the feature amounts of the first target object and the second target object (step S13 in FIG. 5).

領域選定部１２４は、第１の対象物体に対応する移動方向を例えば移動方向取得部１２３から取得することができる。また、領域選定部１２４は、第２の対象物体に対応する移動方向を例えば記憶部１２１から取得することができる。 The region selection unit 124 can acquire the movement direction corresponding to the first target object from the movement direction acquisition unit 123, for example. In addition, the region selection unit 124 can acquire the movement direction corresponding to the second target object from the storage unit 121, for example.

図９〜図１１を用いて、領域選定部１２４によるモデルを用いた、特徴量を抽出するための共通領域の選定方法について説明する。図９は、第１の実施形態に係る、人である対象物体に対応するモデルの例を示す。第１の実施形態では、図９に例示されるようにモデル３００として立体形状である円筒を用いている。モデルの形状は、この例に限定されず、対象物体の形状を抽象化した形状であれば、例えば直方体、楕円体、球体など他の立体形状であってもよい。 A common region selection method for extracting feature amounts using a model by the region selection unit 124 will be described with reference to FIGS. 9 to 11. FIG. 9 shows an example of a model corresponding to a target object that is a person according to the first embodiment. In the first embodiment, as illustrated in FIG. 9, a cylinder having a three-dimensional shape is used as the model 300. The shape of the model is not limited to this example, and may be another three-dimensional shape such as a rectangular parallelepiped, an ellipsoid, and a sphere as long as the shape of the target object is abstracted.

特徴量抽出を行う領域の選定においては、モデル３００に対して予め基準方向を決めておき、モデル３００がカメラの正面に対して基準方向を向けた場合に、モデル３００上のどの範囲がフレーム画像に含まれるかを定義しておく。第１の実施形態では、基準方向をモデル３００の正面方向として定める。そして、対象物体がカメラの前で正面を向いた場合に、モデル３００がカメラによるフレーム画像に含まれる範囲が、モデル３００の正面すなわち基準方向の角度を０°とした場合に、カメラに対して左右両側の６０°ずつの範囲３１０と定める。 In selecting a region for feature extraction, a reference direction is determined in advance for the model 300, and when the model 300 is directed to the front of the camera, which range on the model 300 is a frame image. Define whether it is included in. In the first embodiment, the reference direction is determined as the front direction of the model 300. When the target object faces the front in front of the camera, the range in which the model 300 is included in the frame image by the camera is relative to the camera when the front of the model 300, that is, the angle of the reference direction is 0 °. A range 310 of 60 ° on both the left and right sides is determined.

なお、範囲３１０は、反時計回りに角度の増加を定義する場合、正面の０°に対して６０°と３００°とがなす劣角の範囲となる。以降、角度が反時計回りに増加するものとし、角度範囲は、２の角度で示される劣角の範囲をいうものとする。 Note that the range 310 is an inferior angle range formed by 60 ° and 300 ° with respect to 0 ° on the front when an increase in angle is defined counterclockwise. Hereinafter, the angle is assumed to increase counterclockwise, and the angle range refers to a sub-angle range indicated by an angle of 2.

次に、領域選定部１２４は、例えばカメラ１１₁から取得された、注目する第１の時系列画像に含まれる第１の対象物体の移動方向と、例えばカメラ１１₂から取得された第２の時系列画像に含まれる第２の対象物体の移動方向とを比較して、第１の時系列画像および第２の時系列画像において、第１の対象物体と第２の対象物体とが共通して含まれる、第１の対象物体および第２の対象物体上の領域を、モデル３００上で定める。 Next, the area selection unit 124 acquires the moving direction of the first target object included in the first time-series image of interest acquired from the camera 11 ₁ , for example, and the _second acquired from the camera 11 ₂ , for example. The moving direction of the second target object included in the time series image is compared, and the first target object and the second target object are common in the first time series image and the second time series image. Regions on the first target object and the second target object included in the model 300 are determined.

一例として、第１の対象物体の移動方向が正面、第２の対象物体の移動方向が右方向である場合について、図１０を用いて考える。図１０（ａ）〜図１０（ｃ）は、モデル３００を俯瞰して示している。 As an example, a case where the moving direction of the first target object is the front and the moving direction of the second target object is the right direction will be considered with reference to FIG. 10A to 10C show the model 300 as an overhead view.

正面を向いている第１の対象物体については、図１０（ａ）に例示されるように、モデル３００上の６０°〜３００°の範囲３１０が第１の時系列画像に含まれる。一方、第２の対象物体は、右方向に移動しているため、正面が第２の時系列画像において左方向を向いている。そのため、図１０（ｂ）に例示されるように、第２の時系列画像に含まれる範囲３１０は、モデル３００の３０°〜１５０°の範囲となる。したがって、第１の時系列画像および第２の時系列画像において、第１の対象物体および第２の対象物体が共通して含まれる、第１の対象物体および第２の対象物体上の共通領域は、図１０（ｃ）に例示されるように、モデル３００の３０°〜６０°の範囲３２１となる。 For the first target object facing the front, a range 310 of 60 ° to 300 ° on the model 300 is included in the first time-series image, as illustrated in FIG. On the other hand, since the second target object is moving in the right direction, the front faces leftward in the second time-series image. Therefore, as illustrated in FIG. 10B, the range 310 included in the second time-series image is a range of 30 ° to 150 ° of the model 300. Therefore, in the first time-series image and the second time-series image, the common area on the first target object and the second target object that includes the first target object and the second target object in common. Is a range 321 of 30 ° to 60 ° of the model 300 as illustrated in FIG.

図１１は、第１の実施形態に係るモデル３００を上面から見た例を示す。図１１において、モデル３００の基準方向がカメラ１１の方向を向いている場合、モデル３００が時系列画像に含まれる範囲３１０は、基準方向に対して６０°〜３００°（基準方向の両側に６０°）の範囲となる。一方、モデル３００の基準方向がカメラ１１に対して右方向を向いている場合、時系列画像上では基準方向が左方向を向いているように映り、モデル３００が時系列画像に含まれる範囲３１０は、基準方向に対して３０°〜１５０°の範囲となる。この、範囲３１０および範囲３２０に共通する範囲３２１が、第１の対象物体および第２の対象物体上の、第１の時系列画像および第２の時系列画像に共通して含まれる共通領域であると推定される。 FIG. 11 shows an example in which the model 300 according to the first embodiment is viewed from above. In FIG. 11, when the reference direction of the model 300 faces the direction of the camera 11, the range 310 in which the model 300 is included in the time-series image has a range of 60 ° to 300 ° with respect to the reference direction (60 on both sides of the reference direction). °) range. On the other hand, when the reference direction of the model 300 is directed to the right with respect to the camera 11, it appears as if the reference direction is directed to the left on the time-series image, and the range 310 in which the model 300 is included in the time-series image. Is in the range of 30 ° to 150 ° with respect to the reference direction. The range 321 common to the range 310 and the range 320 is a common area included in common with the first time-series image and the second time-series image on the first target object and the second target object. Presumed to be.

領域選定部１２４は、例えば、閾値以上の広さで範囲３２１が取得された場合に、当該範囲３２１を、後述する特徴量抽出部１２５で特徴量を抽出するための領域として選定する。領域選定部１２４は、共通領域の選定を、第１の時系列画像および第２の時系列画像それぞれの所定のフレーム画像、例えば移動方向の取得に用いた最後のフレーム画像に注目して行うことができる。 For example, when the range 321 is acquired with a width equal to or larger than the threshold, the region selection unit 124 selects the range 321 as a region for extracting a feature amount by the feature amount extraction unit 125 described later. The area selection unit 124 selects the common area by paying attention to predetermined frame images of the first time-series image and the second time-series image, for example, the last frame image used for acquiring the moving direction. Can do.

領域選定部１２４は、第１の時系列画像に撮像される各対象物体について、第２の時系列画像に撮像される各第２の対象物体との間でそれぞれ共通領域を選定する。 The area selection unit 124 selects a common area between each target object captured in the first time-series image and each second target object captured in the second time-series image.

第１の実施形態では、特徴量抽出部１２５は、この範囲３２１を、範囲３２１を導出した第１の対象物体および第２の対象物体にそれぞれ適用し、範囲３２１に含まれる画像を用いて、第１の対象物体および第２の対象物体それぞれの特徴量を抽出する。 In the first embodiment, the feature amount extraction unit 125 applies the range 321 to the first target object and the second target object from which the range 321 is derived, and uses the images included in the range 321. A feature amount of each of the first target object and the second target object is extracted.

特徴量抽出部１２５は、領域選定部１２４で共通領域が選定された場合、共通領域からの距離に応じて重み付けを行い、第１の対象物体および第２の対象物体それぞれに対応する画像から特徴量を抽出する（図５のステップＳ１５）。特徴量抽出部１２５は、例えば、人の個人性を表すのに直観的で有効な値を特徴量として抽出する。このような特徴量の例としては、任意の色空間における色ヒストグラムを用いることができる。これに限らず、特徴量として、服の模様などのテクスチャ情報を用いてもよい。特徴量は、例えば、複数のパラメータを持つ多次元の値として抽出される。 When the common region is selected by the region selection unit 124, the feature amount extraction unit 125 performs weighting according to the distance from the common region, and features from the images corresponding to the first target object and the second target object, respectively. The amount is extracted (step S15 in FIG. 5). The feature quantity extraction unit 125 extracts, for example, an intuitively effective value as a feature quantity to express a person's individuality. As an example of such a feature amount, a color histogram in an arbitrary color space can be used. However, the present invention is not limited to this, and texture information such as clothing patterns may be used as the feature amount. The feature amount is extracted as a multidimensional value having a plurality of parameters, for example.

特徴量抽出部１２５は、第１の時系列画像に撮像される第１の対象物体それぞれと、第２の時系列画像に撮像される第２の対象物体それぞれとに対して共通領域をそれぞれ適用し、第１の時系列画像および第２の時系列画像に撮像される各対象物体について、それぞれ特徴量を抽出する。 The feature amount extraction unit 125 applies a common area to each of the first target objects imaged in the first time-series image and each of the second target objects imaged in the second time-series image. Then, feature amounts are extracted for each target object captured in the first time-series image and the second time-series image.

連結部１２６は、特徴量抽出部１２５にて共通領域を適用した対象物体の各組について、抽出された各特徴量に基づき類似度を算出し、対応付けが可能か否かを判定する。連結部１２６は、類似度を、特徴量間のＬ１ノルムやバタチャリア(Bhattacharyya)距離を用いて求めることができる。 The linking unit 126 calculates a similarity for each set of target objects to which the common region is applied by the feature amount extraction unit 125 based on the extracted feature amounts, and determines whether or not association is possible. The connecting unit 126 can determine the similarity using the L1 norm between the feature amounts or the Bhattacharyya distance.

また、連結部１２６は、共通領域を適用した対象物体の組の２の特徴量を１に纏めて識別器で識別した出力結果に基づき、対応付けが可能か否かを判定してもよい。識別器は、例えばＳＶＭを用いることができる。連結部１２６は、例えば、１の第１の対象物体に対して複数の第２の対象物体それぞれの類似度を算出し、算出した類似度が最大の組み合わせとなる第２の対象物体が第１の対象物体に対応付けられると判定することができる。これに限らず、連結部１２６は、類似度が閾値以上となる対象物体の組み合わせが対応付くと判定してもよい。 Further, the linking unit 126 may determine whether or not the association is possible based on an output result in which the two feature amounts of the set of target objects to which the common area is applied are combined into 1 and identified by the classifier. As the discriminator, for example, SVM can be used. For example, the connecting unit 126 calculates the similarity of each of the plurality of second target objects with respect to one first target object, and the second target object having the maximum combination of the calculated similarities is the first. Can be determined to be associated with the target object. Not limited to this, the connecting unit 126 may determine that a combination of target objects whose similarity is equal to or greater than a threshold value is associated.

連結部１２６は、上述のようにして求めた、各時系列画像間での対象物体同士の対応付けの結果を出力部１２７を介して外部に出力することができる。また、連結部１２６は、各時系列画像間での対象物体同士の対応付けの結果を、記憶部１２１に記憶している、対応付けがなされた対象物体に対応する移動体識別情報と、時系列画像と、移動方向と、当該時系列画像を取得したカメラを識別する画像識別情報とに追加または更新する。 The connection unit 126 can output the result of the association between the target objects between the time-series images obtained as described above via the output unit 127 to the outside. In addition, the connecting unit 126 stores the result of the association between the target objects between the respective time-series images in the storage unit 121, the mobile object identification information corresponding to the associated target object, and the time It is added or updated to the sequence image, the moving direction, and the image identification information for identifying the camera that acquired the time-series image.

例えば、図７を参照し、カメラＩＤ＝Ａ０１、Ａ０２およびＡ０３でそれぞれ識別される各時系列画像に含まれる、移動体識別情報ＩＤ＝０２、１２および２２の各対象物体２０ｂが互いに対応付けられる場合、「ＩＤ１２＝ＩＤ０２」、「ＩＤ２２＝ＩＤ０２」など、互いの対応付け関係を示す情報が記憶部１２１に追加されて記憶される。連結部１２６は、さらなる処理によりこの対応付け関係を示す情報が変更された場合、記憶部１２１の記憶内容を更新する。 For example, referring to FIG. 7, the target objects 20b of the mobile object identification information ID = 02, 12 and 22 included in the respective time-series images identified by the camera ID = A01, A02 and A03 are associated with each other. In this case, information indicating the mutual association relationship such as “ID12 = ID02” and “ID22 = ID02” is added to the storage unit 121 and stored. The connection unit 126 updates the storage content of the storage unit 121 when the information indicating the association relationship is changed by further processing.

このように、第１の実施形態の画像処理装置１０によれば、対象物体の移動方向の情報を用いて、異なるカメラにより取得された画像間に共通して含まれる共通領域を推定する。そして、推定された共通領域を用いて異なるカメラにより取得された画像間で対象物体の画像を対応付けるようにしている。これにより、第１の実施形態では、各カメラの設置方向や対象物体の移動方向がカメラ毎に異なり、対象物体の画像の時系列画像に含まれる方向が、時系列画像毎に異なる場合であっても、対象物体の追跡をより高精度に実行することが可能となる。 As described above, according to the image processing apparatus 10 of the first embodiment, the common area included in common between images acquired by different cameras is estimated using the information on the moving direction of the target object. And the image of a target object is matched between the images acquired with the different camera using the estimated common area | region. Thereby, in the first embodiment, the installation direction of each camera and the moving direction of the target object are different for each camera, and the direction included in the time-series image of the image of the target object is different for each time-series image. However, the tracking of the target object can be executed with higher accuracy.

（第１の実施形態の変形例）
次に、第１の実施形態の変形例について説明する。上述した第１の実施形態では、記憶部１２１に対して、各時系列画像と、各カメラすなわち各時系列画像を識別する画像識別情報と、時系列画像に含まれる対象物体を識別する移動体識別情報と、移動体識別情報により識別される対象物体の移動方向とを対応付けて記憶している。 (Modification of the first embodiment)
Next, a modification of the first embodiment will be described. In the first embodiment described above, each time-series image, image identification information that identifies each camera, that is, each time-series image, and a moving body that identifies a target object included in the time-series image are stored in the storage unit 121. The identification information and the moving direction of the target object identified by the moving body identification information are stored in association with each other.

これに対して、第１の実施形態の変形例は、記憶部１２１に対して、上述の時系列画像の代わりに、対象物体全体に対応する画像から抽出した特徴量を記憶する。例えば、図４の構成に対して、追跡部１２２で検出された対象物体全体に対応する画像の特徴量を抽出する他の特徴量抽出部を追加する。他の特徴量抽出部は、追跡部１２２で検出された対象物体の全体に対応する画像から特徴量を抽出し、移動体識別情報を対応付けて記憶部１２１に記憶させる。 On the other hand, the modification of 1st Embodiment memorize | stores the feature-value extracted from the image corresponding to the whole target object with respect to the memory | storage part 121 instead of the above-mentioned time series image. For example, another feature amount extraction unit that extracts the feature amount of the image corresponding to the entire target object detected by the tracking unit 122 is added to the configuration of FIG. The other feature amount extraction unit extracts the feature amount from the image corresponding to the entire target object detected by the tracking unit 122, and stores the moving body identification information in the storage unit 121 in association with each other.

領域選定部１２４は、移動方向取得部１２３により取得され記憶部１２１に記憶された対象物体の移動方向に基づき、上述したモデル３００を用いて共通領域を選定する。特徴量抽出部１２５は、記憶部１２１に記憶される対象物体全体の特徴量から、共通領域に重み付けを行った特徴量を抽出する。これに限らず、特徴量抽出部１２５は、記憶部１２１に記憶される対象物体全体の特徴量のうち、共通領域に含まれる特徴量を抽出してもよい。 The area selection unit 124 selects a common area using the model 300 described above based on the movement direction of the target object acquired by the movement direction acquisition unit 123 and stored in the storage unit 121. The feature amount extraction unit 125 extracts the feature amount obtained by weighting the common area from the feature amounts of the entire target object stored in the storage unit 121. Not limited to this, the feature amount extraction unit 125 may extract a feature amount included in the common area among the feature amounts of the entire target object stored in the storage unit 121.

この方法を用いても、上述した第１の実施形態と同様に、対象物体の移動方向の情報を用いて、異なるカメラにより取得された画像間に共通して含まれる共通領域を推定し、推定された共通領域を用いて異なるカメラにより取得された画像間で対象物体を対応付けて、高精度に対象物体の追跡が実行可能である。また、記憶部１２１に時系列画像を記憶しないため、記憶部１２１の容量を節約できる。 Even if this method is used, as in the first embodiment described above, using the information on the moving direction of the target object, the common area included in common between images acquired by different cameras is estimated and estimated. It is possible to track the target object with high accuracy by associating the target object between the images acquired by different cameras using the common area. Further, since no time-series images are stored in the storage unit 121, the capacity of the storage unit 121 can be saved.

（第２の実施形態）
次に、第２の実施形態について説明する。図１２は、第２の実施形態に係る画像処理装置１０’の機能を説明するための一例の機能ブロック図である。なお、図１２において、上述の図４と共通する部分には同一の符号を付して、詳細な説明を省略する。第２の実施形態に係る画像処理装置１０’は、図４で示した第１の実施形態に係る画像処理装置１０に対して色補正部１３０が追加されている。 (Second Embodiment)
Next, a second embodiment will be described. FIG. 12 is a functional block diagram illustrating an example of functions of the image processing apparatus 10 ′ according to the second embodiment. In FIG. 12, the same reference numerals are given to the portions common to FIG. 4 described above, and detailed description thereof is omitted. In the image processing apparatus 10 ′ according to the second embodiment, a color correction unit 130 is added to the image processing apparatus 10 according to the first embodiment shown in FIG.

第２の実施形態では、画像取得部１２０が各カメラ１１₁、１１₂、１１₃、…から取得した各時系列画像間で、色調を揃えるようにする。図１２の例では、領域選定部１２４と特徴量抽出部１２５との間に色補正部１３０が設けられ、色補正部１３０により、領域選定部１２４で選定された共通領域に含まれる画像、あるいは、対象物体全体の色調を、時系列画像間で揃える。 In the second embodiment, the image acquisition unit 120 adjusts the color tone between the time-series images acquired from the cameras 11 ₁ , 11 ₂ , 11 ₃ ,. In the example of FIG. 12, a color correction unit 130 is provided between the region selection unit 124 and the feature amount extraction unit 125, and the image included in the common region selected by the region selection unit 124 by the color correction unit 130, or The color tone of the entire target object is aligned between time series images.

すなわち、カメラ１１₁、１１₂、１１₃、…の設置方法や設置場所によっては、各カメラ１１₁、１１₂、１１₃、…で同一の被写体を撮像した場合であっても、それぞれ異なる色調の時系列画像が取得される場合が有り得る。例えば、各カメラ１１₁、１１₂、１１₃、…を、それぞれ色温度が異なる照明の部屋に設置されるような場合や、各カメラ１１₁、１１₂、１１₃、…の一部が屋内、他が屋外に設置されるような場合、取得される時系列画像の色調がカメラの設置場所によって異なってしまう。 That is, the camera 11 _1, 11 _2, 11 _3, ... Depending on the installation method and location of the cameras 11 _1, 11 _2, 11 _3, even ... in a case where imaging of the same object, different color tones, respectively It is possible that the time-series images are acquired. For example, each camera 11 ₁ , 11 ₂ , 11 ₃ ,... Is installed in a room with a different color temperature, or a part of each camera 11 ₁ , 11 ₂ , 11 ₃ ,. When others are installed outdoors, the color tone of the acquired time-series image differs depending on the installation location of the camera.

第２の実施形態では、各カメラ１１₁、１１₂、１１₃、…から取得される時系列画像に対して色補正を施し、各時系列画像の色調を、各カメラ１１₁、１１₂、１１₃、…のうち何れか１のカメラで取得される時系列画像の色調に揃える。ここでは、カメラ１１₁で取得される時系列画像を基準とし、他のカメラ１１₂、１１₃、…で取得される時系列画像の色調を、色補正部１３０での色補正により、カメラ１１₁で取得される時系列画像の色調に揃えるものとする。 In the second embodiment, color correction is performed on the time-series images acquired from the cameras 11 ₁ , 11 ₂ , 11 ₃ ,..., And the color tone of each time-series image is changed to each camera 11 ₁ , 11 ₂ ,. 11 ₃ ,... Are aligned with the color tone of the time-series image acquired by any one of the cameras. Here, the time series image acquired by the camera 11 ₁ is used as a reference, and the color tone of the time series image acquired by the other cameras 11 ₂ , 11 ₃ ,. Align to the color tone of the time-series image acquired in ₁ .

図１３は、第２の実施形態に係る画像処理装置１０’における画像処理を示す一例のフローチャートである。なお、図１３において、上述した図５と共通する部分には同一の符号を付して、詳細な説明を省略する。図１３に例示されるように、第２の実施形態では、ステップＳ１４での共通領域が存在するか否かの判定の後に、色補正部１３０による色補正を行い（ステップＳ２５またはステップＳ２６）、その後、色補正が行われた画像を用いて、ステップＳ１５またはステップＳ１６による特徴量の抽出処理が行われる。 FIG. 13 is a flowchart illustrating an example of image processing in the image processing apparatus 10 ′ according to the second embodiment. In FIG. 13, the same reference numerals are given to the portions common to FIG. 5 described above, and detailed description thereof is omitted. As illustrated in FIG. 13, in the second embodiment, color correction by the color correction unit 130 is performed after the determination in step S 14 as to whether or not a common area exists (step S 25 or step S 26). Thereafter, feature amount extraction processing in step S15 or step S16 is performed using the color-corrected image.

より具体的な例としては、ステップＳ１４で共通領域が存在すると判定された場合、処理がステップＳ２５に移行される。ステップＳ２５で、色補正部１３０は、例えば第１の対象物体および第２の対象物体の、共通領域が適用された範囲の画像に対して、予め取得された基準の画像の色調に揃えるように色補正処理を施す。次のステップＳ１５で、特徴量抽出部１２５は、第１の対象物体および第２の対象物体に対し、色補正が施された共通領域内の画像の特徴量を抽出する。 As a more specific example, if it is determined in step S14 that a common area exists, the process proceeds to step S25. In step S 25, the color correction unit 130 aligns the color tone of the reference image acquired in advance with respect to, for example, the image of the first target object and the second target object in the range where the common area is applied. Perform color correction processing. In the next step S15, the feature amount extraction unit 125 extracts the feature amount of the image in the common area where color correction has been performed on the first target object and the second target object.

一方、ステップＳ１４で共通領域が存在しないと判定された場合、処理がステップＳ２６に移行される。ステップＳ２６で、色補正部１３０は、例えば第１の対象物体および第２の対象物体それぞれの全体に対して、予め取得された基準の画像の色調に揃えるように色補正処理を施す。次のステップＳ１６で、特徴量抽出部１２５は、それぞれ色補正が施された第１の対象物体および第２の対象物体から特徴量を抽出する。 On the other hand, if it is determined in step S14 that no common area exists, the process proceeds to step S26. In step S 26, the color correction unit 130 performs color correction processing on the entire first target object and second target object, for example, so as to match the color tone of the reference image acquired in advance. In the next step S16, the feature amount extraction unit 125 extracts feature amounts from the first target object and the second target object that have been subjected to color correction.

画像間の色補正は、例えば非特許文献３に開示される技術を適用することができる。非特許文献３によれば、予め、各カメラ１１₁、１１₂、１１₃、…により同一の被写体を撮像し、各カメラ１１₁、１１₂、１１₃、…間における同一被写体の色の違いを、図１４に例示されるような変換関数（輝度伝達関数ｃｆ_ij）として学習しておく。図１４において、輝度伝達関数ｃｆ_ijを示す特性線３３０は、例えば、第１のカメラおよび第２のカメラでそれぞれ取得された第１の画像の輝度Ｙ_aと第２の画像の輝度Ｙ_bとの対応関係を示す。 For example, the technique disclosed in Non-Patent Document 3 can be applied to color correction between images. According to Non-Patent Document 3, in advance, each of the cameras 11 _1, 11 _2, 11 _3, ... by imaging the same subject, the camera 11 _1, 11 _2, 11 _3, the difference in color of the same subject between ... Is learned as a conversion function (luminance transfer function cf _ij ) as exemplified in FIG. In FIG. 14, the characteristic line 330 indicating the luminance transfer function cf _ij represents, for example, the luminance Y _a of the first image and the luminance Y _b of the second image acquired by the first camera and the second camera, respectively. The correspondence relationship is shown.

このように、第２の実施形態によれば、各カメラ１１₁、１１₂、１１₃、…から取得された画像間で色調を揃える色補正を行うようにしている。そのため、各カメラ１１₁、１１₂、１１₃、…の設置環境において色温度などが異なる場合であっても、対象物体の追跡をより高精度に実行することができる。 As described above, according to the second embodiment, color correction is performed so as to align the color tone between images acquired from the cameras 11 ₁ , 11 ₂ , 11 ₃ ,. Therefore, even if the color temperature is different in the installation environment of each camera 11 ₁ , 11 ₂ , 11 ₃ ,..., The target object can be tracked with higher accuracy.

（他の実施形態）
なお、各実施形態および変形例に係る画像処理を実行するための画像処理プログラムは、インストール可能な形式または実行可能な形式のファイルでＣＤ(Compact Disk)、ＤＶＤ(Digital Versatile Disk)などのコンピュータで読み取り可能な記録媒体に記録して提供される。これに限らず、画像処理プログラムを、ＲＯＭ１０２に予め記憶させて提供してもよい。 (Other embodiments)
An image processing program for executing image processing according to each embodiment and modification is a file in an installable format or an executable format, and is a computer such as a CD (Compact Disk) or a DVD (Digital Versatile Disk). It is provided by being recorded on a readable recording medium. However, the present invention is not limited to this, and the image processing program may be stored in the ROM 102 in advance and provided.

さらに、各実施形態および変形例に係る画像処理を実行するための画像処理プログラムを、インターネットなどの通信ネットワークに接続されたコンピュータ上に格納し、通信ネットワーク経由でダウンロードさせることにより提供するように構成してもよい。また、各実施形態および変形例に係る画像処理を実行するための画像処理プログラムを、インターネットなどの通信ネットワーク経由で提供または配布するように構成してもよい。 Furthermore, an image processing program for executing image processing according to each embodiment and modification is stored on a computer connected to a communication network such as the Internet and provided by being downloaded via the communication network. May be. In addition, an image processing program for executing image processing according to each embodiment and modification may be configured to be provided or distributed via a communication network such as the Internet.

各実施形態および変形例に係る画像処理を実行するための画像処理プログラムは、例えば、上述した各部（画像取得部１２０、記憶部１２１、追跡部１２２、移動方向取得部１２３、領域選定部１２４、特徴量抽出部１２５、連結部１２６および出力部１２７、第２の実施形態においては色補正部１３０をさらに含む）を含むモジュール構成となっており、実際のハードウェアとしてはＣＰＵ１０１が例えばストレージ１０４から当該画像処理プログラムを読み出して実行することにより上記各部が主記憶装置（例えばＲＡＭ１０３）上にロードされ、各部が主記憶装置上に生成されるようになっている。 The image processing program for executing the image processing according to each embodiment and modification includes, for example, the above-described units (image acquisition unit 120, storage unit 121, tracking unit 122, movement direction acquisition unit 123, region selection unit 124, The module 101 includes a feature amount extraction unit 125, a connection unit 126, an output unit 127, and a color correction unit 130 in the second embodiment. By reading out and executing the image processing program, each unit is loaded onto a main storage device (for example, the RAM 103), and each unit is generated on the main storage device.

なお、本実施形態は、上述したそのままに限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形して具体化できる。また、上述の実施形態に開示されている複数の構成要素の適宜な組み合わせにより、種々の発明を形成できる。例えば、実施形態に示される全構成要素から幾つかの構成要素を削除してもよい。さらに、異なる実施形態にわたる構成要素を適宜組み合わせてもよい。 In addition, this embodiment is not limited to the above-mentioned as it is, and can implement | achieve by modifying a component in the range which does not deviate from the summary in an implementation stage. Various inventions can be formed by appropriately combining a plurality of constituent elements disclosed in the above-described embodiments. For example, some components may be deleted from all the components shown in the embodiment. Furthermore, constituent elements over different embodiments may be appropriately combined.

１０，１０’ 画像処理装置
１１、１１₁、１１₂、１１₃ カメラ
２０，２０ａ，２０ｂ，２０ｃ対象物体
１０１ＣＰＵ
１０９カメラＩ／Ｆ
１２０画像取得部
１２１記憶部
１２２追跡部
１２３移動方向取得部
１２４領域選定部
１２５特徴量抽出部
１２６連結部
１２７出力部
１３０色補正部
２００，２００_-4，２００_-3，２００_-2，２００_-1，２００₀，２００₁，２００₂，２００₃，２００₅，２００₆，２００₇，２００₈ フレーム画像
３００モデル 10, 10 ′ Image processing apparatus 11, 11 ₁ , 11 ₂ , 11 ₃ Camera 20, 20a, 20b, 20c Target object 101 CPU
109 Camera I / F
120 image acquisition unit 121 storage unit 122 tracking unit 123 movement direction acquisition unit 124 region selecting portion 125 feature extraction unit 126 connecting portion 127 output unit 130 color correction portion 200, 200 _-4, 200 _-3, 200 _-2, 200 _{- 1,} 200 _0, 200 _1, 200 _2, 200 _3, 200 _5, 200 _6, 200 _7, 200 ₈ frame image 300 model

Claims

An image acquisition unit for acquiring time-series images from each of a plurality of imaging devices;
A tracking unit that tracks the position of the target object captured in the time-series image;
A moving direction acquisition unit that acquires the moving direction of the target object from the time-series image;
A first target imaged in a first time-series image among a plurality of time-series images acquired from a plurality of imaging devices based on a predetermined reference direction serving as a reference for the direction of the target object and the moving direction. An object and a second target object captured in the second time-series image among the plurality of time-series images are included in common in the first time-series image and the second time-series image. An area selection unit for acquiring a common area on the first target object and the second target object;
A feature amount is extracted from an image corresponding to the common area, and the tracking result by the tracking unit is associated between a plurality of time-series images acquired from the plurality of imaging devices using the feature amount. An image processing apparatus comprising: a connecting portion.

The moving direction acquisition unit
Transition of the target object as a result of tracking the target object in a predetermined number of the frame images at least before and after the target frame image in the time series among the frame images included in the time-series image The moving direction is acquired using at least one of information, transition information of a circumscribed rectangle including a target object captured in the time series image, and an identification result identified by a direction identifier. The image processing apparatus according to claim 1.

The region selection unit is
A model in which the reference direction and a region corresponding to the region of the target object captured in the time-series image acquired when the target object moves in the reference direction in front of the imaging device The image processing apparatus according to claim 1, wherein the common area is acquired.

The connecting portion is
The second feature weighted based on a comparison result comparing the first feature amount extracted from the entire target object included in the time-series image and the second feature amount extracted from the common region. The image processing apparatus according to claim 1, wherein the association is performed using a quantity.

The connecting portion is
The image processing apparatus according to any one of claims 1 to 3, wherein the association is performed using a feature amount extracted from the common area.

A color correction unit that performs correction for converting the color tone into the color tone of the second time-series image with respect to the first time-series image;
The region selection unit is
The image processing apparatus according to claim 1, wherein the common area is acquired using the first time-series image whose color tone has been converted by the color correction unit.

An image acquisition step of acquiring a time-series image from each of a plurality of imaging devices;
A tracking step of tracking the position of the target object imaged in the time-series image;
A moving direction acquisition step of acquiring a moving direction of the target object from the time-series image;
A first target imaged in a first time-series image among a plurality of time-series images acquired from a plurality of imaging devices based on a predetermined reference direction serving as a reference for the direction of the target object and the moving direction. An object and a second target object captured in the second time-series image among the plurality of time-series images are included in common in the first time-series image and the second time-series image. A region selecting step for obtaining a common region on the first target object and the second target object;
A feature amount is extracted from an image corresponding to the common area, and the tracking result obtained by the tracking step is associated with a plurality of time-series images acquired from the plurality of imaging devices using the feature amount. And a connecting step.

An image acquisition step of acquiring a time-series image from each of a plurality of imaging devices;
A tracking step of tracking the position of the target object imaged in the time-series image;
A moving direction acquisition step of acquiring a moving direction of the target object from the time-series image;
A first target imaged in a first time-series image among a plurality of time-series images acquired from a plurality of imaging devices based on a predetermined reference direction serving as a reference for the direction of the target object and the moving direction. An object and a second target object captured in the second time-series image among the plurality of time-series images are included in common in the first time-series image and the second time-series image. A region selecting step for obtaining a common region on the first target object and the second target object;
A feature amount is extracted from an image corresponding to the common area, and the tracking result obtained by the tracking step is associated with a plurality of time-series images acquired from the plurality of imaging devices using the feature amount. An image processing program for causing a computer to execute the connecting step.