JP2019128791A

JP2019128791A - Image processing device and control method thereof

Info

Publication number: JP2019128791A
Application number: JP2018009941A
Authority: JP
Inventors: 大貴平賀; Daiki Hiraga; 弥志河林; Hisashi Kawabayashi
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2018-01-24
Filing date: 2018-01-24
Publication date: 2019-08-01

Abstract

To make it possible to efficiently set parameters for detecting an object included in an image.SOLUTION: An image processing device, for setting a detection area for detecting a predetermined dynamic body in an image captured by an imaging device and a dynamic body size of the predetermined dynamic body in the detection area, has: acquisition means for acquiring a dynamic image from the imaging device; first extraction means for extracting a dynamic body image from each of a plurality of images included in the dynamic image; generation means for generating a superimposed image in which a plurality of dynamic body images extracted by the first extracting means are superimposed; first setting means for setting the detection area on the basis of the superimposed image; second extraction means for extracting dynamic body images included in the detection area among the plurality of dynamic body images extracted by the first extraction means; determination means for determining a dynamic body image having the maximum size and a dynamic body image having the minimum size of the dynamic body images extracted by the second extracting means; and second setting means for setting the dynamic body size in the detection area on the basis of the dynamic body image having the maximum size and the dynamic body image having the minimum size determined by the determination means.SELECTED DRAWING: Figure 5

Description

本発明は、動体検出に係る画像処理技術に関するものである。 The present invention relates to an image processing technique related to moving object detection.

従来、カメラで撮像した映像を解析して人体などのオブジェクトを検出して計数（カウント）する技術が知られている。人体の検出処理を高速化するためには、映像に対して、検出処理を行う対象領域である検出エリア、および、検出対象となる人体サイズを限定する方法が有効である。そこで、実空間内に人が立っている様子を実際に撮像し、撮像により得られた映像を見ながら、ユーザーが検出エリア及び人体サイズを手動で設定する方法が考えられる。ただし、実空間内に人が立っている様子を実際に撮像するにあたっては、多くの人間を動員する必要があり、手間及びコストがかかるという問題がある。 2. Description of the Related Art Conventionally, a technique for detecting and counting (counting) an object such as a human body by analyzing an image captured by a camera is known. In order to speed up the human body detection process, it is effective to use a method for limiting the detection area, which is a target area for the detection process, and the human body size to be detected for the video. Therefore, a method is conceivable in which a user manually sets a detection area and a human body size while actually capturing an image of a person standing in real space and viewing an image obtained by the image capturing. However, when actually imaging a person standing in real space, it is necessary to mobilize many people, and there is a problem that it takes time and cost.

そこで、人が立っている様子を模倣した映像を作成することが考えられる。特許文献１では、領域内に移動体が入るように制御される撮影手段によって順次得られる複数のフレーム画像で構成される映像上に移動体の軌跡を重畳して出力画像を生成する表示装置が開示されている。また、特許文献２では、ユーザーにより指定された時間帯に動いた移動体の軌跡を強調表示する技術、ユーザーにより指定された移動速度よりも高速で動く移動体や低速で動く移動体を強調表示する技術が開示されている。 Therefore, it is conceivable to create a video that imitates a person standing. In Patent Document 1, there is provided a display device which generates an output image by superimposing a trajectory of a moving object on a video composed of a plurality of frame images sequentially obtained by imaging means controlled so that the moving object enters an area. It is disclosed. Further, in Patent Document 2, a technology for highlighting the trajectory of a moving object moved in a time zone specified by the user, highlighting a moving object moving at a speed higher than the moving velocity specified by the user or a moving object moving at a low speed Technology is disclosed.

特開２０１１−２５４２８９号公報JP 2011-254289 A 米国特許出願公開２０１２／０１６３６５７号公報U.S. Patent Application Publication 2012/01636657

しかしながら、上述の従来技術では、人体サイズを限定するために利用される最大サイズと最小サイズについては、ユーザーが映像を目視で確認し設定する必要がある。特に、人体検出処理を高速化するためには、画像内の検出エリアごとに人体サイズの最小サイズと最大サイズを設定する必要がある。そのため、設定には煩雑な操作が必要となりコスト効率が悪いという問題があった。 However, in the above-described conventional technology, it is necessary for the user to visually confirm and set the maximum size and the minimum size used for limiting the human body size. In particular, in order to speed up the human body detection process, it is necessary to set the minimum size and the maximum size of the human body size for each detection area in the image. For this reason, a complicated operation is required for setting, and there is a problem that the cost efficiency is low.

本発明はこのような課題を鑑みてなされたものであり、画像に含まれるオブジェクトを検出するためのパラメータを効率的に設定可能とする技術を提供することを目的とする。 The present invention has been made in view of such problems, and an object of the present invention is to provide a technique that can efficiently set parameters for detecting an object included in an image.

上述の課題を解決するため、本発明に係る画像処理装置は以下の構成を備える。すなわち、撮像装置による撮像映像において所定の動体を検出するための検出エリア及び該検出エリアにおける前記所定の動体の動体サイズを設定する画像処理装置は、
前記撮像装置から動画像を取得する取得手段と、
前記動画像に含まれる複数の画像それぞれから動体画像を抽出する第１の抽出手段と、
前記第１の抽出手段により抽出された複数の動体画像を重畳表示した重畳画像を生成する生成手段と、
前記重畳画像に基づいて前記検出エリアを設定する第１の設定手段と、
前記第１の抽出手段により抽出された複数の動体画像のうち前記検出エリアに含まれる動体画像を抽出する第２の抽出手段と、
前記第２の抽出手段により抽出された動体画像のうち最大サイズ及び最小サイズの動体画像を決定する決定手段と、
前記決定手段により決定された最大サイズ及び最小サイズの動体画像に基づいて、前記検出エリアにおける前記動体サイズを設定する第２の設定手段と、
を有する。 In order to solve the above-described problems, an image processing apparatus according to the present invention has the following configuration. That is, an image processing device for setting a detection area for detecting a predetermined moving object in a captured image by the imaging device and a moving object size of the predetermined moving object in the detection area,
Acquisition means for acquiring a moving image from the imaging device;
First extraction means for extracting a moving body image from each of a plurality of images included in the moving image;
Generation means for generating a superimposed image in which a plurality of moving body images extracted by the first extraction means are superimposed and displayed;
First setting means for setting the detection area based on the superimposed image;
Second extracting means for extracting a moving body image included in the detection area among the plurality of moving body images extracted by the first extracting means;
A determining unit that determines a moving object image of maximum size and minimum size among moving object images extracted by the second extracting unit;
Second setting means for setting the moving object size in the detection area based on the moving object image of the maximum size and the minimum size determined by the determining means;
Have.

本発明によれば、画像に含まれるオブジェクトを検出するためのパラメータを効率的に設定可能とする技術を提供することができる。 ADVANTAGE OF THE INVENTION According to this invention, the technique which enables the parameter for detecting the object contained in an image to be set efficiently can be provided.

第１実施形態における画像処理システムの全体構成を示す図である。It is a figure showing the whole composition of the image processing system in a 1st embodiment. 撮像装置のハードウェア構成及び機能構成を示す図である。It is a figure which shows the hardware constitutions and function structure of an imaging device. クライアント装置のハードウェア構成及び機能構成を示す図である。It is a figure which shows the hardware constitutions and function structure of a client apparatus. 撮像画像に対する検出エリア及び人体サイズの設定を説明する図である。It is a figure explaining the setting of the detection area and human body size with respect to a captured image. 第１実施形態における画像処理システムの動作を示すフローチャートである。It is a flowchart which shows operation | movement of the image processing system in 1st Embodiment. 動体の動き及び検出エリアの設定を説明する図である。It is a figure explaining the movement of a moving body, and the setting of a detection area. 最大・最小サイズの取得を説明する図である。It is a figure explaining acquisition of the maximum and the minimum size. 人体サイズの設定を説明する図である。It is a figure explaining setting of human body size. 第２実施形態における最大サイズ・最小サイズのフレームの取得処理を示すフローチャートである。It is a flow chart which shows acquisition processing of a frame of the maximum size / minimum size in a 2nd embodiment. 候補フレームを表示する各種ＧＵＩを例示的に示す図である。FIG. 7 is a view exemplarily showing various GUIs for displaying candidate frames. フレーム重畳枚数の調整を説明する図である。It is a figure explaining adjustment of frame superposition number.

以下に、図面を参照して、この発明の実施の形態の一例を詳しく説明する。なお、以下の実施の形態はあくまで例示であり、本発明の範囲を限定する趣旨のものではない。 Hereinafter, an example of an embodiment of the present invention will be described in detail with reference to the drawings. The following embodiment is merely an example and is not intended to limit the scope of the present invention.

（第１実施形態）
本発明に係る画像処理装置の第１実施形態として、撮像装置により取得された映像に含まれる動体（人体）を検出する画像処理システムを例に挙げて以下に説明する。特に、映像内の検出エリアにおける人体サイズの設定を効率化する手法について説明する。 First Embodiment
As a first embodiment of an image processing apparatus according to the present invention, an image processing system that detects a moving body (human body) included in an image acquired by an imaging apparatus will be described below as an example. In particular, a method for improving the efficiency of setting the human body size in the detection area in the video will be described.

＜システム及び各装置の構成＞
図１は、第１実施形態における画像処理システムの全体構成を示す図である。 <System and Configuration of Each Device>
FIG. 1 is a diagram illustrating an overall configuration of an image processing system according to the first embodiment.

撮像装置１１０は、撮像を行うネットワークカメラ等の撮像装置である。クライアント装置１２０は、撮像装置１１０の駆動、撮像装置１１０で撮像された撮像画像の表示を行うパーソナルコンピュータ、サーバ装置、タブレット装置等の情報処理装置である。 The imaging device 110 is an imaging device such as a network camera that performs imaging. The client device 120 is an information processing device such as a personal computer, a server device, or a tablet device that drives the imaging device 110 and displays a captured image captured by the imaging device 110.

入力装置１３０は、マウスやキーボード等から構成され、クライアント装置１２０へのユーザー入力を行う。表示装置１４０は、ディスプレイ等から構成され、クライアント装置１２０が出力した画像の表示を行う。 The input device 130 includes a mouse, a keyboard and the like, and performs user input to the client device 120. The display device 140 includes a display or the like, and displays an image output by the client device 120.

ここでは、クライアント装置１２０と入力装置１３０と表示装置１４０とを各々独立した装置として示している。しかし、例えば、クライアント装置１２０と表示装置１４０とが、一体化されていてもよいし、入力装置１３０と表示装置１４０とが一体化されていてもよい。また、クライアント装置１２０と入力装置１３０と表示装置１４０とが、一体化されていてもよい。 Here, the client device 120, the input device 130, and the display device 140 are shown as independent devices. However, for example, the client device 120 and the display device 140 may be integrated, or the input device 130 and the display device 140 may be integrated. Also, the client device 120, the input device 130, and the display device 140 may be integrated.

ネットワーク１５０は、撮像装置１１０とクライアント装置１２０とを相互に通信可能に接続する。ネットワーク１５０は、例えばローカルネットワーク等の通信規格を満たす複数のルータ、スイッチ、ケーブル等から構成される。ここでは、撮像装置１１０とクライアント装置１２０との間の通信を行うことができるものであればよく、その通信規格、規模、構成を問わない。例えば、ネットワーク１５０は、インターネットや有線ＬＡＮ（Local Area Network）、無線ＬＡＮ、ＷＡＮ（Wide Area Network）等により構成されても良い。また、クライアント装置１２０に接続される撮像装置１１０の数は１台に限られず、複数台であっても良い。 The network 150 communicably connects the imaging device 110 and the client device 120 to each other. The network 150 is configured of, for example, a plurality of routers, switches, cables, and the like that meet communication standards such as a local network. Here, any communication standard, scale, and configuration may be used as long as communication between the imaging device 110 and the client device 120 can be performed. For example, the network 150 may be configured by the Internet, a wired LAN (Local Area Network), a wireless LAN, a WAN (Wide Area Network), or the like. Further, the number of imaging devices 110 connected to the client device 120 is not limited to one, and may be a plurality.

図２は、撮像装置のハードウェア構成及び機能構成を示す図である。図２（ａ）はハードウェア構成の一例、図２（ｂ）は機能構成の一例をそれぞれ示している。 FIG. 2 is a diagram illustrating a hardware configuration and a functional configuration of the imaging apparatus. FIG. 2A shows an example of a hardware configuration, and FIG. 2B shows an example of a functional configuration.

撮像装置１１０はＣＰＵ２１１、主記憶装置２１２、補助記憶装置２１３、駆動部２１４、撮像部２１５、ネットワークＩ／Ｆ２１６を含む。各要素は、システムバス２１７を介して、相互に通信可能に接続されている。 The imaging device 110 includes a CPU 211, a main storage device 212, an auxiliary storage device 213, a driving unit 214, an imaging unit 215, and a network I / F 216. The respective elements are communicably connected to one another via a system bus 217.

ＣＰＵ２１１は、撮像装置１１０の動作を制御する中央演算装置である。主記憶装置２１２は、ＣＰＵ２１１のワークエリア、データの一時的な記憶場所として機能するランダムアクセスメモリ（ＲＡＭ）等の記憶装置である。補助記憶装置２１３は、各種プログラム、各種設定データ等を記憶するハードディスクドライブ（ＨＤＤ）、リードオンリーメモリ（ＲＯＭ）、ソリッドステートドライブ（ＳＳＤ）等の記憶装置である。駆動部２１４は、撮像装置１１０を駆動し、撮像装置１１０の姿勢等を変更させ、撮像部２１５の撮影方向及び画角を変更する駆動部である。 The CPU 211 is a central processing unit that controls the operation of the imaging device 110. The main storage device 212 is a storage device such as a random access memory (RAM) that functions as a work area for the CPU 211 and a temporary storage location for data. The auxiliary storage device 213 is a storage device such as a hard disk drive (HDD) that stores various programs, various setting data, and the like, a read only memory (ROM), and a solid state drive (SSD). The drive unit 214 is a drive unit that drives the imaging device 110 to change the orientation and the like of the imaging device 110 and change the shooting direction and the angle of view of the imaging unit 215.

撮像部２１５は、被写体の画像を取得するための機能部であり、撮像素子と光学系とを有する。撮像素子には、ＣＭＯＳ（Complementary Metal-Oxide Semiconductor）、ＣＣＤ（Charged Coupled Device）等がある。ネットワークＩ／Ｆ２１６は、クライアント装置１２０等の外部の装置とのネットワーク１５０を介した通信に利用されるインターフェースである。 The imaging unit 215 is a functional unit for acquiring an image of a subject, and includes an imaging element and an optical system. Examples of the image sensor include a complementary metal-oxide semiconductor (CMOS) and a charged coupled device (CCD). The network I / F 216 is an interface used for communication with an external device such as the client device 120 via the network 150.

ＣＰＵ２１１が、補助記憶装置２１３に記憶されたプログラムに基づき処理を実行することによって、後述する撮像装置１１０の機能及び処理が実現されることになる。 When the CPU 211 executes processing based on a program stored in the auxiliary storage device 213, functions and processing of the imaging device 110 described later are realized.

撮像装置１１０は、機能構成として、撮像制御部２３１、信号処理部２３２、駆動制御部２３３、通信制御部２３４を含む。 The imaging apparatus 110 includes an imaging control unit 231, a signal processing unit 232, a drive control unit 233, and a communication control unit 234 as functional configurations.

撮像制御部２３１は、撮像部２１５を介して周囲の環境を撮影する。信号処理部２３２は、撮像制御部２３１によって撮影された画像の処理を行う。信号処理部２３２は、例えば、撮像制御部２３１によって撮影された画像の符号化を行う。静止画の場合は、信号処理部２３２は、例えば、ＪＰＥＧ（Joint Photographic Experts Group）等の符号化方式を用いて、画像の符号化を行う。また、動画の場合は、信号処理部２３２は、Ｈ．２６４／ＭＰＥＧ−４ＡＶＣ（以下では、単にＨ．２６４と呼ぶ）、ＨＥＶＣ（High Efficiency Video Coding）等の符号化方式を用いて、画像の符号化を行う。なお、信号処理部２３２は、予め設定された複数の符号化方式の中から、例えば撮像装置１１０の操作部を介してユーザーにより選択された符号化方式を用いて、画像の符号化を行うようにしてもよい。 The imaging control unit 231 images the surrounding environment via the imaging unit 215. The signal processing unit 232 processes an image captured by the imaging control unit 231. The signal processing unit 232, for example, encodes an image captured by the imaging control unit 231. In the case of a still image, the signal processing unit 232 encodes an image using an encoding method such as JPEG (Joint Photographic Experts Group), for example. Further, in the case of a moving image, the signal processing unit 232 performs the H.264 process. The image is encoded using a coding method such as H.264 / MPEG-4 AVC (hereinafter simply referred to as H.264), HEVC (High Efficiency Video Coding), or the like. Note that the signal processing unit 232 performs image encoding using, for example, the encoding method selected by the user via the operation unit of the imaging device 110 from among a plurality of encoding methods set in advance. You may

駆動制御部２３３は、駆動部２１４を介して、撮像制御部２３１の撮影方向及び画角を変更させる制御を行う。ただし、撮像制御部２３１による撮影方向と画角とのうちの何れか１つのみを変更する構成としてもよい。また、撮像制御部２３１の撮影方向及び画角は、固定であってもよい。通信制御部２３４は、撮像制御部２３１により撮影され、信号処理部２３２により処理された画像を、ネットワークＩ／Ｆ２１６を介して、クライアント装置１２０に送信する。また、通信制御部２３４は、ネットワークＩ／Ｆ２１６を介して、クライアント装置１２０から撮像装置１１０に対する制御命令を受信する。 The drive control unit 233 performs control to change the shooting direction and the angle of view of the imaging control unit 231 via the drive unit 214. However, only one of the shooting direction and the angle of view by the imaging control unit 231 may be changed. In addition, the shooting direction and the angle of view of the imaging control unit 231 may be fixed. The communication control unit 234 transmits the image captured by the imaging control unit 231 and processed by the signal processing unit 232 to the client device 120 via the network I / F 216. Further, the communication control unit 234 receives a control command for the imaging device 110 from the client device 120 via the network I / F 216.

図３は、クライアント装置のハードウェア構成及び機能構成を示す図である。図３（ａ）はハードウェア構成の一例、図３（ｂ）は機能構成の一例をそれぞれ示している。 FIG. 3 is a diagram illustrating a hardware configuration and a functional configuration of the client apparatus. FIG. 3A shows an example of a hardware configuration, and FIG. 3B shows an example of a functional configuration.

クライアント装置１２０は、ＣＰＵ２２１、主記憶装置２２２、補助記憶装置２２３、入力Ｉ／Ｆ２２４、出力Ｉ／Ｆ２２５、ネットワークＩ／Ｆ２２６を含む。各要素は、システムバス２２７を介して、相互に通信可能に接続されている。 The client device 120 includes a CPU 221, a main storage device 222, an auxiliary storage device 223, an input I / F 224, an output I / F 225, and a network I / F 226. The elements are communicably connected to one another via a system bus 227.

ＣＰＵ２２１は、クライアント装置１２０の動作を制御する中央演算装置である。主記憶装置２２２は、ＣＰＵ２２１のワークエリア、データの一時的な記憶場所として機能するＲＡＭ等の記憶装置である。補助記憶装置２２３は、各種プログラム、各種設定データ等を記憶するＨＤＤ、ＲＯＭ、ＳＳＤ等の記憶装置である。入力Ｉ／Ｆ２２４は、入力装置１３０等からの入力を受付ける際に利用されるインターフェースである。出力Ｉ／Ｆ２２５は、表示装置１４０等への情報の出力に利用されるインターフェースである。ネットワークＩ／Ｆ２１６は、撮像装置１１０等の外部の装置とのネットワーク１５０を介した通信に利用されるインターフェースである。 The CPU 221 is a central processing unit that controls the operation of the client device 120. The main storage device 222 is a storage device such as a work area of the CPU 221 and a RAM that functions as a temporary storage location of data. The auxiliary storage device 223 is a storage device such as an HDD, a ROM, or an SSD that stores various programs, various setting data, and the like. The input I / F 224 is an interface used when receiving an input from the input device 130 or the like. The output I / F 225 is an interface used to output information to the display device 140 or the like. The network I / F 216 is an interface used for communication with an external device such as the imaging device 110 via the network 150.

ＣＰＵ２２１が、補助記憶装置２２３に記憶されたプログラムに基づき処理を実行することによって、後述するクライアント装置１２０の機能及び処理が実現されることになる。 When the CPU 221 executes processing based on the program stored in the auxiliary storage device 223, functions and processing of the client device 120 described later are realized.

クライアント装置１２０は、機能構成として、入力情報取得部２４１、通信制御部２４２、画像取得部２４３、検出部２４４、映像録画部２４５、動体取得部２４６、描画部２４７、表示制御部２４８を含む。 The client device 120 includes an input information acquisition unit 241, a communication control unit 242, an image acquisition unit 243, a detection unit 244, a video recording unit 245, a moving object acquisition unit 246, a drawing unit 247, and a display control unit 248 as functional components.

入力情報取得部２４１は、入力装置１３０を介したユーザーによる入力を受け付ける。通信制御部２４２は、撮像装置１１０から送信された画像を、ネットワーク１５０を介して受信する。また、通信制御部２４２は、撮像装置１１０への制御命令を、ネットワーク１５０を介して送信する。画像取得部２４３は、通信制御部２４２を介して、撮像装置１１０により撮影された画像を、被写体の検出処理の対象である画像として取得する。また、画像取得部２４３は、補助記憶装置２２３に記憶されている画像を、被写体の検出処理の対象である画像として取得しても良い。なお、ここで画像とは、動画（複数のフレーム画像）であってもよいし静止画であってもよい。 The input information acquisition unit 241 receives an input by the user via the input device 130. The communication control unit 242 receives an image transmitted from the imaging device 110 via the network 150. Further, the communication control unit 242 transmits a control instruction to the imaging device 110 via the network 150. The image acquisition unit 243 acquires an image captured by the imaging device 110 as an image that is a subject of subject detection processing via the communication control unit 242. In addition, the image acquisition unit 243 may acquire the image stored in the auxiliary storage device 223 as an image that is a subject of subject detection processing. Here, the image may be a moving image (a plurality of frame images) or a still image.

検出部２４４は、画像取得部２４３により取得された画像に対して、画像特徴に基づく被写体の検出処理を行う。映像録画部２４５は、撮像装置１１０により撮影しているリアルタイム映像をユーザーからの入力に応じて録画する。動体取得部２４６は映像録画部で取得された映像から動体を動体差分によって切り出す。しかしこの限りではなく、補助記憶装置２２３内にある事前に録画した映像に対して処理を行ってもよい。 The detection unit 244 performs subject detection processing based on the image characteristics for the image acquired by the image acquisition unit 243. The video recording unit 245 records the real-time video shot by the imaging device 110 according to an input from the user. The moving body acquisition unit 246 cuts out the moving body from the video acquired by the video recording unit by moving body difference. However, the present invention is not limited to this, and processing may be performed on previously recorded video in the auxiliary storage device 223.

表示画像生成部２４７は、動体取得部２４６で取得された動体のフレーム画像を既定間隔毎に重畳された画像を生成する。また、表示画像生成部２４７は、検出エリア内に含まれるフレームを抽出するとともに、当該フレームの中から最大と最小のフレームを抽出する。そして、取得された最大、最小サイズのフレームを強調して背景画像に重畳した映像を生成する。表示制御部２４８は、ＣＰＵ２２１からの指示に従い、被写体の検出結果や動体のフレームの重畳結果を表示装置１４０へ出力する。 The display image generation unit 247 generates an image in which the moving object frame image acquired by the moving object acquisition unit 246 is superimposed at predetermined intervals. Further, the display image generation unit 247 extracts the frames included in the detection area and extracts the maximum and minimum frames from the frames. Then, the acquired maximum and minimum size frames are emphasized and a video superimposed on the background image is generated. The display control unit 248 outputs a subject detection result or a moving object frame superimposition result to the display device 140 in accordance with an instruction from the CPU 221.

なお、ここでは、クライアント装置１２０が、撮像装置１１０から撮像画像を取得し当該撮像画像に対して処理を行う形態について説明するが、処理を撮像装置１１０内で行うよう構成してもよい。 Here, a mode is described in which the client device 120 acquires a captured image from the imaging device 110 and performs processing on the captured image. However, the processing may be performed in the imaging device 110.

＜システムの動作＞
上述したように、クライアント装置１２０は、撮像装置１１０から取得した撮像画像に対し人体検出を行う。具体的には、１以上の検出エリアを設定し、設定した検出エリアごとに最大、最小の人体サイズを設定し、設定された人体検出処理を行う。 <System operation>
As described above, the client device 120 performs human body detection on the captured image acquired from the imaging device 110. Specifically, one or more detection areas are set, the maximum and minimum human body sizes are set for each set detection area, and the set human body detection process is performed.

図４は、撮像画像に対する検出エリア及び人体サイズの設定を説明する図である。具体的には、基準とする撮像画像を取得し、当該撮像画像を参照して検出エリアや人体サイズを設定する。 FIG. 4 is a diagram for explaining setting of a detection area and a human body size for a captured image. Specifically, a captured image as a reference is acquired, and the detection area and the human body size are set with reference to the captured image.

図４（ａ）は、撮像装置２００が被写体を撮像する様子を示した図である。被写体２１０、２１１、２２０、２２１は、４人の人間を示している。すなわち、図４（ａ）は、撮像装置２００に相対的に近い位置に被写体２１０、２１１が存在し、相対的に遠い位置に被写体２２０、２２１が存在しているシーンを表している。 FIG. 4A is a diagram showing how the imaging device 200 images a subject. The subjects 210, 211, 220, and 221 indicate four people. That is, FIG. 4A shows a scene in which the subjects 210 and 211 exist at positions relatively close to the imaging device 200 and the subjects 220 and 221 exist at positions relatively far.

図４（ｂ）は、図４（ａ）に示す状態で撮像装置２００により撮像されクライアント装置において表示される表示画像を例示的に示す図である。表示画像２３０上のオブジェクト２４０，２４１、２５０、２５１は撮像内に映る人体映像である。オブジェクト２４０は図４（ａ）の被写体２１０の撮像映像であり、同様にオブジェクト２４１は被写体２１１、オブジェクト２５０は被写体２２０の、オブジェクト２５１は被写体２２１の撮像映像である。 FIG. 4B is a diagram exemplarily showing a display image captured by the imaging device 200 and displayed on the client device in the state shown in FIG. Objects 240, 241, 250, and 251 on the display image 230 are human body images shown in the imaging. The object 240 is a captured video of the subject 210 in FIG. 4A, and similarly, the object 241 is a captured subject 211, the object 250 is a captured video of the subject 220, and the object 251 is a captured video of the subject 221.

ユーザーは、クライアント装置において表示された表示画像２３０を目視して検出エリアを設定する。例えば、人体映像２４０、２４１、２５０、２５１の上半身を含むように検出エリアを枠で囲むように設定する。ここで、上半身映像を含むように設定するのは、人体の検出時に人体の特徴画像を元にしたマッチング処理に都合がよいからである。 The user visually sets the detection area by visually observing the display image 230 displayed on the client device. For example, the detection area is set to be surrounded by a frame so as to include the upper body of the human body images 240, 241, 250, and 251. Here, the reason why the setting is made so as to include the upper body image is that it is convenient for the matching processing based on the feature image of the human body when the human body is detected.

図４（ｂ）の矩形領域２６０、２７０はユーザーが設定した検出エリアである。なお、検出エリアの大きさはカウントしたい検出エリアの面積に応じて任意に変更可能である。したがって、検出エリアの大きさに応じて必要となる被写体数が変わる場合もある。さらに、図４（ｂ）では２か所の検出エリアを指定しているが、カウントしたい検出エリアの数に応じて設定を増減できる。 Rectangular areas 260 and 270 in FIG. 4B are detection areas set by the user. Note that the size of the detection area can be arbitrarily changed according to the area of the detection area to be counted. Therefore, the number of required subjects may change depending on the size of the detection area. Further, although two detection areas are designated in FIG. 4B, the setting can be increased or decreased according to the number of detection areas to be counted.

図４（ｃ）は、人体サイズ設定のユーザーインタフェース（ＵＩ）を表した図である。ユーザーは、当該ＵＩを介して、各検出エリアに対して人体サイズの設定を行う。 FIG. 4C is a view showing a user interface (UI) of human body size setting. The user sets the human body size for each detection area via the UI.

人体モデル２８０、２８１、２９０、２９１は、人体サイズ設定用のモデルである。ユーザーは人体モデル２８０、２８１、２９０、２９１をマウスで操作することによって人体サイズを設定することができる。 Human body models 280, 281, 290, and 291 are models for human body size setting. The user can set the human body size by operating the human body model 280, 281, 290, 291 with a mouse.

例えば、検出エリア２６０に対しては人体モデル２８０及び２８１を操作して動体サイズ（人体サイズ）を設定することができる。ここでは、人体モデル２８０を操作して最大の人体サイズを、人体モデル２８１を操作して最小の人体サイズを設定することを想定する。同様に、検出エリア２７０に対しては人体モデル２９０及び２９１を操作して最大及び最小の人体サイズを設定することができる。 For example, the moving body size (human body size) can be set for the detection area 260 by operating the human body models 280 and 281. Here, it is assumed that the human body model 280 is operated to set the maximum human body size, and the human body model 281 is operated to set the minimum human body size. Similarly, the human body models 290 and 291 can be operated for the detection area 270 to set the maximum and minimum human body sizes.

しかしながら、上述の方法では実空間に人を立たせる必要が生じため人件費によるコストが増加することになる。また、各検出エリアの最大と最小の人体をユーザーが目視で見つけるといった手間が生じることになる。そこで、第１実施形態では、検出エリア設定と人体サイズ設定を効率化するために、検出エリア内における人体の最大サイズ・最小サイズを取得し、強調表示する方法について説明する。 However, in the above-mentioned method, it is necessary to set people in the real space, which increases the cost due to the labor cost. In addition, it takes time for the user to visually find the maximum and minimum human bodies in each detection area. Therefore, in the first embodiment, in order to streamline detection area setting and human body size setting, a method of acquiring and highlighting the maximum size and the minimum size of the human body in the detection area will be described.

図５は、第１実施形態における画像処理システムの動作を示すフローチャートである。すなわち、人体の最大サイズと最小サイズを抽出して強調表示する処理の流れを示している。 FIG. 5 is a flowchart showing the operation of the image processing system in the first embodiment. That is, the flow of processing for extracting and highlighting the maximum size and the minimum size of the human body is shown.

Ｓ３０１では、撮像装置１１０は、移動している人間を撮像し、得られた動画像をクライアント装置１２０内のメモリ２２３内に格納する。 In step S <b> 301, the imaging device 110 captures a moving human and stores the obtained moving image in the memory 223 in the client device 120.

図６は、動体の動き及び検出エリアの設定を説明する図である。図６（ａ）は、撮像装置１１０で撮像された録画映像の一例を示す図である。この映像は被写体である人体５０１が時間と共に空間内を矢印の示す向きに移動している様子を示したものである。 FIG. 6 is a diagram for explaining the movement of the moving object and the setting of the detection area. FIG. 6A is a diagram illustrating an example of a recorded video imaged by the imaging device 110. This image shows a human body 501 as a subject moving in the space in the direction indicated by the arrow with time.

Ｓ３０２では、動体取得部２４６は、Ｓ３０１で取得された録画映像を用いて、動体差分により、動体画像（移動する被写体を含むフレーム）を所定の時間間隔毎に切り出す。そして、表示画像生成部２４７は、背景画像にフレーム５１２を重畳した画像を作成する。 In step S302, the moving body acquisition unit 246 cuts out a moving body image (frame including a moving subject) at predetermined time intervals by moving body difference using the recorded video acquired in step S301. Then, the display image generation unit 247 generates an image in which the frame 512 is superimposed on the background image.

図６（ｂ）は、フレーム５１２を所定の時間間隔毎に取得し、そのすべてをフレーム群５１３として重畳表示した様子を表す図である。なお、複数のフレームを１枚の画像に重畳するのは、検出エリアの設定時に同じ大きさの動体のフレームを１つの検出エリアとして設定する際における、ユーザー側の視認性を高めるためである。最後に、表示制御部２４８は、表示画像生成部２４７によって作成された重畳画像を表示装置１４０に表示する。なお、表示装置１４０には動体のフレーム５１２がその動体を含む最小の矩形として表示される。また、ここでは、各フレームは、図６（ｂ）に示すように時系列順に重畳表示される。そのため、奥から手間に動体が移動する場合、新しいフレームは古いフレームの前方に重なって表示される。各フレームは後で説明する透過率の変更を考慮するため、ＰＮＧ（Portable Network Graphics）画像とする。 FIG. 6B is a diagram illustrating a state in which the frames 512 are acquired at predetermined time intervals and all the frames 512 are superimposed and displayed as the frame group 513. A plurality of frames is superimposed on one image in order to enhance the visibility on the user side when a moving object frame of the same size is set as one detection area at the time of setting the detection area. Finally, the display control unit 248 causes the display device 140 to display the superimposed image generated by the display image generation unit 247. The display device 140 displays a moving object frame 512 as a minimum rectangle including the moving object. In addition, here, each frame is superimposed and displayed in chronological order as shown in FIG. Therefore, when a moving body moves from the back to the hand, the new frame is displayed in front of the old frame. Each frame is a PNG (Portable Network Graphics) image in order to take account of the change in transmittance described later.

Ｓ３０３では、入力情報取得部２４１は、ユーザーの設定した検出エリア情報を取得する。すなわち、ユーザーは、Ｓ３０２により取得された重畳画像を目視しながら手動で検出エリアを設定し、入力情報取得部２４１は、その設定内容を受け付ける。 In S303, the input information acquisition unit 241 acquires detection area information set by the user. That is, the user manually sets a detection area while viewing the superimposed image acquired in S302, and the input information acquisition unit 241 receives the setting content.

図６（ｃ）はユーザーが検出エリア６０１を設定した状態を表した図である。検出エリアの設定方法は前述したとおりである。なお、ここでは検出エリアを１つのみ指定しているが複数指定してもよい。すなわち、複数の検出エリアを設定し、検出エリア毎に解析することもできる。さらに、各検出エリアをさらに複数の小領域に分割することもできる。この場合、複数の小領域が点在していても同一の検出エリアであればその検出エリアに適用した設定で解析処理が行われることになる。 FIG. 6C shows the state in which the user has set the detection area 601. The method of setting the detection area is as described above. Although only one detection area is specified here, a plurality of detection areas may be specified. In other words, a plurality of detection areas can be set and analyzed for each detection area. Furthermore, each detection area can be further divided into a plurality of small areas. In this case, even if a plurality of small regions are scattered, if the same detection area, the analysis processing is performed with the setting applied to the detection area.

Ｓ３０４では、表示画像生成部は、Ｓ３０３でユーザーが設定した検出エリア内に動体（人体）が含まれるフレーム画像を抽出する。抽出は、例えば、検出エリアの範囲内にフレームの上半身が含まれているか判定することで行う。図６（ｄ）は検出エリア６０１内に含まれるフレーム画像を抽出した様子を示す図である。ここでは検出エリア６０１内に存在するフレーム６１１、６１２、６１３を抽出している。これにより検出エリア外のフレーム画像は後の処理の対象外となる。 In S304, the display image generation unit extracts a frame image in which a moving body (human body) is included in the detection area set by the user in S303. The extraction is performed, for example, by determining whether the upper half of the frame is included in the range of the detection area. FIG. 6D is a diagram illustrating a state in which frame images included in the detection area 601 are extracted. Here, the frames 611, 612 and 613 present in the detection area 601 are extracted. As a result, the frame image outside the detection area is not subject to subsequent processing.

Ｓ３０５では、表示画像生成部２４７は、検出エリア内から最大・最小サイズのフレームを抽出する。具体的には、検出エリア内の各フレームは矩形として既に切り出されているので、最も大きい矩形のフレームを最大サイズとして、最も小さい矩形のフレームを最小サイズとして抽出する。ここでは、最大サイズのフレームとしてフレーム６１１が、最小サイズのフレームとしてフレーム６１３が抽出される。 In S305, the display image generation unit 247 extracts the maximum and minimum size frames from the detection area. Specifically, since each frame in the detection area is already cut out as a rectangle, the largest rectangular frame is extracted as the largest size, and the smallest rectangular frame is extracted as the smallest size. Here, the frame 611 is extracted as the maximum size frame, and the frame 613 is extracted as the minimum size frame.

Ｓ３０６では、表示画像生成部２４７は、強調されたフレームを背景画像に重畳した画像を生成する。そして、表示制御部２４８は表示画像生成部２４７によって作成された画像を表示装置１４０に表示する。 In S306, the display image generation unit 247 generates an image in which the enhanced frame is superimposed on the background image. Then, the display control unit 248 displays the image created by the display image generation unit 247 on the display device 140.

図７は、最大・最小サイズの取得を説明する図である。検出エリア６０１内に属するフレーム画像のうち、最大サイズのフレーム７０１と最小サイズのフレーム７０２が強調表示された画像を例示的に示している。 FIG. 7 is a diagram for explaining acquisition of the maximum and minimum sizes. Of the frame images belonging to the detection area 601, an image in which the maximum size frame 701 and the minimum size frame 702 are highlighted is shown.

ここでは、フレームを強調表示するために最大サイズであるフレーム７０１と最小サイズであるフレーム７０２の透過率を低め、それ以外のフレームの透過率を高めるという手法をとる。なお、透過率に関してはユーザーが最大サイズ・最小サイズのフレームとそれ以外のフレームに対し、任意の値を設定できる。例えば、最大サイズ・最小サイズのフレームのみ注目したい場合は、最大・最小サイズの透過率を”０”とし、それ以外のフレームの透過率を”１００”とするとよい。 Here, in order to highlight a frame, the transmittance of the frame 701 which is the maximum size and the frame 702 which is the minimum size is reduced, and the transmittance of the other frames is increased. Regarding the transmittance, the user can set an arbitrary value for the maximum size / minimum size frame and other frames. For example, when it is desired to pay attention only to the maximum size / minimum size frames, the maximum / minimum size transmittance may be set to “0”, and the transmittance of other frames may be set to “100”.

なお、フレームを強調表示する手法は上述の透過率の制御以外の手法も使用可能である。例えば、最大サイズとなるフレーム７０１と最小サイズとなるフレーム７０２の輪郭を太くすることで強調することもできる。また、上述のようにフレームは時系列順で重畳されているので、最大・最小サイズのフレームを最前面に表示するよう制御することで強調することも可能である。 In addition, methods other than the control of the above-mentioned transmittance | permeability can also be used for the method of highlighting a flame | frame. For example, emphasis can be achieved by thickening the outline of the frame 701 having the maximum size and the frame 702 having the minimum size. Further, as described above, since the frames are superimposed in time-sequential order, it is also possible to emphasize by controlling to display the frames of maximum and minimum sizes in the foreground.

Ｓ３０７では、表示画像生成部２４７は、人体サイズ設定用の人体モデルが背景画像に重畳表示された人体サイズ設定画面を生成する。当該処理は、例えば、ユーザーが不図示の人体サイズ設定ボタンを押下することにより実行される。 In S307, the display image generation unit 247 generates a human body size setting screen in which the human body model for setting the human body size is displayed superimposed on the background image. This process is executed, for example, when the user presses a human body size setting button (not shown).

図８は、人体サイズの設定を説明する図である。すなわち、表示画像生成部２４７が生成した人体サイズ設定画面を例示的に示す図である。人体モデル８０１、８０２は人体の上半身に模したモデルである。上述したように、ユーザーが人体モデル８０１、８０２をマウス操作により拡縮することにより人体サイズを設定することができる。なお人体サイズの設定方法はこの限りでなく、例えば、人体モデルを選択状態にし、キーボード上のｕｐボタン、Ｄｏｗｎボタンを使用することで大小調整してもよい。また、Ｓ３０５で取得された最大・最小フレームから自動で人体サイズを決定し、検出エリアの近辺に重畳することも可能である。ユーザーが人体サイズを設定すると、入力情報取得部２４１はユーザーによって設定された人体サイズの最大サイズと最小サイズを取得する。 FIG. 8 is a diagram for explaining setting of the human body size. That is, it is a diagram exemplarily showing a human body size setting screen generated by the display image generation unit 247. The human body models 801 and 802 are models imitating the upper body of the human body. As described above, the user can set the human body size by scaling the human body models 801 and 802 by operating the mouse. The method of setting the human body size is not limited to this, and for example, the human body model may be selected, and the size may be adjusted by using the up button and the down button on the keyboard. It is also possible to automatically determine the human body size from the maximum and minimum frames acquired in S305 and to superimpose it in the vicinity of the detection area. When the user sets the human body size, the input information acquisition unit 241 acquires the maximum size and the minimum size of the human body size set by the user.

Ｓ３０８では、入力情報取得部２４１は、他の検出エリアがあるか否かを判定する。他に検出エリアの設定がなければ処理を終了し、検出エリアの設定が他に存在する場合はＳ３０３に戻って処理を繰り返す。 In S308, the input information acquisition unit 241 determines whether there is another detection area. If there is no other detection area setting, the process ends. If there is another detection area setting, the process returns to S303 to repeat the process.

以上説明したとおり第１実施形態によれば、表示装置１４０に出力される動体のフレームの重畳映像を用いることにより、検出エリアの設定の効率化が可能となる。また、重畳映像における検出エリアに含まれる最大サイズ及び最小サイズのフレームを強調表示することにより、人体サイズの設定の効率化が可能となる。すなわち、人体サイズの設定時に必要であった人手や設定時の手間を大幅に軽減することができる。 As described above, according to the first embodiment, it is possible to increase the efficiency of setting the detection area by using the superimposed video of the moving object frame output to the display device 140. In addition, the human body size can be set more efficiently by highlighting the maximum size frame and the minimum size frame included in the detection area in the superimposed image. That is, it is possible to greatly reduce the labor and time required for setting the human body size.

（第２実施形態）
第２実施形態では、最大サイズ及び最小サイズのフレームの取得処理（Ｓ３０５）について更に詳細に説明する。具体的には、映像に人体以外のオブジェクトが含まれており、人体以外のオブジェクトが、最大サイズ又は最小サイズのフレームとして取得された場合の処理について説明する。なお、システム及び各装置の構成については第１実施形態（図１〜図３）と同様であるため説明は省略する。 Second Embodiment
In the second embodiment, acquisition processing (S305) of a frame of maximum size and minimum size will be described in more detail. Specifically, a process when an object other than a human body is included in a video and an object other than a human body is acquired as a frame having a maximum size or a minimum size will be described. The configuration of the system and each device is the same as that of the first embodiment (FIGS. 1 to 3), and thus the description thereof is omitted.

＜システムの動作＞
システムの全体動作については第１実施形態（図５）と同様であるため詳細な説明は省略する。上述の通り、第２実施形態では、最大サイズ・最小サイズのフレームの取得処理（Ｓ３０５）が第１実施形態と異なる。 <System operation>
Since the overall operation of the system is the same as that of the first embodiment (FIG. 5), detailed description thereof is omitted. As described above, in the second embodiment, acquisition processing (S305) of a frame of maximum size / minimum size is different from that of the first embodiment.

撮像装置１１０が、移動している人間、および、人体以外の被写体を撮像して録画映像としてクライアント装置１２０内のメモリ２２３内に格納すると、第１実施形態と同様の処理が開始される。Ｓ３０１〜Ｓ３０４は、第１実施形態と同様である。ただし、Ｓ３０４においては、人体以外のオブジェクトが抽出されている点が第１実施形態と異なる。 When the imaging device 110 captures a moving human and an object other than the human body and stores the image as a recorded video image in the memory 223 in the client device 120, processing similar to that of the first embodiment is started. S301 to S304 are the same as those in the first embodiment. However, in S304, the point from which objects other than a human body are extracted differs from 1st Embodiment.

図１０は、候補フレームを表示する各種ＧＵＩを例示的に示す図である。図１０（ａ）は第２実施形態においてＳ３０４において抽出されたフレームの重畳映像の一例を示す図である。当該映像には、人体６１１〜６１３、および、人体以外の被写体９０１である鳥の画像が含まれている。 FIG. 10 is a diagram exemplarily showing various GUIs for displaying candidate frames. FIG. 10A is a diagram illustrating an example of the superimposed video of the frame extracted in S304 in the second embodiment. The video includes human bodies 611 to 613 and an image of a bird that is a subject 901 other than the human body.

図９は、第２実施形態における最大サイズ・最小サイズのフレームの取得処理を示すフローチャートである。 FIG. 9 is a flowchart showing acquisition processing of a frame of maximum size and minimum size in the second embodiment.

Ｓ４０１では、表示画像生成部２４７は、最小サイズのフレームを最小人体サイズの候補として取得し、最大・最小サイズ確認画面９２０を生成する。そして、表示制御部２４８は、最大・最小サイズ確認画面９２０を表示装置１４０に表示する。ここで、最大・最小サイズ確認画面９２０は、フレームが人体を含んでいるか否かをユーザーに判定させるためのＵＩである。最大・最小サイズ確認画面９２０は、例えば、Ｓ３０４において抽出されたフレームの重畳映像に隣接して表示される。 In S401, the display image generation unit 247 acquires the minimum size frame as a minimum human body size candidate, and generates the maximum / minimum size confirmation screen 920. Then, the display control unit 248 displays the maximum / minimum size confirmation screen 920 on the display device 140. Here, the maximum / minimum size confirmation screen 920 is a UI for allowing the user to determine whether or not the frame includes a human body. The maximum / minimum size confirmation screen 920 is displayed adjacent to the superimposed video of the frame extracted in S304, for example.

図１０（ｂ）は、Ｓ３０４において抽出されたフレームの重畳映像と最大・最小サイズ確認画面９２０とが表示されたＵＩの一例を示す図である。最大・最小サイズ確認画面９２０は最小人体サイズ候補枠９２１と最大人体サイズ候補枠９２２によって構成される。 FIG. 10B is a diagram illustrating an example of a UI on which the superimposed video of the frame extracted in S304 and the maximum / minimum size confirmation screen 920 are displayed. The maximum / minimum size confirmation screen 920 includes a minimum human body size candidate frame 921 and a maximum human body size candidate frame 922.

最小人体サイズ候補枠９２１は、取得された最小サイズのフレームが表示され、最大人体サイズ候補枠９２２は、取得された最大サイズのフレームが表示される。ここでは、被写体９０１が最小フレーム候補として最小人体サイズ候補枠９２１にオブジェクト９２３として表示されている。 The minimum human body size candidate frame 921 displays the acquired minimum size frame, and the maximum human body size candidate frame 922 displays the acquired maximum size frame. Here, the subject 901 is displayed as the object 923 in the minimum human body size candidate frame 921 as the minimum frame candidate.

Ｓ４０２では、表示画像生成部２４７は、人体を含むフレームであるか否かのユーザーの判定結果を取得する。具体的には、ユーザーは、最小人体サイズ候補枠９２１に表示されている画像を目視で確認する。そして、最小人体サイズ候補枠９２１に表示されている画像が人体を含むと判定した場合、ユーザーは確定ボタン（不図示）を押下する。一方、最小人体サイズ候補枠９２１に表示されている画像が人体を含まないと判定した場合、ユーザーは矢印ボタン９２４を押下する。 In S402, the display image generation unit 247 acquires the determination result of the user as to whether or not the frame includes the human body. Specifically, the user visually confirms the image displayed in the minimum human body size candidate frame 921. Then, when it is determined that the image displayed in the minimum human body size candidate frame 921 includes the human body, the user presses the enter button (not shown). On the other hand, when it is determined that the image displayed in the minimum human body size candidate frame 921 does not include a human body, the user presses the arrow button 924.

Ｓ４０３では、表示画像生成部２４７は、Ｓ４０２におけるユーザーの判定結果を解析する。ここでは、ユーザーにより矢印ボタン９２４が押下されたと解析された場合、オブジェクトの誤検出が発生していると判定し、処理はＳ４０４に進む。一方、確定ボタンが押下されたと判定された場合、オブジェクトの誤検出が発生していない判定し、処理はＳ４０５に進む。 In S403, the display image generation unit 247 analyzes the determination result of the user in S402. Here, if it is analyzed that the arrow button 924 has been pressed by the user, it is determined that an erroneous detection of the object has occurred, and the process proceeds to S404. On the other hand, if it is determined that the confirm button has been pressed, it is determined that no erroneous detection of the object has occurred, and the process proceeds to S405.

Ｓ４０４では、表示画像生成部２４７は、現在選択されている被写体９１１の次に小さな次候補の被写体（ここでは被写体９３１）を表示する。図１０（ｃ）は次候補の被写体が表示された状態を示す図である。ここでは、新たに最小人体サイズの候補９４３が最小人体サイズ候補枠９２１に表示されている。そして、処理はＳ４０２に戻り、ユーザーは人体を含むフレームであるかの判定を行う。この処理により、人体を含まないフレームは最小人体サイズから除外されることになる。 In S <b> 404, the display image generation unit 247 displays the next next candidate subject (here, the subject 931) next to the currently selected subject 911. FIG. 10C is a diagram showing a state where the next candidate subject is displayed. Here, a minimum human body size candidate 943 is newly displayed in the minimum human body size candidate frame 921. Then, the process returns to S402, and the user determines whether the frame includes a human body. By this process, the frame which does not include the human body is excluded from the minimum human body size.

Ｓ４０５では、表示画像生成部２４７は、最大サイズのフレームを最大人体サイズの候補として取得し、最大・最小サイズ確認画面９２０を生成する。Ｓ４０５〜Ｓ４０８に示す最大サイズのフレームの取得処理は、最小サイズのフレームの取得処理（Ｓ４０１〜Ｓ４０４）と同様であるため説明を省略する。 In S405, the display image generation unit 247 acquires the maximum size frame as a maximum human body size candidate, and generates the maximum / minimum size confirmation screen 920. The acquisition process of the maximum size frame shown in S405 to S408 is the same as the acquisition process of the minimum size frame (S401 to S404), and thus description thereof is omitted.

図１０（ｄ）は、Ｓ３０４において抽出されたフレームの重畳映像と最大・最小サイズ確認画面９２０とが表示されたＵＩの一例を示す図である。ここでは、被写体６１１が最大フレーム候補として最大人体サイズ候補枠９２２にオブジェクト９６４として表示されている。 FIG. 10D is a diagram illustrating an example of a UI on which the superimposed video of the frame extracted in S304 and the maximum / minimum size confirmation screen 920 are displayed. Here, the subject 611 is displayed as the object 964 in the maximum human body size candidate frame 922 as the maximum frame candidate.

上述の処理により、映像内に人体以外のオブジェクトが含まれている場合においても、人体の最大サイズ及び最小サイズを適切に取得することが出来る。その後、処理はＳ３０６に進む。すなわち、表示画像生成部２４７は、強調されたフレームを背景画像に重畳した画像を生成する。そして、表示制御部２４８は、重畳画像を表示装置１４０に表示する。 With the above-described processing, even when an object other than the human body is included in the video, the maximum size and the minimum size of the human body can be appropriately acquired. Thereafter, the processing proceeds to step S306. That is, the display image generation unit 247 generates an image in which the emphasized frame is superimposed on the background image. Then, the display control unit 248 displays the superimposed image on the display device 140.

以上説明したとおり第２実施形態によれば、映像内に人体以外のオブジェクトが含まれている場合においても、検出エリアの設定及び人体サイズの設定の効率化が可能となる。 As described above, according to the second embodiment, it is possible to make the setting of the detection area and the setting of the human body size more efficient even when the image includes an object other than the human body.

（変形例）
上述の説明においては、動体取得部２４６はＳ３０２において、所定の時間間隔の複数のフレームを重畳表示したが、他の表示手法を用いてもよい。例えば、フレームの時間間隔を変更可能とする表示方法でもよい。 (Modification)
In the above description, the moving object acquisition unit 246 superimposedly displays a plurality of frames at a predetermined time interval in S302, but other display methods may be used. For example, a display method in which the frame time interval can be changed may be used.

図１１は、フレーム重畳枚数（重畳数）の調整を説明する図である。具体的には、単位時間当たりの動体（ここでは人体）の移動量とそれに応じて表示されるフレーム群の関係性を示した図である。なお、単位時間当たりの移動量は一定時間当たりのピクセル移動量から算出される。 FIG. 11 is a diagram for explaining the adjustment of the number of superimposed frames (the number of superimposed frames). Specifically, it is a diagram showing the relationship between the amount of movement of a moving object (here, a human body) per unit time and the frame group displayed accordingly. The movement amount per unit time is calculated from the pixel movement amount per fixed time.

フレーム表示映像１００１は、単位時間当たりの動体の移動量が小さい場合のフレーム重畳例である。この場合、重畳枚数が多すぎるため、ユーザーに検出エリア設定時、人体サイズの設定時に煩わしさを与えてしまう。また、フレーム表示部１０２１は、単位時間あたりの動体の移動量が大きい場合のフレーム重畳例である。この場合、人をカウントしたい検出エリアを把握することが難しく、検出エリア設定と人体サイズ設定が困難である。そのため、フレーム表示映像１０１１に示されるように、単位時間あたりの移動量が適当であり、各フレームが視認しやすく配置された映像として表示することが望ましい。 The frame display video 1001 is an example of frame superimposition when the moving amount of the moving object per unit time is small. In this case, since the number of superimposed images is too large, the user is bothered when setting the detection area and setting the human body size. The frame display unit 1021 is an example of frame superimposition when the moving amount of the moving object per unit time is large. In this case, it is difficult to grasp the detection area where people are to be counted, and it is difficult to set the detection area and the human body size. For this reason, as shown in the frame display video 1011, it is desirable that the amount of movement per unit time is appropriate, and that each frame be displayed as a video that is easily visible.

そこで、動体取得部２４６は、動体の移動速度に応じて表示するフレーム数を調整する。ここで、動体の移動速度は、一定時間あたりのピクセル換算による移動量で算出される。例えば、フレーム表示映像１００１に示される状況においては、フレームの枚数が多いため、フレーム数を間引くよう調整する。一方、フレーム表示映像１０２１に示される状況においては、フレームの枚数が少なくフレーム間隔が広がってしまうため、フレーム数を増加させるよう調整する。これらの調整により、ユーザーの視認性を考慮したより好適な重畳表示を提供することが可能となる。 Therefore, the moving object acquisition unit 246 adjusts the number of frames to be displayed according to the moving speed of the moving object. Here, the moving speed of the moving body is calculated by the amount of movement in pixel conversion per fixed time. For example, in the situation shown in the frame display video 1001, since the number of frames is large, the number of frames is adjusted to be thinned out. On the other hand, in the situation shown in the frame display video 1021, since the number of frames is small and the frame interval is widened, adjustment is made to increase the number of frames. By these adjustments, it is possible to provide a more suitable superimposed display in consideration of user visibility.

なお、上述の調整を動体取得部２４６が自動で行うよう構成してもよいし、ユーザーからの調整指示を受けつけるよう構成してもよい。例えば、ユーザーが手動でフレームの表示間隔を設定できるように、表示画像生成部２４７は表示間隔設定画面（不図示）を表示するとよい。 The moving body acquisition unit 246 may automatically perform the adjustment described above, or may be configured to receive an adjustment instruction from the user. For example, the display image generation unit 247 may display a display interval setting screen (not shown) so that the user can manually set the frame display interval.

（その他の実施例）
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 (Other embodiments)
The present invention supplies a program that implements one or more functions of the above-described embodiments to a system or apparatus via a network or storage medium, and one or more processors in a computer of the system or apparatus read and execute the program. Can also be realized. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

１１０撮像装置；１２０クライアント装置；１３０入力装置；１４０表示装置；１５０ネットワーク 110 imaging device; 120 client device; 130 input device; 140 display device; 150 network

Claims

An image processing device for setting a detection area for detecting a predetermined moving object in a captured image by an imaging device and a moving object size of the predetermined moving object in the detection area,
Acquisition means for acquiring a moving image from the imaging device;
First extraction means for extracting a moving body image from each of a plurality of images included in the moving image;
Generation means for generating a superimposed image in which a plurality of moving body images extracted by the first extraction means are superimposed and displayed;
First setting means for setting the detection area based on the superimposed image;
Second extracting means for extracting a moving body image included in the detection area among the plurality of moving body images extracted by the first extracting means;
A determining unit that determines a moving object image of maximum size and minimum size among moving object images extracted by the second extracting unit;
Second setting means for setting the moving object size in the detection area based on the moving object image of the maximum size and the minimum size determined by the determining means;
An image processing apparatus comprising:

The first setting unit includes a first display unit that displays the superimposed image and displays a user interface (UI) that receives a setting of the detection area from a user. Image processing device.

The second setting means displays a user interface (UI) for displaying the moving body image of the maximum size and the minimum size determined by the determining means and receiving the setting of the moving body size from the user. The image processing apparatus according to claim 1, further comprising:

The image processing apparatus according to claim 3, wherein the second display unit highlights the moving body images having the maximum size and the minimum size determined by the determination unit.

The second display means performs the highlighting by making the transmittance of the moving image of the maximum size and the minimum size determined by the determining means relatively lower than the transmittance of the other moving image. The image processing apparatus according to claim 4, wherein

Determining means for determining whether or not the maximum size and minimum size moving body images determined by the determining means include a predetermined moving body;
Excluding means for excluding the moving body image determined not to include the predetermined moving body from the maximum size and the minimum size moving body image by the determining means;
The image processing apparatus according to any one of claims 1 to 5, further comprising:

The image processing apparatus according to claim 1, further comprising a control unit that controls the number of superimposed moving body images in the superimposed image.

The image processing apparatus according to claim 7, wherein the control unit controls the number of superimpositions based on a moving amount of a moving object per unit time in the moving image.

The image processing apparatus according to any one of claims 1 to 8, wherein the predetermined moving body is a human body.

A control area of an image processing apparatus for setting a detection area for detecting a predetermined moving object in an image captured by an imaging apparatus and a moving object size of the predetermined moving object in the detection area,
An acquisition step of acquiring a moving image from the imaging device;
A first extraction step of extracting a moving object image from each of a plurality of images included in the moving image;
A generation step of generating a superimposed image in which a plurality of moving body images extracted by the first extraction step are superimposed and displayed;
A first setting step of setting the detection area based on the superimposed image;
A second extraction step of extracting a moving body image included in the detection area among a plurality of moving body images extracted by the first extraction step;
A determining step of determining a maximum size and a minimum size moving image among the moving images extracted by the second extraction step;
A second setting step of setting the moving body size in the detection area based on the moving body images of the maximum size and the minimum size determined by the determining step;
And controlling the image processing apparatus.

The program for functioning a computer as each means of the image processing apparatus in any one of Claims 1-9.