JP7418074B2

JP7418074B2 - Image processing device, image processing method, and program

Info

Publication number: JP7418074B2
Application number: JP2019030373A
Authority: JP
Inventors: 稔日下部
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2018-12-26
Filing date: 2019-02-22
Publication date: 2024-01-19
Anticipated expiration: 2039-02-22
Also published as: JP2020107297A

Description

本発明は、画像処理技術に関する。 The present invention relates to image processing technology.

近年、監視カメラが広く設置されるようになっている。このような監視カメラが用いられるシステムは、撮像された人物の動きを特定することにより、防犯、マーケティング分析、サービス向上を行うことができる点で有用である。その一方で、このようなシステムでは、撮像された人物に関するプライバシーが保護されることが重要である。 In recent years, surveillance cameras have become widely installed. A system using such a surveillance camera is useful in that it can perform crime prevention, marketing analysis, and service improvement by specifying the movements of the photographed person. On the other hand, in such a system, it is important that the privacy of the imaged person is protected.

特許文献１では、予め撮像された背景画像に対して、画像から検出された人物が移動しているか否かに応じて色が異なる人型アイコンを重畳する技術が開示されている。 Patent Document 1 discloses a technique for superimposing a humanoid icon whose color varies depending on whether or not a person detected from the image is moving on a background image that has been captured in advance.

国際公開第２０１７／１４１４５４号International Publication No. 2017/141454

しかしながら、特許文献１では、人物ではない物体に対して人物であると検出した場合に、当該物体に対応する人型アイコンを背景画像に重畳してしまうことがある。つまり、実際には人物ではない物体を人物として検出した場合に、当該物体に対応する人型アイコンを背景画像に重畳した不適切な出力画像が生成されることがある。 However, in Patent Document 1, when an object that is not a person is detected to be a person, a humanoid icon corresponding to the object may be superimposed on the background image. That is, when an object that is not actually a person is detected as a person, an inappropriate output image may be generated in which a humanoid icon corresponding to the object is superimposed on a background image.

そこで、本発明は、撮像された画像における適切な領域を隠蔽した画像を生成可能とする技術を提供することを目的とする。 Therefore, an object of the present invention is to provide a technique that makes it possible to generate an image in which an appropriate area in a captured image is hidden.

本発明の一態様による画像処理装置は、撮像された入力画像から前景領域を抽出する抽出手段と、前記入力画像から特定の物体を検出する検出手段と、前記検出手段により検出された特定の物体の位置に基づいて、少なくとも一部が曲線の形状の領域である特定領域を設定する設定手段と、前記設定手段により設定された特定領域以外の前記前景領域は、前記入力画像に対応する所定の画像に重畳せず、前記設定手段により設定された特定領域における前記前景領域を抽象化したシルエット画像を、前記所定の画像に重畳した出力画像を生成する生成手段と、前記入力画像における混雑領域を判定する判定手段と、を有し、前記設定手段は、前記判定手段により判定された混雑領域に位置する人物に対し、水平方向の幅を所定の倍率だけ縮小させた特定領域を設定する。 An image processing device according to one aspect of the present invention includes an extraction unit that extracts a foreground region from a captured input image, a detection unit that detects a specific object from the input image, and a specific object that is detected by the detection unit. a setting means for setting a specific area, at least a part of which is a curved area, based on the position of the foreground area other than the specific area set by the setting means; generating means for generating an output image in which a silhouette image that abstracts the foreground region in the specific area set by the setting means is superimposed on the predetermined image without superimposing it on the image; and a determining means for determining, and the setting means sets a specific area whose horizontal width is reduced by a predetermined magnification for a person located in the crowded area determined by the determining means.

本発明によれば、撮像された画像における適切な領域を隠蔽した画像を生成可能とする技術を提供することができる。 According to the present invention, it is possible to provide a technique that makes it possible to generate an image in which an appropriate area in a captured image is hidden.

システム構成の一例を示す図である。FIG. 1 is a diagram showing an example of a system configuration. 画像処理装置の機能ブロックを示す図である。FIG. 2 is a diagram showing functional blocks of an image processing device. 出力画像を生成する処理の流れを示すフローチャートである。3 is a flowchart showing the flow of processing for generating an output image. 出力画像を生成する処理を説明するための図である。FIG. 3 is a diagram for explaining processing for generating an output image. 人物領域の形状を説明するための図である。FIG. 3 is a diagram for explaining the shape of a human region. 出力画像を生成する処理を説明するための図である。FIG. 3 is a diagram for explaining processing for generating an output image. 画像処理装置の機能ブロックを示す図である。FIG. 2 is a diagram showing functional blocks of an image processing device. 出力画像を生成する処理を説明するための図である。FIG. 3 is a diagram for explaining processing for generating an output image. 出力画像を生成する処理を説明するための図である。FIG. 3 is a diagram for explaining processing for generating an output image. 画像処理装置の機能ブロックを示す図である。FIG. 2 is a diagram showing functional blocks of an image processing device. 出力画像を生成する処理の流れを示すフローチャートである。3 is a flowchart showing the flow of processing for generating an output image. 出力画像を生成する処理を説明するための図である。FIG. 3 is a diagram for explaining processing for generating an output image. 出力画像を生成する処理の流れを示すフローチャートである。3 is a flowchart showing the flow of processing for generating an output image. 各装置のハードウェア構成を示す図である。FIG. 3 is a diagram showing the hardware configuration of each device.

以下、添付図面を参照しながら、本発明に係る実施形態について説明する。なお、以下の実施形態において示す構成は一例に過ぎず、図示された構成に限定されるものではない。 Embodiments of the present invention will be described below with reference to the accompanying drawings. Note that the configuration shown in the following embodiments is only an example, and the configuration is not limited to the illustrated configuration.

（実施形態１）
図１は、本実施形態におけるシステム構成を示す図である。本実施形態におけるシステムは、画像処理装置１００、撮像装置１１０、記録装置１２０、およびディスプレイ１３０を有している。 (Embodiment 1)
FIG. 1 is a diagram showing the system configuration in this embodiment. The system in this embodiment includes an image processing device 100, an imaging device 110, a recording device 120, and a display 130.

画像処理装置１００、撮像装置１１０、および記録装置１２０は、ネットワーク１４０を介して相互に接続されている。ネットワーク１４０は、例えばＥＴＨＥＲＮＥＴ（登録商標）等の通信規格に準拠する複数のルータ、スイッチ、ケーブル等から実現される。 The image processing device 100, the imaging device 110, and the recording device 120 are interconnected via a network 140. The network 140 is realized, for example, by a plurality of routers, switches, cables, etc. that comply with communication standards such as ETHERNET (registered trademark).

なお、ネットワーク１４０は、インターネットや有線ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）、無線ＬＡＮ（ＷｉｒｅｌｅｓｓＬａｎ）、ＷＡＮ（ＷｉｄｅＡｒｅａＮｅｔｗｏｒｋ）等により実現されてもよい。 Note that the network 140 may be realized by the Internet, a wired LAN (Local Area Network), a wireless LAN (Wireless LAN), a WAN (Wide Area Network), or the like.

画像処理装置１００は、例えば、後述する画像処理の機能を実現するためのプログラムがインストールされたパーソナルコンピュータ等によって実現される。 The image processing apparatus 100 is realized, for example, by a personal computer or the like installed with a program for realizing image processing functions described below.

撮像装置１１０は、画像を撮像する装置である。撮像装置１１０は、撮像した画像に基づく画像データと、画像を撮像した撮像時刻の情報と、撮像装置１１０を識別する情報である識別情報とを関連付けて、ネットワーク１４０を介し、画像処理装置１００や記録装置１２０等の外部装置へ送信する。なお、本実施形態に係るシステムにおいて、撮像装置１１０は１つとするが、複数であってもよい。 The imaging device 110 is a device that captures images. The imaging device 110 associates image data based on the captured image, information on the imaging time at which the image was captured, and identification information that is information for identifying the imaging device 110 and sends the data to the image processing device 100 and the like via the network 140. The information is transmitted to an external device such as the recording device 120. Note that in the system according to this embodiment, there is one imaging device 110, but there may be a plurality of imaging devices 110.

記録装置１２０は、撮像装置１１０が撮像した画像の画像データと、画像を撮像した撮像時刻の情報と、撮像装置１１０を識別する識別情報とを関連付けて記録する。そして、画像処理装置１００からの要求に従って、記録装置１２０は、記録したデータ（画像、識別情報など）を画像処理装置１００へ送信する。 The recording device 120 records image data of an image captured by the imaging device 110, information on the imaging time at which the image was captured, and identification information for identifying the imaging device 110 in association with each other. Then, in accordance with a request from the image processing apparatus 100, the recording apparatus 120 transmits the recorded data (image, identification information, etc.) to the image processing apparatus 100.

ディスプレイ１３０は、ＬＣＤ（ＬｉｑｕｉｄＣｒｙｓｔａｌＤｉｓｐｌａｙ）等により構成されており、画像処理装置１００の画像処理の結果や、撮像装置１１０が撮像した画像などを表示する。ディスプレイ１３０は、ＨＤＭＩ（登録商標）（ＨｉｇｈＤｅｆｉｎｉｔｉｏｎＭｕｌｔｉｍｅｄｉａＩｎｔｅｒｆａｃｅ）等の通信規格に準拠したディスプレイケーブルを介して画像処理装置１００と接続されている。 The display 130 is configured with an LCD (Liquid Crystal Display) or the like, and displays the results of image processing by the image processing device 100, images captured by the imaging device 110, and the like. The display 130 is connected to the image processing device 100 via a display cable compliant with communication standards such as HDMI (registered trademark) (High Definition Multimedia Interface).

また、ディスプレイ１３０は、表示手段として機能し、撮像装置１１０が撮像した画像や、後述する画像処理に係る設定画面等を表示する。なお、ディスプレイ１３０、画像処理装置１００、および記録装置１２０の少なくともいずれか２つ又は全ては、単一の筐体に設けられてもよい。 Further, the display 130 functions as a display means, and displays images captured by the imaging device 110, a setting screen related to image processing, etc. to be described later. Note that at least any two or all of the display 130, the image processing device 100, and the recording device 120 may be provided in a single housing.

なお、画像処理装置１００の画像処理の結果や、撮像装置１１０により撮像された画像は、画像処理装置１００にディスプレイケーブルを介して接続されたディスプレイ１３０に限らず、例えば、次のような外部装置が有するディスプレイに表示されてもよい。すなわち、ネットワーク１４０を介して接続されたスマートフォン、タブレット端末などのモバイルデバイスが有するディスプレイに表示されていてもよい。 Note that the results of image processing by the image processing device 100 and the images captured by the imaging device 110 are not limited to the display 130 connected to the image processing device 100 via a display cable, but can be displayed on external devices such as the following. It may be displayed on a display that has. That is, it may be displayed on a display of a mobile device such as a smartphone or a tablet terminal connected via the network 140.

次に、図２に示す本実施形態に係る画像処理装置１００の機能ブロックを参照して、本実施形態に係る画像処理装置１００の画像処理について説明する。 Next, image processing of the image processing apparatus 100 according to the present embodiment will be described with reference to functional blocks of the image processing apparatus 100 according to the present embodiment shown in FIG.

なお、図２に示す各機能は、本実施形態の場合、図１４を参照して後述するＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）１４２０とＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）１４００とを用いて、次のようにして実現されるものとする。図２に示す各機能は、画像処理装置１００のＲＯＭ１４２０に格納されたコンピュータプログラムを画像処理装置１００のＣＰＵ１４００が実行することにより実現される。なお、以降の説明において、特定の物体は、人物であるものとする。 In this embodiment, each function shown in FIG. 2 is realized as follows using a ROM (Read Only Memory) 1420 and a CPU (Central Processing Unit) 1400, which will be described later with reference to FIG. shall be carried out. Each function shown in FIG. 2 is realized by the CPU 1400 of the image processing apparatus 100 executing a computer program stored in the ROM 1420 of the image processing apparatus 100. Note that in the following description, it is assumed that the specific object is a person.

通信部２００は、図１４を参照して後述するＩ／Ｆ（Ｉｎｔｅｒｆａｃｅ）１４４０によって実現でき、ネットワーク１４０を介して、撮像装置１１０や記録装置１２０と通信を行う。通信部２００は、例えば、撮像装置１１０が撮像した画像の画像データを受信したり、撮像装置１１０を制御するための制御コマンドを撮像装置１１０へ送信したりする。なお、制御コマンドは、例えば、撮像装置１１０に対して画像を撮像するよう指示を行うコマンドなどを含む。 The communication unit 200 can be realized by an I/F (Interface) 1440, which will be described later with reference to FIG. 14, and communicates with the imaging device 110 and the recording device 120 via the network 140. The communication unit 200 receives, for example, image data of an image captured by the imaging device 110, and transmits a control command for controlling the imaging device 110 to the imaging device 110. Note that the control command includes, for example, a command that instructs the imaging device 110 to capture an image.

記憶部２０１は、図１４を参照して後述するＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）１４１０やＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）１４３０等によって実現でき、画像処理装置１００による画像処理に関わる情報やデータを記憶する。例えば、記憶部２０１は、画像から検出された人物の位置に関する情報を記憶する。 The storage unit 201 can be implemented by a RAM (Random Access Memory) 1410, an HDD (Hard Disk Drive) 1430, etc., which will be described later with reference to FIG. 14, and stores information and data related to image processing by the image processing apparatus 100. For example, the storage unit 201 stores information regarding the position of a person detected from an image.

表示制御部２０２は、撮像装置１１０が撮像した画像や、本実施形態に係る画像処理に関する設定を行う設定画面、画像処理の結果を示す情報などをディスプレイ１３０に表示させる。例えば、表示制御部２０２は、後述する生成部２０７により生成された出力画像をディスプレイ１３０に表示させる。操作受付部２０３は、キーボードやマウス等の入力装置（不図示）を介して、ユーザが行った操作を受け付ける。 The display control unit 202 causes the display 130 to display images captured by the imaging device 110, a setting screen for making settings related to image processing according to the present embodiment, information indicating the results of image processing, and the like. For example, the display control unit 202 causes the display 130 to display an output image generated by a generation unit 207, which will be described later. The operation reception unit 203 receives an operation performed by a user via an input device (not shown) such as a keyboard or a mouse.

抽出部２０４は、撮像された入力画像から前景領域を抽出する。抽出部２０４は、通信部２００が受信した入力画像と背景画像とを比較することにより、前景領域を抽出する。なお、背景画像は、画像にプライバシーを保護する対象（隠蔽対象）とする特定の物体（本実施形態では人物）が含まれていない状態の画像であるものとする。 The extraction unit 204 extracts a foreground region from the captured input image. The extraction unit 204 extracts a foreground region by comparing the input image received by the communication unit 200 and the background image. Note that the background image is an image that does not include a specific object (in this embodiment, a person) whose privacy is to be protected (hidden object).

抽出部２０４は、通信部２００が受信した入力画像と背景画像とを比較することにより、前景領域を抽出する。抽出部２０４は、例えば、背景画像に対する差分の領域を示す前景領域を「１」とし、その他の領域を「０」とする２値画像を生成する。この場合、抽出部２０４は、例えば、背景画像の各画素と入力画像の各画素とで差分値を算出し、算出した差分値が閾値以上の画素については前景領域を表す「１」を設定し、閾値未満の場合は「０」を設定することにより、２値画像を生成することができる。なお、抽出部２０４は、背景差分のみならず、他の手法によって前景領域を特定してもよい。 The extraction unit 204 extracts a foreground region by comparing the input image received by the communication unit 200 and the background image. For example, the extraction unit 204 generates a binary image in which the foreground region indicating a difference with respect to the background image is set to "1" and the other regions are set to "0". In this case, the extraction unit 204 calculates a difference value between each pixel of the background image and each pixel of the input image, and sets "1" representing the foreground region to a pixel for which the calculated difference value is greater than or equal to a threshold value. , if it is less than the threshold value, a binary image can be generated by setting "0". Note that the extraction unit 204 may identify the foreground region using not only the background difference but also other methods.

検出部２０５は、画像から特定の物体を検出する。例えば、検出部２０５は、照合パターン（辞書）を用いて、画像から特定の物体を検出する。なお、画像から人物を検出する場合において、人物が正面向きである場合と横向きである場合とで使用する照合パターンを両方使うことで検出精度の向上が期待できる。例えば、正面（背面）向きの人体の画像と照合させるための照合パターンと、横向きの人物の画像と照合させるための照合パターンとを保持し、撮像装置１１０の設置状態やユーザの指定に基づいて両方使うことができる。 The detection unit 205 detects a specific object from the image. For example, the detection unit 205 detects a specific object from the image using a matching pattern (dictionary). Note that when detecting a person from an image, improvement in detection accuracy can be expected by using both the matching patterns used when the person is facing forward and when the person is facing sideways. For example, a matching pattern for matching an image of a human body facing the front (back) and a matching pattern for matching an image of a person facing sideways are held, and the matching pattern is stored based on the installation state of the imaging device 110 or the user's designation. You can use both.

また、照合パターンは、斜め方向からや上方向からなど他の角度からのものを用意しておいてもよい。また、人物を検出する場合、必ずしも全身の特徴を示す照合パターン（辞書）を用意しておく必要はなく、上半身、下半身、頭部、顔、足などの人物の一部について照合パターンを用意してもよい。なお、検出部２０５は、画像から特定の物体として人物を検出する機能を有していればよく、パターンマッチング処理にのみ限定されるものではない。 Further, the matching pattern may be prepared from other angles such as from an oblique direction or from above. Furthermore, when detecting a person, it is not necessarily necessary to prepare matching patterns (dictionaries) that indicate the characteristics of the whole body, but rather to prepare matching patterns for parts of the person such as the upper body, lower body, head, face, and feet. You can. Note that the detection unit 205 only needs to have a function of detecting a person as a specific object from an image, and is not limited to pattern matching processing.

なお、以降の説明において、本実施形態における検出部２０５は、人物の上半身の照合パターンを用いて画像から人物の上半身を検出するものとする。そして、検出部２０５は、検出した人物の上半身を示す画像の領域（画像領域）から当該人物の全身の画像領域を特定する。例えば、検出部２０５は、人物の上半身を示す画像領域の垂直方向のサイズを所定の倍率（例えば２倍）だけ画像の垂直方向における下方向に拡大した画像領域を当該人物の全身の画像領域として特定する。なお、検出部２０５により検出された人物の位置は、画像の左上の端点を原点として、当該人物の全身の画像領域における重心点の座標であるものとする。 In the following description, it is assumed that the detection unit 205 in this embodiment detects the upper body of a person from an image using a matching pattern of the upper body of the person. Then, the detection unit 205 identifies an image area of the whole body of the detected person from the area of the image (image area) showing the upper body of the person. For example, the detection unit 205 sets an image area obtained by enlarging the vertical size of an image area showing the upper body of a person by a predetermined magnification (for example, 2 times) downward in the vertical direction of the image as an image area of the whole body of the person. Identify. Note that the position of the person detected by the detection unit 205 is the coordinate of the center of gravity in the image area of the whole body of the person, with the upper left end point of the image as the origin.

設定部２０６は、検出部２０５により検出された特定の物体の画像における位置に基づいて、当該特定の物体に対応する形状の領域（特定領域）を設定する。本実施形態における設定部２０６は、検出部２０５により検出された画像における人物の位置に基づいて、人物に対応する形状の領域である特定領域を設定する。例えば、設定部２０６は、検出部２０５により検出された人物の全身の画像領域を囲うように特定領域を設定する。なお、以降の説明において、人物領域とは、検出された人物に対し設定された特定領域であるものとして説明する。 Based on the position of the specific object detected by the detection unit 205 in the image, the setting unit 206 sets an area (specific area) having a shape corresponding to the specific object. The setting unit 206 in this embodiment sets a specific area, which is an area having a shape corresponding to the person, based on the position of the person in the image detected by the detection unit 205. For example, the setting unit 206 sets the specific area so as to surround the image area of the whole body of the person detected by the detection unit 205. Note that in the following description, the person area will be explained as a specific area set for a detected person.

生成部２０７は、設定部２０６により設定された人物領域における前景領域を抽象化したシルエット画像を生成する。言い換えれば、生成部２０７は、抽出部２０４が抽出した前景領域のうち、設定部２０６により設定された特定領域以外の前景領域を、隠蔽対象（抽象化の対象）から外す。そして、生成部２０７は、設定部２０６により設定された人物領域における前景領域を、例えば、任意の色（ＲＧＢ値）で塗りつぶすことで抽象化（隠蔽）したシルエット画像を生成する。なお、人物領域における前景領域を抽象化したシルエット画像としては、例えば、テクスチャを付与した画像や、モザイク処理を施したモザイク画像、ぼかし処理を施したぼかし画像などであってもよい。 The generation unit 207 generates a silhouette image that abstracts the foreground area in the person area set by the setting unit 206. In other words, the generation unit 207 excludes foreground areas other than the specific area set by the setting unit 206 from among the foreground areas extracted by the extraction unit 204 from the concealment target (abstraction target). Then, the generation unit 207 generates an abstracted (concealed) silhouette image by filling the foreground area in the person area set by the setting unit 206 with, for example, an arbitrary color (RGB values). Note that the silhouette image that abstracts the foreground region of the human region may be, for example, an image with texture, a mosaic image that has been subjected to mosaic processing, a blurred image that has been subjected to blurring processing, or the like.

そして、生成部２０７は、設定部２０６により設定された人物領域における前景領域を抽象化したシルエット画像を、所定の画像に重畳した出力画像を生成する。なお、本実施形態における生成部２０７は、シルエット画像を、背景画像に重畳するものとするが、例えば表示する用途として予め用意した画像である表示用画像に重畳してもよいし、通信部２００が受信した画像（入力画像）に重畳してもよい。 Then, the generation unit 207 generates an output image in which a silhouette image, which is an abstraction of the foreground area in the person area set by the setting unit 206, is superimposed on a predetermined image. Note that the generation unit 207 in this embodiment superimposes the silhouette image on the background image, but it may also be superimposed on a display image that is an image prepared in advance for display purposes, or the communication unit 207 may superimpose the silhouette image on the background image. may be superimposed on the image (input image) received by the user.

次に、図３および図４を参照して、本実施形態に係る画像処理装置１００の画像処理について更に詳細に説明する。図３は、本実施形態に係る画像処理装置１００の画像処理の流れを示すフローチャートである。また、図４は、本実施形態に係る画像処理装置１００の画像処理を説明するための図である。 Next, image processing by the image processing apparatus 100 according to the present embodiment will be described in more detail with reference to FIGS. 3 and 4. FIG. 3 is a flowchart showing the flow of image processing by the image processing apparatus 100 according to the present embodiment. Further, FIG. 4 is a diagram for explaining image processing of the image processing apparatus 100 according to the present embodiment.

なお、図３に示すフローを実行することで、設定された人物領域における前景領域を抽象化したシルエット画像を背景画像に重畳した出力画像を生成することができる。なお、図３に示すフローの処理は、例えば、ユーザによる指示に従って、開始又は終了するものとする。なお、図３に示すフローチャートの処理は、撮像装置１１０のＲＯＭ１４２０に格納されたコンピュータプログラムを撮像装置１１０のＣＰＵ１４００が実行して実現される図２に示す機能ブロックにより実行されるものとする。 Note that by executing the flow shown in FIG. 3, it is possible to generate an output image in which a silhouette image, which is an abstraction of the foreground region in the set human region, is superimposed on the background image. Note that the processing of the flow shown in FIG. 3 is started or ended, for example, according to an instruction from a user. It is assumed that the processing in the flowchart shown in FIG. 3 is executed by the functional blocks shown in FIG. 2, which are realized by the CPU 1400 of the imaging device 110 executing a computer program stored in the ROM 1420 of the imaging device 110.

まず、Ｓ３０１にて、通信部２００は、撮像装置１１０が撮像した画像（入力画像）を受信する。図４（ａ）は、通信部２００が受信する入力画像の一例を示す図である。図４（ａ）に示す画像には、開閉可能な扉４００、人物４０１～４０２が含まれている。 First, in S301, the communication unit 200 receives an image (input image) captured by the imaging device 110. FIG. 4A is a diagram showing an example of an input image received by the communication unit 200. The image shown in FIG. 4(a) includes a door 400 that can be opened and closed, and people 401 and 402.

次に、Ｓ３０２にて、抽出部２０４は、背景画像を取得する。なお、抽出部２０４は、例えば、予め設定された背景画像を取得してもよい。また、抽出部２０４は、現在のフレームから過去Ｎフレーム分の入力画像に基づいて生成された背景画像を取得してもよい。例えば、生成部２０７は、現在のフレームから直近５フレーム分の入力画像の各画素値の平均値を求めることで背景画像を生成する。そして、抽出部２０４は、生成部２０７により生成された背景画像を取得してもよい。図４（ｂ）に示す画像は、Ｓ３０２にて抽出部２０４により取得される背景画像の一例を示す図であり、図４（ａ）に示す画像との差異は、人物４０１、及び４０２の有無である。 Next, in S302, the extraction unit 204 acquires a background image. Note that the extraction unit 204 may obtain, for example, a preset background image. Further, the extraction unit 204 may obtain a background image generated based on input images of the past N frames from the current frame. For example, the generation unit 207 generates the background image by finding the average value of each pixel value of the input image for the most recent five frames from the current frame. Then, the extraction unit 204 may acquire the background image generated by the generation unit 207. The image shown in FIG. 4(b) is a diagram showing an example of the background image acquired by the extraction unit 204 in S302, and the difference from the image shown in FIG. It is.

次に、Ｓ３０３にて、抽出部２０４は、撮像された入力画像から前景領域を抽出する。本実施形態における抽出部２０４は、通信部２００が受信した入力画像と背景画像とを比較することにより、前景領域を抽出する。図４（ｃ）は図４（ａ）とは時間的に異なるタイミングで撮像された入力画像の一例である。図４（ｂ）とは、人物４０１、及び４０２の有無及び、扉４００の開閉状態が異なる。なお、図４（ｃ）では、扉４００の開閉状態が異なることにより扉４００の周囲の光源の状態が変化している例を示している。そして、図４（ｃ）に示す領域４０３は、扉４００の奥にある不図示の光源からの光が差し込むことによって、図４（ｂ）に示す背景画像に対し画素値の変化が大きい領域を示している。 Next, in S303, the extraction unit 204 extracts a foreground region from the captured input image. The extraction unit 204 in this embodiment extracts a foreground region by comparing the input image received by the communication unit 200 with a background image. FIG. 4(c) is an example of an input image captured at a temporally different timing from that of FIG. 4(a). 4B differs from FIG. 4B in the presence or absence of persons 401 and 402 and in the open/closed state of the door 400. Note that FIG. 4C shows an example in which the state of the light source around the door 400 changes depending on whether the door 400 is opened or closed. The area 403 shown in FIG. 4(c) is an area where the pixel value changes significantly with respect to the background image shown in FIG. It shows.

図４（ｄ）に示す前景領域４０４、４０５は、Ｓ３０３にて、抽出部２０４により、図４（ｃ）に示す入力画像と、図４（ｂ）に示す背景画像とを比較することにより抽出された前景領域である。図４（ｄ）に示す例において、前景領域４０４は、人物４０１に対応する前景領域であり、周囲に他の前景領域が無いことから人物の形状が明瞭である。一方、人物４０２に対応する前景領域は、人物４０２の周囲に発生している画素値の変化が大きい領域４０３に対応する前景領域４０５に紛れてしまい、人物４０２の形状や位置の判別は困難である。 The foreground regions 404 and 405 shown in FIG. 4(d) are extracted by the extraction unit 204 in S303 by comparing the input image shown in FIG. 4(c) and the background image shown in FIG. 4(b). This is the foreground area. In the example shown in FIG. 4D, the foreground area 404 is a foreground area corresponding to the person 401, and the shape of the person is clear because there are no other foreground areas around it. On the other hand, the foreground area corresponding to the person 402 is blended into the foreground area 405 corresponding to the area 403 with large changes in pixel values occurring around the person 402, making it difficult to distinguish the shape and position of the person 402. be.

なお、図４（ｅ）に示す画像は、Ｓ３０３にて抽出部２０４により抽出された前景領域を抽象化したシルエット画像を背景画像に重畳した出力画像を示す図である。図４（ｅ）に示すように、抽出された前景領域を単純に抽象化したシルエット画像を背景画像に重畳した出力画像では、撮像装置１１０の撮像された環境の状況をうかがい知ることが困難になることがある。 Note that the image shown in FIG. 4E is an output image in which a silhouette image obtained by abstracting the foreground region extracted by the extraction unit 204 in S303 is superimposed on a background image. As shown in FIG. 4E, in the output image in which a silhouette image that simply abstracts the extracted foreground region is superimposed on the background image, it is difficult to understand the situation of the environment imaged by the imaging device 110. It may happen.

次に、Ｓ３０４にて、検出部２０５は、照合パターン（辞書）を用いて、入力画像から人物を検出する。図４に示す例において、検出部２０５は、図４（ｃ）に示す入力画像から、人物４０１および人物４０２を検出する。 Next, in S304, the detection unit 205 detects a person from the input image using the matching pattern (dictionary). In the example shown in FIG. 4, the detection unit 205 detects a person 401 and a person 402 from the input image shown in FIG. 4(c).

次に、Ｓ３０５にて、設定部２０６は、検出部２０５により検出された人物の位置に基づいて、人物に対応する形状の領域である人物領域を設定する。図４（ｆ）に示す人物領域４０６は、検出部２０５により検出された人物４０１の位置に基づいて、設定部２０６により、人物４０１の全身を囲うよう設定された人物領域である。また、図４（ｆ）に示す人物領域４０７は、検出部２０５により検出された人物４０２の位置に基づいて、設定部２０６により、人物４０２の全身を囲うよう設定された人物領域である。なお、図４（ｆ）に示す人物領域４０６、４０７の形状は楕円形であり、検出された各々の人物のサイズに対応したサイズで設定部２０６により設定される。なお、人物領域４０６、４０７の形状は楕円形であるが、例えば、人物の少なくとも頭部に対応する第１形状と、人物の動体に対応する第２形状とを含む形状の領域であってもよい。なお、人物領域の形状は、検出された人物の全身を全て囲うような形状にしてもよいし、人物の一部がはみ出るような形状であっても構わない。例えば、人物領域の形状は、人物の頭部、上半身など人物の一部のみを覆う形状であってもよい。なお、設定部２０６により設定される人物領域の形状についての詳細な説明は、図５を参照して後述する。 Next, in S305, the setting unit 206 sets a person area, which is an area having a shape corresponding to the person, based on the position of the person detected by the detection unit 205. A person area 406 shown in FIG. 4F is a person area set by the setting unit 206 to surround the whole body of the person 401 based on the position of the person 401 detected by the detection unit 205. Further, a person area 407 shown in FIG. 4F is a person area set by the setting unit 206 to surround the whole body of the person 402 based on the position of the person 402 detected by the detection unit 205. Note that the shapes of the human regions 406 and 407 shown in FIG. 4(f) are elliptical, and are set by the setting unit 206 at a size corresponding to the size of each detected human. Although the shape of the person regions 406 and 407 is elliptical, for example, the shape of the region may include at least a first shape corresponding to the head of the person and a second shape corresponding to the moving body of the person. good. Note that the shape of the person area may be such that it completely encloses the whole body of the detected person, or may be such that a portion of the person protrudes. For example, the shape of the person area may be such that it covers only a part of the person, such as the head or upper body of the person. Note that a detailed description of the shape of the person area set by the setting unit 206 will be given later with reference to FIG.

次に、Ｓ３０６にて、生成部２０７は、設定部２０６により設定された人物領域における前景領域を抽象化したシルエット画像を生成する。図４（ｇ）に示す画像は、図４（ｄ）に示す前景領域に対して、設定部２０６により設定された人物領域４０６～４０７の各々における前景領域を抽象化したシルエット画像を示す。図４（ｇ）に示すように、人物領域４０６では、内部に存在する前景領域が人物４０１に対応する前景領域４０４のみである。そのため、当該前景領域４０４を抽象化してシルエット画像を生成しても、差分領域４０４の形状がシルエット画像の形状としてそのまま抽出される。一方、人物領域４０７は、人物４０２の周囲に差分領域が存在するため詳細な人物４０２の形状を抽象化したシルエット画像の生成は難しい。しかしながら、人物４０２の位置とサイズに対応した人物領域４０７における前景領域４０５を抽象化したシルエット画像を生成する。このようにすることで、当該人物４０２の位置やサイズなどを大まかに把握することが可能なシルエット画像を生成することができる。 Next, in S306, the generation unit 207 generates a silhouette image in which the foreground area in the person area set by the setting unit 206 is abstracted. The image shown in FIG. 4(g) is a silhouette image obtained by abstracting the foreground area in each of the human areas 406 to 407 set by the setting unit 206 with respect to the foreground area shown in FIG. 4(d). As shown in FIG. 4G, in the person area 406, the only foreground area that exists inside is the foreground area 404 corresponding to the person 401. Therefore, even if a silhouette image is generated by abstracting the foreground region 404, the shape of the difference region 404 is extracted as is as the shape of the silhouette image. On the other hand, in the person region 407, since a difference region exists around the person 402, it is difficult to generate a silhouette image that abstracts the detailed shape of the person 402. However, a silhouette image is generated in which the foreground area 405 in the person area 407 corresponding to the position and size of the person 402 is abstracted. By doing so, it is possible to generate a silhouette image that allows a rough understanding of the position, size, etc. of the person 402.

次に、Ｓ３０７にて、生成部２０７は、設定部２０６により設定された人物領域における前景領域を抽象化したシルエット画像を、背景画像に重畳した出力画像を生成する。図４（ｈ）に示す図は、Ｓ３０６により生成されたシルエット画像を背景画像に重畳した出力画像の一例を示す図である。図４（ｈ）に示すシルエット画像４０８は、人物領域４０６における前景領域を抽象化したシルエット画像である。また、図４（ｈ）に示すシルエット画像４０９は、人物領域４０７における前景領域を抽象化したシルエット画像である。図４（ｈ）に示すように、扉４００の開閉などによる画像における輝度の変化等の環境変化が少ない領域では、シルエット画像４０８に示すように実際に存在する人物の形状を反映したシルエット画像を生成することが可能である。そのため、撮像環境の様子を詳細に把握することができる。一方、環境変化の多い領域では、シルエット画像４０９に示すように、人物の位置とサイズに対応するシルエット画像を生成することが可能になるため、環境変化が大きい領域あっても当該領域における人物が存在しているか否かの状況が把握することができる。 Next, in S307, the generation unit 207 generates an output image in which a silhouette image, which is an abstraction of the foreground area in the person area set by the setting unit 206, is superimposed on the background image. The diagram shown in FIG. 4(h) is a diagram illustrating an example of an output image in which the silhouette image generated in S306 is superimposed on the background image. A silhouette image 408 shown in FIG. 4(h) is a silhouette image in which the foreground region in the human region 406 is abstracted. Further, a silhouette image 409 shown in FIG. 4(h) is a silhouette image in which the foreground region in the human region 407 is abstracted. As shown in FIG. 4H, in areas where there are few environmental changes such as changes in brightness in the image due to opening and closing of the door 400, a silhouette image that reflects the shape of the actually existing person is displayed as shown in the silhouette image 408. It is possible to generate Therefore, the state of the imaging environment can be grasped in detail. On the other hand, in areas where there are many environmental changes, it is possible to generate a silhouette image that corresponds to the position and size of the person, as shown in silhouette image 409. It is possible to grasp the status of whether it exists or not.

次に、Ｓ３０８にて、表示制御部２０２は、生成部２０７により生成された出力画像を出力する。なお、本実施形態において、表示制御部２０２は、生成部２０７により生成された出力画像をディスプレイ１３０に表示させる。 Next, in S308, the display control unit 202 outputs the output image generated by the generation unit 207. Note that in this embodiment, the display control unit 202 causes the display 130 to display the output image generated by the generation unit 207.

次に、Ｓ３０９にて、ユーザにより処理を終了する指示がある場合（Ｓ３０９にてＹｅｓ）、処理を終了する。一方、ユーザにより処理を終了する指示がない場合（Ｓ３０９にてＮｏ）、Ｓ３０１へ戻り、通信部２００は次のフレームの画像（入力画像）を受信する。 Next, in S309, if the user instructs to end the process (Yes in S309), the process ends. On the other hand, if there is no instruction from the user to end the process (No in S309), the process returns to S301, and the communication unit 200 receives the next frame image (input image).

以上説明したように本実施形態に係る画像処理装置１００の画像処理は、検出された人物の位置に基づいて人物領域を設定し、設定した人物領域における前景領域を抽象化したシルエット画像を背景画像に重畳した出力画像を生成する。このようにすることで、撮像された画像の適切な領域を隠蔽した画像を生成可能とする技術を提供することができる。 As explained above, the image processing of the image processing apparatus 100 according to the present embodiment sets a person area based on the detected position of the person, and uses a silhouette image that abstracts the foreground area of the set person area as a background image. Generates an output image superimposed on the image. By doing so, it is possible to provide a technique that makes it possible to generate an image in which an appropriate area of a captured image is hidden.

次に、図５を参照して、設定部２０６により設定される人物領域の形状について説明する。図４を参照して説明した上述の内容においては、設定部２０６により設定される人物領域４０６、４０７として、図５に示す楕円形状５００を例に挙げて説明したが、これに限らない。人物領域の形状としては、例えば、図５に示す矩形５０１や台形５０２といった単純な図形でもよい。 Next, with reference to FIG. 5, the shape of the human region set by the setting unit 206 will be described. In the above description with reference to FIG. 4, the elliptical shape 500 shown in FIG. 5 is used as an example of the human regions 406 and 407 set by the setting unit 206, but the present invention is not limited to this. The shape of the person area may be a simple figure such as a rectangle 501 or a trapezoid 502 shown in FIG. 5, for example.

また、人物領域の形状としては、人物の少なくとも頭部に対応する第１形状と、人物の胴体に対応する第２形状とを含む形状であってもよい。例えば、人物領域の形状として、形状５０３のように、頭部に対応する円形の第１形状５１０と、人物の胴体に対応する三角形の第２形状５１１とを含む形状であってもよい。また、人物領域の形状として、形状５０４のように、頭部に対応する円形の第１形状５２０と、人物の胴体に対応する楕円を一部である第２形状５２１とを含む形状であってもよい。また、人物領域の形状として、形状５０５のように、頭部に対応する円形の第１形状５３０と、人物の胴体に対応する略矩形の第２形状５３１とを含む形状であってもよい。 Further, the shape of the person region may include a first shape corresponding to at least the head of the person and a second shape corresponding to the torso of the person. For example, the shape of the person region may be a shape such as shape 503 that includes a circular first shape 510 corresponding to the head and a triangular second shape 511 corresponding to the torso of the person. Further, the shape of the human region is a shape that includes a circular first shape 520 corresponding to the head and a second shape 521 having a part of an ellipse corresponding to the torso of the person, such as shape 504. Good too. Further, the shape of the human region may be a shape that includes a circular first shape 530 corresponding to the head and a substantially rectangular second shape 531 corresponding to the torso of the person, as in the shape 505.

なお、人物領域の形状の一例である形状５０３～５０５に示すように、上述したように、人物の頭部を模した第１形状と、人物の胴体を模した第２形状とを含む形状であるが、これに限らない。例えば、人物領域の形状としては、人物の頭部を模した第１形状と、人物の胴体を模した第２形状と、更に、例えば、人物の下半身を模した第３形状を含む形状であってもよい。 Note that, as shown in shapes 503 to 505, which are examples of the shape of the human region, as described above, the shape includes a first shape that imitates the head of a person and a second shape that imitates the torso of the person. Yes, but not limited to this. For example, the shape of the human region may include a first shape that imitates the head of a person, a second shape that imitates the torso of the person, and a third shape that imitates the lower body of the person. You can.

なお、本実施形態における人物領域の形状は、図５に示す矩形５０１や台形５０２のように曲線を有していない形状であってもよいし、楕円形５００、形状５０３～５０５のように、一部曲線を有する形状であってもよい。このように、一部曲線を有する形状の人物領域を用いることで、人物の形状により近いシルエット画像を生成することが可能となる。 Note that the shape of the person area in this embodiment may be a shape without a curve, such as a rectangle 501 or a trapezoid 502 shown in FIG. The shape may have a partial curve. In this way, by using a human region having a partially curved shape, it is possible to generate a silhouette image that more closely resembles the shape of a human.

図５に示す形状５０３～５０５のように、人物の形状により近い人物領域を用いることで、より人物の形状に近いシルエット画像を生成することが可能となる。例えば、図５（ｈ）において人物４０２に対応するシルエット画像４０９は楕円形であるが、前景領域が大きい領域内にて検出された人物に対して、人物の形状により近い人物領域を用いることで、人物の存在をより分かり易くユーザに提示することが可能となる。 By using a person region closer to the shape of a person, such as shapes 503 to 505 shown in FIG. 5, it is possible to generate a silhouette image closer to the shape of the person. For example, in FIG. 5H, the silhouette image 409 corresponding to the person 402 has an elliptical shape, but for a person detected in an area with a large foreground area, it is possible to use a person area that is closer to the shape of the person. , it becomes possible to present the presence of a person to the user more easily.

以上説明したように、本実施形態に係る画像処理装置１００の画像処理は、検出された人物の位置に基づいて人物領域を設定し、設定した人物領域における前景領域を抽象化したシルエット画像を背景画像に重畳した出力画像を生成する。このようにすることで、撮像された人物の位置に基づくシルエット画像を生成して表示することが可能であるため、プライバシーを確保しつつ、人の存在が分かりやすい表示を行うことが可能となる。また、画像に映る人物を覆うようにサイズが大きい前景領域が抽出された場合であっても、撮像環境の様子を把握しやすい画像の表示を行うことが可能となる。また更に、人物を検出する処理において、本来検出対象（隠蔽対象）としていない物体（例えば人物を含むポスター）を検出してしまった場合でも、前景領域が存在していなければシルエット画像が生成されない。そのため、ユーザに誤解を与えるような出力画像が生成されることを抑制し、より実際の状況を反映させた出力画像を生成することが可能となる。以上より、本実施形態に係る画像処理装置１００の画像処理によれば、撮像された画像の適切な領域を隠蔽した画像を生成可能とする技術を提供することができる。 As explained above, the image processing of the image processing apparatus 100 according to the present embodiment sets a person area based on the detected position of the person, and uses a silhouette image as a background image that is an abstraction of the foreground area of the set person area. Generate an output image superimposed on the image. By doing this, it is possible to generate and display a silhouette image based on the position of the photographed person, making it possible to display the presence of a person in an easy-to-understand manner while ensuring privacy. . Further, even if a large foreground region is extracted so as to cover a person appearing in the image, it is possible to display an image that makes it easy to understand the state of the imaging environment. Furthermore, in the process of detecting a person, even if an object (for example, a poster including a person) that is not originally a detection target (hidden target) is detected, a silhouette image will not be generated unless a foreground region exists. Therefore, it is possible to suppress generation of an output image that may mislead the user, and to generate an output image that more closely reflects the actual situation. As described above, according to the image processing of the image processing apparatus 100 according to the present embodiment, it is possible to provide a technique that makes it possible to generate an image in which an appropriate area of a captured image is hidden.

（実施形態２）
本実施形態では、駅やショッピングモールなど、多くの人物で混雑が発生するような場合であっても、プライバシーを保護しつつ、画像に映る人物の存在をユーザが把握しやすくする実施形態について説明する。以下、実施形態１と異なる部分を主に説明し、実施形態１と同一または同等の構成要素、および処理には同一の符号を付すとともに、重複する説明は省略する。 (Embodiment 2)
This embodiment describes an embodiment that makes it easier for users to understand the presence of people in images while protecting privacy even in cases where there are many people in a crowded place such as a station or shopping mall. do. Hereinafter, parts that are different from Embodiment 1 will be mainly described, and the same or equivalent components and processes as Embodiment 1 will be denoted by the same reference numerals, and redundant explanation will be omitted.

以下、図６を参照して、本実施形態に係る画像処理について説明する。図６は、本実施形態に係る画像処理を説明するための図である。図６（ａ）は撮像装置１１０によって撮像された入力画像の一例を示している。図６（ａ）における人物６００は、複数の人物で混雑している領域６６０（混雑領域）外に単独で存在している人物を示している。また、人物６０１は領域６６０に存在する複数の人物を示している。影６０２は、領域６６０において人物が作る影の領域である。 Image processing according to this embodiment will be described below with reference to FIG. 6. FIG. 6 is a diagram for explaining image processing according to this embodiment. FIG. 6A shows an example of an input image captured by the imaging device 110. A person 600 in FIG. 6A is a person who exists alone outside an area 660 (crowded area) crowded with a plurality of people. Further, a person 601 indicates a plurality of people existing in the area 660. Shadow 602 is a shadow area created by a person in area 660.

図６（ｂ）は、本実施形態において用いられる背景画像の一例を示している。図６（ｃ）は、図６（ａ）に示す入力画像と、図６（ｂ）に示す背景画像とを比較することにより抽出部２０４により抽出された前景領域を示す。図６（ｃ）に示す例において、前景領域６０３は、単独で存在する人物６００に対応する前景領域を示している。また、図６（ｃ）に示す前景領域６０４は、領域６６０に存在する複数の人物、及び当該複数の人物が作り出す影領域６０２に対応する前景領域を示している。 FIG. 6(b) shows an example of a background image used in this embodiment. FIG. 6(c) shows a foreground region extracted by the extraction unit 204 by comparing the input image shown in FIG. 6(a) and the background image shown in FIG. 6(b). In the example shown in FIG. 6C, a foreground region 603 corresponds to a person 600 existing alone. Further, a foreground region 604 shown in FIG. 6C indicates a foreground region corresponding to a plurality of people existing in the region 660 and a shadow region 602 created by the plurality of people.

図６（ｄ）は、図６（ａ）に示す入力画像に対して検出部２０５により検出された人物の位置に基づき、設定部２０６が設定した人物領域を示す図である。単独で存在する人物６００の位置に基づいて設定された人物領域６０５では他の人物領域との重なりが無い。一方、領域６６０（混雑領域）に存在する複数の人物６０１に対応する複数の人物領域６０６は他の人物領域と重なってしまう。ここで、実施形態１で説明した画像処理と同様にして、図７（ｄ）に示す人物領域における前景領域を抽象化したシルエット画像を生成し、生成したシルエット画像を背景画像に重畳した出力画像を生成する場合を想定する。このようにして生成された出力画像の一例を図６（ｅ）に示す。図６（ｅ）では、単独で存在する人物６００に対応するシルエット画像６０７では人物を区別しながら表示できている。しかしながら、混雑領域に存在する複数の人物６０１に対応するシルエット画像６０８では、シルエット画像が重なった状態の出力画像が生成されてしまうため、撮像された画像における人物の状況を把握することが困難となる。 FIG. 6(d) is a diagram showing a person area set by the setting unit 206 based on the position of the person detected by the detection unit 205 with respect to the input image shown in FIG. 6(a). A person area 605 that is set based on the position of a person 600 that exists alone does not overlap with other person areas. On the other hand, the plurality of person areas 606 corresponding to the plurality of persons 601 existing in the area 660 (crowded area) overlap with other person areas. Here, in the same manner as the image processing described in Embodiment 1, a silhouette image is generated by abstracting the foreground area in the person area shown in FIG. 7(d), and an output image is obtained by superimposing the generated silhouette image on the background image. Assume that you want to generate . An example of the output image generated in this manner is shown in FIG. 6(e). In FIG. 6E, a silhouette image 607 corresponding to a person 600 existing alone can be displayed while distinguishing the person. However, in silhouette images 608 corresponding to multiple people 601 existing in a crowded area, an output image in which the silhouette images overlap is generated, making it difficult to understand the situation of the people in the captured image. Become.

そこで、複数の人物で混雑するような状況であっても、画像における人物の状況を把握しやすくするため、本実施形態に係る画像処理装置１００は、例えば、次のような処理を実行する。すなわち、本実施形態に係る生成部２０７は、設定部２０６により設定された人物領域ごとに、人物領域における前景領域を抽象化したシルエット画像の表示態様を異ならせるようにする。シルエット画像の表示形態を異ならせる方法としては、設定部２０６により設定された人物領域ごとに、人物領域における前景領域の色を異ならせたり、人物領域に対して付与するテクスチャの種類を異ならせたりする方法がある。しかしながら、人物領域ごとに、識別可能とするようにできればよく、上述した方法に限らない。 Therefore, in order to make it easier to understand the situation of the people in the image even in a situation where the image is crowded with a plurality of people, the image processing apparatus 100 according to the present embodiment performs the following process, for example. That is, the generation unit 207 according to the present embodiment changes the display mode of the silhouette image, which is an abstraction of the foreground area in the person area, for each person area set by the setting unit 206. As a method of varying the display form of the silhouette image, for each person area set by the setting unit 206, the color of the foreground area in the person area may be different, or the type of texture given to the person area may be different. There is a way to do it. However, the method is not limited to the above method as long as it can be made distinguishable for each person area.

また、上述したように混雑領域６６０における複数の人物領域が示すように人物領域同士が重なる場合において、人物領域ごとに表示形態を異ならせる際には、画像の手前側に位置する人物領域に対応するシルエット画像が優先的に表示されるようにするとよい。なお、生成部２０７は、例えば、画像の奥側に位置する人物領域に対応するシルエット画像に対し、画像の手前側に位置する人物領域に対応するシルエット画像が表側に位置するような出力画像を生成することで優先的な表示を行う。 In addition, when the human areas overlap each other as shown by the plurality of human areas in the crowded area 660 as described above, when changing the display format for each human area, it is necessary to It is preferable to display silhouette images with priority. Note that, for example, the generation unit 207 generates an output image in which a silhouette image corresponding to a person area located on the front side of the image is located on the front side with respect to a silhouette image corresponding to a person area located on the back side of the image. By generating, priority display is performed.

具体的には、まず、検出部２０５は、検出した人物の足元の位置を特定する。例えば、検出部２０５は、検出した人物の全身の画像領域において、最も下部に位置する点を当該人物の足元の位置として特定する。そして、生成部２０７は、より撮像装置１１０に近い人物、例えば、画像の垂直方向のより下側に足元が位置する人物に対応するシルエット画像を優先的に表示するようにする。なお、人物に対応するシルエット画像とは、当該人物に対し設定された人物領域における前景領域を抽象化したシルエット画像とする。なお、生成部２０７は、画像の垂直方向のより上側に足元が位置する人物に対応するシルエット画像に対し、より下側に足元が位置する人物に対応するシルエット画像が表側に位置するような出力画像を生成する。 Specifically, first, the detection unit 205 identifies the position of the detected person's feet. For example, the detection unit 205 identifies the lowest point in the image area of the detected person's whole body as the position of the person's feet. Then, the generation unit 207 preferentially displays a silhouette image corresponding to a person closer to the imaging device 110, for example, a person whose feet are located further down in the vertical direction of the image. Note that the silhouette image corresponding to a person is a silhouette image that abstracts the foreground area in the person area set for the person. Note that the generation unit 207 outputs an output in which a silhouette image corresponding to a person whose feet are located further down is located on the front side, with respect to a silhouette image corresponding to a person whose feet are located higher in the vertical direction of the image. Generate an image.

ここで、図３に示すフローを参照して、本実施形態に係る生成部２０７の処理について説明する。本実施形態に係る生成部２０７は、Ｓ３０６にて、設定部２０６により設定された人物領域ごとに、人物領域における前景領域を抽象化したシルエット画像の表示態様を異ならせつつ、シルエット画像を生成する。 Here, the processing of the generation unit 207 according to the present embodiment will be described with reference to the flow shown in FIG. 3. In S306, the generation unit 207 according to the present embodiment generates a silhouette image for each person area set by the setting unit 206, while changing the display mode of the silhouette image that abstracts the foreground area in the person area. .

図６（ｆ）は、上述した本実施形態に係る生成部２０７の処理により生成されたシルエット画像を背景画像に重畳した出力画像の一例を示している。図６（ｆ）に示すように、単独で存在する人物６００に対応するシルエット画像６０９のみならず、混雑領域に存在する複数の人物６０１に対応するシルエット画像６１０であっても各シルエット画像を識別可能な出力画像を生成できる。 FIG. 6F shows an example of an output image in which the silhouette image generated by the processing of the generation unit 207 according to the embodiment described above is superimposed on the background image. As shown in FIG. 6(f), each silhouette image is identified not only in a silhouette image 609 corresponding to a single person 600 but also in a silhouette image 610 corresponding to multiple persons 601 existing in a crowded area. possible output images.

以上説明したように、本実施形態に係る生成部２０７は、設定部２０６により設定された人物領域ごとに、人物領域における前景領域を抽象化したシルエット画像の表示態様を異ならせた。このようにすることで、複数の人物で混雑するような状況であっても、画像に含まれる人物のプライバシーを保護しつつ、画像に含まれる人物の状況を把握しやすくする。 As described above, the generation unit 207 according to the present embodiment changes the display mode of the silhouette image, which is an abstraction of the foreground area in the person area, for each person area set by the setting unit 206. In this way, even in a situation where the image is crowded with multiple people, the privacy of the person included in the image is protected, and the situation of the person included in the image can be easily understood.

（実施形態３）
実施形態２に係る生成部２０７は、複数の人物で混雑するような状況であっても、画像における人物の状況を把握しやすくするため、設定された人物領域ごとに、人物領域における前景領域を抽象化したシルエット画像の表示態様を異ならせた。 (Embodiment 3)
The generation unit 207 according to the second embodiment generates a foreground area in the person area for each set person area in order to make it easier to understand the situation of the person in the image even in a situation where the image is crowded with multiple people. The display mode of abstracted silhouette images was changed.

本実施形態に係る生成部２０７は、複数の人物で混雑するような状況であっても、画像における人物の状況を把握しやすくするため、次のような処理を実行する。すなわち、本実施形態に係る生成部２０７は、画像における複数の人物で混雑する混雑領域の位置に基づいて、設定された人物領域ごとに、人物領域における前景領域を抽象化したシルエット画像の表示態様を異ならせる。以下、図７を参照して、本実施形態に係る生成部２０７の処理について説明する。なお、実施形態１～２と異なる部分を主に説明し、実施形態１～２と同一または同等の構成要素、および処理には同一の符号を付すとともに、重複する説明は省略する。 The generation unit 207 according to the present embodiment performs the following processing in order to make it easier to understand the situation of the person in the image even in a situation where the image is crowded with a plurality of people. That is, the generation unit 207 according to the present embodiment generates a display mode of a silhouette image in which the foreground region of the human region is abstracted for each set human region based on the position of a crowded region crowded with a plurality of people in the image. Make the difference. The processing of the generation unit 207 according to this embodiment will be described below with reference to FIG. Note that parts that are different from Embodiments 1 and 2 will be mainly explained, and components and processes that are the same or equivalent to Embodiments 1 and 2 will be denoted by the same reference numerals, and redundant explanation will be omitted.

図７は、本実施形態に係る画像処理装置１００の機能ブロックを示す図である。なお、図７の機能ブロックが示す各機能は、本実施形態において、ＲＯＭ１４２０とＣＰＵ１４００とを用いて、次のようにして実現されるものとする。図７に示す各機能は、画像処理装置１００のＲＯＭ１４２０に格納されたコンピュータプログラムを画像処理装置１００のＣＰＵ１４００が実行することにより実現される。 FIG. 7 is a diagram showing functional blocks of the image processing apparatus 100 according to this embodiment. In addition, each function shown by the functional block of FIG. 7 shall be implement|achieved as follows using ROM1420 and CPU1400 in this embodiment. Each function shown in FIG. 7 is realized by the CPU 1400 of the image processing apparatus 100 executing a computer program stored in the ROM 1420 of the image processing apparatus 100.

図７における通信部２００、記憶部２０１、表示制御部２０２、操作受付部２０３、抽出部２０４、検出部２０５、設定部２０６の処理は、実施形態１と同様であるため説明を省略する。 The processes of the communication unit 200, storage unit 201, display control unit 202, operation reception unit 203, extraction unit 204, detection unit 205, and setting unit 206 in FIG. 7 are the same as those in Embodiment 1, so description thereof will be omitted.

図７に示す判定部７０８は、画像における混雑領域を判定する。例えば、判定部７０８は、検出部２０５により検出された人物の数に基づいて、画像における混雑領域を判定する。この場合、判定部７０８は、例えば、画像の全体の領域を複数の分割領域に分割し、分割領域毎に検出された人物の数を計数し、人物の数が閾値以上の分割領域を混雑領域として判定する。 A determining unit 708 shown in FIG. 7 determines a crowded area in an image. For example, the determining unit 708 determines a crowded area in the image based on the number of people detected by the detecting unit 205. In this case, the determination unit 708 divides the entire area of the image into a plurality of divided areas, counts the number of people detected in each divided area, and selects the divided areas where the number of people is equal to or greater than a threshold as a crowded area. Determine as.

また、判定部７０８は、例えば、画像から抽出部２０４により抽出された前景領域の位置とサイズに基づいて、混雑領域を判定する。具体的には、判定部７０８は、抽出部２０４により抽出された前景領域のサイズが閾値以上である場合に当該前景領域を混雑領域として判定してもよい。なお、１つの前景領域を、２値画像において前景領域を示す「１」である画素のうち、隣接した画素を連結させることで形成される領域とする。そして、判定部７０８は、画像から複数の前景領域が抽出された場合において、当該複数の前景領域の各々について、前景領域のサイズと、閾値とを比較して、混雑領域である前景領域を判定する。 Further, the determining unit 708 determines a crowded area, for example, based on the position and size of the foreground area extracted from the image by the extracting unit 204. Specifically, the determining unit 708 may determine that the foreground area extracted by the extracting unit 204 is a crowded area when the size of the foreground area is equal to or larger than a threshold value. Note that one foreground region is defined as a region formed by connecting adjacent pixels among pixels that are "1" indicating a foreground region in a binary image. Then, when a plurality of foreground regions are extracted from the image, the determination unit 708 compares the size of the foreground region with a threshold value for each of the plurality of foreground regions to determine which foreground region is a crowded region. do.

また、判定部７０８は、例えば、画像から抽出部２０４により抽出された前景領域の位置およびサイズと、画像から検出部２０５により検出された人物の数とに基づいて、画像における混雑領域を判定してもよい。具体的には、判定部７０８は、例えば、画像から抽出されたサイズが第１閾値以上の前景領域、または、画像を複数のエリアに分割したときに検出された人物の数が第２閾値以上のエリア、を混雑領域として判定する。 Further, the determining unit 708 determines a crowded area in the image based on, for example, the position and size of the foreground area extracted from the image by the extracting unit 204 and the number of people detected from the image by the detecting unit 205. You can. Specifically, the determination unit 708 determines whether the size of the foreground region extracted from the image is equal to or greater than a first threshold, or the number of people detected when the image is divided into a plurality of areas is equal to or greater than a second threshold. area is determined to be a congested area.

そして、本実施形態に係る生成部２０７は、判定部７０８により判定された画像における混雑領域に位置する人物に対し設定された人物領域の各々について、人物領域における前景領域を抽象化したシルエット画像の表示態様を異ならせる。例えば、判定部７０８により、図６に示す人物で混雑した領域６６０を混雑領域として判定する。この場合、生成部２０７は、判定部７０８により混雑領域であると判定された領域６６０に位置する人物に対して設定された人物領域ごとに、人物領域における前景領域を抽象化したシルエット画像の表示態様を異ならせる。人物領域ごとに、人物領域における前景領域を抽象化したシルエット画像の表示態様を異ならせる方法は、実施形態２で説明した内容と同様であるため、説明を省略する。なお、判定部７０８は、例えば、検出された人物の全身の画像領域と混雑領域とが重畳している割合である重畳率が閾値以上である場合、当該人物を混雑領域に位置する人物と判定する。 Then, the generation unit 207 according to the present embodiment generates a silhouette image that abstracts the foreground area of the person area for each of the person areas set for the person located in the crowded area in the image determined by the determination unit 708. Make the display mode different. For example, the determination unit 708 determines an area 660 crowded with people shown in FIG. 6 as a crowded area. In this case, the generation unit 207 displays, for each person area set for a person located in the area 660 determined to be a crowded area by the determination unit 708, a silhouette image that abstracts the foreground area of the person area. Make the aspects different. The method of varying the display mode of the silhouette image, which is an abstraction of the foreground area in the person area, for each person area is the same as that described in Embodiment 2, so a description thereof will be omitted. Note that, for example, when the overlapping ratio, which is the ratio at which the detected person's whole body image area and the crowded area overlap, is equal to or higher than a threshold, the determination unit 708 determines that the person is a person located in the crowded area. do.

以上説明したように、本実施形態に係る生成部２０７は、判定部７０８により判定された画像における混雑領域の位置に基づいて、設定された人物領域ごとに、人物領域における前景領域を抽象化したシルエット画像の表示態様を異ならせる。このようにすることで、複数の人物で混雑するような状況であっても、画像に含まれる人物のプライバシーを保護しつつ、画像に含まれる人物の状況を把握しやすくする。 As described above, the generation unit 207 according to the present embodiment abstracts the foreground area in the person area for each set person area based on the position of the crowded area in the image determined by the determination unit 708. The display mode of silhouette images is made different. In this way, even in a situation where the image is crowded with multiple people, the privacy of the person included in the image is protected and the situation of the person included in the image can be easily understood.

（実施形態４）
実施形態２に係る生成部２０７は、複数の人物で混雑するような状況であっても、画像における人物の状況を把握しやすくするため、設定された人物領域ごとに、人物領域における前景領域を抽象化したシルエット画像の表示態様を異ならせた。 (Embodiment 4)
The generation unit 207 according to the second embodiment generates a foreground area in the person area for each set person area in order to make it easier to understand the situation of the person in the image even in a situation where the image is crowded with multiple people. The display mode of abstracted silhouette images was changed.

本実施形態に係る画像処理装置１００は、複数の人物で混雑するような状況であっても、画像における人物の状況を把握しやすくするため、次のような処理を実行する。すなわち、本実施形態に係る画像処理装置１００の設定部２０６は、画像に含まれる人物の混雑領域に基づき、形状が異なる人物領域の設定を行う。以下、図７を参照して、本実施形態に係る設定部２０６の処理について説明する。なお、実施形態１～３と異なる部分を主に説明し、実施形態１～３と同一または同等の構成要素、および処理には同一の符号を付すとともに、重複する説明は省略する。 The image processing apparatus 100 according to the present embodiment performs the following process in order to make it easier to understand the situation of the person in the image even in a situation where the image is crowded with a plurality of people. That is, the setting unit 206 of the image processing apparatus 100 according to the present embodiment sets human regions having different shapes based on the crowded region of the people included in the image. The processing of the setting unit 206 according to this embodiment will be described below with reference to FIG. 7. Note that parts that are different from Embodiments 1 to 3 will be mainly explained, and components and processes that are the same or equivalent to Embodiments 1 to 3 will be given the same reference numerals, and redundant explanation will be omitted.

図８は、混雑が発生している場合において設定される人物領域の形状を説明する図である。図８に示す人物領域８００は、混雑が発生していない領域において単独で存在する人物８０１に対して設定部２０６により設定された人物領域であって、人物８０１の全身が収まるサイズの人物領域である。なお、本実施形態における人物領域は、便宜的に、図５に示す楕円５００の形状とするが、これに限らない。例えば、設定される人物領域としては、図５に示す楕円５００以外の他の形状であってもよい。 FIG. 8 is a diagram illustrating the shape of a person area that is set when congestion occurs. A person area 800 shown in FIG. 8 is a person area set by the setting unit 206 for a person 801 existing alone in a non-congested area, and is a person area of a size that fits the whole body of the person 801. be. Note that, for convenience, the human region in this embodiment has the shape of an ellipse 500 shown in FIG. 5, but is not limited to this. For example, the person area to be set may have a shape other than the ellipse 500 shown in FIG. 5.

ここで、人物領域８００における人物８０１の全身の画像領域を前景領域として抽出部２０４により抽出された場合を想定したとき、生成部２０７は、当該前景領域を抽象化してシルエット画像８０２を生成する。シルエット画像８０２が示すように、混雑が発生してない領域における人物に対し設定された人物領域において、前景領域を抽象化したシルエット画像は、人物の形状が比較的分かり易い。 Here, assuming that the extraction unit 204 extracts an image area of the whole body of the person 801 in the person area 800 as a foreground area, the generation unit 207 generates the silhouette image 802 by abstracting the foreground area. As shown in the silhouette image 802, in a silhouette image in which the foreground area is abstracted in a person area set for a person in a non-congested area, the shape of the person is relatively easy to understand.

一方、人物領域８１０は、混雑領域（図６に示す領域６６０等）に存在する人物８１１および人物８１２のうち人物８１２に対して設定された人物領域であり、人物領域８００と同じ形状の人物領域である。人物８１２の全身の画像領域と人物８１１の全身の画像領域とを前景領域として抽出部２０４により抽出された場合を想定したとき、生成部２０７は、人物８１２に対し設定された人物領域８１０における前景領域を抽象化してシルエット画像８１３を生成する。ここでは、人物８１２の背後に位置する人物８１１が人物領域８１０に一部入り込んでしまった結果、人物８１１の一部がシルエット画像として生成されてしまう例を示している。シルエット画像８１３が示すように、混雑が発生している領域における人物に対し設定された人物領域において、前景領域を抽象化したシルエット画像は、人物の形状として不自然な形状が表示されてしまい、ユーザによって見にくくなってしまうことがある。 On the other hand, a person area 810 is a person area set for a person 812 among a person 811 and a person 812 existing in a crowded area (area 660 shown in FIG. 6, etc.), and has the same shape as the person area 800. It is. Assuming that the whole body image area of the person 812 and the whole body image area of the person 811 are extracted by the extraction unit 204 as foreground areas, the generation unit 207 generates a foreground image area in the person area 810 set for the person 812. A silhouette image 813 is generated by abstracting the area. Here, an example is shown in which a part of the person 811 located behind the person 812 enters the person area 810, and as a result, part of the person 811 is generated as a silhouette image. As shown in the silhouette image 813, in a person area set for a person in an area where congestion has occurred, a silhouette image that abstracts the foreground area will display an unnatural shape as the shape of the person. It may become difficult to see depending on the user.

人物領域８２０は、混雑領域に存在する複数の人物８２１および人物８２２のうち人物８２２に対し設定部２０６により設定された人物領域である。また、人物領域８２０は人物領域８００および人物領域８１０と比較して、水平方向の幅を所定の倍率だけ縮小した人物領域の一例を示している。人物８２１の全身の画像領域と人物８２２の全身の画像領域とを前景領域として抽出部２０４により抽出された場合を想定したとき、生成部２０７は、人物８２２に対し設定された人物領域８２０における前景領域を抽象化してシルエット画像８２３を生成する。 The person area 820 is a person area set by the setting unit 206 for the person 822 among the plurality of people 821 and 822 existing in the crowded area. Furthermore, the human region 820 is an example of a human region whose horizontal width is reduced by a predetermined magnification compared to the human region 800 and the human region 810. Assuming that the extraction unit 204 extracts the whole body image area of the person 821 and the whole body image area of the person 822 as foreground areas, the generation unit 207 generates a foreground image area in the person area 820 set for the person 822. A silhouette image 823 is generated by abstracting the region.

人物領域８２０のように、混雑領域に存在する人物８２２に対しては、本実施形態に係る設定部２０６は、水平方向の幅を所定の倍率だけ縮小した人物領域を設定する。例えば、図７に示す判定部７０８は、画像における混雑領域を判定し、設定部２０６は、判定部７０８により判定された混雑領域に存在する人物の各々に対しては、水平方向の幅を所定の倍率だけ縮小した人物領域を設定する。このとき、例えば、判定部７０８が、画像を複数のエリアに分割し、エリア毎に検出された人物の数を計数し、人物の数が閾値を超えるエリアを混雑領域として判定する場合を想定する。この場合、設定部２０６は、混雑領域として判定されていないエリアにおいて検出された人物に対しては、実施形態１と同様に予め用意した人物領域を設定する。一方、設定部２０６は、混雑領域として判定されていないエリアにおいて検出された人物に対しては、水平方向の幅を所定の倍率だけ縮小した人物領域を設定する。 For a person 822 existing in a crowded area, such as the person area 820, the setting unit 206 according to the present embodiment sets a person area whose horizontal width is reduced by a predetermined magnification. For example, the determining unit 708 shown in FIG. 7 determines a crowded area in an image, and the setting unit 206 sets a predetermined width in the horizontal direction for each person existing in the crowded area determined by the determining unit 708. Set the human area reduced by the magnification of . At this time, for example, assume that the determination unit 708 divides the image into multiple areas, counts the number of people detected in each area, and determines an area where the number of people exceeds a threshold as a crowded area. . In this case, the setting unit 206 sets a person area prepared in advance as in the first embodiment for a person detected in an area that has not been determined as a crowded area. On the other hand, for a person detected in an area that is not determined to be a crowded area, the setting unit 206 sets a person area whose horizontal width is reduced by a predetermined magnification.

また、判定部７０８が、画像から抽出部２０４により抽出された前景領域の位置とサイズに基づいて、混雑領域を判定する場合を想定する。具体的には、判定部７０８が、抽出部２０４により抽出された前景領域のサイズが閾値以上である場合に当該前景領域を混雑領域として判定する場合を想定する。この場合、設定部２０６は、閾値以上のサイズの前景領域に位置する人物（例えば、当該前景領域に包含されるよう位置する人物）に対しては、水平方向に所定の倍率縮小した人物領域を設定する。 Further, assume that the determining unit 708 determines a crowded area based on the position and size of the foreground area extracted from the image by the extracting unit 204. Specifically, assume that the determining unit 708 determines the foreground area extracted by the extracting unit 204 as a crowded area when the size of the foreground area is equal to or larger than a threshold value. In this case, for a person located in a foreground area having a size equal to or larger than the threshold value (for example, a person positioned to be included in the foreground area), the setting unit 206 reduces the person area by a predetermined magnification in the horizontal direction. Set.

また、判定部７０８が、画像から抽出部２０４により抽出された前景領域の位置およびサイズと、画像から検出部２０５により検出された人物の数とに基づいて、画像における混雑領域を判定する場合を想定する。具体的には、判定部７０８が、例えば、画像から抽出されたサイズが第１閾値以上の前景領域、または、画像を複数のエリアに分割したときに検出された人物の数が第２閾値以上のエリア、を混雑領域として判定する場合を想定する。この場合、設定部２０６は、そのように判定した混雑領域において検出された人物に対し、水平方向の幅を所定の倍率だけ縮小した人物領域を設定する。 Further, the case where the determining unit 708 determines the crowded area in the image based on the position and size of the foreground area extracted from the image by the extracting unit 204 and the number of people detected from the image by the detecting unit 205 is assumed. Suppose. Specifically, the determination unit 708 determines, for example, if the size of the foreground region extracted from the image is equal to or greater than a first threshold, or if the number of people detected when the image is divided into a plurality of areas is equal to or greater than a second threshold. Assume that the area , is determined to be a congested area. In this case, the setting unit 206 sets a person area whose horizontal width is reduced by a predetermined magnification for the person detected in the thus determined crowded area.

以上説明したように本実施形態に係る設定部２０６は、判定部７０８により判定された画像における混雑領域に存在する人物（図８に示す人物８２２等）に対しては、水平方向に所定の倍率だけ縮小した人物領域を設定する。このように設定した人物領域における前景領域を抽象化したシルエット画像は、本来の人物の形状が削られたシルエット画像となることがある。しかしながら、そのようなシルエット画像は、当該人物の近傍に位置する他の人物の一部が混入したシルエット画像となることを抑制することができる。また混雑領域に存在する複数の人物の各々に対し、同様の処理を実行することで、複数の人物で混雑するような領域であっても、画像における人物の各々を個別に把握しやすくすることができる。 As explained above, the setting unit 206 according to the present embodiment sets a predetermined magnification in the horizontal direction for a person (such as the person 822 shown in FIG. 8) existing in a crowded area in the image determined by the determination unit 708. Set the human area reduced by the amount. A silhouette image obtained by abstracting the foreground region of the human region set in this way may become a silhouette image in which the original shape of the human figure is removed. However, such a silhouette image can be prevented from becoming a silhouette image in which part of another person located near the person concerned is mixed into the silhouette image. In addition, by performing the same processing on each of multiple people existing in a crowded area, it is possible to easily understand each person in the image individually even if the area is crowded with multiple people. Can be done.

（実施形態５）
実施形態２では、複数の人物で混雑するような状況であっても、画像における人物の状況を把握しやすくするため、設定された人物領域ごとに、人物領域における前景領域を抽象化したシルエット画像の表示態様を異ならせる実施形態について説明した。 (Embodiment 5)
In the second embodiment, in order to make it easier to understand the situation of the person in the image even in a situation where the image is crowded with multiple people, a silhouette image is created in which the foreground area of the person area is abstracted for each set person area. An embodiment in which the display mode of is changed has been described.

本実施形態における生成部２０７は、画像における人物の移動状況に応じて、当該人物に設定された人物領域における前景領域を抽象化したシルエット画像の表示態様を異ならせる。 The generation unit 207 in this embodiment changes the display mode of a silhouette image that abstracts the foreground area in the person area set for the person, depending on the movement status of the person in the image.

以下本実施形態に係る画像処理装置１００の処理について説明する。なお、実施形態１～４と異なる部分を主に説明し、実施形態１～４と同一または同等の構成要素、および処理には同一の符号を付すとともに、重複する説明は省略する。 The processing of the image processing apparatus 100 according to this embodiment will be described below. Note that parts that are different from Embodiments 1 to 4 will be mainly explained, and components and processes that are the same or equivalent to Embodiments 1 to 4 will be denoted by the same reference numerals, and redundant explanation will be omitted.

本実施形態に係る検出部２０５は、画像から人物を検出する処理に加え、更に、画像に含まれる人物を追尾する。例えば、判定部７０８は、現在のフレーム（現在フレーム）よりも１つ以上フレーム前のフレームの画像から検出した人物と同じ人物を着目フレームの画像から検出した場合、それぞれのフレームにおける人物同士を対応づける。すなわち、時間的に近い複数のフレームの画像間で人物を追尾する。 In addition to the process of detecting a person from an image, the detection unit 205 according to the present embodiment also tracks a person included in the image. For example, when the determination unit 708 detects the same person from the image of the frame of interest as the person detected from the image of the frame one or more frames before the current frame (current frame), the determination unit 708 associates the people in each frame with each other. Attach. That is, a person is tracked between images of a plurality of temporally close frames.

検出部２０５が複数のフレームの画像にわたって同じ人物であると判断する方法として、例えば、検出された人物の移動ベクトルを用いて人物の移動予測位置と検出した人物位置とが一定距離内であれば同一人物であるとする方法がある。また、検出部２０５は、人物の色、形状、大きさ（画素数）等を用いて、複数のフレームの画像間で相関の高い人物を対応付けてもよい。このように、検出部２０５は、複数のフレームの画像にわたって同じ人物であると判断して、当該人物を追尾できればよく、特定の方法に限定されるものではない。 As a method for the detection unit 205 to determine that the person is the same across multiple frames of images, for example, if the predicted movement position of the person and the detected person position are within a certain distance using the movement vector of the detected person. There is a way to prove that they are the same person. Further, the detection unit 205 may use the color, shape, size (number of pixels), etc. of the person to associate people with high correlation between images of a plurality of frames. In this way, the detection unit 205 is not limited to a specific method as long as it can determine that the person is the same person across multiple frames of images and track the person.

本実施形態に係る判定部７０８は、検出部２０５により追尾される人物の移動の状況（移動状況）を判定する。なお、人物の移動状況としては、例えば、当該人物の移動速度や、当該人物の移動の有無などがある。ここで、判定部７０８は、検出部２０５により追尾される人物の移動速度を判定する場合を想定する。この場合、判定部７０８は、撮像装置１１０が画像を撮像するフレームレートと、検出部２０５により追尾される人物の画像における位置の変化とに基づき、当該人物の移動速度を判定する。 The determination unit 708 according to the present embodiment determines the movement status (movement status) of the person tracked by the detection unit 205. Note that the moving status of the person includes, for example, the moving speed of the person, and whether or not the person is moving. Here, it is assumed that the determination unit 708 determines the moving speed of the person tracked by the detection unit 205. In this case, the determination unit 708 determines the moving speed of the person tracked by the detection unit 205 based on the frame rate at which the imaging device 110 captures the image and the change in the position of the person tracked by the detection unit 205 in the image.

また、判定部７０８は、検出部２０５により追尾される人物の移動の有無を判定する場合を想定する。この場合、判定部７０８は、検出部２０５により追尾される人物について、現在のフレームにおける当該人物の位置と、１つ前のフレームにおける当該人物の位置とを比較し、当該人物の移動距離を算出する。そして、判定部７０８は、算出した移動距離と閾値とを比較し、当該人物の移動の有無を判定する。具体的には、判定部７０８は、算出した移動距離が閾値以上である場合、当該人物は移動していると判定し、算出した移動距離が閾値未満である場合、当該人物は移動していないと判定する。 Further, it is assumed that the determination unit 708 determines whether or not the person tracked by the detection unit 205 is moving. In this case, the determination unit 708 compares the position of the person tracked by the detection unit 205 in the current frame with the position of the person in the previous frame, and calculates the moving distance of the person. do. The determination unit 708 then compares the calculated movement distance with a threshold value and determines whether the person has moved. Specifically, the determining unit 708 determines that the person is moving if the calculated moving distance is equal to or greater than the threshold, and determines that the person is not moving if the calculated moving distance is less than the threshold. It is determined that

また、人物の移動の有無を判定する他の方法としては、複数の入力画像のフレーム間差分に基づく方法を用いてもよい。例えば、判定部７０８は、人物領域と、フレーム間差分により特定される動体領域のサイズとに基づいて、人物の移動の有無を判定する。具体的には、判定部７０８は、例えば、人物に対し設定された人物領域と、動体領域とが重畳している割合である重畳率が閾値以上である場合、当該人物は移動していると判定する。一方、判定部７０８は、人物に対し設定された人物領域と、動体領域との重畳率が閾値未満である場合、当該人物は移動していない判定する。なお、動体領域を特定する方法としては、抽出部２０４により背景画像を用いて抽出された前景領域を動体領域として特定してもよい。なお、人物の移動の有無を判定する処理は、画像を複数に分割して得られる分割領域ごとに行ってもよい。 Further, as another method for determining whether or not a person has moved, a method based on inter-frame differences of a plurality of input images may be used. For example, the determination unit 708 determines whether or not the person has moved based on the person area and the size of the moving object area specified by the inter-frame difference. Specifically, the determination unit 708 determines that the person is moving, for example, if the overlapping ratio, which is the ratio at which the person area set for the person and the moving object area overlap, is equal to or higher than a threshold value, the determination unit 708 determines that the person is moving. judge. On the other hand, if the overlap ratio between the person area set for the person and the moving body area is less than the threshold, the determination unit 708 determines that the person is not moving. Note that, as a method for identifying the moving object region, a foreground region extracted by the extraction unit 204 using a background image may be identified as the moving object region. Note that the process of determining whether or not a person has moved may be performed for each divided region obtained by dividing the image into a plurality of regions.

本実施形態に係る生成部２０７は、判定部７０８により判定された人物の移動状況に基づき、当該人物に設定された人物領域における前景領域を抽象化したシルエット画像の表示態様を異ならせる。例えば、生成部２０７は、判定部７０８により判定された人物の移動速度に基づき、当該人物に設定された人物領域における前景領域を抽象化したシルエット画像の表示態様を異ならせる。また、例えば、生成部２０７は、判定部７０８により判定された人物の移動の有無に基づき、当該人物に設定された人物領域における前景領域を抽象化したシルエット画像の表示態様を異ならせる。 The generation unit 207 according to the present embodiment changes the display mode of a silhouette image in which the foreground area in the person area set for the person is abstracted based on the movement status of the person determined by the determination unit 708. For example, the generation unit 207 changes the display mode of a silhouette image that abstracts the foreground area in the person area set for the person based on the moving speed of the person determined by the determination unit 708. Further, for example, the generation unit 207 changes the display mode of the silhouette image that abstracts the foreground area in the person area set for the person based on whether the person has moved as determined by the determination unit 708.

なお、例えば、移動状況に応じて、人物領域における前景領域を抽象化したシルエット画像の表示態様を異ならせる方法としては、次のような処理を行ってもよい。すなわち、生成部２０７は、移動している人物に対し設定された人物領域における前景領域を所定の色で塗りつぶす際には、移動速度に応じて、暖色系の色の中で用いる色を異ならせる。一方、生成部２０７は、移動していない人物に対し設定された人物領域における前景領域を所定の色で塗りつぶす際には、寒色系の色相の中で用いる色を異ならせて表示する。このように、色の大まかな種類（暖色系の色、寒色系の色など）を移動の有無に応じて割り当て、更に、移動速度に応じて、色の大まかな種類の各々において色を異ならせるようにしてもよい。なお、人物領域における前景領域を抽象化したシルエット画像の表示態様を異ならせる方法については、特定の方法に限られず、例えば、実施形態２にて説明したようなシルエット画像の表示態様を異ならせる方法を用いればよい。 Note that, for example, as a method of varying the display mode of a silhouette image in which the foreground region in the human region is abstracted depending on the movement situation, the following processing may be performed. That is, when filling the foreground area in the person area set for a moving person with a predetermined color, the generation unit 207 uses different colors among warm colors depending on the moving speed. . On the other hand, when the generation unit 207 fills in the foreground area in the person area set for a person who is not moving with a predetermined color, the generation unit 207 displays the color using different colors among the cool color hues. In this way, general color types (warm colors, cool colors, etc.) are assigned depending on the presence or absence of movement, and furthermore, colors are assigned to each general color type depending on the speed of movement. You can do it like this. Note that the method of varying the display mode of the silhouette image that abstracts the foreground region in the human region is not limited to a specific method, and for example, the method of varying the display mode of the silhouette image as described in Embodiment 2 can be used. You can use

以上説明したように、本実施形態に係る画像処理装置１００の生成部２０７は、画像における人物の移動状況に応じて、当該人物に設定された人物領域における前景領域を抽象化したシルエット画像の表示態様を異ならせる。このように、人物の移動状況に応じてシルエット画像の表示態様を異ならせることで、画像における人物の行動の様子をより把握しやすくすることが可能となる。 As described above, the generation unit 207 of the image processing device 100 according to the present embodiment displays a silhouette image that abstracts the foreground area of the person area set for the person, depending on the movement status of the person in the image. Make the aspects different. In this way, by changing the display mode of the silhouette image depending on the movement status of the person, it becomes possible to more easily understand the behavior of the person in the image.

（実施形態６）
上述の各実施形態において、図６を参照して説明したように人の重なりが生じている場合であっても個々の人物の表示形態を異ならせることにより、画像に含まれる人物のプライバシーを保護しつつ、画像に含まれる人物の状況を把握しやすくすることができる。 (Embodiment 6)
In each of the above embodiments, the privacy of the people included in the image is protected by changing the display form of each person even when there are overlapping people as described with reference to FIG. At the same time, the situation of the person included in the image can be easily understood.

本実施形態では、人の重なりが生じている場合において、より明確に個々の人物を区別可能な状態で、画像に含まれる人物のプライバシーを保護しつつ、画像に含まれる人物の状況を把握しやすくする出力画像を生成する方法に関して説明する。なお、実施形態１～５と異なる部分を主に説明し、実施形態１～５と同一または同等の構成要素、および処理には同一の符号を付すとともに、重複する説明は省略する。 In this embodiment, when people overlap, it is possible to understand the situation of the person included in the image while protecting the privacy of the person included in the image, in a state where each person can be more clearly distinguished. A method for generating an output image that is easy to use will be explained. Note that parts that are different from Embodiments 1 to 5 will be mainly described, and components and processes that are the same or equivalent to Embodiments 1 to 5 will be given the same reference numerals, and redundant explanation will be omitted.

図９は、本実施形態に関わる画像処理を説明するための図である。図９（ａ）は撮像装置１１０によって撮像された入力画像の一例を示しており、被写体として人物９００～９０３が映っている。 FIG. 9 is a diagram for explaining image processing related to this embodiment. FIG. 9A shows an example of an input image captured by the imaging device 110, in which people 900 to 903 are shown as subjects.

図９（ｂ）は、本実施形態において用いられる背景画像の一例を示している。図９（ｃ）に示す人物領域９９０は、本実施形態における人物領域を示しており、検出部２０５により検出された人物が十分に収まる大きさであり、検出された人物の位置に対し設定部２０６により設定される。 FIG. 9(b) shows an example of a background image used in this embodiment. A person area 990 shown in FIG. 9(c) is a person area in this embodiment, and is large enough to accommodate the person detected by the detection unit 205. 206.

図９（ｄ）は、実施形態１に係る画像処理によって、生成部２０７により生成されたシルエット画像を背景画像に重畳した出力画像である。具体的には、図９（ａ）に示す入力画像から検出された人物に対して設定部２０６は人物領域９９０を設定する。そして、設定部２０６により設定された人物領域９９０における前景領域を抽象化したシルエット画像を図９（ｂ）に示す背景画像に重畳することで生成部２０７は出力画像を生成する。このようにして生成部２０７により生成された出力画像が図９（ｄ）に示す画像である。 FIG. 9D is an output image in which the silhouette image generated by the generation unit 207 is superimposed on the background image through the image processing according to the first embodiment. Specifically, the setting unit 206 sets a person area 990 for a person detected from the input image shown in FIG. 9(a). Then, the generation unit 207 generates an output image by superimposing a silhouette image obtained by abstracting the foreground area in the person area 990 set by the setting unit 206 on the background image shown in FIG. 9(b). The output image generated by the generation unit 207 in this manner is the image shown in FIG. 9(d).

ここで、図９（ｄ）では、重なりのある人物のシルエット画像９０５～９０７は各々異なる色を用いて表示形態を異ならせている出力画像の例を示している。これより、図９（ｄ）に示すように、単独で存在する人物９００に対応するシルエット画像９０４のみならず、混雑領域に存在する複数の人物９０１～９０３に対応するシルエット画像９０５～９０７であっても各シルエット画像を識別可能な出力画像を生成できる。 Here, FIG. 9(d) shows an example of an output image in which overlapping silhouette images 905 to 907 of people are displayed in different display formats using different colors. From this, as shown in FIG. 9(d), there are not only silhouette images 904 corresponding to a single person 900, but also silhouette images 905 to 907 corresponding to multiple persons 901 to 903 existing in a crowded area. It is possible to generate an output image in which each silhouette image can be identified.

図９（ｄ）に示す出力画像をディスプレイ１３０に表示する場合、ディスプレイ１３０の表示性能次第では、似たような色は見た目に区別ができなくなる可能性がある。この場合、多数の人物が映っている画像中では、重なった人物同士の色の差が見えにくくなる可能性がある。また、色の違いにより明示的に表示形態を異ならせる場合において、視覚的に大きく異なる色の組み合わせを使用する必要があるため、使用可能な色数に制約を設ける必要がある。しかし、使用可能な色数に比べ多数の人物が映っている場合には、使用可能な色数が不足してしまい、重なっている人物同士に同じ色が適用されて区別ができなくなる可能性もある。そこで本実施形態では、更に明確な人の区別をつけることが可能な実施形態を説明する。 When displaying the output image shown in FIG. 9D on the display 130, depending on the display performance of the display 130, similar colors may become visually indistinguishable. In this case, in an image that includes many people, it may be difficult to see the difference in color between the overlapping people. Furthermore, in the case where the display format is explicitly changed based on the difference in color, it is necessary to use combinations of colors that are visually significantly different, so it is necessary to set a restriction on the number of colors that can be used. However, if there are a large number of people in the image compared to the number of available colors, the number of usable colors may be insufficient and the same color may be applied to overlapping people, making it impossible to distinguish them. be. Therefore, in this embodiment, an embodiment will be described in which it is possible to more clearly distinguish between people.

図９（ｅ）は本実施形態の好適な出力画像例を示す図である。図９（ｅ）においては、人物領域毎に表示形態を異ならせるようにする処理に加え、人物領域の外縁であって、前景領域と重複する外縁のみ、人物領域の内部とは異なる表示形態で表示する。 FIG. 9(e) is a diagram showing a suitable example of an output image of this embodiment. In FIG. 9(e), in addition to the process of making the display form different for each person area, only the outer edge of the person area, which overlaps with the foreground area, is displayed in a different display form from the inside of the person area. indicate.

本実施形態における人物領域９９０は検出される人物の実際のサイズよりも大きい。そのため、例えば、画像から検出された人物９００に対して人物領域９９０を設定する場合、人物９００から抽出された前景領域と、人物９００に対し設定された人物領域９９０の外縁とは重複しない。一方、人物同士の重なりがある人物９０１～９０３の各々に対して人物領域９９０を設定する場合、人物９０１～９０３の各々から抽出された前景領域と、人物９０１～９０３の各々に対する人物領域９９０の外縁とで重複するラインが存在する。図９（ｅ）に示すライン９０８は、人物９０１～９０３から抽出された前景領域と、人物９０２に対し設定された人物領域９９０の外縁とで重複するライン（以下、重複ライン）である。また、ライン９０９は、人物９０１～９０３から抽出された前景領域と、人物９０３に対し設定された人物領域９９０の外縁とで重複する重複ラインである。本実施形態における生成部２０７は、重複ラインであるライン９０８およびライン９０９の色を、シルエット画像の色と異ならせる。例えば、生成部２０７は、ライン９０８およびライン９０９の色を黒色とし、シルエット画像９０４～９０７の色を黒色以外とする。このように、人物同士の重なりが存在する箇所であるライン９０８およびライン９０９を明示的に示すことで、複数の人物同士が重なった場合でも当該人物同士を容易に区別することができる。 The person area 990 in this embodiment is larger than the actual size of the person to be detected. Therefore, for example, when setting a person area 990 for a person 900 detected from an image, the foreground area extracted from the person 900 and the outer edge of the person area 990 set for the person 900 do not overlap. On the other hand, when setting a human region 990 for each of the persons 901 to 903 that overlap, the foreground region extracted from each of the persons 901 to 903 and the human region 990 for each of the persons 901 to 903 are set. There is a line that overlaps with the outer edge. A line 908 shown in FIG. 9(e) is a line where the foreground region extracted from the persons 901 to 903 overlaps with the outer edge of the human region 990 set for the person 902 (hereinafter referred to as an overlapping line). Furthermore, a line 909 is an overlapping line where the foreground regions extracted from the persons 901 to 903 overlap with the outer edge of the human region 990 set for the person 903. The generation unit 207 in this embodiment makes the colors of lines 908 and 909, which are overlapping lines, different from the color of the silhouette image. For example, the generation unit 207 sets the color of the line 908 and the line 909 to black, and sets the color of the silhouette images 904 to 907 to a color other than black. In this way, by explicitly indicating the lines 908 and 909 where the persons overlap, even when a plurality of persons overlap, it is possible to easily distinguish between the persons.

なお、本実施形態における人物領域９９０は画像における人物が十分に収まる大きさであるものとして説明したが、これに限るものではない。例えば、多数の人物がいる状態において、実施形態４で説明したように、水平方向の幅を所定の倍率だけ縮小した人物領域を設定する場合を想定する。このような場合、人物同士の重なりが無い人物９００であっても、水平方向の幅を所定の倍率だけ縮小した人物領域が設定されると、人物９００から抽出される前景領域と、人物領域の外縁とで重複する重複ラインが存在する。本実施形態における生成部２０７は、当該重複ラインをシルエット画像と異なる色で表示させる。例えば、生成部２０７は、当該重複ラインを黒、シルエット画像を黒以外の色とする。なお、重複ラインに対する表示形態は人物（人物領域）毎に同じ表示形態にしてもよいし、人物（人物領域）毎に異なる表示形態にしてもよい。或いは、各々の人物の移動の有無や移動方向など特定の同じグループに分類できる人物には同じ表示形態を適用し、グループ毎に異なる表示形態にするなどを行ってもよい。 Note that although the person area 990 in this embodiment has been described as being large enough to accommodate the person in the image, the present invention is not limited to this. For example, assume that in a state where there are many people, a person area is set whose horizontal width is reduced by a predetermined magnification, as described in the fourth embodiment. In such a case, even if the person 900 does not overlap, if a person area whose horizontal width is reduced by a predetermined magnification is set, the foreground area extracted from the person 900 and the person area There is an overlapping line that overlaps with the outer edge. The generation unit 207 in this embodiment displays the overlapping line in a color different from that of the silhouette image. For example, the generation unit 207 sets the overlapping line to black and the silhouette image to a color other than black. Note that the display format for the overlapping lines may be the same for each person (person area), or may be different for each person (person area). Alternatively, the same display form may be applied to persons who can be classified into the same specific group, such as whether or not each person moves and the direction of movement, and a different display form may be used for each group.

更に、上述の説明においては人物領域毎に内部の色を切り替え、更に人物領域の外縁における重複ラインのみ異なる表示形態にする例を示した。しかし、人物領域の外縁における重複ラインによる区別だけで十分な場合には、人物領域毎に内部の色を切り替える必要はなく、同じ色にしてもよい。或いは、何らかの分類により、同じグループに属する人物には同じ色を人物領域内部に適用し、グループ毎に異なる表示形態にするなどを行ってもよい。 Furthermore, in the above description, an example has been shown in which the internal color is changed for each human region, and only the overlapping lines at the outer edge of the human region are displayed in a different manner. However, if it is sufficient to distinguish the overlapping lines at the outer edge of the human area, it is not necessary to change the internal color for each human area, and the same color may be used. Alternatively, by some classification, the same color may be applied to the inside of the person area for people belonging to the same group, and the display format may be different for each group.

（実施形態７）
従来、駅のホームなどを撮像した静止画から電車に乗れそうか否かをユーザが把握したいというケースがある。例えば、事故や災害の発生時に、ホームや駅が人で溢れていて駅に向かっても電車に乗れるのか、それとも、電車に乗れないのかを確認したいという要求がある。このような場合において、画像に含まれる人物を抽象化したシルエット画像を含む静止画１枚からでは、電車に乗れるのか否かの状況を判断することは難しい場合がある。 (Embodiment 7)
Conventionally, there are cases in which a user wants to understand whether or not he or she is likely to board a train from a still image taken of a station platform or the like. For example, in the event of an accident or disaster, there is a demand to confirm whether the platform or station is overflowing with people and whether it is possible to board the train even if you head to the station, or whether you will not be able to board the train. In such a case, it may be difficult to determine whether or not the person can board the train based on a single still image that includes a silhouette image that is an abstraction of the person included in the image.

画像に多くの人物が存在する場合であっても当該多くの人物が流動しているとユーザは判別できれば電車に乗れそうだとユーザは判別できるが、静止画１枚からでは当該多くの人物が流動しているとユーザは判別することができない。そのため、ユーザは電車に乗れそうか否かを判別できない。一方、人物が動いているか否かに応じてシルエット画像の色を異ならせる方法により静止画１枚から人物が流動しているかをユーザは判別できる。しかしながら、画像に多くの人物がいる場合、人物同士が重なることで動きの検出に失敗してしまい、結果として人物が流動しているかを判別できず、ユーザは電車に乗れそうか否かを判断できない場合がある。そこで、本実施形態では、人物を抽象化した静止画１枚であっても、駅などで電車に乗れそうかを容易に判別できる出力画像を生成することを目的としている。以下、本実施形態における画像処理について説明するが、実施形態１～６と異なる部分を主に説明し、実施形態１～６と同一または同等の構成要素、および処理には同一の符号を付すとともに、重複する説明は省略する。 Even if there are many people in an image, if the user can determine that many people are moving, the user can determine that they are likely to get on the train, but from a single still image, it is difficult to see that many people are moving. The user cannot tell if this is the case. Therefore, the user cannot determine whether it is likely that he or she will be able to board the train. On the other hand, a user can determine whether a person is moving from a single still image by changing the color of the silhouette image depending on whether the person is moving or not. However, when there are many people in an image, motion detection fails because the people overlap, and as a result, it is not possible to determine whether the people are moving, and the user has to judge whether or not they can get on the train. It may not be possible. Therefore, the present embodiment aims to generate an output image that can easily determine whether a person is likely to board a train at a station or the like, even if it is a single still image that abstracts a person. The image processing in this embodiment will be described below, but the differences from Embodiments 1 to 6 will be mainly explained, and the same or equivalent components and processes as in Embodiments 1 to 6 will be given the same reference numerals. , duplicate explanations will be omitted.

まず図１０に示す機能ブロックを参照して、本実施形態にかかる画像処理装置１００の画像処理について説明する。なお、図１０に示す各機能は、本実施形態の場合、図１４を参照して後述するＲＯＭ１４２０とＣＰＵ１４００とを用いて、次のようにして実現されるものとする。図１０に示す各機能は、画像処理装置１００のＲＯＭ１４２０に格納されたコンピュータプログラムを画像処理装置１００のＣＰＵ１４００が実行することにより実現される。 First, image processing by the image processing apparatus 100 according to this embodiment will be described with reference to the functional blocks shown in FIG. In this embodiment, each function shown in FIG. 10 is realized as follows using a ROM 1420 and a CPU 1400, which will be described later with reference to FIG. 14. Each function shown in FIG. 10 is realized by the CPU 1400 of the image processing apparatus 100 executing a computer program stored in the ROM 1420 of the image processing apparatus 100.

判定部７０８は、撮像装置１１０によって連続的に撮像された画像中の物体の動きを判定する。例えば、判定部７０８は、通信部２００が連続的に受信した入力画像間のフレーム間差分に基づいて画像中の動き量を判定する。例えば、判定部７０８は、前のフレームの各画素値と、現在のフレームの各画素値とで差分値を算出し、算出した差分値が閾値以上の画素を動きのある画素とし、その動きのある画素の数の合計を動き量とする。動き量の判定は、画像全体で動きの量を判定するようにしてもよい。或いは、画像を複数の領域に分割し、該分割領域毎に動き量を判定するようにしてもよい。更に、判定する内容として、動き量ではなく、実施形態５で説明したように単純に動きの有無を判定するようにしてもよい。例えば、判定部７０８は、画像に含まれる物体を追尾し、前のフレームから現在のフレームまでの当該物体の移動距離を算出する。そして、判定部７０８は、算出した移動距離と閾値とを比較し、移動距離が閾値以上なら移動していると判定し、閾値以下なら移動していないと判定する。 The determining unit 708 determines the movement of an object in images continuously captured by the imaging device 110. For example, the determination unit 708 determines the amount of motion in the image based on the inter-frame difference between input images that the communication unit 200 continuously receives. For example, the determination unit 708 calculates a difference value between each pixel value of the previous frame and each pixel value of the current frame, and determines that the pixel for which the calculated difference value is equal to or greater than a threshold is a moving pixel, and Let the sum of the number of pixels be the amount of motion. The amount of motion may be determined by determining the amount of motion in the entire image. Alternatively, the image may be divided into a plurality of regions, and the amount of motion may be determined for each divided region. Furthermore, as the content to be determined, instead of the amount of motion, the presence or absence of motion may be simply determined as described in the fifth embodiment. For example, the determination unit 708 tracks an object included in an image and calculates the distance traveled by the object from the previous frame to the current frame. The determining unit 708 then compares the calculated moving distance with a threshold value, and determines that the object is moving if the moving distance is greater than or equal to the threshold value, and determines that the object is not moving if the moving distance is less than or equal to the threshold value.

また、判定部７０８は、例えば、検出部２０５により検出された特定の物体の位置を、通信部２００が連続的に受信した入力画像間で追尾することで、特定の物体の動きを判定する。動きの判定を行う際には、例えば、画像内に映っている特定の物体の全てを対象に判定を行うようにする。或いは、所定の画像位置毎に選択した代表物体を対象に動きの判定を行い、代表物体の周辺に映っている他の特定の物体の動きに反映する。或いは、画像内を複数の領域に分割し、分割領域内で一つ以上の代表物体を選択して動きの判定を行い、分割領域内に映っている他の特定の物体の動きに反映する。このように代表物体の動きを周辺の特定の物体の動きに反映することにより、追尾や動きの判定処理を削減することが可能になる。 Further, the determination unit 708 determines the movement of the specific object, for example, by tracking the position of the specific object detected by the detection unit 205 between input images continuously received by the communication unit 200. When determining motion, for example, the determination is performed for all specific objects appearing in the image. Alternatively, the motion is determined for the representative object selected for each predetermined image position, and is reflected in the motion of other specific objects appearing around the representative object. Alternatively, the image is divided into a plurality of regions, one or more representative objects are selected within the divided regions, the movement is determined, and this is reflected in the movement of other specific objects shown within the divided regions. By reflecting the movement of the representative object on the movement of specific surrounding objects in this way, it is possible to reduce tracking and movement determination processing.

調整部１０００は、判定部７０８の判定結果に基づいて、検出部２０５で検出された特定の物体のうち、抽象化してシルエット画像とする特定の物体（抽象化対象）の数を調整する。例えば、判定部７０８の判定の結果から得られた画像内全体の動き量が所定値以上であれば、検出部２０５で検出された人物の全てを抽象化対象とするのでなく、一部の人物を抽象化対象とする。言い換えれば、調整部１０００は、抽象化対象の人物の数を削減する。 Based on the determination result of the determination unit 708, the adjustment unit 1000 adjusts the number of specific objects (abstraction targets) to be abstracted into a silhouette image among the specific objects detected by the detection unit 205. For example, if the amount of movement in the entire image obtained from the determination result of the determination unit 708 is equal to or greater than a predetermined value, not all of the people detected by the detection unit 205 are abstracted, but only some of the people are abstracted. is the object of abstraction. In other words, the adjustment unit 1000 reduces the number of people to be abstracted.

削減の方法としては、例えば、動き量に基づいて上限数を設定し、検出部２０５で検出された人物の数が上限数を超えていた場合には、上限数までシルエット画像とする人物の数を削減する。上限数の決定方法としては、動き量が多いほど上限数を減らす。なお、動き量が所定以下では上限数を設けないようにしてもよい。 As a reduction method, for example, an upper limit is set based on the amount of movement, and if the number of people detected by the detection unit 205 exceeds the upper limit, the number of people to be used as silhouette images is reduced up to the upper limit. reduce. The method for determining the upper limit number is to reduce the upper limit number as the amount of movement increases. Note that if the amount of motion is below a predetermined value, no upper limit may be set.

更に、調整部１０００における抽象化対象の特定の物体の削減方法における他の好適な例としては、所定の割合で削減する、或いは所定数まで一律に削減するなど他の方法を適用してもよい。また、調整部１０００における抽象化対象としない物体（非抽象化対象）の決定方法としては、画像中の物体の位置がなるべく分散するように非抽象化対象の物体を決定するとよい。非抽象化対象の物体の決定方法としては、例えば、視覚的な影響の大きいと考えられる画像の手前側の物体から遠方の物体に順番に代表物体を設定し、該代表物体に最も近い物体を非抽象化対象とする。これを削減数まで繰り返すことで、画像中の物体を平均的に間引くように抽象化対象の数を削減することが可能になる。或いは、削減物体の決定方法としては、例えば、物体間の距離が近い複数の特定の物体を検索し、該距離が近い複数物体の中から非抽象化対象の物体を決定する。これを削減数まで繰り返すことで、画像中の物体密度の高い領域から非抽象化対象の物体が選択されるため、画像中の物体の密度の偏りを少なくするように抽象化対象とする物体の数を削減することが可能になる。 Further, as other preferable examples of the method for reducing specific objects to be abstracted in the adjustment unit 1000, other methods such as reducing at a predetermined ratio or uniformly reducing to a predetermined number may be applied. . Furthermore, as a method for determining objects that are not to be abstracted (non-abstracted objects) in the adjustment unit 1000, it is preferable to determine the objects that are not to be abstracted so that the positions of the objects in the image are dispersed as much as possible. As a method for determining objects to be de-abstracted, for example, representative objects are set in order from objects in the foreground of the image that are considered to have a large visual impact to objects in the distance, and the object closest to the representative object is selected. Become a non-abstract object. By repeating this process until the number of objects to be abstracted is reduced, it is possible to reduce the number of objects to be abstracted so that objects in the image are thinned out on average. Alternatively, as a method for determining objects to be reduced, for example, a plurality of specific objects having close distances between the objects are searched, and an object to be non-abstracted is determined from among the plurality of objects having close distances. By repeating this process until the number of reductions is reached, non-abstract objects are selected from areas with high object density in the image, so the objects to be abstracted are selected to reduce the bias in the density of objects in the image. It becomes possible to reduce the number.

更に、調整部１０００における抽象化対象の物体の削減方法、或いは非抽象化対象の物体の決定方法としては、例えば、画像を複数の領域に分割し、分割領域毎に上述の処理を適用する。このようにすることで、特定の画像領域において検出された物体の削減を他の画像領域と異ならせることが可能になり、撮像場面に応じた抽象化対象の物体の数の調整が可能となる。更に、調整部１０００における物体数の調整方法は上述の方法に限定するものではなく、乱数などを使用してランダムに決定するなど、同様の効果をもたらす他の方法により実現してもよい。 Furthermore, as a method for reducing objects to be abstracted or determining objects to be non-abstracted in the adjustment unit 1000, for example, the image is divided into a plurality of regions and the above-described processing is applied to each divided region. In this way, it is possible to reduce the number of objects detected in a specific image area differently from other image areas, and it is possible to adjust the number of objects to be abstracted depending on the imaging scene. . Furthermore, the method of adjusting the number of objects in the adjustment unit 1000 is not limited to the above-mentioned method, and may be realized by other methods that provide the same effect, such as randomly determining the number using random numbers.

生成部２０７は、調整部１０００による調整の結果に基づき、抽象化対象とされた特定の物体に対応する領域を抽象化したシルエット画像を生成する。本実施形態において、生成部２０７は、抽象化対象とされた特定の物体に対し設定された特定領域における前景領域を抽象化したシルエット画像を生成する。そして、生成部２０７は、生成したシルエット画像を背景画像に重畳することで出力画像を生成する。 The generation unit 207 generates a silhouette image in which a region corresponding to the specific object to be abstracted is abstracted based on the result of the adjustment by the adjustment unit 1000. In this embodiment, the generation unit 207 generates a silhouette image in which a foreground area in a specific area set for a specific object to be abstracted is abstracted. The generation unit 207 then generates an output image by superimposing the generated silhouette image on the background image.

次に、図１１および図１２を参照して、本実施形態に係る画像処理装置１００の画像処理について更に詳細に説明する。図１１は、本実施形態に係る画像処理装置１００の画像処理の流れを示すフローチャートである。また、図１２は、本実施形態に係る画像処理装置１００の画像処理を説明するための図である。 Next, image processing by the image processing apparatus 100 according to this embodiment will be described in more detail with reference to FIGS. 11 and 12. FIG. 11 is a flowchart showing the flow of image processing by the image processing apparatus 100 according to this embodiment. Further, FIG. 12 is a diagram for explaining image processing of the image processing apparatus 100 according to the present embodiment.

なお、図１１に示すフローを実行することで、設定された人物領域における前景領域を抽象化したシルエット画像を、抽象化対象の数の調整を行った後に背景画像に重畳した出力画像を生成することができる。なお、図１１に示すフローの処理は、例えば、ユーザによる指示に従って、開始又は終了するものとする。なお、図１１に示すフローチャートの処理は、撮像装置１１０のＲＯＭ１４２０に格納されたコンピュータプログラムを撮像装置１１０のＣＰＵ１４００が実行して実現される図２に示す機能ブロックにより実行されるものとする。 Note that by executing the flow shown in FIG. 11, an output image is generated in which a silhouette image in which the foreground region in the set human region is abstracted is superimposed on the background image after adjusting the number of abstraction targets. be able to. It is assumed that the processing of the flow shown in FIG. 11 is started or ended, for example, according to an instruction from a user. Note that the processing in the flowchart shown in FIG. 11 is executed by the functional blocks shown in FIG. 2, which are realized by the CPU 1400 of the imaging device 110 executing a computer program stored in the ROM 1420 of the imaging device 110.

まず、Ｓ１１０１にて、通信部２００は、撮像装置１１０が撮像した画像（入力画像）を受信する。図１２（ａ）は、通信部２００が受信する入力画像の一例を示す図である。図１２（ａ）に示す画像には、複数の人物が含まれている。 First, in S1101, the communication unit 200 receives an image (input image) captured by the imaging device 110. FIG. 12A is a diagram showing an example of an input image received by the communication unit 200. The image shown in FIG. 12(a) includes a plurality of people.

次に、Ｓ１１０２にて、抽出部２０４は、背景画像を取得する。図１２（ｂ）は、Ｓ１１０２にて抽出部２０４により取得される背景画像の一例を示す図であり、図１２（ａ）に示す画像との差異は、人物１２０１～１２０９の有無である。 Next, in S1102, the extraction unit 204 acquires a background image. FIG. 12(b) is a diagram showing an example of a background image acquired by the extraction unit 204 in S1102, and the difference from the image shown in FIG. 12(a) is the presence or absence of people 1201 to 1209.

次に、Ｓ１１０３にて、抽出部２０４は、撮像された入力画像から前景領域を抽出する。本実施形態における抽出部２０４は、通信部２００が受信した入力画像と背景画像とを比較することにより、前景領域を抽出する。図１２（ｃ）は、Ｓ１１０３にて、抽出部２０４により、図１２（ａ）に示す入力画像と、図１２（ｂ）に示す背景画像とを比較することにより抽出された前景領域を含む画像である。図１２（ｃ）に示す前景領域は黒色で示されている。 Next, in S1103, the extraction unit 204 extracts a foreground region from the captured input image. The extraction unit 204 in this embodiment extracts a foreground region by comparing the input image received by the communication unit 200 with a background image. FIG. 12(c) is an image including a foreground region extracted by the extraction unit 204 in S1103 by comparing the input image shown in FIG. 12(a) and the background image shown in FIG. 12(b). It is. The foreground region shown in FIG. 12(c) is shown in black.

次に、Ｓ１１０４にて、検出部２０５は、照合パターン（辞書）を用いて、入力画像から人物を検出する。図１２に示す例において、検出部２０５は、図１２（ａ）に示す入力画像から、人物１２０１～１２０９を検出する。 Next, in S1104, the detection unit 205 detects a person from the input image using a matching pattern (dictionary). In the example shown in FIG. 12, the detection unit 205 detects people 1201 to 1209 from the input image shown in FIG. 12(a).

次に、Ｓ１１０５にて、判定部７０８は、連続して受信した複数の入力画像から検出された人物の動き量を判定する。ここで、駅などにおいて通常の場合であれば人が歩行している場面を撮像すると、人物の動きが検出される。一方、駅に入場規制がかかっている場合、多くの人が動けない状況で行列を形成することになるため、検出部２０５では、所定未満の人物の動きしか検出されない。このように、画像内の動きの量を人の流れが滞留している可能性を示唆している情報（滞留可能性情報）であるとする。 Next, in S1105, the determination unit 708 determines the amount of movement of the person detected from the plurality of input images that are received continuously. Here, if an image of a scene where a person is normally walking at a station or the like is captured, the movement of the person is detected. On the other hand, if entry is restricted to a station, many people will form a line without being able to move, so the detection unit 205 will only detect the movement of people less than a predetermined amount. In this way, the amount of movement in the image is assumed to be information indicating the possibility that the flow of people is stagnant (retention possibility information).

次に、Ｓ１１０６にて、調整部１０００は、Ｓ１１０５において判定した人物の動きに基づいて、検出された人物１２０１～１２０９のうち抽象化対象とする人物の数の調整を行う。例えば、調整部１０００は、Ｓ１１０５で判定された動き量が所定値以上であれば、検出部２０５で検出された人物の全てを抽象化対象とするのでなく、一部の人物を抽象化対象とする。 Next, in S1106, the adjustment unit 1000 adjusts the number of persons to be abstracted among the detected persons 1201 to 1209, based on the movement of the persons determined in S1105. For example, if the amount of movement determined in S1105 is equal to or greater than a predetermined value, the adjustment unit 1000 does not set all of the people detected by the detection unit 205 as abstraction targets, but selects some of the people as abstraction targets. do.

調整部１０００が行う調整内容の具体例としては、滞留が発生している可能性が低い場合には、後述する出力画像において人が動ける空間が十分にあることを視覚的にわかりやすくするために、抽象化対象とする人物の削減を行う。このようにすることで、混雑が発生しているものの、滞留状態にはなっていないことを、後述する出力画像に反映させることが可能となる。 As a specific example of the adjustment content performed by the adjustment unit 1000, when there is a low possibility that stagnation has occurred, in order to make it easier to visually understand that there is sufficient space for a person to move in the output image described later. , the number of people to be abstracted is reduced. By doing so, it is possible to reflect in the output image, which will be described later, that although congestion has occurred, there is no stagnation state.

一方、滞留が発生している可能性が高い場合には、後述する出力画像において実際の状況を視覚的にわかりやすくするため、抽象化対象とする人物の削減を行わないようにする。或いは、滞留の発生している可能性に応じて削減割合を変更する。ここで、判定部２０６において、滞留が発生している可能性が高いと判断された場合であっても、実際には混雑が発生しているとは限らない。しかし、この場合であっても抽象化対象とする人物が削減されないだけであり、混雑が発生していない実際の状況を、後述する出力画像に反映させることが可能となる。 On the other hand, if there is a high possibility that stagnation has occurred, the number of people to be abstracted is not reduced in order to make the actual situation easier to visually understand in the output image described later. Alternatively, the reduction rate is changed depending on the possibility that stagnation has occurred. Here, even if the determining unit 206 determines that there is a high possibility that stagnation is occurring, it does not necessarily mean that congestion is actually occurring. However, even in this case, the number of people to be abstracted is not reduced, and it becomes possible to reflect an actual situation in which no congestion occurs in the output image, which will be described later.

次に、Ｓ１１０７にて、設定部２０６は、検出部２０５により検出され、更に調整部１０００において人数調整が行われた結果得られた人物の位置、及び大きさに基づいて、人物に対応する形状の領域である人物領域を設定する。つまり、設定部２０６は、抽象化対象とされた人物に対し人物領域を設定する。 Next, in S1107, the setting unit 206 determines the shape corresponding to the person based on the position and size of the person detected by the detection unit 205 and further adjusted by the adjustment unit 1000. Set the person area which is the area of . That is, the setting unit 206 sets a person area for the person who is the abstraction target.

次に、Ｓ１１０８にて、生成部２０７は、設定部２０６により設定された人物領域における前景領域を抽象化したシルエット画像を生成する。 Next, in S1108, the generation unit 207 generates a silhouette image that abstracts the foreground area in the person area set by the setting unit 206.

次に、Ｓ１１０９にて、生成部２０７は、Ｓ１１０８において生成したシルエット画像を、背景画像に重畳した出力画像を生成する。図１２（ｄ）に示す画像は、調整部１０００において人数の調整が行われなかった場合に、生成部２０９において生成される出力画像の例であり、検出部２０５において検出された人物の動きが少なかった場合に対応している。一方、図１２（ｅ）は、調整部１０００で人数の調整を行った場合に生成部２０７において生成される出力画像の例であり、検出部２０５において検出された人物の動きが多かった場合に対応している。 Next, in S1109, the generation unit 207 generates an output image in which the silhouette image generated in S1108 is superimposed on the background image. The image shown in FIG. 12(d) is an example of an output image generated by the generation unit 209 when the adjustment unit 1000 does not adjust the number of people, and the movement of the person detected by the detection unit 205 is This corresponds to cases where there are fewer. On the other hand, FIG. 12(e) is an example of an output image generated by the generation unit 207 when the adjustment unit 1000 adjusts the number of people. Compatible.

図１２（ｅ）に示すように、抽象化対象とする人物の数の調整が行われた結果人物が間引かれた場合、シルエット画像中の人物間に隙間が多くなる。その結果、図１２（ｅ）に示す画像からは、あまり混雑していない様子を表示することが可能になる。図１２（ｅ）に示す画像は実際の検出人数を反映していないものの、撮像領域は人が動くことが可能であることを一目でわかりやすい形で閲覧者に提示することが可能となるため、例えば駅に入場できないような状況にはなっていないことが容易に判断可能となる。 As shown in FIG. 12E, when people are thinned out as a result of adjusting the number of people to be abstracted, there are many gaps between the people in the silhouette image. As a result, from the image shown in FIG. 12(e), it becomes possible to display an appearance that is not very crowded. Although the image shown in FIG. 12(e) does not reflect the actual number of people detected, it is possible to present to the viewer in a form that is easy to understand at a glance that people can move in the imaging area. For example, it can be easily determined that the situation is not such that it is impossible to enter the station.

一方、図１２（ｄ）に示すように、抽象化対象とする人物の数の調整が行われた結果人物が間引かれなかった場合、シルエット画像中の人物間の隙間は検出部２０５の検出結果に基づいて決定される。ここで、調整部１０００において人数の変更が行われなかった場合であっても、撮像範囲内にいる人数が少なかった場合には検出部２０５で検出される人数は少ないため、結果として人物間に隙間の多いシルエット画像が生成されることになる。一方、撮像範囲内に多くの人物が存在していた場合、検出部２０５で検出される人数が多くなり、結果として人物間に隙間が少ないシルエット画像が生成されることになる。人物間に隙間が少ない画像が閲覧者に提示された場合、閲覧者は混雑している様子をシルエット画像から確認することができる。 On the other hand, as shown in FIG. 12(d), when the number of people to be abstracted is adjusted and the people are not thinned out, the gaps between the people in the silhouette image are detected by the detection unit 205. Decisions will be made based on the results. Here, even if the adjustment unit 1000 does not change the number of people, if the number of people within the imaging range is small, the number of people detected by the detection unit 205 is small. A silhouette image with many gaps will be generated. On the other hand, if there are many people within the imaging range, the number of people detected by the detection unit 205 will increase, and as a result, a silhouette image with fewer gaps between the people will be generated. When a viewer is presented with an image in which there are few gaps between people, the viewer can confirm the appearance of crowding from the silhouette image.

次に、Ｓ１１１０にて、表示制御部２０２は、生成部２０７により生成された出力画像を出力する。なお、本実施形態において、表示制御部２０２は、生成部２０７により生成された出力画像をディスプレイ１３０に表示させる。 Next, in S1110, the display control unit 202 outputs the output image generated by the generation unit 207. Note that in this embodiment, the display control unit 202 causes the display 130 to display the output image generated by the generation unit 207.

次に、Ｓ１１１１にて、ユーザにより処理を終了する指示がある場合（Ｓ１１１１にてＹｅｓ）、処理を終了する。一方、ユーザにより処理を終了する指示がない場合（Ｓ１１１１にてＮｏ）、Ｓ１１０１へ戻り、通信部２００は次のフレームの画像（入力画像）を受信する。 Next, in S1111, if the user instructs to end the process (Yes in S1111), the process ends. On the other hand, if there is no instruction from the user to end the process (No in S1111), the process returns to S1101, and the communication unit 200 receives the next frame image (input image).

なお上述の説明においては、背景差分法により前景領域を抽出し、人物領域における前景領域を抽象化したシルエット画像を背景画像に重畳させることで出力画像を生成したが、これに限定されるものではない。例えば、入力画像にエッジ抽出処理を施して得られたエッジ成分を、検出部２０５により検出された人物の位置に基づいて切り出すことにより得られるエッジをシルエット画像として用いてもよい。或いは、検出部２０５により検出された人物の領域の輪郭線をシルエット画像として用いてもよい。或いは、該輪郭線の内部領域を塗りつぶした結果得られる塗りつぶし画像をシルエット画像として用いてもよい。或いは、検出部２０５により検出された入力画像における人物の位置に対応する背景画像における位置に当該人物の存在を示すアイコンを重畳してもよい。つまり、人物の存在を示すアイコンをシルエット画像として用いてもよい。その他、人数調整が適用可能なシルエット画像の生成方法であれば、いかなる方法を適用してもよい。 In the above explanation, the foreground region is extracted by the background subtraction method, and the output image is generated by superimposing a silhouette image, which is an abstraction of the foreground region in the human region, on the background image, but the present invention is not limited to this. do not have. For example, edges obtained by performing edge extraction processing on the input image and cutting out edge components based on the position of the person detected by the detection unit 205 may be used as the silhouette image. Alternatively, the outline of the area of the person detected by the detection unit 205 may be used as the silhouette image. Alternatively, a filled-in image obtained as a result of filling in the inner region of the outline may be used as the silhouette image. Alternatively, an icon indicating the presence of the person may be superimposed at a position in the background image corresponding to the position of the person in the input image detected by the detection unit 205. In other words, an icon indicating the presence of a person may be used as a silhouette image. In addition, any method for generating a silhouette image may be applied as long as it is applicable to adjusting the number of people.

更に、上述の例を適用する際には、入力画像全体に同一の処理を行うのみならず、入力画像を複数領域に分割して、分割領域毎に異なる抽象化対象とする人物の数の調整を行うようにしてもよい。 Furthermore, when applying the above example, in addition to performing the same processing on the entire input image, it is also necessary to divide the input image into multiple regions and adjust the number of people to be abstracted differently for each divided region. You may also do this.

以上説明したように、本実施形態に係る画像処理装置によれば、人の動きの量に応じて閲覧者が見る画像中の様子を異ならせることが可能になる。その結果、人が動けないような混雑状況が発生している場合には混雑している様子を示す画像を提示することが可能になる。一方混雑してはいるものの、電車の乗り降りなどにより人が動いている状況であれば、人が動ける状況である様子を示す画像を提示することが可能になる。このようにすることで、本実施形態では、人物を抽象化した静止画１枚であっても、駅などで電車に乗れそうかを容易に判別できる出力画像を生成することができる。なお、画像中の人の動き方は撮像したカメラの設置場所や設置方法によって異なるため、同じ動き量が得られた場合であってもカメラの設置場所、或いは駅によって異なる対応が必要になる場合がある。そのため、Ｓ１１０５において判定した人物の動きに加え、所定の判断基準に基づいて判定を行うようにし、当該判定結果に基づいてＳ１１０６の調整を行うようにしてもよい。 As described above, according to the image processing device according to the present embodiment, it is possible to change the appearance of the image that the viewer sees depending on the amount of movement of the person. As a result, when a crowded situation occurs where people cannot move, it becomes possible to present an image showing the congestion. On the other hand, if the situation is crowded but people are moving, such as getting on and off the train, it is possible to present an image that shows the situation in which people can move. By doing so, in this embodiment, even if it is a single still image in which a person is abstracted, it is possible to generate an output image that can easily determine whether a person is likely to board a train at a station or the like. Please note that the way people move in an image varies depending on the location and installation method of the camera that captured the image, so even if the same amount of movement is obtained, different measures may be required depending on the location of the camera or station. There is. Therefore, in addition to the movement of the person determined in S1105, the determination may be made based on a predetermined criterion, and the adjustment in S1106 may be performed based on the determination result.

（実施形態８）
実施形態７においては、検出された特定の物体の動きを滞留可能性情報とし、該滞留可能性情報に基づいて、抽象化対象とする特定の物体の数の調整を行った後に、シルエット画像の生成を行う例を示した。本実施形態においては、特定の物体の動き以外の情報を滞留可能性情報として該特定の物体の数の調整を行い、シルエット画像の生成を行う例を説明する。以下、実施形態１～７と異なる部分を主に説明し、実施形態１～７と同一または同等の構成要素、および処理には同一の符号を付すとともに、重複する説明は省略する。 (Embodiment 8)
In the seventh embodiment, the detected movement of a specific object is used as residence possibility information, and the number of specific objects to be abstracted is adjusted based on the residence possibility information, and then the silhouette image is An example of generation is shown. In this embodiment, an example will be described in which the number of specific objects is adjusted using information other than the movement of specific objects as residence possibility information, and a silhouette image is generated. Hereinafter, parts that are different from Embodiments 1 to 7 will be mainly described, and components and processes that are the same or equivalent to Embodiments 1 to 7 will be denoted by the same reference numerals, and redundant explanation will be omitted.

プラットホーム、駅、博物館、美術館、ショッピングモール、スタジアムなどの人が集まる施設のような場所においては、何らかの理由により人が殺到して円滑な利用が困難になる場合がある。このような場合、円滑な利用が困難になったことを示す情報（利用情報）が発せられる。利用情報の具体例としては、電車など交通機関の運行情報（遅延の発生の有無、遅延時間、運転再開までの予想時間など）、プラットホームや駅、博物館など施設への入場規制の有無の情報、施設への入場待ち時間の情報などが挙げられる。そして、該利用情報が発信された場合には、円滑な利用ができなくなっている可能性があることから、滞留が発生する可能性がある。特に、待ち時間が長くなることを示す情報が発信された場合には、時間の経過とともに混雑度合が増えてゆき、最終的には人の滞留につながる可能性がある。そこで、本実施形態においては、該利用情報を滞留可能性情報として、調整部１０００により調整を行う例を示す。 BACKGROUND ART In places such as platforms, stations, museums, art galleries, shopping malls, stadiums, and other facilities where people gather, for some reason, people may rush in and make it difficult to use the facilities smoothly. In such a case, information (usage information) indicating that smooth use has become difficult is issued. Specific examples of usage information include information on the operation of trains and other means of transportation (whether or not there will be delays, delay times, estimated time until operations resume, etc.), information on whether or not there are restrictions on entry to platforms, stations, museums, and other facilities, Examples include information on waiting times to enter the facility. If the usage information is sent out, it may not be possible to use it smoothly, so there is a possibility that stagnation will occur. In particular, if information indicating that waiting times will be longer is sent out, the degree of congestion will increase over time, which may eventually lead to people staying behind. Therefore, in this embodiment, an example is shown in which the adjustment unit 1000 performs adjustment using the usage information as retention possibility information.

以下、図１３を参照して、本実施形態に係る画像処理について説明する。なお、図１３に示すＳ１１０１～Ｓ１１０４までの処理は図１１で説明した内容と同様であるため説明を省略する。図１３においてＳ１３００にて、通信部２００は、上述の利用情報を受信する。なお、Ｓ１３００にて、本実施形態における通信部２００は、利用情報として運行情報を受信するものとする。次に、Ｓ１３０１にて、調整部１０００は、Ｓ１３００において受信した運行情報に基づいて、検出された人物１２０１～１２０９の一部を抽象化対象とすることで、抽象化対象とする人の数の調整を行う。なお、図１３に示すＳ１１０７～Ｓ１１１１までの処理は図１１で説明した内容と同様であるため説明を省略する。 Image processing according to this embodiment will be described below with reference to FIG. 13. Note that the processing from S1101 to S1104 shown in FIG. 13 is the same as that described in FIG. 11, and therefore the description thereof will be omitted. In S1300 in FIG. 13, communication unit 200 receives the above usage information. Note that, in S1300, the communication unit 200 in this embodiment receives operation information as usage information. Next, in S1301, the adjustment unit 1000 sets some of the detected persons 1201 to 1209 as abstraction targets based on the operation information received in S1300, thereby increasing the number of people to be abstracted. Make adjustments. Note that the processing from S1107 to S1111 shown in FIG. 13 is the same as that described in FIG. 11, and therefore the description thereof will be omitted.

なお、利用情報には様々な種類があるだけでなく、同じ利用情報であってもカメラの設置場所、或いは駅によって異なる対応が必要になる場合がある。例えば、同じ遅延時間の情報を受信した場合でも、駅によって発生する混雑状況は異なる。そのため、Ｓ１３００において受信した利用情報を、所定の判断基準に基づいて判定し、該判定結果に基づいてＳ１３０１の調整を行うようにしてもよい。その結果、同じ利用情報から異なる調整を行った結果に基づくシルエット画像を生成することが可能になる。 Note that not only are there various types of usage information, but even the same usage information may require different responses depending on the location where the camera is installed or the station. For example, even if information about the same delay time is received, the congestion situation that occurs differs depending on the station. Therefore, the usage information received in S1300 may be determined based on predetermined criteria, and the adjustment in S1301 may be performed based on the determination result. As a result, it becomes possible to generate a silhouette image based on the results of performing different adjustments from the same usage information.

本実施形態における利用情報に基づく調整の例としては、次のような方法がある。すなわち、利用情報が電車の遅延の発生などの運行情報、或いは入場規制中であることを示す情報であった場合、混雑している様子がシルエット画像に反映されるように、抽象化対象とする人物の数の調整は行わないようにする。一方、利用情報が発せられていない、或いは平常時を示す情報の場合には、人数の調整を行うようにする。更に、利用情報に遅延時間や運転再開までの予想時間が含まれていた場合には、該遅延時間、或いは該予想時間が長いほど混雑する可能性が高まるため、人数調整による削減の割合が少なくなるようにする。このように、人が動けないような混雑が発生する可能性が高い場合には、実際の混雑状態を反映したシルエット画像を生成し、人が動ける程度の混雑度合であった場合には人が動けることを容易に把握可能なシルエット画像を生成することが可能となる。なお、本実施例においては上述の利用情報に基づいて、抽象化対象とする人物の数の調整を行う例を示したが、これに限定するものではない。例えば、利用情報に加えて、実施形態７において図１１のＳ１１０５で説明した人物の物体の動きを併せて考慮するようにしてもよい。 Examples of adjustments based on usage information in this embodiment include the following method. In other words, if the usage information is operational information such as train delays, or information indicating that admission is being restricted, it will be abstracted so that the congestion is reflected in the silhouette image. Avoid adjusting the number of people. On the other hand, if the usage information has not been issued or the information indicates normal times, the number of people will be adjusted. Furthermore, if the usage information includes the delay time or the expected time until operation resumes, the longer the delay time or the expected time, the higher the possibility of congestion, so the reduction rate by adjusting the number of people will be smaller. I will make it happen. In this way, if there is a high possibility that congestion will occur where people cannot move, a silhouette image will be generated that reflects the actual congestion, and if the congestion is such that people can move, it will be possible to create a silhouette image that reflects the actual congestion. It is possible to generate a silhouette image that allows easy movement. In addition, in this embodiment, an example was shown in which the number of people to be abstracted is adjusted based on the above-mentioned usage information, but the present invention is not limited to this. For example, in addition to the usage information, the movement of the person's object described in S1105 of FIG. 11 in the seventh embodiment may also be considered.

（実施形態９）
実施形態７～８では、Ｓ１１０６、或いはＳ１３０１において行う調整部１０００による調整の処理として、抽象化対象とする人数を削減する方法について説明した。本実施形態では、他の調整方法について説明する。 (Embodiment 9)
In the seventh and eighth embodiments, a method of reducing the number of abstraction targets has been described as an adjustment process performed by the adjustment unit 1000 in S1106 or S1301. In this embodiment, another adjustment method will be described.

混雑が発生して人の密度が高くなると人物同士の重なりが増えるため、Ｓ１１０４で行われる人物検出の方法によっては人としての特徴を捉えることが難しくなり、検出される人数が少なくなる場合がある。例えば、照合パターンを用いる場合に、照合する対象の人物領域が隠れてしまう場合などが挙げられる。この場合には、連続する複数の入力画像において、Ｓ１１０３で行われる前景領域の抽出の結果、前景領域の面積が減少していないにもかかわらず検出される人数が少なくなる。そこで、連続する複数の入力画像間において前景領域が減少していないにもかかわらず検出人数が減少した場合には、人物検出が困難な混雑が発生している可能性があると判定し、抽象化対象とする人物の数を増やすように調整する。このようにすることで、人物検出に失敗している人が発生しているにもかかわらず、人が動けないほどの混雑が発生している状況を示すシルエット画像の生成を行うことが可能になる。 When congestion occurs and the density of people increases, the number of people overlapping each other increases, so depending on the method of person detection performed in S1104, it may be difficult to capture human characteristics, and the number of detected people may decrease. . For example, when using a matching pattern, a person area to be matched may be hidden. In this case, as a result of the extraction of the foreground region performed in S1103 in the plurality of consecutive input images, the number of people detected decreases even though the area of the foreground region does not decrease. Therefore, if the number of detected people decreases even though the foreground area does not decrease between multiple consecutive input images, it is determined that there is a possibility of congestion that makes it difficult to detect people, and abstract Adjust to increase the number of people targeted. By doing this, it is possible to generate a silhouette image that shows a situation where people are so crowded that they cannot move, even though there are people whose detection fails. Become.

抽象化対象とする人物の数を増やすように調整を行うかどうかの判定の具体例としては、以下のようにしてもよい。すなわち、調整部１０００は、前景領域が所定面積以上ある場合において、過去の画像における検出人数と前景領域の割合に比べ、検出人数が所定以上減少した場合に、抽象化対象とする人の数を増やす方向で調整すると判定するとよい。なお、該判定の際の検出人数と前景領域の割合は実行時に動的に決定してもよいが、予め測定した結果を固定的に保持しておくようにしてもよい。 A specific example of determining whether to make adjustments to increase the number of people to be abstracted may be as follows. That is, when the foreground region has a predetermined area or more, the adjustment unit 1000 adjusts the number of people to be abstracted when the detected number of people decreases by a predetermined amount or more compared to the ratio of the detected number of people in past images to the foreground region. It may be determined that the adjustment should be made in the direction of increasing the value. Note that the number of people detected and the ratio of the foreground area at the time of this determination may be determined dynamically at the time of execution, or the results measured in advance may be fixedly held.

更に、上述の例においては前景領域を用いて判定を行うようにしたが、本実施形態はこれに限るものではない。例えば、背景画像と入力画像について各々画像中のエッジを検出し、検出されたエッジの類似度によって画像中に発生している変化の大きさを判定する。この変化の大きさが所定以上発生しているにもかかわらず、検出された人数が減少している場合に抽象化対象とする人数を増やす方向で調整を行うようにしてもよい。その他、画像中の時間的な変化を捉えることが可能な方法であればいかなる方法を用いてもよい。 Further, in the above example, the foreground region is used for determination, but the present embodiment is not limited to this. For example, edges in the background image and input image are detected, and the magnitude of change occurring in the image is determined based on the degree of similarity between the detected edges. If the number of people detected is decreasing even though the magnitude of this change is greater than a predetermined value, adjustments may be made to increase the number of people to be abstracted. Any other method may be used as long as it is capable of capturing temporal changes in the image.

また、抽象化対象とする人の数を増やす際の人物挿入方法としては、画像中の変化が発生している位置であって、且つ検出されている人物間の空間が埋まるように人物の位置を決定するとよい。また、増やす人物の大きさは、周囲で検出された人物の大きさに基づいて決定するとよいが、その際に撮像画角の影響で位置によって人の大きさが大きく変化する場合があるため、この変化を考慮するようにしてもよい。更に、増加させる抽象化対象の人数に関しては、前述の検出された人物の隙間が埋まるまで行うようにするとよい。或いは、所定時間内において検出された最大人数まで増加させる、最大人数に対して所定の人数を超えるまで増加させるなど、視覚的に混雑が発生している様子が表現される人数まで増加させるようにする様々な方法が適用可能である。 In addition, when increasing the number of people to be abstracted, the method of inserting people is to insert the person at a position where a change occurs in the image and to fill the space between the detected people. It is a good idea to decide. Also, the size of the person to be added should be determined based on the size of the people detected in the surrounding area, but at that time, the size of the person may change greatly depending on the position due to the influence of the imaging angle of view. This change may be taken into consideration. Furthermore, regarding the number of abstraction targets to be increased, it is preferable to increase the number of abstraction targets until the aforementioned gaps between detected persons are filled. Alternatively, the number of people may be increased to a level that visually indicates that congestion has occurred, such as increasing the number of people to the maximum number of people detected within a predetermined time, or increasing the number of people until a predetermined number of people exceeds the maximum number of people. Various methods are applicable.

本実施形態で説明したように人数調整を行った後に、前述の実施形態において説明したシルエット画像の生成を行うことにより、様々な状況下においても汎用的に混雑状況を閲覧者が把握可能な画像を生成することが可能になる。 After adjusting the number of people as described in this embodiment, by generating the silhouette image as described in the previous embodiment, the image allows the viewer to understand the congestion situation in a general manner under various situations. It becomes possible to generate.

（その他の実施形態）
次に図１４を参照して、各実施形態の各機能を実現するための画像処理装置１００のハードウェア構成を説明する。なお、以降の説明において画像処理装置１００のハードウェア構成について説明するが、記録装置１２０および撮像装置１１０も同様のハードウェア構成によって実現されるものとする。 (Other embodiments)
Next, with reference to FIG. 14, the hardware configuration of the image processing apparatus 100 for realizing each function of each embodiment will be described. Note that although the hardware configuration of the image processing device 100 will be described in the following description, it is assumed that the recording device 120 and the imaging device 110 are also realized by the same hardware configuration.

本実施形態における画像処理装置１００は、ＣＰＵ１４００と、ＲＡＭ１４１０と、ＲＯＭ１４２０、ＨＤＤ１４３０と、Ｉ／Ｆ１４４０と、を有している。 The image processing device 100 in this embodiment includes a CPU 1400, a RAM 1410, a ROM 1420, an HDD 1430, and an I/F 1440.

ＣＰＵ１４００は画像処理装置１００を統括制御する中央処理装置である。ＲＡＭ１４１０は、ＣＰＵ１４００が実行するコンピュータプログラムを一時的に記憶する。また、ＲＡＭ１４１０は、ＣＰＵ１４００が処理を実行する際に用いるワークエリアを提供する。また、ＲＡＭ１４１０は、例えば、フレームメモリとして機能したり、バッファメモリとして機能したりする。 The CPU 1400 is a central processing unit that centrally controls the image processing apparatus 100. RAM 1410 temporarily stores computer programs executed by CPU 1400. Further, the RAM 1410 provides a work area used when the CPU 1400 executes processing. Further, the RAM 1410 functions, for example, as a frame memory or as a buffer memory.

ＲＯＭ１４２０は、ＣＰＵ１４００が画像処理装置１００を制御するためのプログラムなどを記憶する。ＨＤＤ１４３０は、画像データ等を記録する記憶装置である。 The ROM 1420 stores programs and the like for the CPU 1400 to control the image processing apparatus 100. The HDD 1430 is a storage device that records image data and the like.

Ｉ／Ｆ１４４０は、ネットワーク１４０を介して、ＴＣＰ／ＩＰやＨＴＴＰなどに従って、外部装置との通信を行う。 The I/F 1440 communicates with external devices via the network 140 according to TCP/IP, HTTP, or the like.

なお、上述した各実施形態の説明では、ＣＰＵ１４００が処理を実行する例について説明するが、ＣＰＵ１４００の処理のうち少なくとも一部を専用のハードウェアによって行うようにしてもよい。例えば、ディスプレイ１３０にＧＵＩ（ＧＲＡＰＨＩＣＡＬＵＳＥＲＩＮＴＥＲＦＡＣＥ）や画像データを表示する処理は、ＧＰＵ（ＧＲＡＰＨＩＣＳＰＲＯＣＥＳＳＩＮＧＵＮＩＴ）で実行してもよい。また、ＲＯＭ１４２０からプログラムコードを読み出してＲＡＭ１４０１に展開する処理は、転送装置として機能するＤＭＡ（ＤＩＲＥＣＴＭＥＭＯＲＹＡＣＣＥＳＳ）によって実行してもよい。 In addition, in the description of each embodiment mentioned above, an example will be described in which the CPU 1400 executes the processing, but at least a part of the processing of the CPU 1400 may be performed by dedicated hardware. For example, the process of displaying a GUI (GRAPHICAL USER INTERFACE) or image data on the display 130 may be executed by a GPU (GRAPHICS PROCESSING UNIT). Further, the process of reading the program code from the ROM 1420 and expanding it to the RAM 1401 may be executed by a DMA (DIRECT MEMORY ACCESS) functioning as a transfer device.

なお、本発明は、上述の実施形態の１以上の機能を実現するプログラムを１つ以上のプロセッサが読出して実行する処理でも実現可能である。プログラムは、ネットワーク又は記憶媒体を介して、プロセッサを有するシステム又は装置に供給するようにしてもよい。また、本発明は、上述の実施形態の１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。また、画像処理装置１００の各部は、図１４に示すハードウェアにより実現してもよいし、ソフトウェアにより実現することもできる。 Note that the present invention can also be implemented by a process in which one or more processors read and execute a program that implements one or more of the functions of the embodiments described above. The program may be supplied to a system or device having a processor via a network or a storage medium. The present invention can also be implemented by a circuit (eg, an ASIC) that implements one or more of the functions of the embodiments described above. Further, each part of the image processing apparatus 100 may be realized by the hardware shown in FIG. 14, or may be realized by software.

なお、上述の各実施形態において、隠蔽対象（抽象化の対象）である特定の物体を人物として説明したが、本発明はこれに限らない。例えば、背景画像上に映っていない物体であって、撮像範囲の状況を把握するために有用な物体であれば人物検出と同様に当該物体検出を行い、上述の処理を適用してシルエット画像を生成して背景画像に重畳した出力画像を生成してもよい。状況の把握に有用な物体としては例えば、邪魔になりやすい旅行用の大きなカバンや、工事個所に設置される進入禁止を示すバリケードなど様々な物体が挙げられる。 Note that in each of the embodiments described above, the specific object that is the object of concealment (object of abstraction) is described as a person, but the present invention is not limited to this. For example, if it is an object that is not visible in the background image and is useful for understanding the situation in the imaging range, the object is detected in the same way as human detection, and the above processing is applied to create a silhouette image. An output image may be generated and superimposed on the background image. Objects useful for understanding the situation include various objects, such as large travel bags that tend to get in the way, and barricades set up at construction sites to indicate that entry is prohibited.

なお、上述した各実施形態に係る画像処理装置１００の１以上の機能を他の装置が有していてもよい。例えば、各実施形態に係る画像処理装置１００の１以上の機能を撮像装置１１０が有していてもよい。なお、上述した各実施形態を組み合わせて、例えば、上述した実施形態を任意に組み合わせて実施してもよい。 Note that another device may have one or more functions of the image processing device 100 according to each embodiment described above. For example, the imaging device 110 may have one or more functions of the image processing device 100 according to each embodiment. Note that the above-described embodiments may be combined, for example, any combination of the above-described embodiments may be implemented.

以上、本発明を実施形態と共に説明したが、上記実施形態は本発明を実施するにあたっての具体化の例を示したものに過ぎず、これらによって本発明の技術的範囲は限定的に解釈されるものではない。すなわち、本発明はその技術思想、又はその主要な特徴から逸脱しない範囲において、様々な形で実施することができる。例えば、各実施形態を組み合わせたものも本明細書の開示内容に含まれる。 Although the present invention has been described above along with the embodiments, the above embodiments are merely examples of implementation of the present invention, and the technical scope of the present invention is interpreted to be limited by these embodiments. It's not a thing. That is, the present invention can be implemented in various forms without departing from its technical idea or main features. For example, a combination of each embodiment is also included in the disclosure content of this specification.

１００画像処理装置
１１０撮像装置
２００通信部
２０１記憶部
２０２表示制御部
２０３操作受付部
２０４抽出部
２０５検出部
２０６設定部
２０７生成部
７０８判定部
１０００調整部 100 Image processing device 110 Imaging device 200 Communication unit 201 Storage unit 202 Display control unit 203 Operation reception unit 204 Extraction unit 205 Detection unit 206 Setting unit 207 Generation unit 708 Judgment unit 1000 Adjustment unit

Claims

Extracting means for extracting a foreground region from the captured input image;
detection means for detecting a specific object from the input image;
Setting means for setting a specific area, at least a part of which is a curved area, based on the position of the specific object detected by the detection means;
The foreground area other than the specific area set by the setting means is not superimposed on a predetermined image corresponding to the input image, but a silhouette image is created by abstracting the foreground area in the specific area set by the setting means. , generating means for generating an output image superimposed on the predetermined image;
determining means for determining a crowded area in the input image,
The image processing apparatus is characterized in that the setting means sets a specific area whose horizontal width is reduced by a predetermined magnification for a person located in the crowded area determined by the determination means.

The determining means determines that among divided regions obtained by dividing the entire area of the input image into a plurality of divided regions, a divided region in which the number of specific objects detected by the detecting means is equal to or greater than a threshold is a crowded region. The image processing device according to claim 1.

The image processing apparatus according to claim 1, wherein the determining unit determines that the foreground area extracted by the extracting unit is a crowded area when the size of the foreground area is equal to or larger than a threshold value.

The generating means may vary the display mode of the silhouette image, which is an abstraction of the foreground area in the specific area, for each specific area set by the setting unit for a person located in the congested area determined by the determining unit. The image processing apparatus according to any one of claims 1 to 3, characterized in that:

5. The image processing device according to claim 1, wherein the predetermined image is a background image used to extract the foreground region.

The image processing apparatus according to claim 5 , wherein the extraction means extracts the foreground region by comparing a captured input image and the background image.

7. The extraction means extracts the foreground region by comparing a difference value calculated between each pixel of the input image and each pixel of the background image with a threshold value. The image processing device described in .

The image processing apparatus according to any one of claims 1 to 7, wherein the specific object is a person.

9. The shape of the specific area set by the setting means is a shape including a first shape corresponding to at least the head of the person and a second shape corresponding to the torso of the person. The image processing device described.

The generating means sets a display mode of an outer edge of the specific area that overlaps with the foreground area extracted by the extracting unit to be different from a display mode of a silhouette image in which the foreground area in the specific area is abstracted. The image processing apparatus according to any one of claims 1 to 9.

The generation means generates a silhouette image that abstracts the foreground region in a specific region for some specific objects among the plurality of specific objects detected by the detection means, based on the amount of movement in the input image. 11. The image processing apparatus according to claim 1, wherein the image processing apparatus generates an output image in which the predetermined image is superimposed on the predetermined image.

an extraction step of extracting a foreground region from the captured input image;
a detection step of detecting a specific object from the input image;
a setting step of setting a specific region, at least a part of which is a curved region, based on the position of the specific object detected in the detection step;
The foreground area other than the specific area set in the setting step is not superimposed on a predetermined image corresponding to the input image, but a silhouette image is created by abstracting the foreground area in the specific area set in the setting step. , a generation step of generating an output image superimposed on the predetermined image, and a determination step of determining a crowded area in the input image,
The method for controlling an image processing apparatus is characterized in that the setting step sets a specific area whose horizontal width is reduced by a predetermined magnification for a person located in the crowded area determined in the determining step.

A program for causing a computer to function as each means of the image processing apparatus according to claim 1.