JP2014164579A

JP2014164579A - Information processor, program and information processing method

Info

Publication number: JP2014164579A
Application number: JP2013035826A
Authority: JP
Inventors: Shunsuke Ichihara; 俊介市原; Kohei Endo; 康平圓藤; Shigeru Tatezawa; 茂立澤
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 2013-02-26
Filing date: 2013-02-26
Publication date: 2014-09-08

Abstract

PROBLEM TO BE SOLVED: To more accurately detect a person from a distance image even when a person is not imaged from directly above.SOLUTION: The information processor includes: an acquisition part for acquiring a first distance image generated by an imaging device and a second distance image preliminarily generated by the imaging device by imaging a background; a generation part for generating a third distance image corresponding to a foreground corresponding to the background on the basis of the first distance image and the second distance image; and a detection part for detecting one or more regions from the third distance image by using a plurality of filters applied at different positions in the third distance image, that is, the plurality of filters capable of recognizing the patterns of the head and body of the person corresponding to the applied positions.

Description

本発明は、情報処理装置、プログラム及び情報処理方法に関する。 The present invention relates to an information processing apparatus, a program, and an information processing method.

近年、パターン認識により、画像から所望の対象を検出する様々な技術が検討されている。このような技術として、例えば、画像から人物を検出する人物検出の技術がある。 In recent years, various techniques for detecting a desired object from an image by pattern recognition have been studied. As such a technique, for example, there is a person detection technique for detecting a person from an image.

例えば、非特許文献１には、天井から床に向けて設置された距離画像センサにより生成される距離画像から人物を検出する技術が開示されている。より具体的には、当該技術によれば、生成された距離画像から、背景差分により前景部分のみの画像が生成される。そして、生成された上記画像に対して、人物の頭部及び両肩の凸形状を認識可能なＨａａｒ−Ｌｉｋｅフィルタを適用し、Ｈａａｒ−Ｌｉｋｅフィルタの出力結果についてのクラスタリングを行うことにより、人物（人物に対応する領域）が検出される。 For example, Non-Patent Document 1 discloses a technique for detecting a person from a distance image generated by a distance image sensor installed from the ceiling toward the floor. More specifically, according to the technique, an image of only the foreground portion is generated from the generated distance image by background difference. Then, by applying a Haar-Like filter capable of recognizing the convex shape of the person's head and both shoulders to the generated image and performing clustering on the output result of the Haar-Like filter, the person ( A region corresponding to a person) is detected.

「池村翔、川合俊輔、藤吉弘亘、距離情報を用いたＨａａｒ−Ｌｉｋｅフィルタリングによる人検出、第１６回画像センシングシンポジウム(ＳＳＩＩ２０１０)」"Sho Ikemura, Shunsuke Kawai, Hironobu Fujiyoshi, Human detection by Haar-Like filtering using distance information, 16th Image Sensing Symposium (SSII2010)"

しかし、上記非特許文献１に開示されている技術では、距離画像センサが人物の真上から当該人物を撮像することが前提になっているので、人物が真上から撮像されていない場合には、人物の誤検出又は未検出が発生することが懸念される。 However, in the technique disclosed in Non-Patent Document 1, it is assumed that the distance image sensor images the person from directly above the person. There is a concern that erroneous detection or non-detection of a person may occur.

そこで、人物が真上から撮像されない場合でも距離画像から人物をより正確に検出することを可能にする仕組みが提供されることが望ましい。 Therefore, it is desirable to provide a mechanism that enables more accurate detection of a person from a distance image even when the person is not captured from directly above.

本発明によれば、撮像装置により生成される第１の距離画像、及び、背景の撮像により上記撮像装置により予め生成された第２の距離画像を取得する取得部と、上記第１の距離画像及び上記第２の距離画像に基づいて、上記背景に対する前景に対応する第３の距離画像を生成する生成部と、上記第３の距離画像内の別々の位置で適用される複数のフィルタであって、適用される位置に応じた人物の頭部及び胴体のパターンを認識可能な上記複数のフィルタを用いて、上記第３の距離画像から１つ以上の領域を検出する検出部と、を備える情報処理装置が提供される。 According to the present invention, the acquisition unit that acquires the first distance image generated by the imaging device and the second distance image generated in advance by the imaging device by imaging the background, and the first distance image And a generation unit for generating a third distance image corresponding to the foreground with respect to the background based on the second distance image, and a plurality of filters applied at different positions in the third distance image. And a detection unit for detecting one or more regions from the third distance image using the plurality of filters capable of recognizing a human head and torso pattern according to the position to be applied. An information processing apparatus is provided.

また、上記複数のフィルタの各々は、上記撮像装置により人物が撮像される場合における当該撮像装置と当該人物との相対関係に基づいて生成されたフィルタであってもよい。 Each of the plurality of filters may be a filter generated based on a relative relationship between the imaging device and the person when the person is imaged by the imaging device.

また、上記相対関係は、上記人物の頭部が存在する上記撮像装置からの方向、及び上記人物の胴体が存在する上記撮像装置からの方向を含んでもよい。 The relative relationship may include a direction from the imaging device where the person's head exists and a direction from the imaging device where the person's torso exists.

また、上記複数のフィルタの各々はさらに、上記撮像装置の特性に基づいて生成されたフィルタであってもよい。 Further, each of the plurality of filters may be a filter generated based on characteristics of the imaging device.

また、上記検出部は、人物の対象部分に対応する領域として、上記１つ以上の領域を検出し、上記情報処理装置は、上記１つ以上の領域のうちの少なくとも１つの領域の各々に対応する大きさに基づいて、当該少なくとも１つの領域の各々が上記対象部分に対応する領域かを判定する領域判定部、をさらに備えてもよい。 Further, the detection unit detects the one or more regions as regions corresponding to the target portion of the person, and the information processing device corresponds to each of at least one region of the one or more regions. And a region determination unit that determines whether each of the at least one region corresponds to the target portion based on the size of the region.

また、上記情報処理装置は、上記第１の距離画像又は上記第３の距離画像に基づいて、上記１つ以上の領域のうちの少なくとも１つの領域の各々の周辺領域であって、所定の条件を満たす当該周辺領域を、当該少なくとも１つの領域の各々に追加することにより、上記１つ以上の領域を補正する補正部、をさらに備え、上記領域判定部は、補正された上記少なくとも１つの領域の各々に対応する大きさに基づいて、当該少なくとも１つの領域の各々が上記対象部分に対応する領域かを判定してもよい。 In addition, the information processing device is a peripheral region of each of at least one of the one or more regions based on the first distance image or the third distance image, and a predetermined condition A correction unit that corrects the one or more regions by adding the peripheral region that satisfies the condition to each of the at least one region, and the region determination unit includes the corrected at least one region It may be determined whether each of the at least one region corresponds to the target portion based on a size corresponding to each of the regions.

また、上記対象部分は、人物の頭部であってもよい。 The target portion may be a human head.

また、上記情報処理装置は、上記１つ以上の領域のうちの、上記対象部分に対応する領域であると判定される可能性が高い高可能性領域と、その他の領域とを識別する識別部、をさらに備え、上記１つ以上の領域のうちの上記少なくとも１つの領域は、上記１つ以上の領域のうちの上記その他の領域であってもよい。 In addition, the information processing apparatus identifies a high possibility region that is highly likely to be determined to be a region corresponding to the target portion, and other regions out of the one or more regions. The at least one region of the one or more regions may be the other region of the one or more regions.

また、上記高可能性領域は、上記対象部分以外の人物の部分に対応する領域が検出されにくい所定の範囲内に位置する領域であってもよい。 The high possibility area may be an area located within a predetermined range in which an area corresponding to a person portion other than the target portion is difficult to detect.

また、上記高可能性領域は、上記１つ以上の領域が１つの領域のみを含む場合には、当該１つの領域であり、上記１つ以上の領域が２つ以上の領域を含む場合には、上記１つ以上の領域に含まれる他のいずれの領域とも所定の間隔以上離れている領域であってもよい。 In addition, the high possibility region is the one region when the one or more regions include only one region, and when the one or more regions include two or more regions. A region that is separated from the other regions included in the one or more regions by a predetermined distance or more may be used.

また、上記取得部は、上記第１の距離画像の前に上記撮像装置により生成される第４の距離画像を取得し、上記生成部は、上記第４の距離画像及び上記第２の距離画像に基づいて、上記背景に対する前景に対応する第５の距離画像を生成し、上記検出部は、上記複数のフィルタを用いて、上記第５の距離画像から別の１つ以上の領域を検出し、上記識別部は、上記別の１つ以上の領域のうちの、上記高可能性領域とその他の領域とを識別し、上記領域判定部は、上記別の１つ以上の領域のうちの上記その他の領域の各々が上記対象部分に対応する領域かを判定し、上記識別部は、上記別の１つ以上の領域のうちの、上記高可能性領域と、上記対象部分に対応する領域であると判定された判定済領域とに基づいて、上記１つ以上の領域のうちの上記高可能性領域とその他の領域とを識別してもよい。 Further, the acquisition unit acquires a fourth distance image generated by the imaging device before the first distance image, and the generation unit includes the fourth distance image and the second distance image. And generating a fifth distance image corresponding to the foreground with respect to the background, and the detection unit detects another one or more regions from the fifth distance image using the plurality of filters. The identification unit identifies the high possibility region and the other region among the one or more other regions, and the region determination unit includes the one of the one or more other regions. It is determined whether each of the other regions corresponds to the target portion, and the identification unit includes the high possibility region and the region corresponding to the target portion among the one or more other regions. One of the one or more regions based on the determined region determined to be Serial may identify a high potential region and other regions.

また、上記識別部は、上記別の１つ以上の領域のうちの上記高可能性領域及び上記判定済領域の位置、及び上記１つ以上の領域の位置に基づいて、当該１つ以上の領域のうちの上記高可能性領域とその他の領域とを識別してもよい。 In addition, the identification unit may determine the one or more areas based on the position of the high possibility area and the determined area and the position of the one or more areas among the one or more other areas. Of these, the high possibility area and other areas may be identified.

また、上記検出部は、上記第３の距離画像内の別々の位置で適用される複数のフィルタセットであって、適用される位置に応じた人物の頭部及び胴体のパターンを認識可能な上記複数のフィルタセットを用いて、上記１つ以上の領域を検出し、上記複数のフィルタセットの各々は、別々の特徴を有する人物の頭部及び胴体のパターンを認識可能な２つ以上のフィルタを含んでもよい。 Further, the detection unit is a plurality of filter sets applied at different positions in the third distance image, and is capable of recognizing a human head and torso pattern according to the applied position. The one or more regions are detected using a plurality of filter sets, and each of the plurality of filter sets includes two or more filters capable of recognizing human head and torso patterns having different characteristics. May be included.

また、上記検出部は、上記複数のフィルタセットの各々を用いる際に上記２つ以上のフィルタを上記第３の距離画像内の同一の位置に適用することにより、複数の領域を検出し、上記領域判定部は、上記複数の領域のうちの少なくとも１つの領域の各々に対応する大きさに基づいて、当該少なくとも１つの領域の各々が上記対象部分に対応する領域かを判定してもよい。 In addition, the detection unit detects the plurality of regions by applying the two or more filters to the same position in the third distance image when using each of the plurality of filter sets. The region determination unit may determine whether each of the at least one region corresponds to the target portion based on a size corresponding to each of at least one region of the plurality of regions.

また、上記検出部は、上記複数のフィルタセットの各々を用いる際に上記２つ以上のフィルタから少なくとも１つのフィルタを選択し、選択された当該少なくとも１つのフィルタを適用することにより、上記１つ以上の領域を検出してもよい。 The detection unit selects at least one filter from the two or more filters when using each of the plurality of filter sets, and applies the selected at least one filter. The above region may be detected.

また、上記検出部は、上記撮像装置により生成される距離画像、又は当該距離画像に基づいて生成される別の距離画像から推定される人物の特徴に基づいて、上記２つ以上のフィルタから上記少なくとも１つのフィルタを選択してもよい。 In addition, the detection unit is configured to extract the two or more filters from the two or more filters based on a person image estimated from a distance image generated by the imaging device or another distance image generated based on the distance image. At least one filter may be selected.

また、上記情報処理装置は、上記１つ以上の領域及び上記第３の距離画像に基づいて、上記１つ以上の領域以外の検出されるべき領域が検出されていない可能性があるかを判定する未検出判定部、をさらに備えてもよい。 Further, the information processing apparatus determines whether a region other than the one or more regions that should be detected may not be detected based on the one or more regions and the third distance image. An undetected determination unit may be further provided.

また、上記複数のフィルタの各々は、Ｈａａｒ−ｌｉｋｅフィルタであってもよい。 Each of the plurality of filters may be a Haar-like filter.

また、上記複数のフィルタの各々は、適用される位置に応じた人物の頭部及び肩のパターンを認識可能なフィルタであってもよい。 Each of the plurality of filters may be a filter capable of recognizing a person's head and shoulder pattern according to an applied position.

また、本発明によれば、コンピュータを、撮像装置により生成される第１の距離画像、及び、背景の撮像により上記撮像装置により予め生成された第２の距離画像を取得する取得部と、上記第１の距離画像及び上記第２の距離画像に基づいて、上記背景に対する前景に対応する第３の距離画像を生成する生成部と、上記第３の距離画像内の別々の位置で適用される複数のフィルタであって、適用される位置に応じた人物の頭部及び胴体のパターンを認識可能な上記複数のフィルタを用いて、上記第３の距離画像から１つ以上の領域を検出する検出部と、として機能させるためのプログラムが提供される。 Further, according to the present invention, the computer acquires the first distance image generated by the imaging device, and the second distance image generated in advance by the imaging device by imaging the background, Based on the first distance image and the second distance image, the generation unit generates a third distance image corresponding to the foreground with respect to the background, and is applied at different positions in the third distance image. Detection that detects one or more regions from the third distance image by using a plurality of filters that can recognize a human head and torso pattern according to an applied position. And a program for functioning as a part.

また、本発明によれば、撮像装置により生成される第１の距離画像、及び、背景の撮像により上記撮像装置により予め生成された第２の距離画像を取得することと、上記第１の距離画像及び上記第２の距離画像に基づいて、上記背景に対する前景に対応する第３の距離画像を生成することと、上記第３の距離画像内の別々の位置で適用される複数のフィルタであって、適用される位置に応じた人物の頭部及び胴体のパターンを認識可能な上記複数のフィルタを用いて、上記第３の距離画像から１つ以上の領域を検出することと、を含む情報処理方法が提供される。 In addition, according to the present invention, the first distance image generated by the imaging device and the second distance image generated in advance by the imaging device by imaging the background are acquired, and the first distance is acquired. Generating a third distance image corresponding to the foreground with respect to the background based on the image and the second distance image; and a plurality of filters applied at different positions in the third distance image. And detecting one or more regions from the third distance image using the plurality of filters capable of recognizing a human head and torso pattern according to an applied position. A processing method is provided.

以上説明したように本発明によれば、人物が真上から撮像されない場合でも距離画像から人物をより正確に検出することが可能となる。 As described above, according to the present invention, it is possible to more accurately detect a person from a distance image even when the person is not imaged from directly above.

従来のＨａａｒ−Ｌｉｋｅフィルタの例を説明するための説明図である。It is explanatory drawing for demonstrating the example of the conventional Haar-Like filter. 従来のＨａａｒ−Ｌｉｋｅフィルタが適用される距離画像の例を説明するための説明図である。It is explanatory drawing for demonstrating the example of the distance image to which the conventional Haar-Like filter is applied. 従来のＨａａｒ−Ｌｉｋｅフィルタが適用されると誤検出が発生し得る距離画像の例を説明するための説明図である。It is explanatory drawing for demonstrating the example of the distance image which may generate | occur | produce a misdetection when the conventional Haar-Like filter is applied. 本発明の実施形態に係る情報処理システムの概略的な構成の一例を示す説明図である。It is explanatory drawing which shows an example of schematic structure of the information processing system which concerns on embodiment of this invention. 第１の実施形態に係る情報処理装置の構成の一例を示すブロック図である。It is a block diagram which shows an example of a structure of the information processing apparatus which concerns on 1st Embodiment. 生成される前景距離画像の一例を説明するための説明図である。It is explanatory drawing for demonstrating an example of the foreground distance image produced | generated. 本開示の実施形態に係る複数のフィルタのうちの１つのフィルタの一例を説明するための説明図である。It is an explanatory view for explaining an example of one filter among a plurality of filters concerning an embodiment of this indication. 撮像装置により生成される対象距離画像の例を説明するための説明図である。It is explanatory drawing for demonstrating the example of the object distance image produced | generated by an imaging device. 図８に示される対象距離画像から得られる前景距離画像の例を説明するための説明図である。It is explanatory drawing for demonstrating the example of the foreground distance image obtained from the object distance image shown by FIG. 前景距離画像における物体領域の検出のために用いられるフィルタの例を説明するための説明図である。It is explanatory drawing for demonstrating the example of the filter used for the detection of the object area | region in a foreground distance image. 撮像装置により生成される距離画像の画素数の例を説明するための説明図である。It is explanatory drawing for demonstrating the example of the pixel count of the distance image produced | generated by an imaging device. 撮像装置の画角を説明するための説明図である。It is explanatory drawing for demonstrating the angle of view of an imaging device. 人物の頭部に対応する撮像角度の範囲を算出する手法の例を説明するための説明図である。It is explanatory drawing for demonstrating the example of the method of calculating the range of the imaging angle corresponding to a person's head. 人物の肩に対応する撮像角度の範囲を算出する手法の例を説明するための説明図である。It is explanatory drawing for demonstrating the example of the method of calculating the range of the imaging angle corresponding to a person's shoulder. 本発明の実施形態に係る複数のフィルタを用いた物体領域の検出の例を説明するための説明図である。It is explanatory drawing for demonstrating the example of the detection of the object area | region using the some filter which concerns on embodiment of this invention. 第１の実施形態に係る情報処理の概略的な流れの一例を示すフローチャートである。It is a flowchart which shows an example of the schematic flow of the information processing which concerns on 1st Embodiment. 第２の実施形態に係る情報処理装置の構成の一例を示すブロック図である。It is a block diagram which shows an example of a structure of the information processing apparatus which concerns on 2nd Embodiment. 第２の実施形態に係る物体領域の補正の第１の例を説明するための説明図である。It is explanatory drawing for demonstrating the 1st example of correction | amendment of the object area | region which concerns on 2nd Embodiment. 第２の実施形態に係る物体領域の補正の第２の例を説明するための説明図である。It is explanatory drawing for demonstrating the 2nd example of correction | amendment of the object area | region which concerns on 2nd Embodiment. 物体領域が人物の頭部に対応する領域かを判定する手法の例を説明するための説明図である。It is explanatory drawing for demonstrating the example of the method of determining whether an object area | region is an area | region corresponding to a person's head. 第２の実施形態に係る情報処理の概略的な流れの一例を示すフローチャートである。It is a flowchart which shows an example of the schematic flow of the information processing which concerns on 2nd Embodiment. 第３の実施形態に係る情報処理装置の構成の一例を示すブロック図である。It is a block diagram which shows an example of a structure of the information processing apparatus which concerns on 3rd Embodiment. 対象部分以外の人物の部分に対応する領域が検出されにくい所定の範囲の一例を説明するための説明図である。It is explanatory drawing for demonstrating an example of the predetermined range where the area | region corresponding to the parts of persons other than a target part is hard to be detected. 第３の実施形態に係る情報処理の概略的な流れの一例を示すフローチャートである。It is a flowchart which shows an example of the schematic flow of the information processing which concerns on 3rd Embodiment. 第４の実施形態に係る情報処理装置の構成の一例を示すブロック図である。It is a block diagram which shows an example of a structure of the information processing apparatus which concerns on 4th Embodiment. 第４の実施形態に係る高可能性領域とその他の領域との識別の例を説明するための説明図である。It is explanatory drawing for demonstrating the example of identification with the high possibility area | region and other area | region which concerns on 4th Embodiment. 第４の実施形態に係る情報処理の概略的な流れの一例を示すフローチャートである。It is a flowchart which shows an example of the schematic flow of the information processing which concerns on 4th Embodiment. 第５の実施形態に係る情報処理装置の構成の一例を示すブロック図である。It is a block diagram which shows an example of a structure of the information processing apparatus which concerns on 5th Embodiment. フィルタセットに含まれるフィルタの例を説明するための説明図である。It is explanatory drawing for demonstrating the example of the filter contained in a filter set. フィルタセットに含まれる２つ以上のフィルタを適用することにより検出される物体領域の例を説明するための説明図である。It is explanatory drawing for demonstrating the example of the object area | region detected by applying the 2 or more filter contained in a filter set. フィルタセットに含まれる２つ以上のフィルタを同一の位置に適用することにより検出される複数の物体領域についての判定の例を説明するための説明図である。It is explanatory drawing for demonstrating the example of the determination about the several object area | region detected by applying two or more filters contained in a filter set to the same position. 第５の実施形態に係る情報処理の概略的な流れの一例を示すフローチャートである。It is a flowchart which shows an example of the schematic flow of the information processing which concerns on 5th Embodiment. 第５の実施形態の変形例に係る情報処理の概略的な流れの一例を示すフローチャートである。It is a flowchart which shows an example of the schematic flow of the information processing which concerns on the modification of 5th Embodiment. 第６の実施形態に係る情報処理装置の構成の一例を示すブロック図である。It is a block diagram which shows an example of a structure of the information processing apparatus which concerns on 6th Embodiment. 前景領域の検出結果の一例を説明するための説明図である。It is explanatory drawing for demonstrating an example of the detection result of a foreground area | region. 領域の未検出の可能性があるかの判定の例を説明するための説明図である。It is explanatory drawing for demonstrating the example of determination of whether the area | region has not been detected. 第６の実施形態に係る情報処理の概略的な流れの一例を示すフローチャートである。It is a flowchart which shows an example of the schematic flow of the information processing which concerns on 6th Embodiment. 本開示の実施形態に係る情報処理装置のハードウェア構成の一例を示すブロック図である。It is a block diagram showing an example of hardware constitutions of an information processor concerning an embodiment of this indication.

以下に添付の図面を参照しながら、本発明の好適な実施の形態について詳細に説明する。なお、本明細書及び図面において、実質的に同一の機能構成を有する構成要素については、同一の符号を付することにより重複説明を省略する。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. In addition, in this specification and drawing, about the component which has the substantially same function structure, duplication description is abbreviate | omitted by attaching | subjecting the same code | symbol.

以降、＜＜１．はじめに＞＞、＜＜２．情報処理システムの概略的な構成＞＞、＜＜３．第１の実施形態＞＞、＜＜４．第２の実施形態＞＞、＜＜５．第３の実施形態＞＞、＜＜６．第４の実施形態＞＞、＜＜７．第５の実施形態＞＞、＜＜８．第６の実施形態＞＞という順序で本発明の実施形態を説明する。 Thereafter, << 1. Introduction >>, << 2. Schematic configuration of information processing system >>, << 3. First Embodiment >>, << 4. Second Embodiment >>, << 5. Third Embodiment >>, << 6. Fourth Embodiment >>, << 7. Fifth embodiment >>, << 8. Embodiments of the present invention will be described in the order of the sixth embodiment >>.

＜＜１．はじめに＞＞
まず、図１〜図３を参照して、従来のＨａａｒ−Ｌｉｋｅフィルタ、及び技術的課題を説明する。 << 1. Introduction >>
First, a conventional Haar-Like filter and technical problems will be described with reference to FIGS.

（従来のＨａａｒ−Ｌｉｋｅフィルタ）
非特許文献「池村翔、川合俊輔、藤吉弘亘、距離情報を用いたＨａａｒ−Ｌｉｋｅフィルタリングによる人検出、第１６回画像センシングシンポジウム(ＳＳＩＩ２０１０)」には、天井から床に向けて設置された距離画像センサにより生成される距離画像から人物を検出する技術が開示されている。 (Conventional Haar-Like filter)
Non-Patent Documents “Sho Ikemura, Shunsuke Kawai, Hironobu Fujiyoshi, Human Detection by Haar-Like Filtering Using Distance Information, 16th Image Sensing Symposium (SSII2010)”, distance installed from ceiling to floor A technique for detecting a person from a distance image generated by an image sensor is disclosed.

より具体的には、当該技術によれば、生成された距離画像から、背景差分により前景部分のみの画像が生成される。そして、生成された上記画像に対して、人物の頭部及び両肩の凸形状を認識可能なＨａａｒ−Ｌｉｋｅフィルタを適用し、Ｈａａｒ−Ｌｉｋｅフィルタの出力結果についてのクラスタリングを行うことにより、人物（人物に対応する領域）が検出される。以下、この点について、図１及び図２を参照して具体的な内容を説明する。 More specifically, according to the technique, an image of only the foreground portion is generated from the generated distance image by background difference. Then, by applying a Haar-Like filter capable of recognizing the convex shape of the person's head and both shoulders to the generated image and performing clustering on the output result of the Haar-Like filter, the person ( A region corresponding to a person) is detected. Hereinafter, specific contents of this point will be described with reference to FIGS. 1 and 2.

図１は、従来のＨａａｒ−Ｌｉｋｅフィルタの例を説明するための説明図である。図１を参照すると、上記非特許文献に開示されているＨａａｒ−Ｌｉｋｅフィルタ１０が示されている。Ｈａａｒ−Ｌｉｋｅフィルタ１０は、肩、頭部、肩の凸形状を認識するためのフィルタであり、１つの矩形１１と別の種類の２つの矩形１３（即ち、矩形１３Ａ及び矩形１３Ｂ）を含む。Ｈａａｒ−Ｌｉｋｅフィルタ１０は、前景画素において画素単位で適用され、適用される画素が上記凸形状の頭部に該当する場合に１を、そうでない場合に０を出力する。具体的には、対象画素にＨａａｒ−Ｌｉｋｅフィルタ１０を適用する場合には、背景差分後の距離画像において、対象画素が中心に位置するようにＨａａｒ−Ｌｉｋｅフィルタ１０を配置する。そして、例えば、矩形１１に含まれる前景画素の距離の総和と、矩形１３Ａ及び矩形１３Ｂに含まれる前景画素の距離の総和の半分との差分が算出される。ここで、Ｈａａｒ−Ｌｉｋｅフィルタ１０は、当該差分と所定の閾値とを比較し、比較結果に基づいて１及び０のいずれか一方を出力する。このようにＨａａｒ−Ｌｉｋｅフィルタ１０により１が出力された画素をクラスタリングすれば、人物の頭部に対応する領域が検出される。なお、Ｈａａｒ−Ｌｉｋｅフィルタ１０は、対象画素ごとに、０°、４５°、９０°及び１３５°の４つの方向に傾けて用いられる。 FIG. 1 is an explanatory diagram for explaining an example of a conventional Haar-Like filter. Referring to FIG. 1, a Haar-Like filter 10 disclosed in the above non-patent document is shown. The Haar-Like filter 10 is a filter for recognizing the shoulder, the head, and the convex shape of the shoulder, and includes one rectangle 11 and two other types of rectangles 13 (that is, a rectangle 13A and a rectangle 13B). The Haar-Like filter 10 is applied pixel by pixel in the foreground pixel, and outputs 1 when the applied pixel corresponds to the convex head and 0 otherwise. Specifically, when the Haar-Like filter 10 is applied to the target pixel, the Haar-Like filter 10 is arranged so that the target pixel is located at the center in the distance image after the background difference. Then, for example, the difference between the sum of the distances of the foreground pixels included in the rectangle 11 and half of the sum of the distances of the foreground pixels included in the rectangles 13A and 13B is calculated. Here, the Haar-Like filter 10 compares the difference with a predetermined threshold value, and outputs one of 1 and 0 based on the comparison result. In this way, if the pixels for which 1 is output by the Haar-Like filter 10 are clustered, a region corresponding to the human head is detected. The Haar-Like filter 10 is used by being inclined in four directions of 0 °, 45 °, 90 °, and 135 ° for each target pixel.

図２は、従来のＨａａｒ−Ｌｉｋｅフィルタが適用される距離画像の例を説明するための説明図である。図２を参照すると、距離画像２０Ａが示されている。距離画像２０には、人物に対応する領域２１Ａ及び領域２１Ｂが含まれる。距離画像２０Ａは、人物を真上から撮像することにより距離画像センサにより生成される距離画像である。このような距離画像２０Ａに背景差分処理を行い、その後Ｈａａｒ−Ｌｉｋｅフィルタ１０を適用することにより、人物に対応する各領域２１のうちの、頭部に対応する領域が、検出され得る。 FIG. 2 is an explanatory diagram for explaining an example of a distance image to which a conventional Haar-Like filter is applied. Referring to FIG. 2, a distance image 20A is shown. The distance image 20 includes a region 21A and a region 21B corresponding to a person. The distance image 20A is a distance image generated by the distance image sensor by imaging a person from directly above. By performing background difference processing on such a distance image 20A and then applying the Haar-Like filter 10, a region corresponding to the head of each region 21 corresponding to a person can be detected.

（技術的課題）
しかし、上記非特許文献に開示されている技術には、技術的課題が存在する。具体的には、当該技術では、距離画像センサが人物の真上から当該人物を撮像することが前提になっているので、人物が真上から撮像されていない場合には、人物の誤検出又は未検出が発生することが懸念される。例えば、人物が斜め方向から撮像される場合には、人物の頭部に対応する領域以外に、人物の頭部以外の部分に対応する領域も検出されてしまうことがある。即ち、１人の人物しかいなくても、２人以上の人物（人物に対応する２つの領域）が検出され得る。以下、このような距離画像の例の具体例を、図３を参照して説明する。 (Technical issues)
However, the technology disclosed in the non-patent literature has technical problems. Specifically, in the technique, since it is assumed that the distance image sensor captures the person from directly above the person, if the person is not captured from directly above, There is a concern that non-detection may occur. For example, when a person is imaged from an oblique direction, an area corresponding to a part other than the person's head may be detected in addition to the area corresponding to the person's head. That is, even if there is only one person, two or more persons (two areas corresponding to the persons) can be detected. Hereinafter, a specific example of such a distance image will be described with reference to FIG.

図３は、従来のＨａａｒ−Ｌｉｋｅフィルタが適用されると誤検出が発生し得る距離画像の例を説明するための説明図である。図３を参照すると、距離画像２０Ｂが示されている。距離画像２０には、人物に対応する領域２１Ｃ及び領域２１Ｄが含まれる。距離画像２０Ｂは、人物を斜め方向から撮像することにより距離画像センサにより生成される距離画像である。この例では、距離画像センサの撮像範囲の端部近くに人物が位置し、その結果、当該人物は斜め方向から撮像されている。このような距離画像２０Ｂに背景差分処理を行い、その後Ｈａａｒ−Ｌｉｋｅフィルタ１０を適用すると、人物に対応する各領域２１のうちの、頭部に対応する領域以外に、人物の頭部以外の部分（例えば、肩、背中等の胴体の一部）に対応する領域も検出され得る。即ち、１人の人物しかいなくても、２人以上の人物（人物に対応する２つの領域）が検出され得る。 FIG. 3 is an explanatory diagram for explaining an example of a distance image in which erroneous detection may occur when a conventional Haar-Like filter is applied. Referring to FIG. 3, a distance image 20B is shown. The distance image 20 includes a region 21C and a region 21D corresponding to a person. The distance image 20B is a distance image generated by the distance image sensor by imaging a person from an oblique direction. In this example, a person is positioned near the end of the imaging range of the distance image sensor, and as a result, the person is imaged from an oblique direction. When background difference processing is performed on such a distance image 20B and then the Haar-Like filter 10 is applied, in addition to the region corresponding to the head of each region 21 corresponding to the person, the portion other than the head of the person An area corresponding to (for example, a part of a torso such as a shoulder or a back) can also be detected. That is, even if there is only one person, two or more persons (two areas corresponding to the persons) can be detected.

以上のように、距離画像センサの撮像範囲の端部近くに人物が位置する場合に、Ｈａａｒ−Ｌｉｋｅフィルタ１０の適用により、誤検出が発生することが懸念される。また、とりわけ、距離画像センサがより低い位置に設置される場合に、このような誤検出が発生する可能性が高くなる。なぜならば、人物が当該距離画像センサにより斜め方向から撮像される可能性が高くなるからである。また、当然ながら、距離画像センサが真下に向けて設置されずに、斜め方向に向けて設置される場合にも、上述したような誤検出が発生する可能性が高くなる。 As described above, there is a concern that erroneous detection may occur due to the application of the Haar-Like filter 10 when a person is positioned near the end of the imaging range of the distance image sensor. In particular, when the distance image sensor is installed at a lower position, the possibility of such erroneous detection increases. This is because a person is more likely to be captured from an oblique direction by the distance image sensor. Of course, when the distance image sensor is installed not in the downward direction but in the oblique direction, the possibility of the above-described erroneous detection increases.

そこで、本発明の実施形態では、人物が真上から撮像されない場合でも距離画像から人物をより正確に検出することを可能にする。以下、本発明の実施形態の具体的な内容を説明する。 Therefore, in the embodiment of the present invention, it is possible to more accurately detect a person from a distance image even when the person is not captured from directly above. Hereinafter, specific contents of the embodiment of the present invention will be described.

＜＜２．情報処理システムの概略的な構成＞＞
図４を参照して、本発明の実施形態に係る情報処理システム１の概略的な構成を説明する。図４は、本発明の実施形態に係る情報処理システム１の概略的な構成の一例を示す説明図である。図４を参照すると、情報処理システム１は、撮像装置３０及び情報処理装置１００を含む。 << 2. Schematic configuration of information processing system >>
With reference to FIG. 4, a schematic configuration of the information processing system 1 according to the embodiment of the present invention will be described. FIG. 4 is an explanatory diagram showing an example of a schematic configuration of the information processing system 1 according to the embodiment of the present invention. Referring to FIG. 4, the information processing system 1 includes an imaging device 30 and an information processing device 100.

撮像装置３０は、距離画像を生成する。撮像装置３０は、例えば距離画像センサである。より具体的には、例えば、撮像装置３０は、ＴＯＦ（Time-of-Flight）方式の距離画像センサである。上記距離画像の各画素値は、撮像装置３０から被写体までの距離、又は、撮像装置３０が原点である場合の被写体の３次元座標である。 The imaging device 30 generates a distance image. The imaging device 30 is a distance image sensor, for example. More specifically, for example, the imaging device 30 is a TOF (Time-of-Flight) range image sensor. Each pixel value of the distance image is the distance from the imaging device 30 to the subject or the three-dimensional coordinates of the subject when the imaging device 30 is the origin.

例えば、撮像装置３０は、天井から床の方向（即ち、真下）に向けて天井に設置される。 For example, the imaging device 30 is installed on the ceiling from the ceiling toward the floor (that is, directly below).

また、例えば、撮像装置３０は、人物４０が撮像範囲内に位置する場合に、人物４０を撮像する。その結果、撮像装置３０により、人物４０に対応する領域を含む距離画像が生成される。 For example, the imaging device 30 captures an image of the person 40 when the person 40 is located within the imaging range. As a result, a distance image including an area corresponding to the person 40 is generated by the imaging device 30.

情報処理装置１００は、距離画像から人物を検出する。情報処理装置１００は、例えばサーバである。 The information processing apparatus 100 detects a person from the distance image. The information processing apparatus 100 is a server, for example.

また、例えば、情報処理装置１００は、ネットワーク５０を介して撮像装置３０と通信し、撮像装置３０により生成される距離画像を取得する。そして、情報処理装置１００は、取得される当該距離画像から、例えば、人物４０を検出する。 For example, the information processing apparatus 100 communicates with the imaging apparatus 30 via the network 50 and acquires a distance image generated by the imaging apparatus 30. Then, the information processing apparatus 100 detects, for example, the person 40 from the acquired distance image.

以下、このような情報処理システム１を前提として、本発明の第１〜第６の実施形態を説明する。 Hereinafter, on the premise of such an information processing system 1, first to sixth embodiments of the present invention will be described.

＜＜３．第１の実施形態＞＞
まず、図５〜図１５を参照して、本発明の第１の実施形態を説明する。本発明の第１の実施形態によれば、背景差分後の距離画像内の別々の位置で適用される複数のフィルタが用いられる。これにより、人物が真上から撮像されない場合でも距離画像から人物をより正確に検出することが可能になる。 << 3. First Embodiment >>
First, a first embodiment of the present invention will be described with reference to FIGS. According to the first embodiment of the present invention, a plurality of filters applied at different positions in the distance image after the background difference are used. This makes it possible to detect the person more accurately from the distance image even when the person is not imaged from directly above.

＜３−１．情報処理装置の構成＞
図５〜図１４を参照して、第１の実施形態に係る情報処理装置の構成の一例を説明する。図５は、第１の実施形態に係る情報処理装置１００−１の構成の一例を示すブロック図である。図５を参照すると、情報処理装置１００−１は、通信部１１０、記憶部１２０及び制御部１３０を備える。 <3-1. Configuration of information processing apparatus>
An example of the configuration of the information processing apparatus according to the first embodiment will be described with reference to FIGS. FIG. 5 is a block diagram illustrating an example of the configuration of the information processing apparatus 100-1 according to the first embodiment. Referring to FIG. 5, the information processing apparatus 100-1 includes a communication unit 110, a storage unit 120, and a control unit 130.

（通信部１１０）
通信部１１０は、他の装置と通信する。例えば、通信部１１０は、ネットワーク５０を介して、撮像装置３０と通信する。 (Communication unit 110)
The communication unit 110 communicates with other devices. For example, the communication unit 110 communicates with the imaging device 30 via the network 50.

例えば、通信部１１０は、撮像装置３０により生成される距離画像を受信する。そして、通信部１１０は、当該距離画像を制御部１３０（距離画像取得部１３１）に提供する。 For example, the communication unit 110 receives a distance image generated by the imaging device 30. Then, the communication unit 110 provides the distance image to the control unit 130 (distance image acquisition unit 131).

（記憶部１２０）
記憶部１２０は、情報処理装置１００−１内で一時的又は恒久的に保存すべきデータを記憶する。 (Storage unit 120)
The storage unit 120 stores data to be temporarily or permanently stored in the information processing apparatus 100-1.

例えば、記憶部１２０は、背景の撮像により撮像装置３０により予め生成された第２の距離画像（以下、「背景距離画像」と呼ぶ）を記憶する。 For example, the storage unit 120 stores a second distance image (hereinafter referred to as “background distance image”) generated in advance by the imaging device 30 by imaging the background.

また、例えば、記憶部１２０は、撮像装置３０の特性の情報を記憶する。例えば、当該特性の情報は、撮像装置３０により生成される距離画像２０の画素数、撮像装置３０の画角等の情報を含む。 For example, the storage unit 120 stores information on characteristics of the imaging device 30. For example, the information on the characteristic includes information such as the number of pixels of the distance image 20 generated by the imaging device 30 and the angle of view of the imaging device 30.

また、例えば、記憶部１２０は、領域を検出するためのフィルタを記憶する。当該フィルタについては後述する。 For example, the storage unit 120 stores a filter for detecting a region. The filter will be described later.

また、例えば、記憶部１２０は、撮像装置３０の設置状態に関する情報を含む。当該設置状態に関する情報は、例えば、撮像装置３０が設置されている高さの情報を含む。一例として、当該高さは、床からの高さである。 For example, the storage unit 120 includes information regarding the installation state of the imaging device 30. The information regarding the installation state includes, for example, information on the height at which the imaging device 30 is installed. As an example, the height is a height from the floor.

また、例えば、記憶部１２０は、撮像装置３０により撮像されると想定される人物に関する情報を含む。例えば、当該情報は、撮像されると想定される人物の身長、頭幅、肩幅等の情報を含む。 For example, the storage unit 120 includes information related to a person assumed to be imaged by the imaging device 30. For example, the information includes information such as the height, head width, and shoulder width of a person assumed to be imaged.

（制御部１３０）
制御部１３０は、情報処理装置１００−１の様々な機能を提供する。制御部１３０は、距離画像取得部１３１、前景距離画像生成部１３３及び領域検出部１３５を含む。 (Control unit 130)
The control unit 130 provides various functions of the information processing apparatus 100-1. The control unit 130 includes a distance image acquisition unit 131, a foreground distance image generation unit 133, and an area detection unit 135.

（距離画像取得部１３１）
距離画像取得部１３１は、撮像装置３０により生成される第１の距離画像（以下、「対象処理画像」と呼ぶ）を取得する。例えば、通信部１１０が、撮像装置３０により生成された距離画像を受信すると、距離画像取得部１３１は、当該距離画像を取得する。 (Distance image acquisition unit 131)
The distance image acquisition unit 131 acquires a first distance image (hereinafter referred to as “target processed image”) generated by the imaging device 30. For example, when the communication unit 110 receives a distance image generated by the imaging device 30, the distance image acquisition unit 131 acquires the distance image.

また、距離画像取得部１３１は、背景距離画像を取得する。例えば、距離画像取得部１３１は、記憶部１２０に記憶されている背景距離画像を記憶部１２０から取得する。 The distance image acquisition unit 131 acquires a background distance image. For example, the distance image acquisition unit 131 acquires the background distance image stored in the storage unit 120 from the storage unit 120.

距離画像取得部１３１は、取得された対象距離画像及び背景距離画像を前景距離画像生成部１３３に提供する。また、例えば、距離画像取得部１３１は、取得された対象距離画像を記憶部１２０に記憶する。 The distance image acquisition unit 131 provides the acquired target distance image and background distance image to the foreground distance image generation unit 133. For example, the distance image acquisition unit 131 stores the acquired target distance image in the storage unit 120.

（前景距離画像生成部１３３）
前景距離画像生成部１３３は、上記対象距離画像及び上記背景距離画像に基づいて、上記背景に対する前景に対応する第３の距離画像（以下、「前景距離画像」と呼ぶ）を生成する。 (Foreground distance image generation unit 133)
The foreground distance image generation unit 133 generates a third distance image corresponding to the foreground with respect to the background (hereinafter referred to as “foreground distance image”) based on the target distance image and the background distance image.

例えば、対象距離画像Ｉの座標（ｘ，ｙ）における画素値をＩ（ｘ，ｙ）とし、背景距離画像Ｂの座標（ｘ，ｙ）における画素値をＢ（ｘ，ｙ）とすると、上記前景距離画像は、以下の式により表される。 For example, when the pixel value at the coordinates (x, y) of the target distance image I is I (x, y) and the pixel value at the coordinates (x, y) of the background distance image B is B (x, y), The foreground distance image is expressed by the following equation.

即ち、前景距離画像生成部１３３は、対象距離画像Ｉの画素値Ｉ（ｘ，ｙ）と背景距離画像Ｂの画素値をＢ（ｘ，ｙ）との差分を算出し、当該差分の大きさが閾値ｔｈを超えるかを判定する。そして、前景距離画像生成部１３３は、上記差分の大きさが閾値ｔｈを超える画素の画素値を画素値Ｉ（ｘ，ｙ）のまま維持し、上記差分の大きさが閾値ｔｈである画素の画素値を０にする。これにより、前景距離画像生成部１３３は、対象距離画像Ｉのうちの前景に対応する前景距離画像を生成する。以下、図６を参照して、前景距離画像の具体例を説明する。 That is, the foreground distance image generation unit 133 calculates the difference between the pixel value I (x, y) of the target distance image I and the pixel value of the background distance image B as B (x, y), and the magnitude of the difference Is greater than the threshold th. Then, the foreground distance image generation unit 133 maintains the pixel value of the pixel in which the magnitude of the difference exceeds the threshold th as the pixel value I (x, y), and the pixel of which the magnitude of the difference is the threshold th Set the pixel value to 0. Accordingly, the foreground distance image generation unit 133 generates a foreground distance image corresponding to the foreground in the target distance image I. Hereinafter, a specific example of the foreground distance image will be described with reference to FIG.

図６は、生成される前景距離画像の一例を説明するための説明図である。図６を参照すると、前景距離画像６０が示されている。前景距離画像６０では、前景に対応する前景領域６１Ａ及び前景領域６１Ｂの画素の画素値が対象距離画像の画素値のまま維持され、それ以外の画素の画素値は０である。 FIG. 6 is an explanatory diagram for explaining an example of the generated foreground distance image. Referring to FIG. 6, a foreground distance image 60 is shown. In the foreground distance image 60, the pixel values of the pixels in the foreground area 61A and the foreground area 61B corresponding to the foreground are maintained as the pixel values of the target distance image, and the pixel values of the other pixels are 0.

なお、上記前景距離画像は、背景差分画像とも呼ばれ得る。 Note that the foreground distance image may also be referred to as a background difference image.

（領域検出部１３５）
領域検出部１３５は、上記前景距離画像内の別々の位置で適用される複数のフィルタであって、適用される位置に応じた人物の頭部及び胴体のパターンを認識可能な上記複数のフィルタを用いて、上記前景距離画像から１つ以上の領域（以下、「物体領域」と呼ぶ）を検出する。 (Area detection unit 135)
The area detection unit 135 includes a plurality of filters applied at different positions in the foreground distance image, and the plurality of filters capable of recognizing a human head and torso pattern according to the applied position. And one or more regions (hereinafter referred to as “object regions”) are detected from the foreground distance image.

−フィルタの形状
例えば、領域検出部１３５は、人物の対象部分に対応する領域として、１つ以上の物体領域を検出する。例えば、当該対象部分は、人物の頭部である。即ち、領域検出部１３５は、人物の頭部に対応する領域として、１つ以上の物体領域を検出する。 -Filter shape For example, the region detection unit 135 detects one or more object regions as regions corresponding to the target portion of a person. For example, the target portion is a human head. That is, the area detection unit 135 detects one or more object areas as an area corresponding to a person's head.

また、例えば、上記複数のフィルタの各々は、Ｈａａｒ−Ｌｉｋｅフィルタである。ただし、上記複数のフィルタの各々は、図１に示されるＨａａｒ−Ｌｉｋｅフィルタ１０とは異なる形状を有する。以下、この点について図７を参照して具体例を説明する。 For example, each of the plurality of filters is a Haar-Like filter. However, each of the plurality of filters has a shape different from that of the Haar-Like filter 10 shown in FIG. Hereinafter, a specific example of this point will be described with reference to FIG.

図７は、本開示の実施形態に係る複数のフィルタのうちの１つのフィルタの一例を説明するための説明図である。図７を参照すると、本発明の実施形態に係るフィルタ７０が示されている。当該フィルタ７０は、Ｈａａｒ−Ｌｉｋｅフィルタである。フィルタ７０は、人物の頭部及び胴体（例えば、肩、背中、胸、等）のパターンを認識するためのフィルタであり、より小さい矩形７１とより大きい矩形７３を含む。フィルタ７０は、前景距離画像６０の前景領域６１において画素単位で適用され、適用される画素が上記パターンにおける頭部に該当する場合に１を、そうでない場合に０を出力する。具体的には、対象画素にフィルタ７０を適用する場合には、前景距離画像６０において、対象画素が矩形７１の中心に位置するようにフィルタ７０を配置する。そして、例えば、矩形７１に含まれる前景画素に対応する高さの平均値から、矩形７３に含まれる前景画素に対応する高さの平均値を差し引くことにより、高さの平均値の差分が算出される。ここで、フィルタ７０は、上記差分が所定の閾値を超える場合に１を出力し、そうでなければ０を出力する。このようにフィルタ７０により１が出力された画素をクラスタリングすれば、人物の頭部に対応する領域が検出される。 FIG. 7 is an explanatory diagram for describing an example of one of the plurality of filters according to the embodiment of the present disclosure. Referring to FIG. 7, a filter 70 according to an embodiment of the present invention is shown. The filter 70 is a Haar-Like filter. The filter 70 is a filter for recognizing a pattern of a person's head and torso (for example, shoulder, back, chest, etc.), and includes a smaller rectangle 71 and a larger rectangle 73. The filter 70 is applied pixel by pixel in the foreground area 61 of the foreground distance image 60, and outputs 1 when the applied pixel corresponds to the head in the pattern, and 0 otherwise. Specifically, when the filter 70 is applied to the target pixel, the filter 70 is arranged so that the target pixel is located at the center of the rectangle 71 in the foreground distance image 60. Then, for example, by subtracting the average height corresponding to the foreground pixels included in the rectangle 73 from the average height corresponding to the foreground pixels included in the rectangle 71, the difference in the average height is calculated. Is done. Here, the filter 70 outputs 1 when the difference exceeds a predetermined threshold, and outputs 0 otherwise. In this way, if the pixels for which 1 is output by the filter 70 are clustered, a region corresponding to the person's head is detected.

なお、各画素に対応する高さは、例えば、各画素に対応する被写体の床からの高さである。そして、当該高さは、各画素の画素値（即ち、距離）、撮像装置３０の設置状態、及び撮像装置３０の特性の情報に基づいて、算出可能である。以下、ある注目画素に対応する被写体の高さの具体的な算出手法を説明する。 Note that the height corresponding to each pixel is, for example, the height of the subject corresponding to each pixel from the floor. Then, the height can be calculated based on the pixel value (that is, distance) of each pixel, the installation state of the imaging device 30, and information on the characteristics of the imaging device 30. Hereinafter, a specific method for calculating the height of the subject corresponding to a certain pixel of interest will be described.

まず、撮像装置３０の画角と、撮像装置３０により生成される距離画像の画素数から、注目画素に対応する撮像角度θ_Ｘ（例えば、撮像装置３０の真下方向に対する角度）を特定することができる。また、撮像装置３０から、注目画素に対応する被写体（即ち、上記撮像角度の方向にある被写体）までの距離ｄ_Ｘが、注目画素の画素値から得られる。そのため、上記撮像角度θ_Ｘに対応する余弦（ｃｏｓθ_Ｘ）と上記距離ｄ_Ｘを乗算することにより、上記被写体と撮像装置３０との高さ方向の距離が算出される。そして、例えば、撮像装置３０が設置されている高さＤが記憶部１２０に記憶されているので、撮像装置３０が設置されている高さＤから、上記被写体と撮像装置３０との高さ方向の距離を差し引くことにより、当該被写体の高さＤ_Ｘ（即ち、注目画素に対応する高さ）算出される。即ち、注目画素に対応する高さＤ_Ｘは、以下のように表される。 First, from the angle of view of the imaging device 30 and the number of pixels of the distance image generated by the imaging device 30, an imaging angle θ _X corresponding to the pixel of interest (for example, an angle with respect to the downward direction of the imaging device 30) is specified. it can. Further, the distance d _X from the imaging device 30 to the subject corresponding to the target pixel (that is, the subject in the direction of the imaging angle) is obtained from the pixel value of the target pixel. Therefore, by multiplying the cosine (cos θ _X ) corresponding to the imaging angle θ _X by the distance d _X , the distance in the height direction between the subject and the imaging device 30 is calculated. For example, since the height D at which the imaging device 30 is installed is stored in the storage unit 120, the height direction between the subject and the imaging device 30 is determined from the height D at which the imaging device 30 is installed. The height D _{X of the} subject (that is, the height corresponding to the target pixel) is calculated by subtracting the distance. That is, the height D _X corresponding to the target pixel is expressed as follows.

−複数のフィルタ
とりわけ第１の実施形態では、前景距離画像内の別々の位置で適用される複数のフィルタが用いられる。 -Multiple filters In particular in the first embodiment, multiple filters are used that are applied at different positions in the foreground distance image.

上述したように、撮像装置３０が人物の真上から当該人物を撮像するとは限らない。例えば、撮像装置３０の撮像範囲の端部近くに人物が位置する場合に、当該人物は斜め方向から撮像される。以下、この点について図８及び図９を参照して具体例を説明する。 As described above, the imaging device 30 does not always capture the person from directly above the person. For example, when a person is located near the end of the imaging range of the imaging device 30, the person is imaged from an oblique direction. Hereinafter, a specific example of this point will be described with reference to FIGS.

図８は、撮像装置３０により生成される対象距離画像の例を説明するための説明図である。図８を参照すると、対象距離画像２０が示されている。この対象距離画像２０は、撮像装置３０の撮像範囲内の９箇所に人物が位置する場合に生成される距離画像であり、これらの人物に対応する９つの前景領域２１Ａ〜２１Ｉが含まれる。例えば、前景領域２１Ｅは、撮像装置３０の真下に位置する人物に対応する領域である。一方、前景領域２１Ｅを除くその他の各前景領域２１は、撮像装置３０の撮像範囲の端部近くに位置する人物に対応する領域である。このように、前景領域２１Ｅには、人物の頭部及び肩に対応する領域が含まれるが、その他の各前景領域２１Ｅには、人物の頭部及び肩以外の部分（例えば、背中、胸等）に対応する領域も含まれる。 FIG. 8 is an explanatory diagram for describing an example of a target distance image generated by the imaging device 30. Referring to FIG. 8, a target distance image 20 is shown. The target distance image 20 is a distance image generated when a person is located at nine locations within the imaging range of the imaging device 30, and includes nine foreground areas 21A to 21I corresponding to these persons. For example, the foreground area 21E is an area corresponding to a person located directly below the imaging device 30. On the other hand, each of the foreground areas 21 other than the foreground area 21E is an area corresponding to a person located near the end of the imaging range of the imaging device 30. As described above, the foreground area 21E includes areas corresponding to the head and shoulders of the person, but the other foreground areas 21E include portions other than the person's head and shoulders (for example, the back, chest, etc.). ) Is also included.

図９は、図８に示される対象距離画像から得られる前景距離画像の例を説明するための説明図である。図９を参照すると、前景距離画像６０が示されている。そして、前景距離画像６０は、９つの前景領域２１Ａ〜２１Ｉに対応する９つの前景領域６１Ａ〜６１Ｉを含む。上述したように、前景距離画像６０では、前景に対応する各前景領域６１の画素の画素値が対象距離画像２０の画素値（即ち、各前景領域２１の画素の画素値）のまま維持され、それ以外の画素の画素値は０である。 FIG. 9 is an explanatory diagram for explaining an example of the foreground distance image obtained from the target distance image shown in FIG. Referring to FIG. 9, a foreground distance image 60 is shown. The foreground distance image 60 includes nine foreground areas 61A to 61I corresponding to the nine foreground areas 21A to 21I. As described above, in the foreground distance image 60, the pixel value of each foreground region 61 corresponding to the foreground is maintained as the pixel value of the target distance image 20 (that is, the pixel value of each pixel of the foreground region 21). The pixel values of other pixels are 0.

以上のように、例えば、図８に示されるような対象距離画像２０が取得され、図９に示されるような前景距離画像６０が生成される。このような例では、人物が真上から撮像されることを前提とするフィルタ（例えば、図１に示されるＨａａｒ−Ｌｉｋｅフィルタ１０）が用いられると、前景領域６１Ｅを除く他の前景領域６１（即ち、前景領域６１Ａ〜６１Ｄ、６１Ｆ〜６１Ｉ）では、人物の頭部に対応する領域以外に、その他の部分に対応する領域も誤って検出され得る。例えば、人物の肩、背中又は胸に対応する領域が誤って検出され得る。即ち、１人の人物しかいなくても、２人以上の人物（人物に対応する２つの領域）が検出され得る。そこで、本発明の実施形態では、前景距離画像６０内の別々の位置で適用される複数のフィルタが用いられる。以下、この点について、図１０を参照して具体例を説明する。 As described above, for example, the target distance image 20 as shown in FIG. 8 is acquired, and the foreground distance image 60 as shown in FIG. 9 is generated. In such an example, when a filter (for example, the Haar-Like filter 10 shown in FIG. 1) on the premise that a person is imaged from directly above is used, the other foreground areas 61 ( That is, in the foreground areas 61A to 61D and 61F to 61I), in addition to the area corresponding to the person's head, areas corresponding to other parts can be erroneously detected. For example, a region corresponding to a person's shoulder, back, or chest may be erroneously detected. That is, even if there is only one person, two or more persons (two areas corresponding to the persons) can be detected. Therefore, in the embodiment of the present invention, a plurality of filters applied at different positions in the foreground distance image 60 are used. Hereinafter, a specific example of this point will be described with reference to FIG.

図１０は、前景距離画像における物体領域の検出のために用いられるフィルタの例を説明するための説明図である。図１０を参照すると、９つのフィルタ７０Ａ〜７０Ｉが示されている。このように、本発明の実施形態では、前景距離画像６０内の位置によって別々のフィルタ７０が適用される。一例として、フィルタ７０を適用する画素ごとに、別々のフィルタ７０が適用される。また、フィルタ７０は、適用される位置に応じた人物の頭部及び胴体のパターンを認識可能である。例えば、フィルタ７０は、適用される位置に応じた人物の頭部及び胴体のパターンを認識するために、適用される位置に応じた形状を有する。 FIG. 10 is an explanatory diagram for describing an example of a filter used for detecting an object region in a foreground distance image. Referring to FIG. 10, nine filters 70A-70I are shown. Thus, in the embodiment of the present invention, different filters 70 are applied depending on the position in the foreground distance image 60. As an example, a separate filter 70 is applied to each pixel to which the filter 70 is applied. The filter 70 can recognize the pattern of the person's head and torso according to the applied position. For example, the filter 70 has a shape corresponding to the position to be applied in order to recognize the pattern of the head and torso of the person according to the position to be applied.

−フィルタの生成手法
次に、フィルタ７０の生成手法を説明する。 -Filter Generation Method Next, the filter 70 generation method will be described.

−−使用する情報（撮像装置の特性）
例えば、上記複数のフィルタの各々は、撮像装置３０の特性にさらに基づいて生成される。 -Information to be used (characteristics of imaging device)
For example, each of the plurality of filters is generated further based on the characteristics of the imaging device 30.

例えば、撮像装置３０の当該特性は、撮像装置３０により生成される距離画像の画素数（Ｎ_ｈ、Ｎ_ｖ）を含む。画素数Ｎ_ｈは、距離画像の水平方向の画素数であり、画素数Ｎ_ｖは、距離画像の垂直方向の画素数である。以下、この点について図１１を参照して具体例を説明する。 For example, the characteristic of the imaging device 30 includes the number of pixels (N _h , N _v ) of the distance image generated by the imaging device 30. The number of pixels _Nh is the number of pixels in the horizontal direction of the distance image, and the number of pixels _Nv is the number of pixels in the vertical direction of the distance image. Hereinafter, a specific example of this point will be described with reference to FIG.

図１１は、撮像装置により生成される距離画像の画素数の例を説明するための説明図である。図１１を参照すると、図３と同様に、撮像装置３０により生成される対象距離画像２０が示されている。図１１に示されるように、対象距離画像２０は、水平方向の画素数Ｎ_ｈ、及び垂直方向の画素数Ｎ_ｖを伴う。なお、対象距離画像２０と前景距離画像６０の画素数は同じであるので、画素数Ｎ_ｈ及び画素数Ｎ_ｖは、前景距離画像６０の画素数とも言える。 FIG. 11 is an explanatory diagram for explaining an example of the number of pixels of the distance image generated by the imaging device. Referring to FIG. 11, the target distance image 20 generated by the imaging device 30 is shown as in FIG. 3. As shown in FIG. 11, the target distance image 20 is accompanied by the number of pixels N _{h in} the horizontal direction and the number of pixels N _v in the vertical direction. Since the target distance image 20 and the foreground distance image 60 have the same number of pixels, the number of pixels N _h and the number of pixels N _v can also be said to be the number of pixels of the foreground distance image 60.

また、例えば、撮像装置３０の当該特性は、撮像装置３０の画角（Θ_ｈ、Θ_ｖ）を含む。画角Θ_ｈは、対象距離画像２０の水平方向に対応する画角であり、画角Θ_ｖは、対象距離画像２０の垂直方向に対応する画角である。以下、この点について図１２を参照して具体例を説明する。 For example, the characteristic of the imaging device 30 includes the angle of view (Θ _h , Θ _v ) of the imaging device 30. The field angle Θ _h is a field angle corresponding to the horizontal direction of the target distance image 20, and the field angle Θ _v is a field angle corresponding to the vertical direction of the target distance image 20. Hereinafter, a specific example of this point will be described with reference to FIG.

図１２は、撮像装置の画角を説明するための説明図である。図１２を参照すると、撮像装置３０とその画角Θが示されている。当該画角Θは、対象距離画像２０の水平方向に対応する画角Θ_ｈ、又は対象距離画像２０の垂直方向に対応する画角Θ_ｖである。撮像装置３０は、このような画角Θに応じた撮像範囲を撮像する。 FIG. 12 is an explanatory diagram for explaining the angle of view of the imaging apparatus. Referring to FIG. 12, an imaging device 30 and its angle of view Θ are shown. The angle of view Θ is the angle of view Θ _h corresponding to the horizontal direction of the target distance image 20 or the angle of view Θ _v corresponding to the vertical direction of the target distance image 20. The imaging device 30 captures an imaging range corresponding to such an angle of view Θ.

−−使用する情報（撮像装置と人物との相対関係）
例えば、複数のフィルタ７０の各々は、撮像装置３０により人物が撮像される場合における撮像装置３０と当該人物との相対関係に基づいて生成される。 -Information to be used (relative relationship between imaging device and person)
For example, each of the plurality of filters 70 is generated based on the relative relationship between the imaging device 30 and the person when the person is imaged by the imaging device 30.

具体的には、例えば、複数のフィルタ７０の各々は、別々の画素に適用される。即ち、画素単位のフィルタ７０が生成される。この場合に、ある画素に適用されるフィルタ７０を生成する際の上記相対関係は、当該ある画素の中心が人物の頭部の中心に対応するように人物が存在すると仮定した場合における、当該人物と撮像装置３０との相対関係である。 Specifically, for example, each of the plurality of filters 70 is applied to a separate pixel. That is, a pixel-unit filter 70 is generated. In this case, the relative relationship when generating the filter 70 applied to a certain pixel is that the person is assumed to exist when the center of the certain pixel corresponds to the center of the person's head. And the imaging device 30.

また、例えば、上記相対関係は、人物の頭部が存在する撮像装置３０からの方向、及び人物の胴体が存在する撮像装置３０からの方向を含む。例えば、上記複数のフィルタの各々は、適用される位置に応じた人物の頭部及び肩のパターンを認識可能なフィルタである。この場合に、上記相対関係は、人物の頭部が存在する撮像装置３０からの方向、及び人物の肩が存在する撮像装置３０からの方向を含む。 Further, for example, the relative relationship includes a direction from the imaging device 30 in which the person's head is present and a direction from the imaging device 30 in which the person's torso is present. For example, each of the plurality of filters is a filter capable of recognizing a person's head and shoulder pattern according to an applied position. In this case, the relative relationship includes a direction from the imaging device 30 in which the person's head exists and a direction from the imaging device 30 in which the person's shoulder exists.

一例として、人物の頭部が存在する撮像装置３０からの方向は、人物の頭部に対応する撮像角度の範囲である。また、人物の肩が存在する撮像装置３０からの方向は、人物の肩に対応する撮像角度の範囲である。 As an example, the direction from the imaging device 30 in which a person's head exists is a range of imaging angles corresponding to the person's head. Further, the direction from the imaging device 30 where the person's shoulder exists is the range of the imaging angle corresponding to the person's shoulder.

具体的には、例えば、ある画素に適用されるフィルタ７０を生成する際の上記相対関係は、当該ある画素の中心が人物の頭部の中心に対応するように人物が存在すると仮定した場合における、人物の頭部に対応する撮像角度の範囲、及び人物の肩に対応する撮像角度の範囲を含む。以下、この点について図１３Ａ及び図１３Ｂを参照して具体例を説明する。 Specifically, for example, the above relative relationship when generating the filter 70 applied to a certain pixel is based on the assumption that a person exists such that the center of the certain pixel corresponds to the center of the head of the person. A range of imaging angles corresponding to the head of the person and a range of imaging angles corresponding to the shoulder of the person. Hereinafter, a specific example of this point will be described with reference to FIGS. 13A and 13B.

図１３Ａは、人物の頭部に対応する撮像角度の範囲を算出する手法の例を説明するための説明図である。図１３Ａを参照すると、撮像装置３０及び人物４０が示されている。人物４０の頭幅はＷ_Ｈであり、人物４０の肩幅はＷ_Ｓである。また、撮像装置３０と人物４０の頭部との高さ方向の距離は、Ｄ_Ｈであり、撮像装置３０と人物４０の肩との高さ方向の距離は、Ｄ_Ｓである。また、フィルタ７０を適用する画素が人物４０の頭部の中心に対応するという仮定を考慮すると、撮像装置３０により生成される対象距離画像２０の画素数と、撮像装置３０の画角とに基づいて、対象距離画像の各画素に対応する頭部の中心の撮像角度θ_Ｈを算出することができる。そして、例えば、頭部の中心の撮像角度θ_Ｈ、人物４０の頭幅Ｗ_Ｈ、及び、撮像装置３０と人物４０の頭部との高さ方向の距離Ｄ_Ｈを用いて、人物４０の頭部に対応する撮像角度の範囲θ_Ｈ±を、以下のように算出することができる。 FIG. 13A is an explanatory diagram for describing an example of a technique for calculating a range of imaging angles corresponding to a person's head. Referring to FIG. 13A, an imaging device 30 and a person 40 are shown. Head width of the person 40 is a _{W H,} width of the shoulders of the person 40 is a _{W S.} The distance in the height direction of the head of the image pickup device 30 and the person 40 are D _H, the distance in the height direction between the shoulder of the imaging device 30 and the person 40 is D _S. Further, in consideration of the assumption that the pixel to which the filter 70 is applied corresponds to the center of the head of the person 40, it is based on the number of pixels of the target distance image 20 generated by the imaging device 30 and the angle of view of the imaging device 30. Thus, the imaging angle θ _H at the center of the head corresponding to each pixel of the target distance image can be calculated. Then, for example, the head of the person 40 using the imaging angle θ _{H at} the center of the head, the head width W _{H of} the person 40, and the distance _DH in the height direction between the imaging device 30 and the head of the person 40. The imaging angle range θ _{H ±} corresponding to the part can be calculated as follows.

図１３Ｂは、人物の肩に対応する撮像角度の範囲を算出する手法の例を説明するための説明図である。図１３Ｂを参照すると、図１３Ａと同様に、人物４０の肩幅Ｗ_Ｓ、及び、撮像装置３０と人物４０の肩との高さ方向の距離Ｄ_Ｓが示されている。人物４０の頭幅Ｗ_Ｈ、及び、撮像装置３０と人物４０の頭部との高さ方向の距離Ｄ_Ｈ、及び、頭部の中心の撮像角度θ_Ｈは示されていないが、これらは図１３Ａと同様である。頭部の中心の撮像角度θ_Ｈ、距離Ｄ_Ｈ及び距離Ｄ_Ｓから、肩の中心の撮像角度θ_Ｓを算出することができる。そして、例えば、肩の中心の撮像角度θ_Ｓ、人物４０の肩幅Ｗ_Ｓ、及び、撮像装置３０と人物４０の肩との高さ方向の距離Ｄ_Ｓを用いて、人物４０の肩に対応する撮像角度の範囲θ_Ｓ±を、以下のように算出することができる。 FIG. 13B is an explanatory diagram for describing an example of a method for calculating a range of imaging angles corresponding to a person's shoulder. Referring to FIG. 13B, similarly to FIG. 13A, the shoulder width W _{S of} the person 40 and the distance D _S in the height direction between the imaging device 30 and the shoulder of the person 40 are shown. The head width W _{H of} the person 40, the distance D _H in the height direction between the imaging device 30 and the head of the person 40, and the imaging angle θ _H of the center of the head are not shown. It is the same as 13A. From the imaging angle θ _{H at} the center of the head, the distance D _H, and the distance D _S , the imaging angle θ _S at the center of the shoulder can be calculated. Then, for example, using the imaging angle θ _{S at} the center of the shoulder, the shoulder width W _{S of} the person 40, and the distance D _S in the height direction between the imaging device 30 and the shoulder of the person 40, it corresponds to the shoulder of the person 40. The imaging angle range θ _{S ±} can be calculated as follows.

例えば以上のように、撮像装置３０により人物が撮像される場合における撮像装置３０と当該人物との相対関係が算出される。 For example, as described above, the relative relationship between the imaging device 30 and the person when the person is imaged by the imaging device 30 is calculated.

なお、上述した算出手法は、一方向についての撮像角度の範囲である。そのため、上記相対関係として、対象距離画像２０の垂直方向に対応する撮像角度の範囲、及び対象距離画像２０の水平方向に対応する撮像角度の範囲が、それぞれ算出される。即ち、人物の頭部に対応する水平方向及び垂直方向撮像角度の範囲（θ_Ｈｈ±，θ_Ｈｖ±）、及び人物の肩に対応する水平方向及び垂直方向撮像角度の範囲（θ_Ｓｈ±，θ_Ｓｖ±）が算出される。 In addition, the calculation method mentioned above is the range of the imaging angle about one direction. Therefore, as the above-described relative relationship, a range of imaging angles corresponding to the vertical direction of the target distance image 20 and a range of imaging angles corresponding to the horizontal direction of the target distance image 20 are calculated. That is, the horizontal and vertical imaging angle ranges corresponding to the person's head (θ _{Hh ±} , θ _{Hv ±} ) and the horizontal and vertical imaging angle ranges corresponding to the person's shoulder (θ _{Sh ±} , θ _{Sv ±} ) is calculated.

−−フィルタの生成
図７を参照して説明したように、例えば、フィルタ７０は矩形７１及び矩形７３を含む。そして、頭部に対応する矩形７１の水平方向及び垂直方向の画素範囲（Ｐ_Ｈｈ±，Ｐ_Ｈｖ±）は、人物の頭部に対応する水平方向及び垂直方向撮像角度の範囲（θ_Ｈｈ±，θ_Ｈｖ±）、対象距離画像２０の画素数（Ｎ_ｈ、Ｎ_ｖ）及び撮像装置３０の画角（Θ_ｈ、Θ_ｖ）を用いて、以下のように算出される。なお、以下の画素範囲は、対象距離画像２０の中心を原点とした場合の画素の座標として表されている。また、距離画像の特性を考慮して、画素の座標が撮像角度と比例関係にあることを前提としている。 -Generation of Filter As described with reference to FIG. 7, for example, the filter 70 includes a rectangle 71 and a rectangle 73. The horizontal and vertical pixel ranges (P _{Hh ±} , P _{Hv ±} ) of the rectangle 71 corresponding to the head are the horizontal and vertical imaging angle ranges (θ _{Hh ±} , θ _{Hv ±} ), the number of pixels of the target distance image 20 (N _h , N _v ), and the angle of view (Θ _h , Θ _v ) of the imaging device 30 are calculated as follows. The following pixel range is expressed as pixel coordinates when the center of the target distance image 20 is the origin. In addition, in consideration of the characteristics of the distance image, it is assumed that the coordinates of the pixel are proportional to the imaging angle.

同様に、肩に対応する矩形７３の水平方向及び垂直方向の画素範囲（Ｐ_Ｓｈ±，Ｐ_Ｓｖ±）は、以下のように算出される。 Similarly, the horizontal and vertical pixel ranges (P _{Sh ±} , P _{Sv ±} ) of the rectangle 73 corresponding to the shoulder are calculated as follows.

以上のように、例えば画素単位でのフィルタ７０が生成される。例えば、画素単位の当該フィルタ７０は、例えば、予め生成され、記憶部１２０に記憶される。そして、各画素へのフィルタ７０の適用の際に取得される。 As described above, for example, the filter 70 in units of pixels is generated. For example, the filter 70 in units of pixels is generated in advance and stored in the storage unit 120, for example. Then, it is acquired when the filter 70 is applied to each pixel.

−複数のフィルタを用いた物体領域の検出の例
図１４を参照して、上記複数のフィルタを用いた物体領域の検出の例を説明する。図１４は、本発明の実施形態に係る複数のフィルタを用いた物体領域の検出の例を説明するための説明図である。図１４を参照すると、図６に示される前景距離画像６０に対する物体領域の検出を行った場合に検出される物体領域８１Ａ及び８１Ｂと、物体領域８１Ａ及び８１Ｂのみを含む２値画像８０とが示されている。２値画像８０では、フィルタ７０により１が出力された画素での画素値が１であり、その他の画素での画素値が０である。即ち、物体領域８１Ａ及び８１Ｂの各々は画素値が１である画素の集合である。このように、人物の頭部に対応する領域が、個々の物体領域８１Ａ及び８１Ｂとして検出される。 -Example of detection of object region using a plurality of filters An example of detection of an object region using the plurality of filters will be described with reference to Fig. 14. FIG. 14 is an explanatory diagram for explaining an example of object region detection using a plurality of filters according to the embodiment of the present invention. Referring to FIG. 14, object areas 81A and 81B detected when the object area is detected from the foreground distance image 60 shown in FIG. 6 and a binary image 80 including only the object areas 81A and 81B are shown. Has been. In the binary image 80, the pixel value at the pixel where 1 is output by the filter 70 is 1, and the pixel values at the other pixels are 0. That is, each of the object areas 81A and 81B is a set of pixels having a pixel value of 1. In this way, the regions corresponding to the person's head are detected as the individual object regions 81A and 81B.

なお、領域検出部１３５は、２値画像８０に含まれる画素のうちの、画素値１を有する画素について、ラベリングを行うことにより、個々の物体領域８１（例えば、物体領域８１Ａ、物体領域８１Ｂ）を検出する。例えば、領域検出部１３５は、上下左右斜めの８近傍で連続する画素に同一のラベル（例えば番号）を付加する。その結果、同一のラベルを有する画素の集合が、個々の物体領域８１Ａ及び８１Ｂとなる。 Note that the area detection unit 135 performs labeling on the pixels having the pixel value 1 among the pixels included in the binary image 80, thereby causing the individual object areas 81 (for example, the object area 81A and the object area 81B) to be labeled. Is detected. For example, the region detection unit 135 adds the same label (for example, a number) to pixels that are continuous in the vicinity of eight diagonally up, down, left and right. As a result, a set of pixels having the same label becomes individual object regions 81A and 81B.

＜３−２．処理の流れ＞
次に、図１５を参照して、第１の実施形態に係る情報処理の一例を説明する。図１５は、第１の実施形態に係る情報処理の概略的な流れの一例を示すフローチャートである。 <3-2. Process flow>
Next, an example of information processing according to the first embodiment will be described with reference to FIG. FIG. 15 is a flowchart illustrating an example of a schematic flow of information processing according to the first embodiment.

ステップＳ３１０で、距離画像取得部１３１は、記憶部１２０に記憶されている背景距離画像を記憶部１２０から取得する。また、ステップＳ３２０で、距離画像取得部１３１は、撮像装置３０により生成された対象距離画像２０を取得する。 In step S 310, the distance image acquisition unit 131 acquires the background distance image stored in the storage unit 120 from the storage unit 120. In step S320, the distance image acquisition unit 131 acquires the target distance image 20 generated by the imaging device 30.

次に、ステップＳ３３０で、前景距離画像生成部１３３は、上記対象距離画像及び上記背景距離画像に基づいて、前景距離画像を生成する。 Next, in step S330, the foreground distance image generation unit 133 generates a foreground distance image based on the target distance image and the background distance image.

そして、ステップＳ３４０で、領域検出部１３５は、複数のフィルタ７０を用いて、上記前景距離画像から１つ以上の物体領域を検出する。そして、処理は終了する。 In step S 340, the region detection unit 135 detects one or more object regions from the foreground distance image using the plurality of filters 70. Then, the process ends.

なお、対象距離画像２０が継続的に生成され、取得されてもよい。この場合に、ステップＳ３４０の後に、処理はステップＳ３２０に戻ってもよい。 Note that the target distance image 20 may be continuously generated and acquired. In this case, the process may return to step S320 after step S340.

以上、本発明の第１の実施形態を説明した。第１の実施形態によれば、前景距離画像内の別々の位置で適用される複数のフィルタであって、適用される位置に応じた人物の頭部及び胴体のパターンを認識可能な当該複数のフィルタが、用いられる。これにより、人物が真上から撮像されない場合でも距離画像から人物をより正確に検出することが可能になる。 The first embodiment of the present invention has been described above. According to the first embodiment, a plurality of filters applied at different positions in the foreground distance image, the plurality of filters capable of recognizing a human head and torso pattern according to the applied position. A filter is used. This makes it possible to detect the person more accurately from the distance image even when the person is not imaged from directly above.

より具体的には、例えば、撮像装置３０の撮像範囲の端部近くに人物４０が位置する場合であっても、人物の頭部以外の部分（例えば、肩、背中、胸、等）に対応する領域が誤って検出される可能性が低くなる。即ち、１人の人物しかいないにもかかわらず２人以上の人物（人物に対応する２つの領域）を検出する可能性が低くなるので、人物の誤検出を減らすことが可能になる。 More specifically, for example, even when the person 40 is located near the end of the imaging range of the imaging device 30, it corresponds to a part other than the person's head (eg, shoulder, back, chest, etc.). The possibility that the area to be detected is erroneously detected is reduced. In other words, the possibility of detecting two or more persons (two areas corresponding to the persons) even though there is only one person is reduced, so that erroneous detection of persons can be reduced.

さらに、上記複数のフィルタ７０の使用により、撮像装置３０の設置に関する自由度が高まる。一例として、撮像装置が低い位置にしか設置できない場合（例えば、天井が低い場合）であっても、撮像装置３０を当該低い位置に設置し、撮像装置３０により生成される距離画像から、人物をより正確に検出することが可能になる。また、別の例として、撮像装置３０を真下に向けて設置せずに、撮像装置３０を斜め方向に向けて設置することも可能になる。この場合にも、撮像装置３０の設置状態に基づいて、フィルタ７０が生成される。 Furthermore, the use of the plurality of filters 70 increases the degree of freedom regarding the installation of the imaging device 30. As an example, even when the imaging device can be installed only at a low position (for example, when the ceiling is low), the imaging device 30 is installed at the low position, and a person is detected from the distance image generated by the imaging device 30. It becomes possible to detect more accurately. As another example, it is also possible to install the imaging device 30 in an oblique direction without installing the imaging device 30 directly below. Also in this case, the filter 70 is generated based on the installation state of the imaging device 30.

また、第１の実施形態によれば、例えば、フィルタ７０は、撮像装置３０により人物４０が撮像される場合における撮像装置３０と人物４０との相対関係に基づいて生成される。これにより、当該相対関係の情報を取得すれば、フィルタ７０を自動的に生成することが可能になる。よって、撮像装置３０の設置状態の変更又は新規設置の際の手間を減らすことができる。 According to the first embodiment, for example, the filter 70 is generated based on the relative relationship between the imaging device 30 and the person 40 when the imaging device 30 captures the person 40. Thereby, if the information on the relative relationship is acquired, the filter 70 can be automatically generated. Therefore, it is possible to reduce the trouble of changing the installation state of the imaging device 30 or newly installing it.

また、第１の実施形態によれば、例えば、フィルタ７０は、撮像装置３０の特性にさらに基づいて生成される。これにより、様々なスペックの撮像装置３０が用いられたとしても、フィルタ７０を自動的に生成することが可能になる。よって、撮像装置３０の設置状態の変更又は新規設置の際の手間を減らすことができる。 Further, according to the first embodiment, for example, the filter 70 is generated further based on the characteristics of the imaging device 30. Thereby, even if the imaging device 30 having various specifications is used, the filter 70 can be automatically generated. Therefore, it is possible to reduce the trouble of changing the installation state of the imaging device 30 or newly installing it.

＜＜４．第２の実施形態＞＞
続いて、図１６〜図２０を参照して、本発明の第２の実施形態を説明する。本発明の第２の実施形態によれば、複数のフィルタ７０を用いて検出された領域が、当該領域の大きさに基づいて人物の対象部分（即ち、頭部）に対応するかが判定される。これにより、距離画像から人物をさらに正確に検出することが可能になる。 << 4. Second Embodiment >>
Subsequently, a second embodiment of the present invention will be described with reference to FIGS. According to the second embodiment of the present invention, it is determined whether the area detected using the plurality of filters 70 corresponds to the target portion (that is, the head) of the person based on the size of the area. The This makes it possible to more accurately detect a person from a distance image.

＜４−１．情報処理装置の構成＞
図１６〜図２０を参照して、第２の実施形態に係る情報処理装置の構成の一例を説明する。図１６は、第２の実施形態に係る情報処理装置１００−２の構成の一例を示すブロック図である。図１６を参照すると、情報処理装置１００−２は、通信部１１０、記憶部１２０及び制御部１４０を備える。 <4-1. Configuration of information processing apparatus>
An example of the configuration of the information processing apparatus according to the second embodiment will be described with reference to FIGS. FIG. 16 is a block diagram illustrating an example of the configuration of the information processing apparatus 100-2 according to the second embodiment. Referring to FIG. 16, the information processing apparatus 100-2 includes a communication unit 110, a storage unit 120, and a control unit 140.

ここで、通信部１１０、記憶部１２０、並びに、制御部１４０に含まれる距離画像取得部１３１、前景距離画像生成部１３３及び領域検出部１３５については、第１の実施形態と第２の実施形態との間に差異はない。よって、制御部１４０に含まれる領域補正部１４１及び領域判定部１４３のみを説明する。 Here, the communication unit 110, the storage unit 120, and the distance image acquisition unit 131, the foreground distance image generation unit 133, and the region detection unit 135 included in the control unit 140 are the first embodiment and the second embodiment. There is no difference between Therefore, only the region correction unit 141 and the region determination unit 143 included in the control unit 140 will be described.

（領域補正部１４１）
領域補正部１４１は、対象距離画像２０又は前景距離画像６０に基づいて、検出される１つ以上の物体領域８１のうちの少なくとも１つの物体領域８１の各々の周辺領域であって、所定の条件を満たす当該周辺領域を、当該少なくとも１つの物体領域８１の各々に追加することにより、当該１つ以上の物体領域８１を補正する。 (Region Correction Unit 141)
The region correction unit 141 is a peripheral region of each of at least one object region 81 among the one or more object regions 81 detected based on the target distance image 20 or the foreground distance image 60, and has a predetermined condition. The one or more object regions 81 are corrected by adding the peripheral region satisfying the condition to each of the at least one object region 81.

例えば、領域補正部１４１は、検出される１つ以上の物体領域８１の全てを補正する。即ち、領域補正部１４１は、検出される１つ以上の領域８１の各々の周辺領域であって、所定の条件を満たす当該周辺領域を、当該１つ以上の物体領域８１の各々に追加することにより、当該１つ以上の物体領域８１を補正する。 For example, the area correction unit 141 corrects all of the detected one or more object areas 81. That is, the region correction unit 141 adds the peripheral region that satisfies each of the one or more object regions 81 that is a peripheral region of each of the one or more regions 81 to be detected that satisfies a predetermined condition. Thus, the one or more object regions 81 are corrected.

また、例えば、領域補正部１４１は、物体領域８１の周辺に位置する画素値が０である画素（以下、「周辺画素」と呼ぶ）のうちの、一定以上の高さに対応する画素を、当該物体領域８１に追加する。即ち、領域補正部１４１は、一定以上の高さに対応する当該画素の画素値を１にする。 Further, for example, the region correction unit 141 selects a pixel corresponding to a certain height or higher among pixels having a pixel value of 0 located around the object region 81 (hereinafter referred to as “peripheral pixel”). The object area 81 is added. That is, the area correction unit 141 sets the pixel value of the pixel corresponding to a certain height or higher to 1.

より具体的には、例えば、まず、領域補正部１４１は、個々の物体領域８１についての閾値を設定する。一例として、当該閾値として、個々の物体領域８１の画素に対応する高さの平均値から、想定される頭部の高さ（即ち、図１２Ａに示される距離Ｄ_Ｓと距離Ｄ_Ｈとの差分）を差し引いた値が設定される。そして、領域補正部１４１は、各周辺画素に対応する高さと、設定された閾値とを比較し、高さが閾値を超える周辺画素を物体領域８１に追加する。一例として、領域補正部１４１は、物体領域８１の中心から頭幅Ｗ_Ｈの２倍以内の距離に位置する画素を、周辺画素として選択する。 More specifically, for example, first, the region correction unit 141 sets a threshold value for each object region 81. As an example, as the threshold value, the difference from the average value of the height corresponding to the pixels of each object region 81, the height of the head is assumed (i.e., the distance D _S and the distance D _H shown in FIG. 12A ) Is subtracted. Then, the region correction unit 141 compares the height corresponding to each peripheral pixel with the set threshold value, and adds the peripheral pixel whose height exceeds the threshold value to the object region 81. As an example, the area correction section 141, the pixels located within a distance of 2 times the head width W _H from center of the object region 81 is selected as peripheral pixels.

例えば以上のように、物体領域８１が補正される。以下、図１７及び図１８を参照して、物体領域８１の補正の具体例を説明する。 For example, the object region 81 is corrected as described above. Hereinafter, a specific example of the correction of the object region 81 will be described with reference to FIGS. 17 and 18.

図１７は、第２の実施形態に係る物体領域８１の補正の第１の例を説明するための説明図である。図１７を参照すると、補正前の２値画像８０並びに物体領域８１Ａ及び８１Ｂと、補正後の２値画像８０並びに物体領域８１Ａ及び８１Ｂとが、示されている。補正前の物体領域８１Ａ及び８１Ｂは、それぞれ、頭部に対応する領域のうちの一部が含まれていない。このような物体領域８１Ａ及び物体領域８１Ｂの周辺画素であって、所定の条件を満たす周辺画素を追加することにより、補正後の物体領域８１Ａ及び８１Ｂのように、頭部に対応する領域のうち、補正前には含まれていなかった領域が、追加される。 FIG. 17 is an explanatory diagram for describing a first example of correction of the object region 81 according to the second embodiment. Referring to FIG. 17, a binary image 80 and object regions 81A and 81B before correction, and a binary image 80 and object regions 81A and 81B after correction are shown. Each of the object areas 81A and 81B before correction does not include a part of the area corresponding to the head. Among the regions corresponding to the head, such as the corrected object regions 81A and 81B, by adding peripheral pixels satisfying a predetermined condition that are peripheral pixels of the object region 81A and the object region 81B. The area that was not included before the correction is added.

図１８は、第２の実施形態に係る物体領域８１の補正の第２の例を説明するための説明図である。図１８を参照すると、図１７の例とは異なり、物体領域８１Ｃがさらに検出されている。物体領域８１Ｃは、例えば、人物の肩の一部に対応する領域である。このように、フィルタ７０を使用しても、物体領域８１Ｃが検出されてしまった場合には、物体領域８１Ｃも、物体領域８１Ａ及び物体領域８１Ｂと同様に補正される。 FIG. 18 is an explanatory diagram for describing a second example of the correction of the object region 81 according to the second embodiment. Referring to FIG. 18, unlike the example of FIG. 17, an object region 81C is further detected. The object region 81C is a region corresponding to a part of a person's shoulder, for example. As described above, even when the filter 70 is used, if the object region 81C is detected, the object region 81C is also corrected in the same manner as the object region 81A and the object region 81B.

なお、物体領域８１の補正の手法は、上述した例に限られない。例えば、対象距離画像２０又は前景距離画像６０を参照することにより、２値画像８０の膨張処理を行なってもよい。具体的には、例えば、２値画像８０において画素値が０である注目画素について、当該注目画素の近傍に物体領域８１があるかが判定されてもよい。そして、上記注目画素の近傍に物体領域８１がある場合には、対象距離画像２０又は前景距離画像６０を参照して、上記注目画素に対応する高さが算出されてもよい。その結果、算出された当該高さが閾値を超える場合に、当該注目画素が近傍の物体領域８１に追加されてもよい。 Note that the method of correcting the object region 81 is not limited to the above-described example. For example, the binary image 80 may be expanded by referring to the target distance image 20 or the foreground distance image 60. Specifically, for example, for a target pixel having a pixel value of 0 in the binary image 80, it may be determined whether the object region 81 is in the vicinity of the target pixel. If the object region 81 is in the vicinity of the target pixel, the height corresponding to the target pixel may be calculated with reference to the target distance image 20 or the foreground distance image 60. As a result, when the calculated height exceeds the threshold, the target pixel may be added to the nearby object region 81.

（領域判定部１４３）
領域判定部１４３は、検出される１つ以上の物体領域８１のうちの少なくとも１つの物体領域８１の各々に対応する大きさに基づいて、当該少なくとも１つの物体領域８１の各々が人物の対象部分に対応する領域かを判定する。 (Area determination unit 143)
Based on the size corresponding to each of at least one of the one or more object areas 81 to be detected, the area determination unit 143 determines that each of the at least one object area 81 is a target portion of a person. It is determined whether the area corresponds to.

例えば、領域判定部１４３は、補正された上記少なくとも１つの物体領域８１の各々に対応する大きさに基づいて、当該少なくとも１つの物体領域８１の各々が人物の対象部分に対応する領域かを判定する。 For example, the region determination unit 143 determines whether each of the at least one object region 81 corresponds to a target portion of a person based on the corrected size corresponding to each of the at least one object region 81. To do.

より具体的には、例えば、上記対象部分は、人物の頭部であり、検出された１つ以上の領域８１の各々が補正される。即ち、領域判定部１４３は、補正された１つ以上の物体領域８１の各々に対応する大きさに基づいて、当該１つ以上の物体領域８１の各々が頭部に対応する領域かを判定する。 More specifically, for example, the target portion is a person's head, and each of the detected one or more regions 81 is corrected. That is, the region determination unit 143 determines whether each of the one or more object regions 81 is a region corresponding to the head based on the corrected size corresponding to each of the one or more object regions 81. .

−物体領域の大きさの算出
例えば、領域判定部１４３は、２値画像のモーメント特徴を用いて物体領域８１の大きさを算出する。モーメント特徴を用いることにより、重心及び重心周りの２次モーメントが物体領域８１のものと等しい楕円（即ち、慣性等価楕円）が、算出される。即ち、当該楕円の長軸及び短軸の長さが算出される。そして、領域判定部１４３は、当該楕円の大きさを物体領域８１の大きさとみなす。 -Calculation of size of object region For example, the region determination unit 143 calculates the size of the object region 81 using the moment feature of the binary image. By using the moment feature, an ellipse (that is, an inertia equivalent ellipse) in which the center of gravity and the second moment around the center of gravity are equal to those of the object region 81 is calculated. That is, the major axis and minor axis lengths of the ellipse are calculated. Then, the region determination unit 143 regards the size of the ellipse as the size of the object region 81.

具体的には、例えば、領域判定部１４３は、個別の物体領域８１ごとに、以下のような処理を行う。まず、領域判定部１４３は、１つの物体領域８１に注目し、２値画像Ｊ（ｘ，ｙ）において、注目している物体領域８１の画素値を１とし、それ以外の画素値を０とする。そして、領域判定部１４３は、注目している物体領域８１の０次モーメントＭ_００、１次モーメントＭ_１０及び２次モーメントＭ_２０を、以下のように算出する。 Specifically, for example, the region determination unit 143 performs the following process for each individual object region 81. First, the region determination unit 143 pays attention to one object region 81, sets the pixel value of the object region 81 of interest in the binary image J (x, y) to 1, and sets the other pixel values to 0. To do. Then, the region determination unit 143 calculates the zeroth moment M ₀₀ , the first moment M _10, and the second moment M ₂₀ of the object region 81 of interest as follows.

そして、重心周りの２次モーメントＵ_ｘｘ、Ｕ_ｘｙ及びＵ_ｙｙは、以下のように算出される。 Then, the secondary moments U _xx , U _xy and U _yy around the center of gravity are calculated as follows.

そして、慣性等価楕円の長軸Ｌ_ｍ及び短軸Ｌ_ｎは、以下のように算出される。 Then, the major axis L _m and the minor axis L _n of the inertia equivalent ellipse are calculated as follows.

以上のように、領域判定部１４３は、物体領域８１の大きさとして楕円の長軸Ｌ_ｍ及び短軸Ｌ_ｎの長さを算出する。 As described above, the region determination unit 143 calculates the lengths of the major axis L _m and the minor axis L _n of the ellipse as the size of the object region 81.

−物体領域が頭部に対応する領域かの判定
領域判定部１４３は、算出された物体領域８１の大きさが、頭部に対応する所定の大きさの範囲内であれば、物体領域８１が頭部に対応する領域であると判定する。 -Determining whether or not the object region corresponds to the head The region determining unit 143 determines that the object region 81 is determined if the size of the calculated object region 81 is within a predetermined size corresponding to the head. It is determined that the region corresponds to the head.

例えば、上記所定の大きさの範囲は、フィルタ７０の矩形７１の大きさとの差が所定の値以下になる範囲である。即ち、矩形７１の水平方向及び垂直方向の幅を（Ｗ_Fh，Ｗ_Fv）、閾値をＴとすると、長軸Ｌ_ｍ及び短軸Ｌ_ｎが以下の条件を満たす場合に、物体領域８１はが人物に対応する領域であると判定される。 For example, the range of the predetermined size is a range in which a difference from the size of the rectangle 71 of the filter 70 is equal to or less than a predetermined value. That is, _assuming that the horizontal and vertical widths of the rectangle 71 are (W _Fh , W _Fv ) and the threshold is T, the object region 81 is peeled off when the major axis L _m and the minor axis L _n satisfy the following conditions. It is determined that the area corresponds to a person.

図１９は、物体領域８１が人物の頭部に対応する領域かを判定する手法の例を説明するための説明図である。図１９を参照すると、幅Ｗ_Ｆｈ及びＷ_Ｆｖを有する矩形７１、及び算出される楕円の長軸Ｌ_ｍ及び短軸Ｌ_ｎが、示されている。このように、楕円の長軸Ｌ_ｍ及び短軸Ｌ_ｎが、幅Ｗ_Ｆｈ及びＷ_Ｆｖと比較される。そして、長軸Ｌ_ｍ及び短軸Ｌ_ｎが、幅Ｗ_Ｆｈ及びＷ_Ｆｖと大きく異ならなければ、物体領域８１が人物の頭部に対応する領域であると判定される。 FIG. 19 is an explanatory diagram for explaining an example of a method for determining whether the object region 81 is a region corresponding to a person's head. Referring to FIG. 19, a rectangle 71 having widths W _Fh and W _Fv and a major axis L _m and a minor axis L _{n of the} calculated ellipse are shown. Thus, the major axis L _m and the minor axis L _n of the ellipse are compared with the widths W _Fh and W _Fv . If the major axis L _m and the minor axis L _n are not significantly different from the widths W _Fh and W _Fv , it is determined that the object region 81 is a region corresponding to the person's head.

以上のように、物体領域８１が人物の頭部に対応する領域かが、判定される。このような判定により、例えば、図１８に示される物体領域８１Ｃのような物体領域８１が検出されたとしても、このような物体領域８１Ｃを最終的に除外することができる。 As described above, it is determined whether the object region 81 corresponds to the person's head. For example, even if an object region 81 such as the object region 81C shown in FIG. 18 is detected by such determination, such an object region 81C can be finally excluded.

なお、判定の際には、矩形７１の幅Ｗ_Ｆｈ及びＷ_Ｆｖの代わりに、物体領域８１に対応する高さ（例えば、画素に対応する高さの平均値）に基づいて身長を推定し、当該身長に対応する頭幅を用いてもよい。 In the determination, instead of the widths W _Fh and W _Fv of the rectangle 71, the height is estimated based on the height corresponding to the object region 81 (for example, the average value of the height corresponding to the pixels), A head width corresponding to the height may be used.

＜４−２．処理の流れ＞
次に、図２０を参照して、第２の実施形態に係る情報処理の一例を説明する。図２０は、第２の実施形態に係る情報処理の概略的な流れの一例を示すフローチャートである。 <4-2. Process flow>
Next, an example of information processing according to the second embodiment will be described with reference to FIG. FIG. 20 is a flowchart illustrating an example of a schematic flow of information processing according to the second embodiment.

ここで、ステップＳ４１０〜Ｓ４４０は、図１５を参照して説明した第１の実施形態に係るステップＳ３１０〜Ｓ３４０と同様である。よって、ここでは、ステップＳ４５０及びＳ４６０のみを説明する。 Here, steps S410 to S440 are the same as steps S310 to S340 according to the first embodiment described with reference to FIG. Therefore, only steps S450 and S460 will be described here.

ステップＳ４５０で、領域補正部１４１は、対象距離画像２０又は前景距離画像６０に基づいて、検出された１つ以上の物体領域８１の各々の周辺領域であって、所定の条件を満たす当該周辺領域を、当該１つ以上の物体領域８１の各々に追加することにより、当該１つ以上の物体領域８１を補正する。 In step S450, the region correction unit 141 is a peripheral region of each of the one or more object regions 81 detected based on the target distance image 20 or the foreground distance image 60 and satisfies the predetermined condition. Is added to each of the one or more object regions 81 to correct the one or more object regions 81.

そして、ステップＳ４６０で、領域判定部１４３は、補正された１つ以上の物体領域８１の各々に対応する大きさに基づいて、当該１つ以上の物体領域８１の各々が頭部に対応する領域かを判定する。そして、処理は終了する。 In step S460, the region determination unit 143 determines the region in which each of the one or more object regions 81 corresponds to the head based on the corrected size corresponding to each of the one or more object regions 81. Determine whether. Then, the process ends.

なお、対象距離画像２０が継続的に生成され、取得されてもよい。この場合に、ステップＳ４６０の後に、処理はステップＳ４２０に戻ってもよい。 Note that the target distance image 20 may be continuously generated and acquired. In this case, after step S460, the process may return to step S420.

以上、本発明の第２の実施形態を説明した。第２の実施形態によれば、物体領域８１に対応する大きさに基づいて、当該物体領域８１が頭部に対応する領域かが、判定される。これにより、距離画像から人物をさらに正確に検出することが可能になる。 Heretofore, the second embodiment of the present invention has been described. According to the second embodiment, based on the size corresponding to the object region 81, it is determined whether the object region 81 corresponds to the head. This makes it possible to more accurately detect a person from a distance image.

より具体的には、例えば、人物の対象部分以外の部分（例えば、肩、背中、胸、等）に対応する領域が、フィルタ７０を用いてもなお誤って検出されてしまう場合に、当該領域を最終的に除外することができる。即ち、１人の人物しかいないにもかかわらず２人以上の人物（人物に対応する２つの領域）を検出する可能性が低くなるので、人物の誤検出を減らすことが可能になる。 More specifically, for example, when a region corresponding to a portion other than the target portion of a person (for example, shoulder, back, chest, etc.) is still erroneously detected using the filter 70, the region Can eventually be excluded. In other words, the possibility of detecting two or more persons (two areas corresponding to the persons) even though there is only one person is reduced, so that erroneous detection of persons can be reduced.

また、人物以外の物体（例えば、動物、運搬物、等）に対応する領域が、誤って検出されてしまう場合に、当該領域を最終的に除外することができる。即ち、人物以外の物体が検出される可能性を減らすことができる。 In addition, when an area corresponding to an object other than a person (for example, an animal, a transported object, etc.) is erroneously detected, the area can be finally excluded. That is, the possibility of detecting an object other than a person can be reduced.

また、第２の実施形態によれば、周辺領域の追加により物体領域８１が補正される。これにより、頭部の一部に対応する領域が物体領域８１に含まれていない場合であっても、当該領域を物体領域８１に事後的に含めることができる。その結果、物体領域８１が人物の対象部分（例えば、頭部）に対応するにもかかわらず、物体領域８１が頭部の一部に対応する領域を含まないことに起因して、物体領域８１が頭部に対応する領域ではないと判定されてしまう可能性を、減らすことができる。また、逆に、物体領域８１が人物の対象部分以外の部分（例えば、動物、運搬物、等）に対応するにもかかわらず、物体領域８１が頭部に対応する領域であると判定されてしまう可能性を、減らすことができる。 Further, according to the second embodiment, the object region 81 is corrected by adding the peripheral region. Thereby, even if the region corresponding to a part of the head is not included in the object region 81, the region can be included in the object region 81 afterwards. As a result, the object region 81 does not include a region corresponding to a part of the head even though the object region 81 corresponds to the target portion (for example, the head) of the person. Can be determined not to be an area corresponding to the head. Conversely, it is determined that the object region 81 is a region corresponding to the head, although the object region 81 corresponds to a portion other than the target portion of the person (for example, an animal, a transported item, etc.). The possibility of being lost can be reduced.

＜＜５．第３の実施形態＞＞
続いて、図２１〜図２３を参照して、本発明の第３の実施形態を説明する。本発明の第３の実施形態によれば、１つ以上の物体領域８１のうちの、人物の対象部分（例えば、頭部）に対応する領域であると判定される可能性が高い高可能性領域と、その他の領域とが、識別される。そして、当該その他の領域について、人物の対象部分に対応する領域であるかが判定される。これにより、最終的に人物を検出するまでの処理を軽減することができる。 << 5. Third Embodiment >>
Subsequently, a third embodiment of the present invention will be described with reference to FIGS. According to the third embodiment of the present invention, there is a high possibility that it is determined that the region corresponds to the target portion (for example, the head) of one or more object regions 81. Regions and other regions are identified. Then, it is determined whether the other region is a region corresponding to the target portion of the person. Thereby, it is possible to reduce processing until a person is finally detected.

＜５−１．情報処理装置の構成＞
図２１及び図２２を参照して、第３の実施形態に係る情報処理装置の構成の一例を説明する。図２１は、第３の実施形態に係る情報処理装置１００−３の構成の一例を示すブロック図である。図２１を参照すると、情報処理装置１００−３は、通信部１１０、記憶部１２０及び制御部１５０を備える。 <5-1. Configuration of information processing apparatus>
An example of the configuration of the information processing apparatus according to the third embodiment will be described with reference to FIGS. 21 and 22. FIG. 21 is a block diagram illustrating an example of the configuration of the information processing apparatus 100-3 according to the third embodiment. Referring to FIG. 21, the information processing apparatus 100-3 includes a communication unit 110, a storage unit 120, and a control unit 150.

ここで、通信部１１０、記憶部１２０、並びに、制御部１５０に含まれる距離画像取得部１３１、前景距離画像生成部１３３及び領域検出部１３５については、第２の実施形態と第３の実施形態との間に差異はない。よって、制御部１５０に含まれる領域識別部１５１、領域補正部１５３及び領域判定部１５５のみを説明する。 Here, the communication unit 110, the storage unit 120, and the distance image acquisition unit 131, the foreground distance image generation unit 133, and the region detection unit 135 included in the control unit 150 are the second embodiment and the third embodiment. There is no difference between Therefore, only the region identification unit 151, the region correction unit 153, and the region determination unit 155 included in the control unit 150 will be described.

（領域識別部１５１）
領域識別部１５１は、１つ以上の物体領域８１のうちの、人物の対象部分に対応する領域であると判定される可能性が高い高可能性領域と、その他の領域とを識別する。 (Area identification unit 151)
The region identifying unit 151 identifies a high possibility region that is highly likely to be determined to be a region corresponding to the target portion of the person, and other regions in the one or more object regions 81.

−高可能性領域の第１の例
第１の例として、上記高可能性領域は、上記対象部分以外の人物の部分に対応する領域が検出されにくい所定の範囲内に位置する物体領域８１である。例えば、図４に示されるように、撮像装置３０が真下に向けて設置されている場合に、上記所定の範囲は、２値画像８０の中心に近い範囲である。この点について、図２２を参照して具体例を説明する。 -1st example of high possibility area | region As a 1st example, the said high possibility area | region is the object area | region 81 located in the predetermined range in which the area | region corresponding to a person's part other than the said target part is hard to be detected. is there. For example, as shown in FIG. 4, the predetermined range is a range close to the center of the binary image 80 when the imaging device 30 is installed directly below. A specific example of this point will be described with reference to FIG.

図２２は、対象部分以外の人物の部分に対応する領域が検出されにくい所定の範囲の一例を説明するための説明図である。図２２を参照すると、２値画像８０並びに物体領域８１Ｄ及び８１Ｅが示されている。また、人物の対象部分以外の部分（例えば、肩、背中、胸、等）に対応する領域が検出されにくい所定の範囲８５が示されている。このように、所定の範囲８５は、例えば、２値画像８０の中心に近い範囲である。この場合に、領域識別部１５１は、物体領域８１Ｄを高可能性領域として識別し、物体領域８１Ｅをその他の領域として識別する。 FIG. 22 is an explanatory diagram for explaining an example of a predetermined range in which a region corresponding to a person portion other than the target portion is difficult to be detected. Referring to FIG. 22, a binary image 80 and object areas 81D and 81E are shown. Further, a predetermined range 85 in which a region corresponding to a portion other than the target portion of the person (for example, shoulder, back, chest, etc.) is difficult to be detected is shown. Thus, the predetermined range 85 is a range close to the center of the binary image 80, for example. In this case, the area identifying unit 151 identifies the object area 81D as a high possibility area and identifies the object area 81E as another area.

上述したように、撮像装置３０が真下に向けて設置されている場合には、人物の頭部以外の部分（例えば、肩、背中、胸、等）に対応する領域が検出されてしまう可能性が高い。そのため、２値画像８０内の端部に近い範囲では、物体領域８１が頭部に対応する領域であるかを判定することが望ましい。一方、２値画像８０内の中心に近い範囲では、人物の頭部以外の部分に対応する領域が検出されてしまう可能性が低い。よって、図４に示されるように、撮像装置３０が真下に向けて設置されている場合に、領域識別部１５１は、２値画像８０の中心に近い範囲内に位置する物体領域８１を、高可能性領域として識別することができる。 As described above, when the imaging device 30 is installed directly below, a region corresponding to a portion other than the person's head (for example, shoulder, back, chest, etc.) may be detected. Is expensive. Therefore, it is desirable to determine whether the object region 81 is a region corresponding to the head in a range near the end in the binary image 80. On the other hand, in the range close to the center in the binary image 80, there is a low possibility that a region corresponding to a portion other than the person's head will be detected. Therefore, as illustrated in FIG. 4, when the imaging device 30 is installed directly below, the region identification unit 151 displays an object region 81 located within a range close to the center of the binary image 80 as a high It can be identified as a potential area.

−高可能性領域の第２の例
第２の例として、上記高可能性領域は、検出された１つ以上の物体領域８１が１つの物体領域８１のみを含む場合には、当該１つの物体領域８１である。また、上記高可能性領域は、上記１つ以上の物体領域８１が２つ以上の物体領域８１を含む場合には、当該１つ以上の物体領域８１に含まれる他のいずれの物体領域８１とも所定の間隔以上離れている物体領域８１である。 -2nd example of high possibility area | region As a 2nd example, when the one or more detected object area | regions 81 contain only one object area | region 81, the said high possibility area | region is the said 1 object This is a region 81. In addition, when the one or more object regions 81 include two or more object regions 81, the high possibility region is the same as any other object region 81 included in the one or more object regions 81. The object region 81 is separated by a predetermined distance or more.

例えば、１つの物体領域８１のみが検出される場合には、当該１つの物体領域８１は、高可能性領域として識別される。また、２つ以上の物体領域８１が検出される場合には、２値画像８０において他の物体領域８１が近傍に位置しない物体領域８１が、高可能性領域として識別され、２値画像８０において他の物体領域８１が近傍に位置する物体領域８１が、その他の領域として識別される。図１８に示される例を参照すると、例えば、物体領域８１Ｂは、高可能性領域として識別される。一方、物体領域８１Ａ及び物体領域８１Ｃは、その他の領域として識別される。 For example, when only one object region 81 is detected, the one object region 81 is identified as a high possibility region. When two or more object areas 81 are detected, an object area 81 that is not located in the vicinity of another object area 81 in the binary image 80 is identified as a high possibility area. The object area 81 where the other object area 81 is located in the vicinity is identified as the other area. Referring to the example shown in FIG. 18, for example, the object area 81B is identified as a high possibility area. On the other hand, the object region 81A and the object region 81C are identified as other regions.

人物の対象部分（例えば、頭部）に対応する物体領域８１と、当該人物に対応するその他の部分（例えば、肩、背中、胸、等）に対応する物体領域８１とが検出される場合には、これらの物体領域８１は互いの近傍に位置する。そのため、互いに近傍に位置する物体領域８１があれば、これらの物体領域８１が頭部に対応する領域であるかを判定することが望ましい。一方、他の物体領域８１が近傍に位置しない物体領域８１は、上記その他の部分に対応する領域ではないと推定できる。また、仮に、対象部分に対応する領域が物体領域８１として検出されず、上記その他の部分に対応する領域が物体領域８１として検出されたとしても、１人の人物しかいないにもかかわらず２人以上の人物（人物に対応する２つの領域）が検出されるわけではない。よって、領域識別部１５１は、他の物体領域８１が近傍に位置しない物体領域８１を、高可能性領域として識別することができる。 When an object region 81 corresponding to a target portion (for example, a head) of a person and an object region 81 corresponding to another portion (for example, shoulder, back, chest, etc.) corresponding to the person are detected. These object regions 81 are located in the vicinity of each other. Therefore, if there are object regions 81 located in the vicinity of each other, it is desirable to determine whether these object regions 81 are regions corresponding to the head. On the other hand, it can be estimated that the object region 81 in which the other object region 81 is not located in the vicinity is not a region corresponding to the other part. Further, even if the region corresponding to the target portion is not detected as the object region 81 and the region corresponding to the other portion is detected as the object region 81, two people are used even though there is only one person. The above person (two areas corresponding to the person) is not detected. Therefore, the area identification unit 151 can identify the object area 81 in which the other object area 81 is not located in the vicinity as the high possibility area.

（領域補正部１５３）
領域補正部１５３は、対象距離画像２０又は前景距離画像６０に基づいて、検出される１つ以上の物体領域８１のうちの少なくとも１つの物体領域８１の各々の周辺領域であって、所定の条件を満たす当該周辺領域を、当該少なくとも１つの物体領域８１の各々に追加することにより、当該１つ以上の物体領域８１を補正する。 (Region Correction Unit 153)
The area correction unit 153 is a peripheral area of each of at least one object area 81 among the one or more object areas 81 detected based on the target distance image 20 or the foreground distance image 60, and has a predetermined condition. The one or more object regions 81 are corrected by adding the peripheral region satisfying the condition to each of the at least one object region 81.

とりわけ第３の実施形態では、上記少なくとも１つの領域は、上記１つ以上の領域のうちの、高可能性領域以外のその他の物体領域８１である。即ち、領域補正部１５３は、高可能性領域以外の上記その他の物体領域８１を補正する。具体的な補正の手法については、第２の実施形態において説明したとおりである。 In particular, in the third embodiment, the at least one region is an object region 81 other than the high possibility region among the one or more regions. That is, the area correction unit 153 corrects the other object area 81 other than the high possibility area. A specific correction method is as described in the second embodiment.

（領域判定部１５５）
領域判定部１５５は、検出される１つ以上の物体領域８１のうちの少なくとも１つの物体領域８１の各々に対応する大きさに基づいて、当該少なくとも１つの物体領域８１の各々が人物の対象部分に対応する領域かを判定する。 (Area determination unit 155)
The area determination unit 155 determines that each of the at least one object area 81 is a target portion of a person based on a size corresponding to each of the at least one object area 81 among the one or more object areas 81 to be detected. It is determined whether the area corresponds to.

とりわけ第３の実施形態では、上記少なくとも１つの領域は、上記１つ以上の領域のうちの、高可能性領域以外のその他の物体領域８１である。即ち、領域判定部１５５は、高可能性領域以外の上記その他の物体領域８１が人物の対象部分（例えば、頭部）に対応する領域であるかを判定する。具体的な判定の手法については、第２の実施形態において説明したとおりである。 In particular, in the third embodiment, the at least one region is an object region 81 other than the high possibility region among the one or more regions. That is, the region determination unit 155 determines whether the other object region 81 other than the high possibility region is a region corresponding to a target portion (for example, a head) of a person. The specific determination method is as described in the second embodiment.

＜５−２．処理の流れ＞
次に、図２３を参照して、第３の実施形態に係る情報処理の一例を説明する。図２３は、第３の実施形態に係る情報処理の概略的な流れの一例を示すフローチャートである。 <5-2. Process flow>
Next, an example of information processing according to the third embodiment will be described with reference to FIG. FIG. 23 is a flowchart illustrating an example of a schematic flow of information processing according to the third embodiment.

ここで、ステップＳ５１０〜Ｓ５４０は、図２０を参照して説明した第２の実施形態に係るステップＳ４１０〜Ｓ４４０（即ち、図１５を参照して説明したステップＳ３１０〜Ｓ３４０）と同様である。よって、ここでは、ステップＳ５５０〜Ｓ５７０のみを説明する。 Here, steps S510 to S540 are the same as steps S410 to S440 (that is, steps S310 to S340 described with reference to FIG. 15) according to the second embodiment described with reference to FIG. Therefore, only steps S550 to S570 will be described here.

ステップＳ５５０で、領域識別部１５１は、補正された１つ以上の物体領域８１のうちの、人物の頭部に対応する領域であると判定される可能性が高い高可能性領域と、その他の領域とを識別する。 In step S550, the area identifying unit 151 includes a high possibility area that is highly likely to be determined to be an area corresponding to a person's head among the corrected one or more object areas 81, and other areas. Identify the area.

ステップＳ５６０で、領域補正部１５３は、対象距離画像２０又は前景距離画像６０に基づいて、識別された上記その他の領域の各々の周辺領域であって、所定の条件を満たす当該周辺領域を、当該その他の領域の各々に追加することにより、当該その他の領域を補正する In step S560, the region correction unit 153 determines the peripheral region that satisfies the predetermined condition, which is the peripheral region of each of the other regions identified based on the target distance image 20 or the foreground distance image 60. Correcting other areas by adding to each of the other areas

ステップＳ５７０で、領域判定部１５５は、補正された上記その他の領域の各々に対応する大きさに基づいて、当該その他の領域の各々が頭部に対応する領域かを判定する。そして、処理は終了する。 In step S570, the region determination unit 155 determines whether each of the other regions corresponds to the head based on the corrected size corresponding to each of the other regions. Then, the process ends.

なお、対象距離画像２０が継続的に生成され、取得されてもよい。この場合に、ステップＳ５７０の後に、処理はステップＳ５２０に戻ってもよい。 Note that the target distance image 20 may be continuously generated and acquired. In this case, after step S570, the process may return to step S520.

以上、本発明の第３の実施形態を説明した。第３の実施形態によれば、検出される１つ以上の物体領域８１のうちの高可能性領域との他の領域とが識別される。そして、当該その他の領域について、人物の対象部分に対応する領域かが判定される。これにより、最終的に人物を検出するまでの処理を軽減することができる。 Heretofore, the third embodiment of the present invention has been described. According to the third embodiment, one of the one or more object areas 81 to be detected is identified from the other areas with the high possibility area. Then, it is determined whether the other area corresponds to the target portion of the person. Thereby, it is possible to reduce processing until a person is finally detected.

より具体的には、例えば、判定処理（及び補正処理）の対象を減らすことが可能になる。その結果、判定処理（及び補正処理）を軽減することができる。また、誤検出の可能性はほとんど上がらないと期待される。即ち、正確な検出を維持しつつ、処理を軽減することが可能になる。 More specifically, for example, it is possible to reduce the number of determination processing (and correction processing) targets. As a result, determination processing (and correction processing) can be reduced. In addition, the possibility of false detection is expected to hardly increase. That is, it is possible to reduce processing while maintaining accurate detection.

また、例えば、上記高可能性領域は、上記対象部分以外の人物の部分に対応する領域が検出されにくい所定の範囲内に位置する物体領域８１である。これにより、判定処理（及び補正処理）の対象を、一部の範囲に位置する物体領域８１のみに絞ることができる。 Further, for example, the high possibility area is an object area 81 located within a predetermined range in which an area corresponding to a person part other than the target part is difficult to detect. Thereby, the target of the determination process (and the correction process) can be narrowed down to only the object region 81 located in a part of the range.

また、例えば、上記高可能性領域は、他の物体領域８１が近傍に位置しない物体領域８１である。これにより、判定処理（及び補正処理）の対象を、対象部分（頭部）以外の部分（例えば、肩、背中、胸、等）に対応する領域である可能性が高い領域のみに絞ることができる。 Further, for example, the high possibility area is an object area 81 in which another object area 81 is not located in the vicinity. Thereby, the target of the determination process (and the correction process) may be limited to only an area that is highly likely to be an area corresponding to a part other than the target part (head) (for example, shoulder, back, chest, etc.). it can.

＜＜６．第４の実施形態＞＞
続いて、図２４〜図２６を参照して、本発明の第４の実施形態を説明する。本発明の第４の実施形態によれば、第３の実施形態と同様に、１つ以上の物体領域８１のうちの、人物の対象部分（例えば、頭部）に対応する領域であると判定される可能性が高い高可能性領域と、その他の領域とが、識別される。そして、当該その他の領域について、人物の対象部分に対応する領域であるかが判定される。とりわけ第４の実施形態では、対象距離画像２０が継続的に生成され、物体領域８１が継続的に検出される場合に、前回の物体領域８１の検出結果に基づいて、今回の物体領域８１のうちの高可能性領域とその他の領域とが識別される。これにより、継続的に対象距離画像２０が生成され、物体領域８１が継続的に検出される場合に、人物を検出するまでの処理を軽減することができる。 << 6. Fourth Embodiment >>
Subsequently, a fourth embodiment of the present invention will be described with reference to FIGS. According to the fourth embodiment of the present invention, as in the third embodiment, it is determined that the region corresponds to the target portion (for example, the head) of one or more object regions 81. A high likelihood region that is likely to be identified and other regions are identified. Then, it is determined whether the other region is a region corresponding to the target portion of the person. In particular, in the fourth embodiment, when the target distance image 20 is continuously generated and the object region 81 is continuously detected, the current object region 81 is detected based on the previous detection result of the object region 81. Our high probability areas and other areas are identified. Thereby, when the object distance image 20 is continuously generated and the object region 81 is continuously detected, it is possible to reduce processing until the person is detected.

＜６−１．情報処理装置の構成＞
図２４及び図２５を参照して、第４の実施形態に係る情報処理装置の構成の一例を説明する。図２４は、第４の実施形態に係る情報処理装置１００−４の構成の一例を示すブロック図である。図２４を参照すると、情報処理装置１００−４は、通信部１１０、記憶部１２０及び制御部１６０を備える。 <6-1. Configuration of information processing apparatus>
With reference to FIGS. 24 and 25, an example of the configuration of the information processing apparatus according to the fourth embodiment will be described. FIG. 24 is a block diagram illustrating an example of a configuration of an information processing device 100-4 according to the fourth embodiment. Referring to FIG. 24, the information processing apparatus 100-4 includes a communication unit 110, a storage unit 120, and a control unit 160.

ここで、通信部１１０、記憶部１２０、並びに、制御部１６０に含まれる距離画像取得部１３１、前景距離画像生成部１３３、領域検出部１３５、領域補正部１５３及び領域判定部１５５については、第３の実施形態と第４の実施形態との間に差異はない。よって、制御部１６０に含まれる領域識別部１６１のみを説明する。 Here, the communication unit 110, the storage unit 120, and the distance image acquisition unit 131, the foreground distance image generation unit 133, the region detection unit 135, the region correction unit 153, and the region determination unit 155 included in the control unit 160 There is no difference between the third embodiment and the fourth embodiment. Therefore, only the area identification unit 161 included in the control unit 160 will be described.

（領域識別部１６１）
領域識別部１６１は、１つ以上の物体領域８１のうちの、人物の対象部分に対応する領域であると判定される可能性が高い高可能性領域と、その他の領域とを識別する。 (Area identification unit 161)
The area identifying unit 161 identifies a high possibility area that is highly likely to be determined to be an area corresponding to a target portion of a person, and other areas of the one or more object areas 81.

とりわけ第４の実施形態では、前提として、対象距離画像２０（即ち、今回の対象距離画像２０）の前に撮像装置３０により生成された対象距離画像２０（即ち、前回の対象距離画像２０）が取得される。そして、当該前回の対象距離画像２０及び背景距離画像に基づいて、前景距離画像６０（即ち、前回の前景距離画像６０）が生成される。さらに、複数のフィルタ７０を用いて、当該前回の前景距離画像６０から別の１つ以上の物体領域（即ち、前回の１つ以上の物体領域８１）が検出される。その後、当該前回の１つ以上の領域のうちの高可能性領域とその他の領域とが識別される。そして、当該前回の１つ以上の領域のうちの上記その他の領域の各々が上記対象部分に対応する領域かが、判定される。 In particular, in the fourth embodiment, the target distance image 20 (that is, the previous target distance image 20) generated by the imaging device 30 before the target distance image 20 (that is, the current target distance image 20) is presupposed. To be acquired. Based on the previous target distance image 20 and the background distance image, a foreground distance image 60 (that is, the previous foreground distance image 60) is generated. Furthermore, using the plurality of filters 70, another one or more object areas (that is, the previous one or more object areas 81) are detected from the previous foreground distance image 60. Then, the high possibility area | region and other area | regions of the said 1 or more area | region of the last time are identified. Then, it is determined whether each of the other regions of the one or more previous regions corresponds to the target portion.

そして、とりわけ、領域識別部１６１は、上記前回の１つ以上の物体領域８１のうちの、高可能性領域と、上記対象部分に対応する領域であると判定された判定済領域とに基づいて、今回の１つ以上の物体領域８１のうちの高可能性領域とその他の領域とを識別する。 In particular, the area identifying unit 161 is based on the high possibility area and the determined area determined to be the area corresponding to the target portion among the one or more previous object areas 81. Among the one or more object regions 81 this time, the high possibility region and other regions are identified.

換言すると、領域識別部１６１は、対象部分（例えば、頭部）に対応する領域であるとみなされた前回の物体領域８１を利用して人物をトレースすることにより、今回の物体領域８１のうちの高可能性領域とその他の領域とを識別する。 In other words, the region identifying unit 161 traces a person using the previous object region 81 that is regarded as a region corresponding to the target portion (for example, the head), thereby Identify high likelihood areas and other areas.

具体的には、領域識別部１６１は、上記前回の１つ以上の物体領域８１のうちの高可能性領域及び判定済領域の位置、及び今回の１つ以上の物体領域８１の位置に基づいて、当該今回の１つ以上の物体領域８１のうちの高可能性領域とその他の領域とを識別する。 Specifically, the area identification unit 161 is based on the positions of the high possibility area and the determined area of the one or more previous object areas 81 and the positions of the one or more current object areas 81. The high possibility area and the other areas of the one or more object areas 81 of this time are identified.

例えば、対象距離画像２０は、所定の時間間隔で生成され、取得される。そして、所定の時刻Ｔ_Ｎにおいて、各対象距離画像２０が取得されて、１つ以上の物体領域８１が検出される度に、領域識別部１６１は、時刻Ｔ_Ｎにおける各物体領域８１の位置を取得する。時刻Ｔ_Ｎにおける位置は、時刻Ｔ_Ｎに取得される距離画像２０から生成される２値画像８０内での物体領域８１の位置である。 For example, the target distance image 20 is generated and acquired at predetermined time intervals. Then, each time the target distance image 20 is acquired at a predetermined time T _N and one or more object regions 81 are detected, the region identifying unit 161 determines the position of each object region 81 at the time T _N. get. The position at time _TN is the position of the object region 81 in the binary image 80 generated from the distance image 20 acquired at time _TN .

そして、領域識別部１６１は、時刻Ｔ_Ｎから上記所定の時間間隔だけ前の時刻Ｔ_Ｎ−１における各物体領域８１（即ち、前回の物体領域８１）の位置及び移動速度から、前回の物体領域８１の各々の上記所定の時間間隔後（即ち、時刻Ｔ_Ｎ）の位置を推定する。当該移動速度については後述する。そして、領域識別部１６１は、前回の物体領域８１の推定された位置と、時刻Ｔ_Ｎにおける物体領域８１（即ち、今回の物体領域８１）の実際の位置とを比較し、位置が近い前回の物体領域８１と今回の物体領域８１とを対応付ける。一例として、距離が短い順で、前回の物体領域８１と今回の物体領域８１とが対応付けられる。 The area identifying unit 161, the object region 81 at time T _N from the front by the predetermined time interval time T _N-1 (i.e., the last object region 81) from the position and the moving speed of the previous object region The position after each of the predetermined time intervals 81 (ie, time T _N ) is estimated. The moving speed will be described later. Then, the area identification unit 161 compares the estimated position of the previous object area 81 with the actual position of the object area 81 at the time _TN (that is, the current object area 81), and the previous position where the position is close The object area 81 and the current object area 81 are associated with each other. As an example, the previous object region 81 and the current object region 81 are associated with each other in ascending order of distance.

その後、領域識別部１６１は、前回の物体領域８１と対応付けられた今回の物体領域８１（即ち、高可能性領域）の時刻Ｔ_Ｎにおける移動速度を取得し、保持する。時刻Ｔ_Ｎにおける当該移動速度は、今回の物体領域８１の位置と、対応付けられる前回の物体領域８１の位置との差分である。同様に、前回の物体領域８１の時刻Ｔ_Ｎ−１における移動速度が取得さているので、上述したように、時刻Ｔ_Ｎ−１における移動速度が利用可能になる。なお、前回の物体領域８１と対応付けられない今回の物体領域８１の移動速度は０となる。 Thereafter, the area identifying unit 161 acquires and holds the moving speed at the time _TN of the current object area 81 (ie, the high possibility area) associated with the previous object area 81. The moving speed at time _TN is a difference between the current position of the object region 81 and the previous position of the associated object region 81. Similarly, since the movement speed at the time T _N-1 of the previous object region 81 is acquired, the movement speed at the time T _N-1 can be used as described above. The moving speed of the current object area 81 that is not associated with the previous object area 81 is zero.

そして、領域識別部１６１は、前回の物体領域８１と対応付けられなかった今回の物体領域８１を、高可能性領域以外のその他の領域として識別する。また、領域識別部１６１は、前回の物体領域８１と対応付けられた今回の物体領域８１であって、時刻Ｔ_Ｎにおける移動速度の向きが２値画像８０の外側への向き（即ち、２値画像８０の中心から離れる向き）である今回の物体領域８１も、その他の領域として識別する。また、領域識別部１６１は、上述した物体領域８１以外の残りの物体領域８１を、高可能性領域として識別する。 Then, the area identifying unit 161 identifies the current object area 81 that is not associated with the previous object area 81 as another area other than the high possibility area. In addition, the region identification unit 161 is the current object region 81 associated with the previous object region 81, and the direction of the moving speed at the time T _N is the outward direction of the binary image 80 (that is, binary). The current object region 81 (in the direction away from the center of the image 80) is also identified as another region. The area identifying unit 161 identifies the remaining object area 81 other than the object area 81 described above as a high possibility area.

以上のように、高可能性領域とその他の領域とが識別される。一般に、人物の姿勢として、頭部は胴体よりもやや前方に位置する傾向がある。そのため、人物が撮像装置３０から離れる方向に移動する場合に（即ち、対象距離画像２０には人物の背中に対応する領域がある場合に）、人物の肩又は背中に対応する領域が誤って検出されやすい。逆に、人物が撮像装置３０に近づく方向に移動する場合に（即ち、対象距離画像２０には人物の背中に対応する領域がない場合に）、人物の肩又は背中に対応する領域が誤って検出されにくい。よって、領域識別部１６１は、上述したように、高可能性領域とその他の領域とを識別することができる。 As described above, the high possibility area and the other areas are identified. Generally, as a posture of a person, the head tends to be positioned slightly forward of the torso. Therefore, when the person moves away from the imaging device 30 (that is, when the target distance image 20 has an area corresponding to the person's back), an area corresponding to the person's shoulder or back is erroneously detected. Easy to be. Conversely, when the person moves in a direction approaching the imaging device 30 (that is, when there is no area corresponding to the person's back in the target distance image 20), the area corresponding to the person's shoulder or back is erroneously set. It is difficult to detect. Therefore, the area identification unit 161 can identify the high possibility area and the other areas as described above.

図２５は、第４の実施形態に係る高可能性領域とその他の領域との識別の例を説明するための説明図である。図２５を参照すると、前回の対象距離画像２０から得られた２値画像８０Ａ並びに物体領域８１Ｆ及び８１Ｇと、今回の対象距離画像から得られた２値画像８０Ｂ並びに物体領域８１Ｈ及び８１Ｉとが、示されている。例えば、前回の物体領域８１Ｆと今回の物体領域８１Ｈとが対応付けられ、前回の物体領域８１Ｇと今回の物体領域８１Ｉとが対応付けられる。また、今回の物体領域８１Ｈの移動速度の向きは、２値画像８０Ｂの内側への向きであるので、今回の物体領域８１Ｈは、高可能性領域として識別される。一方、今回の物体領域８１Ｉの移動速度の向きは、２値画像８０Ｂの外側への向きであるので、今回の物体領域８１Ｉは、その他の領域として識別される。 FIG. 25 is an explanatory diagram for explaining an example of identification between a high possibility area and other areas according to the fourth embodiment. Referring to FIG. 25, a binary image 80A and object areas 81F and 81G obtained from the previous target distance image 20, and a binary image 80B and object areas 81H and 81I obtained from the current target distance image, It is shown. For example, the previous object region 81F and the current object region 81H are associated with each other, and the previous object region 81G and the current object region 81I are associated with each other. Further, since the direction of the moving speed of the current object region 81H is the direction toward the inside of the binary image 80B, the current object region 81H is identified as a high possibility region. On the other hand, since the direction of the moving speed of the current object region 81I is the outward direction of the binary image 80B, the current object region 81I is identified as another region.

＜６−２．処理の流れ＞
次に、図２６を参照して、第４の実施形態に係る情報処理の一例を説明する。図２６は、第４の実施形態に係る情報処理の概略的な流れの一例を示すフローチャートである。 <6-2. Process flow>
Next, an example of information processing according to the fourth embodiment will be described with reference to FIG. FIG. 26 is a flowchart illustrating an example of a schematic flow of information processing according to the fourth embodiment.

ここで、ステップＳ６１０〜Ｓ６４０、Ｓ６７０及びＳ６８０は、図２３を参照して説明した第３の実施形態に係るステップＳ５１０〜Ｓ５４０、Ｓ５６０及びＳ５７０と同様である。よって、ここでは、ステップＳ６５０及びＳ６６０のみを説明する。 Here, steps S610 to S640, S670, and S680 are the same as steps S510 to S540, S560, and S570 according to the third embodiment described with reference to FIG. Therefore, only steps S650 and S660 will be described here.

ステップＳ６５０で、領域識別部１６１は、今回の１つ以上の物体領域８１の位置を取得する。 In step S650, the region identification unit 161 acquires the positions of the current one or more object regions 81.

ステップＳ６６０で、領域識別部１６１は、前回の１つ以上の物体領域８１のうちの高可能性領域及び判定済領域の位置、及び今回の１つ以上の物体領域８１の位置に基づいて、当該今回の１つ以上の物体領域８１のうちの高可能性領域とその他の領域とを識別する。 In step S660, the area identifying unit 161 determines the position of the high possibility area and the determined area from the previous one or more object areas 81, and the current one or more object areas 81, Among the one or more object areas 81 this time, the high possibility area and other areas are identified.

なお、ステップＳ６８０の後には、ステップＳ６２０へ戻る。 Note that after step S680, the process returns to step S620.

以上、本発明の第４の実施形態を説明した。第４の実施形態によれば、検出される１つ以上の物体領域８１のうちの高可能性領域との他の領域とが識別される。そして、当該その他の領域について、人物の対象部分に対応する領域かが判定される。これにより、最終的に人物を検出するまでの処理を軽減することができる。 The fourth embodiment of the present invention has been described above. According to the fourth embodiment, one of the one or more object areas 81 to be detected is identified from the other areas with the high possibility area. Then, it is determined whether the other area corresponds to the target portion of the person. Thereby, it is possible to reduce processing until a person is finally detected.

とりわけ第４の実施形態では、前回の物体領域８１の検出結果に基づいて、今回の物体領域８１のうちの高可能性領域とその他の領域とが識別される。これにより、継続的に対象距離画像２０が生成され、物体領域８１が継続的に検出される場合に、人物を検出するまでの処理を軽減することができる。 In particular, in the fourth embodiment, based on the previous detection result of the object area 81, the high possibility area and other areas in the current object area 81 are identified. Thereby, when the object distance image 20 is continuously generated and the object region 81 is continuously detected, it is possible to reduce processing until the person is detected.

また、対象距離画像２０が短い期間で生成され、１つ以上の物体領域８１が継続的に生成されれば、同一の人物の頭部に対応する物体領域８１が継続的に検出されるので、対象距離画像２０内のみの情報ではなく、時間方向の情報も利用して、人物をより正確に検出することが可能になる。 In addition, if the target distance image 20 is generated in a short period and one or more object regions 81 are continuously generated, the object region 81 corresponding to the head of the same person is continuously detected. It is possible to detect a person more accurately by using not only information in the target distance image 20 but also information in the time direction.

＜＜７．第５の実施形態＞＞
続いて、図２７〜図３２を参照して、本発明の第５の実施形態を説明する。本発明の第５の実施形態によれば、物体領域８１の検出のために、前景距離画像６０内の別々の位置で適用される複数のフィルタセットが用いられる。そして、各フィルタセットは、別々の特徴を有する人物の頭部及び胴体のパターンを認識可能な２つ以上のフィルタ７０を含む。これにより、特徴（例えば、体格）が異なる様々な人物が存在する中で、人物をより正確に検出することが可能になる。 << 7. Fifth embodiment >>
Subsequently, a fifth embodiment of the present invention will be described with reference to FIGS. According to the fifth embodiment of the present invention, a plurality of filter sets applied at different positions in the foreground distance image 60 are used for detection of the object region 81. Each filter set includes two or more filters 70 capable of recognizing human head and torso patterns having different characteristics. This makes it possible to more accurately detect a person in the presence of various persons with different characteristics (for example, physique).

＜７−１．情報処理装置の構成＞
図２７〜図３０を参照して、第５の実施形態に係る情報処理装置の構成の一例を説明する。図２７は、第５の実施形態に係る情報処理装置１００−５の構成の一例を示すブロック図である。図２７を参照すると、情報処理装置１００−５は、通信部１１０、記憶部１２０及び制御部１７０を備える。 <7-1. Configuration of information processing apparatus>
An example of the configuration of the information processing apparatus according to the fifth embodiment will be described with reference to FIGS. FIG. 27 is a block diagram illustrating an example of a configuration of an information processing device 100-5 according to the fifth embodiment. Referring to FIG. 27, the information processing apparatus 100-5 includes a communication unit 110, a storage unit 120, and a control unit 170.

ここで、通信部１１０、記憶部１２０、並びに、制御部１７０に含まれる距離画像取得部１３１、前景距離画像生成部１３３及び領域補正部１４１については、第２の実施形態と第５の実施形態との間に差異はない。よって、制御部１５０に含まれる領域検出部１７１及び領域判定部１７３のみを説明する。 Here, the communication unit 110, the storage unit 120, and the distance image acquisition unit 131, the foreground distance image generation unit 133, and the region correction unit 141 included in the control unit 170 are the second embodiment and the fifth embodiment. There is no difference between Therefore, only the region detection unit 171 and the region determination unit 173 included in the control unit 150 will be described.

（領域検出部１７１）
−複数のフィルタセット
第５の実施形態では、領域検出部１７１は、上記前景距離画像内の別々の位置で適用される複数のフィルタセットであって、適用される位置に応じた人物の頭部及び胴体のパターンを認識可能な上記複数のフィルタセットを用いて、上記前景距離画像から１つ以上の物体領域を検出する。 (Area detection unit 171)
-Plural Filter Sets In the fifth embodiment, the region detection unit 171 is a plurality of filter sets applied at different positions in the foreground distance image, and a human head corresponding to the applied position One or more object regions are detected from the foreground distance image using the plurality of filter sets capable of recognizing the body pattern.

また、上記複数のフィルタセットの各々は、別々の特徴を有する人物の頭部及び胴体のパターンを認識可能な２つ以上のフィルタ７０を含む。例えば、各フィルタセットは、異なる体格を有する人物の頭部及び胴体のパターンを認識可能な２つ以上のフィルタ７０を含む。以下、この点について、図２８を参照して具体例を説明する。 Each of the plurality of filter sets includes two or more filters 70 capable of recognizing human head and torso patterns having different characteristics. For example, each filter set includes two or more filters 70 capable of recognizing head and torso patterns of persons having different physiques. Hereinafter, a specific example of this point will be described with reference to FIG.

図２８は、フィルタセットに含まれるフィルタの例を説明するための説明図である。図２８を参照すると、１つのフィルタセットに含まれるフィルタ７０Ａ、７０Ｂ及び７０Ｃが示されている。この例では、１つのフィルタセットは、１８０ｃｍの身長を有する人物のためのフィルタ７０Ａ、１６０ｃｍの身長を有する人物のためのフィルタ７０Ｂ、及び、１４０ｃｍの身長を有する人物のためのフィルタ７０Ｃを含む。例えば以上のように、別々の体格（例えば、身長）を有する人物の頭部及び胴体のパターンを認識可能な２つ以上のフィルタが、各フィルタセットに含まれる。そして、例えば、画素単位でフィルタセットが生成される。 FIG. 28 is an explanatory diagram for describing an example of filters included in the filter set. Referring to FIG. 28, filters 70A, 70B and 70C included in one filter set are shown. In this example, one filter set includes a filter 70A for a person having a height of 180 cm, a filter 70B for a person having a height of 160 cm, and a filter 70C for a person having a height of 140 cm. For example, as described above, each filter set includes two or more filters capable of recognizing head and trunk patterns of persons having different physiques (for example, height). Then, for example, a filter set is generated for each pixel.

−フィルタの形状
フィルタセットに含まれる各フィルタ７０の形状については、第２の実施形態において説明したとおりである。 —Shape of Filter The shape of each filter 70 included in the filter set is as described in the second embodiment.

−複数のフィルタセットの生成手法
フィルタセットに含まれる各フィルタ７０も、第２の実施形態において説明した手法により生成され得る。 -Generation Method of Multiple Filter Sets Each filter 70 included in the filter set can also be generated by the method described in the second embodiment.

とりわけ第５の実施形態に係る各フィルタセットに含まれる２つ以上のフィルタ７０は、上述したように、別々の特徴を有する人物の頭部及び胴体のパターンを認識可能である。よって、フィルタセットに含まれる各フィルタ７０は、ターゲットとする人物の特徴を考慮して生成される。 In particular, the two or more filters 70 included in each filter set according to the fifth embodiment can recognize human head and trunk patterns having different characteristics as described above. Therefore, each filter 70 included in the filter set is generated in consideration of the characteristics of the target person.

例えば、図１３Ａ及び図１３Ｂを参照して説明したように、各フィルタ７０は、撮像装置３０により人物４０が撮像される場合における撮像装置３０と人物４０との相対関係に基づいて生成される。そして、とりわけ第５の実施形態では、フィルタセットに含まれる２つ以上のフィルタ７０は、それぞれ別々の相対関係に基づいて生成される。フィルタセットに含まれる２つ以上のフィルタは、それぞれ、別々の人物の特徴を有する人物の頭部及び胴体（例えば、肩）のパターンを識別するからである。例えば、図１３Ａ及び図１３Ｂに示されるような、撮像装置３０と人物４０の頭部との高さ方向での距離Ｄ_Ｈ、撮像装置３０と人物４０の肩との高さ方向での距離Ｄ_Ｓ、人物４０の頭幅Ｗ_Ｈ、及び人物４０の肩幅Ｗ_Ｓは、フィルタセットに含まれるどのフィルタ７０を生成するかによって変更される。例えば、図２８に示されるフィルタ７０Ａを生成する際には、人物４０に関連する上記情報は、人物４０の身長が１８０ｃｍであるという想定の下で設定される。また、図２８に示されるフィルタ７０Ｃを生成する際には、人物４０に関連する上記情報は、人物４０の身長が１４０ｃｍであるという想定の下で設定される。 For example, as described with reference to FIGS. 13A and 13B, each filter 70 is generated based on the relative relationship between the imaging device 30 and the person 40 when the imaging device 30 images the person 40. In particular, in the fifth embodiment, two or more filters 70 included in the filter set are generated based on different relative relationships. This is because the two or more filters included in the filter set each identify a pattern of a person's head and torso (eg, shoulders) having different person characteristics. For example, as shown in FIGS. 13A and 13B, the distance D _H in the height direction between the imaging device 30 and the head of the person 40, and the distance D in the height direction between the imaging device 30 and the shoulder of the person 40. _S , the head width W _{H of} the person 40, and the shoulder width W _{S of the} person 40 are changed depending on which filter 70 included in the filter set is generated. For example, when the filter 70A shown in FIG. 28 is generated, the information related to the person 40 is set under the assumption that the height of the person 40 is 180 cm. Further, when the filter 70C shown in FIG. 28 is generated, the information related to the person 40 is set on the assumption that the height of the person 40 is 140 cm.

−複数のフィルタセットを用いた物体領域の検出の例
また、例えば、領域検出部１７１は、上記複数のフィルタセットの各々を用いる際に上記２つ以上のフィルタ７０を前景距離画像６０内の同一の位置に適用することにより、複数の領域を検出する。 -Example of detection of object region using a plurality of filter sets For example, the region detection unit 171 uses the two or more filters 70 in the foreground distance image 60 when using each of the plurality of filter sets. A plurality of areas are detected by applying to the position of.

具体的には、例えば、図２８に示されるように、１つのフィルタセットが３つのフィルタ７０を含む場合に、領域検出部１７１は、当該３つのフィルタ７０を前景距離画像６０内の同一の位置に適用する。即ち、領域検出部１７１は、１８０ｃｍの身長を有する人物のためのフィルタ７０Ａ、１６０ｃｍの身長を有する人物のためのフィルタ７０Ｂ及び１４０ｃｍの身長を有する人物のためのフィルタ７０Ｃを、同一の位置（例えば、同一の画素）に適用する。以下、図２９を参照して、このようなフィルタの適用により検出される物体領域８１の例を説明する。 Specifically, for example, as illustrated in FIG. 28, when one filter set includes three filters 70, the region detection unit 171 sets the three filters 70 at the same position in the foreground distance image 60. Applies to That is, the region detection unit 171 places the filter 70A for a person having a height of 180 cm, the filter 70B for a person having a height of 160 cm, and the filter 70C for a person having a height of 140 cm at the same position (for example, , The same pixel). Hereinafter, an example of the object region 81 detected by applying such a filter will be described with reference to FIG.

図２９は、フィルタセットに含まれる２つ以上のフィルタを適用することにより検出される物体領域８１の例を説明するための説明図である。図２９を参照すると、図２８に示されているフィルタ７０Ａ、７０Ｂ及び７０Ｃが示されている。例えば、このようなフィルタ７０Ａ、７０Ｂ及び７０Ｃを含むフィルタセットが画素単位で生成され、前景距離画像６０の各前景領域６１に含まれる画素に適用される。その結果、例えば、１８０ｃｍの身長を有する人物に対応する前景領域６１Ａから、１８０ｃｍの身長を有する人物のためのフィルタ７０Ａを用いて、物体領域８１Ａが検出される。同様に、１６０ｃｍの身長を有する人物のためのフィルタ７０Ｂを用いて、物体領域８１Ｂが検出され、１４０ｃｍの身長を有する人物のためのフィルタ７０Ｃを用いて、物体領域８１Ｃが検出される。また、１４０ｃｍの身長を有する人物に対応する前景領域６１Ｂから、物体領域８１Ｄ、物体領域８１Ｅ及び物体領域８１Ｆが検出される。例えばこのように、各フィルタセットに含まれるフィルタ７０のうちの、特定の身長に対応するフィルタ（即ち、特定の身長を有する人物の頭部及び胴体のパターンを識別可能なフィルタ７０）が、各画素で適用される。これにより、当該特定の身長に対応するフィルタ７０（即ち、特定の身長を有する人物の頭部及び胴体のパターンを識別可能なフィルタ７０）の集合ごとに、物体領域８１が検出され得る。 FIG. 29 is an explanatory diagram for explaining an example of the object region 81 detected by applying two or more filters included in the filter set. Referring to FIG. 29, the filters 70A, 70B and 70C shown in FIG. 28 are shown. For example, such a filter set including the filters 70 A, 70 B, and 70 C is generated in units of pixels and applied to the pixels included in each foreground region 61 of the foreground distance image 60. As a result, for example, the object region 81A is detected from the foreground region 61A corresponding to a person having a height of 180 cm using the filter 70A for a person having a height of 180 cm. Similarly, the object region 81B is detected using the filter 70B for a person having a height of 160 cm, and the object region 81C is detected using the filter 70C for a person having a height of 140 cm. In addition, the object region 81D, the object region 81E, and the object region 81F are detected from the foreground region 61B corresponding to a person having a height of 140 cm. For example, in this way, among the filters 70 included in each filter set, a filter corresponding to a specific height (that is, a filter 70 that can identify the head and trunk patterns of a person having a specific height) Applied in pixels. Accordingly, the object region 81 can be detected for each set of filters 70 corresponding to the specific height (that is, the filter 70 that can identify the pattern of the head and torso of a person having a specific height).

（領域判定部１７３）
領域判定部１７３は、検出される１つ以上の物体領域８１のうちの少なくとも１つの物体領域８１の各々に対応する大きさに基づいて、当該少なくとも１つの物体領域８１の各々が人物の対象部分に対応する領域かを判定する。 (Area determination unit 173)
Based on the size corresponding to each of at least one of the one or more object areas 81 to be detected, the area determination unit 173 determines that each of the at least one object area 81 is a target portion of a person. It is determined whether the area corresponds to.

例えば、領域判定部１７３は、補正された上記少なくとも１つの物体領域８１の各々に対応する大きさに基づいて、当該少なくとも１つの物体領域８１の各々が人物の対象部分に対応する領域かを判定する。 For example, the region determination unit 173 determines whether each of the at least one object region 81 corresponds to a target portion of a person based on the corrected size corresponding to each of the at least one object region 81. To do.

例えば、第５の実施形態では、複数のフィルタセットの各々を用いる際に２つ以上のフィルタ７０を前景距離画像６０内の同一の位置に適用することにより、複数の物体領域８１が検出される。また、例えば、当該複数の物体領域８１の各々が補正される。そして、領域判定部１７３は、上記複数の物体領域８１のうちの少なくとも１つの物体領域８１の各々に対応する大きさに基づいて、当該少なくとも１つの物体領域８１の各々が対象部分に対応する領域かを判定する。 For example, in the fifth embodiment, a plurality of object regions 81 are detected by applying two or more filters 70 to the same position in the foreground distance image 60 when using each of a plurality of filter sets. . Further, for example, each of the plurality of object regions 81 is corrected. Then, based on the size corresponding to each of at least one object region 81 among the plurality of object regions 81, the region determination unit 173 is a region in which each of the at least one object region 81 corresponds to the target portion. Determine whether.

例えば、領域判定部１７３は、上記複数の物体領域８１の各々に対応する大きさに基づいて、当該複数の物体領域８１を評価する。例えば、図１９を参照して説明したように、各物体領域８１に対応する楕円の長軸Ｌ_ｍ及び短軸Ｌ_ｎと矩形７１の幅Ｗ_Ｆとの比較が行われ、一例として、両者の類似度が評価結果として算出される。そして、領域判定部１７３は、同一の前景領域６１に対応する複数の物体領域８１の中から、最も良好な評価結果（例えば、最も高い類似度）を伴う１つの物体領域８１を選択する。例えば、前景領域６１ごとにこのような選択が行われる。そして、領域判定部１７３は、選択された少なくとも１つの物体領域８１が人物の頭部に対応する領域かを判定する。以下、このような判定手法について図３０を参照して具体例を説明する。 For example, the region determination unit 173 evaluates the plurality of object regions 81 based on the size corresponding to each of the plurality of object regions 81. For example, as described with reference to FIG. 19, compared with the width W _F of the ellipse major axis L _m and a minor axis L _n and a rectangular 71 corresponding to each object region 81 is made, as an example, both Similarity is calculated as an evaluation result. Then, the region determination unit 173 selects one object region 81 with the best evaluation result (for example, the highest similarity) from the plurality of object regions 81 corresponding to the same foreground region 61. For example, such a selection is performed for each foreground region 61. Then, the region determination unit 173 determines whether the selected at least one object region 81 is a region corresponding to a person's head. Hereinafter, a specific example of such a determination method will be described with reference to FIG.

図３０は、フィルタセットに含まれる２つ以上のフィルタを同一の位置に適用することにより検出される複数の物体領域８１についての判定の例を説明するための説明図である。図３０を参照すると、図２９に示される検出された複数の物体領域８１Ａ〜８１Ｆにそれぞれ対応する楕円８３Ａ〜８３Ｆが示されている。また、楕円８３Ａ〜８３Ｆに対応する物体領域８１Ａ〜８１Ｆの各々の検出に用いられたフィルタ７０（即ち、フィルタ７０Ａ、７０Ｂ又は７０Ｃ）の矩形が、楕円８３Ａ〜８３Ｆの各々に重ねて表示されている。領域判定部１７３は、検出された複数の物体領域８１Ａ〜８１Ｆの各々について、対応する楕円８３の長軸Ｌ_ｍ及び短軸Ｌ_ｎを算出し、長軸Ｌ_ｍ及び短軸Ｌ_ｎと小さい方の矩形（即ち、矩形７１）の幅Ｗ_Ｆｈ及びＷ_Ｆｖとの類似度Ｒを以下のように算出する。 FIG. 30 is an explanatory diagram for explaining an example of determination for a plurality of object regions 81 detected by applying two or more filters included in the filter set to the same position. Referring to FIG. 30, ellipses 83 A to 83 F respectively corresponding to the plurality of detected object regions 81 A to 81 F shown in FIG. 29 are shown. Further, the rectangle of the filter 70 (that is, the filter 70A, 70B, or 70C) used for detecting each of the object regions 81A to 81F corresponding to the ellipses 83A to 83F is displayed so as to be superimposed on each of the ellipses 83A to 83F. Yes. The area determination unit 173 calculates the major axis L _m and the minor axis L _n of the corresponding ellipse 83 for each of the detected object areas 81A to 81F, and the smaller one of the major axis L _m and the minor axis L _n. rectangular (i.e., square 71) is calculated as follows similarity R between the width _{W Fh} and _{W Fv} of.

そして、類似度に基づいて、前景領域６１ごとに１つの物体領域８１が選択される。一例として、類似度Ｒが最も大きい物体領域８１が選択される。即ち、図２９及び図３０の例では、前景領域６１Ａについて、楕円８３Ａに対応する物体領域８１Ａが選択される。また、前景領域６１Ｂについて、楕円８３Ｆに対応する物体領域８１Ｆが選択される。そして、選択された少なくとも１つの物体領域が人物の頭部に対応する領域かが判定される。この判定については、第２の実施形態において図１９を参照して説明したとおりである。 Then, one object region 81 is selected for each foreground region 61 based on the similarity. As an example, the object region 81 having the highest similarity R is selected. That is, in the example of FIGS. 29 and 30, the object region 81A corresponding to the ellipse 83A is selected for the foreground region 61A. For the foreground area 61B, the object area 81F corresponding to the ellipse 83F is selected. Then, it is determined whether the selected at least one object region corresponds to a person's head. This determination is as described with reference to FIG. 19 in the second embodiment.

例えば以上のように、物体領域８１の評価、選択及び判定が行われる。 For example, as described above, the evaluation, selection, and determination of the object region 81 are performed.

＜７−２．処理の流れ＞
次に、図３１を参照して、第５の実施形態に係る情報処理の一例を説明する。図３１は、第５の実施形態に係る情報処理の概略的な流れの一例を示すフローチャートである。 <7-2. Process flow>
Next, an example of information processing according to the fifth embodiment will be described with reference to FIG. FIG. 31 is a flowchart illustrating an example of a schematic flow of information processing according to the fifth embodiment.

ここで、ステップＳ７１０〜Ｓ７３０は、図２０を参照して説明した第２の実施形態に係るステップＳ４１０〜Ｓ４３０（図１５を参照して説明した第１の実施形態に係るＳ３１０〜Ｓ３３０）と同様である。よって、ここでは、ステップＳ７４０〜Ｓ７７０のみを説明する。 Here, steps S710 to S730 are the same as steps S410 to S430 according to the second embodiment described with reference to FIG. 20 (S310 to S330 according to the first embodiment described with reference to FIG. 15). It is. Therefore, only steps S740 to S770 will be described here.

ステップＳ７４０で、領域検出部１７１は、前景距離画像６０内の別々の位置で適用される複数のフィルタセットであって、適用される位置に応じた人物の頭部及び胴体のパターンを認識可能な上記複数のフィルタセットを用いて、前景距離画像６０から複数の物体領域８１を検出する。ここで、上記複数のフィルタセットの各々を用いる際に、フィルタセットに含まれる２つ以上のフィルタ７０が、前景距離画像６０内の同一の位置に適用される。 In step S740, the region detection unit 171 is a plurality of filter sets applied at different positions in the foreground distance image 60, and can recognize the pattern of the human head and trunk corresponding to the applied positions. A plurality of object regions 81 are detected from the foreground distance image 60 using the plurality of filter sets. Here, when each of the plurality of filter sets is used, two or more filters 70 included in the filter set are applied to the same position in the foreground distance image 60.

ステップＳ７５０で、領域補正部１４１は、対象距離画像２０又は前景距離画像６０に基づいて、検出された複数の物体領域８１の各々の周辺領域であって、所定の条件を満たす当該周辺領域を、当該複数の物体領域８１の各々に追加することにより、当該複数の物体領域８１を補正する。 In step S750, the region correction unit 141 determines the peripheral region that is a peripheral region of each of the plurality of object regions 81 detected based on the target distance image 20 or the foreground distance image 60 and satisfies a predetermined condition. By adding to each of the plurality of object regions 81, the plurality of object regions 81 are corrected.

ステップＳ７６０で、領域判定部１７３は、上記複数の物体領域８１の各々に対応する大きさに基づいて、当該複数の物体領域８１を評価し、高評価を伴う少なくとも１つの物体領域８１を選択する。高評価を伴う当該少なくとも１つの物体領域８１の各々は、例えば、同一の前景領域６１に対応する物体領域８１の中で最も良好な評価結果を伴う物体領域８１である。 In step S760, the region determination unit 173 evaluates the plurality of object regions 81 based on the size corresponding to each of the plurality of object regions 81, and selects at least one object region 81 with high evaluation. . Each of the at least one object region 81 with high evaluation is, for example, the object region 81 with the best evaluation result among the object regions 81 corresponding to the same foreground region 61.

ステップＳ７７０で、領域判定部１７３は、選択された少なくとも１つの物体領域８１の各々が人物の頭部に対応する領域かを判定する。そして、処理は終了する。 In step S770, the region determination unit 173 determines whether each of the selected at least one object region 81 corresponds to a person's head. Then, the process ends.

なお、対象距離画像２０が継続的に生成され、取得されてもよい。この場合に、ステップＳ７７０の後に、処理はステップＳ７２０に戻ってもよい。 Note that the target distance image 20 may be continuously generated and acquired. In this case, after step S770, the process may return to step S720.

＜７−３．変形例＞
次に、図３２を参照して、第５の実施形態の変形例を説明する。第５の実施形態の変形例によれば、各フィルタセットに含まれる２つ以上のフィルタが選択的に適用される。これにより、体格が異なる様々な人物が存在する中で、人物をより正確に検出しつつ、物体領域８１の検出に要する処理を軽減することが可能になる。 <7-3. Modification>
Next, a modification of the fifth embodiment will be described with reference to FIG. According to the modification of the fifth embodiment, two or more filters included in each filter set are selectively applied. This makes it possible to reduce processing required to detect the object region 81 while more accurately detecting a person in the presence of various persons having different physiques.

（領域検出部１７１）
上述した第５の実施形態の例では、領域検出部１７１は、上記複数のフィルタセットの各々を用いる際に上記２つ以上のフィルタ７０を前景距離画像６０内の同一の位置に適用することにより、複数の領域を検出する。 (Area detection unit 171)
In the example of the fifth embodiment described above, the region detection unit 171 applies the two or more filters 70 to the same position in the foreground distance image 60 when using each of the plurality of filter sets. Detect multiple areas.

一方、第５の実施形態の変形例では、領域検出部１７１は、上記複数のフィルタセットの各々を用いる際に上記２つ以上のフィルタ７０から少なくとも１つのフィルタ７０を選択し、選択された当該少なくとも１つのフィルタ７０を適用することにより、１つ以上の物体領域８１を検出する。 On the other hand, in the modification of the fifth embodiment, the region detection unit 171 selects at least one filter 70 from the two or more filters 70 when using each of the plurality of filter sets, and the selected By applying at least one filter 70, one or more object regions 81 are detected.

また、例えば、領域検出部１７１は、撮像装置３０により生成される距離画像、又は当該距離画像に基づいて生成される別の距離画像から推定される人物の特徴に基づいて、上記２つ以上のフィルタ７０から上記少なくとも１つのフィルタ７０を選択する。 In addition, for example, the region detection unit 171 may perform the two or more of the above based on a person image estimated from a distance image generated by the imaging device 30 or another distance image generated based on the distance image. The at least one filter 70 is selected from the filters 70.

より具体的には、例えば、対象距離画像２０から生成される前景距離画像６０の各前景領域６１に対応する高さの平均値が算出される。即ち、前景領域６１に含まれる各画素に対応する高さが算出され、算出された高さの平均値が算出される。さらに、当該平均値から前景領域６１に対応する人物の身長が推定される。そして、フィルタセットに含まれる２つ以上のフィルタ７０のうちの、推定される身長に対応する１つ以上のフィルタ７０が、選択される。一例として、１つのフィルタ７０が選択される。 More specifically, for example, an average height corresponding to each foreground area 61 of the foreground distance image 60 generated from the target distance image 20 is calculated. That is, the height corresponding to each pixel included in the foreground area 61 is calculated, and the average value of the calculated heights is calculated. Further, the height of the person corresponding to the foreground area 61 is estimated from the average value. Then, one or more filters 70 corresponding to the estimated height are selected from the two or more filters 70 included in the filter set. As an example, one filter 70 is selected.

なお、フィルタセットに含まれる２つ以上のフィルタは、予め記憶され、選択された場合に取得されてもよく、又は、候補として予め存在するものの、選択された際に実際に生成されてもよい。 Note that two or more filters included in the filter set may be stored in advance and acquired when selected, or may exist in advance as candidates but may actually be generated when selected. .

上述したように、例えば、フィルタセットに含まれる２つ以上のフィルタ７０のうちの、１つのフィルタ７０が選択される。この場合には、フィルタセットが用いられるものの、実際に適用されるのは１つのフィルタ７０である。そのため、この場合には、第２の実施形態に係る領域判定部１４３と同様に、各物体領域８１が人物の対象部分に対応する領域かが判定される。 As described above, for example, one of the two or more filters 70 included in the filter set is selected. In this case, a filter set is used, but only one filter 70 is actually applied. Therefore, in this case, similarly to the region determination unit 143 according to the second embodiment, it is determined whether each object region 81 is a region corresponding to a target portion of a person.

（処理の流れ）
次に、図３２を参照して、第５の実施形態に係る情報処理の一例を説明する。図３２は、第５の実施形態の変形例に係る情報処理の概略的な流れの一例を示すフローチャートである。 (Process flow)
Next, an example of information processing according to the fifth embodiment will be described with reference to FIG. FIG. 32 is a flowchart illustrating an example of a schematic flow of information processing according to a modification of the fifth embodiment.

ここで、ステップＳ８１０〜Ｓ８３０、Ｓ８５０及びＳ８６０は、図２０を参照して説明した第２の実施形態に係るステップＳ４１０〜Ｓ４３０、Ｓ４５０及びＳ４６０と同様である。よって、ここでは、ステップＳ８４０のみを説明する。 Here, steps S810 to S830, S850, and S860 are the same as steps S410 to S430, S450, and S460 according to the second embodiment described with reference to FIG. Therefore, only step S840 will be described here.

ステップＳ８４０で、領域検出部１７１は、上記複数のフィルタセットの各々を用いる際に２つ以上のフィルタ７０から１つのフィルタ７０を選択し、選択された当該１つのフィルタ７０を適用することにより、１つ以上の物体領域８１を検出する。 In step S840, the region detection unit 171 selects one filter 70 from two or more filters 70 when using each of the plurality of filter sets, and applies the selected one filter 70. One or more object regions 81 are detected.

なお、対象距離画像２０が継続的に生成され、取得されてもよい。この場合に、ステップＳ８６０の後に、処理はステップＳ８２０に戻ってもよい。 Note that the target distance image 20 may be continuously generated and acquired. In this case, after step S860, the process may return to step S820.

（その他）
以上のように第５の実施形態の変形例を説明したが、当該変形例はこの例に限られない。 (Other)
As described above, the modification of the fifth embodiment has been described, but the modification is not limited to this example.

例えば、推定される人物の特徴に基づいてフィルタセットからフィルタ７０が選択される例を説明したが、フィルタセットからのフィルタ７０の選択手法はこれに限られない。例えば、前回取得された対象距離画像２０から生成される前回の前景距離画像６０内の前景領域６１に適用されたフィルタ７０が保持され、今回の前景距離画像６０内の前景領域６１に流用されてもよい。例えば、前回の前景領域６１の近傍に位置する今回の前景領域６１のために、当該前回の前景領域６１で適用されたフィルタ７０が選択されてもよい。 For example, although the example in which the filter 70 is selected from the filter set based on the estimated human characteristics has been described, the method for selecting the filter 70 from the filter set is not limited thereto. For example, the filter 70 applied to the foreground area 61 in the previous foreground distance image 60 generated from the previously acquired target distance image 20 is retained and used for the foreground area 61 in the current foreground distance image 60. Also good. For example, for the current foreground area 61 located in the vicinity of the previous foreground area 61, the filter 70 applied in the previous foreground area 61 may be selected.

また、フィルタセットに含まれる２つ以上のフィルタ７０のうちの１つのフィルタ７０が選択される例を説明したが、選択されるフィルタ７０の数はこれに限られない。例えば、フィルタセットの中から２つ以上のフィルタ７０が選択され、適用されてもよい。そして、第５の実施形態の変形例においても、第２の実施形態と同様の物体領域８１の判定ではなく、第５の実施形態と同様の物体領域８１の判定が、行われてもよい。 Moreover, although the example in which one filter 70 of two or more filters 70 included in the filter set is selected has been described, the number of filters 70 to be selected is not limited thereto. For example, two or more filters 70 may be selected from the filter set and applied. And also in the modification of 5th Embodiment, determination of the object area | region 81 similar to 5th Embodiment may be performed instead of the determination of the object area | region 81 similar to 2nd Embodiment.

以上、本発明の第５の実施形態を説明した。第５の実施形態によれば、物体領域８１の検出のために、前景距離画像内の別々の位置で適用される複数のフィルタセットが用いられる。そして、各フィルタセットは、別々の特徴を有する人物の頭部及び胴体のパターンを認識可能な２つ以上のフィルタ７０を含む。これにより、特徴（例えば、体格）が異なる様々な人物が存在する中で、人物をより正確に検出することが可能になる。 The fifth embodiment of the present invention has been described above. According to the fifth embodiment, a plurality of filter sets applied at different positions in the foreground distance image are used to detect the object region 81. Each filter set includes two or more filters 70 capable of recognizing human head and torso patterns having different characteristics. This makes it possible to more accurately detect a person in the presence of various persons with different characteristics (for example, physique).

具体的には、撮像装置３０により撮像される人物は、一律に同じ特徴（例えば、体格）を有するわけではなく、それぞれ別の特徴を有することが想定される。よって、単一の特徴に限定するのではなく、様々な特徴に応じたフィルタ７０が用いられることにより、人物の未検出を減らすことが可能になる。 Specifically, it is assumed that the person imaged by the imaging device 30 does not have the same characteristic (for example, physique) uniformly but has different characteristics. Therefore, it is possible to reduce the number of undetected persons by using the filter 70 according to various features, not limited to a single feature.

また、第５の実施形態において説明したように、例えば、複数のフィルタセットの各々を用いる際に２つ以上のフィルタを前景距離画像６０内の同一の位置に適用することにより、複数の領域が検出される。これにより、撮像装置３０により撮像される人物の特徴が特定できない場合でも、人物の未検出を減らすことが可能になる。 Further, as described in the fifth embodiment, for example, when each of a plurality of filter sets is used, by applying two or more filters to the same position in the foreground distance image 60, a plurality of regions can be obtained. Detected. Thereby, even when the characteristics of the person imaged by the imaging device 30 cannot be specified, it is possible to reduce the undetected person.

また、第５の実施形態の変形例において説明したように、例えば、複数のフィルタセットの各々を用いる際に上記２つ以上のフィルタから少なくとも１つのフィルタが選択され、選択されたフィルタが適用される。これにより、物体領域８１の検出に要する処理を減らしつつ、人物の未検出を減らすことが可能になる。 Further, as described in the modification of the fifth embodiment, for example, when using each of a plurality of filter sets, at least one filter is selected from the two or more filters, and the selected filter is applied. The As a result, it is possible to reduce the number of undetected persons while reducing the processing required to detect the object region 81.

さらに、例えば、撮像装置３０により生成される距離画像、又は当該距離画像に基づいて生成される別の距離画像から推定される人物の特徴に基づいて、上記少なくとも１つのフィルタが選択される。これにより、選択されるフィルタ７０がより人物の特徴に適したものになるので、物体領域８１の検出に要する処理を減らしつつ、人物の未検出をさらに減らすことが可能になる。 Further, for example, the at least one filter is selected based on a person image estimated from a distance image generated by the imaging device 30 or another distance image generated based on the distance image. As a result, the selected filter 70 becomes more suitable for the characteristics of the person, so that it is possible to further reduce the number of undetected persons while reducing the processing required to detect the object region 81.

さらに、例えば、複数のフィルタセットの中には、ここまでに示したもの以外で、腰が曲がったなどの傾いた姿勢や、ゴルフバッグや赤ちゃんなど体と一体化した携行品がある、など特殊な形状をした人物の特徴を考慮したフィルタが含まれていてもよい。 In addition, for example, there are some filter sets other than those shown so far, such as tilted postures such as bent waist, and carry items integrated with the body such as golf bags and babies. A filter that takes into account the characteristics of a person having a simple shape may be included.

＜＜８．第６の実施形態＞＞
続いて、図３３〜図３６を参照して、本発明の第６の実施形態を説明する。本発明の第６の実施形態によれば、前景距離画像６０に基づいて、物体領域８１の未検出がないかが判定される。これにより、物体の未検出が見逃されてしまう可能性を減らすことができる。 << 8. Sixth Embodiment >>
Subsequently, a sixth embodiment of the present invention will be described with reference to FIGS. According to the sixth embodiment of the present invention, it is determined based on the foreground distance image 60 whether there is an undetected object region 81. This can reduce the possibility that an undetected object is missed.

＜８−１．情報処理装置の構成＞
図３３〜図３６を参照して、第６の実施形態に係る情報処理装置の構成の一例を説明する。図３３は、第６の実施形態に係る情報処理装置１００−６の構成の一例を示すブロック図である。図３３を参照すると、情報処理装置１００−６は、通信部１１０、記憶部１２０及び制御部１８０を備える。 <8-1. Configuration of information processing apparatus>
An example of the configuration of the information processing apparatus according to the sixth embodiment will be described with reference to FIGS. FIG. 33 is a block diagram illustrating an example of a configuration of an information processing device 100-6 according to the sixth embodiment. Referring to FIG. 33, the information processing apparatus 100-6 includes a communication unit 110, a storage unit 120, and a control unit 180.

ここで、通信部１１０、記憶部１２０、並びに、制御部１８０に含まれる距離画像取得部１３１、前景距離画像生成部１３３及び領域検出部１３５については、第１の実施形態と第６の実施形態との間に差異はない。よって、制御部１８０に含まれる未検出領域判定部１８１のみを説明する。 Here, the communication unit 110, the storage unit 120, and the distance image acquisition unit 131, the foreground distance image generation unit 133, and the region detection unit 135 included in the control unit 180 are described in the first embodiment and the sixth embodiment. There is no difference between Therefore, only the undetected area determination unit 181 included in the control unit 180 will be described.

（未検出領域判定部１８１）
未検出領域判定部１８１は、検出される１つ以上の物体領域８１及び前景距離画像６０に基づいて、当該１つ以上の物体領域８１以外の検出されるべき領域が検出されていない可能性があるかを判定する。 (Undetected area determination unit 181)
The undetected area determination unit 181 may not detect an area to be detected other than the one or more object areas 81 based on the one or more object areas 81 and the foreground distance image 60 that are detected. Determine if there is.

−前景距離画像からの前景領域の検出
例えば、まず、未検出領域判定部１８１は、前景距離画像６０から前景領域６１を検出する。 -Detection of the foreground area from the foreground distance image For example, first, the undetected area determination unit 181 detects the foreground area 61 from the foreground distance image 60.

例えば、未検出領域判定部１８１は、前景距離画像６０のうちの、画素値（即ち、距離）が０ではない画素を、前景領域６１として検出する。より具体的には、例えば、未検出領域判定部１８１は、前景距離画像６０のうちの、画素値が０である画素をそのまま維持し、画素値が０ではない画素の当該画素値を１に変更ことにより、前景２値画像を生成する。そして、未検出領域判定部１８１は、前景２値画像に含まれる画素のうちの、画素値１を有する画素について、ラベリングを行うことにより、個々の前景領域を検出する。以下、この点について図３４を参照して具体例を説明する。 For example, the undetected area determination unit 181 detects a pixel whose pixel value (that is, distance) is not 0 in the foreground distance image 60 as the foreground area 61. More specifically, for example, the undetected area determination unit 181 maintains the pixel value of the foreground distance image 60 with the pixel value of 0 as it is, and sets the pixel value of the pixel with a pixel value of 0 to 1 as it is. By changing, a foreground binary image is generated. Then, the undetected area determination unit 181 detects individual foreground areas by labeling pixels having a pixel value 1 among the pixels included in the foreground binary image. Hereinafter, a specific example of this point will be described with reference to FIG.

図３４は、前景領域６１の検出結果の一例を説明するための説明図である。図３４に示される例は、図６に示される前景距離画像６０から前景領域６１Ａ及び６１Ｂを検出した場合の例である。図３４を参照すると、前景２値画像９０が示されている。そして、図６に示される前景領域６１Ａ及び前景領域６１Ｂは、画素値が１である画素の集合である前景領域９１Ａ及び前景領域９１Ｂとして検出される。 FIG. 34 is an explanatory diagram for explaining an example of the detection result of the foreground area 61. The example shown in FIG. 34 is an example when the foreground regions 61A and 61B are detected from the foreground distance image 60 shown in FIG. Referring to FIG. 34, a foreground binary image 90 is shown. The foreground area 61A and the foreground area 61B shown in FIG. 6 are detected as a foreground area 91A and a foreground area 91B that are a set of pixels having a pixel value of 1.

−未検出領域の有無の判定
例えば、次に、未検出領域判定部１８１は、検出された１つ以上の物体領域８１及び検出された１つ以上の前景領域９１に基づいて、当該１つ以上の物体領域８１以外の検出されるべき領域が検出されていない可能性があるかを判定する。 -Determination of presence / absence of undetected area For example, the undetected area determination unit 181 then performs the determination based on the detected one or more object areas 81 and the detected one or more foreground areas 91. It is determined whether there is a possibility that a region other than the object region 81 to be detected is not detected.

例えば、未検出領域判定部１８１は、検出された１つ以上の物体領域８１の各々に外接する矩形（以下、「物体領域矩形」と呼ぶ）を算出する。また、未検出領域判定部１８１は、検出された１つ以上の前景領域９１の各々に外接する矩形（以下、「前景領域矩形」と呼ぶ）も算出する。そして、未検出領域判定部１８１は、１つ以上の物体領域矩形の各々の位置及び大きさと、１つ以上の前景領域矩形の各々の位置及び大きさとの比較により、上記１つ以上の前景領域６１のうちの、いずれの物体領域８１の検出にも関わっていない前景領域６１を特定する。そして、未検出領域判定部１８１は、いずれの物体領域８１の検出にも関わっていない前景領域６１を特定した場合に、特定された当該前景領域６１に関して領域の未検出の可能性があると判定する。以下、この点について、図３５を参照して具体例を説明する。 For example, the undetected area determination unit 181 calculates a rectangle circumscribing each of the detected one or more object areas 81 (hereinafter referred to as “object area rectangle”). The undetected area determination unit 181 also calculates a rectangle circumscribing each of the detected one or more foreground areas 91 (hereinafter referred to as “foreground area rectangle”). Then, the undetected area determination unit 181 compares the position and size of each of the one or more object area rectangles with the position and size of each of the one or more foreground area rectangles. The foreground area 61 that is not involved in the detection of any object area 81 is specified. Then, when the undetected area determination unit 181 identifies the foreground area 61 that is not involved in the detection of any object area 81, the undetected area determination unit 181 determines that there is a possibility that the area has not been detected with respect to the identified foreground area 61. To do. Hereinafter, a specific example of this point will be described with reference to FIG.

図３５は、領域の未検出の可能性があるかの判定の例を説明するための説明図である。図３５を参照すると、前景２値画像９０及び２値画像８０が示されている。また、前景２値画像９０内に、前景領域９１Ａ及び９１Ｂと、当該前景領域９１Ａ及び９１Ｂの前景領域矩形９３Ａ及び９３Ｂが示されている。また、２値画像８０内に、１つの物体領域８１Ａと、物体領域８１Ａの物体領域矩形８７Ａが示されている。この例では、前景２値画像９０と２値画像８０とを重ねたと仮定すると、前景領域矩形９３Ａ内には、物体領域矩形８７Ａが存在するが、前景領域矩形９３Ｂ内には、いずれの物体領域矩形８７も存在しない。よって、前景領域９０Ｂに関して領域の未検出の可能性があると判定される。 FIG. 35 is an explanatory diagram for explaining an example of determining whether or not there is a possibility that an area has not been detected. Referring to FIG. 35, a foreground binary image 90 and a binary image 80 are shown. In the foreground binary image 90, foreground areas 91A and 91B and foreground area rectangles 93A and 93B of the foreground areas 91A and 91B are shown. In the binary image 80, one object area 81A and an object area rectangle 87A of the object area 81A are shown. In this example, assuming that the foreground binary image 90 and the binary image 80 are overlapped, an object region rectangle 87A exists in the foreground region rectangle 93A, but any object region is present in the foreground region rectangle 93B. There is also no rectangle 87. Therefore, it is determined that there is a possibility that no area has been detected for the foreground area 90B.

以上のように、１つ以上の物体領域８１以外に検出されるべき領域が検出されていない可能性があるかが判定される。 As described above, it is determined whether there is a possibility that a region other than the one or more object regions 81 is not detected.

−未検出の可能性の通知
また、例えば、未検出領域判定部１８１は、１つ以上の物体領域８１以外に検出されるべき領域が検出されていない可能性があると判定した場合に、判定結果を通知する。 -Notification of undetected possibility For example, when the undetected area determination unit 181 determines that there is a possibility that an area to be detected other than the one or more object areas 81 may not be detected. Notify the result.

例えば、未検出領域判定部１８１は、通信部１１０（又は図示されていない表示部）を介して、情報処理装置１００−６のユーザ又は管理者に、判定結果を通知する。 For example, the undetected area determination unit 181 notifies the determination result to the user or administrator of the information processing apparatus 100-6 via the communication unit 110 (or a display unit not shown).

また、例えば、通知される上記判定結果は、領域の未検出の可能性があることを示す情報、未検出に関連し得る前景領域６１についての情報、及び対象距離画像２０を含む。 Also, for example, the notified determination result includes information indicating that there is a possibility that the area has not been detected, information about the foreground area 61 that may be related to the non-detection, and the target distance image 20.

＜８−２．処理の流れ＞
次に、図３６を参照して、第６の実施形態に係る情報処理の一例を説明する。図３６は、第６の実施形態に係る情報処理の概略的な流れの一例を示すフローチャートである。 <8-2. Process flow>
Next, an example of information processing according to the sixth embodiment will be described with reference to FIG. FIG. 36 is a flowchart illustrating an example of a schematic flow of information processing according to the sixth embodiment.

ここで、ステップＳ９１０〜Ｓ９４０は、図１５を参照して説明した第１の実施形態に係るステップＳ３１０〜Ｓ３４０と同様である。よって、ここでは、ステップＳ９５０〜Ｓ９７０のみを説明する。 Here, steps S910 to S940 are the same as steps S310 to S340 according to the first embodiment described with reference to FIG. Therefore, only steps S950 to S970 will be described here.

ステップＳ９５０で、未検出領域判定部１８１は、前景距離画像６０から前景領域６１を検出する。 In step S950, the undetected area determination unit 181 detects the foreground area 61 from the foreground distance image 60.

ステップＳ９６０で、未検出領域判定部１８１は、検出された１つ以上の物体領域８１及び検出された１つ以上の前景領域９１に基づいて、当該１つ以上の物体領域８１以外の検出されるべき領域が検出されていない可能性があるかを判定する。当該領域が検出されていない可能性がある場合には、処理はステップＳ９７０へ進む。そうでなければ、処理は終了する。 In step S960, the undetected area determination unit 181 detects other than the one or more object areas 81 based on the detected one or more object areas 81 and the detected one or more foreground areas 91. It is determined whether there is a possibility that a region to be detected is not detected. If there is a possibility that the area has not been detected, the process proceeds to step S970. Otherwise, the process ends.

ステップＳ９７０で、未検出領域判定部１８１は、情報処理装置１００−６のユーザ又は管理者に判定結果を通知する。そして、処理は終了する。 In step S970, the undetected area determination unit 181 notifies the determination result to the user or administrator of the information processing apparatus 100-6. Then, the process ends.

なお、対象距離画像２０が継続的に生成され、取得されてもよい。この場合に、処理は、ステップＳ９６０又はＳ９７０の後に終了する代わりに、ステップＳ９２０に戻ってもよい。 Note that the target distance image 20 may be continuously generated and acquired. In this case, the process may return to step S920 instead of ending after step S960 or S970.

以上、本発明の第６の実施形態を説明した。第６の実施形態によれば、前景距離画像６０に基づいて、物体領域８１の未検出がないかが判定される。これにより、物体の未検出が見逃されてしまう可能性を減らすことができる。 The sixth embodiment of the present invention has been described above. According to the sixth embodiment, based on the foreground distance image 60, it is determined whether or not the object region 81 has not been detected. This can reduce the possibility that an undetected object is missed.

＜＜９．ハードウェア構成＞＞
続いて、図３７を参照して、本開示の実施形態に係る情報処理装置１００のハードウェア構成の一例を説明する。図３７は、本開示の実施形態に係る情報処理装置１００のハードウェア構成の一例を示すブロック図である。図３７を参照すると、情報処理装置１００は、ＣＰＵ（Central Processing Unit）１００１、ＲＯＭ（Read Only Memory）１００３、ＲＡＭ（Random Access Memory）１００５、バス１００７、記憶装置１００９及び通信装置１０１１を備える。 << 9. Hardware configuration >>
Subsequently, an example of a hardware configuration of the information processing apparatus 100 according to the embodiment of the present disclosure will be described with reference to FIG. FIG. 37 is a block diagram illustrating an example of a hardware configuration of the information processing apparatus 100 according to an embodiment of the present disclosure. Referring to FIG. 37, the information processing apparatus 100 includes a CPU (Central Processing Unit) 1001, a ROM (Read Only Memory) 1003, a RAM (Random Access Memory) 1005, a bus 1007, a storage device 1009, and a communication device 1011.

ＣＰＵ１００１は、情報処理装置１００における様々な処理を実行する。ＲＯＭ１００１は、情報処理装置１００における処理をＣＰＵ１００１に実行させるためのプログラム及びデータを記憶する。ＲＡＭ１００５は、ＣＰＵ１００１の処理の実行時に、プログラム及びデータを一時的に記憶する。 The CPU 1001 executes various processes in the information processing apparatus 100. The ROM 1001 stores a program and data for causing the CPU 1001 to execute processing in the information processing apparatus 100. The RAM 1005 temporarily stores programs and data when the CPU 1001 executes processing.

バス１００７は、ＣＰＵ１００１、ＲＯＭ１００３及びＲＡＭ１００５を相互に接続する。バスには、さらに、記憶装置１００９及び通信装置１０１１が接続される。バス１００７は、例えば、複数の種類のバスを含む。一例として、バス１００７は、ＣＰＵ１００１、ＲＯＭ１００３及びＲＡＭ１００５を接続するバスと、当該バスよりも低速の１つ以上の別のバスを含む。 A bus 1007 connects the CPU 1001, the ROM 1003, and the RAM 1005 to each other. A storage device 1009 and a communication device 1011 are further connected to the bus. The bus 1007 includes, for example, a plurality of types of buses. As an example, the bus 1007 includes a bus that connects the CPU 1001, the ROM 1003, and the RAM 1005, and one or more other buses that are slower than the bus.

記憶装置１００９は、情報処理装置１００内で一時的又は恒久的に保存すべきデータを記憶する。記憶装置１００９は、例えば、ハードディスク（Hard Disk）等の磁気記憶装置であってもよく、又は、ＥＥＰＲＯＭ（Electrically Erasable and Programmable Read Only Memory）、フラッシュメモリ（flash memory）、ＭＲＡＭ（Magnetoresistive Random Access Memory）、ＦｅＲＡＭ（Ferroelectric Random Access Memory）及びＰＲＡＭ（Phase change Random Access Memory）等の不揮発性メモリ（nonvolatile memory）であってもよい。 The storage device 1009 stores data to be temporarily or permanently stored in the information processing device 100. The storage device 1009 may be a magnetic storage device such as a hard disk, or may be an EEPROM (Electrically Erasable and Programmable Read Only Memory), a flash memory, a MRAM (Magnetoresistive Random Access Memory), or the like. Further, non-volatile memories such as FeRAM (Ferroelectric Random Access Memory) and PRAM (Phase change Random Access Memory) may be used.

通信装置１０１１は、情報処理装置１００が備える通信手段であり、ネットワークを介して（あるいは、直接的に）外部装置と通信する。通信装置１０１１は、有線通信用の通信装置であってもよく、この場合に、例えば、ＬＡＮ端子、伝送回路及びその他の通信処理用の回路を含んでもよい。また、通信装置１０１１は、無線通信用の通信装置であってもよく、この場合に、例えば、通信アンテナ、ＲＦ回路及びその他の通信処理用の回路を含んでもよい。 The communication device 1011 is a communication unit included in the information processing device 100 and communicates with an external device via a network (or directly). The communication device 1011 may be a communication device for wired communication. In this case, for example, the communication device 1011 may include a LAN terminal, a transmission circuit, and other communication processing circuits. The communication device 1011 may be a communication device for wireless communication. In this case, for example, the communication device 1011 may include a communication antenna, an RF circuit, and other communication processing circuits.

以上、添付図面を参照しながら本発明の好適な実施形態を説明したが、本発明は係る例に限定されないことは言うまでもない。当業者であれば、特許請求の範囲に記載された範疇内において、各種の変更例または修正例に想到し得ることは明らかであり、それらについても当然に本発明の技術的範囲に属するものと了解される。 As mentioned above, although preferred embodiment of this invention was described referring an accompanying drawing, it cannot be overemphasized that this invention is not limited to the example which concerns. It will be apparent to those skilled in the art that various changes and modifications can be made within the scope of the claims, and these are naturally within the technical scope of the present invention. Understood.

例えば、物体領域を検出する際に用いられるフィルタの形状が矩形である例を説明したが、本発明は係る例に限定されない。フィルタの形状は矩形以外の形状であってもよい。一例として、フィルタの形状は楕円であってもよい。 For example, although the example in which the shape of the filter used when detecting the object region is rectangular has been described, the present invention is not limited to such an example. The shape of the filter may be other than a rectangle. As an example, the shape of the filter may be an ellipse.

また、第５の実施形態を除き、前景距離画像の各位置に１つのフィルタが適用される例を説明したが、本発明は係る例に限定されない。各位置には２つ以上のフィルタが適用されてもよい。一例として、人物が向いている方向として、２つ以上の方向が設定され、各方向に対応するフィルタが生成されてもよい。そして、各位置には、各方向に対応するフィルタが適用されてもよい。さらに、方向ごとに１つ以上の物体領域が検出された後に、全方向の物体領域が統合され、統合後の１つ以上の物体領域を最終的な物体領域とみなしてもよい。 Moreover, although the example in which one filter is applied to each position of the foreground distance image has been described except for the fifth embodiment, the present invention is not limited to such an example. Two or more filters may be applied to each position. As an example, two or more directions may be set as the direction in which the person is facing, and a filter corresponding to each direction may be generated. A filter corresponding to each direction may be applied to each position. Further, after one or more object regions are detected for each direction, the object regions in all directions may be integrated, and the one or more object regions after integration may be regarded as a final object region.

また、フィルタが画素単位で生成される例を説明したが、本発明は係る例に限定されない。画素単位よりも粗い単位でフィルタが生成されてもよい。一例として、前景距離画像にいくつかの区分を設定し、区分ごとにフィルタが生成されてもよい。 Moreover, although the example in which the filter is generated in units of pixels has been described, the present invention is not limited to such an example. The filter may be generated in a coarser unit than the pixel unit. As an example, several sections may be set in the foreground distance image, and a filter may be generated for each section.

また、主として、フィルタが予め生成されることを前提として説明したが、本発明は係る例に限定されない。フィルタは、随時生成されてもよい。一例として、フィルタは、前景距離画像に適用する際に随時生成されてもよい。 In addition, the description has been mainly made on the assumption that the filter is generated in advance, but the present invention is not limited to such an example. The filter may be generated at any time. As an example, the filter may be generated at any time when applied to a foreground distance image.

また、撮像装置が真下に向けて設置される例を説明したが、本発明は係る例に限定されない。撮像装置は、真下以外の方向に向けて設置されてもよい。例えば、撮像装置は斜め下の方向に向けて設置されてもよい。この場合も、撮像装置の設置状態に関する情報に基づいてフィルタが生成される。 Moreover, although the example in which the imaging device is installed directly below has been described, the present invention is not limited to such an example. The imaging device may be installed in a direction other than directly below. For example, the imaging device may be installed in a diagonally downward direction. Also in this case, a filter is generated based on information regarding the installation state of the imaging device.

また、情報処理システムが１つの撮像装置のみを含む例を説明したが、当然ながら本発明は係る例に限定されない。情報処理システムは２つ以上の撮像装置を含み、情報処理装置は、それぞれの撮像装置により生成される距離画像から、人物を検出してもよい。 In addition, although the example in which the information processing system includes only one imaging device has been described, it is needless to say that the present invention is not limited to such an example. The information processing system may include two or more imaging devices, and the information processing device may detect a person from a distance image generated by each imaging device.

また、撮像装置と情報処理装置とが別々の装置である例を説明したが、本発明は係る例に限定されない。例えば、撮像装置と情報処理装置とは同一の装置であり、距離画像の生成と当該距離画像からの人物の検出との両方を行なってもよい。 Further, although the example in which the imaging device and the information processing device are separate devices has been described, the present invention is not limited to such an example. For example, the imaging device and the information processing device are the same device, and both generation of a distance image and detection of a person from the distance image may be performed.

また、本明細書の情報処理における処理ステップは、必ずしもフローチャートに記載された順序に沿って時系列に実行されなくてよい。例えば、情報処理における処理ステップは、フローチャートとして記載した順序と異なる順序で実行されても、並列的に実行されてもよい。 Further, the processing steps in the information processing of the present specification do not necessarily have to be executed in time series in the order described in the flowchart. For example, the processing steps in the information processing may be executed in an order different from the order described in the flowchart, or may be executed in parallel.

また、情報処理装置に内蔵されるＣＰＵ、ＲＯＭ及びＲＡＭ等のハードウェアに、上記情報処理装置の各構成と同等の機能を発揮させるためのコンピュータプログラムも作成可能である。また、当該コンピュータプログラムを記憶させた記憶媒体も提供される。 It is also possible to create a computer program for causing hardware such as a CPU, a ROM, and a RAM built in the information processing apparatus to exhibit functions equivalent to those of each configuration of the information processing apparatus. A storage medium storing the computer program is also provided.

１０（従来の）Ｈａａｒ−Ｌｉｋｅフィルタ
２０距離画像
３０撮像装置
４０人物
６０前景距離画像
６１前景領域
７０フィルタ
７１、７３矩形
８０２値画像
８１物体領域
８３楕円
９０前景２値画像
９１前景領域
１００情報処理装置
１３１距離画像取得部
１３３前景距離画像生成部
１３５、１７１領域検出部
１４１、１５３領域補正部
１４３、１５５、１７３領域判定部
１５１、１６１領域識別部
１８１未検出領域判定部
10 (Conventional) Haar-Like filter 20 Distance image 30 Imaging device 40 Person 60 Foreground distance image 61 Foreground area 70 Filter 71, 73 Rectangular 80 Binary image 81 Object area 83 Ellipse 90 Foreground binary image 91 Foreground area 100 Information processing Device 131 Distance image acquisition unit 133 Foreground distance image generation unit 135, 171 Area detection unit 141, 153 Area correction unit 143, 155, 173 Area determination unit 151, 161 Area identification unit 181 Undetected area determination unit

Claims

An acquisition unit that acquires a first distance image generated by the imaging device and a second distance image generated in advance by the imaging device by imaging a background;
A generating unit that generates a third distance image corresponding to a foreground with respect to the background based on the first distance image and the second distance image;
A plurality of filters applied at different positions in the third distance image, the plurality of filters capable of recognizing human head and torso patterns according to the applied positions, A detection unit for detecting one or more regions from the third distance image;
An information processing apparatus comprising:

2. The information processing apparatus according to claim 1, wherein each of the plurality of filters is a filter generated based on a relative relationship between the imaging apparatus and the person when the person is imaged by the imaging apparatus.

The information processing apparatus according to claim 2, wherein the relative relationship includes a direction from the imaging apparatus in which the head of the person is present and a direction from the imaging apparatus in which the torso of the person is present.

The information processing apparatus according to claim 2, wherein each of the plurality of filters is a filter generated further based on characteristics of the imaging apparatus.

The detection unit detects the one or more regions as regions corresponding to a target portion of a person,
The information processing apparatus includes:
A region determination unit that determines whether each of the at least one region corresponds to the target portion based on a size corresponding to each of at least one region of the one or more regions;
The information processing apparatus according to any one of claims 1 to 4, further comprising:

Based on the first distance image or the third distance image, the peripheral area of each of at least one of the one or more areas, the peripheral area satisfying a predetermined condition is A correction unit that corrects the one or more areas by adding to each of the at least one area;
Further comprising
6. The region determination unit according to claim 5, wherein the region determination unit determines whether each of the at least one region corresponds to the target portion based on the corrected size corresponding to each of the at least one region. Information processing device.

The information processing apparatus according to claim 5, wherein the target portion is a human head.

An identification unit that identifies a high possibility region that is highly likely to be determined to be a region corresponding to the target portion, and another region of the one or more regions,
Further comprising
The at least one region of the one or more regions is the other region of the one or more regions;
The information processing apparatus according to any one of claims 5 to 7.

The information processing apparatus according to claim 8, wherein the high possibility region is a region located within a predetermined range in which a region corresponding to a person portion other than the target portion is difficult to be detected.

The high possibility region is the one region when the one or more regions include only one region, and when the one or more regions include two or more regions, The information processing apparatus according to claim 8, wherein the information processing apparatus is an area that is separated from any other area included in the one or more areas by a predetermined distance or more.

The acquisition unit acquires a fourth distance image generated by the imaging device before the first distance image;
The generation unit generates a fifth distance image corresponding to the foreground with respect to the background based on the fourth distance image and the second distance image,
The detection unit detects another one or more regions from the fifth distance image using the plurality of filters,
The identification unit identifies the high possibility region and the other region among the one or more other regions,
The region determination unit determines whether each of the other regions of the one or more other regions corresponds to the target portion;
The identification unit is configured to determine the one or more one or more regions based on the high possibility region and the determined region determined to be a region corresponding to the target portion. Identifying the high likelihood area and other areas of the area;
The information processing apparatus according to claim 8.

The identification unit is configured based on a position of the high possibility region and the determined region in the one or more other regions, and a position of the one or more regions. The information processing apparatus according to claim 11, wherein the high possibility area and other areas are identified.

The detection unit is a plurality of filter sets applied at different positions in the third distance image, and the plurality of the plurality of filter sets capable of recognizing human head and torso patterns according to the applied positions. Using a filter set to detect the one or more regions;
Each of the plurality of filter sets includes two or more filters capable of recognizing a human head and torso pattern having different characteristics.
The information processing apparatus according to any one of claims 5 to 7.

The detection unit detects a plurality of regions by applying the two or more filters to the same position in the third distance image when using each of the plurality of filter sets;
The region determination unit determines whether each of the at least one region corresponds to the target portion based on a size corresponding to each of at least one region of the plurality of regions.
The information processing apparatus according to claim 13.

The detecting unit selects at least one filter from the two or more filters when using each of the plurality of filter sets, and applies the selected at least one filter to the one or more filter sets. The information processing apparatus according to claim 13, wherein the information processing apparatus detects an area.

The detection unit is configured to detect at least one of the two or more filters based on a person image estimated from a distance image generated by the imaging device or another distance image generated based on the distance image. The information processing apparatus according to claim 15, wherein one filter is selected.

An undetected determination unit that determines whether there is a possibility that a region other than the one or more regions to be detected is not detected based on the one or more regions and the third distance image; The information processing apparatus according to claim 1, further comprising:

The information processing apparatus according to claim 1, wherein each of the plurality of filters is a Haar-like filter.

19. The information processing apparatus according to claim 1, wherein each of the plurality of filters is a filter capable of recognizing a human head and shoulder pattern according to an applied position.

Computer
An acquisition unit that acquires a first distance image generated by the imaging device and a second distance image generated in advance by the imaging device by imaging a background;
A generating unit that generates a third distance image corresponding to a foreground with respect to the background based on the first distance image and the second distance image;
A plurality of filters applied at different positions in the third distance image, the plurality of filters capable of recognizing human head and torso patterns according to the applied positions, A detection unit for detecting one or more regions from the third distance image;
Program to function as.

Acquiring a first distance image generated by the imaging device, and a second distance image generated in advance by the imaging device by imaging a background;
Generating a third distance image corresponding to the foreground with respect to the background based on the first distance image and the second distance image;
A plurality of filters applied at different positions in the third distance image, the plurality of filters capable of recognizing human head and torso patterns according to the applied positions, Detecting one or more regions from the third distance image;
An information processing method including: