JP2018088049A

JP2018088049A - Device, method and program for image processing

Info

Publication number: JP2018088049A
Application number: JP2016230046A
Authority: JP
Inventors: 直嗣佐川; Naotada Sagawa; 矢野　光太郎; Kotaro Yano; 光太郎矢野
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2016-11-28
Filing date: 2016-11-28
Publication date: 2018-06-07

Abstract

PROBLEM TO BE SOLVED: To make it possible to detect accurately an area of a specific object and others from an input image.SOLUTION: A face detection unit (202) detects a face area from an input image. A human body detection unit (203) detects plural human body areas from the input image. A distance calculation unit (204) calculates a distance between the face area and each of the plural human body areas. A condition setting unit (205) sets a condition against the distance calculated by the distance calculation unit (204). A deletion unit (206) deletes, among the plural human body areas, the human body area in which the distance calculated by the distance calculation unit (204) does not satisfy the condition against the distance.SELECTED DRAWING: Figure 2

Description

本発明は、画像から被写体等の領域を検出する画像処理装置、画像処理方法及びプログラムに関する。 The present invention relates to an image processing apparatus, an image processing method, and a program for detecting a region such as a subject from an image.

画像から特定の被写体画像を自動的に検出する技術は、画像検索、物体検知、物体認識、物体追跡など様々な分野に応用される。このような技術の例として画像の中から特に人物領域を検出（以後、人体検出と称す。）する方法が非特許文献１に開示されている。この方法では、入力画像から抽出した多数の検出ウインドウを、予め膨大な数の人物画像を用いて学習した辞書データと照合することによって人物領域の検出を実現している。さらに、特許文献１では、積分画像を利用して人物の検出に有効なＨｉｓｔｇｒａｍｏｆＯｒｉｅｎｔｅｄＧｒａｄｉｅｎｔｓ(以後ＨＯＧと称す。）特徴量を求め、アダブースト学習で得たカスケード型識別器を適用することで高速化を実現している。カスケード型識別器を用いた識別方法は、複数の識別器を直列に結合することによって効率よく検出対象を絞り込んでいく方法である。 A technique for automatically detecting a specific subject image from an image is applied to various fields such as image retrieval, object detection, object recognition, and object tracking. As an example of such a technique, Non-Patent Document 1 discloses a method for detecting a human area from an image (hereinafter referred to as human body detection). In this method, detection of a person region is realized by collating a large number of detection windows extracted from an input image with dictionary data learned in advance using a huge number of person images. Furthermore, in Patent Document 1, a feature amount of Histogram of Oriented Gradients (hereinafter referred to as HOG) that is effective for human detection is obtained by using an integral image, and a cascade type discriminator obtained by Adaboost learning is applied for high speed. Has been realized. The identification method using a cascade classifier is a method of narrowing down detection targets efficiently by connecting a plurality of classifiers in series.

米国特許出願公開第２００７／０２３７３８７号明細書US Patent Application Publication No. 2007/0237387

Ｎ．ＤａｌａｌａｎｄＢ．Ｔｒｉｇｇｓ：ｈｉｓｔｇｒａｍｓｏｆＯｒｉｅｎｔｅｄＧｒａｄｉｅｎｔｓｆｏｒＨｕｍａｎＤｅｔｅｃｔｉｏｎｓ（ＣＶＰＲ２００５）．N. Dalal and B.M. Triggs: histgrams of Oriented Gradients for Human Detections (CVPR2005).

以上述べてきた方法では、検出結果として得られる被写体の位置と、画像内の実際の被写体の位置とでずれが生じてしまうという問題がある。例えば、従来手法を用いた人体検出の場合、検出結果として得られる人体の頭頂部の位置と、画像内の実際の人体の頭頂部の位置とがずれてしまうことがある。これは、従来手法の人体検出では、髪型や帽子等、様々に変化する可能性のある人体の頭頂部を検出する必要があるため、頭部等の特定輪郭の位置を正確に検出するのではなく、検出対象の領域内に人体頭部の大まかな輪郭があることを検出しているためである。 In the method described above, there is a problem that a deviation occurs between the position of the subject obtained as a detection result and the position of the actual subject in the image. For example, in the case of human body detection using a conventional method, the position of the top of the human body obtained as a detection result may be displaced from the actual position of the top of the human body in the image. This is because the human body detection of the conventional method needs to detect the top of the human body that may change variously, such as the hairstyle and hat, so it is not possible to accurately detect the position of a specific contour such as the head. This is because it is detected that there is a rough outline of the human head in the region to be detected.

そこで、本発明は、画像から人体等の領域を正確に検出可能にすることを目的とする。 Therefore, an object of the present invention is to make it possible to accurately detect a region such as a human body from an image.

本発明は、入力画像から被写体に関する第一の領域を検出する第一の検出手段と、前記入力画像から前記被写体に関する複数の第二の領域を検出する第二の検出手段と、前記第一の検出手段により検出された前記第一の領域と、前記第二の検出手段により検出された複数の前記第二の領域のそれぞれとの間の距離を算出する距離算出手段と、前記距離算出手段により算出される距離に対する条件を設定する条件設定手段と、前記第二の検出手段により検出された前記複数の第二の領域の中で、前記距離算出手段により算出された距離が、前記距離に対する条件を満たさない第二の領域を、削除する削除手段と、を有することを特徴とする。 The present invention provides a first detection means for detecting a first area related to a subject from an input image, a second detection means for detecting a plurality of second areas related to the subject from the input image, and the first A distance calculating unit that calculates a distance between the first region detected by the detecting unit and each of the plurality of second regions detected by the second detecting unit; and the distance calculating unit. Condition setting means for setting a condition for the calculated distance, and the distance calculated by the distance calculation means among the plurality of second areas detected by the second detection means is a condition for the distance And deleting means for deleting the second area that does not satisfy the condition.

本発明によれば、画像から人体等の領域を正確に検出可能となる。 According to the present invention, a region such as a human body can be accurately detected from an image.

本実施形態の画像処理装置の概略的なハードウェア構成図である。It is a schematic hardware block diagram of the image processing apparatus of this embodiment. 本実施形態の画像処理装置の概略的な機能ブロック構成図である。1 is a schematic functional block configuration diagram of an image processing apparatus according to an embodiment. 本実施形態の画像処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the image processing of this embodiment. 人体検出処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a human body detection process. 入力画像、縮小画像、検出ウインドウの関係を示す図である。It is a figure which shows the relationship between an input image, a reduced image, and a detection window. 顔検出結果、人体検出結果、顔人体距離の関係を示す図である。It is a figure which shows the relationship between a face detection result, a human body detection result, and a face human body distance. 不適合人体削除処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a nonconforming human body deletion process. 人物等の被写体を撮影する際の撮影環境を示す図である。It is a figure which shows the imaging environment at the time of image | photographing subjects, such as a person. 撮影された入力画像の例を示す図である。It is a figure which shows the example of the image | photographed input image. 画像内の位置と顔人体距離との関係の対応表を示す図である。It is a figure which shows the correspondence table of the relationship between the position in an image, and a face human body distance. 一人の人物の顔検出結果、人体検出結果、顔人体距離を示す図である。It is a figure which shows the face detection result of one person, a human body detection result, and a face human body distance. 不適合人体削除処理前後の画像例と人体領域を示す図である。It is a figure which shows the example of an image before and after a nonconforming human body deletion process, and a human body area | region. 最終的な画像処理結果を示す図である。It is a figure which shows the final image processing result.

以下、本発明の好ましい実施の形態を、添付の図面に基づいて詳細に説明する。
本実施形態の画像処理装置は、一例として、デジタルカメラやデジタルビデオカメラ、カメラ機能を備えたスマートフォンやタブレット端末等の各種携帯端末、工業用カメラ、車載カメラ、医療用カメラ等に適用可能である。特に、本実施形態の画像処理装置は、画像から人物画像（以下、単に人物とする。）などの特定の被写体領域又は被写体の一部領域を検出する機能を有する。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.
As an example, the image processing apparatus of the present embodiment can be applied to a digital camera, a digital video camera, various mobile terminals such as a smartphone or a tablet terminal having a camera function, an industrial camera, an in-vehicle camera, a medical camera, and the like. . In particular, the image processing apparatus according to the present embodiment has a function of detecting a specific subject area such as a person image (hereinafter simply referred to as a person) or a partial area of the subject from the image.

図１は、本実施形態に係る画像処理装置の概略的なハードウェア構成図である。本実施形態の画像処理装置は、ＣＰＵ(Central Processing Unit)１０１、記憶装置１０２、入力装置１０３、及び出力装置１０４を含んで構成される。なお、各装置は互いに通信可能に構成され、バス等により接続されている。 FIG. 1 is a schematic hardware configuration diagram of an image processing apparatus according to the present embodiment. The image processing apparatus according to the present embodiment includes a CPU (Central Processing Unit) 101, a storage device 102, an input device 103, and an output device 104. Each device is configured to be able to communicate with each other, and is connected by a bus or the like.

ＣＰＵ１０１は、画像処理装置の動作をコントロールし、記憶装置１０２に格納されたプログラムの実行等を行う。記憶装置１０２は、磁気記憶装置、半導体メモリ等のストレージデバイスであり、ＣＰＵ１０１の動作に基づき読み込まれたプログラムや長時間記憶しなくてはならないデータ等を記憶する。また、記憶装置１０２は、図示しない撮像装置（カメラ）により撮影された画像データも記憶する。本実施形態では、ＣＰＵ１０１が記憶装置１０２に格納されたプログラムの手順に従って処理を行うことによって、画像処理装置の後述する機能及びフローチャートに係る処理が実現される。 The CPU 101 controls the operation of the image processing apparatus and executes a program stored in the storage device 102. The storage device 102 is a storage device such as a magnetic storage device or a semiconductor memory, and stores programs read based on the operation of the CPU 101, data that must be stored for a long time, and the like. The storage device 102 also stores image data taken by an imaging device (camera) (not shown). In the present embodiment, the CPU 101 performs processing according to the procedure of the program stored in the storage device 102, thereby realizing later-described functions and flowcharts of the image processing apparatus.

入力装置１０３は、マウス、キーボード、タッチパネルデバイス、ボタン等であり、ユーザから各種の指示が入力される。また、入力装置１０３は、画像を撮影する撮像装置（カメラ）を含んでいてもよい。撮像装置は、公知のＣＣＤ素子などの撮像素子を有して構成されており画像の撮影を行う。出力装置１０４は、液晶パネル、外部モニタ等であり、各種の情報を出力する。 The input device 103 is a mouse, a keyboard, a touch panel device, buttons, or the like, and receives various instructions from the user. The input device 103 may include an imaging device (camera) that captures an image. The image pickup apparatus includes an image pickup element such as a known CCD element, and takes an image. The output device 104 is a liquid crystal panel, an external monitor, or the like, and outputs various types of information.

なお、画像処理装置のハードウェア構成は、上述した構成に限られるものではない。例えば、各種の装置間で通信を行うためのＩ／Ｏ装置を備えてもよい。Ｉ／Ｏ装置は、メモリーカード、ＵＳＢケーブル等の入出力部、有線、無線等による送受信部などである。 Note that the hardware configuration of the image processing apparatus is not limited to the above-described configuration. For example, an I / O device for performing communication between various devices may be provided. The I / O device is a memory card, an input / output unit such as a USB cable, a wired / wireless transmission / reception unit, or the like.

図２は、本実施形態に係る画像処理装置の機能構成を示す機能ブロック図である。図２の機能ブロックは、本実施形態の画像処理装置の例えば図１のＣＰＵ１０１が、記憶装置１０２に格納されたコンピュータプログラムを読み込んで実行することにより、ソフトウェア構成として構築されてもよい。もちろん、図２の機能ブロックは、全てがハードウェア構成又はソフトウェア構成として構築されていてもよいし、一部がハードウェア、残りがソフトウェアで構成されてもよい。また、図２に示した構成は一例であり、例えば各ブロックのうち幾つかが統合されていてもよく、各ブロックの機能を実現可能な構成であれば、如何なる構成が採用されていてもよい。 FIG. 2 is a functional block diagram showing a functional configuration of the image processing apparatus according to the present embodiment. 2 may be constructed as a software configuration by, for example, the CPU 101 of FIG. 1 of the image processing apparatus of the present embodiment reading and executing a computer program stored in the storage device 102. Of course, all of the functional blocks in FIG. 2 may be constructed as a hardware configuration or a software configuration, or a part may be configured by hardware and the rest may be configured by software. Further, the configuration shown in FIG. 2 is an example. For example, some of the blocks may be integrated, and any configuration may be adopted as long as the function of each block can be realized. .

図２において、画像入力部２０１は、本実施形態において画像処理の対象となる画像を取得する。画像入力部２０１が取得する画像は、図１の記憶装置１０２から読み出された画像や、カメラにより撮像された画像などである。
顔検出部２０２は、画像入力部２０１により取得された入力画像から、人物の顔領域を検出する。顔検出部２０２は、入力画像の中に複数の人物が存在している場合には、それら人物毎の顔領域を検出する。顔検出部２０２における顔検出処理の詳細は後述する。そして、顔検出部２０２は、人物毎に検出した顔領域の情報を、距離算出部２０４に出力する。 In FIG. 2, an image input unit 201 acquires an image that is a target of image processing in the present embodiment. The image acquired by the image input unit 201 is an image read from the storage device 102 in FIG. 1 or an image captured by a camera.
The face detection unit 202 detects a human face area from the input image acquired by the image input unit 201. If there are a plurality of persons in the input image, the face detection unit 202 detects a face area for each person. Details of the face detection processing in the face detection unit 202 will be described later. Then, the face detection unit 202 outputs information on the face area detected for each person to the distance calculation unit 204.

人体検出部２０３は、画像入力部２０１の入力画像から複数の人体領域を検出する。人体検出部２０３は、入力画像の中に複数の人物が存在している場合、それら人物毎に複数の人体領域を検出する。人体検出部２０３における人体領域の検出処理の詳細は後述する。そして、人体検出部２０３は、人物毎に検出した複数の人体領域の情報を、顔人体距離算出部２０４に出力する。 The human body detection unit 203 detects a plurality of human body regions from the input image of the image input unit 201. When there are a plurality of persons in the input image, the human body detection unit 203 detects a plurality of human body regions for each person. Details of the human body region detection processing in the human body detection unit 203 will be described later. Then, the human body detection unit 203 outputs information on a plurality of human body regions detected for each person to the face human body distance calculation unit 204.

距離算出部２０４は、顔検出部２０２で検出された顔領域と、人体検出部２０３で検出された複数の人体領域とに基づき、それら顔領域と複数の人体領域との間の距離を算出する。距離算出部２０４は、入力画像の中に複数の人物が存在していた場合には、それらの人物毎に、顔領域と複数の人体領域との間の距離を算出する。なお、顔領域と人体領域との間の距離（以下、顔人体距離とする。）の定義と、距離算出部２０４における顔人体距離の算出処理の詳細は後述する。そして、距離算出部２０４は、人物毎の顔領域及び複数の人体領域の情報と、それら人物毎の顔領域と複数の人体領域との間の顔人体距離の情報とを、削除部２０６に出力する。 The distance calculation unit 204 calculates distances between the face region and the plurality of human body regions based on the face region detected by the face detection unit 202 and the plurality of human body regions detected by the human body detection unit 203. . When there are a plurality of persons in the input image, the distance calculation unit 204 calculates a distance between the face area and the plurality of human body areas for each person. The definition of the distance between the face area and the human body area (hereinafter referred to as the face human body distance) and details of the face human body distance calculation process in the distance calculation unit 204 will be described later. Then, the distance calculation unit 204 outputs information on the face area and the plurality of human body areas for each person and information on the face human body distance between the face area and the plurality of human body areas for each person to the deletion unit 206. To do.

条件設定部２０５は、顔人体距離に対する条件（以下、距離条件とする。）を設定し、その距離条件の情報を削除部２０６に出力する。条件設定部２０５における距離条件の詳細は後述する。 The condition setting unit 205 sets a condition for the face human body distance (hereinafter referred to as a distance condition), and outputs information on the distance condition to the deletion unit 206. Details of the distance condition in the condition setting unit 205 will be described later.

削除部２０６は、顔人体距離と距離条件の情報とに基づき、複数の人体領域の中から、人体領域としては不適合とみなせる人体領域を削除する。削除部２０６は、入力画像の中に複数の人物が存在していた場合には、それら人物毎に、複数の人体領域の中から、人体領域としては不適合とみなせる人体領域を削除する。削除部２０６における不適合領域の削除処理の詳細は後述する。そして、削除部２０６は、人物毎に不適合な人体領域が削除された後の残りの複数の人体領域の情報を、統合処理部２０７に出力する。 The deleting unit 206 deletes a human body region that can be regarded as a non-conforming human body region from a plurality of human body regions based on the face human body distance and the distance condition information. If there are a plurality of persons in the input image, the deletion unit 206 deletes a human body area that can be regarded as a non-conforming human body area from the plurality of human body areas for each person. Details of the nonconforming area deletion process in the deletion unit 206 will be described later. Then, the deletion unit 206 outputs the information of the remaining plurality of human body regions after the non-conforming human body region is deleted for each person to the integration processing unit 207.

統合処理部２０７は、削除部２０６による不適合領域を削除した後の、複数の人体領域の中から、それら複数の人体領域の位置とサイズの関係に基づき、一つの人体領域を決定する。統合処理部２０７は、入力画像の中に複数の人物が存在していた場合には、それら人物毎に、複数の人体領域の中から一つの人体領域を決定、つまり一人の人物について一つの人体領域を決定する。統合処理部２０７における検出結果の統合処理の詳細は後述する。本実施形態において、統合処理部２０７により人物毎に決定された一つの人体領域を、各人物において最終的に得られた人体領域の情報として出力する。 The integration processing unit 207 determines one human body region from the plurality of human body regions after the nonconforming region is deleted by the deletion unit 206 based on the relationship between the positions and sizes of the plurality of human body regions. When there are a plurality of persons in the input image, the integration processing unit 207 determines one human body area from the plurality of human body areas for each person, that is, one human body for one person. Determine the area. Details of detection result integration processing in the integration processing unit 207 will be described later. In the present embodiment, one human body region determined for each person by the integration processing unit 207 is output as information on a human body region finally obtained for each person.

図３は、図２の機能ブロックにより表される本実施形態の画像処理装置における画像処理の流れを示すフローチャートである。なお、図３のフローチャートの処理は、ハードウェア構成により実現される場合だけでなく、ＣＰＵ等がコンピュータプログラムを実行することにより実現されてもよい。これらのことは、後述する他のフローチャートにおいても同様とする。 FIG. 3 is a flowchart showing the flow of image processing in the image processing apparatus of the present embodiment represented by the functional blocks of FIG. Note that the processing of the flowchart in FIG. 3 is not only realized by a hardware configuration, but may be realized by a CPU or the like executing a computer program. The same applies to other flowcharts described later.

図３のステップＳ３０１では、顔検出部２０２は、画像入力部２０１からの入力画像に対し、顔検出処理を実行する。本実施形態における顔検出処理は、公知の方法を用いることができ、例えば、顔画像に対するテンプレートマッチングを用いる方法や、予め機械学習した特徴抽出フィルタを検出器として用いる方法などが適用可能である。本実施形態では、これら公知の顔検出処理の何れを用いてもよく、特定の顔検出方式に限定されるものではない。顔検出部２０２により検出された顔領域の情報は、距離算出部２０４に送られる。ステップＳ３０１の後、画像処理装置の処理は、人体検出部２０３にて行われるステップＳ３０２の処理に進む。 In step S <b> 301 of FIG. 3, the face detection unit 202 performs face detection processing on the input image from the image input unit 201. For the face detection processing in the present embodiment, a known method can be used. For example, a method using template matching for a face image or a method using a feature extraction filter that has been previously machine-learned as a detector can be applied. In the present embodiment, any of these known face detection processes may be used, and the present invention is not limited to a specific face detection method. Information on the face area detected by the face detection unit 202 is sent to the distance calculation unit 204. After step S301, the process of the image processing apparatus proceeds to the process of step S302 performed by the human body detection unit 203.

図３のステップＳ３０２では、人体検出部２０３は、画像入力部２０１からの入力画像に対して人体検出処理を実行する。本実施形態における人体検出処理では、例えば、人物の肩から頭部にかけての形状、つまりΩ（オメガ）形状を検出する手法を例に挙げて、以下の説明を行う。 In step S <b> 302 of FIG. 3, the human body detection unit 203 performs human body detection processing on the input image from the image input unit 201. In the human body detection process according to the present embodiment, for example, a method for detecting a shape from a person's shoulder to the head, that is, an Ω (omega) shape will be described as an example.

図４は、図３のステップＳ３０２の人体検出処理の詳細なフローチャートである。
図４のステップＳ４０１では、人体検出部２０３は、画像入力部２０１からの入力画像を所定の異なる倍率毎に縮小した複数の縮小画像を生成する。これは、様々な大きさの人物を検出するために、複数サイズの各画像に対して順次、人体検出処理を行うようにするためである。ステップＳ４０１の後、人体検出部２０３は、ステップＳ４０２に処理を進める。 FIG. 4 is a detailed flowchart of the human body detection process in step S302 of FIG.
In step S401 in FIG. 4, the human body detection unit 203 generates a plurality of reduced images obtained by reducing the input image from the image input unit 201 at predetermined different magnifications. This is because human body detection processing is sequentially performed on each image of a plurality of sizes in order to detect persons of various sizes. After step S401, the human body detection unit 203 advances the process to step S402.

ステップＳ４０２において、人体検出部２０３は、ステップＳ４０１で生成した複数の縮小画像の中から、以降の処理の対象となる画像（以下、処理対象画像とする。）を１枚設定する。ステップＳ４０２の後、人体検出部２０３は、ステップＳ４０３に処理を進める。
ステップＳ４０３では、人体検出部２０３は、ステップＳ４０２で設定した処理対象画像に対し、所定の大きさの部分領域を設定する。以降、この部分領域を検出ウインドウと呼ぶ。ステップＳ４０３の後、人体検出部２０３は、ステップＳ４０４に処理を進める。 In step S402, the human body detection unit 203 sets one image (hereinafter referred to as a processing target image) that is a target of subsequent processing from the plurality of reduced images generated in step S401. After step S402, the human body detection unit 203 advances the process to step S403.
In step S403, the human body detection unit 203 sets a partial area having a predetermined size for the processing target image set in step S402. Hereinafter, this partial area is referred to as a detection window. After step S403, the human body detection unit 203 advances the process to step S404.

ステップＳ４０４では、人体検出部２０３は、人体の認識モデルを用い、検出ウインドウ内に人体領域とみなせる候補領域が含まれるか否かの人体判別処理を行う。ここで、図５は、ステップＳ４０１で画像入力部２０１から供給された入力画像５１０と、その入力画像５１０を縮小した複数の縮小画像５１１と、ステップＳ４０３で設定された検出ウインドウ５０２との関係を示す図である。この図５に示すように、ステップＳ４０４の人体判別処理は、この検出ウインドウ５０２を用いて行われる。人体判別処理では縮小画像５１１の全域が人体検出対象となされるため、人体検出部２０３は、縮小画像５１１内において、検出ウインドウ５０２を図５の矢印５０１で示すように数画素刻みで走査する。具体的には、人体検出部２０３は、検出ウインドウ５０２を、縮小画像５１１の左端から右横方向に数画素刻みで走査し、右端に到達した後は縦方向に数画素分だけずらして左端に戻し、その後は同様に右横方向に数画素刻みで走査することを繰り返す。 In step S404, the human body detection unit 203 uses a human body recognition model and performs a human body discrimination process to determine whether or not a candidate area that can be regarded as a human body area is included in the detection window. FIG. 5 shows the relationship between the input image 510 supplied from the image input unit 201 in step S401, a plurality of reduced images 511 obtained by reducing the input image 510, and the detection window 502 set in step S403. FIG. As shown in FIG. 5, the human body discrimination process in step S <b> 404 is performed using this detection window 502. In the human body discrimination process, since the entire area of the reduced image 511 is a human body detection target, the human body detection unit 203 scans the detection window 502 in units of several pixels as indicated by arrows 501 in FIG. Specifically, the human body detection unit 203 scans the detection window 502 from the left end of the reduced image 511 to the right side in increments of several pixels, and after reaching the right end, shifts by several pixels in the vertical direction to the left end. After that, the scanning is repeated in the right lateral direction in increments of several pixels.

また、ステップＳ４０４の人体判別処理は、検出ウインドウ５０２の画像パターンに対して人体認識モデルを適用して尤度（人体らしさ）を出力するものであればどのような方法でもよく、特に限定されるものではない。一例として、ステップＳ４０４の人体判別処理としては、前述した非特許文献１に開示されているような方法を用いることができる。なお、非特許文献１には、検出ウインドウ内の複数領域から認識対象に対する尤度を取得し、これら尤度と予め設定した閾値を比較することで検出ウインドウに認識対象が含まれるか否かを判別する方法が開示されている。そして、人体検出部２０３は、検出ウインドウ５０２の画像が人体の画像であると判別した場合、その縮小画像５１１における検出ウインドウ５０２の位置座標を後段の処理（ステップＳ４０７）に渡す。ステップＳ４０４の後、人体検出部２０３は、ステップＳ４０５に処理を進める。 The human body discrimination process in step S404 is not particularly limited as long as it applies a human body recognition model to the image pattern in the detection window 502 and outputs a likelihood (likeness of human body). It is not a thing. As an example, the method disclosed in Non-Patent Document 1 described above can be used as the human body discrimination process in step S404. In Non-Patent Document 1, the likelihood for a recognition target is acquired from a plurality of regions in the detection window, and whether or not the recognition target is included in the detection window by comparing these likelihoods with a preset threshold value. A method for discrimination is disclosed. When the human body detection unit 203 determines that the image of the detection window 502 is a human body image, the human body detection unit 203 passes the position coordinates of the detection window 502 in the reduced image 511 to subsequent processing (step S407). After step S404, the human body detection unit 203 advances the process to step S405.

ステップＳ４０５では、人体検出部２０３は、縮小画像５１１内を検出ウインドウ５０２により全て走査したか否かを判定する。人体検出部２０３は、ステップＳ４０５において走査が終了していないと判定（Ｎｏ）した場合には、前述したステップＳ４０３に処理を戻し、ステップＳ４０３以降の処理を行う。そして、人体検出部２０３は、ステップＳ４０５において、縮小画像５１１内の全ての走査が終了したと判定（Ｙｅｓ）した場合には、ステップＳ４０６に処理を進める。 In step S <b> 405, the human body detection unit 203 determines whether or not the entire reduced image 511 has been scanned by the detection window 502. When it is determined in step S405 that the scanning has not ended (No), the human body detection unit 203 returns the process to step S403 described above, and performs the processes after step S403. If the human body detection unit 203 determines in step S405 that all the scans in the reduced image 511 have been completed (Yes), the process proceeds to step S406.

ステップＳ４０６では、人体検出部２０３は、入力画像５１０から生成された複数の縮小画像５１１の全てに対して、ステップＳ４０２からステップＳ４０５までの処理が終わったか否かを判定する。人体検出部２０３は、ステップＳ４０６において、全ての縮小画像５１１に対する処理が終了していないと判定（Ｎｏ）した場合には、ステップＳ４０２に処理を戻し、ステップＳ４０２以降の処理を行う。そして、人体検出部２０３は、ステップＳ４０６において、全ての縮小画像５１１に対する処理が終了したと判定（Ｙｅｓ）した場合には、ステップＳ４０７に処理を進める。 In step S406, the human body detection unit 203 determines whether or not the processing from step S402 to step S405 has been completed for all of the plurality of reduced images 511 generated from the input image 510. If it is determined in step S406 that the processing for all the reduced images 511 has not been completed (No), the human body detection unit 203 returns the processing to step S402, and performs the processing after step S402. If the human body detection unit 203 determines in step S406 that the processing for all the reduced images 511 has been completed (Yes), the process proceeds to step S407.

ステップＳ４０７では、人体検出部２０３は、ステップＳ４０２からステップＳ４０６までの処理により、複数の各縮小画像５１１からそれぞれ検出された人体領域について、その人体領域の位置座標を元の入力画像５１０の座標系に変換する。そして、人体検出部２０３は、それら複数の縮小画像５１１から検出されてそれぞれ座標変換がなされた後の複数の人体領域を、それぞれ人体領域の候補として、図２の距離算出部２０４に出力する。このステップＳ４０７の処理後、人体検出部２０３は、図４のフローチャートの処理（図３のステップＳ３０２の処理）を終了する。その後、画像処理装置の処理は、距離算出部２０４にて行われる図３のステップＳ３０３に進む。 In step S407, the human body detection unit 203 uses the processing from step S402 to step S406 to determine the position coordinates of the human body region for each human body region detected from each of the plurality of reduced images 511, and the coordinate system of the original input image 510. Convert to Then, the human body detection unit 203 outputs a plurality of human body regions that have been detected from the plurality of reduced images 511 and subjected to coordinate transformation, respectively, to the distance calculation unit 204 in FIG. 2 as human body region candidates. After the processing in step S407, the human body detection unit 203 ends the processing in the flowchart in FIG. 4 (processing in step S302 in FIG. 3). Thereafter, the processing of the image processing apparatus proceeds to step S303 in FIG.

図１２（ａ）は、人体検出部２０３により入力画像１２０１から検出された、複数の人体領域１２１０，１２１１の候補の一例を示す図である。なお、図１２（ｂ）の説明は後述する。図１２（ａ）の例では、入力画像１２０１内には被写体として二人の人物８０１，８０２が写っているとする。人体検出部２０３は、この入力画像１２０１から図４のステップＳ４０１〜４０７までの処理により複数の人体領域１２１０，１２１１を検出する。ここで、図４のステップＳ４０１〜４０７までの処理では、入力画像１２０１から複数の縮小画像が生成され、それら複数の縮小画像からそれぞれ複数の人体領域が検出される。また、一つの縮小画像内においても、隣り合う複数の検出ウインドウにおいてそれぞれ人体領域が検出されることがある。このため、結果として、図１２（ａ）に示すように、入力画像１２０１内の人物８０１については複数の人体領域１２１０が検出され、人物８０２についても複数の人体領域１２１１が検出される。したがって、人体検出部２０３から図２の距離算出部２０４には、一人の人物につき複数検出された人体領域を示す情報が出力される。 FIG. 12A is a diagram illustrating an example of candidates for a plurality of human body regions 1210 and 1211 detected from the input image 1201 by the human body detection unit 203. The description of FIG. 12B will be described later. In the example of FIG. 12A, it is assumed that two persons 801 and 802 are captured as subjects in the input image 1201. The human body detection unit 203 detects a plurality of human body regions 1210 and 1211 by processing from the input image 1201 to steps S401 to S407 in FIG. Here, in the processing from step S401 to step S407 in FIG. 4, a plurality of reduced images are generated from the input image 1201, and a plurality of human body regions are detected from the plurality of reduced images. Further, even in one reduced image, a human body region may be detected in each of a plurality of adjacent detection windows. Therefore, as a result, as shown in FIG. 12A, a plurality of human body regions 1210 are detected for the person 801 in the input image 1201, and a plurality of human body regions 1211 are also detected for the person 802. Therefore, information indicating a plurality of human body regions detected for one person is output from the human body detection unit 203 to the distance calculation unit 204 in FIG.

図３に説明を戻す。図３のステップＳ３０３では、距離算出部２０４は、顔検出部２０２から供給された顔領域の情報と、人体検出部２０３から供給された複数の人体領域の情報とを用い、顔と人体の距離（以下、顔人体距離と呼ぶ。）を算出する。本実施形態の場合、人体検出部２０３は、顔領域の中心点（中心点の座標）と、複数の人体領域の各頭頂部（頭頂点の座標）との間の各距離を、それぞれ顔人体距離として算出する。 Returning to FIG. In step S303 of FIG. 3, the distance calculation unit 204 uses the face area information supplied from the face detection unit 202 and a plurality of human body area information supplied from the human body detection unit 203 to use the distance between the face and the human body. (Hereinafter referred to as the face human body distance) is calculated. In the case of the present embodiment, the human body detection unit 203 calculates the respective distances between the center point of the face region (coordinates of the center point) and the tops of the plurality of human body regions (coordinates of the head vertices). Calculate as distance.

図６は、顔検出部２０２にて検出された顔領域６０２と、人体検出部２０３にて検出された複数の人体領域の中の一つの人体領域６０１と、距離算出部２０４が算出する顔人体距離６０５との関係を表す図である。図６に示すように、顔人体距離６０５は、顔領域６０２の中心点６０３と人体領域６０１の頭頂部６０４との間の距離として求められる。 FIG. 6 shows a face region 602 detected by the face detection unit 202, one human body region 601 among a plurality of human body regions detected by the human body detection unit 203, and a face human body calculated by the distance calculation unit 204. It is a figure showing the relationship with the distance 605. FIG. As shown in FIG. 6, the face human body distance 605 is obtained as the distance between the center point 603 of the face region 602 and the top 604 of the human body region 601.

具体的には、距離算出部２０４は、以下の演算により顔人体距離を算出する。ここで、顔領域６０２を表す矩形領域の４点（４角）の座標のうち、左上座標をＦ１（Ｘｆ１，Ｙｆ１）、右下座標をＦ２（Ｘｆ２，Ｙｆ２）とする。また、人体領域６０１を表す矩形領域の４点（４角）の座標のうち、左上座標をＨ１（Ｘｈ１，Ｙｈ１）、右上座標をＨ２（Ｘｈ２，Ｙｈ２）とする。この場合、顔領域の中心点の座標は（（Ｘｆ２−Ｘｆ１）／２，（Ｙｆ２−Ｙｆ１）／２）となり、人体の頭頂部の位置（頭頂点）の座標は（（Ｘｈ２−Ｘｈ１）／２，Ｙｈ１））になる。そして、顔人体距離６０５の値（顔人体距離Ｌｆｈ）は、下記式（１）により算出することができる。
Ｌｆｈ＝√（((Xf2-Xf1)/2-(Xh2-Xh1)/2)²+((Yf2-Yf1)/2-Yh1))²）式（１） Specifically, the distance calculation unit 204 calculates the face human body distance by the following calculation. Here, among the coordinates of the four points (four corners) of the rectangular area representing the face area 602, the upper left coordinates are F1 (Xf1, Yf1) and the lower right coordinates are F2 (Xf2, Yf2). Of the four points (four corners) of the rectangular area representing the human body area 601, the upper left coordinates are H1 (Xh1, Yh1) and the upper right coordinates are H2 (Xh2, Yh2). In this case, the coordinates of the center point of the face area are ((Xf2-Xf1) / 2, (Yf2-Yf1) / 2), and the coordinates of the position of the head of the human body (head vertex) are ((Xh2-Xh1) / 2, Yh1)). The value of the face human body distance 605 (face human body distance Lfh) can be calculated by the following equation (1).
Lfh = √ (((Xf2-Xf1) / 2- (Xh2-Xh1) / 2) ² + ((Yf2-Yf1) / 2-Yh1)) ² ) Formula (1)

距離算出部２０４は、前述したようにして、人物毎に、顔領域と複数の人体領域とについてそれぞれ顔人体距離を算出し、それら顔領域と複数の人体領域の情報と、それぞれ算出した顔人体距離の情報とを、削除部２０６に出力する。ステップＳ３０３の後、画像処理装置の処理は、削除部２０６にて行われるステップＳ３０４に進む。 As described above, the distance calculation unit 204 calculates the face human body distance for each face area and a plurality of human body areas for each person, information on the face areas and the plurality of human body areas, and the calculated face human body. The distance information is output to the deletion unit 206. After step S303, the processing of the image processing apparatus proceeds to step S304 performed by the deletion unit 206.

ステップＳ３０４では、削除部２０６は、人物毎に、ステップＳ３０２で検出された複数の人体領域のうち、不適合と判断した人体領域を削除する。具体的には、削除部２０６は、ステップＳ３０３で人体領域毎にそれぞれ算出した顔人体距離が所定の条件を満たすか否かを判定し、顔人体距離が所定の条件を満たさない人体領域については削除する。 In step S304, the deletion unit 206 deletes, for each person, a human body region that is determined to be incompatible among the plurality of human body regions detected in step S302. Specifically, the deletion unit 206 determines whether or not the face human body distance calculated for each human body region in step S303 satisfies a predetermined condition, and for the human body region where the face human body distance does not satisfy the predetermined condition. delete.

図７は、削除部２０６における不適合領域の削除処理の詳細なフローチャートである。また、図８は、撮像装置であるカメラ８０３とそのカメラ８０３が撮影した被写体である人物８０１，８０２との位置関係の一例を示した図である。図８の例では、カメラ８０３は例えば天井に固定され且つ天井に対して斜め下の前方向を撮影するように設置されている。また、図８の例では、被写体は二人の人物８０１，８０２であり、これら人物８０１，８０２は、カメラ８０３の撮影画角内で且つカメラ８０３からの距離がそれぞれ異なる位置にいるとする。また、図９は、図８に例示したカメラ８０３により二人の人物８０１，８０２が撮影されて得られた画像例を示す図である。図９の画像中の人物８０１，８０２は、図８のカメラ８０３により撮影された人物８０１，８０２である。以下、図８、図９の例を参照しながら、図７のフローチャートの処理を説明する。 FIG. 7 is a detailed flowchart of the nonconforming area deletion process in the deletion unit 206. FIG. 8 is a diagram illustrating an example of a positional relationship between a camera 803 that is an imaging device and persons 801 and 802 that are subjects captured by the camera 803. In the example of FIG. 8, the camera 803 is fixed to the ceiling, for example, and is installed so as to photograph the front direction obliquely below the ceiling. In the example of FIG. 8, it is assumed that the subjects are two persons 801 and 802, and these persons 801 and 802 are located at different positions within the shooting angle of view of the camera 803 and from the camera 803. FIG. 9 is a diagram illustrating an example of an image obtained by photographing two persons 801 and 802 with the camera 803 illustrated in FIG. Persons 801 and 802 in the image of FIG. 9 are persons 801 and 802 photographed by the camera 803 of FIG. The processing of the flowchart of FIG. 7 will be described below with reference to the examples of FIGS.

図７のステップＳ７０１では、削除部２０６は、条件設定部２０５により設定される顔人体距離に対する条件（距離条件）を基に、それぞれの人物毎に、顔人体距離に対して適用する所定の条件を決定する。図７のフローチャートに示す不適合領域の削除処理は、複数の人体領域の中から、顔と人体の位置関係が実際にはありえない状況となっている人体領域を検出し、その人体領域を不適として削除する処理である。このため、削除部２０６は、条件設定部２０５により設定された距離条件を基に、人物毎に、顔人体距離に対する下限値と上限値を、所定の条件として決定する。 In step S701 in FIG. 7, the deletion unit 206 performs a predetermined condition to be applied to the face human body distance for each person based on the condition (distance condition) for the face human body distance set by the condition setting unit 205. To decide. The nonconformity region deletion process shown in the flowchart of FIG. 7 detects a human body region in which the positional relationship between the face and the human body is actually impossible from a plurality of human body regions, and deletes the human body region as inappropriate. It is processing to do. For this reason, the deletion unit 206 determines a lower limit value and an upper limit value for the face human body distance as predetermined conditions for each person based on the distance condition set by the condition setting unit 205.

また、削除部２０６は、顔人体距離に対する所定の条件を、画像内で人体が存在している位置に応じて決定する。本実施形態の場合、撮影により取得された画像が図９に示すような三つの領域９０１，９０２，９０３に分けられ、画像内の人物の位置が、それら領域９０１，９０２，９０３の何れの位置であるかに応じて、顔人体距離に対して適用する所定の条件が決定される。 The deletion unit 206 determines a predetermined condition for the face human body distance according to the position where the human body exists in the image. In the case of the present embodiment, an image acquired by photographing is divided into three areas 901, 902, and 903 as shown in FIG. 9, and the position of a person in the image is any position in these areas 901, 902, and 903. Depending on whether or not, a predetermined condition to be applied to the face human body distance is determined.

ここで、図８の例では、カメラ８０３が斜め下の前方向を撮影する向きに設置され、人物８０１，８０２の位置がカメラ８０３の撮影画角内で且つその距離が異なっており、この場合、カメラ８０３から見下ろす角度が人物８０１と人物８０２とで異なる。また、この図８の例の場合、カメラ８０３が人物８０１，８０２を見下ろす角度によって、カメラ８０３の撮影画像内における人物８０１，８０２の位置が異なる。具体的には、カメラ８０３が見下ろす角度が大きくなる人物８０１の撮影画像内の位置は、図９に示すように撮影画像の下側の位置となる。一方、カメラ８０３が見下ろす角度が小さくなる人物８０２の撮影画像内の位置は、図９に示すように撮影画像の上側の位置となる。また、顔中心と頭頂部との間の顔人体距離に注目すると、図８、図９の例のように、人物８０１と人物８０２の撮影画像内の位置によって、それら人物８０１と人物８０２のそれぞれの顔人体距離は異なる。具体的には、撮影画像内で下側の位置に写っている人物８０１の顔人体距離は、撮影画像内で上側の位置に写っている人物８０２の顔人体距離よりも大きく。このため、本実施形態において、削除部２０６は、画像内において検出された人体領域の位置に応じて、顔人体距離に対する所定の条件をそれぞれ決定する。例えば、削除部２０６は、図９に示すように、画像が三つの領域（分割領域９０１〜９０３）に分割される場合、人体領域が何れの分割領域９０１〜９０３に位置しているかに応じて、顔人体距離に適用する所定の条件を決定する。 Here, in the example of FIG. 8, the camera 803 is installed in a direction to capture the front direction obliquely below, and the positions of the persons 801 and 802 are within the shooting angle of view of the camera 803 and the distances thereof are different. The angle when looking down from the camera 803 differs between the person 801 and the person 802. In the example of FIG. 8, the positions of the persons 801 and 802 in the captured image of the camera 803 differ depending on the angle at which the camera 803 looks down at the persons 801 and 802. Specifically, the position in the captured image of the person 801 where the angle looked down by the camera 803 is large is the lower position of the captured image as shown in FIG. On the other hand, the position in the captured image of the person 802 with a smaller angle looked down by the camera 803 is the upper position of the captured image as shown in FIG. When attention is paid to the face human body distance between the center of the face and the top of the head, each of the person 801 and the person 802 depends on the position in the photographed image of the person 801 and the person 802, as in the examples of FIGS. The face human body distance is different. Specifically, the face human body distance of the person 801 appearing at the lower position in the photographed image is larger than the face human body distance of the person 802 appearing at the upper position in the photographed image. Therefore, in the present embodiment, the deletion unit 206 determines predetermined conditions for the face human body distance according to the position of the human body region detected in the image. For example, as illustrated in FIG. 9, when the image is divided into three regions (divided regions 901 to 903), the deleting unit 206 determines depending on which divided region 901 to 903 the human body region is located. The predetermined condition to be applied to the face human body distance is determined.

図１０は、図８と図９の例において、画像内における人体領域の位置に応じて、顔人体距離に適用する所定の条件の一例を示す図である。図１０は、図９の分割領域９０１，９０２，９０３と、それら各分割領域９０１〜９０３について顔人体距離Ｌｆｈに対する上限値と下限値の条件（条件ＩＤ）が対応付けられた対応表を示している。図１０の対応表に設定されている条件は、カメラからの距離が近く、カメラが見下ろす角度が大きくなる人物、つまり画像内で下側に位置している人物ほど、顔人体距離Ｌｆｈの上限値と下限値が大きくなるような条件となされている。そして、削除部２０６は、図１０に示す対応表を参照し、ステップＳ３０２で検出した顔領域の位置が分割領域９０１〜９０３の何れに含まれているかに応じて、顔人体距離Ｌｆｈに対する所定の条件を決定する。 FIG. 10 is a diagram illustrating an example of a predetermined condition applied to the face human body distance according to the position of the human body region in the image in the examples of FIGS. 8 and 9. FIG. 10 shows a correspondence table in which the divided regions 901, 902, and 903 in FIG. 9 are associated with the upper limit value and lower limit value conditions (condition IDs) for the face human body distance Lfh for each of the divided regions 901 to 903. Yes. The condition set in the correspondence table of FIG. 10 is that the person whose distance from the camera is closer and the angle at which the camera looks down becomes larger, that is, the person located at the lower side in the image, the upper limit of the face human body distance Lfh And the lower limit is increased. Then, the deletion unit 206 refers to the correspondence table shown in FIG. 10 and determines a predetermined value for the face human body distance Lfh according to which of the divided areas 901 to 903 the position of the face area detected in step S302 is included in. Determine the conditions.

一例として、図９の人物８０２の場合、分割領域９０１内に顔領域が存在しているため、その人物８０２の人体領域に対して、削除部２０６は、顔人体距離Ｌｆｈに対する所定の条件として、条件ＩＤ（１００１）の１０＝＜Ｌｆｈ＜１５を決定する。条件ＩＤ（１００１）では、「１０」が顔人体距離Ｌｆｈに対する下限値、「１５」が顔人体距離Ｌｆｈに対する上限値に相当する。また例えば、図９の人物８０１の場合、分割領域９０３内に顔領域が存在しているため、削除部２０６は、顔人体距離Ｌｆｈに対する所定の条件として、条件ＩＤ（１００３）の２０＝＜Ｌｆｈ＜２５を決定する。条件ＩＤ（１００２）では、「２０」が顔人体距離Ｌｆｈに対する下限値、「２５」が顔人体距離Ｌｆｈに対する上限値に相当する。 As an example, in the case of the person 802 in FIG. 9, since a face area exists in the divided area 901, the deletion unit 206 performs a predetermined condition on the face human body distance Lfh for the human body area of the person 802. The condition ID (1001) 10 = <Lfh <15 is determined. In the condition ID (1001), “10” corresponds to the lower limit value for the face human body distance Lfh, and “15” corresponds to the upper limit value for the face human body distance Lfh. Further, for example, in the case of the person 801 in FIG. 9, since the face area exists in the divided area 903, the deletion unit 206 sets the condition ID (1003) 20 = <Lfh as a predetermined condition for the face human body distance Lfh. <25 is determined. In the condition ID (1002), “20” corresponds to the lower limit value for the face human body distance Lfh, and “25” corresponds to the upper limit value for the face human body distance Lfh.

図１０に例示したような対応表の情報は、予め統計的に調べた情報を基に用意しておけばよい。例えば、様々なカメラアングル（カメラが人物を見下ろす際の角度）で撮影した画像と、その画像内に写る人物の顔人体距離の統計値とを基に、対応表を生成し、その対応表の情報を例えば図１の記憶装置１０２内に格納しておくようにする。なお、図１０の例の対応表は、例えば図８のカメラ８０３に対して用意された対応表であり、このため、カメラが異なる場合やそのカメラの設置方法等が異なれば、対応表も異なるものが使用される。図２の条件設定部２０５は、設置されたカメラに応じた図１０のような対応表を設定し、削除部２０６は、その対応表を参照して前述のように条件を決定する。ステップＳ７０１の後、削除部２０６は、ステップＳ７０２に処理を進める。 The information in the correspondence table illustrated in FIG. 10 may be prepared based on information that has been statistically examined in advance. For example, a correspondence table is generated on the basis of images taken at various camera angles (angles when the camera looks down at a person) and statistics of the human body distance of the person in the image. For example, the information is stored in the storage device 102 of FIG. The correspondence table in the example of FIG. 10 is, for example, a correspondence table prepared for the camera 803 in FIG. 8. Therefore, if the cameras are different or the installation method of the cameras is different, the correspondence tables are also different. Things are used. The condition setting unit 205 in FIG. 2 sets a correspondence table as shown in FIG. 10 according to the installed camera, and the deletion unit 206 determines the conditions as described above with reference to the correspondence table. After step S701, the deletion unit 206 advances the process to step S702.

ステップＳ７０２では、削除部２０６は、人物毎に、ステップＳ３０２で検出した複数の人体領域の中の一つを、顔人体距離の判定対象の人体領域として設定する。
ここで、図９に例示した人物８０１，８０２のうち、人物８０２の人体領域が、顔人体距離の判定対象として設定されたとする。図１１は、人物８０２の人物領域のみを抜き出して示す図である。図１１の例では、前述のステップＳ３０１で検出された顔領域１１０１と、ステップＳ３０２で検出された複数の人体領域１１０２〜１１０４と、それら複数の人体領域１１０２〜１１０４について算出された各顔人体距離１１０５〜１１０７とが示されている。 In step S702, the deletion unit 206 sets, for each person, one of the plurality of human body regions detected in step S302 as a human body region whose face human body distance is to be determined.
Here, it is assumed that the human body region of the person 802 among the persons 801 and 802 illustrated in FIG. 9 is set as the determination target of the face human body distance. FIG. 11 is a diagram showing only the person area of the person 802 extracted. In the example of FIG. 11, the face area 1101 detected in step S301 described above, the plurality of human body areas 1102 to 1104 detected in step S302, and the face human body distances calculated for the plurality of human body areas 1102 to 1104. 1105 to 1107 are shown.

図１１の例では、三つの人体領域１１０２〜１１０４が検出されているため、削除部２０６は、ステップＳ７０２において、それら三つの人体領域１１０２〜１１０４の中から、顔人体距離の判定処理の対象とする人体領域を一つずつ順番に設定する。具体的には、削除部２０６は、人体領域１１０２〜１１０４のうち、それぞれの人体領域を表す矩形領域の４点の座標のうち例えば左上座標のＹ座標値が小さい方から順の人体領域を、顔人体距離の判定対象の人体領域として設定する。図１１の例の場合、左上座標のＹ座標値が小さい順番では人体領域１１０２，１１０３，１１０４の順であるため、ステップＳ７０２では、最初に、人体領域１１０２が判定対象として設定される。ステップＳ２０７の後、削除部２０６は、ステップＳ７０３に処理を進める。 In the example of FIG. 11, since three human body regions 1102 to 1104 are detected, the deletion unit 206 selects a face human body distance determination target from the three human body regions 1102 to 1104 in step S702. The human body regions to be set are set one by one in order. Specifically, the deletion unit 206 selects, for example, the human body regions in order from the smallest Y coordinate value of the upper left coordinate among the four coordinates of the rectangular region representing the human body region among the human body regions 1102 to 1104. It is set as the human body region for which the facial human body distance is to be determined. In the example of FIG. 11, the human body regions 1102, 1103, and 1104 are in the order from the smallest Y coordinate value of the upper left coordinate. Therefore, in step S702, the human body region 1102 is first set as a determination target. After step S207, the deletion unit 206 advances the process to step S703.

ステップＳ７０３では、削除部２０６は、ステップＳ７０２で顔人体距離の判定対象に設定された人体領域１１０２について、その人体領域１１０２の顔人体距離が、ステップＳ７０１で決定した条件に該当するか否かを判定する。前述したステップＳ７０１では、人体領域１１０２の人物８０２に対しては図１０の条件ＩＤ（１００１）が決定されるため、ステップＳ７０３では、削除部２０６は、人体領域１１０２の顔人体距離Ｌｆｈが、条件ＩＤ（１００１）に該当するか否かを判定する。具体的には、削除部２０６は、ステップＳ７０３において、人体領域１１０２の顔人体距離Ｌｆｈ（１００５）と条件ＩＤ（１００１）とを基に、以下の式（２）を満たすか否かを判定する。
１０＝＜Ｌｆｈ（１１０５）＜１５式（２） In step S703, the deletion unit 206 determines whether the face human body distance of the human body area 1102 satisfies the condition determined in step S701 for the human body area 1102 set as the determination target of the human face distance in step S702. judge. In step S701 described above, the condition ID (1001) in FIG. 10 is determined for the person 802 in the human body region 1102, and in step S703, the deletion unit 206 determines that the face human body distance Lfh in the human body region 1102 is the condition. It is determined whether it corresponds to ID (1001). Specifically, in step S703, the deletion unit 206 determines whether or not the following expression (2) is satisfied based on the face human body distance Lfh (1005) and the condition ID (1001) of the human body region 1102. .
10 = <Lfh (1105) <15 Formula (2)

そして、削除部２０６は、式（２）を満たすと判定（Ｙｅｓ）した場合には、人体領域１１０２は顔の位置に対して正しい位置に存在していると判断して、ステップＳ７０５に処理を進める。一方、削除部２０６は、式（２）を満たさないと判定（Ｎｏ）した場合には、人体領域１１０２は顔の位置関係が不適と判断して、ステップＳ７０４に処理を進める。 If the deletion unit 206 determines that the expression (2) is satisfied (Yes), the deletion unit 206 determines that the human body region 1102 is present at the correct position with respect to the face position, and performs the process in step S705. Proceed. On the other hand, when determining that the expression (2) is not satisfied (No), the deletion unit 206 determines that the human body region 1102 has an improper face positional relationship, and proceeds to step S704.

ステップＳ７０４では、削除部２０６は、ステップＳ７０３で不適と判断された人体領域（この場合は人体領域１１０２）を削除する。ステップＳ７０４の後、削除部２０６は、ステップＳ７０５に処理を進める。 In step S704, the deletion unit 206 deletes the human body area determined to be inappropriate in step S703 (in this case, the human body area 1102). After step S704, the deletion unit 206 advances the process to step S705.

ステップＳ７０５では、削除部２０６は、ステップＳ３０２で検出した複数の人体領域の全てについてステップＳ７０２からＳ７０４までの処理が終わったか否か判定する。この時点では、図１１の人体領域１１０２しか処理が終わっていないため、削除部２０６は、ステップＳ７０２に処理を戻す。 In step S705, the deletion unit 206 determines whether or not the processing from steps S702 to S704 has been completed for all of the plurality of human body regions detected in step S302. At this point, since only the human body region 1102 in FIG. 11 has been processed, the deletion unit 206 returns the process to step S702.

この場合、ステップＳ７０２において、削除部２０６は、次の判定対象の人体領域として図１１の人体領域１１０３を設定することになる。そして、削除部２０６は、前述同様に、人体領域１１０３についてステップＳ７０３，Ｓ７０４の処理を行う。そして、ステップＳ７０５からステップＳ７０２の処理に戻ると、削除部２０６は、さらに次の判定対象の人体領域として図１１の人体領域１１０４を設定する。そして、人体領域１１０４について、ステップＳ７０３，Ｓ７０４の処理が行われる。なお、図１１の例において、例えば人体領域１１０２が式（２）の条件を満たさず、人体領域１１０３，１１０４が式（２）の条件を満たした場合、人体領域１１０２が削除され、人体領域１１０３と１１０４は残される。 In this case, in step S702, the deletion unit 206 sets the human body region 1103 in FIG. 11 as the next human body region to be determined. And the deletion part 206 performs the process of step S703, S704 about the human body area | region 1103 similarly to the above-mentioned. When the process returns from step S705 to step S702, the deletion unit 206 further sets the human body region 1104 in FIG. 11 as the next human body region to be determined. Then, the processes of steps S703 and S704 are performed on the human body region 1104. In the example of FIG. 11, for example, when the human body area 1102 does not satisfy the condition of Expression (2) and the human body areas 1103 and 1104 satisfy the condition of Expression (2), the human body area 1102 is deleted and the human body area 1103 And 1104 are left.

また、削除部２０６は、図９の人物８０１についてステップＳ３０２で検出した各人体領域についても同様に、所定の条件を満たすか否か判定し、条件を満たさない人体領域を不適として削除する。なおこの場合、ステップＳ７０３では、人物８０１から検出された各人体領域について、前述した図１０の条件ＩＤ（１００３）の条件を用いた判定処理が行われる。 Similarly, the deletion unit 206 determines whether or not a predetermined condition is satisfied for each human body region detected in step S302 for the person 801 in FIG. 9, and deletes a human body region that does not satisfy the condition as inappropriate. In this case, in step S703, the determination process using the condition of the condition ID (1003) in FIG. 10 described above is performed for each human body region detected from the person 801.

前述したように、削除部２０６は、人物８０１と人物８０２からそれぞれステップＳ３０２で検出した複数のそれぞれの人体領域に対し、ステップＳ７０２で各々決定した所定の条件を満たすか否か判定し、条件を満たさない人体領域を不適として削除する。 As described above, the deletion unit 206 determines whether or not the predetermined conditions determined in step S702 are satisfied for each of a plurality of human body regions detected in step S302 from the person 801 and the person 802, respectively. Remove unsatisfied human body areas as inappropriate.

図１２（ｂ）は、例えば前述した図１２（ａ）の入力画像１２０１（つまり削除部２０６における不適合人体の削除処理前の画像例）に対し、削除部２０６による不適合人体の削除処理が行われた後の画像１２０２の一例を示す図である。図１２（ａ）の例では、前述したように人物８０１，８０２に対し、それぞれ複数の人体領域１２１０，１２１１が検出されている。本実施形態では、削除部２０６により前述したような人体領域の顔領域の中心に対して不適な位置にある人体領域が削除されることで、図１２（ｂ）に示すように、人物８０１，８０２に対してそれぞれ人体領域１２２０，１２２１が残される。 In FIG. 12B, for example, the non-conforming human body deleting process by the deleting unit 206 is performed on the input image 1201 in FIG. 12A described above (that is, the image example before the non-conforming human body is deleted in the deleting unit 206). It is a figure which shows an example of the image 1202 after. In the example of FIG. 12A, as described above, a plurality of human body regions 1210 and 1211 are detected for the persons 801 and 802, respectively. In this embodiment, the deletion unit 206 deletes the human body region in an inappropriate position with respect to the center of the face region of the human body region as described above. Human body regions 1220 and 1221 are left for 802, respectively.

図３に説明を戻す。図３のステップＳ３０４の後、画像処理装置の処理は、統合処理部２０７にて行われるステップＳ３０５に進む。ステップＳ３０５では、統合処理部２０７は、削除部２０６による不適合領域の削除処理後の人物毎に複数の人体領域を統合処理し、各人物につき一つの人体領域を決定する。具体的には、統合処理部２０７は、例えばＭｅａｎ−Ｓｈｉｆｔ等の手法を用いることにより、一人の人物について一つの人体結果が得られるような統合処理を行う。 Returning to FIG. After step S304 in FIG. 3, the processing of the image processing apparatus proceeds to step S305 performed in the integration processing unit 207. In step S305, the integration processing unit 207 integrates a plurality of human body regions for each person after the nonconformity region deletion processing by the deletion unit 206, and determines one human body region for each person. Specifically, the integration processing unit 207 performs integration processing so that one human body result is obtained for one person by using a method such as Mean-Shift, for example.

図１３は、前述した図１２（ａ），図１２（ｂ）の画像例において、統合処理部２０７により一人の人物について一つの人体領域が得られた画像例を示している。図１３には、人物８０１については一つの人体領域１３１０が得られ、人物８０２についても一つの人体領域１３１１が得られた例が示されている。すなわち、削除部２０６からは一人の人物に対して複数の人体領域が出力されているが、統合処理部２０７は、一人の人物について一つの人体領域を出力する。このように、本実施形態の画像処理装置は、一人の人物につき一つの人体領域の情報を出力する。 FIG. 13 shows an image example in which one human body region is obtained for one person by the integration processing unit 207 in the image examples of FIGS. 12A and 12B described above. FIG. 13 shows an example in which one human body region 1310 is obtained for the person 801 and one human body region 1311 is obtained for the person 802. In other words, the deletion unit 206 outputs a plurality of human body regions for one person, but the integration processing unit 207 outputs one human body region for one person. As described above, the image processing apparatus according to the present embodiment outputs information on one human body region for each person.

以上説明したように、本実施形態によれば、入力画像から人物毎に複数検出された人体領域のうち、人物毎に顔領域の中心点との間の位置関係が不適な人体領域を削除することで、各人物の顔領域から位置がずれた人体領域が出力されることを防止している。また、本実施形態では、人体領域と顔領域の中心点との間の位置関係に対する条件を、カメラアングルによる人物の見え方の違いに対して適切に決定しているため、精度の高い条件判定が可能となり、不適な人体領域を精度良く削除することができる。 As described above, according to the present embodiment, a human body region having an inappropriate positional relationship with the center point of the face region is deleted for each person from a plurality of human body regions detected for each person from the input image. As a result, it is possible to prevent a human body region whose position is shifted from the face region of each person from being output. In the present embodiment, the condition for the positional relationship between the human body area and the center point of the face area is appropriately determined with respect to the difference in the appearance of the person depending on the camera angle. This makes it possible to delete an inappropriate human body region with high accuracy.

本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 The present invention supplies a program that realizes one or more functions of the above-described embodiments to a system or apparatus via a network or a storage medium, and one or more processors in a computer of the system or apparatus read and execute the program This process can be realized. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

上述の実施形態は、何れも本発明を実施するにあたっての具体化の例を示したものに過ぎず、これらによって本発明の技術的範囲が限定的に解釈されてはならないものである。即ち、本発明は、その技術思想、又はその主要な特徴から逸脱することなく、様々な形で実施することができる。 The above-described embodiments are merely examples of implementation in carrying out the present invention, and the technical scope of the present invention should not be construed as being limited thereto. That is, the present invention can be implemented in various forms without departing from the technical idea or the main features thereof.

１０１ＣＰＵ、１０２記憶装置、１０３入力装置、１０４出力装置、２０１画像入力部、２０２顔検出部、２０３人体検出部、２０４顔人体距離算出部、２０５顔人体距離条件設定部、２０６不適合領域削除部、２０７検出結果統合処理部 101 CPU, 102 storage device, 103 input device, 104 output device, 201 image input unit, 202 face detection unit, 203 human body detection unit, 204 face human body distance calculation unit, 205 face human body distance condition setting unit, 206 nonconforming region deletion unit , 207 Detection result integration processing unit

Claims

First detection means for detecting a first region relating to the subject from the input image;
Second detection means for detecting a plurality of second regions related to the subject from the input image;
Distance calculating means for calculating a distance between the first area detected by the first detecting means and each of the plurality of second areas detected by the second detecting means;
Condition setting means for setting conditions for the distance calculated by the distance calculation means;
Deleting means for deleting a second area in which the distance calculated by the distance calculating means does not satisfy the condition for the distance among the plurality of second areas detected by the second detecting means; ,
An image processing apparatus comprising:

The image processing apparatus according to claim 1, further comprising an integration unit that integrates a plurality of the remaining second areas after deletion by the deletion unit to obtain one second area.

The first detection means detects a human face area as the first area,
The second detection means detects a human body region of the person as the second region,
The image processing apparatus according to claim 1, wherein the distance calculating unit calculates a distance between the detected face area and each of a plurality of human body areas for one person.

The condition setting means sets a lower limit value and an upper limit value for the distance between the face area and the human body area as a condition for the distance in association with the position of the person in the input image,
The image processing apparatus according to claim 3, wherein the deletion unit deletes a second area that does not satisfy a condition corresponding to the position of the person in the input image.

The condition setting means sets a plurality of conditions in association with the position of a person in the input image as a condition for the distance,
The said deletion means deletes the 2nd area | region which does not satisfy | fill the conditions according to the position of the person's face area | region in the said input image from among these several conditions. Image processing device.

The condition setting means sets a condition in which the lower limit value and the upper limit value of the distance between the face area and the human body area become larger as the angle at which the camera that acquires the input image looks down at the subject person is larger. The image processing apparatus according to claim 4, wherein the image processing apparatus is an image processing apparatus.

The distance calculation means calculates a distance between a center point of the detected face area and a top of each of the plurality of human body areas for the one person. The image processing apparatus according to any one of the above.

The second detection unit generates a plurality of reduced images obtained by reducing the input image at predetermined different magnifications, and sequentially detects the human body region from the plurality of reduced images. The image processing apparatus according to any one of 3 to 7.

The second detection means sets a predetermined detection window for the reduced image, scans the detection window in the reduced image, obtains the likelihood of the human body region in the image of the detection window, The image processing apparatus according to claim 8, wherein the plurality of human body regions are detected based on the likelihood.

A first detection step of detecting a first region related to the subject from the input image;
A second detection step of detecting a plurality of second regions related to the subject from the input image;
A distance calculation step of calculating a distance between the first region detected by the first detection step and each of the plurality of second regions detected by the second detection step;
A condition setting step for setting a condition for the distance calculated by the distance calculation step;
A deletion step of deleting a second region in which the distance calculated by the distance calculation step does not satisfy the condition for the distance among the plurality of second regions detected by the second detection step; ,
An image processing method for an image processing apparatus, comprising:

A program for causing a computer to function as each unit of the image processing apparatus according to any one of claims 1 to 9.