JP2010141847A

JP2010141847A - Image processor and method of processing image

Info

Publication number: JP2010141847A
Application number: JP2008318935A
Authority: JP
Inventors: Ryosuke Tsuji; 良介辻
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2008-12-15
Filing date: 2008-12-15
Publication date: 2010-06-24
Anticipated expiration: 2028-12-15
Also published as: JP5147670B2

Abstract

<P>PROBLEM TO BE SOLVED: To improve precision in subject detection by a simple method, while suppressing computational complexity in a device and a method for processing images for detecting subjects from the images. <P>SOLUTION: The image region of the same subject is identified in image regions detected as predetermined subject image regions. A history of detected reliability is recorded for the image region identified as the image region of the same subject. An image region, which meets a criterion decided according to a reliability level is determined to be the predetermined subject image region in the image regions identified as an image region of the same subject. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、画像処理装置および画像処理方法に関し、特には動画像中に含まれる被写体を検出するための画像処理装置及び画像処理方法に関する。 The present invention relates to an image processing apparatus and an image processing method, and more particularly to an image processing apparatus and an image processing method for detecting a subject included in a moving image.

画像から特定の被写体（人物、動物、特定の物体など）を検出する画像処理技術は非常に有用である。例えば被写体として人間の顔を検出する画像処理技術は、テレビ会議、マン・マシン・インタフェース、セキュリティ、人間の顔を追跡するためのモニタ・システム、画像圧縮など多くの分野で使用することができる。 An image processing technique for detecting a specific subject (a person, an animal, a specific object, etc.) from an image is very useful. For example, an image processing technique for detecting a human face as a subject can be used in many fields such as video conferencing, a man-machine interface, security, a monitor system for tracking a human face, and image compression.

既にデジタルカメラやデジタルビデオカメラでは、撮影画像から人物の顔を検出し、顔検出結果に基づく露出制御や焦点検出制御を実現させている。 Digital cameras and digital video cameras already detect human faces from captured images and realize exposure control and focus detection control based on face detection results.

このような、画像中から特定の被写体を検出するための画像処理技術としては、様々な手法が提案されているが、その大半はパターンマッチングに基づく手法である。例えば、画像上の複数の異なる位置で部分画像を切り出し、その部分画像が顔領域の画像であるか否かを判別して、画像上の顔領域を検出する方法が挙げられる。部分画像が顔領域であるか否かは、テンプレートマッチングによる手法や、ニューラル・ネットワークなどの学習手法により顔の特徴を学習させた識別器を用いる手法により判別できる。 As such an image processing technique for detecting a specific subject from an image, various techniques have been proposed, but most of them are based on pattern matching. For example, there is a method of extracting a partial image at a plurality of different positions on the image, determining whether the partial image is a face area image, and detecting the face area on the image. Whether or not the partial image is a face region can be determined by a template matching method or a method using a classifier that learns facial features by a learning method such as a neural network.

いずれの手法においても、部分画像の画像パターンに基づいてその部分画像が被写体領域の画像である確からしさを示す信頼度を算出し、信頼度が所定の閾値を超えた部分画像を被写体領域の画像として検出するのが一般的である。 In either method, the reliability indicating the certainty that the partial image is the image of the subject area is calculated based on the image pattern of the partial image, and the partial image whose reliability exceeds a predetermined threshold is calculated as the image of the subject area. It is common to detect as

例えば、特許文献１では、複数の異なる解像度パターンの部分画像から信頼度を算出し、部分画像ごとの信頼度の和に基づいて被写体領域の画像を検出（被写体検出）している。 For example, in Patent Document 1, the reliability is calculated from partial images having a plurality of different resolution patterns, and the image of the subject region is detected (subject detection) based on the sum of the reliability of each partial image.

特許文献２では、被写体として人物の顔を検出し、検出結果に基づいて画像中の顔の位置を検出し、顔に焦点を合わるとともに、顔が最適な露出で撮影されるように自動焦点検出及び自動露出する撮影装置を開示している。 In Patent Document 2, a person's face is detected as a subject, the position of the face in the image is detected based on the detection result, the face is focused, and automatic focusing is performed so that the face is photographed with an optimal exposure. An imaging device that detects and automatically exposes is disclosed.

特開２００８−０３３４２４号公報JP 2008-033424 A 特開２００５−３１８５５４号公報JP 2005-318554 A

パターンマッチングによる被写体検出は、画像中に被写体が含まれる可能性の程度に基づくものであり、検出すべき被写体のみが検出されるわけではない。例えば、被写体として人物の顔を検出する場合、パターンマッチングに基づく、顔らしさの信頼度が、所定の閾値を満たす領域であれば、顔でない領域であっても顔として検出される。 Subject detection by pattern matching is based on the degree of possibility that the subject is included in the image, and not only the subject to be detected is detected. For example, when a human face is detected as a subject, even if it is an area that does not meet the face, the face-like reliability based on pattern matching satisfies a predetermined threshold, the face is detected.

このような誤検出（検出過多）を減少させるため、被写体領域と見なすための信頼度の閾値を厳しく設定すると、被写体領域にもかかわらず閾値を満たさない領域が発生し、被写体の検出率が低下する（検出もれ）。また、閾値を緩く設定すれば、検出すべき被写体とは異なる領域を被写体として検出する誤検出が多発する。つまり、検出もれと誤検出にはトレードオフの関係があり、最適な検出条件を設定することは容易でない。 In order to reduce such false detection (excessive detection), if the reliability threshold value for considering the subject area is set strictly, an area that does not satisfy the threshold value occurs despite the subject area, and the detection rate of the subject decreases. (Detection leak). Moreover, if the threshold value is set to be loose, erroneous detection in which a region different from the subject to be detected is detected as the subject frequently occurs. In other words, there is a trade-off relationship between detection failure and erroneous detection, and it is not easy to set optimal detection conditions.

パターンマッチングの演算量を増やし、検出の信頼性を高めることも考えられるが、デジタルビデオカメラやデジタルカメラなど、演算リソースが限られ、さらに検出のリアルタイム性が求められる装置においては現実的でない。 Although it is conceivable to increase the calculation amount of pattern matching and increase the reliability of detection, it is not practical in an apparatus such as a digital video camera or a digital camera that has limited calculation resources and requires real-time detection.

本発明はこのような従来技術の課題に鑑みなされたものであり、画像から被写体検出を行うための画像処理装置及び画像処理方法において、演算量を抑制しながら、簡便な方法によって被写体検出精度を向上させることを目的とする。 The present invention has been made in view of the above-described problems of the prior art. In an image processing apparatus and an image processing method for performing subject detection from an image, subject detection accuracy can be improved by a simple method while suppressing the amount of computation. The purpose is to improve.

上述の目的は、時系列的に供給される画像から、予め定めた被写体の領域を検出するとともに、検出した領域の信頼度を検出する被写体検出手段と、異なる画像から被写体検出手段が検出した領域のうち、同一被写体の領域を特定する特定手段と、特定手段が同一被写体の領域と特定した領域について、被写体検出手段が検出した信頼度の履歴を含む被写体データを記録する記録手段と、特定手段が同一被写体の領域と特定した領域のうち、記録手段が記録した信頼度の履歴が、信頼度のレベルに応じて定められた判定基準を満たす領域を、予め定めた被写体の領域と判定する判定手段とを有することを特徴とする画像処理装置によって達成される。 The above-described object is to detect a predetermined subject area from images supplied in time series and subject detection means for detecting the reliability of the detected area, and a region detected by the subject detection means from a different image. A specifying unit for specifying the same subject region, a recording unit for recording subject data including a history of reliability detected by the subject detecting unit for the region specified by the specifying unit as the same subject region, and a specifying unit Judgment to determine, as a predetermined subject region, a region in which the reliability history recorded by the recording unit among the regions identified as the same subject region satisfies the determination criterion determined according to the reliability level And an image processing apparatus.

また、上述の目的は、被写体光学像を結像するためのレンズと、レンズが結像した被写体光学像を逐次撮像し、時系列的に供給される画像を出力する撮像手段と、本発明に係る画像処理装置と、判定手段が予め定めた被写体の領域と判定した領域の情報を用いて撮像条件の制御を行う制御手段とを有することを特徴とする撮像装置によっても達成される。 Further, the above object is to provide a lens for forming a subject optical image, an imaging means for sequentially capturing the subject optical image formed by the lens, and outputting an image supplied in time series, and the present invention. It is also achieved by an imaging apparatus comprising such an image processing apparatus and a control unit that performs control of imaging conditions using information on a region determined by the determination unit as a predetermined subject area.

また、上述の目的は、撮像装置であって、被写体光学像を結像するためのレンズと、レンズが結像した被写体光学像を逐次撮像し、時系列的に供給される画像を出力する撮像手段と、時系列的に隣接する画像の変化量を算出するとともに、変化量が予め定めた変化量を超える場合には、記録手段が記録する被写体データを全て削除する変化検出手段を有する本発明に係る画像処理装置と、判定手段が予め定めた被写体の領域と判定した領域の情報を用いて撮像条件の制御を行う制御手段とを有し、画像処理装置が、撮像装置の動きまたはレンズの画角変化に関する情報を取得するとともに、これら情報の少なくとも１つが予め定めた閾値を超える場合には、撮像手段が出力する画像が大きく変化するものとして記録手段が記録する被写体データを全て削除する変化検出手段を有することを特徴とする撮像装置によっても達成される。 In addition, the above-described object is an imaging apparatus that sequentially captures a lens for forming a subject optical image and a subject optical image formed by the lens, and outputs an image supplied in time series. The present invention has a change detection means for calculating a change amount of an image adjacent to the means in time series and deleting all subject data recorded by the recording means when the change amount exceeds a predetermined change amount. The image processing apparatus according to the present invention, and a control unit that controls the imaging conditions using information on the area determined by the determination unit as a predetermined subject area. When the information regarding the change in the angle of view is acquired and at least one of the information exceeds a predetermined threshold, the subject data recorded by the recording unit is assumed to be a large change in the image output by the imaging unit. Also achieved by an imaging apparatus characterized by having a deviation detection means for deleting all.

また、上述の目的は、時系列的に供給される画像から、予め定めた被写体の領域を検出するとともに、検出した領域の信頼度を検出する被写体検出ステップと、異なる画像から被写体検出ステップで検出された領域のうち、同一被写体の領域を特定する特定ステップと、特定ステップで同一被写体の領域と特定された領域について、被写体検出ステップで検出された信頼度の履歴を含む被写体データを記録手段に記録する記録ステップと、特定ステップで同一被写体の領域と特定された領域のうち、記録手段に記録された信頼度の履歴が、信頼度のレベルに応じて定められた判定基準を満たす領域を、予め定めた被写体の領域と判定する判定ステップとを有することを特徴とする画像処理方法によっても達成される。 In addition, the above-described object is to detect a subject area from a time-sequentially supplied image while detecting a predetermined subject area and detecting the reliability of the detected area in a subject detection step from a different image. In the recording unit, the identification step including the reliability step detected in the subject detection step for the identification step of identifying the region of the same subject in the identified region and the region identified as the same subject region in the identification step is recorded in the recording unit. Of the recording step to record and the region identified as the same subject region in the specific step, the reliability history recorded in the recording means satisfies the determination criteria determined according to the level of reliability, It is also achieved by an image processing method including a determination step of determining a predetermined subject area.

このような構成により、本発明によれば、画像から被写体検出を行うための画像処理装置及び画像処理方法において、演算量を抑制しながら、簡便な方法によって被写体検出精度を向上させることができる。 With such a configuration, according to the present invention, in the image processing apparatus and the image processing method for detecting a subject from an image, the subject detection accuracy can be improved by a simple method while suppressing the amount of calculation.

以下、図面を参照して本発明の好適かつ例示的な実施形態について詳細に説明する。
図１は、本発明の実施形態に係る画像処理装置の一例としての撮像装置の構成例を示すブロック図である。 Hereinafter, preferred and exemplary embodiments of the present invention will be described in detail with reference to the drawings.
FIG. 1 is a block diagram illustrating a configuration example of an imaging apparatus as an example of an image processing apparatus according to an embodiment of the present invention.

レンズ１０１により、被写体光学像がＣＣＤイメージセンサやＣＭＯＳイメージセンサのような撮像素子１０２の撮像面上に結像される。撮像素子１０２は、入射した光線の強度に応じた電気信号を画素単位で出力する。この電気信号が映像信号である。撮像素子１０２から出力された映像信号は、アナログ信号処理部１０３において相関二重サンプリング（ＣＤＳ）等のアナログ信号処理が行われる。 An optical object image is formed on the imaging surface of the imaging element 102 such as a CCD image sensor or a CMOS image sensor by the lens 101. The image sensor 102 outputs an electrical signal corresponding to the intensity of the incident light beam in units of pixels. This electrical signal is a video signal. The video signal output from the image sensor 102 is subjected to analog signal processing such as correlated double sampling (CDS) in the analog signal processing unit 103.

アナログ信号処理部１０３から出力された映像信号は、Ａ／Ｄ変換部１０４においてデジタルデータの形式に変換され、撮影制御部１０５および画像処理部１０６に入力する。画像処理部１０６では、ガンマ補正、ホワイトバランス処理などの画像処理が行われる。画像処理部１０６は、これら通常の画像処理に加え、後述するように、顔検出部１０９から供給される、画像中に検出された顔領域に関する情報を用いた画像処理も行う。 The video signal output from the analog signal processing unit 103 is converted into a digital data format by the A / D conversion unit 104 and input to the imaging control unit 105 and the image processing unit 106. The image processing unit 106 performs image processing such as gamma correction and white balance processing. In addition to these normal image processing, the image processing unit 106 also performs image processing using information relating to the face area detected in the image supplied from the face detection unit 109, as will be described later.

画像処理部１０６から出力された映像信号は、表示部１０７に送られる。表示部１０７は、例えばＬＣＤや有機ＥＬディスプレイであり、映像信号を表示する。時系列的に連続撮影した画像を逐次表示部１０７に表示することで、表示部１０７を電子ビューファインダ（ＥＶＦ）として機能させることができる。また、映像信号は記録媒体１０８、例えば着脱可能なメモリカードに記録される。記録先はカメラの内蔵メモリであっても、通信可能な接続された外部装置であっても良い。 The video signal output from the image processing unit 106 is sent to the display unit 107. The display unit 107 is, for example, an LCD or an organic EL display, and displays a video signal. By sequentially displaying images taken continuously in time series on the display unit 107, the display unit 107 can function as an electronic viewfinder (EVF). The video signal is recorded on a recording medium 108, for example, a removable memory card. The recording destination may be a built-in memory of the camera or a connected external device capable of communication.

画像処理部１０６から出力された映像信号は、顔検出部１０９にも供給される。顔検出部１０９は画像中の人物の顔を検出し、被写体の人数と顔領域を特定する。検出方法は公知の顔検出方法を用いる。例えば、顔に関する知識（肌色情報、目・鼻・口などのパーツ）を利用する方法とニューラルネットに代表される学習アルゴリズムにより顔検出のための識別器を構成する方法などがある。認識率向上のために複数の方法を組み合わせて顔認識を行なうのが一般的である。具体的には特開２００２−２５１３８０号公報に記載のウェーブレット変換と画像特徴量を利用して顔検出する方法などが挙げられる。 The video signal output from the image processing unit 106 is also supplied to the face detection unit 109. The face detection unit 109 detects the face of a person in the image, and specifies the number of subjects and the face area. A known face detection method is used as the detection method. For example, there are a method using face knowledge (skin color information, parts such as eyes, nose, mouth) and a method of configuring a classifier for face detection by a learning algorithm represented by a neural network. In order to improve the recognition rate, face recognition is generally performed by combining a plurality of methods. Specifically, a face detection method using wavelet transform and image feature amount described in JP-A-2002-251380 may be used.

顔検出部１０９が検出結果として出力する顔領域の情報としては、検出した人数分の顔領域の位置、大きさ、傾き、信頼度などが挙げられる。ここで、信頼度とは顔検出結果の確からしさを表す値であり、顔検出の処理過程で決定される。 Examples of the face area information output as a detection result by the face detection unit 109 include the position, size, inclination, reliability, and the like of face areas for the number of detected persons. Here, the reliability is a value representing the likelihood of the face detection result, and is determined in the face detection process.

信頼度の算出法はさまざまなものがある。例えば、予め記憶された顔画像の特徴と、顔検出部１０９により検出した顔領域の画像の特徴とを比較して、検出した顔領域の画像が被写体の画像である確率を求め、この確率から信頼度を算出する方法がある。また、予め記憶された顔画像の特徴と、顔検出部１０９によって検出した顔領域の画像の特徴との差を算出し、その差の大きさから信頼度を算出する方法もある。どのような方法で算出された信頼度であっても、信頼度が高ければ誤検出の可能性が少なく、低ければ誤検出の可能性が高いことを示す。 There are various methods for calculating the reliability. For example, the feature of the face image stored in advance and the feature of the image of the face region detected by the face detection unit 109 are compared, and the probability that the detected image of the face region is the image of the subject is obtained. There is a method for calculating the reliability. There is also a method of calculating the difference between the feature of the face image stored in advance and the feature of the image of the face area detected by the face detection unit 109 and calculating the reliability from the magnitude of the difference. Regardless of the reliability calculated by any method, if the reliability is high, the possibility of erroneous detection is low, and if the reliability is low, the possibility of erroneous detection is high.

顔検出部１０９の検出結果である顔領域の情報は被写体特定部１１０に送られる。被写体特定部１１０は、時系列的に連続した顔検出結果から、同一の被写体を特定し、被写体情報を顔判定部１１１に送る。顔判定部１１１は、顔検出部１０９から供給される時系列的な顔検出結果に基づいて、信頼性が高い顔領域を顔と判定する。顔として判定された検出結果の情報（撮像画像中の顔領域の位置、大きさなど）は、画像処理部１０６及び撮影制御部１０５に供給される。そして、顔領域の情報を用いた自動焦点検出制御や自動露出制御といった、撮像条件の制御に用いることができる。 Information of the face area as a detection result of the face detection unit 109 is sent to the subject specifying unit 110. The subject specifying unit 110 specifies the same subject from the face detection results continuous in time series, and sends the subject information to the face determination unit 111. The face determination unit 111 determines a highly reliable face region as a face based on the time-series face detection results supplied from the face detection unit 109. Information of the detection result determined as the face (position, size, etc. of the face area in the captured image) is supplied to the image processing unit 106 and the imaging control unit 105. And it can use for control of imaging conditions, such as automatic focus detection control and automatic exposure control using information on a face area.

撮影制御部１０５は、Ａ／Ｄ変換部１０４から出力された映像信号に基づいて、撮像レンズの図示しない焦点制御機構や露出制御機構を制御する。撮影制御部１０５は、この焦点制御機構や露出制御機構の制御に、顔判定部１１１から供給された検出結果の情報を用いることができる。従って、本実施形態の撮像装置は、撮像画像中の顔領域の情報を考慮した撮影処理を行なう機能を実現できる。具体的には、顔領域を基準とした露出制御、焦点検出制御、フラッシュ制御などを実現することができる。撮影制御部１０５は、撮像素子１０２の出力タイミングや出力画素などの制御を行なう。 The imaging control unit 105 controls a focus control mechanism and an exposure control mechanism (not shown) of the imaging lens based on the video signal output from the A / D conversion unit 104. The imaging control unit 105 can use the detection result information supplied from the face determination unit 111 to control the focus control mechanism and the exposure control mechanism. Therefore, the imaging apparatus according to the present embodiment can realize a function of performing an imaging process in consideration of information on a face area in a captured image. Specifically, exposure control, focus detection control, flash control, and the like based on the face area can be realized. The imaging control unit 105 controls the output timing and output pixels of the image sensor 102.

（被写体特定処理）
被写体特定部１１０には、顔検出部１０９から顔検出結果として顔領域の情報（撮像画像中の位置や大きさ、信頼度など）が供給される。顔検出部１０９では、過去の顔検出結果を保持したり、利用したりせずに顔検出処理を行う。そのため、被写体特定部１１０は、顔検出部１０９の顔検出結果を時系列的に保持し、時系列的な顔検出結果に基づいて、同一被写体の顔領域を特定する。これにより、被写体追尾が可能になる。 (Subject identification process)
Information on the face area (position, size, reliability, etc. in the captured image) is supplied from the face detection unit 109 to the subject specifying unit 110 as a face detection result. The face detection unit 109 performs face detection processing without holding or using past face detection results. Therefore, the subject specifying unit 110 holds the face detection result of the face detection unit 109 in time series, and specifies the face area of the same subject based on the time series face detection result. Thereby, subject tracking becomes possible.

顔領域の位置と大きさを用いて同一被写体を特定する処理について、図２を参照して説明する。
図２は逐次撮像される時系列的に連続する２フレームを示しており、（ｂ）が現在の撮像画像、（ａ）が（ｂ）より１フレーム前の画像を示している。なお、ここではフレーム毎に顔検出を行うものとしているが、数フレームおきに顔検出を行ってもよい。顔検出部１０９によって各フレームで検出された顔領域２０１、２０２、２０３は、顔領域を示す表示（顔枠）によって示されている。 Processing for specifying the same subject using the position and size of the face area will be described with reference to FIG.
FIG. 2 shows two frames that are sequentially captured in time series, where (b) shows the current captured image, and (a) shows the image one frame before (b). Here, face detection is performed for each frame, but face detection may be performed every several frames. The face areas 201, 202, and 203 detected in each frame by the face detection unit 109 are indicated by a display (face frame) indicating the face area.

図２（ａ）に示す前フレームで検出された顔領域について、位置を、画像中の座標を用いて(x_t-1(i), y_t-1(i))、大きさをs_t-1(i)と表す。また、現フレームで検出された顔領域については、位置を(x_t(j), y_t(j))、大きさをs_t(j)と表す。ここで、i、jの値は１からｎの値を取る整数であり、同一フレーム内で検出された顔領域ごとに値が割り振られる。ｎは検出された顔領域の総数を示す。 For the face region detected in the previous frame shown in FIG. 2 (a), position, using the coordinates in the image _{(x t-1 (i)} , y t-1 (i)), the size s _{t -1} (i). For the face area detected in the current frame, the position is represented as (x _t (j), y _t (j)) and the size is represented as _st (j). Here, the values of i and j are integers that take values from 1 to n, and are assigned to each face area detected in the same frame. n indicates the total number of detected face regions.

被写体特定部１１０は、時系列的に連続する顔検出結果において、位置と大きさが、以下の式１の値が０以上となる顔領域を、同一被写体として判定する。 The subject specifying unit 110 determines, as the same subject, face regions whose positions and sizes are equal to or greater than 0 in the following expression 1 in face detection results that are continuous in time series.

つまり、連続するフレーム間で大きさが同一の顔領域については、前フレームの顔領域の周囲に顔領域と等しい大きさの領域が隣接する範囲に現フレームの顔領域が含まれれば、両者を同一被写体の顔領域と判定する。 In other words, for a face area having the same size between consecutive frames, if the face area of the current frame is included in the area adjacent to the face area around the face area of the previous frame, The face area of the same subject is determined.

また、連続するフレーム間で大きさが異なる顔領域については、２つの顔領域の座標差が、大きさの小さい方の顔を基準として設定した隣接範囲に含まれれば、同一人物と判定する。現フレームの結果を基準として、式１の値が０以上となる前フレームの検出結果が複数個あった場合は、式の値の小さいほうが同一人物として判定される。また、現フレームの結果を基準として、前フレームの検出結果が式１の値で０以上となる結果がなければ、その結果の被写体は新規の被写体としてみなされる。つまり、図２において、フレームｔ−１おける顔領域２０１とフレームｔにおける顔領域２０２は同一の被写体として判定され、フレームｔにおける顔領域２０３は新規に出現した被写体と判定される。 Further, regarding a face area having a different size between successive frames, if the coordinate difference between the two face areas is included in the adjacent range set with the smaller face as a reference, it is determined as the same person. When there are a plurality of detection results of the previous frame in which the value of Expression 1 is 0 or more with reference to the result of the current frame, the smaller value of the expression is determined as the same person. On the other hand, if there is no result in which the detection result of the previous frame is 0 or more with the value of Expression 1 on the basis of the result of the current frame, the subject of the result is regarded as a new subject. That is, in FIG. 2, the face area 201 in the frame t-1 and the face area 202 in the frame t are determined as the same subject, and the face area 203 in the frame t is determined as a newly appearing subject.

ここで説明した被写体特定方法は一例であり、他の方法を用いて同一の被写体を特定してもよい。例えば、顔検出部１０９から顔の傾きや向きに関する情報が得られるならば、それらを被写体特定部１１０での被写体特定の条件として利用しても良い。また、レンズ１０１のズーム率（焦点距離）、ビデオカメラの移動量、手ぶれ補正のON/OFFといった撮像装置の情報を取得し、それらを被写体特定の条件として利用することもできる。 The subject specifying method described here is an example, and the same subject may be specified using other methods. For example, if information regarding the tilt and orientation of the face can be obtained from the face detection unit 109, they may be used as the subject specifying conditions in the subject specifying unit 110. It is also possible to acquire information of the imaging device such as the zoom rate (focal length) of the lens 101, the moving amount of the video camera, and ON / OFF of camera shake correction, and use these as conditions for specifying the subject.

被写体特定部１１０は、図６に示すような被写体リスト１１２を保持し、顔検出部１０９から検出結果を受け取るごとに被写体リスト１１２を更新する。被写体リスト１１２の詳細については後述するが、個々の顔領域について、位置及び大きさの情報と、信頼度レベルの履歴などが記憶される。 The subject specifying unit 110 holds a subject list 112 as shown in FIG. 6 and updates the subject list 112 every time a detection result is received from the face detection unit 109. Although details of the subject list 112 will be described later, position and size information, reliability level history, and the like are stored for each face area.

（顔判定処理）
顔判定部１１１では、顔検出部１０９により出力される顔検出結果から、画像処理部１０６及び撮影制御部１０５で有効に利用できる情報を判定し、画像処理部１０６及び撮影制御部１０５へ供給する。つまり、顔判定部１１１は、顔検出部１０９により出力される顔検出結果の中から、信頼性の高い検出結果を取り出し、画像処理部１０６及び撮影制御部１０５へ供給する。 (Face determination process)
The face determination unit 111 determines information that can be effectively used by the image processing unit 106 and the imaging control unit 105 from the face detection result output by the face detection unit 109, and supplies the information to the image processing unit 106 and the imaging control unit 105. . That is, the face determination unit 111 extracts a highly reliable detection result from the face detection results output by the face detection unit 109 and supplies the detection result to the image processing unit 106 and the imaging control unit 105.

具体的には、顔判定部１１１は、信頼度のレベル（高さ）に応じた継続検出回数または継続検出時間を判定基準として予め保持しておく。そして、被写体特定部１１０が同一被写体と判定した顔領域について、判定基準を満たす顔領域を最終的に顔としての信頼性の高い顔領域と判定する。 Specifically, the face determination unit 111 holds the number of continuous detections or the continuous detection time corresponding to the reliability level (height) in advance as a determination criterion. Then, for the face area determined by the subject specifying unit 110 as the same subject, the face area that satisfies the determination criterion is finally determined as a highly reliable face area as a face.

つまり、顔判定部１１１は、同一の被写体と判定された顔領域のうち、顔検出部１０９が算出した信頼度のレベルに応じた回数または時間継続して検出されている顔領域を、信頼性の高い顔領域と判定する。ここで、信頼度のレベルは、顔検出部１０９から供給される信頼度そのものであっても、顔検出部１０９から得られる信頼度に対して正規化処理など所定の処理を施した値であってもよい。実際には、顔判定部１１１は、被写体特定部１１０が保持、管理する被写体リスト１１２（図６）を参照し、信頼度レベルの履歴から判定基準を満たす顔領域を顔と判定する。そして、顔判定部１１１は、顔と判定した顔領域の位置や大きさの情報を被写体リスト１１２から読み出して、画像処理部１０６及び撮影制御部１０５へ供給する。 In other words, the face determination unit 111 determines, from among the face regions determined to be the same subject, the face region that has been detected continuously for a number of times or for a time corresponding to the reliability level calculated by the face detection unit 109. It is determined that the face area is high. Here, the level of reliability is a value obtained by performing a predetermined process such as normalization on the reliability obtained from the face detection unit 109 even if the reliability itself supplied from the face detection unit 109 is used. May be. In practice, the face determination unit 111 refers to the subject list 112 (FIG. 6) held and managed by the subject specifying unit 110 and determines a face region that satisfies the determination criterion from the history of reliability levels as a face. Then, the face determination unit 111 reads information on the position and size of the face area determined as a face from the subject list 112 and supplies the information to the image processing unit 106 and the imaging control unit 105.

図３に顔検出部１０９により検出された顔領域と、対応する信頼度レベルの例を示す。
本実施形態では、信頼度レベルは１から５の５段階あり、５が最も信頼性が高いものとする。図３に示した例では、顔を正しく検出している顔領域（４０１，４０３，４０４，４０６，４０８，４１０）には信頼度レベルの高いものが多く、顔以外の領域を誤検出した顔領域（４０２，４０５，４０７，４０９）は信頼度レベルの低い結果が多い。しかし、正しく検出されている顔領域でも信頼度レベルの低いもの（顔領域４０３，４０８）や、誤検出された顔領域でも信頼度レベルの高いもの（顔領域４０２，４０７）が含まれている。 FIG. 3 shows an example of the face area detected by the face detection unit 109 and the corresponding reliability level.
In the present embodiment, there are five reliability levels from 1 to 5, with 5 being the most reliable. In the example shown in FIG. 3, many face areas (401, 403, 404, 406, 408, 410) in which the face is correctly detected have a high reliability level, and a face in which an area other than the face is erroneously detected Regions (402, 405, 407, 409) often have low reliability levels. However, a correctly detected face area includes a low reliability level (face areas 403 and 408), and a misdetected face area includes a high reliability level (face areas 402 and 407). .

従って、１フレームのみから得られる信頼度レベルによって、正しい検出と誤検出を精度良く区別することは難しい。
そこで、本実施形態では、同一被写体の顔領域についての信頼度レベルを経時的に追跡し、個々の被写体の顔領域が真に顔領域であるかどうかを判定する。具体的には上述したように、同一被写体の顔領域と判定された顔領域について、信頼度レベルに応じた判定基準を設け、判定基準を満たした顔領域について真の顔領域であると判定する。判定基準は例えば、ある信頼度レベルが継続して検出された回数や時間であってよい。本実施形態では、顔領域ごとの信頼度レベルの追跡、記録は被写体特定部１１０が行い、判定基準に基づく判定を顔判定部１１１が行う。 Therefore, it is difficult to accurately distinguish between correct detection and erroneous detection based on the reliability level obtained from only one frame.
Therefore, in the present embodiment, the reliability level of the face area of the same subject is tracked over time, and it is determined whether or not the face area of each subject is truly a face area. Specifically, as described above, for a face area determined to be the face area of the same subject, a determination criterion corresponding to the reliability level is provided, and a face area that satisfies the determination criterion is determined to be a true face region. . The determination criterion may be, for example, the number of times or time that a certain reliability level is continuously detected. In the present embodiment, the subject specifying unit 110 performs tracking and recording of the reliability level for each face area, and the face determination unit 111 performs determination based on the determination criterion.

図４は、図３において判定基準を適用した例を模式的に示した図である。
図４において、顔枠が実線で示されている顔領域は顔判定部１１１により顔と判定されていることを、顔枠が点線で示されている顔領域は顔判定部１１１により顔と判定されていないことを示している。 FIG. 4 is a diagram schematically illustrating an example in which the determination criterion is applied in FIG.
In FIG. 4, a face area whose face frame is indicated by a solid line is determined as a face by the face determination unit 111, and a face area whose face frame is indicated by a dotted line is determined as a face by the face determination unit 111. It has not been shown.

図４では、理解及び説明を簡単にするため、信頼度レベル４以上が２フレーム継続したら顔と判定され、それ以外は顔と判定しないという判定基準が設定されているものとする。
フレームｔ−２（図４（ａ））の段階では、どの顔領域も判定基準を満たしていないため、顔領域４０１〜４０３はいずれも顔と判定されておらず、顔枠は点線で示されている。 In FIG. 4, for the sake of simplicity of understanding and explanation, it is assumed that a determination criterion is set that a face is determined when the reliability level of 4 or more continues for two frames, and no other face is determined.
At the stage of frame t-2 (FIG. 4 (a)), since no face area satisfies the determination criteria, none of the face areas 401 to 403 is determined as a face, and the face frame is indicated by a dotted line. ing.

次の、フレームｔ−１（図４（ｂ））では、顔領域４０４が、同一被写体と判定されている前フレームの顔領域４０１から継続して信頼度レベル５を有していることから顔と判定される。誤検出された顔領域４０５は、前フレームで対応する顔領域４０２では信頼度レベル４であったが、現フレームで信頼度レベルが１に下がったため、顔と判定されていない。また、正しく検出された顔領域４０６は、前フレームで対応する顔領域４０３が信頼度レベル２であったため、この時点では顔と判定されていない。 In the next frame t-1 (FIG. 4B), the face area 404 continues to have a reliability level of 5 from the face area 401 of the previous frame determined to be the same subject. It is determined. The erroneously detected face area 405 has a reliability level 4 in the corresponding face area 402 in the previous frame, but is not determined to be a face because the reliability level has dropped to 1 in the current frame. Further, the face area 406 detected correctly is not determined to be a face at this time because the corresponding face area 403 in the previous frame has a reliability level 2.

フレームｔ（図４（ｃ））では、顔領域４０８の信頼度レベルが２に低下したが、既に顔と判定されているため、この時点では依然として顔と判定される。顔領域４１０は、同一被写体と判定されている前フレームの顔領域４０６から継続して信頼度レベル４を有していることからこの時点で顔と判定される。誤検出された顔領域４０９は、信頼度レベル３であるため依然として顔と判定されていない。フレームｔ−１で誤検出された顔領域４０７は、フレームｔにおいて同一被写体と判別される顔領域が存在しないため、顔とは判定されない。 In frame t (FIG. 4C), the reliability level of the face area 408 has decreased to 2, but since it has already been determined to be a face, it is still determined to be a face at this point. The face area 410 continues to have a reliability level 4 from the face area 406 of the previous frame determined to be the same subject, and thus is determined to be a face at this point. The erroneously detected face area 409 is still not determined to be a face because the reliability level is 3. The face area 407 erroneously detected in the frame t-1 is not determined to be a face because there is no face area that is determined to be the same subject in the frame t.

本実施形態の撮像装置における顔判定処理について、図５に示したフローチャートを用いてさらに説明する。
まず、顔検出部１０９が、画像中から人物の顔を検出し、検出した顔領域の各々について、画像中における位置（座標）、大きさ、信頼度などを求める（Ｓ６０１）。本実施形態では、時系列的に連続して得られる画像ごとに顔検出部１０９で顔検出するものとする。 The face determination process in the imaging apparatus of the present embodiment will be further described using the flowchart shown in FIG.
First, the face detection unit 109 detects a human face from the image, and obtains the position (coordinates), size, reliability, and the like in the image for each detected face area (S601). In this embodiment, it is assumed that the face detection unit 109 detects a face for each image obtained continuously in time series.

次に、検出された顔領域の１つについて、被写体特定部１１０において、上述した被写体特定法により、現フレームの検出結果を前フレームの検出結果と比較し、同一被写体の顔領域を特定する（Ｓ６０２）。 Next, for one of the detected face areas, the subject specifying unit 110 compares the detection result of the current frame with the detection result of the previous frame by the above-described subject specifying method to specify the face area of the same subject ( S602).

この判定に基づき、被写体特定部１１０は、図６に示したような被写体リスト１１２の更新もしくは登録を行なう。すなわち、被写体特定部１１０は、現フレームで検出された顔領域のうち、前フレームで検出された顔領域と同一被写体のものと判定されるものについて、被写体リスト１１２に登録済みのデータを更新する（Ｓ６０３）。また、被写体特定部１１０は、前フレームで検出されていたにもかかわらず、現フレームで検出されなかった被写体に関する被写体データは被写体リスト１１２から削除する。 Based on this determination, the subject specifying unit 110 updates or registers the subject list 112 as shown in FIG. That is, the subject specifying unit 110 updates the data registered in the subject list 112 for the face area detected in the current frame and determined to be the same subject as the face area detected in the previous frame. (S603). Also, the subject specifying unit 110 deletes from the subject list 112 subject data relating to subjects that were detected in the previous frame but not detected in the current frame.

被写体特定部１１０は、被写体データのうち、顔領域の位置、大きさについては、現フレームでの検出結果で書き換えて更新する。また、信頼度レベルの継続検出回数については、現フレームでの信頼度レベルと、判定基準に応じた方法で書き換えて更新する。 The subject specifying unit 110 rewrites and updates the position and size of the face area in the subject data with the detection result in the current frame. The number of continuous detections of the reliability level is rewritten and updated by a method according to the reliability level in the current frame and the determination criterion.

例えば、判定基準がある特定の信頼度レベルの継続検出回数であれば、現フレームでの信頼度レベルが前フレームでの信頼度レベルと等しい場合のみ、対応する信頼度レベルの継続検出回数を１増加させる。そして、現フレームでの信頼度レベルが前フレームでの信頼度レベルと異なる場合には、現フレームでの信頼度レベルの継続検出回数を「１」とし、他の信頼度レベルの継続検出回数を「０」とする。 For example, if the determination criterion is the number of continuous detections of a certain reliability level, the number of continuous detections of the corresponding reliability level is set to 1 only when the reliability level in the current frame is equal to the reliability level in the previous frame. increase. If the reliability level in the current frame is different from the reliability level in the previous frame, the continuous detection count of the reliability level in the current frame is set to “1”, and the continuous detection count of other reliability levels is set to “1”. “0”.

一方、判定基準が、ある信頼度レベル以上の継続検出回数である場合、現フレームでの信頼度レベルと、それ以下の信頼度レベルの継続検出回数を増加させ、現フレームでの信頼度レベルよりも高い信頼性レベルの継続検出回数は全て０とする。 On the other hand, if the criterion is the number of continuous detections at or above a certain reliability level, increase the reliability level at the current frame and the number of continuous detections at a reliability level lower than that, and the reliability level at the current frame The number of continuous detections with a high reliability level is all zero.

例えば、現フレームでの信頼度レベルが５であれば、全ての信頼度レベルの継続検出回数を増加させる。また、現フレームでの信頼度レベルが３であれば、信頼度レベル１〜３の継続検出回数を増加させ、信頼度レベル４〜５の継続検出回数は０とする。このような更新方法により、個々の信頼度レベル以上の信頼度レベルが継続して検出されている回数を記録することができる。 For example, if the reliability level in the current frame is 5, the number of continuous detections of all reliability levels is increased. If the reliability level in the current frame is 3, the number of continuous detections at reliability levels 1 to 3 is increased, and the number of continuous detections at reliability levels 4 to 5 is set to 0. By such an updating method, it is possible to record the number of times that reliability levels equal to or higher than the individual reliability levels are continuously detected.

一方で、被写体特定部１１０は、現フレームで検出された顔領域のうち、前フレームで検出された顔領域と異なる被写体のものについては、新たな被写体ＩＤを付与して被写体リスト１１２に新規の被写体データとして登録する（Ｓ６０４）。 On the other hand, the subject specifying unit 110 assigns a new subject ID to the subject list 112 for a subject different from the face region detected in the previous frame among the face regions detected in the current frame. Registration as subject data (S604).

被写体特定部１１０は、顔検出部１０９から供給される、顔領域の位置・大きさ、信頼度レベルと、被写体特定結果とに基づいて、図６の被写体リスト１１２のデータを更新、および登録する。 The subject specifying unit 110 updates and registers the data of the subject list 112 in FIG. 6 based on the position / size of the face area, the reliability level, and the subject specifying result supplied from the face detecting unit 109. .

被写体リスト１１２には、被写体（顔領域）を識別するための被写体ＩＤごとに、顔領域の位置、大きさ、信頼度レベルごとの継続検出回数、顔と判定されたか否かを示す顔判定フラグ、現フレームで登録または更新されたことを示す更新フラグが関連付けられる。
更新フラグはフレーム毎に０にクリアされ、Ｓ６０３の更新もしくはＳ６０４の登録により、更新フラグは１になる。 The subject list 112 includes, for each subject ID for identifying the subject (face region), the position and size of the face region, the number of times of continuous detection for each reliability level, and a face determination flag indicating whether or not a face is determined. An update flag indicating registration or update in the current frame is associated.
The update flag is cleared to 0 for each frame, and the update flag becomes 1 by updating in S603 or registering in S604.

初期フレームにおける処理時は、被写体リスト１１２のデータがないため、被写体特定部１１０は、検出された顔領域の情報を全て新規の被写体データとして被写体リスト１１２に登録される。２フレーム以降、被写体特定部１１０は被写体リスト１１２中の被写体データと、現フレームの顔検出結果の情報を用いて、被写体判定処理を行う。 During processing in the initial frame, since there is no data in the subject list 112, the subject specifying unit 110 registers all detected face area information as new subject data in the subject list 112. From the second frame onward, the subject specifying unit 110 performs subject determination processing using subject data in the subject list 112 and information on the face detection result of the current frame.

顔判定部１１１では、被写体リスト１１２に記録された、各被写体の顔領域の信頼度レベルの履歴、具体的には信頼度レベルに応じた継続検出回数または継続検出時間に基づいて、各被写体の顔領域のうち、顔である可能性の高い顔領域を判定する。 In the face determination unit 111, based on the reliability level history recorded in the subject list 112, specifically, the number of continuous detections or the continuous detection time corresponding to the reliability level, Among the face areas, a face area that is likely to be a face is determined.

また、本実施形態では、顔判定部１１１で顔として判定された被写体については、その後の検出結果における信頼度がレベルとは無関係に、同一被写体と特定される顔領域が検出されていれば、顔であると判定する。 Further, in the present embodiment, for a subject determined as a face by the face determination unit 111, if the face area specified as the same subject is detected regardless of the level of reliability in the subsequent detection result, Judged to be a face.

そのため、顔判定部１１１はまず、現フレームで検出された顔領域が、既に顔と判定されたものかどうかを判定する（Ｓ６０５）。顔判定部１１１は、図６に示した被写体リスト１１２中の顔判定フラグが１であれば、既に顔と判定されており、０であれば顔と判定されていないと判定する。なお、被写体データの登録時における顔判定フラグの値は０とする。 Therefore, the face determination unit 111 first determines whether the face area detected in the current frame has already been determined as a face (S605). The face determination unit 111 determines that the face is already determined if the face determination flag in the subject list 112 illustrated in FIG. 6 is 1, and determines that the face is not determined if it is 0. Note that the value of the face determination flag at the time of registration of the subject data is 0.

そして、顔判定部１１１は、顔判定フラグが０である顔領域に関して、信頼度レベルの履歴を参照し、予め定めた判定基準を満たすか否かを判定する（Ｓ６０６）。ここでは、信頼度レベルＬに応じた継続検出回数の基準値f（Ｌ）が判定基準として定められており、この判定基準を満たすか否かを判定するものとする。 Then, the face determination unit 111 determines whether or not a predetermined determination criterion is satisfied with reference to the reliability level history with respect to the face region whose face determination flag is 0 (S606). Here, the reference value f (L) of the number of times of continuous detection according to the reliability level L is determined as the determination criterion, and it is determined whether or not this determination criterion is satisfied.

図７に、信頼度レベルに応じて設定される、顔と判定するのに必要な継続検出回数の基準値ｆ（Ｌ）の例を示す。図７において、横軸は信頼度レベルＬ、縦軸は継続検出回数の基準値ｆ（Ｌ）を示しており、信頼度レベルが高いほど基準値ｆ（Ｌ）が低くなるように設定されている。継続検出回数の代わりに継続検出時間とする場合には、信頼度レベルが高いほど基準値ｆ（Ｌ）が短くなるように設定される。 FIG. 7 shows an example of the reference value f (L) of the number of times of continuous detection necessary for determining a face, which is set according to the reliability level. In FIG. 7, the horizontal axis indicates the reliability level L, and the vertical axis indicates the reference value f (L) of the number of times of continuous detection. The higher the reliability level, the lower the reference value f (L) is set. Yes. When the continuous detection time is used instead of the continuous detection count, the reference value f (L) is set to be shorter as the reliability level is higher.

信頼度レベルＬと継続検出回数の基準値ｆ（Ｌ）とは、線形関係であっても非線形関係であっても良い。また、継続検出回数の基準値ｆ（Ｌ）は、信頼度レベル毎の回数であっても、各信頼度レベル以上の回数であっても良い。 The reliability level L and the reference value f (L) for the number of continuous detections may be linear or non-linear. Further, the reference value f (L) for the number of times of continuous detection may be the number of times for each reliability level or the number of times of each reliability level or more.

顔判定部１１１は、図６に示した被写体リスト１１２に含まれる顔領域のうち、信頼度レベルに応じた継続検出回数が図７に示した基準値を満たすものを顔として判定し、被写体リスト１１２中の顔判定フラグの値を１にする（Ｓ６０７）。そして、顔判定部１１１は、顔と判定した顔領域の情報を、被写体リスト１１２から読み出して画像処理部１０６や撮影制御部１０５に供給する。 The face determination unit 111 determines that the face area included in the subject list 112 shown in FIG. 6 satisfies the reference value shown in FIG. The value of the face determination flag in 112 is set to 1 (S607). Then, the face determination unit 111 reads information on the face area determined to be a face from the subject list 112 and supplies the information to the image processing unit 106 and the imaging control unit 105.

一方、顔判定部１１１は、継続検出回数が基準値を満たしていない顔領域については顔とは判定せず、顔判定フラグは０のままとする。そして、顔判定部１１１は、顔と判定しなかった顔領域の情報は、画像処理部１０６や撮影制御部１０５には供給しない。 On the other hand, the face determination unit 111 does not determine that the face area has the number of continuous detections not satisfying the reference value, and leaves the face determination flag at 0. Then, the face determination unit 111 does not supply information on the face area that is not determined to be a face to the image processing unit 106 or the imaging control unit 105.

上述したＳ６０２からＳ６０７までの処理を、Ｓ６０１における顔検出により検出された各顔領域に対して行なうため、顔判定部１１１は、現フレームで検出された全ての顔領域に対して処理したか否かを判定する（Ｓ６０８）。 Since the processing from S602 to S607 described above is performed for each face area detected by the face detection in S601, the face determination unit 111 has processed all face areas detected in the current frame. Is determined (S608).

そして、顔判定部１１１は、未処理の顔領域が残っている場合、その１つを処理対象としてＳ６０２に処理を戻す。一方、現フレームで検出された全ての顔領域について処理した場合には、顔判定部１１１は処理をＳ６０９に移す。 Then, when an unprocessed face area remains, the face determination unit 111 returns the process to S602 with one of them as a processing target. On the other hand, if all face areas detected in the current frame have been processed, the face determination unit 111 moves the process to S609.

そして、顔判定部１１１は、被写体リスト１１２中の被写体データに未更新のデータがあれば、前フレームでは検出されたが、現フレームでは検出されなかった被写体のデータであるため、被写体リスト１１２から削除する（Ｓ６０９）。 Then, if there is unupdated data in the subject data in the subject list 112, the face determination unit 111 is data of a subject that was detected in the previous frame but not detected in the current frame. It is deleted (S609).

このように、本実施形態によれば、顔検出処理によって検出された顔領域が顔かどうかを判定する際に、検出されたフレームでの信頼度だけからでなく、信頼度の履歴に基づいた判定を行なうことにより、簡便な方法で顔検出の精度を高めることができる。 As described above, according to the present embodiment, when determining whether or not the face area detected by the face detection process is a face, not only the reliability in the detected frame but also the reliability history is used. By performing the determination, the accuracy of face detection can be increased by a simple method.

（変形例１）
なお、本実施形態においては、前フレームでは検出されたが、現フレームでは検出されなかった被写体のデータは、Ｓ６０９で被写体リスト１１２から削除するものとして説明した。 (Modification 1)
In the present embodiment, it has been described that data of a subject that was detected in the previous frame but not detected in the current frame is deleted from the subject list 112 in S609.

しかし、実際にはその被写体が存在しているにも係わらず、顔検出部１０９の問題で検出されなかった可能性もある。このような検出漏れによって検出精度が低下することを防ぐため、一度顔として判定された被写体については、所定数のフレームにわたり連続して検出されないことが確認されるまでは被写体データを保持するように構成してもよい。 However, even though the subject actually exists, there is a possibility that the subject has not been detected due to a problem of the face detection unit 109. In order to prevent the detection accuracy from being lowered due to such detection omission, the subject data is retained until it is confirmed that the subject once determined as a face is not continuously detected over a predetermined number of frames. It may be configured.

このような構成を採用した場合の撮像装置における顔判定処理について、図８に示したフローチャートを用いてさらに説明する。
図８のＳ９０１からＳ９０８は上述した図５のＳ６０１からＳ６０８と同様である。Ｓ９０１からＳ９０８の処理により、現フレームで顔検出部１０９に検出された顔領域について、被写体の特定処理及び顔判定処理が行なわれる。 The face determination process in the imaging apparatus when such a configuration is employed will be further described with reference to the flowchart shown in FIG.
S901 to S908 in FIG. 8 are the same as S601 to S608 in FIG. 5 described above. By the processing from S901 to S908, subject identification processing and face determination processing are performed on the face area detected by the face detection unit 109 in the current frame.

図９に、本変形例で用いる被写体リスト１１２’の例を示す。
図９の被写体リスト１１２’は、図６の被写体リスト１１２の情報に加え、検出されなかった被写体データを保持するフレーム数と、検出されなかった状態で現在までに保持したフレーム数の情報を持つ。保持したフレーム数は、被写体特定部１１０により、同一の被写体が顔検出部１０９で検出されたと判定された場合には、０にクリアされる。 FIG. 9 shows an example of the subject list 112 ′ used in this modification.
The subject list 112 ′ in FIG. 9 has information on the number of frames that hold subject data that has not been detected and the number of frames that have been held so far in addition to the information of the subject list 112 in FIG. . The number of held frames is cleared to 0 when the subject identifying unit 110 determines that the same subject has been detected by the face detection unit 109.

現フレームで検出された全ての顔領域についての顔判定処理が終了すると、顔判定部１１１は、顔判定フラグが１（顔と判定済み）で、未更新の被写体データが存在するか否かを判定する（Ｓ９０９）。 When the face determination processing for all face areas detected in the current frame is completed, the face determination unit 111 determines whether the face determination flag is 1 (determined as a face) and there is unupdated subject data. Determination is made (S909).

顔判定フラグが１で、未更新の被写体データが存在する場合、顔判定部１１１は対応する「保持したフレーム数」を参照し、「保持するフレーム数」以下かどうか判定する（Ｓ９１０）。 When the face determination flag is 1 and unupdated subject data exists, the face determination unit 111 refers to the corresponding “number of retained frames” and determines whether it is equal to or less than the “number of retained frames” (S910).

「保持したフレーム数」が「保持するフレーム数」以下であれば、顔判定部１１１は未更新の被写体データを削除せずに、対応する「保持したフレーム数」を増加させることによって更新する（Ｓ９１１）。 If “the number of retained frames” is equal to or less than “the number of retained frames”, the face determination unit 111 updates the corresponding “number of retained frames” by increasing the corresponding “number of retained frames” without deleting the unupdated subject data ( S911).

一方、「保持したフレーム数」が「保持するフレーム数」を超えている未更新の被写体データについては、顔判定部１１１は更新しない。顔判定部１１１は、顔と判定済みで、かつ未更新の被写体データ全てについて「保持したフレーム数」が「保持するフレーム数」以下か否かの判定処理を行ったことがＳ９０９において確認されるまで、Ｓ９０９〜Ｓ９１１の処理を繰り返し実行する。 On the other hand, the face determination unit 111 does not update the unupdated subject data in which “the number of retained frames” exceeds the “number of retained frames”. In step S909, the face determination unit 111 determines whether or not the “number of retained frames” is equal to or less than the “number of retained frames” for all unupdated subject data that has been determined to be a face. Until then, the processing of S909 to S911 is repeatedly executed.

その後、顔判定部１１１は、未更新の被写体データを被写体リスト１１２’から削除する（Ｓ９１２）。従って、Ｓ９１２で削除される被写体データは、顔と判定済みの被写体データであって、所定フレーム数連続して検出されなかった顔領域についての被写体データと、顔と判定されておらず、現フレームで検出されなかった顔領域についての被写体データである。 Thereafter, the face determination unit 111 deletes the unupdated subject data from the subject list 112 ′ (S 912). Accordingly, the subject data to be deleted in S912 is subject data that has been determined to be a face, subject data for a face area that has not been detected for a predetermined number of frames continuously, and a face that has not been determined to be a face. This is subject data for a face area that is not detected in.

本変形例において保持される、現フレームで検出されなかった顔領域に対する被写体データにおける位置、大きさなどの値は、顔領域が検出されなくなる直前の値であっても、検出されなくなる以前の数フレームでの値からの推定値であってもよい。 The values such as the position and size in the subject data for the face area not detected in the current frame held in this modification are the values before the face area is not detected even if the values are just before the face area is not detected. It may be an estimated value from a value in a frame.

また、「保持するフレーム数」の値は、顔領域が検出されなくなる直前に継続して検出されていた信頼度レベルとその継続検出回数に応じて決定することができる。例えば、高い信頼度レベルが多数回継続して検出されていれば、「保持するフレーム数」の値を大きく設定することができる。また、高い信頼度レベルが少数回継続して検出されていた場合や、低い信頼度レベルが検出されていた場合には、「保持するフレーム数」の値を小さく設定することができる。 Further, the value of “the number of frames to be held” can be determined according to the reliability level continuously detected immediately before the face area is not detected and the number of times of continuous detection. For example, if the high reliability level is continuously detected many times, the value of “the number of frames to hold” can be set large. In addition, when the high reliability level is detected continuously a few times or when the low reliability level is detected, the value of “the number of frames to be held” can be set small.

また、保持するフレーム数の値は、信頼度レベルと継続検出回数に加えて、あるいはそれに代えて、他の条件に基づいて決定することができる。他の条件としては、例えば、被写体リスト中の顔領域の位置や大きさなどであってよい。例えば、顔の位置が画像の端であれば、検出されなくなった要因として顔が画角からはみ出した可能性が高い。そこで、顔の位置が画像の中央付近であれば、保持するフレーム数を多くし、画像の端付近であれば、保持するフレーム数を少なくすることができる。 Further, the value of the number of frames to be held can be determined based on other conditions in addition to or instead of the reliability level and the number of continuous detections. Other conditions may be, for example, the position and size of the face area in the subject list. For example, if the position of the face is the edge of the image, there is a high possibility that the face protrudes from the angle of view as a factor that is no longer detected. Therefore, if the face position is near the center of the image, the number of frames to be held can be increased, and if it is near the edge of the image, the number of frames to be held can be reduced.

（変形例２）
本実施形態では、顔判定部１１１において顔判定に用いる判定基準として、信頼度のレベルに応じて設定された継続検出回数または継続検出時間を用いる場合について説明した。しかし、継続検出回数または継続検出時間を、検出された顔領域（被写体）数や、顔領域の移動量などに応じて変化させても良い。 (Modification 2)
In the present embodiment, the case where the number of continuous detections or the continuous detection time set according to the level of reliability is used as a determination criterion used for face determination in the face determination unit 111 has been described. However, the number of continuous detections or the continuous detection time may be changed according to the number of detected face areas (subjects), the amount of movement of the face area, and the like.

継続検出回数または継続検出時間を、検出された顔領域（被写体）数に応じて変化させる場合の撮像装置における顔判定処理について、図１０に示したフローチャートを用いてさらに説明する。 The face determination process in the imaging apparatus when changing the number of continuous detections or the continuous detection time according to the number of detected face regions (subjects) will be further described with reference to the flowchart shown in FIG.

まず、顔検出部１０９で画像の中から人物の顔を検出する（Ｓ１１０１）。そして、顔検出部１０９は、検出人数をカウントする（Ｓ１１０２）。なお、検出人数のカウントは、各フレームの検出人数でなく、ある一定期間内に含まれる各フレームにおける検出人数の累積値としてもよい。例えば、一度に複数人が通り過ぎるようなことがあると、一時的に検出人数が増加し、判定基準が大きく変動してしまうことになる。そのため、それまで継続して検出された人物の顔が、検出されなくなってしまうという問題が生じる可能性がある。また、顔検出部１０９の検出漏れのたびに、判定基準が変動してしまう可能性がある。そこで、検出人数のカウントを一定期間による検出人数の累積値とすることで、判定基準の変動を抑制することができる。 First, the face detection unit 109 detects a human face from the image (S1101). Then, the face detection unit 109 counts the number of detected people (S1102). Note that the count of the detected number of people may be an accumulated value of the detected number of people in each frame included in a certain period, not the number of detected people in each frame. For example, if there are cases where a plurality of people pass at once, the number of detected people temporarily increases, and the determination criteria greatly fluctuate. For this reason, there is a possibility that a human face that has been detected until then is not detected. In addition, each time there is a detection failure of the face detection unit 109, the determination criterion may change. Therefore, by making the count of the detected number of people the cumulative value of the detected number of persons over a certain period, it is possible to suppress fluctuations in the determination criteria.

次に、被写体特定部１１０において、現フレームの検出結果を基準として、前フレームの検出結果を比較し、同一被写体の顔領域を判定する（Ｓ１１０３）。そして、現フレームで前フレームと同一の被写体の顔領域が検出されていれば、被写体特定部１１０は、被写体リスト中の対応する被写体データを更新する（Ｓ１１０４）。 Next, the subject specifying unit 110 compares the detection results of the previous frame with reference to the detection result of the current frame, and determines the face area of the same subject (S1103). If a face area of the same subject as the previous frame is detected in the current frame, the subject specifying unit 110 updates corresponding subject data in the subject list (S1104).

一方で、現フレームにおいて、前フレームで検出されていない被写体の顔領域が検出された場合、被写体特定部１１０は、その顔領域についての情報を新規の被写体データとして被写体リストに登録する（Ｓ１１０５）。 On the other hand, when a face area of a subject not detected in the previous frame is detected in the current frame, the subject specifying unit 110 registers information about the face area as new subject data in the subject list (S1105). .

図１１は、本変形例で用いる被写体リスト１１２”を示す。本変形例における被写体リスト１１２”は、図６の被写体リスト１１２の情報に加え、検出人数の情報を持つ。検出人数は、顔検出部１０９から被写体特定部１１０に供給され、被写体データを登録する際に被写体特定部１１０が被写体リスト１１２”に登録する。 FIG. 11 shows a subject list 112 ″ used in this modification. The subject list 112 ″ in this modification has information on the number of detected people in addition to the information on the subject list 112 in FIG. The number of detected persons is supplied from the face detection unit 109 to the subject specifying unit 110, and the subject specifying unit 110 registers in the subject list 112 "when registering the subject data.

顔判定部１１１は、現フレームで検出された顔領域が、既に顔と判定された被写体の顔領域か否かを判定する（Ｓ１１０６）。そして、顔判定部１１１は、顔として判定されていない顔領域に関して、継続検出回数が信頼度レベルＬと検出人数ｎに応じた基準値ｆ（Ｌ，ｎ）を満たすか否かを判定する（Ｓ１１０７）。 The face determination unit 111 determines whether the face area detected in the current frame is a face area of a subject that has already been determined to be a face (S1106). Then, the face determination unit 111 determines whether or not the number of continuous detections satisfies a reference value f (L, n) corresponding to the reliability level L and the detected number n regarding a face region that is not determined as a face ( S1107).

顔判定部１１１は、継続検出回数が基準値を満たす顔領域について顔として判定し、被写体リストを更新する（Ｓ１１０８）。
ここで、本変形例においては、Ｓ１１０２でカウントされた検出人数が少なければ、検出頻度を上げるべきであるため、継続検出回数の基準値を少なくなくして、顔の判定基準を緩くする。一方、検出人数が多い場合には、検出頻度を増加させるよりも誤検出の抑制が望まれるため、継続検出回数の基準値を多くし、判定基準を厳しくする。 The face determination unit 111 determines a face area whose number of continuous detections satisfies the reference value as a face, and updates the subject list (S1108).
Here, in this modified example, if the number of detected persons counted in S1102 is small, the detection frequency should be increased. Therefore, the reference value for the number of continuous detections is not reduced, and the face criterion is relaxed. On the other hand, when the number of detected people is large, suppression of false detection is desired rather than increasing the detection frequency. Therefore, the reference value for the number of times of continuous detection is increased to make the determination criteria stricter.

信頼度レベルと検出人数に応じた継続検出回数の基準値の設定例を図１２に示す。
図１２では、横軸に信頼度レベル、奥行き軸に検出人数のレベルを示し、それらに応じた継続検出回数の基準値を縦軸に示す。図１２の例では、検出人数を所定の閾値によって多い、少ないの２つに分類しているが、より細かく分類してもよい。また、信頼度レベルおよび検出人数と継続検出回数との関係は、線形関係であっても非線形関係であっても良い。 An example of setting a reference value for the number of times of continuous detection according to the reliability level and the number of detected persons is shown in FIG.
In FIG. 12, the horizontal axis represents the reliability level, the depth axis represents the number of detected persons, and the reference value for the number of times of continuous detection corresponding thereto is indicated on the vertical axis. In the example of FIG. 12, the number of detected people is classified into two, that is, a large number and a small number according to a predetermined threshold value, but may be classified more finely. Further, the relationship between the reliability level and the number of detected persons and the number of continuous detections may be a linear relationship or a non-linear relationship.

顔判定部１１１は、上述したＳ１１０３からＳ１１０８までの処理をＳ１１０１における顔検出で検出された顔領域の各々に対して実行したか判定する（Ｓ１１０９）。
未処理の顔領域があれば、その１つを対象としてＳ１１０３からの処理を繰り返し実行する。全ての顔領域の処理が終わっていれば、顔判定部１１１は、Ｓ１１１０で、被写体リスト中の被写体データのうち、未更新のデータを削除する。 The face determination unit 111 determines whether the above-described processing from S1103 to S1108 has been executed for each of the face areas detected by the face detection in S1101 (S1109).
If there is an unprocessed face area, the process from S1103 is repeated for one of the face areas. If all face areas have been processed, the face determination unit 111 deletes unupdated data from the subject data in the subject list in S1110.

なお、顔検出部１０９で検出された人数の代わりに、顔判定部１１１で顔として判定された顔領域の数を判定基準に反映させてもよい。また、継続検出回数の代わりに継続検出時間としてもよいことは言うまでもない。 Instead of the number of people detected by the face detection unit 109, the number of face areas determined as faces by the face determination unit 111 may be reflected in the determination criterion. Needless to say, the continuous detection time may be used instead of the continuous detection count.

次に、判定基準として用いる継続検出回数または継続検出時間を、検出された顔領域（被写体）数に応じて変化させる場合の撮像装置における顔判定処理について、図１３に示したフローチャートを用いてさらに説明する。 Next, with reference to the flowchart shown in FIG. 13, the face determination process in the imaging apparatus in the case where the number of continuous detections or the continuous detection time used as the determination criterion is changed according to the number of detected face regions (subjects). explain.

まず、顔検出部１０９により、画像の中から人物の顔を検出する（Ｓ１４０１）。次に、被写体特定部１１０において、現フレームの検出結果を基準として、前フレームの検出結果を比較し、同一被写体の顔領域を判定する（Ｓ１４０２）。 First, the face detection unit 109 detects a human face from the image (S1401). Next, the subject specifying unit 110 compares the detection results of the previous frame with reference to the detection result of the current frame, and determines the face area of the same subject (S1402).

被写体特定部１１０は、現フレームにおいて前フレームと同一被写体と判断される顔領域が検出された場合、その顔領域の検出位置の差から、その被写体の移動量を算出する（Ｓ１４０３）。なお、被写体の移動量は、隣接する２フレーム間の移動量ではなく、隣接する２フレーム間での移動量をある一定期間内に含まれる複数のフレームについて加算した累積値としてもよい。累積値とすることで、同一被写体の誤判定によって一時的にその被写体の移動量が増加したとしても、判定基準が大きく変動してしまうことを抑制することができる。そして、被写体特定部１１０は、被写体リスト中の対応する被写体データを、移動量を含めて更新する（Ｓ１４０４）。 When a face area that is determined to be the same subject as the previous frame is detected in the current frame, the subject specifying unit 110 calculates the amount of movement of the subject from the difference in detection position of the face area (S1403). The movement amount of the subject may be an accumulated value obtained by adding the movement amount between two adjacent frames to a plurality of frames included in a certain period, not the movement amount between two adjacent frames. By setting the cumulative value, even if the amount of movement of the subject temporarily increases due to erroneous determination of the same subject, it is possible to prevent the judgment criterion from fluctuating greatly. Then, the subject specifying unit 110 updates the corresponding subject data in the subject list including the movement amount (S1404).

一方で、現フレームにおいて、前フレームで検出されていない被写体の顔領域が検出された場合、被写体特定部１１０は、その顔領域についての情報を新規の被写体データとして被写体リストに登録する（Ｓ１４０５）。新規に登録した際の移動量は０とする。 On the other hand, when a face area of a subject not detected in the previous frame is detected in the current frame, the subject specifying unit 110 registers information about the face area as new subject data in the subject list (S1405). . The movement amount at the time of newly registering is 0.

図１４は、本変形例で用いる被写体リスト１１３を示す。本変形例における被写体リスト１１３は、図６の被写体リスト１１２の情報に加え、移動量の情報を持つ。移動量は、例えば、被写体特定部１１０が、同一被写体と判定した顔領域に対応する被写体データついて登録（更新）する。 FIG. 14 shows a subject list 113 used in this modification. The subject list 113 in this modification has information on the amount of movement in addition to the information on the subject list 112 in FIG. For example, the movement amount is registered (updated) for the subject data corresponding to the face area determined by the subject specifying unit 110 as the same subject.

顔判定部１１１は、現フレームで検出された顔領域が、既に顔と判定された被写体の顔領域か否かを判定する（Ｓ１４０６）。そして、顔判定部１１１は、顔として判定されていない顔領域に関して、継続検出回数が信頼度レベルＬと各被写体の移動量ｗに応じた基準値ｆ（Ｌ，ｗ）を満たすか否かを判定する（Ｓ１４０７）。 The face determination unit 111 determines whether the face area detected in the current frame is a face area of a subject that has already been determined to be a face (S1406). Then, the face determination unit 111 determines whether or not the number of continuation detections satisfies a reference value f (L, w) corresponding to the reliability level L and the movement amount w of each subject for a face region that is not determined as a face. Determination is made (S1407).

顔判定部１１１は、継続検出回数が基準値を満たす顔領域について顔として判定し、被写体リストを更新する（Ｓ１４０８）。
通常の写真撮影では、移動量が大きい被写体は主被写体である可能性が少なく、移動量が小さい被写体が主被写体である可能性が大きい。そのため、本変形例では、移動量が少ない顔領域については継続検出回数の基準値を少なくして、顔の判定基準を緩くする。一方、移動量が多い顔領域については継続検出回数の基準値を多くして、顔の判定基準を厳しくする。 The face determination unit 111 determines a face area whose number of continuous detections satisfies the reference value as a face, and updates the subject list (S1408).
In normal photography, a subject with a large amount of movement is unlikely to be a main subject, and a subject with a small amount of movement is likely to be a main subject. For this reason, in the present modification, the reference value for the number of times of continuous detection is reduced for a face region with a small amount of movement, and the determination criterion for the face is relaxed. On the other hand, for a face area with a large amount of movement, the reference value for the number of times of continuous detection is increased to tighten the criteria for determining the face.

信頼度レベルと被写体の移動量に応じた継続検出回数の基準値の設定例を図１５に示す。
図１５では、横軸に信頼度レベル、奥行き軸に移動量の大きさを示し、それらに応じた継続検出回数の基準値を縦軸に示す。図１５の例では、移動量を所定の閾値によって大小２つに分類しているが、より細かく分類してもよい。また、信頼度レベルおよび移動量と継続検出回数との関係は、線形関係であっても非線形関係であっても良い。 FIG. 15 shows an example of setting a reference value for the number of times of continuous detection according to the reliability level and the amount of movement of the subject.
In FIG. 15, the horizontal axis indicates the reliability level, the depth axis indicates the magnitude of the movement amount, and the reference value of the number of times of continuous detection corresponding to them is indicated on the vertical axis. In the example of FIG. 15, the movement amount is classified into two types according to a predetermined threshold, but it may be classified more finely. Further, the relationship between the reliability level and the movement amount and the number of times of continuous detection may be a linear relationship or a non-linear relationship.

顔判定部１１１は、上述したＳ１４０２からＳ１４０８までの処理をＳ１４０１における顔検出で検出された顔領域の各々に対して実行したか判定する（Ｓ１４０９）。
未処理の顔領域があれば、その１つを対象としてＳ１４０２からの処理を繰り返し実行する。全ての顔領域の処理が終わっていれば、顔判定部１１１は、Ｓ１４１０で、被写体リスト中の被写体データのうち、未更新のデータを削除する。 The face determination unit 111 determines whether the processing from S1402 to S1408 described above has been executed for each of the face areas detected by the face detection in S1401 (S1409).
If there is an unprocessed face area, the process from S1402 is repeatedly executed on one of the face areas. If the processing of all the face areas has been completed, the face determination unit 111 deletes unupdated data from the subject data in the subject list in S1410.

なお、被写体の移動量を判定基準に反映する方法としては、連続するフレーム間での移動量もしくは一定期間内における各被写体の移動量の加算値に応じて継続検出回数を変化させる方法に限定されない。例えば、連続するフレーム間での移動量がある閾値を超えた場合に、継続検出回数の基準値を変化させるようにしても良い。 Note that the method of reflecting the amount of movement of the subject in the determination criterion is not limited to the method of changing the number of continuous detections according to the amount of movement between successive frames or the added value of the amount of movement of each subject within a certain period. . For example, when the amount of movement between consecutive frames exceeds a certain threshold value, the reference value for the number of times of continuous detection may be changed.

（変形例３）
本実施形態では、未更新の被写体データと判定されるまでは、どの被写体データも被写体リストに保持され続ける。そのため、連続するフレーム間において、撮像モードの変化により画角が大きく変化した場合や、カメラの移動量が大きい場合、被写体特定部１１０が異なる被写体の顔領域を同一被写体の顔領域として特定してしまう恐れがある。被写体の特定を誤ると、最終的な顔判定結果の信頼性が大きく低下する。 (Modification 3)
In the present embodiment, any subject data is kept in the subject list until it is determined that the subject data has not been updated. Therefore, when the angle of view changes greatly due to a change in the imaging mode between consecutive frames, or when the amount of movement of the camera is large, the subject specifying unit 110 specifies different subject face regions as the same subject face region. There is a risk. If the subject is specified incorrectly, the reliability of the final face determination result is greatly reduced.

これに対応するため、本変形例では、画角や撮影方向の大きな変化など、画像全体が大きく変化した（する）と検出される場合には、被写体リスト内の被写体データを全て削除し、次フレーム以降、新規に被写体データを登録する。 To cope with this, in this modification, when it is detected that the entire image has changed significantly (such as a large change in the angle of view or shooting direction), all the subject data in the subject list is deleted, and the next After the frame, new subject data is registered.

図１６は、本変形例における顔判定処理を説明するためのフローチャートである。
まず、例えば、変化検出手段としての画像処理部１０６において、連続するフレーム間での画像の変化量を算出し、画像全体に大きな変化があったか否かを検出する（Ｓ１７０１）。この変化量は、連続するフレーム間の輝度成分、色成分、またはエッジ成分から算出することができる。そして、画像処理部１０６は、変化量が予め定めた変化量を超える場合には、画像が大きく変化したものと判別する。また、実施形態に係る画像処理装置が撮像装置に適用される場合は、撮影モード（例えばズーム倍率）の変化を検出したり、撮像装置が例えば手ぶれ補正用に有するジャイロセンサーなどから情報を取得したりして、画像全体が大きく変化したかどうか判別してもよい。 FIG. 16 is a flowchart for explaining the face determination processing in the present modification.
First, for example, the image processing unit 106 serving as a change detection unit calculates the amount of change in the image between successive frames, and detects whether or not there has been a large change in the entire image (S1701). The amount of change can be calculated from the luminance component, color component, or edge component between successive frames. The image processing unit 106 determines that the image has changed significantly when the amount of change exceeds a predetermined amount of change. When the image processing apparatus according to the embodiment is applied to an imaging apparatus, a change in a shooting mode (for example, zoom magnification) is detected, or information is acquired from, for example, a gyro sensor that the imaging apparatus has for camera shake correction. Alternatively, it may be determined whether the entire image has changed significantly.

画像全体の大きな変化があれば、画像処理部１０６が、被写体リスト内の全ての被写体データを削除する（Ｓ１７０２）。
そして、Ｓ１７０３で顔検出部１０９が顔検出処理を行なう。この時、全ての顔検出結果は、新規の被写体データとして被写体リストに登録される。一方、画像全体の大きな変化がなければ、Ｓ１７０３に移る。図１６のＳ１７０３からＳ１７１１は図５のＳ６０１からＳ６０９と同様であるため説明を省略する。 If there is a large change in the entire image, the image processing unit 106 deletes all subject data in the subject list (S1702).
In step S1703, the face detection unit 109 performs face detection processing. At this time, all face detection results are registered in the subject list as new subject data. On the other hand, if there is no significant change in the entire image, the process moves to S1703. Since S1703 to S1711 in FIG. 16 are the same as S601 to S609 in FIG.

なお、画像処理部１０６の代わりに、顔検出部１０９で顔検出を行わずに被写体特定部１１０に画像データを供給し、被写体特定部１１０が画像全体の大きな変化を検出するように構成してもよい。 Instead of the image processing unit 106, the face detection unit 109 supplies image data to the subject specifying unit 110 without performing face detection, and the subject specifying unit 110 detects a large change in the entire image. Also good.

なお、本変形例において、「画像全体の大きな変化」とは、例えばシーンの半分以上が変わるようなカメラの移動や画角変化であったり、一般的なシーンチェンジ検出技術においてシーンチェンジと判別できるような変化であってよい。 In this modification, the “large change in the entire image” is, for example, a camera movement or a change in the angle of view that changes more than half of the scene, or can be determined as a scene change by a general scene change detection technique. It may be a change like this.

（他の実施形態）
上述した実施形態およびその変形例は、本発明に係る画像処理装置を撮像装置に適用し、かつ撮像時の顔判定処理に着目して説明した。しかし、顔判定に用いる画像はリアルタイムに撮影される画像に限定されず、記録済みの画像であっても良いことは理解されよう。従って、上述した実施形態およびその変形例は、動画像の再生時の顔判定処理にも同様に適用可能である。また、上述の変形例は複数を組み合わせて実施することも可能である。 (Other embodiments)
The above-described embodiment and its modification have been described by applying the image processing apparatus according to the present invention to the imaging apparatus and paying attention to the face determination process during imaging. However, it will be understood that an image used for face determination is not limited to an image captured in real time, and may be a recorded image. Therefore, the above-described embodiment and its modification examples can be similarly applied to the face determination process at the time of moving image reproduction. Moreover, the above-described modification examples can be implemented in combination.

また、被写体検出及び被写体判定の一例として顔検出及び顔判定を説明したが、被写体は人物の顔に限定されず、公知の画像認識技術を適用して画像中から検出可能な任意の物体、生物などであってよいこともまた理解されよう。 Further, although face detection and face determination have been described as an example of subject detection and subject determination, the subject is not limited to a human face, and any object or organism that can be detected from an image by applying a known image recognition technique. It will also be understood that it may be.

さらに、上述の実施形態（変形例を含む。以下同様）は、システム或は装置のコンピュータ（或いはＣＰＵ、ＭＰＵ等）によりソフトウェア的に実現することも可能である。
従って、上述の実施形態をコンピュータで実現するために、該コンピュータに供給されるコンピュータプログラム自体も本発明を実現するものである。つまり、上述の実施形態の機能を実現するためのコンピュータプログラム自体も本発明の一つである。 Furthermore, the above-described embodiments (including modifications, the same applies to the following) can also be realized as software by a computer (or CPU, MPU, etc.) of a system or apparatus.
Therefore, the computer program itself supplied to the computer in order to implement the above-described embodiment by the computer also realizes the present invention. That is, the computer program itself for realizing the functions of the above-described embodiments is also one aspect of the present invention.

なお、上述の実施形態を実現するためのコンピュータプログラムは、コンピュータで読み取り可能であれば、どのような形態であってもよい。例えば、オブジェクトコード、インタプリタにより実行されるプログラム、ＯＳに供給するスクリプトデータ等で構成することができるが、これらに限るものではない。 The computer program for realizing the above-described embodiment may be in any form as long as it can be read by a computer. For example, it can be composed of object code, a program executed by an interpreter, script data supplied to the OS, but is not limited thereto.

上述の実施形態を実現するためのコンピュータプログラムは、記憶媒体又は有線／無線通信によりコンピュータに供給される。プログラムを供給するための記憶媒体としては、例えば、フレキシブルディスク、ハードディスク、磁気テープ等の磁気記憶媒体、ＭＯ、ＣＤ、ＤＶＤ等の光／光磁気記憶媒体、不揮発性の半導体メモリなどがある。 A computer program for realizing the above-described embodiment is supplied to a computer via a storage medium or wired / wireless communication. Examples of the storage medium for supplying the program include a magnetic storage medium such as a flexible disk, a hard disk, and a magnetic tape, an optical / magneto-optical storage medium such as an MO, CD, and DVD, and a nonvolatile semiconductor memory.

有線／無線通信を用いたコンピュータプログラムの供給方法としては、コンピュータネットワーク上のサーバを利用する方法がある。この場合、本発明を形成するコンピュータプログラムとなりうるデータファイル（プログラムファイル）をサーバに記憶しておく。プログラムファイルとしては、実行形式のものであっても、ソースコードであっても良い。 As a computer program supply method using wired / wireless communication, there is a method of using a server on a computer network. In this case, a data file (program file) that can be a computer program forming the present invention is stored in the server. The program file may be an executable format or a source code.

そして、このサーバにアクセスしたクライアントコンピュータに、プログラムファイルをダウンロードすることによって供給する。この場合、プログラムファイルを複数のセグメントファイルに分割し、セグメントファイルを異なるサーバに分散して配置することも可能である。
つまり、上述の実施形態を実現するためのプログラムファイルをクライアントコンピュータに提供するサーバ装置も本発明の一つである。 Then, the program file is supplied by downloading to a client computer that has accessed the server. In this case, the program file can be divided into a plurality of segment files, and the segment files can be distributed and arranged on different servers.
That is, a server apparatus that provides a client computer with a program file for realizing the above-described embodiment is also one aspect of the present invention.

また、上述の実施形態を実現するためのコンピュータプログラムを暗号化して格納した記憶媒体を配布し、所定の条件を満たしたユーザに、暗号化を解く鍵情報を供給し、ユーザの有するコンピュータへのインストールを許可してもよい。鍵情報は、例えばインターネットを介してホームページからダウンロードさせることによって供給することができる。 In addition, a storage medium in which the computer program for realizing the above-described embodiment is encrypted and distributed is distributed, and key information for decrypting is supplied to a user who satisfies a predetermined condition, and the user's computer Installation may be allowed. The key information can be supplied by being downloaded from a homepage via the Internet, for example.

また、上述の実施形態を実現するためのコンピュータプログラムは、すでにコンピュータ上で稼働するＯＳの機能を利用するものであってもよい。
さらに、上述の実施形態を実現するためのコンピュータプログラムは、その一部をコンピュータに装着される拡張ボード等のファームウェアで構成してもよいし、拡張ボード等が備えるＣＰＵで実行するようにしてもよい。 Further, the computer program for realizing the above-described embodiment may use an OS function already running on the computer.
Further, a part of the computer program for realizing the above-described embodiment may be configured by firmware such as an expansion board attached to the computer, or may be executed by a CPU provided in the expansion board. Good.

本発明の実施形態に係る画像処理装置の一例としての撮像装置の構成例を示すブロック図である。1 is a block diagram illustrating a configuration example of an imaging apparatus as an example of an image processing apparatus according to an embodiment of the present invention. 本発明の実施形態に係る撮像装置における、顔領域の位置と大きさを用いて同一被写体を特定する処理について説明する図である。It is a figure explaining the process which specifies the same subject using the position and size of a face area in the imaging device according to the embodiment of the present invention. 本発明の実施形態に係る撮像装置において、顔検出部により検出された顔領域と、対応する信頼度レベルの例を示す図である。It is a figure which shows the example of the face area detected by the face detection part, and the corresponding reliability level in the imaging device which concerns on embodiment of this invention. 図３に示す画像に判定基準を適用した例を模式的に示した図である。It is the figure which showed typically the example which applied the determination standard to the image shown in FIG. 本発明の実施形態に係る撮像装置における顔判定処理を説明するためのフローチャートである。It is a flowchart for demonstrating the face determination process in the imaging device which concerns on embodiment of this invention. 本発明の実施形態に係る撮像装置における被写体リストの例を示す図である。。It is a figure showing an example of a photographic subject list in an imaging device concerning an embodiment of the present invention. . 本発明の実施形態に係る撮像装置における、信頼度レベルと継続検出回数の基準値の設定例を示す図である。It is a figure which shows the example of a setting of the reference value of a reliability level and the continuous detection frequency in the imaging device which concerns on embodiment of this invention. 本発明の実施形態の変形例１に係る撮像装置における顔判定処理を説明するためのフローチャートである。It is a flowchart for demonstrating the face determination process in the imaging device which concerns on the modification 1 of embodiment of this invention. 本発明の実施形態の変形例１に係る撮像装置における被写体リストの例を示す図である。It is a figure which shows the example of the to-be-photographed object list | wrist in the imaging device which concerns on the modification 1 of embodiment of this invention. 本発明の実施形態の変形例２に係る撮像装置における顔判定処理を説明するためのフローチャートである。It is a flowchart for demonstrating the face determination process in the imaging device which concerns on the modification 2 of embodiment of this invention. 本発明の実施形態の変形例２に係る撮像装置における被写体リストの例を示す図である。It is a figure which shows the example of the to-be-photographed object list | wrist in the imaging device which concerns on the modification 2 of embodiment of this invention. 本発明の実施形態の変形例２に係る撮像装置における、信頼度レベルと検出人数に応じた継続検出回数の基準値の設定例を示す図である。It is a figure which shows the example of a setting of the reference value of the continuous detection frequency | count according to a reliability level and the detection number in the imaging device which concerns on the modification 2 of embodiment of this invention. 本発明の実施形態の変形例２に係る撮像装置における別の顔判定処理を説明するためのフローチャートである。It is a flowchart for demonstrating another face determination process in the imaging device which concerns on the modification 2 of embodiment of this invention. 本発明の実施形態の変形例２に係る撮像装置における別の被写体リストの例を示す図である。It is a figure which shows the example of another to-be-photographed object list | wrist in the imaging device which concerns on the modification 2 of embodiment of this invention. 本発明の実施形態の変形例２に係る撮像装置における、信頼度レベルと被写体移動量に応じた継続検出回数の基準値の設定例を示す図である。It is a figure which shows the example of a setting of the reference value of the continuous detection frequency | count according to the reliability level and subject moving amount in the imaging device which concerns on the modification 2 of embodiment of this invention. 本発明の実施形態の変形例３に係る撮像装置における顔判定処理を説明するためのフローチャートである。It is a flowchart for demonstrating the face determination process in the imaging device which concerns on the modification 3 of embodiment of this invention.

Claims

Subject detection means for detecting a predetermined subject region from images supplied in time series and detecting the reliability of the detected region;
Of the areas detected by the subject detection means from different images, the specifying means for specifying the same subject area;
Recording means for recording subject data including a history of reliability detected by the subject detection means for an area specified by the specifying means as an area of the same subject;
Of the areas identified by the identifying means as the same subject area, the area in which the reliability history recorded by the recording means satisfies the determination criteria determined according to the reliability level is determined in advance. An image processing apparatus comprising: determination means for determining an area of a subject.

The determination criterion is the number of times or the time that the reliability is continuously detected by the subject detection unit according to the level of reliability, and the number of times or the time is shorter as the reliability level is higher. The image processing apparatus according to claim 1.

As for the area determined as the predetermined subject area, as long as the area is detected by the subject detection means, the determination means uses the predetermined subject area even if the reliability decreases. The image processing apparatus according to claim 1, wherein the image processing apparatus is determined to be present.

2. The identification unit is configured to delete the subject data corresponding to a region that is not detected by the subject detection unit from a region where the reliability history is recorded in the recording unit. The image processing apparatus according to claim 1.

The identification means includes the subject corresponding to a region that is not continuously detected for a predetermined number or time of images in the subject detection unit in the region where the history of reliability is recorded in the recording unit. The image processing apparatus according to claim 4, wherein the data is deleted.

The determination criterion is determined according to the level of reliability and the number of the areas detected by the subject detection unit, and when the number of the areas is smaller than when the number of the areas is large, 6. The image processing apparatus according to claim 1, wherein the image processing apparatus is determined so as to be easily determined as the predetermined subject area.

The determination criterion is determined according to the level of reliability and the amount of movement of the region identified by the specifying unit as the same subject region, and the region with the smaller amount of movement than the region with the larger amount of movement. The image processing apparatus according to claim 1, wherein the image processing apparatus is determined so as to be easily determined as the predetermined subject area.

A change detecting unit that calculates a change amount of the adjacent images in time series and deletes all the subject data recorded by the recording unit when the change amount exceeds a predetermined change amount. The image processing apparatus according to claim 1, wherein the image processing apparatus is an image processing apparatus.

A lens for forming a subject optical image;
Imaging means for sequentially capturing an optical image of a subject formed by the lens and outputting an image supplied in time series;
An image processing apparatus according to any one of claims 1 to 8,
An imaging apparatus comprising: control means for controlling imaging conditions using information on an area determined by the determination means as the predetermined subject area.

An imaging device,
A lens for forming a subject optical image;
Imaging means for sequentially capturing an optical image of a subject formed by the lens and outputting an image supplied in time series;
An image processing apparatus according to claim 8,
Control means for controlling imaging conditions using information of the area determined by the determination means as the predetermined subject area;
The image processing apparatus is
While acquiring information regarding the movement of the imaging device or the change in the angle of view of the lens, and when at least one of these information exceeds a predetermined threshold, the image output by the imaging means is assumed to change significantly. An image pickup apparatus comprising: a change detection unit that deletes all the subject data recorded by the recording unit.

A subject detection step for detecting a predetermined region of the subject from an image supplied in time series and detecting the reliability of the detected region;
A specifying step for specifying a region of the same subject among the regions detected in the subject detection step from different images;
A recording step for recording subject data including a history of the reliability detected in the subject detection step in a recording unit for the region identified as the same subject region in the identifying step;
Among the areas identified as the same subject area in the identifying step, an area in which the reliability history recorded in the recording unit satisfies a determination criterion determined in accordance with the reliability level is determined in advance. An image processing method comprising: a determination step of determining an area of a defined subject.