JP2007081682A

JP2007081682A - Image processor, image processing method, and executable program by information processor

Info

Publication number: JP2007081682A
Application number: JP2005265518A
Authority: JP
Inventors: Junji Sugawara; 淳史菅原
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2005-09-13
Filing date: 2005-09-13
Publication date: 2007-03-29

Abstract

PROBLEM TO BE SOLVED: To solve problems that, when calculating deviation between image signals, an image sensor is affected by the movement or the like of hands and feet when the deviation of a composition is calculated on the basis of a person, and when calculating the deviation of the composition on the basis of a background while avoiding the person, accuracy becomes poor since a blurred image is targeted when the depth of field is shallow. SOLUTION: A specific region is detected from the image signal by executing matching with reference information. The image signal is divided into a plurality of regions. Different weighting is set to information (a motion vector) showing positional deviation for each region corresponding to the detection result of a specific region. COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、複数の画像間の構図のずれである動きベクトルを算出するための画像処理装置、画像処理方法、及び、情報処理装置が実行可能なプログラムに関するものである。 The present invention relates to an image processing apparatus, an image processing method, and a program executable by an information processing apparatus for calculating a motion vector, which is a composition shift between a plurality of images.

現在のカメラは露出決定や焦点調節等の撮影にとって重要な作業が自動化されたものが多く、カメラ操作に未熟な人でも容易に適正な画像を得ることができる。これに加え、カメラに加わる手ぶれを防ぐ装置も開発されている。この手ぶれを防ぐ方法としては、例えば光学式の手ぶれ補正がある。これは、手ぶれによる振動を、加速度、角加速度、角速度、角変位などを検知するセンサーによって検出し、その検出値に応じて補正レンズを変位させて光軸を補正するものである（例えば、特許文献１参照）。 Many of the current cameras have automated operations important for shooting such as exposure determination and focus adjustment, and even a person unskilled in camera operation can easily obtain an appropriate image. In addition to this, a device for preventing camera shake applied to the camera has been developed. As a method for preventing this camera shake, for example, there is an optical camera shake correction. In this method, vibration due to camera shake is detected by a sensor that detects acceleration, angular acceleration, angular velocity, angular displacement, and the like, and the correction lens is displaced according to the detected value to correct the optical axis (for example, patents). Reference 1).

また、光学式の手ぶれ補正と異なるものとして、画像合成式の手ぶれ補正がある。これは、手ぶれの生じないような短い露光時間で撮影を繰り返し、撮影された複数の画像の位置を合わせながら合成することで、画像毎の構図の違いを補正しつつ、露出の不足を補完するものである（例えば、特許文献２参照）。 In addition, as a difference from optical camera shake correction, there is image synthesis camera shake correction. This compensates for lack of exposure while correcting the difference in composition for each image by repeating shooting with a short exposure time so as not to cause camera shake and combining the positions of a plurality of shot images. (For example, refer to Patent Document 2).

この画像合成方式の手ぶれ補正では、人物を基準として構図のずれを補正しようとすると、手足の動きなどに影響されて、この基準となる人物を高い精度で検出することが困難である。そこで、人物が位置する領域を避けた背景を使って動きベクトルを算出する方法が提案されている（例えば、特許文献３参照）。
特開平０５−０６６４５２号公報特開２００２−０６４７４３号公報特開２００４−２１９７６５号公報 In this image composition type camera shake correction, if an attempt is made to correct a composition shift with a person as a reference, it is difficult to detect the reference person with high accuracy due to the influence of limb movements and the like. Therefore, a method of calculating a motion vector using a background that avoids a region where a person is located has been proposed (see, for example, Patent Document 3).
JP 05-066452 A JP 2002-064743 A JP 2004-219765 A

画像合成方式の手ぶれ補正では、光学式の手ぶれ補正と異なり、補正レンズ及びその駆動機構を設ける必要がないため、小型化に適している。しかしながら、画像合成方式の手ぶれ補正は、次の課題が生じる場合がある。 Unlike the optical camera shake correction, the image synthesis type camera shake correction is suitable for downsizing because it is not necessary to provide a correction lens and its driving mechanism. However, the following problems may occur in the image synthesis method of camera shake correction.

例えば、絞りを開いて被写界深度を浅く設定すると、主要となる被写体と背景でボケ量が大きく異なる。そのため、背景を使って動きベクトルを算出すると、ボケた領域の画像情報を用いて位置合わせ演算を行うため、精度の悪い結果しか得られない。 For example, when the aperture is opened and the depth of field is set shallow, the amount of blur greatly differs between the main subject and the background. For this reason, when the motion vector is calculated using the background, the alignment calculation is performed using the image information of the blurred area, so that only a result with poor accuracy can be obtained.

本発明は上記課題に鑑み、主要となる被写体を精度良く検出することで、主要となる被写体の画像信号間における構図のずれを適正に算出することを目的とする。 The present invention has been made in view of the above problems, and it is an object of the present invention to appropriately calculate a composition shift between image signals of main subjects by accurately detecting main subjects.

斯かる目的下において、第１の発明は、複数の画像信号間の位置ずれを検出するずれ検出手段と、画像信号から、基準情報との比較結果に応じて、特定の領域を検出する特定領域検出手段とを有し、ずれ検出手段は特定領域検出手段による特定の領域の検出結果に応じて、画像信号の領域別に、位置ずれを示す情報に対して異なる重み付けを設定して出力することを特徴とする画像処理装置を提供するものである。 Under such an object, the first invention provides a displacement detection means for detecting displacement between a plurality of image signals, and a specific region for detecting a specific region from the image signal according to a comparison result with reference information. Detecting means, and the deviation detecting means sets and outputs different weights for the information indicating the positional deviation for each area of the image signal according to the detection result of the specific area by the specific area detecting means. An image processing apparatus is provided.

同様に、斯かる目的下において、第２の発明は、複数の画像信号間の位置ずれを検出するずれ検出手段と、画像信号から人物の顔の形状を示す情報との比較結果に応じて人物の顔が存在する領域を検出する顔領域検出手段をと有し、ずれ検出手段は、画像信号の人物の顔が存在する領域の位置ずれを示す情報と、画像信号の人物の顔が存在する領域とは異なる領域の位置ずれを示す情報のいずれかを出力することを特徴とする画像処理装置を提供するものである。 Similarly, under such an object, the second invention provides a person according to a comparison result between a deviation detecting means for detecting a positional deviation between a plurality of image signals and information indicating the shape of a person's face from the image signals. A face area detecting means for detecting an area where the face of the image is present, and the deviation detecting means includes information indicating a positional deviation of the area where the human face of the image signal is present and a human face of the image signal is present. It is an object of the present invention to provide an image processing apparatus that outputs any one of information indicating a positional deviation of an area different from the area.

同様に、斯かる目的下において、第３の発明は、複数の画像信号間の位置ずれを検出するずれ検出工程と、画像信号から、基準情報との比較結果に応じて、特定の領域を検出する特定領域検出工程とを有し、ずれ検出工程では特定領域検出工程での特定の領域の検出結果に応じて、画像信号の領域別に、位置ずれを示す情報に対して異なる重み付けを設定して出力することを特徴とする画像処理方法を提供するものである。 Similarly, under such an object, the third invention detects a specific region from the image signal in accordance with a result of comparison with reference information, and a displacement detection step for detecting a displacement between a plurality of image signals. Specific area detection step, and in the deviation detection step, according to the detection result of the specific area in the specific area detection step, different weights are set for the information indicating the positional deviation for each area of the image signal. An image processing method characterized by output is provided.

同様に、斯かる目的下において、第４の発明は、複数の画像信号間の位置ずれを検出するずれ検出工程と、画像信号から人物の顔の形状を示す情報との比較結果に応じて人物の顔が存在する領域を検出する顔領域検出工程をと有し、ずれ検出工程では、画像信号の人物の顔が存在する領域の位置ずれを示す情報と、画像信号の人物の顔が存在する領域とは異なる領域の位置ずれを示す情報のいずれかを出力することを特徴とする画像処理方法を提供するものである。 Similarly, under such an object, the fourth invention provides a person according to a result of comparison between a displacement detection step for detecting a positional deviation between a plurality of image signals and information indicating the shape of a person's face from the image signals. A face area detecting step for detecting an area where the face of the image is present, and in the deviation detecting step, information indicating the positional deviation of the area where the human face of the image signal is present and the human face of the image signal exist It is an object of the present invention to provide an image processing method characterized by outputting any information indicating a positional deviation of an area different from the area.

同様に、斯かる目的下において、第５の発明は、上記第３の発明または第４の発明を実現するためのプログラムコードを有することを特徴とする情報処理装置が実行可能なプログラムを提供するものである。 Similarly, under such an object, the fifth invention provides a program executable by an information processing apparatus characterized by having a program code for realizing the third invention or the fourth invention. Is.

本発明によれば、基準情報と比較処理することによって高い精度で特定の対象を検出できるため、この特定の対象を中心とした画像信号間における構図のずれを適正に算出することができる。 According to the present invention, it is possible to detect a specific target with high accuracy by performing a comparison process with reference information. Therefore, it is possible to appropriately calculate a composition shift between image signals centered on the specific target.

以下、添付図面を参照して本発明の好適な実施の形態について詳しく説明する。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.

（第１の実施の形態）
本発明の第１の実施の形態である画像処理装置について、図１乃至図４を用いて説明を行う。 (First embodiment)
The image processing apparatus according to the first embodiment of the present invention will be described with reference to FIGS.

なお、ここでは被写体検出機能を備えた画像処理装置の例として、人物の顔を検出する機能を備えたデジタルカメラをあげて説明する。 Here, as an example of an image processing apparatus having a subject detection function, a digital camera having a function of detecting a human face will be described.

画像処理装置は、複数の連続した画像データを入力できる構成であればどのようなものであっても構わない。例えば、ＣＣＤやＣＭＯＳセンサ等の撮像素子と、撮像素子の出力信号を基にデジタル信号処理を行うプロセッサを備えたデジタルカメラ、デジタルビデオカメラ、あるいはカメラ付き携帯電話が、画像処理装置の例としてあげられる。また、外部から無線通信、あるいは有線通信によって受信した画像データを基に、デジタル信号処理を行うコンピュータも画像処理装置の例としてあげられる。 The image processing apparatus may have any configuration as long as it can input a plurality of continuous image data. For example, a digital camera, a digital video camera, or a camera-equipped mobile phone including an image sensor such as a CCD or CMOS sensor and a processor that performs digital signal processing based on an output signal of the image sensor is an example of an image processing apparatus. It is done. An example of an image processing apparatus is a computer that performs digital signal processing based on image data received from outside by wireless communication or wired communication.

被写体検出機能は、特定の形状、特定の輝度分布、特定の色相分布等を検出し判別することによって、所定の被写体を検出する機能であればよい。被写体の対象としては、例えば犬、猫、鳥といった動物や、車、飛行機、電車といった乗り物のように、人物の顔以外のものであってもよい。 The subject detection function may be any function that detects a predetermined subject by detecting and determining a specific shape, a specific luminance distribution, a specific hue distribution, and the like. The subject of the subject may be something other than a human face, such as an animal such as a dog, cat, or bird, or a vehicle such as a car, airplane, or train.

このような被写体検出機能に関しては、様々な方法が公知となっている。 Various methods are known for such a subject detection function.

例えば、特開平０５−１９７７９３号公報では、顔の各部品の形状データと入力画像とのマッチング結果から被写体を検出する技術が開示されている。この技術では、虹彩、口、鼻等の形状データを用意しておき、まず２つの虹彩を求める。続いて口、鼻等を求める際に、その虹彩の位置に基づいて、口、鼻等の顔部品の探索領域を限定する。つまり、このアルゴリズムでは、虹彩（眼）、口、鼻といった顔を構成する顔部品を並列的に検出するのではなく、虹彩（眼）を最初に見つけ、その結果を使用して、順に口、鼻という顔部品を検出している。 For example, Japanese Patent Laid-Open No. 05-197793 discloses a technique for detecting a subject from a matching result between shape data of each part of a face and an input image. In this technique, shape data such as iris, mouth and nose are prepared, and two irises are first obtained. Subsequently, when obtaining the mouth, nose, etc., the search area for the facial parts such as the mouth, nose, etc. is limited based on the position of the iris. In other words, this algorithm does not detect the facial parts that make up the face such as the iris (eye), mouth, and nose in parallel, but first finds the iris (eye) and uses the result to It detects a facial part called the nose.

他にもマッチングを行う処理としては、例えば、画像データならば、特開平９−１３０７１４号公報に記載の画像情報抽出装置で用いる方法を利用することができる。この装置では、被写体距離に応じたサイズのテンプレート画像を生成する。そして、これを用いて画面内を走査しながら、各場所で正規化相関係数などを計算することにより入力画像の局所部分とモデルデータとの類似度分布を算出する。他にも、特許３０７８１６６号公報に記載の局所的特徴の空間配置関係に基づくアルゴリズムや特開２００２−８０３２号公報に記載の、畳み込み神経回路網をベースにしたアルゴリズムなどを用いても良い。 As other matching processing, for example, for image data, the method used in the image information extraction apparatus described in Japanese Patent Laid-Open No. 9-130714 can be used. In this apparatus, a template image having a size corresponding to the subject distance is generated. Then, using this, while calculating the normalized correlation coefficient at each location while scanning the screen, the similarity distribution between the local portion of the input image and the model data is calculated. In addition, an algorithm based on a spatial arrangement relationship of local features described in Japanese Patent No. 3078166, an algorithm based on a convolutional neural network described in Japanese Patent Laid-Open No. 2002-8032, or the like may be used.

本実施の形態では、画像データの輝度情報を用いてエッジを抽出し、抽出できたエッジの形状、画像における位置、色相などの指標を、予め記憶されたデータベースと比較してパターンマッチングを行うことで所望の被写体の検出を行う方法を用いている。 In this embodiment, an edge is extracted using luminance information of image data, and pattern matching is performed by comparing indexes such as the extracted edge shape, position in the image, and hue with a database stored in advance. A method for detecting a desired subject is used.

図１に本実施の形態における撮像装置１００のブロック図を示す。 FIG. 1 shows a block diagram of an imaging apparatus 100 in the present embodiment.

図１において、１０１はズームレンズ及びフォーカスレンズを含む複数のレンズにて構成された撮影レンズであり、１０２は撮影レンズを通過した光束の量を調節する絞り、１０３は撮影レンズを通過した光束を遮るためのシャッタである。１０４はＣＭＯＳセンサやＣＣＤ等の撮像素子であり、絞り１０２とシャッタ１０３は、撮影レンズ１０１と撮像素子１０４の間に配置されている。 In FIG. 1, reference numeral 101 denotes a photographic lens composed of a plurality of lenses including a zoom lens and a focus lens, 102 denotes a diaphragm for adjusting the amount of light flux that has passed through the photographic lens, and 103 denotes light flux that has passed through the photographic lens. This is a shutter for blocking. Reference numeral 104 denotes an image sensor such as a CMOS sensor or a CCD. The diaphragm 102 and the shutter 103 are disposed between the photographing lens 101 and the image sensor 104.

１０５は撮像素子の電荷蓄積動作及びリセット動作を制御する撮像素子駆動手段である。１０６はＡＦ駆動モータであり、撮影レンズ１０１に含まれるフォーカスレンズを光軸方向に駆動し、焦点位置を変更する。１０７はシャッタ１０３を駆動するシャッタ駆動手段であり、１０８は絞り１０２を駆動して絞り口径を調節する絞り駆動手段であり、１０９はＡＦ駆動モータ１０６の駆動量、駆動方向を制御する焦点制御手段である。 Reference numeral 105 denotes image sensor driving means for controlling the charge accumulation operation and reset operation of the image sensor. Reference numeral 106 denotes an AF drive motor that drives a focus lens included in the photographing lens 101 in the optical axis direction to change the focal position. Reference numeral 107 denotes shutter driving means for driving the shutter 103, reference numeral 108 denotes aperture driving means for driving the iris 102 to adjust the aperture diameter, and reference numeral 109 denotes focus control means for controlling the driving amount and driving direction of the AF driving motor 106. It is.

１１０はアナログ／ディジタル（以下、Ａ／Ｄという）変換手段であり、１１１はＡ／Ｄ変換手段１１０から出力された画像信号や、後述するメモリ１１２に記憶された画像信号に対して信号処理を施す信号処理手段である。撮像素子１０４から出力されたアナログ信号はＡ／Ｄ変換手段１１０によってディジタル信号に変換され、信号処理手段１１１に入力される。信号処理回路１１１はＡ／Ｄ変換手段１１０から入力されたディジタル信号から輝度信号や色信号を形成し、表示用の画像信号、記録用の画像信号、及び顔検出用の画像信号を形成する。１１２は顔検出の画像信号や表示用の画像信号を一時的に記憶するメモリである。１１３は制御手段であり、シャッタ駆動手段１０７、絞り駆動手段１０８、焦点制御手段１０９、信号処理手段１１１、及び後述する顔検出手段１１４の各々は、制御手段１１３からの制御指令に基づいて動作する。 Reference numeral 110 denotes analog / digital (hereinafter referred to as A / D) conversion means. Reference numeral 111 denotes signal processing for an image signal output from the A / D conversion means 110 or an image signal stored in the memory 112 described later. Signal processing means to be applied. The analog signal output from the image sensor 104 is converted into a digital signal by the A / D conversion unit 110 and input to the signal processing unit 111. The signal processing circuit 111 forms a luminance signal and a color signal from the digital signal input from the A / D conversion means 110, and forms a display image signal, a recording image signal, and a face detection image signal. A memory 112 temporarily stores a face detection image signal and a display image signal. 113 is a control means, and each of the shutter drive means 107, the aperture drive means 108, the focus control means 109, the signal processing means 111, and the face detection means 114 described later operates based on a control command from the control means 113. .

１１４は顔検出手段であり、信号処理手段１１１で形成されてメモリ１１２に記憶された顔検出用の画像信号を解析し、人物の顔が存在する領域を検出する。後述するが、顔検出手段１１４は顔と判定するための基準レベルとして、第１の基準レベルと、この第１の基準レベルよりも顔の判定基準が厳しい第２の基準レベルを備えている。１１５はずれ検出手段であり、複数の画像信号間で相関演算を行い、画像信号間の相対的な位置ずれ量（動きベクトル）を算出する。ずれ検出手段１１５の動きベクトルの算出方法については後述する。１１６は座標変換手段であり、ずれ検出手段１１５で算出された動きベクトルにあわせて各画像信号の画像変換を行う。１１７は座標変換手段１１６にて座標変換された画像信号を記憶するための画像記憶手段であり、１１８は画像記憶手段１１７に記憶された画像信号を合成する画像合成手段である。 Reference numeral 114 denotes a face detection unit that analyzes an image signal for face detection formed by the signal processing unit 111 and stored in the memory 112 to detect a region where a human face exists. As will be described later, the face detection unit 114 includes a first reference level and a second reference level, which is stricter than the first reference level, as a reference level for determining a face. Reference numeral 115 denotes a deviation detection means, which performs a correlation operation between a plurality of image signals and calculates a relative positional deviation amount (motion vector) between the image signals. A method of calculating the motion vector of the deviation detecting unit 115 will be described later. A coordinate conversion unit 116 performs image conversion of each image signal in accordance with the motion vector calculated by the deviation detection unit 115. Reference numeral 117 denotes image storage means for storing the image signal whose coordinates have been converted by the coordinate conversion means 116, and reference numeral 118 denotes image composition means for synthesizing the image signals stored in the image storage means 117.

ずれ検出手段１１５、座標変換手段１１６、画像記憶手段１１７及び画像合成手段１１８は、ぶれが生じる可能性が高くないときは動作せずに、ぶれが生じる可能性が高いときのみカメラが画像合成モードが設定されることによって動作しても構わない。具体的な例としては、シャッタ速度が所定値よりも長く、かつ、カメラが三脚等に固定されていない場合である。なお、画像合成モードは、ユーザーによって手動で設定されるようにしてもよい。 The shift detection unit 115, the coordinate conversion unit 116, the image storage unit 117, and the image synthesis unit 118 do not operate when the possibility of blurring is not high, and the camera operates in the image synthesis mode only when the possibility of blurring is high. It does not matter if this is set. A specific example is a case where the shutter speed is longer than a predetermined value and the camera is not fixed to a tripod or the like. Note that the image composition mode may be manually set by the user.

１１９はＬＣＤ（ＬｉｑｕｉｄＣｒｙｓｔａｌＤｉｓｐｌａｙ）等からなる表示手段であり、信号処理手段１１１にて形成された表示用の画像信号を入力して被写体像を表示する。シャッタ速度、絞り値、感度、及び撮影モード等の撮影に関する各種のパラメータを被写体像に重畳させて表示することもできる。また、この表示手段１１５は、既に撮影されてカメラ内部に保存されている画像を再生するためにも用いられる。信号処理手段１１１は顔検出手段１１３にて検出された顔の領域に関する情報を受け取り、顔の位置及び大きさを示す枠を表示用の画像信号に重畳させる。表示手段１１９はこの画像信号を受け取って表示することで、カメラの使用者に顔検出結果を認識させることができる。１２０は記録手段であり、信号処理手段１１１にて形成された記録用の画像信号を記録する。１２１は操作手段であり、撮影用の測光動作及び焦点調節動作を指示するためのスイッチＳＷ１、撮影を開始するためのスイッチＳＷ２、後述する撮影モードや画像合成モード、あるいは流し撮りモードを設定するための操作部材が含まれている。 Reference numeral 119 denotes a display unit made up of an LCD (Liquid Crystal Display) or the like, and displays a subject image by inputting a display image signal formed by the signal processing unit 111. Various parameters relating to shooting such as shutter speed, aperture value, sensitivity, and shooting mode can be displayed superimposed on the subject image. The display unit 115 is also used to reproduce an image that has already been shot and stored in the camera. The signal processing unit 111 receives information related to the face area detected by the face detection unit 113, and superimposes a frame indicating the position and size of the face on the display image signal. The display means 119 receives and displays this image signal, so that the user of the camera can recognize the face detection result. Reference numeral 120 denotes a recording unit that records a recording image signal formed by the signal processing unit 111. Reference numeral 121 denotes an operating means for setting a switch SW1 for instructing a photometric operation and a focus adjustment operation for photographing, a switch SW2 for starting photographing, a photographing mode, an image composition mode, or a panning mode to be described later. The operation member is included.

画像合成モードが設定され、撮影者によってレリーズボタンが半押しされてスイッチＳＷ１がオンすると、撮像装置は本撮影のための焦点調節動作及び測光動作を行う。このときの測光結果に応じて、制御手段１１３が絞り１０２の絞り口径とシャッタ１０３のシャッタ速度（露光時間）を決定する。被写体輝度が低ければ、絞り口径は全開となり、露光時間も長くなる。露光時間が長ければそれだけ手ぶれが発生する可能性が高いため、露光時間が所定値よりも長い場合は、短い時間での露光を複数回連続して行う画像合成モードが設定される。 When the image composition mode is set and the release button is pressed halfway by the photographer and the switch SW1 is turned on, the imaging apparatus performs a focus adjustment operation and a photometric operation for the actual photographing. The control means 113 determines the aperture diameter of the aperture 102 and the shutter speed (exposure time) of the shutter 103 according to the photometric result at this time. If the subject brightness is low, the aperture is fully open and the exposure time is also long. If the exposure time is longer, the possibility of camera shake is higher. Therefore, when the exposure time is longer than a predetermined value, an image composition mode is set in which exposure in a shorter time is continuously performed a plurality of times.

画像合成モードでは、１回の露光で得られた画像信号は輝度が不足した状態となるが、手ぶれの影響は低減させることができる。また、連続して得られた複数の画像信号を合成すれば、それぞれの画像信号の輝度が加算されるため、合成後の画像信号の輝度をほぼ適正値とすることができる。しかしながら、１つ１つの画像信号のぶれは小さくとも、連続撮影中の手ぶれにより各画像信号の構図は変化しているため、このまま画像信号の合成を行うと画像信号間の構図のずれが累積されてしまう。 In the image composition mode, the image signal obtained by one exposure is in a state where the luminance is insufficient, but the influence of camera shake can be reduced. Further, if a plurality of image signals obtained in succession are combined, the luminance of each image signal is added, so that the luminance of the combined image signal can be set to an approximately appropriate value. However, even if the blur of each image signal is small, the composition of each image signal changes due to camera shake during continuous shooting. Therefore, if the image signals are synthesized as they are, the composition deviation between the image signals is accumulated. End up.

そこで、連続して露光された複数の画像信号を、それぞれ信号処理手段１１１を介して顔検出手段１１４及びずれ検出手段１１５に入力し、これら複数の画像信号間の相対的な位置ずれ量、即ち動きベクトルを算出する。 Therefore, a plurality of continuously exposed image signals are input to the face detection unit 114 and the shift detection unit 115 via the signal processing unit 111, respectively, and the relative positional shift amounts between the plurality of image signals, that is, A motion vector is calculated.

ずれ検出手段１１５は、信号処理手段１１１から送られてきた複数の画像信号を、それぞれ図２にあるように複数のブロックに分割する。図２は画像信号の動きベクトルを説明するための図である。本実施の形態では１つの画像信号の長辺方向を８分割、短辺方向を６分割し、合計４８のブロックに分割した例を示す。勿論、ブロックの分割数は任意であり、他の数、他の形状でも構わない。 The shift detection unit 115 divides the plurality of image signals sent from the signal processing unit 111 into a plurality of blocks, as shown in FIG. FIG. 2 is a diagram for explaining a motion vector of an image signal. In this embodiment, an example is shown in which the long side direction of one image signal is divided into eight and the short side direction is divided into six and divided into a total of 48 blocks. Of course, the number of divisions of the block is arbitrary, and other numbers and other shapes may be used.

図２では説明を容易にするために、左からＡ列、Ｂ列、Ｃ列、Ｄ列、Ｅ列、Ｆ列、Ｇ列、Ｈ列とし、上から１行、２行、３行、４行、５行、６行とする。そしてｘ列ｙ行にあるブロックを、ｘｙと称する。つまり、左上のブロックはＡ１となり、右下のブロックはＨ６となる。 In FIG. 2, for ease of explanation, A column, B column, C column, D column, E column, F column, G column, H column are shown from the left, and 1 row, 2 rows, 3 rows, 4 rows from the top. Lines 5, 5 and 6 are assumed. And the block in x column y row is called xy. That is, the upper left block is A1, and the lower right block is H6.

ずれ検出手段１１５は、複数の画像信号間の同位置のブロックにおける二次元の相関演算を行って、ブロック毎の位置ずれ量である動きベクトルを算出する。そして算出された４８個のブロックのそれぞれの動きベクトルに対して重み付けをしてからヒストグラムを求め、その最頻値をもって画像信号全体の動きベクトルとする。 The shift detection unit 115 performs a two-dimensional correlation calculation on the blocks at the same position between a plurality of image signals, and calculates a motion vector that is a position shift amount for each block. The calculated motion vectors of the 48 blocks are weighted and then a histogram is obtained. The mode value is used as the motion vector of the entire image signal.

本実施の形態においては、顔検出手段１１４にて画像信号から人物の顔が存在するブロックを検出し、顔が検出されたブロックの位置を重み付けに反映させる。具体的には、顔が検出されたブロックの動きベクトルの出現頻度を他の領域よりも高くして画像全体の動きベクトルを求める。 In the present embodiment, the face detecting unit 114 detects a block in which a human face exists from the image signal, and reflects the position of the block in which the face is detected in the weighting. Specifically, the motion vector of the entire image is obtained by making the appearance frequency of the motion vector of the block in which the face is detected higher than other regions.

具体的には、顔が検出されたブロックの動きベクトルを他の領域よりも大きくして画像全体の動きベクトルを求める。もしくは、顔が検出されたブロックの動きベクトルのみを用いて画像全体の動きベクトルを求める。 Specifically, the motion vector of the block in which the face is detected is made larger than that of other regions, and the motion vector of the entire image is obtained. Alternatively, the motion vector of the entire image is obtained using only the motion vector of the block in which the face is detected.

図３に人物の顔が検出された位置と重み付けを反映させるブロックの位置との関係を示す。図３にて、楕円が人物の顔の輪郭を示しており、ブロックＣ４、Ｃ５、Ｄ４、Ｄ５、Ｆ３、Ｆ４、Ｆ５、Ｇ３、Ｇ４、Ｇ５の計１０ブロックが顔の領域と完全に重畳している。本実施の形態では、これら１０ブロックの動きベクトルの重み付けを、他の全てのブロックよりも大きくし、動きベクトルのヒストグラムから画像全体の動きベクトルを決定している。図３に示す例では、顔の領域と完全に重畳しているブロックを重み付けを大きくする対象としているが、顔の領域が所定の割合以上重畳しているブロックを同様の対象としてもよい。また、顔までの距離や、顔の存在する位置も、重み付けする際に考慮してもよい。例えば、中心に近い顔の領域と重畳しているブロックの重み付けを、周囲の顔の領域と重畳しているブロックの重み付けよりも大きくする方法が考えられる。 FIG. 3 shows the relationship between the position where the person's face is detected and the position of the block reflecting the weight. In FIG. 3, the ellipse indicates the outline of the face of the person, and a total of 10 blocks of blocks C4, C5, D4, D5, F3, F4, F5, G3, G4, and G5 are completely superimposed on the face area. ing. In the present embodiment, the motion vectors of these 10 blocks are weighted more than all other blocks, and the motion vector of the entire image is determined from the motion vector histogram. In the example shown in FIG. 3, a block that is completely overlapped with the face area is set as a target for increasing the weight, but a block in which the face area is overlapped by a predetermined ratio or more may be set as the same target. Also, the distance to the face and the position where the face exists may be taken into account when weighting. For example, a method is conceivable in which the weight of the block superimposed on the face area close to the center is made larger than the weight of the block superimposed on the surrounding face area.

座標変換手段１１６は画像信号間のずれを相殺するため、各画像信号に対して、ずれ検出手段１１５にて算出された画像全体の動きベクトルにあわせて座標変換を行う。座標変換手段１１６にて座標変換された画像信号は画像記憶手段１１７に記憶され、記憶された画像信号は画像合成手段１１８にて１枚の画像に合成されて、メモリ１１２に記憶される。画像信号間のずれを相殺するために合成した画像信号には、合成前の全ての画像信号と重畳する領域と、合成前の一部の画像信号とのみ重畳する領域が発生する。そこで画像合成手段１１８は、合成前の一部の画像信号とのみ重畳する領域はカットし、合成前の全ての画像信号と重畳する領域のみ合成した画像信号を生成する。そして合成した画像信号に拡散補完処理を施し、元のフレームと同じ大きさの画像信号を生成する。この画像信号はメモリ１１２を介して記録手段１２０に記録されるとともに、この画像信号を間引いて生成された表示用の画像信号を用いて表示手段１１９が合成後の画像信号の表示を行う。 The coordinate conversion unit 116 performs coordinate conversion on each image signal in accordance with the motion vector of the entire image calculated by the shift detection unit 115 in order to cancel the shift between the image signals. The image signal transformed by the coordinate transformation unit 116 is stored in the image storage unit 117, and the stored image signal is synthesized into one image by the image synthesis unit 118 and stored in the memory 112. In the image signal synthesized to cancel out the shift between the image signals, an area that overlaps with all the image signals before the synthesis and an area that overlaps only a part of the image signals before the synthesis occur. Therefore, the image synthesizing unit 118 cuts a region that is superimposed only on a part of the image signal before synthesis, and generates an image signal that is synthesized only on the region that is superimposed on all the image signals before synthesis. Then, a diffusion complement process is performed on the synthesized image signal to generate an image signal having the same size as the original frame. The image signal is recorded in the recording unit 120 via the memory 112, and the display unit 119 displays the combined image signal using the display image signal generated by thinning out the image signal.

次に、本実施の形態の動きベクトルの演算処理について、図４に示すフローチャートを用いて詳細に説明する。 Next, motion vector calculation processing according to the present embodiment will be described in detail with reference to the flowchart shown in FIG.

カメラの電源スイッチが操作されてカメラが起動すると、ステップＳ１からスタートする。ステップＳ１で、制御手段１１３がシャッタ駆動手段１０７、絞り駆動手段１０８、及び、焦点制御手段１０９の各制御パラメータを初期化する。そして撮像素子１０４の出力信号をＡ／Ｄ変換手段１１０を介して信号処理回路１１１に入力し、信号処理回路１１１が表示用の画像信号を生成して、表示手段１１９に入力する。このような撮像動作を周期的に繰り返すことで、カメラの操作者は表示手段１１９に表示される画像を観察することによって、被写体の様子をリアルタイムでモニターすることができる。 When the camera is activated by operating the power switch of the camera, the process starts from step S1. In step S 1, the control unit 113 initializes control parameters of the shutter driving unit 107, the aperture driving unit 108, and the focus control unit 109. Then, the output signal of the image sensor 104 is input to the signal processing circuit 111 via the A / D conversion unit 110, and the signal processing circuit 111 generates an image signal for display and inputs it to the display unit 119. By periodically repeating such an imaging operation, the camera operator can monitor the state of the subject in real time by observing the image displayed on the display means 119.

制御手段１１３は顔検出手段１１４に顔を判定するための基準レベルの初期値として、第１の基準レベルよりも判定基準の厳しい第２の基準レベルを設定させる。ここで、顔を判定するための基準レベルとは、参照用データベースとして保存された顔のテンプレート情報に対して、入力された画像信号がどの程度まで一致したものを顔として判定するかのレベルのことである。 The control unit 113 causes the face detection unit 114 to set a second reference level that is stricter than the first reference level as an initial value of the reference level for determining the face. Here, the standard level for determining a face is a level for determining to what degree the input image signal matches the face template information stored as the reference database. That is.

基準レベルに差を設ける方法としては、次のような例が考えられる。例えば、第１の基準レベルを用いた場合は、テンプレート情報との一致度が顔の形状において所定の条件を満たす領域があれば、その領域を顔として判定する。これに対し、第２の基準レベルを用いた場合には、テンプレート情報との一致度が顔の形状に加え、顔の色、及び顔の大きさの全てにおいて所定の条件を満たす領域があれば、その領域を顔として判定する。 As a method for providing a difference in the reference level, the following example can be considered. For example, when the first reference level is used, if there is a region whose degree of coincidence with the template information satisfies a predetermined condition in the shape of the face, the region is determined as a face. On the other hand, when the second reference level is used, if there is a region that satisfies a predetermined condition in all of the face color and face size in addition to the face shape, the degree of coincidence with the template information The area is determined as a face.

他の例としては、第２の基準レベルを設定した場合には、第１の基準レベルを設定した場合に比べて、検出対象として認める所定の数値範囲を狭くする方法があげられる。例えば、第１の基準レベルを設定した場合は、互いの距離が５．５ｃｍ〜７．５ｃｍの範囲に虹彩のペアが存在することを顔として判定するために必要な条件の一つとする。これに対して、第２の基準レベルを設定した場合は、互いの距離が、第１の基準レベルよりも狭い６．０ｃｍ〜７．０ｃｍの範囲に虹彩のペアが存在することを顔として判定するために必要な条件の一つとする。第１の基準レベルと第２の基準レベルとで、顔として認識するための基準値に差を設けることができるのであれば、別の方法であっても構わない。 As another example, when the second reference level is set, there is a method of narrowing a predetermined numerical range recognized as a detection target as compared with the case where the first reference level is set. For example, when the first reference level is set, it is set as one of the conditions necessary for determining as a face that an iris pair exists within a range of 5.5 cm to 7.5 cm. On the other hand, when the second reference level is set, it is determined as a face that there is an iris pair in a range of 6.0 cm to 7.0 cm whose distance from each other is narrower than the first reference level. One of the conditions necessary to do this. Another method may be used as long as a difference can be provided in the reference value for recognizing the face between the first reference level and the second reference level.

以上の準備が整ったら、ステップＳ２で制御手段１１３が撮影モードの判定を行う。これは、操作手段１２１によってカメラの撮影モードが複数枚の画像を撮影した後それらを合成する画像合成モードに設定されているか、その他の撮影モードに設定されているかを判定する。撮影モードが画像合成モードに設定されていればステップＳ３に進み、その他の撮影モードに設定されていればステップＳ２２に進む。 When the above preparation is completed, the control unit 113 determines the shooting mode in step S2. This determines whether the shooting mode of the camera is set to an image synthesis mode in which a plurality of images are shot by the operation unit 121 and then set to another shooting mode. If the shooting mode is set to the image composition mode, the process proceeds to step S3, and if it is set to another shooting mode, the process proceeds to step S22.

まず、ステップＳ２にてその他の撮影モードに設定されていると判定された場合について説明する。ステップＳ２２で、制御手段１１３は操作手段１２１に含まれるスイッチＳＷ１がオンされたかを判定し、オンされたならばステップＳ２３に進み、オンされていなければオンされるまで待機する。 First, the case where it is determined in step S2 that the other shooting mode is set will be described. In step S22, the control unit 113 determines whether the switch SW1 included in the operation unit 121 is turned on. If the switch SW1 is turned on, the control unit 113 proceeds to step S23, and if not, waits until it is turned on.

ステップＳ２３で、スイッチＳＷ１がオンされたことを受けて、信号処理手段１１１は顔検出用の画像信号を生成し、顔検出手段１１４がこの顔検出用の画像信号に対して顔検出を行う。そして制御手段１１３は顔検出結果に応じて焦点調節エリア、測光エリアを決定し、焦点調節動作、測光動作を行わせる。顔検出に失敗した場合は、至近にいる被写体に焦点を合わせる方法や、画面中央付近の重み付けを大きくして被写体輝度を求める方法を用いて、焦点調節動作や測光動作を行わせる。 In step S23, in response to the switch SW1 being turned on, the signal processing unit 111 generates an image signal for face detection, and the face detection unit 114 performs face detection on the image signal for face detection. Then, the control unit 113 determines a focus adjustment area and a photometry area according to the face detection result, and performs a focus adjustment operation and a photometry operation. When face detection fails, the focus adjustment operation and the photometry operation are performed using a method of focusing on a close subject or a method of obtaining subject luminance by increasing the weight near the center of the screen.

ステップＳ２４で、制御手段１１３は操作手段１２１に含まれるスイッチＳＷ２がオンされているかを判定し、オンされているならばステップＳ２５に進み、オンされていなければステップＳ２２に戻る。 In step S24, the control means 113 determines whether or not the switch SW2 included in the operation means 121 is turned on. If it is turned on, the process proceeds to step S25, and if not, the process returns to step S22.

ステップＳ２５で、ステップＳ２３で得た測光情報をもとに、絞り値、シャッタ速度を設定して撮影を行い、得られた画像をメモリ１１２に記憶する。そして信号処理手段１１１が、このメモリ１１２に記憶された合成後の画像信号を表示用の画像信号、或いは記憶用の画像信号に変換する。この変換された画像信号を受けて、表示手段１１９が合成後の画像信号を表示し、記録手段１２０が合成後の画像信号を記録する。 In step S25, based on the photometric information obtained in step S23, the aperture value and shutter speed are set to perform shooting, and the obtained image is stored in the memory 112. The signal processing unit 111 converts the combined image signal stored in the memory 112 into a display image signal or a storage image signal. In response to the converted image signal, the display unit 119 displays the combined image signal, and the recording unit 120 records the combined image signal.

次にステップＳ２で画像合成モードに設定されていると判定された場合について説明する。 Next, a case where it is determined in step S2 that the image composition mode is set will be described.

ステップＳ３で、制御手段１１３は顔検出手段１１４に顔を判定するための基準レベルとして第１の基準レベルを設定させる。顔を判定する基準レベルを変更するのは、流し撮りなどで被写体とカメラが相対的に動いて、被写体が静止している状態に比較して顔検出が困難である状況であっても、顔検出を成功させる確率を高く維持するためである。このため、ステップＳ３では第２の基準レベルよりも判定基準の緩やかな第１の基準レベルを設定する。 In step S3, the control unit 113 causes the face detection unit 114 to set the first reference level as the reference level for determining the face. The reference level for determining the face is changed even when the subject and the camera move relatively during panning, etc., and the face detection is difficult compared to the situation where the subject is stationary. This is to maintain a high probability of successful detection. For this reason, in step S3, a first reference level that is more gradual than the second reference level is set.

ステップＳ４で、制御手段１１３は操作手段１２１に含まれるスイッチＳＷ１がオンされているかを判定する。スイッチＳＷ１がオンされているならばステップＳ５に進み、オンされていなければオンされるまで待機する。 In step S4, the control unit 113 determines whether the switch SW1 included in the operation unit 121 is turned on. If the switch SW1 is turned on, the process proceeds to step S5, and if not, it waits until it is turned on.

ステップＳ５で、スイッチＳＷ１がオンされていることを受けて、信号処理手段１１１は顔検出用の画像信号を生成し、顔検出手段１１４がこの顔検出用の画像信号に対して顔検出を行う。そして制御手段１１３は顔検出結果に応じて焦点調節エリア、測光エリアを決定し、焦点調節動作、測光動作を行わせる。その際、検出された顔が存在する領域の座標情報をメモリ１１２に記憶する。顔検出に失敗した場合は、至近にいる被写体に焦点を合わせる方法や、画面中央付近の重み付けを大きくして被写体輝度を求める方法を用いて、焦点調節動作や測光動作を行わせる。 In step S5, in response to the switch SW1 being turned on, the signal processing unit 111 generates an image signal for face detection, and the face detection unit 114 performs face detection on the image signal for face detection. . Then, the control unit 113 determines a focus adjustment area and a photometry area according to the face detection result, and performs a focus adjustment operation and a photometry operation. At this time, the coordinate information of the area where the detected face exists is stored in the memory 112. When face detection fails, the focus adjustment operation and the photometry operation are performed using a method of focusing on a close subject or a method of obtaining subject luminance by increasing the weight near the center of the screen.

なお、画像合成モードは被写体の輝度によって自動的に設定されるものとして、ステップＳ５での測光結果を受けてから、ステップＳ２における画像合成モードに設定されているかの判定を行うようにしても構わない。この場合は、ステップＳ２の位置がステップＳ５とステップＳ６の間に変更され、画像合成モードが設定されていなければ、ステップＳ２２に進む。 Note that it is assumed that the image composition mode is automatically set according to the luminance of the subject, and it is possible to determine whether the image composition mode is set in step S2 after receiving the photometric result in step S5. Absent. In this case, the position of step S2 is changed between step S5 and step S6, and if the image composition mode is not set, the process proceeds to step S22.

ステップＳ６で、制御手段１１３は操作手段１２１によって、カメラ自身が移動したり、カメラ自身の向きが変わることが予想される流し撮りモードに設定されているかを判定する。流し撮りモードに設定されていれば、制御手段１１３内のメモリに格納された流し撮りフラグｆｌｇを１に設定し（ステップＳ７）、そうでない場合はフラグｆｌｇを０に設定する（ステップＳ８）。 In step S 6, the control unit 113 determines whether or not the operation unit 121 is set to a panning mode in which the camera itself is expected to move or the camera itself changes direction. If the panning mode is set, the panning flag flg stored in the memory in the control means 113 is set to 1 (step S7). Otherwise, the flag flg is set to 0 (step S8).

このフラグｆｌｇの値は、操作手段によって流し撮りモードが設定されなくとも、カメラに内蔵された不図示の加速度、角加速度、角速度、角変位などを検知するセンサーの出力に応じて設定しても構わない。この場合は、不図示のセンサーによってカメラが一定方向に動いていると検知された場合は、フラグｆｌｇを１に設定する。 The value of the flag flg may be set according to the output of a sensor that detects acceleration, angular acceleration, angular velocity, angular displacement, etc. (not shown) incorporated in the camera, even if the panning mode is not set by the operating means. I do not care. In this case, the flag flg is set to 1 when it is detected by a sensor (not shown) that the camera is moving in a certain direction.

ステップＳ９で、制御手段１１３は操作手段１２１に含まれるスイッチＳＷ２がオンされているかを判定し、オンされているならばステップＳ１０に進み、オンされていなければステップＳ４に戻る。 In step S9, the control means 113 determines whether or not the switch SW2 included in the operation means 121 is turned on. If it is turned on, the process proceeds to step S10, and if not, the process returns to step S4.

ステップＳ１０で、ステップＳ５で得た測光情報をもとに、連続撮影を行う。連続撮影する枚数ｎは、そのシーンで１枚撮影する場合に必要な本来の露光時間をＴ_Ｓ、複数枚撮影時の一枚あたりの露光時間をＴ_Ｍとすると、
Ｔ_Ｓ＝Ｔ_Ｍ×ｎ
を満たせばよい。具体的な例としては、ＴＭとして手ぶれしない露光時間の目安とされる「1/焦点距離」を与え、これにより撮影枚数nを決定する方法が考えられるが、実際はこの限りではない。 In step S10, continuous shooting is performed based on the photometric information obtained in step S5. The number of consecutive shots n is T _S as the original exposure time required for shooting one shot in the scene, and T _{M as} the exposure time per shot when shooting multiple shots.
T _S = T _M × n
Should be satisfied. As a specific example, a method may be considered in which “1 / focal length”, which is a measure of the exposure time without camera shake, is determined as TM and the number of shots n is determined based on this.

ステップＳ１１で、ずれ検出手段１１５は連続撮影されたそれぞれの画像信号を、図２に示すように複数のブロックに分割する。これら画像信号の全てのブロックを対象として、同一のブロックに対してそれぞれ画像信号間の２次元の相関演算を行う。これによりブロック毎のずれ量、すなわち動きベクトルが求まり、これをバッファメモリに格納する。 In step S11, the shift detection unit 115 divides each image signal continuously shot into a plurality of blocks as shown in FIG. For all the blocks of the image signal, two-dimensional correlation calculation between the image signals is performed on the same block. As a result, a shift amount for each block, that is, a motion vector is obtained and stored in the buffer memory.

ステップＳ１２で、制御手段１１３は操作手段１２１によって設定された画像合成モードが、主要被写体である人物を優先して合成を行う人物優先モードであるかを判定する。人物優先モードが設定されていればステップＳ１３に進み、そうでなければステップＳ１８に進む。 In step S 12, the control unit 113 determines whether the image composition mode set by the operation unit 121 is a person priority mode in which the person who is the main subject is preferentially combined. If the person priority mode is set, the process proceeds to step S13, and if not, the process proceeds to step S18.

ステップＳ１３で、制御手段１１３は顔検出手段１１４に顔を判定するための基準レベルとして第２の基準レベルを設定させる。焦点調節エリア、測光エリアを決定するための顔検出よりも時間は要するが、撮影済みの画像信号に対して顔検出を行うため、撮影動作における顔検出とは異なり、レリーズタイムラグ等の影響が発生しない。基準レベルを厳しくすることで、ステップＳ５よりも精度の高い顔検出ができる。 In step S13, the control unit 113 causes the face detection unit 114 to set the second reference level as a reference level for determining the face. Although it takes more time than the face detection to determine the focus adjustment area and photometry area, the face detection is performed on the captured image signal. Therefore, unlike the face detection in the shooting operation, effects such as release time lag occur. do not do. By making the reference level stricter, face detection with higher accuracy than in step S5 can be performed.

ステップＳ１４で、顔検出手段１１４はステップＳ１０で連続撮影された最初の画像信号に対して顔検出を行う。ただし一つ一つの画像の輝度が低いため、デジタルゲイン処理を行うなどして輝度を高めた顔検出用の画像信号を生成し、この顔検出用の画像信号に対して顔検出を行う。また、最初の画像信号から顔検出ができなければ、２枚目、３枚目、と別の画像信号に対し顔検出を行う。もし全ての画像信号に対して顔検出が失敗したのであれば、ステップＳ５での顔検出結果を代用してもよい。 In step S14, the face detection unit 114 performs face detection on the first image signal continuously photographed in step S10. However, since the brightness of each image is low, an image signal for face detection with increased brightness is generated by performing digital gain processing or the like, and face detection is performed on the image signal for face detection. If face detection cannot be performed from the first image signal, face detection is performed on the second and third image signals. If face detection fails for all image signals, the face detection result in step S5 may be substituted.

本実施の形態では、予め登録された個人別に顔検出用の異なるテンプレート情報を備えており、画像信号から顔検出だけでなく個人認識まで行うことができるものとする。顔として検出された領域に対して、更に個人別に用意されたテンプレート情報と順次マッチングを行う。一致度が最も高く、かつ、所定の閾値を満たしている場合は、そのテンプレート情報として登録されている個人であると判定する。そして、個人別のテンプレート情報には、予めその個人の優先順位に関する情報が関連付けられて記憶されている。これは複数の人物の顔が映っている場合に、全員に対して最適となるように位置ずれを補正することはほぼ不可能であるためである。また、予め優先度をつけておくことで、使用者の意図した被写体の位置ずれ補正を優先させることができる。 In the present embodiment, different template information for face detection is provided for each individual registered in advance, and not only face detection but also individual recognition can be performed from an image signal. The region detected as a face is further matched with template information prepared for each individual. If the degree of coincidence is the highest and a predetermined threshold is satisfied, it is determined that the person is registered as the template information. The individual template information is stored in advance in association with information related to the priority order of the individual. This is because it is almost impossible to correct the positional deviation so as to be optimal for all of the faces of a plurality of persons. In addition, by assigning priorities in advance, it is possible to give priority to the correction of the positional deviation of the subject intended by the user.

図３のように高い優先順位（例えば、１位）で登録されたＡさんの顔と、優先順位が登録されていないＢさんの顔が存在する場合を例にあげて説明する。Ａさんの顔と重畳する長方形領域の左上の座標を（ｘ_Ａ１、ｙ_Ａ１）、右下の座標を（ｘ_Ａ２、ｙ_Ａ２）とする。Ｂさんの顔と重畳する長方形領域の左上の座標を（ｘ_Ｂ１、ｙ_Ｂ１）、右下の座標を（ｘ_Ｂ２、ｙ_Ｂ２）とする。Ａさんの顔検出結果の信頼性をＲ_Ａ、Ｂさんの顔検出結果の信頼性をＲ_Ｂすると、ＡさんとＢさんの顔検出結果に関してそれぞれ次の情報がメモリ１１２に記憶される。もちろん、この情報は検出された人数や、その個人に応じて、記憶される情報は異なる。
Ａさん｛（ｘ_Ａ１、ｙ_Ａ１）、（ｘ_Ａ２、ｙ_Ａ２）、Ｒ_Ａ、１｝
Ｂさん｛（ｘ_Ｂ１、ｙ_Ｂ１）、（ｘ_Ｂ２、ｙ_Ｂ２）、Ｒ_Ｂ、−｝
ここで、顔検出結果の信頼性について説明する。 An example will be described in which there is a face of Mr. A registered with a high priority (for example, first place) and a face of Mr. B with no priority registered as shown in FIG. The upper left coordinates of the rectangular area superimposed on Mr. A's face are (x _A1 , y _A1 ), and the lower right coordinates are (x _A2 , y _A2 ). The upper left coordinates of the rectangular area superimposed on Mr. B's face are (x _B1 , y _B1 ), and the lower right coordinates are (x _B2 , y _B2 ). When the reliability of the face detection result of Mr. _A is R _A and the reliability of the face detection result of Mr. _B is R _B , the following information is stored in the memory 112 for each of the face detection results of Mr. A and Mr. B. Of course, the information stored in this information differs depending on the number of people detected and the individual.
Mr. A {(x _A1 , y _A1 ), (x _A2 , y _A2 ), R _A , 1}
Mr. B {(x _B1 , y _B1 ), (x _B2 , y _B2 ), R _B , −}
Here, the reliability of the face detection result will be described.

顔検出結果の信頼性にはいくつかの算出方法が考えられる。例えば、テンプレートマッチング結果、色（顔なら肌色）、動き検出（顔（人）なら動いている）などいくつかのパラメータを有していて、これらのパラメータの組み合わせによって信頼度を設定する方式がある。他には、テンプレートに対してどの程度マッチしたのかによって信頼度を設定する方式や、顔検出領域の解像度や面積に応じて信頼度を設定する（高解像度の画像に写った顔、大きく写った顔の方が信頼度が高い）方式などが考えられる。特開２００４−２０６６６５号公報には、肌色領域を検出しておき、この肌色領域から検出された顔領域は信頼性が高いと判定する方法が開示されている。特開２００３−２６６３４８号公報には、テンプレートの画素数ｎに対して、マッチした画素数をｎｍａｔｃｈとし、ｎｍａｔｃｈ／ｎを信頼度とする方法が開示されている。 Several calculation methods are conceivable for the reliability of the face detection result. For example, there are several parameters such as template matching result, color (skin color if face), motion detection (moving if face (person)), and there is a method of setting reliability by combining these parameters . Other methods include setting the reliability according to the degree of matching with the template, and setting the reliability according to the resolution and area of the face detection area (faces captured in high-resolution images, large images (Face is more reliable). Japanese Unexamined Patent Application Publication No. 2004-206665 discloses a method of detecting a skin color area and determining that the face area detected from the skin color area is highly reliable. Japanese Patent Laid-Open No. 2003-266348 discloses a method in which the number of matched pixels is set to nmatch and the reliability of nmatch / n with respect to the number of pixels n of the template.

本実施の形態では、上述のような種々の方法によって得られる顔検出結果の信頼性を示す値として、０〜１に正規化された値を設定する。信頼性が高いほど１に近い値となる。 In the present embodiment, a value normalized to 0 to 1 is set as a value indicating the reliability of the face detection result obtained by various methods as described above. The higher the reliability, the closer to 1.

もし、ステップＳ１４で顔検出に失敗し、顔の存在する領域を検出することができなければ、後述するステップＳ１９に進む（ステップＳ１５）。 If face detection fails in step S14 and the area where the face exists cannot be detected, the process proceeds to step S19 described later (step S15).

ステップＳ１６で、ずれ検出手段１１５はメモリ１１２に記憶された顔検出結果の座標情報の範囲に含まれるブロックに対する重み付けＷを計算する。重み付けＷの計算には、顔検出結果の信頼性Ｒ（０≦Ｒ≦１）と、登録された顔に関する優先順位が加味される。具体的なＷの計算方法の例を挙げると、予め登録された顔の優先順位ごとの重みｗ_ｆ（ｆは優先順位）と、登録されていない一般の顔に対する重みｗ_０をカメラ内に保持している。優先順位が高い顔ほど重みが増すように
ｗ_１≧ｗ_２≧・・・≧ｗ_０≧１
を満たしているとする。この時、各顔ブロックの重み付けＷが、
Ｗ＝｛（ｗ_ｆ−１）×Ｒ｝＋１
で表されたとすると、登録された優先順位と顔検出時の信頼性が高いほど重み付けＷの値は大きくなる。Ａさん、Ｂさんの顔が検知された図３に示す例では、Ａさんの顔が検出されたブロックであるＦ３〜Ｆ５、Ｇ３〜Ｇ５の計６ブロックに対して｛（ｗ_１−１）×Ｒ_Ａ｝＋１の重み付けが与えられる。また、Ｂさんの顔が検出されたブロックであるＣ４〜Ｃ５、Ｄ４〜Ｄ５の計４ブロックに対しては｛（ｗ_０−１）×Ｒ_Ｂ｝＋１の重み付けが与えられる。なお、登録されていない顔の重み付けＷを０とするように設定するように構成すれば、位置ずれを算出する際に、たまたま写ってしまった通りがかりの人物等の、使用者の意図していない人物を無視することができる。 In step S 16, the deviation detection unit 115 calculates the weight W for the blocks included in the range of the coordinate information of the face detection result stored in the memory 112. In calculating the weight W, the reliability R (0 ≦ R ≦ 1) of the face detection result and the priority order regarding the registered face are taken into consideration. As an example of a specific calculation method of W, a weight w _f (f is a priority) for each registered priority order of a face and a weight w ₀ for a general face that is not registered are stored in the camera. is doing. W ₁ ≧ w ₂ ≧ ・・・ ≧ w ₀ ≧ 1 so that the higher the priority, the higher the weight.
Is satisfied. At this time, the weight W of each face block is
W = {(w _f −1) × R} +1
If the registered priority order and the reliability at the time of face detection are higher, the value of the weight W becomes larger. In the example shown in FIG. 3 in which the faces of Mr. A and Mr. B are detected, {(w ₁ -1) for a total of 6 blocks F3 to F5 and G3 to G5 which are blocks from which the face of Mr. A is detected. _A weight of × R _A } +1 is given. Further, a weight of {(w ₀ −1) × R _B } +1 is given to a total of four blocks C4 to C5 and D4 to D5, which are blocks in which Mr. B's face is detected. Note that if the weight W of the unregistered face is set to be 0, the user does not intend the user, such as a passing person, who happens to be captured when calculating the positional deviation. You can ignore people.

なお、本実施の形態では、登録された優先順位と顔検出時の信頼性の両方を用いて重み付けＷを与えているが、登録された優先順位と顔検出時の信頼性の一方のみを用いて重み付けＷを与えてもよい。 In this embodiment, weight W is given using both the registered priority and the reliability at the time of face detection, but only one of the registered priority and the reliability at the time of face detection is used. The weight W may be given.

ステップＳ１７で、ステップＳ１１で算出してバッファメモリに記憶したブロック毎の動きベクトルと、ステップＳ１６で計算した重み付けを基に、画像信号毎に動きベクトルのヒストグラムを作成する。顔が検出されていないブロックに関しては、ブロック毎で算出された動きベクトル１つにつき１を加算するのに対して、顔が検出されたブロックに関してはＷ（Ｗ＞１）だけ加算して、その出現頻度をヒストグラムとして作成する。そしてステップＳ２０に進む。 In step S17, a motion vector histogram is created for each image signal based on the motion vector for each block calculated in step S11 and stored in the buffer memory and the weight calculated in step S16. For blocks where no face is detected, 1 is added to each motion vector calculated for each block, whereas for blocks where a face is detected, only W (W> 1) is added. Create appearance frequency as a histogram. Then, the process proceeds to step S20.

ステップＳ１２に戻り、通常の合成モードに設定されていた場合は、ステップＳ１８にて、ステップＳ７において設定したフラグｆｌｇの値を検出する。フラグｆｌｇが１に設定されている場合は、カメラが流し撮りモードに設定されているか、撮影時にカメラが一定方向に動いていたことを示す。そのため、被写体を中心に画像合成を行う方が好ましい結果が得られると考えられる。そこでフラグｆｌｇが１に設定されている場合は、ステップＳ１３に進み、顔検出結果を利用して画像合成を行う処理に進む。 Returning to step S12, if the normal synthesis mode has been set, the value of the flag flg set in step S7 is detected in step S18. When the flag flg is set to 1, it indicates that the camera is set to the panning mode or the camera has moved in a certain direction at the time of shooting. For this reason, it is considered that a preferable result can be obtained when image synthesis is performed with the subject as the center. Accordingly, if the flag flg is set to 1, the process proceeds to step S13, and the process proceeds to a process of performing image synthesis using the face detection result.

一方、フラグｆｌｇが０に設定されている場合は、ステップＳ１９に進み、ブロック別に重み付けをすることなく、ブロック毎に算出された動きベクトル１つにつき１を加算して、画像信号毎に動きベクトルのヒストグラムを作成する。ステップＳ１４にて顔検出に失敗した場合は、ステップＳ１５を経て、このステップＳ１９の処理を行う。そしてステップＳ２０に進む。 On the other hand, when the flag flg is set to 0, the process proceeds to step S19, where 1 is added to each motion vector calculated for each block without weighting for each block, and the motion vector for each image signal is added. Create a histogram for. If face detection fails in step S14, the process of step S19 is performed through step S15. Then, the process proceeds to step S20.

ステップＳ２０で、ステップＳ１７、或いはステップＳ１９で作成したヒストグラムにおいて最頻出となった動きベクトルを抽出し、その画像信号の動きベクトルとする。 In step S20, the motion vector that appears most frequently in the histogram created in step S17 or step S19 is extracted and used as the motion vector of the image signal.

そしてこの結果を基に、ステップＳ２１で、座標変換手段１１６が座標変換を行い、画像記憶手段１１７に記憶する。ステップＳ２０で求めた画像全体の動きベクトルを基に座標変換を行うことで、連続撮影された画像信号間の主たる位置ずれが相殺される。画像合成手段１１８は、画像記憶手段１１７から座標変換された一連の画像信号を読み出し、全ての画像信号とのみ重畳する領域のみを合成した画像信号を生成する。本実施の形態では、画像合成手段１１８は一部の画像信号のみが重畳する領域は合成せずにカットするが、一部の画像信号のみが重畳する領域も合成し、不足する輝度をデジタルゲインにて補償するようにしても構わない。画像合成手段１１８は合成した画像信号を拡散補完処理で元のフレームの大きさとしてから、メモリ１１２に記憶させる。そして信号処理手段１１１が、このメモリ１１２に記憶された合成後の画像信号を表示用の画像信号、或いは記憶用の画像信号に変換する。この変換された画像信号を受けて、表示手段１１９が合成後の画像信号を表示し、記録手段１２０が合成後の画像信号を記録する。 Based on this result, the coordinate conversion means 116 performs coordinate conversion and stores it in the image storage means 117 in step S21. By performing coordinate conversion based on the motion vector of the entire image obtained in step S20, the main positional deviation between consecutively captured image signals is offset. The image synthesizing unit 118 reads out a series of image signals whose coordinates have been converted from the image storage unit 117, and generates an image signal in which only a region that overlaps only all the image signals is synthesized. In this embodiment, the image synthesizing unit 118 cuts without synthesizing a region where only a part of the image signal is superimposed, but also synthesizes a region where only a part of the image signal is superimposed, and the insufficient luminance is converted into a digital gain. You may make it compensate by. The image synthesizing unit 118 stores the synthesized image signal in the memory 112 after making the size of the original frame by diffusion complement processing. The signal processing unit 111 converts the combined image signal stored in the memory 112 into a display image signal or a storage image signal. In response to the converted image signal, the display unit 119 displays the combined image signal, and the recording unit 120 records the combined image signal.

このように本実施の形態においては、人物の顔が存在する領域の重み付けを大きくして動きベクトルを求める。そのため、手足の動きなどの影響をあまり受けることなく、人物を優先させて複数の画像信号間のずれである動きベクトルを高い精度で求めることができる。 As described above, in this embodiment, the motion vector is obtained by increasing the weighting of the region where the human face exists. Therefore, it is possible to obtain a motion vector, which is a shift between a plurality of image signals, with high accuracy by giving priority to a person without being affected by the movement of limbs.

（第２の実施の形態）
上述した第１の実施の形態では、先に撮影画像を分割した全てのブロックについて動きベクトルを計算し、それから顔を検出して、顔が検出されたブロックに重み付けを行った上で、画像信号全体の動きベクトルを算出した。本実施の形態では、先に顔を検出して、顔が検出されたブロックのみを用いて画像信号全体の動きベクトルを算出する。 (Second Embodiment)
In the first embodiment described above, the motion vectors are calculated for all the blocks obtained by previously dividing the captured image, then the face is detected, and the block from which the face is detected is weighted, and then the image signal The overall motion vector was calculated. In the present embodiment, the face is detected first, and the motion vector of the entire image signal is calculated using only the block in which the face is detected.

本実施の形態における撮像装置であるカメラのブロック図は、第１の実施の形態のブロック図と同様である。 A block diagram of a camera which is an imaging apparatus in the present embodiment is the same as the block diagram of the first embodiment.

本実施の形態の動きベクトルの演算処理について、図５に示すフローチャートを用いて、第１の実施の形態と異なる処理を中心に詳細に説明する。図５に示すフローチャートは、第１の実施の形態のフローチャートと共通する処理については、図４と同じステップ番号を付してある。図５に示すフローチャートは、図４に示すフローチャートのステップＳ１０までは同じ処理を行う。 The motion vector calculation processing of the present embodiment will be described in detail with reference to the flowchart shown in FIG. 5, focusing on processing different from that of the first embodiment. In the flowchart shown in FIG. 5, the same step numbers as those in FIG. 4 are attached to the processes common to the flowchart of the first embodiment. The flowchart shown in FIG. 5 performs the same processing up to step S10 of the flowchart shown in FIG.

ステップＳ１０で、ステップＳ５で得た測光情報をもとに連続撮影を行うと、ステップＳ１２に進む。第１の実施の形態と異なり、連続撮影をした後に、全てのブロックに対してそれぞれ２次元の相関演算を行うことはしない。 If continuous shooting is performed in step S10 based on the photometric information obtained in step S5, the process proceeds to step S12. Unlike the first embodiment, two-dimensional correlation calculation is not performed on all the blocks after continuous shooting.

ステップＳ１３で、制御手段１１３は顔検出手段１１４に顔を判定するための基準レベルとして第２の基準レベルを設定させる。 In step S13, the control unit 113 causes the face detection unit 114 to set the second reference level as a reference level for determining the face.

そして、ステップＳ１４で、顔検出手段１１４はステップＳ１０で連続撮影された最初の画像信号に対して顔検出を行う。ただし一つ一つの画像の輝度が低いため、デジタルゲイン処理を行うなどして輝度を高めた顔検出用の画像信号を生成し、この顔検出用の画像信号に対して顔検出を行う。また、最初の画像信号から顔検出ができなければ、２枚目、３枚目、と別の画像信号に対し顔検出を行う。もし全ての画像信号に対して顔検出が失敗したのであれば、ステップＳ５での顔検出結果を代用してもよい。 In step S14, the face detection unit 114 performs face detection on the first image signal continuously captured in step S10. However, since the brightness of each image is low, an image signal for face detection with increased brightness is generated by performing digital gain processing or the like, and face detection is performed on the image signal for face detection. If face detection cannot be performed from the first image signal, face detection is performed on the second and third image signals. If face detection fails for all image signals, the face detection result in step S5 may be substituted.

本実施の形態でも、予め登録された個人別に顔検出用の異なるテンプレート情報を備えており、画像信号から顔検出だけでなく個人認識まで行うことができるものとする。 Also in this embodiment, different template information for face detection is provided for each individual registered in advance, and not only face detection but also individual recognition can be performed from an image signal.

図３の画像信号に対して顔検出を行うものとし、第１の実施の形態と同様の顔検出結果、及び個人認識結果が得られるとすると、ＡさんとＢさんの顔検出結果に関してそれぞれ次の情報がメモリ１１２に記憶される。
Ａさん｛（ｘ_Ａ１、ｙ_Ａ１）、（ｘ_Ａ２、ｙ_Ａ２）、Ｒ_Ａ、１｝
Ｂさん｛（ｘ_Ｂ１、ｙ_Ｂ１）、（ｘ_Ｂ２、ｙ_Ｂ２）、Ｒ_Ｂ、−｝
次に、ステップＳ３１で、ずれ検出手段１１４は連続撮影されたそれぞれの画像信号を複数のブロックに分割する。これら画像信号のメモリ１１２に記憶された座標情報の範囲に含まれるブロックのみを対象として、同一のブロックに対してそれぞれ画像信号間の２次元の相関演算を行う。これにより顔が検出されたブロック毎のずれ量、すなわち動きベクトルが求まり、バッファメモリに格納する。 Assume that face detection is performed on the image signal of FIG. 3 and the same face detection result and personal recognition result as in the first embodiment are obtained. Is stored in the memory 112.
Mr. A {(x _A1 , y _A1 ), (x _A2 , y _A2 ), R _A , 1}
Mr. B {(x _B1 , y _B1 ), (x _B2 , y _B2 ), R _B , −}
Next, in step S31, the shift detection unit 114 divides each image signal continuously shot into a plurality of blocks. Two-dimensional correlation calculation between image signals is performed on the same block only for blocks included in the range of coordinate information stored in the memory 112 of these image signals. As a result, the shift amount for each block in which the face is detected, that is, the motion vector is obtained and stored in the buffer memory.

そしてステップＳ３２で、顔が検出されたブロックに対して、顔検出結果の信頼性Ｒ（０≦Ｒ≦１）と、登録された顔に関する優先順位を加味して重み付けＷを計算する。そして、この重み付けＷを基に、画像信号毎に顔が検出されたブロックの動きベクトルのヒストグラムを作成する。そしてステップＳ２０に進み、ヒストグラムにおいて最頻出となった動きベクトルを抽出し、その画像信号の動きベクトルとする。 In step S32, a weight W is calculated for the block in which the face is detected, taking into account the reliability R (0 ≦ R ≦ 1) of the face detection result and the priority order for the registered face. Based on this weighting W, a motion vector histogram of a block in which a face is detected is created for each image signal. In step S20, the motion vector that appears most frequently in the histogram is extracted and used as the motion vector of the image signal.

ステップＳ１２に戻り、通常の合成モードに設定されていた場合は、ステップＳ１８に進み、ステップＳ７、もしくはステップＳ８において記憶したフラグｆｌｇの値を検出する。フラグｆｌｇが１に設定されている場合は、カメラが流し撮りモードに設定されているか、撮影時にカメラが一定方向に動いていたことを示す。そのため、被写体を中心に画像合成を行う方が好ましい結果が得られると考えられる。そこでフラグｆｌｇが１に設定されている場合は、ステップＳ１３に進み、顔検出結果を利用して画像合成を行う処理に進む。 Returning to step S12, if the normal composition mode is set, the process proceeds to step S18, and the value of the flag flg stored in step S7 or step S8 is detected. When the flag flg is set to 1, it indicates that the camera is set to the panning mode or the camera has moved in a certain direction at the time of shooting. For this reason, it is considered that a preferable result can be obtained when image synthesis is performed with the subject as the center. Accordingly, if the flag flg is set to 1, the process proceeds to step S13, and the process proceeds to a process of performing image synthesis using the face detection result.

一方、フラグｆｌｇが０に設定されている場合は、ステップＳ３３に進む。ステップＳ３３で、ずれ検出手段１１５は連続撮影されたそれぞれの画像信号を複数のブロックに分割する。これら画像信号の全てのブロックを対象として、同一のブロックに対してそれぞれ画像信号間の２次元の相関演算を行う。これによりブロック毎のずれ量、すなわち動きベクトルが求まり、バッファメモリに格納する。 On the other hand, if the flag flg is set to 0, the process proceeds to step S33. In step S 33, the deviation detection unit 115 divides each continuously captured image signal into a plurality of blocks. For all the blocks of the image signal, two-dimensional correlation calculation between the image signals is performed on the same block. As a result, a shift amount for each block, that is, a motion vector is obtained and stored in the buffer memory.

そしてステップＳ３４で、ブロック毎で算出された動きベクトル１つにつき１を加算して、画像信号毎に動きベクトルのヒストグラムを作成する。そしてステップＳ２０に進み、ヒストグラムにおいて最頻出となった動きベクトルを抽出し、その画像信号の動きベクトルとする。 In step S34, 1 is added to each motion vector calculated for each block, and a motion vector histogram is created for each image signal. In step S20, the motion vector that appears most frequently in the histogram is extracted and used as the motion vector of the image signal.

ステップＳ２１で、第１の実施の形態と同様に、ステップＳ２０で求めた画像全体の動きベクトルを基に座標変換を行うことで、連続撮影された画像信号間の主たるずれを相殺し、これらの画像信号を合成する。 In step S21, as in the first embodiment, coordinate conversion is performed based on the motion vector of the entire image obtained in step S20, thereby canceling main deviations between continuously captured image signals. Synthesize the image signal.

第２の実施の形態においては、人物優先モードの場合には、第１の実施の形態と異なり、顔が検出された領域に対してのみから動きベクトルを算出するため、動きベクトルの算出に要する時間が短縮される。 In the second embodiment, in the person priority mode, unlike the first embodiment, the motion vector is calculated only from the area where the face is detected. Time is shortened.

また、第１、第２の実施の形態においては、画像信号を複数のブロックに分割し、メモリ１１２に記憶された座標情報の範囲に含まれるブロックに対して相関演算を行っていた。本発明は、この方法に限定されるものではなく、例えば画像信号を複数のブロックに分割せずに、メモリ１１２に記憶された座標情報の範囲そのものに応じた領域に対して相関演算を行ってもよい。また、顔の形状に沿った領域を抽出して、その領域に対して相関演算を行ってもよい。 In the first and second embodiments, the image signal is divided into a plurality of blocks, and the correlation calculation is performed on the blocks included in the range of the coordinate information stored in the memory 112. The present invention is not limited to this method. For example, the image signal is not divided into a plurality of blocks, and the correlation calculation is performed on the area corresponding to the range of the coordinate information stored in the memory 112. Also good. Alternatively, a region along the face shape may be extracted and correlation calculation may be performed on the region.

（その他の実施の形態）
第１の実施の形態、第２の実施の形態においては、静止画を前提とした構成について説明したが、本発明はこれに限ったものではなく、動画についても適用する事ができる。動画の手ぶれ補正については、例えば特開平１１−１８７３０３号公報に、特定の画像を基準とし、これと連続した他の画像との位置ずれを検出してこれを補正した画像を再生する技術が開示されている。この技術に本発明を適用する事により、動きベクトル算出の高精度化、処理時間の短縮が可能になるため、動画の再生にも効果的である。また、動画の手ぶれ補正に関しては、第１、第２の実施の形態で説明したように主要被写体（人物の顔）のある領域から算出される動きベクトルを優先するのではなく、主要被写体の無い領域（背景）を優先した方が望ましい。 (Other embodiments)
In the first embodiment and the second embodiment, the configuration based on still images has been described. However, the present invention is not limited to this, and can be applied to moving images. As for moving image blur correction, for example, Japanese Patent Application Laid-Open No. 11-187303 discloses a technique for detecting a positional deviation between a specific image and a continuous image and reproducing the corrected image. Has been. By applying the present invention to this technique, it becomes possible to increase the accuracy of motion vector calculation and shorten the processing time, and therefore it is effective for reproducing moving images. In addition, as described in the first and second embodiments, the motion vector calculated from a certain area of the main subject (person's face) is not given priority, and there is no main subject. It is desirable to give priority to the area (background).

つまり、図４のステップＳ１６での処理を、顔が検出された領域の重み付けを他の領域よりも小さくする処理に置き換えればよい。 That is, the process in step S16 in FIG. 4 may be replaced with a process in which the weight of the area where the face is detected is made smaller than the other areas.

もしくは、図５のステップＳ３１、Ｓ３２での処理を、顔が検出されていない領域のみの動きベクトルを算出し、ヒストグラムを作成する処理に置き換えればよい。 Alternatively, the processing in steps S31 and S32 in FIG. 5 may be replaced with processing for calculating a motion vector only for an area where no face is detected and creating a histogram.

また、上述の実施の形態は、カメラの内部で動きベクトルの算出を行うものであったが、本発明はこれに限定されるものではない。カメラあるいはネットワーク上から有線通信或いは無線通信により複数枚画像を受け取ったＰＣ等の画像処理装置にも本発明を適用し、動きベクトル算出処理を実施することができる。 In the above-described embodiment, the motion vector is calculated inside the camera, but the present invention is not limited to this. The present invention can also be applied to an image processing apparatus such as a PC that has received a plurality of images from a camera or a network via wired communication or wireless communication, and motion vector calculation processing can be performed.

本発明の目的は、以下の様にして達成することも可能である。まず、前述した実施の形態の機能を実現するソフトウェアのプログラムコードを記録した記憶媒体（または記録媒体）を、システムあるいは装置に供給する。そして、そのシステムあるいは装置のコンピュータ（またはＣＰＵやＭＰＵ）が記憶媒体に格納されたプログラムコードを読み出し実行する。この場合、記憶媒体から読み出されたプログラムコード自体が前述した実施の形態の機能を実現することになり、そのプログラムコードを記憶した記憶媒体は本発明を構成することになる。 The object of the present invention can also be achieved as follows. First, a storage medium (or recording medium) in which a program code of software that realizes the functions of the above-described embodiments is recorded is supplied to the system or apparatus. Then, the computer (or CPU or MPU) of the system or apparatus reads and executes the program code stored in the storage medium. In this case, the program code itself read from the storage medium realizes the functions of the above-described embodiment, and the storage medium storing the program code constitutes the present invention.

また、コンピュータが読み出したプログラムコードを実行することにより、前述した実施の形態の機能が実現されるだけでなく、以下のようにして達成することも可能である。即ち、読み出したプログラムコードの指示に基づき、コンピュータ上で稼働しているオペレーティングシステム（ＯＳ）などが実際の処理の一部または全部を行い、その処理によって前述した実施の形態の機能が実現される場合である。ここでプログラムコードを記憶する記憶媒体としては、例えば、フレキシブルディスク、ハードディスク、ＲＯＭ、ＲＡＭ、磁気テープ、不揮発性のメモリカード、ＣＤ−ＲＯＭ、ＣＤ−Ｒ、ＤＶＤ、光ディスク、光磁気ディスク、ＭＯなどが考えられる。また、ＬＡＮ（ローカル・エリア・ネットワーク）やＷＡＮ（ワイド・エリア・ネットワーク）などのコンピュータネットワークを、プログラムコードを供給するために用いることができる。 Further, by executing the program code read by the computer, not only the functions of the above-described embodiments are realized, but also the following can be achieved. That is, based on the instruction of the read program code, an operating system (OS) or the like running on the computer performs part or all of the actual processing, and the functions of the above-described embodiments are realized by the processing. Is the case. Examples of the storage medium for storing the program code include a flexible disk, hard disk, ROM, RAM, magnetic tape, nonvolatile memory card, CD-ROM, CD-R, DVD, optical disk, magneto-optical disk, MO, and the like. Can be considered. Also, a computer network such as a LAN (Local Area Network) or a WAN (Wide Area Network) can be used to supply the program code.

本発明の第１、第２の実施の形態にかかる画像処理装置であるカメラのブロック図である。1 is a block diagram of a camera that is an image processing apparatus according to first and second embodiments of the present invention. FIG. 画像信号の動きベクトルを説明するための図である。It is a figure for demonstrating the motion vector of an image signal. 人物の顔が検出された位置と重み付けを反映させるブロックの位置との関係を示す図である。It is a figure which shows the relationship between the position where the face of a person was detected, and the position of the block which reflects weighting. 本発明の第１の実施の形態におけるカメラの動作を表すフローチャートである。It is a flowchart showing operation | movement of the camera in the 1st Embodiment of this invention. 本発明の第２の実施の形態におけるカメラの動作を表すフローチャートである。It is a flowchart showing operation | movement of the camera in the 2nd Embodiment of this invention.

Explanation of symbols

１００撮像装置
１０１撮影レンズ
１０２絞り
１０３シャッタ
１０４撮像素子
１０５撮像素子駆動手段
１０６ＡＦ駆動モータ
１０７シャッタ駆動手段
１０８絞り駆動手段
１０９焦点制御回路
１１０Ａ／Ｄ変換手段
１１１信号処理回路
１１２メモリ
１１３制御手段
１１４顔検出手段
１１５ずれ検出手段
１１６座標変換手段
１１７画像記憶手段
１１８画像合成手段
１１９表示手段
１２０記録手段
１２１操作手段
DESCRIPTION OF SYMBOLS 100 Imaging device 101 Shooting lens 102 Diaphragm 103 Shutter 104 Image pick-up element 105 Image pick-up element drive means 106 AF drive motor 107 Shutter drive means 108 Aperture drive means 109 Focus control circuit 110 A / D conversion means 111 Signal processing circuit 112 Memory 113 Control means 114 Face detection means 115 Deviation detection means 116 Coordinate conversion means 117 Image storage means 118 Image composition means 119 Display means 120 Recording means 121 Operation means

Claims

A deviation detecting means for detecting a positional deviation between a plurality of image signals;
From the image signal, according to the comparison result with the reference information, having a specific area detection means for detecting a specific area,
The deviation detecting means sets and outputs different weights for information indicating positional deviation for each area of the image signal in accordance with the detection result of the specific area by the specific area detecting means. Processing equipment.

The deviation detection means sets and outputs different weights for information indicating positional deviation for each area of the plurality of image signals according to the reliability of the detection result of the specific area of the specific area detection means. The image processing apparatus according to claim 1.

3. The image processing according to claim 2, wherein the shift detection unit sets and outputs a larger weight to information indicating the shift in a region where the reliability of the detection result of the specific region detection unit is higher. apparatus.

The specific area detection means determines the type of the detected specific area, and sets and outputs different weights for information indicating positional deviation for each area of the plurality of image signals according to the determined type. The image processing apparatus according to claim 1, wherein:

The specific area is a person's face, and the specific area detecting unit detects an area where the person's face exists according to a comparison result with reference information indicating a shape of the person's face. The image processing apparatus according to claim 1.

The specific area is a person's face, and the specific area detecting means determines an individual from the detected person's face according to a comparison result with reference information indicating the shape of the individual's face, and determines the determined individual. 5. The image processing apparatus according to claim 4, wherein different weights are set for the information indicating the positional deviation and output.

7. The displacement detection unit according to claim 1, wherein the displacement detection unit calculates a displacement of the entire image signal using information indicating a displacement in which different weights are set for each region of the image signal. An image processing apparatus according to 1.

A deviation detecting means for detecting a positional deviation between a plurality of image signals;
A face area detecting means for detecting an area in which a person's face exists in accordance with a comparison result with information indicating the shape of the person's face from an image signal;
The deviation detecting means outputs either information indicating a positional deviation of an area where the human face of the image signal is present or information indicating a positional deviation of an area different from the area where the human face of the image signal is present. An image processing apparatus.

The image processing apparatus according to claim 7, wherein an area having a higher reliability of the detection result of the face area detecting unit sets and outputs a larger weight for information indicating positional deviation.

8. The face area detection unit determines an individual from the detected person's face, and sets and outputs different weights for information indicating positional deviation according to the determined individual. The image processing apparatus described.

The deviation detecting means uses either information indicating a positional deviation of an area where the human face of the image signal is present or information indicating a positional deviation of an area different from the area where the human face of the image signal is present. The image processing apparatus according to claim 8, wherein a position shift of the entire image signal is calculated.

The image processing apparatus is a camera provided with an image sensor, and the specific area detecting means detects a specific area for focus adjustment or photometry at the time of shooting, and an image signal obtained by shooting. The image processing apparatus according to claim 1, wherein a detection criterion for detecting a specific area for performing signal processing on the image processing apparatus is different.

The image processing apparatus is a camera including an image sensor, and the deviation detection unit sets and outputs different weights for information indicating positional deviation according to information on movement and orientation of the image processing apparatus itself. An image processing apparatus according to claim 1, wherein:

A displacement detection step for detecting a displacement between a plurality of image signals;
From the image signal, according to the comparison result with the reference information, having a specific area detection step of detecting a specific area,
According to the detection result of the specific area in the specific area detection step, the deviation detection step sets and outputs different weights for information indicating positional deviation for each area of the image signal. Image processing method.

In the shift detection step, different weights are set and output for information indicating the shift for each region of the plurality of image signals according to the reliability of the detection result of the specific region in the specific region detection step. The image processing method according to claim 14.

In the specific area detecting step, the type of the detected specific area is determined, and according to the determined type, different weights are set and output for the information indicating the positional deviation for each area of the plurality of image signals. The image processing apparatus according to claim 14, wherein:

The specific area is a person's face, and the specific area detecting step detects an area where the person's face exists according to a comparison result with reference information indicating a shape of the person's face. The image processing method according to claim 14.

The specific area is a person's face, and in the specific area detection step, an individual is determined from the detected person's face according to a comparison result with reference information indicating the shape of the individual's face, and the determined individual The image processing method according to claim 16, wherein different weights are set for the information indicating the positional deviation and output accordingly.

19. The displacement detection step according to claim 14, wherein the displacement of the entire image signal is calculated using information indicating a displacement in which different weights are set for each region of the image signal. An image processing apparatus according to 1.

A displacement detection step for detecting a displacement between a plurality of image signals;
A face region detection step of detecting a region where the human face exists according to a comparison result with information indicating the shape of the human face from the image signal;
In the deviation detection step, either information indicating a positional deviation of an area where the human face of the image signal exists or information indicating an positional deviation of an area different from the area where the human face of the image signal exists is output. An image processing method.

21. A program executable by an information processing apparatus, comprising program code for realizing the image processing method according to claim 14.