JP5245604B2

JP5245604B2 - Image processing apparatus and program

Info

Publication number: JP5245604B2
Application number: JP2008186861A
Authority: JP
Inventors: 博清水; 淳村木; 浩靖形川; 博之星野; 英里奈市川
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 2008-07-18
Filing date: 2008-07-18
Publication date: 2013-07-24
Anticipated expiration: 2028-07-18
Also published as: JP2010028414A

Description

本発明は、顔検出機能を有する画像処理装置、およびこの画像処理装置に用いられるプログラムに関する。 The present invention relates to an image processing apparatus having a face detection function and a program used for the image processing apparatus.

従来、ＣＣＤやＣＭＯＳ型の固体撮像素子を用いて被写体を撮像するデジタルカメラやデジタルビデオカメラ等の撮像装置が知られている。 2. Description of the Related Art Conventionally, an imaging device such as a digital camera or a digital video camera that captures an object using a CCD or CMOS type solid-state imaging device is known.

この撮像装置を用いて撮像された人物画像は、証明写真として利用される場合も多い。この場合には、人物の背景は無地であることが望ましいので、撮像する際に、背景が無地となる撮像場所を選ぶ必要があった。そこで、撮像場所となり得る条件を緩和するために、撮像された画像から背景部分を除いた人物を示す領域を抽出できる画像処理技術が望まれる。 A person image captured using this imaging apparatus is often used as an ID photo. In this case, since it is desirable that the background of the person is plain, it is necessary to select an imaging place where the background is plain when imaging. Therefore, in order to relax the conditions that can be an imaging location, an image processing technique that can extract an area indicating a person excluding a background portion from a captured image is desired.

ところで、撮像装置には、人物の顔を検出する機能を有するものが登場している。このような撮像装置は、例えば目や鼻や口等の部分を検出し、顔のおおよその位置や大きさを判別できる。しかしながら、人物の顔以外の部分や輪郭を判別することは難しいため、人物を示す領域を抽出する手段としては適さなかった。 By the way, what has the function to detect a person's face has appeared in the imaging device. Such an imaging device can detect the approximate position and size of the face by detecting parts such as eyes, nose and mouth. However, since it is difficult to discriminate parts and contours other than the person's face, it is not suitable as a means for extracting an area indicating a person.

このような状況にあって、例えば、特許文献１には、画像データ内の画素を明るさに基づいて数段階に量子化することにより、人物を示す領域を分離して抽出する方法が示されている。
特開２００５−２２３５２３号公報 Under such circumstances, for example, Patent Document 1 discloses a method of separating and extracting a region indicating a person by quantizing pixels in image data in several stages based on brightness. ing.
JP 2005-223523 A

しかしながら、特許文献１の方法は、人物を示す画素と背景を示す画素との間に明確な明るさの差異が必要であるため、明るさの差異がわずかな場合や、複雑な背景の場合等には、抽出精度が不安定となってしまう。したがって、撮像場所によっては、撮像された画像から背景部分を除いた人物を示す領域を抽出することが難しかった。 However, since the method of Patent Document 1 requires a clear brightness difference between a pixel indicating a person and a pixel indicating a background, the brightness difference is small or the background is complicated. In some cases, the extraction accuracy becomes unstable. Therefore, depending on the imaging location, it is difficult to extract a region indicating a person excluding the background portion from the captured image.

そこで本発明は、撮像された画像から背景部分を除いた人物を示す領域を容易に抽出できる画像処理装置を提供することを目的とする。 Accordingly, an object of the present invention is to provide an image processing apparatus that can easily extract a region indicating a person excluding a background portion from a captured image.

請求項１に記載の発明に係る画像処理装置は、同一の被写体を含み且つ背景の異なる第１画像および第２画像を取得する取得手段と、当該取得した第１画像および第２画像の中から、前記被写体の顔部分を検出する検出手段と、前記第１画像の顔部分と前記第２画像の顔部分との相対変位を測定する変位測定手段と、前記相対変位に基づき、前記第１画像を構成する画素と前記第２画像を構成する画素とを対応付ける対応付け手段と、前記対応付けられた画素同士を比較し、当該画素の画素値の差分が第１の所定値以上である場合には、当該画素を背景部分とし、当該画素の画素値の差分が第１の所定値未満である場合には、当該画素を非背景部分とする判別手段と、を備えることを特徴とする。 The image processing apparatus according to the invention of claim 1 includes an acquisition unit that acquires unrealized and the first image and the second image different backgrounds of the same subject, in the first image and the second image the acquired Based on the relative displacement, the detecting means for detecting the face portion of the subject, the displacement measuring means for measuring the relative displacement between the face portion of the first image and the face portion of the second image, The association means for associating the pixels constituting the image with the pixels constituting the second image and the associated pixels are compared, and the difference between the pixel values of the pixels is equal to or greater than a first predetermined value Includes a determining unit that sets the pixel as a background portion and sets the pixel as a non-background portion when the difference between the pixel values of the pixels is less than a first predetermined value.

請求項２に記載の発明に係る画像処理装置は、前記第１画像または前記第２画像のうち一方を選択する選択手段と、前記選択手段により選択された画像について前記判別手段により背景部分であると判別された画素の画素値を同じ値に設定する設定手段とをさらに備えることを特徴とする。 The image processing apparatus according to the invention of claim 2 includes a selection means for selecting one of the first image or the second image, the background portion the discriminating means with the selected image by the selection means and further comprising a setting means for setting the pixel value of the determined pixel to be the same value.

請求項３に記載の発明に係る画像処理装置は、前記検出された顔部分の中心を決定し、当該顔部分の中心からの距離が第３の所定値以上である画素を、背景部分とし、当該顔部分の中心からの距離が第４の所定値以下である画素を、非背景部分とする第２判別手段をさらに備え、当該第２判別手段の判別対象を除いた領域について、前記判別手段を実行することを特徴とする。 The image processing apparatus according to the invention of claim 3 determines the center of the detected face part, and sets a pixel whose distance from the center of the face part is a third predetermined value or more as a background part, The discriminator further includes a second discriminating unit that uses a pixel whose distance from the center of the face portion is a fourth predetermined value or less as a non-background portion, and for the region excluding the discrimination target of the second discriminating unit. It is characterized by performing.

請求項４に記載の発明に係る画像処理装置は、前記第２判別手段は、前記顔部分の向きに基づいて、前記被写体の前記顔部分の中心から見た首方向に対する見込み角が、所定の角度の範囲である人体の胴体部分が含まれると想定される範囲については判別対象から除外することを特徴とする。 In the image processing apparatus according to the fourth aspect of the present invention, the second determination unit has a predetermined angle of view with respect to a neck direction viewed from the center of the face part of the subject based on an orientation of the face part. A range that is assumed to include a torso portion of a human body that is an angle range is excluded from the discrimination target.

請求項５に記載の発明に係る画像処理装置は、前記第１画像の顔部分の特徴量と前記第２画像の顔部分の特徴量とが大きく異なる場合に、警告を出力する警告手段をさらに備えることを特徴とする。 The image processing apparatus according to claim 5 further includes warning means for outputting a warning when the feature amount of the face portion of the first image and the feature amount of the face portion of the second image are greatly different. It is characterized by providing.

請求項６に記載の発明に係る画像処理装置は、前記警告手段は、前記第１画像の顔部分の特徴量と前記第２画像の顔部分の特徴量とが大きく異なる場合に、前記第２画像の顔部分の特徴量に基づいて、前記第２画像の顔部分の特徴量を前記第１画像の顔部分の特徴量に近似させるための前記第２画像の顔部分の位置、大きさ、および向きのうち少なくとも１つを報知することを特徴とする。 In the image processing apparatus according to the sixth aspect of the present invention, when the warning means has a feature quantity of the face portion of the first image greatly different from a feature quantity of the face portion of the second image, the second means Based on the feature amount of the face portion of the image, the position and size of the face portion of the second image for approximating the feature amount of the face portion of the second image to the feature amount of the face portion of the first image; And at least one of the directions is notified.

請求項７に記載の発明に係る画像処理装置は、前記取得手段は、前記第１画像の顔部分の特徴量と前記第２画像の顔部分の特徴量とが近似しない場合、新たに画像を取得し、当該新たな画像を前記第２画像とすることを特徴とする。 The image processing apparatus according to claim 7, wherein the acquisition unit newly creates an image when the feature quantity of the face portion of the first image and the feature quantity of the face portion of the second image do not approximate. The new image is acquired and used as the second image.

請求項８に記載の発明に係る画像処理装置は、前記取得手段は、第１画像群として被写体を含み同様な背景の複数の画像と第２画像群として被写体を含み第１画像群と異なる背景の複数の画像とを取得し、前記取得した各画像群の中から１つずつ選択し、組合せを複数生成し、当該複数の組合せのそれぞれについて、前記顔部分の相関性が最も高い組合せを、前記第１画像および前記第２画像として決定する決定手段をさらに備えることを特徴とする。 The image processing apparatus according to an eighth aspect of the invention is characterized in that the acquisition unit includes a subject as a first image group and includes a plurality of similar background images and a subject as a second image group and a background different from the first image group. A plurality of images, selecting one image from each of the acquired image groups, generating a plurality of combinations, for each of the plurality of combinations, the combination having the highest correlation of the face part, The image processing apparatus further includes a determining unit that determines the first image and the second image.

請求項９に記載の発明に係る画像処理装置は、前記被写体の顔部分に対する合焦度を測定する測定手段をさらに備え、前記検出手段は、前記測定手段により測定された合焦度を、前記第１画像および第２画像の顔部分の特徴量として検出することを特徴とする。 The image processing apparatus according to claim 9 further includes a measurement unit that measures a degree of focus on the face portion of the subject, and the detection unit uses the degree of focus measured by the measurement unit as the degree of focus. It is detected as a feature amount of the face portion of the first image and the second image .

請求項１０に記載の発明に係る画像処理装置は、前記検出した顔部分の特徴量に基づいて、前記取得した第１画像および第２画像の非背景部分を推定する推定手段と、前記顔部分を位置合わせすると共に、前記推定手段の推定結果に基づいて、前記第１画像の顔部分以外の非背景部分と、前記第２画像の顔部分以外の非背景部分と、を位置合わせして、当該２つの位置合わせ結果に基づいて、前記第１画像を構成する画素と前記第２画像を構成する画素とを対応付ける位置合わせ手段と、をさらに備えることを特徴とする。 The image processing apparatus according to the invention of claim 10, the estimation means on the basis of the feature amount of the detected face parts, to estimate the non-background portion of the first image and the second image above acquired, before Kikao And aligning the non-background portion other than the face portion of the first image and the non-background portion other than the face portion of the second image based on the estimation result of the estimating means. The image forming apparatus further includes alignment means for associating the pixels constituting the first image with the pixels constituting the second image based on the two alignment results.

請求項１１に記載の発明に係る画像処理装置は、ユーザにより前記第１画像の非背景部分または前記第２画像の非背景部分が選択されると、当該選択された画像の非背景部分を記憶する記憶手段をさらに備えることを特徴とする。 The image processing apparatus according to claim 11 stores a non-background portion of the selected image when the user selects a non-background portion of the first image or a non-background portion of the second image. It further comprises storage means for performing.

請求項１２に記載の発明に係るプログラムは、コンピュータを、同一の被写体を含み且つ背景の異なる第１画像および第２画像を取得する取得手段、当該取得した第１画像および第２画像の中から、前記被写体の顔部分を検出する検出手段、前記第１画像の顔部分と前記第２画像の顔部分との相対変位を測定する変位測定手段、前記相対変位に基づき、前記第１画像を構成する画素と前記第２画像を構成する画素とを対応付ける対応付け手段、前記対応付けられた画素同士を比較し、当該画素の画素値の差分が第１の所定値以上である場合には、当該画素を背景部分とし、当該画素の画素値の差分が第１の所定値未満である場合には、当該画素を非背景部分とする判別手段、として機能させる。 Program according to the invention of claim 12, computer, same subject unrealized and acquisition means for acquiring the first image and the second image different background, in the first image and the second image the acquired From the detection means for detecting the face portion of the subject, the displacement measuring means for measuring the relative displacement between the face portion of the first image and the face portion of the second image, the first image is based on the relative displacement. An associating means for associating the constituting pixels with the pixels constituting the second image, comparing the associated pixels with each other, and when the difference between the pixel values of the pixels is equal to or greater than a first predetermined value, When the pixel is the background portion and the difference between the pixel values of the pixel is less than the first predetermined value, the pixel functions as a non-background portion.

本発明によれば、撮像された画像から背景部分を除いた人物を示す領域を容易に抽出することができる。 According to the present invention, it is possible to easily extract a region indicating a person excluding a background portion from a captured image.

以下、本発明の実施形態を図面に基づいて説明する。なお、以下の実施形態の説明にあたって、同一構成要件については同一符号を付し、その説明を省略もしくは簡略化する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. In the following description of the embodiments, the same constituent elements are denoted by the same reference numerals, and the description thereof is omitted or simplified.

［第１実施形態］
図１は、本実施形態に係る画像処理装置としてのデジタルカメラ１の概略構成を示すブロック図である。このデジタルカメラ１は、画像データ内で人物の顔部分を検出する機能を備えたものであり、主として以下の各部を備える。 [First Embodiment]
FIG. 1 is a block diagram showing a schematic configuration of a digital camera 1 as an image processing apparatus according to the present embodiment. The digital camera 1 has a function of detecting a human face portion in image data, and mainly includes the following units.

すなわち、デジタルカメラ１は、撮影部２、ＣＰＵ１１、キー入力部１２、顔検出部１３、フラッシュメモリ１４、表示部１５、外部メモリ１６、解像度変換部１７、ＪＰＥＧ変換部１８、およびＳＤＲＡＭ１９を備える。 That is, the digital camera 1 includes a photographing unit 2, a CPU 11, a key input unit 12, a face detection unit 13, a flash memory 14, a display unit 15, an external memory 16, a resolution conversion unit 17, a JPEG conversion unit 18, and an SDRAM 19.

ここで、撮影部２は、レンズを通した被写体を含む光学像をデジタル信号に変換し、ＳＤＲＡＭ１９に画像データ（第１画像、第２画像）として記憶する本発明の取得手段である。この撮影部２は、タイミングジェネレータ（ＴＧ）２１、ＣＣＤ駆動部２２、ＣＣＤ２３、およびＡ／Ｄ変換部２４を備える。 Here, the photographing unit 2 is an acquisition unit of the present invention that converts an optical image including a subject that has passed through a lens into a digital signal and stores the digital image in the SDRAM 19 as image data (first image, second image). The photographing unit 2 includes a timing generator (TG) 21, a CCD driving unit 22, a CCD 23, and an A / D conversion unit 24.

ＣＰＵ１１は、デジタルカメラ１の上記各部の制御、フラッシュメモリ１４、外部メモリ１６、ＳＤＲＡＭ１９等に記憶されたデータの操作、そして、本実施形態に係る画像データ内の背景部分を所定の画像に置換する処理等を行う。 The CPU 11 controls the above-described units of the digital camera 1, operates the data stored in the flash memory 14, the external memory 16, the SDRAM 19, and the like, and replaces the background portion in the image data according to the present embodiment with a predetermined image. Perform processing.

すなわち、ＣＰＵ１１は、第１画像の顔部分と第２画像の顔部分との相対変位を測定する本発明の変位測定手段、相対変位に基づき第１画像を構成する画素と第２画像を構成する画素とを対応付ける本発明の対応付け手段、対応付けられた画素同士を比較して背景部分と非背景部分とを判別する本発明の判別手段、画素値の比較によらずに背景部分と非背景部分とを判別する本発明の第２判別手段、背景部分と判別された画素の画素値を所定の値（第２の所定値）に設定し、新たな画像を生成する本発明の設定手段、第１画像の顔部分の特徴量と第２画像の顔部分の特徴量とが大きく異なる場合に警告を出力する本発明の警告手段、およびユーザにより選択された画像の非背景部分を記憶する本発明の記憶手段である（処理の詳細は後述する）。 That is, the CPU 11 constitutes the second image with the displacement measuring means of the present invention for measuring the relative displacement between the face portion of the first image and the face portion of the second image, and the pixels constituting the first image based on the relative displacement. The association means of the present invention for associating pixels, the determination means of the present invention for comparing the associated pixels to determine the background portion and the non-background portion, the background portion and the non-background without comparing pixel values A second discriminating means of the present invention for discriminating a portion; a setting means of the present invention for generating a new image by setting a pixel value of a pixel discriminated as a background portion to a predetermined value (second predetermined value); Warning means of the present invention for outputting a warning when the feature amount of the face portion of the first image and the feature amount of the face portion of the second image are greatly different, and a book for storing the non-background portion of the image selected by the user It is a storage means of the invention (details of processing will be described later) .

また、ＣＰＵ１１は、被写体の顔部分に対する合焦度を測定する本発明の測定手段としても機能し、さらに、顔部分の特徴量に基づいて、非背景部分を推定する本発明の推定手段としても機能する。 The CPU 11 also functions as a measurement unit of the present invention that measures the degree of focus on the face portion of the subject, and further, as an estimation unit of the present invention that estimates a non-background portion based on the feature amount of the face portion. Function.

キー入力部１２は、電源キーや、デジタルカメラ１の基本の動作モードである記録モードと再生モード、あるいは本実施形態に係る証明写真モードとの切り替えを行うモード切替キー、ＭＥＮＵキー、シャッタキー等の複数キーを含む。シャッタキーは、ユーザが撮影予告を行うための半押し位置と、実際の撮影動作を指示するための全押し位置との２段階の操作が可能なハーフシャッタ機能を有する。以上のキー入力部１２の各キーの操作状態は、ＣＰＵ１１により随時スキャンされる。 The key input unit 12 includes a power key, a mode switching key for switching between a recording mode and a playback mode, which are basic operation modes of the digital camera 1, or an ID photo mode according to the present embodiment, a MENU key, a shutter key, and the like. Includes multiple keys. The shutter key has a half shutter function that can be operated in two steps: a half-pressed position for a user to make a shooting advance notice and a full-pressed position for instructing an actual shooting operation. The operation state of each key of the key input unit 12 is scanned by the CPU 11 as needed.

顔検出部１３は、外部メモリ１６またはＳＤＲＡＭ１９に記憶された画像データから、人物の顔部分を検出する本発明の検出手段である。具体的には、顔検出部１３は、顔部分の位置、大きさ、向きのうち少なくとも１つを顔パラメータ（特徴量）として検出する。 The face detection unit 13 is a detection unit of the present invention that detects a human face portion from image data stored in the external memory 16 or the SDRAM 19. Specifically, the face detection unit 13 detects at least one of the position, size, and orientation of the face part as a face parameter (feature amount).

フラッシュメモリ１４は、記憶データの書き換えが可能な不揮発性メモリであり、ＣＰＵ１１で実行される各種のプログラムや、ＣＰＵ１１での処理に必要な各種データを記憶している。また、フラッシュメモリ１４は、ユーザにより設定されたデジタルカメラ１の各種の機能に関する設定情報も記憶する。 The flash memory 14 is a nonvolatile memory capable of rewriting stored data, and stores various programs executed by the CPU 11 and various data necessary for processing by the CPU 11. The flash memory 14 also stores setting information regarding various functions of the digital camera 1 set by the user.

表示部１５は、液晶ディスプレイおよびこの液晶ディスプレイの駆動回路からなる。記録モードでは、ＳＤＲＡＭ１９に１フレーム分の画像データが格納される毎に、この画像データは、ビデオ信号に変換されて、表示部１５においてスルー画像として表示される。そして、記録モードで撮影が実行されると、ＳＤＲＡＭ１９に一時記憶された画像データがＪＰＥＧ変換部１８においてＪＰＥＧ方式により圧縮符号化された後、例えば各種のメモリカードにより構成される外部メモリ１６に静止画ファイルとして記録される。 The display unit 15 includes a liquid crystal display and a driving circuit for the liquid crystal display. In the recording mode, every time image data for one frame is stored in the SDRAM 19, the image data is converted into a video signal and displayed on the display unit 15 as a through image. When shooting is performed in the recording mode, the image data temporarily stored in the SDRAM 19 is compressed and encoded by the JPEG conversion unit 18 using the JPEG method, and then is statically stored in the external memory 16 including, for example, various memory cards. It is recorded as a picture file.

再生モードでは、上記外部メモリ１６に記録された静止画ファイルは、ユーザの選択操作に応じて適宜読み出されて、ＪＰＥＧ変換部１８において復号されてＳＤＲＡＭ１９に画像データとして展開され、表示部１５において静止画像として表示される。 In the reproduction mode, the still image file recorded in the external memory 16 is appropriately read according to the user's selection operation, decoded by the JPEG conversion unit 18 and expanded as image data in the SDRAM 19, and displayed on the display unit 15. Displayed as a still image.

また、証明写真モードでは、撮影動作により２枚の画像データ（第１画像および第２画像）がＳＤＲＡＭ１９に格納され、ＣＰＵ１１により、この２枚の画像データのうち少なくとも一方について、背景部分の画素の画素値を所定値（第２の所定値）に設定した新たな合成画像データが生成される。そして、生成された新たな合成画像データがＪＰＥＧ変換部１８においてＪＰＥＧ方式により圧縮符号化された後、外部メモリ１６に静止画ファイルとして記録される。 In the ID photo mode, two image data (first image and second image) are stored in the SDRAM 19 by the photographing operation, and the CPU 11 determines the pixel of the background portion for at least one of the two image data. New composite image data in which the pixel value is set to a predetermined value (second predetermined value) is generated. Then, the generated new composite image data is compressed and encoded by the JPEG conversion unit 18 by the JPEG method, and then recorded in the external memory 16 as a still image file.

ここで、表示部１５は、第１画像の顔部分の特徴量と第２画像の顔部分の特徴量とが大きく異なる場合に、ＣＰＵ１１からの指令に基づいて警告を出力する本発明の警告手段である。 Here, the display unit 15 outputs a warning based on a command from the CPU 11 when the feature amount of the face portion of the first image and the feature amount of the face portion of the second image are greatly different from each other. It is.

解像度変換部１７は、ＳＤＲＡＭ１９に記憶された画像データの解像度を変換する。この解像度が変換された画像データがビデオ信号に変換されることにより、表示部１５において表示される画像が拡大または縮小される。 The resolution conversion unit 17 converts the resolution of the image data stored in the SDRAM 19. The image data whose resolution has been converted is converted into a video signal, whereby the image displayed on the display unit 15 is enlarged or reduced.

タイミングジェネレータ２１は、タイミング信号を生成し、ＣＣＤ駆動部２２は、このタイミング信号に基づいて、駆動信号を生成する。ＣＣＤ２３は、ＣＣＤ駆動部２２から出力されるタイミング信号に基づいて、被写体の光学像を光電変換し撮像信号として出力する。Ａ／Ｄ変換部２４は、ＣＣＤ２から出力された撮像信号をデジタル信号に変換し、画像データとしてＳＤＲＡＭ１９に記憶する。 The timing generator 21 generates a timing signal, and the CCD driving unit 22 generates a driving signal based on the timing signal. The CCD 23 photoelectrically converts the optical image of the subject based on the timing signal output from the CCD drive unit 22 and outputs it as an imaging signal. The A / D conversion unit 24 converts the imaging signal output from the CCD 2 into a digital signal and stores it as image data in the SDRAM 19.

以下、ＣＰＵ１１の制御により実行されるデジタルカメラ１の処理の詳細について説明する。 Hereinafter, details of the processing of the digital camera 1 executed under the control of the CPU 11 will be described.

図２は、本実施形態に係る処理の概略を示す図である。まず、デジタルカメラ１により人物の顔を含めて第１画像（ａ）および第２画像（ｂ）を撮影する。このとき、人物の背景は互いに異なる場所とし、画像上の人物の大きさや向き等は変わらないものとする。 FIG. 2 is a diagram showing an outline of processing according to the present embodiment. First, the first image (a) and the second image (b) including a person's face are photographed by the digital camera 1. At this time, it is assumed that the background of the person is different from each other, and the size and orientation of the person on the image do not change.

ＣＰＵ１１は、この２枚の画像の双方の画像データから、顔検出部１３により検出した顔部分の顔パラメータに基づいて、第１画像（ａ）の顔部分と第２画像（ｂ）の顔部分とを位置合わせする。このことにより、ＣＰＵ１１は、２枚の画像データ間の相対変位量を算出し、第１画像を構成する画素と第２画像を構成する画素とを相対変位量により対応付ける。 The CPU 11 determines the face part of the first image (a) and the face part of the second image (b) based on the face parameters of the face part detected by the face detection unit 13 from the image data of both of the two images. And align. Thus, the CPU 11 calculates the relative displacement amount between the two pieces of image data, and associates the pixels constituting the first image with the pixels constituting the second image by the relative displacement amount.

続いてＣＰＵ１１は、このように対応付けられた画素同士を比較し、この画素の画素値の差分が所定値（第１の所定値）以上である場合に、この画素を背景部分と判別し、この画素の画素値の差分が所定値（第１の所定値）未満である場合に、この画素を非背景部分（人物部分）と判別する。つまり、背景が互いに異なる２枚の画像データを用いれば、背景部分の対応する画素間で、その値は異なり、人物部分では同一あるいは類似するので、このように対応する画素の画素値を比較することにより、背景と非背景の領域を分離することができる。 Subsequently, the CPU 11 compares the pixels associated with each other as described above. When the difference between the pixel values of the pixels is equal to or greater than a predetermined value (first predetermined value), the CPU 11 determines the pixel as a background portion. When the difference between the pixel values of this pixel is less than a predetermined value (first predetermined value), this pixel is determined as a non-background portion (person portion). In other words, if two pieces of image data having different backgrounds are used, the values of the corresponding pixels in the background portion are different and the same or similar in the human portion. Therefore, the pixel values of the corresponding pixels are compared in this way. Thus, the background and non-background regions can be separated.

そして、ＣＰＵ１１は、第１画像（ａ）または第２画像（ｂ）の少なくとも一方について、背景部分であると判別された画素の画素値を所定の値（第２の所定値）に設定し、新たな画像（ｃ）を生成する。これにより、画像（ｃ）の背景部分は所定の色に塗りつぶされる。 Then, the CPU 11 sets the pixel value of the pixel determined to be the background portion to at least one of the first image (a) and the second image (b) to a predetermined value (second predetermined value), A new image (c) is generated. Thereby, the background portion of the image (c) is filled with a predetermined color.

図３および図４は、本実施形態に係る処理の流れを示すフローチャートである。キー入力部１２に設けられているモード切替キーが操作されて証明写真モードが設定されると、ＣＰＵ１１は、図３および図４に示すフローチャートの処理を実行するためのプログラムをフラッシュメモリ１４から読み出す。ＣＰＵ１１は、このプログラムに従って、各種処理を実行する。 3 and 4 are flowcharts showing the flow of processing according to the present embodiment. When the ID photo mode is set by operating a mode switching key provided in the key input unit 12, the CPU 11 reads a program for executing the processing of the flowcharts shown in FIGS. 3 and 4 from the flash memory 14. . The CPU 11 executes various processes according to this program.

まず、図３に示す撮影処理では、ユーザの操作に基づいて、ＣＰＵ１１の制御により、第１画像Ｐ_１および第２画像Ｐ_２が記録される。 First, the photographing processing shown in FIG. 3, based on the user, the control of the CPU 11, the first image _{P 1} and the second image _{P 2} is recorded.

ステップＳ１０１では、ＣＰＵ１１は、ユーザがシャッタキーを押下したことに応じて、撮影部２を制御し、第１画像Ｐ_１を取得してＳＤＲＡＭ１９に記録する。 In step S 101, the CPU 11 controls the photographing unit 2 in response to the user pressing the shutter key, acquires the first image P ₁ , and records it in the SDRAM 19.

ステップＳ１０２では、ＣＰＵ１１は、ステップＳ１０１にて記録した第１画像Ｐ_１に対して、顔検出部１３により顔部分の検出を行い、位置、大きさ、向き等の顔パラメータを取得する。 In step S102, CPU 11, to the first image _{P 1} recorded in step S101, it performs detection of a face part by the face detection unit 13, obtains the position, the size, the face parameters orientation, and the like.

ステップＳ１０３では、ＣＰＵ１１は、撮影部２から入れ替わり連続的して取得される第２画像の候補であるスルー画像に対して、ステップＳ１０２と同様に、顔検出部１３により顔部分の検出を行い、顔パラメータを取得する。 In step S103, the CPU 11 detects a face part by the face detection unit 13 for the through image that is a candidate for the second image that is continuously acquired from the photographing unit 2, as in step S102. Get face parameters.

ステップＳ１０４では、ＣＰＵ１１は、ステップＳ１０３にて取得したスルー画像の顔パラメータと、ステップＳ１０２にて取得した第１画像の顔パラメータとが大きく異なり、所定の条件よりも乖離しているか否かを判別する。この判定がＹＥＳの場合はステップＳ１０５に移り、判定がＮＯの場合はステップＳ１０６に移る。 In step S104, the CPU 11 determines whether or not the face parameter of the through image acquired in step S103 and the face parameter of the first image acquired in step S102 are greatly different and deviate from a predetermined condition. To do. If this determination is YES, the process proceeds to step S105, and if the determination is NO, the process proceeds to step S106.

ステップＳ１０５では、第１画像と第２画像の候補とで顔の位置、大きさ、向き等のいずれかが大きく異なっているので、ＣＰＵ１１は、ユーザに調整を促すために、表示部１５により、例えば「１枚目の画像と顔部分が一致しません」といった警告を出力する。さらに、ＣＰＵ１１は、第２画像の顔パラメータを第１画像の顔パラメータに近似させるために、第２画像の顔パラメータの少なくとも１つの変更点を報知する。具体的には、ＣＰＵ１１は、ステップＳ１０４における所定の条件を満たすための顔パラメータの変更点として、例えば、「顔が近すぎます」や「左を向いてください」等の警告を出力する。 In step S105, since any of the face position, size, orientation, etc. is greatly different between the first image and the second image candidate, the CPU 11 uses the display unit 15 to prompt the user to make adjustments. For example, a warning such as “the first image does not match the face portion” is output. Further, the CPU 11 notifies at least one change of the face parameter of the second image in order to approximate the face parameter of the second image to the face parameter of the first image. Specifically, the CPU 11 outputs, for example, a warning such as “face is too close” or “turn to the left” as the change point of the face parameter to satisfy the predetermined condition in step S104.

ユーザがこれらの警告表示に従って調整することにより、顔パラメータの乖離が改善される。なお、警告出力は、表示部１５による表示出力に替えて、あるいはこの表示出力と共に、スピーカ（図示せず）による音声出力や、所定の警告内容を示すＬＥＤ（ＬｉｇｈｔＥｍｉｔｔｉｎｇＤｉｏｄｅ）の発光出力等であってよい。 When the user adjusts according to these warning displays, the deviation of the face parameters is improved. The warning output is, instead of or in addition to the display output by the display unit 15, an audio output by a speaker (not shown), a light emitting output of an LED (Light Emitting Diode) indicating a predetermined warning content, or the like. It may be.

ステップＳ１０６では、ＣＰＵ１１は、ステップＳ１０３にて取得したスルー画像の顔パラメータが、ステップＳ１０２にて取得した第１画像の顔パラメータと所定以上の類似度で近似しているか否かを判定する。この判定がＹＥＳの場合は、第１画像とスルー画像との間で顔部分の位置合わせが可能なので、ステップＳ１０７に移る。一方、この判定がＮＯの場合は、第１画像とスルー画像との間で顔部分の位置合わせが難しく、背景部分を精度良く分離することができないため、ステップＳ１０３に戻って、第２画像の候補であるスルー画像を新たに取得する。 In step S106, the CPU 11 determines whether or not the face parameter of the through image acquired in step S103 approximates the face parameter of the first image acquired in step S102 with a predetermined degree of similarity or more. If this determination is YES, since the face portion can be aligned between the first image and the through image, the process proceeds to step S107. On the other hand, if this determination is NO, it is difficult to align the face portion between the first image and the through image, and the background portion cannot be accurately separated. A new through image as a candidate is acquired.

ステップＳ１０７では、ＣＰＵ１１は、ステップＳ１０３にて取得したスルー画像を、第２画像Ｐ_２として決定し、ＳＤＲＡＭ１９に記録する。 In step S 107, the CPU 11 determines the through image acquired in step S 103 as the second image P ₂ and records it in the SDRAM 19.

次に、図４に示す画像合成処理は、図３の撮影処理に続いて実行され、ＣＰＵ１１の制御により、人物の背景部分を所定値に設定した新たな画像を生成する。以下、図５の説明図を適宜参照して説明する。 Next, the image composition process shown in FIG. 4 is executed following the photographing process shown in FIG. 3, and a new image in which the background portion of the person is set to a predetermined value is generated under the control of the CPU 11. Hereinafter, description will be made with reference to the explanatory diagram of FIG.

ステップＳ２０１では、ＣＰＵ１１は、第１画像Ｐ_１から顔検出部１３により人物（非背景部分）Ｃにおける顔部分を検出し、検出された顔部分の位置に基づいて顔部分の中心Ｐ_Ｆ（ｍ，ｎ）を取得する（図５参照）。 In step S201, CPU 11 has the face detecting unit 13 from the first image _{P 1} detects the face portion in the person (non-background portion) C, the center _P F (m face portion based on the detected position of the face portion , N) (see FIG. 5).

ステップＳ２０２では、ＣＰＵ１１は、第２画像Ｐ２から顔検出部１３により顔部分を検出する。 In step S202, the CPU 11 detects a face portion by the face detection unit 13 from the second image P2.

ステップＳ２０３では、ＣＰＵ１１は、ステップＳ２０１およびステップＳ２０２の顔検出により検出された顔パラメータについて位置合わせを行うことにより、第１画像Ｐ_１に対する第２画像Ｐ２の相対変位量（ａ，ｂ）を算出する。 In step S203, CPU 11 is calculated by performing positioning for the detected face parameters, the relative displacement of the second image P2 with respect to the first image _{P 1} of the (a, b) by the face detection in step S201 and step S202 To do.

ステップＳ２０４では、ＣＰＵ１１は、ステップＳ２０１にて検出された顔部分の大きさに基づいて、この顔部分の外側を示す半径Ｌ_１の円と、この顔部分に含まれる半径Ｌ_２の円と、を設定する（図５参照）。 In step S204, CPU 11, based on the size of the detected face portion at step S201, a circle with a radius L ₁ showing the outside of the face portion, and a circle of radius L ₂ included in the face portion, Is set (see FIG. 5).

ステップＳ２０５では、ＣＰＵ１１は、第１画像Ｐ_１における座標（ｘ，ｙ）の画素値であるＤ_１と、第２画像Ｐ_２における座標（ｘ＋ａ，ｙ＋ｂ）の画素値であるＤ_２と、を取得する。ここで、座標（ｘ＋ａ，ｙ＋ｂ）は、座標（ｘ，ｙ）からステップＳ２０３にて算出された相対変位量（ａ，ｂ）だけ変位した座標であり、第１画像Ｐ_１と第２画像Ｐ_２の各画素同士は、この相対変位量（ａ，ｂ）により対応付けられる。 In step S205, CPU 11 has a _{D 1} is the pixel value of coordinates (x, y) in the first image _{P 1,} the coordinates in the second image _{P 2 (x + a, y} + b) and _{D 2} is a pixel value of the get. Here, the coordinates (x + a, y + b ) is the coordinates (x, y) relative displacement amount calculated in step S203 from the (a, b) are displaced by coordinates, the first image _{P 1} and the second image P Each pixel of ₂ is matched by this relative displacement amount (a, b).

ステップＳ２０６では、ＣＰＵ１１は、Ｐ_１（ｘ，ｙ）が半径Ｌ_１の円外か否かを判定する。具体的には、条件［（ｘ−ｍ）^２＋（ｙ−ｎ）^２＜Ｌ_１ ^２］を満たす場合はＹＥＳの判定となり、ステップＳ２０７に移る。一方、この条件を満たさない場合にはＮＯの判定となり、ステップＳ２０８に移る。 In step S206, the CPU 11 determines whether P ₁ (x, y) is outside the circle with the radius L ₁ . Specifically, if the condition [(x−m) ² + (y−n) ² <L ₁ ² ] is satisfied, the determination is YES, and the process proceeds to step S207. On the other hand, if this condition is not satisfied, the determination is NO and the process proceeds to step S208.

ステップＳ２０７では、Ｐ_１（ｘ，ｙ）が半径Ｌ_１の円内にあるので、ＣＰＵ１１は、Ｐ_１（ｘ，ｙ）が半径Ｌ_２の円外か否かを判定する。具体的には、条件［（ｘ−ｍ）^２＋（ｙ−ｎ）^２＞Ｌ_２ ^２］を満たす場合はＹＥＳの判定となり、ステップＳ２０９に移る。一方、この条件を満たさない場合にはＮＯの判定となり、領域Ｂ（図５参照）内の非背景部分と判断できるので、ステップＳ２１０に移る。 In step S207, since P ₁ (x, y) is within the circle with the radius L ₁ , the CPU 11 determines whether or not P ₁ (x, y) is outside the circle with the radius L ₂ . Specifically, if the condition [(x−m) ² + (y−n) ² > L ₂ ² ] is satisfied, the determination is YES, and the process proceeds to step S209. On the other hand, if this condition is not satisfied, the determination is NO, and it can be determined that it is a non-background part in the region B (see FIG. 5), and the process proceeds to step S210.

ステップＳ２０８では、Ｐ_１（ｘ，ｙ）が半径Ｌ_１の円外にあるので、ＣＰＵ１１は、Ｐ_１（ｘ，ｙ）が顔の中心位置Ｐ_Ｆ（ｍ，ｎ）から見て首の方向であり、かつ、所定の角度θの範囲（図５参照）内であるか否かを判定する。ここで、首の方向としては、検出された顔部分の各構成要素の配置から首が存在するであろう方向を推定する。なお、この範囲は、人物の胴体部分が含まれると想定される範囲であり、例えば、顔部分の中心から見た首方向に対する見込み角が１８０度や１２０度等の範囲として、適宜設計されてよい。この判定がＹＥＳの場合は、Ｐ_１（ｘ，ｙ）が首、肩、胴体等の非背景部分の可能性があるため、ステップＳ２０９に移る。一方、この判定がＮＯの場合は、Ｐ_１（ｘ，ｙ）が領域Ａ（図５参照）内にあり背景部分であると判断できるので、ステップＳ２１１に移る。 In step S208, since P ₁ (x, y) is outside the circle having the radius L ₁ , the CPU 11 determines the direction of the neck when P ₁ (x, y) is viewed from the center position P _F (m, n) of the face. And whether it is within a range of a predetermined angle θ (see FIG. 5). Here, as the direction of the neck, the direction in which the neck will exist is estimated from the arrangement of each component of the detected face part. Note that this range is a range that is assumed to include a human torso portion. For example, the expected angle with respect to the neck direction viewed from the center of the face portion is appropriately designed as a range such as 180 degrees or 120 degrees. Good. If this determination is YES, since P ₁ (x, y) may be a non-background portion such as a neck, shoulder, or trunk, the process proceeds to step S209. On the other hand, if this determination is NO, it can be determined that P ₁ (x, y) is in the region A (see FIG. 5) and is the background portion, and thus the process proceeds to step S211.

ステップＳ２０９では、Ｐ_１（ｘ，ｙ）が領域Ａ内でも領域Ｂ内でもなく、即座に背景部分か非背景部分かを判別できない座標であるため、ＣＰＵ１１は、画素値Ｄ_１とＤ_２とを比較し、差分が閾値Ｒ未満であるか否かを判定する。この判定がＹＥＳの場合はステップＳ２１０に移り、判定がＮＯの場合はステップＳ２１１に移る。 In step S209, since P ₁ (x, y) is not in the area A or the area B and is a coordinate that cannot be immediately discriminated as the background portion or the non-background portion, the CPU 11 determines the pixel values D ₁ and D ₂ as To determine whether or not the difference is less than the threshold value R. If this determination is YES, the process proceeds to step S210, and if the determination is NO, the process proceeds to step S211.

ステップＳ２１０では、ＣＰＵ１１は、Ｐ_１（ｘ，ｙ）が非背景部分であると判別し、画素値Ｄ_１およびＤ_２を、それぞれ合成画像１および２の画素値として、ＳＤＲＡＭ１９に記録する。 In step S210, the CPU 11 determines that P ₁ (x, y) is a non-background portion, and records the pixel values D ₁ and D ₂ in the SDRAM 19 as the pixel values of the composite images 1 and 2, respectively.

ステップＳ２１１では、ＣＰＵ１１は、Ｐ_１（ｘ，ｙ）が背景部分であると判別し、画素値Ｄ_１およびＤ_２を、背景用の画素値Ｄ_３に置き換え、それぞれ合成画像１および２の画素値として、ＳＤＲＡＭ１９に記録する。 In step S211, the CPU 11 determines that P ₁ (x, y) is the background portion, replaces the pixel values D ₁ and D ₂ with the pixel value D ₃ for background, and the pixels of the composite images 1 and 2 respectively. The value is recorded in the SDRAM 19.

ステップＳ２１２では、ＣＰＵ１１は、全ての画素に対して背景部分であるか非背景部分であるかの判別がなされ、合成画像１および２の画素値が記録されたか否かを判定する。なお、相対変位量（ａ，ｂ）により対応付けられなかった画素については、背景部分であると判断し、背景用の画素値Ｄ_３を合成画像１または２の画素値としてよい。この判定がＹＥＳの場合はステップＳ２１３に移り、判定がＮＯの場合はステップＳ２０５に戻って他の座標に関して処理を繰り返す。 In step S212, the CPU 11 determines whether all the pixels are the background portion or the non-background portion, and determines whether the pixel values of the composite images 1 and 2 are recorded. The relative displacement (a, b) for pixels which are not associated with is determined to be a background portion may pixel value D ₃ for the background as a composite image 1 or 2 pixel values. If this determination is YES, the process moves to step S213, and if the determination is NO, the process returns to step S205 to repeat the process for other coordinates.

ステップＳ２１３では、ＣＰＵ１１は、ＳＤＲＡＭ１９に記録された合成画像１および２を表示部１５に表示し、ユーザから、いずれの合成画像の非背景部分を使用するかの選択入力を受け付ける。 In step S213, the CPU 11 displays the composite images 1 and 2 recorded in the SDRAM 19 on the display unit 15, and receives a selection input from the user as to which non-background portion of the composite image is used.

ステップＳ２１４では、ＣＰＵ１１は、ステップＳ２１３にてユーザにより選択された合成画像１または２を、ＪＰＥＧ変換部１８によりＪＰＥＧ符号化する。 In step S214, the CPU 11 JPEG encodes the composite image 1 or 2 selected by the user in step S213 using the JPEG conversion unit 18.

ステップＳ２１５では、ＣＰＵ１１は、ステップＳ２１４にてＪＰＥＧ符号化された画像データを、外部メモリ１６に記録する。これにより、第１画像Ｐ_１または第２画像Ｐ_２における人物の背景部分を所定値に置き換えた新たな画像データが生成され記録される。 In step S215, the CPU 11 records the image data encoded in JPEG in step S214 in the external memory 16. Thus, new image data by replacing the background portion of the person in the first image P ₁ and the second image P ₂ to a predetermined value are generated and recorded.

以上のように、本実施形態によれば、２枚の画像における顔部分の顔パラメータを利用して位置合わせを行うので、背景の図柄によらず、背景と非背景との分離が容易となる。さらに、顔検出では、顔のおおよその形状しか知ることができないが、位置合わせを行った後の２枚の画像における対応する画素値を比較することにより、例えば髪の毛や首や肩等の顔以外の非背景部分を分離することができる。 As described above, according to the present embodiment, alignment is performed using the facial parameters of the face portions in the two images, so that the background and the non-background can be easily separated regardless of the background pattern. . Furthermore, in face detection, only the approximate shape of the face can be known, but by comparing the corresponding pixel values in the two images after alignment, for example, other than the face such as hair, neck or shoulder The non-background part of can be separated.

また、顔の中心位置と大きさから、領域Ａおよび領域Ｂを設定し、画素値の比較を行うことなく背景と非背景の判別を行ったので、合成画像の生成に掛かる処理時間が短縮される。 In addition, the region A and the region B are set from the center position and size of the face, and the background and the non-background are discriminated without comparing the pixel values, so that the processing time required for generating the composite image is shortened. The

また、顔の向きから首の方向（胴体部分）を判定し、この方向の所定の領域を領域Ａ（人物が含まれない領域）から除外したので、顔以外の胴体部分が領域Ａ内となってしまうことによる背景部分の誤検出を抑制することができる。 Further, the direction of the neck (torso portion) is determined from the face direction, and a predetermined region in this direction is excluded from the region A (region not including a person), so the torso portion other than the face is within the region A. It is possible to suppress erroneous detection of the background portion due to the occurrence of the error.

また、２枚の画像間で顔パラメータが大きく異なる場合に警告を出力したので、位置合わせができずに背景と非背景との分離ができない状況を防止することができる。 Further, since the warning is output when the face parameter is greatly different between the two images, it is possible to prevent a situation in which the background cannot be separated and the background cannot be separated.

また、合成画像を２枚生成し、ユーザによる選択を可能とした。証明写真のように長く保存される画像を生成する場合には、ユーザはできるだけ良い表情の画像を生成したいので、本実施形態により、ユーザは、２枚のいずれか気に入った方を選択することができる。 In addition, two composite images are generated and can be selected by the user. When generating an image that is stored for a long time such as an ID photo, the user wants to generate an image with the best possible expression. Therefore, according to the present embodiment, the user can select one of the two favorite ones. it can.

なお、本実施形態では、顔検出による顔パラメータの比較により警告出力、あるいは、第２画像の決定を行ったが、これには限られない。例えば、ＣＰＵ１１は、被写体である顔部分に対する合焦度や焦点距離を測定し、この合焦度や焦点距離が乖離する場合に警告出力を行い、近似する場合に第２画像として決定してもよい。 In this embodiment, the warning output or the determination of the second image is performed by comparing face parameters by face detection. However, the present invention is not limited to this. For example, the CPU 11 measures the degree of focus and the focal length with respect to the face portion that is the subject, outputs a warning when the degree of focus and the focal length deviate, and determines the second image when approximated. Good.

また、本実施形態では、顔パラメータが類似することにより自動的に第２画像を決定したが、これには限られない。例えば、顔パラメータによる位置合わせを行った後に、背景部分（例えば領域Ａ）の画素値の差分が所定以上となり、かつ、非背景部分（例えば領域Ｂ）の画素値の差分が所定以下となった場合に、第２画像として決定して（シャッタを切って）もよい。このことにより、背景部分の画素値が乖離し、非背景部分の画素値が近似する２枚の画像データが選択されるので、精度良く背景と非背景とを分離することができる。 In the present embodiment, the second image is automatically determined due to similarity of the face parameters, but the present invention is not limited to this. For example, after performing alignment using face parameters, the difference in pixel values in the background portion (for example, region A) is greater than or equal to a predetermined value, and the difference in pixel values in the non-background portion (for example, region B) is less than or equal to the predetermined value. In this case, it may be determined as the second image (shutter is released). As a result, the image values of the background portion deviate from each other, and two pieces of image data that approximate the pixel values of the non-background portion are selected, so that the background and the non-background can be accurately separated.

また、本実施形態では、顔パラメータによる位置合わせを行ったが、これには限られない。例えば、顔以外の非背景部分（胴体部分）の位置合わせをさらに行ってもよい。具体的には、顔検出により検出した顔部分の顔パラメータに基づいて、所定の位置関係にある領域を非背景部分と推定し、この非背景部分と推定された領域に関してブロックマッチング等の、画素同士の比較による位置合わせを行う。このように複数の位置合わせ手法を併用することにより、ＣＰＵ１１は、精度良く位置合わせを行うことができる。 In the present embodiment, the position adjustment is performed using the face parameters, but the present invention is not limited to this. For example, you may further align non-background parts (torso part) other than a face. Specifically, based on the face parameters of the face portion detected by face detection, a region having a predetermined positional relationship is estimated as a non-background portion, and pixels such as block matching are estimated for the region estimated as the non-background portion. Perform alignment by comparing each other. In this way, by using a plurality of alignment methods in combination, the CPU 11 can perform alignment with high accuracy.

また、本実施形態では、新たな画像データとして、合成画像を２枚生成したが、これには限られない。３枚以上の画像を取得し、合成画像を３枚以上生成してもよい。このことにより、ユーザは多くの顔画像の候補から、好みの一枚を選択することができる。 In this embodiment, two composite images are generated as new image data. However, the present invention is not limited to this. Three or more images may be acquired, and three or more composite images may be generated. As a result, the user can select one favorite image from many face image candidates.

また、本実施形態では、画素値の差分に基づいて背景か非背景かを判別したが、これには限られず、判別結果に対して補正を行ってもよい。具体的には、背景部分と判別された画素に囲まれて、非背景部分と判別された画素の領域が所定面積以内で存在する場合に、この所定面積以内の非背景部分と判別された画素を、背景部分として補正する。また、非背景部分と判別された画素に囲まれて、背景部分と判別された画素の領域が所定面積以内で存在する場合に、この所定面積以内の背景部分と判別された画素を、非背景部分として補正する。このことにより、背景と非背景とを誤って判別した画素を補正し、精度よく合成画像を生成できる可能性がある。 In the present embodiment, the background or the non-background is determined based on the pixel value difference. However, the present invention is not limited to this, and the determination result may be corrected. Specifically, when a region of a pixel determined to be a non-background portion is within a predetermined area surrounded by pixels determined to be a background portion, the pixel determined to be a non-background portion within the predetermined area Is corrected as a background portion. In addition, when the area of the pixel determined to be the background portion is within a predetermined area surrounded by the pixels determined to be the non-background portion, the pixel determined to be the background portion within the predetermined area is replaced with the non-background portion. Correct as part. As a result, there is a possibility that a pixel in which the background and the non-background are erroneously determined is corrected and a composite image can be generated with high accuracy.

［第２実施形態］
次に、本発明の第２実施形態を図面に基づいて説明する。本実施形態では、デジタルカメラ１の動作が、第１実施形態と異なる。すなわち、第１実施形態では、ＳＤＲＡＭ１９に記録された２枚の画像データに基づいて、新たな合成画像データを生成した。一方、本実施形態では、第１画像および第２画像のそれぞれに対して複数枚の候補画像を記録する。 [Second Embodiment]
Next, 2nd Embodiment of this invention is described based on drawing. In the present embodiment, the operation of the digital camera 1 is different from that of the first embodiment. That is, in the first embodiment, new composite image data is generated based on two pieces of image data recorded in the SDRAM 19. On the other hand, in this embodiment, a plurality of candidate images are recorded for each of the first image and the second image.

ここで、ＣＰＵ１１は、３枚以上取得した画像の中から２枚を抽出して、第１画像および第２画像の組合せを複数生成し、当該複数の組合せのそれぞれについて、顔部分の相関性が最も高い組合せを、第１画像および第２画像として決定する決定手段として機能する。 Here, the CPU 11 extracts two images from the acquired three or more images, generates a plurality of combinations of the first image and the second image, and the correlation of the face part for each of the plurality of combinations. It functions as a determination unit that determines the highest combination as the first image and the second image.

図６は、本実施形態に係る撮影処理の流れを示すフローチャートである。本処理は、第１実施形態の撮影処理（図３）に替えて実行され、画像合成処理（図４）へと続く。 FIG. 6 is a flowchart showing the flow of the photographing process according to the present embodiment. This process is executed instead of the shooting process (FIG. 3) of the first embodiment, and continues to the image composition process (FIG. 4).

ステップＳ３０１では、ＣＰＵ１１は、ユーザによりシャッタキーが押下されたか否かを判定する。ＣＰＵ１１は、この判定がＹＥＳになるまで待機し、判定がＹＥＳになると、第１画像を取得するためステップＳ３０２に移る。 In step S301, the CPU 11 determines whether or not the shutter key has been pressed by the user. The CPU 11 waits until this determination is YES. When the determination is YES, the CPU 11 proceeds to step S302 to acquire the first image.

ステップＳ３０２では、ＣＰＵ１１は、撮影部２を制御し、第１画像の候補となる候補画像を取得してＳＤＲＡＭ１９に記録する。 In step S 302, the CPU 11 controls the photographing unit 2 to acquire a candidate image that is a candidate for the first image and record it in the SDRAM 19.

ステップＳ３０３では、ＣＰＵ１１は、ステップＳ３０２にて記録した候補画像に対して、顔検出部１３により顔部分の検出を行い、位置、大きさ、向き等の顔パラメータを取得する。 In step S303, the CPU 11 detects a face portion by the face detection unit 13 with respect to the candidate image recorded in step S302, and acquires face parameters such as position, size, and orientation.

ステップＳ３０４では、ＣＰＵ１１は、第１画像の候補として、所定の枚数を記録したか否かを判定する。この判定がＹＥＳの場合は、第１画像に対する候補画像の取得を終了してステップＳ３０５に移る。一方、この判定がＮＯの場合は、ステップＳ３０２に戻って次の候補画像を記録する。 In step S304, the CPU 11 determines whether or not a predetermined number of sheets have been recorded as candidates for the first image. If this determination is YES, acquisition of candidate images for the first image is terminated, and the process proceeds to step S305. On the other hand, if this determination is NO, the process returns to step S302 to record the next candidate image.

ステップＳ３０５では、ＣＰＵ１１は、ユーザによりシャッタキーが押下されたか否かを判定する。ＣＰＵ１１は、この判定がＹＥＳになるまで待機し、判定がＹＥＳになると、第２画像を取得するためステップＳ３０６に移る。なお、ＣＰＵ１１は、ユーザに対して、２度目のシャッタキーの押下を促すため、表示部１５やＬＥＤ等の表示手段により表示出力を行ったり、スピーカにより音声出力を行ったりすることが好ましい。 In step S305, the CPU 11 determines whether the shutter key has been pressed by the user. The CPU 11 waits until this determination is YES, and when the determination is YES, the CPU 11 proceeds to step S306 to acquire the second image. In order to prompt the user to press the shutter key for the second time, the CPU 11 preferably performs display output using a display unit such as the display unit 15 or an LED, or performs sound output using a speaker.

ステップＳ３０６では、ＣＰＵ１１は、撮影部２を制御し、第２画像の候補となる候補画像を取得してＳＤＲＡＭ１９に記録する。 In step S 306, the CPU 11 controls the photographing unit 2 to acquire a candidate image that is a candidate for the second image and record it in the SDRAM 19.

ステップＳ３０７では、ＣＰＵ１１は、ステップＳ３０６にて記録した候補画像に対して、顔検出部１３により顔部分の検出を行い、位置、大きさ、向き等の顔パラメータを取得する。 In step S307, the CPU 11 detects the face portion of the candidate image recorded in step S306 by the face detection unit 13, and acquires face parameters such as position, size, and orientation.

ステップＳ３０８では、ＣＰＵ１１は、第２画像の候補として、所定の枚数を記録したか否かを判定する。この判定がＹＥＳの場合は、第２画像に対する候補画像の取得を終了してステップＳ３０９に移る。一方、この判定がＮＯの場合は、ステップＳ３０６に戻って次の候補画像を記録する。 In step S308, the CPU 11 determines whether or not a predetermined number of sheets has been recorded as candidates for the second image. If this determination is YES, acquisition of candidate images for the second image is terminated, and the process proceeds to step S309. On the other hand, if this determination is NO, the process returns to step S306 to record the next candidate image.

ステップＳ３０９では、ＣＰＵ１１は、第１画像の候補画像群から１枚、第２画像の候補画像群から１枚を選択する。このとき、ＣＰＵ１１は、全ての組合せに関して、顔パラメータの類似度を評価し、この類似度が最も高い組合せを、第１画像Ｐ_１および第２画像Ｐ_２として決定する。なお、顔パラメータの類似度には限らず、例えば、顔パラメータによる位置合わせを実行した後の、顔部分の相関性が最も高い組合せや、背景部分と推定される部分の相関性が低い組合せを、第１画像および第２画像として決定してもよい。 In step S309, the CPU 11 selects one image from the first image candidate image group and one image from the second image candidate image group. At this time, CPU 11 will, for all combinations, to assess the similarity of the face parameters, the similarity is the highest combination is determined as the first image P ₁ and the second image P _2. In addition, it is not limited to the similarity of the face parameter, for example, after performing the alignment by the face parameter, a combination having the highest correlation of the face part or a combination having a low correlation of the part estimated to be the background part. The first image and the second image may be determined.

以上のように、本実施形態によれば、複数枚の候補画像から２枚の画像を自動的に選択し、背景と非背景との分離を行った。このことにより、ユーザの手を煩わせることなく、精度良く背景と非背景とが分離され得る画像を選択できるので、効率的に合成画像を生成することができる。 As described above, according to the present embodiment, two images are automatically selected from a plurality of candidate images, and the background and the non-background are separated. As a result, an image that can accurately separate the background and the non-background can be selected without bothering the user, so that a composite image can be efficiently generated.

なお、本発明は第１実施形態または第２実施形態に限定されるものではなく、本発明の目的を達成できる範囲での変形、改良等は本発明に含まれるものである。 It should be noted that the present invention is not limited to the first embodiment or the second embodiment, and modifications, improvements, etc. within a scope that can achieve the object of the present invention are included in the present invention.

例えば、第１実施形態および第２実施形態では、撮像素子をＣＣＤ２３として説明したが、ＣＭＯＳセンサ等の他の撮像素子であってもよい。また、変位測定手段、対応付け手段、判別手段、設定手段、第２判別手段、警告手段、決定手段、測定手段、推定手段、記憶手段としての機能の一部をＣＰＵ１１とは別の制御部により動作させてもよい。 For example, in the first embodiment and the second embodiment, the image pickup device has been described as the CCD 23, but another image pickup device such as a CMOS sensor may be used. Further, a part of the function as the displacement measuring means, the associating means, the discriminating means, the setting means, the second discriminating means, the warning means, the determining means, the measuring means, the estimating means, and the storage means is controlled by a control unit different from the CPU 11. It may be operated.

また、本発明はデジタルカメラに限らず、例えばカメラ付きの携帯電話端末等の静止画撮像機能を有する撮像装置や、複数枚の画像から新たな画像を生成する機能を有する他の画像処理装置にも適用することができる。なお、このような画像処理装置には、所定のプログラムに基づいて動作することにより前述の機能が実現されるＰＣ（ＰｅｒｓｏｎａｌＣｏｍｐｕｔｅｒ）やＰＤＡ（ＰｅｒｓｏｎａｌＤｉｇｉｔａｌＡｓｓｉｓｔａｎｔ）等も含まれる。 The present invention is not limited to a digital camera. For example, the present invention may be applied to an imaging device having a still image imaging function such as a mobile phone terminal with a camera, or another image processing device having a function of generating a new image from a plurality of images. Can also be applied. Note that such an image processing apparatus includes a PC (Personal Computer), a PDA (Personal Digital Assistant), and the like that realize the above-described functions by operating based on a predetermined program.

第１実施形態に係るデジタルカメラの概略構成を示すブロック図である。1 is a block diagram illustrating a schematic configuration of a digital camera according to a first embodiment. 第１実施形態に係る処理の概略を示す図である。It is a figure which shows the outline of the process which concerns on 1st Embodiment. 第１実施形態に係る撮影処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the imaging | photography process which concerns on 1st Embodiment. 第１実施形態に係る画像合成処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the image composition process which concerns on 1st Embodiment. 第１実施形態に係る背景部分と非背景部分とを判別する方法の説明に供する図である。It is a figure where it uses for description of the method which discriminate | determines the background part and non-background part which concern on 1st Embodiment. 第２実施形態に係る撮影処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the imaging | photography process which concerns on 2nd Embodiment.

Explanation of symbols

Ｃ非背景部分
Ｄ_１、Ｄ_２画素値
Ｄ_３背景用の画素値（第２の所定値）
Ｌ_１所定の距離（第３の所定値）
Ｌ_２所定の距離（第４の所定値）
Ｐ_１第１画像
Ｐ_２第２画像
Ｐ_Ｆ顔部分の中心
Ｒ閾値（第１の所定値）
１デジタルカメラ（画像処理装置）
２撮影部（取得手段）
１１ＣＰＵ（変位測定手段、対応付け手段、判別手段、設定手段、第２判別手段、警告手段、決定手段、測定手段、推定手段、記憶手段）
１３顔検出部（検出手段）
１５表示部（警告手段） C Non-background portion D ₁ , D ₂ Pixel value D ₃ Background pixel value (second predetermined value)
L ₁ predetermined distance (third predetermined value)
L ₂ predetermined distance (fourth predetermined value)
P ₁ first image _{P 2} second image _{P F} face portion of the center R threshold (first predetermined value)
1 Digital camera (image processing device)
2 Shooting unit (acquisition means)
11 CPU (displacement measurement means, association means, discrimination means, setting means, second discrimination means, warning means, determination means, measurement means, estimation means, storage means)
13 Face detection unit (detection means)
15 Display (Warning means)

Claims

Acquisition means for acquiring a first image and a second image that include the same subject and have different backgrounds;
Detecting means for detecting a face portion of the subject from the acquired first image and second image;
A displacement measuring means for measuring a relative displacement between the face portion of the first image and the face portion of the second image;
Association means for associating pixels constituting the first image and pixels constituting the second image based on the relative displacement;
The associated pixels are compared with each other, and if the pixel value difference of the pixel is equal to or greater than a first predetermined value, the pixel is set as a background portion, and the pixel value difference of the pixel is the first predetermined value. If the value is less than the value, a determination unit that sets the pixel as a non-background portion;
An image processing apparatus comprising:

The image processing apparatus according to claim 1.
Selecting means for selecting one of the first image and the second image;
An image processing apparatus, further comprising: a setting unit that sets a pixel value of a pixel determined to be a background portion by the determination unit to the same value for the image selected by the selection unit.

The image processing apparatus according to claim 2,
The center of the detected face part is determined, a pixel whose distance from the center of the face part is a third predetermined value or more is set as a background part, and the distance from the center of the face part is a fourth predetermined value. A second discriminating unit that uses the following pixel as a non-background portion;
An image processing apparatus that executes the determination unit on an area excluding the determination target of the second determination unit.

The image processing apparatus according to claim 3.
The second determining means is based on the orientation of the face part, and it is assumed that a torso part of a human body in which a prospective angle with respect to a neck direction viewed from the center of the face part of the subject is within a predetermined angle range is included. An image processing apparatus that excludes a range to be determined from a discrimination target.

The image processing apparatus according to any one of claims 1 to 4,
An image processing apparatus, further comprising: a warning unit that outputs a warning when a feature amount of the face portion of the first image and a feature amount of the face portion of the second image are greatly different.

The image processing apparatus according to claim 5.
When the feature amount of the face portion of the first image and the feature amount of the face portion of the second image are greatly different from each other, the warning means determines the first image based on the feature amount of the face portion of the second image. At least one of the position, size, and orientation of the face portion of the second image for approximating the feature amount of the face portion of the two images to the feature amount of the face portion of the first image; An image processing apparatus.

The image processing apparatus according to any one of claims 1 to 6,
The acquisition means acquires a new image when the feature amount of the face portion of the first image and the feature amount of the face portion of the second image do not approximate, and sets the new image as the second image. An image processing apparatus.

The image processing apparatus according to claim 1,
The acquisition means acquires a plurality of images including a subject as a first image group and a similar background and a plurality of images including a subject as a second image group and a background different from the first image group,
Select one from each of the acquired image groups, generate a plurality of combinations, and for each of the plurality of combinations, the combination having the highest correlation of the face part is selected as the first image and the second image. An image processing apparatus, further comprising: a determination unit that determines

The image processing apparatus according to any one of claims 1 to 8,
Measuring means for measuring the degree of focus on the face portion of the subject,
The image processing apparatus characterized in that the detection means detects the degree of focus measured by the measurement means as a feature amount of a face portion of the first image and the second image.

The image processing apparatus according to any one of claims 1 to 9,
Estimating means for estimating a non-background portion of the acquired first image and second image based on the detected feature amount of the face portion ;
Together before aligning the Kikao portion, based on the estimation result of the estimating means, the position and the non-background portion other than the face portion of the first image, and a non-background portion other than the face portion of the second image, the In addition, alignment means for associating the pixels constituting the first image and the pixels constituting the second image based on the two alignment results ,
An image processing apparatus further comprising:

The image processing apparatus according to claim 1,
An image processing apparatus further comprising a storage unit that stores a non-background portion of the selected image when the user selects a non-background portion of the first image or a non-background portion of the second image. .

Computer
Acquisition means for acquiring a first image and a second image that include the same subject and have different backgrounds;
Detecting means for detecting a face portion of the subject from the acquired first image and second image;
A displacement measuring means for measuring a relative displacement between the face portion of the first image and the face portion of the second image;
Association means for associating pixels constituting the first image and pixels constituting the second image based on the relative displacement;
The associated pixels are compared with each other, and if the pixel value difference of the pixel is equal to or greater than a first predetermined value, the pixel is set as a background portion, and the pixel value difference of the pixel is the first predetermined value. If the value is less than the value, a determination unit that sets the pixel as a non-background part,
Program to function as.