JP5610245B2

JP5610245B2 - Image composition apparatus, image composition method, image composition program, and recording medium

Info

Publication number: JP5610245B2
Application number: JP2013175399A
Authority: JP
Inventors: 平井　義人; 義人平井; 健三浦
Original assignee: Morpho Inc
Current assignee: Morpho Inc
Priority date: 2013-08-27
Filing date: 2013-08-27
Publication date: 2014-10-22
Anticipated expiration: 2031-10-14
Also published as: JP2013243780A

Description

本発明は、画像合成装置、画像合成方法、画像合成プログラム及び記録媒体に関するものである。 The present invention relates to an image composition device, an image composition method, an image composition program, and a recording medium.

従来、画像合成装置として、ハイダイナミックレンジ合成（ＨＤＲ（High Dynamic Range）合成）を行うものが知られている（特許文献１参照）。この装置は、異なる露出条件にて順次撮像された複数の画面を合成することによって、映像信号のダイナミックレンジを見かけ上拡大させている。これにより、逆光時などにおいて発生していた「白とび」又は「黒つぶれ」（輝度レベルが著しく高い又は低い部分）を解消する。また、この装置では、手ぶれによって生じる複数の画面間の撮像時の経時的な位置ずれに対応して、複数の画面の各々の座標変換を行った後にＨＤＲ合成を行う。具体的には、画像の動き情報を用いて、２つの画面の共通エリア部分を用いてＨＤＲ合成を行う。これにより、被写体に対する画面（撮像素子）の位置ずれ（画面ぶれ）を解消する。 2. Description of the Related Art Conventionally, an image synthesizing apparatus that performs high dynamic range synthesis (HDR (High Dynamic Range) synthesis) is known (see Patent Document 1). This apparatus apparently expands the dynamic range of the video signal by synthesizing a plurality of screens sequentially captured under different exposure conditions. This eliminates “overexposure” or “blackout” (part where the luminance level is extremely high or low) that has occurred during backlighting. Further, in this apparatus, HDR synthesis is performed after coordinate conversion of each of the plurality of screens is performed in response to a positional shift with time during imaging between the plurality of screens caused by camera shake. Specifically, HDR synthesis is performed using the common area portion of the two screens using the motion information of the image. This eliminates the positional deviation (screen blur) of the screen (image sensor) with respect to the subject.

特許第３１１０７９７号公報Japanese Patent No. 3110797

ところで、被写体が動体である場合には、順次撮像された複数の画像において被写体位置が異なる位置となる。このため、特許文献１記載の画像合成装置にあっては、被写体が移動して色が変化している場合であっても露出が異なることで色が変化しているものとして合成する。よって、適切な合成画像を生成できないおそれがある。当技術分野においては、被写体が移動した場合であっても適切な合成画像を生成することができる画像合成装置、画像合成方法及び画像合成プログラム並びに当該画像合成プログラムを格納した記録媒体が望まれている。 By the way, when the subject is a moving object, the subject positions are different in a plurality of images sequentially captured. For this reason, in the image composition device described in Patent Document 1, even when the subject is moving and the color is changed, the image is synthesized as if the color has changed due to different exposure. Therefore, there is a possibility that an appropriate composite image cannot be generated. In this technical field, an image composition device, an image composition method, an image composition program, and a recording medium storing the image composition program that can generate an appropriate composite image even when the subject moves are desired. Yes.

すなわち、本発明の一側面に係る画像合成装置は、露出条件の異なる第１画像及び第２画像を用いて合成画像を生成する装置である。この装置は、入力部、尤度算出部、露出推定部及び合成部を備える。入力部は、第１画像及び第２画像を入力する。尤度算出部は、第１画像と第２画像との露出条件を合わせる露出変換関数を推定する前に、第１画像及び第２画像の差分に基づいて、各画素における動被写体尤度を算出する。露出推定部は、動被写体尤度に基づいて露出変換関数を推定する。合成部は、露出変換関数を用いて第１画像及び第２画像を合成する。さらに、露出推定部は、動被写体尤度の高い画素ほど当該画素の重みを小さく設定して、露出変換関数を推定する。 That is, an image composition device according to one aspect of the present invention is a device that generates a composite image using a first image and a second image having different exposure conditions. This apparatus includes an input unit, a likelihood calculation unit, an exposure estimation unit, and a synthesis unit. The input unit inputs the first image and the second image. The likelihood calculating unit calculates the moving subject likelihood in each pixel based on the difference between the first image and the second image before estimating an exposure conversion function that matches the exposure conditions of the first image and the second image. To do. The exposure estimation unit estimates an exposure conversion function based on the moving subject likelihood. The combining unit combines the first image and the second image using the exposure conversion function. Further, the exposure estimation unit estimates the exposure conversion function by setting the weight of the pixel to be lower as the moving subject likelihood is higher.

この画像構成装置では、第１画像と第２画像との露出を合わせる前に、第１画像と第２画像との差分に基づいて各画素における動被写体尤度を算出する。そして、動被写体尤度に基づいて第１画像と第２画像との露出条件を合わせる露出変換関数を推定する。このように、露出を合わせる際に動被写体尤度が考慮されるため、例えば被写体の動きで色が変化している可能性がある領域を除いて露出を合わせることができる。よって、適切な合成画像を生成することが可能となる。 In this image construction device, the moving subject likelihood at each pixel is calculated based on the difference between the first image and the second image before matching the exposures of the first image and the second image. Based on the moving subject likelihood, an exposure conversion function that matches the exposure conditions of the first image and the second image is estimated. Thus, since the moving subject likelihood is taken into consideration when adjusting the exposure, for example, the exposure can be adjusted except for a region where the color may change due to the movement of the subject. Therefore, it is possible to generate an appropriate composite image.

一実施形態においては、尤度算出部は、第１画像の画素の近傍画素、及び、対応する第２画像の画素の近傍画素を正規化し、正規化された近傍画素間の差分に基づいて、各画素における動被写体尤度を算出してもよい。このように構成することで、各画素における動被写体尤度を適切に算出することができる。 In one embodiment, the likelihood calculating unit normalizes the neighboring pixels of the pixels of the first image and the corresponding neighboring pixels of the second image, and based on the difference between the normalized neighboring pixels, it may calculate the moving subject likelihood that put to each pixel. With this configuration, the moving subject likelihood at each pixel can be calculated appropriately.

一実施形態においては、尤度算出部は、第１画像の解像度をそれぞれ段階的に変更することで得られる複数の第１処理画像、及び、第２画像の解像度をそれぞれ段階的に変更することで得られる複数の第２処理画像を用いて、各画素の差分を解像度ごとに算出し、解像度ごとに得られた差分を重み付けすることによって各画素の動被写体尤度を算出してもよい。このように構成することで、各画素の動被写体尤度を精度良く算出することができる。 In one embodiment, the likelihood calculating unit changes the resolution of the plurality of first processed images and the second image obtained by changing the resolution of the first image in stages. The moving subject likelihood of each pixel may be calculated by calculating the difference of each pixel for each resolution using the plurality of second processed images obtained in step S1 and weighting the difference obtained for each resolution. With this configuration, the moving subject likelihood of each pixel can be calculated with high accuracy.

一実施形態においては、尤度算出部は、第１画像と第２画像との差分の信頼度、及び第１処理画像又は第２処理画像の画像サイズもしくは解像度に基づいて、解像度ごとに得られた差分を重み付けしてもよい。このように構成することで、各画素の動被写体尤度をさらに精度良く算出することができる。 In one embodiment, the likelihood calculation unit is obtained for each resolution based on the reliability of the difference between the first image and the second image and the image size or resolution of the first processed image or the second processed image. The differences may be weighted. With this configuration, the moving subject likelihood of each pixel can be calculated with higher accuracy.

一実施形態においては、合成部は、第１画像及び第２画像の差分に基づいて各画素における動被写体尤度を算出し、該動被写体尤度及び露出変換関数を用いて第１画像及び第２画像を合成してもよい。このように構成することで、被写体の動きを考慮して合成することができるので、適切な合成画像を生成することが可能となる。 In one embodiment, the synthesizing unit calculates a moving subject likelihood in each pixel based on a difference between the first image and the second image, and uses the moving subject likelihood and the exposure conversion function to perform the first image and the second image. Two images may be combined. By configuring in this way, it is possible to synthesize in consideration of the movement of the subject, so that an appropriate synthesized image can be generated.

一実施形態においては、合成部は、第１画像又は第２画像の元の輝度値の大きさに基づいて、第１画像及び第２画像の画素値の合成比を示す輝度ベースマスクを生成してもよい。そして、合成部は、第１画像及び第２画像の差分に基づいて、第１画像及び第２画像の画素値の合成比を示す被写体ぶれマスクを生成してもよい。さらに、合成部は、輝度ベースマスク及び被写体ぶれマスクを結合させて、第１画像及び第２画像の画素値を合成する合成マスクを生成してもよい。 In one embodiment, the synthesizing unit generates a luminance base mask indicating a synthesis ratio of the pixel values of the first image and the second image based on the original luminance value of the first image or the second image. May be. The synthesizing unit may generate a subject blur mask indicating a synthesis ratio of the pixel values of the first image and the second image based on the difference between the first image and the second image. Further, the combining unit may combine the luminance base mask and the subject blur mask to generate a combined mask that combines the pixel values of the first image and the second image.

このように構成することで、露出をあわせた状態で、第１画像及び第２画像の差分に基づいて、輝度値を基準に合成する輝度ベースマスクとは異なる被写体ぶれマスクを生成することができる。このため、被写体ぶれが発生する領域のみを異なる処理で合成させることが可能となる。よって、被写体ぶれを抑制させた合成画像を生成することができる。 With this configuration, it is possible to generate a subject blur mask that is different from the luminance base mask that is synthesized based on the luminance value, based on the difference between the first image and the second image, with the exposure adjusted. . For this reason, it is possible to combine only areas where subject blurring occurs by different processes. Therefore, it is possible to generate a composite image in which subject blur is suppressed.

一実施形態においては、合成部は、第１画像及び第２画像の差分に基づいて、各画素における動被写体尤度を算出し、該動被写体尤度に基づいて被写体ぶれマスクを生成してもよい。このように構成することで、動被写体尤度に基づいて被写体ぶれが発生する領域を特定して被写体ぶれマスクを生成することができる。 In one embodiment, the synthesizing unit may calculate a moving subject likelihood in each pixel based on the difference between the first image and the second image, and generate a subject blur mask based on the moving subject likelihood. Good. With this configuration, a subject blur mask can be generated by specifying a region where subject blur occurs based on the moving subject likelihood.

一実施形態においては、合成部は、第１画像の解像度をそれぞれ段階的に変更することで得られる複数の第１処理画像、及び、第２画像の解像度をそれぞれ段階的に変更することで得られる複数の第２処理画像を用いて、各画素の差分を解像度ごとに算出し、解像度ごとに得られた差分を重み付けすることによって各画素の動被写体尤度を算出し、該動被写体尤度に基づいて被写体ぶれマスクを生成してもよい。このように構成することで、各画素の動被写体尤度を精度良く算出することができる。 In one embodiment, the synthesizing unit obtains the plurality of first processed images obtained by changing the resolution of the first image in stages, and the resolution of the second image, respectively. The difference between each pixel is calculated for each resolution using a plurality of second processed images, the moving subject likelihood of each pixel is calculated by weighting the difference obtained for each resolution, and the moving subject likelihood A subject blur mask may be generated based on the above. With this configuration, the moving subject likelihood of each pixel can be calculated with high accuracy.

一実施形態においては、合成部は、動被写体尤度が所定の閾値以下となる画素が隣接する領域を検出し、各領域に対して識別ラベルを付与し、領域ごとに被写体ぶれマスクを生成してもよい。このように構成することで、画像内で異なる動きをする動体が存在した場合であっても適切に合成することができる。 In one embodiment, the synthesizing unit detects a region where pixels whose moving subject likelihood is equal to or less than a predetermined threshold is adjacent, assigns an identification label to each region, and generates a subject blur mask for each region. May be. With such a configuration, even when there is a moving body that moves differently in the image, it can be appropriately combined.

一実施形態においては、合成部は、被写体ぶれマスクとして、第１画像及び第２画像のうち強制的に輝度値の低い画素値を選択させる第１マスク、又は、第１画像及び第２画像のうち強制的に輝度値の高い画素値を選択させる第２マスクを生成してもよい。このように構成することで、被写体が動いている可能性のある領域については、第１画像及び第２画像の何れか一方の画像を強制的に選択させることができる。よって、被写体が動くことで合成後の画像において被写体が２重や３重にずれて表示されることを回避することが可能となる。 In one embodiment, the synthesis unit forcibly selects a pixel value having a low luminance value from the first image and the second image as the subject blur mask, or the first image and the second image. A second mask for forcibly selecting a pixel value having a high luminance value may be generated. With this configuration, it is possible to forcibly select either the first image or the second image for an area where the subject may be moving. Therefore, it is possible to avoid the subject from being displayed in a double or triple shift in the combined image due to the movement of the subject.

一実施形態においては、合成部は、輝度ベースマスクに対して、第１マスクを反転させたマスクを乗算し、又は、第２マスクを加算することで、合成マスクを生成してもよい。このように構成することで、被写体のぶれを適切に補正する合成マスクを生成することができる。 In one embodiment, the synthesis unit may generate a synthesis mask by multiplying the luminance base mask by a mask obtained by inverting the first mask or by adding a second mask. With this configuration, it is possible to generate a composite mask that appropriately corrects subject blurring.

一実施形態においては、画像処理装置は、第１画像と第２画像との間の画素の動き情報を取得する動き情報取得部をさらに備えてもよい。そして、尤度算出部は、動き情報に基づいて第１画像及び第２画像を補正し、補正後の第１画像及び第２画像を用いて各画素の動被写体尤度を算出してもよい。このように構成することで、撮像素子が被写体に対して相対的に動く場合であっても、撮像素子の動きを補正して各画素の動被写体尤度を算出することができる。 In one embodiment, the image processing apparatus may further include a motion information acquisition unit that acquires motion information of pixels between the first image and the second image. The likelihood calculating unit may correct the first image and the second image based on the motion information, and calculate the moving subject likelihood of each pixel using the corrected first image and second image. . With this configuration, even when the image sensor moves relative to the subject, the motion of the image sensor can be corrected and the moving subject likelihood of each pixel can be calculated.

一実施形態においては、第１画像は、露出条件の異なる画像同士が合成された画像であってもよい。このように構成することで、露出条件の異なる複数の画像を順次合成して最終的な合成画像を生成することができる。 In one embodiment, the first image may be an image in which images having different exposure conditions are combined. With this configuration, a final composite image can be generated by sequentially combining a plurality of images having different exposure conditions.

また、本発明の他の側面に係る画像処理方法は、露出条件の異なる第１画像及び第２画像を用いて合成画像を生成する方法である。この方法では、第１画像及び第２画像を入力する。そして、第１画像と第２画像との露出条件を合わせる露出変換関数を推定する前に、第１画像及び第２画像の差分に基づいて、各画素における動被写体尤度を算出する。そして、動被写体尤度の高い画素ほど当該画素の重みを小さく設定して、露出変換関数を推定する。さらに、露出変換関数を用いて第１画像及び第２画像を合成する。 An image processing method according to another aspect of the present invention is a method for generating a composite image using a first image and a second image having different exposure conditions. In this method, a first image and a second image are input. Then, before estimating an exposure conversion function that matches the exposure conditions of the first image and the second image, the moving subject likelihood at each pixel is calculated based on the difference between the first image and the second image. Then, the pixel with the higher moving subject likelihood is set to a smaller weight for the pixel, and the exposure conversion function is estimated. Further, the first image and the second image are synthesized using the exposure conversion function.

また、本発明のさらに他の側面に係る画像合成プログラムは、露出条件の異なる第１画像及び第２画像を用いて合成画像を生成するようにコンピュータを動作させるプログラムである。このプログラムは、コンピュータを、入力部、尤度算出部、露出推定部及び合成部として動作させる。入力部は、第１画像及び第２画像を入力する。尤度算出部は、第１画像と第２画像との露出条件を合わせる露出変換関数を推定する前に、第１画像及び第２画像の差分に基づいて、各画素における動被写体尤度を算出する。露出推定部は、動被写体尤度の高い画素ほど当該画素の重みを小さく設定して、露出変換関数を推定する。合成部は、露出変換関数を用いて第１画像及び第２画像を合成する。 An image composition program according to still another aspect of the present invention is a program that causes a computer to operate so as to generate a composite image using a first image and a second image having different exposure conditions. This program causes the computer to operate as an input unit, a likelihood calculation unit, an exposure estimation unit, and a synthesis unit. The input unit inputs the first image and the second image. The likelihood calculating unit calculates the moving subject likelihood in each pixel based on the difference between the first image and the second image before estimating an exposure conversion function that matches the exposure conditions of the first image and the second image. To do. The exposure estimation unit estimates the exposure conversion function by setting the weight of the pixel to be smaller as the moving subject likelihood is higher . The combining unit combines the first image and the second image using the exposure conversion function.

また、本発明のさらに他の側面に係る記録媒体は、露出条件の異なる第１画像及び第２画像を用いて合成画像を生成するようにコンピュータを動作させる画像合成プログラムを記録した記録媒体である。このプログラムは、コンピュータを、入力部、尤度算出部、露出推定部及び合成部として動作させる。入力部は、第１画像及び第２画像を入力する。尤度算出部は、第１画像と第２画像との露出条件を合わせる露出変換関数を推定する前に、第１画像及び第２画像の差分に基づいて、各画素における動被写体尤度を算出する。露出推定部は、動被写体尤度の高い画素ほど当該画素の重みを小さく設定して、露出変換関数を推定する。合成部は、露出変換関数を用いて第１画像及び第２画像を合成する。 A recording medium according to still another aspect of the present invention is a recording medium on which an image composition program for operating a computer to generate a composite image using a first image and a second image having different exposure conditions is recorded. . This program causes the computer to operate as an input unit, a likelihood calculation unit, an exposure estimation unit, and a synthesis unit. The input unit inputs the first image and the second image. The likelihood calculating unit calculates the moving subject likelihood in each pixel based on the difference between the first image and the second image before estimating an exposure conversion function that matches the exposure conditions of the first image and the second image. To do. The exposure estimation unit estimates the exposure conversion function by setting the weight of the pixel to be smaller as the moving subject likelihood is higher . The combining unit combines the first image and the second image using the exposure conversion function.

本発明の他の側面に係る画像合成方法、画像合成プログラム及び記録媒体によれば、上述した画像合成装置と同様の効果を奏する。 According to the image synthesizing method, the image synthesizing program, and the recording medium according to the other aspects of the present invention, the same effects as those of the above-described image synthesizing apparatus are obtained.

本発明の種々の側面及び実施形態によれば、被写体が移動した場合であっても適切な合成画像を生成することができる画像合成装置、画像合成方法及び画像合成プログラム並びに当該画像合成プログラムを格納した記録媒体が提供される。 According to various aspects and embodiments of the present invention, an image synthesizing apparatus, an image synthesizing method, an image synthesizing program, and an image synthesizing program that can generate an appropriate synthesized image even when the subject moves are stored. A recording medium is provided.

一実施形態に係る画像合成装置を搭載した携帯端末の機能ブロック図である。It is a functional block diagram of the portable terminal carrying the image composition apparatus concerning one embodiment. 図１の画像合成装置が搭載される携帯端末のハードウェア構成図である。It is a hardware block diagram of the portable terminal in which the image composition apparatus of FIG. 1 is mounted. 図１に示す画像合成装置の前処理動作を示すフローチャートである。3 is a flowchart showing a preprocessing operation of the image composition device shown in FIG. 1. 動き検出を説明する概要図である。It is a schematic diagram explaining motion detection. 差分画像を説明する概要図である。It is a schematic diagram explaining a difference image. 多重解像度を用いて差分画像を導出する例を説明する概要図である。It is a schematic diagram explaining the example which derives | leads-out a difference image using multiresolution. 露出変換関数の一例を示すグラフである。It is a graph which shows an example of an exposure conversion function. 輝度変換関数を説明する概要図である。It is a schematic diagram explaining a luminance conversion function. 図１に示す画像合成装置の合成動作を示すフローチャートである。3 is a flowchart illustrating a composition operation of the image composition apparatus illustrated in FIG. 1. 合成処理の流れを説明する概要図である。It is a schematic diagram explaining the flow of a synthetic | combination process. 合成マスクを説明する概要図である。（Ａ）は、露出変換関数の一例を示すグラフである。（Ｂ）は、露出変換関数同士を繋ぎ合わせる際の重みの一例を示すグラフである。It is a schematic diagram explaining a synthetic mask. (A) is a graph which shows an example of an exposure conversion function. (B) is a graph which shows an example of the weight at the time of connecting exposure conversion functions. 輝度ベースマスクを説明する概要図である。（Ａ）は、入力画像の一例である。（Ｂ）は、輝度ベースマスクの一例である。It is a schematic diagram explaining a brightness | luminance base mask. (A) is an example of an input image. (B) is an example of a luminance base mask. 差分画像の被写体ぶれ領域のラベリングを説明する概要図である。（Ａ）は、差分画像の一例である。（Ｂ）は、ラベリングした差分画像の一例である。It is a schematic diagram explaining the subject blur region labeling of a difference image. (A) is an example of a difference image. (B) is an example of a labeled difference image. 被写体ぶれマスクの生成処理の流れを説明する概要図である。It is a schematic diagram explaining the flow of a subject blur mask generation process. 合成マスクの生成処理の流れを説明する概要図である。It is a schematic diagram explaining the flow of a synthetic | combination mask production | generation process.

以下、添付図面を参照して本発明の実施形態について説明する。なお、各図において同一又は相当部分には同一の符号を付し、重複する説明を省略する。 Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings. In addition, in each figure, the same code | symbol is attached | subjected to the same or an equivalent part, and the overlapping description is abbreviate | omitted.

本実施形態に係る画像合成装置は、露出条件の異なる複数の画像を合成して一つの合成画像を生成する装置である。この画像合成装置は、例えば、異なる露出条件で順次撮像された複数の画像を合成して、映像信号のダイナミックレンジを見かけ上拡大させるＨＤＲ合成を行う場合に採用される。本実施形態に係る画像合成装置は、例えば、携帯電話、デジタルカメラ、ＰＤＡ（Personal Digital Assistant）等、リソースに制限のあるモバイル端末に好適に搭載されるものであるが、これらに限られるものではなく、例えば通常のコンピュータシステムに搭載されてもよい。なお、以下では、説明理解の容易性を考慮し、本発明に係る画像合成装置の一例として、カメラ機能を備えた携帯端末に搭載される画像合成装置を説明する。 The image composition device according to the present embodiment is a device that generates a single composite image by combining a plurality of images having different exposure conditions. This image synthesizing apparatus is employed, for example, when HDR synthesis is performed in which a plurality of images sequentially captured under different exposure conditions are synthesized and the dynamic range of the video signal is apparently enlarged. The image composition device according to the present embodiment is suitably mounted on a mobile terminal with limited resources such as a mobile phone, a digital camera, and a PDA (Personal Digital Assistant), but is not limited thereto. For example, it may be mounted on a normal computer system. In the following, considering the ease of understanding, an image composition device mounted on a portable terminal having a camera function will be described as an example of an image composition device according to the present invention.

図１は、本実施形態に係る画像合成装置１を備える携帯端末２の機能ブロック図である。図１に示す携帯端末２は、例えばユーザにより携帯される移動端末であり、図２に示すハードウェア構成を有する。図２は、携帯端末２のハードウェア構成図である。図２に示すように、携帯端末２は、物理的には、ＣＰＵ（Central Processing Unit）１００、ＲＯＭ（Read Only Memory）１０１及びＲＡＭ（Random Access Memory）１０２等の主記憶装置、カメラ又はキーボード等の入力デバイス１０３、ディスプレイ等の出力デバイス１０４、ハードディスク等の補助記憶装置１０５などを含む通常のコンピュータシステムとして構成される。後述する携帯端末２及び画像合成装置１の各機能は、ＣＰＵ１００、ＲＯＭ１０１、ＲＡＭ１０２等のハードウェア上に所定のコンピュータソフトウェアを読み込ませることにより、ＣＰＵ１００の制御の元で入力デバイス１０３及び出力デバイス１０４を動作させるとともに、主記憶装置や補助記憶装置１０５におけるデータの読み出し及び書き込みを行うことで実現される。なお、上記の説明は携帯端末２のハードウェア構成として説明したが、画像合成装置１がＣＰＵ１００、ＲＯＭ１０１及びＲＡＭ１０２等の主記憶装置、入力デバイス１０３、出力デバイス１０４、補助記憶装置１０５などを含む通常のコンピュータシステムとして構成されてもよい。また、携帯端末２は、通信モジュール等を備えてもよい。 FIG. 1 is a functional block diagram of a mobile terminal 2 including an image composition device 1 according to this embodiment. A mobile terminal 2 shown in FIG. 1 is a mobile terminal carried by a user, for example, and has a hardware configuration shown in FIG. FIG. 2 is a hardware configuration diagram of the mobile terminal 2. As shown in FIG. 2, the portable terminal 2 physically includes a main storage device such as a CPU (Central Processing Unit) 100, a ROM (Read Only Memory) 101, and a RAM (Random Access Memory) 102, a camera, a keyboard, and the like. The input device 103, the output device 104 such as a display, the auxiliary storage device 105 such as a hard disk, and the like are configured as a normal computer system. Each function of the portable terminal 2 and the image composition device 1 described later is configured to load the input device 103 and the output device 104 under the control of the CPU 100 by reading predetermined computer software on hardware such as the CPU 100, the ROM 101, and the RAM 102. This is realized by operating and reading and writing data in the main storage device and the auxiliary storage device 105. Although the above description has been given as the hardware configuration of the mobile terminal 2, the image composition apparatus 1 normally includes a CPU 100, a main storage device such as the ROM 101 and RAM 102, an input device 103, an output device 104, an auxiliary storage device 105, and the like. It may be configured as a computer system. The mobile terminal 2 may include a communication module or the like.

図１に示すように、携帯端末２は、カメラ２０、画像合成装置１及び表示部２１を備えている。カメラ２０は、画像を撮像する機能を有している。カメラ２０として、例えばＣＭＯＳの画素センサ等が用いられる。カメラ２０は、例えばユーザ操作等により指定されたタイミングから所定の間隔で繰り返し撮像する連続撮像機能を有している。すなわち、カメラ２０は、一枚の静止画像だけでなく複数毎の静止画像（連続するフレーム画像）を取得する機能を有している。さらに、カメラ２０は、連続する各フレーム画像の露出条件を変更して撮像する機能を有している。すなわち、カメラ２０によって連続撮像された各画像は、それぞれ露出条件の異なる画像となる。カメラ２０は、例えば撮像されたフレーム画像を撮像の度に画像合成装置１へ出力する機能を有している。 As shown in FIG. 1, the mobile terminal 2 includes a camera 20, an image composition device 1, and a display unit 21. The camera 20 has a function of capturing an image. For example, a CMOS pixel sensor or the like is used as the camera 20. The camera 20 has a continuous imaging function that repeatedly captures images at a predetermined interval from a timing specified by a user operation or the like, for example. That is, the camera 20 has a function of acquiring not only one still image but also a plurality of still images (continuous frame images). Further, the camera 20 has a function of changing the exposure condition of each successive frame image and taking an image. That is, each image continuously captured by the camera 20 is an image with different exposure conditions. For example, the camera 20 has a function of outputting a captured frame image to the image composition device 1 every time it is captured.

画像合成装置１は、画像入力部１０、前処理部１１、動き補正部１５及び合成部１６を備えている。 The image composition device 1 includes an image input unit 10, a preprocessing unit 11, a motion correction unit 15, and a composition unit 16.

画像入力部１０は、カメラ２０により撮像されたフレーム画像を入力する機能を有している。画像入力部１０は、例えばカメラ２０により撮像されたフレーム画像を撮像の度に入力する機能を有している。また、画像入力部１０は、入力フレーム画像を、携帯端末２に備わる記憶装置に保存する機能を有している。 The image input unit 10 has a function of inputting a frame image captured by the camera 20. The image input unit 10 has a function of inputting, for example, a frame image captured by the camera 20 every time it is captured. In addition, the image input unit 10 has a function of saving an input frame image in a storage device provided in the mobile terminal 2.

前処理部１１は、ＨＤＲ合成前の前処理を行う。前処理部１１は、動き情報取得部１２、尤度算出部１３及び露出推定部１４を備えている。 The preprocessing unit 11 performs preprocessing before HDR synthesis. The preprocessing unit 11 includes a motion information acquisition unit 12, a likelihood calculation unit 13, and an exposure estimation unit 14.

動き情報取得部１２は、画像間の画素の動き情報を取得する機能を有している。例えば、入力フレーム画像を第１画像及び第２画像とすると、第１画像及び第２画像間の画素の動き情報を取得する。動き情報としては、例えば動きベクトルが用いられる。また、動き情報取得部１２は、３以上の入力画像が画像入力部１０により入力された場合には、入力画像を露出順にソートし、露出条件の近い入力画像間で動き情報を取得してもよい。露出条件の近い画像同士を比較して動きを検出することにより、画像間の露出の差によって動きの検出精度が低下することを回避することができる。そして、動き情報取得部１２は、複数の入力画像の中から、動き情報を合わせる基準画像を選択してもよい。基準画像としては、例えば複数の入力画像の中で最も有効画素が多い画像を採用する。ここで、有効画素とは、黒つぶれ又は白飛びしていない画素である。黒つぶれ又は白飛びは、輝度値を基準として判定される。また、動き情報取得部１２は、２つの入力画像を用いて動き情報を取得する場合には、２つの入力画像のうち露出の高い入力画像から特徴点を抽出し、それに対する対応点を露出の低い画像から求めてもよい。このように動作することで、露出の低い画像においては特徴点として抽出された点が露出の高い画像においては白飛びしていることにより、動き情報が取得できないことを回避することができる。なお、ジャイロセンサ等から動き情報を取得してもよい。動き情報取得部１２は、動き情報を尤度算出部１３へ出力する機能を有している。 The motion information acquisition unit 12 has a function of acquiring pixel motion information between images. For example, when the input frame image is a first image and a second image, pixel motion information between the first image and the second image is acquired. For example, a motion vector is used as the motion information. Further, when three or more input images are input by the image input unit 10, the motion information acquisition unit 12 sorts the input images in the order of exposure, and acquires motion information between input images having similar exposure conditions. Good. By detecting motion by comparing images with similar exposure conditions, it is possible to avoid a decrease in motion detection accuracy due to a difference in exposure between images. Then, the motion information acquisition unit 12 may select a reference image that matches the motion information from among a plurality of input images. As the reference image, for example, an image having the most effective pixels among a plurality of input images is employed. Here, an effective pixel is a pixel that is not crushed or blown out. Blackout or whiteout is determined based on the luminance value. When the motion information acquisition unit 12 acquires motion information using two input images, the motion information acquisition unit 12 extracts a feature point from an input image with high exposure among the two input images, and sets a corresponding point for the feature point as an exposure point. You may obtain | require from a low image. By operating in this way, it is possible to avoid that the motion information cannot be acquired because the points extracted as feature points in the low-exposure image are whiteout in the high-exposure image. Note that movement information may be acquired from a gyro sensor or the like. The motion information acquisition unit 12 has a function of outputting motion information to the likelihood calculation unit 13.

尤度算出部１３は、各画素における被写体の動きの尤度（動被写体尤度）を算出する機能を有している。動被写体尤度が大きいほど、被写体に動きがある可能性が高く、合成画像がぶれる領域となる可能性が高いことを意味する。尤度算出部１３は、動き情報を用いて入力画像間の画面の動きを補正する。その後、尤度算出部１３は、２つの入力画像において対応する画素の画素値を正規化する。例えば、尤度算出部１３は、近傍画素の画素値に基づいてＬｏｃａｌＴｅｒｎａｒｙＰａｔｔｅｒｎｓ（ＬＴＰ）を求める。画素値としては、ＲＧＢ３色が用いられ、近傍画素としては２４近傍が用いられる。そして、尤度算出部１３は、正規化された画像間の差分を用いて動被写体尤度を算出する。例えば、正規化された画素値の差分、すなわち注目画素のＬＴＰにおける符号の不一致割合を、注目画素の動被写体尤度として算出する。 The likelihood calculating unit 13 has a function of calculating the likelihood of the movement of the subject in each pixel (moving subject likelihood). The larger the moving subject likelihood, the higher the possibility that the subject is moving and the higher the possibility that the composite image will be blurred. The likelihood calculating unit 13 corrects the screen motion between the input images using the motion information. Thereafter, the likelihood calculating unit 13 normalizes the pixel values of the corresponding pixels in the two input images. For example, the likelihood calculating unit 13 obtains Local Territory Patterns (LTP) based on pixel values of neighboring pixels. As the pixel value, RGB three colors are used, and as the neighboring pixels, 24 neighborhoods are used. Then, the likelihood calculating unit 13 calculates the moving subject likelihood by using the difference between the normalized images. For example, the normalized pixel value difference, that is, the sign mismatch rate in the LTP of the target pixel is calculated as the moving subject likelihood of the target pixel.

また、尤度算出部１３は、２つの入力画像を多重解像度化して動被写体尤度を算出してもよい。例えば、尤度算出部１３は、各入力画像（第１画像及び第２画像）の解像度をそれぞれ段階的に変更することで、解像度の異なる複数の画像（第１処理画像及び第２処理画像）を作成する。そして、尤度算出部１３は、同一の解像度において、第１処理画像及び第２処理画像の差分画像を作成する。この差分画像とは、第１処理画像と第２処理画像との差分であり、具体的には画素値の差分である。そして、尤度算出部１３は、解像度ごとに得られた差分画像を重み付けすることによって各画素の動被写体尤度を算出する。重み（信頼度）としては、各画素のＬＴＰにおける符号の不一致割合が用いられる。例えば、ＬＴＰにおいて、有意に差があるペアの数が用いられる。また、重みは、第１処理画像又は第２処理画像の画像サイズ又は解像度によってさらに重み付けされてもよい。すなわち、画像サイズが大きいほど又は解像度が大きいほど、重みを大きくしてもよい。尤度算出部１３は、各画素の動被写体尤度を露出推定部１４へ出力する機能を有している。 In addition, the likelihood calculating unit 13 may calculate the moving subject likelihood by converting two input images into multiple resolutions. For example, the likelihood calculating unit 13 changes the resolution of each input image (the first image and the second image) in stages, thereby a plurality of images having different resolutions (the first processed image and the second processed image). Create Then, the likelihood calculating unit 13 creates a difference image between the first processed image and the second processed image at the same resolution. This difference image is a difference between the first processed image and the second processed image, and specifically, is a difference between pixel values. And the likelihood calculation part 13 calculates the moving subject likelihood of each pixel by weighting the difference image obtained for every resolution. As the weight (reliability), a code mismatch rate in the LTP of each pixel is used. For example, in LTP, the number of pairs that are significantly different is used. The weight may be further weighted according to the image size or resolution of the first processed image or the second processed image. That is, the weight may be increased as the image size is larger or the resolution is larger. The likelihood calculating unit 13 has a function of outputting the moving subject likelihood of each pixel to the exposure estimating unit 14.

露出推定部１４は、入力画像間の露出条件を合わせる露出変換関数を推定する機能を有している。露出変換関数とは、各入力画像の露出を基準画像相当に露出変換するための関数である。露出推定部１４は、３以上の入力画像が入力された場合には、露出条件の近い入力画像間で露出条件をあわせてもよい。露出条件の近い画像同士を比較して露出を合わせることにより、画像間の露出の差によって推定の精度が低下することを回避することができる。 The exposure estimation unit 14 has a function of estimating an exposure conversion function that matches exposure conditions between input images. The exposure conversion function is a function for converting the exposure of each input image to an exposure equivalent to the reference image. When three or more input images are input, the exposure estimation unit 14 may match the exposure conditions between input images having similar exposure conditions. By matching the exposures by comparing the images having similar exposure conditions, it is possible to avoid the estimation accuracy from being lowered due to the difference in exposure between the images.

露出推定部１４は、例えば、動き情報を用いて入力画面間の動きを補正する。そして、動き補正後の２つの入力画像において、同一の箇所から輝度値を組としてサンプリングし、その関係をプロットする。入力画像の座標としては例えばＨａｌｔｏｎ数列が用いられる。なお、露出推定部１４は、所定の値以上の輝度値や所定の値以下の輝度値をサンプリング点として採用しなくてもよい。例えば、１０〜２４５の範囲に含まれる輝度値をサンプリング点として採用する。露出推定部１４は、例えば、プロットの結果をフィッティングすることにより、露出変換関数を推定する。例えば、第１画像のサンプリング点ｉにおける元の輝度値をＫ_ｉ、露出変換関数をｆ（Ｋ_ｉ）、第２画像のサンプリング点ｉにおける元の輝度値をＵ_ｉとした場合、以下の誤差関数ｅを用いて、Ｇａｕｓｓ−Ｎｅｗｔｏｎ法によりフィッティングしてもよい。
For example, the exposure estimation unit 14 corrects the motion between the input screens using the motion information. Then, in the two input images after motion correction, the luminance values are sampled from the same location as a set, and the relationship is plotted. For example, a Halton number sequence is used as the coordinates of the input image. Note that the exposure estimation unit 14 does not have to adopt a luminance value greater than or equal to a predetermined value or a luminance value less than or equal to a predetermined value as a sampling point. For example, the luminance value included in the range of 10 to 245 is employed as the sampling point. The exposure estimation unit 14 estimates the exposure conversion function by fitting the plot result, for example. For example, when the original luminance value at the sampling point i of the first image is K _i , the exposure conversion function is f (K _i ), and the original luminance value at the sampling point i of the second image is U _i , the following error The function e may be used for fitting by the Gauss-Newton method.

なお、露出推定部１４は、各画素の動被写体尤度に基づいて、露出変換関数を導出するためのサンプリングを行う。露出推定部１４は、例えば、各画素の動被写体尤度に基づいてサンプリング点を選択する。例えば、露出推定部１４は、段階的に閾値を設けて、動被写体尤度の小さい画素から輝度値をサンプリングする。また、露出推定部１４は、動被写体尤度に基づいて、サンプリング点に重みを付けてもよい。例えば、以下の誤差関数ｅを最小化させてフィッティングしてもよい。

式２において、ｗ_ｉは重みである。ここで、動被写体尤度が高い画素ほど、重みｗ_ｉを小さく設定する。このように、露出推定部１４が各画素の動被写体尤度に基づいて露出変換関数を算出することで、信頼度の低いサンプリング点のデータほど露出変換関数の導出に影響を与えないようにすることができる。なお、露出変換関数は、変換後の入力画像が表現可能な範囲に収まるように変更されてもよい。 The exposure estimation unit 14 performs sampling for deriving an exposure conversion function based on the moving subject likelihood of each pixel. For example, the exposure estimation unit 14 selects a sampling point based on the moving subject likelihood of each pixel. For example, the exposure estimation unit 14 samples the luminance value from pixels with a small moving subject likelihood by providing threshold values in stages. Further, the exposure estimation unit 14 may weight the sampling points based on the moving subject likelihood. For example, the following error function e may be minimized for fitting.

In Equation 2, w _i is a weight. Here, the weight w _i is set to be smaller as the moving object likelihood becomes higher. As described above, the exposure estimation unit 14 calculates the exposure conversion function based on the moving subject likelihood of each pixel, so that the data of the sampling point with lower reliability does not affect the derivation of the exposure conversion function. be able to. Note that the exposure conversion function may be changed so that the input image after conversion falls within a representable range.

動き補正部１５は、動き情報を用いて入力画面間の動きを補正する機能を有している。合成部１６は、合成マスクを用いて入力画像同士あるいは、既に合成された画像と入力画像とを合成する。合成マスクは、画像同士を合成（αブレンド）する際の合成比（重み）を画像化したものである。合成部１６は、３以上の入力画像がある場合には、まず合成マスクに従って２つの入力画像を合成し、合成画像と残りの入力画像との合成マスクを生成して合成を行う。合成部１６は、輝度ベースマスク及び被写体ぶれマスクを結合して合成マスクを生成する。輝度ベースマスクは、画像同士を合成する際の重みを輝度値に基づいて決定することで、白飛びや黒つぶれの領域を合成に用いることを回避するためのマスクである。被写体ぶれマスクは、被写体が移動する画像を合成した際に、被写体が２重３重に重なって表示される現象（ゴースト現象）を回避するためのマスクである。 The motion correction unit 15 has a function of correcting motion between input screens using motion information. The synthesizing unit 16 synthesizes the input images or the already synthesized image and the input image using the synthesis mask. The combination mask is an image of the combination ratio (weight) when combining images (α blend). When there are three or more input images, the synthesizing unit 16 first synthesizes the two input images according to the synthesis mask, generates a synthesis mask of the synthesized image and the remaining input images, and performs synthesis. The combining unit 16 combines the luminance base mask and the subject blur mask to generate a combined mask. The luminance base mask is a mask for avoiding the use of an overexposure or underexposure region for synthesis by determining a weight for combining images based on a luminance value. The subject blur mask is a mask for avoiding a phenomenon (ghost phenomenon) in which a subject is displayed in a double or triple manner when an image in which the subject moves is combined.

合成部１６は、入力画像の元の輝度値に基づいて重みを算出し、輝度ベースマスクを生成する。重みは、例えば以下の算出式で求める。

上記算出式により、重みが適切に決定されるとともに、輝度的な不連続性が軽減される。なお、空間的な不連続性を軽減するために、合成マスクに対してぼかし処理を施してもよい。 The synthesizer 16 calculates a weight based on the original luminance value of the input image and generates a luminance base mask. The weight is obtained by the following calculation formula, for example.

According to the above calculation formula, the weight is appropriately determined and the luminance discontinuity is reduced. In order to reduce spatial discontinuity, the synthesis mask may be subjected to blurring processing.

合成部１６は、入力画像間の差分に基づいて重みを算出し、被写体ぶれマスクを生成する。合成部１６は、入力画像間の画素値の差分から、動被写体尤度を算出する。入力画像間の画素値の差分及び動被写体尤度については、上述した尤度算出部１３と同様に動作することで得ることができる。そして、尤度算出部１３は、動被写体尤度が所定の閾値以下となる画素が隣接する被写体ぶれ領域を検出し、各被写体ぶれ領域に対して識別ラベルを付与し、被写体ぶれ領域ごとに被写体ぶれマスクを生成する。なお、所定の閾値は、要求仕様に応じて適宜変更可能である。閾値を大きく設定すると、連続領域を抽出しやすくすることができる。被写体ぶれ領域ごとにマスクを生成することにより、被写体ぶれ領域ごとに白飛び領域あるいは黒つぶれ領域を回避するように情報量の多い画像から画素を選択することができる。すなわち、この被写体ぶれマスクとしては、合成する画像同士のうち強制的に輝度値の低い画素値を選択させるｌｏ_ｍａｓｋ（第１マスク）、又は、合成する画像同士のうち強制的に輝度値の高い画素値を選択させるｈｉ_ｍａｓｋ（第２マスク）が存在する。合成部１６は、基本的には、情報量の多い高露出の画像から画素値を選択させる第２マスクを生成する。しかしながら、合成部１６は、高露出の画像において被写体ぶれ領域が白飛び領域に影響される場合には、第１マスクを生成する。具体的には、以下の何れかの条件を満たす場合には第１マスクを生成する。第１の条件としては、合成する２つの画像のうち、高露出の画像の白飛びの面積が低露出の画像の黒つぶれ領域の面積よりも大きい場合である。第２の条件としては、合成する２つの画像のうち高露出の画像において、被写体ぶれ領域内の白飛び領域の面積が１０％以上の場合である。なお、合成する２つの画像のうち高露出の画像において、被写体ぶれ領域と隣接する領域が白飛び領域である場合を条件としてもよい。 The synthesizer 16 calculates a weight based on the difference between the input images and generates a subject blur mask. The synthesizer 16 calculates the moving subject likelihood from the difference in pixel values between the input images. The pixel value difference between the input images and the moving subject likelihood can be obtained by operating in the same manner as the likelihood calculating unit 13 described above. Then, the likelihood calculating unit 13 detects a subject blur region adjacent to a pixel whose moving subject likelihood is equal to or less than a predetermined threshold, assigns an identification label to each subject blur region, and subjects each subject blur region. Generate a blur mask. The predetermined threshold value can be changed as appropriate according to the required specifications. If the threshold value is set large, it is possible to easily extract the continuous area. By generating a mask for each subject blur area, pixels can be selected from an image with a large amount of information so as to avoid a whiteout area or a blackout area for each subject blur area. That is, as this subject blur mask, lo_mask (first mask) for forcibly selecting a pixel value having a low luminance value among the images to be combined, or a pixel forcibly having a high luminance value among the images to be combined. There is a hi_mask (second mask) that selects a value. The synthesizing unit 16 basically generates a second mask that selects pixel values from a high-exposure image with a large amount of information. However, the composition unit 16 generates the first mask when the subject blur area is affected by the whiteout area in the high-exposure image. Specifically, the first mask is generated when any of the following conditions is satisfied. The first condition is a case where, among the two images to be combined, the overexposure area of the high exposure image is larger than the area of the blackout area of the low exposure image. The second condition is a case where, in a high-exposure image of the two images to be combined, the area of the whiteout region in the subject blur region is 10% or more. It should be noted that, in the high-exposure image of the two images to be combined, a condition may be set where the region adjacent to the subject blur region is a whiteout region.

合成部１６は、輝度ベースマスク及び被写体ぶれマスクを結合させて合成マスクを生成する。例えば、合成部１６は、輝度ベースマスクに対して、第１マスクを反転させたマスクを乗算する。また、合成部１６は、輝度ベースマスクに対して、第２マスクを加算する。合成部１６は、全ての入力画像を合成し、最終的な合成画像を表示部２１へ出力する。表示部２１は、合成画像を表示する。表示部２１として例えばディスプレイ装置が用いられる。 The combining unit 16 combines the luminance base mask and the subject blur mask to generate a combined mask. For example, the synthesis unit 16 multiplies the luminance base mask by a mask obtained by inverting the first mask. The synthesizing unit 16 adds the second mask to the luminance base mask. The synthesizer 16 synthesizes all the input images and outputs the final synthesized image to the display unit 21. The display unit 21 displays a composite image. For example, a display device is used as the display unit 21.

次に、画像合成装置１の動作を説明する。図３は、ＨＤＲ合成の前処理を説明するフローチャートである。図３に示す制御処理は、例えばユーザによってＨＤＲ合成モードが選択され、カメラ２０が複数の画像を連続撮像した場合に開始する。 Next, the operation of the image composition device 1 will be described. FIG. 3 is a flowchart for explaining preprocessing for HDR synthesis. The control process illustrated in FIG. 3 starts when the HDR synthesis mode is selected by the user and the camera 20 continuously captures a plurality of images, for example.

まず、画像入力部１０が画像フレームを入力する（Ｓ１０）。以下では、説明理解の容易性を考慮して、５つの入力画像Ｉ_０〜Ｉ_４を入力したものとして説明する。Ｓ１０の処理が終了すると、露出順ソート処理へ移行する（Ｓ１２）。 First, the image input unit 10 inputs an image frame (S10). In the following description, it is assumed that five input images I _{0 to} I ₄ are input in consideration of easy understanding. When the process of S10 ends, the process proceeds to the exposure order sort process (S12).

Ｓ１２の処理では、動き情報取得部１２が入力画像Ｉ_０〜Ｉ_４を露出順にソートする。動き情報取得部１２は、例えば輝度値の平均値を用いてソートする。ここでは、入力画像Ｉ_０〜Ｉ_４の数字が小さくなるほど輝度値が小さいものとする。この場合、入力画像Ｉ_０〜Ｉ_４は、数字の順にソートされる。Ｓ１２の処理が終了すると、動き情報取得処理へ移行する（Ｓ１４）。 In the processing in S12, the motion information obtaining section 12 sorts the input image _I 0 ~I ₄ in display order. The motion information acquisition unit 12 sorts using, for example, the average value of the luminance values. Here, it is assumed that the luminance value is smaller as the numbers of the input images I _{0 to} I ₄ are smaller. In this case, the input images I _{0 to} I ₄ are sorted in numerical order. When the process of S12 ends, the process proceeds to a motion information acquisition process (S14).

Ｓ１４の処理では、動き情報取得部１２が、入力画像Ｉ_０〜Ｉ_４のそれぞれの画像間の動き情報を取得する。図４は、動き情報の取得処理を説明する概要図である。図４に示すように、入力画像Ｉ_０〜Ｉ_４が左から右に向けて順に平均輝度値が大きくなるように並べられているとする。まず、動き情報取得部１２は、入力画像Ｉ_０〜Ｉ_４の中から基準画像を設定する。ここでは、入力画像Ｉ_２を基準画像とする。次に、露出条件の近い入力画像同士の動き情報を取得する（例えば、入力画像Ｉ_０と入力画像Ｉ_１、入力画像Ｉ_１と入力画像Ｉ_２等）。動き情報取得部１２は、２つの入力画像のうち、露出の高い入力画像で特徴点を抽出して、抽出された特徴点に対する対応点を露出の低い入力画像から抽出する。この動き情報によって、露出条件の近い入力画像同士を同一次元の座標に変換する変換行列を求めることができる。なお、図４では、露出条件の近い入力画像同士のうち、露出の低い画像を露出の高い画像へあわせるための変換行列ｍ１０，ｍ２１，ｍ３２，ｍ４３を示している。次に、変換行列ｍ１０，ｍ２１，ｍ３２，ｍ４３を用いて、基準画像Ｉ_２以外の他の入力画像Ｉ_０，Ｉ_１，Ｉ_３，Ｉ_４の座標を基準画像Ｉ_２相当の座標へ変形させる変換行列を算出する。図４に示すように、入力画像Ｉ_０を基準画像Ｉ_２へ変換させる変換行列は、ｍ１０*ｍ２１である。入力画像Ｉ_１を基準画像Ｉ_２へ変換させる変換行列は、ｍ１０である。入力画像Ｉ_３を基準画像Ｉ_２へ変換させる変換行列は、（ｍ３２）^−１である。入力画像Ｉ_４を基準画像Ｉ_２へ変換させる変換行列は、（ｍ３２*ｍ４３）^−１である。以下では変換後の入力画像をＩ_０’〜Ｉ_４’として説明する。Ｓ１４の処理が終了すると、動被写体尤度算出処理へ移行する（Ｓ１６）。 In the processing in S14, the motion information obtaining section 12 acquires the motion information between each image of the input image _I 0 ~I _4. FIG. 4 is a schematic diagram illustrating motion information acquisition processing. As illustrated in FIG. 4, it is assumed that the input images I _{0 to} I ₄ are arranged so that the average luminance value increases in order from the left to the right. First, the motion information acquisition unit 12 sets a reference image from among the input images I _{0 to} I ₄ . Here, the reference image input image I _2. Next, motion information between input images having similar exposure conditions is acquired (for example, input image I ₀ and input image I ₁ , input image I ₁ and input image I _2, etc.). The motion information acquisition unit 12 extracts a feature point from the input image with high exposure out of the two input images, and extracts a corresponding point for the extracted feature point from the input image with low exposure. Based on this motion information, it is possible to obtain a conversion matrix for converting input images having similar exposure conditions into coordinates of the same dimension. FIG. 4 shows conversion matrices m10, m21, m32, and m43 for matching a low-exposure image to a high-exposure image among input images with similar exposure conditions. Next, using the transformation matrix m10, m21, m32, m43, deforming the reference image _{I 2} other input image _I 0 other _than, I _1, I 3, the coordinates of the _{I 4} to the reference image _{I 2} corresponding coordinates A transformation matrix is calculated. As shown in FIG. 4, the conversion matrix for converting the input image I ₀ into the reference image I ₂ is m10 * m21. Transformation matrix to convert the input image _{I 1} to the reference image _{I 2} is m10. A conversion matrix for converting the input image I ₃ to the reference image I ₂ is (m32) ⁻¹ . A conversion matrix for converting the input image I ₄ to the reference image I ₂ is (m32 * m43) ⁻¹ . Hereinafter, the converted input image will be described as I _{0 ′ to} I _{4 ′} . When the process of S14 ends, the process proceeds to a moving subject likelihood calculation process (S16).

Ｓ１６の処理では、尤度算出部１３が、入力画像Ｉ_０’〜Ｉ_４’のそれぞれの画像間の動被写体尤度を算出する。図５は、入力画像Ｉ_０’と入力画像Ｉ_１’との画像間における動被写体尤度を算出する例である。なお、図５では、画素値としてＲ値を用いる場合を示している。図５に示すように、尤度算出部１３は、入力画像Ｉ_０’の注目画素（Ｒ値＝４２）の８近傍の画素値（Ｒ値）を取得する。そして、注目画素の画素値と８近傍の画素値を用いて正規化する。例えば、ＬＴＰを用いる。注目画素の画素値と８近傍の画素値との差が±５の範囲であれば０、＋５より大きい場合には１、−５より小さい場合には−１とする。尤度算出部１３は、入力画像Ｉ_１’についても、同様に正規化する。図中では、入力画像Ｉ_０’の注目画素に対応する入力画像Ｉ_１’の画素において正規化している。次に、正規化された画素の画素値を比較すると、差分が生じていることがわかる。差分の大きさ（符号の不一致度合い）に応じて当該画素の色を黒から白へ変化させた画像として表したものが差分画像Ｘである。この差分画像は、各画素の動被写体尤度が画像化されたものである。なお、８近傍に限定されることはなく、２４近傍であってもよい。また、Ｒ値のみに限られずＧ値及びＢ値についても同様に処理してもよい。 In the process of S16, the likelihood calculating unit 13 calculates the moving subject likelihood between the respective images of the input images I0 _'to I4 _' . FIG. 5 is an example of calculating the moving subject likelihood between the images of the input image I _{0 ′} and the input image I _{1 ′} . FIG. 5 shows a case where the R value is used as the pixel value. As shown in FIG. 5, the likelihood calculating unit 13 obtains pixel values (R values) near 8 of the target pixel (R value = 42) of the input image I _{0 ′} . Then, normalization is performed using the pixel value of the target pixel and the pixel values in the vicinity of 8. For example, LTP is used. If the difference between the pixel value of the pixel of interest and the pixel values in the vicinity of 8 is within a range of ± 5, 0 is set when it is larger than +5, and -1 when it is smaller than −5. The likelihood calculating unit 13 also normalizes the input image I _{1 ′} in the same manner. In the figure, the pixels of the input image I _{1 ′} corresponding to the target pixel of the input image I _{0 ′} are normalized. Next, when the pixel values of the normalized pixels are compared, it can be seen that a difference has occurred. The difference image X is represented as an image in which the color of the pixel is changed from black to white in accordance with the magnitude of the difference (the degree of mismatch of the signs). This difference image is an image of the moving subject likelihood of each pixel. In addition, it is not limited to 8 vicinity, 24 vicinity may be sufficient. Further, not only the R value but also the G value and the B value may be processed similarly.

差分画像Ｘの領域Ｃ_１に示す平滑領域の動被写体尤度の精度を向上させるために、尤度算出部１３は、多重解像度を用いて動被写体尤度を求めてもよい。図６は、多重解像度を用いて動被写体尤度を求める一例である。まず、尤度算出部１３は、入力画像Ｉ_０’と入力画像Ｉ_１’との解像度を段階的に変更させた複数の画像を生成する。そして、同一の解像度同士で差分画像を生成する。この差分画像は、単純に画素値を差し引いたものである。図６では、入力画像Ｉ_０’と入力画像Ｉ_１’とを６段階に多重化した場合を示している。それぞれの差分画像がＸ_１〜Ｘ_６であり、数字が大きくなるほど低い解像度の差分画像となる。また、解像度が低いほど画像サイズが小さくなる。この差分画像を信頼度で重み付けして最終的な差分画像を算出する。信頼度は、例えば、上述したＬＴＰの差分において有意な差のあるペアの数に画像サイズ（又は解像度）を乗算したものを用いる。例えば、図５に示すＬＴＰの場合には有意な差のあるペアの数は１となる。このように、画素ごとにペアの数と画像サイズとを掛けあわせて、差分画像Ｘ_１〜Ｘ_６に対応する重み画像（重みを画像化したもの）を算出する。そして、差分画像Ｘ_１〜Ｘ_６と重み画像とを用いて最終的な差分画像を算出する。尤度算出部１３は、上述した手法と同様の手法で、入力画像Ｉ_１’〜Ｉ_４’までの差分画像を算出する。Ｓ１６の処理が終了すると、露出変換関数推定処理へ移行する（Ｓ１８）。 In order to improve the moving subject likelihood of accuracy of the smoothing region shown in region C ₁ of the difference image X, the likelihood calculating unit 13 may obtain the moving subject likelihood using multiresolution. FIG. 6 is an example of obtaining the moving subject likelihood using multi-resolution. First, the likelihood calculating unit 13 generates a plurality of images in which the resolutions of the input image I _{0 ′} and the input image I _{1 ′} are changed stepwise. Then, difference images are generated with the same resolution. This difference image is obtained by simply subtracting pixel values. FIG. 6 shows a case where the input image I _{0 ′} and the input image I _{1 ′} are multiplexed in six stages. The difference images are X _{1 to} X ₆ , and the difference image has a lower resolution as the number increases. Also, the lower the resolution, the smaller the image size. The difference image is weighted with reliability to calculate a final difference image. As the reliability, for example, a value obtained by multiplying the number of pairs having a significant difference in the above-described LTP difference by the image size (or resolution) is used. For example, in the case of LTP shown in FIG. 5, the number of pairs having a significant difference is 1. In this way, the weight image (the image of the weights) corresponding to the difference images X _{1 to} X ₆ is calculated by multiplying the number of pairs by the image size for each pixel. Then, a final difference image is calculated using the difference images X _{1 to} X ₆ and the weight image. The likelihood calculating unit 13 calculates a difference image from the input images I _{1 ′ to} I _{4 ′} by a method similar to the method described above. When the process of S16 is completed, the process proceeds to an exposure conversion function estimation process (S18).

Ｓ１８の処理では、露出推定部１４が露出変換関数を推定する。露出推定部１４は、変換前の輝度値をｘ、変換後の輝度値をｙとすると露出変換関数を以下の数式で表すことができる。

ここで、（ａ，ｂ）は露出変換パラメータである。露出変換パラメータ（ａ，ｂ）を導出することで露出変換関数を求めることができる。以下では、動き補正後の入力画像Ｉ_０’と入力画像Ｉ_１’との露出変換関数を求める場合を説明する。露出推定部１４は、入力画像の点（ｘ，ｙ）において、露出の低い入力画像Ｉ_０’の輝度値と露出の低い入力画像Ｉ_１’の輝度値の組みをいくつかサンプリングして、その関係をプロットする。ここで、Ｓ１６の処理で取得した差分画像に基づいて、サンプリングする点を選択する。例えば、動被写体尤度の高い領域からはサンプリングしないように設定する。すなわち動被写体尤度の低いものからサンプリングするように設定する。そして、例えば動被写体尤度が高いほど低い重みを割り当てて、式２を用いて露出変換関数を推定する。これにより、例えば図７に示すようなフィッティングが行われる。尤度算出部１３は、上述した手法と同様の手法で、入力画像Ｉ_１’〜Ｉ_４’間の露出変換関数を推定する。なお、輝度値が０に近いデータ又は２５５に近いデータを除いてもよい。 In the process of S18, the exposure estimation unit 14 estimates an exposure conversion function. The exposure estimation unit 14 can express the exposure conversion function by the following formula, where x is the luminance value before conversion and y is the luminance value after conversion.

Here, (a, b) are exposure conversion parameters. An exposure conversion function can be obtained by deriving the exposure conversion parameters (a, b). Hereinafter, a case where an exposure conversion function between the input image I _{0 ′} and the input image I _{1 ′} after motion correction is obtained will be described. The exposure estimation unit 14 samples several sets of luminance values of the low-exposure input image I _{0 ′} and the low-exposure input image I _{1 ′} at the point (x, y) of the input image. Plot the relationship. Here, a point to be sampled is selected based on the difference image acquired in the process of S16. For example, the setting is made so that sampling is not performed from an area where the moving subject likelihood is high. In other words, the sampling is set so as to sample from a moving object with a low likelihood. Then, for example, a lower weight is assigned as the moving subject likelihood is higher, and the exposure conversion function is estimated using Expression 2. Thereby, for example, fitting as shown in FIG. 7 is performed. The likelihood calculating unit 13 estimates an exposure conversion function between the input images I _{1 ′ to} I _{4 ′} by a method similar to the method described above. Note that data with luminance values close to 0 or data close to 255 may be excluded.

図８は、上記の露出変換関数の推定処理を説明する概要図である。なお、図８では、露出条件の近い入力画像同士のうち、露出の低い画像を露出の高い画像へあわせるための露出変換パラメータ（ａ１０，ｂ１０）、（ａ２１，ｂ２１）、（ａ３２，ｂ３２）、（ａ４３，ｂ４３）を示している。最終的な合成画像が表現可能な範囲に収まるように、最も露出の低い入力画像Ｉ_０’の露出変換パラメータ（Ａ_０，Ｂ_０）のＡ_０を１．０に設定することで、変換結果が１．０を超えないようにしてもよい。ここでは、入力画像Ｉ_０’の露出変換後の画像を、入力画像Ｉ_０’’として表示している。また、最も露出の低い入力画像Ｉ_０’に対する基準画像Ｉ_２’の露出変換パラメータを（Ａ_２，Ｂ_２）とすると、Ａ_０を１．０に設定すると同時にＢ_２を１．０とすることで、ゲインが１／Ａ_２のときに色味が入力画像と等しくなるように設定してもよい。尤度算出部１３は、上述した処理をＲＧＢチャネルごとに別々に行う。Ｓ１８の処理が終了すると図３に示す前処理を終了する。 FIG. 8 is a schematic diagram illustrating the exposure conversion function estimation process. In FIG. 8, exposure conversion parameters (a10, b10), (a21, b21), (a32, b32) for matching a low-exposure image to a high-exposure image among input images with similar exposure conditions. (A43, b43) are shown. The conversion result is obtained by setting A ₀ of the exposure conversion parameter (A ₀ , B ₀ ) of the input image I _{0 ′ having} the lowest exposure to 1.0 so that the final composite image can be expressed. May not exceed 1.0. Here, _'the image after exposure conversion of the input image I _0' input image I ₀ is displayed as a _'. If the exposure conversion parameter of the reference image I ₂ _{′ with} respect to the input image I _{0 ′ having} the lowest exposure is (A ₂ , B ₂ ), A ₀ is set to 1.0 and B ₂ is set to 1.0. Thus, the color may be set to be equal to that of the input image when the gain is 1 / A ₂ . The likelihood calculating unit 13 performs the above-described processing separately for each RGB channel. When the process of S18 ends, the pre-process shown in FIG. 3 ends.

以上で図３に示す制御処理を終了する。図３に示す制御処理を実行することで、露出変換関数を推定する前に、被写体ぶれを検出することで、被写体ぶれ領域からサンプリングすることを回避したり、被写体ぶれ領域からサンプリングされたデータの影響を重み付けにより小さくすることができる。このため、露出変換関数を精度良く推定することができる。また、従来のＨＤＲ技術であれば、被写体ぶれの補正は、露出合わせが行われていないと正確に行うことができず、また逆に露出合わせは被写体ぶれの修正が行われていないと正確にできない。しかし、露出変換関数を推定する前に簡易的に被写体ぶれ（被写体の動き）を検出することで、上記デットロック関係を解消することができる。 Thus, the control process shown in FIG. 3 is finished. By performing the control process shown in FIG. 3, by detecting the subject blur before estimating the exposure conversion function, it is possible to avoid sampling from the subject blur region, or to change the data sampled from the subject blur region. The influence can be reduced by weighting. For this reason, the exposure conversion function can be estimated with high accuracy. Also, with the conventional HDR technology, correction of subject blur cannot be performed accurately unless exposure adjustment is performed, and conversely, exposure correction is accurate unless subject blur correction is performed. Can not. However, the deadlock relationship can be resolved by simply detecting subject shake (subject movement) before estimating the exposure conversion function.

次に、画像合成装置１の合成動作を説明する。図９は、ＨＤＲ合成を説明するフローチャートである。図９に示す制御処理は例えば図３に示す制御処理が終了すると開始する。 Next, the composition operation of the image composition device 1 will be described. FIG. 9 is a flowchart for explaining HDR synthesis. The control process shown in FIG. 9 starts when the control process shown in FIG.

図９に示すように、動き補正部１５が実際に動きを補正する（Ｓ２０）。この処理では、図３のＳ１４の処理と同様に、動き補正部１５が、変換行列を用いて、露出変換後の入力画像Ｉ_０’’〜Ｉ_４’’の動きを補正する。なお、要求される精度に応じてサブピクセル補間アルゴリズム等を用いることができるようにしてもよい。Ｓ２０の処理が終了すると、輝度ベースマスク生成処理及び被写体ぶれ領域抽出処理へ移行する（Ｓ２２及びＳ２４）。 As shown in FIG. 9, the motion correction unit 15 actually corrects the motion (S20). In this process, similarly to the process of S14 of FIG. 3, the motion correction unit 15 corrects the motion of the input images I _{0 ″ to} I _{4 ″} after exposure conversion using the conversion matrix. Note that a subpixel interpolation algorithm or the like may be used according to the required accuracy. When the process of S20 ends, the process proceeds to a luminance base mask generation process and a subject blur area extraction process (S22 and S24).

Ｓ２２の処理では、合成部１６が輝度ベースマスクを生成する。図１０は、合成処理の流れを説明する概要図である。図１０に示すように、露出の低い入力画像Ｉ_０’’から順に、入力画像Ｉ_１’’〜Ｉ_４’’を置き換えていくことで合成する。すなわち、最初は、入力画像Ｉ_０’’に対して入力画像Ｉ_１’’をどの程度合成させるかを定める輝度ベースマスクを生成する。この輝度ベースマスクは、入力画像Ｉ_１’’の元の輝度値から重みを算出する。例えば、白飛び領域の付近の重みを０とする。このように重みを設定して、露出の低い画像へ露出の高い画像を重ねるように合成させることで、対象のピクセルに対して情報量の多い入力画像を必ず選択させることができる。図１１の（Ａ）は、入力輝度に対するピクセル値の関係を示すグラフである。図１１の（Ａ）に示すように、関数ｆ_０〜ｆ_３は、輝度値に基づいてどちらの画像の画素値を採用するかを示すグラフである。関数ｆ_０〜ｆ_３は、数字が大きくなるほど露出が大きい画像に適用されるものである。例えば、最も露出の低い入力画像Ｉ_０’’が入力されると、関数ｆ_０が適用されて全ての画素値が採用される。次に、入力画像Ｉ_１’’が入力されると、関数ｆ_０と関数ｆ_１とが適用される。このため、Ｓ０〜Ｓ５の輝度値の範囲では、入力画像Ｉ_１’’が採用され、Ｓ６以上の輝度値の範囲では、入力画像Ｉ_０’’が採用される。Ｓ５〜Ｓ６の輝度値の範囲は、（Ｂ）に示す重みでブレンドされた合成値で採用される。なお、便宜上γ補正は省略している。次に入力画像Ｉ_２’’が入力されると、関数ｆ_０〜ｆ_２が適用される。このため、Ｓ０〜Ｓ３の輝度値の範囲では、入力画像Ｉ_２’’が採用され、Ｓ４〜Ｓ５の輝度値の範囲では、入力画像Ｉ_１’’が採用され、Ｓ６以上の輝度値の範囲では、入力画像Ｉ_０’’が採用される。Ｓ３〜Ｓ４及びＳ５〜Ｓ６の輝度値の範囲は、（Ｂ）に示す重みでブレンドされた合成値で採用される。次に入力画像Ｉ_３’’が入力されると、関数ｆ_０〜ｆ_３が適用される。このため、Ｓ０〜Ｓ１の輝度値の範囲では、入力画像Ｉ_３’’が採用され、Ｓ２〜Ｓ３の輝度値の範囲では、入力画像Ｉ_２’’が採用され、Ｓ４〜Ｓ５の輝度値の範囲では、入力画像Ｉ_１’’が採用され、Ｓ６以上の輝度値の範囲では、入力画像Ｉ_０’’が採用される。Ｓ１〜Ｓ２、Ｓ３〜Ｓ４及びＳ５〜Ｓ６の輝度値の範囲は、（Ｂ）に示す重みでブレンドされた合成値で採用される。このように、露出の高い画像が優先的に採用される。また、白とび領域部分については露出の低い画像が採用されるとともに、境界部分を滑らかにブレンドする。上記図１１の（Ａ）に示すグラフを画像化した輝度ベースマスクの一例を図１２に示す。図１２の（Ａ）は入力画像を示し、（Ｂ）は当該入力画像の輝度ベースマスクである。図１２の（Ｂ）では、１００％入力画像の画素値を利用する場合には白とし、１００％入力画像の画素値を利用しない場合には黒として表現している。Ｓ２２の処理が終了すると、合成マスク生成処理へ移行する（Ｓ３２）。 In the process of S22, the synthesis unit 16 generates a luminance base mask. FIG. 10 is a schematic diagram illustrating the flow of the synthesis process. As shown in FIG. 10, the images are synthesized by replacing the input images I _{1 ″ to} I _{4 ″ in} order from the input image I _{0 ″ having} the lowest exposure. That is, first, a luminance base mask that determines how much the input image I _{1 ″} is to be combined with the input image I _{0 ″} is generated. This luminance base mask calculates the weight from the original luminance value of the input image I _{1 ″} . For example, the weight near the whiteout area is set to zero. By setting the weights in this way and combining the images with high exposure on the images with low exposure, it is possible to always select an input image with a large amount of information for the target pixel. FIG. 11A is a graph showing the relationship of the pixel value with respect to the input luminance. As shown in FIG. 11A, the functions f _{0 to} f ₃ are graphs showing which image pixel value is adopted based on the luminance value. The functions f _{0 to} f ₃ are applied to an image having a larger exposure as the number increases. For example, when the input image I _{0 ″ having} the lowest exposure is input, the function f ₀ is applied and all pixel values are adopted. Next, when the input image I _{1 ″} is input, the function f ₀ and the function f ₁ are applied. For this reason, the input image I _{1 ″} is adopted in the range of the luminance values of S0 to S5, and the input image I _{0 ″} is adopted in the range of the luminance value of S6 or more. The range of the luminance values of S5 to S6 is adopted as a composite value blended with the weight shown in (B). For convenience, γ correction is omitted. Next, when the input image I _{2 ″} is input, the functions f _{0 to} f ₂ are applied. For this reason, the input image I _{2 ″} is adopted in the range of luminance values of S0 to S3, the input image I _{1 ″} is adopted in the range of luminance values of S4 to S5, and the range of luminance values of S6 or more. Then, the input image I _{0 ″} is adopted. The range of the luminance values of S3 to S4 and S5 to S6 is adopted as a composite value blended with the weight shown in (B). Next, when the input image I _{3 ″} is input, the functions f _{0 to} f ₃ are applied. For this reason, the input image I _{3 ″} is adopted in the range of luminance values of S0 to S1, the input image I _{2 ″} is adopted in the range of luminance values of S2 to S3, and the luminance values of S4 to S5. In the range, the input image I _{1 ″} is adopted, and in the range of the luminance value of S6 or more, the input image I _{0 ″} is adopted. The range of the luminance values of S1 to S2, S3 to S4, and S5 to S6 is adopted as a composite value blended with the weight shown in (B). In this way, images with high exposure are preferentially adopted. In addition, an image with low exposure is adopted for the overexposed region portion, and the boundary portion is blended smoothly. FIG. 12 shows an example of a luminance base mask obtained by imaging the graph shown in FIG. 12A shows an input image, and FIG. 12B shows a luminance base mask of the input image. In FIG. 12B, white is used when the pixel value of the 100% input image is used, and black is displayed when the pixel value of the 100% input image is not used. When the process of S22 ends, the process proceeds to a synthesis mask generation process (S32).

一方、Ｓ２４の処理では、合成部１６が被写体ぶれ領域を抽出する。例えば、合成部１６が、図３のＳ１６の処理と同様に差分画像を算出し、動被写体尤度が所定値以上の領域を被写体ぶれ領域として抽出する。図１３の（Ａ）は、被写体ぶれ領域を含む差分画像の一例である。Ｓ２４の処理が終了すると、ラベリング処理へ移行する（Ｓ２６）。 On the other hand, in the process of S24, the synthesizer 16 extracts a subject blur area. For example, the synthesizing unit 16 calculates a difference image in the same manner as the process of S16 in FIG. 3, and extracts a region where the moving subject likelihood is a predetermined value or more as a subject blur region. FIG. 13A is an example of a difference image including a subject blur area. When the process of S24 is completed, the process proceeds to a labeling process (S26).

Ｓ２６の処理では、合成部１６が被写体ぶれ領域をラベリングする。合成部１６は、連続する被写体ぶれ領域に対して一つのラベルＲ_ｎを設定する。図１３の（Ｂ）は、連続領域をラベリングした例である。Ｓ２６の処理が終了すると、各領域の基準画像の選択処理へ移行する（Ｓ２８）。 In the process of S26, the synthesizer 16 labels the subject blur area. The combining unit 16 sets one of the label R _n with respect to subject shake a continuous area. FIG. 13B shows an example in which continuous regions are labeled. When the process of S26 is completed, the process proceeds to a reference image selection process for each region (S28).

Ｓ２８の処理では、合成部１６が被写体ぶれ領域ごとに基準画像を設定する。合成部１６は、基準画像として基本的に高露出の画像を優先させる。例えば、入力画像Ｉ_０’’と入力画像Ｉ_１’’とを合成する場合には、基準画像として入力画像Ｉ_１’’が選択される。ただし、入力画像Ｉ_１’’において被写体ぶれ領域が白飛び領域に影響される場合には、基準画像として入力画像Ｉ_０’’が選択される。Ｓ２８の処理が終了すると、被写体ぶれマスク生成処理へ移行する（Ｓ３０）。 In the process of S28, the combining unit 16 sets a reference image for each subject blur area. The synthesizing unit 16 gives priority to an image with high exposure basically as a reference image. For example, when the input image I _{0 ″} and the input image I _{1 ″} are combined, the input image I _{1 ″} is selected as the reference image. However, when the subject blur area is affected by the whiteout area in the input image I _{1 ″} , the input image I _{0 ″} is selected as the reference image. When the processing of S28 is completed, the process proceeds to subject blur mask generation processing (S30).

Ｓ３０の処理では、合成部１６が被写体ぶれ領域ごとに被写体ぶれマスクを生成する。合成部１６は、基準画像として高露出の画像を優先させる場合には第２マスクを生成する。一方、基準画像として低露出の画像を優先させる場合には第１マスクを生成する。図１４は、Ｓ２４〜Ｓ３０の一連の処理を説明する概要図である。図１４に示すように、入力画像Ｉ_０’’と入力画像Ｉ_１’’とを合成する際に、差分画像Ｘを求め、差分画像の領域ごとに、第１マスク（ｌｏ_ｍａｓｋ）又は第２マスク（ｈｉ_ｍａｓｋ）が生成される。すなわち、被写体が動く領域については、被写体ぶれマスクを用いることで一枚の画像のみから画素値を入力させることで、上述したゴースト現象を回避することができる。Ｓ３０の処理が終了すると、合成マスク生成処理へ移行する（Ｓ３２）。 In the process of S30, the synthesis unit 16 generates a subject blur mask for each subject blur region. The synthesizing unit 16 generates the second mask when giving priority to the high exposure image as the reference image. On the other hand, the first mask is generated when priority is given to the low-exposure image as the reference image. FIG. 14 is a schematic diagram illustrating a series of processes from S24 to S30. As shown in FIG. 14, when the input image I _{0 ″} and the input image I _{1 ″} are combined, a difference image X is obtained, and a first mask (lo_mask) or a second mask is obtained for each area of the difference image. (Hi_mask) is generated. That is, for the region where the subject moves, the ghost phenomenon described above can be avoided by inputting the pixel value from only one image by using the subject blur mask. When the process of S30 ends, the process proceeds to a synthesis mask generation process (S32).

Ｓ３２の処理では、合成部１６が輝度ベースマスク及び被写体ぶれマスクに基づいて合成マスクを生成する。図１５は、合成マスクの生成処理を説明する概要図である。図１５に示すように、ｌｏ_ｍａｓｋを反転させた画像を輝度ベースマスクに乗算する。また、ｈｉ_ｍａｓｋを輝度ベースマスクに加算する。このように結合させることで、合成マスクが生成される。Ｓ３２の処理が終了すると、合成処理へ移行する（Ｓ３４）。 In the process of S32, the synthesis unit 16 generates a synthesis mask based on the luminance base mask and the subject blur mask. FIG. 15 is a schematic diagram for explaining the synthesis mask generation process. As shown in FIG. 15, the luminance base mask is multiplied by an image obtained by inverting lo_mask. Also, hi_mask is added to the luminance base mask. By combining in this way, a composite mask is generated. When the process of S32 is completed, the process proceeds to the composition process (S34).

Ｓ３４の処理では、合成部１６がＳ３２の処理で作成された合成マスクに従って合成処理を行う。なお、合成済み画像の輝度値Ｐ_０と露出変換関数を適用させた入力画像の輝度値Ｐ_１を重みａで合成する場合、合成後の輝度値Ｐ_２は以下の数式で求めることができる。

このとき、露出の最も低い画像については全領域をそのまま合成する。Ｓ３４の処理が終了すると、入力画像確認処理へ移行する（Ｓ３６）。 In the process of S34, the synthesis unit 16 performs a synthesis process according to the synthesis mask created in the process of S32. When the luminance value P ₀ of the combined image and the luminance value P ₁ of the input image to which the exposure conversion function is applied are combined with the weight a, the combined luminance value P ₂ can be obtained by the following formula.

At this time, the entire region is synthesized as it is for the image with the lowest exposure. When the process of S34 is completed, the process proceeds to an input image confirmation process (S36).

Ｓ３６の処理では、合成部１６が全ての入力画像を合成したか否かを判定する。全ての入力画像を合成していない場合には、Ｓ２２及びＳ２４の処理へ移行する。そして、例えば、図１０に示すように、入力画像Ｉ_０’’と入力画像Ｉ_１’’との合成画像Ｏ_０と、新たな入力画像Ｉ_０’’との合成処理が行われる。一方、全ての入力画像を合成した場合には、図９に示す制御処理を終了する。 In the process of S36, it is determined whether or not the combining unit 16 combined all input images. If all input images have not been combined, the process proceeds to S22 and S24. Then, for example, as shown in FIG. 10, a synthesis process of a composite image O ₀ of the input image I _{0 ″} and the input image I _{1 ″} and a new input image I _{0 ″} is performed. On the other hand, when all the input images have been combined, the control process shown in FIG. 9 ends.

図９に示す制御処理を実行することで、被写体ぶれの補正されたＨＤＲ合成画像が生成される。 By executing the control processing shown in FIG. 9, an HDR composite image in which subject blur is corrected is generated.

次に、携帯端末（コンピュータ）２を上記画像合成装置１として機能させるための画像合成プログラムを説明する。 Next, an image composition program for causing the portable terminal (computer) 2 to function as the image composition apparatus 1 will be described.

画像合成プログラムは、メインモジュール、入力モジュール及び演算処理モジュールを備えている。メインモジュールは、画像処理を統括的に制御する部分である。入力モジュールは、入力画像を取得するように携帯端末２を動作させる。演算処理モジュールは、動き情報取得モジュール、尤度算出モジュール、露出推定モジュール、動き補正モジュール及び合成モジュールを備えている。メインモジュール、入力モジュール及び演算処理モジュールを実行させることにより実現される機能は、上述した画像合成装置１の画像入力部１０、動き情報取得部１２、尤度算出部１３、露出推定部１４、動き補正部１５及び合成部１６の機能とそれぞれ同様である。 The image composition program includes a main module, an input module, and an arithmetic processing module. The main module is a part that comprehensively controls image processing. The input module operates the mobile terminal 2 so as to acquire an input image. The arithmetic processing module includes a motion information acquisition module, a likelihood calculation module, an exposure estimation module, a motion correction module, and a synthesis module. The functions realized by executing the main module, the input module, and the arithmetic processing module are the image input unit 10, the motion information acquisition unit 12, the likelihood calculation unit 13, the exposure estimation unit 14, the motion of the image synthesis apparatus 1 described above. The functions of the correction unit 15 and the synthesis unit 16 are the same.

画像合成プログラムは、例えば、ＲＯＭ等の記録媒体または半導体メモリによって提供される。また、画像合成プログラムは、データ信号としてネットワークを介して提供されてもよい。 The image composition program is provided by a recording medium such as a ROM or a semiconductor memory, for example. The image composition program may be provided as a data signal via a network.

以上、本実施形態に係る画像合成装置１、画像合成方法及び画像合成プログラムによれば、第１画像と第２画像との露出を合わせる前に、第１画像と第２画像との差分に基づいて各画素における被写体の動きの尤度を算出する。そして、被写体の動きの尤度に基づいて第１画像と第２画像との露出条件を合わせる露出変換関数を推定する。このように、露出を合わせる際に被写体の動きの尤度が考慮されるため、例えば被写体の動きで色が変化している可能性がある領域を除いて露出を合わせることができる。よって、適切な合成画像を生成することが可能となる。さらに、被写体ぶれマスクを用いて被写体ぶれ（ゴースト的な表示）の発生を回避し、クリアな画像とすることができる。 As described above, according to the image synthesizing apparatus 1, the image synthesizing method, and the image synthesizing program according to the present embodiment, before matching the exposure of the first image and the second image, based on the difference between the first image and the second image. Thus, the likelihood of movement of the subject in each pixel is calculated. Then, an exposure conversion function that matches the exposure conditions of the first image and the second image is estimated based on the likelihood of the movement of the subject. Thus, since the likelihood of the movement of the subject is taken into account when adjusting the exposure, for example, the exposure can be adjusted except for a region where the color may change due to the movement of the subject. Therefore, it is possible to generate an appropriate composite image. Furthermore, the use of the subject blur mask prevents the occurrence of subject blur (ghost-like display), and a clear image can be obtained.

なお、上述した実施形態は本発明に係る画像合成装置の一例を示すものである。本発明に係る画像合成装置は、実施形態に係る画像合成装置１に限られるものではなく、各請求項に記載した要旨を変更しない範囲で、実施形態に係る画像合成装置を変形し、又は他のものに適用したものであってもよい。 The embodiment described above shows an example of an image composition apparatus according to the present invention. The image synthesizing apparatus according to the present invention is not limited to the image synthesizing apparatus 1 according to the embodiment, and the image synthesizing apparatus according to the embodiment may be modified or otherwise changed without changing the gist described in each claim. It may be applied to the above.

例えば、上述した各実施形態では、カメラ２０がフレーム画像を取得する例を説明したが、別の機器からネットワークを介して送信された画像であってもよい。また、合成画像を表示せずに記録のみする場合には、表示部２１を備えなくてもよい。 For example, in each of the above-described embodiments, an example in which the camera 20 acquires a frame image has been described. However, an image transmitted from another device via a network may be used. Further, when only recording without displaying the composite image, the display unit 21 may not be provided.

また、上述した各実施形態に係る画像合成装置１を、手ぶれ補正装置とともに動作させてもよい。 In addition, the image composition device 1 according to each embodiment described above may be operated together with the camera shake correction device.

１…画像合成装置、１０…画像入力部（入力部）、１２…動き情報取得部、１３…尤度算出部、１４…露出推定部、１５…動き補正部、１６…合成部。 DESCRIPTION OF SYMBOLS 1 ... Image composition apparatus, 10 ... Image input part (input part), 12 ... Motion information acquisition part, 13 ... Likelihood calculation part, 14 ... Exposure estimation part, 15 ... Motion correction part, 16 ... Composition part.

Claims

An image composition device that generates a composite image using a first image and a second image having different exposure conditions,
An input unit for inputting the first image and the second image;
The likelihood of calculating the moving subject likelihood in each pixel based on the difference between the first image and the second image before estimating an exposure conversion function that matches the exposure conditions of the first image and the second image. A degree calculator,
An exposure estimation unit that estimates the exposure conversion function based on the moving subject likelihood;
A combining unit that combines the first image and the second image using the exposure conversion function;
Equipped with a,
The exposure estimation unit estimates the exposure conversion function by setting the weight of the pixel to be smaller as the moving subject likelihood is higher.
Image composition device.

The likelihood calculation unit normalizes the neighboring pixels of the pixels of the first image and the corresponding neighboring pixels of the second image, and based on the difference between the normalized neighboring pixels, The image composition device according to claim 1, wherein the moving subject likelihood is calculated.

The likelihood calculating unit
A plurality of first processed images obtained by changing the resolution of the first image in stages, and a plurality of second processed images obtained by changing the resolution of the second image in stages. The image synthesizing apparatus according to claim 1 or 2, wherein a difference between the pixels is calculated for each resolution, and the moving subject likelihood of each pixel is calculated by weighting the difference obtained for each resolution.

The likelihood calculation unit is obtained for each resolution based on the reliability of the difference between the first image and the second image and the image size or resolution of the first processed image or the second processed image. The image composition device according to claim 3, wherein the difference is weighted.

The synthesis unit is
A moving subject likelihood at each pixel is calculated based on the difference between the first image and the second image, and the first image and the second image are synthesized using the moving subject likelihood and the exposure conversion function. The image composition device according to any one of claims 1 to 4.

The synthesis unit is
Generating a luminance base mask indicating a synthesis ratio of pixel values of the first image and the second image based on the magnitude of the original luminance value of the first image or the second image;
Based on the difference between the first image and the second image, generate a subject blur mask indicating a synthesis ratio of pixel values of the first image and the second image,
The image composition device according to any one of claims 1 to 5, wherein the luminance base mask and the subject blur mask are combined to generate a composite mask that combines pixel values of the first image and the second image. .

The synthesis unit is
Based on the difference between the first image and the second image, the likelihood of a moving subject at each pixel is calculated,
The image composition device according to claim 6, wherein the subject blur mask is generated based on the moving subject likelihood.

The synthesis unit is
A plurality of first processed images obtained by changing the resolution of the first image in stages, and a plurality of second processed images obtained by changing the resolution of the second image in stages. And calculating a moving subject likelihood of each pixel by calculating a difference of each pixel for each resolution, weighting the difference obtained for each resolution, and applying the subject blur mask based on the moving subject likelihood. The image composition device according to claim 7 which generates.

The synthesis unit is
The image composition according to claim 8, wherein an area adjacent to a pixel whose moving subject likelihood is equal to or less than a predetermined threshold is detected, an identification label is assigned to each area, and the subject blur mask is generated for each area. apparatus.

The synthesis unit is
As the subject blur mask, a first mask that forcibly selects a pixel value having a low luminance value from the first image and the second image, or a luminance that is forcibly selected from the first image and the second image. The image composition device according to any one of claims 6 to 9, wherein a second mask for selecting a pixel value having a high value is generated.

The synthesis unit is
The image composition device according to claim 10, wherein the composite mask is generated by multiplying the luminance base mask by a mask obtained by inverting the first mask, or adding the second mask.

A movement information acquisition unit that acquires movement information of pixels between the first image and the second image;
The likelihood calculating unit corrects the first image and the second image based on the motion information, and calculates the moving subject likelihood of each pixel using the corrected first image and second image. The image composition device according to any one of claims 1 to 11 to calculate.

The image synthesizing apparatus according to claim 1, wherein the first image is an image obtained by synthesizing images having different exposure conditions.

An image composition method for generating a composite image using a first image and a second image having different exposure conditions,
Input the first image and the second image;
Before estimating an exposure conversion function that matches the exposure conditions of the first image and the second image, a moving subject likelihood in each pixel is calculated based on the difference between the first image and the second image,
Estimate the exposure conversion function by setting the weight of the pixel to be lower as the moving subject likelihood is higher,
Combining the first image and the second image using the exposure conversion function;
Image composition method.

An image composition program for operating a computer to generate a composite image using a first image and a second image having different exposure conditions,
An input unit for inputting the first image and the second image;
The likelihood of calculating the moving subject likelihood in each pixel based on the difference between the first image and the second image before estimating an exposure conversion function that matches the exposure conditions of the first image and the second image. Degree calculator,
The higher the moving subject likelihood is, the smaller the weight of the pixel is set , and the exposure estimation unit that estimates the exposure conversion function, and the first image and the second image are synthesized using the exposure conversion function. An image composition program for operating the computer as a composition unit.

A recording medium recording an image composition program for operating a computer to generate a composite image using a first image and a second image having different exposure conditions,
An input unit for inputting the first image and the second image;
The likelihood of calculating the moving subject likelihood in each pixel based on the difference between the first image and the second image before estimating an exposure conversion function that matches the exposure conditions of the first image and the second image. Degree calculator,
The higher the moving subject likelihood is, the smaller the weight of the pixel is set , and the exposure estimation unit that estimates the exposure conversion function, and the first image and the second image are synthesized using the exposure conversion function. A recording medium on which an image composition program for operating the computer as a composition unit is recorded.