JP2004072685A

JP2004072685A - Device, method and program for compositing image and recording medium recording the program

Info

Publication number: JP2004072685A
Application number: JP2002233130A
Authority: JP
Inventors: Masashi Hirozawa; 広沢　昌司
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2002-08-09
Filing date: 2002-08-09
Publication date: 2004-03-04
Anticipated expiration: 2022-08-09
Also published as: JP3983624B2

Abstract

<P>PROBLEM TO BE SOLVED: To prevent the composited result from becoming unnatural in a background when compositing images, such that two objects are present within the same image, from images obtained by individually photographing two objects. <P>SOLUTION: A first object image and a second object image are acquired and with one of the images as a reference image, corrected images wherein background parts are matched are created by a background correction quantity calculating means 4 and a corrected image creating means 5. An overlapping image creating means 9 creates an image overlapping the reference image and the corrected image. Thus, the backgrounds of the images to be overlapped are matched, such that the image of the first object and the image of the second object which are obtained by individually photographing them are composited without a sense of incompatibility in the background. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明は、別々に撮影された複数の被写体を、同時に存在するかのように一枚の画像に合成し、またその際、被写体同士が重なりなく撮影／合成ができるように補助を行う装置および方法およびプログラムおよびプログラム媒体に関する。
【０００２】
【従来の技術】
フィルムカメラやデジタルカメラで、例えば二人で並んで写真を撮る際、三脚を使ってセルフタイマーで撮影するか、通りがかりの人などに頼んで撮影してもらうしかない。
【０００３】
しかし、三脚を持ち歩くのは大変であり、また、見ず知らずの他人に頼むのも気が引けるという問題がある。
【０００４】
それに対して、特開２０００−３１６１２５号公報（２０００年１１月１４日公開）では、同一場所で撮影した複数枚の画像から被写体の領域を抽出し、被写体の画像を背景と合成したりしなかったりすることで、背景のみの画像や別の画像の被写体が同時に存在するかのような画像を合成することができる画像合成装置が開示されている。
【０００５】
また、特開２００１−３３３３２７号公報（２００１年１１月３０日公開）では、撮影済みの参照画像中の指定された領域（被写体領域）を撮影中の画像に重ねてモニタ画面またはファインダ内に表示させることができると共に、被写体領域内の被写体を、撮影中の画像に合成した合成画像の画像データを作成することができるデジタルカメラおよび画像処理方法が開示されている。
【０００６】
【発明が解決しようとする課題】
しかし、これら従来技術では、大きく２つの問題が出てくる。
【０００７】
１つ目の問題は、参照画像中の被写体領域を単に切り出して別の画像と重ね合わせるだけでは、被写体領域の指定が不正確な場合に（１）合成結果の被写体が欠けたり、（２）余計なものが合成されたり、（３）指定が正確であっても合成境界が微妙に不自然になったりするという点である。
【０００８】
例えば、（１）の、実際の被写体領域より参照画像中で指定した被写体領域（以下、指定被写体領域と呼ぶ）が欠けている場合は、合成画像上でもその被写体は欠けているので、明らかに不自然となる。
【０００９】
また、（２）の、実際の被写体領域より参照画像中の指定被写体領域が大きすぎる場合は、参照画像上での被写体周囲の背景も含んでしまっていることになる。上でいう「余計なもの」とは、この含んでしまっている背景部分のことである。特開２００１−３３３３２７号公報で説明される合成方法では、参照画像と撮影画像を違う場所で撮影することもありえるので、指定被写体領域に含まれてしまっている背景画像（参照画像上の背景）と、合成画像上でのその周囲の背景（撮影画像上の背景）とは異なることがある。この場合、合成画像上では、指定被写体領域で背景が突然変わるため、不自然な合成画像となる。
【００１０】
仮に、同じ場所、同じ背景でどちらも撮影されたとしても、特開２００１−３３３３２７号公報で説明される合成方法では、参照画像中の指定被写体領域を撮影画像上の任意の位置に配置・合成できるので、指定被写体領域に含まれてしまっている背景画像（参照画像上の背景）と、撮影画像上での合成位置周囲の背景（撮影画像の背景）とが、同じ位置の背景とは限らず、同様に合成結果は不自然となる。
【００１１】
特開２００１−３３３３２７号公報のように、参照画像中の指定被写体領域に対し、ユーザーがタブレットなどを使ってその輪郭を指定する場合、人間が輪郭を判断しながら指定するので指定被写体領域の指定が大きく間違うことは少ないが、１、２画素ないし数画素程度の誤りが出てくる可能性はある。もし、１画素の単位で人手で正確に指定しようとすると、大変な労力が必要となる。
【００１２】
また、（３）の、指定が正確であっても合成境界が微妙に不自然になる場合には、（１）、（２）のような指定被写体領域が画素単位で正確であったとしても、指定被写体領域の合成結果として、その輪郭の画素が撮影画像の背景と馴染まない場合をも含んでいる。
【００１３】
これは、指定被写体領域の輪郭は、画素単位の指定では精度が充分でなく、実際は１画素よりももっと細かい単位でないと表現できないためである。すなわち、輪郭の画素は、本来は被写体部分が（０．Ｘ）画素分、背景部分が（１．０−０．Ｘ）画素分となっており、画素値としては、被写体部分の画素値と背景部分の画素値とが割合に応じて足された値、すなわち平均化された値となっている。
【００１４】
このため、被写体部分と背景部分との割合は、平均化された画素値からは逆算できないので、結局、合成する時は画素単位で扱うしかない。その結果、合成画像の輪郭の画素値には、参照画像の背景の値が含まれてしまい、周囲の撮影画像の背景と馴染まなくなってしまう。
【００１５】
以上の（１）〜（３）の問題は、特開２０００−３１６１２５号公報に開示された合成方法によっても解決できない。同公報には、同一場所または互いに近くの場所で撮影した複数枚の画像を重ねる前に位置合わせを行うことが開示されている。
【００１６】
しかしながら、例えば同じ背景を使って２人が交互にお互いを撮影する場合、カメラの向きの違いによって撮影される背景の位置が移動するだけではなく、カメラの傾きによる画像の回転や、撮影者と被写体との距離のずれによる画像の拡大縮小や、撮影者の背丈の違いによってカメラの仰角が変わることによる画像の歪みが発生する。
【００１７】
このため、重ね合わせようとする画像の位置合わせを単に行うだけでは、上記（１）〜（３）の問題が解消されず、合成結果は不自然になってしまう。
【００１８】
２つ目の問題は、参照画像中の被写体領域と、別の被写体の含まれる撮影画像とを合成することを目的に撮影を行おうとすると、撮影時の被写体の位置に気をつけないと、それぞれの画像中の被写体の領域が合成画像上で互いに重なってしまったり、どちらかの被写体が合成画像からはみ出てしまう場合が出てくるという点である。
【００１９】
この問題に対して、特開２０００−３１６１２５号公報には、撮影済みの画像を使った合成方法が主に説明されているだけであり、被写体同士の重なりや合成画像からのはみだしを防ぐ撮影方法などには触れられていない。
【００２０】
また、特開２００１−３３３３２７号公報の画像処理方法によれば、参照画像中の被写体領域（ユーザーがタブレットなどを使って輪郭を指定する）と撮影中の画像とを重ねて表示することができるので、合成する場合の参照画像中の被写体領域と撮影中の画像中の被写体領域とに関して、被写体同士が重なるかどうかや、被写体領域が合成画像からはみだすかどうかを、撮影時に知ることができる。被写体の重なりやはみだしがある場合は、被写体やカメラを動かすことで撮影中の画像中の被写体の位置を変更することができ、重なりやはみだしが起こらない画像を撮影・記録することができるようになる。
【００２１】
しかし、被写体領域の認識処理や、被写体領域同士が重なっているかどうか、合成画像から被写体領域がはみだしているかどうかの判断処理など、高度な処理を人間自身がしなければならないという不便さがある。また、参照画像中の被写体の領域は手で指定しなければいけないという不便さもある。
【００２２】
本発明の第１の目的は、合成結果が不自然とならないような合成を行う画像合成装置（画像合成方法）を提供することであり、第２の目的は、別々に撮影された複数の被写体を、同時に存在するかのように一枚の画像に合成する際、合成画像上で被写体同士の重なりが起きないように撮影を補助する画像合成装置（画像合成方法）を提供することである。
【００２３】
【課題を解決するための手段】
本発明に係る画像合成装置は、上記の課題を解決するために、背景と第１の被写体とを含む画像である第１被写体画像と、上記背景の少なくとも一部と第２の被写体とを含む画像である第２被写体画像との間での、背景部分の相対的な移動量、回転量、拡大縮小率、歪補正量のいずれかもしくは組み合わせからなる補正量を算出する、あるいは予め算出しておいた補正量を読み出す背景補正量算出手段と、第１被写体画像、第２被写体画像のどちらかを基準画像とし、他方の画像を、被写体以外の背景の部分が少なくとも一部重なるように前記背景補正量算出手段から得られる補正量で補正し、基準画像と補正した画像を重ねた画像を生成する重ね画像生成手段と、を有することを特徴とする。
【００２４】
上記の構成において、「第１の被写体」、「第２の被写体」とは、合成を行おうとしている対象であり、一般には人物であることが多いが物などの場合もある。厳密には、「被写体」は、第１被写体画像と第２被写体画像との間で、背景部分が少なくとも一部重なるようにした時に、画素値が一致しない領域、すなわち変化がある領域は全て「被写体の領域」となる可能性を持つ。
【００２５】
但し、背景部分で、風で木の葉が揺れたなどの小さな変化でも変化がある領域となってしまうので、小さな変化や小さな領域はある程度無視する方が、「被写体の領域」を的確に抽出でき、より自然な重ね画像を得ることができる。
【００２６】
なお、例えば被写体が人物の場合、被写体は必ずしも一人であるとは限らず、複数の人物をまとめて「第１の被写体」や「第２の被写体」とする場合もある。つまり、複数人であっても、合成の処理の単位としてまとめて扱うものは一つの「被写体」となる。なお、人物でなく、物であっても同様である。
【００２７】
また、被写体は、必ずしも一つの領域であるとは限らず、複数の領域からなる場合もある。「第１」、「第２」は、異なるコマ画像として単に区別する為につけたものであり、撮影の順番などを表すものではなく、本質的な違いはない。また、例えば、人物が服や物などを持っていて、「第１、第２の被写体を含まない背景だけの画像」にそれらが現れないのならば、それらも被写体に含まれる。
【００２８】
「第１被写体画像」、「第２被写体画像」は、上記の「第１の被写体」、「第２の被写体」を含む別々の画像であり、一般には、カメラなどでその被写体を撮影した画像である。但し、画像上に被写体のみしか写っておらず、互いに共通する背景部分が全く写っていない場合は、合成に適さないので、少なくとも一部は互いに共通する背景部分が写っている必要がある。また、通常は、第１被写体画像、第２被写体画像は、同じ背景を使って、すなわちカメラをあまり動かさないで撮影する場合が多い。
【００２９】
なお、被写体を撮影するカメラは、画像を静止画として記録するスチルカメラである必要はなく、画像を動画として記録するビデオカメラであってもよい。ビデオカメラで静止画としての重ね画像を生成する場合、撮影した動画を構成する１フレームの画像を被写体画像として取り出し、合成に用いることになる。
【００３０】
「背景の部分」とは、第１被写体画像、第２被写体画像から「第１の被写体」、「第２の被写体」を除いた部分である。
【００３１】
「移動量」は、基準画像と背景の少なくとも一部が重なる位置へ、他の画像を平行移動させる量だが、回転や拡大縮小の中心の対応点の移動量と言ってもよい。
【００３２】
「歪補正量」とは、カメラやレンズの位置や方向が変わったことによる撮影画像の変化のうち、平行移動、回転、拡大縮小では補正できない残りの変化を補正する為の補正量である。例えば、高い建物を撮影した時に、上の方が遠近法の効果により同じ大きさであっても小さく写ってしまう「あおり」などとよばれる効果などを補正する場合などがこれに含まれる。
【００３３】
「重ね画像生成手段」は、重ね画像を生成するが、必ずしも一つの画像データとして生成しなくてもよく、他の手段の画像データと合わせて合成したかのように見えるのでも構わない。例えば、表示手段上にある画像を表示する際、その画像に上書きする形で別の画像を一部表示すれば、見た目には２つの画像データから１つの合成画像データを生成し、その合成画像データを表示しているかのように見えるが、実際は、２つの画像データに基づく画像がそれぞれ存在するだけで、合成画像データは存在していない。
【００３４】
背景補正量算出手段による補正量の算出には、例えば、ブロックマッチングなど、２つの画像間での部分的な位置の対応を算出する手法を採用することができる。これらの手法などを利用して、第１被写体画像、第２被写体画像の２つの画像間の対応を求めれば、背景部分に一致するところがあれば、その部分の位置的な対応を算出することができる。被写体部分は他の画像中には存在しないので、その部分は間違った対応が得られる。背景部分の正しい対応と被写体部分の間違った対応の中から、統計的な手法を使うなどして背景部分の正しい対応だけを得る。残った正しい対応から、背景部分の相対的な移動量、回転量、拡大縮小率、歪補正量のいずれかもしくは組み合わせからなる補正量が算出できる。
【００３５】
重ね画像生成手段は、背景補正量算出手段により算出された補正量に基づき、基準画像に合わせて他方の画像を背景部分が一致するように補正した画像を作る。そして、重ね画像生成手段は、基準画像に補正した画像を重ねた画像を生成する。
【００３６】
画像の重ね方としては、２つの画像の位置的に対応する画素の画像データを、０〜１の範囲で比例配分した任意の比率で混合すればよい。例えば、第１被写体画像の比率を１、第２被写体画像の比率を０とすれば、その画素には、第１被写体画像の画像データのみが書き込まれる。また、２つの画像の混合比率を１：１とすれば、その画素には、２つの画像の画像データを均等に合成した画像データが書き込まれる。
【００３７】
なお、混合比率をどう設定するかは、本発明にとって本質的ではなく、どのような重ね画像を表示ないし出力したいかというユーザーの目的次第である。
【００３８】
以上の処理によって、第１の被写体と第２の被写体とを背景部分を一致させた状態で一枚の画像上に合成することができる。
【００３９】
二つの画像間の背景のずれや歪みを補正して合成することができるので、これによって、被写体など明らかに異なる領域を除いた以外の部分（すなわち背景部分）は、どのように重ねても合成結果がほぼ一致し、合成結果が不自然とならないという効果が出てくる。例えば被写体領域だけを主に合成しようとした時、被写体領域の抽出や指定が多少不正確であっても、被写体領域の周りの背景部分が合成先の画像の部分とずれや歪みがないので、不正確な領域の内外が連続した風景として合成され、見た目の不自然さを軽減するという効果が出てくる。
【００４０】
被写体領域の抽出が画素単位で正確であったとしても、課題の項で説明した通り、１画素より細かいレベルでの不自然さは従来技術の方法では出てしまうが、本発明では、背景部分のずれや歪みを無くしてから合成しているので、輪郭の画素の周囲の画素は、同じ背景部分の位置の画素なので、合成してもほぼ自然なつながりとなる。このように、１画素より細かいレベルでの不自然さを防ぐ、あるいは軽減するという効果が出てくる。
【００４１】
また、背景のずれや歪みを補正して合成するので、第１、第２被写体画像の撮影時にカメラなどを三脚などで固定する必要がなく、手などで大体の方向を合わせておけばよく、撮影が簡単になるという効果が出てくる。
【００４２】
なお、背景補正量算出手段の動作である、「背景部分の相対的な移動量、回転量、拡大縮小率、歪補正量のいずれかもしくは組み合わせからなる補正量を算出する」を「背景部分の相対的な移動量に、相対的な回転量、拡大縮小率または歪補正量のいずれかもしくは複数を組み合わせた補正量を算出する」としてもよい。これにより、補正の精度が一層向上し、より自然な合成結果を得ることができる。
【００４３】
さらに、背景補正量算出手段の上記２種類の動作をユーザーが入力手段を介して選択的に切り換えられるようにすれば、補正の精度を重視したい場合と、処理速度または処理負担軽減を重視したい場合とを使い分けることができ、画像合成装置の操作性が向上する。
【００４４】
本発明に係る画像合成装置は、上記の課題を解決するために、被写体や風景を撮像する撮像手段を有し、第１被写体画像または第２被写体画像は、前記撮像手段の出力に基づいて生成されることを特徴とする。
【００４５】
上記の構成によれば、重ね画像を生成する画像合成装置が、撮像手段を具備することで、ユーザーが被写体や風景を撮影したその場で、重ね画像を生成することができるため、ユーザーにとっての利便性が向上する。また、重ね画像を生成した結果、もし被写体同士の重なりがあるなどの不都合があれば、その場で撮影し直すことができるという効果が出てくる。
【００４６】
なお、撮像手段から得られる画像は、通常、画像合成装置に内蔵されているか否かを問わない主記憶や外部記憶などに記録し、シャッターボタンなどを利用して記録するタイミングをユーザーが指示する。そして、記録された画像を第１被写体画像、または第２被写体画像として、合成処理に利用することになる。
【００４７】
本発明に係る画像合成装置は、上記の課題を解決するために、第１被写体画像と第２被写体画像のうち、後に撮影した方を基準画像とすることを特徴とする。
【００４８】
上記の構成によれば、例えば、第１被写体画像、第２被写体画像の順に撮影したとすると、第２被写体画像を基準画像にする。そして、第２被写体画像を基準画像として、第１被写体画像を補正する。この際、第２被写体画像（基準画像）と第１被写体画像の間で、背景部分の移動量などの補正量を算出し、その補正量を使って第１被写体画像の補正を行う。第２被写体画像（基準画像）、補正された第１被写体画像を使って、合成画像を合成する。そして合成画像の表示などを行う。
【００４９】
この結果、表示される合成画像は、直前に撮影したばかりの、あるいは合成画像をリアルタイム表示する形態では現在撮影中の第２被写体画像の背景の範囲となるので、撮影者にとっては違和感が無いという効果が出てくる。
【００５０】
もし第１被写体画像を基準画像とすると、合成画像の背景の範囲は、第１被写体画像の背景の範囲となる。第１被写体画像の背景の範囲は、カメラの方向などが変わっていて、先ほど撮影した第２被写体画像の背景の範囲と変わっているかもしれず、撮影者が変わることもある。その場合、後で撮影した背景の範囲と、表示される合成画像の背景の範囲とが一致しないので、撮影者などにとって違和感が出てくる。
【００５１】
さらに、上記の第２被写体画像の撮影から合成画像の表示をリアルタイムに繰り返すとすると、第２被写体画像を撮影画像に更新し続けているにも関わらず、合成画像の背景の範囲は第１被写体の背景の範囲のままなので、この違和感は一層増幅される。
【００５２】
本発明に係る画像合成装置は、上記の課題を解決するために、前記重ね画像生成手段において、基準画像と補正した画像とを、それぞれ所定の透過率で重ねることを特徴とする。
【００５３】
上記の構成において、「所定の透過率」は、固定された値でもよいし、領域に応じて変化させる値や、領域の境界付近で徐々に変化させる値などでもよい
前記重ね画像生成手段は、重ね画像の画素位置を決め、基準画像上の画素位置の画素値と補正した他の画像上の画素位置の画素値とを得て、その二つの画素値を所定の透過率によって掛け合わせた値を重ね画像の画素値とする。この処理を重ね画像の全ての画素位置で行う。
【００５４】
また、透過率を画素位置によって変えれば、場所によって基準画像の割合を強くしたり、補正画像の割合を強くしたりできる。
【００５５】
これを使って、例えば、補正画像中の被写体領域だけを基準画像に重ねる時、被写体領域内は不透明（すなわち補正画像中の被写体の画像そのまま）で重ね、被写体領域周辺は被写体領域から離れるに従い基準画像の割合が強くなるように重ねる。すると、被写体領域、すなわち抽出した被写体の輪郭が間違っていたとしても、その周辺の画素は、補正画像から基準画像に徐々に変わっているので、間違いが目立たなくなるという効果が出てくる。
【００５６】
また、例えば被写体領域だけを半分の透過度で重ねる、などの合成表示をすることで、表示されている画像のどの部分が以前に撮影した合成対象部分で、どの部分が今撮影している画像なのかをユーザーや被写体が判別しやすくなるという効果も出てくる。
【００５７】
また、人間は、常識（画像理解）を使うことで、画像中の背景部分と被写体部分（輪郭）を区別する能力を通常、持っている。被写体領域を半分の透過度で重ねて表示しても、その能力は一般に有効である。
【００５８】
従って、被写体領域を半分の透過度で重ねて表示することで、複数の被写体の領域が重なっている場合でも、それぞれの被写体の領域を前記能力で区別することができ、それらが合成画像上で位置的に重なっているかどうかを容易に判断することができる。
【００５９】
第１被写体画像と第２被写体画像を左右に並べて見比べることでも重なりがあるかどうかを判断することは不可能ではないが、その際は、それぞれの画像中の被写体領域を前記能力で区別し、それぞれの画像の背景部分の重なりを考慮して、区別した被写体領域同士が重なるかどうかを頭の中で計算して判断しなければいけない。この一連の作業を頭の中だけで正確に行うことは、合成画像中の被写体領域を区別する先の方法と比べると、難しい。
【００６０】
つまり、背景部分が重なるような位置合わせを機械に行わせることで、人間の高度な画像理解能力を使って、被写体領域同士が重なっているかどうかを判断し易い状況を作り出しているといえる。このように、被写体領域を半分の透過度で重ねて表示することで、被写体同士の重なりなどがある場合も、今撮影している被写体の位置を判別しやすくなるという効果も出てくる。
【００６１】
なお、本請求項に記載した構成を、前記請求項に記載した各構成と、必要に応じて任意に組み合わせてもよい。
【００６２】
本発明に係る画像合成装置は、上記の課題を解決するために、前記重ね画像生成手段において、基準画像と補正した画像の間の差分画像中の差のある領域を、元の画素値と異なる画素値の画像として生成することを特徴とする
ここで、「差分画像」とは、二つの画像中の同じ位置の画素値を比較して、その差の値を画素値として作成する画像のことである。一般には、差の値は絶対値をとることが多い。
【００６３】
「元の画素値と異なる画素値」とは、例えば、透過率を変えて半透明にしたり、画素値の明暗や色相などを逆にして反転表示させたり、赤や白、黒などの目立つ色にしたり、などを実現するような画素値である。また、領域の境界部分と内部とで、前述したように画素値を変えてみたり、境界部分を点線で囲ってみたり、点滅表示（時間的に画素値を変化させる）させてみたり、というような場合も含む。
【００６４】
上記の構成によれば、基準画像と補正した他の画像との間で、同じ画素位置の画素値を得て、その差がある場合はその画素位置の重ね画像の画素値を他の領域とは異なる画素値とする。この処理を全ての画素位置で行うことで、差分部分の領域を元の画素値と異なる画素値の画像として生成することができる。
【００６５】
これによって、二つの画像間で一致しない部分がユーザーに分かりやすくなるという効果が出てくる。例えば、第１や第２の被写体の領域は、基準画像上と補正画像上では、片方は被写体の画像、他方は背景部分の画像となるので、差分画像中の差のある領域として抽出される。抽出された領域を半透明にしたり、反転表示したり、目立つような色の画素値とすることで、被写体の領域がユーザーに分かりやすくなるという効果が出てくる。
【００６６】
なお、本請求項に記載した構成を、前記請求項に記載した各構成と、必要に応じて任意に組み合わせてもよい。
【００６７】
本発明に係る画像合成装置は、上記の課題を解決するために、基準画像と補正した画像の間の差分画像中から、第１の被写体の領域と第２の被写体の領域を抽出する被写体領域抽出手段を有し、前記重ね画像生成手段において、基準画像と補正した画像とを重ねる代わりに、基準画像または補正した画像と前記被写体領域抽出手段から得られる領域内の画像とを重ねることを特徴とする。
【００６８】
ここで、「被写体の領域」とは、被写体が背景と分離される境界で区切られる領域である。例えば、第１被写体画像中で人物が服や物などを持っていて、第２被写体画像中でそれらが現れないのならば、それらも被写体であり、被写体領域に含まれる。なお、被写体の領域は、必ずしも繋がった一塊の領域とは限らず、複数の領域に分かれていることもある。
【００６９】
「前記被写体領域抽出手段から得られる領域内の・・・画像を重ねる」とは、その領域以外は何も画像を生成しないということではなく、それ以外の領域は基準画像などで埋めることを意味する。
【００７０】
背景部分は一致するように補正しているのだから、差分として現れるのは主に被写体部分となる。従って、被写体領域抽出手段で、差分画像に含まれている被写体領域を抽出することができる。このとき、差分画像からノイズなどを除去する（例えば、差分の画素値が閾値以下のものを除く）などの処理を施すと、被写体領域をより正確に抽出することができる。
【００７１】
重ね画像を生成する際、各画素位置の画素値を決めるが、その画素位置が被写体領域抽出手段から得られる被写体領域内の場合のみ、被写体の画像を重ねるようにする。
【００７２】
これによって、基準画像上に、補正された被写体画像中の被写体領域のみを合成することできるという効果が出てくる。あるいは、補正された被写体画像上に、基準画像中の被写体領域のみを合成することができるという効果も出てくる。
【００７３】
なお、重ね画像生成手段における被写体領域の透過率を変える処理と組み合わせることで、どの領域を合成しようとしているかがユーザーに分かり易くなり、合成の結果として被写体同士に重なりなどが生じる場合には、それもさらに分かり易くなるという効果が出てくる。さらに、それによって、重なりが起きないように撮影を補助することができるという効果が出てくる。
【００７４】
なお、重なりがある場合は、被写体やカメラを動かすなどして、重なりの無い状態で撮影し直すのが良い訳だが、この場合の補助とは、例えば、重なりが起きるかどうかをユーザーに認識し易くすることや、どのくらい被写体やカメラを動かせば重なりが解消できそうかを、ユーザーが判断する材料（ここでは合成画像）を与えること、などになる。
【００７５】
なお、本請求項に記載した構成を、前記請求項に記載した各構成と、必要に応じて任意に組み合わせてもよい。
【００７６】
本発明に係る画像合成装置の前記被写体領域抽出手段は、上記の課題を解決するために、第１被写体画像中あるいは補正された第１被写体画像中から第１の被写体の領域内の画像および第２の被写体の領域内の画像を抽出すると共に、第２被写体画像中あるいは補正された第２被写体画像中から第１の被写体の領域内の画像および第２の被写体の領域内の画像を抽出し、さらに皮膚色を基準として第１の被写体の画像および第２の被写体の画像を選別することを特徴とする。
【００７７】
上記の構成において、被写体領域抽出手段は、差分画像から抽出した被写体領域が、第１の被写体の領域あるいは第２の被写体の領域であることは分かるが、個々の被写体の領域が、第１の被写体の領域なのか第２の被写体の領域なのかは分からない。言い方を変えれば、その領域が示している被写体の画像は、第１被写体画像中に存在するのか、あるいは第２被写体画像中に存在するのか分からない、ということになる。
【００７８】
そこで、被写体が人物であることが分かっているならば、個々の領域中の画素の色を、第１被写体画像（基準画像）と補正された第２被写体画像、または第２被写体画像（基準画像）と補正された第１被写体画像とでそれぞれ調べる。この場合、いずれにしても、基準画像と補正された画像とのそれぞれについて、被写体領域抽出手段が第１の被写体の領域内の画像および第２の被写体の領域内の画像を抽出するから、合計４つの画像部分が抽出されることになる。
【００７９】
この抽出した４つの画像部分の中には、第１の被写体の画像部分、第２の被写体の形をした背景部分、第１の被写体の形をした背景部分、第２の被写体の画像部分とが含まれている。そこで、皮膚色を基準にすることで、皮膚色またはそれに近い色を持つ第１の被写体および第２の被写体の各画像部分を選り分けることができる。
【００８０】
これによって、抽出した画像部分がどちらの被写体であるかを自動的に簡単に判別できる効果が出てくる。
【００８１】
本発明に係る画像合成装置は、上記の課題を解決するために、前記被写体領域抽出手段は、第１被写体画像中あるいは補正された第１被写体画像中から第１の被写体の領域内の画像および第２の被写体の領域内の画像を抽出すると共に、第２被写体画像中あるいは補正された第２被写体画像中から第１の被写体の領域内の画像および第２の被写体の領域内の画像を抽出し、さらにその各領域外の画像の特徴を基準として第１の被写体の画像および第２の被写体の画像を選別することを特徴とする。
【００８２】
上記の構成において、被写体領域抽出手段が４つの画像部分を抽出する点は、前述のとおりである。但し、第１の被写体および第２の被写体の各画像部分を選り分ける基準として、前記のように皮膚色を使うのではなく、各領域外の画像の特徴を使う。
【００８３】
ここで、「特徴」とは、着目した領域の画像の持つ性質、属性などであり、特徴量として数値化して表現できる性質が好ましい。特徴量としては、例えば、各色の画素値や、その色相、彩度、明度のほか、画像の模様や構造を表す統計量として、同時生起行列や差分統計量、ランレングス行列、パワースペクトル、それらの第２次統計量、高次統計量などがある。
【００８４】
個々の領域中、すなわち抽出した画像部分の特徴量を、基準画像と補正された画像とでそれぞれ求める。またその領域の周囲の領域の特徴量も、基準画像と補正された画像とでそれぞれ求める。領域中の特徴量とその周囲の領域の特徴量の差を、第１被写体画像と第２被写体画像で比較し、差が大きい方を被写体領域の画像とする。
【００８５】
これによって、抽出した画像部分がどちらの被写体であるかを自動的に簡単に判別できる効果が出てくる。
【００８６】
本発明に係る画像合成装置は、上記の課題を解決するために、前記被写体領域抽出手段から得られる第１の被写体あるいは第２の被写体の領域の数が、合成する被写体の数として設定された値と一致しない時に、第１の被写体の領域と第２の被写体の領域が重なっていると判断する重なり検出手段を有することを特徴とする。
【００８７】
上記の構成において、「第１の被写体あるいは第２の被写体の領域」とは、差分画像などから抽出される被写体の領域で、第１の被写体の領域かあるいは第２の被写体の領域かの区別がついていなくてもよい領域である。
【００８８】
「合成する被写体」とは、合成処理の過程で求められる被写体のことではなく、実際に存在する被写体のことであり、ユーザーが合成しようとしている被写体のことである。但し、上述した通り、合成の処理の単位としてまとめて扱うものは一つの「被写体」なので、１つの被写体が複数の人物であることもありえる。
【００８９】
また、被写体の数は画像合成装置に固定的に設定しておく形態でもよいが、使い勝手としては、重なり検出手段が重なり検出を行う以前に、撮影者等のユーザーの指示に基づいて画像合成装置に設定される形態とすることが好ましい。
【００９０】
差分画像から被写体領域抽出手段によって抽出された被写体領域は、被写体同士が重なっていなければ、互いに分離しており、被写体同士が重なっていれば、第１の被写体の領域と第２の被写体の領域とは、連続した領域として一塊に統合されている。従って、抽出された被写体の領域の数と被写体の数（設定値）とを重なり検出手段が比較し、一致すれば被写体同士の重なりは無く、一致しなければ重なりがあると判断する。
【００９１】
その判断結果は、重なりの有無を合成画面やランプなどで撮影者や被写体に通知、警告するのに利用することができる。
【００９２】
これによって、被写体同士が重なり合っている部分があるかどうかをユーザーに判別させやすくすることができるという効果が出てくる。それによって、重なりが起きないように撮影を補助する効果については、前述したものと同様である。
【００９３】
本発明に係る画像合成装置は、上記の課題を解決するために、前記重なり検出手段において重なりが検出される時、重なりが存在することを、ユーザーあるいは被写体あるいは両方に警告する重なり警告手段を有することを特徴とする。
【００９４】
ここで、「警告」には、表示手段などに文字や画像で警告することも含まれるし、ランプなどによる光やスピーカなどによる音声、バイブレータなどによる振動など、ユーザーや被写体が感知できる方法ならば何でも含まれる。
【００９５】
これによって、被写体同士が重なり合っている場合に、重なり警告手段の動作によって警告されるので、ユーザーがそれに気づかずに撮影／記録したり合成処理したりということを防ぐことができ、さらに被写体にも位置調整等が必要であることを即時に知らせることができるという撮影補助の効果が出てくる。
【００９６】
本発明に係る画像合成装置は、上記の課題を解決するために、前記重なり検出手段において、重なりが検出されない時、重なりが存在しないことを、ユーザーあるいは被写体あるいは両方に通知するシャッターチャンス通知手段を有することを特徴とする。
【００９７】
ここで、「通知」には、「警告」同様、ユーザーや被写体が感知できる方法ならば何でも含まれる。
【００９８】
これによって、被写体同士が重なり合っていない時をユーザーが知ることができるので、撮影や撮影画像記録、合成のタイミングをそれに合わせて行えば、被写体同士が重ならずに合成することができるという撮影補助の効果が出てくる。
【００９９】
また、被写体にも、シャッターチャンスであることを通知できるので、ポーズや視線などの備えを即座に行えるという撮影補助の効果も得られる。
【０１００】
本発明に係る画像合成装置は、上記の課題を解決するために、被写体や風景を撮像する撮像手段を有し、前記重なり検出手段で重なりが検出されない時に、前記撮像手段から得られる画像を第１被写体画像、または第２被写体画像として記録する指示を生成する自動シャッター手段を有することを特徴とする。
【０１０１】
上記の構成において、撮影画像を第１被写体画像、第２被写体画像として記録するというのは、例えば、主記憶や外部記憶に記録するなどで実現される。したがって、自動シャッター手段は、第１の被写体の領域と第２の被写体の領域とに重なりが無いという信号を重なり検出手段から入力したときに、主記憶や外部記憶に対する記録制御処理の指示を出力する。
【０１０２】
そして、背景補正量算出手段や重ね画像生成手段は、主記憶や外部記憶に記録されている画像を読み込むことで、第１被写体画像、第２被写体画像を得ることができるようになる。
【０１０３】
なお、自動シャッター手段が自動的に指示を出しても、即座に画像が記録されるとは限らない。例えば、同時にシャッターボタンも押されているとか、自動記録モードになっているなどの状態でないと記録されないようにしてもよい。
【０１０４】
これによって、被写体同士が重なり合っていない時に自動的に撮影が行われるので、ユーザー自身が重なりがあるかどうかを判別してシャッターを押さなくても良いという撮影補助の効果が出てくる。
【０１０５】
本発明に係る画像合成装置は、上記の課題を解決するために、被写体や風景を撮像する撮像手段を有し、前記重なり検出手段で重なりが検出される時に、前記撮像手段から得られる画像を、第１被写体画像、あるいは第２被写体画像として記録することを禁止する指示を生成する自動シャッター手段を有することを特徴とする。
【０１０６】
上記の構成によれば、自動シャッター手段は、重なり検出手段から重なりがあるという信号を得たら、撮像手段から得られる画像を主記憶や外部記憶などに記録することを禁止する指示を出力する。この結果、例えば、シャッターボタンが押されたとしても、撮像手段から得られる画像は記録されない。なお、この禁止処理は、自動禁止モードになっているなどの状態でないと行われないようにしてもよい。
【０１０７】
これによって、被写体同士が重なり合ってる時は撮影が行われないので、ユーザーが誤って重なりがある状態で撮影／記録してしまうことを防ぐ撮影補助の効果が出てくる。
【０１０８】
本発明に係る画像合成方法は、上記の課題を解決するために、背景と第１の被写体とを含む画像である第１被写体画像と、上記背景の少なくとも一部と第２の被写体とを含む画像である第２被写体画像との間での、背景部分の相対的な移動量、回転量、拡大縮小率、歪補正量のいずれかもしくは組み合わせからなる補正量を算出する、あるいは予め算出しておいた補正量を読み出す背景補正量算出ステップと、第１被写体画像、第２被写体画像のどちらかを基準画像とし、他方の画像を被写体以外の背景の部分が少なくとも一部重なるように前記背景補正量算出ステップから得られる補正量で補正し、基準画像と補正した画像を重ねた画像を生成する重ね画像生成ステップと、を有することを特徴とする。
【０１０９】
これによる種々の作用効果は、前述したとおりである。
【０１１０】
本発明に係る画像合成プログラムは、上記の課題を解決するために、上記画像合成装置が備える各手段として、コンピュータを機能させることを特徴とする。
【０１１１】
本発明に係る画像合成プログラムは、上記の課題を解決するために、上記画像合成方法が備える各ステップをコンピュータに実行させることを特徴とする。
【０１１２】
本発明に係る記録媒体は、上記の課題を解決するために、上記画像合成プログラムを記録したことを特徴とする。
【０１１３】
これにより、上記記録媒体、またはネットワークを介して、一般的なコンピュータに画像合成プログラムをインストールすることによって、該コンピュータを用いて上記の画像合成方法を実現する、言い換えれば、該コンピュータを画像合成装置として機能させることができる。
【０１１４】
【発明の実施の形態】
以下、本発明の実施の形態を図面を参照して説明する。
【０１１５】
まず、言葉の定義について説明しておく。
【０１１６】
「第１の被写体」、「第２の被写体」とは、合成を行おうとしている対象であり、一般には人物であることが多いが物などの場合もある。厳密には、「被写体」は、第１被写体画像と第２被写体画像との間で、背景部分が少なくとも一部重なるようにした時に、画素値が一致しない領域、すなわち変化がある領域は全て「被写体の領域」となる可能性を持つ。但し、背景部分で、風で木の葉が揺れたなどの小さな変化でも変化がある領域となってしまうので、小さな変化や小さな領域はある程度無視する方が好ましい。
【０１１７】
なお、例えば被写体が人物の場合、被写体は必ずしも一人であるとは限らず、複数の人物をまとめて「第１の被写体」や「第２の被写体」とする場合もある。つまり、複数人であっても、合成の処理の単位としてまとめて扱うものは一つの「被写体」となる。
【０１１８】
なお、人物でなく、物であっても同様である。また、被写体は、必ずしも一つの領域であるとは限らず、複数の領域からなる場合もある。「第１」、「第２」は、異なるコマ画像として単に区別する為につけたものであり、撮影の順番などを表すものではなく、本質的な違いはない。また、例えば、人物が服や物などを持っていて、「第１の被写体または第２の被写体を含まない背景だけの画像」にそれらが現れないのならば、それらも被写体に含まれる。
【０１１９】
「第１被写体画像」、「第２被写体画像」は、上記の「第１の被写体」、「第２の被写体」を含む別々の画像であり、一般には、カメラなどでその被写体を別々に撮影した画像である。但し、画像上に被写体のみしか写っておらず、互いに共通する背景部分が全く写っていない場合は、その共通する背景部分を元にした位置合わせができないので、合成に適さない。したがって、少なくとも一部は（合成した被写体の周囲を自然にするために、より好ましくは、合成しようとする被写体の周囲において）互いに共通する背景部分が写っている必要がある。また、通常は、第１被写体画像、第２被写体画像は、同じ背景を使って、すなわちカメラをあまり動かさないで撮影する場合が多い。
【０１２０】
「背景部分」とは、第１被写体画像、第２被写体画像から「第１の被写体」、「第２の被写体」をそれぞれ除いた部分である。
【０１２１】
「移動量」は、平行移動させる量だが、回転や拡大縮小の中心の対応点の移動量と言ってもよい。
【０１２２】
「歪補正量」とは、カメラやレンズの位置や方向が変わったことによる撮影画像の変化のうち、平行移動、回転、拡大縮小では補正できない残りの変化を補正する為の補正量である。例えば、高い建物を撮影した時に、上の方が遠近法の効果により同じ大きさであっても小さく写ってしまう「あおり」などとよばれる効果などを補正する場合などがこれに含まれる。
【０１２３】
「重ね画像生成手段」は、重ね画像を生成するが、必ずしも一つの画像として生成しなくてもよく、他の手段との協働で合成したかのように見せる処理を行うのでも構わない。例えば、表示手段上にある画像を表示する際、その画像に上書きする形で別の画像を一部表示すれば、見た目には２つの画像から合成画像を生成し、その合成画像を表示しているかのように見えるが、実際は、２つの画像がそれぞれ存在するだけで、合成画像は存在していない。
【０１２４】
「画素値」とは、画素の値であり、一般に所定のビット数を使って表される。例えば、白黒二値の場合は１ビットで表現され、２５６階調のモノクロの場合、８ビット、赤、緑、青の各色２５６階調のカラーの場合、２４ビットで表現される。カラーの場合、赤、緑、青の光の３原色に分解されて表現されることが多い。
【０１２５】
なお、似た言葉として、「濃度値」、「輝度値」などがある。これは目的によって使い分けているだけであり、「濃度値」は主に画素を印刷する場合、「輝度値」は主にディスプレイ上に表示する場合に使われるが、ここでは目的は限定していないので、「画素値」と表現することにする。
【０１２６】
「透過率」とは、複数の画素の画素値に所定の割合の値を掛けて、その和を新たな画素値とする処理において、掛ける「所定の割合の値」のことである。通常、０以上、１以下の値である。また、１つの新たな画素値で使われる各画素の透過率の和は１とする場合が多い。「透過率」でなく、「不透明度」と言う場合もある。「透明度」は１から「不透明度」を引いた値である。
【０１２７】
「所定の透過率」には、固定された値、領域に応じて変わる値、領域の境界付近で徐々に変わる値なども含まれる。
【０１２８】
「差分画像」とは、二つの画像中の同じ位置の画素値を比較して、その差の値を画素値として作成する画像のことである。一般には、差の値は絶対値をとることが多い。
【０１２９】
「元の画素値と異なる画素値」とは、例えば、透過率を変えて半透明にしたり、画素値の明暗や色相などを逆にして反転表示させたり、赤や白、黒などの目立つ色にしたり、などを実現するような画素値である。また、領域の境界部分と内部とで、上記のように画素値を変えてみたり、境界部分を点線で囲ってみたり、点滅表示（時間的に画素値を変化させる）させてみたり、というような場合も含む。
【０１３０】
「被写体の領域」とは、被写体が背景と分離される境界で区切られる領域である。例えば、第１被写体画像中で人物が服や物などを持っていて、第２被写体画像中でそれらが現れないのならば、それらも被写体であり、被写体の領域に含まれる。なお、被写体の領域は、必ずしも繋がった一塊の領域とは限らず、複数の領域に分かれていることもある。
【０１３１】
「前記被写体領域抽出手段から得られる領域のみを重ねる」とは、その領域以外は何も画像を生成しないということではなく、それ以外の領域は基準画像などで埋めることを意味する。
【０１３２】
「特徴」とは、その領域の画像の持つ性質などであり、特徴量として数値化して表現できる性質が好ましい。特徴量としては、例えば、各色の画素値や、その色相、彩度、明度のほか、画像の模様や構造を表す統計量として、同時生起行列や差分統計量、ランレングス行列、パワースペクトル、それらの第２次統計量、高次統計量などがある。
【０１３３】
「第１の被写体あるいは第２の被写体の領域」とは、差分画像などから抽出される被写体の領域で、第１の被写体の領域かあるいは第２の被写体の領域かの区別がついていなくてもよい領域である。
【０１３４】
「合成しようとしている被写体」とは、合成処理の過程で求められる被写体のことではなく、実際に（カメラの前に）存在する被写体のことであり、第１被写体画像および第２被写体画像のどちらか一方に定めた基準画像に対して、ユーザーが合成しようとしている被写体のことである。但し、上述した通り、合成の処理の単位としてまとめて扱うものは一つの「被写体」なので、１つの被写体が複数の人物／物であることもありえる。
【０１３５】
「警告」には、表示手段などに文字や画像を表示して警告することも含まれるし、ランプなどによる光やスピーカなどによる音声、バイブレータなどによる振動など、ユーザーや被写体が感知できる方法ならば何でも含まれる。
【０１３６】
「通知」は、「警告」同様、ユーザーや被写体が感知できる方法ならば何でも含まれる。
【０１３７】
「フレーム（枠）」とは、画像全体の外形輪郭をさす。被写体が画像の外形輪郭に一部かかっているような場合、フレーム（枠）にかかる、とか、フレーム（枠）から切れる、などと表現することもある。
【０１３８】
図１は、本発明の実施の一形態に係る画像合成方法を実施する画像合成装置を示す構成図である。
【０１３９】
すなわち、画像合成装置の要部を、撮像手段１、第１被写体画像取得手段２、第２被写体画像取得手段３、背景補正量算出手段４、補正画像生成手段５、差分画像生成手段６、被写体領域抽出手段７、重なり検出手段８、重ね画像生成手段９、重ね画像表示手段１０、重なり警告手段１１、シャッターチャンス通知手段１２、自動シャッター手段１３の主要な機能ブロックに展開して示すことができる。
【０１４０】
図２は、図１の各手段１〜１３を具体的に実現する装置の構成例である。
【０１４１】
ＣＰＵ（ｃｅｎｔｒａｌ　ｐｒｏｃｅｓｓｉｎｇ　ｕｎｉｔ）７０は、背景補正量算出手段４、補正画像生成手段５、差分画像生成手段６、被写体領域抽出手段７、重なり検出手段８、重ね画像生成手段９、重ね画像表示手段１０、重なり警告手段１１、シャッターチャンス通知手段１２、自動シャッター手段１３として機能し、これら各手段４〜１３の処理手順が記述されたプログラムを主記憶７４、外部記憶７５、通信デバイス７７を介したネットワーク先などから得る。
【０１４２】
なお、撮像手段１、第１被写体画像取得手段２、第２被写体画像取得手段３、についても、撮像素子や、撮像素子が出力する画像データの各種処理に対する内部制御などの為にＣＰＵなどを使っている場合もある。
【０１４３】
また、ＣＰＵ７０は、ＣＰＵ７０を含めてバス７９を通じ相互に接続されたディスプレイ７１、撮像素子７２、タブレット７３、主記憶７４、外部記憶７５、シャッターボタン７６、通信デバイス７７、ランプ７８、スピーカ８０とデータのやりとりを行ないながら、処理を行なう。
【０１４４】
なお、データのやりとりは、バス７９を介して行う以外にも、通信ケーブルや無線通信装置などデータを送受信できるものを介して行ってもよい。また、各手段１〜１３の実現手段としては、ＣＰＵに限らず、ＤＳＰ（ｄｉｇｉｔａｌ　ｓｉｇｎａｌ　ｐｒｏｃｅｓｓｏｒ）や処理手順が回路として組み込まれているロジック回路などを用いることもできる。
【０１４５】
ディスプレイ７１は、通常はグラフィックカードなどと組み合わされて実現され、グラフィックカード上にＶＲＡＭ（ｖｉｄｅｏ　ｒａｎｄｏｍ　ａｃｃｅｓｓ　ｍｅｍｏｒｙ）を有し、ＶＲＡＭ上のデータを表示信号に変換して、モニターなどのディスプレイ（表示／出力媒体）に送り、ディスプレイは表示信号を画像として表示する。
【０１４６】
撮像素子７２は、風景等を撮影して画像信号を得るデバイスであり、通常、レンズなどの光学系部品と受光素子およびそれに付随する電子回路などからなる。ここでは、撮像素子７２は、出力信号をＡ／Ｄ変換器などを通して、デジタル画像データに変換する所まで含んでいるとし、バス７９を通じて、第１被写体画像取得手段２、第２被写体画像取得手段３などに撮影した画像の画像データを送るとする。撮像素子として一般的なデバイスとしては、例えば、ＣＣＤ（ｃｈａｒｇｅ　ｃｏｕｐｌｅｄ　ｄｅｖｉｃｅ）などがあるが、その他にも風景等を画像データとして得られるデバイスならば何でも良い。
【０１４７】
ユーザの指示を入力する手段として、タブレット７３、シャッターボタン７６などがあり、ユーザの指示はバス７９を介して各手段１〜１３に入力される。この他にも各種操作ボタン、マイクによる音声入力など、様々な入力手段が使用可能である。タブレット７３は、ペンとペン位置を検出する検出機器からなる。シャッターボタン７６は、メカニカルもしくは電子的なスイッチなどからなり、ユーザーがボタンを押すことで、通常は、撮像素子７２で撮影された画像を主記憶７４や外部記憶７５などに記録したりする一連の処理を開始させるスタート信号を生成する。
【０１４８】
主記憶７４は、通常はＤＲＡＭ（ｄｙｎａｍｉｃ　ｒａｎｄｏｍ　ａｃｃｅｓｓ　ｍｅｍｏｒｙ）やフラッシュメモリなどのメモリデバイスで構成される。なお、ＣＰＵ内部に含まれるメモリやレジスタなども一種の主記憶として解釈してもよい。
【０１４９】
外部記憶７５は、ＨＤＤ（ｈａｒｄ　ｄｉｓｋ　ｄｒｉｖｅ）やＰＣ（ｐｅｒｓｏｎａｌ　ｃｏｍｐｕｔｅｒ）　カードなどの装脱着可能な記憶手段である。あるいはＣＰＵ７０とネットワークを介して有線または無線で接続された他のネットワーク機器に取り付けられた主記憶や外部記憶を外部記憶７５として用いることもできる。
【０１５０】
通信デバイス７７は、ネットワークインターフェースカードなどにより実現され、無線や有線などにより接続された他のネットワーク機器とデータをやりとりする。
【０１５１】
スピーカ８０は、バス７９などを介して送られて来る音声データを音声信号として解釈し、音声として出力する。出力される音声は、単波長の単純な音の場合もあるし、音楽や人間の音声など複雑な場合もある。出力する音声が予め決まっている場合、送られて来るデータは音声信号ではなく、単なるオン、オフの動作制御信号だけという場合もある。
【０１５２】
次に、図１の各手段１〜１３を各手段間のデータ授受の観点から説明する。
【０１５３】
なお、各手段間でのデータのやりとりは、特に注釈なく「＊＊手段から得る」、「＊＊手段へ送る（渡す）」という表現をしている時は、主にバス７９を介してデータをやりとりしているとする。その際、直接各手段間でデータのやりとりをする場合もあれば、主記憶７４や外部記憶７５、通信デバイス７７を介したネットワークなどを間に挟んでデータをやりとりする場合もある。
【０１５４】
撮像手段１は主に撮像素子７２からなり、撮像した風景などを画像データとして第１被写体画像取得手段２、第２被写体画像取得手段３に送る。
【０１５５】
第１被写体画像取得手段２は、例えば撮像手段１、主記憶７４、および／または外部記憶７５などで構成され、第１被写体画像を、撮像手段１、主記憶７４、外部記憶７５、および／または通信デバイス７７を介したネットワーク先などから得る。なお、第１被写体画像取得手段２は、内部制御などの為にＣＰＵなどを含む場合もある。
【０１５６】
撮像手段１を使う場合は、第１の被写体が含まれる現在の風景（第１被写体画像）を撮像素子７２で撮影することになり、通常はシャッターボタン７６などを押したタイミングなどで撮影し、撮影された画像は、主記憶７４、外部記憶７５、および／または通信デバイス７７を介したネットワーク先などに記録される。
【０１５７】
一方、第１被写体画像取得手段２が、主記憶７４、外部記憶７５、および／または通信デバイス７７を介したネットワーク先などから第１被写体画像を得る場合は、既に撮影されて予め用意してある画像を読み出すことになる。なお、通信デバイス７７を介したネットワーク先などにカメラがあり、ネットワークを通して撮影する場合もある。
【０１５８】
第１被写体画像は、背景補正量算出手段４、補正画像生成手段５、差分画像生成手段６、被写体領域抽出手段７、および／または重ね画像生成手段９などに送られる。
【０１５９】
第２被写体画像取得手段３は、例えば撮像手段１、主記憶７４、および／または外部記憶７５などで構成され、第２の被写体が含まれる画像（以降、「第２被写体画像」と呼ぶ）を、撮像手段１、主記憶７４、外部記憶７５、および／または通信デバイス７７を介したネットワーク先などから得る。なお、第２被写体画像取得手段３は、内部制御などの為にＣＰＵなどを含む場合もある。画像の中身が違う以外は、画像の取得方法に関しては、第１被写体画像取得手段２と同様である。
【０１６０】
第２被写体画像は、背景補正量算出手段４、補正画像生成手段５、差分画像生成手段６、被写体領域抽出手段７、および／または重ね画像生成手段９などに送られる。
【０１６１】
背景補正量算出手段４としてのＣＰＵ７０は、第１被写体画像および第２被写体画像中の被写体以外の背景の相対的な移動量、回転量、拡大縮小率、歪補正量のいずれかもしくは任意の組み合わせからなる補正量を算出する。第１被写体画像および第２被写体画像の一方（基準画像）と他方の画像との間の補正量が最低限求まればよい。
【０１６２】
背景補正量算出手段４は、算出した補正量を補正画像生成手段５に送る。なお、予め算出しておいた補正量を背景補正量算出手段４が読み出す場合は、主記憶７４、外部記憶７５、および／または通信デバイス７７を介したネットワーク先などから補正量を読み出すことになる。
【０１６３】
補正画像生成手段５としてのＣＰＵ７０は、第１被写体画像、第２被写体画像のどちらかを基準画像とし、他方の画像を被写体以外の背景の部分が重なるように背景補正量算出手段４から得られる補正量で補正した画像（以下、補正画像と呼ぶ）を生成し、差分画像生成手段６および重ね画像生成手段９へ送る。なお、予め生成しておいた補正画像を補正画像生成手段５が読み出す場合は、主記憶７４、外部記憶７５、および／または通信デバイス７７を介したネットワーク先などから読み出すことになる。
【０１６４】
差分画像生成手段６としてのＣＰＵ７０は、補正画像生成手段５で決めた基準画像と補正画像生成手段５から得られる補正画像との間の差分画像を生成して、生成した差分画像を被写体領域抽出手段７および重ね画像生成手段９へ送る。
【０１６５】
被写体領域抽出手段７としてのＣＰＵ７０は、差分画像生成手段６から得られる差分画像から第１、第２の被写体の領域を抽出して、抽出した領域を重なり検出手段８および重ね画像生成手段９へ送る。
【０１６６】
重なり検出手段８としてのＣＰＵ７０は、被写体領域抽出手段７から得られる第１、第２の被写体の領域から第１、第２の被写体同士の重なりを検出して、重なりが存在するかどうかの情報と重なり領域の情報とを、重ね画像生成手段９、重なり警告手段１１、シャッターチャンス通知手段１２および自動シャッター手段１３に送る。
【０１６７】
重ね画像生成手段９としてのＣＰＵ７０は、第１被写体画像取得手段２から得られる第１被写体画像、第２被写体画像取得手段３から得られる第２被写体画像、補正画像生成手段５から得られる補正画像を、全部あるいは一部重ねた画像を生成し、生成した画像を重ね画像表示手段１０に送る。
【０１６８】
また、重ね画像生成手段９は、差分画像生成手段６から得られる差分画像画像中の差のある領域を、元の画素値と異なる画素値の画像として生成する場合もある。
【０１６９】
また、重ね画像生成手段９は、被写体領域抽出手段７から得られる第１の被写体と第２の被写体の領域のみを基準画像などに重ねる場合もある。
【０１７０】
また、重ね画像生成手段９は、重なり検出手段８から得られる重なりの領域を元の画素値と異なる画素値の画像として生成する場合もある。
【０１７１】
重ね画像表示手段１０としてのＣＰＵ７０は、重ね画像生成手段９から得られる重ね画像をディスプレイ７１などに表示する。
【０１７２】
また、重ね画像表示手段１０は、重なり警告手段１１から得られる警告情報に応じて、警告表示を行う場合や、シャッターチャンス通知手段１２から得られるシャッターチャンス情報に応じて、シャッターチャンスである旨の表示を行う場合や、自動シャッター手段１３から得られるシャッター情報に応じて、自動シャッターが行われた旨の表示を行う場合もある。
【０１７３】
重なり警告手段１１としてのＣＰＵ７０は、重なり検出手段８から得られる重なり情報から、重なりが存在する場合、ユーザーあるいは被写体あるいは両方に重なりがあることを通知する。
【０１７４】
通知には、通知内容を文字などにして重ね画像表示手段１０に送ってディスプレイ７１に表示させたり、ランプ７８を使って光で知らせたり、スピーカ８０を使って音で知らせたりする等の種々の形態を採用できる。通知することができるのならば、それ以外のデバイスなどを使っても良い。
【０１７５】
シャッターチャンス通知手段１２としてのＣＰＵ７０は、重なり検出手段８から得られる重なり情報から、重なりが存在しない場合、ユーザーあるいは被写体あるいは両方に重なりが無いことを通知する。通知方法に関しては、重なり警告手段１１の説明と同様である。
【０１７６】
自動シャッター手段１３としてのＣＰＵ７０は、重なり検出手段８から得られる重なり情報から、重なりが存在しない場合、第２被写体画像取得手段３に対し、撮像手段１から得られる画像を主記憶７４や外部記憶７５などに記録するように自動的に指示を出す。
【０１７７】
ここでは、撮像手段１から得られる画像は、第１被写体画像または第２被写体画像として主記憶７４や外部記憶７５などに最終的に記録、保存され、合成されるような使い方を主に想定している。例えば、第１の被写体を先に撮影した後で、第２の被写体を撮影するとするとき、第１被写体画像を撮像手段１から得た場合には、得る毎に記録、保存するが、第２被写体画像は撮像手段１から得られても、すぐには保存されない。
【０１７８】
すなわち、撮像手段１から得た画像を第２被写体画像とする場合、その得られた第２被写体画像と保存されている第１被写体画像とを使って、重なり検出などの処理を行い、重ね画像表示手段１０などでの各種表示や警告、通知などの処理を行う、という一連の処理を繰り返す。そして、自動シャッター手段１３により記録、保存を指示された時、第２被写体画像が最終的に記録、保存される。
【０１７９】
なお、自動シャッター手段１３による指示が存在し、かつ、シャッターボタン１４３がユーザーにより押される場合に、第２被写体画像を記録、保存するようにしてもよい。
【０１８０】
また、自動シャッター手段１３が、指示を出した結果、撮像画像が記録されたことをユーザーあるいは被写体あるいは両方に通知してもよい。通知方法に関しては、重なり警告手段１１の説明と同様である。
【０１８１】
また、自動シャッター手段１３としてのＣＰＵ７０は、記録の指示を行うだけでなく、重なり検出手段８から得られる重なり情報から、重なりが存在する場合、第２被写体画像取得手段３に撮像手段１から得られる画像を主記憶７４や外部記憶７５などに記録するのを禁止するように自動的に指示を出す。この動作は、上述した自動記録する場合の逆となる。
【０１８２】
この場合、自動シャッター手段１３による保存禁止の指示が存在する場合、シャッターボタン１４３がユーザーにより押されても、第２被写体画像は記録、保存されないことになる。
【０１８３】
図３（ａ）は、本発明に係る画像合成装置の背面からの外観例を示している。本体１４０上に表示部兼タブレット１４１、ランプ１４２、およびシャッターボタン１４３がある。
【０１８４】
表示部兼タブレット１４１は入出力装置（ディスプレイ７１およびタブレット７３等）および重ね画像表示手段１０に相当する。表示部兼タブレット１４１上には、図３（ａ）のように、重ね画像生成手段９で生成された合成画像重なり警告手段１１、シャッターチャンス通知手段１２、自動シャッター手段１３などからの通知／警告情報などが表示される。また、画像合成装置の各種設定メニューなどを表示して、タブレットを使って指やペンなどで設定を変更したりするのにも使われる。
【０１８５】
なお、各種設定などの操作手段として、タブレットだけでなく、ボタン類などがこの他にあってもよい。また、表示部兼タブレット１４１は、本体１４０に対する回転や分離などの方法を用いて、撮影者だけでなく、被写体側でも見られるようになっていてもよい。
【０１８６】
ランプ１４２は、重なり警告手段１１、シャッターチャンス通知手段１２または自動シャッター手段１３などからの通知や警告に使われたりする。
【０１８７】
シャッターボタン１４３は、第１被写体画像取得手段２または第２被写体画像取得手段３が撮像手段１から撮影画像を取り込む／記録するタイミングを指示する為に主に使われる。
【０１８８】
また、この例では示していないが、内蔵スピーカなどを通知／警告手段として使ってもよい。
【０１８９】
図３（ｂ）は、本発明に係る画像合成装置の前面からの外観例を示している。本体１４０前面にレンズ部１４４が存在する。レンズ部１４４は、撮像手段１の一部である。なお、図３（ｂ）の例では示していないが、前面に被写体に情報（前記の通知や警告）を伝えられるように、表示部やランプ、スピーカなどがあってもよい。
【０１９０】
図４は、画像データのデータ構造例を説明する説明図である。画像データは画素データの２次元配列であり、「画素」は、属性として位置と画素値を持つ。ここでは画素値として光の３原色（赤、緑、青）に対応したＲ、Ｇ、Ｂの値を持つとする。図４の横に並んだＲ、Ｇ、Ｂの組で１画素のデータとなる。但し、色情報を持たないモノクロの輝度情報だけを持つ場合は、Ｒ、Ｇ、Ｂの代わりに輝度値を１画素のデータとして持つとする。
【０１９１】
位置はＸ−Ｙ座標（ｘ、ｙ）で表す。図４では左上原点とし、右方向を＋Ｘ方向、下方向を＋Ｙ方向とする。
【０１９２】
以降では説明の為、位置（ｘ、ｙ）の画素を「Ｐ（ｘ、ｙ）」と表すが、画素Ｐ（ｘ、ｙ）の画素値も「画素値Ｐ（ｘ、ｙ）」あるいは単に「Ｐ（ｘ、ｙ）」と表す場合もある。画素値がＲ、Ｇ、Ｂに分かれている場合、各色毎に計算は行うが、色に関する特別な処理でなければ、同じ計算処理をＲ、Ｇ、Ｂの値毎に行えばよい。従って、以降では共通した計算方法として「画素値Ｐ（ｘ、ｙ）」を使って説明する。
【０１９３】
図５は、本発明の実施の一形態に係る画像合成方法の一例を示すフローチャート図である。
【０１９４】
まずステップＳ１（以下、「ステップＳ」を「Ｓ」と略記する。）では、第１被写体画像取得手段２が、第１被写体画像を取得し、連結点Ｐ２０（以下、「連結点Ｐ」を「Ｐ」と略記する）を経てＳ２へ処理が進む。第１被写体画像は、撮像手段１を使って撮影してもよいし、予め主記憶７４、外部記憶７５または通信デバイス７７を介したネットワーク先などに用意してある画像を読み出してもよい。
【０１９５】
Ｓ２では、第２被写体画像取得手段３が、上記第１被写体画像と少なくとも一部共通する背景部分を持つ第２被写体画像を取得し、Ｐ３０を経てＳ３へ処理が進む。ここでの処理は後で図１３を用いて詳しく説明するが、第２被写体画像の取得方法自体は、第１被写体画像と同様である。なお、Ｓ１とＳ２の処理の順番は逆でも良いが、後に撮影する方を基準画像とすると、撮影時の合成画像の表示に違和感が少ない効果が出てくる。
【０１９６】
Ｓ３では、背景補正量算出手段４が、第１被写体画像および第２被写体画像から背景補正量を算出して、Ｐ４０を経てＳ４へ処理が進む。第１被写体画像、第２被写体画像はそれぞれ、第１被写体画像取得手段２（Ｓ１）、第２被写体画像取得手段３（Ｓ２）から得られる。
【０１９７】
なお、以降、第１被写体画像、第２被写体画像を使う際、特にことわりの無い限り、これらの画像の取得元の手段／ステップはＳ３での取得元の手段／ステップと同じなので、以降はこれらの画像の取得元の手段／ステップの説明は省く。
【０１９８】
Ｓ３の処理の詳細は後で図１４を用いて説明する。
【０１９９】
Ｓ４では、補正画像生成手段５が、背景補正量算出手段４から得た背景補正量を使って基準画像以外の第１被写体画像または第２被写体画像を補正し、差分画像生成手段６が、補正画像生成手段５で補正された画像と基準画像との間の差分画像を生成して、Ｐ５０を経てＳ５へ処理が進む。Ｓ４の処理の詳細は後で図１６を用いて説明する。
【０２００】
Ｓ５では、被写体領域抽出手段７が、差分画像生成手段６（Ｓ４）から得られる差分画像から、第１、第２の被写体の領域（以降、第１被写体領域、第２被写体領域と呼ぶ）を抽出し、重なり検出手段８が被写体同士の重なりを検出して、Ｐ６０を経てＳ６へ処理が進む。Ｓ５の処理の詳細は後で図１８を用いて説明する。
【０２０１】
Ｓ６では、重なり警告手段１１、シャッターチャンス通知手段１２、自動シャッター手段１３のうちの一つ以上の手段が、重なり検出手段８（Ｓ５）から得られる重なりに関する情報に応じて様々な処理を行い、Ｐ７０を経てＳ７へ処理が進む。Ｓ６の処理の詳細は後で図２０から図２２を用いて説明する。
【０２０２】
Ｓ７では、重ね画像生成手段９が、第１被写体画像、第２被写体画像、およびそれらの内の基準画像ではない方の画像を補正画像生成手段５（Ｓ４）で補正した画像、被写体領域抽出手段７（Ｓ５）から得られる第１、第２被写体領域、重なり検出手段８（Ｓ６）から得られる第１、第２の被写体の重なりに関する情報などから、これら２枚の画像を重ねる「重ね画像」を生成して、Ｐ８０を経てＳ８へ処理が進む。Ｓ７の処理の詳細は後で図２３を用いて説明する。
【０２０３】
Ｓ７では、重ね画像表示手段１０が、重ね画像生成手段９（Ｓ７）から得られる重ね画像をディスプレイ７１などに表示して、処理を終了する。
【０２０４】
これらＳ１からＳ７の処理で、第１被写体画像、第２被写体画像を使って、第１の被写体と第２の被写体を１枚の画像上に合成し、また被写体同士の重なり具合に応じて様々な処理が行えるようになる。
【０２０５】
詳細な処理やその効果については、後で詳しく説明するとして、まず簡単な例で処理の概要を説明する。
【０２０６】
図６（ａ）はＳ１で得る第１被写体画像の例である。背景の手前、左側に第１の被写体たる人物（１）が立っている。分かりやすいように人物（１）の顔部分には「１」と記しておく。なお、今後、特にことわりなく「右側」「左側」といった場合、図上での「右側」「左側」という意味だとする。この方向は、撮影者／カメラから見た方向だと思えばよい。
【０２０７】
図７（ａ）はＳ２で得る第２被写体画像の例である。背景の手前、右側に第２の被写体たる人物（２）が立っている。分かりやすいように人物（２）の顔部分には「２」と記しておく。
【０２０８】
図７（ｃ）は、図６（ａ）の第１被写体画像と図７（ａ）の第２被写体画像との間で背景補正量を求め、第１被写体画像を基準画像として、その背景補正量を用いて第２被写体画像を補正した画像である。
【０２０９】
補正された画像は実線の枠で囲われた範囲であり、補正のされ方が分かるように、元の図７（ａ）の第２被写体画像と図６（ａ）の第１被写体画像の範囲を図７（ｃ）上に点線の枠で示してある。図７（ａ）の背景は、図６（ａ）の背景の少し左上側の風景を撮影して得られている。このため、図７（ａ）の第２被写体画像を図６（ａ）の第１被写体画像の背景と重なるように補正するには、図７（ａ）の少し右下側の風景を選択する必要がある。従って、図７（ｃ）は、図７（ａ）より少し右下側の風景となるように補正されている。元の図７（ａ）の範囲は点線で示されている。図７（ａ）より右下側の風景の画像は存在しないので、図７（ｃ）では右端の点線から右にはみ出した部分、および下端の点線から下にはみ出した部分が空白となっている。逆に図７（ａ）の左上側の部分は切り捨てられている。
【０２１０】
ここでは拡大縮小や回転などの補正はなく、単なる平行移動だけの補正結果になっている。すなわちＳ３で得られる背景補正量は、ここでは実線の枠と点線の枠のずれが示す平行移動量となる。
【０２１１】
図８（ａ）は、Ｓ４で、図６（ａ）の第１被写体画像と図７（ｃ）の補正された第２被写体画像との間で生成した差分画像である。差分画像では差分量０の部分（すなわち、背景の一致部分）は黒い領域で示されている。差分がある部分は、被写体の領域内とノイズ部分であり、被写体の領域部分は背景部分と被写体部分の画像が重なり合った妙な画像になっている。（なお、補正によってどちらかの画像しか画素が存在しない領域（例えば図７（ｃ）の右下側に位置する実線と点線の間の逆Ｌ字領域）は差分の対象からは外し、差分量は０としている）。
【０２１２】
Ｓ６の重なりに関する処理は様々な処理方法があるが、この例では重なりは検出されないので、ここでは説明を簡単にする為に特に処理は行わないことにしておく。
【０２１３】
図９（ａ）は、後述する図１９（ｄ）に示す第２被写体領域に相当する部分の画像を、図６（ａ）の第１被写体画像（基準画像）に重ねて（上書きして）生成した画像である。図６（ａ）と図７（ａ）の別々に写っていた被写体が同じ画像上に重なりなく並んでいる。重ね方に関しても、様々な処理方法があるので、後で詳しく説明する。図９（ａ）の画像が重ね画像表示手段１０上に合成画像として表示される。
【０２１４】
これによって、別々に撮影された被写体を同時に撮影したかのような画像を合成できるようになる効果が出てくる。
【０２１５】
以上の説明により、処理の概要を一通り説明したが、Ｓ５で被写体領域同士で重なりがある場合のＳ６の処理例の概要について説明していないので、以降、簡単に触れておく。
【０２１６】
図１０は、図７（ａ）とは別の第２被写体画像の例である。図７（ａ）と比べると、第２の被写体が同一の背景に対して少し左に位置している。なお、第１被写体画像は図６（ａ）と同じものを使うとする。
【０２１７】
図１１（ｃ）は、第１被写体領域と第２被写体領域との合わさった領域を示している。図中の領域２０２が第１被写体領域と第２被写体領域とで構成されている。ここでは、同じ背景に対する第１、第２の被写体の各位置の関係で、第１被写体領域と第２被写体領域とに重なりが生じたため、領域２０２が結合された領域として示されている。
【０２１８】
図１２は、Ｓ６で重なりがある場合にＳ７で生成される重ね画像の一例を示した図である。領域２０２は、第１被写体領域と第２被写体領域とが結合された１つの領域として扱われるので、一括して半透明に表示されている。また、重ね画像に上書きして、第１の被写体と第２の被写体が重なっていることを示すメッセージが表示されている。
【０２１９】
この重ね画像（含むメッセージ）を表示することで、第１の被写体と第２の被写体が重なっていることが、ユーザーや被写体に分かりやすくなるという効果が出てくる。
【０２２０】
以上の説明により、Ｓ５で被写体領域同士で重なりがある場合のＳ６の処理例の概要について説明した。
【０２２１】
なお、これを典型的な利用シーン例で考えると、まず図６（ａ）のような第１の被写体をカメラ（画像合成装置）で撮影し、記録する。次に同じ背景で図７（ａ）のような第２の被写体を撮影する。
【０２２２】
なお、第１の被写体と第２の被写体の撮影は、第１の被写体と第２の被写体が交互に行うことで、第３者がいなくても二人だけでも撮影が可能である。同じ背景で撮影する為にはカメラは動かさない方が良いが、背景にあわせて補正するので、三脚などで固定までしなくても、手で大体同じ位置で同じ方向を向いて撮影すれば良い。なお、被写体の位置関係は図６（ａ）、図７（ａ）のような左右だけでなく、任意の位置関係でよい。
【０２２３】
そして、２つの画像を撮影した後、Ｓ３からＳ７の処理を行い、図９（ａ）や図１２のような表示（や後で説明する警告／通知など）を行う。
【０２２４】
もし、被写体が重なっているなどの表示や通知がある場合、再度、Ｓ１からＳ７の処理を繰り返してもよい。すなわち第１被写体画像、第２被写体画像を撮影し、重ね画像を生成、表示などする。表示される処理結果に満足がいくまで何度でも繰り返せば良い。
【０２２５】
しかし、第２の被写体が位置を移動する場合などは、第１の被写体は必ずしも撮りなおさなくてもよく、第２の被写体だけ撮り直せば済むこともある。その場合は、Ｓ２からＳ７を繰り返せばよい。
【０２２６】
この場合、Ｓ２の第２被写体画像取得からＳ７の表示までを自動的に繰り返せば、すなわち第２被写体画像取得をシャッターボタンを押さずに動画を撮影するように連続的に取得し、処理、表示も含めて繰り返すようにすれば、カメラや第２の被写体の移動などに追従してリアルタイムに処理結果が確認できることになる。従って、第２の被写体の移動位置が適切かどうか（重なっていないかどうか）をリアルタイムに知ることができ、重なりが無い合成結果を得る為の第２の被写体の撮影が容易になるという利点が出てくる。
【０２２７】
なお、この繰り返し処理を開始するには、メニューなどから処理開始を選択するなどして、専用モードに入る必要がある。適切な移動位置になったらシャッターボタンを押すことで、第２被写体画像を決定して（記録し）、この繰り返し処理／専用モードを終了させればよい（終了といっても、最後の合成結果を得るＳ７までは処理を続けてもよい）。
【０２２８】
また、第１被写体画像が良くない場合、例えば、背景の真中に第１の被写体が位置し、第２の被写体をどう配置しても第１の被写体に重なってしまうか、重ならないようにすると第２の被写体が重ね画像からフレームアウトしてしまうような場合、Ｓ１の第１被写体画像の取得からやり直しても良い。
【０２２９】
以降では、上で説明した処理の詳細を説明する。
【０２３０】
図１３は、図５のＳ２の処理、すなわち第２被写体画像を取得する処理の一方法を説明するフローチャート図である。
【０２３１】
Ｐ２０を経たＳ２−１では、第２被写体画像取得手段３が、第２被写体画像を取得し、Ｓ２−２へ処理が進む。ここでの処理は、図５のＳ１の第１被写体画像の取得と取得方法自体は同様である。
【０２３２】
Ｓ２−２では、同手段３が、自動シャッター手段１３から画像を記録するように指示があるかどうかを判断し、指示があればＳ２−３へ処理が進み、指示がなければＰ３０へ抜ける。
【０２３３】
Ｓ２−３では、同手段３が、Ｓ２−１で取得した第２被写体画像を主記憶７４、外部記憶７５などに記録して、Ｐ３０へ処理が抜ける。
【０２３４】
以上のＳ２−１からＳ２−３の処理で、図５のＳ２の処理が行われる。
【０２３５】
なお、自動シャッター手段１３以外であっても、撮影者によって手動でシャッターボタンが押されたり、セルフタイマーでシャッターが切られた場合などにも撮影画像を記録してもよいが、それはＳ１、Ｓ２−１の処理に含まれるとする。
【０２３６】
図１４は、図５のＳ３の処理、すなわち背景補正量を算出する処理の一方法を説明するフローチャート図である。
【０２３７】
背景補正量を算出する方法は色々考えられるが、ここではブロックマッチングを使った簡易的な手法について説明する。
【０２３８】
Ｐ３０を経たＳ３−１では、背景補正量算出手段４が、第１被写体画像をブロック領域に分割する。図６（ｂ）は図６（ａ）の第１被写体画像をブロック領域に分割した状態を説明する説明図である。点線で区切られた矩形が各ブロック領域である。左上のブロックを「Ｂ（１，１）」とし、その右が「Ｂ（１，２）」、下が「Ｂ（２，１）」というように表現することにする。図６（ｂ）ではスペースの都合上、例えばＢ（１，１）のブロックではブロックの左上に「１１」と記している。
【０２３９】
Ｓ３−２では、第１被写体画像のブロックが、第２被写体画像上でマッチングする位置を、同手段４が求めて、Ｓ３−３へ処理が進む。「（ブロック）マッチング」とは、この場合、第１被写体画像の各ブロックと最もブロック内の画像が似ているブロック領域を第２被写体画像上で探す処理である。
【０２４０】
説明の為、ブロックを定義する画像（ここでは第１被写体画像）を「参照画像」と呼び、似ているブロックを探す相手の画像（ここでは第２被写体画像）を「探索画像」と呼び、参照画像上のブロックを「参照ブロック」、探索画像上のブロックを「探索ブロック」と呼ぶことにする。参照画像上の任意の点（ｘ、ｙ）の画素値をＰｒ（ｘ、ｙ）、探索画像上の任意の点（ｘ、ｙ）の画素値をＰｓ（ｘ、ｙ）とする。
【０２４１】
なお、背景補正量は相対的なものなので、上記とは逆に、参照画像と探索画像を、第２被写体画像と第１被写体画像として良い。
【０２４２】
今、参照ブロックが正方形で１辺の大きさがｍ画素だとする。すると参照ブロックＢ（ｉ，ｊ）の左上の画素の位置は、
（ｍ×（ｉ−１），ｍ×（ｊ−１））
となり、参照ブロックＢ（ｉ，ｊ）の左上から画素数にして（ｄｘ、ｄｙ）離れた画素値は、
Ｐｒ（ｍ×（ｉ−１）＋ｄｘ、ｍ×（ｊ−１）＋ｄｙ）
となる。
【０２４３】
探索ブロックの左上位置を（ｘｓ、ｙｓ）とした時、参照ブロックＢ（ｉ，ｊ）と探索ブロックの類似度Ｓ（ｘｓ、ｙｓ）は次の２式で求められる。
【０２４４】
Ｄ（ｘｓ、ｙｓ；ｄｘ、ｄｙ）＝｜Ｐｓ（ｘｓ＋ｄｘ、ｙｓ＋ｄｙ）−Ｐｒ（ｍ×（ｉ−１）＋ｄｘ、ｍ×（ｊ−１）＋ｄｙ｜

Ｄ（ｘｓ、ｙｓ；ｄｘ、ｄｙ）は、参照ブロックと探索ブロックの左上から（ｄｘ、ｄｙ）離れたそれぞれの画素値の間の差の絶対値である。そして、Ｓ（ｘｓ、ｙｓ）は、その差の絶対値をブロック内の全画素について足したものである。
【０２４５】
もし、参照ブロックと探索ブロックが全く同じ画像である（対応する画素値が全て等しい）場合、Ｓ（ｘｓ、ｙｓ）は０となる。似ていない部分が増えると、すなわち画素値の差が大きくなると、Ｓ（ｘｓ、ｙｓ）は大きな値となっていく。従って、Ｓ（ｘｓ、ｙｓ）が小さいほど似たブロックということになる。
【０２４６】
Ｓ（ｘｓ、ｙｓ）は、探索ブロックの左上位置を（ｘｓ、ｙｓ）とした時の類似度なので、（ｘｓ、ｙｓ）を探索画像上で変えれば、それぞれの場所での類似度が得られる。全ての類似度の中で最小となる類似度の位置（ｘｓ、ｙｓ）をマッチングした位置とすればよい。マッチングした位置の探索ブロックを「マッチングブロック」と呼ぶ。
【０２４７】
図１５は、このマッチングの様子を説明した図だが、図１５（ａ）の画像を参照画像、図１５（ｂ）の画像を探索画像とし、画像の中身としてはカギ括弧型の線がそれぞれ少し位置がずれて存在しているとする。参照画像中の参照ブロック１００は、カギ括弧型の線のちょうど角の部分に位置しているとする。探索画像中の探索ブロックとして、探索ブロック１０１、１０２、１０３があったとする。参照ブロック１００と探索ブロック１０１、参照ブロック１００と探索ブロック１０２、参照ブロック１００と探索ブロック１０３でそれぞれ類似度を計算すると、探索ブロック１０１が最も小さな値となるので、探索ブロック１０１を参照ブロック１００に対するマッチングブロックとすればよい。
【０２４８】
以上は一つの参照ブロックＢ（ｉ，ｊ）のマッチングについて説明したが、それぞれの参照ブロックについて、マッチングブロックを求めることができる。図６（ｂ）の４２個の参照ブロックにそれぞれに対して、第２被写体画像のそれぞれで、マッチングブロックを探すとする。
【０２４９】
なお、マッチングブロックの類似度の求め方については、ここでは各画素値の差分の絶対値を使ったが、それ以外にも様々な方法があり、いずれの手法を使っても良い。
【０２５０】
例えば、相関係数を使う方法や周波数成分を使う方法などもあるし、各種高速化手法などもある。また、参照ブロックの位置や大きさなどの設定の仕方も色々考えられるが、ブロックマッチングの細かな改良方法は本発明の主旨ではないのでここでは省略する。
【０２５１】
なお、参照ブロックの大きさについては、あまり小さくしすぎるとブロック内にうまく特徴が捉えきれずマッチング結果の精度が悪くなるが、逆に大きくしすぎると被写体や画像のフレーム枠を含んでしまいマッチング結果の精度が悪くなったり、回転、拡大縮小などの変化に弱くなってしまうので、適当な大きさにすることが望ましい。
【０２５２】
次に、Ｓ３−３で、同手段４が、Ｓ３−２で求めたマッチングブロックの中から背景部分に相当する探索ブロックだけを抜き出して、Ｓ３−４へ処理が進む。
【０２５３】
Ｓ３−２で求めたマッチングブロックは、最も差分が少ない探索ブロックを選んだだけなので、同じ画像であることが保証されてはおらず、たまたま何かの模様などが似ているだけの場合もある。また、そもそも第１の被写体の為、参照ブロック自体が背景部分でなかったり、参照ブロックは背景部分だが、第２の被写体の為、参照ブロックに相当する画像部分が第２被写体画像上に存在しない場合もあるので、その場合はいいかげんな場所にマッチングブロックが設定されていることになる。
【０２５４】
そこで各マッチングブロックから、参照ブロックと同じ画像部分ではないと判断されるものを取り除くことが必要となる。残ったマッチングブロックは参照ブロックと同じ画像部分であると判断されたものなので、結果的に第１や第２の被写体を除いた背景部分だけが残ることになる。
【０２５５】
マッチングブロックの選別手法は色々考えられるが、ここでは最も単純な方法として、類似度Ｓ（ｘｓ、ｙｓ）を所定の閾値で判断することにする。すなわち、各マッチングブロックのＳ（ｘｓ、ｙｓ）が閾値を超えていたら、そのマッチングは不正確であるとして取り除くという手法である。Ｓ（ｘｓ、ｙｓ）は、ブロックの大きさに影響されるので、閾値はブロックの大きさを考慮して決めるのが望ましい。
【０２５６】
図７（ｂ）は、図７（ａ）の第２被写体画像のＳ３−２のマッチング結果から、不正確なマッチングブロックを取り除いた結果である。正しいと判断されたマッチングブロックには、対応する参照ブロックと同じ番号が振ってある。これにより、被写体部分が含まれない、あるいはほとんど含まれない背景部分のマッチングブロックだけが残っているのが分かる。
【０２５７】
しかも、残ったマッチングブロックは、第１被写体画像と第２被写体画像とに共通して写り込んだ同一の背景部分であると判断できる。もし、第１被写体画像と第２被写体画像とが共通する背景部分を全く持っていないとすると、Ｓ３−３の処理の結果、残るマッチングブロックは０となる。
【０２５８】
Ｓ３−４では、同手段４が、Ｓ３−３で得た背景部分のマッチングブロックから、第２被写体画像の背景補正量を求めて、Ｐ４０へ処理が抜ける。
【０２５９】
背景補正量として、例えば回転量θ、拡大縮小量Ｒ、および／または平行移動量（Ｌｘ、Ｌｙ）を求めるのだが、計算方法は色々考えられる。ここでは２つのブロックを使った最も簡単な方法について説明する。
【０２６０】
なお、回転量、拡大縮小量、平行移動量以外の歪補正量は、よほど撮影時にカメラを動かすなどしない限り、使わなくても背景部分がほぼ重なり、差分画像でノイズが充分少ない補正ができる場合が多い。回転量、拡大縮小量、平行移動量以外の歪補正量を得るには、最低でも３点あるいは４点以上ブロックを使うことが必要であり、透視変換を考慮した計算が必要となるが、パノラマ画像の合成などでも使われている公知の手法（例えば、「共立出版：ｂｉｔ１９９４年１１月号別冊『コンピュータ・サイエンス』」のＰ９０など）なので、この処理の詳細についてはここでは省略する。
【０２６１】
まず、できるだけ互いの距離が離れているマッチングブロックを２つ選ぶ。なお、Ｓ３−３で残ったマッチングブロックが１つしか無いときは、以降の拡大縮小率、回転量を求める処理は省いて、対応する参照ブロックの位置との差分を平行移動量として求めればよい。Ｓ３−３で残ったマッチングブロックが１つも無かったら、第１、第２被写体画像などを撮影し直した方が良いと思われるので、その旨の警告を出すなどするとよい。
【０２６２】
選び方は色々考えられるが、例えば、
１）マッチングブロック中の任意の２つを選び、その二つのブロックの中心位置間の距離を計算する、
２）１）の計算を全てのマッチングブロックの組み合わせで行う、
３）２）の中で最も距離が大きい組み合わせを背景補正量の算出に使う２つのブロックとして選ぶ、という方法が考えられる。
【０２６３】
ここで、上記３）として挙げたように、互いの距離が最も離れているマッチングブロックを使う利点としては、拡大縮小率や回転量などを求める際の精度が良くなることがあげられる。マッチングブロックの位置は画素単位となるので、精度も画素単位となってしまう。例えば、横に５０画素離れた位置で上に１画素分ずれた時の角度は、横に５画素離れた位置で上に０．１画素分ずれた時の角度と同じになる。しかし、０．１画素のずれはマッチングでは検出できない。従って、できるだけ離れたマッチングブロックを使った方が良い。
【０２６４】
２つのブロックを使っているのは、単に計算が簡単だからである。もっと多くのブロックを使って平均的な拡大縮小率や回転量などを求めるようにすると、誤差が減少する利点が出てくる。
【０２６５】
例えば図７（ｂ）の例では、互いの距離が最も離れている２つのマッチングブロックは、ブロック１５、６１の組み合わせとなる。
【０２６６】
次に、選んだ２つのマッチングブロックの中心位置を、探索画像上の座標で表した（ｘ１’、ｙ１’）、（ｘ２’、ｙ２’）、それに対応する参照ブロックの中心位置を参照画像上の座標で表した（ｘ１、ｙ１）、（ｘ２、ｙ２）とする。
【０２６７】
まず、拡大縮小率について求める。
【０２６８】
マッチングブロックの中心間の距離Ｌｍは、
Ｌｍ＝（（ｘ２’―　ｘ１’）×（ｘ２’―　ｘ１’）＋（ｙ２’―　ｙ１’）×（ｙ２’―　ｙ１’））^１／２
参照ブロックの中心間の距離Ｌｒは、
Ｌｒ＝（（ｘ２―　ｘ１）×（ｘ２―　ｘ１）＋（ｙ２―　ｙ１）×（ｙ２―　ｙ１））^１／２
となり、拡大縮小率Ｒは、
Ｒ＝Ｌｒ／Ｌｍ
で求められる。
【０２６９】
次に回転量について求める。
【０２７０】
マッチングブロックの中心を通る直線の傾きθｍは、
θｍ＝ａｒｃｔａｎ（（ｙ２’―　ｙ１’）／（ｘ２’―　ｘ１’））
（但し、ｘ２’＝　ｘ１’の時はθｍ＝π／２）
参照ブロックの中心を通る直線の傾きθｒは、
θｒ＝ａｒｃｔａｎ（（ｙ２―　ｙ１）／（ｘ２―　ｘ１））
（但し、ｘ２＝　ｘ１の時はθｒ＝π／２）
で求められる。なお、ａｒｃｔａｎは、ｔａｎの逆関数とする。
【０２７１】
これより、回転量θは、
θ＝θｒ―θｍ
で求められる。
【０２７２】
最後に平行移動量であるが、これは対応するブロック同士の中心位置が等しくなればよいので、例えば、（ｘ１’、ｙ１’）と（ｘ１、ｙ１）が等しくなるようにすると、平行移動量（Ｌｘ、Ｌｙ）は、
（Ｌｘ、Ｌｙ）＝（ｘ１’―　ｘ１、ｙ１’―　ｙ１）
となる。回転量と拡大縮小量は、どこを中心にしても良いので、ここでは平行移動で一致する点、すなわち対応するブロックの中心を回転中心、拡大縮小中心とすることにする。
【０２７３】
従って、探索画像中の任意の点（ｘ’，ｙ’）を補正された点（ｘ”，ｙ”）に変換する変換式は、
ｘ”＝Ｒ×（ｃｏｓθ×（ｘ’−ｘ１’）−ｓｉｎθ×（ｙ’−ｙ１’））＋ｘ１
ｙ”＝Ｒ×（ｓｉｎθ×（ｘ’−ｘ１’）＋ｃｏｓθ×（ｙ’−ｙ１’））＋ｙ１
となる。回転量、拡大縮小量、平行移動量と述べたが、正確にはここでは、θ、Ｒ，（ｘ１　、ｙ１　）、（ｘ１’、ｙ１’）のパラメータを求めることになる。なお、補正量／変換式の表し方は、これに限定される訳ではなく、その他の表し方でもよい。
【０２７４】
この変換式は、探索画像上の点（ｘ’，ｙ’）を補正画像上の点（ｘ”，ｙ”）に変換するものだが、補正画像上の点（ｘ”，ｙ”）は、参照画像に（背景部分が）重なるようになるのだから、意味的には、探索画像から参照画像への（背景部分が重なるような）変換とみなせる。従って、この変換式を探索画像上の点（Ｘｓ，Ｙｓ）を参照画像上の点（Ｘｒ，Ｙｒ）への変換関数Ｆｓｒ、
（Ｘｒ，Ｙｒ）＝Ｆｓｒ（Ｘｓ，Ｙｓ）
と表現することにする。
【０２７５】
なお、先の式は逆に補正された点（ｘ”，ｙ”）から探索画像中の任意の点（ｘ’，ｙ’）への変換式、
ｘ’＝（１／Ｒ）×（ｃｏｓθ×（ｘ”−ｘ１）＋ｓｉｎθ×（ｙ”−ｙ１））＋ｘ１’
ｙ’＝（１／Ｒ）×（ｓｉｎθ×（ｘ”−ｘ１）−ｓｉｎθ×（ｙ”−ｙ１））＋ｙ１’
にも変形できる。これも変換関数Ｆｒｓで表せば、
（Ｘｓ，Ｙｓ）＝Ｆｒｓ（Ｘｒ，Ｙｒ）
となる。変換関数Ｆｒｓは変換関数Ｆｓｒの逆変換関数とも言う。
【０２７６】
図６（ａ）、図７（ａ）の例では回転や拡大縮小はなく、単なる平行移動だけであるが、詳細は後で図７（ｃ）で説明する。
【０２７７】
以上のＳ３−１からＳ３−４の処理で、図５のＳ３の背景補正量算出の処理が行われる。
【０２７８】
図１６は、図５のＳ４の処理、すなわち第２被写体画像の補正画像を生成し、第１被写体画像との差分画像を生成する処理の一方法を説明するフローチャート図である。
【０２７９】
Ｐ４０を経たＳ４−１では、補正画像生成手段５が、背景補正量算出手段４（Ｓ３）で得られる補正量を使って、第２被写体画像を第１被写体画像に背景部分が重なるように補正した画像を生成し、Ｓ４−２へ処理が進む。なお、ここで生成される補正された第２被写体画像を「補正第２被写体画像」（図７（ｃ）参照）と呼ぶことにする。
【０２８０】
補正には、変換関数Ｆｓｒあるいは逆変換関数Ｆｒｓを使えばよい。一般に、きれいな変換画像を生成する為には、変換画像（ここでは補正第２被写体画像）の画素位置に対応する元画像（ここでは第２被写体画像）の画素位置を求め、その画素位置から変換画像の画素値を求める。この時、使用する変換関数はＦｓｒになる。
【０２８１】
また、一般に求めた元画像の画素位置は整数値とはならないので、そのままでは求めた元画像の画素位置の画素値は求められない。そこで、通常は何らかの補間を行う。例えば最も一般的な手法として、求めた元画像の画素位置の周囲の整数値の画素位置の４画素から一次補間で求める手法がある。一次補間法に関しては、一般的な画像処理の本など（例えば、森北出版：安居院猛、中嶋正之共著「画像情報処理」のＰ５４）に載っているので、ここでは詳しい説明を省略する。
【０２８２】
図７（ｃ）は、図７（ａ）の第２被写体画像と図６（ａ）の第１被写体画像とから、第２被写体画像が第１被写体画像の背景部分に重なるように生成した補正第２被写体画像の例である。この例での補正は平行移動だけである。補正の様子が分かるように、図７（ａ）の第２被写体画像の範囲を点線で示してある。図７（ａ）の第２被写体画像よりフレーム枠全体が少し右下に移動している。
【０２８３】
補正の結果、対応する第２被写体画像が存在しない部分が出てくる。例えば、図７（ｃ）の右端の点線と実線の間の部分は、図７（ａ）の第２被写体画像には存在しない部分なので、抜けている。これは、下の道路を示す水平線が右端までいかずに途切れているのでも分かる。その部分は、Ｓ４−２で説明するマスク画像を使って除外するので適当な画素値のままとしておいても問題はない。
【０２８４】
なお、図１７（ａ）は補正に回転が必要な場合の第２被写体画像の例である。第１被写体画像は、図６（ａ）と同じとする。画面全体が図７（ａ）と比べて少し左回りに回転している。
【０２８５】
図１７（ｂ）は、図１７（ａ）の第２被写体画像と図６（ａ）の第１被写体画像とでブロックマッチングを行った結果である。ブロックは回転などがあっても、回転量やブロックの大きさがそれほど大きくなければ、ブロック内での画像変化は少ないので、回転に追従して正確なマッチングがある程度可能である。
【０２８６】
図１７（ｃ）は、図１７（ｂ）のブロックマッチング結果をもとに背景補正量を算出し、補正した第２被写体画像である。図６（ａ）の第１被写体画像と背景部分が重なるようになり、回転が補正されているのが分かる。補正の様子がわかるように、図１７（ａ）の画像枠を点線で示してある。
【０２８７】
Ｓ４−２では、補正画像生成手段５が、補正第２被写体画像のマスク画像を生成して、Ｓ４−３へ処理が進む。
【０２８８】
マスク画像は、補正画像を生成する際、補正画像上の各画素に対応するオリジナル画像上の画素位置が先に説明した式で求められるが、その画素位置がオリジナル画像の範囲に収まっているかどうかで判断して、収まっていればマスク部分として補正画像上の対応する画素の画素値を例えば０（黒）にし、収まっていなければ例えば２５５（白）にすればよい。マスク部分の画素値は０、２５５に限らず自由に決めてよいが、以降では、０（黒）、２５５（白）で説明する。
【０２８９】
図７（ｄ）は、図７（ｃ）のマスク画像の例である。実線のフレーム枠中の黒く塗りつぶされた範囲がマスク部分である。このマスク部分は、補正された画像中でオリジナルの画像（補正前の画像）が画素を持っている範囲を示している。従って、図７（ｄ）では、対応する第２被写体画像が存在しない右下端部分がマスク部分とはなっておらず、白くなっている。
【０２９０】
Ｓ４−３では、差分画像生成手段６が、第１被写体画像と、補正画像生成手段５（Ｓ４−１）から得られる補正第２被写体画像とそのマスク画像とを用いて、第１被写体画像と補正第２被写体画像との差分画像を生成してＳ４−４へ処理が進む。
【０２９１】
差分画像を生成するには、ある点（ｘ、ｙ）のマスク画像上の点の画素値が０かどうかを見る。０（黒）ならば補正第２被写体画像上に補正された画素が存在するはずなので、差分画像上の点（ｘ、ｙ）の画素値Ｐｄ（ｘ、ｙ）は、
Ｐｄ（ｘ、ｙ）＝｜Ｐ１（ｘ、ｙ）−Ｐｆ２（ｘ、ｙ）｜
より、第１被写体画像上の画素値Ｐ１（ｘ、ｙ）と補正第２被写体画像上の画素値Ｐｆ２（ｘ、ｙ）の差の絶対値とする。
【０２９２】
ある点（ｘ、ｙ）のマスク画像上の点の画素値が０（黒）でないならば、
Ｐｄ（ｘ、ｙ）＝０
とする。
【０２９３】
これらの処理を、点（ｘ、ｙ）を差分画像の左上から右下まですべての画素について繰り返せばよい。
【０２９４】
図８（ａ）は、図６（ａ）の第１被写体画像と図７（ｃ）の補正第２被写体画像、図７（ｄ）のマスク画像から生成された差分画像の例である。人物（１）と人物（２）の領域以外の所は背景が一致している、あるいはマスク範囲外として差分が０となる。この結果、主に人物（１）の領域と人物（２）の領域内がそれぞれ、人物（１）の画像と背景の画像、人物（２）の画像と背景の画像が交じり合ったような画像となっている。
【０２９５】
通常、Ｓ３での補正量の算出の誤差や、補正画像生成の補間処理などの誤差、背景部分の画像自体の撮影時間の差による微妙な変化などによって、人物（１）の領域と人物（２）の領域以外にも小さな差分部分は出てくる。通常は数画素程度の大きさで、差もあまり大きくないことが多い。図８（ａ）でも人物（１）の領域と人物（２）の領域の周辺に白い部分がいくつか出てきている。
【０２９６】
一方、図１７（ｂ）の場合のマスク画像は図１７（ｄ）のようになる。なお、拡大縮小や回転の補正量がある場合でも、Ｓ４−１、Ｓ４−２で補正やマスク画像生成を行ってしまえば、後の処理は手順としては変わりないので、以降の説明では、第２被写体画像は図１７（ａ）は使わず、図７（ａ）のものを使う。
【０２９７】
以上のＳ４−１からＳ４−３の処理で、図５のＳ４の差分画像生成の処理が行える。
【０２９８】
図１８は、図５のＳ５の処理、すなわち被写体領域を抽出する処理の一方法を説明するフローチャート図である。
【０２９９】
Ｐ５０を経たＳ５−１では、被写体領域抽出手段７が、差分画像生成手段６（Ｓ５）から得られる差分画像から、「ラベリング画像」（「ラベリング画像」の意味については後で説明する）を生成して、Ｓ５−２へ処理が進む。
【０３００】
まず準備として、差分画像から２値画像を生成する。２値画像の生成方法も色々考えられるが、例えば、差分画像中の各画素値を所定の閾値と比較して、閾値より大きければ黒、以下ならば白、などとしてやればよい。差分画像がＲ，Ｇ，Ｂの画素値からなる場合は、Ｒ，Ｇ，Ｂの画素値を足した値と閾値を比較すればよい。
【０３０１】
図８（ｂ）は、図８（ａ）の差分画像から生成した２値画像の例である。黒い領域が領域１１０から１１７の８つ存在し、大きな人型の領域１１２、１１３以外は小さな領域である。
【０３０２】
次に、生成した２値画像からラベリング画像を生成するが、一般に「ラベリング画像」とは、２値画像中の白画素同士あるいは黒画素同士が連結している塊を見つけ、その塊に番号（「ラベリング値」と以降、呼ぶ）を振っていく処理により生成される画像である。多くの場合、出力されるラベリング画像は多値のモノクロ画像であり、各塊の領域の画素値は全て振られたラベリング値になっている。
【０３０３】
なお、同じラベリング値を持つ画素の領域を「ラベル領域」と以降呼ぶことにする。また、連結している塊を見つけ、その塊にラベリング値を振っていく処理手順の詳細については、一般的な画像処理の本など（例えば、昭晃堂：昭和６２年発行「画像処理ハンドブック」Ｐ３１８）に載っているので、ここでは省略し、処理結果例を示す。
【０３０４】
２値画像とラベリング画像とは、２値か多値かの違いなので、ラベリング画像例は図８（ｂ）で説明する。図８（ｂ）の領域１１０から１１７の番号の後に「１１０（１）」などと括弧書きで番号がついているが、これが各領域のラベリング値である。これ以外の領域はラベリング値０が振られているとする。
【０３０５】
なお、ラベリング画像図８（ｂ）は、紙面上で多値画像を図示するのが難しいので２値画像のように示してあるが、実際はラベリング値による多値画像になっているので、表示する必要はないが実際に画像として表示した場合は図８（ｂ）とは異なる見え方をする。
【０３０６】
Ｓ５−２では、被写体領域抽出手段７が、Ｓ５−１で得られるラベリング画像中の「ノイズ」的な領域を除去して、Ｓ５−３へ処理が進む。「ノイズ」とは目的のデータ以外の部分を一般に指し、ここでは人型の領域以外の領域を指す。
【０３０７】
ノイズ除去にも様々な方法があるが、簡単な方法として、例えばある閾値以下の面積のラベル領域は除くという方法がある。これには、まず各ラベル領域の面積を求める。面積を求めるには、全画素を走査し、ある特定のラベリング値を持つ画素がいくつ存在するか数えればよい。全ラベリング値について面積（画素数）を求めたら、それらの内、所定の閾値以下の面積（画素数）のラベル領域は除去する。除去処理は、具体的には、そのラベル領域をラベリング値０にしてしまうか、新たなラベリング画像を作成し、そこにノイズ以外のラベル領域をコピーする、でもよい。
【０３０８】
図８（ｃ）は、図８（ｂ）のラベリング画像からノイズ除去した結果である。人型の領域１１２、１１３以外はノイズとして除去されてしまっている。
【０３０９】
なお、被写体以外のラベル領域を除去するノイズ除去処理の完全自動化が難しいなら、例えば、どの領域が被写体領域であるかを、タブレットやマウスなどの入力手段を使ってユーザーに指定してもらう方法も考えられる。指定方法も、被写体領域の輪郭まで指定してもらう方法と、輪郭はラベリング画像の各ラベル領域の輪郭を使い、どのラベル領域が被写体領域であるかどうかを指定してもらう方法などが考えられる。
【０３１０】
また、図８（ｂ）ではたまたま一人の領域がうまく一つのラベル領域となっているが、画像によっては、一人の被写体であっても複数のラベル領域に分かれてしまうことがある。例えば、被写体領域中の真中辺りの画素が、背景と似たような色や明るさの画素の場合、差分画像中のその部分の画素値が小さいので、被写体領域の真中辺りが背景と認識されてしまい、被写体領域が上下や左右に分断されて抽出されてしまうことがある。その場合、後の被写体の重なり検出や合成処理などでうまく処理できない場合が出てくる可能性がある。
【０３１１】
そこで、ラベリング画像のラベル領域を膨張させて、距離的に近いラベル領域を同じラベル領域として統合してしまう処理を入れるという方法もある。さらに、領域を抽出する手法の１つである「スネーク」を統合に利用する方法も考えられる。膨張やスネークの処理手順の詳細については、一般的な画像処理の本など（例えば、昭晃堂：昭和６２年発行「画像処理ハンドブック」Ｐ３２０、またはＫａｓｓ　Ａ．，　ｅｔ　ａｌ．，”Ｓｎａｋｅｓ：　Ａｃｔｉｖｅ　Ｃｏｎｔｏｕｒ　Ｍｏｄｅｌｓ”，Ｉｎｔ．　Ｊ．　Ｃｏｍｐｕｔ．　Ｖｉｓｉｏｎ，　ｐｐ．３２１−３３１（１９８８））に載っているので、ここでは省略する。
【０３１２】
また、距離的に近いラベル領域の統合に使わなくても、第１、第２の被写体領域同士に重なりがあることを見逃す危険性を減らすことに使う為に、抽出した被写体領域を一定量膨張させるという方法もある。
【０３１３】
なお、ここでは、膨張や統合は特に行わない処理例で説明している。
【０３１４】
Ｓ５−３では、重なり検出手段８が、Ｓ５−２で得られるノイズ除去されたラベリング画像から被写体同士の重なりがあるかどうかを検出し、重なりが検出されなければＳ５−４へ進み、重なりが検出されればＳ５−５へ進む。
【０３１５】
重なりの検出方法には様々な方法が考えられるが、ここでは簡単に求められる方法として、撮影／合成したい被写体の数と、ノイズ除去されたラベリング画像中の被写体の領域数とを使う方法について説明する。
【０３１６】
まず、撮影／合成したい被写体の数は予めプログラムや外部記憶、ユーザー入力などによって指定されているとする。例えば、カメラに「２集団撮影モード」（被写体数２）、「３集団撮影モード」（被写体数３）などのモード設定があり、これをユーザーが設定する。
【０３１７】
なお、ここでは「被写体の数」は領域として一塊になっている人物などの数である。例えば、第１の被写体、第２の被写体としてそれぞれ１人ずつならば、被写体の数は２となる。第１の被写体は１人として、もし、第２の被写体が２人の場合、その２人がくっつきあって写る場合は、一塊の領域となっているので、第２の被写体を１とし、被写体の数は合計２となるが、２人が距離を空けて離れている場合は、一塊の領域となっていないので、第２の被写体を２とし、被写体の数は合計３となる。
【０３１８】
被写体の領域数は、ノイズ除去されたラベリング画像中の異なるラベル値の領域数を数えればよい（ラベリング値０の部分は除く）。
【０３１９】
そこで、重なり検出手段８は、得られた撮影／合成したい被写体の数と、ノイズ除去されたラベリング画像中の被写体の領域数とが一致するかどうかを見て、一致するならば被写体同士が重なっていないと判断し、一致しない場合は被写体同士が重なっていると判断する。
【０３２０】
この重なり検出手段８による判断の原理は次の通りである。説明を簡単にする為、ここでは撮影／合成したい被写体の数は２とする。
【０３２１】
もし被写体同士が重なっていないならば、当然、第１の被写体の領域と第２の被写体の領域は分離しているはずである。従って、被写体同士が重なっていない場合、ノイズ除去した後の被写体の領域の数は２となるはずである。
【０３２２】
もし被写体同士が重なっているのならば、第１の被写体の領域と第２の被写体の領域は重なっている部分で統合されるため、分離していないはずである。従って、被写体同士が重なっている場合、ノイズ除去した後の被写体の領域の数は１となるはずである。
【０３２３】
撮影／合成したい被写体の数が３でも同様の考え方で、もし被写体同士が重なっていないならば、それぞれの領域は分離されているので、ノイズ除去した後の被写体の領域の数は３となるはずである。もし被写体同士が重なっているのならば、３つの被写体の領域の少なくともいずれか一組は重なっている部分で統合されるため、分離していないはずである。従って、被写体同士が重なっている場合、ノイズ除去した後の被写体の領域の数は１あるいは２となるはずである。
【０３２４】
図６（ａ）、図７（ａ）ではそれぞれ被写体となる人物が１人なので、撮影／合成したい被写体の数は２で設定されているとする。図８（ｃ）では、領域の数は、人型の領域１１２、１１３の２つなので、得られた撮影／合成したい被写体の数と、ノイズ除去されたラベリング画像中の被写体の領域数とが一致する。従って、この場合、重なり検出手段８は被写体同士が重なっていないと判断する。
【０３２５】
重なりがある例として、第２被写体画像の図１０を使う場合を考える。第１被写体画像は図６（ａ）をそのまま使う。これらから生成された差分画像が図１１（ａ）である。図１１（ａ）では、被写体同士が重なってしまい、重なった腕の部分は、第１の被写体と第２の被写体の画像が交じり合った画像となり、それ以外の被写体の部分は、第１の被写体と背景部分、第２の被写体と背景部分の画像が交じり合った画像となっている。図１１（ａ）のラベリング画像が図１１（ｂ）であり、図１１（ｂ）からノイズ除去を施したものが図１１（ｃ）である。
【０３２６】
図１１（ｃ）では、第１の被写体と第２の被写体の領域は腕の部分で結合されてしまっているので、１塊の領域２０２しか残らない。この場合、ノイズ除去されたラベリング画像中の被写体の領域数は１となるので、撮影／合成したい被写体の数と一致せず、重なりがあると判断されることになる。
【０３２７】
なお、重なり検出の方法として、第１の被写体と第２の被写体の輪郭を正確に求めて、その輪郭同士が重なっているかどうかで判断する方法もある。輪郭が正確に求まるのならば、重なりの検出を行うことも可能であり、さらに重なり領域を使った表示、重なり回避などの様々な処理を行うことも可能である。
【０３２８】
しかし、被写体の領域を画像処理だけで完全に正確に抽出することは一般に難しく、人間の知識や人工知能的な高度な処理が一般に必要とされる。領域を抽出する手法の１つである「スネーク」などもあるが、完璧ではない。なお、第１被写体画像および第２被写体画像に加えて、各被写体画像と少なくとも一部共通する背景部分が写っていて被写体は写っていない背景画像を利用するのであれば、重なりの有無にかかわらず、被写体の領域を抽出することができる。これに対し、第１被写体画像と第２被写体画像からだけで、重なりがあるかもしれない被写体の輪郭を正確に抽出するのは難しい。
【０３２９】
従って、ここでは上述した簡単な方法で重なりの有無だけを検出することにする。
【０３３０】
Ｓ５−４では、被写体領域抽出手段７が、ノイズ除去されたラベリング画像中の被写体の領域について、どちらが第１被写体領域で、どちらが補正第２被写体領域なのかを判断して、Ｐ６０へ抜ける。
【０３３１】
上述の背景画像を用いる方法では、背景画像と第１被写体画像との差分画像、背景画像と第２被写体画像との差分画像を使っているので、被写体領域はそれぞれ抽出できる。抽出された被写体領域は、それぞれ第１被写体領域と第２被写体領域となる。つまり、第１被写体領域と第２被写体領域とは独立して抽出できる。
【０３３２】
しかし、本発明では背景画像を使わないので、第１被写体画像と第２被写体画像との差分画像からは、第１被写体の領域と第２被写体領域は独立して抽出できず、第１被写体領域と第２被写体領域とが混ざった形でしか抽出できない。つまり、図８（ｃ）のようなノイズ除去されたラベリング画像からは、被写体領域１１２、１１３が２つ得られるだけで、２つの領域１１２、１１３のうち、どちらが第１被写体領域でどちらが第２被写体領域なのかは、これだけでは被写体領域抽出手段７が判断できない。
【０３３３】
どちらが第１被写体領域か第２被写体領域か判断できないというのは、見方を変えると、第１、第２の被写体の画像か背景部分の画像かを被写体領域抽出手段７が判断できない、ということでもある。
【０３３４】
例えば、第１被写体画像（図６（ａ））と第２被写体画像（図７（ａ））から、図８（ｃ）の領域１１２、１１３に相当する範囲をそれぞれ抜き出したのが図１９（ａ）〜（ｄ）である。すなわち、図１９（ａ）は、第１被写体画像の領域１１２の範囲、図１９（ｂ）は、第２被写体画像の領域１１２の範囲、図１９（ｃ）は、第１被写体画像の領域１１３の範囲、図１９（ｄ）は、第２被写体画像の領域１１３の範囲である。
【０３３５】
背景部分以外は、第１被写体画像中には第１の被写体だけ、第２被写体画像中には第２の被写体だけが写っていることが前提なので、実際には、「図１９（ａ）が第１の被写体の画像で図１９（ｄ）が第２の被写体の画像」、あるいは「図１９（ｂ）が第１の被写体の画像で図１９（ｃ）が第２の被写体の画像」のどちらかが正しいことなる。
【０３３６】
従って、第１被写体領域と第２被写体領域を区別するには、図１９（ａ）、（ｄ）と図１９（ｂ）、（ｃ）のどちらが被写体範囲の画像かを識別すればよい。
【０３３７】
どちらが被写体範囲の画像かを識別するには様々な方法が考えられるが、例えば、被写体や背景の特徴が予めわかっているのならば、それを利用して区別する方法がある。
【０３３８】
例えば、被写体が人物であることが分かっているのならば、被写体範囲の画像には肌色が多く含まれている可能性が高い。従って、肌色が多く含まれる方を被写体範囲の画像とすればよい。
【０３３９】
色の認識の仕方にも様々な方法があるが、例えば、図４のＲ、Ｇ、Ｂの画素値から、色相Ｈ、彩度Ｓ、明度Ｉを求め、主に色相Ｈを使って認識する方法がある。色相Ｈ、彩度Ｓ、明度Ｉの求め方には各種方式があり、一般的な画像処理の本など（例えば、東京大学出版会、１９９１年発行「画像解析ハンドブック」Ｐ４８５〜４９１）に載っているので、ここでは詳細は省略するが、例えば同書籍中の「ＨＳＩ６角錐カラーモデルによる変換」方法では、色相Ｈは０から２πの値域を持つ。
【０３４０】
具体的には、被写体領域抽出手段７が標準となる肌色のＨの範囲を決める。次に、同手段７が図１９（ａ）〜（ｄ）の領域の各画素のＨを求め、標準となる肌色のＨの範囲に入っていれば、肌色としてカウントする。続いて、同手段７が図１９（ａ）、（ｄ）の肌色のカウント数と、図１９（ｂ）、（ｃ）の肌色のカウント数のどちらが多いか比較し、多い方が被写体範囲の画像とすればよい。
【０３４１】
特徴量を使って、どちらが被写体範囲の画像かを識別する方法として、肌色を使う以外にも、例えば、周囲の背景部分と似ているかどうかで識別する方法がある。
【０３４２】
この場合、まず、被写体領域抽出手段７が被写体領域中の特徴量（後述）を第１被写体画像、第２被写体画像で求める。次に、同手段７が被写体領域の周囲の領域（例えば周囲２０ドットなど）の特徴量を求める。被写体領域の周囲は背景部分であり、背景部分は重なるように補正しているので、これは片方だけでも良い場合もある。そして、同手段７が、背景部分の特徴量と近い特徴量をもつ方を背景部分の画像、近くない方を被写体領域の画像と判断すればよい。
【０３４３】
上記の特徴量としては、上述したようなＲ、Ｇ、Ｂの画素値や、色相Ｈ、彩度Ｓ、明度Ｉの他にも、テクスチャなども利用可能である。テクスチャを特徴量として求める方法は様々考案されているが、例えば、明度Ｉのヒストグラムなどがある。これは、ある領域中の画素に対して、全体の和が１．０となるように正規化された明度ＩのヒストグラムＰ（ｉ）、（ｉ＝０、１、‥‥、ｎ−１）、を取り、平均μ、分散（σ¢２）、歪度Ｔｓ、尖度Ｔｋを、被写体領域抽出手段７が以下の式によって求める。なお、（Ｘ¢Ｙ）は、ＸのＹ乗を意味する。
【０３４４】

以上の４つの値を特徴量として使う。
【０３４５】
特徴量としては、その他にも、同時生起行列や差分統計量、ランレングス行列、パワースペクトル、それらの第２次統計量、高次統計量を使う方法などがあるが、一般的な画像処理の本など（例えば、東京大学出版会、１９９１年発行「画像解析ハンドブック」Ｐ５１７〜５３８）に載っているので、ここでは詳細は省略する。
【０３４６】
これにより、図１９の場合、図１９（ａ）、（ｄ）が、被写体領域抽出手段７によって被写体範囲の画像と判断されたとする。すると、領域１１２が第１被写体領域、領域１１３が第２被写体領域となる。
【０３４７】
なお、ここでの処理は、Ｓ５−３で被写体同士の重なりが無い場合に実行される処理なので、図８（ｃ）のように第１の被写体と第２の被写体が完全に分離した状態になっているはずである。図１１（ｃ）のように、第１の被写体と第２の被写体が統合した状態にはなっていないはずである。
【０３４８】
Ｓ５−５では、Ｓ５−３で、撮影／合成したい被写体の数と、ノイズ除去されたラベリング画像中の被写体の領域数とが一致しなかったため、被写体領域抽出手段７が、ノイズ除去されたラベリング画像中の被写体の領域を、第１被写体領域と第２被写体領域が統合された領域（以降、「被写体統合領域」と呼ぶ）と定めて、Ｐ６０へ処理が抜ける。
【０３４９】
この場合、被写体領域抽出手段７によって第１被写体領域と第２被写体領域を独立して抽出することはあきらめ、統合された領域として処理する。なお、上述したように、第１の被写体と第２の被写体の輪郭を正確に求められる場合は、Ｓ５−３やＳ５−５の処理を行わず、Ｓ５−４の処理を行えばよい。
【０３５０】
以上のＳ５−１からＳ５−５の処理で、図５のＳ５の被写体領域抽出処理が行われる。
【０３５１】
図２０は、図５のＳ６の処理、すなわち重なりに関する処理の一方法を説明するフローチャート図である。重なりに関する別の処理方法に関しては、後で図２１、２３を使って説明する。
【０３５２】
Ｐ６０を経たＳ６−１では、重なり警告手段１１において、重なり検出手段８（Ｓ５）から得られる重なりがあるかどうかの情報から、重なりがある場合はＳ６Ａ−２へ処理が進み、無い場合はＰ７０へ抜ける。
【０３５３】
Ｓ６Ａ−２では、重なり警告手段１１が、第１の被写体と第２の被写体に重なりがあることをユーザー（撮影者）あるいは被写体あるいはその両方に警告して、Ｐ７０へ抜ける。
【０３５４】
警告の通知の仕方としては色々考えられる。
【０３５５】
例えば、合成画像を利用して通知する場合、重なりのある被写体領域を目立つように合成画像に重ねて表示すればよい。図１２はこれを説明する例である。
【０３５６】
図１２では、図１１（ｃ）の領域２０２、すなわち第１の被写体と第２の被写体の重なり合った領域が、合成画像上に重ねて半透明で表示されている。領域２０２の部分を赤などの目立つ色のフィルタをかける（領域２０２に色セロハンを当てるイメージ）とさらに良い。あるいは、領域２０２の領域やその枠を点滅させて表示させても良い。これらの合成方法については、後で図２３で説明する。
【０３５７】
図１２では、さらに文字で警告を行っている例である。図１２の上の方に合成画像に重ねて警告ウィンドウを出し、その中で「被写体が重なっています！」というメッセージを表示している。これも目立つような配色にしたり、点滅させたりしてもよい。
【０３５８】
合成画像に対する上書きは、重なり警告手段１１の指示により、重ね画像生成手段９で行っても良いし、重ね画像表示手段１０で行っても良い。警告ウィンドウを点滅などさせる場合は元の合成画像を残しておく必要があるかもしれないので、重ね画像表示手段１０に対して、例えば主記憶７４または外部記憶７５から警告ウィンドウのデータを間歇的に読み出して与える等して行った方がよい場合が多い。
【０３５９】
これらの警告表示を図３（ａ）のモニター１４１上に表示すれば、撮影しながら重なり状態を確認することができて、撮影に便利である。この時、撮影者は被写体（人物（２））に対して、「重なっているからもっと右の方に動いてくれ」などと、次に撮影した画像を第２被写体画像などとして使う場合に、重なり状態を解消するような指示を行うことができるという利点がある。
【０３６０】
なお、次に撮影した画像を第２被写体画像などとして使う場合とは、ユーザーがメニューやシャッターボタンで第２被写体画像の記録（メモリ書き込み）を指示する場合か、先に説明したように、第２被写体画像を動画的に撮影し補正重ね画像をほぼリアルタイムに表示する繰り返し処理の専用モードになっている場合などが考えられる。
【０３６１】
また、図３（ａ）のモニター１４１は撮影者の方を向いているが、被写体の方にモニターを向けることができる装置ならば、重なり状態を被写体も確認することができ、撮影者に指示されなくても、被写体が自発的に重なりを解消するように動くこともできるようになる。モニター１４１とは別のモニターを用意して、それを被写体が見られるようにするのでもよい。
【０３６２】
また、先に専用モードとして説明したように図５のＳ３からＳ７の処理を繰り返すのならば、現在の重なり状態がほぼリアルタイムで分かるので、被写体の移動によって重なりが解消できたかどうかがほぼリアルタイムで分かり、撮影が便利で効率よくできる。図５のＳ３からＳ７の処理は、充分速いＣＰＵやロジック回路などを使えば、それほど時間は必要ない。実使用上は、１秒に１回程度以上の速さの繰り返し処理を実現できれば、ほぼリアルタイムの表示と言って良い。
【０３６３】
なお、Ｓ４で補正画像を生成する際、第１被写体画像を基準画像にすると、合成画像も第１被写体画像がベースとなる。モニター１４１に写る背景の範囲は第１被写体画像の背景の範囲となる。上述したリアルタイムに繰り返し処理を行う場合、カメラを振ると撮影される背景の範囲が変わるが、撮影される画像は第２被写体画像であって、第１被写体画像ではない。従って、モニター１４１に写る背景の範囲は、第１被写体画像の背景の範囲のまま変わらない。このため、撮影している範囲がモニター１４１に写らない／反映されないというのは、ユーザーにとって違和感がある。
【０３６４】
これに対し、第２被写体画像を基準画像にすると、モニター１４１に写る背景の範囲は第２被写体画像の背景の範囲となる。上述したリアルタイムに繰り返し処理を行う場合、カメラを振ると撮影される背景の範囲が変わり、撮影される画像は第２被写体画像（基準画像）なので、モニター１４１に写る背景の範囲は、撮影中の背景の範囲となる。これにより、撮影している範囲がモニター１４１に写る／反映されるので、ユーザーにとって違和感が少ないという効果が出てくる。
【０３６５】
また、重なり合った被写体領域を合成画像と重ねて表示した結果、重なり具合と合成画像のフレーム枠との関係を見て、被写体がどう動いても重なりが生じたり、被写体がフレームアウトしてしまうとユーザーが判断できれば、もう一度、第１被写体画像の撮影からやり直した方が良いという判断を行うこともできるようになる。
【０３６６】
また、警告の通知の仕方として、図３（ａ）のランプ１４２を点燈あるいは点滅させることで知らせることもできる。警告なので、ランプの色は赤やオレンジなどの色にしておくと分かりやすい。ランプの点滅などは一般にモニター１４１に撮影者が注目していなくても気づきやすいという利点がある。
【０３６７】
また、図１２のように被写体の重ね画像を表示せず、重なりがあることだけを、警告メッセージやランプで知らせてもよい。この場合、どのくらい重なっているかはすぐには分からないが、重なりがあるかないかだけ分かれば、後は被写体が移動するなどして警告通知が無くなるかどうかを見ていれば重なりの無い合成画像を得るという目的は達せられる。従って、警告メッセージやランプで重なりがあることを知らせるだけにすることにより、重なり部分を表示させる処理が省けるという利点が出てくる。
【０３６８】
また、図３（ａ）ではランプ１４２を撮影者側のみ見られるような配置にしているが、もちろん、被写体側からも分かるように、図３（ｂ）の本体１４０の前面側につけてもよい。効果については、モニターを被写体が見られる場合と同様である。
【０３６９】
また、図３（ａ）にはないが、モニター１４１とは別にファインダーのような画像を確認できる別の手段がある場合、そちらにモニター１４１と同じ警告通知を表示したり、ファインダー内部にランプを組み込んでおき、通知する方法も考えられる。
【０３７０】
また、図３（ａ）、図３（ｂ）では示していないが、図２のスピーカ８０を使って警告通知を行っても良い。重なりがある場合に警告ブザーを鳴らしたり、「重なっています」などの音声を出力したりなどして、警告通知を行う。この場合にもランプと同様の効果が期待できる。スピーカを使う場合、光と違って指向性があまりないので、一つのスピーカで撮影者も被写体も両方重なり状態を知ることができるという利点がある。
【０３７１】
以上のＳ６−１からＳ６Ａ−２の処理で、図５のＳ６の重なりに関する処理が行える。
【０３７２】
図２１は、図５のＳ６の処理、すなわち重なりに関する処理の別の一方法を説明するフローチャート図である。
【０３７３】
Ｐ６０を経たＳ６−１では、シャッターチャンス通知手段１２が、重なり検出手段８（Ｓ５）から得られる情報に基づいて重なりがあるかどうかを判断し、重なりがある場合はＰ７０へ処理が抜け、無い場合はＳ６Ｂ−２へ処理が進む。
【０３７４】
Ｓ６Ｂ−２では、シャッターチャンス通知手段１２が、第１の被写体と第２の被写体に重なりがないことをユーザー（撮影者）あるいは被写体あるいはその両方に通知して、Ｐ７０へ抜ける。
【０３７５】
この通知は、実際には、重なりが無いことを通知するというより、重なりがないことによる副次的な操作、具体的には第２被写体を記録するシャッターチャンスであることを通知するような使われ方が最も一般的である。その場合、その通知は、主に撮影者に対するものとなる。
【０３７６】
シャッターチャンスの通知方法に関しては、図２０で説明したような方法がほぼそのまま使える。例えば、図１２のメッセージを「シャッターチャンスです！」などと変えるなどすればよい。その他、ランプ、スピーカについても、色や出力する音の内容などは多少変わるが、通知手法としては同様に利用できる。
【０３７７】
シャッターチャンスであることが分かれば、撮影者はシャッターを切ることで被写体同士に重なりのない状態で撮影／記録することができ、また、被写体もシャッターを切られるかもしれない準備（例えば目線の方向や顔の表情など）を行うことができるという利点が出てくる。
【０３７８】
以上のＳ６−１からＳ６Ｂ−２の処理で、図５のＳ６の重なりに関する処理が行える。
【０３７９】
図２２は、図５のＳ６の処理、すなわち重なりに関する処理のさらに別の一方法を説明するフローチャート図である。
【０３８０】
Ｐ６０を経たＳ６−１では、自動シャッター手段１３が、重なり検出手段８（Ｓ５）から得られる情報に基づいて重なりがあるかどうかを判断し、重なりがある場合はＰ７０へ処理が抜け、無い場合はＳ６Ｃ−２へ処理が進む。
【０３８１】
Ｓ６Ｃ−２では、自動シャッター手段１３が、シャッターボタンが押されているかどうかを判断し、押されていればＳ６Ｃ−３へ進み、押されていなければＰ７０へ抜ける。
【０３８２】
Ｓ６Ｃ−３では、自動シャッター手段１３が、第２被写体画像の記録を第２被写体画像取得手段３へ指示して、Ｐ７０へ処理が抜ける。第２被写体画像取得手段３は、指示に従い、撮影画像を主記憶７４、外部記憶７５などに記録する。
【０３８３】
これによって、被写体同士が重なっていない時にシャッターボタンが押されていれば、自動的に撮影画像を記録することができるようになるという効果が出てくる。同時に、誤って重なっている状態で撮影画像を記録してしまうことを防ぐ効果も出てくる。
【０３８４】
実際の使われ方としては、被写体の様子などを見て、今なら撮影画像を記録しても良いと思ったら撮影者がシャッターボタンを押すが、その時点で必ずしも記録される訳ではなく、重なりがある場合は記録されない。すなわち、自動シャッター手段１３が、重なりがあると判断した場合には、撮影者がシャッターボタンを押しても第２被写体画像取得手段３による記録動作が行われないように、第２被写体画像の記録を禁止する。
【０３８５】
なお、記録されない場合は、その旨を表示やランプ、スピーカなどの通知手段で撮影者などに知らせた方が、シャッターを押したが撮影されてないことが分かってよい。
【０３８６】
そして、被写体が動くなどして、重なりがない状態になった時に、再度シャッターボタンが押されれば、今度は記録される。記録されたことが分かるように、表示やランプ、スピーカなどの通知手段で撮影者などに知らせるとよい。
【０３８７】
シャッターボタンを毎度押すのではなく、押しっぱなしにするならば、重なっている状態から重なりがなくなった瞬間に自動的に記録されることになる。但し、重なりがなくなった瞬間だとまだ被写体が静止しておらず撮影画像がぶれてしまったり、被写体が撮影される状態（被写体が他所を向いている時など）になっていない場合があるので、その場合は自動的に記録するまでに少し時間をあけると良い。
【０３８８】
以上のＳ６−１からＳ６Ｃ−３の処理で、図５のＳ６の重なりに関する処理が行える。
【０３８９】
なお、図２０〜２３の処理は必ずしも排他的な処理ではなく、任意に組み合わせて処理することも可能である。組み合わせの例として、次のような利用シーンが可能となる。
【０３９０】
『被写体同士が重なっている時は「重なっています」と警告がなされ、この時にシャッターボタンを押しても撮影画像は記録されない。警告に応じて被写体が動き、重なりがなくなったらシャッターチャンスランプが点燈する。シャッターチャンスランプが点燈している間にシャッターボタンを押したら撮影画像が記録される。』
次に、図２３は、図５のＳ７の処理、すなわち重ね画像を生成する処理の一方法を説明するフローチャート図である。
【０３９１】
Ｐ７０を経たＳ７−１では、重ね画像生成手段９が、生成する重ね画像の最初の画素位置をカレント画素に設定してＳ７−２へ処理が進む。最初の画素位置は、例えば左上などの隅から始まることが多い。
【０３９２】
なお、「画素位置」は、画像上の特定の位置を表し、左上隅を原点、右方向を＋Ｘ軸、下方向を＋Ｙ軸としたＸ−Ｙ座標系で表現されることが多い。画素位置は、画像を表すメモリ上のアドレスに対応し、画素値はそのアドレスのメモリの値である。
【０３９３】
Ｓ７−２では、重ね画像生成手段９が、カレント画素位置は存在するかどうかを判断し、存在するならばＳ７−３へ処理が進み、存在しないならばＰ８０へ抜ける。
【０３９４】
Ｓ７−３では、重ね画像生成手段９が、カレント画素位置が被写体統合領域内かどうかを判断し、被写体統合領域内ならばＳ７−４へ処理が進み、そうでないならばＳ７−５へ処理が進む。
【０３９５】
被写体統合領域内かどうかは、重なり検出手段８（Ｓ５−５）で被写体統合領域が得られ、かつ、被写体統合領域画像中のカレント画素位置が黒（０）かどうかで判断できる。
【０３９６】
Ｓ７−４では、重ね画像生成手段９が、設定に応じた合成画素を生成して、重ね画像のカレント画素位置の画素値として書き込む。
【０３９７】
設定とは、つまりどのような合成画像を合成するかということである。例えば、図９（ｂ）のように第１の被写体を半透明で合成するのか、図９（ａ）のように不透明で第１の被写体をそのまま上書きで合成するのか、図１２のように第１の被写体も第２の被写体も半透明で合成するのか、などである。ここでは、被写体統合領域内を扱っているので、実質的には、その領域の合成割合（透過率）に関する設定となる。
【０３９８】
合成割合（透過率）が決まれば、第１被写体画像のカレント画素位置の画素値Ｐ１と補正画像生成手段５（Ｓ４）から得られる補正第２被写体画像のカレント画素位置の画素値Ｐｆ２を得て、所定の透過率Ａ（０．０から１．０の間の値）で合成画素値（Ｐ１×（１−Ａ）＋Ｐｆ２×Ａ）を求めればよい。
【０３９９】
例えば図１２のような被写体統合領域内を半透明とするには、透過率Ａを０．５とすればよい。
【０４００】
Ｓ７−５では、Ｓ７−３でカレント画素が被写体統合領域に属さないと判断された場合に、重ね画像生成手段９が、カレント画素位置が第１被写体領域内かどうかを判断し、第１被写体領域内ならばＳ７−６へ処理が進み、そうでないならばＳ７−７へ処理が進む。
【０４０１】
第１被写体領域内かどうかは、被写体領域抽出手段７（Ｓ５）から得られる第１被写体領域画像を使い、カレント画素位置が黒（０）かどうかで判断できる。なお、被写体統合領域が存在する場合は、第１被写体領域は存在しないことが分かっているので、第１被写体領域内かどうか判断せずに（Ｓ７−５を省略）、直接、Ｓ７−７へ処理を進めてもよい。
【０４０２】
なお、第１被写体領域であるかどうかで特に処理を変えない場合は、Ｓ７−５，Ｓ７−６は省いて、Ｓ７−３からＳ７−７へ進めばよい。
【０４０３】
Ｓ７−６では、重ね画像生成手段９が、設定に応じた合成画素を生成して、重ね画像のカレント画素位置の画素値として書き込む。ここでの処理は、被写体統合領域（画像）が第１被写体領域（画像）に変わるだけで、Ｓ７−４と同様である。
【０４０４】
図９（ｂ）のように第１の被写体を半透明で合成するのなら、第１の被写体の透過率を０．５とすればよく、図９（ａ）のように不透明で第１の被写体をそのまま上書きで合成するのならば、第１の被写体の透過率を０．０とすればよい。
【０４０５】
Ｓ７−７では、Ｓ７−５でカレント画素が第１被写体領域にも属さないと判断された場合に、重ね画像生成手段９が、カレント画素位置が第２被写体領域内かどうかを判断し、第２被写体領域内ならばＳ７−８へ進み、そうでないならばＳ７−９へ処理が進む。ここでの処理は、第１被写体領域が第２被写体領域に変わるだけで、Ｓ７−５と同様である。
【０４０６】
Ｓ７−８では、重ね画像生成手段９が、設定に応じた合成画素を生成して、重ね画像のカレント画素位置の画素値として書き込む。ここでの処理は、第１被写体領域が第２被写体領域に変わるだけで、Ｓ７−６と同様である。
【０４０７】
Ｓ７−９では、Ｓ７−７でカレント画素が第２被写体領域にも属さないと判断された場合に、重ね画像生成手段９が、第１被写体画像（基準画像）のカレント画素位置の画素値を重ね画像のカレント画素位置の画素値として書き込む。すなわち、この場合のカレント画素位置は、被写体統合領域内でも第１被写体領域内でも第２被写体領域内でもないので、結局、背景部分に相当する。
【０４０８】
Ｓ７−１０では、重ね画像生成手段９が、カレント画素位置を次の画素位置に設定して、Ｓ７−２へ処理が戻る。
【０４０９】
以上のＳ７−１からＳ７−１０の処理で、図５のＳ７の重ね画像生成に関する処理が行われる。
【０４１０】
なお、上記の処理ではＳ７−４やＳ７−６、Ｓ７−９で第１被写体画像や補正第２被写体画像を処理しているが、生成する重ね画像に、Ｓ７−１の前に最初に、第１被写体画像または補正第２被写体画像を全画素コピーしてしまい、その後、各画素位置の処理で第１被写体領域および／または第２被写体領域だけを処理する方法も考えられる。全画素コピーの方が処理手順は単純になるが、処理時間は若干増えるかもしれない。
【０４１１】
なお、ここでは合成画像の大きさを基準画像の大きさにしているが、これより小さくしたり、大きくしたりすることも可能である。例えば図７（ｃ）で補正画像を生成する際、一部を切り捨ててしまっていたが、補正画像の大きさを大きくして切り捨てないようにすれば、合成画像を大きくする時のために、切り捨てずに残した画像を合成に使い、それによって背景を広げることも可能となる。いわゆるパノラマ画像合成のようなことが可能となる効果が出てくる。
【０４１２】
図９（ｂ）は、第１被写体領域だけを半透明に合成した重ね画像である。図９（ｃ）は、第２被写体領域だけを半透明に合成した重ね画像である。図９（ａ）は、両方とも半透明にはせず、どちらも上書きして生成した重ね画像である。また、図１２は、両方とも半透明にして合成した重ね画像である。
【０４１３】
どの合成方法をとるかは目的によるので、ユーザーがそのときの目的に応じた合成方法を選択できるようにすれば良い。
【０４１４】
例えば、第１被写体画像を既に撮影／記録してあり、第２被写体画像を重なり無く撮影する場合などのためには、第１の被写体の詳細な画像は必要なく、第１の被写体が大体どの辺に存在し、第２の被写体と重なりがあるかどうかが分かればよいので、半透明の合成で構わない。また、第２の被写体は、撮影する瞬間にどういう表情をしているとかの詳細が分からないとうまくシャッターを切れないので、半透明ではなく上書きで合成する方が良い。従って、図９（ｂ）のような合成方法が向いている。
【０４１５】
また、既に説明したように、合成画像の背景の範囲が撮影中の画像（第２被写体）の背景の範囲となる方が違和感が少ないなら、第２被写体画像を基準画像にして、かつ、第２の被写体を撮影中であることが分かり易いように図９（ｂ）のように合成する方が向いている。
【０４１６】
また、合成する被写体の領域が分かった方が撮りやすいというユーザーにとっては、撮影中は両者を半透明で合成した方が良い場合や、第２の被写体だけを半透明にした方が良い場合もあるかもしれない。
【０４１７】
また、第２の被写体の撮影／記録が済んで、第１被写体画像、第２被写体画像を使って、最終的な合成画像を作成したい場合は、半透明な被写体では困るので、どちらも上書きで合成する必要がある。従って、図９（ａ）のような合成方法が向いている。
【０４１８】
また、被写体領域取得手段７（Ｓ５）から得られる被写体領域が、前述したように既に膨張されていれば、被写体だけでなく、その周囲の背景部分も一緒に合成してしまうが、既に補正画像生成手段５（Ｓ４）で背景部分は一致するように補正処理されているので、実際の被写体の輪郭の領域よりも多少、抽出する被写体領域が大きめになって背景部分まで含んでしまっていても、合成境界で不自然になることはないという効果が出てくる。
【０４１９】
なお、被写体領域を膨張させて処理するのであれば、合成境界をより自然に見せるように、外部も含めた被写体領域の合成境界付近、あるいは被写体領域内部だけの合成境界付近で、透明度を徐々に変化させて合成させるという方法もある。例えば、被写体領域の外部にいくに従って、背景部分の画像の割合を強くし、被写体領域の内部にいくに従って、被写体領域部分の画像の割合を強くする、といった具合である。
【０４２０】
これにより、もし合成境界付近で補正誤差による多少の背景のずれがあったとしても、不自然さを目立たなくすることができるという効果が出てくる。補正誤差でなく、そもそも被写体領域の抽出が間違っている場合や、撮影時間のずれなどに起因する背景部分の画像の変化（例えば、風で木が動いた、日が陰った、関係無い人が通った、など）があったとしても、同様に、不自然さを目立たなくすることができるという効果が出てくる。
【０４２１】
また、本発明の目的は、前述した実施形態の機能を実現するソフトウェアのプログラムコードを記録した記憶媒体を、システムあるいは装置に供給し、そのシステムあるいは装置のコンピュータ（またはＣＰＵやＭＰＵ）が記憶媒体に格納されたプログラムコードを読み出し実行することによっても、達成されることは言うまでもない。
【０４２２】
この場合、記憶媒体から読み出されたプログラムコード自体が前述した実施形態の機能を実現することになり、そのプログラムコードを記憶した記憶媒体は本発明を構成することになる。
【０４２３】
プログラムコードを供給するための記憶媒体としては、例えば、フロッピディスク，ハードディスク，光ディスク，光磁気ディスク，磁気テープ，不揮発性のメモリカード，等を用いることができる。
【０４２４】
また、上記プログラムコードは、通信ネットワークのような伝送媒体を介して、他のコンピュータシステムから画像合成装置の主記憶７４または外部記憶７５へダウンロードされるものであってもよい。
【０４２５】
また、コンピュータが読み出したプログラムコードを実行することにより、前述した実施形態の機能が実現されるだけでなく、そのプログラムコードの指示に基づき、コンピュータ上で稼働しているＯＳ（オペレーティングシステム）などが実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。
【０４２６】
さらに、記憶媒体から読み出されたプログラムコードが、コンピュータに挿入された機能拡張ボードやコンピュータに接続された機能拡張ユニットに備わるメモリに書込まれた後、そのプログラムコードの指示に基づき、その機能拡張ボードや機能拡張ユニットに備わるＣＰＵなどが実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。
【０４２７】
本発明を上記記憶媒体に適用する場合、その記憶媒体には、先に説明したフローチャートに対応するプログラムコードを格納することになる。
【０４２８】
本発明は上述した各実施形態に限らず、請求項に示した範囲で種々の変更が可能である。
【０４２９】
【発明の効果】
本発明に係る画像合成装置は、以上のように、背景と第１の被写体とを含む画像である第１被写体画像と、上記背景の少なくとも一部と第２の被写体とを含む画像である第２被写体画像との間での、背景部分の相対的な移動量、回転量、拡大縮小率、歪補正量のいずれかもしくは組み合わせからなる補正量を算出する、あるいは予め算出しておいた補正量を読み出す背景補正量算出手段と、第１被写体画像、第２被写体画像のどちらかを基準画像とし、他方の画像を被写体以外の背景の部分が少なくとも一部重なるように前記背景補正量算出手段から得られる補正量で補正し、基準画像と補正した画像を重ねた画像を生成する重ね画像生成手段と、を有することを特徴とする。
【０４３０】
これにより、二つの画像間の背景のずれや歪みを補正して合成することができるので、これによって、被写体など明らかに異なる領域を除いた以外の部分（すなわち背景部分）は、どのように重ねても合成結果がほぼ一致し、合成結果が不自然とならないという効果が出てくる。例えば被写体領域だけを主に合成しようとした時、被写体領域の抽出や指定が多少不正確であっても、被写体領域の周りの背景部分が合成先の画像の部分とずれがないので、不正確な領域の内外が連続した風景として合成され、見た目の不自然さを軽減するという効果が出てくる。
【０４３１】
また、これにより、たとえ被写体領域の抽出が画素単位で正確であったとしても、課題の項で説明した通り、１画素より細かいレベルでの不自然さは従来技術の方法では出てしまうが、本発明では、背景部分のずれや歪みを無くしてから合成しているので、輪郭の画素の周囲の画素は、同じ背景部分の位置の画素となり、合成してもほぼ自然なつながりとなる。このように、１画素より細かいレベルでの不自然さを防ぐ、あるいは軽減するという効果が出てくる。
【０４３２】
また、背景のずれや歪みを補正して合成するので、第１、第２被写体画像の撮影時にカメラなどを三脚などで固定する必要がなく、手などで大体の方向を合わせておけばよく、撮影が簡単になるという効果が出てくる。
【０４３３】
本発明に係る画像合成装置は、以上のように、被写体や風景を撮像する撮像手段を有し、第１被写体画像または第２被写体画像は、前記撮像手段の出力に基づいて生成されることを特徴とする。
【０４３４】
これによって、重ね画像を生成する画像合成装置が、撮像手段を具備することで、ユーザーが被写体や風景を撮影したその場で、重ね画像を生成することができるため、ユーザーにとっての利便性が向上する。また、重ね画像を生成した結果、もし被写体同士の重なりがあるなどの不都合があれば、その場で撮影し直すことができるという効果が出てくる。
【０４３５】
本発明に係る画像合成装置は、以上のように、第１被写体画像と第２被写体画像のうち、後に撮影した方を基準画像とすることを特徴とする。
【０４３６】
これにより、表示される合成画像は、直前に撮影したばかりの、あるいは合成画像をリアルタイム表示する形態では現在撮影中の第２被写体画像の背景の範囲となるので、撮影者にとっては違和感が無いという効果が出てくる。
【０４３７】
本発明に係る画像合成装置は、以上のように、前記重ね画像生成手段において、基準画像と補正した画像とを、それぞれ所定の透過率で重ねることを特徴とする。
【０４３８】
上記の構成において、所定の透過率で重ねる形態には、透過率を画素位置によって変える形態がふくまれる。例えば、補正画像中の被写体領域だけを基準画像に重ねる時、被写体領域内は不透明（すなわち補正画像中の被写体の画像そのまま）で重ね、被写体領域周辺は被写体領域から離れるに従い基準画像の割合が強くなるように重ねる。すると、被写体領域、すなわち被写体の輪郭が間違っていたとしても、その周辺の画素は、補正画像から基準画像に徐々に変わっているので、間違いが目立たなくなるという効果が出てくる。
【０４３９】
また、所定の透過率で重ねる形態には、被写体領域だけを半分の透過度で重ねる、などの形態も含まれる。この結果、表示されている画像のどの部分が以前に撮影した合成対象部分で、どの部分が今撮影している画像なのかをユーザーや被写体が判別しやすくなるという効果も出てくる。それにより、被写体同士の重なりなどがある場合も、判別しやすくなるという効果も出てくる。
【０４４０】
本発明に係る画像合成装置は、以上のように、前記重ね画像生成手段において、基準画像と補正した画像の間の差分画像中の差のある領域を、元の画素値と異なる画素値の画像として生成することを特徴とする
これによって、二つの画像間で一致しない部分がユーザーに分かりやすくなるという効果が出てくる。例えば、第１や第２の被写体の領域は、基準画像上と補正画像上では、片方は被写体の画像、他方は背景部分の画像となるので、差分画像中の差のある領域として抽出される。抽出された領域を半透明にしたり、反転表示したり、目立つような色の画素値とすることで、被写体の領域がユーザーに分かりやすくなるという効果が出てくる。
【０４４１】
本発明に係る画像合成装置は、以上のように、基準画像と補正した画像の間の差分画像中から、第１の被写体の領域と第２の被写体の領域を抽出する被写体領域抽出手段を有し、前記重ね画像生成手段において、基準画像と補正した画像とを重ねる代わりに、基準画像または補正した画像と前記被写体領域抽出手段から得られる領域内の画像とを重ねることを特徴とする。
【０４４２】
これによって、基準画像上に、補正された被写体画像中の被写体領域のみを合成することできるという効果が出てくる。あるいは、補正された被写体画像上に、基準画像中の被写体領域のみを合成することができるということもできる。
【０４４３】
また、重ね画像生成手段における被写体領域の透過率を変える処理と組み合わせることで、どの領域を合成しようとしているかがユーザーに分かり易く、もし被写体同士に重なりなどがあれば、それもさらに分かり易くなるという効果が出てくる。さらに、それによって、重なりが起きないように撮影を補助することができるという効果が出てくる。重なりがある場合は、被写体やカメラを動かすなどして、重なりの無い状態で撮影し直すのが良い訳だが、この場合の補助とは、例えば、重なりが起きるかどうかをユーザーに認識し易くすることや、どのくらい被写体やカメラを動かせば重なりが解消できそうかを、ユーザーが判断する材料（ここでは合成画像）を与えること、などになる。
【０４４４】
本発明に係る画像合成装置は、以上のように、被写体領域抽出手段は、第１被写体画像中あるいは補正された第１被写体画像中から第１の被写体の領域内の画像および第２の被写体の領域内の画像を抽出すると共に、第２被写体画像中あるいは補正された第２被写体画像中から第１の被写体の領域内の画像および第２の被写体の領域内の画像を抽出し、さらに皮膚色を基準として第１の被写体の画像および第２の被写体の画像を選別することを特徴とする。
【０４４５】
これによって、抽出した画像部分がどちらの被写体であるかを自動的に簡単に判別できる効果が出てくる。
【０４４６】
本発明に係る画像合成装置は、以上のように、前記被写体領域抽出手段は、第１被写体画像中あるいは補正された第１被写体画像中から第１の被写体の領域内の画像および第２の被写体の領域内の画像を抽出すると共に、第２被写体画像中あるいは補正された第２被写体画像中から第１の被写体の領域内の画像および第２の被写体の領域内の画像を抽出し、さらにその各領域外の画像の特徴を基準として第１の被写体の画像および第２の被写体の画像を選別することを特徴とする。
【０４４７】
これによって、抽出した画像部分がどちらの被写体であるかを自動的に簡単に判別できる効果が出てくる。
【０４４８】
本発明に係る画像合成装置は、以上のように、前記被写体領域抽出手段から得られる第１の被写体あるいは第２の被写体の領域の数が、合成する被写体の数として設定された値と一致しない時に、第１の被写体の領域と第２の被写体の領域が重なっていると判断する重なり検出手段を有することを特徴とする。
【０４４９】
これによって、重なり検出手段の判断結果は、重なりの有無を合成画面やランプなどで撮影者や被写体に通知、警告するのに利用することができる。その結果、被写体同士が重なり合っている部分があるかどうかをユーザーに判別させやすくすることができるという効果が出てくる。それによって、重なりが起きないように撮影を補助する効果については、前述したものと同様である。
【０４５０】
本発明に係る画像合成装置は、以上のように、前記重なり検出手段において重なりが検出される時、重なりが存在することを、ユーザーあるいは被写体あるいは両方に警告する重なり警告手段を有することを特徴とする。
【０４５１】
これによって、被写体同士が重なり合っている場合に警告されるので、ユーザーがそれに気づかずに撮影／記録したり合成処理したりということを防ぐことができ、さらに被写体にも位置調整等が必要であることを即時に知らせることができるという撮影補助の効果が出てくる。
【０４５２】
本発明に係る画像合成装置は、以上のように、前記重なり検出手段において、重なりが検出されない時、重なりが存在しないことを、ユーザーあるいは被写体あるいは両方に通知するシャッターチャンス通知手段を有することを特徴とする。
【０４５３】
これによって、被写体同士が重なり合っていない時をユーザーが知ることができるので、撮影や撮影画像記録、合成のタイミングをそれに合わせて行えば、被写体同士が重ならずに合成することができるという撮影補助の効果が出てくる。
【０４５４】
また、被写体にも、シャッターチャンスであることを通知できるので、ポーズや視線などの備えを即座に行えるという撮影補助の効果も得られる。
【０４５５】
本発明に係る画像合成装置は、被写体や風景を撮像する撮像手段を有し、前記重なり検出手段で重なりが検出されない時に、前記撮像手段から得られる画像を第１被写体画像、または第２被写体画像として記録する指示を生成する自動シャッター手段を有することを特徴とする。
【０４５６】
これによって、被写体同士が重なり合っていない時に自動的に撮影が行われるので、ユーザー自身が重なりがあるかどうかを判別してシャッターを押さなくても良いという撮影補助の効果が出てくる。
【０４５７】
本発明に係る画像合成装置は、上記の課題を解決するために、被写体や風景を撮像する撮像手段を有し、前記重なり検出手段で重なりが検出される時に、前記撮像手段から得られる画像を、第１被写体画像、あるいは第２被写体画像として記録することを禁止する指示を生成する自動シャッター手段を有することを特徴とする。
【０４５８】
これによって、被写体同士が重なり合ってる時は撮影が行われないので、ユーザーが誤って重なりがある状態で撮影／記録してしまうことを防ぐ撮影補助の効果が出てくる。
【０４５９】
本発明に係る画像合成方法は、以上のように、背景と第１の被写体とを含む画像である第１被写体画像と、上記背景の少なくとも一部と第２の被写体とを含む画像である第２被写体画像との間での、背景部分の相対的な移動量、回転量、拡大縮小率、歪補正量のいずれかもしくは組み合わせからなる補正量を算出する、あるいは予め算出しておいた補正量を読み出す背景補正量算出ステップと、第１被写体画像、第２被写体画像のどちらかを基準画像とし、他方の画像を被写体以外の背景の部分が少なくとも一部重なるように前記背景補正量算出ステップから得られる補正量で補正し、基準画像と補正した画像を重ねた画像を生成する重ね画像生成ステップと、を有することを特徴とする。
【０４６０】
これによる種々の効果は、前述したとおりである。
【０４６１】
本発明に係る画像合成プログラムは、以上のように、上記画像合成装置が備える各手段として、コンピュータを機能させることを特徴とする。
【０４６２】
本発明に係る画像合成プログラムは、以上のように、上記画像合成方法が備える各ステップをコンピュータに実行させることを特徴とする。
【０４６３】
本発明に係る記録媒体は、上記画像合成プログラムを記録したことを特徴とする。
【０４６４】
これにより、上記記録媒体、またはネットワークを介して、一般的なコンピュータに画像合成プログラムをインストールすることによって、該コンピュータを用いて上記の画像合成方法を実現する、言い換えれば、該コンピュータを画像合成装置として機能させることができる。
【図面の簡単な説明】
【図１】本発明の画像合成装置の機能的な構成を示すブロック図である。
【図２】各手段を具体的に実現する装置の構成例を説明するブロック図である。
【図３】（ａ）は、上記画像合成装置の背面の外観例を示す模式的な斜視図、（ｂ）は、上記画像合成装置の前面の外観例を示す模式的な斜視図である。
【図４】画像データのデータ構造例を説明する説明図である。
【図５】画像合成方法全体の流れを示すフローチャート図である。
【図６】（ａ）は、第１被写体画像の例を示す説明図、（ｂ）は、（ａ）の第１被写体画像中の参照マッチングブロックの配置を説明する説明図である。
【図７】（ａ）は、第２被写体画像の例を示す説明図、（ｂ）は、（ａ）の第２被写体画像中の検出されたマッチングブロックの配置を説明する説明図、（ｃ）は、（ａ）の第２被写体画像を補正した補正第２被写体画像を説明する説明図、（ｄ）は、（ｃ）の補正第２被写体画像のマスク画像を説明する説明図である。
【図８】（ａ）は、図６（ａ）の第１被写体画像と図７（ｃ）の補正第２被写体画像の差分画像例を示す説明図、（ｂ）は、（ａ）の差分画像から生成したラベル画像例を示す説明図、（ｃ）は、（ｂ）のラベル画像からノイズ部分を除去したラベル画像例を示す説明図である。
【図９】（ａ）は、図６（ａ）の第１被写体画像に図１９（ｄ）の第２被写体領域部分を重ねて合成した重ね画像例を示す説明図、（ｂ）は、図６（ａ）の第１被写体画像に、図１９（ｂ）の第１被写体領域部分を半透明にして重ね、図１９（ｄ）の第２被写体領域部分を重ねて合成した重ね画像例を示す説明図、（ｃ）は、図６（ａ）の第１被写体画像に、図１９（ｄ）の第２被写体領域部分を半透明にして重ねて合成した重ね画像例を示す説明図である。
【図１０】図６（ａ）の第１被写体と被写体領域同士が重なる第２被写体画像の例を示す説明図である。
【図１１】（ａ）は、図６（ａ）の第１被写体画像と図１０の第２被写体画像の補正画像との差分画像例を示す説明図、（ｂ）は、（ａ）の差分画像から生成したラベル画像例を示す説明図、（ｃ）は、（ｂ）のラベル画像からノイズ部分を除去したラベル画像例を示す説明図である。
【図１２】図１１（ｃ）の被写体領域部分を半分の透過率で重ねて合成し、重なりの警告メッセージを表示させた例を示す説明図である。
【図１３】第２被写体画像を取得する処理の一方法を説明するフローチャート図である。
【図１４】背景補正量を算出する処理の一方法を説明するフローチャート図である。
【図１５】（ａ）は、マッチングを説明する参照画像の例を示す説明図、（ｂ）は、マッチングを説明する探索画像の例を示す説明図である。
【図１６】第２被写体画像の補正画像を生成し、第１被写体画像との差分画像を生成する処理の一方法を説明するフローチャート図である。
【図１７】（ａ）は、回転している第２被写体画像の例を示す説明図、（ｂ）は、（ａ）の第２被写体画像中の検出されたマッチングブロックの配置を説明する説明図、（ｃ）は、（ａ）の第２被写体画像を補正した補正第２被写体画像を説明する説明図、（ｄ）は、（ｃ）の補正第２被写体画像画像のマスク画像を説明する説明図である。
【図１８】被写体領域を抽出する処理の一方法を説明するフローチャート図である。
【図１９】（ａ）は、図６（ａ）の第１被写体画像中の第１被写体領域の画像を示す説明図、（ｂ）は、図７（ａ）の第２被写体画像中の第１被写体領域の画像を示す説明図、（ｃ）は、図６（ａ）の第１被写体画像中の第２被写体領域の画像を示す説明図、（ｄ）は、図６（ａ）の第２被写体画像中の第２被写体領域の画像を示す説明図である。
【図２０】被写体領域の重なりを警告する処理の一方法を説明するフローチャート図である。
【図２１】被写体領域に重なりが無い時に、シャッターチャンスを通知する処理の一方法を説明するフローチャート図である。
【図２２】被写体領域に重なりが無い時に、自動シャッターを行う処理の一方法を説明するフローチャート図である。
【図２３】重なり画像を生成する処理の一方法を説明するフローチャート図である。
【符号の説明】
１　撮像手段
２　第１被写体画像取得手段
３　第２被写体画像取得手段
４　背景補正量算出手段
５　補正画像生成手段
６　差分画像生成手段
７　被写体領域抽出手段
８　重なり検出手段
９　重ね画像生成手段
１０　重ね画像表示手段
１１　重なり警告手段
１２　シャッターチャンス通知手段
１３　自動シャッター手段
７４　主記憶（記録媒体）
７５　外部記憶（記録媒体）
１１２　領域（第１被写体領域）
１１３　領域（第２被写体領域）
１４０　本体（画像合成装置）
１４１　表示部兼タブレット
１４３　シャッターボタン
２０２　領域[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention combines a plurality of separately photographed subjects into a single image as if they exist simultaneously, and at this time, an apparatus for assisting photographing / combining without overlapping the subjects. The present invention relates to a method, a program, and a program medium.
[0002]
[Prior art]
For example, when two people take a picture side by side with a film camera or digital camera, they have to take a tripod with a self-timer or ask a person passing by to take a picture.
[0003]
However, there is a problem that it is difficult to carry a tripod, and it is difficult to rely on strangers.
[0004]
On the other hand, in Japanese Patent Application Laid-Open No. 2000-316125 (published on November 14, 2000), a region of a subject is extracted from a plurality of images photographed at the same place, and the image of the subject is not combined with the background. For example, there is disclosed an image synthesizing apparatus capable of synthesizing an image having only a background or an image as if a subject of another image exists simultaneously.
[0005]
In Japanese Patent Application Laid-Open No. 2001-333327 (published on November 30, 2001), a designated area (subject area) in a captured reference image is displayed on a monitor screen or a viewfinder so as to be superimposed on the image being captured. A digital camera and an image processing method that can generate image data of a combined image in which a subject in a subject area is combined with an image being photographed while the subject is being photographed are disclosed.
[0006]
[Problems to be solved by the invention]
However, these conventional techniques have two major problems.
[0007]
The first problem is that simply cutting out the subject area in the reference image and superimposing the subject area on another image may cause an incorrect designation of the subject area. Unnecessary ones are synthesized, or (3) even if the designation is accurate, the synthesis boundary may be slightly unnatural.
[0008]
For example, if the subject area specified in the reference image (hereinafter, referred to as a specified subject area) is missing from the actual subject area in (1), the subject is also missing on the composite image, so It becomes unnatural.
[0009]
In the case of (2), when the designated subject area in the reference image is too large than the actual subject area, the background around the subject on the reference image is also included. The "extra thing" mentioned above is the background portion that has been included. In the synthesis method described in Japanese Patent Application Laid-Open No. 2001-333327, since the reference image and the captured image may be captured in different places, the background image (background on the reference image) included in the designated subject area May be different from the surrounding background on the composite image (the background on the captured image). In this case, on the composite image, the background suddenly changes in the designated subject area, so that the composite image is unnatural.
[0010]
Even if both are photographed in the same place and in the same background, the combining method described in Japanese Patent Application Laid-Open No. 2001-333327 disposes and combines the designated subject area in the reference image at an arbitrary position on the captured image. Since the background image included in the designated subject area (the background on the reference image) and the background around the combined position on the captured image (the background of the captured image) are not necessarily the same background. Similarly, the synthesis result is unnatural.
[0011]
As described in Japanese Patent Application Laid-Open No. 2001-333327, when a user specifies the outline of a specified subject area in a reference image using a tablet or the like, the user specifies the outline while judging the outline. Is rarely wrong, but errors of about one, two or several pixels may occur. If an attempt is made to specify manually one pixel at a time, a great deal of labor is required.
[0012]
Also, in the case of (3), if the synthesis boundary becomes slightly unnatural even if the designation is accurate, even if the designated subject area as in (1) and (2) is accurate in pixel units, This also includes the case where the pixels of the outline do not blend with the background of the captured image as a result of the synthesis of the designated subject area.
[0013]
This is because the contour of the specified subject area is not sufficiently accurate if specified in pixel units, and in fact cannot be expressed unless the unit is a finer unit than one pixel. In other words, the pixels of the outline are originally (0.X) pixels for the subject portion and (1.0-0.X) pixels for the background portion. The pixel value of the background portion is a value added according to the ratio, that is, an averaged value.
[0014]
For this reason, the ratio between the subject portion and the background portion cannot be calculated back from the averaged pixel value. As a result, the pixel value of the outline of the composite image includes the value of the background of the reference image, and becomes incompatible with the background of the surrounding captured image.
[0015]
The above problems (1) to (3) cannot be solved by the synthesizing method disclosed in JP-A-2000-316125. This publication discloses that registration is performed before superposing a plurality of images taken at the same place or at places close to each other.
[0016]
However, for example, when two persons alternately photograph each other using the same background, not only the position of the background to be photographed moves due to the difference in the direction of the camera, but also the rotation of the image due to the tilt of the camera, Distortion of the image occurs due to the enlargement / reduction of the image due to the deviation of the distance from the subject and the change in the elevation angle of the camera due to the difference in the height of the photographer.
[0017]
Therefore, simply performing the alignment of the images to be superimposed does not solve the above problems (1) to (3), and the synthesis result becomes unnatural.
[0018]
The second problem is that if you try to take a picture in order to combine a subject area in the reference image with a captured image containing another subject, you must pay attention to the position of the subject at the time of shooting. The point is that the regions of the subject in each image may overlap each other on the composite image, or one of the subjects may protrude from the composite image.
[0019]
To cope with this problem, Japanese Patent Application Laid-Open No. 2000-316125 mainly describes a synthesizing method using a photographed image, and a photographing method for preventing overlapping of subjects and protruding from a synthesized image. It is not mentioned.
[0020]
Further, according to the image processing method disclosed in Japanese Patent Application Laid-Open No. 2001-333327, it is possible to superimpose and display a subject area (a user specifies a contour using a tablet or the like) in a reference image and an image being captured. Therefore, regarding the subject area in the reference image and the subject area in the image being photographed when the composition is performed, it is possible to know at the time of photographing whether the subjects overlap each other and whether the subject area protrudes from the composite image. If the subject overlaps or protrudes, you can change the position of the subject in the image being shot by moving the subject or the camera so that you can shoot and record images that do not overlap or protrude. Become.
[0021]
However, there is an inconvenience that humans themselves have to perform advanced processing such as recognition processing of the subject area, determination processing of whether the subject areas overlap each other, and determination of whether the subject area protrudes from the composite image. There is also inconvenience that the region of the subject in the reference image must be manually specified.
[0022]
A first object of the present invention is to provide an image synthesizing apparatus (image synthesizing method) for performing synthesizing so that the synthesizing result does not become unnatural. A second object is to provide a plurality of objects photographed separately. The object of the present invention is to provide an image synthesizing apparatus (image synthesizing method) that assists shooting so that subjects do not overlap each other on a synthesized image when images are synthesized as if they exist simultaneously.
[0023]
[Means for Solving the Problems]
In order to solve the above-described problems, an image combining device according to the present invention includes a first subject image which is an image including a background and a first subject, and at least a part of the background and a second subject. Calculate a correction amount consisting of any or a combination of a relative movement amount, a rotation amount, an enlargement / reduction ratio, and a distortion correction amount of the background portion with respect to the second subject image which is an image, or calculate the correction amount in advance. A background correction amount calculating means for reading the set correction amount; and setting the one of the first subject image and the second subject image as a reference image, and setting the other image so that the background portion other than the subject at least partially overlaps the background image. And a superimposed image generating unit that generates an image in which the reference image and the corrected image are superimposed by performing correction using the correction amount obtained from the correction amount calculating unit.
[0024]
In the above configuration, the "first subject" and the "second subject" are objects to be synthesized, and are generally a person but often an object. Strictly speaking, the “subject” is a region where the pixel values do not match when the background portion at least partially overlaps between the first subject image and the second subject image, that is, all the regions where there is a change are “subjects”. There is a possibility of becoming a “subject area”.
[0025]
However, in the background part, even small changes such as the leaves swaying due to the wind become areas where there is a change, so ignoring the small changes and small areas to some extent can extract the "subject area" more accurately. A more natural superimposed image can be obtained.
[0026]
For example, when the subject is a person, the subject is not necessarily one person, and a plurality of persons may be collectively referred to as a “first subject” or a “second subject”. In other words, even if there are a plurality of persons, what is treated collectively as a unit of the synthesis processing is one "subject". The same applies to objects other than people.
[0027]
Further, the subject is not always limited to one area, and may include a plurality of areas. The “first” and “second” are provided merely for distinction as different frame images, do not represent the order of photographing, and have no essential difference. Further, for example, if the person has clothes, things, and the like and does not appear in the “image of only the background not including the first and second subjects”, these are also included in the subject.
[0028]
The “first subject image” and the “second subject image” are separate images including the above “first subject” and “second subject”, and are generally images obtained by photographing the subject with a camera or the like. It is. However, when only the subject is shown in the image and no background part common to each other is shown at all, it is not suitable for combination, and at least a part of the background part common to each other needs to be shown. Usually, the first subject image and the second subject image are often photographed using the same background, that is, without moving the camera very much.
[0029]
It should be noted that the camera that shoots the subject need not be a still camera that records images as still images, but may be a video camera that records images as moving images. When a superimposed image as a still image is generated by a video camera, an image of one frame constituting a captured moving image is extracted as a subject image and used for composition.
[0030]
The “background portion” is a portion obtained by removing the “first subject” and the “second subject” from the first subject image and the second subject image.
[0031]
The “movement amount” is an amount by which another image is translated in a position where at least a part of the background overlaps with the reference image, but may also be referred to as a movement amount of a corresponding point at the center of rotation or enlargement / reduction.
[0032]
The “distortion correction amount” is a correction amount for correcting a remaining change that cannot be corrected by parallel movement, rotation, or enlargement / reduction among changes in a captured image due to a change in the position or direction of a camera or a lens. For example, this includes a case where when photographing a tall building, an effect called “tilt” or the like, in which the upper part is smaller even if it is the same size due to the perspective effect, is corrected.
[0033]
The “superimposed image generating means” generates a superimposed image, but does not necessarily need to generate one image data, and may look as if it were synthesized with image data of another means. For example, when displaying an image on the display means, if another image is partially displayed so as to overwrite the image, one composite image data is generated from two image data, and the composite image is displayed. Although it looks as if the data is displayed, in reality, there are only images based on the two image data, but no composite image data.
[0034]
For the calculation of the correction amount by the background correction amount calculating means, for example, a method of calculating a partial correspondence between two images, such as block matching, can be adopted. If the correspondence between the two images of the first subject image and the second subject image is obtained by using these methods and the like, if there is a portion that matches the background portion, the positional correspondence of that portion can be calculated. it can. Since the object portion does not exist in other images, an incorrect response is obtained for that portion. From the correct correspondence of the background part and the wrong correspondence of the subject part, only the correct correspondence of the background part is obtained by using a statistical method or the like. From the remaining correct correspondence, a correction amount including any or a combination of the relative movement amount, the rotation amount, the enlargement / reduction ratio, and the distortion correction amount of the background portion can be calculated.
[0035]
The superimposed image generation means creates an image in which the other image is corrected so that the background part matches the reference image based on the correction amount calculated by the background correction amount calculation means. Then, the superimposed image generating means generates an image in which the corrected image is superimposed on the reference image.
[0036]
As a method of superimposing the images, the image data of the pixels corresponding to the positions of the two images may be mixed at an arbitrary ratio proportionally distributed in a range of 0 to 1. For example, if the ratio of the first subject image is 1 and the ratio of the second subject image is 0, only the image data of the first subject image is written to the pixel. Assuming that the mixing ratio of the two images is 1: 1, image data obtained by uniformly combining the image data of the two images is written to the pixel.
[0037]
Note that how to set the mixing ratio is not essential for the present invention, but depends on what kind of superimposed image the user wants to display or output.
[0038]
Through the above-described processing, the first subject and the second subject can be combined on one image with the background portions thereof being matched.
[0039]
Since it is possible to compose the image by correcting the background displacement and distortion between the two images, it is possible to compose a portion other than a region except a clearly different region such as a subject (that is, a background portion) no matter how they overlap. The result is that the results are almost the same, and the synthesis result does not become unnatural. For example, when mainly combining only the subject area, even if the extraction or designation of the subject area is somewhat inaccurate, the background around the subject area does not shift or be distorted from the part of the image to be combined. The inside and outside of the inaccurate area are combined as a continuous scene, which has the effect of reducing the unnatural appearance.
[0040]
Even if the extraction of the subject area is accurate on a pixel-by-pixel basis, the unnaturalness at a level finer than one pixel appears in the prior art method as described in the section of the problem, but in the present invention, the background portion Since the composition is performed after eliminating the displacement and distortion of the pixels, the pixels around the contour pixels are pixels at the same background portion position. As described above, an effect of preventing or reducing unnaturalness at a level finer than one pixel is obtained.
[0041]
In addition, since the background is corrected and the distortion is corrected and combined, there is no need to fix the camera or the like with a tripod or the like when capturing the first and second subject images. The effect is that shooting becomes easier.
[0042]
Note that the operation of the background correction amount calculating means, which is to calculate a correction amount composed of any one or a combination of the relative movement amount, the rotation amount, the enlargement / reduction ratio, and the distortion correction amount of the background portion, is referred to as the background portion. The correction amount may be calculated by combining any one or a plurality of the relative rotation amount, the relative rotation amount, the enlargement / reduction ratio, or the distortion correction amount ”. Thereby, the accuracy of the correction is further improved, and a more natural synthesis result can be obtained.
[0043]
Further, if the user can selectively switch between the above two types of operations of the background correction amount calculating means via the input means, there is a case where importance is placed on correction accuracy and a case where importance is placed on processing speed or reduction of processing load. Can be used properly, and the operability of the image synthesizing apparatus is improved.
[0044]
An image synthesizing apparatus according to the present invention includes an imaging unit that captures an image of a subject or a landscape, and generates a first subject image or a second subject image based on an output of the imaging unit. It is characterized by being performed.
[0045]
According to the above configuration, since the image synthesizing apparatus that generates the superimposed image includes the imaging unit, the superimposed image can be generated on the spot where the user has photographed the subject or the landscape. Convenience is improved. Further, as a result of generating the superimposed image, if there is a problem such as overlapping of the subjects, an effect that the photographing can be performed again on the spot is obtained.
[0046]
Note that the image obtained from the imaging means is usually recorded in a main memory or an external memory irrespective of whether the image is incorporated in the image synthesizing device or not, and a user instructs a timing of recording using a shutter button or the like. . Then, the recorded image is used as the first subject image or the second subject image in the combining process.
[0047]
In order to solve the above-described problems, an image synthesizing apparatus according to the present invention is characterized in that, of the first subject image and the second subject image, the one photographed later is used as a reference image.
[0048]
According to the above configuration, for example, if the first subject image and the second subject image are photographed in this order, the second subject image is used as the reference image. Then, the first subject image is corrected using the second subject image as a reference image. At this time, a correction amount such as a movement amount of a background portion is calculated between the second subject image (reference image) and the first subject image, and the first subject image is corrected using the correction amount. A composite image is synthesized using the second subject image (reference image) and the corrected first subject image. Then, display of a composite image and the like are performed.
[0049]
As a result, the displayed composite image is the range of the background of the second subject image that has just been captured or is currently being captured in a form in which the composite image is displayed in real time, so that the photographer does not feel uncomfortable. The effect comes out.
[0050]
If the first subject image is the reference image, the range of the background of the composite image is the range of the background of the first subject image. The range of the background of the first subject image is different from the background range of the second subject image photographed earlier because the direction of the camera is changed, and the photographer may be changed. In this case, since the range of the background captured later does not match the range of the background of the displayed composite image, a sense of incongruity appears for the photographer or the like.
[0051]
Furthermore, assuming that the display of the composite image is repeated in real time from the photographing of the second subject image, the range of the background of the composite image is the first subject even though the second subject image is continuously updated to the photographed image. This uncomfortable feeling is further amplified.
[0052]
In order to solve the above-mentioned problems, the image synthesizing apparatus according to the present invention is characterized in that the superimposed image generating means superimposes the reference image and the corrected image at a predetermined transmittance.
[0053]
In the above configuration, the “predetermined transmittance” may be a fixed value, a value that changes according to the region, a value that gradually changes near the boundary of the region, or the like.
The superimposed image generating means determines a pixel position of the superimposed image, obtains a pixel value of a pixel position on the reference image and a pixel value of a corrected pixel position on another image, and compares the two pixel values with a predetermined value. The value multiplied by the transmittance is defined as the pixel value of the superimposed image. This processing is performed at all pixel positions of the superimposed image.
[0054]
If the transmittance is changed depending on the pixel position, the ratio of the reference image or the ratio of the corrected image can be increased depending on the location.
[0055]
By using this, for example, when only the subject area in the corrected image is overlaid on the reference image, the inside of the subject area is opaque (that is, the image of the subject in the corrected image as it is), and the periphery of the subject area becomes the reference as the distance from the subject area increases. Overlap so that the proportion of images is strong. Then, even if the contour of the subject area, that is, the contour of the extracted subject is wrong, the surrounding pixels gradually change from the corrected image to the reference image.
[0056]
In addition, for example, by performing composite display such as overlapping only the subject area with half the transparency, which part of the displayed image is the part to be composited previously captured, and which part is the image currently captured This also has the effect of making it easier for the user and the subject to determine what is going on.
[0057]
In addition, humans usually have the ability to distinguish between a background portion and a subject portion (contour) in an image by using common sense (image understanding). Even if the object area is displayed with half the transparency, it is generally effective.
[0058]
Therefore, by superimposing and displaying the subject areas at half the transparency, even when the areas of a plurality of subjects are overlapped, the areas of the respective subjects can be distinguished by the ability, and they are displayed on the composite image. It can be easily determined whether or not they overlap in position.
[0059]
It is not impossible to determine whether there is an overlap by comparing the first subject image and the second subject image side by side, but in that case, the subject areas in each image are distinguished by the ability, In consideration of the overlap of the background portions of the respective images, it is necessary to calculate in the head whether or not the distinguished subject areas overlap with each other. It is difficult to accurately perform this series of operations only in the head, as compared with the above-described method for distinguishing the subject region in the composite image.
[0060]
In other words, it can be said that, by causing the machine to perform the alignment such that the background portions overlap, it is possible to create a situation in which it is easy to determine whether or not the subject areas overlap each other by using the advanced image understanding ability of a human. In this way, by superimposing and displaying the subject areas with half the transmittance, even when the subjects overlap each other, it is possible to easily determine the position of the subject being photographed.
[0061]
Note that the configuration described in the present claim may be arbitrarily combined with the components described in the claim as needed.
[0062]
In order to solve the above-described problems, the image combining device according to the present invention may be configured such that, in the superimposed image generating unit, an area having a difference in a difference image between a reference image and a corrected image is different from an original pixel value. Generated as an image of pixel values
Here, the “difference image” is an image in which pixel values at the same position in two images are compared, and the difference value is created as a pixel value. Generally, the difference value often takes an absolute value.
[0063]
"Pixel value different from the original pixel value" refers to, for example, translucency by changing the transmittance, inverting the display by reversing the brightness and hue of the pixel value, or displaying a conspicuous color such as red, white, black, etc. Is a pixel value that realizes the following. Also, the pixel value may be changed between the boundary portion and the inside of the region as described above, the boundary portion may be surrounded by a dotted line, blinking display (temporarily changing the pixel value), Including such cases.
[0064]
According to the above configuration, a pixel value at the same pixel position is obtained between the reference image and another corrected image, and if there is a difference, the pixel value of the superimposed image at that pixel position is compared with another region. Are different pixel values. By performing this processing at all pixel positions, the area of the difference portion can be generated as an image having a pixel value different from the original pixel value.
[0065]
This has the effect of making it easier for the user to identify the parts that do not match between the two images. For example, the regions of the first and second subjects are extracted as a region having a difference in the difference image since one of the regions on the reference image and the corrected image is an image of the subject and the other is an image of the background portion. . By making the extracted area translucent, inverted, or made to have a pixel value of a conspicuous color, the effect that the area of the subject can be easily understood by the user is obtained.
[0066]
Note that the configuration described in the present claim may be arbitrarily combined with the components described in the claim as needed.
[0067]
In order to solve the above-mentioned problem, an image combining apparatus according to the present invention provides a subject area for extracting a first subject area and a second subject area from a difference image between a reference image and a corrected image. Extracting means, wherein in the superimposed image generating means, instead of superimposing the reference image and the corrected image, superimposing the reference image or the corrected image and the image in the area obtained from the subject area extracting means. And
[0068]
Here, the “subject region” is a region separated by a boundary where the subject is separated from the background. For example, if a person has clothes and objects in the first subject image and does not appear in the second subject image, they are also subjects and are included in the subject area. Note that the subject area is not necessarily a connected one-piece area, but may be divided into a plurality of areas.
[0069]
"Overlapping images in the area obtained from the subject area extracting means" does not mean that no image is generated except for the area, but means that the other area is filled with a reference image or the like. I do.
[0070]
Since the background portion is corrected so as to match, what appears as the difference is mainly the subject portion. Accordingly, the subject region included in the difference image can be extracted by the subject region extracting means. At this time, if processing such as removing noise or the like from the difference image (for example, excluding a pixel whose difference pixel value is equal to or less than a threshold value) is performed, the subject region can be more accurately extracted.
[0071]
When the superimposed image is generated, the pixel value of each pixel position is determined, but the image of the subject is superimposed only when the pixel position is within the subject area obtained from the subject area extracting means.
[0072]
This produces an effect that only the subject area in the corrected subject image can be combined on the reference image. Alternatively, there is an effect that only the subject area in the reference image can be synthesized on the corrected subject image.
[0073]
In addition, by combining with the process of changing the transmittance of the subject region in the superimposed image generating means, it becomes easy for the user to know which region is to be combined, and when the combining results in overlapping of the subjects, etc. Also has the effect of being easier to understand. Further, this has an effect that the photographing can be assisted so that the overlapping does not occur.
[0074]
If there is an overlap, it is better to move the subject or the camera and take another shot in a state where there is no overlap.However, in this case, assistance is, for example, recognizing whether or not overlap will occur to the user. For example, it is necessary to provide a material (in this case, a composite image) for the user to determine how much the object or the camera can be moved to eliminate the overlap.
[0075]
Note that the configuration described in the present claim may be arbitrarily combined with the components described in the claim as needed.
[0076]
In order to solve the above-mentioned problem, the object region extracting means of the image synthesizing apparatus according to the present invention may be configured such that an image in a region of the first object from the first object image or from the corrected first object image and In addition to extracting an image in the area of the second subject, an image in the area of the first subject and an image in the area of the second subject are extracted from the second subject image or the corrected second subject image. In addition, the image of the first subject and the image of the second subject are selected based on the skin color.
[0077]
In the above configuration, the subject area extracting unit can recognize that the subject area extracted from the difference image is the first subject area or the second subject area, but the individual subject area is the first subject area. It is not known whether it is the area of the subject or the area of the second subject. In other words, it is not known whether the image of the subject indicated by the region exists in the first subject image or the second subject image.
[0078]
Therefore, if it is known that the subject is a person, the colors of the pixels in the individual areas are changed to the first subject image (reference image) and the corrected second subject image or the second subject image (reference image). ) And the corrected first subject image. In this case, in any case, for each of the reference image and the corrected image, the subject region extracting means extracts the image in the region of the first subject and the image in the region of the second subject. Four image parts will be extracted.
[0079]
The extracted four image portions include an image portion of a first subject, a background portion shaped like a second subject, a background portion shaped like a first subject, and an image portion of a second subject. It is included. Therefore, by using the skin color as a reference, it is possible to select each image portion of the first subject and the second subject having the skin color or a color close thereto.
[0080]
As a result, there is an effect that the subject of the extracted image portion can be automatically and easily determined.
[0081]
In order to solve the above problem, the image combining apparatus according to the present invention, wherein the subject region extracting means includes an image in a region of a first subject from a first subject image or a corrected first subject image. An image in the area of the second object is extracted, and an image in the area of the first object and an image in the area of the second object are extracted from the second object image or the corrected second object image. In addition, the image of the first subject and the image of the second subject are selected based on the feature of the image outside each area.
[0082]
In the above configuration, the point that the subject region extracting unit extracts four image portions is as described above. However, as a criterion for selecting each image portion of the first subject and the second subject, instead of using the skin color as described above, the feature of the image outside each region is used.
[0083]
Here, the “feature” is a property, an attribute, or the like of an image of a region of interest, and is preferably a property that can be represented by a numerical value as a feature amount. As the feature amount, for example, pixel values of each color, its hue, saturation, brightness, as well as statistics representing the pattern and structure of the image, co-occurrence matrix, difference statistics, run-length matrix, power spectrum, , And higher-order statistics.
[0084]
The feature amount of each of the regions, that is, the extracted image portion is obtained by using the reference image and the corrected image. In addition, the feature amount of the area around the area is also obtained using the reference image and the corrected image. The difference between the feature value in the region and the feature value in the surrounding region is compared between the first subject image and the second subject image, and the one with the larger difference is defined as the subject region image.
[0085]
As a result, there is an effect that the subject of the extracted image portion can be automatically and easily determined.
[0086]
In the image synthesizing apparatus according to the present invention, in order to solve the above-described problem, the number of first or second subject areas obtained from the subject area extracting unit is set as the number of subjects to be combined. When there is no coincidence with the value, there is provided overlap detection means for judging that the region of the first subject and the region of the second subject overlap.
[0087]
In the above configuration, the “region of the first subject or the second subject” is a region of the subject extracted from a difference image or the like, and distinguishes between the region of the first subject and the region of the second subject. This is an area that does not need to be marked.
[0088]
The “subject to be combined” is not a subject obtained in the course of the combining process, but is a subject that actually exists, and is a subject that the user intends to combine. However, as described above, what is collectively handled as a unit of the synthesis processing is one “subject”, so one subject may be a plurality of persons.
[0089]
Further, the number of subjects may be fixedly set in the image synthesizing apparatus. However, as a convenience, the image synthesizing apparatus may be configured based on an instruction of a user such as a photographer before the overlap detecting unit performs the overlap detection. It is preferable to adopt a mode set to.
[0090]
The subject areas extracted by the subject area extracting means from the difference image are separated from each other if the subjects do not overlap, and if the subjects overlap, the areas of the first subject and the second subject Is integrated as a continuous area. Therefore, the overlap detection means compares the number of the extracted subject areas with the number of the subjects (set value), and determines that there is no overlap between the subjects if they match, and that there is overlap if they do not match.
[0091]
The determination result can be used to notify or warn the photographer or the subject of the presence / absence of the overlap on a composite screen, a lamp, or the like.
[0092]
As a result, it is possible to make it easier for the user to determine whether or not there is a portion where the subjects overlap each other. As a result, the effect of assisting the photographing so that no overlap occurs is the same as that described above.
[0093]
In order to solve the above problem, the image synthesizing apparatus according to the present invention has an overlap warning unit that warns the user or the subject or both that the overlap exists when the overlap detection unit detects the overlap. It is characterized by the following.
[0094]
Here, the "warning" includes a warning using characters or images on a display means or the like, or a method that can detect a user or a subject, such as light from a lamp or the like, sound from a speaker or the like, or vibration from a vibrator. Anything is included.
[0095]
Thus, when the subjects overlap each other, a warning is issued by the operation of the overlap warning means, so that it is possible to prevent the user from not taking a picture / recording or synthesizing processing without noticing it, and furthermore, to the subject. The effect of photographing assistance is that the user can be immediately notified that the position adjustment or the like is necessary.
[0096]
The image synthesizing apparatus according to the present invention, in order to solve the above-described problem, in the overlap detecting means, when no overlap is detected, the absence of overlap, a shutter chance notifying means for notifying the user or the subject or both, It is characterized by having.
[0097]
Here, the “notification” includes any method that can be detected by the user or the subject, similarly to the “warning”.
[0098]
This makes it possible for the user to know when the subjects do not overlap each other. The effect comes out.
[0099]
In addition, since the subject can be notified of the photo opportunity, the effect of photographing assistance that the user can immediately prepare for a pose, a line of sight, and the like can be obtained.
[0100]
An image synthesizing apparatus according to the present invention has an image pickup device for picking up an object or a landscape, and solves the above-mentioned problem. When an overlap is not detected by the overlap detection device, an image obtained from the image pickup device is converted to a second image. An automatic shutter means for generating an instruction to record as the first subject image or the second subject image is provided.
[0101]
In the above configuration, recording the captured image as the first subject image and the second subject image is realized, for example, by recording the captured image in a main storage or an external storage. Therefore, the automatic shutter means outputs an instruction of a recording control process to the main memory or the external memory when a signal indicating that there is no overlap between the first subject area and the second subject area is input from the overlap detection means. I do.
[0102]
Then, the background correction amount calculation unit and the superimposed image generation unit can obtain the first subject image and the second subject image by reading the image recorded in the main storage or the external storage.
[0103]
Note that even if the automatic shutter means automatically issues an instruction, an image is not necessarily recorded immediately. For example, recording may not be performed unless the shutter button is pressed at the same time or the automatic recording mode is set.
[0104]
As a result, the shooting is automatically performed when the subjects do not overlap each other, so that an effect of shooting assistance that the user does not need to determine whether there is overlap and does not need to press the shutter is obtained.
[0105]
An image synthesizing apparatus according to the present invention has an image pickup unit for picking up an object or a landscape, and solves the above-mentioned problem. When an overlap is detected by the overlap detection unit, an image obtained from the image pickup unit is obtained. , An automatic shutter means for generating an instruction to prohibit recording as a first subject image or a second subject image.
[0106]
According to the above configuration, when the automatic shutter unit receives a signal indicating that there is an overlap from the overlap detection unit, the automatic shutter unit outputs an instruction to prohibit recording an image obtained from the imaging unit in the main storage or the external storage. As a result, for example, even if the shutter button is pressed, an image obtained from the imaging unit is not recorded. Note that this prohibition processing may be performed only when the automatic prohibition mode is set.
[0107]
As a result, no shooting is performed when the subjects overlap each other, so that a shooting assistance effect that prevents the user from shooting / recording in a state in which there is an overlap is provided.
[0108]
In order to solve the above-described problems, an image combining method according to the present invention includes a first subject image which is an image including a background and a first subject, and at least a part of the background and a second subject. Calculate a correction amount consisting of any or a combination of a relative movement amount, a rotation amount, an enlargement / reduction ratio, and a distortion correction amount of the background portion with respect to the second subject image which is an image, or calculate the correction amount in advance. A background correction amount calculating step of reading the set correction amount; and performing the background correction so that one of the first subject image and the second subject image is used as a reference image, and the other image overlaps at least a part of the background other than the subject. A superimposed image generation step of generating an image in which the reference image and the corrected image are superimposed, the image being corrected with the correction amount obtained from the amount calculation step.
[0109]
The various functions and effects resulting from this are as described above.
[0110]
In order to solve the above-described problems, an image composition program according to the present invention causes a computer to function as each unit included in the image composition device.
[0111]
An image synthesizing program according to the present invention causes a computer to execute each step of the image synthesizing method in order to solve the above-mentioned problem.
[0112]
A recording medium according to the present invention is characterized by recording the above-mentioned image synthesizing program in order to solve the above-mentioned problems.
[0113]
Thereby, the image synthesizing method is realized using the computer by installing the image synthesizing program in a general computer via the recording medium or the network. In other words, the computer is connected to the image synthesizing apparatus. Can function as
[0114]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[0115]
First, the definition of words is explained.
[0116]
The “first subject” and the “second subject” are objects to be synthesized, and are generally human, but may be objects. Strictly speaking, the “subject” is a region where the pixel values do not match when the background portion at least partially overlaps between the first subject image and the second subject image, that is, all the regions where there is a change are “subjects”. There is a possibility of becoming a “subject area”. However, in the background portion, even a small change such as a leaf swaying due to the wind becomes an area where there is a change. Therefore, it is preferable to ignore the small change and the small area to some extent.
[0117]
For example, when the subject is a person, the subject is not necessarily one person, and a plurality of persons may be collectively referred to as a “first subject” or a “second subject”. In other words, even if there are a plurality of persons, what is treated collectively as a unit of the synthesis processing is one "subject".
[0118]
The same applies to objects other than people. Further, the subject is not always limited to one area, and may include a plurality of areas. The “first” and “second” are provided merely for distinction as different frame images, do not represent the order of photographing, and have no essential difference. Further, for example, if the person has clothes, things, and the like and does not appear in the “image of only the background that does not include the first subject or the second subject”, those are also included in the subject.
[0119]
The “first subject image” and the “second subject image” are separate images including the above “first subject” and “second subject”. Generally, the subjects are separately photographed by a camera or the like. Image. However, if only the subject is shown on the image and no background part common to each other is shown at all, the alignment based on the common background part cannot be performed, which is not suitable for composition. Therefore, at least a part (in order to make the surroundings of the combined subject natural), and more preferably, around the subject to be combined, it is necessary that a common background portion is captured. Usually, the first subject image and the second subject image are often photographed using the same background, that is, without moving the camera very much.
[0120]
The “background portion” is a portion obtained by removing the “first subject” and the “second subject” from the first subject image and the second subject image, respectively.
[0121]
The “movement amount” is the amount of translation, but may be the movement amount of the corresponding point at the center of rotation or enlargement / reduction.
[0122]
The “distortion correction amount” is a correction amount for correcting a remaining change that cannot be corrected by parallel movement, rotation, or enlargement / reduction among changes in a captured image due to a change in the position or direction of a camera or a lens. For example, this includes a case where when photographing a tall building, an effect called “tilt” or the like, in which the upper part is smaller even if it is the same size due to the perspective effect, is corrected.
[0123]
The “superimposed image generation means” generates the superimposed image, but does not necessarily need to generate the superimposed image as one image, and may perform processing to make it appear as if the images were synthesized in cooperation with other means. For example, when displaying an image on the display means, if another image is partially displayed so as to overwrite the image, a composite image is generated from the two images in appearance, and the composite image is displayed. Although it looks as if it were, actually, only two images exist, and no composite image exists.
[0124]
“Pixel value” is a pixel value, and is generally represented using a predetermined number of bits. For example, a monochrome binary is represented by 1 bit, a monochrome of 256 gradations is represented by 8 bits, and a color of 256 gradations of red, green and blue is represented by 24 bits. In the case of color, it is often expressed by being decomposed into three primary colors of red, green and blue light.
[0125]
Note that similar words include “density value” and “luminance value”. This is only used depending on the purpose. "Density value" is mainly used when printing pixels, "Luminance value" is mainly used when displaying on a display, but the purpose is not limited here. Therefore, it is expressed as “pixel value”.
[0126]
The “transmittance” is a “predetermined ratio value” to be multiplied in a process of multiplying the pixel values of a plurality of pixels by a predetermined ratio value and obtaining the sum as a new pixel value. Usually, the value is 0 or more and 1 or less. In addition, the sum of the transmittance of each pixel used in one new pixel value is often set to one. It may be called "opacity" instead of "transmittance". “Transparency” is a value obtained by subtracting “opacity” from 1.
[0127]
The “predetermined transmittance” includes a fixed value, a value that changes according to the region, a value that gradually changes near the boundary of the region, and the like.
[0128]
The “difference image” is an image in which pixel values at the same position in two images are compared and the difference value is created as a pixel value. Generally, the difference value often takes an absolute value.
[0129]
"Pixel value different from the original pixel value" refers to, for example, translucency by changing the transmittance, inverting the display by reversing the brightness and hue of the pixel value, or displaying a conspicuous color such as red, white, black, etc. Is a pixel value that realizes the following. In addition, the pixel value may be changed between the boundary portion and the inside of the region as described above, the boundary portion may be surrounded by a dotted line, blinking display (temporarily changing the pixel value), Including such cases.
[0130]
The “subject region” is a region separated by a boundary where the subject is separated from the background. For example, if a person has clothes or objects in the first subject image and does not appear in the second subject image, they are also subjects and are included in the subject area. Note that the subject area is not necessarily a connected one-piece area, but may be divided into a plurality of areas.
[0131]
"Overlapping only the region obtained from the subject region extracting means" does not mean that no image is generated except for the region, but means that the other region is filled with a reference image or the like.
[0132]
The “feature” refers to a property of an image of the area, and is preferably a property that can be expressed by a numerical value as a feature amount. As the feature amount, for example, pixel values of each color, its hue, saturation, brightness, as well as statistics representing the pattern and structure of the image, co-occurrence matrix, difference statistics, run-length matrix, power spectrum, , And higher-order statistics.
[0133]
The “region of the first subject or the second subject” is a region of the subject extracted from a difference image or the like, and is not distinguished between the region of the first subject and the region of the second subject. Is also a good area.
[0134]
The “subject to be combined” is not a subject obtained in the course of the combining process, but is a subject that actually exists (in front of the camera), and is one of the first subject image and the second subject image. This refers to the subject that the user intends to combine with the reference image determined for either one. However, as described above, what is collectively handled as a unit of the synthesis processing is one “subject”, so one subject may be a plurality of persons / objects.
[0135]
"Warning" includes displaying characters or images on display means, etc. to warn, or any method that can sense the user or the subject, such as light from a lamp, sound from a speaker, or vibration from a vibrator. Anything is included.
[0136]
The “notification” includes any method that can be detected by the user or the subject, like the “warning”.
[0137]
“Frame” refers to the outline of the entire image. When the subject partially overlaps the outer contour of the image, it may be expressed that the subject overlaps the frame (frame) or is cut off from the frame (frame).
[0138]
FIG. 1 is a configuration diagram showing an image synthesizing apparatus that performs an image synthesizing method according to an embodiment of the present invention.
[0139]
That is, the main parts of the image synthesizing apparatus are composed of an imaging unit 1, a first subject image obtaining unit 2, a second subject image obtaining unit 3, a background correction amount calculating unit 4, a corrected image generating unit 5, a difference image generating unit 6, Area extracting means 7, overlap detecting means 8, overlapping image generating means 9, overlapping image displaying means 10, overlapping warning means 11, shutter chance notifying means 12, and automatic shutter means 13 can be developed and shown in main functional blocks. .
[0140]
FIG. 2 is a configuration example of an apparatus that specifically realizes the units 1 to 13 of FIG.
[0141]
The CPU (central processing unit) 70 includes a background correction amount calculation unit 4, a correction image generation unit 5, a difference image generation unit 6, a subject area extraction unit 7, an overlap detection unit 8, an overlay image generation unit 9, and an overlay image display unit 10. Functioning as the overlap warning means 11, the shutter chance notifying means 12, and the automatic shutter means 13, and a program describing the processing procedure of each of these means 4 to 13 is stored in the main storage 74, the external storage 75, and the network via the communication device 77. Obtain from the destination.
[0142]
Note that the imaging unit 1, the first subject image obtaining unit 2, and the second subject image obtaining unit 3 also use a CPU or the like for internal control of the image sensor and various processes of image data output by the image sensor. In some cases.
[0143]
The CPU 70 includes a display 71, an image sensor 72, a tablet 73, a main memory 74, an external memory 75, a shutter button 76, a communication device 77, a lamp 78, a speaker 80, Processing is performed while exchanging data.
[0144]
Note that the exchange of data may be performed not only via the bus 79 but also via a communication cable or a wireless communication device such as a device capable of transmitting and receiving data. The means for realizing each of the units 1 to 13 is not limited to the CPU, but may be a DSP (digital signal processor) or a logic circuit in which a processing procedure is incorporated as a circuit.
[0145]
The display 71 is usually realized in combination with a graphic card or the like, has a VRAM (video random access memory) on the graphic card, converts data in the VRAM into a display signal, and displays the display (display / display) such as a monitor. Output medium), and the display displays the display signal as an image.
[0146]
The imaging device 72 is a device that captures an image of a landscape or the like and obtains an image signal, and usually includes an optical system component such as a lens, a light receiving device, and an electronic circuit associated therewith. Here, it is assumed that the image sensor 72 includes a portion for converting the output signal into digital image data through an A / D converter or the like. It is assumed that the image data of the captured image is transmitted to 3 or the like. As a general device as an image pickup device, for example, there is a charge coupled device (CCD) or the like, but any other device that can obtain a landscape or the like as image data may be used.
[0147]
Means for inputting a user's instruction include a tablet 73 and a shutter button 76. The user's instruction is input to each of the units 1 to 13 via a bus 79. In addition, various input means such as various operation buttons and voice input by a microphone can be used. The tablet 73 includes a pen and a detection device that detects a pen position. The shutter button 76 is made up of a mechanical or electronic switch or the like. When the user presses the button, the shutter button 76 normally records a series of images captured by the image sensor 72 in the main memory 74, the external memory 75, or the like. Generate a start signal to start processing.
[0148]
The main memory 74 is usually constituted by a memory device such as a DRAM (dynamic random access memory) or a flash memory. Note that a memory or a register included in the CPU may be interpreted as a kind of main memory.
[0149]
The external storage 75 is a removable storage means such as a hard disk drive (HDD) or a personal computer (PC) card. Alternatively, a main storage or an external storage attached to another network device that is connected to the CPU 70 via a network by wire or wirelessly can be used as the external storage 75.
[0150]
The communication device 77 is realized by a network interface card or the like, and exchanges data with another network device connected wirelessly or by wire.
[0151]
The speaker 80 interprets audio data transmitted via the bus 79 or the like as an audio signal and outputs the audio data as audio. The output sound may be a simple sound of a single wavelength, or may be a complex sound such as music or human voice. If the audio to be output is determined in advance, the transmitted data may not be an audio signal but may be merely an ON / OFF operation control signal.
[0152]
Next, each of the units 1 to 13 in FIG. 1 will be described from the viewpoint of data transfer between the units.
[0153]
The data exchange between each means is expressed mainly by the bus 79 when the expressions “obtain from ** means” and “send (pass) to ** means” are used without any particular annotation. Suppose you are exchanging At this time, data may be directly exchanged between the units, or data may be exchanged via a main storage 74, an external storage 75, a network via a communication device 77, or the like.
[0154]
The imaging unit 1 mainly includes an imaging element 72, and sends the captured scenery and the like to the first subject image acquiring unit 2 and the second subject image acquiring unit 3 as image data.
[0155]
The first subject image obtaining unit 2 is configured by, for example, the imaging unit 1, the main storage 74, and / or the external storage 75, and converts the first subject image into the imaging unit 1, the main storage 74, the external storage 75, and / or the like. Obtained from a network destination or the like via the communication device 77. The first subject image obtaining means 2 may include a CPU or the like for internal control or the like.
[0156]
When the imaging means 1 is used, the current scenery (first object image) including the first object is photographed by the image sensor 72, and is usually photographed at the timing when the shutter button 76 or the like is pressed. The captured image is recorded in the main storage 74, the external storage 75, and / or a network destination via the communication device 77, or the like.
[0157]
On the other hand, when the first subject image obtaining means 2 obtains the first subject image from the main storage 74, the external storage 75, and / or a network destination via the communication device 77, the first subject image has already been captured and prepared in advance. The image will be read. Note that a camera may be located at a network destination or the like via the communication device 77, and shooting may be performed through the network.
[0158]
The first subject image is sent to the background correction amount calculating unit 4, the corrected image generating unit 5, the difference image generating unit 6, the subject region extracting unit 7, and / or the superimposed image generating unit 9.
[0159]
The second subject image acquisition unit 3 is configured by, for example, the imaging unit 1, the main storage 74, and / or the external storage 75, and stores an image including the second subject (hereinafter, referred to as a “second subject image”). From the imaging unit 1, the main memory 74, the external memory 75, and / or a network destination via the communication device 77. The second subject image obtaining means 3 may include a CPU or the like for internal control or the like. Except for the difference in the content of the image, the method of acquiring the image is the same as that of the first subject image acquiring means 2.
[0160]
The second subject image is sent to the background correction amount calculating unit 4, the corrected image generating unit 5, the difference image generating unit 6, the subject region extracting unit 7, and / or the superimposed image generating unit 9.
[0161]
The CPU 70 serving as the background correction amount calculation means 4 may be any one or any combination of the relative movement amount, the rotation amount, the enlargement / reduction ratio, and the distortion correction amount of the background other than the subject in the first subject image and the second subject image. Is calculated. It is sufficient that the correction amount between one (reference image) of the first subject image and the second subject image and the other image is at least obtained.
[0162]
The background correction amount calculation unit 4 sends the calculated correction amount to the correction image generation unit 5. When the background correction amount calculation unit 4 reads the correction amount calculated in advance, the correction amount is read from the main memory 74, the external storage 75, and / or a network destination via the communication device 77, or the like. .
[0163]
The CPU 70 as the corrected image generating means 5 is obtained from the background correction amount calculating means 4 so that either the first subject image or the second subject image is used as a reference image, and the other image is overlapped with a background portion other than the subject. An image corrected by the correction amount (hereinafter, referred to as a corrected image) is generated and sent to the difference image generation unit 6 and the superimposed image generation unit 9. When the corrected image generation unit 5 reads a corrected image generated in advance, the corrected image is read from the main storage 74, the external storage 75, and / or a network destination via the communication device 77.
[0164]
The CPU 70 as the difference image generation means 6 generates a difference image between the reference image determined by the correction image generation means 5 and the correction image obtained from the correction image generation means 5, and extracts the generated difference image from the subject area. To the means 7 and the superimposed image generating means 9.
[0165]
The CPU 70 as the subject area extracting means 7 extracts the first and second subject areas from the difference image obtained from the difference image generating means 6, and sends the extracted areas to the overlap detecting means 8 and the superimposed image generating means 9. send.
[0166]
The CPU 70 serving as the overlap detecting means 8 detects an overlap between the first and second subjects from the first and second subject areas obtained from the subject area extracting means 7 and outputs information on whether or not an overlap exists. And the information of the overlapping area are sent to the superimposed image generating means 9, the overlapping warning means 11, the photo opportunity notifying means 12 and the automatic shutter means 13.
[0167]
The CPU 70 serving as the superimposed image generating means 9 includes a first subject image obtained from the first subject image obtaining means 2, a second subject image obtained from the second subject image obtaining means 3, and a corrected image obtained from the corrected image generating means 5. Is generated, and the generated image is sent to the superimposed image display means 10.
[0168]
In some cases, the superimposed image generating means 9 generates an area having a difference in the difference image obtained from the difference image generating means 6 as an image having a pixel value different from the original pixel value.
[0169]
In some cases, the superimposed image generating means 9 superimposes only the first subject and the second subject obtained from the subject area extracting means 7 on a reference image or the like.
[0170]
In some cases, the superimposed image generating means 9 generates the overlap area obtained from the overlap detecting means 8 as an image having a pixel value different from the original pixel value.
[0171]
The CPU 70 as the superimposed image display means 10 displays the superimposed image obtained from the superimposed image generation means 9 on a display 71 or the like.
[0172]
In addition, the superimposed image display means 10 provides a warning display according to the warning information obtained from the overlap warning means 11 or a photo opportunity according to the photo opportunity information obtained from the photo opportunity notification means 12. In some cases, display may be performed, or in accordance with shutter information obtained from the automatic shutter means 13, a display indicating that automatic shutter has been performed may be performed.
[0173]
The CPU 70 serving as the overlap warning unit 11 notifies the user, the subject, or both of them from the overlap information obtained from the overlap detection unit 8 that the overlap exists.
[0174]
For the notification, various contents such as sending the content of the notification in characters or the like to the superimposed image display means 10 to be displayed on the display 71, notifying by light using the lamp 78, or notifying by sound using the speaker 80, etc. A form can be adopted. If notification is possible, other devices may be used.
[0175]
The CPU 70 serving as the photo opportunity notification means 12 notifies the user, the subject, or both of them from the overlap information obtained from the overlap detection means 8 that no overlap exists. The notification method is the same as that described for the overlap warning unit 11.
[0176]
The CPU 70 serving as the automatic shutter unit 13 sends the image obtained from the imaging unit 1 to the main storage 74 or the external storage to the second subject image acquisition unit 3 when there is no overlap based on the overlap information obtained from the overlap detection unit 8. Automatically instruct to record at 75 or the like.
[0177]
Here, it is mainly assumed that the image obtained from the imaging means 1 is finally recorded, stored, and synthesized as the first subject image or the second subject image in the main memory 74, the external memory 75, or the like. ing. For example, when the second subject is photographed after the first subject is photographed first, when the first subject image is obtained from the imaging means 1, the first subject image is recorded and stored every time it is obtained. Even if the subject image is obtained from the imaging means 1, it is not immediately stored.
[0178]
That is, when the image obtained from the imaging unit 1 is used as the second subject image, a process such as overlap detection is performed using the obtained second subject image and the stored first subject image, and a superimposed image is obtained. A series of processing of performing various kinds of display on the display means 10, warnings, notifications, and the like is repeated. Then, when recording and saving are instructed by the automatic shutter means 13, the second subject image is finally recorded and saved.
[0179]
Note that the second subject image may be recorded and stored when the instruction from the automatic shutter unit 13 is present and the shutter button 143 is pressed by the user.
[0180]
The automatic shutter unit 13 may notify the user, the subject, or both that the captured image has been recorded as a result of issuing the instruction. The notification method is the same as that described for the overlap warning unit 11.
[0181]
Further, the CPU 70 as the automatic shutter unit 13 not only issues a recording instruction, but also obtains the second subject image obtaining unit 3 from the imaging unit 1 when there is an overlap based on the overlap information obtained from the overlap detection unit 8. An instruction is automatically issued to prohibit recording of the image to be stored in the main memory 74, the external memory 75, or the like. This operation is the reverse of the above-described automatic recording.
[0182]
In this case, if there is an instruction to prohibit saving by the automatic shutter means 13, even if the user presses the shutter button 143, the second subject image will not be recorded and saved.
[0183]
FIG. 3A shows an example of the appearance of the image synthesizing apparatus according to the present invention as viewed from the back. A display unit / tablet 141, a lamp 142, and a shutter button 143 are provided on the main body 140.
[0184]
The display unit / tablet 141 corresponds to the input / output device (the display 71 and the tablet 73) and the superimposed image display unit 10. As shown in FIG. 3A, on the display / tablet 141, as shown in FIG. 3A, notification / warning from the composite image overlap warning unit 11, the shutter chance notification unit 12, the automatic shutter unit 13, and the like generated by the superimposed image generation unit 9. Information is displayed. It is also used for displaying various setting menus of the image synthesizing apparatus and changing settings with a finger or a pen using a tablet.
[0185]
In addition, as an operation means for various settings, not only a tablet but also buttons and the like may be used. The display unit / tablet 141 may be viewed not only by the photographer but also by the subject using a method such as rotation or separation with respect to the main body 140.
[0186]
The lamp 142 is used for notification or warning from the overlap warning means 11, the photo opportunity notification means 12, the automatic shutter means 13, or the like.
[0187]
The shutter button 143 is mainly used for instructing a timing at which the first subject image acquiring unit 2 or the second subject image acquiring unit 3 captures / records a captured image from the imaging unit 1.
[0188]
Although not shown in this example, a built-in speaker or the like may be used as notification / warning means.
[0189]
FIG. 3B shows an example of the appearance of the image synthesizing apparatus according to the present invention from the front. A lens section 144 is provided on the front surface of the main body 140. The lens unit 144 is a part of the imaging unit 1. Although not shown in the example of FIG. 3B, a display unit, a lamp, a speaker, and the like may be provided on the front so that information (the above-described notification or warning) can be transmitted to the subject.
[0190]
FIG. 4 is an explanatory diagram illustrating an example of a data structure of image data. The image data is a two-dimensional array of pixel data, and “pixel” has a position and a pixel value as attributes. Here, it is assumed that the pixel values have R, G, and B values corresponding to the three primary colors of light (red, green, and blue). A set of R, G, and B arranged side by side in FIG. 4 becomes data of one pixel. However, when only monochrome luminance information having no color information is provided, it is assumed that a luminance value is provided as data of one pixel instead of R, G, and B.
[0191]
The position is represented by XY coordinates (x, y). In FIG. 4, the origin is the upper left, the right is the + X direction, and the downward is the + Y direction.
[0192]
Hereinafter, for the sake of explanation, the pixel at the position (x, y) is represented as “P (x, y)”, but the pixel value of the pixel P (x, y) is also referred to as “pixel value P (x, y)” or simply It may be expressed as “P (x, y)”. When the pixel value is divided into R, G, and B, the calculation is performed for each color. However, if there is no special process related to the color, the same calculation process may be performed for each of the R, G, and B values. Therefore, hereinafter, a description will be given using “pixel value P (x, y)” as a common calculation method.
[0193]
FIG. 5 is a flowchart illustrating an example of the image synthesizing method according to the embodiment of the present invention.
[0194]
First, in step S1 (hereinafter, "step S" is abbreviated as "S"), the first subject image obtaining means 2 obtains the first subject image, and sets the connection point P20 (hereinafter, "connection point P"). The process proceeds to S2 via “P”. The first subject image may be photographed using the imaging unit 1, or an image prepared in advance in a main storage 74, an external storage 75, a network destination via the communication device 77, or the like may be read.
[0195]
In S2, the second subject image acquiring means 3 acquires a second subject image having a background portion at least partially common to the first subject image, and the process proceeds to S3 via P30. Although the processing here will be described in detail later with reference to FIG. 13, the method of acquiring the second subject image itself is the same as that of the first subject image. Note that the order of the processing of S1 and S2 may be reversed, but if the later image is used as the reference image, an effect of displaying a composite image at the time of imaging with less discomfort will be obtained.
[0196]
In S3, the background correction amount calculation means 4 calculates a background correction amount from the first subject image and the second subject image, and the process proceeds to S4 via P40. The first subject image and the second subject image are obtained from the first subject image obtaining means 2 (S1) and the second subject image obtaining means 3 (S2), respectively.
[0197]
In the following, when using the first subject image and the second subject image, the means / steps of acquisition of these images are the same as the means / steps of acquisition in S3 unless otherwise specified. The description of the means / steps from which the image is obtained is omitted.
[0198]
Details of the processing in S3 will be described later with reference to FIG.
[0199]
In S4, the corrected image generation unit 5 corrects the first subject image or the second subject image other than the reference image using the background correction amount obtained from the background correction amount calculation unit 4, and the difference image generation unit 6 performs correction. A difference image between the image corrected by the image generation means 5 and the reference image is generated, and the process proceeds to S5 via P50. Details of the processing in S4 will be described later with reference to FIG.
[0200]
In S5, the subject area extracting means 7 extracts first and second subject areas (hereinafter, referred to as a first subject area and a second subject area) from the difference image obtained from the difference image generating means 6 (S4). Then, the overlap detection unit 8 detects the overlap between the subjects, and the process proceeds to S6 via P60. Details of the processing in S5 will be described later with reference to FIG.
[0201]
In S6, one or more of the overlap warning unit 11, the shutter chance notification unit 12, and the automatic shutter unit 13 perform various processes according to the information on the overlap obtained from the overlap detection unit 8 (S5), The process proceeds to S7 via P70. Details of the processing in S6 will be described later with reference to FIGS.
[0202]
In S7, the superimposed image generating means 9 corrects the first subject image, the second subject image, and the image which is not the reference image by the corrected image generating means 5 (S4), and the subject area extracting means. 7 from the first and second object areas obtained from S5, the information on the overlap of the first and second objects obtained from the overlap detection means 8 (S6), etc. Is generated, and the process proceeds to S8 via P80. Details of the processing in S7 will be described later with reference to FIG.
[0203]
In S7, the superimposed image display means 10 displays the superimposed image obtained from the superimposed image generation means 9 (S7) on the display 71 or the like, and ends the processing.
[0204]
In the processes of S1 to S7, the first subject and the second subject are combined on one image using the first subject image and the second subject image, and various processes are performed according to the degree of overlap between the subjects. Processing can be performed.
[0205]
The detailed processing and its effects will be described in detail later, and the outline of the processing will be described first with a simple example.
[0206]
FIG. 6A is an example of the first subject image obtained in S1. A first subject (1) is standing on the left side in front of the background. "1" is written on the face of the person (1) for easy understanding. In the following, the terms "right side" and "left side" will mean "right side" and "left side" in the figure without particular limitation. You can think of this direction as seen from the photographer / camera.
[0207]
FIG. 7A is an example of the second subject image obtained in S2. A person (2) as a second subject stands on the right side of the background. "2" is written on the face of the person (2) for easy understanding.
[0208]
FIG. 7C shows a background correction amount obtained between the first subject image of FIG. 6A and the second subject image of FIG. 7A, and the background correction is performed using the first subject image as a reference image. It is an image obtained by correcting the second subject image using the amount.
[0209]
The corrected image is a range surrounded by a solid-line frame, and the range of the original second subject image in FIG. 7A and the range of the first subject image in FIG. Is indicated by a dotted frame on FIG. 7 (c). The background in FIG. 7A is obtained by photographing a slightly upper left landscape of the background in FIG. 6A. For this reason, in order to correct the second subject image in FIG. 7A so as to overlap the background of the first subject image in FIG. 6A, select the slightly lower right landscape in FIG. 7A. There is a need. Therefore, FIG. 7 (c) is corrected so as to have a slightly lower right landscape than FIG. 7 (a). The original range of FIG. 7A is indicated by a dotted line. Since there is no lower right landscape image than in FIG. 7 (a), the portion protruding to the right from the rightmost dotted line and the portion protruding below the lowermost dotted line are blank in FIG. 7 (c). . Conversely, the upper left part of FIG. 7A is truncated.
[0210]
Here, there is no correction such as enlargement / reduction or rotation, but a correction result obtained only by a simple translation. That is, the background correction amount obtained in S3 is a translation amount indicated by a shift between a solid frame and a dotted frame here.
[0211]
FIG. 8A is a difference image generated in S4 between the first subject image of FIG. 6A and the corrected second subject image of FIG. 7C. In the difference image, a portion having a difference amount of 0 (that is, a matching portion of the background) is indicated by a black region. The part having the difference is within the subject area and the noise part, and the subject area part is a strange image in which the background part and the subject part image overlap. (A region in which only one of the images has pixels due to the correction (for example, an inverted L-shaped region between the solid line and the dotted line located on the lower right side in FIG. 7C) is excluded from the target of the difference, and the difference amount Is 0).
[0212]
Although there are various processing methods for the processing related to the overlap in S6, no overlap is detected in this example, and therefore, no particular processing is performed here to simplify the description.
[0213]
FIG. 9A shows an image of a portion corresponding to a second subject area shown in FIG. 19D described later, which is overlaid (overwritten) on a first subject image (reference image) in FIG. It is a generated image. The subjects shown separately in FIG. 6A and FIG. 7A are arranged without overlapping on the same image. Since there are various processing methods for the method of superimposing, the details will be described later. The image of FIG. 9A is displayed on the superimposed image display means 10 as a composite image.
[0214]
As a result, an effect is obtained in which images can be synthesized as if the subjects separately photographed were photographed simultaneously.
[0215]
Although the outline of the processing has been described in the above description, the outline of the processing example of S6 in the case where the object regions overlap each other in S5 is not described, and will be briefly described below.
[0216]
FIG. 10 is an example of a second subject image different from FIG. 7A. 7A, the second subject is located slightly to the left of the same background. It is assumed that the first subject image is the same as that shown in FIG.
[0219]
FIG. 11C shows an area where the first subject area and the second subject area are combined. An area 202 in the figure is composed of a first subject area and a second subject area. Here, since the first subject region and the second subject region overlap with each other in the relationship between the positions of the first subject and the second subject with respect to the same background, the region 202 is shown as a combined region.
[0218]
FIG. 12 is a diagram illustrating an example of the superimposed image generated in S7 when there is an overlap in S6. Since the area 202 is treated as one area in which the first subject area and the second subject area are combined, they are displayed translucently in a lump. Further, a message indicating that the first subject and the second subject overlap each other is displayed over the overlaid image.
[0219]
Displaying the superimposed image (including the message) has an effect that the user and the subject can easily understand that the first subject and the second subject overlap each other.
[0220]
As described above, the outline of the processing example of S6 when the subject areas overlap in S5 has been described.
[0221]
Considering this in a typical usage scene example, first, a first subject as shown in FIG. 6A is photographed by a camera (image synthesizing device) and recorded. Next, a second subject as shown in FIG. 7A is photographed with the same background.
[0222]
Note that the first subject and the second subject are photographed alternately by the first subject and the second subject, so that the photographing can be performed by only two persons without a third person. In order to shoot in the same background, it is better not to move the camera, but it is corrected according to the background, so you do not need to fix it with a tripod etc. . The positional relationship between the subjects may be not only the left and right as shown in FIGS. 6A and 7A, but may be any positional relationship.
[0223]
Then, after capturing the two images, the processing from S3 to S7 is performed, and a display as shown in FIG. 9A or FIG. 12 (or a warning / notification described later) is performed.
[0224]
If there is a display or notification that the subjects are overlapping, the processing from S1 to S7 may be repeated again. That is, the first subject image and the second subject image are photographed, and a superimposed image is generated and displayed. It may be repeated as many times as desired until the displayed processing result is satisfactory.
[0225]
However, when the second subject moves, for example, the first subject does not necessarily need to be retaken, and only the second subject may need to be retaken. In that case, S2 to S7 may be repeated.
[0226]
In this case, by automatically repeating the process from the acquisition of the second subject image in S2 to the display in S7, that is, the acquisition of the second subject image is continuously performed so that a moving image is captured without pressing the shutter button, and the process and display are performed. If this is repeated, the processing result can be confirmed in real time following the movement of the camera or the second subject. Therefore, it is possible to know in real time whether or not the movement position of the second subject is appropriate (whether the second subject does not overlap), and it is advantageous in that it is easy to photograph the second subject to obtain a composite result without overlap. Come out.
[0227]
In order to start this repetitive processing, it is necessary to enter a dedicated mode by selecting the start of processing from a menu or the like. The second subject image is determined (recorded) by pressing the shutter button when an appropriate moving position is reached, and the repetitive processing / dedicated mode may be terminated (even though the termination is performed, the final synthesis result is obtained). The processing may be continued until S7 is obtained.
[0228]
When the first subject image is not good, for example, the first subject is located in the middle of the background, and the second subject may or may not be overlapped on the first subject regardless of the arrangement of the second subject. When the second subject is out of the frame from the superimposed image, the process may be repeated from the acquisition of the first subject image in S1.
[0229]
Hereinafter, details of the above-described processing will be described.
[0230]
FIG. 13 is a flowchart illustrating one method of the process of S2 in FIG. 5, that is, a process of acquiring the second subject image.
[0231]
In S2-1 after P20, the second subject image acquiring unit 3 acquires the second subject image, and the process proceeds to S2-2. The process here is the same as the acquisition of the first subject image in S1 of FIG. 5 and the acquisition method itself.
[0232]
In S2-2, the means 3 determines whether or not there is an instruction to record an image from the automatic shutter means 13, and if there is an instruction, the process proceeds to S2-3, and if there is no instruction, the process exits to P30.
[0233]
In S2-3, the means 3 records the second subject image acquired in S2-1 in the main memory 74, the external memory 75, etc., and the process exits to P30.
[0234]
The processing of S2 in FIG. 5 is performed by the processing of S2-1 to S2-3 described above.
[0235]
It should be noted that, other than the automatic shutter means 13, the photographed image may be recorded even when the shutter button is manually pressed by the photographer or when the shutter is released by the self-timer. Assume that it is included in the processing of -1.
[0236]
FIG. 14 is a flowchart illustrating one method of the process of S3 in FIG. 5, that is, a process of calculating the background correction amount.
[0237]
There are various methods for calculating the background correction amount. Here, a simple method using block matching will be described.
[0238]
In S3-1 after P30, the background correction amount calculation means 4 divides the first subject image into block areas. FIG. 6B is an explanatory diagram illustrating a state where the first subject image in FIG. 6A is divided into block regions. Each block area is a rectangle separated by a dotted line. The upper left block is represented as "B (1,1)", the right side is represented as "B (1,1)", and the lower side is represented as "B (2,1)". In FIG. 6B, for example, in the block of B (1, 1), "11" is written at the upper left of the block due to space limitations.
[0239]
In S3-2, the means 4 finds a position where the block of the first subject image matches on the second subject image, and the process proceeds to S3-3. In this case, “(block) matching” is a process of searching the second subject image for a block area in which the image in the block is most similar to each block of the first subject image.
[0240]
For the purpose of explanation, an image defining a block (here, a first subject image) is called a “reference image”, and an image of a partner searching for a similar block (here, a second subject image) is called a “search image”. A block on the reference image is called a “reference block”, and a block on the search image is called a “search block”. The pixel value at an arbitrary point (x, y) on the reference image is Pr (x, y), and the pixel value at an arbitrary point (x, y) on the search image is Ps (x, y).
[0241]
Since the background correction amount is relative, the reference image and the search image may be used as the second subject image and the first subject image, contrary to the above.
[0242]
Now, it is assumed that the reference block is a square and the size of one side is m pixels. Then, the position of the upper left pixel of the reference block B (i, j) is
(Mx (i-1), mx (j-1))
And the pixel value (dx, dy) apart from the upper left of the reference block B (i, j) by the number of pixels is:
Pr (mx (i-1) + dx, mx (j-1) + dy)
It becomes.
[0243]
Assuming that the upper left position of the search block is (xs, ys), the similarity S (xs, ys) between the reference block B (i, j) and the search block is obtained by the following two equations.
[0244]
D (xs, ys; dx, dy) = | Ps (xs + dx, ys + dy) -Pr (mx (i-1) + dx, mx (j-1) + dy |

D (xs, ys; dx, dy) is the absolute value of the difference between each pixel value (dx, dy) away from the upper left of the reference block and the search block. S (xs, ys) is the sum of the absolute value of the difference for all pixels in the block.
[0245]
If the reference block and the search block are exactly the same image (all corresponding pixel values are equal), S (xs, ys) becomes 0. As the number of dissimilar parts increases, that is, as the difference between pixel values increases, S (xs, ys) increases. Therefore, the smaller the S (xs, ys), the more similar the block.
[0246]
Since S (xs, ys) is the similarity when the upper left position of the search block is (xs, ys), if (xs, ys) is changed on the search image, the similarity at each location can be obtained. . The position (xs, ys) of the minimum similarity among all the similarities may be set as the matching position. The search block at the matched position is called a “matching block”.
[0247]
FIG. 15 is a diagram for explaining the state of the matching. The image of FIG. 15A is a reference image, the image of FIG. 15B is a search image, and the content of the image is a little It is assumed that the positions are shifted. It is assumed that the reference block 100 in the reference image is located at exactly the corner of the bracket-shaped line. It is assumed that search blocks 101, 102, and 103 exist as search blocks in the search image. When the similarities are calculated in the reference block 100 and the search block 101, in the reference block 100 and the search block 102, and in the reference block 100 and the search block 103, the search block 101 has the smallest value. What is necessary is just to make it a matching block.
[0248]
The matching of one reference block B (i, j) has been described above, but a matching block can be obtained for each reference block. For each of the 42 reference blocks in FIG. 6B, a matching block is searched for in each of the second subject images.
[0249]
Note that the similarity of the matching block is determined using the absolute value of the difference between the pixel values, but there are various other methods, and any method may be used.
[0250]
For example, there are a method using a correlation coefficient, a method using a frequency component, and various speed-up methods. There are various ways of setting the position and size of the reference block, but a detailed improvement method of the block matching is not the gist of the present invention, and will not be described here.
[0251]
If the size of the reference block is too small, the features cannot be well captured in the block and the accuracy of the matching result deteriorates. Since the accuracy of the result deteriorates and the result becomes weak to changes such as rotation and enlargement / reduction, it is desirable to set the size appropriately.
[0252]
Next, in S3-3, the means 4 extracts only the search block corresponding to the background portion from the matching blocks obtained in S3-2, and the process proceeds to S3-4.
[0253]
Since only the search block with the smallest difference is selected as the matching block obtained in S3-2, it is not guaranteed that the matching image is the same, and there may be a case where a pattern or the like just happens to be similar. In addition, the reference block itself is not the background part because of the first subject, or the reference block is the background part, but the image part corresponding to the reference block does not exist on the second subject image because of the second subject. In such a case, the matching block is set in a difficult place.
[0254]
Therefore, it is necessary to remove, from each matching block, those that are determined not to be the same image part as the reference block. Since the remaining matching blocks are determined to be the same image portion as the reference block, only the background portion excluding the first and second subjects remains as a result.
[0255]
There are various methods for selecting a matching block. Here, as the simplest method, the similarity S (xs, ys) is determined using a predetermined threshold. That is, if S (xs, ys) of each matching block exceeds the threshold, the matching is determined to be inaccurate and removed. Since S (xs, ys) is affected by the size of the block, it is desirable to determine the threshold value in consideration of the size of the block.
[0256]
FIG. 7B illustrates a result obtained by removing an incorrect matching block from the matching result of S3-2 of the second subject image in FIG. 7A. Matching blocks determined to be correct have the same numbers as the corresponding reference blocks. As a result, it can be seen that only the matching block of the background portion that does not include or almost does not include the subject portion remains.
[0257]
Moreover, the remaining matching blocks can be determined to be the same background portion that is commonly reflected in the first subject image and the second subject image. If the first subject image and the second subject image do not have any common background portion, the remaining matching blocks become 0 as a result of the processing in S3-3.
[0258]
In S3-4, the means 4 obtains the background correction amount of the second subject image from the background matching block obtained in S3-3, and the process exits to P40.
[0259]
As the background correction amount, for example, the rotation amount θ, the enlargement / reduction amount R, and / or the translation amount (Lx, Ly) are obtained, and various calculation methods are conceivable. Here, the simplest method using two blocks will be described.
[0260]
In addition, if the amount of distortion correction other than the amount of rotation, enlargement / reduction, and translation is not used, the background part almost overlaps, and the difference image can be corrected with sufficiently small noise unless the camera is moved during shooting. There are many. In order to obtain a distortion correction amount other than the rotation amount, the enlargement / reduction amount, and the translation amount, it is necessary to use at least three or four or more blocks, and a calculation in consideration of perspective transformation is required. Since this is a known method used in image synthesis and the like (for example, P90 of “Kyoritsu Shuppan: bit November November Separate Volume“ Computer Science ”), details of this processing are omitted here.
[0261]
First, two matching blocks that are as far apart from each other as possible are selected. If there is only one remaining matching block in S3-3, the subsequent process of calculating the enlargement / reduction ratio and rotation amount may be omitted, and the difference from the position of the corresponding reference block may be obtained as the parallel movement amount. . If there is no matching block left in S3-3, it is considered better to retake the first and second subject images and the like, and a warning to that effect may be issued.
[0262]
There are various ways to choose, for example,
1) Select any two of the matching blocks and calculate the distance between the center positions of the two blocks.
2) The calculation of 1) is performed for all combinations of matching blocks.
3) A method in which the combination having the largest distance in 2) is selected as two blocks used for calculating the background correction amount can be considered.
[0263]
Here, as described in the above 3), the advantage of using the matching blocks that are the farthest apart from each other is that the accuracy in obtaining the enlargement / reduction ratio, the rotation amount, and the like is improved. Since the position of the matching block is in pixel units, the accuracy is also in pixel units. For example, the angle when shifted upward by one pixel at a position horizontally separated by 50 pixels is the same as the angle when shifted upward by 0.1 pixel at a position horizontally separated by 5 pixels. However, a deviation of 0.1 pixel cannot be detected by matching. Therefore, it is better to use matching blocks that are as far apart as possible.
[0264]
The reason for using two blocks is simply that the calculation is easy. If an average enlargement / reduction ratio, rotation amount, or the like is obtained using more blocks, the advantage of reducing errors appears.
[0265]
For example, in the example of FIG. 7B, two matching blocks having the longest distance from each other are a combination of

blocks

15 and 61.
[0266]
Next, the center positions of the two selected matching blocks are represented by coordinates (x1 ′, y1 ′) and (x2 ′, y2 ′) on the search image, and the corresponding center positions of the reference blocks are displayed on the reference image. (X1, y1) and (x2, y2) represented by the following coordinates.
[0267]
First, the scaling ratio is obtained.
[0268]
The distance Lm between the centers of the matching blocks is
Lm = ((x2 '-' x1 ') x (x2'-'x1') + (y2 '-' y1 ') x (y2'-'y1'))^1/2
The distance Lr between the centers of the reference blocks is
Lr = ((x2- x1) × (x2- x1) + (y2- y1) × (y2- y1))^1/2
And the scaling ratio R is
R = Lr / Lm
Is required.
[0269]
Next, the rotation amount is obtained.
[0270]
The slope θm of the straight line passing through the center of the matching block is
θm = arctan ((y2'- y1 ') / (x2'- x1'))
(However, θm = π / 2 when x2 ′ = x1 ’)
The slope θr of the straight line passing through the center of the reference block is
θr = arctan ((y2- y1) / (x2- x1))
(However, when x2 = x1, θr = π / 2)
Is required. Note that arctan is an inverse function of tan.
[0271]
Thus, the rotation amount θ is
θ = θr−θm
Is required.
[0272]
Lastly, the amount of parallel movement is sufficient if the center positions of the corresponding blocks are equal. For example, if (x1 ', y1') and (x1, y1) are equal, the amount of parallel movement is obtained. (Lx, Ly) is
(Lx, Ly) = (x1'- x1, y1'- y1)
It becomes. Since the rotation amount and the enlargement / reduction amount may be set at any center, here, the point of the parallel movement, that is, the center of the corresponding block is set as the rotation center and the enlargement / reduction center.
[0273]
Therefore, a conversion formula for converting an arbitrary point (x ′, y ′) in the search image into a corrected point (x ″, y ″) is as follows:
x ″ = R × (cos θ × (x′−x1 ′) − sin θ × (y′−y1 ′)) + x1
y ″ = R × (sin θ × (x′−x1 ′) + cos θ × (y′−y1 ′)) + y1
It becomes. Although the rotation amount, the enlargement / reduction amount, and the translation amount have been described, here, exactly, the parameters of θ, R, (x1, y1), and (x1 ', y1') are obtained. Note that the expression of the correction amount / conversion formula is not limited to this, and other expressions may be used.
[0274]
This conversion formula converts a point (x ', y') on the search image into a point (x ", y") on the corrected image, but the point (x ", y") on the corrected image is Since the background image overlaps with the reference image, the search image can be regarded as a conversion from the search image to the reference image (with the background portion overlapping). Therefore, this conversion formula is used to convert the point (Xs, Ys) on the search image into a point (Xr, Yr) on the reference image, Fsr,
(Xr, Yr) = Fsr (Xs, Ys)
I will express it.
[0275]
Note that the above equation is a conversion equation from the corrected point (x ", y") to an arbitrary point (x ', y') in the search image.
x ′ = (1 / R) × (cos θ × (x ″ −x1) + sin θ × (y ″ −y1)) + x1 ′
y ′ = (1 / R) × (sin θ × (x ″ −x1) −sin θ × (y ″ −y1)) + y1 ′
Can also be transformed. If this is also expressed by the conversion function Frs,
(Xs, Ys) = Frs (Xr, Yr)
It becomes. The conversion function Frs is also called an inverse conversion function of the conversion function Fsr.
[0276]
In the examples of FIGS. 6A and 7A, there is no rotation or enlargement / reduction, but only parallel movement, but details will be described later with reference to FIG. 7C.
[0277]
The processing of calculating the background correction amount of S3 in FIG. 5 is performed by the above processing of S3-1 to S3-4.
[0278]
FIG. 16 is a flowchart illustrating a method of the process of S4 in FIG. 5, that is, a method of generating a corrected image of the second subject image and generating a difference image with the first subject image.
[0279]
In S4-1 after P40, the corrected image generation unit 5 corrects the second subject image using the correction amount obtained by the background correction amount calculation unit 4 (S3) so that the background portion overlaps the first subject image. Then, the process proceeds to S4-2. Note that the corrected second subject image generated here is referred to as a “corrected second subject image” (see FIG. 7C).
[0280]
For the correction, a conversion function Fsr or an inverse conversion function Frs may be used. Generally, in order to generate a beautiful converted image, a pixel position of an original image (here, a second subject image) corresponding to a pixel position of a converted image (here, a corrected second subject image) is obtained, and conversion is performed from the pixel position. Find the pixel value of the image. At this time, the conversion function used is Fsr.
[0281]
Further, since the pixel position of the original image generally obtained does not become an integer value, the pixel value of the pixel position of the original image obtained as it is cannot be obtained. Therefore, some kind of interpolation is usually performed. For example, as the most common method, there is a method in which the pixel value is determined by primary interpolation from four pixels at integer pixel positions around the pixel position of the obtained original image. The primary interpolation method is described in a general image processing book or the like (for example, Morikita Publishing: Takeshi Yasui, Masayuki Nakajima, “Image Information Processing”, p. 54), and therefore detailed description is omitted here.
[0282]
FIG. 7C shows a correction generated from the second subject image of FIG. 7A and the first subject image of FIG. 6A so that the second subject image overlaps the background portion of the first subject image. It is an example of a 2nd subject image. The correction in this example is only translation. The range of the second subject image in FIG. 7A is indicated by a dotted line so that the state of the correction can be understood. The entire frame is slightly lower right than the second subject image in FIG. 7A.
[0283]
As a result of the correction, a portion where the corresponding second subject image does not exist appears. For example, a portion between the dotted line and the solid line at the right end in FIG. 7C is omitted because it is a portion that does not exist in the second subject image in FIG. 7A. This can be seen from the fact that the horizontal line indicating the road below is interrupted without reaching the right end. Since that part is excluded by using the mask image described in S4-2, there is no problem even if an appropriate pixel value is left.
[0284]
FIG. 17A is an example of the second subject image in a case where rotation is necessary for correction. The first subject image is the same as that in FIG. The entire screen is rotated slightly counterclockwise as compared to FIG.
[0285]
FIG. 17B shows the result of performing block matching between the second subject image in FIG. 17A and the first subject image in FIG. Even if the block is rotated, if the amount of rotation and the size of the block are not so large, there is little image change in the block, so that accurate matching can be performed to some extent following the rotation.
[0286]
FIG. 17C is a second subject image in which the background correction amount has been calculated and corrected based on the block matching result in FIG. 17B. It can be seen that the first subject image in FIG. 6A and the background portion overlap and the rotation has been corrected. The image frame in FIG. 17A is shown by a dotted line so that the state of the correction can be understood.
[0287]
In S4-2, the corrected image generating means 5 generates a mask image of the corrected second subject image, and the process proceeds to S4-3.
[0288]
When generating a corrected image, the mask image is obtained by using the above-described equation to determine the pixel position on the original image corresponding to each pixel on the corrected image, and whether the pixel position falls within the range of the original image. If it is, the pixel value of the corresponding pixel on the corrected image is set to, for example, 0 (black) as a mask portion, and if not, it is set to, for example, 255 (white). The pixel value of the mask portion is not limited to 0 and 255, and may be freely determined. Hereinafter, the description will be made with 0 (black) and 255 (white).
[0289]
FIG. 7D is an example of the mask image of FIG. 7C. The black area in the solid frame is the mask portion. This mask portion indicates a range in which the original image (the image before correction) has pixels in the corrected image. Therefore, in FIG. 7D, the lower right end portion where the corresponding second subject image does not exist is not a mask portion and is white.
[0290]
In S4-3, the difference image generation unit 6 uses the first subject image, the corrected second subject image obtained from the corrected image generation unit 5 (S4-1), and the mask image thereof to generate the first subject image. A difference image from the corrected second subject image is generated, and the process proceeds to S4-4.
[0291]
To generate a difference image, it is checked whether the pixel value of a point (x, y) on the mask image is 0 or not. If 0 (black), the corrected pixel should exist on the corrected second subject image, and the pixel value Pd (x, y) of the point (x, y) on the difference image is
Pd (x, y) = | P1 (x, y) -Pf2 (x, y) |
Thus, the absolute value of the difference between the pixel value P1 (x, y) on the first subject image and the pixel value Pf2 (x, y) on the corrected second subject image is used.
[0292]
If the pixel value of a point (x, y) on the mask image is not 0 (black),
Pd (x, y) = 0
And
[0293]
These processes may be repeated for the point (x, y) for all pixels from the upper left to the lower right of the difference image.
[0294]
FIG. 8A is an example of a difference image generated from the first subject image of FIG. 6A, the corrected second subject image of FIG. 7C, and the mask image of FIG. 7D. The background other than the area of the person (1) and the area of the person (2) have the same background, or the difference is 0 as it is outside the mask range. As a result, the image of the person (1) and the image of the background, and the image of the image of the person (2) mixed with the image of the background are mainly included in the area of the person (1) and the area of the person (2), respectively. It has become.
[0295]
Normally, the region of the person (1) and the person (2) are changed due to an error in the calculation of the correction amount in S3, an error in an interpolation process for generating a corrected image, a slight change due to a difference in the photographing time of the background image itself. A small difference portion appears outside the region of ()). Usually, the size is about several pixels, and the difference is often not so large. In FIG. 8A, some white portions appear around the area of the person (1) and the area of the person (2).
[0296]
On the other hand, the mask image in the case of FIG. 17B is as shown in FIG. Even if there is a correction amount for enlargement / reduction and rotation, if the correction and the generation of the mask image are performed in S4-1 and S4-2, the subsequent processing is the same as the procedure. 17A is not used for the two subject images, but the one shown in FIG. 7A is used.
[0297]
With the above-described processing of S4-1 to S4-3, the processing of generating a difference image in S4 of FIG. 5 can be performed.
[0298]
FIG. 18 is a flowchart illustrating a method of the process of S5 in FIG. 5, that is, a process of extracting a subject region.
[0299]
In S5-1 after P50, the subject area extraction unit 7 generates a “labeling image” (the meaning of the “labeling image” will be described later) from the difference image obtained from the difference image generation unit 6 (S5). Then, the process proceeds to S5-2.
[0300]
First, as a preparation, a binary image is generated from the difference image. Although various methods of generating a binary image are conceivable, for example, each pixel value in the difference image may be compared with a predetermined threshold value, and if the pixel value is larger than the threshold value, black may be used, and if it is less than the threshold value, white may be used. When the difference image is composed of R, G, and B pixel values, the value obtained by adding the R, G, and B pixel values may be compared with a threshold value.
[0301]
FIG. 8B is an example of a binary image generated from the difference image of FIG. There are eight black areas 110 to 117, and the areas other than the large human-shaped areas 112 and 113 are small areas.
[0302]
Next, a labeling image is generated from the generated binary image. In general, the “labeling image” refers to a block in which white pixels or black pixels are connected in the binary image, and a number ( This is an image generated by a process of assigning a “labeling value” (hereinafter referred to as “labeling value”). In many cases, the output labeling image is a multi-valued monochrome image, and the pixel values of each block area are all assigned labeling values.
[0303]
Note that a region of pixels having the same labeling value is hereinafter referred to as a “label region”. Also, for details of the processing procedure of finding a connected block and assigning a labeling value to the block, refer to a general image processing book (eg, Shokodo: "Image Processing Handbook" published in 1987) P318), the description is omitted here, and an example of the processing result is shown.
[0304]
Since the difference between a binary image and a labeling image is binary or multivalued, an example of a labeling image will be described with reference to FIG. The numbers in parentheses such as “110 (1)” are added after the numbers of the areas 110 to 117 in FIG. 8B, and this is the labeling value of each area. It is assumed that a labeling value of 0 is assigned to other areas.
[0305]
The labeling image FIG. 8 (b) is shown as a binary image because it is difficult to show the multi-valued image on the paper, but it is actually a multi-valued image based on the labeling values, so it is displayed. Although it is not necessary, when it is actually displayed as an image, it looks different from FIG. 8B.
[0306]
In S5-2, the subject region extracting means 7 removes the "noise" region in the labeling image obtained in S5-1, and the process proceeds to S5-3. “Noise” generally refers to a portion other than target data, and here refers to a region other than a human-shaped region.
[0307]
There are various methods for removing noise, but as a simple method, for example, there is a method of removing a label region having an area smaller than a certain threshold. For this, first, the area of each label region is obtained. To determine the area, it is sufficient to scan all the pixels and count how many pixels have a particular labeling value. When the area (the number of pixels) is obtained for all the labeling values, the label area having the area (the number of pixels) equal to or less than a predetermined threshold is removed from them. In the removal processing, specifically, the label area may be set to a labeling value of 0, or a new labeling image may be created and a label area other than noise may be copied there.
[0308]
FIG. 8C shows the result of noise removal from the labeling image of FIG. 8B. Areas other than the human-shaped areas 112 and 113 have been removed as noise.
[0309]
If it is difficult to completely automate the noise removal processing that removes the label area other than the subject, for example, a method of asking the user to specify which area is the subject area using an input means such as a tablet or a mouse. Conceivable. As the specification method, there are a method of specifying the outline of the subject area and a method of using the outline of each label area of the labeling image to specify which label area is the subject area.
[0310]
In FIG. 8B, one area happens to be one label area, but depending on the image, even one subject may be divided into a plurality of label areas. For example, if the pixel in the middle of the subject area has a color and brightness similar to that of the background, the pixel value of that part in the difference image is small, so that the middle of the subject area is recognized as the background. As a result, the subject area may be divided vertically and horizontally and extracted. In that case, there is a possibility that a case in which the subsequent processing such as overlap detection of the subject or the synthesis processing cannot be performed well will occur.
[0311]
Therefore, there is also a method of expanding a label area of a labeling image and integrating a label area having a short distance as the same label area. Further, a method of using “snake”, which is one of the techniques for extracting a region, for integration is also conceivable. For details of the dilation or snake processing procedure, see a general image processing book (for example, Shokodo: “Image Processing Handbook” P320, published in 1987, or Kass @ A., @ Et @ al., "Snakes: @Active"). Contour @ Models, Int. @ J. @ Comput. @ Vision, pp.321-331 (1988)), and a description thereof will be omitted.
[0312]
In addition, even if it is not used to integrate label areas that are close in distance, the extracted object area is expanded by a certain amount in order to reduce the risk of overlooking the overlap between the first and second object areas. There is also a way to make it happen.
[0313]
Note that, here, an example of processing in which expansion and integration are not particularly performed is described.
[0314]
In S5-3, the overlap detection means 8 detects whether or not there is an overlap between subjects from the noise-removed labeling image obtained in S5-2. If no overlap is detected, the process proceeds to S5-4, and the overlap is determined. If detected, the process proceeds to S5-5.
[0315]
There are various methods for detecting the overlap, but here, as an easily obtained method, a method using the number of subjects to be photographed / combined and the number of regions of the subject in the noise-free labeled image will be described. I do.
[0316]
First, it is assumed that the number of subjects to be photographed / combined is specified in advance by a program, external storage, user input, or the like. For example, the camera has mode settings such as “two-group shooting mode” (number of subjects 2) and “three-group shooting mode” (number of subjects 3), which are set by the user.
[0317]
In this case, the “number of subjects” is the number of persons or the like who are grouped as a region. For example, if each of the first subject and the second subject is one person, the number of subjects is two. If the first subject is one person, and if the second subject is two people, and if the two people are stuck together, they are in a lump area. Is 2 in total, but when the two persons are separated from each other at a distance, the second object is set to 2 and the number of objects is 3 in total because the area is not a lump area.
[0318]
The number of regions of the subject can be obtained by counting the number of regions having different label values in the labeling image from which noise has been removed (excluding the portion having a labeling value of 0).
[0319]
Then, the overlap detecting means 8 checks whether or not the obtained number of subjects to be photographed / combined and the number of subjects in the noise-removed labeling image match, and if they match, the subjects overlap. It is determined that they do not match, and if they do not match, it is determined that the subjects overlap each other.
[0320]
The principle of the judgment by the overlap detecting means 8 is as follows. For simplicity of description, the number of subjects to be photographed / combined is assumed to be two here.
[0321]
If the objects do not overlap each other, the area of the first object and the area of the second object should be separated. Therefore, when the subjects do not overlap each other, the number of regions of the subject after noise removal should be two.
[0322]
If the subjects overlap each other, the regions of the first subject and the regions of the second subject are unified because they are integrated at the overlapping portion. Therefore, when the subjects overlap each other, the number of regions of the subject after noise removal should be one.
[0323]
Even if the number of subjects to be photographed / combined is three, the same concept is applied. If the subjects do not overlap each other, the respective regions are separated, so the number of the subject regions after noise removal should be three. It is. If the subjects overlap each other, at least one set of the three subject regions is integrated at the overlapping portion, and therefore should not be separated. Therefore, when the objects overlap each other, the number of regions of the object after noise removal should be one or two.
[0324]
In FIG. 6A and FIG. 7A, since there is only one person as a subject, it is assumed that the number of subjects to be photographed / combined is set to two. In FIG. 8C, since the number of regions is two of the human-shaped regions 112 and 113, the number of obtained subjects to be photographed / combined and the number of regions of the subject in the noise-removed labeling image are different. Matches. Therefore, in this case, the overlap detection means 8 determines that the subjects do not overlap.
[0325]
As an example of the overlap, a case where FIG. 10 of the second subject image is used will be considered. FIG. 6A is used as it is for the first subject image. FIG. 11A shows a difference image generated from these. In FIG. 11A, the subjects overlap each other, and the overlapped arm portion is an image in which the images of the first subject and the second subject are mixed, and the other subject portions are the first subject. The image of the subject and the background portion, and the image of the second subject and the background portion are mixed. FIG. 11B shows the labeling image of FIG. 11A, and FIG. 11C shows the result of noise removal from FIG. 11B.
[0326]
In FIG. 11 (c), the region of the first subject and the region of the second subject have been joined together at the arm, so that only one block of the region 202 remains. In this case, since the number of regions of the subject in the labeling image from which noise has been removed is 1, the number does not match the number of subjects to be photographed / combined, and it is determined that there is an overlap.
[0327]
As an overlap detection method, there is a method of accurately obtaining the contours of the first subject and the second subject, and determining whether the contours overlap each other. If the contour is accurately determined, it is possible to detect the overlap, and it is also possible to perform various processes such as display using the overlap area and avoidance of the overlap.
[0328]
However, it is generally difficult to completely and accurately extract the region of the subject only by image processing, and generally requires advanced processing using human knowledge and artificial intelligence. There is a method of extracting a region such as "Snake", but it is not perfect. In addition, in addition to the first subject image and the second subject image, if a background image in which at least a part common to each subject image is shown and the subject is not shown is used, regardless of the presence or absence of the overlap, , The region of the subject can be extracted. On the other hand, it is difficult to accurately extract a contour of a subject that may have an overlap only from the first subject image and the second subject image.
[0329]
Therefore, here, only the presence or absence of the overlap is detected by the simple method described above.
[0330]
In S5-4, the subject area extracting means 7 determines which of the subject areas in the noise-removed labeling image is the first subject area and which is the corrected second subject area, and the process exits to P60.
[0331]
In the above-described method using the background image, since the difference image between the background image and the first subject image and the difference image between the background image and the second subject image are used, the subject regions can be respectively extracted. The extracted subject areas are a first subject area and a second subject area, respectively. That is, the first subject area and the second subject area can be extracted independently.
[0332]
However, since the background image is not used in the present invention, the first subject region and the second subject region cannot be independently extracted from the difference image between the first subject image and the second subject image. And the second subject area can be extracted only in a mixed form. In other words, only two subject areas 112 and 113 are obtained from the labeling image from which noise has been removed as shown in FIG. 8C, and which of the two areas 112 and 113 is the first subject area and which is the second subject area. The subject area extraction means 7 cannot determine whether the subject area is the subject area.
[0333]
The fact that it is not possible to determine which is the first subject area or the second subject area also means that, from a different viewpoint, the subject area extracting means 7 cannot determine whether the image is the image of the first or second subject or the image of the background part. is there.
[0334]
For example, a range corresponding to the regions 112 and 113 in FIG. 8C is extracted from the first subject image (FIG. 6A) and the second subject image (FIG. 7A), respectively. a) to (d). 19A shows the range of the area 112 of the first subject image, FIG. 19B shows the range of the area 112 of the second subject image, and FIG. 19C shows the area 113 of the first subject image. FIG. 19D shows the range of the area 113 of the second subject image.
[0335]
Since it is premised that only the first subject is shown in the first subject image and only the second subject is shown in the second subject image, except for the background portion, “FIG. 19 (d) is an image of the second subject in the image of the first subject, or “FIG. 19 (b) is an image of the first subject and FIG. 19 (c) is an image of the second subject”. Either will be correct.
[0336]
Therefore, in order to distinguish the first subject area from the second subject area, it is sufficient to identify which of FIGS. 19A and 19D and FIGS. 19B and 19C is the image of the subject range.
[0337]
Various methods can be considered to identify which is the image of the subject range. For example, if the features of the subject and the background are known in advance, there is a method of using the feature to distinguish them.
[0338]
For example, if it is known that the subject is a person, there is a high possibility that the image of the subject range contains many skin colors. Accordingly, the image containing the larger amount of skin color may be used as the image of the subject range.
[0339]
There are various methods for recognizing colors. For example, hue H, saturation S, and lightness I are obtained from the R, G, and B pixel values in FIG. There is a way. There are various methods for obtaining the hue H, the saturation S, and the lightness I, which are described in general image processing books and the like (for example, “The Image Analysis Handbook” published by the University of Tokyo, 1991, pages 485 to 491). Although the details are omitted here, the hue H has a value range of 0 to 2π in, for example, the “conversion using the HSI hexagonal pyramid color model” method in the book.
[0340]
More specifically, the subject area extracting means 7 determines the standard range of skin color H. Next, the means 7 obtains the H of each pixel in the regions shown in FIGS. 19A to 19D, and if it falls within the range of the standard flesh color H, it is counted as flesh color. Subsequently, the means 7 compares which of the skin color count numbers in FIGS. 19A and 19D and the skin color count numbers in FIGS. 19B and 19C is larger, and It may be an image.
[0341]
As a method of identifying which is the image of the subject range using the feature amount, there is a method of identifying whether the image is similar to the surrounding background portion, for example, in addition to using the skin color.
[0342]
In this case, first, the subject area extracting means 7 obtains a feature amount (described later) in the subject area from the first subject image and the second subject image. Next, the means 7 obtains a feature amount of an area around the object area (for example, 20 dots around). Since the periphery of the subject area is a background portion, and the background portion is corrected so as to overlap, there is a case where only one of the portions may be used. Then, the means 7 may judge the one having the feature amount close to the feature amount of the background portion as the image of the background portion and the one not close to the feature amount as the image of the subject region.
[0343]
As the above-mentioned feature amount, in addition to the R, G, and B pixel values described above, the hue H, the saturation S, and the brightness I, a texture and the like can be used. Various methods have been devised for obtaining a texture as a feature amount. For example, there is a histogram of lightness I. This means that, for pixels in a certain area, the histogram P (i) of lightness I normalized so that the total sum becomes 1.0 (i = 0, 1, 、１, n−1) , And the average μ, the variance (σ ¢ 2), the skewness Ts, and the kurtosis Tk are obtained by the subject region extraction means 7 by the following equation. Note that (X） Y) means X raised to the Y-th power.
[0344]

The above four values are used as feature amounts.
[0345]
Other examples of the feature amount include a co-occurrence matrix, a difference statistic, a run-length matrix, a power spectrum, a secondary statistic thereof, and a method using a higher-order statistic. Since it is described in a book or the like (for example, “Image Analysis Handbook”, pages 517 to 538, published by the University of Tokyo Press, 1991), the details are omitted here.
[0346]
Thus, in the case of FIG. 19, it is assumed that FIGS. 19A and 19D are determined to be images of the subject range by the subject region extracting unit 7. Then, the area 112 becomes the first subject area, and the area 113 becomes the second subject area.
[0347]
Note that this processing is performed when there is no overlap between the subjects in S5-3, and thus the first and second subjects are completely separated as shown in FIG. 8C. Should be. As shown in FIG. 11C, the first subject and the second subject should not be in an integrated state.
[0348]
In S5-5, the number of subjects to be photographed / combined in S5-3 did not match the number of regions of the subject in the noise-removed labeling image. The region of the subject in the image is defined as a region where the first subject region and the second subject region are integrated (hereinafter, referred to as “subject integrated region”), and the process exits to P60.
[0349]
In this case, the extraction of the first subject area and the second subject area independently by the subject area extracting means 7 is abandoned and is processed as an integrated area. As described above, when the contours of the first subject and the second subject can be accurately obtained, the processing of S5-4 may be performed without performing the processing of S5-3 and S5-5.
[0350]
In the above-described processing of S5-1 to S5-5, the subject area extraction processing of S5 in FIG. 5 is performed.
[0351]
FIG. 20 is a flowchart illustrating one method of the process of S6 in FIG. 5, that is, a process related to overlap. Another processing method regarding the overlap will be described later with reference to FIGS.
[0352]
In S6-1 after P60, the processing proceeds to S6A-2 if there is an overlap from the information on whether or not there is an overlap obtained from the overlap detection means 8 (S5) in the overlap warning means 11; Exit.
[0353]
In S6A-2, the overlap warning unit 11 warns the user (photographer) and / or the subject that there is an overlap between the first subject and the second subject, and goes to P70.
[0354]
There are various ways to notify the warning.
[0355]
For example, when the notification is performed using the composite image, the overlapping subject areas may be displayed so as to be conspicuously superimposed on the composite image. FIG. 12 is an example for explaining this.
[0356]
In FIG. 12, the area 202 in FIG. 11C, that is, the area where the first subject and the second subject overlap each other, is displayed translucently on the composite image. It is more preferable to apply a filter of a conspicuous color such as red to the portion of the area 202 (image of applying a cellophane color to the area 202). Alternatively, the area of the area 202 and its frame may be displayed by blinking. These combining methods will be described later with reference to FIG.
[0357]
FIG. 12 shows an example in which a warning is further issued in characters. In the upper part of FIG. 12, a warning window is displayed superimposed on the composite image, and a message “Subject is overlapping!” Is displayed therein. This may also be made a conspicuous color scheme or blinked.
[0358]
Overwriting of the composite image may be performed by the superimposed image generation means 9 or by the superimposed image display means 10 according to an instruction of the overlap warning means 11. When the warning window blinks, it may be necessary to keep the original composite image. It is often better to read and give.
[0359]
If these warning displays are displayed on the monitor 141 shown in FIG. 3A, it is possible to confirm the overlapping state while taking a picture, which is convenient for taking a picture. At this time, when the photographer instructs the subject (person (2)) to say, "Move to the right because it overlaps," There is an advantage that an instruction to eliminate the overlapping state can be given.
[0360]
Note that the case where the next shot image is used as the second subject image or the like is when the user instructs recording (memory writing) of the second subject image using a menu or a shutter button, or as described above, A case may be considered in which a dedicated mode for repetitive processing for capturing two subject images as a moving image and displaying the corrected superimposed image almost in real time is used.
[0361]
Although the monitor 141 in FIG. 3A faces the photographer, if the monitor can be directed toward the subject, the overlapping state can be confirmed by the subject, and the photographer can be instructed. Even if it is not performed, the subject can also move to voluntarily cancel the overlap. A monitor different from the monitor 141 may be prepared so that the subject can be seen.
[0362]
If the processing from S3 to S7 in FIG. 5 is repeated as described above in the dedicated mode, the current overlapping state can be known almost in real time. Understand, shooting is convenient and efficient. The processing from S3 to S7 in FIG. 5 does not require much time if a sufficiently fast CPU or logic circuit is used. In actual use, if a repetitive processing at a speed of about once per second or more can be realized, it can be said that display is almost real time.
[0363]
When generating the corrected image in S4, if the first subject image is used as the reference image, the composite image is also based on the first subject image. The range of the background shown on the monitor 141 is the range of the background of the first subject image. In the case of performing the above-described repetitive processing in real time, when the camera is shaken, the range of the background to be shot changes, but the shot image is the second subject image, not the first subject image. Therefore, the range of the background shown on the monitor 141 remains unchanged as the range of the background of the first subject image. For this reason, the fact that the shooting range is not reflected / reflected on the monitor 141 is uncomfortable for the user.
[0364]
On the other hand, when the second subject image is used as the reference image, the range of the background shown on the monitor 141 is the range of the background of the second subject image. In the case of performing the above-described repetitive processing in real time, when the camera is shaken, the range of the background to be shot changes, and the shot image is the second subject image (reference image). The range of the background. As a result, the area being photographed is reflected / reflected on the monitor 141, so that an effect of less discomfort for the user is obtained.
[0365]
In addition, as a result of displaying the overlapped subject area superimposed on the composite image, if the relationship between the degree of overlap and the frame frame of the composite image is viewed, no matter how the subject moves, overlapping occurs or the subject goes out of frame. If the user can make a determination, it is possible to determine again that it is better to start over from capturing the first subject image.
[0366]
Further, as a method of giving a warning, the lamp 142 shown in FIG. 3A can be turned on or blinked. Because it is a warning, it is easy to understand that the color of the lamp is red or orange. In general, there is an advantage that the blinking of the lamp is easily noticed even if the photographer does not pay attention to the monitor 141.
[0367]
Alternatively, a superimposed image of the subject may not be displayed as shown in FIG. 12 and only the overlap may be notified by a warning message or a lamp. In this case, it is not immediately known how much overlap, but if it is only known whether there is overlap, after that, if you see whether the warning notice disappears because the subject moves etc., a composite image without overlap The purpose of gaining is achieved. Therefore, by simply notifying that there is an overlap with a warning message or a lamp, there is an advantage that the process of displaying the overlap portion can be omitted.
[0368]
Further, in FIG. 3A, the lamp 142 is arranged so as to be seen only by the photographer, but it is needless to say that the lamp 142 may be attached to the front side of the main body 140 in FIG. . The effect is the same as when the subject can be seen on the monitor.
[0369]
Although not shown in FIG. 3A, if there is another means such as a finder that can be used to confirm an image separately from the monitor 141, the same warning notification as that of the monitor 141 is displayed there, or a lamp is provided inside the finder. It is also possible to incorporate the information and notify the user.
[0370]
Although not shown in FIGS. 3A and 3B, a warning notification may be made using the speaker 80 of FIG. If there is an overlap, a warning buzzer is sounded, or a sound such as "overlap" is output to give a warning notification. In this case, the same effect as the lamp can be expected. When a speaker is used, there is little directivity unlike light, so that there is an advantage that both the photographer and the subject can know the overlapping state with one speaker.
[0371]
By the above processing from S6-1 to S6A-2, the processing related to the overlap of S6 in FIG. 5 can be performed.
[0372]
FIG. 21 is a flowchart illustrating another method of the process of S6 in FIG. 5, that is, a process related to overlap.
[0373]
In S6-1 after P60, the photo opportunity notifying unit 12 determines whether or not there is an overlap based on the information obtained from the overlap detection unit 8 (S5). In this case, the process proceeds to S6B-2.
[0374]
In S6B-2, the photo opportunity notifying unit 12 notifies the user (photographer) and / or the subject that there is no overlap between the first subject and the second subject, and the process goes to P70.
[0375]
This notification is not a notification that there is no overlap, but rather a notification that a secondary operation due to the lack of overlap, specifically, a shutter chance to record the second subject. We are the most common. In that case, the notification is mainly to the photographer.
[0376]
As for the method of notifying of a photo opportunity, the method described with reference to FIG. 20 can be used almost as it is. For example, the message in FIG. 12 may be changed to "shutter chance!" In addition, although the colors and the contents of sounds to be output are slightly changed for lamps and speakers, they can be similarly used as a notification method.
[0377]
If the photographer knows that it is a photo opportunity, the photographer can release the shutter to shoot / record the subject without overlapping, and also prepare the subject to be able to release the shutter (for example, the direction of the line of sight) And facial expressions).
[0378]
With the above processing from S6-1 to S6B-2, the processing related to the overlap of S6 in FIG. 5 can be performed.
[0379]
FIG. 22 is a flowchart for explaining yet another method of the process of S6 in FIG. 5, that is, the process regarding the overlap.
[0380]
In S6-1 after P60, the automatic shutter unit 13 determines whether or not there is an overlap based on the information obtained from the overlap detection unit 8 (S5). Goes to S6C-2.
[0381]
In S6C-2, the automatic shutter unit 13 determines whether or not the shutter button has been pressed. If the shutter button has been pressed, the process proceeds to S6C-3, and if not, the process exits to P70.
[0382]
In S6C-3, the automatic shutter unit 13 instructs the second subject image acquisition unit 3 to record the second subject image, and the process exits to P70. The second subject image acquiring unit 3 records the captured image in the main memory 74, the external memory 75, or the like according to the instruction.
[0383]
As a result, if the shutter button is pressed when the subjects do not overlap with each other, an effect that the captured image can be automatically recorded can be obtained. At the same time, there is an effect of preventing recording of a photographed image in a state where the images are erroneously overlapped.
[0384]
As a practical use, the photographer presses the shutter button if it is OK to record the shot image now, looking at the state of the subject, etc., but it is not necessarily recorded at that point, If there is, it is not recorded. That is, when the automatic shutter unit 13 determines that there is an overlap, the recording of the second subject image is performed so that the recording operation by the second subject image acquiring unit 3 is not performed even if the photographer presses the shutter button. Ban.
[0385]
In the case where the image is not recorded, it is better to notify the photographer or the like by a display means, a lamp, a speaker, or the like to notify the photographer that the shutter has been pressed but the image has not been photographed.
[0386]
When the shutter button is pressed again when there is no overlap due to movement of the subject or the like, recording is performed this time. The photographer or the like may be notified by a notification means such as a display, a lamp, or a speaker so that the user can recognize that the recording has been made.
[0387]
If the shutter button is not depressed every time, but is kept depressed, the recording is automatically made at the moment when the overlapping state disappears. However, at the moment when the overlap has disappeared, the subject may not be still and the captured image may be blurred, or the subject may not be photographed (when the subject is facing another place). In that case, it is better to leave a little time before recording automatically.
[0388]
With the above processing of S6-1 to S6C-3, the processing related to the overlap of S6 in FIG. 5 can be performed.
[0389]
Note that the processes in FIGS. 20 to 23 are not necessarily exclusive processes, and may be performed in any combination. As an example of the combination, the following use scenes are possible.
[0390]
When the subject overlaps, a warning is issued that "the subject overlaps". At this time, even if the shutter button is pressed, the photographed image is not recorded. The subject moves in response to the warning, and when there is no overlap, the shutter chance lamp is turned on. When the shutter button is pressed while the photo opportunity lamp is on, a photographed image is recorded. 』
Next, FIG. 23 is a flowchart illustrating one method of the process of S7 of FIG. 5, that is, a process of generating a superimposed image.
[0391]
In S7-1 after P70, the superimposed image generation means 9 sets the first pixel position of the superimposed image to be generated as the current pixel, and the process proceeds to S7-2. The first pixel position often starts from the upper left corner, for example.
[0392]
The “pixel position” indicates a specific position on an image, and is often expressed in an XY coordinate system with the origin at the upper left corner, the + X axis in the right direction, and the + Y axis in the downward direction. The pixel position corresponds to an address on the memory representing the image, and the pixel value is the value of the memory at that address.
[0393]
In S7-2, the superimposed image generation means 9 determines whether or not the current pixel position exists. If the current pixel position exists, the process proceeds to S7-3; otherwise, the process exits to P80.
[0394]
In S7-3, the superimposed image generation means 9 determines whether or not the current pixel position is within the subject integrated area. If the current pixel position is within the subject integrated area, the process proceeds to S7-4; otherwise, the process proceeds to S7-5. move on.
[0395]
Whether it is within the subject integrated area can be determined by determining whether the subject integrated area is obtained by the overlap detection means 8 (S5-5) and whether the current pixel position in the subject integrated area image is black (0).
[0396]
In S7-4, the superimposed image generating means 9 generates a synthesized pixel according to the setting and writes it as a pixel value at the current pixel position of the superimposed image.
[0397]
The setting means what kind of combined image is to be combined. For example, whether the first subject is synthesized translucently as shown in FIG. 9B, or the opaque first subject is synthesized as it is overwritten as shown in FIG. Whether the first subject and the second subject are translucent and synthesized, and so on. Here, since the inside of the subject integrated area is handled, the setting is substantially related to the composition ratio (transmittance) of the area.
[0398]
When the composition ratio (transmittance) is determined, the pixel value P1 at the current pixel position of the first subject image and the pixel value Pf2 at the current pixel position of the corrected second subject image obtained from the corrected image generating means 5 (S4) are obtained. The composite pixel value (P1 × (1−A) + Pf2 × A) may be obtained from the predetermined transmittance A (a value between 0.0 and 1.0).
[0399]
For example, in order to make the subject integrated area as shown in FIG. 12 translucent, the transmittance A may be set to 0.5.
[0400]
In S7-5, when it is determined in S7-3 that the current pixel does not belong to the subject integrated area, the superimposed image generation means 9 determines whether the current pixel position is in the first subject area, and If it is within the area, the process proceeds to S7-6; otherwise, the process proceeds to S7-7.
[0401]
Whether the current pixel position is within the first subject region can be determined by using the first subject region image obtained from the subject region extracting means 7 (S5), based on whether the current pixel position is black (0). If the subject integrated region exists, it is known that the first subject region does not exist. Therefore, the process directly proceeds to S7-7 without determining whether or not the first subject region exists (S7-5 is omitted). The processing may proceed.
[0402]
If the process is not particularly changed depending on whether the region is the first subject region, S7-5 and S7-6 may be omitted, and the process may proceed from S7-3 to S7-7.
[0403]
In S7-6, the superimposed image generation means 9 generates a synthesized pixel according to the setting, and writes it as a pixel value at the current pixel position of the superimposed image. The processing here is the same as S7-4, except that the subject integrated area (image) is changed to the first subject area (image).
[0404]
If the first subject is synthesized translucently as shown in FIG. 9B, the transmittance of the first subject may be set to 0.5, and as shown in FIG. If the subject is synthesized by overwriting as it is, the transmittance of the first subject may be set to 0.0.
[0405]
In S7-7, when it is determined in S7-5 that the current pixel does not belong to the first subject area, the superimposed image generating means 9 determines whether the current pixel position is in the second subject area, If it is within the two subject areas, the process proceeds to S7-8; otherwise, the process proceeds to S7-9. The processing here is the same as S7-5, except that the first subject area is changed to the second subject area.
[0406]
In S7-8, the superimposed image generating means 9 generates a synthesized pixel according to the setting and writes it as a pixel value at the current pixel position of the superimposed image. The processing here is the same as S7-6, except that the first subject area is changed to the second subject area.
[0407]
In S7-9, when it is determined in S7-7 that the current pixel does not belong to the second object area, the superimposed image generating means 9 determines the pixel value of the current pixel position of the first object image (reference image). Write as a pixel value at the current pixel position of the superimposed image. That is, the current pixel position in this case is not within the subject integrated area, the first subject area, or the second subject area, and thus corresponds to the background part after all.
[0408]
In S7-10, the superimposed image generating means 9 sets the current pixel position to the next pixel position, and the process returns to S7-2.
[0409]
In the processing of S7-1 to S7-10 described above, the processing relating to the superimposed image generation of S7 in FIG. 5 is performed.
[0410]
In the above processing, the first subject image and the corrected second subject image are processed in S7-4, S7-6, and S7-9. It is also conceivable to copy all pixels of the first subject image or the corrected second subject image, and then process only the first subject region and / or the second subject region in the processing of each pixel position. The whole pixel copy simplifies the processing procedure, but may slightly increase the processing time.
[0411]
Here, the size of the composite image is set to the size of the reference image, but it is also possible to make the size smaller or larger than this. For example, when the corrected image is generated in FIG. 7C, a part of the corrected image is truncated. However, if the corrected image is enlarged so as not to be truncated, in order to increase the size of the composite image, It is also possible to use the image left uncut for compositing, thereby expanding the background. An effect that enables a so-called panoramic image synthesis can be obtained.
[0412]
FIG. 9B is a superimposed image in which only the first subject region is synthesized translucently. FIG. 9C shows a superimposed image in which only the second subject area is synthesized to be translucent. FIG. 9A is a superimposed image generated by overwriting both without making them both translucent. FIG. 12 shows a superimposed image obtained by combining both images with translucency.
[0413]
Which combination method is used depends on the purpose, and it is sufficient that the user can select a combination method according to the purpose at that time.
[0414]
For example, in the case where the first subject image has already been photographed / recorded and the second subject image is to be photographed without overlapping, a detailed image of the first subject is not necessary. Since it is only necessary to know whether or not there is an overlap with the second subject on the side, translucent composition may be used. In addition, the shutter cannot be released properly without knowing the details of the expression of the second subject at the moment of photographing. Therefore, it is better to combine the second subject by overwriting instead of translucent. Therefore, the combining method as shown in FIG. 9B is suitable.
[0415]
Further, as described above, if it is less uncomfortable for the background range of the composite image to be the background range of the image being captured (the second subject), the second subject image is used as the reference image, and It is better to combine the two subjects as shown in FIG. 9B so that it is easy to understand that the subject is being photographed.
[0416]
Also, for a user who knows the area of the subject to be synthesized is easier to take a picture, it is better to combine the two with translucency during shooting, or to make the second subject only translucent during shooting. might exist.
[0417]
Further, when the second subject has been photographed / recorded and a final composite image is desired to be created using the first subject image and the second subject image, it is not possible to use a translucent subject. It needs to be synthesized. Therefore, the combining method as shown in FIG. 9A is suitable.
[0418]
If the subject area obtained from the subject area acquisition means 7 (S5) has already been expanded as described above, not only the subject but also the surrounding background portion are synthesized together. Since the background portion has been corrected by the generation means 5 (S4) so as to match, even if the subject region to be extracted is slightly larger than the actual region of the contour of the subject and includes the background portion as well. This has the effect that the image does not become unnatural at the composite boundary.
[0419]
If the processing is performed by expanding the subject region, the transparency is gradually increased near the combining boundary of the subject region including the outside or near the combining boundary of only the inside of the subject region so that the combining boundary looks more natural. There is also a method of changing and synthesizing. For example, the ratio of the image of the background portion is increased as going outside the subject region, and the ratio of the image of the subject region portion is increased as going inside the subject region.
[0420]
As a result, even if there is a slight displacement of the background due to a correction error in the vicinity of the combining boundary, an effect is obtained that unnaturalness can be made inconspicuous. If not the correction error, but the extraction of the subject area in the first place is wrong, or the image of the background part changes due to a difference in the shooting time (for example, a tree is moved by the wind, Even if there is a pass, the effect that the unnaturalness can be made inconspicuous similarly can be obtained.
[0421]
Further, an object of the present invention is to provide a storage medium storing a program code of software for realizing the functions of the above-described embodiments to a system or an apparatus, and a computer (or CPU or MPU) of the system or apparatus to store the storage medium. Needless to say, this can also be achieved by reading and executing the program code stored in the program.
[0422]
In this case, the program code itself read from the storage medium realizes the function of the above-described embodiment, and the storage medium storing the program code constitutes the present invention.
[0423]
As a storage medium for supplying the program code, for example, a floppy disk, a hard disk, an optical disk, a magneto-optical disk, a magnetic tape, a nonvolatile memory card, and the like can be used.
[0424]
Further, the program code may be downloaded from another computer system to the main storage 74 or the external storage 75 of the image synthesizing apparatus via a transmission medium such as a communication network.
[0425]
When the computer executes the readout program code, not only the functions of the above-described embodiments are realized, but also an OS (Operating System) running on the computer based on the instruction of the program code. It goes without saying that a part or all of the actual processing is performed and the functions of the above-described embodiments are realized by the processing.
[0426]
Further, after the program code read from the storage medium is written into a memory provided in a function expansion board inserted into the computer or a function expansion unit connected to the computer, the function is executed based on the instruction of the program code. It goes without saying that the CPU included in the expansion board or the function expansion unit performs part or all of the actual processing, and the processing realizes the functions of the above-described embodiments.
[0427]
When the present invention is applied to the storage medium, the storage medium stores program codes corresponding to the flowcharts described above.
[0428]
The present invention is not limited to the above embodiments, and various modifications can be made within the scope shown in the claims.
[0429]
【The invention's effect】
As described above, the image synthesizing apparatus according to the present invention provides a first subject image which is an image including a background and a first subject, and a first subject image which is an image including at least a part of the background and a second subject. Calculate a correction amount consisting of any one or a combination of a relative movement amount, a rotation amount, an enlargement / reduction ratio, and a distortion correction amount of a background portion between two subject images, or a correction amount calculated in advance. From the background correction amount calculating means for reading one of the first subject image and the second subject image as a reference image, and setting the other image so that at least a part of the background other than the subject overlaps at least partially. And a superimposed image generating means for generating an image in which the reference image and the corrected image are superimposed, which is corrected by the obtained correction amount.
[0430]
As a result, it is possible to correct the background displacement and distortion between the two images and synthesize the images. Even so, there is an effect that the synthesis results almost match and the synthesis result does not become unnatural. For example, when trying to mainly combine only the subject area, even if the extraction or specification of the subject area is slightly inaccurate, the background part around the subject area does not deviate from the part of the image to be combined. The inside and outside of the natural area are synthesized as a continuous landscape, and the effect of reducing the unnatural appearance can be obtained.
[0431]
In addition, even if the extraction of the subject area is accurate in pixel units, as described in the section of the problem, unnaturalness at a level finer than one pixel appears in the method of the related art, In the present invention, since the synthesis is performed after eliminating the displacement and distortion of the background portion, the pixels around the outline pixel become pixels at the same background portion position, and even if they are synthesized, the connection becomes almost natural. As described above, an effect of preventing or reducing unnaturalness at a level finer than one pixel is obtained.
[0432]
In addition, since the background is corrected and the distortion is corrected and combined, there is no need to fix the camera or the like with a tripod or the like when capturing the first and second subject images. The effect is that shooting becomes easier.
[0433]
As described above, the image synthesizing apparatus according to the present invention includes the imaging unit that captures an object or a landscape, and the first or second object image is generated based on the output of the imaging unit. Features.
[0434]
With this, the image combining device that generates the superimposed image includes the imaging unit, so that the superimposed image can be generated on the spot where the user has photographed the subject or the landscape, thereby improving the convenience for the user. I do. Further, as a result of generating the superimposed image, if there is a problem such as overlapping of the subjects, an effect that the photographing can be performed again on the spot is obtained.
[0435]
As described above, the image synthesizing device according to the present invention is characterized in that, of the first subject image and the second subject image, the one taken later is used as the reference image.
[0436]
In this manner, the displayed composite image is the range of the background of the second subject image that has just been captured or is currently being captured in a form in which the composite image is displayed in real time, so that the photographer does not feel uncomfortable. The effect comes out.
[0437]
As described above, the image synthesizing apparatus according to the present invention is characterized in that the reference image and the corrected image are superimposed at a predetermined transmittance in the superimposed image generating means.
[0438]
In the above-described configuration, a form in which the transmittance is overlapped with a predetermined transmittance includes a form in which the transmittance is changed depending on the pixel position. For example, when only the subject region in the corrected image is superimposed on the reference image, the inside of the subject region is opaque (that is, the image of the subject in the corrected image as it is), and the ratio of the reference image increases as the distance from the subject region increases as the distance from the subject region increases. Layer so that it becomes. Then, even if the subject area, that is, the contour of the subject is wrong, the surrounding pixels gradually change from the corrected image to the reference image, so that the effect of making the error less noticeable appears.
[0439]
Further, the mode of overlapping at a predetermined transmittance includes a mode of overlapping only the subject area with half transmittance. As a result, there is an effect that the user or the subject can easily determine which part of the displayed image is the part to be combined previously captured and which part is the image that is being captured now. As a result, even when the subjects overlap each other, it is possible to easily determine the subject.
[0440]
As described above, in the image synthesizing apparatus according to the present invention, in the superimposed image generation unit, an area having a difference in the difference image between the reference image and the corrected image is converted into an image having a pixel value different from the original pixel value. Is characterized by being generated as
This has the effect of making it easier for the user to identify the parts that do not match between the two images. For example, the regions of the first and second subjects are extracted as a region having a difference in the difference image since one of the regions on the reference image and the corrected image is an image of the subject and the other is an image of the background portion. . By making the extracted area translucent, inverted, or made to have a pixel value of a conspicuous color, the effect that the area of the subject can be easily understood by the user is obtained.
[0441]
As described above, the image synthesizing apparatus according to the present invention has the subject area extracting means for extracting the first subject area and the second subject area from the difference image between the reference image and the corrected image. The superimposed image generating means superimposes the reference image or the corrected image and the image in the area obtained from the subject area extracting means, instead of superimposing the reference image and the corrected image.
[0442]
This produces an effect that only the subject area in the corrected subject image can be combined on the reference image. Alternatively, it can be said that only the subject area in the reference image can be synthesized on the corrected subject image.
[0443]
Also, by combining with the process of changing the transmittance of the subject area in the superimposed image generating means, it is easy for the user to know which area is to be synthesized, and if there is overlap between the subjects, it becomes even easier to understand. The effect comes out. Further, this has an effect that the photographing can be assisted so that the overlapping does not occur. If there is overlap, it is better to move the subject or the camera and take another shot without overlap, but in this case assistance is, for example, to make it easier for the user to recognize whether overlap occurs That is, giving a material (in this case, a composite image) for the user to determine how much the object or camera should be moved to eliminate the overlap.
[0444]
As described above, in the image synthesizing apparatus according to the present invention, the subject region extracting means includes the first subject image or the corrected first subject image in the region of the first subject and the second subject. The image in the area is extracted, and the image in the area of the first object and the image in the area of the second object are extracted from the second object image or the corrected second object image. The image of the first subject and the image of the second subject are selected based on the reference.
[0445]
As a result, there is an effect that the subject of the extracted image portion can be automatically and easily determined.
[0446]
As described above, in the image synthesizing apparatus according to the present invention, the object region extracting means may include an image in the region of the first object from the first object image or the corrected first object image and the second object. And an image in the area of the first object and an image in the area of the second object are extracted from the second object image or the corrected second object image. The method is characterized in that the image of the first subject and the image of the second subject are selected based on the features of the image outside each region.
[0447]
As a result, there is an effect that the subject of the extracted image portion can be automatically and easily determined.
[0448]
As described above, in the image synthesizing apparatus according to the present invention, the number of regions of the first subject or the second subject obtained from the subject region extracting means does not match the value set as the number of subjects to be combined. In some cases, there is provided overlap detection means for judging that the region of the first subject and the region of the second subject overlap.
[0449]
As a result, the determination result of the overlap detection means can be used to notify or warn the photographer or the subject of the presence or absence of the overlap on a composite screen, a lamp, or the like. As a result, there is an effect that the user can easily determine whether or not there is a portion where the subjects overlap each other. As a result, the effect of assisting the photographing so that no overlap occurs is the same as that described above.
[0450]
As described above, the image synthesizing apparatus according to the present invention has an overlap warning unit that warns a user or a subject or both that an overlap exists when an overlap is detected by the overlap detection unit. I do.
[0451]
With this, a warning is issued when the subjects overlap each other, so that it is possible to prevent the user from taking / recording or synthesizing without noticing that, and it is necessary to adjust the position of the subjects as well. The effect of photographing assistance is that the user can be notified immediately.
[0452]
As described above, the image synthesizing apparatus according to the present invention has a shutter chance notifying means for notifying the user or the subject or both that no overlap exists when no overlap is detected in the overlap detecting means. And
[0453]
This makes it possible for the user to know when the subjects do not overlap each other. The effect comes out.
[0454]
In addition, since the subject can be notified of the photo opportunity, the effect of photographing assistance that the user can immediately prepare for a pose, a line of sight, and the like can be obtained.
[0455]
An image synthesizing apparatus according to the present invention has an image pickup means for picking up an image of a subject or a landscape, and when no overlap is detected by the overlap detection means, an image obtained from the image pickup means is converted into a first object image or a second object image. It is characterized by having an automatic shutter means for generating an instruction to record as.
[0456]
As a result, the shooting is automatically performed when the subjects do not overlap each other, so that an effect of shooting assistance that the user does not need to determine whether there is overlap and does not need to press the shutter is obtained.
[0457]
An image synthesizing apparatus according to the present invention has an image pickup unit for picking up an object or a landscape, and solves the above-mentioned problem. When an overlap is detected by the overlap detection unit, an image obtained from the image pickup unit is obtained. , An automatic shutter means for generating an instruction to prohibit recording as a first subject image or a second subject image.
[0458]
As a result, no shooting is performed when the subjects overlap each other, so that a shooting assistance effect that prevents the user from shooting / recording in a state in which there is an overlap is provided.
[0459]
As described above, the image synthesizing method according to the present invention includes the first subject image which is an image including the background and the first subject, and the first subject image which is an image including at least a part of the background and the second subject. Calculate a correction amount consisting of any one or a combination of a relative movement amount, a rotation amount, an enlargement / reduction ratio, and a distortion correction amount of a background portion between two subject images, or a correction amount calculated in advance. From the background correction amount calculating step, wherein one of the first subject image and the second subject image is used as a reference image, and the other image is subjected to the background correction amount calculating step so that the background portion other than the subject at least partially overlaps. And a superimposed image generating step of generating an image in which the reference image and the corrected image are superimposed by performing correction with the obtained correction amount.
[0460]
Various effects by this are as described above.
[0461]
As described above, the image composition program according to the present invention causes a computer to function as each unit included in the image composition device.
[0462]
As described above, an image composition program according to the present invention causes a computer to execute each step of the image composition method.
[0463]
A recording medium according to the present invention is characterized by recording the above-mentioned image synthesizing program.
[0464]
Thereby, the image synthesizing method is realized using the computer by installing the image synthesizing program in a general computer via the recording medium or the network. In other words, the computer is connected to the image synthesizing apparatus. Can function as
[Brief description of the drawings]
FIG. 1 is a block diagram showing a functional configuration of an image synthesizing apparatus according to the present invention.
FIG. 2 is a block diagram illustrating a configuration example of an apparatus that specifically realizes each unit.
FIG. 3A is a schematic perspective view showing an example of the outer appearance of the back of the image synthesizing apparatus, and FIG. 3B is a schematic perspective view showing an example of the outer appearance of the front of the image synthesizing apparatus.
FIG. 4 is an explanatory diagram illustrating a data structure example of image data.
FIG. 5 is a flowchart showing the flow of the entire image synthesizing method.
FIG. 6A is an explanatory diagram illustrating an example of a first subject image, and FIG. 6B is an explanatory diagram illustrating an arrangement of reference matching blocks in the first subject image of FIG.
7A is an explanatory diagram illustrating an example of a second subject image, FIG. 7B is an explanatory diagram illustrating the arrangement of detected matching blocks in the second subject image in FIG. 7A, and FIG. (A) is an explanatory diagram illustrating a corrected second subject image obtained by correcting the second subject image in (a), and (d) is an explanatory diagram illustrating a mask image of the corrected second subject image in (c).
8A is an explanatory diagram illustrating an example of a difference image between the first subject image in FIG. 6A and the corrected second subject image in FIG. 7C, and FIG. 8B is a diagram illustrating the difference in FIG. FIG. 3C is an explanatory diagram illustrating an example of a label image generated from an image, and FIG. 4C is an explanatory diagram illustrating an example of a label image obtained by removing a noise portion from the label image of FIG.
9A is an explanatory diagram showing an example of a superimposed image obtained by superimposing the second subject area portion of FIG. 19D on the first subject image of FIG. 6A, and FIG. An example of a superimposed image obtained by superimposing the first subject image of FIG. 19 (b) on the first subject image of FIG. 19 (b) with the first subject area of FIG. FIG. 6C is an explanatory diagram showing an example of a superimposed image obtained by overlaying the first subject image of FIG. 6A on the first subject image of FIG.
FIG. 10 is an explanatory diagram showing an example of a second subject image in which the first subject and the subject area overlap each other in FIG. 6A.
11A is an explanatory diagram illustrating an example of a difference image between the first subject image in FIG. 6A and the corrected image of the second subject image in FIG. 10; FIG. 11B is a diagram illustrating the difference in FIG. FIG. 3C is an explanatory diagram illustrating an example of a label image generated from an image, and FIG. 4C is an explanatory diagram illustrating an example of a label image obtained by removing a noise portion from the label image of FIG.
FIG. 12 is an explanatory diagram showing an example in which the subject area portion of FIG. 11C is overlapped and synthesized with a half transmittance, and an overlap warning message is displayed.
FIG. 13 is a flowchart illustrating a method of acquiring a second subject image.
FIG. 14 is a flowchart illustrating a method of calculating a background correction amount.
FIG. 15A is an explanatory diagram showing an example of a reference image for explaining matching, and FIG. 15B is an explanatory diagram showing an example of a search image for explaining matching.
FIG. 16 is a flowchart illustrating a method of generating a corrected image of the second subject image and generating a difference image from the first subject image.
17A is an explanatory diagram illustrating an example of a rotating second subject image, and FIG. 17B is an explanatory diagram illustrating an arrangement of detected matching blocks in the second subject image in FIG. 17A; FIG. 3C is an explanatory view illustrating a corrected second subject image obtained by correcting the second subject image of FIG. 3A, and FIG. 4D is a view illustrating a mask image of the corrected second subject image of FIG. FIG.
FIG. 18 is a flowchart illustrating a method of extracting a subject region.
19A is an explanatory diagram showing an image of a first subject area in the first subject image in FIG. 6A, and FIG. 19B is a diagram illustrating a first subject area in the second subject image in FIG. FIG. 6C is an explanatory diagram showing an image of one subject region, FIG. 6C is an explanatory diagram showing an image of a second subject region in the first subject image of FIG. 6A, and FIG. FIG. 9 is an explanatory diagram illustrating an image of a second subject area in the two subject images.
FIG. 20 is a flowchart illustrating a method for warning about overlapping of subject areas.
FIG. 21 is a flowchart illustrating a method of notifying a photo opportunity when there is no overlap in the subject areas.
FIG. 22 is a flowchart illustrating a method of performing an automatic shutter when there is no overlap between subject areas.
FIG. 23 is a flowchart illustrating one method of processing for generating an overlapping image.
[Explanation of symbols]
1 Imaging means
2 1st subject image acquisition means
3. Second subject image acquisition means
4. Background correction amount calculation means
5 Correction image generation means
6. Difference image generation means
7. Object area extraction means
8 overlap detection means
9 Superimposed image generation means
10 ° overlay image display means
11 Overlap warning means
12 Shutter chance notification means
13. Automatic shutter means
74 main memory (recording medium)
75 external storage (recording medium)
112 ° area (first subject area)
113 ° area (second subject area)
140 body (image synthesis device)
141 display and tablet
143 Shutter button
202 area

Claims

Relative of the background portion between a first subject image which is an image including the background and the first subject and a second subject image which is an image including at least a part of the background and the second subject Background correction amount calculating means for calculating a correction amount consisting of any or a combination of a moving amount, a rotation amount, an enlargement / reduction ratio, and a distortion correction amount, or reading out a previously calculated correction amount,
Either the first subject image or the second subject image is used as a reference image, and the other image is corrected by a correction amount obtained from the background correction amount calculating means so that a background portion other than the subject at least partially overlaps, A superimposed image generating means for generating an image in which the reference image and the corrected image are superimposed,
An image synthesizing apparatus comprising:

It has imaging means for imaging the subject and landscape,
The apparatus according to claim 1, wherein the first subject image or the second subject image is generated based on an output of the imaging unit.

3. The image synthesizing apparatus according to claim 2, wherein a later one of the first subject image and the second subject image is used as a reference image.

2. The image synthesizing apparatus according to claim 1, wherein the superimposed image generating means superimposes the reference image and the corrected image at a predetermined transmittance.

2. The superimposed image generating unit according to claim 1, wherein an area having a difference in a difference image between the reference image and the corrected image is generated as an image having a pixel value different from an original pixel value. Image synthesis device.

Subject area extracting means for extracting a first subject area and a second subject area from a difference image between the reference image and the corrected image;
The method according to claim 1, wherein the superimposed image generating means superimposes the reference image or the corrected image and an image in an area obtained from the subject area extracting means, instead of superimposing the reference image and the corrected image. The image synthesizing apparatus according to the above.

The subject region extracting means extracts an image in the first subject region and an image in the second subject region from the first subject image or the corrected first subject image, and extracts the second subject image. An image in the area of the first object and an image in the area of the second object are extracted from the middle or corrected second object image, and the image of the first object and the second image are extracted based on the skin color. The image synthesizing apparatus according to claim 6, wherein an image of the subject is selected.

The subject region extracting means extracts an image in the first subject region and an image in the second subject region from the first subject image or the corrected first subject image, and extracts the second subject image. An image in the region of the first subject and an image in the region of the second subject are extracted from the middle or corrected second subject image, and the first subject is extracted based on the features of the image outside the respective regions. The image synthesizing apparatus according to claim 6, wherein the image of the second subject and the image of the second subject are selected.

When the number of regions of the first subject or the second subject obtained from the subject region extracting means does not match the value set as the number of subjects to be combined, the first subject region and the second subject region are compared. 7. The image synthesizing apparatus according to claim 6, further comprising an overlap detecting means for judging that the areas overlap.

10. The image synthesizing apparatus according to claim 9, further comprising: an overlap warning unit that warns a user, a subject, or both, when an overlap is detected by the overlap detection unit, to a user, a subject, or both.

The image synthesizing apparatus according to claim 9, further comprising: a shutter chance notifying unit that notifies a user, a subject, or both, of the absence of the overlap when the overlap detection unit does not detect the overlap.

It has imaging means for imaging the subject and landscape,
10. The image forming apparatus according to claim 9, further comprising: an automatic shutter unit that generates an instruction to record an image obtained from the imaging unit as a first subject image or a second subject image when no overlap is detected by the overlap detection unit. The image synthesizing apparatus according to the above.

It has imaging means for imaging the subject and landscape,
10. An automatic shutter means for prohibiting an image obtained from the image pickup means from being used as a first subject image or a second subject image when an overlap is detected by the overlap detection means. An image synthesizing device according to claim 1.

Relative of a background portion between a first subject image which is an image including a background and a first subject, and a second subject image which is an image including at least a part of the background and a second subject. Calculating a correction amount consisting of any or a combination of a moving amount, a rotation amount, an enlargement / reduction ratio, and a distortion correction amount, or a background correction amount calculating step of reading out a previously calculated correction amount;
Either the first subject image or the second subject image is used as a reference image, and the other image is corrected using the correction amount obtained from the background correction amount calculation step so that the background portion other than the subject at least partially overlaps the reference image. A superimposed image generating step of generating an image in which the image and the corrected image are superimposed,
An image synthesizing method, comprising:

An image synthesizing program for causing a computer to function as each unit included in the image synthesizing apparatus according to claim 1.

An image composition program for causing a computer to execute each step of the image composition method according to claim 14.

A recording medium on which the image synthesizing program according to claim 15 is recorded.