JP4064186B2

JP4064186B2 - Method and apparatus for determining the top and bottom of an image

Info

Publication number: JP4064186B2
Application number: JP2002250285A
Authority: JP
Inventors: 貞登赤堀; 雅彦山田
Original assignee: Fujifilm Corp
Current assignee: Fujifilm Corp
Priority date: 2002-08-29
Filing date: 2002-08-29
Publication date: 2008-03-19
Anticipated expiration: 2022-08-29
Also published as: JP2004086806A

Description

【０００１】
【発明の属する技術分野】
本発明は、画像の上下を決定する方法および装置に関し、特に、デジタルデータで表された写真画像の表示や媒体への記録に際して、その上下を自動的に決定する方法および装置に関する。
【０００２】
【従来の技術】
デジタルカメラで撮影を行う際には、撮影者がカメラの向きをその都度自由に決定する。したがって、撮影された一連の画像をデジタルカメラのモニターやパソコン画面に表示すると、それぞれ不揃いな向きで表示されるため、表示された各画像を目視で確認しながら１枚ずつ手動操作で向きを整える必要がある。同様の問題は、ネガフィルムに撮影された写真画像をフィルムスキャナーで読み取り、デジタルデータで表された読取画像を表示する場合等にも生じる。
【０００３】
このような画像の向きを整える作業の負担は、撮影者が個人的に画像の表示や記録を行う場合のみならず、商用サービスとして、デジタルカメラで撮影された画像がそのまま記録された媒体や、撮影済のネガフィルムを顧客から預かり、撮影された一連の画像を向きを揃えてＣＤ−Ｒに記録して顧客に提供したり、ホームページ上に表示したりする場合にも問題となる。特に、そのような商用サービスにおいて、大量の画像を処理する場合には、かかる作業がサービスの効率を大きく低下させる要因となっている。したがって、デジタルデータで表された画像の上下を決定して向きを整える作業の自動化は、上記のような商用サービスを提供する事業者にとって、強く望まれることである。
【０００４】
かかる作業の自動化のため既に開発されている方法として、撮影された画像中にある空領域を特定し、その空領域の色の勾配に基づいて画像の上下を決定して、画像の向きを整える方法がある（例えば、特許文献１参照）。すなわち、この方法は、実世界における空の色が、天頂側においてより青く、水平線に近くなるにつれて白くなるという特徴に着目して、空でない単に青い領域を誤って空領域として検出することを防止して空領域を高い信頼性で特定し、その色の勾配から画像の向きを決定するものである。
【０００５】
しかしながら、実際の写真画像には、必ずしも空が写っているとは限らず、その場合には上記の方法により画像の上下を決定することができない。また、たとえ空が写っていても、空のほぼ全面が雲に覆われた曇天状態では、上記の方法は適用できない。したがって、実際に撮影された画像のうち上記の方法により上下が正しく決定できないものは相当数に上ると考えられ、それらの画像については、結局、サービス提供者等が１枚ずつ向きを目視で確認し整えなくてはならない。
【０００６】
【特許文献１】
特開２００１−２０２５２５号公報
【０００７】
【発明が解決しようとする課題】
上述のように、撮影された画像に含まれ得る撮影対象要素のうち「空」のみというように１つの撮影対象要素のみに注目してその色勾配等に基づいて画像の上下を決定する方法は、適用可能な画像の割合が低い。そのため、画像を１枚ずつ目視で確認して手動操作で向きを整える負担を減らして、上記した商用サービス等における作業効率を向上させたいという要求に、十分応えるものではなかった。
【０００８】
本発明は、かかる事情に鑑み、デジタルデータで表された写真画像のほとんどに適用可能な画像の上下決定方法および装置を提供することにより、最終的に目視確認および手動操作による向きの修正が必要とされる画像の割合を格段に減らし、上記した商用サービス等における作業効率を大幅に向上させることを目的とするものである。
【０００９】
【課題を解決するための手段】
すなわち、本発明に係る画像の上下決定方法は、複数の撮影対象要素の実世界における位置的上下関係に関する参照データを作成する工程と、当該複数の撮影対象要素のうち少なくとも２つに対応する少なくとも２つの画像領域を含む画像について、上記の参照データを参照して、上記の少なくとも２つの画像領域の配置が実世界における位置的上下関係を最も良く反映した配置となるように、当該画像の上下を決定する工程を含むことを特徴とする方法である。
【００１０】
当該方法においては、上記の複数の撮影対象要素に各々１つの上下配置レベルを割り当てたデータを上記の参照データとして用い、画像の上下を決定する上記の工程を、該参照データを参照して、上記の少なくとも２つの画像領域の各々に、当該画像領域に対応する撮影対象要素の上下配置レベルを割り当てる工程と、割り当てられた該上下配置レベルの、画像全体の第１の方向に亘る変化傾向を表す第１のパラメータを導出する工程と、割り当てられた該上下配置レベルの、画像全体の上記の第１の方向と直交する第２の方向に亘る変化傾向を表す第２のパラメータを導出する工程と、上記の第１のパラメータと第２のパラメータに基づいて、当該画像の上下を決定する工程を含む工程としてもよい。
【００１１】
あるいは、本発明に係る画像の上下決定方法においては、上記の複数の撮影対象要素の２つからなる組合せの少なくとも一部について、当該組合せをなす２つの撮影対象要素の実世界における位置的上下関係を確率的に表したデータを上記の参照データとして用い、画像の上下を決定する上記の工程を、上記の少なくとも２つの画像領域の当該画像中における位置を代表する指標を導出する工程と、当該画像の９０°ずつ異なる４方向のうちの１方向を上方向であると仮定して、参照データ中の、上記の少なくとも２つの画像領域に対応する撮影対象要素の実世界における位置的上下関係を確率的に表したデータ部分と、上記の少なくとも２つの画像領域の画像中における位置を代表する上記の指標とに基づいて、当該仮定が正しい可能性を表すパラメータを導出する工程と、パラメータを導出する上記の工程を、当該画像の４方向の残りの３方向についても繰り返す工程と、当該画像の４方向について導出された４つの上記のパラメータのうち最も高い可能性を表すパラメータに対応する方向を、当該画像の上方向として特定する工程を含む工程としてもよい。
【００１２】
また、本発明に係る画像の上下決定装置は、複数の撮影対象要素の実世界における位置的上下関係に関する参照データが記憶された記憶手段と、入力された画像データが表す、当該複数の撮影対象要素のうち少なくとも２つに対応する少なくとも２つの画像領域を含む画像について、上記の参照データを参照して、上記の少なくとも２つの画像領域の配置が実世界における位置的上下関係を最も良く反映した配置となるように、当該画像の上下を決定する決定手段を含むことを特徴とする装置である。
【００１３】
当該装置においては、上記の複数の撮影対象要素に各々１つの上下配置レベルを割り当てたデータを上記の参照データとして用い、上記の決定手段を、該参照データを参照して、上記の少なくとも２つの画像領域の各々に、当該画像領域に対応する撮影対象要素の上下配置レベルを割り当てる手段と、割り当てられた該上下配置レベルの、画像全体の第１の方向に亘る変化傾向を表す第１のパラメータを導出する手段と、割り当てられた該上下配置レベルの、画像全体の上記の第１の方向と直交する第２の方向に亘る変化傾向を表す第２のパラメータを導出する手段と、上記の第１のパラメータと第２のパラメータに基づいて、当該画像の上下を決定する手段を含む手段としてもよい。
【００１４】
あるいは、本発明に係る画像の上下決定装置においては、上記の複数の撮影対象要素の２つからなる組合せの少なくとも一部について、当該組合せをなす２つの撮影対象要素の実世界における位置的上下関係を確率的に表したデータを上記の参照データとして用い、上記の決定手段を、上記の少なくとも２つの画像領域の当該画像中における位置を代表する指標を導出する手段と、当該画像の９０°ずつ異なる４方向の各々の方向について、その方向を上方向であると仮定して、参照データ中の、上記の少なくとも２つの画像領域に対応する撮影対象要素の実世界における位置的上下関係を確率的に表したデータ部分と、上記の少なくとも２つの画像領域の画像中における位置を代表する上記の指標とに基づいて、当該仮定が正しい可能性を表すパラメータを導出する手段と、当該画像の４方向について導出された４つの上記のパラメータのうち最も高い可能性を表すパラメータに対応する方向を、当該画像の上方向として特定する手段を含む手段としてもよい。
【００１５】
なお、本発明において「撮影対象要素」とは、実世界に存在し、写真撮影の対象となり得る個々の要素を言い、たとえば、空、雲、山、木、建物、水、地面等が挙げられる。また、本発明において「画像領域」とは、本発明に係る方法および装置が取り扱う画像中の分割された個々の有意な領域であって、それぞれ上記の「撮影対象要素」のいずれかに対応するものとして特定され得る領域をいう。
【００１６】
【発明の効果】
本発明に係る画像の上下決定方法および装置は、空、木、地面等の複数の撮影対象要素の実世界における相対的な位置的上下関係に関する参照データに基づいて画像の上下を決定するので、これらの撮影対象要素のうち少なくとも２つに対応する画像領域を含む画像であれば、上下を決定することが可能である。そのため、本発明に係る方法および装置により上下を決定できる画像の割合は、従来の方法および装置による場合に比べて格段に高く、最終的に目視確認および手動操作による向きの修正が必要とされる画像の割合を大幅に減らすことができる。これは、顧客の求めに応じて、デジタルデータで表された大量の画像を整理してＣＤ−Ｒに記録したり、ホームページに表示する商用サービスの効率を大きく向上させるものである。
【００１７】
また、本発明に係る画像の上下決定方法および装置では、複雑なプログラムの変更等を行わなくても、参照データを修正・追加することにより、画像の上下を決定する基準を容易に変更することができるという利点もある。
【００１８】
【発明の実施の形態】
以下、図面により、本発明の例示的な実施形態を詳細に説明する。
【００１９】
図１は、本発明による画像の上下決定工程を含む一連の画像処理工程の流れを示したフローチャートである。この一連の画像処理工程は、不揃いな向きで記録された原画像を、正しい向きに揃えて表示または記録することを可能とするものである。まず、ステップ１０において、原画像を表すデジタル形式の画像データが読み込まれる。次に、ステップ１２において、上記の原画像が複数の画像領域に分割され、ステップ１４において、各画像領域と実世界に存在する撮影対象要素との対応づけが行われる。続いて、ステップ１６において、上記の撮影対象要素の実世界における位置的上下関係に関する参照データを参照して、画像の上下が決定される。最後に、ステップ１８において、ステップ１６で決定された画像の上下に従って、原画像が正しい向きに回転されて表示または記録される。
【００２０】
本発明は、図１に示した一連の画像処理工程のうち、ステップ１６を実現するための方法および装置に関するものであるが、本発明の実施形態の説明に先立って、ステップ１２における画像領域への分割手法の例、およびステップ１４における各画像領域と撮影対象要素の対応づけの手法の例を、以下に簡単に説明する。
【００２１】
ステップ１２における画像領域への分割手法の例については、図２を用いて説明する。
【００２２】
まず、図２の（ａ）は処理対象である原画像であり、この原画像を構成する各画素は、各々の色および明度を有している。そこで、隣接する画素の色および明度を比較して、それらの類似の度合いが所定の基準を超える場合に、それらの画素を統合することとする。この比較および統合は、上記の所定の基準によりそれ以上の統合が起こらなくなるまで順次繰り返され、類似の色および明度を有する画素からなる区域が拡大していく。この画素の統合が完了した後の状態が、たとえば図２（ｂ）の状態であるとする。
【００２３】
ここに、図２（ｂ）に示した画像を構成する各区域のうち、周囲長が所定の長さより短い区域を「微小区域」と呼び、周囲長が該所定の長さ以上である区域を「非微小区域」と呼ぶこととする。図２（ｂ）においては、区域２０、２２、２４、２６および２８は非微小区域、区域３０および３２は微小区域である。
【００２４】
次に、図２（ｂ）の画像を構成する各区域を隣接する区域とさらに比較して、統合可能なものをさらに統合するのであるが、この区域の統合の基準は、微小区域と非微小区域で異なる。微小区域については、１の非微小区域に完全に包含されている微小区域（たとえば非微小区域２４に完全に包含されている微小区域３０）は、その１の非微小区域に統合されるものとする。また、２以上の非微小区域と境界を接する微小区域（たとえば非微小区域２０および２４と境界を接する微小区域３２）は、接する境界の長さが長い方の非微小区域（上記の場合、非微小領域２４）に統合されるものとする。この基準によれば、微小区域統合後の状態は、図２（ｃ）のようになる。
【００２５】
非微小区域については、当該非微小区域をなす画素の平均の色および明度を、隣接する各非微小区域をなす画素の平均の色および明度と比較し、類似の度合いが所定の基準を超える隣接非微小区域がある場合は統合が行われる。たとえば、図２（ｃ）における非微小区域２２の平均の色および明度について、非微小区域２６の平均の色および明度との類似の度合いは上記の所定の基準を超えるが、他の各隣接非微小区域２０、２４および２８の平均の色および明度との類似の度合いは上記の所定の基準以下である場合は、当該非微小区域２２は、非微小区域２６と統合され、非微小区域２０、２４および２８とは統合されない。かかる所定の基準による非微小区域の統合の最終的な結果は、たとえば図２（ｄ）のようになる。この最終的な状態の画像を構成する各領域が、本発明における「画像領域」である。
【００２６】
以上、図１のステップ１２における画像領域への分割手法の例を図２を用いて説明したが、このステップ１２における画像領域への分割が、他のいかなる周知の手法によるものでもよいことは言うまでもない。
【００２７】
次に、図１のステップ１４における各画像領域と撮影対象要素の対応づけの手法の例を、図３および図４を用いて説明する。この例は、予め作成された自己組織化マップを用いて各画像領域と撮影対象要素の対応づけを行うものである。
【００２８】
まず、図３のステップ４０において、既に図２を用いて説明された手法等により画像領域に分割された画像が、図４の（ａ）に示すように複数のブロックに分割される。ここでは、たとえば３２×３２画素からなる正方ブロックが用いられている。
【００２９】
次に、ステップ４２において、撮影対象要素との対応づけを行う１の画像領域に含まれるブロックを特定する。ここでは、まず図４（ａ）の画像領域２８について撮影対象要素との対応づけを行うものとすると、図４（ｂ）に網かけで示した複数のブロックが、この画像領域２８に含まれるブロックとして特定される。
【００３０】
続いて、ステップ４４において、ステップ４２で特定されたブロックのうちの１つについて、特徴ベクトルが抽出される。この「特徴ベクトル」とは、そのブロックについての複数の特徴量を成分とするベクトルで、それらの成分には、色の特徴を示す成分や明度の特徴を示す成分のほか、奥行情報等の画像の特徴を示す成分も含まれ得る。
【００３１】
次に、ステップ４６において、ステップ４４で抽出された特徴ベクトルが自己組織化マップ上に写像され、そのブロックに対応する撮影対象要素が特定される。「自己組織化マップ」とは、複数の参照特徴ベクトルに対応する点が２次元的に配列されたマップであり、互いに特徴が類似する画像の参照特徴ベクトルに対応する点は互いに近い位置に配置されている。この自己組織化マップは、予め、「空」や「地面」の画像であることが分かっている多数の画像の特徴をコンピュータに学習させることにより作成されており、したがって、自己組織化マップ上の各点に対応する可能性が最も高い撮影対象要素が予め分かっている。ステップ４４で抽出された特徴ベクトルは、この自己組織化マップ上において、最も類似する参照特徴ベクトルに対応する点に写像される。ここで、抽出された特徴ベクトルと各参照特徴ベクトルの類似度の評価は、両ベクトル間のユークリッド距離等を指標として行われる。この自己組織化マップ上への写像により、現在のブロックに対応する可能性が最も高い撮影対象要素を特定することができる。
【００３２】
続いて、図３の処理は、ステップ４８を経てステップ４４に戻り、図４（ｂ）に網かけで示したブロックのうち次のものについて、同様にして対応撮影対象要素が特定される。
【００３３】
全てのブロックに対応する撮影対象要素の特定が終了すると、図３の処理はステップ５０へと進む。このとき、画像領域２８中の各ブロックは、図４（ｃ）に示すように、ほとんどが「空」に対応するものとして特定されているが、一部、そのブロックの画像の特徴が「空」よりもたとえば「水」の特徴に近く、「水」に対応するものとして特定されたものが混ざってしまっている。しかし、ステップ５０においては、いわば多数決の方法により、画像領域２８中の各ブロックに対応する撮影対象要素として特定されたもののうち最多のものが、当該画像領域２８に対応する撮影対象要素として特定されるので、図４（ｄ）に示すように、画像領域２８は「空」に対応するものとされる。
【００３４】
その後、図３の処理は、ステップ５２を経てステップ４２に戻り、当該画像に含まれる全ての画像領域について対応する撮影対象要素が特定されるまで、ステップ４２から５２が繰り返される。
【００３５】
以上、図１のステップ１４における各画像領域と撮影対象要素の対応づけの手法の例を図３および図４を用いて説明したが、このステップ１４における対応づけが、他のいかなる周知の手法によるものでもよいことは言うまでもない。
【００３６】
さて、上記に説明した手法等により、上下を決定する対象である画像が画像領域に分割され、各画像領域に対応する撮影対象領域が特定されている前提で、以下、本発明による画像の上下決定方法（すなわち、図１のステップ１６）の２つの実施形態を説明する。
【００３７】
まず、本発明の第１の実施形態を、図５−９を用いて説明する。この実施形態は、複数の撮影対象要素に各々１つの上下配置レベルを割り当てたデータを参照データとして用いて、画像の上下を決定するものである。ここでは、既に説明した手法により、複数の画像領域に分割され、各画像領域と実世界に存在する撮影対象要素との対応づけがされた、図５に示す画像の上下を決定することを考える。画像はＭ×Ｎ個の画素からなり、この段階では、各画像領域の各画素に「空」、「木」、「建物」または「地面」という意味が割り当てられている。画像のｘ座標およびｙ座標は図に示すとおりである。なお、図５の画像は、図２および図４の画像よりも単純化されたものとなっているが、これは単に以降の説明の便宜のためである。
【００３８】
本実施形態による画像の上下決定は、図６のフローチャートに示した手順で行われる。
【００３９】
まず、ステップ６０において、図５の画像の各画像領域に対し、対応撮影対象要素に対応する上下配置レベルが割り当てられる。より厳密に言えば、各画像領域に属する各画素ごとに、対応する上下配置レベルが割り当てられることになる。この割当ては、図７に示すような表形式の参照データを参照して行われる。この参照データは、実世界における撮影対象要素の相対的な位置的上下関係に基づいて予め作成され、記憶手段に記憶されているもので、実世界において相対的に上にある可能性が高い撮影対象要素ほど、大きな値の上下配置レベルが付与されている。この参照データによると、この例では図８のように上下配置レベルが割り当てられる。
【００４０】
次に、ステップ６２および６２’において、ｘ方向とｙ方向のそれぞれに関し、図９に示すように画像が２等分され、各２等分領域中の画素に割り当てられた上下配置レベルの平均値が算出される。図９の例では、ｘ方向に関する各２等分領域における上下配置レベルの平均値は、０．９と１．８であり、ｙ方向に関する各２等分領域における上下配置レベルの平均値は、１．２と１．３であるとする。
【００４１】
続いて、ステップ６４と６４’において、ｘ方向およびｙ方向に亘る上下配置レベルの変化傾向を表すパラメータとして、ｘ方向とｙ方向のそれぞれに関し、上記の各２等分領域における上下配置レベルの平均値の、差分値が求められる。ここで、ｘ方向に関しては、図９で言えば、右側の２等分領域における平均値から左側の２等分領域における平均値が差し引かれ、ｙ方向に関しては、下側の２等分領域における平均値から上側の２等分領域における平均値が差し引かれる。すなわち、図９の例では、ｘ方向に関する差分値は１．８−０．９＝＋０．９であり、ｙ方向に関する差分値は１．３−１．２＝＋０．１である。
【００４２】
次に、ステップ６６において、ｘ方向に関する差分値とｙ方向に関する差分値が比較され、絶対値の大きい方に対応する軸が、画像の上下軸として特定される。図９の例では、上述のようにｘ方向に関する差分値が＋０．９、ｙ方向に関する差分値が＋０．１であり、ｘ方向に関する差分値の絶対値の方が大きいので、ｘ軸が上下軸として特定される。
【００４３】
続いて、ステップ６８において、特定された上下軸に対する上記の差分値の符号から、その符号が正であれば＋ｘまたは＋ｙ方向が、負であれば−ｘまたは−ｙ方向が、画像の上方向として特定される。ここでは、ステップ６６においてｘ軸が上下軸として特定されており、ｘ方向に関する差分値が＋０．９という正の値であるので、＋ｘ方向が上方向として特定される。
【００４４】
以上の各ステップにより、図５に示した画像の上方向は＋ｘ方向であると決定される。この結果から、図５の画像は左に９０°回転されて、表示または記録されることになる。
【００４５】
なお、上記の実施形態においては、ｘ方向とｙ方向のそれぞれに関して画像を２等分し、各２等分領域における上下配置レベルの平均値の差分値をパラメータとして用いて画像の上下を決定しているが、画像全体のｘ方向およびｙ方向に亘る上下配置レベルの変化傾向を適当に表すパラメータであれば、他のいかなるパラメータを用いてもよい。たとえば、上記の実施形態の変更例の１つとして、ｘ方向およびｙ方向に関し、各画素に割り当てられた上下配置レベルの行ごとおよび列ごとの傾きの平均値を求め、該平均値を、ｘ方向およびｙ方向に亘る上下配置レベルの変化傾向を表すパラメータとして用いる例が考えられる。
【００４６】
次に、本発明の第２の実施形態を、図１０−１６を用いて説明する。この実施形態は、複数の撮影対象要素の２つからなる組合せの少なくとも一部について、その組合せをなす２つの撮影対象要素の実世界における位置的上下関係を確率的に表したデータを参照データとして用いて、画像の上下を決定するものである。ここでも、上記の第１の実施形態の説明と同じく図５に示した画像を用いて、具体的な手順の説明を行うこととする。
【００４７】
本実施形態による画像の上下決定は、図１０のフローチャートに示した手順で行われる。
【００４８】
まず、ステップ７０において、既に画像領域に分割され、各画像領域と撮影対象要素とが関連づけられた図５の画像について、各画像領域の重心座標が、図１１（ａ）のように特定される。この重心座標は、以降のステップにおいて、画像全体の中における各画像領域の位置を代表する指標として使用されるものである。
【００４９】
次に、ステップ７２において、「注目領域」の探索が行われる。「注目領域」とは、画像に含まれる画像領域のうち、その画像の上下決定のために特に注目する撮影対象要素として指定されているものに対応する画像領域を指す。すなわち、本実施形態における画像の上下は、この注目領域と他の画像領域との相対的な位置的上下関係に基づいて決定されることとなる。たとえば、図１１（ａ）の画像は、空、木、建物および地面の４つの撮影対象領域に対応する画像領域を含むものであるが、ここで空に対応する画像領域のみが注目領域として指定されているとすると、画像の上下は、空と木、空と建物および空と地面の相対的な位置的上下関係に基づいて決定されることとなる。また、空、山、木、建物および地面の５つの撮影対象要素に対応する画像領域が注目領域として指定されている場合には、図１１（ａ）の画像にはこのうち空、木、建物および地面の４つに対応する画像領域が含まれているので、空と木、空と建物、空と地面、木と建物、木と地面および建物と地面の相対的な位置的上下関係に基づいて、画像の上下が決定されることとなる。いずれの撮影対象要素に対応する画像領域を注目領域として用いるか、および、注目領域が複数の場合はいずれの注目領域から他の画像領域との相対的な位置的上下関係を調べるか（すなわち探索順序）は、予め指定されているものとする。以下においては、空、山、木、建物および地面の５つの撮影対象要素に対応する画像領域が注目領域として指定されており、その探索順序が空→山→木→建物→地面とされている場合を例にとって説明する。したがって、ステップ７２においては、まず空に対応する画像領域が、注目領域として探索される。図１１（ａ）の画像には空に対応する画像領域が含まれるので、これを注目領域として次のステップ７４へと進む。
【００５０】
ステップ７４では、空に対応する注目領域と、その他の全ての画像領域（すなわち、木、建物および地面に対応する画像領域）との相対的な位置的上下関係を評価する。
【００５１】
本実施形態においては、この評価は、図１２に示した表形式の参照データを参照して行われる。この参照データは、想定されている撮影対象要素の２つからなる組合せの全部または一部について、実世界における位置的上下関係を確率的に表したものである。たとえば、図１２の第１行目のデータ部分は、空と木の組合わせについて、実世界の風景において空が木より上にある確率、空と木が横に並ぶ確率、および空が木より下にある確率を表している。この参照データは、予め作成され、記憶手段に記憶されている。なお、図１２には本実施形態の説明に必要なデータ部分のみを示したが、実際には、この参照データは、他の多くの撮影対象要素間の関係を規定している。
【００５２】
一方が注目領域である２つの画像領域の関係の評価は、まず図１２の参照データにおいて、当該２つの画像領域に対応する２つの撮影対象要素の関係を規定したデータ部分を読み出すことから始まる。本実施形態では、まず、空に対応する注目領域と、木に対応する画像領域との関係を評価するため、図１２の第１行目のデータ部分が読み出される。
【００５３】
次に、読み出したデータ部分の情報と、ステップ７０において特定した重心座標に基づいて、ｘ方向およびｙ方向の重み係数（ｗ_ｘ，ｗ_ｙ）を算出する。ここで、重み係数（ｗ_ｘ，ｗ_ｙ）は、２つの撮影対象領域ＡおよびＢに対応する画像領域の重心座標をそれぞれ（ｘ_Ａ，ｙ_Ａ）および（ｘ_Ｂ，ｙ_Ｂ）として、読み出したデータ部分による、撮影対象要素Ａが下にありＢが上にある確率ｐ_{Ａ下、Ｂ上}、および撮影対象要素Ａが上にありＢが下にある確率ｐ_{Ａ上、Ｂ下}に基づいて、
【数１】

を算出し、
【数２】

により求められる。空と木に対応する画像領域の関係の評価の場合、図１２の参照データ第１行目によれば、木が下にある確率が７５％であって、空が下にある確率５％よりも高いので、上記のΔｘおよびΔｙは、
【数３】

となる。現段階で評価しようとしている画像の向きは図１１（ａ）の向きであり、この向きではｘ_空＞ｘ_木、ｙ_空＝ｙ_木となっているので、Δｘ＞０、Δｙ＝０となり、重み係数はｗ_ｘ＝１、ｗ_ｙ＝０となる。
【００５４】
続いて、現段階の画像方向における当該２つの画像領域の相対的な位置的上下関係の評価値ｓ_ＡＢを、
【数４】

により算出する。ここで、ｐ_{ＡＢ横並び}は参照データによる撮影対象要素ＡとＢが横に並ぶ確率、ｍａｘ（ｐ_{Ａ下、Ｂ上}，ｐ_{Ａ上、Ｂ下}）はｐ_{Ａ下、Ｂ上}とｐ_{Ａ上、Ｂ下}のうち大きい方、ｍｉｎ（ｐ_{Ａ下、Ｂ上}，ｐ_{Ａ上、Ｂ下}）はｐ_{Ａ下、Ｂ上}とｐ_{Ａ上、Ｂ下}のうち小さい方を指す。図１１（ａ）の向きにおける空と木に対応する画像領域の相対的な位置的上下関係については、上述のようにΔｙ＝０であるので、評価値は、
【数５】

となる。
【００５５】
同様に、空と建物、空と地面の相対的な位置的上下関係についても、評価値ｓ_空建物およびｓ_空地面を算出する。これで、空に対応する画像領域を注目領域としたステップ７４の工程が終了する。
【００５６】
次に、ステップ７６に移り、次の注目領域を探索する。ここでは、上述したように本実施例における探索順序は空→山→木→建物→地面であるので、２番目の「山」に対応する画像領域が評価対象の画像中において探索される。しかし、「山」に対応する画像領域はないので、次に「木」に対応する画像領域が探索され、注目領域として特定される。
【００５７】
すると、次の注目領域が特定されたのでステップ７８で「Ｎｏ」となり、ステップ７４に戻って、上記と同様の手法により、「木」に対応する注目領域とその他の画像領域との相対的な位置的上下関係が評価される。ここで、空と木に対応する画像領域の相対的な位置的上下関係については既に評価が行われているので、評価対象から除かれ、木と建物および木と地面について、評価値ｓ_木建物およびｓ_木地面が算出される。
【００５８】
以上のステップ７４、７６および７８からなる繰返しの工程が全ての注目領域について終了すると、ステップ８０へと進む。この段階で、図１３に示したように６組の撮影対象要素に対応する画像領域の図１１（ａ）の画像方向における相対的な位置的上下関係について、評価値ｓ_ＡＢが算出されていることになる。ステップ８０では、それら６つの評価値の合計値Ｓが算出される。この合計値は、現在評価を行っている画像方向が上下が正しく配置された画像方向である可能性を表すパラメータとしての意味を持ち、本実施形態では、この合計値Ｓが大きいほど当該画像方向が正しい可能性が高い。図１３に示すとおり、図１１（ａ）の画像方向が正しい可能性を示す合計値Ｓは、ここでは１５９となる。
【００５９】
次に、ステップ８２を経てステップ８４へ進み、画像の方向が仮想的に９０°回転され、図１１（ｂ）に示す方向とされる。その上でステップ７０から８０の一連の工程が繰り返され、図１１（ｂ）に示す画像方向についても、当該画像方向が正しい可能性を表す合計値Ｓが算出される。同様の工程が、図１１（ｃ）に示す画像方向および図１１（ｄ）に示す画像方向についても、順次行われる。これらの各画像方向についての評価の結果を、図１４−１６に示す。
【００６０】
以上、４方向全ての画像方向の評価が終了すると、ステップ８６へと進み、合計値Ｓの値が最も大きくなった画像方向を、正しい画像方向として特定する。本例では、図１１（ａ）−（ｄ）の各画像方向に対する合計値Ｓがそれぞれ１５９、２、１５９および３６８となっているので、図１１（ｄ）の画像方向が、正しい画像方向として特定される。
【００６１】
以上の各ステップにより、図５の画像の上下が正しく決定され、当該画像は、図１１（ｄ）に示した向きで表示または記録されることになる。
【００６２】
なお、上記の実施形態では、画像全体中における各画像領域の位置を代表する指標として、各画像領域の重心座標を用いたが、これに限らず他の指標を用いてもよい。たとえば、各画像領域をなす画素のうち最も高い上下位置にある画素の座標等を用いてもよい。
【００６３】
また、上記の実施形態のステップ８６では、単純に評価値の合計値Ｓが最大となった方向を正しい画像方向として特定しているが、ある閾値を設定して、４方向に対する合計値Ｓのうち最大のものがその閾値を越えた場合に限り、その最大合計値に対応する方向を正しい画像方向として特定してもよい。あるいは、最大合計値の、該最大合計値と２番目に大きい合計値の和に対する比が、設定された閾値（たとえば０．６）を超える場合に限り、その最大合計値に対応する方向を正しい画像方向として特定してもよい。これらの場合、その設定閾値では画像の上下を判別できなかった画像については、サービス事業者等が目視で画像方向を確認し修正することになる。
【００６４】
また、上記の実施形態においては、各画像方向における全ての評価値の合計値に基づいていずれの画像方向が正しいかを決定しているが、各画像方向における評価値ｓ_ＡＢの最大値等に基づいて決定してもよい。
【００６５】
また、上記の実施形態においては、予め注目領域とその探索順序が規定されていたが、評価対象である画像中に存在する画像領域を全て順次探索することとし、それらの画像領域の全ての組合せについて相対的な位置的上下関係を評価するようにしてもよい。
【００６６】
以上説明した各実施形態によれば、画像の上下は、空、木、地面等の複数の撮影対象要素の実世界における相対的な位置的上下関係に関する参照データに基づいて決定されるので、これらの撮影対象要素のうち少なくとも２つに対応する画像領域を含む画像であれば、上下を決定することができる。第２の実施形態において閾値を設ける場合等には、一部の画像は目視確認および手動操作による向きの修正を必要とすることになるが、それでも本発明では単数ではなく複数の撮影対象要素の相対的な位置的上下関係に基づいて画像の上下を決定するため、かかる目視確認および手動操作が必要とされる画像の割合は、従来に比べて大幅に減少する。
【００６７】
また、上記の各実施形態によれば、複雑なプログラムの変更等を行わなくても、参照データを修正したり新たな撮影対象要素に関するデータを追加することにより、画像の上下を決定する基準を容易に変更することができる。
【００６８】
以上、本発明の２つの実施形態について詳細に述べたが、これら２つの実施形態は例示的なものに過ぎず、本発明の技術的範囲は、本明細書中の特許請求の範囲のみによって定められるべきものである。
【図面の簡単な説明】
【図１】本発明による画像の上下決定工程を含む一連の画像処理工程を示したフローチャート
【図２】画像領域への分割手法の例を示した工程図
【図３】画像領域と撮影対象要素の対応づけの手法の例を示したフローチャート
【図４】図３に示した手法の各工程を示す工程図
【図５】本発明による上下の決定の対象となる画像を示した図
【図６】本発明の第１の実施形態による画像の上下決定方法を示すフローチャート
【図７】本発明の第１の実施形態において使用される参照データの例を示した表
【図８】本発明の第１の実施形態において、画像に上下配置レベルが割り当てられた状態を示す図
【図９】本発明の第１の実施形態における、ｘ方向およびｙ方向に関する上下配置レベルの変化傾向を表す差分値の導出方法を示す図
【図１０】本発明の第２の実施形態による画像の上下決定方法を示すフローチャート
【図１１】本発明の第２の実施形態において評価される画像の４つの方向を示した図
【図１２】本発明の第２の実施形態において使用される参照データの例を示した表
【図１３】本発明の第２の実施形態による、図１１の（ａ）の方向についての評価値を示した表
【図１４】本発明の第２の実施形態による、図１１の（ｂ）の方向についての評価値を示した表
【図１５】本発明の第２の実施形態による、図１１の（ｃ）の方向についての評価値を示した表
【図１６】本発明の第２の実施形態による、図１１の（ｄ）の方向についての評価値を示した表[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a method and apparatus for determining the top and bottom of an image, and more particularly, to a method and apparatus for automatically determining the top and bottom when displaying a photographic image represented by digital data or recording it on a medium.
[0002]
[Prior art]
When photographing with a digital camera, the photographer freely decides the direction of the camera each time. Therefore, when a series of captured images are displayed on a digital camera monitor or personal computer screen, they are displayed in irregular directions, so the orientation is adjusted manually one by one while visually checking each displayed image. There is a need. A similar problem also occurs when a photographic image taken on a negative film is read by a film scanner and a read image represented by digital data is displayed.
[0003]
The burden of the work of adjusting the orientation of the image is not only when the photographer personally displays and records the image, but as a commercial service, the medium on which the image taken with the digital camera is recorded as it is, There is also a problem when a photographed negative film is deposited from a customer, and a series of photographed images are aligned and recorded on a CD-R and provided to the customer, or displayed on a homepage. In particular, when such a commercial service processes a large amount of images, such work is a factor that greatly reduces the efficiency of the service. Therefore, the automation of the work of determining the top and bottom of the image represented by digital data and adjusting the orientation is strongly desired for the business operator who provides the commercial service as described above.
[0004]
As a method that has already been developed for automating such work, the sky area in the captured image is identified, and the image is oriented by determining the top and bottom of the image based on the color gradient of the sky area. There exists a method (for example, refer patent document 1). In other words, this method focuses on the feature that the color of the sky in the real world is bluer at the zenith side and becomes whiter as it approaches the horizon, and prevents a blue region that is not empty from being erroneously detected as a sky region. Thus, the sky region is specified with high reliability, and the orientation of the image is determined from the color gradient.
[0005]
However, the actual photographic image does not always include the sky, and in that case, the top and bottom of the image cannot be determined by the above method. Even if the sky is reflected, the above method cannot be applied in a cloudy state where almost the entire surface of the sky is covered with clouds. Therefore, it is considered that there are a considerable number of images that have not been correctly determined by the above method among the actually captured images. After all, the service providers etc. visually confirm the orientation of each image. It must be trimmed.
[0006]
[Patent Document 1]
JP 2001-202525 A
[0007]
[Problems to be solved by the invention]
As described above, the method of determining the top and bottom of an image based on the color gradient or the like by focusing on only one shooting target element such as “sky” among the shooting target elements that can be included in the shot image is as follows. The percentage of applicable images is low. For this reason, it has not been sufficient to meet the above-mentioned demand for improving the work efficiency in commercial services and the like by reducing the burden of visually checking images one by one and adjusting the orientation by manual operation.
[0008]
In view of such circumstances, the present invention provides an image up / down determination method and apparatus applicable to most of photographic images represented by digital data, and finally requires visual confirmation and correction of orientation by manual operation. It is an object of the present invention to drastically reduce the ratio of images taken and greatly improve the work efficiency in the above-mentioned commercial services.
[0009]
[Means for Solving the Problems]
That is, the method for determining the vertical direction of an image according to the present invention includes a step of creating reference data regarding a positional vertical relationship in the real world of a plurality of imaging target elements, and at least corresponding to at least two of the plurality of imaging target elements. With respect to an image including two image areas, with reference to the reference data, the upper and lower sides of the image are arranged so that the arrangement of the at least two image areas best reflects the positional relationship in the real world. The method includes the step of determining.
[0010]
In the method, the above-described step of determining the top and bottom of the image is performed with reference to the reference data, using the data in which one vertical arrangement level is assigned to each of the plurality of imaging target elements as the reference data. Assigning a vertical arrangement level of the imaging target element corresponding to the image area to each of the at least two image areas, and a change tendency of the assigned vertical arrangement level in the first direction of the entire image. Deriving a first parameter to represent, and deriving a second parameter representing a change tendency of the assigned vertical arrangement level over a second direction orthogonal to the first direction of the entire image. And a step including a step of determining the upper and lower sides of the image based on the first parameter and the second parameter.
[0011]
Alternatively, in the image up / down determination method according to the present invention, for at least a part of the combination of two of the plurality of imaging target elements, the positional vertical relationship in the real world of the two imaging target elements forming the combination. The above-mentioned step of determining the top and bottom of the image using the data representing the probability of the above as the reference data, the step of deriving an index representative of the position in the image of the at least two image regions, Assuming that one of the four directions differing by 90 ° of the image is the upward direction, the positional vertical relationship in the real world of the imaging target elements corresponding to the at least two image regions in the reference data is as follows. Based on the probabilistic data portion and the above-mentioned index representing the position of the at least two image regions in the image, the possibility that the assumption is correct is shown. The process of deriving the parameters, the process of deriving the parameters for the remaining three directions in the four directions of the image, and the most of the four parameters derived from the four directions of the image It is good also as a process including the process of specifying the direction corresponding to the parameter showing high possibility as the up direction of the image concerned.
[0012]
Also, the image up / down determination apparatus according to the present invention includes a storage unit that stores reference data related to a positional relationship in the real world of a plurality of imaging target elements, and the plurality of imaging targets represented by input image data. For an image including at least two image regions corresponding to at least two of the elements, the arrangement of the at least two image regions best reflects the positional vertical relationship in the real world with reference to the reference data. It is an apparatus characterized by including a determining means for determining the top and bottom of the image so as to be arranged.
[0013]
In the apparatus, data in which one upper and lower arrangement level is assigned to each of the plurality of imaging target elements is used as the reference data, and the determination unit refers to the reference data, and the at least two Means for assigning to each of the image areas an upper and lower arrangement level of the imaging target element corresponding to the image area, and a first parameter representing a change tendency of the assigned upper and lower arrangement level in the first direction of the entire image And means for deriving a second parameter representing a change tendency of the assigned vertical arrangement level in a second direction orthogonal to the first direction of the entire image, and Means including means for determining the top and bottom of the image based on the first parameter and the second parameter may be used.
[0014]
Alternatively, in the image up / down determination apparatus according to the present invention, for at least a part of the combination of two of the plurality of imaging target elements, the positional vertical relationship in the real world of the two imaging target elements forming the combination. Is used as the reference data, and the determining means includes means for deriving an index representing the position of the at least two image regions in the image, and 90 ° of the image. Assuming that each of the four different directions is an upward direction, the positional relationship in the real world of the imaging target elements corresponding to the at least two image regions in the reference data is stochastic. The possibility that the assumption is correct is based on the data part expressed in (1) and the above-mentioned index representing the position of the at least two image areas in the image. And means for deriving a parameter corresponding to the parameter representing the highest possibility among the four parameters derived from the four directions of the image as the upward direction of the image. Also good.
[0015]
In the present invention, the “photographing element” refers to an individual element that exists in the real world and can be a subject of photography, and includes, for example, the sky, clouds, mountains, trees, buildings, water, and the ground. . In the present invention, an “image region” is an individual significant region divided in an image handled by the method and apparatus according to the present invention, and each corresponds to one of the above “photographing target elements”. An area that can be identified as a thing.
[0016]
【The invention's effect】
Since the image up / down determination method and apparatus according to the present invention determines the image up / down based on the reference data regarding the relative positional relationship in the real world of a plurality of imaging target elements such as sky, trees, ground, etc. If the image includes an image region corresponding to at least two of these photographing target elements, it is possible to determine the upper and lower sides. Therefore, the ratio of images that can be determined up and down by the method and apparatus according to the present invention is much higher than in the case of the conventional method and apparatus, and finally it is necessary to visually confirm and correct the orientation by manual operation. The ratio of images can be greatly reduced. This greatly improves the efficiency of commercial services in which a large number of images represented by digital data are organized and recorded on a CD-R or displayed on a homepage in response to customer demand.
[0017]
Further, in the image up / down determination method and apparatus according to the present invention, the reference for determining the up / down of the image can be easily changed by correcting / adding the reference data without changing a complicated program or the like. There is also an advantage of being able to.
[0018]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the drawings.
[0019]
FIG. 1 is a flowchart showing a flow of a series of image processing steps including an image up / down determination step according to the present invention. This series of image processing steps makes it possible to display or record an original image recorded in an irregular direction so as to be aligned in the correct direction. First, in step 10, digital image data representing an original image is read. Next, in step 12, the original image is divided into a plurality of image areas, and in step 14, each image area is associated with an imaging target element existing in the real world. Subsequently, in step 16, the upper and lower sides of the image are determined with reference to the reference data relating to the positional relationship in the real world of the imaging target element. Finally, in step 18, the original image is rotated and displayed or recorded in the correct orientation according to the top and bottom of the image determined in step 16.
[0020]
The present invention relates to a method and apparatus for realizing step 16 in the series of image processing steps shown in FIG. 1, but prior to the description of the embodiment of the present invention, to the image region in step 12. An example of this dividing method and an example of a method for associating each image region with the imaging target element in step 14 will be briefly described below.
[0021]
An example of the method of dividing into image areas in step 12 will be described with reference to FIG.
[0022]
First, FIG. 2A shows an original image to be processed, and each pixel constituting the original image has each color and brightness. Therefore, the colors and brightness of adjacent pixels are compared, and when the degree of similarity exceeds a predetermined standard, these pixels are integrated. This comparison and integration is sequentially repeated until no further integration occurs according to the predetermined criteria, and the area of pixels having similar colors and lightness is expanded. Assume that the state after the integration of the pixels is the state shown in FIG. 2B, for example.
[0023]
Here, among the areas constituting the image shown in FIG. 2 (b), an area whose peripheral length is shorter than a predetermined length is called a “micro area”, and an area whose peripheral length is equal to or greater than the predetermined length. It shall be called “non-micro area”. In FIG. 2B, the

areas

20, 22, 24, 26 and 28 are non-micro areas, and the

areas

30 and 32 are micro areas.
[0024]
Next, each area constituting the image of FIG. 2 (b) is further compared with the adjacent area to further integrate what can be integrated. Varies by area. For a micro area, a micro area that is completely contained in one non-micro area (eg, a micro area 30 that is fully contained in a non-micro area 24) is integrated into that one non-micro area. To do. Further, a micro area that borders two or more non-micro areas (for example, micro area 32 that borders non-micro areas 20 and 24) is a non-micro area having a longer border length (in the above case, non-micro area). It is assumed that they are integrated into the micro area 24). According to this standard, the state after the integration of the minute areas is as shown in FIG.
[0025]
For non-micro areas, the average color and lightness of the pixels forming the non-micro area are compared with the average color and lightness of the pixels forming each adjacent non-micro area, and the degree of similarity exceeds a predetermined standard. Integration is performed if there are non-micro areas. For example, for the average color and lightness of the non-micro area 22 in FIG. 2 (c), the degree of similarity to the average color and lightness of the non-micro area 26 exceeds the predetermined criteria described above, but each other adjacent non-area If the degree of similarity of the average color and lightness of the micro-areas 20, 24 and 28 is below the predetermined criteria, the non-micro-area 22 is integrated with the non-micro-area 26 and the non-micro-

area

20, 24 and 28 are not integrated. The final result of the integration of the non-micro area according to such a predetermined standard is, for example, as shown in FIG. Each area constituting the final image is an “image area” in the present invention.
[0026]
The example of the method for dividing the image area in step 12 in FIG. 1 has been described with reference to FIG. 2. Needless to say, the division into the image area in step 12 may be performed by any other known method. Yes.
[0027]
Next, an example of a method for associating each image region with the imaging target element in step 14 in FIG. 1 will be described with reference to FIGS. 3 and 4. In this example, each image region is associated with an imaging target element using a self-organizing map created in advance.
[0028]
First, in step 40 of FIG. 3, an image that has been divided into image regions by the method already described with reference to FIG. 2 is divided into a plurality of blocks as shown in FIG. Here, for example, a square block made up of 32 × 32 pixels is used.
[0029]
Next, in step 42, a block included in one image area to be associated with the imaging target element is specified. Here, first, assuming that the image region 28 in FIG. 4A is associated with the imaging target element, the image region 28 includes a plurality of blocks shown by shading in FIG. Identified as a block.
[0030]
Subsequently, in step 44, a feature vector is extracted for one of the blocks identified in step 42. This "feature vector" is a vector having a plurality of feature quantities for the block as components, and in addition to the components indicating color features and lightness features, these components are images such as depth information. Ingredients that exhibit these characteristics may also be included.
[0031]
Next, in step 46, the feature vector extracted in step 44 is mapped onto the self-organizing map, and an imaging target element corresponding to the block is specified. The “self-organizing map” is a map in which points corresponding to a plurality of reference feature vectors are two-dimensionally arranged, and points corresponding to reference feature vectors of images having similar features are arranged at positions close to each other. Has been. This self-organizing map is created in advance by letting a computer learn the characteristics of a large number of images that are known to be “sky” and “ground” images. An imaging target element that is most likely to correspond to each point is known in advance. The feature vector extracted in step 44 is mapped to a point corresponding to the most similar reference feature vector on this self-organizing map. Here, the similarity between the extracted feature vector and each reference feature vector is evaluated using the Euclidean distance between the vectors as an index. By mapping onto the self-organizing map, it is possible to identify the imaging target element that is most likely to correspond to the current block.
[0032]
Subsequently, the processing of FIG. 3 returns to step 44 through step 48, and the corresponding photographing target element is similarly specified for the following blocks among the shaded blocks in FIG. 4B.
[0033]
When the imaging target elements corresponding to all the blocks are identified, the process in FIG. At this time, as shown in FIG. 4C, most of the blocks in the image area 28 are identified as corresponding to “empty”. However, some of the image features of the block are “empty”. For example, it is closer to the feature of “water” than “,” and those specified as corresponding to “water” are mixed. However, in step 50, the largest number among the elements to be photographed corresponding to the respective blocks in the image area 28 is identified as the element to be photographed corresponding to the image area 28 by a so-called majority method. Therefore, as shown in FIG. 4D, the image area 28 corresponds to “empty”.
[0034]
Thereafter, the processing of FIG. 3 returns to step 42 through step 52, and steps 42 to 52 are repeated until the corresponding imaging target elements are specified for all image regions included in the image.
[0035]
The example of the method for associating each image region with the imaging target element in step 14 in FIG. 1 has been described with reference to FIGS. 3 and 4. The association in this step 14 is based on any other known method. It goes without saying that things can be used.
[0036]
Now, based on the premise that the image to be determined up and down is divided into image areas by the above-described method and the like, and the imaging target area corresponding to each image area is specified, the image upper and lower according to the present invention will be described below. Two embodiments of the determination method (ie, step 16 of FIG. 1) will be described.
[0037]
First, the 1st Embodiment of this invention is described using FIGS. 5-9. In this embodiment, the upper and lower sides of an image are determined using data in which one vertical arrangement level is assigned to each of a plurality of imaging target elements as reference data. Here, it is considered to determine the upper and lower sides of the image shown in FIG. 5 that is divided into a plurality of image regions by using the method described above, and in which each image region is associated with the imaging target element existing in the real world. . The image is composed of M × N pixels. At this stage, the meaning of “sky”, “tree”, “building” or “ground” is assigned to each pixel in each image area. The x and y coordinates of the image are as shown in the figure. Note that the image in FIG. 5 is simplified from the images in FIGS. 2 and 4, but this is merely for convenience of the following description.
[0038]
Image up / down determination according to the present embodiment is performed according to the procedure shown in the flowchart of FIG.
[0039]
First, in step 60, an upper and lower arrangement level corresponding to a corresponding photographing target element is assigned to each image region of the image of FIG. Strictly speaking, a corresponding vertical arrangement level is assigned to each pixel belonging to each image area. This assignment is performed with reference to tabular reference data as shown in FIG. This reference data is created in advance based on the relative positional relationship between the elements to be imaged in the real world and is stored in the storage means, and is likely to be relatively high in the real world. The higher the vertical arrangement level is given to the target element. According to this reference data, in this example, the upper and lower arrangement levels are assigned as shown in FIG.
[0040]
Next, in

steps

62 and 62 ′, the image is divided into two equal parts as shown in FIG. 9 for each of the x direction and the y direction, and the average value of the upper and lower arrangement levels assigned to the pixels in each of the bisected areas. Is calculated. In the example of FIG. 9, the average value of the upper and lower arrangement levels in each bisected area in the x direction is 0.9 and 1.8, and the average value of the upper and lower arrangement levels in each bisector area in the y direction is Let 1.2 and 1.3.
[0041]
Subsequently, in

steps

64 and 64 ′, the average of the upper and lower arrangement levels in each of the bisected regions in each of the x direction and the y direction is used as a parameter representing the change tendency of the upper and lower arrangement levels in the x direction and the y direction. The difference value of the value is obtained. Here, with respect to the x direction, in FIG. 9, the average value in the left bisection region is subtracted from the average value in the right bisection region, and in the y direction, in the lower bisection region. The average value in the upper bisection region is subtracted from the average value. That is, in the example of FIG. 9, the difference value in the x direction is 1.8−0.9 = + 0.9, and the difference value in the y direction is 1.3−1.2 = + 0.1.
[0042]
Next, in step 66, the difference value in the x direction is compared with the difference value in the y direction, and the axis corresponding to the larger absolute value is specified as the vertical axis of the image. In the example of FIG. 9, as described above, the difference value in the x direction is +0.9, the difference value in the y direction is +0.1, and the absolute value of the difference value in the x direction is larger. Identified as an axis.
[0043]
Subsequently, in step 68, from the sign of the above difference value with respect to the specified vertical axis, if the sign is positive, the + x or + y direction is positive, if it is negative, the -x or -y direction is the upward direction of the image. Identified as Here, in step 66, the x axis is specified as the vertical axis, and the difference value in the x direction is a positive value of +0.9, so the + x direction is specified as the upward direction.
[0044]
Through the above steps, the upward direction of the image shown in FIG. 5 is determined to be the + x direction. From this result, the image in FIG. 5 is rotated 90 ° to the left and displayed or recorded.
[0045]
In the above-described embodiment, the image is divided into two equal parts in each of the x direction and the y direction, and the upper and lower parts of the image are determined using the difference value of the average value of the upper and lower arrangement levels in each of the bisection areas as a parameter. However, any other parameter may be used as long as it appropriately represents the change tendency of the vertical arrangement level in the x and y directions of the entire image. For example, as one modified example of the above-described embodiment, the average value of the inclination for each row and column of the vertical arrangement level assigned to each pixel is obtained in the x direction and the y direction, and the average value is calculated as x An example of using it as a parameter representing the change tendency of the vertical arrangement level in the direction and the y direction can be considered.
[0046]
Next, a second embodiment of the present invention will be described with reference to FIGS. This embodiment uses, as reference data, data that stochastically represents the positional relationship in the real world of two imaging target elements forming the combination for at least a part of a combination of two of the plurality of imaging target elements. To determine the top and bottom of the image. Here, the specific procedure will be described using the image shown in FIG. 5 as in the description of the first embodiment.
[0047]
Image up / down determination according to the present embodiment is performed according to the procedure shown in the flowchart of FIG.
[0048]
First, in step 70, the center-of-gravity coordinates of each image region are specified as shown in FIG. 11A for the image of FIG. 5 that has already been divided into image regions and associated with each image region and the imaging target element. . This barycentric coordinate is used as an index representing the position of each image region in the entire image in the subsequent steps.
[0049]
Next, in step 72, a search for “region of interest” is performed. The “attention area” refers to an image area corresponding to an image area included in an image that is designated as an imaging target element to be particularly noted for determining the upper and lower sides of the image. That is, the top and bottom of the image in the present embodiment is determined based on the relative positional top and bottom relationship between this attention area and another image area. For example, the image in FIG. 11A includes image areas corresponding to four shooting target areas of the sky, a tree, a building, and the ground. Here, only the image area corresponding to the sky is designated as the attention area. If so, the top and bottom of the image is determined based on the relative positional relationship between the sky and the tree, the sky and the building, and the sky and the ground. When image areas corresponding to five imaging target elements of sky, mountain, tree, building, and ground are designated as attention areas, the image of FIG. 11A includes the sky, tree, and building. And the image area corresponding to four of the ground, sky and tree, sky and building, sky and ground, tree and building, tree and ground, and relative positional top and bottom relationship between building and ground Thus, the top and bottom of the image are determined. Which image area corresponding to which imaging target element is to be used as the attention area, and if there are a plurality of attention areas, which of the attention areas to examine the relative positional relationship with other image areas (that is, search) The order is assumed to be specified in advance. In the following, image areas corresponding to five shooting target elements of sky, mountain, tree, building, and ground are designated as attention areas, and the search order is empty → mountain → tree → building → ground. A case will be described as an example. Therefore, in step 72, first, an image area corresponding to the sky is searched for as an attention area. Since the image of FIG. 11A includes an image area corresponding to the sky, the process proceeds to the next step 74 using this as an attention area.
[0050]
In step 74, the relative positional relationship between the attention area corresponding to the sky and all other image areas (that is, image areas corresponding to trees, buildings, and the ground) is evaluated.
[0051]
In the present embodiment, this evaluation is performed with reference to the tabular reference data shown in FIG. This reference data is a probabilistic representation of the positional relationship in the real world for all or part of the two possible combinations of the imaging target elements. For example, the data part in the first row of FIG. 12 shows the probability that the sky is above the tree, the probability that the sky and the tree are arranged side by side, and the sky is more than the tree in the real world landscape. It represents the probability below. This reference data is created in advance and stored in the storage means. Note that FIG. 12 shows only the data portion necessary for the description of the present embodiment, but in actuality, this reference data defines relationships among many other imaging target elements.
[0052]
Evaluation of the relationship between two image regions, one of which is a region of interest, starts with reading out a data portion that defines the relationship between two imaging target elements corresponding to the two image regions in the reference data of FIG. In this embodiment, first, in order to evaluate the relationship between the attention area corresponding to the sky and the image area corresponding to the tree, the data portion of the first row in FIG. 12 is read.
[0053]
Next, based on the read data portion information and the barycentric coordinates specified in step 70, weighting factors (w _x , W _y ) Is calculated. Where the weighting factor (w _x , W _y ) Represents the barycentric coordinates of the image areas corresponding to the two shooting target areas A and B, respectively (x _A , Y _A ) And (x _B , Y _B ) As a probability p that the object A to be photographed is at the bottom and B is at the top according to the read data portion _{Below A, above B} , And the probability p that the subject A is above and B is below _{Above A, Below B} On the basis of the,
[Expression 1]

To calculate
[Expression 2]

Is required. In the evaluation of the relationship between the sky and the image area corresponding to the tree, according to the first row of reference data in FIG. 12, the probability that the tree is below is 75% and the probability that the sky is below is 5%. Since the above Δx and Δy are
[Equation 3]

It becomes. The orientation of the image to be evaluated at this stage is the orientation of FIG. 11A, and in this orientation, x _Sky > X _wood , Y _Sky = Y _wood Therefore, Δx> 0, Δy = 0, and the weight coefficient is w _x = 1, w _y = 0.
[0054]
Subsequently, the evaluation value s of the relative positional relationship between the two image areas in the current image direction. _AB The
[Expression 4]

Calculated by Where p _{AB side by side} Is the probability that the imaging target elements A and B are arranged side by side with reference data, max (p _{Below A, above B} , P _{Above A, Below B} ) Is p _{Below A, above B} And p _{Above A, Below B} Whichever is larger, min (p _{Below A, above B} , P _{Above A, Below B} ) Is p _{Below A, above B} And p _{Above A, Below B} The smaller of the two points. Since the relative positional relationship between the image area corresponding to the sky and the tree in the direction of FIG. 11A is Δy = 0 as described above, the evaluation value is
[Equation 5]

It becomes.
[0055]
Similarly, for the relative positional relationship between the sky and the building and between the sky and the ground, the evaluation value s _{Sky building} And s _{Empty ground} Is calculated. This completes the process of step 74 in which the image area corresponding to the sky is the attention area.
[0056]
Next, the process moves to step 76, and the next attention area is searched. Here, as described above, the search order in this embodiment is sky → mountain → tree → building → ground, so the image region corresponding to the second “mountain” is searched in the image to be evaluated. However, since there is no image area corresponding to “mountain”, the image area corresponding to “tree” is searched next and specified as the attention area.
[0057]
Then, since the next region of interest has been identified, “No” is determined in step 78, and the process returns to step 74, and the region of interest corresponding to “tree” is compared with other image regions by the same method as described above. The positional relationship is evaluated. Here, since the relative positional relationship between the image areas corresponding to the sky and the tree has already been evaluated, it is excluded from the evaluation target, and the evaluation value s for the tree and the building and the tree and the ground is excluded. _{Wooden building} And s _{Wood ground} Is calculated.
[0058]
When the repetitive process composed of

steps

74, 76 and 78 is completed for all the attention areas, the process proceeds to step 80. At this stage, as shown in FIG. 13, the evaluation value s regarding the relative positional relationship in the image direction of FIG. 11A of the image regions corresponding to the six sets of imaging target elements. _AB Is calculated. In step 80, the total value S of these six evaluation values is calculated. This total value has a meaning as a parameter indicating the possibility that the image direction currently being evaluated is an image direction in which the top and bottom are correctly arranged. In this embodiment, the larger the total value S, the more the image direction. Is likely to be correct. As shown in FIG. 13, the total value S indicating the possibility that the image direction in FIG.
[0059]
Next, the process proceeds to step 84 through step 82, and the direction of the image is virtually rotated by 90 ° to be the direction shown in FIG. Then, a series of steps 70 to 80 are repeated, and a total value S representing the possibility that the image direction is correct is calculated for the image direction shown in FIG. Similar steps are sequentially performed for the image direction shown in FIG. 11C and the image direction shown in FIG. The evaluation results for each of these image directions are shown in FIGS. 14-16.
[0060]
When the evaluation of all the image directions in the four directions is completed, the process proceeds to step 86, and the image direction in which the total value S is the largest is specified as the correct image direction. In this example, the total values S for the respective image directions in FIGS. 11A to 11D are 159, 2, 159, and 368, respectively. Therefore, the image direction in FIG. Identified.
[0061]
Through the above steps, the top and bottom of the image in FIG. 5 are correctly determined, and the image is displayed or recorded in the orientation shown in FIG.
[0062]
In the above embodiment, the barycentric coordinates of each image area are used as an index representing the position of each image area in the entire image. However, the present invention is not limited to this, and other indices may be used. For example, the coordinates of the pixel at the highest vertical position among the pixels forming each image area may be used.
[0063]
In step 86 of the above-described embodiment, the direction in which the total value S of the evaluation values is maximum is simply specified as the correct image direction. Only when the maximum value exceeds the threshold value, the direction corresponding to the maximum total value may be specified as the correct image direction. Alternatively, the direction corresponding to the maximum total value is correct only when the ratio of the maximum total value to the sum of the maximum total value and the second highest total value exceeds a set threshold (for example, 0.6). The image direction may be specified. In these cases, with respect to an image for which the upper and lower sides of the image cannot be determined with the set threshold value, the service provider or the like visually confirms and corrects the image direction.
[0064]
In the above embodiment, which image direction is correct is determined based on the total value of all evaluation values in each image direction. However, the evaluation value s in each image direction is determined. _AB It may be determined based on the maximum value of.
[0065]
Further, in the above embodiment, the attention area and the search order thereof are defined in advance. However, all the image areas existing in the image to be evaluated are sequentially searched, and all combinations of these image areas are performed. You may make it evaluate the relative positional relationship between about.
[0066]
According to each of the embodiments described above, the top and bottom of the image are determined based on the reference data regarding the relative positional top and bottom relationships in the real world of a plurality of shooting target elements such as the sky, trees, and the ground. If the image includes an image area corresponding to at least two of the imaging target elements, the upper and lower sides can be determined. When a threshold value is provided in the second embodiment, some images require visual confirmation and manual orientation correction. However, in the present invention, a plurality of imaging target elements are not used in the present invention. Since the top and bottom of the image is determined based on the relative positional top and bottom relationship, the proportion of images that require such visual confirmation and manual operation is greatly reduced compared to the conventional case.
[0067]
In addition, according to each of the above embodiments, the reference for determining the top and bottom of the image can be determined by correcting the reference data or adding data related to a new photographing target element without changing a complicated program. It can be easily changed.
[0068]
The two embodiments of the present invention have been described in detail above. However, these two embodiments are merely illustrative, and the technical scope of the present invention is defined only by the claims in this specification. It should be done.
[Brief description of the drawings]
FIG. 1 is a flowchart showing a series of image processing steps including an image up / down determination step according to the present invention.
FIG. 2 is a process diagram illustrating an example of a method for dividing an image area.
FIG. 3 is a flowchart showing an example of a method for associating an image area with an imaging target element.
4 is a process diagram showing each process of the method shown in FIG. 3;
FIG. 5 is a diagram showing an image to be determined up and down according to the present invention.
FIG. 6 is a flowchart showing an image up / down determination method according to the first embodiment of the present invention;
FIG. 7 is a table showing an example of reference data used in the first embodiment of the present invention.
FIG. 8 is a diagram showing a state in which an upper and lower arrangement level is assigned to an image in the first embodiment of the present invention.
FIG. 9 is a diagram showing a method for deriving a difference value representing a change tendency of the vertical arrangement level with respect to the x direction and the y direction in the first embodiment of the present invention.
FIG. 10 is a flowchart showing an image up / down determination method according to the second embodiment of the present invention;
FIG. 11 is a diagram showing four directions of an image evaluated in the second embodiment of the present invention.
FIG. 12 is a table showing an example of reference data used in the second embodiment of the present invention.
13 is a table showing evaluation values in the direction of FIG. 11A according to the second embodiment of the present invention.
14 is a table showing evaluation values in the direction of FIG. 11B according to the second embodiment of the present invention.
FIG. 15 is a table showing evaluation values in the direction of (c) of FIG. 11 according to the second embodiment of the present invention.
FIG. 16 is a table showing evaluation values in the direction of FIG. 11 (d) according to the second embodiment of the present invention.

Claims

Creating reference data that assigns one vertical arrangement level to each of the plurality of imaging target elements, with respect to the positional vertical relationship in the real world of the plurality of imaging target elements;
With respect to an image including at least two image areas corresponding to at least two of the plurality of imaging target elements, the imaging corresponding to the image area is referred to in each of the at least two image areas with reference to the reference data. Assigning the upper and lower arrangement levels of the target element;
Deriving a first parameter representing a change tendency of the assigned top and bottom arrangement level across a first direction of the entire image;
Deriving a second parameter representing a change tendency of the assigned top and bottom arrangement level over a second direction orthogonal to the first direction of the entire image;
Based on the first parameter and the second parameter, the upper and lower sides of the image are determined so that the arrangement of the at least two image areas best reflects the positional vertical relationship in the real world. An image up / down determination method comprising a step.

Regarding at least a part of a combination of two of the plurality of imaging target elements with respect to a positional vertical relationship in the real world of the plurality of imaging target elements, the positional relationship in the real world of the two imaging target elements forming the combination Creating reference data that stochastically represents the hierarchical relationship;
Deriving an index representative of the position of the at least two image areas in the image for an image including at least two image areas corresponding to at least two of the plurality of imaging target elements ;
Assuming that one of the four directions that differ by 90 ° of the image is the upward direction, the position in the real world of the imaging target element corresponding to the at least two image regions in the reference data Deriving a parameter representing the likelihood that the assumption is correct based on a data portion that stochastically represents a hierarchical relationship and the index representative of the position of the at least two image regions in the image;
Repeating the step of deriving the parameters for the remaining three directions of the four directions of the image;
Represents the highest possibility of the four parameters derived for the four directions of the image so that the arrangement of the at least two image regions best reflects the positional top-bottom relationship in the real world. A method for determining the up / down direction of an image, comprising the step of specifying a direction corresponding to a parameter as the upward direction of the image .

Storage means for storing reference data in which one vertical arrangement level is assigned to each of the plurality of imaging target elements, with respect to the positional vertical relationship in the real world of the plurality of imaging target elements;
With respect to an image including at least two image areas corresponding to at least two of the plurality of imaging target elements, the imaging corresponding to the image area is referred to in each of the at least two image areas with reference to the reference data. Means for assigning the upper and lower arrangement levels of the target element;
Means for deriving a first parameter representing a change tendency of the assigned top and bottom arrangement level across a first direction of the entire image;
Means for deriving a second parameter representing a change tendency of the assigned top and bottom arrangement level over a second direction orthogonal to the first direction of the entire image;
Based on the first parameter and the second parameter, the upper and lower sides of the image are determined so that the arrangement of the at least two image areas best reflects the positional vertical relationship in the real world. An apparatus for determining the up and down of an image, characterized in that it includes means.

Regarding at least a part of a combination of two of the plurality of imaging target elements with respect to a positional vertical relationship in the real world of the plurality of imaging target elements, positions of the two imaging target elements forming the combination in the real world Storage means in which reference data that stochastically represents the hierarchical relationship is stored ,
Means for deriving an index representative of the position of the at least two image areas in the image for an image including at least two image areas corresponding to at least two of the plurality of imaging target elements ;
The real world of the imaging target element corresponding to the at least two image regions in the reference data, assuming that the direction is an upward direction for each of four directions that differ by 90 ° in the image. A parameter representing the likelihood that the assumption is correct is derived based on a data portion that stochastically represents the positional relationship in FIG. 5 and the index representative of the position of the at least two image regions in the image. Means,
Represents the highest possibility of the four parameters derived for the four directions of the image so that the arrangement of the at least two image regions best reflects the positional top-bottom relationship in the real world. An apparatus for determining an up / down direction of an image, comprising means for specifying a direction corresponding to a parameter as the upward direction of the image .