JP5962491B2

JP5962491B2 - Angle of view adjustment apparatus, method, and program

Info

Publication number: JP5962491B2
Application number: JP2012277361A
Authority: JP
Inventors: 雅行広浜; 松永　和久; 和久松永; 道大二瓶; 浩一中込
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 2012-12-19
Filing date: 2012-12-19
Publication date: 2016-08-03
Anticipated expiration: 2032-12-19
Also published as: JP2014120136A

Description

本発明は、画像内の主要被写体に対する画角を調整する画角調整装置、方法、およびプログラムに関する。 The present invention relates to an angle-of-view adjusting apparatus, method, and program for adjusting an angle of view for a main subject in an image.

野山や道端で見かけた花の名前を知りたくなることがある。そこで、撮影等により得た花のディジタル画像より、クラスタリング法を用いて対象物である花の画像を抽出し、その抽出された花の画像より得られる情報を特徴量とする。単数または複数の特徴量を求め、その求められた特徴量と、あらかじめデータベースに登録してある各種の植物の特徴量とを統計的手法を用いて解析して野草の種類を判別する技術が提案されている（例えば特許文献１に記載の技術）。 Sometimes you want to know the name of a flower you saw on Noyama or a roadside. Therefore, a flower image as an object is extracted from a digital flower image obtained by photographing or the like using a clustering method, and information obtained from the extracted flower image is used as a feature amount. Proposed a technique to determine the type of wild grass by calculating one or more feature quantities and analyzing the obtained feature quantities and the feature quantities of various plants registered in the database in advance using statistical methods. (For example, the technique described in Patent Document 1).

また、主要被写体を含む画像をＧｒａｐｈＣｕｔｓ法を用いて主要被写体と背景とに分割する従来技術が知られている（例えば非特許文献１、特許文献２に記載の技術）。領域分割を行う場合，主要被写体と背景の関係によりその境界が不明確な部分が存在する可能性があり，最適な領域分割を行う必要がある。そこで、この従来技術では、領域分割をエネルギーの最小化問題としてとらえ、その最小化手法を提案している。この従来技術では，領域分割に適合するようにグラフを作成し、そのグラフの最小カットを求めることにより、エネルギー関数の最小化を行う。この最小カットは、最大フローアルゴリズムを用いることにより、効率的な領域分割計算を実現している。 In addition, a conventional technique is known in which an image including a main subject is divided into a main subject and a background using the Graph Cuts method (for example, techniques described in Non-Patent Document 1 and Patent Document 2). When region segmentation is performed, there may be a portion where the boundary is unclear due to the relationship between the main subject and the background, and it is necessary to perform segmentation optimally. Therefore, in this prior art, region division is regarded as an energy minimization problem, and a minimization method is proposed. In this prior art, the energy function is minimized by creating a graph that matches the region division and obtaining the minimum cut of the graph. This minimum cut realizes efficient area division calculation by using a maximum flow algorithm.

ＧｒａｐｈＣｕｔｓ法を用いて主要被写体と背景を分割する手法では、画像内の各画素に付与する主要被写体または背景を示す領域ラベルを更新しながら、その領域ラベルと各画素の画素値に基づいて領域分割を行う手法が知られている。この場合例えば、次のようなコスト項を含むエネルギー関数が定義される。まず、主要被写体を示す画像から算出した例えばカラー画素値ごとのヒストグラムの値が大きいほど値が小さくなるコスト項が含まれる。また、背景を示す画像から算出した例えばカラー画素値ごとのヒストグラムの値が大きいほど値が小さくなるコスト項が含まれる。そして、そのエネルギー関数の最小化処理により、画像内で主要被写体と背景が領域分割される（以上、例えば非特許文献１に記載の手法）。 In the method of dividing the main subject and the background using the Graph Cuts method, the region label indicating the main subject or background to be given to each pixel in the image is updated, and the region is based on the region label and the pixel value of each pixel. A technique for performing division is known. In this case, for example, an energy function including the following cost term is defined. First, for example, a cost term that includes a cost value that decreases as the value of a histogram for each color pixel value calculated from an image showing the main subject increases is included. Further, for example, a cost term that decreases as the value of the histogram for each color pixel value calculated from the image indicating the background increases is included. Then, the main subject and the background are divided into regions in the image by the energy function minimization process (for example, the method described in Non-Patent Document 1).

また、ＧｒａｐｈＣｕｔｓ法だけでは主要被写体と背景の分割が難しい場合がある。このため、例えばいわゆるスマートフォンなどへの実装では、ユーザが、例えば撮像装置で撮像した画像に対し、認識したい物体（例えば花）が存在するおおよその領域に対して、例えばタッチパネル等の入力装置を用いて矩形枠を指定する機能が実装される。
あるいは、ユーザが、あらかじめ決められた矩形枠内に認識した物体が大きく入るようにカメラの画角を調整する。 Further, it may be difficult to divide the main subject and the background only by the Graph Cuts method. For this reason, for example, in mounting on a so-called smartphone or the like, for example, an input device such as a touch panel is used for an approximate region where an object (for example, a flower) to be recognized exists for an image captured by the imaging device. The function to specify the rectangular frame is implemented.
Alternatively, the user adjusts the angle of view of the camera so that the recognized object is large within a predetermined rectangular frame.

特開２００２−２０３２４２号公報Japanese Patent Laid-Open No. 2002-203242 特開２０１１−３５６３６号公報JP 2011-35636 A

Ｙ．ＢｏｙｋｏｖａｎｄＧ．Ｆｕｎｋａ−Ｌｅａ：“ＩｎｔｅｒａｃｔｉｖｅＧｒａｐｈＣｕｔｓｆｏｒＯｐｔｉｍａｌＢｏｕｎｄａｒｙ＆ＲｅｇｉｏｎＳｅｇｍｅｎｔａｔｉｏｎｏｆＯｂｊｅｃｔｓｉｎＮ−ＤＩｍａｇｅｓ”，Ｐｒｏｃｅｅｄｉｎｇｓｏｆ “ＩｎｔｅｒｎａｔｉｏｎＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎ”，Ｖａｎｃｏｕｖｅｒ，Ｃａｎａｄａ，ｖｏｌ．Ｉ，ｐ．１０５−１１２，Ｊｕｌｙ２００１．Y. Boykov and G. Funka-Lea: “Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in ND Images”, Proceedings of “Interference Convence CV”. I, p. 105-112, July 2001.

しかし、上述したようにユーザが矩形枠を指定する動作、およびあらかじめ決められた矩形枠内に認識した物体が大きく入るようにカメラの画角を調整する動作は、わずらわしいものであって操作性を損ねてしまうという問題点を有していた。
一方、矩形枠や画角の指定がいい加減だと、花画像などの識別精度が低下してしまうという問題点も有していた。 However, as described above, the operation of designating the rectangular frame by the user and the operation of adjusting the angle of view of the camera so that the recognized object is large within the predetermined rectangular frame are both cumbersome and easy to operate. It had the problem of losing.
On the other hand, if the rectangular frame and the angle of view are specified appropriately, there is a problem that the identification accuracy of a flower image or the like is lowered.

本発明は、簡単な操作で主要被写体に対する画角を調整可能とすることを目的とする。 An object of the present invention is to make it possible to adjust the angle of view with respect to a main subject with a simple operation.

態様の一例は、画像内で主要被写体が適正画角となるように該主要被写体に対する画角を調整する画角調整装置であって、目標画角の中心位置を指定する位置指定手段と、前記画像内の各画素値に基づいて前記主要被写体と該主要被写体以外の背景を領域分割する処理を実行しながら、前記中心位置の周囲画素については前記主要被写体の領域に領域分割する第１の領域分割処理と、当該領域分割された主要被写体の周囲の各画素については前記主要被写体の領域に領域分割する度合いを高める処理を所定の終了条件に達するまで繰り返し実行する第２の領域分割処理と、を実行する領域分割手段と、前記第２の領域分割処理により領域分割された主要被写体を囲む外接矩形枠を基準として、前記外接矩形枠が前記画像内に収まるようにズームアップして、前記主要被写体に対する画角を調整する画角調整手段と、を備えるとともに、前記中心位置から前記画像の上下左右の端へ向かって伸ばした４本の線分のうち最も短い線分の長さだけ、前記中心位置から前記上下左右の方向にそれぞれ広げた大きさの矩形をズーム領域とし、該ズーム領域のアスペクト比を保った状態で画面の中央に収まるようにズームアップを実行し、該ズームアップ後の初期ズーム領域とする初期ズーム手段を更に備え、前記画角の調整を前記初期ズーム領域に対して実行し、前記第２の領域分割処理により領域分割された主要被写体の領域のいずれかの境界が前記初期ズーム領域の端に達したことを前記所定の終了条件とする、ことを特徴とする画角調整装置である。
An example of the aspect is an angle-of-view adjusting device that adjusts an angle of view with respect to the main subject so that the main subject has an appropriate angle of view in the image, the position specifying unit for specifying the center position of the target angle of view, A first region that divides the main subject and the background other than the main subject into regions based on each pixel value in the image and divides the surrounding pixels at the center position into the regions of the main subject. A second area division process for repeatedly executing a division process and a process for increasing the degree of area division for each pixel around the main subject divided into the main subject area until a predetermined end condition is reached; And a circumscribing rectangular frame surrounding the main subject divided by the second region dividing process as a reference so that the circumscribing rectangular frame fits within the image. Up, the the angle adjusting means for adjusting the angle of view with respect to the main object, provided with a shortest line segment of the four line segments extended toward the upper and lower left and right edges of the image from the center position The zoom area is a rectangle that is widened from the center position in the up, down, left, and right directions by the length of the center area, and zoom-in is performed so that the center ratio of the zoom area is maintained while maintaining the aspect ratio of the zoom area. An initial zoom means for setting an initial zoom area after the zoom-up, the adjustment of the angle of view is performed on the initial zoom area, and the area of the main subject divided by the second area dividing process The angle-of-view adjusting apparatus is characterized in that the predetermined end condition is that any one of the above reaches the end of the initial zoom area.

本発明によれば、領域分割の精度を向上させることが可能となる。 According to the present invention, it is possible to improve the accuracy of area division.

本発明の一実施形態に係る画角調整装置の機能ブロック図である。It is a functional block diagram of an angle-of-view adjustment device according to an embodiment of the present invention. 図１の画角調整装置の動作説明図である。It is operation | movement explanatory drawing of the angle-of-view adjustment apparatus of FIG. 本発明の一実施形態に係る画角調整装置のハードウェア構成例を示すブロック図である。It is a block diagram which shows the hardware structural example of the angle-of-view adjustment apparatus which concerns on one Embodiment of this invention. 本実施形態による画角調整処理の全体動作を示すフローチャートである。It is a flowchart which shows the whole operation | movement of a view angle adjustment process by this embodiment. 初期ズーム処理の説明図である。It is explanatory drawing of an initial zoom process. 終了判定の説明図である。It is explanatory drawing of an end determination. 最終ズーム処理の説明図である。It is explanatory drawing of the last zoom process. 重み付き有向グラフの説明図である。It is explanatory drawing of a weighted directed graph. ヒストグラムθの説明図である。It is explanatory drawing of histogram (theta). ｈ_uv（Ｘ_u,Ｘ_v）の特性図である。It is a characteristic view of h _uv (X _u , X _v ). ｔ−ｌｉｎｋとｎ−ｌｉｎｋを有するグラフと、領域ラベルベクトルＸおよびグラフカットとの関係を、模式的に示した図である。It is the figure which showed typically the relationship between the graph which has t-link and n-link, the area | region label vector X, and the graph cut. データ更新１の処理を示すフローチャートである。It is a flowchart which shows the process of the data update. データ更新２の処理を示すフローチャートである。It is a flowchart which shows the process of the data update.

以下、本発明を実施するための形態について図面を参照しながら詳細に説明する。 Hereinafter, embodiments for carrying out the present invention will be described in detail with reference to the drawings.

図１は、本発明の一実施形態に係る画角調整装置の機能ブロック図である。 FIG. 1 is a functional block diagram of an angle of view adjusting apparatus according to an embodiment of the present invention.

本実施形態は、画像内で主要被写体が適正画角となるように主要被写体に対する画角を調整する画角調整装置である。 The present embodiment is an angle-of-view adjusting device that adjusts an angle of view with respect to a main subject so that the main subject has an appropriate angle of view in an image.

位置指定手段１０１は、目標画角の中心位置を指定する。この位置指定手段１０１は例えば、画像を表示するスマートフォンやデジタルカメラのタッチ入力機能付きディスプレイ上で、目標画角の中心位置をユーザに指等でタッチ指定させる手段である。 The position designation unit 101 designates the center position of the target angle of view. For example, the position specifying unit 101 is a unit that allows the user to touch-specify the center position of the target angle of view with a finger or the like on a display with a touch input function of a smartphone or a digital camera that displays an image.

第１の領域分割手段１０２は、画像内の各画素値に基づいて主要被写体と主要被写体以外の背景を領域分割する処理を実行する。そして、第１の領域分割手段１０２は、中心位置の周囲画素については主要被写体の領域に領域分割する。具体的には、第１の領域分割手段１０２は例えば、画像内の各画素に付与する主要被写体または背景を示す領域ラベルを更新しながら、領域ラベルと各画素の画素値に基づき、主要被写体を示す画像から算出した画素値ごとの第１のヒストグラムの値が大きいほど値が小さくなる第１のコスト項と、背景を示す画像から算出した画素値ごとの第２のヒストグラムの値が大きいほど値が小さくなる第２のコスト項を含み、中心位置に関する周囲画素については第１のコスト項の値がゼロとなるエネルギー関数の例えばＧｒａｐｈＣｕｔｓ法による最小化処理により、画像内で主要被写体と背景を領域分割する。 The first area dividing unit 102 executes a process of dividing the main subject and the background other than the main subject into areas based on each pixel value in the image. Then, the first area dividing unit 102 divides the surrounding pixels at the center position into areas of the main subject. Specifically, for example, the first region dividing unit 102 updates the main subject or background region label indicating the background to be given to each pixel in the image, and determines the main subject based on the region label and the pixel value of each pixel. A first cost term that decreases as the value of the first histogram for each pixel value calculated from the image to be displayed is larger, and a value that increases as the value of the second histogram for each pixel value calculated from the image that indicates the background increases. Includes a second cost term that decreases, and for the surrounding pixels related to the center position, the main subject and the background in the image are reduced by an energy function that minimizes the value of the first cost term to, for example, the Graph Cuts method. Divide the area.

第２の領域分割手段１０３は、画像内の各画素値に基づいて主要被写体と主要被写体以外の背景を領域分割する処理を実行する。そして、第２の領域分割手段１０３は、第１の領域分割手段１０２により領域分割された主要被写体の周囲の各画素については主要被写体の領域に領域分割する度合いを高める処理を所定の終了条件に達するまで繰り返し実行する。具体的には、第２の領域分割手段１０３は例えば、画像内の各画素に付与する主要被写体または背景を示す領域ラベルを更新しながら、領域ラベルと各画素の画素値に基づき、第１のヒストグラムの値が大きいほど値が小さくなる第１のコスト項と、第２のヒストグラムの値が大きいほど値が小さくなる第２のコスト項を含み、その画素の画素値をビン値とする第１のヒストグラムの値が所定のヒストグラム閾値に比較して大きくかつその画素値の彩度が所定の彩度閾値に比較して大きい場合に、第１のコスト項がより小さくなる例えばＧｒａｐｈＣｕｔｓ法によるエネルギー関数の最小化処理により、画像内で主要被写体と背景をさらに領域分割する処理を繰り返す。 The second area dividing means 103 executes a process of dividing the main subject and the background other than the main subject into areas based on each pixel value in the image. Then, the second area dividing unit 103 performs a process for increasing the degree of area division of each pixel around the main subject divided by the first area dividing unit 102 into the main subject area under a predetermined end condition. Repeat until it reaches. Specifically, the second region dividing unit 103 updates the first label based on the region label and the pixel value of each pixel while updating the region label indicating the main subject or background to be given to each pixel in the image. A first cost term that decreases as the value of the histogram increases and a second cost term that decreases as the value of the second histogram increases, and the pixel value of the pixel is a bin value. The first cost term is smaller when the value of the histogram is larger than the predetermined histogram threshold and the saturation of the pixel value is larger than the predetermined saturation threshold. For example, energy by the Graph Cuts method By the function minimization process, the process of further dividing the main subject and background in the image is repeated.

画角調整手段１０４は、第２の領域分割手段１０４により領域分割された主要被写体を囲む外接矩形枠を基準として主要被写体に対する画角を調整する。画角調整手段１０４は例えば、外接矩形枠が上下左右に一定のマージン領域を含めて画面内に収まるように、ズームアップの画角調整を行う。 The angle-of-view adjusting unit 104 adjusts the angle of view with respect to the main subject with reference to a circumscribed rectangular frame surrounding the main subject divided by the second region dividing unit 104. For example, the angle-of-view adjustment unit 104 adjusts the angle of view for zooming up so that the circumscribed rectangular frame can be accommodated in the screen including a certain margin area in the vertical and horizontal directions.

上述の構成において、以下のような初期ズーム手段を更に備えてよい。この初期ズーム手段は、目標画角の中心位置から画像の上下左右の端へ向かって伸ばした４本の線分のうち最も短い線分の長さだけ、中心位置から上下左右の方向にそれぞれ広げた大きさの矩形をズーム領域とする。そして、初期ズーム手段は、そのズーム領域のアスペクト比を保った状態で画面の中央に収まるようにズームアップの初期の画角調整を実行し、ズームアップ後の初期ズーム領域とする。
そして、図１の構成による画角の調整が、上述の初期ズーム領域に対して実行される。
このとき、領域分割された主要被写体の領域のいずれかの境界が上述の初期ズーム領域の端に達したことが前述した所定の終了条件とされる。 In the above configuration, the following initial zoom means may be further provided. This initial zoom means expands in the vertical and horizontal directions from the central position by the length of the shortest of the four line segments extending from the center position of the target angle of view toward the top, bottom, left and right edges of the image. A rectangle of a certain size is set as a zoom area. Then, the initial zoom means executes an initial angle adjustment of zooming up so as to fit in the center of the screen while maintaining the aspect ratio of the zoom area, and sets the initial zoom area after zooming up.
Then, the adjustment of the angle of view according to the configuration of FIG. 1 is performed on the above-described initial zoom region.
At this time, the predetermined end condition described above is that one of the boundaries of the divided main subject area has reached the end of the initial zoom area.

図２は、図１の画角調整装置の動作説明図である。 FIG. 2 is an operation explanatory view of the angle of view adjusting apparatus of FIG.

ユーザはまず、図２（ａ）に示されるように、例えばディスプレイ画面に花が表示されている画像上で、目標画角の中心位置２０１、すなわち主要被写体の中央をワンタッチする。 First, as shown in FIG. 2A, the user touches the center position 201 of the target angle of view, that is, the center of the main subject on the image on which the flower is displayed, for example.

この結果、まず第１の領域分割手段１０２が、図２（ｂ）に示される花の中心部２０２のみを主要被写体領域として抽出する。 As a result, first, the first area dividing unit 102 extracts only the central part 202 of the flower shown in FIG. 2B as the main subject area.

次に、第２の領域分割手段１０３が、図２（ｂ）の主要被写体領域２０２に対して図２（ｃ）に示されるような周囲の領域２０３内の各画素の画素値を新たな主要被写体領域として取り込んでゆくような、主要被写体と背景の領域分割処理を実行する。そして、第２の領域分割手段１０３は、この処理を、図２（ｄ）の２０４として示されるように、主要被写体の領域を少しずつ広げながら、所定の終了条件に達するまで（例えば初期ズーム領域の端に到達するまで）、繰り返し実行する。 Next, the second area dividing unit 103 sets the pixel value of each pixel in the surrounding area 203 as shown in FIG. 2C to the main object area 202 in FIG. The main subject and background region dividing process is executed so as to capture the subject region. Then, the second area dividing means 103 performs this process until the predetermined end condition is reached while gradually expanding the area of the main subject as shown by 204 in FIG. 2D (for example, the initial zoom area). Until it reaches the end of).

以上の処理の結果、ユーザは、最初に目標画角の中心位置をワンタッチするだけで、その主要被写体が画面いっぱいに収まるように自動的に画角調整が実行され、主要被写体がズームアップ等された矩形枠が自動的に設定される。
これにより、ユーザは、矩形枠の設定や画角調整のわずらわしさから解放され、操作性を向上させることが可能となる。
また、主要被写体を認識して検索するようなアプリケーションの実行において、主要被写体の識別精度を向上させることが可能となる。 As a result of the above processing, the user can automatically adjust the angle of view so that the main subject fits the entire screen just by one-touching the center position of the target angle of view. A rectangular frame is automatically set.
This frees the user from the hassle of setting a rectangular frame and adjusting the angle of view, and improves operability.
In addition, in executing an application that recognizes and searches for a main subject, it is possible to improve the identification accuracy of the main subject.

図３は、本発明の一実施形態に係る画角調整装置３０１のハードウェア構成例を示すブロック図である。 FIG. 3 is a block diagram illustrating a hardware configuration example of the field angle adjustment device 301 according to the embodiment of the present invention.

この画角調整装置３０１は例えば、いわゆるスマートフォンなどの携帯情報端末あるいはデジタルカメラであるコンピュータシステム上に実現される。 The angle-of-view adjustment device 301 is realized on a computer system that is a portable information terminal such as a so-called smartphone or a digital camera, for example.

画像領域分割装置３０１は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ：中央演算処理装置）３０２と、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）３０３と、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）３０４を備える。また、画像領域分割装置３０１は、ソリッド記憶装置等の外部記憶装置３０５と、通信インタフェース３０６と、タッチパネルディスプレイ装置などの入力装置３０７および表示装置３０８を備える。さらに、画像領域分割装置３０１は、マイクロＳＤメモリカードやＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）メモリカードなどの可搬記録媒体３１０をセット可能な可搬記録媒体駆動装置３０９を備える。撮像装置３１２は、静止画像やビデオ画像を撮像することのできるデジタルカメラ機構であり、レンズ、オートフォーカス駆動制御装置、露出制御装置、撮像センサ等を備える。上述の各機器３０２〜３０９および３１２は、バス３１１によって相互に接続される。 The image area dividing device 301 includes a CPU (Central Processing Unit) 302, a ROM (Read Only Memory) 303, and a RAM (Random Access Memory) 304. The image area dividing device 301 includes an external storage device 305 such as a solid storage device, a communication interface 306, an input device 307 such as a touch panel display device, and a display device 308. Further, the image area dividing device 301 includes a portable recording medium driving device 309 capable of setting a portable recording medium 310 such as a micro SD memory card or a USB (Universal Serial Bus) memory card. The imaging device 312 is a digital camera mechanism that can capture still images and video images, and includes a lens, an autofocus drive control device, an exposure control device, an imaging sensor, and the like. The devices 302 to 309 and 312 described above are connected to each other by a bus 311.

ＲＯＭ３０３には、スマートフォンやデジタルカメラ全体の一般的な動作を制御するプログラムのほか、後述する図４、図１２、および図１３のフローチャートによって示される画像領域分割処理の制御プログラムが記憶されている。ＣＰＵ３０２は、ＲＯＭ３０３から、この制御プログラムを読み出し、ＲＡＭ３０４をワークメモリとして実行する。これにより、図１の構成を有する画角調整装置の機能が実現され、この結果、例えばユーザが、撮像装置３１２によって花などを撮像した画像上で、花がどのような画角で写っていても、花が画面いっぱいに表示されるように自動的に画角が調整される。このように画面が自動調整された画像に対して、例えば花などの主要被写体をそれ以外の背景から分割する画像領域分割処理が実行される。これにより得られた花などの主要被写体領域の画像データは、ユーザが例えば花の種類を検索するために、通信インタフェース３０６から特には図示しないインターネットを経由してインターネットに接続される画像検索サーバコンピュータに送信される。このコンピュータ上で、送られた主要被写体領域の花画像データに基づいて花のデータベースが検索される。その結果検索がヒットした花の図鑑情報が、その花の画像データとともにインターネットを経由して通信インタフェース３０６にて受信され、表示装置３０８に表示される。 The ROM 303 stores a program for controlling general operations of the smartphone and the entire digital camera, as well as a control program for image region division processing shown by flowcharts of FIGS. 4, 12, and 13 described later. The CPU 302 reads this control program from the ROM 303 and executes the RAM 304 as a work memory. Thereby, the function of the angle-of-view adjusting device having the configuration of FIG. 1 is realized. As a result, for example, the angle at which the flower is captured on the image obtained by the user imaging the flower or the like with the imaging device 312. However, the angle of view is automatically adjusted so that the flowers are displayed full screen. For the image whose screen is automatically adjusted in this way, an image area dividing process for dividing a main subject such as a flower from other backgrounds is executed. The image data of the main subject area such as a flower thus obtained is used as an image search server computer connected to the Internet from the communication interface 306 via the Internet (not shown) in particular so that the user can search for, for example, the type of flower. Sent to. On this computer, a flower database is searched based on the sent flower image data of the main subject area. As a result, the pictorial book information of the flower hit by the search is received by the communication interface 306 via the Internet together with the image data of the flower and displayed on the display device 308.

図４は、本実施形態による画角調整処理の全体動作を示すフローチャートである。このフローチャートの処理は、図１２および図１３の詳細処理を示すフローチャートの処理とともに、図３のＣＰＵ３０２が、ＲＯＭ３０３に記憶された制御プログラムを、ＲＡＭ３０４をワークメモリとして使用しながら実行する処理として実現される。 FIG. 4 is a flowchart showing the overall operation of the angle of view adjustment process according to the present embodiment. The processing of this flowchart is realized as processing in which the CPU 302 in FIG. 3 executes the control program stored in the ROM 303 while using the RAM 304 as a work memory together with the processing in the flowchart showing the detailed processing in FIGS. The

まず、ＣＰＵ３０２は、撮像され例えばＲＡＭ３０４に記憶された画像を、図３の例えばタッチ入力機能付きディスプレイである表示装置３０８に表示し、ユーザにズーム対象領域の中心（目標画角の中心位置）へのタッチを促す。ユーザが、タッチ入力機能付きディスプレイである入力装置３０７から中心位置をタッチ入力すると、ＣＰＵ３０２は、その中心位置の画素の座標情報を例えばＲＡＭ３０４に記憶させる（以上、図４のステップＳ４０１）。 First, the CPU 302 displays an image that has been captured and stored in, for example, the RAM 304, on the display device 308 that is a display with a touch input function in FIG. 3, for example, to the center of the zoom target area (the center position of the target field angle). Encourage touch. When the user touch-inputs the center position from the input device 307 that is a display with a touch input function, the CPU 302 stores the coordinate information of the pixel at the center position in the RAM 304, for example (step S401 in FIG. 4).

次に、ＣＰＵ３０２は、初期ズーム処理を実行する（図４のステップＳ４０２）。図５は、初期ズーム処理の説明図である。まず、ユーザによって、目標画角の中心位置５０１が指定される。次に、ＣＰＵ３０２は、中心位置５０１から画像の上下左右の端へ向かって伸ばした４本の線分５０２のうち最も短い線分５０２（下）の長さだけ、中心位置５０１から上下左右の方向にアスペクト比を保った状態でそれぞれ広げた大きさの矩形をズーム領域とする。そして、ＣＰＵ３０２は、そのズーム領域の画面の中央に収まるようにズームアップの初期の画角調整を実行し、ズームアップ後の初期ズーム領域５０３とする。
以下の画角調整処理は、この初期ズーム領域に対して実行される。
ＣＰＵ３０２が実行するステップＳ４０２の処理は、図１での説明に関連して前述した初期ズーム手段の機能を実現する。 Next, the CPU 302 executes an initial zoom process (step S402 in FIG. 4). FIG. 5 is an explanatory diagram of the initial zoom process. First, the center position 501 of the target angle of view is designated by the user. Next, the CPU 302 moves in the vertical and horizontal directions from the central position 501 by the length of the shortest line segment 502 (bottom) among the four line segments 502 extending from the central position 501 toward the top, bottom, left and right edges of the image. A rectangle having a size expanded while maintaining the aspect ratio is defined as a zoom area. Then, the CPU 302 performs an initial angle adjustment for zooming up so as to fit in the center of the screen of the zoom area, and sets the initial zoom area 503 after zooming up.
The following angle of view adjustment processing is executed for this initial zoom area.
The process of step S402 executed by the CPU 302 realizes the function of the initial zoom means described above in connection with the description in FIG.

次に、ＣＰＵ３０２は、ステップＳ４０２で算出した初期ズーム領域に対して、データ項の更新１の処理を実行する（図４のステップＳ４０３）。
ＣＰＵ３０２が実行するデータ項の更新１の処理は、図１の第１の領域分割手段１０２の機能の一部を実現する。
すなわち、ＣＰＵ３０２は、初期ズーム領域の画像内の各画素に付与する主要被写体または背景を示す領域ラベルを更新しながら、領域ラベルと各画素の画素値に基づき、主要被写体を示す画像から算出した画素値ごとの第１のヒストグラムの値が大きいほど値が小さくなる第１のコスト項（後述する数６式および図１２のステップＳ１２０３参照）と、背景を示す画像から算出した画素値ごとの第２のヒストグラムの値が大きいほど値が小さくなる第２のコスト項（後述する数７式および図１２のステップＳ１２０４参照）を算出する。
またこのとき、ＣＰＵ３０２は、ユーザによりタッチされた画素とその周囲の画素に関する第１のコスト項の値はゼロとなるように制御する（後述する図１２のステップＳ１２０６参照）。
ここで、周囲とは例えば、タッチされた画素位置から半径３画素以内の画素をいう。
なお、周囲には、タッチされた画素位置自身も含まれる。
ステップＳ４０３の処理の詳細は、図１２のフローチャートを用いて後述する。 Next, the CPU 302 executes data item update 1 for the initial zoom area calculated in step S402 (step S403 in FIG. 4).
The data item update 1 process executed by the CPU 302 realizes a part of the function of the first area dividing means 102 of FIG.
That is, the CPU 302 updates the main subject or background region label indicating the background to be applied to each pixel in the image of the initial zoom region, and calculates the pixel calculated from the image indicating the main subject based on the region label and the pixel value of each pixel. The first cost term (see equation 6 below and step S1203 in FIG. 12) that decreases as the value of the first histogram for each value increases, and the second for each pixel value calculated from the image indicating the background. The second cost term (see equation (7) and step S1204 in FIG. 12), which decreases as the histogram value increases, is calculated.
At this time, the CPU 302 performs control so that the value of the first cost term regarding the pixel touched by the user and the surrounding pixels is zero (see step S1206 in FIG. 12 described later).
Here, the surroundings refer to pixels within a radius of 3 pixels from the touched pixel position, for example.
Note that the surrounding area includes the touched pixel position itself.
Details of the processing in step S403 will be described later with reference to the flowchart of FIG.

次に、ＣＰＵ３０２は、ステップＳ４０３のデータ項の更新１の処理で算出した第１および第２のコスト項を含むエネルギー関数（後述する数３式参照）を計算しながら、ＧｒａｐｈＣｕｔｓアルゴリズムを実行する（図４のステップＳ４０４）。
この処理の詳細については後述する。
この場合にＣＰＵ３０２が実行するステップＳ４０４の処理は、図１の第１の領域分割手段１０２の機能の一部を実現する。 Next, the CPU 302 executes the Graph Cuts algorithm while calculating an energy function (see Equation 3 described later) including the first and second cost terms calculated in the data term update 1 process in step S403. (Step S404 in FIG. 4).
Details of this processing will be described later.
In this case, the process of step S404 executed by the CPU 302 realizes a part of the function of the first area dividing unit 102 in FIG.

ステップＳ４０４の後、ＣＰＵ３０２は、終了判定を行う（図４のステップＳ４０５）。図６は、終了判定の説明図である。この終了判定における終了条件は、図６に示されるように、領域分割された主要被写体の領域６０１のいずれかの境界が初期ズーム領域の端６０２に達した場合、または主要被写体の領域が所定の誤差の範囲内で変化しなくなった場合のいずれかである。
この終了判定は、次に説明する４０６のデータ項の更新２において、主要被写体領域の周囲の背景画素が前景色に似ていて主要被写体領域と判定してしまう場合に対する対応策となっている。 After step S404, the CPU 302 makes an end determination (step S405 in FIG. 4). FIG. 6 is an explanatory diagram of the end determination. As shown in FIG. 6, the end condition in this end determination is that when one of the boundaries of the divided main subject region 601 reaches the end 602 of the initial zoom region, or the main subject region is a predetermined region. One of the cases where the change does not change within the error range.
This end determination is a countermeasure against a case where the background pixel around the main subject area is similar to the foreground color and is determined to be the main subject area in the update 2 of the data term 406 described below.

ＣＰＵ３０２は、ステップＳ４０５の終了判定処理の結果、まだ終了していないと判定した場合には、ステップＳ４０２で算出した初期ズーム領域に対して、データ項の更新２の処理を実行する（図４のステップＳ４０６）。
ＣＰＵ３０２が実行するデータ項の更新２の処理は、図１の第２の領域分割手段１０３の機能の一部を実現する。
すなわち、ＣＰＵ３０２は、初期ズーム領域の画像内の各画素に付与する主要被写体または背景を示す領域ラベルを更新しながら、領域ラベルと各画素の画素値に基づき、主要被写体を示す画像から算出した画素値ごとの第１のヒストグラムの値が大きいほど値が小さくなる第１のコスト項（後述する数６式および図１３のステップＳ１２０３参照）と、背景を示す画像から算出した画素値ごとの第２のヒストグラムの値が大きいほど値が小さくなる第２のコスト項（後述する数７式および図１３のステップＳ１２０４参照）を算出する。
またこのとき、ＣＰＵ３０２は、上記画素値をビン値とする第１のヒストグラムの値が所定のヒストグラム閾値に比較して大きくかつその画素値の彩度が所定の彩度閾値に比較して大きい場合に（後述する図１３のステップＳ１３０１参照）、第１のコスト項をより小さい値にする（後述する図１３のステップＳ１３０２からＳ１３０４参照）。
ステップＳ４０６の処理の詳細は、図１２のフローチャートを用いて後述する。 As a result of the end determination process in step S405, the CPU 302 executes the data item update 2 process on the initial zoom area calculated in step S402 (FIG. 4). Step S406).
The data term update 2 process executed by the CPU 302 realizes a part of the function of the second area dividing unit 103 in FIG.
That is, the CPU 302 updates the main subject or background region label indicating the background to be applied to each pixel in the image of the initial zoom region, and calculates the pixel calculated from the image indicating the main subject based on the region label and the pixel value of each pixel. The first cost term (see Equation 6 below and step S1203 in FIG. 13) that decreases as the value of the first histogram for each value increases, and the second for each pixel value calculated from the image indicating the background. The second cost term (see equation (7) described later and step S1204 in FIG. 13), which decreases as the histogram value increases, is calculated.
At this time, the CPU 302 determines that the value of the first histogram having the pixel value as a bin value is larger than a predetermined histogram threshold and the saturation of the pixel value is larger than a predetermined saturation threshold. (See step S1301 in FIG. 13 described later), the first cost term is set to a smaller value (see steps S1302 to S1304 in FIG. 13 described later).
Details of the processing in step S406 will be described later with reference to the flowchart of FIG.

次に、ＣＰＵ３０２は、ステップＳ４０６のデータ項の更新２の処理で算出した第１および第２のコスト項を含むエネルギー関数（後述する数３式参照）を計算しながら、ＧｒａｐｈＣｕｔｓアルゴリズムを実行する（図４のステップＳ４０４）。
この処理の詳細については後述する。
この場合にＣＰＵ３０２が実行するステップＳ４０４の処理は、図１の第２の領域分割手段１０３の機能の一部を実現する。 Next, the CPU 302 executes the Graph Cuts algorithm while calculating an energy function (see Equation 3 described later) including the first and second cost terms calculated in the data term update 2 process in step S406. (Step S404 in FIG. 4).
Details of this processing will be described later.
In this case, the process of step S404 executed by the CPU 302 realizes a part of the function of the second area dividing unit 103 in FIG.

その後、再び終了判定処理が実行され（図４のステップＳ４０５）、さらに上述のデータ項の更新２の処理（図４のステップＳ４０６）が繰り返し実行され、主要被写体領域が拡張されてゆく。 Thereafter, the end determination process is executed again (step S405 in FIG. 4), and the above-described data item update 2 process (step S406 in FIG. 4) is repeatedly executed to expand the main subject area.

ステップＳ４０５で終了が判定されると、最終ズーム処理が実行される（図４のステップＳ４０７）。図７は、最終ズーム処理の説明図である。図７に示されるように、主要被写体領域７０１の外接矩形が、上下左右に一定のマージン領域を含めて画面内に収まるように、ズームアップの画角調整が行われる。
ＣＰＵ３０２が実行するステップＳ４０７の処理は、図１の画角調整手段１０４の機能を実現する。 When the end is determined in step S405, final zoom processing is executed (step S407 in FIG. 4). FIG. 7 is an explanatory diagram of the final zoom process. As shown in FIG. 7, zoom-in angle adjustment is performed so that the circumscribed rectangle of the main subject area 701 fits within the screen including a certain margin area in the vertical and horizontal directions.
The processing in step S407 executed by the CPU 302 realizes the function of the angle of view adjustment unit 104 in FIG.

以上のようにして、図２で前述したように、ユーザはステップＳ４０１で、図２（ａ）に示されるように、ズーム対象領域の中心２０１をワンタッチする。そして、ステップＳ４０３で、図２（ｂ）に示される花の中心部２０２のみが主要被写体領域として抽出される。さらに、ステップＳ４０６の繰返しにより、図２（ｂ）の主要被写体領域２０２に対して図２（ｃ）に示されるような周囲の領域２０３内の各画素の画素値を新たな主要被写体領域として取り込んでゆくような、主要被写体と背景の領域分割処理が実行される。この処理が、図２（ｄ）の２０４として示されるように、主要被写体の領域を少しずつ広げながら、ステップＳ４０５の終了判定の条件に達するまで繰り返し実行される。
そして、ユーザは、最初に目標画角の中心位置をワンタッチするだけで、ステップＳ４０７の最終ズーム処理の後、花などの主要被写体が画面いっぱいに収まるように自動的に画角調整が実行される。ユーザは、矩形枠の設定や画角調整のわずらわしさから解放され、操作性を向上させることが可能となる。 As described above, as described above with reference to FIG. 2, in step S <b> 401, the user performs one-touch operation on the center 201 of the zoom target area as illustrated in FIG. In step S403, only the central part 202 of the flower shown in FIG. 2B is extracted as the main subject area. Further, by repeating step S406, the pixel value of each pixel in the surrounding region 203 as shown in FIG. 2C is taken as a new main subject region with respect to the main subject region 202 in FIG. 2B. An area division process of the main subject and the background, which is going to be performed, is executed. This process is repeatedly executed until the condition for the end determination in step S405 is reached while gradually increasing the area of the main subject, as indicated by 204 in FIG.
Then, the user simply touches the center position of the target angle of view first, and after the final zoom process in step S407, the angle of view is automatically adjusted so that the main subject such as a flower fits on the screen. . The user is freed from the hassle of setting the rectangular frame and adjusting the angle of view, and can improve the operability.

以下に、図４のステップＳ４０３、Ｓ４０４、Ｓ４０６の処理の基礎となるＧｒａｐｈＣｕｔｓアルゴリズムについて、説明する。
いま、
を、要素Ｘ_vが画像Ｖにおける画素ｖ（１≦ｖ≦Ｖ）に対する領域ラベルを示す領域ラベルベクトルであるとする。この領域ラベルベクトルは、例えば、画素ｖが主要被写体領域内にあれば要素Ｘ_v＝０、背景領域内にあれば要素Ｘ_v＝１となるバイナリベクトルである。すなわち、
である。 In the following, the GraphCuts algorithm that is the basis of the processing of steps S403, S404, and S406 of FIG. 4 will be described.
Now
And an element X _v is a region label vector indicating an area label for pixels v (1 ≦ v ≦ V) in the image V. This area label vector is, for example, a binary vector having element X _v = 0 if the pixel v is in the main subject area and element X _v = 1 if it is in the background area. That is,
It is.

本実施形態において実行される領域分割処理は、画像Ｖにおいて、次式で定義されるエネルギー関数Ｅ（Ｘ）を最小にするような数１式の領域ラベルベクトルＸを求める処理である。
エネルギー最小化処理が実行される結果、領域ラベルベクトルＸ上で領域ラベル値Ｘ_v＝０となる画素ｖの集合として、主要被写体領域が得られる。なお、領域ラベルベクトルＸ上で領域ラベル値Ｘ_v＝１となる画素ｖの集合が、背景領域となる。 The area dividing process executed in the present embodiment is a process for obtaining an area label vector X of Formula 1 that minimizes an energy function E (X) defined by the following expression in the image V.
As a result of executing the energy minimization process, the main subject region is obtained as a set of pixels v having the region label value X _v = 0 on the region label vector X. Note that the set of pixels v with the region label value X _v = 1 on the region label vector X is the background region.

数３式のエネルギーを最小化するために、次式および図８で示される重み付き有向グラフ（以下「グラフ」と略す）を定義する。
ここで、Ｖはノード（ｎｏｄｅ）、Ｅはエッジ（ｅｄｇｅ）である。このグラフが画像の領域分割に適用される場合は、画像の各画素が各ノードＶに対応する。また、画素以外のノードとして、次式および図８中に示される、
と呼ばれる特殊なターミナルが追加される。このソースｓを主要被写体領域、シンクｔを背景領域に対応付けて考える。また、エッジＥは、ノードＶ間の関係を表現している。周辺の画素との関係を表したエッジＥをｎ−ｌｉｎｋ、各画素とソースｓ（主要被写体領域に対応）またはシンクｔ（背景領域に対応）との関係を表したエッジＥをｔ−ｌｉｎｋと呼ぶ。 In order to minimize the energy of equation (3), the following equation and a weighted directed graph (hereinafter abbreviated as “graph”) shown in FIG. 8 are defined.
Here, V is a node and E is an edge. When this graph is applied to image area division, each pixel of the image corresponds to each node V. Also, as nodes other than pixels, the following formula and shown in FIG.
A special terminal called is added. Consider the source s in association with the main subject region and the sink t in the background region. The edge E represents the relationship between the nodes V. An edge E representing the relationship with surrounding pixels is n-link, and an edge E representing the relationship between each pixel and the source s (corresponding to the main subject region) or sink t (corresponding to the background region) is t-link. Call.

いま、ソースｓと各画素に対応するノードとを結ぶ各ｎ−ｌｉｎｋを、各画素がどの程度主要被写体領域らしいかを示す関係ととらえる。そして、その主要被写体領域らしさを示すコスト値を、数３式第１項に対応付けて、
と定義する。ここで、θ（ｃ、０）は、画像の主要被写体の領域から算出したカラー画素値ｃごとのヒストグラム（出現回数）を示す関数データであり、例えば図９（ａ）に示されるように予め得られている。なお、θ（ｃ、０）の全カラー画素値ｃにわたる総和は１になるように正規化されているものとする。また、Ｉ（ｖ）は、入力画像の各画素ｖにおけるカラー（ＲＧＢ）画素値である。実際には、カラー（ＲＧＢ）画素値を輝度値に変換した値であるが、以下では説明の簡単のために「カラー（ＲＧＢ）画素値」または「カラー画素値」と記載する。数６式において、θ（Ｉ（ｖ）、０）の値が大きいほど、コスト値は小さくなる。これは、予め得られている主要被写体領域のカラー画素値の中で出現回数が多いものほど、数６式で得られるコスト値が小さくなって、画素ｖが主要被写体領域中の画素らしいことを意味し、数３式のエネルギー関数Ｅ（Ｘ）の値を押し下げる結果となる。 Now, each n-link connecting the source s and a node corresponding to each pixel is regarded as a relationship indicating how much each pixel seems to be a main subject region. Then, a cost value indicating the likelihood of the main subject area is associated with the first term of Equation 3 and
It is defined as Here, θ (c, 0) is function data indicating a histogram (number of appearances) for each color pixel value c calculated from the area of the main subject of the image. For example, as shown in FIG. Has been obtained. It is assumed that the sum total of θ (c, 0) over all color pixel values c is normalized to be 1. I (v) is a color (RGB) pixel value at each pixel v of the input image. Actually, this is a value obtained by converting a color (RGB) pixel value into a luminance value, but for the sake of simplicity of explanation, it will be described as “color (RGB) pixel value” or “color pixel value”. In Equation 6, the cost value decreases as the value of θ (I (v), 0) increases. This is because, as the number of appearances of color pixel values in the main subject region obtained in advance increases, the cost value obtained by Equation 6 becomes smaller, and the pixel v seems to be a pixel in the main subject region. This means that the value of the energy function E (X) in Equation 3 is pushed down.

次に、シンクｔと各画素に対応するノードとを結ぶ各ｔ−ｌｉｎｋを、各画素がどの程度背景領域らしいかを示す関係ととらえる。そして、その背景領域らしさを示すコスト値を、数３式第１項に対応付けて、
と定義する。ここで、θ（ｃ、1）は、画像の背景の領域から算出したカラー画素値ｃごとのヒストグラム（出現度数）を示す関数データであり、例えば図９（ｂ）に示されるように予め得られている。なお、θ（ｃ、１）の全カラー画素値ｃにわたる総和は１になるように正規化されているものとする。Ｉ（ｖ）は、数６式の場合と同様に、入力画像の各画素ｖにおけるカラー（ＲＧＢ）画素値である。数６式において、θ（Ｉ（ｖ）、１）の値が大きいほど、コスト値は小さくなる。これは、予め得られている背景領域のカラー画素値の中で出現回数が多いものほど、数７式で得られるコスト値が小さくなって、画素ｖが背景領域中の画素らしいことを意味し、数３式のエネルギー関数Ｅ（Ｘ）の値を押し下げる結果となる。 Next, each t-link connecting the sink t and the node corresponding to each pixel is regarded as a relationship indicating how much each pixel is a background region. Then, the cost value indicating the likelihood of the background region is associated with the first term of Equation 3 and
It is defined as Here, θ (c, 1) is function data indicating a histogram (appearance frequency) for each color pixel value c calculated from the background region of the image, and is obtained in advance as shown in FIG. 9B, for example. It has been. It is assumed that the sum total of θ (c, 1) over all color pixel values c is normalized to be 1. I (v) is a color (RGB) pixel value at each pixel v of the input image, as in Equation 6. In Equation 6, the cost value decreases as the value of θ (I (v), 1) increases. This means that the more frequently appearing color pixel values of the background area obtained in advance, the smaller the cost value obtained by Equation 7 is, and the pixel v seems to be a pixel in the background area. As a result, the value of the energy function E (X) in Formula 3 is pushed down.

次に、各画素に対応するノードとその周辺画素との関係を示すｔ−ｌｉｎｋのコスト値を、数３式第２項に対応付けて、
と定義する。ここで、ｄｉｓｔ（ｕ，ｖ）は、画素ｖとその周辺画素ｕのユークリッド距離を示しており、κは所定の係数である。また、Ｉ（ｕ）およびＩ（ｖ）は、入力画像の各画素ｕおよびｖにおける各カラー（ＲＧＢ）画素値である。画素ｖおよびその周辺画素ｕの各領域ラベル値Ｘ_uおよびＸ_vが同一（Ｘ_u＝Ｘ_v）となるように選択された場合における数８式のコスト値は０とされて、エネルギーＥ（Ｘ）の計算には影響しなくなる。一方、画素ｖとその周辺画素ｕの各領域ラベル値Ｘ_uおよびＸ_vが異なる（Ｘ_u≠Ｘ_v）ように選択された場合における数８式のコスト値は、例えば図１０に示される特性を有する関数特性となる。すなわち、画素ｖおよびその周辺画素ｕの各領域ラベル値Ｘ_uおよびＸ_vが異なっていて、かつ画素ｖおよびその周辺画素ｕのカラー画素値の差Ｉ（ｕ）−Ｉ（ｖ）が小さい場合には、数８式で得られるコスト値が大きくなる。この場合には、数３式のエネルギー関数Ｅ（Ｘ）の値が押し上げられる結果となる。言い換えれば、近傍画素間で、カラー画素値の差が小さい場合には、それらの画素の各領域ラベル値は、互いに異なるようには選択されない。すなわち、その場合には、近傍画素間では領域ラベル値はなるべく同じになって主要被写体領域または背景領域はなるべく変化しないように、制御される。一方、画素ｖおよびその周辺画素ｕの各領域ラベル値Ｘ_uおよびＸ_vが異なっていて、かつ画素ｖおよびその周辺画素ｕのカラー画素値の差Ｉ（ｕ）−Ｉ（ｖ）が大きい場合には、数８式で得られるコスト値が小さくなる。この場合には、数３式のエネルギー関数Ｅ（Ｘ）の値が押し下げられる結果となる。言い換えれば、近傍画素間で、カラー画素値の差が大きい場合には、主要被写体領域と背景領域の境界らしいことを意味し、画素ｖとその周辺画素ｕとで、領域ラベル値が異なる方向に制御される。 Next, the cost value of t-link indicating the relationship between the node corresponding to each pixel and its surrounding pixels is associated with the second term of Equation 3;
It is defined as Here, dist (u, v) indicates the Euclidean distance between the pixel v and the surrounding pixel u, and κ is a predetermined coefficient. Further, I (u) and I (v) are color (RGB) pixel values in the pixels u and v of the input image. When the region label values X _u and X _{v of the} pixel v and the surrounding pixels u are selected to be the same (X _u = X _v ), the cost value of Formula 8 is set to 0, and the energy E ( It does not affect the calculation of X). On the other hand, the cost value of Equation 8 when the region label values X _u and X _{v of} the pixel v and the surrounding pixels u are selected to be different (X _u ≠ X _v ) is, for example, the characteristic shown in FIG. It has a function characteristic having That is, when the region label values X _u and X _v of the pixel v and the surrounding pixel u are different, and the difference I (u) −I (v) between the pixel v and the surrounding pixel u is small. In this case, the cost value obtained by Equation 8 is large. In this case, the value of the energy function E (X) in Equation 3 is pushed up. In other words, when the difference in color pixel value between neighboring pixels is small, the area label values of those pixels are not selected to be different from each other. In other words, in this case, the region label values are controlled to be the same between neighboring pixels and the main subject region or the background region is not changed as much as possible. On the other hand, when the region label values X _u and X _v of the pixel v and the surrounding pixel u are different and the difference I (u) −I (v) between the color values of the pixel v and the surrounding pixel u is large. Therefore, the cost value obtained by Equation 8 is small. In this case, the result is that the value of the energy function E (X) in Equation 3 is pushed down. In other words, if the color pixel value difference between neighboring pixels is large, it means that the boundary between the main subject region and the background region is likely, and the region label values in the direction where the pixel v and its surrounding pixels u are different. Be controlled.

以上の定義を用いて、入力画像の各画素ｖごとに、数６式によって、ソースｓと各画素ｖとを結ぶｔ−ｌｉｎｋのコスト値（主要被写体領域らしさ）が算出される。また、数７式によって、シンクｔと各画素ｖとを結ぶｔ−ｌｉｎｋのコスト値（背景領域らしさ）が算出される。さらに、入力画像の各画素ｖごとに、数８式によって、画素ｖとその周辺例えば８方向の各８画素とを結ぶ８本のｎ−ｌｉｎｋのコスト値（境界らしさ）が算出される。 Using the above definition, the cost value of t-link (likeness of main subject area) connecting the source s and each pixel v is calculated for each pixel v of the input image by Equation (6). In addition, the cost value (likeness of background area) of t-link connecting the sink t and each pixel v is calculated by Expression 7. Further, for each pixel v of the input image, eight n-link cost values (likeness of boundary) connecting the pixel v and its surroundings, for example, each of eight pixels in eight directions, are calculated by Equation (8).

そして、理論的には、数１式の領域ラベルベクトルＸの全ての領域ラベル値の０または１の組合せごとに、各領域ラベル値に応じて上記数６式、数７式、および数８式の計算結果が選択されながら数３式のエネルギー関数Ｅ（Ｘ）が計算される。そして、全ての組合せの中でエネルギー関数Ｅ（Ｘ）の値が最小となる領域ラベルベクトルＸを選択することにより、領域ラベルベクトルＸ上で領域ラベル値Ｘ_v＝０となる画素ｖの集合として、主要被写体領域を得ることができる。 Theoretically, for each combination of region labels 0 or 1 of the region label vector X in Equation 1, the above Equation 6, Equation 7, and Equation 8 according to each region label value. The energy function E (X) of Formula 3 is calculated while the calculation result of is selected. Then, by selecting the region label vector X that minimizes the value of the energy function E (X) among all the combinations, as a set of pixels v with the region label value X _v = 0 on the region label vector X The main subject area can be obtained.

しかし実際には、領域ラベルベクトルＸの全ての領域ラベル値の０または１の組合せ数は、２の画素数乗通りあるため、現実的な時間でエネルギー関数Ｅ（Ｘ）の最小化処理を計算することができない。 However, since the number of combinations of 0 or 1 of all region label values of the region label vector X is actually the number of pixels multiplied by 2, calculation of the energy function E (X) minimization process in a realistic time is calculated. Can not do it.

そこで、図４のステップＳ４０４で実行されるＧｒａｐｈＣｕｔｓ法では、次のようなアルゴリズムを実行することにより、エネルギー関数Ｅ（Ｘ）の最小化処理を現実的な時間で計算することを可能にする。
図１１は、上述した数６式、数７式で定義されるｔ−ｌｉｎｋと数８式で定義されるｎ−ｌｉｎｋを有するグラフと、領域ラベルベクトルＸおよびグラフカットとの関係を、模式的に示した図である。図１１では、理解の容易化のために、画素ｖは一次元的に示されている。 Therefore, the Graph Cuts method executed in step S404 in FIG. 4 makes it possible to calculate the energy function E (X) minimization processing in a realistic time by executing the following algorithm. .
FIG. 11 is a schematic diagram showing the relationship between a graph having t-link defined by the above-described Equation 6 and Equation 7 and n-link defined by Equation 8, and the region label vector X and the graph cut. It is the figure shown in. In FIG. 11, the pixel v is shown one-dimensionally for easy understanding.

数３式のエネルギー関数Ｅ（Ｘ）の第１項の計算で、領域ラベルベクトルＸ中の領域ラベル値が０となるべき主要被写体領域中の画素では、数６式と数７式のうち、主要被写体領域中の画素らしい場合により小さな値となる数６式のコスト値のほうが小さくなる。従って、ある画素において、ソースｓ側のｔ−ｌｉｎｋが選択されシンクｔ側のｔ−ｌｉｎｋがカットされて（図１１の１１０２のケース）、数６式を用いて数３式のＥ（Ｘ）の第１項が計算された場合に、その計算結果が小さくなれば、その画素の領域ラベル値としては０が選択される。そして、そのグラフカット状態が採用される。計算結果が小さくならなければ、そのグラフカット状態は採用されず、他のリンクの探索およびグラフカットが試みられる。 In the calculation of the first term of the energy function E (X) in Expression 3, in the pixel in the main subject area where the area label value in the area label vector X should be 0, among Expression 6 and Expression 7, The cost value of equation (6), which is a smaller value depending on the pixel in the main subject area, is smaller. Therefore, in a certain pixel, the t-link on the source s side is selected and the t-link on the sink t side is cut (case 1102 in FIG. 11), and E (X) in Equation 3 using Equation 6 If the calculation result becomes small when the first term is calculated, 0 is selected as the region label value of the pixel. Then, the graph cut state is adopted. If the calculation result does not become small, the graph cut state is not adopted and another link search and graph cut are attempted.

逆に、領域ラベルベクトルＸ中の領域ラベル値が１となるべき背景領域中の画素では、数６式と数７式のうち、背景領域中の画素らしい場合により小さな値となる数７式のコスト値のほうが小さくなる。従って、ある画素において、シンクｔ側のｔ−ｌｉｎｋが選択されソースｓ側のｔ−ｌｉｎｋはカットされて（図１１の１１０３のケース）、数７式を用いて数３式のＥ（Ｘ）の第１項が計算された場合に、その計算結果が小さくなれば、その画素の領域ラベル値としては１が選択される。そして、そのグラフカット状態が採用される。計算結果が小さくならなければ、そのグラフカット状態は採用されず、他のリンクの探索およびグラフカットが試みられる。 Conversely, for the pixels in the background region where the region label value in the region label vector X should be 1, among the equations (6) and (7), the equation The cost value is smaller. Therefore, in a certain pixel, t-link on the sink t side is selected, and t-link on the source s side is cut (case 1103 in FIG. 11), and E (X) in Equation 3 using Equation 7 When the first term is calculated, if the calculation result becomes small, 1 is selected as the region label value of the pixel. Then, the graph cut state is adopted. If the calculation result does not become small, the graph cut state is not adopted and another link search and graph cut are attempted.

一方、数３式のエネルギー関数Ｅ（Ｘ）の第１項の計算に係る上記グラフカット処理により、領域ラベルベクトルＸ中の領域ラベル値が０または１で連続すべき主要被写体領域内部または背景領域内部の画素間では、数８式のコスト値が０となる。従って、数８式の計算結果は、エネルギー関数Ｅ（Ｘ）の第２項のコスト値の計算には影響しない。また、その画素間のｎ−ｌｉｎｋは、数８式がコスト値０を出力するように、カットされずに維持される。 On the other hand, by the above-described graph cut processing relating to the calculation of the first term of the energy function E (X) of Formula 3, the area label value in the area label vector X is 0 or 1, and the main subject area should be continuous or the background area The cost value of Formula 8 is 0 between the internal pixels. Therefore, the calculation result of Equation 8 does not affect the calculation of the cost value of the second term of the energy function E (X). Further, the n-link between the pixels is maintained without being cut so that Equation 8 outputs a cost value of 0.

ところが、エネルギー関数Ｅ（Ｘ）の第１項の計算に係る上記グラフカット処理により、近傍画素間で、領域ラベル値が０と１の間で変化した場合に、それらの画素間のカラー画素値の差が小さければ、数８式のコスト値が大きくなる。この結果、数３式のエネルギー関数Ｅ（Ｘ）の値が押し上げられる。このようなケースは、同一領域内で第１項の値による領域ラベル値の判定がたまたま反転するような場合に相当する。従って、このようなケースでは、エネルギー関数Ｅ（Ｘ）の値が大きくなって、そのような領域ラベル値の反転は選択されない結果となる。また、この場合には、数８式の計算結果が、上記結果を維持するように、それらの画素間のｎ−ｌｉｎｋは、カットされずに維持される。 However, when the region label value changes between 0 and 1 between neighboring pixels by the graph cut processing relating to the calculation of the first term of the energy function E (X), the color pixel value between those pixels is changed. If the difference is small, the cost value of Equation 8 is large. As a result, the value of the energy function E (X) in Equation 3 is pushed up. Such a case corresponds to a case where the determination of the region label value by the value of the first term happens to be reversed in the same region. Therefore, in such a case, the value of the energy function E (X) becomes large, and as a result, such inversion of the region label value is not selected. In this case, the n-link between the pixels is maintained without being cut so that the calculation result of Expression 8 maintains the above result.

これに対して、エネルギー関数Ｅ（Ｘ）の第１項の計算に係る上記グラフカット処理により、近傍画素間で、領域ラベル値が０と１の間で変化した場合に、それらの画素間のカラー画素値の差が大きければ、数８式のコスト値が小さくなる。この結果、数３式のエネルギー関数Ｅ（Ｘ）の値が押し下げられる。このようなケースは、それらの画素部分が主要被写体領域と背景領域の境界らしいことを意味している。従って、このようなケースでは、これらの画素間で領域ラベル値を異ならせて、主要被写体領域と背景領域の境界を形成する方向に制御される。また、この場合には、境界の形成状態を安定化するために、それらの画素間のｎ−ｌｉｎｋがカットされて、数３式の第２項のコスト値が０にされる（図１１の１１０４のケース）。 On the other hand, when the region label value changes between 0 and 1 between neighboring pixels by the graph cut processing relating to the calculation of the first term of the energy function E (X), If the difference between the color pixel values is large, the cost value of Equation 8 is small. As a result, the value of the energy function E (X) in Equation 3 is pushed down. Such a case means that these pixel portions are likely to be the boundary between the main subject area and the background area. Therefore, in such a case, the region label value is varied between these pixels, and the control is performed in the direction in which the boundary between the main subject region and the background region is formed. In this case, in order to stabilize the boundary formation state, the n-link between these pixels is cut, and the cost value of the second term of Equation 3 is set to 0 (FIG. 11). 1104 case).

以上の判定制御処理が、ソースｓのノードを起点にして、順次各画素のノードをたどりながら繰り返されることにより、図１１の１１０１で示されるようなグラフカットが実行され、図４のステップＳ４０４におけるエネルギー関数Ｅ（Ｘ）の最小化処理が現実的な時間で計算される。この処理の具体的な手法としては、例えば、非特許文献１に記載されている手法を採用することができる。 The above determination control processing is repeated starting from the node of the source s while sequentially tracing the nodes of each pixel, so that a graph cut as indicated by 1101 in FIG. 11 is executed, and in step S404 of FIG. The process of minimizing the energy function E (X) is calculated in a realistic time. As a specific method of this processing, for example, the method described in Non-Patent Document 1 can be adopted.

そして、各画素ごとに、ソースｓ側のｔ−ｌｉｎｋが残っていれば、その画素の領域ラベル値として０、すなわち主要被写体領域の画素を示すラベルが付与される。逆に、シンクｔ側のｔ−ｌｉｎｋが残っていれば、その画素の領域ラベル値として１、すなわち背景領域の画素を示すラベルが付与される。最終的に、領域ラベル値が０となる画素の集合として、主要被写体領域が得られる。 If t-link on the source s side remains for each pixel, 0 is given as the area label value of that pixel, that is, a label indicating the pixel of the main subject area is given. On the contrary, if t-link on the sink t side remains, 1 is given as the area label value of the pixel, that is, a label indicating the pixel in the background area is given. Finally, the main subject area is obtained as a set of pixels having the area label value of 0.

図１２は、上述した動作原理に基づく図４のステップＳ４０３のデータ項の更新１の処理の詳細を示すフローチャートである。 FIG. 12 is a flowchart showing details of the data item update 1 process in step S403 of FIG. 4 based on the above-described operation principle.

まず、ステップＳ４０２で得られた初期ズーム領域の画像から、カラー画素値Ｉ（Ｖ）が１つずつ読み込まれる（図１２のステップＳ１２０１）。 First, the color pixel values I (V) are read one by one from the image of the initial zoom area obtained in step S402 (step S1201 in FIG. 12).

次に、ステップＳ１２０１で読み込まれた画素が、図４のステップＳ４０１でユーザによってタッチ指定されたズーム対象領域の中心に関する周囲画素であるか否かが判定される（図１２のステップＳ１２０２）。 Next, it is determined whether or not the pixel read in step S1201 is a surrounding pixel related to the center of the zoom target area touch-designated by the user in step S401 in FIG. 4 (step S1202 in FIG. 12).

ステップＳ１２０２の判定がＮＯの場合には、前述した数６式、数７式、および数８式に基づいて、主要被写体領域らしさを示すコスト値、背景領域らしさを示すコスト値、および境界らしさを示すコスト値が、それぞれ算出される（図１２のステップＳ１２０３、Ｓ１２０４、およびＳ１２０５）。なお、θ（ｃ、０）の初期値は、学習用に用意した複数枚（数百枚程度）の主要被写体の領域から算出される。同様に、θ（ｃ、１）の初期値は、学習用に用意した複数枚（数百枚程度）の背景の領域から算出される。 If the determination in step S1202 is NO, the cost value indicating the main subject area likelihood, the cost value indicating the background area likelihood, and the boundary likelihood are calculated based on the above-described Expression 6, Expression 7, and Expression 8. The cost values shown are respectively calculated (steps S1203, S1204, and S1205 in FIG. 12). The initial value of θ (c, 0) is calculated from the areas of a plurality of (approximately several hundred) main subjects prepared for learning. Similarly, the initial value of θ (c, 1) is calculated from a plurality (several hundreds) of background regions prepared for learning.

一方、ステップＳ１２０２の判定がＹＥＳの場合には、タッチ指定された画素を含むその周囲の画素は、確実に主要被写体領域であると言える。このため、タッチ指定された画素を含むその周囲領域が確実に主要被写体領域と判定されるようにするために、主要被写体領域らしさを示すコスト値ｇ_v(Ｘ_v)が、次式のようにゼロとされる（図１２のステップＳ１２０６）。
On the other hand, if the determination in step S1202 is YES, it can be said that the surrounding pixels including the touch-designated pixel are surely the main subject region. Therefore, in order to ensure that the surrounding area including the touch-designated pixel is determined as the main subject area, the cost value g _v (X _v ) indicating the likelihood of the main subject area is expressed by the following equation: It is set to zero (step S1206 in FIG. 12).

また、タッチ指定された画素を含むその周囲領域が確実に主要被写体領域と判定されるようにするために、背景領域らしさを示すコスト値ｇ_v(Ｘ_v)が、次式のように値Ｋとされる。
ここで、Ｋは、次式に示されるように、任意のピクセルの平滑化項の総和よりも大きい値を設定しておく（以上、図１２のステップＳ１２０７）。
Further, in order to ensure that the surrounding area including the touch-designated pixel is determined as the main subject area, the cost value g _v (X _v ) indicating the likelihood of the background area is a value K as shown in the following equation: It is said.
Here, as shown in the following equation, K is set to a value larger than the sum of smoothing terms of arbitrary pixels (step S1207 in FIG. 12).

さらに、タッチ指定された画素を含むその周囲領域内は全て主要被写体領域であるため、ｈ_uv（Ｘ_u,Ｘ_v）の値は０とされる（図１２のステップＳ１２０８）。 Further, since all the surrounding areas including the touch-designated pixel are main subject areas, the value of h _uv (X _u , X _v ) is set to 0 (step S1208 in FIG. 12).

以上の処理の後、画像内に処理すべき画素が残っているか否かが判定される（図１２のステップＳ１２０９）。 After the above processing, it is determined whether or not there remains a pixel to be processed in the image (step S1209 in FIG. 12).

処理すべき画素がありステップＳ１２０９の判定がＹＥＳならば、ステップＳ１２０１の処理に戻って、上記処理が繰り返される。 If there is a pixel to be processed and the determination in step S1209 is YES, the process returns to step S1201 and the above process is repeated.

処理すべき画素がなくなりステップＳ１２０９の判定がＮＯになると、図４のステップＳ４０３のデータ項の更新１の処理が終了する。
その後、図４のステップＳ４０４で、初期ズーム領域の画像内の全ての画素について求まったコスト値を用いて、数３式のエネルギー関数Ｅ（Ｘ）が計算されながら、ＧｒａｐｈＣｕｔｓアルゴリズムが実行され、主要被写体と背景が領域分割される。 When there is no more pixel to be processed and the determination in step S1209 is NO, the data item update 1 process in step S403 in FIG. 4 ends.
Thereafter, in step S404 of FIG. 4, the Graph Cuts algorithm is executed while calculating the energy function E (X) of Formula 3 using the cost values obtained for all the pixels in the image of the initial zoom region, The main subject and the background are divided into regions.

図１３は、前述したＧｒａｐｈＣｕｔｓアルゴリズムの動作原理に基づく図４のステップＳ４０６のデータ項の更新２の処理の詳細を示すフローチャートである。このフローチャートにおいて、図１２の場合と同じ処理の部分には同じステップ番号を付してある。 FIG. 13 is a flowchart showing details of the data item update 2 process in step S406 of FIG. 4 based on the operation principle of the Graph Cuts algorithm described above. In this flowchart, the same processing steps as those in FIG. 12 are denoted by the same step numbers.

次に、ステップＳ１２０１で読み込まれたカラー画素値Ｉ（Ｖ）が主要被写体の画素らしいか否かが、次のようにして判定される。いま、Ｓａｔ（Ｉ（Ｖ））をカラー画素値Ｉ（Ｖ）に対する彩度、θ（Ｉ（Ｖ）、０）をカラー画素値Ｉ（Ｖ）の主要被写体らしさを示すヒストグラムとしたとき、彩度Ｓａｔ（Ｉ（Ｖ））が所定の彩度閾値Ｓｔｈ以上であって、かつヒストグラムθ（Ｉ（Ｖ）、０）が所定のヒストグラム閾値θｔｈ以上か否かが判定される（図１３のステップＳ１３０１）。 Next, whether or not the color pixel value I (V) read in step S1201 seems to be a pixel of the main subject is determined as follows. Now, let Sat (I (V)) be the saturation with respect to the color pixel value I (V), and θ (I (V), 0) be the histogram showing the main subject likeness of the color pixel value I (V). It is determined whether the degree Sat (I (V)) is greater than or equal to a predetermined saturation threshold Sth and the histogram θ (I (V), 0) is greater than or equal to the predetermined histogram threshold θth (step in FIG. 13). S1301).

ステップＳ１３０１の判定がＮＯの場合には、図１２の場合と同様に、前述した数６式、数７式、および数８式に基づいて、主要被写体領域らしさを示すコスト値、背景領域らしさを示すコスト値、および境界らしさを示すコスト値が、それぞれ算出される（図１２のステップＳ１２０３、Ｓ１２０４、およびＳ１２０５）。 If the determination in step S1301 is NO, as in the case of FIG. 12, the cost value indicating the likelihood of the main subject region and the likelihood of the background region are calculated based on the above-described Equation 6, Equation 7, and Equation 8. The cost value to be indicated and the cost value to indicate the likelihood of the boundary are respectively calculated (steps S1203, S1204, and S1205 in FIG. 12).

一方、ステップＳ１３０１の判定がＹＥＳの場合には、ステップＳ１２０１で読み込まれたカラー画素値Ｉ（Ｖ）は、前回までの処理により得られている主要被写体領域の周囲の画素であって、やはり主要被写体領域である可能性が高いと言える。このため、タッチ指定された画素を含むその周囲領域が確実に主要被写体領域と判定されやすくするために、主要被写体領域らしさを示すコスト値ｇ_v(Ｘ_v)が、次のように修正される。
まず、主要被写体領域らしさを示すコスト値ｇ_v(０)が背景領域らしさを示すコスト値ｇ_v(１)よりも大きいか否かが判定される（図１３のステップＳ１３０２）。
ｇ_v(０)＞ｇ_v(１)であってステップＳ１３０２の判定がＹＥＳならば、主要被写体領域らしさを示すコスト値ｇ_v(０)が背景領域らしさを示すコスト値ｇ_v(１)の半分の値に修正される（図１３のステップＳ１３０３）。すなわち、数７式より、
ｇ_v(Ｘ_v)＝ｇ_v(０)＝ｇ_v(１)／２＝−ｌｏｇθ（Ｉ（Ｖ）、１）／２
とされる。
一方、ｇ_v(０)＞ｇ_v(１)ではなくステップＳ１３０２の判定がＮＯならば、主要被写体領域らしさを示すコスト値ｇ_v(０)が元の自身の値の半分の値に修正される（図１３のステップＳ１３０４）。すなわち、数６式より、
ｇ_v(Ｘ_v)＝ｇ_v(０)＝ｇ_v(０)／２＝−ｌｏｇθ（Ｉ（Ｖ）、０）／２
とされる。 On the other hand, if the determination in step S1301 is YES, the color pixel value I (V) read in step S1201 is a pixel around the main subject area obtained by the previous processing, and is also a main pixel value. It can be said that there is a high possibility of being a subject area. For this reason, the cost value g _v (X _v ) indicating the likelihood of the main subject area is corrected as follows to ensure that the surrounding area including the touch-designated pixel is surely determined as the main subject area. .
First, it is determined whether or not the cost value g _v (0) indicating the likelihood of the main subject area is greater than the cost value g _v (1) indicating the likelihood of the background area (step S1302 in FIG. 13).
If g _v (0)> g _v (1) and the determination in step S1302 is YES, the cost value g _v (0) indicating the likelihood of the main subject area is the cost value g _v (1) indicating the likelihood of the background area. The value is corrected to half (step S1303 in FIG. 13). That is, from Equation 7,
g _v (X _v ) = g _v (0) = g _v (1) / 2 = −log θ (I (V), 1) / 2
It is said.
On the other hand, if g _v (0)> g _v (1) is not satisfied and the determination in step S1302 is NO, the cost value g _v (0) indicating the likelihood of the main subject area is corrected to half the original value. (Step S1304 in FIG. 13). That is, from Equation 6,
g _v (X _v ) = g _v (0) = g _v (0) / 2 = −log θ (I (V), 0) / 2
It is said.

また、背景領域らしさを示すコスト値ｇ_v(Ｘ_v)については、前述したステップＳ１２０４によって算出される。 Further, the cost value g _v (X _v ) indicating the likelihood of the background area is calculated in the above-described step S1204.

ｈ_uv（Ｘ_u,Ｘ_v）の値は、前述したステップＳ１２０５によって算出される。 The value of h _uv (X _u , X _v ) is calculated in step S1205 described above.

以上の処理の後、画像内の今回処理した画素の近傍に似ている色の画素が残っているか否かが判定される（図１３のステップＳ１３０５）。似ている色とは、例えば、２５６階調のＲＧＢ値を各色１０階調へ減色した際に同じＲＧＢ値となる色とする。 After the above processing, it is determined whether or not there remains a pixel having a color similar to the vicinity of the currently processed pixel in the image (step S1305 in FIG. 13). The similar color is, for example, a color having the same RGB value when the 256 gradation RGB values are reduced to 10 gradations for each color.

処理すべき画素がありステップＳ１３０５の判定がＹＥＳならば、ステップＳ１２０１の処理に戻って、上記処理が繰り返される。 If there is a pixel to be processed and the determination in step S1305 is YES, the process returns to step S1201 and the above process is repeated.

処理すべき画素がなくなりステップＳ１３０５の判定がＮＯになると、図４のステップＳ４０６のデータ項の更新２の処理が終了する。
その後、図４のステップＳ４０４で、初期ズーム領域の画像内の全ての画素について求まったコスト値を用いて、数３式のエネルギー関数Ｅ（Ｘ）が計算されながら、ＧｒａｐｈＣｕｔｓアルゴリズムが実行され、主要被写体と背景が領域分割される。 If there is no more pixel to be processed and the determination in step S1305 is NO, the data item update 2 process in step S406 of FIG. 4 ends.
Thereafter, in step S404 of FIG. 4, the Graph Cuts algorithm is executed while calculating the energy function E (X) of Formula 3 using the cost values obtained for all the pixels in the image of the initial zoom region, The main subject and the background are divided into regions.

以上のようにして本実施形態では、領域分割の精度を向上させることが可能となる。 As described above, in this embodiment, it is possible to improve the accuracy of area division.

本実施形態では、主要被写体として花を例に説明したが、花に限られず、様々なオブジェクトを採用することができる。 In the present embodiment, a flower has been described as an example of a main subject. However, the present invention is not limited to a flower, and various objects can be employed.

本実施形態では、主要被写体の検索を目的とするシステムについて説明したが、画角調整が必要となる様々なシステムに適用可能である。 In this embodiment, a system for searching for a main subject has been described. However, the present invention can be applied to various systems that require angle-of-view adjustment.

以上の実施形態に関して、更に以下の付記を開示する。
（付記１）
画像内で主要被写体が適正画角となるように該主要被写体に対する画角を調整する画角調整装置であって、
目標画角の中心位置を指定する位置指定手段と、
前記画像内の各画素値に基づいて前記主要被写体と該主要被写体以外の背景を領域分割する処理を実行し、前記中心位置の周囲画素については前記主要被写体の領域に領域分割する第１の領域分割手段と、
前記画像内の各画素値に基づいて前記主要被写体と該主要被写体以外の背景を領域分割する処理を実行し、前記第１の領域分割手段により領域分割された主要被写体の周囲の各画素については前記主要被写体の領域に領域分割する度合いを高める処理を所定の終了条件に達するまで繰り返し実行する第２の領域分割手段と、
前記第２の領域分割手段により領域分割された主要被写体を囲む外接矩形枠を基準として該主要被写体に対する画角を調整する画角調整手段と、
を備えることを特徴とする画角調整装置。
（付記２）
前記位置指定手段は、ユーザーに中心位置を指定させる手段であることを特徴とする付記１に記載の画角調整装置。
（付記３）
前記位置指定手段は、前記画像を表示するタッチ入力機能付きディスプレイ上で、前記目標画角の中心位置を前記ユーザにタッチ指定させる、
ことを特徴とする付記２に記載の画角調整装置。
（付記４）
前記中心位置から前記画像の上下左右の端へ向かって伸ばした４本の線分のうち最も短い線分の長さだけ、前記中心位置から前記上下左右の方向にそれぞれ広げた大きさの矩形をズーム領域とし、該ズーム領域のアスペクト比を保った状態で画面の中央に収まるようにズームアップを実行し、該ズームアップ後の初期ズーム領域とする初期ズーム手段を更に備え、
前記画角の調整を前記初期ズーム領域に対して実行し、
前記第２の領域分割手段により領域分割された主要被写体の領域のいずれかの境界が前記初期ズーム領域の端に達したことを前記所定の終了条件とする、
ことを特徴とする付記１ないし３のいずれかに記載の画角調整装置。
（付記５）
前記第１の領域分割手段は、前記画像内の各画素に付与する前記主要被写体または前記背景を示す領域ラベルを更新しながら、該領域ラベルと該各画素の画素値に基づき、前記主要被写体を示す画像から算出した前記画素値ごとの第１のヒストグラムの値が大きいほど値が小さくなる第１のコスト項と、前記背景を示す画像から算出した前記画素値ごとの第２のヒストグラムの値が大きいほど値が小さくなる第２のコスト項を含み、前記中心位置に関する周囲画素については前記第１のコスト項の値がゼロとなるエネルギー関数の最小化処理により、前記画像内で前記主要被写体と前記背景を領域分割する、
ことを特徴とする付記１ないし４のいずれかに記載の画角調整装置。
（付記６）
前記第２の領域分割手段は、前記画像内の各画素に付与する前記主要被写体または前記背景を示す領域ラベルを更新しながら、該領域ラベルと該各画素の画素値に基づき、前記第１のヒストグラムの値が大きいほど値が小さくなる第１のコスト項と、前記第２のヒストグラムの値が大きいほど値が小さくなる第２のコスト項を含み、前記画素の画素値をビン値とする前記第１のヒストグラムの値が所定のヒストグラム閾値に比較して大きくかつ前記画素値の彩度が所定の彩度閾値に比較して大きい場合に、前記第１のコスト項がより小さくなるエネルギー関数の最小化処理により、前記画像内で前記主要被写体と前記背景を領域分割する、
ことを特徴とする付記５に記載の画角調整装置。
（付記７）
前記第１の領域分割手段および前記第２の領域分割手段は、ＧｒａｐｈＣｕｔｓ法により前記エネルギー関数の最小化処理を実行する、
ことを特徴とする付記６に記載の画角調整装置。
（付記８）
画像内で主要被写体が適正画角となるように該主要被写体に対する画角を調整する画角調整方法であって、
目標画角の中心位置をユーザに指定させる位置指定ステップと、
前記画像内の各画素値に基づいて前記主要被写体と該主要被写体以外の背景を領域分割する処理を実行し、前記中心位置の周囲画素については前記主要被写体の領域に領域分割する第１の領域分割ステップと、
前記画像内の各画素値に基づいて前記主要被写体と該主要被写体以外の背景を領域分割する処理を実行し、前記第１の領域分割ステップにより領域分割された主要被写体の周囲の各画素については前記主要被写体の領域に領域分割する度合いを高める処理を所定の終了条件に達するまで繰り返し実行する第２の領域分割ステップと、
前記第２の領域分割ステップにより領域分割された主要被写体を囲む外接矩形枠を基準として該主要被写体に対する画角を調整する画角調整ステップと、
を実行することを備えることを特徴とする画角調整方法。
（付記９）
画像内で主要被写体が適正画角となるように該主要被写体に対する画角を調整するコンピュータに、
目標画角の中心位置をユーザに指定させる位置指定ステップと、
前記画像内の各画素値に基づいて前記主要被写体と該主要被写体以外の背景を領域分割する処理を実行し、前記中心位置の周囲画素については前記主要被写体の領域に領域分割する第１の領域分割ステップと、
前記画像内の各画素値に基づいて前記主要被写体と該主要被写体以外の背景を領域分割する処理を実行し、前記第１の領域分割ステップにより領域分割された主要被写体の周囲の各画素については前記主要被写体の領域に領域分割する度合いを高める処理を所定の終了条件に達するまで繰り返し実行する第２の領域分割ステップと、
前記第２の領域分割ステップにより領域分割された主要被写体を囲む外接矩形枠を基準として該主要被写体に対する画角を調整する画角調整ステップと、
を実行させるためのプログラム。 Regarding the above embodiment, the following additional notes are disclosed.
(Appendix 1)
An angle-of-view adjusting device that adjusts an angle of view with respect to the main subject so that the main subject has an appropriate angle of view in the image,
Position specifying means for specifying the center position of the target angle of view;
A first region that performs a region division process on the main subject and a background other than the main subject based on each pixel value in the image, and divides the surrounding pixels at the center position into the region of the main subject. Dividing means;
Based on each pixel value in the image, the main subject and a background other than the main subject are divided into regions, and each pixel around the main subject divided by the first region dividing unit is processed. Second area dividing means for repeatedly executing a process for increasing the degree of area division into the area of the main subject until a predetermined end condition is reached;
An angle-of-view adjusting unit that adjusts an angle of view with respect to the main subject with reference to a circumscribed rectangular frame surrounding the main subject divided by the second region dividing unit;
An angle-of-view adjustment apparatus comprising:
(Appendix 2)
The angle-of-view adjustment apparatus according to appendix 1, wherein the position specifying unit is a unit that allows a user to specify a center position.
(Appendix 3)
The position specifying means causes the user to specify the center position of the target angle of view on the display with a touch input function for displaying the image.
The angle-of-view adjustment apparatus according to appendix 2, characterized in that:
(Appendix 4)
A rectangle having a size expanded from the center position in the up, down, left, and right directions by the length of the shortest of the four line segments extending from the center position toward the top, bottom, left, and right edges of the image. A zoom area, and zooming up so as to fit in the center of the screen while maintaining the aspect ratio of the zoom area, further comprising an initial zoom means as an initial zoom area after the zoom-up,
Adjusting the angle of view with respect to the initial zoom area;
The predetermined end condition is that any boundary of the area of the main subject divided by the second area dividing means reaches the end of the initial zoom area.
The angle-of-view adjustment apparatus according to any one of appendices 1 to 3, characterized in that:
(Appendix 5)
The first region dividing means updates the main subject or the region label indicating the background to be applied to each pixel in the image, and determines the main subject based on the region label and the pixel value of each pixel. A first cost term that decreases as the value of the first histogram for each pixel value calculated from the image to be displayed is smaller, and the value of the second histogram for each pixel value calculated from the image to indicate the background is A second cost term that decreases as the value increases, and for the surrounding pixels related to the center position, the main subject within the image is identified by the energy function minimization process in which the value of the first cost term is zero. Dividing the background into regions,
The angle-of-view adjustment device according to any one of appendices 1 to 4, wherein
(Appendix 6)
The second region dividing means updates the region label indicating the main subject or the background to be given to each pixel in the image, and based on the region label and the pixel value of each pixel, A first cost term that decreases as the value of the histogram increases, and a second cost term that decreases as the value of the second histogram increases, and the pixel value of the pixel is a bin value An energy function of which the first cost term is smaller when the value of the first histogram is larger than a predetermined histogram threshold and the saturation of the pixel value is larger than the predetermined saturation threshold. By the minimization process, the main subject and the background are divided into regions in the image.
The angle-of-view adjustment apparatus according to appendix 5, characterized in that:
(Appendix 7)
The first region dividing unit and the second region dividing unit execute the energy function minimization processing by a Graph Cuts method.
The angle-of-view adjustment apparatus according to appendix 6, wherein
(Appendix 8)
An angle of view adjustment method for adjusting an angle of view with respect to a main subject so that the main subject has an appropriate angle of view in an image,
A position specifying step for allowing the user to specify the center position of the target angle of view;
A first region that performs a region division process on the main subject and a background other than the main subject based on each pixel value in the image, and divides the surrounding pixels at the center position into the region of the main subject. A splitting step;
Based on each pixel value in the image, the main subject and a background other than the main subject are divided into regions, and each pixel around the main subject divided in the first region dividing step is processed. A second area dividing step of repeatedly executing a process for increasing the degree of area division into the main subject area until a predetermined end condition is reached;
An angle-of-view adjusting step of adjusting an angle of view with respect to the main subject with reference to a circumscribed rectangular frame surrounding the main subject divided by the second region dividing step;
The angle-of-view adjustment method comprising:
(Appendix 9)
A computer that adjusts the angle of view of the main subject so that the main subject has an appropriate angle of view in the image;
A position specifying step for allowing the user to specify the center position of the target angle of view;
A first region that performs a region division process on the main subject and a background other than the main subject based on each pixel value in the image, and divides the surrounding pixels at the center position into the region of the main subject. A splitting step;
Based on each pixel value in the image, the main subject and a background other than the main subject are divided into regions, and each pixel around the main subject divided in the first region dividing step is processed. A second area dividing step of repeatedly executing a process for increasing the degree of area division into the main subject area until a predetermined end condition is reached;
An angle-of-view adjusting step of adjusting an angle of view with respect to the main subject with reference to a circumscribed rectangular frame surrounding the main subject divided by the second region dividing step;
A program for running

１０１位置指定手段
１０２第１の領域分割手段
１０３第２の領域分割手段
１０４画角調整手段
３０１画角調整装置
３０２ＣＰＵ
３０３ＲＯＭ
３０４ＲＡＭ
３０５外部記憶装置
３０６通信インタフェース
３０７入力装置
３０８表示装置
３０９可搬記録媒体駆動装置
３１０可搬記録媒体
３１１バス
３１２撮像装置
DESCRIPTION OF SYMBOLS 101 Position designation means 102 1st area division means 103 2nd area division means 104 View angle adjustment means 301 View angle adjustment apparatus 302 CPU
303 ROM
304 RAM
305 External storage device 306 Communication interface 307 Input device 308 Display device 309 Portable recording medium driving device 310 Portable recording medium 311 Bus 312 Imaging device

Claims

An angle-of-view adjusting device that adjusts an angle of view with respect to the main subject so that the main subject has an appropriate angle of view in the image,
Position specifying means for specifying the center position of the target angle of view;
A first pixel that divides the main subject and the background other than the main subject into regions based on the pixel values in the image and divides the surrounding pixels at the center position into the main subject region. A region dividing process, and a second region dividing process that repeatedly executes a process for increasing the degree of area division for each pixel around the main subject divided into the main subject area until a predetermined end condition is reached. Region dividing means for executing
An image for adjusting the angle of view with respect to the main subject by zooming in so that the circumscribed rectangular frame fits in the image with reference to the circumscribed rectangular frame surrounding the main subject divided by the second region dividing process. Angle adjustment means;
With
A rectangle having a size expanded from the center position in the up, down, left, and right directions by the length of the shortest of the four line segments extending from the center position toward the top, bottom, left, and right edges of the image. A zoom area, and zooming up so as to fit in the center of the screen while maintaining the aspect ratio of the zoom area, further comprising an initial zoom means as an initial zoom area after the zoom-up,
Adjusting the angle of view with respect to the initial zoom area;
The predetermined end condition is that any one of the main subject areas divided by the second area dividing process has reached the end of the initial zoom area.
An angle-of-view adjusting device characterized by that.

An angle-of-view adjusting device that adjusts an angle of view with respect to the main subject so that the main subject has an appropriate angle of view in the image,
Position specifying means for specifying the center position of the target angle of view;
A first pixel that divides the main subject and the background other than the main subject into regions based on the pixel values in the image and divides the surrounding pixels at the center position into the main subject region. A region dividing process, and a second region dividing process that repeatedly executes a process for increasing the degree of area division for each pixel around the main subject divided into the main subject area until a predetermined end condition is reached. Region dividing means for executing
An image for adjusting the angle of view with respect to the main subject by zooming in so that the circumscribed rectangular frame fits in the image with reference to the circumscribed rectangular frame surrounding the main subject divided by the second region dividing process. Angle adjustment means;
With
As the first region dividing process, while updating the main subject or the region label indicating the background to be given to each pixel in the image, the main subject is changed based on the region label and the pixel value of each pixel. A first cost term that decreases as the value of the first histogram for each pixel value calculated from the image to be displayed is smaller, and the value of the second histogram for each pixel value calculated from the image to indicate the background is A second cost term that decreases as the value increases, and for the surrounding pixels related to the center position, the main subject within the image is identified by the energy function minimization process in which the value of the first cost term is zero. Dividing the background into regions,
An angle-of-view adjusting device characterized by that.

As the second region dividing process, while updating the region label indicating the main subject or the background to be given to each pixel in the image, the first region is divided based on the region label and the pixel value of each pixel. A first cost term that decreases as the value of the histogram increases, and a second cost term that decreases as the value of the second histogram increases, and the pixel value of the pixel is a bin value An energy function of which the first cost term is smaller when the value of the first histogram is larger than a predetermined histogram threshold and the saturation of the pixel value is larger than the predetermined saturation threshold. By the minimization process, the main subject and the background are divided into regions in the image.
The angle-of-view adjusting apparatus according to claim 2.

The region dividing unit performs the energy function minimization process by a Graph Cuts method.
The angle-of-view adjustment apparatus according to claim 3.

The angle-of-view adjustment apparatus according to claim 1, wherein the position designation unit is a unit that allows a user to designate a center position.

The position specifying means causes the user to specify the center position of the target angle of view on the display with a touch input function for displaying the image.
The angle-of-view adjustment apparatus according to claim 5.

An angle of view adjustment method for adjusting an angle of view with respect to a main subject so that the main subject has an appropriate angle of view in an image,
A position specifying step for specifying the center position of the target angle of view;
A first pixel that divides the main subject and the background other than the main subject into regions based on the pixel values in the image and divides the surrounding pixels at the center position into the main subject region. A region dividing process, and a second region dividing process that repeatedly executes a process for increasing the degree of area division for each pixel around the main subject divided into the main subject area until a predetermined end condition is reached. A region dividing step for performing
An image for adjusting the angle of view with respect to the main subject by zooming in so that the circumscribed rectangular frame fits in the image with reference to the circumscribed rectangular frame surrounding the main subject divided by the second region dividing process. Corner adjustment step;
With
A rectangle having a size expanded from the center position in the up, down, left, and right directions by the length of the shortest of the four line segments extending from the center position toward the top, bottom, left, and right edges of the image. A zoom area, and performing zoom-up so as to fit in the center of the screen while maintaining the aspect ratio of the zoom area, further comprising an initial zoom step as an initial zoom area after the zoom-up,
Adjusting the angle of view with respect to the initial zoom area;
The predetermined end condition is that any one of the main subject areas divided by the second area dividing process has reached the end of the initial zoom area.
A method of adjusting the angle of view.

An angle of view adjustment method for adjusting an angle of view with respect to a main subject so that the main subject has an appropriate angle of view in an image,
A position specifying step for specifying the center position of the target angle of view;
A first pixel that divides the main subject and the background other than the main subject into regions based on the pixel values in the image and divides the surrounding pixels at the center position into the main subject region. A region dividing process, and a second region dividing process that repeatedly executes a process for increasing the degree of area division for each pixel around the main subject divided into the main subject area until a predetermined end condition is reached. A region dividing step for performing
An image for adjusting the angle of view with respect to the main subject by zooming in so that the circumscribed rectangular frame fits in the image with reference to the circumscribed rectangular frame surrounding the main subject divided by the second region dividing process. Corner adjustment step;
With
As the first region dividing process, while updating the main subject or the region label indicating the background to be given to each pixel in the image, the main subject is changed based on the region label and the pixel value of each pixel. A first cost term that decreases as the value of the first histogram for each pixel value calculated from the image to be displayed is smaller, and the value of the second histogram for each pixel value calculated from the image to indicate the background is A second cost term that decreases as the value increases, and for the surrounding pixels related to the center position, the main subject within the image is identified by the energy function minimization process in which the value of the first cost term is zero. Dividing the background into regions,
A method of adjusting the angle of view.

A computer that adjusts the angle of view of the main subject so that the main subject has an appropriate angle of view in the image;
The program for performing each step of Claim 7 or 8.