JP5006839B2

JP5006839B2 - Trimming rule learning apparatus and method, and program

Info

Publication number: JP5006839B2
Application number: JP2008137361A
Authority: JP
Inventors: 浩民金; 正志乗松; 康晴岩城
Original assignee: Fujifilm Corp
Current assignee: Fujifilm Corp
Priority date: 2008-05-27
Filing date: 2008-05-27
Publication date: 2012-08-22
Anticipated expiration: 2028-05-27
Also published as: JP2009290249A

Description

本発明は、画像のトリミングを自動で行うに際し、トリミングのルールを学習するトリミングルール学習装置および方法並びにトリミングルール学習方法をコンピュータに実行させるためのプログラムに関するものである。 The present invention relates to a trimming rule learning device and method for learning a trimming rule when automatically trimming an image, and a program for causing a computer to execute a trimming rule learning method.

撮影を行う場合には、好ましい構図となるようにカメラを構えて撮影を行うが、撮影時に構図を適切に定めることは熟練を要するため、撮影により取得された画像が必ずしも所望とする構図を有するものとはならない場合が多い。例えば、全く関係ない被写体が画像に含まれたり、画像に含まれてほしい被写体が画像の端に位置してしまう場合がある。このため、撮影により取得した画像の一部の領域を所望とする構図となるようにトリミングすることが行われている。 When shooting, the camera is set so that a desirable composition is obtained. However, since it takes skill to appropriately determine the composition at the time of shooting, an image obtained by shooting always has a desired composition. Often not a thing. For example, there may be a case where an irrelevant subject is included in the image, or a subject that is desired to be included in the image is positioned at the end of the image. For this reason, trimming is performed so that a partial region of an image acquired by photographing has a desired composition.

トリミングは、ユーザが画像を見ながら手動で所望とする領域を切り取ることにより行うことができるが、画像の枚数が多いとその作業が非常に煩わしいものとなる。 Trimming can be performed by manually cutting out a desired region while viewing the image, but if the number of images is large, the operation becomes very troublesome.

このため、自動で画像のトリミングを行う手法が種々提案されている（特許文献１，２参照）。特許文献１に記載された手法は、画像の注目領域の注目度およびあらかじめ定義したモデルを用いたオブジェクト指標を算出し、注目度、オブジェクト指標値およびあらかじめ定義したトリミングルールに基づいて、トリミング手法を決定して画像をトリミングするものである。 For this reason, various methods for automatically trimming an image have been proposed (see Patent Documents 1 and 2). The method described in Patent Document 1 calculates the attention level of an attention area of an image and an object index using a predefined model, and performs a trimming technique based on the attention level, the object index value, and a predefined trimming rule. It is determined and the image is trimmed.

また、特許文献２に記載された手法は、ユーザ個人の感性を学習し、その学習結果を利用して画像からユーザの感性に応じたトリミングを行うものである。
特開２００４−２２８９９５号公報特開２００６−１３４１５３号公報 The method described in Patent Document 2 learns the user's individual sensibility and uses the learning result to perform trimming according to the user's sensibility from the image.
Japanese Patent Application Laid-Open No. 2004-228995 JP 2006-134153 A

しかしながら、特許文献１にはトリミングルールについてどのように学習するかについては開示がない。また、特許文献２に記載された手法は、ユーザの感性を学習してトリミングを行うものであるため、トリミング領域の構図が必ずしもユーザが所望とする構図とはならない場合がある。 However, Patent Document 1 does not disclose how to learn trimming rules. Further, since the technique described in Patent Document 2 performs trimming by learning the user's sensibility, the composition of the trimming region may not necessarily be the composition desired by the user.

本発明は、上記事情に鑑みなされたものであり、ユーザが所望とする構図となるようにトリミングを行うことができるようにすることを目的とする。 The present invention has been made in view of the above circumstances, and an object thereof is to enable trimming to achieve a composition desired by a user.

本発明によるトリミングルール学習装置は、含まれるオブジェクト毎にあらかじめ分類された画像を表示する表示手段と、
前記表示された画像に対するユーザによるトリミング領域の指定を受け付ける入力手段と、
複数の画像についての前記トリミング領域の構図に基づいて、ユーザ単位でオブジェクト毎に画像についてのトリミングルールを学習する学習手段とを備えたことを特徴とするものである。 The trimming rule learning device according to the present invention includes display means for displaying images classified in advance for each object included therein,
Input means for accepting designation of a trimming region by the user for the displayed image;
Learning means for learning a trimming rule for an image for each object on a user basis based on the composition of the trimming region for a plurality of images is provided.

なお、本発明によるトリミングルール学習装置においては、前記学習手段を、前記トリミング領域に含まれるオブジェクトを抽出し、前記トリミング領域の構図に基づいて、前記トリミング前の原画像における前記オブジェクトの位置、トリミング領域における前記オブジェクトの位置、前記原画像に対する前記オブジェクトの面積比および前記トリミング領域に対する前記オブジェクトの面積比を構図情報として取得し、該構図情報を前記トリミングルールとして学習する手段としてもよい。 In the trimming rule learning device according to the present invention, the learning means extracts an object included in the trimming area, and based on the composition of the trimming area, the position of the object in the original image before trimming, trimming The position of the object in a region, the area ratio of the object with respect to the original image, and the area ratio of the object with respect to the trimming region may be acquired as composition information, and the composition information may be learned as the trimming rule.

この場合、前記学習手段を、前記トリミング領域内の注目領域を前記オブジェクトとして抽出する手段としてもよい。 In this case, the learning unit may be a unit that extracts a region of interest in the trimming region as the object.

本発明によるトリミングルール学習方法は、含まれるオブジェクト毎にあらかじめ分類された画像を表示し、
前記表示された画像に対するユーザによるトリミング領域の指定を受け付け、
複数の画像についての前記トリミング領域の構図に基づいて、ユーザ単位でオブジェクト毎に画像についてのトリミングルールを学習することを特徴とするものである。 The trimming rule learning method according to the present invention displays images classified in advance for each included object,
Accepting a user to specify a trimming area for the displayed image;
Based on the composition of the trimming area for a plurality of images, a trimming rule for the image is learned for each object for each user.

なお、本発明によるトリミングルール学習方法をコンピュータに実行させるためのプログラムとして提供してもよい。 In addition, you may provide as a program for making a computer perform the trimming rule learning method by this invention.

本発明によれば、含まれるオブジェクト毎にあらかじめ分類された画像が表示され、表示された画像に対するユーザによるトリミング領域の指定が受け付けられる。そして、複数の画像についてのトリミング領域の構図に基づいて、ユーザ単位でオブジェクト毎に画像についてのトリミングルールが学習される。このため、トリミングルールの学習結果を用いて自動で画像のトリミングを行うことにより、画像に含まれるオブジェクトをユーザが所望とする構図となるようにトリミングすることができる。 According to the present invention, an image classified in advance for each object included is displayed, and a user designates a trimming area for the displayed image. Then, based on the composition of the trimming area for a plurality of images, a trimming rule for the image is learned for each object for each user. Therefore, by automatically trimming an image using the learning result of the trimming rule, it is possible to trim an object included in the image so as to have a composition desired by the user.

また、トリミングされた領域内の注目領域をオブジェクトとして抽出することにより、オブジェクトの抽出を容易に行うことができる。 Further, by extracting the attention area in the trimmed area as an object, the object can be easily extracted.

以下、図面を参照して本発明の実施形態について説明する。図１は本発明の実施形態によるトリミングルール学習装置の構成を示す概略ブロック図である。図１に示すように本実施形態によるトリミングルール学習装置１は、学習対象となる複数の画像を記録した画像記録部２と、画像を表示する液晶モニタ等の表示部３と、各種指示入力を行うためのキーボードおよびマウス等からなる入力部４と、トリミングルールを学習するための学習部５と、各部の制御を行う制御部６とを備え、各部がバス７により接続されている。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. FIG. 1 is a schematic block diagram showing the configuration of a trimming rule learning device according to an embodiment of the present invention. As shown in FIG. 1, the trimming rule learning device 1 according to the present embodiment includes an image recording unit 2 that records a plurality of images to be learned, a display unit 3 such as a liquid crystal monitor that displays images, and various instruction inputs. An input unit 4 including a keyboard and a mouse for performing, a learning unit 5 for learning trimming rules, and a control unit 6 for controlling each unit are connected to each other by a bus 7.

画像記録部２には、学習対象となる複数の画像が記録されているが、各画像は含まれるオブジェクト毎にあらかじめ分類されている。例えば、空、花、海、山、集合写真、正面顔ポートレート、斜め顔ポートレート、看板、標識、建物およびランドマーク等の含まれるオブジェクト毎に分類されている。なお、分類は画像を見ながらマニュアル操作により行ってもよく、オブジェクト認識を行って自動で行ってもよい。また、複数のオブジェクトを含む画像については、複数のグループに跨って分類されることとなる。 A plurality of images to be learned are recorded in the image recording unit 2, and each image is classified in advance for each object included. For example, it is classified for each object including sky, flower, sea, mountain, group photo, front face portrait, oblique face portrait, signboard, sign, building, landmark, and the like. The classification may be performed by manual operation while viewing the image, or may be performed automatically by performing object recognition. An image including a plurality of objects is classified across a plurality of groups.

学習部５は、オブジェクト抽出部８と、学習結果であるトリミングルールを登録する学習結果データベースＤＢ１とを備える。 The learning unit 5 includes an object extraction unit 8 and a learning result database DB1 that registers trimming rules that are learning results.

オブジェクト抽出部８は、後述するようにユーザにより指定されたトリミング領域内の注目領域をオブジェクトとして抽出する。ここで、オブジェクト抽出部８は、画像を目視で確認したときに注目される部分を注目領域すなわちオブジェクトとして抽出する。例えば、画像上の一部の色が周囲の色と異なる部分、画像上の一部が周囲に比べて非常に明るい部分、平坦な画面上に現れた直線部分等が画像を見たとき注目される領域となる。このため、オブジェクト抽出部８は、画像の色、明度、および画像に現れた直線成分の方向に基づいて、画像中の各部分の特徴がその部分の周囲に位置する部分の特徴と異なる度合いを求めて、これらの異なる度合いが大きいところを注目領域すなわちオブジェクトとして抽出する。 As will be described later, the object extraction unit 8 extracts a region of interest within a trimming region designated by the user as an object. Here, the object extraction unit 8 extracts a portion that is noticed when the image is visually confirmed as an attention region, that is, an object. For example, when a part of the color on the image is different from the surrounding colors, a part of the image is very bright compared to the surroundings, or a straight line that appears on a flat screen is noticed when viewing the image. It becomes an area. For this reason, the object extraction unit 8 determines the degree to which the feature of each part in the image is different from the feature of the part located around the part based on the color and brightness of the image and the direction of the linear component appearing in the image. In search of these, areas where these different degrees are large are extracted as attention areas, that is, objects.

このように視覚的に注目される注目領域は、色、明度および画像中に現れた直線成分等の画像を構成する要素が周囲と異なる特徴を持っている。そこで、画像の色（Color）、画像の明度（Intensity）、画像に現れた直線成分の方向（Orientation）を用いて、画像中の各部分の特徴が、その部分の周囲に位置する部分の特徴と異なる度合いを求め、異なる度合いが大きい部分を視覚的に注目される注目領域として抽出することができる。具体的には、IttiやKoch達によって提案された手法を用いて、視覚的に注目される注目領域を自動的に抽出することができる（例えば、IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 20, NO. 11, NOVEMBER 1998 “A Model of Saliency-Based Visual Action for Rapid Scene Analysis”, Laurent Itti, Christof Koch and Emst Neiburを参照）。 In this way, the region of interest that is visually noticeable has elements that make up the image, such as color, brightness, and linear components that appear in the image, different from the surroundings. Therefore, using the color of the image (Color), the intensity of the image (Intensity), and the direction of the linear component appearing in the image (Orientation), the characteristics of each part in the image are the characteristics of the part located around that part. And a portion having a large difference degree can be extracted as a region of interest that is visually noticed. Specifically, using the method proposed by Itti and Koch et al., It is possible to automatically extract a visually noticeable area of interest (for example, IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 20 , NO. 11, NOVEMBER 1998 “A Model of Saliency-Based Visual Action for Rapid Scene Analysis”, Laurent Itti, Christof Koch and Emst Neibur).

図２に従って、この手法を用いて注目領域を抽出する処理の流れについて説明する。 A flow of processing for extracting a region of interest using this method will be described with reference to FIG.

まず、画像に線形フィルタリング処理を施して明度を表す画像と複数の色成分に分けた色成分の画像とをそれぞれ生成する（Ｓｔｅｐ１）。 First, linear filtering processing is performed on the image to generate an image representing brightness and an image of color components divided into a plurality of color components (Step 1).

具体的には、画像から明度の画像Ｉを生成して、さらに、明度の画像ＩのGaussian pyramidを生成する。このGaussian pyramidの各階層の画像をI（σ）（σは画素のスケールを表し、σ∈[0..8]とする）とする。 Specifically, a brightness image I is generated from the image, and a Gaussian pyramid of the brightness image I is further generated. An image of each layer of this Gaussian pyramid is assumed to be I (σ) (σ represents a pixel scale, and σ∈ [0..8]).

次に、画像を４つの色成分の画像Ｒ（赤），Ｇ（緑），Ｂ（青），Ｙ（黄）に分ける。さらに、これらの画像Ｒ，Ｇ，Ｂ，Ｙから４種類のGaussian pyramidを生成して、各階層の画像をＲ（σ）、Ｇ（σ）、Ｂ（σ）、Ｙ（σ）とする。 Next, the image is divided into four color component images R (red), G (green), B (blue), and Y (yellow). Further, four types of Gaussian pyramid are generated from these images R, G, B, and Y, and the images of each layer are defined as R (σ), G (σ), B (σ), and Y (σ).

そこで、これらの画像I（σ）、Ｒ（σ）、Ｇ（σ）、Ｂ（σ）、Ｙ（σ）から画像上の各部分の特徴とその周囲の部分の特徴との違いを算出することにより、特徴量マップを生成する（Ｓｔｅｐ２）。 Therefore, the difference between the feature of each part on the image and the feature of the surrounding part is calculated from these images I (σ), R (σ), G (σ), B (σ), and Y (σ). Thus, a feature amount map is generated (Step 2).

画面上のある部分の明度がその周囲の明度とは違うと感知される場所は、周囲が明るいところに暗い部分が存在する場所、あるいは、周囲が暗いところに明るい部分が存在する場所である。そこで、周囲の明度に比べて中心部分の明度がどの程度異なるかは、細かい画素で表された画像Ｉ（ｃ）と、荒い画素で表された画像Ｉ（ｓ）とを用いて求める。荒い画像Ｉ（ｓ）の画素１個の値は、細かい画像Ｉ（ｃ）の画素を複数個分の画素をまとめた値を表しているので、画像Ｉ（ｃ）の画素の値（中心部の明度）と、この画素に対応する位置にある画像Ｉ（ｓ）の画素の値（周囲の明度）の差を求める（center-surroundと呼ばれる）ことによって、画像上の各部分が周囲とどの程度異なるかを求めることができる。例えば、細かい画素で表された画像Ｉ（ｃ）のスケールをｃ∈｛２，３，４｝とし、荒い画素で表された画像Ｉ（ｓ）のスケールをｓ＝ｃ＋δ（δ∈｛３，４｝）として明度の特徴マップＭ_Ｉ（ｃ，ｓ）を求める。この明度の特徴マップＭ_Ｉ（ｃ，ｓ）は下記の式（１）のように表される。

A place where the brightness of a certain part on the screen is different from the brightness of the surrounding area is a place where a dark part exists in a place where the surroundings are bright, or a place where a bright part exists in a place where the surroundings are dark. Therefore, how much the lightness of the central portion is different from the surrounding lightness is obtained using the image I (c) represented by fine pixels and the image I (s) represented by rough pixels. Since the value of one pixel of the rough image I (s) represents a value obtained by collecting a plurality of pixels of the fine image I (c), the value of the pixel (center portion) of the image I (c) And the value of the pixel of the image I (s) at the position corresponding to this pixel (surrounding brightness) (referred to as center-surround) It can be determined whether the degree is different. For example, the scale of the image I (c) represented by fine pixels is cε {2, 3, 4}, and the scale of the image I (s) represented by rough pixels is s = c + δ (δε {3, 4}), a brightness feature map M _I (c, s) is obtained. The brightness feature map M _I (c, s) is expressed as in the following equation (1).

各色成分についても同様に、Ｒ（σ）、Ｇ（σ）、Ｂ（σ）、Ｙ（σ）から特徴マップを生成する。画面上のある部分の色がその周囲の色と違うと感知される場所は、色相環で正反対に位置する色（反対色）の組み合わせから見つけることができる。例えば、赤／緑と緑／赤の組み合わせから特徴マップＭ_ＲＧ（ｃ，ｓ）を取得し、青／黄と黄／青の組み合わせから特徴マップＭ_ＢＹ（ｃ，ｓ）を取得する。この色の特徴マップは下記の式（２）、（３）のように表される。

Similarly, for each color component, a feature map is generated from R (σ), G (σ), B (σ), and Y (σ). A place where the color of a certain part on the screen is detected as different from the surrounding color can be found from a combination of colors (opposite colors) positioned in opposite directions in the hue circle. For example, the feature map M _RG (c, s) is acquired from the combination of red / green and green / red, and the feature map M _BY (c, s) is acquired from the combination of blue / yellow and yellow / blue. This color feature map is represented by the following equations (2) and (3).

さらに、画像上に表れた直線成分の方向について、各部分に現れた直線成分の方向とその周囲の直線成分との違いが感知される部分は、明度の画像Ｉから直線成分の方向を検出するGaborフィルタ等を利用して見つけることができる。Ｉ（σ）の各階層の画像に対してGaborフィルタを用いて、θ∈｛０°，４５°，９０°，１３５°｝の各方向の直線成分を検知して特徴マップＭ_Ｏ（ｃ，ｓ，θ）を取得する。この方向の特徴マップは下記の式（４）のように表される。

Further, with respect to the direction of the linear component appearing on the image, the part where the difference between the direction of the linear component appearing in each part and the surrounding linear component is detected detects the direction of the linear component from the brightness image I. It can be found using a Gabor filter. A Gabor filter is used for images in each layer of I (σ) to detect a linear component in each direction of θ∈ {0 °, 45 °, 90 °, 135 °}, and a feature map M _O (c, s, θ). The feature map in this direction is expressed as the following equation (4).

ｃ∈｛２，３，４｝とし、ｓ＝ｃ＋δ（δ∈｛３，４｝）とした場合には、明度の特徴マップは６個、色の特徴マップは１２個、方向に関する特徴マップは２４個得られ、これらのマップを総合的にみて、視覚的に注目される注目領域を抽出する。 If c∈ {2, 3, 4} and s = c + δ (δ∈ {3,4)}, there are 6 brightness feature maps, 12 color feature maps, and direction feature maps. Twenty-four maps are obtained, and these maps are viewed comprehensively to extract a region of interest that is visually noted.

これらの４２個の特徴マップＭ_Ｉ，Ｍ_ＲＧ，Ｍ_ＢＹ，Ｍ_Ｏは、ダイナミックレンジの違いおよび抽出する情報の違い等により、各部分とその周囲の違いが大きく表れるものとあまり大きく表れないものがある。そのため、４２個の特徴マップＭ_Ｉ，Ｍ_ＲＧ，Ｍ_ＢＹ，Ｍ_Ｏの値をそのまま用いて判定を行ったのでは、違いが大きい特徴マップに影響されて、違いが少ない特徴マップの情報が反映されない場合がある。そこで、これらの４２個の特徴マップＭ_Ｉ，Ｍ_ＲＧ，Ｍ_ＢＹ，Ｍ_Ｏを規格化した後に組み合わせて、注目領域を抽出することが好ましい。 These 42 pieces of feature maps _{_{M I, M RG, M BY}} , M O is the difference of information that difference and extraction of the dynamic range, which does not appear the parts and so large that those surrounding the difference appears greater There is. Therefore, if the determination is made using the values of the 42 feature maps M _I , M _RG , M _BY , and M _O as they are, the feature map information with a small difference is reflected by the feature map having a large difference. May not be. Therefore, these 42 amino feature maps _{_{_{M I, M RG, M BY}}} , combined after normalizing _{M O,} it is preferable to extract the region of interest.

具体的には、例えば、明度の特徴マップＭ_Ｉ（ｃ，ｓ）の６個を規格化して組み合わせた明度の要素別注目度マップＭ^Ｃ _Ｉを取得し、色の特徴マップＭ_ＲＧ（ｃ，ｓ）、Ｍ_ＢＹ（ｃ，ｓ）の１２個を規格化して組み合わせた色の要素別注目度マップＭ^Ｃ _Ｃを取得し、方向に関する特徴マップＭ_Ｏ（ｃ，ｓ，θ）の２４個を規格化して組み合わせた方向の要素別注目度マップＭ^Ｃ _Ｏを取得する（Ｓｔｅｐ３）。さらに、要素別注目度マップＭ^Ｃ _Ｉ、Ｍ^Ｃ _Ｃ、Ｍ^Ｃ _Ｏを線形結合して、画像の各部分の注目度の分布を表した注目度マップＭ^Ｓを取得する（Ｓｔｅｐ４）。この注目度が所定のしきい値Ｔｈ１を越えた領域を注目領域すなわちオブジェクトとして抽出する（Ｓｔｅｐ５）。 Specifically, for example, a brightness element-specific attention map M ^C _I obtained by standardizing and combining six brightness characteristic maps M _I (c, s) is acquired, and a color feature map M _RG (c, s) and 12 of M _BY (c, s) are standardized and combined to obtain a color element-specific attention map M ^C _C, and 24 feature map M _O (c, s, θ) related to direction are obtained to obtain the orientation of the elemental saliency map ^M _{C O} which is combined normalized (Step3). Further, the attention level map M ^S representing the distribution of the attention level of each part of the image is obtained by linearly combining the element-specific attention level maps M ^C _I , M ^C _C and M ^C _O (Step 4). A region where the degree of attention exceeds a predetermined threshold Th1 is extracted as a region of interest, that is, an object (Step 5).

また、注目領域を抽出する際に、画像の色、明度、画像に現れた直線成分の傾きが周囲と異なる度合いの影響を変えるように、画像の色、明度、画像に現れた直線成分の傾きのそれぞれの度合いと、各度合いそれぞれに対して重み付けした重み付け度合いを変えることによって、抽出される注目領域を変えることができる。例えば、要素別注目度マップＭ^Ｃ _Ｉ、Ｍ^Ｃ _Ｃ、Ｍ^Ｃ _Ｏを線形結合するときの重みを変えることによって抽出される注目領域を変えることができる。あるいは、要素別注目度マップＭ^Ｃ _Ｉ、Ｍ^Ｃ _Ｃ、Ｍ^Ｃ _Ｏを取得するときに、各明度の特徴マップＭ_Ｉ（ｃ，ｓ）、色の特徴マップＭ_ＲＧ（ｃ，ｓ）、Ｍ_ＢＹ（ｃ，ｓ）、方向に関する特徴マップＭ_Ｏ（ｃ，ｓ，θ）のそれぞれのマップの影響を変えるように、各特徴マップＭ_Ｉ（ｃ，ｓ）、Ｍ_ＲＧ（ｃ，ｓ）、Ｍ_ＢＹ（ｃ，ｓ）、Ｍ_Ｏ（ｃ，ｓ，θ）に対する重みを変えるようしてもよい。 In addition, when extracting the region of interest, the color, brightness, and slope of the linear component that appears in the image are changed so that the influence of the degree of the color component, brightness, and the slope of the linear component that appears in the image differ from the surroundings. The attention area to be extracted can be changed by changing the degree of each and the weighting degree weighted for each degree. For example, the attention area extracted can be changed by changing the weight when linearly combining the element-specific attention maps M ^C _I , M ^C _C , and M ^C _O. Alternatively, when the element-specific attention maps M ^C _I , M ^C _C , and M ^C _O are acquired, the feature map M _I (c, s) for each brightness, the color feature map M _RG (c, s), M _BY (c, s), to alter the effect of each map feature with respect to the direction map _{M O (c, s, θ} ), each feature map _{_{M I (c, s),}} M RG (c, s), The weights for M _BY (c, s) and M _O (c, s, θ) may be changed.

なお、注目領域すなわちオブジェクトの抽出の手法は、上記の手法に限定されるものではなく、公知の任意の手法を用いることができる。 Note that the method of extracting the attention area, that is, the object is not limited to the above-described method, and any known method can be used.

制御部６は、ＣＰＵ、作業領域となるＲＡＭ、およびトリミングルール学習装置１を動作させるためのプログラム等を記憶したＲＯＭを備える。 The control unit 6 includes a CPU, a RAM serving as a work area, and a ROM storing a program for operating the trimming rule learning device 1.

なお、学習時に行われる処理については以下の本実施形態の動作において説明する。 In addition, the process performed at the time of learning is demonstrated in the operation | movement of this embodiment below.

図３は本実施形態において行われる学習時に行われる処理を示すフローチャートである。なお、学習はユーザ単位で行われるものであり、学習を行うユーザのユーザＩＤが入力部４からあらかじめトリミングルール学習装置１に入力されているものとする。ユーザによる学習開始の指示が入力部４から行われることにより制御部６が処理を開始し、表示部３に学習対象の画像を表示する（ステップＳＴ１）。なお、学習対象の画像を表示する順序は、分類されたオブジェクト単位であってもランダムであってもよい。次いで、制御部６はユーザによりトリミング領域の指定がなされたか否かの監視を開始する（ステップＳＴ２）。ステップＳＴ２が肯定されると、学習部５のオブジェクト抽出部８が、ユーザが指定したトリミング領域の注目領域を抽出する（ステップＳＴ３）。 FIG. 3 is a flowchart showing processing performed at the time of learning performed in the present embodiment. Note that learning is performed on a user-by-user basis, and it is assumed that the user ID of the user who performs the learning is input to the trimming rule learning device 1 from the input unit 4 in advance. When the instruction to start learning is given from the input unit 4 by the user, the control unit 6 starts processing, and displays an image to be learned on the display unit 3 (step ST1). Note that the order in which the learning target images are displayed may be classified object units or random. Next, the control unit 6 starts monitoring whether or not the trimming area has been designated by the user (step ST2). If step ST2 is affirmed, the object extraction unit 8 of the learning unit 5 extracts the attention area of the trimming area designated by the user (step ST3).

図４はトリミングを説明するための図、図５はオブジェクトの抽出を説明するための図である。図４に示すようにトリミング前の画像（以下原画像とする）Ｓ０が花を含み、ユーザが指定したトリミング領域をＴ０とする。この場合、トリミング領域Ｔ０に含まれる被写体は、花および背景となる山の一部であるが、花が注目領域となる。このため、花の領域がオブジェクトとして抽出される。なお、花びらを含む輪郭形状（図５に破線で示す）に囲まれる領域Ａ０と比較すると、注目領域は花の中央部分にのみ相当する領域Ａ１となる。オブジェクト抽出部８は、花の中央部分に相当する注目領域Ａ１をオブジェクトＯ１として抽出する。 FIG. 4 is a diagram for explaining trimming, and FIG. 5 is a diagram for explaining object extraction. As shown in FIG. 4, an image before trimming (hereinafter referred to as an original image) S0 includes a flower, and a trimming area designated by the user is denoted as T0. In this case, the subject included in the trimming area T0 is a part of a mountain that is a flower and a background, but the flower is an attention area. For this reason, a flower region is extracted as an object. Note that, compared with the region A0 surrounded by the contour shape including the petals (indicated by a broken line in FIG. 5), the region of interest is a region A1 corresponding to only the central portion of the flower. The object extraction unit 8 extracts a region of interest A1 corresponding to the central portion of the flower as an object O1.

次いで、学習部５は、トリミング領域Ｔ０の構図に基づいて、原画像Ｓ０におけるオブジェクトの位置、トリミング領域Ｔ０におけるオブジェクトの位置、原画像Ｓ０に対するオブジェクトの面積比およびトリミング領域Ｔ０におけるオブジェクトの面積比を、トリミング領域Ｔ０の構図を表す構図情報として取得する（ステップＳＴ４）。 Next, the learning unit 5 calculates the position of the object in the original image S0, the position of the object in the trimming region T0, the area ratio of the object with respect to the original image S0, and the area ratio of the object in the trimming area T0 based on the composition of the trimming area T0. Then, it is acquired as composition information representing the composition of the trimming area T0 (step ST4).

ここで、原画像Ｓ０におけるオブジェクトＯ１の位置とは、図６に示すようにオブジェクトＯ１の原画像Ｓ０の上右下左の４辺からの距離Ｌ１〜Ｌ４である。トリミング領域Ｔ０におけるオブジェクトＯ１の位置とは、図７に示すようにオブジェクトＯ１のトリミング領域Ｔ０の上右下左の４辺からの距離ＬＴ１〜ＬＴ４である。原画像Ｓ０に対するオブジェクトＯ１の面積比Ｈ１は、原画像Ｓ０の面積に対するオブジェクトＯ１の面積であり、例えば１０％というように百分率で表す。トリミング領域Ｔ０におけるオブジェクトの面積比Ｈ２は、トリミング領域Ｔ０の面積に対するオブジェクトＯ１の面積であり、例えば２０％というように百分率で表す。 Here, the position of the object O1 in the original image S0 is distances L1 to L4 from the upper, lower, left, and four sides of the original image S0 of the object O1, as shown in FIG. The position of the object O1 in the trimming area T0 is the distances LT1 to LT4 from the four upper and lower left and right sides of the trimming area T0 of the object O1 as shown in FIG. The area ratio H1 of the object O1 with respect to the original image S0 is the area of the object O1 with respect to the area of the original image S0, and is expressed as a percentage, for example, 10%. The area ratio H2 of the object in the trimming region T0 is the area of the object O1 with respect to the area of the trimming region T0, and is expressed as a percentage, for example, 20%.

そして学習部５は、学習結果であるトリミングルールを、取得した構図情報により更新することにより、トリミングルールを学習する（ステップＳＴ５）。なお、トリミングルールの更新とは、現在学習中のユーザについて、現時点までに学習結果データベースＤＢ１に登録された特定のオブジェクトについてのトリミングルールに含まれる構図情報と、新たに取得した構図情報との平均を算出することをいう。例えば、図８に示すように１つの構図情報のみにより得られた学習結果データベースＤＢ１に登録されたあるオブジェクトについてのトリミングルールがＲold、新たに取得した構図情報がＫ０である場合、トリミングルールＲoldの各構図情報の値と新たに取得した構図情報Ｋ０の値との平均値を算出することにより、更新されたトリミングルールＲnewが得られる。なお、平均を算出するのみならず、学習対象の画像について取得した構図情報をすべて記憶することによりトリミングルールＲを更新するようにしてもよい。 And the learning part 5 learns a trimming rule by updating the trimming rule which is a learning result with the acquired composition information (step ST5). Note that the trimming rule update is the average of the composition information included in the trimming rule for a specific object registered in the learning result database DB1 up to the present time and the newly acquired composition information for the user who is currently learning. Is calculated. For example, as shown in FIG. 8, when the trimming rule for a certain object registered in the learning result database DB1 obtained by only one composition information is Rold and the newly obtained composition information is K0, the trimming rule Rold The updated trimming rule Rnew is obtained by calculating the average value of the value of each composition information and the value of the newly acquired composition information K0. In addition to calculating the average, the trimming rule R may be updated by storing all the composition information acquired for the learning target image.

次いで、制御部６は画像記録部２に記録されているすべての画像について学習を行ったか否かを判定し（ステップＳＴ６）、ステップＳＴ６が否定されると、学習対象を次の画像に変更し（ステップＳＴ７）、ステップＳＴ１の処理に戻る。ステップＳＴ６が肯定されるとそのユーザについてのトリミングルールの学習を終了する。 Next, the control unit 6 determines whether or not learning has been performed for all images recorded in the image recording unit 2 (step ST6). If step ST6 is negative, the learning target is changed to the next image. (Step ST7), the process returns to Step ST1. If step ST6 is affirmed, the learning of the trimming rule for the user is terminated.

図９はトリミングルールの学習結果データベースを示す図である。図９に示すように学習結果データベースＤＢ１には、複数のユーザのユーザＩＤ（００１，００２…）が登録されており、各ユーザＩＤには、空、花、海、山、集合写真、正面顔ポートレート、斜め顔ポートレート等のオブジェクトが登録されている。そして各オブジェクトには、学習により得られたトリミングルールが登録されている。 FIG. 9 is a diagram showing a trimming rule learning result database. As shown in FIG. 9, user IDs (001, 002...) Of a plurality of users are registered in the learning result database DB1, and each user ID has a sky, a flower, the sea, a mountain, a group photo, and a front face. Objects such as portraits and oblique face portraits are registered. In each object, a trimming rule obtained by learning is registered.

図１０は本実施形態によるトリミングルール学習装置１により学習されたトリミングルールを用いて画像のトリミングを行うトリミング装置の構成を示す概略ブロック図である。図１０に示すようにトリミング装置２０は、トリミングの対象となる画像が記録されたメディア２１からの画像の読み出しおよびメディア２１への画像の記録を制御する記録制御部２２と、上述したトリミングルール学習装置１により学習がなされたトリミングルールが登録された学習結果データベースＤＢ１と、トリミングを行うトリミング部２３と、各種表示を行う液晶モニタ等の表示部２４と、各種入力を行う入力部２５と、各部を制御する制御部２６とを備え、各部がバス２７により接続されている。 FIG. 10 is a schematic block diagram illustrating a configuration of a trimming apparatus that trims an image using the trimming rules learned by the trimming rule learning apparatus 1 according to the present embodiment. As shown in FIG. 10, the trimming apparatus 20 includes a recording control unit 22 that controls reading of an image from a medium 21 on which an image to be trimmed is recorded and recording of the image on the medium 21, and the above-described trimming rule learning. A learning result database DB1 in which trimming rules learned by the apparatus 1 are registered, a trimming unit 23 that performs trimming, a display unit 24 such as a liquid crystal monitor that performs various displays, an input unit 25 that performs various inputs, and each unit And a control unit 26 that controls the above-described components, and each unit is connected by a bus 27.

制御部２６は、ＣＰＵ、作業領域となるＲＡＭ、およびトリミングルール学習装置１を動作させるためのプログラム等を記憶したＲＯＭを備える。 The control unit 26 includes a CPU, a RAM serving as a work area, and a ROM that stores a program and the like for operating the trimming rule learning device 1.

以下このようなトリミング装置２０において行われる処理について説明する。図１１はトリミング装置２０が行うトリミング処理を示すフローチャートである。トリミングを開始する指示をユーザが入力部２５から行うことにより制御部２６が処理を開始し、入力部２５からのユーザＩＤおよびトリミングを行う画像の指定を受け付ける（ステップＳＴ１１）。そして、記録制御部２２が指定された処理対象の画像Ｓ１をメディア２１から読み出し（ステップＳＴ１２）、トリミング部２３が処理対象の画像Ｓ１に含まれるオブジェクトの認識を行う（ステップＳＴ１３）。なお、オブジェクトの認識を行うことなく、トリミングを行う画像に含まれるオブジェクトの種類をユーザが入力部２５から入力するようにしてもよい。 Hereinafter, processing performed in the trimming apparatus 20 will be described. FIG. 11 is a flowchart showing a trimming process performed by the trimming apparatus 20. When the user gives an instruction to start trimming from the input unit 25, the control unit 26 starts processing, and receives the user ID and the designation of the image to be trimmed from the input unit 25 (step ST11). Then, the processing target image S1 designated by the recording control unit 22 is read from the medium 21 (step ST12), and the trimming unit 23 recognizes an object included in the processing target image S1 (step ST13). Note that the user may input the type of the object included in the image to be trimmed from the input unit 25 without recognizing the object.

そして、トリミング部２３は、ユーザＩＤおよびオブジェクトの認識結果に基づいて、学習結果データベースＤＢ１を参照して、作業中のユーザおよびトリミングを行う画像に含まれるオブジェクトに応じたトリミングルールを取得する（ステップＳＴ１４）。なお、学習結果データベースＤＢ１が、学習対象の画像についてすべての構図情報を取得している場合には、ランダムに構図情報を選択し、選択した構図情報をトリミングルールとして取得する。そして、トリミング部２３は、取得したトリミングルールに基づいて処理対象の画像Ｓ１をトリミングし（ステップＳＴ１５）、トリミング結果を表示部２４に表示し（ステップＳＴ１６）、処理を終了する。 Then, the trimming unit 23 refers to the learning result database DB1 based on the user ID and the recognition result of the object, and acquires a trimming rule corresponding to the user who is working and the object included in the image to be trimmed (step) ST14). When the learning result database DB1 has acquired all the composition information for the learning target image, the composition information is selected at random, and the selected composition information is acquired as a trimming rule. Then, the trimming unit 23 trims the image S1 to be processed based on the acquired trimming rule (step ST15), displays the trimming result on the display unit 24 (step ST16), and ends the process.

このように本実施形態においては、含まれるオブジェクト毎にあらかじめ分類された画像を表示し、表示された画像に対するユーザによるトリミング領域の指定を受け付け、複数の画像についてのトリミング領域の構図に基づいて、ユーザ単位でオブジェクト毎に画像についてのトリミングルールを学習するようにしたものである。このため、学習結果である学習結果データベースＤＢ１を用いて自動で画像のトリミングを行うことにより、画像に含まれるオブジェクトをユーザが所望とする構図となるようにトリミングすることができる。 As described above, in the present embodiment, an image classified in advance for each included object is displayed, the designation of the trimming area by the user for the displayed image is received, and based on the composition of the trimming area for a plurality of images, A trimming rule for an image is learned for each object on a user basis. For this reason, by automatically trimming an image using the learning result database DB1 which is a learning result, an object included in the image can be trimmed so as to have a composition desired by the user.

以上、本発明の実施形態に係る装置１について説明したが、コンピュータを、上記の学習部５に対応する手段として機能させ、図３に示すような処理を行わせるプログラムも、本発明の実施形態の１つである。また、そのようなプログラムを記録したコンピュータ読取り可能な記録媒体も、本発明の実施形態の１つである。 The apparatus 1 according to the embodiment of the present invention has been described above. However, a program that causes a computer to function as a unit corresponding to the learning unit 5 and performs processing as illustrated in FIG. 3 is also an embodiment of the present invention. It is one of. A computer-readable recording medium in which such a program is recorded is also one embodiment of the present invention.

本発明の実施形態によるトリミングルール学習装置の構成を示す概略ブロック図1 is a schematic block diagram showing the configuration of a trimming rule learning device according to an embodiment of the present invention. 注目領域すなわちオブジェクトの抽出方法を説明するための図A diagram for explaining a method of extracting a region of interest, that is, an object 本実施形態において行われる学習時に行われる処理を示すフローチャートThe flowchart which shows the process performed at the time of the learning performed in this embodiment トリミングを説明するための図Diagram for explaining trimming オブジェクトの抽出を説明するための図Diagram for explaining object extraction 原画像におけるオブジェクトの位置を説明するための図The figure for demonstrating the position of the object in an original image トリミング領域におけるオブジェクトの位置を説明するための図The figure for demonstrating the position of the object in a trimming area | region トリミングルールの学習を説明するための図Diagram for explaining trimming rule learning 学習結果データベースＤＢ１を示す図The figure which shows learning result database DB1 本実施形態によるトリミングルール学習装置により学習されたトリミングルールを用いて画像のトリミングを行うトリミング装置の構成を示す概略ブロック図Schematic block diagram showing a configuration of a trimming apparatus that trims an image using a trimming rule learned by the trimming rule learning apparatus according to the present embodiment トリミング処理を示すフローチャートFlow chart showing trimming process

Explanation of symbols

１トリミングルール学習装置
２画像記録部
３表示部
４入力部
５学習部
６制御部
７バス
８オブジェクト抽出部
ＤＢ１学習結果データベース DESCRIPTION OF SYMBOLS 1 Trimming rule learning apparatus 2 Image recording part 3 Display part 4 Input part 5 Learning part 6 Control part 7 Bus 8 Object extraction part DB1 Learning result database

Claims

Display means for displaying pre-classified images for each included object;
Input means for accepting designation of a trimming region by the user for the displayed image;
Based on the composition of the trimming area for a plurality of images, when learning the trimming rule for the image for each object in units of users, the object included in the trimming area is extracted, and based on the composition of the trimming area, The position of the object in the original image before trimming, the position of the object in the trimming area, the area ratio of the object to the original image, and the area ratio of the object to the trimming area are obtained as composition information, and the composition information A trimming rule learning device comprising: learning means for learning as a trimming rule.

Said learning means, trimming rule learning device according to claim 1, characterized in that the means for extracting a region of interest of the trimming area as the object.

Display pre-classified images for each included object,
Accepting a user to specify a trimming area for the displayed image;
Based on the composition of the trimming area for a plurality of images, when learning the trimming rules for the image for each object in units of users, the objects included in the trimming area are extracted,
Based on the composition of the trimming area, the position of the object in the original image before trimming, the position of the object in the trimming area, the area ratio of the object to the original image, and the area ratio of the object to the trimming area Acquired as composition information,
A trimming rule learning method comprising learning the composition information as the trimming rule .

A procedure for displaying pre-classified images for each included object;
A procedure for accepting designation of a trimming region by the user for the displayed image;
A procedure for extracting an object included in the trimming area when learning a trimming rule for the image for each object on a per-user basis based on the composition of the trimming area for a plurality of images ;
Based on the composition of the trimming area, the position of the object in the original image before trimming, the position of the object in the trimming area, the area ratio of the object to the original image, and the area ratio of the object to the trimming area The procedure to get as composition information,
A program for causing a computer to execute a trimming rule learning method comprising learning the composition information as the trimming rule .