JP5158974B2

JP5158974B2 - Attention area extraction method, program, and image evaluation apparatus

Info

Publication number: JP5158974B2
Application number: JP2009109950A
Authority: JP
Inventors: 亮太河内
Original assignee: Nikon Systems Inc
Current assignee: Nikon Systems Inc
Priority date: 2009-04-28
Filing date: 2009-04-28
Publication date: 2013-03-06
Anticipated expiration: 2029-04-28
Also published as: JP2010257423A

Description

本発明は、注目領域抽出方法、プログラム、及び、画像評価装置に関する。 The present invention relates to a region of interest extraction method, a program, and an image evaluation apparatus.

昨今において、主にデジタル画像の意味理解や画像検索における画像タグ付けの前処理を行うための方法や、スライドショーにおける自動ズームのズームセンターを決定する方法として、画像内から被写体を抽出する手法や主要な領域を抽出する手法が検討されている（例えば、非特許文献１及び非特許文献２参照）。また、このような手法とは別に、例えば、ユーザが画面上の所定の領域を指定するなど、ユーザの指示を基に特定の画像領域を抽出する方法も提案されている（例えば、特許文献１参照）。これらは、全て「画像からの注目領域抽出」という概念で分類されるものである。 In recent years, mainly as a method for preprocessing of image tagging in digital image understanding and image search, and as a method for determining the zoom center of automatic zoom in a slide show, a method for extracting subjects from images and main methods A technique for extracting a simple region has been studied (see, for example, Non-Patent Document 1 and Non-Patent Document 2). In addition to such a method, a method of extracting a specific image area based on a user instruction, for example, a user specifying a predetermined area on the screen has been proposed (for example, Patent Document 1). reference). These are all classified by the concept of “attention area extraction from image”.

特開２００５−７８２９０号公報JP 2005-78290 A

「注目度に基づく画像からの興味領域抽出」木村ｅｔａｌ．ＩＥＩＣＥＩＣＤ２００５−２２１，２００５年“Extraction of Interest Region from Image Based on Attention” Kimura et al. IEICE ICD 2005-221, 2005 「ＡＵＴＯＣＲＯＰＰＩＮＧＦＯＲＤＩＧＩＴＡＬＰＨＯＴＯＧＲＡＰＨＳ」Ｍ．Ｚｈａｎｇｅｔａｌ．ＩＥＥＥＩＣＭＥ２００５，２００５年“AUTO CROPPING FOR DIGITAL PHOTOGRAPHS” Zhang et al. IEEE ICME 2005, 2005

しかしながら、従来の何れの手法においても、処理速度が遅いと言う課題があった。例えば、非特許文献１に記載の手法は比較的高速ではあるが、一般的なパソコン（ＰＣ）環境下で２００ｍ秒程度の時間がかかってしまう。現在の一般ユーザの写真撮影行動を鑑みるに、各ユーザは大量の画像データを保有している。その大量の画像データの中で一定数の枚数について、デジタルスチルカメラ（ＤＳＣ）等の組み込み環境下で、このような画像抽出処理を行おうとすると、トータルで１０秒以上の時間がかかってしまい、実アプリケーションを考えた際に、まだ実用的な速度とは言えない。また、非特許文献２に記載の手法は、一枚の画像を処理するのにＰＣ環境下においても数秒の時間がかかってしまう。この場合も、ＤＳＣ等の組み込み環境下での実アプリケーションを考えると好ましいものでない。 However, any of the conventional methods has a problem that the processing speed is slow. For example, although the method described in Non-Patent Document 1 is relatively fast, it takes about 200 milliseconds in a general personal computer (PC) environment. In view of the current general user's photography behavior, each user has a large amount of image data. When such an image extraction process is performed in a built-in environment such as a digital still camera (DSC) for a certain number of images in the large amount of image data, it takes a total of 10 seconds or more. When considering a real application, it's still not practical. In the method described in Non-Patent Document 2, it takes several seconds to process one image even in a PC environment. This case is also not preferable when considering an actual application in an embedded environment such as DSC.

また、非特許文献１に記載の手法では、ユーザが注目する領域が、画像の中心付近に存在するものであると仮定して抽出処理を行っているが、実際に撮影される画像において、そのような仮定が成り立たないケースも多く存在する。このような画像に対して抽出処理を行うと、ユーザの感覚とはかけ離れた領域が注目領域として抽出されてしまう。一方、特許文献１に記載の発明では、画像上でユーザに注目する領域（若しくは点）を指定させた上で、領域の抽出を行うため、ＤＳＣ等のような小型デバイスで、ＰＣと比較して簡素なユーザインタフェースしか提供できない場合には、有効な機能とはならない。また、領域ベースのクラスタリングを行うため、処理速度の面でもＤＳＣ等の組み込み環境下では実用的な速度では実現できないものである。 Further, in the method described in Non-Patent Document 1, the extraction process is performed on the assumption that the region that the user is interested in exists in the vicinity of the center of the image. There are many cases where such assumptions do not hold. When extraction processing is performed on such an image, an area far from the user's sense is extracted as an attention area. On the other hand, in the invention described in Patent Document 1, in order to extract a region after specifying a region (or a point) to which a user pays attention on an image, a small device such as a DSC is compared with a PC. If only a simple user interface can be provided, it will not be an effective function. In addition, since region-based clustering is performed, the processing speed cannot be realized at a practical speed in an embedded environment such as DSC.

本発明はこのような課題に鑑みてなされたものであり、より高速でより高精度であり、かつ、ＤＳＣ等の組み込み環境下においても実用的な注目領域の抽出が可能な注目領域抽出方法、プログラム、及び、画像評価装置を提供することを目的とする。 The present invention has been made in view of such a problem, and is an attention area extraction method that is faster and more accurate, and that can extract a practical attention area even in an embedded environment such as DSC. It is an object to provide a program and an image evaluation apparatus.

前記課題を解決するために、第１の本発明に係る注目領域抽出方法は、画素毎に、輝度及び２つの色度で表現されたデジタル画像である被抽出画像の注目領域を抽出する注目領域抽出方法であって、被抽出画像の、輝度及び２つの色度に対応する３つのプレーンの各々から、当該画素毎に、被抽出画像のエッジ成分を抽出し、画素毎に、３つのプレーンの各々のエッジ成分を重み付けして加算することにより、画素毎のエッジ量を算出するステップと、被抽出画像の画素数と対応する要素を有し、当該要素毎に予め重み値が設定された注目領域重み付けマップの要素の各々を、被抽出画像の対応する画素のエッジ量に乗算して、画素毎の注目度を算出するステップと、被抽出画像において、注目度が所定の閾値より大きい画素の全てを内包する領域を注目領域として抽出するステップと、を有する。 In order to solve the above-described problem, a region of interest extraction method according to the first aspect of the present invention extracts a region of interest in an extracted image, which is a digital image expressed by luminance and two chromaticities, for each pixel. In the extraction method, an edge component of an extracted image is extracted for each pixel from each of three planes corresponding to luminance and two chromaticities of the extracted image, and three planes are extracted for each pixel. A step of calculating an edge amount for each pixel by weighting and adding each edge component and an element corresponding to the number of pixels of the extracted image, and a weight value set in advance for each element Multiplying each of the elements of the region weighting map by the edge amount of the corresponding pixel of the extracted image to calculate a degree of attention for each pixel; and, in the extracted image, a pixel having a degree of attention greater than a predetermined threshold Contain everything And a step of extracting a region as a region of interest.

このような注目領域抽出方法は、被抽出画像を、少なくとも１つ以上の異なる次元にリサイズしたリサイズ画像を生成するステップをさらに有し、エッジ量を算出するステップは、被抽出画像及びリサイズ画像毎に、エッジ量を算出するように構成され、注目度を算出するステップは、さらに、リサイズ画像から算出されたエッジ量を、被抽出画像の画素数と同一次元に復元するステップと、被抽出画像から算出されたエッジ量及びリサイズ画像から算出されて復元されたエッジ量の各々に、注目領域重み付けマップの要素の各々を乗算し、さらに、注目領域重み付けマップが乗算されたエッジ量の各々に、予め設定された重みを乗じて画素毎に加算して、画素毎の注目度を算出するステップと、を有することが好ましい。 Such a region-of-interest extraction method further includes a step of generating a resized image obtained by resizing the extracted image into at least one or more different dimensions, and the step of calculating the edge amount is performed for each of the extracted image and the resized image. The step of calculating the degree of attention is further configured to restore the edge amount calculated from the resized image to the same dimension as the number of pixels of the extracted image; Multiply each of the edge amount calculated from the resize image and the edge amount calculated from the resized image by each of the elements of the attention area weighting map, and each of the edge amounts multiplied by the attention area weighting map, It is preferable to include a step of multiplying a preset weight and adding each pixel to calculate a degree of attention for each pixel.

さらに、このような注目領域抽出方法は、被抽出画像から、注目領域として抽出するステップにおいて抽出された領域を切り出すステップと、切り出された被抽出画像を、少なくとも１つ以上の異なるサイズにリサイズするステップと、切り出された被抽出画像及び切り出された被抽出画像をリサイズした画像毎にエッジ量を算出するステップと、切り出された被抽出画像をリサイズした画像から算出されたエッジ量を、切り出された被抽出画像の次元に復元するステップと、切り出された被抽出画像から算出されたエッジ量及び切り出された被抽出画像をリサイズした画像から算出され、さらに復元されたエッジ量を、予め設定された重みを乗じて画素毎に加算して、切り出された被抽出画像の注目度を算出するステップと、切り出された被抽出画像の注目度を、当該注目度が所定の閾値以上のときを１とし、所定の閾値より小さいときを０として二値化するステップと、この二値化された注目度において、隣接して１が設定されている画素をグループ化し、所定の個数以上の画素が含まれる前記グループを注目領域部分領域として決定し、当該注目領域部分領域の全てを内包する領域を新たな注目領域として抽出するステップと、を有することが好ましい。 Further, in this attention area extraction method, the step of extracting the extracted area in the step of extracting as the attention area from the extracted image, and resizing the extracted extracted image to at least one different size A step of calculating an edge amount for each image obtained by resizing the extracted image to be extracted and the extracted image to be extracted; and an edge amount calculated from an image obtained by resizing the extracted image to be extracted. A step of restoring to the dimension of the extracted image, an edge amount calculated from the extracted image to be extracted and an image obtained by resizing the extracted image to be extracted, and further restoring the restored edge amount. A step of calculating the degree of attention of the extracted image to be extracted In the step of binarizing the attention level of the extracted image as 1 when the attention level is equal to or higher than a predetermined threshold value and 0 when the attention level is smaller than the predetermined threshold value, Pixels for which 1 is set are grouped, the group including a predetermined number of pixels or more is determined as an attention area partial area, and an area including all of the attention area partial areas is extracted as a new attention area. Preferably comprising steps.

また、第２の本発明に係る注目領域抽出方法は、画素毎に、輝度及び２つの色度で表現されたデジタル画像である被抽出画像の注目領域を抽出する注目領域抽出方法であって、被抽出画像の、輝度及び２つの色度に対応する３つのプレーンの各々から、当該プレーン毎に、画素の値の標準偏差及び平均値を求めてこの標準偏差を平均値で除算することにより変動係数を算出し、当該変動係数が最も大きいプレーンを処理対象プレーンとして選択するステップと、被抽出画像の処理対象プレーンを、少なくとも１つ以上の異なる次元にリサイズしたリサイズ画像を生成するステップと、被抽出画像及びリサイズ画像の処理対象プレーンにおいて、画素毎にエッジ成分を抽出して画素毎のエッジ量を算出するステップと、被抽出画像及びリサイズ画像のうち、次元の最も小さい画像を基準画像とし、残りの被抽出画像及びリサイズ画像から算出されたエッジ量を、基準画像の画素数と同一次元に復元するステップと、基準画像のエッジ量及び復元されたエッジ量の各々の画素に、基準画像の画素数と対応する要素を有し、当該要素毎に予め重み値が設定された注目領域重み付けマップの要素の各々を乗算し、さらに、注目領域重み付けマップが乗算されたエッジ量の各々に、予め設定された重みを乗じて画素毎に加算して、画素毎の注目度を算出するステップと、被抽出画像において、注目度が所定の閾値より大きい画素の全てを内包する領域を前記注目領域として抽出するステップと、を有する。 An attention area extraction method according to the second aspect of the present invention is an attention area extraction method for extracting an attention area of an extracted image, which is a digital image expressed by luminance and two chromaticities, for each pixel, Fluctuates by obtaining the standard deviation and average value of the pixel value for each plane from each of the three planes corresponding to the brightness and two chromaticities of the image to be extracted and dividing the standard deviation by the average value. Calculating a coefficient, selecting a plane having the largest variation coefficient as a processing target plane, generating a resized image obtained by resizing the processing target plane of the extracted image into at least one or more different dimensions, In the processing target plane of the extracted image and the resized image, a step of extracting an edge component for each pixel and calculating an edge amount for each pixel; The image having the smallest dimension among the images is used as a reference image, and the edge amount calculated from the remaining extracted image and the resized image is restored to the same dimension as the number of pixels of the reference image, and the edge amount of the reference image and Each pixel of the restored edge amount has an element corresponding to the number of pixels of the reference image, and is multiplied by each element of the attention area weighting map in which a weight value is set in advance for each element. Multiplying each of the edge amounts multiplied by the area weighting map by a preset weight for each pixel and calculating the attention level for each pixel; and, in the extracted image, the attention level is a predetermined threshold value Extracting a region including all larger pixels as the region of interest.

また、本発明に係るプログラムは、上述の注目領域抽出方法をコンピュータに実行させるものである。 A program according to the present invention causes a computer to execute the above-described attention area extraction method.

さらに、本発明に係る画像評価装置は、画素毎に、輝度及び２つの色度で表現されたデジタル画像を記憶する記憶部と、この記憶部からデジタル画像を被抽出画像として読み出して、上述の注目領域抽出方法のいずれかにより当該被抽出画像の注目領域を抽出する注目領域抽出部と、を有する。 Further, the image evaluation apparatus according to the present invention stores, for each pixel, a storage unit that stores a digital image expressed by luminance and two chromaticities, and reads out the digital image from the storage unit as an extracted image. An attention area extraction unit that extracts an attention area of the extracted image by any one of the attention area extraction methods.

本発明に係る注目領域抽出方法、プログラム、及び、画像評価装置を以上のように構成すると、より高速な注目領域の抽出を、より高精度に行うことができ、デジタルスチルカメラ等の組み込み環境下においても実用的な注目領域の抽出を行うことができる。 When the region of interest extraction method, the program, and the image evaluation apparatus according to the present invention are configured as described above, faster region of interest can be extracted with higher accuracy, and in an embedded environment such as a digital still camera. In this case, a practical attention area can be extracted.

画像評価装置の構成を表すブロック図である。It is a block diagram showing the structure of an image evaluation apparatus. データベースのデータ構造を示す説明図である。It is explanatory drawing which shows the data structure of a database. 上述の画像評価装置で実行される画像評価処理のフローチャートである。It is a flowchart of the image evaluation process performed with the above-mentioned image evaluation apparatus. 第１の注目領域抽出方法を示すフローチャートである。It is a flowchart which shows the 1st attention area extraction method. ラプラシアン処理に用いるフィルタを示す説明図であり、（ａ）は８近傍ラプラシアンフィルタを示し、（ｂ）は４近傍ラプラシアンフィルタを示す。It is explanatory drawing which shows the filter used for a Laplacian process, (a) shows an 8-neighbor Laplacian filter, (b) shows a 4-neighbor Laplacian filter. 注目領域重み付けマップの一例を説明するための説明図である。It is explanatory drawing for demonstrating an example of an attention area weighting map. 画像評価処理により、画像表示装置に表示された画像の一例を説明するための説明図である。It is explanatory drawing for demonstrating an example of the image displayed on the image display apparatus by the image evaluation process. 第２の注目領域抽出方法を示すフローチャートである。It is a flowchart which shows the 2nd attention area extraction method. 第３の注目領域抽出方法を示すフローチャートである。It is a flowchart which shows the 3rd attention area extraction method.

以下、本発明の好ましい実施形態について図面を参照して説明する。まず、図１を用いて本実施形態に係る注目領域抽出方法が実行される画像評価装置１００の構成について説明する。図１に示すように、この画像評価装置１００は、装置全体を制御する制御部１０２と、画像及びこの画像の各種情報から構成される画像データが記憶されるデータベース１０３（記憶部）と、画像を表示するための画像表示部１０１と、制御部１０２の制御によりデータベース１０３に画像データを保存する画像データ保存部１０４と、制御部１０２の制御により画像から注目領域を抽出する注目領域抽出部１０５と、制御部１０２の制御により抽出された注目領域を画像表示部１０１に表示する注目領域表示部１０６と、制御部１０２に画像データ保存の指示を与える画像保存スイッチ（Ｓ／Ｗ）１０７と、制御部１０２に注目領域抽出開始の指示を与える注目領域抽出開始スイッチ（Ｓ／Ｗ）１０８と、画像表示部１０１への注目領域の表示指示を与える注目領域表示スイッチ（Ｓ／Ｗ）１０９とから構成される。 Hereinafter, preferred embodiments of the present invention will be described with reference to the drawings. First, the configuration of the image evaluation apparatus 100 in which the attention area extraction method according to the present embodiment is executed will be described with reference to FIG. As shown in FIG. 1, the image evaluation apparatus 100 includes a control unit 102 that controls the entire apparatus, a database 103 (storage unit) that stores image data including images and various types of information about the image, and an image. An image display unit 101 for displaying the image, an image data storage unit 104 for storing image data in the database 103 under the control of the control unit 102, and a region of interest extraction unit 105 for extracting a region of interest from the image under the control of the control unit 102. An attention area display section 106 that displays the attention area extracted by the control of the control section 102 on the image display section 101, an image storage switch (S / W) 107 that gives an instruction to store image data to the control section 102, An attention area extraction start switch (S / W) 108 which gives an instruction to start attention area extraction to the control unit 102, and an attention area to the image display section 101 Give shows instruction region of interest displayed a switch (S / W) 109 Metropolitan.

データベース１０３は、図２に示すように、画像データの格納先や注目領域の座標などが記憶される画像情報格納領域１０３ａと、画像そのものとその属性情報が関連付けられて画像データとして記憶される画像データ格納領域１０３ｈとを有する。画像情報格納領域１０３ａは、図２（ａ）に示すようなデータ構造を有しており、画像データを識別する識別情報（ＩＤ）として、例えばファイル名が記憶されるファイル名記憶領域１０３ｂと、この画像データの格納先の先頭アドレスが記憶されるアドレス記憶領域１０３ｃと、画像の高さ方向（縦方向）のサイズが画素数として記憶される画像サイズ（高さ）記憶領域１０３ｄと、画像の幅方向（横方向）のサイズが画素数として記憶される画像サイズ（幅）記憶領域１０３ｅと、抽出された注目領域の左上の座標（ｔｏｐ，ｌｅｆｔ）が記憶される左上座標記憶領域１０３ｆと、当該注目領域の右下の座標（ｂｏｔｔｏｍ，ｒｉｇｈｔ）が記憶される右下座標記憶領域１０３ｇと、を少なくとも有している。なお、この座標については、画像の左上隅を原点（０，０）とし、高さ方向をｘ軸、幅方向をｙ軸として表している。したがって、ｔｏｐ及びｂｏｔｔｏｍは原点からの高さ方向の相対位置を画素で示したものであり、ｌｅｆｔ及びｒｉｇｈｔは原点からの幅方向の相対位置を画素で示したものである。 As shown in FIG. 2, the database 103 stores an image information storage area 103 a in which the storage destination of image data, the coordinates of the attention area, and the like are stored, and an image that is stored as image data in association with the image itself and its attribute information. And a data storage area 103h. The image information storage area 103a has a data structure as shown in FIG. 2A, and as identification information (ID) for identifying image data, for example, a file name storage area 103b in which a file name is stored; An address storage area 103c in which the start address of the storage destination of the image data is stored, an image size (height) storage area 103d in which the size in the height direction (vertical direction) of the image is stored as the number of pixels, An image size (width) storage area 103e in which the size in the width direction (horizontal direction) is stored as the number of pixels; an upper left coordinate storage area 103f in which the upper left coordinates (top, left) of the extracted attention area are stored; And a lower right coordinate storage area 103g in which the lower right coordinates (bottom, right) of the attention area are stored. Note that, for these coordinates, the upper left corner of the image is the origin (0, 0), the height direction is the x axis, and the width direction is the y axis. Therefore, top and bottom indicate the relative position in the height direction from the origin in pixels, and left and right indicate the relative position in the width direction from the origin in pixels.

一方、画像データ格納領域１０３ｈは、図２（ｂ）に示すようなデータ構造を有しており、画像そのものが記憶される画像記憶領域１０３ｉと、ヘッダ領域、撮影時刻情報などの属性情報が記憶される属性情報記憶領域１０３ｊとを有し、１つの画像データに対する情報が１レコードとして管理される。ここで、画像データ格納領域１０３ｈに格納されている画像データの格納場所（アドレス）は、画像情報格納領域１０３ａのアドレス記憶領域１０３ｃに記憶されており、ファイル名記憶領域１０３ｂに記憶されている画像データのファイル名から、読み出そうとする画像データを特定し、この画像データの格納領域の先頭アドレスをアドレス記憶領域１０３ｃから抽出して、画像データ格納領域１０３ｈから該当する画像データそのものを取得することができる。なお、この画像記憶領域１０３ｉと属性情報記憶領域１０３ｊとを一つのファイルとして構成するようにしてもよい。このようなデータ構造としては、日本電子工業振興協会（ＪＥＩＤＡ）で規格されたＥｘｉｆ（Ｅｘｃｈａｎｇｅａｂｌｅｉｍａｇｅｆｉｌｅｆｏｒｍａｔ）等が知られている。 On the other hand, the image data storage area 103h has a data structure as shown in FIG. 2B, and stores image storage area 103i in which the image itself is stored, and attribute information such as a header area and shooting time information. Attribute information storage area 103j, and information for one piece of image data is managed as one record. Here, the storage location (address) of the image data stored in the image data storage area 103h is stored in the address storage area 103c of the image information storage area 103a, and the image stored in the file name storage area 103b. The image data to be read is specified from the file name of the data, the start address of the storage area of this image data is extracted from the address storage area 103c, and the corresponding image data itself is acquired from the image data storage area 103h. be able to. The image storage area 103i and the attribute information storage area 103j may be configured as one file. As such a data structure, Exif (Exchangeable image file format) standardized by the Japan Electronics Industry Promotion Association (JEIDA) is known.

次に、このような構成の画像評価装置１００による注目領域抽出方法を用いた画像評価処理について、図３のフローチャートを合わせて用いて説明する。この画像評価処理は、画像データ保存部１０４によって行われる画像データ保存処理（ステップＳ１１０〜Ｓ１２０）と、注目領域抽出部１０５によって行われる注目領域抽出処理（ステップＳ２１０）と、注目領域表示部１０６によって行われる注目領域表示処理（ステップＳ３１０）とから構成され、これらの処理は、ユーザが画像保存スイッチ１０７、注目領域抽出開始スイッチ１０８、または、注目領域表示スイッチ１０９の何れかを押下することにより実行される。 Next, an image evaluation process using the attention area extraction method by the image evaluation apparatus 100 having the above configuration will be described with reference to the flowchart of FIG. This image evaluation processing is performed by the image data storage processing (steps S110 to S120) performed by the image data storage unit 104, the attention region extraction processing (step S210) performed by the attention region extraction unit 105, and the attention region display unit 106. The region-of-interest display processing (step S310) to be performed is executed by the user pressing one of the image storage switch 107, the region-of-interest extraction start switch 108, or the region-of-interest display switch 109. Is done.

具体的には、何れかのスイッチ１０７〜１０９が押下されると、これらのスイッチ１０７〜１０９から処理の開始信号が制御部１０２に発信される。これを受信した制御部１０２では、何れのスイッチ１０７〜１０９からの開始信号であるかを判定し、判定された開始信号の指示に従って、各部に処理実行を指令する（ステップＳ１００，Ｓ２００，Ｓ３００）。各部での処理が終わったら、制御部１０２は、全ての処理が終了したか判定し（ステップＳ４００）、処理が全て終了したら、この画像評価処理を終了する。ここで、この終了の判定は、例えば、電源オフや図示しない終了スイッチの押下などの終了動作が行われ、この終了動作によって発信された終了信号を制御部１０２が受信することにより行ってもよい。このような終了動作がされない間は指示待ち状態となり、何れかのスイッチ押下による次の指示があったタイミングで、制御部１０２は、この指示に従って、画像データ保存処理、注目領域抽出処理、又は、注目領域表示処理のいずれかの処理を続行する。以下、各部の処理の詳細について説明する。 Specifically, when any one of the switches 107 to 109 is pressed, a processing start signal is transmitted from the switches 107 to 109 to the control unit 102. Receiving this, the control unit 102 determines which switch 107 to 109 is the start signal, and instructs each unit to execute processing according to the instruction of the determined start signal (steps S100, S200, and S300). . When the processes in the respective units are finished, the control unit 102 determines whether all the processes are finished (step S400), and when all the processes are finished, the image evaluation process is finished. Here, the determination of the end may be performed, for example, when an end operation such as turning off the power or pressing an end switch (not shown) is performed and the control unit 102 receives the end signal transmitted by the end operation. . While such an end operation is not performed, the control unit 102 waits for an instruction, and at the timing when the next instruction is made by pressing any switch, the control unit 102 performs image data storage processing, attention area extraction processing, or One of the attention area display processes is continued. Hereinafter, details of the processing of each unit will be described.

まず、画像評価装置１００に画像データを新規に保存する際には、図示しない外部媒体読み取り装置（たとえば、ＳＤカードリーダ）に画像データが格納された媒体（例えばＳＤカード）を取り付け、画像保存スイッチ１０７を押下する。この画像保存スイッチ１０７が押下されると、制御部１０２は、画像データ保存部１０４に処理実行を指令する（ステップＳ１００）。そして、画像データ保存部１０４は、インターフェース（例えば、ＳＤカードインタフェース）を経由して媒体から画像データを読み出し、この画像データに識別ＩＤであるファイル名（例えば、ｆｉｌｅ１．ｊｐｇ）を付与し、このファイル名とともにデータベース１０３の画像データ格納領域１０３ｈに格納する（ステップＳ１１０）。さらに、画像データ保存部１０４は、記憶された画像データに関する情報を、データベース１０３の画像情報格納領域１０３ａに記憶し、データベース１０３を更新する（ステップＳ１２０）。この場合、画像データに関する情報として、画像データのファイル名、格納先の先頭アドレス、画像の高さ方向及び幅方向のサイズが、各記憶領域（１０３ｂ〜１０３ｅ）に設定される。また、注目領域の左上座標記憶領域１０３ｆには原点（０，０）の座標が、右下座標記憶領域１０３ｇには、画像の右下の座標（画像の高さ，画像の幅）が、初期値として設定される。 First, when image data is newly stored in the image evaluation apparatus 100, a medium (for example, an SD card) storing image data is attached to an external medium reading apparatus (for example, an SD card reader) (not shown), and an image storage switch 107 is pressed. When the image storage switch 107 is pressed, the control unit 102 instructs the image data storage unit 104 to execute processing (step S100). Then, the image data storage unit 104 reads the image data from the medium via an interface (for example, an SD card interface), and assigns a file name (for example, file1.jpg) as an identification ID to the image data. The file name is stored in the image data storage area 103h of the database 103 together with the file name (step S110). Further, the image data storage unit 104 stores information relating to the stored image data in the image information storage area 103a of the database 103, and updates the database 103 (step S120). In this case, as information about the image data, the file name of the image data, the start address of the storage destination, and the size in the height direction and the width direction of the image are set in each storage area (103b to 103e). Further, the coordinates of the origin (0, 0) are stored in the upper left coordinate storage area 103f of the attention area, and the lower right coordinates (image height and image width) of the image are initially stored in the lower right coordinate storage area 103g. Set as a value.

そして、注目領域抽出開始スイッチ１０８が押下されると、制御部１０２は、注目領域抽出部１０５に処理実行を指令する（ステップＳ２００）。注目領域抽出部１０５は、この命令を受信すると、画像データベース１０３に記憶されている全ての画像データを対象に、後述する注目領域抽出処理を実行し、その結果（抽出された注目領域に関する情報）を、対応する画像データの画像情報格納領域１０３ａ（左上座標記憶領域１０３ｆ及び右下座標記憶領域１０３ｇ）に記憶する（ステップＳ２１０）。 When the attention area extraction start switch 108 is pressed, the control unit 102 instructs the attention area extraction unit 105 to execute processing (step S200). Upon receiving this command, the attention area extraction unit 105 executes attention area extraction processing, which will be described later, for all image data stored in the image database 103, and the result (information on the extracted attention area). Are stored in the image information storage area 103a (upper left coordinate storage area 103f and lower right coordinate storage area 103g) of the corresponding image data (step S210).

最後に、注目領域表示スイッチ１０９が押下されると、制御部１０２は、注目領域表示部１０６に処理実行を指示する（ステップＳ３００）。注目領域表示部１０６は、この命令を受信すると、データベース１０３の画像データ格納領域１０３ｈに記憶された画像データを読み出すとともに、画像情報格納領域１０３ａから各画像の注目領域の左上座標と右下座標を読み出す。そして、この左上座標と右下座標で指定される注目領域に対応するデータが、読み出した画像データから抽出され、所定の表示画像として合成された上で、各画像データが画像表示部１０１に整列表示される（ステップＳ３１０）。図７に、この画像表示部１０１における画像データの表示例を示す。この図７に示す表示例では、全体画像を画像表示部１０１に表示した上で、各画像における注目領域を矩形で囲って表示しているが、画像表示部１０１への表示方法はこれに限定されることはなく、注目領域のみを抽出して表示してもよいし、全体画像と注目領域とを並べて表示してもよい。 Finally, when the attention area display switch 109 is pressed, the control unit 102 instructs the attention area display section 106 to execute processing (step S300). Upon receiving this command, the attention area display unit 106 reads the image data stored in the image data storage area 103h of the database 103, and obtains the upper left and lower right coordinates of the attention area of each image from the image information storage area 103a. read out. Then, data corresponding to the region of interest specified by the upper left coordinates and the lower right coordinates is extracted from the read image data, synthesized as a predetermined display image, and then each image data is aligned with the image display unit 101. It is displayed (step S310). FIG. 7 shows a display example of image data in the image display unit 101. In the display example shown in FIG. 7, the entire image is displayed on the image display unit 101, and the attention area in each image is surrounded by a rectangle, but the display method on the image display unit 101 is limited to this. However, only the attention area may be extracted and displayed, or the entire image and the attention area may be displayed side by side.

それでは、注目領域抽出部１０５で実行される注目領域抽出処理の詳細として、３つの方法について以下に説明する。なお、上述のように注目領域抽出部１０５では、データベース１０３に格納された全ての画像データに対して注目領域の抽出が行われるが、以降においては、１つの画像データに対する注目領域の抽出処理について説明する。また、画像データとして格納されている画像は、輝度（Ｙ）及び２つの色度（Ｃｂ，Ｃｒ）で表現されたデジタル画像、すなわち、ＹＣｂＣｒ表色系で表現されたデジタル画像であるとする。 Then, as a detail of the attention area extraction processing executed by the attention area extraction unit 105, three methods will be described below. As described above, the attention area extraction unit 105 extracts the attention area for all the image data stored in the database 103. Hereinafter, the attention area extraction processing for one image data will be described. explain. Further, it is assumed that the image stored as image data is a digital image expressed by luminance (Y) and two chromaticities (Cb, Cr), that is, a digital image expressed by the YCbCr color system.

（第１の注目領域抽出方法）
図４に示すように、注目領域抽出部１０５では、最初に、注目度マップ作成処理が行われる（ステップＳ２１１）。この注目度マップ作成処理では、まず、データベース１０３に記憶された画像のリサイズ（縮小）処理が行われる。すなわち、データベース１０３の画像情報格納領域１０３ａから読み出され、処理対象として指定された１つの被抽出画像（以下、「入力画像ｉｍｇ１」と呼ぶ）に対して、幅方向及び高さ方向にそれぞれ１／２及び１／４にリサイズした画像を生成する。以下、１／２にリサイズされた画像を「入力画像ｉｍｇ２」と呼び、１／４にリサイズされた画像を「入力画像ｉｍｇ３」と呼ぶ。 (First attention area extraction method)
As shown in FIG. 4, the attention area extraction unit 105 first performs attention level map creation processing (step S <b> 211). In this attention level map creation processing, first, resizing (reduction) processing of an image stored in the database 103 is performed. That is, for one extracted image (hereinafter referred to as “input image img1”) read from the image information storage area 103a of the database 103 and designated as a processing target, 1 in each of the width direction and the height direction. Generate images resized to / 2 and 1/4. Hereinafter, the image resized to ½ is called “input image img2”, and the image resized to ¼ is called “input image img3”.

そして、入力画像ｉｍｇ１〜ｉｍｇ３のＹＣｂＣｒの各プレーンに対して、ラプラシアン（Ｌａｐｌａｃｉａｎ）処理が行われる。ここで、入力画像ｉｍｇ１〜ｉｍｇ３のＹＣｂＣｒの各プレーンを、以下の式（ａ１）〜（ａ３）のように定義する。なお、（ｘ，ｙ）は画素の座標を示し、ｘ軸、ｙ軸の定義は上述した通りである。 Then, a Laplacian process is performed on each YCbCr plane of the input images img1 to img3. Here, each plane of YCbCr of the input images img1 to img3 is defined as the following expressions (a1) to (a3). Note that (x, y) indicates pixel coordinates, and the definitions of the x-axis and the y-axis are as described above.

次に、下記式（ｂ１）〜（ｂ９）を用いて、上記で定義した入力画像ｉｍｇ１〜ｉｍｇ３のＹＣｂＣｒ各プレーンに対して、図５（ａ）に示すような８近傍ラプラシアンフィルタΔ（ｘ，ｙ）を畳み込み演算することにより、ラプラシアン画像（ΔＹ１〜３，ΔＣｂ１〜３，ΔＣｒ１〜３）を求める。 Next, using the following formulas (b1) to (b9), the 8-neighbor Laplacian filter Δ (x, The Laplacian images (ΔY1-3, ΔCb1-3, ΔCr1-3) are obtained by performing a convolution operation on y).

上記式（ｂ１）〜（ｂ９）中、演算子「＊」は畳み込み演算を意味する。なお、画像の上下端、左右端の画素に対してフィルタを演算する際には、上下端の行と列、または左右端の行と列とをコピーすることにより画像を拡張し、演算を実行するものとする。 In the above formulas (b1) to (b9), the operator “*” means a convolution operation. When calculating the filter for the pixels at the top and bottom edges and the left and right edges of the image, the image is expanded by copying the rows and columns at the top and bottom edges, or the rows and columns at the left and right edges, and the computation is executed. It shall be.

そして、以上のようにして求めた各プレーンのラプラシアン画像に対して、以下に示す式（ｃ１）〜（ｃ３）を用いて、プレーンごとに、各プレーンのエッジ成分（隣接画素とのコントラスト）に、予め決められた係数を乗じた上で、３つのプレーンの対応する各画素の値を加算する。これにより、入力画像ｉｍｇ１〜ｉｍｇ３に応じたラプラシアン画像Δｉｍｇ１、Δｉｍｇ２、及び、Δｉｍｇ３が求められる。 Then, for the Laplacian image of each plane obtained as described above, the following formulas (c1) to (c3) are used to calculate the edge component (contrast with adjacent pixels) of each plane for each plane. After multiplying by a predetermined coefficient, the values of the corresponding pixels of the three planes are added. Thereby, Laplacian images Δimg1, Δimg2, and Δimg3 corresponding to the input images img1 to img3 are obtained.

上記式（ｃ１）〜（ｃ３）中、演算子「×」は各画像の画素値に共通の係数（ｗＹ，ｗＣｂ，ｗＣｒ）を乗算することを意味する。なお、この係数（ｗＹ，ｗＣｂ，ｗＣｒ）は、それぞれΔＹ，ΔＣｂ，ΔＣｒ各々に対する重み付け係数（スカラ値）であり、Δｉｍｇ１、Δｉｍｇ２、及び、Δｉｍｇ３の計算時においては、共通の値が用いられる。また、係数ｗＭを乗じている３つの項目は、各プレーンの画素数から、各プレーンの平均値（ｍｅａｎ（＊）で表す）を減算した値の絶対値（ａｂｓ（＊）で表す）に、各々ｗＭ＿Ｙ等の係数を乗算し、さらに、その総和に係数ｗＭを乗算しているが、これは次段の処理で得られる注目度マップ算出の際に、背景画像からの影響を除去する効果を期待して付け加えられている。 In the above formulas (c1) to (c3), the operator “x” means that the pixel value of each image is multiplied by a common coefficient (wY, wCb, wCr). The coefficients (wY, wCb, wCr) are weighting coefficients (scalar values) for ΔY, ΔCb, ΔCr, respectively, and common values are used when calculating Δimg1, Δimg2, and Δimg3. Three items multiplied by the coefficient wM are obtained by subtracting the average value (represented by mean (*)) of each plane from the number of pixels of each plane (represented by abs (*)). Each coefficient is multiplied by a coefficient such as wM_Y, and the sum is multiplied by a coefficient wM. This is effective in removing the influence from the background image when calculating the attention level map obtained in the subsequent processing. It is added in anticipation.

さらに、以下に示す式（ｄ１）〜（ｄ３）を用いて、上述の処理で求めたラプラシアン画像Δｉｍｇ１〜Δｉｍｇ３に対し、予め定義されてこの画像評価装置１００に設定されている注目領域重み付けマップｗＭａｐを乗算することにより、ラプラシアン画像ごとの注目度マップＳａｌ１〜Ｓａｌ３を求める。その後、以下に示す式（ｄ４）を用いて、注目度マップＳａｌ１〜Ｓａｌ３をｗｄ１〜ｗｄ３で重み付けして、対応する画素毎に加算することにより、画素毎の注目度から構成される当該入力画像の注目度マップＳａｌを得る。 Further, the attention area weighting map wMap defined in advance and set in the image evaluation apparatus 100 for the Laplacian images Δimg1 to Δimg3 obtained by the above-described processing using the following expressions (d1) to (d3). To obtain attention map Sal1 to Sal3 for each Laplacian image. Thereafter, using the equation (d4) shown below, the attention level maps Sal1 to Sal3 are weighted by wd1 to wd3 and added for each corresponding pixel, whereby the input image composed of the attention level for each pixel. Is obtained.

なお、上記式（ｄ１）〜（ｄ３）中、ｒｅｓｉｚｅ（＊，＊）は、第１引数の行列（画素）を、第２引数の倍率により、リサイズ（元の画像サイズに復元）する関数を意味する。すなわち、式（ｄ２），（ｄ３）では、入力画像ｉｍｇ１の次元に対して、それぞれ１／２、１／４されているため、これらのラプラシアン画像Δｉｍｇ２，Δｉｍｇ３を元の画像の次元にリサイズしている。また、ｗＭａｐは注目領域重み付けマップであり、Δｉｍｇ１と次元数を等しくした行列である。例えば、Δｉｍｇ１の次元（すなわち、入力画像ｉｍｇ１の次元）が８０×１２０（画素）の場合には、ｗＭａｐは同様に８０×１２０（画素）の次元を有する。なお、ｗＭａｐは以下に示す式（ｄ５）及び（ｄ６）により表される混合ガウス型のマップである。また、Ｓａｌ１〜Ｓａｌ３の定義式（ｄ１）〜（ｄ３）に現れる行列に対する演算子「×」は当該演算子の両側にある行列の各要素を各々乗算することを意味する。そのため、必然的に、２つの行列の次元は一致している必要がある。また、式（ｄ４）中、ｗｄ１〜ｗｄ３はＳａｌ１〜Ｓａｌ３に対する重み付け因子（スカラ値）である。 In the above formulas (d1) to (d3), resize (*, *) is a function for resizing (restoring to the original image size) the matrix (pixel) of the first argument by the magnification of the second argument. means. In other words, in the equations (d2) and (d3), the dimensions of the input image img1 are ½ and ¼, respectively. Therefore, these Laplacian images Δimg2 and Δimg3 are resized to the dimensions of the original image. ing. WMap is a region-of-interest weight map, and is a matrix having the same number of dimensions as Δimg1. For example, when the dimension of Δimg1 (that is, the dimension of the input image img1) is 80 × 120 (pixels), wMap similarly has a dimension of 80 × 120 (pixels). Note that wMap is a mixed Gaussian map represented by the following equations (d5) and (d6). In addition, the operator “x” for the matrix appearing in the defining expressions (d1) to (d3) of Sal1 to Sal3 means that each element of the matrix on both sides of the operator is multiplied. Therefore, inevitably, the dimensions of the two matrices must match. In the formula (d4), wd1 to wd3 are weighting factors (scalar values) for Sal1 to Sal3.

上記式（ｄ５）及び（ｄ６）中、μ_n及びｇｗ_nは、ｗＭａｐを形成する各ガウシアン（ｎでインデクス付けしている）の平均ベクトル、及び、共分散行列Σの重み付け係数をそれぞれ示す。ここで、この注目領域重み付けマップｗＭａｐの例を図６に示す。この図６に示す注目領域重み付けマップｗＭａｐは、５つのガウシアン分布を組み合わせた場合を示している。なお、この注目領域重み付けマップｗＭａｐは、例えば、サンプル画像を収集して解析することにより生成される。 In the above formula (d5) and (d6), mu _n and gw _n represents the average vector of each Gaussian (as put index at n) to form the WMAP, and the weighting coefficients of the covariance matrix Σ respectively. Here, an example of this attention area weighting map wMap is shown in FIG. The attention area weighting map wMap shown in FIG. 6 shows a case where five Gaussian distributions are combined. The attention area weighting map wMap is generated, for example, by collecting and analyzing sample images.

上述のようにして注目度マップが作成されると、次に、注目点（ＰＯＩ）抽出処理（ステップＳ２１２）が行われる。具体的には、以下に示す式（ｅ１）を用いて、作成された注目度マップから、所定の閾値ｔｈ以上の注目度を持つ点（画素）を抽出し、それらを注目点（ＰＯＩ）として定義する。 Once the attention level map is created as described above, a point of interest (POI) extraction process (step S212) is performed. Specifically, using the following formula (e1), points (pixels) having an attention level equal to or higher than a predetermined threshold th are extracted from the generated attention level map, and these are used as the attention point (POI). Define.

注目点（ＰＯＩ）の抽出が完了すると、次に、注目領域（ＲＯＩ）決定処理（ステップＳ２１３）が行われる。まず、上記注目点（ＰＯＩ）抽出処理で求められたＰＯＩを全て内包する領域、すなわち、画像中の全注目点（ＰＯＩ）を含む最小矩形領域の左上座標（ｔｏｐ，ｌｅｆｔ）及び右下座標（ｂｏｔｔｏｍ，ｒｉｇｈｔ）を求め、この領域を最終的な注目領域（ＲＯＩ）とする。具体的には、以下の式（ｆ１）〜（ｆ５）で定義された手順を順次実行する。下記式中、ｘ，ｙは、注目点のｘ座標、ｙ座標を表し、添え字ｎは抽出された注目点のＩＤを表す。 When extraction of the point of interest (POI) is completed, a region of interest (ROI) determination process (step S213) is performed. First, an upper left coordinate (top, left) and a lower right coordinate of a region including all the POIs obtained by the attention point (POI) extraction process, that is, a minimum rectangular region including all the attention points (POI) in the image ( bottom, right) is determined, and this region is set as a final region of interest (ROI). Specifically, the procedures defined by the following formulas (f1) to (f5) are sequentially executed. In the following formula, x and y represent the x and y coordinates of the target point, and the subscript n represents the ID of the extracted target point.

上記式（ｆ２）〜（ｆ５）で求められた注目領域（ＲＯＩ）の左上座標（ｔｏｐ，ｌｅｆｔ）及び右下座標（ｂｏｔｔｏｍ，ｒｉｇｈｔ）の値が、画像情報格納領域１０３ａの左上座標記憶領域１０３ｆ及び右下座標記憶領域１０３ｇに記憶され、データベース１０３が更新される（ステップＳ２１４）。 The values of the upper left coordinates (top, left) and lower right coordinates (bottom, right) of the region of interest (ROI) obtained by the above formulas (f2) to (f5) are the upper left coordinate storage area 103f of the image information storage area 103a. And stored in the lower right coordinate storage area 103g, and the database 103 is updated (step S214).

以上のように、一つの被抽出画像をリサイズ（縮小）した２つのリサイズ画像を生成して次元の異なる３つの画像とし、それぞれの画像を構成する３つのプレーンに対してエッジ抽出処理をして重み付けをした上で合成し、また、次元の異なる３つの画像の各々に対して注目領域重み付けマップｗＭａｐで重み付けをして、さらにこれらの３つの画像を重み付けして合成して注目マップＳａｌを生成することにより、注目領域の抽出の精度を向上させることができる。 As described above, two resized images obtained by resizing (reducing) one extracted image are generated as three images having different dimensions, and edge extraction processing is performed on the three planes constituting each image. Combine the images after weighting them, weight each of the three images with different dimensions with the attention area weighting map wMap, and generate the attention map Sal by weighting and combining these three images. By doing so, the accuracy of extracting the attention area can be improved.

（第２の注目領域抽出方法）
次に、第２の注目領域抽出方法について説明する。この方法は、上述の第１の注目領域抽出方法により注目領域を大まかに抽出し（以下、「ラフ抽出処理」と呼ぶ）、その後、抽出された注目領域を基に、再度注目領域の抽出処理を行って（以下「ファイン抽出」と呼ぶ）、注目領域の抽出処理の高精度化を図っている。 (Second attention area extraction method)
Next, the second attention area extraction method will be described. In this method, the attention area is roughly extracted by the above-described first attention area extraction method (hereinafter referred to as “rough extraction process”), and then the attention area extraction process is performed again based on the extracted attention area. (Hereinafter referred to as “fine extraction”) to improve the accuracy of the region-of-interest extraction process.

この第２の注目領域抽出方法の詳細を、図８のフローチャートを用いて説明する。この図８に示すように、まず、注目領域抽出部１０５によって、ラフ抽出処理（ステップＳ２２１〜Ｓ２２３）が行われる。この処理では、最初に注目度マップ作成処理が行われる（ステップＳ２２１）。この処理の詳細は、第１の注目領域抽出方法と同様であり、データベース１０３から画像データが読み出され、処理対象として指定された被抽出画像（入力画像ｉｍｇ１）に対して、幅方向及び高さ方向にそれぞれ１／２及び１／４にリサイズした入力画像ｉｍｇ２及び入力画像ｉｍｇ３が作成される。そして、これらの入力画像ｉｍｇ１〜ｉｍｇ３のＹＣｂＣｒの各プレーンに対して、前出の式（ａ１）〜（ａ３）を用いて、ラプラシアン処理が行われる。 Details of the second attention area extraction method will be described with reference to the flowchart of FIG. As shown in FIG. 8, first, rough extraction processing (steps S <b> 221 to S <b> 223) is performed by the attention area extraction unit 105. In this process, attention level map creation processing is first performed (step S221). The details of this processing are the same as those in the first attention area extraction method. Image data is read from the database 103, and the extracted image (input image img1) designated as the processing target is processed in the width direction and the height direction. An input image img2 and an input image img3 resized to 1/2 and 1/4 in the vertical direction are created. Then, Laplacian processing is performed on the YCbCr planes of these input images img1 to img3 using the above-described equations (a1) to (a3).

さらに、第１の注目領域抽出方法と同様に、前出の式（ｂ１）〜（ｂ９）を用いて、上記で定義された入力画像ｉｍｇ１〜ｉｍｇ３のＹＣｂＣｒ各プレーンに対して、図５（ａ）に示すような８近傍ラプラシアンフィルタΔ（ｘ，ｙ）を畳み込み演算することにより、ラプラシアン画像（ΔＹ１〜３、ΔＣｂ１〜３、ΔＣｒ１〜３）が求められる。 Further, similarly to the first attention area extraction method, the above-described equations (b1) to (b9) are used to determine the YCbCr planes of the input images img1 to img3 defined above with reference to FIG. The Laplacian images (ΔY1 to 3; ΔCb1 to 3; ΔCr1 to 3) are obtained by performing a convolution operation on an 8-neighbor Laplacian filter Δ (x, y) as shown in FIG.

そして、このようにして求めた各プレーンのラプラシアン画像に対して、前出の式（ｃ１）〜（ｃ３）を用いて、プレーンごとに、各画素に対して予め決められた係数を乗じた上で、各値を加算することにより、入力画像ｉｍｇ１〜ｉｍｇ３に応じたラプラシアン画像Δｉｍｇ１、Δｉｍｇ２、及び、Δｉｍｇ３が求められる。その後、前出の式（ｄ１）〜（ｄ６）を用いて、入力画像の第１の注目度マップＳａｌが求められる。 Then, the Laplacian image of each plane obtained in this way is multiplied by a predetermined coefficient for each pixel for each plane using the above formulas (c1) to (c3). Thus, by adding each value, Laplacian images Δimg1, Δimg2, and Δimg3 corresponding to the input images img1 to img3 are obtained. Thereafter, the first attention level map Sal of the input image is obtained using the above-described equations (d1) to (d6).

上述のようにして第１の注目度マップＳａｌが作成されると、次に、注目点（ＰＯＩ）抽出処理（ステップＳ２２２）が行われる。この処理も、第１の注目領域抽出方法と同様に、前出の式（ｅ１）を用いて、閾値ｔｈ以上の注目度を持つ点（画素）を抽出し、それらを第１の注目点（ＰＯＩ）と定義する。そして、次の注目領域（ＲＯＩ）決定処理（ステップＳ２２３）では、この抽出された第１の注目点（ＰＯＩ）を基に、前出の式（ｆ１）〜（ｆ５）を用いて、第１の注目領域が求められる。なお、このラフ抽出処理にて求められた第１の注目領域を、以下、「ＲＯＩ_rough」と呼ぶ。 When the first attention level map Sal is created as described above, the attention point (POI) extraction process (step S222) is then performed. In this process, similarly to the first attention area extraction method, points (pixels) having an attention degree equal to or higher than the threshold th are extracted using the above-described formula (e1), and these are extracted as the first attention points ( POI). Then, in the next attention area (ROI) determination process (step S223), the first expression (f1) to (f5) described above is used based on the extracted first attention point (POI). The attention area is required. The first region of interest obtained by this rough extraction process is hereinafter referred to as “ROI _rough ”.

次に、ファイン抽出処理（ステップＳ２２４〜Ｓ２２６）について説明する。上記のラフ抽出処理で抽出された第１の注目領域ＲＯＩ_roughを基に、ファイン抽出処理が行われる。この処理では、まず、注目度マップの再生成処理が行われる（ステップＳ２２４）。この場合、入力画像ｉｍｇ１から第１の注目領域ＲＯＩ_roughに対応する領域を切り出して、この領域をファイン抽出処理用の入力画像ｉｍｇ１_fineとする。この際、ＲＯＩ_roughで既定される領域の幅及び高さが４の倍数ではない場合は、下端行及び右端列をそれぞれコピーたしたものをｉｍｇ１_fineに付与することとする。なお、右下隅についてはコピー元の右下隅の情報を用いることができる。この入力画像ｉｍｇ１_fineを基に、ラフ抽出処理と同様に、幅方向及び高さ方向にそれぞれ１／２及び１／４にリサイズ（復元）した入力画像ｉｍｇ２_fine及び入力画像ｉｍｇ３_fineが生成される。これらを基に、前出の式（ｄ１）〜（ｄ４）を用いて、入力画像の第２の注目度マップＳａｌが求められる。この一連の処理は、基本的にはラフ抽出処理と同様の処理となるが、各式中のパラメータはラフ抽出処理とは異なる値となる。また、前出の注目度マップＳａｌ１〜Ｓａｌ３を求める式（ｄ１）〜（ｄ３）の注目領域重み付けマップｗＭａｐ（ｘ，ｙ）は、ｘ，ｙの値によらず全て１とする（すなわち、このファイン抽出処理では、注目領域重み付けマップｗＭａｐによる重み付けは行わない）。 Next, the fine extraction process (steps S224 to S226) will be described. Fine extraction processing is performed based on the first region of interest ROI _rough extracted by the rough extraction processing. In this process, first, the attention level map is regenerated (step S224). In this case, a region corresponding to the first region of interest ROI _rough is cut out from the input image img1, and this region is set as an input image img1 _fine for fine extraction processing. At this time, if the width and height of the region defined by ROI _rough are not a multiple of 4, a copy of the lower end row and the right end column is assigned to img1 _fine . For the lower right corner, information on the lower right corner of the copy source can be used. Based on the input image img1 _fine , the input image img2 _fine and the input image img3 _fine resized (restored) to 1/2 and 1/4 in the width direction and the height direction, respectively, are generated in the same manner as the rough extraction process. . Based on these, the second attention level map Sal of the input image is obtained using the above-described equations (d1) to (d4). This series of processing is basically the same processing as the rough extraction processing, but the parameters in each equation have different values from the rough extraction processing. Further, the attention area weighting maps wMap (x, y) of the expressions (d1) to (d3) for obtaining the attention degree maps Sal1 to Sal3 are all set to 1 regardless of the values of x and y (that is, this In the fine extraction process, weighting by the attention area weighting map wMap is not performed).

第２の注目度マップの再生成が終了すると、次に、この第２の注目度マップの２値化処理が行われる（ステップＳ２２５）。すなわち、上記注目度マップの再生成処理を行った結果得られた第２の注目度マップＳａｌ（以下、「Ｓａｌ_fine」と呼ぶ）に対して、所定の閾値ｔｈ_fineを基に、以下に示す式（ｇ１）〜（ｇ２）を用いて、２値化処理を行うことにより（注目度マップの各画素の値が、閾値ｔｈ_fine以上のときは「１」とし、それより小さいときは「０」とする）、２値化注目度マップ（以下、「Ｓａｌ_bin」と呼ぶ）を生成する。 When the regeneration of the second attention level map is completed, the binarization process of the second attention level map is performed (step S225). That is, for the second attention level map Sal (hereinafter referred to as “Sal _fine ”) obtained as a result of performing the attention level map regeneration process, the following is shown based on a predetermined threshold th _fine. By performing binarization processing using the expressions (g1) to (g2) (when the value of each pixel in the attention level map is equal to or greater than the threshold th _fine, “1” is set, and when smaller than that, “0” is set. And a binarized attention map (hereinafter referred to as “Sal _bin ”).

注目度マップの２値化処理では、次に、上記式（ｇ１）又は（ｇ２）で生成されたＳａｌ_binに対して、図５（ｂ）に示すような４近傍ラプラシアンフィルタΔ（ｘ，ｙ）を畳み込み演算することにより、２値化注目度マップＳａｌ_binのエッジ成分（輪郭成分にほぼ等しい）の抽出が行われる。そして、この畳み込み演算の結果が０ではない優位な値を持つ画素に対して、以下に示す手順（１）〜（３）によりラベリング処理が行われる。 In the binarization processing of the attention degree map, next, the 4-neighbor Laplacian filter Δ (x, y as shown in FIG. 5B is applied to the Sal _bin generated by the above formula (g1) or (g2). by) to the convolution operation, the extraction of binary saliency map Sal _bin edge component (substantially equal to the contour component) is performed. Then, a labeling process is performed on the pixels having a superior value whose result of the convolution operation is not 0 by the following procedures (1) to (3).

（１）ラベル値をλと定義し、このλ＝０を初期値として、Ｓａｌ_bin上の左上から順方向ラスタ走査を行う。
（２）走査の過程で、ラベル値が付与されておらず、画素値が０でない画素を発見した時、当該画素にラベル付けを行う。この際、既走査の８近傍画素に付与されているラベル値に応じて、以下の通り、付与するラベル値λを変化させる。
（ｉ）既走査の８近傍画素の全ての画素値が０であった場合、ラベル値λを１カウントアップ（λ＝λ＋１）して、当該画素にラベル値λを付与する。
（ii）ラベル値が１種類の場合は、λの更新を行わず、当該画素にラベル値λを付与する。
（iii）ラベル値が２種類（λ，λ′，λ＜λ′）の場合は、当該画素にラベル値λを付与し、さらに、既走査画素においてλ′のラベル値が付与されている画素を全てラベル値λに変更する。
（３）上記（２）の処理を、Ｓａｌ_bin上の全ての画素に対して行う。 (1) A label value is defined as λ, and λ = 0 is an initial value, and forward raster scanning is performed from the upper left on Sal _bin .
(2) When a pixel having no label value and a pixel value other than 0 is found in the scanning process, the pixel is labeled. At this time, the label value λ to be applied is changed as follows according to the label value assigned to the already scanned eight neighboring pixels.
(I) When the pixel values of all the eight neighboring pixels that have been scanned are 0, the label value λ is incremented by 1 (λ = λ + 1), and the label value λ is given to the pixel.
(Ii) When there is only one type of label value, the label value λ is given to the pixel without updating λ.
(Iii) When there are two types of label values (λ, λ ′, λ <λ ′), a label value λ is assigned to the pixel, and further, a pixel to which a label value of λ ′ is assigned to the scanned pixel Are all changed to the label value λ.
(3) The process of (2) is performed on all pixels on Sal _bin .

次に、注目度マップの２値化処理において、ラベリング結果の評価が行われる。まず、上記のようにＳａｌ_bin上の０でない全画素を対象にラベリング処理を行った後、ラベル値ごとに、付与されている画素の個数ｎ_λをカウントする。このｎ_λを所定の閾値ｔｈ_λにより評価し、ｎ_λがｔｈ_λよりも大きいラベルを抽出する。その後、当該ラベルが付与されている画素（の集合）を、注目領域部分領域ＰＯＩ_fineと定義する。 Next, in the binarization process of the attention level map, the labeling result is evaluated. First, after the labeling process for all pixels not zero on Sal _bin as described above, for each label value, it counts the number n _lambda of pixels that are granted. This n _λ is evaluated by a predetermined threshold th _λ , and a label in which n _λ is larger than th _λ is extracted. Thereafter, the pixel (set) to which the label is _assigned is defined as the attention area partial area POI _fine .

注目度マップの２値化処理が終了すると、次に、注目領域（ＲＯＩ）の決定処理（ステップＳ２２６）が行われる。この処理では、上述の通り求めた注目領域部分領域ＰＯＩ_fineを全て内包する最小矩形領域の左上座標（ｔｏｐ，ｌｅｆｔ）及び右下座標（ｂｏｔｔｏｍ，ｒｉｇｈｔ）が求められ、この領域が第２の注目領域抽出方法における最終的な注目領域（ＲＯＩ）となる。具体的には、前出の式（ｆ１）〜（ｆ５）で定義された手順が順次実行される。このようにして求められた左上座標（ｔｏｐ，ｌｅｆｔ）及び右下座標（ｂｏｔｔｏｍ，ｒｉｇｈｔ）の値が、画像情報格納領域１０３ａの左上座標記憶領域１０３ｆ及び右下座標記憶領域１０３ｇに記憶され、データベース１０３が更新される（ステップＳ２２７）。このように、第２の注目領域抽出方法によると、注目領域抽出処理をラフ抽出処理とファイン抽出処理の２段階で行うことで、注目領域の抽出をより高精度に行うことが可能となる。 When the binarization processing of the attention level map ends, next, the attention region (ROI) determination processing (step S226) is performed. In this process, the upper left coordinates (top, left) and the lower right coordinates (bottom, right) of the minimum rectangular area including all the attention area partial area POI _fine obtained as described above are obtained, and this area is the second attention area. This is the final region of interest (ROI) in the region extraction method. Specifically, the procedures defined by the above formulas (f1) to (f5) are sequentially executed. The values of the upper left coordinates (top, left) and lower right coordinates (bottom, right) thus determined are stored in the upper left coordinate storage area 103f and the lower right coordinate storage area 103g of the image information storage area 103a, and are stored in the database. 103 is updated (step S227). As described above, according to the second attention area extraction method, the attention area extraction process is performed in two stages of the rough extraction process and the fine extraction process, so that the attention area can be extracted with higher accuracy.

（第３の注目領域抽出方法）
次に、第３の注目領域抽出方法について説明する。第１の注目領域抽出方法では、注目度マップを作成する際に、ＹＣｂＣｒの３つのプレーン全てを評価していたが、第３の注目領域抽出方法では、ＹＣｂＣｒの何れか１つのプレーンのみを処理対象として、注目度マップを作成した上で、注目領域の抽出を行っている。これにより、第３の注目領域抽出方法では、注目領域の抽出処理の高速化を図っている。 (Third attention area extraction method)
Next, a third attention area extraction method will be described. In the first attention area extraction method, all three planes of YCbCr are evaluated when creating the attention level map. However, in the third attention area extraction method, only one of the YCbCr planes is processed. As a target, attention areas are extracted after creating an attention level map. Thus, in the third attention area extraction method, the extraction process of the attention area is accelerated.

以下、注目領域抽出部１０５における注目領域抽出処理について、図９のフローチャートを用いて説明する。この図９に示すように、まず、注目領域抽出部１０５によって、変動係数の計算及び処理対象プレーンの選択処理（ステップＳ２３１）が行われる。この処理では、まず、データベース１０３から画像データが読み出され、処理対象として指定された被抽出画像（入力画像ｉｍｇ１）に対して、各プレーンの変動係数が算出される。ここで、Ｙプレーンの変動係数をＶ_Y、Ｃｂプレーンの変動係数をＶ_Cb、Ｃｒプレーンの変動係数をＶ_Crと定義したとき、これらの変動係数は、以下に示す式（ｈ１）〜（ｈ３）を用いて求められる。 The attention area extraction processing in the attention area extraction unit 105 will be described below with reference to the flowchart of FIG. As shown in FIG. 9, first, the attention area extraction unit 105 calculates a variation coefficient and selects a processing target plane (step S <b> 231). In this process, first, image data is read from the database 103, and the variation coefficient of each plane is calculated for the extracted image (input image img1) designated as the processing target. Here, when the variation coefficient of the Y plane is defined as V _Y , the variation coefficient of the Cb plane is defined as V _Cb , and the variation coefficient of the Cr plane is defined as V _Cr , these variation coefficients are expressed by the following equations (h1) to (h3). ).

上記式（ｈ１）〜（ｈ３）中、ｓｄ（＊）は当該プレーンにおける画素のエッジ成分の標準偏差を求める演算を意味し、ｍｅａｎ（＊）は当該プレーンにおける画素のエッジ成分の平均値を求める演算を意味する。 In the above formulas (h1) to (h3), sd (*) means an operation for obtaining the standard deviation of the edge component of the pixel in the plane, and mean (*) obtains the average value of the edge component of the pixel in the plane. Means an operation.

このように変動係数が求まったら、以下に示す式（ｉ１）を用いて、処理対象プレーンＳＰの選択を行う。 When the variation coefficient is obtained in this way, the processing target plane SP is selected using the following equation (i1).

上記式（ｉ１）中、ｍａｘ（＊）は引数で最大の要素を返す関数であり、ａｒｇはｍａｘ（＊）の返り値がＶ_Y，Ｖ_Cb，Ｖ_Crいずれのプレーンであるかを判定する演算子である。すなわち、変動係数が最も大きなプレーンが処理対象プレーンとして選択される。例えば、Ｖ_Y，Ｖ_Cb，Ｖ_Crのうち、Ｙプレーンの変動係数Ｖ_Yが最も大きな値であった場合、処理対象プレーンはＹプレーンとなる。以下、この処理対象プレーンを、「ＳＰ」と呼ぶが、例えば、入力画像ｉｍｇ１に対して選択された処理対象プレーンを、「ＳＰ１」と呼ぶ。 In the above formula (i1), max (*) is a function that returns the maximum element as an argument, and arg determines whether the return value of max (*) is a plane of V _Y , V _Cb , or V _Cr. It is an operator. That is, the plane with the largest variation coefficient is selected as the processing target plane. For example, when the variation coefficient V _Y of the Y plane is the largest value among V _Y , V _Cb , and V _Cr , the processing target plane is the Y plane. Hereinafter, this processing target plane is referred to as “SP”. For example, the processing target plane selected for the input image img1 is referred to as “SP1”.

処理対象プレーンＳＰの選択が終了すると、次に、この処理対象プレーンＳＰに対して注目度マップ作成処理が行われる（ステップＳ２３２）。まず、入力画像のリサイズが行われるが、処理対象として指定された入力画像ｉｍｇ１に対して、幅方向及び高さ方向にそれぞれ１／２及び１／４にリサイズした入力画像ｉｍｇ２及び入力画像ｉｍｇ３が生成される。なお、入力画像ｉｍｇ２及び入力画像ｉｍｇ３の処理対象プレーンは、それぞれ「ＳＰ２」、「ＳＰ３」と呼ぶ。 When the selection of the processing target plane SP is completed, an attention level map creation process is performed on the processing target plane SP (step S232). First, the input image is resized. The input image img2 and the input image img3, which are resized to 1/2 and 1/4 in the width direction and the height direction, respectively, with respect to the input image img1 designated as the processing target. Generated. Note that the processing target planes of the input image img2 and the input image img3 are referred to as “SP2” and “SP3”, respectively.

次に、以下に示す式（ｂ１′）〜（ｂ３′）を用いて、入力画像ｉｍｇ１〜ｉｍｇ３の処理対象プレーンＳＰ１〜ＳＰ３に対して、図５（ａ）に示すような８近傍ラプラシアンフィルタΔ（ｘ，ｙ）を畳み込み演算することにより、ラプラシアン画像（ΔＳＰ１〜ΔＳＰ３）を求める。 Next, an 8-neighbor Laplacian filter Δ as shown in FIG. 5A is applied to the processing target planes SP1 to SP3 of the input images img1 to img3 using the following expressions (b1 ′) to (b3 ′). A Laplacian image (ΔSP1 to ΔSP3) is obtained by performing a convolution operation on (x, y).

上記式（ｂ１′）〜（ｂ３′）中、演算子「＊」は畳み込み演算を意味する。なお、画像の上下端、左右端の画素に対してフィルタを演算する際には、上下端の行と列、または左右端の行と列とをコピーすることにより画像を拡張し、演算を実行するものとする。 In the above formulas (b1 ′) to (b3 ′), the operator “*” means a convolution operation. When calculating the filter for the pixels at the top and bottom edges and the left and right edges of the image, the image is expanded by copying the rows and columns at the top and bottom edges, or the rows and columns at the left and right edges, and the computation is executed. It shall be.

そして、以下に示す式（ｄ１′）〜（ｄ３′）を用いて、上述のようにして求めたラプラシアン画像Δｉｍｇ１〜Δｉｍｇ３に対し、予め設定されている注目領域重み付けマップｗＭａｐを乗算することにより、ラプラシアン画像ごとの注目度マップＳａｌ１〜Ｓａｌ３を求める。その後、以下に示す式（ｄ４）を用いて、Ｓａｌ１〜Ｓａｌ３をｗｄ１〜ｗｄ３で重み付けして画素毎に加算することにより、入力画像全体の注目度マップＳａｌを得る。 Then, by using the following formulas (d1 ′) to (d3 ′), the Laplacian images Δimg1 to Δimg3 obtained as described above are multiplied by a preset attention area weighting map wMap, Attention level maps Sal1 to Sal3 for each Laplacian image are obtained. Thereafter, using the equation (d4) shown below, Sal1 to Sal3 are weighted by wd1 to wd3 and added for each pixel, thereby obtaining an attention level map Sal for the entire input image.

上記式（ｄ１′）〜（ｄ３′）中、ｒｅｓｉｚｅ（＊，＊）は、第１引数の行列（画素）を、第２引数の倍率により、リサイズする関数を意味する。すなわち、第１の注目領域抽出方法では、被抽出画像である入力画像ｉｍｇ１の次元にリサイズしていたが、この第３の注目領域抽出方法では、次元の最も小さい入力画像ｉｍｇ３を基準画像とし、残りの入力画像ｉｍｇ１，ｉｍｇ２をこの基準画像の次元にリサイズしている。そのため、注目領域重み付けマップｗＭａｐは、基準画像（Δｉｍｇ３）と次元数を等しくした行列となる。例えば、Δｉｍｇ３の次元（すなわち、入力画像ｉｍｇ３の次元）が２０×３０（画素）の場合には、ｗＭａｐは同様に２０×３０（画素）の次元を有する。各式の演算子の意味については、前出の式（ｄ１）〜（ｄ４）の説明で述べた通りである。なお、ｗＭａｐは前出の式（ｄ５）及び（ｄ６）により表される混合ガウス型のマップである（図６参照）。 In the above formulas (d1 ′) to (d3 ′), “resize (*, *)” means a function for resizing the matrix (pixel) of the first argument by the magnification of the second argument. That is, in the first attention area extraction method, the input image img1 which is the extracted image is resized to the dimension, but in the third attention area extraction method, the input image img3 having the smallest dimension is used as the reference image. The remaining input images img1 and img2 are resized to the dimensions of this reference image. Therefore, the attention area weighting map wMap is a matrix having the same number of dimensions as the reference image (Δimg3). For example, when the dimension of Δimg3 (that is, the dimension of the input image img3) is 20 × 30 (pixels), wMap similarly has a dimension of 20 × 30 (pixels). The meaning of the operator of each expression is as described in the description of the expressions (d1) to (d4). Note that wMap is a mixed Gaussian map represented by the above equations (d5) and (d6) (see FIG. 6).

上述のようにして注目度マップが作成されると、次に、注目点（ＰＯＩ）抽出処理（ステップＳ２３３）が行われる。下記式（ｅ１′）を用いて、生成した注目度マップＳａｌから、予め決定しておいた閾値ｔｈ_SP以上の注目度を持つ点を抽出し、それらを注目点（ＰＯＩ）と定義する。 Once the attention level map is created as described above, a point of interest (POI) extraction process (step S233) is performed. Using the following formula (e1 ′), points having an attention level equal to or higher than a predetermined threshold th _SP are extracted from the generated attention level map Sal, and defined as attention points (POI).

なお、閾値ｔｈ_SPは、選択された処理対象プレーンがＹ，Ｃｂ，Ｃｒ何れかによって、それぞれ異なる値が用いられる。この閾値ｔｈ_SPは、プレーンごとに予め決められた固定値であってもよいし、以下に示す式（ｊ１）及び（ｊ２）を用いて動的に求めてもよい。なお、上記式（ｊ１）中、Ｖ_SPはＳに対応した変動係数であり、ｐａｒａｍ_SPは以下の式（ｊ２）で表され、ＳＰに応じて固定値を持つ、予め決められたパラメータを意味する。 Note that the threshold th _SP has a different value depending on whether the selected processing target plane is Y, Cb, or Cr. This threshold th _SP may be a fixed value determined in advance for each plane, or may be obtained dynamically using equations (j1) and (j2) shown below. In the above formula (j1), V _SP is a coefficient of variation corresponding to S, and param _SP is expressed by the following formula (j2) and means a predetermined parameter having a fixed value according to SP. To do.

上記注目点（ＰＯＩ）の抽出処理が完了すると、次の注目領域（ＲＯＩ）決定処理（ステップＳ２３４）では、この抽出された注目点（ＰＯＩ）を基に、前出の式（ｆ１）〜（ｆ５）を用いて、注目領域ＲＯＩの左上座標（ｔｏｐ，ｌｅｆｔ）及び右上座標（ｂｏｔｔｏｍ，ｒｉｇｈｔ）が求められる。なお、上述のように、基準画像（次元の最も小さい画像であって、上述の場合は入力画像ｉｍｇ３）の次元にリサイズされているため、基準画像と被抽出画像（入力画像ｉｍｇ１）の次元が異なる場合には、注目領域ＲＯＩの座標を被抽出画像の次元に変換（復元）する必要がある。 When the extraction process of the attention point (POI) is completed, in the next attention area (ROI) determination process (step S234), based on the extracted attention point (POI), the above formulas (f1) to (f1) to ( Using f5), the upper left coordinates (top, left) and upper right coordinates (bottom, right) of the region of interest ROI are obtained. As described above, since the dimensions of the reference image (the image having the smallest dimension and in the above case, the input image img3) are resized, the dimensions of the reference image and the extracted image (input image img1) are the same. If they are different, it is necessary to convert (restore) the coordinates of the attention area ROI into the dimensions of the extracted image.

このようにして求められた左上座標（ｔｏｐ，ｌｅｆｔ）及び右下座標（ｂｏｔｔｏｍ，ｒｉｇｈｔ）の値が、画像情報格納領域１０３ａの左上座標記憶領域１０３ｆ及び右下座標記憶領域１０３ｇに記憶され、データベース１０３が更新される（ステップＳ２３５）。この第３の注目領域抽出方法によると、ＹＣｂＣｒの１つのプレーンのみを処理対象とすることで、注目領域の抽出をより高速に行うことが可能となる。 The values of the upper left coordinates (top, left) and lower right coordinates (bottom, right) thus determined are stored in the upper left coordinate storage area 103f and the lower right coordinate storage area 103g of the image information storage area 103a, and are stored in the database. 103 is updated (step S235). According to the third attention area extraction method, it is possible to extract the attention area at a higher speed by using only one YCbCr plane as a processing target.

なお、この画像評価装置１００は、中央演算装置（ＣＰＵ）やメモリ等を有し、上述の画像評価処理及びこの画像評価処理で実行される注目領域抽出方法は、このＣＰＵで実行されるプログラムとして実装することができる。ここで、このプログラムは、例えばフレキシブルディスク、ＣＤ−ＲＯＭ、光磁気ディスク、半導体メモリ、ハードディスク等の記憶媒体または記憶装置に格納される。また、ネットワークなどを介してディジタル信号として配信される場合もある。このとき、中間的な処理結果はメインメモリ等の記憶装置に一時保管される。あるいは、特別のＣＰＵやメモリ等を設けずに、以上の処理をＡＳＩＣやＤＳＰ等に論理回路として構成することも可能である。また、上述の画像評価処理及びこの画像評価処理で実行される注目領域抽出方法はスライドショーにおける自動ズームセンターを決定する場合や画像編集ソフトで自動トリミングをする場合、カメラのオートフォーカス（ＡＦ）領域や自動露光（ＡＥ）決定領域を決定する場合に適用することができるが、これに限定されることはない。 The image evaluation apparatus 100 includes a central processing unit (CPU), a memory, and the like, and the above-described image evaluation process and the attention area extraction method executed in the image evaluation process are programs executed by the CPU. Can be implemented. Here, this program is stored in a storage medium or storage device such as a flexible disk, a CD-ROM, a magneto-optical disk, a semiconductor memory, or a hard disk. In some cases, it may be distributed as a digital signal via a network or the like. At this time, intermediate processing results are temporarily stored in a storage device such as a main memory. Alternatively, the above processing can be configured as a logic circuit in an ASIC, DSP, or the like without providing a special CPU or memory. The above-described image evaluation process and the attention area extraction method executed in this image evaluation process are performed when determining an automatic zoom center in a slide show, or when performing automatic trimming with image editing software, The present invention can be applied to the case where an automatic exposure (AE) determination area is determined, but is not limited thereto.

１００画像評価装置１０３データベース１０５注目領域抽出部 DESCRIPTION OF SYMBOLS 100 Image evaluation apparatus 103 Database 105 Region of interest extraction part

Claims

An attention area extraction method for extracting an attention area of an extracted image, which is a digital image expressed by luminance and two chromaticities, for each pixel,
For each pixel, an edge component of the extracted image is extracted from each of the three planes corresponding to the luminance and two chromaticities of the extracted image, and each of the three planes is extracted for each pixel. Calculating the edge amount for each pixel by weighting and adding the edge components of:
Each of the elements of the attention area weighting map having elements corresponding to the number of pixels of the extracted image, and having a weight value set in advance for each of the elements, the edge amount of the pixel corresponding to the extracted image Multiplying to calculate the degree of attention for each pixel;
A region of interest extraction method comprising: extracting, from the image to be extracted, a region including all pixels having a degree of attention greater than a predetermined threshold as the region of interest.

Generating a resized image obtained by resizing the extracted image into at least one or more different dimensions;
The step of calculating the edge amount is configured to calculate the edge amount for each of the extracted image and the resized image,
The step of calculating the attention level further includes:
Restoring the edge amount calculated from the resized image to the same dimension as the number of pixels of the extracted image;
Each of the edge amount calculated from the extracted image and the edge amount calculated and restored from the resized image is multiplied by each of the elements of the attention area weighting map, and further, the attention area weighting map The attention area according to claim 1, further comprising: multiplying each of the edge amounts multiplied by a predetermined weight for each pixel and calculating the attention degree for each pixel. Extraction method.

further,
Cutting out the region extracted in the step of extracting as the region of interest from the extracted image;
Resizing the extracted image to be extracted to at least one different size;
Calculating the edge amount for each image obtained by resizing the extracted image to be extracted and the extracted image to be extracted; and
Restoring an edge amount calculated from an image obtained by resizing the extracted image to be extracted to a dimension of the extracted image to be extracted;
The edge amount calculated from the extracted image to be extracted and the edge amount calculated from the image obtained by resizing the extracted image to be extracted and further restored by multiplying a preset weight by the pixel amount. And calculating the degree of attention of the extracted image to be extracted,
Binarizing the degree of attention of the extracted image to be extracted with 1 when the degree of attention is greater than or equal to a predetermined threshold and 0 when smaller than the predetermined threshold;
In the binarized attention level, adjacent pixels set to 1 are grouped, the group including a predetermined number of pixels or more is determined as the attention area partial area, and the attention area partial area The region of interest extraction method according to claim 2, further comprising a step of extracting a region including all as a new region of interest.

An attention area extraction method for extracting an attention area of an extracted image, which is a digital image expressed by luminance and two chromaticities, for each pixel,
From each of the three planes corresponding to the luminance and two chromaticities of the extracted image, a standard deviation and an average value of the pixel values are obtained for each plane, and the standard deviation is divided by the average value. Calculating a coefficient of variation, and selecting the plane having the largest coefficient of variation as a processing target plane;
Generating a resized image in which the processing target plane of the extracted image is resized to at least one or more different dimensions;
Extracting an edge component for each pixel and calculating an edge amount for each pixel in the processing target plane of the extracted image and the resized image;
Of the extracted image and the resized image, an image having the smallest dimension is set as a reference image, and the edge amount calculated from the remaining extracted image and the resized image is set to the same dimension as the number of pixels of the reference image. Steps to restore,
An attention area weighting map in which each pixel of the edge amount of the reference image and the restored edge amount has an element corresponding to the number of pixels of the reference image, and a weight value is set in advance for each element. Each of the elements is multiplied, and each of the edge amounts multiplied by the attention area weighting map is multiplied by a preset weight and added for each pixel, and the degree of attention for each pixel is obtained. A calculating step;
A region of interest extraction method comprising: extracting, from the image to be extracted, a region including all pixels having a degree of attention greater than a predetermined threshold as the region of interest.

The program for making a computer perform the attention area extraction method as described in any one of Claims 1-4.

A storage unit for storing a digital image expressed by luminance and two chromaticities for each pixel;
An attention area extracting section that reads out the digital image from the storage section as an extracted image and extracts the attention area of the extracted image by the attention area extraction method according to claim 1. Image evaluation device.