JP2013073309A

JP2013073309A - Method for extracting region of interest from image, and electronic apparatus, system and program for achieving the same method

Info

Publication number: JP2013073309A
Application number: JP2011210296A
Authority: JP
Inventors: Noriko Tamura; 紀子田村
Original assignee: NEC Casio Mobile Communications Ltd
Current assignee: NEC Casio Mobile Communications Ltd
Priority date: 2011-09-27
Filing date: 2011-09-27
Publication date: 2013-04-22

Abstract

PROBLEM TO BE SOLVED: To provide an electronic apparatus and the like appropriately extracting a subject that a user desires to acquire without deviating from a region of interest.SOLUTION: The electronic apparatus according to the present invention includes: acquiring means for acquiring a monitor image; analysis means for analyzing a subject included in the monitor image; learning means for generating and retaining subject information on the subject as dictionary data; processing object image acquiring means for acquiring a processing object image; and processing object image analysis means for analyzing the subject included in the processing object image. The electronic apparatus further includes: region-of-interest extraction means for extracting a region of interest from the processing object image on the basis of a comparative determination obtained by comparing the subject information obtained by the processing object image analysis means with the dictionary data; and application operation control means for controlling the operation of an application on the basis of the region of interest extracted by the region-of-interest extraction means.

Description

本発明は、画像注目領域抽出方法、より詳細には、ユーザが参照した携帯端末などの電子機器の画面上に表示された画像（写真）を学習することで、別の画像からユーザの興味と近い注目領域を抽出する方法に関する。
本発明は、また、当該方法を実現する電子機器、システム、及びプログラムに関する。 The present invention relates to an image attention area extraction method, and more specifically, by learning an image (photograph) displayed on a screen of an electronic device such as a mobile terminal referred to by a user, The present invention relates to a method of extracting a close attention area.
The present invention also relates to an electronic device, a system, and a program for realizing the method.

一般に、携帯端末におけるカメラのライブビュー画像や撮影画像から画像の中心や人物などの注目領域を自動で抽出し、抽出した注目領域に基づいてハイライト表示し、スライドショーの際に種々のエフェクト処理をし、アルバム生成し、又はオートフォーカス（ＡＦ）し若しくはオート露出補正（ＡＥ）する技術が存在する。
そして、注目領域を抽出する技術として、特許文献１は、画像の中心位置や、人物画像等が位置する所定位置を中心として、撮影された画像を拡大する技術を開示する。 In general, a region of interest such as the center of the image or a person is automatically extracted from a live view image or a captured image of a camera on a mobile terminal, highlighted based on the extracted region of interest, and various effect processing is performed during a slide show. However, there is a technique for generating an album, or performing autofocus (AF) or auto exposure correction (AE).
As a technique for extracting a region of interest, Patent Document 1 discloses a technique for enlarging a captured image around a center position of an image or a predetermined position where a person image or the like is located.

特開２０００−７５４１６号公報JP 2000-75416 A

しかしながら、特許文献１に記載されているような画像の中心や人物を中心としてトリミングして、ハイライト表示やスライドショー等のアプリケーションを実行すると、次のような課題がある。
第１の課題は、画像の中心を注目領域として判断した場合、人物や背景の構造物の一部などが注目領域から外れてしまうという点である。
具体的には、撮影者ないし電子機器のユーザが取得ないし表示したいと意図した被写体が人物とその背景の構造物である場合において、画像の中心を基準として拡大してトリミングすると、人物の一部や構造物の一部が切れてしまい好ましくない画像になるという問題点があった。
第２の課題は、人物を注目領域として判断した場合、人物の背景の構造物が注目領域から外れてしまうという点である。
具体的には、先と同様、撮影者ないし電子機器のユーザが取得ないし表示したいと意図した被写体が人物とその背景の構造物である場合において、人物を基準として一定の領域を抽出してトリミングすると、背景の構造物が切れてしまうという問題点があった。 However, when the image center or person as described in Patent Document 1 is trimmed around the center and an application such as a highlight display or a slide show is executed, there are the following problems.
The first problem is that, when the center of the image is determined as the attention area, a person, a part of the background structure, or the like deviates from the attention area.
Specifically, when the subject intended to be acquired or displayed by the photographer or the user of the electronic device is a person and a background structure, when the image is enlarged and trimmed with reference to the center of the image, a part of the person is obtained. In addition, there is a problem that a part of the structure is cut and an undesired image is obtained.
The second problem is that when a person is determined as an attention area, the structure of the background of the person is out of the attention area.
Specifically, in the same manner as described above, when a subject intended to be acquired or displayed by a photographer or a user of an electronic device is a person and a background structure, a certain region is extracted and trimmed with reference to the person. Then, there was a problem that the background structure was cut.

そこで、本発明は、ユーザが取得ないし表示したいと意図した被写体を注目領域から外れることなく適切に抽出する方法、並びに当該方法を実現する電子機器、画像注目領域抽出支援システム、及びプログラムを提供することを目的とする。 Therefore, the present invention provides a method for appropriately extracting a subject intended to be acquired or displayed by a user without deviating from the attention area, an electronic device that realizes the method, an image attention area extraction support system, and a program. For the purpose.

本発明による画像から注目領域を抽出する方法は、
モニタ画像を取得し、
該モニタ画像に含まれる被写体を解析し、
該モニタ画像に含まれる該被写体に関する被写体情報を辞書データとして生成して保持し、
処理対象画像を取得し、
該処理対象画像に含まれる被写体を解析し、
該処理対象画像に含まれる該被写体に関する被写体情報と該辞書データとを比較することにより、該処理対象画像に該モニタ画像に含まれる被写体と同一又は類似する１又は２以上の被写体が含まれているかどうかを判断し、
同一又は類似する１又は２以上の被写体が含まれていると判断した場合に、該１又は２以上の被写体を含むように該処理対象画像から注目領域を抽出し、
該注目領域に基づいてアプリケーションの動作制御を行う。 A method for extracting a region of interest from an image according to the present invention includes:
Get a monitor image,
Analyzing the subject included in the monitor image,
Generating and storing subject information regarding the subject included in the monitor image as dictionary data;
Get the processing target image,
Analyzing a subject included in the processing target image;
By comparing subject information related to the subject included in the processing target image with the dictionary data, the processing target image includes one or more subjects that are the same as or similar to the subject included in the monitor image. To determine whether
When it is determined that one or more subjects that are the same or similar are included, an attention area is extracted from the processing target image so as to include the one or more subjects,
Application operation control is performed based on the attention area.

本発明による電子機器は、
モニタ画像を取得するモニタ画像取得手段と、
該モニタ画像取得手段により取得された該モニタ画像に含まれる被写体を解析するモニタ画像解析手段と、
該モニタ画像解析手段により得られた該被写体に関する被写体情報を辞書データとして生成して保持するモニタ画像学習手段と、
処理対象画像を取得する処理対象画像取得手段と、
該処理対象画像取得手段により取得された該処理対象画像に含まれる被写体を解析する処理対象画像解析手段と、
該処理対象画像解析手段により得られた該被写体に関する被写体情報と該モニタ画像学習手段に保持されている該辞書データとを比較することにより、該処理対象画像に該モニタ画像学習手段に保持されている被写体と同一又は類似する１又は２以上の被写体が含まれているかどうかを判断する比較判断手段と、
該比較判断手段により同一又は類似する１又は２以上の被写体が含まれていると判断された場合に、該１又は２以上の被写体を含むように該処理対象画像から注目領域を抽出する注目領域抽出手段と、
該注目領域抽出手段により抽出された該注目領域に基づいてアプリケーションの動作制御を行うアプリケーション動作制御手段と、
を備える。 The electronic device according to the present invention is
Monitor image acquisition means for acquiring a monitor image;
Monitor image analysis means for analyzing a subject included in the monitor image acquired by the monitor image acquisition means;
Monitor image learning means for generating and holding subject information relating to the subject obtained by the monitor image analysis means as dictionary data;
Processing target image acquisition means for acquiring a processing target image;
Processing target image analysis means for analyzing a subject included in the processing target image acquired by the processing target image acquisition means;
By comparing the subject information about the subject obtained by the processing target image analyzing means and the dictionary data held in the monitor image learning means, the processing target image is held in the monitor image learning means. Comparison determination means for determining whether or not one or more subjects that are the same as or similar to the subject being included are included;
A region of interest for extracting a region of interest from the processing target image so as to include the one or more subjects when the comparison / determination unit determines that the same or similar subjects are included. Extraction means;
Application operation control means for controlling the operation of the application based on the attention area extracted by the attention area extraction means;
Is provided.

本発明による画像注目領域抽出システムは、
モニタ画像を取得するモニタ画像取得手段と、
該モニタ画像取得手段により取得された該モニタ画像に含まれる被写体を解析するモニタ画像解析手段と、
該モニタ画像解析手段により得られた該被写体に関する被写体情報を辞書データとして生成して保持するモニタ画像学習手段と、
を備えたサーバと、
該サーバから辞書データを取得する辞書データ取得手段と、
処理対象画像を取得する処理対象画像取得手段と、
該処理対象画像取得手段により取得された該処理対象画像に含まれる被写体を解析する処理対象画像解析手段と、
該処理対象画像解析手段により得られた該被写体に関する被写体情報と該辞書データ取得手段により取得された該辞書データとを比較することにより、該処理対象画像に、該辞書データ取得手段により取得された該辞書データの被写体と同一又は類似する１又は２以上の被写体が含まれているかどうかを判断する比較判断手段と、
該比較判断手段により同一又は類似する１又は２以上の被写体が含まれていると判断された場合に、該１又は２以上の被写体を含むように該処理対象画像から注目領域を抽出する注目領域抽出手段と、
該注目領域抽出手段により抽出された該注目領域に基づいてアプリケーションの動作制御を行うアプリケーション動作制御手段と、
を備えた電子機器と、
を備える。 An image attention area extraction system according to the present invention includes:
Monitor image acquisition means for acquiring a monitor image;
Monitor image analysis means for analyzing a subject included in the monitor image acquired by the monitor image acquisition means;
Monitor image learning means for generating and holding subject information relating to the subject obtained by the monitor image analysis means as dictionary data;
A server with
Dictionary data acquisition means for acquiring dictionary data from the server;
Processing target image acquisition means for acquiring a processing target image;
Processing target image analysis means for analyzing a subject included in the processing target image acquired by the processing target image acquisition means;
By comparing the subject information relating to the subject obtained by the processing target image analyzing unit with the dictionary data acquired by the dictionary data acquiring unit, the processing target image is acquired by the dictionary data acquiring unit. Comparison determination means for determining whether or not one or more subjects that are the same as or similar to the subject of the dictionary data are included;
A region of interest for extracting a region of interest from the processing target image so as to include the one or more subjects when the comparison / determination unit determines that the same or similar subjects are included. Extraction means;
Application operation control means for controlling the operation of the application based on the attention area extracted by the attention area extraction means;
An electronic device with
Is provided.

本発明による他の画像注目領域抽出システムは、
モニタ画像を取得するモニタ画像取得手段と、
該モニタ画像取得手段により取得された該モニタ画像に含まれる被写体を解析するモニタ画像解析手段と、
該モニタ画像解析手段により得られた該被写体に関する被写体情報を辞書データとして生成して保持するモニタ画像学習手段と、
を備えたサーバと、
処理対象画像を取得する処理対象画像取得手段と、
該処理対象画像取得手段により取得された該処理対象画像を該サーバに送信する処理対象画像送信手段と、
を備えた電子機器と、
を備えた画像注目領域抽出システムであって、
該サーバは、
該処理対象画像送信手段から送信された該処理対象画像を受信する処理対象画像受信手段と、
該処理対象画像受信手段により受信された該処理対象画像に含まれる被写体を解析する処理対象画像解析手段と、
該処理対象画像解析手段により得られた該被写体に関する被写体情報と該モニタ画像学習手段に保持されている該辞書データとを比較することにより、該処理対象画像に、該辞書データ取得手段により取得された該辞書データの被写体と同一又は類似する１又は２以上の被写体が含まれているかどうかを判断する比較判断手段と、
該比較判断手段により同一又は類似する１又は２以上の被写体が含まれていると判断された場合に、該１又は２以上の被写体を含むように該処理対象画像から注目領域を抽出する注目領域抽出手段と、
該注目領域抽出手段により注目領域が抽出された画像を該電子機器に送信する注目領域抽出画像送信手段と、
をさらに備え、
該電子機器は、
該注目領域抽出画像送信手段から送信された該注目領域抽出画像を受信する注目領域抽出画像受信手段と、
該注目領域抽出画像受信手段によって受信された該注目領域抽出画像に基づいてアプリケーションの動作制御を行うアプリケーション動作制御手段と、
をさらに備える。 Another image attention area extraction system according to the present invention includes:
Monitor image acquisition means for acquiring a monitor image;
Monitor image analysis means for analyzing a subject included in the monitor image acquired by the monitor image acquisition means;
Monitor image learning means for generating and holding subject information relating to the subject obtained by the monitor image analysis means as dictionary data;
A server with
Processing target image acquisition means for acquiring a processing target image;
Processing target image transmission means for transmitting the processing target image acquired by the processing target image acquisition means to the server;
An electronic device with
An image attention area extraction system comprising:
The server
Processing target image receiving means for receiving the processing target image transmitted from the processing target image transmitting means;
Processing target image analysis means for analyzing a subject included in the processing target image received by the processing target image receiving means;
By comparing the subject information relating to the subject obtained by the processing target image analyzing means with the dictionary data held in the monitor image learning means, the processing target image is acquired by the dictionary data acquiring means. Comparison determination means for determining whether or not one or more subjects that are the same as or similar to the subject of the dictionary data are included;
A region of interest for extracting a region of interest from the processing target image so as to include the one or more subjects when the comparison / determination unit determines that the same or similar subjects are included. Extraction means;
Attention area extraction image transmission means for transmitting an image from which the attention area has been extracted by the attention area extraction means to the electronic device;
Further comprising
The electronic device is
Attention area extraction image receiving means for receiving the attention area extraction image transmitted from the attention area extraction image transmission means;
Application operation control means for performing application operation control based on the attention area extraction image received by the attention area extraction image reception means;
Is further provided.

本発明による画像注目領域抽出プログラムは、
コンピュータに、
モニタ画像を取得させる機能と、
該モニタ画像に含まれる被写体を解析させる機能と、
該モニタ画像に含まれる該被写体に関する被写体情報を辞書データとして生成して保持させる機能と、
処理対象画像を取得させる機能と、
該処理対象画像に含まれる被写体を解析させる機能と、
該処理対象画像に含まれる該被写体に関する被写体情報と該辞書データとを比較させることにより、該処理対象画像に該モニタ画像に含まれる被写体と同一又は類似する１又は２以上の被写体が含まれているかどうかを判断させる機能と、
同一又は類似する１又は２以上の被写体が含まれていると判断した場合に、該処理対象画像に含まれる同一又は類似であると判断した該１又は２以上の被写体を含むように該処理対象画像から注目領域を抽出させる機能と、
該注目領域に基づいてアプリケーションの動作制御を行わせる機能と、
を実行させる。 An image attention area extraction program according to the present invention is:
On the computer,
A function to acquire a monitor image;
A function of analyzing a subject included in the monitor image;
A function of generating and storing subject information regarding the subject included in the monitor image as dictionary data;
A function to acquire a processing target image;
A function of analyzing a subject included in the processing target image;
By comparing subject information regarding the subject included in the processing target image with the dictionary data, the processing target image includes one or more subjects that are the same as or similar to the subject included in the monitor image. The ability to determine whether or not
When it is determined that one or more subjects that are the same or similar are included, the processing target is included so as to include the one or more subjects that are determined to be the same or similar included in the processing target image. A function to extract a region of interest from an image;
A function for controlling the operation of the application based on the attention area;
Is executed.

本発明によれば、ユーザが取得したいと意図した被写体を注目領域から外れることなく適切に抽出することができる。 According to the present invention, a subject intended to be acquired by a user can be appropriately extracted without departing from the attention area.

第１実施形態に係る携帯端末１を含むシステム構成図である。1 is a system configuration diagram including a mobile terminal 1 according to a first embodiment. 第１実施形態に係る携帯端末１のブロック図である。It is a block diagram of portable terminal 1 concerning a 1st embodiment. 第１実施形態に係る携帯端末１の動作を示すフローチャート（辞書データ生成処理）である。It is a flowchart (dictionary data generation process) which shows operation | movement of the portable terminal 1 which concerns on 1st Embodiment. 第１実施形態に係る携帯端末１の動作を示すフローチャート（注目領域抽出処理）である。It is a flowchart (attention area extraction process) which shows operation | movement of the portable terminal 1 which concerns on 1st Embodiment. 第１実施形態に係る携帯端末１の注目領域抽出処理の説明図である。It is explanatory drawing of the attention area extraction process of the portable terminal which concerns on 1st Embodiment. 第１実施形態に係る携帯端末１の注目領域抽出処理の説明図である。It is explanatory drawing of the attention area extraction process of the portable terminal which concerns on 1st Embodiment. 第１実施形態に係る携帯端末１の注目領域抽出処理の説明図である。It is explanatory drawing of the attention area extraction process of the portable terminal which concerns on 1st Embodiment. 第２実施形態に係る画像注目領域抽出支援システムの構成図である。It is a block diagram of the image attention area extraction assistance system which concerns on 2nd Embodiment. 第２実施形態に係る携帯端末１００のブロック図である。It is a block diagram of the portable terminal 100 which concerns on 2nd Embodiment. 第２実施形態に係る携帯端末１００の動作を示すフローチャート（注目領域抽出処理）の説明図である。It is explanatory drawing of the flowchart (attention area extraction process) which shows operation | movement of the portable terminal 100 which concerns on 2nd Embodiment. 付記１の構成図である。FIG.

（第１実施形態）
以下、図面を用いて、本発明の電子機器の一実施形態について説明する。本実施形態は、本発明を携帯端末１に適用した例である。
なお、本発明の携帯端末１としては、携帯電話、デジタルカメラ、スマートフォン等が典型であるが、ゲーム機、パーソナルコンピュータ（タブレットＰＣやノートＰＣを含む。）、デジタルフォトフレーム等にも適用可能である。端的に要すると、本発明は、辞書データに基づいて一定の処理が可能な電子機器に広く適用できる。
（１）システム構成 (First embodiment)
Hereinafter, an embodiment of an electronic apparatus of the present invention will be described with reference to the drawings. This embodiment is an example in which the present invention is applied to the mobile terminal 1.
The mobile terminal 1 of the present invention is typically a mobile phone, a digital camera, a smartphone, or the like, but can also be applied to game machines, personal computers (including tablet PCs and notebook PCs), digital photo frames, and the like. is there. In short, the present invention can be widely applied to electronic devices that can perform certain processing based on dictionary data.
(1) System configuration

図１は、本発明の実施形態に係る通信システムの構成を示す図である。
図１において、携帯端末１は、移動無線通信機能及びメール送受信機能を備える。移動無線通信機能を用いて通話する場合、携帯端末１は、位置登録している基地局ＢＳを介して発信側の交換機ＳＷに発呼する。発信側の交換機ＳＷは、発呼に応じて、無線通信網ＲＮ中に設けられる加入者登録サーバ（不図示）から問い合せた着番号（加入者番号）及び位置登録情報に基づき着信側の交換機ＳＷを呼び出す。着信側の交換機ＳＷは、着信側の基地局ＢＳを介して着信側の携帯端末（不図示）を呼び出し、これに応じて着信側が着呼応答すると、発信側の交換機ＳＷと着信側の交換機ＳＷとのリンクが確立して通話可能になる。 FIG. 1 is a diagram showing a configuration of a communication system according to an embodiment of the present invention.
In FIG. 1, a portable terminal 1 has a mobile radio communication function and a mail transmission / reception function. When a call is made using the mobile radio communication function, the mobile terminal 1 calls the originating exchange SW via the base station BS where the location is registered. In response to the outgoing call, the originating side exchange SW switches the terminating side exchange SW based on the called party number (subscriber number) and location registration information inquired from a subscriber registration server (not shown) provided in the wireless communication network RN. Call. The called side exchange SW calls the called side portable terminal (not shown) via the called side base station BS, and when the called side responds to the incoming call in response, the calling side exchange SW and the called side exchange SW A link with is established and a call becomes possible.

メール送受信機能を用いて携帯端末１から送信されるメールは、基地局ＢＳ、交換機ＳＷおよび無線通信網ＲＮ中に設けられるゲートウェイサーバ（不図示）を経てインターネットＩＮに接続されているメールサーバＭＳに伝送される。メールサーバＭＳは、上述とは逆の経路で宛先のメールアドレスを有する携帯端末１にメールが送信される。 Mail transmitted from the portable terminal 1 using the mail transmission / reception function is sent to the mail server MS connected to the Internet IN via the base station BS, the exchange SW, and a gateway server (not shown) provided in the wireless communication network RN. Is transmitted. The mail server MS transmits the mail to the mobile terminal 1 having the destination mail address through the reverse route.

携帯端末１は、この他にも、ホームページサーバＨＳや電子書籍サーバＥＢＳを含め多くのサーバにアクセス可能であり、風景画像、人物画像、動物画像等の画像データを含むファイルをこれらのサーバから取得することができる。
（２）携帯端末の構成 In addition to this, the portable terminal 1 can access many servers including the homepage server HS and the electronic book server EBS, and obtains files including image data such as landscape images, person images, and animal images from these servers. can do.
(2) Configuration of mobile terminal

図２は、本発明の実施形態に係る携帯端末１のブロック図である。
図２に示されているように、携帯端末１は、制御部２、無線通信部（送受信部）３、アンテナ４、音声信号処理部５、マイク６、スピーカ７、表示部８、操作部９、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）１０、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）１１、撮像部１２、及び記録媒体１３を備える。 FIG. 2 is a block diagram of the mobile terminal 1 according to the embodiment of the present invention.
As shown in FIG. 2, the mobile terminal 1 includes a control unit 2, a wireless communication unit (transmission / reception unit) 3, an antenna 4, an audio signal processing unit 5, a microphone 6, a speaker 7, a display unit 8, and an operation unit 9. A ROM (Read Only Memory) 10, a RAM (Random Access Memory) 11, an imaging unit 12, and a recording medium 13.

制御部２は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）を備え、ＲＯＭ１０に記憶されたプログラムに従って携帯端末１の各部の動作を制御する。制御部２は、モニタ画像解析部２Ａ、モニタ画像学習部２Ｂ、注目領域抽出部２Ｃ、及びアプリ動作制御部２Ｄを備える。
モニタ画像解析部２Ａは、使用者が閲覧したホームページ、メール、電子書籍などからモニタ用の画像データを取得し、画像（写真）部分を認識し、その画像の特徴を解析する。解析対象となるモニタ画像は、これらから取得されるものの他、撮像部１２や記録媒体１３を介して取得された画像であってもよい。
なお、ここで言う「モニタ画像」ないし「モニタ用の画像」とは、後述する辞書データとして保持されて、注目領域を抽出する際に参照される参照画像を意味する。
モニタ画像解析部２Ａは、取得した画像から、次のような被写体情報について解析し、出力する。 The control unit 2 includes a CPU (Central Processing Unit), and controls the operation of each unit of the mobile terminal 1 according to a program stored in the ROM 10. The control unit 2 includes a monitor image analysis unit 2A, a monitor image learning unit 2B, a region of interest extraction unit 2C, and an application operation control unit 2D.
The monitor image analysis unit 2A acquires monitor image data from a home page, mail, electronic book, or the like browsed by the user, recognizes an image (photograph) portion, and analyzes the characteristics of the image. The monitor image to be analyzed may be an image acquired via the imaging unit 12 or the recording medium 13 in addition to those acquired from these.
Here, “monitor image” or “monitor image” means a reference image that is stored as dictionary data to be described later and is referred to when an attention area is extracted.
The monitor image analysis unit 2A analyzes and outputs the following subject information from the acquired image.

（ａ）被写体（人物・ペット・小物・建物・山・海など）の特徴とその領域
たとえば、ユーザの取得したモニタ画像を解析して、被写体の中に人物、ペット、小物、建物、山、海などが含まれていた場合、それらの輪郭や色調、大きさ、年齢、種類などのパラメータに基づいて可能な範囲で分析し、それぞれの特徴点を得るようにしてデータ化する。
また、領域については、例えば被写体が画像全体の中で占める領域の割合や位置などの領域に関するパラメータを分析し、その特徴点を得るようにしてデータ化する。
（ｂ）構図
たとえば、建物や山などの被写体が画像全体に対して位置する位置情報などを分析して構図に関する情報を得る。あるいは、また、ハフ変換によって画像から直線を検出して、これを構図線として構図情報を解析することもできる。
（ｃ）アスペクト比
ここでいうアスペクト比とは、２次元形状の物の長辺と短辺の比率を意味する。すなわち、画像自体（画面自体）のアスペクト比でなく、画像に含まれている被写体のアスペクト比をいう。
モニタ画像解析部２Ａは、以上の解析結果をモニタ画像学習部２Ｂへ渡す。 (A) Characteristics of subject (person, pet, accessory, building, mountain, sea, etc.) and their areas For example, by analyzing a monitor image acquired by the user, a person, pet, accessory, building, mountain, If the sea is included, it is analyzed in the possible range based on parameters such as contour, color tone, size, age, type, etc., and each characteristic point is obtained and converted into data.
For the area, for example, parameters relating to the area such as the ratio and position of the area occupied by the subject in the entire image are analyzed, and the characteristic points are obtained and converted into data.
(B) Composition For example, information on the composition is obtained by analyzing position information where a subject such as a building or a mountain is located with respect to the entire image. Alternatively, it is also possible to detect a straight line from the image by Hough transform and analyze the composition information using this as a composition line.
(C) Aspect ratio The aspect ratio here means the ratio of the long side to the short side of a two-dimensional object. That is, it refers to the aspect ratio of the subject included in the image, not the aspect ratio of the image itself (screen itself).
The monitor image analysis unit 2A passes the above analysis result to the monitor image learning unit 2B.

モニタ画像学習部２Ｂは、モニタ画像解析部２Ａの解析結果を元に、画像の特徴を注目領域抽出部２Ｃが参照する辞書データとして生成し、辞書データ記憶部１１Ａに保持する。 Based on the analysis result of the monitor image analysis unit 2A, the monitor image learning unit 2B generates image features as dictionary data referred to by the attention area extraction unit 2C, and stores the generated image features in the dictionary data storage unit 11A.

注目領域抽出部２Ｃは、携帯端末１の撮像部１２により撮像されて表示部８にライブビュー表示される画像や、ＲＡＭ１１や記録媒体１３に記録されている撮影画像を解析し、注目領域を抽出する。具体的には、注目領域抽出部２Ｃは、ライブビュー表示等される画像に含まれる被写体に関する被写体情報とモニタ画像学習部２Ｂに保持されている辞書データとを、被写体のエッジ等の特徴、領域、構図、アスペクト比の少なくとも１つのパラメータに基づいて比較することにより、ライブビュー表示等される画像にモニタ画像学習部２Ｂに保持されている被写体と同一又は類似する被写体が含まれているかどうかを判断し、含まれていると判断した場合に、同一又は類似であるライブビュー表示等されている被写体が可及的にモニタ画像学習部２Ｂの辞書データに含まれる被写体と近くなるような形で含まれるように注目領域を抽出して出力する。
ライブビュー表示等されている画像に辞書データの被写体と同一又は類似する被写体が複数検出された場合には、注目領域抽出部２Ｃはこれら複数の被写体を含むように注目領域を抽出して出力する。 The attention area extraction unit 2 C analyzes the image captured by the imaging unit 12 of the mobile terminal 1 and displayed in live view on the display unit 8, and the captured image recorded in the RAM 11 and the recording medium 13, and extracts the attention area. To do. Specifically, the attention area extraction unit 2C uses subject information about a subject included in an image displayed in a live view display and the dictionary data held in the monitor image learning unit 2B, as features such as an edge of the subject, a region By comparing based on at least one parameter of composition, aspect ratio, it is determined whether or not an image displayed in a live view or the like includes a subject that is the same as or similar to the subject held in the monitor image learning unit 2B. When it is determined that the subject is included, the subject that is the same or similar to the live view display or the like is as close as possible to the subject included in the dictionary data of the monitor image learning unit 2B. The region of interest is extracted and included so as to be included.
When a plurality of subjects that are the same as or similar to the subject of the dictionary data are detected in the image displayed in live view, the attention area extraction unit 2C extracts and outputs the attention area so as to include the plurality of objects. .

アプリ動作制御部２Ｄは、注目領域抽出部２Ｃの出力結果を元に、携帯端末１の機能に応じたアプリ動作を制御する。たとえば、表示部８にハイライト表示やスライドショーの際に種々のエフェクト処理をして表示制御する。あるいは、アルバム生成してＲＡＭ１１や記録媒体１３に記録する。あるいは、カメラ撮影時には、抽出領域が表示部８に表示されるように表示制御しつつ、オートフォーカス（ＡＦ）やオート露出補正（ＡＥ）等の撮像制御を行う。 The application operation control unit 2D controls the application operation corresponding to the function of the mobile terminal 1 based on the output result of the attention area extraction unit 2C. For example, display control is performed by performing various effect processes on the display unit 8 during highlight display or slide show. Alternatively, an album is generated and recorded in the RAM 11 or the recording medium 13. Alternatively, at the time of camera photographing, imaging control such as autofocus (AF) and auto exposure correction (AE) is performed while performing display control so that the extraction area is displayed on the display unit 8.

無線通信部３は、携帯端末１が音声通話を行う際は、アンテナ４を介して受信された受話信号（無線信号）を復調して受話データに変換する。
音声信号処理部５は受話データをＤ／Ａ変換して得られるアナログ受話信号をスピーカ７に提供し、スピーカ７は提供されたアナログ受話信号に基づいて受話音声を発する。
マイク６は入力された送話音声をアナログ送話信号に変換して音声信号処理部５に提供し、音声信号処理部５は提供されたアナログ送話信号をＡ／Ｄ変換して送話データを得る。
無線通信部３は送話データを送信信号に変調し、送信信号がアンテナ４を介して送信される。 When the mobile terminal 1 performs a voice call, the wireless communication unit 3 demodulates a received signal (wireless signal) received via the antenna 4 and converts it into received data.
The audio signal processing unit 5 provides an analog reception signal obtained by D / A converting the reception data to the speaker 7, and the speaker 7 generates a reception voice based on the provided analog reception signal.
The microphone 6 converts the input transmission voice into an analog transmission signal and provides it to the voice signal processing unit 5. The voice signal processing unit 5 performs A / D conversion on the provided analog transmission signal and transmits the transmission data. Get.
The wireless communication unit 3 modulates transmission data into a transmission signal, and the transmission signal is transmitted via the antenna 4.

無線通信部３は、また、携帯端末１が画像を受信する際は、パケットに含まれる符号化された画像データを復号化し、画像データを制御部２に提供する。復号化された画像データは表示部８に表示される。 Further, when the mobile terminal 1 receives an image, the wireless communication unit 3 decodes the encoded image data included in the packet and provides the image data to the control unit 2. The decoded image data is displayed on the display unit 8.

表示部８は、液晶ディスプレイと液晶駆動装置を備え、電話着信時の相手電話番号、電波状態、電池残量等の情報や、電子メール、Ｗｅｂサイト、電子書籍等の内容を表示する。表示部８は、また、撮像部１２によって撮像された画像や記録媒体１３に記録されている画像を表示する。 The display unit 8 includes a liquid crystal display and a liquid crystal driving device, and displays information such as the other party's telephone number, radio wave state, remaining battery level, etc. when receiving a call, and the contents of e-mails, websites, e-books and the like. The display unit 8 also displays an image captured by the imaging unit 12 and an image recorded on the recording medium 13.

操作部９は、携帯端末１の操作用のキーを備え、具体的には、電源キー、数字や文字を入力する入力キー、アプリケーションの起動や終了を指示するアプリケーションキー等を備える。操作部９は、また、後述するモニタ画像参照モードとノーマルモードを選択するキーを有する。 The operation unit 9 includes keys for operating the mobile terminal 1. Specifically, the operation unit 9 includes a power key, an input key for inputting numbers and characters, an application key for instructing activation and termination of an application, and the like. The operation unit 9 also has keys for selecting a monitor image reference mode and a normal mode, which will be described later.

ＲＯＭ１０には、後述する図２及び図３に示されたフローチャートを実行させるためのプログラムや、各種のアプリケーションを実行させるためのアプリケーションプログラムが格納されている。 The ROM 10 stores a program for executing flowcharts shown in FIGS. 2 and 3 to be described later, and an application program for executing various applications.

ＲＡＭ１１は、各種データを記憶する不図示の各種記憶部を備える。たとえば、アドレス帳機能のアドレス帳情報（氏名、電話番号、メールアドレス等）を記憶するアドレス帳情報記憶部、送受信メールの内容や添付画像を記憶するメール情報記憶部、及びＷｅｂサイトのＵＲＬ情報や画像を記憶するＷｅｂ情報記憶部を備える。
ＲＡＭ１１は、また、これらの不図示の各種記憶部から抽出された画像データを辞書データとして記憶する辞書データ記憶部１１Ａを備える。 The RAM 11 includes various storage units (not shown) that store various data. For example, an address book information storage unit for storing address book information (name, telephone number, mail address, etc.) of the address book function, a mail information storage unit for storing contents of transmitted / received mails and attached images, and URL information of websites A Web information storage unit for storing images is provided.
The RAM 11 also includes a dictionary data storage unit 11A that stores image data extracted from these various storage units (not shown) as dictionary data.

撮像部１２は、不図示のレンズ、ＴＴＬ（ＴｈｒｏｕｇｈｔｈｅＬｅｎｓ）露出計、ＣＭＯＳ(ＣｏｍｐｌｅｍｅｎｔａｒｙＭｅｔａｌＯｘｉｄｅＳｅｍｉｃｏｎｄｕｃｔｏｒ）センサ、Ａ／Ｄ変換回路、及びフレームメモリ等の撮像機構を備える。
撮像部１２により撮像された画像は、表示部８にライブビュー表示される。撮像部１２は、ユーザ操作による撮像時には、ＴＴＬ露出計の測光結果に基づき、オートフォーカス（ＡＦ）やオート露出補正（ＡＥ）等の機能を実行して被写体を撮像する。
ユーザの撮像指示に基づいて撮像された画像は、ＲＡＭ１１又は記憶媒体１３に記録される。 The imaging unit 12 includes an imaging mechanism such as a lens (not shown), a TTL (Through the Lens) exposure meter, a CMOS (Complementary Metal Oxide Semiconductor) sensor, an A / D conversion circuit, and a frame memory.
The image captured by the imaging unit 12 is displayed in live view on the display unit 8. The imaging unit 12 captures a subject by performing functions such as auto focus (AF) and auto exposure correction (AE) based on the photometric result of the TTL exposure meter during imaging by a user operation.
An image captured based on the user's imaging instruction is recorded in the RAM 11 or the storage medium 13.

記録媒体１３は、フラッシュメモリを備えるメモリカード等の着脱自在の記録媒体であり、撮像部１２によって撮像された画像を記録可能である他、携帯端末１を用いずに撮像された画像を携帯端末１に取り込むことができる。この取り込まれた画像は、辞書データ記憶部１１Ａに記憶される画像とすることができる。
（３）携帯端末の動作 The recording medium 13 is a detachable recording medium such as a memory card provided with a flash memory. The recording medium 13 can record an image captured by the imaging unit 12, and can also capture an image captured without using the mobile terminal 1. 1 can be captured. This captured image can be an image stored in the dictionary data storage unit 11A.
(3) Mobile terminal operation

次に、上述した第１実施形態に係る携帯端末１の動作について説明する。以下の処理は、何れも、制御部２（又はモニタ画像解析部２Ａ、モニタ画像学習部２Ｂ、注目領域抽出部２Ｃ、若しくはアプリ動作制御部２Ｄ）によって実行される。
（３．１）辞書データ生成処理 Next, the operation of the mobile terminal 1 according to the first embodiment described above will be described. The following processes are all executed by the control unit 2 (or the monitor image analysis unit 2A, the monitor image learning unit 2B, the attention area extraction unit 2C, or the application operation control unit 2D).
(3.1) Dictionary data generation processing

図３は、携帯端末１の辞書データ生成処理に係る動作を説明するためのフローチャートである。
制御部２は、まず、ユーザによる操作部９の操作によって辞書データ生成指示が出されたかどうかを判断する。辞書データの生成とは、後述する注目領域抽出処理で参照される比較基準となる辞書データとしての画像データを生成する処理をいう。
ユーザ操作によって辞書データ生成指示が出されたと制御部２により判断された場合、図３の辞書データ生成処理に入る。
なお、以上のように、ユーザ指示で辞書生成する場合の他に、バックグラウンドで、画像を解析し、辞書データを生成するようにしてもよい。 FIG. 3 is a flowchart for explaining an operation related to the dictionary data generation processing of the mobile terminal 1.
First, the control unit 2 determines whether or not a dictionary data generation instruction has been issued by the operation of the operation unit 9 by the user. The generation of dictionary data refers to a process of generating image data as dictionary data serving as a comparison reference that is referred to in an attention area extraction process described later.
When the controller 2 determines that a dictionary data generation instruction has been issued by a user operation, the dictionary data generation process of FIG. 3 is entered.
As described above, in addition to the case where a dictionary is generated by a user instruction, an image may be analyzed in the background to generate dictionary data.

ステップＳ１において、モニタ画像解析部２Ａは、使用者が閲覧したホームページ、メール、電子書籍などからモニタ用画像として用いられる画像を取得する。画像の取得源は、これらに限られず、撮像部１２を通じて撮像された画像や記録媒体１３に記録されている画像であってもよい。
ステップＳ２において、モニタ画像解析部２Ａは、ステップＳ１で取得された画像に対し、被写体に関する情報を把握するための解析を行う。たとえば、被写体（人物・ペット・小物・建物・山・海など）のエッジ等の特徴とその領域、画像の構図（すなわち被写体の位置情報）、及び被写体のアスペクト比などの被写体に関する情報を被写体情報として把握する。 In step S1, the monitor image analysis unit 2A acquires an image used as a monitor image from a homepage, mail, electronic book, or the like browsed by the user. The image acquisition source is not limited to these, and may be an image captured through the imaging unit 12 or an image recorded on the recording medium 13.
In step S2, the monitor image analysis unit 2A performs analysis for grasping information on the subject with respect to the image acquired in step S1. For example, subject information includes information about the subject, such as the features of the subject (person, pet, accessory, building, mountain, sea, etc.) and its area, image composition (that is, subject position information), and subject aspect ratio. To grasp as.

ステップＳ３において、モニタ画像学習部２Ｂは、モニタ画像解析部２Ａの解析結果に基づいて、図４に則して後述する注目領域抽出部２Ｃが参照するための辞書データを生成する。
ステップＳ４において、モニタ画像学習部２Ｂは、ステップＳ３において生成された辞書データを、辞書データ記憶部１１Ａに保持する。
（３．２）注目領域抽出処理 In step S3, the monitor image learning unit 2B generates dictionary data for reference by the attention area extraction unit 2C described later with reference to FIG. 4 based on the analysis result of the monitor image analysis unit 2A.
In step S4, the monitor image learning unit 2B holds the dictionary data generated in step S3 in the dictionary data storage unit 11A.
(3.2) attention area extraction processing

図４は、携帯端末１の注目領域抽出処理に係る動作を説明するためのフローチャートである。
制御部２は、まず、ユーザによる操作部９の操作によってモニタ画像参照モードが選択されたかどうかを判断する。モニタ画像参照モードが選択されると、制御部２は、以下で説明するように、辞書データ記憶部１１Ａに保持された辞書データを参照しつつアプリケーションの動作制御を実行する。
すなわち、モニタ画像参照モードが選択されたと制御部２により判断された場合、処理は図４のルーチンに入る。 FIG. 4 is a flowchart for explaining the operation related to the attention area extraction processing of the mobile terminal 1.
First, the control unit 2 determines whether or not the monitor image reference mode is selected by the operation of the operation unit 9 by the user. When the monitor image reference mode is selected, the control unit 2 controls the operation of the application while referring to the dictionary data held in the dictionary data storage unit 11A as described below.
That is, when the control unit 2 determines that the monitor image reference mode is selected, the process enters the routine of FIG.

ステップＳ２１において、注目領域抽出部２Ｃは、表示部８にライブビュー表示されている画像や、ＲＡＭ１１や記録媒体１３に記録されている画像を処理対象画像として取得する。
ステップＳ２２において、注目領域抽出部２Ｃは、ステップＳ２１で取得された処理対象画像に対して解析を行う。解析の手順は、先述したステップＳ２と同様である。 In step S 21, the attention area extraction unit 2 C acquires an image displayed in live view on the display unit 8 or an image recorded in the RAM 11 or the recording medium 13 as a processing target image.
In step S22, the attention area extraction unit 2C performs analysis on the processing target image acquired in step S21. The analysis procedure is the same as step S2 described above.

ステップＳ２３において、注目領域抽出部２Ｃは、辞書データ記憶部１１Ａに記憶されている辞書データとステップＳ２２で取得された解析データとを比較する。比較は、双方の画像データがともに有するパラメータに沿って行われる。 In step S23, the attention area extraction unit 2C compares the dictionary data stored in the dictionary data storage unit 11A with the analysis data acquired in step S22. The comparison is performed along parameters that both image data have.

ステップＳ２４において、注目領域抽出部２Ｃは、ステップＳ２３の比較結果に基づいて、ステップＳ２１で取得された処理対象画像から注目領域を抽出する。
この点について、図５及び図６を参照して説明する。
図５において、先のステップＳ１〜Ｓ４を通じて辞書データとして保持されている画像が「学習画像」として示されている。また、ステップＳ２１において取得された画像が「処理対象画像」として示されている。
図５に示すように、ステップＳ２２において処理対象画像の解析が行われ、ステップＳ２３において学習画像とステップＳ２２の解析結果との比較が行われる。
ステップＳ２３において学習画像に含まれる被写体と同一又は類似する被写体の存在を示すデータがステップＳ２２における解析結果に含まれている場合、制御部２は、処理対象画像から学習画像の被写体と同一又は類似する被写体を、辞書データとできるだけ近くなるような形で含むように、図５のように「抽出画像」として注目領域を抽出する。 In step S24, the attention area extraction unit 2C extracts the attention area from the processing target image acquired in step S21 based on the comparison result in step S23.
This point will be described with reference to FIGS.
In FIG. 5, an image held as dictionary data through the previous steps S1 to S4 is shown as a “learning image”. In addition, the image acquired in step S21 is shown as a “processing target image”.
As shown in FIG. 5, the processing target image is analyzed in step S22, and the learning image is compared with the analysis result in step S22 in step S23.
When data indicating the presence of a subject that is the same as or similar to the subject included in the learning image is included in the analysis result in step S22 in step S23, the control unit 2 is the same as or similar to the subject of the learning image from the processing target image. A region of interest is extracted as an “extracted image” as shown in FIG. 5 so that the subject to be included is included as close as possible to the dictionary data.

処理対象画像の中心部分を基準にして一定の領域を抽出すると、たとえば図６の「中心抽出画像」のように建物も人物も切れてしまうことがある。
また、顔認識機能により処理対象画像の顔部分を基準にして一定の領域を抽出すると、図６の「顔認識抽出画像」のように建物が切れてしまうことがある。
本実施形態によれば、処理対象画像の中心を基準にして、学習画像の被写体と同一又は類似する被写体を含むようにステップＳ２４において抽出領域が決定される。したがって、これらの不都合が生じることなく、注目領域を抽出することができる。
なお、画像の抽出は、処理対象画像の中心を基準とせずに、顔認識機能により検出された人物の顔を、たとえば黄金分割点や３分割点の１点に位置させつつ、学習画像が含まれるように抽出してもよい。
また、画像の抽出は、図７の太い破線で示したように注目領域が存在する領域（エリア）を出力するようにしてもよい。 If a certain area is extracted with reference to the central portion of the processing target image, for example, a building and a person may be cut as shown in “center extraction image” in FIG.
Further, when a certain area is extracted with the face recognition function based on the face portion of the processing target image, the building may be cut like the “face recognition extracted image” in FIG.
According to the present embodiment, the extraction region is determined in step S24 so as to include a subject that is the same as or similar to the subject of the learning image with reference to the center of the processing target image. Therefore, the attention area can be extracted without causing these disadvantages.
Note that the image extraction includes the learning image while positioning the face of the person detected by the face recognition function at, for example, one of the golden division points and the three division points without using the center of the processing target image as a reference. May be extracted.
Further, the image may be extracted by outputting a region (area) where the region of interest exists as shown by a thick broken line in FIG.

図４に戻り、ステップＳ２５において、アプリ動作制御部２Ｄは、抽出画像に基づいて、携帯端末１の機能に応じたアプリ動作を制御する。たとえば、ハイライト表示、スライドショーのエフェクト、アルバム生成への表示制御、カメラ撮影時のオートフォーカス（ＡＦ）やオート露出補正（ＡＥ）制御を行う。
そして、アプリ動作を制御するには、画像の中心や人物を中心としてトリミングして、ハイライト表示やスライドショー等のアプリケーションを実行するような場合であっても、人物や背景の構造物の一部などがハイライト表示やスライドショー等から外れてしまうことがない。すなわち、撮影者ないし携帯端末１のユーザが取得ないし表示したいと意図した被写体を注目領域から外れることなく適切に抽出して、ハイライト表示やスライドショー等のアプリケーションを最適に動作させることができる。
（第２実施形態） Returning to FIG. 4, in step S 25, the application operation control unit 2 D controls the application operation according to the function of the mobile terminal 1 based on the extracted image. For example, highlight display, slide show effect, display control for album generation, auto focus (AF) and auto exposure correction (AE) control during camera shooting are performed.
In order to control the operation of the application, even if you perform an application such as highlight display or slide show by trimming around the center of the image or person, a part of the structure of the person or background Will not fall out of the highlight display or slide show. That is, it is possible to appropriately extract a subject intended to be acquired or displayed by the photographer or the user of the mobile terminal 1 without departing from the attention area, and to optimally operate applications such as highlight display and slide show.
(Second Embodiment)

第２実施形態は、辞書データ生成機能をサーバによって行うようにした画像の注目領域抽出支援システムに関する。
以下、このシステムについて、図８から図１０までを用いて説明する。
図８に示されているように、本実施携帯に係る画像の注目領域抽出支援システムは、辞書機能サービスを提供する辞書機能提供サーバＤＳと、このサービスの提供を受けて画像から注目領域を抽出する携帯端末１００とを備え、これらはインターネットＩＮ等を介して接続される。携帯端末１００と辞書機能提供サーバＤＳとを媒介する構成は、図１のものと同様である。
辞書機能提供サーバＤＳは、不図示のＣＰＵ、ＲＯＭ及びＲＡＭ等を備え、第１実施形態に係る図３と同様の手順で辞書データを生成し、生成された辞書データは携帯端末１００からの配信要求に基づいて配信される。
携帯端末１００は、第１実施形態に係る携帯端末１と同一部分について同一符号を用いて図９に示しているように、モニタ画像解析部２Ａとモニタ画像学習部２Ｂが不要である点を除き、携帯端末１と同様の構成を備える。 The second embodiment relates to an image attention area extraction support system in which a dictionary data generation function is performed by a server.
Hereinafter, this system will be described with reference to FIGS.
As shown in FIG. 8, the attention area extraction support system for an image according to the present embodiment carries a dictionary function providing server DS that provides a dictionary function service, and extracts the attention area from the image in response to the provision of this service. Mobile terminal 100, which are connected via the Internet IN or the like. The configuration for mediating the portable terminal 100 and the dictionary function providing server DS is the same as that of FIG.
The dictionary function providing server DS includes a CPU, ROM, RAM, and the like (not shown), generates dictionary data in the same procedure as in FIG. 3 according to the first embodiment, and the generated dictionary data is distributed from the mobile terminal 100. Delivered on request.
The mobile terminal 100 uses the same reference numerals for the same parts as the mobile terminal 1 according to the first embodiment, as shown in FIG. 9, except that the monitor image analysis unit 2A and the monitor image learning unit 2B are unnecessary. The mobile terminal 1 has the same configuration.

携帯端末１００は、図１０のフローチャートに従った処理を実行する。
図１０と図４の対比により明らかなように、図１０のステップＳ３１、Ｓ３２、Ｓ３５、及びＳ３６は、図４のステップＳ２１、Ｓ２２、Ｓ２４、及びＳ２５にそれぞれ対応するので説明を省略し、異なる点についてのみ以下で説明する。
図１０の処理手順においては、制御部２は、ステップＳ３３で辞書データを辞書機能提供サーバＤＳから取得し、取得したデータを辞書データ記憶部１１Ａに記憶する。
そして、注目領域抽出部２Ｃは、ステップＳ３４において、辞書データ記憶部１１Ａに記憶されている辞書データとステップＳ３２で取得された解析データとを比較する。 The mobile terminal 100 executes processing according to the flowchart of FIG.
As is clear from the comparison between FIG. 10 and FIG. 4, steps S31, S32, S35, and S36 of FIG. 10 correspond to steps S21, S22, S24, and S25 of FIG. Only the point will be described below.
In the processing procedure of FIG. 10, the control unit 2 acquires dictionary data from the dictionary function providing server DS in step S33, and stores the acquired data in the dictionary data storage unit 11A.
In step S34, the attention area extraction unit 2C compares the dictionary data stored in the dictionary data storage unit 11A with the analysis data acquired in step S32.

なお、第２実施形態の変形実施形態として、いわゆるシンクライアントのようにより多くの処理をサーバ側で実現するようにしてもよい。
たとえば、携帯端末側では、処理対象画像の取得と、取得画像のサーバへの送信と、処理済画像（注目領域抽出画像）の受信と、アプリ動作制御に係る最低限の処理を実行するようにし、他方サーバ側において、辞書データ生成処理の他、携帯端末から受信した画像を解析して学習データと比較する処理と、注目領域の抽出と、注目領域を抽出した画像の携帯端末への送信とを実行するようにしてもよい。 As a modified embodiment of the second embodiment, more processing may be realized on the server side like a so-called thin client.
For example, on the mobile terminal side, minimum processing related to acquisition of the processing target image, transmission of the acquired image to the server, reception of the processed image (region of interest extraction image), and application operation control is executed. On the other server side, in addition to dictionary data generation processing, processing for analyzing an image received from a mobile terminal and comparing it with learning data, extraction of a region of interest, and transmission of an image from which the region of interest has been extracted to a mobile terminal May be executed.

上記で示したフローチャートの手順を実現するプログラムコードは、当該プログラムコードを記録した記録媒体により提供されてもよい。たとえば、ＵＳＢメモリ、ＣＤ−ＲＯＭ、光磁気ディスク等により提供されてもよい。
（付記１から付記９まで） The program code that realizes the procedure of the flowchart shown above may be provided by a recording medium that records the program code. For example, it may be provided by a USB memory, a CD-ROM, a magneto-optical disk, or the like.
(Appendix 1 to Appendix 9)

以下、本発明の一側面について、付言する。
上記の実施形態の一部又は全部は、以下の付記のようにも記載されうるが、以下には限られない。
（付記１）
図１１は、付記１の構成図である。
この図に示すように、付記１に係る発明は、
モニタ画像を取得するモニタ画像取得手段４１と、
該モニタ画像取得手段により取得された該モニタ画像に含まれる被写体を解析するモニタ画像解析手段４２と、
該モニタ画像解析手段により得られた該被写体に関する被写体情報を辞書データとして生成して保持するモニタ画像学習手段４３と、
処理対象画像を取得する処理対象画像取得手段４４と、
該処理対象画像取得手段により取得された該処理対象画像に含まれる被写体を解析する処理対象画像解析手段４５と、
該処理対象画像解析手段により得られた該被写体に関する被写体情報と該モニタ画像学習手段に保持されている該辞書データとを比較することにより、該処理対象画像に該モニタ画像学習手段に保持されている被写体と同一又は類似する１又は２以上の被写体が含まれているかどうかを判断する比較判断手段４６と、
該比較判断手段により同一又は類似する１又は２以上の被写体が含まれていると判断された場合に、該１又は２以上の被写体を含むように該処理対象画像から注目領域を抽出する注目領域抽出手段４７と、
該注目領域抽出手段により抽出された該注目領域に基づいてアプリケーションの動作制御を行うアプリケーション動作制御手段４８と、
を備えた電子機器である。
（付記２）
付記２に係る発明は、
該モニタ画像解析手段及び該処理対象画像解析手段は、被写体の特徴、領域、構図、アスペクト比の少なくとも１つのパラメータについて被写体を解析し、
該比較判断手段は、これらのパラメータの内の少なくとも１つに基づいて、同一又は類似する画像かどうかを判断する付記１に記載の電子機器である。
（付記３）
付記３に係る発明は、
該モニタ画像学習手段は、該モニタ画像解析手段の解析結果に基づいて類似画像の統計処理を行って学習することにより、被写体情報を辞書データとして生成して保持する付記１又は２の何れか１つに記載の電子機器である。
（付記４）
付記４に係る発明は、
モニタ画像を取得し、
該モニタ画像に含まれる被写体を解析し、
該モニタ画像に含まれる該被写体に関する被写体情報を辞書データとして生成して保持し、
処理対象画像を取得し、
該処理対象画像に含まれる被写体を解析し、
該処理対象画像に含まれる該被写体に関する被写体情報と該辞書データとを比較することにより、該処理対象画像に該モニタ画像に含まれる被写体と同一又は類似する１又は２以上の被写体が含まれているかどうかを判断し、
同一又は類似する１又は２以上の被写体が含まれていると判断した場合に、該１又は２以上の被写体を含むように該処理対象画像から注目領域を抽出し、
該注目領域に基づいてアプリケーションの動作制御を行う、
画像から注目領域を抽出する方法である。
（付記５）
付記５に係る発明は、
該モニタ画像の解析及び該処理対象画像の解析においては、被写体の特徴、領域、構図、アスペクト比の少なくとも１つのパラメータについて被写体を解析し、
該比較判断においては、これらのパラメータの内の少なくとも１つに基づいて、同一又は類似する被写体かどうかを判断する付記４に記載の方法である。
（付記６）
付記６に係る発明は、
該辞書データを保持する際は、該モニタ画像の解析結果に基づいて類似画像の統計処理を行って学習することにより、被写体情報を辞書データとして保持する付記４又は５の何れか１つに記載の方法である。
（付記７）
付記７に係る発明は、
モニタ画像を取得するモニタ画像取得手段と、
該モニタ画像取得手段により取得された該モニタ画像に含まれる被写体を解析するモニタ画像解析手段と、
該モニタ画像解析手段により得られた該被写体に関する被写体情報を辞書データとして生成して保持するモニタ画像学習手段と、
を備えたサーバと、
該サーバから辞書データを取得する辞書データ取得手段と、
処理対象画像を取得する処理対象画像取得手段と、
該処理対象画像取得手段により取得された該処理対象画像に含まれる被写体を解析する処理対象画像解析手段と、
該処理対象画像解析手段により得られた該被写体に関する被写体情報と該辞書データ取得手段により取得された該辞書データとを比較することにより、該処理対象画像に、該辞書データ取得手段により取得された該辞書データの被写体と同一又は類似する１又は２以上の被写体が含まれているかどうかを判断する比較判断手段と、
該比較判断手段により同一又は類似する１又は２以上の被写体が含まれていると判断された場合に、該１又は２以上の被写体を含むように該処理対象画像から注目領域を抽出する注目領域抽出手段と、
該注目領域抽出手段により抽出された該注目領域に基づいてアプリケーションの動作制御を行うアプリケーション動作制御手段と、
を備えた電子機器と、
を備えた画像注目領域抽出支援するシステムである。
（付記８）
付記８に係る発明は、
モニタ画像を取得するモニタ画像取得手段と、
該モニタ画像取得手段により取得された該モニタ画像に含まれる被写体を解析するモニタ画像解析手段と、
該モニタ画像解析手段により得られた該被写体に関する被写体情報を辞書データとして生成して保持するモニタ画像学習手段と、
を備えたサーバと、
処理対象画像を取得する処理対象画像取得手段と、
該処理対象画像取得手段により取得された該処理対象画像を該サーバに送信する処理対象画像送信手段と、
を備えた電子機器と、
を備えた画像注目領域抽出システムであって、
該サーバは、
該処理対象画像送信手段から送信された該処理対象画像を受信する処理対象画像受信手段と、
該処理対象画像受信手段により受信された該処理対象画像に含まれる被写体を解析する処理対象画像解析手段と、
該処理対象画像解析手段により得られた該被写体に関する被写体情報と該モニタ画像学習手段に保持されている該辞書データとを比較することにより、該処理対象画像に、該辞書データ取得手段により取得された該辞書データの被写体と同一又は類似する１又は２以上の被写体が含まれているかどうかを判断する比較判断手段と、
該比較判断手段により同一又は類似する１又は２以上の被写体が含まれていると判断された場合に、該１又は２以上の被写体を含むように該処理対象画像から注目領域を抽出する注目領域抽出手段と、
該注目領域抽出手段により注目領域が抽出された画像を該電子機器に送信する注目領域抽出画像送信手段と、
をさらに備え、
該電子機器は、
該注目領域抽出画像送信手段から送信された該注目領域抽出画像を受信する注目領域抽出画像受信手段と、
該注目領域抽出画像受信手段によって受信された該注目領域抽出画像に基づいてアプリケーションの動作制御を行うアプリケーション動作制御手段と、
をさらに備えた、
画像注目領域抽出システムである。
（付記９）
付記９に係る発明は、
コンピュータに、
モニタ画像を取得させる機能と、
該モニタ画像に含まれる被写体を解析させる機能と、
該モニタ画像に含まれる該被写体に関する被写体情報を辞書データとして生成して保持させる機能と、
処理対象画像を取得させる機能と、
該処理対象画像に含まれる被写体を解析させる機能と、
該処理対象画像に含まれる該被写体に関する被写体情報と該辞書データとを比較させることにより、該処理対象画像に該モニタ画像に含まれる被写体と同一又は類似する１又は２以上の被写体が含まれているかどうかを判断させる機能と、
同一又は類似する１又は２以上の被写体が含まれていると判断した場合に、該処理対象画像に含まれる同一又は類似であると判断した該１又は２以上の被写体を含むように該処理対象画像から注目領域を抽出させる機能と、
該注目領域に基づいてアプリケーションの動作制御を行わせる機能と、
を実行させる画像から注目領域を抽出するプログラムである。
Hereinafter, additional aspects of the present invention will be described.
A part or all of the above-described embodiment can be described as in the following supplementary notes, but is not limited thereto.
(Appendix 1)
FIG. 11 is a configuration diagram of Supplementary Note 1.
As shown in this figure, the invention according to Appendix 1 is
Monitor image acquisition means 41 for acquiring a monitor image;
Monitor image analysis means 42 for analyzing a subject included in the monitor image acquired by the monitor image acquisition means;
Monitor image learning means 43 for generating and holding subject information relating to the subject obtained by the monitor image analysis means as dictionary data;
Processing target image acquisition means 44 for acquiring a processing target image;
Processing target image analysis means 45 for analyzing a subject included in the processing target image acquired by the processing target image acquisition means;
By comparing the subject information about the subject obtained by the processing target image analyzing means and the dictionary data held in the monitor image learning means, the processing target image is held in the monitor image learning means. Comparison determination means 46 for determining whether or not one or two or more subjects that are the same as or similar to a certain subject are included;
A region of interest for extracting a region of interest from the processing target image so as to include the one or more subjects when the comparison / determination unit determines that the same or similar subjects are included. Extraction means 47;
Application operation control means 48 for controlling the operation of the application based on the attention area extracted by the attention area extraction means;
Is an electronic device.
(Appendix 2)
The invention according to appendix 2
The monitor image analyzing unit and the processing target image analyzing unit analyze the subject with respect to at least one parameter of the feature, area, composition, and aspect ratio of the subject,
The comparison determination unit is the electronic apparatus according to attachment 1, wherein the comparison determination unit determines whether the images are the same or similar based on at least one of these parameters.
(Appendix 3)
The invention according to appendix 3
The monitor image learning means generates and holds subject information as dictionary data by performing statistical processing of similar images based on the analysis result of the monitor image analysis means to learn, and either 1 or 2 It is an electronic device described in one.
(Appendix 4)
The invention according to appendix 4
Get a monitor image,
Analyzing the subject included in the monitor image,
Generating and storing subject information regarding the subject included in the monitor image as dictionary data;
Get the processing target image,
Analyzing a subject included in the processing target image;
By comparing subject information related to the subject included in the processing target image with the dictionary data, the processing target image includes one or more subjects that are the same as or similar to the subject included in the monitor image. To determine whether
When it is determined that one or more subjects that are the same or similar are included, an attention area is extracted from the processing target image so as to include the one or more subjects,
Performing application control based on the region of interest;
This is a method for extracting a region of interest from an image.
(Appendix 5)
The invention according to appendix 5
In the analysis of the monitor image and the analysis of the processing target image, the subject is analyzed with respect to at least one parameter of the feature, area, composition, and aspect ratio of the subject,
In the comparison determination, the method according to supplementary note 4, wherein it is determined whether or not the subject is the same or similar based on at least one of these parameters.
(Appendix 6)
The invention according to appendix 6
When the dictionary data is held, subject information is held as dictionary data by performing statistical processing of similar images based on the analysis result of the monitor image, and learning, according to any one of appendix 4 or 5 It is a method.
(Appendix 7)
The invention according to appendix 7
Monitor image acquisition means for acquiring a monitor image;
Monitor image analysis means for analyzing a subject included in the monitor image acquired by the monitor image acquisition means;
Monitor image learning means for generating and holding subject information relating to the subject obtained by the monitor image analysis means as dictionary data;
A server with
Dictionary data acquisition means for acquiring dictionary data from the server;
Processing target image acquisition means for acquiring a processing target image;
Processing target image analysis means for analyzing a subject included in the processing target image acquired by the processing target image acquisition means;
By comparing the subject information relating to the subject obtained by the processing target image analyzing unit with the dictionary data acquired by the dictionary data acquiring unit, the processing target image is acquired by the dictionary data acquiring unit. Comparison determination means for determining whether or not one or more subjects that are the same as or similar to the subject of the dictionary data are included;
A region of interest for extracting a region of interest from the processing target image so as to include the one or more subjects when the comparison / determination unit determines that the same or similar subjects are included. Extraction means;
Application operation control means for controlling the operation of the application based on the attention area extracted by the attention area extraction means;
An electronic device with
Is a system that supports extraction of an image attention area.
(Appendix 8)
The invention according to appendix 8
Monitor image acquisition means for acquiring a monitor image;
Monitor image analysis means for analyzing a subject included in the monitor image acquired by the monitor image acquisition means;
Monitor image learning means for generating and holding subject information relating to the subject obtained by the monitor image analysis means as dictionary data;
A server with
Processing target image acquisition means for acquiring a processing target image;
Processing target image transmission means for transmitting the processing target image acquired by the processing target image acquisition means to the server;
An electronic device with
An image attention area extraction system comprising:
The server
Processing target image receiving means for receiving the processing target image transmitted from the processing target image transmitting means;
Processing target image analysis means for analyzing a subject included in the processing target image received by the processing target image receiving means;
By comparing the subject information relating to the subject obtained by the processing target image analyzing means with the dictionary data held in the monitor image learning means, the processing target image is acquired by the dictionary data acquiring means. Comparison determination means for determining whether or not one or more subjects that are the same as or similar to the subject of the dictionary data are included;
A region of interest for extracting a region of interest from the processing target image so as to include the one or more subjects when the comparison / determination unit determines that the same or similar subjects are included. Extraction means;
Attention area extraction image transmission means for transmitting an image from which the attention area has been extracted by the attention area extraction means to the electronic device;
Further comprising
The electronic device is
Attention area extraction image receiving means for receiving the attention area extraction image transmitted from the attention area extraction image transmission means;
Application operation control means for performing application operation control based on the attention area extraction image received by the attention area extraction image reception means;
Further equipped with,
This is an image attention area extraction system.
(Appendix 9)
The invention according to appendix 9
On the computer,
A function to acquire a monitor image;
A function of analyzing a subject included in the monitor image;
A function of generating and storing subject information regarding the subject included in the monitor image as dictionary data;
A function to acquire a processing target image;
A function of analyzing a subject included in the processing target image;
By comparing subject information regarding the subject included in the processing target image with the dictionary data, the processing target image includes one or more subjects that are the same as or similar to the subject included in the monitor image. The ability to determine whether or not
When it is determined that one or more subjects that are the same or similar are included, the processing target is included so as to include the one or more subjects that are determined to be the same or similar included in the processing target image. A function to extract a region of interest from an image;
A function for controlling the operation of the application based on the attention area;
This is a program for extracting a region of interest from an image for executing.

Claims

Monitor image acquisition means for acquiring a monitor image;
Monitor image analysis means for analyzing a subject included in the monitor image acquired by the monitor image acquisition means;
Monitor image learning means for generating and holding subject information relating to the subject obtained by the monitor image analysis means as dictionary data;
Processing target image acquisition means for acquiring a processing target image;
Processing target image analysis means for analyzing a subject included in the processing target image acquired by the processing target image acquisition means;
By comparing the subject information about the subject obtained by the processing target image analyzing means and the dictionary data held in the monitor image learning means, the processing target image is held in the monitor image learning means. Comparison determination means for determining whether or not one or more subjects that are the same as or similar to the subject being included are included;
A region of interest for extracting a region of interest from the processing target image so as to include the one or more subjects when the comparison / determination unit determines that the same or similar subjects are included. Extraction means;
Application operation control means for controlling the operation of the application based on the attention area extracted by the attention area extraction means;
With electronic equipment.

The monitor image analyzing unit and the processing target image analyzing unit analyze the subject with respect to at least one parameter of the feature, area, composition, and aspect ratio of the subject,
The electronic apparatus according to claim 1, wherein the comparison determination unit determines whether the subject is the same or similar based on at least one of these parameters.

3. The monitor image learning unit according to claim 1, wherein the monitor image learning unit holds subject information as dictionary data by performing statistical processing of similar images based on the analysis result of the monitor image analysis unit and learning. The electronic device described.

Get a monitor image,
Analyzing the subject included in the monitor image,
Generating and storing subject information regarding the subject included in the monitor image as dictionary data;
Get the processing target image,
Analyzing a subject included in the processing target image;
By comparing subject information related to the subject included in the processing target image with the dictionary data, the processing target image includes one or more subjects that are the same as or similar to the subject included in the monitor image. To determine whether
When it is determined that one or more subjects that are the same or similar are included, an attention area is extracted from the processing target image so as to include the one or more subjects,
Performing application control based on the region of interest;
Image attention area extraction method.

In the analysis of the monitor image and the analysis of the processing target image, the subject is analyzed with respect to at least one parameter of the feature, area, composition, and aspect ratio of the subject,
5. The image attention region extraction method according to claim 4, wherein in the comparison determination, it is determined whether or not the subject is the same or similar based on at least one of these parameters.

6. When the dictionary data is held, subject information is generated and held as dictionary data by performing statistical processing of similar images based on the analysis result of the monitor image and learning. 2. The image attention area extraction method according to item 1.

Monitor image acquisition means for acquiring a monitor image;
Monitor image analysis means for analyzing a subject included in the monitor image acquired by the monitor image acquisition means;
Monitor image learning means for generating and holding subject information relating to the subject obtained by the monitor image analysis means as dictionary data;
A server with
Dictionary data acquisition means for acquiring dictionary data from the server;
Processing target image acquisition means for acquiring a processing target image;
Processing target image analysis means for analyzing a subject included in the processing target image acquired by the processing target image acquisition means;
By comparing the subject information relating to the subject obtained by the processing target image analyzing unit with the dictionary data acquired by the dictionary data acquiring unit, the processing target image is acquired by the dictionary data acquiring unit. Comparison determination means for determining whether or not one or more subjects that are the same as or similar to the subject of the dictionary data are included;
A region of interest for extracting a region of interest from the processing target image so as to include the one or more subjects when the comparison / determination unit determines that the same or similar subjects are included. Extraction means;
Application operation control means for controlling the operation of the application based on the attention area extracted by the attention area extraction means;
An electronic device with
An image attention area extraction system.

Monitor image acquisition means for acquiring a monitor image;
Monitor image analysis means for analyzing a subject included in the monitor image acquired by the monitor image acquisition means;
Monitor image learning means for generating and holding subject information relating to the subject obtained by the monitor image analysis means as dictionary data;
A server with
Processing target image acquisition means for acquiring a processing target image;
Processing target image transmission means for transmitting the processing target image acquired by the processing target image acquisition means to the server;
An electronic device with
An image attention area extraction system comprising:
The server
Processing target image receiving means for receiving the processing target image transmitted from the processing target image transmitting means;
Processing target image analysis means for analyzing a subject included in the processing target image received by the processing target image receiving means;
By comparing the subject information relating to the subject obtained by the processing target image analyzing means with the dictionary data held in the monitor image learning means, the processing target image is acquired by the dictionary data acquiring means. Comparison determination means for determining whether or not one or more subjects that are the same as or similar to the subject of the dictionary data are included;
A region of interest for extracting a region of interest from the processing target image so as to include the one or more subjects when the comparison / determination unit determines that the same or similar subjects are included. Extraction means;
Attention area extraction image transmission means for transmitting an image from which the attention area has been extracted by the attention area extraction means to the electronic device;
Further comprising
The electronic device is
Attention area extraction image receiving means for receiving the attention area extraction image transmitted from the attention area extraction image transmission means;
Application operation control means for performing application operation control based on the attention area extraction image received by the attention area extraction image reception means;
Further equipped with,
Image attention area extraction system.

On the computer,
A function to acquire a monitor image;
A function of analyzing a subject included in the monitor image;
A function of generating and storing subject information regarding the subject included in the monitor image as dictionary data;
A function to acquire a processing target image;
A function of analyzing a subject included in the processing target image;
By comparing subject information regarding the subject included in the processing target image with the dictionary data, the processing target image includes one or more subjects that are the same as or similar to the subject included in the monitor image. The ability to determine whether or not
When it is determined that one or more subjects that are the same or similar are included, the processing target is included so as to include the one or more subjects that are determined to be the same or similar included in the processing target image. A function to extract a region of interest from an image;
A function for controlling the operation of the application based on the attention area;
An image attention area extraction program for executing