JP2013140559A

JP2013140559A - Image processing device, imaging device and program

Info

Publication number: JP2013140559A
Application number: JP2012206296A
Authority: JP
Inventors: Hiroko Kobayashi; 寛子小林; Takeshi Matsuo; 武史松尾
Original assignee: Nikon Corp
Current assignee: Nikon Corp
Priority date: 2011-12-05
Filing date: 2012-09-19
Publication date: 2013-07-18

Abstract

PROBLEM TO BE SOLVED: To provide a technique capable of imparting more flexible character information to a captured image.SOLUTION: An image processing device 1 comprises: an image input section 10 that inputs a captured image; a storage section 90 that stores, as a text template in which a word is inserted into a predetermined blank column part to complete a text, a person image template to be used in preparation of a text for a person image in which a person is a photographic subject and a landscape image template to be used in preparation of a text for a landscape image in which a landscape is a photographic subject; a determination section 20 that determines whether or not the captured image is a person image or landscape image; and a text preparation section 30 that reads the text template of any of the person image template or landscape image template from the storage section 90 based on the determination result by the determination section 20 for the captured image and inserts words into the blank column part in accordance with an amount of features or image-capturing conditions of the captured image to prepare a text for the captured image.

Description

本発明は、画像処理装置、撮像装置、及び、プログラムに関するものである。 The present invention relates to an image processing device, an imaging device, and a program.

従来、特定の人物の誕生日、イベントの日時などを予め登録しておくことによって、撮像日時に対応する誕生日の人物名、撮像日時に対応するイベント名などの文字情報を撮像画像に付与する技術が開示されている（例えば、特許文献１参照）。 Conventionally, character information such as a birthday person name corresponding to an imaging date and an event name corresponding to an imaging date is given to a captured image by registering a birthday of the specific person, an event date and the like in advance. A technique is disclosed (for example, see Patent Document 1).

特開平２−３０３２８２号公報JP-A-2-303282

しかしながら、従来技術は、ユーザが事前に登録した文字情報しか撮像画像に付与することができないという問題がある。従来技術が有する問題に鑑み、本発明の目的は、撮像画像に対し、より柔軟な文字情報を付与することができる技術を提供することにある。 However, the conventional technology has a problem that only character information registered in advance by the user can be added to the captured image. In view of the problems of the prior art, an object of the present invention is to provide a technique that can give more flexible character information to a captured image.

上記問題を解決するために、本発明の一態様である画像処理装置は、撮像画像を入力する画像入力部と、所定の空欄部に単語を挿入して文章を完成させる文章テンプレートとして、人物が被写体である人物画像に対する文章の作成に用いられる人物画像用テンプレートと、風景が被写体である風景画像に対する文章の作成に用いられる風景画像用テンプレートとを記憶する記憶部と、前記撮像画像が、前記人物画像であるか前記風景画像であるかを判定する判定部と、前記撮像画像に対する前記判定部による判定結果に応じて、前記人物画像用テンプレート又は前記風景画像用テンプレートの何れかの前記文章テンプレートを前記記憶部から読み出し、読み出した前記文章テンプレートの前記空欄部に前記撮像画像の特徴量又は撮像条件に応じた単語を挿入して当該撮像画像に対する文章を作成する文章作成部とを備えることを特徴とする。 In order to solve the above problems, an image processing apparatus according to an aspect of the present invention includes an image input unit that inputs a captured image, and a sentence template that completes a sentence by inserting a word into a predetermined blank part. A storage unit for storing a person image template used for creating a sentence for a person image that is a subject, a landscape image template used for creating a sentence for a landscape image whose landscape is a subject, and the captured image, A determination unit that determines whether the image is a person image or the landscape image, and the sentence template that is either the person image template or the landscape image template according to a determination result by the determination unit for the captured image Is read from the storage unit, and the blank portion of the read text template corresponds to the feature amount or the imaging condition of the captured image. It was inserted words, characterized in that it comprises a sentence creation unit for creating a sentence with respect to the captured image.

上記問題を解決するために、本発明の他の態様である画像処理装置は、撮像画像が入力される画像入力部と、前記撮像画像の特徴量、及び、前記撮像画像の撮像条件の少なくとも一方に対応するテキストを決定する決定部と、前記撮像画像が第１種別の画像であるか、前記第１種別とは異なる第２種別の画像であるかを判定する判定部と、前記第１種別に用いられる文章の構文である第１構文と、前記第２種別に用いられる文章の構文である第２構文とを記憶する記憶部と、前記撮像画像が前記第１種別の画像であると前記判定部により判定されたとき、前記決定部が決定した前記テキストを用いて前記第１構文の文章を作成し、前記撮像画像が前記第２種別の画像であると前記判定部により判定されたとき、前記決定部が決定した前記テキストを用いて前記第２構文の文章を作成する文章作成部とを含むことを特徴とする。 In order to solve the above problems, an image processing apparatus according to another aspect of the present invention includes an image input unit to which a captured image is input, a feature amount of the captured image, and an imaging condition of the captured image. A determination unit that determines a text corresponding to the first type, a determination unit that determines whether the captured image is a first type image or a second type image different from the first type, and the first type A storage unit that stores a first syntax that is a syntax of a sentence used for the second type and a second syntax that is a syntax of a sentence that is used for the second type, and the captured image is the image of the first type When the determination unit determines that the sentence of the first syntax is created using the text determined by the determination unit, and the determination unit determines that the captured image is the second type image The text determined by the determination unit Characterized in that it comprises a sentence creation unit for creating a sentence of the second syntax using preparative.

また、本発明の他の態様である撮像装置は、被写体を撮像して撮像画像を生成する撮像部と、所定の空欄部に単語を挿入して文章を完成させる文章テンプレートとして、人物が被写体である人物画像に対する文章の作成に用いられる人物画像用テンプレートと、風景が被写体である風景画像に対する文章の作成に用いられる風景画像用テンプレートとを記憶する記憶部と、前記撮像画像が、前記人物画像であるか前記風景画像であるかを判定する判定部と、前記撮像画像に対する前記判定部による判定結果に応じて、前記人物画像用テンプレート又は前記風景画像用テンプレートの何れかの前記文章テンプレートを前記記憶部から読み出し、読み出した前記文章テンプレートの前記空欄部に前記撮像画像の特徴量又は撮像条件に応じた単語を挿入して当該撮像画像に対する文章を作成する文章作成部とを備えることを特徴とする。 In addition, an imaging apparatus according to another aspect of the present invention includes an imaging unit that captures a subject and generates a captured image, and a sentence template that completes a sentence by inserting a word into a predetermined blank part and a person is the subject. A storage unit that stores a person image template used for creating a sentence for a certain person image and a landscape image template used for creating a sentence for a landscape image whose scenery is a subject, and the captured image includes the person image. The sentence template, which is either the person image template or the landscape image template, is determined according to a determination result by the determination unit that determines whether the image is a landscape image or the landscape image. A word corresponding to the feature amount or the imaging condition of the captured image is read from the storage unit and the blank part of the sentence template that has been read out. Input to characterized in that it comprises a sentence creation unit for creating a sentence with respect to the captured image.

また、本発明の他の態様であるプログラムは、所定の空欄部に単語を挿入して文章を完成させる文章テンプレートとして、人物が被写体である人物画像に対する文章の作成に用いられる人物画像用テンプレートと、風景が被写体である風景画像に対する文章の作成に用いられる風景画像用テンプレートとを記憶する記憶部を備える画像処理装置のコンピュータに、撮像画像を入力する画像入力ステップと、前記撮像画像が、前記人物画像であるか前記風景画像であるかを判定する判定ステップと、前記撮像画像に対する前記判定ステップによる判定結果に応じて、前記人物画像用テンプレート又は前記風景画像用テンプレートの何れかの前記文章テンプレートを前記記憶部から読み出し、読み出した前記文章テンプレートの前記空欄部に前記撮像画像の特徴量又は撮像条件に応じた単語を挿入して当該撮像画像に対する文章を作成する文章作成ステップとを実行させることを特徴とする。 A program according to another aspect of the present invention includes a person image template used for creating a sentence for a person image in which a person is a subject as a sentence template for completing a sentence by inserting a word into a predetermined blank space. An image input step of inputting a captured image to a computer of an image processing apparatus including a storage unit for storing a landscape image template used for creating a sentence for a landscape image in which the landscape is a subject; and The sentence template of either the person image template or the landscape image template according to a determination step for determining whether the image is a person image or the landscape image, and a determination result of the determination step for the captured image Is read from the storage unit, and the blank part of the read sentence template Characterized in that to execute the sentence generating step of generating a sentence by inserting a word corresponding to the feature amount or the imaging condition of the image the image with respect to the captured image.

また、本発明の一態様である画像処理装置は、撮像画像から所定の意味を有する文字を決定する決定部と、前記撮像画像が人物画像であるか、前記人物画像とは異なる画像であるかを判定する判定部と、前記人物画像に用いられる文章の構文である第１構文と、前記人物画像とは異なる画像に用いられる文章の構文である第２構文とを記憶する記憶部と、前記撮像画像が前記人物画像であると前記判定部により判定されたとき、前記所定の意味を有する文字を用いて前記第１構文の文章を出力し、前記撮像画像が前記人物画像とは異なる画像であると前記判定部により判定されたとき、前記所定の意味を有する文字を用いて前記第２構文の文章を出力する出力部とを有することを特徴とする。 An image processing apparatus according to one aspect of the present invention includes a determination unit that determines a character having a predetermined meaning from a captured image, and whether the captured image is a person image or an image different from the person image. A storage unit that stores a determination unit that determines a first syntax that is a sentence syntax used for the person image, and a second syntax that is a sentence syntax used for an image different from the person image; When the determination unit determines that the captured image is the person image, the first syntax sentence is output using characters having the predetermined meaning, and the captured image is an image different from the person image. And an output unit that outputs the sentence of the second syntax using the character having the predetermined meaning when it is determined by the determination unit.

本発明によれば、撮像画像に対し、より柔軟な文字情報を付与することができるようになる。 According to the present invention, more flexible character information can be given to a captured image.

本発明の第１の実施形態による画像処理装置１の機能ブロック図の一例である。It is an example of the functional block diagram of the image processing apparatus 1 by the 1st Embodiment of this invention. 記憶部９０に記憶される文章テンプレートの一例である。4 is an example of a sentence template stored in a storage unit 90. 記憶部９０に記憶される単語の一例である。4 is an example of words stored in a storage unit 90. 撮像画像の配色パターンの抽出について説明するための説明図である。It is explanatory drawing for demonstrating extraction of the color scheme of a captured image. 画像処理装置１の動作の一例を示すフローチャートである。3 is a flowchart illustrating an example of the operation of the image processing apparatus 1. 画像処理装置１の動作の一例を示すフローチャートである。3 is a flowchart illustrating an example of the operation of the image processing apparatus 1. 文章付加部４０によって文章を付加された撮像画像の一例である。It is an example of the captured image with which the text was added by the text addition part. 本発明の第２の実施形態による撮像装置１００の機能ブロック図の一例である。It is an example of the functional block diagram of the imaging device 100 by the 2nd Embodiment of this invention. 撮像画像の特徴量を抽出するプロセスの一例を模式的に示す図である。It is a figure which shows typically an example of the process which extracts the feature-value of a captured image. 撮像画像の特徴量を抽出するプロセスの別の一例を模式的に示す図である。It is a figure which shows typically another example of the process which extracts the feature-value of a captured image. 笑顔レベルの判定方法を模式的に示すフローチャートである。It is a flowchart which shows typically the determination method of a smile level. 画像処理装置からの出力画像の一例を示す図である。It is a figure which shows an example of the output image from an image processing apparatus. 画像処理装置からの出力画像の別の例を示す図である。It is a figure which shows another example of the output image from an image processing apparatus. 撮像装置の画像処理部の内部構成を表す概略ブロック図である。It is a schematic block diagram showing the internal structure of the image process part of an imaging device. 代表色の決定の流れを示すフローチャートである。It is a flowchart which shows the flow of determination of a representative color. 画像処理部における処理の一例を示す概念図である。It is a conceptual diagram which shows an example of the process in an image process part. 画像処理部における処理の一例を示す概念図である。It is a conceptual diagram which shows an example of the process in an image process part. 図１６に示す主要領域に対して実施されたクラスタリングの結果を示す概念図である。It is a conceptual diagram which shows the result of the clustering implemented with respect to the main area | region shown in FIG. 文章付加部によって文章を付加された画像の一例である。It is an example of the image which the text was added by the text addition part. 文章付加部によって文章を付加された画像の別の一例である。It is another example of the image which the text was added by the text addition part. 色と単語との対応テーブルの一例を示す図である。It is a figure which shows an example of the correspondence table of a color and a word. 遠景画像（第２シーン画像）用の対応テーブルの一例を示す図である。It is a figure which shows an example of the correspondence table for a distant view image (2nd scene image). その他の画像（第３シーン画像）用の対応テーブルの一例を示す図である。It is a figure which shows an example of the corresponding | compatible table for other images (3rd scene image).

（第１の実施形態）
以下、図面を参照しながら本発明の第１の実施形態について説明する。図１は、本発明の第１の実施形態による画像処理装置１の機能ブロック図の一例である。図２は、記憶部９０に記憶される文章テンプレートの一例である。図３は、記憶部９０に記憶される単語の一例である。図４は、撮像画像の配色パターンの抽出について説明するための説明図である。 (First embodiment)
Hereinafter, a first embodiment of the present invention will be described with reference to the drawings. FIG. 1 is an example of a functional block diagram of an image processing apparatus 1 according to the first embodiment of the present invention. FIG. 2 is an example of a sentence template stored in the storage unit 90. FIG. 3 is an example of words stored in the storage unit 90. FIG. 4 is an explanatory diagram for describing extraction of a color arrangement pattern of a captured image.

画像処理装置１は、図１に示すように、画像入力部１０、判定部２０、文章作成部３０、文章付加部４０及び記憶部９０を備える。画像入力部１０は、例えば、ネットワーク又は記憶媒体を介して、撮像画像を入力する。画像入力部１０は、撮像画像を判定部２０に出力する。 As illustrated in FIG. 1, the image processing apparatus 1 includes an image input unit 10, a determination unit 20, a sentence creation unit 30, a sentence addition unit 40, and a storage unit 90. For example, the image input unit 10 inputs a captured image via a network or a storage medium. The image input unit 10 outputs the captured image to the determination unit 20.

記憶部９０は、所定の空欄部に単語を挿入して文章を完成させる文章テンプレートを記憶する。具体的には、記憶部９０は、文章テンプレートとして、人物が被写体である画像（以下、人物画像という）に対する文章の作成に用いられる人物画像用テンプレートと、風景（第２種別とも称する）が被写体である画像（以下、風景画像という）に対する文章の作成に用いられる風景画像用テンプレートとを記憶する。なお、人物画像の一例は、ポートレート（第１種別とも称する）である。 The storage unit 90 stores a sentence template that completes a sentence by inserting a word into a predetermined blank part. Specifically, in the storage unit 90, as a text template, a person image template used for creating a text for an image in which a person is a subject (hereinafter referred to as a person image), and a landscape (also referred to as a second type) are subjects. And a landscape image template used for creating a sentence for an image (hereinafter referred to as a landscape image). An example of a person image is a portrait (also referred to as a first type).

例えば、記憶部９０は、図２（ａ）（ｂ）に示すような２種類の人物画像用テンプレートを記憶する。なお、図２（ａ）（ｂ）に示す人物画像用テンプレートは、被写体の人数に応じた単語を挿入する空欄部（空欄部｛人数｝と表記）、及び、撮像画像の配色パターンに応じた単語を挿入する空欄部（空欄部｛形容詞｝と表記）を有している。 For example, the storage unit 90 stores two types of person image templates as shown in FIGS. The person image templates shown in FIGS. 2 (a) and 2 (b) correspond to a blank portion (blank portion {number}) for inserting words corresponding to the number of subjects and a color arrangement pattern of the captured image. It has a blank space (blank space {adjective}) for inserting a word.

また例えば、記憶部９０は、図２（ｃ）（ｄ）に示すような２種類の風景画像用テンプレートを記憶する。なお、図２（ｃ）に示す風景画像用テンプレートは、撮像画像の撮像条件（日時）に応じた単語を挿入する空欄部（空欄部｛日時｝）、及び、撮像画像の配色パターンに応じた単語を挿入する空欄部を有している。また、図２（ｄ）に示す風景画像用テンプレートは、撮像画像の撮像条件（場所）に応じた単語を挿入する空欄部（空欄部｛場所｝と表記）、及び、撮像画像の配色パターンに応じた単語を挿入する空欄部を有している。 Further, for example, the storage unit 90 stores two types of landscape image templates as shown in FIGS. Note that the landscape image template shown in FIG. 2C corresponds to a blank part (blank part {date}) for inserting a word corresponding to the imaging condition (date and time) of the captured image and a color arrangement pattern of the captured image. It has a blank space for inserting a word. In addition, the landscape image template shown in FIG. 2D includes a blank part (denoted as blank part {place}) for inserting a word corresponding to the imaging condition (location) of the captured image, and a color arrangement pattern of the captured image. It has a blank space for inserting a corresponding word.

なお、上述の人物画像用テンプレートは、被写体として撮像された人物に焦点をあててイメージされるような文章テンプレート、つまり、被写体として撮像された人物の視点による文章に空欄部を設定した文章テンプレートである。例えば、図２（ａ）の人物画像用テンプレートの文言「過ごした」、図２（ｂ）の人物画像用テンプレートの文言「ポーズ」は撮像された人物の視点を表現している。また、上述の風景画像用テンプレートは、撮像画像全体からイメージされるような文章テンプレート、つまり、被写体を撮像した撮影者の視点による文章に空欄部を設定した文章テンプレートである。例えば、図２（ｃ）の風景画像用テンプレートの文言「一枚」、図２（ｄ）の風景画像用テンプレートの文言「景色」は撮影者の視点を表現している。 The person image template described above is a sentence template that is focused on a person imaged as a subject, that is, a sentence template in which a blank portion is set in a sentence from the viewpoint of a person imaged as a subject. is there. For example, the word “pause” of the person image template in FIG. 2A and the word “pose” of the person image template in FIG. 2B express the viewpoint of the imaged person. The landscape image template described above is a text template that is imaged from the entire captured image, that is, a text template in which a blank portion is set in text from the viewpoint of the photographer who captured the subject. For example, the word “scene” in the landscape image template in FIG. 2C and the word “scenery” in the landscape image template in FIG. 2D express the viewpoint of the photographer.

さらに、記憶部９０は、文章テンプレート（人物画像用テンプレート、風景画像用テンプレート）に加え、文章テンプレートの各空欄部に挿入する単語を記憶する。例えば、記憶部９０は、図３（ａ）に示すように、撮像画像の被写体の人数に対応付けて、空欄部｛人数｝に挿入する単語として人数に関連する単語を記憶する。 Further, the storage unit 90 stores a word to be inserted into each blank portion of the sentence template in addition to the sentence template (person image template, landscape image template). For example, as illustrated in FIG. 3A, the storage unit 90 stores a word related to the number of people as a word to be inserted into the blank portion {number of people} in association with the number of subjects of the captured image.

例えば、人物画像用テンプレートを使用する場合に、被写体の人数が「１」であったときは、人物画像用テンプレートの空欄部｛人数｝には単語「ひとり」が挿入される。なお、文章作成部３０が、使用する文章テンプレートを記憶部９０から読み出して、空欄部に単語を挿入する（後述）。 For example, when the person image template is used and the number of subjects is “1”, the word “one person” is inserted into the blank portion {number of persons] of the person image template. Note that the sentence creation unit 30 reads a sentence template to be used from the storage unit 90 and inserts a word into the blank part (described later).

また、記憶部９０は、図３（ｂ）に示すように、撮像画像の配色パターンに対応付けて、人物画像用テンプレートの空欄部｛形容詞｝又は風景画像用テンプレートの空欄部｛形容詞｝に挿入する単語として人物画像用の形容詞及び風景画像用の形容詞を記憶する。 Further, as shown in FIG. 3B, the storage unit 90 is inserted into the blank portion {adjective} of the person image template or the blank portion {adjective} of the landscape image template in association with the color arrangement pattern of the captured image. The adjective for the person image and the adjective for the landscape image are stored as the words to be processed.

例えば、人物画像用テンプレートを使用する場合に、撮像画像の全領域の配色パターンが、図４（ａ）に示す、第１色「色１」、第２色「色２」、第３色「色３」であったときは、人物画像用テンプレートの空欄部｛形容詞｝には単語「クールな」が挿入される。また、風景画像用テンプレートを使用する場合に、撮像画像の全領域の配色パターンが、図４（ｂ）に示す、第１色「色２」、第２色「色１」、第３色「色４」であったときは、風景画像用テンプレートの空欄部｛形容詞｝には単語「賑やかな」が挿入される。 For example, when a person image template is used, the color arrangement pattern of the entire region of the captured image is the first color “color 1”, the second color “color 2”, and the third color “ When the color is “3”, the word “cool” is inserted into the blank portion {adjective} of the person image template. Further, when the landscape image template is used, the color arrangement pattern of the entire area of the captured image has the first color “color 2”, the second color “color 1”, and the third color “ When the color is “4”, the word “lively” is inserted into the blank portion {adjective} of the landscape image template.

上述の色１〜色５は、撮像画像において実際に表現されている個々の色を、例えば、暖色系／寒色系などの基準によって５色（５つの代表色）に分類したものである。換言すれば、撮像画像の各画素の画素値を、例えば、暖色系／寒色系などの基準によって５色に分類したものが、上述の色１〜色５である。
また、配色パターンを構成する、第１色は色１〜色５のうち当該撮像画像において最も多く表現されている色、第２色は色１〜色５のうち当該撮像画像において２番目に多く表現されている色、第３色は色１〜色５のうち当該撮像画像において３番目に多く表現されている色である。換言すれば、色１〜色５に画素値を分類したときに分類された画素数が最も多い色が第１色、色１〜色５に画素値を分類したときに分類された画素数が２番目に多い色が第２色、色１〜色５に画素値を分類したときに分類された画素数が３番目に多い色が第３色である。
なお、文章作成部３０が、撮像画像から配色パターンを抽出する。 The above-mentioned colors 1 to 5 are obtained by classifying individual colors actually expressed in the captured image into five colors (five representative colors) based on a standard such as a warm color / cold color. In other words, the above-mentioned colors 1 to 5 are obtained by classifying the pixel values of each pixel of the captured image into five colors based on, for example, a warm color / cold color standard.
The first color constituting the color arrangement pattern is the color most expressed in the captured image among colors 1 to 5, and the second color is the second most frequently expressed in the captured image among colors 1 to 5. The expressed color, the third color, is the third most frequently expressed color in the captured image among the colors 1 to 5. In other words, the color having the largest number of pixels classified when the pixel values are classified into colors 1 to 5 is the first color, and the number of pixels classified when the pixel values are classified into colors 1 to 5 is The second most common color is the second color, and when the pixel values are classified into colors 1 to 5, the third most common color is the third color.
Note that the text creation unit 30 extracts a color arrangement pattern from the captured image.

なお、撮像画像の全領域の配色パターンに代えて、撮像画像上の一部の領域における配色パターンを用いてもよい。つまり、文章作成部３０は、撮像画像上の一部の領域の配色パターンに応じた形容詞を空欄部に挿入してもよい。具体的には、文章作成部３０は、撮像画像が人物画像であるか風景画像であるかに応じて撮像画像上の所定領域を決定し、決定した撮像画像上の所定領域の配色パターンに応じた形容詞を空欄部に挿入してもよい。
例えば、文章作成部３０は、図４（ｃ）に示すように撮像画像が人物画像であるときは、当該人物画像の中央部の領域を所定領域として決定し、中央部の領域の配色パターンを抽出し、抽出した配色パターンに応じた形容詞を空欄部に挿入してもよい。また、文章作成部３０は、図４（ｄ）に示すように撮像画像が風景画像であるときは、当該風景画像の上部の領域を所定領域として決定し、上記の領域の配色パターンを抽出し、抽出した配色パターンに応じた形容詞を空欄部に挿入してもよい。 Note that a color arrangement pattern in a partial area on the captured image may be used instead of the color arrangement pattern in the entire area of the captured image. That is, the sentence creation unit 30 may insert an adjective corresponding to the color arrangement pattern of a partial area on the captured image into the blank part. Specifically, the text creation unit 30 determines a predetermined area on the captured image according to whether the captured image is a person image or a landscape image, and according to the determined color arrangement pattern of the predetermined area on the captured image An adjective may be inserted in the blank.
For example, when the captured image is a person image as shown in FIG. 4C, the sentence creation unit 30 determines the central area of the human image as a predetermined area, and sets the color arrangement pattern of the central area. An adjective corresponding to the extracted color scheme may be inserted into the blank section. In addition, when the captured image is a landscape image as shown in FIG. 4D, the text creation unit 30 determines the upper region of the landscape image as a predetermined region, and extracts the color arrangement pattern of the region. Alternatively, an adjective corresponding to the extracted color arrangement pattern may be inserted into the blank section.

また、図示は省略したが、記憶部９０は、撮像日時に対応付けて、空欄部｛日時｝に挿入する単語として日時に関連する単語（例えば、時刻、“おはよう”、“夕暮れ”、“真夏！!”、…）を記憶する。また、記憶部９０は、撮影場所に対応付けて、空欄部｛場所｝に挿入する単語として場所に関連する単語（例えば、“北国”、“古都”“富士山”、“雷門”、…）を記憶する。 Although not shown, the storage unit 90 associates with the imaging date and time, and inserts a word related to the date and time as a word to be inserted into the blank {date and time} (for example, time, “good morning”, “dusk”, “midsummer”). "!", ...) is memorized. In addition, the storage unit 90 associates the location-related words (for example, “Northern country”, “Old city”, “Mt. Fuji”, “Kaminarimon”,. Remember.

判定部２０は、画像入力部１０から撮像画像を取得する。判定部２０は、取得した撮像画像が人物画像であるか風景画像であるかを判定する。以下、判定部２０による人物画像／風景画像の判定について詳細に説明する。なお、第１の閾値（Ｆlowとも称する）は、第２の閾値（Ｆhighとも称する）よりも小さい値である。 The determination unit 20 acquires a captured image from the image input unit 10. The determination unit 20 determines whether the acquired captured image is a person image or a landscape image. Hereinafter, the determination of the person image / landscape image by the determination unit 20 will be described in detail. Note that the first threshold value (also referred to as “Flow”) is smaller than the second threshold value (also referred to as “Fhigh”).

判定部２０は、撮像画像内の顔領域の認識を試みる。
（顔領域＝０の場合）
判定部２０は、撮像画像内に１つも顔領域を認識しなかった場合、当該撮像画像は風景画像であると判定する。 The determination unit 20 attempts to recognize a face area in the captured image.
(When face area = 0)
When no face area is recognized in the captured image, the determination unit 20 determines that the captured image is a landscape image.

（顔領域＝１の場合）
判定部２０は、撮像画像内に１つの顔領域を認識した場合、下記式（１）に従って、撮像画像の大きさに対する顔領域の大きさの割合Ｒを算出する。
Ｒ＝Ｓｆ／Ｓｐ…（１）
上記式（１）のＳｐは、撮像画像の大きさであって、具体的には、撮像画像の長手方向の長さを用いる。上記式（１）のＳｆは、顔領域の大きさであって、具体的には、顔領域に外接する矩形の長手方向の長さ（若しくは、顔領域を囲む楕円の長軸の長さ（長径））を用いる。 (When face area = 1)
When the determination unit 20 recognizes one face area in the captured image, the determination unit 20 calculates a ratio R of the size of the face area to the size of the captured image according to the following equation (1).
R = Sf / Sp (1)
Sp in the above formula (1) is the size of the captured image, and specifically, the length in the longitudinal direction of the captured image is used. Sf in the above formula (1) is the size of the face area. Specifically, the length in the longitudinal direction of the rectangle circumscribing the face area (or the length of the major axis of the ellipse surrounding the face area ( Long diameter)) is used.

割合Ｒを算出した判定部２０は、割合Ｒと第１の閾値Ｆlowとを比較する。判定部２０は、割合Ｒが第１の閾値Ｆlow未満であると判定した場合、当該撮像画像は風景画像であると判定する。一方、判定部２０は、割合Ｒが第１の閾値Ｆlow以上であると判定した場合、割合Ｒと第２の閾値Ｆhighとを比較する。 The determination unit 20 that has calculated the ratio R compares the ratio R with the first threshold value Flow. If the determination unit 20 determines that the ratio R is less than the first threshold value Flow, the determination unit 20 determines that the captured image is a landscape image. On the other hand, when the determination unit 20 determines that the ratio R is equal to or greater than the first threshold value Flow, the determination unit 20 compares the ratio R with the second threshold value Fhigh.

判定部２０は、割合Ｒが第２の閾値Ｆhigh以上であると判定した場合、当該撮像画像は人物画像であると判定する。一方、判定部２０は、割合Ｒが第２の閾値Ｆhigh未満であると判定した場合、当該撮像画像は風景画像であると判定する。 If the determination unit 20 determines that the ratio R is equal to or greater than the second threshold Fhigh, the determination unit 20 determines that the captured image is a person image. On the other hand, when the determination unit 20 determines that the ratio R is less than the second threshold value Fhigh, the determination unit 20 determines that the captured image is a landscape image.

（顔領域≧２の場合）
判定部２０は、撮像画像内に複数の顔領域を認識した場合、下記式（２）に従って、撮像画像の大きさに対する各顔領域の大きさの割合Ｒ（ｉ）を算出する。
Ｒ（ｉ）＝Ｓｆ（ｉ）／Ｓｐ…（２）
上記式（２）のＳｐは、上記式（１）と同様である。上記式（２）のＳｆ（ｉ）は、ｉ番目の顔領域の大きさであって、具体的には、ｉ番目の顔領域に外接する矩形の長手方向の長さ（若しくは、顔領域を囲む楕円の長軸の長さ（長径））を用いる。 (If face area ≧ 2)
When the determination unit 20 recognizes a plurality of face regions in the captured image, the determination unit 20 calculates a ratio R (i) of the size of each face region to the size of the captured image according to the following equation (2).
R (i) = Sf (i) / Sp (2)
Sp in the above formula (2) is the same as that in the above formula (1). Sf (i) in the above formula (2) is the size of the i-th face area. Specifically, the length of the rectangle circumscribing the i-th face area (or the face area is The major axis length (major axis) of the enclosing ellipse is used.

Ｒ（ｉ）を算出した判定部２０は、Ｒ（ｉ）の最大値（Ｒmax）を算出する。即ち、判定部２０は、撮像画像の大きさに対する最大の顔領域の大きさの割合Ｒmaxを算出する。 The determination unit 20 that has calculated R (i) calculates the maximum value (Rmax) of R (i). That is, the determination unit 20 calculates the ratio Rmax of the maximum face area size to the size of the captured image.

割合Ｒmaxを算出した判定部２０は、割合Ｒmaxと第１の閾値Ｆlowとを比較する。判定部２０は、割合Ｒmaxが第１の閾値Ｆlow未満であると判定した場合、当該撮像画像は風景画像であると判定する。一方、判定部２０は、割合Ｒmaxが第１の閾値Ｆlow以上であると判定した場合、割合Ｒmaxと第２の閾値Ｆhighとを比較する。 The determination unit 20 that has calculated the ratio Rmax compares the ratio Rmax with the first threshold value Flow. If the determination unit 20 determines that the ratio Rmax is less than the first threshold value Flow, the determination unit 20 determines that the captured image is a landscape image. On the other hand, when the determination unit 20 determines that the ratio Rmax is greater than or equal to the first threshold value Flow, the determination unit 20 compares the ratio Rmax with the second threshold value Fhigh.

判定部２０は、割合Ｒmaxが第２の閾値Ｆhigh以上であると判定した場合、当該撮像画像は人物画像であると判定する。一方、判定部２０は、割合Ｒmaxが第２の閾値Ｆhigh未満であると判定した場合、Ｒ（ｉ）の標準偏差σを算出する。下記式（３）は、標準偏差σの算出式である。 If the determination unit 20 determines that the ratio Rmax is greater than or equal to the second threshold Fhigh, the determination unit 20 determines that the captured image is a person image. On the other hand, when the determination unit 20 determines that the ratio Rmax is less than the second threshold Fhigh, the determination unit 20 calculates the standard deviation σ of R (i). The following formula (3) is a formula for calculating the standard deviation σ.

標準偏差σを算出した判定部２０は、標準偏差σと第３の閾値（Ｆstdevとも称する）とを比較する。判定部２０は、標準偏差σが第３の閾値Ｆstdev未満であると判定した場合、当該撮像画像は人物画像であると判定する。一方、判定部２０は、標準偏差σが第３の閾値Ｆstdev以上であると判定した場合、当該撮像画像は風景画像であると判定する。 The determination unit 20 that has calculated the standard deviation σ compares the standard deviation σ with a third threshold value (also referred to as Fstdev). If the determination unit 20 determines that the standard deviation σ is less than the third threshold Fstdev, the determination unit 20 determines that the captured image is a person image. On the other hand, when the determination unit 20 determines that the standard deviation σ is equal to or greater than the third threshold Fstdev, the determination unit 20 determines that the captured image is a landscape image.

以上のように、判定部２０は、撮像画像内に複数の顔領域を認識した場合、当該撮像画像の大きさに対する最大の顔領域の大きさの割合Ｒmaxが第２の閾値Ｆhigh以上であるときは、当該撮像画像は人物画像であると判定する。また、判定部２０は、割合Ｒmaxが第２の閾値Ｆhigh未満であっても、割合Ｒmaxが第１の閾値Ｆlow以上であるときは、複数の顔領域の割合Ｒ（ｉ）の標準偏差σが第３の閾値Ｆstdev未満であるときは、当該撮像画像は人物画像であると判定する。 As described above, when the determination unit 20 recognizes a plurality of face areas in the captured image, the ratio Rmax of the maximum face area to the size of the captured image is equal to or greater than the second threshold Fhigh. Determines that the captured image is a person image. In addition, even when the ratio Rmax is less than the second threshold Fhigh, the determination unit 20 determines that the standard deviation σ of the ratio R (i) of the plurality of face regions is not greater than the first threshold Flow. When it is less than the third threshold value Fstdev, it is determined that the captured image is a person image.

なお、判定部２０は、複数の顔領域の割合Ｒ（ｉ）の標準偏差σと第３の閾値Ｆstdevによる判定に代えて、複数の顔領域の割合Ｒ（ｉ）の分散λと、分散λ用の閾値とを用いて判定をしてもよい。また、判定部２０は、複数の顔領域の割合Ｒ（ｉ）の標準偏差（若しくは分散）に代えて、複数の顔領域Ｓｆ（ｉ）の標準偏差（若しくは分散）を使用してもよい（当該場合、顔領域Ｓｆ（ｉ）用の閾値を用いる）。 Note that the determination unit 20 replaces the determination with the standard deviation σ of the ratio R (i) of the plurality of face areas and the third threshold value Fstdev, and the variance λ and the distribution λ of the ratio R (i) of the plurality of face areas The determination may be made using the threshold value. Further, the determination unit 20 may use the standard deviation (or variance) of the plurality of face regions Sf (i) instead of the standard deviation (or variance) of the ratio R (i) of the plurality of face regions ( In this case, a threshold value for the face area Sf (i) is used).

また、判定部２０は、撮像画像を人物画像と判定した場合には、第１の閾値Ｆlow以上の割合Ｒ（ｉ）である顔領域の数に基づいて被写体の人数を判定（計数）する。つまり、判定部２０は、第１の閾値Ｆlow以上の割合Ｒ（ｉ）である顔領域の１つひとつを被写体一人ひとりと判定し、第１の閾値Ｆlow以上の顔領域の数を被写体の人数とする。 If the determination unit 20 determines that the captured image is a human image, the determination unit 20 determines (counts) the number of subjects based on the number of face areas having a ratio R (i) equal to or greater than the first threshold value Flow. That is, the determination unit 20 determines each face area having a ratio R (i) equal to or greater than the first threshold value Flow as each subject, and determines the number of face areas equal to or greater than the first threshold value Flow as the number of subjects. To do.

判定部２０は、判定結果を文章作成部３０に出力する。具体的には、判定部２０は、撮像画像を人物画像と判定した場合には、人物画像である旨の判定結果を示す画像判定結果情報、及び、被写体人数の判定結果を示す人数判定結果情報を文章作成部３０に出力する。一方、判定部２０は、撮像画像を風景画像と判定した場合には、風景画像である旨の判定結果を示す画像判定結果情報を文章作成部３０に出力する。
また、判定部２０は画像入力部１０から取得した当該撮像画像を文章作成部３０に出力する。 The determination unit 20 outputs the determination result to the sentence creation unit 30. Specifically, when the determination unit 20 determines that the captured image is a person image, the image determination result information indicating the determination result indicating that the captured image is a person image, and the number determination result information indicating the determination result of the number of subjects. Is output to the sentence creation unit 30. On the other hand, when the determination unit 20 determines that the captured image is a landscape image, the determination unit 20 outputs image determination result information indicating a determination result indicating that the image is a landscape image to the sentence creation unit 30.
In addition, the determination unit 20 outputs the captured image acquired from the image input unit 10 to the sentence creation unit 30.

文章作成部３０は、判定部２０から判定結果及び撮像画像を取得する。文章作成部３０は、取得した判定結果に応じて、人物画像用テンプレート又は風景画像用テンプレートの何れかの文章テンプレートを記憶部９０から読み出す。具体的には、文章作成部３０は、人物画像である旨の判定結果を示す画像判定結果情報を取得した場合には、記憶部９０に記憶されている２種類の人物画像用テンプレートの中からランダムに選択された一方の人物画像用テンプレートを読み出す。また、文章作成部３０は、風景画像である旨の判定結果を示す画像判定結果情報を取得した場合には、記憶部９０に記憶されている２種類の風景画像用テンプレートの中からランダムに選択された一方の人物画像用テンプレートを読み出す。 The text creation unit 30 acquires a determination result and a captured image from the determination unit 20. The sentence creation unit 30 reads out either a person image template or a landscape image template from the storage unit 90 according to the acquired determination result. Specifically, when the text creation unit 30 acquires image determination result information indicating a determination result indicating that the image is a person image, the sentence creation unit 30 selects from two types of person image templates stored in the storage unit 90. One of the randomly selected person image templates is read out. In addition, when acquiring image determination result information indicating a determination result indicating that the image is a landscape image, the text creation unit 30 selects at random from two types of landscape image templates stored in the storage unit 90. The one person image template thus read is read out.

文章作成部３０は、読み出した文章テンプレート（人物画像用テンプレート又は風景画像用テンプレート）の空欄部に撮像画像の特徴量又は撮像条件に応じた単語を挿入して当該撮像画像に対する文章を作成する。特徴量に応じた単語とは、撮像画像の配色パターンに応じた形容詞、又は、被写体の人数に応じた単語（人数に関連する単語）である。また、撮像画像の撮像条件に応じた単語とは、撮像日時に応じた単語（日時に関連する単語）、又は、撮像場所に応じた単語（場所に関連する単語）である。 The sentence creation unit 30 creates a sentence for the captured image by inserting a word corresponding to the feature amount or the imaging condition of the captured image into the blank part of the read sentence template (person image template or landscape image template). The word corresponding to the feature amount is an adjective corresponding to the color arrangement pattern of the captured image, or a word corresponding to the number of subjects (word related to the number of subjects). The word corresponding to the imaging condition of the captured image is a word corresponding to the imaging date and time (word related to the date and time) or a word corresponding to the imaging location (word related to the location).

一例として、文章作成部３０は、図２（ａ）に示す人物画像用テンプレートを読み出した場合には、人数判定結果情報から当該撮像画像の被写体の人数を取得し、当該人数に対応付けて記憶されている単語（人数に関連する単語）を記憶部９０から読み出して空欄部｛人数｝に挿入し、当該撮像画像の配色パターンを抽出し、抽出した配色パターンに対応付けて記憶されている単語（人物画像用の形容詞）を記憶部９０から読み出して空欄部｛形容詞｝に挿入し、当該撮像画像に対する文章を作成する。具体的には、被写体の人数が「１」、配色パターンが第１色「色１」、第２色「色２」、第３色「色３」であるならば、文章作成部３０は、文章『ひとりですごしたクールな思い出』を作成する。 As an example, when the person image template shown in FIG. 2A is read, the text creation unit 30 acquires the number of subjects of the captured image from the number determination result information and stores it in association with the number of persons. The read word (word related to the number of people) is read from the storage unit 90 and inserted into the blank portion {number of people}, the color arrangement pattern of the captured image is extracted, and the word stored in association with the extracted color arrangement pattern (Adjective for person image) is read from the storage unit 90 and inserted into the blank part {adjective} to create a sentence for the captured image. Specifically, if the number of subjects is “1” and the color arrangement pattern is the first color “color 1”, the second color “color 2”, and the third color “color 3”, the sentence creation unit 30 Create the sentence "Cool memories spent alone".

他の例として、文章作成部３０は、図２（ｂ）に示す人物画像用テンプレートを読み出した場合には、図２（ａ）の場合と同様、記憶部９０から人数に関連する単語を読み出して空欄部｛人数｝に挿入し、記憶部９０から人物画像用の形容詞を読み出して空欄部｛形容詞｝に挿入し、当該撮像画像に対する文章を作成する。具体的には、被写体の人数が「１０」、配色パターンが第１色「色５」、第２色「色４」、第３色「色２」であるならば、文章作成部３０は、文章『熱い感じで？大勢でポーズ！！』を作成する。 As another example, when the person image template shown in FIG. 2B is read out, the sentence creating unit 30 reads out words related to the number of people from the storage unit 90, as in FIG. 2A. Are inserted into the blank portion {number of people}, and the adjectives for the person image are read from the storage portion 90 and inserted into the blank portion {adjective} to create a sentence for the captured image. Specifically, if the number of subjects is “10” and the color arrangement pattern is the first color “color 5”, the second color “color 4”, and the third color “color 2”, the sentence creation unit 30 Sentence "Hot feeling? Pause with many people! ! Is created.

他の例として、文章作成部３０は、図２（ｃ）に示す風景画像用テンプレートを読み出した場合には、当該撮像画像の付加情報（例えばＥｘｉｆ）から撮像日時を取得し、取得した撮像日時に対応付けて記憶されている単語（日時に関連する単語）を記憶部９０から読み出して空欄部｛日時｝に挿入し、当該撮像画像の配色パターンを抽出し、抽出した配色パターンに対応付けて記憶されている単語（風景画像用の形容詞）を記憶部９０から読み出して空欄部｛形容詞｝に挿入し、当該撮像画像に対する文章を作成する。
具体的には、記憶部９０に８月に対応付けて単語「真夏！！」が記憶されている場合に、撮像日時が２０１１年８月１０日、配色パターンが第１色「色５」、第２色「色４」、第３色「色２」であるならば、文章作成部３０は、文章『真夏！！。暑い感じの一枚』を作成する。 As another example, when the landscape image template shown in FIG. 2C is read, the text creation unit 30 acquires the imaging date and time from the additional information (for example, Exif) of the captured image, and acquires the acquired imaging date and time. The word stored in association with the word (word related to the date and time) is read from the storage unit 90 and inserted into the blank field {date and time}, the color arrangement pattern of the captured image is extracted, and is associated with the extracted color arrangement pattern. The stored word (adjective for landscape image) is read from the storage unit 90 and inserted into the blank field {adjective} to create a sentence for the captured image.
Specifically, when the word “Midsummer !!!” is stored in the storage unit 90 in association with August, the imaging date and time is August 10, 2011, the color arrangement pattern is the first color “color 5”, If the second color is “color 4” and the third color is “color 2”, the sentence creation unit 30 determines the sentence “Midsummer! ! . Create a piece that feels hot.

他の例として、文章作成部３０は、図２（ｄ）に示す風景画像用テンプレートを読み出した場合には、当該撮像画像の付加情報から撮像場所を取得し、取得した撮像場所に対応付けて記憶されている単語（場所に関連する単語）を記憶部９０から読み出して空欄部｛場所｝に挿入し、当該撮像画像の配色パターンを抽出し、抽出した配色パターンに対応付けて記憶されている単語（風景画像用の形容詞）を記憶部９０から読み出して空欄部｛形容詞｝に挿入し、当該撮像画像に対する文章を作成する。
具体的には、記憶部９０に京都駅に対応付けて単語「古都」が記憶されている場合に、撮像場所が京都駅前、配色パターンが第１色「色１」、第２色「色２」、第３色「色５」であるならば、文章作成部３０は、文章『古都。あのときの柔らかい景色！』を作成する。 As another example, when the landscape image template shown in FIG. 2D is read, the text creation unit 30 acquires an imaging location from the additional information of the captured image and associates it with the acquired imaging location. The stored words (words related to the place) are read from the storage unit 90 and inserted into the blank portion {place}, the color arrangement pattern of the captured image is extracted, and stored in association with the extracted color arrangement pattern. A word (adjective for a landscape image) is read from the storage unit 90 and inserted into a blank portion {adjective} to create a sentence for the captured image.
Specifically, when the word “old city” is stored in the storage unit 90 in association with Kyoto Station, the imaging location is in front of Kyoto Station, the color arrangement pattern is the first color “Color 1”, and the second color “Color 2”. ”, The third color“ color 5 ”, the sentence creation unit 30 reads the sentence“ Old city. The soft scenery at that time! Is created.

文章を作成した文章作成部３０は、作成した文章、及び、撮像画像を文章付加部４０に出力する。文章付加部４０は、文章作成部３０から文章及び撮像画像を取得する。文章付加部４０は、当該撮像画像に当該文章を付加（合成）する。 The sentence creation unit 30 that created the sentence outputs the created sentence and the captured image to the sentence addition unit 40. The text adding unit 40 acquires a text and a captured image from the text creating unit 30. The sentence adding unit 40 adds (synthesizes) the sentence to the captured image.

続いて、画像処理装置１の動作を説明する。図５及び図６は、画像処理装置１の動作の一例を示すフローチャートである。 Subsequently, the operation of the image processing apparatus 1 will be described. 5 and 6 are flowcharts showing an example of the operation of the image processing apparatus 1.

図５において、画像入力部１０は、撮像画像を入力する（ステップＳ１０）。画像入力部１０は、撮像画像を判定部２０に出力する。判定部２０は、撮像画像内に顔領域が１つ以上あるか否かを判定する（ステップＳ１２）。判定部２０は、撮像画像内に顔領域が１つ以上あると判定した場合（ステップＳ１２：Ｙｅｓ）、撮像画像の大きさに対する顔領域の大きさの割合を顔領域毎に算出し（ステップＳ１４）、当該割合の最大値を算出する（ステップＳ１６）。 In FIG. 5, the image input unit 10 inputs a captured image (step S10). The image input unit 10 outputs the captured image to the determination unit 20. The determination unit 20 determines whether or not there is one or more face areas in the captured image (step S12). When determining that there is one or more face areas in the captured image (step S12: Yes), the determination unit 20 calculates a ratio of the size of the face area to the size of the captured image for each face area (step S14). ), And calculates the maximum value of the ratio (step S16).

ステップＳ１６に続いて、判定部２０は、ステップＳ１６にて算出した最大値が第１の閾値以上であるか否かを判定する（ステップＳ２０）。判定部２０は、ステップＳ１６にて算出した最大値が第１の閾値以上であると判定した場合（ステップＳ２０：Ｙｅｓ）、当該最大値が第２の閾値以上であるか否かを判定する（ステップＳ２２）。判定部２０は、当該最大値が第２の閾値以上であると判定した場合（ステップＳ２２：Ｙｅｓ）、撮像画像は人物画像であると判定する（ステップＳ３０）。ステップＳ３０に続いて、判定部２０は、第１の閾値以上の割合である顔領域の数を被写体の人数として計数する（ステップ３２）。ステップＳ３２に続いて、判定部２０は、判定結果（人物画像である旨の判定結果を示す画像判定結果情報、及び、被写体人数の判定結果を示す人数判定結果情報）、及び、撮像画像を文章作成部３０に出力する。 Subsequent to step S16, the determination unit 20 determines whether or not the maximum value calculated in step S16 is greater than or equal to the first threshold (step S20). If the determination unit 20 determines that the maximum value calculated in step S16 is greater than or equal to the first threshold (step S20: Yes), the determination unit 20 determines whether the maximum value is greater than or equal to the second threshold ( Step S22). If the determination unit 20 determines that the maximum value is equal to or greater than the second threshold (step S22: Yes), the determination unit 20 determines that the captured image is a person image (step S30). Subsequent to step S30, the determination unit 20 counts the number of face areas having a ratio equal to or higher than the first threshold as the number of subjects (step 32). Subsequent to step S32, the determination unit 20 writes a determination result (image determination result information indicating a determination result indicating that the image is a person image and number determination result information indicating a determination result of the number of subjects) and the captured image as text. The data is output to the creation unit 30.

一方、ステップＳ２２において、最大値が第２の閾値未満であると判定した場合（ステップＳ２２：Ｎｏ）、判定部２０は、撮像画像内に顔領域が２つ以上あるか否かを判定する（ステップＳ４０）。判定部２０は、撮像画像内に顔領域が２つ以上あると判定した場合（ステップＳ４０：Ｙｅｓ）、ステップＳ１４にて算出した割合の標準偏差を算出し（ステップＳ４２）、当該標準偏差が第３の閾値未満であるか否かを判定する（ステップＳ４４）。判定部２０は、当該標準偏差が第３の閾値未満であると判定した場合（ステップＳ４４：Ｙｅｓ）、処理をステップＳ３０に進める。 On the other hand, when it is determined in step S22 that the maximum value is less than the second threshold (step S22: No), the determination unit 20 determines whether there are two or more face regions in the captured image ( Step S40). When determining that there are two or more face regions in the captured image (step S40: Yes), the determination unit 20 calculates the standard deviation of the ratio calculated in step S14 (step S42), and the standard deviation is It is determined whether or not the threshold value is less than 3 (step S44). If the determination unit 20 determines that the standard deviation is less than the third threshold (step S44: Yes), the process proceeds to step S30.

一方、ステップＳ１２において、撮像画像内に顔領域が１つもないと判定した場合（ステップＳ１２：Ｎｏ）、又は、ステップＳ２０において、最大値が第１の閾値未満であると判定した場合（ステップＳ２０：Ｎｏ）、又は、ステップＳ４０において、撮像画像内に顔領域が１つしかないと判定した場合（ステップＳ４０：Ｎｏ）、判定部２０は、撮像画像は風景画像であると判定する（ステップＳ５０）。ステップＳ５０に続いて、判定部２０は、判定結果（風景画像である旨の判定結果を示す画像判定結果情報）を文章作成部３０に出力する。 On the other hand, when it is determined in step S12 that there is no face area in the captured image (step S12: No), or when it is determined in step S20 that the maximum value is less than the first threshold (step S20). : No), or when it is determined in step S40 that there is only one face area in the captured image (step S40: No), the determination unit 20 determines that the captured image is a landscape image (step S50). ). Subsequent to step S <b> 50, the determination unit 20 outputs a determination result (image determination result information indicating a determination result indicating that the image is a landscape image) to the sentence creation unit 30.

なお、上述のステップＳ４０は、顔領域が１つである撮像画像が、人物画像であると常に判定されるのを防止するための処理である。また、上述のステップＳ４０では、撮像画像内に、撮像画像の大きさに対する顔領域の大きさの割合が最大の顔領域の他に、大きさが揃った非常に小さい顔領域が非常に多数存在していれば、標準偏差は小さくなるため、人物画像であると判定される可能性がある。従って、上述のような判定をなるべく減らすために、判定部２０は、所定の大きさの顔領域が２以上あるか否かを判定してもよい。例えば、判定部２０は、上述の割合が第１の閾値以上である顔領域が２つ以上あるか否かを判定してもよい。 The above-described step S40 is a process for preventing a captured image with one face area from being always determined to be a person image. In step S40 described above, there are a very large number of very small face areas with the same size in addition to the face area having the largest ratio of the size of the face area to the size of the captured image. If so, the standard deviation is small, so that it may be determined that the image is a person image. Therefore, in order to reduce the above-described determination as much as possible, the determination unit 20 may determine whether there are two or more face regions having a predetermined size. For example, the determination unit 20 may determine whether there are two or more face regions in which the above-described ratio is equal to or greater than a first threshold value.

ステップＳ３２又はステップＳ５０に続いて、文章作成部３０は、判定部２０から取得した判定結果に応じて、人物画像用テンプレート又は風景画像用テンプレートの何れかの文章テンプレートを記憶部９０から読み出して、読み出した文章テンプレートの空欄部に撮像画像の特徴量又は撮像条件に応じた単語を挿入して当該撮像画像に対する文章を作成する（ステップＳ１００）。 Subsequent to step S32 or step S50, the sentence creation unit 30 reads out either a person image template or a landscape image template from the storage unit 90 according to the determination result acquired from the determination unit 20, A word corresponding to the feature amount or the imaging condition of the captured image is inserted into the blank portion of the read sentence template to create a sentence for the captured image (step S100).

図６は、ステップＳ１００の詳細である。図６において、文章作成部３０は、撮像画像が人物画像であるか否かを判断する（ステップＳ１０２）。具体的には、文章作成部３０は、判定部２０から判定結果として、人物画像である旨の判定結果を示す画像判定結果情報を取得していた場合には、撮像画像が人物画像であると判断し、風景画像である旨の判定結果を示す画像判定結果情報を取得していた場合には、撮像画像が人物画像でないと判断する。 FIG. 6 shows details of step S100. In FIG. 6, the text creation unit 30 determines whether or not the captured image is a person image (step S102). Specifically, if the sentence creation unit 30 has acquired image determination result information indicating a determination result indicating that the image is a person image as a determination result from the determination unit 20, the captured image is a person image. If image determination result information indicating a determination result indicating that the image is a landscape image has been acquired, it is determined that the captured image is not a person image.

文章作成部３０は、撮像画像が人物画像であると判断した場合（ステップＳ１０２：Ｙｅｓ）、記憶部９０から人物画像用テンプレートを読み出す（ステップＳ１０４）。具体的には、文章作成部３０は、記憶部９０に記憶されている２種類の人物画像用テンプレートの中からランダムに選択された一方の人物画像用テンプレートを読み出す。 When the sentence creation unit 30 determines that the captured image is a person image (step S102: Yes), the sentence creation unit 30 reads a person image template from the storage unit 90 (step S104). Specifically, the sentence creation unit 30 reads one person image template randomly selected from the two types of person image templates stored in the storage unit 90.

ステップＳ１０４に続いて、文章作成部３０は、被写体の人数に応じた単語を人物画像用テンプレートの空欄部｛人数｝に挿入する（ステップＳ１１０）。具体的には、文章作成部３０は、人数判定結果情報から被写体の人数を取得し、当該人数に対応付けて記憶されている単語（人数に関連する単語）を記憶部９０から読み出して人物画像用テンプレートの空欄部｛人数｝に挿入する。 Subsequent to step S104, the sentence creation unit 30 inserts a word corresponding to the number of subjects in the blank portion {number of people} of the person image template (step S110). Specifically, the sentence creation unit 30 acquires the number of subjects from the number determination result information, reads words stored in association with the number of persons (words related to the number of persons) from the storage unit 90, and reads the person image. Insert it into the blank field {number of people} of the template.

ステップＳ１１０に続いて、文章作成部３０は、撮像画像（人物画像）の配色パターンに応じた単語を人物画像用テンプレートの空欄部｛形容詞｝に挿入する（ステップＳ１２０）。具体的には、文章作成部３０は、撮像画像（人物画像）の中央部の領域の配色パターンを抽出し、当該配色パターンに対応付けて記憶されている単語（人物画像用の形容詞）を記憶部９０から読み出して人物画像用テンプレートの空欄部｛形容詞｝に挿入する。 Subsequent to step S110, the sentence creation unit 30 inserts a word corresponding to the color arrangement pattern of the captured image (person image) into a blank portion {adjective} of the person image template (step S120). Specifically, the sentence creation unit 30 extracts a color arrangement pattern in the central area of the captured image (person image), and stores a word (adjective for person image) stored in association with the color arrangement pattern. The data is read from the part 90 and inserted into the blank part {adjective} of the person image template.

一方、ステップＳ１０２において、文章作成部３０は、撮像画像が風景画像であると判断した場合（ステップＳ１０２：Ｎｏ）、記憶部９０から風景画像用テンプレートを読み出す（ステップＳ１０６）。具体的には、文章作成部３０は、記憶部９０に記憶されている２種類の風景画像用テンプレートの中からランダムに選択された一方の風景画像用テンプレートを読み出す。 On the other hand, when the text creation unit 30 determines in step S102 that the captured image is a landscape image (step S102: No), the text creation unit 30 reads a landscape image template from the storage unit 90 (step S106). Specifically, the text creation unit 30 reads one landscape image template selected at random from two types of landscape image templates stored in the storage unit 90.

ステップＳ１０６に続いて、文章作成部３０は、撮像画像（風景画像）の配色パターンに応じた単語を風景画像用テンプレートの空欄部｛形容詞｝に挿入する（ステップＳ１３０）。具体的には、文章作成部３０は、撮像画像（風景画像）の上部の領域の配色パターンを抽出し、当該配色パターンに対応付けて記憶されている単語（風景画像用の形容詞）を記憶部９０から読み出して風景画像用テンプレートの空欄部｛形容詞｝に挿入する。 Subsequent to step S106, the sentence creation unit 30 inserts a word corresponding to the color arrangement pattern of the captured image (landscape image) into the blank portion {adjective} of the landscape image template (step S130). Specifically, the text creation unit 30 extracts a color arrangement pattern of the upper region of the captured image (landscape image), and stores a word (landscape image adjective) stored in association with the color arrangement pattern. The data is read from 90 and inserted into the blank field {adjective} of the landscape image template.

ステップＳ１２０又はステップＳ１３０に続いて、文章作成部３０は、読み出した文章テンプレートに空欄部｛日時｝が存在するか否かを判断する（ステップＳ１３２）。本実施例の場合、図２に示したように、図２（ｃ）の風景画像用テンプレートには空欄部｛日時｝が存在するが、図２（ａ）（ｂ）の人物画像用テンプレート及び図２（ｄ）の風景画像用テンプレートには空欄部｛日時｝が存在しない。従って、文章作成部３０は、ステップＳ１０６にて図２（ｃ）の風景画像用テンプレートを読み出していた場合には、空欄部｛日時｝が存在すると判断し、ステップＳ１０４にて図２（ａ）若しくは図２（ｂ）の人物画像用テンプレートを読み出していた場合、又は、ステップＳ１０６にて図２（ｄ）の風景画像用テンプレートを読み出していた場合には、空欄部｛日時｝が存在しないと判断する。 Subsequent to step S120 or step S130, the sentence creation unit 30 determines whether or not a blank part {date} exists in the read sentence template (step S132). In the case of the present embodiment, as shown in FIG. 2, the landscape image template of FIG. 2C has a blank portion {date and time}, but the person image template of FIGS. The landscape image template in FIG. 2D does not have a blank field {date and time}. Accordingly, when the landscape image template shown in FIG. 2C is read out in step S106, the text creation unit 30 determines that the blank part {date} exists, and in step S104, the sentence creation unit 30 determines in FIG. Alternatively, when the person image template of FIG. 2B is read out, or when the landscape image template of FIG. 2D is read out in step S106, the blank portion {date} does not exist. to decide.

文章作成部３０は、読み出した文章テンプレートに空欄部｛日時｝が存在すると判断した場合（ステップＳ１３２：Ｙｅｓ）、撮像画像の撮像条件（日時）に応じた単語を文章テンプレートの空欄部｛日時｝に挿入する（ステップＳ１４０）。具体的には、文章作成部３０は、撮像画像（風景画像）の付加情報から撮像日時を取得し、当該撮像日時に対応付けて記憶されている単語（日時に関連する単語）を記憶部９０から読み出して風景画像用テンプレートの空欄部｛日時｝に挿入する。一方、文章作成部３０は、読み出した文章テンプレートに空欄部｛日時｝が存在しないと判断した場合（ステップＳ１３２：Ｎｏ）、ステップＳ１４０を飛ばして処理をステップＳ１４２に進める。 When the sentence creation unit 30 determines that the blank part {date} exists in the read sentence template (step S132: Yes), the word corresponding to the imaging condition (date) of the captured image is changed to a blank part {date} of the sentence template. (Step S140). Specifically, the text creation unit 30 acquires the imaging date and time from the additional information of the captured image (landscape image), and stores the words (words related to the date and time) stored in association with the imaging date and time. Is inserted into the blank field {date and time} of the landscape image template. On the other hand, if the text creation unit 30 determines that the blank text {date} does not exist in the read text template (step S132: No), the process skips step S140 and proceeds to step S142.

ステップＳ１３２（Ｎｏ）又はステップＳ１４０に続いて、文章作成部３０は、読み出した文章テンプレートに空欄部｛場所｝が存在するか否かを判断する（ステップＳ１４２）。本実施例の場合、図２に示したように、図２（ｄ）の風景画像用テンプレートには空欄部｛場所｝が存在するが、図２（ａ）（ｂ）の人物画像用テンプレート及び図２（ｃ）の風景画像用テンプレートには空欄部｛場所｝が存在しない。従って、文章作成部３０は、ステップＳ１０６にて図２（ｄ）の風景画像用テンプレートを読み出していた場合には、空欄部｛場所｝が存在すると判断し、ステップＳ１０４にて図２（ａ）若しくは図２（ｂ）の人物画像用テンプレートを読み出していた場合、又は、ステップＳ１０６にて図２（ｃ）の風景画像用テンプレートを読み出していた場合には、空欄部｛場所｝が存在しないと判断する。 Subsequent to step S132 (No) or step S140, the sentence creation unit 30 determines whether or not a blank part {place} exists in the read sentence template (step S142). In the case of the present embodiment, as shown in FIG. 2, the landscape image template in FIG. 2D has a blank portion {place}, but the person image template in FIGS. The landscape image template in FIG. 2C does not have a blank portion {place}. Accordingly, when the landscape image template shown in FIG. 2D is read out in step S106, the text creation unit 30 determines that a blank portion {place} exists, and in step S104, the sentence creation unit 30 determines that the blank image portion {location} exists. Alternatively, when the person image template of FIG. 2B is read out, or when the landscape image template of FIG. 2C is read out in step S106, the blank portion {place} does not exist. to decide.

文章作成部３０は、読み出した文章テンプレートに空欄部｛場所｝が存在すると判断した場合（ステップＳ１４２：Ｙｅｓ）、撮像画像の撮像条件（場所）に応じた単語を文章テンプレートの空欄部｛場所｝に挿入する（ステップＳ１５０）。具体的には、文章作成部３０は、撮像画像（風景画像）の付加情報から撮像場所を取得し、当該撮像場所に対応付けて記憶されている単語（場所に関連する単語）を記憶部９０から読み出して風景画像用テンプレートの空欄部｛場所｝に挿入する。そして、図６に示すフローチャートは終了し、図５に示すフローチャートに戻る。一方、文章作成部３０は、読み出した文章テンプレートに空欄部｛場所｝が存在しないと判断した場合（ステップＳ１４２：Ｎｏ）、ステップＳ１５０は飛ばして、図５に示すフローチャートに戻る。 When the sentence creation unit 30 determines that the blank part {place} exists in the read sentence template (step S142: Yes), the word corresponding to the imaging condition (place) of the captured image is changed to a blank part {place} of the sentence template. (Step S150). Specifically, the text creation unit 30 acquires an imaging location from the additional information of the captured image (landscape image), and stores a word (word related to the location) stored in association with the imaging location. Is inserted into the blank field {place} of the landscape image template. Then, the flowchart shown in FIG. 6 ends, and the process returns to the flowchart shown in FIG. On the other hand, when the sentence creation unit 30 determines that the blank part {place} does not exist in the read sentence template (step S142: No), step S150 is skipped and the process returns to the flowchart shown in FIG.

図５に戻って、文章を作成した文章作成部３０は、作成した文章、及び、撮像画像を文章付加部４０に出力する。文章付加部４０は、文章作成部３０から文章及び撮像画像を取得する。文章付加部４０は、文章作成部３０から取得した撮像画像に、文章作成部３０から取得した文章を付加（合成）する。そして、図５に示すフローチャートは終了する。 Returning to FIG. 5, the sentence creation unit 30 that created the sentence outputs the created sentence and the captured image to the sentence addition unit 40. The text adding unit 40 acquires a text and a captured image from the text creating unit 30. The text adding unit 40 adds (synthesizes) the text acquired from the text creating unit 30 to the captured image acquired from the text creating unit 30. Then, the flowchart shown in FIG. 5 ends.

図７は、文章付加部４０によって文章を付加された撮像画像の一例である。図７（ａ）の撮像画像は、１人の顔が大きく写っているので人物画像であると判定されている。即ち、当該撮像画像は、撮像画像の大きさに対する顔領域の大きさの割合の最大値（当該１つの顔領域の割合）が第２の閾値以上であると判定されている（ステップＳ２２（Ｙｅｓ））。図７（ｂ）の撮像画像は、２人の顔が大きく写っているので人物画像であると判定されている。即ち、当該撮像画像は、撮像画像の大きさに対する顔領域の大きさの割合の最大値が第２の閾値以上であると判定されている（ステップＳ２２（Ｙｅｓ））。 FIG. 7 is an example of a captured image to which a sentence is added by the sentence adding unit 40. The captured image in FIG. 7A is determined to be a person image because one person's face is greatly reflected. That is, it is determined that the maximum value of the ratio of the size of the face area to the size of the captured image (the ratio of the one face area) is greater than or equal to the second threshold (Step S22 (Yes )). The captured image in FIG. 7B is determined to be a person image because the faces of the two people are large. That is, it is determined that the maximum value of the ratio of the size of the face area to the size of the captured image is greater than or equal to the second threshold (step S22 (Yes)).

図７（ｃ）の撮像画像は、ある程度の大きさの顔が写っていて、かつ、大きさも揃っているので、人物画像であると判定されている。即ち、当該撮像画像は、撮像画像の大きさに対する顔領域の大きさの割合の最大値が、第１の閾値以上かつ第２の閾値未満であるが（ステップＳ２２（Ｎｏ））、標準偏差が第３の閾値未満であると判定されている（ステップＳ４４（Ｙｅｓ））。 The captured image in FIG. 7C is determined to be a person image because a face of a certain size is captured and the size is uniform. That is, in the captured image, the maximum value of the ratio of the size of the face area to the size of the captured image is greater than or equal to the first threshold and less than the second threshold (step S22 (No)), but the standard deviation is larger. It is determined that it is less than the third threshold (step S44 (Yes)).

図７（ｄ）の撮像画像は、ある程度の大きさの顔が写っているが、大きさが揃っていないので、風景画像であると判定されている。即ち、当該撮像画像は、撮像画像の大きさに対する顔領域の大きさの割合の最大値が、第１の閾値以上かつ第２の閾値未満であるが（ステップＳ２２（Ｎｏ））、標準偏差が第３の閾値以上であると判定されている（ステップＳ４４（Ｎｏ））。図７（ｅ）の撮像画像は、顔が何も写っていないので、風景画像であると判定されている（ステップＳ１２（Ｎｏ））。 The captured image in FIG. 7D shows a face of a certain size, but the size is not uniform, so it is determined to be a landscape image. That is, in the captured image, the maximum value of the ratio of the size of the face area to the size of the captured image is greater than or equal to the first threshold and less than the second threshold (step S22 (No)), but the standard deviation is larger. It is determined that it is greater than or equal to the third threshold (step S44 (No)). The captured image in FIG. 7E is determined to be a landscape image because no face is captured (step S12 (No)).

以上、画像処理装置１によれば、撮像画像に対し、より柔軟な文字情報を付与することができる。即ち、画像処理装置１は、撮像画像を人物画像と風景画像とに分類し、人物画像に対しては、予め記憶している人物画像用テンプレートを使用して人物画像用の文章を作成し、風景画像に対しては、予め記憶している風景画像用テンプレートを使用して風景画像用の文章を作成するため、撮像内容に応じて、より柔軟な文字情報を付与することができる。 As described above, according to the image processing apparatus 1, more flexible character information can be given to the captured image. In other words, the image processing apparatus 1 classifies captured images into human images and landscape images, and for human images, creates a human image text using a human image template stored in advance, For landscape images, landscape image text is created using a prestored landscape image template, so that more flexible text information can be given according to the captured content.

なお、上記実施例では、画像入力部１０は、撮像画像の入力時に当該撮像画像を判定部２０に出力する例を説明したが、判定部２０が撮像画像を取得する態様はこれに限定されない。例えば、画像入力部１０は撮像画像の入力時に当該撮像画像を記憶部９０に記憶し、判定部２０は必要時に記憶部９０から所望の撮像画像を読み出して取得してもよい。 In the above-described embodiment, the image input unit 10 has described an example in which the captured image is output to the determination unit 20 when the captured image is input. However, the manner in which the determination unit 20 acquires the captured image is not limited thereto. For example, the image input unit 10 may store the captured image in the storage unit 90 when the captured image is input, and the determination unit 20 may read and acquire a desired captured image from the storage unit 90 when necessary.

なお、上記実施例では、配色パターンを構成する第１色の色数は、色１〜色５の５色を用いる例を説明したが、説明の便宜上であって、６色以上であってもよい。第２色、第３色についても同様である。また、上記実施例では、第１色〜第３色の３色から構成される配色パターンを用いる例を説明したが、配色パターンを構成する色数はこれに限定されない。例えば、２色又は４色以上から構成される配色パターンを用いてもよい。 In the above-described embodiment, the example in which the number of colors of the first color constituting the color arrangement pattern uses five colors of colors 1 to 5 is described. However, for convenience of explanation, the number of colors may be six or more. Good. The same applies to the second color and the third color. In the above-described embodiment, an example using a color arrangement pattern including three colors of the first color to the third color has been described, but the number of colors constituting the color arrangement pattern is not limited to this. For example, a color arrangement pattern composed of two colors or four or more colors may be used.

なお、上記実施例では、文章作成部３０は、撮像画像が人物画像である場合に、記憶部９０に記憶されている２種類の中からランダムに選択された一方の人物画像用テンプレートを読み出す例を説明したが、２種類の人物画像用テンプレートの中から読み出す一方を選択する態様はこれに限定されない。例えば、文章作成部３０は、操作部（非図示）を介してユーザが指定した一方の人物画像テンプレートを選択してもよい。同様に、文章作成部３０は、指定受付部を介してユーザが指定した一方の風景画像テンプレートを選択してもよい。 In the above embodiment, when the captured image is a person image, the sentence creation unit 30 reads one of the person image templates randomly selected from the two types stored in the storage unit 90. However, the mode of selecting one of the two types of person image templates to be read is not limited to this. For example, the text creation unit 30 may select one person image template designated by the user via the operation unit (not shown). Similarly, the text creation unit 30 may select one landscape image template designated by the user via the designation receiving unit.

また、上記実施例では、選択したテンプレートの空欄部に挿入するべき単語を記憶部９０から常に得られる例を説明したが、選択したテンプレートの空欄部に挿入するべき単語が記憶部９０から得られないときは、他のテンプレートを選択し直してもよい。例えば、ある撮像画像の文章を作成用に、空欄部｛場所｝を有する図２（ｄ）の風景画像用テンプレートを選択したが、当該撮像画像の付加情報から撮像場所を取得できなかったときは、空欄部｛場所｝を有しない図２（ｃ）に風景画像用テンプレートを選択しなおしてもよい。 In the above-described embodiment, an example has been described in which a word to be inserted into the blank portion of the selected template is always obtained from the storage unit 90. However, a word to be inserted into the blank portion of the selected template is obtained from the storage unit 90. If not, another template may be selected again. For example, when the landscape image template of FIG. 2D having a blank space {location} is selected for creating a sentence of a captured image, but the imaging location cannot be acquired from the additional information of the captured image. The landscape image template may be selected again in FIG. 2C without the blank portion {place}.

また、上記実施例では、画像処理装置１は、空欄部｛人数｝及び空欄部｛形容詞｝を有する人物画像用テンプレートを記憶部９０に記憶する例を説明したが、人物画像用テンプレートが有する空欄部の数、種類はこれに限定されない。例えば、人物画像用テンプレートは、空欄部｛人数｝及び空欄部｛形容詞｝に加え、空欄部｛日時｝又は空欄部｛場所｝の何れか一方又は両方を有していてもよい。また、画像処理装置１が各種センサを備える場合、人物画像用テンプレートは、撮像画像の撮像条件（照度）に応じた単語を挿入する空欄部｛空欄部｛照度｝）、撮像画像の撮像条件（温度）に応じた単語を挿入する空欄部｛空欄部｛温度｝）などを有していてもよい。 In the above embodiment, the image processing apparatus 1 has described the example in which the person image template having the blank part {number of people} and the blank part {adjective} is stored in the storage unit 90. The number and type of parts are not limited to this. For example, the person image template may have one or both of the blank part {date} and the blank part {location} in addition to the blank part {number of people} and the blank part {adjective}. Further, when the image processing apparatus 1 includes various sensors, the human image template includes a blank portion {blank portion {illuminance}) for inserting a word corresponding to the imaging condition (illuminance) of the captured image, and an imaging condition of the captured image ( It may have a blank part {blank part {temperature}) for inserting a word corresponding to (temperature).

また、人物画像用テンプレートは、必ずしも空欄部｛人数｝を有していなくてもよい。人物画像用テンプレートが空欄部｛人数｝を有しない場合の一例は、人物画像に対し、被写体の人数に応じた単語を含む文章を作成しない場合である。人物画像に対し、被写体の人数に応じた単語を含む文章を作成しない場合には、当然に、画像処理装置１は、空欄部｛人数｝を有する人物画像用テンプレートを記憶部９０に記憶する必要はない。
人物画像用テンプレートが空欄部｛人数｝を有しない場合の他の例は、被写体の人数に応じた複数の人物画像用テンプレートを記憶部９０に記憶する場合である。被写体の人数に応じた複数の人物画像用テンプレートを記憶部９０に記憶する場合には、画像処理装置１は、人物画像に対し、被写体の人数に応じた単語を空欄部｛人数｝に挿入して、被写体の人数に応じた単語を含む文章を作成するのではなく、被写体の人数に応じた人物画像用テンプレートを記憶部９０から読み出して、被写体の人数に応じた単語を含む文章を作成する。 Further, the person image template does not necessarily have a blank portion {number of people}. An example of the case where the person image template does not have a blank portion {number of people} is a case where a sentence including words corresponding to the number of subjects is not created for the person image. When a sentence including words corresponding to the number of subjects is not created for a person image, the image processing apparatus 1 naturally needs to store a person image template having a blank portion {number of persons] in the storage unit 90. There is no.
Another example of the case where the person image template does not have the blank portion {number of people} is a case where a plurality of person image templates corresponding to the number of subjects are stored in the storage unit 90. When storing a plurality of person image templates corresponding to the number of subjects in the storage unit 90, the image processing apparatus 1 inserts a word corresponding to the number of subjects in the blank portion {number} for the person image. Thus, instead of creating a sentence including words according to the number of subjects, a person image template according to the number of subjects is read from the storage unit 90 and a sentence including words according to the number of subjects is created. .

また、上記実施例では、画像処理装置１は、空欄部｛日時｝及び空欄部｛形容詞｝を有する風景画像用テンプレート、及び、空欄部｛場所｝及び空欄部｛形容詞｝を有する風景画像用テンプレートを記憶部９０に記憶する例を説明したが、風景画像用テンプレートが有する空欄部の数、種類はこれに限定されない。例えば、画像処理装置１が各種センサを備える場合、上述の空欄部｛照度｝、空欄部｛温度｝などを有していてもよい。 In the above embodiment, the image processing apparatus 1 also includes a landscape image template having a blank portion {date} and a blank portion {adjective}, and a landscape image template having a blank portion {location} and a blank portion {adjective}. However, the number and type of blank sections included in the landscape image template are not limited to this. For example, when the image processing apparatus 1 includes various sensors, the blank section {illuminance}, the blank section {temperature}, and the like may be included.

また、上記実施例では、画像処理装置１は、２種類の人物画像用テンプレートを記憶部９０に記憶する例を説明したが、１種類又は３種類以上の人物画像テンプレートを記憶部９０に記憶してもよい。同様に、画像処理装置１は、１種類又は３種類以上の風景画像テンプレートを記憶部９０に記憶してもよい。 In the above embodiment, the image processing apparatus 1 has described an example in which two types of person image templates are stored in the storage unit 90. However, one type or three or more types of person image templates are stored in the storage unit 90. May be. Similarly, the image processing apparatus 1 may store one type or three or more types of landscape image templates in the storage unit 90.

また、上記実施例では、画像処理装置１は、撮像画像に対する文章を作成した場合に当該文章を当該撮像画像に付加する例を説明したが、撮像画像に対する文章を作成した場合に当該文章を当該撮像画像と対応付けて記憶部９０に記憶してもよい。 In the above-described embodiment, the image processing apparatus 1 has been described as an example in which the sentence is added to the captured image when the sentence for the captured image is created. The image may be stored in the storage unit 90 in association with the captured image.

また、記憶部９０は、第１種別（例えば、ポートレート）の画像に用いられる文章の構文である第１構文と、第２種別（例えば、風景）の画像に用いられる文章の構文である第２構文とを記憶してもよい。 The storage unit 90 also has a first syntax that is a syntax of a sentence used for an image of a first type (for example, a portrait) and a syntax of a sentence that is used for an image of a second type (for example, a landscape). Two syntaxes may be stored.

文章作成部３０は、記憶部９０内に第１構文及び第２構文が記憶されている場合、撮像画像が第１種別の画像であると判定部２０により判定されたときは（即ち、判定部２０が人物画像であると判定したときは）、所定のテキストを用いて第１構文の文章を作成し、撮像画像が第２種別の画像であると判定部２０により判定されたときは（即ち、判定部２０が風景画像であると判定したときは）、所定のテキストを用いて第２構文の文章を作成してもよい。 When the first syntax and the second syntax are stored in the storage unit 90 and the determination unit 20 determines that the captured image is the first type image (that is, the determination unit 30) When it is determined that 20 is a human image), a sentence of the first syntax is created using a predetermined text, and when the determination unit 20 determines that the captured image is a second type image (that is, When the determination unit 20 determines that the image is a landscape image), a sentence having the second syntax may be created using a predetermined text.

また、画像処理装置１は、撮像画像の特徴量及び撮像条件の少なくとも一方に対応するテキスト（撮像画像の特徴量又は／及び撮像条件に応じたテキスト）を決定する決定部（非図示）を備えるようにしてもよい。例えば、決定部は、画像入力部１０が撮像画像を入力（取得）した場合に、文書作成に用いる所定のテキストとして、当該撮像画像の特徴量又は／及び撮像条件に応じたテキストを決定する。より詳細には、例えば、特徴量及び撮像条件に対応付けて複数のテキストを記憶部９０に予め記憶しておき、決定部は、記憶部９０内の複数のテキストのなかから、特徴量又は／及び撮像条件に応じたテキストを選択する。 The image processing apparatus 1 further includes a determination unit (not shown) that determines text corresponding to at least one of the feature amount of the captured image and the imaging condition (text according to the feature amount of the captured image or / and the imaging condition). You may do it. For example, when the image input unit 10 inputs (acquires) a captured image, the determination unit determines a text corresponding to a feature amount or / and an imaging condition of the captured image as predetermined text used for document creation. More specifically, for example, a plurality of texts are stored in advance in the storage unit 90 in association with the feature amount and the imaging condition, and the determination unit selects the feature amount or / or from the plurality of texts in the storage unit 90. The text corresponding to the imaging condition is selected.

つまり、文章作成部３０は、撮像画像が第１種別の画像であると判定部２０により判定されたときは、決定部が上述の如く決定したテキストを用いて第１構文の文章を作成し、撮像画像が第２種別の画像であると判定部２０により判定されたとき、決定部が上述の如く決定したテキストを用いて第２構文の文章を作成する。 That is, when the determination unit 20 determines that the captured image is the first type image, the sentence creation unit 30 creates a sentence of the first syntax using the text determined by the determination unit as described above. When the determination unit 20 determines that the captured image is the second type of image, the determination unit generates a sentence having the second syntax using the text determined as described above.

（第２の実施形態）
続いて、図面を参照しながら本発明の第２の実施形態について説明する。図８は、本発明の第２の実施形態による撮像装置１００の機能ブロック図の一例である。 (Second Embodiment)
Subsequently, a second embodiment of the present invention will be described with reference to the drawings. FIG. 8 is an example of a functional block diagram of the imaging apparatus 100 according to the second embodiment of the present invention.

撮像装置１００は、図８に示すように、撮像部１１０、バッファメモリ部１３０、画像処理部１４０、表示部１５０、記憶部１６０、通信部１７０、操作部１８０及びＣＰＵ（Central processing unit）１９０を備える。
なお、撮像装置１００の画像処理部１４０は、第１の実施形態による画像処理装置１の判定部２０、文章作成部３０及び文章付加部４０に相当する。また、撮像装置１００の記憶部１６０は、第１の実施形態による画像処理装置１の記憶部９０に相当する。 As illustrated in FIG. 8, the imaging apparatus 100 includes an imaging unit 110, a buffer memory unit 130, an image processing unit 140, a display unit 150, a storage unit 160, a communication unit 170, an operation unit 180, and a CPU (Central processing unit) 190. Prepare.
Note that the image processing unit 140 of the imaging apparatus 100 corresponds to the determination unit 20, the sentence creation unit 30, and the sentence addition unit 40 of the image processing apparatus 1 according to the first embodiment. The storage unit 160 of the imaging apparatus 100 corresponds to the storage unit 90 of the image processing apparatus 1 according to the first embodiment.

撮像部１１０は、光学系１１１、撮像素子１１９及びＡ/Ｄ変換部１２０を備える。光学系１１１は、１又は２以上のレンズを備える。 The imaging unit 110 includes an optical system 111, an imaging element 119, and an A / D conversion unit 120. The optical system 111 includes one or more lenses.

撮像素子１１９は、例えば、受光面に結像した光学像を電気信号に変換して、Ａ/Ｄ変換部１２０に出力する。 For example, the image sensor 119 converts an optical image formed on the light receiving surface into an electrical signal and outputs the electrical signal to the A / D converter 120.

また、撮像素子１１９は、操作部１８０を介して撮像指示を受け付けた際に得られる画像データを、撮像された静止画の撮像画像データとして、Ａ/Ｄ変換部１２０や画像処理部１４０を介して、記憶媒体２００に記憶させる。 Further, the image sensor 119 uses the image data obtained when an imaging instruction is received via the operation unit 180 as captured image data of a captured still image via the A / D conversion unit 120 and the image processing unit 140. And stored in the storage medium 200.

また、撮像素子１１９は、例えば、操作部１８０を介して撮像指示を受け付けていない状態において、連続的に得られる画像データをスルー画データ（撮像画像）として、Ａ/Ｄ変換部１２０や画像処理部１４０を介して、表示部１５０に連続的に出力する。 In addition, for example, the imaging element 119 uses the A / D conversion unit 120 and the image processing as the through image data (captured image) as image data obtained continuously in a state where an imaging instruction is not received via the operation unit 180. The data is continuously output to the display unit 150 via the unit 140.

なお、光学系１１１は、撮像装置１００に取り付けられて一体とされていてもよいし、撮像装置１００に着脱可能に取り付けられてもよい。 Note that the optical system 111 may be attached to and integrated with the imaging apparatus 100, or may be detachably attached to the imaging apparatus 100.

Ａ/Ｄ変換部１２０は、撮像素子１１９によって変換された電子信号をアナログ／デジタル変換し、この変換したデジタル信号である撮像画像データ（撮像画像）を出力する。 The A / D converter 120 performs analog / digital conversion on the electronic signal converted by the image sensor 119, and outputs captured image data (captured image) that is the converted digital signal.

即ち、撮像部１１０は、設定された撮像条件（例えば絞り値、露出値等）に基づいてＣＰＵ１９０により制御され、光学系１１１を介した光学像を撮像素子１１９に結像させ、Ａ/Ｄ変換部１２０によりデジタル信号に変換された当該光学像に基づく撮像画像を生成する。 That is, the imaging unit 110 is controlled by the CPU 190 based on the set imaging conditions (for example, aperture value, exposure value, etc.), and forms an optical image via the optical system 111 on the imaging device 119 for A / D conversion. A captured image based on the optical image converted into a digital signal by the unit 120 is generated.

操作部１８０は、例えば、電源スイッチ、シャッターボタン、十字キー、確定ボタン、および、その他の操作キーを含み、ユーザによって操作されることでユーザの操作入力を受け付け、ＣＰＵ１９０に出力する。 The operation unit 180 includes, for example, a power switch, a shutter button, a cross key, a confirmation button, and other operation keys. The operation unit 180 is operated by the user, receives a user operation input, and outputs it to the CPU 190.

画像処理部１４０は、記憶部１６０に記憶されている画像処理条件に基づいて、バッファメモリ部１３０に記憶されている画像データに対して画像処理を実行する。例えば、画像処理部１４０は、第１の実施形態による画像処理装置１の判定部２０、文章作成部３０及び文章付加部４０の処理を実行する。なお、バッファメモリ部１３０に記憶されている画像データとは、画像処理部１４０に入力される画像データのことであり、例えば、上述した撮像画像データ、スルー画データ、または、記憶媒体２００から読み出された撮像画像データのことである。 The image processing unit 140 performs image processing on the image data stored in the buffer memory unit 130 based on the image processing conditions stored in the storage unit 160. For example, the image processing unit 140 executes the processes of the determination unit 20, the text creation unit 30, and the text addition unit 40 of the image processing apparatus 1 according to the first embodiment. The image data stored in the buffer memory unit 130 is image data input to the image processing unit 140. For example, the image data read from the captured image data, the through image data, or the storage medium 200 described above. This is taken image data.

表示部１５０は、例えば液晶ディスプレイであって、画像データ、操作画面などを表示する。例えば、表示部１５０は、画像処理部１４０によって文章が付加された撮像画像を表示する。 The display unit 150 is a liquid crystal display, for example, and displays image data, an operation screen, and the like. For example, the display unit 150 displays a captured image to which text is added by the image processing unit 140.

記憶部１６０は、種々の情報を記憶する。具体的には、記憶部１６０は、少なくとも、第１の実施形態による画像処理装置１の記憶部９０が記憶する情報を記憶する。 The storage unit 160 stores various information. Specifically, the storage unit 160 stores at least information stored in the storage unit 90 of the image processing apparatus 1 according to the first embodiment.

バッファメモリ部１３０は、撮像部１１０によって撮像された画像データを、一時的に記憶する。通信部１７０は、カードメモリ等の取り外しが可能な記憶媒体２００と接続され、この記憶媒体２００への撮影画像データの書込み、読み出し、または消去を行う。 The buffer memory unit 130 temporarily stores image data captured by the imaging unit 110. The communication unit 170 is connected to a removable storage medium 200 such as a card memory, and writes, reads, or deletes photographed image data in the storage medium 200.

記憶媒体２００は、撮像装置１００に対して着脱可能に接続される記憶部であり、例えば、撮像部１１０によって生成された撮影画像データを記憶する。ＣＰＵ１９０は、撮像装置１００が備える各構成を制御する。バス３００は、撮像部１１０と、ＣＰＵ１９０と、操作部１８０と、画像処理部１４０と、表示部１５０と、記憶部１６０と、バッファメモリ部１３０と、通信部１７０とに接続され、各部から出力された画像データや制御信号等を転送する。 The storage medium 200 is a storage unit that is detachably connected to the imaging device 100, and stores, for example, photographed image data generated by the imaging unit 110. The CPU 190 controls each component included in the imaging device 100. The bus 300 is connected to the imaging unit 110, the CPU 190, the operation unit 180, the image processing unit 140, the display unit 150, the storage unit 160, the buffer memory unit 130, and the communication unit 170, and outputs from each unit. The transferred image data and control signals are transferred.

なお、上記第１の実施形態による画像処理装置１の各処理を実行するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、当該記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することにより、上記画像処理装置１の各処理に係る上述した種々の処理を行ってもよい。なお、ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものであってもよい。また、「コンピュータシステム」は、ＷＷＷシステムを利用している場合であれば、ホームページ提供環境（あるいは表示環境）も含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、フラッシュメモリ等の書き込み可能な不揮発性メモリ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。 A program for executing each process of the image processing apparatus 1 according to the first embodiment is recorded on a computer-readable recording medium, and the program recorded on the recording medium is read into a computer system and executed. By doing so, the above-described various processes related to each process of the image processing apparatus 1 may be performed. Here, the “computer system” may include an OS and hardware such as peripheral devices. Further, the “computer system” includes a homepage providing environment (or display environment) if a WWW system is used. The “computer-readable recording medium” means a flexible disk, a magneto-optical disk, a ROM, a writable nonvolatile memory such as a flash memory, a portable medium such as a CD-ROM, a hard disk built in a computer system, etc. This is a storage device.

さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムが送信された場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリ（例えばＤＲＡＭ（Dynamic Random Access Memory））のように、一定時間プログラムを保持しているものも含むものとする。また、上記プログラムは、このプログラムを記憶装置等に格納したコンピュータシステムから、伝送媒体を介して、あるいは、伝送媒体中の伝送波により他のコンピュータシステムに伝送されてもよい。ここで、プログラムを伝送する「伝送媒体」は、インターネット等のネットワーク（通信網）や電話回線等の通信回線（通信線）のように情報を伝送する機能を有する媒体のことをいう。また、上記プログラムは、前述した機能の一部を実現するためのものであっても良い。さらに、前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるもの、いわゆる差分ファイル（差分プログラム）であっても良い。 Further, the “computer-readable recording medium” means a volatile memory (for example, DRAM (Dynamic DRAM) in a computer system that becomes a server or a client when a program is transmitted through a network such as the Internet or a communication line such as a telephone line. Random Access Memory)), etc., which hold programs for a certain period of time. The program may be transmitted from a computer system storing the program in a storage device or the like to another computer system via a transmission medium or by a transmission wave in the transmission medium. Here, the “transmission medium” for transmitting the program refers to a medium having a function of transmitting information, such as a network (communication network) such as the Internet or a communication line (communication line) such as a telephone line. The program may be for realizing a part of the functions described above. Furthermore, what can implement | achieve the function mentioned above in combination with the program already recorded on the computer system, what is called a difference file (difference program) may be sufficient.

以上、この発明の実施形態について図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の設計等も含まれる。 The embodiment of the present invention has been described in detail with reference to the drawings. However, the specific configuration is not limited to this embodiment, and includes designs and the like that do not depart from the gist of the present invention.

図９は、画像上に配置される文章を決定するために用いられる撮像画像の特徴量を抽出するプロセスの一例を模式的に示す図である。図９の例において、画像処理装置の判定部は、撮像画像のシーンを人物画像又は風景画像に分類する。次に、画像処理装置は、そのシーンに応じて、撮像画像の特徴量を抽出する。特徴量は、人物画像の場合には、顔の数（被写体の人数）及び平均色（配色パターン）とすることができ、風景画像の場合には、平均色（配色パターン）とすることができる。これらの特徴量を基に、人物画像用テンプレート又は風景画像用テンプレートに挿入される単語（形容詞等）が決定される。 FIG. 9 is a diagram schematically illustrating an example of a process for extracting a feature amount of a captured image used for determining a sentence arranged on an image. In the example of FIG. 9, the determination unit of the image processing apparatus classifies the scene of the captured image into a person image or a landscape image. Next, the image processing apparatus extracts a feature amount of the captured image according to the scene. The feature amount can be the number of faces (number of subjects) and the average color (color arrangement pattern) in the case of a person image, and can be the average color (color arrangement pattern) in the case of a landscape image. . Based on these feature quantities, words (adjectives and the like) to be inserted into the person image template or the landscape image template are determined.

ここで、図９の例では、配色パターンは、撮像画像を構成する代表的な複数の色の組み合わせで構成されている。したがって、配色パターンは、撮像画像の平均的な色（平均色）を表すことができる。一例において、配色パターンとして、「第１色」、「第２色」、「第３色」を規定し、これら３種類の色の組み合わせ、すなわち３種類の平均的な色に基づいて、人物画像用、又は風景画像用の文章テンプレートに挿入される単語（形容詞）を決定することができる。 Here, in the example of FIG. 9, the color arrangement pattern is configured by a combination of a plurality of representative colors constituting the captured image. Therefore, the color arrangement pattern can represent the average color (average color) of the captured image. In one example, “first color”, “second color”, and “third color” are defined as a color arrangement pattern, and based on a combination of these three colors, that is, based on three average colors, a person image Or a word (adjective) to be inserted into a text template for a landscape image.

図９の例において、撮像画像のシーンは２種類（人物画像及び風景画像）に分類される。他の例において、撮像画像のシーンは、３種類以上（３、４、５、６、７、８、９、又は１０種類以上）に分類することができる。 In the example of FIG. 9, the scene of the captured image is classified into two types (person image and landscape image). In another example, the scene of the captured image can be classified into three or more types (3, 4, 5, 6, 7, 8, 9, or 10 types or more).

図１０は、画像上に配置される文章を決定するために用いられる撮像画像の特徴量を抽出するプロセスの別の一例を模式的に示す図である。図１０の例において、撮像画像のシーンを３種類以上に分類することができる。 FIG. 10 is a diagram schematically illustrating another example of the process of extracting the feature amount of the captured image used for determining the text arranged on the image. In the example of FIG. 10, the scene of the captured image can be classified into three or more types.

図１０の例において、画像処理装置の判定部は、撮像画像が人物画像（第１モード画像）、遠景画像（第２モード画像）、又はその他の画像（第３モード画像）いずれであるかを判定する。まず、判定部は、図９の例と同様に、撮像画像が人物画像であるか、人物画像とは異なる画像であるかを判定する。 In the example of FIG. 10, the determination unit of the image processing apparatus determines whether the captured image is a person image (first mode image), a distant view image (second mode image), or another image (third mode image). judge. First, as in the example of FIG. 9, the determination unit determines whether the captured image is a person image or an image different from the person image.

次に、撮像画像が人物画像とは異なる画像である場合、判定部は、撮像画像が遠景画像（第２モード画像）又はその他の画像（第３モード画像）のうちいずれであるか、を判定する。この判定は、例えば、撮像画像に付与された画像識別情報の一部を用いて行うことができる。 Next, when the captured image is an image different from the human image, the determination unit determines whether the captured image is a distant view image (second mode image) or another image (third mode image). To do. This determination can be performed using, for example, a part of the image identification information given to the captured image.

具体的には、撮像画像が遠景画像かどうかを判定するために、画像識別情報の一部である焦点距離を用いることができる。判定部は、焦点距離が、あらかじめ設定された基準距離以上である場合、撮像画像を遠景画像と判定し、焦点距離が基準距離未満である場合、撮像画像をその他の画像と判定する。以上により、撮像画像が、人物画像（第１モード画像）、遠景画像（第２モード画像）、又はその他の画像（第３モード画像）の３種類にシーン分類される。なお、遠景画像（第２モード画像）の例は、海や山などの風景画像等を含み、その他の画像（第３モード画像）の例は、花及びペット等を含む。 Specifically, in order to determine whether the captured image is a distant view image, a focal length that is a part of the image identification information can be used. The determination unit determines that the captured image is a distant view image when the focal distance is greater than or equal to a preset reference distance, and determines the captured image as another image when the focal distance is less than the reference distance. As described above, the captured image is classified into three types of scenes: a person image (first mode image), a distant view image (second mode image), or another image (third mode image). Note that examples of distant view images (second mode images) include landscape images such as the sea and mountains, and examples of other images (third mode images) include flowers and pets.

図１０の例においても、撮像画像のシーンが分類された後、画像処理装置は、そのシーンに応じて、撮像画像の特徴量を抽出する。 Also in the example of FIG. 10, after the scene of the captured image is classified, the image processing apparatus extracts the feature amount of the captured image in accordance with the scene.

図１０の例において、撮像画像が人物画像（第１シーン画像）の場合、画像上に配置される文章を決定するために用いられる撮像画像の特徴量として、顔の数（被写体の人数）及び／又は笑顔レベルを用いることができる。すなわち、撮像画像が人物画像の場合、顔の数（被写体の人数）の判定結果に加え、又は代えて笑顔レベルの判定結果に基づいて、人物画像用テンプレートに挿入される単語を決定することができる。以下、笑顔レベルの判定方法の一例について、図１１を用いて説明する。 In the example of FIG. 10, when the captured image is a person image (first scene image), the number of faces (number of subjects) and the feature amount of the captured image used for determining the text arranged on the image are A smile level can be used. That is, when the captured image is a human image, a word to be inserted into the human image template may be determined based on the determination result of the smile level in addition to or instead of the determination result of the number of faces (number of subjects). it can. Hereinafter, an example of a smile level determination method will be described with reference to FIG.

図１１の例において、画像処理装置の判定部は、人物画像に対して、顔認識などの方法により顔領域を検出する（ステップＳ５００１）。一例において、口角部分の上り具合を数値化することにより、人物画像の笑顔度が算出される。なお、笑顔度の算出には例えば、顔認識にかかる公知の様々な技術を用いることができる。 In the example of FIG. 11, the determination unit of the image processing apparatus detects a face area from a person image by a method such as face recognition (step S5001). In one example, the degree of smile of a person image is calculated by digitizing the degree of ascending of the mouth corner. For example, various known techniques for face recognition can be used for calculating the smile level.

次に、判定部は、あらかじめ設定された第１の笑顔閾値αと、笑顔度を比較する（ステップＳ５００２）。笑顔度がα以上と判定された場合、判定部は、この人物画像の笑顔レベルは、「笑顔：大」であると判定する。 Next, the determination unit compares the smile level with the first smile threshold α set in advance (step S5002). When it is determined that the smile level is greater than or equal to α, the determination unit determines that the smile level of the person image is “smile: large”.

一方、笑顔度がα未満と判定された場合、判定部は、あらかじめ設定された第２の笑顔閾値βと笑顔度を比較する（ステップＳ５００３）。笑顔度がβ以上と判定された場合、判定部は、この人物画像の笑顔レベルは、「笑顔：中」であると判定する。さらに、笑顔度がβ未満と判定された場合、判定部は、この人物画像の笑顔レベルは、「笑顔：小」であると判定する。 On the other hand, when it is determined that the smile level is less than α, the determination unit compares the smile level with a second smile threshold value β set in advance (step S5003). When it is determined that the smile level is β or more, the determination unit determines that the smile level of this person image is “smile: medium”. Furthermore, when it is determined that the smile level is less than β, the determination unit determines that the smile level of the person image is “smile: small”.

人物画像の笑顔レベルの判定結果に基づき、人物画像用テンプレートに挿入される単語が決定される。ここで、「笑顔：大」の笑顔レベルに対応する単語の例としては、「喜びいっぱいの」、「とてもいい」等が挙げられる。「笑顔：中」の笑顔レベルに対応する単語の例としては、「嬉しそうな」、「いい穏やかな」等が挙げられる。「笑顔：小」の笑顔レベルに対応する単語の例としては、「真剣そうな」、「クールな」等が挙げられる。 A word to be inserted into the person image template is determined based on the determination result of the smile level of the person image. Here, examples of the word corresponding to the smile level of “smile: large” include “full of joy” and “very good”. Examples of words that correspond to the smile level of “smile: medium” include “joyful” and “good calm”. Examples of words corresponding to the smile level of “smile: small” include “seriously seems” and “cool”.

なお、上記では、人物画像用テンプレートに挿入される単語が、連体形である場合について説明したが、これに限ることはなく、例えば終止形であってもよい。この場合、「笑顔：大」の笑顔レベルに対応する単語の例としては、「笑顔が素敵」、「すごくいい笑顔だね」等が挙げられる。「笑顔：中」の笑顔レベルに対応する単語の例としては、「にこやかだね」、「いい表情」等が挙げられる。「笑顔：小」の笑顔レベルに対応する単語の例としては、「真剣そうです」、「真面目そうです」等が挙げられる。 In the above description, the case where the word inserted into the person image template is a continuous form has been described. However, the present invention is not limited to this, and may be an end form, for example. In this case, examples of words corresponding to the smile level of “smile: large” include “smile is nice”, “it is a very good smile”, and the like. Examples of words corresponding to the smile level of “smile: medium” include “smiley” and “good expression”. Examples of words corresponding to a smile level of “smile: small” include “looks serious” and “looks serious”.

図１２Ａは、画像処理装置の動作結果を示す出力画像の一例であり、この出力画像は、図９の例に基づいて決定された文章を有する。図１２Ａの例において、撮像画像は人物画像であると判定され、特徴量としては被写体の人数、及び配色パターン（平均色）が抽出されている。また、配色パターンに応じて、人物画像用テンプレートに挿入される単語が、「重厚な」と決定されている。その結果、図１２Ａに示す出力結果が得られている。すなわち、図１２Ａの例では、撮像画像の平均色に基づいて、「重厚な」の単語（形容詞、連体形）が決定されている。 FIG. 12A is an example of an output image showing the operation result of the image processing apparatus, and this output image has a sentence determined based on the example of FIG. In the example of FIG. 12A, the captured image is determined to be a person image, and the number of subjects and the color arrangement pattern (average color) are extracted as the feature amount. Further, the word inserted into the person image template is determined as “heavy” according to the color arrangement pattern. As a result, the output result shown in FIG. 12A is obtained. That is, in the example of FIG. 12A, the word “heavy” (adjective, combined form) is determined based on the average color of the captured image.

図１２Ｂは、画像処理装置の動作結果を示す出力画像の別一例であり、この出力画像は、図１０の例に基づいて決定された文章を有する。図１２Ｂの例において、撮像画像は人物画像であると判定され、特徴量としては被写体の人数、及び笑顔レベルが抽出されている。また、笑顔レベルに応じて、人物画像用テンプレートに挿入される単語が、「いい表情」と決定されている。その結果、図１２Ｂに示す出力結果が得られている。すなわち、図１２Ｂの例では、撮像画像における人物の笑顔レベルに基づいて、「いい表情」の単語（終止形）が決定されている。図１２Ｂの出力結果のように、人物画像に対して笑顔レベルを用いた単語出力を用いることで、画像から受ける印象に比較的近い文字情報を添付することができる。 FIG. 12B is another example of an output image showing the operation result of the image processing apparatus, and this output image has a sentence determined based on the example of FIG. In the example of FIG. 12B, the captured image is determined to be a person image, and the number of subjects and the smile level are extracted as the feature amount. Further, according to the smile level, the word inserted into the person image template is determined as “good expression”. As a result, the output result shown in FIG. 12B is obtained. That is, in the example of FIG. 12B, the word (end form) of “good expression” is determined based on the smile level of the person in the captured image. Like the output result of FIG. 12B, by using the word output using the smile level for the person image, it is possible to attach character information that is relatively close to the impression received from the image.

図１０に戻り、撮像画像が風景画像（第２シーン画像）又はその他の画像（第３シーン画像）の場合、画像上に配置される文章を決定するために用いられる撮像画像の特徴量として、平均色に代えて、代表色を用いることができる。代表色としては、配色パターンにおける「第１色」、すなわち撮像画像において最も頻度の多い色を用いることができる。あるいは、代表色は、以下に説明するように、クラスタリングを用いて決定することができる。 Returning to FIG. 10, when the captured image is a landscape image (second scene image) or another image (third scene image), as a feature amount of the captured image used to determine the text arranged on the image, A representative color can be used instead of the average color. As the representative color, the “first color” in the color arrangement pattern, that is, the most frequently used color in the captured image can be used. Alternatively, the representative color can be determined using clustering as described below.

図１３は、撮像装置に含まれる画像処理部の内部構成を表す概略ブロック図である。図１３の例において、画像処理装置の画像処理部５０４０は、画像データ入力部５０４２と、解析部５０４４と、文章作成部５０５２と、文章付加部５０５４とを有する。画像処理部５０４０は、撮像部等で生成された画像データについて、各種の解析処理を行うことにより、画像データの内容に関する各種の情報を取得し、画像データの内容と整合性の高いテキストを作成し、画像データにテキストを付加することができる。 FIG. 13 is a schematic block diagram illustrating an internal configuration of an image processing unit included in the imaging apparatus. In the example of FIG. 13, the image processing unit 5040 of the image processing apparatus includes an image data input unit 5042, an analysis unit 5044, a text creation unit 5052, and a text addition unit 5054. The image processing unit 5040 performs various types of analysis processing on the image data generated by the imaging unit or the like, thereby acquiring various types of information regarding the content of the image data, and creating text that is highly consistent with the content of the image data. Then, text can be added to the image data.

解析部５０４４は、色情報抽出部５０４６、領域抽出部５０４８、クラスタリング部５０５０を有しており、画像データに対して解析処理を行う。色情報抽出部５０４６は、画像データから、画像データに含まれる各画素の色情報に関する第１情報を抽出する。典型的には、第１情報は、画像データに含まれる全ての画素のＨＳＶ値を、集計したものである。ただし、第１情報は、類似性が関連づけられた（例えば所定の色空間に関連付けされた）所定の色について、この所定の色が画像中に表れる頻度（画素単位での頻度、面積割合等）を示す情報であればよく、色の解像度や、色空間の種類は限定されない。 The analysis unit 5044 includes a color information extraction unit 5046, a region extraction unit 5048, and a clustering unit 5050, and performs analysis processing on the image data. The color information extraction unit 5046 extracts first information regarding color information of each pixel included in the image data from the image data. Typically, the first information is a total of the HSV values of all the pixels included in the image data. However, the first information is the frequency at which the predetermined color appears in the image (frequency in pixel units, area ratio, etc.) for a predetermined color associated with similarity (for example, associated with a predetermined color space). The color resolution and the type of color space are not limited.

例えば、第１情報は、ＨＳＶ空間ベクトル（ＨＳＶ値）やＲＧＢ値で表されるそれぞれの色について、それぞれの色の画素が、画像データに幾つずつ含まれるか、を表す情報であっても良い。ただし、第１情報における色解像度は、演算処理の負担等を考慮して適宜変更すれば良く、また、色空間の種類もＨＳＶやＲＧＢに限られず、ＣＭＹ、ＣＭＹＫ等であっても良い。 For example, the first information may be information indicating how many pixels of each color are included in the image data for each color represented by an HSV space vector (HSV value) or RGB value. . However, the color resolution in the first information may be changed as appropriate in consideration of the burden of calculation processing, and the type of color space is not limited to HSV or RGB, and may be CMY, CMYK, or the like.

図１４は、解析部５０４４において行われる代表色の決定の流れを表すフローチャートである。図１４のステップＳ５１０１では、画像処理装置が、具体的な画像データ５０６０（撮像画像、図１５参照）の代表色の算出を開始する。 FIG. 14 is a flowchart showing the flow of representative color determination performed in the analysis unit 5044. In step S5101, the image processing apparatus starts calculating the representative color of specific image data 5060 (captured image, see FIG. 15).

ステップＳ５１０２では、画像処理装置の画像データ入力部５０４２が、画像データを解析部５０４４に出力する。次に、解析部５０４４の色情報抽出部５０４６は、画像データに含まれる各画素の色情報に関する第１情報５０６２を算出する（図１５参照）。 In step S5102, the image data input unit 5042 of the image processing apparatus outputs the image data to the analysis unit 5044. Next, the color information extraction unit 5046 of the analysis unit 5044 calculates first information 5062 regarding the color information of each pixel included in the image data (see FIG. 15).

図１５は、ステップＳ５１０２において色情報抽出部５０４６が実施する第１情報５０６２の算出処理を表す概念図である。色情報抽出部５０４６は、画像データ５０６０に含まれる色情報を、各色毎（例えば２５６階調の各階調毎）に集計し、第１情報５０６２を得る。図１５の下図に示すヒストグラムは、色情報抽出部５０４６によって算出された第１情報５０６２のイメージを表している。図１５のヒストグラムの横軸は色であり、縦軸は、画像データ５０６０中に、所定の色の画素がいくつ含まれるかを表している。 FIG. 15 is a conceptual diagram illustrating a calculation process of the first information 5062 performed by the color information extraction unit 5046 in step S5102. The color information extraction unit 5046 aggregates the color information included in the image data 5060 for each color (for example, for each gradation of 256 gradations) to obtain first information 5062. The histogram shown in the lower part of FIG. 15 represents an image of the first information 5062 calculated by the color information extraction unit 5046. The horizontal axis of the histogram in FIG. 15 is color, and the vertical axis represents how many pixels of a predetermined color are included in the image data 5060.

図１４のステップＳ５１０３では、解析部５０４４の領域抽出部５０４８が、画像データ５０６０における主要領域を抽出する。例えば、領域抽出部５０４８は、図１５に示す画像データ５０６０の中からピントが合っている領域を抽出し、画像データ５０６０の中央部分を主要領域であると認定する（図１６における主要領域５０６４参照）。 In step S5103 of FIG. 14, the region extraction unit 5048 of the analysis unit 5044 extracts the main region in the image data 5060. For example, the area extraction unit 5048 extracts a focused area from the image data 5060 shown in FIG. 15, and recognizes the central portion of the image data 5060 as the main area (see the main area 5064 in FIG. 16). ).

図１４のステップＳ５１０４では、解析部５０４４の領域抽出部５０４８が、ステップＳ５１０５で実施されるクラスタリングの対象領域を決定する。例えば、領域抽出部５０４８は、図１６の上部に示すように、ステップＳ５１０３において画像データ５０６０の一部を主要領域５０６４であると認識し、主要領域５０６４を抽出した場合、クラスタリングの対象を、主要領域５０６４に対応する第１情報５０６２（主要第１情報５０６６）とする。図１６の下図に示すヒストグラムは、主要第１情報５０６６のイメージを表している。 In step S5104 in FIG. 14, the region extraction unit 5048 of the analysis unit 5044 determines a clustering target region to be implemented in step S5105. For example, as shown in the upper part of FIG. 16, the area extraction unit 5048 recognizes that part of the image data 5060 is the main area 5064 in step S5103 and extracts the main area 5064, the clustering target is set as the main area 5064. The first information 5062 (main first information 5066) corresponding to the area 5064 is used. The histogram shown in the lower part of FIG. 16 represents an image of the main first information 5066.

一方、領域抽出部５０４８が、ステップS５１０３において画像データ５０６０における主要領域５０６４を抽出しなかった場合、領域抽出部５０４８は、図１５に示すように、画像データ５０６０の全領域に対応する第１情報５０６２を、クラスタリングの対象に決定する。なお、クラスタリングの対象領域が異なることを除き、主要領域５０６４が抽出された場合と抽出されなかった場合とで、その後の処理に違いはないため、以下では、主要領域が抽出された場合を例に説明を行う。 On the other hand, if the region extracting unit 5048 has not extracted the main region 5064 in the image data 5060 in step S5103, the region extracting unit 5048 first information corresponding to all the regions of the image data 5060 as shown in FIG. 5062 is determined as a clustering target. Note that there is no difference in the subsequent processing between the case where the main region 5064 is extracted and the case where it is not extracted, except that the target region for clustering is different. I will explain.

図１４のステップＳ５１０５では、解析部５０４４のクラスタリング部５０５０が、ステップＳ５１０４で決定された領域の第１情報５０６２である主要第１情報５０６６に対して、クラスタリングを実施する。図１７は、図１６に示す主要領域５０６４の主要第１情報５０６６について、クラスタリング部５０５０が実施したクラスタリングの結果を表す概念図である。 In step S5105 of FIG. 14, the clustering unit 5050 of the analysis unit 5044 performs clustering on the main first information 5066 that is the first information 5062 of the area determined in step S5104. FIG. 17 is a conceptual diagram showing the result of clustering performed by the clustering unit 5050 on the main first information 5066 of the main region 5064 shown in FIG.

クラスタリング部５０５０は、例えば、２５６階調の主要第１情報５０６６（図１６参照）を、ｋ−ｍｅａｎｓ法によって複数のクラスタに分類する。なお、クラスタリングは、ｋ−ｍｅａｎｓ法（ｋ平均法）に限定されない。他の例において、最短距離法等の他の方法を用いることができる。 For example, the clustering unit 5050 classifies the main first information 5066 (see FIG. 16) having 256 gradations into a plurality of clusters by the k-means method. Note that the clustering is not limited to the k-means method (k average method). In other examples, other methods such as the shortest distance method can be used.

図１７の上部は、各画素がどのクラスタに分類されたかを表しており、図１７の下部に示すヒストグラムは、各クラスタに属する画素の数を示したものである。クラスタリング部５０５０によるクラスタリングによって、２５６階調の主要第１情報５０６６（図１６）は、２５６より少ない（図１７に示す例では３つの）クラスタに分類されている。クラスタリングの結果は、各クラスタの大きさに関する情報と、各クラスタの色（クラスタの色空間上の位置）に関する情報とを含むことができる。 The upper part of FIG. 17 shows to which cluster each pixel is classified, and the histogram shown in the lower part of FIG. 17 shows the number of pixels belonging to each cluster. By the clustering by the clustering unit 5050, the 256 first main information 5066 (FIG. 16) is classified into less than 256 clusters (three in the example shown in FIG. 17). The result of clustering can include information about the size of each cluster and information about the color of each cluster (the position of the cluster in the color space).

ステップＳ５１０６は、解析部５０４４のクラスタリング部５０５０が、クラスタリングの結果に基づき、画像データ５０６０の代表色を決定する。一例において、クラスタリング部５０５０は、図１７に示すようなクラスタリング結果を得た場合、算出された複数のクラスタのうち最も多くの画素を含む最大クラスタ５０７４に属する色を、画像データ５０６０の代表色とする。 In step S5106, the clustering unit 5050 of the analysis unit 5044 determines a representative color of the image data 5060 based on the clustering result. In one example, when the clustering unit 5050 obtains a clustering result as illustrated in FIG. 17, the color belonging to the maximum cluster 5074 including the most pixels among the plurality of calculated clusters is set as the representative color of the image data 5060. To do.

代表色の算出が終了すると、文章作成部５０５２は、代表色に関する情報を用いてテキストを作成し、画像データ５０６０に付与する。 When the calculation of the representative color is completed, the sentence creation unit 5052 creates a text using information on the representative color and assigns the text to the image data 5060.

文章作成部５０５２は、例えば風景画像用の文章テンプレートを読み出し、文章テンプレートの｛日時｝に、画像データ５０６０の生成日時に対応する単語（例えば「２０１２／０３／１０」）を適用する。この場合、解析部５０４４は、画像データ５０６０の生成日時に関する情報を記憶媒体等から検索し、文章作成部５０５２に出力することができる。 The text creation unit 5052 reads a text template for a landscape image, for example, and applies a word (for example, “2012/03/10”) corresponding to the generation date and time of the image data 5060 to {date and time} of the text template. In this case, the analysis unit 5044 can retrieve information related to the generation date and time of the image data 5060 from the storage medium and output the information to the text creation unit 5052.

また、文章作成部５０５２は、文章テンプレートの｛形容詞｝に、画像データ５０６０の代表色に対応する単語を適用する。文章作成部５０５２は、記憶部５０２８から対応情報を読み出して、文章テンプレートに適用する。一例において、記憶部５０２８には、シーン毎に色と単語とが関連付けられたテーブルが保存されている。文章作成部５０５２は、そのテーブルから読み出した単語を用いて文章（例えば「とてもきれいなものを見つけた」）を作成することができる。 In addition, the sentence creation unit 5052 applies a word corresponding to the representative color of the image data 5060 to the {adjective} of the sentence template. The sentence creation unit 5052 reads the correspondence information from the storage unit 5028 and applies it to the sentence template. In one example, the storage unit 5028 stores a table in which colors and words are associated with each scene. The sentence creation unit 5052 can create a sentence (for example, “I found a very beautiful thing”) using words read from the table.

図１８は、上述した一連の処理によってテキストを付与された画像データ５０８０を表示したものである。 FIG. 18 shows image data 5080 to which text is given by the series of processes described above.

図１９は、シーンが遠景画像の場合に、上述と同様の一連の処理によってテキストを付与された画像データの例を示したものである。この場合、シーンが遠景画像に分類され、かつ代表色は青と判定されている。例えば、シーン毎に色と単語とが関連付けられたテーブルにおいて、代表色の「青」に対して単語「爽やかな」等が対応付けられている。 FIG. 19 shows an example of image data to which text is given by a series of processes similar to the above when the scene is a distant view image. In this case, the scene is classified as a distant view image, and the representative color is determined to be blue. For example, in a table in which colors and words are associated with each scene, the word “fresh” is associated with the representative color “blue”.

図２０は、色と単語との対応情報を有するテーブルの一例を示す図である。図２０のテーブルにおいて、人物画像（第１シーン画像）、遠景画像（第２シーン画像）、及びその他の画像（第３シーン画像）、のシーンごとに、色と単語とが関連付けられている。一例において、画像データの代表色が「青」であり、シーンがその他の画像（第３シーン画像）であるとき、文章作成部５０５２は、テーブルの対応情報から、代表色に対応する単語（例えば「上品な」）を選択し、文章テンプレートの｛形容詞｝に適用する。 FIG. 20 is a diagram illustrating an example of a table having correspondence information between colors and words. In the table of FIG. 20, a color and a word are associated with each scene of a person image (first scene image), a distant view image (second scene image), and another image (third scene image). In one example, when the representative color of the image data is “blue” and the scene is another image (third scene image), the sentence creation unit 5052 uses a word corresponding to the representative color (for example, from the correspondence information in the table). “Classy”) and select {adjective} in the sentence template.

色と単語との対応テーブルは、例えば、ＰＣＣＳ表色系、ＣＩＣＣ表色系、又はＮＣＳ表色系などのカラーチャートに基づき設定することができる。 The correspondence table between colors and words can be set based on a color chart such as a PCCS color system, CICC color system, or NCS color system.

図２１は、ＣＣＩＣ表示系のカラーチャートを用いた、遠景画像（第２シーン画像）用の対応テーブルの一例を示す。図２２は、ＣＣＩＣ表示系のカラーチャートを用いた、その他の画像（第３シーン画像）用の対応テーブルの一例を示す。 FIG. 21 shows an example of a correspondence table for a distant view image (second scene image) using a color chart of the CCIC display system. FIG. 22 shows an example of a correspondence table for other images (third scene images) using a CCIC display color chart.

図２１において、横軸は、代表色の色相に、縦軸は代表色のトーンに対応している。単語の決定に図２１のテーブルを用いることにより、代表色の色相の情報だけでなく、代表色のトーンの情報も併せて単語を決定し、人間が生じる感性に比較的近いテキストを付与することが可能となる。以下、図２１のテーブルを用いた、遠景画像（第２シーン画像）の場合の具体的なテキストの設定例を説明する。なお、その他の画像（第３シーン画像）の場合、図２２のテーブルを用いて同様に設定することができる。 In FIG. 21, the horizontal axis corresponds to the hue of the representative color, and the vertical axis corresponds to the tone of the representative color. By using the table of FIG. 21 to determine the word, not only the information on the hue of the representative color but also the information on the tone of the representative color is used to determine the word, and a text that is relatively close to the sensibility generated by humans is given. Is possible. Hereinafter, a specific text setting example in the case of a distant view image (second scene image) using the table of FIG. 21 will be described. In the case of other images (third scene images), the same setting can be made using the table of FIG.

図２１において、代表色が領域Ａ５００１と判定された場合、その代表色の呼称（赤、橙、黄、青など）がそのままテキスト中の単語に適用される。例えば、代表色の色相が「赤（Ｒ）」、トーンが「ビビッド・トーン（Ｖ）」の場合、その色を表す形容詞「真っ赤な」等が選択される。 In FIG. 21, when it is determined that the representative color is the area A5001, the name of the representative color (red, orange, yellow, blue, etc.) is applied to the word in the text as it is. For example, if the hue of the representative color is “red (R)” and the tone is “Vivid Tone (V)”, the adjective “crimson” representing the color is selected.

また、代表色が領域Ａ５００２、Ａ５００３、Ａ５００４又はＡ５００５の色と判定された場合、その色から連想する形容詞が、テキスト中の単語に適用される。例えば、代表色が領域Ａ５００３の色（緑）と判定された場合、緑から連想する形容詞である「心地良い」、「さわやかな」等が適用される。 When the representative color is determined to be the color of the region A5002, A5003, A5004, or A5005, an adjective associated with the color is applied to the word in the text. For example, when the representative color is determined to be the color (green) of the area A5003, the adjectives associated with green, such as “comfortable” and “fresh”, are applied.

なお、代表色が領域Ａ５００１〜Ａ５００５の色と判定され、且つそのトーンがビビッド・トーン（Ｖ）、ストロング・トーン（Ｓ）、ブライト・トーン（Ｂ）、又はペール・トーン（ＬＴ）の場合には、形容詞の前に程度を表す副詞（例：とても、かなり等）が適用される。 When the representative color is determined to be the color of the area A5001 to A5005 and the tone is a vivid tone (V), a strong tone (S), a bright tone (B), or a pale tone (LT). Applies adverbs that indicate the degree before the adjectives (eg, very, pretty, etc.).

代表色が領域Ａ５００６、すなわち「ホワイト・トーン（白）」と判定された場合、白から連想される単語である「清らかな」、「澄んだ」等が選択される。また、代表色が領域Ａ５００７、すなわちグレー系の色（ライト・グレイ・トーン：ｌｔＧＹ、ミディアム・グレイ・トーン：ｍＧＹ、又はダーク・グレイ・トーン：ｄｋＧＹ）と判定された場合、無難な形容詞である「きれいな」、「すてきな」等が選択される。白、又はグレー系の色、すなわち無彩色が代表色となる画像においては、さまざまな色が画像全体に含まれる場合が多い。したがって、色とは関連性の少ない単語を用いることで、的外れな意味のテキストが付与されるのを防止し、画像から受けるイメージに比較的近いテキストを付与することができる。 When the representative color is determined to be the area A5006, that is, “white tone (white)”, “clean”, “clear”, and the like, which are words associated with white, are selected. Further, if the representative color is determined to be the area A5007, that is, a gray color (light gray tone: ltGY, medium gray tone: mGY, or dark gray tone: dkGY), it is a safe adjective. “Clean”, “nice”, etc. are selected. In an image in which a white or gray color, that is, an achromatic color is a representative color, various colors are often included in the entire image. Therefore, by using a word that is less related to color, it is possible to prevent a text having an inappropriate meaning from being added, and it is possible to provide a text that is relatively close to an image received from an image.

また、代表色が領域Ａ５００１〜Ａ５００７のいずれの領域にも属さない場合、すなわち代表色が低トーン（ダーク・グレイッシュ・トーン）、又は黒（ブラック・トーン）である場合、所定の意味を有する文字（単語、又は文章）をテキストとして選択することができる。所定の意味を有する文字は、例えば、「ここはどこ」、「あっ」等を含む。これらの単語や文章は、「つぶやき辞書」として画像処理装置の記憶部に保存しておくことができる。 Further, when the representative color does not belong to any of the areas A5001 to A5007, that is, when the representative color is a low tone (dark grayish tone) or black (black tone), characters having a predetermined meaning (Word or sentence) can be selected as text. Characters having a predetermined meaning include, for example, “where is here”, “a”, and the like. These words and sentences can be stored in the storage unit of the image processing apparatus as a “tweet dictionary”.

すなわち、代表色が低トーン、又は黒と判定されたとき、画像全体の色相の判定が困難なことがあるが、このような場合においても上記のように色とは関連性の少ない文字を用いることで、的外れな意味のテキストが付与されるのを防止し、画像から受けるイメージに近いテキストを付与することができる。 In other words, when the representative color is determined to be low tone or black, it may be difficult to determine the hue of the entire image. Even in such a case, characters having less relation to the color are used as described above. Thus, it is possible to prevent a text having an inappropriate meaning from being added, and to add a text close to an image received from an image.

また、上記の例では、シーンと代表色に応じて文章と単語が一義的に決定される場合について説明したが、これに限らず、文章と単語の選択において、時々、例外処理を行うこともできる。例えば、複数回に１回（例えば１０回に１回）は、上記の「つぶやき辞書」からテキストを抽出してもよい。これにより、テキストの表示内容が必ずしもパターン化されることがないので、ユーザが表示内容に飽きるのを防止することができる。 In the above example, the case where the sentence and the word are uniquely determined according to the scene and the representative color has been described. However, the present invention is not limited to this, and exception processing is sometimes performed in the selection of the sentence and the word. it can. For example, the text may be extracted from the “tweet dictionary” once every plural times (for example, once every 10 times). As a result, the display content of the text is not necessarily patterned, so that the user can be prevented from getting bored with the display content.

なお、上記の例において、文章付加部は、文章作成部によって生成されたテキストを画像の上部、又は下部に配置する場合について説明したが、これに限らず、例えばテキストを画像の外（枠外）に配置することもできる。 In the above example, the case where the sentence adding unit arranges the text generated by the sentence creating unit at the upper part or the lower part of the image has been described. However, the present invention is not limited to this. It can also be arranged.

また、上記の例において、テキストの位置が画像内で固定されている場合について説明したが、これに限らず、例えば画像処理装置の表示部において、テキストを流れるように表示させることができる。これにより、入力画像がテキストにより影響を受けにくい、又はテキストの視認性が向上される。 In the above example, the case where the position of the text is fixed in the image has been described. However, the present invention is not limited to this. For example, the text can be displayed so as to flow on the display unit of the image processing apparatus. Thereby, the input image is not easily affected by the text, or the text visibility is improved.

なお、上記の例において、テキストが画像に必ず貼り付けられる場合について説明したが、これに限らず、例えば人物画像の場合には、テキストは貼り付けず、遠景画像又はその他の画像の場合にはテキストを貼り付けるようにしてもよい。 In the above example, the case where the text is always pasted on the image has been described. However, the present invention is not limited to this. For example, in the case of a person image, the text is not pasted, and in the case of a distant view image or other images. You may make it paste a text.

また、上記の例において、文章付加部は、文章作成部によって生成されたテキストの表示方法（フォント、色、表示位置など）を所定の方法で決定する場合について説明したが、これに限らず、テキストの表示方法は、多種多様に決定することができる。以下、これらの方法について、いくつかの例を示す。 In the above example, the sentence adding unit has described the case where the display method (font, color, display position, etc.) of the text generated by the sentence creating unit is determined by a predetermined method. A variety of text display methods can be determined. Hereinafter, some examples of these methods will be described.

一例においては、ユーザが画像処理装置の操作部を介して、テキストの表示方法（フォント、色、表示位置）を修正することができる。或いは、ユーザは、テキストの内容（単語）を変更、又は削除することができる。また、ユーザは、テキスト全体を表示させないように設定する、すなわちテキストの表示／非表示を選択することができる。 In one example, the user can correct the text display method (font, color, display position) via the operation unit of the image processing apparatus. Alternatively, the user can change or delete the contents (words) of the text. In addition, the user can select not to display the entire text, that is, display / non-display of the text.

また、一例においては、入力画像のシーンに応じてテキストの大きさを変更することができる。例えば、入力画像のシーンが人物画像の場合、テキストを小さくし、入力画像のシーンが遠景画像又はその他の画像の場合、テキストを大きくすることができる。 In one example, the size of the text can be changed according to the scene of the input image. For example, when the scene of the input image is a person image, the text can be reduced, and when the scene of the input image is a distant view image or other images, the text can be increased.

また、一例においては、テキストを強調表示して画像データに合成することもできる。例えば、入力画像が人物画像の場合、人物に吹き出しを付与し、その吹き出し中にテキストを配置することができる。 In one example, text can be highlighted and combined with image data. For example, when the input image is a person image, a balloon can be given to the person and text can be placed in the balloon.

また、一例においては、テキストの表示色は、入力画像の代表色を基準として設定することできる。具体的には、入力画像の代表色と色相は同じであり、且つトーンが異なる色を、テキストの表示色として用いることができる。これにより、テキストが過度に主張されることなく、入力画像とほどよく調和したテキストを付与することができる。 In one example, the display color of the text can be set with reference to the representative color of the input image. Specifically, a color having the same hue as the representative color of the input image and a different tone can be used as a text display color. As a result, it is possible to give a text that is in harmony with the input image without excessively claiming the text.

また、特に、入力画像の代表色が白の場合、テキストの表示色の決定において、例外処理を行ってもよい。ここで、例外処理では例えば、テキストの色を白とし、そのテキストの周辺部を黒に設定することができる。 In particular, when the representative color of the input image is white, exception processing may be performed in determining the text display color. Here, in the exception processing, for example, the text color can be set to white and the peripheral portion of the text can be set to black.

１…画像処理装置１０…画像入力部２０…判定部３０…文章作成部４０…文章付加部９０…記憶部１００…撮像装置１１０…撮像部１１１…光学系１１９…撮像素子１２０ＡＤ変換部１３０…バッファメモリ部１４０…画像処理部１５０…表示部１６０…記憶部１７０…通信部１８０…操作部１９０…ＣＰＵ２００…記憶媒体３００…バス DESCRIPTION OF SYMBOLS 1 ... Image processing apparatus 10 ... Image input part 20 ... Determination part 30 ... Text preparation part 40 ... Text addition part 90 ... Memory | storage part 100 ... Imaging device 110 ... Imaging part 111 ... Optical system 119 ... Imaging element 120 AD conversion part 130 ... Buffer memory unit 140 ... Image processing unit 150 ... Display unit 160 ... Storage unit 170 ... Communication unit 180 ... Operation unit 190 ... CPU 200 ... Storage medium 300 ... Bus

Claims

An image input unit for inputting a captured image;
As a sentence template for completing a sentence by inserting a word into a predetermined blank space, a person image template used for creating a sentence for a person image with a person as a subject and a sentence for a landscape image with a landscape as a subject A storage unit for storing a landscape image template to be used;
A determination unit that determines whether the captured image is the person image or the landscape image;
According to a determination result by the determination unit for the captured image, the sentence template of either the person image template or the landscape image template is read from the storage unit, and the blank part of the read sentence template is stored in the blank part. An image processing apparatus comprising: a sentence creation unit that inserts a word corresponding to a feature amount or an imaging condition of a captured image and creates a sentence for the captured image.

The image processing apparatus according to claim 1.
The storage unit
Storing the sentence template in which the blank portion is set in a sentence from the viewpoint of a person imaged as a subject as the person image template;
An image processing apparatus, wherein the sentence template in which the blank portion is set in a sentence from a viewpoint of a photographer who has photographed a subject is stored as the landscape image template.

The image processing apparatus according to claim 1 or 2,
The determination unit
In the person image, the number of subjects is further determined as the feature amount,
The sentence creation unit
An image processing apparatus, wherein a sentence is created by inserting a word corresponding to the number of subjects in the blank portion with respect to the person image.

The image processing apparatus according to claim 3.
The determination unit
In the case where a plurality of face areas are recognized in the captured image,
The ratio of the size of the maximum face area to the size of the captured image is greater than or equal to a first threshold, less than a second threshold that is greater than or equal to the first threshold, and a plurality of face areas When the standard deviation or variance of the ratio or the standard deviation or variance of the sizes of the plurality of face regions is less than the third threshold,
Or, when the ratio of the size of the maximum face area is equal to or greater than the second threshold value,
An image processing apparatus characterized in that the captured image is determined to be the person image, and the number of subjects is determined based on the number of face areas having a ratio equal to or greater than the first threshold.

The image processing apparatus according to any one of claims 1 to 4,
The sentence creation unit
An image processing apparatus, wherein a sentence is created by inserting an adjective corresponding to a color arrangement pattern of the captured image into the blank portion as a word corresponding to a feature amount of the captured image.

The image processing apparatus according to claim 5.
The sentence creation unit
A sentence is created by inserting an adjective in accordance with a color arrangement pattern of a predetermined area on the captured image determined according to whether the captured image is the person image or the landscape image into the blank section. An image processing apparatus.

An image input unit for inputting a captured image;
A determination unit that determines text corresponding to at least one of the feature amount of the captured image and the imaging condition of the captured image;
A determination unit that determines whether the captured image is a first type image or a second type image different from the first type;
A storage unit for storing a first syntax that is a syntax of a sentence used for the first type and a second syntax that is a syntax of a sentence used for the second type;
When the determination unit determines that the captured image is the first type image, the sentence having the first syntax is created using the text determined by the determination unit, and the captured image is the second image. An image processing apparatus, comprising: a sentence creation unit that creates a sentence of the second syntax using the text determined by the determination unit when the determination unit determines that the image is a type image.

An image processing apparatus according to claim 7,
The image processing apparatus according to claim 1, wherein the first type is a portrait, and the second type is a landscape.

An imaging unit that images a subject and generates a captured image;
As a sentence template for completing a sentence by inserting a word into a predetermined blank space, a person image template used for creating a sentence for a person image with a person as a subject and a sentence for a landscape image with a landscape as a subject A storage unit for storing a landscape image template to be used;
A determination unit that determines whether the captured image is the person image or the landscape image;
According to a determination result by the determination unit for the captured image, the sentence template of either the person image template or the landscape image template is read from the storage unit, and the blank part of the read sentence template is stored in the blank part. An imaging apparatus comprising: a sentence creation unit that inserts a word according to a feature amount or an imaging condition of a captured image and creates a sentence for the captured image.

As a sentence template for completing a sentence by inserting a word into a predetermined blank space, a person image template used for creating a sentence for a person image with a person as a subject and a sentence for a landscape image with a landscape as a subject In a computer of an image processing apparatus provided with a storage unit for storing a landscape image template to be used,
An image input step for inputting a captured image;
A determination step of determining whether the captured image is the person image or the landscape image;
According to the determination result of the determination step for the captured image, the sentence template of either the person image template or the landscape image template is read from the storage unit, and the blank section of the read sentence template is stored in the blank section. A program for executing a sentence creating step of creating a sentence for a captured image by inserting a word corresponding to a feature amount or an imaging condition of the captured image.

A determining unit that determines a character having a predetermined meaning from the captured image;
A determination unit that determines whether the captured image is a person image or an image different from the person image;
A storage unit that stores a first syntax that is a syntax of a sentence used for the person image and a second syntax that is a syntax of a sentence used for an image different from the person image;
When the determination unit determines that the captured image is the person image, the first syntax sentence is output using the characters having the predetermined meaning, and the captured image is different from the person image. And an output unit that outputs the sentence of the second syntax using the character having the predetermined meaning when it is determined by the determination unit.