JP5293102B2

JP5293102B2 - Image reproducing apparatus and program

Info

Publication number: JP5293102B2
Application number: JP2008292138A
Authority: JP
Inventors: 啓一安藤
Original assignee: Nikon Corp
Current assignee: Nikon Corp
Priority date: 2008-11-14
Filing date: 2008-11-14
Publication date: 2013-09-18
Anticipated expiration: 2028-11-14
Also published as: JP2010119012A

Description

本発明は、画像データを再生可能な画像再生装置および画像再生装置で実行可能なプログラムに関するものである。 The present invention relates to an image reproducing apparatus capable of reproducing image data and a program executable by the image reproducing apparatus.

デジタルカメラやコンピュータなど、記録媒体に記録されている画像を表示部にてスライドショー再生、もしくは動画（ムービー）再生するとともに、スピーカ等から音声の再生も行う、各種の画像再生装置がある（例えば特許文献１）。特許文献１によれば、画像は、その撮影日付に基づき、その当時流行していた楽曲をＢＧＭ（Back Ground Music）とともにスライドショー再生される。 There are various image playback apparatuses that play back images recorded on a recording medium such as a digital camera or a computer on a display unit as a slide show or a movie (movie), and also play audio from a speaker or the like (for example, patents) Reference 1). According to Japanese Patent Application Laid-Open No. 2004-228620, images are played back as a slideshow along with BGM (Back Ground Music) based on the shooting date.

例えば特許文献２には、音声データ付き画像の音声データを音声認識し、画像の検索に利用する技術が開示されている。
特開２００６−１６４２２９号公報特開２００５−３４６２５９号公報 For example, Patent Document 2 discloses a technique for recognizing sound data of an image with sound data and using it for image retrieval.
JP 2006-164229 A JP 2005-346259 A

ところで、多数の画像データをスライドショー再生するなど、順番に再生する場合、記録日付順、ファイル名順、ランダム再生など、様々な再生順序が考えられる。しかし、日付順やファイル名順に再生すると、再生の順序は常に一定であり、興趣性が低い。一方、ランダムに再生する場合は、再生順序について予測が付かず、再生順序についてのユーザの希望が反映されない。 By the way, when reproducing a large number of image data in order such as a slide show, various reproduction orders such as a recording date order, a file name order, and a random reproduction can be considered. However, when playing back in date order or file name order, the order of playback is always constant and is less interesting. On the other hand, when playing back at random, the playback order is not predicted, and the user's desire about the playback order is not reflected.

一方、ユーザの指定する基準に従った順序で再生すれば、興趣が高まると考えられる。とりわけ、画像データに、被写体である人物の声を、ボイスメモなどの音声データとして付加している場合、人物の声を解析して、被写体を、性別その他の特性に分類し、再生順序に反映させれば、興趣性が高まると考えられえる。 On the other hand, if it is played back in the order according to the criteria specified by the user, it is thought that the interest will increase. In particular, when the voice of a person who is the subject is added to the image data as voice data such as a voice memo, the voice of the person is analyzed, the subject is classified into gender and other characteristics, and reflected in the playback order. If this is the case, it can be considered that interest is enhanced.

しかし特許文献１に記載の技術は、画像データに、撮影日付に基づいて音声データを付与するものであり、音声データがもとより関連付けられている画像データを処理する技術ではない。 However, the technique described in Patent Document 1 is to add sound data to image data based on the shooting date, and is not a technique for processing image data associated with sound data.

また、特許文献２に記載の技術では、画像データに関連付けられた音声データを音声認識して被写体を識別するため、きわめて詳細に画像データを分類でき、再生順序にも反映させられる可能性がある。しかし、音声認識の負荷が高く、画像再生装置の処理能力を大幅に消費してしまう。 In the technique described in Patent Document 2, since voice data associated with image data is recognized by voice recognition, the subject can be classified in detail, and may be reflected in the reproduction order. . However, the load of voice recognition is high, and the processing capability of the image playback device is consumed greatly.

本発明は、このような課題に鑑み、多数の画像データを、簡便な処理によって、被写体に因んだ音声データの特性に応じて分類したり、特性が顕著な順番に並べたりして、再生順序に反映させ、興趣性のある再生順序で画像データを再生可能な画像再生装置およびプログラムを提供することを目的としている。 In view of such a problem, the present invention reproduces a large number of image data by classifying them according to the characteristics of audio data caused by the subject or arranging them in order of remarkable characteristics by simple processing. An object of the present invention is to provide an image reproducing apparatus and a program that can reflect image data in an order and reproduce image data in an interesting reproduction order.

上記課題を解決するために、本発明にかかる画像再生装置の代表的な構成は、１つ以上の画像データ各々に関連付けられた音声データの周波数を解析する周波数解析部と、解析された周波数に応じて、話者の性別を含む音声データの特性情報を判定する音声判定部と、特性情報を１つ以上の画像データ各々に関連付ける関連付け部と、１つ以上の画像データのなかから、各々に関連付けられた特性情報に基づいて、再生対象となる再生対象画像データを選択する画像選択部と、選択された再生対象画像データを解析された周波数の昇順または降順に表示部に再生する再生制御部と、を備えることを特徴とする。
In order to solve the above problems, a typical configuration of an image reproduction device according to the present invention includes a frequency analysis unit that analyzes the frequency of audio data associated with each of one or more image data, and an analyzed frequency. Accordingly, a voice determination unit that determines characteristic information of voice data including the gender of the speaker, an association unit that associates the characteristic information with each of the one or more image data, and one or more of the image data, respectively. An image selection unit that selects reproduction target image data to be reproduced based on the associated characteristic information, and a reproduction control unit that reproduces the selected reproduction target image data on the display unit in ascending or descending order of the analyzed frequency. And.

上記の構成によれば、性別判定された画像データに、さらに周波数（声色）という特性が与えられ、声色が高い順（周波数の降順）または低い順（周波数の昇順）に画像データを再生可能である。これにより、声の高い、明るい人物が被写体として写っている画像データを優先的に再生するなどの、興趣のあるスライドショー再生が可能となる。
According to the above configuration, the image data subjected to gender determination is further given a characteristic of frequency (voice color), and the image data can be reproduced in the order of high voice color (descending order of frequency) or low order (ascending order of frequency). is there. This makes it possible to play an interesting slide show such as preferentially reproducing image data in which a bright and bright person is captured as a subject.

上記の再生制御部は、選択された再生対象画像データを、順次切り換えて再生してよい。かかる構成によれば、いわゆるスライドショー再生が実行される。 The reproduction control unit may sequentially switch and reproduce the selected reproduction target image data. According to such a configuration, so-called slide show reproduction is executed.

上記の構成によれば、性別判定された画像データに、さらに周波数（声色）という特性が与えられ、声色が高い順（周波数の降順）または低い順（周波数の照準）に画像データを再生可能である。これにより、声の高い、明るい人物が被写体として写っている画像データを優先的に再生するなどの、興趣のあるスライドショー再生が可能となる。 According to the above configuration, the gender-determined image data is further given a characteristic of frequency (voice color), and the image data can be reproduced in the order of high voice color (descending order of frequency) or low order (frequency aiming). is there. This makes it possible to play an interesting slide show such as preferentially reproducing image data in which a bright and bright person is captured as a subject.

上記の関連付け部は、音声データの特性情報を１つ以上の画像データ各々に付加してよい。これにより、音声データの特性情報は、画像データと一体的に保存されることによって、画像データに関連付けられる。 The associating unit may add audio data characteristic information to each of one or more pieces of image data. Thereby, the characteristic information of the sound data is associated with the image data by being stored integrally with the image data.

上記の関連付け部は、音声データの特性情報と、１つ以上の画像データ各々との関連付け状態を示す関連付けデータを作成し、関連付けデータをファイルとして保存してもよい。 The associating unit may create associating data indicating an associating state between the audio data characteristic information and each of the one or more pieces of image data, and store the associating data as a file.

上記の構成によれば、音声データの特性情報は、関連付けデータを参照することによって、いずれの画像データに関連付けられているかが分かる。 According to the configuration described above, it is possible to determine which image data the audio data characteristic information is associated with by referring to the association data.

上記課題を解決するために、本発明にかかるプログラムの代表的な構成は、１つ以上の画像データ各々に関連付けられた音声データの周波数を解析する周波数解析部と、解析された周波数に応じて、話者の性別を含む音声データの特性情報を判定する音声判定部と、特性情報を１つ以上の画像データ各々に関連付ける関連付け部と、１つ以上の画像データのなかから、各々に関連付けられた特性情報に基づいて、再生対象となる再生対象画像データを選択する画像選択部と、選択された再生対象画像データを解析された周波数の昇順または降順に表示部に再生する再生制御部として画像再生装置を機能させることを特徴とする。 In order to solve the above-described problem, a typical configuration of a program according to the present invention includes a frequency analysis unit that analyzes the frequency of audio data associated with each of one or more image data, and a frequency that is analyzed. A voice determination unit for determining characteristic information of the voice data including the gender of the speaker, an association unit for associating the characteristic information with each of the one or more image data, and one or more of the image data. An image selection unit that selects reproduction target image data to be reproduced based on the characteristic information, and a reproduction control unit that reproduces the selected reproduction target image data on the display unit in ascending or descending order of the analyzed frequency. The playback device is made to function.

上述した画像再生装置における技術的思想に対応する構成要素やその説明は、当該プログラムにも適用可能である。 The components corresponding to the technical idea of the image reproduction apparatus described above and the description thereof can be applied to the program.

本発明によれば、多数の画像データを、簡便な処理によって、被写体に因んだ音声データの特性に応じて分類したり、特性が顕著な順番に並べたりして、再生順序に反映させ、興趣性のある再生順序で画像データを再生可能である。 According to the present invention, a large number of image data can be classified according to the characteristics of audio data caused by the subject by simple processing, or arranged in order of remarkable characteristics, and reflected in the reproduction order. Image data can be reproduced in an interesting reproduction order.

以下に添付図面を参照しながら、本発明の好適な実施形態について詳細に説明する。かかる実施形態に示す寸法、材料、その他具体的な数値などは、発明の理解を容易とするための例示に過ぎず、特に断る場合を除き、本発明を限定するものではない。なお、本明細書及び図面において、実質的に同一の機能、構成を有する要素については、同一の符号を付することにより重複説明を省略し、また本発明に直接関係のない要素は図示を省略する。さらに、信号や電流はそれらが通る線路の符号によって表記するものとする。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. The dimensions, materials, and other specific numerical values shown in the embodiments are merely examples for facilitating understanding of the invention, and do not limit the present invention unless otherwise specified. In the present specification and drawings, elements having substantially the same function and configuration are denoted by the same reference numerals, and redundant description is omitted, and elements not directly related to the present invention are not illustrated. To do. Further, signals and currents are expressed by the codes of the lines through which they pass.

（デジタルカメラ）
図１は、本発明による画像再生装置の実施形態であるデジタルカメラのブロック図であり、図２は、図１のデジタルカメラで再生される画像データと、画像データに関連付けられる音声データおよび特性情報を模式的に表す図である。 (Digital camera)
FIG. 1 is a block diagram of a digital camera which is an embodiment of an image playback apparatus according to the present invention. FIG. 2 is a diagram of image data played back by the digital camera of FIG. 1, and audio data and characteristic information associated with the image data. FIG.

（周波数解析部）
デジタルカメラ１００は、１つ以上の画像データ１０２Ａ〜１０２Ｄ各々に関連付けられた音声データ１０４Ａ〜１０４Ｄの周波数を解析する周波数解析部１０６を備える。 (Frequency analysis part)
The digital camera 100 includes a frequency analysis unit 106 that analyzes the frequencies of audio data 104A to 104D associated with each of the one or more image data 102A to 102D.

図２（ａ）では、画像データ１０２Ａ〜１０２Ｄのうち、代表として、相互に関連付けられ、記録媒体３７０に記録されている、画像データ１０２Ａ、音声データ１０４Ａおよび特性情報１０８Ａを例示する。図２（ｂ）は、画像データ１０２Ａ〜１０２Ｄ、音声データ１０４Ａ〜１０４Ｄおよび特性情報１０８Ａ〜の関連付けを示す参照テーブルであり、図１の記録媒体３７０に記録される。これは後述の関連付け部１１２によって生成されるファイルであり、テキストファイルとしてもよい。 2A exemplifies image data 102A, audio data 104A, and characteristic information 108A that are associated with each other and recorded on the recording medium 370 as representatives of the image data 102A to 102D. FIG. 2B is a reference table showing the association of the image data 102A to 102D, the audio data 104A to 104D, and the characteristic information 108A, and is recorded on the recording medium 370 in FIG. This is a file generated by the association unit 112 described later, and may be a text file.

画像データ１０２Ａ〜１０２Ｄは、本実施形態では静止画像であるが、動画像としてもよい。画像データ１０２Ａ〜１０２Ｄは本実施形態では、図２（ｂ）のように説明の便宜上４つしか示していないが、その数には制限はない。音声データ１０４Ａ〜１０４Ｄは、例えば、デジタルカメラ１００が静止画像に音声を関連付けて記録するボイスメモ機能によって取得された、ボイスメモとしてよい。なお、ボイスメモ機能における静止画像と音声との関連付けは、静止画像の撮影時に音声を録音して記録することにより行ってもよいし、記録媒体３７０に記録された静止画像を再生している際に音声を録音して記録することにより行ってもよい。 The image data 102A to 102D are still images in the present embodiment, but may be moving images. In the present embodiment, only four pieces of image data 102A to 102D are shown for convenience of explanation as shown in FIG. 2B, but the number is not limited. The audio data 104A to 104D may be, for example, voice memos acquired by the voice memo function in which the digital camera 100 records audio in association with still images. The association between the still image and the sound in the voice memo function may be performed by recording and recording the sound at the time of shooting the still image, or when the still image recorded on the recording medium 370 is being reproduced. You may carry out by recording and recording an audio | voice.

画像データ１０２Ａ〜１０２Ｄのファイルフォーマットは、本実施形態ではＪＰＥＧ（Joint Photographic Experts Group）であるが、その他、ＴＩＦＦ（Tagged Image File Format）、ＧＩＦ（Graphics Interchange Format）、ＰＮＧ（Portable Network Graphics）、ビットマップ（Bitmap）など、あらゆるフォーマットの画像ファイルとしてよい。また、ＪＰＥＧ形式の圧縮データに撮影時の様々な情報を付加したＥｘｉｆ（Exchangeable Image File Format）としてもよい。 The file format of the image data 102A to 102D is JPEG (Joint Photographic Experts Group) in this embodiment, but in addition, TIFF (Tagged Image File Format), GIF (Graphics Interchange Format), PNG (Portable Network Graphics), bit It may be an image file of any format such as a map (Bitmap). Moreover, it is good also as Exif (Exchangeable Image File Format) which added various information at the time of imaging | photography to the compression data of JPEG format.

音声データ１０４Ａ〜１０４Ｄのファイルフォーマットは、本実施形態ではＷＡＶ（WAVeform Audio Format。音声と画像をまとめて保存するＲＩＦＦ（Resource Interchange File Format）の一種）であるが、ＡＶＩ（Audio Video Interleave。ＲＩＦＦの一種）、ＷＭＡ（Windows（登録商標） Media Audio）、ＭＰ３（MPEG Audio Layer-3）など、あらゆるフォーマットの音声ファイルとしてよい。 The file format of the audio data 104A to 104D is WAV (WAVeform Audio Format, which is a kind of RIFF (Resource Interchange File Format) for storing audio and images together) in this embodiment, but is AVI (Audio Video Interleave. RIFF). 1 type), WMA (Windows (registered trademark) Media Audio), MP3 (MPEG Audio Layer-3), etc.

（音声判定部）
デジタルカメラ１００は、解析された周波数に応じた話者の性別（男性または女性）を含む、音声データ１０４Ａ〜１０４Ｄの特性情報１０８Ｄ〜１０８Ｄを判定する音声判定部１１０を備える。 (Voice judgment part)
The digital camera 100 includes a voice determination unit 110 that determines the characteristic information 108D to 108D of the voice data 104A to 104D including the gender (male or female) of the speaker according to the analyzed frequency.

音声判定部１１０は、声帯振動の基本周波数として、日本人の平均値を参考にし、例えば、周波数解析部１０６で解析された周波数が、１００〜１５０Ｈｚであれば男声、２２０〜２７０Ｈｚであれば女声、それらの中間であれば子供、と判定してよい。かかる判定基準は自由に変更してよい。 For example, if the frequency analyzed by the frequency analysis unit 106 is 100 to 150 Hz, the voice determination unit 110 refers to a Japanese average value as the fundamental frequency of vocal cord vibration. If it is in between, it may be determined that it is a child. Such a criterion may be freely changed.

このように、周波数を用いれば、簡便な処理にて性別の判定が可能であるため、音声認識ほどの負荷をかけることなく、性別という、画像データ再生順序の基準となり得る特性を、音声データ１０４Ａ〜１０４Ｄの特性情報１０８Ｄ〜１０８Ｄとして獲得できる。 Thus, if the frequency is used, gender can be determined by a simple process. Therefore, the characteristic that can be a reference of the image data reproduction order, such as gender, is applied to the voice data 104A without applying a load as much as voice recognition. ~ 104D characteristic information 108D to 108D.

（音量検出部）
上記のデジタルカメラ１００は、さらに、１つ以上の画像データ１０２Ａ〜１０２Ｄ各々に関連付けられた音声データ１０４Ａ〜１０４Ｄの音量を検出して特性情報１０８Ｄ〜１０８Ｄに付加する音量検出部１１１を備える。 (Volume detector)
The digital camera 100 further includes a volume detector 111 that detects the volume of the audio data 104A to 104D associated with each of the one or more image data 102A to 102D and adds the detected volume to the characteristic information 108D to 108D.

（関連付け部）
デジタルカメラ１００は、特性情報１０８Ｄ〜１０８Ｄを１つ以上の画像データ１０２Ａ〜１０２Ｄ各々に関連付ける関連付け部１１２を備える。 (Association section)
The digital camera 100 includes an association unit 112 that associates the characteristic information 108D to 108D with each of the one or more pieces of image data 102A to 102D.

関連付け部１１２は、音声データ１０４Ａ〜１０４Ｄの特性情報１０８Ｄ〜１０８Ｄを１つ以上の画像データ１０２Ａ〜１０２Ｄ各々に付加してよい。この場合、例えば図２（ａ）の特性情報１０８Ａは、画像データ１０２Ａの一部、例えばヘッダ情報としてよい。これにより、音声データ１０４Ａ〜１０４Ｄの特性情報１０８Ｄ〜１０８Ｄは、画像データ１０２Ａ〜１０２Ｄと一体的に保存されることによって、画像データに関連付けられる。 The associating unit 112 may add the characteristic information 108D to 108D of the audio data 104A to 104D to each of the one or more image data 102A to 102D. In this case, for example, the characteristic information 108A in FIG. 2A may be a part of the image data 102A, for example, header information. Accordingly, the characteristic information 108D to 108D of the audio data 104A to 104D is associated with the image data by being stored together with the image data 102A to 102D.

関連付け部１１２は、音声データ１０４Ａ〜１０４Ｄの特性情報１０８Ｄ〜１０８Ｄと、１つ以上の画像データ１０２Ａ〜１０２Ｄ各々との関連付け状態を示す、関連付けデータを作成し、関連付けデータをファイルとして保存してもよい。 The associating unit 112 creates association data indicating the association state between the characteristic information 108D to 108D of the audio data 104A to 104D and each of the one or more image data 102A to 102D, and stores the association data as a file. Good.

図２（ｂ）は、上記の関連付けデータの一例を示す参照テーブル１１４である。参照テーブル１１４が作成される場合、特性情報１０８Ｄ〜１０８Ｄは、画像データ１０２Ａ〜１０２Ｄとは別の独立したファイルとして保存される。 FIG. 2B is a reference table 114 showing an example of the association data. When the reference table 114 is created, the characteristic information 108D to 108D is stored as an independent file different from the image data 102A to 102D.

上記の構成によれば、音声データ１０４Ａ〜１０４Ｄの特性情報１０８Ａ〜１０８Ｄは、参照テーブル１１４を参照することによって、画像データ１０２Ａ〜１０２Ｄのいずれに関連付けられているかが分かる。 According to the above configuration, the characteristic information 108 A to 108 D of the audio data 104 A to 104 D can be identified with which of the image data 102 A to 102 D by referring to the reference table 114.

本実施形態では、関連付け部１１２は、特性情報１０８Ａ〜１０８Ｄを画像データ１０２Ａ〜１０２Ｄに関連付けているが、音声データ１０４Ａ〜１０４Ｄに関連付けてもよい。 In the present embodiment, the associating unit 112 associates the characteristic information 108A to 108D with the image data 102A to 102D, but may associate with the audio data 104A to 104D.

（画像選択部）
デジタルカメラ１００は、１つ以上の画像データ１０２Ａ〜１０２Ｄのなかから、各々に関連付けられた特性情報１０８Ａ〜１０８Ｄに基づいて、再生対象となる再生対象画像データを選択（抽出）する画像選択部１１６を備える。 (Image selection part)
The digital camera 100 selects (extracts) reproduction target image data to be reproduced based on the characteristic information 108A to 108D associated with each of the one or more pieces of image data 102A to 102D. Is provided.

画像選択部１１６による画像選択の方法は、所定のインターフェースによって、ユーザが自由に設定可能である。例えば、画像データ１０２Ａ〜１０２Ｄのうち、男性の画像（画像データ１０２Ａ、１０２Ｂ）のみを選択して再生することも、女性の画像（画像データ１０２Ｃ、１０２Ｄ）のみを選択して再生することも、男女両方の画像（画像データ１０２Ａ〜１０２Ｄ）を選択して再生することも可能である。 The method of image selection by the image selection unit 116 can be freely set by the user through a predetermined interface. For example, of the image data 102A to 102D, only a male image (image data 102A, 102B) can be selected and reproduced, or only a female image (image data 102C, 102D) can be selected and reproduced. It is also possible to select and reproduce both male and female images (image data 102A to 102D).

（再生制御部）
デジタルカメラ１００は、選択された再生対象画像データを表示部（ＬＣＤ（Liquid Crystal Display）３８０）に再生する再生制御部１１８を備える。再生制御部１１８は、また、再生対象画像データに関連付けられた音声データも、同時に、スピーカ３２０から再生する。 (Reproduction control unit)
The digital camera 100 includes a reproduction control unit 118 that reproduces the selected reproduction target image data on a display unit (LCD (Liquid Crystal Display) 380). The reproduction control unit 118 also reproduces the audio data associated with the reproduction target image data from the speaker 320 at the same time.

（スライドショー再生）
再生制御部１１８は、選択された再生対象画像データを、順次切り換えて再生する。すなわち、いわゆるスライドショー再生が実行される。各々の再生対象画像の再生時間は、それに関連付けられた音声データの再生時間としてよい。かかる方法によれば、静止画像である再生対象画像の再生時間を設定する必要がない。ただし、再生時間は、音声データの再生時間に関係なく、一定の時間に設定してもよいし、任意の規則によって変化させてもよい。例えば被写体が女性の画像のほうが、男性の画像より再生時間を長くするなどとしてよい。 (Slideshow playback)
The reproduction control unit 118 sequentially switches and reproduces the selected reproduction target image data. That is, so-called slide show reproduction is executed. The playback time of each playback target image may be the playback time of the audio data associated therewith. According to such a method, there is no need to set the playback time of the playback target image that is a still image. However, the playback time may be set to a fixed time regardless of the playback time of the audio data, or may be changed according to an arbitrary rule. For example, an image of a female subject may have a longer reproduction time than a male image.

（性別に応じた再生順序）
上記の構成によれば、多数の画像データを、簡便な処理によって、被写体に因んだ音声データの特性（性別）に応じて分類して、興趣性のある再生順序で画像データを再生可能である。 (Playback order according to gender)
According to the above configuration, it is possible to classify a large number of image data according to the characteristics (gender) of the audio data caused by the subject by simple processing, and to reproduce the image data in an interesting reproduction order. is there.

例えば、結婚式において、多数の人物を撮影しつつ、その人物からのお祝いのメッセージをボイスメモ機能によって記録した場合を考察する。画像データ１０２Ａ〜１０２Ｄとともに音声データ１０４Ａ〜１０４Ｄが得られ、さらに、特性情報１０８Ａ〜１０８Ｄが得られたと仮定する。すると、画像データの再生の順序を、例えば被写体が女性と判定された画像データ１０２Ｃ、１０２Ｄを先に再生し、男性と判定された画像データ１０２Ａ、１０２Ｂを後で再生することが可能である。 For example, consider a case where a congratulatory message from a person is recorded by a voice memo function while photographing a large number of persons at a wedding. It is assumed that audio data 104A to 104D are obtained together with the image data 102A to 102D, and further that characteristic information 108A to 108D is obtained. Then, for example, the image data 102C and 102D in which the subject is determined to be female can be reproduced first, and the image data 102A and 102B determined to be male can be reproduced later.

（音量に応じた再生順序）
再生制御部１１８は、音量検出部１１１によって検出された音量の昇順または降順に、選択された再生対象画像データを再生してよい。 (Playback order according to volume)
The reproduction control unit 118 may reproduce the selected reproduction target image data in ascending or descending order of the volume detected by the volume detection unit 111.

上記の構成によれば、性別判定された画像データ１０２Ａ〜１０２Ｄに、さらに音量という特性が与えられ、音量の昇順あるいは降順に画像データ１０２Ａ〜１０２Ｄを再生可能である。これにより、声の大きい、明るい人物が被写体として写っている画像データを優先的に再生するなどの、興趣のあるスライドショー再生が可能となる。 According to the above configuration, the gender-determined image data 102A to 102D is further given a volume characteristic, and the image data 102A to 102D can be reproduced in ascending or descending volume order. This makes it possible to play an interesting slide show such as preferentially reproducing image data in which a bright person and a bright person appear as subjects.

例えば上記の、性別に応じた再生順序に、音量に応じた再生順序を組み合わせてもよく、再生制御部１１８を、女性→男声の順番、音量は昇順、という再生順序に設定すれば、画像データ１０２Ｃ、１０２Ｄ、１０２Ａ、１０２Ｂの順序で再生される。 For example, the above-described reproduction order according to gender may be combined with the reproduction order according to the volume. If the reproduction control unit 118 is set to the reproduction order of female → male voice, and the volume is ascending order, the image data Playback is performed in the order of 102C, 102D, 102A, and 102B.

この他、音量の所定の閾値によって画像データを分類し、「声が大きい」「普通」「声が小さい」などのグループに分別し、画像選択部１１６に、いずれかのグループに属する画像データを選択させて再生してもよい。 In addition, the image data is classified according to a predetermined threshold of volume, and is classified into groups such as “voice is loud”, “normal”, and “voice is low”, and image data belonging to any group is displayed in the image selection unit 116. You may select and play.

（声色に応じた再生）
再生制御部１１８は、解析された周波数の昇順または降順に、選択された再生対象画像データを再生してよい。 (Playback according to voice)
The reproduction control unit 118 may reproduce the selected reproduction target image data in ascending or descending order of the analyzed frequencies.

上記の構成によれば、性別判定された画像データ１０２Ａ〜１０２Ｄに、さらに周波数（声色）という特性が与えられ、声色が高い順（周波数の降順）または低い順（周波数の昇順）に画像データ１０２Ａ〜１０２Ｄを再生可能である。これにより、声の高い、明るい人物が被写体として写っている画像データを優先的に再生するなどの、興趣のあるスライドショー再生が可能となる。 According to the above configuration, the gender-determined image data 102 A to 102 D is further given a characteristic of frequency (voice color), and the image data 102 A is ordered in descending order of voice color (descending order of frequency) or decreasing order (rising order of frequency). -102D can be reproduced. This makes it possible to play an interesting slide show such as preferentially reproducing image data in which a bright and bright person is captured as a subject.

例えば上記の、性別に応じた再生順序に、声音に応じた再生順序を組み合わせてもよく、再生制御部１１８を、女性→男声の順番、周波数は降順（高周波数の明るい声音を有する人物を優先的に再生）、という再生順序に設定すれば、画像データ１０２Ｄ、１０２Ｃ、１０２Ａ、１０２Ｂの順序で再生される。 For example, the playback order according to gender may be combined with the playback order according to the voice, and the playback control unit 118 may be configured to change the order of female → male voice and descending frequency (priority is given to a person with a bright voice sound of high frequency). In the order of image data 102D, 102C, 102A, and 102B.

この他、周波数の所定の閾値によって画像データを分類し、「声音が明るい」「普通」「声音が暗い」などのグループに分別し、画像選択部１１６に、いずれかのグループに属する画像データを選択させて再生してもよい。 In addition, the image data is classified according to a predetermined threshold of frequency, and is classified into groups such as “voice sound is bright”, “normal”, and “voice sound is dark”, and the image data belonging to any group is displayed in the image selection unit 116. You may select and play.

（その他の特性による分類および順序付け）
その他、音声データ１０４Ａ〜１０４Ｄの他の特性によって画像データを分類したり、順序付けしたりしてもよいし、画像データ自体の特性によって画像データを分類したり、順序付けしたりしてもよい。例えば画像認識を行い、笑顔と判定された画像データを、表情が明るいグループに分類可能である。 (Classification and ordering by other characteristics)
In addition, the image data may be classified or ordered according to other characteristics of the audio data 104A to 104D, or the image data may be classified or ordered according to the characteristics of the image data itself. For example, image recognition is performed, and image data determined to be a smile can be classified into a group with a bright expression.

（デジタルカメラ構成）
図３は図１のデジタルカメラの外観の一例を示す図であり、図３（ａ）は正面から見た図、図３（ｂ）は背面から見た図である。図３および図１を用いて、デジタルカメラのその他の構成について、以下、説明する。撮像部２２０はＣＣＤ（Charge Coupled Device; 電荷結合素子）としてよく、被写界を撮像して電子的な画像信号を生成する撮像手段であり、例えば１６００×１２００個の画素を有する。撮像部２２０は、撮像レンズ２１０によって結像された被写体の光像を、画素毎にＲ（赤）、Ｇ（緑）、Ｂ（青）の色成分の画像信号（各画素で受光された画素信号の信号列からなる信号）に光電変換して出力する。 (Digital camera configuration)
3 is a diagram showing an example of the appearance of the digital camera of FIG. 1, FIG. 3 (a) is a diagram seen from the front, and FIG. 3 (b) is a diagram seen from the back. Other configurations of the digital camera will be described below with reference to FIGS. 3 and 1. The imaging unit 220 may be a CCD (Charge Coupled Device) and is an imaging unit that captures an object scene and generates an electronic image signal, and has, for example, 1600 × 1200 pixels. The imaging unit 220 converts the light image of the subject formed by the imaging lens 210 into image signals of R (red), G (green), and B (blue) color components for each pixel (pixels received by each pixel). A signal comprising a signal sequence of signals) and output.

撮像部２２０から得られる画像信号（アナログ信号）は、アナログ信号処理回路２３０に与えられる。アナログ信号処理回路２３０は、画像信号に対して所定のアナログ信号処理を行う回路である。アナログ信号処理回路２３０は、少なくとも相関二重サンプリング回路（Correlated Double Sampling：ＣＤＳ。図示省略）およびオートゲインコントロール（Auto Gain Controlled：ＡＧＣ。図示省略）回路を含む。相関二重サンプリング回路によって画像信号のノイズ低減処理が行われ、オートゲインコントロール回路でゲイン調整することによって、画像信号のレベル調整が行われる。 An image signal (analog signal) obtained from the imaging unit 220 is given to the analog signal processing circuit 230. The analog signal processing circuit 230 is a circuit that performs predetermined analog signal processing on the image signal. The analog signal processing circuit 230 includes at least a correlated double sampling circuit (Correlated Double Sampling: CDS, not shown) and an auto gain control (Auto Gain Controlled: AGC, not shown) circuit. Noise reduction processing of the image signal is performed by the correlated double sampling circuit, and the level of the image signal is adjusted by adjusting the gain by the auto gain control circuit.

Ａ／Ｄ変換器２４０は、画像信号の各画素信号を、例えば１２ビットのデジタル信号に変換する。変換後のデジタル信号は、中央処理装置（ＣＰＵ: Central Processing Unit）２５０に与えられ、画像データとして一時的にＲＡＭ（Random Access Memory）
２６０に格納される。ＲＡＭ２６０に保存された画像データは、画像処理部２８０によって色補正処理等を施された後、圧縮伸張部２９０による圧縮処理等が施される。 The A / D converter 240 converts each pixel signal of the image signal into, for example, a 12-bit digital signal. The converted digital signal is given to a central processing unit (CPU) 250 and temporarily stored in RAM (Random Access Memory) as image data.
260. The image data stored in the RAM 260 is subjected to color correction processing or the like by the image processing unit 280 and then subjected to compression processing or the like by the compression / decompression unit 290.

また、指向性マイク３００から得られる環境音などの音声信号は、音声処理部３１０に入力される。音声処理部３１０に入力された音声信号は、音声処理部３１０内に設けられたＡ／Ｄ変換器（図示省略）により、デジタル信号に変換され、一時的にＲＡＭ２６０に格納される。デジタル化された音声信号は、再び音声処理部３１０に送り、スピーカ３２０から再生可能である。 An audio signal such as an environmental sound obtained from the directional microphone 300 is input to the audio processing unit 310. The audio signal input to the audio processing unit 310 is converted into a digital signal by an A / D converter (not shown) provided in the audio processing unit 310 and temporarily stored in the RAM 260. The digitized audio signal is sent again to the audio processing unit 310 and can be reproduced from the speaker 320.

操作部３３０は、電源ボタン３３０Ａ、各種の操作ボタン３３０Ｂ、シャッタレリーズボタン３３０Ｃ等を含み、ユーザがデジタルカメラ１００の設定を変更操作する際や撮像操作を行う際等に用いられる。 The operation unit 330 includes a power button 330 A, various operation buttons 330 B, a shutter release button 330 C, and the like, and is used when the user changes the setting of the digital camera 100 or performs an imaging operation.

ＣＰＵ２５０は、ＲＡＭ２６０およびＲＯＭ（Read Only Memory）３５０に記録された所定のプログラムを実行することにより、上記各部を統括的に制御する。なお、ＲＡＭ２６０は、高速アクセス可能な半導体メモリであり、ＲＯＭ３５０は電気的に書き換えが不可能な不揮発の半導体メモリ（例えばフラッシュＲＯＭ）として構成される。また、ＲＡＭ２６０内における一部の領域は、一時記憶用のバッファエリアとして機能し、画像データおよび音声データを一時的に記憶する。 The CPU 250 comprehensively controls the above-described units by executing predetermined programs recorded in a RAM 260 and a ROM (Read Only Memory) 350. The RAM 260 is a high-speed accessible semiconductor memory, and the ROM 350 is configured as a non-volatile semiconductor memory (for example, a flash ROM) that cannot be electrically rewritten. A part of the area in the RAM 260 functions as a temporary storage buffer area, and temporarily stores image data and audio data.

ＣＰＵ２５０の各処理部２８０、２９０、３１０は、マイクロコンピュータが所定のプログラムを実行することにより実現される、機能部位である。 Each processing unit 280, 290, 310 of the CPU 250 is a functional part realized by the microcomputer executing a predetermined program.

画像処理部２８０は、ＷＢ（ホワイトバランス）処理、γ補正処理等の各種のデジタル画像処理を施す処理部である。ＷＢ処理は、算出されたホワイトバランス補正値に基づき、Ｒ、Ｇ、Ｂの各色成分のレベル変換を行い、カラーバランスを調整する処理であり、γ補正処理は、画素データの階調を補正する処理である。圧縮伸張部２９０は、画像処理部２８０によって色補正処理等が行われた画像データを、さらに圧縮する。圧縮方式としては、例えばＪＰＥＧ（Joint Photographic Experts Group）方式などが採用される。音声処理部３１０は、音声データに対する各種のデジタル処理を施す処理部である。 The image processing unit 280 is a processing unit that performs various digital image processing such as WB (white balance) processing and γ correction processing. The WB process is a process of adjusting the color balance by performing level conversion of each of R, G, and B color components based on the calculated white balance correction value, and the γ correction process corrects the gradation of the pixel data. It is processing. The compression / decompression unit 290 further compresses the image data that has been subjected to color correction processing and the like by the image processing unit 280. As the compression method, for example, a JPEG (Joint Photographic Experts Group) method or the like is adopted. The audio processing unit 310 is a processing unit that performs various digital processes on audio data.

このような構成を有するＣＰＵ２５０によって、撮像モード、再生モードの処理が行われる。例えば、撮像モードにおいては、まず撮像部２２０をビューファインダ（ＬＣＤ３８０）画像出力用の動作モードに設定した上で所定周期（例えば３０コマ／秒）の撮像を行い、撮像に応じた画像データを逐次出力する（予備撮像）。撮像部２２０から出力された画像データは、アナログ信号処理回路２３０、Ａ／Ｄ変換器２４０、画像処理部２８０を介してＬＣＤ３８０にビューファインダ画像として表示される。この状態で、シャッタレリーズボタン３３０Ｃがユーザによって半押し（Ｓ１状態）されると、ＣＰＵ２５０は、撮像部２２０から入力される予備撮像の画像データに基づき、ＡＥ（Auto Exposure）評価値およびＡＦ（Auto Focus）評価値を求める。ＣＰＵ２５０はＡＦ評価値に基づいて合焦位置を、公知の例えば山登り方式によって求め、ＡＥ／ＡＦ部３６０を用いて、合焦位置に撮像レンズ２１０を移動させる。また、ＣＰＵ２５０はＡＥ評価値に基づいて、本撮像時のシャッタスピード（撮像部２２０における電荷蓄積時間）、撮像レンズ２１０の絞り値およびアナログ信号処理回路２３０におけるオートゲインコントロールのゲイン値を決定する。 The CPU 250 having such a configuration performs processing in an imaging mode and a reproduction mode. For example, in the imaging mode, first, the imaging unit 220 is set to an operation mode for viewfinder (LCD 380) image output, imaging is performed at a predetermined cycle (for example, 30 frames / second), and image data corresponding to the imaging is sequentially obtained. Output (preliminary imaging). The image data output from the imaging unit 220 is displayed as a viewfinder image on the LCD 380 via the analog signal processing circuit 230, the A / D converter 240, and the image processing unit 280. In this state, when the shutter release button 330C is half-pressed (S1 state) by the user, the CPU 250 performs AE (Auto Exposure) evaluation value and AF (Auto Focus) Find the evaluation value. The CPU 250 obtains the in-focus position based on the AF evaluation value by a known hill-climbing method, for example, and moves the imaging lens 210 to the in-focus position using the AE / AF unit 360. Further, the CPU 250 determines the shutter speed (charge accumulation time in the imaging unit 220) at the time of actual imaging, the aperture value of the imaging lens 210, and the gain value of auto gain control in the analog signal processing circuit 230 based on the AE evaluation value.

ＣＰＵ２５０は、撮像モードにおいてシャッタレリーズボタン３３０Ｃが全押し（Ｓ２状態）されると、撮像部２２０の動作モードを本撮像用の動作モードに設定した上で被写体を撮像し、撮像部２２０で取得された撮像画像データに基づき、圧縮伸張部２９０で圧縮画像を生成する。そして、圧縮された高解像度の画像データは、記録媒体３７０に記録される。記録媒体３７０は例えばＳＤカード（登録商標）としてよい。 When the shutter release button 330C is fully pressed (S2 state) in the imaging mode, the CPU 250 sets the operation mode of the imaging unit 220 to the operation mode for main imaging, captures the subject, and is acquired by the imaging unit 220. A compression / decompression unit 290 generates a compressed image based on the captured image data. The compressed high resolution image data is recorded on the recording medium 370. The recording medium 370 may be, for example, an SD card (registered trademark).

さらに、デジタルカメラ１００は、記録媒体３７０から、デジタルカメラ１００で動作されるプログラムを取り込むことも可能である。例えば、記録媒体３７０に記録される制御プログラムを、ＣＰＵ２５０のＲＡＭ２６０またはＲＯＭ３５０内に取り込むことができる。これにより、制御プログラムを更新することも可能である。 Furthermore, the digital camera 100 can also import a program operated by the digital camera 100 from the recording medium 370. For example, a control program recorded on the recording medium 370 can be taken into the RAM 260 or the ROM 350 of the CPU 250. Thereby, the control program can be updated.

ＬＣＤ３８０は、被写界を表示するほか、ユーザに対する各種メニューを表示し、操作のための情報を提供する表示装置である。フラッシュ３９０は、発光が許可されている状態で撮影補助光を発して被写体を照射する発光装置である。 The LCD 380 is a display device that displays an object scene, displays various menus for the user, and provides information for operation. The flash 390 is a light-emitting device that emits photographing auxiliary light and irradiates a subject while light emission is permitted.

（プログラム）
本実施形態にかかるプログラムの代表的な構成は、１つ以上の画像データ１０２Ａ〜１０２Ｄ各々に関連付けられた音声データ１０４Ａ〜１０４Ｄの周波数を解析する周波数解析部１０６と、解析された周波数に応じて、話者の性別を含む音声データ１０４Ａ〜１０４Ｄの特性情報１０８Ａ〜１０８Ｄを判定する音声判定部１１０と、特性情報１０８Ａ〜１０８Ｄを１つ以上の画像データ１０２Ａ〜１０２Ｄ各々に関連付ける関連付け部１１２と、１つ以上の画像データ１０２Ａ〜１０２Ｄのなかから、各々に関連付けられた特性情報１０８Ａ〜１０８Ｄに基づいて、再生対象となる再生対象画像データを選択する画像選択部１１６と、選択された再生対象画像データをＬＣＤ３８０で再生する再生制御部１１８としてデジタルカメラ１００を機能させることを特徴とする。 (program)
A typical configuration of the program according to the present embodiment includes a frequency analysis unit 106 that analyzes the frequency of the audio data 104A to 104D associated with each of the one or more pieces of image data 102A to 102D, and a frequency that is analyzed. A voice determination unit 110 that determines the characteristic information 108A to 108D of the voice data 104A to 104D including the gender of the speaker, and an association unit 112 that associates the characteristic information 108A to 108D with each of the one or more image data 102A to 102D, Based on the characteristic information 108A to 108D associated with each of the one or more pieces of image data 102A to 102D, an image selection unit 116 that selects reproduction target image data to be reproduced, and the selected reproduction target image The digital camera 100 serves as a playback control unit 118 that plays back data on the LCD 380. Characterized in that to function.

（スライドショー実行プロセス）
図４は図１のＣＰＵ２５０の各要素が行うスライドショー実行プロセスを示すフローチャートであり、図５は図４の性別・声音判断工程Ｓ５００を詳細に表すサブルーチンである。 (Slideshow execution process)
FIG. 4 is a flowchart showing a slide show execution process performed by each element of the CPU 250 of FIG. 1, and FIG. 5 is a subroutine showing in detail the gender / voice sound determination step S500 of FIG.

まず、撮影され、音声データ１０４Ａ〜１０４Ｄと関連付けられた画像データ１０２Ａ〜１０２Ｄに対して、周波数解析部１０６および音声判定部１１０が、性別・声音判断工程Ｓ５００にて性別および声音について判断する。 First, the frequency analysis unit 106 and the sound determination unit 110 determine the gender and the sound of the image data 102A to 102D associated with the sound data 104A to 104D in the sex / voice sound determination step S500.

図５（ａ）に示すように、性別・声音判断工程Ｓ５００は、画像データ数だけ繰り返される。画像データ１０２Ａ〜１０２Ｄに関連付けられた音声データ１０４Ａ〜１０４Ｄの周波数を、周波数解析部１０６が測定し、音声判定部１１０が、性別閾値との大小関係を比較し（ステップＳ６００）、男性または女性と判定する（ステップＳ６０２またはＳ７０２）。音声の周波数が１５０Ｈｚ〜２２０Ｈｚの間である場合は、子供と判定することも可能であるが、本実施形態では、音声判定部１１０は、例えば中央値である１８５Ｈｚを性別閾値とし、男性または女性のいずれかのみに判定している。 As shown in FIG. 5A, the sex / voice determination step S500 is repeated by the number of image data. The frequency analysis unit 106 measures the frequency of the audio data 104A to 104D associated with the image data 102A to 102D, and the audio determination unit 110 compares the magnitude relationship with the gender threshold value (step S600). Determination is made (step S602 or S702). When the voice frequency is between 150 Hz and 220 Hz, it is possible to determine that the child is a child. However, in this embodiment, the voice determination unit 110 uses, for example, a median value of 185 Hz as a gender threshold, and male or female. Only one of them is judged.

以下、男性と判定された場合と、女性と判定された場合の処理は、同様であるため、双方を並行して説明する。図５（ｂ）は、図５（ａ）で用いられる各閾値の大小関係を示す図である。 Hereinafter, since the process when it is determined that it is male and the case where it is determined that it is female is the same, both will be described in parallel. FIG. 5B is a diagram showing the magnitude relationship between the threshold values used in FIG.

音声判定部１１０は、さらに、男性または女性と判定された画像データについて、男性高周波閾値または女性高周波閾値と大小関係を比較し（ステップＳ６０４、Ｓ７０４）、それら高周波閾値より大きければ画像データを「明るい」に分類し（ステップＳ６０６、Ｓ７０６）、小さければ、さらに男性低周波閾値または女性低周波閾値と大小関係を比較する（ステップＳ６０８、Ｓ７０８）。音声判定部１１０は、それら低周波閾値より大きければ画像データを「普通」に分類し（ステップＳ６１０、Ｓ７１０）、小さければ「暗い」に分類する（ステップＳ６１２、Ｓ７１２）。 The sound determination unit 110 further compares the magnitude of the image data determined to be male or female with the male high frequency threshold or the female high frequency threshold (steps S604 and S704). (Step S606, S706), if smaller, the male low frequency threshold or the female low frequency threshold is further compared with the magnitude relationship (steps S608, S708). The sound determination unit 110 classifies the image data as “normal” if it is larger than the low frequency threshold (steps S610 and S710), and classifies it as “dark” if it is smaller (steps S612 and S712).

以上のように、ステップＳ５００にて判断された性別、声音の分類は、すべて、特性情報１０８Ａ〜１０８Ｄとして記録媒体３７０に記録され、関連付け部１１２によって画像データ１０２Ａ〜１０２Ｄに関連付けられる。 As described above, the gender and voice classification determined in step S500 are all recorded on the recording medium 370 as the characteristic information 108A to 108D, and are associated with the image data 102A to 102D by the associating unit 112.

図４に戻り、音量検出部１１１が音声データ１０４Ａ〜１０４Ｄの音量を検出し（ステップＳ５０２）、これらも特性情報１０８Ａ〜１０８Ｄに含めて記録される。 Returning to FIG. 4, the volume detector 111 detects the volume of the audio data 104 A to 104 D (step S 502), and these are also recorded in the characteristic information 108 A to 108 D.

次に、再生方法をユーザに設定させる（ステップＳ５０４）。すなわち、ＣＰＵ２５０は、男性の画像、女性の画像、両方の画像のいずれを選択するか、それらのうち、「明るい」「普通」「暗い」のいずれのグループに分類された画像データを表示するか等を、所定のインターフェースにより、ユーザに設定させる。また、音量の昇順、降順に再生するなどの設定も、ここで行うことが可能である。 Next, the playback method is set by the user (step S504). That is, the CPU 250 selects a male image, a female image, or both images, and displays image data classified into “bright”, “normal”, or “dark” among them. Etc. are set by the user through a predetermined interface. Also, settings such as playback in ascending order and descending order of volume can be performed here.

画像選択部１１６は、ユーザの設定に応じて、再生すべき、再生対象画像データを選択する（ステップＳ５０６）。 The image selection unit 116 selects the reproduction target image data to be reproduced according to the user setting (step S506).

選択された再生対象画像データが「明るい」「普通」「暗い」のいずれに分類されたものであるかに応じて、再生制御部１１８は、再生対象画像データの再生時に同時に再生するＢＧＭを選定する（ステップＳ５０８）。 Depending on whether the selected reproduction target image data is classified as “bright”, “normal”, or “dark”, the reproduction control unit 118 selects BGM to be reproduced simultaneously with reproduction of the reproduction target image data. (Step S508).

最後に、再生制御部１１８は、再生対象画像データを、ステップＳ５０４で設定された再生方法に応じ、また、ステップＳ５０８で選定されたＢＧＭとともに、スライドショー再生する（ステップＳ５１０）。各々の再生対象画像の再生時間は、それに関連付けられた音声データの再生時間としてよい。かかる方法によれば、静止画像である再生対象画像の再生時間を設定する必要がない。 Finally, the playback control unit 118 plays back the playback target image data as a slide show according to the playback method set in step S504 and together with the BGM selected in step S508 (step S510). The playback time of each playback target image may be the playback time of the audio data associated therewith. According to such a method, there is no need to set the playback time of the playback target image that is a still image.

なお、本明細書のスライドショー実行における各工程は、必ずしもフローチャートとして記載された順序に沿って時系列に処理する必要はなく、並列的あるいはサブルーチンによる処理を含んでもよい。 It should be noted that each step in the slide show execution of this specification does not necessarily have to be processed in time series in the order described in the flowchart, and may include parallel or subroutine processing.

以上、添付図面を参照しながら本発明の好適な実施形態について説明したが、本発明は係る例に限定されないことは言うまでもない。当業者であれば、特許請求の範囲に記載された範疇内において、各種の変更例または修正例に想到し得ることは明らかであり、それらについても当然に本発明の技術的範囲に属するものと了解される。 As mentioned above, although preferred embodiment of this invention was described referring an accompanying drawing, it cannot be overemphasized that this invention is not limited to the example which concerns. It will be apparent to those skilled in the art that various changes and modifications can be made within the scope of the claims, and these are naturally within the technical scope of the present invention. Understood.

本発明は、画像データを再生可能な画像再生装置および画像再生装置で実行可能なプログラムに利用することができる。 The present invention can be used for an image reproducing apparatus capable of reproducing image data and a program executable by the image reproducing apparatus.

本発明による画像再生装置の実施形態であるデジタルカメラのブロック図である。1 is a block diagram of a digital camera that is an embodiment of an image playback device according to the present invention. FIG. 図１のデジタルカメラで再生される画像データと、画像データに関連付けられる音声データおよび特性情報を模式的に表す図である。FIG. 2 is a diagram schematically illustrating image data reproduced by the digital camera of FIG. 1, audio data associated with the image data, and characteristic information. 図１のデジタルカメラの外観の一例を示す図である。It is a figure which shows an example of the external appearance of the digital camera of FIG. 図１のＣＰＵの各要素が行うスライドショー実行プロセスを示すフローチャートである。It is a flowchart which shows the slide show execution process which each element of CPU of FIG. 1 performs. 図４の性別・声音判断工程を詳細に表すサブルーチンである。It is a subroutine showing in detail the gender / voice sound determination step of FIG.

Explanation of symbols

１００ …デジタルカメラ
１０２Ａ〜１０２Ｄ …画像データ
１０４Ａ〜１０４Ｄ …音声データ
１０６ …周波数解析部
１０８Ａ〜１０８Ｄ …特性情報
１１０ …音声判定部
１１１ …音量検出部
１１２ …関連付け部
１１４ …参照テーブル
１１６ …画像選択部
１１８ …再生制御部
３２０ …スピーカ
３８０ …ＬＣＤ DESCRIPTION OF SYMBOLS 100 ... Digital camera 102A-102D ... Image data 104A-104D ... Audio | voice data 106 ... Frequency analysis part 108A-108D ... Characteristic information 110 ... Audio | voice determination part 111 ... Volume detection part 112 ... Correlation part 114 ... Reference table 116 ... Image selection part 118... Playback control unit 320... Speaker 380.

Claims

A frequency analyzer for analyzing the frequency of audio data associated with each of the one or more image data;
A voice determination unit for determining characteristic information of the voice data including gender of the speaker according to the analyzed frequency;
An association unit associating the characteristic information with each of the one or more image data;
An image selection unit that selects target reproduction target image data based on characteristic information associated with each of the one or more pieces of image data;
A reproduction control unit that reproduces the selected reproduction target image data on a display unit in ascending or descending order of the analyzed frequency ;
An image reproducing apparatus comprising:

The image reproduction apparatus according to claim 1,
The reproduction control unit reproduces the selected reproduction target image data by sequentially switching and reproducing the selected reproduction target image data.

The image reproduction apparatus according to claim 1 or 2,
The associating unit adds characteristic information of the audio data to each of the one or more pieces of image data .

The image reproduction apparatus according to any one of claims 1 to 3 ,
The image reproducing apparatus , wherein the associating unit creates association data indicating an association state between the characteristic information of the audio data and each of the one or more image data, and stores the association data as a file .

A frequency analyzer for analyzing the frequency of audio data associated with each of the one or more image data;
A voice determination unit for determining characteristic information of the voice data including gender of the speaker according to the analyzed frequency;
An association unit associating the characteristic information with each of the one or more image data;
An image selection unit that selects reproduction target image data to be reproduced based on characteristic information associated with each of the one or more pieces of image data;
A program that causes an image reproduction device to function as a reproduction control unit that reproduces the selected reproduction target image data on a display unit in ascending or descending order of the analyzed frequency .