JP5683863B2

JP5683863B2 - Image reproduction apparatus and sound information output method of image reproduction apparatus

Info

Publication number: JP5683863B2
Application number: JP2010177877A
Authority: JP
Inventors: 翼笠井
Original assignee: Olympus Imaging Corp
Current assignee: Olympus Imaging Corp
Priority date: 2010-08-06
Filing date: 2010-08-06
Publication date: 2015-03-11
Anticipated expiration: 2030-08-06
Also published as: JP2012039394A

Description

本発明は、記録された画像に音情報を付加する画像再生装置及び画像再生装置の音情報出力方法に関する。 The present invention relates to an image reproduction apparatus for adding sound information to a recorded image and a sound information output method for the image reproduction apparatus .

近年、デジタルカメラなどの撮影機能付き携帯機器（撮影機器）が普及している。この種の撮影機器においては、撮像画像を圧縮して半導体メモリ等の記録媒体に記憶させるものが多い。圧縮技術及び記録媒体の記録容量の増大に伴い、記録媒体内には膨大な画像が蓄積される傾向にある。このような膨大な量の画像から所望の画像を短時間で検索することは、極めて困難である。 In recent years, portable devices with a photographing function (photographing devices) such as digital cameras have become widespread. Many of these types of photographing devices compress captured images and store them in a recording medium such as a semiconductor memory. With the increase in the compression technology and the recording capacity of the recording medium, enormous images tend to be accumulated in the recording medium. It is extremely difficult to retrieve a desired image from such an enormous amount of images in a short time.

そこで、特許文献１においては、画像から検索用の情報を抽出する技術が開示されている。これにより、特許文献１の発明においては、所望の内容や構成の撮影画像を検索することができるようになっている。 Therefore, Patent Document 1 discloses a technique for extracting search information from an image. Thereby, in the invention of Patent Document 1, a photographed image having a desired content and configuration can be searched.

特開２００７−９６３７９号公報 JP 2007-96379 A

しかしながら、特許文献１の発明においては、検索用の設定が極めて煩雑であるという欠点がある。また、特許文献１の発明における検索方法は論理的であり直感的な検索方法ではなく、ユーザフレンドリとは言えない。また、人間の持つ五感を有効に利用できていない。 However, the invention of Patent Document 1 has a drawback that the setting for search is extremely complicated. Further, the search method in the invention of Patent Document 1 is logical and not an intuitive search method, and cannot be said to be user friendly. Also, the five senses of human beings cannot be used effectively.

本発明は、直感的で且つ迅速に画像を検索することができる画像再生装置及び画像再生装置の音情報出力方法を提供することを目的とする。
また、画像を音声に変換して、視覚以外の五感の刺激をも可能にするので、機器の利用を促進できる。 It is an object of the present invention to provide an image reproducing apparatus and a sound information output method for the image reproducing apparatus that can intuitively and quickly retrieve images.
In addition, since the image is converted into sound and stimulation of five senses other than vision is possible, the use of the device can be promoted.

本発明の一態様の画像再生装置は、画像再生装置において、画像中の被写体についての特徴を求めることで上記被写体を検出する被写体特徴検出部と、検出された上記被写体に対して音情報を割当てる音情報割当て部と、上記被写体を含む画像に対応させて、上記音情報割当て部が割当てた上記音情報を出力する出力制御部と、上記被写体を含む画像の表示に対応させて、上記出力制御部が出力した上記音情報に基づく音響を出力する音響出力部と、上記画像再生装置が傾斜しているか否かを判定する姿勢判定部と、上記画像再生装置が傾斜していると上記姿勢判定部が判定した場合には上記音情報を順次読み出して再生し、上記画像再生装置の傾きが元に戻された場合には該再生を停止するよう制御する制御部と、を具備する。
本発明の一態様の画像再生装置の音情報出力方法は、画像再生装置の音情報出力方法において、被写体特徴検出部が、画像中の被写体についての特徴を求めることで上記被写体を検出するステップと、音情報割当て部が、検出された上記被写体に対して音情報を割当てるステップと、出力制御部が、上記被写体を含む画像に対応させて、上記音情報割当て部が割当てた上記音情報を出力するステップと、音響出力部が、上記被写体を含む画像の表示に対応させて、上記出力制御部が出力した上記音情報に基づく音響を出力するステップと、姿勢判定部が、上記画像再生装置が傾斜しているか否かを判定するステップと、制御部が、上記画像再生装置が傾斜していると上記姿勢判定部が判定した場合には上記音情報を順次読み出して再生し、上記画像再生装置の傾きが元に戻された場合には該再生を停止するよう制御するステップと、を具備する。 According to an image reproduction device of one embodiment of the present invention, in the image reproduction device, a subject feature detection unit that detects the subject by obtaining a feature of the subject in the image, and assigns sound information to the detected subject. A sound information allocating unit, an output control unit that outputs the sound information allocated by the sound information allocating unit in correspondence with the image including the subject, and the output control corresponding to the display of the image including the photographic subject. A sound output unit that outputs sound based on the sound information output by the unit , a posture determination unit that determines whether the image playback device is tilted, and the posture determination that the image playback device is tilted A control unit that sequentially reads and reproduces the sound information when the determination is made, and controls to stop the reproduction when the tilt of the image reproduction apparatus is restored.
The sound information output method of the image reproduction device according to one aspect of the present invention is a sound information output method of the image reproduction device, wherein the subject feature detection unit detects the subject by obtaining a feature of the subject in the image. A step in which the sound information assigning unit assigns sound information to the detected subject, and an output control unit outputs the sound information assigned by the sound information assigning unit in association with an image including the subject. A step in which the sound output unit outputs sound based on the sound information output from the output control unit in correspondence with display of an image including the subject, and a posture determination unit includes the image reproduction device. If the control unit determines that the image reproduction device is inclined, the control unit reads and reproduces the sound information sequentially when the image determination device determines that the image reproduction device is inclined. If the tilt of the image reproducing apparatus is returned to the original is provided with a step of controlling so as to stop the regeneration, the.

本発明によれば、直感的で且つ迅速に画像を検索することができるという効果を有する。 According to the present invention, there is an effect that an image can be searched intuitively and quickly.

本発明の第１の実施の形態に係る画像記録再生装置が組み込まれた撮影機器を示すブロック図。1 is a block diagram showing a photographing device in which an image recording / reproducing apparatus according to a first embodiment of the present invention is incorporated. 制御部１による人物と音情報との関連付けの一例を示す説明図。Explanatory drawing which shows an example of the correlation of the person and sound information by the control part. 図２に対応させて制御部１による音響出力を説明するための説明図。Explanatory drawing for demonstrating the acoustic output by the control part 1 corresponding to FIG. 図２に対応させて制御部１による音響出力を説明するための説明図。Explanatory drawing for demonstrating the acoustic output by the control part 1 corresponding to FIG. 図２に対応させて制御部１による音響出力を説明するための説明図。Explanatory drawing for demonstrating the acoustic output by the control part 1 corresponding to FIG. 図２に対応させて制御部１による音響出力を説明するための説明図。Explanatory drawing for demonstrating the acoustic output by the control part 1 corresponding to FIG. 撮影機器１０のメインフローを示すフローチャート。5 is a flowchart showing a main flow of the photographing apparatus 10. 図７中のステップＳ１５の顔検出時音情報決定処理の具体的なフローを示すフローチャート。The flowchart which shows the specific flow of the sound information determination process at the time of face detection of step S15 in FIG. 音情報登録画面を示す説明図。Explanatory drawing which shows a sound information registration screen. 図７中のステップＳ１６の顔未検出時音情報決定処理の具体的なフローを示すフローチャート。The flowchart which shows the specific flow of the sound information determination process at the time of face undetection of step S16 in FIG. 再生モード時の動作を説明するためのフローチャート。The flowchart for demonstrating the operation | movement at the time of reproduction | regeneration mode. 楽譜の表示例を示す説明図。Explanatory drawing which shows the example of a score display. 画像の表示例を示す説明図。Explanatory drawing which shows the example of a display of an image. 本発明の第２の実施の形態に係る音情報付加装置を示すブロック図。The block diagram which shows the sound information addition apparatus which concerns on the 2nd Embodiment of this invention. 本発明の第３の実施の形態に係る音情報付加装置を示すブロック図。The block diagram which shows the sound information addition apparatus which concerns on the 3rd Embodiment of this invention.

以下、図面を参照して本発明の実施の形態について詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

（第１の実施の形態）
図１は本発明の第１の実施の形態に係る画像記録再生装置が組み込まれた撮影機器を示すブロック図である。本実施の形態はカメラ等の撮影機器に適用したものである。 (First embodiment)
FIG. 1 is a block diagram showing a photographing apparatus in which an image recording / reproducing apparatus according to a first embodiment of the present invention is incorporated. This embodiment is applied to a photographing device such as a camera.

撮影機器１０は、ＣＣＤやＣＭＯＳセンサ等の撮像素子によって構成された撮像部２を有している。撮像部２は、撮像素子の撮像面に結像した被写体光学像を電気信号に変換して、画像信号を制御部１に出力するようになっている。 The imaging device 10 has an imaging unit 2 configured by an imaging device such as a CCD or a CMOS sensor. The imaging unit 2 converts the subject optical image formed on the imaging surface of the imaging element into an electrical signal and outputs the image signal to the control unit 1.

制御部１は、撮像部２からの画像信号に対して、所定の信号処理、例えば、色信号生成処理、マトリックス変換処理、その他各種のデジタル処理を行う。制御部１は記録再生制御部１ｄを備えており、記録再生制御部１ｄは、制御部１において信号処理された画像信号及び音声信号等に符号化処理を施して、圧縮した画像情報及び音声情報等を記録部３に与えて記録させることができる。 The control unit 1 performs predetermined signal processing, such as color signal generation processing, matrix conversion processing, and other various digital processing, on the image signal from the imaging unit 2. The control unit 1 includes a recording / playback control unit 1d. The recording / playback control unit 1d performs coding processing on the image signal and the audio signal that are signal-processed by the control unit 1 and compresses the image information and the audio information. Etc. can be given to the recording unit 3 for recording.

なお、記録再生制御部１ｄとしては例えばカードインターフェースを採用することができ、記録再生制御部１ｄはメモリカード等の記録部３に画像情報及び音声情報等を記録可能である。また、記録再生制御部１ｄは、記録媒体に記録された画像情報及び音声情報を読み出して制御部１に供給することができる。制御部１は、記録再生制御部１ｄからの画像情報及び音声情報を復号化して、画像信号及び音声信号を得ることができるようになっている。 For example, a card interface can be adopted as the recording / playback control unit 1d, and the recording / playback control unit 1d can record image information, audio information, and the like in the recording unit 3 such as a memory card. The recording / playback control unit 1 d can read out image information and audio information recorded on the recording medium and supply them to the control unit 1. The control unit 1 can decode the image information and audio information from the recording / playback control unit 1d to obtain an image signal and an audio signal.

また、撮影機器１０には、時計部４、操作部５及びタッチパネル６も配設されている。時計部４は制御部１が用いる時間情報を発生する。操作部５は、撮影機器１０に設けられた図示しないレリーズボタンや撮影モード設定等の図示しない各種スイッチ等によって構成される。操作部５及びタッチパネル６は、ユーザ操作に基づく操作信号を発生して、制御部１に出力するようになっている。制御部１は、操作信号に基づいて、各部を制御する。 The photographing device 10 is also provided with a clock unit 4, an operation unit 5, and a touch panel 6. The clock unit 4 generates time information used by the control unit 1. The operation unit 5 includes a release button (not shown) provided in the photographing device 10 and various switches (not shown) such as a shooting mode setting. The operation unit 5 and the touch panel 6 generate operation signals based on user operations and output them to the control unit 1. The control unit 1 controls each unit based on the operation signal.

撮影機器１０には、姿勢判定部７も設けられている。姿勢判定部７は、加速度センサ等によって構成することができ、撮影機器１０本体の姿勢を検出する。例えば、姿勢判定部７は、撮影機器１０本体の傾きやぶれを検出して検出結果を制御部１に与えるようになっている。 The photographing apparatus 10 is also provided with a posture determination unit 7. The posture determination unit 7 can be configured by an acceleration sensor or the like, and detects the posture of the photographing apparatus 10 main body. For example, the posture determination unit 7 detects the tilt or shake of the main body of the photographing apparatus 10 and gives the detection result to the control unit 1.

また、制御部１には、表示制御部１ｂが設けられている。表示制御部１ｂは、撮像部２からの撮像画像や記録再生制御部１ｄからの再生画像が制御部１から供給されて、これらの画像表示を表示部８に対して行うことができる。また、表示制御部１ｂは、撮影機器１０の操作を行うためのメニュー表示等を表示部８に表示させることもできるようになっている。 The control unit 1 is provided with a display control unit 1b. The display control unit 1 b can receive the captured image from the imaging unit 2 and the reproduction image from the recording / reproduction control unit 1 d from the control unit 1, and display these images on the display unit 8. The display control unit 1b can also display a menu display for operating the photographing apparatus 10 on the display unit 8.

また、制御部１には、音声制御部１ｃが設けられている。音声制御部１ｃは、記録再生制御部１ｄによって再生された音声情報が制御部１から供給されて、音響出力をスピーカ９に対して行うことができる。 The control unit 1 is provided with a voice control unit 1c. The audio control unit 1 c can output sound information to the speaker 9 when the audio information reproduced by the recording / reproduction control unit 1 d is supplied from the control unit 1.

本実施の形態においては、制御部１には、物特徴量検出部としての顔検出部１ａが設けられている。顔検出部１ａは、撮像部２から画像信号が与えられ、画像内に人間の顔の特徴を有する陰影パターンが存在するか否かをコマ（フレーム）毎に検出する。顔検出部１ａは、検出した顔領域についての特徴量を記録再生制御部１ｄによって記録部３に与えて記録させる。顔の特徴量は人物毎に異なり、顔検出部１ａは、新たな人物に対応する特徴量を検出した場合にのみ、その特徴量を記録部３に記録するようになっている。 In the present embodiment, the control unit 1 is provided with a face detection unit 1a as an object feature amount detection unit. The face detection unit 1a receives an image signal from the imaging unit 2, and detects whether or not a shadow pattern having a human face feature exists in the image for each frame (frame). The face detection unit 1a provides the recording unit 3 with the feature amount of the detected face area by the recording / playback control unit 1d for recording. The face feature amount differs for each person, and the face detection unit 1a records the feature amount in the recording unit 3 only when the feature amount corresponding to the new person is detected.

記録部３は、顔特徴情報記録領域３ａ、音情報記録領域３ｂ及び画像・音声情報記録領域３ｃを有している。顔特徴情報記録領域３ａには、顔検出部１ａによって検出された顔の特徴量を人物毎に記録する。記録部３の音情報記録領域３ｂには、音情報が記録されている。 The recording unit 3 includes a face feature information recording area 3a, a sound information recording area 3b, and an image / audio information recording area 3c. In the face feature information recording area 3a, the feature amount of the face detected by the face detection unit 1a is recorded for each person. Sound information is recorded in the sound information recording area 3 b of the recording unit 3.

音情報としては、例えば、所定の音階上の任意の音、即ち、楽音を音響出力するための情報が用いられる。また、音情報としては、周波数（音階）の情報だけでなく、音量、音色等の情報を含んでもよい。 As the sound information, for example, information for outputting an arbitrary sound on a predetermined scale, that is, a musical sound, is used. Further, the sound information may include not only information on frequency (scale) but also information on volume, tone color, and the like.

本実施の形態においては、制御部１は、顔の特徴量によって特定される人物（顔）に対して、所定の音情報を割当てることができるようになっている。 In the present embodiment, the control unit 1 can assign predetermined sound information to a person (face) specified by a facial feature quantity.

図２は制御部１による顔（人物）に対する音情報の割当ての一例を示す説明図である。図２では説明を簡略化するために、音情報を楽譜によって示している。 FIG. 2 is an explanatory diagram showing an example of sound information assignment to a face (person) by the control unit 1. In FIG. 2, the sound information is shown by a musical score for the sake of simplicity.

図２の例では、３人の顔Ａ〜Ｃに対して夫々音Ｇ，Ｂ，Ｄ（英語音名表記）を対応させた例を示している。なお、ここでは、後述するように音を再生する場合の響きを考慮して、協和和音となる音を選択する例を示したが、不協和和音となる音を選択してもよい。なお、協和和音は、２つの音の音程が完全１度、完全８度、完全５度、完全４度、長３度、長６度、短３度又は短６度になる協和音程の関係を有する音のみを用いて構成された和音であり、通常、人は協和和音を聞くと、聞き慣れた和音、或いは心地よい和音と感じることが多い。 The example of FIG. 2 shows an example in which the sounds G, B, and D (English pitch name notation) are associated with the three faces A to C, respectively. In addition, although the example which selects the sound used as a consonant chord in consideration of the reverberation at the time of reproducing | regenerating a sound so that it may mention later is shown here, you may select the sound which becomes a dissonant chord. Note that the Kyowa chord has a relationship between Kyowa intervals where the pitch of the two sounds is 1 degree, 8 degrees, 5 degrees, 4 degrees, 3 degrees, 6 degrees, 3 degrees, or 6 degrees. This is a chord that is composed of only the sounds that it has. Normally, when a person hears a chord, it often feels a familiar chord or a comfortable chord.

更に、制御部１は、顔が検出されなかった画像（以下、背景画像という）に対して音情報を割り当てることもできる。また、制御部１は音情報として和音を割り当てることもできる。図２の例では、背景画像に対して、３つの和音Ｅ，Ｃ，Ｆ（コードネーム）を割り当て可能であることを示している。図２の例では、制御部１は、例えば画像の明るさに応じて３つの和音のうちの１つを選択的に割り当てる。 Furthermore, the control unit 1 can also assign sound information to an image in which no face is detected (hereinafter referred to as a background image). The control unit 1 can also assign chords as sound information. The example of FIG. 2 indicates that three chords E, C, and F (code names) can be assigned to the background image. In the example of FIG. 2, the control unit 1 selectively assigns one of three chords according to, for example, the brightness of the image.

制御部１は、各画像毎に、割り当てた音情報を関連づける。音情報は、記録再生制御部１ｄによって記録部３の画像・音声情報記録領域３ｃに各画像の画像情報に対応付けられて記録させる。画像・音声情報記録領域３ｃは、画像と音情報とが関連付けられて記録される領域である。 The control unit 1 associates the assigned sound information for each image. The sound information is recorded in the image / audio information recording area 3c of the recording unit 3 in association with the image information of each image by the recording / playback control unit 1d. The image / sound information recording area 3c is an area in which images and sound information are recorded in association with each other.

なお、制御部１は、顔と音情報との割当てを予め決定していてもよい。例えば、検出された顔の順に、周波数順に音を順次割り当ててもよい。また、検出された顔のサイズに応じて、周波数順に音を順次割り当ててもよい。或いは、制御部１は、顔の特徴量から顔の表情を検出し、笑顔程高い周波数の音情報を割り当てるようにしてもよい。更に、制御部１は、新たな人物の顔が検出される毎に、ユーザに割り当てる音情報を選択させるようにしてもよい。また、更に、制御部１は、検出された顔の特徴量に応じて、音量や音色等を変化させるようにしてもよい。例えば、画像中の顔のサイズが大きいほど大きな音量の音を割当てたり、顔のサイズに応じて相互に異なる楽器に対応する音を割当ててもよい。更に、制御部１は、周波数、音量、音色等の全てを検出された顔毎に異なる値に設定してもよい。 Note that the control unit 1 may determine in advance the assignment between the face and the sound information. For example, sounds may be sequentially assigned in order of frequency in the order of detected faces. In addition, sounds may be sequentially assigned in order of frequency according to the detected face size. Alternatively, the control unit 1 may detect a facial expression from the facial feature amount and assign sound information having a frequency that is higher for a smile. Furthermore, the control unit 1 may select sound information to be assigned to the user every time a new human face is detected. Furthermore, the control unit 1 may change the volume, tone, and the like according to the detected facial feature amount. For example, a louder sound may be assigned as the face size in the image is larger, or sounds corresponding to different instruments may be assigned according to the face size. Furthermore, the control unit 1 may set all of the frequency, volume, tone color, and the like to different values for each detected face.

同様に、制御部１は、背景画像に対する音情報の割当てを予め決定していてもよい。例えば、背景画像の明るさに応じて周波数順に音を順次割り当ててもよい。また、制御部１は、背景画像毎に、ユーザに割り当てる音情報を選択させるようにしてもよい。 Similarly, the control unit 1 may determine in advance the assignment of sound information to the background image. For example, sounds may be sequentially assigned in order of frequency according to the brightness of the background image. Moreover, you may make it the control part 1 select the sound information allocated to a user for every background image.

なお、制御部１は、必ずしも全ての顔又は背景画像に音情報を割り当てる必要はない。例えば、予め設定された数の音情報を顔及び背景画像に割り当て可能に構成することも可能である。或いは、制御部１は、音情報を割り当てない顔及び背景画像をユーザに選択させるようにしてもよい。 Note that the control unit 1 does not necessarily have to assign sound information to all faces or background images. For example, a predetermined number of sound information can be assigned to the face and the background image. Alternatively, the control unit 1 may cause the user to select a face and a background image to which sound information is not assigned.

更に、制御部１は、顔検出部１ａだけでなく、特徴量検出部を設けることにより、顔以外の特定の物を検出可能に構成して、特定の物に夫々音情報を割り当てるようにしてもよい。 Furthermore, the control unit 1 is configured not only to detect the face detection unit 1a but also to provide a feature amount detection unit so that a specific object other than the face can be detected, and sound information is assigned to each specific object. Also good.

制御部１は、音声制御部１ｃを制御して、画像の再生に際して、各画像に関連付けられた音情報を読み出して音響出力することが可能である。 The control unit 1 can control the audio control unit 1c to read sound information associated with each image and output the sound when reproducing the image.

図３乃至図６は図２に対応させて制御部１による音響出力を説明するための説明図である。図３乃至図６では説明を簡略化するために、出力される音響を楽譜によって示している。 3 to 6 are explanatory diagrams for explaining the sound output by the control unit 1 corresponding to FIG. In FIG. 3 to FIG. 6, the output sound is shown by a musical score in order to simplify the description.

制御部１は、記録再生制御部１ｄ及び表示制御部１ｂを制御して、記録部３に記録されている画像情報を読み出して表示部８に通常表示又はサムネイル表示等により表示させることができる。この表示に際して、制御部１は、画像に対応付けて記録されている音情報を読み出し、音声制御部１ｃを制御してスピーカ９から音響出力させることができる。 The control unit 1 can control the recording / playback control unit 1d and the display control unit 1b to read the image information recorded in the recording unit 3 and display the image information on the display unit 8 by normal display or thumbnail display. At the time of this display, the control unit 1 can read sound information recorded in association with an image, and can control the audio control unit 1c to output sound from the speaker 9.

図３（ａ）は図２の顔Ｃが含まれる画像を示している。この画像の表示に際して、制御部１は、図３（ｂ）に示す音Ｄを音響出力する。また、図４（ａ）は図２の顔Ａ〜顔Ｃの３人の顔が含まれる画像を示している。この画像の表示に際して、制御部１は、図４（ｂ）に示す音Ｇ，Ｂ，Ｄを同時に音響出力する。なお、制御部１は、音Ｇ，Ｂ，Ｄを同時ではなく順番に音響出力するようにしてもよい。
音情報記録領域３ｂには、優先して割り当てる音情報情報である優先音情報も記録されている。制御部１は、複数の音情報を割当てる場合には、優先音情報を優先的に割当てるようにしてもよい。例えば、優先音情報は、各音情報に基づく音同士が協和音程を構成するように設定される。優先音情報を用いると、画像中に複数の顔が含まれる場合には、協和和音が音響出力されることになり、ユーザは聞き慣れた和音、或いは心地よい和音を聞くことができる。 FIG. 3A shows an image including the face C of FIG. When displaying this image, the control unit 1 outputs a sound D as shown in FIG. FIG. 4A shows an image including three faces, face A to face C in FIG. At the time of displaying this image, the control unit 1 outputs sound G, B, and D shown in FIG. The control unit 1 may output the sounds G, B, and D in order, not simultaneously.
In the sound information recording area 3b, priority sound information which is sound information information to be preferentially assigned is also recorded. When assigning a plurality of pieces of sound information, the control unit 1 may preferentially assign the priority sound information. For example, the priority sound information is set so that sounds based on each sound information form a Kyowa interval. When the priority sound information is used, when a plurality of faces are included in the image, a consonant chord is output as a sound, and the user can hear a familiar chord or a comfortable chord.

図５は背景画像に対する音情報の割当ての一例を示す説明図である。図５（ａ），（ｃ）は横軸に輝度をとり縦軸に画素数をとって、画像内における各輝度毎の画素数を示すヒストグラムである。図５（ａ）は比較的低い輝度の画素が多い画像、即ち、比較的暗い画像を示し、図５（ｃ）は比較的高い輝度の画素が多い画像、即ち、比較的明るい画像を示している。制御部１は、比較的暗い画像については、音情報として比較的低い周波数の音からなる和音を割当て、比較的明るい画像については、音情報として比較的高い周波数の音からなる和音を割当てる。図５（ｂ）は図５（ａ）のように比較的暗い画像に割り当てられる和音（Ｇ）を示し、図５（ｄ）は図５（ｃ）のように比較的明るい画像に割り当てられる和音（Ｆ）を示している。 FIG. 5 is an explanatory diagram showing an example of assignment of sound information to a background image. 5A and 5C are histograms showing the number of pixels for each luminance in the image, with luminance on the horizontal axis and the number of pixels on the vertical axis. FIG. 5A shows an image with many pixels with relatively low luminance, that is, a relatively dark image, and FIG. 5C shows an image with many pixels with relatively high luminance, that is, a relatively bright image. Yes. The control unit 1 assigns a chord composed of a relatively low frequency sound as sound information for a relatively dark image, and assigns a chord composed of a relatively high frequency sound as sound information for a relatively bright image. FIG. 5B shows a chord (G) assigned to a relatively dark image as shown in FIG. 5A, and FIG. 5D shows a chord assigned to a relatively bright image as shown in FIG. 5C. (F) is shown.

なお、制御部１が、暗い画像に低い周波数の音からなる３和音を割当て、明るい画像に高い周波数の音からなる３和音を割当てる例を示したが、音情報の割当て方法は種々考えられる。例えば、制御部１は、暗い画像にマイナーコードを割当て、明るい画像にメジャーコードを割当ててもよく、特に明るい画像にはテンションコード等を割当るようにしてもよい。また、図５では画像の明るさによって和音を割り当てる例を示したが、絵柄の細かさに応じて音情報を割り当ててもよい。例えば、絵柄が細かいほど、高い周波数の音からなる和音を割り当ててもよい。なお、制御部１は、背景画像に単音の音を割り当てることも可能である。 In addition, although the control part 1 showed the example which assigns the 3 chord which consists of a low frequency sound to a dark image, and assigns the 3 chord which consists of a high frequency sound to a bright image, the allocation method of sound information can be considered variously. For example, the control unit 1 may assign a minor code to a dark image, assign a major code to a bright image, and assign a tension code or the like to a particularly bright image. 5 shows an example in which chords are assigned according to the brightness of the image, sound information may be assigned according to the fineness of the pattern. For example, you may assign the chord which consists of a high frequency sound, so that a pattern is fine. Note that the control unit 1 can also assign a single sound to the background image.

図６は連続的に再生される画像に対する音響出力の一例を示す説明図である。図６（ａ）は連続的に再生表示される画像を示している。四角の枠が各画像を示し、無地の枠は顔が検出されない背景画像であることを示している。図６（ｂ）は図６（ａ）の各画像の再生表示に対応して出力される音響を示している。即ち、図６（ｂ）は、図６（ａ）の左から５枚の画像に対して、和音Ｇ，Ｃ，Ｃ，Ｇ，Ｆが順次音響出力され、顔Ａを含む画像の表示に対応させて単音の音Ｇが音響出力され、図６（ａ）の右２枚の画像に対して、和音Ｇ，Ｆが順次音響出力されることを示している。 FIG. 6 is an explanatory diagram showing an example of sound output for continuously reproduced images. FIG. 6A shows images that are continuously reproduced and displayed. A square frame indicates each image, and a plain frame indicates a background image in which no face is detected. FIG. 6B shows the sound output corresponding to the reproduction display of each image in FIG. That is, FIG. 6B corresponds to the display of the image including the face A by sequentially outputting the chords G, C, C, G, and F with respect to the five images from the left in FIG. As a result, a single sound G is acoustically output, and chords G and F are sequentially acoustically output with respect to the right two images in FIG.

ユーザは、順次出力される音響を耳で確認する。ユーザは、例えば単音の音Ｇが出力されることによって、顔Ａを含む画像が再生されることを認識することができる。人間の耳は、時間軸方向には、極めて短い時間間隔で音を聞き分けることが可能であり、ユーザは、例えば、膨大な画像の中から顔Ａを含む画像を極めて短時間に検出することが可能である。 The user confirms the sequentially output sound with his / her ear. The user can recognize that an image including the face A is reproduced by outputting a single sound G, for example. The human ear can distinguish sounds at very short time intervals in the time axis direction, and the user can detect, for example, an image including the face A from a huge amount of images in a very short time. Is possible.

次に、画像記録時及び再生時における具体的な動作の一例について図７乃至図１３を参照して説明する。図７は撮影機器１０のメインフローを示すフローチャートである。 Next, an example of specific operations during image recording and reproduction will be described with reference to FIGS. FIG. 7 is a flowchart showing a main flow of the photographing apparatus 10.

撮影機器１０に電源が投入されると、制御部１は、図７のステップＳ１において、撮像画像と音情報とを関連付けて記録するモード（以下、音画像モードという）が指示されたか否かを判定する。音画像モードが指示されていない場合には、制御部１は、ステップＳ２において別モードが指示されたか否かの判定を行う。例えば再生モードが指示された場合には、制御部１は、ステップＳ３において再生モードに移行し、撮影画像の再生を行う。別モードが指定されていない場合には、制御部１はステップＳ４においてシャットダウン操作されたか否かを判定する。制御部１は、シャットダウン操作されると処理を終了し、そうでない場合には処理をステップＳ１に戻して音画像モードが指示されたか否かの判定を繰り返す。 When the photographing apparatus 10 is turned on, the control unit 1 determines in step S1 of FIG. 7 whether or not a mode for recording the captured image and the sound information in association with each other (hereinafter referred to as a sound image mode) is instructed. judge. When the sound image mode is not instructed, the control unit 1 determines whether or not another mode is instructed in step S2. For example, when the reproduction mode is instructed, the control unit 1 shifts to the reproduction mode in step S3 and reproduces the captured image. If another mode is not designated, the control unit 1 determines whether or not a shutdown operation has been performed in step S4. When the shutdown operation is performed, the control unit 1 ends the process. If not, the control unit 1 returns the process to step S1 and repeats the determination as to whether the sound image mode is instructed.

音画像モードが指示されると、制御部１は、ステップＳ１０において、撮影を開始する。即ち、制御部１は、撮像部２からの撮像信号に基づいて、表示部８に撮像画像（スルー画）を表示させる。次に、制御部１は、レリーズボタンが押下されたか否かを判断する（ステップＳ１１）。 When the sound image mode is instructed, the control unit 1 starts photographing in step S10. That is, the control unit 1 displays a captured image (through image) on the display unit 8 based on the imaging signal from the imaging unit 2. Next, the control unit 1 determines whether or not the release button has been pressed (step S11).

レリーズボタンが押下されると、制御部１はステップＳ１２において撮像画像の記録を行う。制御部１は撮像部２からの撮像画像に所定の信号処理を施した後符号化処理を行う。記録再生制御部１ｄは、圧縮した画像情報を記録部３に与えて記録する。 When the release button is pressed, the control unit 1 records a captured image in step S12. The control unit 1 performs predetermined signal processing on the captured image from the imaging unit 2 and then performs encoding processing. The recording / playback control unit 1d gives the compressed image information to the recording unit 3 for recording.

次に、顔検出部１ａは、記録した画像について画像認識処理を行い、画像中に含まれる顔を検出する（ステップＳ１３）。制御部１は、顔が検出されると、処理をステップＳ１４からステップＳ１５に移行して、顔検出時音情報決定処理を実行する。なお、制御部１は、画像中に顔が検出されない場合には、処理をステップＳ１６に移行して、顔未検出時音情報決定処理を実行する。 Next, the face detection unit 1a performs image recognition processing on the recorded image, and detects a face included in the image (step S13). When the face is detected, the control unit 1 shifts the process from step S14 to step S15, and executes face detection sound information determination processing. If the face is not detected in the image, the control unit 1 moves the process to step S16 and executes the sound information determination process when no face is detected.

顔検出時音情報決定処理においては、画像から検出された顔に音情報を割当てる処理が行われる。また、顔未検出時音情報決定処理においては、背景画像に音情報を割り当てる処理が行われる。制御部１は、次のステップＳ１７において、画像に対応させて割り当てられた音情報を画像に対応させて記録する。ステップＳ１８においては、制御部１は撮影終了か否かを判定する。撮影が終了すると、処理をステップＳ４に移行し、撮影が終了しない場合には、処理をステップＳ１１に戻してレリーズボタンの押下を待つ。 In the face detection sound information determination process, a process of assigning sound information to the face detected from the image is performed. In the face non-detection sound information determination process, a process of assigning sound information to the background image is performed. In the next step S17, the control unit 1 records the sound information assigned in association with the image in association with the image. In step S18, the control unit 1 determines whether or not the photographing is finished. When shooting is completed, the process proceeds to step S4. When shooting is not completed, the process returns to step S11 to wait for the release button to be pressed.

図８は図７中のステップＳ１５の顔検出時音情報決定処理の具体的なフローを示すフローチャートである。 FIG. 8 is a flowchart showing a specific flow of the sound information determination processing at the time of face detection in step S15 in FIG.

画像中に顔が検出された場合には、制御部１はステップＳ２１において、検出された顔に音情報が登録済みであるか否かを判定する。登録済みの場合には、ステップＳ２２において、制御部１は検出された顔に登録済みの音情報を設定する。 When a face is detected in the image, the control unit 1 determines in step S21 whether sound information has been registered for the detected face. If registered, in step S22, the control unit 1 sets the registered sound information for the detected face.

一方、登録済みでない場合には、制御部１は、ステップＳ２３において、顔に割り当てる音情報を登録するための画面表示を行う。 On the other hand, if not registered, the control unit 1 displays a screen for registering sound information to be assigned to the face in step S23.

図９はこのような音情報登録画面を示す説明図である。図９においては、画像２１上に画像から検出された１つの顔２２が表示されている。この顔２２に対応させて登録すべき音情報が楽譜２４，２６として表示されている。画像２１上には、「登録しますか？」という登録表示２３又は「登録しない」という非登録表示２５が表示されている。登録表示２３の下方には登録すべき音情報が楽譜２４上の音符によって示されている。なお、図９の楽譜２４上にはまだ一度も登録されていない音情報に基づく音符のみが表示されているものとする。また、非登録表示２５の下方には、非登録時の音情報が楽譜２６上の音符によって示されている。 FIG. 9 is an explanatory view showing such a sound information registration screen. In FIG. 9, one face 22 detected from the image is displayed on the image 21. Sound information to be registered corresponding to the face 22 is displayed as musical scores 24 and 26. On the image 21, a registration display 23 "Do you want to register?" Or a non-registration display 25 "Do not register" is displayed. Below the registration display 23, sound information to be registered is indicated by notes on the score 24. It is assumed that only musical notes based on sound information that has never been registered are displayed on the score 24 of FIG. Also, below the non-registration display 25, sound information at the time of non-registration is indicated by notes on the score 26.

ユーザが音符２４のいずれかの音符を選択する操作を行うと、制御部１によって選択された音符に対応する音情報が顔２２に割り当てられて登録される。この場合には、制御部１は、処理をステップＳ２４からステップＳ２１に戻し、更に、ステップＳ２２において検出された顔に登録された音情報を設定する。 When the user performs an operation of selecting one of the notes 24, sound information corresponding to the note selected by the control unit 1 is assigned to the face 22 and registered. In this case, the control unit 1 returns the process from step S24 to step S21, and further sets the sound information registered in the face detected in step S22.

一方、ユーザが楽譜２６上の音符を選択する操作を行うことによって、検出された顔に対して音情報を登録しないことが指示される。この場合には、制御部１は、ステップＳ２５において検出された顔に規定の音（図９の例では音Ｅ）を割当てる。なお、この場合の規定の音情報としては無音の情報であってもよい。 On the other hand, when the user performs an operation of selecting a note on the score 26, it is instructed not to register sound information for the detected face. In this case, the control unit 1 assigns a prescribed sound (sound E in the example of FIG. 9) to the face detected in step S25. In this case, the prescribed sound information may be silent information.

制御部１は、ステップＳ２６において、１画像中の各顔に対応して設定された１つ以上の音情報を和音化する。なお、上述したように、制御部１は１画像中に複数の顔が検出され、各顔にそれぞれ音情報が割り当てられた場合でも、割り当てられた音を和音化することなく時分割に出力するようにしてもよい。 In step S26, the control unit 1 chords one or more pieces of sound information set corresponding to each face in one image. As described above, even when a plurality of faces are detected in one image and sound information is assigned to each face, the control unit 1 outputs the assigned sounds in a time division manner without being chorded. You may do it.

なお、図９の例では、登録可能な音情報が３つの音Ｇ，Ｂ，Ｄであり、未登録時の規定の音が音Ｅである例を示したが、ユーザによって任意の高さの音を選択可能に構成してもよい。この場合には、既登録の音情報については選択不能としてもよく、また、既登録の音情報を複数の顔に割当て可能にしてもよい。 In the example of FIG. 9, the sound information that can be registered is the three sounds G, B, and D, and the prescribed sound when not registered is the sound E. You may comprise so that a sound can be selected. In this case, the registered sound information may not be selected, and the registered sound information may be assigned to a plurality of faces.

図１０は図７中のステップＳ１６の顔未検出時音情報決定処理の具体的なフローを示すフローチャートである。 FIG. 10 is a flowchart showing a specific flow of the sound information determination process when no face is detected in step S16 in FIG.

画像中に顔が検出されない場合には、制御部１はステップＳ３１において、画像の平均輝度を算出する。次に、制御部１は平均輝度が夜景の明るさに相当するか否かを判定する（ステップＳ３２）。平均輝度が夜景の明るさに相当する場合には、制御部１は、ステップＳ３６において背景画像に低音の和音（例えば図２の和音Ｇ）を設定する。 If no face is detected in the image, the control unit 1 calculates the average luminance of the image in step S31. Next, the control unit 1 determines whether or not the average luminance corresponds to the brightness of the night scene (step S32). When the average luminance corresponds to the brightness of the night scene, the control unit 1 sets a low-pitched chord (for example, the chord G in FIG. 2) in the background image in step S36.

次に、制御部１は平均輝度が室内の明るさに相当するか否かを判定する（ステップＳ３３）。平均輝度が室内の明るさに相当する場合には、制御部１は、ステップＳ３５において背景画像に中音の和音（例えば図２の和音Ｃ）を設定する。 Next, the control unit 1 determines whether or not the average luminance corresponds to the indoor brightness (step S33). When the average luminance corresponds to the room brightness, the control unit 1 sets a middle chord (for example, the chord C in FIG. 2) in the background image in step S35.

次に、制御部１はステップＳ３３において平均輝度が室内の明るさに相当しない、即ち、昼間の室外の明るさに相当すると判定した場合には、ステップＳ３４において、背景画像に高音の和音（例えば図２の和音Ｆ）を設定する。こうして、背景画像については、画像の明るさに応じた高さの和音が設定される。 Next, when the control unit 1 determines in step S33 that the average luminance does not correspond to room brightness, that is, corresponds to daytime outdoor brightness, in step S34, the control unit 1 adds a high-pitched chord (for example, Set the chord F) of FIG. Thus, a chord having a height corresponding to the brightness of the image is set for the background image.

次のステップＳ１７における音情報記録処理において、制御部１は、記録再生制御部１ｄを制御して、記録部３の画像・音声情報記録領域３ｃに、各画像と各画像に対して割当てられた音情報とを記録する。なお、上記例では、制御部１は、顔に単音を割当て、背景画像に和音を割り当てたが、顔に和音を割当て、背景画像に単音を割当ててもよい。また、上記例では、制御部１は、顔に比較的高い周波数の音を割当て、背景画像に比較的低い周波数の音を割り当てたが、顔に高音を割当て、背景画像に低音を割当ててもよい。或いは、制御部１は、背景画像に無音を割当てることも可能である。 In the sound information recording process in the next step S17, the control unit 1 controls the recording / playback control unit 1d to be assigned to the image / audio information recording area 3c of the recording unit 3 for each image and each image. Record sound information. In the above example, the control unit 1 assigns a single sound to the face and assigns a chord to the background image, but may assign a chord to the face and assign a single sound to the background image. In the above example, the control unit 1 assigns a sound with a relatively high frequency to the face and assigns a sound with a relatively low frequency to the background image, but assigns a high sound to the face and assigns a low sound to the background image. Good. Alternatively, the control unit 1 can also assign silence to the background image.

次に、図１１を参照して再生モード時の動作について説明する。 Next, the operation in the playback mode will be described with reference to FIG.

ステップＳ４１において再生モードが指示されたことを検出すると、制御部１は再生画像を表示する（ステップＳ４２）。即ち、制御部１は、記録再生制御部１ｄを制御して、記録部３に記録されている画像を読み出す。制御部１は、読み出した画像に復号処理等を施した後、表示制御部１ｂを制御して、表示部８に表示させる。 When it is detected in step S41 that the reproduction mode has been instructed, the control unit 1 displays a reproduction image (step S42). That is, the control unit 1 controls the recording / playback control unit 1d to read out the image recorded in the recording unit 3. The control unit 1 performs a decoding process or the like on the read image, and then controls the display control unit 1b to display on the display unit 8.

次に、制御部１はステップＳ４３において撮影機器１０が傾斜しているか否かを判定する。制御部１は姿勢判定部７の出力によって撮影機器１０の傾斜角度を把握することができる。制御部１は、撮影機器１０が所定角度傾斜していない場合には、ステップＳ４８において再生画像の送り、戻し動作を行う。 Next, the control part 1 determines whether the imaging device 10 inclines in step S43. The control unit 1 can grasp the tilt angle of the photographing device 10 based on the output of the posture determination unit 7. When the photographing apparatus 10 is not inclined at a predetermined angle, the control unit 1 performs a playback image sending / returning operation in step S48.

ここで、ユーザが撮影機器１０を所定角度以上傾斜させるものとする。そうすると、制御部１はステップＳ４３から処理をステップＳ４４に移行して、所定の再生順の各画像に対応した各音情報について、現在表示中の画像に対応する音情報から順次読み出して再生する。即ち、制御部１は、音声制御部１ｃを制御して、順次読み出した音情報に基づく音響をスピーカ９から出力させる。この場合には、制御部１は、撮影機器１０の傾斜角度に対応する再生速度で音の出力を行う。即ち、撮影機器１０の傾斜角度が大きいほど、各音が高速に出力され、傾斜角度が小さいほど各音が低速に出力される。
例えば、制御部１が１秒間に１０画像に対応する１０個の音を出力させる場合には、１００枚の画像に対応した音については１０秒で出力することができる。ユーザはスピーカ９から出力される音を聞くことで、各画像に顔（人物）が含まれるか否か、顔が含まれる場合には誰の顔か、或いは明るい画像か暗い画像か等を、音によって瞬時に判断することができる。 Here, it is assumed that the user tilts the photographing device 10 by a predetermined angle or more. If it does so, the control part 1 will transfer a process to step S44 from step S43, and will sequentially read and reproduce | regenerate from the sound information corresponding to the image currently displayed about each sound information corresponding to each image of predetermined reproduction | regeneration order. That is, the control unit 1 controls the audio control unit 1c to output sound based on the sequentially read sound information from the speaker 9. In this case, the control unit 1 outputs sound at a reproduction speed corresponding to the tilt angle of the photographing device 10. That is, each sound is output at a higher speed as the tilt angle of the photographing apparatus 10 is larger, and each sound is output at a lower speed as the tilt angle is smaller.
For example, when the control unit 1 outputs 10 sounds corresponding to 10 images per second, sounds corresponding to 100 images can be output in 10 seconds. By listening to the sound output from the speaker 9, the user can determine whether each image includes a face (person), if it includes a face, who's face, a bright image or a dark image, etc. Judgment can be made instantly by sound.

また、この場合には、制御部１は、表示制御部１ｂを制御して、順次出力される音に対応する楽譜を表示してもよい。図１２はこの場合の表示例を示す説明図である。図１２に示すように、画像３１上には、楽譜３２が表示されている。楽譜３２は現在順次出力されている音に対応しており、音の再生に合わせて楽譜３２も変化するようになっている。図１２のマーカ３３は現在出力中の音に対応する音符の位置を示している。また、制御部１は、現在出力中の音については他の音と異なる色で表示するようにしてもよい。 In this case, the control unit 1 may control the display control unit 1b to display a score corresponding to the sequentially output sound. FIG. 12 is an explanatory diagram showing a display example in this case. As shown in FIG. 12, a score 32 is displayed on the image 31. The score 32 corresponds to the sound that is currently output sequentially, and the score 32 also changes as the sound is played back. A marker 33 in FIG. 12 indicates the position of a note corresponding to the sound currently being output. Further, the control unit 1 may display the currently output sound in a color different from other sounds.

また、制御部１は、撮影機器１０の傾斜方向が逆になった場合には、音の再生順を逆順にするようにしてもよい。また、制御部１は図１２の楽譜に代えて画像３１上に再生中の音に対応する画像を表示させるようにしてもよい。 Further, the control unit 1 may reverse the sound reproduction order when the tilt direction of the photographing apparatus 10 is reversed. Further, the control unit 1 may display an image corresponding to the sound being reproduced on the image 31 instead of the score of FIG.

制御部１はステップＳ４５において撮影機器１０の傾斜角度が所定の角度以下になったか否かを判定する。音の再生処理は撮影機器１０の傾斜角度を所定の角度以下に戻すまで続けられる。撮影機器１０の傾きを元に戻す操作によって、制御部１は音の再生を停止させる（ステップＳ４６）。 In step S45, the control unit 1 determines whether or not the inclination angle of the imaging device 10 has become equal to or smaller than a predetermined angle. The sound reproduction process is continued until the inclination angle of the photographing apparatus 10 is returned to a predetermined angle or less. By the operation of returning the tilt of the photographing apparatus 10, the control unit 1 stops the sound reproduction (step S46).

音の再生を停止すると、制御部１は、最後に出力した音に対応する画像の情報読出して、表示部８に表示させる（ステップＳ４７）。図１３はこの場合の表示例を示す説明図である。図１３に示すように、画像３１上には、最後に出力した音に対応する画像３４が表示されている。 When the reproduction of the sound is stopped, the control unit 1 reads out the information of the image corresponding to the last output sound and displays it on the display unit 8 (step S47). FIG. 13 is an explanatory diagram showing a display example in this case. As shown in FIG. 13, an image 34 corresponding to the sound output last is displayed on the image 31.

ステップＳ４９において再生の終了が指示されると、制御部１は図７のメイン処理に処理を戻す。 When the end of reproduction is instructed in step S49, the control unit 1 returns the process to the main process of FIG.

なお、図１１では、音を再生させるための操作として撮影機器１０を傾斜させる方法を採用した例を説明したが、音再生用のボタンを操作する等、適宜の方法を採用することが可能である。 Note that although FIG. 11 illustrates an example in which the method of tilting the photographing device 10 is employed as an operation for reproducing sound, an appropriate method such as operating a button for sound reproduction can be employed. is there.

このように本実施の形態においては、画像中の顔や背景画像に音情報を対応させて記録する。再生時に音を読み出して順次再生することにより、ユーザは再生される音によって画像中の顔（人物）や背景画像を認識することができる。即ち、本実施の形態においては、聴覚を補助に使って、迅速に画像を探すことが出来るようになる。人間は、ひとつの音を１／５０秒程度で聞き分けられるので、画面に目を凝らさなくても、１００枚の画像の内容を２秒で判定することも可能である。これにより、記録部に記録されている膨大な量の画像から、希望する画像を極めて短時間に検索することが可能である。 As described above, in this embodiment, sound information is recorded in association with a face or background image in an image. By reading out the sound during reproduction and sequentially reproducing the sound, the user can recognize the face (person) and the background image in the image by the reproduced sound. In other words, in the present embodiment, it becomes possible to quickly search for an image using hearing as an aid. Since humans can recognize a single sound in about 1/50 seconds, it is possible to determine the contents of 100 images in 2 seconds without focusing on the screen. Thereby, it is possible to search for a desired image in a very short time from an enormous amount of images recorded in the recording unit.

なお、上記実施の形態においては、顔が検出された画像については、その背景の画像に音情報を割当てていないが、顔及びその背景の画像の両方に音情報を割当ててもよい。 In the embodiment described above, sound information is not assigned to the background image of the image in which the face is detected, but sound information may be assigned to both the face and the background image.

（第２の実施の形態）
図１４は本発明の第２の実施の形態に係る音情報付加装置を示すブロック図である。 (Second Embodiment)
FIG. 14 is a block diagram showing a sound information adding apparatus according to the second embodiment of the present invention.

第１の実施の形態においては、撮影機器に適用して画像の撮像に際して音情報を各画像に付加した。これに対し、本実施の形態は既に記録されている画像に対して音情報を付加して記録するコンピュータに適用した例を示している。 In the first embodiment, sound information is added to each image when the image is captured by being applied to a photographing apparatus. On the other hand, the present embodiment shows an example applied to a computer that records sound information added to an already recorded image.

画像記録再生部５１には複数の画像が記録されている。コンピュータ５２は画像記録再生部５１によって再生された画像を読み出す。コンピュータ５２には特徴量検出部５３及び音情報割当て部５４が設けられている。特徴量検出部５３は、画像中の顔や背景画像等の特徴量を求める。音情報割当て部５４は、図１の制御部１及び記録部３と同様の構成であり、画像中の顔や背景画像に対して音情報を割当てるようになっている。なお、音情報割当て部５４は、検出した顔や背景画像について所定の規則で順次音情報を割当ててもよく、ユーザの設定に従って音情報を割当てもよい。コンピュータ５２は音情報割当て部５４において割当てた音情報を対応する画像と共に記録部５５において記録する。 A plurality of images are recorded in the image recording / playback unit 51. The computer 52 reads the image reproduced by the image recording / reproducing unit 51. The computer 52 is provided with a feature amount detection unit 53 and a sound information allocation unit 54. The feature amount detection unit 53 obtains feature amounts such as a face and a background image in the image. The sound information assigning unit 54 has the same configuration as the control unit 1 and the recording unit 3 in FIG. 1 and assigns sound information to the face and background image in the image. Note that the sound information assigning unit 54 may sequentially assign sound information according to a predetermined rule for the detected face and background image, or may assign sound information in accordance with user settings. The computer 52 records the sound information allocated by the sound information allocation unit 54 in the recording unit 55 together with the corresponding image.

モニタ５６はコンピュータ５２が記録部５５から読み出した画像を表示すると共に、コンピュータ５２が読み出した音情報を順次音響出力することができる。 The monitor 56 can display an image read by the computer 52 from the recording unit 55 and can sequentially output sound information read by the computer 52 as a sound.

他の構成及び作用・効果は第１の実施の形態と同様である。 Other configurations, operations and effects are the same as those in the first embodiment.

（第３の実施の形態）
図１５は本発明の第３の実施の形態に係る音情報付加装置を示すブロック図である。 (Third embodiment)
FIG. 15 is a block diagram showing a sound information adding apparatus according to the third embodiment of the present invention.

本実施の形態は既に記録されている画像に対して音情報を付加して出力するテレビジョン表示装置に適用した例を示している。 This embodiment shows an example in which the present invention is applied to a television display device that outputs sound information added to an already recorded image.

画像記録再生部５１には複数の画像が記録されている。テレビジョン表示装置６１は画像記録再生部５１によって再生された画像を読み出す。テレビジョン表示装置６１には特徴量検出部６２及び音情報割当て部６３が設けられている。特徴量検出部６２は、画像中の顔や背景画像等の特徴量を求める。音情報割当て部６３は、図１の制御部１及び記録部３と同様の構成であり、画像中の顔や背景画像に対して音情報を割当てるようになっている。本実施の形態においては、テレビジョン表示装置６１は、割当てた音情報を順次スピーカ制御部６４に与えて音響出力させることができる。 A plurality of images are recorded in the image recording / playback unit 51. The television display device 61 reads the image reproduced by the image recording / reproducing unit 51. The television display device 61 is provided with a feature amount detection unit 62 and a sound information allocation unit 63. The feature amount detection unit 62 obtains a feature amount such as a face or a background image in the image. The sound information assigning unit 63 has the same configuration as that of the control unit 1 and the recording unit 3 in FIG. 1 and assigns sound information to the face and background image in the image. In the present embodiment, the television display device 61 can sequentially give the assigned sound information to the speaker control unit 64 for sound output.

これにより本実施の形態においては、画像記録再生部５１に記録されている画像に対応した音情報に基づく音を順次再生させることができ、膨大な量の画像が記録された画像記録再生部５１にどのような画像が記録されているかを、短時間に検索することができる。 As a result, in the present embodiment, the sound based on the sound information corresponding to the image recorded in the image recording / reproducing unit 51 can be sequentially reproduced, and the image recording / reproducing unit 51 in which a huge amount of images are recorded. It is possible to search in a short time what kind of images are recorded in the image.

他の作用効果は第１及び第２の実施の形態と同様である。 Other functions and effects are the same as those of the first and second embodiments.

１…制御部、２…撮像部、３…記録部、５…操作部、７…姿勢判定部、８…表示部、９…スピーカ。 DESCRIPTION OF SYMBOLS 1 ... Control part, 2 ... Imaging part, 3 ... Recording part, 5 ... Operation part, 7 ... Attitude determination part, 8 ... Display part, 9 ... Speaker.

Claims

In an image playback device,
A subject feature detection unit that detects the subject by obtaining a feature of the subject in the image;
A sound information assigning unit for assigning sound information to the detected subject;
An output control unit that outputs the sound information allocated by the sound information allocation unit in correspondence with an image including the subject;
A sound output unit that outputs sound based on the sound information output by the output control unit in correspondence with display of an image including the subject ;
An attitude determination unit that determines whether or not the image reproduction device is inclined;
When the posture determination unit determines that the image playback device is tilted, the sound information is sequentially read and played back, and when the tilt of the image playback device is restored, the playback is stopped. A control unit for controlling
An image reproducing apparatus comprising:

2. The image reproducing apparatus according to claim 1, wherein the sound information is reproduced at a higher speed as the inclination of the image reproducing apparatus is larger.

3. The image reproduction device according to claim 2, wherein the control unit performs control so that the reproduction order of the sound information is reversed when the image reproduction device is inclined opposite to the inclination. .

  In the sound information output method of the image playback device,
  A subject feature detection unit detecting the subject by obtaining a feature of the subject in the image;
A sound information assigning unit assigning sound information to the detected subject;
An output control unit outputting the sound information assigned by the sound information assigning unit in correspondence with an image including the subject;
A step of outputting sound based on the sound information output by the output control unit in response to display of an image including the subject;
  A step of determining whether or not the image reproduction device is inclined;
  When the posture determination unit determines that the image playback device is tilted, the control unit sequentially reads and plays the sound information, and when the tilt of the image playback device is restored, the control unit Controlling to stop playback,
  A sound information output method for an image reproducing apparatus.