JP2010245695A

JP2010245695A - Imaging apparatus

Info

Publication number: JP2010245695A
Application number: JP2009090286A
Authority: JP
Inventors: Yoshiko Ono; 佳子小野; Shintaro Iijima; 慎太郎飯島
Original assignee: Nikon Corp
Current assignee: Nikon Corp
Priority date: 2009-04-02
Filing date: 2009-04-02
Publication date: 2010-10-28
Anticipated expiration: 2029-04-02
Also published as: JP5299034B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide an imaging apparatus which can record sound suitable for a shot image. <P>SOLUTION: This imaging apparatus includes: an image obtaining part (71) for obtaining an image; a sound obtaining part (74) for obtaining sound by being associated with the image; a recognition part (58) for recognizing the state where the image is obtained; a processing part (76) for processing the sound obtained by the sound obtaining part; a control part (64) for controlling the processing part in accordance with the recognition result of the recognition part; and a storage part (79) for storing the image and the sound processed by the processing part in a storage medium by being related to each other. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は撮影装置に関する。 The present invention relates to a photographing apparatus.

従来、音声付動画を記録する機能を備えたビデオカメラやスチルカメラにおいて、撮影時に選択された任意の撮影モードに対応して、当該選択された撮影モードに最も相応しい状態で音声の記録を行なうものがある（例えば、特許文献１）。 Conventionally, in a video camera or a still camera having a function of recording a moving image with sound, the sound is recorded in a state most suitable for the selected shooting mode corresponding to the arbitrary shooting mode selected at the time of shooting. (For example, Patent Document 1).

特開２０００−３５４１９０号公報JP 2000-354190 A

しかし、従来の撮影装置においては、撮影者が撮影前にカメラに装備されている、例えばポートレート、風景などの撮影モードの選択により被写体の撮影シーンを設定しておかなければならなかった。このため、設定時に設定間違いが生じたり、設定作業が煩わしかったりするという問題がある。 However, in the conventional photographing apparatus, the photographer has to set the photographing scene of the subject by selecting a photographing mode such as portrait or landscape, which is equipped on the camera before photographing. For this reason, there is a problem that a setting error occurs at the time of setting or the setting work is troublesome.

本発明はこのような問題に鑑みてなされたものであり、設定の間違いや設定作業の煩わしさを解消し、撮影した画像に適した音声の記録が可能な撮影装置を提供することである。 The present invention has been made in view of such problems, and it is an object of the present invention to provide an imaging apparatus capable of eliminating a setting error and annoying setting work and recording audio suitable for a captured image.

本発明は、以下のような解決手段により、前記課題を解決する。なお、理解を容易にするために、本発明の実施形態に対応する符号を付して説明するが、これに限定されるものではない。すなわち、請求項１の発明は、画像を取得する画像取得部（７１）と、前記画像に対応させて音声を取得する音声取得部（７４）と、前記画像が取得された状態を認識する認識部（５８）と、前記音声取得部で取得された前記音声を加工する加工部（７６）と、前記認識部の認識結果に応じて前記加工部を制御する制御部（６４）と、前記画像と前記加工部により加工された前記音声とを対応させて記憶媒体に記憶する記憶部（７９）と、を含むことを特徴とする撮影装置である。 The present invention solves the above problems by the following means. In addition, in order to make an understanding easy, although the code | symbol corresponding to embodiment of this invention is attached | subjected and demonstrated, it is not limited to this. That is, the invention of claim 1 is an image acquisition unit (71) for acquiring an image, a sound acquisition unit (74) for acquiring a sound corresponding to the image, and recognition for recognizing a state in which the image is acquired. Unit (58), a processing unit (76) for processing the voice acquired by the voice acquisition unit, a control unit (64) for controlling the processing unit according to a recognition result of the recognition unit, and the image And a storage unit (79) for storing the sound processed by the processing unit in a storage medium in association with each other.

請求項２の発明は、請求項１に記載された撮影装置であって、前記認識部（５８）は、前記画像を用いてシーン解析をすることにより前記画像が取得された状態を認識することを特徴とする撮影装置である。 A second aspect of the present invention is the imaging apparatus according to the first aspect, wherein the recognition unit (58) recognizes a state in which the image is acquired by performing a scene analysis using the image. Is a photographing apparatus characterized by the above.

請求項３の発明は、請求項１に記載された撮影装置であって、撮影シーンに応じて撮影者が操作可能なシーン操作部（２５）を含み、前記認識部（５８）は、前記シーン操作部（２５）で操作されたシーンよりも、前記シーン解析で解析されたシーンに重みをもたせて、前記画像が取得された状態を認識することを特徴とする撮影装置である。 A third aspect of the present invention is the photographing apparatus according to the first aspect, including a scene operation unit (25) that can be operated by a photographer in accordance with a photographing scene, wherein the recognition unit (58) includes the scene. The imaging device recognizes the state where the image is acquired by giving a weight to the scene analyzed by the scene analysis rather than the scene operated by the operation unit (25).

請求項４の発明は、請求項１に記載された撮影装置であって、前記画像の像振れを補正するための特性を変更するために撮影者が操作可能な操作部（３１）を含み、前記認識部（５８）は前記操作部（３１）の操作状態に応じて前記画像が取得された状態を認識することを特徴とする撮影装置である。 The invention of claim 4 is the imaging apparatus according to claim 1, comprising an operation unit (31) operable by a photographer to change a characteristic for correcting image blur of the image, The recognizing unit (58) is a photographing apparatus that recognizes a state in which the image is acquired according to an operation state of the operation unit (31).

請求項５の発明は、請求項３または請求項４に記載された撮影装置であって、装置のブレを検出する検出部（６１、６１ａ）を有し、前記制御部（６４）は、前記検出部により検出された前記装置のブレに応じて前記音声を加工するように前記加工部（７６）を制御することを特徴とする撮影装置である。 A fifth aspect of the present invention is the imaging apparatus according to the third or fourth aspect, further comprising a detection unit (61, 61a) for detecting a shake of the device, wherein the control unit (64) The imaging device is characterized in that the processing unit (76) is controlled so as to process the sound according to the shake of the device detected by a detection unit.

請求項６の発明は、請求項１から請求項５までの何れか一項に記載された撮影装置であって、前記加工部（７６）は、前記音声取得部（７４）で取得された前記音声の周波数に応じてゲインを変化させるフィルタ（７６）であり、前記制御部（６４）は、前記認識部（５８）の認識結果に応じて前記フィルタ（７６）の特性を制御することを特徴とする撮影装置である。 Invention of Claim 6 is an imaging device as described in any one of Claim 1-5, Comprising: The said process part (76) is acquired by the said audio | voice acquisition part (74). The filter (76) changes the gain according to the frequency of the voice, and the control unit (64) controls the characteristic of the filter (76) according to the recognition result of the recognition unit (58). It is an imaging device.

請求項７の発明は、画像を取得する画像取得部（７１）と、前記画像に対応させて音声を取得する音声取得部（７４）と、前記画像が取得された状態を認識する認識部（５８）と、前記画像と前記音声と前記認識部の認識結果とを対応させて記憶媒体に記憶する記憶部（７９）とを含むことを特徴とする撮影装置である。 The invention according to claim 7 is an image acquisition unit (71) that acquires an image, a sound acquisition unit (74) that acquires sound corresponding to the image, and a recognition unit that recognizes a state in which the image is acquired ( 58) and a storage unit (79) for storing the image, the sound, and the recognition result of the recognition unit in a storage medium in association with each other.

請求項８の発明は、請求項７に記載された撮影装置であって、前記認識部（５８）は、前記画像を用いてシーン解析をすることにより前記画像が取得された状態を認識することを特徴とする撮影装置である。 The invention of claim 8 is the photographing apparatus according to claim 7, wherein the recognition unit (58) recognizes a state where the image is acquired by performing a scene analysis using the image. Is a photographing apparatus characterized by the above.

本発明によれば、撮影した画像に適した音声の記録が可能な撮影装置を提供することができる。 ADVANTAGE OF THE INVENTION According to this invention, the imaging device which can record the audio | voice suitable for the image | photographed image can be provided.

第１実施形態に係る撮影装置の外観を模式的に示す斜視図である。It is a perspective view which shows typically the external appearance of the imaging device which concerns on 1st Embodiment. 第１実施形態に係る撮影装置の構成を示す図である。It is a figure which shows the structure of the imaging device which concerns on 1st Embodiment. 第１実施形態に係る撮影装置の機能を示すブロック図である。It is a block diagram which shows the function of the imaging device which concerns on 1st Embodiment. シーン認識部のシーン解析の例を示すフロー図である。It is a flowchart which shows the example of the scene analysis of a scene recognition part. 平均的なマイクが集音する音の周波数特性の例を示す図である。It is a figure which shows the example of the frequency characteristic of the sound which an average microphone collects. フィルタの特性の例を示す図である。It is a figure which shows the example of the characteristic of a filter. 第２実施形態に係る撮影装置の機能を示すブロック図である。It is a block diagram which shows the function of the imaging device which concerns on 2nd Embodiment. 第２実施形態に係る撮影装置において、制御部に入力され、記憶部に記憶される各信号の入力状態と時間との関係をそれぞれグラフに示した図であり、（ａ）は画像信号の入力状態、（ｂ）は右側マイクの音声信号の入力状態、（ｃ）は左側マイクの音声信号の入力状態、（ｄ）は角速度信号の入力状態、（ｅ）はアクティブモードのオン信号の入力状態を示している。In the imaging device concerning a 2nd embodiment, it is the figure which showed the relation between the input state of each signal which is inputted into a control part and memorized by a storage part, and a graph, respectively, and (a) is an input of an image signal. State, (b) is the input state of the right microphone audio signal, (c) is the input state of the left microphone audio signal, (d) is the angular velocity signal input state, and (e) is the active mode ON signal input state. Is shown.

以下、本発明の実施形態について図面を参照しつつ説明する。 Embodiments of the present invention will be described below with reference to the drawings.

（第１実施形態）
図１は第１実施形態に係る撮影装置の外観を模式的に示す斜視図である。また、図２は第１実施形態に係る撮影装置の構成を示す図である。 (First embodiment)
FIG. 1 is a perspective view schematically showing the appearance of the photographing apparatus according to the first embodiment. FIG. 2 is a diagram illustrating the configuration of the photographing apparatus according to the first embodiment.

第１実施形態に係る撮影装置は、デジタル一眼レフカメラ１（以下、カメラ１と略記する。）であり、静止画および動画の撮影が可能である。カメラ１はボディ部４と、ボディ部４の上部に備えられたファインダ装置部７と、ボディ部４の前面部に設けられたレンズ部１０とから構成されている。レンズ部１０は複数のレンズ群からなる撮影レンズ１１を内蔵している。撮影レンズ１１は撮影光学系を構成している。レンズ部１０には、被写体距離検出部１３と焦点距離検出１６部が設けられている。 The imaging apparatus according to the first embodiment is a digital single-lens reflex camera 1 (hereinafter abbreviated as camera 1), and can capture still images and moving images. The camera 1 includes a body part 4, a finder device part 7 provided on the upper part of the body part 4, and a lens part 10 provided on the front part of the body part 4. The lens unit 10 incorporates a photographing lens 11 composed of a plurality of lens groups. The photographing lens 11 constitutes a photographing optical system. The lens unit 10 is provided with a subject distance detection unit 13 and a focal length detection unit 16.

ボディ部４の外部には、静止画／動画切替えスイッチ１９、レリーズボタン２２、シーン操作部２５、手ブレ補正機構を作動させるための手ブレ補正スイッチ２８、手ブレを補正する特性を変更できる補正モード切替えスイッチ３１が設けられている。また、ボディ部４の前面、すなわち被写体側の面の下方には、動画の撮影時に音声を記録するためのマイク３４が設けられている。マイク３４は図示しない被写体に向かって右側に設けてある右側マイク３４Ｒおよび左側に設けてある左側マイク３４Ｌの２つを備えている。ボディ部４の背面には、動画撮影中における撮影中の画像および撮影待機中に撮影光学系が捉えている画像が表示される液晶表示部３７が設けられている。液晶表示部３７には、カメラ１の設定状態や操作時の各種メニューが表示される。ボディ部４の両側面には、撮影した動画を再生したときに音声が出力されるスピーカ４０が設けられている。スピーカ４０は被写体に向かって右側面に設けてある右側スピーカ４０Ｒおよび左側面に設けてある左側スピーカ４０Ｌの２つを備えている。 Outside the body portion 4, a still image / video switching switch 19, a release button 22, a scene operation unit 25, a camera shake correction switch 28 for operating a camera shake correction mechanism, and a correction capable of changing the characteristics for correcting camera shake. A mode switch 31 is provided. In addition, a microphone 34 is provided on the front surface of the body portion 4, that is, below the surface on the subject side, for recording sound when shooting a moving image. The microphone 34 includes two microphones, a right microphone 34R provided on the right side and a left microphone 34L provided on the left side of the subject (not shown). On the back surface of the body portion 4, a liquid crystal display unit 37 is provided on which an image being shot during moving image shooting and an image captured by the shooting optical system during shooting standby are displayed. The liquid crystal display unit 37 displays the setting state of the camera 1 and various menus during operation. Speakers 40 are provided on both side surfaces of the body portion 4 to output sound when a captured moving image is reproduced. The speaker 40 includes two speakers, a right speaker 40R provided on the right side and a left speaker 40L provided on the left side facing the subject.

ボディ部４の内部には、被写体を撮影するための撮像素子４３が設けられている。撮像素子４３はＣＣＤあるいはＣＭＯＳが用いられている。撮影レンズ１１と撮像素子４３との間の撮影光路中には、撮影レンズ１１を通過した被写体からの光をファインダ装置部７へ反射するためのクイックリターンミラー４６が設けられている。 An image sensor 43 for photographing a subject is provided inside the body portion 4. The image sensor 43 is a CCD or a CMOS. In the photographing optical path between the photographing lens 11 and the image sensor 43, a quick return mirror 46 for reflecting light from the subject that has passed through the photographing lens 11 to the finder device unit 7 is provided.

静止画を撮影する場合、レリーズ前における被写体からの光は、撮影レンズ１１を通過してボディ部４に入り、クイックリターンミラー４６で反射され、撮像素子４３と共役な位置に設けられているファインダスクリーン４９に結像する。ファインダスクリーン４９に結像した被写体像はファインダ装置部７に設けられたペンタプリズム５２を介して接眼レンズ５５に導かれ、撮影者に観察される。レリーズ後においては、クイックリターンミラー４６が跳ね上がり、撮影レンズ１１を通過した被写体からの光は撮像素子４３に結像する。 When photographing a still image, light from the subject before the release passes through the photographing lens 11 and enters the body portion 4, is reflected by the quick return mirror 46, and is provided at a position conjugated with the image sensor 43. An image is formed on the screen 49. The subject image formed on the finder screen 49 is guided to the eyepiece lens 55 via the pentaprism 52 provided in the finder device section 7 and is observed by the photographer. After the release, the quick return mirror 46 jumps up, and the light from the subject that has passed through the photographing lens 11 forms an image on the image sensor 43.

図３は、第１実施形態に係るカメラ１の機能を示すブロック図である。 FIG. 3 is a block diagram illustrating functions of the camera 1 according to the first embodiment.

以下、カメラ１で被写体（図示省略）の動画を撮影する場合を例としてカメラ１の機能を説明する。 Hereinafter, the function of the camera 1 will be described by taking as an example the case of shooting a moving image of a subject (not shown) with the camera 1.

静止画／動画切替えスイッチ１９を操作してカメラ１を動画撮影モードに切替えると、クイックリターンミラー４６が跳ね上がって撮影光路から退避した状態で保持され、動画撮影の待機状態となる。カメラ１は、撮影者が撮影状態に適した撮影モードを選択することができるようになっている。撮影モードの選択はシーン操作部２５を操作することで行なう。シーン操作部２５は、例えばポートレート撮影モード、運動会等で使用する子供撮影モード、夜景撮影モード、マクロ撮影モード等の撮影モードが設定された撮影シーン選択ダイヤルである。撮影者が撮影シーンを選択すると、選択された撮影モードに適した画像が撮影できるように撮影レンズ１１、絞り（図示省略）等の設定状態が自動的に調節される。シーン操作部２５の設定状態はシーン認識部５８に出力される。 When the still image / moving image changeover switch 19 is operated to switch the camera 1 to the moving image capturing mode, the quick return mirror 46 jumps up and is held in a state of being retracted from the image capturing optical path, and enters a standby state for capturing moving images. The camera 1 allows the photographer to select a shooting mode suitable for the shooting state. The shooting mode is selected by operating the scene operation unit 25. The scene operation unit 25 is a shooting scene selection dial in which shooting modes such as a portrait shooting mode, a child shooting mode used in an athletic meet, a night view shooting mode, a macro shooting mode, and the like are set. When the photographer selects a photographing scene, the setting states of the photographing lens 11, aperture (not shown) and the like are automatically adjusted so that an image suitable for the selected photographing mode can be photographed. The setting state of the scene operation unit 25 is output to the scene recognition unit 58.

カメラ１には、撮影している画像の像ブレを低減するための手ブレ補正機構が備えられている。手ブレ補正機構を作動させるときは、手ブレ補正スイッチ２８（以下、ＶＲスイッチ２８という。）をオンにする。すると、シーン認識部５８に備えられたブレ検出部６１またはレンズ部１０に設けられたブレ検出部６１ａが作動する。ブレ検出部６１は、後述するシーン認識部５８に入力される画像信号を用いて画像処理によりカメラ１のブレを検出する。ブレ検出部６１ａはジャイロセンサである（以下、ジャイロセンサ６１ａともいう。）。ブレ検出部６１、６１ａはカメラ１のブレ、すなわちボディ部４が傾いた際の角速度を検出する。撮影レンズ１１には補正レンズ（図示省略）が組み込まれており、ブレ検出部６１、６１ａが検出した角速度に対応して、ボディ部４の傾きを打ち消す方向に補正レンズを動かすことで光軸を補正する。図示実施例においては、ブレ検出部６１およびブレ検出部６１ａを両方用いても良いし、少なくとも一方を用いても良い。これにより撮像素子４３の受光面に到達する光の動きを抑制し、手ブレを低減している。 The camera 1 is provided with a camera shake correction mechanism for reducing image blurring of a photographed image. When operating the camera shake correction mechanism, the camera shake correction switch 28 (hereinafter referred to as the VR switch 28) is turned on. Then, the blur detection unit 61 provided in the scene recognition unit 58 or the blur detection unit 61a provided in the lens unit 10 operates. The shake detection unit 61 detects a shake of the camera 1 by image processing using an image signal input to a scene recognition unit 58 described later. The shake detection unit 61a is a gyro sensor (hereinafter also referred to as a gyro sensor 61a). The shake detection units 61 and 61a detect the shake of the camera 1, that is, the angular velocity when the body part 4 is tilted. The photographing lens 11 incorporates a correction lens (not shown), and the optical axis is moved by moving the correction lens in a direction that cancels the inclination of the body portion 4 in accordance with the angular velocity detected by the shake detection units 61 and 61a. to correct. In the illustrated embodiment, both the shake detection unit 61 and the shake detection unit 61a may be used, or at least one of them may be used. Thereby, the movement of light reaching the light receiving surface of the image sensor 43 is suppressed, and camera shake is reduced.

また、本実施形態に係るカメラ１は、手ブレ補正機構の補正特性を変更できるようになっている。補正特性の変更は補正モード切替えスイッチ３１（以下、ＶＲモード切替えスイッチ３１という。）を操作することで行なう。ＶＲモード切替えスイッチ３１は、例えば、アクティブモードとノーマルモードとが切替えられるようになっている（以下、アクティブモードをＶＲアクティブモードと、ノーマルモードをＶＲノーマルモードという。）。ＶＲアクティブモードは、撮影者が自動車や電車等の乗り物に乗って撮影する時に使用するモードである。ＶＲアクティブモードは、自動車や電車等の乗り物特有の振動によるカメラ１のブレを検知し、これを補正する。ＶＲノーマルモードは、撮影者が乗り物に乗っていないときの手ブレを補正するために使用するモードである。ＶＲモードを選択することで、撮影状態に適した補正特性で像ブレを低減することができる。なお、ＶＲモード切替えスイッチ３１は、三脚使用時／三脚不使用時切替えスイッチであっても良い。ＶＲモード切替えスイッチ３１の設定状態は、ＶＲモード切替えスイッチ３１がＶＲアクティブモードに設定されている時にＶＲアクティブオン信号が出力され、ＶＲノーマルモードに設定されている時にはＶＲアクティブオン信号が出力されない。ＶＲアクティブオン信号はシーン認識部５８と制御部６４とにそれぞれ出力される。 Further, the camera 1 according to the present embodiment can change the correction characteristics of the camera shake correction mechanism. The correction characteristic is changed by operating a correction mode changeover switch 31 (hereinafter referred to as a VR mode changeover switch 31). For example, the VR mode change-over switch 31 is switched between an active mode and a normal mode (hereinafter, the active mode is referred to as a VR active mode and the normal mode is referred to as a VR normal mode). The VR active mode is a mode used when a photographer takes a picture on a vehicle such as an automobile or a train. In the VR active mode, camera 1 shake due to vibrations peculiar to a vehicle such as an automobile or a train is detected and corrected. The VR normal mode is a mode used to correct camera shake when the photographer is not on the vehicle. By selecting the VR mode, it is possible to reduce image blur with correction characteristics suitable for the shooting state. The VR mode changeover switch 31 may be a changeover switch when using a tripod / when not using a tripod. The VR mode changeover switch 31 is set such that the VR active on signal is output when the VR mode changeover switch 31 is set to the VR active mode, and the VR active on signal is not output when the VR normal mode is set. The VR active on signal is output to the scene recognition unit 58 and the control unit 64, respectively.

動画撮影の待機状態においては、撮影光学系を通過した被写体からの光が撮像素子４３に結像している。ボディ部４の背面の液晶表示部３７には、撮影光学系が捉えている被写体の画像が表示されている。撮影者がレリーズボタン２２を半押し操作すると、被写体距離検出部１３と焦点距離検出部１６とが作動して被写体までの距離および被写体の焦点距離を演算し、被写体に焦点が合うように自動的に撮影レンズ１１を駆動する。こうして撮影者は被写体にピントが合った状態で画像を撮影できる。被写体距離検出部１３および焦点距離検出部１６が演算した被写体までの距離および被写体の焦点距離は、シーン認識部５８に出力される。 In a standby state for moving image shooting, light from a subject that has passed through the shooting optical system forms an image on the image sensor 43. An image of the subject captured by the photographing optical system is displayed on the liquid crystal display unit 37 on the back surface of the body unit 4. When the photographer presses the release button 22 halfway, the subject distance detection unit 13 and the focal length detection unit 16 operate to calculate the distance to the subject and the focal length of the subject, and automatically adjust the subject to be in focus. The photographic lens 11 is driven. Thus, the photographer can take an image while the subject is in focus. The distance to the subject and the focal length of the subject calculated by the subject distance detection unit 13 and the focal length detection unit 16 are output to the scene recognition unit 58.

撮影者がレリーズボタン２２を全押し操作して全押しスイッチが入ると動画の撮影が開始される。動画の撮影が開始されると、液晶表示部３７には記録中の画像が表示される。撮像素子４３に結像した被写体からの光は撮像素子４３で光電変換され電気信号に変換される。電気信号はアナログ処理回路６７でアナログ処理が施された後、Ａ／Ｄ変換回路６８によってデジタル信号に変換され、画像信号が生成される。画像信号は画像処理回路６９でホワイトバランス調整、彩度調整、輪郭調整、階調調整等の画像処理が施される。撮影光学系と撮像素子４３とアナログ処理回路６７とＡ／Ｄ変換回路６８と画像処理回路６９とで画像取得部７１が構成されている。画像処理が施された画像信号は、シーン認識部５８と制御部６４とにそれぞれ出力される。 When the photographer fully presses the release button 22 and the full push switch is turned on, shooting of the moving image is started. When shooting of a moving image is started, an image being recorded is displayed on the liquid crystal display unit 37. Light from the subject imaged on the image sensor 43 is photoelectrically converted by the image sensor 43 and converted into an electrical signal. The electrical signal is subjected to analog processing by an analog processing circuit 67 and then converted to a digital signal by an A / D conversion circuit 68 to generate an image signal. The image signal is subjected to image processing such as white balance adjustment, saturation adjustment, contour adjustment, and gradation adjustment by an image processing circuit 69. An image acquisition unit 71 is configured by the imaging optical system, the image sensor 43, the analog processing circuit 67, the A / D conversion circuit 68, and the image processing circuit 69. The image signal subjected to the image processing is output to the scene recognition unit 58 and the control unit 64, respectively.

画像処理回路６９には、画角内の人物の顔を認識する顔認識部７０が設けられている。顔認識部７０は画角内に存在する顔の面積と個数、位置などを演算し、演算結果を顔認識情報としてシーン認識部５８に出力する。 The image processing circuit 69 is provided with a face recognition unit 70 for recognizing a person's face within the angle of view. The face recognition unit 70 calculates the area, number, and position of faces existing within the angle of view, and outputs the calculation result to the scene recognition unit 58 as face recognition information.

動画を撮影するときは上述した信号および情報、すなわち画像処理回路６９からの画像信号、顔認識部７０からの顔認識情報、シーン操作部２５の設定状態、ＶＲモード切替えスイッチ３１の設定状態すなわちＶＲアクティブオン信号、被写体までの距離情報および被写体の焦点距離情報（以下、画像信号等の情報という。）が連続して出力され、画像信号等の情報はシーン認識部５８に入力される。 When shooting a moving image, the signals and information described above, that is, the image signal from the image processing circuit 69, the face recognition information from the face recognition unit 70, the setting state of the scene operation unit 25, the setting state of the VR mode changeover switch 31, that is, VR. The active-on signal, the distance information to the subject, and the focal length information of the subject (hereinafter referred to as information such as an image signal) are continuously output, and the information such as the image signal is input to the scene recognition unit 58.

シーン認識部５８は入力された画像信号等の情報に基づいて被写体の明るさ、被写体までの距離等を演算し、シーン解析する。シーン認識部５８はシーン解析結果により当該画像が撮影された状態を認識する。例えば、顔認識部５８からの情報に基づいて人物を主要被写体とするものであると認識したり、画角の天上部に青色の情報が多ければ晴天の日に屋外で撮影していると認識したりする。また、シーン認識部５８は入力された画像信号等の情報にＶＲアクティブオン信号が含まれている場合、当該画像は自動車等の乗り物に乗って撮影している画像であると認識する。シーン認識部５８が認識した撮影状態は、撮影者がシーン操作部２５で選択した撮影モードと必ずしも一致するとは限らない。この場合、シーン認識部５８はシーン解析結果に重みを持たせて画像が取得された状態を認識する。シーン認識部５８のシーン解析結果、すなわちシーン認識結果は制御部６４に出力される。 The scene recognition unit 58 calculates the brightness of the subject, the distance to the subject, and the like based on the input information such as the image signal, and analyzes the scene. The scene recognizing unit 58 recognizes a state where the image is captured based on the scene analysis result. For example, based on information from the face recognition unit 58, it is recognized that a person is a main subject, or if there is a lot of blue information at the top of the angle of view, it is recognized that the image is taken outdoors on a sunny day. To do. When the information such as the input image signal includes the VR active on signal, the scene recognition unit 58 recognizes that the image is an image taken on a vehicle such as an automobile. The shooting state recognized by the scene recognition unit 58 does not necessarily match the shooting mode selected by the photographer using the scene operation unit 25. In this case, the scene recognizing unit 58 recognizes the state where the image is acquired by giving a weight to the scene analysis result. The scene analysis result of the scene recognition unit 58, that is, the scene recognition result is output to the control unit 64.

図４は、シーン認識部５８のシーン解析の例であり、シーン解析を実行した結果、子供を撮影しているシーンであると認識する場合のフロー図である。この例においてシーン認識部５８に入力されている信号および情報は上述した画像信号等の情報、すなわち、画像処理回路６９からの画像信号、顔認識部７０からの顔認識情報、シーン操作部２５の設定状態、ＶＲモード設定スイッチ３１の状態（この例の場合ＶＲモード設定スイッチ３１はＶＲノーマルモードに設定されているとする。）、被写体までの距離情報および被写体の焦点距離情報である。 FIG. 4 is an example of scene analysis performed by the scene recognition unit 58, and is a flowchart for recognizing a scene where a child is photographed as a result of executing the scene analysis. In this example, the signal and information input to the scene recognition unit 58 are information such as the image signal described above, that is, the image signal from the image processing circuit 69, the face recognition information from the face recognition unit 70, the scene operation unit 25 The setting state, the state of the VR mode setting switch 31 (in this example, the VR mode setting switch 31 is set to the VR normal mode), the distance information to the subject, and the focal length information of the subject.

シーン認識部５８は、入力された画像信号等の情報のうち、顔認識情報、焦点距離情報、被写体距離情報を用いてシーン解析を実行している。シーン認識部５８は、顔認識情報、焦点距離情報、被写体距離情報を取得し（ステップ１−ステップ３）、被写体距離情報および焦点距離情報を勘案して、顔認識情報から被写体の顔の絶対サイズを演算する（ステップ４）。シーン認識部５８が演算した顔の絶対サイズを示す指標を顔の面積Ｓとし、また、顔の面積Ｓの大きさを判断するパラメータを、大人の顔の面積Ｓadultとする。シーン認識部５８は予め面積Ｓadultを記憶している。そしてＳadultを閾値とすると、シーン認識部５８が演算した顔の絶対サイズＳがＳadultよりも小さければ（ステップ５）、シーン認識部５８は被写体が子供であると認識する（ステップ６）。また、シーン認識部５８はＶＲモード設定スイッチ３１のＶＲアクティブオン信号が出力されていないので、乗り物には乗っていない状態で撮影された画像であると認識する。また、シーン操作部２５の設定状態が子供撮影モード以外のモードに設定されていても、シーン認識部５８は実際に撮影された画像の画像信号等の情報から、被写体は子供であると認識する。つまり、シーン認識部５８は実際に撮影された画像の画像信号等の情報を優先してシーンを認識する。そしてシーン認識部５８の認識結果は制御部６４に出力される（ステップ７）。 The scene recognition unit 58 performs scene analysis using face recognition information, focal length information, and subject distance information among information such as input image signals. The scene recognizing unit 58 acquires face recognition information, focal length information, and subject distance information (step 1 to step 3), and takes into account subject distance information and focal length information, and determines the absolute size of the subject's face from the face recognition information. Is calculated (step 4). An index indicating the absolute size of the face calculated by the scene recognition unit 58 is a face area S, and a parameter for determining the size of the face area S is an adult face area Sadult. The scene recognition unit 58 stores an area Sadult in advance. If Sadult is a threshold, if the absolute face size S calculated by the scene recognition unit 58 is smaller than Sadult (step 5), the scene recognition unit 58 recognizes that the subject is a child (step 6). In addition, since the VR active on signal of the VR mode setting switch 31 is not output, the scene recognizing unit 58 recognizes that the image is captured without being on the vehicle. Even if the setting state of the scene operation unit 25 is set to a mode other than the child shooting mode, the scene recognition unit 58 recognizes that the subject is a child from information such as an image signal of an actually shot image. . That is, the scene recognition unit 58 recognizes a scene with priority given to information such as an image signal of an actually captured image. The recognition result of the scene recognition unit 58 is output to the control unit 64 (step 7).

上述したシーン解析の例の場合において、例えば画角内に顔が複数あり、そのそれぞれの大きさがＳadultよりも小さい場合も、シーン認識部５８は被写体が子供であると認識する。また、例えば顔が複数あり、そのそれぞれの大きさがＳadultよりも大きいものと小さいものとが混在している場合は、シーン認識部５８は被写体には子供が含まれていると判断する。また、例えばＳadultよりも小さいサイズの顔が縦横無尽に動き回っていれば、シーン認識部５８は、子供が遊んでいるシーンを撮影していると判断する。 In the case of the above-described scene analysis example, for example, when there are a plurality of faces in the angle of view and each size is smaller than Sadult, the scene recognition unit 58 recognizes that the subject is a child. Also, for example, when there are a plurality of faces and each of which has a size larger than Sadult and a size smaller than that, the scene recognition unit 58 determines that the subject includes a child. Also, for example, if a face having a size smaller than Sadult moves around indefinitely, the scene recognition unit 58 determines that a scene in which a child is playing is being photographed.

次に、動画の撮影において画像とともに記録される音声の処理機能について説明する。 Next, a description will be given of a processing function of audio recorded together with an image in moving image shooting.

動画の撮影では画像とともに音声も同時に記録される。通常、人間の耳の可聴域は２０Ｈｚ〜２ｋＨｚといわれている（ただし、諸説存在する）。このため、音声を集音するマイクは可聴域を大きくカバーして集音するものが多い。図５に平均的なマイクが集音する音の周波数特性の例を示す。第１実施形態におけるマイク３４（以下、単に「マイク３４」というときは、右側マイク３４Ｒおよび左側マイク３４Ｌのことをいう。）も図５に示す周波数特性を有し、人間の耳の可聴域を大きくカバーするものである。可聴域には様々な種類の音が含まれている。例えば、図５に示すように自動車の音の周波数特性は比較的低域であり、セミや鈴虫等の虫の鳴き声は高域である。また、図５に示すように人間の声は中域を含む広範囲となっている。マイク３４はこれらの様々な種類の音を集音する。例えば子供が遊ぶシーンを撮影した場合、マイク３４は子供の声だけではなく、近くにいる大人の声や、通りかかった自動車の音なども同時に集音する。 When shooting a movie, sound is recorded simultaneously with the image. Usually, the audible range of the human ear is said to be 20 Hz to 2 kHz (however, there are various theories). For this reason, many microphones that collect sound collect much over the audible range. FIG. 5 shows an example of frequency characteristics of sound collected by an average microphone. The microphone 34 in the first embodiment (hereinafter simply referred to as “microphone 34” means the right microphone 34R and the left microphone 34L) also has the frequency characteristics shown in FIG. 5 and has an audible range of the human ear. It covers a lot. The audible range includes various types of sounds. For example, as shown in FIG. 5, the frequency characteristic of the sound of a car is relatively low, and the call of insects such as cicada and bellworm is high. Further, as shown in FIG. 5, the human voice has a wide range including the middle range. The microphone 34 collects these various types of sounds. For example, when a scene where a child plays is photographed, the microphone 34 collects not only the voice of the child but also the voice of an adult nearby and the sound of a car that has passed by.

撮影した画像とともに右側マイク３４Ｒおよび左側マイク３４Ｌで集音されたこのような様々な種類の音声は、それぞれ信号処理回路７３で処理される。信号処理回路７３はＡ／Ｄ変換回路（図示省略）、デジタルアンプ（図示省略）で構成されている。マイク３４と信号処理回路７３とで音声取得部７４を構成している。信号処理回路７３で処理された音声は、デジタルの音声信号として出力される。音声信号は右側マイク３４Ｒで集音された音声の音声信号と左側マイク３４Ｌで集音された音声の音声信号とがそれぞれ出力される（以下、単に「音声信号」というときは、右側マイク３４Ｒの音声から生成された音声信号および左側マイク３４Ｌの音声から生成された音声信号のことをいう。）。なお、信号処理回路７３で処理された音声信号はアナログの音声信号で出力しても良い。アナログの音声信号を出力する場合は、信号処理回路７３の構成はＡ／Ｄ変換回路（図示省略）、デジタルアンプ（図示省略）、Ｄ／Ａ変換回路（図示省略）となる。 Such various types of sound collected by the right microphone 34R and the left microphone 34L together with the captured image are processed by the signal processing circuit 73, respectively. The signal processing circuit 73 includes an A / D conversion circuit (not shown) and a digital amplifier (not shown). The microphone 34 and the signal processing circuit 73 constitute an audio acquisition unit 74. The sound processed by the signal processing circuit 73 is output as a digital sound signal. As the audio signal, an audio signal of the sound collected by the right microphone 34R and an audio signal of the sound collected by the left microphone 34L are respectively output (hereinafter simply referred to as “audio signal”). An audio signal generated from the sound and an audio signal generated from the sound of the left microphone 34L). Note that the audio signal processed by the signal processing circuit 73 may be output as an analog audio signal. When outputting an analog audio signal, the signal processing circuit 73 has an A / D conversion circuit (not shown), a digital amplifier (not shown), and a D / A conversion circuit (not shown).

第１実施形態に係るカメラ１では、信号処理回路７３で処理された音声信号は加工部７６で加工される。加工部７６は音声信号の周波数に応じてゲインを変化させるフィルタである(以下、フィルタ７６という。)。音声信号はこのようなフィルタ７６によって、当該画像が撮影された状態に相応しい音声に加工される。例えば、子供を主要被写体として撮影した場合であれば、側にいる大人の声や、近くを通りかかった自動車の音などを減衰するように処理特性が変更されたフィルタ７６によって音声信号を処理する。そうすると、子供の声が重視された音声とともに画像を記録することができる。一般に成人男性の声の基本周波数は１００Ｈｚ前後であり、子供の声の基本周波数は２００Ｈｚ以上である。そこで、図６に示すような周波数特性のフィルタを用いて音声信号を処理すると、成人男性の声は減衰され、子供の声の帯域だけ取り出すことができる。その結果、子供の声を重視した音声信号となる。このように、第１実施形態のカメラ１は入力された音声信号を所望の音声の帯域が重視されるように、言い換えれば、不要な音声が記録されないようにフィルタ７６を用いて調節している。なお、第１実施形態においては、音声信号はデジタルなのでフィルタ７６もデジタルフィルタである。音声信号をアナログで出力する場合はアナログフィルタを用いれば良い。 In the camera 1 according to the first embodiment, the audio signal processed by the signal processing circuit 73 is processed by the processing unit 76. The processing unit 76 is a filter that changes the gain according to the frequency of the audio signal (hereinafter referred to as the filter 76). The sound signal is processed by such a filter 76 into sound suitable for the state in which the image is captured. For example, when a child is photographed as a main subject, the audio signal is processed by the filter 76 whose processing characteristics are changed so as to attenuate the voice of an adult at the side or the sound of a car passing by nearby. Then, it is possible to record an image together with a voice that emphasizes a child's voice. In general, the basic frequency of an adult male voice is around 100 Hz, and the basic frequency of a child's voice is 200 Hz or more. Therefore, when the audio signal is processed using a filter having a frequency characteristic as shown in FIG. 6, the voice of the adult male is attenuated and only the band of the child's voice can be extracted. As a result, the audio signal is focused on the child's voice. As described above, the camera 1 of the first embodiment adjusts the input audio signal using the filter 76 so that a desired audio band is emphasized, in other words, unnecessary audio is not recorded. . In the first embodiment, since the audio signal is digital, the filter 76 is also a digital filter. When outputting an audio signal in analog, an analog filter may be used.

信号処理回路７３で生成された音声信号に対してフィルタ７６の特性をどのように設定するかは制御部６４が制御している。制御部６４には、シーン認識部５８が画像信号等の情報を用いてシーン解析を行なった解析結果、すなわち画像が撮影された状態についてのシーン認識結果が入力される。制御部６４は入力されたシーン認識結果に基づいて、画像が撮影された状態に最も相応しい音声が記録されるようにフィルタ７６の処理特性を変化させる。つまりフィルタ７６のゲインを変化させる。例えば、シーン認識部５８が、子供が遊んでいるシーンを撮影した画像であると認識した場合、制御部６４は当該認識結果に基づいて、図６に示すような処理特性で処理するのが当該画像に最も相応しいものと判断する。そしてフィルタ７６のゲインを図６に示す処理特性となるように変化させる。そして音声信号は図６に示す処理特性のフィルタ７６で処理される。その結果、成人男性の声の帯域は音声信号からカットされ、子供の声の帯域が重視された音声信号が生成される。また、例えばシーン認識部５８は入力された画像信号等の情報にＶＲアクティブオン信号が含まれている場合、当該画像は自動車等の乗り物に乗って撮影している画像であると認識する。制御部６４はシーン認識部５８のこの認識結果に基づいてフィルタ７６の処理特性を変化させる。すなわち、フィルタ７６の処理特性を自動車等の車内に響くエンジン音やタイヤの走行音、あるいは風切り音等を低減させるように変化させる。このような特性のフィルタ７６で音声信号を加工すれば、自動車のエンジン音等は低減されて、車内での会話等が強調された画像を記録することができる。 The control unit 64 controls how the characteristics of the filter 76 are set for the audio signal generated by the signal processing circuit 73. The control unit 64 receives an analysis result obtained by the scene recognition unit 58 performing scene analysis using information such as an image signal, that is, a scene recognition result regarding a state where an image is captured. Based on the input scene recognition result, the control unit 64 changes the processing characteristics of the filter 76 so that the sound most suitable for the state in which the image is captured is recorded. That is, the gain of the filter 76 is changed. For example, when the scene recognizing unit 58 recognizes that the image is a scene in which a child is playing, the control unit 64 performs processing with processing characteristics as shown in FIG. 6 based on the recognition result. Judge as the most suitable for the image. Then, the gain of the filter 76 is changed so as to have the processing characteristics shown in FIG. The audio signal is processed by a filter 76 having processing characteristics shown in FIG. As a result, the voice band of the adult male is cut from the voice signal, and a voice signal in which the voice band of the child is emphasized is generated. Further, for example, when the information such as the input image signal includes a VR active on signal, the scene recognition unit 58 recognizes that the image is an image taken on a vehicle such as an automobile. The control unit 64 changes the processing characteristics of the filter 76 based on the recognition result of the scene recognition unit 58. That is, the processing characteristics of the filter 76 are changed so as to reduce engine noise, tire running noise, wind noise, etc. that resonate inside the vehicle. If the audio signal is processed by the filter 76 having such characteristics, the engine sound of the automobile is reduced, and an image in which conversation in the vehicle is emphasized can be recorded.

そして、撮影された画像の画像信号は、フィルタ７６で処理された音声信号と関連付けられて記憶部７９に記憶される。なお、記憶部７９はメモリカード等の外付けの記憶媒体であっても良い。 The image signal of the captured image is stored in the storage unit 79 in association with the audio signal processed by the filter 76. The storage unit 79 may be an external storage medium such as a memory card.

第１実施形態においては、制御部６４は、シーン認識部５８のシーン認識結果が人物を主要被写体とするものであるか否かにより、フィルタ７６の処理特性を変更する。例えば、人物を主要被写体とする画像であるというシーン認識結果が入力されれば、自動車の音や虫の鳴き声を低減するような処理特性に変更する。また、制御部６４は、シーン認識結果が主要被写体に子供が含まれているか否かにより、フィルタ７６の処理特性を変更する。主要被写体に子供が含まれていれば、図６に示すような処理特性に変更する。 In the first embodiment, the control unit 64 changes the processing characteristics of the filter 76 depending on whether or not the scene recognition result of the scene recognition unit 58 is a person as a main subject. For example, if a scene recognition result indicating that the image is a person as a main subject is input, the processing characteristics are changed to reduce the sound of a car or the sound of insects. Further, the control unit 64 changes the processing characteristics of the filter 76 depending on whether or not the scene recognition result includes a child in the main subject. If the main subject includes a child, the processing characteristics are changed to those shown in FIG.

このように、第１実施形態に係るカメラ１は、撮影された画像を解析して当該画像が撮影された状態を認識し、撮影された画像に最も相応しい音声が記録されるようにフィルタ７６のゲインを調節する。フィルタ７６のゲインを調節する制御部６４は、シーン認識部５８のシーン認識結果により、フィルタ７６をハイパスフィルタ、ローパスフィルタ、あるいはバンドパスフィルタなどの特性を有するように変化させて音声信号を処理する。その結果、撮影された画像に最も相応しい状態の音声が記録される。 As described above, the camera 1 according to the first embodiment analyzes the captured image, recognizes the state of the captured image, and records the sound most suitable for the captured image. Adjust the gain. The control unit 64 that adjusts the gain of the filter 76 processes the audio signal by changing the filter 76 to have characteristics such as a high-pass filter, a low-pass filter, or a band-pass filter based on the scene recognition result of the scene recognition unit 58. . As a result, sound in a state most suitable for the photographed image is recorded.

ここで、例えば、撮影者がシーン操作部２５の設定をポートレート撮影モードに設定した状態で子供が遊んでいるシーンを撮影していたとしても、シーン認識部５８は、顔認識部７０からの顔認識情報より被写体は子供であると判断し、かつ認識した顔が動き回っていれば、シーン解析の結果、当該画像が撮影された状態は子供が遊んでいるシーンを撮影した画像であると認識する。そして、制御部６４はシーン認識部５８の認識結果に基づいて、子供の声の帯域を重視する処理ができるようにフィルタ７６の処理特性を変化させる。つまり、シーン認識部５８は実際に撮影された画像を重視してシーンを認識する。その結果、撮影者がシーン操作部２５の設定間違いあるいは設定忘れをしたとしても、撮影された画像に最も相応しい音声信号処理を施すことができる。 Here, for example, even if the photographer is shooting a scene where a child is playing in a state where the setting of the scene operation unit 25 is set to the portrait shooting mode, the scene recognizing unit 58 receives from the face recognizing unit 70. If it is determined from the face recognition information that the subject is a child, and the recognized face is moving around, the result of the scene analysis is that the image is captured as an image of a scene where the child is playing. To do. Then, based on the recognition result of the scene recognition unit 58, the control unit 64 changes the processing characteristics of the filter 76 so that processing that emphasizes the child's voice band can be performed. That is, the scene recognizing unit 58 recognizes the scene with emphasis on the actually captured image. As a result, even if the photographer makes a setting mistake or forgets to set the scene operation unit 25, the sound signal processing most suitable for the photographed image can be performed.

なお、画像の撮影中にシーン認識部５８がシーン解析を実行し、解析結果がシーン操作部２５の設定状態と異なっている場合、撮影者にシーン操作部２５の設定が適していないことを通知するようにしても良い。通知はシーン操作部２５の設定の変更を促す表示を液晶表示部３７に表示させても良いし、警報ランプが点灯あるいは点滅するようにしても良い。また、強制的に撮影している状態に適した撮影モードに切替わるように制御しても良い。 When the scene recognition unit 58 performs scene analysis during image shooting and the analysis result is different from the setting state of the scene operation unit 25, the photographer is notified that the setting of the scene operation unit 25 is not suitable. You may make it do. The notification may be displayed on the liquid crystal display unit 37 so as to prompt the user to change the setting of the scene operation unit 25, or the alarm lamp may be turned on or blinked. In addition, control may be performed so as to switch to a shooting mode suitable for a state in which shooting is forcibly performed.

また、制御部６４は、音声信号の周波数に応じてフィルタ７６のゲインを変化させる際、左右のマイク３４Ｒ、３４Ｌで集音した音声について一様にフィルタ７６のゲインを変えるように制御しても良いし、右側のマイク３４Ｒが集音した音声と左側のマイク３４Ｌが集音した音声との音量を異なるように制御しても良い。例えば、主要被写体の子供が画角内を右から左に移動し、さらに左から右に移動した場合に、子供が写っている方のマイクの音量が大きくなるようにフィルタ７６のゲインを変化させる。このように制御すれば、画角内を動き回る子供の動きが強調された音声とともに画像が記録される。このように記録された画像および音声を再生すると臨場感が増す。 In addition, when changing the gain of the filter 76 in accordance with the frequency of the audio signal, the control unit 64 may control to change the gain of the filter 76 uniformly for the sounds collected by the left and right microphones 34R and 34L. Alternatively, the volume of the sound collected by the right microphone 34R and the volume of the sound collected by the left microphone 34L may be controlled to be different. For example, when the child of the main subject moves from right to left within the angle of view and further moves from left to right, the gain of the filter 76 is changed so that the volume of the microphone on which the child is photographed is increased. . By controlling in this way, an image is recorded together with a sound in which the movement of a child moving around within the angle of view is emphasized. When the image and sound recorded in this way are reproduced, the sense of reality increases.

また、第１実施形態に係るカメラ１は、ＶＲスイッチ２８がオンの時は、ＶＲモード切替えスイッチ３１の設定状態に拘らず、ブレ検出部６１、６１ａが検出するボディ部４のブレ、すなわち角速度が制御部６４に入力される。制御部６４は、入力されるボディ部４のブレの周期に基づいて音声信号を処理するようにフィルタ７６の特性を制御している。 Further, in the camera 1 according to the first embodiment, when the VR switch 28 is on, the shake of the body part 4 detected by the shake detection units 61 and 61a, that is, the angular velocity is detected regardless of the setting state of the VR mode changeover switch 31. Is input to the control unit 64. The control unit 64 controls the characteristics of the filter 76 so as to process the audio signal based on the input blur period of the body unit 4.

例えば、ブレ検出部６１、６１ａからＡという角速度が検出された場合、制御部６４はフィルタ７６のゲインがｋＡ（ｋは任意の定数）倍になるようにフィルタ７６の特性を変化させる。また、ブレ検出部６１、６１ａからＢという角速度が検出された場合、制御部６４はフィルタ７６のゲインがｋＢ（ｋは任意の定数）倍になるようにフィルタ７６の特性を変化させる。このようにブレ検出部６１、６１ａが検出する角速度に応じてフィルタ７６の特性を変化させることにより、撮影者が被写体の動きに合わせてパンニングすると、パンニング速度に合わせて音を変化させて記録することができる。この機能を生かせば、例えばモータスポーツにおいてカメラ１を自動車に積んで撮影した場合には、自動車がカーブを曲がる時にブレ検出部６１、６１ａで検出される角速度が変化するから、自動車が向きを変える（カーブを曲がる）ごとに角速度の変化に応じてフィルタ７６のゲインがｋＡ倍またはｋＢ倍になるようにすることで音を変化させて記録することができる。このように記録された画像および音声を再生すると、自動車の動きに応じて音が変化し、臨場感を出すことができる。また、別の例として、撮影者が右から左へパンニングをした場合に、右側マイク３４Ｒが集音した音声の音量が徐々に小さくなり、左側マイク３４Ｌが集音した音声の音量が徐々に大きくなって記録されるようにフィルタ７６の特性を制御し、音声を記録しても良い。このように記録された画像および音声を再生すると、画面が右から左へパンニングするに従い、右のスピーカ４０Ｒの音量は徐々に小さくなり、左のスピーカ４０Ｌの音量は徐々に大きくなる。再生時に画面のパンニング方向に合わせて音が変化すれば臨場感を増すことができる。 For example, when an angular velocity of A is detected from the blur detection units 61 and 61a, the control unit 64 changes the characteristics of the filter 76 so that the gain of the filter 76 is multiplied by kA (k is an arbitrary constant). When the angular velocity B is detected from the blur detection units 61 and 61a, the control unit 64 changes the characteristics of the filter 76 so that the gain of the filter 76 is multiplied by kB (k is an arbitrary constant). In this way, by changing the characteristics of the filter 76 according to the angular velocity detected by the blur detection units 61 and 61a, when the photographer pans in accordance with the movement of the subject, the sound is changed and recorded in accordance with the panning speed. be able to. If this function is used, for example, in motor sports, when the camera 1 is mounted on a car and shot, the angular speed detected by the shake detection units 61 and 61a changes when the car turns a curve, so the car changes its direction. By changing the gain of the filter 76 to kA or kB times according to the change in angular velocity every time (turning a curve), the sound can be changed and recorded. When the image and sound recorded in this way are reproduced, the sound changes according to the movement of the automobile, and a sense of reality can be obtained. As another example, when the photographer pans from right to left, the volume of the sound collected by the right microphone 34R gradually decreases, and the volume of the sound collected by the left microphone 34L gradually increases. Therefore, the sound may be recorded by controlling the characteristics of the filter 76 so as to be recorded. When the image and sound recorded in this way are reproduced, the volume of the right speaker 40R gradually decreases and the volume of the left speaker 40L gradually increases as the screen pans from right to left. If the sound changes in accordance with the panning direction of the screen during playback, the sense of presence can be increased.

このように、第１実施形態のカメラ１によれば、撮影された画像のシーン解析に基づいて当該画像が撮影された状態に最も相応しい音声を記録することができ、さらにカメラ１の動きと左右のマイク３４Ｒ、３４Ｌがそれぞれ集音した音声の音声信号とをリンクさせることで臨場感のある音声を記録することができる。 As described above, according to the camera 1 of the first embodiment, it is possible to record the sound most suitable for the state in which the image is captured based on the scene analysis of the captured image, and further, the movement of the camera 1 and the left and right By linking the sound signals collected by the microphones 34R and 34L, it is possible to record realistic sounds.

（第２実施形態）
次に第２実施形態に係る撮影装置について説明する。第２実施形態に係る撮影装置もデジタル一眼レフカメラ１００（以下、カメラ１００と略記する。）であり、静止画および動画の撮影が可能である。なお、第１実施形態と同様の構成については、同じ符号を用いて説明する（図１および図２参照）。 (Second Embodiment)
Next, a photographing apparatus according to the second embodiment will be described. The imaging apparatus according to the second embodiment is also a digital single-lens reflex camera 100 (hereinafter abbreviated as camera 100), and can capture still images and moving images. In addition, about the structure similar to 1st Embodiment, it demonstrates using the same code | symbol (refer FIG. 1 and FIG. 2).

図７は、第２実施形態に係るカメラ１００の機能構成を示すブロック図である。 FIG. 7 is a block diagram illustrating a functional configuration of the camera 100 according to the second embodiment.

第２実施形態のカメラ１００の構成は第１実施形態のカメラ１と略同様であるが、着脱式の交換レンズ部８２を備え、交換レンズ部８２に画像の像ブレを低減するための手ブレ補正機構と手ブレ補正機構を作動させるためのＶＲスイッチ２８およびＶＲモード切替えスイッチ３１が設けられている。ＶＲモードは第１実施形態と同様に、ＶＲアクティブモードとＶＲノーマルモードである。交換レンズ部８２には被写体距離検出部１３および焦点距離検出部１６が設けられている。交換レンズ部８２にはレンズ側制御部８５が備えられ、被写体距離検出部１３からの被写体距離情報、焦点距離検出部１６からの焦点距離情報、ＶＲスイッチ２８のオン／オフ情報およびＶＲモード切替えスイッチ３１の設定状態の情報とが入力されている。 The configuration of the camera 100 according to the second embodiment is substantially the same as that of the camera 1 according to the first embodiment. However, the camera 100 includes a detachable interchangeable lens unit 82, and the interchangeable lens unit 82 has a camera shake for reducing image blur. A VR switch 28 and a VR mode changeover switch 31 for operating the correction mechanism and the camera shake correction mechanism are provided. The VR mode is a VR active mode and a VR normal mode, as in the first embodiment. The interchangeable lens unit 82 is provided with a subject distance detection unit 13 and a focal length detection unit 16. The interchangeable lens unit 82 includes a lens-side control unit 85, subject distance information from the subject distance detection unit 13, focal length information from the focal length detection unit 16, VR switch on / off information, and a VR mode switch. 31 setting state information is input.

ボディ部４の外部には、静止画／動画切替えスイッチ１９、レリーズボタン２２、シーン操作部２５が設けられている。また、ボディ部４にはボディ側通信部８８が設けられており、レンズ側通信部８５と常時通信している。被写体距離検出部１３からの被写体距離情報、焦点距離検出部１６からの焦点距離情報、ＶＲスイッチ２８のオン／オフ情報およびＶＲモード切替えスイッチ３１の設定状態の情報は、この通信によってボディ部４側に送られている。他の構成、すなわち撮像素子、クイックリターンミラー４６、マイク３４、ボディ部４の背面の液晶表示部３７等は第１実施形態と同様である。 A still image / moving image changeover switch 19, a release button 22, and a scene operation unit 25 are provided outside the body unit 4. The body unit 4 is provided with a body side communication unit 88, and always communicates with the lens side communication unit 85. The subject distance information from the subject distance detection unit 13, the focal length information from the focal length detection unit 16, the ON / OFF information of the VR switch 28, and the setting state information of the VR mode changeover switch 31 are transmitted to the body unit 4 side through this communication. Has been sent to. Other configurations, that is, the image pickup device, the quick return mirror 46, the microphone 34, the liquid crystal display unit 37 on the back surface of the body unit 4, and the like are the same as those in the first embodiment.

以下、第２実施形態に係るカメラ１００の機能を被写体の動画を撮影する場合を例に説明する。 Hereinafter, the function of the camera 100 according to the second embodiment will be described taking an example of shooting a moving image of a subject.

撮影者がレリーズボタン２２の半押し状態、すなわち動画撮影の待機状態から、レリーズボタン２２を全押しし、シーン操作部２５を操作して動画の撮影を開始すると、第１実施形態と同様に、撮像素子４３に結像した被写体からの光は撮像素子４３で光電変換され電気信号に変換される。電気信号はアナログ処理回路６７でアナログ処理が施された後、Ａ／Ｄ変換回路６８によってデジタル信号に変換され、画像信号が生成される。画像信号は画像処理回路６９でホワイトバランス調整等の画像処理が施され、シーン認識部５８と制御部６４とにそれぞれ出力される。 When the photographer fully presses the release button 22 from the half-pressed state of the release button 22, that is, in the standby state for moving image shooting, and operates the scene operation unit 25 to start moving image shooting, as in the first embodiment, Light from the subject imaged on the image sensor 43 is photoelectrically converted by the image sensor 43 and converted into an electrical signal. The electrical signal is subjected to analog processing by an analog processing circuit 67 and then converted to a digital signal by an A / D conversion circuit 68 to generate an image signal. The image signal is subjected to image processing such as white balance adjustment by the image processing circuit 69 and is output to the scene recognition unit 58 and the control unit 64, respectively.

ＶＲスイッチ２８がオンの時は、ブレ検出部６１、６１ａが作動し、ブレ検出部６１、６１ａが検出するボディ部４のブレ、すなわちボディ部４の角速度が制御部６４に入力される。ただし、以下の説明では、ブレ検出部６１ａのみを用いる場合について詳細に説明する。 When the VR switch 28 is on, the blur detection units 61 and 61a are operated, and the blur of the body part 4 detected by the blur detection units 61 and 61a, that is, the angular velocity of the body unit 4 is input to the control unit 64. However, in the following description, the case where only the blur detection unit 61a is used will be described in detail.

ＶＲモード切替えスイッチ３１の設定状態の情報はシーン認識部５８と制御部６４とに入力される。ＶＲモード切替えスイッチ３１の設定状態の情報は、ＶＲモード切替えスイッチ３１がＶＲアクティブモードに設定されている時に出力されるＶＲアクティブオン信号であり、ＶＲノーマルモードに設定されている時には出力されない。 Information on the setting state of the VR mode changeover switch 31 is input to the scene recognition unit 58 and the control unit 64. The information on the setting state of the VR mode changeover switch 31 is a VR active on signal that is output when the VR mode changeover switch 31 is set to the VR active mode, and is not output when it is set to the VR normal mode.

また、画像処理回路６９に備えられた顔認識部７０が第１実施形態と同様に画角内の顔の面積、個数、位置等を演算した演算結果は、顔認識情報としてシーン認識部５８に出力される。 In addition, the face recognition unit 70 included in the image processing circuit 69 calculates the area, number, position, and the like of the face within the angle of view as in the first embodiment, and the calculation result is sent to the scene recognition unit 58 as face recognition information. Is output.

シーン認識部５８には、第１実施形態と同様に画像信号等の情報、すなわち画像処理回路６９からの画像信号、顔認識部７０からの顔認識情報、シーン操作部２５の設定状態、ＶＲモード切替えスイッチ３１の設定状態の情報すなわちＶＲアクティブオン信号、被写体までの距離情報および被写体の焦点距離情報が入力される。そしてシーン認識部５８は入力された画像信号等の情報に基づいてシーン解析を実行し、シーン解析結果により当該画像が撮影された状態を認識する。シーン認識部５８のシーン認識結果は制御部６４に出力される。 As in the first embodiment, the scene recognition unit 58 includes information such as an image signal, that is, an image signal from the image processing circuit 69, face recognition information from the face recognition unit 70, setting state of the scene operation unit 25, VR mode. Information on the setting state of the changeover switch 31, that is, VR active-on signal, distance information to the subject, and focal length information of the subject are input. The scene recognizing unit 58 performs scene analysis based on the input information such as an image signal, and recognizes a state where the image is captured based on the scene analysis result. The scene recognition result of the scene recognition unit 58 is output to the control unit 64.

第２実施形態に係るカメラ１００おいては、右側マイク３４Ｒおよび左側マイク３４Ｌで集音された音声は、それぞれ信号処理回路７３で音声信号に変換されて制御部６４に入力される。制御部６４に入力される音声信号は、それぞれのマイク３４Ｒ、３４Ｌが集音した様々な音声の全帯域を含む音声信号である（以下、単に「マイク３４」というときは、右側マイク３４Ｒおよび左側マイク３４Ｌのことをいい、単に「音声信号」というときは、右側マイク３４Ｒの音声から生成された音声信号および左側マイク３４Ｌの音声から生成された音声信号のことをいう。）。 In the camera 100 according to the second embodiment, sounds collected by the right microphone 34R and the left microphone 34L are converted into audio signals by the signal processing circuit 73 and input to the control unit 64, respectively. The audio signal input to the control unit 64 is an audio signal including all bands of various sounds collected by the respective microphones 34R and 34L (hereinafter, simply referred to as “the microphone 34”, the right microphone 34R and the left microphone 34R). This refers to the microphone 34L, and the term “audio signal” refers to an audio signal generated from the sound of the right microphone 34R and an audio signal generated from the sound of the left microphone 34L.)

制御部６４には、上述したように、画像処理回路６９からの画像信号、シーン認識部５８のシーン認識結果、音声信号、ＶＲアクティブオン信号、およびブレ検出部６１ａが検出するボディ部４の角速度が入力される。画像信号は、撮影者がレリーズボタン２２を全押しして全押しスイッチがオンになると画像処理回路６９から出力されて制御部６４に入力され、画像の撮影が開始される。そしてレリーズボタン２２が再度全押しされて全押しスイッチがオフになると、画像処理回路６９から画像信号は出力されなくなり、したがって画像信号は制御部６４に入力されなくなり、画像の撮影が終了する。 As described above, the control unit 64 includes the image signal from the image processing circuit 69, the scene recognition result of the scene recognition unit 58, the audio signal, the VR active on signal, and the angular velocity of the body unit 4 detected by the blur detection unit 61a. Is entered. The image signal is output from the image processing circuit 69 and input to the control unit 64 when the photographer fully presses the release button 22 and the full-press switch is turned on, and image capturing is started. When the release button 22 is fully pressed again and the full-press switch is turned off, no image signal is output from the image processing circuit 69. Therefore, the image signal is not input to the control unit 64, and image shooting is terminated.

制御部６４は、撮影開始から撮影終了まで、すなわち動画撮影モードでレリーズボタン２２が全押しされた時から再度レリーズボタン２２が全押しされて全押しスイッチがオフとなるまでの経過時間を計測している。つまり画像信号と、画像信号が入力されている間の経過時間とをリンクさせている。制御部６４はさらに、入力されたシーン認識結果、音声信号、ＶＲアクティブオン信号、およびボディ部４の角速度についても、それぞれ撮影開始から撮影終了までの経過時間と関連付けを行なっている。つまり、シーン認識結果、音声信号、ＶＲアクティブオン信号、およびボディ部４の角速度は、それぞれ画像信号入力の経過時間と関連付けられる。その結果、画像信号、シーン認識結果、音声信号、ＶＲアクティブオン信号、およびボディ部４の角速度は、互いに画像信号入力の経過時間によって関連付けられることとなる。このように、入力される信号および情報を画像信号入力の経過時間に関連付けることにより、制御部６４は、例えば、画像の撮影開始から撮影終了までの間でどの時点からどの時点までＶＲアクティブモードに設定されていたのか、あるいはＶＲアクティブモードに設定されなかったのか、ということを把握することができる。 The control unit 64 measures the elapsed time from the start of shooting to the end of shooting, that is, from when the release button 22 is fully pressed in the moving image shooting mode to when the release button 22 is fully pressed again and the full press switch is turned off. ing. That is, the image signal and the elapsed time during which the image signal is input are linked. The control unit 64 further associates the input scene recognition result, the audio signal, the VR active on signal, and the angular velocity of the body unit 4 with the elapsed time from the start of shooting to the end of shooting. That is, the scene recognition result, the audio signal, the VR active on signal, and the angular velocity of the body unit 4 are each associated with the elapsed time of image signal input. As a result, the image signal, the scene recognition result, the audio signal, the VR active on signal, and the angular velocity of the body unit 4 are associated with each other according to the elapsed time of the image signal input. In this way, by associating the input signal and information with the elapsed time of image signal input, the control unit 64, for example, from which time point to which point in time from the start of image capture to the end of image capture enters the VR active mode. It can be ascertained whether it has been set or has not been set to the VR active mode.

そして、互いに関連付けられたこれらの信号および情報は記憶部７９に記憶される。このとき、画像信号入力の経過時間と関連付けられた音声信号は記憶部７９の音声ファイル９１に記憶される。音声ファイル９１に記憶される音声信号は、マイク３４が集音した様々な音声の全帯域を含む音声信号である。また、画像信号入力の経過時間と関連付けられたＶＲアクティブオン信号は記憶部の補正モード記憶ファイル９４（以下、ＶＲモード記憶ファイル９４という。）に記憶される。画像信号入力の経過時間と関連付けられた画像信号、シーン認識結果およびボディ部４の角速度は、記憶部の他のファイル９７に記憶される。他のファイル９７は１つでも複数でも良い。また、音声信号およびＶＲアクティブオン信号を同一のファイルに記憶しても良い。また、記憶部７９はメモリーカード等の外付けの記憶媒体であっても良い。 These signals and information associated with each other are stored in the storage unit 79. At this time, the audio signal associated with the elapsed time of image signal input is stored in the audio file 91 of the storage unit 79. The audio signal stored in the audio file 91 is an audio signal including all bands of various sounds collected by the microphone 34. The VR active on signal associated with the elapsed time of image signal input is stored in a correction mode storage file 94 (hereinafter referred to as a VR mode storage file 94) of the storage unit. The image signal associated with the elapsed time of image signal input, the scene recognition result, and the angular velocity of the body unit 4 are stored in another file 97 of the storage unit. One or more other files 97 may be provided. Further, the audio signal and the VR active on signal may be stored in the same file. The storage unit 79 may be an external storage medium such as a memory card.

図８（ａ）−（ｅ）は、制御部６４に入力され、記憶部７９に記憶される各信号について、各信号の入力状態と時間との関係をそれぞれグラフに示した図である。ただしシーン認識結果については示していない。 8A to 8E are graphs showing the relationship between the input state of each signal and time for each signal input to the control unit 64 and stored in the storage unit 79. FIG. However, the scene recognition result is not shown.

図８の各図に示す信号は、（ａ）のＳ１が画像処理回路６９からの画像信号、（ｂ）のＳ２が右側マイク３４Ｒの音声信号、（ｃ）のＳ３が左側マイク３４Ｌの音声信号、（ｄ）のＳ４がジャイロセンサ６１ａからの角速度の信号、（ｅ）のＳ５がＶＲアクティブオン信号を示している。ＶＲアクティブオン信号Ｓ５は、ＶＲアクティブモードに設定されている時にオン信号が発信される。 8A, S1 in FIG. 8A is an image signal from the image processing circuit 69, S2 in FIG. 8B is an audio signal from the right microphone 34R, and S3 in FIG. 8C is an audio signal from the left microphone 34L. , (D) S4 indicates an angular velocity signal from the gyro sensor 61a, and (e) S5 indicates a VR active-on signal. The VR active on signal S5 is transmitted when the VR active mode is set.

図８の各図に示すように、画像信号Ｓ１は、時刻ｔ１でオンになっている。つまり、時刻ｔ１でレリーズボタン２２が全押しされ、画像の撮影が開始されている。画像の撮影開始後は、右側マイク３４Ｒの音声信号Ｓ２、左側マイク３４Ｌの音声信号Ｓ３および角速度信号Ｓ４は、時間の経過（時刻ｔ２、ｔ３、ｔ４）とともに信号の強弱が変化している。撮影開始当初は、ＶＲモード切替えスイッチ３１はＶＲノーマルモードに設定されており、ジャイロセンサ６１ａは手ブレによるボディ部の角速度を検知している。ＶＲアクティブオン信号は出力されていない。ここで例えば、撮影者が撮影を継続した状態で自動車に乗り込み、時刻ｔ５の時点でＶＲモード切替えスイッチ３１をＶＲノーマルモードからＶＲアクティブモードに切替えたとする。すると、ＶＲアクティブオン信号Ｓ５が制御部６４に入力される。走行する自動車の車内で撮影をすると、走行する自動車特有の振動によるカメラ１００のブレをジャイロセンサ６１ａが検知し、ブレの角速度の信号Ｓ４が制御部６４に入力される。また、自動車のエンジン音や風切り音等が右側マイク３４Ｒおよび左側マイク３４Ｌで集音されている。その後、撮影者は撮影を継続した状態で自動車から降りて、時刻ｔ６の時点でＶＲモード切替えスイッチ３１をＶＲアクティブモードからＶＲノーマルモードに切替えたとする。すると、ＶＲアクティブオン信号Ｓ５はオフになる。自動車特有の振動によるカメラ１００のブレは検知されず、自動車のエンジン音等も集音されていない。そして撮影者は時刻ｔ７で再度レリーズボタン２２を全押しし、撮影を終了する。 As shown in each drawing of FIG. 8, the image signal S1 is turned on at time t1. That is, the release button 22 is fully pressed at time t1, and image capturing is started. After the start of image capturing, the signal strength of the audio signal S2 from the right microphone 34R, the audio signal S3 from the left microphone 34L, and the angular velocity signal S4 change with time (time t2, t3, t4). At the beginning of shooting, the VR mode changeover switch 31 is set to the VR normal mode, and the gyro sensor 61a detects the angular velocity of the body part due to camera shake. The VR active on signal is not output. Here, for example, it is assumed that the photographer gets into the car in a state where shooting is continued, and the VR mode switch 31 is switched from the VR normal mode to the VR active mode at time t5. Then, the VR active-on signal S5 is input to the control unit 64. When shooting is performed in the vehicle of the traveling vehicle, the gyro sensor 61a detects a shake of the camera 100 due to vibrations peculiar to the traveling vehicle, and a shake angular velocity signal S4 is input to the control unit 64. In addition, engine sound, wind noise, and the like of the automobile are collected by the right microphone 34R and the left microphone 34L. After that, it is assumed that the photographer gets out of the automobile in a state where photographing is continued and the VR mode changeover switch 31 is changed from the VR active mode to the VR normal mode at time t6. Then, the VR active on signal S5 is turned off. The shake of the camera 100 due to the vibration unique to the automobile is not detected, and the engine sound or the like of the automobile is not collected. Then, the photographer fully presses the release button 22 again at time t7, and the photographing is finished.

以上のような状態の撮影を行なった場合、時刻ｔ１からｔ７までに至る画像は、Ｓ１からＳ５までの信号と、さらにシーン認識部５８からのシーン認識結果とが全て画像信号Ｓ１の入力の経過時間に関連付けられて、１つの画像データとして記憶部７９に記憶される。 When shooting is performed in the above-described state, the images from time t1 to t7 are all the signals from S1 to S5 and the scene recognition result from the scene recognition unit 58 are all input image signals S1. In association with time, the image data is stored in the storage unit 79 as one piece of image data.

次に、記憶部７９に記憶された画像データを再生する場合を説明する。 Next, a case where the image data stored in the storage unit 79 is reproduced will be described.

記憶部７９に記憶された画像データの再生画像は、ボディ部４背面の液晶表示部３７に表示させて見ることができ、再生音声はボディ部４側面のスピーカ４０で聞くことができる。また、ボディ部４に設けられた図示しない外部出力端子にパーソナルコンピュータを接続して、パーソナルコンピュータのモニタで見ることもできる。 The reproduced image of the image data stored in the storage unit 79 can be viewed on the liquid crystal display unit 37 on the back of the body unit 4, and the reproduced audio can be heard through the speaker 40 on the side of the body unit 4. Alternatively, a personal computer can be connected to an external output terminal (not shown) provided in the body portion 4 and viewed on a monitor of the personal computer.

操作者は、図示しない操作ボタン等を操作して再生したい画像データを選択し、再生の指示をする。このとき操作者は、図示しない操作ボタン等を操作して音声の再生について加工して再生するか、あるいは加工しないで再生するかを選択できる。 The operator operates an operation button (not shown) or the like to select image data to be reproduced and gives an instruction for reproduction. At this time, the operator can select whether to process and reproduce the sound reproduction by operating an operation button or the like (not shown), or to reproduce without processing.

まず、操作者が音声を加工しないで再生する方を選択した場合について説明する。 First, a case will be described in which the operator selects the method of reproducing without processing the voice.

操作者が音声を加工しないで再生する方を選択すると、制御部６４は記憶部７９の音声ファイル９１、ＶＲモード記憶ファイル９４および他のファイル９７からそれぞれ音声信号、ＶＲアクティブオン信号、画像信号、シーン認識結果、およびボディ部４の角速度等の信号および情報を呼び出す。記憶部７９から呼び出されたこれらの信号および情報は、制御部６４に備えられた再生信号処理部９９に入力される。再生信号処理部９９は入力されたこれらの信号および情報から画像および音声の再生信号を生成する。 When the operator selects the one to reproduce without processing the audio, the control unit 64 causes the audio signal 91, the VR mode storage file 94, and the other file 97 in the storage unit 79 to receive an audio signal, a VR active on signal, an image signal, A signal and information such as a scene recognition result and an angular velocity of the body part 4 are called. These signals and information called from the storage unit 79 are input to a reproduction signal processing unit 99 provided in the control unit 64. The reproduction signal processing unit 99 generates an image and audio reproduction signal from these input signals and information.

例えば、上述した図８における時間ｔ１からｔ７に至る画像データを再生する場合であれば、再生信号処理部９９は画像信号Ｓ１から画像の再生信号を生成し、音声信号、すなわち右側マイク３４Ｒの音声信号Ｓ２および左側マイク３４Ｌの音声信号Ｓ３から音声再生信号を生成する。このとき、制御部６４はＶＲアクティブオン信号、シーン認識結果、およびボディ部４の角速度を考慮しない。すると再生信号処理部９９は音声信号に加工を施さずに音声再生信号を生成する。つまり、音声信号に、第１実施形態のようなフィルタ７６による処理を施さずに音声再生信号を生成する。このように生成された画像および音声の再生信号を再生すると、自動車に乗っている間に撮影された部分は、画像とともに車内に響く自動車のエンジン音やタイヤの走行音、風切り音等が減衰されることなく再生される。このように音声信号に加工を施さずに再生をすると、自動車に乗って画像を撮影していることが強調され、撮影時の雰囲気を損なうことがない。自動車に限らず、列車に乗って列車からの車窓を撮影した場合も同様に撮影時の雰囲気を損なうことなく再生することができる。 For example, in the case of reproducing the image data from time t1 to t7 in FIG. 8 described above, the reproduction signal processing unit 99 generates an image reproduction signal from the image signal S1, and the audio signal, that is, the audio from the right microphone 34R. An audio reproduction signal is generated from the signal S2 and the audio signal S3 of the left microphone 34L. At this time, the control unit 64 does not consider the VR active on signal, the scene recognition result, and the angular velocity of the body unit 4. Then, the reproduction signal processing unit 99 generates an audio reproduction signal without processing the audio signal. That is, the audio reproduction signal is generated without applying the processing by the filter 76 as in the first embodiment to the audio signal. When the image and sound reproduction signals generated in this way are reproduced, the portion of the image taken while riding in the car attenuates the engine sound of the car, the running sound of the tire, the wind noise, etc. It is played without When the audio signal is reproduced without being processed in this way, it is emphasized that the image is taken while riding in a car, and the atmosphere at the time of shooting is not impaired. In addition to automobiles, when a train window is photographed on a train, it can be reproduced without impairing the atmosphere at the time of photographing.

次に、操作者が音声の再生について加工を施して再生する方を選択した場合について説明する。 Next, a case will be described in which the operator selects a method of processing and playing back audio.

操作者が音声を加工して再生する方を選択して再生の指示をすると、制御部６４は記憶部７９の音声ファイル９１、ＶＲモード記憶ファイル９４および他のファイル９７からそれぞれ音声信号、ＶＲアクティブオン信号、画像信号、シーン認識結果、およびボディ部４の角速度等の信号および情報を呼び出す。記憶部７９から呼び出されたこれらの信号および情報は、制御部６４に備えられた再生信号処理部９９に入力される。再生信号処理部９９は入力されたこれらの信号および情報から画像および音声の再生信号を生成する。再生信号処理部９９は画像信号Ｓ１から画像再生信号を生成する。また、再生信号処理部９９は入力された音声信号から音声再生信号を生成し、加工部７６に出力する。加工部７６は第１実施形態と同様に音声信号の周波数に応じてゲインを変化させるフィルタである。ここで制御部６４は、画像データに含まれる他の信号を参照する。具体的には、シーン認識部５８からのシーン認識結果、ジャイロセンサ６１ａからの角速度信号Ｓ４、ＶＲアクティブモードのオン信号Ｓ５である。制御部６４はシーン認識結果に基づいて第１実施形態と同様にフィルタ７６の特性を変化させる。フィルタ７６に入力された音声再生信号は撮影された画像に最も適した音声で再生されるように加工されて出力される。 When the operator selects one to process and reproduce the sound and gives an instruction to reproduce, the control unit 64 transmits the sound signal, VR active from the sound file 91, the VR mode storage file 94 and the other file 97 in the storage unit 79, respectively. Signals and information such as an ON signal, an image signal, a scene recognition result, and an angular velocity of the body unit 4 are called. These signals and information called from the storage unit 79 are input to a reproduction signal processing unit 99 provided in the control unit 64. The reproduction signal processing unit 99 generates an image and audio reproduction signal from these input signals and information. The reproduction signal processing unit 99 generates an image reproduction signal from the image signal S1. Further, the reproduction signal processing unit 99 generates an audio reproduction signal from the input audio signal and outputs it to the processing unit 76. The processing unit 76 is a filter that changes the gain according to the frequency of the audio signal, as in the first embodiment. Here, the control unit 64 refers to other signals included in the image data. Specifically, the scene recognition result from the scene recognition unit 58, the angular velocity signal S4 from the gyro sensor 61a, and the ON signal S5 in the VR active mode. The control unit 64 changes the characteristics of the filter 76 based on the scene recognition result as in the first embodiment. The audio reproduction signal input to the filter 76 is processed and output so as to be reproduced with audio most suitable for the captured image.

例えば、画像データに子供を被写体とするシーン認識結果が含まれているとする。すると、制御部６４は画像信号Ｓ１の経過時間のうち、子供を被写体とするシーンに対応する部分について、子供の声が強調されるようにフィルタ７６の特性を変化させる。そうすると、再生画像のうち、子供を被写体とするシーンについては子供の声が強調された画像が再生される。また、画像データにＶＲアクティブオン信号Ｓ５が記憶されていれば、制御部６４は画像信号Ｓ１の経過時間のうち、ＶＲアクティブオン信号Ｓ５が記憶されている部分に対応する部分について、自動車の音を低減させるようにフィルタ７６の特性を変化させる。そうすると、再生画像のうち、ＶＲアクティブオン信号Ｓ５が入力されている部分については自動車のエンジン音や風切り音は低減され、車内での会話等が強調された画像が再生される。制御部６４は、画像信号Ｓ１の経過時間のうち、他の部分についてもシーン認識結果に基づいて適宜フィルタ７６の特性を変化させる。 For example, it is assumed that a scene recognition result for a child as a subject is included in the image data. Then, the control unit 64 changes the characteristics of the filter 76 so that the voice of the child is emphasized in the portion corresponding to the scene where the child is the subject in the elapsed time of the image signal S1. Then, an image in which the child's voice is emphasized is reproduced for a scene in which the child is a subject among the reproduced images. Further, if the VR active on signal S5 is stored in the image data, the control unit 64 determines the sound of the vehicle for the portion corresponding to the portion in which the VR active on signal S5 is stored in the elapsed time of the image signal S1. The characteristic of the filter 76 is changed so as to reduce. If it does so, about the part into which VR active on signal S5 is inputted among reproduction images, the engine sound and wind noise of a car will be reduced, and the picture which emphasized the conversation etc. in a car will be reproduced. The control unit 64 appropriately changes the characteristics of the filter 76 based on the scene recognition result for other portions of the elapsed time of the image signal S1.

また、画像データに含まれるジャイロセンサ６１ａからの角速度信号Ｓ４に基づき、左右のマイク３４Ｒ、３４Ｌの音声信号Ｓ２、Ｓ３から生成する音声再生信号の音量を変化させても良い。そうすると、画面のパンニングに合わせて左右の音が変化し、臨場感を出して再生することができる。 Further, the volume of the audio reproduction signal generated from the audio signals S2 and S3 of the left and right microphones 34R and 34L may be changed based on the angular velocity signal S4 from the gyro sensor 61a included in the image data. Then, the left and right sounds change according to the panning of the screen, and it can be reproduced with a sense of reality.

このように第２実施形態にかかるカメラ１００は、音声については、画像の撮影時に集音された音声の全帯域を記憶する。そして取得した画像を再生する時に、加工を施さずに再生するパターンと、シーン認識部５８のシーン認識結果、ＶＲアクティブオン信号、およびに角速度信号等に基づいて加工を施した音声で再生するパターンとを選択して再生することができる。このような構成とすることで、画像の再生時に画像を見る者の好みに応じて音声の再生状態を変化させることができる。 As described above, the camera 100 according to the second embodiment stores the entire bandwidth of the sound collected when the image is captured. Then, when reproducing the acquired image, a pattern to be reproduced without processing, and a pattern to be reproduced with sound processed based on the scene recognition result of the scene recognition unit 58, the VR active on signal, and the angular velocity signal, etc. And can be played back. With such a configuration, it is possible to change the sound reproduction state according to the preference of the person viewing the image when reproducing the image.

以上、第１実施形態および第２実施形態について、デジタル一眼レフカメラを例に説明したが、これらの実施形態はデジタル一眼レフカメラに限らず、音声を記録する機能を有する撮影装置に適用できる。例えば、スチルカメラ、ビデオカメラ、カメラが内蔵された携帯電話等に適用できる。 As described above, the first embodiment and the second embodiment have been described using the digital single-lens reflex camera as an example. However, these embodiments are not limited to the digital single-lens reflex camera, and can be applied to an imaging apparatus having a function of recording sound. For example, the present invention can be applied to a still camera, a video camera, a mobile phone with a built-in camera, and the like.

また、本発明の構成は上記第１および第２実施形態に限定されるものではなく、適宜変更が可能である。 The configuration of the present invention is not limited to the first and second embodiments, and can be changed as appropriate.

１、１００カメラ
４ボディ部
７ファインダ装置部
１０レンズ部
１１撮影レンズ
１３被写体距離検出部
１６焦点距離検出部
１９静止画／動画切替えスイッチ
２２レリーズボタン
２５シーン操作部
２８手ブレ補正スイッチ
３１補正モード切替えスイッチ
３４マイク
３７液晶表示部
４０スピーカ
４３撮像素子
４６クイックリターンミラー
４９ファインダスクリーン
５２ペンタプリズム
５５接眼レンズ
５８シーン認識部
６１、６１ａブレ検出部
６４制御部
６７アナログ処理回路
７０顔認識部
７１画像取得部
７３信号処理回路
７４音声取得部
７６加工部（フィルタ）
７９記憶部
８２交換レンズ部
８５レンズ側制御部
８８ボディ側通信部
９１音声ファイル
９４補正モード記憶ファイル（ＶＲモード記憶ファイル）
９７他のファイル
９９再生信号処理部 DESCRIPTION OF SYMBOLS 1,100 Camera 4 Body part 7 Finder apparatus part 10 Lens part 11 Shooting lens 13 Subject distance detection part 16 Focal length detection part 19 Still image / video switching switch 22 Release button 25 Scene operation part 28 Camera shake correction switch 31 Correction mode switching Switch 34 Microphone 37 Liquid crystal display unit 40 Speaker 43 Image sensor 46 Quick return mirror 49 Viewfinder screen 52 Penta prism 55 Eyepiece 58 Scene recognition unit 61, 61a Motion detection unit 64 Control unit 67 Analog processing circuit 70 Face recognition unit 71 Image acquisition unit 73 Signal Processing Circuit 74 Audio Acquisition Unit 76 Processing Unit (Filter)
79 Storage unit 82 Interchangeable lens unit 85 Lens side control unit 88 Body side communication unit 91 Audio file 94 Correction mode storage file (VR mode storage file)
97 Other files 99 Playback signal processor

Claims

An image acquisition unit for acquiring images;
An audio acquisition unit that acquires audio corresponding to the image;
A recognition unit for recognizing a state in which the image is acquired;
A processing unit for processing the voice acquired by the voice acquisition unit;
A control unit that controls the processing unit according to a recognition result of the recognition unit;
An imaging apparatus comprising: a storage unit that stores the image and the sound processed by the processing unit in a storage medium in association with each other.

An imaging apparatus according to claim 1,
The recognizing unit recognizes a state in which the image is acquired by performing a scene analysis using the image.

An imaging apparatus according to claim 1,
Includes a scene operation unit that can be operated by the photographer according to the shooting scene,
The imaging apparatus recognizes a state in which the image is acquired by giving a weight to a scene analyzed by the scene analysis rather than a scene operated by the scene operation unit.

An imaging apparatus according to claim 1,
An operation unit operable by a photographer to change characteristics for correcting image blur of the image;
The imaging apparatus according to claim 1, wherein the recognition unit recognizes a state where the image is acquired according to an operation state of the operation unit.

The imaging device according to claim 3 or 4, wherein
It has a detection unit that detects the shake of the device,
The said control part controls the said process part so that the said audio | voice may be processed according to the shake of the said apparatus detected by the said detection part, The imaging device characterized by the above-mentioned.

An imaging apparatus according to any one of claims 1 to 5, wherein
The processing unit is a filter that changes a gain according to the frequency of the sound acquired by the sound acquisition unit,
The control unit controls the characteristics of the filter according to a recognition result of the recognition unit.

An image acquisition unit for acquiring images;
An audio acquisition unit that acquires audio corresponding to the image;
A recognition unit for recognizing a state in which the image is acquired;
An imaging apparatus comprising: a storage unit that stores the image, the sound, and a recognition result of the recognition unit in a storage medium in association with each other.

The photographing apparatus according to claim 7,
The recognizing unit recognizes a state in which the image is acquired by performing a scene analysis using the image.