JP2010147587A

JP2010147587A - Imaging device and imaging method

Info

Publication number: JP2010147587A
Application number: JP2008319969A
Authority: JP
Inventors: Tomoyuki Shiozaki; 智行塩崎
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2008-12-16
Filing date: 2008-12-16
Publication date: 2010-07-01
Anticipated expiration: 2028-12-16
Also published as: JP5235644B2

Abstract

<P>PROBLEM TO BE SOLVED: To represent a feeling obtained from hearing during photographing in a photographed image. <P>SOLUTION: The imaging device is provided with a microphone for acquiring sound information, a sound analyzing circuit for analyzing a feature amount of a sound signal, and a system control means for controlling an image processing parameter and a photographing parameter on the basis of the feature amount of the sound signal acquired by the microphone, wherein the imaging device can represent a feeling obtained from hearing during photographing and can generate a photographed image having presence. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は撮像装置及び撮像方法に関し、特に、撮影時の雰囲気や臨場感を反映した画像を撮影するために用いて好適な技術に関する。 The present invention relates to an image pickup apparatus and an image pickup method, and more particularly to a technique suitable for use in taking an image reflecting an atmosphere and a sense of reality at the time of shooting.

従来、撮影時における撮影者の感情を撮影画像に反映させることを目的として、撮影者の体温、発汗、グリップ圧力等の状態を検出する感情検出センサ、及び撮影時の位置・外気温・外湿度を検出する環境検出センサを有している。そして、各々の検出センサの検出値に応じて取得画像の画像処理パラメータを変更する技術が特許文献１に開示されている。 Conventionally, an emotion detection sensor for detecting the photographer's body temperature, sweating, grip pressure, etc., and the position / external temperature / external humidity at the time of photographing for the purpose of reflecting the photographer's emotion at the time of photographing in the photographed image. Has an environment detection sensor for detecting Patent Document 1 discloses a technique for changing an image processing parameter of an acquired image in accordance with a detection value of each detection sensor.

また、「撮影時の雰囲気」を撮影画像から感じ取れることを目的として、音声入力可能なカメラにおいて、入力された音声の音量や抑揚を分析する。そして、分析結果に応じて最適な形式の文字画像に変換し、被写体像と合成した撮影画像を取得する技術が特許文献２に開示されている。 In addition, in order to make the “atmosphere at the time of photographing” feel from a photographed image, the volume and inflection of the inputted voice are analyzed in a camera capable of voice input. Patent Document 2 discloses a technique for converting a character image into an optimal format according to the analysis result and acquiring a photographed image synthesized with a subject image.

特開２００５−２５０９７７号公報Japanese Patent Laid-Open No. 2005-250977 特開２００３−３４８４１１号公報JP 2003-348411 A

一般的に、撮像装置は、撮像光学系を介して結像された光学像を電気信号に変換して画像情報を取得し、記録媒体に記録したり、表示装置に表示したりするものである。いわば、撮影者の視覚情報を記録保存できるようにした装置である。 In general, an imaging apparatus converts an optical image formed through an imaging optical system into an electrical signal to acquire image information, and records the information on a recording medium or displays it on a display device. . In other words, it is a device that can record and store the photographer's visual information.

しかし、実際には撮影者は視覚情報のみならず、無意識にも聴覚・触覚等の他の感覚器官から得られた情報を加味し、撮影時の「映像」を記憶して保存する。このため、撮影後に時間を経過してから撮影画像を見返すと、自分の記憶に保存された「映像」とは異なるように感じることがある。 However, in reality, the photographer stores not only visual information but also information obtained from other sensory organs such as auditory and tactile sensation, and stores and stores the “video” at the time of photographing. For this reason, when the photographed image is looked back after a lapse of time after photographing, it may feel different from the “video” stored in its own memory.

これに対して、前記特許文献１にて開示されている技術では、外気温・外湿度センサにより撮影時の触覚、特に温覚・湿覚を検出し画像に反映させることができ、より臨場感のある画像を取得することができる。しかし、聴覚情報には触れられていない。 On the other hand, in the technique disclosed in Patent Document 1, a tactile sensation at the time of shooting, in particular, temperature sensation / humidity can be detected and reflected in an image by an outside air temperature / outside humidity sensor. A certain image can be acquired. However, the auditory information is not touched.

一方で、前記特許文献２にて開示されている技術では、聴覚情報を文字という形で撮影画像と合成することで、撮影時の雰囲気を演出できるようにしている。しかし、文字で表現するという形は、臨場感のある映像を記録したいという本来の撮影目的とは若干異なってしまう問題点があった。 On the other hand, in the technique disclosed in Patent Document 2, the atmosphere at the time of photographing can be produced by synthesizing auditory information with a photographed image in the form of characters. However, there is a problem that the form of expressing in characters is slightly different from the original shooting purpose of recording a realistic video.

本発明は前述の問題点に鑑み、撮影時の聴覚から得られる情感を撮影画像に表現できるようにすることを目的としている。 In view of the above-described problems, an object of the present invention is to make it possible to express a feeling obtained from hearing at the time of shooting in a shot image.

本発明の撮像装置は、被写体を撮影して静止画像データまたは動画像データを生成する撮像手段と、前記撮像手段によって生成された静止画像データまたは動画像データに所定の画像処理を施す画像処理手段と、前記撮像手段が前記被写体を撮影する時に音声信号を取得する音声入力手段と、前記音声入力手段によって取得された音声信号の特徴量を分析する音声分析処理手段と、前記画像処理手段が前記静止画像データまたは動画像データに所定の画像処理を施すために用いる画像処理パラメータを、前記音声分析処理手段によって分析された音声信号の特徴量に基づいて制御するシステム制御手段とを有することを特徴とする。 An imaging apparatus according to the present invention includes an imaging unit that captures a subject and generates still image data or moving image data, and an image processing unit that performs predetermined image processing on the still image data or moving image data generated by the imaging unit. Voice input means for acquiring a voice signal when the imaging means captures the subject, voice analysis processing means for analyzing a feature amount of the voice signal acquired by the voice input means, and the image processing means System control means for controlling an image processing parameter used for performing predetermined image processing on still image data or moving image data based on a feature amount of an audio signal analyzed by the audio analysis processing means. And

本発明の撮像方法は、被写体を撮影して静止画像データまたは動画像データを生成する撮像工程と、前記撮像工程において生成された静止画像データまたは動画像データに所定の画像処理を施す画像処理工程と、前記撮像工程において前記被写体を撮影する時に音声信号を取得する音声入力工程と、前記音声入力工程において取得された音声信号の特徴量を分析する音声分析処理工程と、前記画像処理工程において前記静止画像データまたは動画像データに所定の画像処理を施すために用いる画像処理パラメータを、前記音声分析処理工程において分析された音声信号の特徴量に基づいて制御するシステム制御工程とを有することを特徴とする。 The imaging method of the present invention includes an imaging step of capturing a subject to generate still image data or moving image data, and an image processing step of performing predetermined image processing on the still image data or moving image data generated in the imaging step. A voice input step of acquiring a voice signal when photographing the subject in the imaging step, a voice analysis processing step of analyzing a feature amount of the voice signal acquired in the voice input step, and the image processing step. A system control step of controlling image processing parameters used for performing predetermined image processing on still image data or moving image data based on a feature amount of an audio signal analyzed in the audio analysis processing step. And

本発明のコンピュータプログラムは、被写体を撮影して静止画像データまたは動画像データを生成する撮像工程と、前記撮像工程において生成された静止画像データまたは動画像データに所定の画像処理を施す画像処理工程と、前記撮像工程において前記被写体を撮影する時に音声信号を取得する音声入力工程と、前記音声入力工程において取得された音声信号の特徴量を分析する音声分析処理工程と、前記画像処理工程において前記静止画像データまたは動画像データに所定の画像処理を施すために用いる画像処理パラメータを、前記音声分析処理工程において分析された音声信号の特徴量に基づいて制御するシステム制御工程とを有する撮像方法をコンピュータに実行させることを特徴とする。 The computer program of the present invention includes an imaging process for capturing a subject to generate still image data or moving image data, and an image processing process for performing predetermined image processing on the still image data or moving image data generated in the imaging process. A voice input step of acquiring a voice signal when photographing the subject in the imaging step, a voice analysis processing step of analyzing a feature amount of the voice signal acquired in the voice input step, and the image processing step. A system control step of controlling an image processing parameter used for performing predetermined image processing on still image data or moving image data based on a feature amount of an audio signal analyzed in the audio analysis processing step; The computer is executed.

本発明によれば、撮影時の聴覚から得られる情感を撮影画像に表現することが可能となり、より臨場感のある撮影画像を得ることができる。 According to the present invention, it is possible to express a feeling obtained from hearing at the time of photographing in a photographed image, and a photographed image with a more realistic feeling can be obtained.

（第１の実施形態）
図１は、本発明の第１の実施形態の形態を示し、撮像装置の構成例を示すブロック図である。
図１において、１００は撮像装置である。１２１はレンズユニット２００より、図示されていない被写体の光学像を撮影レンズ２１０から、絞り２１１、レンズマウント１０２及び２０２、シャッター１４４を含む撮像光学系を介して結像された光学像を電気信号に変換する撮像素子である。 (First embodiment)
FIG. 1 is a block diagram illustrating a configuration example of an imaging apparatus according to the first exemplary embodiment of the present invention.
In FIG. 1, reference numeral 100 denotes an imaging device. Reference numeral 121 denotes a lens unit 200 that converts an optical image of a subject (not shown) from an imaging lens 210 through an imaging optical system including an aperture 211, lens mounts 102 and 202, and a shutter 144 into an electrical signal. This is an image sensor to be converted.

１２２は撮像素子１２１から出力されるアナログ信号出力をデジタル信号に変換し、デジタルの静止画像データまたは動画像データを生成するＡ／Ｄ変換部である。Ａ／Ｄ変換部１２２においてＡ／Ｄ変換されたデジタル信号は、メモリ制御部１２４及びシステム制御部１２０により制御され、メモリ１２７に格納される。 Reference numeral 122 denotes an A / D converter that converts an analog signal output output from the image sensor 121 into a digital signal and generates digital still image data or moving image data. The digital signal A / D converted by the A / D converter 122 is controlled by the memory controller 124 and the system controller 120 and stored in the memory 127.

１２３は画像処理部であり、デジタル信号のデータ或いはメモリ制御部１２４から読み出されたデータに対して所定の画素補間処理や色変換処理を行う。画像処理部１２３は適応離散コサイン変換（ＡＤＣＴ）等により画像データを圧縮伸長する圧縮・伸長回路も備えている。 An image processing unit 123 performs predetermined pixel interpolation processing and color conversion processing on digital signal data or data read from the memory control unit 124. The image processing unit 123 also includes a compression / decompression circuit that compresses and decompresses image data by adaptive discrete cosine transform (ADCT) or the like.

また、メモリ１２７に格納された画像を読み込んで圧縮処理或いは伸長処理を行い、処理を終えたデータをメモリ１２７に書き込むことも可能である。また、動画撮影モードでは、記録時に、メモリ１２７に取り込まれている一連の動画像データをＭＰＥＧ方式等によりデータ圧縮することができる。 It is also possible to read an image stored in the memory 127, perform compression processing or decompression processing, and write the processed data to the memory 127. In the moving image shooting mode, a series of moving image data captured in the memory 127 can be compressed by the MPEG method or the like at the time of recording.

さらに、画像処理部１２３では、撮像した画像データを用いて所定の演算処理を行い、コントラスト、明度、彩度、ホワイトバランス、シャープネス等の画像処理を施すことができる。また、TTL（スルーザレンズ）方式のAWB（オートホワイトバランス）処理も行っている。 Further, the image processing unit 123 can perform predetermined arithmetic processing using the captured image data and perform image processing such as contrast, brightness, saturation, white balance, and sharpness. It also performs TTL (through the lens) AWB (auto white balance) processing.

１２４はメモリ制御部であり、Ａ／Ｄ変換部１２２、画像処理部１２３、液晶パネル表示部１２５、外部着脱メモリ部１３１とメモリ１２７との間のデータの送受信を制御する。Ａ／Ｄ変換部１２２においてデジタル化されたデータが画像処理部１２３、メモリ制御部１２４を介してメモリ１２７に書き込まれる。或いは、Ａ／Ｄ変換部１２２から出力されるデジタルデータがメモリ１２７に直接書き込まれる。 A memory control unit 124 controls transmission / reception of data between the A / D conversion unit 122, the image processing unit 123, the liquid crystal panel display unit 125, the external removable memory unit 131, and the memory 127. Data digitized by the A / D conversion unit 122 is written into the memory 127 via the image processing unit 123 and the memory control unit 124. Alternatively, digital data output from the A / D converter 122 is directly written in the memory 127.

１１０は液晶ディスプレイ型の表示装置であり、液晶パネル表示部１２５とバックライト照明部１２６とから構成される。液晶パネル表示部１２５は、メモリ１２７の表示データ用格納領域へ書き込まれたメニュー画面、または外部着脱メモリ部１３１に格納された画像ファイルを表示することが可能である。 A liquid crystal display type display device 110 includes a liquid crystal panel display unit 125 and a backlight illumination unit 126. The liquid crystal panel display unit 125 can display a menu screen written in the display data storage area of the memory 127 or an image file stored in the external removable memory unit 131.

バックライト照明部１２６は、液晶パネル表示部１２５に対して背面照射する光源素子として、ＬＥＤ、有機ＥＬ、蛍光管等を用いて構成され、システム制御部１２０の指示により照明を任意に点灯或いは消灯することが可能である。また、システム制御部１２０により、電圧駆動方式或いはＰＷＭ駆動方式の何れかで光源素子の通電電流を制限することで、表示装置１１０の表示輝度を調整する調光機能を有する。 The backlight illumination unit 126 is configured by using an LED, an organic EL, a fluorescent tube, or the like as a light source element that irradiates the back surface of the liquid crystal panel display unit 125, and arbitrarily turns on or off illumination according to an instruction from the system control unit 120. Is possible. Further, the system control unit 120 has a dimming function for adjusting the display luminance of the display device 110 by limiting the energization current of the light source element by either the voltage driving method or the PWM driving method.

１２７は撮影した静止画像及び動画像、再生用表示のための画像及び音声ファイルを格納するためのメモリであり、所定枚数の静止画像や動画像を格納するのに十分な記憶量を備えている。なお、メモリ１２７はシステム制御部１２０の作業領域としても使用することが可能である。 Reference numeral 127 denotes a memory for storing captured still images and moving images, images for reproduction display, and audio files, and has a sufficient storage capacity for storing a predetermined number of still images and moving images. . The memory 127 can also be used as a work area for the system control unit 120.

１２８は電気的に消去・記録可能な不揮発性メモリであり、例えばフラッシュメモリやEEPROM等が用いられる。また、不揮発性メモリ１２８には、撮影状態の保存や、撮像装置１００を制御するプログラムが格納されている。 Reference numeral 128 denotes an electrically erasable / recordable nonvolatile memory such as a flash memory or an EEPROM. Further, the nonvolatile memory 128 stores a shooting state storage and a program for controlling the imaging apparatus 100.

１３０は接眼検出部であり、赤外発光体及び受光回路で構成されていて、一定間隔で、赤外発光体から赤外光を発光し、被検出物体で反射した赤外光を受光回路で受光し、受光した光量により規定位置に被検出物体があるかどうかを検出する。 Reference numeral 130 denotes an eyepiece detection unit, which is composed of an infrared light emitter and a light receiving circuit. The infrared light is emitted from the infrared light emitter at regular intervals, and the infrared light reflected by the detection object is received by the light receiving circuit. It receives light and detects whether there is an object to be detected at a specified position based on the received light quantity.

１３１はコンパクトフラッシュ（登録商標）やＳＤカードといった記録媒体に画像ファイル記録や読み出しを行うための外部着脱メモリ部である。１３８は電源部であり、電池、電池検出回路、ＤＣ／ＤＣコンバータ、通電するブロックを切り替えるスイッチ回路等により構成されており、電池の装着の有無、電池の種類、電池残量の検出を行う。また、検出結果及びシステム制御部１２０の指示に基づいてＤＣ／ＤＣコンバータを制御し、必要な電圧を必要な期間、各ブロック部へ供給する。 Reference numeral 131 denotes an external detachable memory unit for recording and reading image files on a recording medium such as a compact flash (registered trademark) or an SD card. A power supply unit 138 includes a battery, a battery detection circuit, a DC / DC converter, a switch circuit that switches a block to be energized, and the like, and detects whether a battery is installed, the type of battery, and the remaining battery level. Further, the DC / DC converter is controlled based on the detection result and an instruction from the system control unit 120, and a necessary voltage is supplied to each block unit for a necessary period.

システム制御部１２０は、撮像装置１００の動作状態に伴い、不要なブロックへの電源供給を停止することで、省電力へのパワーマネージメント制御を行っている。その例として、画像再生表示や、メニュー画面表示の場合、カメラ制御ブロックは動作不要のため、カメラ制御部１４０や測光部１４２、測距部１４３、撮像素子１２１、レンズユニット２００、ストロボユニット３００への供給電源を遮断する。 The system control unit 120 performs power management control for power saving by stopping power supply to unnecessary blocks in accordance with the operation state of the imaging apparatus 100. As an example, in the case of image reproduction display or menu screen display, since the camera control block does not need to be operated, to the camera control unit 140, photometry unit 142, distance measurement unit 143, image sensor 121, lens unit 200, and strobe unit 300. Shut off the power supply.

１４１は測光部１４２からの測光情報に基づいて、絞り２１１を制御するレンズ制御部２０３と連動しながら、シャッター１４４を制御するシャッター制御部である。１４２はＡＥ（自動露出）処理を行うための測光部であり、本実施形態においては、被写体の周辺部の明るさを測るためにも用いられる。 Reference numeral 141 denotes a shutter control unit that controls the shutter 144 in conjunction with the lens control unit 203 that controls the aperture 211 based on photometric information from the photometry unit 142. Reference numeral 142 denotes a photometric unit for performing AE (automatic exposure) processing. In this embodiment, the photometric unit 142 is also used to measure the brightness of the peripheral portion of the subject.

すなわち、撮影レンズ２１０に入射した光線を、絞り２１１、レンズマウント２０２及び１０２、そして不図示の測光用レンズを介して、測光部１４２に入射させる。これにより、光学像として結像された被写体の露出状態を測定することができる。また、測光部１４２は、ストロボユニット３００と連携することによりＥＦ（フラッシュ調光）処理機能も有するものである。また、ストロボユニット３００は、ＡＦ補助光の投光機能、フラッシュ調光機能も有する。 That is, the light beam incident on the photographing lens 210 is incident on the photometry unit 142 via the stop 211, the lens mounts 202 and 102, and a photometric lens (not shown). Thereby, the exposure state of the subject formed as an optical image can be measured. The photometry unit 142 also has an EF (flash dimming) processing function in cooperation with the flash unit 300. The flash unit 300 also has an AF auxiliary light projecting function and a flash light control function.

１４３はＡＦ（オートフォーカス）処理を行うための測距部であり、撮影レンズ２１０に入射した光線を、絞り２１１、レンズマウント２０２及び１０２、そして不図示の測距用ミラーを介して、測距部１４３に入射させる。これにより、光学像として結像された画像の合焦状態を測定することができる。上述したように、測距部１４３及び測光部１４２を専用に備える構成のため、測距部１４３及び測光部１４２を用いてＡＦ（オートフォーカス）処理、ＡＥ（自動露出）処理、ＥＦ（フラッシュ調光）処理の各処理を行う構成となっている。 Reference numeral 143 denotes a distance measuring unit for performing AF (autofocus) processing. The light beam incident on the photographing lens 210 is measured through the aperture 211, the lens mounts 202 and 102, and a distance measuring mirror (not shown). The light is incident on the portion 143. Thereby, the focus state of the image formed as an optical image can be measured. As described above, since the distance measurement unit 143 and the photometry unit 142 are provided exclusively, the AF (auto focus) process, the AE (automatic exposure) process, and the EF (flash adjustment) are performed using the distance measurement unit 143 and the photometry unit 142. It is configured to perform each processing of (light) processing.

１４０はシャッター制御部１４１、測光部１４２、測距部１４３との送受通信によりカメラとしての一連の動作を制御するカメラ制御部である。カメラ制御部１４０は、レンズユニット２００、ストロボユニット３００を制御することも可能である。 A camera control unit 140 controls a series of operations as a camera through transmission / reception communication with the shutter control unit 141, the photometry unit 142, and the distance measurement unit 143. The camera control unit 140 can also control the lens unit 200 and the strobe unit 300.

１３２、１３３、１３４、１３５、１３６、及び１３７は、システム制御部１２０の各種の動作指示を入力するための操作部を構成するものである。操作部を構成するスイッチやダイアル、タッチパネル、視線検知によるポインティング、音声認識装置等の単数或いは複数の組み合わせで構成されている。 Reference numerals 132, 133, 134, 135, 136, and 137 constitute an operation unit for inputting various operation instructions of the system control unit 120. It is composed of a single or a combination of switches, dials, touch panels, pointing by line-of-sight detection, voice recognition devices, etc. constituting the operation unit.

ここで、操作部を構成する操作器具の具体的な説明を行う。
再生スイッチ１３２は、液晶パネル表示部１２５に所定の画像データを表示する再生表示モード操作を行うためのスイッチである。外部着脱メモリ部１３１に格納された画像ファイルを再生表示する場合は、この再生スイッチ１３２を必ず操作する必要がある。また、すでに再生表示モードで、この操作が行われた場合には、再生表示モードから撮影モードへの切り替えができる。 Here, a specific description of the operation tool constituting the operation unit will be given.
The reproduction switch 132 is a switch for performing a reproduction display mode operation for displaying predetermined image data on the liquid crystal panel display unit 125. When the image file stored in the external detachable memory unit 131 is reproduced and displayed, the reproduction switch 132 must be operated. If this operation has already been performed in the playback display mode, the playback display mode can be switched to the shooting mode.

メニュースイッチ１３３は、液晶パネル表示部１２５に各種項目一覧を表示するためのスイッチである。この表示内容としては撮影に関する状態設定、記録媒体のフォーマット、時計の設定、現像パラメータ設定、及びユーザ機能設定（カスタム機能の設定）等がある。 The menu switch 133 is a switch for displaying a list of various items on the liquid crystal panel display unit 125. The display contents include a state setting related to shooting, a recording medium format, a clock setting, a development parameter setting, and a user function setting (custom function setting).

モードダイアル１３４は、種々の機能撮影モードを切り替え設定するためのスイッチである。本実施形態においては、自動撮影モード、プログラム撮影モード、シャッター速度優先撮影モード、絞り優先撮影モード、マニュアル撮影モード、ポートレート撮影モード、風景撮影モード、スポーツ撮影モード、夜景撮影モード、動画モード等を設定できる。 The mode dial 134 is a switch for switching and setting various function photographing modes. In this embodiment, automatic shooting mode, program shooting mode, shutter speed priority shooting mode, aperture priority shooting mode, manual shooting mode, portrait shooting mode, landscape shooting mode, sports shooting mode, night scene shooting mode, video mode, etc. Can be set.

レリーズスイッチ１３５は、レリーズボタンの半押し（ＳＷ１）及び全押し（ＳＷ２）で各々ＯＮとなるスイッチである。半押し状態では、ＡＦ（オートフォーカス）処理、ＡＥ（自動露出）処理、ＡＷＢ（オートホワイトバランス）処理、ＥＦ（フラッシュ調光）処理等の動作開始を指示する。 The release switch 135 is a switch that is turned on when the release button is half-pressed (SW1) and fully pressed (SW2). In the half-pressed state, the start of operations such as AF (autofocus) processing, AE (automatic exposure) processing, AWB (auto white balance) processing, and EF (flash dimming) processing is instructed.

また、全押し状態では、撮像素子１２１から読み出したアナログ信号をＡ／Ｄ変換部１２２によりデジタルの画像データに変換し、メモリ制御部１２４の制御に従ってメモリ１２７に書き込む撮像処理を行う。また、画像処理部１２３やメモリ制御部１２４で演算を行って映像として見ることができるようにする現像処理を行う。さらに、メモリ１２７から画像データを読み出し、画像処理部１２３でさらに圧縮を行い、外部着脱メモリ部１３１に装着された不図示の記録媒体に画像データを書き込む記録処理を行う、という一連の処理の動作開始を指示する。 In the fully-pressed state, an analog signal read from the image sensor 121 is converted into digital image data by the A / D converter 122, and imaging processing is performed in which the analog signal is written to the memory 127 according to the control of the memory controller 124. Further, development processing is performed so that the image processing unit 123 and the memory control unit 124 perform calculation so that the image can be viewed. Further, a series of processing operations of reading image data from the memory 127, further compressing the image data by the image processing unit 123, and performing a recording process of writing the image data to a recording medium (not shown) mounted in the external removable memory unit 131. Instruct the start.

各種ボタンスイッチからなる操作部１３６は、撮影モード、連写モード、セット、マクロ、ページ送り、フラッシュ設定、メニュー移動、ホワイトバランス選択、撮影画質選択、露出補正、日付/時間設定等を行うことができる。さらに、動画撮影開始及び停止を行う動画撮影スイッチや、上下左右方向スイッチや、再生画像のズーム倍率変更スイッチ、液晶パネル表示部１２５の画像表示ＯＮ／ＯＦＦスイッチがある。また、撮影直後に撮影画像データを自動再生するクイックレビューＯＮ／ＯＦＦスイッチ、再生画像を消去する画像消去スイッチがある。 An operation unit 136 including various button switches can perform shooting mode, continuous shooting mode, set, macro, page feed, flash setting, menu movement, white balance selection, shooting image quality selection, exposure correction, date / time setting, and the like. it can. Further, there are a moving image shooting switch for starting and stopping moving image shooting, an up / down / left / right direction switch, a zoom magnification change switch for a reproduced image, and an image display ON / OFF switch of the liquid crystal panel display unit 125. There are also a quick review ON / OFF switch for automatically reproducing captured image data immediately after shooting, and an image erasing switch for deleting a reproduced image.

また、ＪＰＥＧ及びＭＰＥＧ圧縮の各圧縮率と、撮像素子の信号をそのままデジタル化して記録するＣＣＤＲＡＷモードとを選択する圧縮モードスイッチがある。その他、レリーズスイッチ半押し状態でオートフォーカスの合焦状態を保ち続けるワンショットＡＦモードと連続してオートフォーカス動作を続けるサーボＡＦモードとを設定するＡＦモード設定スイッチ等がある。 In addition, there is a compression mode switch for selecting each compression rate of JPEG and MPEG compression and a CCD RAW mode for digitizing and recording the signal of the image sensor as it is. In addition, there is an AF mode setting switch for setting a one-shot AF mode that keeps the autofocus in-focus state when the release switch is half-pressed and a servo AF mode that keeps the autofocus operation continuously.

１３７は電源スイッチで、撮像装置１００の電源オン、電源オフの各モードを切り替え設定することができる。また、撮像装置１００に接続されたレンズユニット２００、ストロボユニット３００、記録媒体等の各種付属装置の電源オン、電源オフの設定も合わせて切り替え設定することができる。タイマー機能１３９は、時計機能、カレンダー機能、タイマーカウンター機能、アラーム機能があり、スリープモードへの移行時間や、アラーム通知などのシステム管理に用いられる。 Reference numeral 137 denotes a power switch that can switch and set the power-on and power-off modes of the imaging apparatus 100. In addition, the power-on and power-off settings of various accessory devices such as the lens unit 200, the strobe unit 300, and the recording medium connected to the imaging device 100 can be switched and set together. The timer function 139 includes a clock function, a calendar function, a timer counter function, and an alarm function, and is used for system management such as a transition time to the sleep mode and alarm notification.

１０２及び２０２は、レンズマウントであり、撮像装置１００をレンズユニット２００と接続するためのインターフェースである。１０１及び２０１は撮像装置１００をレンズユニット２００と電気的に接続するコネクタであり、カメラ制御部１４０により制御される。１１１及び３０１は、アクセサリシューであり、撮像装置１００をストロボユニット３００と接続するためのインターフェースである。 Reference numerals 102 and 202 denote lens mounts, which are interfaces for connecting the imaging apparatus 100 to the lens unit 200. Reference numerals 101 and 201 denote connectors that electrically connect the imaging apparatus 100 to the lens unit 200, and are controlled by the camera control unit 140. Reference numerals 111 and 301 denote accessory shoes, which are interfaces for connecting the imaging apparatus 100 to the strobe unit 300.

レンズ制御部２０３はレンズユニット２００の全体を制御して、動作用の定数、変数、プログラム等を記憶するメモリを備えている。また、レンズユニット２００に固有の番号等の識別情報、管理情報、開放絞り値や最小絞り値、焦点距離等の機能情報、現在や過去の各設定値などを保持する不揮発メモリ（図示せず）を備えている。また、レンズ制御部２０３は絞り２１１を制御したり、撮影レンズ２１０のフォーカシングを制御したり、撮影レンズ２１０のズーミングを制御する機能も兼ね備えている。 The lens control unit 203 includes a memory that controls the entire lens unit 200 and stores operation constants, variables, programs, and the like. In addition, a non-volatile memory (not shown) that holds identification information such as a number unique to the lens unit 200, management information, function information such as an open aperture value, minimum aperture value, and focal length, and current and past set values. It has. The lens control unit 203 also has a function of controlling the diaphragm 211, controlling the focusing of the photographing lens 210, and controlling zooming of the photographing lens 210.

ストロボユニット３００はストロボ発光制御部３０２を有している。ストロボ発光制御部３０２は、インターフェース３０１を介してアクセサリシュー１１１に接続して動作する時に、不図示のキセノン管等の発光部に対し、測光部１４２からの情報を元に、発光量や、発光タイミングを制御する。 The strobe unit 300 has a strobe light emission control unit 302. The strobe light emission control unit 302 is connected to the accessory shoe 111 via the interface 301 and operates with respect to a light emission unit such as a xenon tube (not shown) based on information from the photometry unit 142 and a light emission amount or light emission. Control timing.

１５０は音声信号を取得する音声入力部として設けられているマイクロフォンであり、音声を電気信号に変換して音声信号を撮像装置１００に入力する。１５１はマイクロフォン１５０から出力されるアナログ信号出力をデジタル信号に変換するＡ／Ｄ変換部である。Ａ／Ｄ変換部１５１においてＡ／Ｄ変換されたデジタル信号は、ＡｕｄｉｏＣｏｄｅｃ部１５２により、符号化処理がなされる。ＡｕｄｉｏＣｏｄｅｃ部１５２において符号化された信号は、メモリ制御部１２４及びシステム制御部１２０により制御されてメモリ１２７に格納される。 Reference numeral 150 denotes a microphone provided as an audio input unit that acquires an audio signal, which converts the audio into an electric signal and inputs the audio signal to the imaging apparatus 100. Reference numeral 151 denotes an A / D converter that converts an analog signal output from the microphone 150 into a digital signal. The digital signal that has been A / D converted by the A / D converter 151 is encoded by the Audio Codec unit 152. The signal encoded by the audio codec unit 152 is controlled by the memory control unit 124 and the system control unit 120 and stored in the memory 127.

音声再生時には、メモリ１２７に格納された音声ファイルを、ＡｕｄｉｏＣｏｄｅｃ部１５２により復号化する。復号化された信号は、Ｄ／Ａ変換部１５３によりデジタル信号からアナログ信号に変換される。１５４は音声出力のためのスピーカであり、Ｄ／Ａ変換部１５３において変換されたアナログの音声信号を音声として出力する。 At the time of audio reproduction, the audio codec 152 decodes the audio file stored in the memory 127. The decoded signal is converted from a digital signal to an analog signal by the D / A converter 153. Reference numeral 154 denotes a speaker for voice output, which outputs an analog voice signal converted by the D / A converter 153 as voice.

１５５は音声信号から音声特徴量を取得するための音声分析処理部である。音声分析処理部１５５はスペクトルアナライザ回路から構成されており、入力音声信号の音量、周波数分布が得られる。音声分析処理部１５５により得られた入力音声信号の特徴量は、システム制御部１２０において演算され、演算結果が画像処理部１２３で施される画像処理のパラメータに反映される。 Reference numeral 155 denotes a voice analysis processing unit for acquiring a voice feature amount from a voice signal. The voice analysis processing unit 155 includes a spectrum analyzer circuit, and obtains the volume and frequency distribution of the input voice signal. The feature amount of the input voice signal obtained by the voice analysis processing unit 155 is calculated by the system control unit 120, and the calculation result is reflected in the parameters of the image processing performed by the image processing unit 123.

１５６は音声認識モードスイッチであり、音声特徴量を抽出して画像処理パラメータ及び撮影パラメータを変更する、本実施形態の実施モード（音声認識モード）は、音声認識モードスイッチ１５６をＯＮすることで開始される。 Reference numeral 156 denotes a voice recognition mode switch, which extracts the voice feature amount and changes the image processing parameter and the shooting parameter. The implementation mode (voice recognition mode) of the present embodiment is started by turning on the voice recognition mode switch 156. Is done.

図２に、第１の実施形態の撮像装置１００において実行される処理手順の一例を説明するフローチャートを示す。
図１の撮像装置において、音声認識モードスイッチ１５６により音声認識モードが選択されている場合に、以下のフローが処理実行される。
ステップＳ１０１では、１３５のレリーズスイッチが半押し（ＳＷ１）の状態であるかどうかを判別する。ＳＷ１がＯＮであればステップＳ１０２へ移行する。 FIG. 2 is a flowchart illustrating an example of a processing procedure executed in the imaging apparatus 100 according to the first embodiment.
In the imaging apparatus of FIG. 1, when the voice recognition mode is selected by the voice recognition mode switch 156, the following flow is processed.
In step S101, it is determined whether or not the release switch 135 is half pressed (SW1). If SW1 is ON, the process proceeds to step S102.

ステップＳ１０２では、マイクロフォン１５０より音声信号を取得する。
ステップＳ１０３では、音声分析処理部１５５により、入力された音声信号の音量レベル及び周波数を分析する。 In step S102, an audio signal is acquired from the microphone 150.
In step S103, the sound analysis processing unit 155 analyzes the volume level and frequency of the input sound signal.

ステップＳ１０４では、システム制御部１２０において、取得した音声信号の音量レベル及び周波数分布から所定の演算処理を施し、音声取得期間内の平均音量レベル、単位時間当たりの音量変化率、平均周波数、単位時間当たりの平均周波数の変化率を演算する。そして演算結果から、入力音声の音量やテンポ、音高、抑揚等の音声特徴量を抽出する。詳細は後述する。 In step S104, the system control unit 120 performs predetermined calculation processing from the volume level and frequency distribution of the acquired audio signal, and calculates the average volume level, the volume change rate per unit time, the average frequency, and the unit time within the audio acquisition period. The rate of change of the average frequency per hit is calculated. Then, voice feature values such as the volume, tempo, pitch, and intonation of the input voice are extracted from the calculation result. Details will be described later.

ステップＳ１０５では、１３５のレリーズスイッチが全押し（ＳＷ２）の状態であるかどうかを判別する。この判別の結果、レリーズスイッチが全押しＳＷ２がＯＮであれば音声取得を停止し、レリーズ動作を開始する。また、レリーズスイッチが全押しＳＷ２がＯＮでなければステップＳ１０２に戻る。つまり、レリーズスイッチ１３５が、ＳＷ１がＯＮされてからＳＷ２がＯＮされるまでの間が、音声取得期間となる。 In step S105, it is determined whether or not the release switch 135 is fully pressed (SW2). As a result of the determination, if the release switch is fully pressed and SW2 is ON, the voice acquisition is stopped and the release operation is started. If the release switch is not fully pressed and SW2 is not ON, the process returns to step S102. That is, the period from the time when the release switch 135 is turned on to the time when SW2 is turned on is the voice acquisition period.

ステップＳ１０６では、撮像素子１２１から読み出した信号を画像データに変換してメモリ１２７に書き込む撮像処理が行われる。
ステップＳ１０７では、ステップＳ１０４で得られた音声特徴量に対応するコントラスト・明度・彩度・シャープネス等の画像処理パラメータ値を設定する。 In step S <b> 106, an imaging process is performed in which a signal read from the image sensor 121 is converted into image data and written to the memory 127.
In step S107, image processing parameter values such as contrast, brightness, saturation, and sharpness corresponding to the audio feature value obtained in step S104 are set.

ステップＳ１０８では、所定の画素補間処理や色変換処理に加え、ステップＳ１０７で設定したパラメータに基づく画像処理を施し、画像データの圧縮処理を行う。
ステップＳ１０９では、ステップＳ１０８において得られた画像データを外部着脱メモリ部１３１に記録する。また、表示装置１１０に画像を表示する。 In step S108, in addition to predetermined pixel interpolation processing and color conversion processing, image processing based on the parameters set in step S107 is performed, and image data compression processing is performed.
In step S109, the image data obtained in step S108 is recorded in the external removable memory unit 131. In addition, an image is displayed on the display device 110.

次に、音声特徴量の抽出方法について説明する。
前述の通り、音声信号は音声分析処理部１５５により分析される。音声分析処理部１５５はスペクトルアナライザ機能を含み、取得した音声信号の周波数分布を得ることができる。これにより、図３に示すように各周波数域における音量レベルを取得することができる。 Next, a method for extracting voice feature amounts will be described.
As described above, the voice signal is analyzed by the voice analysis processing unit 155. The voice analysis processing unit 155 includes a spectrum analyzer function, and can obtain the frequency distribution of the acquired voice signal. Thereby, as shown in FIG. 3, the volume level in each frequency range can be acquired.

前述のようにして得られた周波数分布から、下記の式１、式２、式３、式４より音声取得時間内の平均音量Ａ、単位時間当たりの音量変化率Ｂ、平均周波数Ｃ、単位時間当たりの平均周波数の変化率Ｄを求めることができる。ただし、Ｔは音声取得サンプリング回数の総数、Ｋは周波数のサンプリング数、ｆ_t、_kは音声取得サンプリングｔ回目でのサンプリングｋ番目の周波数。ａ_t、_kは周波数ｆ_t、_kでの音量レベルを表す。 From the frequency distribution obtained as described above, the average volume A within the voice acquisition time, the volume change rate B per unit time, the average frequency C, and the unit time from the following formula 1, formula 2, formula 3, and formula 4. The change rate D of the average frequency per hit can be obtained. Here, T is the total number of times of voice acquisition sampling, K is the number of frequency samplings, _ft , _k are the _kth frequency of sampling at the tth time of voice acquisition sampling. a _t and _k represent volume levels at frequencies f _t and _k .

ここで、前記の単位時間当たりの音量変化率Ｂは入力音声信号のテンポ、平均周波数Ｃは音高、単位時間当たりの平均周波数の変化率Ｄは抑揚に対応する。以上より、入力音声の特徴量である音量・テンポ・音高・抑揚を取得することができる。なお、前述した４式は、簡略のために移動平均式としても構わない。 Here, the volume change rate B per unit time corresponds to the tempo of the input audio signal, the average frequency C corresponds to the pitch, and the change rate D of the average frequency per unit time corresponds to the inflection. As described above, the volume, tempo, pitch, and intonation, which are the feature quantities of the input voice, can be acquired. Note that the above-described four formulas may be moving average formulas for simplicity.

次に、画像処理パラメータの設定方法について説明する。変更する画像処理パラメータはコントラスト、シャープネス、明度、彩度、ホワイトバランスとする。デフォルト時の画像処理に対して、入力音声の特徴量である音量・テンポ・音高・抑揚の強弱に応じて、各効果の強弱を調整する。 Next, a method for setting image processing parameters will be described. The image processing parameters to be changed are contrast, sharpness, brightness, saturation, and white balance. For the default image processing, the strength of each effect is adjusted according to the strength, volume, tempo, pitch, and inflection, which are the features of the input sound.

以下、コントラストを例に設定方法を示す。コントラストはコントラスト設定値Ｃｒに比例して効果の強弱が変わる画像処理テーブルを有している。コントラスト設定値Ｃｒの値が大きいほどコントラストが強くなり、小さいほどコントラストが弱くなる。 Hereinafter, the setting method will be described by taking contrast as an example. The contrast has an image processing table in which the strength of the effect changes in proportion to the contrast setting value Cr. The greater the contrast setting value Cr, the stronger the contrast, and the smaller the contrast setting value Cr, the weaker the contrast.

音声特徴量は各々のコントラスト調整用パラメータとして、音量Ｖｃｒ・テンポＴｃｒ・音高Ｐｃｒ・抑揚Ｉｃｒをもち、取得した特徴量の強弱に応じて各々の値が設定される。前記パラメータを用いて、コントラスト設定値Ｃｒは式５より求まる。
Ｃｒ＝Ｃｒｏ×Ｖｃｒ×Ｔｃｒ×Ｐｃｒ×Ｉｃｒ（式５）
ここで、Ｃｒｏはコントラストのデフォルト設定値である。 The audio feature amount includes volume Vcr, tempo Tcr, pitch Pcr, and inflection Icr as parameters for contrast adjustment, and each value is set according to the strength of the acquired feature amount. Using the parameters, the contrast setting value Cr is obtained from Equation 5.
Cr = Cro × Vcr × Tcr × Pcr × Icr (Formula 5)
Here, Cro is a default setting value of contrast.

ここで、各画像処理パラメータと音声特徴量とは図４の関係にあるものとする。コントラスト設定値を例とすれば、音量・抑揚が大きいほどコントラストは強まり、音高・テンポが大きいほどコントラストは弱まるものとする。つまり、前記の式５において各音声特徴量がデフォルトより大きければ、音量Ｖｃｒ・抑揚Ｉｃｒは１より大きな値となり、音高Ｐｃｒ・テンポＴｃｒは１より小さな値となる。 Here, it is assumed that each image processing parameter and the audio feature amount have the relationship shown in FIG. Taking the contrast setting value as an example, it is assumed that the contrast increases as the volume / inflection increases, and the contrast decreases as the pitch / tempo increases. That is, if each voice feature amount is larger than the default in Equation 5, the volume Vcr / inflection Icr is a value greater than 1, and the pitch Pcr / tempo Tcr is a value smaller than 1.

以上、コントラストを例に設定方法を示したが、シャープネス、明度、彩度、ホワイトバランスについても同様の画像処理テーブルをもち、音声特徴量の値に応じて、各々の効果が変わる。ただし、図４に示すように、音声特徴量に対する各々の効果の変化の方向は異なる。 The setting method has been described above by taking contrast as an example, but the same image processing table is used for sharpness, lightness, saturation, and white balance, and each effect varies depending on the value of the audio feature amount. However, as shown in FIG. 4, the direction of change of each effect on the audio feature amount is different.

例えば、音量及び抑揚が高い環境下では、画像の明度・彩度を強くして印象的な画像にする。また、コントラストとシャープネスを下げるとともにホワイトバランスを黄色側にシフトさせることで、柔らかく、明るい印象をもたせる絵作りを行う。 For example, in an environment where the volume and inflection are high, the brightness and saturation of the image are increased to make an impressive image. In addition, by reducing the contrast and sharpness and shifting the white balance to the yellow side, we create a picture that gives a soft and bright impression.

一方で、音量・抑揚が小さく、高音でテンポが速い環境下では、明度・彩度を下げ、かつホワイトバランスを青色側へシフトさせる。そして、コントラスト・シャープネスの効果を上げ、引き締まった画像に仕上げる。 On the other hand, in an environment where the volume / inflection is small, the sound is high, and the tempo is fast, the brightness / saturation is lowered and the white balance is shifted to the blue side. Then, the effect of contrast and sharpness is enhanced to produce a tight image.

以上、第１の実施形態の構成によれば、撮影直前の環境音を取得し、画像処理に反映させることが可能になる。なお、前述した第１の実施形態ではレリーズスイッチ１３５が半押し（ＳＷ１）から全押し（ＳＷ２）になるまでの期間を音声取得期間としたが、レリーズスイッチ１３５以外のボタンを使用してもよい。 As described above, according to the configuration of the first embodiment, it is possible to acquire the environmental sound immediately before shooting and reflect it in the image processing. In the first embodiment described above, the period from when the release switch 135 is half-pressed (SW1) to fully pressed (SW2) is the voice acquisition period, but buttons other than the release switch 135 may be used. .

また、音声取得期間が他の任意の期間であってもよい。例えば、撮影前に予め取得しておいた音声特徴量に基づいて、画像処理を施してもよい。また、撮影後に音声特徴量を抽出し、既に撮影された画像に画像処理を施してもよい。 The voice acquisition period may be any other period. For example, image processing may be performed based on a voice feature amount acquired in advance before shooting. In addition, a voice feature amount may be extracted after shooting, and image processing may be performed on an already shot image.

（第２の実施形態）
次に、本発明の第２の実施形態を説明する。
前述した第１の実施形態では、静止画撮影時において本発明を実行する例を説明したが、第２の実施形態では動画撮影時に本発明を適用する例を説明する。本実施形態の撮像装置の構成は、第１の実施形態と同様であるため、撮像装置の構成を説明する図面は省略する。 (Second Embodiment)
Next, a second embodiment of the present invention will be described.
In the first embodiment described above, an example in which the present invention is executed during still image shooting has been described. In the second embodiment, an example in which the present invention is applied during moving image shooting will be described. Since the configuration of the imaging apparatus of the present embodiment is the same as that of the first embodiment, drawings for describing the configuration of the imaging apparatus are omitted.

図５に、第２の実施形態において処理実行される処理手順を説明するフローチャートを示す。
図１の撮像装置１００に配設されている音声認識モードスイッチ１５６により音声認識モードが選択されている場合に、以下のフローが処理実行される。 FIG. 5 shows a flowchart for explaining a processing procedure executed in the second embodiment.
When the voice recognition mode is selected by the voice recognition mode switch 156 provided in the imaging apparatus 100 of FIG. 1, the following flow is processed.

まず、ステップＳ２０１では、操作部１３６の動画スイッチ（図示せず）が押され、動画撮影が開始状態であるかどうかを判別する。この判別の結果、動画撮影が開始状態であればステップＳ２０２へ移行する。
ステップＳ２０２では、撮像素子１２１から読み出した信号を画像データに変換してメモリ１２７に書き込む撮像処理が行われる。また、マイクロフォン１５０より音声信号を取得する。次に、ステップＳ２０３では、ステップＳ２０２で取得した画像データを処理するか、音声信号を処理するかの選択が行われ、音声信号を処理する場合にはステップＳ２０４へ移行する。また、画像データを処理する場合にはステップＳ２０６へ移行する。 First, in step S201, a moving image switch (not shown) of the operation unit 136 is pressed to determine whether moving image shooting is in a start state. If the result of this determination is that moving image shooting has been started, the process proceeds to step S202.
In step S202, an imaging process is performed in which the signal read from the image sensor 121 is converted into image data and written to the memory 127. An audio signal is acquired from the microphone 150. Next, in step S203, it is selected whether to process the image data acquired in step S202 or to process the audio signal, and when processing the audio signal, the process proceeds to step S204. When processing image data, the process proceeds to step S206.

ステップＳ２０４では、入力された音声信号の音量レベル及び周波数を音声分析処理部１５５により分析する。
次に、ステップＳ２０５では、ステップＳ２０４で取得した音声信号の音量レベル及び周波数分布から所定の演算処理をシステム制御部１２０において施す。これにより、音声取得期間内の平均音量レベル、単位時間当たりの音量変化率、平均周波数、単位時間当たりの平均周波数の変化率を演算する。そして、演算結果から、入力音声の音量やテンポ、音高、抑揚等の音声特徴量を抽出する。音声特徴量を抽出する詳細は第１の実施形態に示した通りである。 In step S204, the sound analysis processing unit 155 analyzes the volume level and frequency of the input sound signal.
Next, in step S205, the system control unit 120 performs predetermined arithmetic processing from the volume level and frequency distribution of the audio signal acquired in step S204. Thus, the average sound volume level, the sound volume change rate per unit time, the average frequency, and the change rate of the average frequency per unit time are calculated. Then, the voice feature quantity such as the volume, tempo, pitch, and intonation of the input voice is extracted from the calculation result. The details of extracting the audio feature amount are as shown in the first embodiment.

ただし、動画撮影の場合には、取得した音声特徴量を画像処理に動的に反映する必要があるため、移動平均式を用いて単位時間当たりの音量変化率、平均周波数、単位時間当たりの平均周波数の変化率を演算することが望ましい。 However, in the case of video shooting, the acquired audio feature value needs to be dynamically reflected in the image processing, so the volume change rate per unit time, average frequency, average per unit time using the moving average formula It is desirable to calculate the rate of change of frequency.

次に、ステップＳ２０６では、ステップＳ２０５で得られた音声特徴量に対応するコントラスト・明度・彩度・シャープネス等の画像処理パラメータ値を設定する。
次に、ステップＳ２０７では、所定の画素補間処理や色変換処理に加え、Ｓ２０７で設定したパラメータに基づく画像処理を施す。 Next, in step S206, image processing parameter values such as contrast, brightness, saturation, and sharpness corresponding to the audio feature amount obtained in step S205 are set.
In step S207, in addition to predetermined pixel interpolation processing and color conversion processing, image processing based on the parameters set in S207 is performed.

次に、ステップＳ２０８では、操作部１３６の動画スイッチが押され、動画撮影が停止状態であるかどうかを判別する。この判断の結果、動画撮影が停止状態であればステップＳ２０９へ移行する。また、動画撮影が停止状態でなかった場合にはステップＳ２０２に戻り、前述した動作を実行する。 Next, in step S208, it is determined whether the moving image switch of the operation unit 136 is pressed and moving image shooting is in a stopped state. If the result of this determination is that movie shooting has been stopped, the process proceeds to step S209. If the moving image shooting is not stopped, the process returns to step S202, and the above-described operation is executed.

ステップＳ２０９では、取得した動画データを画像処理部１２３にてＭＰＥＧ方式によりデータ圧縮する。また、音声データはＡｕｄｉｏＣｏｄｅｃ部１５２により圧縮処理される。
次に、ステップＳ２１０では、ステップＳ２０９で圧縮された動画像ファイル及び音声ファイルを動画ファイル（動画データ＋音声データ）として外部着脱メモリ部１３１に記録する。 In step S209, the acquired moving image data is compressed by the image processing unit 123 using the MPEG method. The audio data is compressed by the audio codec unit 152.
Next, in step S210, the moving image file and the audio file compressed in step S209 are recorded in the external detachable memory unit 131 as a moving image file (moving image data + audio data).

以上、説明した第２の実施形態において、音声特徴量の抽出方法及び画像処理パラメータの変更方法は前述した第１の実施形態と同様であるため、説明は省略する。第２の実施形態によれば、撮影している際の音声特徴量に基づいて画像処理を変更する本発明を、動画像に対しても適用することが可能となる。 As described above, in the second embodiment described above, the audio feature extraction method and the image processing parameter change method are the same as those in the first embodiment described above, and thus the description thereof is omitted. According to the second embodiment, it is possible to apply the present invention that changes the image processing based on the sound feature amount during shooting to a moving image.

（第３の実施形態）
前述した、第１の実施形態及び第２の実施形態では、音声特徴量に基づき画像処理パラメータを変更する例を説明した。それに対して、第３の実施形態では、音声特徴量に基づいて撮影パラメータを変更する例を説明する。なお、撮像装置の構成は図１で説明した第１の実施形態と同様であるため、撮像装置の構成を説明する図面は省略する。 (Third embodiment)
In the first embodiment and the second embodiment described above, the example in which the image processing parameter is changed based on the audio feature amount has been described. On the other hand, in the third embodiment, an example in which the shooting parameter is changed based on the audio feature amount will be described. Since the configuration of the imaging apparatus is the same as that of the first embodiment described with reference to FIG. 1, drawings for describing the configuration of the imaging apparatus are omitted.

図６に、第３の実施形態において行われる撮像方法の処理手順を説明するフローチャートを示す。この処理は、図１の撮像装置において、音声認識モードスイッチ１５６により音声認識モードが選択されている場合に実行される。 FIG. 6 is a flowchart for explaining the processing procedure of the imaging method performed in the third embodiment. This process is executed when the voice recognition mode is selected by the voice recognition mode switch 156 in the imaging apparatus of FIG.

まず、ステップＳ３０１では、接眼検出部１３０において、接眼が検出されているかどうかを判別する。この判断の結果、接眼状態にあればステップＳ３０２へ移行する。
ステップＳ３０２では、マイクロフォン１５０より音声信号を取得する。その後、ステップＳ３０３に進む。ステップＳ３０３では、マイクロフォン１５０から入力された音声信号の音量レベル及び周波数を音声分析処理部１５５により分析する。 First, in step S301, the eyepiece detection unit 130 determines whether an eyepiece is detected. If it is determined that the eyepiece is in the eyepiece state, the process proceeds to step S302.
In step S302, an audio signal is acquired from the microphone 150. Thereafter, the process proceeds to step S303. In step S303, the sound analysis processing unit 155 analyzes the volume level and frequency of the sound signal input from the microphone 150.

次に、ステップＳ３０４では、取得した音声信号の音量レベル及び周波数分布から所定の演算処理をシステム制御部１２０で施し、音声取得期間内の平均音量レベル、単位時間当たりの音量変化率、平均周波数、単位時間当たりの平均周波数の変化率を演算する。そして、演算結果から、入力音声の音量やテンポ、音高、抑揚等の音声特徴量を抽出する。詳細は第１の実施形態に示す。 Next, in step S304, the system control unit 120 performs predetermined calculation processing from the volume level and frequency distribution of the acquired audio signal, and the average volume level within the audio acquisition period, the volume change rate per unit time, the average frequency, Calculate the rate of change of average frequency per unit time. Then, the voice feature quantity such as the volume, tempo, pitch, and intonation of the input voice is extracted from the calculation result. Details are shown in the first embodiment.

次に、ステップＳ３０５では、レリーズスイッチ１３５が半押し（ＳＷ１）の状態であるかどうかを判別する。この判断の結果、ＳＷ１がＯＮであれば音声取得を停止し、撮影モードへ移行する。つまり、接眼検出部１３０により接眼が検知されてからＳＷ１がＯＮされるまでの間が、音声取得期間となる。
ステップＳ３０６では、測光部１４２より得られた被写体及び被写体の周辺部の明るさを示す測光値より自動露出補正（ＡＥ）が行われ、絞り及びシャッタースピードが設定される。 Next, in step S305, it is determined whether or not the release switch 135 is half pressed (SW1). If the result of this determination is that SW1 is ON, voice acquisition is stopped and the mode is shifted to the shooting mode. That is, the period from when the eyepiece is detected by the eyepiece detection unit 130 to when SW1 is turned on is the voice acquisition period.
In step S306, automatic exposure correction (AE) is performed based on the photometric value indicating the brightness of the subject and the peripheral portion of the subject obtained from the photometric unit 142, and the aperture and shutter speed are set.

ステップＳ３０７では、ステップＳ３０６で得られた絞り及びシャッタースピードの設定値を、ステップＳ３０４で得られた音声特徴量に基づいて変更する。
ステップＳ３０８では、レリーズスイッチ１３５が全押し（ＳＷ２）の状態であるかどうかを判別する。 In step S307, the aperture and shutter speed set values obtained in step S306 are changed based on the audio feature values obtained in step S304.
In step S308, it is determined whether or not the release switch 135 is fully pressed (SW2).

ステップＳ３０８の判断の結果、ＳＷ２がＯＮであればステップＳ３０９へ移行する。ステップＳ３０９では、ステップＳ３０７で設定された絞り及びシャッタースピードの設定値に基づいて撮影動作が行われる。なお、第３の実施形態において、音声特徴量の抽出方法は第１の実施形態と同様であるため、詳細な説明は省略する。 As a result of the determination in step S308, if SW2 is ON, the process proceeds to step S309. In step S309, a shooting operation is performed based on the aperture and shutter speed setting values set in step S307. Note that in the third embodiment, the method for extracting voice feature amounts is the same as that in the first embodiment, and thus detailed description thereof is omitted.

次に、撮影パラメータの設定方法について説明する。変更する撮影パラメータは絞り、シャッタースピードの設定値とする。
図６のステップ３０６で得られたデフォルトの絞り及びシャッタースピードに対して、入力音声の特徴量である音量・テンポ・音高・抑揚の強弱に応じて、設定値を調整する。 Next, a method for setting shooting parameters will be described. The shooting parameters to be changed are the aperture and shutter speed settings.
With respect to the default aperture and shutter speed obtained in step 306 in FIG. 6, the set values are adjusted in accordance with the volume, tempo, pitch, and inflection strength, which are the feature amounts of the input sound.

露出（Ｅｖ）、絞り（Ａｖ）、シャッタースピード（Ｔｖ）に対して、音声特徴パラメータとして、音量Ｖ＊・テンポＴ＊・音高Ｐ＊・抑揚Ｉ＊（＊＝ｅｖは露出、＊＝ａｖは絞り、＊＝ｔｖはシャッタースピードのパラメータである）をそれぞれ有する。本実施形態においては、ステップＳ３０４で取得した特徴量の強弱に応じて各々の値が設定される。 As an audio feature parameter for exposure (Ev), aperture (Av), and shutter speed (Tv), volume V *, tempo T *, pitch P *, inflection I * (* = ev is exposure, * = av Is an aperture, and * = tv is a shutter speed parameter. In this embodiment, each value is set according to the strength of the feature amount acquired in step S304.

すなわち、以下の式６より露出補正パラメータＫｅｖが求まる。Ｋｅｖは露出を何段明るくすべきかを決定するパラメータである。また、デフォルトの露出値をＥｖｏとすれば、補正後の露出ＥｖはＥｖｏをＫｅｖ段だけ明るくした値となる。 That is, the exposure correction parameter Kev is obtained from the following equation (6). Kev is a parameter that determines how much exposure should be increased. If the default exposure value is Evo, the corrected exposure Ev is a value obtained by increasing Evo by Kev level.

同様に、絞りＡｖ、シャッタースピードＴｖについても、以下の式７より絞り補正パラメータＫａｖ、以下の式８よりシャッタースピード補正パラメータＫｔｖが求まる。補正後の絞りＡｖはデフォルトの設定値Ａｖｏに対してＫａｖ段下げた値となり、補正後のシャッタースピードＴｖはデフォルトの設定値Ｔｖｏに対してＫｔｖ段上げた値となる。 Similarly, with respect to the aperture Av and the shutter speed Tv, the aperture correction parameter Kav is obtained from the following equation 7, and the shutter speed correction parameter Ktv is obtained from the following equation 8. The corrected aperture Av is a value lowered by a Kav step relative to the default set value Avo, and the corrected shutter speed Tv is a value raised by a Ktv step relative to the default set value Tvo.

ここで、各音声特徴量と撮影パラメータの関係は、図７の関係にあるものとする。例えば、音量及び抑揚が高い環境下では、絞りを開放側に補正し、かつシャッタースピードを下げ、露出を高くすることで、柔らかく、明るい印象をもつ画像を撮影できる。一方で、音量・抑揚が小さく、高音でテンポが速い環境下では、露出を下げ、絞り込み、かつシャッタースピードを上げ、引き締まった画像に仕上げるようにする。 Here, it is assumed that the relationship between each audio feature amount and the shooting parameter is as shown in FIG. For example, in an environment where the volume and the inflection are high, an image having a soft and bright impression can be taken by correcting the aperture to the open side, decreasing the shutter speed, and increasing the exposure. On the other hand, in an environment where the volume / inflection is low, the sound is high, and the tempo is fast, the exposure is reduced, the aperture is narrowed down, and the shutter speed is increased to produce a tight image.

以上説明した第３の実施形態によれば、画像処理機能に乏しい撮像装置に対しても、本発明を適用することが可能となる。また、第１の実施形態と組み合わせることで、相乗効果が期待できる。 According to the third embodiment described above, the present invention can be applied to an imaging apparatus having a poor image processing function. Further, a synergistic effect can be expected by combining with the first embodiment.

また、第３の実施形態では、静止画撮影を想定しているが、動画撮影時に適用してもよい。例えば、動画の場合は露出及び絞りの制御に加えて、シャッタースピード調整の代わりに、撮影時のフレームレートを変更してもよい。例えば、音声特徴量の内、テンポが速ければ動画のフレームレートを間引くことで、再生時に早送りの効果となり、臨場感がより伝わる。 In the third embodiment, still image shooting is assumed, but it may be applied during moving image shooting. For example, in the case of a moving image, in addition to exposure and aperture control, the frame rate at the time of shooting may be changed instead of adjusting the shutter speed. For example, if the tempo is fast among the audio feature amounts, the frame rate of the moving image is thinned out, resulting in a fast-forward effect during reproduction, and a sense of realism is transmitted more.

また、第３の実施形態では接眼検出部１３０において接眼が検出されてから、レリーズスイッチ１３５が半押し（ＳＷ１）されるまでの期間を音声取得期間としたが、他のボタンを使用してもよい。また、音声取得期間が他の任意の期間であってもよい。例えば、撮影前に予め取得しておいた音声特徴量に基づいて、撮影パラメータを設定してもよい。 In the third embodiment, the period from when the eyepiece is detected by the eyepiece detection unit 130 to when the release switch 135 is half-pressed (SW1) is set as the voice acquisition period. Good. The voice acquisition period may be any other period. For example, shooting parameters may be set based on audio feature values acquired in advance before shooting.

（第４の実施形態）
次に、本発明の第４の実施形態を説明する。
前述した第１の実施形態〜第３の実施形態では、音声信号の特徴量を音量・テンポ・音高・抑揚として画像処理パラメータ及び撮影パラメータを変更していた。それに対して、第４の実施形態では、取得した周波数成分の分布が可聴域範囲内であるかを判定し、判定結果に応じて設定する画像処理パラメータ及び撮影パラメータを変更する。 (Fourth embodiment)
Next, a fourth embodiment of the present invention will be described.
In the first to third embodiments described above, the image processing parameters and the shooting parameters are changed with the feature amount of the audio signal as the volume, tempo, pitch, and inflection. On the other hand, in the fourth embodiment, it is determined whether the distribution of the acquired frequency component is within the audible range, and the image processing parameter and the imaging parameter set according to the determination result are changed.

一般的に、都市部は人間の可聴域（２０ｋＨｚ）を越える音はほとんど存在しない。図８は、都市市街地の環境音の周波数分布を概略図として示したものであるが、平均周波数上限は、約１０ｋＨｚ程度にとどまっている。 In general, there is almost no sound exceeding the human audible range (20 kHz) in urban areas. FIG. 8 is a schematic diagram showing the frequency distribution of environmental sounds in an urban city area, but the upper limit of the average frequency is only about 10 kHz.

一方で、図９は熱帯雨林等の自然環境における環境音の周波数分布の概略図であるが、１５０ｋＨｚ以上に達する音として聴こえない高周波成分が溢れている。そこで、第１の実施形態〜第３の実施形態に示す構成の撮像装置において、撮影時に取得した環境音の内、可聴域（２０ｋＨｚ）を越える周波数成分の平均音量レベルを算出する。そして、撮影環境が自然環境であるか都市環境であるかどうかを判別し、自然環境若しくは都市環境用の画像処理及び撮影パラメータを設定する。 On the other hand, FIG. 9 is a schematic diagram of the frequency distribution of the environmental sound in a natural environment such as a rainforest, but is filled with high frequency components that cannot be heard as sound reaching 150 kHz or higher. Therefore, in the imaging apparatus having the configuration shown in the first to third embodiments, the average volume level of frequency components exceeding the audible range (20 kHz) is calculated from the environmental sound acquired at the time of shooting. Then, it is determined whether the shooting environment is a natural environment or an urban environment, and image processing and shooting parameters for the natural environment or the urban environment are set.

図１０に、第４の実施形態の処理手順を説明するフローチャートを示す。その他の構成は第１の実施形態〜第３の実施形態と同様のため、撮像装置の構成を説明するブロック図及び説明は省略する。
まず、ステップＳ４０１では、マイクロフォン１５０により音声信号を取得する。
ステップＳ４０２では、音声分析処理部１５５により、マイクロフォン１５０から出力された音声信号の音量レベル及び周波数を分析する。 FIG. 10 is a flowchart for explaining the processing procedure of the fourth embodiment. Since other configurations are the same as those in the first to third embodiments, a block diagram and a description for describing the configuration of the imaging apparatus are omitted.
First, in step S401, an audio signal is acquired by the microphone 150.
In step S402, the sound analysis processing unit 155 analyzes the volume level and frequency of the sound signal output from the microphone 150.

次に、ステップＳ４０３では、取得した音声信号の音量レベル及び周波数分布から所定の演算処理をシステム制御部１２０で施し、可聴域（２０ｋＨｚ）を越える周波数成分の平均音量レベルを算出する。 Next, in step S403, predetermined calculation processing is performed by the system control unit 120 from the volume level and frequency distribution of the acquired audio signal, and the average volume level of frequency components exceeding the audible range (20 kHz) is calculated.

次に、ステップＳ４０４では、可聴域（２０ｋＨｚ）を越える周波数成分の平均音量レベルが、予め定めた閾値以上かどうかを判別する。この判別の結果、閾値以上であればステップＳ４０６へ移行し、閾値未満であればステップＳ４０５へ移行する。 Next, in step S404, it is determined whether or not the average volume level of frequency components exceeding the audible range (20 kHz) is equal to or higher than a predetermined threshold value. If it is determined that the threshold value is equal to or greater than the threshold value, the process proceeds to step S406, and if it is less than the threshold value, the process proceeds to step S405.

ステップＳ４０５では、撮影環境は都市部と判定し、都市部用のパラメータを設定する。例えば、明度・彩度を下げ、かつホワイトバランスを青色側へシフトさせる。そして、コントラスト・シャープネスの効果を上げ、引き締まった画像に仕上げることで、人工物に囲まれた雰囲気を出すようにする。 In step S405, the shooting environment is determined to be an urban area, and parameters for the urban area are set. For example, the brightness / saturation is lowered and the white balance is shifted to the blue side. Then, the effect of contrast and sharpness is enhanced and the image is tightened to create an atmosphere surrounded by artifacts.

一方、ステップＳ４０６では、撮影環境は自然環境と判定し、自然環境用のパラメータを設定する。例えば、画像の明度・彩度を強くし、色鮮やかな画像にする。また、コントラストとシャープネスを下げることで、柔らかく、明るい印象をもたせる画像作りを行う。 On the other hand, in step S406, it is determined that the shooting environment is a natural environment, and parameters for the natural environment are set. For example, the brightness / saturation of the image is increased to make the image colorful. Also, by reducing the contrast and sharpness, the image is created to give a soft and bright impression.

以上、第４の実施形態の構成を説明した。なお、第４の実施形態では画像処理パラメータのみ変更しているが、撮影パラメータを変更してもよい。また、音量・音高・テンポ・抑揚等の他の音声特徴量を併用してもよい。 The configuration of the fourth embodiment has been described above. In the fourth embodiment, only the image processing parameters are changed, but the shooting parameters may be changed. Moreover, you may use together other audio | voice feature-values, such as volume, pitch, tempo, and inflection.

また、前述した実施形態では、環境音を可聴域内と可聴域外の２つに分類して画像処理パラメータを変更しているが、分類をより細かくしてもよい。例えば１０ｋＨｚ、２０ｋＨｚ、５０ｋＨｚ、１００ｋＨｚにおいて各々音量を判別し、撮影環境を類推して画像処理パラメータ及び撮影パラメータを変更してもよい。 Further, in the above-described embodiment, the environmental sound is classified into two within the audible range and outside the audible range, and the image processing parameter is changed. However, the classification may be made finer. For example, the volume may be determined at 10 kHz, 20 kHz, 50 kHz, and 100 kHz, and the image processing parameter and the shooting parameter may be changed by analogy with the shooting environment.

以上、４つの実施形態について説明した。
なお、実施形態の説明においては、撮像装置１００は、レンズ交換可能なデジタルカメラを想定した構成となっているが、レンズ一体型のコンパクトデジタルカメラやカメラ付携帯電話、及びデジタルビデオカメラのような構成としてもよい。 The four embodiments have been described above.
In the description of the embodiment, the imaging apparatus 100 is configured to assume a digital camera with interchangeable lenses. However, the imaging apparatus 100 may be a lens-integrated compact digital camera, a camera-equipped mobile phone, or a digital video camera. It is good also as a structure.

また、４つの実施形態を別々に説明したが、各々が組み合わされた構成としてもよい。また、静止画撮影時と動画撮影時について別々に説明しているが、動画撮影時に撮影する静止画に本発明を適用してもよい。 Moreover, although four embodiment was demonstrated separately, it is good also as a structure by which each was combined. Further, although the still image shooting and the moving image shooting are separately described, the present invention may be applied to a still image shot during moving image shooting.

また、前述した実施形態の中では、音声特徴量に対する画像処理パラメータ及び撮影パラメータの設定例を図４及び図７のように示したが、音声特徴量と各パラメータの関係は、本実施形態と異なる設定でもよい。 In the above-described embodiment, the setting examples of the image processing parameter and the shooting parameter for the audio feature amount are shown as in FIGS. 4 and 7, but the relationship between the audio feature amount and each parameter is the same as that of the present embodiment. Different settings may be used.

また、実施形態の中で画像特徴量に基づき変更するパラメータは、コントラスト、明度、彩度、シャープネス、ホワイトバランスとしたが、他の画像処理パラメータを変更するようにしてもよい。 In the embodiment, the parameters to be changed based on the image feature amount are contrast, brightness, saturation, sharpness, and white balance. However, other image processing parameters may be changed.

また、前述においては、音声特徴量に基づき画像処理を施した画像データのみ記録媒体に保存しているが、画像処理を施す前のＣＣＤＲＡＷデータや、デフォルトの画像処理を施した画像データを同時に保存できるように構成してもよい。 In the above description, only image data that has undergone image processing based on audio feature values is stored in the recording medium. However, CCD RAW data before image processing and image data that has undergone default image processing are simultaneously stored. You may comprise so that it can do.

また、撮像装置１００に外部着脱メモリ部１３１を装着する構成として説明したが、記録媒体は単数或いは複数の何れを組み合わせた構成であってもよい。また、前述した実施形態ではカメラ制御部１４０、システム制御部１２０は、独立した回路構成としているが、システム制御部１２０がカメラ制御部１４０を兼ね備えた構成であってもよい。 Further, although the configuration has been described in which the external detachable memory unit 131 is mounted on the imaging apparatus 100, the recording medium may be configured to be a single or a combination of a plurality. In the above-described embodiment, the camera control unit 140 and the system control unit 120 have independent circuit configurations. However, the system control unit 120 may have a configuration in which the camera control unit 140 is also used.

また、実施形態の説明においては、撮像装置１００は、不揮発性メモリにプログラムや、表示データを備えた構成を想定しているが、ハードディスクやＤＶＤ−ＲＯＭ、ＣＤ−ＲＯＭ等による構成としてもよい。 In the description of the embodiment, the imaging apparatus 100 is assumed to have a configuration in which a nonvolatile memory includes a program and display data, but may be configured by a hard disk, a DVD-ROM, a CD-ROM, or the like.

（本発明に係る他の実施形態）
前述した本発明の実施形態における撮像装置を構成する各手段は、コンピュータのＲＡＭやＲＯＭなどに記憶されたプログラムが動作することによって実現できる。このプログラム及び前記プログラムを記録したコンピュータ読み取り可能な記録媒体は本発明に含まれる。 (Other embodiments according to the present invention)
Each unit constituting the imaging apparatus according to the above-described embodiment of the present invention can be realized by operating a program stored in a RAM or a ROM of a computer. This program and a computer-readable recording medium recording the program are included in the present invention.

また、本発明は、例えば、システム、装置、方法、プログラムもしくは記憶媒体等としての実施形態も可能であり、具体的には、複数の機器から構成されるシステムに適用してもよいし、また、一つの機器からなる装置に適用してもよい。 In addition, the present invention can be implemented as, for example, a system, apparatus, method, program, storage medium, or the like. Specifically, the present invention may be applied to a system including a plurality of devices. The present invention may be applied to an apparatus composed of a single device.

なお、本発明は、前述した撮像方法における各工程を実行するソフトウェアのプログラム（実施形態では図２、図５、図６及び図１０に示すフローチャートに対応したプログラム）を、システムあるいは装置に直接、あるいは遠隔から供給する。そして、そのシステムあるいは装置のコンピュータが前記供給されたプログラムコードを読み出して実行することによっても達成される場合を含む。 In the present invention, a software program (in the embodiment, a program corresponding to the flowcharts shown in FIGS. 2, 5, 6, and 10) that executes each step in the imaging method described above is directly transferred to a system or apparatus. Alternatively, it is supplied remotely. In addition, this includes a case where the system or the computer of the apparatus is also achieved by reading and executing the supplied program code.

したがって、本発明の機能処理をコンピュータで実現するために、前記コンピュータにインストールされるプログラムコード自体も本発明を実現するものである。つまり、本発明は、本発明の機能処理を実現するためのコンピュータプログラム自体も含まれる。 Accordingly, since the functions of the present invention are implemented by computer, the program code installed in the computer also implements the present invention. In other words, the present invention includes a computer program itself for realizing the functional processing of the present invention.

その場合、プログラムの機能を有していれば、オブジェクトコード、インタプリタにより実行されるプログラム、ＯＳに供給するスクリプトデータ等の形態であってもよい。 In that case, as long as it has the function of a program, it may be in the form of object code, a program executed by an interpreter, script data supplied to the OS, and the like.

プログラムを供給するための記録媒体としては種々の記録媒体を使用することができる。例えば、フロッピー（登録商標）ディスク、ハードディスク、光ディスク、光磁気ディスク、ＭＯ、ＣＤ−ＲＯＭ、ＣＤ−Ｒ、ＣＤ−ＲＷ、磁気テープ、不揮発性のメモリカード、ＲＯＭ、ＤＶＤ（ＤＶＤ−ＲＯＭ，ＤＶＤ−Ｒ）などがある。 Various recording media can be used as a recording medium for supplying the program. For example, floppy (registered trademark) disk, hard disk, optical disk, magneto-optical disk, MO, CD-ROM, CD-R, CD-RW, magnetic tape, nonvolatile memory card, ROM, DVD (DVD-ROM, DVD- R).

その他、プログラムの供給方法としては、クライアントコンピュータのブラウザを用いてインターネットのホームページに接続する。そして、前記ホームページから本発明のコンピュータプログラムそのもの、もしくは圧縮され自動インストール機能を含むファイルをハードディスク等の記録媒体にダウンロードすることによっても供給できる。 As another program supply method, a browser on a client computer is used to connect to an Internet home page. The computer program itself of the present invention or a compressed file including an automatic installation function can be downloaded from the homepage by downloading it to a recording medium such as a hard disk.

また、本発明のプログラムを構成するプログラムコードを複数のファイルに分割し、それぞれのファイルを異なるホームページからダウンロードすることによっても実現可能である。つまり、本発明の機能処理をコンピュータで実現するためのプログラムファイルを複数のユーザに対してダウンロードさせるＷＷＷサーバも、本発明に含まれるものである。 It can also be realized by dividing the program code constituting the program of the present invention into a plurality of files and downloading each file from a different homepage. That is, a WWW server that allows a plurality of users to download a program file for realizing the functional processing of the present invention on a computer is also included in the present invention.

また、本発明のプログラムを暗号化してＣＤ−ＲＯＭ等の記憶媒体に格納してユーザに配布し、所定の条件をクリアしたユーザに対し、インターネットを介してホームページから暗号化を解く鍵情報をダウンロードさせる。そして、その鍵情報を使用することにより暗号化されたプログラムを実行してコンピュータにインストールさせて実現することも可能である。 In addition, the program of the present invention is encrypted, stored in a storage medium such as a CD-ROM, distributed to users, and key information for decryption is downloaded from a homepage via the Internet to users who have cleared predetermined conditions. Let It is also possible to execute the encrypted program by using the key information and install the program on a computer.

また、コンピュータが、読み出したプログラムを実行することによって、前述した実施形態の機能が実現される他、コンピュータ上で稼動しているＯＳなどが、実際の処理の一部または全部を行うことによっても前述した実施形態の機能が実現され得る。 In addition to the functions of the above-described embodiments being realized by the computer executing the read program, the OS running on the computer may perform part or all of the actual processing. The functions of the above-described embodiments can be realized.

さらに、記録媒体から読み出されたプログラムが、コンピュータに挿入された機能拡張ボードやコンピュータに接続された機能拡張ユニットに備わるメモリに書き込まれる。その後、そのプログラムの指示に基づき、その機能拡張ボードや機能拡張ユニットに備わるＣＰＵなどが実際の処理の一部または全部を行い、その処理によっても前述した実施形態の機能が実現される。 Further, the program read from the recording medium is written in a memory provided in a function expansion board inserted into the computer or a function expansion unit connected to the computer. Thereafter, the CPU of the function expansion board or function expansion unit performs part or all of the actual processing based on the instructions of the program, and the functions of the above-described embodiments are realized by the processing.

本発明の第１の実施形態を示し、撮像装置の構成例を示すブロック図である。1 is a block diagram illustrating a configuration example of an imaging apparatus according to a first embodiment of the present invention. 本発明の第１の実施形態を示し、撮像装置において実行される処理手順の一例を説明するフローチャートである。5 is a flowchart illustrating an example of a processing procedure executed in the imaging apparatus according to the first embodiment of this invention. 本発明の第１の実施形態〜第４の実施形態において取得する音声信号の周波数分布の例を示す図である。It is a figure which shows the example of the frequency distribution of the audio | voice signal acquired in the 1st Embodiment of this invention-4th Embodiment. 本発明の第１の実施形態及び第２の実施形態における音声特徴量と画像処理パラメータの関係を表す概念図である。It is a conceptual diagram showing the relationship between the audio | voice feature-value and image processing parameter in the 1st Embodiment and 2nd Embodiment of this invention. 本発明の第２の実施形態を示し、撮像装置において処理実行される処理手順を説明するフローチャートである。10 is a flowchart illustrating a processing procedure executed in the imaging apparatus according to the second embodiment of this invention. 本発明の第３の実施形態を示し、撮像装置において実行される処理手順の一例を説明するフローチャートである。10 is a flowchart illustrating an example of a processing procedure executed in the imaging apparatus according to the third embodiment of this invention. 本発明の第３の実施形態を示し、音声特徴量と撮影パラメータの関係を表す図である。It is a figure which shows the 3rd Embodiment of this invention and represents the relationship between an audio | voice feature-value and imaging | photography parameter. 本発明の第４の実施形態を示し、都市部の環境音の周波数分布の一例を示す図である。It is a figure which shows the 4th Embodiment of this invention and shows an example of frequency distribution of the environmental sound of an urban area. 本発明の第４の実施形態を示し、自然環境での環境音の周波数分布の一例を示す図である。It is a figure which shows the 4th Embodiment of this invention and shows an example of frequency distribution of the environmental sound in a natural environment. 本発明の第４の実施形態を示し、撮像装置において実行される処理手順の一例を説明するフローチャートである。14 is a flowchart illustrating an example of a processing procedure executed in the imaging apparatus according to the fourth embodiment of this invention.

Explanation of symbols

１００撮像装置
１０１レンズ用コネクタ
１０２レンズマウント（カメラ側）
１１０表示装置
１１１アクセサリシュー
１２０システム制御部
１２１撮像素子
１２２Ａ／Ｄ変換部
１２３画像処理部
１２４メモリ制御部
１２５液晶パネル表示部
１２６バックライト照明
１２７メモリ
１２８不揮発性メモリ
１３０接眼検出部
１３１外部着脱メモリ部
１３２再生スイッチ
１３３メニュースイッチ
１３４モードダイアル
１３５レリーズスイッチ
１３６操作部
１３７電源スイッチ
１３８電源部
１３９タイマー
１４０カメラ制御部
１４１シャッター制御部
１４２測光部
１４３測距部
１４４シャッター
１５０マイクロフォン
１５１Ａ／Ｄ変換部
１５２ＡｕｄｉｏＣｏｄｅｃ部１５２
１５３Ｄ／Ａ変換部
１５４スピーカ
１５５音声分析処理部
１５６音声認識モードスイッチ
２００レンズユニット
２０１レンズ用コネクタ
２０２レンズマウント（レンズ側）
２０３レンズ制御部
２１０撮影レンズ
２１１絞り
３００ストロボユニット
３０１インターフェース
３０２ストロボ発光制御部 100 Imaging device 101 Lens connector 102 Lens mount (camera side)
DESCRIPTION OF SYMBOLS 110 Display apparatus 111 Accessory shoe 120 System control part 121 Image pick-up element 122 A / D conversion part 123 Image processing part 124 Memory control part 125 Liquid crystal panel display part 126 Backlight illumination 127 Memory 128 Non-volatile memory 130 Eyepiece detection part 131 External removable memory Section 132 Playback switch 133 Menu switch 134 Mode dial 135 Release switch 136 Operation section 137 Power switch 138 Power section 139 Timer 140 Camera control section 141 Shutter control section 142 Photometry section 143 Distance measurement section 144 Shutter 150 Microphone 151 A / D conversion section 152 Audio Codec 152
153 D / A converter 154 Speaker 155 Voice analysis processor 156 Voice recognition mode switch 200 Lens unit 201 Lens connector 202 Lens mount (lens side)
203 Lens control unit 210 Shooting lens 211 Aperture 300 Strobe unit 301 Interface 302 Strobe light emission control unit

Claims

Imaging means for capturing a subject and generating still image data or moving image data;
Image processing means for performing predetermined image processing on still image data or moving image data generated by the imaging means;
Audio input means for acquiring an audio signal when the imaging means images the subject;
Voice analysis processing means for analyzing a feature amount of the voice signal acquired by the voice input means;
System control means for controlling image processing parameters used by the image processing means for performing predetermined image processing on the still image data or moving image data based on the feature amount of the audio signal analyzed by the audio analysis processing means. An imaging apparatus comprising:

The imaging means includes an imaging element that converts an optical image formed through an imaging optical system including a lens, a diaphragm, and a shutter into an electrical signal, a photometric means that measures the brightness of a peripheral portion of the subject, and the photometric means The image pickup apparatus according to claim 1, further comprising: a lens control unit that controls the diaphragm in conjunction with a photometric value obtained by the step (a), and a shutter control unit that controls a speed of the shutter.

The system control means includes the aperture and the shutter based on a photometric value indicating brightness of a peripheral portion of the subject measured by the photometry means and a feature amount of an audio signal acquired when the subject is photographed. The imaging apparatus according to claim 2, wherein imaging parameters including the speed of the imaging are controlled.

The feature amount of the audio signal includes at least one of volume, tempo, pitch, and inflection, and the audio analysis processing means performs a predetermined calculation based on the volume and frequency distribution of the audio signal, and The imaging device according to claim 1, wherein a feature amount of the audio signal is obtained.

5. The imaging according to claim 1, wherein the image processing parameter controlled by the system control unit includes at least one of contrast, sharpness, brightness, saturation, and white balance. apparatus.

6. The system control unit according to claim 1, wherein when the imaging unit performs moving image shooting, the system control unit controls a frame rate of the moving image based on a feature amount of an audio signal acquired at the time of shooting. The imaging device according to item.

The sound analysis processing means determines whether the distribution of frequency components acquired when the subject is photographed is within an audible range,
The system control unit controls an image processing parameter set when the image processing unit performs image processing and a shooting parameter when the imaging unit captures an image according to a determination result of the voice analysis processing unit. The imaging apparatus according to claim 1, wherein

An imaging step of shooting a subject to generate still image data or moving image data;
An image processing step of performing predetermined image processing on still image data or moving image data generated in the imaging step;
An audio input step of acquiring an audio signal when shooting the subject in the imaging step;
A voice analysis processing step of analyzing a feature amount of the voice signal acquired in the voice input step;
A system control step for controlling image processing parameters used for performing predetermined image processing on the still image data or moving image data in the image processing step based on a feature amount of the audio signal analyzed in the audio analysis processing step. An imaging method characterized by comprising:

An imaging step of shooting a subject to generate still image data or moving image data;
An image processing step of performing predetermined image processing on still image data or moving image data generated in the imaging step;
An audio input step of acquiring an audio signal when shooting the subject in the imaging step;
A voice analysis processing step of analyzing a feature amount of the voice signal acquired in the voice input step;
A system control step for controlling image processing parameters used for performing predetermined image processing on the still image data or moving image data in the image processing step based on a feature amount of the audio signal analyzed in the audio analysis processing step. A computer program that causes a computer to execute an imaging method including: