JP2014041300A

JP2014041300A - Signal processing device, imaging device, and program

Info

Publication number: JP2014041300A
Application number: JP2012184454A
Authority: JP
Inventors: Mitsuhiro Okazaki; 光宏岡崎
Original assignee: Nikon Corp
Current assignee: Nikon Corp
Priority date: 2012-08-23
Filing date: 2012-08-23
Publication date: 2014-03-06

Abstract

PROBLEM TO BE SOLVED: To improve accuracy of a noise estimation in noise reduction processing.SOLUTION: A signal processing device comprises: a conversion part that converts a sound signal to be input into a frequency spectrum in a frequency domain; a processing part that performs signal processing to the frequency spectrum converted by the conversion part; an inverse conversion part that converts the frequency spectrum subjected to the signal processing by the processing part into a sound time signal in a time domain; and a coupling adjustment part that multiplies the sound time signal to be output from the inverse conversion part with a window function in which values at both ends in a unit section in a time axis direction are smaller than a value at a center therein.

Description

本発明は、音信号に対して信号処理をする信号処理装置、撮像装置、およびプログラムに関する。 The present invention relates to a signal processing device, an imaging device, and a program that perform signal processing on a sound signal.

マイクが収音したマイク音信号に対して、例えばノイズ低減処理等の信号処理を施す場合、マイク音信号をフレームで区切って、各フレームに対応する音時間信号を周波数スペクトルに変換する。そして、この周波数スペクトルに対してノイズ低減処理をし、このノイズ低減処理後の周波数スペクトルを時間領域に戻す処理がなされる。
例えば、マイク音信号をフレームに区切って窓関数を乗算し、窓関数を乗算した音時間信号をフーリエ変換することで時間領域の音時間信号を周波数領域の周波数スペクトルに変換する。この周波数スペクトルからノイズとして推定される周波数スペクトルを減算して、減算結果である周波数スペクトルに逆フーリエ変換を行うことで時間領域の音時間信号に戻すものがある（例えば、特許文献１参照）。 When signal processing such as noise reduction processing is performed on the microphone sound signal collected by the microphone, for example, the microphone sound signal is divided into frames, and the sound time signal corresponding to each frame is converted into a frequency spectrum. Then, a noise reduction process is performed on the frequency spectrum, and a process of returning the frequency spectrum after the noise reduction process to the time domain is performed.
For example, the microphone sound signal is divided into frames, multiplied by a window function, and the sound time signal multiplied by the window function is Fourier transformed to convert the sound time signal in the time domain into a frequency spectrum in the frequency domain. There is a method of subtracting a frequency spectrum estimated as noise from this frequency spectrum and performing an inverse Fourier transform on the frequency spectrum as a subtraction result to return to a time-domain sound time signal (see, for example, Patent Document 1).

特開２０１１−０９５５６７号公報JP 2011-095567 A

このように、音時間信号を周波数スペクトルに変換した後に信号処理を行い、信号処理後の周波数スペクトルを音時間信号に戻すと、フレーム間のつなぎ目において音時間信号の振幅値がずれることによりノイズが発生する場合がある。つまり、周波数スペクトルを時間領域に戻したとき、フレーム間のつなぎ目が不連続となってしまうことにより音がとび、つなぎ目においてノイズが発生するおそれがあった。 As described above, when signal processing is performed after the sound time signal is converted into the frequency spectrum, and the frequency spectrum after the signal processing is returned to the sound time signal, the amplitude value of the sound time signal shifts at the joint between frames, thereby causing noise. May occur. That is, when the frequency spectrum is returned to the time domain, the joint between frames becomes discontinuous, so that the sound skips and noise may occur at the joint.

本発明は、前記の点に鑑みてなされたものであり、音時間信号を周波数領域に変換した後で信号処理を行い、信号処理された周波数領域の周波数スペクトルを音時間信号に戻した場合に、つなぎ目に生じるおそれのあるノイズを低減させるための信号処理装置、撮像装置、およびプログラムを提供することを目的とする。 The present invention has been made in view of the above points. When the sound time signal is converted into the frequency domain, signal processing is performed, and the frequency spectrum of the signal-processed frequency domain is returned to the sound time signal. An object of the present invention is to provide a signal processing device, an imaging device, and a program for reducing noise that may occur at a joint.

本発明は、上記の課題を解決するためになされたものであり、信号処理装置は、入力する音信号を周波数領域における周波数スペクトルに変換する変換部と、前記変換部が変換した前記周波数スペクトルを信号処理する処理部と、前記処理部によって信号処理された前記周波数スペクトルを時間領域における音時間信号に変換する逆変換部と、単位区間における両端の値が中央の値よりも小さい窓関数を、前記逆変換部によって変換された前記音時間信号に乗算する連結調整部と、を備える。 The present invention has been made to solve the above-described problem, and the signal processing device converts the input sound signal into a frequency spectrum in the frequency domain, and the frequency spectrum converted by the conversion unit. A signal processing unit, an inverse conversion unit for converting the frequency spectrum signal-processed by the processing unit into a sound time signal in a time domain, and a window function in which values at both ends in a unit section are smaller than a central value, A connection adjustment unit that multiplies the sound time signal converted by the inverse conversion unit.

本発明によれば、時間領域の音時間信号を周波数領域に変換した後で信号処理を行い、信号処理された周波数領域の周波数スペクトルを時間領域の音時間信号に戻した場合に、つなぎ目に生じるおそれのあるノイズを低減させることができる。 According to the present invention, when signal processing is performed after a sound time signal in the time domain is converted to the frequency domain, and the frequency spectrum of the signal processed frequency domain is returned to the sound time signal in the time domain, a joint is generated. Possible noise can be reduced.

本発明の第１実施形態に係る撮像装置の構成の一例を示すブロック図である。1 is a block diagram illustrating an example of a configuration of an imaging apparatus according to a first embodiment of the present invention. 本発明の第１実施形態に係る動作部の動作タイミング信号とマイク音信号との関係の一例を説明するための参考図である。It is a reference figure for explaining an example of the relation between the operation timing signal of the operation part concerning the 1st embodiment of the present invention, and the microphone sound signal. 図２に示すマイク音信号を説明するための参考図である。It is a reference diagram for demonstrating the microphone sound signal shown in FIG. 本発明の第１実施形態に係る低減処理部の機能構成の一例を示すブロック図である。It is a block diagram which shows an example of a function structure of the reduction process part which concerns on 1st Embodiment of this invention. 衝撃音処理周波数スペクトルと衝撃音フロアリングスペクトルの一例を説明するための図である。It is a figure for demonstrating an example of an impact sound process frequency spectrum and an impact sound flooring spectrum. 周波数スペクトルの周波数成分の一例について説明するための図である。It is a figure for demonstrating an example of the frequency component of a frequency spectrum. 本発明の第１実施形態に係る衝撃音ノイズ低減処理の一例について説明するための図である。It is a figure for demonstrating an example of the impact sound noise reduction process which concerns on 1st Embodiment of this invention. ハミング窓関数Ｗ_１の一例を示す図である。It is a diagram illustrating an example of a Hamming window function W _1. ハニング窓関数Ｗ_２の一例を示す図である。Is a diagram illustrating an example of a Hanning window function W _2. 連結調整窓関数Ｗ_３の一例を示す図である。Is a diagram illustrating an example of a coupling adjustment window function W _3. 低減処理部に入力するマイク音信号の一例を示す図である。It is a figure which shows an example of the microphone sound signal input into a reduction process part. 音信号切り出し部によって切出された音時間信号の一例を示す図である。It is a figure which shows an example of the sound time signal cut out by the sound signal cutout part. ハミング窓処理部によって音時間信号にハミング窓関数Ｗ_１を乗算した音時間情報の一例を示す図である。Is a diagram illustrating an example of the sound time information obtained by multiplying the Hamming window function W ₁ to the sound time signal by a Hamming windowing unit. ノイズ低減処理部によってノイズ低減処理がなされた後の音時間情報の一例を示す図である。It is a figure which shows an example of the sound time information after a noise reduction process was made | formed by the noise reduction process part. 図１４に示す音時間信号の窓の端部に対応する部分を拡大して示す拡大図である。It is an enlarged view which expands and shows the part corresponding to the edge part of the window of the sound time signal shown in FIG. 信号重ね合わせ部によってつなぎ合わされた音情報を示す図である。It is a figure which shows the sound information connected by the signal superimposition part. 図１６に示すつなぎ目を拡大した拡大図である。It is the enlarged view to which the joint shown in FIG. 16 was expanded. 本実施形態によらない場合の一例を説明するための図である。It is a figure for demonstrating an example when not depending on this embodiment. 本実施形態によらない場合の一例を説明するための図である。It is a figure for demonstrating an example when not depending on this embodiment. 本実施形態によらない場合の一例を説明するための図である。It is a figure for demonstrating an example when not depending on this embodiment. 本実施形態によらない場合の一例を説明するための図である。It is a figure for demonstrating an example when not depending on this embodiment. 本実施形態によらない場合の一例を説明するための図である。It is a figure for demonstrating an example when not depending on this embodiment. 本発明の第１実施形態に係るノイズ低減処理方法の一例について説明するためのフローチャートである。It is a flowchart for demonstrating an example of the noise reduction processing method which concerns on 1st Embodiment of this invention. 連結調整窓関数Ｗ_５の分子の窓関数の一例を示す図である。It is a diagram illustrating an example of a window function of a molecule of the coupling adjustment window function W _5. 連結調整窓関数Ｗ_５の一例を示す図である。Is a diagram illustrating an example of a coupling adjustment window function W _5. 本実施形態に利用可能な窓関数の一例を説明するための図である。It is a figure for demonstrating an example of the window function which can be utilized for this embodiment. 本実施形態に利用可能な窓関数の一例を説明するための図である。It is a figure for demonstrating an example of the window function which can be utilized for this embodiment. 本実施形態に利用可能な窓関数の一例を説明するための図である。It is a figure for demonstrating an example of the window function which can be utilized for this embodiment. 本発明の第２実施形態に係る構成例を示す図である。It is a figure which shows the structural example which concerns on 2nd Embodiment of this invention.

［第１実施形態］
本発明の実施形態について図面を参照して詳細に説明する。図１には、本実施形態に係る撮像装置の構成を示すブロック図が示されている。なお、本実施形態では、本発明に係る信号処理装置が撮像装置に搭載されている例について以下説明するが、本発明はこれに限られない。
図１に示す通り、撮像装置１００は、光学系による像を撮像し、得られた画像データを記憶媒体２００に記憶させるとともに、マイクによって収音されたマイク音信号に対してノイズ低減処理を行い、ノイズ低減処理後の音情報を記憶媒体２００に記憶させる。
この撮像装置１００は、低減処理部２５０を備える。この低減処理部２５０は、マイク音信号に含まれる推定ノイズを取得し、この推定ノイズに基づきマイク音からノイズを低減するノイズ低減処理を行う。 [First Embodiment]
Embodiments of the present invention will be described in detail with reference to the drawings. FIG. 1 is a block diagram illustrating the configuration of the imaging apparatus according to the present embodiment. In the present embodiment, an example in which the signal processing device according to the present invention is mounted on an imaging device will be described below, but the present invention is not limited to this.
As shown in FIG. 1, the imaging device 100 captures an image by an optical system, stores the obtained image data in the storage medium 200, and performs noise reduction processing on the microphone sound signal collected by the microphone. The sound information after the noise reduction processing is stored in the storage medium 200.
The imaging apparatus 100 includes a reduction processing unit 250. The reduction processing unit 250 acquires estimated noise included in the microphone sound signal and performs noise reduction processing for reducing noise from the microphone sound based on the estimated noise.

本実施形態に係る低減処理部２５０は、動作部が動作することによって発生するノイズ（以下、動作音という）を低減するためのノイズ低減処理を実行する。例えば、撮像装置１００において、ＡＦ（ＡｕｔｏＦｏｃｕｓ）やＶＲ（ＶｉｂｒａｔｉｏｎＲｅｄｕｃｔｉｏｎ）等の処理において光学系を駆動する場合、モータや光学系が動くことにより動作音が発生する。また、モータの駆動開始時、駆動終了時、および回転方向切り換え時に、一時的に大きい音の動作音が発生する。このように、動作部の動作状態が変化した場合に、一時的に発生する大きい音を衝撃音という。一方、この衝撃音よりも小さく、光学系やモータが動いているときに発生する音を駆動音という。つまり、駆動音は、衝撃音以外の動作音（ノイズ）である。本実施形態に係る低減処理部２５０が低減しようとするノイズとは、駆動音と衝撃音とを含む動作音である。つまり、この低減処理部２５０は、マイク音信号から衝撃音によるノイズを低減する衝撃音ノイズ低減処理を行うとともに、マイク音信号から駆動音によるノイズを低減する駆動音ノイズ低減処理を行う。 The reduction processing unit 250 according to the present embodiment performs noise reduction processing for reducing noise (hereinafter referred to as operation sound) generated when the operation unit operates. For example, in the imaging apparatus 100, when the optical system is driven in processing such as AF (Auto Focus) and VR (Vibration Reduction), an operation sound is generated by the movement of the motor and the optical system. In addition, a loud operating sound is temporarily generated at the start of driving the motor, at the end of driving, and at the time of switching the rotation direction. In this way, a loud sound that is temporarily generated when the operating state of the operating unit changes is referred to as an impact sound. On the other hand, a sound that is smaller than the impact sound and is generated when the optical system or the motor is moving is called a drive sound. That is, the drive sound is an operation sound (noise) other than the impact sound. The noise to be reduced by the reduction processing unit 250 according to the present embodiment is an operation sound including a driving sound and an impact sound. That is, the reduction processing unit 250 performs impact sound noise reduction processing for reducing noise due to impact sound from the microphone sound signal, and performs drive sound noise reduction processing for reducing noise due to drive sound from the microphone sound signal.

以下、撮像装置１００と低減処理部２５０の構成の一例について詳細に説明する。なお、本実施形態において、低減処理部２５０は、撮像装置１００に内蔵されている例について説明するが、本発明はこれに限られない。例えば、低減処理部２５０は、撮像装置１００の外部装置であってもよい。 Hereinafter, an exemplary configuration of the imaging apparatus 100 and the reduction processing unit 250 will be described in detail. In the present embodiment, an example in which the reduction processing unit 250 is built in the imaging apparatus 100 will be described, but the present invention is not limited to this. For example, the reduction processing unit 250 may be an external device of the imaging device 100.

撮像装置１００は、撮像部１１０と、レンズＣＰＵ１２０と、バッファメモリ部１３０と、画像処理部１４０と、表示部１５０と、記憶部１６０と、通信部１７０と、操作部１８０と、ボディＣＰＵ１９０と、計時部２２０と、マイク２３０と、Ａ/Ｄ変換部２４０と、低減処理部２５０と、電池２６０と、を備える。 The imaging apparatus 100 includes an imaging unit 110, a lens CPU 120, a buffer memory unit 130, an image processing unit 140, a display unit 150, a storage unit 160, a communication unit 170, an operation unit 180, a body CPU 190, The timer unit 220, the microphone 230, the A / D converter 240, the reduction processing unit 250, and the battery 260 are provided.

撮像部１１０は、光学系１１１と、撮像素子１１９と、Ａ／Ｄ（Ａｎａｌｏｇ／Ｄｉｇｉｔａｌ）変換部１２１とを備え、設定される撮像条件（例えば絞り値、露出値等）に応じて予め決められた動作パターンに従い、レンズＣＰＵ１２０により制御される。この撮像部１１０は、光学系１１１による光学像を撮像素子１１９に結像させて、Ａ／Ｄ変換部１２１によってデジタル信号に変換された光学像に基づく画像データを生成する。 The imaging unit 110 includes an optical system 111, an imaging element 119, and an A / D (Analog / Digital) conversion unit 121, and is determined in advance according to imaging conditions (for example, an aperture value, an exposure value, and the like) that are set. The lens CPU 120 is controlled according to the operation pattern. The imaging unit 110 forms an optical image by the optical system 111 on the imaging device 119 and generates image data based on the optical image converted into a digital signal by the A / D conversion unit 121.

光学系１１１は、焦点調整レンズ（以下、「ＡＦレンズ」という）１１２と、手ブレ補正レンズ（以下、「ＶＲレンズ」という）１１３と、ズームレンズ１１４と、ズームエンコーダ１１５と、レンズ駆動部１１６と、ＡＦエンコーダ１１７と、手ブレ補正部１１８とを備える。
これら光学系１１１の各構成は、レンズＣＰＵ１２０による焦点調整処理、手ブレ補正処理、およびズーム処理において、各機能の処理に応じて予め決められた動作パターンに従って、駆動する。つまり、光学系１１１は、撮像装置１００における動作部である。 The optical system 111 includes a focus adjustment lens (hereinafter referred to as “AF lens”) 112, a camera shake correction lens (hereinafter referred to as “VR lens”) 113, a zoom lens 114, a zoom encoder 115, and a lens driving unit 116. And an AF encoder 117 and a camera shake correction unit 118.
Each component of the optical system 111 is driven in accordance with an operation pattern determined in advance according to processing of each function in focus adjustment processing, camera shake correction processing, and zoom processing by the lens CPU 120. That is, the optical system 111 is an operation unit in the imaging apparatus 100.

この光学系１１１は、ズームレンズ１１４から入射し、ズームレンズ１１４、ＶＲレンズ１１３、ＡＦレンズ１１２の順番で通過した光学像を、撮像素子１１９の受光面に導く。
レンズ駆動部１１６は、ＡＦレンズ１１２およびズームレンズ１１４の位置を制御するための駆動制御信号（コマンド）をレンズＣＰＵ１２０から入力する。このレンズ駆動部１１６は、入力するコマンドに応じて、ＡＦレンズ１１２およびズームレンズ１１４の位置を制御する。
つまり、このコマンドがレンズＣＰＵ１２０からレンズ駆動部１１６に入力されてレンズ駆動部１１６が駆動することにより、ＡＦレンズ１１２およびズームレンズ１１４が移動（動作）する。本実施形態において、レンズＣＰＵ１２０がコマンドを出力したタイミングを、ＡＦレンズ１１２およびズームレンズ１１４の動作が開始される動作開始タイミングという。 The optical system 111 guides an optical image incident from the zoom lens 114 and passed through the zoom lens 114, the VR lens 113, and the AF lens 112 in this order to the light receiving surface of the image sensor 119.
The lens driving unit 116 inputs drive control signals (commands) for controlling the positions of the AF lens 112 and the zoom lens 114 from the lens CPU 120. The lens driving unit 116 controls the positions of the AF lens 112 and the zoom lens 114 in accordance with an input command.
That is, when this command is input from the lens CPU 120 to the lens driving unit 116 and the lens driving unit 116 is driven, the AF lens 112 and the zoom lens 114 move (operate). In the present embodiment, the timing at which the lens CPU 120 outputs a command is referred to as an operation start timing at which the operations of the AF lens 112 and the zoom lens 114 are started.

ズームエンコーダ１１５は、ズームレンズ１１４の位置を表わすズームポジションを検出し、レンズＣＰＵ１２０に出力する。このズームエンコーダ１１５は、ズームレンズ１１４の移動を検出し、例えば、ズームレンズ１１４が光学系１１１内を移動している場合にパルス信号をレンズＣＰＵ１２０に出力する。一方、停止している場合、ズームエンコーダ１１５は、パルス信号の出力を停止する。 The zoom encoder 115 detects a zoom position representing the position of the zoom lens 114 and outputs it to the lens CPU 120. The zoom encoder 115 detects the movement of the zoom lens 114, and outputs a pulse signal to the lens CPU 120 when the zoom lens 114 is moving in the optical system 111, for example. On the other hand, when stopped, the zoom encoder 115 stops outputting the pulse signal.

ＡＦエンコーダ１１７は、ＡＦレンズ１１２の位置を表わすフォーカスポジションを検出し、レンズＣＰＵ１２０およびボディＣＰＵ１９０に出力する。このＡＦエンコーダ１１７は、ＡＦレンズ１１２の移動を検出する。このＡＦエンコーダ１１７は、ＡＦレンズ１１２の移動を検出し、例えば、ＡＦレンズ１１２が光学系１１１内を移動している場合にパルス信号をレンズＣＰＵ１２０に出力する。一方、停止している場合、ＡＦエンコーダ１１７は、パルス信号の出力を停止する。 The AF encoder 117 detects a focus position representing the position of the AF lens 112 and outputs it to the lens CPU 120 and the body CPU 190. The AF encoder 117 detects the movement of the AF lens 112. The AF encoder 117 detects the movement of the AF lens 112, and outputs a pulse signal to the lens CPU 120 when the AF lens 112 is moving in the optical system 111, for example. On the other hand, when stopped, the AF encoder 117 stops outputting the pulse signal.

なお、ズームエンコーダ１１５は、ズームポジションを検出するために、ズームレンズ１１４の駆動方向を検出するものであってもよい。また、ＡＦエンコーダ１１７は、フォーカスポジションを検出するために、ＡＦレンズ１１２の駆動方向を検出するものであってもよい。
例えば、ズームレンズ１１４やＡＦレンズ１１２は、レンズ駆動部１１６によって駆動される駆動機構（例えばモータやカム等）が時計回り（ＣＷ）あるいは反時計回り（ＣＣＷ）に回転することにより光軸方向に移動する。ズームエンコーダ１１５およびＡＦエンコーダ１１７は、それぞれ、駆動機構の回転方向（ここでは、時計回りあるいは反時計回り）を検出することよって、ズームレンズ１１４およびＡＦレンズ１１２が移動していることを検出するものであってもよい。 The zoom encoder 115 may detect the driving direction of the zoom lens 114 in order to detect the zoom position. Further, the AF encoder 117 may detect the driving direction of the AF lens 112 in order to detect the focus position.
For example, the zoom lens 114 and the AF lens 112 are moved in the direction of the optical axis when a driving mechanism (for example, a motor or a cam) driven by the lens driving unit 116 rotates clockwise (CW) or counterclockwise (CCW). Moving. The zoom encoder 115 and the AF encoder 117 respectively detect that the zoom lens 114 and the AF lens 112 are moving by detecting the rotation direction (here, clockwise or counterclockwise) of the drive mechanism. It may be.

手ブレ補正部１１８は、例えば振動ジャイロ機構を備え、光学系１１１による像の光軸ぶれを検出し、この光軸ぶれを打ち消す方向にＶＲレンズ１１３を動かす。この手ブレ補正部１１８は、例えばＶＲレンズ１１３を動かしている状態においてハイレベルの信号をレンズＣＰＵ１２０に出力する。一方、ＶＲレンズ１１３を停止させている状態において、手ブレ補正部１１８は、ローレベルの信号をレンズＣＰＵ１２０に出力する。 The camera shake correction unit 118 includes, for example, a vibration gyro mechanism, detects an optical axis shake of an image by the optical system 111, and moves the VR lens 113 in a direction to cancel the optical axis shake. The camera shake correction unit 118 outputs a high level signal to the lens CPU 120 in a state where the VR lens 113 is moved, for example. On the other hand, in a state where the VR lens 113 is stopped, the camera shake correction unit 118 outputs a low level signal to the lens CPU 120.

撮像素子１１９は、例えば、光電変換面を備え、その受光面に結像した光学像を電気信号に変換して、変換した電気信号をＡ／Ｄ変換部１２１に出力する。
この撮像素子１１９は、操作部１８０を介して撮影指示を受け付けた際に得られる画像データを、静止画又は動画の画像データとして、Ａ／Ｄ変換部１２１を介して記憶媒体２００に記憶させる。一方、撮像素子１１９は、操作部１８０を介して撮像指示を受け付けていない状態において、連続的に得られる画像データをスルー画データ（プレビュー画像データ）として、Ａ／Ｄ変換部１２１を介してボディＣＰＵ１９０および表示部１５０に出力する。 The image sensor 119 includes, for example, a photoelectric conversion surface, converts an optical image formed on the light receiving surface into an electric signal, and outputs the converted electric signal to the A / D conversion unit 121.
The image sensor 119 stores image data obtained when a shooting instruction is received via the operation unit 180 in the storage medium 200 via the A / D conversion unit 121 as still image data or moving image data. On the other hand, the image sensor 119 uses the continuously obtained image data as through image data (preview image data) in the state where the imaging instruction is not received via the operation unit 180, and the body via the A / D conversion unit 121. The data is output to the CPU 190 and the display unit 150.

Ａ／Ｄ変換部１２１は、撮像素子１１９によって変換された電気信号をデジタル化して、デジタル信号である画像データをバッファメモリ部１３０に出力する。 The A / D converter 121 digitizes the electrical signal converted by the image sensor 119 and outputs image data that is a digital signal to the buffer memory unit 130.

バッファメモリ部１３０は、撮像部１１０によって撮像された画像データを、一時的に記憶する。また、バッファメモリ部１３０は、マイク２３０が収音したマイク検出音に応じたマイク音信号を、一時的に記憶する。 The buffer memory unit 130 temporarily stores image data captured by the imaging unit 110. The buffer memory unit 130 temporarily stores a microphone sound signal corresponding to the microphone detection sound collected by the microphone 230.

画像処理部１４０は、記憶部１６０に記憶されている画像処理条件を示す情報を参照して、バッファメモリ部１３０に一時的に記憶されている画像データに対して、画像処理をする。画像処理された画像データは、通信部１７０を介して記憶媒体２００に記憶される。なお、画像処理部１４０は、記憶媒体２００に記憶されている画像データに対して、画像処理をしてもよい。 The image processing unit 140 refers to the information indicating the image processing conditions stored in the storage unit 160 and performs image processing on the image data temporarily stored in the buffer memory unit 130. The image data subjected to the image processing is stored in the storage medium 200 via the communication unit 170. Note that the image processing unit 140 may perform image processing on the image data stored in the storage medium 200.

表示部１５０は、例えば液晶ディスプレイであって、撮像部１１０によって得られた画像データや操作画面等を表示する。 The display unit 150 is, for example, a liquid crystal display, and displays image data, an operation screen, and the like obtained by the imaging unit 110.

記憶部１６０は、レンズＣＰＵ１２０によってシーン判定の際に参照される判定条件を示す情報や、シーン判定によって判断されたシーン毎に対応付けられた撮像条件を示す情報等を記憶する。 The storage unit 160 stores information indicating determination conditions referred to when the lens CPU 120 performs scene determination, information indicating imaging conditions associated with each scene determined by scene determination, and the like.

通信部１７０は、カードメモリ等の取り外しが可能な記憶媒体２００と接続され、この記憶媒体２００への情報（画像データや音データ等）の書込み、読み出し、あるいは消去する。 The communication unit 170 is connected to a removable storage medium 200 such as a card memory, and writes, reads, or erases information (image data, sound data, etc.) to the storage medium 200.

操作部１８０は、例えば、電源スイッチ、シャッターボタン、マルチセレクタ（十字キー）、又はその他の操作キーを備え、ユーザによって操作されることでユーザからの操作入力を受け付け、操作入力に応じた操作内容を示す操作情報をレンズＣＰＵ１２０およびボディＣＰＵ１９０に出力する。この操作部１８０は、ユーザによって押下される際、物理的な動作音を発生する場合がある。本実施形態において、ユーザの操作入力に応じた操作内容を示す操作情報が操作部１８０からレンズＣＰＵ１２０あるいはボディＣＰＵ１９０に入力するタイミングを、操作部１８０の動作が開始される動作開始タイミングという。 The operation unit 180 includes, for example, a power switch, a shutter button, a multi-selector (cross key), or other operation keys. The operation unit 180 receives an operation input from the user when operated by the user, and an operation content corresponding to the operation input. Is output to the lens CPU 120 and the body CPU 190. The operation unit 180 may generate a physical operation sound when pressed by the user. In the present embodiment, the timing at which operation information indicating the operation content corresponding to the user's operation input is input from the operation unit 180 to the lens CPU 120 or the body CPU 190 is referred to as an operation start timing at which the operation of the operation unit 180 is started.

記憶媒体２００は、撮像装置１００に対して着脱可能に接続される記憶部であって、例えば、撮像部１１０によって生成された（撮影された）画像データや、低減処理部２５０により信号処理された音情報を記憶する。 The storage medium 200 is a storage unit that is detachably connected to the image capturing apparatus 100. For example, the storage medium 200 is image data generated (captured) by the image capturing unit 110 or subjected to signal processing by the reduction processing unit 250. Stores sound information.

バス２１０は、撮像部１１０と、レンズＣＰＵ１２０と、バッファメモリ部１３０と、画像処理部１４０と、表示部１５０と、記憶部１６０と、通信部１７０と、操作部１８０と、ボディＣＰＵ１９０と、計時部２２０と、Ａ/Ｄ変換部２４０と、低減処理部２５０と接続され、各構成部から出力されたデータ等を転送する。 The bus 210 includes an imaging unit 110, a lens CPU 120, a buffer memory unit 130, an image processing unit 140, a display unit 150, a storage unit 160, a communication unit 170, an operation unit 180, a body CPU 190, and a clock. The unit 220, the A / D conversion unit 240, and the reduction processing unit 250 are connected to transfer data output from each component.

計時部２２０は、日にちや時刻を計時して、計時した日時を示す日時情報を出力する。 The timekeeping unit 220 measures the date and time and outputs date / time information indicating the time / date.

マイク２３０は、周辺の音を収音し、この音のマイク音信号をＡ/Ｄ変換部２４０に出力する。このマイク２３０によって収音されるマイク音信号には、主に、収音対象である目的音と、動作部による動作音（ノイズ）とが含まれている。 The microphone 230 picks up surrounding sounds and outputs a microphone sound signal of this sound to the A / D converter 240. The microphone sound signal collected by the microphone 230 mainly includes a target sound to be collected and an operation sound (noise) by the operation unit.

ここで、マイク２３０によって取得されたマイク音信号について、例えば、ＡＦレンズ１１２が動作している時に得られたマイク音信号を例に、図２、３を参照して説明する。
図２（Ａ）は、ＡＦエンコーダ１１７の出力と時間との関係の一例を示す。図２（Ｂ）は、マイク音信号と時間の関係の一例が示されている。図２（Ａ）と２（Ｂ）の時間軸は、同一の時刻を示す。なお、図２（Ｂ）は、説明便宜のため、マイク音信号のうち、動作音のマイク音信号のみを示し、目的音のマイク音信号の図示を省略する。図２（Ａ）と図２（Ｂ）に示すＡＦレンズ１１２の動作パターンは、例えば、距離Ｐでピントを合わせるＡＦ処理を行う場合の動作パターンである。 Here, the microphone sound signal acquired by the microphone 230 will be described with reference to FIGS. 2 and 3 by taking, for example, a microphone sound signal obtained when the AF lens 112 is operating.
FIG. 2A shows an example of the relationship between the output of the AF encoder 117 and time. FIG. 2B shows an example of the relationship between the microphone sound signal and time. The time axes in FIGS. 2A and 2B indicate the same time. For convenience of explanation, FIG. 2B shows only the microphone sound signal of the operation sound among the microphone sound signals, and the illustration of the microphone sound signal of the target sound is omitted. The operation pattern of the AF lens 112 shown in FIGS. 2A and 2B is, for example, an operation pattern when performing AF processing for focusing at a distance P.

図２（Ａ）には、その縦軸に、ＡＦエンコーダ１１７の出力に基づく、ＡＦレンズ１１２を駆動する駆動機構の回転方向（ＣＷ，ＣＷＷ）を示す。
この距離Ｐでピントを合わせるＡＦ処理を行う動作パターンでは、図２（Ａ）に示す通り、ＡＦレンズ１１２を駆動する駆動機構が、時刻ｔ１０〜ｔ２０において、時計回りＣＷに回転して、その後、静止する。
つまり、時刻ｔ１０は、ＡＦレンズ１１２の動作開始タイミングを、時刻ｔ２０は、ＡＦレンズ１１２の動作停止タイミングを、それぞれ表わしている。なお、本実施形態において、動作開始タイミングの時刻ｔ１０は、ＡＦレンズ１１２の位置を制御するためのコマンドをレンズＣＰＵ１２０がレンズ駆動部１１６に出力したタイミング（時刻）である。動作停止タイミングの時刻ｔ２０は、ＡＦエンコーダ１１７からのパルス信号の出力が停止したタイミングである。 In FIG. 2A, the vertical axis indicates the rotation direction (CW, CWW) of the drive mechanism that drives the AF lens 112 based on the output of the AF encoder 117.
In the operation pattern for performing AF processing for focusing at this distance P, as shown in FIG. 2A, the driving mechanism for driving the AF lens 112 rotates clockwise CW from time t10 to t20, and then Quiesce.
That is, time t10 represents the operation start timing of the AF lens 112, and time t20 represents the operation stop timing of the AF lens 112. In the present embodiment, the time t10 of the operation start timing is the timing (time) at which the lens CPU 120 outputs a command for controlling the position of the AF lens 112 to the lens driving unit 116. The operation stop timing time t20 is a timing at which the output of the pulse signal from the AF encoder 117 is stopped.

従って、図２（Ｂ）に示す通り、時刻ｔ１０〜ｔ２０の期間で、マイク音信号にＡＦレンズ１１２による動作音が目的音に重畳している、又は、動作音が目的音に重畳している可能性が高い。本実施形態においては、時刻ｔ１０〜ｔ２０の期間において、ＡＦレンズ１１２による動作音であるノイズが発生している場合を例に、以下説明する。
また、図２（Ｂ）に示す通り、時刻ｔ１０、ｔ２０においては、それぞれ衝撃音が発生している可能性が高い。本実施形態においては、時刻ｔ１０、ｔ２０において、ＡＦレンズ１１２による衝撃音が発生している場合を例に、以下説明する。 Therefore, as shown in FIG. 2B, the operation sound by the AF lens 112 is superimposed on the target sound or the operation sound is superimposed on the target sound during the period from time t10 to t20. Probability is high. In the present embodiment, the following description will be given by taking as an example a case where noise, which is an operation sound generated by the AF lens 112, is generated during the period from time t10 to t20.
Further, as shown in FIG. 2B, there is a high possibility that impact sounds are generated at times t10 and t20. In the present embodiment, the following description will be given by taking as an example a case where an impact sound is generated by the AF lens 112 at times t10 and t20.

また、衝撃音が発生した場合、その衝撃音が発生している可能性の高い時間長（期間）は、各動作パターンに応じて予め決められている。距離Ｐでピントを合わせるＡＦ処理を行う動作パターンでは、図３に示すような衝撃音の発生する時間長Ｌ１、Ｌ２が決められている。
図３は、距離Ｐでピントを合わせるＡＦ処理を行う動作パターンでＡＦレンズ１１２を駆動した際に、マイク２３０で収音されるマイク音信号の一例を示す図である。図３に示すグラフは、縦軸にマイク２３０によって収音されたマイク音信号の振幅を、横軸に時間を、それぞれ示す。なお、図３は、説明便宜のため、マイク音信号のうち、動作音のマイク音信号のみを示し、目的音のマイク音信号の図示を省略する。また、図３に示す時刻ｔ１０、ｔ２０は、図２（Ａ）、２（Ｂ）に示す時刻ｔ１０、ｔ２０と同じである。 In addition, when an impact sound is generated, a time length (period) during which the impact sound is highly likely to be generated is determined in advance according to each operation pattern. In the operation pattern in which the AF process for focusing at the distance P is performed, time lengths L1 and L2 at which impact sounds are generated as shown in FIG. 3 are determined.
FIG. 3 is a diagram illustrating an example of a microphone sound signal picked up by the microphone 230 when the AF lens 112 is driven with an operation pattern in which an AF process for focusing at a distance P is performed. The graph shown in FIG. 3 shows the amplitude of the microphone sound signal collected by the microphone 230 on the vertical axis, and the time on the horizontal axis. For convenience of explanation, FIG. 3 shows only the microphone sound signal of the operation sound among the microphone sound signals, and the illustration of the microphone sound signal of the target sound is omitted. 3 are the same as the times t10 and t20 shown in FIGS. 2 (A) and 2 (B).

距離Ｐでピントを合わせるＡＦ処理を行う動作パターンでは、動作開始タイミングから時間長Ｌ１の期間、および、動作停止タイミングから時間長Ｌ２の期間が、それぞれ、衝撃音の発生する時間長であると予め決められている。よって、本実施形態では、時刻ｔ１０から時間長Ｌ１の期間（ｔ１０〜ｔ１１）、および時刻ｔ２０から時間長Ｌ２の期間（ｔ２０〜ｔ２１）が、それぞれ、衝撃音の発生する期間である。ここでは、動作開始タイミングから時間長Ｌ１の期間を、動作開始タイミング期間という。また、動作停止タイミングから時間長Ｌ２の期間を、動作停止タイミング期間という。
ここで、動作部が動作しない可能性の高い期間を非動作期間Ｔａとする。また、動作部の動作により衝撃音が発生する可能性の高い期間を衝撃音発生期間Ｔｂとする。さらに、動作部の動作により駆動音が発生する可能性の高い期間を駆動音発生期間Ｔｃとする。本実施形態では、時刻ｔ０〜ｔ１０の期間、ｔ２１〜の期間が、非動作期間Ｔａである。時刻ｔ１０〜ｔ１１の期間、時刻ｔ２０〜ｔ２１の期間が、衝撃音発生期間Ｔｂである。時刻ｔ１１〜ｔ２０の期間が、駆動音発生期間Ｔｃである。 In the operation pattern in which the AF process for focusing at the distance P is performed, the period from the operation start timing to the time length L1 and the period from the operation stop timing to the time length L2 are preliminarily set to be the time length at which the impact sound is generated. It has been decided. Therefore, in this embodiment, the period from time t10 to time length L1 (t10 to t11) and the period from time t20 to time length L2 (t20 to t21) are periods in which impact sounds are generated. Here, a period of time length L1 from the operation start timing is referred to as an operation start timing period. Further, a period of time length L2 from the operation stop timing is referred to as an operation stop timing period.
Here, a period during which the operating unit is highly unlikely to operate is defined as a non-operation period Ta. Further, a period during which an impact sound is highly likely to be generated by the operation of the operation unit is referred to as an impact sound generation period Tb. Furthermore, a period during which a driving sound is likely to be generated by the operation of the operating unit is defined as a driving sound generation period Tc. In the present embodiment, the period from time t0 to time t10 and the period from t21 to t is the non-operation period Ta. The period from time t10 to t11 and the period from time t20 to t21 are the impact sound generation period Tb. The period from time t11 to t20 is the drive sound generation period Tc.

図１に戻って、撮像装置１００の各構成の説明を続ける。
レンズＣＰＵ１２０は、設定された撮像条件（例えば絞り値、露出値等）に応じた動作パターンに従って撮像部１１０を制御する。このレンズＣＰＵ１２０は、ズームエンコーダ１１５から出力されるズームポジションおよびＡＦエンコーダ１１７から出力されるフォーカスポジションに基づき、レンズ駆動部１１６を駆動するコマンドを生成して、レンズ駆動部１１６に出力する。その生成アルゴリズムは、必要に応じて既存のアルゴリズムを適宜用いてもよい。 Returning to FIG. 1, the description of each configuration of the imaging apparatus 100 is continued.
The lens CPU 120 controls the imaging unit 110 according to an operation pattern according to the set imaging conditions (for example, aperture value, exposure value, etc.). The lens CPU 120 generates a command for driving the lens driving unit 116 based on the zoom position output from the zoom encoder 115 and the focus position output from the AF encoder 117, and outputs the command to the lens driving unit 116. As the generation algorithm, an existing algorithm may be appropriately used as necessary.

ボディＣＰＵ１９０は、撮像装置１００を統括的に制御する。このボディＣＰＵ１９０は、動作タイミング検出部１９１を備える。
動作タイミング検出部１９１は、撮像装置１００が備えている動作部の動作状態が変化するタイミングを検出する。この動作状態が変化するタイミングとしては、例えば、動作部が動作を開始する動作開始タイミングと、動作部の動作が停止する動作停止タイミングとがある。
ここでいう動作部とは、例えば、上述した光学系１１１、あるいは、操作部１８０のことであり、撮像装置１００が備えている構成のうち、動作することにより、または、動作されることにより、動作音を生じる（または、動作音を生じる可能性がある）構成である。
言い換えると、動作部とは、撮像装置１００が備えている構成のうち、動作部が動作することにより生じた動作音、または、動作部が動作されることにより生じた動作音が、マイク２３０により収音される（または、収音される可能性のある）構成である。 The body CPU 190 comprehensively controls the imaging device 100. The body CPU 190 includes an operation timing detection unit 191.
The operation timing detection unit 191 detects timing at which the operation state of the operation unit included in the imaging apparatus 100 changes. The timing at which the operation state changes includes, for example, an operation start timing at which the operation unit starts operation and an operation stop timing at which the operation of the operation unit stops.
The operation unit referred to here is, for example, the optical system 111 or the operation unit 180 described above. By operating or operating among the configurations of the imaging apparatus 100, This is a configuration that generates an operation sound (or that may generate an operation sound).
In other words, the operation unit refers to the operation sound generated by the operation of the operation unit or the operation sound generated by the operation of the operation unit being included in the imaging apparatus 100 by the microphone 230. The sound is collected (or possibly picked up).

例えば、この動作タイミング検出部１９１は、動作部を動作させるコマンドに基づいて、動作部の動作状態が変化するタイミングを検出してもよい。このコマンドとは、動作部を動作させる駆動部に対して、動作部を動作させるようにする駆動制御信号、または、この駆動部を駆動させる駆動制御信号である。 For example, the operation timing detection unit 191 may detect the timing at which the operation state of the operation unit changes based on a command for operating the operation unit. This command is a drive control signal that causes the operating unit to operate with respect to the drive unit that operates the operating unit, or a drive control signal that drives the drive unit.

例えば、動作タイミング検出部１９１は、ズームレンズ１１４、ＶＲレンズ１１３、または、ＡＦレンズ１１２を駆動させるため、レンズ駆動部１１６または手ブレ補正部１１８に入力されるコマンドに基づいて、ズームレンズ１１４、ＶＲレンズ１１３、または、ＡＦレンズ１１２の動作が開始された動作開始タイミングを検出する。この場合、動作タイミング検出部１９１は、レンズＣＰＵ１２０がコマンドを生成する場合に、レンズＣＰＵ１２０内部で実行される処理やコマンドに基づいて、動作開始タイミングを検出してもよい。
また、動作タイミング検出部１９１は、操作部１８０から入力されるズームレンズ１１４、または、ＡＦレンズ１１２を駆動させることを示す操作信号に基づいて、動作開始タイミングを検出してもよい。 For example, in order to drive the zoom lens 114, the VR lens 113, or the AF lens 112, the operation timing detection unit 191 is based on a command input to the lens driving unit 116 or the camera shake correction unit 118, and the zoom lens 114, The operation start timing when the operation of the VR lens 113 or the AF lens 112 is started is detected. In this case, when the lens CPU 120 generates a command, the operation timing detection unit 191 may detect the operation start timing based on a process or command executed in the lens CPU 120.
Further, the operation timing detection unit 191 may detect the operation start timing based on an operation signal indicating that the zoom lens 114 or the AF lens 112 is input from the operation unit 180.

また、動作タイミング検出部１９１は、動作部が動作したことを示す信号に基づいて、動作部の動作状態が変化するタイミングを検出してもよい。
例えば、動作タイミング検出部１９１は、ズームエンコーダ１１５またはＡＦエンコーダ１１７の出力に基づいて、ズームレンズ１１４またはＡＦレンズ１１２が駆動されたことを検出することにより、ズームレンズ１１４またはＡＦレンズ１１２の動作開始タイミングを検出してもよい。また、動作タイミング検出部１９１は、ズームエンコーダ１１５またはＡＦエンコーダ１１７の出力に基づいて、ズームレンズ１１４またはＡＦレンズ１１２が停止されたことを検出することにより、ズームレンズ１１４またはＡＦレンズ１１２の動作停止タイミングを検出してもよい。
また、動作タイミング検出部１９１は、手ブレ補正部１１８からの出力に基づいて、ＶＲレンズ１１３が駆動されたことを検出することにより、ＶＲレンズ１１３の動作開始タイミングを検出してもよい。この動作タイミング検出部１９１は、手ブレ補正部１１８からの出力に基づいて、ＶＲレンズ１１３が停止されたことを検出することにより、ＶＲレンズ１１３の動作停止タイミングを検出してもよい。
さらに、動作タイミング検出部１９１は、操作部１８０からの入力に基づいて、操作部１８０が操作されたことを検出することにより、動作部が動作するタイミングを検出してもよい。 Further, the operation timing detection unit 191 may detect the timing at which the operation state of the operation unit changes based on a signal indicating that the operation unit has operated.
For example, the operation timing detector 191 starts the operation of the zoom lens 114 or the AF lens 112 by detecting that the zoom lens 114 or the AF lens 112 is driven based on the output of the zoom encoder 115 or the AF encoder 117. Timing may be detected. Further, the operation timing detection unit 191 detects that the zoom lens 114 or the AF lens 112 has been stopped based on the output of the zoom encoder 115 or the AF encoder 117, thereby stopping the operation of the zoom lens 114 or the AF lens 112. Timing may be detected.
Further, the operation timing detection unit 191 may detect the operation start timing of the VR lens 113 by detecting that the VR lens 113 is driven based on the output from the camera shake correction unit 118. The operation timing detection unit 191 may detect the operation stop timing of the VR lens 113 by detecting that the VR lens 113 is stopped based on the output from the camera shake correction unit 118.
Furthermore, the operation timing detection unit 191 may detect the timing at which the operation unit operates by detecting that the operation unit 180 has been operated based on an input from the operation unit 180.

動作タイミング検出部１９１は、撮像装置１００が備えている動作部の動作開始タイミングを検出し、検出した動作開始タイミングを示す動作タイミング信号を、低減処理部２５０に出力する。また、動作タイミング検出部１９１は、撮像装置１００が備えている動作部の動作停止タイミングを検出し、この検出した動作停止タイミングを示す動作タイミング信号を、低減処理部２５０に出力する。
本実施形態において、動作タイミング検出部１９１は、レンズＣＰＵ１２０から入力されるコマンドに基づき、ＡＦレンズ１１２を動かすコマンドがレンズＣＰＵ１２０からレンズ駆動部１１６に出力されるタイミングを、ＡＦレンズ１１２の動作開始タイミングと判定する。また、動作タイミング検出部１９１は、衝撃音の発生時間長Ｌ１を参照し、例えば、図３を用いた例で示す衝撃音が発生している時刻ｔ１０〜ｔ１１を示す情報を、動作開始タイミング期間を示す信号（動作タイミング信号）として出力する。 The operation timing detection unit 191 detects the operation start timing of the operation unit included in the imaging apparatus 100 and outputs an operation timing signal indicating the detected operation start timing to the reduction processing unit 250. Further, the operation timing detection unit 191 detects the operation stop timing of the operation unit provided in the imaging apparatus 100 and outputs an operation timing signal indicating the detected operation stop timing to the reduction processing unit 250.
In the present embodiment, the operation timing detection unit 191 determines the timing at which a command for moving the AF lens 112 is output from the lens CPU 120 to the lens driving unit 116 based on the command input from the lens CPU 120, and the operation start timing of the AF lens 112. Is determined. The operation timing detection unit 191 refers to the generation time length L1 of the impact sound, for example, information indicating the times t10 to t11 at which the impact sound is generated in the example using FIG. Is output as a signal (operation timing signal).

また、動作タイミング検出部１９１は、ＡＦエンコーダ１１７から入力されるパルス信号に基づき、このパルス信号の出力が停止した時を、ＡＦレンズ１１２の動作が停止した動作停止タイミングと判定する。また、動作タイミング検出部１９１は、衝撃音の発生時間長Ｌ２を参照して、例えば、図３を用いた例で示す衝撃音が発生している時刻ｔ２０〜ｔ２１を示す情報を、動作停止タイミング期間を示す信号（動作タイミング信号）として出力する。 The operation timing detection unit 191 determines, based on the pulse signal input from the AF encoder 117, the operation stop timing at which the operation of the AF lens 112 is stopped when the output of the pulse signal is stopped. In addition, the operation timing detection unit 191 refers to the generation time length L2 of the impact sound, for example, information indicating the times t20 to t21 when the impact sound is generated in the example illustrated in FIG. A signal indicating the period (operation timing signal) is output.

Ａ/Ｄ変換部２４０は、マイク２３０から入力されたアナログ信号であるマイク音信号をデジタル信号であるマイク音信号に変換する。このＡ/Ｄ変換部２４０は、デジタル信号であるマイク音信号を、低減処理部２５０に出力する。また、Ａ/Ｄ変換部２４０は、デジタル信号であるマイク音信号を、バッファメモリ部１３０あるいは記憶媒体２００に記憶させる構成であってもよい。この場合、Ａ/Ｄ変換部２４０は、計時部２２０によって計時された日時情報に基づき、マイク音信号が取得された時刻を示す情報を、マイク音信号に関連付けて、バッファメモリ部１３０あるいは記憶媒体２００に記憶させる。 The A / D converter 240 converts the microphone sound signal that is an analog signal input from the microphone 230 into a microphone sound signal that is a digital signal. The A / D converter 240 outputs a microphone sound signal, which is a digital signal, to the reduction processing unit 250. The A / D conversion unit 240 may be configured to store a microphone sound signal that is a digital signal in the buffer memory unit 130 or the storage medium 200. In this case, the A / D conversion unit 240 associates the information indicating the time when the microphone sound signal is acquired based on the date and time information measured by the time measuring unit 220 with the buffer sound unit 130 or the storage medium. 200.

低減処理部２５０は、Ａ／Ｄ変換部２４０によりデジタル信号に変換されたマイク音信号に対して、例えばＡＦレンズ１１２、ＶＲレンズ１１３、ズームレンズ１１４等の動作部による動作音であるノイズを低減するなどのノイズ低減処理を実行し、このノイズ低減処理した音情報を記憶媒体２００に記憶させる。 The reduction processing unit 250 reduces noise, which is an operation sound generated by the operation unit such as the AF lens 112, the VR lens 113, and the zoom lens 114, with respect to the microphone sound signal converted into a digital signal by the A / D conversion unit 240. The noise reduction process such as performing the noise reduction process is executed, and the sound information subjected to the noise reduction process is stored in the storage medium 200.

次に、図４を参照して、低減処理部２５０について詳細に説明する。図４は、本実施形態に係る低減処理部２５０の機能構成の一例を示すブロック図である。
低減処理部２５０は、音信号切り出し部２５１と、ハミング窓処理部２５２と、フーリエ変換部２５３と、ノイズ低減処理部２５４と、逆フーリエ変換部２５５と、連結調整部２５６と、信号重ね合わせ部２５７とを含む。 Next, the reduction processing unit 250 will be described in detail with reference to FIG. FIG. 4 is a block diagram illustrating an example of a functional configuration of the reduction processing unit 250 according to the present embodiment.
The reduction processing unit 250 includes a sound signal cutout unit 251, a Hamming window processing unit 252, a Fourier transform unit 253, a noise reduction processing unit 254, an inverse Fourier transform unit 255, a connection adjustment unit 256, and a signal superposition unit. 257.

音信号切り出し部２５１は、Ａ/Ｄ変換部２４０から出力されたマイク音信号を、予め決められた時間長のフレームで区切って、フレーム単位の音時間信号を切り出す。ここで、音信号切り出し部２５１によって切り出されたフレーム単位の音時間信号には、説明便宜のため、奇数番号（Ｓ１０１，Ｓ１０３，Ｓ１０５・・・）を付す。また、音信号切り出し部２５１は、この奇数番号を割り当てたフレーム単位の音時間信号Ｓ１０１，Ｓ１０３，Ｓ１０５・・・と半分ずつオーバーラップするように、マイク音信号からフレーム単位の音時間信号を切り出す。ここで、音時間信号Ｓ１０１，Ｓ１０３，Ｓ１０５・・・と半分ずつオーバーラップするように音信号切り出し部２５１によって切り出されたフレーム単位の音時間信号には、説明便宜のため、偶数番号（Ｓ１０２，Ｓ１０４，Ｓ１０６・・・）を付す。
また、音時間信号Ｓ１０１，Ｓ１０２，Ｓ１０３，Ｓ１０４，Ｓ１０５，Ｓ１０６・・・に対応するフレームを、それぞれ、フレームＦ１０１，Ｆ１０２，Ｆ１０３，Ｆ１０４，Ｆ１０５，Ｆ１０６・・・と記す。 The sound signal cutout unit 251 cuts out the sound time signal in units of frames by dividing the microphone sound signal output from the A / D conversion unit 240 into frames having a predetermined time length. Here, for the convenience of explanation, odd-numbered numbers (S101, S103, S105...) Are attached to the sound time signals in frame units cut out by the sound signal cutout unit 251. Further, the sound signal cutout unit 251 cuts out the sound time signal in units of frames from the microphone sound signal so as to overlap with the sound time signals S101, S103, S105,. . Here, for the convenience of explanation, an even number (S102, S102, S102, S105,...) Is included in the sound time signal in frame units cut out by the sound signal cutout unit 251 so as to overlap the sound time signals S101, S103, S105. S104, S106...
Also, frames corresponding to the sound time signals S101, S102, S103, S104, S105, S106,... Are denoted as frames F101, F102, F103, F104, F105, F106,.

なお、音時間信号Ｓ１０１，Ｓ１０３，Ｓ１０５・・・と、音時間信号Ｓ１０２，Ｓ１０４，Ｓ１０６・・・とは、全て同じ長さのフレームである。また、図３に示したように、フレームＦ１０１，Ｆ１０２，Ｆ１０３，Ｆ１０４，Ｆ１０５，Ｆ１０６・・・は、２分の１ずつ近くのフレームと重複している。
本実施形態において、音信号切り出し部２５１は、サンプリング周波数４８ｋＨｚでフレーム長が１０２４点となるように、音時間信号Ｓ１０１，Ｓ１０２，Ｓ１０３，Ｓ１０４，Ｓ１０５，Ｓ１０６・・・を切り出す。よって、音時間信号Ｓ１０２は、音時間信号Ｓ１０１の後半の５１２点、および、音時間信号Ｓ１０３の前半の５１２点とそれぞれ共通する情報を含む。 The sound time signals S101, S103, S105... And the sound time signals S102, S104, S106. Further, as shown in FIG. 3, the frames F101, F102, F103, F104, F105, F106,...
In this embodiment, the sound signal cutout unit 251 cuts out the sound time signals S101, S102, S103, S104, S105, S106... So that the sampling frequency is 48 kHz and the frame length is 1024 points. Therefore, the sound time signal S102 includes information common to the second half 512 points of the sound time signal S101 and the first half 512 points of the sound time signal S103.

ハミング窓処理部２５２は、音信号切り出し部２５１によって切り出された音時間信号Ｓ１０１，Ｓ１０２，Ｓ１０３，Ｓ１０４，Ｓ１０５，Ｓ１０６・・・のそれぞれにハミング窓関数Ｗ_１を乗算する。なお、ハミング窓処理部２５２によってハミング窓関数Ｗ_１で重み付けされた音時間信号を、以下、Ｓ２０１，Ｓ２０２，Ｓ２０３，Ｓ２０４，Ｓ２０５，Ｓ２０６・・・と記す。
また、ハミング窓関数Ｗ_１は、以下の式（１）で示される。なお、式（１）では、窓の範囲を０〜Ｔとし、変数を時刻ｔ（０≦ｔ≦Ｔ）で示す。 Hamming window processing unit 252, the sound signal clipping section sound time signal S101 that has been cut out by 251, S102, S103, S104, S105, S106 multiplies the Hamming window function _{W 1} to each .... Note that the sound time signals weighted by the Hamming window function W ₁ by the Hamming window processing unit 252 are hereinafter referred to as S201, S202, S203, S204, S205, S206.
Further, the Hamming window function _{W 1} is expressed by the following equation (1). In Equation (1), the window range is 0 to T, and the variable is indicated by time t (0 ≦ t ≦ T).

フーリエ変換部２５３は、ハミング窓処理部２５２によってハミング窓関数Ｗ_１で重み付けされた音時間信号Ｓ２０１，Ｓ２０２，Ｓ２０３，Ｓ２０４，Ｓ２０５，Ｓ２０６・・・を周波数領域で表わされるスペクトルに変換して、この周波数領域で表わされるスペクトル（周波数スペクトル）をノイズ低減処理部２５４に出力する。
このフーリエ変換部２５３は、例えば、音時間信号にフーリエ変換、あるいは高速フーリエ変換（ＦＦＴ：Fast Fourier Transform）を行うことで、音時間信号を周波数領域に変換する。本実施形態において、フーリエ変換部２５３は、例えば、音時間信号Ｓ２０１，Ｓ２０２，Ｓ２０３，Ｓ２０４，Ｓ２０５，Ｓ２０６・・・にフーリエ変換を行うことで、窓関数の各規定期間に対応する周波数スペクトルを算出する。なお、フーリエ変換部２５３が音時間信号を周波数スペクトルに変換する手段は、フーリエ変換に限られない。
なお、フーリエ変換部２５３によって周波数領域に変換された周波数スペクトルを、以下、Ｓ３０１，Ｓ３０２，Ｓ３０３，Ｓ３０４，Ｓ３０５，Ｓ３０６・・・と記す。つまり、フーリエ変換部２５３は、入力する音時間信号Ｓ２０１，Ｓ２０２，Ｓ２０３，Ｓ２０４，Ｓ２０５，Ｓ２０６・・・を、周波数スペクトルＳ３０１，Ｓ３０２，Ｓ３０３，Ｓ３０４，Ｓ３０５，Ｓ３０６・・・に変換して、ノイズ低減処理部２５４に出力する。 Fourier transform unit 253 converts Hamming window processing unit 252 Hamming window function _{W 1} weighted sound time signal S201 by, S202, S203, and S204, S205, S206 · · · to the spectrum represented in the frequency domain, The spectrum (frequency spectrum) represented in this frequency domain is output to the noise reduction processing unit 254.
The Fourier transform unit 253 transforms the sound time signal into the frequency domain, for example, by performing Fourier transform or fast Fourier transform (FFT) on the sound time signal. In the present embodiment, the Fourier transform unit 253 performs, for example, Fourier transform on the sound time signals S201, S202, S203, S204, S205, S206, and so on, so that the frequency spectrum corresponding to each specified period of the window function is obtained. calculate. The means for the Fourier transform unit 253 to convert the sound time signal into a frequency spectrum is not limited to Fourier transform.
The frequency spectrum converted into the frequency domain by the Fourier transform unit 253 is hereinafter referred to as S301, S302, S303, S304, S305, S306. That is, the Fourier transform unit 253 converts the input sound time signals S201, S202, S203, S204, S205, S206... Into frequency spectra S301, S302, S303, S304, S305, S306. Output to the noise reduction processing unit 254.

図３に示した通り、距離Ｐでピントを合わせるＡＦ処理を行う場合の動作パターンでは、動作開始タイミング期間と動作停止タイミング期間において衝撃音が発生する。
上述の通り、時刻ｔ０〜ｔ１０の期間、時刻ｔ２１〜の期間が、非動作期間Ｔａ（動作部が動作しない可能性の高い期間）である。時刻ｔ１０〜ｔ１１の期間（動作開始タイミング期間）、時刻ｔ２０〜ｔ２１の期間（動作停止タイミング期間）が、衝撃音発生期間Ｔｂ（動作部の動作により衝撃音が発生する可能性の高い期間）である。時刻ｔ１１〜ｔ２０の期間が、駆動音発生期間Ｔｃ（動作部の動作により駆動音が発生する可能性の高い期間）である。
つまり、フレームＦ１０１、Ｆ１１３、Ｆ１１４に対応する周波数スペクトルＳ３０１、Ｓ３１３、Ｓ３１４は、非動作期間Ｔａに取得されたマイク音信号の周波数スペクトルである。フレームＦ１０５〜Ｆ１０８に対応する周波数スペクトルＳ３０５〜Ｓ３０８は、動作部が動作する可能性の高い期間のうち、駆動音発生期間Ｔｃに取得されたマイク音信号の周波数スペクトルである。
また、フレームＦ１０２〜Ｆ１０４に対応する周波数スペクトルＳ３０２〜Ｓ３０４と、フレームＦ１０９〜Ｆ１１２に対応する周波数スペクトルＳ３０９〜Ｓ３１２は、動作部が動作する可能性の高い期間のうち、衝撃音発生期間Ｔｂに取得されたマイク音信号の周波数スペクトルである。 As shown in FIG. 3, in the operation pattern when performing AF processing for focusing at a distance P, an impact sound is generated in the operation start timing period and the operation stop timing period.
As described above, the period from the time t0 to the time t10 and the period from the time t21 to the non-operation period Ta (a period during which the operation unit is unlikely to operate). The period from time t10 to t11 (operation start timing period) and the period from time t20 to t21 (operation stop timing period) are the impact sound generation period Tb (a period during which an impact sound is highly likely to be generated by the operation of the operation unit). is there. A period from time t11 to t20 is a drive sound generation period Tc (a period during which drive sound is highly likely to be generated by the operation of the operation unit).
That is, the frequency spectra S301, S313, and S314 corresponding to the frames F101, F113, and F114 are the frequency spectra of the microphone sound signal acquired during the non-operation period Ta. Frequency spectra S305 to S308 corresponding to the frames F105 to F108 are frequency spectra of the microphone sound signal acquired in the drive sound generation period Tc in the period in which the operation unit is highly likely to operate.
In addition, the frequency spectrums S302 to S304 corresponding to the frames F102 to F104 and the frequency spectra S309 to S312 corresponding to the frames F109 to F112 are acquired in the impact sound generation period Tb in the period during which the operation unit is likely to operate. It is a frequency spectrum of the made microphone sound signal.

ノイズ低減処理部２５４は、衝撃音ノイズ低減処理部２５４１と、駆動音ノイズ低減処理部２５４２とを含む。このノイズ低減処理部２５４は、フーリエ変換部２５３によって周波数領域に変換された周波数スペクトルＳ３０１，Ｓ３０２，Ｓ３０３，Ｓ３０４，Ｓ３０５，Ｓ３０６・・・に対して信号処理を施す処理部である。本実施形態において、ノイズ低減処理部２５４は、動作部の動作により生じる駆動音および衝撃音を低減させるノイズ低減処理を実行する処理部である。 The noise reduction processing unit 254 includes an impact sound noise reduction processing unit 2541 and a driving sound noise reduction processing unit 2542. The noise reduction processing unit 254 is a processing unit that performs signal processing on the frequency spectrums S301, S302, S303, S304, S305, S306,... Converted into the frequency domain by the Fourier transform unit 253. In the present embodiment, the noise reduction processing unit 254 is a processing unit that executes a noise reduction process that reduces drive sound and impact sound generated by the operation of the operation unit.

ノイズ低減処理部２５４は、動作タイミング検出部１９１から入力する動作タイミング信号に基づき、動作部が動作する可能性の高い期間に取得される音時間信号の周波数スペクトルを取得する。このノイズ低減処理部２５４は、例えば、動作タイミング検出部１９１から入力する動作タイミング信号に基づき、フーリエ変換部２５３から出力される衝撃音期間Ｔｂおよび駆動音期間Ｔｃに対応する全ての周波数スペクトルＳ３０２〜Ｓ３１２を取得する。
本実施形態において、このノイズ低減処理部２５４は、動作タイミング信号に基づき、例えば、動作音が発生している可能性が高い期間から、衝撃音と駆動音の両方が発生している可能性の高い期間（衝撃音期間Ｔｂ）と、駆動音のみが発生している可能性の高い期間（駆動音期間Ｔｃ）とを、それぞれ区別して、周波数スペクトルを取得することが好ましい。詳細については後述するが、ノイズ低減処理部２５４は、衝撃音ノイズ低減処理と、駆動音ノイズ低減処理の両方を行うからである。
具体的に説明すると、ノイズ低減処理部２５４は、動作タイミング検出部１９１から入力する動作タイミング信号に基づき、例えば、フーリエ変換部２５３から出力される周波数スペクトルＳ３０１，Ｓ３０２，Ｓ３０３，Ｓ３０４，Ｓ３０５，Ｓ３０６・・・から、駆動音のみが発生している可能性の高い期間（衝撃音期間Ｔｂ）に対応する周波数スペクトルＳ３０５〜Ｓ３０８と、衝撃音と駆動音の両方が発生している可能性の高い期間（衝撃音期間Ｔｃ）に対応する周波数スペクトルＳ３０２〜Ｓ３０４、Ｓ３０９〜Ｓ３１２を取得する。 The noise reduction processing unit 254 acquires the frequency spectrum of the sound time signal acquired during a period when the operation unit is likely to operate based on the operation timing signal input from the operation timing detection unit 191. The noise reduction processing unit 254, for example, based on the operation timing signal input from the operation timing detection unit 191, all the frequency spectrums S302 to S302 corresponding to the impact sound period Tb and the drive sound period Tc output from the Fourier transform unit 253. S312 is acquired.
In the present embodiment, the noise reduction processing unit 254 is based on the operation timing signal. For example, there is a possibility that both the impact sound and the drive sound are generated from the period when the operation sound is highly likely to be generated. It is preferable to obtain a frequency spectrum by distinguishing between a high period (impact sound period Tb) and a period (drive sound period Tc) in which only drive sound is likely to be generated. Although details will be described later, the noise reduction processing unit 254 performs both the impact sound noise reduction process and the driving sound noise reduction process.
Specifically, the noise reduction processing unit 254 is based on the operation timing signal input from the operation timing detection unit 191, for example, the frequency spectrum S301, S302, S303, S304, S305, S306 output from the Fourier transform unit 253. ..., it is highly likely that both the frequency spectrum S305 to S308 corresponding to the period during which only the driving sound is likely to occur (impact sound period Tb) and both the impact sound and the driving sound are occurring. Frequency spectra S302 to S304 and S309 to S312 corresponding to the period (impact sound period Tc) are acquired.

つまり、ノイズ低減処理部２５４は、動作タイミング信号に基づき、衝撃音が発生している可能性の高い期間を動作開始タイミング期間と判定する。そして、ノイズ低減処理部２５４は、この期間に取得されるマイク音信号の周波数スペクトルＳ３０２〜Ｓ３０４を、衝撃音と駆動音の両方を含むマイク音信号の周波数スペクトルとして取得する。
また、ノイズ低減処理部２５４は、動作タイミング信号に基づき、衝撃音が発生している可能性の高い期間を動作停止タイミング期間と判定する。そして、ノイズ低減処理部２５４は、この期間に取得されるマイク音信号の周波数スペクトルＳ３０９〜Ｓ３１２を、衝撃音と駆動音の両方を含むマイク音信号の周波数スペクトルとして取得する。
さらに、ノイズ低減処理部２５４は、動作タイミング信号に基づき、駆動音が発生している可能性の高い期間を、動作開始タイミング期間の終了点から動作停止タイミング期間の開始点までの期間と判定する。そして、ノイズ低減処理部２５４は、この期間に取得されるマイク音信号の周波数スペクトルＳ３０５〜Ｓ３０８を、駆動音を含むマイク音信号の周波数スペクトルとして取得する。 That is, based on the operation timing signal, the noise reduction processing unit 254 determines a period in which there is a high possibility that an impact sound is occurring as the operation start timing period. And the noise reduction process part 254 acquires the frequency spectrum S302-S304 of the microphone sound signal acquired in this period as a frequency spectrum of the microphone sound signal containing both an impact sound and a drive sound.
In addition, the noise reduction processing unit 254 determines a period during which the impact sound is highly likely to be generated as the operation stop timing period based on the operation timing signal. And the noise reduction process part 254 acquires the frequency spectrum S309-S312 of the microphone sound signal acquired in this period as a frequency spectrum of the microphone sound signal containing both an impact sound and a drive sound.
Further, the noise reduction processing unit 254 determines, based on the operation timing signal, a period in which driving sound is likely to be generated as a period from the end point of the operation start timing period to the start point of the operation stop timing period. . And the noise reduction process part 254 acquires the frequency spectrum S305-S308 of the microphone sound signal acquired in this period as a frequency spectrum of the microphone sound signal containing a drive sound.

このノイズ低減処理部２５４は、取得した周波数スペクトルＳ３０２〜Ｓ３１２に対して、動作パターンに応じて予め決められているノイズを低減するノイズ低減処理を行う。
例えば、ノイズ低減処理部２５４の衝撃音ノイズ低減処理部２５４１は、衝撃音と駆動音の両方を含むマイク音信号の周波数スペクトルＳ３０２〜Ｓ３０４、Ｓ３０９〜Ｓ３１２に対して、衝撃音に対応する周波数スペクトルを低減する衝撃音低減処理を実行する。
また、ノイズ低減処理部２５４の駆動音ノイズ低減処理部２５４２は、動作音を含むマイク音信号の周波数スペクトルＳ３０２〜Ｓ３１２に対して、駆動音に対応する周波数スペクトルを低減する駆動音低減処理を実行する。この駆動音ノイズ低減処理部２５４２は、衝撃音低減処理を実行した周波数スペクトルＳ３０２〜Ｓ３０４とＳ３０９〜Ｓ３１２、および駆動音のみを含むマイク音信号の周波数スペクトルＳ３０５〜Ｓ３０８の両方に対して、駆動音低減処理を実行することが好ましい。本実施形態において、駆動音ノイズ低減処理部２５４２は、衝撃音低減処理を実行した周波数スペクトルを含む動作時の全ての周波数スペクトルＳ３０２〜Ｓ３１２に対して駆動音低減処理を実行する例について説明する。 The noise reduction processing unit 254 performs noise reduction processing for reducing noise determined in advance according to the operation pattern for the acquired frequency spectra S302 to S312.
For example, the impact sound noise reduction processing unit 2541 of the noise reduction processing unit 254 has a frequency spectrum corresponding to the impact sound with respect to the frequency spectra S302 to S304 and S309 to S312 of the microphone sound signal including both the impact sound and the driving sound. The impact noise reduction process for reducing the noise is executed.
In addition, the driving sound noise reduction processing unit 2542 of the noise reduction processing unit 254 executes a driving sound reduction process for reducing the frequency spectrum corresponding to the driving sound with respect to the frequency spectrum S302 to S312 of the microphone sound signal including the operation sound. To do. The drive sound noise reduction processing unit 2542 generates drive sound for both the frequency spectrums S302 to S304 and S309 to S312 that have been subjected to the impact sound reduction process, and the frequency spectrum S305 to S308 of the microphone sound signal that includes only the drive sound. It is preferable to perform a reduction process. In the present embodiment, an example in which the drive sound noise reduction processing unit 2542 executes the drive sound reduction process on all frequency spectra S302 to S312 during operation including the frequency spectrum on which the impact sound reduction process has been performed will be described.

衝撃音ノイズ低減処理部２５４１は、動作タイミング検出部１９１から入力するタイミング信号に基づき、例えば、フーリエ変換部２５３から出力される周波数スペクトルＳ３０１，Ｓ３０２，Ｓ３０３，Ｓ３０４，Ｓ３０５，Ｓ３０６・・・から、衝撃音が発生している可能性の高い期間に対応する周波数スペクトル（以下、衝撃音処理周波数スペクトルＳＳという）を取得する。例えば、衝撃音ノイズ低減処理部２５４１は、動作開始タイミングｔ１０に対応する衝撃音発生期間ｔ１０〜ｔ１１を示す動作タイミング信号に基づき、衝撃音が発生している可能性のある期間に対応する周波数スペクトルＳ３０２〜Ｓ３０４を、衝撃音処理周波数スペクトルＳＳとして取得する。衝撃音ノイズ低減処理部２５４１は、動作停止タイミングｔ２０に対応する衝撃音発生期間ｔ２０〜ｔ２１を示す動作タイミング信号に基づき、衝撃音が発生している可能性のある期間に対応する周波数スペクトルＳ３０９〜Ｓ３１２を、衝撃音処理周波数スペクトルＳＳとして取得する。 The shock noise reduction processing unit 2541 is based on the timing signal input from the operation timing detection unit 191, for example, from the frequency spectrum S301, S302, S303, S304, S305, S306... Output from the Fourier transform unit 253. A frequency spectrum (hereinafter referred to as an impact sound processing frequency spectrum SS) corresponding to a period during which the impact sound is likely to be generated is acquired. For example, the impact sound noise reduction processing unit 2541 generates a frequency spectrum corresponding to a period in which an impact sound may be generated based on an operation timing signal indicating an impact sound generation period t10 to t11 corresponding to the operation start timing t10. S302 to S304 are acquired as the impact sound processing frequency spectrum SS. The impact sound noise reduction processing unit 2541 is based on the operation timing signal indicating the impact sound generation period t20 to t21 corresponding to the operation stop timing t20, and the frequency spectrum S309 to the frequency spectrum S309 to correspond to the period in which the impact sound may be generated. S312 is acquired as the impact sound processing frequency spectrum SS.

また、衝撃音ノイズ低減処理部２５４１は、動作タイミング検出部１９１から入力するタイミング信号に基づき、フーリエ変換部２５３から出力される周波数スペクトルＳ３０１，Ｓ３０２，Ｓ３０３，Ｓ３０４，Ｓ３０５，Ｓ３０６・・・から、衝撃音が発生していない可能性の高い期間に対応する周波数スペクトル（以下、衝撃音フロアリングスペクトルＦＳという）を取得する。この衝撃音ノイズ低減処理部２５４１は、衝撃音を含んでいる可能性の高い衝撃音処理周波数スペクトルＳＳごとに、この衝撃音を含む可能性の低い衝撃音フロアリングスペクトルＦＳを取得する。本実施形態において、衝撃音ノイズ低減処理部２５４１は、衝撃音処理周波数スペクトルＳＳと時間軸方向において最も近い衝撃音処理周波数スペクトルＳＳ以外の周波数スペクトルを衝撃音フロアリングスペクトルＦＳとして取得する。つまり、衝撃音ノイズ低減処理部２５４１は、衝撃音処理周波数スペクトルＳＳと時間軸方向に隣接あるいは重複する衝撃音処理周波数スペクトルＳＳ以外の周波数スペクトルを衝撃音フロアリングスペクトルＦＳとして取得する。
なお、本実施形態において、衝撃音フロアリングスペクトルＦＳは、衝撃音が発生していない可能性の高い期間に対応する周波数スペクトルである。しかし、本発明はこれに限られず、衝撃音フロアリングスペクトルＦＳは、衝撃音以外の動作音（つまり、駆動音）が発生している可能性の高い期間に対応する周波数スペクトルであってもよい。なお、衝撃音フロアリングスペクトルＦＳは、動作部の動作によって発生するノイズ音が発生しない可能性の高い期間に対応する周波数スペクトルであることが好ましい。 Further, the shock noise reduction processing unit 2541 is based on the frequency spectrum S301, S302, S303, S304, S305, S306... Output from the Fourier transform unit 253 based on the timing signal input from the operation timing detection unit 191. A frequency spectrum (hereinafter referred to as an impact sound flooring spectrum FS) corresponding to a period during which there is a high possibility that no impact sound is generated is acquired. The impact sound noise reduction processing unit 2541 obtains an impact sound flooring spectrum FS that is unlikely to include the impact sound for each impact sound processing frequency spectrum SS that is likely to include the impact sound. In the present embodiment, the impact sound noise reduction processing unit 2541 acquires a frequency spectrum other than the impact sound processing frequency spectrum SS closest to the impact sound processing frequency spectrum SS in the time axis direction as the impact sound flooring spectrum FS. That is, the impact sound noise reduction processing unit 2541 acquires a frequency spectrum other than the impact sound processing frequency spectrum SS adjacent to or overlapping the impact sound processing frequency spectrum SS in the time axis direction as the impact sound flooring spectrum FS.
In the present embodiment, the impact sound flooring spectrum FS is a frequency spectrum corresponding to a period in which there is a high possibility that no impact sound is generated. However, the present invention is not limited to this, and the impact sound flooring spectrum FS may be a frequency spectrum corresponding to a period during which an operation sound other than the impact sound (that is, a drive sound) is likely to be generated. . In addition, it is preferable that the impact sound flooring spectrum FS is a frequency spectrum corresponding to a period during which noise noise generated by the operation of the operation unit is highly unlikely to occur.

ここで、図５を参照して、衝撃音ノイズ低減処理部２５４１が取得する衝撃音処理周波数スペクトルＳＳと衝撃音フロアリングスペクトルＦＳとの関係の一例について説明する。図５は、衝撃音ノイズ低減処理部２５４１が取得する衝撃音処理周波数スペクトルＳＳと衝撃音フロアリングスペクトルＦＳの一例を説明するための図である。
例えば、衝撃音ノイズ低減処理部２５４１は、衝撃音発生期間ｔ１０〜ｔ１１を示す動作タイミング信号に基づき、衝撃音が発生している可能性のある期間に対応する周波数スペクトルＳ３０２〜Ｓ３０４を、衝撃音処理周波数スペクトルＳＳとして取得する。
そして、衝撃音ノイズ低減処理部２５４１は、衝撃音発生期間ｔ１０〜ｔ１１を示す動作タイミング信号に基づき、衝撃音処理周波数スペクトルＳＳである周波数スペクトルＳ３０２、Ｓ３０３に最も近い非動作期間Ｔａに対応する周波数スペクトルＳ３０１を、衝撃音処理周波数スペクトルＳＳである周波数スペクトルＳ３０２、Ｓ３０３に対応する衝撃音フロアリングスペクトルＦＳと判定する。
また、衝撃音ノイズ低減処理部２５４１は、衝撃音発生期間ｔ１０〜ｔ１１を示す動作タイミング信号に基づき、衝撃音処理周波数スペクトルＳＳである周波数スペクトルＳ３０４に最も近い駆動音発生期間Ｔｃに対応する周波数スペクトルＳ３０５を、衝撃音処理周波数スペクトルＳＳである周波数スペクトルＳ３０４に対応する衝撃音フロアリングスペクトルＦＳと判定する。 Here, an example of the relationship between the impact sound processing frequency spectrum SS and the impact sound flooring spectrum FS acquired by the impact sound noise reduction processing unit 2541 will be described with reference to FIG. FIG. 5 is a diagram for explaining an example of the impact sound processing frequency spectrum SS and the impact sound flooring spectrum FS acquired by the impact sound noise reduction processing unit 2541.
For example, the impact sound noise reduction processing unit 2541 converts the frequency spectrum S302 to S304 corresponding to the period during which the impact sound may be generated into the impact sound based on the operation timing signal indicating the impact sound generation period t10 to t11. Obtained as the processing frequency spectrum SS.
Then, the shock noise reduction processing unit 2541 is based on the operation timing signal indicating the shock sound generation periods t10 to t11, and the frequency corresponding to the non-operation period Ta closest to the frequency spectrums S302 and S303 which are the shock sound processing frequency spectrum SS. The spectrum S301 is determined as the impact sound flooring spectrum FS corresponding to the frequency spectra S302 and S303 which are the impact sound processing frequency spectrum SS.
Further, the impact sound noise reduction processing unit 2541 is based on the operation timing signal indicating the impact sound generation periods t10 to t11, and the frequency spectrum corresponding to the drive sound generation period Tc closest to the frequency spectrum S304 that is the impact sound processing frequency spectrum SS. S305 is determined as the impact sound flooring spectrum FS corresponding to the frequency spectrum S304 which is the impact sound processing frequency spectrum SS.

また、衝撃音ノイズ低減処理部２５４１は、衝撃音発生期間ｔ２０〜ｔ２１を示す動作タイミング信号に基づき、衝撃音が発生している可能性のある期間に対応する周波数スペクトルＳ３０９〜Ｓ３１２を、衝撃音処理周波数スペクトルＳＳとして取得する。
そして、衝撃音ノイズ低減処理部２５４１は、衝撃音発生期間ｔ２０〜ｔ２１を示す動作タイミング信号に基づき、衝撃音処理周波数スペクトルＳＳである周波数スペクトルＳ３０９、Ｓ３１０に最も近い駆動音発生期間Ｔｃに対応する周波数スペクトルＳ３０８を、衝撃音処理周波数スペクトルＳＳである周波数スペクトルＳ３０９、Ｓ３１０に対応する衝撃音フロアリングスペクトルＦＳと判定する。
また、衝撃音ノイズ低減処理部２５４１は、衝撃音発生期間ｔ２０〜ｔ２１を示す動作タイミング信号に基づき、衝撃音処理周波数スペクトルＳＳである周波数スペクトルＳ３１１、Ｓ３１２に最も近い非動作期間Ｔａに対応する周波数スペクトルＳ３１３を、衝撃音処理周波数スペクトルＳＳである周波数スペクトルＳ３１１、Ｓ３１２に対応する衝撃音フロアリングスペクトルＦＳと判定する。 Further, the impact sound noise reduction processing unit 2541 converts the frequency spectrum S309 to S312 corresponding to the period during which the impact sound may be generated into the impact sound based on the operation timing signal indicating the impact sound generation period t20 to t21. Obtained as the processing frequency spectrum SS.
The impact sound noise reduction processing unit 2541 corresponds to the drive sound generation period Tc that is closest to the frequency spectrums S309 and S310, which are the impact sound processing frequency spectrum SS, based on the operation timing signal indicating the impact sound generation periods t20 to t21. The frequency spectrum S308 is determined as the impact sound flooring spectrum FS corresponding to the frequency spectra S309 and S310 which are the impact sound processing frequency spectrum SS.
Further, the impact sound noise reduction processing unit 2541 has a frequency corresponding to the non-operation period Ta closest to the frequency spectrums S311 and S312 which are the impact sound processing frequency spectrum SS, based on the operation timing signal indicating the impact sound generation period t20 to t21. The spectrum S313 is determined as the impact sound flooring spectrum FS corresponding to the frequency spectra S311 and S312 which are the impact sound processing frequency spectrum SS.

さらに、衝撃音ノイズ低減処理部２５４１は、衝撃音処理周波数スペクトルＳＳの少なくとも一部を、衝撃音フロアリングスペクトルＦＳの対応する部分に置き換える。
例えば、衝撃音ノイズ低減処理部２５４１は、衝撃音処理周波数スペクトルＳＳのうち予め決められた閾値周波数以上の周波数スペクトルと、衝撃音フロアリングスペクトルＦＳのうち予め決められた閾値周波数以上の周波数スペクトルとを、周波数成分ごとに比較する。そして、衝撃音フロアリングスペクトルＦＳの方が衝撃音処理周波数スペクトルＳＳに比べて小さいと判定した場合に、衝撃音ノイズ低減処理部２５４１は、衝撃音処理周波数スペクトルＳＳにおける当該周波数成分を衝撃音フロアリングスペクトルＦＳの周波数成分に置き換える。 Furthermore, the impact sound noise reduction processing unit 2541 replaces at least a part of the impact sound processing frequency spectrum SS with a corresponding part of the impact sound flooring spectrum FS.
For example, the impact sound noise reduction processing unit 2541 includes a frequency spectrum that is greater than or equal to a predetermined threshold frequency in the impact sound processing frequency spectrum SS, and a frequency spectrum that is greater than or equal to a predetermined threshold frequency in the impact sound flooring spectrum FS. Are compared for each frequency component. When it is determined that the impact sound flooring spectrum FS is smaller than the impact sound processing frequency spectrum SS, the impact sound noise reduction processing unit 2541 converts the frequency component in the impact sound processing frequency spectrum SS to the impact sound floor. Replace with the frequency component of the ring spectrum FS.

図６を参照して詳細に説明する。図６は、一部の周波数スペクトルの周波数成分の一例について説明するための図である。なお、本実施の形態では、説明便宜のため、図３に示すマイク音信号のうち、フレームＦ１０１，Ｆ１０３，Ｆ１０５，Ｆ１０７，Ｆ１１１，Ｆ１１３に対応する周波数スペクトルＳ３０１，Ｓ３０３，Ｓ３０５，Ｓ３０７，Ｓ３１１，Ｓ３１３について説明する。
図６に示す通り、周波数スペクトルＳ３０１，Ｓ３０３，Ｓ３０５，Ｓ３０７，Ｓ３１１，Ｓ３１３は、それぞれ、周波数成分ｆ１〜ｆ９の周波数成分を含む。
例えば、衝撃音ノイズ低減処理部２５４１は、各周波数スペクトルの閾値周波数以上の周波数成分として、周波数成分ｆ３〜ｆ９について、衝撃音処理周波数スペクトルＳＳと衝撃音フロアリングスペクトルＦＳとを比較することが予め決められている。よって、衝撃音ノイズ低減処理部２５４１は、周波数成分ｆ１，ｆ２については、衝撃音処理周波数スペクトルＳＳと衝撃音フロアリングスペクトルＦＳとを比較しない。 This will be described in detail with reference to FIG. FIG. 6 is a diagram for explaining an example of frequency components of a part of the frequency spectrum. In the present embodiment, for convenience of explanation, among the microphone sound signals shown in FIG. S313 will be described.
As shown in FIG. 6, the frequency spectrums S301, S303, S305, S307, S311, and S313 each include frequency components f1 to f9.
For example, the impact sound noise reduction processing unit 2541 compares the impact sound processing frequency spectrum SS and the impact sound flooring spectrum FS in advance for the frequency components f3 to f9 as frequency components equal to or higher than the threshold frequency of each frequency spectrum. It has been decided. Therefore, the impact sound noise reduction processing unit 2541 does not compare the impact sound processing frequency spectrum SS and the impact sound flooring spectrum FS for the frequency components f1 and f2.

次いで、図７を参照して、周波数スペクトルＳ３０１とＳ３０３について、衝撃音ノイズ低減処理部２５４１による衝撃音ノイズ低減処理の一例について説明する。
図７は、周波数スペクトルＳ３０１とＳ３０３の周波数成分ごとに、振幅の比較について説明するための図である。
例えば、衝撃音ノイズ低減処理部２５４１は、周波数スペクトルＳ３０１の周波数成分ｆ３の振幅と、周波数スペクトルＳ３０３の周波数成分ｆ３の振幅とを比較する。この場合、周波数スペクトルＳ３０１の周波数成分ｆ３の振幅の方が、周波数スペクトルＳ３０３の周波数成分ｆ３の振幅に比べて小さい。よって、衝撃音ノイズ低減処理部２５４１は、周波数スペクトルＳ３０３の周波数成分ｆ３を、周波数スペクトルＳ３０１の周波数成分ｆ３に置き換える。 Next, with reference to FIG. 7, an example of impact sound noise reduction processing by the impact sound noise reduction processing unit 2541 will be described for the frequency spectra S301 and S303.
FIG. 7 is a diagram for explaining comparison of amplitude for each frequency component of the frequency spectra S301 and S303.
For example, the impact noise reduction processing unit 2541 compares the amplitude of the frequency component f3 of the frequency spectrum S301 with the amplitude of the frequency component f3 of the frequency spectrum S303. In this case, the amplitude of the frequency component f3 of the frequency spectrum S301 is smaller than the amplitude of the frequency component f3 of the frequency spectrum S303. Therefore, the impact sound noise reduction processing unit 2541 replaces the frequency component f3 of the frequency spectrum S303 with the frequency component f3 of the frequency spectrum S301.

また、衝撃音ノイズ低減処理部２５４１は、周波数スペクトルＳ３０１の周波数成分ｆ４の振幅と、周波数スペクトルＳ３０３の周波数成分ｆ４の振幅とを比較する。この場合、周波数スペクトルＳ３０１の周波数成分ｆ４の振幅の方が、周波数スペクトルＳ３０３の周波数成分ｆ４の振幅に比べて大きい。よって、衝撃音ノイズ低減処理部２５４１は、周波数スペクトルＳ３０３の周波数成分ｆ３を、周波数スペクトルＳ３０１の周波数成分ｆ３に置き換えない。
このようにして、衝撃音ノイズ低減処理部２５４１は、周波数スペクトルＳ３０１の周波数成分の振幅の方が、周波数スペクトルＳ３０３の周波数成分の振幅に比べて小さい場合のみ、周波数スペクトルＳ３０３の周波数成分を周波数スペクトルＳ３０１の周波数成分に置き換える。
図７に示す場合、衝撃音ノイズ低減処理部２５４１は、周波数スペクトルＳ３０３の周波数成分ｆ３、ｆ６〜ｆ９を周波数スペクトルＳ３０１の周波数成分ｆ３、ｆ６〜ｆ９に置き換える。 Further, the impact sound noise reduction processing unit 2541 compares the amplitude of the frequency component f4 of the frequency spectrum S301 with the amplitude of the frequency component f4 of the frequency spectrum S303. In this case, the amplitude of the frequency component f4 of the frequency spectrum S301 is larger than the amplitude of the frequency component f4 of the frequency spectrum S303. Therefore, the impact sound noise reduction processing unit 2541 does not replace the frequency component f3 of the frequency spectrum S303 with the frequency component f3 of the frequency spectrum S301.
In this way, the impact sound noise reduction processing unit 2541 converts the frequency component of the frequency spectrum S303 into the frequency spectrum only when the amplitude of the frequency component of the frequency spectrum S301 is smaller than the amplitude of the frequency component of the frequency spectrum S303. Replace with the frequency component of S301.
In the case illustrated in FIG. 7, the impact noise reduction processing unit 2541 replaces the frequency components f3 and f6 to f9 of the frequency spectrum S303 with the frequency components f3 and f6 to f9 of the frequency spectrum S301.

駆動音ノイズ低減処理部２５４２は、動作タイミング検出部１９１から入力するタイミング信号に基づき、例えば、フーリエ変換部２５３から出力される周波数スペクトルＳ３０１，Ｓ３０２，Ｓ３０３，Ｓ３０４，Ｓ３０５，Ｓ３０６・・・から、駆動音が発生している可能性の高い期間に対応する周波数スペクトル（以下、駆動音処理周波数スペクトルＫＳという）を取得する。例えば、駆動音ノイズ低減処理部２５４２は、動作開始タイミングｔ１０に対応する衝撃音発生期間ｔ１０〜ｔ１１を示す動作タイミングと、動作停止タイミングｔ２０に対応する衝撃音発生期間ｔ２０〜ｔ２１を示す動作タイミング信号に基づき、駆動音が発生している可能性のある期間に対応する周波数スペクトルＳ３０２〜Ｓ３１２を、駆動音処理周波数スペクトルＫＳとして取得する。 The drive sound noise reduction processing unit 2542 is based on the timing signal input from the operation timing detection unit 191, for example, from the frequency spectrum S301, S302, S303, S304, S305, S306... Output from the Fourier transform unit 253. A frequency spectrum (hereinafter referred to as a driving sound processing frequency spectrum KS) corresponding to a period during which driving sound is likely to be generated is acquired. For example, the drive sound noise reduction processing unit 2542 includes an operation timing signal indicating an impact sound generation period t10 to t11 corresponding to the operation start timing t10 and an operation timing signal indicating an impact sound generation period t20 to t21 corresponding to the operation stop timing t20. Based on the above, the frequency spectrums S302 to S312 corresponding to the period in which the driving sound may be generated are acquired as the driving sound processing frequency spectrum KS.

この駆動音ノイズ低減処理部２５４２は、取得した駆動音処理周波数スペクトルＫＳに対して、駆動パターンに応じて予め決められているノイズを低減する駆動音ノイズ低減処理を行う。例えば、駆動音ノイズ低減処理部２５４２は、駆動パターンに応じて予め決められているノイズを表わす周波数スペクトルの周波数成分を、駆動音処理周波数スペクトルＫＳの周波数成分から減算する周波数スペクトル減算法を用いる。なお、駆動パターンに応じて予め決められているノイズの周波数スペクトルは、設定値として駆動音ノイズ低減処理部２５４２に予め設定されている。しかし本発明はこれに限られず、駆動音ノイズ低減処理部２５４２が、過去のマイク音信号に基づき、駆動音が発生しているフレームの周波数スペクトルから駆動音が発生していないフレームの周波数スペクトルを減算することにより、推定される駆動音のノイズの周波数スペクトル（以下、推定ノイズスペクトルという）を、駆動パターンごとに算出しておくものであってもよい。 The drive sound noise reduction processing unit 2542 performs drive sound noise reduction processing for reducing noise determined in advance according to the drive pattern, on the acquired drive sound processing frequency spectrum KS. For example, the drive sound noise reduction processing unit 2542 uses a frequency spectrum subtraction method in which a frequency component of a frequency spectrum representing noise determined in advance according to a drive pattern is subtracted from the frequency component of the drive sound processing frequency spectrum KS. Note that the frequency spectrum of noise determined in advance according to the drive pattern is preset in the drive sound noise reduction processing unit 2542 as a set value. However, the present invention is not limited to this, and the drive sound noise reduction processing unit 2542 calculates a frequency spectrum of a frame where no drive sound is generated from a frequency spectrum of a frame where the drive sound is generated based on the past microphone sound signal. By subtracting, the frequency spectrum of the noise of the estimated driving sound (hereinafter referred to as the estimated noise spectrum) may be calculated for each driving pattern.

なお、ノイズ低減処理部２５４によって信号処理された後の周波数スペクトルを、以下、Ｓ４０１，Ｓ４０２，Ｓ４０３，Ｓ４０４，Ｓ４０５，Ｓ４０６・・・と記す。つまり、ノイズ低減処理部２５４は、入力する周波数スペクトルＳ３０１，Ｓ３０２，Ｓ３０３，Ｓ３０４，Ｓ３０５，Ｓ３０６・・・を信号処理した処理結果である周波数スペクトルＳ４０１，Ｓ４０２，Ｓ４０３，Ｓ４０４，Ｓ４０５，Ｓ４０６・・・を、逆フーリエ変換部２５５に出力する。 Hereinafter, the frequency spectrum after the signal processing by the noise reduction processing unit 254 is referred to as S401, S402, S403, S404, S405, S406. That is, the noise reduction processing unit 254 performs the frequency spectrum S401, S402, S403, S404, S405, S406,..., Which is a processing result obtained by performing signal processing on the input frequency spectrum S301, S302, S303, S304, S305, S306. Is output to the inverse Fourier transform unit 255.

逆フーリエ変換部２５５は、ノイズ低減処理部２５４によって信号処理された周波数スペクトルＳ４０１，Ｓ４０２，Ｓ４０３，Ｓ４０４，Ｓ４０５，Ｓ４０６・・・に対して、例えば逆フーリエ変換、あるいは逆高速フーリエ変換（ＩＦＦＴ：Inverse Fast Fourier Transform）を行うことで、時間領域に変換する。
なお、逆フーリエ変換部２５５によって時間領域に変換された音時間信号を、以下、Ｓ５０１，Ｓ５０２，Ｓ５０３，Ｓ５０４，Ｓ５０５，Ｓ５０６・・・と記す。つまり、逆フーリエ変換部２５５は、入力する周波数スペクトルＳ４０１，Ｓ４０２，Ｓ４０３，Ｓ４０４，Ｓ４０５，Ｓ４０６・・・を、音時間信号Ｓ５０１，Ｓ５０２，Ｓ５０３，Ｓ５０４，Ｓ５０５，Ｓ５０６・・・に変換して、連結調整部２５６に出力する。 The inverse Fourier transform unit 255 performs, for example, an inverse Fourier transform or an inverse fast Fourier transform (IFFT :) on the frequency spectrums S401, S402, S403, S404, S405, S406... Processed by the noise reduction processing unit 254. Inverse Fast Fourier Transform) is performed to convert to the time domain.
Note that the sound time signals converted into the time domain by the inverse Fourier transform unit 255 are hereinafter referred to as S501, S502, S503, S504, S505, S506,. That is, the inverse Fourier transform unit 255 converts the input frequency spectrum S401, S402, S403, S404, S405, S406... Into sound time signals S501, S502, S503, S504, S505, S506. And output to the connection adjustment unit 256.

連結調整部２５６は、逆フーリエ変換部２５５から入力された音時間信号Ｓ５０１，Ｓ５０２，Ｓ５０３，Ｓ５０４，Ｓ５０５，Ｓ５０６・・・のそれぞれに連結調整窓関数Ｗ_３を乗算する。なお、連結調整部２５６によって連結調整窓関数Ｗ_３で重み付けされた音時間信号を、以下、Ｓ６０１，Ｓ６０２，Ｓ６０３，Ｓ６０４，Ｓ６０５，Ｓ６０６・・・と記す。
本実施形態において、連結調整部２５６は、以下の式（２）に示す連結調整窓関数Ｗ_３＝ハニング窓関数Ｗ_２／ハミング窓関数Ｗ_１を、音時間信号Ｓ５０１，Ｓ５０２，Ｓ５０３，Ｓ５０４，Ｓ５０５，Ｓ５０６・・・のそれぞれに乗算する。なお、式（２）では、窓の範囲を０〜Ｔとし、変数を時刻ｔ（０≦ｔ≦Ｔ）で示す。 Coupling adjusting unit 256 multiplies the inverse Fourier transform unit sound time signal S501 input from 255, S502, S503, S504, S505, S506 coupling adjustment window function _{W 3} each of .... Incidentally, the sound time signal weighted with coupling adjusting window function _{W 3} by connecting adjuster 256, hereinafter referred to as S601, S602, S603, S604, S605, S606 ···.
In the present embodiment, the connection adjusting unit 256 uses the connection adjustment window function W ₃ = Hanning window function W ₂ / Humming window function W ₁ shown in the following formula (2) as the sound time signals S501, S502, S503, S504. Multiply each of S505, S506. In Equation (2), the window range is 0 to T, and the variable is indicated by time t (0 ≦ t ≦ T).

なお、ハニング窓関数Ｗ_２は、以下の式（３）に示す。式（３）では、窓の範囲を０〜Ｔとし、変数を時刻ｔ（０≦ｔ≦Ｔ）で示す。 Incidentally, Hanning window function _{W 2} is shown in the following equation (3). In Expression (3), the window range is 0 to T, and the variable is indicated by time t (0 ≦ t ≦ T).

信号重ね合わせ部２５７は、連結調整部２５６から入力する音時間信号Ｓ６０１，Ｓ６０２，Ｓ６０３，Ｓ６０４，Ｓ６０５，Ｓ６０６・・・に基づき、もとのマイク音信号の配置にあわせてつなぎ合わせる。本実施形態において、マイク音信号は、音信号切り出し部２５１によって、付番が奇数番号と偶数番号の音時間信号が半分ずつオーバーラップするようにフレーム単位に切り出されている。従って、信号重ね合わせ部２５７は、付番が奇数番号と偶数番号の音時間信号が半分ずつオーバーラップするように音時間信号Ｓ６０１，Ｓ６０２，Ｓ６０３，Ｓ６０４，Ｓ６０５，Ｓ６０６・・・を重ね合わせて連結させる。 Based on the sound time signals S601, S602, S603, S604, S605, S606,... Input from the connection adjusting unit 256, the signal superimposing unit 257 is connected in accordance with the arrangement of the original microphone sound signals. In the present embodiment, the microphone sound signal is cut out in units of frames by the sound signal cutout unit 251 so that the odd-numbered and even-numbered sound time signals overlap each other in half. Therefore, the signal superimposing unit 257 superimposes the sound time signals S601, S602, S603, S604, S605, S606... So that the sound time signals with odd numbers and even numbers are overlapped by half. Connect.

この信号重ね合わせ部２５７は、音時間信号Ｓ６０１，Ｓ６０２，Ｓ６０３，Ｓ６０４，Ｓ６０５，Ｓ６０６・・・をつなぎ合わせた音情報を、記憶媒体２００に記憶させる。なお、信号重ね合わせ部２５７は、つなぎ合わされた音情報と、撮像素子１１９により撮像された画像データとを、対応する日時情報を有する同士で対応付けて、記憶媒体２００に記憶させてもよく、音情報を含む動画として記憶してもよい。 The signal superimposing unit 257 causes the storage medium 200 to store sound information obtained by connecting the sound time signals S601, S602, S603, S604, S605, S606. Note that the signal superimposing unit 257 may store the connected sound information and the image data captured by the image sensor 119 in the storage medium 200 in association with each other having corresponding date and time information. You may memorize | store as a moving image containing sound information.

次に、図８〜１７を参照して、本実施形態に係る低減処理部２５０によって処理される音時間信号の一例について説明する。
図８は、ハミング窓関数Ｗ_１の一例を示す図である。また、図９は、ハニング窓関数Ｗ_２の一例を示す図である。さらに、図１０は、連結調整窓関数Ｗ_３の一例を示す図である。
図８に示す通り、ハミング窓関数Ｗ_１は、窓の両端の値が中央の値に比べて小さく、かつ、窓の両端が０（ゼロ）よりも大きい窓関数である。
一方、ハニング窓関数Ｗ_２は、図９に示す通り、窓の両端の値が中央の値に比べて小さく、かつ、窓の両端が０（ゼロ）となる窓関数である。
また、連結調整窓関数Ｗ_３は、図１０に示す通り、窓の両端の値が中央の値に比べて小さく、かつ、窓の両端が０（ゼロ）となる窓関数である。なお、連結調整窓関数Ｗ_３の中央付近は、ハミング窓関数Ｗ_１やハニング窓関数Ｗ_２に比べて、１に近い値である。 Next, an example of a sound time signal processed by the reduction processing unit 250 according to the present embodiment will be described with reference to FIGS.
Figure 8 is a diagram showing an example of a Hamming window function W _1. 9 is a diagram showing an example of a Hanning window function W _2. Further, FIG. 10 is a diagram showing an example of the connection adjustment window function W _3.
As shown in FIG. 8, Hamming window function W ₁ is smaller values at both ends of the window than the center value, and the ends of the window is larger window function than 0 (zero).
On the other hand, Hanning window function W _2, as shown in FIG. 9, the values at both ends of the window is smaller than the center value, and a window function both ends of the window is 0 (zero).
The coupling adjustment window function W ₃ being as shown in FIG. 10, the values of both ends of the window is smaller than the center value, and a window function both ends of the window is 0 (zero). Note that near the center of the coupling adjustment window function W ₃ being compared to Hamming window function W ₁ or Hanning window function W _2, a value close to 1.

図１１は、低減処理部２５０に入力するマイク音信号の一例を示す図である。図１１には、フレームＦ１０１〜Ｆ１０３に対応する部分のみを示す。
図１２は、音信号切り出し部２５１によって切出されたフレームＦ１０３に対応する音時間信号Ｓ１０３の一例を示す。図１２に示す通り、フレームＦ１０３に対応する音時間信号には、衝撃音が含まれている。
図１３は、ハミング窓処理部２５２によって音時間信号Ｓ１０３にハミング窓関数Ｗ_１を乗算した音時間信号Ｓ２０３の一例を示す。図１３に示す通り、音時間信号Ｓ２０３の両端は、図１２に示した音時間信号Ｓ１０３の両端に比べて小さくなっているが、その両端の値は０（ゼロ）ではない。
図１４は、ノイズ低減処理部２５４によってノイズ低減処理がなされた後の音時間信号Ｓ６０３の一例を示す。つまり、図１４は、ノイズ低減処理がなされた周波数スペクトルＳ４０３を時間領域に変換した音時間信号Ｓ５０３に連結調整窓関数Ｗ_３を乗算した音時間信号Ｓ６０３の一例を示す。また、図１５は、図１４に示す音時間信号Ｓ６０３の窓の端部に対応する部分を拡大して示す拡大図である。
図１４、１５に示す通り、音時間信号Ｓ６０３は、衝撃音（あるいは駆動音も含む）が低減されており、音時間信号Ｓ６０３の窓の両端は、０（ゼロ）である。 FIG. 11 is a diagram illustrating an example of a microphone sound signal input to the reduction processing unit 250. FIG. 11 shows only portions corresponding to the frames F101 to F103.
FIG. 12 shows an example of the sound time signal S103 corresponding to the frame F103 cut out by the sound signal cutout unit 251. As shown in FIG. 12, the sound time signal corresponding to the frame F103 includes an impact sound.
Figure 13 shows an example of the sound time signal S203 obtained by multiplying the Hamming window function _{W 1} to the sound time signal S103 by a Hamming windowing unit 252. As shown in FIG. 13, both ends of the sound time signal S203 are smaller than both ends of the sound time signal S103 shown in FIG. 12, but the values at both ends are not 0 (zero).
FIG. 14 shows an example of the sound time signal S603 after the noise reduction processing is performed by the noise reduction processing unit 254. That is, FIG. 14 shows an example of the sound time signal S603 obtained by multiplying the consolidation adjustment window function W ₃ in the sound time signal S503 obtained by converting the frequency spectrum S403 that the noise reduction processing has been performed in the time domain. FIG. 15 is an enlarged view showing a portion corresponding to the end of the window of the sound time signal S603 shown in FIG.
As shown in FIGS. 14 and 15, the sound time signal S603 has a reduced impact sound (or drive sound), and both ends of the window of the sound time signal S603 are 0 (zero).

図１６は、信号重ね合わせ部２５７によって、音時間信号Ｓ６０１，Ｓ６０２，Ｓ６０３をつなぎ合わせた音情報を示す図である。また、図１７は、音時間信号Ｓ６０１とＳ６０３のつなぎ目を拡大した拡大図である。図１６、１７に示す通り、音時間信号Ｓ６０１とＳ６０３のつなぎ目は連続した状態となっている。これは、図１４、１５に示したとおり、ノイズ低減処理部２５４によってノイズ低減処理がなされた周波数スペクトルは、時間領域に変換された後、連結調整窓関数Ｗ_３が乗算されることによって、その音時間信号の窓の両端の値が０（ゼロ）になるからである。
このように、ノイズ低減処理された音時間信号に対して両端が０（ゼロ）の窓関数を乗算することにより、この音時間信号の窓の両端の値を０（ゼロ）にすることができる。よって、音時間信号同士のつなぎ目の値が０（ゼロ）で一致するため、つなぎ目の値が異なることにより音がとび、発生するおそれのあるノイズを低減することができる。 FIG. 16 is a diagram illustrating sound information obtained by connecting the sound time signals S601, S602, and S603 by the signal superimposing unit 257. FIG. 17 is an enlarged view in which the joint between the sound time signals S601 and S603 is enlarged. As shown in FIGS. 16 and 17, the joint between the sound time signals S601 and S603 is in a continuous state. As shown in FIGS. 14 and 15, the frequency spectrum subjected to the noise reduction processing by the noise reduction processing unit 254 is converted into the time domain and then multiplied by the connection adjustment window function W _3. This is because the values at both ends of the sound time signal window become 0 (zero).
In this way, by multiplying the sound time signal subjected to noise reduction by the window function having both ends of 0 (zero), the values at both ends of the sound time signal window can be set to 0 (zero). . Therefore, since the joint value of the sound time signals matches with 0 (zero), noise that may occur due to the sound skipping due to the different joint value can be reduced.

また、本実施形態に係る低減処理部２５０は、フーリエ変換部２５３によって周波数スペクトルに変換する前の音時間信号Ｓ１０１，Ｓ１０２，Ｓ１０３，Ｓ１０４，Ｓ１０５，Ｓ１０６・・・に対してハミング窓関数Ｗ_１を乗算している。このハミング窓関数Ｗ_１は、ハニング窓関数Ｗ_２に比べてサイドローブが小さいという特徴がある。具体的に説明すると、ハミング窓関数Ｗ_１のサイドローブは−４３ｄＢであるのに対して、ハニング窓関数Ｗ_２のサイドローブは−３２ｄＢである。よって、ハミング窓関数Ｗ_１で重み付けした後に周波数領域で信号処理することにより、ノイズ低減処理部２５４は、処理対象音に含まれる接近する周波数成分を分離しやすくなる。従って、ノイズ低減処理部２５４の処理効果を向上できるという利点がある。 Further, the reduction processing unit 250 according to the present embodiment performs a Hamming window function W _{1 on} the sound time signals S101, S102, S103, S104, S105, S106,... Before being converted into a frequency spectrum by the Fourier transform unit 253. Is multiplied. The Hamming window function W ₁ has a feature that the side lobe is smaller than the Hanning window function W ₂ . More specifically, the side lobes of the Hamming window function _{W 1} whereas a -43 dB, the side lobes of the Hanning window function _{W 2} is -32 dB. Therefore, by signal processing in the frequency domain after weighted Hamming window function W _1, the noise reduction processing section 254, it becomes easy to separate the frequency components approaching included in the processing target sound. Therefore, there is an advantage that the processing effect of the noise reduction processing unit 254 can be improved.

次に、図１８〜２２を参照して、本実施形態によらない例について説明する。
図１８は、音時間信号Ｓ１０３にハニング窓関数Ｗ_２が乗算された音時間信号Ｓ１２０３の一例を示す。図１８に示す通り、音時間信号Ｓ１２０３の窓の両端の値は、音時間信号Ｓ１０３の窓の両端に比べて小さくなり、０（ゼロ）になる。
図１９は、ノイズ低減処理がなされた音時間信号の一例を示す。つまり、図１９は、ノイズ低減処理がなされた周波数スペクトルＳ１４０３を時間領域に変換した音時間信号Ｓ１５０３の一例を示す。なお、この周波数スペクトルＳ１４０３は、音時間信号Ｓ１２０３を周波数領域に変換した周波数スペクトルＳ１３０３にノイズ低減処理をした周波数スペクトルである。
また、図２０は、図１９に示す音時間信号Ｓ１５０３の窓の端部に対応する部分を拡大して示す拡大図である。
図１９、２０に示す通り、音時間信号Ｓ１５０３は、衝撃音（あるいは駆動音も含む）が低減されているものの、音時間信号Ｓ１５０３の窓の両端の値は、０（ゼロ）ではない。図１９、２０に示す例では、音時間信号Ｓ１５０３の左端の値は、０．０１となっている。 Next, an example not according to the present embodiment will be described with reference to FIGS.
Figure 18 shows an example of a Hanning window function _W sound ₂ is multiplied times the signal S1203 to the sound time signal S103. As shown in FIG. 18, the values at both ends of the window of the sound time signal S1203 are smaller than both ends of the window of the sound time signal S103 and become 0 (zero).
FIG. 19 shows an example of a sound time signal subjected to noise reduction processing. That is, FIG. 19 shows an example of a sound time signal S1503 obtained by converting the frequency spectrum S1403 subjected to noise reduction processing into the time domain. The frequency spectrum S1403 is a frequency spectrum obtained by performing noise reduction processing on the frequency spectrum S1303 obtained by converting the sound time signal S1203 into the frequency domain.
FIG. 20 is an enlarged view showing a portion corresponding to the end of the window of the sound time signal S1503 shown in FIG.
As shown in FIGS. 19 and 20, the sound time signal S1503 has reduced impact sound (or drive sound), but the values at both ends of the window of the sound time signal S1503 are not 0 (zero). In the examples shown in FIGS. 19 and 20, the value at the left end of the sound time signal S1503 is 0.01.

ここで、音時間信号Ｓ１５０３の両端の値が０（ゼロ）でなくなる理由を説明する。
低減処理前の音時間信号Ｓ１０３に含まれる主な周波数スペクトルは、フレームＦ１０３内で８周期のｃｏｓ成分である。また、音時間信号Ｓ１０３には、それ以外に、衝撃音の周波数スペクトルが含まれている。
つまり、音時間信号Ｓ１０３にハニング窓関数Ｗ_２を乗算した音時間信号Ｓ１２０３の周波数スペクトルは、上記８周期のｃｏｓ成分および衝撃音を含む周波数スペクトルにハニング窓関数Ｗ_２の影響を加えたものである。この状態では、音時間信号Ｓ１２０３の両端の値は０（ゼロ）になっている。これは、音時間信号Ｓ１２０３には、主な周波数スペクトルであるｃｏｓ成分の両端の値を０（ゼロ）にする複数の周波数成分が存在し、ｃｏｓ成分と打ち消しあっているためである。
しかし、ノイズ低減処理によって、衝撃音が除去されると、衝撃音由来の周波数スペクトルが除去され、ｃｏｓ成分と打ち消しあっていたバランスが崩れ、両端の値が０（ゼロ）ではなくなる。
このため、ノイズ低減処理後の音時間信号Ｓ１５０３の窓の両端の値は、０（ゼロ）ではなくなっている。 Here, the reason why the values at both ends of the sound time signal S1503 are not 0 (zero) will be described.
The main frequency spectrum included in the sound time signal S103 before the reduction process is a cosine component of 8 periods in the frame F103. In addition, the sound time signal S103 includes the frequency spectrum of the impact sound.
That is, the frequency spectrum of the sound time signal S103 to the Hanning window function W ₂ sound time signal S1203 multiplied by the plus the effect of the Hanning window function W ₂ into a frequency spectrum including the cos component and the impact sound of the 8 cycles is there. In this state, the values at both ends of the sound time signal S1203 are 0 (zero). This is because the sound time signal S1203 includes a plurality of frequency components whose values at both ends of the cos component, which is the main frequency spectrum, are 0 (zero), and cancels out with the cos component.
However, when the impact sound is removed by the noise reduction processing, the frequency spectrum derived from the impact sound is removed, the balance cancelled with the cos component is lost, and the values at both ends are not 0 (zero).
For this reason, the values at both ends of the window of the sound time signal S1503 after the noise reduction processing are not 0 (zero).

図２１は、ノイズ低減処理後の周波数スペクトルを時間領域に変換した音時間信号をつなぎ合わせた音情報を示す図である。具体的に説明すると、図２１は、図１９，２０に示す音時間信号Ｓ１５０３をこの音時間信号Ｓ１５０３と隣接する音時間信号Ｓ１５０１とつなぎ合わせ、音時間信号Ｓ１５０２が重畳している音情報を示す。
また、図２２は、音時間信号Ｓ１５０１とＳ１５０３のつなぎ目を拡大した拡大図である。図２１、２２に示す通り、音時間信号Ｓ１５０１とＳ１５０３のつなぎ目は連続していない。これは、図１９、２０に示したとおり、本発明によらない場合、ノイズ低減処理後の周波数スペクトルＳ１４０１，Ｓ１４０２，Ｓ１４０３・・・に対して、連結調整窓関数Ｗ_３が乗算されることなく、周波数スペクトルＳ１４０１，Ｓ１４０２，Ｓ１４０３・・・が時間領域の音時間信号Ｓ１５０１，Ｓ１５０２，Ｓ１５０３・・・に変換されるためである。
このように、ノイズ低減処理後の音時間信号の窓の両端の値が０（ゼロ）でない場合、音時間信号同士のつなぎ目が一致しない。よって、つなぎ目の値が異なることにより音がとび、ノイズが発生するおそれがある。本願発明は、この問題を解決するものである。 FIG. 21 is a diagram showing sound information obtained by connecting sound time signals obtained by converting the frequency spectrum after noise reduction processing into the time domain. More specifically, FIG. 21 shows sound information in which the sound time signal S1503 shown in FIGS. 19 and 20 is connected to the sound time signal S1501 adjacent to the sound time signal S1503 and the sound time signal S1502 is superimposed. .
FIG. 22 is an enlarged view in which the joint between the sound time signals S1501 and S1503 is enlarged. As shown in FIGS. 21 and 22, the joint between the sound time signals S1501 and S1503 is not continuous. This is because, as shown in FIGS. 19 and 20, when not according to the present invention, the frequency spectrum S1401 after the noise reduction process, S1402, with respect to S1403 · · ·, without consolidation adjustment window function _{W 3} is multiplied This is because the frequency spectra S1401, S1402, S1403,... Are converted into sound time signals S1501, S1502, S1503,.
Thus, when the values at both ends of the window of the sound time signal after the noise reduction processing are not 0 (zero), the joints of the sound time signals do not match. Therefore, the sound skips due to the different values of the joints, and noise may occur. The present invention solves this problem.

次に、図２３を参照して、本実施形態に係るノイズ低減処理方法の一例について説明する。図２３は、本実施形態に係るノイズ低減処理方法の一例を示すフローチャートである。
例えば、操作部１８０の電源スイッチがＯＮされると、撮像装置１００に電源が投入され、電池２６０から各構成部に対して電力が供給される。本実施形態では、撮像装置１００に対して、撮像時の画像データと音声データを対応付けて記憶媒体２００に記憶させることが予め設定されている。 Next, an example of the noise reduction processing method according to the present embodiment will be described with reference to FIG. FIG. 23 is a flowchart illustrating an example of the noise reduction processing method according to the present embodiment.
For example, when the power switch of the operation unit 180 is turned on, the imaging apparatus 100 is powered on, and power is supplied from the battery 260 to each component. In the present embodiment, it is set in advance for the image capturing apparatus 100 to store image data and sound data at the time of image capturing in association with each other in the storage medium 200.

（ステップＳＴ１）
マイク２３０は、例えば、動画撮影ボタンがＯＮされると、収音されたマイク音信号をＡ/Ｄ変換部２４０に出力する。Ａ/Ｄ変換部２４０は、アナログ信号であるマイク音信号をデジタル変換したマイク音信号を低減処理部２５０に出力する。
低減処理部２５０は、Ａ/Ｄ変換部２４０からマイク音信号を入力する。 (Step ST1)
For example, when the moving image shooting button is turned on, the microphone 230 outputs the collected microphone sound signal to the A / D conversion unit 240. The A / D converter 240 outputs the microphone sound signal obtained by digitally converting the microphone sound signal, which is an analog signal, to the reduction processing unit 250.
The reduction processing unit 250 inputs the microphone sound signal from the A / D conversion unit 240.

ここで、ユーザによって、例えば、操作部１８０のレリーズボタンが押下されたとする。この場合、レンズＣＰＵ１２０は、ＡＦ処理において、例えば距離Ｐでピントを合わせるＡＦ処理を実行するためのコマンドを、レンズ駆動部１１６と動作タイミング検出部１９１に出力する。
このレンズ駆動部１１６は、入力するコマンドに基づき、距離Ｐでピントを合わせる駆動パターンに従って、ＡＦレンズ１１２を移動させる。例えば、レンズ駆動部１１６は、ＡＦレンズ１１２の駆動機構を時計回りに所定量回転させて、ＡＦレンズ１１２を光軸に沿って移動させる。なお、この駆動機構を回転させる回転量やスピードは、距離Ｐでピントを合わせる駆動パターンとして、予め決められている。
ＡＦレンズ１１２が動くと、ＡＦエンコーダ１１７は、パルス信号をボディＣＰＵ１９０に出力する。このボディＣＰＵ１９０は、ＡＦエンコーダ１１７からパルス信号が入力されたことを示す情報を動作タイミング検出部１９１に出力する。動いていたＡＦレンズ１１２が停止すると、ＡＦエンコーダ１１７は、ボディＣＰＵ１９０へのパルス信号の出力を停止させる。このボディＣＰＵ１９０は、ＡＦエンコーダ１１７からのパルス信号の出力が停止されたことを示す情報を動作タイミング検出部１９１に出力する。 Here, it is assumed that the release button of the operation unit 180 is pressed by the user, for example. In this case, the lens CPU 120 outputs, to the lens driving unit 116 and the operation timing detection unit 191, a command for executing an AF process for focusing at a distance P in the AF process, for example.
The lens driving unit 116 moves the AF lens 112 according to a driving pattern for focusing at a distance P based on an input command. For example, the lens driving unit 116 rotates the driving mechanism of the AF lens 112 by a predetermined amount clockwise to move the AF lens 112 along the optical axis. Note that the rotation amount and speed for rotating the drive mechanism are determined in advance as a drive pattern for focusing at the distance P.
When the AF lens 112 moves, the AF encoder 117 outputs a pulse signal to the body CPU 190. The body CPU 190 outputs information indicating that the pulse signal is input from the AF encoder 117 to the operation timing detection unit 191. When the AF lens 112 that has moved stops, the AF encoder 117 stops outputting the pulse signal to the body CPU 190. The body CPU 190 outputs information indicating that the output of the pulse signal from the AF encoder 117 is stopped to the operation timing detection unit 191.

動作タイミング検出部１９１は、入力するコマンドやＡＦエンコーダ１１７の出力に基づき、距離Ｐでピントを合わせる駆動パターンに従って、動作タイミング信号を生成し、低減処理部２５０に出力する。
例えば、距離Ｐでピントを合わせるＡＦ処理を実行するためのコマンドをレンズＣＰＵ１２０から入力した場合、動作タイミング検出部１９１は、ＡＦレンズ１１２の動作開始タイミングｔ１０に対応する衝撃音発生期間（動作開始タイミング期間）ｔ１０〜ｔ１１を示す動作開始タイミング信号を生成し、低減処理部２５０に出力する。
そして、ＡＦエンコーダ１１７から入力するパルス信号が停止された場合、動作タイミング検出部１９１は、ＡＦレンズ１１２の動作停止タイミングｔ２０に対応する衝撃音発生期間（動作開始タイミング期間）ｔ２０〜ｔ２１を示す動作停止タイミング信号を生成し、低減処理部２５０に出力する。 The operation timing detection unit 191 generates an operation timing signal according to a driving pattern for focusing at a distance P based on an input command and an output of the AF encoder 117, and outputs the operation timing signal to the reduction processing unit 250.
For example, when a command for executing an AF process for focusing at a distance P is input from the lens CPU 120, the operation timing detection unit 191 has an impact sound generation period (operation start timing) corresponding to the operation start timing t 10 of the AF lens 112. Period) An operation start timing signal indicating t10 to t11 is generated and output to the reduction processing unit 250.
When the pulse signal input from the AF encoder 117 is stopped, the operation timing detection unit 191 performs an operation indicating an impact sound generation period (operation start timing period) t20 to t21 corresponding to the operation stop timing t20 of the AF lens 112. A stop timing signal is generated and output to the reduction processing unit 250.

（ステップＳＴ２）
低減処理部２５０は、動作タイミング検出部１９１から動作タイミング信号が入力されたか否かを判定する。
（ステップＳＴ３）
動作タイミング信号が入力された場合、低減処理部２５０は、入力するマイク音信号を周波数領域に変換した周波数スペクトルに基づき、ノイズ低減処理を実行する。
具体的に説明すると、低減処理部２５０の音信号切り出し部２５１は、予め決められた時間長のフレームでマイク音信号を区切って、フレーム単位の音時間信号Ｓ１０１，Ｓ１０２，Ｓ１０３，Ｓ１０４，Ｓ１０５，Ｓ１０６・・・をハミング窓処理部２５２に出力する。
ハミング窓処理部２５２は、入力する音時間信号Ｓ１０１，Ｓ１０２，Ｓ１０３，Ｓ１０４，Ｓ１０５，Ｓ１０６・・・のそれぞれにハミング窓関数Ｗ_１を乗算する。そして、ハミング窓処理部２５２は、ハミング窓関数Ｗ_１で重み付けされた音時間信号Ｓ２０１，Ｓ２０２，Ｓ２０３，Ｓ２０４，Ｓ２０５，Ｓ２０６・・・を、フーリエ変換部２５３に出力する。
そして、フーリエ変換部２５３は、入力する音時間信号Ｓ２０１，Ｓ２０２，Ｓ２０３，Ｓ２０４，Ｓ２０５，Ｓ２０６・・・を周波数領域で表わされる周波数スペクトルＳ３０１，Ｓ３０２，Ｓ３０３，Ｓ３０４，Ｓ３０５，Ｓ３０６・・・に変換して、ノイズ低減処理部２５４に出力する。 (Step ST2)
The reduction processing unit 250 determines whether or not an operation timing signal is input from the operation timing detection unit 191.
(Step ST3)
When the operation timing signal is input, the reduction processing unit 250 performs noise reduction processing based on the frequency spectrum obtained by converting the input microphone sound signal into the frequency domain.
More specifically, the sound signal cutout unit 251 of the reduction processing unit 250 divides the microphone sound signal into frames having a predetermined time length, and the sound time signals S101, S102, S103, S104, S105, in units of frames. S106... Are output to the Hamming window processing unit 252.
Hamming window processing unit 252, a sound inputting time signal S101, S102, S103, S104, S105, S106 multiplies the Hamming window function _{W 1} to each .... The Hamming window processing unit 252, a Hamming window function _{W 1} sound time signal weighted in S201, S202, S203, S204, S205, S206 and ..., and outputs the Fourier transform unit 253.
The Fourier transform unit 253 then converts the input sound time signals S201, S202, S203, S204, S205, S206... Into frequency spectra S301, S302, S303, S304, S305, S306. The data is converted and output to the noise reduction processing unit 254.

（ステップＳＴ４）
次いで、衝撃音ノイズ低減処理部２５４１は、フーリエ変換部２５３から入力する周波数スペクトルＳ３０１，Ｓ３０２，Ｓ３０３，Ｓ３０４，Ｓ３０５，Ｓ３０６・・・に対して衝撃音ノイズ低減処理を実行する。
例えば、衝撃音ノイズ低減処理部２５４１は、動作開始タイミングｔ１０に対応する衝撃音発生期間ｔ１０〜ｔ１１を示す動作タイミング信号に基づき、衝撃音が発生している可能性のある期間に対応する周波数スペクトルＳ３０２〜Ｓ３０４を、衝撃音処理周波数スペクトルＳＳとして取得する。
そして、衝撃音ノイズ低減処理部２５４１は、動作開始タイミングｔ１０に対応する衝撃音発生期間ｔ１０〜ｔ１１を示す動作タイミング信号に基づき、衝撃音処理周波数スペクトルＳＳである周波数スペクトルＳ３０２、Ｓ３０３に対応する衝撃音フロアリングスペクトルＦＳとして、周波数スペクトルＳ３０２、３０３の直前の周波数スペクトルＳ３０１を取得する。また、衝撃音ノイズ低減処理部２５４１は、動作開始タイミングｔ１０に対応する衝撃音発生期間ｔ１０〜ｔ１１を示す動作タイミング信号に基づき、衝撃音処理周波数スペクトルＳＳである周波数スペクトルＳ３０４に対応する衝撃音フロアリングスペクトルＦＳとして、周波数スペクトルＳ３０４の直後の周波数スペクトルＳ３０５を取得する。 (Step ST4)
Next, the impact sound noise reduction processing unit 2541 performs impact sound noise reduction processing on the frequency spectra S301, S302, S303, S304, S305, S306,... Input from the Fourier transform unit 253.
For example, the impact sound noise reduction processing unit 2541 generates a frequency spectrum corresponding to a period in which an impact sound may be generated based on an operation timing signal indicating an impact sound generation period t10 to t11 corresponding to the operation start timing t10. S302 to S304 are acquired as the impact sound processing frequency spectrum SS.
The impact sound noise reduction processing unit 2541 then performs impact corresponding to the frequency spectrums S302 and S303, which are the impact sound processing frequency spectrum SS, based on the operation timing signal indicating the impact sound generation period t10 to t11 corresponding to the operation start timing t10. As the sound flooring spectrum FS, the frequency spectrum S301 immediately before the frequency spectra S302 and 303 is acquired. Moreover, the impact sound noise reduction processing unit 2541 is based on the operation timing signal indicating the impact sound generation period t10 to t11 corresponding to the operation start timing t10, and the impact sound floor corresponding to the frequency spectrum S304 which is the impact sound processing frequency spectrum SS. A frequency spectrum S305 immediately after the frequency spectrum S304 is acquired as the ring spectrum FS.

また、衝撃音ノイズ低減処理部２５４１は、動作停止タイミングｔ２０に対応する衝撃音発生期間ｔ２０〜ｔ２１を示す動作タイミング信号に基づき、衝撃音が発生している可能性のある期間に対応する周波数スペクトルＳ３０９〜Ｓ３１２を、衝撃音処理周波数スペクトルＳＳとして取得する。
そして、衝撃音ノイズ低減処理部２５４１は、動作停止タイミングｔ２０に対応する衝撃音発生期間ｔ２０〜ｔ２１を示す動作タイミング信号に基づき、衝撃音処理周波数スペクトルＳＳである周波数スペクトルＳ３０９、Ｓ３１０に対応する衝撃音フロアリングスペクトルＦＳとして、周波数スペクトルＳ３０９、Ｓ３１０の直前の周波数スペクトルＳ３０８を取得する。また、衝撃音ノイズ低減処理部２５４１は、動作開始タイミングｔ２０に対応する衝撃音発生期間ｔ２０〜ｔ２１を示す動作タイミング信号に基づき、衝撃音処理周波数スペクトルＳＳである周波数スペクトルＳ３１１、Ｓ３１２に対応する衝撃音フロアリングスペクトルＦＳとして、周波数スペクトルＳ３１１、Ｓ３１２の直後の周波数スペクトルＳ３１３を取得する。 Also, the impact sound noise reduction processing unit 2541 has a frequency spectrum corresponding to a period in which the impact sound may be generated based on the operation timing signal indicating the impact sound generation period t20 to t21 corresponding to the operation stop timing t20. S309 to S312 are acquired as the impact sound processing frequency spectrum SS.
The shock noise reduction processing unit 2541 then performs shocks corresponding to the frequency spectra S309 and S310 that are the shock sound processing frequency spectrum SS based on the operation timing signal indicating the shock sound generation period t20 to t21 corresponding to the operation stop timing t20. As the sound flooring spectrum FS, the frequency spectrum S308 immediately before the frequency spectra S309 and S310 is acquired. Further, the impact sound noise reduction processing unit 2541 performs the impact corresponding to the frequency spectrums S311 and S312 which are the impact sound processing frequency spectrum SS based on the operation timing signal indicating the impact sound generation period t20 to t21 corresponding to the operation start timing t20. As the sound flooring spectrum FS, the frequency spectrum S313 immediately after the frequency spectra S311 and S312 is acquired.

（ステップＳＴ５）
次いで、衝撃音ノイズ低減処理部２５４１は、各周波数スペクトルの閾値周波数以上の周波数成分として、周波数成分ｆ３〜ｆ９について、衝撃音処理周波数スペクトルＳＳと衝撃音フロアリングスペクトルＦＳを比較する。
例えば、衝撃音ノイズ低減処理部２５４１は、周波数スペクトルＳ３０１の周波数成分ｆ３の振幅と、周波数スペクトルＳ３０２の周波数成分ｆ３の振幅とを比較する。この場合、周波数スペクトルＳ３０１の周波数成分ｆ３の振幅の方が、周波数スペクトルＳ３０２の周波数成分ｆ３の振幅に比べて小さい。よって、衝撃音ノイズ低減処理部２５４１は、周波数スペクトルＳ３０３の周波数成分ｆ３を、周波数スペクトルＳ３０１の周波数成分ｆ３に置き換える。
また、衝撃音ノイズ低減処理部２５４１は、周波数スペクトルＳ３０１の周波数成分ｆ４の振幅と、周波数スペクトルＳ３０３の周波数成分ｆ４の振幅とを比較する。この場合、周波数スペクトルＳ３０１の周波数成分ｆ４の振幅の方が、周波数スペクトルＳ３０３の周波数成分ｆ４の振幅に比べて大きい。よって、衝撃音ノイズ低減処理部２５４１は、周波数スペクトルＳ３０３の周波数成分ｆ４を、周波数スペクトルＳ３０１の周波数成分ｆ４に置き換えない。
このようにして、衝撃音ノイズ低減処理部２５４１は、周波数スペクトルＳ３０１の周波数成分の振幅の方が、周波数スペクトルＳ３０３の周波数成分の振幅に比べて小さい場合のみ、周波数スペクトルＳ３０３の周波数成分を周波数スペクトルＳ３０１の周波数成分に置き換える。
そして、衝撃音ノイズ低減処理部２５４１は、周波数スペクトルＳ３０３の周波数成分ｆ３、ｆ６〜ｆ９を、周波数成分を周波数スペクトルＳ３０１の周波数成分ｆ３、ｆ６〜ｆ９と置き換えて、衝撃音ノイズ低減処理後の周波数スペクトルＳ´３０３を駆動音ノイズ低減処理部２５４２に出力する。 (Step ST5)
Next, the impact sound noise reduction processing unit 2541 compares the impact sound processing frequency spectrum SS and the impact sound flooring spectrum FS for the frequency components f3 to f9 as frequency components equal to or higher than the threshold frequency of each frequency spectrum.
For example, the impact noise reduction processing unit 2541 compares the amplitude of the frequency component f3 of the frequency spectrum S301 with the amplitude of the frequency component f3 of the frequency spectrum S302. In this case, the amplitude of the frequency component f3 of the frequency spectrum S301 is smaller than the amplitude of the frequency component f3 of the frequency spectrum S302. Therefore, the impact sound noise reduction processing unit 2541 replaces the frequency component f3 of the frequency spectrum S303 with the frequency component f3 of the frequency spectrum S301.
Further, the impact sound noise reduction processing unit 2541 compares the amplitude of the frequency component f4 of the frequency spectrum S301 with the amplitude of the frequency component f4 of the frequency spectrum S303. In this case, the amplitude of the frequency component f4 of the frequency spectrum S301 is larger than the amplitude of the frequency component f4 of the frequency spectrum S303. Therefore, the impact sound noise reduction processing unit 2541 does not replace the frequency component f4 of the frequency spectrum S303 with the frequency component f4 of the frequency spectrum S301.
In this way, the impact sound noise reduction processing unit 2541 converts the frequency component of the frequency spectrum S303 into the frequency spectrum only when the amplitude of the frequency component of the frequency spectrum S301 is smaller than the amplitude of the frequency component of the frequency spectrum S303. Replace with the frequency component of S301.
Then, the shock noise reduction processing unit 2541 replaces the frequency components f3 and f6 to f9 of the frequency spectrum S303 with the frequency components f3 and f6 to f9 of the frequency spectrum S301, and the frequency after the shock noise reduction processing. The spectrum S ′ 303 is output to the drive sound noise reduction processing unit 2542.

衝撃音ノイズ低減処理部２５４１は、同様にして、衝撃音処理周波数スペクトルＳＳである周波数スペクトルＳ２と衝撃音フロアリングスペクトルＦＳである周波数スペクトルＳ１との比較と、衝撃音処理周波数スペクトルＳＳである周波数スペクトルＳ３０４と衝撃音フロアリングスペクトルＦＳである周波数スペクトルＳ３０５との比較を行う。そして、第２周波数スペクトルＳ３０１、Ｓ３０５の周波数成分の振幅の方が、それぞれ、周波数スペクトルＳ３０２、Ｓ３０４の周波数成分の振幅に比べて小さい場合のみ、周波数スペクトルＳ３０２、Ｓ３０４の周波数成分をそれぞれ周波数スペクトルＳ３０１、Ｓ３０５の周波数成分に置き換えて、衝撃音ノイズ低減処理後の周波数スペクトルＳ´３０２、Ｓ´３０４を駆動音ノイズ低減処理部２５４２に出力する。 Similarly, the impact sound noise reduction processing unit 2541 compares the frequency spectrum S2 that is the impact sound processing frequency spectrum SS with the frequency spectrum S1 that is the impact sound flooring spectrum FS, and the frequency that is the impact sound processing frequency spectrum SS. The spectrum S304 is compared with the frequency spectrum S305 which is the impact sound flooring spectrum FS. Only when the amplitude of the frequency component of the second frequency spectrum S301, S305 is smaller than the amplitude of the frequency component of the frequency spectrum S302, S304, respectively, the frequency component of the frequency spectrum S302, S304 is set to the frequency spectrum S301, respectively. The frequency spectrums S ′ 302 and S ′ 304 after the impact noise reduction processing are output to the drive sound noise reduction processing unit 2542 instead of the frequency components of S 305.

また、衝撃音ノイズ低減処理部２５４１は、同様にして、衝撃音処理周波数スペクトルＳＳである周波数スペクトルＳ３０９、Ｓ３１０と衝撃音フロアリングスペクトルＦＳである周波数スペクトルＳ３０８との比較、および衝撃音処理周波数スペクトルＳＳである周波数スペクトルＳ３１１、Ｓ３１２と衝撃音フロアリングスペクトルＦＳである周波数スペクトルＳ３１３との比較を行う。そして、衝撃音フロアリングスペクトルＦＳである周波数スペクトルＳ３０８の周波数成分の振幅の方が、それぞれ、周波数スペクトルＳ３０９、Ｓ３１０の周波数成分の振幅に比べて小さい場合のみ、周波数スペクトルＳ３０９、Ｓ３１０の周波数成分をそれぞれ周波数スペクトルＳ３０８の周波数成分に置き換えて、衝撃音ノイズ低減処理後の周波数スペクトルＳ´３０９、Ｓ´３１０を駆動音ノイズ低減処理部２５４２に出力する。同様にして、衝撃音フロアリングスペクトルＦＳである周波数スペクトルＳ３１３の周波数成分の振幅の方が、それぞれ、周波数スペクトルＳ３１１、Ｓ３１２の周波数成分の振幅に比べて小さい場合のみ、周波数スペクトルＳ３１１、Ｓ３１２の周波数成分をそれぞれ周波数スペクトルＳ３１３の周波数成分に置き換えて、衝撃音ノイズ低減処理後の周波数スペクトルＳ´３１１、Ｓ´３１２を駆動音ノイズ低減処理部２５４２に出力する。 Similarly, the impact sound noise reduction processing unit 2541 compares the frequency spectrums S309 and S310 which are the impact sound processing frequency spectrum SS with the frequency spectrum S308 which is the impact sound flooring spectrum FS, and the impact sound processing frequency spectrum. A comparison is made between the frequency spectrums S311 and S312 which are SS and the frequency spectrum S313 which is the impact sound flooring spectrum FS. And only when the amplitude of the frequency component of the frequency spectrum S308 which is the impact sound flooring spectrum FS is smaller than the amplitude of the frequency component of the frequency spectrum S309, S310, the frequency component of the frequency spectrum S309, S310 is obtained. The frequency spectra S ′ 309 and S ′ 310 after the impact sound noise reduction processing are output to the drive sound noise reduction processing unit 2542 by replacing each with the frequency component of the frequency spectrum S 308. Similarly, only when the amplitude of the frequency component of the frequency spectrum S313 which is the impact sound flooring spectrum FS is smaller than the amplitude of the frequency component of the frequency spectrum S311 and S312 respectively, the frequency of the frequency spectrum S311 and S312. The components are replaced with the frequency components of the frequency spectrum S313, respectively, and the frequency spectra S′311 and S′312 after the impact noise reduction processing are output to the drive noise reduction processing unit 2542.

（ステップＳＴ６）
次いで、駆動音ノイズ低減処理部２５４２は、フーリエ変換部２５３から入力するマイク音信号の周波数スペクトルと、衝撃音ノイズ低減処理部２５４１から入力する衝撃音ノイズ低減処理後の周波数スペクトルに基づき、駆動音ノイズ低減処理を実行する。例えば、駆動音ノイズ低減処理部２５４２は、動作開始タイミングｔ１０に対応する衝撃音発生期間ｔ１０〜ｔ１１を示す動作タイミングと、動作停止タイミングｔ２０に対応する衝撃音発生期間ｔ２０〜ｔ２１を示す動作タイミング信号に基づき、駆動音が発生している可能性のある期間に対応する周波数スペクトルＳ３０２〜Ｓ３１２を、駆動音処理周波数スペクトルＫＳとして取得する。
この駆動音ノイズ低減処理部２５４２は、取得した駆動音処理周波数スペクトルＫＳである周波数スペクトルＳ３０２〜Ｓ３１２のうち、衝撃音ノイズ低減処理後の周波数スペクトルに対応する周波数スペクトルをＳ３０２〜Ｓ３０４、Ｓ３０９〜Ｓ３１２を、衝撃音ノイズ低減処理後の周波数スペクトルＳ´３０２、Ｓ´３０３、Ｓ´３０４、Ｓ´３０９、Ｓ´３１０、Ｓ´３１１、Ｓ´３１２に置き換える。
そして、駆動音ノイズ低減処理部２５４２は、衝撃音ノイズ低減処理後の周波数スペクトルＳ´３０２、Ｓ´３０３、Ｓ´３０４、Ｓ´３０９、Ｓ´３１０、Ｓ´３１１、Ｓ´３１２と、周波数スペクトルＳ３０５〜Ｓ３０７に対して駆動音ノイズ低減処理を実行する。つまり、駆動音ノイズ低減処理部２５４２は、駆動パターンに応じて予め決められているノイズを表わす周波数スペクトルの周波数成分を、衝撃音ノイズ低減処理後の駆動音処理周波数スペクトルＫＳである周波数スペクトルＳ´３０２〜Ｓ´３０４、Ｓ３０５〜３０７、Ｓ´３０９〜Ｓ´３１２の周波数成分からそれぞれ減算する。駆動音ノイズ低減処理部２５４２は、この駆動音ノイズ低減処理後の周波数スペクトルＳ４０１，Ｓ４０２，Ｓ４０３，Ｓ４０４，Ｓ４０５，Ｓ４０６・・・を逆フーリエ変換部２５５に出力する。 (Step ST6)
Next, the driving sound noise reduction processing unit 2542 is based on the frequency spectrum of the microphone sound signal input from the Fourier transform unit 253 and the frequency spectrum after the impact sound noise reduction processing input from the impact sound noise reduction processing unit 2541. Perform noise reduction processing. For example, the drive sound noise reduction processing unit 2542 includes an operation timing signal indicating an impact sound generation period t10 to t11 corresponding to the operation start timing t10 and an operation timing signal indicating an impact sound generation period t20 to t21 corresponding to the operation stop timing t20. Based on the above, the frequency spectrums S302 to S312 corresponding to the period in which the driving sound may be generated are acquired as the driving sound processing frequency spectrum KS.
The drive sound noise reduction processing unit 2542 selects a frequency spectrum corresponding to the frequency spectrum after the impact sound noise reduction process from among the frequency spectra S302 to S312 which are the acquired drive sound processing frequency spectrum KS as S302 to S304 and S309 to S312. Is replaced with the frequency spectrum S′302, S′303, S′304, S′309, S′310, S′311 and S′312 after the impact noise reduction processing.
Then, the drive sound noise reduction processing unit 2542 has frequency frequencies S′302, S′303, S′304, S′309, S′310, S′311 and S′312 after the impact sound noise reduction processing, and the frequency Drive noise reduction processing is executed for the spectra S305 to S307. In other words, the drive sound noise reduction processing unit 2542 uses the frequency spectrum of the frequency spectrum representing noise determined in advance according to the drive pattern as the frequency spectrum S ′ that is the drive sound processing frequency spectrum KS after the impact sound noise reduction process. Subtraction is performed from the frequency components of 302 to S′304, S305 to 307, and S′309 to S′312. The drive sound noise reduction processing unit 2542 outputs the frequency spectrums S401, S402, S403, S404, S405, S406... After the drive sound noise reduction processing to the inverse Fourier transform unit 255.

（ステップＳＴ７）
逆フーリエ変換部２５５は、ノイズ低減処理部２５４によって信号処理された駆動音ノイズ低減処理後の周波数スペクトルＳ４０１，Ｓ４０２，Ｓ４０３，Ｓ４０４，Ｓ４０５，Ｓ４０６・・・に対して、例えば逆フーリエ変換を行うことで、時間領域に変換する。この逆フーリエ変換部２５５は、時間領域に変換された音時間信号Ｓ５０１，Ｓ５０２，Ｓ５０３，Ｓ５０４，Ｓ５０５，Ｓ５０６・・・を、連結調整部２５６に出力する。 (Step ST7)
The inverse Fourier transform unit 255 performs, for example, an inverse Fourier transform on the frequency spectrums S401, S402, S403, S404, S405, S406... After the drive sound noise reduction processing that has been signal-processed by the noise reduction processing unit 254. In this way, the time domain is converted. The inverse Fourier transform unit 255 outputs the sound time signals S501, S502, S503, S504, S505, S506,... Converted to the time domain to the connection adjustment unit 256.

（ステップＳＴ８）
連結調整部２５６は、逆フーリエ変換部２５５から入力された音時間信号Ｓ５０１，Ｓ５０２，Ｓ５０３，Ｓ５０４，Ｓ５０５，Ｓ５０６・・・のそれぞれに連結調整窓関数Ｗ_３を乗算する。そして、連結調整部２５６は、連結調整窓関数Ｗ_３で重み付けされた音時間信号Ｓ６０１，Ｓ６０２，Ｓ６０３，Ｓ６０４，Ｓ６０５，Ｓ６０６・・・を信号重ね合わせ部２５７に出力する。 (Step ST8)
Coupling adjusting unit 256 multiplies the inverse Fourier transform unit sound time signal S501 input from 255, S502, S503, S504, S505, S506 coupling adjustment window function _{W 3} each of .... The connection adjustment unit 256 coupled adjustment window function _{W 3} sound time signal weighted in S601, S602, S603, S604, S605, S606 and outputs a ... a signal superposition section 257.

信号重ね合わせ部２５７は、連結調整部２５６から入力する音時間信号Ｓ６０１，Ｓ６０２，Ｓ６０３，Ｓ６０４，Ｓ６０５，Ｓ６０６・・・に基づき、もとのマイク音信号の配置にあわせてつなぎ合わせる。そして、信号重ね合わせ部２５７は、音時間信号Ｓ６０１，Ｓ６０２，Ｓ６０３，Ｓ６０４，Ｓ６０５，Ｓ６０６・・・をつなぎ合わせた音情報を、記憶媒体２００に記憶させる。 Based on the sound time signals S601, S602, S603, S604, S605, S606,... Input from the connection adjusting unit 256, the signal superimposing unit 257 is connected in accordance with the arrangement of the original microphone sound signals. Then, the signal superposition unit 257 causes the storage medium 200 to store sound information obtained by connecting the sound time signals S601, S602, S603, S604, S605, S606.

以上説明したように、本実施形態に係る撮像装置１００は、動作タイミング検出部１９１によって動作部の動作状態が変化するタイミングを検出するとともに、このタイミング信号に基づき、衝撃音が重畳している可能性のあるマイク音信号の周波数スペクトルの一部を、衝撃音が重畳していない可能性のあるマイク音信号の周波数スペクトルの一部と置き換える衝撃音ノイズ低減処理を実行する。これにより、周波数スペクトルの帯域が広い衝撃音であっても、目的音の不連続性が目立たず、かつ、衝撃音を低減した音情報を取得することができる。 As described above, the imaging apparatus 100 according to the present embodiment can detect the timing at which the operation state of the operation unit changes by the operation timing detection unit 191 and can superimpose an impact sound based on this timing signal. The impact sound noise reduction processing is performed in which a part of the frequency spectrum of the characteristic microphone sound signal is replaced with a part of the frequency spectrum of the microphone sound signal that may not have the impact sound superimposed thereon. Thereby, even if the impact sound has a wide frequency spectrum band, the discontinuity of the target sound is not noticeable, and sound information with reduced impact sound can be acquired.

［窓関数の他の例］
なお、本発明に係る低減処理部２５０は、上述の実施形態に限られず、例えば、以下に説明するような窓関数を用いることができる。
例えば、ハミング窓処理部２５２で窓関数にハミング窓関数Ｗ_１を利用した場合、連結調整部２５６は、連結調整窓関数Ｗ_３＝ハニング窓関数Ｗ_２／ハミング窓関数Ｗ_１を、音時間信号Ｓ５０１，Ｓ５０２，Ｓ５０３，Ｓ５０４，Ｓ５０５，Ｓ５０６・・・のそれぞれに乗算する例に説明したが、これ以外の窓関数を乗算するものであってもよい。この連結調整部２５６は、例えば、図２４に示すような窓関数Ｗ_４を分子とする連結調整窓関数Ｗ_５＝窓関数Ｗ_４／ハミング窓関数Ｗ_１を、音時間信号Ｓ５０１，Ｓ５０２，Ｓ５０３，Ｓ５０４，Ｓ５０５，Ｓ５０６・・・のそれぞれに乗算するものであってもよい。
なお、窓関数Ｗ_４は、以下の式（４）に示す。また、連結調整窓関数Ｗ_５は、以下の式（５）に示す。式（４）と式（５）では、窓の範囲を０〜Ｔとし、変数を時刻ｔ（０≦ｔ≦Ｔ）で示す。 [Other examples of window functions]
In addition, the reduction process part 250 which concerns on this invention is not restricted to the above-mentioned embodiment, For example, a window function as demonstrated below can be used.
For example, when the Hamming window processing unit 252 uses the Hamming window function W ₁ as a window function, the connection adjustment unit 256 converts the connection adjustment window function W ₃ = Hanning window function W ₂ / Humming window function W ₁ to a sound time signal. Although the example of multiplying each of S501, S502, S503, S504, S505, S506... Has been described, a window function other than this may be multiplied. For example, the connection adjustment unit 256 generates a connection adjustment window function W ₅ = window function W ₄ / Humming window function W ₁ with a window function W ₄ as a numerator as shown in FIG. 24, and the sound time signals S501, S502, S503. , S504, S505, S506,...
Note that the window function _{W 4} are shown in the following equation (4). The coupling adjustment window function _{W 5} are shown in the following equation (5). In Expressions (4) and (5), the window range is 0 to T, and the variable is indicated by time t (0 ≦ t ≦ T).

本実施形態のように、ハミング窓処理部２５２が、入力する音時間信号Ｓ１０１，Ｓ１０２，Ｓ１０３，Ｓ１０４，Ｓ１０５，Ｓ１０６・・・にハミング窓関数Ｗ_１を乗算する場合、連結調整部２５６の連結調整窓関数としては、窓の両端の値がハミング窓関数Ｗ_１の両端の値より小さい窓関数Ｗ_４（ハミング窓関数Ｗ_１＞窓関数Ｗ_４）を利用することができる。これにより、連結調整部２５６が音時間信号Ｓ５０１，Ｓ５０２，Ｓ５０３，Ｓ５０４，Ｓ５０５，Ｓ５０６・・・にそれぞれ乗算する連結調整窓関数Ｗ_５の両端の値を１より小さくすることができる。また、連結調整部２５６は、中央よりも両端の値が小さい連結調整窓関数Ｗ_５を、音時間信号Ｓ５０１，Ｓ５０２，Ｓ５０３，Ｓ５０４，Ｓ５０５，Ｓ５０６・・・のそれぞれに乗算することができる。よって、連結調整部２５６が連結調整窓関数Ｗ_５を乗算することにより、音時間信号Ｓ６０１，Ｓ６０２，Ｓ６０３，Ｓ６０４，Ｓ６０５，Ｓ６０６・・・間のつなぎ目が不連続となることを低減することができる。なお、連結調整窓関数Ｗ_５＝窓関数Ｗ_４／ハミング窓関数Ｗ_１は、図２５に示す通り、窓の両端の値は、１よりも小さい０．５になっている。 As in the present embodiment, a Hamming window processing unit 252, the sound time signal S101 to be input, S102, S103, S104, S105, S106 when multiplying the Hamming window function _{W 1} in., Coupling of the coupling adjusting unit 256 the adjustment window function, the value of both ends of the window can be utilized Hamming window function W ₁ across the value smaller than the window function W _{4 (Hamming} window function W _1> window function W _4). Thus, coupling adjuster 256 sound time signal S501, S502, S503, S504, S505, S506 the value of the both ends of the connection adjustment window function _{W 5} for multiplying each ... may be less than 1. The coupling adjusting unit 256 may multiply the coupling adjustment window function _{W 5} values at both ends is less than the center sound time signal S501, S502, S503, S504, S505, S506 to each .... Accordingly, by coupling adjusting unit 256 multiplies the coupling adjustment window function _{W 5,} the sound time signal S601, S602, is S603, S604, S605, S606 joint between ... it is reduced to be a discontinuous it can. As shown in FIG. 25, the connection adjustment window function W ₅ = window function W ₄ / Humming window function W ₁ is 0.5, which is smaller than 1 at both ends of the window.

このように、連結調整窓関数は、窓の両端の値が０（ゼロ）であることが好ましいが、少なくとも、窓の両端の値が中央の値に比べて小さくなっている窓関数であればよい。これにより、つなぎ目が不連続となることを低減し、つなぎ目に生じるおそれのあるノイズを低減させることができる。 As described above, the connection adjustment window function preferably has a value at both ends of the window of 0 (zero), but at least if the window function has a value at both ends of the window that is smaller than the central value. Good. Thereby, it can reduce that a joint becomes discontinuous and can reduce the noise which may arise in a joint.

また、ハミング窓処理部２５２は、ハミング窓関数Ｗ_１に限られず、窓の両端の値が中央の値に比べて小さい窓関数を、入力する音時間信号Ｓ１０１，Ｓ１０２，Ｓ１０３，Ｓ１０４，Ｓ１０５，Ｓ１０６・・・に乗算することができる。例えば、ハミング窓処理部２５２は、ブラックマン・ハリス窓関数、ブラック・ナトール窓関数、フラットトップ窓関数、テューキー（Tukey）ウィンドウ関数、ランチョス窓関数、三角窓関数、ガウス窓関数等を入力する音時間信号に乗算することができる。
この場合、連結調整部２５６が音時間信号Ｓ５０１，Ｓ５０２，Ｓ５０３，Ｓ５０４，Ｓ５０５，Ｓ５０６・・・にそれぞれ乗算する連結調整窓関数の分母の窓関数は、入力する音時間信号にハミング窓処理部２５２が乗算する窓関数と一致していることが好ましい。また、連結調整部２５６が音時間信号Ｓ５０１，Ｓ５０２，Ｓ５０３，Ｓ５０４，Ｓ５０５，Ｓ５０６・・・にそれぞれ乗算する連結調整窓関数の分子の窓関数は、その窓の両端の値が分母の窓関数の両端の値より小さい窓関数であることが好ましい。これにより、連結調整窓関数の両端の値を１より小さくすることができる。また、連結調整部２５６は、中央よりも両端の値が小さい連結調整窓関数を、音時間信号Ｓ５０１，Ｓ５０２，Ｓ５０３，Ｓ５０４，Ｓ５０５，Ｓ５０６・・・のそれぞれに乗算することができる。これにより、つなぎ目が不連続となることを低減し、つなぎ目に生じるおそれのあるノイズを低減させることができる。 The Hamming window processing unit 252 is not limited to the Hamming window function W _1, and the sound time signals S 101, S 102, S 103, S 104, S 105, which are input with window functions whose values at both ends of the window are smaller than the center values. S106... Can be multiplied. For example, the Hamming window processing unit 252 is a sound that inputs a Blackman-Harris window function, a Black-Natole window function, a flat-top window function, a Tukey window function, a Ranchos window function, a triangular window function, a Gauss window function, and the like. The time signal can be multiplied.
In this case, the window function of the denominator of the connection adjustment window function that the connection adjustment unit 256 multiplies to the sound time signals S501, S502, S503, S504, S505, S506. Preferably, 252 matches the window function to multiply. Further, the window function of the numerator of the connection adjustment window function that the connection adjustment unit 256 multiplies each of the sound time signals S501, S502, S503, S504, S505, S506,. It is preferable that the window function is smaller than the values at both ends. Thereby, the value of the both ends of a connection adjustment window function can be made smaller than one. Further, the connection adjustment unit 256 can multiply the sound time signals S501, S502, S503, S504, S505, S506,... By a connection adjustment window function whose values at both ends are smaller than those at the center. Thereby, it can reduce that a joint becomes discontinuous and can reduce the noise which may arise in a joint.

［オーバーラップする範囲が窓の１／４となる窓関数の一例］
また、ハミング窓処理部２５２は、図２７に示すような窓関数Ｗ_７を、入力する音時間信号Ｓ１０１，Ｓ１０２，Ｓ１０３，Ｓ１０４，Ｓ１０５，Ｓ１０６・・・に乗算するものであってもよい。この窓関数Ｗ_７は、以下の式（７−１）〜（７−３）で示される。なお、式（７−１）〜（７−３）では、窓の範囲を０〜Ｔとし、変数を時刻ｔ（０≦ｔ≦Ｔ）で示す。 [An example of a window function where the overlapping range is 1/4 of the window]
Further, Hamming window processing unit 252, a window function _{W 7} as shown in FIG. 27, a sound inputting time signal S101, S102, S103, S104, S105, S106 may be configured to multiply the .... The window function _{W 7} is represented by the following formula (7-1) to (7-3). In Expressions (7-1) to (7-3), the window range is 0 to T, and the variable is indicated by time t (0 ≦ t ≦ T).

式（７−１）〜（７−３）に示した通り、窓関数Ｗ_７は、窓の両端の範囲（０≦ｔ≦Ｔ／４，３Ｔ／４≦ｔ≦Ｔ）がハミング窓Ｗ_１であり、窓の中央の範囲（Ｔ／４≦ｔ≦３Ｔ／４）が平坦となる関数である。 Formula (7-1) as shown to (7-3), the window function _{W 7} is in the range of both ends of the window (0 ≦ t ≦ T / 4,3T / 4 ≦ t ≦ T) Hamming window _{W 1} And a function in which the central range of the window (T / 4 ≦ t ≦ 3T / 4) becomes flat.

この場合、連結調整部２５６は、連結調整窓関数Ｗ_８＝窓関数Ｗ_６／窓関数Ｗ_７を、音時間信号Ｓ５０１，Ｓ５０２，Ｓ５０３，Ｓ５０４，Ｓ５０５，Ｓ５０６・・・のそれぞれに乗算する。この窓関数Ｗ_６の一例を、図２６に示す。連結調整窓関数Ｗ_８の一例を、図２８に示す。
また、この窓関数Ｗ_６は、以下の式（６−１）〜（６−３）で示される。連結調整窓関数Ｗ_８は、以下の式（８）で示される。なお、式（６−１）〜（６−３）、式（８）では、窓の範囲を０〜Ｔとし、変数を時刻ｔ（０≦ｔ≦Ｔ）で示す。 In this case, the connection adjustment unit 256 multiplies the sound time signals S501, S502, S503, S504, S505, S506,... By the connection adjustment window function W ₈ = window function W ₆ / window function W ₇ . An example of the window function _{W 6,} shown in Figure 26. An example of a coupling adjustment window function W _8, shown in Figure 28.
Further, the window function _{W 6} being represented by formula (6-1) to (6-3). Coupling adjustment window function W ₈ is expressed by the following equation (8). In Expressions (6-1) to (6-3) and Expression (8), the window range is 0 to T, and the variable is indicated by time t (0 ≦ t ≦ T).

式（６−１）〜（６−３）に示した通り、窓関数Ｗ_６は、窓の両端の範囲（０≦ｔ≦Ｔ／４，３Ｔ／４≦ｔ≦Ｔ）がハニング窓関数Ｗ_２であり、窓の中央の範囲（Ｔ／４≦ｔ≦３Ｔ／４）が平坦となる関数である。
本実施形態において、窓関数Ｗ_６、窓関数Ｗ_７を用いることにより、音時間信号がオーバーラップする範囲を窓の１／４とすることができる。このように、窓の中央の範囲が平坦（＝１）である窓関数を用いることにより、オーバーラップさせる領域を少なくし、フーリエ変換の演算フレームを削減することができる。 As shown in the equations (6-1) to (6-3), the window function W ₆ has a range of both ends of the window (0 ≦ t ≦ T / 4, 3T / 4 ≦ t ≦ T). ₂ is a function in which the central range (T / 4 ≦ t ≦ 3T / 4) of the window is flat.
In the present embodiment, by using the window function W ₆ and the window function W ₇ , the overlapping range of the sound time signals can be ¼ of the window. In this way, by using a window function in which the central range of the window is flat (= 1), it is possible to reduce the overlapping region and reduce the Fourier transform calculation frame.

［第２実施形態］
次に、図２９を参照して、本発明に係る第２実施形態について説明する。図２９は、第１実施形態に係る低減処理部２５０を備える信号処理装置５００の一例を説明するための図である。
信号処理装置５００は、低減処理部２５０を備える。この信号処理装置５００としては、例えば、パーソナルコンピュータやスマートフォン、タブレット型の端末等が利用可能である。 [Second Embodiment]
Next, a second embodiment according to the present invention will be described with reference to FIG. FIG. 29 is a diagram for explaining an example of the signal processing device 500 including the reduction processing unit 250 according to the first embodiment.
The signal processing device 500 includes a reduction processing unit 250. As the signal processing device 500, for example, a personal computer, a smartphone, a tablet terminal, or the like can be used.

この場合、撮像装置１００は、マイク２３０が集音したマイク音信号と、動作タイミング検出部１９１から出力される動作タイミング信号とを、それぞれ関連付けて記憶部１６０や記憶媒体２００に記憶しておく。なお、撮像装置１００は、計時部２２０によって計時された日時情報に基づき、マイク音信号が収音された時刻に従って、マイク音信号が収音された期間に生じた動作タイミングを示す動作タイミング信号とマイク音信号とを、それぞれ関連付けることができる。
具体的に説明すると、Ａ/Ｄ変換部２４０は、マイク２３０によって収音されたマイク音信号と、このマイク音信号を録音した装置が備えている動作部が動作するタイミングを示す情報（例えば、動作タイミング検出部１９１から出力される動作タイミング信号）とを、それぞれ関連付けて、記憶部１６０や記憶媒体２００に記憶しておく。この場合、それぞれ関連付けて記憶されるマイク音信号とタイミングを示す情報（動作タイミング信号）とは、同一のファイルに書き込まれるものであってもよく、別々のファイルに書き込まれファイル同士がマイク音信号の収音された時刻とタイミングを示す情報の時刻に従って関連付けられるものであってもよい。 In this case, the imaging apparatus 100 stores the microphone sound signal collected by the microphone 230 and the operation timing signal output from the operation timing detection unit 191 in the storage unit 160 or the storage medium 200 in association with each other. Note that the imaging apparatus 100 includes an operation timing signal indicating an operation timing generated during a period in which the microphone sound signal is picked up according to the time at which the microphone sound signal is picked up based on the date and time information timed by the time measuring unit 220. The microphone sound signal can be associated with each other.
More specifically, the A / D conversion unit 240 is a microphone sound signal picked up by the microphone 230 and information indicating the timing at which the operation unit provided in the device that recorded the microphone sound signal operates (for example, Are stored in the storage unit 160 or the storage medium 200 in association with each other. In this case, the microphone sound signal and the timing information (operation timing signal) stored in association with each other may be written in the same file, or the files are written in separate files. May be associated according to the time of the information collected and the time of the information indicating the timing.

そして、撮像装置１００と信号処理装置５００が、通信部１７０と通信部５７０を介して接続された場合、記憶部１６０や記憶媒体２００に記憶されている、それぞれ関連付けて記憶されるマイク音信号とタイミングを示す情報（動作タイミング信号）が信号処理装置５００に搭載された低減処理部２５０に出力される。
これにより、低減処理部２５０は、撮像装置１００の外部において、ノイズ低減処理を実行することができる。 When the imaging device 100 and the signal processing device 500 are connected via the communication unit 170 and the communication unit 570, the microphone sound signals stored in the storage unit 160 and the storage medium 200 are stored in association with each other. Information indicating the timing (operation timing signal) is output to the reduction processing unit 250 mounted on the signal processing device 500.
Thereby, the reduction processing unit 250 can perform noise reduction processing outside the imaging apparatus 100.

このように、第１実施形態では、低減処理部２５０が、マイク２３０により収音されたマイク音信号に対して信号処理する例について説明したが、本実施形態に係る低減処理部２５０は、このようなリアルタイムに収音されたマイク音信号に対してのみ適用されるものではない。
撮像装置１００の外部においてもノイズ低減処理を実行することにより、撮像装置の撮像処理等の処理負荷を軽減することができる。また、ユーザの所望する任意の時間において、ノイズ低減処理を実行することができるため、撮像装置１００がノイズ低減処理を実行することによる撮像装置１００の消費電力を抑えることができる。よって、外出先において撮像装置１００の消費電力の消耗を軽減することができる。 As described above, in the first embodiment, the example in which the reduction processing unit 250 performs signal processing on the microphone sound signal collected by the microphone 230 has been described. However, the reduction processing unit 250 according to the present embodiment is The present invention is not applied only to such microphone sound signals collected in real time.
By executing the noise reduction processing also outside the imaging apparatus 100, it is possible to reduce processing load such as imaging processing of the imaging apparatus. In addition, since the noise reduction process can be executed at an arbitrary time desired by the user, the power consumption of the imaging apparatus 100 due to the imaging apparatus 100 executing the noise reduction process can be suppressed. Therefore, it is possible to reduce power consumption of the imaging apparatus 100 when away from home.

なお、撮像装置１００や低減処理部２５０等による手順を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することにより、実行処理を行ってもよい。なお、ここでいう「コンピュータシステム」とは、ＯＳ（ＯｐｅｒａｔｉｎｇＳｙｓｔｅｍ）や周辺機器等のハードウェアを含むものであってもよい。 A program for realizing the procedure by the imaging device 100, the reduction processing unit 250, etc. is recorded on a computer-readable recording medium, and the program recorded on the recording medium is read into a computer system and executed. Execution processing may be performed. Here, the “computer system” may include hardware such as an OS (Operating System) and peripheral devices.

また、「コンピュータシステム」は、ＷＷＷシステムを利用している場合であれば、ホームページ提供環境（あるいは表示環境）も含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、フラッシュメモリ等の書き込み可能な不揮発性メモリ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。 Further, the “computer system” includes a homepage providing environment (or display environment) if a WWW system is used. The “computer-readable recording medium” means a flexible disk, a magneto-optical disk, a ROM, a writable nonvolatile memory such as a flash memory, a portable medium such as a CD-ROM, a hard disk built in a computer system, etc. This is a storage device.

さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムが送信された場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリ（例えばＤＲＡＭ（ＤｙｎａｍｉｃＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ））のように、一定時間プログラムを保持しているものも含むものとする。
また、上記プログラムは、このプログラムを記憶装置等に格納したコンピュータシステムから、伝送媒体を介して、あるいは、伝送媒体中の伝送波により他のコンピュータシステムに伝送されてもよい。ここで、プログラムを伝送する「伝送媒体」は、インターネット等のネットワーク（通信網）や電話回線等の通信回線（通信線）のように情報を伝送する機能を有する媒体のことをいう。
また、上記プログラムは、前述した機能の一部を実現するためのものであっても良い。
さらに、前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるもの、いわゆる差分ファイル（差分プログラム）であっても良い。 Further, the “computer-readable recording medium” means a volatile memory (for example, DRAM (Dynamic) in a computer system which becomes a server or a client when a program is transmitted through a network such as the Internet or a communication line such as a telephone line. Random Access Memory)) that holds a program for a certain period of time.
The program may be transmitted from a computer system storing the program in a storage device or the like to another computer system via a transmission medium or by a transmission wave in the transmission medium. Here, the “transmission medium” for transmitting the program refers to a medium having a function of transmitting information, such as a network (communication network) such as the Internet or a communication line (communication line) such as a telephone line.
The program may be for realizing a part of the functions described above.
Furthermore, what can implement | achieve the function mentioned above in combination with the program already recorded on the computer system, and what is called a difference file (difference program) may be sufficient.

１００…撮像装置、１１０…撮像部、１２０…レンズＣＰＵ、１３０…バッファメモリ、１４０…画像処理部、１５０…表示部、１６０…記憶部、１７０…通信部、１８０…操作部、１９０…ボディＣＰＵ、２２０…計時部、２３０…マイク、２４０…Ａ/Ｄ変換部、２５０…低減処理部、２５１…音信号切り出し部、２５２…ハミング窓処理部（窓処理部）、２５３…フーリエ変換部、２５４…ノイズ低減処理部、２５５…逆フーリエ変換部、２５６…連結調整部、２５７…音信号重ね合わせ部、２６０…電池、１９１…動作タイミング検出部、２５４１…衝撃音ノイズ低減処理部、２５４２…駆動音ノイズ低減処理部 DESCRIPTION OF SYMBOLS 100 ... Imaging device, 110 ... Imaging part, 120 ... Lens CPU, 130 ... Buffer memory, 140 ... Image processing part, 150 ... Display part, 160 ... Memory | storage part, 170 ... Communication part, 180 ... Operation part, 190 ... Body CPU , 220 ... Timekeeping section, 230 ... Microphone, 240 ... A / D conversion section, 250 ... Reduction processing section, 251 ... Sound signal cutout section, 252 ... Hamming window processing section (window processing section), 253 ... Fourier transform section, 254 ... Noise reduction processing unit, 255 ... Inverse Fourier transform unit, 256 ... Connection adjustment unit, 257 ... Sound signal superposition unit, 260 ... Battery, 191 ... Operation timing detection unit, 2541 ... Impact noise reduction processing unit, 2542 ... Drive Sound noise reduction processing section

Claims

A conversion unit that converts an input sound signal into a frequency spectrum in the frequency domain;
A processing unit that performs signal processing on the frequency spectrum converted by the conversion unit;
An inverse conversion unit that converts the frequency spectrum signal-processed by the processing unit into a sound time signal in a time domain;
A connection adjustment unit that multiplies the sound time signal converted by the inverse conversion unit with a window function in which values at both ends in the unit interval are smaller than a central value;
A signal processing apparatus comprising:

The connection adjusting unit is
The signal processing apparatus according to claim 1, wherein the sound time signal converted by the inverse conversion unit is multiplied by a window function whose values at both ends in the unit interval are zero.

The connection adjusting unit is
The signal processing apparatus according to claim 1, wherein the sound time signal converted by the inverse conversion unit is multiplied by a Hanning window function.

4. The window processing unit according to claim 1, further comprising a window processing unit that multiplies the sound signal input to the conversion unit by a window function in which values at both ends in a unit interval are smaller than a center value. The signal processing device according to item.

The window processing unit
Multiplying the sound time signal input to the converter by a Hamming window function,
The connection adjusting unit is
The signal processing apparatus according to claim 4, wherein the sound time signal converted by the inverse conversion unit is multiplied by a Hanning window function / Hamming window function.

The window processing unit
Multiplying the sound time signal input to the converter by a window function W _{7 represented} by the following equation:

The connection adjusting unit is
A window function W _{8 represented} by the following expression is applied to the sound time signal converted by the inverse conversion unit.

The signal processing device according to claim 4, wherein

A sound signal cutout unit that cuts out a sound time signal in units of frames by dividing a sound signal to be input into frames of a predetermined time length;
A window processing unit that multiplies the sound time signal cut out by the sound signal cut-out unit by a first window function in which values at both ends in a unit interval are smaller than a central value;
A conversion unit that converts a sound time signal multiplied by the first window function by the window processing unit into a frequency spectrum in a frequency domain;
A processing unit that performs signal processing on the frequency spectrum converted by the conversion unit;
An inverse conversion unit that converts the frequency spectrum signal-processed by the processing unit into a sound time signal in a time domain;
A connection adjustment unit that multiplies the sound time signal converted by the inverse conversion unit with a window function in which values at both ends in the unit interval are smaller than a central value;
A signal processing apparatus comprising:

An image pickup apparatus comprising the signal processing apparatus according to any one of claims 1 to 7.

Computer
Conversion means for converting an input sound signal into a frequency spectrum in the frequency domain;
Processing means for performing signal processing on the frequency spectrum converted by the conversion means;
Inverse conversion means for converting the frequency spectrum signal-processed by the processing means into a sound time signal in a time domain,
A connection adjustment unit that multiplies the sound time signal converted by the inverse conversion unit with a window function in which values at both ends in a unit interval are smaller than a center value;
Program to function as.

Sound signal cutout means for cutting out the sound time signal in frame units by dividing the sound signal to be input into frames of a predetermined time length;
Window processing means for multiplying the sound time signal cut out by the sound signal cut-out means by a window function in which values at both ends in a unit interval are smaller than a central value;
The program according to claim 9, further comprising: