JP5845954B2

JP5845954B2 - Noise reduction device, voice input device, wireless communication device, noise reduction method, and noise reduction program

Info

Publication number: JP5845954B2
Application number: JP2012031712A
Authority: JP
Inventors: 孝朗山邊; 永井　俊明; 俊明永井
Original assignee: JVCKenwood Corp
Current assignee: JVCKenwood Corp
Priority date: 2012-02-16
Filing date: 2012-02-16
Publication date: 2016-01-20
Anticipated expiration: 2032-02-16
Also published as: JP2013167805A

Description

本発明はノイズ低減装置、音声入力装置、無線通信装置、ノイズ低減方法、およびノイズ低減プログラムに関する。 The present invention relates to a noise reduction device, a voice input device, a wireless communication device, a noise reduction method, and a noise reduction program.

ノイズ低減処理技術は、例えば移動体通信の分野で広く利用されている。移動体通信では、低ビットレートで音声を符号化する際にノイズが混入することで音声の明瞭度が劣化する。よってこれを避けるために、音声を符号化する前にノイズを検出してノイズ低減処理を行なっている。 Noise reduction processing technology is widely used, for example, in the field of mobile communication. In mobile communication, noise is mixed when speech is encoded at a low bit rate, and speech intelligibility deteriorates. Therefore, in order to avoid this, noise is detected by detecting noise before encoding the voice.

ノイズ低減処理技術では、例えば、音声（例えば、通話者が発する音声などの希望音声）を主に収音するマイクロフォンによって収音された音声信号から、ノイズ（例えば希望音声以外の不要音）を主に収音するマイクロフォンによって収音されたノイズ信号を差し引くことで、音声信号に含まれるノイズ成分を除去することができる。 In the noise reduction processing technology, for example, noise (for example, unnecessary sound other than the desired sound) is mainly obtained from a sound signal collected by a microphone that mainly collects sound (for example, desired sound such as a sound emitted from a caller). By subtracting the noise signal collected by the microphone that picks up the sound, the noise component contained in the audio signal can be removed.

特許文献１には、音声品質の低下及びノイズを除去する処理性能の低下を防止し、常に音声を最良の状態で聞き取れるようにする技術が開示されている。特許文献２には、適応雑音低減型の音声入力装置において、希望音声の低減を防止して、低減対象の不要音のみを低減するための技術が開示されている。また、特許文献３には、雑音が抑圧されたより高い音声品質を実現するための技術が開示されている。 Japanese Patent Application Laid-Open No. 2004-228688 discloses a technique for preventing voice quality from being lowered and processing performance from being removed from noise, so that the voice can always be heard in the best state. Patent Document 2 discloses a technique for preventing reduction of desired speech and reducing only unnecessary sound to be reduced in an adaptive noise reduction type speech input device. Patent Document 3 discloses a technique for realizing higher voice quality in which noise is suppressed.

特開平７−１６８５８６号公報JP-A-7-168586 特開平６−６７６９２号公報JP-A-6-67692 特開２０１１−９９９６７号公報JP 2011-99967 A

音声を主に収音する音声用マイクロフォンとノイズを主に収音する参照音用マイクロフォンを用いてノイズ低減処理を実施する場合、ノイズの到来方向によっては音声の低減量（キャンセル量）が増大するという問題があった。つまり、ノイズ低減装置の使用状況によってはノイズを収音する参照音用マイクロフォンにも音声が混入する場合もある。このように参照音用マイクロフォンに音声が混入すると、音声用マイクロフォンで収音された音声に混入しているノイズ成分だけでなく音声自体もキャンセルされてしまい、音声の明瞭度が低下するという問題があった。 When noise reduction processing is performed using an audio microphone that mainly collects audio and a reference sound microphone that mainly collects noise, the amount of audio reduction (cancellation) increases depending on the direction of noise arrival. There was a problem. That is, depending on how the noise reduction device is used, the sound may be mixed into the reference sound microphone that collects noise. When sound is mixed in the reference sound microphone in this way, not only the noise component mixed in the sound collected by the sound microphone but also the sound itself is canceled, resulting in a problem that sound clarity is lowered. there were.

上記課題に鑑み本発明の目的は、ノイズ成分を適切に低減することができるノイズ低減装置、音声入力装置、無線通信装置、ノイズ低減方法、およびノイズ低減プログラムを提供することである。 In view of the above problems, an object of the present invention is to provide a noise reduction device, a voice input device, a wireless communication device, a noise reduction method, and a noise reduction program that can appropriately reduce noise components.

本発明にかかるノイズ低減装置は、第１のマイクロフォンで収音された音に応じた第１の収音信号および第２のマイクロフォンで収音された音に応じた第２の収音信号のうちの少なくとも一方に基づき音声区間およびノイズ区間を検出する音声ノイズ区間検出部と、前記音声区間における前記第１の収音信号の大きさと前記第２の収音信号の大きさとの差である音声パワー差と、前記ノイズ区間における前記第１の収音信号の大きさと前記第２の収音信号の大きさとの差であるノイズパワー差とを取得するパワー情報取得部と、前記音声パワー差と前記ノイズパワー差の状態を判定するノイズ低減処理判定部と、前記ノイズ低減処理判定部の判定結果に応じて前記第１の収音信号と前記第２の収音信号とを用いてノイズ低減処理を実施するノイズ低減処理部と、を備える。
前記ノイズ低減処理判定部は、前記音声パワー差と前記ノイズパワー差との差の状態を判定してもよい。
前記ノイズ低減処理判定部は、前記音声パワー差と前記ノイズパワー差の状態がノイズ低減処理を実施する場合である第１の状態であるか否かを判定してもよく、前記ノイズ低減処理部は、前記ノイズ低減処理判定部が前記第１の状態であると判定した場合、前記第１の収音信号と前記第２の収音信号とを用いてノイズ低減処理を実施してもよい。
前記ノイズ低減処理判定部は、前記音声パワー差と前記ノイズパワー差の状態が、前記第１の状態よりもノイズ低減処理を弱く実施する場合である第２の状態であるか否かを判定してもよく、前記ノイズ低減処理部は、前記ノイズ低減処理判定部が前記第２の状態であると判定した場合、前記第１の状態よりもノイズ低減処理を弱く実施してもよい。
前記ノイズ低減処理判定部は、前記音声パワー差と前記ノイズパワー差の状態が、ノイズ低減処理を実施しない場合である第２の状態であるか否かを判定してもよく、前記ノイズ低減処理部は、前記ノイズ低減処理判定部が前記第２の状態であると判定した場合、ノイズ低減処理を実施しないようにしてもよい。
前記ノイズ低減処理判定部は、前記音声パワー差と前記ノイズパワー差との差の絶対値が所定の閾値以内である場合、前記第２の状態と判定してもよい。
前記パワー情報取得部は、前記音声区間において前記音声パワー差を更新し、前記ノイズ区間において前記ノイズパワー差を更新してもよい。
前記ノイズ低減処理部は、前記ノイズ低減処理判定部において前記第１の状態と判定された場合、前記第１の収音信号に含まれるノイズ成分を前記第２の収音信号を用いて低減し、当該ノイズ低減処理後の信号を音声信号として出力し、前記ノイズ低減処理判定部において前記第２の状態と判定された場合、前記第１の収音信号を音声信号として出力してもよい。
前記ノイズ低減処理部は、前記音声区間における前記第２の収音信号の大きさが前記第１の収音信号の大きさよりも早く、且つ、前記ノイズ低減処理判定部において前記第１の状態と判定された場合、前記第２の収音信号に含まれるノイズ成分を前記第１の収音信号を用いて低減し、当該ノイズ低減処理後の信号を音声信号として出力し、前記音声区間における前記第２の収音信号の大きさが前記第１の収音信号の大きさよりも早く、且つ、前記ノイズ低減処理判定部において前記第２の状態と判定された場合、前記第２の収音信号を音声信号として出力してもよい。
本発明にかかる音声入力装置は、上記ノイズ低減装置を備える。
前記音声入力装置において、前記第１のマイクロフォンは前記音声入力装置の第１の面に設けられ、前記第２のマイクロフォンは、前記第１の面と所定の距離を隔てて対向している第２の面に設けられていてもよい。
本発明にかかる無線通信装置は、上記ノイズ低減装置を備える。
前記無線通信装置において、前記第１のマイクロフォンは前記無線通信装置の第１の面に設けられ、前記第２のマイクロフォンは、前記第１の面と所定の距離を隔てて対向している第２の面に設けられていてもよい。
本発明にかかるノイズ低減方法は、第１のマイクロフォンで収音された音に応じた第１の収音信号および第２のマイクロフォンで収音された音に応じた第２の収音信号のうちの少なくとも一方に基づき音声区間およびノイズ区間を検出し、前記音声区間における前記第１の収音信号の大きさと前記第２の収音信号の大きさとの差である音声パワー差と、前記ノイズ区間における前記第１の収音信号の大きさと前記第２の収音信号の大きさとの差であるノイズパワー差とを取得し、前記音声パワー差と前記ノイズパワー差の状態を判定し、前記音声パワー差と前記ノイズパワー差の状態を判定した結果に応じて前記第１の収音信号と前記第２の収音信号とを用いてノイズ低減処理を実施する。
本発明にかかるノイズ低減プログラムは、コンピュータに、第１のマイクロフォンで収音された音に応じた第１の収音信号および第２のマイクロフォンで収音された音に応じた第２の収音信号のうちの少なくとも一方に基づき音声区間およびノイズ区間を検出させ、前記音声区間における前記第１の収音信号の大きさと前記第２の収音信号の大きさとの差である音声パワー差と、前記ノイズ区間における前記第１の収音信号の大きさと前記第２の収音信号の大きさとの差であるノイズパワー差とを取得させ、前記音声パワー差と前記ノイズパワー差の状態を判定し、前記音声パワー差と前記ノイズパワー差の状態を判定した結果に応じて前記第１の収音信号と前記第２の収音信号とを用いてノイズ低減処理を実施させる、ノイズ低減プログラムである。 The noise reduction device according to the present invention includes a first sound collection signal corresponding to a sound collected by a first microphone and a second sound collection signal corresponding to a sound collected by a second microphone. An audio noise interval detection unit that detects an audio interval and a noise interval based on at least one of the above, and an audio power that is a difference between the magnitude of the first collected signal and the magnitude of the second collected signal in the audio section A power information acquisition unit that acquires a difference and a noise power difference that is a difference between the magnitude of the first sound pickup signal and the magnitude of the second sound pickup signal in the noise section; A noise reduction process determination unit that determines the state of the noise power difference, and a noise reduction process using the first sound collection signal and the second sound collection signal according to the determination result of the noise reduction process determination unit To implement Comprising a's reduction processing unit.
The noise reduction processing determination unit may determine a state of a difference between the audio power difference and the noise power difference.
The noise reduction processing determination unit may determine whether the state of the audio power difference and the noise power difference is a first state where noise reduction processing is performed, and the noise reduction processing unit When the noise reduction process determination unit determines that the first state is established, the noise reduction process may be performed using the first sound collection signal and the second sound collection signal.
The noise reduction processing determination unit determines whether the state of the audio power difference and the noise power difference is a second state in which noise reduction processing is performed weaker than the first state. The noise reduction processing unit may perform the noise reduction processing weaker than the first state when the noise reduction processing determination unit determines that the second state is present.
The noise reduction processing determination unit may determine whether the state of the audio power difference and the noise power difference is a second state in which noise reduction processing is not performed, and the noise reduction processing The unit may not perform the noise reduction process when the noise reduction process determination unit determines that the state is the second state.
The noise reduction processing determination unit may determine the second state when the absolute value of the difference between the audio power difference and the noise power difference is within a predetermined threshold.
The power information acquisition unit may update the audio power difference in the audio interval and update the noise power difference in the noise interval.
The noise reduction processing unit reduces a noise component included in the first collected sound signal using the second collected sound signal when the noise reduction processing determining unit determines that the state is the first state. The signal after the noise reduction processing may be output as an audio signal, and the first sound collection signal may be output as an audio signal when the noise reduction processing determination unit determines that the second state has occurred.
The noise reduction processing unit is configured such that the magnitude of the second collected sound signal in the voice section is earlier than the magnitude of the first collected sound signal, and the noise reduction processing determining unit If determined, the noise component included in the second sound collection signal is reduced using the first sound collection signal, the signal after the noise reduction processing is output as a sound signal, When the magnitude of the second collected sound signal is earlier than the magnitude of the first collected sound signal and the noise reduction processing determination unit determines that the second state is detected, the second collected sound signal May be output as an audio signal.
The voice input device according to the present invention includes the noise reduction device.
In the voice input device, the first microphone is provided on a first surface of the voice input device, and the second microphone is opposed to the first surface with a predetermined distance therebetween. It may be provided on the surface.
A wireless communication device according to the present invention includes the noise reduction device.
In the wireless communication device, the first microphone is provided on a first surface of the wireless communication device, and the second microphone is opposed to the first surface with a predetermined distance therebetween. It may be provided on the surface.
The noise reduction method according to the present invention includes a first sound collection signal corresponding to a sound collected by a first microphone and a second sound collection signal corresponding to a sound collected by a second microphone. A voice interval and a noise interval are detected based on at least one of the following: a voice power difference that is a difference between the magnitude of the first collected sound signal and the second collected sound signal in the voice interval; Obtaining a noise power difference that is a difference between the magnitude of the first collected sound signal and the magnitude of the second collected sound signal, determining a state of the sound power difference and the noise power difference, and Noise reduction processing is performed using the first sound collection signal and the second sound collection signal according to the result of determining the power difference and the noise power difference state.
A noise reduction program according to the present invention causes a computer to collect a first sound collection signal corresponding to a sound collected by a first microphone and a second sound collection corresponding to a sound collected by a second microphone. A voice section and a noise section are detected based on at least one of the signals, and a voice power difference that is a difference between a magnitude of the first collected signal and a magnitude of the second collected signal in the voice section; A noise power difference that is a difference between the magnitude of the first sound pickup signal and the magnitude of the second sound pickup signal in the noise section is acquired, and the state of the voice power difference and the noise power difference is determined. A noise reduction program for performing noise reduction processing using the first sound collection signal and the second sound collection signal according to a result of determining the state of the sound power difference and the noise power difference. It is.

本発明によりノイズ成分を適切に低減することができるノイズ低減装置、音声入力装置、無線通信装置、ノイズ低減方法、およびノイズ低減プログラムを提供することができる。 According to the present invention, it is possible to provide a noise reduction device, a voice input device, a wireless communication device, a noise reduction method, and a noise reduction program that can appropriately reduce noise components.

実施の形態１にかかるノイズ低減装置を示すブロック図である。It is a block diagram which shows the noise reduction apparatus concerning Embodiment 1. FIG. 実施の形態１にかかるノイズ低減装置が備える音声ノイズ区間検出部の一例を示すブロック図である。It is a block diagram which shows an example of the audio | voice noise area detection part with which the noise reduction apparatus concerning Embodiment 1 is provided. 実施の形態１にかかるノイズ低減装置が備える音声ノイズ区間検出部の他の例を示すブロック図である。It is a block diagram which shows the other example of the audio | voice noise area detection part with which the noise reduction apparatus concerning Embodiment 1 is provided. 実施の形態１にかかるノイズ低減装置が備える位相情報取得部の一例を示すブロック図である。It is a block diagram which shows an example of the phase information acquisition part with which the noise reduction apparatus concerning Embodiment 1 is provided. 実施の形態１にかかるノイズ低減装置が備える音声用マイクロフォンおよび参照音用マイクロフォンに対する音声およびノイズの音源の位置の一例を示す図である。It is a figure which shows an example of the position of the sound source of an audio | voice and a noise with respect to the audio | voice microphone with which the noise reduction apparatus concerning Embodiment 1 and a reference sound microphone are equipped. 実施の形態１にかかるノイズ低減装置が備える音声用マイクロフォンおよび参照音用マイクロフォンに対する音声およびノイズの音源の位置の一例を示す図である。It is a figure which shows an example of the position of the sound source of an audio | voice and a noise with respect to the audio | voice microphone with which the noise reduction apparatus concerning Embodiment 1 and a reference sound microphone are equipped. 実施の形態１にかかるノイズ低減装置が備える音声用マイクロフォンおよび参照音用マイクロフォンに対する音声およびノイズの音源の位置の一例を示す図である。It is a figure which shows an example of the position of the sound source of an audio | voice and a noise with respect to the audio | voice microphone with which the noise reduction apparatus concerning Embodiment 1 and a reference sound microphone are equipped. 実施の形態１にかかるノイズ低減装置が備えるノイズ低減処理部の一例を示すブロック図である。FIG. 3 is a block diagram illustrating an example of a noise reduction processing unit included in the noise reduction device according to the first exemplary embodiment; 実施の形態１にかかるノイズ低減装置の動作を説明するためのフローチャートである。3 is a flowchart for explaining an operation of the noise reduction apparatus according to the first exemplary embodiment; 実施の形態１にかかるノイズ低減装置を用いた音声入力装置の一例を示す図である。It is a figure which shows an example of the audio | voice input apparatus using the noise reduction apparatus concerning Embodiment 1. FIG. 実施の形態１にかかるノイズ低減装置を用いた無線通信装置の一例を示す図である。1 is a diagram illustrating an example of a wireless communication device using a noise reduction device according to a first exemplary embodiment; 実施の形態１にかかる発明の効果を説明するための図である。It is a figure for demonstrating the effect of the invention concerning Embodiment 1. FIG. 実施の形態１にかかる発明の効果を説明するための図である。It is a figure for demonstrating the effect of the invention concerning Embodiment 1. FIG. 実施の形態２にかかるノイズ低減装置を示すブロック図である。It is a block diagram which shows the noise reduction apparatus concerning Embodiment 2. 実施の形態２にかかるノイズ低減装置が備えるパワー情報取得部の一例を示すブロック図である。It is a block diagram which shows an example of the power information acquisition part with which the noise reduction apparatus concerning Embodiment 2 is provided. 実施の形態２にかかるノイズ低減装置の動作を説明するためのフローチャートである。10 is a flowchart for explaining the operation of the noise reduction apparatus according to the second exemplary embodiment; 実施の形態２にかかるノイズ低減装置の他の例を示すブロック図である。It is a block diagram which shows the other example of the noise reduction apparatus concerning Embodiment 2. FIG. 実施の形態３にかかるノイズ低減装置を示すブロック図である。FIG. 6 is a block diagram illustrating a noise reduction device according to a third exemplary embodiment. 実施の形態３にかかるノイズ低減装置が備えるノイズ低減処理部の一例を示すブロック図である。It is a block diagram which shows an example of the noise reduction process part with which the noise reduction apparatus concerning Embodiment 3 is provided. 実施の形態３にかかるノイズ低減装置の動作を説明するためのフローチャートである。10 is a flowchart for explaining the operation of the noise reduction apparatus according to the third exemplary embodiment; 実施の形態３にかかるノイズ低減装置が備える音声用マイクロフォンおよび参照音用マイクロフォンに対する音声およびノイズの音源の位置の一例を示す図である。It is a figure which shows an example of the position of the sound source of an audio | voice and a noise with respect to the audio | voice microphone with which the noise reduction apparatus concerning Embodiment 3 and a reference sound microphone are equipped.

＜実施の形態１＞
以下、図面を参照して本発明の実施の形態について説明する。
図１は、実施の形態１にかかるノイズ低減装置を示すブロック図である。図１に示すように、本実施の形態にかかるノイズ低減装置１は、音声用マイクロフォン１１、参照音用マイクロフォン１２、ＡＤコンバータ１３、１４、音声ノイズ区間検出部１５、位相情報取得部１６、ノイズ低減処理判定部１７、およびノイズ低減処理部１８を有する。 <Embodiment 1>
Embodiments of the present invention will be described below with reference to the drawings.
FIG. 1 is a block diagram of the noise reduction apparatus according to the first embodiment. As shown in FIG. 1, the noise reduction apparatus 1 according to the present embodiment includes an audio microphone 11, a reference sound microphone 12, AD converters 13 and 14, an audio noise section detection unit 15, a phase information acquisition unit 16, and noise. A reduction processing determination unit 17 and a noise reduction processing unit 18 are included.

音声用マイクロフォン１１および参照音用マイクロフォン１２はそれぞれ、音声成分やノイズ成分を含む音を収音することができる。音声用マイクロフォン１１は、主に音声成分を含む音を収音してアナログ信号に変換し、変換後のアナログ信号をＡＤコンバータ１３に出力する。参照音用マイクロフォン１２は、主にノイズ成分を含む音を収音してアナログ信号に変換し、変換後のアナログ信号をＡＤコンバータ１４に出力する。例えば、参照音用マイクロフォン１２で収音された音に含まれるノイズ成分は、音声用マイクロフォン１１で収音された音に含まれるノイズ成分を低減するために用いられる。 Each of the sound microphone 11 and the reference sound microphone 12 can collect a sound including a sound component and a noise component. The sound microphone 11 picks up a sound mainly containing a sound component, converts it into an analog signal, and outputs the converted analog signal to the AD converter 13. The reference sound microphone 12 collects a sound mainly including a noise component, converts it into an analog signal, and outputs the converted analog signal to the AD converter 14. For example, the noise component included in the sound collected by the reference sound microphone 12 is used to reduce the noise component contained in the sound collected by the sound microphone 11.

なお、本実施の形態にかかるノイズ低減装置では、マイクロフォンが２つの場合（つまり、音声用マイクロフォン１１と参照音用マイクロフォン１２）の構成について説明するが、例えば参照音用マイクロフォンを更に追加してマイクロフォンを３つ以上設けてもよい。 In the noise reduction device according to the present embodiment, the configuration in the case of two microphones (that is, the voice microphone 11 and the reference sound microphone 12) will be described. For example, a reference sound microphone is further added to the microphone. Three or more may be provided.

ＡＤコンバータ１３は、音声用マイクロフォン１１から出力されたアナログ信号を所定のサンプリングレートでサンプリングしてデジタル信号に変換し、収音信号２１を生成する。ＡＤコンバータ１３で生成された収音信号２１は、音声ノイズ区間検出部１５、位相情報取得部１６、およびノイズ低減処理部１８に出力される。 The AD converter 13 samples the analog signal output from the audio microphone 11 at a predetermined sampling rate and converts it into a digital signal, and generates a sound pickup signal 21. The collected sound signal 21 generated by the AD converter 13 is output to the audio noise section detection unit 15, the phase information acquisition unit 16, and the noise reduction processing unit 18.

ＡＤコンバータ１４は、参照音用マイクロフォン１２から出力されたアナログ信号を所定のサンプリングレートでサンプリングしてデジタル信号に変換し、収音信号２２を生成する。ＡＤコンバータ１４で生成された収音信号２２は、位相情報取得部１６およびノイズ低減処理部１８に出力される。 The AD converter 14 samples the analog signal output from the reference sound microphone 12 at a predetermined sampling rate, converts the analog signal into a digital signal, and generates a sound collection signal 22. The collected sound signal 22 generated by the AD converter 14 is output to the phase information acquisition unit 16 and the noise reduction processing unit 18.

本実施の形態において、音声用マイクロフォン１１および参照音用マイクロフォン１２に入力される音声の周波数帯域は、おおよそ１００Ｈｚから４０００Ｈｚ程度である。よって、ＡＤコンバータ１３、１４におけるサンプリング周波数を８ｋＨｚ〜１２ｋＨｚ程度とすることで、音声成分を含むアナログ信号をデジタル信号として取り扱うことができる。
なお、本明細書では、主に音声成分を含む収音信号を音声信号とも記載し、主にノイズ成分を含む収音信号を参照信号とも記載する。 In the present embodiment, the frequency band of the sound input to the sound microphone 11 and the reference sound microphone 12 is approximately 100 Hz to 4000 Hz. Therefore, by setting the sampling frequency in the AD converters 13 and 14 to about 8 kHz to 12 kHz, an analog signal including an audio component can be handled as a digital signal.
In the present specification, a sound collection signal mainly including a sound component is also referred to as a sound signal, and a sound collection signal mainly including a noise component is also referred to as a reference signal.

音声ノイズ区間検出部１５は、ＡＤコンバータ１３から出力された収音信号２１に基づき音声区間およびノイズ区間を検出する。そして、音声ノイズ区間検出部１５は、音声区間およびノイズ区間を示す音声ノイズ区間情報２３、２４を、位相情報取得部１６およびノイズ低減処理部１８にそれぞれ出力する。 The voice noise section detector 15 detects a voice section and a noise section based on the collected sound signal 21 output from the AD converter 13. Then, the voice noise section detection unit 15 outputs the voice noise section information 23 and 24 indicating the voice section and the noise section to the phase information acquisition unit 16 and the noise reduction processing unit 18, respectively.

音声ノイズ区間検出部１５における音声ノイズ区間検出処理には任意の技術を用いることができる。なお、ノイズレベルが高い環境下でノイズ低減装置が使用される場合は、高い精度で音声区間とノイズ区間を判定することが好ましく、例えば、後述する音声ノイズ区間検出技術Ａや音声ノイズ区間検出技術Ｂを用いることで、音声区間およびノイズ区間を高い精度で検出することができる。音声には人の声以外の音も含まれるが、これらの例では、主に人の声を検出する。なお、音声ノイズ区間検出技術Ａは、一例として、特願２０１０−２６０７９８に基づく優先権を主張する出願である特願２０１１−２５４５７８にも記載されている。また、音声ノイズ区間検出技術Ｂは、一例として、特願２０１１−０２０４５９にも記載されている。 Any technique can be used for the voice noise section detection processing in the voice noise section detector 15. When the noise reduction device is used in an environment where the noise level is high, it is preferable to determine the speech section and the noise section with high accuracy. For example, the speech noise section detection technique A or the speech noise section detection technique described later is used. By using B, it is possible to detect the voice section and the noise section with high accuracy. The sound includes sounds other than human voices, but in these examples, human voices are mainly detected. Note that the speech noise section detection technique A is also described as an example in Japanese Patent Application No. 2011-254578, which is an application claiming priority based on Japanese Patent Application No. 2010-260798. The voice noise section detection technique B is also described in Japanese Patent Application No. 2011-020659 as an example.

最初に、音声ノイズ区間検出技術Ａについて説明する。音声ノイズ区間検出技術Ａでは、音声の主要部分である母音成分の持つ周波数スペクトルに着目し、音声区間を判定している。音声ノイズ区間検出技術Ａでは、適切なノイズレベルを帯域毎に設定し、母音周波数成分のピークとの信号対ノイズレベル比を求め、信号対ノイズレベル比が所定のレベル比かつ所定のピーク数であるか否かを観察することで、音声区間を判定している。 First, the voice noise section detection technique A will be described. In the speech noise section detection technique A, the speech section is determined by paying attention to the frequency spectrum of the vowel component that is the main part of the speech. In the speech noise section detection technique A, an appropriate noise level is set for each band, a signal to noise level ratio with a peak of a vowel frequency component is obtained, and the signal to noise level ratio is a predetermined level ratio and a predetermined number of peaks. The voice section is determined by observing whether or not there is.

図２は、音声ノイズ区間検出技術Ａを用いた音声ノイズ区間検出部１５'の一例を示すブロック図である。図２に示す音声ノイズ区間検出部１５'は、フレーム化部３１、スペクトル生成部３２、帯域分割部３３、周波数平均部３４、保持部３５、時間平均部３６、ピーク検出部３７、および音声判定部３８を備える。 FIG. 2 is a block diagram illustrating an example of a speech noise section detection unit 15 ′ using the speech noise section detection technique A. 2 includes a framing unit 31, a spectrum generating unit 32, a band dividing unit 33, a frequency averaging unit 34, a holding unit 35, a time averaging unit 36, a peak detecting unit 37, and a voice determination unit. The unit 38 is provided.

フレーム化部３１は、収音信号２１を予め定められた時間幅を有するフレーム単位（所定サンプル数長）で順次切り出し、フレーム単位の入力信号（以下、フレーム化入力信号と称す）を生成する。 The framing unit 31 sequentially cuts the sound pickup signal 21 in frame units (predetermined number of samples) having a predetermined time width, and generates an input signal in frame units (hereinafter referred to as a framed input signal).

スペクトル生成部３２は、フレーム化部３１から出力されたフレーム化入力信号の周波数分析を行い、時間領域のフレーム化入力信号を周波数領域のフレーム化入力信号に変換して、スペクトルを集めたスペクトルパターンを生成する。スペクトルパターンは、所定の周波数帯域に渡って、周波数とその周波数におけるエネルギーとが対応付けられた、周波数毎のスペクトルを集めたものである。ここで用いられる周波数変換法は、特定の手段に限定しないが、音声のスペクトルを認識するために必要な周波数分解能が必要であるため、比較的分解能が高いＦＦＴ（Fast Fourier Transform）やＤＣＴ（Discrete Cosine Transform）等の直交変換法を用いるとよい。本実施の形態において、スペクトル生成部３２は、少なくとも２００Ｈｚから７００Ｈｚのスペクトルパターンを生成する。 The spectrum generation unit 32 performs frequency analysis of the framing input signal output from the framing unit 31, converts the time-domain framing input signal into the frequency-domain framing input signal, and collects the spectrum. Is generated. The spectrum pattern is a collection of spectra for each frequency in which a frequency and energy at the frequency are associated with each other over a predetermined frequency band. The frequency transform method used here is not limited to a specific means, but requires a frequency resolution necessary for recognizing the spectrum of speech, and therefore has a relatively high resolution such as FFT (Fast Fourier Transform) or DCT (Discrete). It is recommended to use an orthogonal transformation method such as Cosine Transform. In the present embodiment, the spectrum generation unit 32 generates a spectrum pattern of at least 200 Hz to 700 Hz.

後述する音声判定部３８が音声区間を判定する際に検出する対象である、音声の特徴を示すスペクトル（以下、フォルマントと称す）には、通常、基音に相当する第１フォルマントから、その倍音部分である第ｎフォルマント（ｎは自然数）まで複数ある。このうち、第１フォルマントや第２フォルマントは２００Ｈｚ未満の周波数帯域に存在することが多い。しかし、この帯域には、低域ノイズ成分が比較的高いエネルギーで含まれているため、フォルマントが埋没し易い。また７００Ｈｚ以上のフォルマントは、フォルマント自体のエネルギーが低いため、やはりノイズ成分に埋没し易い。そのため、ノイズ成分に埋没し難い２００Ｈｚから７００Ｈｚのスペクトルパターンを音声区間の判定に用いることで、判定対象を絞り、効率的に音声区間の判定を行うことができる。 A spectrum (hereinafter referred to as a formant) that indicates a feature of a voice, which is a target to be detected when a voice determination unit 38 to be described later determines a voice section, usually includes a harmonic part from a first formant corresponding to a fundamental tone. There are a plurality of nth formants (where n is a natural number). Of these, the first formant and the second formant often exist in a frequency band of less than 200 Hz. However, since this band contains a low-frequency noise component with relatively high energy, formants are easily buried. Also, a formant of 700 Hz or more is easily buried in a noise component because the formant itself has low energy. Therefore, by using a spectrum pattern of 200 Hz to 700 Hz that is difficult to be buried in the noise component for the determination of the voice section, the determination target can be narrowed down and the voice section can be determined efficiently.

帯域分割部３３は、適切な周波数帯域単位で音声に特徴的なスペクトルを検出するため、スペクトルパターンの各スペクトルを、予め定められた帯域幅で分割された周波数帯域である複数の分割周波数帯域に分割する。本実施の形態において、予め定められた帯域幅は、１００Ｈｚから１５０Ｈｚ程度の帯域幅とする。 In order to detect a spectrum characteristic of speech in an appropriate frequency band unit, the band dividing unit 33 divides each spectrum of the spectrum pattern into a plurality of divided frequency bands that are frequency bands divided by a predetermined bandwidth. To divide. In the present embodiment, the predetermined bandwidth is about 100 Hz to 150 Hz.

周波数平均部３４は、分割周波数帯域毎の平均エネルギーを求める。本実施の形態では、周波数平均部３４は、分割周波数帯域毎に、分割周波数帯域におけるすべてのスペクトルのエネルギーを平均するが、演算負荷軽減のためスペクトルのエネルギーの代わりにスペクトルの最大または平均振幅値（絶対値）を代用してもよい。 The frequency averaging unit 34 calculates average energy for each divided frequency band. In the present embodiment, the frequency averaging unit 34 averages the energy of all spectra in the divided frequency band for each divided frequency band. However, the maximum or average amplitude value of the spectrum is used instead of the spectrum energy in order to reduce the calculation load. (Absolute value) may be substituted.

保持部３５は、ＲＡＭ（Random Access Memory）、ＥＥＰＲＯＭ（Electrically Erasable and Programmable Read Only Memory）、フラッシュメモリ等の記憶媒体で構成され、帯域毎の平均エネルギーを過去の予め定められた数（本実施の形態においてはＮとする）のフレーム分保持する。 The holding unit 35 is configured by a storage medium such as a RAM (Random Access Memory), an EEPROM (Electrically Erasable and Programmable Read Only Memory), and a flash memory, and the average energy for each band is set to a predetermined number in the past (this embodiment). N frames in the form) are held.

時間平均部３６は、分割周波数帯域毎に、周波数平均部３４で導出された平均エネルギーの時間方向の複数のフレームに渡る平均である帯域別エネルギーを導出する。すなわち、帯域別エネルギーは、分割周波数帯域毎の平均エネルギーの時間方向の複数のフレームに渡る平均値である。また、時間平均部３６は、直前のフレームの分割周波数帯域毎の平均エネルギーに、重み付け係数と時定数を用いて平均化に準じる処理をして、帯域別エネルギーの代用値を求めてもよい。 The time averaging unit 36 derives, for each divided frequency band, band-specific energy that is an average over a plurality of frames in the time direction of the average energy derived by the frequency averaging unit 34. That is, the band-specific energy is an average value over a plurality of frames in the time direction of the average energy for each divided frequency band. In addition, the time averaging unit 36 may obtain a substitute value of the band-specific energy by performing a process according to averaging using the weighting coefficient and the time constant on the average energy for each divided frequency band of the immediately preceding frame.

ピーク検出部３７は、スペクトルパターンの各スペクトルと、そのスペクトルが含まれる分割周波数帯域における帯域別エネルギーとのエネルギー比（ＳＮＲ：Signal to Noise ratio）を導出する。そして、ピーク検出部３７は、スペクトル毎のＳＮＲと、予め定められた閾値Ａとを比較し、閾値Ａを超えるか否かを判定する。ＳＮＲが閾値Ａを超えるスペクトルがあると、このスペクトルをフォルマントとみなし、フォルマントが検出された旨を示す情報を、音声判定部３８に出力する。 The peak detector 37 derives an energy ratio (SNR: Signal to Noise ratio) between each spectrum of the spectrum pattern and the band-specific energy in the divided frequency band in which the spectrum is included. Then, the peak detection unit 37 compares the SNR for each spectrum with a predetermined threshold A and determines whether or not the threshold A is exceeded. If there is a spectrum whose SNR exceeds the threshold A, this spectrum is regarded as a formant, and information indicating that a formant has been detected is output to the voice determination unit 38.

音声判定部３８は、フォルマントが検出されたという情報をピーク検出部３７から受け付けると、ピーク検出部３７の判定結果に基づいて、該当フレームのフレーム化入力信号が音声であるか否か判定する。音声判定部３８は、フレーム化入力信号が音声であると判定した場合、位相情報取得部１６およびノイズ低減処理部１８に音声区間を示す音声ノイズ区間情報２３、２４をそれぞれ出力する。一方、音声判定部３８は、フレーム化入力信号が音声ではないと判定した場合、位相情報取得部１６およびノイズ低減処理部１８にノイズ区間を示す音声ノイズ区間情報２３、２４をそれぞれ出力する。 When receiving information from the peak detection unit 37 that the formant has been detected, the audio determination unit 38 determines whether the framed input signal of the corresponding frame is audio based on the determination result of the peak detection unit 37. When the speech determination unit 38 determines that the framed input signal is speech, the speech determination unit 38 outputs speech noise section information 23 and 24 indicating the speech section to the phase information acquisition unit 16 and the noise reduction processing unit 18, respectively. On the other hand, when the speech determination unit 38 determines that the framed input signal is not speech, the speech determination unit 38 outputs speech noise section information 23 and 24 indicating noise sections to the phase information acquisition unit 16 and the noise reduction processing unit 18, respectively.

図２に示す音声ノイズ区間検出部１５'は、分割周波数帯域毎に、その分割周波数帯域の帯域別エネルギーを設定している。そのため、音声判定部３８は、他の分割周波数帯域のノイズ成分の影響を受けずに、それぞれの分割周波数帯域毎にフォルマントの有無を精度よく判定することができる。 The audio noise section detection unit 15 ′ illustrated in FIG. 2 sets energy for each divided frequency band for each divided frequency band. Therefore, the voice determination unit 38 can accurately determine the presence / absence of a formant for each divided frequency band without being affected by noise components in other divided frequency bands.

上述したように、フォルマントには、第１フォルマントから、その倍音部分である第ｎフォルマントまで複数ある。したがって、任意の分割周波数帯域の帯域別エネルギー（ノイズレベル）が上昇し、フォルマントの一部がノイズに埋没しても、他の複数のフォルマントを検出できる場合がある。特に、周囲ノイズは低域に集中するため、基音に相当する第１フォルマントや２倍音に相当する第２フォルマントが低域のノイズに埋没していても、３倍音以上のフォルマントを検出できる可能性がある。よって、音声判定部３８は、ＳＮＲが閾値Ａを超えるスペクトルが所定数以上である場合、フレーム化入力信号が音声であると判定することで、よりノイズに強い音声区間の判定を行うことができる。 As described above, there are a plurality of formants from the first formant to the n-th formant, which is a harmonic part thereof. Therefore, even if the energy (noise level) of any divided frequency band is increased and a part of the formant is buried in noise, a plurality of other formants may be detected. In particular, since ambient noise is concentrated in the low range, even if the first formant corresponding to the fundamental tone and the second formant corresponding to the second overtone are buried in the low-frequency noise, the possibility of detecting a formant with a third or higher harmonic is possible. There is. Therefore, the speech determination unit 38 can determine a speech section that is more resistant to noise by determining that the framed input signal is speech when the number of spectrums whose SNR exceeds the threshold A is a predetermined number or more. .

以上で説明したように、音声ノイズ区間検出技術Ａを用いた音声ノイズ区間検出部１５'は、入力信号を予め定められた時間幅を有するフレーム単位で切り出し、フレーム化入力信号を生成するフレーム化部３１と、フレーム化入力信号を、時間領域から周波数領域に変換して、周波数毎のスペクトルを集めたスペクトルパターンを生成するスペクトル生成部３２と、スペクトルパターンの各スペクトルと、予め定められた帯域幅で分割された周波数帯域である複数の分割周波数帯域のうちスペクトルが含まれる分割周波数帯域における帯域別エネルギーとのエネルギー比が、予め定められた閾値Ａを超えるか否かを判定するピーク検出部３７と、ピーク検出部の判定結果に基づいて、フレーム化入力信号が音声であるか否か判定する音声判定部３８と、スペクトルパターンの各分割周波数帯域におけるスペクトルの周波数方向の平均エネルギーを導出する周波数平均部３４と、分割周波数帯域毎に、平均エネルギーの時間方向の平均である前記帯域別エネルギーを導出する時間平均部３６と、を備える。 As described above, the voice noise section detection unit 15 ′ using the voice noise section detection technique A cuts the input signal in units of frames having a predetermined time width, and generates a framed input signal. Unit 31, a spectrum generation unit 32 that converts a framing input signal from the time domain to the frequency domain and generates a spectrum pattern in which spectra for each frequency are collected, each spectrum of the spectrum pattern, and a predetermined band Peak detection unit for determining whether the energy ratio of the divided frequency band including the spectrum among the plurality of divided frequency bands which are frequency bands divided by the width exceeds a predetermined threshold A 37 and voice determination for determining whether the framed input signal is voice based on the determination result of the peak detection unit Unit 38, frequency average unit 34 for deriving the average energy in the frequency direction of the spectrum in each divided frequency band of the spectrum pattern, and deriving the band-specific energy that is the average of the average energy in the time direction for each divided frequency band A time averaging unit 36.

例えば、音声判定部３８は、エネルギー比が閾値Ａを超えるスペクトルが予め定められた数以上であると、フレーム化入力信号が音声であると判定することができる。 For example, the speech determination unit 38 can determine that the framed input signal is speech when the spectrum in which the energy ratio exceeds the threshold A is greater than or equal to a predetermined number.

次に、音声ノイズ区間検出技術Ｂについて説明する。音声ノイズ区間検出技術Ｂでは、子音の特徴であるスペクトルパターンが右上がりになる傾向があるという性質に着目して、音声区間を判定している。音声ノイズ区間検出技術Ｂでは、子音のスペクトルパターンを中高域の周波数帯において測定し、更に部分的にノイズ成分によって埋没してしまった子音の周波数分布の特徴を、ノイズの影響があまり無かった帯域に特化して抽出することで、音声区間を高精度で判定することを可能にしている。 Next, the voice noise section detection technique B will be described. In the speech noise section detection technique B, the speech section is determined by paying attention to the property that the spectrum pattern that is a feature of the consonant tends to rise to the right. In the voice noise section detection technique B, the spectrum pattern of the consonant is measured in the mid-high frequency band, and the characteristics of the frequency distribution of the consonant that is partially buried by the noise component are not affected by the noise. This makes it possible to determine a speech segment with high accuracy.

図３は、音声ノイズ区間検出技術Ｂを用いた音声ノイズ区間検出部１５''の一例を示すブロック図である。音声ノイズ区間検出部１５''は、フレーム化部４１、スペクトル生成部４２、帯域分割部４３、平均導出部４４、ノイズレベル導出部４５、判定選択部４６、および子音判定部４７を備える。 FIG. 3 is a block diagram illustrating an example of the voice noise section detection unit 15 ″ using the voice noise section detection technique B. The audio noise section detection unit 15 ″ includes a framing unit 41, a spectrum generation unit 42, a band division unit 43, an average derivation unit 44, a noise level derivation unit 45, a determination selection unit 46, and a consonant determination unit 47.

フレーム化部４１は、収音信号２１を予め定められた時間幅を有するフレーム単位で順次切り出し、フレーム単位の入力信号であるフレーム化入力信号を生成する。 The framing unit 41 sequentially extracts the sound pickup signal 21 in units of frames having a predetermined time width, and generates a framing input signal that is an input signal in units of frames.

スペクトル生成部４２は、フレーム化部４１から出力されたフレーム化入力信号の周波数分析を行い、時間領域のフレーム化入力信号を周波数領域のフレーム化入力信号に変換して、スペクトルを集めたスペクトルパターンを生成する。スペクトルパターンは、所定の周波数帯域に渡って、周波数とその周波数におけるエネルギーとが対応付けられた、周波数毎のスペクトルを集めたものである。ここで用いられる周波数変換法は、特定の手段に限定しないが、音声のスペクトルを認識するために必要な周波数分解能が必要であるため、比較的分解能が高いＦＦＴやＤＣＴ等の直交変換法を用いるとよい。 The spectrum generation unit 42 performs frequency analysis of the framing input signal output from the framing unit 41, converts the time-domain framing input signal into the frequency-domain framing input signal, and collects the spectrum. Is generated. The spectrum pattern is a collection of spectra for each frequency in which a frequency and energy at the frequency are associated with each other over a predetermined frequency band. The frequency conversion method used here is not limited to a specific means, but a frequency resolution necessary for recognizing a speech spectrum is necessary, and therefore, an orthogonal transformation method such as FFT or DCT having a relatively high resolution is used. Good.

帯域分割部４３は、スペクトル生成部４２が生成したスペクトルパターンの各スペクトルを、予め定められた帯域幅毎に分割し、複数の分割周波数帯域を生成する。本実施の形態において、帯域分割部４３は、例えば、８００Ｈｚ〜３．５ｋＨｚの周波数範囲について、例えば、１００Ｈｚ〜３００Ｈｚ程度の帯域幅毎に分割する。 The band dividing unit 43 divides each spectrum of the spectrum pattern generated by the spectrum generating unit 42 for each predetermined bandwidth, and generates a plurality of divided frequency bands. In the present embodiment, the band dividing unit 43 divides the frequency range of, for example, 800 Hz to 3.5 kHz for each bandwidth of about 100 Hz to 300 Hz, for example.

平均導出部４４は、スペクトルパターンにおける、連接する、帯域分割部４３が分割した分割周波数帯域（バンド）毎の平均エネルギーである帯域別平均エネルギーを導出する。 The average deriving unit 44 derives average energy for each band, which is an average energy for each divided frequency band (band) divided by the band dividing unit 43 in the spectrum pattern.

子音判定部４７は、平均導出部４４が導出した帯域別平均エネルギー同士を比較し、より高周波数帯域の帯域別平均エネルギー程、高いエネルギーとなっていると、そのフレーム化入力信号に子音が含まれると判定する。 The consonant determination unit 47 compares the band-by-band average energies derived by the average deriving unit 44. If the band-by-band average energy of the higher frequency band is higher, the consonant is included in the framed input signal. It is determined that

一般的に、子音はスペクトルパターンが右上がりになる傾向がある。そこで、音声ノイズ区間検出技術Ｂを用いた音声ノイズ区間検出部１５''は、スペクトルパターンにおける帯域別平均エネルギーを導出し、その帯域別エネルギー同士を比較することで子音に特徴的な、スペクトルパターンにおける右上がりの傾向を検出する。そのため、音声ノイズ区間検出部１５''は、入力信号に子音が含まれる子音区間を精度よく検出することができる。 In general, consonants tend to have a spectral pattern that rises to the right. Therefore, the speech noise section detection unit 15 ″ using the speech noise section detection technique B derives the average energy for each band in the spectrum pattern, and compares the energy for each band with each other, and the spectrum pattern characteristic to the consonant Detecting the upward trend in Therefore, the speech noise section detection unit 15 ″ can accurately detect a consonant section in which a consonant is included in the input signal.

子音判定部４７は、隣接する帯域間の帯域別平均エネルギーが、高い周波数の帯域の方が隣接する低い周波数の帯域より大きい組み合わせを計数し、計数した計数値が、予め定められた閾値Ａ以上であると、子音が含まれると判定する第１判定手段を備える。また、子音判定部４７は、隣接する帯域間の帯域別平均エネルギーが、高い周波数の帯域の方が隣接する低い周波数の帯域より大きい組み合わせを計測し、更にこの組み合わせが帯域を跨いで連続する場合に重み付けをして計数し、計数した計数値が、予め定められた閾値Ｂ以上であると、子音が含まれると判定する第２判定手段を備える。子音判定部４７は、第１判定手段と第２判定手段をそれぞれノイズレベルに応じて使い分ける。 The consonant determination unit 47 counts a combination in which the average energy for each band between adjacent bands is higher than the adjacent low frequency band in the high frequency band, and the counted value is equal to or greater than a predetermined threshold A. If it is, the 1st determination means which determines that a consonant is contained is provided. In addition, the consonant determination unit 47 measures a combination in which the average energy for each band between adjacent bands is higher in the high frequency band than in the adjacent low frequency band, and when this combination continues across the bands And a second determination means for determining that a consonant is included when the counted value is equal to or greater than a predetermined threshold value B. The consonant determination unit 47 uses the first determination unit and the second determination unit in accordance with the noise level.

ここで、第１判定手段と第２判定手段とを適宜選択すべく、ノイズレベル導出部４５は、フレーム化入力信号のノイズレベルを導出する。例えば、ノイズレベルは、フレーム化入力信号のすべての周波数帯域の帯域別平均エネルギーの平均値とすることができる。また、ノイズレベル導出部４５は、フレーム化入力信号毎にノイズレベルを導出してもよいし、所定時間分のフレーム化入力信号のノイズレベルの平均値を用いてもよい。判定選択部４６は、導出されたノイズレベルが所定の閾値未満の場合、第１判定手段を選択し、所定の閾値以上の場合、第２判定手段を選択する。 Here, the noise level deriving unit 45 derives the noise level of the framed input signal so as to select the first determination unit and the second determination unit as appropriate. For example, the noise level can be an average value of average energy for each frequency band of the framed input signal. Further, the noise level deriving unit 45 may derive a noise level for each framed input signal, or may use an average value of noise levels of the framed input signal for a predetermined time. The determination selection unit 46 selects the first determination unit when the derived noise level is less than the predetermined threshold, and selects the second determination unit when the derived noise level is equal to or higher than the predetermined threshold.

以上で説明したように、音声ノイズ区間検出技術Ｂを用いた音声ノイズ区間検出部１５''は、入力信号を予め定められたフレーム単位で切り出し、フレーム化入力信号を生成するフレーム化部４１と、フレーム化入力信号を、時間領域から周波数領域に変換して、周波数毎のスペクトルを集めたスペクトルパターンを生成するスペクトル生成部４２と、スペクトルパターンにおける、連接する予め定められた帯域幅毎の平均エネルギーである帯域別平均エネルギーを導出する平均導出部４４と、導出された帯域別平均エネルギー同士を比較し、より高周波数帯域の帯域別平均エネルギー程、高いエネルギーとなっていると、フレーム化入力信号に子音が含まれると判定する子音判定部４７と、を備える。 As described above, the speech noise section detection unit 15 '' using the speech noise section detection technique B includes the framing unit 41 that cuts out an input signal in units of predetermined frames and generates a framed input signal. A spectrum generation unit 42 that converts a framed input signal from a time domain to a frequency domain to generate a spectrum pattern in which spectra for each frequency are collected, and an average for each predetermined predetermined bandwidth in the spectrum pattern. The average derivation unit 44 for deriving the average energy for each band, which is energy, compares the derived average energy for each band, and if the average energy for each band in the higher frequency band is higher, the framed input A consonant determination unit 47 that determines that a consonant is included in the signal.

例えば、子音判定部４７は、スペクトルパターンの隣接する帯域間の帯域別平均エネルギーが、高い周波数の帯域の方が隣接する低い周波数の帯域より大きい組み合わせを計数し、計数した計数値が、予め定められた閾値以上であると、子音が含まれると判定することができる。 For example, the consonant determination unit 47 counts combinations in which the average energy for each band between adjacent bands of the spectrum pattern is larger in the higher frequency band than in the adjacent lower frequency band, and the counted value is determined in advance. It is possible to determine that a consonant is included if it is equal to or greater than the threshold value.

なお、本実施の形態にかかるノイズ低減装置に上記の音声ノイズ区間検出技術Ａ、Ｂを適用する場合、製品毎にパラメータを設定することができる。すなわち、より確実な音声区間の判定が要求される製品に音声ノイズ区間検出技術Ａ、Ｂを適用する場合、音声区間判定のパラメータとしてより厳しい閾値を設定することができる。 In addition, when applying said audio | voice noise area detection technology A and B to the noise reduction apparatus concerning this Embodiment, a parameter can be set for every product. That is, when the speech noise section detection techniques A and B are applied to a product that requires more reliable speech section determination, a stricter threshold can be set as a speech section determination parameter.

図１に示すノイズ低減装置１の位相情報取得部１６は、音声ノイズ区間情報２３が音声区間を示す場合、音声区間における収音信号２１と収音信号２２との位相差である音声位相差を取得する。また、位相情報取得部１６は、音声ノイズ区間情報２３がノイズ区間を示す場合、ノイズ区間における収音信号２１と収音信号２２との位相差であるノイズ位相差を取得する。取得された音声位相差およびノイズ位相差は、位相情報２５としてノイズ低減処理判定部１７に供給される。 The phase information acquisition unit 16 of the noise reduction apparatus 1 illustrated in FIG. 1 calculates a sound phase difference that is a phase difference between the sound collection signal 21 and the sound collection signal 22 in the sound section when the sound noise section information 23 indicates a sound section. get. Moreover, the phase information acquisition part 16 acquires the noise phase difference which is a phase difference of the sound collection signal 21 and the sound collection signal 22 in a noise area, when the audio | voice noise area information 23 shows a noise area. The acquired audio phase difference and noise phase difference are supplied as phase information 25 to the noise reduction processing determination unit 17.

例えば、トランシーバーのような携帯機器（無線通信装置）や、無線通信装置に用いるスピーカーマイクロフォン（音声入力装置）のような小型機器に、本実施の形態にかかるノイズ低減装置を適用する場合（図１０、図１１参照）、音声を拾い易い表側に音声用マイクロフォン１１を設け、音声を拾い難い裏側に参照音用マイクロフォン１２を設ける。これにより、音声用マイクロフォン１１では音声成分を主に収音し、参照音用マイクロフォン１２ではノイズ成分を主に収音することができる。 For example, when the noise reduction device according to the present embodiment is applied to a portable device (wireless communication device) such as a transceiver or a small device such as a speaker microphone (voice input device) used in the wireless communication device (FIG. 10). 11), the sound microphone 11 is provided on the front side where it is easy to pick up the sound, and the reference sound microphone 12 is provided on the back side where it is difficult to pick up the sound. Thereby, the sound microphone 11 can mainly collect sound components, and the reference sound microphone 12 can mainly collect noise components.

上記の無線通信装置や音声入力装置は、一般的に人間の握りこぶしよりも少し小さい程度の大きさである。よって、音源と音声用マイクロフォン１１との距離と、音源と参照音用マイクロフォン１２との距離の差は、機器毎やマイクロフォンの配置により異なるものの、５〜１０ｃｍ程度であると考えられる。ここで、音声の空間伝達速度を３４０００ｃｍ／ｓとすると、サンプリング周波数が８ｋＨｚの場合、１サンプル間において音声が伝達する距離は３４０００÷８０００＝４．２５であるので、４．２５ｃｍとなる。仮に、音声用マイクロフォン１１と参照音用マイクロフォン１２との距離が５ｃｍであれば、サンプリング周波数が８ｋＨｚでは音声の方向を推定するには不十分である。 The above-described wireless communication device and voice input device are generally a little smaller than a human fist. Therefore, the difference between the distance between the sound source and the sound microphone 11 and the distance between the sound source and the reference sound microphone 12 is considered to be about 5 to 10 cm, although it differs depending on the device and the arrangement of the microphones. Here, assuming that the spatial transmission speed of sound is 34000 cm / s, the distance that the sound is transmitted between one sample is 34000 ÷ 8000 = 4.25 when the sampling frequency is 8 kHz, so that 4.25 cm. If the distance between the sound microphone 11 and the reference sound microphone 12 is 5 cm, a sampling frequency of 8 kHz is insufficient to estimate the sound direction.

この場合、サンプリング周波数を８ｋＨｚの３倍である２４ｋＨｚとすると、３４０００÷２４０００≒１．４２ｃｍとなり、５ｃｍの間に３〜４点の位相差ポイントを測定することができる。よって、収音信号２１と収音信号２２の位相差に基づいて音声の到来方向を検出する場合は、位相情報取得部１６に入力される収音信号２１と収音信号２２のサンプリング周波数を２４ｋＨｚ以上にするとよい。 In this case, if the sampling frequency is set to 24 kHz, which is three times 8 kHz, 34000 / 24000≈1.42 cm, and 3 to 4 phase difference points can be measured within 5 cm. Therefore, when detecting the direction of arrival of the sound based on the phase difference between the collected sound signal 21 and the collected sound signal 22, the sampling frequency of the collected sound signal 21 and the collected sound signal 22 input to the phase information acquisition unit 16 is set to 24 kHz. This should be done.

図１に示すノイズ低減装置１において、例えばＡＤコンバータ１３、１４から出力された収音信号２１、２２のサンプリング周波数が８〜１２ｋＨｚである場合は、ＡＤコンバータ１３、１４と位相情報取得部１６との間に、サンプリング周波数変換器を設け、位相情報取得部１６に供給される収音信号２１、２２のサンプリング周波数を２４ｋＨｚ以上に変換してもよい。 In the noise reduction apparatus 1 shown in FIG. 1, for example, when the sampling frequency of the collected sound signals 21 and 22 output from the AD converters 13 and 14 is 8 to 12 kHz, the AD converters 13 and 14 and the phase information acquisition unit 16 Between them, a sampling frequency converter may be provided to convert the sampling frequency of the collected sound signals 21 and 22 supplied to the phase information acquisition unit 16 to 24 kHz or higher.

一方、例えばＡＤコンバータ１３、１４から出力された収音信号２１、２２のサンプリング周波数が２４ｋＨｚ以上である場合は、ＡＤコンバータ１３と音声ノイズ区間検出部１５との間、およびＡＤコンバータ１３、１４とノイズ低減処理部１８との間に、サンプリング周波数変換器を設け、音声ノイズ区間検出部１５およびノイズ低減処理部１８に供給される収音信号２１、２２のサンプリング周波数を８〜１２ｋＨｚに変換してもよい。 On the other hand, for example, when the sampling frequency of the collected sound signals 21 and 22 output from the AD converters 13 and 14 is 24 kHz or more, between the AD converter 13 and the audio noise section detector 15, and between the AD converters 13 and 14, A sampling frequency converter is provided between the noise reduction processing unit 18 and the sampling frequency of the collected sound signals 21 and 22 supplied to the audio noise interval detection unit 15 and the noise reduction processing unit 18 is converted to 8 to 12 kHz. Also good.

収音信号２１と収音信号２２の位相差は、音声用マイクロフォン１１の位置に対する音声またはノイズの到来方向を示すものである。例えば、話者（音声の音源）が音声用マイクロフォン１１と参照音用マイクロフォン１２を直線で結んだ延長線上の音声用マイクロフォン１１側から話す場合、位相差が正の方向に最も大きくなる。換言すると、音声が音声用マイクロフォン１１と参照音用マイクロフォン１２とに到達する際のマイクロフォン間の時間差が正の方向に最も大きくなる（つまり、音声用マイクロフォン１１に最も早く音声が到達する）。 The phase difference between the collected sound signal 21 and the collected sound signal 22 indicates the arrival direction of sound or noise with respect to the position of the sound microphone 11. For example, when a speaker (sound source) speaks from the voice microphone 11 side on an extension line connecting the voice microphone 11 and the reference sound microphone 12 with a straight line, the phase difference becomes the largest in the positive direction. In other words, the time difference between the microphones when the sound reaches the sound microphone 11 and the reference sound microphone 12 becomes the largest in the positive direction (that is, the sound reaches the sound microphone 11 earliest).

一方、話者（音声の音源）が音声用マイクロフォン１１と参照音用マイクロフォン１２を直線で結んだ延長線上の参照音用マイクロフォン１２側から話す場合、位相差が負の方向に最も大きくなる。換言すると、音声が音声用マイクロフォン１１と参照音用マイクロフォン１２とに到達する際のマイクロフォン間の時間差が負の方向に最も大きくなる（つまり、音声用マイクロフォン１１に最も遅く音声が到達する）。 On the other hand, when the speaker (sound source) speaks from the reference sound microphone 12 side on the extension line connecting the sound microphone 11 and the reference sound microphone 12 with a straight line, the phase difference is greatest in the negative direction. In other words, the time difference between the microphones when the sound reaches the sound microphone 11 and the reference sound microphone 12 is greatest in the negative direction (that is, the sound reaches the sound microphone 11 the latest).

また、話者（音声の音源）が音声用マイクロフォン１１と参照音用マイクロフォン１２とを結ぶ線分の垂直二等分線上の位置（つまり、音声用マイクロフォン１１と参照音用マイクロフォン１２の中間の位置）から話す場合は、それぞれのマイクロフォンに音声が同時に到達するので、位相差（時間差）はゼロとなる。 In addition, a position on the vertical bisector of the line connecting the sound microphone 11 and the reference sound microphone 12 by the speaker (sound source) (that is, a position intermediate between the sound microphone 11 and the reference sound microphone 12). ), The sound reaches each microphone at the same time, so the phase difference (time difference) is zero.

このように、音声用マイクロフォン１１からの収音信号２１と参照音用マイクロフォン１２からの収音信号２２とを用いて最も相関が高くなる位置を検出することで、収音信号２１および収音信号２２のうちのいずれか一方を基準として位相差を取得することができる。なお、以下では、音声用マイクロフォン１１からの収音信号２１を基準とする場合を例として説明する。 In this way, by detecting the position having the highest correlation using the collected sound signal 21 from the sound microphone 11 and the collected sound signal 22 from the reference sound microphone 12, the collected sound signal 21 and the collected sound signal are detected. The phase difference can be acquired with any one of 22 as a reference. Hereinafter, a case where the sound collection signal 21 from the sound microphone 11 is used as a reference will be described as an example.

図４は、本実施の形態にかかるノイズ低減装置１が備える位相情報取得部の一例を示すブロック図である。図４に示す位相情報取得部１６は、基準信号バッファ５１、基準信号抽出部５２、比較信号バッファ５３、比較信号抽出部５４、相互相関値算出部５５、位相差取得部５６、音声位相差格納部５７、ノイズ位相差格納部５８、およびセレクタ５９を備える。 FIG. 4 is a block diagram illustrating an example of a phase information acquisition unit included in the noise reduction device 1 according to the present embodiment. The phase information acquisition unit 16 illustrated in FIG. 4 includes a reference signal buffer 51, a reference signal extraction unit 52, a comparison signal buffer 53, a comparison signal extraction unit 54, a cross correlation value calculation unit 55, a phase difference acquisition unit 56, and an audio phase difference storage. Unit 57, noise phase difference storage unit 58, and selector 59.

基準信号バッファ５１は、ＡＤコンバータ１３から出力された収音信号２１を一時的に蓄積する。比較信号バッファ５３は、ＡＤコンバータ１４から出力された収音信号２２を一時的に蓄積する。 The reference signal buffer 51 temporarily stores the collected sound signal 21 output from the AD converter 13. The comparison signal buffer 53 temporarily accumulates the sound collection signal 22 output from the AD converter 14.

音源が一つで同時刻に発せられる音声やノイズは、各マイクロフォン１１、１２への伝達経路が異なるため各マイクロフォン１１、１２で検出される位相や振幅値は異なる。しかし、音声やノイズの音源が一つである場合は、各マイクロフォン１１、１２で検出される音声成分の位相や振幅値は類似しており相関性は非常に高いといえる。特に、本実施の形態では、音声区間において音声をノイズ区間においてノイズをそれぞれ収音しているので、各マイクロフォン１１、１２で検出される音声成分の相関性やノイズ成分の相関性は非常に高いといえる。よって、この相関性を測定することで位相差を求めることができ、音源の方向を推定することができる。２つのマイクロフォン１１、１２の間における位相差は、例えば相互相関関数や最小二乗法を用いて算出することができる。 The sound and noise generated at the same time with a single sound source have different transmission paths to the microphones 11 and 12, and therefore the phases and amplitude values detected by the microphones 11 and 12 are different. However, when there is a single sound or noise source, the phases and amplitude values of the sound components detected by the microphones 11 and 12 are similar and the correlation is very high. In particular, in the present embodiment, since the voice is collected in the voice section and the noise is collected in the noise section, the correlation between the voice components detected by the microphones 11 and 12 and the correlation between the noise components are very high. It can be said. Therefore, the phase difference can be obtained by measuring this correlation, and the direction of the sound source can be estimated. The phase difference between the two microphones 11 and 12 can be calculated using, for example, a cross correlation function or a least square method.

一般的に、２つの信号波形ｘ１（ｔ）とｘ２（ｔ）の相互相関関数は次の式で表すことができる。
In general, the cross-correlation function between two signal waveforms x1 (t) and x2 (t) can be expressed by the following equation.

基準信号抽出部５２は、収音信号（基準信号）２１に含まれる信号波形ｘ１（ｔ）を抽出して固定する。比較信号抽出部５４は、収音信号（比較信号）２２に含まれる信号波形ｘ２（ｔ）を抽出し、当該信号波形ｘ２（ｔ）を移動する。相互相関値算出部５５は、信号波形ｘ１（ｔ）と信号波形ｘ２（ｔ）とに対して畳み込み演算（積和演算）を実施することで、収音信号２１と収音信号２２の相関が高いポイントを判断する。このとき、収音信号２２のサンプリング周波数とマイクロフォン１１、１２の空間的な距離から算出される最大位相差分に応じて、信号波形ｘ２（ｔ）を前後にシフトしながら畳み込み演算値を計算する。畳み込み演算値が最大となるポイントは符号が一致する場所であり最も相関が高いと判断することができる。 The reference signal extraction unit 52 extracts and fixes the signal waveform x1 (t) included in the collected sound signal (reference signal) 21. The comparison signal extraction unit 54 extracts the signal waveform x2 (t) included in the collected sound signal (comparison signal) 22, and moves the signal waveform x2 (t). The cross-correlation value calculation unit 55 performs a convolution operation (product-sum operation) on the signal waveform x1 (t) and the signal waveform x2 (t), so that the correlation between the sound collection signal 21 and the sound collection signal 22 is increased. Judge the high point. At this time, the convolution calculation value is calculated while shifting the signal waveform x2 (t) back and forth according to the maximum phase difference calculated from the sampling frequency of the sound pickup signal 22 and the spatial distance between the microphones 11 and 12. The point where the convolution calculation value is the maximum is the place where the codes match, and it can be determined that the correlation is the highest.

具体的に説明すると、例えば、相関性を比較する時間幅（サンプル数）を２００［ｓａｍｐｌｅ］とした場合、収音信号（基準信号）２１を固定した上で、比較対象とする収音信号（比較信号）２２を同時刻のサンプル先頭から−Ｌ［ｓａｍｐｌｅ］のポイントから＋Ｌ［ｓａｍｐｌｅ］のポイントまで移動することで相互相関値を計算することができる。ここで、Ｌは収音信号２１をデジタル変換する際のサンプリング周波数とマイクロフォン１１、１２間の距離とからその最大値を指定することができる。τ番目の相互相関値（τ）は、上記式１を用いて求めることができる。このとき、τの範囲は−Ｌから＋Ｌまでであり、Ｎ＝２００である。 More specifically, for example, when the time width (number of samples) for comparing the correlation is 200 [sample], the sound collection signal (reference signal) 21 is fixed and the sound collection signal (to be compared) ( The cross-correlation value can be calculated by moving the comparison signal) 22 from the head of the sample at the same time to the point of −L [sample] from the point of −L [sample]. Here, L can specify the maximum value from the sampling frequency when the sound pickup signal 21 is digitally converted and the distance between the microphones 11 and 12. The τ-th cross-correlation value (τ) can be obtained using Equation 1 above. At this time, the range of τ is from −L to + L, and N = 200.

全ての相互相関値（τ）を求めて最も相互相関値が高いτ［ｓａｍｐｌｅ］を抽出する。分解能は、ＡＤコンバータ１３、１４のサンプリング周波数に応じて変化する。例えば、"１［ｓａｍｐｌｅ］あたりの時間［ｓｅｃ］＝１／サンプリング周波数"であるので、サンプリング周波数が９６［ｋＨｚ］の場合は、１［ｓａｍｐｌｅ］あたりの時間は、約１０．４２［ｍｓｅｃ］となる。この１［ｓａｍｐｌｅ］に相当する時間にτ［ｓａｍｐｌｅ］を乗算したものがマイク間の到達時間差となり、位相のずれ（位相差）を導くことが可能となる。 All cross-correlation values (τ) are obtained, and τ [sample] having the highest cross-correlation value is extracted. The resolution changes according to the sampling frequency of the AD converters 13 and 14. For example, since “time [sec] per 1 [sample] = 1 / sampling frequency”, when the sampling frequency is 96 [kHz], the time per 1 [sample] is about 10.42 [msec]. It becomes. The time corresponding to 1 [sample] multiplied by τ [sample] is the arrival time difference between the microphones, and it is possible to introduce a phase shift (phase difference).

また、最小二乗法を用いる場合は、次の式を用いることができる。
When the least square method is used, the following equation can be used.

最小二乗法を用いる場合、基準信号抽出部５２は、収音信号（基準信号）２１に含まれる信号波形を抽出して固定する。比較信号抽出部５４は、収音信号（比較信号）２２に含まれる信号波形を抽出し、当該信号波形を移動する。相互相関値算出部５５は、収音信号２１に含まれる信号波形と収音信号２２に含まれる信号波形との差分値の二乗和を計算する。この二乗和が最小となるポイントは、収音信号２１に含まれる信号波形と収音信号２２に含まれる信号波形とが互いに相似形となる（重なり合う）場所であり、最も相関が高いと判断することができる。最小二乗法を用いる場合は基準信号と比較信号の大きさを揃えることが望ましく、一方を基準として予め正規化しておくのが好ましい。 When the least square method is used, the reference signal extraction unit 52 extracts and fixes a signal waveform included in the collected sound signal (reference signal) 21. The comparison signal extraction unit 54 extracts a signal waveform included in the collected sound signal (comparison signal) 22 and moves the signal waveform. The cross-correlation value calculation unit 55 calculates the sum of squares of the difference values between the signal waveform included in the collected sound signal 21 and the signal waveform included in the collected sound signal 22. The point at which the sum of squares is minimum is a place where the signal waveform included in the collected sound signal 21 and the signal waveform included in the collected sound signal 22 are similar (overlapping) to each other, and is determined to have the highest correlation. be able to. When the least square method is used, it is desirable to make the sizes of the reference signal and the comparison signal uniform, and it is preferable to normalize in advance based on one of them.

相互相関値算出部５５は、上記の演算により得られた、基準信号と比較信号の相関関係に関する情報を位相差取得部５６に出力する。すなわち、相互相関値算出部５５で相関が高いと判断された２つの信号波形（つまり、収音信号２１に含まれる信号波形と収音信号２２に含まれる信号波形）は、音源を同一とする音声やノイズの信号波形である可能性が高い。よって、位相差取得部５６は、相関が高いと判断された２つの信号波形の位相差を求めることで、音声用マイクロフォン１１で収音された音と参照音用マイクロフォン１２で収音された音の位相差を求めることができる。 The cross-correlation value calculation unit 55 outputs information related to the correlation between the reference signal and the comparison signal obtained by the above calculation to the phase difference acquisition unit 56. That is, the two signal waveforms determined to have high correlation by the cross-correlation value calculation unit 55 (that is, the signal waveform included in the sound collection signal 21 and the signal waveform included in the sound collection signal 22) have the same sound source. It is highly possible that the signal waveform is voice or noise. Therefore, the phase difference acquisition unit 56 obtains the phase difference between the two signal waveforms determined to have a high correlation, thereby obtaining the sound collected by the sound microphone 11 and the sound collected by the reference sound microphone 12. Can be obtained.

位相情報取得部１６は、音声ノイズ区間検出部１５が音声区間を検出している場合、収音信号２１と収音信号２２との位相差（音声位相差）を更新する。また、位相情報取得部１６は、音声ノイズ区間検出部１５がノイズ区間を検出している場合、収音信号２１と収音信号２２との位相差（ノイズ位相差）を更新する。 The phase information acquisition unit 16 updates the phase difference (sound phase difference) between the sound pickup signal 21 and the sound pickup signal 22 when the sound noise interval detection unit 15 detects a sound interval. Moreover, the phase information acquisition part 16 updates the phase difference (noise phase difference) of the sound collection signal 21 and the sound collection signal 22 when the audio | voice noise area detection part 15 has detected the noise area.

例えば、音声ノイズ区間検出部１５から供給される音声ノイズ区間情報２３が音声区間を示している場合、位相差取得部５６で取得される位相差は音声の位相差（音声位相差）である確率が高いといえる。このとき、セレクタ５９には音声ノイズ区間情報２３として音声区間を示す信号が供給されるので、セレクタ５９は位相差取得部５６から出力された位相差（音声位相差）を音声位相差格納部５７に出力する。音声位相差格納部５７は、既に格納されている音声位相差を、セレクタ５９から供給された最新の音声位相差に更新する。更新された音声位相差は、次に音声ノイズ区間情報２３が音声区間を示すタイミング（つまり、音声位相差の次の更新のタイミング）まで保持される。 For example, when the audio noise interval information 23 supplied from the audio noise interval detection unit 15 indicates an audio interval, the probability that the phase difference acquired by the phase difference acquisition unit 56 is an audio phase difference (audio phase difference). Can be said to be expensive. At this time, since the selector 59 is supplied with a signal indicating the voice section as the voice noise section information 23, the selector 59 uses the phase difference (voice phase difference) output from the phase difference acquisition unit 56 as the voice phase difference storage unit 57. Output to. The audio phase difference storage unit 57 updates the already stored audio phase difference to the latest audio phase difference supplied from the selector 59. The updated audio phase difference is held until the next time when the audio noise interval information 23 indicates the audio interval (that is, the next update timing of the audio phase difference).

また、音声ノイズ区間検出部１５から供給される音声ノイズ区間情報２３がノイズ区間を示している場合、位相差取得部５６で取得される位相差はノイズの位相差（ノイズ位相差）である確率が高いといえる。このとき、セレクタ５９には音声ノイズ区間情報２３としてノイズ区間を示す信号が供給されるので、セレクタ５９は位相差取得部５６から出力された位相差（ノイズ位相差）をノイズ位相差格納部５８に出力する。ノイズ位相差格納部５８は、既に格納されているノイズ位相差を、セレクタ５９から供給された最新のノイズ位相差に更新する。更新されたノイズ位相差は、次に音声ノイズ区間情報２３がノイズ区間を示すタイミング（つまり、ノイズ位相差の次の更新のタイミング）まで保持される。 Further, when the audio noise interval information 23 supplied from the audio noise interval detector 15 indicates a noise interval, the probability that the phase difference acquired by the phase difference acquisition unit 56 is a noise phase difference (noise phase difference). Can be said to be expensive. At this time, the selector 59 is supplied with a signal indicating the noise interval as the audio noise interval information 23, so the selector 59 converts the phase difference (noise phase difference) output from the phase difference acquisition unit 56 into the noise phase difference storage unit 58. Output to. The noise phase difference storage unit 58 updates the already stored noise phase difference to the latest noise phase difference supplied from the selector 59. The updated noise phase difference is held until the next time when the audio noise interval information 23 indicates the noise interval (that is, the next update timing of the noise phase difference).

音声位相差格納部５７に格納されている音声位相差およびノイズ位相差格納部５８に格納されているノイズ位相差は、位相情報２５としてノイズ低減処理判定部１７に供給される。このとき、音声位相差およびノイズ位相差は、ノイズ低減処理判定部１７においてそれぞれ分離して認識される。 The audio phase difference stored in the audio phase difference storage unit 57 and the noise phase difference stored in the noise phase difference storage unit 58 are supplied as phase information 25 to the noise reduction processing determination unit 17. At this time, the audio phase difference and the noise phase difference are separately recognized by the noise reduction processing determination unit 17.

図５〜図７は、本実施の形態にかかるノイズ低減装置が備える音声用マイクロフォン１１と参照音用マイクロフォン１２とに対する音声およびノイズの音源の位置の一例を示す図である。図５〜図７では、無線通信装置６００の表面側に音声用マイクロフォン１１が設けられており、裏面側に参照音用マイクロフォン１２が設けられている。通常、話者は無線通信装置６００の表面側に設けられている音声用マイクロフォン１１に向かって声を発する。 5-7 is a figure which shows an example of the position of the sound source of an audio | voice and noise with respect to the microphone 11 for audio | voices with which the noise reduction apparatus concerning this Embodiment and the microphone 12 for reference sounds are equipped. 5-7, the sound microphone 11 is provided on the front side of the wireless communication apparatus 600, and the reference sound microphone 12 is provided on the back side. Usually, the speaker speaks toward the voice microphone 11 provided on the front side of the wireless communication apparatus 600.

図５に示すように、音声の音源（話者）が音声用マイクロフォン１１側である場合、音声用マイクロフォン１１で収音される音声の位相は、参照音用マイクロフォン１２で収音される音声の位相よりも早い。よって、この場合は、収音信号２１と収音信号２２の位相差（音声位相差）はプラスとなる。 As shown in FIG. 5, when the sound source (speaker) is on the sound microphone 11 side, the phase of the sound collected by the sound microphone 11 is the same as that of the sound collected by the reference sound microphone 12. Faster than phase. Therefore, in this case, the phase difference (sound phase difference) between the sound collection signal 21 and the sound collection signal 22 is positive.

一方、ノイズの音源が参照音用マイクロフォン１２側である場合、音声用マイクロフォン１１で収音されるノイズの位相は、参照音用マイクロフォン１２で収音されるノイズの位相よりも遅い。よって、この場合は、収音信号２１と収音信号２２の位相差（ノイズ位相差）はマイナスとなる。 On the other hand, when the noise source is on the reference sound microphone 12 side, the phase of the noise collected by the sound microphone 11 is slower than the phase of the noise collected by the reference sound microphone 12. Therefore, in this case, the phase difference (noise phase difference) between the sound collection signal 21 and the sound collection signal 22 is negative.

また、図６に示すように、音声の音源（話者）とノイズの音源とが共に音声用マイクロフォン１１側である場合、音声用マイクロフォン１１で収音される音声の位相は、参照音用マイクロフォン１２で収音される音声の位相よりも早い。また、音声用マイクロフォン１１で収音されるノイズの位相は、参照音用マイクロフォン１２で収音されるノイズの位相よりも早い。よって、この場合は、音声区間における収音信号２１と収音信号２２の位相差（音声位相差）およびノイズ区間における収音信号２１と収音信号２２の位相差（ノイズ位相差）は共にプラスとなる。 Further, as shown in FIG. 6, when both the sound source (speaker) and the noise sound source are on the sound microphone 11 side, the phase of the sound collected by the sound microphone 11 is the reference sound microphone. 12 is earlier than the phase of the sound collected. The phase of noise collected by the sound microphone 11 is earlier than the phase of noise collected by the reference sound microphone 12. Therefore, in this case, both the phase difference (sound phase difference) between the sound collection signal 21 and the sound collection signal 22 in the sound section and the phase difference (noise phase difference) between the sound collection signal 21 and the sound collection signal 22 in the noise section are positive. It becomes.

また、図７に示すように、音声の音源（話者）とノイズの音源とが共に参照音用マイクロフォン１２側である場合、音声用マイクロフォン１１で収音される音声の位相は、参照音用マイクロフォン１２で収音される音声の位相よりも遅い。また、音声用マイクロフォン１１で収音されるノイズの位相は、参照音用マイクロフォン１２で収音されるノイズの位相よりも遅い。よって、この場合は、音声区間における収音信号２１と収音信号２２の位相差（音声位相差）およびノイズ区間における収音信号２１と収音信号２２の位相差（ノイズ位相差）は共にマイナスとなる。 Further, as shown in FIG. 7, when both the sound source (speaker) and the noise sound source are on the reference sound microphone 12 side, the phase of the sound collected by the sound microphone 11 is It is later than the phase of the sound collected by the microphone 12. The phase of noise collected by the sound microphone 11 is slower than the phase of noise collected by the reference sound microphone 12. Therefore, in this case, the phase difference (sound phase difference) between the sound collection signal 21 and the sound collection signal 22 in the voice section and the phase difference (noise phase difference) between the sound collection signal 21 and the sound collection signal 22 in the noise section are both negative. It becomes.

図１に示すノイズ低減処理判定部１７は、位相情報取得部１６で取得された音声位相差とノイズ位相差の状態を判定する。例えば、音声位相差とノイズ位相差の状態が、ノイズ低減処理を実施する場合である第１の状態であるか否かや、ノイズ低減処理を実施しない場合またはノイズ低減処理を第１の状態よりも弱く実施する場合である第２の状態であるか否かを判定する。 The noise reduction processing determination unit 17 illustrated in FIG. 1 determines the state of the audio phase difference and noise phase difference acquired by the phase information acquisition unit 16. For example, whether the state of the audio phase difference and the noise phase difference is the first state where the noise reduction process is performed, whether the noise reduction process is not performed, or the noise reduction process is performed from the first state. It is also determined whether or not the second state, which is a weak implementation.

例えば、ノイズ低減処理判定部１７は、位相情報取得部１６で取得された音声位相差とノイズ位相差との差の絶対値が所定の閾値（第１の閾値）以内である場合、第２の状態と判定することができる。以下では、ノイズ低減処理を実施しない場合またはノイズ低減処理を第１の状態よりも弱く実施する場合である第２の状態を、単に"ノイズ低減処理を実施しない場合"と記載する場合もある。 For example, when the absolute value of the difference between the audio phase difference acquired by the phase information acquisition unit 16 and the noise phase difference is within a predetermined threshold (first threshold), the noise reduction processing determination unit 17 The state can be determined. Hereinafter, the second state, which is a case where the noise reduction process is not performed or a case where the noise reduction process is performed weaker than the first state, may be simply described as “a case where the noise reduction process is not performed”.

なお、音声位相差とノイズ位相差とを用いて、ノイズ低減処理を実施しない場合であるか、またはノイズ低減処理を第１の状態よりも弱く実施する場合であるかをさらに判定するようにしてもよい。新たな閾値を設けて音声成分自体を低減してしまう可能性がより高い場合にノイズ低減処理を実施しないようにすればよい。また、音声位相差とノイズ位相差との差分に常に適応させてノイズ低減処理の強さを変更するようにしてもよい。この場合、ノイズ低減処理判定部１７が行う判定動作は、音声位相差と前記ノイズ位相差との差分の絶対値を算出する動作となり、ノイズ低減処理部はその差分の絶対値に応じた強さのノイズ低減処理を行う。例えば差分の絶対値が小さいほど弱いノイズ低減処理とすればよい。以上のことは実施の形態２のようにパワー差を用いた場合も、位相差とパワー差を置き換えて考えれば同様である。 It should be noted that by using the audio phase difference and the noise phase difference, it is further determined whether the noise reduction processing is not performed or whether the noise reduction processing is performed weaker than in the first state. Also good. If there is a higher possibility that the voice component itself is reduced by providing a new threshold, the noise reduction process may not be performed. Further, the strength of noise reduction processing may be changed by always adapting to the difference between the audio phase difference and the noise phase difference. In this case, the determination operation performed by the noise reduction processing determination unit 17 is an operation of calculating the absolute value of the difference between the audio phase difference and the noise phase difference, and the noise reduction processing unit has a strength corresponding to the absolute value of the difference. Noise reduction processing is performed. For example, the noise reduction processing may be weaker as the absolute value of the difference is smaller. The above is the same when the power difference is used as in the second embodiment if the phase difference and the power difference are replaced.

ここで、所定の閾値は任意に設定することができる。例えば、所定の閾値を小さくするほど、ノイズ低減処理を実施する基準が緩くなる（換言すると、ノイズ低減処理を実施しないと判断する範囲が狭くなる）。つまり、音声位相差とノイズ位相差の差は、例えば、音声の音声用マイクロフォン１１への進入角度（音声用マイクロフォン１１の主面に対する音声の進入角度）と、ノイズの音声用マイクロフォン１１への進入角度（音声用マイクロフォン１１の主面に対するノイズの進入角度）との差に対応している。よって、所定の閾値を小さくするほど、ノイズ低減処理を実施しないと判断される音声とノイズの進入角度の差が狭くなる。 Here, the predetermined threshold can be arbitrarily set. For example, as the predetermined threshold value is decreased, the criterion for performing the noise reduction process becomes loose (in other words, the range in which the noise reduction process is not performed is narrowed). In other words, the difference between the audio phase difference and the noise phase difference is, for example, the angle of entry of the sound into the sound microphone 11 (the sound entry angle of the sound with respect to the main surface of the sound microphone 11) and the approach of the noise into the sound microphone 11. This corresponds to the difference between the angle (the noise entry angle with respect to the main surface of the voice microphone 11). Therefore, the smaller the predetermined threshold is, the narrower the difference between the sound and noise approach angles determined not to perform the noise reduction process.

逆に、所定の閾値を大きくするほど、ノイズ低減処理を実施する基準が厳しくなる（換言すると、ノイズ低減処理を実施しないと判断する範囲が広くなる）。つまり、所定の閾値を大きくするほど、ノイズ低減処理を実施しないと判断される音声とノイズの進入角度の差が広くなる。 Conversely, the larger the predetermined threshold value, the stricter the criteria for performing the noise reduction process (in other words, the wider the range in which it is determined not to perform the noise reduction process). That is, the larger the predetermined threshold value, the wider the difference between the sound and noise approach angles determined not to perform the noise reduction process.

音声とノイズの進入角度の差が０に近づくにつれて、音声用マイクロフォン１１と参照音用マイクロフォン１２とで収音される音（音声およびノイズ）が近似する。このため、ノイズ低減処理部１８においてノイズ低減処理を実施する際に、収音信号２１に含まれるノイズ成分が低減されると同時に音声成分も低減されてしまうという問題がある。このような問題を解決するために、本実施の形態にかかるノイズ低減装置では、位相情報取得部１６で取得された音声位相差とノイズ位相差の差（音声とノイズの進入角度の差に対応する）に基づきノイズ低減処理を実施するか否かを判定している。つまり、音声位相差とノイズ位相差との差の絶対値が所定の閾値以内である場合、ノイズ低減処理を実施しないと判定することができる。 As the difference between the sound and noise approach angles approaches 0, the sound (sound and noise) collected by the sound microphone 11 and the reference sound microphone 12 is approximated. For this reason, when noise reduction processing is performed in the noise reduction processing unit 18, there is a problem that the noise component included in the collected sound signal 21 is reduced and the sound component is also reduced at the same time. In order to solve such a problem, in the noise reduction device according to the present exemplary embodiment, the difference between the audio phase difference acquired by the phase information acquisition unit 16 and the noise phase difference (corresponding to the difference between the audio and noise approach angles). To determine whether or not to perform noise reduction processing. That is, when the absolute value of the difference between the audio phase difference and the noise phase difference is within a predetermined threshold, it can be determined that the noise reduction process is not performed.

例えば、ノイズ低減処理判定部１７は、ノイズ低減処理を実施する（第１の状態）と判定した場合、判定フラグ２６を無効（ロウレベル）とし、ノイズ低減処理を実施しないまたはノイズ低減処理を第１の状態よりも弱く実施する（第２の状態）と判定した場合、判定フラグ２６を有効（ハイレベル）とする。 For example, if it is determined that the noise reduction process is to be performed (first state), the noise reduction process determination unit 17 invalidates the determination flag 26 (low level) and does not perform the noise reduction process or performs the noise reduction process first. If it is determined that the operation is performed weaker than the state (second state), the determination flag 26 is set valid (high level).

ノイズ低減処理部１８は、ノイズ低減処理判定部１７の判定結果に応じて収音信号２１と収音信号２２とを用いてノイズ低減処理を実施する。すなわち、ノイズ低減処理部１８は、ノイズ低減処理を実施する（第１の状態）とノイズ低減処理判定部１７において判定された場合（判定フラグ２６がロウレベルの場合）、収音信号２１に含まれるノイズ成分を収音信号２２を用いて低減し、ノイズ低減処理後の信号を出力信号２７として出力する。また、ノイズ低減処理を実施しないまたはノイズ低減処理を第１の状態よりも弱く実施する（第２の状態）とノイズ低減処理判定部１７において判定された場合（判定フラグ２６がハイレベルの場合）、収音信号２１を音声信号としてそのまま出力してもよいし、また、ノイズ低減処理の効果が通常よりも弱めになるように、ノイズ低減処理を実施してもよい（つまり、図８に示す疑似ノイズ信号８３を小さめに設定してもよい）。 The noise reduction processing unit 18 performs noise reduction processing using the sound collection signal 21 and the sound collection signal 22 according to the determination result of the noise reduction processing determination unit 17. That is, the noise reduction processing unit 18 is included in the sound collection signal 21 when the noise reduction processing determination is performed (first state) in the noise reduction processing determination unit 17 (when the determination flag 26 is low level). The noise component is reduced using the collected sound signal 22, and the signal after the noise reduction processing is output as the output signal 27. Further, when the noise reduction process determination unit 17 determines that the noise reduction process is not performed or the noise reduction process is performed weaker than the first state (second state) (when the determination flag 26 is at a high level). The collected sound signal 21 may be output as it is as an audio signal, or the noise reduction process may be performed so that the effect of the noise reduction process becomes weaker than usual (that is, as shown in FIG. 8). The pseudo noise signal 83 may be set smaller).

ノイズ低減処理部１８は、収音信号（音声信号）２１に含まれたノイズ成分を低減するために、参照音用マイクロフォン１２を用いてノイズ成分を含む参照音を収音し、この参照音に基づき収音信号２１に含まれている可能性があるノイズ成分を擬似的に生成する。そして、ノイズ低減処理部１８は、収音信号２１から、この擬似的に生成したノイズ成分を差し引くことで、ノイズ低減処理を実施することができる。 The noise reduction processing unit 18 collects a reference sound including a noise component by using the reference sound microphone 12 in order to reduce a noise component included in the collected sound signal (audio signal) 21, and uses the reference sound as a reference sound. Based on this, a noise component that may be included in the collected sound signal 21 is generated in a pseudo manner. And the noise reduction process part 18 can implement a noise reduction process by subtracting this pseudo noise component generated from the collected sound signal 21.

例えば、ノイズ低減処理部１８から出力された出力信号２７（デジタル信号）は、ＤＡコンバータ（不図示）においてアナログ信号に変換され、変換後のアナログ信号は出力部（不図示）においてスピーカーや音声出力端子から出力される。 For example, an output signal 27 (digital signal) output from the noise reduction processing unit 18 is converted into an analog signal by a DA converter (not shown), and the converted analog signal is output to a speaker or a voice at the output unit (not shown). Output from the terminal.

図８は、ノイズ低減処理部１８の一例を示すブロック図である。ノイズ低減処理部１８は、遅延素子７１_１〜７１_ｎ、乗算器７２_１〜７２_ｎ＋１、加算器７３_１〜７３_ｎ、適応係数調整部７４、減算器７５、および出力信号選択部７６を備える。遅延素子７１_１〜７１_ｎ、乗算器７２_１〜７２_ｎ＋１、および加算器７３_１〜７３_ｎは、ＦＩＲフィルタを構成する。遅延素子７１_１〜７１_ｎ、乗算器７２_１〜７２_ｎ＋１、および加算器７３_１〜７３_ｎを用いて収音信号２２を処理することで、擬似ノイズ信号８３が生成される。 FIG. 8 is a block diagram illustrating an example of the noise reduction processing unit 18. The noise reduction processing unit 18 includes delay elements 71_1 to 71_n, multipliers 72_1 to 72_n + 1, adders 73_1 to 73_n, an adaptive coefficient adjustment unit 74, a subtractor 75, and an output signal selection unit 76. The delay elements 71_1 to 71_n, the multipliers 72_1 to 72_n + 1, and the adders 73_1 to 73_n constitute an FIR filter. The pseudo noise signal 83 is generated by processing the sound collection signal 22 using the delay elements 71_1 to 71_n, the multipliers 72_1 to 72_n + 1, and the adders 73_1 to 73_n.

適応係数調整部７４は、音声ノイズ区間情報２４に応じて、乗算器７２_１〜７２_ｎ＋１の係数を調整する。すなわち、適応係数調整部７４は、音声ノイズ区間情報２４がノイズ区間を示している場合、適応誤差が少なくなるように係数を調整する。一方、音声ノイズ区間情報２４が音声区間を示している場合、係数を維持するか、または係数を微調整するのみとする。 The adaptive coefficient adjustment unit 74 adjusts the coefficients of the multipliers 72_1 to 72_n + 1 in accordance with the audio noise section information 24. That is, the adaptive coefficient adjustment unit 74 adjusts the coefficient so that the adaptive error is reduced when the audio noise interval information 24 indicates a noise interval. On the other hand, when the voice noise section information 24 indicates a voice section, the coefficient is maintained or only the coefficient is finely adjusted.

減算器７５は、収音信号２１から疑似ノイズ信号８３を差し引くことで、ノイズ低減処理後の信号８４を生成し、出力信号選択部７６に出力する。また、減算器７５は、収音信号２１から疑似ノイズ信号８３を差し引くことで、フィードバック用の信号８５を生成し、適応係数調整部７４に出力する。 The subtractor 75 generates a signal 84 after noise reduction processing by subtracting the pseudo noise signal 83 from the collected sound signal 21 and outputs the signal 84 to the output signal selection unit 76. Further, the subtractor 75 generates a feedback signal 85 by subtracting the pseudo noise signal 83 from the collected sound signal 21, and outputs the feedback signal 85 to the adaptive coefficient adjustment unit 74.

出力信号選択部７６は、ノイズ低減処理判定部１７から出力された判定フラグ２６に応じて、収音信号２１を出力信号２７としてそのまま出力するか、またはノイズ低減処理後の信号８４を出力信号２７として出力するかを選択する。つまり、出力信号選択部７６は、ノイズ低減処理判定部１７から出力された判定フラグ２６が有効（ハイレベル）である場合は、収音信号２１を出力信号２７としてそのまま出力する。一方、ノイズ低減処理判定部１７から出力された判定フラグ２６が無効（ロウレベル）である場合は、ノイズ低減処理後の信号８４を出力信号２７として出力する。 The output signal selection unit 76 outputs the collected sound signal 21 as it is as the output signal 27 according to the determination flag 26 output from the noise reduction process determination unit 17 or outputs the signal 84 after the noise reduction process as the output signal 27. Select whether to output as. That is, the output signal selection unit 76 outputs the collected sound signal 21 as it is as the output signal 27 when the determination flag 26 output from the noise reduction processing determination unit 17 is valid (high level). On the other hand, when the determination flag 26 output from the noise reduction processing determination unit 17 is invalid (low level), the signal 84 after the noise reduction processing is output as the output signal 27.

次に、本実施の形態にかかるノイズ低減装置１の動作について説明する。図９は、本実施の形態にかかるノイズ低減装置１の動作を説明するためのフローチャートである。 Next, operation | movement of the noise reduction apparatus 1 concerning this Embodiment is demonstrated. FIG. 9 is a flowchart for explaining the operation of the noise reduction apparatus 1 according to the present embodiment.

まず、音声ノイズ区間検出部１５において、音声用マイクロフォン１１で収音された音（収音信号２１）が音声区間であるかまたはノイズ区間であるかを検出する（ステップＳ１）。このとき、音声区間およびノイズ区間を検出するための条件を厳しくすることで、音声区間およびノイズ区間を確実に検出することができる。 First, the voice noise section detector 15 detects whether the sound collected by the voice microphone 11 (sound pickup signal 21) is a voice section or a noise section (step S1). At this time, it is possible to reliably detect the voice section and the noise section by tightening the conditions for detecting the voice section and the noise section.

位相情報取得部１６は、音声ノイズ区間検出部１５で検出された音声ノイズ区間情報２３がノイズ区間を示す場合（ステップＳ２：Ｎｏ）、ノイズ区間における収音信号２１と収音信号２２とを用いてノイズ位相差を取得する（ステップＳ３）。そして、位相情報取得部１６は、ステップＳ３で取得したノイズ位相差を用いて、既に保持されているノイズ位相差を更新する（ステップＳ４）。 When the audio noise interval information 23 detected by the audio noise interval detector 15 indicates a noise interval (step S2: No), the phase information acquisition unit 16 uses the collected sound signal 21 and the collected sound signal 22 in the noise interval. The noise phase difference is acquired (step S3). And the phase information acquisition part 16 updates the noise phase difference already hold | maintained using the noise phase difference acquired by step S3 (step S4).

一方、位相情報取得部１６は、音声ノイズ区間検出部１５で検出された音声ノイズ区間情報２３が音声区間を示す場合（ステップＳ２：Ｙｅｓ）、音声区間における収音信号２１と収音信号２２とを用いて音声位相差を取得する（ステップＳ５）。そして、位相情報取得部１６は、ステップＳ５で取得した音声位相差を用いて、既に保持されている音声位相差を更新する（ステップＳ６）。 On the other hand, when the speech noise section information 23 detected by the speech noise section detection unit 15 indicates a speech section (step S2: Yes), the phase information acquisition unit 16 performs the sound collection signal 21 and the sound collection signal 22 in the speech section. Is used to obtain the audio phase difference (step S5). And the phase information acquisition part 16 updates the audio | voice phase difference already hold | maintained using the audio | voice phase difference acquired by step S5 (step S6).

次に、ノイズ低減処理判定部１７は、位相情報取得部１６で取得した音声位相差とノイズ位相差とに基づきノイズ低減処理を実施するか否かを判定する。そして、ノイズ低減処理判定部１７は、音声位相差とノイズ位相差との差の絶対値が所定の閾値よりも大きい場合（ステップＳ７：Ｎｏ）、ノイズ低減処理を実施すると判定する。このとき、ノイズ低減処理判定部１７から出力される判定フラグ２６は無効（ロウレベル）であるため、ノイズ低減処理部１８は、収音信号２１に含まれるノイズ成分を収音信号２２を用いて低減し、ノイズ低減処理後の信号を出力信号２７として出力する（ステップＳ８）。 Next, the noise reduction process determination unit 17 determines whether or not to perform the noise reduction process based on the audio phase difference and the noise phase difference acquired by the phase information acquisition unit 16. And the noise reduction process determination part 17 determines with implementing a noise reduction process, when the absolute value of the difference of an audio | voice phase difference and a noise phase difference is larger than a predetermined threshold value (step S7: No). At this time, since the determination flag 26 output from the noise reduction processing determination unit 17 is invalid (low level), the noise reduction processing unit 18 uses the sound collection signal 22 to reduce the noise component included in the sound collection signal 21. Then, the signal after the noise reduction processing is output as the output signal 27 (step S8).

一方、ノイズ低減処理判定部１７は、音声位相差とノイズ位相差との差の絶対値が所定の閾値以内である場合（ステップＳ７：Ｙｅｓ）、ノイズ低減処理を実施しないと判定する。このとき、ノイズ低減処理判定部１７から出力される判定フラグ２６は有効（ハイレベル）であるため、ノイズ低減処理部１８は、収音信号２１（音声信号）をそのまま出力する（ステップＳ９）。 On the other hand, when the absolute value of the difference between the audio phase difference and the noise phase difference is within a predetermined threshold (step S7: Yes), the noise reduction process determination unit 17 determines that the noise reduction process is not performed. At this time, since the determination flag 26 output from the noise reduction processing determination unit 17 is valid (high level), the noise reduction processing unit 18 outputs the sound collection signal 21 (audio signal) as it is (step S9).

次に、本実施の形態にかかるノイズ低減装置を用いた音声入力装置について説明する。図１０は、本実施の形態にかかるノイズ低減装置を用いた音声入力装置５００の一例を示す図である。図１０（ａ）は、音声入力装置５００の前面図であり、図１０（ｂ）は、音声入力装置５００の背面図である。図１０に示すように、音声入力装置５００はコネクタ５０３を介して無線通信装置５１０に接続可能に構成されている。無線通信装置５１０は一般的な無線機を用いることができ、所定の周波数において他の無線通信装置と通信可能に構成されている。無線通信装置５１０には音声入力装置５００を介して話者の音声が入力される。 Next, a voice input device using the noise reduction device according to the present embodiment will be described. FIG. 10 is a diagram illustrating an example of a voice input device 500 using the noise reduction device according to the present embodiment. FIG. 10A is a front view of the voice input device 500, and FIG. 10B is a rear view of the voice input device 500. As shown in FIG. 10, the voice input device 500 is configured to be connectable to the wireless communication device 510 via a connector 503. The wireless communication device 510 can use a general wireless device, and is configured to be able to communicate with other wireless communication devices at a predetermined frequency. The voice of the speaker is input to the wireless communication device 510 via the voice input device 500.

音声入力装置５００は、本体５０１、コード５０２、及びコネクタ５０３を有する。本体５０１は、話者の手で把持されるのに適するサイズ及び形状に構成されており、マイクロフォン、スピーカー、電子回路、ノイズ低減装置を内蔵する。図１０（ａ）に示すように、本体５０１の前面にはスピーカー５０６および音声用マイクロフォン５０５が設けられている。図１０（ｂ）に示すように、本体５０１の背面には参照音用マイクロフォン５０８およびベルトクリップ５０７が設けられている。本体５０１の頂面には、ＬＥＤ５０９が設けられている。本体５０１の側面にはＰＴＴ（Push To Talk）５０４が設けられている。ＬＥＤ５０９は、音声入力装置５００による話者の音声の検出状態を話者に対して報知する。ＰＴＴ５０４は、無線通信装置５１０を音声送信状態とするためのスイッチであり、突起状部分が筐体内に押し込まれることを検出する。 The voice input device 500 includes a main body 501, a code 502, and a connector 503. The main body 501 is configured to have a size and shape suitable for being held by a speaker's hand, and includes a microphone, a speaker, an electronic circuit, and a noise reduction device. As shown in FIG. 10A, a speaker 506 and an audio microphone 505 are provided on the front surface of the main body 501. As shown in FIG. 10B, a reference sound microphone 508 and a belt clip 507 are provided on the back surface of the main body 501. An LED 509 is provided on the top surface of the main body 501. A PTT (Push To Talk) 504 is provided on a side surface of the main body 501. The LED 509 notifies the speaker of the detection state of the speaker's voice by the voice input device 500. The PTT 504 is a switch for setting the wireless communication device 510 in a voice transmission state, and detects that the protruding portion is pushed into the housing.

本実施の形態にかかるノイズ低減装置１は音声入力装置５００に内蔵されており、ノイズ低減装置１が備える音声用マイクロフォン１１が音声入力装置５００の音声用マイクロフォン５０５に対応し、ノイズ低減装置１が備える参照音用マイクロフォン１２が音声入力装置５００の参照音用マイクロフォン５０８に対応している。また、ノイズ低減装置１から出力される出力信号２７は、音声入力装置５００のコード５０２を経由して無線通信装置５１０に供給される。すなわち、音声入力装置５００は、ノイズ低減装置１でノイズ低減処理された後の出力信号２７を、無線通信装置５１０に供給する。よって、無線通信装置５１０から他の無線通信装置に送信される音声はノイズ低減処理された音声となる。なお、ノイズ低減装置１は、無線通信装置５１０に内蔵するような構成にしてもよい。 The noise reduction device 1 according to the present embodiment is built in the voice input device 500. The voice microphone 11 included in the noise reduction device 1 corresponds to the voice microphone 505 of the voice input device 500, and the noise reduction device 1 is provided. The provided reference sound microphone 12 corresponds to the reference sound microphone 508 of the voice input device 500. Further, the output signal 27 output from the noise reduction device 1 is supplied to the wireless communication device 510 via the code 502 of the voice input device 500. That is, the voice input device 500 supplies the wireless communication device 510 with the output signal 27 that has been subjected to noise reduction processing by the noise reduction device 1. Therefore, the sound transmitted from the wireless communication apparatus 510 to another wireless communication apparatus is a sound subjected to noise reduction processing. Note that the noise reduction device 1 may be configured to be built in the wireless communication device 510.

次に、本実施の形態にかかるノイズ低減装置を用いた無線通信装置（トランシーバー）６００について説明する。図１１は、本実施の形態にかかるノイズ低減装置を用いた無線通信装置６００の一例を示す図である。図１１（ａ）は、無線通信装置６００の前面図であり、図１１（ｂ）は、無線通信装置６００の背面図である。図１１に示すように、無線通信装置６００は、入力ボタン６０１、表示部６０２、スピーカー６０３、音声用マイクロフォン６０４、ＰＴＴ（Push To Talk）６０５、スイッチ６０６、アンテナ６０７、参照音用マイクロフォン６０８、および蓋６０９を備える。 Next, a radio communication apparatus (transceiver) 600 using the noise reduction apparatus according to this embodiment will be described. FIG. 11 is a diagram illustrating an example of a wireless communication device 600 using the noise reduction device according to the present embodiment. FIG. 11A is a front view of the wireless communication apparatus 600, and FIG. 11B is a rear view of the wireless communication apparatus 600. As shown in FIG. 11, the wireless communication apparatus 600 includes an input button 601, a display unit 602, a speaker 603, an audio microphone 604, a PTT (Push To Talk) 605, a switch 606, an antenna 607, a reference sound microphone 608, and A lid 609 is provided.

本実施の形態にかかるノイズ低減装置１は無線通信装置６００に内蔵されており、ノイズ低減装置１が備える音声用マイクロフォン１１が無線通信装置６００の音声用マイクロフォン６０４に対応し、ノイズ低減装置１が備える参照音用マイクロフォン１２が無線通信装置６００の参照音用マイクロフォン６０８に対応している。また、ノイズ低減装置１から出力される出力信号２７は、無線通信装置６００の内部回路において高周波処理されて、アンテナ６０７から他の無線通信装置に無線送信される。ここで、ノイズ低減装置１から出力される出力信号２７はノイズ低減処理が実施された信号であるので、他の無線通信装置に送信される音声はノイズ低減処理された音声となる。ユーザによるＰＴＴ６０５の押下により音の送信が開始されたときに、図９で示したようなノイズ低減装置１の処理を開始し、ユーザがＰＴＴ６０８の押下を中止して、音の送信が終了したときに、図９で示したようなノイズ低減装置１の処理を終了しても良い。 The noise reduction device 1 according to the present embodiment is built in the wireless communication device 600. The voice microphone 11 included in the noise reduction device 1 corresponds to the voice microphone 604 of the wireless communication device 600. The reference sound microphone 12 provided corresponds to the reference sound microphone 608 of the wireless communication apparatus 600. Further, the output signal 27 output from the noise reduction device 1 is subjected to high frequency processing in an internal circuit of the wireless communication device 600 and is wirelessly transmitted from the antenna 607 to another wireless communication device. Here, since the output signal 27 output from the noise reduction apparatus 1 is a signal on which noise reduction processing has been performed, the voice transmitted to another wireless communication apparatus is the voice on which noise reduction processing has been performed. When sound transmission is started by the user pressing the PTT 605, the processing of the noise reduction apparatus 1 as shown in FIG. 9 is started, and the user stops pressing the PTT 608 and the sound transmission is ended. In addition, the processing of the noise reduction apparatus 1 as shown in FIG. 9 may be terminated.

本発明の課題で説明したように、音声を主に収音する音声用マイクロフォン１１とノイズを主に収音する参照音用マイクロフォン１２を用いてノイズ低減処理を実施する場合、ノイズの到来方向によっては音声の低減量（キャンセル量）が増大するという問題があった。つまり、ノイズ低減装置の使用状況によってはノイズを収音する参照音用マイクロフォン１２にも音声が混入する場合もある。このように参照音用マイクロフォン１２に音声が混入すると、音声用マイクロフォンで収音された音声に混入しているノイズ成分だけでなく音声自体もキャンセルされてしまい、音声の明瞭度が低下するという問題があった。 As described in the problem of the present invention, when the noise reduction processing is performed using the voice microphone 11 that mainly collects sound and the reference sound microphone 12 that mainly collects noise, depending on the arrival direction of the noise. However, there is a problem that the amount of voice reduction (cancellation amount) increases. That is, depending on the use situation of the noise reduction device, the sound may be mixed into the reference sound microphone 12 that collects noise. When sound is mixed into the reference sound microphone 12 as described above, not only the noise component mixed in the sound collected by the sound microphone but also the sound itself is canceled, and the clarity of the sound is lowered. was there.

例えば、図７に示すように、音声の音源（話者）とノイズの音源とが共に参照音用マイクロフォン１２側である場合は、参照音用マイクロフォン１２においても音声が収音される。ここで、ノイズ低減装置は、参照音用マイクロフォン１２で収音された参照音を収音し、この参照音に基づき収音信号２１に含まれている可能性があるノイズ成分を擬似的に生成し、この擬似的に生成したノイズ成分を収音信号２１から差し引くことでノイズ低減処理を実施している。このため、参照音用マイクロフォン１２に音声が混入すると、音声用マイクロフォン１１で収音された音に混入しているノイズ成分を低減する際に、ノイズ成分と共に音声自体もキャンセルされてしまうという問題があった。 For example, as shown in FIG. 7, when both the sound source (speaker) and the noise sound source are on the reference sound microphone 12 side, the sound is also collected by the reference sound microphone 12. Here, the noise reduction device collects the reference sound collected by the reference sound microphone 12 and artificially generates a noise component that may be included in the collected sound signal 21 based on the reference sound. Then, noise reduction processing is performed by subtracting the pseudo-generated noise component from the collected sound signal 21. For this reason, when sound is mixed into the reference sound microphone 12, when the noise component mixed in the sound collected by the sound microphone 11 is reduced, the sound itself is canceled together with the noise component. there were.

また、例えば、図６に示すように、音声の音源（話者）とノイズの音源とが共に音声用マイクロフォン１１側である場合は、ノイズの音源の方向から到来する音をキャンセルする作用が働き、同じ到来方向である音声成分は参照音用マイクロフォン１２に混入する音声成分が例え少なくとも、音声用マイクロフォン１１で収音される音声を低減してしまうことになり、音声の明瞭性が損なわれてしまう。また、音声の音源とノイズの音源とが同一方向である場合（図６及び図７参照）以外にも、音声用マイクロフォン１１および参照音用マイクロフォン１２を直線で結んだ軸上に対してミラー対称となるような方向から音声やノイズが到来する場合（つまり、各マイクロフォンへの音声とノイズの入射角度が近似する場合）には、ノイズ成分の低減と共に音声成分も低減されてしまうという問題があった。このような環境では、ノイズ低減処理時にノイズ成分と共に音声自体もキャンセルされてしまい、ノイズ低減処理を適切に実施することができないという問題があった。 For example, as shown in FIG. 6, when both the sound source (speaker) and the noise sound source are on the sound microphone 11 side, the action of canceling the sound coming from the direction of the noise sound source works. As for the sound component having the same direction of arrival, the sound component mixed in the reference sound microphone 12 is reduced, for example, the sound collected by the sound microphone 11 is reduced, and the clarity of the sound is impaired. End up. In addition to the case where the sound source of noise and the sound source of noise are in the same direction (see FIGS. 6 and 7), mirror symmetry is performed with respect to an axis connecting the sound microphone 11 and the reference sound microphone 12 with a straight line. When sound or noise comes from such a direction (that is, when the incident angle of sound and noise to each microphone approximates), there is a problem that the sound component is reduced along with the reduction of the noise component. It was. In such an environment, the noise itself is canceled together with the noise component during the noise reduction process, and there is a problem that the noise reduction process cannot be performed appropriately.

また、特許文献１では、音声信号とノイズ信号とを区別するために、音声およびノイズを指向性マイクロフォンを用いて収音している。このとき、それぞれの指向性マイクロフォンが互いに反対向きとなるように配置している。しかしながら、例えば、話者が音声を収音する指向性マイクロフォンに対して９０度横方向（つまり、２つの指向性マイクロフォンの中間の位置）から話した場合、指向性マイクロフォンでは音声を適切に収音することができない。また、２つの指向性マイクロフォンの横方向から話した場合は、音声用マイクロフォンおよびノイズ用マイクロフォンに均等に音声成分が入力される。この場合、ノイズと共に音声もキャンセルされるため出力される音声の品質が劣化する。 Moreover, in patent document 1, in order to distinguish an audio | voice signal and a noise signal, the audio | voice and noise are picked up using a directional microphone. At this time, the directional microphones are arranged in opposite directions. However, for example, when a speaker speaks 90 degrees laterally with respect to a directional microphone that collects sound (that is, an intermediate position between two directional microphones), the directional microphone appropriately collects sound. Can not do it. Further, when speaking from the lateral direction of the two directional microphones, the sound component is equally input to the sound microphone and the noise microphone. In this case, since the sound is canceled together with the noise, the quality of the output sound is deteriorated.

また、特許文献２では、音声信号とノイズ信号とを区別するために２種類の指向性マイクロフォンを組み合わせて使用している。しかしながら、指向性マイクロフォンは入力利得が一定の方向に定められている。よって、話者が音声用マイクロフォンの正面から外れた位置で話した場合、指向性の範囲を超えるために音声信号を収音できない場合がある。また、ノイズ用マイクロフォンに音声が入力された場合は、ノイズ成分と共に音声成分も低減処理されてしまう。 Moreover, in patent document 2, in order to distinguish an audio | voice signal and a noise signal, it uses combining 2 types of directional microphones. However, the directional microphone has a fixed input gain. Therefore, when the speaker speaks at a position off the front of the voice microphone, the voice signal may not be collected because the range of directivity is exceeded. In addition, when sound is input to the noise microphone, the sound component is also reduced together with the noise component.

このように、音声およびノイズの到来方向によっては、ノイズ低減装置が適切にノイズ低減処理を実施することができないという問題があった。 As described above, there is a problem that the noise reduction device cannot appropriately perform the noise reduction processing depending on the voice and noise arrival directions.

このような問題を解決するために、本実施の形態にかかるノイズ低減装置では、音声区間における収音信号２１と収音信号２２との位相差である音声位相差と、ノイズ区間における収音信号２１と収音信号２２との位相差であるノイズ位相差とに基づきノイズ低減処理を実施するか否かを判定している。つまり、音声位相差とノイズ位相差との差の絶対値が所定の閾値以内である場合に、ノイズ低減処理を実施しないと判定することができる。 In order to solve such a problem, in the noise reduction apparatus according to the present embodiment, the sound phase difference that is the phase difference between the sound collection signal 21 and the sound collection signal 22 in the sound section and the sound collection signal in the noise section Whether or not noise reduction processing is to be performed is determined based on a noise phase difference that is a phase difference between 21 and the sound pickup signal 22. That is, when the absolute value of the difference between the audio phase difference and the noise phase difference is within a predetermined threshold, it can be determined that the noise reduction process is not performed.

このように、本実施の形態にかかるノイズ低減装置では、ノイズ低減処理を実施するには不適切な場合を音声位相差とノイズ位相差とに基づいて判定することができる。よって、ノイズ低減処理を実施するには不適切な場合にノイズ低減処理を実施しないようにすることができる。ここで、ノイズ低減処理を実施するには不適切な場合とは、例えば、参照音用マイクロフォン１２に音声が混入した場合や、音声用マイクロフォン１１および参照音用マイクロフォン１２を直線で結んだ軸上に対してミラー対称となるような方向から音声やノイズが到来する場合（つまり、各マイクロフォンへの音声とノイズの入射角度が近似する場合）などである。 As described above, in the noise reduction device according to the present embodiment, it is possible to determine a case that is inappropriate for performing the noise reduction processing based on the audio phase difference and the noise phase difference. Therefore, it is possible to prevent the noise reduction process from being performed when it is inappropriate to perform the noise reduction process. Here, the case where the noise reduction processing is inappropriate is, for example, when the sound is mixed in the reference sound microphone 12 or on the axis where the sound microphone 11 and the reference sound microphone 12 are connected by a straight line. When voice or noise comes from a direction that is mirror-symmetric with respect to (that is, when the incident angle of the voice and noise to each microphone approximates).

図１２、図１３は、本実施の形態にかかる発明の効果を説明するための図である。図１２、図１３の横軸はサンプル時間［ｓｅｃ］、縦軸は音圧レベル［ｄＢ］である。図１２は、参照音用マイクロフォン１２にも音声が混入し、音声用マイクロフォン１１で収音された音声に混入しているノイズ成分だけでなく音声自体もキャンセルされている状態を示している。また、図１３は、本実施の形態にかかるノイズ低減装置を用いた場合を示している。つまり、図１３では、ノイズ低減処理を実施するには不適切である場合を判定し、ノイズ低減処理を実施していない状態を示している。なお、図１２に示す区間Ａおよび区間Ｂの位置は、図１３に示す区間Ａおよび区間Ｂの位置に対応している。 12 and 13 are diagrams for explaining the effect of the invention according to the present embodiment. The horizontal axis in FIGS. 12 and 13 is the sample time [sec], and the vertical axis is the sound pressure level [dB]. FIG. 12 shows a state in which sound is mixed into the reference sound microphone 12 and not only the noise component mixed in the sound collected by the sound microphone 11 but also the sound itself is canceled. FIG. 13 shows a case where the noise reduction apparatus according to the present embodiment is used. That is, FIG. 13 shows a state where it is determined that the noise reduction process is inappropriate and the noise reduction process is not performed. Note that the positions of the sections A and B shown in FIG. 12 correspond to the positions of the sections A and B shown in FIG.

図１２に示すように、本実施の形態にかかるノイズ低減装置を用いない場合は、区間Ａおよび区間Ｂにおいて音声成分が低減されている。これに対して、本実施の形態にかかるノイズ低減装置を用いた場合は、図１３に示すように、区間Ａおよび区間Ｂにおいて音声成分が低減されていない。よって、本実施の形態にかかるノイズ低減装置を用いることで、音声の到来方向とノイズの到来方向とに応じてノイズ低減処理を適切に実施することができる。 As shown in FIG. 12, when the noise reduction apparatus according to the present embodiment is not used, audio components are reduced in the sections A and B. On the other hand, when the noise reduction apparatus according to the present embodiment is used, the speech component is not reduced in the sections A and B as shown in FIG. Therefore, by using the noise reduction device according to the present embodiment, it is possible to appropriately perform noise reduction processing according to the voice arrival direction and the noise arrival direction.

以上で説明した本実施の形態にかかる発明により、ノイズ成分を適切に低減することができるノイズ低減装置、音声入力装置、無線通信装置、およびノイズ低減方法を提供することができる。 The invention according to the present embodiment described above can provide a noise reduction device, a voice input device, a wireless communication device, and a noise reduction method that can appropriately reduce noise components.

＜実施の形態２＞
次に、本発明の実施の形態２について説明する。図１４は、実施の形態２にかかるノイズ低減装置を示すブロック図である。本実施の形態にかかるノイズ低減装置２は、パワー情報取得部６０を備えている点が実施の形態１で説明したノイズ低減装置１と異なる。これ以外は、実施の形態１で説明したノイズ低減装置１と同様であるので、同一の構成要素には同一の符号を付し重複した説明は省略する。 <Embodiment 2>
Next, a second embodiment of the present invention will be described. FIG. 14 is a block diagram of the noise reduction apparatus according to the second embodiment. The noise reduction apparatus 2 according to the present embodiment is different from the noise reduction apparatus 1 described in the first embodiment in that a power information acquisition unit 60 is provided. Other than this, since it is the same as the noise reduction apparatus 1 described in the first embodiment, the same components are denoted by the same reference numerals, and redundant description is omitted.

パワー情報取得部６０は、音声ノイズ区間情報２３が音声区間を示す場合、音声区間における収音信号２１の大きさと収音信号２２の大きさとの差である音声パワー差を取得する。また、パワー情報取得部６０は、音声ノイズ区間情報２３がノイズ区間を示す場合、ノイズ区間における収音信号２１の大きさと収音信号２２の大きさとの差であるノイズパワー差を取得する。取得された音声パワー差およびノイズパワー差は、パワー情報２８としてノイズ低減処理判定部１７に供給される。 When the voice noise section information 23 indicates a voice section, the power information acquisition unit 60 acquires a voice power difference that is a difference between the magnitude of the sound collection signal 21 and the sound collection signal 22 in the voice section. Further, when the audio noise section information 23 indicates a noise section, the power information acquisition unit 60 acquires a noise power difference that is a difference between the magnitude of the sound collection signal 21 and the sound collection signal 22 in the noise section. The acquired audio power difference and noise power difference are supplied as power information 28 to the noise reduction processing determination unit 17.

図１５は、本実施の形態にかかるノイズ低減装置２が備えるパワー情報取得部６０の一例を示すブロック図である。図１５に示すパワー情報取得部６０は、収音信号バッファ６１、収音信号パワー算出部６２、収音信号バッファ６３、収音信号パワー算出部６４、パワー差算出部６５、音声パワー差格納部６７、ノイズパワー差格納部６８、およびセレクタ６９を備える。図１５に示すパワー情報取得部６０は、ある一定の単位時間における収音信号２１および収音信号２２のパワー情報（図１５に示す場合は、パワー差）を求めることができる。 FIG. 15 is a block diagram illustrating an example of the power information acquisition unit 60 included in the noise reduction device 2 according to the present embodiment. The power information acquisition unit 60 shown in FIG. 15 includes a sound collection signal buffer 61, a sound collection signal power calculation unit 62, a sound collection signal buffer 63, a sound collection signal power calculation unit 64, a power difference calculation unit 65, and a sound power difference storage unit. 67, a noise power difference storage unit 68, and a selector 69. The power information acquisition unit 60 shown in FIG. 15 can obtain power information (power difference in the case of FIG. 15) of the sound pickup signal 21 and the sound pickup signal 22 in a certain unit time.

収音信号バッファ６１は、単位時間分の収音信号２１を蓄積するために、供給された収音信号２１を一時的に蓄積する。収音信号バッファ６３は、単位時間分の収音信号２２を蓄積するために、供給された収音信号２２を一時的に蓄積する。 The sound collection signal buffer 61 temporarily accumulates the supplied sound collection signal 21 in order to accumulate the sound collection signal 21 for a unit time. The sound collection signal buffer 63 temporarily accumulates the supplied sound collection signal 22 in order to accumulate the sound collection signal 22 for a unit time.

収音信号パワー算出部６２は、収音信号バッファ６１に蓄積された単位時間分の収音信号を用いて、単位時間当たりのパワー値を算出する。また、収音信号パワー算出部６４は、収音信号バッファ６３に蓄積された単位時間分の収音信号を用いて、単位時間当たりのパワー値を算出する。 The collected sound signal power calculation unit 62 calculates a power value per unit time using the collected sound signals for unit time accumulated in the collected sound signal buffer 61. In addition, the sound collection signal power calculation unit 64 calculates a power value per unit time using the sound collection signals for the unit time accumulated in the sound collection signal buffer 63.

ここで、単位時間当たりのパワー値とは、単位時間における収音信号２１、２２の大きさであり、例えば、単位時間における収音信号２１、２２の振幅の最大値や、単位時間における収音信号２１、２２の振幅の積分値等を用いることができる。なお、本実施の形態では、収音信号２１、２２の大きさを示す値であれば、パワー値として上記の最大値や積分値以外の値を用いてもよい。 Here, the power value per unit time is the magnitude of the sound pickup signals 21 and 22 in unit time. For example, the maximum value of the amplitude of the sound pickup signals 21 and 22 in unit time or the sound pickup in unit time. An integrated value of the amplitude of the signals 21 and 22 can be used. In the present embodiment, any value other than the above maximum value or integral value may be used as the power value as long as the value indicates the magnitude of the sound pickup signals 21 and 22.

パワー差算出部６５は、収音信号パワー算出部６２で求めた収音信号２１のパワー値と、収音信号パワー算出部６４で求めた収音信号２２のパワー値とのパワー差を算出する。 The power difference calculation unit 65 calculates the power difference between the power value of the sound collection signal 21 obtained by the sound collection signal power calculation unit 62 and the power value of the sound collection signal 22 obtained by the sound collection signal power calculation unit 64. .

パワー情報取得部６０は、音声ノイズ区間検出部１５が音声区間を検出している場合、収音信号２１と収音信号２２のパワー差、つまり、収音信号２１の大きさと収音信号２２の大きさとの差（音声パワー差）を更新する。また、パワー情報取得部６０は、音声ノイズ区間検出部１５がノイズ区間を検出している場合、収音信号２１と収音信号２２とのパワー差、つまり、収音信号２１の大きさと収音信号２２の大きさとの差（ノイズパワー差）を更新する。 When the audio noise section detection unit 15 detects a voice section, the power information acquisition unit 60 determines the power difference between the sound collection signal 21 and the sound collection signal 22, that is, the magnitude of the sound collection signal 21 and the sound collection signal 22. The difference from the size (audio power difference) is updated. In addition, when the audio noise section detection unit 15 detects a noise section, the power information acquisition unit 60 determines the power difference between the collected sound signal 21 and the collected sound signal 22, that is, the magnitude of the collected sound signal 21 and the collected sound. The difference (noise power difference) from the magnitude of the signal 22 is updated.

例えば、音声ノイズ区間検出部１５から供給される音声ノイズ区間情報２３が音声区間を示している場合、パワー差算出部６５で算出されるパワー差は音声のパワー差（音声パワー差）である確率が高いといえる。このとき、セレクタ６９には音声ノイズ区間情報２３として音声区間を示す信号が供給されるので、セレクタ６９はパワー差算出部６５から出力されたパワー差（音声パワー差）を音声パワー差格納部６７に出力する。音声パワー差格納部６７は、既に格納されている音声パワー差を、セレクタ６９から供給された最新の音声パワー差に更新する。更新された音声パワー差は、次に音声ノイズ区間情報２３が音声区間を示すタイミング（つまり、音声パワー差の次の更新のタイミング）まで保持される。 For example, when the audio noise interval information 23 supplied from the audio noise interval detector 15 indicates an audio interval, the probability that the power difference calculated by the power difference calculator 65 is an audio power difference (audio power difference). Can be said to be expensive. At this time, since the selector 69 is supplied with a signal indicating the voice section as the voice noise section information 23, the selector 69 uses the power difference (voice power difference) output from the power difference calculator 65 as the voice power difference storage 67. Output to. The audio power difference storage unit 67 updates the already stored audio power difference to the latest audio power difference supplied from the selector 69. The updated voice power difference is held until the next time when the voice noise section information 23 indicates the voice section (that is, the next update timing of the voice power difference).

また、音声ノイズ区間検出部１５から供給される音声ノイズ区間情報２３がノイズ区間を示している場合、パワー差算出部６５で算出されるパワー差はノイズのパワー差（ノイズパワー差）である確率が高いといえる。このとき、セレクタ６９には音声ノイズ区間情報２３としてノイズ区間を示す信号が供給されるので、セレクタ６９はパワー差算出部６５から出力されたパワー差（ノイズパワー差）をノイズパワー差格納部６８に出力する。ノイズパワー差格納部６８は、既に格納されているノイズパワー差を、セレクタ６９から供給された最新のノイズパワー差に更新する。更新されたノイズパワー差は、次に音声ノイズ区間情報２３がノイズ区間を示すタイミング（つまり、ノイズパワー差の次の更新のタイミング）まで保持される。 Further, when the audio noise interval information 23 supplied from the audio noise interval detector 15 indicates a noise interval, the probability that the power difference calculated by the power difference calculator 65 is a noise power difference (noise power difference). Can be said to be expensive. At this time, since the selector 69 is supplied with a signal indicating the noise interval as the audio noise interval information 23, the selector 69 uses the power difference (noise power difference) output from the power difference calculator 65 as the noise power difference storage 68. Output to. The noise power difference storage unit 68 updates the already stored noise power difference to the latest noise power difference supplied from the selector 69. The updated noise power difference is held until the next time when the audio noise section information 23 indicates the noise section (that is, the next update timing of the noise power difference).

音声パワー差格納部６７に格納されている音声パワー差およびノイズパワー差格納部６８に格納されているノイズパワー差は、パワー情報２８としてノイズ低減処理判定部１７に供給される。このとき、音声パワー差およびノイズパワー差は、ノイズ低減処理判定部１７においてそれぞれ分離して認識される。 The audio power difference stored in the audio power difference storage unit 67 and the noise power difference stored in the noise power difference storage unit 68 are supplied to the noise reduction processing determination unit 17 as power information 28. At this time, the audio power difference and the noise power difference are separately recognized by the noise reduction processing determination unit 17.

例えば、図５に示すように、音声の音源（話者）が音声用マイクロフォン１１側である場合、音声用マイクロフォン１１で収音される音声の大きさは、参照音用マイクロフォン１２で収音される音声の大きさよりも大きい。よって、この場合は、収音信号２１と収音信号２２のパワー差（音声パワー差）はプラスとなる。 For example, as shown in FIG. 5, when the sound source (speaker) of the sound is on the sound microphone 11 side, the size of the sound collected by the sound microphone 11 is picked up by the reference sound microphone 12. It is larger than the loudness of the voice. Therefore, in this case, the power difference (sound power difference) between the sound collection signal 21 and the sound collection signal 22 is positive.

一方、ノイズの音源が参照音用マイクロフォン１２側である場合、音声用マイクロフォン１１で収音されるノイズの大きさは、参照音用マイクロフォン１２で収音されるノイズの大きさよりも小さい。よって、この場合は、収音信号２１と収音信号２２のパワー差（ノイズパワー差）はマイナスとなる。 On the other hand, when the noise source is on the reference sound microphone 12 side, the magnitude of the noise collected by the voice microphone 11 is smaller than the magnitude of the noise collected by the reference sound microphone 12. Therefore, in this case, the power difference (noise power difference) between the sound collection signal 21 and the sound collection signal 22 is negative.

また、図６に示すように、音声の音源（話者）とノイズの音源とが共に音声用マイクロフォン１１側である場合、音声用マイクロフォン１１で収音される音声の大きさは、参照音用マイクロフォン１２で収音される音声の大きさよりも大きい。また、音声用マイクロフォン１１で収音されるノイズの大きさは、参照音用マイクロフォン１２で収音されるノイズの大きさよりも大きい。よって、この場合は、音声区間における収音信号２１と収音信号２２のパワー差（音声パワー差）およびノイズ区間における収音信号２１と収音信号２２のパワー差（ノイズパワー差）は共にプラスとなる。 Also, as shown in FIG. 6, when both the sound source (speaker) and the noise sound source are on the sound microphone 11 side, the size of the sound collected by the sound microphone 11 is The volume of sound collected by the microphone 12 is larger. Further, the magnitude of noise collected by the audio microphone 11 is larger than the magnitude of noise collected by the reference sound microphone 12. Therefore, in this case, the power difference (sound power difference) between the sound collection signal 21 and the sound collection signal 22 in the sound section and the power difference (noise power difference) between the sound collection signal 21 and the sound collection signal 22 in the noise section are both positive. It becomes.

また、図７に示すように、音声の音源（話者）とノイズの音源とが共に参照音用マイクロフォン１２側である場合、音声用マイクロフォン１１で収音される音声の大きさは、参照音用マイクロフォン１２で収音される音声の大きさよりも小さい。また、音声用マイクロフォン１１で収音されるノイズの大きさは、参照音用マイクロフォン１２で収音されるノイズの大きさよりも小さい。よって、この場合は、音声区間における収音信号２１と収音信号２２のパワー差（音声パワー差）およびノイズ区間における収音信号２１と収音信号２２のパワー差（ノイズパワー差）は共にマイナスとなる。 Further, as shown in FIG. 7, when both the sound source (speaker) and the noise sound source are on the reference sound microphone 12 side, the size of the sound collected by the sound microphone 11 is the reference sound. The volume of sound collected by the microphone 12 is smaller. Further, the magnitude of noise collected by the audio microphone 11 is smaller than the magnitude of noise collected by the reference sound microphone 12. Therefore, in this case, the power difference (sound power difference) between the sound collection signal 21 and the sound collection signal 22 in the sound section and the power difference (noise power difference) between the sound collection signal 21 and the sound collection signal 22 in the noise section are both negative. It becomes.

ノイズ低減処理判定部１７は、パワー情報取得部６０で取得された音声パワー差とノイズパワー差とに基づき、ノイズ低減処理を実施する場合である第１の状態と、ノイズ低減処理を実施しない場合またはノイズ低減処理を第１の状態よりも弱く実施する場合である第２の状態とを判定する。例えば、ノイズ低減処理判定部１７は、パワー情報取得部６０で取得された音声パワー差とノイズパワー差との差の絶対値が所定の閾値（第２の閾値）以内である場合、前記第２の状態と判定することができる。ここで、音声パワー差およびノイズパワー差は各マイクロフォン間の相対的な比率（例えば、収音信号２１のパワー／収音信号２２のパワー）を求めて両者を比較することで、音声およびノイズのマイクロフォンへの進入角度が近似しているか否かを判定することができる。 The noise reduction process determination unit 17 performs the first state in which the noise reduction process is performed based on the audio power difference and the noise power difference acquired by the power information acquisition unit 60, and the case where the noise reduction process is not performed. Or it determines with the 2nd state which is a case where a noise reduction process is implemented weaker than a 1st state. For example, when the absolute value of the difference between the audio power difference acquired by the power information acquisition unit 60 and the noise power difference is within a predetermined threshold (second threshold), the noise reduction processing determination unit 17 performs the second operation. It can be determined that Here, the sound power difference and the noise power difference are obtained by calculating a relative ratio between the microphones (for example, the power of the sound pickup signal 21 / the power of the sound pickup signal 22) and comparing the two, thereby obtaining a sound and noise difference. It can be determined whether or not the approach angle to the microphone is approximate.

ここで、所定の閾値は任意に設定することができる。例えば、所定の閾値を小さくするほど、ノイズ低減処理を実施する基準が緩くなる（換言すると、ノイズ低減処理を実施しないと判断する範囲が狭くなる）。つまり、音声パワー差とノイズパワー差との差は、例えば、音声の音声用マイクロフォン１１への進入角度（音声用マイクロフォン１１の主面に対する音声の進入角度）とノイズの音声用マイクロフォン１１への進入角度（音声用マイクロフォン１１の主面に対するノイズの進入角度）との差に対応している。よって、所定の閾値を小さくするほど、ノイズ低減処理を実施しないと判断される音声とノイズの進入角度の差が狭くなる。 Here, the predetermined threshold can be arbitrarily set. For example, as the predetermined threshold value is decreased, the criterion for performing the noise reduction process becomes loose (in other words, the range in which the noise reduction process is not performed is narrowed). That is, the difference between the sound power difference and the noise power difference is, for example, the approach angle of speech to the speech microphone 11 (speech entrance angle with respect to the main surface of the speech microphone 11) and the approach of noise to the speech microphone 11. This corresponds to the difference between the angle (the noise entry angle with respect to the main surface of the voice microphone 11). Therefore, the smaller the predetermined threshold is, the narrower the difference between the sound and noise approach angles determined not to perform the noise reduction process.

音声とノイズの進入角度の差が０に近づくにつれて、音声用マイクロフォン１１と参照音用マイクロフォン１２とで収音される音（音声およびノイズ）が近似する。このため、ノイズ低減処理部１８においてノイズ低減処理を実施する際に、収音信号２１に含まれるノイズ成分が低減されると同時に音声成分も低減されてしまうという問題がある。このような問題を解決するために、本実施の形態にかかるノイズ低減装置では、パワー情報取得部６０で取得された音声パワー差とノイズパワー差との差（音声とノイズの進入角度の差に対応する）に基づきノイズ低減処理を実施するか否かを判定している。つまり、音声パワー差とノイズパワー差との差の絶対値が所定の閾値以内である場合、ノイズ低減処理を実施しないと判定することができる。 As the difference between the sound and noise approach angles approaches 0, the sound (sound and noise) collected by the sound microphone 11 and the reference sound microphone 12 is approximated. For this reason, when noise reduction processing is performed in the noise reduction processing unit 18, there is a problem that the noise component included in the collected sound signal 21 is reduced and the sound component is also reduced at the same time. In order to solve such a problem, in the noise reduction apparatus according to the present embodiment, the difference between the sound power difference acquired by the power information acquisition unit 60 and the noise power difference (the difference between the sound and noise approach angles). It is determined whether or not to perform noise reduction processing based on (corresponding). That is, when the absolute value of the difference between the audio power difference and the noise power difference is within a predetermined threshold, it can be determined that the noise reduction process is not performed.

次に、本実施の形態にかかるノイズ低減装置２の動作について説明する。図１６は、本実施の形態にかかるノイズ低減装置２の動作を説明するためのフローチャートである。 Next, operation | movement of the noise reduction apparatus 2 concerning this Embodiment is demonstrated. FIG. 16 is a flowchart for explaining the operation of the noise reduction apparatus 2 according to the present embodiment.

まず、音声ノイズ区間検出部１５において、音声用マイクロフォン１１で収音された音（収音信号２１）が音声区間であるかまたはノイズ区間であるかを検出する（ステップＳ１１）。このとき、音声区間およびノイズ区間を検出するための条件を厳しくすることで、音声区間およびノイズ区間を確実に検出することができる。 First, the sound noise section detection unit 15 detects whether the sound (sound collected signal 21) collected by the sound microphone 11 is a sound section or a noise section (step S11). At this time, it is possible to reliably detect the voice section and the noise section by tightening the conditions for detecting the voice section and the noise section.

パワー情報取得部６０は、音声ノイズ区間検出部１５で検出された音声ノイズ区間情報２３がノイズ区間を示す場合（ステップＳ１２：Ｎｏ）、ノイズ区間における収音信号２１と収音信号２２とを用いてノイズパワー差を取得する（ステップＳ１３）。そして、パワー情報取得部６０は、ステップＳ１３で取得したノイズパワー差を用いて、既に保持されているノイズパワー差を更新する（ステップＳ１４）。 The power information acquisition unit 60 uses the sound collection signal 21 and the sound collection signal 22 in the noise interval when the audio noise interval information 23 detected by the audio noise interval detection unit 15 indicates a noise interval (step S12: No). The noise power difference is acquired (step S13). And the power information acquisition part 60 updates the noise power difference already hold | maintained using the noise power difference acquired by step S13 (step S14).

一方、パワー情報取得部６０は、音声ノイズ区間検出部１５で検出された音声ノイズ区間情報２３が音声区間を示す場合（ステップＳ１２：Ｙｅｓ）、音声区間における収音信号２１と収音信号２２とを用いて音声パワー差を取得する（ステップＳ１５）。そして、パワー情報取得部６０は、ステップＳ１５で取得した音声パワー差を用いて、既に保持されている音声パワー差を更新する（ステップＳ１６）。 On the other hand, when the sound noise section information 23 detected by the sound noise section detection unit 15 indicates a sound section (step S12: Yes), the power information acquisition unit 60 collects the sound collection signal 21 and the sound collection signal 22 in the sound section. Is used to obtain the audio power difference (step S15). And the power information acquisition part 60 updates the audio | voice power difference already hold | maintained using the audio | voice power difference acquired by step S15 (step S16).

次に、ノイズ低減処理判定部１７は、パワー情報取得部６０で取得した音声パワー差とノイズパワー差とに基づきノイズ低減処理を実施するか否かを判定する。そして、ノイズ低減処理判定部１７は、音声パワー差とノイズパワー差との差の絶対値が所定の閾値よりも大きい場合（ステップＳ１７：Ｎｏ）、ノイズ低減処理を実施すると判定する。このとき、ノイズ低減処理判定部１７から出力される判定フラグ２６は無効（ロウレベル）であるため、ノイズ低減処理部１８は、収音信号２１に含まれるノイズ成分を収音信号２２を用いて低減し、ノイズ低減処理後の信号を出力信号２７として出力する（ステップＳ１８）。 Next, the noise reduction process determination unit 17 determines whether or not to perform the noise reduction process based on the audio power difference and the noise power difference acquired by the power information acquisition unit 60. And the noise reduction process determination part 17 determines with implementing a noise reduction process, when the absolute value of the difference of an audio | voice power difference and a noise power difference is larger than a predetermined threshold value (step S17: No). At this time, since the determination flag 26 output from the noise reduction processing determination unit 17 is invalid (low level), the noise reduction processing unit 18 uses the sound collection signal 22 to reduce the noise component included in the sound collection signal 21. Then, the signal after the noise reduction processing is output as the output signal 27 (step S18).

一方、ノイズ低減処理判定部１７は、音声パワー差とノイズパワー差との差の絶対値が所定の閾値以内である場合（ステップＳ１７：Ｙｅｓ）、ノイズ低減処理を実施しないと判定する。このとき、ノイズ低減処理判定部１７から出力される判定フラグ２６は有効（ハイレベル）であるため、ノイズ低減処理部１８は、収音信号２１（音声信号）をそのまま出力する（ステップＳ１９）。 On the other hand, when the absolute value of the difference between the audio power difference and the noise power difference is within a predetermined threshold (step S17: Yes), the noise reduction process determination unit 17 determines that the noise reduction process is not performed. At this time, since the determination flag 26 output from the noise reduction processing determination unit 17 is valid (high level), the noise reduction processing unit 18 outputs the collected sound signal 21 (audio signal) as it is (step S19).

本実施の形態にかかるノイズ低減装置では、音声区間における収音信号２１の大きさと収音信号２２の大きさとの差である音声パワー差と、ノイズ区間における収音信号２１の大きさと収音信号２２の大きさとの差であるノイズパワー差とに基づきノイズ低減処理を実施するか否かを判定している。つまり、音声パワー差とノイズパワー差との差の絶対値が所定の閾値以内である場合に、ノイズ低減処理を実施しないと判定することができる。 In the noise reduction apparatus according to the present embodiment, the sound power difference that is the difference between the magnitude of the collected sound signal 21 and the collected sound signal 22 in the sound section, and the magnitude of the collected sound signal 21 and the collected sound signal in the noise section. Whether or not to perform noise reduction processing is determined based on a noise power difference that is a difference from the size of 22. That is, when the absolute value of the difference between the audio power difference and the noise power difference is within a predetermined threshold, it can be determined that the noise reduction process is not performed.

このように、本実施の形態にかかるノイズ低減装置では、ノイズ低減処理を実施するには不適切な場合を音声パワー差とノイズパワー差とに基づいて判定することができる。よって、ノイズ低減処理を実施するには不適切な場合にノイズ低減処理を実施しないようにすることができる。 As described above, in the noise reduction device according to the present embodiment, it is possible to determine a case inappropriate for performing the noise reduction processing based on the audio power difference and the noise power difference. Therefore, it is possible to prevent the noise reduction process from being performed when it is inappropriate to perform the noise reduction process.

なお、本実施の形態にかかるノイズ低減装置では、パワー情報取得部６０で取得したパワー情報と共に、位相情報取得部１６で取得した位相情報（実施の形態１参照）を用いて、ノイズ低減処理を実施するか否かを判定してもよい。この場合、例えば、図１７に示すノイズ低減装置２'のように、位相情報取得部とパワー情報取得部とを備える位相パワー情報取得部７０を備えるように構成することができる。 In the noise reduction apparatus according to the present embodiment, the noise reduction processing is performed using the phase information acquired by the phase information acquisition unit 16 (see Embodiment 1) together with the power information acquired by the power information acquisition unit 60. You may determine whether to implement. In this case, for example, like the noise reduction device 2 ′ illustrated in FIG. 17, a phase power information acquisition unit 70 including a phase information acquisition unit and a power information acquisition unit can be provided.

例えば、位相パワー情報取得部７０は、位相情報取得部において音声位相差およびノイズ位相差を取得し、パワー情報取得部において音声パワー差およびノイズパワー差を取得し、これらの情報を位相パワー情報２９としてノイズ低減処理判定部１７に出力する。 For example, the phase power information acquisition unit 70 acquires the audio phase difference and the noise phase difference in the phase information acquisition unit, acquires the audio power difference and the noise power difference in the power information acquisition unit, and uses these information as the phase power information 29. Is output to the noise reduction processing determination unit 17.

ノイズ低減処理判定部１７は、位相パワー情報取得部７０で取得された音声位相差とノイズ位相差との差および音声パワー差とノイズパワー差との差に基づき、ノイズ低減処理を実施するか否かを判定することができる。例えば、音声位相差とノイズ位相差との差の絶対値が所定の第１の閾値以内であり、且つ音声パワー差とノイズパワー差との差の絶対値が所定の第２の閾値以内である場合に、ノイズ低減処理を実施しないと判定することができる。このとき、第１の閾値と第２の閾値を調整することで、音声位相差とノイズ位相差との差を用いた判定と、音声パワー差とノイズパワー差との差を用いた判定とに重み付けを付与することができる。 The noise reduction processing determination unit 17 determines whether to perform the noise reduction processing based on the difference between the audio phase difference and the noise phase difference acquired by the phase power information acquisition unit 70 and the difference between the audio power difference and the noise power difference. Can be determined. For example, the absolute value of the difference between the audio phase difference and the noise phase difference is within a predetermined first threshold, and the absolute value of the difference between the audio power difference and the noise power difference is within a predetermined second threshold. In this case, it can be determined that the noise reduction process is not performed. At this time, by adjusting the first threshold and the second threshold, the determination using the difference between the audio phase difference and the noise phase difference and the determination using the difference between the audio power difference and the noise power difference are made. Weighting can be given.

例えば、トランシーバーのような携帯機器（無線通信装置）や、無線通信装置に付属するスピーカーマイクロフォン（音声入力装置）のような小型機器は、持ち方によりマイク開口部が手で塞がれたり、衣服などによってマイク開口部が遮蔽されたりする場合がある。よって、ノイズ低減処理の有無を判定する際に、位相差を用いる方法とパワー差を用いる方法とを組み合わせて使用することで、ノイズ低減処理を実施するには不適切な場合をより高精度に判定することができる。 For example, in a portable device (wireless communication device) such as a transceiver or a small device such as a speaker microphone (voice input device) attached to the wireless communication device, the microphone opening may be blocked by a hand, For example, the microphone opening may be shielded. Therefore, when determining the presence or absence of noise reduction processing, a combination of a method using a phase difference and a method using a power difference can be used to increase the accuracy of cases that are inappropriate for noise reduction processing. Can be determined.

＜実施の形態３＞
次に、本発明の実施の形態３について説明する。図１８は、本実施の形態にかかるノイズ低減装置３を示すブロック図である。本実施の形態にかかるノイズ低減装置３では、音声ノイズ区間検出部９５、ノイズ低減処理判定部９７、およびノイズ低減処理部９８の構成および動作が、実施の形態１で説明したノイズ低減装置１（図１参照）と異なる。これ以外は実施の形態１で説明したノイズ低減装置１と同様であるので、同一の構成要素には同一の符号を付し重複した説明は省略する。 <Embodiment 3>
Next, a third embodiment of the present invention will be described. FIG. 18 is a block diagram showing the noise reduction device 3 according to the present embodiment. In the noise reduction device 3 according to the present embodiment, the configurations and operations of the audio noise section detection unit 95, the noise reduction processing determination unit 97, and the noise reduction processing unit 98 are the same as those of the noise reduction device 1 ( Different from FIG. Other than this, since it is the same as the noise reduction apparatus 1 described in the first embodiment, the same components are denoted by the same reference numerals, and redundant description is omitted.

図１８に示すように、本実施の形態にかかるノイズ低減装置３は、音声用マイクロフォン１１、参照音用マイクロフォン１２、ＡＤコンバータ１３、１４、音声ノイズ区間検出部９５、位相情報取得部１６、ノイズ低減処理判定部９７、およびノイズ低減処理部９８を有する。 As shown in FIG. 18, the noise reduction device 3 according to the present embodiment includes an audio microphone 11, a reference sound microphone 12, AD converters 13 and 14, an audio noise section detection unit 95, a phase information acquisition unit 16, noise A reduction processing determination unit 97 and a noise reduction processing unit 98 are included.

音声ノイズ区間検出部９５は、ＡＤコンバータ１３から出力された収音信号２１またはＡＤコンバータ１４から出力された収音信号２２に基づき音声区間およびノイズ区間を検出する。そして、音声ノイズ区間検出部１５は、音声区間およびノイズ区間を示す音声ノイズ区間情報２３、２４を、位相情報取得部１６およびノイズ低減処理部９８にそれぞれ出力する。 The voice noise section detector 95 detects a voice section and a noise section based on the collected sound signal 21 output from the AD converter 13 or the collected sound signal 22 output from the AD converter 14. Then, the voice noise section detector 15 outputs the voice noise section information 23 and 24 indicating the voice section and the noise section to the phase information acquisition section 16 and the noise reduction processing section 98, respectively.

例えば、音声ノイズ区間検出部９５は、収音信号２１に音声が含まれているかを判定する回路、および収音信号２２に音声が含まれているかを判定する回路を備えていてもよい。この場合、音声ノイズ区間検出部９５は、音声が多く含まれている方の収音信号を用いて音声区間を検出することができる。なお、音声ノイズ区間検出部１５における音声区間およびノイズ区間の検出には、実施の形態１で説明した技術と同様の技術を用いることができる。 For example, the audio noise section detection unit 95 may include a circuit that determines whether or not the sound collection signal 21 includes sound, and a circuit that determines whether or not the sound collection signal 22 includes sound. In this case, the voice noise section detection unit 95 can detect a voice section using the collected sound signal that contains more voice. Note that the same technique as that described in the first embodiment can be used for detection of the voice section and the noise section in the voice noise section detection unit 15.

実施の形態１で説明したノイズ低減装置１では、音声は音声用マイクロフォン１１において収音される確率が高いことを前提とし、音声ノイズ区間検出部１５が、音声用マイクロフォン１１の収音信号２１のみに基づき音声区間を判定する場合を示した。しかしながら、ノイズ低減装置の使用状況によっては、音声用マイクロフォン１１よりも参照音用マイクロフォン１２の方が多く音声を収音する場合も考えられる。よって、本実施の形態では、音声ノイズ区間検出部９５が、収音信号２１および収音信号２２のうち音声が多く含まれている方の収音信号を用いて音声区間を検出することができるように構成している。 In the noise reduction apparatus 1 described in the first embodiment, it is assumed that there is a high probability that sound is collected by the sound microphone 11, and the sound noise section detection unit 15 performs only the sound collection signal 21 of the sound microphone 11. The case where the speech section is determined based on the above is shown. However, depending on how the noise reduction device is used, there may be cases where the reference sound microphone 12 collects more sound than the sound microphone 11. Therefore, in the present embodiment, the voice noise section detection unit 95 can detect a voice section using the collected sound signal of the collected sound signal 21 and the collected sound signal 22 that contains more sound. It is configured as follows.

ノイズ低減処理判定部９７は、位相情報取得部１６で取得された音声位相差とノイズ位相差とに基づき、ノイズ低減処理を実施する場合である第１の状態と、ノイズ低減処理を実施しない場合またはノイズ低減処理を第１の状態よりも弱く実施する場合である第２の状態とを判定する。例えば、ノイズ低減処理判定部１７は、位相情報取得部１６で取得された音声位相差とノイズ位相差との差の絶対値が所定の閾値（第１の閾値）以内である場合、前記第２の状態と判定することができる。 The noise reduction process determination unit 97 is a first state where noise reduction processing is performed based on the audio phase difference and noise phase difference acquired by the phase information acquisition unit 16, and when noise reduction processing is not performed. Or it determines with the 2nd state which is a case where a noise reduction process is implemented weaker than a 1st state. For example, when the absolute value of the difference between the audio phase difference acquired by the phase information acquisition unit 16 and the noise phase difference is within a predetermined threshold (first threshold), the noise reduction processing determination unit 17 performs the second operation. It can be determined that

図２１に示すように、ノイズ低減装置を含む無線通信装置６００の使用状況によっては、参照音用マイクロフォン１２が配置されている側に音声の音源が存在する場合も想定される。この場合は、参照音用マイクロフォン１２からの収音信号２２に多くの音声成分が含まれているため、収音信号２２に含まれているノイズ成分を収音信号２１を用いて低減する方が、より確実にノイズ低減処理を実施することができる。 As shown in FIG. 21, depending on the use situation of the wireless communication device 600 including the noise reduction device, it may be assumed that a sound source is present on the side where the reference sound microphone 12 is disposed. In this case, since the sound collection signal 22 from the reference sound microphone 12 includes many sound components, it is preferable to reduce the noise component contained in the sound collection signal 22 using the sound collection signal 21. Therefore, the noise reduction process can be performed more reliably.

よって、ノイズ低減処理判定部９７は、音声の音源が参照音用マイクロフォン１２側である場合、ノイズ低減処理部９８においてノイズ低減処理に用いる収音信号２１と収音信号２２とを切り替えるための選択信号９９（例えば、ハイレベル信号）を出力する。このように、ノイズ低減処理部９８に選択信号９９を出力することで、収音信号２２に含まれているノイズ成分を収音信号２１を用いて低減することができる。 Therefore, when the sound source of the sound is on the reference sound microphone 12 side, the noise reduction processing determination unit 97 is a selection for switching between the sound collection signal 21 and the sound collection signal 22 used for the noise reduction processing in the noise reduction processing unit 98. A signal 99 (for example, a high level signal) is output. In this manner, by outputting the selection signal 99 to the noise reduction processing unit 98, the noise component included in the sound collection signal 22 can be reduced using the sound collection signal 21.

例えば、音声用マイクロフォン１１で収音される音声の位相が、参照音用マイクロフォン１２で収音される音声の位相よりも遅い場合に、音声の音源が参照音用マイクロフォン１２側であると判断することができる。換言すると、位相情報取得部１６で取得された音声区間における収音信号２２の位相が収音信号２１の位相よりも早い場合（つまり、収音信号２１と収音信号２２の位相差（音声位相差）がマイナスである場合）、ノイズ低減処理判定部９７は、音声の音源が参照音用マイクロフォン１２側であると判断することができる。 For example, when the phase of the sound collected by the sound microphone 11 is slower than the phase of the sound collected by the reference sound microphone 12, it is determined that the sound source of the sound is on the reference sound microphone 12 side. be able to. In other words, when the phase of the sound pickup signal 22 in the sound section acquired by the phase information acquisition unit 16 is earlier than the phase of the sound pickup signal 21 (that is, the phase difference between the sound pickup signal 21 and the sound pickup signal 22 (sound level). When the phase difference is negative), the noise reduction processing determination unit 97 can determine that the sound source is the reference sound microphone 12 side.

また、例えば、図２１に示すように、ノイズの音源が音声用マイクロフォン１１側である場合、音声用マイクロフォン１１で収音されるノイズの位相は、参照音用マイクロフォン１２で収音されるノイズの位相よりも早くなる。よって、この場合は、収音信号２１と収音信号２２の位相差（ノイズ位相差）はプラスとなる。 For example, as shown in FIG. 21, when the noise source is on the sound microphone 11 side, the phase of the noise collected by the sound microphone 11 is the noise collected by the reference sound microphone 12. Be faster than phase. Therefore, in this case, the phase difference (noise phase difference) between the sound collection signal 21 and the sound collection signal 22 is positive.

なお、ノイズ低減処理判定部９７のその他の構成および動作については、実施の形態１で説明したノイズ低減処理判定部１７の構成および動作と同一であるので、重複した説明は省略する。 Since the other configuration and operation of the noise reduction process determination unit 97 are the same as the configuration and operation of the noise reduction process determination unit 17 described in the first embodiment, a duplicate description is omitted.

また、本実施の形態にかかるノイズ低減装置では、実施の形態２で説明したノイズ低減装置のように、パワー情報取得部で取得された音声パワー差とノイズパワー差とを用いてノイズ低減処理を実施するか否かを判定してもよい。例えば、音声用マイクロフォン１１で収音される音声の大きさが、参照音用マイクロフォン１２で収音される音声の大きさよりも小さい場合、音声の音源が参照音用マイクロフォン１２側であると判断することができる。換言すると、パワー情報取得部で取得された音声区間における収音信号２２の大きさが収音信号２１の大きさよりも大きい場合（つまり、収音信号２１と収音信号２２のパワー差（音声パワー差）がマイナスである場合）、ノイズ低減処理判定部９７は、音声の音源が参照音用マイクロフォン１２側であると判断することができる。 Moreover, in the noise reduction apparatus according to the present embodiment, as in the noise reduction apparatus described in the second embodiment, noise reduction processing is performed using the audio power difference and the noise power difference acquired by the power information acquisition unit. You may determine whether to implement. For example, when the volume of the sound collected by the sound microphone 11 is smaller than the volume of the sound collected by the reference sound microphone 12, it is determined that the sound source is on the reference sound microphone 12 side. be able to. In other words, when the magnitude of the sound pickup signal 22 in the voice section acquired by the power information acquisition unit is larger than the magnitude of the sound pickup signal 21 (that is, the power difference between the sound pickup signal 21 and the sound pickup signal 22 (voice power When the difference is negative), the noise reduction processing determination unit 97 can determine that the sound source is the reference sound microphone 12 side.

また、例えば、ノイズの音源が音声用マイクロフォン１１側である場合、音声用マイクロフォン１１で収音されるノイズの大きさは、参照音用マイクロフォン１２で収音されるノイズの大きさよりも大きい。よって、この場合は、収音信号２１と収音信号２２のパワー差（ノイズパワー差）はプラスとなる。 Further, for example, when the noise source is the voice microphone 11 side, the magnitude of the noise collected by the voice microphone 11 is larger than the magnitude of the noise collected by the reference sound microphone 12. Therefore, in this case, the power difference (noise power difference) between the sound collection signal 21 and the sound collection signal 22 is positive.

ノイズ低減処理部９８は、ノイズ低減処理判定部９７の判定結果に応じて収音信号２１と収音信号２２とを用いてノイズ低減処理を実施する。ノイズ低減処理部９８は、例えば、音声区間における収音信号２１の位相が収音信号２２の位相よりも早く（つまり、音声の音源が音声用マイクロフォン１１側）、且つ、ノイズ低減処理を実施するとノイズ低減処理判定部９７において判定された場合（判定フラグ２６がロウレベルの場合）、収音信号２１に含まれるノイズ成分を収音信号２２を用いて低減し、ノイズ低減処理後の信号を出力信号２７として出力する。また、ノイズ低減処理部９８は、音声区間における収音信号２１の位相が収音信号２２の位相よりも早く（つまり、音声の音源が音声用マイクロフォン１１側）、且つ、ノイズ低減処理を実施しないとノイズ低減処理判定部９７において判定された場合（判定フラグ２６がハイレベルの場合）、収音信号２１を出力信号２７としてそのまま出力する。 The noise reduction processing unit 98 performs noise reduction processing using the sound collection signal 21 and the sound collection signal 22 according to the determination result of the noise reduction processing determination unit 97. For example, when the noise reduction processing unit 98 performs the noise reduction processing in which the phase of the sound collection signal 21 in the voice section is earlier than the phase of the sound collection signal 22 (that is, the sound source of the sound is the sound microphone 11 side). When the noise reduction processing determination unit 97 determines (when the determination flag 26 is at a low level), the noise component included in the sound collection signal 21 is reduced using the sound collection signal 22, and the signal after the noise reduction process is output as an output signal. 27 is output. In addition, the noise reduction processing unit 98 has a phase of the collected sound signal 21 earlier than that of the collected sound signal 22 in the sound section (that is, the sound source of the sound is on the sound microphone 11 side) and does not perform the noise reduction process. And the noise reduction processing determination unit 97 (when the determination flag 26 is at a high level), the collected sound signal 21 is output as the output signal 27 as it is.

一方、ノイズ低減処理部９８は、音声区間における収音信号２２の位相が収音信号２１の位相よりも早く（つまり、音声の音源が参照音用マイクロフォン１２側）、且つ、ノイズ低減処理を実施するとノイズ低減処理判定部９７において判定された場合（判定フラグ２６がロウレベルの場合）、収音信号２２に含まれるノイズ成分を収音信号２１を用いて低減し、ノイズ低減処理後の信号を出力信号２７として出力する。また、ノイズ低減処理部９８は、音声区間における収音信号２２の位相が収音信号２１の位相よりも早く（つまり、音声の音源が参照音用マイクロフォン１２側）、且つ、ノイズ低減処理を実施しないとノイズ低減処理判定部９７において判定された場合（判定フラグ２６がハイレベルの場合）、収音信号２２を出力信号２７としてそのまま出力する。 On the other hand, the noise reduction processing unit 98 performs a noise reduction process in which the phase of the sound pickup signal 22 in the voice section is earlier than the phase of the sound pickup signal 21 (that is, the sound source of the voice is the reference sound microphone 12 side). Then, when it is determined by the noise reduction processing determination unit 97 (when the determination flag 26 is at low level), the noise component included in the sound collection signal 22 is reduced using the sound collection signal 21 and a signal after the noise reduction process is output. Output as signal 27. In addition, the noise reduction processing unit 98 performs the noise reduction process in which the phase of the sound collection signal 22 in the speech section is earlier than the phase of the sound collection signal 21 (that is, the sound source of the sound is the reference sound microphone 12 side). Otherwise, if the noise reduction processing determination unit 97 determines (when the determination flag 26 is at a high level), the sound collection signal 22 is output as it is as the output signal 27.

また、パワー情報を用いた場合は、ノイズ低減処理部９８は、例えば、音声区間における収音信号２１の大きさが収音信号２２の大きさよりも大きく（つまり、音声の音源が音声用マイクロフォン１１側）、且つ、ノイズ低減処理を実施するとノイズ低減処理判定部９７において判定された場合（判定フラグ２６がロウレベルの場合）、収音信号２１に含まれるノイズ成分を収音信号２２を用いて低減し、ノイズ低減処理後の信号を出力信号２７として出力する。また、ノイズ低減処理部９８は、音声区間における収音信号２１の大きさが収音信号２２の大きさよりも大きく（つまり、音声の音源が音声用マイクロフォン１１側）、且つ、ノイズ低減処理を実施しないとノイズ低減処理判定部９７において判定された場合（判定フラグ２６がハイレベルの場合）、収音信号２１を出力信号２７としてそのまま出力する。 When power information is used, the noise reduction processing unit 98, for example, has a larger sound collection signal 21 in a sound section than a sound collection signal 22 (that is, a sound source is a sound microphone 11). Side), and when the noise reduction processing determination unit 97 determines that the noise reduction processing is performed (when the determination flag 26 is low level), the noise component included in the sound pickup signal 21 is reduced using the sound pickup signal 22. Then, the signal after the noise reduction processing is output as the output signal 27. In addition, the noise reduction processing unit 98 performs the noise reduction processing in which the size of the collected sound signal 21 in the speech section is larger than the magnitude of the collected sound signal 22 (that is, the sound source of the sound is the sound microphone 11 side). Otherwise, if the noise reduction processing determination unit 97 determines (when the determination flag 26 is at a high level), the sound collection signal 21 is output as it is as the output signal 27.

一方、ノイズ低減処理部９８は、音声区間における収音信号２２の大きさが収音信号２１の大きさよりも大きく（つまり、音声の音源が参照音用マイクロフォン１２側）、且つ、ノイズ低減処理を実施するとノイズ低減処理判定部９７において判定された場合（判定フラグ２６がロウレベルの場合）、収音信号２２に含まれるノイズ成分を収音信号２１を用いて低減し、ノイズ低減処理後の信号を出力信号２７として出力する。また、ノイズ低減処理部９８は、音声区間における収音信号２２の大きさが収音信号２１の大きさよりも大きく（つまり、音声の音源が参照音用マイクロフォン１２側）、且つ、ノイズ低減処理を実施しないとノイズ低減処理判定部９７において判定された場合（判定フラグ２６がハイレベルの場合）、収音信号２２を出力信号２７としてそのまま出力する。 On the other hand, the noise reduction processing unit 98 has a larger sound collection signal 22 in the voice section than the sound collection signal 21 (that is, the sound source of the sound is the reference sound microphone 12 side), and performs noise reduction processing. When the noise reduction processing determination unit 97 determines that the noise reduction processing is performed (when the determination flag 26 is at a low level), the noise component included in the sound pickup signal 22 is reduced using the sound pickup signal 21, and the signal after the noise reduction process is obtained. Output as an output signal 27. In addition, the noise reduction processing unit 98 has a larger sound collection signal 22 in the voice section than the sound collection signal 21 (that is, the sound source of the sound is the reference sound microphone 12 side), and performs noise reduction processing. Otherwise, if the noise reduction processing determination unit 97 determines (when the determination flag 26 is at high level), the sound collection signal 22 is output as it is as the output signal 27.

図１９は、ノイズ低減処理部９８の一例を示すブロック図である。ノイズ低減処理部９８は、遅延素子７１_１〜７１_ｎ、乗算器７２_１〜７２_ｎ＋１、加算器７３_１〜７３_ｎ、適応係数調整部７４、減算器７５、出力信号選択部７６、およびセレクタ７７を備える。 FIG. 19 is a block diagram illustrating an example of the noise reduction processing unit 98. The noise reduction processing unit 98 includes delay elements 71_1 to 71_n, multipliers 72_1 to 72_n + 1, adders 73_1 to 73_n, an adaptive coefficient adjustment unit 74, a subtractor 75, an output signal selection unit 76, and a selector 77.

セレクタ７７は、ノイズ低減処理判定部９７から出力された選択信号９９に応じて、収音信号２１および収音信号２２をそれぞれ音声信号８１（主に音声成分を含む信号）および参照信号８２（ノイズ成分を擬似的に生成ための信号）として出力する場合と、収音信号２１および収音信号２２をそれぞれ参照信号８２および音声信号８１として出力する場合とを切り替える。例えば、セレクタ７７は、音声の音源が音声用マイクロフォン１１側である場合（つまり、選択信号９９がロウレベルの場合）、収音信号２１および収音信号２２をそれぞれ音声信号８１および参照信号８２として出力する。一方、音声の音源が参照音用マイクロフォン１２側である場合（つまり、選択信号９９がハイレベルの場合）、収音信号２１および収音信号２２をそれぞれ参照信号８２および音声信号８１として出力する。 The selector 77 selects the sound collection signal 21 and the sound collection signal 22 as a sound signal 81 (a signal mainly including a sound component) and a reference signal 82 (noise), respectively, according to the selection signal 99 output from the noise reduction processing determination unit 97. A signal is output as a component for generating a pseudo component, and a case where the sound pickup signal 21 and the sound pickup signal 22 are output as a reference signal 82 and a sound signal 81, respectively. For example, the selector 77 outputs the sound collection signal 21 and the sound collection signal 22 as the sound signal 81 and the reference signal 82, respectively, when the sound source is the sound microphone 11 side (that is, when the selection signal 99 is low level). To do. On the other hand, when the sound source is the reference sound microphone 12 side (that is, when the selection signal 99 is at a high level), the sound collection signal 21 and the sound collection signal 22 are output as the reference signal 82 and the sound signal 81, respectively.

遅延素子７１_１〜７１_ｎ、乗算器７２_１〜７２_ｎ＋１、および加算器７３_１〜７３_ｎは、ＦＩＲフィルタを構成する。遅延素子７１_１〜７１_ｎ、乗算器７２_１〜７２_ｎ＋１、および加算器７３_１〜７３_ｎを用いて参照信号８２を処理することで、擬似ノイズ信号８３が生成される。 The delay elements 71_1 to 71_n, the multipliers 72_1 to 72_n + 1, and the adders 73_1 to 73_n constitute an FIR filter. The pseudo noise signal 83 is generated by processing the reference signal 82 using the delay elements 71_1 to 71_n, the multipliers 72_1 to 72_n + 1, and the adders 73_1 to 73_n.

適応係数調整部７４は、音声ノイズ区間情報２４に応じて、乗算器７２_１〜７２_ｎ＋１の係数を調整する。すなわち、適応係数調整部７４は、音声ノイズ区間情報２４がノイズ区間を示している場合、適応誤差が少なくなるように係数を調整する。一方、音声ノイズ区間情報２４が音声区間を示している場合、ノイズ低減処理部１８の係数を維持するか、または係数を微調整するのみとする。 The adaptive coefficient adjustment unit 74 adjusts the coefficients of the multipliers 72_1 to 72_n + 1 in accordance with the audio noise section information 24. That is, the adaptive coefficient adjustment unit 74 adjusts the coefficient so that the adaptive error is reduced when the audio noise interval information 24 indicates a noise interval. On the other hand, when the speech noise section information 24 indicates a speech section, the coefficient of the noise reduction processing unit 18 is maintained or only the coefficient is finely adjusted.

減算器７５は、音声信号８１から疑似ノイズ信号８３を差し引くことで、ノイズ低減処理後の信号８４を生成し、出力信号選択部７６に出力する。また、減算器７５は、音声信号８１から疑似ノイズ信号８３を差し引くことで、フィードバック用の信号８５を生成し、適応係数調整部７４に出力する。 The subtractor 75 generates a signal 84 after noise reduction processing by subtracting the pseudo noise signal 83 from the audio signal 81 and outputs the signal 84 to the output signal selection unit 76. Also, the subtractor 75 generates a feedback signal 85 by subtracting the pseudo noise signal 83 from the audio signal 81 and outputs the feedback signal 85 to the adaptive coefficient adjustment unit 74.

出力信号選択部７６は、ノイズ低減処理判定部９７から出力された判定フラグ２６に応じて、音声信号８１を出力信号２７としてそのまま出力するか、またはノイズ低減処理後の信号８４を出力信号２７として出力するかを選択する。つまり、出力信号選択部７６は、ノイズ低減処理判定部９７から出力された判定フラグ２６が有効（ハイレベル）である場合は、音声信号８１を出力信号２７としてそのまま出力する。一方、ノイズ低減処理判定部９７から出力された判定フラグ２６が無効（ロウレベル）である場合は、ノイズ低減処理後の信号８４を出力信号２７として出力する。 The output signal selection unit 76 outputs the audio signal 81 as it is as the output signal 27 according to the determination flag 26 output from the noise reduction processing determination unit 97, or the signal 84 after the noise reduction processing as the output signal 27. Select whether to output. In other words, the output signal selection unit 76 outputs the audio signal 81 as it is as the output signal 27 when the determination flag 26 output from the noise reduction processing determination unit 97 is valid (high level). On the other hand, when the determination flag 26 output from the noise reduction processing determination unit 97 is invalid (low level), the signal 84 after the noise reduction processing is output as the output signal 27.

次に、本実施の形態にかかるノイズ低減装置３の動作について説明する。図２０は、本実施の形態にかかるノイズ低減装置３の動作を説明するためのフローチャートである。なお、図２０に示すステップＳ２１〜ステップＳ２６はそれぞれ、図９に示したステップＳ１〜ステップＳ６（実施の形態１参照）と同様であるので、重複した説明は省略する。 Next, operation | movement of the noise reduction apparatus 3 concerning this Embodiment is demonstrated. FIG. 20 is a flowchart for explaining the operation of the noise reduction apparatus 3 according to the present embodiment. Note that steps S21 to S26 shown in FIG. 20 are the same as steps S1 to S6 (see Embodiment 1) shown in FIG.

ステップＳ２７において、ノイズ低減処理判定部９７は、音声の音源が参照音用マイクロフォン１２側であるか判定する。そして、音声の音源が音声用マイクロフォン１１側である場合（ステップＳ２７：Ｎｏ）、ノイズ低減処理部９８は、音声用マイクロフォン１１の収音信号２１を音声信号８１と、参照音用マイクロフォン１２の収音信号２２を参照信号８２とする（ステップ２８）。例えば、収音信号２１と収音信号２２の位相差（音声位相差）がプラスである場合、ノイズ低減処理判定部９７は、音声の音源が音声用マイクロフォン１１側であると判断することができる。 In step S <b> 27, the noise reduction processing determination unit 97 determines whether the sound source is on the reference sound microphone 12 side. When the sound source of the sound is the sound microphone 11 side (step S27: No), the noise reduction processing unit 98 collects the sound signal 21 of the sound microphone 11 as the sound signal 81 and the reference sound microphone 12. The sound signal 22 is used as a reference signal 82 (step 28). For example, when the phase difference (sound phase difference) between the sound pickup signal 21 and the sound pickup signal 22 is positive, the noise reduction processing determination unit 97 can determine that the sound source of the sound is on the sound microphone 11 side. .

一方、音声の音源が参照音用マイクロフォン１２側である場合（ステップＳ２７：Ｙｅｓ）、ノイズ低減処理部９８は、参照音用マイクロフォン１２の収音信号２２を音声信号８１と、音声用マイクロフォン１１の収音信号２１を参照信号８２とする（ステップ２９）。例えば、収音信号２１と収音信号２２の位相差（音声位相差）がマイナスである場合、ノイズ低減処理判定部９７は、音声の音源が参照音用マイクロフォン１２側であると判断することができる。 On the other hand, when the sound source of the sound is the reference sound microphone 12 side (step S <b> 27: Yes), the noise reduction processing unit 98 uses the sound signal 22 of the reference sound microphone 12 as the sound signal 81 and the sound microphone 11. The collected sound signal 21 is used as a reference signal 82 (step 29). For example, when the phase difference (sound phase difference) between the sound collection signal 21 and the sound collection signal 22 is negative, the noise reduction processing determination unit 97 may determine that the sound source of the sound is on the reference sound microphone 12 side. it can.

次に、ノイズ低減処理判定部９７は、位相情報取得部１６で取得された音声位相差とノイズ位相差とに基づきノイズ低減処理を実施するか否かを判定する。つまり、ノイズ低減処理判定部９７は、音声位相差とノイズ位相差との差の絶対値が所定の第１の閾値よりも大きい場合（ステップＳ３０：Ｎｏ）、ノイズ低減処理を実施すると判定する。なお、パワー情報を用いる場合は、ノイズ低減処理判定部９７は、パワー情報取得部で取得された音声パワー差とノイズパワー差との差の絶対値が所定の第２の閾値よりも大きい場合、ノイズ低減処理を実施すると判定することができる。 Next, the noise reduction process determination unit 97 determines whether or not to perform the noise reduction process based on the audio phase difference and the noise phase difference acquired by the phase information acquisition unit 16. That is, when the absolute value of the difference between the audio phase difference and the noise phase difference is larger than the predetermined first threshold (step S30: No), the noise reduction process determination unit 97 determines to perform the noise reduction process. When power information is used, the noise reduction processing determination unit 97, when the absolute value of the difference between the audio power difference acquired by the power information acquisition unit and the noise power difference is larger than a predetermined second threshold value, It can be determined that the noise reduction process is performed.

このとき、ノイズ低減処理判定部９７から出力される判定フラグ２６は無効（ロウレベル）であるため、ノイズ低減処理部９８は、音声信号８１（図１９参照）に含まれるノイズ成分を参照信号８２を用いて低減し、ノイズ低減処理後の信号を出力信号２７として出力する（ステップＳ３１）。 At this time, since the determination flag 26 output from the noise reduction processing determination unit 97 is invalid (low level), the noise reduction processing unit 98 uses the reference signal 82 for the noise component included in the audio signal 81 (see FIG. 19). The signal after the noise reduction processing is output as the output signal 27 (step S31).

一方、ノイズ低減処理判定部９７は、音声位相差とノイズ位相差との差の絶対値が所定の第１の閾値以内である場合（ステップＳ３０：Ｙｅｓ）、ノイズ低減処理を実施しないと判定する。なお、パワー情報を用いる場合は、ノイズ低減処理判定部９７は、パワー情報取得部で取得された音声パワー差とノイズパワー差との差の絶対値が所定の第２の閾値以内である場合、ノイズ低減処理を実施しないと判定することができる。 On the other hand, when the absolute value of the difference between the audio phase difference and the noise phase difference is within a predetermined first threshold (step S30: Yes), the noise reduction process determination unit 97 determines that the noise reduction process is not performed. . When power information is used, the noise reduction processing determination unit 97, when the absolute value of the difference between the audio power difference acquired by the power information acquisition unit and the noise power difference is within a predetermined second threshold, It can be determined that the noise reduction process is not performed.

このとき、ノイズ低減処理判定部９７から出力される判定フラグ２６は有効（ハイレベル）であるため、ノイズ低減処理部９８は、音声信号８１をそのまま出力する（ステップＳ３２）。 At this time, since the determination flag 26 output from the noise reduction processing determination unit 97 is valid (high level), the noise reduction processing unit 98 outputs the audio signal 81 as it is (step S32).

よって、本実施の形態にかかるノイズ低減装置３では、音声の音源が参照音用マイクロフォン１２側である場合、ノイズ低減処理部９８においてノイズ低減処理に用いる収音信号２１と収音信号２２とを切り替えている。このように、収音信号２１と収音信号２２を切り替えることで、収音信号２２に含まれているノイズ成分を収音信号２１を用いて低減することができ、より確実にノイズ低減処理を実施することができる。 Therefore, in the noise reduction apparatus 3 according to the present embodiment, when the sound source of the sound is the reference sound microphone 12 side, the sound collection signal 21 and the sound collection signal 22 used for the noise reduction processing in the noise reduction processing unit 98 are obtained. Switching. In this way, by switching between the sound collection signal 21 and the sound collection signal 22, the noise component included in the sound collection signal 22 can be reduced using the sound collection signal 21, and noise reduction processing can be performed more reliably. Can be implemented.

以上、本発明を上記実施形態に即して説明したが、上記実施形態の構成にのみ限定されるものではなく、本願特許請求の範囲の請求項の発明の範囲内で当業者であればなし得る各種変形、修正、組み合わせを含むことは勿論である。例えば、音声用マイクロフォン１１と参照音用マイクロフォン１２とを機器上部（又は下部）のほぼ同じ位置に設けて、指向性が異なるようにこれらのマイクロフォンを配置してもよい。例えば、音声用マイクロフォン１１と参照音用マイクロフォン１２の指向性が１８０°異なるように配置することが好ましい。 Although the present invention has been described with reference to the above embodiment, the present invention is not limited to the configuration of the above embodiment, and can be made by those skilled in the art within the scope of the invention of the claims of the claims of the present application. It goes without saying that various modifications, corrections, and combinations are included. For example, the sound microphone 11 and the reference sound microphone 12 may be provided at substantially the same position in the upper part (or lower part) of the device, and these microphones may be arranged so as to have different directivities. For example, it is preferable to arrange the sound microphone 11 and the reference sound microphone 12 so that the directivities thereof are different by 180 °.

１、２、３ノイズ低減装置
１１音声用マイクロフォン
１２参照音用マイクロフォン
１３、１４ＡＤコンバータ
１５、９５音声ノイズ区間検出部
１６位相情報取得部
１７、９７ノイズ低減処理判定部
１８、９８ノイズ低減処理部
２１、２２収音信号
２３、２４音声ノイズ区間情報
２５音声ノイズ方向情報
２６判定フラグ
２７出力信号
２８パワー情報
２９位相パワー情報
６０パワー情報取得部
７０位相パワー情報取得部 1, 2, 3 Noise reduction device 11 Audio microphone 12 Reference sound microphone 13, 14 AD converter 15, 95 Audio noise section detection unit 16 Phase information acquisition unit 17, 97 Noise reduction processing determination unit 18, 98 Noise reduction processing unit 21, 22 Sound pickup signals 23, 24 Audio noise section information 25 Audio noise direction information 26 Determination flag 27 Output signal 28 Power information 29 Phase power information 60 Power information acquisition unit 70 Phase power information acquisition unit

Claims

A voice section and noise based on at least one of a first sound collection signal corresponding to the sound collected by the first microphone and a second sound collection signal corresponding to the sound collected by the second microphone. A voice noise section detector for detecting a section;
A voice power difference that is a difference between the magnitude of the first sound pickup signal and the second sound pickup signal in the voice section, the magnitude of the first sound pickup signal in the noise section, and the second A power information acquisition unit that acquires a noise power difference that is a difference from the magnitude of the sound pickup signal of
A noise reduction processing determination unit that determines the state of the audio power difference and the noise power difference depending on whether a value based on the difference between the audio power difference and the noise power difference is within a predetermined threshold ;
A noise reduction processing unit that performs noise reduction processing using the first sound collection signal and the second sound collection signal according to a determination result of the noise reduction process determination unit;
A noise reduction device comprising:

The noise reduction processing determination unit determines whether or not the state of the audio power difference and the noise power difference is a first state where noise reduction processing is performed,
When the noise reduction processing determination unit determines that the noise reduction processing determination unit is in the first state, the noise reduction processing unit performs noise reduction processing using the first sound collection signal and the second sound collection signal. The noise reduction device according to claim 1 .

The noise reduction process determination unit is a second state where the state of the audio power difference and the noise power difference is a case where the noise reduction process is performed weaker than the first state where the noise reduction process is performed Whether or not
The noise reduction device according to claim 1 , wherein the noise reduction processing unit performs the noise reduction processing weaker than the first state when the noise reduction processing determination unit determines that the second state is present. .

The noise reduction processing determination unit determines whether or not the state of the audio power difference and the noise power difference is a second state in which noise reduction processing is not performed.
The noise reduction processing unit, when the noise reduction processing determination unit determines that it is the second state, does not perform the noise reduction processing, the noise reducing device according to claim 1.

The noise according to claim 3 or 4 , wherein the noise reduction processing determination unit determines that the second state is present when an absolute value of the difference between the audio power difference and the noise power difference is within a predetermined threshold. Reduction device.

The noise reduction device according to any one of claims 1 to 5 , wherein the power information acquisition unit updates the audio power difference in the audio interval and updates the noise power difference in the noise interval.

The noise reduction processing determination unit determines whether or not the state of the audio power difference and the noise power difference is a first state where noise reduction processing is performed,
When the noise reduction processing determination unit determines that the noise reduction processing determination unit is in the first state, the noise reduction processing unit performs noise reduction processing using the first sound collection signal and the second sound collection signal. And
The noise reduction processing unit
When the noise reduction process determination unit determines that the first state is present, the noise component included in the first sound collection signal is reduced using the second sound collection signal, and after the noise reduction process The signal is output as an audio signal,
When the noise reduction processing determination unit determines the second state, the first sound collection signal is output as an audio signal.
The noise reduction device according to any one of claims 3 to 5 .

The noise reduction processing determination unit determines whether or not the state of the audio power difference and the noise power difference is a first state where noise reduction processing is performed,
When the noise reduction processing determination unit determines that the noise reduction processing determination unit is in the first state, the noise reduction processing unit performs noise reduction processing using the first sound collection signal and the second sound collection signal. And
The noise reduction processing unit
The magnitude of the second collected signal in the voice section is much larger than the size of the first voice collecting signal, and, if it is determined that the first state in the noise reduction processing determination unit, wherein Reducing a noise component included in the second collected sound signal by using the first collected sound signal, and outputting the signal after the noise reduction processing as an audio signal;
The magnitude of the second collected signal in the voice section is much larger than the size of the first voice collecting signal, and, if it is determined that the second state in the noise reduction processing determination unit, wherein Outputting the second collected sound signal as an audio signal;
The noise reduction device according to any one of claims 3 to 5 .

Voice input device having a noise reduction device according to any one of claims 1 to 8.

The first microphone is provided on a first surface of the voice input device;
The voice input device according to claim 9 , wherein the second microphone is provided on a second surface facing the first surface with a predetermined distance.

Wireless communication device having the noise reducing device according to any one of claims 1 to 8.

The first microphone is provided on a first surface of the wireless communication device;
The wireless communication device according to claim 11 , wherein the second microphone is provided on a second surface facing the first surface with a predetermined distance.

A voice section and noise based on at least one of a first sound collection signal corresponding to the sound collected by the first microphone and a second sound collection signal corresponding to the sound collected by the second microphone. Detect the interval,
A voice power difference that is a difference between the magnitude of the first sound pickup signal and the second sound pickup signal in the voice section, the magnitude of the first sound pickup signal in the noise section, and the second To obtain the noise power difference that is the difference between
Whether or not a value based on the difference between the audio power difference and the noise power difference is within a predetermined threshold determines the state of the audio power difference and the noise power difference,
A noise reduction process is performed using the first sound collection signal and the second sound collection signal according to a result of determining the state of the sound power difference and the noise power difference.
Noise reduction method.

A voice section based on at least one of a first sound collection signal corresponding to sound collected by the first microphone and a second sound collection signal corresponding to sound collected by the second microphone in the computer And detect the noise interval,
A voice power difference that is a difference between the magnitude of the first sound pickup signal and the second sound pickup signal in the voice section, the magnitude of the first sound pickup signal in the noise section, and the second To obtain the noise power difference that is the difference with the magnitude of the sound pickup signal of
Value based on the difference between the noise power difference between the sound power difference, depending on whether it is within a predetermined threshold, to determine the state of the noise power difference between the sound power difference,
In accordance with the result of determining the state of the audio power difference and the noise power difference, noise reduction processing is performed using the first sound collection signal and the second sound collection signal.
Noise reduction program.