JP5725088B2

JP5725088B2 - Sound collection device and sound emission collection system

Info

Publication number: JP5725088B2
Application number: JP2013121705A
Authority: JP
Inventors: 良田中; 田中　　良; 直人栗山
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2013-06-10
Filing date: 2013-06-10
Publication date: 2015-05-27
Anticipated expiration: 2028-11-05
Also published as: JP2013225886A

Description

この発明は、複数方位の音声を収音する収音装置及び放収音システムに関する。 The present invention relates to a sound collection device and a sound emission and collection system that collect sound in a plurality of directions.

従来、話者の音声を収音する収音装置が各種提案されている（例えば、特許文献１参照。）。 Conventionally, various sound collection devices for collecting a speaker's voice have been proposed (for example, see Patent Document 1).

例えば、特許文献１の収音装置は、複数のマイクからなるマイクアレイを備え、各マイクが収音した音声信号に対して遅延処理等を行い、それぞれ異なる方向を指向性の軸方向とする収音指向性を備える複数の収音ビーム信号を生成する。そして、収音装置は、複数の収音ビーム信号の中から最も信号レベルが高い収音ビーム信号を選択し、選択した収音ビーム信号からエコーを除去して、通信相手に送信することで、主な話者の音声を通信相手に送信している。 For example, the sound collection device of Patent Document 1 includes a microphone array including a plurality of microphones, performs a delay process on the audio signal collected by each microphone, and sets different directions as the directivity axial directions. A plurality of sound collecting beam signals having sound directivity are generated. Then, the sound collection device selects the sound collection beam signal having the highest signal level from among the plurality of sound collection beam signals, removes the echo from the selected sound collection beam signal, and transmits it to the communication partner. The main speaker's voice is transmitted to the other party.

特開２００２−２３８０９１号公報Japanese Patent Laid-Open No. 2002-238091

しかしながら、収音ビーム信号には、話者の音声と、自装置から放音した音声に基づく回帰音と、が含まれる。このため、収音装置は、回帰音が話者の音声より大きい場合に、主な話者の音声を正確に選択することができない。 However, the collected sound beam signal includes a speaker's voice and a return sound based on the voice emitted from the own apparatus. For this reason, the sound collecting device cannot accurately select the main speaker's voice when the return sound is larger than the speaker's voice.

このような場合、収音装置は、エコーを除去した後の各収音ビーム信号の信号レベルに基づいて、送信する収音ビーム信号を選択すれば、主な話者の音声を正確に選択することができる。しかし、全収音ビーム信号（例えば６方向）のエコーを除去する処理は、負荷が大きく、現実的に全方位のエコーを除去することは不可能であった。 In such a case, the sound collection device selects the sound of the main speaker accurately if the sound collection beam signal to be transmitted is selected based on the signal level of each sound collection beam signal after removing the echo. be able to. However, the processing for removing the echoes of all collected beam signals (for example, 6 directions) has a heavy load, and it is impossible to practically remove the omnidirectional echoes.

そこで、処理負荷をかけずに、主な話者の音声を正確に選択することができる収音装置及び放収音システムを提供することを目的とする。 Therefore, an object of the present invention is to provide a sound collection device and a sound emission and collection system that can accurately select the voice of a main speaker without applying a processing load.

この発明の収音装置は、複数の異なる方位毎に収音信号を生成して、音源（例えば、主な発話者）の方位を推定する。また、収音装置は、第１エコーキャンセル手段と複数の第２エコーキャンセル手段とを備える。第１エコーキャンセル手段は、推定した方位からの収音信号のエコーを除去する。第２エコーキャンセル手段は、第１エコーキャンセル手段よりも簡易な構成からなり、方位毎の収音信号からエコーを除去する。収音装置は、複数の第２エコーキャンセル手段での処理後の収音信号の信号レベルに基づいて、音源の方位を推定する。なお、簡易な構成からなるエコーキャンセル手段とは、ダウンサンプリング後の収音信号に対してエコーを除去するものや、タップ数が少ない適応フィルタを用いてエコーを除去するものである。 The sound collection device of the present invention generates a sound collection signal for each of a plurality of different directions and estimates the direction of a sound source (for example, a main speaker). The sound collection device includes a first echo cancellation unit and a plurality of second echo cancellation units. The first echo canceling means removes echoes of the collected sound signal from the estimated direction. The second echo canceling unit has a simpler configuration than the first echo canceling unit, and removes the echo from the collected sound signal for each direction. The sound collection device estimates the direction of the sound source based on the signal level of the sound collection signal after processing by the plurality of second echo cancellation means. Note that the echo canceling means having a simple configuration is one that removes echoes from the collected sound signal after downsampling, or one that removes echoes using an adaptive filter with a small number of taps.

これにより、収音装置は、簡易にエコーを除去した後の収音信号を用いて音源の方位を推定するため、処理負荷をかけずに、音源の方位を正確に推定することができる。 Thereby, since the sound collection device estimates the direction of the sound source using the sound collection signal after the echo is simply removed, it is possible to accurately estimate the direction of the sound source without applying a processing load.

また、この発明の収音装置の第１エコーキャンセル手段及び複数の第２エコーキャンセル手段は、それぞれ適応フィルタと、該適応フィルタのフィルタ係数を推定するフィルタ係数推定部を備える。第１エコーキャンセル手段のフィルタ係数推定手段は、推定した音源の方位からの収音信号のエコーを除去した第２エコーキャンセル手段のフィルタ係数を初期値として、フィルタ係数を更新する処理を行う。 The first echo canceling means and the plurality of second echo canceling means of the sound collecting device of the present invention each include an adaptive filter and a filter coefficient estimating unit that estimates a filter coefficient of the adaptive filter. The filter coefficient estimation means of the first echo cancellation means performs a process of updating the filter coefficient using the filter coefficient of the second echo cancellation means from which the echo of the collected sound signal from the estimated sound source direction is removed as an initial value.

これにより、収音装置は、音源方位の推定の際に用いたフィルタ係数を初期値として、音源方位からの収音信号のエコーを除去することができるため、フィルタ係数の推定時間を短縮することができる。よって、収音装置は、初期状態からエコーの除去を行うことができる。 As a result, the sound collection device can remove the echo of the collected sound signal from the sound source direction using the filter coefficient used in the estimation of the sound source direction as an initial value, thereby reducing the estimation time of the filter coefficient. Can do. Therefore, the sound collection device can remove the echo from the initial state.

更に、この発明の収音装置の第１エコーキャンセル手段は、適応フィルタのフィルタ係数を方位毎に記憶している。第１エコーキャンセル手段のフィルタ係数推定手段は、フィルタ係数が記憶されていない場合のみ、上述のように第２エコーキャンセル手段のフィルタ係数を初期値とする初期処理を行う。 Furthermore, the first echo cancellation means of the sound collecting device of the present invention stores the filter coefficient of the adaptive filter for each direction. The filter coefficient estimating unit of the first echo canceling unit performs the initial process using the filter coefficient of the second echo canceling unit as an initial value as described above only when no filter coefficient is stored.

これにより、収音装置の第１エコーキャンセル手段は、記憶部にフィルタ係数を記憶していない場合のみ、音源方位の推定時に用いたフィルタ係数を初期値とし、それ以外は、以前に用いたフィルタ係数を初期値とするため、環境（話者）が変わっても即座にエコーを除去することができる。 As a result, the first echo canceling means of the sound collecting device sets the filter coefficient used at the time of estimating the sound source direction as the initial value only when the filter coefficient is not stored in the storage unit, and otherwise the previously used filter coefficient Since the coefficients are set as initial values, echoes can be immediately removed even if the environment (speaker) changes.

また、本発明は、収音装置に限られず、通信手段および放音手段を備える放収音装置で構成される放収音システムであっても構わない。前記通信手段は、前記第１エコーキャンセル手段でエコーが除去された収音信号を他の放収音装置に送信する。また、前記第２エコーキャンセル手段は、ダウンサンプリングされた放音信号に基づき疑似回帰信号を生成してもよい。 The present invention is not limited to the sound collection device, and may be a sound emission / collection system including a sound emission / collection device including a communication unit and a sound emission unit. The communication means transmits the sound collection signal from which the echo is removed by the first echo cancellation means to another sound emission and collection device. The second echo canceling unit may generate a pseudo regression signal based on the downsampled sound emission signal.

この発明の収音装置及び放収音システムは、処理負荷をかけずに、音源（例えば、主な発話者）の方位を正確に推定することができる。 The sound collection device and the sound emission and collection system of the present invention can accurately estimate the direction of a sound source (for example, a main speaker) without applying a processing load.

放収音装置の機能、構成を示すブロック図である。It is a block diagram which shows the function and structure of a sound emission and collection apparatus. 収音ビーム信号毎の収音方位を説明するための説明図である。It is explanatory drawing for demonstrating the sound collection direction for every sound collection beam signal. エコーキャンセル部の機能、構成を示すブロック図である。It is a block diagram which shows the function and structure of an echo cancellation part. 適応フィルタの収音方位毎のフィルタ係数の一例を示す図である。It is a figure which shows an example of the filter coefficient for every sound collection direction of an adaptive filter. 他の実施形態に係る放収音装置の機能、構成を示すブロック図である。It is a block diagram which shows the function and structure of the sound emission and collection apparatus which concern on other embodiment.

本発明の実施形態に係る放収音装置１について、図１〜４を参照して説明する。放収音装置１は、他の放収音装置とネットワーク等を介して接続される。放収音装置１は、他の放収音装置からの音声信号を放音信号として受信してスピーカＳＰから放音する。また、放収音装置１は、マイクＭＩＣ１〜ＭＩＣ３で収音して、複数方位からの収音ビーム信号を生成する。そして、放収音装置は、主な発話者の方位からの収音ビーム信号を他の放収音装置へ送信する。 A sound emitting and collecting apparatus 1 according to an embodiment of the present invention will be described with reference to FIGS. The sound emission and collection device 1 is connected to another sound emission and collection device via a network or the like. The sound emission and collection device 1 receives an audio signal from another sound emission and collection device as a sound emission signal and emits the sound from the speaker SP. The sound emission and collection device 1 collects sound with the microphones MIC1 to MIC3 and generates sound collection beam signals from a plurality of directions. Then, the sound emission and collection device transmits a sound collection beam signal from the main speaker's direction to another sound emission and collection device.

まず、放収音装置１の機能、構成について、図１，２を参照して説明する。図１は、放収音装置の機能、構成を示すブロック図である。図２は、収音ビーム信号毎の収音方位を説明するための説明図である。放収音装置１は、スピーカＳＰ、マイクＭＩＣ１〜ＭＩＣ３、通信制御部１１、収音制御部１２、ダウンサンプリング部（以下、ＤＳ部と称す。）１３、ダウンサンプリング部（以下、ＤＳ部と称す。）１４Ａ〜１４Ｃ、エコーキャンセル部（本発明の第２エコーキャンセル部に相当する。）１５Ａ〜１５Ｃ、方位推定部１６、制御部１７、収音信号選択部１８、及びエコーキャンセル部（本発明の第１エコーキャンセル部に相当する。）１９から構成される。 First, the function and configuration of the sound emission and collection device 1 will be described with reference to FIGS. FIG. 1 is a block diagram showing the function and configuration of the sound emission and collection device. FIG. 2 is an explanatory diagram for explaining a sound collection direction for each sound collection beam signal. The sound emission and collection device 1 includes a speaker SP, microphones MIC1 to MIC3, a communication control unit 11, a sound collection control unit 12, a downsampling unit (hereinafter referred to as a DS unit) 13, and a downsampling unit (hereinafter referred to as a DS unit). .) 14A to 14C, echo canceling unit (corresponding to the second echo canceling unit of the present invention) 15A to 15C, azimuth estimating unit 16, control unit 17, collected sound signal selecting unit 18, and echo canceling unit (the present invention) This corresponds to the first echo canceling unit.) 19.

通信制御部１１は、他の放収音装置とネットワークを介して接続され、他の放収音装置との通信に関する制御を行う。通信制御部１１は、具体的には、他の放収音装置から放音信号ＦＥを受信して、エコーキャンセル部１９を介して、後述するＤＳ部１３及びスピーカＳＰへ出力する。スピーカＳＰは、放音信号ＦＥに基づく音声を放音する。また、通信制御部１１は、後述するエコーキャンセル部１９から入力された収音ビーム信号ＮＥ１’を他の放収音装置へ送信する。 The communication control unit 11 is connected to another sound emitting and collecting device via a network, and performs control related to communication with the other sound emitting and collecting device. Specifically, the communication control unit 11 receives the sound emission signal FE from another sound emission and collection device, and outputs the sound emission signal FE to the DS unit 13 and the speaker SP described later via the echo cancellation unit 19. The speaker SP emits sound based on the sound emission signal FE. Further, the communication control unit 11 transmits a sound collection beam signal NE1 'input from an echo cancellation unit 19 described later to another sound emission and collection device.

マイクＭＩＣ１〜ＭＩＣ３は、周囲の音声を収音して、それぞれ収音信号を生成して、収音制御部１２へ出力する。なお、マイクの台数は、３台に限らない。 The microphones MIC 1 to MIC 3 collect surrounding sounds, generate sound collection signals, and output them to the sound collection control unit 12. The number of microphones is not limited to three.

収音制御部１２は、各マイクＭＩＣ１〜ＭＩＣ３からの収音信号に対して、遅延処理等を行い、図２に示すように異なる方位を収音指向性の中心方向とする複数の収音ビーム信号ＮＥ１〜ＮＥ３を生成する。なお、以下、収音ビーム信号ＮＥ１〜ＮＥ３のそれぞれの収音方位をＤ１〜Ｄ３とする。 The sound collection control unit 12 performs a delay process on the sound collection signals from the microphones MIC1 to MIC3, and has a plurality of sound collection beams having different directions as the central direction of the sound collection directivity as shown in FIG. Signals NE1 to NE3 are generated. Hereinafter, the sound collection directions of the sound collection beam signals NE1 to NE3 are D1 to D3.

そして、収音制御部１２は、収音ビーム信号ＮＥ１〜ＮＥ３をそれぞれＤＳ部１４Ａ〜１４Ｃへ出力するとともに、収音ビーム信号ＮＥ１〜ＮＥ３を収音信号選択部１８へ出力する。なお、収音制御部１２が生成する収音ビーム信号の数は３本に限らない。また、収音制御部１２は、必須の構成ではない。この場合、各マイクＭＩＣ１〜ＭＩＣ３は、それぞれ異なる方位からの音声を収音して収音信号を生成し、ＤＳ部１４Ａ〜１４Ｃ及び収音信号選択部１８へ出力する。 Then, the sound collection control unit 12 outputs the sound collection beam signals NE1 to NE3 to the DS units 14A to 14C, respectively, and outputs the sound collection beam signals NE1 to NE3 to the sound collection signal selection unit 18. The number of sound collection beam signals generated by the sound collection control unit 12 is not limited to three. The sound collection control unit 12 is not an essential configuration. In this case, each of the microphones MIC 1 to MIC 3 collects sound from different directions, generates a sound collection signal, and outputs the sound collection signal to the DS units 14 A to 14 C and the sound collection signal selection unit 18.

ＤＳ部１４Ａ〜１４Ｃは、ローパスフィルタを含み、それぞれ入力された収音ビーム信号ＮＥ１〜ＮＥ３のダウンサンプリングを行い、ダウンサンプリング後の収音ビーム信号ＤＮＥ１〜ＤＮＥ３をそれぞれエコーキャンセル部１５Ａ〜１５Ｃに出力する。例えば、ＤＳ部１４Ａ〜１４Ｃは、２０ｋＨｚのサンプリング周波数でサンプリングされた収音ビーム信号ＤＮＥ１〜ＮＥ３を１０ｋＨｚのサンプリング周波数の信号にダウンサンプリングする。 The DS units 14A to 14C include low-pass filters, perform downsampling of the input sound collection beam signals NE1 to NE3, and output the downsampled sound collection beam signals DNE1 to DNE3 to the echo cancellation units 15A to 15C, respectively. To do. For example, the DS units 14A to 14C down-sample the collected sound beam signals DNE1 to NE3 sampled at a sampling frequency of 20 kHz into signals having a sampling frequency of 10 kHz.

ＤＳ部１３は、ローパスフィルタを含み、入力された放音信号ＦＥのダウンサンプリングを行い、ダウンサンプリング後の放音信号ＤＦＥをエコーキャンセル部１５Ａ〜１５Ｃに出力する。例えば、ＤＳ部１３は、２０ｋＨｚのサンプリング周波数でサンプリングされた放音信号ＦＥを１０ｋＨｚのサンプリング周波数の信号にダウンサンプリングする。 The DS unit 13 includes a low-pass filter, performs downsampling of the input sound emission signal FE, and outputs the sound emission signal DFE after downsampling to the echo cancellation units 15A to 15C. For example, the DS unit 13 down-samples the sound emission signal FE sampled at a sampling frequency of 20 kHz into a signal having a sampling frequency of 10 kHz.

エコーキャンセル部１５Ａ〜１５Ｃは、ダウンサンプリング後の放音信号ＤＦＥに基づいて、スピーカＳＰから各マイクＭＩＣ１〜ＭＩＣ３へ至る回り込成分の擬似信号である擬似回帰音信号を生成する。エコーキャンセル部１５Ａ〜１５Ｃは、それぞれダウンサンプリング後の収音ビーム信号ＤＮＥ１〜ＤＮＥ３から擬似回帰音信号を差し引くことで、エコーを除去する。そして、エコーキャンセル部１５Ａ〜１５Ｃは、エコーを除去した後の収音ビーム信号ＤＮＥ１’〜ＤＮＥ３’を方位推定部１６へ出力する。 The echo cancellation units 15A to 15C generate a pseudo regression sound signal that is a pseudo signal of a wraparound component from the speaker SP to each of the microphones MIC1 to MIC3, based on the sound output signal DFE after downsampling. The echo cancellation units 15A to 15C remove echoes by subtracting the pseudo regression sound signals from the down-sampled sound collection beam signals DNE1 to DNE3, respectively. Then, the echo cancellation units 15A to 15C output the collected sound beam signals DNE1 'to DNE3' after the echo is removed to the azimuth estimation unit 16.

このエコーキャンセル部１５Ａ〜１５Ｃは、ダウンサンプリング後の収音ビーム信号ＤＮＥ１〜ＤＮＥ３のエコーを除去するためのものであるため、エコーキャンセル部１９よりも簡易な構成からなり、処理負荷をかけずにエコーを除去することができる。また、エコーキャンセル部１５Ａ〜１５Ｃは、エコーキャンセル部１９よりもタップ数を少なくしてもよい。なお、エコーキャンセル部１５Ａ〜１５Ｃの詳細な機能、構成については後述する。 Since the echo canceling units 15A to 15C are for removing echoes of the collected sound beam signals DNE1 to DNE3 after downsampling, the echo canceling units 15A to 15C have a simpler configuration than the echo canceling unit 19 and do not apply a processing load. Echo can be removed. Further, the echo cancellation units 15 A to 15 C may have fewer taps than the echo cancellation unit 19. The detailed functions and configurations of the echo cancellation units 15A to 15C will be described later.

方位推定部１６は、エコーキャンセル後の収音ビーム信号ＤＮＥ１’〜ＤＮＥ３’の中から最も信号レベルが高い収音ビーム信号を選択する。以下、方位推定部１６は、収音方位Ｄ１からの収音ビーム信号ＤＮＥ１’を選択したとして説明する。そして、方位推定部１６は、選択した収音ビーム信号ＤＮＥ１’の収音方位Ｄ１を取得して、制御部１７へ出力する。 The azimuth estimating unit 16 selects a sound collecting beam signal having the highest signal level from the sound collecting beam signals DNE1 'to DNE3' after echo cancellation. In the following description, it is assumed that the direction estimation unit 16 has selected the sound collection beam signal DNE1 'from the sound collection direction D1. Then, the direction estimation unit 16 acquires the sound collection direction D1 of the selected sound collection beam signal DNE1 'and outputs it to the control unit 17.

制御部１７は、方位推定部１６から入力された収音方位Ｄ１に基づいて、収音信号選択部１８、及びエコーキャンセル部１９を制御する。 The control unit 17 controls the sound collection signal selection unit 18 and the echo cancellation unit 19 based on the sound collection direction D1 input from the direction estimation unit 16.

収音信号選択部１８は、収音制御部１２から入力された収音ビーム信号ＮＥ１〜ＮＥ３の中から、制御部１７から入力された収音方位Ｄ１からの収音ビーム信号ＮＥ１を選択して、エコーキャンセル部１９へ出力する。 The sound collection signal selection unit 18 selects the sound collection beam signal NE1 from the sound collection direction D1 input from the control unit 17 from the sound collection beam signals NE1 to NE3 input from the sound collection control unit 12. And output to the echo canceling unit 19.

エコーキャンセル部１９は、放音信号ＦＥに基づいて、スピーカＳＰから各マイクＭＩＣ１〜ＭＩＣ３へ至る回り込成分の擬似信号である擬似回帰音信号を生成して、収音信号選択部１８から入力された収音ビーム信号ＮＥ１から擬似回帰音信号を差し引くことでエコーを除去する。そして、エコーキャンセル部１９は、エコーを除去した後の収音ビーム信号ＮＥ１’を通信制御部１１へ出力する。なお、エコーキャンセル部１９の詳細な機能、構成については後述する。 Based on the sound emission signal FE, the echo cancellation unit 19 generates a pseudo regression sound signal that is a pseudo signal of a wraparound component from the speaker SP to each of the microphones MIC1 to MIC3, and is input from the sound collection signal selection unit 18 The echo is removed by subtracting the pseudo regression signal from the collected sound beam signal NE1. Then, the echo cancellation unit 19 outputs the collected sound beam signal NE1 'after the echo is removed to the communication control unit 11. The detailed function and configuration of the echo cancellation unit 19 will be described later.

以上より、エコーキャンセル部１５Ａ〜１５Ｃは、ダウンサンプリングした後の収音ビーム信号ＤＮＥ１〜ＤＮＥ３に対してエコーを除去するため、処理負荷を低減することができる。また、放収音装置１は、エコーを除去した後の収音ビーム信号ＤＮＥ１’〜ＤＮＥ３’を用いて、主な発話者の方位を取得して、主な発話者の方位からの収音ビーム信号を選択するため、主な発話者の音声を正確に選択することができる。よって、放収音装置１は、処理負荷をかけずに、主な発話者の音声を正確に選択することができる。 As described above, since the echo cancellation units 15A to 15C remove echoes from the collected sound beam signals DNE1 to DNE3 after downsampling, the processing load can be reduced. Further, the sound emitting and collecting apparatus 1 acquires the direction of the main speaker using the collected sound beam signals DNE1 ′ to DNE3 ′ after removing the echo, and collects the sound beam from the main speaker's direction. Since the signal is selected, the voice of the main speaker can be accurately selected. Therefore, the sound emission and collection device 1 can accurately select the voice of the main speaker without applying a processing load.

次に、エコーキャンセル部１５Ａ〜１５Ｃ、エコーキャンセル部１９の機能、構成の詳細について、図３，４を参照して説明する。図３は、エコーキャンセル部の機能、構成を示すブロック図である。図４は、適応フィルタの収音方位毎のフィルタ係数の一例を示す図である。エコーキャンセル部１５Ａ〜１５Ｃとエコーキャンセル部１９とは、同じ機能、構成を有する。以下は、エコーキャンセル部１９を例に挙げて説明する。 Next, details of functions and configurations of the echo cancellation units 15A to 15C and the echo cancellation unit 19 will be described with reference to FIGS. FIG. 3 is a block diagram showing the function and configuration of the echo cancellation unit. FIG. 4 is a diagram illustrating an example of a filter coefficient for each sound collection direction of the adaptive filter. The echo cancellation units 15A to 15C and the echo cancellation unit 19 have the same function and configuration. Hereinafter, the echo canceling unit 19 will be described as an example.

図３に示すように、エコーキャンセル部１９は、記憶部２１、フィルタ係数推定部２２、適応フィルタ２３、及び加算部２４から構成される。 As shown in FIG. 3, the echo cancellation unit 19 includes a storage unit 21, a filter coefficient estimation unit 22, an adaptive filter 23, and an addition unit 24.

記憶部２１は、図４に示すような係数一覧表２１１を一時記憶している。係数一覧表２１１は、収音方位毎のフィルタ係数が記載されており、フィルタ係数推定部２２により参照される。なお、係数一覧表２１１に記憶されているフィルタ係数は、放収音装置１の電源のオンオフに伴ってリセットされる。 The storage unit 21 temporarily stores a coefficient list 211 as shown in FIG. The coefficient list 211 describes filter coefficients for each sound collection direction, and is referred to by the filter coefficient estimation unit 22. Note that the filter coefficients stored in the coefficient list 211 are reset when the power of the sound emission and collection device 1 is turned on / off.

フィルタ係数推定部２２は、音響伝達系（スピーカＳＰから各マイクＭＩＣ１〜ＭＩＣ３に至る音響伝搬経路）の伝達関数を推定し、推定した伝達関数でＦＩＲフィルタのフィルタ係数を設定する。この際、フィルタ係数推定部２２は、制御部１７から入力された収音方位Ｄ１に対応するフィルタ係数を係数一覧表２１１から取得し、該フィルタ係数を初期値として、フィルタ係数を算出する。また、フィルタ係数推定部２２は、加算部２４から出力された収音ビーム信号ＮＥ１’と放音信号ＦＥとに基づいて、適応アルゴリズムを用いてフィルタ係数の更新を行う。そして、フィルタ係数推定部２２は、算出したフィルタ係数を適応フィルタ２３へ出力する。 The filter coefficient estimation unit 22 estimates a transfer function of the acoustic transmission system (acoustic propagation path from the speaker SP to each of the microphones MIC1 to MIC3), and sets the filter coefficient of the FIR filter with the estimated transfer function. At this time, the filter coefficient estimation unit 22 acquires a filter coefficient corresponding to the sound collection direction D1 input from the control unit 17 from the coefficient list 211, and calculates the filter coefficient using the filter coefficient as an initial value. Further, the filter coefficient estimating unit 22 updates the filter coefficient using an adaptive algorithm based on the collected sound beam signal NE1 'and the sound emission signal FE output from the adder unit 24. Then, the filter coefficient estimation unit 22 outputs the calculated filter coefficient to the adaptive filter 23.

適応フィルタ２３は、ＦＩＲフィルタ等のデジタルフィルタを含んでおり、フィルタ係数推定部２２から入力されたフィルタ係数で、擬似回帰音信号を生成する。適応フィルタ２３は、生成した擬似回帰音信号を加算部２４へ出力する。 The adaptive filter 23 includes a digital filter such as an FIR filter, and generates a pseudo regression sound signal using the filter coefficient input from the filter coefficient estimation unit 22. The adaptive filter 23 outputs the generated pseudo regression sound signal to the adding unit 24.

加算部２４は、適応フィルタ２３から入力された擬似回帰音信号を収音ビーム信号ＮＥ１から差し引いた収音ビーム信号ＮＥ１’を出力する。 The adding unit 24 outputs a sound collection beam signal NE1 'obtained by subtracting the pseudo regression sound signal input from the adaptive filter 23 from the sound collection beam signal NE1.

なお、記憶部２１は必須の構成ではない。しかし、エコーキャンセル部１９は、収音方位が切り替わると、フィルタ係数の初期値を変更する必要があるため、記憶部２１を備えた方がよい。 The storage unit 21 is not an essential configuration. However, since the echo cancellation unit 19 needs to change the initial value of the filter coefficient when the sound collection direction is switched, it is preferable to include the storage unit 21.

また、エコーキャンセル部１５Ａ〜１５Ｃのフィルタ係数推定部２２は、それぞれ加算部２４から出力された収音ビーム信号ＤＮＥ１’〜ＤＮＥ３’とダウンサンプリング後の放音信号ＤＦＥとに基づいて、適応アルゴリズムを用いてフィルタ係数の更新を行う。 Further, the filter coefficient estimating unit 22 of the echo canceling units 15A to 15C performs an adaptive algorithm based on the collected sound beam signals DNE1 ′ to DNE3 ′ output from the adding unit 24 and the sound emission signal DFE after downsampling, respectively. To update the filter coefficients.

以上より、エコーキャンセル部１９は、収音方位毎のフィルタ係数を記憶部２１に記憶しているため、収音方位を切り替えた際には環境（話者）が変わるが、過去の適応済みのフィルタ係数を記憶部２１から取得して適応フィルタを推定する。このため、エコーキャンセル部１９は、適応フィルタの推定を短縮することがきでき、環境（話者）が変わっても即座にエコーを除去することができる。 As described above, since the echo cancellation unit 19 stores the filter coefficient for each sound collection direction in the storage unit 21, the environment (speaker) changes when the sound collection direction is switched. The filter coefficient is acquired from the storage unit 21, and the adaptive filter is estimated. For this reason, the echo canceling unit 19 can shorten the estimation of the adaptive filter, and can immediately remove the echo even if the environment (speaker) changes.

なお、エコーキャンセル部１９は、エコーキャンセル部１５Ａ〜１５Ｃから取得したフィルタ係数を用いて、エコーキャンセルを行ってもよい。この場合、エコーキャンセル部１５Ａ〜１５Ｃのフィルタ係数をアップサンプリングして使用する。図５は、他の実施形態に係る放収音装置の機能、構成を示すブロック図である。図５に示すように、制御部１７は、放収音装置１の電源オン時や収音方位を切り替えた際に、エコーキャンセル部１９の記憶部２１の係数一覧表２１１を参照する。制御部１７は、方位推定部１６から入力された収音方位のフィルタ係数が係数一覧表２１１に記載されていない場合にのみ、当該収音方位からの収音ビーム信号のエコーを除去したエコーキャンセル部１５Ａ〜１５Ｃからフィルタ係数を取得して、エコーキャンセル部１９のフィルタ係数推定部２２へ出力する。そして、フィルタ係数推定部２２は、制御部１７から入力されたフィルタ係数を初期値としてフィルタ係数を算出する。これにより、エコーキャンセル部１９は、ダウンサンプリングした収音ビーム信号ＤＮＥ１〜ＤＮＥ３のエコーを除去した際のフィルタ係数を初期値として、フィルタ係数の算出を行うため、フィルタ係数の推定時間を短縮することができる。このため、エコーキャンセル部１９は、初期状態からエコーを除去することができる。 The echo cancellation unit 19 may perform echo cancellation using the filter coefficients acquired from the echo cancellation units 15A to 15C. In this case, the filter coefficients of the echo cancellation units 15A to 15C are upsampled and used. FIG. 5 is a block diagram illustrating functions and configurations of a sound emission and collection device according to another embodiment. As shown in FIG. 5, the control unit 17 refers to the coefficient list 211 in the storage unit 21 of the echo cancellation unit 19 when the sound emission and collection device 1 is turned on or when the sound collection direction is switched. Only when the filter coefficient of the sound collection direction input from the direction estimation unit 16 is not described in the coefficient list 211, the control unit 17 removes the echo of the sound collection beam signal from the sound collection direction. The filter coefficients are acquired from the units 15 A to 15 C and output to the filter coefficient estimation unit 22 of the echo cancellation unit 19. Then, the filter coefficient estimation unit 22 calculates a filter coefficient using the filter coefficient input from the control unit 17 as an initial value. As a result, the echo cancel unit 19 calculates the filter coefficient using the filter coefficient when the echo of the downsampled sound collecting beam signals DNE1 to DNE3 is removed as an initial value, and therefore shortens the estimation time of the filter coefficient. Can do. For this reason, the echo cancellation part 19 can remove an echo from an initial state.

１…放収音装置，１１…通信制御部，１２…収音制御部，１３，１４…ＤＳ部，１５，１９…エコーキャンセル部，１６…方位推定部，１７…制御部，１８…収音信号選択部，２１…記憶部，２１１…係数一覧表，２２…フィルタ係数推定部，２３…適応フィルタ，２４…加算部，ＭＩＣ１〜ＭＩＣ３…マイク，ＳＰ…スピーカ DESCRIPTION OF SYMBOLS 1 ... Sound emission and collection device, 11 ... Communication control part, 12 ... Sound collection control part, 13, 14 ... DS part, 15, 19 ... Echo cancellation part, 16 ... Direction estimation part, 17 ... Control part, 18 ... Sound collection Signal selection unit, 21 ... storage unit, 211 ... coefficient list, 22 ... filter coefficient estimation unit, 23 ... adaptive filter, 24 ... addition unit, MIC1 to MIC3 ... microphone, SP ... speaker

Claims

A sound collection device comprising echo cancellation means for removing echoes of a sound collection signal based on a sound emission signal,
A plurality of sound collecting means for collecting sound of a plurality of directions and generating a sound collection signal for each direction;
Direction estimation means for estimating the direction of the sound source based on the signal level of the collected sound signal for each direction;
First echo cancellation means for removing echoes of the collected sound signal from the direction of the sound source estimated by the direction estimation means;
Echo is removed from the collected sound signal collected by the sound collecting means,
A plurality of second echo canceling means having a simpler configuration than the first echo canceling means;
With
The direction estimating means estimates the direction of a sound source based on the signal level of the collected sound signal after processing by the plurality of second echo canceling means.

Each of the first echo cancellation means and the plurality of second echo cancellation means includes an adaptive filter and a filter coefficient estimation means for estimating a filter coefficient of the adaptive filter,
The second echo cancellation means has a smaller number of filter coefficients than the first echo cancellation means,
The sound collection device according to claim 1.

Downsampling means for downsampling the collected sound signal picked up by the sound collecting means;
The second echo canceling means removes an echo from the downsampled sound pickup signal;
The sound collection device according to claim 1.

The filter coefficient estimation means of the first echo cancellation means is
An initial process is performed in which the filter coefficient of the adaptive filter of the second echo cancellation unit that removes the echo of the collected sound signal of the direction estimated by the direction estimation unit is an initial value.
The sound collection device according to any one of claims 1 to 3.

The first echo cancellation means further comprises storage means for storing the filter coefficient of the adaptive filter for each direction,
The filter coefficient estimation means of the first echo cancellation means performs the initial process only when no filter coefficient is stored in the storage means.
The sound collection device according to claim 4.

The sound collecting device according to any one of claims 1 to 5 constitutes a sound emitting and collecting device including a communication unit and a sound emitting unit,
A sound emission and collection system comprising the sound emission and collection device,
The communication means transmits the sound collection signal from which echo is removed by the first echo cancellation means to another sound emission and collection device.
Sound emission and collection system.

The second echo cancellation means generates a pseudo regression signal based on the downsampled sound emission signal;
The sound emission and collection system according to claim 6.