JP7060905B1

JP7060905B1 - Sound collection system, sound collection method and program

Info

Publication number: JP7060905B1
Application number: JP2022502563A
Authority: JP
Inventors: 圭司松永
Original assignee: Audio Technica KK
Current assignee: Audio Technica KK
Priority date: 2020-11-11
Filing date: 2021-10-12
Publication date: 2022-04-27
Anticipated expiration: 2041-10-12
Also published as: EP4207196A1; US20230247361A1; CN116490924A; EP4207196A4; JPWO2022102322A1

Abstract

収音システムＳは、複数のマイクロフォン２を含むマイクロホンアレイ１と、複数のマイクロフォン２に到来した音に基づく複数の音信号のうち第１範囲内の方向から到来した音に基づく音信号を他の方向から到来した音に基づく音信号よりも強調させた第１信号を出力する第１ビームフォーマ１５２と、複数の音信号のうち第２範囲内の方向から到来した音に基づく音信号を他の方向から到来した音に基づく音信号よりも強調させた第２信号を出力する第２ビームフォーマ１５３と、複数のマイクロフォン２に到来した音を発した音源方向を検出する音源方向検出部１５１と、第１ビームフォーマ１５２が第１信号を出力している間に、音源方向検出部１５１が検出した音源の方向の単位時間あたりの変化角度が閾値以上と判定した場合、第２ビームフォーマ１５３に第２信号を出力させる指向性制御部１５５と、を有する。The sound collecting system S includes a microphone array 1 including a plurality of microphones 2, and a sound signal based on a sound arriving from a direction within the first range among a plurality of sound signals arriving at the plurality of microphones 2. The first beam former 152 that outputs the first signal emphasized more than the sound signal based on the sound coming from the direction, and the other sound signals based on the sound coming from the direction within the second range among the plurality of sound signals. A second beam former 153 that outputs a second signal that is emphasized more than a sound signal based on a sound arriving from a direction, a sound source direction detection unit 151 that detects a sound source direction that emits a sound arriving at a plurality of microphones 2, and a sound source direction detection unit 151. When the change angle per unit time of the direction of the sound source detected by the sound source direction detection unit 151 is determined to be equal to or more than the threshold value while the first beam former 152 is outputting the first signal, the second beam former 153 is subjected to the second. It has a directional control unit 155 that outputs two signals.

Description

本発明は、収音システム、収音方法及びプログラムに関する。 The present invention relates to a sound collecting system, a sound collecting method and a program.

複数のマイクで観測された音声信号の位相差を利用してビームフォーミング処理をすることにより、音源方向に指向性がある状態で収音するためのビームフォーミング処理装置が知られている（例えば、特許文献１を参照）。 A beamforming processing device for collecting sound in a state of directivity in the sound source direction by performing beamforming processing using the phase difference of audio signals observed by a plurality of microphones is known (for example). See Patent Document 1).

特開２０１３－２０１５２５号公報Japanese Unexamined Patent Publication No. 2013-201525

従来のビームフォーミング処理装置においては、音源が一つであることが想定されていた。したがって、従来のビームフォーミング処理装置では、１人の話者の方向に指向性がある状態で収音しているときに別の話者が発言すると、別の話者の音声を収音できないという問題が生じていた。 In the conventional beamforming processing device, it was assumed that there was only one sound source. Therefore, in the conventional beamforming processing device, if another speaker speaks while the sound is picked up in a state of directivity in the direction of one speaker, the voice of another speaker cannot be picked up. There was a problem.

そこで、本発明はこれらの点に鑑みてなされたものであり、複数の話者の音声の収音を可能にすることを目的とする。 Therefore, the present invention has been made in view of these points, and an object thereof is to enable sound collection of voices of a plurality of speakers.

本発明の第１の態様に係る収音システムは、複数のマイクロフォンを含むマイクロホンアレイと、前記複数のマイクロフォンに到来した音に基づく複数の音信号のうち第１範囲内の方向から到来した音に基づく音信号を他の方向から到来した音に基づく音信号よりも強調させた第１信号を出力する第１ビームフォーマと、前記複数の音信号のうち第２範囲内の方向から到来した音に基づく音信号を他の方向から到来した音に基づく音信号よりも強調させた第２信号を出力する第２ビームフォーマと、前記複数のマイクロフォンに到来した音を発した音源の方向を検出する音源方向検出部と、前記第１ビームフォーマが前記第１信号を出力している間に、前記音源方向検出部が検出した前記音源の方向の単位時間あたりの変化角度が閾値以上であると判定した場合に、前記第２ビームフォーマに前記第２信号を出力させる指向性制御部と、を有する。 The sound collecting system according to the first aspect of the present invention includes a microphone array including a plurality of microphones and a plurality of sound signals based on the sounds arriving at the plurality of microphones, which are sound arriving from a direction within the first range. A first beam former that outputs a first signal that emphasizes a sound signal based on a sound signal that is based on a sound that arrives from another direction, and a sound that arrives from a direction within the second range of the plurality of sound signals. A second beam former that outputs a second signal that emphasizes the sound signal based on the sound signal that is based on the sound that arrives from another direction, and a sound source that detects the direction of the sound source that emits the sound that arrives at the plurality of microphones. While the direction detection unit and the first beam former output the first signal, it is determined that the change angle per unit time of the direction of the sound source detected by the sound source direction detection unit is equal to or greater than the threshold value. In this case, it has a directional control unit that causes the second beam former to output the second signal.

前記指向性制御部は、前記第１ビームフォーマが前記第１信号を出力している間に、前記音源の方向の単位時間あたりの変化角度が閾値未満であると判定した場合に、前記第１範囲を変更した状態で前記第１ビームフォーマに前記第１信号を継続して出力させてもよい。 When the directivity control unit determines that the change angle per unit time in the direction of the sound source is less than the threshold value while the first beam former outputs the first signal, the first one. The first signal may be continuously output to the first beam former in a state where the range is changed.

前記指向性制御部は、前記第１ビームフォーマが前記第１信号を出力している間に前記変化角度が閾値以上であると判定した場合に、前記第１信号の出力レベルを減少させてもよい。 Even if the directivity control unit reduces the output level of the first signal when it is determined that the change angle is equal to or greater than the threshold value while the first beam former outputs the first signal. good.

前記指向性制御部は、前記変化角度が閾値以上であると判定してからの経過時間に基づく減衰率で前記第１信号の出力レベルを減少させてもよい。 The directivity control unit may reduce the output level of the first signal by an attenuation factor based on the elapsed time from the determination that the change angle is equal to or greater than the threshold value.

前記指向性制御部は、前記第１信号の出力レベルを減少させる間に前記第２信号の出力レベルを増加させてもよい。 The directivity control unit may increase the output level of the second signal while decreasing the output level of the first signal.

前記指向性制御部は、前記第１信号の出力レベルを減少させる変化速度よりも大きい変化速度で前記第２信号の出力レベルを増加させてもよい。 The directivity control unit may increase the output level of the second signal at a rate of change larger than the rate of change that decreases the output level of the first signal.

前記指向性制御部は、前記音源の方向が前記第１範囲に含まれていないと判定した場合に、前記第２ビームフォーマに前記第２信号を出力させてもよい。 When the directivity control unit determines that the direction of the sound source is not included in the first range, the second beam former may output the second signal.

前記指向性制御部は、前記第２ビームフォーマに前記第２信号を出力させる前に、前記音源の方向を含むように前記第２範囲を決定してもよい。 The directivity control unit may determine the second range so as to include the direction of the sound source before causing the second beam former to output the second signal.

前記指向性制御部は、前記第２ビームフォーマが前記第２信号を出力している間に、前記音源方向検出部が検出した前記音源の方向の単位時間あたりの変化角度が閾値以上であると判定した場合に、前記第１ビームフォーマに前記第１信号を出力させてもよい。 The directivity control unit determines that the change angle per unit time of the direction of the sound source detected by the sound source direction detection unit is equal to or greater than the threshold value while the second beam former outputs the second signal. When the determination is made, the first beam former may be made to output the first signal.

前記収音システムは、前記音源方向検出部が検出した前記音源の方向と、ビームフォーマ係数とを関連付けて記憶する記憶部をさらに有し、前記指向性制御部は、前記音源方向検出部が検出した前記音源の方向に関連付けて前記記憶部に記憶された前記ビームフォーマ係数を用いて前記第１ビームフォーマ又は前記第２ビームフォーマに前記第１信号又は前記第２信号を出力させてもよい。 The sound collecting system further has a storage unit that stores the direction of the sound source detected by the sound source direction detection unit in association with the beamformer coefficient, and the directional control unit is detected by the sound source direction detection unit. The first beam former or the second beam former may be made to output the first signal or the second signal by using the beam former coefficient stored in the storage unit in association with the direction of the sound source.

前記記憶部は、前記音源方向検出部が過去に検出した音源の方向と、当該方向に基づいて指向性制御部が過去の算出したビームフォーマ係数と、を関連付けて記憶し、前記指向性制御部は、前記音源方向検出部が新たに検出した音源の方向と前記記憶部が記憶している前記過去に検出した音源の方向とが同じであると判定した場合に、前記過去に検出した音源の方向に関連付けて記憶された前記ビームフォーマ係数を使用してもよい。 The storage unit stores the direction of the sound source previously detected by the sound source direction detection unit and the beam former coefficient calculated in the past by the direction control unit based on the direction, and stores the direction control unit. When it is determined that the direction of the sound source newly detected by the sound source direction detection unit and the direction of the previously detected sound source stored in the storage unit are the same, The beamformer coefficient stored in association with the direction may be used.

本発明の第２の態様に係る収音方法は、複数のマイクロフォンに到来した音に基づく複数の音信号のうち第１範囲内の方向から到来した音に基づく音信号を他の方向から到来した音に基づく音信号よりも強調させた第１信号を出力するステップと、前記複数のマイクロフォンに到来した音を発した音源の方向を検出するステップと、前記第１信号を出力している間に、前記音源の方向の単位時間あたりの変化角度が閾値以上であると判定した場合に、前記複数の音信号のうち第２範囲内の方向から到来した音に基づく音信号を他の方向から到来した音に基づく音信号よりも強調させた第２信号を出力するステップと、を有する。 In the sound collecting method according to the second aspect of the present invention, a sound signal based on a sound arriving from a direction within the first range among a plurality of sound signals arriving at a plurality of microphones is arriving from another direction. While outputting the first signal that is emphasized more than the sound signal based on the sound, the step of detecting the direction of the sound source that emitted the sound arriving at the plurality of microphones, and the step of outputting the first signal. When it is determined that the change angle per unit time in the direction of the sound source is equal to or greater than the threshold value, a sound signal based on a sound coming from a direction within the second range of the plurality of sound signals arrives from another direction. It has a step of outputting a second signal that is emphasized more than a sound signal based on the sound.

本発明の第３の態様に係るプログラムは、コンピュータを、複数のマイクロフォンに到来した音に基づく複数の音信号のうち第１範囲内の方向から到来した音に基づく音信号を他の方向から到来した音に基づく音信号よりも強調させた第１信号を出力する第１ビームフォーマ、前記複数の音信号のうち第２範囲内の方向から到来した音に基づく音信号を他の方向から到来した音に基づく音信号よりも強調させた第２信号を出力する第２ビームフォーマ、前記複数のマイクロフォンに到来した音を発した音源の方向を検出する音源方向検出部、及び前記第１ビームフォーマが前記第１信号を出力している間に、前記音源方向検出部が検出した前記音源の方向の単位時間あたりの変化角度が閾値以上であると判定した場合に、前記第２ビームフォーマに前記第２信号を出力させる指向性制御部、として機能させる。 In the program according to the third aspect of the present invention, the computer receives a sound signal based on a sound arriving from a direction within the first range among a plurality of sound signals arriving at a plurality of microphones from another direction. A first beam former that outputs a first signal that is emphasized more than a sound signal based on the sound, and a sound signal based on a sound that arrives from a direction within the second range of the plurality of sound signals arrives from another direction. The second beam former that outputs a second signal that is emphasized more than the sound signal based on sound, the sound source direction detection unit that detects the direction of the sound source that emitted the sound that arrived at the plurality of microphones, and the first beam former When it is determined that the change angle per unit time of the direction of the sound source detected by the sound source direction detection unit is equal to or greater than the threshold value while the first signal is being output, the second beam former is subjected to the second beam former. It functions as a directional control unit that outputs two signals.

本発明によれば、複数の話者の音声の収音を可能にするという効果を奏する。 According to the present invention, there is an effect that the sound of a plurality of speakers can be picked up.

本実施形態に係る収音システムＳの概要を説明するための図である。It is a figure for demonstrating the outline of the sound collecting system S which concerns on this embodiment. 複数の話者が発した複数の音声を収音システムＳが収音する動作を時系列で示した図である。It is a figure which showed the operation which the sound collecting system S picks up a plurality of voices made by a plurality of speakers in time series. 収音システムＳの構成を説明するための図である。It is a figure for demonstrating the structure of the sound collecting system S. 第１ビームフォーマ１５２の構成を説明するための図である。It is a figure for demonstrating the structure of the 1st beam former 152. 新しい音源を検出したか否かをビームフォーミング処理部１５が判定する処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the process which the beamforming processing unit 15 determines whether or not a new sound source was detected. 新しい音源を検出したことに基づいてビームフォーミング処理部１５がビームフォーマを制御する処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the process which the beamforming processing unit 15 controls a beamformer based on the detection of a new sound source.

＜本実施形態に係る収音システムＳの概要＞
図１は、本実施形態に係る収音システムＳの概要を説明するための図である。図１は、空間Ｒの側面から空間Ｒの内部を見た図である。空間Ｒは、例えば、建物内の部屋であるが、これに限らず、建物内の廊下、ラウンジ、階段スペース等であってもよい。図１に示すように、空間Ｒの上面には収音システムＳが設置されており、空間Ｒには話者Ａ１、話者Ａ２、及び話者Ａ３が滞在している。図１における音声Ｂ１、Ｂ２、Ｂ３は、それぞれ話者Ａ１、Ａ２、Ａ３が発する音声である。図１においては、収音システムＳは空間Ｒの上面に設置されている。なお、収音システムＳは空間Ｒの側面又は底面に設置されていてもよい。<Overview of the sound collecting system S according to this embodiment>
FIG. 1 is a diagram for explaining an outline of the sound collecting system S according to the present embodiment. FIG. 1 is a view of the inside of the space R from the side surface of the space R. The space R is, for example, a room in the building, but is not limited to this, and may be a corridor, a lounge, a staircase space, or the like in the building. As shown in FIG. 1, a sound collecting system S is installed on the upper surface of the space R, and a speaker A1, a speaker A2, and a speaker A3 are staying in the space R. The voices B1, B2, and B3 in FIG. 1 are voices emitted by the speakers A1, A2, and A3, respectively. In FIG. 1, the sound collecting system S is installed on the upper surface of the space R. The sound collecting system S may be installed on the side surface or the bottom surface of the space R.

収音システムＳは、複数のマイクロフォンを含むマイクロホンアレイと、信号処理装置とを有する。信号処理装置は、マイクロホンアレイに到達した音を信号処理する複数のビームフォーマを有する。収音システムＳは、複数のビームフォーマそれぞれが検出した音源方向に対応するビームフォーマ係数を用いることでビームフォーミングを行い、複数の指向性マイクロフォンを疑似的に構成する。ビームフォーマ係数については後述する。 The sound collecting system S includes a microphone array including a plurality of microphones and a signal processing device. The signal processing device has a plurality of beam formers that signal-process the sound that reaches the microphone array. The sound collecting system S performs beamforming by using beamforming coefficients corresponding to the sound source directions detected by each of the plurality of beamformers, and pseudo-configures a plurality of directional microphones. The beamformer coefficient will be described later.

図２は、複数の話者が発した複数の音声を収音システムＳが収音する動作を時系列で示した図である。図２の横軸は時刻を示している。図２の縦軸に示す「話者Ａ１」、「話者Ａ２」、「話者Ａ３」は、それぞれ話者Ａ１、Ａ２、Ａ３が音声Ｂ１、Ｂ２、Ｂ３を発している期間を示している。図２の縦軸に示す「第１ビームフォーマ」及び「第２ビームフォーマ」は、収音システムＳが有する第１ビームフォーマ及び第２ビームフォーマがビームフォーミング処理を実行する期間とビームフォーミング処理により特定した音源方向の音声とを示している。「出力音」は、収音システムＳが収音して外部装置に出力する音声を示している。外部装置は、例えば通信ネットワークに接続されたルータ又は記憶媒体を有するコンピュータである。 FIG. 2 is a diagram showing an operation in which the sound collecting system S picks up a plurality of sounds emitted by a plurality of speakers in chronological order. The horizontal axis of FIG. 2 indicates the time. “Speaker A1”, “speaker A2”, and “speaker A3” shown on the vertical axis of FIG. 2 indicate the period during which the speakers A1, A2, and A3 emit voices B1, B2, and B3, respectively. .. The "first beamformer" and "second beamformer" shown on the vertical axis of FIG. 2 are based on the period during which the first beamformer and the second beamformer of the sound source system S execute the beamforming process and the beamforming process. It shows the sound in the specified sound source direction. The “output sound” indicates a sound collected by the sound collecting system S and output to an external device. The external device is, for example, a router or a computer having a storage medium connected to a communication network.

図２に示すように、時刻Ｔ１から時刻Ｔ３において、話者Ａ１は音声Ｂ１を発し、時刻Ｔ２から時刻Ｔ５において、話者Ａ２は音声Ｂ２を発し、時刻Ｔ４から時刻Ｔ６において話者Ａ３は音声Ｂ３を発する。時刻Ｔ１において、収音システムＳは、音声Ｂ１を検出することで、第１ビームフォーマによりビームフォーミング処理を開始し、音声Ｂ１の音源方向を特定する。時刻Ｔ２において、収音システムＳは、音声Ｂ１とは異なる方向である音声Ｂ２を検出し、第２ビームフォーマによりビームフォーミング処理を開始することで音声Ｂ２の音源方向を特定する。時刻Ｔ３において、収音システムＳは、第１ビームフォーマのビームフォーミング処理を停止する。 As shown in FIG. 2, from time T1 to time T3, speaker A1 emits voice B1, from time T2 to time T5, speaker A2 emits voice B2, and from time T4 to time T6, speaker A3 emits voice. Emit B3. At time T1, the sound collecting system S detects the sound B1 and starts the beamforming process by the first beamformer to specify the sound source direction of the sound B1. At time T2, the sound collecting system S detects the voice B2 in a direction different from that of the voice B1, and starts the beamforming process by the second beamformer to specify the sound source direction of the voice B2. At time T3, the sound collecting system S stops the beamforming process of the first beamformer.

時刻Ｔ４において、収音システムＳは、音声Ｂ３の音源方向を検出し、第１ビームフォーマによるビームフォーミング処理を開始する。時刻Ｔ５において、収音システムＳは、第２ビームフォーマによるビームフォーミング処理を停止する。その結果、収音システムＳは、時刻Ｔ１から時刻Ｔ２において音声Ｂ１を収音し、時刻Ｔ２から時刻Ｔ３において音声Ｂ１と音声Ｂ２とを収音する。収音システムＳは、時刻Ｔ３から時刻Ｔ４において音声Ｂ２を収音し、時刻Ｔ４から時刻Ｔ５において音声Ｂ２と音声Ｂ３とを収音する。時刻Ｔ５から時刻Ｔ６において、収音システムＳは、音声Ｂ３を収音する。 At time T4, the sound collecting system S detects the sound source direction of the voice B3 and starts the beamforming process by the first beamformer. At time T5, the sound collecting system S stops the beamforming process by the second beamformer. As a result, the sound collecting system S picks up the voice B1 from the time T1 to the time T2, and picks up the voice B1 and the voice B2 from the time T2 to the time T3. The sound collecting system S picks up the voice B2 from the time T3 to the time T4, and picks up the voice B2 and the voice B3 from the time T4 to the time T5. From time T5 to time T6, the sound collecting system S picks up the sound B3.

収音システムＳがこのように複数のビームフォーマを有することで、収音システムＳは、複数の狭指向性マイクロフォンをそれぞれの音源方向に向けた状態と同じ状況を疑似的に実現し、収音する。さらに、収音システムＳは、複数のビームフォーマを切り替えることで、ビームフォーマの数よりも多い数の話者がいる状況であって音声を発する話者が切り替わる場合にも、途切れることなく複数の話者の音声を収音することができる。 By having the sound collecting system S having a plurality of beam formers in this way, the sound collecting system S realizes the same situation as when a plurality of narrow directional microphones are pointed in the respective sound source directions in a pseudo manner, and the sound collecting system S realizes the same situation. do. Further, the sound collecting system S switches between a plurality of beam formers, so that even if there are more speakers than the number of beam formers and the speakers that emit voice are switched, a plurality of speakers are used without interruption. The voice of the speaker can be picked up.

なお、図２における収音システムＳは、話者が発する音声の停止とともにビームフォーミング処理を停止しているが、話者が発する音声が停止した後もビームフォーミング処理を継続してもよい。例えば、収音システムＳは、時刻Ｔ１に開始した第１ビームフォーマのビームフォーミング処理を、時刻Ｔ３ではなく時刻Ｔ３から一定時間が経過した後の時刻に停止してもよい。また、収音システムＳは、時刻Ｔ３において第１ビームフォーマによるビームフォーミング処理を停止せずに、ビームフォーミング処理を継続してもよい。この場合、収音システムＳは、時刻Ｔ４において音声Ｂ３の音源方向を検出すると、第１ビームフォーマによるビームフォーミングの方向を音声Ｂ３の音源方向に切り替える。 Although the sound collecting system S in FIG. 2 stops the beamforming process when the voice emitted by the speaker is stopped, the beamforming process may be continued even after the voice emitted by the speaker is stopped. For example, the sound collecting system S may stop the beamforming process of the first beamformer that started at the time T1 at a time after a certain time has elapsed from the time T3 instead of the time T3. Further, the sound collecting system S may continue the beamforming process without stopping the beamforming process by the first beamformer at time T3. In this case, when the sound collecting system S detects the sound source direction of the voice B3 at the time T4, the sound collecting system S switches the direction of beamforming by the first beamformer to the sound source direction of the voice B3.

＜収音システムＳの構成＞
図３は、収音システムＳの構成を説明するための図である。収音システムＳは、マイクロホンアレイ１と信号処理装置１０とを有する。マイクロホンアレイ１は、複数のマイクロフォン２（マイクロフォン２ａ，２ｂ，２ｃ，２ｄ）を含む。複数のマイクロフォン２は、到来した音に基づく電気信号を出力する。信号処理装置１０は、複数のマイクロフォン２が出力する電気信号を処理して音源方向の指向性を高めることにより、音源が発した音を強調して出力する。<Configuration of sound collection system S>
FIG. 3 is a diagram for explaining the configuration of the sound collecting system S. The sound collecting system S includes a microphone array 1 and a signal processing device 10. The microphone array 1 includes a plurality of microphones 2 (microphones 2a, 2b, 2c, 2d). The plurality of microphones 2 output an electric signal based on the incoming sound. The signal processing device 10 processes the electric signals output by the plurality of microphones 2 to increase the directivity in the direction of the sound source, thereby emphasizing and outputting the sound emitted by the sound source.

信号処理装置１０は、入力部１１、第１減衰部１２、第２減衰部１３、出力部１４、及びビームフォーミング処理部１５を有する。入力部１１は、例えばプリアンプとＡ／Ｄ（アナログ／デジタル）変換器とを備えている。入力部１１は、複数のマイクロフォン２それぞれから入力された複数のアナログ電気信号を複数のデジタル信号に変換することにより複数の音信号を生成する。入力部１１は、例えば複数のマイクロフォン２それぞれから入力されるアナログ電気信号を増幅した複数の増幅信号を生成する。入力部１１は、複数の増幅信号を複数のデジタル信号に変換することにより、複数の音信号を生成する。入力部１１は、生成した複数の音信号をビームフォーミング処理部１５に出力する。 The signal processing device 10 includes an input unit 11, a first attenuation unit 12, a second attenuation unit 13, an output unit 14, and a beamforming processing unit 15. The input unit 11 includes, for example, a preamplifier and an A / D (analog / digital) converter. The input unit 11 generates a plurality of sound signals by converting a plurality of analog electric signals input from each of the plurality of microphones 2 into a plurality of digital signals. The input unit 11 generates a plurality of amplified signals obtained by amplifying analog electric signals input from each of the plurality of microphones 2, for example. The input unit 11 generates a plurality of sound signals by converting a plurality of amplified signals into a plurality of digital signals. The input unit 11 outputs the generated plurality of sound signals to the beamforming processing unit 15.

第１減衰部１２及び第２減衰部１３は、ビームフォーミング処理部１５から入力された信号のレベルを減少又は増加させる。第１減衰部１２及び第２減衰部１３は、ビームフォーミング処理部１５から取得したアッテネータゲインに基づいて、ビームフォーミング処理部１５が出力した信号のレベルを減少又は増加させる。アッテネータゲインは、第１減衰部１２及び第２減衰部１３において信号のレベルを減少又は増加させる前の信号のレベルに対する信号のレベルの減少量又は増加量である減衰率に対応する。第１減衰部１２及び第２減衰部１３は、信号のレベルを減少又は増加させた後の信号を出力部１４に出力する。 The first attenuation unit 12 and the second attenuation unit 13 decrease or increase the level of the signal input from the beamforming processing unit 15. The first attenuation unit 12 and the second attenuation unit 13 reduce or increase the level of the signal output by the beamforming processing unit 15 based on the attenuator gain acquired from the beamforming processing unit 15. The attenuator gain corresponds to the attenuation rate, which is the amount of decrease or increase in the signal level with respect to the signal level before the signal level is decreased or increased in the first attenuation unit 12 and the second attenuation unit 13. The first attenuation unit 12 and the second attenuation unit 13 output the signal after reducing or increasing the signal level to the output unit 14.

出力部１４は、第１減衰部１２及び第２減衰部１３から入力された信号を出力する。出力部１４は、第１減衰部１２が出力した信号と第２減衰部１３が出力した信号とを加算した出力音信号を生成し、生成した出力音信号を出力する。出力部１４は、例えば、Ｄ／Ａ（デジタル／アナログ）変換器を備えており、デジタルの出力音信号をアナログ信号に変換し、変換後のアナログ信号を出力する。 The output unit 14 outputs the signal input from the first attenuation unit 12 and the second attenuation unit 13. The output unit 14 generates an output sound signal by adding the signal output by the first attenuation unit 12 and the signal output by the second attenuation unit 13, and outputs the generated output sound signal. The output unit 14 includes, for example, a D / A (digital / analog) converter, converts a digital output sound signal into an analog signal, and outputs the converted analog signal.

ビームフォーミング処理部１５は、音源方向検出部１５１、第１ビームフォーマ１５２、第２ビームフォーマ１５３、記憶部１５４、及び指向性制御部１５５を有する。ビームフォーミング処理部１５は、例えばデジタル信号処理用プロセッサにより構成されている。 The beamforming processing unit 15 includes a sound source direction detection unit 151, a first beamformer 152, a second beamformer 153, a storage unit 154, and a directivity control unit 155. The beamforming processing unit 15 is composed of, for example, a digital signal processing processor.

音源方向検出部１５１は、複数のマイクロフォン２に到来した音を発した音源の方向を検出する。音源の方向は、例えば、マイクロホンアレイ１が空間の上面に設置されている場合、マイクロホンアレイ１の中心位置から鉛直方向に進む直線と、マイクロフォン２の位置と音源の位置とを結んだ直線と、の角度により表される。音源方向検出部１５１は、例えば、複数のマイクロフォン２それぞれに音が到来した時刻の差に基づいて、遅延和アレイ法を用いることにより音源の方向を検出する。音源方向検出部１５１は、検出した音源の方向を指向性制御部１５５に通知する。 The sound source direction detection unit 151 detects the direction of the sound source that emits the sound that has arrived at the plurality of microphones 2. The direction of the sound source is, for example, when the microphone array 1 is installed on the upper surface of the space, a straight line traveling in the vertical direction from the center position of the microphone array 1 and a straight line connecting the position of the microphone 2 and the position of the sound source. It is represented by the angle of. The sound source direction detection unit 151 detects the direction of the sound source by using the delay sum array method, for example, based on the difference in time when the sound arrives at each of the plurality of microphones 2. The sound source direction detection unit 151 notifies the directivity control unit 155 of the direction of the detected sound source.

第１ビームフォーマ１５２は、複数のマイクロフォン２が収音した音に基づく複数の音信号のうち、第１範囲内の方向から到来した音に基づく音信号を他の方向から到来した音に基づく音信号よりも強調させた第１信号を出力する。第１範囲は、音源方向検出部１５１から通知された第１の音源の方向を中心とする範囲である。第１範囲の大きさは、例えば、複数のマイクロフォン２の数、及び第１ビームフォーマ１５２に設定されるビームフォーマ係数によって定まる。 The first beam former 152 is a sound based on a sound signal coming from a direction within the first range among a plurality of sound signals based on the sound picked up by the plurality of microphones 2, and a sound based on a sound coming from another direction. The first signal emphasized more than the signal is output. The first range is a range centered on the direction of the first sound source notified from the sound source direction detection unit 151. The size of the first range is determined by, for example, the number of a plurality of microphones 2 and the beamformer coefficient set in the first beamformer 152.

第１ビームフォーマ１５２は、入力部１１から入力された複数の音信号を合成することにより第１信号を生成する。第１ビームフォーマ１５２は、指向性制御部１５５から入力されるビームフォーマ係数を用いて、第１範囲内の方向から到来した音に基づく音信号のレベルが他の方向から到来した音に基づく音信号のレベルよりも大きくなるように複数の音信号を生成する。第１ビームフォーマ１５２は、生成した複数の音信号を合成することにより、第１信号を生成する。第１ビームフォーマ１５２は、生成した第１信号を第１減衰部１２に出力する。 The first beam former 152 generates the first signal by synthesizing a plurality of sound signals input from the input unit 11. The first beam former 152 uses the beam former coefficient input from the directivity control unit 155, and the level of the sound signal based on the sound coming from the direction within the first range is the sound based on the sound coming from the other direction. Generate multiple sound signals to be greater than the signal level. The first beam former 152 generates a first signal by synthesizing a plurality of generated sound signals. The first beam former 152 outputs the generated first signal to the first attenuation unit 12.

図４は、第１ビームフォーマ１５２の構成を説明するための図である。第１ビームフォーマ１５２は、複数の可変遅延部１６１（可変遅延部１６１ａ，１６１ｂ，１６１ｃ，１６１ｄ）、複数のゲイン調整部１６２（ゲイン調整部１６２ａ，１６２ｂ，１６２ｃ，１６２ｄ）、及び加算部１６３を有する。 FIG. 4 is a diagram for explaining the configuration of the first beam former 152. The first beam former 152 includes a plurality of variable delay units 161 (variable delay units 161a, 161b, 161c, 161d), a plurality of gain adjustment units 162 (gain adjustment units 162a, 162b, 162c, 162d), and an addition unit 163. Have.

可変遅延部１６１は、入力部１１から取得した複数の音信号を、指向性制御部１５５から入力された遅延量に基づいて遅延させる。ビームフォーマ係数は、音源から複数のマイクロフォン２のそれぞれまでの距離（以下、「伝搬距離」という）の差に対応する時間である遅延量に対応しており、可変遅延部１６１は、例えばビームフォーマ係数の遅延量に基づいて音信号を遅延させる。可変遅延部１６１が、伝搬距離の差に対応する時間だけ音信号を遅延させることで、複数のマイクロフォン２に複数の音が到来したタイミングの差が補正され、第１ビームフォーマ１５２の指向性が最も強い方向からの複数の音信号が同相になる。 The variable delay unit 161 delays a plurality of sound signals acquired from the input unit 11 based on the delay amount input from the directivity control unit 155. The beamformer coefficient corresponds to the delay amount, which is the time corresponding to the difference in the distances (hereinafter referred to as “propagation distances”) from the sound source to each of the plurality of microphones 2, and the variable delay unit 161 is, for example, the beamformer. The sound signal is delayed based on the amount of delay of the coefficient. The variable delay unit 161 delays the sound signal by a time corresponding to the difference in propagation distance, so that the difference in timing when a plurality of sounds arrive at the plurality of microphones 2 is corrected, and the directivity of the first beam former 152 is improved. Multiple sound signals from the strongest direction are in phase.

ゲイン調整部１６２は、可変遅延部１６１が遅延させた後の信号のゲインを調整する。ビームフォーマ係数はゲインに対応しており、ゲイン調整部１６２は、例えばビームフォーマ係数に対応するゲインに基づいて、可変遅延部１６１が遅延させた後の信号を増幅又は減衰させる。複数のゲイン調整部１６２それぞれのゲインは、ビームフォーマ係数に応じて定められる。 The gain adjusting unit 162 adjusts the gain of the signal after the delay by the variable delay unit 161. The beamformer coefficient corresponds to the gain, and the gain adjusting unit 162 amplifies or attenuates the signal after the delay by the variable delay unit 161 based on the gain corresponding to the beamformer coefficient, for example. The gain of each of the plurality of gain adjusting units 162 is determined according to the beamformer coefficient.

加算部１６３は、複数のゲイン調整部１６２が生成した複数の信号を加算する。第１範囲内の方向に対応するゲイン調整部１６２が出力する信号は、他のゲイン調整部１６２が出力する信号よりも大きい。したがって、加算部１６３は、複数の信号を加算することで、第１範囲内の方向から到来した音に基づく音信号を他の方向から到来した音に基づく音信号よりも強調させた第１信号を生成する。 The addition unit 163 adds a plurality of signals generated by the plurality of gain adjustment units 162. The signal output by the gain adjusting unit 162 corresponding to the direction in the first range is larger than the signal output by the other gain adjusting unit 162. Therefore, the addition unit 163 adds a plurality of signals to emphasize the sound signal based on the sound arriving from the direction within the first range more than the sound signal based on the sound arriving from the other direction. To generate.

図３に戻って、第２ビームフォーマ１５３は、入力部１１から入力された複数の音信号のうち第２範囲内の方向から到来した音に基づく音信号を他の方向から到来した音に基づく音信号よりも強調させた第２信号を出力する。第２範囲は、音源方向検出部１５１から通知された第２の音源の方向を中心とする範囲である。第２範囲の大きさは、例えば、複数のマイクロフォン２の数、及び第２ビームフォーマ１５３に設定されるビームフォーマ係数によって定まる。 Returning to FIG. 3, the second beam former 153 uses a sound signal based on a sound coming from a direction within the second range among a plurality of sound signals input from the input unit 11 based on a sound coming from another direction. A second signal that is emphasized more than the sound signal is output. The second range is a range centered on the direction of the second sound source notified from the sound source direction detection unit 151. The size of the second range is determined by, for example, the number of a plurality of microphones 2 and the beamformer coefficient set in the second beamformer 153.

第２ビームフォーマ１５３は、入力部１１から入力された複数の音信号を合成することにより第２信号を生成する。第２ビームフォーマ１５３は、指向性制御部１５５から入力されるビームフォーマ係数を用いて、第２範囲内の方向から到来した音に基づく音信号のレベルが他の方向から到来した音に基づく音信号のレベルよりも大きくなるように複数の音信号を生成する。第２ビームフォーマ１５３は、生成した複数の音信号を合成することにより、第２信号を生成する。第２ビームフォーマ１５３は、生成した第２信号を第２減衰部１３に出力する。第２ビームフォーマ１５３の構成は、図４に示した第１ビームフォーマ１５２の構成の構成と同等である。 The second beam former 153 generates a second signal by synthesizing a plurality of sound signals input from the input unit 11. The second beamformer 153 uses the beamformer coefficient input from the directivity control unit 155, and the level of the sound signal based on the sound coming from the direction within the second range is the sound based on the sound coming from the other direction. Generate multiple sound signals to be greater than the signal level. The second beam former 153 generates a second signal by synthesizing a plurality of generated sound signals. The second beam former 153 outputs the generated second signal to the second attenuation unit 13. The configuration of the second beam former 153 is the same as the configuration of the first beam former 152 shown in FIG.

記憶部１５４は、ＲＡＭ（Random Access Memory）及びＳＳＤ（Solid State Drive）等の記憶媒体を有する。記憶部１５４は、第１減衰部１２及び第２減衰部１３が用いるアッテネータゲインを算出するための減衰係数を記憶している。また、記憶部１５４は、音源の方向に関連付けてビームフォーマ係数を記憶している。 The storage unit 154 has a storage medium such as a RAM (Random Access Memory) and an SSD (Solid State Drive). The storage unit 154 stores the attenuation coefficient for calculating the attenuator gain used by the first attenuation unit 12 and the second attenuation unit 13. Further, the storage unit 154 stores the beamformer coefficient in association with the direction of the sound source.

記憶部１５４は、音源方向検出部１５１が検出した音源の方向と、ビームフォーマ係数とを関連付けて記憶してもよい。記憶部１５４は、例えば、過去に音源方向検出部１５１が検出した音源の方向と、当該方向に基づいて指向性制御部１５５が過去に算出したビームフォーマ係数とを関連付けて記憶する。 The storage unit 154 may store the direction of the sound source detected by the sound source direction detection unit 151 in association with the beam former coefficient. The storage unit 154 stores, for example, the direction of the sound source detected in the past by the sound source direction detection unit 151 and the beam former coefficient calculated in the past by the directivity control unit 155 based on the direction in association with each other.

また、記憶部１５４は、音源方向検出部１５１、第１ビームフォーマ１５２、第２ビームフォーマ１５３及び指向性制御部１５５として機能するプロセッサを機能させるためのプログラムを記憶している。 Further, the storage unit 154 stores a program for operating a processor that functions as a sound source direction detection unit 151, a first beam former 152, a second beam former 153, and a directivity control unit 155.

指向性制御部１５５は、音源方向検出部１５１から通知された音源の方向に基づいて、第１ビームフォーマ１５２及び第２ビームフォーマ１５３のビームフォーマ係数を決定し、第１ビームフォーマ１５２及び第２ビームフォーマ１５３を制御する。指向性制御部１５５は、例えば、音源方向検出部１５１が検出した音源の方向に関連付けて記憶部１５４に記憶されたビームフォーマ係数を用いて第１ビームフォーマ１５２又は第２ビームフォーマ１５３に第１信号又は第２信号を出力させる。また、指向性制御部１５５は、第１減衰部１２及び第２減衰部１３の減衰率を制御する。 The directivity control unit 155 determines the beamformer coefficients of the first beamformer 152 and the second beamformer 153 based on the direction of the sound source notified from the sound source direction detection unit 151, and determines the beamformer coefficients of the first beamformer 152 and the second beamformer 152 and the second. Controls the beam former 153. The directivity control unit 155 uses, for example, the beam former coefficient stored in the storage unit 154 in association with the direction of the sound source detected by the sound source direction detection unit 151, and the first beam former 152 or the second beam former 153 is used. A signal or a second signal is output. Further, the directivity control unit 155 controls the attenuation rate of the first attenuation unit 12 and the second attenuation unit 13.

指向性制御部１５５は、音源方向検出部１５１から通知された音源の方向に基づいて、音を発している音源が変化したと判定した場合に、第１ビームフォーマ１５２及び第２ビームフォーマ１５３に設定するビームフォーマ係数、並びに第１減衰部１２及び第２減衰部１３の減衰率を変化させる。指向性制御部１５５は、音源が変化又は移動したことを検出するために、音源方向検出部１５１から通知された音源の方向を示す角度情報を記憶部１５４に記憶させる。指向性制御部１５５は、現在の時刻において音源方向検出部１５１が検出した角度と記憶部１５４が記憶している単位時間前の角度情報が示す角度（以下、「直前の角度」という）との差である変化角度を算出する。 When the directional control unit 155 determines that the sound source emitting sound has changed based on the direction of the sound source notified from the sound source direction detection unit 151, the directional control unit 155 determines that the first beam former 152 and the second beam former 153 have changed. The beamformer coefficient to be set and the attenuation rate of the first attenuation unit 12 and the second attenuation unit 13 are changed. The directivity control unit 155 stores the angle information indicating the direction of the sound source notified from the sound source direction detection unit 151 in the storage unit 154 in order to detect that the sound source has changed or moved. The directivity control unit 155 has an angle detected by the sound source direction detection unit 151 at the current time and an angle indicated by the angle information stored in the storage unit 154 before a unit time (hereinafter referred to as "immediately preceding angle"). Calculate the change angle, which is the difference.

現在の時刻と直前の時刻との時刻の差である単位時間あたりの変化角度が閾値以上である場合、指向性制御部１５５は、音を発している音源が変化したと判定する。一方、変化角度が閾値未満である場合、指向性制御部１５５は、音を発している音源が移動したと判定する。単位時間は、例えば０．１秒である。閾値は、複数の音源の最小方向差に基づいて設定された値であり、例えば１０度である。 When the change angle per unit time, which is the difference between the current time and the immediately preceding time, is equal to or greater than the threshold value, the directivity control unit 155 determines that the sound source producing the sound has changed. On the other hand, when the change angle is less than the threshold value, the directivity control unit 155 determines that the sound source emitting the sound has moved. The unit time is, for example, 0.1 second. The threshold value is a value set based on the minimum direction difference of a plurality of sound sources, and is, for example, 10 degrees.

指向性制御部１５５は、新しい音源を検出したと判定した場合、複数のビームフォーマのうち使用していないビームフォーマを用いて、新しい音源を含む範囲の信号処理を実行する。具体的には、指向性制御部１５５は、第１ビームフォーマ１５２が第１信号を出力している間に、音源方向検出部１５１が検出した音源の方向の単位時間あたりの変化角度が閾値以上であると判定した場合に、第２ビームフォーマ１５３に第２信号を出力させる。すなわち、指向性制御部１５５は、音源方向検出部１５１が検出した音源の方向が第１範囲に含まれていない新たな音源の方向であると判定した場合に、第２ビームフォーマ１５３に第２信号を出力させる。 When the directivity control unit 155 determines that a new sound source has been detected, the directivity control unit 155 executes signal processing in a range including the new sound source by using the unused beam former among the plurality of beam formers. Specifically, in the directivity control unit 155, the change angle per unit time of the direction of the sound source detected by the sound source direction detection unit 151 while the first beam former 152 outputs the first signal is equal to or larger than the threshold value. When it is determined that the above is true, the second beam former 153 is made to output the second signal. That is, when the directivity control unit 155 determines that the direction of the sound source detected by the sound source direction detection unit 151 is the direction of a new sound source not included in the first range, the second beam former 153 is second. Output a signal.

指向性制御部１５５は、第２ビームフォーマ１５３に第２信号を出力させる前に、新たに検出された音源の方向を含むように第２範囲を決定する。指向性制御部１５５は、決定した第２範囲に対応するビームフォーマ係数を算出し、算出したビームフォーマ係数を複数のゲイン調整部１６２に設定することで第２ビームフォーマ１５３に第２信号を出力させる。指向性制御部１５５がこのように動作することで、信号処理装置１０は、新たな音源が音を発し始めた場合に、新たな音源の方向にも指向性がある状態で収音できる。 The directivity control unit 155 determines the second range so as to include the direction of the newly detected sound source before causing the second beam former 153 to output the second signal. The directivity control unit 155 calculates the beamformer coefficient corresponding to the determined second range, and sets the calculated beamformer coefficient in the plurality of gain adjustment units 162 to output the second signal to the second beamformer 153. Let me. By operating the directivity control unit 155 in this way, when a new sound source starts to emit sound, the signal processing device 10 can collect sound in a state where the direction of the new sound source is also directional.

一方、指向性制御部１５５は、第１ビームフォーマ１５２が第１信号を出力している間に、音源の方向の単位時間あたりの変化角度が閾値未満であると判定した場合に、第１範囲を変更した状態で第１ビームフォーマ１５２に第１信号を継続して出力させる。すなわち、指向性制御部１５５は、現在の時刻において直前の時刻と同じ音源を検出したと判定し、検出した音源を含む範囲に指向性がある状態で収音しているビームフォーマを継続して用いる。 On the other hand, when the directivity control unit 155 determines that the change angle per unit time in the direction of the sound source is less than the threshold value while the first beam former 152 outputs the first signal, the first range. The first signal is continuously output to the first beam former 152 with the above changed. That is, the directivity control unit 155 determines that the same sound source as the immediately preceding time is detected at the current time, and continuously collects the sound in a range including the detected sound source with directivity. Use.

このように、指向性制御部１５５は、検出した音源が直前の時刻と異なる位置であると判定した場合であっても、音源の方向の単位時間あたりの変化角度が閾値未満であると判定した場合、動作させるビームフォーマを切り替えない。すなわち、指向性制御部１５５は、音源の位置が変わっていても、音源の方向の単位時間あたりの変化角度が閾値未満である場合、直前の時刻と同じ音源を検出したと判定する。そして、指向性制御部１５５は、変化した角度に基づいて、動作中のビームフォーマに設定するビームフォーマ係数を変更することにより指向方向を変化させる。このように指向性制御部１５５が動作することで、信号処理装置１０は、例えば、話者が移動しながら音声を発する場合にはビームフォーマを切り替えることなく収音できるので、収音した音のレベルの変動を抑制できる。 In this way, the directivity control unit 155 determines that the change angle per unit time in the direction of the sound source is less than the threshold value even when the detected sound source is determined to be at a position different from the time immediately before. If so, do not switch the beam former to operate. That is, even if the position of the sound source has changed, the directivity control unit 155 determines that the same sound source as the immediately preceding time has been detected when the change angle per unit time in the direction of the sound source is less than the threshold value. Then, the directivity control unit 155 changes the directivity direction by changing the beamformer coefficient set in the operating beamformer based on the changed angle. By operating the directivity control unit 155 in this way, for example, when the speaker emits sound while moving, the signal processing device 10 can collect sound without switching the beam former, so that the sound collected can be picked up. The fluctuation of the level can be suppressed.

指向性制御部１５５は、第２ビームフォーマ１５３が第２信号を出力している間に、さらに新しい音源（第３方向の音源）を検出した場合、第１ビームフォーマ１５２を用いて、検出した新しい音源が発した音を収音する。指向性制御部１５５は、第２ビームフォーマ１５３が第２信号を出力している間に、音源方向検出部１５１が検出した音源の方向の単位時間あたりの変化角度が閾値以上であると判定した場合に、第１ビームフォーマ１５２に第１信号を出力させる。 When the directivity control unit 155 detects a new sound source (sound source in the third direction) while the second beam former 153 outputs the second signal, the directivity control unit 155 detects it by using the first beam former 152. Collects the sound emitted by the new sound source. The directivity control unit 155 determines that the change angle per unit time of the direction of the sound source detected by the sound source direction detection unit 151 is equal to or greater than the threshold value while the second beam former 153 outputs the second signal. In this case, the first beam former 152 is made to output the first signal.

指向性制御部１５５は、検出された新しい音源の方向が過去に検出された音源の方向と同じである場合、過去に検出した音源の方向に関連付けられたビームフォーマ係数を使用してもよい。具体的には、指向性制御部１５５は、音源方向検出部１５１が新たに検出した音源の方向（第３方向）が過去に検出した第１方向と同じであると判定した場合に、第１方向に関連付けて記憶部１５４に記憶されたビームフォーマ係数を用いて第１ビームフォーマ１５２に第１信号を出力させる。指向性制御部１５５が、記憶部１５４に記憶されたビームフォーマ係数を用いることにより、ビームフォーマが動作を開始するまでに要する時間を短縮することができる。 The directivity control unit 155 may use the beamformer coefficient associated with the direction of the previously detected sound source when the direction of the detected new sound source is the same as the direction of the previously detected sound source. Specifically, when the directivity control unit 155 determines that the direction (third direction) of the sound source newly detected by the sound source direction detection unit 151 is the same as the first direction detected in the past, the first The first beam former 152 is made to output the first signal by using the beam former coefficient stored in the storage unit 154 in relation to the direction. By using the beamformer coefficient stored in the storage unit 154 by the directivity control unit 155, the time required for the beamformer to start operation can be shortened.

このように、指向性制御部１５５は、新しい音源を検出する度に第１ビームフォーマ１５２と第２ビームフォーマ１５３とを交互に使用する。その結果、信号処理装置１０は、音源が切り替わる際に複数の音源から同時に音が発せられる期間がある場合であっても、複数の音源が発する音を収音することができる。 In this way, the directivity control unit 155 alternately uses the first beam former 152 and the second beam former 153 each time a new sound source is detected. As a result, the signal processing device 10 can pick up the sound emitted by the plurality of sound sources even if there is a period in which the sound is emitted from the plurality of sound sources at the same time when the sound sources are switched.

続いて、指向性制御部１５５が、第１減衰部１２及び第２減衰部１３を制御する動作を説明する。指向性制御部１５５は、新しい音源を検出した時刻からの経過時間に基づいて、第１減衰部１２及び第２減衰部１３のアッテネータゲインを算出する。指向性制御部１５５は、算出したアッテネータゲインを第１減衰部１２及び第２減衰部１３に設定することで、第１減衰部１２及び第２減衰部１３が出力する信号のレベルを調整する。 Subsequently, the operation in which the directivity control unit 155 controls the first attenuation unit 12 and the second attenuation unit 13 will be described. The directivity control unit 155 calculates the attenuator gains of the first attenuation unit 12 and the second attenuation unit 13 based on the elapsed time from the time when the new sound source is detected. The directivity control unit 155 adjusts the level of the signal output by the first attenuation unit 12 and the second attenuation unit 13 by setting the calculated attenuator gain in the first attenuation unit 12 and the second attenuation unit 13.

指向性制御部１５５は、新しい音源を検出した場合、新しい音源を含む範囲に対応するビームフォーマの後段の減衰部の出力レベルを増加させる。一方、指向性制御部１５５は、新しい音源を含まない範囲に対応するビームフォーマの後段の減衰部の出力レベルを減少させる。以下に、第１ビームフォーマが出力する第１信号に対応する第１範囲が時間の経過とともに音源を含まなくなるとともに、第２ビームフォーマが出力する第２信号に対応する第２範囲が時間の経過とともに新しい音源を含むように変化する場合を例示する。この場合、第１ビームフォーマの後段の減衰部であって信号のレベルを減少させる減衰部は第１減衰部１２であり、第２ビームフォーマの後段の減衰部であって信号のレベルを増加させる減衰部は第２減衰部１３である。 When the directivity control unit 155 detects a new sound source, the directivity control unit 155 increases the output level of the attenuation unit in the subsequent stage of the beam former corresponding to the range including the new sound source. On the other hand, the directivity control unit 155 reduces the output level of the attenuation unit in the subsequent stage of the beam former corresponding to the range not including the new sound source. Below, the first range corresponding to the first signal output by the first beam former does not include the sound source with the passage of time, and the second range corresponding to the second signal output by the second beam former is the passage of time. The case where it changes to include a new sound source is illustrated. In this case, the attenuation portion in the subsequent stage of the first beam former, which reduces the signal level, is the first attenuation portion 12, and the attenuation portion in the latter stage of the second beam former, which increases the signal level. The damping unit is the second damping unit 13.

指向性制御部１５５は、第１ビームフォーマ１５３が第１信号を出力している間に変化角度が閾値以上であると判定した場合に第１信号の出力レベルを減少させる。指向性制御部１５５は、第１信号の出力レベルを減少させる場合、変化角度が閾値以上であると判定してからの経過時間に基づく減衰率で第１信号の出力レベルを減少させる。指向性制御部１５５は、減衰係数及び経過時間に基づいて定められるアッテネータゲインに対応する減衰率で第１減衰部１２を動作させる。 The directivity control unit 155 reduces the output level of the first signal when it is determined that the change angle is equal to or greater than the threshold value while the first beam former 153 outputs the first signal. When the directivity control unit 155 reduces the output level of the first signal, the directivity control unit 155 reduces the output level of the first signal by the attenuation factor based on the elapsed time from the determination that the change angle is equal to or greater than the threshold value. The directivity control unit 155 operates the first attenuation unit 12 at an attenuation rate corresponding to the attenuator gain determined based on the attenuation coefficient and the elapsed time.

アッテネータゲインは、例えば、減衰係数Ｃと経過時間Ｔとを乗算することにより定められる。減衰係数Ｃは、例えば負の固定値である。このように、経過時間に基づいて算出したアッテネータゲインを第１減衰部１２に設定することで、指向性制御部１５５は、第１信号を段階的に減衰させることができるので、音源が発している音が急に消えてしまうことを防げる。 The attenuator gain is determined, for example, by multiplying the attenuation coefficient C by the elapsed time T. The attenuation coefficient C is, for example, a negative fixed value. By setting the attenuator gain calculated based on the elapsed time in the first attenuation unit 12 in this way, the directivity control unit 155 can attenuate the first signal step by step, so that the sound source emits a sound source. You can prevent the sound from disappearing suddenly.

また、指向性制御部１５５は、第２ビームフォーマ１５３が出力する第２信号の出力レベルを増加させる。指向性制御部１５５は、例えば、第１信号の出力レベルを減少させる変化速度よりも大きい変化速度で第２信号の出力レベルを増加させる。変化速度は、単位時間あたりの出力レベルの変化量により定められる。このように、指向性制御部１５５が、第１信号の出力レベルを減少させる変化速度よりも大きい変化速度で第２信号の出力レベルを増加させることで、第２信号の出力レベルが短時間で増加するので、信号処理装置１０は、発言を始めた人の声を最初から十分な大きさで出力することができる。指向性制御部１５５は、第１信号の出力レベルを減少させる間に第２信号の出力レベルを増加させてもよい。このように指向性制御部１５５が動作することで、信号処理装置１０は、第１信号と第２信号を切り替えて出力する場合、第１信号と第２信号との間に無音の期間が生じることを防止できる。 Further, the directivity control unit 155 increases the output level of the second signal output by the second beam former 153. The directivity control unit 155 increases the output level of the second signal at a change speed larger than the change speed that decreases the output level of the first signal, for example. The rate of change is determined by the amount of change in the output level per unit time. In this way, the directivity control unit 155 increases the output level of the second signal at a change speed larger than the change speed of decreasing the output level of the first signal, so that the output level of the second signal can be increased in a short time. Since the number increases, the signal processing device 10 can output the voice of the person who has begun to speak in a sufficiently loud volume from the beginning. The directivity control unit 155 may increase the output level of the second signal while decreasing the output level of the first signal. By operating the directivity control unit 155 in this way, when the signal processing device 10 switches between the first signal and the second signal and outputs the signal, a silent period is generated between the first signal and the second signal. Can be prevented.

＜新しい音源の検出処理の流れ＞
図５は、新しい音源を検出したか否かをビームフォーミング処理部１５が判定する処理の流れを示すフローチャートである。音源方向検出部１５１は、入力部１１が増幅した後の複数の音信号を取得する（Ｓ１１）。音源方向検出部１５１は、取得した複数の音信号に基づいて音源方向を検出する（Ｓ１２）。<Flow of detection processing of new sound source>
FIG. 5 is a flowchart showing a flow of processing in which the beamforming processing unit 15 determines whether or not a new sound source has been detected. The sound source direction detection unit 151 acquires a plurality of sound signals after being amplified by the input unit 11 (S11). The sound source direction detection unit 151 detects the sound source direction based on the acquired plurality of sound signals (S12).

指向性制御部１５５は、音源方向検出部１５１が検出した現在の時刻の音源方向と直前の時刻の音源方向との差を算出する（Ｓ１３）。算出した音源方向の差が閾値以上である場合（Ｓ１４のＹＥＳ）、指向性制御部１５５は、新しい音源を検出したと判定する（Ｓ１５）。算出した音源方向の差が閾値未満である場合（Ｓ１４のＮＯ）、指向性制御部１５５は、直前の時刻と同じ音源を検出したと判定する（Ｓ１６）。 The directivity control unit 155 calculates the difference between the sound source direction at the current time detected by the sound source direction detection unit 151 and the sound source direction at the immediately preceding time (S13). When the calculated difference in the sound source direction is equal to or greater than the threshold value (YES in S14), the directivity control unit 155 determines that a new sound source has been detected (S15). When the calculated difference in the sound source direction is less than the threshold value (NO in S14), the directivity control unit 155 determines that the same sound source as the immediately preceding time has been detected (S16).

新しい音源の検出処理を終了するための操作が行われていない場合（Ｓ１７のＮＯ）、ビームフォーミング処理部１５は、Ｓ１１からＳ１７までの処理を繰り返す。新しい音源の検出処理を終了するための操作が行われた場合（Ｓ１７のＹＥＳ）、ビームフォーミング処理部１５は、新しい音源の検出処理を終了する。 If the operation for terminating the detection process of the new sound source has not been performed (NO in S17), the beamforming processing unit 15 repeats the processes from S11 to S17. When the operation for ending the detection process of the new sound source is performed (YES in S17), the beamforming processing unit 15 ends the detection process of the new sound source.

＜ビームフォーマの制御処理の流れ＞
図６は、新しい音源を検出したことに基づいてビームフォーミング処理部１５がビームフォーマを制御する処理の流れを示すフローチャートである。図６は、信号処理装置１０が有する複数のビームフォーマのうち１つのビームフォーマを指向性制御部１５５が制御する際の処理の流れを示している。図６に示すフローチャートは、第１ビームフォーマ１５２が第１音源の方向に指向性がある状態で第１信号を出力している時点から開始している。<Flow of beam former control process>
FIG. 6 is a flowchart showing a flow of processing in which the beamforming processing unit 15 controls the beamformer based on the detection of a new sound source. FIG. 6 shows a processing flow when the directivity control unit 155 controls one of the plurality of beam formers included in the signal processing device 10. The flowchart shown in FIG. 6 starts from the time when the first beam former 152 outputs the first signal in a state where the first beam former 152 has directivity in the direction of the first sound source.

第１ビームフォーマ１５２は、第１音源用のビームフォーマ係数で動作している（Ｓ２１）。指向性制御部１５５は、第２音源を検出していない場合（Ｓ２２のＮＯ）、第２音源を検出する処理を繰り返す。指向性制御部１５５は、第２音源を検出した場合（Ｓ２２のＹＥＳ）、経過時間の計測を開始する（Ｓ２３）。指向性制御部１５５は、計測した経過時間に基づいて第１音源用のアッテネータゲインを算出し、第１音源用のアッテネータゲインを減衰させる（Ｓ２４）。 The first beam former 152 operates with the beam former coefficient for the first sound source (S21). When the directivity control unit 155 has not detected the second sound source (NO in S22), the directivity control unit 155 repeats the process of detecting the second sound source. When the directivity control unit 155 detects the second sound source (YES in S22), the directivity control unit 155 starts measuring the elapsed time (S23). The directivity control unit 155 calculates the attenuator gain for the first sound source based on the measured elapsed time, and attenuates the attenuator gain for the first sound source (S24).

第１ビームフォーマ１５２が動作していない状態で、指向性制御部１５５が第２音源以外の音源（例えば第３音源）を検出した場合（Ｓ２５のＹＥＳ）、指向性制御部１５５は、第３音源用に算出したビームフォーマ係数を第１ビームフォーマ１５２に適用する（Ｓ２６）。指向性制御部１５５は、記憶部１５４を参照することにより、第３音源用のビームフォーマ係数を取得してもよい。第１ビームフォーマ１５２は、指向性制御部１５５が適用した第３音源用のビームフォーマ係数に基づいて動作を開始する（Ｓ２７）。指向性制御部１５５は、第３音源用のアッテネータゲインを増加させる（Ｓ２８）。 When the directivity control unit 155 detects a sound source other than the second sound source (for example, the third sound source) while the first beam former 152 is not operating (YES in S25), the directivity control unit 155 is the third. The beamformer coefficient calculated for the sound source is applied to the first beamformer 152 (S26). The directivity control unit 155 may acquire the beamformer coefficient for the third sound source by referring to the storage unit 154. The first beam former 152 starts operation based on the beam former coefficient for the third sound source applied by the directivity control unit 155 (S27). The directivity control unit 155 increases the attenuator gain for the third sound source (S28).

第１ビームフォーマ１５２が動作していない状態で、指向性制御部１５５が第３音源を検出していない場合（Ｓ２５のＮＯ）、指向性制御部１５５は、第３音源を検出する処理を繰り返す。ビームフォーマを制御する処理を終了するための操作が行われていない場合（Ｓ２９のＮＯ）、ビームフォーミング処理部１５は、Ｓ２１からＳ２８までの処理を繰り返す。ビームフォーマを制御する処理を終了するための操作が行われた場合（Ｓ２９のＹＥＳ）、ビームフォーミング処理部１５は、ビームフォーマを制御する処理を終了する。 When the directivity control unit 155 does not detect the third sound source (NO in S25) while the first beam former 152 is not operating, the directivity control unit 155 repeats the process of detecting the third sound source. .. When the operation for terminating the process of controlling the beamformer is not performed (NO in S29), the beamforming processing unit 15 repeats the processes from S21 to S28. When the operation for ending the process of controlling the beamformer is performed (YES in S29), the beamforming processing unit 15 ends the process of controlling the beamformer.

＜収音システムＳの効果＞
以上のとおり、収音システムＳは、複数のマイクロフォン２に到来した音に基づく音信号のうち第１範囲内の方向から到来した音に基づく音信号を強調させた第１信号を出力する第１ビームフォーマ１５２と、複数の音信号のうち第２範囲内の方向から到来した音に基づく音信号を強調させた第２信号を出力する第２ビームフォーマ１５３とを有する。そして、指向性制御部１５５が、音源の方向に基づいて、ビームフォーミング処理を行わせるビームフォーマを切り替える。<Effect of sound collection system S>
As described above, the sound collecting system S outputs the first signal that emphasizes the sound signal based on the sound arriving from the direction within the first range among the sound signals based on the sound arriving at the plurality of microphones 2. It has a beam former 152 and a second beam former 153 that outputs a second signal that emphasizes a sound signal based on a sound coming from a direction within the second range among a plurality of sound signals. Then, the directivity control unit 155 switches the beamformer to perform the beamforming process based on the direction of the sound source.

収音システムＳは、複数の話者のうち音声を発する話者が切り替わった場合であっても、複数の話者が発する音声が途切れることなく、複数の音声を収音することができる。 The sound collecting system S can pick up a plurality of sounds without interruption even when the speaker that emits the sound is switched among the plurality of speakers.

なお、図１においては３人の話者がいる場合を例示したが、収音システムＳは４人以上の話者がいる環境においても使用可能である。また、以上の説明においては、収音システムＳが備える２つのビームフォーマを用いて説明したが、収音システムＳは、３つ以上のビームフォーマを備えることにより、３つ以上の音源方向それぞれに指向性がある状態で収音してもよい。 Although the case where there are three speakers is illustrated in FIG. 1, the sound collecting system S can be used even in an environment where there are four or more speakers. Further, in the above description, the two beam formers included in the sound collecting system S have been used, but the sound collecting system S is provided with three or more beam formers in each of three or more sound source directions. Sound may be picked up in a directional state.

以上、本発明を実施の形態を用いて説明したが、本発明の技術的範囲は上記実施の形態に記載の範囲には限定されず、その要旨の範囲内で種々の変形及び変更が可能である。例えば、装置の全部又は一部は、任意の単位で機能的又は物理的に分散・統合して構成することができる。また、複数の実施の形態の任意の組み合わせによって生じる新たな実施の形態も、本発明の実施の形態に含まれる。組み合わせによって生じる新たな実施の形態の効果は、もとの実施の形態の効果を併せ持つ。 Although the present invention has been described above using the embodiments, the technical scope of the present invention is not limited to the scope described in the above embodiments, and various modifications and changes can be made within the scope of the gist. be. For example, all or part of the device can be functionally or physically distributed / integrated in any unit. Also included in the embodiments of the present invention are new embodiments resulting from any combination of the plurality of embodiments. The effect of the new embodiment produced by the combination has the effect of the original embodiment together.

１マイクロホンアレイ
２マイクロフォン
１０信号処理装置
１１入力部
１２第１減衰部
１３第２減衰部
１４出力部
１５ビームフォーミング処理部
１５１音源方向検出部
１５２第１ビームフォーマ
１５３第２ビームフォーマ
１５４記憶部
１５５指向性制御部
１６１可変遅延部
１６２ゲイン調整部
１６３加算部1 Microphone array 2 Microphone 10 Signal processing device 11 Input unit 12 1st attenuation unit 13 2nd attenuation unit 14 Output unit 15 Beamforming processing unit 151 Sound source direction detection unit 152 1st beamformer 153 2nd beamformer 154 Storage unit 155 Directivity Sex control unit 161 Variable delay unit 162 Gain adjustment unit 163 Addition unit

Claims

With a microphone array containing multiple microphones,
The first signal in which the sound signal based on the sound arriving from the direction within the first range is emphasized more than the sound signal based on the sound arriving from the other direction among the plurality of sound signals based on the sound arriving at the plurality of microphones. The first beam former that outputs
A second beam former that outputs a second signal in which a sound signal based on a sound arriving from a direction within the second range is emphasized more than a sound signal based on a sound arriving from another direction among the plurality of sound signals.
A sound source direction detection unit that detects the direction of the sound source that emitted the sound that arrived at the plurality of microphones, and
When it is determined that the change angle per unit time in the direction of the sound source detected by the sound source direction detection unit is equal to or greater than the threshold value while the first beam former is outputting the first signal, the first beam former. A directivity control unit that causes the two-beam former to output the second signal,
Sound collection system with.

When the directivity control unit determines that the change angle per unit time in the direction of the sound source is less than the threshold value while the first beam former outputs the first signal, the first one. The first beam former is made to continuously output the first signal in a state where the range is changed.
The sound collecting system according to claim 1.

The directivity control unit reduces the output level of the first signal when it is determined that the change angle is equal to or greater than the threshold value while the first beam former outputs the first signal.
The sound collecting system according to claim 1 or 2.

The directivity control unit reduces the output level of the first signal by an attenuation factor based on the elapsed time from the determination that the change angle is equal to or greater than the threshold value.
The sound collecting system according to claim 3.

The directivity control unit increases the output level of the second signal while decreasing the output level of the first signal.
The sound collecting system according to claim 3 or 4.

The directivity control unit increases the output level of the second signal at a rate of change larger than the rate of change that decreases the output level of the first signal.
The sound collecting system according to any one of claims 3 to 5.

When the directivity control unit determines that the direction of the sound source is not included in the first range, the directivity control unit causes the second beam former to output the second signal.
The sound collecting system according to any one of claims 1 to 6.

The directivity control unit determines the second range so as to include the direction of the sound source before causing the second beam former to output the second signal.
The sound collecting system according to any one of claims 1 to 7.

The directivity control unit determines that the change angle per unit time of the direction of the sound source detected by the sound source direction detection unit is equal to or greater than the threshold value while the second beam former outputs the second signal. When the determination is made, the first beam former is made to output the first signal.
The sound collecting system according to any one of claims 1 to 8.

Further, it has a storage unit that stores the direction of the sound source detected by the sound source direction detection unit in association with the beamformer coefficient.
The directivity control unit uses the beamformer coefficient stored in the storage unit in association with the direction of the sound source detected by the sound source direction detection unit to the first beamformer or the second beamformer. Output one signal or the second signal.
The sound collecting system according to any one of claims 1 to 9.

The storage unit stores the direction of the sound source previously detected by the sound source direction detection unit and the beam former coefficient calculated in the past by the directivity control unit based on the direction in association with each other.
When the directional control unit determines that the direction of the sound source newly detected by the sound source direction detection unit and the direction of the previously detected sound source stored in the storage unit are the same, the past Using the beamformer coefficient stored in association with the direction of the detected sound source,
The sound collecting system according to claim 10.

Of the multiple sound signals based on the sound arriving at the plurality of microphones, the first signal in which the sound signal based on the sound arriving from the direction within the first range is emphasized more than the sound signal based on the sound arriving from the other direction is emphasized. Steps to output and
The step of detecting the direction of the sound source that emitted the sound that arrived at the plurality of microphones, and
When it is determined that the change angle per unit time in the direction of the sound source is equal to or greater than the threshold value while the first signal is being output, the sound signals come from a direction within the second range of the plurality of sound signals. A step of outputting a second signal that emphasizes a sound signal based on sound more than a sound signal based on sound coming from another direction, and
Sound collection method having.

Computer,
Of the multiple sound signals based on the sound arriving at the plurality of microphones, the first signal in which the sound signal based on the sound arriving from the direction within the first range is emphasized more than the sound signal based on the sound arriving from the other direction is emphasized. First beam former to output,
A second beam former that outputs a second signal in which a sound signal based on a sound arriving from a direction within the second range is emphasized more than a sound signal based on a sound arriving from another direction among the plurality of sound signals.
The sound source direction detection unit that detects the direction of the sound source that emitted the sound arriving at the plurality of microphones, and the sound source direction detection unit that detects the sound source direction detection unit while the first beam former outputs the first signal. A directional control unit that causes the second beam former to output the second signal when it is determined that the change angle per unit time in the direction of the sound source is equal to or greater than the threshold value.
A program to function as.