JP7020283B2

JP7020283B2 - Sound source direction determination device, sound source direction determination method, and sound source direction determination program

Info

Publication number: JP7020283B2
Application number: JP2018091212A
Authority: JP
Inventors: 千里塩田; 信之鷲尾; 政直鈴木; 俊輔武内; 義照土永
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2017-09-14
Filing date: 2018-05-10
Publication date: 2022-02-16
Anticipated expiration: 2038-05-10
Also published as: JP2019087986A

Description

本発明は、音源方向判定装置、音源方向判定方法、及び音源方向判定プログラムに関する。 The present invention relates to a sound source direction determination device, a sound source direction determination method, and a sound source direction determination program.

第１指向性マイクロフォンを第１方向に沿って伝搬する音を検出するように配置し、第２指向性マイクロフォンを第１方向に交差する第２方向に沿って伝搬する音を検出するように配置することで、音源方向を判定する音源方向判定装置が存在する。この音源方向判定装置では、第１指向性マイクロフォンが検出した音の音圧の大きさが第２指向性マイクロフォンで検出した音の音圧の大きさよりも大きい場合、音が第１方向に沿って伝搬した音であると判定する。一方、第２指向性マイクロフォンが検出した音の音圧の大きさが第１指向性マイクロフォンで検出した音の音圧の大きさよりも大きい場合、音が第２方向に沿って伝搬した音であると判定する。 The first directional microphone is arranged to detect the sound propagating along the first direction, and the second directional microphone is arranged to detect the sound propagating along the second direction intersecting the first direction. By doing so, there is a sound source direction determination device that determines the sound source direction. In this sound source direction determination device, when the sound pressure of the sound detected by the first directional microphone is larger than the sound pressure of the sound detected by the second directional microphone, the sound is along the first direction. It is determined that the sound is propagated. On the other hand, when the sound pressure of the sound detected by the second directional microphone is larger than the sound pressure of the sound detected by the first directional microphone, the sound propagates along the second direction. Is determined.

特開２０１８－４０９８２号公報Japanese Unexamined Patent Publication No. 2018-40982 特許５３８７４５９号公報Japanese Patent No. 5387459

渡邊ら、”指向性マイクロホンを用いた音源位置推定に関する基礎的検討”、[online]、［平成２９年９月１１日検索］、インターネット（ＵＲＬ：ｈｔｔｐ：／／ｗｗｗ．ｃｉｔ．ｎｉｈｏｎ－ｕ．ａｃ．ｊｐ／ｋｏｕｅｎｄａｔａ／Ｎｏ．４１／２＿ｄｅｎｋｉ／２－００８．ｐｄｆ）Watanabe et al., "Basic study on sound source position estimation using directional microphone", [online], [Search on September 11, 2017], Internet (URL: http: //www.cit.nihon-u. ac.jp/kouendata/No.41/2_denki/2-008.pdf) 山本貢平、「回折計算の方法」、騒音制御、日本、１９９７年、Vol. 21、No. 3、頁143～147Kohei Yamamoto, "Diffraction Calculation Method", Noise Control, Japan, 1997, Vol. 21, No. 3, pp. 143-147

しかしながら、指向性マイクロフォンは、無指向性マイクロフォンよりもサイズが大きく、価格も高いため、無指向性マイクロフォンを使用した場合よりも、音源方向判定装置のサイズが大きくなり、価格が高くなる、という問題がある。 However, since the directional microphone is larger in size and more expensive than the omnidirectional microphone, the size of the sound source direction determination device is larger and the price is higher than when the omnidirectional microphone is used. There is.

本発明は、１つの側面として、無指向性マイクロフォンを使用した音源方向判定の精度を向上させることを目的とする。 One aspect of the present invention is to improve the accuracy of sound source direction determination using an omnidirectional microphone.

１つの実施形態では、マイク設置部は、第１音道及び第２音道が内部に設けられている。第１音道は、第１平坦面に開口した第１開口部を一端部に備え、第１開口部から音が伝搬する。第２音道は、第１平坦面と交差する第２平坦面に開口した第２開口部を一端部に備え、第２開口部から音が伝搬する第２音道が内部に設けられている。第１マイクロフォンは、第１音道の他端部に設置され、第２マイクロフォンは、第２音道の他端部に設置されている。判定部は、音圧の相違及び位相の相違の少なくとも一方に基づいて、音源が存在する方向を判定する。音圧の相違は、第１マイクロフォンで取得された音の第１周波数成分の音圧である第１音圧と、第２マイクロフォンで取得された音の第１周波数成分の音圧である第２音圧との相違である。位相の相違は、第１マイクロフォンで取得された音の第２周波数成分の位相である第１位相と、第２マイクロフォンで取得された音の第２周波数成分の位相である第２位相との相違である。 In one embodiment, the microphone installation portion is provided with a first sound path and a second sound path inside. The first sound path is provided with a first opening opened on the first flat surface at one end, and sound propagates from the first opening. The second sound path is provided at one end with a second opening opened in the second flat surface intersecting with the first flat surface, and a second sound path through which sound propagates from the second opening is provided inside. .. The first microphone is installed at the other end of the first sound path, and the second microphone is installed at the other end of the second sound path. The determination unit determines the direction in which the sound source exists based on at least one of the difference in sound pressure and the difference in phase. The difference in sound pressure is the first sound pressure, which is the sound pressure of the first frequency component of the sound acquired by the first microphone, and the second sound pressure, which is the sound pressure of the first frequency component of the sound acquired by the second microphone. It is a difference from sound pressure. The difference in phase is the difference between the first phase, which is the phase of the second frequency component of the sound acquired by the first microphone, and the second phase, which is the phase of the second frequency component of the sound acquired by the second microphone. Is.

１つの側面として、無指向性マイクロフォンを使用した音源方向判定の精度を向上させることを可能とする。 As one aspect, it is possible to improve the accuracy of sound source direction determination using an omnidirectional microphone.

第１～第５実施形態に係る情報処理端末の一例を示すブロック図である。It is a block diagram which shows an example of the information processing terminal which concerns on 1st to 5th Embodiment. 第１、第２、第４及び第５実施形態に係る音源方向判定装置の外観の一例を示す概念図である。It is a conceptual diagram which shows an example of the appearance of the sound source direction determination apparatus which concerns on 1st, 2nd, 4th, and 5th Embodiment. 第１、第２、第４及び第５実施形態に係る音源方向判定装置の外観の一例を示す概念図である。It is a conceptual diagram which shows an example of the appearance of the sound source direction determination apparatus which concerns on 1st, 2nd, 4th, and 5th Embodiment. 第１、第４及び第５実施形態に係る図２Ａの切断線３－３に沿った断面図である。It is sectional drawing which follows the cutting line 3-3 of FIG. 2A which concerns on 1st, 4th and 5th Embodiment. 第１、第３、第４及び第５実施形態の音の回折を説明するための概念図である。It is a conceptual diagram for demonstrating the diffraction of the sound of the 1st, 3rd, 4th and 5th embodiments. 第１、第３、第４及び第５実施形態の音の回折を説明するための概念図である。It is a conceptual diagram for demonstrating the diffraction of the sound of the 1st, 3rd, 4th and 5th embodiments. 平坦面の面積が異なる場合の第１マイクロフォンの音圧と第２マイクロフォンの音圧との音圧差を例示する表である。It is a table exemplifying the sound pressure difference between the sound pressure of the 1st microphone and the sound pressure of the 2nd microphone when the area of a flat surface is different. 第１、第２、第４及び第５実施形態の音の回折を説明するための概念図である。It is a conceptual diagram for demonstrating the diffraction of the sound of the 1st, 2nd, 4th and 5th embodiments. 第１、第２、第４及び第５実施形態の音の回折を説明するための概念図である。It is a conceptual diagram for demonstrating the diffraction of the sound of the 1st, 2nd, 4th and 5th embodiments. 周波数軸に沿った回折による音圧の低下を説明するためのグラフである。It is a graph for demonstrating the decrease of sound pressure by diffraction along a frequency axis. 第１～第３実施形態の音源方向判定処理の概要を例示するブロック図である。It is a block diagram which illustrates the outline of the sound source direction determination processing of 1st to 3rd Embodiment. 第１～第４実施形態の音源方向判定基準を例示する表である。It is a table exemplifying the sound source direction determination criteria of 1st to 4th Embodiment. 第１～第５実施形態に係る情報処理端末のハードウェアの一例を示すブロック図である。It is a block diagram which shows an example of the hardware of the information processing terminal which concerns on 1st to 5th Embodiment. 第１～第３実施形態に係る音源方向判定処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of the sound source direction determination processing which concerns on 1st to 3rd Embodiment. 第２実施形態に係る図２Ａの切断線３－３に沿った断面図である。It is sectional drawing which follows the cutting line 3-3 of FIG. 2A which concerns on 2nd Embodiment. 第３及び第５実施形態に係る音源方向判定装置の外観の一例を示す概念図である。It is a conceptual diagram which shows an example of the appearance of the sound source direction determination apparatus which concerns on 3rd and 5th Embodiment. 第３及び第５実施形態に係る音源方向判定装置の外観の一例を示す概念図である。It is a conceptual diagram which shows an example of the appearance of the sound source direction determination apparatus which concerns on 3rd and 5th Embodiment. 第３及び第５実施形態に係る音源方向判定装置の外観の一例を示す概念図である。It is a conceptual diagram which shows an example of the appearance of the sound source direction determination apparatus which concerns on 3rd and 5th Embodiment. 第３及び第５実施形態に係る図１３Ａの切断線１４－１４に沿った断面図である。3 is a cross-sectional view taken along the cutting line 14-14 of FIG. 13A according to the third and fifth embodiments. 第４実施形態の音源方向判定処理の概要を例示するブロック図である。It is a block diagram which illustrates the outline of the sound source direction determination process of 4th Embodiment. 音がマイクロフォンに到達する際の位相差を説明するための概念図である。It is a conceptual diagram for demonstrating the phase difference when a sound reaches a microphone. 音がマイクロフォンに到達する際の位相差を説明するための概念図である。It is a conceptual diagram for demonstrating the phase difference when a sound reaches a microphone. 第４実施形態に係る音源方向判定処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of the sound source direction determination processing which concerns on 4th Embodiment. 第４実施形態に係る音源方向判定処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of the sound source direction determination processing which concerns on 4th Embodiment. 第４実施形態に係る音源方向判定処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of the sound source direction determination processing which concerns on 4th Embodiment. 第４実施形態に係る音源方向判定処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of the sound source direction determination processing which concerns on 4th Embodiment. 第４実施形態に係る音源方向判定処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of the sound source direction determination processing which concerns on 4th Embodiment. 第４実施形態に係る音源方向判定処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of the sound source direction determination processing which concerns on 4th Embodiment. 第４実施形態に係る音源方向判定処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of the sound source direction determination processing which concerns on 4th Embodiment. 関連技術に係る指向性マイクロフォンを使用した音源方向判定装置の一例を示す概念図である。It is a conceptual diagram which shows an example of the sound source direction determination apparatus using the directional microphone which concerns on the related technology. 指向性マイクロフォンの大きさと無指向性マイクロフォンの大きさとを比較するための例示的な表である。It is an exemplary table for comparing the size of a directional microphone with the size of an omnidirectional microphone. 関連技術に係る無指向性マイクロフォンを使用した音源方向判定装置の一例を示す概念図である。It is a conceptual diagram which shows an example of the sound source direction determination apparatus using the omnidirectional microphone which concerns on the related technology. 関連技術に係る無指向性マイクロフォンを使用した音源方向判定装置の一例を示す概念図である。It is a conceptual diagram which shows an example of the sound source direction determination apparatus using the omnidirectional microphone which concerns on the related technology. 関連技術における音圧差と本実施形態における音圧差との比較の一例を示す表である。It is a table which shows an example of the comparison between the sound pressure difference in a related technique and the sound pressure difference in this embodiment. 音源方向判定装置の背面に空隙が存在する場合の音の回折を説明するための概念図である。It is a conceptual diagram for demonstrating the diffraction of a sound when a space exists on the back surface of a sound source direction determination apparatus. 音源方向判定装置の背面に空隙が存在しない場合の音の回折を説明するための概念図である。It is a conceptual diagram for demonstrating the diffraction of a sound when there is no void in the back surface of the sound source direction determination apparatus. 音源方向判定装置の背面に空隙が存在する場合及び存在しない場合の高域音圧差を例示する概念図である。It is a conceptual diagram which illustrates the high region sound pressure difference in the case where a gap exists in the back surface of the sound source direction determination device, and when it does not exist. 音源方向判定装置の背面に空隙が存在する場合及び存在しない場合の正規化位相差を例示する概念図である。It is a conceptual diagram which illustrates the normalized phase difference in the case where a gap exists in the back surface of the sound source direction determination apparatus, and when it does not exist. 第５実施形態の音源方向判定処理の概要を例示するブロック図である。It is a block diagram which illustrates the outline of the sound source direction determination process of 5th Embodiment. 音源方向を判定する閾値の調整を説明するための概念図である。It is a conceptual diagram for demonstrating the adjustment of the threshold value which determines a sound source direction. 音源方向判定装置の傾斜と、音源方向を判定する閾値と、の関係を例示する概念図である。It is a conceptual diagram which illustrates the relationship between the inclination of the sound source direction determination device, and the threshold value which determines a sound source direction. 音源方向判定装置の傾斜と、音源方向を判定する閾値と、の関係を例示する概念図である。It is a conceptual diagram which illustrates the relationship between the inclination of the sound source direction determination device, and the threshold value which determines a sound source direction. 音源方向判定装置の傾斜と、音源方向判定装置の上面及び前面へ到達する音の位相差と、の関係を例示する概念図である。It is a conceptual diagram which illustrates the relationship between the inclination of the sound source direction determination device, and the phase difference of the sound reaching the upper surface and the front surface of the sound source direction determination device. 音源方向判定装置の傾斜と、音源方向判定装置の上面及び前面へ到達する音の位相差と、の関係を例示する概念図である。It is a conceptual diagram which illustrates the relationship between the inclination of the sound source direction determination device, and the phase difference of the sound reaching the upper surface and the front surface of the sound source direction determination device. 音源方向判定装置の傾斜が異なる場合の、ユーザの音声の位相差と対話相手の音声の位相差との関係を例示する概念図である。It is a conceptual diagram which illustrates the relationship between the phase difference of a user's voice and the phase difference of the voice of a dialogue partner when the inclination of the sound source direction determination device is different. 音源方向判定装置の傾斜が同じ場合の、ユーザの音声の位相差と対話相手の音声の位相差と、の関係を例示する概念図である。It is a conceptual diagram which illustrates the relationship between the phase difference of a user's voice and the phase difference of the voice of a dialogue partner when the inclination of the sound source direction determination device is the same. 音源方向判定装置の傾斜が同じ場合の、ユーザの音声の位相差と対話相手の音声の位相差と、の関係を例示する概念図である。It is a conceptual diagram which illustrates the relationship between the phase difference of a user's voice and the phase difference of the voice of a dialogue partner when the inclination of the sound source direction determination device is the same. 第５実施形態に係る音源方向判定処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of the sound source direction determination processing which concerns on 5th Embodiment. 第５実施形態に係る音源方向判定処理の流れの一例を示すフローチャートである。It is a flowchart which shows an example of the flow of the sound source direction determination processing which concerns on 5th Embodiment.

［第１実施形態］
以下、図面を参照して第１実施形態の一例を詳細に説明する。 [First Embodiment]
Hereinafter, an example of the first embodiment will be described in detail with reference to the drawings.

図１に、情報処理端末１の要部機能を例示する。情報処理端末１は、音源方向判定装置１０及び音声翻訳装置１４を含む。 FIG. 1 illustrates the main functions of the information processing terminal 1. The information processing terminal 1 includes a sound source direction determination device 10 and a speech translation device 14.

音源方向判定装置１０は、第１マイクロフォン（以下、「マイクロフォン」を「マイク」ともいう。）１１、第２マイクロフォン１２、及び、判定部１３を含む。音声翻訳装置１４は、第１翻訳部１４Ａ、第２翻訳部１４Ｂ、及び、スピーカ１４Ｃを含む。 The sound source direction determination device 10 includes a first microphone (hereinafter, “microphone” is also referred to as “microphone”) 11, a second microphone 12, and a determination unit 13. The speech translation device 14 includes a first translation unit 14A, a second translation unit 14B, and a speaker 14C.

第１マイク１１及び第２マイク１２の各々は、無指向性マイクロフォンであって、全方位の音を取得する。判定部１３は、第１マイク１１及び第２マイク１２で取得された音の音源が存在する方向を判定する。音声翻訳装置１４は、判定部１３によって判定された音源方向に基づいて、第１マイク１１または第２マイク１２で取得された音源方向から伝搬する音に対応する音声信号によって表される言語を所定の言語に翻訳する。 Each of the first microphone 11 and the second microphone 12 is an omnidirectional microphone and acquires omnidirectional sound. The determination unit 13 determines the direction in which the sound source of the sound acquired by the first microphone 11 and the second microphone 12 exists. The voice translation device 14 determines a language represented by a voice signal corresponding to a sound propagating from the sound source direction acquired by the first microphone 11 or the second microphone 12 based on the sound source direction determined by the determination unit 13. Translate into the language of.

詳細には、判定部１３によって音源が、例えば、上方である第１方向に存在すると判定された場合、取得した音に対応する音声信号によって表される言語を、第１翻訳部１４Ａが第１言語（例えば、英語）に翻訳する。判定部１３によって、音源が、例えば、前方である第２方向に存在すると判定された場合、取得した音に対応する音声信号によって表される言語を、第２翻訳部１４Ｂが第２言語（例えば、日本語）に翻訳する。スピーカ１４Ｃは、第１翻訳部１４Ａまたは第２翻訳部１４Ｂによって翻訳された言語を音声で出力する。 Specifically, when the determination unit 13 determines that the sound source exists in, for example, the upper first direction, the first translation unit 14A first determines the language represented by the audio signal corresponding to the acquired sound. Translate into a language (eg English). When the determination unit 13 determines that the sound source exists in the second direction, which is the front, for example, the second translation unit 14B uses the second language (for example, the language represented by the voice signal corresponding to the acquired sound). , Japanese). The speaker 14C outputs the language translated by the first translation unit 14A or the second translation unit 14B by voice.

図２Ａ及び図２Ｂに、音源方向判定装置１０の外観を例示する。音源方向判定装置１０は、例えば、ユーザのシャツの胸ポケットに入れて、衣服のユーザの胸部付近に該当する部分にクリップまたはピンなどで留めて、または、ストラップでユーザの首に下げて使用することが想定される装置である。図２Ａは、音源方向判定装置１０の筐体１８の上面を例示する。筐体１８は、マイク設置部の一例である。第１平坦面の一例である筐体１８の上面は、音源方向判定装置１０を胸ポケットに入れた際に、上方を向く面、即ち、ユーザの口に最も近い面である。 2A and 2B illustrate the appearance of the sound source direction determination device 10. The sound source direction determination device 10 is used, for example, by putting it in the chest pocket of the user's shirt and fastening it to the corresponding portion of the clothing near the user's chest with a clip or a pin, or hanging it around the user's neck with a strap. It is a device that is supposed to be. FIG. 2A illustrates the upper surface of the housing 18 of the sound source direction determination device 10. The housing 18 is an example of a microphone installation portion. The upper surface of the housing 18, which is an example of the first flat surface, is a surface facing upward when the sound source direction determination device 10 is placed in the chest pocket, that is, a surface closest to the user's mouth.

筐体１８の上面には、第１音道の一端部に備えられた第１開口部の一例である開口部１１Ｏが存在する。第１音道の他端部には、第１マイク１１が設置されている。以下、図において矢印ＦＲは、音源方向判定装置１０の前方を表す。筐体１８の上面には、スピーカ１４Ｃも配置されている。即ち、図２Ａ及び図２Ｂの例では、音声翻訳装置１４は、音源方向判定装置１０の筐体１８に含まれている。筐体１８の上面の前後方向の長さは、例えば、１［ｃｍ］である。 On the upper surface of the housing 18, there is an opening 11O which is an example of the first opening provided at one end of the first sound path. A first microphone 11 is installed at the other end of the first sound path. Hereinafter, in the figure, the arrow FR represents the front of the sound source direction determination device 10. A speaker 14C is also arranged on the upper surface of the housing 18. That is, in the examples of FIGS. 2A and 2B, the speech translation device 14 is included in the housing 18 of the sound source direction determination device 10. The length of the upper surface of the housing 18 in the front-rear direction is, for example, 1 [cm].

図２Ｂは、音源方向判定装置１０の筐体１８の前面を例示する。第２平坦面の一例である前面は、例えば、音源方向判定装置１０を胸ポケットに入れた際に、ユーザが対話する対話相手に対向する面である。 FIG. 2B illustrates the front surface of the housing 18 of the sound source direction determination device 10. The front surface, which is an example of the second flat surface, is a surface facing the dialogue partner with which the user interacts, for example, when the sound source direction determination device 10 is placed in the chest pocket.

筐体１８の前面には、第２音道の一端部に備えられた開口部１２Ｏが存在する。第２音道の他端部には、第２マイク１２が設置されている。以下、図において矢印ＵＰは、音源方向判定装置１０の上方を表す。筐体１８の前面の大きさは、例えば、一般的な名刺と同程度の大きさである。 On the front surface of the housing 18, there is an opening 12O provided at one end of the second sound path. A second microphone 12 is installed at the other end of the second sound path. Hereinafter, in the figure, the arrow UP represents the upper part of the sound source direction determination device 10. The size of the front surface of the housing 18 is, for example, about the same size as a general business card.

音源方向判定装置１０は、上方に音源が存在すると判定した音をユーザによって発話された音声であると判定して、第１言語に翻訳してスピーカ１４Ｃから音声で出力するように、音声翻訳装置１４の第１翻訳部１４Ａに当該音に対応する音声信号を送信する。また、音源方向判定装置１０は、前方に音源が存在すると判定した音を対話相手によって発話された音声であると判定する。音源方向判定装置１０は、第２言語に翻訳してスピーカ１４Ｃから音声で出力するように、音声翻訳装置１４の第２翻訳部１４Ｂに当該音に対応する音声信号を送信する。 The sound source direction determination device 10 determines that the sound determined to have a sound source above is the voice uttered by the user, translates it into a first language, and outputs the voice from the speaker 14C. A voice signal corresponding to the sound is transmitted to the first translation unit 14A of 14. Further, the sound source direction determination device 10 determines that the sound determined to have a sound source in front is the voice uttered by the dialogue partner. The sound source direction determination device 10 transmits a voice signal corresponding to the sound to the second translation unit 14B of the voice translation device 14 so as to translate into a second language and output the sound from the speaker 14C.

図３は、図２Ａの切断線３－３に沿った断面図を表す。第２音道１２Ｒの一端部は、筐体１８の前面に開口した開口部１２Ｏを備え、第２マイク１２は、第２音道の他端部に設置されている。 FIG. 3 represents a cross-sectional view taken along the cutting line 3-3 of FIG. 2A. One end of the second sound path 12R is provided with an opening 12O opened in the front surface of the housing 18, and the second microphone 12 is installed at the other end of the second sound path.

第１音道１１Ｒの一端部は、筐体１８の上面に開口した開口部１１Ｏを備え、第１マイク１１は、第１音道１１Ｒの他端部に設置されている。第１音道１１Ｒは途中に屈曲部１１Ｋを有する。屈曲部１１Ｋは第２回折部の一例である。 One end of the first sound path 11R is provided with an opening 11O opened on the upper surface of the housing 18, and the first microphone 11 is installed at the other end of the first sound path 11R. The first sound path 11R has a bent portion 11K in the middle. The bent portion 11K is an example of the second diffraction portion.

図４Ａに、音源が音源方向判定装置１０の前方に存在する場合を例示する。筐体１８の前面の面積が第１所定値の一例である所定値より大きい場合、第２マイク１２は、開口部１２Ｏを通って、直接届く音に加え、筐体１８の前面で反射し、第３回折部の一例である開口部１２Ｏで回折した音を取得する。 FIG. 4A illustrates a case where the sound source is located in front of the sound source direction determination device 10. When the area of the front surface of the housing 18 is larger than the predetermined value which is an example of the first predetermined value, the second microphone 12 is reflected by the front surface of the housing 18 in addition to the sound directly arriving through the opening 12O. The sound diffracted by the opening 12O, which is an example of the third diffractive part, is acquired.

図４Ｂに、音源が音源方向判定装置１０の上方に存在する場合を例示する。音は、第２マイク１２に直接には届かず、第２マイク１２は、開口部１２Ｏで回折した音を取得する。したがって、第２マイク１２で取得される音の音圧は、音源が前方に存在する場合の方が、音源が上方に存在する場合よりも大きい。 FIG. 4B illustrates a case where the sound source is located above the sound source direction determination device 10. The sound does not reach the second microphone 12 directly, and the second microphone 12 acquires the sound diffracted by the opening 12O. Therefore, the sound pressure of the sound acquired by the second microphone 12 is higher when the sound source is present in the front than when the sound source is located above.

図５に、音源が音源方向判定装置１０の前方に存在する場合、及び、上方に存在する場合の、第２マイク１２で取得される音圧を例示する。音源方向判定装置１０の前面の面積が所定値以下の大きさの一例である２［平方ｃｍ］である場合、音源が音源方向判定装置１０の前方に存在する音の音圧は－２６［ｄＢｏｖ］である。また、音源が音源方向判定装置１０の上方に存在する音の音圧は－２９［ｄＢｏｖ］である。したがって、音源方向判定装置１０の前方に存在する音源からの音の音圧と、上方に存在する音源からの音の音圧との音圧差は３［ｄＢ］である。 FIG. 5 illustrates the sound pressure acquired by the second microphone 12 when the sound source is present in front of and above the sound source direction determination device 10. When the area of the front surface of the sound source direction determination device 10 is 2 [square cm], which is an example of a size equal to or less than a predetermined value, the sound pressure of the sound in which the sound source exists in front of the sound source direction determination device 10 is −26 [dbov. ]. Further, the sound pressure of the sound in which the sound source exists above the sound source direction determination device 10 is −29 [dBov]. Therefore, the sound pressure difference between the sound pressure of the sound from the sound source existing in front of the sound source direction determination device 10 and the sound pressure of the sound from the sound source existing above is 3 [dB].

一方、音源方向判定装置１０の前面の面積が所定値より大きい大きさの一例である６３［平方ｃｍ］である場合、音源が音源方向判定装置１０の前方に存在する音の音圧は－２４［ｄＢｏｖ］である。また、音源が音源方向判定装置１０の上方に存在する音の音圧は－３０［ｄＢｏｖ］である。したがって、音源方向判定装置１０の前方に存在する音源からの音の音圧と、上方に存在する音源からの音の音圧との音圧差は、６［ｄＢ］である。 On the other hand, when the area of the front surface of the sound source direction determination device 10 is 63 [square cm], which is an example of a size larger than a predetermined value, the sound pressure of the sound in which the sound source exists in front of the sound source direction determination device 10 is -24. [DBov]. Further, the sound pressure of the sound in which the sound source exists above the sound source direction determination device 10 is −30 [dBov]. Therefore, the sound pressure difference between the sound pressure of the sound from the sound source existing in front of the sound source direction determination device 10 and the sound pressure of the sound from the sound source existing above is 6 [dB].

即ち、音源方向判定装置１０の前面の面積が２［平方ｃｍ］の場合よりも、６３［平方ｃｍ］の場合の方が音源の方向による音圧差が大きく、音源の方向の判定が容易となる。前面の面積が所定値より大きい場合、音源が音源方向判定装置１０の前方に存在する音の反射が十分に行われるためである。 That is, when the area of the front surface of the sound source direction determination device 10 is 2 [square cm], the sound pressure difference depending on the direction of the sound source is larger when the area is 63 [square cm], and the direction of the sound source can be easily determined. .. This is because when the area of the front surface is larger than a predetermined value, the sound generated in front of the sound source direction determination device 10 is sufficiently reflected.

所定値とは、例えば、音道の断面積の１０００倍であってよい。即ち、第２マイク１２のマイク穴の直径が、例えば、０．５［ｍｍ］であり、第２音道１２Ｒが、第２マイク１２のマイク穴の直径の２倍の長さである直径１［ｍｍ］の円形の断面を有している場合、約７８５［平方ｍｍ］より大きい面積であってよい。なお、例えば、第２音道１２Ｒは、一端部から他端部まで同じ直径を有していてもよいし、一端部から他端部に向かって徐々に直径が小さくなってもよい。また、第２音道は、例えば、矩形の断面を有していてもよい。 The predetermined value may be, for example, 1000 times the cross-sectional area of the sound path. That is, the diameter of the microphone hole of the second microphone 12 is, for example, 0.5 [mm], and the diameter 1 of the second sound path 12R is twice the diameter of the microphone hole of the second microphone 12. If it has a circular cross section of [mm], it may have an area larger than about 785 [mm2]. For example, the second sound path 12R may have the same diameter from one end to the other end, or may gradually decrease in diameter from one end to the other end. Further, the second sound path may have, for example, a rectangular cross section.

第２音道１２Ｒの一端部から他端部までの長さは、例えば、３［ｍｍ］であってよいが、３［ｍｍ］よりも長くてもよいし、短くてもよい。また、第２音道１２Ｒは、筐体１８の前面と直交していてもよいし、第２音道１２Ｒと筐体１８の前面とは９０［度］以外の角度で交差していてもよい。 The length from one end to the other end of the second sound path 12R may be, for example, 3 [mm], but may be longer or shorter than 3 [mm]. Further, the second sound path 12R may be orthogonal to the front surface of the housing 18, or the second sound path 12R and the front surface of the housing 18 may intersect at an angle other than 90 [degrees]. ..

図６Ａ及び図６Ｂで、音源が音源方向判定装置１０の上方に存在する場合と、前方に存在する場合の、第１マイク１１で取得される音圧を説明する。図６Ａに、音源が音源方向判定装置１０の上方に存在する場合を例示する。 6A and 6B show the sound pressure acquired by the first microphone 11 when the sound source is above the sound source direction determination device 10 and when the sound source is in front of the sound source direction determination device 10. FIG. 6A illustrates a case where the sound source is located above the sound source direction determination device 10.

筐体１８の上面の前後方向の長さは短く、上面の面積は所定値以下であるため、音源が音源方向判定装置１０の上方にある場合でも、図４Ａに例示する音の反射及び回折による音の取得が期待できない。そこで、第１音道１１Ｒには屈曲部１１Ｋを設けている。第１音道１１Ｒは、屈曲部１１Ｋを有するため、上方からの音は、第１マイク１１には直接届かず、第１音道１１Ｒの屈曲部１１Ｋで回折し、第１マイク１１で取得される。 Since the length of the upper surface of the housing 18 in the front-rear direction is short and the area of the upper surface is equal to or less than a predetermined value, even when the sound source is above the sound source direction determination device 10, the sound is reflected and diffracted as illustrated in FIG. 4A. I can't expect to get the sound. Therefore, the first sound path 11R is provided with a bent portion 11K. Since the first sound path 11R has a bent portion 11K, the sound from above does not reach the first microphone 11 directly, is diffracted by the bent portion 11K of the first sound path 11R, and is acquired by the first microphone 11. To.

図６Ｂに、音源が音源方向判定装置１０の前方に存在する場合を例示する。音は、第１回折部の一例である開口部１１Ｏで回折し、さらに、屈曲部１１Ｋで回折して、第１マイク１１で取得される。 FIG. 6B illustrates a case where the sound source is located in front of the sound source direction determination device 10. The sound is diffracted by the opening 11O, which is an example of the first diffracting portion, further diffracted by the bent portion 11K, and acquired by the first microphone 11.

図７に、音源が音源方向判定装置１０の上方に存在する場合に第１マイク１１で取得される音の音圧と、音源が音源方向判定装置１０の前方に存在する場合に第１マイク１１で取得される音の音圧との音圧差を例示する。実線は、音源が音源方向判定装置１０の上方に存在する場合に第１マイク１１で取得される音の音圧［ｄＢ］を表し、破線は、音源が音源方向判定装置１０の前方に存在する場合に第１マイク１１で取得される音の音圧［ｄＢ］を表す。 FIG. 7 shows the sound pressure of the sound acquired by the first microphone 11 when the sound source is above the sound source direction determination device 10, and the first microphone 11 when the sound source is in front of the sound source direction determination device 10. The difference in sound pressure from the sound pressure of the sound acquired in is illustrated. The solid line represents the sound pressure [dB] of the sound acquired by the first microphone 11 when the sound source is above the sound source direction determination device 10, and the broken line represents the sound source in front of the sound source direction determination device 10. In this case, it represents the sound pressure [dB] of the sound acquired by the first microphone 11.

即ち、実線と破線との間の上下方向の距離が、音源が音源方向判定装置１０の上方に存在する場合に第１マイク１１で取得される音の音圧と、音源が音源方向判定装置１０の前方に存在する場合に第１マイク１１で取得される音の音圧との音圧差を表す。図７のグラフの横軸は周波数［Ｈｚ］であり、音圧差は、周波数が低いほど小さく、周波数が高いほど大きい傾向を有する。即ち、回折の回数が１回である、音源が音源方向判定装置１０の上方に存在する場合と、回折の回数が２回である音源が音源方向判定装置１０の前方に存在する場合と、の音圧差は、周波数が高いほど顕著となる。 That is, the vertical distance between the solid line and the broken line is the sound pressure of the sound acquired by the first microphone 11 when the sound source is above the sound source direction determination device 10, and the sound source is the sound source direction determination device 10. It represents the sound pressure difference from the sound pressure of the sound acquired by the first microphone 11 when it exists in front of. The horizontal axis of the graph of FIG. 7 is the frequency [Hz], and the sound pressure difference tends to be smaller as the frequency is lower and larger as the frequency is higher. That is, there are cases where the sound source has the number of diffractions of once above the sound source direction determination device 10 and cases where the sound source having the number of diffractions twice exists in front of the sound source direction determination device 10. The sound pressure difference becomes more remarkable as the frequency becomes higher.

回折による減音量Ｒ［ｄＢ］は、例えば、（１）式で表される。

Ｎは、フレネル数であり、（２）式で表される。
Ｎ＝δ／（λ／２）
＝δ・ｆ／１６５ …（２）
δは、回折経路と直接経路との経路差［ｍ］であり、λは音の波長［ｍ］であり、ｆは音の周波数［Ｈｚ］であり、音速（＝λ×ｆ）を３３０［ｍ／秒］とした場合である。即ち、図７のグラフにも表されるように、周波数ｆが高いほど、回折による減音量Ｒは大きくなる傾向を有する。したがって、本実施形態では、音源の方向を判定する際に、音の高域成分の音圧差を使用する。 The volume reduction R [dB] due to diffraction is expressed by, for example, Eq. (1).

N is a Fresnel number and is expressed by the equation (2).
N = δ / (λ / 2)
= Δ ・ f / 165 ... (2)
δ is the path difference [m] between the diffraction path and the direct path, λ is the wavelength of sound [m], f is the frequency of sound [Hz], and the speed of sound (= λ × f) is 330 [. m / sec]. That is, as shown in the graph of FIG. 7, the higher the frequency f, the larger the volume reduction R due to diffraction tends to be. Therefore, in the present embodiment, the difference in sound pressure of the high frequency component of the sound is used when determining the direction of the sound source.

第１音道１１Ｒは、第１マイク１１のマイク穴の直径が０．５［ｍｍ］である場合、マイク穴の直径の２倍の長さである直径１［ｍｍ］の円形の断面を有していてもよい。なお、例えば、第１音道１１Ｒは、一端部から他端部まで同じ直径を有していてもよいし、一端部から他端部に向かって徐々に直径が小さくなってもよい。 The first sound path 11R has a circular cross section with a diameter of 1 [mm], which is twice the diameter of the microphone hole when the diameter of the microphone hole of the first microphone 11 is 0.5 [mm]. You may be doing it. For example, the first sound path 11R may have the same diameter from one end to the other end, or may gradually decrease in diameter from one end to the other end.

第１音道１１Ｒは、一端部から屈曲部１１Ｋに向かって徐々に直径が小さくなり、屈曲部１１Ｋから他端部まで同じ直径を有していてもよい。また、第１音道１１Ｒは、例えば、矩形の断面を有していてもよい。 The diameter of the first sound path 11R gradually decreases from one end to the bent portion 11K, and may have the same diameter from the bent portion 11K to the other end. Further, the first sound path 11R may have, for example, a rectangular cross section.

第１音道１１Ｒの一端部から屈曲部１１Ｋまでの長さ、及び、屈曲部１１Ｋから他端部までの長さは、例えば、３［ｍｍ］であってよいが、３［ｍｍ］よりも長くてもよいし、短くてもよい。また、第１音道１１Ｒの一端部から屈曲部１１Ｋまでは、筐体１８の上面と直交していてもよいし、第１音道１１Ｒと筐体１８の上面とは９０［度］以外の角度で交差していてもよい。また、第１音道１１Ｒの屈曲部１１Ｋから他端部までは、一端部から屈曲部１１Ｋまでと直交していてもよいし、９０［度］以外の角度で交差していてもよい。 The length from one end of the first sound path 11R to the bent portion 11K and the length from the bent portion 11K to the other end may be, for example, 3 [mm], but is more than 3 [mm]. It may be long or short. Further, from one end of the first sound path 11R to the bent portion 11K may be orthogonal to the upper surface of the housing 18, and the first sound path 11R and the upper surface of the housing 18 are other than 90 [degrees]. It may intersect at an angle. Further, the bent portion 11K to the other end of the first sound path 11R may be orthogonal to the bent portion 11K from the one end portion, or may intersect at an angle other than 90 [degrees].

また、第１マイク１１の周囲は第１音道１１Ｒの他端部と側壁とがつながる部分を除いて側壁で包囲され、他端部と側壁との間に空隙は存在しない。また、第２マイク１２の周囲は第２音道１２Ｒの他端部と側壁とがつながる部分を除いて側壁で包囲され、他端部と側壁との間に空隙は存在しない。なお、筐体１８の上面と前面とは直交している。しかしながら、本実施形態は筐体１８の上面と前面とが直交されている例に限定されず、筐体１８の上面と前面とは、９０［度］以外の角度で交差していてもよい。 Further, the periphery of the first microphone 11 is surrounded by the side wall except for the portion where the other end of the first sound path 11R and the side wall are connected, and there is no gap between the other end and the side wall. Further, the periphery of the second microphone 12 is surrounded by the side wall except for the portion where the other end of the second sound path 12R and the side wall are connected, and there is no gap between the other end and the side wall. The upper surface and the front surface of the housing 18 are orthogonal to each other. However, this embodiment is not limited to the example in which the upper surface and the front surface of the housing 18 are orthogonal to each other, and the upper surface and the front surface of the housing 18 may intersect at an angle other than 90 [degrees].

図８を使用して、第１実施形態の判定部１３で行われる音源方向判定処理の概要を例示する。図３に例示するように設置された第１マイク１１で取得された音に対応する音信号を、時間周波数変換部１３Ａが時間周波数変換する。同様に、図３に例示するように設置された第２マイク１２で取得された音に対応する音信号を、時間周波数変換部１３Ｂが時間周波数変換する。時間周波数変換には、例えば、ＦＦＴ（Fast Fourier Transformation）を使用する。 FIG. 8 is used to illustrate an outline of the sound source direction determination process performed by the determination unit 13 of the first embodiment. The time frequency conversion unit 13A converts the sound signal corresponding to the sound acquired by the first microphone 11 installed as illustrated in FIG. 3 into time frequency. Similarly, the time frequency conversion unit 13B converts the sound signal corresponding to the sound acquired by the second microphone 12 installed as illustrated in FIG. 3 into time frequency. For example, FFT (Fast Fourier Transformation) is used for time-frequency conversion.

上記したように、第１マイク１１で取得された音の音圧と、第２マイク１２で取得された音の音圧との音圧差は、高域成分で顕著に現れる。したがって、高域音圧差算出部１３Ｃは、所定の周波数より高い周波数における周波数帯域毎の音圧差の平均値を、高域音圧差として算出する。音源方向判定部１３Ｄは、高域音圧差算出部１３Ｃで算出された高域音圧差に基づいて、音源の位置を判定する。 As described above, the sound pressure difference between the sound pressure of the sound acquired by the first microphone 11 and the sound pressure of the sound acquired by the second microphone 12 appears remarkably in the high frequency component. Therefore, the high frequency sound pressure difference calculation unit 13C calculates the average value of the sound pressure difference for each frequency band at a frequency higher than a predetermined frequency as the high frequency sound pressure difference. The sound source direction determination unit 13D determines the position of the sound source based on the high-frequency sound pressure difference calculated by the high-frequency sound pressure difference calculation unit 13C.

詳細には、高域音圧差算出部１３Ｃは、第１マイク１１で取得された音に対応する音信号のスペクトルパワーpow1[bin]を（３）式で算出し、第２マイク１２で取得された音に対応する音信号のスペクトルパワーpow2[bin]を（４）式で算出する。
pow1[bin]=re1[bin]²+im1[bin]² …（３）
pow2[bin]=re2[bin]²+im2[bin]² …（４）
bin=0, …, F-1であり、Ｆは周波数帯域数であり、例えば、２５６であってよい。re1[bin]は、第１マイク１１で取得した音の音信号を時間周波数変換した際に取得される、周波数帯域binの周波数スペクトルの実部である。また、im1[bin]は、第１マイク１１で取得した音の音信号を時間周波数変換した際に取得される、周波数帯域binの周波数スペクトルの虚部である。 Specifically, the high frequency sound pressure difference calculation unit 13C calculates the spectral power pow1 [bin] of the sound signal corresponding to the sound acquired by the first microphone 11 by the equation (3), and acquires it by the second microphone 12. The spectral power pow2 [bin] of the sound signal corresponding to the sound is calculated by Eq. (4).
pow1 [bin] = re1 [bin] ² + im1 [bin] ² … (3)
pow2 [bin] = re2 [bin] ² + im2 [bin] ² … (4)
bin = 0, ..., F-1, where F is the number of frequency bands, for example 256. re1 [bin] is the real part of the frequency spectrum of the frequency band bin acquired when the sound signal of the sound acquired by the first microphone 11 is time-frequency converted. Further, im1 [bin] is an imaginary part of the frequency spectrum of the frequency band bin acquired when the sound signal of the sound acquired by the first microphone 11 is time-frequency converted.

re2[bin]は、第２マイク１２で取得した音の音信号を時間周波数変換した際に取得される、周波数帯域binの周波数スペクトルの実部である。また、im2[bin]は、第２マイク１２で取得した音の音信号を時間周波数変換した際に取得される、周波数帯域binの周波数スペクトルの虚部である。 re2 [bin] is the real part of the frequency spectrum of the frequency band bin acquired when the sound signal of the sound acquired by the second microphone 12 is time-frequency converted. Further, im2 [bin] is an imaginary part of the frequency spectrum of the frequency band bin acquired when the sound signal of the sound acquired by the second microphone 12 is time-frequency converted.

次に、（５）式で、高域音圧差d_powを算出する。

高域音圧差d_powは、音圧の相違の一例であり、スペクトルパワーpow1[i]の対数から、スペクトルパワーpow2[i]の対数を減算した値の平均値である。ｓは、高域の下限周波数帯域数であり、例えば、９６であってよい。音信号のサンプリング周波数が１６［ｋＨｚ］であり、ｓ＝９６である場合、高域とは３０００［Ｈｚ］～８［ｋＨｚ］である。 Next, the high frequency sound pressure difference d_pow is calculated by the equation (5).

The high-frequency sound pressure difference d_pow is an example of the difference in sound pressure, and is the average value obtained by subtracting the logarithm of the spectral power pow2 [i] from the logarithm of the spectral power pow1 [i]. s is the lower limit frequency band number in the high frequency range, and may be 96, for example. When the sampling frequency of the sound signal is 16 [kHz] and s = 96, the high frequency range is 3000 [Hz] to 8 [kHz].

図９に、音源方向判定部１３Ｄの判定基準及び判定結果を例示する。高域音圧差d_powと正の値である第１閾値とを比較し、高域音圧差d_powが第１閾値よりも大きい場合、音源は筐体１８の上面に対向する位置、即ち、上方に存在すると判定される。また、高域音圧差d_powと負の値である第２閾値とを比較し、高域音圧差d_powが第２閾値よりも小さい場合、音源は筐体１８の前面に対向する位置、即ち、前方に存在すると判定される。 FIG. 9 illustrates the determination criteria and determination results of the sound source direction determination unit 13D. Comparing the high-frequency sound pressure difference d_pow with the first threshold value, which is a positive value, if the high-frequency sound pressure difference d_pow is larger than the first threshold value, the sound source is located at a position facing the upper surface of the housing 18, that is, above. Is determined. Further, when the high frequency sound pressure difference d_pow is compared with the second threshold value which is a negative value and the high frequency sound pressure difference d_pow is smaller than the second threshold value, the sound source is located at a position facing the front surface of the housing 18, that is, in front. It is determined that it exists in.

また、図９に例示されるように、高域音圧差d_powが第２閾値以上であり、第１閾値以下である場合には、音源方向の判定は不可であると判定する。第１閾値は、例えば、１．５［ｄＢ］、第２閾値は、例えば、－１．５［ｄＢ］であってよい。 Further, as illustrated in FIG. 9, when the high frequency sound pressure difference d_pow is equal to or higher than the second threshold value and equal to or lower than the first threshold value, it is determined that the sound source direction cannot be determined. The first threshold value may be, for example, 1.5 [dB], and the second threshold value may be, for example, −1.5 [dB].

なお、高域音圧差d_powを取得する際に、（５）式において、筐体１８の前面に開口部１２Ｏを有する第２マイク１２のスペクトルパワーを基準にしているため、図９に例示するような判定結果となる。しかしながら、（６）式に例示するように、筐体１８の上面に開口部１１Ｏを有する第１マイク１１のスペクトルパワーを基準として高域音圧差d_powを取得する場合、判定結果は異なる。

In addition, when acquiring the high frequency sound pressure difference d_pow, since the spectral power of the second microphone 12 having the opening 12O on the front surface of the housing 18 is used as a reference in the equation (5), it is illustrated in FIG. The judgment result is good. However, as illustrated in the equation (6), when the high frequency sound pressure difference d_pow is acquired with reference to the spectral power of the first microphone 11 having the opening 11O on the upper surface of the housing 18, the determination result is different.

高域音圧差d_powと正の値である第１閾値とを比較し、高域音圧差d_powが第１閾値よりも大きい場合、音源は筐体１８の前面に対向する位置、即ち、前方に存在すると判定される。また、高域音圧差d_powと負の値である第２閾値とを比較し、高域音圧差d_powが第２閾値よりも小さい場合、音源は筐体１８の上面に対向する位置、即ち、上方に存在すると判定される。 Comparing the high-frequency sound pressure difference d_pow with the first threshold value, which is a positive value, if the high-frequency sound pressure difference d_pow is larger than the first threshold value, the sound source is located at a position facing the front surface of the housing 18, that is, in front of the housing 18. Is determined. Further, when the high frequency sound pressure difference d_pow is compared with the second threshold value which is a negative value and the high frequency sound pressure difference d_pow is smaller than the second threshold value, the sound source is located at a position facing the upper surface of the housing 18, that is, above. It is determined that it exists in.

なお、高域音圧差を取得する（５）式及び（６）式は例示であり、本実施形態はこれに限定されない。また、第１マイク１１で取得された音の高域成分の音圧、及び、第２マイク１２で取得された音の高域成分の音圧の相違である高域音圧差を使用する例について説明したが、本実施形態はこの例に限定されない。 It should be noted that the equations (5) and (6) for acquiring the high frequency sound pressure difference are examples, and the present embodiment is not limited thereto. Further, regarding an example of using a high-frequency sound pressure difference, which is a difference between the sound pressure of the high-frequency component of the sound acquired by the first microphone 11 and the sound pressure of the high-frequency component of the sound acquired by the second microphone 12. As described above, the present embodiment is not limited to this example.

第１マイク１１で取得された音の所定の周波数成分の音圧、及び、第２マイク１２で取得された音の所定の周波数成分の音圧の相違を、高域音圧差に代えて使用してもよい。所定の周波数成分とは、第１周波数成分の一例であり、高域成分であってよいが、音源の方向によって、第１マイク１１と第２マイク１２との間で音圧差が顕著に現れる周波数成分であればよい。また、図９の判定基準及び判定結果も例示であり、本実施形態はこの例に限定されない。 The difference between the sound pressure of the predetermined frequency component of the sound acquired by the first microphone 11 and the sound pressure of the predetermined frequency component of the sound acquired by the second microphone 12 is used instead of the high frequency sound pressure difference. You may. The predetermined frequency component is an example of the first frequency component and may be a high frequency component, but a frequency at which a sound pressure difference remarkably appears between the first microphone 11 and the second microphone 12 depending on the direction of the sound source. It may be an ingredient. Further, the determination criteria and determination results in FIG. 9 are also examples, and the present embodiment is not limited to this example.

図１０に、情報処理端末１のハードウェア構成を例示する。情報処理端末１は、ハードウェアであるプロセッサの一例であるＣＰＵ（Central Processing Unit）５１、一次記憶部５２、二次記憶部５３、及び、外部インターフェイス５４を含む。情報処理端末１は、また、第１マイク１１、第２マイク１２、及びスピーカ１４Ｃを含む。 FIG. 10 illustrates the hardware configuration of the information processing terminal 1. The information processing terminal 1 includes a CPU (Central Processing Unit) 51, which is an example of a processor that is hardware, a primary storage unit 52, a secondary storage unit 53, and an external interface 54. The information processing terminal 1 also includes a first microphone 11, a second microphone 12, and a speaker 14C.

ＣＰＵ５１、一次記憶部５２、二次記憶部５３、外部インターフェイス５４、第１マイク１１、第２マイク１２、及びスピーカ１４Ｃは、バス５９を介して相互に接続されている。 The CPU 51, the primary storage unit 52, the secondary storage unit 53, the external interface 54, the first microphone 11, the second microphone 12, and the speaker 14C are connected to each other via the bus 59.

一次記憶部５２は、例えば、ＲＡＭ（Random Access Memory）などの揮発性のメモリである。 The primary storage unit 52 is, for example, a volatile memory such as a RAM (Random Access Memory).

二次記憶部５３は、プログラム格納領域５３Ａ及びデータ格納領域５３Ｂを含む。プログラム格納領域５３Ａは、一例として、音源方向判定処理をＣＰＵ５１に実行させるための音源方向判定プログラム、音源方向判定処理の判定結果に基づいて、音声翻訳処理をＣＰＵ５１に実行させるための音声翻訳プログラムなどのプログラムを記憶している。データ格納領域５３Ｂは、第１マイク１１及び第２マイク１２から取得された音に対応する音信号、音源方向判定処理及び音声翻訳処理において一時的に生成される中間データ、などを記憶する。 The secondary storage unit 53 includes a program storage area 53A and a data storage area 53B. The program storage area 53A is, for example, a sound source direction determination program for causing the CPU 51 to execute the sound source direction determination process, a speech translation program for causing the CPU 51 to execute the speech translation process based on the determination result of the sound source direction determination process, and the like. I remember the program of. The data storage area 53B stores sound signals corresponding to the sounds acquired from the first microphone 11 and the second microphone 12, intermediate data temporarily generated in the sound source direction determination process and the speech translation process, and the like.

ＣＰＵ５１は、プログラム格納領域５３Ａから音源方向判定プログラムを読み出して一次記憶部５２に展開する。ＣＰＵ５１は、音源方向判定プログラムを実行することで、図１の判定部１３として動作する。ＣＰＵ５１は、プログラム格納領域５３Ａから音声翻訳プログラムを読み出して一次記憶部５２に展開する。ＣＰＵ５１は、音声翻訳プログラムを実行することで、図１の第１翻訳部１４Ａ及び第２翻訳部１４Ｂとして動作する。なお、音源方向判定プログラム及び音声翻訳プログラムなどのプログラムは、ＤＶＤ（Digital Versatile Disc）などの非一時的記録媒体に記憶され、記録媒体読込装置を介して読み込まれ、一次記憶部５２に展開されてもよい。 The CPU 51 reads a sound source direction determination program from the program storage area 53A and deploys it to the primary storage unit 52. The CPU 51 operates as the determination unit 13 in FIG. 1 by executing the sound source direction determination program. The CPU 51 reads a speech translation program from the program storage area 53A and deploys it in the primary storage unit 52. The CPU 51 operates as the first translation unit 14A and the second translation unit 14B in FIG. 1 by executing the speech translation program. Programs such as a sound source direction determination program and a voice translation program are stored in a non-temporary recording medium such as a DVD (Digital Versatile Disc), read via a recording medium reading device, and expanded in the primary storage unit 52. May be good.

外部インターフェイス５４には、外部装置が接続され、外部インターフェイス５４は、外部装置とＣＰＵ５１との間の各種情報の送受信を司る。例えば、スピーカ１４Ｃは、音源方向判定装置１０に含まれず、外部インターフェイス５４を介して接続される外部装置であってもよい。 An external device is connected to the external interface 54, and the external interface 54 controls transmission / reception of various information between the external device and the CPU 51. For example, the speaker 14C may be an external device that is not included in the sound source direction determination device 10 and is connected via the external interface 54.

次に、音源方向判定装置１０の作用の概略について説明する。音源方向判定装置１０の作用の概略を図１１に例示する。例えば、ユーザが音源方向判定装置１０の電源を投入すると、ＣＰＵ５１は、せせで、１フレーム分の音信号を読み込む。詳細には、第１マイク１１から取得された音に対応する１フレーム分の音信号（以下、第１音信号という。）と、第２マイク１２から取得された音に対応する１フレーム分の音信号（以下、第２音信号という。）と、を読み込む。１フレームは、サンプリング周波数が１６［ｋＨｚ］である場合、例えば、３２［ｍ秒］であってよい。 Next, the outline of the operation of the sound source direction determination device 10 will be described. The outline of the operation of the sound source direction determination device 10 is illustrated in FIG. For example, when the user turns on the power of the sound source direction determination device 10, the CPU 51 reads a sound signal for one frame at least. Specifically, one frame of sound signal corresponding to the sound acquired from the first microphone 11 (hereinafter referred to as the first sound signal) and one frame of sound signal acquired from the second microphone 12 The sound signal (hereinafter referred to as the second sound signal) and the sound signal are read. One frame may be, for example, 32 [msec] when the sampling frequency is 16 [kHz].

ＣＰＵ５１は、ステップ１０２で、ステップ１０１で読み込んだ音信号の各々に時間周波数変換を施す。ＣＰＵ５１は、ステップ１０３で、（３）式及び（４）式を使用して、時間周波数変換を施した音信号の各々のスペクトルパワーを算出し、（５）式を使用して、高域音圧差d_powを算出する。 In step 102, the CPU 51 performs time-frequency conversion on each of the sound signals read in step 101. In step 103, the CPU 51 calculates the spectral power of each of the time-frequency-converted sound signals using the equations (3) and (4), and uses the equation (5) to calculate the high frequency sound. Calculate the pressure difference d_pow.

ＣＰＵ５１は、ステップ１０４で、ステップ１０３で算出した高域音圧差d_powと第１閾値値とを比較し、高域音圧差d_powが第１閾値より大きい場合、音源が音源方向判定装置１０の上方に存在すると判定し、ステップ１０５に進む。ＣＰＵ５１は、ステップ１０５で、音信号を第２言語から第１言語へ翻訳する処理に振り分け、ステップ１０８に進む。振り分けられた音信号は、既存の音声翻訳処理技術によって、第２言語から第１言語へ翻訳され、例えば、スピーカ１４Ｃから音声として出力される。 In step 104, the CPU 51 compares the high-frequency sound pressure difference d_pow calculated in step 103 with the first threshold value, and when the high-frequency sound pressure difference d_pow is larger than the first threshold value, the sound source is above the sound source direction determination device 10. It is determined that it exists, and the process proceeds to step 105. In step 105, the CPU 51 allocates the sound signal to the process of translating the sound signal from the second language to the first language, and proceeds to step 108. The distributed sound signal is translated from the second language to the first language by the existing voice translation processing technology, and is output as voice from the speaker 14C, for example.

ステップ１０４で、高域音圧差d_powが第１閾値以下であると判定された場合、ＣＰＵ５１は、ステップ１０６で、高域音圧差d_powと第２閾値とを比較し、高域音圧差d_powが第２閾値より小さい場合、音源が音源方向判定装置１０の前方に存在すると判定する。ステップ１０６の判定が肯定された場合、即ち、音源が音源方向判定装置１０の前方に存在すると判定された場合、ＣＰＵ５１は、ステップ１０７に進む。ＣＰＵ５１は、ステップ１０７で、音信号を第１言語から第２言語へ翻訳する処理に振り分け、ステップ１０８に進む。振り分けられた音信号は、既存の音声翻訳処理技術によって、第１言語から第２言語へ翻訳され、例えば、スピーカ１４Ｃから音声として出力される。 When it is determined in step 104 that the high frequency sound pressure difference d_pow is equal to or less than the first threshold value, the CPU 51 compares the high frequency sound pressure difference d_pow with the second threshold value in step 106, and the high frequency sound pressure difference d_pow is the first. If it is smaller than the two threshold values, it is determined that the sound source exists in front of the sound source direction determination device 10. If the determination in step 106 is affirmed, that is, if it is determined that the sound source is in front of the sound source direction determination device 10, the CPU 51 proceeds to step 107. In step 107, the CPU 51 allocates the sound signal to the process of translating the sound signal from the first language to the second language, and proceeds to step 108. The distributed sound signal is translated from the first language to the second language by the existing voice translation processing technology, and is output as voice from the speaker 14C, for example.

ステップ１０６の判定が否定された場合、ＣＰＵ５１は、ステップ１０８に進む。即ち、高域音圧差d_powが第１閾値以下であり、かつ、第２閾値以上である場合、音源位置の判定は不可であると判定され、第１言語から第２言語への翻訳も、第２言語から第１言語への翻訳も行わない。 If the determination in step 106 is denied, the CPU 51 proceeds to step 108. That is, when the high frequency sound pressure difference d_pow is equal to or less than the first threshold value and is equal to or greater than the second threshold value, it is determined that the sound source position cannot be determined, and the translation from the first language to the second language is also performed. It does not translate from two languages to the first language.

ＣＰＵ５１は、ステップ１０８で、音源方向判定装置１０の音源方向判定機能が、例えば、ユーザの操作によりオフされたか否か判定する。ステップ１０８の判定が否定された場合、即ち、音源方向判定機能がオンである場合、ＣＰＵ５１は、ステップ１０１に進み、次のフレームの音信号を読み込み、音源方向判定処理を継続する。ステップ１０８の判定が否定された場合、即ち、音源方向判定機能がオフである場合、ＣＰＵ５１は、音源方向判定処理を終了する。 In step 108, the CPU 51 determines whether or not the sound source direction determination function of the sound source direction determination device 10 is turned off by, for example, a user's operation. When the determination in step 108 is denied, that is, when the sound source direction determination function is on, the CPU 51 proceeds to step 101, reads the sound signal of the next frame, and continues the sound source direction determination process. When the determination in step 108 is denied, that is, when the sound source direction determination function is off, the CPU 51 ends the sound source direction determination process.

本実施形態のマイク設置部は、第１音道及び第２音道が内部に設けられている。第１音道は、第１平坦面に開口した第１開口部を一端部に備え、第１開口部から音が伝搬する。第２音道は、第１平坦面と交差する第２平坦面に開口した第２開口部を一端部に備え、第２開口部から音が伝搬する第２音道が内部に設けられている。第１マイクロフォンは、第１音道の他端部に設置され、第２マイクロフォンは、第２音道の他端部に設置されている。判定部は、音圧の相違に基づいて、音源が存在する方向を判定する。音圧の相違は、第１マイクロフォンで取得された音の第１周波数成分の音圧である第１音圧と、第２マイクロフォンで取得された音の第１周波数成分の音圧である第２音圧との相違である。 The microphone installation portion of the present embodiment is provided with a first sound path and a second sound path inside. The first sound path is provided with a first opening opened on the first flat surface at one end, and sound propagates from the first opening. The second sound path is provided at one end with a second opening opened in the second flat surface intersecting with the first flat surface, and a second sound path through which sound propagates from the second opening is provided inside. .. The first microphone is installed at the other end of the first sound path, and the second microphone is installed at the other end of the second sound path. The determination unit determines the direction in which the sound source exists based on the difference in sound pressure. The difference in sound pressure is the first sound pressure, which is the sound pressure of the first frequency component of the sound acquired by the first microphone, and the second sound pressure, which is the sound pressure of the first frequency component of the sound acquired by the second microphone. It is a difference from sound pressure.

本実施形態では、上記構成により、無指向性マイクロフォンを使用した音源方向判定の精度を向上させることを可能とする。 In the present embodiment, the above configuration makes it possible to improve the accuracy of sound source direction determination using an omnidirectional microphone.

また、本実施形態では、第１平坦面と第２平坦面とは直交し、第１平坦面の面積は所定値以下であり、第２平坦面の面積は所定値より大きい。第１音道は、第１開口部に音を回折する第１回折部を有し、かつ、途中に、音を回折する屈曲部である第２回折部を有し、第２音道は、第２開口部に音を回折する第３回折部を有する。 Further, in the present embodiment, the first flat surface and the second flat surface are orthogonal to each other, the area of the first flat surface is equal to or less than a predetermined value, and the area of the second flat surface is larger than the predetermined value. The first sound path has a first diffracting portion that diffracts sound in the first opening, and has a second diffracting portion that is a bending portion that diffracts sound in the middle, and the second sound path has a second diffracting portion. The second opening has a third diffracting part that diffracts sound.

本実施形態では、上記構成により、音道の開口部を備える平坦面の面積が音を十分に反射することが可能な所定値以下である場合でも、無指向性マイクロフォンを使用した音源方向判定の精度を向上させることを可能とする。 In the present embodiment, according to the above configuration, even when the area of the flat surface provided with the opening of the sound path is equal to or less than a predetermined value capable of sufficiently reflecting the sound, the sound source direction determination using the omnidirectional microphone is performed. It is possible to improve the accuracy.

なお、本実施形態では、筐体の上面の面積が所定値以下であり、筐体の前面の面積が所定値より大きい場合について例示したが、上面の面積が所定値より大きく、前面の面積が所定値以下であってもよい。この場合、上面に開口部を有する第１音道が屈曲部である回折部を有さず、前面に開口部を有する第２音道が屈曲部である回折部を有する。 In this embodiment, the case where the area of the upper surface of the housing is equal to or less than the predetermined value and the area of the front surface of the housing is larger than the predetermined value is illustrated, but the area of the upper surface is larger than the predetermined value and the area of the front surface is large. It may be less than or equal to a predetermined value. In this case, the first sound path having an opening on the upper surface does not have a diffraction portion which is a bending portion, and the second sound path having an opening on the front surface has a diffraction portion which is a bending portion.

なお、音声翻訳装置１４が、音源方向判定装置１０の筐体１８内に含まれている場合について例示したが、本実施形態はこれに限定されない。例えば、音声翻訳装置１４は、音源方向判定装置１０の筐体１８の外部に存在し、音源方向判定装置１０と有線接続または無線接続を介して接続されていてもよい。 Although the case where the voice translation device 14 is included in the housing 18 of the sound source direction determination device 10, the present embodiment is not limited to this. For example, the voice translation device 14 may exist outside the housing 18 of the sound source direction determination device 10 and may be connected to the sound source direction determination device 10 via a wired connection or a wireless connection.

［第２実施形態］
次に、第２実施形態の一例を説明する。第１実施形態と同様の構成及び作用については、説明を省略する。 [Second Embodiment]
Next, an example of the second embodiment will be described. The description of the same configuration and operation as in the first embodiment will be omitted.

図１２に、図２Ａの切断線３－３に沿った断面図を例示する。第２実施形態では、第１実施形態と同様に、音源方向判定装置１０Ａの筐体１８Ａの上面の面積は所定値以下であり、音源方向判定装置１０Ａの筐体１８Ａの前面の面積は所定値より大きい。 FIG. 12 illustrates a cross-sectional view taken along the cutting line 3-3 of FIG. 2A. In the second embodiment, as in the first embodiment, the area of the upper surface of the housing 18A of the sound source direction determination device 10A is a predetermined value or less, and the area of the front surface of the housing 18A of the sound source direction determination device 10A is a predetermined value. Greater.

第２実施形態では、第１音道１１ＡＲは、開口部１１ＡＯに音を回折する第１回折部の一例である回折部を有し、かつ、途中に、音を回折する屈曲部１１ＡＫである第２回折部の一例である回折部を有する。また、第２音道１２ＡＲは、第２開口部１２ＡＯに音を回折する第３回折部の一例である回折部を有し、途中に、音を回折する屈曲部１２ＡＫである第４回折部の一例である回折部を有する。 In the second embodiment, the first sound path 11AR has a diffracting portion which is an example of the first diffracting portion for diffracting the sound in the opening 11AO, and is a bending portion 11AK for diffracting the sound in the middle. It has a diffractive part which is an example of two diffractive parts. Further, the second sound path 12AR has a diffracting portion which is an example of a third diffracting portion that diffracts the sound in the second opening 12AO, and a bending portion 12AK that diffracts the sound in the middle of the fourth diffracting portion. It has a diffractive part, which is an example.

音源方向判定装置１０Ａの筐体１８Ａの前面は、第１実施形態と同様に所定値より大きい面積を有するが、第１実施形態と異なり、第２音道１２ＡＲは、途中に、回折部である屈曲部１２ＡＫを有している。 The front surface of the housing 18A of the sound source direction determination device 10A has an area larger than a predetermined value as in the first embodiment, but unlike the first embodiment, the second sound path 12AR is a diffractive portion in the middle. It has a bent portion 12AK.

本実施形態では、上記構成により、回折による所定の周波数成分（例えば、高域成分）の減音を利用して、無指向性マイクロフォンを使用した音源方向判定の精度を向上させることを可能とする。 In the present embodiment, the above configuration makes it possible to improve the accuracy of sound source direction determination using an omnidirectional microphone by utilizing the sound reduction of a predetermined frequency component (for example, a high frequency component) by diffraction. ..

［第３実施形態］
次に、第３実施形態の一例を説明する。第１実施形態及び第２実施形態と同様の構成及び作用については、説明を省略する。 [Third Embodiment]
Next, an example of the third embodiment will be described. The description of the same configuration and operation as those of the first embodiment and the second embodiment will be omitted.

図１３Ａ～図１３Ｃに、第３実施形態の音源方向判定装置１０Ｃの外観を例示する。図１３Ａは、第１平坦面の一例である筐体１８Ｃの右側面、図１３Ｂは、第２平坦面の一例である筐体１８Ｃの前面、図１３Ｃは、音源方向判定装置１０Ｃを筐体１８Ｃの前面と右側面とをつなぐ辺を正面から見た図である。図中矢印Ｒは、音源方向判定装置１０Ｃを正面から見た際の右手側を示す。 13A to 13C illustrate the appearance of the sound source direction determination device 10C of the third embodiment. 13A is the right side surface of the housing 18C which is an example of the first flat surface, FIG. 13B is the front surface of the housing 18C which is an example of the second flat surface, and FIG. 13C shows the sound source direction determination device 10C as the housing 18C. It is the figure which looked at the side connecting the front surface and the right side surface from the front. The arrow R in the figure indicates the right-hand side when the sound source direction determination device 10C is viewed from the front.

図１４に、図１３Ａの切断線１４－１４に沿った断面図を例示する。第３実施形態では、第１音道１１ＣＲは、筐体１８Ｃの右側面に開口した第１開口部１１ＣＯを一端部に備え、第２音道１２ＣＲは、筐体１８Ｃの前面に開口した第２開口部１２ＣＯを一端部に備えている。第１マイク１１Ｃが第１音道１１ＣＲの他端部に設置され、第２マイク１２Ｃが第２音道１２ＣＲの他端部に設置されている。 FIG. 14 illustrates a cross-sectional view taken along the cutting line 14-14 of FIG. 13A. In the third embodiment, the first sound path 11CR is provided with a first opening 11CO opened on the right side surface of the housing 18C at one end, and the second sound path 12CR is a second opening opened on the front surface of the housing 18C. An opening 12CO is provided at one end. The first microphone 11C is installed at the other end of the first sound path 11CR, and the second microphone 12C is installed at the other end of the second sound path 12CR.

第１実施形態及び第２実施形態と異なり、第１音道１１ＣＲ及び第２音道１２ＣＲは、双方共、途中に、回折部である屈曲部を有していない。第３実施形態では、筐体１８Ｃの前面及び右側面の双方が、音を十分に反射することが可能な所定値より大きい面積を有するためである。第３実施形態では、第１音道１１ＣＲは、第１開口部１１ＣＯに音を回折する第１回折部の一例である回折部を有し、第２音道１２ＣＲは、第２開口部１２ＣＯに音を回折する第２回折部の一例である回折部を有する、 Unlike the first embodiment and the second embodiment, neither the first sound path 11CR nor the second sound path 12CR has a bending portion which is a diffractive portion in the middle. This is because, in the third embodiment, both the front surface and the right side surface of the housing 18C have an area larger than a predetermined value capable of sufficiently reflecting sound. In the third embodiment, the first sound path 11CR has a diffraction section which is an example of the first diffraction section that diffracts the sound into the first opening 11CO, and the second sound path 12CR has the second opening 12CO. It has a diffractometer, which is an example of a second diffractometer that diffracts sound.

本実施形態では、上記構成により、筐体の平坦面で反射した音を利用して、無指向性マイクロフォンを使用した音源方向判定の精度を向上させることを可能とする。 In the present embodiment, the above configuration makes it possible to improve the accuracy of sound source direction determination using an omnidirectional microphone by utilizing the sound reflected on the flat surface of the housing.

なお、第１～第３実施形態において、音源方向判定装置は、第１平坦面及び第２平坦面の少なくとも一方と交差する第３平坦面をさらに有していてもよい。また、第３平坦面に開口した第３開口部を一端部に備え、第３開口部から音が伝搬する第３音道が筐体の内部に設けられ、無指向性の第３マイクが第３音道の他端部に設置されていてもよい。 In the first to third embodiments, the sound source direction determination device may further have a third flat surface that intersects at least one of the first flat surface and the second flat surface. Further, a third opening opened on the third flat surface is provided at one end, a third sound path through which sound propagates from the third opening is provided inside the housing, and an omnidirectional third microphone is provided. It may be installed at the other end of the three sound paths.

第３音道は、第３平坦面の面積が所定値以下である場合、途中に、屈曲部である回折部を有し、第３平坦面の面積が所定値より大きい場合、途中に、屈曲部である回折部を有していてもよいし、有していなくてもよい。この場合、第３平坦面と交差する平坦面に開口部を有する音道の他端部に設置されたマイクで取得された音の所定の周波数成分の音圧と、第３マイクで取得された音の所定の周波数成分の音圧との相違に基づいて、音源が存在する方向を判定する。 The third sound path has a diffraction portion which is a bending portion in the middle when the area of the third flat surface is equal to or less than a predetermined value, and bends in the middle when the area of the third flat surface is larger than the predetermined value. It may or may not have a diffractive part which is a part. In this case, the sound pressure of a predetermined frequency component of the sound acquired by the microphone installed at the other end of the sound path having an opening in the flat surface intersecting the third flat surface and the sound pressure acquired by the third microphone. The direction in which the sound source exists is determined based on the difference from the sound pressure of a predetermined frequency component of the sound.

なお、本実施形態では、音源方向が判定された音信号は、音源方向によって、音声翻訳装置１４で、第１言語から第２言語または第２言語から第１言語に翻訳される例について説明したが、本実施形態はこれに限定されない。音声翻訳装置１４は、例えば、第１翻訳部１４Ａまたは第２翻訳部１４Ｂの何れか一方だけを含んでいてもよい。 In this embodiment, an example is described in which the sound signal whose sound source direction is determined is translated from the first language to the second language or from the second language to the first language by the voice translation device 14 depending on the sound source direction. However, the present embodiment is not limited to this. The speech translation device 14 may include, for example, only one of the first translation unit 14A and the second translation unit 14B.

また、情報処理端末１は、音声翻訳装置１４に代えて、会議支援装置を含んでいてもよい。会議支援装置は、例えば、判定された音源方向及び音信号に基づいて、カメラ、マイク、及び、ディスプレイなどの切り替えを行う。また、情報処理端末１は、音声翻訳装置１４に代えて、ドライブ支援装置を含んでいてもよい。ドライブ支援装置は、判定された音源方向が運転手席側であれば、例えば、音信号に基づいて運転支援を行い、判定された音源方向が助手席側であれば、例えば、音信号に基づいて音楽または動画の再生などの娯楽を提供する。 Further, the information processing terminal 1 may include a conference support device instead of the speech translation device 14. The conference support device switches, for example, a camera, a microphone, a display, and the like based on the determined sound source direction and sound signal. Further, the information processing terminal 1 may include a drive support device instead of the speech translation device 14. If the determined sound source direction is the driver's seat side, the drive support device provides driving support based on, for example, a sound signal, and if the determined sound source direction is the passenger seat side, for example, based on the sound signal. To provide entertainment such as playing music or video.

音源方向判定装置を含む情報処理端末は、音源方向判定のための専用端末であってもよいが、既存の端末に、音源方向判定装置がハードウェア及びソフトウェアによって組み込まれていてもよい。既存の端末は、例えば、スマートフォン、タブレット、ウェアラブルデバイス、または、ナビゲーションシステムなどである。また、当該既存の端末に、音源方向判定装置のハードウェアまたはソフトウェアの少なくとも一部分が組み込まれ、音源方向判定装置は、外部装置として当該既存の端末と接続されていてもよい。 The information processing terminal including the sound source direction determination device may be a dedicated terminal for determining the sound source direction, but the sound source direction determination device may be incorporated in an existing terminal by hardware and software. Existing terminals are, for example, smartphones, tablets, wearable devices, or navigation systems. Further, at least a part of the hardware or software of the sound source direction determination device may be incorporated in the existing terminal, and the sound source direction determination device may be connected to the existing terminal as an external device.

なお、図１１におけるフローチャートの処理の順序は一例であり、本実施形態は、当該処理の順序に限定されない。 The order of processing in the flowchart in FIG. 11 is an example, and the present embodiment is not limited to the order of processing.

［第４実施形態］
次に、第４実施形態の一例を説明する。第１～第３実施形態と同様の構成及び作用については、説明を省略する。 [Fourth Embodiment]
Next, an example of the fourth embodiment will be described. The description of the same configuration and operation as those of the first to third embodiments will be omitted.

第４実施形態では、音源方向判定装置１０Ｄは、図１の音源方向判定装置１０の判定部１３に代えて、判定部１３’を含む。図１５を使用して、第４実施形態の判定部１３’で行われる音源方向判定処理の概要を例示する。 In the fourth embodiment, the sound source direction determination device 10D includes a determination unit 13'instead of the determination unit 13 of the sound source direction determination device 10 of FIG. FIG. 15 exemplifies an outline of the sound source direction determination process performed by the determination unit 13'of the fourth embodiment.

図１５の判定部１３’は、位相差算出部１３Ｃ’をさらに含む点で、図８の判定部１３と異なる。即ち、第４実施形態では、高域音圧差に加えて、正規化位相差を使用する点で、第４実施形態は、第１実施形態と異なる。 The determination unit 13'of FIG. 15 is different from the determination unit 13 of FIG. 8 in that it further includes the phase difference calculation unit 13C'. That is, the fourth embodiment is different from the first embodiment in that the normalized phase difference is used in addition to the high frequency sound pressure difference.

図１６Ａに例示するように、上方からの音ＵＳ１が第１マイク１１Ｄに到達するまでの距離は、上方からの音ＵＳ２が第２マイク１２Ｄに到達するまでの距離よりも短い。参考のために記載した基準線ＲＬ１から第１マイク１１Ｄに、音ＵＳ１が到達するまでの矢印ＵＳＤ１と、基準線ＲＬ１から第２マイク１２Ｄに、音ＵＳ２が到達するまでの矢印ＵＳＤ２と、を比較すると明らかである。 As illustrated in FIG. 16A, the distance until the sound US1 from above reaches the first microphone 11D is shorter than the distance until the sound US2 from above reaches the second microphone 12D. For reference, compare the arrow USD1 from the reference line RL1 to the first microphone 11D until the sound US1 reaches, and the arrow USD2 from the reference line RL1 to the second microphone 12D until the sound US2 reaches. Then it is clear.

即ち、上方からの音が第１マイク１１Ｄに到達するまでの時間と、上方からの音がマイク１２Ｄに到達するまでの時間と、は異なる。したがって、上方からの音が第１マイク１１Ｄに到達する際の位相と、上方からの音が第２マイク１２Ｄに到達する際の位相と、は異なる。 That is, the time until the sound from above reaches the first microphone 11D and the time until the sound from above reaches the microphone 12D are different. Therefore, the phase when the sound from above reaches the first microphone 11D and the phase when the sound from above reaches the second microphone 12D are different.

また、図１６Ｂに例示するように、前方からの音ＦＳ１が第１マイク１１Ｄに到達するまでの距離は、前方からの音ＦＳ２が第２マイク１２Ｄに到達するまでの距離よりも長い。参考のために記載した基準線ＲＬ２から第１マイク１１Ｄに、音ＦＳ１が到達するまでの矢印ＦＳＤ１から明らかである。 Further, as illustrated in FIG. 16B, the distance until the sound FS1 from the front reaches the first microphone 11D is longer than the distance until the sound FS2 from the front reaches the second microphone 12D. It is clear from the arrow FSD1 until the sound FS1 reaches the first microphone 11D from the reference line RL2 described for reference.

即ち、前方からの音が第１マイク１１Ｄに到達するまでの時間と、前方からの音がマイク１２Ｄに到達するまでの時間と、は異なる。したがって、前方からの音が第１マイク１１Ｄに到達する際の位相と、前方からの音が第２マイク１２Ｄに到達する際の位相と、は異なる。第４実施形態では、当該位相差を使用して音源方向を判定する。 That is, the time until the sound from the front reaches the first microphone 11D and the time until the sound from the front reaches the microphone 12D are different. Therefore, the phase when the sound from the front reaches the first microphone 11D and the phase when the sound from the front reaches the second microphone 12D are different. In the fourth embodiment, the phase difference is used to determine the sound source direction.

図１５の位相差算出部１３Ｃ’は、第１マイク１１Ｄで取得された音の位相である第１位相と、第２マイク１２Ｄで取得された音の位相である第２位相との相違を算出する。詳細には、位相差算出部１３Ｃ’は、位相の相違の一例である正規化位相差a_phaseを（７）式で算出する。

The phase difference calculation unit 13C'in FIG. 15 calculates the difference between the first phase, which is the phase of the sound acquired by the first microphone 11D, and the second phase, which is the phase of the sound acquired by the second microphone 12D. do. Specifically, the phase difference calculation unit 13C'calculates the normalized phase difference a_phase, which is an example of the phase difference, by the equation (7).

正規化位相差a_phaseは、ｊ番目の周波数帯域の位相差phase[j]を正規化係数C_n[j]で正規化した値の平均値である。ｊ＝ｓｓ，…，ｅｅであり、ｓｓは正規化位相差算出の下限周波数帯域数であり、ｅｅは正規化位相差算出の上限周波数帯域数であり、ｓｓ及びｅｅは、上記したｂｉｎに含まれる数値（ｂｉｎ＝０，…，ｓｓ，…，ｅｅ，…，Ｆ－１）である。 The normalized phase difference a_phase is the average value of the values obtained by normalizing the phase difference phase [j] of the jth frequency band with the normalization coefficient C_n [j]. j = ss, ..., ee, ss is the lower limit frequency band number for normalization phase difference calculation, ee is the upper limit frequency band number for normalization phase difference calculation, and ss and ee are included in the above bin. It is a numerical value (bin = 0, ..., ss, ..., ee, ..., F-1).

位相差phase[j]は、（８）式で算出される。
phase[j]=atan(phase_im[j]/phase_re[j]) …(8)
phase_re[j]=re1[j]*re2[j]+im1[j]*im2[j]であり、phase_im[j]=im1[j]*re2[j]-re1[j]*im2[j]であり、ａｔａｎはアークタンジェントを表す。
The phase difference phase [j] is calculated by Eq. (8).
phase [j] = atan (phase_ im [j] / phase_ re [j])… (8)
phase_re [j] = re1 [j] * re2 [j] + im1 [j] * im 2 [j], phase_im [j] = im1 [j] * re2 [j]-re1 [j] * im2 [ j], where atan stands for arctangent.

また、正規化係数C_n[j]は、（９）式で算出される。
C_n[j]=λ[j]/λ_c …(9)
λ［ｊ］＝Ｃ／ｆ_ｊであり、λ［ｊ］は周波数帯域数ｊに対応する波長であり、Ｃは音速であり、ｆ_ｊは周波数帯域数jに対応する周波数であり、λ_ｃは基準周波数の音の波長である。基準周波数は、例えば、サンプリング周波数が１６[ｋＨｚ]である場合、上限周波数である８[ｋＨｚ]であってよい。 The normalization coefficient C_n [j] is calculated by Eq. (9).
C_n [j] = λ [j] / λ_c… (9)
λ [j] = C / f_j, λ [j] is the wavelength corresponding to the number of frequency bands j, C is the sound velocity, f_j is the frequency corresponding to the number of frequency bands j, and λ_c is the reference frequency. The wavelength of the sound. The reference frequency may be, for example, 8 [kHz], which is the upper limit frequency, when the sampling frequency is 16 [kHz].

正規化位相差算出の上限周波数帯域数ｅｅに対応する周波数は、例えば、Ｃ／２Ｌであってよい。Ｌは、第１マイク１１と第２マイク１２との間の距離である。正規化位相差算出の下限周波数帯域数ｓｓに対応する周波数は、例えば、１００Ｈzであってよい。 The frequency corresponding to the upper limit frequency band number ee for normalization phase difference calculation may be, for example, C / 2L. L is the distance between the first microphone 11 and the second microphone 12. The frequency corresponding to the lower limit frequency band number ss for normalized phase difference calculation may be, for example, 100 Hz.

なお、正規化位相差算出の上限周波数帯域数ｅｅ及び下限周波数帯域数ｓｓは雑音の影響が大きくならず、位相変化の適切な検出が可能な程度に設定してもよい。音は、周波数が高くなるとパワーが小さくなるため、周波数が高くなると信号対雑音比が低下し、雑音の影響が大きくなる。また、雑音の影響が大きくならないよう、低い周波数に設定すると、低い周波数の音は波長が長いため、高い周波数の音より位相変化が遅く、短時間での位相変化の適切な検出が困難となる。 The upper limit frequency band number ee and the lower limit frequency band number ss for normalized phase difference calculation may be set to such an extent that the influence of noise does not increase and appropriate detection of the phase change is possible. Since the power of sound decreases as the frequency increases, the signal-to-noise ratio decreases as the frequency increases, and the influence of noise increases. In addition, if the frequency is set to a low frequency so that the influence of noise does not become large, the phase change of the low frequency sound is slower than that of the high frequency sound, and it becomes difficult to properly detect the phase change in a short time. ..

上記（７）式で算出される正規化位相差a_phaseは、音源が上方に存在する場合、即ち、第１マイク１１Ｄが第２マイク１２Ｄよりも音源に近い場合正の値となる。一方、音源が前方に存在する場合、即ち、第１マイク１１Ｄが第２マイク１２Ｄよりも音源から遠い場合負の値となる。なお、正規化位相差の符号は、第１マイク１１Ｄ及び第２マイク１２Ｄの何れを基準とするかにより異なる。また、正規化位相差を求める手法は、上記（７）式に限定されない。 The normalized phase difference a_phase calculated by the above equation (7) becomes a positive value when the sound source is located above, that is, when the first microphone 11D is closer to the sound source than the second microphone 12D. On the other hand, when the sound source is present in front, that is, when the first microphone 11D is farther from the sound source than the second microphone 12D, a negative value is obtained. The sign of the normalized phase difference differs depending on whether the first microphone 11D or the second microphone 12D is used as a reference. Further, the method for obtaining the normalized phase difference is not limited to the above equation (7).

次に、音源方向判定装置１０Ｄの作用の概略について説明する。音源方向判定装置１０Ｄの作用の概略を図１７Ａに例示する。図１１と図１７Ａとの差異は、図１１のステップ１０３、１０４及び１０６が、図１７Ａでは、ステップ１０３、１０３Ｂ、１０４、１０４Ｂ、及び１０６と置き替えられている点である。 Next, the outline of the operation of the sound source direction determination device 10D will be described. The outline of the operation of the sound source direction determination device 10D is illustrated in FIG. 17A. The difference between FIGS. 11 and 17A is that steps 103, 104 and 106 in FIG. 11 are replaced with steps 103, 103B, 104, 104B and 106 in FIG. 17A.

即ち、図１７Ａでは、ＣＰＵ５１は、ステップ１０３で、上記したように高域音圧差を算出し、ステップ１０３Ｂで、（７）式を使用して、正規化位相差a_phaseを算出する。ＣＰＵ５１は、ステップ１０４で、高域音圧差が正の第１閾値より大きいか否か判定し、ステップ１０４の判定が肯定された場合、ステップ１０４Ｂで、正規化位相差が正の第３の閾値より大きいか否か判定する。ステップ１０４Ｂの判定が肯定された場合、音源が上方に存在すると判定し、ステップ１０５に進む。 That is, in FIG. 17A, the CPU 51 calculates the high frequency sound pressure difference as described above in step 103, and calculates the normalized phase difference a_phase using the equation (7) in step 103B. The CPU 51 determines in step 104 whether or not the high frequency sound pressure difference is larger than the positive first threshold value, and if the determination in step 104 is affirmed, in step 104B, the normalized phase difference is a positive third threshold value. Determine if it is greater than. If the determination in step 104B is affirmed, it is determined that the sound source exists above, and the process proceeds to step 105.

ステップ１０４の判定が否定された場合、即ち、高域音圧差が正の第１閾値以下である場合、音源が上方に存在しないと判定し、ＣＰＵ５１は、ステップ１０６で、高域音圧差が負の第２閾値より小さいか否か判定する。ステップ１０６の判定が肯定された場合、または、ステップ１０４Ｂの判定が否定された場合、即ち、正規化位相差が正の第３閾値以下である場合、音源が前方に存在すると判定し、ＣＰＵ５１は、ステップ１０７に進む。 When the determination in step 104 is denied, that is, when the high frequency sound pressure difference is equal to or less than the positive first threshold value, it is determined that the sound source does not exist above, and the CPU 51 determines in step 106 that the high frequency sound pressure difference is negative. It is determined whether or not it is smaller than the second threshold value of. If the determination in step 106 is affirmed, or if the determination in step 104B is denied, that is, if the normalized phase difference is equal to or less than the positive third threshold value, it is determined that the sound source exists in front, and the CPU 51 determines. , Step 107.

ステップ１０６の判定が否定された場合、即ち、高域音圧差が負の第２閾値以上である場合、音源方向の判定は不可であると判定して、ＣＰＵ５１は、ステップ１０８に進む。正の第３閾値は、例えば、３．０［ｒａｄ］であってよい。 If the determination in step 106 is denied, that is, if the high frequency sound pressure difference is equal to or greater than the negative second threshold value, it is determined that the determination of the sound source direction is impossible, and the CPU 51 proceeds to step 108. The positive third threshold may be, for example, 3.0 [rad].

なお、本実施形態は、図１７Ａのステップ１０４、１０４Ｂ、及び１０６で、音源方向を判定することに限定されない。図１７Ｂ～図１７Ｆに例示するように、高域音圧差の判定と正規化位相差の判定とを組み合わせることで、音源方向を判定してもよいし、図１７Ｇに例示するように、正規化位相差の判定で、音源方向を判定してもよい。 Note that this embodiment is not limited to determining the sound source direction in steps 104, 104B, and 106 of FIG. 17A. As illustrated in FIGS. 17B to 17F, the sound source direction may be determined by combining the determination of the high frequency sound pressure difference and the determination of the normalized phase difference, or normalized as illustrated in FIG. 17G. The sound source direction may be determined by determining the phase difference.

図１１と図１７Ｂとの差異は、図１１のステップ１０３、１０４及び１０６が、図１７Ｂでは、ステップ１０３、１０３Ｂ、１０４、１０４Ｂ、１０６、及び１０６Ｂと置き替えられている点である。 The difference between FIGS. 11 and 17B is that steps 103, 104 and 106 in FIG. 11 are replaced with steps 103, 103B, 104, 104B, 106 and 106B in FIG. 17B.

即ち、図１７Ｂでは、ＣＰＵ５１は、ステップ１０４で、高域音圧差が正の第１閾値より大きいか否か判定し、ステップ１０４の判定が肯定された場合、ステップ１０４Ｂで、正規化位相差が正の第３の閾値より大きいか否か判定する。ステップ１０４Ｂの判定が肯定された場合、音源が上方に存在すると判定し、ステップ１０５に進む。 That is, in FIG. 17B, the CPU 51 determines in step 104 whether or not the high frequency sound pressure difference is larger than the positive first threshold value, and if the determination in step 104 is affirmed, the normalized phase difference is determined in step 104B. It is determined whether or not it is larger than the positive third threshold value. If the determination in step 104B is affirmed, it is determined that the sound source exists above, and the process proceeds to step 105.

ステップ１０４の判定が否定された場合、即ち、高域音圧差が正の第１閾値以下である場合、音源が上方に存在しないと判定し、ＣＰＵ５１は、ステップ１０６で、高域音圧差が負の第２閾値より小さいか否か判定する。ステップ１０６の判定が肯定された場合、または、ステップ１０４Ｂの判定が否定された場合、ＣＰＵ５１は、ステップ１０６Ｂで、正規化位相差が負の第４閾値より小さいか否か判定する。ステップ１０６Ｂの判定が肯定された場合、音源が前方に存在すると判定し、ステップ１０７に進む。 When the determination in step 104 is denied, that is, when the high frequency sound pressure difference is equal to or less than the positive first threshold value, it is determined that the sound source does not exist above, and the CPU 51 determines in step 106 that the high frequency sound pressure difference is negative. It is determined whether or not it is smaller than the second threshold value of. If the determination in step 106 is affirmed, or if the determination in step 104B is denied, the CPU 51 determines in step 106B whether the normalized phase difference is smaller than the negative fourth threshold value. If the determination in step 106B is affirmed, it is determined that the sound source exists in front, and the process proceeds to step 107.

ステップ１０６またはステップ１０６Ｂの判定が否定された場合、即ち、高域音圧差が負の第２閾値以上である場合、または、正規化位相差が負の第４閾値以上である場合、音源方向の判定は不可であると判定して、ステップ１０８に進む。 If the determination in step 106 or step 106B is denied, that is, if the high frequency sound pressure difference is equal to or greater than the negative second threshold value, or if the normalized phase difference is equal to or greater than the negative fourth threshold value, the sound source direction It is determined that the determination is not possible, and the process proceeds to step 108.

図１１と図１７Ｃとの差異は、図１１のステップ１０３、１０４及び１０６が、図１７Ｃでは、ステップ１０３、１０３Ｂ、１０４、１０６、及び１０６Ｂと置き替えられている点である。 The difference between FIGS. 11 and 17C is that steps 103, 104 and 106 in FIG. 11 are replaced with steps 103, 103B, 104, 106 and 106B in FIG. 17C.

即ち、図１７Ｃでは、ＣＰＵ５１は、ステップ１０４で、高域音圧差が正の第１閾値より大きいか否か判定し、ステップ１０４の判定が肯定された場合、音源が上方に存在すると判定し、ステップ１０５に進む。 That is, in FIG. 17C, the CPU 51 determines in step 104 whether or not the high frequency sound pressure difference is larger than the positive first threshold value, and if the determination in step 104 is affirmed, determines that the sound source exists above. Proceed to step 105.

ステップ１０４の判定が否定された場合、即ち、高域音圧差が正の第１閾値以下である場合、音源が上方に存在しないと判定し、ＣＰＵ５１は、ステップ１０６で、高域音圧差が負の第２閾値より小さいか否か判定する。ステップ１０６の判定が肯定された場合、ＣＰＵ５１は、ステップ１０６Ｂで、正規化位相差が負の第４閾値より小さいか否か判定する。ステップ１０６Ｂの判定が肯定された場合、音源が前方に存在すると判定し、ステップ１０７に進む。 When the determination in step 104 is denied, that is, when the high frequency sound pressure difference is equal to or less than the positive first threshold value, it is determined that the sound source does not exist above, and the CPU 51 determines in step 106 that the high frequency sound pressure difference is negative. It is determined whether or not it is smaller than the second threshold value of. If the determination in step 106 is affirmed, the CPU 51 determines in step 106B whether the normalized phase difference is smaller than the negative fourth threshold value. If the determination in step 106B is affirmed, it is determined that the sound source exists in front, and the process proceeds to step 107.

図１１と図１７Ｄとの差異は、図１１のステップ１０３、１０４及び１０６が、図１７Ｄでは、ステップ１０３、１０３Ｂ、１０４Ｂ、１０４、及び１０６Ｂと置き替えられている点である。 The difference between FIGS. 11 and 17D is that steps 103, 104 and 106 in FIG. 11 are replaced with steps 103, 103B, 104B, 104 and 106B in FIG. 17D.

即ち、図１７Ｄでは、ＣＰＵ５１は、ステップ１０４Ｂで、正規化位相差が正の第３閾値より大きいか否か判定する。ステップ１０４Ｂの判定が肯定された場合、即ち、正規化位相差が正の第３閾値より大きい場合、ＣＰＵ５１は、ステップ１０４で、高域音圧差が正の第１閾値より大きいか否か判定する。ステップ１０４の判定が肯定された場合、音源が上方に存在すると判定し、ＣＰＵ５１はステップ１０５に進む。 That is, in FIG. 17D, the CPU 51 determines in step 104B whether or not the normalized phase difference is larger than the positive third threshold value. If the determination in step 104B is affirmed, that is, if the normalized phase difference is greater than the positive third threshold, the CPU 51 determines in step 104 whether the high frequency sound pressure difference is greater than the positive first threshold. .. If the determination in step 104 is affirmed, it is determined that the sound source exists above, and the CPU 51 proceeds to step 105.

ステップ１０４Ｂの判定が否定された場合、即ち、正規化位相差が正の第３閾値以下である場合、音源が上方に存在しないと判定し、ＣＰＵ５１は、ステップ１０６Ｂで、正規化位相差が負の第４閾値より小さいか否か判定する。ステップ１０６Ｂの判定が肯定された場合、または、ステップ１０４の判定が否定された場合、即ち、正規化位相差が負の第４閾値以上である場合、または、高域音圧差が正の第１閾値以下である場合、音源が前方に存在すると判定し、ステップ１０７に進む。 When the determination in step 104B is denied, that is, when the normalized phase difference is equal to or less than the positive third threshold value, it is determined that the sound source does not exist above, and the CPU 51 determines in step 106B that the normalized phase difference is negative. It is determined whether or not it is smaller than the fourth threshold value of. When the determination in step 106B is affirmed, or when the determination in step 104 is denied, that is, when the normalized phase difference is equal to or greater than the negative fourth threshold value, or when the high frequency sound pressure difference is positive first. If it is equal to or less than the threshold value, it is determined that the sound source exists in front, and the process proceeds to step 107.

ステップ１０６Ｂの判定が否定された場合、即ち、正規化位相差が負の第４閾値以上である場合、音源方向の判定は不可であると判定して、ステップ１０８に進む。 If the determination in step 106B is denied, that is, if the normalized phase difference is equal to or greater than the negative fourth threshold value, it is determined that the determination of the sound source direction is impossible, and the process proceeds to step 108.

図１１と図１７Ｅとの差異は、図１１のステップ１０３、１０４及び１０６が、図１７Ｅでは、ステップ１０３、１０３Ｂ、１０４Ｂ、１０４、１０６Ｂ、及び１０６と置き替えられている点である。 The difference between FIGS. 11 and 17E is that steps 103, 104 and 106 in FIG. 11 are replaced with steps 103, 103B, 104B, 104, 106B and 106 in FIG. 17E.

即ち、図１７Ｅでは、ＣＰＵ５１は、ステップ１０４Ｂで、正規化位相差が正の第３閾値より大きいか否か判定する。ステップ１０４Ｂの判定が肯定された場合、即ち、正規化位相差が正の第３閾値より大きい場合、ＣＰＵ５１は、ステップ１０４で、高域音圧差が正の第１閾値より大きいか否か判定する。ステップ１０４の判定が肯定された場合、音源が上方に存在すると判定し、ＣＰＵ５１はステップ１０５に進む。 That is, in FIG. 17E, the CPU 51 determines in step 104B whether or not the normalized phase difference is larger than the positive third threshold value. If the determination in step 104B is affirmed, that is, if the normalized phase difference is greater than the positive third threshold, the CPU 51 determines in step 104 whether the high frequency sound pressure difference is greater than the positive first threshold. .. If the determination in step 104 is affirmed, it is determined that the sound source exists above, and the CPU 51 proceeds to step 105.

ステップ１０４Ｂの判定が否定された場合、即ち、正規化位相差が正の第３閾値以下である場合、音源が上方に存在しないと判定し、ＣＰＵ５１は、ステップ１０６Ｂで、正規化位相差が負の第４閾値より小さいか否か判定する。ステップ１０６Ｂの判定が肯定された場合、または、ステップ１０４の判定が否定された場合、即ち、正規化位相差が負の第４閾値より小さい場合、または、高域音圧差が正の第１閾値以下である場合、ＣＰＵ５１は、ステップ１０６に進む。ＣＰＵ５１は、ステップ１０６で、高域音圧差が負の第２閾値より小さいか否か判定する。ステップ１０６の判定が肯定された場合、即ち、高域音圧差が負の第２閾値より小さい場合、音源が前方に存在すると判定し、ステップ１０７に進む。 When the determination in step 104B is denied, that is, when the normalized phase difference is equal to or less than the positive third threshold value, it is determined that the sound source does not exist above, and the CPU 51 determines in step 106B that the normalized phase difference is negative. It is determined whether or not it is smaller than the fourth threshold value of. When the determination in step 106B is affirmed, or when the determination in step 104 is denied, that is, when the normalized phase difference is smaller than the negative fourth threshold value, or when the high frequency sound pressure difference is positive first threshold value. If the following is true, the CPU 51 proceeds to step 106. In step 106, the CPU 51 determines whether or not the high frequency sound pressure difference is smaller than the negative second threshold value. If the determination in step 106 is affirmative, that is, if the high frequency sound pressure difference is smaller than the negative second threshold value, it is determined that the sound source exists in front, and the process proceeds to step 107.

ステップ１０６Ｂの判定が否定された場合、または、ステップ１０６の判定が否定された場合、即ち、正規化位相差が負の第４閾値以上である場合、または、高域音圧差が負の第２閾値以上である場合、音源方向の判定は不可であると判定する。音源方向の判定は不可であると判定すると、ＣＰＵ５１はステップ１０８に進む。 When the determination in step 106B is denied, or when the determination in step 106 is denied, that is, when the normalized phase difference is equal to or greater than the negative fourth threshold value, or when the high frequency sound pressure difference is negative second. If it is equal to or higher than the threshold value, it is determined that the sound source direction cannot be determined. If it is determined that the sound source direction cannot be determined, the CPU 51 proceeds to step 108.

図１１と図１７Ｆとの差異は、図１１のステップ１０３、１０４及び１０６が、図１７Ｆでは、ステップ１０３、１０３Ｂ、１０４Ｂ、１０６Ｂ、及び１０６と置き替えられている点である。 The difference between FIGS. 11 and 17F is that steps 103, 104 and 106 in FIG. 11 are replaced with steps 103, 103B, 104B, 106B and 106 in FIG. 17F.

即ち、図１７Ｆでは、ＣＰＵ５１は、ステップ１０４Ｂで、正規化位相差が正の第３閾値より大きいか否か判定する。ステップ１０４Ｂの判定が肯定された場合、即ち、正規化位相差が正の第３閾値より大きい場合、音源が上方に存在すると判定し、ステップ１０５に進む。 That is, in FIG. 17F, the CPU 51 determines in step 104B whether or not the normalized phase difference is larger than the positive third threshold value. If the determination in step 104B is affirmative, that is, if the normalized phase difference is larger than the positive third threshold value, it is determined that the sound source exists above, and the process proceeds to step 105.

ステップ１０４Ｂの判定が否定された場合、即ち、正規化位相差が正の第３閾値以下である場合、音源が上方に存在しないと判定し、ＣＰＵ５１は、ステップ１０６Ｂで、正規化位相差が負の第４閾値より小さいか否か判定する。ステップ１０６Ｂの判定が肯定された場合、即ち、正規化位相差が負の第４閾値より小さい場合、ＣＰＵ５１は、ステップ１０６で、高域音圧差が負の第２閾値より小さいか否か判定する。ステップ１０６の判定が肯定された場合、即ち、高域音圧差が負の第２閾値より小さい場合、音源が前方に存在すると判定し、ステップ１０７に進む。 When the determination in step 104B is denied, that is, when the normalized phase difference is equal to or less than the positive third threshold value, it is determined that the sound source does not exist above, and the CPU 51 determines in step 106B that the normalized phase difference is negative. It is determined whether or not it is smaller than the fourth threshold value of. If the determination in step 106B is affirmed, that is, if the normalized phase difference is smaller than the negative fourth threshold value, the CPU 51 determines in step 106 whether the high frequency sound pressure difference is smaller than the negative second threshold value. .. If the determination in step 106 is affirmative, that is, if the high frequency sound pressure difference is smaller than the negative second threshold value, it is determined that the sound source exists in front, and the process proceeds to step 107.

ステップ１０６Ｂの判定が否定された場合、または、ステップ１０６の判定が否定された場合、即ち、正規化位相差が負の第４閾値以上である場合、または、高域音圧差が負の第２閾値以上である場合、音源方向の判定は不可であると判定する。音源方向の判定は不可であると判定した場合、ＣＰＵ５１は、ステップ１０８に進む。 When the determination in step 106B is denied, or when the determination in step 106 is denied, that is, when the normalized phase difference is equal to or greater than the negative fourth threshold value, or when the high frequency sound pressure difference is negative second. If it is equal to or higher than the threshold value, it is determined that the sound source direction cannot be determined. If it is determined that the sound source direction cannot be determined, the CPU 51 proceeds to step 108.

図１１と図１７Ｇとの差異は、図１１のステップ１０３、１０４及び１０６が、図１７Ｇでは、ステップ１０３Ｂ、１０４Ｂ、及び１０６Ｂと置き替えられている点である。 The difference between FIGS. 11 and 17G is that steps 103, 104 and 106 in FIG. 11 are replaced with steps 103B, 104B and 106B in FIG. 17G.

即ち、図２１Ｇでは、ＣＰＵ５１は、ステップ１０３Ｂで、正規化位相差を算出する。ＣＰＵ５１は、ステップ１０４Ｂで、正規化位相差が正の第３閾値より大きいか否か判定する。ステップ１０４Ｂの判定が肯定された場合、即ち、正規化位相差が正の第３閾値より大きい場合、音源が上方に存在すると判定し、ステップ１０５に進む。 That is, in FIG. 21G, the CPU 51 calculates the normalized phase difference in step 103B. In step 104B, the CPU 51 determines whether or not the normalized phase difference is larger than the positive third threshold value. If the determination in step 104B is affirmative, that is, if the normalized phase difference is larger than the positive third threshold value, it is determined that the sound source exists above, and the process proceeds to step 105.

ステップ１０４Ｂの判定が否定された場合、即ち、正規化位相差が正の第３閾値以下である場合、音源が上方に存在しないと判定し、ＣＰＵ５１は、ステップ１０６Ｂで、正規化位相差が負の第４閾値より小さいか否か判定する。ステップ１０６Ｂの判定が肯定された場合、即ち、正規化位相差が負の第４閾値より小さい場合、音源が前方に存在すると判定し、ステップ１０７に進む。 When the determination in step 104B is denied, that is, when the normalized phase difference is equal to or less than the positive third threshold value, it is determined that the sound source does not exist above, and the CPU 51 determines in step 106B that the normalized phase difference is negative. It is determined whether or not it is smaller than the fourth threshold value of. If the determination in step 106B is affirmative, that is, if the normalized phase difference is smaller than the negative fourth threshold value, it is determined that the sound source exists in front, and the process proceeds to step 107.

ステップ１０６Ｂの判定が否定された場合、即ち、正規化位相差が負の第４閾値以上である場合、音源方向の判定は不可であると判定して、ステップ１０８に進む。なお、図１７Ａ～図１７Ｇにおけるフローチャートの処理の順序は一例であり、本実施形態は、当該処理の順序に限定されない。 If the determination in step 106B is denied, that is, if the normalized phase difference is equal to or greater than the negative fourth threshold value, it is determined that the determination of the sound source direction is impossible, and the process proceeds to step 108. The order of processing of the flowcharts in FIGS. 17A to 17G is an example, and the present embodiment is not limited to the order of the processing.

なお、第４実施形態では、第１音道１１ＤＲが屈曲部１１ＤＫを有することで、第１マイク１１Ｄと第２マイク１２Ｄとの間の距離を、音道が屈曲部を有していない場合よりも長くすることができる。これにより、所定の周波数の音の波長に対する音の移動距離の差を長くすることができ、位相差の変動の検出が容易になる。 In the fourth embodiment, since the first sound path 11DR has the bent portion 11DK, the distance between the first microphone 11D and the second microphone 12D can be set from the case where the sound path does not have the bent portion. Can also be lengthened. As a result, the difference in the moving distance of the sound with respect to the wavelength of the sound having a predetermined frequency can be lengthened, and the fluctuation of the phase difference can be easily detected.

なお、第１音道１１ＤＲが屈曲部１１ＤＫを有する例を図１６Ａ及び図１６Ｂに示したが、本実施形態はこれに限定されない。本実施形態は、第２実施形態のように、２つの音道の各々が何れも屈曲部を有する場合、第３実施形態のように、２つの音道の各々が何れも屈曲部を含まない場合でも適用可能である。 Although examples of the first sound path 11DR having the bent portion 11DK are shown in FIGS. 16A and 16B, the present embodiment is not limited to this. In this embodiment, when each of the two sound paths has a bending portion as in the second embodiment, neither of the two sound paths includes a bending portion as in the third embodiment. It is applicable even in the case.

本実施形態の音源方向判定装置は、マイク設置部と、第１マイクロフォンと、第２マイクロフォンと、を含む。マイク設置部は、第１平坦面に開口した第１開口部を一端部に備え、第１開口部から音が伝搬する第１音道、及び、第１平坦面と交差する第２平坦面に開口した第２開口部を一端部に備え、第２開口部から音が伝搬する第２音道が内部に設けられている。第１マイクロフォンは第１音道の他端部に設置された無指向性のマイクロフォンであり、第２マイクロフォンは第２音道の他端部に設置された無指向性のマイクロフォンである。 The sound source direction determination device of the present embodiment includes a microphone installation unit, a first microphone, and a second microphone. The microphone installation portion is provided with a first opening opened on the first flat surface at one end, and is provided on a first sound path through which sound propagates from the first opening and on a second flat surface intersecting the first flat surface. A second opening is provided at one end, and a second sound path through which sound propagates from the second opening is provided inside. The first microphone is an omnidirectional microphone installed at the other end of the first sound path, and the second microphone is an omnidirectional microphone installed at the other end of the second sound path.

本実施形態の音源方向判定装置の判定部は、第１音圧と第２音圧との音圧の相違、及び、第１位相と第２位相との位相の相違の少なくとも一方に基づいて、音源が存在する方向を判定する。第１音圧は、第１マイクロフォンで取得された音の第１周波数成分の音圧であり、第２音圧は、第２マイクロフォンで取得された音の第１周波数成分の音圧である。第１位相は、第１マイクロフォンで取得された音の第２周波数成分の位相であり、第２位相は、第２マイクロフォンで取得された音の第２周波数成分の位相である。 The determination unit of the sound source direction determination device of the present embodiment is based on at least one of the difference in sound pressure between the first sound pressure and the second sound pressure and the difference in phase between the first phase and the second phase. Determine the direction in which the sound source exists. The first sound pressure is the sound pressure of the first frequency component of the sound acquired by the first microphone, and the second sound pressure is the sound pressure of the first frequency component of the sound acquired by the second microphone. The first phase is the phase of the second frequency component of the sound acquired by the first microphone, and the second phase is the phase of the second frequency component of the sound acquired by the second microphone.

本実施形態では、これにより、音圧の相違だけで音源方向の判定が困難な場合であっても、音源方向の判定を適切に判定することが可能となる。 In the present embodiment, this makes it possible to appropriately determine the sound source direction even when it is difficult to determine the sound source direction only due to the difference in sound pressure.

（第４実施形態の説明）
図２２Ａに、音源方向判定装置１０Ｄの背面に空隙が存在する場合、即ち、例えば、音源方向判定装置１０Ｄを装着したユーザの衣服などの物体ＢＯと音源方向判定装置１０Ｄの背面との間に空隙が存在する場合を例示する。音源が前方に存在する場合、第１マイク１１Ｄが取得する音の音圧は第２マイク１２Ｄが取得する音の音圧より小さい。第１マイク１１Ｄの音圧は回折により減衰しており、また、第１開口１１ＤＯで回折しない音は、空隙の入り口で回折し空隙を通るため、第１マイク１１Ｄには到達しないからである。 (Explanation of Fourth Embodiment)
In FIG. 22A, there is a gap on the back surface of the sound source direction determination device 10D, that is, a gap between an object BO such as clothes of a user wearing the sound source direction determination device 10D and the back surface of the sound source direction determination device 10D. Illustrates the case where is present. When the sound source is in front, the sound pressure of the sound acquired by the first microphone 11D is smaller than the sound pressure of the sound acquired by the second microphone 12D. This is because the sound pressure of the first microphone 11D is attenuated by diffraction, and the sound that is not diffracted by the first opening 11DO is diffracted at the entrance of the gap and passes through the gap, so that the sound does not reach the first microphone 11D.

図２２Ｂに、音源方向判定装置１０Ｄの背面に空隙が存在しない場合、即ち、例えば、音源方向判定装置１０Ｄを装着したユーザの衣服などの物体ＢＯと音源方向判定装置１０Ｄの背面との間に空隙が存在しない場合を例示する。音源が前方に存在する場合、第１マイク１１Ｄが取得する音の音圧は第２マイク１２Ｄが取得する音の音圧より大きい。音源が前方に存在する場合、第１マイク１１Ｄが取得する音の音圧は第２マイク１２Ｄが取得する音の音圧より小さい場合であっても、音源方向を判定するのが困難な程度に、第１マイク１１Ｄが取得する音の音圧と第２マイク１２Ｄが取得する音の音圧とが近い。図２２Ａでは空隙を通る音が、図２２Ｂでは、第１開口１１ＤＯで回折し、第１マイク１１Ｄに到達するためである。 In FIG. 22B, there is no gap on the back surface of the sound source direction determination device 10D, that is, a gap between the object BO such as the clothes of the user wearing the sound source direction determination device 10D and the back surface of the sound source direction determination device 10D. Illustrate the case where is not present. When the sound source is in front, the sound pressure of the sound acquired by the first microphone 11D is higher than the sound pressure of the sound acquired by the second microphone 12D. When the sound source is in front, it is difficult to determine the sound source direction even if the sound pressure of the sound acquired by the first microphone 11D is smaller than the sound pressure of the sound acquired by the second microphone 12D. , The sound pressure of the sound acquired by the first microphone 11D and the sound pressure of the sound acquired by the second microphone 12D are close to each other. This is because the sound passing through the gap in FIG. 22A is diffracted by the first opening 11DO in FIG. 22B and reaches the first microphone 11D.

図２３Ａに、第１マイク１１Ｄと第２マイク１２Ｄとの高域音圧差を例示する。左から１番目のブロックＵＧＮは、音源が上方に存在し、空隙が存在しない場合の第１音圧差を示す。左から２番目のブロックＵＧは、音源が上方に存在し、空隙が存在する場合の第２音圧差を示す。空隙を通る音が存在するため、第２音圧差は第１音圧差よりも小さい。 FIG. 23A illustrates the high frequency sound pressure difference between the first microphone 11D and the second microphone 12D. The first block UGN from the left shows the first sound pressure difference when the sound source exists above and the void does not exist. The second block UG from the left shows the second sound pressure difference when the sound source is above and the void is present. The second sound pressure difference is smaller than the first sound pressure difference because there is sound passing through the void.

左から４番目のブロックＦＧは、音源が前方に存在し、空隙が存在する場合の第４音圧差を示す。第２マイク１２Ｄが取得する音の音圧は第１マイク１１Ｄが取得する音の音圧よりも大きくなるため、第４音圧差は負の値となる。 The fourth block FG from the left shows the fourth sound pressure difference when the sound source is in front and the gap is present. Since the sound pressure of the sound acquired by the second microphone 12D is higher than the sound pressure of the sound acquired by the first microphone 11D, the fourth sound pressure difference becomes a negative value.

一方、左から３番目のブロックＦＧＮは、音源が前方に存在し、空隙が存在しない場合の第３音圧差を示す。空隙が存在しないため、空隙が存在する場合には空隙を通る音も第１マイク１１Ｄに到達するため、第１マイク１１Ｄが取得する音の音圧が第２マイク１２Ｄが取得する音の音圧よりも大きくなり、正の値となる。第１マイク１１Ｄが取得する音の音圧が第２マイク１２Ｄが取得する音の音圧よりも小さい場合であっても、第１マイクが取得する音の音圧と第２マイクが取得する音の音圧とが近く、音源方向を判定するのが困難な程度に、音圧差は小さくなる。第１音圧差は、例えば、４．８［ｄＢ］であり、第２音圧差は、例えば、１．８［ｄＢ］であり、第３音圧差は、例えば、１．２［ｄＢ］であり、第４音圧差は、例えば、－０．９［ｄＢ］である。 On the other hand, the third block FGN from the left shows the third sound pressure difference when the sound source exists in front and the void does not exist. Since there is no gap, the sound passing through the gap also reaches the first microphone 11D when the gap exists, so that the sound pressure of the sound acquired by the first microphone 11D is the sound pressure of the sound acquired by the second microphone 12D. Will be greater than and will be a positive value. Even if the sound pressure of the sound acquired by the first microphone 11D is smaller than the sound pressure of the sound acquired by the second microphone 12D, the sound pressure of the sound acquired by the first microphone and the sound acquired by the second microphone The difference in sound pressure becomes so small that it is difficult to determine the direction of the sound source because it is close to the sound pressure of. The first sound pressure difference is, for example, 4.8 [dB], the second sound pressure difference is, for example, 1.8 [dB], and the third sound pressure difference is, for example, 1.2 [dB]. , The fourth sound pressure difference is, for example, −0.9 [dB].

したがって、音源方向判定装置１０Ｄの背面に空隙が存在しないと、高域音圧差で音源方向を判定することが困難な場合がある。即ち、音源方向を判定する適切な閾値の設定が困難な場合がある。例えば、音源が上方に存在するか否か判定する正の第１閾値の値を大きく設定すると、ブロックＵＧで表される音源が上方に存在する場合の高域音圧差を音源が前方に存在する高域音圧差であると判断する虞が生じる。一方、正の第１閾値の値を小さく設定すると、ブロックＦＧＮで表される音源が前方に存在する場合の高域音圧差を音源が上方に存在する高域音圧差であると判定する虞が生じる。 Therefore, if there is no gap on the back surface of the sound source direction determination device 10D, it may be difficult to determine the sound source direction based on the high frequency sound pressure difference. That is, it may be difficult to set an appropriate threshold value for determining the sound source direction. For example, if the value of the positive first threshold value for determining whether or not the sound source exists above is set large, the sound source has a high-frequency sound pressure difference in front when the sound source represented by the block UG exists above. There is a risk of determining that it is a high-frequency sound pressure difference. On the other hand, if the value of the positive first threshold value is set small, there is a possibility that the high-frequency sound pressure difference when the sound source represented by the block FGN exists in front is determined to be the high-frequency sound pressure difference in which the sound source exists above. Occurs.

図２３Ｂに、第１マイク１１Ｄが取得する音の位相と第２マイク１２Ｄが取得する音の位相との正規化位相差を例示する。左から１番目のブロックＵＧは、音源が上方に存在し、空隙が存在しない場合の第１位相差を示す。左から２番目のブロックＵＧＮは、音源が上方に存在し、空隙が存在する場合の第２位相差を示す。 FIG. 23B illustrates a normalized phase difference between the phase of the sound acquired by the first microphone 11D and the phase of the sound acquired by the second microphone 12D. The first block UG from the left shows the first phase difference when the sound source exists above and the void does not exist. The second block UGN from the left shows the second phase difference when the sound source is above and the void is present.

左から３番目のブロックＦＧは、音源が前方に存在し、空隙が存在する場合の第３位相差を示す。左から４番目のブロックＦＧＮは、音源が前方に存在し、空隙が存在しない場合の位相差を示す。即ち、音源方向判定装置１０の背面の空隙の有無に拘わらず、音源が上方に存在する場合、位相差は正の値を示す。また、音源が前方に存在する場合、位相差は負の値を示す。第１位相差は、例えば、６．１［ｒａｄ］であり、第２位相差は、例えば、６．０［ｒａｄ］であり、第３位相差は、例えば、－２．５［ｒａｄ］であり、第４位相差は、例えば、－１．４［ｒａｄ］である。したがって、音源方向判定装置１０の背面に空隙が存在するか否かに拘わらず、音源方向を判定する適切な閾値の設定が比較的容易となる。 The third block FG from the left shows the third phase difference when the sound source is in front and the void is present. The fourth block FGN from the left shows the phase difference when the sound source exists in front and the void does not exist. That is, when the sound source is present above, regardless of the presence or absence of a gap on the back surface of the sound source direction determination device 10, the phase difference shows a positive value. Also, when the sound source is in front, the phase difference shows a negative value. The first phase difference is, for example, 6.1 [rad], the second phase difference is, for example, 6.0 [rad], and the third phase difference is, for example, -2.5 [rad]. Yes, the fourth phase difference is, for example, −1.4 [rad]. Therefore, regardless of whether or not there is a gap on the back surface of the sound source direction determination device 10, it is relatively easy to set an appropriate threshold value for determining the sound source direction.

音源が音源方向判定装置１０Ｄの上方に存在する場合、第２マイク１２Ｄに到達するより前に第１マイク１１Ｄに音が到達する。また、音源が音源方向判定装置１０Ｄの前方に存在する場合、第１マイク１１Ｄに到達するより前に第２マイク１２Ｄに音が到達する。したがって、音源方向の判定に位相差を使用することができる。また、位相差は絶対音圧の影響をあまり受けないため、音源判定装置１０Ｄの背面の空隙の有無によって絶対音圧が変動しても、適切な位相差を取得することが可能である。 When the sound source is above the sound source direction determination device 10D, the sound reaches the first microphone 11D before reaching the second microphone 12D. Further, when the sound source is in front of the sound source direction determination device 10D, the sound reaches the second microphone 12D before reaching the first microphone 11D. Therefore, the phase difference can be used to determine the sound source direction. Further, since the phase difference is not so affected by the absolute sound pressure, it is possible to obtain an appropriate phase difference even if the absolute sound pressure fluctuates depending on the presence or absence of a gap on the back surface of the sound source determination device 10D.

［第５実施形態］
次に、第５実施形態の一例を説明する。第１～第４実施形態と同様の構成及び作用については、説明を省略する。第５実施形態では、音源方向判定の閾値を、ユーザ及び対話相手の発話した音に対応する音信号に基づいて調整する。 [Fifth Embodiment]
Next, an example of the fifth embodiment will be described. The description of the same configuration and operation as those of the first to fourth embodiments will be omitted. In the fifth embodiment, the threshold value for determining the direction of the sound source is adjusted based on the sound signal corresponding to the sound spoken by the user and the dialogue partner.

図２４は、図１の音源方向判定装置１０の判定部１３に代えて、判定部１３”で行われる第５実施形態の音源方向判定処理の概要を例示する。時間周波数変換部８５Ａ１は、第１マイク１１で取得された音に対応する音信号を時間周波数変換し、時間周波数変換部８５Ａ２は、第２マイク１２で取得された音に対応する音信号を時間周波数変換する。 FIG. 24 illustrates an outline of the sound source direction determination process of the fifth embodiment performed by the determination unit 13 ”instead of the determination unit 13 of the sound source direction determination device 10 of FIG. 1. The time-frequency conversion unit 85A1 is the first. 1 The sound signal corresponding to the sound acquired by the microphone 11 is time-frequency-converted, and the time-frequency conversion unit 85A2 converts the sound signal corresponding to the sound acquired by the second microphone 12 into time-frequency.

発話区間検出部８５Ｂ１は、第１マイク１１で取得された音に対応する音信号の発話区間を検出し、発話区間検出部８５Ｂ２は、第２マイク１２で取得された音に対応する音信号の発話区間を検出する。発話区間の検出には、既存の手法を適用することができる。 The utterance section detection unit 85B1 detects the utterance section of the sound signal corresponding to the sound acquired by the first microphone 11, and the utterance section detection unit 85B2 detects the sound signal corresponding to the sound acquired by the second microphone 12. Detects the speech section. Existing methods can be applied to detect the utterance section.

位相算出部８５Ｃ１は、検出された発話区間の音信号を使用して、第１マイク１１で取得された音に対応する音信号の位相を算出する。位相算出部８５Ｃ２は、検出された発話区間の音信号を使用して、第２マイク１２で取得された音に対応する音信号の位相を算出する。平均位相差算出部８５Ｄは、算出された位相を使用して位相差を算出し、発話区間の位相差の平均値である位相差平均値を算出する。 The phase calculation unit 85C1 calculates the phase of the sound signal corresponding to the sound acquired by the first microphone 11 by using the detected sound signal in the utterance section. The phase calculation unit 85C2 calculates the phase of the sound signal corresponding to the sound acquired by the second microphone 12 by using the detected sound signal in the utterance section. The average phase difference calculation unit 85D calculates the phase difference using the calculated phase, and calculates the phase difference average value which is the average value of the phase difference in the utterance section.

過去発話位相差記憶部８５Ｅは、算出した位相差平均値を、将来の過去発話位相差として使用するために記憶する。位相差比較部８５Ｆは、位相差平均値と、以前に記憶した過去発話位相差と、を比較する。 The past utterance phase difference storage unit 85E stores the calculated phase difference average value for use as a future past utterance phase difference. The phase difference comparison unit 85F compares the phase difference average value with the previously stored past utterance phase difference.

位相差平均値と、過去発話位相差と、に第３所定値の一例である所定値を超える差がある場合、閾値調整部８５Ｇは音源方向を判定する閾値を調整する。差は、位相差平均値から過去発話位相差を減算した値の絶対値である。 When there is a difference between the phase difference average value and the past utterance phase difference exceeding a predetermined value which is an example of the third predetermined value, the threshold value adjusting unit 85G adjusts the threshold value for determining the sound source direction. The difference is the absolute value of the value obtained by subtracting the past utterance phase difference from the phase difference average value.

例えば、音源方向判定装置１０の筐体１８の前面を、垂直方向に対して複数の異なる角度で傾斜させ、ユーザの音声の位相差平均値と、対話相手の音声の位相差平均値と、の差を各々の角度で取得する。取得した複数個の差の絶対値の内、最小値を第３所定値として使用することができる。第３所定値は、例えば、４．１［ｒａｄ］であってよい。第３所定値を超える過去発話位相差が存在しない場合、閾値を調整しない。 For example, the front surface of the housing 18 of the sound source direction determination device 10 is tilted at a plurality of different angles with respect to the vertical direction, and the phase difference average value of the user's voice and the phase difference average value of the voice of the dialogue partner are set. Get the difference at each angle. Of the plurality of acquired absolute values of the difference, the minimum value can be used as the third predetermined value. The third predetermined value may be, for example, 4.1 [rad]. If there is no past utterance phase difference exceeding the third predetermined value, the threshold value is not adjusted.

所定値を超える差がある過去発話位相差が複数存在する場合、直近の過去発話位相差を使用して、閾値調整部８５Ｇは、音源方向を判定する閾値を調整する。詳細には、例えば、現在の発話区間の位相差平均値と、過去発話位相差と、の平均値（即ち、中間の値）を音源方向判定の閾値に設定する。音源方向判定部８５Ｈは、調整した閾値を使用して音源方向を判定し、判定結果を出力する。 When there are a plurality of past utterance phase differences having a difference exceeding a predetermined value, the threshold value adjusting unit 85G adjusts the threshold value for determining the sound source direction by using the latest past utterance phase difference. Specifically, for example, the average value (that is, an intermediate value) of the phase difference average value of the current utterance section and the past utterance phase difference is set as the threshold value for determining the sound source direction. The sound source direction determination unit 85H determines the sound source direction using the adjusted threshold value, and outputs the determination result.

図２５を使用して、音源方向を判定する閾値の調整について説明する。図２５の縦軸は、位相差［ｒａｄ］を表し、横軸は時間、即ち、フレーム番号を表す。破線８６Ｐは、フレーム毎の、第１マイク１１で取得した音に対応する音信号と、第２マイク１２で取得した音に対応する音信号と、の位相差を表す。 FIG. 25 will be used to describe the adjustment of the threshold value for determining the sound source direction. The vertical axis of FIG. 25 represents the phase difference [rad], and the horizontal axis represents time, that is, the frame number. The broken line 86P represents the phase difference between the sound signal corresponding to the sound acquired by the first microphone 11 and the sound signal corresponding to the sound acquired by the second microphone 12 for each frame.

上記したように、以前の発話区間である発話区間８６Ｈ１の位相差平均値が、例えば、二次記憶部５３のデータ格納領域５３Ｂに、過去発話位相差として記憶されている。現在の発話区間である発話区間８６Ｈ２の位相差平均値と発話区間８６Ｈ１に対応する過去発話位相差とには所定値を超える差８６Ｄがある。 As described above, the phase difference average value of the utterance section 86H1 which is the previous utterance section is stored as the past utterance phase difference in the data storage area 53B of the secondary storage unit 53, for example. There is a difference 86D exceeding a predetermined value between the phase difference average value of the utterance section 86H2, which is the current utterance section, and the past utterance phase difference corresponding to the utterance section 86H1.

閾値調整部８５Ｆは、例えば、発話区間８６Ｈ１に対応する過去発話位相差と、発話区間８６Ｈ２の位相差平均値との平均値を閾値８６Ｔとして設定する。設定された閾値は、発話区間８６Ｈ２の音信号の音源方向を判定するために使用される。 The threshold value adjusting unit 85F sets, for example, an average value of the past utterance phase difference corresponding to the utterance section 86H1 and the phase difference average value of the utterance section 86H2 as the threshold value 86T. The set threshold value is used to determine the sound source direction of the sound signal in the utterance section 86H2.

音源方向判定装置１０は、図２６Ａに例示するように、筐体１８の前面が垂直方向に略平行となるようにユーザに装着されることが想定されている。図２４Ａでは、所定の位相差閾値８１Ｔを境界として、領域８１Ｕの音声の音源方向は上方、即ち、ユーザの発話であると判定され、領域８１Ｆの音声の音源方向は前方、即ち、対話相手の発話であると判定される。 As illustrated in FIG. 26A, the sound source direction determination device 10 is assumed to be mounted on the user so that the front surface of the housing 18 is substantially parallel to the vertical direction. In FIG. 24A, with the predetermined phase difference threshold value 81T as a boundary, the sound source direction of the voice in the region 81U is determined to be upward, that is, the utterance of the user, and the sound source direction of the voice in the region 81F is forward, that is, the dialogue partner. It is determined to be an utterance.

しかしながら、音源方向判定装置１０の装着者であるユーザの体型または、装着方法などにより、音源方向判定装置１０が、図２６Ｂに例示するように傾斜する場合がある。例えば、ユーザが女性である場合、胸の傾きの影響により、図２６Ｂに例示するように、音源方向判定装置１０の筐体１８の前面が斜め上方に向くように、傾斜する。この場合、位相差閾値８２Ｔで例示するように、判定の境界も共に傾斜、即ち、回転する。 However, the sound source direction determination device 10 may be tilted as illustrated in FIG. 26B depending on the body shape of the user who is the wearer of the sound source direction determination device 10, the wearing method, or the like. For example, when the user is a woman, the front surface of the housing 18 of the sound source direction determination device 10 is tilted diagonally upward as illustrated in FIG. 26B due to the influence of the tilt of the chest. In this case, as illustrated by the phase difference threshold value 82T, the boundary of the determination is also inclined, that is, rotated.

図２６Ｂでは、位相差閾値８２Ｔを境界として、領域８２Ｕの音声の音源方向は上方、即ち、装着者であるユーザの発話であると判定され、領域８２Ｆの音声の音源方向は前方、即ち、対話相手の発話であると判定される。したがって、矢印８２Ｖで例示されるユーザの発話が対話相手の発話であると判断される虞がある。 In FIG. 26B, with the phase difference threshold value 82T as a boundary, the sound source direction of the voice in the region 82U is determined to be upward, that is, the utterance of the user who is the wearer, and the sound source direction of the voice in the region 82F is forward, that is, dialogue. It is determined that the speech is from the other party. Therefore, there is a possibility that the utterance of the user exemplified by the arrow 82V is determined to be the utterance of the dialogue partner.

図２７Ａに、音源方向判定装置１０の筐体１８の前面が垂直方向に略平行である場合を例示し、図２７Ｂに、筐体１８の前面が斜め上方に向くように傾斜している場合を例示する。図２７Ａに例示する位相差８３Ｄと、図２７Ｂに例示する位相差８４Ｄと、は略等しい。位相差８３Ｄは、対話相手の音声の上面への到達を示す矢印８３Ｆ１と前面への到達を示す矢印８３Ｆ２との位相差を表す。位相差８４Ｄは、ユーザの音声の上面への到達を示す矢印８４Ｕ１と前面への到達を示す矢印８４Ｕ２との位相差を表す。 FIG. 27A illustrates a case where the front surface of the housing 18 of the sound source direction determination device 10 is substantially parallel in the vertical direction, and FIG. 27B shows a case where the front surface of the housing 18 is inclined so as to face diagonally upward. Illustrate. The phase difference 83D exemplified in FIG. 27A and the phase difference 84D exemplified in FIG. 27B are substantially equal to each other. The phase difference 83D represents the phase difference between the arrow 83F1 indicating the arrival at the upper surface of the voice of the dialogue partner and the arrow 83F2 indicating the arrival at the front surface. The phase difference 84D represents the phase difference between the arrow 84U1 indicating the arrival at the upper surface of the user's voice and the arrow 84U2 indicating the arrival at the front surface.

図２８に、図２７Ａの位相差８３Ｄに対応する位相差９１Ａ及び図２７Ｂの位相差８４Ｄに対応する位相差９１Ｂを例示する。位相差閾値９１Ｔでは、位相差９１Ａと位相差９１Ｂとを区別することは困難であるし、閾値を調整したとしても、位相差９１Ａと位相差９１Ｂとを区別することは困難である。 FIG. 28 illustrates the phase difference 91A corresponding to the phase difference 83D of FIG. 27A and the phase difference 91B corresponding to the phase difference 84D of FIG. 27B. With the phase difference threshold value 91T, it is difficult to distinguish between the phase difference 91A and the phase difference 91B, and even if the threshold value is adjusted, it is difficult to distinguish between the phase difference 91A and the phase difference 91B.

一方、装着者であるユーザの音声と対話相手の音声との位相差には、音源方向判定装置１０が傾斜したとしても、同じ傾斜であれば、所定値を超える相違が存在する。したがって、ユーザの発話と対話相手の発話とに基づいて、位相差閾値を調整することで、音源方向判定装置１０が傾斜していたとしても、音源方向を適切に判定することができる。 On the other hand, there is a difference in the phase difference between the voice of the user who is the wearer and the voice of the dialogue partner, even if the sound source direction determination device 10 is tilted, if the tilt is the same, the difference exceeds a predetermined value. Therefore, by adjusting the phase difference threshold value based on the utterance of the user and the utterance of the dialogue partner, the sound source direction can be appropriately determined even if the sound source direction determination device 10 is tilted.

図２９Ａに、音源方向判定装置１０の筐体１８の前面が垂直方向に略平行である場合のユーザの音声の位相差９２Ａと、対話相手の音声の位相差９２Ｂと、を例示する。位相差閾値９２Ｔを、位相差９２Ａと位相差９２Ｂとの平均値に調整することで、位相差９２Ａと位相差９２Ｂと、を区別することができる。即ち、音源方向を適切に判定することができる。 FIG. 29A illustrates a phase difference 92A of the user's voice and a phase difference 92B of the voice of the dialogue partner when the front surface of the housing 18 of the sound source direction determination device 10 is substantially parallel in the vertical direction. By adjusting the phase difference threshold value 92T to the average value of the phase difference 92A and the phase difference 92B, the phase difference 92A and the phase difference 92B can be distinguished from each other. That is, the sound source direction can be appropriately determined.

図２９Ｂに、音源方向判定装置１０の筐体１８の前面が斜め上方を向くように傾斜する場合のユーザの音声の位相差９３Ａと、対話相手の音声の位相差９３Ｂと、を例示する。位相差閾値９３Ｔを、位相差９３Ａと位相差９３Ｂとの平均値に調整することで、位相差９３Ａと位相差９３Ｂと、を区別することができる。即ち、音源方向を適切に判定することができる。 FIG. 29B illustrates a phase difference 93A of the voice of the user and a phase difference 93B of the voice of the dialogue partner when the front surface of the housing 18 of the sound source direction determination device 10 is tilted so as to face diagonally upward. By adjusting the phase difference threshold value 93T to the average value of the phase difference 93A and the phase difference 93B, the phase difference 93A and the phase difference 93B can be distinguished. That is, the sound source direction can be appropriately determined.

図３０Ａは、音源判定処理の流れの一例を示す。ＣＰＵ５１は、ステップ２０１で、変数ＮＰに０を設定する。変数ＮＰは、発話区間の正規化位相差を合計するための変数である。 FIG. 30A shows an example of the flow of the sound source determination process. The CPU 51 sets the variable NP to 0 in step 201. The variable NP is a variable for summing the normalized phase differences of the utterance sections.

ＣＰＵ５１は、ステップ２０２で、第１マイク１１及び第２マイク１２で取得された音に対応する音信号を１フレーム分読み込み、ステップ２０３で、時間周波数変換する。ＣＰＵ５１は、ステップ２０４で、発話区間が開始されたか否か判定する。 In step 202, the CPU 51 reads the sound signal corresponding to the sound acquired by the first microphone 11 and the second microphone 12 for one frame, and in step 203, the time frequency is converted. The CPU 51 determines in step 204 whether or not the utterance section has started.

ステップ２０４の判定が否定された場合、ＣＰＵ５１は、ステップ２０２に戻る。ステップ２０４の判定が肯定された場合、ＣＰＵ５１は、ステップ２０５で、正規化位相差を算出し、ステップ２０６で、変数ＮＰに正規化位相差を加算する。 If the determination in step 204 is denied, the CPU 51 returns to step 202. If the determination in step 204 is affirmed, the CPU 51 calculates the normalized phase difference in step 205, and adds the normalized phase difference to the variable NP in step 206.

ＣＰＵ５１は、ステップ２０７で、第１マイク１１及び第２マイク１２で取得された音に対応する音信号を１フレーム分読み込み、ステップ２０８で、時間周波数変換する。ＣＰＵ５１は、ステップ２０９で、発話区間が終了されたか否か判定する。 In step 207, the CPU 51 reads the sound signal corresponding to the sound acquired by the first microphone 11 and the second microphone 12 for one frame, and in step 208, the time frequency is converted. The CPU 51 determines in step 209 whether or not the utterance section has ended.

ステップ２０９の判定が否定された場合、ＣＰＵ５１は、ステップ２０５に戻る。ステップ２０９の判定が肯定された場合、ＣＰＵ５１は、ステップ２１０で、変数ＮＰの値をステップ２０７で読み込まれた音信号のフレーム数で割ることで、位相差平均値の一例である平均正規化位相差を算出する。ＣＰＵ５１は、ステップ２１１で、将来使用するために、算出した平均正規化位相差を過去発話位相差として、例えば、二次記憶部５３のデータ格納領域５３Ｂに、記憶する。 If the determination in step 209 is denied, the CPU 51 returns to step 205. If the determination in step 209 is affirmed, the CPU 51 divides the value of the variable NP by the number of frames of the sound signal read in step 207 in step 210, so that the average normalization position is an example of the phase difference average value. Calculate the phase difference. The CPU 51 stores the calculated average normalized phase difference as the past utterance phase difference in the data storage area 53B of the secondary storage unit 53 for future use in step 211.

ＣＰＵ５１は、ステップ２１２で、以前の処理で記憶されている過去発話位相差と平均正規化位相差とを比較する。ステップ２１２の判定が肯定された場合、過去発話位相差と平均正規化位相差とに所定値を超える差がある場合、ＣＰＵ５１は、ステップ２１３で、閾値を調整し、ステップ２１４に進む。詳細には、ＣＰＵ５１は、ステップ２１３で、過去発話位相差と平均正規化位相差との平均値を、第６閾値の一例である閾値として設定することで閾値を調整する。 In step 212, the CPU 51 compares the past utterance phase difference stored in the previous process with the average normalized phase difference. If the determination in step 212 is affirmed, and if there is a difference exceeding a predetermined value between the past utterance phase difference and the average normalized phase difference, the CPU 51 adjusts the threshold value in step 213 and proceeds to step 214. Specifically, in step 213, the CPU 51 adjusts the threshold value by setting the average value of the past utterance phase difference and the average normalized phase difference as a threshold value which is an example of the sixth threshold value.

ステップ２１２の判定が否定された場合、ＣＰＵ５１は、閾値を調整せず、ステップ２１４に進む。ＣＰＵ５１は、ステップ２１４で、ステップ２０７で読み込まれた音信号の音源方向が上方であるか否か判定する。詳細には、平均正規化位相差が閾値を超えるか否か判定する。 If the determination in step 212 is denied, the CPU 51 does not adjust the threshold value and proceeds to step 214. In step 214, the CPU 51 determines whether or not the sound source direction of the sound signal read in step 207 is upward. Specifically, it is determined whether or not the average normalized phase difference exceeds the threshold value.

ステップ２１４の判定が肯定された場合、ＣＰＵ５１は、ステップ２１５で、ステップ２０７で読み込まれた音信号を第１言語に翻訳するように設定する。ステップ２１４の判定が否定された場合、ＣＰＵ５１は、ステップ２１６で、ステップ２０７で読み込まれた音信号の音源方向が前方であるか否か判定する。詳細には、平均正規化位相差が閾値以下であるか否か判定する。 If the determination in step 214 is affirmed, the CPU 51 is set to translate the sound signal read in step 207 into the first language in step 215. If the determination in step 214 is denied, the CPU 51 determines in step 216 whether or not the sound source direction of the sound signal read in step 207 is forward. Specifically, it is determined whether or not the average normalized phase difference is equal to or less than the threshold value.

ステップ２１６の判定が肯定された場合、ＣＰＵ５１は、ステップ２１７で、ステップ２０７で読み込まれた音信号を第２言語に翻訳するように設定する。ＣＰＵ５１は、ステップ２１８で、ユーザが、例えば、所定のボタンを押下するなど、音源方向判定処理を終了するように指示する操作が行われたか否かを判定する。 If the determination in step 216 is affirmed, the CPU 51 is set to translate the sound signal read in step 207 into a second language in step 217. In step 218, the CPU 51 determines whether or not an operation instructing the user to end the sound source direction determination process, such as pressing a predetermined button, has been performed.

ステップ２１８の判定が否定された場合、ＣＰＵ５１は、ステップ２０１に戻り、ステップ２１８の判定が肯定された場合、ＣＰＵ５１は、音源方向判定処理を終了する。 If the determination in step 218 is denied, the CPU 51 returns to step 201, and if the determination in step 218 is affirmed, the CPU 51 ends the sound source direction determination process.

図３０Ｂは、音源方向判定処理の流れの一例を示す。図３０Ｂの音源方向判定処理は、ユーザの音声と対話相手の音声との音圧差に基づいて、閾値を調整する。 FIG. 30B shows an example of the flow of the sound source direction determination process. The sound source direction determination process of FIG. 30B adjusts the threshold value based on the sound pressure difference between the voice of the user and the voice of the dialogue partner.

ＣＰＵ５１は、ステップ２３１で、高域音圧差の合計を算出するための変数ＨＶに０を設定する。ステップ２３２～ステップ２３４は、図３０Ａのステップ２０２～２０４と同様である。 In step 231 the CPU 51 sets 0 to the variable HV for calculating the total of the high frequency sound pressure differences. Steps 232 to 234 are the same as steps 202 to 204 in FIG. 30A.

ＣＰＵ５１は、ステップ２３５で、高域音圧差を算出し、ステップ２３６で算出した高域音圧差を変数ＨＶの値に加算する。ステップ２３７～２３９は、図３０Ａのステップ２０７～２０９と同様である。 The CPU 51 calculates the high-frequency sound pressure difference in step 235, and adds the high-frequency sound pressure difference calculated in step 236 to the value of the variable HV. Steps 237 to 239 are similar to steps 207 to 209 of FIG. 30A.

ＣＰＵ５１は、ステップ２４０で、変数ＨＶの値をステップ２３７で読み込まれた音信号のフレーム数で割ることで、音圧差平均値の一例である平均高域音圧差を算出する。ＣＰＵ５１は、ステップ２４１で、将来使用するために、算出した平均高域音圧差を過去発話音圧差として、例えば、二次記憶部５３のデータ格納領域５３Ｂに、記憶する。 In step 240, the CPU 51 calculates the average high-frequency sound pressure difference, which is an example of the sound pressure difference average value, by dividing the value of the variable HV by the number of frames of the sound signal read in step 237. The CPU 51 stores the calculated average high-frequency sound pressure difference as the past utterance sound pressure difference in the data storage area 53B of the secondary storage unit 53 for future use in step 241.

ＣＰＵ５１は、以前の処理で記憶されている過去発話音圧差と平均高域音圧差とを比較する。ステップ２４２の判定が肯定された場合、ＣＰＵ５１は、ステップ２４３で、過去発話音圧差と平均高域音圧差との平均値を第５閾値の一例である閾値として設定することで閾値を調整し、ステップ２４４に進む。ステップ２４２の判定は、過去発話音圧差と平均高域音圧差とに第２所定値の一例である所定値を超える差がある場合、肯定される。 The CPU 51 compares the past utterance sound pressure difference stored in the previous process with the average high frequency sound pressure difference. If the determination in step 242 is affirmed, the CPU 51 adjusts the threshold value by setting the average value of the past utterance sound pressure difference and the average high frequency sound pressure difference as a threshold value as an example of the fifth threshold value in step 243. Proceed to step 244. The determination in step 242 is affirmed when there is a difference between the past utterance sound pressure difference and the average high frequency sound pressure difference exceeding a predetermined value which is an example of the second predetermined value.

例えば、音源方向判定装置１０の筐体１８の前面を、垂直方向に対して複数の異なる角度で傾斜させ、ユーザの音声の音圧差平均値と、対話相手の音声の音圧差平均値と、の差を各々の角度で取得する。取得した複数個の差の絶対値の内、最小値を第２所定値として使用することができる。第２所定値は、例えば、３．０［ｄＢ］であってよい。第２所定値を超える過去発話音圧差が存在しない場合、閾値を調整しない。 For example, the front surface of the housing 18 of the sound source direction determination device 10 is tilted at a plurality of different angles with respect to the vertical direction, and the average value of the sound pressure difference of the user's voice and the average value of the sound pressure difference of the voice of the dialogue partner are set. Get the difference at each angle. Of the plurality of acquired absolute values of the difference, the minimum value can be used as the second predetermined value. The second predetermined value may be, for example, 3.0 [dB]. If there is no past speech pressure difference exceeding the second predetermined value, the threshold value is not adjusted.

ステップ２４２の判定が否定された場合、ＣＰＵ５１は、閾値を調整せず、ステップ２４４に進む。ＣＰＵ５１は、ステップ２４４で、ステップ２３７で読み込まれた音信号の音源方向が上方であるか否か判定する。詳細には、平均高域音圧差が閾値を超えるか否か判定する。 If the determination in step 242 is denied, the CPU 51 does not adjust the threshold value and proceeds to step 244. In step 244, the CPU 51 determines whether or not the sound source direction of the sound signal read in step 237 is upward. Specifically, it is determined whether or not the average high-frequency sound pressure difference exceeds the threshold value.

ステップ２４４の判定が肯定された場合、ＣＰＵ５１は、ステップ２４５で、ステップ２３７で読み込まれた音信号を第１言語に翻訳するように設定する。ステップ２４４の判定が否定された場合、ＣＰＵ５１は、ステップ２４６で、ステップ２０７で読み込まれた音信号の音源方向が前方であるか否か判定する。詳細には、平均高域音圧差が閾値以下であるか否か判定する。ステップ２４８は、図３０Ａのステップ２１８と同様である。 If the determination in step 244 is affirmed, the CPU 51 is set to translate the sound signal read in step 237 into the first language in step 245. If the determination in step 244 is denied, the CPU 51 determines in step 246 whether or not the sound source direction of the sound signal read in step 207 is forward. Specifically, it is determined whether or not the average high-frequency sound pressure difference is equal to or less than the threshold value. Step 248 is similar to step 218 of FIG. 30A.

図３０Ａは、第４実施形態の図１７Ｇの音源方向判定処理に第５実施形態を適用した例であり、図３０Ｂは、第３実施形態の図１１の音源方向判定処理に第５実施形態を適用した例である。しかしながら、第５実施形態は、第４実施形態の図１７Ａ～１７Ｆの音源方向判定処理に適用されてもよい。即ち、音圧差を判定する閾値と位相差を判定する閾値との双方を調整するようにしてもよい。 FIG. 30A is an example in which the fifth embodiment is applied to the sound source direction determination process of FIG. 17G of the fourth embodiment, and FIG. 30B shows the fifth embodiment to the sound source direction determination process of FIG. 11 of the third embodiment. This is an applied example. However, the fifth embodiment may be applied to the sound source direction determination process of FIGS. 17A to 17F of the fourth embodiment. That is, both the threshold value for determining the sound pressure difference and the threshold value for determining the phase difference may be adjusted.

なお、位相差平均値との差が所定値を超える過去発話位相差が複数存在する場合、直近の過去発話位相差を使用してもよいし、所定時間内の過去発話位相差のうち差が最大となる過去発話位相差を使用してもよい。また、所定時間内の過去発話位相差の平均値を使用してもよい。 If there are a plurality of past utterance phase differences whose difference from the phase difference average value exceeds a predetermined value, the latest past utterance phase difference may be used, or the difference among the past utterance phase differences within the predetermined time may be used. The maximum past utterance phase difference may be used. Further, the average value of the past utterance phase differences within a predetermined time may be used.

音圧差平均値との差が所定値を超える過去発話音圧差が複数存在する場合、直近の過去発話音圧差を使用してもよいし、所定時間内の過去発話音圧差のうち差が最大となる過去発話音圧差を使用してもよい。また、所定時間内の過去発話音圧差の平均値を使用してもよい。 When there are multiple past utterance sound pressure differences whose difference from the average sound pressure difference exceeds a predetermined value, the latest past utterance sound pressure difference may be used, or the difference among the past utterance sound pressure differences within the predetermined time is the largest. The past utterance sound pressure difference may be used. Further, the average value of the past utterance sound pressure difference within a predetermined time may be used.

なお、発話区間の複数フレームの位相差平均値または音圧差平均値を算出する例について説明したが、発話区間の一部分の複数フレームの位相差平均値及び音圧差平均値を算出するようにしてもよい。また、発話区間が長時間に及ぶ場合、発話区間を複数に分け、複数に分けた部分区間毎に位相差平均値の算出または音圧差平均値の算出を行うようにしてもよい。 Although an example of calculating the phase difference average value or the sound pressure difference average value of a plurality of frames in the utterance section has been described, the phase difference average value and the sound pressure difference average value of a part of the utterance section may be calculated. good. Further, when the utterance section extends for a long time, the utterance section may be divided into a plurality of parts, and the phase difference average value or the sound pressure difference average value may be calculated for each of the plurality of divided subsections.

ユーザと対話相手の対話中に、自然に、音源方向を判定する閾値を調整する例について説明したが、対話の冒頭で、ユーザと対話相手とが交互に所定時間長を超えるフレーズを発話し、当該発話の音声を使用して、閾値を調整するようにしてもよい。フレーズは、例えば、既定の挨拶（例えば、「こんにちは」など）であってよい。 An example of adjusting the threshold value for determining the sound source direction naturally during a dialogue between the user and the dialogue partner has been described, but at the beginning of the dialogue, the user and the dialogue partner alternately utter a phrase exceeding a predetermined time length. The voice of the utterance may be used to adjust the threshold. The phrase may be, for example, a default greeting (eg, "hello").

なお、上記の例では、図３０Ａのステップ２１６は、省略可能であるが、例えば、ステップ２１４で音源方向を判定する閾値とステップ２１６で音源方向を判定する閾値とが異なる値となるようにしてもよい。詳細には、例えば、ステップ２１６で使用する閾値をステップ２１４で使用する閾値よりも所定量低減してもよい。 In the above example, step 216 of FIG. 30A can be omitted, but for example, the threshold value for determining the sound source direction in step 214 and the threshold value for determining the sound source direction in step 216 are set to be different values. May be good. Specifically, for example, the threshold value used in step 216 may be reduced by a predetermined amount from the threshold value used in step 214.

これにより、音源方向の判定が困難な、即ち、何れの音源方向からの音声であるとも判定し得る音声を誤判定する虞を低減することができる。図３０Ｂのステップ２４６についても同様である。また、ステップ２１４またはステップ２４４で使用する閾値を所定量増大してもよい。 This makes it possible to reduce the risk of erroneous determination of sound that is difficult to determine in the direction of the sound source, that is, sound that can be determined to be sound from any sound source direction. The same applies to step 246 of FIG. 30B. Further, the threshold value used in step 214 or step 244 may be increased by a predetermined amount.

なお、音信号の信号対雑音比を算出し、信号対雑音比が第４所定値の一例である所定値より小さい場合、音源方向を判定する閾値を、第５所定値の一例である所定値分下げるようにしてもよい。信号対雑音比が小さい程、音源方向による位相差及び音圧差の差異が小さくなる傾向があるためである。 When the signal-to-noise ratio of the sound signal is calculated and the signal-to-noise ratio is smaller than the predetermined value which is an example of the fourth predetermined value, the threshold value for determining the sound source direction is set to the predetermined value which is an example of the fifth predetermined value. You may try to lower it by a minute. This is because the smaller the signal-to-noise ratio, the smaller the difference in phase difference and sound pressure difference depending on the sound source direction.

第４所定値は、例えば、定常雑音比であってよいし、第５所定値は、音圧差平均値を区別する閾値の場合、例えば、０．５［ｄＢ］であってよいし、位相差平均値を区別する閾値の場合、例えば、０．５［ｒａｄ］であってよい。定常雑音比は、既存の方法で算出することができる。 The fourth predetermined value may be, for example, a stationary noise ratio, and the fifth predetermined value may be, for example, 0.5 [dB] in the case of a threshold value for distinguishing the average sound pressure difference value, and the phase difference. In the case of the threshold value for distinguishing the average value, it may be, for example, 0.5 [rad]. The steady-state noise ratio can be calculated by existing methods.

なお、図２Ａ及び図２Ｂに例示する音源方向判定装置１０に適用する例について説明したが、本実施形態は、図１３Ａ～図１３Ｃに例示する音源方向判定装置１０Ｃに適用されてもよい。本実施形態によれば、ユーザが、筐体１８Ｃの右側面及び前面に対向する位置からずれた位置に存在して発話する場合であっても、音源方向、即ち、発話者を適切に判定することができる。 Although the example applied to the sound source direction determination device 10 exemplified in FIGS. 2A and 2B has been described, the present embodiment may be applied to the sound source direction determination device 10C exemplified in FIGS. 13A to 13C. According to the present embodiment, even when the user exists at a position deviated from the position facing the right side surface and the front surface of the housing 18C and speaks, the sound source direction, that is, the speaker is appropriately determined. be able to.

なお、図３０Ａ及び３０Ｂにおけるフローチャートの処理の順序は一例であり、本実施形態は、当該処理の順序に限定されない。 The order of processing of the flowcharts in FIGS. 30A and 30B is an example, and the present embodiment is not limited to the order of the processing.

本実施形態では、ユーザの音声と対話相手の音声とに基づいて、音源方向を判定する閾値を調整することで、音源判定装置が傾斜した場合であっても、音源方向を適切に判定することができる。 In the present embodiment, by adjusting the threshold value for determining the sound source direction based on the voice of the user and the voice of the dialogue partner, the sound source direction can be appropriately determined even when the sound source determination device is tilted. Can be done.

（関連技術）
次に、関連技術について説明する。関連技術では、図１８に例示するように、指向性マイク１１Ｘの指向１１ＸＯＲ及び指向性マイク１２Ｘの指向１２ＸＯＲを交差させるように、２つの指向性マイクを配置する。例えば、指向１１ＸＯＲを上方に向け、指向１２ＸＯＲを前方に向ける。 (Related technology)
Next, the related technology will be described. In the related art, as illustrated in FIG. 18, two directional microphones are arranged so as to intersect the directional 11XOR of the directional microphone 11X and the directional 12XOR of the directional microphone 12X. For example, the oriented 11XOR is directed upwards and the directed 12XOR is directed forward.

この構成により、指向性マイク１１Ｘ及び指向性マイク１２Ｘが取得した音の音圧差を使用して、音源の方向を判定することが可能である。即ち、指向性マイク１１Ｘで取得した音の音圧が指向性マイク１２Ｘで取得した音の音圧より大きい場合、音源は上方に存在し、指向性マイク１２Ｘで取得した音の音圧が指向性マイク１１Ｘで取得した音の音圧より大きい場合、音源は前方に存在する。 With this configuration, it is possible to determine the direction of the sound source by using the sound pressure difference of the sound acquired by the directional microphone 11X and the directional microphone 12X. That is, when the sound pressure of the sound acquired by the directional microphone 11X is larger than the sound pressure of the sound acquired by the directional microphone 12X, the sound source is located above and the sound pressure of the sound acquired by the directional microphone 12X is directional. If it is higher than the sound pressure of the sound acquired by the microphone 11X, the sound source is in front.

しかしながら、指向性マイクは、図１９に例示するように、無指向性マイクよりも大きいため、指向性マイクを使用した場合、音源方向判定装置を小型化することが困難である。図１９の例では、指向性マイクの体積は２２６［立方ｍｍ］であり、無指向性マイクの体積は１１［立方ｍｍ］である。即ち、指向性マイクの体積は、無指向性マイクの体積の約２０倍である。また、指向性マイクは無指向性マイクよりも高価であるため、指向性マイクを使用した場合音源方向判定装置の価格を低減することも困難となる。 However, since the directional microphone is larger than the omnidirectional microphone as illustrated in FIG. 19, it is difficult to miniaturize the sound source direction determination device when the directional microphone is used. In the example of FIG. 19, the volume of the directional microphone is 226 [cubi mm], and the volume of the omnidirectional microphone is 11 [cubi mm]. That is, the volume of the directional microphone is about 20 times the volume of the omnidirectional microphone. Further, since the directional microphone is more expensive than the omnidirectional microphone, it is difficult to reduce the price of the sound source direction determination device when the directional microphone is used.

しかしながら、図１８に例示した音源方向判定装置の指向性マイクを単に無指向性マイクで置き替えることで、音源方向を精度よく判定することが可能な音源方向判定装置を実現することは困難である。図２０Ａに例示するように、無指向性マイク１１Ｙが音を取得することができる範囲１１ＹＯＲと、無指向性マイク１２Ｙが音を取得することができる範囲１２ＹＯＲと、はほぼ重複する。したがって、無指向性マイク１１Ｙ及び１２Ｙが取得した音の音圧差に、音源方向を精度よく判定することができる程度の有意な差が生じないためである。 However, it is difficult to realize a sound source direction determination device capable of accurately determining the sound source direction by simply replacing the directional microphone of the sound source direction determination device illustrated in FIG. 18 with an omnidirectional microphone. .. As illustrated in FIG. 20A, the range 11YOR in which the omnidirectional microphone 11Y can acquire sound and the range 12YOR in which the omnidirectional microphone 12Y can acquire sound substantially overlap. Therefore, there is no significant difference in the sound pressure difference of the sounds acquired by the omnidirectional microphones 11Y and 12Y to the extent that the sound source direction can be accurately determined.

図２０Ｂに、筐体１８Ｙの上面に第１マイク１１Ｙを設置し、前面に第２マイク１２Ｙを設置した、第１実施形態と同様に、前後方向の幅が１［ｃｍ］程度であり、前面が名刺程度の大きさである、関連技術の音源方向判定装置１０Ｙを例示する。第１マイク１１Ｙ及び第２マイク１２Ｙは、無指向性マイクである。関連技術の音源方向判定装置１０Ｙの音圧差と第１実施形態の音源方向判定装置１０の音圧差とを図２１に例示する。音源が音源方向判定装置の上方にある場合、第１マイクで取得する音の音圧と第２マイクで取得する音の音圧との音圧差は、関連技術では、２．９［ｄＢ］であり、第１実施形態では、７．２［ｄＢ］である。 In FIG. 20B, the width in the front-rear direction is about 1 [cm] and the front surface is similar to the first embodiment in which the first microphone 11Y is installed on the upper surface of the housing 18Y and the second microphone 12Y is installed on the front surface. Illustrates the sound source direction determination device 10Y of the related technology, which is about the size of a business card. The first microphone 11Y and the second microphone 12Y are omnidirectional microphones. FIG. 21 illustrates the sound pressure difference of the sound source direction determination device 10Y of the related technology and the sound pressure difference of the sound source direction determination device 10 of the first embodiment. When the sound source is above the sound source direction determination device, the sound pressure difference between the sound pressure of the sound acquired by the first microphone and the sound pressure of the sound acquired by the second microphone is 2.9 [dB] in the related technology. Yes, in the first embodiment, it is 7.2 [dB].

音源が音源方向判定装置の前方にある場合、第１マイクで取得する音の音圧と第２マイクで取得する音の音圧との音圧差は、関連技術では、－２．９［ｄＢ］であり、第１実施形態では、－４．２［ｄＢ］である。即ち、音源が音源方向判定装置の上方にある場合、第１実施形態で算出される音圧差は、関連技術より４．３［ｄＢ］大きく、音源が音源方向判定装置の前方にある場合、第１実施形態で算出される音圧差は、関連技術より１．３［ｄＢ］小さい。 When the sound source is in front of the sound source direction determination device, the sound pressure difference between the sound pressure of the sound acquired by the first microphone and the sound pressure of the sound acquired by the second microphone is -2.9 [dB] in the related technology. In the first embodiment, it is -4.2 [dB]. That is, when the sound source is above the sound source direction determination device, the sound pressure difference calculated in the first embodiment is 4.3 [dB] larger than that of the related technology, and when the sound source is in front of the sound source direction determination device, the first The sound pressure difference calculated in one embodiment is 1.3 [dB] smaller than that of the related technology.

したがって、本実施形態では図１１のステップ１０４及びステップ１０６の判定で、誤った判定結果を得る可能性が低減するため、本実施形態によれば、無指向性マイクロフォンを使用した音源方向判定の精度を向上させることを可能とする。 Therefore, in the present embodiment, the possibility of obtaining an erroneous determination result in the determination in step 104 and step 106 in FIG. 11 is reduced. Therefore, according to the present embodiment, the accuracy of the sound source direction determination using the omnidirectional microphone is reduced. It is possible to improve.

以上の各実施形態に関し、更に以下の付記を開示する。 The following additional notes will be further disclosed with respect to each of the above embodiments.

（付記１）
第１平坦面に開口した第１開口部を一端部に備え、前記第１開口部から音が伝搬する第１音道、及び、前記第１平坦面と交差する第２平坦面に開口した第２開口部を一端部に備え、前記第２開口部から音が伝搬する第２音道が内部に設けられたマイク設置部と、
前記第１音道の他端部に設置された無指向性の第１マイクロフォンと、
前記第２音道の他端部に設置された無指向性の第２マイクロフォンと、
前記第１マイクロフォンで取得された音の第１周波数成分の音圧である第１音圧と、前記第２マイクロフォンで取得された音の前記第１周波数成分の音圧である第２音圧との音圧の相違、及び、前記第１マイクロフォンで取得された音の第２周波数成分の位相である第１位相と、前記第２マイクロフォンで取得された音の前記第２周波数成分の位相である第２位相との位相の相違の少なくとも一方に基づいて、音源が存在する方向を判定する、判定部と、
を含む、
音源方向判定装置。
（付記２）
前記第１周波数成分は高域成分である、
付記１の音源方向判定装置。
（付記３）
前記第１平坦面と前記第２平坦面とは直交し、
前記第１平坦面の面積は第１所定値以下であり、前記第２平坦面の面積は前記第１所定値より大きく、
前記第１音道は、前記第１開口部に音を回折する第１回折部を有し、かつ、途中に、音を回折する屈曲部である第２回折部を有し、
前記第２音道は、前記第２開口部に音を回折する第３回折部を有する、
付記１または付記２の音源方向判定装置。
（付記４）
前記第１平坦面と前記第２平坦面とは直交し、
前記第１平坦面の面積は第１所定値以下であり、前記第２平坦面の面積は前記第１所定値より大きく、
前記第１音道は、前記第１開口部に音を回折する第１回折部を有し、かつ、途中に、音を回折する屈曲部である第２回折部を有し、
前記第２音道は、前記第２開口部に音を回折する第３回折部を有し、かつ、途中に、音を回折する屈曲部である第４回折部を有する、
付記１または付記２の音源方向判定装置。
（付記５）
前記第１平坦面と前記第２平坦面とは直交し、
前記第１平坦面及び前記第２平坦面の面積は第１所定値より大きく、
前記第１音道は、前記第１開口部に音を回折する第１回折部を有し、
前記第２音道は、前記第２開口部に音を回折する第２回折部を有する、
付記１または付記２の音源方向判定装置。
（付記６）
前記音圧の相違は、前記第１音圧のパワーの対数から前記第２音圧のパワーの対数を減算した音圧差の平均値であり、
前記位相の相違は、対象周波数帯域の位相差の平均値であり、
前記音圧差の平均値が正の第１閾値よりも大きい場合、及び、前記位相差の平均値が正の第３閾値よりも大きい場合の内少なくとも一方の場合、前記音源が前記第１平坦面に対向する位置に存在すると判定する、
付記１～付記５の何れかの音源方向判定装置。
（付記７）
前記音圧差の平均値が負の第２閾値よりも小さい場合、及び、前記位相差の平均値が負の第４閾値よりも小さい場合の内少なくとも一方の場合、前記音源が前記第２平坦面に対向する位置に存在すると判定する、
付記６の音源方向判定装置。
（付記８）
前記対象周波数帯域の位相差の平均値a_phaseは、以下の（１０）式で表される、付記６または付記７の音源方向判定装置。

但し、
phase[j]=atan(phase_im[j]/phase_re[j])、
phase_re[j]=re1[j]*re2[j]+im1[j]*im2[j]、
phase_im[j]=im1[j]*re2[j]-re1[j]*im2[j]、
C_n[j]=λ[j]/λ_cであり、
ｊは周波数帯域数であり、
re1[j]は、j番目の周波数帯域の前記第１音圧のスペクトルの実部であり、
re2[j]は、j番目の周波数帯域の前記第２音圧のスペクトルの実部であり、
im1[j]は、j番目の周波数帯域の前記第１音圧のスペクトルの虚部であり、
im2[j]は、j番目の周波数帯域の前記第２音圧のスペクトルの虚部であり、
λ[j]は、j番目の周波数帯域の音の波長であり、
λ_cは、基準周波数の音の波長であり、
ｅｅは、前記対象周波数帯域の上限であり、
ｓｓは、前記対象周波数帯域の下限である。
（付記９）
前記音圧の相違は、前記第１音圧のパワーの対数から前記第２音圧のパワーの対数を減算したフレーム毎の音圧差の複数フレームの平均値である音圧差平均値であり、
前記位相の相違は、フレーム毎の対象周波数帯域の位相差の複数フレームの平均値である位相差平均値であり、
前記音圧差平均値が第５閾値よりも大きい場合、及び、前記位相差平均値が第６閾値よりも大きい場合の内少なくとも一方の場合、前記音源が前記第１平坦面に対向する位置に存在すると判定し、
前記第５閾値は、前記音源が前記第１平坦面に対向する位置に存在する場合の前記音圧差平均値と、前記音源が前記第２平坦面に対向する位置に存在する場合の前記音圧差平均値と、の平均値であり、
前記第６閾値は、前記音源が前記第１平坦面に対向する位置に存在する場合の前記位相差平均値と、前記音源が前記第２平坦面に対向する位置に存在する場合の前記位相差平均値と、の平均値である、
付記１～付記５の何れかの音源方向判定装置。
（付記１０）
前記音圧差平均値が前記第５閾値以下の場合、及び、前記位相差平均値が前記第６閾値以下の場合の内少なくとも一方の場合、前記音源が前記第２平坦面に対向する位置に存在すると判定する、
付記９の音源方向判定装置。
（付記１１）
前記音源が前記第１平坦面に対向する位置に存在する場合の前記音圧差平均値と、前記音源が前記第２平坦面に対向する位置に存在する場合の前記音圧差平均値と、の平均値は、前記音圧差の第１発話区間の平均値である第１平均値と、前記音圧差の第２発話区間の平均値である第２平均値と、の平均値であり、前記第１平均値と前記第２平均値との相違は、第２所定値を超え、
前記音源が前記第１平坦面に対向する位置に存在する場合の前記位相差平均値と、前記音源が前記第２平坦面に対向する位置に存在する場合の前記位相差平均値と、の平均値は、前記位相差の第３発話区間の平均値である第３平均値と、前記位相差の第４発話区間の平均値である第４平均値と、の平均値であり、前記第３平均値と前記第４平均値との相違は、第３所定値を超える、
付記９または付記１０の音源方向判定装置。
（付記１２）
前記音に対応する信号の信号対雑音比が第４所定値より小さい場合、前記第５閾値及び前記第６閾値を第５所定値分低減する、
付記９～付記１１の何れかの音源方向判定装置。
（付記１３）
前記音源が前記第１平坦面と対向する位置に存在すると判定された場合、前記音に対応する信号を第１言語に翻訳し、前記音源が前記第２平坦面に対向する位置に存在すると判定された場合、前記音に対応する信号を第２言語に翻訳する、
付記１～付記１２の何れかの音源方向判定装置。
（付記１４）
第１平坦面に開口した第１開口部を一端部に備え、前記第１開口部から音が伝搬する第１音道、及び、前記第１平坦面と交差する第２平坦面に開口した第２開口部を一端部に備え、前記第２開口部から音が伝搬する第２音道が内部に設けられたマイク設置部と、
前記第１音道の他端部に設置された無指向性の第１マイクロフォンと、
前記第２音道の他端部に設置された無指向性の第２マイクロフォンと、
コンピュータと、
を含む音源方向判定装置の前記コンピュータが、
前記第１マイクロフォンで取得された音の第１周波数成分の音圧である第１音圧と、前記第２マイクロフォンで取得された音の前記第１周波数成分の音圧である第２音圧との音圧の相違、及び、前記第１マイクロフォンで取得された音の第２周波数成分の位相である第１位相と、前記第２マイクロフォンで取得された音の前記第２周波数成分の位相である第２位相との位相の相違の少なくとも一方に基づいて、音源が存在する方向を判定する、
音源方向判定方法。
（付記１５）
前記音圧の相違は、前記第１音圧のパワーの対数から前記第２音圧のパワーの対数を減算した音圧差の平均値であり、
前記位相の相違は、対象周波数帯域の位相差の平均値であり、
前記音圧差の平均値が正の第１閾値よりも大きい場合、及び、前記位相差の平均値が正の第３閾値よりも大きい場合の内少なくとも一方の場合、前記音源が前記第１平坦面に対向する位置に存在すると判定する、
付記１４の音源方向判定方法。
（付記１６）
前記音圧の相違は、前記第１音圧のパワーの対数から前記第２音圧のパワーの対数を減算したフレーム毎の音圧差の複数フレームの平均値である音圧差平均値であり、
前記位相の相違は、フレーム毎の対象周波数帯域の位相差の複数フレームの平均値である位相差平均値であり、
前記音圧差平均値が第５閾値よりも大きい場合、及び、前記位相差平均値が第６閾値よりも大きい場合の内少なくとも一方の場合、前記音源が前記第１平坦面に対向する位置に存在すると判定し、
前記第５閾値は、前記音源が前記第１平坦面に対向する位置に存在する場合の前記音圧差平均値と、前記音源が前記第２平坦面に対向する位置に存在する場合の前記音圧差平均値と、の平均値であり、
前記第６閾値は、前記音源が前記第１平坦面に対向する位置に存在する場合の前記位相差平均値と、前記音源が前記第２平坦面に対向する位置に存在する場合の前記位相差平均値と、の平均値である、
付記１４の音源方向判定方法。
（付記１７）
第１平坦面に開口した第１開口部を一端部に備え、前記第１開口部から音が伝搬する第１音道、及び、前記第１平坦面と交差する第２平坦面に開口した第２開口部を一端部に備え、前記第２開口部から音が伝搬する第２音道が内部に設けられたマイク設置部と、
前記第１音道の他端部に設置された無指向性の第１マイクロフォンと、
前記第２音道の他端部に設置された無指向性の第２マイクロフォンと、
コンピュータと、
を含む音源方向判定装置のコンピュータに、
前記第１マイクロフォンで取得された音の第１周波数成分の音圧である第１音圧と、前記第２マイクロフォンで取得された音の前記第１周波数成分の音圧である第２音圧との音圧の相違、及び、前記第１マイクロフォンで取得された音の第２周波数成分の位相である第１位相と、前記第２マイクロフォンで取得された音の前記第２周波数成分の位相である第２位相との位相の相違の少なくとも一方に基づいて、音源が存在する方向を判定する、
音源方向判定処理を実行させるためのプログラム。
（付記１８）
前記音圧の相違は、前記第１音圧のパワーの対数から前記第２音圧のパワーの対数を減算した音圧差の平均値であり、
前記位相の相違は、対象周波数帯域の位相差の平均値であり、
前記音圧差の平均値が正の第１閾値よりも大きい場合、及び、前記位相差の平均値が正の第３閾値よりも大きい場合の内少なくとも一方の場合、前記音源が前記第１平坦面に対向する位置に存在すると判定する、
付記１７のプログラム。
（付記１９）
前記音圧の相違は、前記第１音圧のパワーの対数から前記第２音圧のパワーの対数を減算したフレーム毎の音圧差の複数フレームの平均値である音圧差平均値であり、
前記位相の相違は、フレーム毎の対象周波数帯域の位相差の複数フレームの平均値である位相差平均値であり、
前記音圧差平均値が第５閾値よりも大きい場合、及び、前記位相差平均値が第６閾値よりも大きい場合の内少なくとも一方の場合、前記音源が前記第１平坦面に対向する位置に存在すると判定し、
前記第５閾値は、前記音源が前記第１平坦面に対向する位置に存在する場合の前記音圧差平均値と、前記音源が前記第２平坦面に対向する位置に存在する場合の前記音圧差平均値と、の平均値であり、
前記第６閾値は、前記音源が前記第１平坦面に対向する位置に存在する場合の前記位相差平均値と、前記音源が前記第２平坦面に対向する位置に存在する場合の前記位相差平均値と、の平均値である、
付記１７のプログラム。 (Appendix 1)
A first opening opened in the first flat surface is provided at one end, and a first sound path through which sound propagates from the first opening and a second flat surface intersecting with the first flat surface are opened. A microphone installation unit having two openings at one end and an internal second sound path through which sound propagates from the second opening.
An omnidirectional first microphone installed at the other end of the first sound path, and
An omnidirectional second microphone installed at the other end of the second sound path, and
The first sound pressure, which is the sound pressure of the first frequency component of the sound acquired by the first microphone, and the second sound pressure, which is the sound pressure of the first frequency component of the sound acquired by the second microphone. The difference in sound pressure between the two, and the first phase, which is the phase of the second frequency component of the sound acquired by the first microphone, and the phase of the second frequency component of the sound acquired by the second microphone. A determination unit that determines the direction in which the sound source exists based on at least one of the phase differences from the second phase.
including,
Sound source direction determination device.
(Appendix 2)
The first frequency component is a high frequency component.
The sound source direction determination device of Appendix 1.
(Appendix 3)
The first flat surface and the second flat surface are orthogonal to each other.
The area of the first flat surface is equal to or less than the first predetermined value, and the area of the second flat surface is larger than the first predetermined value.
The first sound path has a first diffracting portion that diffracts sound in the first opening, and has a second diffracting portion that is a bending portion that diffracts sound in the middle.
The second sound path has a third diffracting portion that diffracts sound in the second opening.
The sound source direction determination device of Appendix 1 or Appendix 2.
(Appendix 4)
The first flat surface and the second flat surface are orthogonal to each other.
The area of the first flat surface is equal to or less than the first predetermined value, and the area of the second flat surface is larger than the first predetermined value.
The first sound path has a first diffracting portion that diffracts sound in the first opening, and has a second diffracting portion that is a bending portion that diffracts sound in the middle.
The second sound path has a third diffracting portion that diffracts sound in the second opening, and has a fourth diffracting portion that is a bending portion that diffracts sound in the middle.
The sound source direction determination device of Appendix 1 or Appendix 2.
(Appendix 5)
The first flat surface and the second flat surface are orthogonal to each other.
The areas of the first flat surface and the second flat surface are larger than the first predetermined value.
The first sound path has a first diffracting portion that diffracts sound in the first opening.
The second sound path has a second diffracting portion that diffracts sound in the second opening.
The sound source direction determination device of Appendix 1 or Appendix 2.
(Appendix 6)
The difference in sound pressure is the average value of the sound pressure difference obtained by subtracting the logarithm of the power of the second sound pressure from the logarithm of the power of the first sound pressure.
The phase difference is the average value of the phase difference in the target frequency band.
When the average value of the sound pressure difference is larger than the positive first threshold value and at least one of the cases where the average value of the phase difference is larger than the positive third threshold value, the sound source is the first flat surface. Judged to exist at a position facing
The sound source direction determination device according to any one of Supplementary note 1 to Supplementary note 5.
(Appendix 7)
When the average value of the sound pressure difference is smaller than the negative second threshold value and at least one of the cases where the average value of the phase difference is smaller than the negative fourth threshold value, the sound source is the second flat surface. Judged to exist at a position facing
The sound source direction determination device of Appendix 6.
(Appendix 8)
The average value a_phase of the phase difference of the target frequency band is represented by the following equation (10), and is the sound source direction determination device of Appendix 6 or Appendix 7.

However,
phase [j] = atan (phase_ im [j] / phase_ re [j]),
phase_re [j] = re1 [j] * re2 [j] + im1 [j] * im 2 [j],
phase_im [j] = im1 [j] * re2 [j]-re1 [j] * im2 [j],
C_n [j] = λ [j] / λ_c,
j is the number of frequency bands
re1 [j] is the real part of the spectrum of the first sound pressure in the jth frequency band.
re2 [j] is the real part of the spectrum of the second sound pressure in the jth frequency band.
im1 [j] is an imaginary part of the spectrum of the first sound pressure in the jth frequency band.
im2 [j] is an imaginary part of the spectrum of the second sound pressure in the jth frequency band.
λ [j] is the wavelength of the sound in the jth frequency band.
λ_c is the wavelength of the sound of the reference frequency,
ee is the upper limit of the target frequency band, and is
ss is the lower limit of the target frequency band.
(Appendix 9)
The difference in sound pressure is a sound pressure difference average value which is an average value of a plurality of frames of the sound pressure difference for each frame obtained by subtracting the logarithm of the power of the second sound pressure from the logarithm of the power of the first sound pressure.
The phase difference is a phase difference average value which is an average value of a plurality of frames of the phase difference of the target frequency band for each frame.
When the sound pressure difference average value is larger than the fifth threshold value and at least one of the cases where the phase difference average value is larger than the sixth threshold value, the sound source exists at a position facing the first flat surface. Then,
The fifth threshold is the average value of the sound pressure difference when the sound source is present at a position facing the first flat surface, and the sound pressure difference when the sound source is present at a position facing the second flat surface. The average value and the average value of
The sixth threshold is the phase difference average value when the sound source is present at a position facing the first flat surface, and the phase difference when the sound source is present at a position facing the second flat surface. The average value and the average value of,
The sound source direction determination device according to any one of Supplementary note 1 to Supplementary note 5.
(Appendix 10)
When the sound pressure difference average value is equal to or less than the fifth threshold value and at least one of the cases where the phase difference average value is equal to or less than the sixth threshold value, the sound source exists at a position facing the second flat surface. Judge,
The sound source direction determination device of Appendix 9.
(Appendix 11)
The average of the sound pressure difference average value when the sound source is present at a position facing the first flat surface and the sound pressure difference average value when the sound source is present at a position facing the second flat surface. The value is an average value of a first mean value which is an average value of the first speech section of the sound pressure difference and a second mean value which is an average value of the second speech section of the sound pressure difference, and is the first mean value. The difference between the mean value and the second mean value exceeds the second predetermined value,
The average of the phase difference average value when the sound source is present at a position facing the first flat surface and the phase difference average value when the sound source is present at a position facing the second flat surface. The value is an average value of a third mean value which is an average value of the third speech section of the phase difference and a fourth mean value which is an average value of the fourth speech section of the phase difference, and is the third mean value. The difference between the mean value and the fourth mean value exceeds the third predetermined value.
The sound source direction determination device of Appendix 9 or Appendix 10.
(Appendix 12)
When the signal-to-noise ratio of the signal corresponding to the sound is smaller than the fourth predetermined value, the fifth threshold value and the sixth threshold value are reduced by the fifth predetermined value.
A sound source direction determination device according to any one of Supplementary note 9 to Supplementary note 11.
(Appendix 13)
When it is determined that the sound source exists at a position facing the first flat surface, the signal corresponding to the sound is translated into the first language, and it is determined that the sound source exists at a position facing the second flat surface. If so, the signal corresponding to the sound is translated into a second language.
The sound source direction determination device according to any one of Supplementary note 1 to Supplementary note 12.
(Appendix 14)
A first opening opened in the first flat surface is provided at one end, and a first sound path through which sound propagates from the first opening and a second flat surface intersecting with the first flat surface are opened. A microphone installation unit having two openings at one end and an internal second sound path through which sound propagates from the second opening.
An omnidirectional first microphone installed at the other end of the first sound path, and
An omnidirectional second microphone installed at the other end of the second sound path, and
With a computer
The computer of the sound source direction determination device including
The first sound pressure, which is the sound pressure of the first frequency component of the sound acquired by the first microphone, and the second sound pressure, which is the sound pressure of the first frequency component of the sound acquired by the second microphone. The difference in sound pressure between the two, and the first phase, which is the phase of the second frequency component of the sound acquired by the first microphone, and the phase of the second frequency component of the sound acquired by the second microphone. Determining the direction in which the sound source is located, based on at least one of the phase differences from the second phase.
Sound source direction determination method.
(Appendix 15)
The difference in sound pressure is the average value of the sound pressure difference obtained by subtracting the logarithm of the power of the second sound pressure from the logarithm of the power of the first sound pressure.
The phase difference is the average value of the phase difference in the target frequency band.
When the average value of the sound pressure difference is larger than the positive first threshold value and at least one of the cases where the average value of the phase difference is larger than the positive third threshold value, the sound source is the first flat surface. Judged to exist at a position facing
Appendix 14 Sound source direction determination method.
(Appendix 16)
The difference in sound pressure is a sound pressure difference average value which is an average value of a plurality of frames of the sound pressure difference for each frame obtained by subtracting the logarithm of the power of the second sound pressure from the logarithm of the power of the first sound pressure.
The phase difference is a phase difference average value which is an average value of a plurality of frames of the phase difference of the target frequency band for each frame.
When the sound pressure difference average value is larger than the fifth threshold value and at least one of the cases where the phase difference average value is larger than the sixth threshold value, the sound source exists at a position facing the first flat surface. Then,
The fifth threshold is the average value of the sound pressure difference when the sound source is present at a position facing the first flat surface, and the sound pressure difference when the sound source is present at a position facing the second flat surface. The average value and the average value of
The sixth threshold is the phase difference average value when the sound source is present at a position facing the first flat surface, and the phase difference when the sound source is present at a position facing the second flat surface. The average value and the average value of,
Appendix 14 Sound source direction determination method.
(Appendix 17)
A first opening opened in the first flat surface is provided at one end, and a first sound path through which sound propagates from the first opening and a second flat surface intersecting with the first flat surface are opened. A microphone installation unit having two openings at one end and an internal second sound path through which sound propagates from the second opening.
An omnidirectional first microphone installed at the other end of the first sound path, and
An omnidirectional second microphone installed at the other end of the second sound path, and
With a computer
To the computer of the sound source direction determination device including
The first sound pressure, which is the sound pressure of the first frequency component of the sound acquired by the first microphone, and the second sound pressure, which is the sound pressure of the first frequency component of the sound acquired by the second microphone. The difference in sound pressure between the two, and the first phase, which is the phase of the second frequency component of the sound acquired by the first microphone, and the phase of the second frequency component of the sound acquired by the second microphone. Determining the direction in which the sound source is located, based on at least one of the phase differences from the second phase.
A program for executing sound source direction determination processing.
(Appendix 18)
The difference in sound pressure is the average value of the sound pressure difference obtained by subtracting the logarithm of the power of the second sound pressure from the logarithm of the power of the first sound pressure.
The phase difference is the average value of the phase difference in the target frequency band.
When the average value of the sound pressure difference is larger than the positive first threshold value and at least one of the cases where the average value of the phase difference is larger than the positive third threshold value, the sound source is the first flat surface. Judged to exist at a position facing
Appendix 17 program.
(Appendix 19)
The difference in sound pressure is a sound pressure difference average value which is an average value of a plurality of frames of the sound pressure difference for each frame obtained by subtracting the logarithm of the power of the second sound pressure from the logarithm of the power of the first sound pressure.
The phase difference is a phase difference average value which is an average value of a plurality of frames of the phase difference of the target frequency band for each frame.
When the sound pressure difference average value is larger than the fifth threshold value and at least one of the cases where the phase difference average value is larger than the sixth threshold value, the sound source exists at a position facing the first flat surface. Then,
The fifth threshold is the average value of the sound pressure difference when the sound source is present at a position facing the first flat surface, and the sound pressure difference when the sound source is present at a position facing the second flat surface. The average value and the average value of
The sixth threshold is the phase difference average value when the sound source is present at a position facing the first flat surface, and the phase difference when the sound source is present at a position facing the second flat surface. The average value and the average value of,
Appendix 17 program.

１０音源方向判定装置
１１第１マイクロフォン
１１Ｒ第１音道
１１Ｏ第１開口部
１１Ｋ屈曲部
１２第２マイクロフォン
１２Ｒ第２音道
１２Ｏ第２開口部
１３判定部
１４音声翻訳装置
５１ＣＰＵ
５２一次記憶部
５３二次記憶部 10 Sound source direction determination device 11 1st microphone 11R 1st sound path 11O 1st opening 11K Bending part 12 2nd microphone 12R 2nd sound path 12O 2nd opening 13 Judgment unit 14 Speech translation device 51 CPU
52 Primary storage 53 Secondary storage

Claims

A first opening opened in the first flat surface is provided at one end, and a first sound path through which sound propagates from the first opening and a second flat surface intersecting with the first flat surface are opened. A microphone installation unit having two openings at one end and an internal second sound path through which sound propagates from the second opening.
An omnidirectional first microphone installed at the other end of the first sound path, and
An omnidirectional second microphone installed at the other end of the second sound path, and
The first sound pressure, which is the sound pressure of the first frequency component of the sound acquired by the first microphone, and the second sound pressure, which is the sound pressure of the first frequency component of the sound acquired by the second microphone. The difference in sound pressure between the two, and the first phase, which is the phase of the second frequency component of the sound acquired by the first microphone, and the phase of the second frequency component of the sound acquired by the second microphone. A determination unit that determines the direction in which the sound source exists based on at least one of the phase differences from the second phase.
including,
Sound source direction determination device.

The first frequency component is a high frequency component.
The sound source direction determination device according to claim 1.

The first flat surface and the second flat surface are orthogonal to each other.
The area of the first flat surface is equal to or less than the first predetermined value, and the area of the second flat surface is larger than the first predetermined value.
The first sound path has a first diffracting portion that diffracts sound in the first opening, and has a second diffracting portion that is a bending portion that diffracts sound in the middle.
The second sound path has a third diffracting portion that diffracts sound in the second opening.
The sound source direction determination device according to claim 1 or 2.

The first flat surface and the second flat surface are orthogonal to each other.
The area of the first flat surface is equal to or less than the first predetermined value, and the area of the second flat surface is larger than the first predetermined value.
The first sound path has a first diffracting portion that diffracts sound in the first opening, and has a second diffracting portion that is a bending portion that diffracts sound in the middle.
The second sound path has a third diffracting portion that diffracts sound in the second opening, and has a fourth diffracting portion that is a bending portion that diffracts sound in the middle.
The sound source direction determination device according to claim 1 or 2.

The first flat surface and the second flat surface are orthogonal to each other.
The areas of the first flat surface and the second flat surface are larger than the first predetermined value.
The first sound path has a first diffracting portion that diffracts sound in the first opening.
The second sound path has a second diffracting portion that diffracts sound in the second opening.
The sound source direction determination device according to claim 1 or 2.

The difference in sound pressure is the average value of the sound pressure difference obtained by subtracting the logarithm of the power of the second sound pressure from the logarithm of the power of the first sound pressure.
The phase difference is the average value of the phase difference in the target frequency band.
When the average value of the sound pressure difference is larger than the positive first threshold value and at least one of the cases where the average value of the phase difference is larger than the positive third threshold value, the sound source is the first flat surface. Judged to exist at a position facing
The sound source direction determination device according to any one of claims 1 to 5.

When the average value of the sound pressure difference is smaller than the negative second threshold value and at least one of the cases where the average value of the phase difference is smaller than the negative fourth threshold value, the sound source is the second flat surface. Judged to exist at a position facing
The sound source direction determination device according to claim 6.

The sound source direction determination device according to claim 6 or 7, wherein the average value a_phase of the phase difference in the target frequency band is represented by the following equation (1).

However,
phase [j] = atan (phase_ im [j] / phase_ re [j]),
phase_re [j] = re1 [j] * re2 [j] + im1 [j] * im 2 [j],
phase_im [j] = im1 [j] * re2 [j]-re1 [j] * im2 [j],
C_n [j] = λ [j] / λ_c,
j is the number of frequency bands
re1 [j] is the real part of the spectrum of the first sound pressure in the jth frequency band.
re2 [j] is the real part of the spectrum of the second sound pressure in the jth frequency band.
im1 [j] is an imaginary part of the spectrum of the first sound pressure in the jth frequency band.
im2 [j] is an imaginary part of the spectrum of the second sound pressure in the jth frequency band.
λ [j] is the wavelength of the sound in the jth frequency band.
λ_c is the wavelength of the sound of the reference frequency,
ee is the upper limit of the target frequency band, and is
ss is the lower limit of the target frequency band.

The difference in sound pressure is a sound pressure difference average value which is an average value of a plurality of frames of the sound pressure difference for each frame obtained by subtracting the logarithm of the power of the second sound pressure from the logarithm of the power of the first sound pressure.
The phase difference is a phase difference average value which is an average value of a plurality of frames of the phase difference of the target frequency band for each frame.
When the sound pressure difference average value is larger than the fifth threshold value and at least one of the cases where the phase difference average value is larger than the sixth threshold value, the sound source exists at a position facing the first flat surface. Then,
The fifth threshold is the average value of the sound pressure difference when the sound source is present at a position facing the first flat surface, and the sound pressure difference when the sound source is present at a position facing the second flat surface. The average value and the average value of
The sixth threshold is the phase difference average value when the sound source is present at a position facing the first flat surface, and the phase difference when the sound source is present at a position facing the second flat surface. The average value and the average value of,
The sound source direction determination device according to any one of claims 1 to 5.

When the sound pressure difference average value is equal to or less than the fifth threshold value and at least one of the cases where the phase difference average value is equal to or less than the sixth threshold value, the sound source exists at a position facing the second flat surface. Judge,
The sound source direction determination device according to claim 9.

The average of the sound pressure difference average value when the sound source is present at a position facing the first flat surface and the sound pressure difference average value when the sound source is present at a position facing the second flat surface. The value is an average value of a first mean value which is an average value of the first speech section of the sound pressure difference and a second mean value which is an average value of the second speech section of the sound pressure difference, and is the first mean value. The difference between the mean value and the second mean value exceeds the second predetermined value,
The average of the phase difference average value when the sound source is present at a position facing the first flat surface and the phase difference average value when the sound source is present at a position facing the second flat surface. The value is an average value of a third mean value which is an average value of the third speech section of the phase difference and a fourth mean value which is an average value of the fourth speech section of the phase difference, and is the third mean value. The difference between the mean value and the fourth mean value exceeds the third predetermined value.
The sound source direction determination device according to claim 9 or 10.

When it is determined that the sound source exists at a position facing the first flat surface, the signal corresponding to the sound is translated into the first language, and it is determined that the sound source exists at a position facing the second flat surface. If so, the signal corresponding to the sound is translated into a second language.
The sound source direction determination device according to any one of claims 1 to 11.

A first opening opened in the first flat surface is provided at one end, and a first sound path through which sound propagates from the first opening and a second flat surface intersecting with the first flat surface are opened. A microphone installation unit having two openings at one end and an internal second sound path through which sound propagates from the second opening.
An omnidirectional first microphone installed at the other end of the first sound path, and
An omnidirectional second microphone installed at the other end of the second sound path, and
With a computer
The computer of the sound source direction determination device including
The first sound pressure, which is the sound pressure of the first frequency component of the sound acquired by the first microphone, and the second sound pressure, which is the sound pressure of the first frequency component of the sound acquired by the second microphone. The difference in sound pressure between the two, and the first phase, which is the phase of the second frequency component of the sound acquired by the first microphone, and the phase of the second frequency component of the sound acquired by the second microphone. Determining the direction in which the sound source is located, based on at least one of the phase differences from the second phase.
Sound source direction determination method.

A first opening opened in the first flat surface is provided at one end, and a first sound path through which sound propagates from the first opening and a second flat surface intersecting with the first flat surface are opened. A microphone installation unit having two openings at one end and an internal second sound path through which sound propagates from the second opening.
An omnidirectional first microphone installed at the other end of the first sound path, and
An omnidirectional second microphone installed at the other end of the second sound path, and
With a computer
To the computer of the sound source direction determination device including
The first sound pressure, which is the sound pressure of the first frequency component of the sound acquired by the first microphone, and the second sound pressure, which is the sound pressure of the first frequency component of the sound acquired by the second microphone. The difference in sound pressure between the two, and the first phase, which is the phase of the second frequency component of the sound acquired by the first microphone, and the phase of the second frequency component of the sound acquired by the second microphone. Determining the direction in which the sound source is located, based on at least one of the phase differences from the second phase.
A program for executing sound source direction determination processing.