JP5226989B2

JP5226989B2 - POSITION ESTIMATION DEVICE, ITS METHOD, ITS PROGRAM, AND RECORDING MEDIUM

Info

Publication number: JP5226989B2
Application number: JP2007217924A
Authority: JP
Inventors: 和則小林; 賢一古家; 章俊片岡
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2007-08-24
Filing date: 2007-08-24
Publication date: 2013-07-03
Anticipated expiration: 2027-08-24
Also published as: JP2009052928A

Description

この発明は、例えば、複数の収音手段を用いた雑音抑圧収音や話者位置に自動追従するテレビカメラ制御に利用するための位置推定装置、その方法、そのプログラム、およびその記録媒体に関する。 The present invention relates to, for example, a position estimation apparatus, a method thereof, a program thereof, and a recording medium thereof for use in noise suppression sound collection using a plurality of sound collection means and television camera control that automatically follows a speaker position.

図１に従来の位置推定装置１０の機能構成例を示す。位置推定装置１０は、収音間遅延時間差測定部１３と推定部１６１とで構成され、必要に応じて収音手段間距離入力部６１も備える。従来の位置推定装置１０は、Ｎ個（Ｎは１以上の整数）の信号源１７_ｎ（ｎ＝１，．．．，Ｎ）の位置と、信号源よりの信号源信号を収音するＭ個（Ｍは１以上の整数）の収音手段１１_ｍ（ｍ＝１，．．．，Ｍ）の位置を推定する。ここで、信号源とは例えば、音源のことである。音源とは、例えばＴＶ会議での音声を発する人間のことであり、信号源信号とは、例えば音源信号であり、収音手段とは、音声を収音するマイクロホンなどである。なお、以下の説明では、収音手段１１_ｍをマイクロホン１１_ｍとして説明する。また、複数のマイクロホンがフレームで固定されている場合は、それら固定されているマイクロホンの間の距離は既知である。２個のマイクロホンの間の距離を収音手段間距離とし、Ｑ個（Ｑは１以上の整数）の収音手段間距離が既知である場合は、収音手段間距離入力部６１が備えられ、収音手段間距離は事前に収音手段間距離入力部６１に入力される。以下の説明では、Ｑ個の収音手段間距離が既知であり、収音手段間距離入力部６１が備えられている場合を説明する。 FIG. 1 shows a functional configuration example of a conventional position estimation apparatus 10. The position estimation apparatus 10 includes a sound collection delay time difference measurement unit 13 and an estimation unit 161, and also includes a sound input means distance input unit 61 as necessary. The conventional position estimating apparatus 10 collects the positions of N (N is an integer of 1 or more) signal sources 17 _n (n = 1,..., N) and the signal source signals from the signal sources. The positions of the sound collecting means 11 _m (m = 1,..., M) are estimated (M is an integer of 1 or more). Here, the signal source is, for example, a sound source. The sound source is, for example, a person who emits sound in a TV conference, the signal source signal is, for example, a sound source signal, and the sound collection means is a microphone that collects sound. In the following description, the sound collecting unit 11 _m is described as the microphone 11 _m . When a plurality of microphones are fixed with a frame, the distance between the fixed microphones is known. If the distance between the two microphones is the distance between the sound pickup means and Q (Q is an integer of 1 or more) distances between the sound pickup means are known, a distance input section 61 between the sound pickup means is provided. The distance between the sound collecting means is input to the sound collecting means distance input unit 61 in advance. In the following description, a case where the distance between the Q sound collecting means is known and the sound collecting means distance input unit 61 is provided will be described.

まず、収音間遅延時間差測定部１３はＭ個のマイクロホン１１_ｍで収音されたＭチャネルの収音信号間の遅延時間差である測定収音間遅延時間差を測定する。ここで、音源１７_ｎから発せられた音源信号について、ｉ番目（ｉ＝１，．．．，Ｍ−１）のマイクロホン１１_ｉとｊ番目（ｉ＝１，．．．，Ｍ）のマイクロホン１１_ｊとの間で、収音間遅延時間差測定部１３により測定される遅延時間差である測定収音間遅延時間差をτ_ｉｊｎとする。収音間遅延時間差測定部１３はＮ個の全ての音源について、測定収音間遅延時間差τ_ｉｊｎを求める。収音間遅延時間差測定部１３による測定収音間遅延時間差τ_ｉｊｎの求め方は［発明の実施をするための最良の形態］で詳細に説明する。 First, the inter-sound-collection delay time difference measuring unit 13 measures a measured inter-sound-collection delay time difference, which is a delay time difference between M-channel sound-collected signals collected by M microphones 11 _m . Here, regarding the sound source signal emitted from the sound source 17 _n , the i-th (i = 1,..., M−1) microphone 11 _i and the j-th (i = 1,..., M) microphone 11 are used. A delay time difference between measured sound collections, which is a delay time difference measured by the delay time difference measuring unit 13 between _j and _j , is _denoted by _τijn . The inter-sound collection delay time difference measuring unit 13 obtains the measured inter-sound collection delay time difference τ _ijn for all N sound sources. The method of _obtaining the delay time difference τ _ijn between the measured sound collections by the delay time difference measuring unit 13 between sound collections will be described in detail in [Best Mode for Carrying Out the Invention].

推定部１６１は、Ｍ個のマイクロホンの位置とＮ個の音源の位置を推定する。ここで、求めるべきマイクロホンの推定位置を（ｘ＾_ｍ、ｙ＾_ｍ、ｚ＾_ｍ）（ｍ＝１，．．．，Ｍ）とし、求めるべき音源の推定位置を（Ｘ＾_ｎ、Ｙ＾_ｎ、Ｚ＾_ｎ）（_ｎ＝１，．．．，Ｎ）する。ここで、全てのマイクロホンの位置、全ての音源の位置が未知であるので、座標の基準位置を設ける。ここでは、１番目のマイクロホン１１_１を原点（０、０、０）とし２番目のマイクロホン１１_２と３番目のマイクロホン１１_３を通る平面をｘｙ平面として座標を定義する。このように設定すれば、ｘ＾_１＝０、ｙ＾_１＝０、ｚ＾_１＝０、ｙ＾_２＝０、ｚ＾_２＝０、ｚ＾_３＝０となり、これらを定数とすることが出来る。マイクロホンの推定位置、音源の推定位置から求められる推定収音間遅延時間差τ＾_ｉｊｎ（Ｐ）は以下のように表される。 The estimation unit 161 estimates the positions of M microphones and N sound sources. Here, the estimated position of the microphone to be obtained is (x ^ _m , y ^ _m , z ^ _m ) (m = 1,..., M), and the estimated position of the sound source to be obtained is (X ^ _n , Y ^). _n , Z ^ _n ) ( _n = 1,..., N). Here, since the positions of all microphones and the positions of all sound sources are unknown, a coordinate reference position is provided. Here, we define the coordinate of the first microphone 11 ₁ the origin (0, 0, 0) and the second microphone 11 ₂ and the third microphone 11 ₃ plane passing through the xy plane. With this setting, x ^ ₁ = 0, y ^ ₁ = 0, z ^ ₁ = 0, y ^ ₂ = 0, z ^ ₂ = 0, z ^ ₃ = 0, and these should be constants. I can do it. A delay time difference τ ^ _ijn (P) between estimated sound collections obtained from the estimated position of the microphone and the estimated position of the sound source is expressed as follows.

ただしｃは音速であり、Ｐは３Ｍ＋３Ｎ−６個の要素を持つマイクロホンと音源との推定位置のベクトルであり、
Ｐ＝（ｘ＾_２，．．．，ｘ＾_Ｍ，ｙ＾_３，．．．，ｙ＾_Ｍ，ｚ＾_４，．．．，ｘ＾_Ｍ，
Ｘ＾_１，．．．，Ｘ＾_Ｎ，Ｙ＾_１，．．．，Ｙ＾_Ｎ，Ｚ＾_１，．．．，Ｚ＾_Ｎ）で表される。

Where c is the speed of sound, P is a vector of estimated positions of a microphone and a sound source having 3M + 3N-6 elements,
P = (x ^ ₂ , ..., x ^ _M , y ^ ₃ , ..., y ^ _M , z ^ ₄ , ..., x ^ _M ,
X ^ ₁ ,. . . , X ^ _N , Y ^ ₁ ,. . . , Y ^ _N , Z ^ ₁ ,. . . , Z ^ _N ).

また、イメージで記載した式では（例えば式（１））、記号（例えば、ｘ_ｉ）の真上に「＾」を付しており、テキストで記載した式では記号の右上に「＾」を付しているが、これらは同値であることに留意されたい。 In addition, in the expression described in the image (for example, Expression (1)), “^” is added immediately above a symbol (for example, x _i ), and in the expression described in text, “^” is added at the upper right of the symbol. Note that these are equivalent.

収音間遅延時間差測定部１３で求められた測定収音間遅延時間差をτ_ｉｊｎと式（１）で表される推定収音間遅延時間差τ＾_ｉｊｎ（Ｐ）に音速ｃを乗じ距離に換算したものをそれぞれ測定収音手段間距離ｄ_ｉｊｎと推定収音手段間距離ｄ＾_ｉｊｎ（Ｐ）との二乗誤差の和ｅ’（Ｐ）を求めれば、式（２）となる。なお、推定収音手段間距離ｄ＾_ｉｊｎ（Ｐ）は推定収音間遅延時間差τ＾_ｉｊｎ（Ｐ）にｃを乗算するのではなく、直接、推定収音手段間距離ｄ＾_ｉｊｎ（Ｐ）を求めても良い。 The measured delay time difference between sound collections obtained by the delay time difference measuring unit 13 is converted into a distance by multiplying _τijn and the estimated delay time difference between sound collections τ ^ _ijn (P) represented by the equation (1) by the sound velocity c. If the sum e ′ (P) of the square error between the distance d _ijn between the measured sound collecting means and the estimated distance d _ijn (P) between the measured sound collecting means is obtained, Equation (2) is obtained. The estimated sound pickup means distance d ^ _ijn (P) is not directly multiplied by c between the estimated sound pickup delay time difference τ ^ _ijn (P), but directly between the estimated sound pickup means distances d ^ _ijn (P). You may ask for.

また、上述のように、収音手段間距離入力部６１に収音手段間距離が入力される。ここで、マイクロホン番号Ｆ（ｑ）であるマイクロホン１１_Ｆ（ｑ）と、マイクロホン番号Ｇ（ｑ）であるマイクロホン１１_Ｇ（ｑ）との測定された距離を収音手段間距離Ｄ_{Ｆ（ｑ）Ｇ（ｑ）}とする。Ｄ_{Ｆ（ｑ）Ｇ（ｑ）}の測定の仕方として、人間が測定などをすれば良い。推定位置ベクトルＰから計算される推定収音手段間距離をＤ＾_{Ｆ（ｑ）Ｇ（ｑ）}（Ｐ）とする。Ｄ_{Ｆ（ｑ）Ｇ（ｑ）}とＤ＾_{Ｆ（ｑ）Ｇ（ｑ）}（Ｐ）との関係は以下の式（３）で表される。

Further, as described above, the distance between the sound pickup means is input to the distance input portion 61 between the sound pickup means. Here, the microphone _{11 F (q)} is a microphone number F (q), the distance between the measured distance sound pickup means and the microphone _{11 G} is a microphone number _{_{G (q) (q) D}} F (q) _{Let G (q)} . As a method of measuring _{DF (q) G (q)} , a human may perform the measurement. The distance between the estimated sound collecting means calculated from the estimated position vector P is D ^ _{F (q) G (q)} (P). The relationship between _{DF (q) G (q)} and D ^ _{F (q) G (q)} (P) is expressed by the following equation (3).

ここで、マイクロホンの位置を音源の位置を推定するには、式（３）の制約条件下で式（２）を最小化すればよい。そこで、式（２）と式（３）を変形して、１つの最小化問題に置き換えれば、式（４）になる。

Here, in order to estimate the position of the sound source and the position of the microphone, Equation (2) may be minimized under the constraint condition of Equation (3). Therefore, if equation (2) and equation (3) are modified and replaced with one minimization problem, equation (4) is obtained.

ただし、λ_ｑは重み係数であり、事前に設定される。λ_ｑが大きいほど、式（４）が厳密に満たされるＰが求められる。ｅ（Ｐ）を最小化するＰを求めることで、測定収音間遅延時間差τ_ｉｊｎと推定収音間遅延時間差τ＾_ｉｊｎ（Ｐ）との誤差が最小となる推定音源位置、推定されるマイクロホンの位置を求めることができる。ただし、式（４）は非線形連立方程式であり、解析的に解くことは困難である。ここでは、逐次修正を用いた数値解析で求める。式（４）を最小化する推定されるマイクロホンの位置（ｘ＾_ａ、ｙ＾_ａ、ｚ＾_ａ）と推定音源位置（Ｘ＾_ａ、Ｙ＾_ａ、ｚ＾_ａ）を求めるには、ある点における勾配を求め、誤差ｅ（Ｐ）が小さくなる方向に、推定されるマイクロホンの位置、推定音源位置を修正していき、勾配が０になる点を求めればよい。従って、修正式は式（５）のようになる。

However, λ _q is a weighting factor and is set in advance. The larger λ _q is, the more P is required to satisfy the expression (4). By obtaining P that minimizes e (P), the estimated sound source position where the error between the measured delay time difference τ _ijn and the estimated delay time difference τ ^ _ijn (P) is minimized, and the estimated microphone Can be determined. However, equation (4) is a nonlinear simultaneous equation and is difficult to solve analytically. Here, it is obtained by numerical analysis using sequential correction. To find the estimated microphone position (x ^ _a , y ^ _a , z ^ _a ) and the estimated sound source position (X ^ _a , Y ^ _a , z ^ _a ) that minimizes Equation (4): The gradient at the point is obtained, and the position of the estimated microphone and the estimated sound source position are corrected in the direction in which the error e (P) decreases, and the point at which the gradient becomes 0 may be obtained. Therefore, the correction formula is as shown in Formula (5).

以上、示した方法により、音源の位置（ｘ＾_ａ、ｙ＾_ａ、ｚ＾_ａ）とマイクロホンの位置（Ｘ＾_ａ、Ｙ＾_ａ、ｚ＾_ａ）を推定できる。
特開２００７−８１４５５号

As described above, the position of the sound source (x ^ _a , y ^ _a , z ^ _a ) and the position of the microphone (X ^ _a , Y ^ _a , z ^ _a ) can be estimated by the method described above.
JP 2007-81455 A

従来の位置推定装置１０では、マイクロホン１１_ｍの配置が１箇所に集中している場合に収音間遅延時間差測定部１３で生じる推定誤差の影響を受けやすく、収音手段の位置と信号源（音源）の位置の推定を大きく誤る。何故なら、マイクロホンの配置が１箇所に集中しているということは、ｄ_ｉｊｎ、ｄ＾_ｉｊｎ（Ｐ）が小さいということであり、誤った点で式（５）による勾配が０になってしまうからである。 In the conventional position estimation device 10, sensitive to estimation errors caused by sound pickup between the delay time difference measuring unit 13 when the arrangement of the microphones 11 _m are concentrated in one place, the position of the sound pickup means and the signal source ( The position of the (sound source) is estimated incorrectly. This is because the fact that the microphones are concentrated in one place means that d _ijn and d ^ _ijn (P) are small, and the gradient according to equation (5) becomes zero at the wrong point. Because.

本発明は、マイクロホンの配置が１箇所に集中している場合であっても、収音手段、信号源の位置を正しく推定する位置推定装置、その方法、そのプログラム、およびその記録媒体を提供することを目的とする。 The present invention provides a sound collection means, a position estimation device that correctly estimates the position of a signal source, a method thereof, a program thereof, and a recording medium thereof even when the arrangement of microphones is concentrated in one place. For the purpose.

この発明は、信号発生部と、収音間遅延時間差測定部と、収音放出間遅延時間差測定部と、推定部と、を備える位置推定装置。信号発生部は、Ｋ（Ｋは１以上の整数）チャネルの信号を発生する。収音間遅延時間差測定部は、信号発生部が出力するＫチャネルの信号のそれぞれを入力信号とするＫ個の放出手段から放出される放出信号とＮ個（Ｎは１以上の整数）の信号源から発せられる信号源信号とをＭ個（Ｍは２以上の整数）の収音手段が収音したＭチャネルの収音信号を用いて、当該Ｍチャネルの収音信号間の遅延時間差である測定収音間遅延時間差を測定する。収音放出間遅延時間差測定部は、Ｋ個の放出手段に入力されるＫチャネルの入力信号とＭチャネルの収音信号を用いて、Ｋチャネルの放出信号のそれぞれとＭチャネルの収音信号のそれぞれとの間の遅延時間差である測定収音放出間遅延時間差を測定する。推定部は、測定収音間遅延時間差と、測定収音放出間遅延時間差と、を用いて、収音手段の位置、放出手段の位置、信号源の位置を推定する。 The present invention relates to a position estimation device including a signal generation unit, a delay time difference measuring unit between sound collection, a delay time difference measuring unit between sound collection and emission, and an estimation unit. The signal generation unit generates a K channel signal (K is an integer of 1 or more). The delay time difference measuring unit between sound collection units emits N signals (N is an integer equal to or greater than 1) emitted from K number of emission means , each of which has K channel signals output from the signal generator as input signals. This is a delay time difference between M channel sound pickup signals using M channel sound pickup signals picked up by M (M is an integer of 2 or more) sound pickup means from signal source signals emitted from the source. Measure the delay time difference between the collected sound. The delay time difference measurement unit between sound collection and emission uses the K channel input signal and the M channel sound collection signal input to the K emission means, and uses the K channel emission signal and the M channel sound collection signal respectively. The delay time difference between the measured sound collection and emission, which is the delay time difference between them, is measured. The estimation unit estimates the position of the sound collection means, the position of the emission means, and the position of the signal source using the delay time difference between the measurement sound collection and the delay time difference between the measurement sound collection and emission.

上記の構成により、更にＫ個の放出手段を設け、Ｋ個の放出手段に入力されるＫチャネルの入力信号とＭチャネルの収音信号を用いて、Ｋチャネルの放出信号のそれぞれとＭチャネルの収音信号のそれぞれとの間の遅延時間差である測定収音放出間遅延時間差を測定する。そして収音間遅延時間差のみではなく測定収音放出間遅延時間差も用いる。こうすることで、信号源の位置推定、収音手段の位置推定に用いる情報量を増加させることができる。従って、どのような環境下にあっても、収音手段、信号源の位置を高精度に推定でき、更には、拡声手段の位置も推定できる。 According to the above configuration, K emission means are further provided, and each of the K channel emission signals and the M channel emission signals are input using the K channel input signal and the M channel sound pickup signal input to the K emission means. A delay time difference between measured sound pickup and emission, which is a delay time difference between each of the sound pickup signals, is measured. And not only the delay time difference between sound collection but also the delay time difference between measured sound collection and emission is used. By doing so, it is possible to increase the amount of information used for signal source position estimation and sound collection means position estimation. Therefore, the position of the sound pickup means and the signal source can be estimated with high accuracy in any environment, and further, the position of the loudspeaker means can be estimated.

以下に、発明を実施するための最良の形態を示す。なお、同じ機能を持つ構成部や同じ処理を行う過程には同じ番号を付し、重複説明を省略する。 The best mode for carrying out the invention will be described below. In addition, the same number is attached | subjected to the process which performs the structure part which has the same function, and the same process, and duplication description is abbreviate | omitted.

図２に位置推定装置５０−１の機能構成例を示し、図３に位置推定装置５０−１の主な処理の流れを示す。以下の説明では、信号源を音源とし、信号源信号を音源信号とし、収音手段をマイクロホンとし、拡声手段をスピーカとする。また、スピーカの数はＫ個（Ｋは１以上の整数）であり、１２_ｋ（ｋ＝１，．．．，Ｋ）と示し、Ｍ個（Ｍは１以上の整数）のマイクロホンを１１_ｍ（ｍ＝１，．．．，Ｍ）と示し、Ｎ個（Ｎは１以上の整数）の音源を１７_ｎ（ｎ＝１，．．．，Ｎ）と示す。位置推定装置５０−１は、信号発生部１４、収音間遅延時間差測定部１３、収音放出間遅延時間差測定部１５、推定部１６、により構成される。 FIG. 2 shows a functional configuration example of the position estimation apparatus 50-1, and FIG. 3 shows a main processing flow of the position estimation apparatus 50-1. In the following description, it is assumed that the signal source is a sound source, the signal source signal is a sound source signal, the sound collection means is a microphone, and the loudspeaker means is a speaker. The number of speakers is K (K is an integer equal to or greater than 1), which is indicated as 12 _k (k = 1,..., K), and M (M is an integer equal to or greater than 1) microphones are 11 _m. (M = 1,..., M), and N (N is an integer of 1 or more) sound sources are _denoted by 17 _n (n = 1,..., N). The position estimation device 50-1 includes a signal generation unit 14, a delay time difference measuring unit 13 between sound collections, a delay time difference measuring unit 15 between sound collection releases, and an estimation unit 16.

信号発生部１４は、スピーカ１２_１とマイクロホン１１_ｍの間の距離を測定するための信号を発生する。当該信号のチャネル数は、スピーカの数Ｋと同じである。また信号発生部１４よりの信号は、同時刻には１チャネルしか存在しないようにチャネル毎に順番に発生される。信号の種類は、例えば、白色雑音、ＴＳＰ（Ｔｉｍｅｓｔｒｅｔｃｈｅｄｐｕｌｓｅ）信号、Ｍ系列信号などである。信号発生部１４で発生された信号は入力信号Ｅ_ｋとして、スピーカ１２_ｋに入力され、スピーカ１２_ｋから放出信号Ｃ_ｋとして、放出される。 Signal generating unit 14 generates a signal for measuring the distance between the speaker 12 ₁ and the microphone 11 _m. The number of channels of the signal is the same as the number K of speakers. The signal from the signal generator 14 is generated in order for each channel so that only one channel exists at the same time. Examples of the signal type include white noise, a TSP (Time stretched pulse) signal, and an M-sequence signal. Generated signal by the signal generating section 14 as an input signal _{E k,} are inputted to the speaker 12 _k, as emission signal _{C k} from the speaker 12 _k, it is released.

Ｍ個のマイクロホン１１_ｍは、スピーカ１２_ｋよりの放出信号Ｃ_ｋと音源１７_ｎよりの音源信号Ａ_ｎとをＭチャネルの収音信号Ｂ_ｍとして収音する。収音間遅延時間差測定部１３は、Ｍチャネルの収音信号Ｂ_ｍを用いて、当該Ｍチャネルの収音信号間の遅延時間差である測定収音間遅延時間差τ_ｉｊｎを測定する（ステップＳ２）。ここで、測定収音間遅延時間差τ_ｉｊｎは、ｎ番目の音源１７_ｎよりの音源信号Ａ_ｎについて、ｉ番目（ｉ＝１，．．．，Ｍ−１）のマイクロホン１１_ｉとｊ番目（ｉ＝１，．．．，Ｍ）のマイクロホン１１_ｊとの間の遅延時間差である。測定収音間遅延時間差τ_ｉｊｎの測定の仕方の一例を示す。 M microphones 11 _m is picking up the sound signal _{A n} of from emission signal _{C k} and the sound source 17 _n of the speaker 12 _k as collected signal _{B m} of M channels. The inter-sound-collection delay time difference measurement unit 13 measures the inter-sound-collection delay time difference τ _ijn , which is the delay time difference between the M-channel sound-collected signals, using the M-channel sound-collected signal B _m (step S2). . Here, the delay time difference τ _ijn between measured sound collections is the i-th (i = 1,..., M−1) microphone 11 _i and j-th ( _n −1) of the sound source signal _An from the n-th sound source 17 _n. i = 1,..., M) is the delay time difference from the microphone 11 _j . An example of how to measure the delay time difference τ _ijn between measured sound collections is shown.

図４に、収音間遅延時間差測定部１３の機能構成例を示す。収音間遅延時間差測定部１３は、Ｍ個のＦＦＴ手段２１_ｍと、Ｍ個の白色化手段２２_ｍと、収音手段対選択手段２３、共役化手段４０、乗算手段２４、ＩＦＦＴ手段２５、最大ピーク検出手段２６、により構成される。Ｍ個のＦＦＴ手段２１_ｍは対応するマイクロホン１１_ｍに接続されている。マイクロホン１１_ｍよりの収音信号Ｂ_ｍは対応するＦＦＴ手段２１_ｍに入力される。ＦＦＴ手段２１_ｍは時間領域の収音信号Ｂ_ｍから周波数領域の収音信号Ｂ’_ｍに変換する。周波数領域への変換の手法は、例えば公知のフーリエ変換などで行えばよい。白色化手段２２_ｍは、収音信号Ｂ’_ｍを白色化（フラット）して白色化収音信号ＷＢ’_ｍを生成する。次に、収音手段対選択手段２３は、スイッチを切替えて、白色化収音信号ＷＢ’_ｍのうち２つを選択する。このとき、全てのマイクロホンのペアの組み合わせについて以下の処理が実施されるように、収音手段対選択手段２３のスイッチの切り替えが行われる。この２つの白色化収音信号をＷＢ’_ｉとＷＢ’_ｊとする。選択された２つの白色化収音信号ＷＢ’_ｉとＷＢ’_ｊのうち、ＷＢ’_ｊが共役化手段４０により共役がとられる。そして、乗算手段２４は、共役をとられた白色化収音信号ＷＢ’^＊ _ｊと共役をとられていない白色化収音信号ＷＢ’_ｉとを周波数領域ごとに乗算してクロススペクトルを求める。そして、乗算手段２４よりの出力信号は、ＩＦＦＴ手段２５により時間領域に変換され。白色化相互相関が求められる。次に、最大ピーク検出手段２６で、ＩＦＦＴ手段２５よりの白色化相互相関の最大ピークを検出し、最大ピークの地点の時間差が測定収音間遅延時間差τ_ｉｊｎとして、出力される。 FIG. 4 shows an example of a functional configuration of the delay time difference measuring unit 13 between sound collections. The delay time difference measuring unit 13 between sound collections includes M FFT means 21 _m , M whitening means 22 _m , sound collection means pair selection means 23, conjugation means 40, multiplication means 24, IFFT means 25, The maximum peak detecting means 26 is configured. The M FFT means 21 _m are connected to the corresponding microphone 11 _m . Collected sound signal _{B m} of a microphone 11 _m is input to the corresponding FFT unit 21 _m. The FFT means 21 _m converts the time domain sound collected signal B _m to the frequency domain sound collected signal B ′ _m . The method of transforming to the frequency domain may be performed by, for example, a known Fourier transform. Whitening means 22 _m is collected signals B 'and _m are whitened (flat) whitening collected sound signal WB' generates a _m. Next, the sound collection means pair selection means 23 switches the switch to select two of the whitened sound collection signals WB ′ _m . At this time, the switches of the sound collection means pair selection means 23 are switched so that the following processing is performed for all combinations of microphone pairs. These two whitened sound pickup signals are designated as WB ′ _i and WB ′ _j . Of the two selected whitened sound pickup signals WB ′ _i and WB ′ _j , WB ′ _j is conjugated by the conjugate means 40. Then, the multiplying unit 24 multiplies the whitened sound collected signal WB ′ ^* _{j, which} is conjugated, and the whitened sound collected signal WB ′ _i , which is not conjugated, for each frequency domain to obtain a cross spectrum. The output signal from the multiplication means 24 is converted into the time domain by the IFFT means 25. Whitening cross-correlation is required. Next, the maximum peak detection means 26 detects the maximum peak of the whitening cross-correlation from the IFFT means 25, and the time difference at the point of the maximum peak is output as the delay time difference τ _ijn between measured sound collections.

収音放出間遅延時間差測定部１５は、Ｋ個のスピーカ１２_ｋに入力されるＫチャネルの入力信号Ｅ_ｋとＭチャネルの収音信号Ｂ_ｍを用いて、Ｋチャネルの放出信号Ｃ_ｋのそれぞれとＭチャネルの収音信号Ｂ_ｍのそれぞれとの間の遅延時間差である測定収音放出間遅延時間差δ_ｋｍを測定する（ステップＳ４）。換言すれば、測定収音放出間遅延時間差δ_ｋｍは、ｋ番目のスピーカ１２_ｋからｍ番目のマイクロホン１１_ｍまでの音が到達するのにかかる時間である。測定収音放出間遅延時間差δ_ｋｍの測定の手法の一例を説明する。 The delay time difference measuring unit 15 between sound collection and emission uses the K channel input signal E _k and the M channel sound collection signal B _m input to the K speakers 12 _k to respectively output the K channel emission signal C _k . Measured sound collection delay time δ _km , which is a delay time difference between each of the M channel sound collection signals B _m and the M channel sound collection signals B _m (step S4). In other words, the delay time difference δ _km between measured sound collection and emission is the time taken for the sound from the _kth speaker 12 _k to the mth microphone 11 _m to arrive. An example of a method for measuring the delay time difference δ _km between measured sound emission and emission will be described.

図５に収音放出間遅延時間差測定部１５の機能構成例を示す。収音放出間遅延時間差測定部１５は、Ｍ個のＦＦＴ手段２１_ｍ（ｍ＝１，．．．，Ｍ）、Ｍ個の白色化手段３２_ｍ（ｍ＝１，．．．，Ｍ）、収音手段選択手段３３、Ｋ個のＦＦＴ手段３４_ｋ（ｋ＝１，．．．，Ｋ）、Ｋ個の白色化手段３５_ｋ（ｋ＝１，．．．，Ｋ）、放出手段選択手段３６、共役化手段４０、乗算手段３７、ＩＦＦＴ手段３８、最大ピーク検出手段３９、により構成される。 FIG. 5 shows an example of the functional configuration of the delay time difference measuring unit 15 between sound collection and emission. The delay time difference measuring unit 15 between sound collection and emission includes M FFT means 21 _m (m = 1,..., M), M whitening means 32 _m (m = 1,..., M), Sound collection means selection means 33, K FFT means 34 _k (k = 1,..., K), K whitening means 35 _k (k = 1,..., K), discharge means selection means 36, a conjugating means 40, a multiplying means 37, an IFFT means 38, and a maximum peak detecting means 39.

マイクロホン１１_ｍよりの収音信号Ｂ_ｍは対応するＦＦＴ手段３１_ｍに入力される。ＦＦＴ手段３１_ｍは、時間領域の収音信号Ｂ_ｍを、例えばフーリエ変換などにより、周波数領域の収音信号Ｂ’_ｍに変換する。白色化手段３２_ｍは、周波数領域の収音信号Ｂ’_ｍを白色化（フラット）して白色化収音信号ＷＢ’_ｍを生成する。 Collected sound signal _{B m} of a microphone 11 _m is input to the corresponding FFT means 31 for _m. The FFT means 31 _m converts the time domain sound collection signal B _m into a frequency domain sound collection signal B ′ _m by, for example, Fourier transform. The whitening means 32 _m whitens (flats) the sound collected signal B ′ _m in the frequency domain to generate a whitened sound collected signal WB ′ _m .

一方、スピーカ１２_ｋに入力される入力信号Ｅ_ｋは、対応するＦＦＴ手段３４_ｋに入力される。ＦＦＴ手段３４_ｋは、時間領域の入力信号Ｅ_ｋを、例えばフーリエ変換などにより、周波数領域の入力信号Ｅ’_ｋに変換する。白色化手段３５_ｋは、周波数領域の入力信号Ｅ’_ｋを白色化（フラット）して白色化入力信号ＷＤ’_ｋを生成する。次に、収音手段選択手段３３は白色化手段３２_ｍの出力信号のうち１つを選択する。また、放出手段選択手段３６は白色化手段３５_ｋの出力信号のうち１つを選択する。このとき、全てのＭ個のマイクロホン１１_ｍと全てのＫ個のスピーカ１２_ｋの組み合わせについて、以下の処理が実施されるように収音手段選択手段３３と放出手段選択手段３６のスイッチの切り替えは行われる
そして、収音手段選択手段３３で選択された白色化収音信号ＷＢ’_ｍは、共役化手段４０により、共役をとられ、ＷＢ’^＊ _ｍが生成される。乗算手段３７は、共役をとられた白色化収音信号ＷＢ’^＊ _ｍと共役をとられていない白色化入力信号ＷＤ’_ｋを周波数領域毎に乗算させ、クロススペクトルを求める。ＩＦＦＴ手段３８は、乗算手段３７の出力信号を、逆フーリエ変換などで、周波数領域から時間領域に戻して、白色化相関関を求める。最大ピーク検出手段３９は、ＩＦＦＴ手段３８よりの白色化相関関の最大ピークを検出し、その最大ピークの時間差を測定収音放出間遅延時間差δ_ｋｍとして出力する。 On the other hand, the input signal E _k input to the speaker 12 _k is input to the corresponding FFT means 34 _k . The FFT unit 34 _k converts the time domain input signal E _k into a frequency domain input signal E ′ _k by, for example, Fourier transform. Whitening means 35 _k generates the _k input signal E of the frequency domain 'whitened input signal WD to _k and whitened (flat)'. Then, the sound collection unit selection means 33 selects one of the output signal of the whitening means 32 _m. Also, discharge means selecting unit 36 selects one of the output signal of the whitening means 35 _k. At this time, the switches of the sound collection means selection means 33 and the emission means selection means 36 are switched so that the following processing is performed for the combinations of all M microphones 11 _m and all K speakers 12 _k. Then, the whitened sound pickup signal WB ′ _m selected by the sound pickup means selection means 33 is conjugated by the conjugation means 40, and WB ′ ^* _m is generated. Multiplying means 37, a _k 'whitened input signal WD is not taken ^* _m is conjugate' whitened collected signal WB 6,147,531 conjugated to multiply each frequency domain, obtaining the cross spectrum. The IFFT unit 38 returns the output signal of the multiplication unit 37 from the frequency domain to the time domain by inverse Fourier transform or the like, and obtains a whitening correlation. The maximum peak detecting means 39 detects the maximum peak of the whitening correlation from the IFFT means 38 and outputs the time difference between the maximum peaks as a delay time difference δ _km between measured sound collection releases.

説明を図２に戻す。推定部１６は、収音間遅延時間差測定部１３よりの測定収音間遅延時間差τ_ｉｊｎと、収音放出間遅延時間差測定部１５よりの測定収音放出間遅延時間差δ_ｋｍと、を用いて、位置推定装置５０−１の目的である、マイクロホン１１_ｍの位置、音源１７_ｎの位置、スピーカ１２_ｋの位置を推定する（ステップＳ６）。ここで推定されるマイクロホン１１_ｍの位置を（ｘ＾_ｍ、ｙ＾_ｍ、ｚ＾_ｍ）とし、推定される音源１７_ｎの位置を（Ｘ＾_ｎ、Ｙ＾_ｎ、Ｚ＾_ｎ）とし、推定されるスピーカ１２_ｋの位置を（Ｘ’＾_ｋ、Ｙ’＾_ｋ、Ｚ’＾_ｋ）とする。ただし、全てのマイクロホンの位置、全ての音源の位置、全てのスピーカの位置が未知であるので、座標の基準位置を設ける。ここでは、１番目のマイクロホン１１_１を原点（０、０、０）とし２番目のマイクロホン１１_２と３番目のマイクロホン１１_３を通る平面をｘｙ平面として座標を定義する。このように設定すれば、ｘ＾_１＝０、ｙ＾_１＝０、ｚ＾_１＝０、ｙ＾_２＝０、ｚ＾_２＝０、ｚ＾_３＝０となり、これらを定数とすることが出来る。マイクロホンの推定位置、音源の推定位置から求められる推定収音間遅延時間差τ＾_ｉｊｎ（Ｐ）は以下のように表される。 Returning to FIG. The estimation unit 16 uses the measured delay time difference τ _ijn between the collected sound delay times from the collected sound delay time difference measurement unit 13 and the measured delayed delay time difference δ _km between the collected sound emission from the delayed sound collection / release time measurement unit 15. The position of the microphone 11 _m , the position of the sound source 17 _n , and the position of the speaker 12 _k , which are the purposes of the position estimation device 50-1, are estimated (step S6). The position of the microphone 11 _m is estimated here to _{_{(x ^ m, y ^ m}} , z ^ m) and to the position of the sound source 17 _n to be estimated _{_{(X ^ n, Y ^ n}} , Z ^ n) and, Assume that the estimated position of the speaker 12 _k is (X ′ ^ _k , Y ′ ^ _k , Z ′ ^ _k ). However, since the positions of all microphones, the positions of all sound sources, and the positions of all speakers are unknown, a coordinate reference position is provided. Here, we define the coordinate of the first microphone 11 ₁ the origin (0, 0, 0) and the second microphone 11 ₂ and the third microphone 11 ₃ plane passing through the xy plane. With this setting, x ^ ₁ = 0, y ^ ₁ = 0, z ^ ₁ = 0, y ^ ₂ = 0, z ^ ₂ = 0, z ^ ₃ = 0, and these should be constants. I can do it. A delay time difference τ ^ _ijn (P) between estimated sound collections obtained from the estimated position of the microphone and the estimated position of the sound source is expressed as follows.

ただしｃは音速であり、Ｐは３Ｍ＋３Ｎ＋３Ｋ−６個の要素を持つマイクロホンと音源との推定位置のベクトルであり、
Ｐ＝（ｘ＾_２，．．．，ｘ＾_Ｍ，ｙ＾_３，．．．，ｙ＾_Ｍ，ｚ＾_４，．．．，ｘ＾_Ｍ，
Ｘ＾_１，．．．，Ｘ＾_Ｎ，Ｙ＾_１，．．．，Ｙ＾_Ｎ，Ｚ＾_１，．．．，Ｚ＾_Ｎ，
Ｘ’＾_１，．．．，Ｘ’＾_Ｋ，Ｙ’＾_１，．．．，Ｙ’＾_Ｋ，Ｚ’＾_１，．．．，Ｚ’＾_Ｋ）で表される。

Where c is the speed of sound, P is a vector of estimated positions of a microphone and a sound source having 3M + 3N + 3K-6 elements,
P = (x ^ ₂ , ..., x ^ _M , y ^ ₃ , ..., y ^ _M , z ^ ₄ , ..., x ^ _M ,
X ^ ₁ ,. . . , X ^ _N , Y ^ ₁ ,. . . , Y ^ _N , Z ^ ₁ ,. . . , Z ^ _N ,
X ′ ^ ₁ ,. . . , X ′ ^ _K , Y ′ ^ ₁ ,. . . , Y ′ ^ _K , Z ′ ^ ₁ ,. . . , Z ′ ^ _K ).

次に、推定されるスピーカの位置と推定されるマイクロホンの位置から推定される推定収音放出間遅延時間差δ＾_ｋｍ（Ｐ）は以下の式（２１）になる。 Next, the estimated delay time δ ^ _km (P) between sound collection and emission estimated from the estimated speaker position and estimated microphone position is expressed by the following equation (21).

測定収音間遅延時間差τ_ｉｊｎ、推定収音間遅延時間差τ＾_ｉｊｎ（Ｐ）に音速ｃを乗じたものをそれぞれ測定収音手段間距離ｄ_ｉｊｎ、推定収音手段間距離ｄ＾_ｉｊｎ（Ｐ）とする。そして、測定収音手段間距離ｄ_ｉｊｎ、推定収音手段間距離ｄ＾_ｉｊｎ（Ｐ）の二乗誤差の和ｅ’（Ｐ）を求めれば、以下の式（２２）になる。

The measured sound pickup delay time difference τ _ijn , the estimated sound pickup delay time difference τ ^ _ijn (P) multiplied by the speed of sound c, the measured sound pickup means distance d _ijn , and the estimated sound pickup means distance d ^ _ijn (P ). Then, the sum of squared errors e ′ (P) of the measured distance between sound collecting means d _ijn and the estimated distance between sound collecting means d ^ _ijn (P) is obtained as the following expression (22).

次に、測定収音放出間遅延時間差δ_ｋｍ、推定収音放出間遅延時間差δ＾_ｋｍ（Ｐ）に音速ｃを乗じ、距離に換算したものをそれぞれ測定収音放出間距離Ｌ_ｋｍ、推定収音放出間距離Ｌ＾_ｋｍ（Ｐ）とし、測定収音放出間距離Ｌ_ｋｍ、推定収音放出間距離Ｌ＾_ｋｍ（Ｐ）の二乗誤差の和ｅ’’（Ｐ）を求めれば、以下の式（２３）になる。なお、推定収音手段間距離ｄ＾_ｉｊｎ（Ｐ）を求める際に、推定収音間遅延時間差τ＾_ｉｊｎ（Ｐ）にｃを乗算するのではなく、直接、推定収音手段間距離ｄ＾_ｉｊｎ（Ｐ）を求めても良い。

The measurement sound collection release the delay time difference between [delta] _miles, multiplied by the speed of sound c in the estimated sound pickup release the delay time difference between δ ^ _{km (P),} the distance each measured sound pickup release distance L _miles those _terms, the estimated yield If the distance between sound emission L ^ _km (P) is taken and the sum of squared errors e '' (P) of the distance L _km between measured sound collection and emission and the distance L ^ _km (P) between estimated sound collection and emission is obtained, Equation (23) is obtained. It should be noted that, when the estimated sound pickup means distance d ^ _ijn (P) is obtained, the estimated sound pickup means distance d ^ is not directly multiplied by c, but the estimated sound pickup delay time difference τ ^ _ijn (P) is not multiplied by c. _ijn (P) may be obtained.

ここで、マイクロホンの位置、音源の位置、スピーカの位置、を推定するには、式（２２）と式（２３）を重みつきして結合させて、１つの最小化問題に置き換えればよい。置き換えた結果は以下の式（２４）である。

Here, in order to estimate the position of the microphone, the position of the sound source, and the position of the speaker, the equations (22) and (23) may be combined by weighting and replaced with one minimization problem. The result of the replacement is the following equation (24).

ただし、βは重み係数であり、事前に設定される。βが大きいほど、式（２３）の誤差を最小化する重みが大きくなる。Ｌ_ｋｍの測定誤差がｄ_ｉｊｎの測定誤差に比べ小さい場合は、βを１よりも大きな値に設定すると、より精度の高い推定値に対する重みを大きくでき、より高精度な位置の推定をできる。

Here, β is a weighting factor and is set in advance. The larger β is, the larger the weight for minimizing the error in equation (23) is. When the measurement error of L _km is smaller than the measurement error of _dijn , setting β to a value larger than 1 makes it possible to increase the weight for the estimated value with higher accuracy and estimate the position with higher accuracy.

式（２４）に示したｅ_１（Ｐ）を最小化する解を求めることで、ｄ_ｉｊｎとｄ＾_ｉｊｎ（Ｐ）との誤差、Ｌ_ｋｍ、とＬ＾_ｋｍ（Ｐ）との誤差が最小となるマイクロホン１１_ｍの位置、音源１７_ｎの位置、スピーカ１２_ｋの位置を求めることができる。ただし、式（２４）は非線形連立方程式であり、解析的に解くことは困難である。ここでは、逐次修正を用いた数値解析で求める。式（２４）を最小化する推定収音手段位置（ｘ＾_ａ、ｙ＾_ａ、ｚ＾_ａ）（ａ＝１，．．．，Ｍ）と推定音源位置（Ｘ＾_ｂ、Ｙ＾_ｂ、Ｚ＾_ｂ）（ｂ＝１，．．．，Ｎ）と推定音源位置（Ｘ’＾_ｃ、Ｙ’＾_ｃ、Ｚ’＾_ｃ）（ｃ＝１，．．．，Ｋ）を求めるには、ある点における勾配を求め、誤差ｅ（Ｐ）が小さくなる方向に、推定される収音手段の位置、推定される音源の位置、推定されるスピーカの位置を修正していき、勾配が０になる点を求めればよい。従って、修正式は式（２５）のようになる。ただし、逐次修正では、勾配が完全に０にはならないので、勾配が事前に設定した十分小さい値の閾値以下となった場合に推定位置の修正を停止する。閾値については後述する。 By finding a solution that minimizes e ₁ (P) shown in Equation (24), the error between d _ijn and d ^ _ijn (P), and the error between L _km and L ^ _km (P) are minimized. The position of the microphone 11 _m, the position of the sound source 17 _n , and the position of the speaker 12 _k can be obtained. However, equation (24) is a nonlinear simultaneous equation and is difficult to solve analytically. Here, it is obtained by numerical analysis using sequential correction. Estimated sound pickup means positions (x ^ _a , y ^ _a , z ^ _a ) (a = 1, ..., M) that minimize Equation (24) and estimated sound source positions (X ^ _b , Y ^ _b , To obtain Z ^ _b ) (b = 1,..., N) and the estimated sound source position (X ′ ^ _c , Y ′ ^ _c , Z ′ ^ _c ) (c = 1,..., K) Then, the gradient at a certain point is obtained, and the position of the estimated sound pickup means, the estimated position of the sound source, and the estimated position of the speaker are corrected in a direction in which the error e (P) becomes smaller. Find the point that becomes. Therefore, the correction formula is as shown in Formula (25). However, since the gradient does not completely become zero in the successive correction, the correction of the estimated position is stopped when the gradient is equal to or smaller than a sufficiently small threshold value set in advance. The threshold will be described later.

式（２５）を用いた位置推定処理の詳細を説明する。図６に推定部１６の機能構成例を示し、図７に推定部１６の主な処理の流れを示す。図６の例では、推定部１６は、更新手段４１、初期値設定手段４２、信号源位置記憶手段４３、収音手段位置記憶手段４４、放出手段位置記憶手段４５、判定手段４６により構成されている。以下の説明では、ｕ回修正後の推定されるマイクロホンの位置をＡ（ｕ）＝（ｘ＾_ａ（ｕ）、ｙ＾_ａ（ｕ）、ｚ＾_ａ（ｕ））とし、推定される音源の位置をＢ（ｕ）＝（Ｘ＾_ｂ（ｕ）、Ｙ＾_ｂ（ｕ）、Ｚ＾_ｂ（ｕ））とし、推定されるスピーカの位置をＣ（ｕ）＝（Ｘ’＾_ｃ（ｕ）、Ｙ’＾_ｃ（ｕ）、Ｚ’＾_ｃ（ｕ））とする。

Details of the position estimation processing using Expression (25) will be described. FIG. 6 shows a functional configuration example of the estimation unit 16, and FIG. 7 shows a main processing flow of the estimation unit 16. In the example of FIG. 6, the estimation unit 16 includes update means 41, initial value setting means 42, signal source position storage means 43, sound collection means position storage means 44, discharge means position storage means 45, and determination means 46. Yes. In the following description, the estimated microphone position after being corrected u times is assumed to be A (u) = (x ^ _a (u), y ^ _a (u), z ^ _a (u)), and the estimated sound source. And B (u) = (X ^ _b (u), Y ^ _b (u), Z ^ _b (u)), and the estimated speaker position is C (u) = (X '^ _c ( u), Y ′ ^ _c (u), Z ′ ^ _c (u)).

まず、更新手段４１は、測定収音放出間遅延時間差δ_ｋｍと、測定収音間遅延時間差τ_ｉｊｎとを読み込む（ステップＳ１００）。そして、初期値設定手段４２は、推定収音手段位置の初期値、推定音源位置の初期値、推定放出手段位置の初期値、つまり、０回修正後のそれぞれの値Ａ（０）＝（ｘ＾_ａ（０）、ｙ＾_ａ（０）、ｚ＾_ａ（０））、Ｂ（０）＝（Ｘ＾_ｂ（０）、Ｙ＾_ｂ（０）、Ｚ＾_ｂ（０））、Ｃ（０）＝（Ｘ’＾_ｃ（０）、Ｙ’＾_ｃ（０）、Ｚ’＾_ｃ（０））を設定する。これらの初期値は任意の値でよい。次に、Ａ（０）は収音手段位置記憶手段４４に、Ｂ（０）は信号源位置記憶手段に、Ｃ（０）は放出手段位置記憶手段４５に、記憶される。そして、式（２５）により、Ａ（０）、Ｂ（０）、Ｃ（０）、δ_ｋｍ、τ_ｉｊｎ、とを用いて、ベクトルＰ、つまり、Ｎ個の信号源の位置、Ｍ個の収音手段の位置、Ｋ個の放出手段の位置を式（２５）により更新する（ステップＳ１０４）。更新後の推定収音手段位置、推定音源位置、推定音源位置は更新の都度、収音手段位置記憶手段４４に、信号源位置記憶手段４３に、放出手段位置記憶手段４５に、記憶される。 First, the update means 41 reads the delay time difference δ _km between measured sound _pickup and the delay time difference τ _ijn between measured sound _pickup (step S100). Then, the initial value setting means 42 sets the initial value of the estimated sound pickup means position, the initial value of the estimated sound source position, the initial value of the estimated emission means position, that is, the respective values A (0) = (x _{_{_{^ a (0), y ^}}} a (0), z ^ a (0)), B (0) = (X ^ b (0), Y ^ b (0), Z ^ b (0)), C (0) = (X ′ ^ _c (0), Y ′ ^ _c (0), Z ′ ^ _c (0)) is set. These initial values may be arbitrary values. Next, A (0) is stored in the sound pickup means position storage means 44, B (0) is stored in the signal source position storage means, and C (0) is stored in the discharge means position storage means 45. Then, according to Equation (25), using A (0), B (0), C (0), δ _km , τ _ijn , the vector P, that is, the positions of N signal sources, M The position of the sound collection means and the positions of the K emission means are updated by the equation (25) (step S104). The updated estimated sound collection means position, estimated sound source position, and estimated sound source position are stored in the sound collection means position storage means 44, the signal source position storage means 43, and the emission means position storage means 45 each time they are updated.

また、逐次修正では、勾配が完全に０にはならない。従って、判定手段４６が、これら更新が終了したと判定すると（ステップＳ１０６）、例えば、更新停止信号を生成し、更新処理を停止させる。以下に、判定手段４６の判定処理の一例を説明する。 In addition, the gradient is not completely zero in the successive correction. Accordingly, when the determination unit 46 determines that these updates have been completed (step S106), for example, an update stop signal is generated to stop the update process. Below, an example of the determination process of the determination means 46 is demonstrated.

判定手段４６が、更新手段４１の更新処理に用いる式（２５）中の更新量ｇｒａｄｅ_１（Ｐ）が十分に小さいか否かを判定し、十分小さければ、更新が終了したと判定する。例えば、更新量ｇｒａｄｅ_１（Ｐ）の総和Σ_Ｐ│ｇｒａｄｅ_１（Ｐ）│と、事前に設定された閾値Ｔｃとを比較し、Ｔｃ＞Σ_Ｐ│ｇｒａｄｅ_１（Ｐ）│になれば、判定手段４６が更新が終了したと判定する。ここで、Σ_Ｐ│ｇｒａｄｅ_１（Ｐ）│は、ベクトルＰで規定される全ての位置についての「ｇｒａｄｅ_１（Ｐ）」の合計である。その他、Ｔｃ≧Σ_Ｐ│ｇｒａｄｅ_１（Ｐ）│になった場合や、修正回数ｕが予め定められた閾値以上になった場合に、判定手段４６が更新が終了したと判定するようにしても良い。そして、修正後の音源の位置、マイクロホンの位置、スピーカの位置が推定された音源の位置、推定されたマイクロホンの位置、推定されたスピーカの位置として、推定部１６から出力される（ステップＳ１０８）。 The determination unit 46 determines whether or not the update amount grad e ₁ (P) in the equation (25) used for the update process of the update unit 41 is sufficiently small. If it is sufficiently small, it is determined that the update is completed. For example, the sum Σ _P | grad e ₁ (P) | of the update amount grad e ₁ (P) is compared with a preset threshold value Tc, and Tc> Σ _P | grad e ₁ (P) | In this case, the determination unit 46 determines that the update has been completed. Here, Σ _P | grad e ₁ (P) | is the sum of “grad e ₁ (P)” for all positions defined by the vector P. In addition, when Tc ≧ Σ _P | grad e ₁ (P) |, or when the number of corrections u is equal to or greater than a predetermined threshold, the determination unit 46 determines that the update has been completed. Also good. Then, the position of the sound source after correction, the position of the microphone, the position of the sound source from which the position of the speaker is estimated, the position of the estimated microphone, and the position of the estimated speaker are output from the estimation unit 16 (step S108). .

従来の位置推定装置１０は、測定収音間遅延時間差τ_ｉｊｎを用いて、位置を推定していた。マイクロホンが１箇所に集中している等の場合、式（２）に示すｅ’（Ｐ）が小さくなる。また、式（３）に示す値Ｄ_{Ｆ（ｑ）Ｇ（ｑ）}は、固定値である。従って、式（４）に示すｅ（Ｐ）の値は小さくなり、結果として、式（５）による逐次修正において、誤った点で勾配が０になってしまい、正しい位置を推定できなかった。 The conventional position estimation apparatus 10 estimates the position using the delay time difference τ _ijn between measured sound pickups. When the microphones are concentrated at one place, e ′ (P) shown in Expression (2) becomes small. Further, the value _{DF (q) G (q)} shown in Expression (3 ₎ is a fixed value. Therefore, the value of e (P) shown in the equation (4) becomes small. As a result, in the successive correction according to the equation (5), the gradient becomes 0 at an incorrect point, and the correct position cannot be estimated.

しかし、位置推定装置５０−１は、測定収音間遅延時間差τ_ｉｊｎのみではなく、測定収音放出間遅延時間差δ_ｋｍをも用いて、音源の位置、マイクロホンの位置、スピーカの位置を推定する。従って、マイクロホンが１箇所に集中している等の場合、式（２２）に示すｅ’（Ｐ）は小さくなるが、式（２３）に示すｅ’’（Ｐ）は小さくはならない。よって、式（２４）に示すｅ_１（Ｐ）は小さくならず、式（２５）による逐次修正において、正しい点で勾配が０に近づき、正しい位置を推定できるようになった。 However, the position estimation apparatus 50-1 estimates the position of the sound source, the position of the microphone, and the position of the speaker using not only the delay time difference τ _ijn between measured sound collections but also the delay time difference δ _km between measured sound pickups. . Therefore, when the microphones are concentrated at one place, e ′ (P) shown in the equation (22) is small, but e ″ (P) shown in the equation (23) is not small. Therefore, e ₁ (P) shown in Expression (24) is not reduced, and in the successive correction according to Expression (25), the gradient approaches 0 at the correct point, and the correct position can be estimated.

また、信号発生部１４は、上述の通り、同時刻には１チャネルしか存在しない入力信号Ｅ_ｋを発生する。もし、信号発生部１４がＧ（Ｇ≧２）チャネルの入力信号Ｅ_ｋを発生すると、Ｇチャネルの放出信号が放出され、Ｇ−１チャネルの放出信号が雑音信号として収音信号Ｂ_ｍに混ざってしまい、収音間遅延時間差測定部１３、収音放出間遅延時間差測定部１５の処理が正確に行われなくなる。また、Ｇ−１チャネルの入力信号Ｅ_ｋが収音放出間遅延時間差測定部１５に入力されると、収音放出間遅延時間差測定部１５の処理が正確に行われなくなる。従って、信号発生部１４が同時刻には１チャネルしか存在しない入力信号Ｅ_ｋを発生することで、収音間遅延時間差測定部１３、収音放出間遅延時間差測定部１５は処理を正確に行うことができる。 Further, as described above, the signal generator 14 generates the input signal E _k that has only one channel at the same time. If the signal generator 14 generates an input signal E _k of G (G ≧ 2) channels, release signals G channel is released, mixed with the collected sound signal B _m-release signal of G-1 channel as noise signal Therefore, the processing of the delay time difference measuring unit 13 between sound collection and the delay time difference measuring unit 15 between sound collection and discharge is not accurately performed. Further, the input signal E _k of G-1 channel is input to the sound collection discharge between the delay time difference measuring unit 15, the processing of the collected sound emission between the delay time difference measuring unit 15 is not performed accurately. Therefore, when the signal generator 14 generates the input signal E _k having only one channel at the same time, the delay time difference measurement unit 13 between sound collection and the delay time difference measurement unit 15 between sound collection and discharge accurately perform processing. be able to.

なお、いくつかのスピーカやマイクロホンがフレーム等で固定されている場合は、（１）２つのマイクロホンの間の距離である収音手段間距離、（２）２つのスピーカの間の距離である放出手段間距離、（３）マイクロホンとスピーカの間の距離である収音手段放出手段間距離、が既知である。また、音源がテープレコーダーの場合など、固定されており、当該音源がフレームで固定されている場合は、（４）２つの音源の間の距離である信号源間距離、（５）信号源とマイクロホンとの間の距離である信号源収音手段間距離、（６）信号源とスピーカとの間の距離である信号源放出手段間距離、が既知である。 When several speakers or microphones are fixed by a frame or the like, (1) a distance between sound collecting means that is a distance between two microphones, and (2) a discharge that is a distance between two speakers. The distance between the means and (3) the distance between the sound collecting means emitting means, which is the distance between the microphone and the speaker, are known. When the sound source is fixed, such as a tape recorder, and the sound source is fixed by a frame, (4) a distance between signal sources that is a distance between two sound sources, and (5) a signal source The distance between the signal source pickup means that is the distance between the microphones and (6) the distance between the signal source emission means that is the distance between the signal source and the speaker are known.

これら６つの距離の少なくとも１つを固有パラメータと定義する。以下で説明する実施例２〜実施例７の推定部１６は、測定収音間遅延時間差τ_ｉｊｎと、測定収音放出間遅延時間差δ_ｋｍの他に、固有パラメータを用いて推定をする。 At least one of these six distances is defined as an intrinsic parameter. The estimation part 16 of Example 2-Example 7 _demonstrated below estimates using an intrinsic parameter besides the delay time difference (tau) _ijn between measurement sound-collection discharge | releases, and delay time difference (delta) _km between measurement sound-collection discharge | releases.

実施例２では、２つのマイクロホンの間の距離である収音手段間距離が既知である場合を説明する。図８に実施例２の位置推定装置５０−２の機能構成例を示す。位置推定装置５０−２は位置推定装置５０−１と比較して、収音手段間距離入力部６１を備える点で異なる。収音手段間距離を用いることで、位置推定装置５０−１と比較して、情報量が増え、更に、高精度な位置を推定できる。Ｑ個（Ｑは１以上の整数）の収音手段間距離が既知である場合は、収音手段間距離入力部６１が備えられ、収音手段間距離は事前に収音手段間距離入力部６１に入力される。マイクロホン番号Ｆ（ｑ）（ｑ＝１，．．．，Ｑ）であるマイクロホン１１_Ｆ（ｑ）と、マイクロホン番号Ｇ（ｑ）であるマイクロホン１１_Ｇ（ｑ）との測定された距離を収音手段間距離Ｄ_{Ｆ（ｑ）Ｇ（ｑ）}とする。Ｄ_{Ｆ（ｑ）Ｇ（ｑ）}の測定の仕方として、人間が測定などをすれば良い。推定される位置ベクトルＰから計算される推定収音手段間距離をＤ＾_{Ｆ（ｑ）Ｇ（ｑ）}（Ｐ）とする。Ｄ_{Ｆ（ｑ）Ｇ（ｑ）}とＤ＾_{Ｆ（ｑ）Ｇ（ｑ）}（Ｐ）との関係は以下の式（３８）で表される。 In the second embodiment, a case will be described in which the distance between sound collection means, which is the distance between two microphones, is known. FIG. 8 shows a functional configuration example of the position estimation apparatus 50-2 according to the second embodiment. The position estimation device 50-2 is different from the position estimation device 50-1 in that it includes a sound input means distance input unit 61. By using the distance between the sound collecting means, the amount of information is increased as compared with the position estimation device 50-1, and a highly accurate position can be estimated. When Q distances between sound collecting means are known (Q is an integer equal to or greater than 1), a distance input section 61 between the sound collecting means is provided, and the distance between the sound collecting means is set in advance. 61 is input. Microphone number F (q) (q = 1 , ..., Q) sound pickup and a microphone _{11 F (q),} the measured distance between the microphone _{11 G} is a microphone number G _(q) (q) The distance between means _{DF (q) G (q)} . As a method of measuring _{DF (q) G (q)} , a human may perform the measurement. The distance between the estimated sound pickup means calculated from the estimated position vector P is D ^ _{F (q) G (q)} (P). The relationship between _{DF (q) G (q)} and D ^ _{F (q) G (q)} (P) is expressed by the following equation (38).

ここで、マイクロホンの位置と音源の位置とスピーカの位置を推定するには、式（３８）の制約条件下で式（２４）中のｅ_１を最小化すればよい。そこで、式（３８）を二乗誤差の形式に変形して、式（２４）に追加し、１つの最小化問題に置き換えれば、式（３９）になる。

Here, in order to estimate the position of the microphone, the position of the sound source, and the position of the speaker, e ₁ in Expression (24) may be minimized under the constraint condition of Expression (38). Therefore, if equation (38) is transformed into the square error format and added to equation (24) and replaced with one minimization problem, equation (39) is obtained.

ただし、λ_ｑは、重み係数であり、事前に設定される。λ_ｑが大きいほど、式（３８）が厳密に満たされる解が求まる。

However, λ _q is a weighting factor, is set in advance. as lambda _q is large, a solution can be obtained in which the formula (38) is exactly satisfied.

式（３９）に示したｅ_２（Ｐ）を最小化する解を求めることで、ｄ_ｉｊｎとｄ＾_ｉｊｎ（Ｐ）との誤差、Ｌ_ｋｍ、とＬ＾_ｋｍ（Ｐ）との誤差が最小となるマイクロホン１１_ｍの位置、音源１７_ｎの位置、スピーカ１２_ｋの位置を求めることができる。ただし、式（３９）は非線形連立方程式であり、解析的に解くことは困難である。ここでは、逐次修正を用いた数値解析で求める。式（３９）を最小化する推定されるマイクロホンの位置（ｘ＾_ａ、ｙ＾_ａ、ｚ＾_ａ）（ａ＝１，．．．，Ｍ）と推定される音源の位置（Ｘ＾_ｂ、Ｙ＾_ｂ、Ｚ＾_ｂ）（ｂ＝１，．．．，Ｎ）と推定されるスピーカの位置（Ｘ’＾_ｃ、Ｙ’＾_ｃ、Ｚ’＾_ｃ）（ｃ＝０，．．．，Ｋ）を求めるには、ある点における勾配を求め、誤差ｅ（Ｐ）が小さくなる方向に、推定されるマイクロホンの位置、推定される音源の位置、推定されるスピーカの位置を修正していき、勾配が０になる点を求めればよい。従って、修正式は式（４０）のようになる。 By finding a solution that minimizes e ₂ (P) shown in Equation (39), the error between d _ijn and d ^ _ijn (P) and the error between L _km and L ^ _km (P) are minimized. The position of the microphone 11 _m, the position of the sound source 17 _n , and the position of the speaker 12 _k can be obtained. However, equation (39) is a nonlinear simultaneous equation and is difficult to solve analytically. Here, it is obtained by numerical analysis using sequential correction. Estimated microphone positions (x ^ _a , y ^ _a , z ^ _a ) (a = 1,..., M) that minimize Equation (39) and estimated sound source positions (X ^ _b , Y ^ _b , Z ^ _b ) (b = 1,..., N) estimated speaker position (X ′ ^ _c , Y ′ ^ _c , Z ′ ^ _c ) (c = 0,. , K), the gradient at a certain point is obtained, and the estimated microphone position, estimated sound source position, and estimated speaker position are corrected in the direction in which the error e (P) decreases. What is necessary is just to obtain | require the point from which a gradient becomes zero. Therefore, the correction formula is as shown in formula (40).

推定部１６は、式（４１）〜式（５３）を用いて、音源の位置、マイクロホンの位置、放出手段の位置を推定する。

The estimation unit 16 estimates the position of the sound source, the position of the microphone, and the position of the emission unit using Expressions (41) to (53).

このように、位置推定装置５０−２は位置推定装置５０−１と比較して、収音手段間距離Ｄ_{Ｆ（ｑ）Ｇ（ｑ）}と推定収音手段間距離をＤ＾_{Ｆ（ｑ）Ｇ（ｑ）}を用いることで、使用する情報量を更に増やすことができ、結果として、より高精度な位置を推定できる。 Thus, compared position estimation device 50-2 and the position estimation device 50-1, the sound collection unit distance _{D F (q) G (q} ) and the distance between the estimated sound pickup means _{D ^ F (q)} By using _{G (q)} , the amount of information to be used can be further increased, and as a result, a more accurate position can be estimated.

実施例３では、２つのスピーカの間の距離である放出手段間距離が既知である場合を説明する。図９に実施例３の位置推定装置５０−３の機能構成例を示す。位置推定装置５０−３は位置推定装置５０−２と比較して、放出手段間距離入力部７１を備える点で異なる。放出手段間距離を用いることで、位置推定装置５０−２と比較して、情報量が増え、更に高精度に位置を推定できる。Ｒ個（Ｒは１以上の整数）の放出手段間距離が既知である場合は、放出手段間距離入力部７１が備えられ、放出手段間距離は事前に放出手段間距離入力部７１に入力される。スピーカ番号Ｖ（ｒ）（ｒ＝１，．．．，Ｒ）であるスピーカ１２_Ｖ（ｒ）と、スピーカ番号Ｗ（ｒ）であるスピーカ１２_Ｗ（ｒ）との測定された距離を放出手段間距離Ψ_{Ｖ（ｒ）Ｗ（ｒ）}とする。Ψ_{Ｖ（ｒ）Ｗ（ｒ）}の測定の仕方として、人間が測定などをすれば良い。推定される位置ベクトルＰから計算される推定放出手段間距離をΨ＾_{Ｖ（ｒ）Ｗ（ｒ）}（Ｐ）とする。Ψ_{Ｖ（ｒ）Ｗ（ｒ）}とΨ＾_{Ｖ（ｒ）Ｗ（ｒ）}（Ｐ）との関係は以下の式（５４）で表される。 In the third embodiment, a case where the distance between the emission means, which is the distance between two speakers, is known will be described. FIG. 9 shows a functional configuration example of the position estimation apparatus 50-3 according to the third embodiment. The position estimation device 50-3 is different from the position estimation device 50-2 in that it includes a distance input unit 71 between discharge means. By using the distance between the emission means, the amount of information increases and the position can be estimated with higher accuracy than the position estimation device 50-2. When the distance between R discharge means (R is an integer of 1 or more) is known, a discharge means distance input unit 71 is provided, and the discharge means distance input is input to the discharge means distance input unit 71 in advance. The Speaker number V (r) (r = 1 , ..., R) and a speaker _{12 V (r),} a speaker _{12 W (r)} and the measured distance the release means is a speaker ID W (r) The inter-distance distance ψ _{V (r) W (r)} . As a method of measuring Ψ _{V (r) W (r)} , a human may measure. Assume that the estimated distance between the emission means calculated from the estimated position vector P is Ψ ^ _{V (r) W (r)} (P). The relationship between ΨV _{(r) W (r)} and Ψ ^ _{V (r) W (r)} (P) is expressed by the following equation (54).

ここで、マイクロホンの位置と音源の位置とスピーカの位置を推定するには、式（５４）の制約条件と式（３８）の制約条件を満たすように、式（２４）のｅ_１を最小化すればよい。そこで、式（３８）と式（５４）を二乗誤差の形式に変形して、式（２４）に追加し、１つの最小化問題に置き換えれば、式（５５）になる。

Here, in order to estimate the position of the microphone, the position of the sound source, and the position of the speaker, e ₁ of Expression (24) is minimized so as to satisfy the restriction condition of Expression (54) and the restriction condition of Expression (38). do it. Therefore, if equation (38) and equation (54) are transformed into the square error format and added to equation (24) and replaced with one minimization problem, equation (55) is obtained.

ただし、λ_ｒは、重み係数であり、事前に設定される。λ_ｒが大きいほど、式（５４）が厳密に満たされる解が求まる。

Here, λ _r is a weighting factor and is set in advance. as lambda _r is large, a solution can be obtained in which the formula (54) is exactly satisfied.

式（５５）に示したｅ_３（Ｐ）を最小化する解を求めることで、ｄ_ｉｊｎとｄ＾_ｉｊｎ（Ｐ）との誤差、Ｌ_ｋｍ、とＬ＾_ｋｍ（Ｐ）との誤差が最小となるマイクロホン１１_ｍの位置、音源１７_ｎの位置、スピーカ１２_ｋの位置を求めることができる。ただし、式（５５）は非線形連立方程式であり、解析的に解くことは困難である。ここでは、逐次修正を用いた数値解析で求める。式（５５）を最小化する推定されるマイクロホンの位置（ｘ＾_ａ、ｙ＾_ａ、ｚ＾_ａ）（ａ＝１，．．．，Ｍ）と推定される音源位置（Ｘ＾_ｂ、Ｙ＾_ｂ、Ｚ＾_ｂ）（ｂ＝１，．．．，Ｎ）と推定されるスピーカの位置（Ｘ’＾_ｃ、Ｙ’＾_ｃ、Ｚ’＾_ｃ）（ｃ＝０，．．．，Ｋ）を求めるには、ある点における勾配を求め、誤差ｅ（Ｐ）が小さくなる方向に、推定されるマイクロホンの位置、推定される音源の位置、推定されるスピーカの位置を修正していき、勾配が０になる点を求めればよい。従って、修正式は式（５６）のようになる。 By finding a solution that minimizes e ₃ (P) shown in Equation (55), the error between d _ijn and d ^ _ijn (P) and the error between L _km and L ^ _km (P) are minimized. The position of the microphone 11 _m, the position of the sound source 17 _n , and the position of the speaker 12 _k can be obtained. However, Expression (55) is a nonlinear simultaneous equation and is difficult to solve analytically. Here, it is obtained by numerical analysis using sequential correction. Estimated microphone positions that minimize Equation (55) (x ^ _a , y ^ _a , z ^ _a ) (a = 1, ..., M) and estimated sound source positions (X ^ _b , Y ^ _B , Z ^ _b ) (b = 1,..., N) estimated speaker position (X ′ ^ _c , Y ′ ^ _c , Z ′ ^ _c ) (c = 0,. K) is obtained by calculating the gradient at a certain point and correcting the estimated microphone position, estimated sound source position, and estimated speaker position in a direction in which the error e (P) decreases. What is necessary is just to obtain the point where the gradient becomes zero. Therefore, the correction formula is as shown in Formula (56).

推定部１６は、式（５６）〜式（６９）を用いて、音源の位置、マイクロホンの位置、スピーカの位置を推定する。

The estimation unit 16 estimates the position of the sound source, the position of the microphone, and the position of the speaker using Expressions (56) to (69).

このように、位置推定装置５０−３は位置推定装置５０−２と比較して、放出手段間距離Ψ_{Ｆ（ｑ）Ｇ（ｑ）}と推定放出手段間距離をΨ＾_{Ｆ（ｑ）Ｇ（ｑ）}（Ｐ）を用いることで、情報量を更に増やすことができ、結果として、より高精度に位置を推定できる。また、放出手段間距離入力部５１を実施例１で説明した位置推定装置５０−１（図２参照）に追加しても、実施できる。 In this way, the position estimation device 50-3 compares the distance between discharge means Ψ _{F (q) G (q)} and the distance between estimated discharge means Ψ ^ _{F (q) G ( q)} By using (P), the amount of information can be further increased, and as a result, the position can be estimated with higher accuracy. Further, even if the inter-emission means distance input unit 51 is added to the position estimation device 50-1 (see FIG. 2) described in the first embodiment, the present invention can be implemented.

実施例４では、マイクロホンとスピーカの間の距離である収音手段放出手段間距離が既知である場合を説明する。図１０に実施例４の位置推定装置５０−４の機能構成例を示す。位置推定装置５０−４は位置推定装置５０−３と比較して、収音手段放出手段間距離入力部８１を備える点で異なる。収音手段放出手段間距離を用いることで、位置推定装置５０−３と比較して、情報量が増え、更に、高精度な位置を推定できる。Ｓ個（Ｓは１以上の整数）の収音手段放出手段間距離が既知である場合は、収音手段放出手段間距離入力部８１が備えられ、収音手段放出手段間距離は事前に収音手段放出手段間距離入力部８１に入力される。スピーカ番号Ｔ（ｓ）（ｓ＝１，．．．，Ｓ）であるスピーカ１２_Ｔ（ｓ）と、マイクロホン番号Ｕ（ｓ）であるスピーカ１１_Ｕ（ｓ）との測定された距離を収音手段放出手段間距離Φ_{Ｔ（ｓ）Ｕ（ｓ）}とする。Φ_{Ｔ（ｓ）Ｕ（ｓ）}の測定の仕方として、人間が測定などをすれば良い。推定される位置ベクトルＰから計算される推定収音手段放出手段間距離をΦ＾_{Ｔ（ｓ）Ｕ（ｓ）}（Ｐ）とする。Φ_{Ｔ（ｓ）Ｕ（ｓ）}とΦ＾_{Ｔ（ｓ）Ｕ（ｓ）}（Ｐ）との関係は以下の式（７０）で表される。 In the fourth embodiment, a case will be described in which the distance between the sound collecting means emitting means, which is the distance between the microphone and the speaker, is known. FIG. 10 shows a functional configuration example of the position estimation apparatus 50-4 of the fourth embodiment. The position estimation device 50-4 is different from the position estimation device 50-3 in that it includes a distance input unit 81 between sound collection means and emission means. By using the distance between the sound collecting means emitting means, the amount of information is increased as compared with the position estimating device 50-3, and a highly accurate position can be estimated. When S (S is an integer greater than or equal to 1) distances between sound collecting means emitting means are known, a distance input section 81 between sound collecting means emitting means is provided, and the distance between sound collecting means emitting means is collected in advance. This is input to the sound means emitting means distance input section 81. The measured distance between the speaker 12 _{T (s)} with the speaker number T (s) (s = 1,..., S) and the speaker 11 _{U (s)} with the microphone number U (s) is collected. The distance between means releasing means is Φ _{T (s) U (s)} . As a method of measuring Φ _{T (s) U (s)} , a human may perform measurement. The distance between the estimated sound collecting means emitting means calculated from the estimated position vector P is Φ ^ _{T (s) U (s)} (P). The relationship between ΦT _{(s) U (s)} and Φ ^ _{T (s) U (s)} (P) is expressed by the following equation (70).

ここで、マイクロホンの位置と音源の位置とスピーカの位置を推定するには、式（３８）、式（５４）、式（７０）の制約条件を満たすように、式（２４）中のｅ_１を最小化すればよい。そこで、式（３８）、式（５４）、式（７０）を二乗誤差の形式に変形して、式（２４）に追加し、１つの最小化問題に置き換えれば、式（７１）になる。

Here, in order to estimate the position of the microphone, the position of the sound source, and the position of the speaker, e _{1 in} Expression (24) is satisfied so as to satisfy the constraints of Expression (38), Expression (54), and Expression (70). Should be minimized. Therefore, if Equation (38), Equation (54), and Equation (70) are transformed into the square error format and added to Equation (24) and replaced with one minimization problem, Equation (71) is obtained.

ただし、ξ_ｓは、重み係数であり、事前に設定される。ξ_ｓが大きいほど、式（７０）が厳密に満たされる解が求まる。

Here, ξ _s is a weighting factor and is set in advance. As ξ _s increases, a solution that satisfies Equation (70) more precisely is obtained.

式（７１）に示したｅ_４（Ｐ）を最小化する解を求めることで、ｄ_ｉｊｎとｄ＾_ｉｊｎ（Ｐ）との誤差、Ｌ_ｋｍ、とＬ＾_ｋｍ（Ｐ）との誤差が最小となるマイクロホン１１_ｍの位置、音源１７_ｎの位置、スピーカ１２_ｋの位置を求めることができる。ただし、式（７１）は非線形連立方程式であり、解析的に解くことは困難である。ここでは、逐次修正を用いた数値解析で求める。式（７１）を最小化する推定されるマイクロホンの位置（ｘ＾_ａ、ｙ＾_ａ、ｚ＾_ａ）（ａ＝１，．．．，Ｍ）と推定音源位置（Ｘ＾_ｂ、Ｙ＾_ｂ、ｚ＾_ｂ）（ｂ＝１，．．．，Ｎ）と推定音源位置（Ｘ’＾_ｃ、Ｙ’＾_ｃ、ｚ’＾_ｃ）（ｃ＝０，．．．，Ｋ）を求めるには、ある点における勾配を求め、誤差ｅ（Ｐ）が小さくなる方向に、推定されるマイクロホンの位置、推定される音源の位置、推定されるスピーカの位置を修正していき、勾配が０になる点を求めればよい。従って、修正式は式（７２）のようになる。 By finding a solution that minimizes e ₄ (P) shown in equation (71), the error between d _ijn and d ^ _ijn (P), and the error between L _km and L ^ _km (P) are minimized. The position of the microphone 11 _m, the position of the sound source 17 _n , and the position of the speaker 12 _k can be obtained. However, equation (71) is a nonlinear simultaneous equation and is difficult to solve analytically. Here, it is obtained by numerical analysis using sequential correction. Estimated microphone position (x ^ _a , y ^ _a , z ^ _a ) (a = 1, ..., M) and estimated sound source position (X ^ _b , Y ^ _b ) that minimizes Equation (71) , Z ^ _b ) (b = 1,..., N) and the estimated sound source position (X ′ ^ _c , Y ′ ^ _c , z ′ ^ _c ) (c = 0,..., K). Finds the gradient at a certain point and corrects the estimated microphone position, estimated sound source position, and estimated speaker position in a direction in which the error e (P) becomes smaller, and the gradient becomes zero. What is necessary is just to ask for. Therefore, the correction formula is as shown in formula (72).

推定部１６は、式（７３）〜式（８５）を用いて、音源の位置、マイクロホンの位置、スピーカの位置を推定する。

The estimation unit 16 estimates the position of the sound source, the position of the microphone, and the position of the speaker using Expressions (73) to (85).

このように、位置推定装置５０−４は位置推定装置５０−３と比較して、収音手段放出手段間距離Φ_{Ｔ（ｓ）Ｕ（ｓ）}と推定収音手段放出手段間距離Φ＾_{Ｔ（ｓ）Ｕ（ｓ）}（Ｐ）を用いることで、情報量を更に増やすことができ、結果として、より高精度な位置を推定できる。また、収音手段放出手段間距離入力部８１は、位置推定装置５０−１（図２参照）や位置推定装置５０−２（図８参照）に設けてもよい。 In this way, the position estimation device 50-4 and the position estimation device 50-3 are compared with the distance Φ _{T (s) U (s)} between the sound collecting means emitting means and the distance Φ ^ _T between the estimated sound collecting means emitting means. _{(S) U (s)} By using (P), the amount of information can be further increased, and as a result, a more accurate position can be estimated. Further, the distance input unit 81 between the sound pickup means emission means may be provided in the position estimation device 50-1 (see FIG. 2) or the position estimation device 50-2 (see FIG. 8).

実施例５では、２つの音源の間の距離である信号源間距離が既知である場合を説明する。図１１に、実施例５の位置推定装置５０−５を示す。位置推定装置５０−５は位置推定装置５０−４と比較して、信号源間距離入力部９１を備える点で異なる。信号源間距離は事前に測定され、信号源間距離入力部９１に入力される。そのほかの処理は実施例２〜４で説明したものと同様である。 In the fifth embodiment, a case where the distance between signal sources, which is the distance between two sound sources, is known will be described. FIG. 11 shows a position estimation apparatus 50-5 of the fifth embodiment. The position estimation device 50-5 is different from the position estimation device 50-4 in that a signal source distance input unit 91 is provided. The distance between the signal sources is measured in advance and input to the signal source distance input unit 91. Other processes are the same as those described in the second to fourth embodiments.

このように、信号源間距離を用いることで、位置推定装置５０−４と比較して、情報量を更に増やすことができ、結果として、より高精度に位置を推定できる。
また、信号源間距離入力部９１は、位置推定装置５０−１、５０−２、５０−３以下で説明する５０−６、５０−７に備えても良い。 As described above, by using the distance between the signal sources, the amount of information can be further increased as compared with the position estimation device 50-4, and as a result, the position can be estimated with higher accuracy.
Further, the signal source distance input unit 91 may be provided in 50-6 and 50-7, which will be described below in the position estimation devices 50-1, 50-2, and 50-3.

実施例６では、音源とマイクロホンの間の距離である信号源収音手段間距離が既知である場合を説明する。図１２に、実施例６１の位置推定装置５０−６を示す。位置推定装置５０−６は位置推定装置５０−５と比較して、信号源収音手段間距離入力部１０１を備える点で異なる。信号源収音手段間距離は事前に測定され、信号源収音手段間距離入力部１０１に入力される。そのほかの処理は実施例２〜４で説明したものと同様である。 In the sixth embodiment, a case will be described in which the distance between the signal source sound pickup means, which is the distance between the sound source and the microphone, is known. In FIG. 12, the position estimation apparatus 50-6 of Example 61 is shown. The position estimation device 50-6 differs from the position estimation device 50-5 in that it includes a signal source sound pickup means distance input unit 101. The distance between the signal source pickup means is measured in advance and is input to the distance input unit 101 between the signal source pickup means. Other processes are the same as those described in the second to fourth embodiments.

このように、信号源収音手段間距離を用いることで、位置推定装置５０−５と比較して、情報量を更に増やすことができ、結果として、より高精度な位置を推定できる。信号源収音手段間距離入力部１０１は、位置推定装置５０−１、５０−２、５０−３、５０−４、に備えても良い。 In this way, by using the distance between the signal source sound collection means, the amount of information can be further increased as compared with the position estimation device 50-5, and as a result, a more accurate position can be estimated. The signal source sound pickup unit distance input unit 101 may be provided in the position estimation devices 50-1, 50-2, 50-3, and 50-4.

実施例７では、音源とスピーカの間の距離である信号源放出手段間距離が既知である場合を説明する。図１３に、実施例７の位置推定装置５０−７を示す。位置推定装置５０−７は位置推定装置５０−６と比較して、信号源放出手段間距離入力部１１１を備える点で異なる。信号源放出手段間距離は事前に測定され、信号源放出手段間距離入力部１１１に入力される。そのほかの処理は実施例２〜４で説明したものと同様である。 In the seventh embodiment, a case where the distance between the signal source emitting means, which is the distance between the sound source and the speaker, is known will be described. FIG. 13 shows a position estimation apparatus 50-7 of the seventh embodiment. The position estimation device 50-7 differs from the position estimation device 50-6 in that it includes a signal source emission means distance input unit 111. The distance between the signal source emitting means is measured in advance and input to the signal source emitting means distance input unit 111. Other processes are the same as those described in the second to fourth embodiments.

このように、信号源放出手段間距離を用いることで、位置推定装置５０−６と比較して、情報量を更に増やすことができ、結果として、より高精度な位置を推定できる。信号源放出手段間距離入力部１１１は、位置推定装置５０−１、５０−２、５０−３、５０−４、５０―５に備えても良い。 In this way, by using the distance between the signal source emitting means, it is possible to further increase the amount of information as compared with the position estimation device 50-6, and as a result, it is possible to estimate the position with higher accuracy. The signal source emission means distance input unit 111 may be provided in the position estimation devices 50-1, 50-2, 50-3, 50-4, and 50-5.

図１４に、実施例８の位置推定装置５０−８の推定部１６’の機能構成例を示す。推定部１６’は、位置推定装置５０−１〜５０−７が有する推定部１６（図６参照）と比較して、設定手段１２１を有する点で異なる。また、位置推定装置５０−８は、実施例２〜４で説明した式（３９）や式（５５）や式（７１）に示すように、更新手段４１が、収音手段間距離、放出手段間距離、収音手段放出手段間距離、のうち少なくとも１つに重み係数（λ_ｑ、γ_ｒ、ξ_ｓ）を乗算したものを用いて、更新量を計算する場合に適用できる。 FIG. 14 illustrates a functional configuration example of the estimation unit 16 ′ of the position estimation device 50-8 according to the eighth embodiment. The estimation unit 16 ′ is different from the estimation unit 16 (see FIG. 6) included in the position estimation devices 50-1 to 50-7 in that it includes a setting unit 121. In addition, as shown in the equations (39), (55), and (71) described in the second to fourth embodiments, the position estimating device 50-8 is configured so that the updating unit 41 includes the distance between the sound collecting units and the emitting unit. The present invention can be applied to the case where the update amount is calculated by using at least one of the distance between the sound pickup means and the distance between the sound pickup means discharge means multiplied by a weighting coefficient (λ _q , γ _r , ξ _s ).

設定手段１２１は、収音手段間距離、放出手段間距離、収音手段放出手段間距離のうち重み係数が乗算されたものが重視されるように、更新量の値が小さくなるに従って、乗算で用いられる重み係数を大きくする。重み係数（λ_ｑ、γ_ｒ、ξ_ｓ）は設定手段１２１により、逐次設定される。以下の処理の詳細について説明する。 The setting means 121 performs multiplication as the value of the update amount decreases so that the weight between the sound collection means distance, the emission means distance, and the sound collection means emission means distance multiplied by the weighting factor is emphasized. Increase the weighting factor used. The weighting factors (λ _q , γ _r , ξ _s ) are sequentially set by the setting unit 121. Details of the following processing will be described.

測定収音手段間距離ｄ_ｉｊｎ、測定収音放出間距離Ｌ_ｋｍは誤差を含むが、測定された収音手段間距離Ｄ_{Ｆ（ｑ）Ｇ（ｑ）}、測定された放出手段間距離Ψ_{Ｖ（ｒ）Ｗ（ｒ）}、測定された収音手段放出手段間距離Φ_{Ｔ（ｓ）Ｕ（ｓ）}、はフレーム等で固定された距離なので、ほとんど誤差を含まない。従って、式（３９）や式（５５）や式（７１）に含まれる重み係数λ_ｑ、γ_ｒ、ξ_ｓを大きな値に設定して、式（３８）、式（５４）、式（７０）の条件を満たすような推定位置Ｐを求めることが望ましい。ただし、逐次修正の初期段階からλ_ｑ、γ_ｒ、ξ_ｓを大きな値に設定すると、ｄ_ｉｊｎとｄ＾_ｉｊｎ（Ｐ）との誤差、Ｌ_ｋｍ、とＬ＾_ｋｍ（Ｐ）との誤差が最小化することが十分行われないうちに、更新が終了したとみなされてしまう可能性がある。そこで、更新の終了に近づくにつれて、λ_ｑ、γ_ｒ、ξ_ｓを、大きな値に設定するようにすればよい。式（２５）、式（４０）、式（５６）、式（７２）記載のｅ_１〜ｅ_４をまとめてｅ_ＡＬＬとすると、更新量ｇｒａｄｅ_ＡＬＬ（Ｐ）の総和Σ_Ｐ│ｇｒａｄｅ_ＡＬＬ（Ｐ）│と、複数の閾値Ｔ（ｉ）（ｉ＝１，．．．，Ｉとし、Ｉは２以上の整数）とを比較して、Ｔ（ｉ）＞Σ_Ｐ│ｇｒａｄｅ_ＡＬＬ（Ｐ）│になった場合に、もしくは、Ｔ（ｉ）≧Σ_Ｐ│ｇｒａｄｅ_ＡＬＬ（Ｐ）│になった場合に、λ_ｑ（ｉ）、γ_ｒ（ｉ）、ξ_ｓ（ｉ）を設定する。ここで、λ_ｑ（ｉ）、γ_ｒ（ｉ）、ξ_ｓ（ｉ）は、Ｔ（ｉ）が小さくなるほど、大きくなるように事前に設定する。 The distance between measured sound collecting means d _ijn and the distance between measured sound collecting and emitting L _km include errors, but the measured distance between sound collecting means _{DF (q) G (q)} and the measured distance between emitting means Ψ _{V Since (r) W (r)} and the measured distance between sound collecting means and emitting means Φ _{T (s) U (s)} are distances fixed by a frame or the like, there is almost no error. Therefore, the weighting coefficients λ _q , γ _r , and ξ _s included in the equations (39), (55), and (71) are set to large values, and the equations (38), (54), and (70) are set. It is desirable to obtain an estimated position P that satisfies the condition (1). However, if λ _q , γ _r , and ξ _s are set to large values from the initial stage of the successive correction, the error between d _ijn and d ^ _ijn (P), the error between L _km and L ^ _km (P) There is a possibility that the update has been completed before the minimization is sufficiently performed. Therefore, λ _q , γ _r , and ξ _s may be set to larger values as the update is approached. When e ₁ to e ₄ described in Expression (25), Expression (40), Expression (56), and Expression (72) are collectively represented as e _ALL , the sum Σ _P | grad _ALL (P) of the update amount grade _ALL (P) ) | And a plurality of threshold values T (i) (i = 1,..., I, where I is an integer equal to or greater than 2), T (i)> Σ _P | grade _ALL (P) | Λ _q (i), γ _r (i), ξ _s (i) are set when T (i) ≧ Σ _P | grade _ALL (P) |. Here, λ _q (i), γ _r (i), and ξ _s (i) are set in advance so as to increase as T (i) decreases.

このように、重み係数λ_ｑ、γ_ｒ、ξ_ｓを逐次的に修正することで、正確な情報である収音手段間距離Ｄ_{Ｆ（ｑ）Ｇ（ｑ）}、放出手段間距離Ψ_{Ｖ（ｒ）Ｗ（ｒ）}、収音手段放出手段間距離Φ_{Ｔ（ｓ）Ｕ（ｓ）}を重視して、位置推定装置５０−８は位置を推定でき、結果として推定精度を向上できる。 In this way, by sequentially correcting the weighting coefficients λ _q , γ _r , and ξ _s , accurate information, the distance between sound collection means _{DF (q) G (q)} and the distance between discharge means Ψ _{V ( The} position estimation device 50-8 can estimate the position with emphasis on _{r) W (r)} and the distance Φ _{T (s) U (s)} between the sound collection means and the emission means, and as a result, the estimation accuracy can be improved.

図１５に実施例９の位置推定装置５０−９の機能構成例を示す。位置推定装置５０−９は音源１７_ｎが大きく移動する場合などに有効である。図１５記載の位置推定装置５０−９は、位置推定装置５０−１と比較して、収音間遅延時間差記憶部１３１と、第１新信号源検出部１３２を有する点で異なる。また、以下の説明では、収音間遅延時間差測定部１３が、過去に測定した収音間遅延時間差をτ_{ｉｊｎ、ＯＬＤ}とし、現在、測定した収音間遅延時間差をτ_{ｉｊｎ、ＮＥＷ}とする。収音間遅延時間差記憶部１３１は、新たな位置にある（移動した）音源の測定収音間遅延時間差を記憶する。新たな位置にある音源の認定は第１新信号源検出部１３２が行う。新たな位置にある音源の認定の方法として、収音間遅延時間差測定部が測定した現在の測定収音間遅延時間差τ_{ｉｊｎ、ＮＥＷ}と、収音間遅延時間差記憶部１３１に記憶されている過去の測定収音間遅延時間差τ_{ｉｊｎ、ＯＬＤ}と、の距離を求める。そして当該距離が予め定められた閾値以上もしくは閾値を超えれば新たな位置にある音源と認定する。ここで、距離とは、τ_{ｉｊｎ、ＮＥＷ}やτ_{ｉｊｎ、ＯＬＤ}とを引き数とする距離関数の関数値や、τ_{ｉｊｎ、ＮＥＷ}やτ_{ｉｊｎ、ＯＬＤ}との類似度などのことである。距離関数とは、Ｘ＝τ_{ｉｊｎ、ＮＥＷ}とＹ＝τ_{ｉｊｎ、ＯＬＤ}とすると、Ｘ−Ｙ、Ｘ／Ｙ、（Ｘ−Ｙ）の２乗平均などである。そしてこの関数値が大きいということは、過去の測定収音間遅延時間差τ_{ｉｊｎ、ＯＬＤ}についての音源と、現在の測定収音間遅延時間差τ_{ｉｊｎ、ＮＥＷ}についての音源とが離れているということである。また、類似度が小さい場合も同様のことが言える。従って、τ_{ｉｊｎ、ＯＬＤ}についての音源とτ_{ｉｊｎ、ＮＥＷ}についての音源とが離れている場合に、第１新信号源検出部１３２はτ_{ｉｊｎ、ＮＥＷ}についての音源を、新たな位置にある音源と認定する。以下の説明では、「距離に関する関数」とは、（Ｘ−Ｙ）の２乗平均の場合を説明する。 FIG. 15 shows a functional configuration example of the position estimation apparatus 50-9 according to the ninth embodiment. Position estimation device 50-9 is effective when the sound source 17 _n moves greatly. The position estimation device 50-9 illustrated in FIG. 15 differs from the position estimation device 50-1 in that it includes a delay time difference storage unit 131 between sound collection units and a first new signal source detection unit 132. In the following description, the inter-sound collection delay time difference measurement unit 13 sets τ _{ijn and OLD} as the delay times between sound collections measured in the past, and τ _{ijn and NEW} as the currently measured delay times between sound collections. The inter-sound collection delay time difference storage unit 131 stores a measured inter-sound collection delay time difference of a sound source at a new position (moved). The first new signal source detector 132 identifies the sound source at the new position. As a method for certifying a sound source at a new position, the current measured delay time difference τ _{ijn, NEW} measured by the delay time difference measurement unit between sound collections and the past stored in the delay time difference storage unit 131 between sound collections The distance between the measured sound pickup delay time difference _{τijn and OLD} is obtained. And if the said distance is more than a predetermined threshold value or exceeds a threshold value, it will recognize as the sound source in a new position. Here, the distance refers to a function value of a distance function having τ _{ijn, NEW} , τ _{ijn, and OLD} as arguments _, and a similarity to τ _{ijn, NEW} , τ _{ijn, and OLD} . The distance function is the mean square of XY, X / Y, and (XY), where X = _{τijn, NEW} and Y = _{τijn, OLD} . The large function value means that the sound source for the past measured sound pickup delay time difference τ _{ijn, OLD and} the sound source for the current measured sound pickup delay time difference τ _{ijn, NEW} are separated from each other. is there. The same can be said when the degree of similarity is small. Therefore, when the sound source for τ _{ijn and OLD} is separated from the sound source for τ _{ijn and NEW} , the first new signal source detection unit 132 sets the sound source for τ _{ijn and NEW} as the sound source at a new position. Authorize. In the following description, the “function related to distance” will be described as the case of the mean square of (XY).

図１６に第１新信号源検出部１３２の機能構成例を示す。第１新信号源検出部１３２は、二乗誤差計算手段１４１、閾値比較手段１４２で構成されている。例えば、二乗誤差計算手段１４１は、収音間遅延時間差測定部１３よりの現在の収音間遅延時間差τ_{ｉｊｎ、ＮＥＷ}から、収音間遅延時間差記憶部１３１に記憶されている過去の収音間遅延時間差τ_{ｉｊｎ、ＯＬＤ}を減算して二乗平均する。二乗誤差計算手段１４１は、例えば以下の式を演算する。 FIG. 16 shows a functional configuration example of the first new signal source detector 132. The first new signal source detection unit 132 includes a square error calculation unit 141 and a threshold comparison unit 142. For example, the square error calculation means 141 uses the current sound pickup delay time difference τ _{ijn and NEW} from the sound _pickup delay time difference measurement unit 13 to calculate the past sound pickup interval stored in the sound pickup delay time difference storage unit 131. Subtract the delay time difference τ _{ijn and OLD} and average the squares. The square error calculation unit 141 calculates, for example, the following expression.

閾値比較手段１４２の比較によりｅ_ｆ＜Ｔ_ｆの場合は、τ_{ｉｊｎ、ＮＥＷ}についての音源が新たな位置にある音源でない、つまり、音源が（あまり）移動していないと認識する。また、ｅ_ｆ＜Ｔ_ｆではなくｅ_ｆ≦Ｔ_ｆの場合でも、τ_{ｉｊｎ、ＮＥＷ}についての音源は新たな位置にある音源ではない、つまり、音源が（あまり）移動していないと認識してもよい。

If e _f <T _f by comparison by the threshold comparison means 142, it is recognized that the sound source for τ _{ijn and NEW} is not a sound source at a new position, that is, the sound source has not moved (too much). Even if e _f <T _f and not e _f ≦ T _f , the sound source for τ _{ijn and NEW} is not a sound source at a new position, that is, the sound source is recognized as not moving (too much). Also good.

また、閾値比較手段１４２の比較により、ｅ_ｆ≧Ｔ_ｆの場合もしくはｅ_ｆ＞Ｔ_ｆの場合、τ_{ｉｊｎ、ＮＥＷ}についての音源を新た位置にある音源でない、つまり、音源が移動していると認識する。そして、閾値比較手段１４２が、音源が移動していると認識すると、記憶命令信号を生成出力して、現在の収音間遅延時間差τ_{ｉｊｎ、ＮＥＷ}を収音間遅延時間差記憶部１３１に記憶させる
そして、推定部１６は、収音間遅延時間差記憶部１３１に記憶されている全ての収音間遅延時間差と、測定収音放出間遅延時間差とを用いて推定する。 Furthermore, by comparing the threshold value comparison means 142, in the case of _{e f} ≧ _T if _f or _e f> _{T f,} tau _ijn, not a sound source in a new position the sound source for _NEW, i.e., when the sound source is moving recognize. When the threshold comparison unit 142 recognizes that the sound source is moving, it generates and outputs a storage command signal, and stores the current delay time difference between sound collections τ _{ijn and NEW} in the sound delay time difference storage unit 131. Then, the estimation unit 16 estimates using all the delay times between sound collections stored in the sound collection delay time difference storage unit 131 and the delay times difference between measured sound collection releases.

位置推定装置５０−９のような構成にして、現在の収音間遅延時間差τ_{ｉｊｎ、ＮＥＷ}を用いることで、音源が移動したとても、高精度に位置を推定できる。また、過去の収音間遅延時間差τ_{ｉｊｎ、ＯＬＤ}を用いることで、移動する前の音源の位置を推定することもできる。なお、収音間遅延時間差記憶部１３１と第１新信号源検出部１３２は位置推定装置５０−２〜８に追加しても良い。 By using a configuration such as the position estimation device 50-9 and using the current delay time τ _ijn between sound collections _{and NEW} , the position of the sound source can be estimated with very high accuracy. Moreover, the position of the sound source before moving can also be estimated by using the past delay time difference τ _{ijn OLD, OLD} . Note that the delay time difference storage unit 131 between sound collections and the first new signal source detection unit 132 may be added to the position estimation devices 50-2 to 50-8.

次に、実施例１０の位置推定装置５０−１０を図１５を用いて説明する。位置推定装置５０−１０は、第１新信号源検出部１３２が第１新信号源検出部１３２’に代替されている点で、位置推定装置５０−９と異なる。図１７に、第１新信号源検出部１３２’の機能構成例を示す。第１新信号源検出部１３２’は、平均計算手段１４３を有する点で、第１新信号源検出部１３２’と異なる。式（８６）記載のｅ_ｆについて、ｅ_ｆ＜Ｔ_ｆの場合もしくはｅ_ｆ≦Ｔ_ｆは、上述したとおり、音源位置が（あまり）移動していないと、第１新信号源検出部１３２中の閾値比較手段１４２が認識する。この場合は、平均計算手段１４３が、収音間遅延時間差τ_{ｉｊｎ、ＮＥＷ}と過去の収音間遅延時間差τ_{ｉｊｎ、ＯＬＤ}の平均収音間遅延時間差τ_{ｉｊｎ、AVG}を計算する。そして過去の収音間遅延時間差τ_{ｉｊｎ、ＯＬＤ}が平均収音間遅延時間差τ_{ｉｊｎ、ＡＶＧ}に更新される。 Next, the position estimation apparatus 50-10 of Example 10 is demonstrated using FIG. The position estimation device 50-10 is different from the position estimation device 50-9 in that the first new signal source detection unit 132 is replaced with a first new signal source detection unit 132 ′. FIG. 17 shows a functional configuration example of the first new signal source detection unit 132 ′. The first new signal source detection unit 132 ′ is different from the first new signal source detection unit 132 ′ in that it includes an average calculation unit 143. For _{e f} of formula (86) _wherein, when or _{e f} ≦ _{T f} of e f _{<T f,} as described above, the sound source position (so) if not moved, the first new signal source detector in 132 Are recognized by the threshold value comparison means 142. In this case, the average calculation means 143 calculates the delay time difference τ _ijn between the sound collections _{, NEW} and the delay time difference τ _ijn between the past sound collections _{, and} the average delay time difference τ _{ijn, AVG} between the old sound collections. Then, the past delay time difference τ _{ijn and OLD} between the collected _sounds is updated to the average delay time difference τ _{ijn and AVG} between the collected _sounds .

このように、音源が移動していないと認識された場合に、収音間遅延時間差τ_{ｉｊｎ、ＯＬＤ}を平均収音間遅延時間差τ_{ｉｊｎ、ＡＶＧ}に更新することで、記憶されていた収音間遅延時間差の正確性が増すので、位置推定装置５０−１０は、より高精度に位置を推定できる。なお、位置推定装置５０−１〜９に第１新信号源検出部１３２’と収音間遅延時間差記憶部１３１とを追加してもよい。 As described above, when it is recognized that the sound source is not moving, by updating the delay time difference τ _{ijn, OLD} between sound pickups to the average delay time difference τ _{ijn, AVG} between sound pickups, Since the accuracy of the delay time difference is increased, the position estimation device 50-10 can estimate the position with higher accuracy. In addition, you may add 1st new signal source detection part 132 'and the delay time difference memory | storage part 131 between sound collections to the position estimation apparatuses 50-1-9.

次に、実施例１１の位置推定装置５０−１１を図１５を用いて説明する。位置推定装置５０−１１は、収音間遅延時間差記憶部１３１が収音間遅延時間差記憶部１３１’に代替されている点で、位置推定装置５０−９と異なる。収音間遅延時間差記憶部１３１’は記憶する収音間遅延時間差τ_ｉｊｎの個数について、上限が与えられる。上限をＧ個とすると、収音間遅延時間差記憶部１３１’はG個の収音間遅延時間差τ_ｉｊｎを記憶する。そして、G＋１個目の収音間遅延時間差τ_ｉｊｎを記憶する際に、最も古く記憶された収音間遅延時間差τ_ｉｊｎは破棄される。このようにして、収音間遅延時間差記憶部１３１’が、G＋１個以上の収音間遅延時間差τ_ｉｊｎを記憶しないことで、収音間遅延時間差記憶部１３１を少ないメモリで構成できる。 Next, the position estimation apparatus 50-11 of Example 11 is demonstrated using FIG. The position estimation device 50-11 is different from the position estimation device 50-9 in that the inter-acquisition delay time difference storage unit 131 is replaced with an inter-acquisition delay time difference storage unit 131 ′. The inter-sound collection delay time difference storage unit 131 ′ is given an upper limit for the number of inter-sound collection delay time differences τ _ijn stored. Assuming that the upper limit is G, the inter-sound-collection delay time difference storage unit 131 ′ stores G inter-sound-collection delay time differences _τijn . Then, when storing the (G + 1) th delay time difference between sound collections τ _ijn , the earliest stored delay time difference between sound collections τ _ijn is discarded. In this way, since the delay time difference storage unit 131 ′ between sound collections does not store G + 1 or more delay time differences τ _ijn between sound collections, the delay time difference storage unit 131 between sound collections can be configured with a small amount of memory.

図１８に、変形例４の位置推定装置５０−１２の機能構成例を示す。位置推定装置５０−１２は、収音放出間遅延時間差記憶部１３３と第２新信号源検出部１３４を有する点で位置推定装置５０−１と異なる。位置推定装置５０−９中の第１新信号源検出部１３２は、実施例８で説明したように、現在の収音間遅延時間差τ_{ｉｊｎ、ＮＥＷ}と、過去の収音間遅延時間差τ_{ｉｊｎ、ＯＬＤ}とを用いる。位置推定装置５０−１０中の第２新信号源検出部１３４は、現在の収音放出間遅延時間差δ_{ｋｍ、ＮＥＷ}と、過去の収音放出間遅延時間差δ_{ｋｍ、ＯＬＤ}とを用いて、音源の移動を検出する。過去の収音放出間遅延時間差δ_{ｋｍ、ＯＬＤ}は収音放出間遅延時間差記憶部１３３に記憶される。収音放出間遅延時間差記憶部１３３、第２新信号源検出部１３４の処理はそれぞれ、収音間遅延時間差記憶部１３１、第１新信号源検出部１３２と同様なので、ここでは省略する。 In FIG. 18, the functional structural example of the position estimation apparatus 50-12 of the modification 4 is shown. The position estimation device 50-12 is different from the position estimation device 50-1 in that it includes a delay time difference storage unit between sound collection and emission 133 and a second new signal source detection unit 134. As described in the eighth embodiment, the first new signal source detection unit 132 in the position estimation apparatus 50-9 includes the current delay time difference τ _ijn between sound collections and the new delay time difference τ _ijn between past sound collections _{, OLD} is used. The second new signal source detection unit 134 in the position estimation apparatus 50-10 uses the current delay between the sound pickup and emission delays δ _{km and NEW} and the past delay between the sound pickup and emission delays δ _{km and OLD} to _generate a sound source. Detecting movement. The past delay time difference δ _km between sound collection releases _{and OLD} are stored in the delay time difference storage unit 133 between sound collection releases. The processing of the delay time difference storage unit between sound collection and emission 133 and the second new signal source detection unit 134 is the same as the processing of the delay time difference storage unit 131 and the first new signal source detection unit 132, respectively, and is therefore omitted here.

また、第２新信号源検出部１３４を第２新信号源検出部１３４’と代替してもよい。第２新信号源検出部１３４’は、音源があまり移動していないと認定すると、収音間遅延時間差τ_{ｉｊｎ、ＮＥＷ}と過去の収音間遅延時間差τ_{ｉｊｎ、ＯＬＤ}の平均収音間遅延時間差τ_{ｉｊｎ、AVG}を計算し、τ_{ｉｊｎ、ＯＬＤ}をτ_{ｉｊｎ、ＡＶＧ}に更新するようにしてもよい（第１新信号源検出部１３２’の構成と同様）。また、収音放出間遅延時間差記憶部１３３を収音放出間遅延時間差記憶部１３３’と代替してもよい。収音放出間遅延時間差記憶部１３３’は記憶する測定収音放出間遅延時間差の個数について上限を有する（収音間遅延時間差記憶部１３１’と同様）。
なお、収音放出間遅延時間差記憶部１３３と新信号源検出部１３４とを位置推定装置５０−２〜１２に追加させても良い。 The second new signal source detection unit 134 may be replaced with the second new signal source detection unit 134 ′. When the second new signal source detection unit 134 ′ recognizes that the sound source has not moved much, the delay time difference between sound collections τ _{ijn, NEW} and the delay time difference between previous sound collections τ _ijn, the average delay time difference between sound collections of _OLD τ _{ijn and AVG} may be calculated, and τ _{ijn and OLD} may be updated to τ _{ijn and AVG} (similar to the configuration of the first new signal source detection unit 132 ′). Further, the delay time difference storage unit between sound collection releases 133 may be replaced with a delay time difference storage unit 133 ′ between sound collection releases. The delay time difference storage unit between sound collection releases 133 ′ has an upper limit for the number of delay time differences between measurement sound collection releases stored (similar to the delay time difference storage unit 131 ′ between sound collections).
In addition, you may make the position estimation apparatus 50-2-12 add the delay time difference memory | storage part 133 between sound collection discharge | releases, and the new signal source detection part 134. FIG.

図１９に、実施例１３の位置推定装置５０−１３の機能構成例を示す。位置推定装置５０−１３は、有音区間検出部１５１を有する点で、位置推定装置５０−１と異なる。 FIG. 19 illustrates a functional configuration example of the position estimation device 50-13 according to the thirteenth embodiment. The position estimation apparatus 50-13 is different from the position estimation apparatus 50-1 in that it includes a sound section detection unit 151.

有音区間検出部１５１は、収音信号Ｂ_ｍ中の有音区間を検出する。検出方法の一例を示す。収音信号Ｂ_ｍ（ｍ＝１，．．．，Ｍ）を全て加算した信号を加算信号ｘ（ｔ）とする。ｔは時間を示す。この加算信号を短時間平均したものをＸ（ｔ）とする。Ｘ（ｔ）の雑音信号Ｎ（ｔ）のレベルはＸ（ｔ）に対してディップホールド処理することで推定でき、例えば、以下の式（８７）を用いて、計算される。
Ｎ（ｔ）＝Ｘ（ｔ）Ｎ（ｔ−１）≧Ｘ（ｔ）
Ｎ（ｔ）＝ｕ・Ｎ（ｔ−１）＋（１−ｕ）Ｘ（ｔ）Ｎ（ｔ−１）＜Ｘ（ｔ）
（８７）
ここで、ｕは推定される雑音信号Ｎ（ｔ）のレベル上昇時の平滑化定数であり、０＜ｕ＜１の値をとる。ｕが１に近いと緩やかな雑音信号のレベルの上昇となり、ディップホールドの効果が得られる。 Sound interval detecting unit 151 detects the voiced interval in the collected signal B _m. An example of a detection method is shown. A signal obtained by adding all the collected sound signals B _m (m = 1,..., M) is defined as an addition signal x (t). t indicates time. Let X (t) be the average of this sum signal for a short time. The level of the noise signal N (t) of X (t) can be estimated by performing dip hold processing on X (t), and is calculated using, for example, the following equation (87).
N (t) = X (t) N (t−1) ≧ X (t)
N (t) = u.N (t-1) + (1-u) X (t) N (t-1) <X (t)
(87)
Here, u is a smoothing constant when the level of the estimated noise signal N (t) rises, and takes a value of 0 <u <1. When u is close to 1, the level of the noise signal gradually increases, and a dip hold effect is obtained.

次に、有音区間検出部１５１は、有音区間と雑音区間の検出は、雑音信号Ｎ（ｔ）のレベルに予め設定した定数を乗じた閾値ＴＮ（ｔ）とＸ（ｔ）を比較することで行う。有音区間検出部１５１は、ＴＮ（ｔ）≦Ｘ（ｔ）であれば、雑音区間と検出し、ＴＮ（ｔ）＞Ｘ（ｔ）であれば、有音区間と検出する。ここで雑音区間とは、音源以外の雑音源が存在した場合に、当該雑音源よりの雑音信号を含む区間である。 Next, the voiced section detecting unit 151 detects the voiced section and the noise section by comparing a threshold TN (t) obtained by multiplying the level of the noise signal N (t) with a preset constant and X (t). Do that. The sound section detection unit 151 detects a noise section if TN (t) ≦ X (t), and detects a sound section if TN (t)> X (t). Here, the noise section is a section including a noise signal from the noise source when there is a noise source other than the sound source.

収音間遅延時間差測定部１３は、有音区間と検出された収音信号について測定収音間遅延時間差τ_ｉｊｎを測定する。このような構成にすることで、より正確に測定収音間遅延時間差τ_ｉｊｎを測定できるという効果と雑音の位置を推定しなくなるという効果を得ることができる。 The inter-sound collection delay time difference measuring unit 13 measures a measured inter-sound collection delay time difference τ _ijn with respect to the sounded section and the detected sound collection signal. By adopting such a configuration, it is possible to obtain the effect that the delay time difference τ _ijn between measured sound collections can be measured more accurately and the effect that the position of noise is not estimated.

なお、図１９に破線で示すように、収音放出間遅延時間差測定部１５は、入力信号Ｅ_ｋと収音信号Ｂ_ｍ中の検出された有音区間とを用いて処理を行うこともできる。このようにすることで、収音放出間遅延時間差測定部１５は、より正確に、測定収音放出間遅延時間差δ_ｋｍを測定できる。有音区間検出部１５１は、位置推定装置５０−２〜１２に追加することもできる。 As indicated by a broken line in FIG. 19, the sound collection released between the delay time difference measuring unit 15 may also perform processing by using the detected sound period input signal E _k and the collected signal B in _m . By doing in this way, the delay time difference measurement part 15 between sound collection discharge | releases can measure the delay time difference (delta) _km between measurement sound pickup discharge | releases more correctly. The voiced section detection unit 151 can also be added to the position estimation devices 50-2 to 50-12.

図２０に実施例１４の位置推定装置５０−１４の機能構成例を示す。位置推定装置５０−１４は、信号発生部１４がない点で位置推定装置５０−１と異なる。ＴＶ会議などで位置推定装置５０−１４を使用する場合、遠隔地からの音声が受信される。この遠隔地からの音声をスピーカ１２_ｋの入力信号Ｅ_ｋとする。このような構成にすることで、信号発生部１４を省略できる。信号発生部１４を省略する構成は、位置推定装置５０−２〜１３にも適用できる。 FIG. 20 shows a functional configuration example of the position estimation apparatus 50-14 of the fourteenth embodiment. The position estimation apparatus 50-14 is different from the position estimation apparatus 50-1 in that the signal generation unit 14 is not provided. When the position estimation device 50-14 is used in a TV conference or the like, audio from a remote place is received. The sound from this remote place is used as the input signal E _k of the speaker 12 _k . With this configuration, the signal generator 14 can be omitted. The configuration in which the signal generation unit 14 is omitted can be applied to the position estimation devices 50-2 to 13-13.

［実験結果］
図２１に、実験を行ったときのマイクロホン、スピーカ、音源の配置を示す。ただし、グループ１とグループ２（破線で囲まれている）は、マイクロホンとスピーカがフレームで固定されており、マイクロホン間の距離（収音手段間距離）、スピーカ間の距離（放出手段間距離）、マイクロホンとスピーカとのの間の距離（収音手段放出手段間距離）は既知である。この配置はＴＶ会議を想定しており、グループ１がテレビカメラなどが設置されているＴＶ会議装置を想定していて、グループ２が収音用のテーブル置きマイクである。図２２は実験のグループ１のマイク数、グループ２のマイク数、グループ１のスピーカ数、話者数を示したものである。以下の説明では、（グループ１のマイク数、グループ２のマイク数、グループ１のスピーカ数、話者数）とする。従来技術１〜４は、位置推定装置１０を用いた場合であり、従来技術１は（４、０、０、１）、従来技術２は（８、０、０、１）、従来技術３は（４、５、０、１）、従来技術４は（４、４、０、２）である。また、本発明１とは位置推定装置５０−２を用いた場合であり、本発明１は（４、４、１、１）、本発明２は（４、４、２、１）である。図２３は実験結果を示した折れ線グラフを示す。横軸は、収音間遅延時間差の実測値に含まれる誤差の標準偏差と収音放出手段間遅延時間の実測値に含まれる誤差の標準偏差である。ただし誤差はガウス分布するように与えた。縦軸は位置推定結果の推定誤差であり、誤差が小さいほど高精度な推定が行われているということである。この結果より、どの従来技術よりも本発明の位置推定装置のほうが推定誤差が大幅に小さく、精度良い推定が行われていることが確認できる。 [Experimental result]
FIG. 21 shows the arrangement of microphones, speakers, and sound sources when the experiment was performed. However, in Group 1 and Group 2 (enclosed by broken lines), the microphone and the speaker are fixed by a frame, the distance between the microphones (distance between the sound collecting means), the distance between the speakers (distance between the emitting means) The distance between the microphone and the speaker (distance between the sound collecting means emitting means) is known. This arrangement assumes a TV conference, group 1 assumes a TV conference device in which a TV camera or the like is installed, and group 2 is a microphone for placing a table for sound collection. FIG. 22 shows the number of microphones in group 1 of the experiment, the number of microphones in group 2, the number of speakers in group 1, and the number of speakers. In the following description, it is assumed that (the number of microphones in group 1, the number of microphones in group 2, the number of speakers in group 1, and the number of speakers). Prior arts 1 to 4 are cases where the position estimation device 10 is used, the prior art 1 is (4, 0, 0, 1), the prior art 2 is (8, 0, 0, 1), and the prior art 3 is (4, 5, 0, 1) and the prior art 4 is (4, 4, 0, 2). The present invention 1 is a case where the position estimation device 50-2 is used, the present invention 1 is (4, 4, 1, 1), and the present invention 2 is (4, 4, 2, 1). FIG. 23 shows a line graph showing experimental results. The horizontal axis represents the standard deviation of the error included in the actually measured value of the delay time difference between the sound collecting and the standard deviation of the error included in the actually measured value of the delay time between the sound collecting and emitting means. However, the error was given to be Gaussian. The vertical axis represents the estimation error of the position estimation result, and the smaller the error, the higher the accuracy of the estimation. From this result, it can be confirmed that the position estimation apparatus of the present invention has a significantly smaller estimation error than any prior art, and that the estimation is performed with high accuracy.

従来の位置推定装置の機能構成例を示す図。The figure which shows the function structural example of the conventional position estimation apparatus. 実施例１の位置推定装置の機能構成例を示す図。FIG. 3 is a diagram illustrating a functional configuration example of the position estimation apparatus according to the first embodiment. 位置推定装置の主な処理の流れを示す図。The figure which shows the flow of the main processes of a position estimation apparatus. 収音間遅延時間差測定部の機能構成例を示す図。The figure which shows the function structural example of the delay time difference measurement part between sound collections. 収音放出間遅延時間差測定部の機能構成例を示す図。The figure which shows the function structural example of the delay time difference measurement part between sound collection discharge | releases. 推定部の機能構成例を示す図。The figure which shows the function structural example of an estimation part. 推定部の主な処理の流れを示す図。The figure which shows the flow of the main processes of an estimation part. 実施例２の位置推定装置の機能構成例を示す図。FIG. 6 is a diagram illustrating a functional configuration example of a position estimation device according to a second embodiment. 実施例３の位置推定装置の機能構成例を示す図。FIG. 10 is a diagram illustrating a functional configuration example of a position estimation device according to a third embodiment. 実施例４の位置推定装置の機能構成例を示す図。FIG. 10 is a diagram illustrating a functional configuration example of a position estimation device according to a fourth embodiment. 実施例５の位置推定装置の機能構成例を示す図。FIG. 10 is a diagram illustrating a functional configuration example of a position estimation device according to a fifth embodiment. 実施例６の位置推定装置の機能構成例を示す図。FIG. 10 is a diagram illustrating a functional configuration example of a position estimation device according to a sixth embodiment. 実施例７の位置推定装置の機能構成例を示す図。FIG. 10 is a diagram illustrating a functional configuration example of a position estimation device according to a seventh embodiment. 実施例８の位置推定装置の推定部の機能構成例を示す図。FIG. 10 is a diagram illustrating a functional configuration example of an estimation unit of a position estimation apparatus according to an eighth embodiment. 実施例９〜１１の位置推定装置の機能構成例を示す図。The figure which shows the function structural example of the position estimation apparatus of Examples 9-11. 第１新信号源検出部の機能構成例を示す図。The figure which shows the function structural example of a 1st new signal source detection part. 実施例１１の第１新信号源検出部の機能構成例を示す図。FIG. 20 is a diagram illustrating a functional configuration example of a first new signal source detection unit according to an eleventh embodiment. 実施例１２の位置推定装置の機能構成例を示す図。FIG. 20 is a diagram illustrating a functional configuration example of a position estimation device according to a twelfth embodiment. 実施例１３の位置推定装置の機能構成例を示す図。The figure which shows the function structural example of the position estimation apparatus of Example 13. FIG. 実施例１４の位置推定装置の機能構成例を示す図。The figure which shows the function structural example of the position estimation apparatus of Example 14. FIG. シミュレーションの配置を示す図。The figure which shows arrangement | positioning of simulation. シミュレーションの条件を示す図。The figure which shows the conditions of simulation. シミュレーション結果を示す図。The figure which shows a simulation result.

Claims

A signal generator for generating a signal of a K channel (K is an integer of 1 or more);
A release signal emitted from the K-emitting means for the input signal to each of the signals of K channels by the signal generating unit outputs,
M channel sound collection signals obtained by M (M is an integer greater than or equal to 2) sound collection means collected from N signal sources other than the emission means (N is an integer greater than or equal to 1). A delay time difference measuring unit between sound pickups for measuring a delay time difference between sound pickups which is a delay time difference between the sound pickup signals of the M channel,
Delay between each of the K channel emission signal and each of the M channel sound pickup signals using the K channel input signal and the M channel sound pickup signal input to the K emission means. A delay time difference measuring unit for measuring the delay time between the sound collection and emission to measure the delay time difference between the measurement and the sound emission being a time difference;
The position of the sound pickup means is (x ^ _m , y ^ _m , z ^ _m ) (m = 1,..., M), and the position of the signal source is (X ^ _n , Y ^ _n , Z ^ _n ). (N = 1,..., N), and the position of the emitting means is (X ′ ^ _k , Y ′ ^ _k , Z ′ ^ _k ) (k = 1,. The distance between measured sound collecting means obtained by multiplying the time difference by the speed of sound is _{defined as dijn} , and the estimated distance between sound collecting means d ^ _ijn (P) is

age,
The sum of square errors of the measured distance between sound collecting means d _ijn and the estimated distance between sound collecting means d ^ _ijn (P)

age,
The distance between measured sound collection and emission, which is obtained by multiplying the delay time difference between measured sound collection and emission by the speed of sound, is L _km , and the estimated distance between sound collection and emission L ^ _km (P) is

age,
The sum of squared errors between the measured sound pickup and emission distance L _km and the estimated sound pickup and discharge distance L ^ _km (P)

And the sum of e ′ (P) and e ″ (P) is represented by e ₁ (P), and the sound collecting means when e ₁ (P) is minimized by numerical analysis using sequential correction. Position (x ^ _m , y ^ _m , z ^ _m ) , the position of the signal source (X ^ _n , Y ^ _n , Z ^ _n ) , the position of the emitting means (X '^ _k , Y' ^ _k) , Z ′ ^ _k ) , and an estimation unit.

The position estimation apparatus according to claim 1,
The estimation unit is configured to execute the e _１1 Instead of (P), the sum of e ′ (P) and e ″ (P) multiplied by the weighting coefficient β is e _１1 As (P), by the numerical analysis using sequential correction, the e _１1 The position of the sound collecting means when (P) is minimized (x ^ _ｍm , Y ^ _ｍm , Z ^ _ｍm ), The position of the signal source (X ^ _ｎn , Y ^ _ｎn , Z ^ _ｎn ), The position of the discharge means (X ′ ^ _ｋk , Y ’^ _ｋk , Z ’^ _ｋk ) For estimating position).

The position estimation apparatus according to claim 1 or 2 ,
Furthermore,
And a signal generation unit for generating the input signal for a signal having only one channel at the same time.

It is a position estimation apparatus in any one of Claims 1-3,
A distance between the sound collecting means, which is a distance between the two sound collecting means;
The distance between the discharge means, which is the distance between the two discharge means,
The distance between the signal sources, which is the distance between the two signal sources,
The distance between the sound collecting means emitting means, which is the distance between the sound collecting means and the emitting means,
The distance between the signal source and the sound collecting means, which is the distance between the signal source and the sound collecting means,
When each of the distances between the signal source emitting means, which is the distance between the signal source and the emitting means, is an intrinsic parameter,
An input unit for inputting at least one of the specific parameters;
For each of the input eigen parameters, the estimation unit determines the position of the sound collecting means (x ^ _m , y ^ _m , z ^ _m ) and the position of the signal source (X ^ _n , Y ^ _n , Z ^). _n ), the sum of square errors of the estimated values of the eigenparameters using the positions (X ′ ^ _k , Y ′ ^ _k , Z ′ ^ _k ) of the emission means and the eigen parameters, and the obtained square error And the sum of e ′ (P) and e ″ (P) as e ₁ (P) instead of e ₁ (P), by numerical analysis using sequential correction, The position of the sound pickup means (x ^ _m , y ^ _m , z ^ _m ) when e1 (P) is minimized, the position of the signal source (X ^ _n , Y ^ _n , Z ^ _n ), A position estimation device for estimating the position (X ′ ^ _k , Y ′ ^ _k , Z ′ ^ _k ) of the discharge means .

The position estimation device according to claim 4 ,
The estimation unit includes
Initial value setting means for setting initial values of the positions of the N signal sources, initial values of the positions of the M sound pickup means, and initial values of the positions of the K emission means;
The position of the N signal sources, the position of the M sound collecting means, and the position of the K emission means are calculated using the delay time difference between the measured sound collection times and the delay time difference between the measured sound collection and emission times. Updating means for updating
Signal source position storage means for storing an initial value of the positions of the N signal sources and the updated positions of the N signal sources;
Sound pickup means position storage means for storing the initial values of the positions of the M sound pickup means and the updated positions of the M sound pickup means;
Discharge means position storage means for storing an initial value of the positions of the K discharge means and the updated positions of the K discharge means;
When it is determined that the update is completed, a determination unit that stops the update;
A position estimation apparatus comprising:

The position estimation device according to claim 5 ,
The updating means performs the updating by using a weighting factor multiplied by at least one of the distance between the sound collecting means, the distance between the emitting means, and the distance between the sound collecting means emitting means. ,
The estimation unit decreases the value of the update amount so that importance is given to the distance between the sound collecting means, the distance between the emitting means, and the distance between the sound collecting means and emitting means multiplied by the weight coefficient. A position estimation apparatus characterized by comprising setting means for increasing the weighting coefficient used in the multiplication.

The position estimation device according to any one of claims 1 to 6 ,
Furthermore,
A delay time difference storage unit between sound collections for storing a delay time difference between the measurement sound collections of the sound source at a new position;
A distance between a current measured delay time difference between sound collections measured by the delay time difference measurement unit between sound collections and a past delay time difference between sound collections stored in the delay time difference storage unit between sound collections is determined in advance. A first new signal source detection unit that certifies a sound source for a delay time difference between the current measured sound collections as a sound source at the new position if the threshold value exceeds or exceeds the threshold value,
The estimation unit is configured to estimate using all the delay times between measured sound collections stored in the delay time difference storage unit between sound collections and the delay time differences between the measured sound collection releases. Position estimation device.

The position estimation device according to any one of claims 1 to 7 ,
Furthermore,
A delay time difference storage unit between the sound collection and emission for storing the delay time difference between the measurement and sound collection; and
The distance between the current measured delay time between sound collection and emission measured by the delay time difference measurement unit between the sound collection and emission and the past measured delay time difference between the sound collection and emission stored in the delay time difference storage unit between the sound collection and emission is calculated. A second new signal source detection unit that recognizes a sound source for the current measured sound pickup delay time difference as a sound source at the new position if a predetermined threshold value or more or exceeds a threshold value;
The estimation unit is configured to estimate using all the delay times between measured sound collection and emission stored in the storage delay time difference storage unit and the delay times between measured sound collection and storage. Position estimation device.

The position estimation apparatus according to claim 7 or 8 ,
At least one of the delay time difference storage unit between the sound pickups and the delay time difference storage unit between the sound pickups and discharges has an upper limit on the number of delay times difference between the measured sound pickups to be stored or the number of delay time differences between the measurement sound pickups to be stored. A position estimation device comprising:

The position estimation device according to any one of claims 1 to 9 ,
Furthermore,
A voiced section detector for detecting a voiced section in the collected sound signal;
At least one of the delay time difference measuring unit between sound collection and the delay time difference measuring unit between sound collection and emission is a delay time difference between sound collections or a delay time difference between sound collection and emission with respect to the sound collection signal detected as the sound period. A position estimation device for measuring the position.

A release signal (the K being an integer number of 1 ore more) K the signal generating unit outputs emitted from the K-emitting means for receiving the respective channel signals, N number other than the release means (N is 1 or more The M channel sound pickup signals picked up by M (M is an integer equal to or greater than two) sound pickup means are used as signal source signals emitted from (integer) signal sources. A process for measuring a delay time difference between sound collections for measuring a delay time difference between measurement sound pickups which is a delay time difference;
Delay between each of the K channel emission signal and each of the M channel sound pickup signals using the K channel input signal and the M channel sound pickup signal input to the K emission means. The process of measuring the delay time difference between the sound collection and emission to measure the delay time difference between the sound collection and emission, which is the time difference,
The position of the sound pickup means is (x ^ _m , y ^ _m , z ^ _m ) (m = 1,..., M), and the position of the signal source is (X ^ _n , Y ^ _n , Z ^ _n ). (N = 1,..., N), and the position of the emitting means is (X ′ ^ _k , Y ′ ^ _k , Z ′ ^ _k ) (k = 1,. The distance between measured sound collecting means obtained by multiplying the time difference by the speed of sound is _{defined as dijn} , and the estimated distance between sound collecting means d ^ _ijn (P) is

And the sum of the e ′ (P) and the e ″ (P) is e ₁ (P), and the sound collecting means when the e1 (P) is minimized by the numerical analysis using the sequential correction . Position (x ^ _m , y ^ _m , z ^ _m ) , position of the signal source (X ^ _n , Y ^ _n , Z ^ _n ) , position of the emitting means (X '^ _k , Y' ^ _k , Z ′ ^ _k ) , and an estimation process.

The position estimation method according to claim 11, comprising:
The estimation process includes the e _１1 Instead of (P), the sum of e ′ (P) and e ″ (P) multiplied by the weighting coefficient β is e _１1 As (P), by the numerical analysis using sequential correction, the e _１1 The position of the sound collecting means when (P) is minimized (x ^ _ｍm , Y ^ _ｍm , Z ^ _ｍm ), The position of the signal source (X ^ _ｎn , Y ^ _ｎn , Z ^ _ｎn ), The position of the discharge means (X ′ ^ _ｋk , Y ’^ _ｋk , Z ’^ _ｋk ) Is a process of estimating the position.

A position estimation method according to claim 11 or 12 , comprising:
Furthermore,
And a signal generating process for generating the input signal for a signal having only one channel at the same time.

A position estimating method according to any claims 1 1 to 1 3,
A distance between the sound collecting means, which is a distance between the two sound collecting means;
The distance between the discharge means, which is the distance between the two discharge means,
The distance between the signal sources, which is the distance between the two signal sources,
The distance between the sound collecting means emitting means, which is the distance between the sound collecting means and the emitting means,
The distance between the signal source and the sound collecting means, which is the distance between the signal source and the sound collecting means,
When each of the distances between the signal source emitting means, which is the distance between the signal source and the emitting means, is an intrinsic parameter,
An input process in which at least one specific parameter is input;
In the estimation process, the position of the sound pickup means (x ^ _m , y ^ _m , z ^ _m ), the position of the signal source (X ^ _n , Y ^ _n , Z ^ ) for each of the input intrinsic parameters. _n ), the sum of square errors of the estimated values of the eigenparameters using the positions (X ′ ^ _k , Y ′ ^ _k , Z ′ ^ _k ) of the emission means and the eigen parameters, and the obtained square error And the sum of e ′ (P) and e ″ (P) as e ₁ (P) instead of e ₁ (P), by numerical analysis using sequential correction, The position of the sound pickup means (x ^ _m , y ^ _m , z ^ _m ) when e1 (P) is minimized, the position of the signal source (X ^ _n , Y ^ _n , Z ^ _n ), A position estimation method characterized by being a process of estimating a position (X ′ ^ _k , Y ′ ^ _k , Z ′ ^ _k ) of a discharge means .

A claim 1 4 position estimating method according,
It has an input process in which specific parameters are input,
The estimation process is a process of using the delay time difference between the measured sound collection, the delay time difference between the measured sound collection and emission, and the intrinsic parameter,
The intrinsic parameter is a distance between sound collecting means which is a distance between two sound collecting means, a distance between emitting means which is a distance between two emitting means, and a distance between two signal sources. Distance between signal sources, distance between sound collection means and emission means, which is the distance between sound collection means and emission means, distance between signal sources and collection means, distance between signal source and sound collection means, signal source and emission A position estimation method characterized by being at least one of distances between signal source emitting means, which is a distance between the means.

A position estimating method according to claim 1 5, wherein,
The updating step is a process of performing the updating by using a weighting factor multiplied by at least one of the distance between the sound collecting means, the distance between the emitting means, and the distance between the sound collecting means emitting means. ,
The estimation process includes:
Furthermore,
As the value of the update amount decreases, the multiplication becomes more important so that the weight between the sound collecting means, the distance between the emitting means, and the distance between the sound collecting means emitting means is multiplied. A position estimating method comprising a setting step of increasing a weighting factor used in the method.

A position estimation method according to any one of claims 1 to 16 , comprising:
Furthermore,
A delay time difference storage process between sound collections for storing the delay time difference between the measurement sound pickups of the sound source at a new position;
A distance between a current measurement delay time difference between sound collections measured in the process of measuring delay time difference between sound collections and a past measurement delay time difference between sound collections stored in the process of storing delay times difference between sound collections is predetermined. A first new signal source detection process for certifying a sound source for the current measured sound pickup delay time difference as a sound source at the new position if the threshold value is greater than or equal to the threshold value;
The estimation process is a process of estimating using all the delay times between measured sound collections memorized in the delay time difference storing process between the sound pickups and the delay time difference between the measured sound pickups. A position estimation method characterized by the above.

The position estimation method according to any one of claims 1 to 17 , comprising:
Furthermore,
A process of storing a delay time difference between sound collection and emission to store a delay time difference between the measured sound collection and emission;
A distance between a current measured delay time difference between sound collection and emission measured in the process of measuring a delay time difference between sound collection and emission and a delay time difference between the past measured sound collection and emission stored in the storage delay time difference storage unit. A second new signal source detection process for certifying a sound source for the current measured sound pickup delay time difference as a sound source at the new position if is equal to or greater than a predetermined threshold value or exceeds a threshold value;
The estimation step is a step of estimating using all the delay times between measured sound collection and emission stored in the storage time delay storing step and the difference between the delay times between measured sound collections. Position estimation method.

The position estimation method according to claim 17 or 18 , comprising:
At least one of the inter-acquisition delay time difference storing process and the inter-acquisition delay time difference storing process has an upper limit for the memorized delay time difference between the measured sound collections or the memorized delay period between the collected sound collections. A position estimation method.

The position estimation method according to any one of claims 1 to 19 , comprising:
Furthermore,
A voiced section detection process for detecting a voiced section in the collected sound signal;
At least one of the delay time difference measuring unit between sound collection and the delay time difference measuring unit between sound collection and emission is a delay time difference between sound collections or a delay time difference between sound collection and emission with respect to the sound collection signal detected as the sound period. A position estimation method characterized by being a process of measuring.

A position estimation program for causing a computer to execute each step of the position estimation method according to any one of claims 1 to 20 .

A computer-readable recording medium on which the position estimation program according to claim 21 is recorded.