JP2007235875A - Transmission path estimating method, echo canceling method, sound source separating method, apparatus therefor, program, and recording medium - Google Patents

Transmission path estimating method, echo canceling method, sound source separating method, apparatus therefor, program, and recording medium Download PDF

Info

Publication number
JP2007235875A
JP2007235875A JP2006058189A JP2006058189A JP2007235875A JP 2007235875 A JP2007235875 A JP 2007235875A JP 2006058189 A JP2006058189 A JP 2006058189A JP 2006058189 A JP2006058189 A JP 2006058189A JP 2007235875 A JP2007235875 A JP 2007235875A
Authority
JP
Japan
Prior art keywords
matrix
signal space
signal
space matrix
residual signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP2006058189A
Other languages
Japanese (ja)
Other versions
JP4422692B2 (en
Inventor
Akira Emura
暁 江村
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nippon Telegraph and Telephone Corp
Original Assignee
Nippon Telegraph and Telephone Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nippon Telegraph and Telephone Corp filed Critical Nippon Telegraph and Telephone Corp
Priority to JP2006058189A priority Critical patent/JP4422692B2/en
Publication of JP2007235875A publication Critical patent/JP2007235875A/en
Application granted granted Critical
Publication of JP4422692B2 publication Critical patent/JP4422692B2/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Circuit For Audible Band Transducer (AREA)

Abstract

<P>PROBLEM TO BE SOLVED: To estimate a transmission path or input signal without using previous information of the transmission path. <P>SOLUTION: The present invention is based upon processing steps wherein a present signal space matrix and a past/future signal space matrix are determined from a multi-channel signal, a projection residual signal space matrix is determined by projecting the present signal space matrix on an orthogonal base of the past/future signal space matrix, and an optimal response length is directly estimated from the projection residual signal space matrix. A corrected projection residual signal space is determined by correcting a projection residual signal space based on the estimated optimal response length, and an impulse response of the transmission path is estimated from the corrected projection residual signal space. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

この発明は、1入力多出力もしくは多入力多出力の線形伝達経路からの多チャネル出力信号から、伝達経路の事前情報をもちいずに、伝達経路や入力信号を推定する伝達経路推定方法、装置、プログラムおよび記録媒体、および、これを利用した残響除去方法、音源分離方法、これらの装置、プログラム、記録媒体に関する。   The present invention relates to a transmission path estimation method, apparatus, and the like for estimating a transmission path and an input signal from a multi-channel output signal from a linear transmission path of 1-input multi-output or multi-input multi-output without using prior information of the transmission path, The present invention relates to a program and a recording medium, and a dereverberation method, a sound source separation method, an apparatus, a program, and a recording medium using the same.

線形伝達経路を経た観測信号から、伝達経路およびソース信号の知識を全く使用せずに伝達経路やソース信号を推定する問題は、ブラインド推定問題と呼ばれ、音声音響信号処理や無線通信の分野で研究が進められている。
音声音響信号処理の分野では、室内で収録された音声に壁等からの反射で生じる残響が混入し、音声の明瞭性が低下して認識しにくくなる問題がある。ブラインド推定問題を解くことで、音声の明瞭性を回復して、認識しやすくできる。また無線通信では、ビル等の建築物による反射も受信することでゴーストが生じ、伝送誤りが生じさせやすくなる。ブラインド推定問題を解くことで、伝送誤りを減らして伝送速度を向上させることができる。
The problem of estimating the transmission path and source signal from the observed signal that has passed through the linear transmission path without using any knowledge of the transmission path and source signal is called the blind estimation problem, and is used in the fields of audio-acoustic signal processing and wireless communication. Research is ongoing.
In the field of audio-acoustic signal processing, there is a problem that reverberation caused by reflection from a wall or the like is mixed in the sound recorded indoors, and the clarity of the sound is lowered and is difficult to recognize. Solving the blind estimation problem restores the clarity of speech and makes it easier to recognize. In wireless communication, ghosts are generated by receiving reflections from buildings such as buildings, and transmission errors are likely to occur. Solving the blind estimation problem can reduce transmission errors and improve the transmission rate.

これら残響やゴーストの現象は、図10に示すようにソース信号をs(k)、Pチャネル観測信号をy1(k)〜yP(k)として、モデル化することができる。

Figure 2007235875
ただしh1(k)…hP(k)は伝達経路のインパルス応答であり、Nはインパルス応答の長さになる。
無線通信のブラインド推定問題では、観測信号の周波数特性は伝達経路に由来することがほとんどである。一方、音声音響分野のブラインド推定問題では、観測信号の周波数特性がソース信号と伝達経路の両方に由来し、推定が一層困難になっている。
上記ブラインド推定問題に関して、有色のソース信号から伝達経路を推定する方法として、Least Square Smoothing法(以下LSS法と表記)とその拡張であるJ-LSS法が提案されている(非特許文献1)。 These reverberation and ghost phenomena can be modeled with the source signal as s (k) and the P channel observation signal as y 1 (k) to y P (k) as shown in FIG.
Figure 2007235875
However, h 1 (k)... H P (k) is the impulse response of the transmission path, and N is the length of the impulse response.
In the wireless communication blind estimation problem, the frequency characteristics of the observed signal are mostly derived from the transmission path. On the other hand, in the blind estimation problem in the audio-acoustic field, the frequency characteristics of the observation signal are derived from both the source signal and the transmission path, making estimation more difficult.
Regarding the blind estimation problem, the Least Square Smoothing method (hereinafter referred to as LSS method) and the J-LSS method, which is an extension thereof, have been proposed as methods for estimating a transmission path from a colored source signal (Non-patent Document 1). .

LSS法の処理を、図9乃至図12を用いて説明する。LSS法では、実際のインパルス応答長Nが既知であることを前提とし、これを伝達経路インパルス応答の想定長jとして図9に示す現信号空間行列生成手段111、過去信号空間行列生成手段112、未来信号空間行列生成手段113に設定する。尚、ここで図9に示す多チャネル信号蓄積手段101に蓄積されている多チャネル信号について、図11を用いて説明する。多チャネル信号蓄積手段101には或る時刻における所定時間長の多チャネル信号を蓄積する。蓄積された多チャネル信号の時間軸方向のほぼ中央に位置する信号からなるベクトルで張られる空間を現信号空間と称し、この現信号より時間的に古い信号ベクトルが張る空間を過去信号空間、現信号空間より時間的に新しい信号ベクトルが張る空間を未来信号空間と称すものである。   The processing of the LSS method will be described with reference to FIGS. In the LSS method, it is assumed that the actual impulse response length N is known, and this is assumed as the assumed length j of the transmission path impulse response, and the current signal space matrix generation unit 111, the past signal space matrix generation unit 112 shown in FIG. The future signal space matrix generation means 113 is set. Here, the multi-channel signal stored in the multi-channel signal storage means 101 shown in FIG. 9 will be described with reference to FIG. The multi-channel signal storage means 101 stores a multi-channel signal having a predetermined time length at a certain time. A space spanned by a vector composed of signals located approximately in the center of the time axis direction of the accumulated multi-channel signal is called a current signal space, and a space spanned by a signal vector that is older in time than the current signal is past signal space, current space. A space in which a signal vector that is temporally newer than the signal space is called a future signal space.

ステップSP1(図12参照):多チャネル信号蓄積手段101に蓄積されたPチャネル観測信号から、過去信号空間、未来信号空間、現信号空間を求める。
以下では、Pチャネル収音信号の時刻kの値y1(k)〜yP(k)からなる縦ベクトル

Figure 2007235875
をベースとして説明を進める。
過去信号空間行列生成手段112は、多チャネル信号蓄積手段101に蓄積された多チャネル信号から、多チャネル過去信号空間として過去信号空間行列ZP(非特許文献1の“past data matrix”)を生成する。未来信号空間行列生成手段113は、同様に多チャネル未来信号空間として未来信号空間行列をZF(非特許文献1の“future data matrix”)を生成する。
Figure 2007235875
wの推奨値はjである。またTについては、T>2wに設定する。行列ZP、ZFは(P× w)行(T+1)列の行列になる。 Step SP1 (see FIG. 12): A past signal space, a future signal space, and a current signal space are obtained from the P channel observation signals accumulated in the multichannel signal accumulation means 101.
In the following, a vertical vector consisting of values y 1 (k) to y P (k) of time k of the P channel sound pickup signal
Figure 2007235875
The explanation will proceed based on the above.
The past signal space matrix generation unit 112 generates a past signal space matrix Z P (“past data matrix” in Non-Patent Document 1) as a multi-channel past signal space from the multi-channel signals stored in the multi-channel signal storage unit 101. To do. The future signal space matrix generation unit 113 similarly generates a future signal space matrix Z F (“future data matrix” of Non-Patent Document 1) as a multi-channel future signal space.
Figure 2007235875
The recommended value for w is j. For T, set T> 2w. The matrices Z P and Z F are (P × w) rows (T + 1) columns.

現信号空間行列生成手段111は、多チャネル現信号空間として(P× w)行(T+1)列の現信号空間行列Y(非特許文献1の“current data matrix”)を生成する。

Figure 2007235875
ステップSP2:Z行列生成手段115にて、多チャネル過去信号空間行列ZPと多チャネル未来信号空間行列ZFから過去未来信号空間行列Z(非特許文献1の“future-past data matrix”)を生成する。
Figure 2007235875
The current signal space matrix generating unit 111 generates a current signal space matrix Y (“current data matrix” in Non-Patent Document 1) of (P × w) rows (T + 1) columns as a multi-channel current signal space.
Figure 2007235875
Step SP2: In the Z matrix generation means 115, the past future signal space matrix Z (“future-past data matrix” in Non-Patent Document 1) is obtained from the multi-channel past signal space matrix Z P and the multi-channel future signal space matrix Z F. Generate.
Figure 2007235875

ステップSP3:多チャネル現信号空間を、多チャネル過去信号空間および多チャネル未来信号空間に射影して、射影残差信号空間を求める。そのために、過去未来信号空間直交基底算出手段121において、過去未来信号空間に対応する行列Zの直交基底を求める。この直交基底Vは行列Zを
Z=UΣVT
のように特異値分解することで得られる。
Step SP3: The multi-channel current signal space is projected onto the multi-channel past signal space and the multi-channel future signal space to obtain a projected residual signal space. For this purpose, the past future signal space orthogonal basis calculation means 121 obtains the orthogonal basis of the matrix Z corresponding to the past future signal space. This orthogonal basis V is the matrix Z
Z = UΣV T
It is obtained by singular value decomposition as follows.

射影残差信号行列抽出手段131において、現信号空間の過去未来信号空間への射影をYVVTで求める。そして射影残差信号空間を
E=Y−YVVT
により、射影残差信号空間行列E(非特許文献1の“projection error matrix”)として求める。
In the projection residual signal matrix extracting means 131 calculates the projection of the past future signal space of the current signal space YVV T. And projective residual signal space
E = Y−YVV T
Is obtained as a projection residual signal space matrix E (“projection error matrix” in Non-Patent Document 1).

ステップSP4:伝達経路推定手段161において、
射影残差信号空間行列Eを特異値分解し、最大特異値に対応する左特異ベクトルを取り出すPチャネル収音経路のインパルス応答を

Figure 2007235875
とし、h^1(0)とh1(0)の推定値とした場合、前記の左特異ベクトルの各成分は、
Figure 2007235875
のようにPチャネル収音経路インパルス応答の推定値と対応する。 Step SP4: In the transmission path estimation means 161,
The impulse response of the P channel sound collection path that extracts the left singular vector corresponding to the maximum singular value by singular value decomposition of the projection residual signal space matrix E
Figure 2007235875
And assuming that h ^ 1 (0) and h 1 (0) are estimated values, each component of the left singular vector is
Figure 2007235875
This corresponds to the estimated value of the P channel sound collection path impulse response.

このため、左特異ベクトルの成分を、P個ごとに取り出してベクトル化することで、Pチャネル収音経路インパルス応答の推定値に応答するP本のベクトル

Figure 2007235875
を得ることができる。このP本のベクトルにより伝達経路推定手段161は伝達経路を推定することができ、この伝達経路推定値を用いて逆フィルタ算出手段201は逆フィルタ係数を算出し、逆フィルタ手段202のフィルタ特性を設定する。以上がLSS法の処理となる。 Therefore, by extracting the components of the left singular vector for every P and vectorizing them, P vectors that respond to the estimated value of the P channel sound collection path impulse response
Figure 2007235875
Can be obtained. The transmission path estimation means 161 can estimate the transmission path from the P vectors, and using this transmission path estimation value, the inverse filter calculation means 201 calculates the inverse filter coefficient, and the filter characteristic of the inverse filter means 202 is obtained. Set. The above is the processing of the LSS method.

LSS法では、実際のインパルス応答長Nは既知なことを前提としたが、実際には不明なため、応答長を確定させるための処理が必要となる。この処理を含む方法が非特許文献2に記載されているJ−LSS法である。
J−LSS法を図13及び図14を用いて説明する。図13に示すJ−LSS法の特徴とする構成は、射影残差信号行列抽出手段131と伝達経路推定手段161’の応答長推定手段141を設けた点と、これに伴って伝達経路推定手段161の手順をわずかに変更し、161’とした点である。全体の処理の流れとしては図14に示すようにステップSP1〜SP3の処理はLSS法とJ−LSS法で共通なため、J−LSS法における処理として、以下ではステップSP6以降を図14を用いて説明する。
In the LSS method, it is assumed that the actual impulse response length N is known, but since it is actually unknown, processing for determining the response length is required. A method including this processing is the J-LSS method described in Non-Patent Document 2.
The J-LSS method will be described with reference to FIGS. The characteristic configuration of the J-LSS method shown in FIG. 13 is that projection residual signal matrix extraction means 131 and response length estimation means 141 of transmission path estimation means 161 ′ are provided, and accompanying this, transmission path estimation means. The procedure of 161 is slightly changed to 161 ′. As shown in FIG. 14, the processing flow of steps SP1 to SP3 is common to the LSS method and the J-LSS method as shown in FIG. 14. Therefore, in FIG. I will explain.

ステップSP6:応答長推定手段141において、射影残差信号空間行列Eを特異値分解する。
E=UEΣEVE T
そして応答長を仮定する。これを仮定長k(1≦k≦j)とする。一例としてjに設定することが考えられる。
Step SP6: The response length estimation means 141 performs singular value decomposition on the projection residual signal space matrix E.
E = U E Σ E V E T
And the response length is assumed. This is assumed to be an assumed length k (1 ≦ k ≦ j). As an example, setting to j can be considered.

ステップSP7:行列Eの小さい方からP(j+1)−j+k−1個の特異値に対応する左特異ベクトルを行列UEから取り出す。その左特異ベクトルを行にもつ行列Dを生成し、行列Dを
D=|D1…Dj|
のようにj個に等分割し、これからブロックHankel行列Γk(D)

Figure 2007235875
を生成する。 Step SP7: The left singular vector corresponding to P (j + 1) −j + k−1 singular values is extracted from the matrix U E from the smaller of the matrix E. Generate a matrix D with the left singular vector in the row,
D = | D 1 … D j |
, And divide equally into j and block Hankel matrix Γ k (D)
Figure 2007235875
Is generated.

ステップSP8:仮定長kの妥当性を調べるために、上記ブロックHankel行列に関する一次式
Γk(D)f=0
が0ベクトル以外の解を持つか否かを調べる。この一次式が0ベクトル以外の解を持つとき、その仮定長kはインパルス応答長Nと一致すると判定し、ステップSP10に行く。0ベクトル以外の解を持たないとき、ステップSP9に行く。
Step SP8: In order to check the validity of the assumed length k, a linear expression for the block Hankel matrix
Γ k (D) f = 0
Check if has a solution other than 0 vector. When this linear expression has a solution other than the 0 vector, it is determined that the assumed length k matches the impulse response length N, and the process goes to step SP10. When there is no solution other than the zero vector, the process goes to step SP9.

ステップSP9:仮定長kを変えて、ステップSP7に戻る。一例として、最初に仮定長kをjに設定した場合には、k←k−1のように仮定長kを変えることが考えられる。
ステップSP10:伝達経路推定手段161’において、
Γk(D)f=0
の0ベクトル以外の解fとして、インパルス応答を求める。fとインパルス応答との対応は、ステップSP4における特異ベクトルとインパルス応答との対応と同じになる。
L.Tong and Q. Zhao, "Joint Order Detection and Blind Channel Estimation by Least Squares Smoothing," IEEE Transactions on signal processing, Vol. 47, No.9 1999.
Step SP9: Change the assumed length k and return to step SP7. As an example, when the assumed length k is initially set to j, it is conceivable to change the assumed length k as k ← k−1.
Step SP10: In the transmission path estimation means 161 ′,
Γ k (D) f = 0
An impulse response is obtained as a solution f other than the zero vector of. The correspondence between f and the impulse response is the same as the correspondence between the singular vector and the impulse response in step SP4.
L.Tong and Q. Zhao, "Joint Order Detection and Blind Channel Estimation by Least Squares Smoothing," IEEE Transactions on signal processing, Vol. 47, No.9 1999.

上記J−LSS法(従来技術)では、インパルス応答の仮定長kの妥当性を仮定を少しずつ変えながら繰り返し検証することにより最適なインパルス応答長を求めるため、伝達経路または入力信号の推定には膨大な演算を必要とする問題がある。
本発明の目的は、インパルス応答長の仮定を変更して繰り返し検証する処理を行なわず、少ない演算量で伝達経路または入力信号を推定することにある。
In the J-LSS method (prior art), the optimal impulse response length is obtained by repeatedly verifying the validity of the assumed impulse response length k while changing the assumption little by little. There is a problem that requires enormous operations.
An object of the present invention is to estimate a transmission path or an input signal with a small amount of computation without changing the assumption of the impulse response length and repeatedly performing verification processing.

この発明では、最初に得られた射影残差信号空間から最適な応答長を直接に推定する。そして、この最適応答長に基づいて射影残差信号空間を補正して補正射影残差信号空間を求め、補正射影残差信号空間から伝達経路のインパルス応答を推定する処理手順を基本動作とする。
以下にこの発明による伝達経路推定方法、この伝達経路推定方法を利用して動作する残響除去方法、音源分離方法の具体的な手順を説明する。
この発明による伝達経路推定方法はソース信号から複数の線形伝達経路を経て観測された観測信号から、伝達経路を推定する伝達経路推定方法において、多チャネル観測信号を蓄積する多チャネル信号蓄積処理手段と、蓄積された多チャネル観測信号から現信号空間行列を求める現信号空間行列生成処理と、蓄積された多チャネル観測信号から過去未来信号空間行列を生成する過去未来信号空間行列生成処理と、過去未来信号空間行列の直交基底を求める過去未来信号空間直交基底算出処理と、現信号空間行列と前記過去未来信号空間の直交基底から、現信号空間行列を過去未来信号空間に射影したときの射影残差信号空間行列を求める射影残差信号行列抽出処理と、射影残差信号行列Eのランクrから冗長射影残差信号空間行列Ebを求め、射影残差信号空間行列Eのうち冗長射影残差信号空間行列Ebに直交する成分を取り出して補正射影残差信号空間行列とする射影残差信号行列補正処理と、補正射影残差信号行列を特異値分解し、最大特異値に対応する左特異ベクトルを取り出して多チャネル伝達経路のインパルス応答を求める伝達経路推定処理とを含むことを特徴とする。
In the present invention, the optimum response length is directly estimated from the initially obtained projection residual signal space. Based on this optimum response length, the projection residual signal space is corrected to obtain a corrected projection residual signal space, and the processing procedure for estimating the impulse response of the transmission path from the corrected projection residual signal space is a basic operation.
Hereinafter, specific procedures of the transmission path estimation method according to the present invention, the dereverberation method that operates using the transmission path estimation method, and the sound source separation method will be described.
A transmission path estimation method according to the present invention includes a multi-channel signal accumulation processing means for accumulating multi-channel observation signals in a transmission path estimation method for estimating a transmission path from observation signals observed from a source signal through a plurality of linear transmission paths. Current signal space matrix generation processing for obtaining a current signal space matrix from accumulated multi-channel observation signals, past future signal space matrix generation processing for generating past future signal space matrices from accumulated multi-channel observation signals, and past future Projected residual when projecting current signal space matrix to past future signal space from past future signal space orthogonal basis calculation processing to obtain orthogonal basis of signal space matrix and orthogonal base of current signal space matrix and said past future signal space Projection residual signal matrix extraction processing to obtain the signal space matrix, and the redundant projection residual signal space matrix Eb from the rank r of the projection residual signal matrix E to obtain the projection Projection residual signal matrix correction processing that takes out a component orthogonal to the redundant projection residual signal space matrix Eb from the residual signal space matrix E to obtain a corrected projection residual signal space matrix, and the corrected projection residual signal matrix as a singular value A transmission path estimation process for decomposing and extracting a left singular vector corresponding to the maximum singular value to obtain an impulse response of the multi-channel transmission path.

この発明による残響除去方法は上記伝達経路推定方法で適用する各処理に加えて、推定された伝達経路インパルス応答から逆フィルタを求める逆フィルタ算出処理と、逆フィルタを多チャネル観測信号に適用してソース信号推定結果を出力する逆フィルタ処理とを含むことを特徴とする。   The dereverberation method according to the present invention applies an inverse filter calculation process for obtaining an inverse filter from the estimated transmission path impulse response, and an inverse filter applied to the multichannel observation signal, in addition to each process applied in the above transmission path estimation method. And an inverse filter process for outputting a source signal estimation result.

更に、この発明による音源分離方法では上記伝達経路推定方法で用いる各処理に加えて、推定された伝達経路インパルス応答から残響除去フィルタを求める残響除去フィルタ算出処理と、多チャネル観測信号に対して残響除去フィルタ算出処理で求めた残響除去フィルタを適用する残響除去フィルタリング処理と、残響除去処理後段の信号を入力として音源分離処理を行なう音源分離処理とを含むことを特徴とする。   Furthermore, in the sound source separation method according to the present invention, in addition to the processes used in the transmission path estimation method, a dereverberation filter calculation process for obtaining a dereverberation filter from the estimated transmission path impulse response, and reverberation for a multichannel observation signal. The present invention includes a dereverberation filtering process that applies a dereverberation filter obtained by a removal filter calculation process, and a sound source separation process that performs a sound source separation process using a signal after the dereverberation process as an input.

本発明は、現信号空間行列を未来過去信号空間行列に射影して求めた射影残差信号空間から最適な応答長を直接に推定する。そして、この最適応答長に基づいて射影残差信号空間を補正して補正射影残差信号空間を求めて、補正射影残差信号空間から伝達経路のインパルス応答を推定する。これにより、本発明は従来法(J−LSS法)のような繰り返し処理の必要がなくなることから、従来技術(J−LSS法)に比べて少ない演算量で伝達経路のインパルス応答を推定することができる。従って、推定するインパルス応答長が大きい可能性がある場合には、本発明は特に有効である。   The present invention directly estimates the optimum response length from the projected residual signal space obtained by projecting the current signal space matrix onto the future past signal space matrix. Then, based on the optimum response length, the projection residual signal space is corrected to obtain a corrected projection residual signal space, and the impulse response of the transmission path is estimated from the corrected projection residual signal space. As a result, the present invention eliminates the need for iterative processing as in the conventional method (J-LSS method), and therefore estimates the impulse response of the transmission path with a smaller amount of computation than in the prior art (J-LSS method). Can do. Therefore, the present invention is particularly effective when the estimated impulse response length may be large.

本発明による伝達経路推定方法、残響除去方法、音源分離方法及びこれらの装置は全てハードウェアによって実現することができるが、最も簡素に実現するにはコンピュータに本発明による伝達経路推定プログラム、残響除去プログラム、音源分離プログラムをインストールし、コンピュータにこれらの装置として機能させる形態が最良の形態である。   The transmission path estimation method, the dereverberation method, the sound source separation method, and these apparatuses according to the present invention can all be realized by hardware. However, in order to realize the simplest method, the transmission path estimation program according to the present invention and the dereverberation are implemented in a computer. The best mode is one in which a program and a sound source separation program are installed and the computer functions as these devices.

コンピュータを伝達経路推定装置として機能させる場合、コンピュータには伝達経路推定プログラムをインストールし、このプログラムによりコンピュータ内に多チャネル信号蓄積手段と、現信号空間行列生成手段と、過去信号空間行列生成手段と、過去未来信号空間直交基底算出手段と、射影残差信号行列抽出手段と、射影残差信号行列補正手段と、伝達経路推定手段とを構築し、伝達経路推定装置として機能させる。   When the computer functions as a transmission path estimation device, a transmission path estimation program is installed in the computer, and by this program, multi-channel signal storage means, current signal space matrix generation means, past signal space matrix generation means, The past future signal space orthogonal basis calculation means, the projection residual signal matrix extraction means, the projection residual signal matrix correction means, and the transmission path estimation means are constructed and function as a transmission path estimation device.

コンピュータを残響除去装置として機能させる場合、コンピュータには残響除去プログラムをインストールし、このプログラムによりコンピュータ内に上記の伝達経路推定装置の構成に、逆フィルタ算出手段と、逆フィルタ手段とを加えた構成を構築し、残響除去装置として機能させる。
コンピュータを音源分離装置として機能させる場合、コンピュータに音源分離プログラムをインストールし、このプログラムによってコンピュータ内に上記伝達経路推定装置の構成に残響除去フィルタ算出手段と、残響除去フィルタリング手段と、音源分離手段とを追加した構成を構築し、音源分離装置として機能させる。
When making a computer function as an dereverberation device, a dereverberation program is installed in the computer, and a configuration in which an inverse filter calculating unit and an inverse filter unit are added to the configuration of the transmission path estimation device in the computer by this program And make it function as a dereverberation device.
When making a computer function as a sound source separation device, a sound source separation program is installed in the computer, and by this program, the dereverberation filter calculating means, the dereverberation filtering means, the sound source separation means, Is added to make it function as a sound source separation device.

図1乃至図3を用いて本発明の第1の実施例を説明する。なお図9に示した従来技術と同じ処理を行うブロックには同一の符号を付して示している。
まず最初に伝達経路インパルス応答想定長jを、インパルス応答長N以上と見込まれる値にあらかじめ図1の現信号空間行列生成手段111、過去信号空間行列生成手段112、未来信号空間行列生成手段113に設定する。尚、図1に示す実施例1では過去信号空間行列生成手段112と未来信号空間行列生成手段113を設け、これらで生成した過去信号空間行列と未来信号空間行列からZ行列を生成する構成を示すが、過去信号空間行列生成手段112と未来信号空間行列生成手段113を介することなくZ行列を生成する方法も存在するため、ここでは過去信号空間行列生成手段112と未来信号空間行列生成手段113とZ行列生成手段115を含めてZ行列生成部102と称すことにする。
A first embodiment of the present invention will be described with reference to FIGS. In addition, the same code | symbol is attached | subjected and shown to the block which performs the same process as the prior art shown in FIG.
First, the assumed transmission path impulse response length j is set to a value expected to be equal to or longer than the impulse response length N in advance in the current signal space matrix generation unit 111, the past signal space matrix generation unit 112, and the future signal space matrix generation unit 113 of FIG. Set. In the first embodiment shown in FIG. 1, a past signal space matrix generation unit 112 and a future signal space matrix generation unit 113 are provided, and a Z matrix is generated from the past signal space matrix and the future signal space matrix generated by them. However, since there is also a method for generating the Z matrix without going through the past signal space matrix generation means 112 and the future signal space matrix generation means 113, here the past signal space matrix generation means 112, the future signal space matrix generation means 113, The Z matrix generation unit 115 including the Z matrix generation unit 115 is referred to as a Z matrix generation unit 102.

ステップ1:現信号空間行列生成手段111が、多チャネル信号蓄積手段101に蓄積された多チャネル信号から現信号空間行列Yを求める。同様にして、過去信号空間行列生成手段112が過去信号空間行列をZP求め、未来信号空間行列生成手段113が未来信号空間行列ZFを求める。詳細な処理内容は「背景技術」のステップSP1と同一なのでここでは説明を省略する。 Step 1: The current signal space matrix generation unit 111 obtains the current signal space matrix Y from the multichannel signals stored in the multichannel signal storage unit 101. Similarly, the past signal space matrix generation means 112 obtains the past signal space matrix Z P and the future signal space matrix generation means 113 obtains the future signal space matrix Z F. The detailed processing content is the same as that in Step SP1 of “Background Technology”, and the description thereof is omitted here.

ステップ2:Z行列生成部102が、過去信号空間行列ZPと未来信号空間行列ZFから

Figure 2007235875
により行列Zを生成する。詳細な処理内容は「背景技術」のステップSP2と同一なのでここでは説明を省略する。 Step 2: The Z matrix generation unit 102 calculates the past signal space matrix Z P and the future signal space matrix Z F from
Figure 2007235875
To generate a matrix Z. Detailed processing contents are the same as in step SP2 of “Background Technology”, and thus description thereof is omitted here.

ステップ3:過去未来信号空間直交基底算出手段121において行列Zの直交基底Vを求め、射影残差信号空間行列Eを
E=Y−YVVT
により求める。詳細な処理内容は「背景技術」のステップSP2と同一なのでここでは説明を省略する。
Step 3: In the past future signal space orthogonal basis calculation means 121, the orthogonal basis V of the matrix Z is obtained, and the projection residual signal space matrix E is obtained.
E = Y−YVV T
Ask for. Detailed processing contents are the same as in step SP2 of “Background Technology”, and thus description thereof is omitted here.

ステップ4:射影残差信号行列補正手段151において、射影残差信号空間行列Eのランクrを求め、j−r+1を最適な応答長とする。射影残差信号空間行列Eの下からP×(r−1)行を取り出して、冗長射影残差信号空間行列Ebとする。その大きさはP×(r−1)行(T+1)列となる。   Step 4: In the projection residual signal matrix correction means 151, the rank r of the projection residual signal space matrix E is obtained, and j−r + 1 is set as the optimum response length. P × (r−1) rows are extracted from the bottom of the projection residual signal space matrix E and set as a redundant projection residual signal space matrix Eb. Its size is P × (r−1) rows (T + 1) columns.

ここでは射影残差信号空間行列Eを下記のように行列Eaと冗長射影残差信号空間行列Ebに分割されるものとみなし、射影残差信号空間行列Eから冗長射影残差信号空間行列Ebのみを取り出す。

Figure 2007235875
次に、取り出した冗長射影残差信号空間行列Ebの直交基底Vbを、後述のように特異値分解等を経由して求め、
E2=Ea−EaVbVb T
により補正された射影残差信号行列E2を求める。このステップ4が背景技術のLSS法にもJ−LSS法にも無い本願特有の処理である。 Here, it is assumed that the projection residual signal space matrix E is divided into the matrix Ea and the redundant projection residual signal space matrix Eb as follows, and only the redundant projection residual signal space matrix Eb from the projection residual signal space matrix E Take out.
Figure 2007235875
Next, the orthogonal base Vb of the extracted redundant projection residual signal space matrix Eb is obtained via singular value decomposition or the like as described later,
E 2 = E a −E a V b V b T
The projection residual signal matrix E 2 corrected by is obtained. This step 4 is a process unique to the present application which is neither in the background art LSS method nor in the J-LSS method.

ステップ5:伝達経路推定手段161において、補正された射影残差行列E2を特異値分解し、最大特異値に対応する左特異ベクトルから、多チャネル収音経路のインパルス応答推定値を得る。入力が射影残差行列Eの代わりに補正された射影残差行列E2となっている以外の詳細な処理内容は「背景技術」LSS法のステップSP4と同一なのでここでは説明を省略する。 Step 5: In pathways estimating means 161, and singular value decomposition corrected projection residual matrix E 2, the left singular vector corresponding to the largest singular value, to obtain an impulse response estimate of the multi-channel sound path. Since the detailed processing contents other than the input being the corrected residual matrix E 2 instead of the projected residual matrix E are the same as those in step SP4 of the “background art” LSS method, the description thereof is omitted here.

ここで補足説明を図2を用いて行なう。図2Aに示すYは現信号空間行列生成手段111と、過去信号空間行列生成手段112、未来信号空間行列生成手段113で生成した各行列を示す。この行列Yには冗長部分Xを含んでいるとする。上記ステップ4で求めた射影残差信号行列E2は図2Bに示すように冗長部分Xの量を割り出し、その冗長部分Xを除去し、適正量に補正した射影残差信号行列である。この補正された射影残差信号行列E2を用いて伝達経路の推定を行なうことにより、適正なインパルス応答長で伝達経路の推定を行なうことができることになる。 Here, supplementary explanation will be given with reference to FIG. 2A indicates each matrix generated by the current signal space matrix generation unit 111, the past signal space matrix generation unit 112, and the future signal space matrix generation unit 113. It is assumed that the matrix Y includes a redundant part X. Projection residual signal matrix E 2 obtained in step 4 is indexing the amount of redundant portions X as shown in FIG. 2B, to remove the redundant portion X, a projection residual signal matrix corrected to a proper amount. By the estimation of the transmission path by using the corrected projected residual signal matrix E 2, so that it is possible to estimate the transmission path at an appropriate impulse response length.

従って、本発明によれば過去未来信号空間直交基底抽出処理と、射影残差信号行列抽出処理と、射影残差信号行列補正処理を一度実行するだけで補正された射影残差信号行列E2を得ることができるから、その演算量を少なくすることができる。
従来技術J−LSS法では、応答長の仮定を少しずつ変えて、仮定の妥当性を検証する処理を繰り返し行う必要がある。一方本発明では、J−LSS法における繰り返し処理(図14に示すステップSP7−SP8−SP9のループ)が不要となる。
Therefore, according to the present invention, the corrected residual signal matrix E 2 corrected by performing the past future signal space orthogonal base extraction process, the projected residual signal matrix extraction process, and the projected residual signal matrix correction process once is obtained. Therefore, the amount of calculation can be reduced.
In the prior art J-LSS method, it is necessary to change the assumption of the response length little by little and repeatedly perform the process of verifying the validity of the assumption. On the other hand, in the present invention, iterative processing (loop of steps SP7-SP8-SP9 shown in FIG. 14) in the J-LSS method is unnecessary.

実施例1の手法の有効性を示すために行ったシミュレーションの結果を図3に示す。図3に示すシミュレーションは、8kHzサンプリングの音声信号をソース信号とし、1入力2出力の伝達経路を経て観測された観測信号から伝達経路の推定と逆フィルタの推定を行なう。上記1入力2出力伝達経路のインパルス応答長をN=500に設定し、最初のインパルス応答長の想定をJ=550に設定している。上記伝達経路は、サンプリング周波数8」kHzで測定された残響時間200msの室内インパルス応答を500タップに打ち切って使用している。   FIG. 3 shows the result of the simulation performed to show the effectiveness of the method of the first embodiment. The simulation shown in FIG. 3 uses a sound signal of 8 kHz sampling as a source signal, performs estimation of a transmission path and estimation of an inverse filter from an observation signal observed through a transmission path of one input and two outputs. The impulse response length of the 1-input 2-output transmission path is set to N = 500, and the initial impulse response length is assumed to be J = 550. The transmission path uses a room impulse response with a reverberation time of 200 ms, measured at a sampling frequency of 8 ”kHz, cut to 500 taps.

図3に示すA1、A2は、信号源から伝達経路後段までのインパルス応答(真値)である。また、図3に示すC1、C2は実施例1の手法によるインパルス応答の推定結果である。A1とC1およびA2とC2を比較すると両者はほぼ同一波形となっており、実施例1の手法でインパルス応答が良好に推定されているのが分かる。
実施例1の変形:また射影残差信号行列Eからランク経由で最適な応答長を求めた後に、伝達経路のインパルス応答を求める方法として、行列Y、Z、Eを再計算する下記の実施方法も考えられる。
(1)想定長をj−r+1(r+1は冗長分)に再設定し、実施例1のステップ1を適用して行列Yを求める。
(2) 想定長をj−r+1に再設定し、実施例1のステップ2を適用して行列Zを求める。
(3)実施例1のステップ3を適用して、射影残差行列を再計算する。
(4)再計算された射影残差行列を入力として、実施例1のステップ5を適用する。
実施例1の方法は、P×(r−1)行(T+1)列の行列E2の直交基底を求めればよい。
A1 and A2 shown in FIG. 3 are impulse responses (true values) from the signal source to the latter stage of the transmission path. Also, C1 and C2 shown in FIG. 3 are impulse response estimation results by the method of the first embodiment. When A1 and C1 and A2 and C2 are compared, both have substantially the same waveform, and it can be seen that the impulse response is well estimated by the method of the first embodiment.
Modification of the first embodiment: After obtaining an optimum response length via rank from the projection residual signal matrix E, as a method for obtaining the impulse response of the transmission path, the following implementation method for recalculating the matrices Y, Z, and E Is also possible.
(1) The assumed length is reset to j−r + 1 (r + 1 is redundant), and the matrix Y is obtained by applying step 1 of the first embodiment.
(2) The assumed length is reset to j−r + 1, and the matrix Z is obtained by applying step 2 of the first embodiment.
(3) Applying Step 3 of Example 1 to recalculate the projection residual matrix.
(4) Step 5 of Embodiment 1 is applied with the recalculated projection residual matrix as an input.
In the method according to the first embodiment, the orthogonal basis of the matrix E 2 having P × (r−1) rows (T + 1) columns may be obtained.

但し、上記の再計算による実施方法ではw=j−r+1として(2P×w)行(T+1)列の行列Zの直交基底を求める必要がある。実施例1の方法と比較すると、再計算による実施方法では、行列Zと行列E2のサイズ差に応じた演算量が余分に必要となる。 However, in the implementation method based on the above recalculation, it is necessary to obtain the orthogonal basis of the matrix Z of (2P × w) rows (T + 1) columns with w = j−r + 1. Compared to the method of Example 1, in the exemplary method according to recalculation, the amount of computation extra required according to the size difference of the matrix E 2 and the matrix Z.

本発明の第2の実施例を図4をもちいて説明する。第2実施例では、第1実施例による伝達経路インパルス応答の推定結果をもちいて、残響を除去したソース信号を推定する。なお図9の従来技術と同じ処理を行うブロックには同一の符号を付している。
ステップ1:第1実施例の方法をもちいて多チャネル収音経路のインパルス応答推定値を得る。
ステップ2:逆フィルタ算出手段201において、この推定された伝達経路インパルス応答から、特開昭62−190935公報に記載の方法をもちいて逆フィルタを求める。
ステップ3:逆フィルタ手段202において、この逆フィルタを多チャネル観測信号に適用して、本来のソース信号を推定する。
実施例2の手法の有効性を示すために、第1実施例のシミュレーションと同一の設定をもちいて、シミュレーションを行った。実施例2の手法により求めた逆フィルタを用いたときの信号源から逆フィルタ後段までのインパルス応答を図5に示す。このインパルス応答は、ほぼ直接波のみのインパルス応答になっており、残響がうまく除去されていることが分かる。
A second embodiment of the present invention will be described with reference to FIG. In the second embodiment, the source signal from which reverberation is removed is estimated using the estimation result of the transmission path impulse response according to the first embodiment. It should be noted that blocks that perform the same processing as in the prior art in FIG.
Step 1: An impulse response estimation value of a multi-channel sound pickup path is obtained using the method of the first embodiment.
Step 2: Inverse filter calculation means 201 obtains an inverse filter from the estimated transmission path impulse response using the method described in Japanese Patent Laid-Open No. 62-190935.
Step 3: The inverse filter means 202 applies the inverse filter to the multichannel observation signal to estimate the original source signal.
In order to show the effectiveness of the technique of the second embodiment, a simulation was performed using the same settings as the simulation of the first embodiment. FIG. 5 shows an impulse response from the signal source to the latter stage of the inverse filter when the inverse filter obtained by the method of the second embodiment is used. This impulse response is an impulse response of only a direct wave, and it can be seen that the reverberation is well removed.

次に図6をもちいて本発明の第3の実施例を説明する。
この実施例では図7のような多入力多出力伝達経路を経て観測された信号の残響除去と分離を目的とする。多入力多出力伝達経路のインパルス応答長をN、その想定長をN以上と見込まれる値jに設定する。ここで音源数はQで既知とし、収音チャネル数はPである。但し、Q<Pである。
Next, a third embodiment of the present invention will be described with reference to FIG.
The purpose of this embodiment is to eliminate and reverberate signals observed through a multi-input multi-output transmission path as shown in FIG. The impulse response length of the multi-input multi-output transmission path is set to N, and the assumed length is set to a value j that is expected to be N or more. Here, the number of sound sources is known as Q, and the number of sound collection channels is P. However, Q <P.

ステップ1:多チャネル信号蓄積手段101に蓄積された多チャネル観測信号から、現信号空間、過去信号空間、未来信号空間を求める。
以下では、Pチャネル収音信号の時刻kの値y1(k)〜yP(k)からなる縦ベクトル

Figure 2007235875
をベースとして説明を進める。
同様に多チャネル信号蓄積手段101に蓄積された多チャネル信号から、過去信号空間行列生成手段112により、多チャネル過去信号空間として行列ZFを生成する。
Figure 2007235875
但し、音源数Qをもちいて、w≧Q×jに設定する。またTについては、T>2wに設定する。行列ZP、ZFのサイズは(P×w)行(T+1)列である。
現信号空間行列生成手段111は、多チャネル信号蓄積手段101に蓄積された多チャネル観測信号から、多チャネル現信号空間として、行列Yを生成する。このとき行列Yのサイズは(P×j)行(T+1)列になる。
Figure 2007235875
Step 1: A current signal space, a past signal space, and a future signal space are obtained from the multichannel observation signals stored in the multichannel signal storage means 101.
In the following, a vertical vector consisting of values y 1 (k) to y P (k) of time k of the P channel sound pickup signal
Figure 2007235875
The explanation will proceed based on the above.
Similarly, a matrix Z F is generated as a multi-channel past signal space by the past signal space matrix generation unit 112 from the multi-channel signals stored in the multi-channel signal storage unit 101.
Figure 2007235875
However, w ≧ Q × j is set using the number of sound sources Q. For T, set T> 2w. The sizes of the matrices Z P and Z F are (P × w) rows (T + 1) columns.
The current signal space matrix generation unit 111 generates a matrix Y as a multichannel current signal space from the multichannel observation signals stored in the multichannel signal storage unit 101. At this time, the size of the matrix Y is (P × j) rows (T + 1) columns.
Figure 2007235875

ステップ2:Z行列生成手段115にて、過去信号空間行列ZPと未来信号空間行列ZFから

Figure 2007235875
により行列Zを生成する。 Step 2: In the Z matrix generation means 115, from the past signal space matrix Z P and the future signal space matrix Z F
Figure 2007235875
To generate a matrix Z.

ステップ3:過去未来信号空間直交基底算出手段121において行列Zの直交基底Vを求め、射影残差信号空間行列Eを
E=Y−YVVT
により求める。
Step 3: In the past future signal space orthogonal basis calculation means 121, the orthogonal basis V of the matrix Z is obtained, and the projection residual signal space matrix E is obtained.
E = Y−YVV T
Ask for.

ステップ4:射影残差信号行列補正手段151において、射影残差信号行列Eのランクrを求める。このランクrをもちいて、射影残差信号空間行列Eの下からP×(r−Q)行を取り出して、冗長射影残差信号空間行列Ebとする。行列Ebの大きさは、P×(r−Q)行(T+1)列となる。行列Eは下記のように行列Eaと冗長射影残差信号空間行列Ebに分割される。

Figure 2007235875
冗長射影残差信号空間行列Ebの直交基底Vbを、上述のように特異値分解等を経由して求め、
E2=Ea−EaVbVb T
により補正された射影残差信号空間行列を求める。 Step 4: The projection residual signal matrix correction means 151 obtains the rank r of the projection residual signal matrix E. Using this rank r, P × (r−Q) rows are extracted from the bottom of the projection residual signal space matrix E and set as a redundant projection residual signal space matrix Eb. The size of the matrix Eb is P × (r−Q) rows (T + 1) columns. The matrix E is divided into a matrix Ea and a redundant projection residual signal space matrix Eb as follows.
Figure 2007235875
Obtain the orthogonal basis Vb of the redundant projection residual signal space matrix Eb via singular value decomposition as described above,
E 2 = E a −E a V b V b T
The projection residual signal space matrix corrected by is obtained.

ステップ5:伝達経路推定手段161において、補正された射影残差信号行列E2を特異値分解し、最大特異値に対応する左特異ベクトルから大きさがQ番目の特異値に対応する左特異ベクトルまでを取り出す。第q番目の特異ベクトルを

Figure 2007235875
とすると、
Figure 2007235875
はq番目の音源に関するPチャネル収音経路のインパルス応答推定値になっている。 Step 5: left singular vectors in the transmission path estimator 161, the singular value decomposition of the corrected projection residual signal matrix E 2, the size of left singular vector corresponding to the largest singular value corresponds to the Q-th singular value Take out. The qth singular vector
Figure 2007235875
Then,
Figure 2007235875
Is the impulse response estimate of the P channel sound collection path for the qth sound source.

ステップ6:残響除去フィルタ算出手段203において、ステップ5にて推定された伝達経路インパルス応答から、前出の特開昭62−190935号公報に記載の方法をもちいて残響除去フィルタを求める。   Step 6: The dereverberation filter calculating means 203 obtains an dereverberation filter from the transmission path impulse response estimated in Step 5 by using the method described in the above-mentioned Japanese Patent Application Laid-Open No. 62-190935.

ステップ7:残響除去フィルタリング手段204において、この残響除去フィルタを多チャネル観測信号に適用して、残響成分の取り除かれた多チャネル信号を推定する。   Step 7: In the dereverberation filtering means 204, this dereverberation filter is applied to the multichannel observation signal to estimate the multichannel signal from which the reverberation component has been removed.

ステップ8:残響成分の取り除かれた多チャネル信号に音源分離手段301で音源分離処理を適用し、音源信号を取り出す。   Step 8: The sound source separation unit 301 applies sound source separation processing to the multichannel signal from which the reverberation component has been removed, and extracts the sound source signal.

音源分離処理には参考文献1に記載の独立成分解析(ICA)に基づくブラインド分離アルゴリズムを用いることができる。(参考文献1:「J.F. Cardoso,Blind Signal Separation: Statistical Principles, Proceedings of the IEEE, VOL.86, NO.10, pp.2009-2025, 1998.」)
上記の残響除去処理に含まれる伝達経路推定方法は、LSS法(非特許文献1)をベースとする。ただしLSS法は、音源が単一有色信号で伝達経路が1入力多出力系の場合のみを扱っている。
LSS法では、多チャネル信号から、過去信号空間行列ZPおよび未来信号空間行列ZFを生成する際に、wとして想定インパルス応答長jを推奨している。しかしLSS法をそのまま多入力多出力系の観測信号に適用しても、所望の結果を得ることができない。
For the sound source separation processing, a blind separation algorithm based on independent component analysis (ICA) described in Reference 1 can be used. (Reference 1: “JF Cardoso, Blind Signal Separation: Statistical Principles, Proceedings of the IEEE, VOL.86, NO.10, pp.2009-2025, 1998.”)
The transmission path estimation method included in the above dereverberation process is based on the LSS method (Non-patent Document 1). However, the LSS method deals only with the case where the sound source is a single colored signal and the transmission path is a one-input multi-output system.
In the LSS method, an assumed impulse response length j is recommended as w when generating a past signal space matrix Z P and a future signal space matrix Z F from a multi-channel signal. However, the desired result cannot be obtained even if the LSS method is applied as it is to the observation signal of a multi-input multi-output system.

そこで本発明では、LSS法を多入力多出力系に拡張するために、w≧Q×jに設定している。ただしQを音源数Qとする。
また実施例3の畳み込み混合された信号を分離する問題に対して特開2003−333682号公報に記載されている周波数領域ブラインド音源分離(周波数領域BSS)という技術が従来よく用いられている。
Therefore, in the present invention, in order to extend the LSS method to a multi-input multi-output system, w ≧ Q × j is set. However, Q is the number of sound sources Q.
Further, a technique called frequency domain blind sound source separation (frequency domain BSS) described in Japanese Patent Application Laid-Open No. 2003-333682 is often used for the problem of separating the convolution mixed signal of the third embodiment.

しかし周波数領域BSSでは、除去したい音の直接音成分はほぼ完全に除去できるが、除去したい音の残響成分(間接音成分)はうまく除去できずに残留雑音となり、クロストーク成分の大きさは−10dB前後となって、分離性能が大幅に低下することが知られている。
分離性能の改善を狙い、周波数領域ブラインド音源分離処理の後段に残響対応処理として雑音抑圧処理(クロストーク成分抑圧処理)を行なう方法も特開2003−99093号公報で提案されている。この特許文献に記載されている構成を図8に示す。雑音抑圧手段302の雑音抑圧処理では、推定された直接音成分の遅延とゲインを調整した信号として残響成分をモデル化する。このモデルにもとづき、音源分離手段301の音源分離処理後の信号に含まれるクロストーク成分を推定して差し引く。しかし残響成分を数次程度の反射波として簡略にモデル化しているため、分離性能の改善は平均3〜4dBにとどまる。
However, in the frequency domain BSS, the direct sound component of the sound to be removed can be almost completely removed. It is known that the separation performance is significantly reduced at around 10 dB.
Japanese Patent Laid-Open No. 2003-99093 also proposes a method of performing noise suppression processing (crosstalk component suppression processing) as reverberation processing after the frequency domain blind sound source separation processing with the aim of improving separation performance. The structure described in this patent document is shown in FIG. In the noise suppression processing of the noise suppression unit 302, the reverberation component is modeled as a signal in which the delay and gain of the estimated direct sound component are adjusted. Based on this model, the crosstalk component included in the signal after the sound source separation processing of the sound source separation means 301 is estimated and subtracted. However, since the reverberation component is simply modeled as a reflected wave of several orders, the improvement of the separation performance is only 3 to 4 dB on average.

これに対し、実施例3の手法をもちいることにより、クロストーク成分を−30dB前後にまで抑えながら、所望の音を残響成分なく抽出することが可能となる。
以上実施例1で説明した伝達経路推定装置、実施例2で説明した残響除去装置、実施例3で説明した音源分離装置は全て、それぞれの実施例で説明した手順に従ってコンピュータを動作させる伝達経路推定プログラム、残響除去プログラム、音源分離プログラムによってコンピュータを機能させることにより実現することができる。
各プログラムはコンピュータが解読可能なプログラム言語によって記述され、コンピュータが読み取り可能な例えば磁気ディスク、CD-ROM或いは半導体メモリ等の記録媒体に記録される。これらの記録媒体から或いは通信回線を通じてコンピュータにインストールされ、コンピュータに備えられたCPUに解読されて実行される。
On the other hand, by using the method of the third embodiment, it is possible to extract a desired sound without a reverberation component while suppressing the crosstalk component to around −30 dB.
The transmission path estimation apparatus described in the first embodiment, the dereverberation apparatus described in the second embodiment, and the sound source separation apparatus described in the third embodiment are all transmission path estimation that causes a computer to operate according to the procedure described in each embodiment. It can be realized by causing a computer to function by a program, a dereverberation program, and a sound source separation program.
Each program is written in a computer-readable program language, and is recorded on a computer-readable recording medium such as a magnetic disk, CD-ROM, or semiconductor memory. It is installed in a computer from these recording media or through a communication line, and is decrypted and executed by a CPU provided in the computer.

本発明による伝達経路推定装置、残響除去装置、音源分離装置はそれぞれハンズフリーの音声通話システム等の分野に活用される。   The transmission path estimation apparatus, dereverberation apparatus, and sound source separation apparatus according to the present invention are each utilized in the field of hands-free voice communication systems and the like.

この発明の実施例1を説明するためのブロック図。BRIEF DESCRIPTION OF THE DRAWINGS The block diagram for demonstrating Example 1 of this invention. 実施例1の動作を説明するための図。FIG. 5 is a diagram for explaining the operation of the first embodiment. 実施例1の効果を説明するための波形図。FIG. 6 is a waveform diagram for explaining the effect of the first embodiment. この発明の実施例2を説明するためのブロック図。The block diagram for demonstrating Example 2 of this invention. 実施例2の効果を説明するための波形図。FIG. 6 is a waveform diagram for explaining the effect of the second embodiment. この発明の実施例3を説明するためのブロック図。The block diagram for demonstrating Example 3 of this invention. 実施例3の利用状況を説明するための配置図。FIG. 6 is a layout diagram for explaining a usage situation of the third embodiment. 公知文献に記載された音源分離手段と雑音抑圧手段の接続関係を説明するためのブロック図。The block diagram for demonstrating the connection relation of the sound source separation means described in the well-known literature, and a noise suppression means. 従来技術を説明するためのブロック図。The block diagram for demonstrating a prior art. 従来技術の利用状況を説明するための配置図。The layout for demonstrating the utilization condition of a prior art. 多チャネル信号蓄積手段の内部の様子を説明するための図。The figure for demonstrating the mode inside a multichannel signal storage means. 背景技術を説明するためのフローチャート。The flowchart for demonstrating background art. 背景技術の他の例を説明するためのブロック図。The block diagram for demonstrating the other example of background art. 図13に示した他の背景技術を説明するためのフローチャート。14 is a flowchart for explaining another background art shown in FIG. 13.

符号の説明Explanation of symbols

101 多チャネル信号蓄積手段 131 射影残差信号行列抽出手段
102 Z行列生成部 141 応答長推定手段
111 現信号空間行列生成手段 151 射影残差信号行列補正手段
112 過去信号空間行列生成手段 161、161’ 伝達経路推定手段
113 未来信号空間行列生成手段 201 逆フィルタ算出手段
115 Z行列生成手段 202 逆フィルタ手段
121 過去未来信号空間直交基底算出手段
101 Multi-channel signal accumulating unit 131 Projected residual signal matrix extracting unit 102 Z matrix generating unit 141 Response length estimating unit 111 Current signal space matrix generating unit 151 Projected residual signal matrix correcting unit 112 Past signal space matrix generating unit 161, 161 ′ Transmission path estimation means 113 Future signal space matrix generation means 201 Inverse filter calculation means 115 Z matrix generation means 202 Inverse filter means 121 Past future signal space orthogonal basis calculation means

Claims (8)

ソース信号から複数の線形伝達経路を経て観測された観測信号から、伝達経路を推定する伝達経路推定方法において、
多チャネル観測信号を蓄積する多チャネル信号蓄積処理と、
蓄積された多チャネル観測信号から現信号空間行列を求める現信号空間行列生成処理と、
蓄積された多チャネル観測信号から過去未来信号空間行列を生成する過去未来信号空間行列生成処理と、
前記過去未来信号空間行列の直交基底を求める過去未来信号空間直交基底算出処理と、
前記現信号空間行列と前記過去未来信号空間の直交基底から、現信号空間行列を過去未来信号空間に射影したときの射影残差信号空間行列を求める射影残差信号行列抽出処理と、
射影残差信号行列Eのランクrから冗長射影残差信号空間行列Ebを求め、射影残差信号空間行列Eのうち前記冗長射影残差信号空間行列Ebに直交する成分を取り出して射影残差信号行列とする射影残差信号行列補正処理と、
前記補正射影残差信号行列を特異値分解し、最大特異値に対応する左特異ベクトルを取り出して多チャネル伝達経路のインパルス応答を求める伝達経路推定処理と、
を含むことを特徴とする伝達経路推定方法。
In a transfer path estimation method for estimating a transfer path from an observation signal observed through a plurality of linear transfer paths from a source signal,
Multi-channel signal accumulation processing for accumulating multi-channel observation signals;
Current signal space matrix generation processing for obtaining a current signal space matrix from the accumulated multi-channel observation signals;
Past future signal space matrix generation processing for generating a past future signal space matrix from accumulated multi-channel observation signals;
A past future signal space orthogonal basis calculation process for obtaining an orthogonal basis of the past future signal space matrix;
From the orthogonal base of the current signal space matrix and the past future signal space, a projection residual signal matrix extraction process for obtaining a projection residual signal space matrix when the current signal space matrix is projected onto the past future signal space;
A redundant residual signal space matrix Eb is obtained from the rank r of the projected residual signal matrix E, and a component that is orthogonal to the redundant projected residual signal space matrix Eb is extracted from the projected residual signal space matrix E. A projection residual signal matrix correction process as a matrix;
Singular value decomposition of the corrected projected residual signal matrix, taking a left singular vector corresponding to the maximum singular value, and obtaining a transmission path estimation process for obtaining an impulse response of a multi-channel transmission path;
Including a transmission path estimation method.
ソース信号から複数の線形伝達経路を経て観測された観測信号から、伝達経路とソース信号を推定する残響除去方法において、
多チャネル観測信号を蓄積する多チャネル信号蓄積処理と、
蓄積された多チャネル観測信号から現信号空間行列を求める現信号空間行列生成処理と、
蓄積された多チャネル観測信号から過去未来信号空間行列を生成する過去未来信号空間行列生成処理と、
前記過去未来信号空間行列の直交基底を求める過去未来信号空間直交基底算出処理と、
前記現信号空間行列と前記過去未来信号空間の直交基底から、前記現信号空間行列を前記過去未来信号空間に射影したときの射影残差信号空間行列を求める射影残差信号行列抽出処理と、
射影残差信号行列Eのランクrから冗長射影残差信号空間行列Ebを求め、射影残差信号空間行列Eのうち前記冗長射影残差信号空間行列Ebに直交する成分を取り出して補正射影残差信号行列とする射影残差信号行列補正処理と、
補正射影残差信号行列を特異値分解し、最大特異値に対応する左特異ベクトルを取り出して多チャネル伝達経路のインパルス応答を求める伝達経路推定処理と、
推定された伝達経路インパルス応答から逆フィルタを求める逆フィルタ算出処理と、
逆フィルタを多チャネル観測信号に適用してソース信号推定結果を出力する逆フィルタ処理と、
とを含むことを特徴とする残響除去方法。
In the dereverberation method for estimating the transmission path and the source signal from the observation signal observed through the plurality of linear transmission paths from the source signal,
Multi-channel signal accumulation processing for accumulating multi-channel observation signals;
Current signal space matrix generation processing for obtaining a current signal space matrix from the accumulated multi-channel observation signals;
Past future signal space matrix generation processing for generating a past future signal space matrix from accumulated multi-channel observation signals;
A past future signal space orthogonal basis calculation process for obtaining an orthogonal basis of the past future signal space matrix;
A projection residual signal matrix extraction process for obtaining a projection residual signal space matrix when the current signal space matrix is projected onto the past future signal space from orthogonal bases of the current signal space matrix and the past future signal space;
A redundant residual signal space matrix Eb is obtained from the rank r of the projected residual signal matrix E, and a component orthogonal to the redundant projected residual signal space matrix Eb is extracted from the projected residual signal space matrix E to obtain a corrected projected residual. A projection residual signal matrix correction process as a signal matrix;
Singular value decomposition of the corrected projection residual signal matrix, taking out the left singular vector corresponding to the maximum singular value and obtaining the impulse response of the multi-channel transmission route,
An inverse filter calculation process for obtaining an inverse filter from the estimated transmission path impulse response;
Applying an inverse filter to the multi-channel observation signal and outputting a source signal estimation result;
A dereverberation method characterized by comprising:
複数のソース信号が複数の線形伝達経路を経て混合された観測信号から、ソース信号を推定する音源分離方法において、
多チャネル観測信号を蓄積する多チャネル信号蓄積処理と、
蓄積された多チャネル観測信号から現信号空間行列を求める現信号空間行列生成処理と、
蓄積された多チャネル観測信号から過去未来信号空間行列を生成する過去未来信号空間行列生成処理と、
過去未来信号空間行列の直交基底を求める過去未来信号空間直交基底算出処理と、
前記現信号空間行列と前記過去未来信号空間の直交基底から、前記現信号空間行列を前記過去未来信号空間に射影したときの射影残差信号空間行列を求める射影残差信号行列抽出処理と、
射影残差信号行列Eランクrから冗長射影残差信号空間行列Ebを求め、射影残差信号空間行列Eのうち前記冗長射影残差信号空間行列Ebに直交する成分を取り出して補正射影残差信号行列とする射影残差信号行列補正処理と、
補正射影残差信号行列を特異値分解し、最大特異値に対応する左特異ベクトルを取り出して多チャネル伝達経路のインパルス応答を求める伝達経路推定処理と、
推定された伝達経路インパルス応答から残響除去フィルタを求める残響除去フィルタ算出処理と、
多チャネル観測信号に対して前記残響除去フィルタ算出処理で求めた残響除去フィルタを適用する残響除去フィルタリング処理と、
残響除去処理後段の信号を入力として音源分離処理を行う音源分離処理と、
を含むことを特徴とする音源分離方法。
In a sound source separation method for estimating a source signal from an observation signal obtained by mixing a plurality of source signals through a plurality of linear transmission paths,
Multi-channel signal accumulation processing for accumulating multi-channel observation signals;
Current signal space matrix generation processing for obtaining a current signal space matrix from the accumulated multi-channel observation signals;
Past future signal space matrix generation processing for generating a past future signal space matrix from accumulated multi-channel observation signals;
A past future signal space orthogonal basis calculation process for obtaining an orthogonal basis of a past future signal space matrix;
A projection residual signal matrix extraction process for obtaining a projection residual signal space matrix when the current signal space matrix is projected onto the past future signal space from orthogonal bases of the current signal space matrix and the past future signal space;
A redundant residual signal space matrix Eb is obtained from the projected residual signal matrix E rank r, and a component orthogonal to the redundant projected residual signal space matrix Eb is extracted from the projected residual signal space matrix E to obtain a corrected projected residual signal. A projection residual signal matrix correction process as a matrix;
Singular value decomposition of the corrected projection residual signal matrix, taking out the left singular vector corresponding to the maximum singular value and obtaining the impulse response of the multi-channel transmission route,
A dereverberation filter calculation process for obtaining a dereverberation filter from the estimated transfer path impulse response;
A dereverberation filtering process that applies the dereverberation filter obtained in the dereverberation filter calculation process to a multi-channel observation signal;
Sound source separation processing that performs sound source separation processing using the signal after the dereverberation processing as input,
A sound source separation method comprising:
ソース信号から複数の線形伝達経路を経て観測された観測信号から、伝達経路を推定する伝達経路推定装置において、
多チャネル観測信号を蓄積する多チャネル信号蓄積手段と、
蓄積された多チャネル観測信号から現信号空間行列を求める現信号空間行列生成手段と、
蓄積された多チャネル観測信号から過去未来信号空間行列を生成する過去未来信号空間行列生成手段と、
前記過去未来信号空間行列の直交基底を求める過去未来信号空間直交基底算出手段と、
前記現信号空間行列と前記過去未来信号空間の直交基底から、前記現信号空間行列を前記過去未来信号空間に射影したときの射影残差信号空間行列を求める射影残差信号行列抽出手段と、
射影残差信号行列Eのランクrから冗長射影残差信号空間行列Ebを求め、射影残差信号空間行列Eのうち前記冗長射影残差信号空間行列Ebに直交する成分を取り出して補正射影残差信号行列とする射影残差信号行列補正手段と、
前記補正射影残差信号行列を特異値分解し、最大特異値に対応する左特異ベクトルを取り出して多チャネル伝達経路のインパルス応答を求める伝達経路推定手段と、
を備えることを特徴とする伝達経路推定装置。
In a transmission path estimation device that estimates a transmission path from an observation signal observed from a source signal through a plurality of linear transmission paths,
Multi-channel signal storage means for storing multi-channel observation signals;
Current signal space matrix generating means for obtaining a current signal space matrix from the accumulated multi-channel observation signals;
A past future signal space matrix generating means for generating a past future signal space matrix from the accumulated multi-channel observation signals;
Past future signal space orthogonal basis calculating means for obtaining an orthogonal basis of the past future signal space matrix;
Projected residual signal matrix extraction means for obtaining a projected residual signal space matrix when the current signal space matrix is projected onto the past future signal space from orthogonal bases of the current signal space matrix and the past future signal space;
A redundant residual signal space matrix Eb is obtained from the rank r of the projected residual signal matrix E, and a component orthogonal to the redundant projected residual signal space matrix Eb is extracted from the projected residual signal space matrix E to obtain a corrected projected residual. A projection residual signal matrix correction means as a signal matrix;
Singular value decomposition of the corrected projected residual signal matrix, taking out the left singular vector corresponding to the maximum singular value, and obtaining the impulse response of the multi-channel transmission path, transmission path estimation means,
A transmission path estimation apparatus comprising:
ソース信号から複数の線形伝達経路を経て観測された観測信号から、伝達経路とソース信号を推定する残響除去装置において、
多チャネル観測信号を蓄積する多チャネル信号蓄積手段と、
蓄積された多チャネル観測信号から現信号空間行列を求める現信号空間行列生成手段と、
蓄積された多チャネル観測信号から過去未来信号空間行列を生成する過去未来信号空間行列生成手段と、
前記過去未来信号空間行列の直交基底を求める過去未来信号空間直交基底算出手段と、
前記現信号空間行列と前記過去未来信号空間の直交基底から、前記現信号空間行列を過去未来信号空間に射影したときの射影残差信号空間行列を求める射影残差信号行列抽出手段と、
射影残差信号行列Eランクrから冗長射影残差信号空間行列Ebを求め、射影残差信号空間行列Eのうち前記冗長射影残差信号空間行列Ebに直交する成分を取り出して補正射影残差信号行列とする射影残差信号行列補正手段と、
補正射影残差信号行列を特異値分解し、最大特異値に対応する左特異ベクトルを取り出して多チャネル伝達経路のインパルス応答を求める伝達経路推定手段と、
推定された伝達経路インパルス応答から逆フィルタを求める逆フィルタ算出手段と、
逆フィルタを多チャネル観測信号に適用してソース信号推定結果を出力する逆フィルタ手段と、
を備えることを特徴とする残響除去装置。
In the dereverberation device that estimates the transmission path and the source signal from the observation signal observed through the plurality of linear transmission paths from the source signal,
Multi-channel signal storage means for storing multi-channel observation signals;
Current signal space matrix generating means for obtaining a current signal space matrix from the accumulated multi-channel observation signals;
A past future signal space matrix generating means for generating a past future signal space matrix from the accumulated multi-channel observation signals;
Past future signal space orthogonal basis calculating means for obtaining an orthogonal basis of the past future signal space matrix;
Projected residual signal matrix extraction means for obtaining a projected residual signal space matrix when the current signal space matrix is projected onto a past future signal space from orthogonal bases of the current signal space matrix and the past future signal space;
A redundant residual signal space matrix Eb is obtained from the projected residual signal matrix E rank r, and a component orthogonal to the redundant projected residual signal space matrix Eb is extracted from the projected residual signal space matrix E to obtain a corrected projected residual signal. A projection residual signal matrix correction means as a matrix;
Singular value decomposition of the corrected projected residual signal matrix, taking out the left singular vector corresponding to the maximum singular value, and obtaining the impulse response of the multi-channel transmission route,
An inverse filter calculating means for obtaining an inverse filter from the estimated transmission path impulse response;
Means for applying an inverse filter to the multi-channel observation signal and outputting a source signal estimation result;
A dereverberation apparatus comprising:
複数のソース信号が複数の線形伝達経路を経て混合された観測信号から、ソース信号を推定する音源分離装置において、
多チャネル観測信号を蓄積する多チャネル信号蓄積手段と、
蓄積された多チャネル観測信号から現信号空間行列を求める現信号空間行列生成手段と、
蓄積された多チャネル観測信号から過去未来信号空間行列を生成する過去未来信号空間行列生成手段と、
前記過去未来信号空間行列の直交基底を求める過去未来信号空間直交基底算出手段と、
前記現信号空間行列と前記過去未来信号空間の直交基底から、前記現信号空間行列を前記過去未来信号空間に射影したときの射影残差信号空間行列を求める射影残差信号抽出手段と、
射影残差信号行列Eのうちランクrから冗長射影残差信号空間行列Ebを求め、射影残差信号空間行列Eのうち前記冗長射影残差信号空間行列Ebに直交する成分を取り出して補正射影残差信号行列とする射影残差信号行列補正手段と、
前記補正射影残差信号行列を特異値分解し、最大特異値に対応する左特異ベクトルを取り出して多チャンネル伝達経路のインパルス応答を求める伝達経路推定手段と、
推定された伝達経路インパルス応答から残響除去フィルタを求める残響除去フィルタ算出手段と、
前記多チャネル観測信号に対して前記残響除去フィルタ算出手段で求めた残響除去フィルタを適用する残響除去フィルタリング手段と、
残響除去処理後段の信号を入力として音源分離処理を行う音源分離手段と、
を備えたことを特徴とする音源分離装置。
In a sound source separation device that estimates a source signal from an observation signal in which a plurality of source signals are mixed through a plurality of linear transmission paths,
Multi-channel signal storage means for storing multi-channel observation signals;
Current signal space matrix generating means for obtaining a current signal space matrix from the accumulated multi-channel observation signals;
A past future signal space matrix generating means for generating a past future signal space matrix from the accumulated multi-channel observation signals;
Past future signal space orthogonal basis calculating means for obtaining an orthogonal basis of the past future signal space matrix;
Projected residual signal extraction means for obtaining a projected residual signal space matrix when the current signal space matrix is projected onto the past future signal space from orthogonal bases of the current signal space matrix and the past future signal space;
A redundant residual signal space matrix Eb is obtained from the rank r of the projected residual signal matrix E, and a component orthogonal to the redundant projected residual signal space matrix Eb is extracted from the projected residual signal space matrix E to obtain a corrected projected residual. A projection residual signal matrix correction means for making a difference signal matrix;
Singular value decomposition of the corrected projected residual signal matrix, taking out the left singular vector corresponding to the maximum singular value, and obtaining the impulse response of the multi-channel transmission path, transmission path estimation means,
Dereverberation filter calculating means for obtaining an dereverberation filter from the estimated transfer path impulse response;
Dereverberation filtering means for applying the dereverberation filter obtained by the dereverberation filter calculating means to the multi-channel observation signal;
Sound source separation means for performing sound source separation processing using the signal after the dereverberation processing as input,
A sound source separation device comprising:
コンピュータが解読可能なプログラム言語によって記述され、コンピュータに請求項4乃至6記載の装置として機能させるプログラム。   A program written in a program language that can be read by a computer, and causing the computer to function as the device according to claims 4 to 6. コンピュータが読み取り可能な記録媒体によって構成され、この記録媒体に請求項7記載のプログラムを記録した記録媒体。   A recording medium comprising a computer-readable recording medium, wherein the program according to claim 7 is recorded on the recording medium.
JP2006058189A 2006-03-03 2006-03-03 Transmission path estimation method, dereverberation method, sound source separation method, apparatus, program, and recording medium Expired - Fee Related JP4422692B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2006058189A JP4422692B2 (en) 2006-03-03 2006-03-03 Transmission path estimation method, dereverberation method, sound source separation method, apparatus, program, and recording medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2006058189A JP4422692B2 (en) 2006-03-03 2006-03-03 Transmission path estimation method, dereverberation method, sound source separation method, apparatus, program, and recording medium

Publications (2)

Publication Number Publication Date
JP2007235875A true JP2007235875A (en) 2007-09-13
JP4422692B2 JP4422692B2 (en) 2010-02-24

Family

ID=38555953

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2006058189A Expired - Fee Related JP4422692B2 (en) 2006-03-03 2006-03-03 Transmission path estimation method, dereverberation method, sound source separation method, apparatus, program, and recording medium

Country Status (1)

Country Link
JP (1) JP4422692B2 (en)

Cited By (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010049083A (en) * 2008-08-22 2010-03-04 Nippon Telegr & Teleph Corp <Ntt> Sound signal enhancement device and method therefore, program and recording medium
JP2010245697A (en) * 2009-04-02 2010-10-28 Nippon Telegr & Teleph Corp <Ntt> Device and method for suppressing reverberation of adaptive microphone array, and program
JP2013504283A (en) * 2009-09-07 2013-02-04 クゥアルコム・インコーポレイテッド System, method, apparatus and computer readable medium for dereverberation of multi-channel signals
CN105895114A (en) * 2016-03-22 2016-08-24 南京大学 Pulse-response-based room sound propagation path separation method
CN111418011A (en) * 2017-09-28 2020-07-14 搜诺思公司 Multi-channel acoustic echo cancellation
CN115050373A (en) * 2022-04-29 2022-09-13 思必驰科技股份有限公司 Dual path embedded learning method, electronic device, and storage medium
US11727933B2 (en) 2016-10-19 2023-08-15 Sonos, Inc. Arbitration-based voice recognition
US11750969B2 (en) 2016-02-22 2023-09-05 Sonos, Inc. Default playback device designation
US11778259B2 (en) 2018-09-14 2023-10-03 Sonos, Inc. Networked devices, systems and methods for associating playback devices based on sound codes
US11790937B2 (en) 2018-09-21 2023-10-17 Sonos, Inc. Voice detection optimization using sound metadata
US11792590B2 (en) 2018-05-25 2023-10-17 Sonos, Inc. Determining and adapting to changes in microphone performance of playback devices
US11790911B2 (en) 2018-09-28 2023-10-17 Sonos, Inc. Systems and methods for selective wake word detection using neural network models
US11797263B2 (en) 2018-05-10 2023-10-24 Sonos, Inc. Systems and methods for voice-assisted media content selection
US11798553B2 (en) 2019-05-03 2023-10-24 Sonos, Inc. Voice assistant persistence across multiple network microphone devices
US11816393B2 (en) 2017-09-08 2023-11-14 Sonos, Inc. Dynamic computation of system response volume
US11817083B2 (en) 2018-12-13 2023-11-14 Sonos, Inc. Networked microphone devices, systems, and methods of localized arbitration
US11854547B2 (en) 2019-06-12 2023-12-26 Sonos, Inc. Network microphone device with command keyword eventing
US11863593B2 (en) 2016-02-22 2024-01-02 Sonos, Inc. Networked microphone device control
US11862161B2 (en) 2019-10-22 2024-01-02 Sonos, Inc. VAS toggle based on device orientation
US11869503B2 (en) 2019-12-20 2024-01-09 Sonos, Inc. Offline voice control
US11881222B2 (en) 2020-05-20 2024-01-23 Sonos, Inc Command keywords with input detection windowing
US11881223B2 (en) 2018-12-07 2024-01-23 Sonos, Inc. Systems and methods of operating media playback systems having multiple voice assistant services
US11887598B2 (en) 2020-01-07 2024-01-30 Sonos, Inc. Voice verification for media playback
US11893308B2 (en) 2017-09-29 2024-02-06 Sonos, Inc. Media playback system with concurrent voice assistance
US11900937B2 (en) 2017-08-07 2024-02-13 Sonos, Inc. Wake-word detection suppression
US11899519B2 (en) 2018-10-23 2024-02-13 Sonos, Inc. Multiple stage network microphone device with reduced power consumption and processing load
US11934742B2 (en) 2016-08-05 2024-03-19 Sonos, Inc. Playback device supporting concurrent voice assistants
US11947870B2 (en) 2016-02-22 2024-04-02 Sonos, Inc. Audio response playback
US11961519B2 (en) 2020-02-07 2024-04-16 Sonos, Inc. Localized wakeword verification
US11973893B2 (en) 2018-08-28 2024-04-30 Sonos, Inc. Do not disturb feature for audio notifications
US11979960B2 (en) 2016-07-15 2024-05-07 Sonos, Inc. Contextualization of voice inputs
US11983463B2 (en) 2016-02-22 2024-05-14 Sonos, Inc. Metadata exchange involving a networked playback system and a networked microphone system
US11984123B2 (en) 2020-11-12 2024-05-14 Sonos, Inc. Network device interaction by range
US12047753B1 (en) 2017-09-28 2024-07-23 Sonos, Inc. Three-dimensional beam forming with a microphone array
US12063486B2 (en) 2018-12-20 2024-08-13 Sonos, Inc. Optimization of network microphone devices using noise classification
US12062383B2 (en) 2018-09-29 2024-08-13 Sonos, Inc. Linear filtering for noise-suppressed speech detection via multiple network microphone devices
US12080314B2 (en) 2016-06-09 2024-09-03 Sonos, Inc. Dynamic player selection for audio signal processing
US12093608B2 (en) 2019-07-31 2024-09-17 Sonos, Inc. Noise classification for event detection

Cited By (43)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010049083A (en) * 2008-08-22 2010-03-04 Nippon Telegr & Teleph Corp <Ntt> Sound signal enhancement device and method therefore, program and recording medium
JP2010245697A (en) * 2009-04-02 2010-10-28 Nippon Telegr & Teleph Corp <Ntt> Device and method for suppressing reverberation of adaptive microphone array, and program
JP2013504283A (en) * 2009-09-07 2013-02-04 クゥアルコム・インコーポレイテッド System, method, apparatus and computer readable medium for dereverberation of multi-channel signals
US11750969B2 (en) 2016-02-22 2023-09-05 Sonos, Inc. Default playback device designation
US11983463B2 (en) 2016-02-22 2024-05-14 Sonos, Inc. Metadata exchange involving a networked playback system and a networked microphone system
US11832068B2 (en) 2016-02-22 2023-11-28 Sonos, Inc. Music service selection
US11863593B2 (en) 2016-02-22 2024-01-02 Sonos, Inc. Networked microphone device control
US11947870B2 (en) 2016-02-22 2024-04-02 Sonos, Inc. Audio response playback
US12047752B2 (en) 2016-02-22 2024-07-23 Sonos, Inc. Content mixing
CN105895114B (en) * 2016-03-22 2019-09-27 南京大学 A kind of room acoustic propagation path separation method based on impulse response
CN105895114A (en) * 2016-03-22 2016-08-24 南京大学 Pulse-response-based room sound propagation path separation method
US12080314B2 (en) 2016-06-09 2024-09-03 Sonos, Inc. Dynamic player selection for audio signal processing
US11979960B2 (en) 2016-07-15 2024-05-07 Sonos, Inc. Contextualization of voice inputs
US11934742B2 (en) 2016-08-05 2024-03-19 Sonos, Inc. Playback device supporting concurrent voice assistants
US11727933B2 (en) 2016-10-19 2023-08-15 Sonos, Inc. Arbitration-based voice recognition
US11900937B2 (en) 2017-08-07 2024-02-13 Sonos, Inc. Wake-word detection suppression
US11816393B2 (en) 2017-09-08 2023-11-14 Sonos, Inc. Dynamic computation of system response volume
US11817076B2 (en) 2017-09-28 2023-11-14 Sonos, Inc. Multi-channel acoustic echo cancellation
US12047753B1 (en) 2017-09-28 2024-07-23 Sonos, Inc. Three-dimensional beam forming with a microphone array
CN111418011B (en) * 2017-09-28 2023-05-12 搜诺思公司 Multi-channel acoustic echo cancellation
CN111418011A (en) * 2017-09-28 2020-07-14 搜诺思公司 Multi-channel acoustic echo cancellation
US11893308B2 (en) 2017-09-29 2024-02-06 Sonos, Inc. Media playback system with concurrent voice assistance
US11797263B2 (en) 2018-05-10 2023-10-24 Sonos, Inc. Systems and methods for voice-assisted media content selection
US11792590B2 (en) 2018-05-25 2023-10-17 Sonos, Inc. Determining and adapting to changes in microphone performance of playback devices
US11973893B2 (en) 2018-08-28 2024-04-30 Sonos, Inc. Do not disturb feature for audio notifications
US11778259B2 (en) 2018-09-14 2023-10-03 Sonos, Inc. Networked devices, systems and methods for associating playback devices based on sound codes
US11790937B2 (en) 2018-09-21 2023-10-17 Sonos, Inc. Voice detection optimization using sound metadata
US11790911B2 (en) 2018-09-28 2023-10-17 Sonos, Inc. Systems and methods for selective wake word detection using neural network models
US12062383B2 (en) 2018-09-29 2024-08-13 Sonos, Inc. Linear filtering for noise-suppressed speech detection via multiple network microphone devices
US11899519B2 (en) 2018-10-23 2024-02-13 Sonos, Inc. Multiple stage network microphone device with reduced power consumption and processing load
US11881223B2 (en) 2018-12-07 2024-01-23 Sonos, Inc. Systems and methods of operating media playback systems having multiple voice assistant services
US11817083B2 (en) 2018-12-13 2023-11-14 Sonos, Inc. Networked microphone devices, systems, and methods of localized arbitration
US12063486B2 (en) 2018-12-20 2024-08-13 Sonos, Inc. Optimization of network microphone devices using noise classification
US11798553B2 (en) 2019-05-03 2023-10-24 Sonos, Inc. Voice assistant persistence across multiple network microphone devices
US11854547B2 (en) 2019-06-12 2023-12-26 Sonos, Inc. Network microphone device with command keyword eventing
US12093608B2 (en) 2019-07-31 2024-09-17 Sonos, Inc. Noise classification for event detection
US11862161B2 (en) 2019-10-22 2024-01-02 Sonos, Inc. VAS toggle based on device orientation
US11869503B2 (en) 2019-12-20 2024-01-09 Sonos, Inc. Offline voice control
US11887598B2 (en) 2020-01-07 2024-01-30 Sonos, Inc. Voice verification for media playback
US11961519B2 (en) 2020-02-07 2024-04-16 Sonos, Inc. Localized wakeword verification
US11881222B2 (en) 2020-05-20 2024-01-23 Sonos, Inc Command keywords with input detection windowing
US11984123B2 (en) 2020-11-12 2024-05-14 Sonos, Inc. Network device interaction by range
CN115050373A (en) * 2022-04-29 2022-09-13 思必驰科技股份有限公司 Dual path embedded learning method, electronic device, and storage medium

Also Published As

Publication number Publication date
JP4422692B2 (en) 2010-02-24

Similar Documents

Publication Publication Date Title
JP4422692B2 (en) Transmission path estimation method, dereverberation method, sound source separation method, apparatus, program, and recording medium
JP4394832B2 (en) Separation of unknown mixed sources using multiple decorrelation methods
JP5227393B2 (en) Reverberation apparatus, dereverberation method, dereverberation program, and recording medium
CN108429995B (en) Sound processing device, sound processing method, and storage medium
US10718742B2 (en) Hypothesis-based estimation of source signals from mixtures
KR102410850B1 (en) Method and apparatus for extracting reverberant environment embedding using dereverberation autoencoder
US20180301160A1 (en) Signal processing apparatus and method
US10657958B2 (en) Online target-speech extraction method for robust automatic speech recognition
JP6815956B2 (en) Filter coefficient calculator, its method, and program
JP4630203B2 (en) Signal separation device, signal separation method, signal separation program and recording medium, signal arrival direction estimation device, signal arrival direction estimation method, signal arrival direction estimation program and recording medium
JP6343771B2 (en) Head related transfer function modeling apparatus, method and program thereof
JP4473709B2 (en) SIGNAL ESTIMATION METHOD, SIGNAL ESTIMATION DEVICE, SIGNAL ESTIMATION PROGRAM, AND ITS RECORDING MEDIUM
JP6808597B2 (en) Signal separation device, signal separation method and program
EP3557576B1 (en) Target sound emphasis device, noise estimation parameter learning device, method for emphasizing target sound, method for learning noise estimation parameter, and program
JP3920795B2 (en) Echo canceling apparatus, method, and echo canceling program
JP6644356B2 (en) Sound source separation system, method and program
JP2013113866A (en) Reverberation removal method, reverberation removal device and program
WO2024038522A1 (en) Signal processing device, signal processing method, and program
JP7270869B2 (en) Information processing device, output method, and output program
JP2007178590A (en) Object signal extracting device and method therefor, and program
JP2018191255A (en) Sound collecting device, method thereof, and program
JP2012048134A (en) Reverberation removal method, reverberation removal device and program
JP2005091560A (en) Method and apparatus for signal separation
JP2002261659A (en) Multi-channel echo cancellation method, its apparatus, its program, and its storage medium
JP4714892B2 (en) High reverberation blind signal separation apparatus and method

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20080128

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20091117

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20091124

A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20091204

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20121211

Year of fee payment: 3

R150 Certificate of patent or registration of utility model

Free format text: JAPANESE INTERMEDIATE CODE: R150

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20121211

Year of fee payment: 3

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20131211

Year of fee payment: 4

S531 Written request for registration of change of domicile

Free format text: JAPANESE INTERMEDIATE CODE: R313531

R350 Written notification of registration of transfer

Free format text: JAPANESE INTERMEDIATE CODE: R350

LAPS Cancellation because of no payment of annual fees