-
This invention relates to digital signal processing and in particular to a method of processing a signal such as a speech signal, for example for use in cancellation of non-stationary noise resulting from external events that can be measured or informed.
-
Akinori Ito, Takashi Kanayama Motoyuki Suzuki and Shozo Makino show an example of usefulness of such methods and systems in their article entitled "Internal Noise Suppression for Speech Recognition by Small Robots" published in pages 2685-2688 of INTERSPEECH 2005. To suppress unstable noise, they must predict the spectrum of the noise frame by frame. To achieve this, they constructed a neural network that predicts the spectrum of the internal noise from the status of joints. At least 10 000 samples are required for a learning stage that points on a slow convergence which can be problematic particularly with databases becoming huge.
-
There are many other situations wherein speech denoising is useful like for the examples described in the international patent application
W02006/032760 . Here are calculated one or more noise reduction filters from an estimated power spectral density (PSD) of the noise. The estimation of the PSD is not per se an object of
W02006/032760 .
-
UK patent application GB2406487 discloses a modified affine projection algorithm for non stationary-signal. The affine projection algorithm (APA) presents fast convergence features and seems to be well adapted for filtering an echo which is correlated with the speech signal. The problem of the APA is that it is not applicable for filtering noise which is uncorrelated with the speech and which is a function of other changing physical values than voice, particularly when said function is non linear.
-
To avoid the problems of the prior art, an object of the invention is a method or an apparatus for generating an estimator of noise which is an unknown function of physical quantities without necessity for said function to be linear and that with good quality of a fast convergence.
-
Particularly the method according to the invention comprises steps of capturing a sound and associating to the sound by a common index a first vector value of physical quantities inducing the sound at the same time as the sound is captured. Step of capturing is repeated by incrementing a value of the index each time said step is repeated When said index value is or has been incremented a number of times at least equal to a first integer (L) greater than one, each time said index value is incremented, the method further comprises the steps of:
- generating (107) a current sequence of coefficients for a linear combination of functions which satisfy Mercer conditions wherein a first argument of one of the functions is the first vector value of the one of the capturing steps having the index value corresponding to a rank of the coefficient in the sequence and
- setting values of the coefficients so as to a quantity of the last captured sounds equal to said first integer be substantially equal each to an occurrence of the linear combination wherein a second argument of the functions is the first vector value of the capturing step having the index value associated to the sound.
The linear combination resulting from the current generated sequence is performed to produce the estimator when a next captured sound is not pure noise.
-
Prefered modes of implementation of the invention are now described with reference to the drawings wherein:
- figure 1 is schematic representation of a device according to the invention;
- figure 2 presents steps of a first method implementation according to the invention;
- figure 3 presents steps of a second method implementation according to the invention.
-
Figure 1 is a schematic representation of a device according to the invention for restitution of a signal s which is emitted in a noisy environment. The signal s is for instance a sound of voice type or other that needs to be cleaned from noise for exploitation purpose.
-
The device comprises a microphone 21 for capture of sound and a de-noising module 20 for providing an estimate ^s of the signal s. Therefore a first contact of a switch 22 between the microphone 21 and the de-noising module is arranged to connect the microphone 21 with the de-noising module 20 when the signal s is present so as to supply the de-noising module 20 with a received signal which comprises in that case the signal s and a noise y. The detection of the presence of the signal s is not an object of the invention, it can be realized by a voice activity detection (VAD) system, a camera detecting a person or any other system like for example simply a button.
-
When the signal s is not present, said first contact of the switch 22 is arranged to normally connect the microphone 21 to an estimator generator 23 so as to supply the estimator generator 23 with the noise y so long as the signal s is not present. The estimator generator is arranged according to the invention for providing an estimator {αi, x i, K} that can be used by the de-noising module 20 for subtracting an estimation of noise ^y from the signal r so as to provide the estimate ^s of the signal s. For that purpose, a second contact of the switch 22 is arranged for connecting the estimator generator 23 to the de-noising module 20 the same time as the first contact of the switch 22 is connecting the microphone 21 to the de-noising module 20. In a normal state, the second contact of the switch 22 is arranged to loop the estimator generator 23 on itself so as to adapt said estimator in real time according to the received noise y.
-
The estimator is provided for giving an estimation of noise that is a function of data which are collected in a vector x threw an input 24 of the device. The value of each component of the vector is given for instance by a sensor 25, 26, 27, 28 connected to the input 24. Here four sensors are represented but it will be easily understood that the invention can be implemented with any number of sensors more or less than four including a sole one sensor, in which case the vector x is simply a scalar x. The type of data is any that suits for an estimation of noise resulting from an event measurable by such data and liable to create or to contribute to the noise received by the microphone 21. For a non limitative illustration purpose only, the data can be an angle of a moving arm of a robot, a speed or acceleration, a spent power of a motor, a sound captured by another microphone.
-
A third contact of the
switch 22 is arranged for connecting the
input 24 to the denoising module when the first contact is connecting the
microphone 21 to the denoising module. In that way, a real time estimation of noise ^
y can be calculated with help of the estimator so as for the de-noising module to elaborate the estimate ^
s in a similar way but not necessary the same as the one taught in
W02006/032760 .
-
When not connecting the input 24 to the de-noising module 20, the third contact of the switch 22 is arranged to connect the input 24 to a shift register 29. An output of each cell is arranged to be connected to the estimator generator 23 when said cell receives from a preceding cell or from the input 24 a value x i of vector wherein a index i is comprised between 1 and n, 1 for the oldest value and n for the last one which is received threw the input 24. The manner for shifting the values in the register is not essential for the method according to the invention, it can be by means of a clock of the device in a manner usually known in the art for sampling or every time a new value is detected. Useful feature of the invention is that a noise is sampled at the same time as a new value x n of vector shifts the preceding ones in the register.
-
The estimator generator 23 is arranged for starting a process of constructing the estimator when receiving from the shift register 29 a predetermined number L of values x i with their index i less or equal than n and greater than n-L. The process is executed by running the now explained steps of a method implementing the invention.
-
Referring now to figure 2, the number L is predetermined by setting its value in an initialization step 100. The determination of said value per se is not in the scope of the invention, it can result from theoretical considerations or more practically from testing the device by a user providing successively different values of L up to achieve a more acceptable result on the estimation of the signal s by the de-noising module 20.
-
The estimator {α
i,
x i, K} is provided by the
estimator generator 23 for calculating an estimate ^
y of noise in the form:
-
The noise estimation function f relates to the vector x by a linear combination of expressions of a kernel function K when applied each to the current vector x and to a past value x i of it.
-
Said kernel function K to be used is any function that satisfies the Mercer condition. Mathematical sciences define that the Mercer conditions are satisfied when for any number n of complex values a
i or a
j and of vectors
x i or
x j with real values,
gives a non negative real value. We can easy check that for example the Gaussian function
satisfies the Mercer conditions. Therefore this Gaussian function can be used for implementing the invention. Other Mercer conditions satisfying kernels are known and can also be used according to the best suited solution in the context of the exploitation of the device. Here is a non limitative list for illustrative purpose only:
- a polynomial kernel in the form of K( x i,x j ) = (1 + x i · x j ) q
- an exponential kernel in the form of
- a sigmoidal kernel in the form of K( x i,x j )= tanh(ξ0 x i · x j + β0).
-
The coefficients αi are upgraded in real time by a loop of steps 101 to 108 wherein step 101 is triggered again for each new received value of data x considered as a supplementary last received data x n the same time as a received noise y considered as the last received noise yn . Each time step 101 is executed, a supplementary coefficient αn is created with a value initialized to zero. The loop is executed for each value of n, said value being initialized in step 100 and incremented in step 103 or 108 to be ready for a following execution of the loop. Step 102 tests if number n is greater or equal to L so as to furnish coefficients αi for a total number n of received data at least equal to L. So long as number n is less than L, step 103 is branched on step 101.
-
Considering a index i comprised between n-L+1 and n, the L last coefficients α
i of the estimator are calculated in
step 107 by using the following formula:
-
Wherein Dy
j(n) is a distance separating a noise y
j from an estimation of that noise which is done with the estimation function f using coefficients α
m(n-1) with the values they currently had when executing
step 101. The noise y
j is the one which was or is measured along the time when n was or is equal to j in
step 101.
-
The set of coefficients αi(n) on the left side of the setting symbol ":=" is for the coefficients generated by the current execution of the loop with rank n wherein the coefficients αi(n-1) on the right side are those initialized to zero and or generated by a preceding execution of the loop with rank n-1.
-
In
step 107,
is a coefficient on line n-i, column n-j in an inverse matrix of a kernel matrix
generated in
step 104. Any known method of the art can be used for obtaining the inverse matrix of the kernel matrix. In
step 104 executed before
step 107 in case of a positive response to the test of
step 102, the kernel matrix for the loop of rank n is generated by the formula:
-
Wherein
ζh,k is a matrix regularization parameter. In other words it is a value that is equal to zero when the index h is different of the index k and is equal to a constant when the two indexes h and k are equal. The regularization parameter assures that the matrix
has an inverse.
-
The parameters ρ and µ
n are for improving the efficiency of the method and will be explained later. Without said parameters or with ρ and µ
n constants respectively equal to 0 and to 1 which is the same, the formulae used in
step 107 are similar to the following ones:
-
Mathematical considerations show that a setting of coefficients α
i according to formula (4) induces that for every j in the range of n-L+1 to n:
-
It is interesting to note from formula (5) that by an execution of a following loop for a new value of noise
y n+1, the equation (6) has for effect that for every j in the range of n-L+1 to n, the distance
Dy j(
n+
1) is equal to zero. The only distance which is different from zero is
Dy n+1(
n+
1) which is given by:
-
Because the kernel function K satisfies the Mercer condition, it can be shown that greater is L, faster the distance Dy n(n) is decreasing, in other words faster the method is converging.
-
Advantageously, the method comprises a step 105 wherein the coefficients αi are divided by (1+p) with a regularization parameter or forgetting factor ρ having a value greater than zero. Therewith, each time step 105 is executed after step 102, the coefficients αi are decreasing but in such a manner as to preserve the ratio between coefficients of any pair. Because divided many times, rather old coefficients, that is with the smaller indexes, become immaterial after a sufficient number of executions of the loop. The coefficients can be swapped out of the memory of the device executing the method, saving so much storage as time computing resources of it. For instance on figure 2, the coefficients αi with i less than n-L+1 are specifically divided by (1+p) in step 105. The coefficients αi with i greater than n-L are divided by (1+p) in step 107 according to formulae (1) and (2). In the following, every time the regularization parameter ρ is present in a formula, it will be understood that ρ is null, being the same as not being present for implementations without step 105.
-
Advantageously also, the formula (1) used in
step 107 comprises a step size µ(n) which in this case is initialized in
step 100 to a value µ(0). The value of the step size µ(n) can be a constant equal to µ(0) for every execution of
step 107 or can be varied according to n. In that case the method comprises further a
step 106 wherein the step size µ(n) is set to a value µ̃ but limited by a minimum µ
min and a maximum µ
max. The values µ
min and µ
max are set in
initialization step 100 respectively to a value greater or equal to zero and to a value preferably less or equal than one. A possible formula for achieving that is:
-
Before setting µ(n) in
step 106, the value µ̃ is updated by the formula:
-
In the formula, a prediction error is given by:
-
Wherein a function
f n of the vector
x n-j is given by the formula:
-
We see here that for example when j=0, the prediction
error is the difference between the last received value of noise
and a value which would have been predicted or estimated from the n-1 preceding measures with help of the function
f n. We see also here that not only a last value of noise is compared with an estimation of noise resulting from the current available function but that everyone of the L'-1 preceding received values of noise
yn-j is compared with a value of noise which would have been estimated by the last available function
fn .
-
An adaptive size cost parameter γ, an adaptive step size cost order L', an adaptive step size cost insensitivity ε' and an adaptive step size recursive parameter η are respectively a positive real number greater than one, an integer less, equal or greater than L, a positive real number near to zero and a positive real number less than two which can be set in
step 100 in case of
step 106 existing. When the value of the size cost parameter γ is equal to one, we see that the updating of the value µ̃ is independent of the prediction error. When furthermore the values of the adaptive step size recursive parameter η and of the regularization factor ρ are respectively equal to one and zero, a simple expression of the value µ̃ is given by:
-
In the expression of µ̃, components β
m of a weight gradient are updated according to the formula:
-
For every index i comprised in the range of n-L+1 to n, the value of a gradient Δ
i is given by the formula:
-
A second mode of implementation of the method according to the invention is now described in reference to figure 3 wherein steps 100 to 103 are similar to the ones of the first mode of implementation previously described in reference to figure 2.
-
Considering an index i comprised between n-
L+1 and n, the L last coefficients α
i of the estimator are calculated in
step 207 by using the following formula:
-
Wherein
and
are Lagrange multipliers such as for the distance
Dyj (
n) being less than ε for every j being in the range of n-L+1 to n, in other words:
-
More precisely, the Lagrange multipliers are calculated according the following sequence.
-
A
step 204 preceding
step 206 is similar to previously described
step 104 in that a kernel matrix
is calculated for having LxL coefficients, each coefficient of rank h, k, being equal to a kernel function of vectors
x n-h and
x n-k. Furthermore a quadratic matrix Q(n) is constructed for having 2Lx2L coefficients given by:
-
A linear vector p(n) having L components p
k and L components p
k+L, is given by the formula:
-
Wherein δk,n-j=1 when k=n-j and 0 otherwise.
-
The matrix Q(n) and the linear vector p(n) are then input in a quadratic programming library that is arranged to produce in output values of
and
in the form of a vector A having 2L positive components such that:
-
Any quadratic programming library arranged for calculating such an argument of a maximum value is adapted like for example the GQP library available on http://www.gnu.org/software/gsl.
-
In formula (1'), the step size µ(n) is not necessary. It can be a constant which when equal to 1 is the same has not being present. A variable step size improves the method. In that case the method comprises further a
step 205 wherein the step size µ(n) is set to a value µ̃ but limited by a minimum µ
min and a maximum µ
max. The values µ
min and µ
max are set in
initialization step 100 respectively to a value greater or equal to zero and to a value preferably less or equal than one. A possible formula for achieving that is:
-
Before setting µ(n) in
step 205, the value µ̃ is updated by the formula:
-
In the formula, a prediction error
is given by:
-
Wherein a function
f n of the vector
x n-j is given by the formula:
-
We see here that for example when j=0, the prediction
error is the difference between the last received value of noise
yn and a value which would have been predicted or estimated from the n-1 preceding measures with help of the function
f n. We see also here that not only a last value of noise is compared with an estimation of noise resulting from the current available function but that everyone of the L'-1 preceding received values of noise
yn-j is compared with a value of noise which would have been estimated by the last available function
f n.
-
An adaptive size cost parameter γ, an adaptive step size cost order L', an adaptive step size cost insensitivity ε' and an adaptive step size recursive parameter η are respectively a positive real number greater than one, an integer less, equal or greater than L, a positive real number near to zero and a positive real number less than two which can be set in
step 100 in case of
step 106 existing. When the value of the size cost parameter γ is equal to one, we see that the updating of the value µ̃ is independent of the prediction error value except for its sign. When furthermore the values of the adaptive step size recursive parameter η is equal to one, a simple expression of the value µ̃ is given by: