EP1936608A1 - On-line learning method and system for speech denoising - Google Patents

On-line learning method and system for speech denoising Download PDF

Info

Publication number
EP1936608A1
EP1936608A1 EP06301278A EP06301278A EP1936608A1 EP 1936608 A1 EP1936608 A1 EP 1936608A1 EP 06301278 A EP06301278 A EP 06301278A EP 06301278 A EP06301278 A EP 06301278A EP 1936608 A1 EP1936608 A1 EP 1936608A1
Authority
EP
European Patent Office
Prior art keywords
sound
value
linear combination
coefficients
index
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP06301278A
Other languages
German (de)
French (fr)
Inventor
Vladimir FRANCE TELECOM JAPAN Braquet
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Orange SA
Original Assignee
France Telecom SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by France Telecom SA filed Critical France Telecom SA
Priority to EP06301278A priority Critical patent/EP1936608A1/en
Priority to PCT/EP2007/064515 priority patent/WO2008074893A1/en
Publication of EP1936608A1 publication Critical patent/EP1936608A1/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering

Definitions

  • This invention relates to digital signal processing and in particular to a method of processing a signal such as a speech signal, for example for use in cancellation of non-stationary noise resulting from external events that can be measured or informed.
  • UK patent application GB2406487 discloses a modified affine projection algorithm for non stationary-signal.
  • the affine projection algorithm (APA) presents fast convergence features and seems to be well adapted for filtering an echo which is correlated with the speech signal.
  • the problem of the APA is that it is not applicable for filtering noise which is uncorrelated with the speech and which is a function of other changing physical values than voice, particularly when said function is non linear.
  • an object of the invention is a method or an apparatus for generating an estimator of noise which is an unknown function of physical quantities without necessity for said function to be linear and that with good quality of a fast convergence.
  • the method according to the invention comprises steps of capturing a sound and associating to the sound by a common index a first vector value of physical quantities inducing the sound at the same time as the sound is captured. Step of capturing is repeated by incrementing a value of the index each time said step is repeated When said index value is or has been incremented a number of times at least equal to a first integer (L) greater than one, each time said index value is incremented, the method further comprises the steps of:
  • Figure 1 is a schematic representation of a device according to the invention for restitution of a signal s which is emitted in a noisy environment.
  • the signal s is for instance a sound of voice type or other that needs to be cleaned from noise for exploitation purpose.
  • the device comprises a microphone 21 for capture of sound and a de-noising module 20 for providing an estimate ⁇ s of the signal s . Therefore a first contact of a switch 22 between the microphone 21 and the de-noising module is arranged to connect the microphone 21 with the de-noising module 20 when the signal s is present so as to supply the de-noising module 20 with a received signal which comprises in that case the signal s and a noise y .
  • the detection of the presence of the signal s is not an object of the invention, it can be realized by a voice activity detection (VAD) system, a camera detecting a person or any other system like for example simply a button.
  • VAD voice activity detection
  • said first contact of the switch 22 is arranged to normally connect the microphone 21 to an estimator generator 23 so as to supply the estimator generator 23 with the noise y so long as the signal s is not present.
  • the estimator generator is arranged according to the invention for providing an estimator ⁇ i , x i , K ⁇ that can be used by the de-noising module 20 for subtracting an estimation of noise ⁇ y from the signal r so as to provide the estimate ⁇ s of the signal s .
  • a second contact of the switch 22 is arranged for connecting the estimator generator 23 to the de-noising module 20 the same time as the first contact of the switch 22 is connecting the microphone 21 to the de-noising module 20.
  • the second contact of the switch 22 is arranged to loop the estimator generator 23 on itself so as to adapt said estimator in real time according to the received noise y .
  • the estimator is provided for giving an estimation of noise that is a function of data which are collected in a vector x threw an input 24 of the device.
  • the value of each component of the vector is given for instance by a sensor 25, 26, 27, 28 connected to the input 24.
  • the vector x is simply a scalar x.
  • the type of data is any that suits for an estimation of noise resulting from an event measurable by such data and liable to create or to contribute to the noise received by the microphone 21.
  • the data can be an angle of a moving arm of a robot, a speed or acceleration, a spent power of a motor, a sound captured by another microphone.
  • a third contact of the switch 22 is arranged for connecting the input 24 to the denoising module when the first contact is connecting the microphone 21 to the denoising module.
  • the third contact of the switch 22 is arranged to connect the input 24 to a shift register 29.
  • An output of each cell is arranged to be connected to the estimator generator 23 when said cell receives from a preceding cell or from the input 24 a value x i of vector wherein a index i is comprised between 1 and n, 1 for the oldest value and n for the last one which is received threw the input 24.
  • the manner for shifting the values in the register is not essential for the method according to the invention, it can be by means of a clock of the device in a manner usually known in the art for sampling or every time a new value is detected.
  • Useful feature of the invention is that a noise is sampled at the same time as a new value x n of vector shifts the preceding ones in the register.
  • the estimator generator 23 is arranged for starting a process of constructing the estimator when receiving from the shift register 29 a predetermined number L of values x i with their index i less or equal than n and greater than n-L. The process is executed by running the now explained steps of a method implementing the invention.
  • the number L is predetermined by setting its value in an initialization step 100.
  • the determination of said value per se is not in the scope of the invention, it can result from theoretical considerations or more practically from testing the device by a user providing successively different values of L up to achieve a more acceptable result on the estimation of the signal s by the de-noising module 20.
  • the noise estimation function f relates to the vector x by a linear combination of expressions of a kernel function K when applied each to the current vector x and to a past value x i of it.
  • Said kernel function K to be used is any function that satisfies the Mercer condition.
  • the coefficients ⁇ i are upgraded in real time by a loop of steps 101 to 108 wherein step 101 is triggered again for each new received value of data x considered as a supplementary last received data x n the same time as a received noise y considered as the last received noise y n .
  • a supplementary coefficient ⁇ n is created with a value initialized to zero.
  • the loop is executed for each value of n, said value being initialized in step 100 and incremented in step 103 or 108 to be ready for a following execution of the loop.
  • Step 102 tests if number n is greater or equal to L so as to furnish coefficients ⁇ i for a total number n of received data at least equal to L. So long as number n is less than L, step 103 is branched on step 101.
  • Dy j (n) is a distance separating a noise y j from an estimation of that noise which is done with the estimation function f using coefficients ⁇ m (n-1) with the values they currently had when executing step 101.
  • the noise y j is the one which was or is measured along the time when n was or is equal to j in step 101.
  • ⁇ h,k is a matrix regularization parameter. In other words it is a value that is equal to zero when the index h is different of the index k and is equal to a constant when the two indexes h and k are equal.
  • ⁇ and ⁇ n are for improving the efficiency of the method and will be explained later.
  • the method comprises a step 105 wherein the coefficients ⁇ i are divided by (1+p) with a regularization parameter or forgetting factor ⁇ having a value greater than zero.
  • the coefficients ⁇ i are decreasing but in such a manner as to preserve the ratio between coefficients of any pair. Because divided many times, rather old coefficients, that is with the smaller indexes, become immaterial after a sufficient number of executions of the loop.
  • the coefficients can be swapped out of the memory of the device executing the method, saving so much storage as time computing resources of it. For instance on figure 2 , the coefficients ⁇ i with i less than n-L+1 are specifically divided by (1+p) in step 105.
  • the formula (1) used in step 107 comprises a step size ⁇ (n) which in this case is initialized in step 100 to a value ⁇ (0).
  • the value of the step size ⁇ (n) can be a constant equal to ⁇ (0) for every execution of step 107 or can be varied according to n.
  • the method comprises further a step 106 wherein the step size ⁇ (n) is set to a value ⁇ but limited by a minimum ⁇ min and a maximum ⁇ max .
  • the values ⁇ min and ⁇ max are set in initialization step 100 respectively to a value greater or equal to zero and to a value preferably less or equal than one.
  • not only a last value of noise is compared with an estimation of noise resulting from the current available function but that everyone of the L'-1 preceding received values of noise y n-j is compared with a value of noise which would have been estimated by the last available function f n .
  • An adaptive size cost parameter ⁇ , an adaptive step size cost order L', an adaptive step size cost insensitivity ⁇ ' and an adaptive step size recursive parameter ⁇ are respectively a positive real number greater than one, an integer less, equal or greater than L, a positive real number near to zero and a positive real number less than two which can be set in step 100 in case of step 106 existing.
  • the value of the size cost parameter ⁇ is equal to one, we see that the updating of the value ⁇ is independent of the prediction error.
  • the Lagrange multipliers are calculated according the following sequence.
  • Any quadratic programming library arranged for calculating such an argument of a maximum value is adapted like for example the GQP library available on http://www.gnu.org/software/gsl.
  • the step size ⁇ (n) is not necessary. It can be a constant which when equal to 1 is the same has not being present.
  • a variable step size improves the method.
  • the method comprises further a step 205 wherein the step size ⁇ (n) is set to a value ⁇ but limited by a minimum ⁇ min and a maximum ⁇ max .
  • the values ⁇ min and ⁇ max are set in initialization step 100 respectively to a value greater or equal to zero and to a value preferably less or equal than one.
  • the prediction error e n 0 is the difference between the last received value of noise y n and a value which would have been predicted or estimated from the n-1 preceding measures with help of the function f n .
  • the prediction error e n 0 is the difference between the last received value of noise y n and a value which would have been predicted or estimated from the n-1 preceding measures with help of the function f n .
  • not only a last value of noise is compared with an estimation of noise resulting from the current available function but that everyone of the L'-1 preceding received values of noise y n-j is compared with a value of noise which would have been estimated by the last available function f n .
  • An adaptive size cost parameter ⁇ , an adaptive step size cost order L', an adaptive step size cost insensitivity ⁇ ' and an adaptive step size recursive parameter ⁇ are respectively a positive real number greater than one, an integer less, equal or greater than L, a positive real number near to zero and a positive real number less than two which can be set in step 100 in case of step 106 existing.
  • the value of the size cost parameter ⁇ is equal to one, we see that the updating of the value ⁇ is independent of the prediction error value except for its sign.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Complex Calculations (AREA)
  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)

Abstract

The system produces an estimator of noise which is an unknown function of physical quantities without necessity for said function to be linear. A microphone (21) is arranged for capturing a sound and means (22, 24, 29) are arranged for associating a first vector value of the physical quantities to said sound by a common index at the same time as the sound is captured. A generator (23) and a shift register (29) are arranged for storing captured sounds and associated said first vector value by incrementing a value of the index each time a sound is captured. When said index value is or has been incremented a number of times at least equal to a first integer (L) greater than one, each time said index value is incremented, the generator (23) generates a current sequence of coefficients for a linear combination of functions which satisfy Mercer conditions wherein a first argument of one of the functions is the one of first vector values having the index value corresponding to a rank of the coefficient in the sequence and for setting values of the coefficients so as to a quantity of the last captured sounds equal to said first integer be substantially equal each to an occurrence of the linear combination wherein a second argument of the functions is another one of first vector values having the index value associated to the sound. The estimator is produced by the generator (23) performing the linear combination resulting from the current generated sequence when a next captured sound is not pure noise.

Description

  • This invention relates to digital signal processing and in particular to a method of processing a signal such as a speech signal, for example for use in cancellation of non-stationary noise resulting from external events that can be measured or informed.
  • Akinori Ito, Takashi Kanayama Motoyuki Suzuki and Shozo Makino show an example of usefulness of such methods and systems in their article entitled "Internal Noise Suppression for Speech Recognition by Small Robots" published in pages 2685-2688 of INTERSPEECH 2005. To suppress unstable noise, they must predict the spectrum of the noise frame by frame. To achieve this, they constructed a neural network that predicts the spectrum of the internal noise from the status of joints. At least 10 000 samples are required for a learning stage that points on a slow convergence which can be problematic particularly with databases becoming huge.
  • There are many other situations wherein speech denoising is useful like for the examples described in the international patent application W02006/032760 . Here are calculated one or more noise reduction filters from an estimated power spectral density (PSD) of the noise. The estimation of the PSD is not per se an object of W02006/032760 .
  • UK patent application GB2406487 discloses a modified affine projection algorithm for non stationary-signal. The affine projection algorithm (APA) presents fast convergence features and seems to be well adapted for filtering an echo which is correlated with the speech signal. The problem of the APA is that it is not applicable for filtering noise which is uncorrelated with the speech and which is a function of other changing physical values than voice, particularly when said function is non linear.
  • To avoid the problems of the prior art, an object of the invention is a method or an apparatus for generating an estimator of noise which is an unknown function of physical quantities without necessity for said function to be linear and that with good quality of a fast convergence.
  • Particularly the method according to the invention comprises steps of capturing a sound and associating to the sound by a common index a first vector value of physical quantities inducing the sound at the same time as the sound is captured. Step of capturing is repeated by incrementing a value of the index each time said step is repeated When said index value is or has been incremented a number of times at least equal to a first integer (L) greater than one, each time said index value is incremented, the method further comprises the steps of:
    • generating (107) a current sequence of coefficients for a linear combination of functions which satisfy Mercer conditions wherein a first argument of one of the functions is the first vector value of the one of the capturing steps having the index value corresponding to a rank of the coefficient in the sequence and
    • setting values of the coefficients so as to a quantity of the last captured sounds equal to said first integer be substantially equal each to an occurrence of the linear combination wherein a second argument of the functions is the first vector value of the capturing step having the index value associated to the sound.
    The linear combination resulting from the current generated sequence is performed to produce the estimator when a next captured sound is not pure noise.
  • Prefered modes of implementation of the invention are now described with reference to the drawings wherein:
    • figure 1 is schematic representation of a device according to the invention;
    • figure 2 presents steps of a first method implementation according to the invention;
    • figure 3 presents steps of a second method implementation according to the invention.
  • Figure 1 is a schematic representation of a device according to the invention for restitution of a signal s which is emitted in a noisy environment. The signal s is for instance a sound of voice type or other that needs to be cleaned from noise for exploitation purpose.
  • The device comprises a microphone 21 for capture of sound and a de-noising module 20 for providing an estimate ^s of the signal s. Therefore a first contact of a switch 22 between the microphone 21 and the de-noising module is arranged to connect the microphone 21 with the de-noising module 20 when the signal s is present so as to supply the de-noising module 20 with a received signal which comprises in that case the signal s and a noise y. The detection of the presence of the signal s is not an object of the invention, it can be realized by a voice activity detection (VAD) system, a camera detecting a person or any other system like for example simply a button.
  • When the signal s is not present, said first contact of the switch 22 is arranged to normally connect the microphone 21 to an estimator generator 23 so as to supply the estimator generator 23 with the noise y so long as the signal s is not present. The estimator generator is arranged according to the invention for providing an estimator {αi, x i, K} that can be used by the de-noising module 20 for subtracting an estimation of noise ^y from the signal r so as to provide the estimate ^s of the signal s. For that purpose, a second contact of the switch 22 is arranged for connecting the estimator generator 23 to the de-noising module 20 the same time as the first contact of the switch 22 is connecting the microphone 21 to the de-noising module 20. In a normal state, the second contact of the switch 22 is arranged to loop the estimator generator 23 on itself so as to adapt said estimator in real time according to the received noise y.
  • The estimator is provided for giving an estimation of noise that is a function of data which are collected in a vector x threw an input 24 of the device. The value of each component of the vector is given for instance by a sensor 25, 26, 27, 28 connected to the input 24. Here four sensors are represented but it will be easily understood that the invention can be implemented with any number of sensors more or less than four including a sole one sensor, in which case the vector x is simply a scalar x. The type of data is any that suits for an estimation of noise resulting from an event measurable by such data and liable to create or to contribute to the noise received by the microphone 21. For a non limitative illustration purpose only, the data can be an angle of a moving arm of a robot, a speed or acceleration, a spent power of a motor, a sound captured by another microphone.
  • A third contact of the switch 22 is arranged for connecting the input 24 to the denoising module when the first contact is connecting the microphone 21 to the denoising module. In that way, a real time estimation of noise ^y can be calculated with help of the estimator so as for the de-noising module to elaborate the estimate ^s in a similar way but not necessary the same as the one taught in W02006/032760 .
  • When not connecting the input 24 to the de-noising module 20, the third contact of the switch 22 is arranged to connect the input 24 to a shift register 29. An output of each cell is arranged to be connected to the estimator generator 23 when said cell receives from a preceding cell or from the input 24 a value x i of vector wherein a index i is comprised between 1 and n, 1 for the oldest value and n for the last one which is received threw the input 24. The manner for shifting the values in the register is not essential for the method according to the invention, it can be by means of a clock of the device in a manner usually known in the art for sampling or every time a new value is detected. Useful feature of the invention is that a noise is sampled at the same time as a new value x n of vector shifts the preceding ones in the register.
  • The estimator generator 23 is arranged for starting a process of constructing the estimator when receiving from the shift register 29 a predetermined number L of values x i with their index i less or equal than n and greater than n-L. The process is executed by running the now explained steps of a method implementing the invention.
  • Referring now to figure 2, the number L is predetermined by setting its value in an initialization step 100. The determination of said value per se is not in the scope of the invention, it can result from theoretical considerations or more practically from testing the device by a user providing successively different values of L up to achieve a more acceptable result on the estimation of the signal s by the de-noising module 20.
  • The estimator {αi, x i, K} is provided by the estimator generator 23 for calculating an estimate ^y of noise in the form: y : = f x = i = 1 n α i K x i x
    Figure imgb0001
  • The noise estimation function f relates to the vector x by a linear combination of expressions of a kernel function K when applied each to the current vector x and to a past value x i of it.
  • Said kernel function K to be used is any function that satisfies the Mercer condition. Mathematical sciences define that the Mercer conditions are satisfied when for any number n of complex values ai or aj and of vectors x i or x j with real values, i = 0 n - 1 j = 0 n - 1 a i a j * K x i x j
    Figure imgb0002
    gives a non negative real value. We can easy check that for example the Gaussian function K x i x j = e - x i - x j 2 2 σ 2
    Figure imgb0003
    satisfies the Mercer conditions. Therefore this Gaussian function can be used for implementing the invention. Other Mercer conditions satisfying kernels are known and can also be used according to the best suited solution in the context of the exploitation of the device. Here is a non limitative list for illustrative purpose only:
    • a polynomial kernel in the form of K( x i,x j ) = (1 + x i · x j ) q
    • an exponential kernel in the form of K x i x j = e - x i - x j 2 β 0
      Figure imgb0004
    • a sigmoidal kernel in the form of K( x i,x j )= tanh(ξ0 x i · x j + β0).
  • The coefficients αi are upgraded in real time by a loop of steps 101 to 108 wherein step 101 is triggered again for each new received value of data x considered as a supplementary last received data x n the same time as a received noise y considered as the last received noise yn . Each time step 101 is executed, a supplementary coefficient αn is created with a value initialized to zero. The loop is executed for each value of n, said value being initialized in step 100 and incremented in step 103 or 108 to be ready for a following execution of the loop. Step 102 tests if number n is greater or equal to L so as to furnish coefficients αi for a total number n of received data at least equal to L. So long as number n is less than L, step 103 is branched on step 101.
  • Considering a index i comprised between n-L+1 and n, the L last coefficients αi of the estimator are calculated in step 107 by using the following formula: α i n i = n - L + 1 n : = α i n - 1 1 + ρ + μ n j = n - L + 1 n χ n - i , n - j - 1 n D y j n i = n - L + 1 n
    Figure imgb0005
  • Wherein Dyj(n) is a distance separating a noise yj from an estimation of that noise which is done with the estimation function f using coefficients αm(n-1) with the values they currently had when executing step 101. The noise yj is the one which was or is measured along the time when n was or is equal to j in step 101. D y j n : = y j - m = 1 n - 1 α m n - 1 1 + ρ K x m x j
    Figure imgb0006
  • The set of coefficients αi(n) on the left side of the setting symbol ":=" is for the coefficients generated by the current execution of the loop with rank n wherein the coefficients αi(n-1) on the right side are those initialized to zero and or generated by a preceding execution of the loop with rank n-1.
  • In step 107, χ n - i , n - j - 1 n
    Figure imgb0007
    is a coefficient on line n-i, column n-j in an inverse matrix of a kernel matrix χ h , k n h = k = 0 h = k = L - 1
    Figure imgb0008
    generated in step 104. Any known method of the art can be used for obtaining the inverse matrix of the kernel matrix. In step 104 executed before step 107 in case of a positive response to the test of step 102, the kernel matrix for the loop of rank n is generated by the formula: χ h , k n h = k = 0 h = k = L - 1 : = K x n - h x n - k + ζ n , k
    Figure imgb0009
  • Wherein ζh,k is a matrix regularization parameter. In other words it is a value that is equal to zero when the index h is different of the index k and is equal to a constant when the two indexes h and k are equal. The regularization parameter assures that the matrix χ h , k n h = k = 0 h = k = L - 1
    Figure imgb0010
    has an inverse.
  • The parameters ρ and µn are for improving the efficiency of the method and will be explained later. Without said parameters or with ρ and µn constants respectively equal to 0 and to 1 which is the same, the formulae used in step 107 are similar to the following ones: α i n i = n - L + 1 n : = α i n - 1 + j = n - L + 1 n χ n - i , n - j - 1 n D y j n i = n - L + 1 n
    Figure imgb0011
    D y j n : = y j - m = 1 n - 1 α m n - 1 K x m x j
    Figure imgb0012
  • Mathematical considerations show that a setting of coefficients αi according to formula (4) induces that for every j in the range of n-L+1 to n: y j = i = 1 n α i n K x i x j
    Figure imgb0013
  • It is interesting to note from formula (5) that by an execution of a following loop for a new value of noise y n+1, the equation (6) has for effect that for every j in the range of n-L+1 to n, the distance Dy j(n+1) is equal to zero. The only distance which is different from zero is Dy n+1(n+1) which is given by: D y n + 1 n + 1 : = y n + 1 - m = 1 n α m n K x m x n + 1
    Figure imgb0014
  • Because the kernel function K satisfies the Mercer condition, it can be shown that greater is L, faster the distance Dy n(n) is decreasing, in other words faster the method is converging.
  • Advantageously, the method comprises a step 105 wherein the coefficients αi are divided by (1+p) with a regularization parameter or forgetting factor ρ having a value greater than zero. Therewith, each time step 105 is executed after step 102, the coefficients αi are decreasing but in such a manner as to preserve the ratio between coefficients of any pair. Because divided many times, rather old coefficients, that is with the smaller indexes, become immaterial after a sufficient number of executions of the loop. The coefficients can be swapped out of the memory of the device executing the method, saving so much storage as time computing resources of it. For instance on figure 2, the coefficients αi with i less than n-L+1 are specifically divided by (1+p) in step 105. The coefficients αi with i greater than n-L are divided by (1+p) in step 107 according to formulae (1) and (2). In the following, every time the regularization parameter ρ is present in a formula, it will be understood that ρ is null, being the same as not being present for implementations without step 105.
  • Advantageously also, the formula (1) used in step 107 comprises a step size µ(n) which in this case is initialized in step 100 to a value µ(0). The value of the step size µ(n) can be a constant equal to µ(0) for every execution of step 107 or can be varied according to n. In that case the method comprises further a step 106 wherein the step size µ(n) is set to a value µ̃ but limited by a minimum µmin and a maximum µmax. The values µmin and µmax are set in initialization step 100 respectively to a value greater or equal to zero and to a value preferably less or equal than one. A possible formula for achieving that is: μ n : = min max μ ˜ μ min , μ max
    Figure imgb0015
  • Before setting µ(n) in step 106, the value µ̃ is updated by the formula: μ ˜ : = μ n - 1 + η 1 + ρ j = 0 - 1 sign e n j e n j εʹ γ - 1 m = 1 n - 1 β m K x m x n - j
    Figure imgb0016
  • In the formula, a prediction error is given by: e n j : = y n - j - f n x n - j
    Figure imgb0017
  • Wherein a function f n of the vector x n-j is given by the formula: f n x n - j : = i = 1 n - 1 α i K x i x n - j
    Figure imgb0018
  • We see here that for example when j=0, the prediction error e n 0
    Figure imgb0019
    is the difference between the last received value of noise e n 0 : = y n
    Figure imgb0020
    and a value which would have been predicted or estimated from the n-1 preceding measures with help of the function f n. We see also here that not only a last value of noise is compared with an estimation of noise resulting from the current available function but that everyone of the L'-1 preceding received values of noise yn-j is compared with a value of noise which would have been estimated by the last available function fn .
  • An adaptive size cost parameter γ, an adaptive step size cost order L', an adaptive step size cost insensitivity ε' and an adaptive step size recursive parameter η are respectively a positive real number greater than one, an integer less, equal or greater than L, a positive real number near to zero and a positive real number less than two which can be set in step 100 in case of step 106 existing. When the value of the size cost parameter γ is equal to one, we see that the updating of the value µ̃ is independent of the prediction error. When furthermore the values of the adaptive step size recursive parameter η and of the regularization factor ρ are respectively equal to one and zero, a simple expression of the value µ̃ is given by: μ ˜ : = μ n - 1 + j = 0 - 1 m = 1 n - 1 β m K x m x n - j
    Figure imgb0021
  • In the expression of µ̃, components βm of a weight gradient are updated according to the formula: β i i = 1 n : = 1 1 + ρ β i i = 1 n - 1 + Δ i i = n - L + 1 n
    Figure imgb0022
  • For every index i comprised in the range of n-L+1 to n, the value of a gradient Δ i is given by the formula: Δ i : = k = 0 L - 1 χ n n - i , k - 1 y n - k - i = 1 n - 1 α m + μ n - 1 β m K x i x n - j 1 + ρ
    Figure imgb0023
  • A second mode of implementation of the method according to the invention is now described in reference to figure 3 wherein steps 100 to 103 are similar to the ones of the first mode of implementation previously described in reference to figure 2.
  • Considering an index i comprised between n-L+1 and n, the L last coefficients αi of the estimator are calculated in step 207 by using the following formula: α i n i = n - L + 1 n : = α i n - 1 1 + ρ + μ n λ n - 1 + n - λ n - 1 - n 1 + ρ i = n - L + 1 n
    Figure imgb0024
  • Wherein λ n - i + n
    Figure imgb0025
    and λ n - i - n
    Figure imgb0026
    are Lagrange multipliers such as for the distance Dyj (n) being less than ε for every j being in the range of n-L+1 to n, in other words: - ε y j - m = 1 n - 1 α m n - 1 K x m x j ε
    Figure imgb0027
  • More precisely, the Lagrange multipliers are calculated according the following sequence.
  • A step 204 preceding step 206 is similar to previously described step 104 in that a kernel matrix K h = 0 k = L - 1 k = 0 h = L - 1 x n - h x n - k
    Figure imgb0028
    is calculated for having LxL coefficients, each coefficient of rank h, k, being equal to a kernel function of vectors x n-h and x n-k. Furthermore a quadratic matrix Q(n) is constructed for having 2Lx2L coefficients given by: Q h , k = Q h + L , k + L = 1 1 + ρ K x n - h x n - k h = k = 0 h = k = L - 1 + ζ h , k
    Figure imgb0029
    Q h + L , k = Q h , k + L = - 1 1 + ρ K x n - h , x n - k h = k = 0 h = k = L - 1
    Figure imgb0030
  • A linear vector p(n) having L components pk and L components pk+L, is given by the formula: p k p k + L : = δ k , n - j y j - m = 1 n - 1 α m n - 1 K x m x j - ε δ k , n - j - y j - m = 1 n - 1 α m n - 1 K x m x j - ε
    Figure imgb0031
  • Wherein δk,n-j=1 when k=n-j and 0 otherwise.
  • The matrix Q(n) and the linear vector p(n) are then input in a quadratic programming library that is arranged to produce in output values of λ n - i +
    Figure imgb0032
    and λ n - i -
    Figure imgb0033
    in the form of a vector A having 2L positive components such that: Λ T n = λ n - i + λ n - i - i = n - L + 1 n : = Arg max - 1 2 Λ T Q n Λ + Λ T p n
    Figure imgb0034
  • Any quadratic programming library arranged for calculating such an argument of a maximum value is adapted like for example the GQP library available on http://www.gnu.org/software/gsl.
  • In formula (1'), the step size µ(n) is not necessary. It can be a constant which when equal to 1 is the same has not being present. A variable step size improves the method. In that case the method comprises further a step 205 wherein the step size µ(n) is set to a value µ̃ but limited by a minimum µmin and a maximum µmax. The values µmin and µmax are set in initialization step 100 respectively to a value greater or equal to zero and to a value preferably less or equal than one. A possible formula for achieving that is: μ n : = min max μ ˜ μ min , μ max
    Figure imgb0035
  • Before setting µ(n) in step 205, the value µ̃ is updated by the formula: μ ˜ : = μ n - 1 + η 1 + ρ j = 0 - 1 sign e n j e n j εʹ γ - 1 m = 1 n - 1 λ n - i + - λ n - i - K x m x n - j
    Figure imgb0036
  • In the formula, a prediction error e n j
    Figure imgb0037
    is given by: e n j : = y n - j - f n x n - j
    Figure imgb0038
  • Wherein a function f n of the vector x n-j is given by the formula: f n x n - j : = i = 1 n - 1 α i n - 1 K x i x n - j
    Figure imgb0039
  • We see here that for example when j=0, the prediction error e n 0
    Figure imgb0040
    is the difference between the last received value of noise yn and a value which would have been predicted or estimated from the n-1 preceding measures with help of the function f n. We see also here that not only a last value of noise is compared with an estimation of noise resulting from the current available function but that everyone of the L'-1 preceding received values of noise yn-j is compared with a value of noise which would have been estimated by the last available function f n.
  • An adaptive size cost parameter γ, an adaptive step size cost order L', an adaptive step size cost insensitivity ε' and an adaptive step size recursive parameter η are respectively a positive real number greater than one, an integer less, equal or greater than L, a positive real number near to zero and a positive real number less than two which can be set in step 100 in case of step 106 existing. When the value of the size cost parameter γ is equal to one, we see that the updating of the value µ̃ is independent of the prediction error value except for its sign. When furthermore the values of the adaptive step size recursive parameter η is equal to one, a simple expression of the value µ̃ is given by: μ ˜ : = μ n - 1 + 1 1 + ρ j = 0 - 1 sign e n j i = n - L n - 1 λ n - i + - λ n - i - K x i x n - j
    Figure imgb0041

Claims (11)

  1. Method for producing an estimator of noise which is an unknown function of physical quantities without necessity for said function to be linear comprising steps of:
    - capturing (101) a sound and associating to said sound by a common index a first vector value of said physical quantities at the same time as the sound is captured;
    - repeating (102) said step of capturing, incrementing a value of the index each time said step is repeated, and when said index value is or has been incremented a number of times at least equal to a first integer (L) greater than one, each time said index value is incremented:
    - generating (107) a current sequence of coefficients for a linear combination of functions which satisfy Mercer conditions wherein a first argument of one of the functions is the first vector value of the one of the capturing steps having the index value corresponding to a rank of the coefficient in the sequence and
    - setting values of the coefficients so as to a quantity of the last captured sounds equal to said first integer be substantially equal each to an occurrence of the linear combination wherein a second argument of the functions is the first vector value of the capturing step having the index value associated to the sound;
    - performing the linear combination resulting from the current generated sequence to produce the estimator when a next captured sound is not pure noise.
  2. Method according to Claim 1 wherein when a previous sequence was generated in relation with a preceding capturing step, the values of the coefficients are set for the current sequence being at a minimum distance of the previous sequence according to a predetermined metric associated with sequences.
  3. Method according to Claim 1 or 2 wherein one of the last captured sounds is considered substantially equal to said occurrence of the linear combination when a difference between the said one sound and the occurrence is in a sufficiently small interval comprising zero and wherein the values of the (L) more recently generated coefficients comprise a difference λ i + - λ i -
    Figure imgb0042
    between two Lagrange multipliers, a first one and a second one corresponding respectively to a positive limit and to a negative limit of said small interval.
  4. Method according to Claim 3 wherein said difference between two Lagrange multipliers is multiplied by a step size which is updated according to said Lagrange multipliers.
  5. Method according to Claim 1 or 2 wherein one of the last captured sounds is considered substantially equal to said occurrence of the linear combination when a difference between the said one sound and the occurrence is equal to zero and wherein the values of the (L) more recently generated coefficients comprise a difference between a last captured sound and the linear combination associated with a preceding value of said common index.
  6. Method according to Claim 5 wherein said difference between the last captured sound and the linear combination is multiplied by a step size which is updated according to another difference which is between the last captured sound and a previous occurrence of the linear combination.
  7. Method according to anyone of the preceding Claims wherein said coefficients are multiplied by a forgetting factor having a value less than one each time said common index is incremented.
  8. System for producing an estimator of noise which is an unknown function of physical quantities without necessity for said function to be linear comprising:
    - a microphone (21) arranged for capturing a sound and means (22, 24, 29) arranged for associating to said sound by a common index a first vector value of said physical quantities at the same time as the sound is captured;
    - a generator (23) and a shift register (29) arranged for storing captured sounds and associated said first vector value by incrementing a value of the index each time a sound is captured, and when said index value is or has been incremented a number of times at least equal to a first integer (L) greater than one, each time said index value is incremented:
    the generator (23) is arranged for generating a current sequence of coefficients for a linear combination of functions which satisfy Mercer conditions wherein a first argument of one of the functions is the one of first vector values having the index value corresponding to a rank of the coefficient in the sequence and
    for setting values of the coefficients so as to a quantity of the last captured sounds equal to said first integer be substantially equal each to an occurrence of the linear combination wherein a second argument of the functions is another one of first vector values having the index value associated to the sound;
    - so as to produce the estimator when a next captured sound is not pure noise by performing the linear combination resulting from the current generated sequence.
  9. System according to Claim 8 wherein the generator (23) is arranged for setting the values of the coefficients of the current sequence being at a minimum distance of a previous sequence according to a predetermined metric associated with sequences.
  10. System according to Claim 8 or 9 wherein one of the last captured sounds is considered substantially equal to said occurrence of the linear combination when a difference between the said one sound and the occurrence is in a sufficiently small interval comprising zero and wherein the values of the (L) more recently generated coefficients comprise a difference λ i + - λ i -
    Figure imgb0043
    between two Lagrange multipliers, a first one and a second one corresponding respectively to a positive limit and to a negative limit of said small interval.
  11. Method according to Claim 8 or 9 wherein one of the last captured sounds is considered substantially equal to said occurrence of the linear combination when a difference between the said one sound and the occurrence is equal to zero and wherein the values of the (L) more recently generated coefficients comprise a difference between a last captured sound and the linear combination associated with a preceding value of said common index.
EP06301278A 2006-12-21 2006-12-21 On-line learning method and system for speech denoising Withdrawn EP1936608A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP06301278A EP1936608A1 (en) 2006-12-21 2006-12-21 On-line learning method and system for speech denoising
PCT/EP2007/064515 WO2008074893A1 (en) 2006-12-21 2007-12-21 On-line learning method and system for speech denoising

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
EP06301278A EP1936608A1 (en) 2006-12-21 2006-12-21 On-line learning method and system for speech denoising

Publications (1)

Publication Number Publication Date
EP1936608A1 true EP1936608A1 (en) 2008-06-25

Family

ID=37891750

Family Applications (1)

Application Number Title Priority Date Filing Date
EP06301278A Withdrawn EP1936608A1 (en) 2006-12-21 2006-12-21 On-line learning method and system for speech denoising

Country Status (2)

Country Link
EP (1) EP1936608A1 (en)
WO (1) WO2008074893A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000014731A1 (en) * 1998-09-09 2000-03-16 Ericsson Inc. Apparatus and method for transmitting an improved voice signal over a communications device located in a vehicle with adaptive vibration noise cancellation
US20050187763A1 (en) * 2004-02-23 2005-08-25 General Motors Corporation Dynamic tuning of hands-free algorithm for noise and driving conditions

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000014731A1 (en) * 1998-09-09 2000-03-16 Ericsson Inc. Apparatus and method for transmitting an improved voice signal over a communications device located in a vehicle with adaptive vibration noise cancellation
US20050187763A1 (en) * 2004-02-23 2005-08-25 General Motors Corporation Dynamic tuning of hands-free algorithm for noise and driving conditions

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A. ITO ET AL.: "Internal Noise Suppression for Speech Recognition by Small Robots", PROC. EUROSPEECH / INTERSPEECH, 4 September 2005 (2005-09-04), Lisbon, Portugal, pages 2685 - 2688, XP002430339 *
SCHÖLKOPF: "Statistical Learning and Kernel Methods", TECHNICAL REPORT MICROSOFT RESEARCH MSR TR, no. MSR-TR-2000-23, 29 February 2000 (2000-02-29), pages 1 - 8, XP002413484 *

Also Published As

Publication number Publication date
WO2008074893A1 (en) 2008-06-26

Similar Documents

Publication Publication Date Title
EP3776534B1 (en) Systems, methods, and computer-readable media for improved real-time audio processing
EP3142106B1 (en) Apparatus and method for generating acoustic model, and apparatus and method for speech recognition
CN110767223B (en) Voice keyword real-time detection method of single sound track robustness
Wu et al. Application of the unscented Kalman filter for real‐time nonlinear structural system identification
Anava et al. Online learning for time series prediction
EP3584573B1 (en) Abnormal sound detection training device and method and program therefor
Popescu Blind separation of vibration signals and source change detection–Application to machine monitoring
EP2877993B1 (en) Method and device for reconstructing a target signal from a noisy input signal
US20080293372A1 (en) Optimum Nonlinear Correntropy Filted
CN112180899B (en) State estimation method of system under intermittent anomaly measurement detection
KR101620866B1 (en) Dictionary learning based target source separation using induction algorithm
Zheng et al. Recursive adaptive algorithms for fast and rapidly time-varying systems
Mboup Parameter estimation via differential algebra and operational culculus
Bose et al. Framework for automated earthquake event detection based on denoising by adaptive filter
Kang et al. A novel recursive modal parameter estimator for operational time-varying structural dynamic systems based on least squares support vector machine and time series model
CN107564512A (en) Voice activity detection method and device
Li et al. Nonlinear model identification from multiple data sets using an orthogonal forward search algorithm
EP1936608A1 (en) On-line learning method and system for speech denoising
Ajay et al. Comparative study of deep learning techniques used for speech enhancement
WO2021062705A1 (en) Single-sound channel robustness speech keyword real-time detection method
Zhang et al. Denoising and trend terms elimination algorithm of accelerometer signals
Liu et al. State estimation for discrete-time Markov jump linear systems with multiplicative noises and delayed mode measurements
Chu et al. A new regularized TVAR-based algorithm for recursive detection of nonstationarity and its application to speech signals
Likhonina Hand Detection Algorithm: Pre-processing Stage.
Basiri et al. Fast and robust bootstrap method for testing hypotheses in the ICA model

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

AX Request for extension of the european patent

Extension state: AL BA HR MK RS

AKX Designation fees paid
REG Reference to a national code

Ref country code: DE

Ref legal event code: 8566

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20081230