WO2008067834A1 - Dropout concealment for a multi-channel arrangement - Google Patents

Dropout concealment for a multi-channel arrangement Download PDF

Info

Publication number
WO2008067834A1
WO2008067834A1 PCT/EP2006/011759 EP2006011759W WO2008067834A1 WO 2008067834 A1 WO2008067834 A1 WO 2008067834A1 EP 2006011759 W EP2006011759 W EP 2006011759W WO 2008067834 A1 WO2008067834 A1 WO 2008067834A1
Authority
WO
WIPO (PCT)
Prior art keywords
signal
channel
time
channels
filter
Prior art date
Application number
PCT/EP2006/011759
Other languages
French (fr)
Inventor
Martin Opitz
Cornelia Falch
Robert Höldrich
Original Assignee
Akg Acoustics Gmbh
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Akg Acoustics Gmbh filed Critical Akg Acoustics Gmbh
Priority to CN2006800565725A priority Critical patent/CN101548555B/en
Priority to JP2009539608A priority patent/JP4976503B2/en
Priority to EP06818999A priority patent/EP2092790B1/en
Priority to PCT/EP2006/011759 priority patent/WO2008067834A1/en
Priority to DE602006015376T priority patent/DE602006015376D1/en
Priority to AT06818999T priority patent/ATE473605T1/en
Publication of WO2008067834A1 publication Critical patent/WO2008067834A1/en
Priority to US12/479,046 priority patent/US8260608B2/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S1/00Two-channel systems
    • H04S1/007Two-channel systems in which the audio signals are in digital form
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/005Correction of errors induced by the transmission channel, if related to the coding algorithm

Definitions

  • the invention relates to a method for the concealment of dropouts in one or more channels of a multi-channel arrangement comprising at least two channels, wherein a replacement signal is generated in the event of a dropout in one channel with the aid of at least one error-free channel.
  • the wireless transmission of audio signals has constituted an important area of research since the introduction of the wireless microphone on the market at the beginning of the 1990s. At present, these products are used as standard equipment in the area of stage performances, concerts and live shows.
  • the use of digital transmission links offers the advantage to transmit metadata in addition to the audio data.
  • This metadata can contain, for example, information about the overall concept of a stage installation.
  • the notion of combining the individual channels and exploiting their interoperability in future systems can be realised by means of digital technologies.
  • the fast development of the underlying hardware in terms of computing power and storage capacity supports the progress of software implementations.
  • the method of the wireless transmission of signals is not resistant to influences that can crop up along the transmission link.
  • disturbances directly lead to the loss of data, and hence, to a total signal dropout.
  • the degradation of the signal quality, acoustically perceptible as cracks or clicks, is unacceptable at any rate and must be compensated for using appropriate technologies that are incorporated at the receiver side. Since the concealment unit represents an active element in the signal path, the impact of its inherent processing delay must be taken into consideration.
  • the simplest methods for the receiver-based concealment of dropouts are represented by the so-called intra-channel concealment techniques, in which each channel of a multi-channel arrangement is treated separately.
  • Standard concealment methods apply substitution and prediction algorithms.
  • the latter are generally comprised by two stages, the analysis unit and the re-synthesis model of the linear prediction error filter.
  • the first stage serves for estimating the filter coefficients and is executed continuously during error-free signal transmission. If a dropout occurs, the lost signal samples are reconstructed by the filtering process. This corresponds to an extrapolation and is suited to the concealment of dropouts of a few milliseconds in general broadband audio signals.
  • the real-time constraint is not as stringent (for example, the buffering of data is permissible)
  • the extrapolation is transformed into an interpolation and longer dropouts can therefore be handled.
  • US 2006/0171373 Al discloses a single-channel method for the concealment of data losses that makes use of a linear prediction estimate from the intact signal component immediately preceding the dropout.
  • the prediction coefficients obtained by means of a spectral analysis filter are used to estimate a residual signal.
  • a maximum repeatable range is determined for the residual signal over several stages.
  • the spectral analysis of the transmitted signal merely serves for an improved detection of the periodicity, which leads to the classic signal repetition. This period is repeated and the all-pole filter of the linear prediction is applied to it.
  • the residual signal emerges from preceding intact signal components that are filtered inversely with the currently calculated filter coefficients, yielding the estimated replacement signal. All computation required for signal reconstruction is performed in time domain, which is characteristic for the suggested method and results in substantial processing delay. Hence, it is incapable of real-time applications.
  • DE 19735675 C2 also discloses a single-channel concealment method.
  • the algorithm incorporates a perceptionally adapted subband decomposition based on psychoacoustic aspects.
  • the notion of signal reconstruction is to maintain the spectral energy in each subband. If a dropout occurs, an estimation of the signal is obtained by a properly filtered noise signal. Large dropouts yield an unchanged "sound surface".
  • the filter coefficients solely imply the energy information, thus, the preceding time samples are not incorporated.
  • EP 1 145 227 Bl discloses a single-channel concealment method for the transmission of coded audio signals in the context of the MPEG coding standard.
  • the transmitted data comprise spectral coefficients rather than time samples.
  • a perceptionally adapted subband splitting is employed to the signal section preceding the dropout by combining several MDCT (modified discrete cosine transform) coefficients into one subband. Since a dropout affects certain subbands, these are transformed back into time domain, and a narrow-band signal is predicted there. The estimated narrow-band signal is in turn MDC-transformed and inserted into the MDCT stream transmitted in MPEG coding.
  • MDCT modified discrete cosine transform
  • an STFT (short-time Fourier transform) representation is computated directly from the MDCT representation. Interpolation results are obtained in the STFT domain, therefore signal components succeeding the dropout are required, i.e. the method induces additional latency.
  • the interpolation itself is carried out per DFT-bin (discrete Fourier transformation) by use of the GAPES (gapped-data amplitude and phase estimation) algorithm. After the interpolation, the STFT data are transformed back into MDCT data.
  • the transmission format is composed in such a manner that an actual audio channel is only transmitted in one single, so-called "source channel,” whereas LSF (line spectral frequencies) vectors are transmitted in the remaining channels.
  • the LSF vectors represent a (complex valued) spectral interpretation of a time signal and correspond exactly to the linear prediction coefficients. Thus, they contain all of the information on the phase relationships of the spectral envelope.
  • the estimation of the LSF vectors is done by means of a Gaussian mixture model (GMM).
  • GMM Gaussian mixture model
  • the method incorporates subband decomposition, per band and channel prediction, and retransformation into linear prediction coefficients with appropriate filtering of the reference residuum.
  • the replacement signal i.e. of the LSF vectors
  • the entire signal information including the phase information is always transmitted.
  • the different LSF vectors of the individual channels contain information about the characteristics of different microphones that are spaced apart from each other, and which simultaneously pick up a sound event, for example a concert.
  • the efficiency of such technologies is either mostly restricted to its area of application (for example, pre-mixed multi-channel recordings) or is characterised predominantly by the convergence behavior of the adaptive filters, thus is highly variable due to the non-stationary input signals in connection with the dropouts of the target signal.
  • the aim of the present invention consists in providing a concealment method that uses the intact channels of a multi-channel system to replace the lost signal in such a way that the difference between the original signal and its replacement is rendered inaudible.
  • the usability in delay-critical real-time systems constitutes an important criterion, for which reason ultra-low latency techniques are in demand for the processing of signals.
  • this objective is achieved with a method mentioned at the outset, in that during the error-free signal transmission of the channels a mapping takes place of the transmitted signals into the frequency domain, the absolute value of the frequency spectrum being determined, that spectral filter coefficients are calculated that relate the magnitude spectrum of a channel to the magnitude spectrum of at least one other channel, and that in the event of the dropout of one channel the replacement signal is generated by computation of filter coefficients prior to the dropout and application of them to a substitution signal which constitutes of at least one error-free channel.
  • the concealment filter is calculated using the magnitude spectra, thus, without regard to the phase information, providing a more stable filter, and an improved quality of the replacement signal, respectively.
  • a significant advantage compared to single-channel methods currently in use also lies in the utilisation of the interoperability between the individual signals.
  • phase information As an extension of the basic method, a modified treatment of the phase information is proposed. In so doing, the constancy of the phase transition at the beginning and at the end of the dropout is improved by taking into account the average time delay between target and replacement signal. A time delay between the respective channels, independent of their source direction, emerges according to the spatial arrangement of the multi-channel recording system.
  • FIG. 1 shows a schematic representation of the transmission chain according to the invention
  • Fig. 2 shows a detailed block diagram of the dropout concealment of the invention for a two- channel system
  • Fig. 3 shows a block diagram of a multi-channel arrangement of, for example eight channels
  • Fig. 4 shows a flowchart of the entire invention, consisting of the estimation of the spectral filter, the determination of the time delay between the channels, as well as the weighted superposition of all channels in order to generate the substitution signal, and
  • Fig. 5 shows the layout of the device according to the invention for dropout concealment that is to be integrated into each channel of the multi-channel arrangement.
  • the preferred area of application of the present invention is within the overall system of a multi-channel (optionally wireless) transmission of digital audio data.
  • the entire structure of a transmission chain is depicted in Fig. 1 and typically comprises the following stages for one channel:
  • Signal source 1 e.g. a sensor for recording signals (microphone), analog-digital converter 2 (ADC), optional signal compression and coding on the transmitter side, transmitter 3, transmission channel, receiver 4, concealment module 5.
  • ADC analog-digital converter 2
  • concealment module 5 At the output of the concealment module 5, the audio signal is available in digital form — further signal processing units can be connected directly, for example a pre-amp, equalizer, etc.
  • the proposed concealment method is independent of the transmitter/receiver unit as well as the source coding and acts solely on the receiver side (receiver-based technique). It can therefore be integrated flexibly as an independent module into any transmission path. In some transmission systems (e.g. digital audio streaming), different concealment strategies are implemented simultaneously. While the application shown in Fig. 1 does not provide for any further concealment units, a combination with alternative technologies is possible.
  • multi-channel arrangements range from stereo recordings to different variations of surround recordings (e.g. OCT Surround, Decca Tree, Hamasaki Square, etc.) potentially supported by different forms of spot microphones.
  • the signals of the individual channels are comprised of similar components whose particular composition is often quite non-stationary.
  • a dropout in one main microphone channel can be concealed according to the present invention introducing little or no latency.
  • Multi-channel audio transmission in studios prodeeds at different physical layers (e.g.
  • optical fiber waveguides AES-EBU, CAT5
  • dropouts can occur for various reasons, for example due to loss of synchronization, which must be prevented or concealed especially in critical applications such as, for example, in the transmission operations of a radio station.
  • the concealment method according to the invention can be used as a safety unit with a low processing latency.
  • audio transmission in the internet is less delay-sensitive than the abovementioned areas, transmission errors occur more frequently, resulting in an increased degradation of the perceptual audio quality.
  • the inventive concealment method offers an improvement of the quality of service.
  • the method according to the invention can also be used in the framework of a spatially distributed, immersive musical performance, i.e. in the implementation of a collaborative concert of musicians that are separated spatially from each other, hi this case, the ultra-low latency processing strategy of proposed algorithm benefits the system's overall delay.
  • the invention is not restricted to the following embodiment. It is merely intended to explain the inventive principle and to illustrate one possible implementation.
  • the dropout concealment method is described for one channel afflicted with dropouts. If transmission errors occur in more than one channel of the multi-channel arrangement, the system can easily be expanded.
  • the channel afflicted with dropouts is defined as target channel or signal.
  • the replica (estimation) of this signal that is to be generated during dropout periods is referred to as replacement signal.
  • At least one substitution channel is required for the computation of the replacement signal.
  • the proposed algorithm is composed of two parts. Computations of the first part are carried out permanently, whereas the second part is only activated in the case of a dropout in the target channel.
  • the coefficients of a linear-phase FIR (finite impulse response) filter of length L FiUer are permanently being estimated in the frequency domain.
  • the required information is provided by the optionally non-linearly distorted and optionally time-averaged short-term magnitude spectra of the target and substitution channel. This new type of filter computation disregards any phase information and thus, differs fundamentally from the correlation-dependent adaptive filters. Selection of the substitution channel or substitution channels
  • Fig. 2 shows a block diagram of the multi-channel dropout concealment method for a target signal x z and a substitution signal x 5 .
  • the individual steps of the method are each indicated by a box containing a reference symbol and denoted in the subsequent table:
  • the correct selection of a substitution channel depends on the similarity between the substitution and target signal. This correlation can be determined by estimating the cross- correlation or coherence. (See explanations on coherence and on generalized cross-power spectral density (GXPSD) at the end of the specification.) According to the invention, the (GXPSD) is proposed as potential selection strategy.
  • Y 23 j ⁇ k ⁇ is used as particular example in embodiments 1. to 9. (A total of K channels are
  • the channel x 0 ( «) being designated as the target channel x z ( «) .): 1.
  • the J ⁇ channel is defined as a substitution signal by the
  • a fixed allocation can be established between the channels in advance if the user (e.g. a sound engineer) knows the characteristics of the individual channels (according to the selected recording method) and hence their joint signal information.
  • x s (n) denotes the substitution channel composed of the channels
  • ⁇ ( i) represents the frequency-averaged coherence function
  • time delay between the selected channel pairs is considered by ⁇ r y (c.f. section "Estimation of the time delay between target and substitution channel”).
  • the validity of the potential signals is verified incorporating the status bit ⁇ o(y ' ) .
  • a simplification of 4. is proposed that considers a pre-selected set of channels J rather than all available channels / .
  • the weighted sum is built using ⁇ (7) ej ⁇
  • the pre-selection is intended to yield channels whose frequency-averaged coherence function exceed a prescribed threshold ⁇ :
  • the selection can be carried out separately for different frequency bands, i.e. in each band the "optimal" substitution channel is determined on the basis of the coherence function, the respective band pass signals are filtered using the method according to the invention, optionally in a time-delayed manner (c.f. "Estimation of the time delay between target and substitution channel"), superposed and used as a replacement signal.
  • the same criteria apply as in 1., 4., 5., instead of the frequency-averaged function ⁇ ( A .
  • substitution channels can also be selected.
  • the processing is carried out separately for each channel, i.e. several replacement signals are generated. These are weighted according to their coherence function, combined and inserted into the dropout.
  • the computation during error-free transmission is performed in frequency domain, thus in a first step an appropriate short-termn transformation is necessary, resulting in a block-oriented algorithm that requires a buffering of target and substitution signal.
  • the block size should be aligned to the coding format.
  • the estimation of the envelopes of the magnitude spectra of target and substitution signal are used to determine the magnitude response of the concealment filter.
  • the exact narrow-band magnitude spectra of the two signals are not relevant, rather broad-band approximations are sufficient, optionally time-averaged and/or non-linearily distorted by a logarithmic or power function.
  • the estimation of the spectral envelopes can be implemented in various ways.
  • the most efficient possibility concerning computational efficiency is the short-term DFT with short block length, i.e. the spectral resolution is low.
  • a signal block is multiplied by a window function (e.g. Hanning), subjected to the DFT, the magnitude of the short-term DFT is optionally distorted non-linearly and subsequently time-averaged.
  • a window function e.g. Hanning
  • the maxima are detected in the magnitude spectrum of the short-term DFT and the envelope between neighboring maxima are calculated by means of linear or non-linear interpolation, optionally followed by a non-linear distortion of the so obtained envelopes of the magnitude spectra and, subsequent to this, time-averaging.
  • an exponential smoothing of the optionally non-linearly distorted magnitude spectra can be used, as represented in equations (1) with time constant a for the exponential smoothing.
  • the time-averaging can be formed by a moving average filter.
  • the non-linear distortion can, for example, be carried out by means of a power function with arbitrary exponents which, in addition, can be selected differently for the target and substitution channel, as depicted in equations (1) by the exponents ⁇ and ⁇ . (Alternatively, a logarithmic function can also be used.)
  • the non-linear distortion offers the advantage of weighting time periods with high or low signal energy differently along the time-varying progression of each frequency component.
  • exponents ⁇ und ⁇ greater than 1 denote an expansion, i.e. peaks along the signal progression dominate the result of the time-averaging, whereas exponents less than 1 signify a compression, i.e. enhance periods with low signal energy.
  • the optimal selection of the exponent values depends on the sound material to be expected.
  • equations (1) constitute a special case for the calculation of the spectral envelopes of target and substitution channel with exponential smoothing and arbitrary distortion exponents.
  • the invention comprises the method with any time-averaging methods and any non-linear distortions of the envelopes of the magnitude spectra and hence, any values for the exponents ⁇ and ⁇ .
  • the use of the logarithm of the exponential function is enclosed, too.
  • the block index m is omitted, though all magnitude values such as S 5 SJ or H are considered to be time- variant and therefore a function of block index m .
  • concealment filters are calculated by minimizing the mean square error between the target signal and its estimation.
  • the present invention examines the error of the estimated magnitude spectra:
  • E( ⁇ ) corresponds to the difference between the envelope of the magnitude spectra of the optionally non-linearly distorted optionally smoothed target signal and its estimation.
  • the optimization problem is observed separately for each frequency component k .
  • the simplest realisation of the spectral filter H(&) would be determined by the two envelopes, with
  • H( ⁇ ) a constraint of H( ⁇ ) is suggested through the introduction of a regularization parameter.
  • the underlying intention is to prevent the filter amplification from rising disproportionally if the signal power of S 5 1 is too weak and hence, background noise becomes audible or the system becomes perceptibly instable. If, for example, the spectral peaks of one time-block of ⁇ S Z and S s are not located in exactly the same frequency band,
  • the background noise power P g (k) can be estimated incorporating the time-averaged
  • the regularisation parameter ⁇ k) is proportional to the rms value of the
  • ⁇ k c- ⁇ P g (&) r , and c typically between 1 and 5.
  • H is proposed specifically for quasi-stationary input signals.
  • the envelopes of the magnitude spectra are first estimated without time-averaging and optionally non-linear distortion. Both modifications are considered during the determination of the filter coefficients, according to:
  • a status bit can be transmitted at a reserved position within the respective audio stream (e.g. between audio data frames), and continuously registered at the receiver side. It would also be conceivable to perform an energy analysis of the individual frames and to identify a dropout if it falls below a certain threshold. A dropout could also be detected through synchronization between transmitter and receiver.
  • the replacement signal must be generated using the lastly estimated filter coefficients and the substitution channel(s), and is directly fed to the output of the concealment unit.
  • the estimation of the filter coefficients is deactivated.
  • the transition between target and replacement signal can be implemented by a switch, assuming any switching artefacts remaining inaudible.
  • a cross-fade between the signals is proposed as being advantageous, but this requires a buffering of the target signal, hence inducing additional latency, hi particularly delay-critical real-time systems that do not allow for any additional buffering, a cross-fade is not readily possible.
  • an extrapolation of the target signal is proposed, for example by means of linear prediction.
  • the cross-fade is carried out between the extrapolated target signal and the replacement signal by using the method according to the invention.
  • the replacement signal is finally generated through filtering of the substitution signal with the filter coefficients retransformed into the time domain.
  • the inverse transformation of the filter coefficients 7 7"1 ⁇ // ⁇ should be carried out with the same method as the first transformation.
  • the filter impulse response is optionally time-limited by a windowing function w(n) (e.g. rectangular, Harming).
  • w(n) e.g. rectangular, Harming
  • the impulse response h (n ⁇ or h w (n ⁇ , respectively, must only be calculated once at the beginning of the dropout, since the continuous estimation of the filter coefficients is deactivated during the dropout.
  • an appropriate vector of the substitution signal X 5 is necessary,
  • the filtering can be performed in the frequency domain.
  • the coefficients optionally windowed in the time domain are transformed back into the frequency domain, so that the replacement signal of a block is computed by:
  • Successive blocks are combined using methods such as overlap and add or overlap and save.
  • the replacement signal is continued beyond the end of the dropout to enable a cross-fade into the re-existing target signal.
  • the time- alignment of target and replacement signal can be improved, too. Therefore, a time delay is estimated, parallel to the spectral filter coefficients, that takes two components into account. On the one hand, the delay of the replacement signal resulting from the filtering process must be improved, too. Therefore, a time delay is estimated, parallel to the spectral filter coefficients, that takes two components into account. On the one hand, the delay of the replacement signal resulting from the filtering process must be improved, too. Therefore, a time delay is estimated, parallel to the spectral filter coefficients, that takes two components into account. On the one hand, the delay of the replacement signal resulting from the filtering process must
  • substitution channel originates due to the spatial arrangement of the respective microphones.
  • This can be estimated, for example, by means of the generalized cross-correlation (GCC) that requires the computation of complex short-term spectra.
  • GCC generalized cross-correlation
  • the short-term DFT employed for the estimation of the concealment filter can be exploited, too, obviating additional computational complexity.
  • the GCC is calculated using inverse Fourier transform of the estimated generalized cross-power spectral density (GXPSD), which is defined by:
  • X 2 (k ⁇ and X 3 (k ⁇ are the DFTs of a block of the target or substitution
  • G(k ⁇ ) represents a pre- filter the aim of which is explained in the following.
  • the time delay r 2 is determined by indexing the maximum of the cross-correlation.
  • the detection of the maximum can be improved by approximating its shape to a delta function.
  • the pre- filter G(k ⁇ directly affects the shape of the GCC and thus, enhances the estimation of ⁇ 2 .
  • a proper realisation denotes the phase transform filter (PHAT):
  • O 23 cross-power spectral density of target and substitution signal.
  • ⁇ zz auto-power spectral density of the target signal
  • O 53 auto-power spectral density of the substitution signal.
  • the transformation of the signals into the frequency domain is usually implemented by means of short-term DFT.
  • the block length must, on the one hand, be selected large enough in order to facilitate peaks in the GCC that are detectable for the expected time delays but, on the other hand, excessive block lengths lead to increased need for storage capacity.
  • time-averaging of the GXPSD or of the complex coherence function is proposed (e.g. by exponential smoothing). (13)
  • m refers to the block index.
  • the smoothing constants are designated with ⁇ and v . These must be adapted to the jump distance of the short-term DFT and the stationarity of T 2 in order to obtain the best possible estimation of the coherence function or the generalized cross-power spectral density, respectively.
  • the entire time delay element between target and replacement signal can be formulated by
  • Fig. 2 The individual processing steps are summarized in a block diagram in Fig. 2 for one target and one substitution signal.
  • the transition between target and replacement signal or vice-versa is depicted as a simple switch in the graphic; as has already been mentioned, a cross-fade of the signals is recommendable.
  • Fig. 3 The inventive notion of a multi-channel setup with more than two channels is depicted Fig. 3. Depending on which channel is affected by dropouts and hence becomes the target channel, the substitution signal is generated with the remaining intact channels.
  • the discrete blocks of Fig. 3 correspond to the following processing steps:
  • a replacement signal is generated for channel 1, which is afflicted by dropouts. To achieve this, either one, several, or all of the channels 2 to 7 can be used.
  • the second row corresponds to the reconstruction of channel 2 , etc.
  • Fig. 4 shows a schematic of the basic algorithm in combination with the expansion stage (i.e. time delay estimation) to illustrate the mutual dependencies of the individual processing steps.
  • parallel signals (DFT blocks) or (spectral) mappings derived thereof are merged into one (solid) line, the number of which is indicated by K or K -I , respectively.
  • the dotted connections denote the transfer or input of parameters.
  • the first selection of the substitution channels is done in the block labeled "selector" according to the GXPSD. On the one hand, this affects the computation of the envelopes of the magnitude spectra of the substitution signal and, on the other hand, it is needed for the weighted superposition of the same.
  • the second selection criterion is offered by the time delay ⁇ 2 .
  • the status bits of the channels are not depicted explicitly, but their verification is considered in relevant signal-processing blocks. Additionally, the particular determination of the target signal can be omitted from this illustration.
  • the method for dropout concealment works as an independent module and is intended for installation into a digital signal processing chain, wherein the software-specified algorithm is implemented on a commercially available digital signal processor (DSP), preferably a special DSP for audio applications.
  • DSP digital signal processor
  • an appropriate device such as exemplarily depicted in Fig. 5, is necessary that preferably may be integrated directly into the apparatus for receiving and decoding the transmitted digital audio data.
  • the apparatus for dropout concealment is equipped with a primary audio input that adopts the digital signal frames from the receiver unit and temporarily stores them in a storage unit 25.
  • the apparatus is equipped with at least one secondary audio input, optionally several secondary audio inputs, at which the digital data of the substitution channel(s) are available and likewise stored temporarily in one, optionally several, storage unit(s) 25.
  • the device features an interface for the transmission of control data such as the status bit of the signal frames (dropout y/n) or an information bit for the selection of the substitution channel(s), the latter requiring (a) a bidirectional data line and (b) a temporary storage unit 25.
  • the apparatus In order to forward the original or concealed data frames of the primary channel, the apparatus is equipped with an audio output.
  • a separate storage unit for the data blocks to be output is not necessary, since they can be stored as needed in the storage unit of the input signal.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Selective Calling Equipment (AREA)
  • Stereophonic System (AREA)
  • Signal Processing For Digital Recording And Reproducing (AREA)
  • Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)

Abstract

The invention relates to a method for the concealment of dropouts in one or more channels (Z) of a multi-channel arrangement comprising at least two channels (Z, S), wherein in the event of a dropout in one channel (Z) a. replacement signal is generated with the aid of at least one error-free channel (S), characterised in that, during the error-free signal transmission of the channels (Z, S) a mapping of the transmitted signals (xz, xs) into the frequency domain is performed, the magnitude spectra (|SZ|,|SS|) being determined, spectral filter coefficients (H) are calculated that relate the magnitude spectrum (|SZ|) of a channel (Z) to the magnitude spectrum (|SS|) of at least one other channel (S), and that in the event of the dropout of a channel (Z) the replacement signal is generated by application of filter coefficients (H), computed prior to the dropout, to a substitution signal which consists of at least one error-free channel (S).

Description

Dropout concealment for a multi-channel arrangement
The invention relates to a method for the concealment of dropouts in one or more channels of a multi-channel arrangement comprising at least two channels, wherein a replacement signal is generated in the event of a dropout in one channel with the aid of at least one error-free channel.
The wireless transmission of audio signals has constituted an important area of research since the introduction of the wireless microphone on the market at the beginning of the 1990s. At present, these products are used as standard equipment in the area of stage performances, concerts and live shows. In comparison to analog systems, the use of digital transmission links offers the advantage to transmit metadata in addition to the audio data. This metadata can contain, for example, information about the overall concept of a stage installation. Furthermore, the notion of combining the individual channels and exploiting their interoperability in future systems can be realised by means of digital technologies. Despite, the fast development of the underlying hardware in terms of computing power and storage capacity supports the progress of software implementations.
In general, the method of the wireless transmission of signals is not resistant to influences that can crop up along the transmission link. In case of digital radio links, disturbances directly lead to the loss of data, and hence, to a total signal dropout. The degradation of the signal quality, acoustically perceptible as cracks or clicks, is unacceptable at any rate and must be compensated for using appropriate technologies that are incorporated at the receiver side. Since the concealment unit represents an active element in the signal path, the impact of its inherent processing delay must be taken into consideration.
A general classification of error concealment technologies for audio and video transmissions in real time is offered by Wah B. W., Su X., and Lin D.: "A Survey of Error Concealment Schemes for Real-Time Audio and Video Transmission over the Internet"; Proc. IEEE Int. Symposium on Multimedia Software Engineering, Dec. 2000. Here, the dependence of the source coding constitutes a fundamental distinguishing characteristic with which a distinction is made between transmitter-controlled and receiver-based technologies. The method according to the invention belongs to the category "receiver-based method", i.e. it works completely decoupled from the transmitter or source coding and is therefore not affected by the additional latency inherent to transmitter-controlled technologies.
The simplest methods for the receiver-based concealment of dropouts are represented by the so-called intra-channel concealment techniques, in which each channel of a multi-channel arrangement is treated separately. Standard concealment methods apply substitution and prediction algorithms. The latter are generally comprised by two stages, the analysis unit and the re-synthesis model of the linear prediction error filter. The first stage serves for estimating the filter coefficients and is executed continuously during error-free signal transmission. If a dropout occurs, the lost signal samples are reconstructed by the filtering process. This corresponds to an extrapolation and is suited to the concealment of dropouts of a few milliseconds in general broadband audio signals. In some cases, in which the real-time constraint is not as stringent (for example, the buffering of data is permissible), the extrapolation is transformed into an interpolation and longer dropouts can therefore be handled.
The expansion of one-channel systems to multi-channel systems - the so-called inter-channel concealment techniques — leads to the implementation of adaptive filters. Compared to linear prediction algorithms, the estimation of the filter coefficients is not related exclusively to the signal of the respective channel, but rather information from other parallel channels is also used thatfor. The exploitation of the channel cross correlations is deemed to improve the performance of the concealment method. However, the efficiency of the technique is characterised primarily by the convergence behavior of adaptive filters, which mainly depends on the stationarity of the input signals. Since, in general, broadband audio is highly non-stationary, the behaviour of the adaptive filter will be quite poor. One possible implementation of this method is described in US 2005/0182996 Al (and respective EP 1649452 Al), the entire disclosure of which is incorporated into this specification by virtue of reference.
A common feature of the abovementioned filter techniques denotes the processing in time domain; some algorithms also offer an equivalent description in frequency domain. Yet the aim of the transformation is to increase computing efficiency, whereas the characteristics of the time domain method are retained. In the following, several concealment methods are described in brief, beginning with single- channel systems:
US 2006/0171373 Al discloses a single-channel method for the concealment of data losses that makes use of a linear prediction estimate from the intact signal component immediately preceding the dropout. The prediction coefficients obtained by means of a spectral analysis filter are used to estimate a residual signal. A maximum repeatable range is determined for the residual signal over several stages. The spectral analysis of the transmitted signal merely serves for an improved detection of the periodicity, which leads to the classic signal repetition. This period is repeated and the all-pole filter of the linear prediction is applied to it. The residual signal emerges from preceding intact signal components that are filtered inversely with the currently calculated filter coefficients, yielding the estimated replacement signal. All computation required for signal reconstruction is performed in time domain, which is characteristic for the suggested method and results in substantial processing delay. Hence, it is incapable of real-time applications.
DE 19735675 C2 also discloses a single-channel concealment method. The algorithm incorporates a perceptionally adapted subband decomposition based on psychoacoustic aspects. The notion of signal reconstruction is to maintain the spectral energy in each subband. If a dropout occurs, an estimation of the signal is obtained by a properly filtered noise signal. Large dropouts yield an unchanged "sound surface". The filter coefficients solely imply the energy information, thus, the preceding time samples are not incorporated.
EP 1 145 227 Bl discloses a single-channel concealment method for the transmission of coded audio signals in the context of the MPEG coding standard. Thus, the transmitted data comprise spectral coefficients rather than time samples. A perceptionally adapted subband splitting is employed to the signal section preceding the dropout by combining several MDCT (modified discrete cosine transform) coefficients into one subband. Since a dropout affects certain subbands, these are transformed back into time domain, and a narrow-band signal is predicted there. The estimated narrow-band signal is in turn MDC-transformed and inserted into the MDCT stream transmitted in MPEG coding.
The article "Packet Loss Concealment for Audio Streaming Based on the GAPES Algorithm" by Ofir et. al. AES 118th Convention, May 28-31, 2005, Barcelona, Spain, describes a single- channel method in the context of the MPEG coding standard and thus, is also MDCT-based. - A -
Since the properties of the MDCT prevent an adequate interpolation between successive MDCT blocks, an STFT (short-time Fourier transform) representation is computated directly from the MDCT representation. Interpolation results are obtained in the STFT domain, therefore signal components succeeding the dropout are required, i.e. the method induces additional latency. The interpolation itself is carried out per DFT-bin (discrete Fourier transformation) by use of the GAPES (gapped-data amplitude and phase estimation) algorithm. After the interpolation, the STFT data are transformed back into MDCT data.
The single-channel systems described above essentially depend on past signal components, hence, the estimation of the replacement signal is based on the assumption of long-term stationary input signals. Although those methods that incorporate a spectral analysis apply the filter in the frequency domain, both the comparison with preceding samples and the prediction of the future samples occur exclusively in the time domain.
The article "Packet Loss Concealment for Multichannel Audio Using the Multiband Source/Filter Model" by Karadimou et. al. 40th Annual Asilomar Conf. on Signals, Systems and Computers, Oct. 29 - Nov. 01, 2006, discloses a concealment method that relies on several channels. The transmission format is composed in such a manner that an actual audio channel is only transmitted in one single, so-called "source channel," whereas LSF (line spectral frequencies) vectors are transmitted in the remaining channels. The LSF vectors represent a (complex valued) spectral interpretation of a time signal and correspond exactly to the linear prediction coefficients. Thus, they contain all of the information on the phase relationships of the spectral envelope. In this method, dropout concealment is constraint to an error-prone "source channel". Dropouts can therefore only be handled in the LSF channels. The estimation of the LSF vectors is done by means of a Gaussian mixture model (GMM). Despite, the method incorporates subband decomposition, per band and channel prediction, and retransformation into linear prediction coefficients with appropriate filtering of the reference residuum. During computation of the replacement signal, i.e. of the LSF vectors, the entire signal information including the phase information is always transmitted. The different LSF vectors of the individual channels contain information about the characteristics of different microphones that are spaced apart from each other, and which simultaneously pick up a sound event, for example a concert. Hence, correlations between the individual LSF vectors are to be expected, and a so-called cross-channel estimation can be employed, i.e. if a dropout occurs in one LSF vector, parallel LSF vectors can be exploited. For the substitution, a reference channel is established in advance and its LP residuum serves for the signal synthesis of all other channels (not only in the event of dropout but rather during normal operation, too). The fundamental assumption made is that there is a correlation between target and reference channel. However, this assumption is never verified and is definitely not true for many scenarios. The entire processing steps (subband filtering, LP analysis, LSF computation, synthesis filter) of the concealment procedure are implemented in the signal path, resulting in a considerable processing delay that has to be accepted and low latency can not be achieved, respectively. Due to the subband technique, the computational complexity is high (the prediction is performed per subband and channel, and the all-pole filter is implemented in each subband during resynthesis, too).
Another publication that deals with multi-channel concealment is "Loss Concealment for Multi-Channel Streaming Audio" by Sinha et. al. NOSSDAV03, June 1-3, 2003, Monterey, California, USA. The particular application of "distributed immersive musical performance" describes the implementation of a collaborative concert of spatially separated musicians by data transfer over the internet. A possibility for signal substitution is suggested therein that is based on the spatial proximity of the loudspeaker positions to each other in the multi-channel setup. In this method, a special type of interleaved packeted transmission is essential for the concealment.
The prior art for multi-channel systems is currently limited to different implementations of adaptive filters in the time domain or on transmitter-side channel interleaving with simple substitution rules as are typical in the upmix/downmix matrixing strategy suggested by Gerzon (M. Gerzon: "Hierarchical System of Surround Sound Transmission for HDTV," AES preprint* 3339, 92nd Convention, March 24-27, 1992, Vienna; and M. Gerzon: "Problems of Upward and Downward Compatibility in Multichannel Stereo Systems," AES preprint* 3404, 93rd Convention, Oct. 1-4, 1992, San Francisco). The efficiency of such technologies is either mostly restricted to its area of application (for example, pre-mixed multi-channel recordings) or is characterised predominantly by the convergence behavior of the adaptive filters, thus is highly variable due to the non-stationary input signals in connection with the dropouts of the target signal. The aim of the present invention consists in providing a concealment method that uses the intact channels of a multi-channel system to replace the lost signal in such a way that the difference between the original signal and its replacement is rendered inaudible. In addition to the reliability of the transmission, the usability in delay-critical real-time systems constitutes an important criterion, for which reason ultra-low latency techniques are in demand for the processing of signals.
According to the invention, this objective is achieved with a method mentioned at the outset, in that during the error-free signal transmission of the channels a mapping takes place of the transmitted signals into the frequency domain, the absolute value of the frequency spectrum being determined, that spectral filter coefficients are calculated that relate the magnitude spectrum of a channel to the magnitude spectrum of at least one other channel, and that in the event of the dropout of one channel the replacement signal is generated by computation of filter coefficients prior to the dropout and application of them to a substitution signal which constitutes of at least one error-free channel.
The concealment filter is calculated using the magnitude spectra, thus, without regard to the phase information, providing a more stable filter, and an improved quality of the replacement signal, respectively. A significant advantage compared to single-channel methods currently in use also lies in the utilisation of the interoperability between the individual signals.
As an extension of the basic method, a modified treatment of the phase information is proposed. In so doing, the constancy of the phase transition at the beginning and at the end of the dropout is improved by taking into account the average time delay between target and replacement signal. A time delay between the respective channels, independent of their source direction, emerges according to the spatial arrangement of the multi-channel recording system.
In the following, the invention is described in more detail on the basis of the drawings. Fig. 1 shows a schematic representation of the transmission chain according to the invention, and
Fig. 2 shows a detailed block diagram of the dropout concealment of the invention for a two- channel system, and Fig. 3 shows a block diagram of a multi-channel arrangement of, for example eight channels, and
Fig. 4 shows a flowchart of the entire invention, consisting of the estimation of the spectral filter, the determination of the time delay between the channels, as well as the weighted superposition of all channels in order to generate the substitution signal, and
Fig. 5 shows the layout of the device according to the invention for dropout concealment that is to be integrated into each channel of the multi-channel arrangement.
The preferred area of application of the present invention is within the overall system of a multi-channel (optionally wireless) transmission of digital audio data. The entire structure of a transmission chain is depicted in Fig. 1 and typically comprises the following stages for one channel: Signal source 1, e.g. a sensor for recording signals (microphone), analog-digital converter 2 (ADC), optional signal compression and coding on the transmitter side, transmitter 3, transmission channel, receiver 4, concealment module 5. At the output of the concealment module 5, the audio signal is available in digital form — further signal processing units can be connected directly, for example a pre-amp, equalizer, etc.
The proposed concealment method is independent of the transmitter/receiver unit as well as the source coding and acts solely on the receiver side (receiver-based technique). It can therefore be integrated flexibly as an independent module into any transmission path. In some transmission systems (e.g. digital audio streaming), different concealment strategies are implemented simultaneously. While the application shown in Fig. 1 does not provide for any further concealment units, a combination with alternative technologies is possible.
The following application scenarios are provided exemplarily: a) In concert events and stage installations, multi-channel arrangements range from stereo recordings to different variations of surround recordings (e.g. OCT Surround, Decca Tree, Hamasaki Square, etc.) potentially supported by different forms of spot microphones. Especially with main microphone setups, the signals of the individual channels are comprised of similar components whose particular composition is often quite non-stationary. For example, a dropout in one main microphone channel can be concealed according to the present invention introducing little or no latency. b) Multi-channel audio transmission in studios prodeeds at different physical layers (e.g. optical fiber waveguides, AES-EBU, CAT5), and dropouts can occur for various reasons, for example due to loss of synchronization, which must be prevented or concealed especially in critical applications such as, for example, in the transmission operations of a radio station. Here, too, the concealment method according to the invention can be used as a safety unit with a low processing latency. c) While audio transmission in the internet is less delay-sensitive than the abovementioned areas, transmission errors occur more frequently, resulting in an increased degradation of the perceptual audio quality. The inventive concealment method offers an improvement of the quality of service. d) The method according to the invention can also be used in the framework of a spatially distributed, immersive musical performance, i.e. in the implementation of a collaborative concert of musicians that are separated spatially from each other, hi this case, the ultra-low latency processing strategy of proposed algorithm benefits the system's overall delay.
The invention is not restricted to the following embodiment. It is merely intended to explain the inventive principle and to illustrate one possible implementation. In the following, the dropout concealment method is described for one channel afflicted with dropouts. If transmission errors occur in more than one channel of the multi-channel arrangement, the system can easily be expanded.
The following terminology is used in the description: The channel afflicted with dropouts is defined as target channel or signal. The replica (estimation) of this signal that is to be generated during dropout periods is referred to as replacement signal. At least one substitution channel is required for the computation of the replacement signal. The proposed algorithm is composed of two parts. Computations of the first part are carried out permanently, whereas the second part is only activated in the case of a dropout in the target channel. During error-free transmission, the coefficients of a linear-phase FIR (finite impulse response) filter of length LFiUer are permanently being estimated in the frequency domain. The required information is provided by the optionally non-linearly distorted and optionally time-averaged short-term magnitude spectra of the target and substitution channel. This new type of filter computation disregards any phase information and thus, differs fundamentally from the correlation-dependent adaptive filters. Selection of the substitution channel or substitution channels
Fig. 2 shows a block diagram of the multi-channel dropout concealment method for a target signal xz and a substitution signal x5 . The individual steps of the method are each indicated by a box containing a reference symbol and denoted in the subsequent table:
6 Transformation into a spectral respresentation
7 Determination of the envelope of the magnitude spectra
8 Non-linear distortion (optional)
9 Time-averaging (optional)
10 Calculation of the filter coefficients
11 Time-averaging of the filter coefficients (optional)
12 Transformation into the time domain with windowing
13 Transformation into the frequency domain (optional)
14 Filtering of the substitution signal respectively in time or frequency domain
15 Estimation of the complex coherence function or GXPSD
16 Time-averaging (optional)
17 Estimation of the GCC and maximum detection in the time domain
18 Determination of the time delay Aτ
19 Implementation of the time delay Δr (optional)
In this example, the transition between target and replacement signal is indicated by a switch 20. A detailed explanation of the individual steps of the method is given in the following description.
The correct selection of a substitution channel depends on the similarity between the substitution and target signal. This correlation can be determined by estimating the cross- correlation or coherence. (See explanations on coherence and on generalized cross-power spectral density (GXPSD) at the end of the specification.) According to the invention, the (GXPSD) is proposed as potential selection strategy. The complex coherence function
Y23 j {k} is used as particular example in embodiments 1. to 9. (A total of K channels are
observed, the channel x0 («) being designated as the target channel xz («) .): 1. For the target channel xz (n), the Jώ channel is defined as a substitution signal by the
optionally time-averaged coherence function Tzs ; (&) between the channels *, («) ,
with 1 < j < K - 1 and the target channel xs («) = xs («) , whose frequency-averaged
value of the complex coherence function, χ( f) = — ∑ T25J (λ:) has a maximum
value according to: J = arg max χ ( A .
2. Alternatively, a fixed allocation can be established between the channels in advance if the user (e.g. a sound engineer) knows the characteristics of the individual channels (according to the selected recording method) and hence their joint signal information.
3. Likewise, several channels can be summed to one substitution channel, optionally in a weighted manner. This weighted combination can be set up by the user a priori.
4. In an alternative realisation, the superposition of several channels to one substitution channel is carried out on the basis of broadband coherence ratios to the target channel
by: xs (n) = -* ^ , Λ , for all do (j) = false . j Herein, xs (n) denotes the substitution channel composed of the channels
Xj In- ATJ ) , and χ( i) represents the frequency-averaged coherence function
between the target channel xz (n\ and the corresponding channel x. ln — Aτλ . The
time delay between the selected channel pairs is considered by Δry (c.f. section "Estimation of the time delay between target and substitution channel"). The validity of the potential signals is verified incorporating the status bit ^o(y') .
5. A simplification of 4. is proposed that considers a pre-selected set of channels J rather than all available channels / . The weighted sum is built using ^(7) ej ■ The pre-selection is intended to yield channels whose frequency-averaged coherence function exceed a prescribed threshold Θ :
J = U (1 ≤ J ≤ K-1)Λ(X(J) > Θ)} . 6. Furthermore, a maximum number of M channels (with preferably M = 2...5) can be established as a criterion, according to:
J = ji,.|(l < 7,.
Figure imgf000012_0001
7. A joint implementation of both constraints 5. and 6. is also possible:
Figure imgf000012_0002
8. Alternatively, the selection can be carried out separately for different frequency bands, i.e. in each band the "optimal" substitution channel is determined on the basis of the coherence function, the respective band pass signals are filtered using the method according to the invention, optionally in a time-delayed manner (c.f. "Estimation of the time delay between target and substitution channel"), superposed and used as a replacement signal. In so doing, the same criteria apply as in 1., 4., 5.,
Figure imgf000012_0003
instead of the frequency-averaged function χ( A .
9. Several substitution channels can also be selected. In this case, the processing is carried out separately for each channel, i.e. several replacement signals are generated. These are weighted according to their coherence function, combined and inserted into the dropout.
Generally, the functions used in 1. to 9. are time-varying, thus a mathematically exact notation must consider the time dependency by a (block) index m . To simplify the formulations, m has been omitted.
Calculation during error-free transmission
The computation during error-free transmission is performed in frequency domain, thus in a first step an appropriate short-termn transformation is necessary, resulting in a block-oriented algorithm that requires a buffering of target and substitution signal. Preferably, the block size should be aligned to the coding format. The estimation of the envelopes of the magnitude spectra of target and substitution signal are used to determine the magnitude response of the concealment filter. The exact narrow-band magnitude spectra of the two signals are not relevant, rather broad-band approximations are sufficient, optionally time-averaged and/or non-linearily distorted by a logarithmic or power function. The estimation of the spectral envelopes can be implemented in various ways. The most efficient possibility concerning computational efficiency is the short-term DFT with short block length, i.e. the spectral resolution is low. A signal block is multiplied by a window function (e.g. Hanning), subjected to the DFT, the magnitude of the short-term DFT is optionally distorted non-linearly and subsequently time-averaged.
Further implementations: o Wavelet transformation (as is described in Daubechies L; "Ten Lectures on Wavelets"; Society for Industrial and Applied Mathematics; Capital City Press, ISBN 0-89871- 274-2, 1992. The entire disclosure of this printed publication is incorporated into this specification by virtue of reference) with optional subsequent time-averaging of the optionally non-linear distortion of the absolute values of the wavelet transformation. o Gammatone filter bank (as described in Irino T., Patterson R.D.; "A compressive gammachirp auditory filter for both physiological and psychophysical date"; J. Acoust. Soc. Am., Vol. 109, pp. 2008-2022, 2001. The entire disclosure of this printed publication is incorporated into this specification by virtue of reference) with subsequent formation of the signal envelopes of the individual subbands, optionally followed by a non-linear distortion. o Linear prediction (as described in Haykin S.; "Adaptive Filter Theory"; Prentice Hall Inc.; Englewood Cliffs; ISBN 0-13-048434-2, 2002. The entire disclosure of this printed publication is incorporated into this specification by virtue of reference) with subsequent sampling of the magnitude of the spectral envelopes of the signal block, represented by the synthesis filter, optionally followed by a non-linear distortion and, subsequent to this, time-averaging. o Estimation of the real cepstrum (as described in Deller J.R., Hansen J.H.L., Proakis J.G.; "Discrete-Time Processing of Speech Signals"; IEEE Press; ISBN 0-7803-5386- 2, 2000. The entire disclosure of this printed publication is incorporated into this specification by virtue of reference) followed by a retransformation of the cepstrum domain into the frequency domain and taking the antilogarithm, optionally followed by a non-linear distortion of the so obtained envelopes of the magnitude spectra and, subsequent to this, time-averaging. o Short-term DFT with maximum detection and interpolation: Here, the maxima are detected in the magnitude spectrum of the short-term DFT and the envelope between neighboring maxima are calculated by means of linear or non-linear interpolation, optionally followed by a non-linear distortion of the so obtained envelopes of the magnitude spectra and, subsequent to this, time-averaging.
For the optionally used time-averaging of the envelopes, an exponential smoothing of the optionally non-linearly distorted magnitude spectra can be used, as represented in equations (1) with time constant a for the exponential smoothing. Alternatively, the time-averaging can be formed by a moving average filter. The non-linear distortion can, for example, be carried out by means of a power function with arbitrary exponents which, in addition, can be selected differently for the target and substitution channel, as depicted in equations (1) by the exponents γ and δ . (Alternatively, a logarithmic function can also be used.) The non-linear distortion offers the advantage of weighting time periods with high or low signal energy differently along the time-varying progression of each frequency component. The different weighting affects the results of time-averaging within the respective frequency component. Accordingly, exponents γ und δ greater than 1 denote an expansion, i.e. peaks along the signal progression dominate the result of the time-averaging, whereas exponents less than 1 signify a compression, i.e. enhance periods with low signal energy. The optimal selection of the exponent values depends on the sound material to be expected.
Figure imgf000014_0001
Ss(m)\ = Jαtør +(1 (Ib)
Figure imgf000014_0002
where \SZ , \SS : envelopes of the magnitude spectra of target and substitution channel,
\SZ , Ss : time-averaged versions of \SZ and a : time constant of the exponential smoothing, 0 < a < 1 , γ , δ : exponents of the non-linear distortion of S2 and \SS\ , with a preferable value range of: 0.5 < γ, δ ≤ 2 , m : block index. As an example, equations (1) constitute a special case for the calculation of the spectral envelopes of target and substitution channel with exponential smoothing and arbitrary distortion exponents. In the following, the exponents are set to γ = δ = 1 to simplify formulations (i.e. a non-linear distortion is not explicitly indicated). However, the invention comprises the method with any time-averaging methods and any non-linear distortions of the envelopes of the magnitude spectra and hence, any values for the exponents γ and δ . Beyond, the use of the logarithm of the exponential function is enclosed, too. To simplify notation, the block index m is omitted, though all magnitude values such as S5 SJ or H are considered to be time- variant and therefore a function of block index m .
Calculation of the concealment filter
In standard adaptive systems, concealment filters are calculated by minimizing the mean square error between the target signal and its estimation. The difference signal is given by e(n} = xz (rι}- xz («) . In contrast, the present invention examines the error of the estimated magnitude spectra:
E(k) = Sz (k) - S2 (k) = S2 (k) -H(k) Ss (k) (2)
E(^) corresponds to the difference between the envelope of the magnitude spectra of the optionally non-linearly distorted optionally smoothed target signal and its estimation. The optimization problem is observed separately for each frequency component k . The simplest realisation of the spectral filter H(&) would be determined by the two envelopes, with
Figure imgf000015_0001
Alternatively, a constraint of H(^) is suggested through the introduction of a regularization parameter. The underlying intention is to prevent the filter amplification from rising disproportionally if the signal power of S51 is too weak and hence, background noise becomes audible or the system becomes perceptibly instable. If, for example, the spectral peaks of one time-block of \SZ and Ss are not located in exactly the same frequency band,
H(A;) will rise excessively in these bands in which S2 has a maximum and Ss has a
minimum. To avoid this problem, a constraint for H{k\ is established through the frequency-
dependent regularisation parameter β{k), yielding
Figure imgf000016_0001
Through positive real-valued β(k), the filter amplification will not increase immoderately,
even with a small value for \SS L and hence, will prevent undesired signal peaks. The optimal
values for β(k) depend on the signal statistics to be expected, whereas a computation based on an estimation of the background noise power per frequency band is proposed inventively. The background noise power Pg (k) can be estimated incorporating the time-averaged
minimum statistics. The regularisation parameter β{k) is proportional to the rms value of the
background noise power, according to: β{k) = c- \ Pg (&) r , and c typically between 1 and 5.
An alternative implementation of H is proposed specifically for quasi-stationary input signals. The envelopes of the magnitude spectra are first estimated without time-averaging and optionally non-linear distortion. Both modifications are considered during the determination of the filter coefficients, according to:
Figure imgf000016_0002
In equation (5), both the block index m and the frequency index k are indicated, since the computation simultaneously depends on both indices in this case. The parameters a and γ determine the behavior of the time-averaging or the non-linear distortion. Calculations in the event of dropouts in a target signal
The possibilities for detecting a dropout are numerous and known very well in the prior art. For example, a status bit can be transmitted at a reserved position within the respective audio stream (e.g. between audio data frames), and continuously registered at the receiver side. It would also be conceivable to perform an energy analysis of the individual frames and to identify a dropout if it falls below a certain threshold. A dropout could also be detected through synchronization between transmitter and receiver.
If a dropout is detected in the target signal (e.g. as represented in Fig. 2 by a status bit "dropout y/n"; the dotted line denotes the status bit that is actually transmitted contiguously with the audio signal), the replacement signal must be generated using the lastly estimated filter coefficients and the substitution channel(s), and is directly fed to the output of the concealment unit. During a dropout, the estimation of the filter coefficients is deactivated. Basically, the transition between target and replacement signal can be implemented by a switch, assuming any switching artefacts remaining inaudible. According to the invention, a cross-fade between the signals is proposed as being advantageous, but this requires a buffering of the target signal, hence inducing additional latency, hi particularly delay-critical real-time systems that do not allow for any additional buffering, a cross-fade is not readily possible. In this case, an extrapolation of the target signal is proposed, for example by means of linear prediction. The cross-fade is carried out between the extrapolated target signal and the replacement signal by using the method according to the invention.
The replacement signal is finally generated through filtering of the substitution signal with the filter coefficients retransformed into the time domain. The inverse transformation of the filter coefficients 77"1 {//} should be carried out with the same method as the first transformation.
Prior to the filtering, the filter impulse response is optionally time-limited by a windowing function w(n) (e.g. rectangular, Harming).
hw(n) = w(n)T-l {H(k)} or Jφ) = w(n)rl {H(*)} . (6)
The impulse response h (n\ or hw (n\ , respectively, must only be calculated once at the beginning of the dropout, since the continuous estimation of the filter coefficients is deactivated during the dropout. For the sample- wise determination of the replacement signal xz , an appropriate vector of the substitution signal X5 is necessary,
*z («) = K*s (") or *z (") = KT*s («) (?)
In some applications, the filtering can be performed in the frequency domain. Thus, the coefficients optionally windowed in the time domain are transformed back into the frequency domain, so that the replacement signal of a block is computed by:
χz W = T-1 JH; (k)xs (*)} . (8)
Successive blocks are combined using methods such as overlap and add or overlap and save. The replacement signal is continued beyond the end of the dropout to enable a cross-fade into the re-existing target signal.
Estimation of the time delay between target and substitution signal
In a particularly preferred embodiment of the present concealment method, the time- alignment of target and replacement signal can be improved, too. Therefore, a time delay is estimated, parallel to the spectral filter coefficients, that takes two components into account. On the one hand, the delay of the replacement signal resulting from the filtering process must
be compensated for, r, = F"ler . On the other hand, a time delay T2 between target and
substitution channel originates due to the spatial arrangement of the respective microphones. This can be estimated, for example, by means of the generalized cross-correlation (GCC) that requires the computation of complex short-term spectra. In a preferred implementation, the short-term DFT employed for the estimation of the concealment filter can be exploited, too, obviating additional computational complexity. (For more information about the characteristics of the GCC, see especially Carter, G. C: "Coherence and Time Delay Estimation"; Proc. IEEE, Vol. 75, No. 2, Feb. 1987; and Omologo M., Svaizer P.: "Use of the Crosspower-Spectrum Phase in Acoustic Event Location"; IEEE Trans, on Speech and Audio Processing, Vol. 5, No. 3, May 1997. The entire disclosures are incorporated into this specification by virtue of reference.) The GCC is calculated using inverse Fourier transform of the estimated generalized cross-power spectral density (GXPSD), which is defined by:
ΦG.zs {ή = G(k)Xz (k)x; (k) (9) (again, in equations 9-12, the block index m is omitted.)
In equation (9), X2 (k} and X3 (k} are the DFTs of a block of the target or substitution
channel, respectively; * denotes complex conjugation. G(k} represents a pre- filter the aim of which is explained in the following.
The time delay r2 is determined by indexing the maximum of the cross-correlation. The detection of the maximum can be improved by approximating its shape to a delta function. The pre- filter G(k} directly affects the shape of the GCC and thus, enhances the estimation of τ2 . A proper realisation denotes the phase transform filter (PHAT):
(10)
Figure imgf000019_0001
This results in the GXPSD with PHAT filter:
O'2^ V / τ , / i \ τ.r* / I \ Λr / Λ ,,t / , \ (H) χz (k)χs' (k)\ \xz (k)x; (k)\
where O23 : cross-power spectral density of target and substitution signal.
Another possibility is offered by the complex coherence function whose pre-filter can be calculated from the power density spectra, yielding:
Figure imgf000019_0002
Φzz : auto-power spectral density of the target signal, O53 : auto-power spectral density of the substitution signal.
The transformation of the signals into the frequency domain is usually implemented by means of short-term DFT. The block length must, on the one hand, be selected large enough in order to facilitate peaks in the GCC that are detectable for the expected time delays but, on the other hand, excessive block lengths lead to increased need for storage capacity. To adequately track variations of the time delay r2 , time-averaging of the GXPSD or of the complex coherence function is proposed (e.g. by exponential smoothing). (13)
(14)
Figure imgf000020_0001
In equations (13) and (14), m refers to the block index. The smoothing constants are designated with μ and v . These must be adapted to the jump distance of the short-term DFT and the stationarity of T2 in order to obtain the best possible estimation of the coherence function or the generalized cross-power spectral density, respectively.
After the retransformation into the time domain and the detection of the maximum of the GCC, the entire time delay element between target and replacement signal can be formulated by
Ar = T2 -T1. (15)
The individual processing steps are summarized in a block diagram in Fig. 2 for one target and one substitution signal. The transition between target and replacement signal or vice-versa is depicted as a simple switch in the graphic; as has already been mentioned, a cross-fade of the signals is recommendable.
The inventive notion of a multi-channel setup with more than two channels is depicted Fig. 3. Depending on which channel is affected by dropouts and hence becomes the target channel, the substitution signal is generated with the remaining intact channels. The discrete blocks of Fig. 3 correspond to the following processing steps:
21 Selection of the substitution channel(s)
22 Calculation of the filter coefficients
23 Application of a time delay
24 Generation of a replacement signal In the uppermost row of Fig. 3, a replacement signal is generated for channel 1, which is afflicted by dropouts. To achieve this, either one, several, or all of the channels 2 to 7 can be used. The second row corresponds to the reconstruction of channel 2 , etc.
Fig. 4 shows a schematic of the basic algorithm in combination with the expansion stage (i.e. time delay estimation) to illustrate the mutual dependencies of the individual processing steps. To simplify the block diagram, parallel signals (DFT blocks) or (spectral) mappings derived thereof are merged into one (solid) line, the number of which is indicated by K or K -I , respectively. The dotted connections denote the transfer or input of parameters. The first selection of the substitution channels is done in the block labeled "selector" according to the GXPSD. On the one hand, this affects the computation of the envelopes of the magnitude spectra of the substitution signal and, on the other hand, it is needed for the weighted superposition of the same. The second selection criterion is offered by the time delay τ2 . The status bits of the channels are not depicted explicitly, but their verification is considered in relevant signal-processing blocks. Additionally, the particular determination of the target signal can be omitted from this illustration.
Hardware implementation
According to the invention, the method for dropout concealment works as an independent module and is intended for installation into a digital signal processing chain, wherein the software-specified algorithm is implemented on a commercially available digital signal processor (DSP), preferably a special DSP for audio applications. Accordingly, for each channel of a multi-channel arrangement, an appropriate device, such as exemplarily depicted in Fig. 5, is necessary that preferably may be integrated directly into the apparatus for receiving and decoding the transmitted digital audio data.
The apparatus for dropout concealment is equipped with a primary audio input that adopts the digital signal frames from the receiver unit and temporarily stores them in a storage unit 25. The apparatus is equipped with at least one secondary audio input, optionally several secondary audio inputs, at which the digital data of the substitution channel(s) are available and likewise stored temporarily in one, optionally several, storage unit(s) 25. In addition, the device features an interface for the transmission of control data such as the status bit of the signal frames (dropout y/n) or an information bit for the selection of the substitution channel(s), the latter requiring (a) a bidirectional data line and (b) a temporary storage unit 25.
In order to forward the original or concealed data frames of the primary channel, the apparatus is equipped with an audio output. A separate storage unit for the data blocks to be output is not necessary, since they can be stored as needed in the storage unit of the input signal.

Claims

Patent claims:
1. Method for the concealment of dropouts in one or more channels (Z ) of a multichannel arrangement comprising at least two channels (Z , S ), wherein in the event of a dropout in one channel ( Z ) a replacement signal is generated with the aid of at least one error-free channel (S), characterised in that, during the error- free signal transmission of the channels (Z , S ) a mapping of the transmitted signals ( xz , xs ) into the frequency domain is performed, the magnitude spectra (LS2 , S J ) being determined, spectral filter coefficients ( H) are calculated that relate the magnitude spectrum (\sz\) of a channel (Z ) to the magnitude spectrum (\SS\ ) of at least one other channel (S), and that in the event of the dropout of a channel (Z ) the replacement signal is generated by application of filter coefficients (H), computed prior to the dropout, to a substitution signal which consists of at least one error-free channel (S).
2. Method according to Claim 1, characterised in that the magnitude spectra (\sz , I-S1J ) are distorted non-linearly prior to the calculation of the filter coefficients (H).
3. Method according to one of Claims 1 or 2, characterised in that the magnitude spectra (\SZ , Ls J ) are time-averaged prior to the calculation of the filter coefficients (H).
4. Method according to one of Claims 1 to 3, characterised in that the filter coefficients (H) are calculated by minimizing the difference between the optionally non-linearly distorted and/or time-averaged magnitude spectrum ( \SZ ) of a channel (Z ), and an
optionally non-linearly distorted and/or time-averaged magnitude spectrum (I1S5I ) of at least one other channel ( S) filtered with the filter coefficients ( H).
5. Method according to one of Claims 1 to 4, characterised in that the calculation of the filter coefficients (H) is done through the quotient of the magnitude spectra (\s
Ss I ), according to:
Figure imgf000024_0001
6. Method according to one of Claims 1 to 5, characterised in that a regularisation of the filter coefficients ( H ) is carried out with the aid of a frequency-dependent parameter
7. Method according to Claim 6, characterised in that the regularisation is accomplished,
according to the formulation H (&) = WlM*)|
Figure imgf000024_0002
8. Method according to Claim 7, characterised in that the estimation of β (k\ is achieved
via the rms value of the background noise level Pg (&), where β(k) = c-
Figure imgf000024_0003
the factor c facilitating an improved adaptation with preferred values of c = 1...5.
9. Method according to one of Claims 1 to 9, characterised in that the calculation of the envelopes of the magnitude spectra are obtained by means of the short-term DFT of short block length.
10. Method according to one of Claims 1 to 9, characterised in that the envelopes of the magnitude spectra can be calculated incorporating the magnitude spectra of a wavelet transformation, or the rms (per channel) of a gammatone filter bank, or linear prediction with subsequent sampling of the magnitude of the spectral envelopes of a signal frame (represented by the synthesis filter), or a real cepstral analysis with subsequent retransformation of the cepstral domain into the frequency domain and taking the antilogarithm, or a short-term DFT with maximum detection and interpolation of the magnitude spectra, respectively.
11. Method according to Claim 3, characterised in that the time-averaging of a magnitude spectrum (|sz ,
Figure imgf000025_0001
) incorporates exponential smoothing with a smoothing constant
(a ).
12. Method according to Claim 3, characterised in that the time-averaging of a magnitude spectrum (
Figure imgf000025_0002
is implemented by means of a moving average filter.
13. Method according to Claims 2 and 3, characterised in that the non-linear distortion and the time-averaging of a magnitude spectrum (|S2| , Ss\ ) obeys either formulation
Figure imgf000025_0003
where a refers to the smoothing constant in the range of 0 < a < 1 , m refers to the block index and γ , δ refers to the distortion exponents for the magnitude spectra
Figure imgf000025_0004
).
14. Method according to Claim 2, characterised in that the non-linear distortion is achieved through the logarithmic and exponential function, where
[aln{|5z|}+(l-a)li.k(m-l|| (αln{|5j|}-.-(l-o)lnk(m-l
S2 (m) and Ss (m) = e
15. Method according to one of Claims 1 to 4, characterised in that the calculation of the filter coefficients ( H ) is carried out by way of the time-averaging of the coefficients instead of the time-averaging of the spectral envelopes, according to the formulation
Figure imgf000025_0005
16. Method according to one of Claims 1 to 15, characterised in that the filter coefficients ( H ) are transformed into time domain, and the filter impulse response is bounded in time domain applying a windowing function.
17. Method according to one of Claims 1 to 16, characterised in that the replacement signal is generated through the filtering of the error-free substitution channel in the time domain.
18. Method according to one of Claims 1 to 16, characterised in that the bounded filter impuse response is brought back into frequency domain, and the filtering of the substitution signal is performed in frequency domain.
19. Method according to one of Claims 1 to 18, characterised in that the transition between the target signal and the replacement signal takes place using a cross-fade.
20. Method according to Claim 19, characterised in that an extrapolation by means of a linear prediction filter is used for the implementation of the cross-fade without buffering and hence without additional signal delay.
21. Method according to one of Claims 1 to 20, characterised in that a time delay (r2 ) between the signals ( xz , xs ) transmitted on the channels (Z , S ) is determined from the magnitude spectra ( S2 , Ss ; X2 , Xs ) of two channels, and is applied as a time delay to the replacement signal.
22. Method according to Claim 21, characterised in that the time delay (τ2 ) is determined from the maximum of the generalized cross-correlation of the signals ( x2 , xs ).
23. Method according to Claims 21 and 22, characterised in that the time delay (r2 ) is reduced by the time delay (T1) that is caused by the filtering of the substitution signal ( xs) with the time domain filter coefficients (hw), yielding a new time delay AT = T2 -T1 that this is applied to the replacement signal.
24. Method according to Claims 22 and 23, characterised in that the generalized cross- correlation is determined from the generalized cross-power spectral density
®G,zs {k) = G{k)Xz {k)Xl (k) through inverse transformation of the latter into the time domain; ( G(A:)) refers to a pre-filter and ( Xz , Xs ) refers to the complex spectra of the signals ( xz , xs ).
25. Method according to Claim 24, characterised in that the pre-filter (G (&)) is the phase
Figure imgf000027_0001
26. Method according to Claims 22 and 23, characterised in that the generalized cross- correlation is determined by inverse transformation of the coherence function
into the time domain, where O25 (k) = X2 (k)Xs * (k) and
Figure imgf000027_0002
Φzz (k} and Φ5S (k) refer to the auto-power spectral densities of the two signals (Z , S ).
27. Method according to one of Claims 21 to 26, characterised in that the frequency spectra ( X2 , Z5 ) of the signals ( xz , xs ) are determined by means of short-term DFT.
28. Method according to one of Claims 21 to 27, characterised in that, prior to the transformation into the time domain, the generalized cross-power spectral density or the coherence function is preferably time-averaged through exponential smoothing.
29. Method according to one of Claims 1 to 28, characterised in that, a signal jt, (n) is selected as a substitution signal, whose frequency-averaged version of the coherence
function, χ( with
Figure imgf000027_0003
J = argmax 2r(y) .
30. Method according to Claims 1 to 28, characterised in that the substitution signal is composed of several weighted signals.
31. Method according to Claim 30, characterised in that a superposition of several channels to form one substitution channel is implemented, according to the
formulation where J represents the set of the
Figure imgf000028_0001
indices of the potential channels. The superposition considers all time delays (Δr;), too.
32. Method according to Claim 31, characterised in that the size of J can be delimited by the user.
33. Method according to Claims 31 and 32, characterised in that the size of J is restricted to channels whose frequency-averaged values of the coherence function (with the target channel) z{ i) exceed a threshold value Θ , according to:
Figure imgf000028_0002
34. Method according to Claims 31 and 32, characterised in that the size of J is restricted to a maximum number of M channels, according to:
Figure imgf000028_0003
35. Method according to Claims 31 to 34, characterised in that the criteria threshold value Θ and maximum number M are jointly taken into account, according to:
J = {;,|(i<y; <^-i)Λ(i<z<M)Λ(4yJ>Θ)Λ[4yJ>4/),v/e{i,...,^-i}\{y1,...,^}]J
36. Method according to one of Claims 1 to 28, characterised in that different substitution signals are used for different frequency bands of the replacement signal.
37. Method according to Claim 36, characterised in that, for each frequency band k , an appropriate band-pass-filtered version of that signal xy k (n\ is selected as a substitution signal whose value of the (time-averaged) coherence function ^W with the signal to be replaced has a maximum value in the respective frequency band k prior to the dropout, according to: xS k («) = xJ k (n) , where J = argmax F25 . [lΛ .
PCT/EP2006/011759 2006-12-07 2006-12-07 Dropout concealment for a multi-channel arrangement WO2008067834A1 (en)

Priority Applications (7)

Application Number Priority Date Filing Date Title
CN2006800565725A CN101548555B (en) 2006-12-07 2006-12-07 Method for hiding information lost in multi-channel arrangement one or more channels
JP2009539608A JP4976503B2 (en) 2006-12-07 2006-12-07 Dropout compensation for multi-channel arrays
EP06818999A EP2092790B1 (en) 2006-12-07 2006-12-07 Dropout concealment for a multi-channel arrangement
PCT/EP2006/011759 WO2008067834A1 (en) 2006-12-07 2006-12-07 Dropout concealment for a multi-channel arrangement
DE602006015376T DE602006015376D1 (en) 2006-12-07 2006-12-07 DEVICE FOR HIDING OUT SIGNAL FAILURE FOR A MULTI-CHANNEL ARRANGEMENT
AT06818999T ATE473605T1 (en) 2006-12-07 2006-12-07 DEVICE FOR BLENDING SIGNAL FAILURES FOR A MULTI-CHANNEL ARRANGEMENT
US12/479,046 US8260608B2 (en) 2006-12-07 2009-06-05 Dropout concealment for a multi-channel arrangement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2006/011759 WO2008067834A1 (en) 2006-12-07 2006-12-07 Dropout concealment for a multi-channel arrangement

Publications (1)

Publication Number Publication Date
WO2008067834A1 true WO2008067834A1 (en) 2008-06-12

Family

ID=37909549

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2006/011759 WO2008067834A1 (en) 2006-12-07 2006-12-07 Dropout concealment for a multi-channel arrangement

Country Status (7)

Country Link
US (1) US8260608B2 (en)
EP (1) EP2092790B1 (en)
JP (1) JP4976503B2 (en)
CN (1) CN101548555B (en)
AT (1) ATE473605T1 (en)
DE (1) DE602006015376D1 (en)
WO (1) WO2008067834A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2207273A1 (en) 2009-01-09 2010-07-14 AKG Acoustics GmbH Method for receiving digital audio data
WO2015036632A1 (en) 2013-09-13 2015-03-19 European Sleep Care Institute Sl Baby mattress

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011087118A (en) * 2009-10-15 2011-04-28 Sony Corp Sound processing apparatus, sound processing method, and sound processing program
US9201580B2 (en) 2012-11-13 2015-12-01 Adobe Systems Incorporated Sound alignment user interface
US10638221B2 (en) 2012-11-13 2020-04-28 Adobe Inc. Time interval sound alignment
US9355649B2 (en) * 2012-11-13 2016-05-31 Adobe Systems Incorporated Sound alignment using timing information
US10249321B2 (en) 2012-11-20 2019-04-02 Adobe Inc. Sound rate modification
US9451304B2 (en) 2012-11-29 2016-09-20 Adobe Systems Incorporated Sound feature priority alignment
US10455219B2 (en) 2012-11-30 2019-10-22 Adobe Inc. Stereo correspondence and depth sensors
US9135710B2 (en) 2012-11-30 2015-09-15 Adobe Systems Incorporated Depth map stereo correspondence techniques
US10249052B2 (en) 2012-12-19 2019-04-02 Adobe Systems Incorporated Stereo correspondence model fitting
US9208547B2 (en) 2012-12-19 2015-12-08 Adobe Systems Incorporated Stereo correspondence smoothness tool
US9214026B2 (en) 2012-12-20 2015-12-15 Adobe Systems Incorporated Belief propagation and affinity measures
CN104282309A (en) 2013-07-05 2015-01-14 杜比实验室特许公司 Packet loss shielding device and method and audio processing system
US10157620B2 (en) 2014-03-04 2018-12-18 Interactive Intelligence Group, Inc. System and method to correct for packet loss in automatic speech recognition systems utilizing linear interpolation
EP3309981B1 (en) * 2016-10-17 2021-06-02 Nxp B.V. Audio processing circuit, audio unit, integrated circuit and method for blending
CN111383643B (en) 2018-12-28 2023-07-04 南京中感微电子有限公司 Audio packet loss hiding method and device and Bluetooth receiver

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050182996A1 (en) * 2003-12-19 2005-08-18 Telefonaktiebolaget Lm Ericsson (Publ) Channel signal concealment in multi-channel audio systems

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE4111131C2 (en) * 1991-04-06 2001-08-23 Inst Rundfunktechnik Gmbh Method of transmitting digitized audio signals
US5793801A (en) * 1996-07-09 1998-08-11 Telefonaktiebolaget Lm Ericsson Frequency domain signal reconstruction compensating for phase adjustments to a sampling signal
US6904110B2 (en) * 1997-07-31 2005-06-07 Francois Trans Channel equalization system and method
DE19921122C1 (en) * 1999-05-07 2001-01-25 Fraunhofer Ges Forschung Method and device for concealing an error in a coded audio signal and method and device for decoding a coded audio signal
JP4218134B2 (en) * 1999-06-17 2009-02-04 ソニー株式会社 Decoding apparatus and method, and program providing medium
JP2001296894A (en) * 2000-04-12 2001-10-26 Matsushita Electric Ind Co Ltd Voice processor and voice processing method
US7155388B2 (en) * 2004-06-30 2006-12-26 Motorola, Inc. Method and apparatus for characterizing inhalation noise and calculating parameters based on the characterization
US7254535B2 (en) * 2004-06-30 2007-08-07 Motorola, Inc. Method and apparatus for equalizing a speech signal generated within a pressurized air delivery system
US7139701B2 (en) * 2004-06-30 2006-11-21 Motorola, Inc. Method for detecting and attenuating inhalation noise in a communication system

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050182996A1 (en) * 2003-12-19 2005-08-18 Telefonaktiebolaget Lm Ericsson (Publ) Channel signal concealment in multi-channel audio systems

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WANG Y ET AL: "MODIFIED DISCRETE COSINE TRANSFORM-ITS IMPLICATIONS FOR AUDIO CODING AND ERROR CONCEALMENT", JOURNAL OF THE AUDIO ENGINEERING SOCIETY, AUDIO ENGINEERING SOCIETY, NEW YORK, NY, US, vol. 51, no. 1/2, January 2003 (2003-01-01), pages 52 - 61, XP001178776, ISSN: 1549-4950 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2207273A1 (en) 2009-01-09 2010-07-14 AKG Acoustics GmbH Method for receiving digital audio data
WO2010078605A1 (en) 2009-01-09 2010-07-15 Akg Acoustics Gmbh Method for receiving digital audio data
WO2015036632A1 (en) 2013-09-13 2015-03-19 European Sleep Care Institute Sl Baby mattress

Also Published As

Publication number Publication date
CN101548555A (en) 2009-09-30
DE602006015376D1 (en) 2010-08-19
JP4976503B2 (en) 2012-07-18
US20090306972A1 (en) 2009-12-10
EP2092790A1 (en) 2009-08-26
JP2010512078A (en) 2010-04-15
CN101548555B (en) 2012-10-03
EP2092790B1 (en) 2010-07-07
US8260608B2 (en) 2012-09-04
ATE473605T1 (en) 2010-07-15

Similar Documents

Publication Publication Date Title
EP2092790B1 (en) Dropout concealment for a multi-channel arrangement
AU2005299070B2 (en) Diffuse sound envelope shaping for binaural cue coding schemes and the like
AU2005299068B2 (en) Individual channel temporal envelope shaping for binaural cue coding schemes and the like
EP1829026B1 (en) Compact side information for parametric coding of spatial audio
JP5453222B2 (en) Efficient filtering using complex modulation filter banks
KR101989062B1 (en) Apparatus and method for enhancing an audio signal, sound enhancing system
JP2010541350A (en) Apparatus and method for extracting ambient signal in apparatus and method for obtaining weighting coefficient for extracting ambient signal, and computer program
KR20070088329A (en) Apparatus and method for generating multi-channel synthesizer control signal and apparatus and method for multi-channel synthesizing
US9311925B2 (en) Method, apparatus and computer program for processing multi-channel signals

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200680056572.5

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 06818999

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2006818999

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2009539608

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE