WO2005059901A1 - Constrained filter encoding of polyphonic signals - Google Patents
Constrained filter encoding of polyphonic signals Download PDFInfo
- Publication number
- WO2005059901A1 WO2005059901A1 PCT/SE2004/001907 SE2004001907W WO2005059901A1 WO 2005059901 A1 WO2005059901 A1 WO 2005059901A1 SE 2004001907 W SE2004001907 W SE 2004001907W WO 2005059901 A1 WO2005059901 A1 WO 2005059901A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- channel
- signal
- constraint
- filter
- adaptive filter
- Prior art date
Links
- 230000003044 adaptive effect Effects 0.000 claims abstract description 38
- 238000000034 method Methods 0.000 claims description 61
- 230000003595 spectral effect Effects 0.000 claims description 12
- 238000001914 filtration Methods 0.000 abstract description 15
- 230000006978 adaptation Effects 0.000 abstract description 7
- 238000012805 post-processing Methods 0.000 abstract description 4
- 238000004321 preservation Methods 0.000 abstract description 2
- 238000001228 spectrum Methods 0.000 abstract description 2
- 230000005236 sound signal Effects 0.000 description 15
- 238000013459 approach Methods 0.000 description 14
- 238000010586 diagram Methods 0.000 description 13
- 230000005540 biological transmission Effects 0.000 description 12
- 238000012545 processing Methods 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 5
- 230000008901 benefit Effects 0.000 description 5
- 230000015572 biosynthetic process Effects 0.000 description 5
- 238000003786 synthesis reaction Methods 0.000 description 5
- 230000002123 temporal effect Effects 0.000 description 5
- 230000004044 response Effects 0.000 description 4
- 238000007493 shaping process Methods 0.000 description 4
- 230000003321 amplification Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 238000003199 nucleic acid amplification method Methods 0.000 description 3
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000010295 mobile communication Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 230000001594 aberrant effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000001172 regenerating effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000008054 signal transmission Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 125000000391 vinyl group Chemical group [H]C([*])=C([H])[H] 0.000 description 1
- 229920002554 vinyl polymer Polymers 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S5/00—Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation
- H04S5/02—Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation of the pseudo four-channel type, e.g. in which rear channel signals are derived from two-channel stereo signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
Definitions
- the present invention relates in general to encoding of audio signals, and in particular to encoding of multi-channel audio signals.
- stereophonic or multi-channel coding of audio signals is to encode the signals of the different channels separately as individual and independent signals.
- Another basic way used in stereo FM radio transmission and which ensures compatibility with legacy mono radio receivers is to transmit a sum and a difference signal of the two involved channels.
- M/S stereo coding is similar to the described procedure in stereo FM radio, in a sense that it encodes and transmits the sum and difference signals of the channel sub-bands and thereby exploits redundancy between the channel sub-bands.
- the structure and operation of an encoder based on M/S stereo coding is described, e.g. in US patent 5,285,498 by J.D. Johnston.
- Intensity stereo on the other hand is able to make use of stereo irrelevancy. It transmits the joint intensity of the channels (of the different sub-bands) along with some location information indicating how the intensity is distributed among the channels. Intensity stereo does only provide spectral magnitude information of the channels. Phase information is not conveyed. For this reason and since the temporal inter-channel information (more specifically the inter-channel time difference) is of major psycho-acoustical relevancy particularly at lower frequencies, intensity stereo can only be used at high frequencies above e.g. 2 kHz.
- An intensity stereo coding method is described, e.g. in the European patent 0497413 by R. Veldhuis et al.
- a recently developed stereo coding method is described, e.g. in a conference paper with the title "Binaural cue coding applied to stereo and multi-channel audio compression", 112th AES convention, May 2002, Kunststoff, Germany by C. Faller et al.
- This method is a parametric multi-channel audio coding method.
- the basic principle is that at the encoding side, the input signals from N channels ci, C2, ... CN are combined to one mono signal m.
- the mono signal is audio encoded using any conventional monophonic audio codec.
- parameters are derived from the channel signals, which describe the multi-channel image.
- the parameters are encoded and transmitted to the decoder, along with the audio bit stream.
- the decoder first decodes the mono signal m' and then regenerates the channel signals ci', C2',..., CN', based on the parametric description of the multi-channel image.
- the principle of the Binaural Cue Coding (BCC) method is that it transmits the encoded mono signal and so-called BCC parameters.
- the BCC parameters comprise coded inter-channel level differences and inter-channel time differences for sub-bands of the original multi-channel input signal.
- the decoder regenerates the different channel signals by applying sub-band- wise level and phase adjustments of the mono signal based on the BCC parameters.
- the advantage over e.g. M/S or intensity stereo is that stereo information comprising temporal inter-channel information is transmitted at much lower bit rates.
- a problem with the state of-the-art multi-channel coding techniques described above is that they require high bit rates in order to provide good quality. Intensity stereo, if applied at low bit rates as low as e.g. only a few kbps suffers from the fact that it does not provide any temporal inter- channel information. As this information is perceptually important for low frequencies below e.g. 2 kHz, it is unable to provide a stereo impression at such low frequencies.
- BCC is able to reproduce the multi-channel image even at low frequencies at low bit rates of e.g. 3 kbps since it also transmits temporal inter-channel information.
- this technique requires computational demanding time-frequency transforms on each of the channels, both at the encoder and the decoder.
- BCC optimises the mapping in a pure mathematical manner. Characteristic artefacts immanent in the coding method will, however, not disappear.
- side information consists of predictor filters and optionally a residual signal.
- the predictor filters estimated by a least-mean-square algorithm, when applied to the mono signal allow the prediction of the multi-channel audio signals.
- This technique synthesises the right and left channel signals by filtering sound source signals with so-called head-related filters.
- this technique requires the different sound source signals to be separated and can thus not generally be applied for stereo or multi-channel coding.
- the predictor filters are known to be optimal in the least-mean- square sense, they do not always fully restore the perceptual characteristics of the original multi-channel signals.
- stereo image instability may occur, where the sound jumps randomly between left to right.
- spectral nulls may cause instabilities and lead to a filter whose frequency response at these frequencies is aberrant. This may cause the filter to perform unnecessary amplification in certain regions and lead to very annoying audible artefacts, especially if the signals are low-pass or high-pass filtered.
- An object of the present invention is to provide a method and device for multi-channel encoding that improves the perceptual quality of the audio signal.
- a further object of the present invention is to provide such a method and device, which requires low bit rate representation.
- the signals of the different channels are combined into one main signal.
- a set of adaptive filters preferably one for each channel, is derived.
- a filter When a filter is applied to the main signal it reconstructs the signal of the respective channel under a perceptual constraint.
- the perceptual constraint is a gain and/ or shape constraint.
- the gain constraint allows the preservation of the relative energy between the channels while the shape constraint allows stereo image stability, e.g. by avoiding unnecessary filtering of spectral nulls.
- the transmitted parameters are the main signal, in encoded form, and the parameters of the adaptive filters, preferably also encoded.
- the receiver reconstructs the signal of the different channels by applying the adaptive filters and possibly some additional post-processing.
- An advantage with the present invention is that perceptual artefacts are reduced when decoding audio signals.
- the required transmission bit rate is at the same time also kept at a very low level.
- FIG. 1 is a block scheme of a system for transmitting multi-channel signals
- FIG. 2a is a block diagram of an embodiment of an encoder in a transmitter according to the present invention
- FIG. 2b is a block diagram of an embodiment of a decoder in a receiver according to the present invention
- FIG. 3a is a block diagram of another embodiment of an encoder in a transmitter according to the present invention
- FIG. 3b is a block diagram of another embodiment of a decoder in a receiver according to the present invention
- FIG. 4 is a block diagram of an embodiment of a filter adaptation unit according to the present invention
- FIG. 1 is a block scheme of a system for transmitting multi-channel signals
- FIG. 2a is a block diagram of an embodiment of an encoder in a transmitter according to the present invention
- FIG. 2b is a block diagram of an embodiment of a decoder in a receiver according to the present invention
- FIG. 3a is a block diagram of another embodiment of an encoder in a transmitter
- FIG. 5 are diagrams illustrating the effects of insufficient reproduction of side signals in a prior-art system
- FIG. 6 is a diagram illustrating effects of spectral nulls in prior-art systems
- FIG. 7 is a block diagram illustrating combining possibilities in channel filter sections according to the present invention
- FIG. 8 is a block diagram of an embodiment of an encoder employing partial combined encoding of a stereo signal
- FIG. 9 is a block diagram illustrating the use of division in frequency sub- bands
- FIG. 10 is a composite diagram illustrating overlapping analysis for encoding and decoding
- FIG. 11 is a flow diagram of the basic steps of an embodiment of an encoding method according to the present invention.
- FIG. 1 illustrates a typical system 1, in which the present invention advantageously can be utilised.
- a transmitter 10 comprises an antenna 12 including associated hardware and software to be able to transmit radio signals 5 to a receiver 20.
- the transmitter 10 comprises among other parts a multi-channel encoder 14, which transforms signals of a number of input channels 16 into output signals suitable for radio transmission. Examples of suitable multi-channel encoders 14 are described in detail further below.
- the signals of the input channels 16 can be provided from e.g. an audio signal storage 18, such as a data file of digital representation of audio recordings, magnetic tape or vinyl disc recordings of audio etc.
- the signals of the input channels 16 can also be provided in "live", e.g. from a set of microphones 19.
- the audio signals are digitised, if not already in digital form, before entering the multi-channel encoder 14.
- an antenna 22 with associated hardware and software handles the actual reception of radio signals 5 representing polyphonic audio signals.
- typical functionalities such as e.g. error correction, are performed.
- a decoder 24 decodes the received radio signals 5 and transforms the audio data carried thereby into signals of a number of output channels 26.
- the output signals can be provided to e.g. loudspeakers 29 for immediate presentation, or can be stored in an audio signal storage 28 of any kind.
- the system 1 can for instance be a phone conference system, a system for supplying audio services or other audio applications.
- the communication has to be of a duplex type, while e.g. distribution of music from a service provider to a subscriber can be essentially of a one-way type.
- the transmission of signals from the transmitter 10 to the receiver 20 can also be performed by any other means, e.g. by different kinds of electromagnetic waves, cables or fibres as well as combinations thereof.
- Fig. 2a illustrates one embodiment of a multi-channel encoder 14 according to the present invention. A number of channel signals ci, C2, ..., CN are received at separate inputs 16: 1-16:N.
- the channel signals are connected to a linear combination unit 34.
- all channel signals are summed together to form a mono signal x.
- any predetermined linear combination of one or more of the channel signals may be used as an alternative, including pure channel signals.
- a pure sum will simplify most mathematical operations.
- the mono signal x is provided as an input signal 42 to a channel filter section 130.
- the mono signal x is provided to, and encoded in, a mono signal encoder 38 to provide encoding parameters p x representing the mono signal x.
- the mono signal encoder operates according to any suitable mono signal encoding technique. Many such techniques are available in known technology. The actual details of the encoding technique are not of importance for enabling the present invention and is therefore not further discussed.
- the channel signals are also connected to the channel filter section 130.
- each channel signal is connected to a respective filter adaptation unit 30: 1 -30 :N.
- the filter adaptation units perform a reconstruction of a respective channel signal when applied to the mono signal x.
- Coefficients of the filter adaptation units 30: 1-30:N are according to the present invention optimised under a perceptual constraint. However, the optimised coefficients of the filter adaptation units 30: 1-30:N may also be obtained at least partly in a joint optimisation of two or more of the channel signals.
- the output of the channel filter section 130 comprises N sets of filter parameters pi-pN. These filter parameters pi-pN are typically encoded separately or jointly to be suitable for transmission. The filter parameters pi- P and the mono signal x are sufficient to enable reconstruction of all channels signals. The encoded filter parameters pi-pN and the encoding parameters p x representing the mono signal x are in the present embodiment multiplexed in a multiplexor 40 into one output signal 52, ready for transmission.
- Fig. 2b illustrates one embodiment of a multi-channel decoder 24 according to the present invention.
- the decoder 24 in Fig. 2b is suitable for decoding multi-channel signals encoded by the encoder of Fig. 2a.
- An input signal 54 is received and provided to a demultiplexor 56, which divides the input signal 54 into encoding parameters p x representing the mono signal x and a number of sets of encoded filter parameters pi-pN.
- the encoding parameters p x representing the mono signal x are provided to a mono signal decoder 64, in which the encoding parameters p x representing the mono signal x are used to generate a decoded mono signal x" according any suitable decoding technique associated with the encoding technique used in Fig. 2a. Many such techniques are available in known technology. The actual details of the encoding technique are not of importance for enabling the present invention and is therefore not further discussed.
- the decoded mono signal x" is provided to a channel filter section 160.
- the encoded filter parameters are also provided to the channel filter section 160, where they are decoded and used to define channel filters 60: 1-60:N.
- the so defined respective channel filters 60: 1-60:N are applied to the decoded mono signal x" whereby respective channel signals C"I-C"N are reconstructed and provided at outputs 26: 1-26:N.
- a mono signal is used as a main signal for regenerating the channel signals at the encoding or decoding.
- any predetermined linear combination of signals selected among the channel signals may be used as such a main signal.
- the optimum choice of predetermined linear combination depends on the actual application and implementation.
- a single channel signal can also constitute a possible such predetermined linear combination.
- FIG. 3a Another embodiment of a multi-channel encoder 14 according to the present invention is illustrated in Fig. 3a. Similar parts are denoted by similar reference numbers and only the differences are discussed below.
- the linear combination unit 34 provides as earlier a predetermined linear combination of the channel signals to the mono signal encoder 38.
- the signal associated with the mono signal x is instead a decoded version x" of the encoding parameters p x representing the mono signal x.
- Such an arrangement referred to as a closed loop approach, will allow for certain compensations of mono signal encoding inaccuracies, as described further below.
- the linear combination unit 34 of the present embodiment also combines the channel signals in N- l predetermined linear combinations C*J.-C*N-I, which serves as actual input signals to the channel filter section 130.
- the N- l predetermined linear combinations C*I-C*N-I should be mutually linear independent.
- the linear combinations C*I-C*N-I do not necessarily comprise any contribution from all channel signals.
- the term "linear combination" should in this context be used as also comprising the special cases where a factor of a component can be set to zero. In fact, in the most simple set-up, the linear combinations C*I-C*N-I can be identical to the channel signals ci- CN-1.
- the modified channel signals are also in this embodiment connected to the channel filter section 130, in which N-l sets of filter coefficients are deduced, now corresponding to the modified channel signals.
- the coefficients of the filter adaptation units 30: 1-30:N are according to the present invention optimised under a perceptual constraint.
- the output of the channel filter section 130 comprises N-l sets of filter parameters p* ⁇ -p*N- ⁇ - These filter parameters p* ⁇ -p*N- ⁇ are typically encoded separately or jointly to be suitable for transmission.
- the encoded filter parameters p* ⁇ -p*N- ⁇ and the encoding parameters p x representing the mono signal x are in the present embodiment transmitted separately.
- Fig. 3b illustrates another embodiment of a multi-channel decoder 24 according to the present invention.
- the decoder 24 in Fig. 3b is suitable for decoding multi-channel signals encoded by the encoder of Fig. 3a.
- Encoding parameters p x representing the mono signal x and a set of encoded filter parameters p* ⁇ -p*N- ⁇ are received.
- the encoding parameters p x representing the mono signal x are used to generate a decoded mono signal x" in a mono signal decoder 64 in analogy with previous embodiment.
- the filter parameters p* ⁇ -p*N- ⁇ are likewise provided to the channel filter section 160 for obtaining N- l decoded modified channel signals C*I-C*N-I.
- a linear combination unit 74 is then used to provide reconstructed channel signals C"I-C"N from the modified channel signals C*I-C*N-I and the decoded mono signal x".
- R is the symmetric covariance matrix of the mono signal x(n) :
- the perceptual characteristics may not completely be determined by a pure mathematical minimisation.
- the predicted channels may have no frequency content above or below a certain frequency. This occurs if, for instance, the channel is high-pass filtered, or results from a band- splitting procedure. Spectral nulls may cause instabilities and lead to filter responses that produces unnecessary amplification and low frequency audible artefacts. According to the present invention, a shape constraint is therefore advantageously utilised during optimisation procedures.
- Fig. 4 illustrates the basic ideas of the constrained minimisation procedure at the encoder side according to the present invention in an embodiment having two channels (the stereo case) and a linear filter 31.
- a filter 31 responsive for reconstruction of channel cl having filter coefficients h cl is derived according to a constrained error minimisation procedure in an optimising unit 32.
- the factors ⁇ ⁇ and ⁇ c2 determine how the channel signals are combined.
- One possibility is to set ⁇ ⁇ to a factor 2 ⁇ and ⁇ c2 to 2(l — ⁇ ) .
- the mono signal will be a weighted sum of the channels.
- the weighted combination of the individual channel signals to form the mono signal can in general even be the combination of filtered versions of the respective channel signals. Such an approach will be called pre-filtering. This can be useful if the approach is implemented in the excitation domain or in general a weighted signal domain.
- the channels can be pre-filtered by a LPC (Linear Predictive Coding) residual filter of the mono signal.
- LPC Linear Predictive Coding
- the mono and left and right channel will be assumed to be in general some pre-filtered versions of the real mono, left and right channels.
- the step of post-filtering with the mono LPC synthesis filter would be needed in order to get back to the signal domains.
- the filter parameters pi comprise the filter coefficients h cl and maybe necessary additional data defining the filter.
- the difference signal of two channel signals is reproduced by a filter.
- the right and left signals are illustrated by the curves 301 and 302, respectively.
- the representation is not ideal, giving a slightly larger difference than the target difference over the entire frame. This will lead to a reproduced right signal 303 at the decoder side that is slightly lower than the original right signal, and a reproduced left signal 304 that is slightly higher than the original left signal.
- the perception of such an artefact is that the volume of the right channel is decreased and the volume of the left channel is increased. If such artefacts moreover vary in time, the sound will swing back and forth between the right and left channel. A gain constraint may improve such a situation.
- One possible approach is to have a hard constraint, i.e. exact energy match between the original channel and the estimated channel, or to impose a loose gain constraint such as the output channel has a prescribed energy E ⁇ , which is not necessarily equal to the original channel signal energy.
- a channel signal may look like curve 305 of Fig. 6. No intensity is present below frequency fi or above frequency ,. However, a pure mathematical optimisation gives rise to a curve 306, which presents some limited power also below and above the frequencies fi and f2, respectively. Such artefacts are perceived.
- a set of linear constraints have to be imposed on the filter. These constraints should in general be of a number less than the number of coefficients of the filter.
- the shape constraint can be formulated by a matrix and a vector such that
- This constraint is especially useful when it is known a priori that the channel has no frequency content in a certain frequency range.
- the gain and shape constraints can also be combined.
- the shape constraint is preferably first applied and the gain constraint is then added as a factor, according to
- This equation is useful for bit rate reduction when encoding the channel filters, since it shows that the channel filters are related by quantities that are available at the decoder side.
- Fig. 7 an illustration shows that one cl out of two channels cl, c2 is reproduced by applying the mono signal x to an unconstrained filter 131. The result of the unconstrained filter is modified depending on shape constraints in a shape constraint section 132. From the shape constrained filter for the cl channel, also the shape constrained filter of channel c2 can be calculated and provided to separate gain constraint sections 133 for each channel.
- FIG. 8 A more detailed block scheme of another embodiment using a side signal for applying the shape constraint is illustrated in Fig. 8.
- Two channel signals ci and C 2 are combined in addition means 55, 57 of a linear combination unit 34 to a mono signal x and a side signal s.
- a channel filter section 130 comprises an unconstrained parametric filter 131, which applied to the mono signal x reproduces an estimate of the side signal s .
- the filter coefficients are adapted to give the minimum difference between s and s .
- the filter obtained in this manner h" c is provided to a shape constraint section 132, basically according to the discussions further above.
- a shape-constrained filter h s s for the side signal is created.
- a shape-constrained filter for each channel signal is calculated, based on the shape-constrained filter h s s c for the side signal.
- These filters, or rather the coefficients thereof, are provided to a respective gain constraint section 133: 1, 133:2.
- a gain factor for each channel signal is calculated, and the two filters are provided to a parameter encoding section 66, where the parameters of the two filters are jointly encoded.
- the constrained channel filters h cl and h c2 After calculation of the constrained channel filters h cl and h c2 , they are quantized and encoded in a representation, which is suitable for transmission to the receiver. Typically, the coefficients of the filters are quantized using scalar or vector quantizers and the quantizer indexes are transmitted. The quantizers may also implement prediction, which is very beneficial for bit rate reduction especially in this scenario.
- Making use of the complementarities of the filters may further reduce the bit rate since only one of the filters h c ⁇ or h c2 or a linear combination of them is quantized and transmitted while the gains g cl and g c2 are jointly vector quantized and transmitted separately. Such a transmission can be carried out at bit rates as low as, e.g. 1 kbps.
- the receiver first decodes the transmitted mono signal and channel filters. Then, it regenerates the different channel signals by filtering the mono signal through the respective channel filter. Preferably, in the stereo case, the completeness property is used, and the coefficients are recombined to produce the filters h ⁇ and h c2 . Certain post-processing steps that further improve the quality of the reconstructed multi-channel signal may follow the re-generation of the different channels signals.
- the equivalent side signal filter is (as used in Fig. 8):
- the gain constraint on the filters assumes previously computed channel energies, i.e. E c E c2 . It is important to control the gains of the filters, e.g. g cl ,g c2 and to avoid unnecessary amplification by limiting the gains.
- the channels are anti-correlated on the whole frequency range or in certain frequency bands. This leads to a certain cancellation when the mono channel is formed.
- a certain amount e.g. 0 dB.
- One way to perform this gain limitation is to compute a certain gain factor:
- g F quantifies how severe this cancellation is.
- the filters are derived based on the original mono signal. This is e.g. the case in Fig. 2a, where the signal 42 is the original mono signal x.
- the decoder will use a quantized mono signal as input for the channel filtering.
- the filter calculations are based on the coded and thus already quantized mono signal. This is e.g. the case in Fig. 3a, where the signal 44 is a decoded mono signal x" .
- This approach has the advantage that the channel filter design does not only aim to match the respective channel signals in a best possible way. It also aims to mitigate coding errors, which are the result of the mono signal encoding.
- the principles described hitherto are applicable on the complete spectrum, i.e. full-band signals. However, they are equally well or even more beneficially applicable on sub-bands of the signals.
- Fig. 9 illustrates the principles of sub-band processing.
- a number of channels ci - CN are each divided in K sub-bands SB1, SB2, SBK.
- the channel signals in each sub- band is provided to a respective multi-channel encoder unit 80: 1-80:K, where the channel signals are encoded.
- One or several of the multi-channel encoder units 80: 1-80:K can be multi-channel encoder units according to the present invention.
- a bit-stream combiner 82 combines the encoded signals into a common encoded signal 53, that is transmitted.
- the multi-channel encoding for the different sub-bands can be carried out individually optimised with respect to e.g. assigned bit rate, processing frame sizes and sampling rate.
- sub-band processing does not carry out multi-channel encoding for very low frequencies, e.g. below 200 Hz. That means that for this very low frequency band, a mere mono signal is transmitted. This principle makes use of the fact that the human stereo perception is less sensitive for very low frequencies. It is known from prior art and called sub- woofing.
- the band splitting is done using a time-frequency transform such as, e.g. a short term Fourier transform (STFT), which allows decomposing the signal into single freq ⁇ ency components.
- STFT short term Fourier transform
- the filtering reduces to a mere multiplication of the individual spectral coefficients of the mono signal with a complex factor.
- the parametric multi-channel coding method according to the invention will typically involve fixed frame-wise processing of signal samples.
- parameters describing the multi-channel image are derived and transmitted with a rate corresponding to a coding frame length of, e.g. 20 ms.
- the parameters may, however, be obtained from signal frames which are much larger than the coding frame length.
- a suitable choice is to set the length of such analysis frames to values larger than the coding frame length. This implies that the parameter calculation is performed with overlapping analysis frames.
- smooth filter parameter evolution can be enforced. It is, e.g. possible to apply low-pass or median filtering to the filter parameters.
- noise shaping of the coding noise.
- the purpose of this operation is to move coding noise to frequencies where the signal has high spectral density and thus render the noise less audible.
- Noise shaping is usually done adaptively, i.e. in response to the audio signal. This implies that, in general, the noise shaping performed on the mono signal will be different from what is required for the various channel signals.
- the subsequent channel filtering according to the invention may lead to an audible coding noise increase in the reconstructed multi-channel signal when comparing to the audible coding noise in the mono signal.
- signal-adaptive post-filtering may be applied to the reconstructed channel signals in a post-processing step of the receiver.
- Any state-of-the-art post-filtering techniques can be deployed here, which essentially emphasise spectral tops or deepen spectral valleys and thereby reduce the audible noise.
- One example of such a technique is so- called high-resolution post-filtering which is described in the European Patent 0 965 123 Bl by E. Ekudden et. al.
- Other simple methods are so- called pitch- and formant post-filters, which are known from speech coding.
- Fig. 11 the main steps of an embodiment of an encoding method according to the present invention are illustrated as a flow diagram.
- the procedure starts in step 200.
- a main signal preferably a mono signal, deduced from the multi-channel signals is encoded.
- filter coefficients are optimised to give an as good representation as possible of a channel signal when applied to the main signal. The optimising takes place under perceptual constraints.
- the optimal coefficients are then encoded in step 224.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Filters That Use Time-Delay Elements (AREA)
Abstract
Description
Claims
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP04809080.7A EP1639580B1 (en) | 2003-12-19 | 2004-12-15 | Coding of multi-channel signals |
DK04809080.7T DK1639580T3 (en) | 2003-12-19 | 2004-12-15 | Multichannel coding |
PL04809080T PL1639580T3 (en) | 2003-12-19 | 2004-12-15 | Coding of multi-channel signals |
JP2006518597A JP4323520B2 (en) | 2003-12-19 | 2004-12-15 | Constrained filter coding of polyphonic signals |
ES04809080.7T ES2439693T3 (en) | 2003-12-19 | 2004-12-15 | Multi-channel signal encoding |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
SE0303499A SE0303499D0 (en) | 2003-12-19 | 2003-12-19 | Multi-channel coding using gain-shape constrained filters |
SE0303499-8 | 2003-12-19 | ||
SE0400415-6 | 2004-02-20 | ||
SE0400415A SE527713C2 (en) | 2003-12-19 | 2004-02-20 | Coding of polyphonic signals with conditional filters |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2005059901A1 true WO2005059901A1 (en) | 2005-06-30 |
Family
ID=31996352
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/SE2004/001907 WO2005059901A1 (en) | 2003-12-19 | 2004-12-15 | Constrained filter encoding of polyphonic signals |
Country Status (8)
Country | Link |
---|---|
EP (2) | EP1639580B1 (en) |
JP (1) | JP4323520B2 (en) |
DK (1) | DK1639580T3 (en) |
ES (1) | ES2439693T3 (en) |
PL (1) | PL1639580T3 (en) |
PT (1) | PT1639580E (en) |
SE (1) | SE527713C2 (en) |
WO (1) | WO2005059901A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2008016098A1 (en) * | 2006-08-04 | 2008-02-07 | Panasonic Corporation | Stereo audio encoding device, stereo audio decoding device, and method thereof |
WO2010042024A1 (en) * | 2008-10-10 | 2010-04-15 | Telefonaktiebolaget Lm Ericsson (Publ) | Energy conservative multi-channel audio coding |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0559383A1 (en) * | 1992-03-02 | 1993-09-08 | AT&T Corp. | A method and apparatus for coding audio signals based on perceptual model |
WO2003009206A1 (en) * | 2001-07-19 | 2003-01-30 | Sungwoo Kim | The system and operational method of mobile telecommunication device for electronic cash |
WO2003009208A1 (en) * | 2001-07-20 | 2003-01-30 | Medical Research Group | Method and apparatus for communicating between an ambulatory medical device and a control device via telemetry using randomized data |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5434948A (en) | 1989-06-15 | 1995-07-18 | British Telecommunications Public Limited Company | Polyphonic coding |
NL9100173A (en) | 1991-02-01 | 1992-09-01 | Philips Nv | SUBBAND CODING DEVICE, AND A TRANSMITTER EQUIPPED WITH THE CODING DEVICE. |
SE9700772D0 (en) | 1997-03-03 | 1997-03-03 | Ericsson Telefon Ab L M | A high resolution post processing method for a speech decoder |
ES2280736T3 (en) | 2002-04-22 | 2007-09-16 | Koninklijke Philips Electronics N.V. | SYNTHETIZATION OF SIGNAL. |
-
2004
- 2004-02-20 SE SE0400415A patent/SE527713C2/en unknown
- 2004-12-15 WO PCT/SE2004/001907 patent/WO2005059901A1/en not_active Application Discontinuation
- 2004-12-15 PL PL04809080T patent/PL1639580T3/en unknown
- 2004-12-15 EP EP04809080.7A patent/EP1639580B1/en active Active
- 2004-12-15 DK DK04809080.7T patent/DK1639580T3/en active
- 2004-12-15 JP JP2006518597A patent/JP4323520B2/en not_active Expired - Fee Related
- 2004-12-15 PT PT48090807T patent/PT1639580E/en unknown
- 2004-12-15 ES ES04809080.7T patent/ES2439693T3/en active Active
- 2004-12-15 EP EP12154099A patent/EP2456236A1/en not_active Ceased
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0559383A1 (en) * | 1992-03-02 | 1993-09-08 | AT&T Corp. | A method and apparatus for coding audio signals based on perceptual model |
WO2003009206A1 (en) * | 2001-07-19 | 2003-01-30 | Sungwoo Kim | The system and operational method of mobile telecommunication device for electronic cash |
WO2003009208A1 (en) * | 2001-07-20 | 2003-01-30 | Medical Research Group | Method and apparatus for communicating between an ambulatory medical device and a control device via telemetry using randomized data |
Non-Patent Citations (2)
Title |
---|
DATABASE INSPEC [online] Database accession no. 7268205 * |
FALLER C. ET AL: "Efficient representation of spatial audio using perceptualparametrization", IEEE WORKSHOP APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, 21 October 2001 (2001-10-21) - 24 October 2001 (2001-10-24), XP002245584 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2008016098A1 (en) * | 2006-08-04 | 2008-02-07 | Panasonic Corporation | Stereo audio encoding device, stereo audio decoding device, and method thereof |
WO2010042024A1 (en) * | 2008-10-10 | 2010-04-15 | Telefonaktiebolaget Lm Ericsson (Publ) | Energy conservative multi-channel audio coding |
US9330671B2 (en) | 2008-10-10 | 2016-05-03 | Telefonaktiebolaget L M Ericsson (Publ) | Energy conservative multi-channel audio coding |
Also Published As
Publication number | Publication date |
---|---|
JP4323520B2 (en) | 2009-09-02 |
EP1639580A1 (en) | 2006-03-29 |
DK1639580T3 (en) | 2014-01-13 |
PL1639580T3 (en) | 2014-04-30 |
SE0400415L (en) | 2005-06-20 |
JP2007527543A (en) | 2007-09-27 |
ES2439693T3 (en) | 2014-01-24 |
PT1639580E (en) | 2013-11-19 |
SE0400415D0 (en) | 2004-02-20 |
EP1639580B1 (en) | 2013-10-23 |
EP2456236A1 (en) | 2012-05-23 |
SE527713C2 (en) | 2006-05-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7193603B2 (en) | Decoder system, decoding method and computer program | |
US7573912B2 (en) | Near-transparent or transparent multi-channel encoder/decoder scheme | |
CA2527971C (en) | Fidelity-optimised variable frame length encoding | |
EP2981956B1 (en) | Audio processing system | |
KR101183857B1 (en) | Method and apparatus to encode and decode multi-channel audio signals | |
US8249883B2 (en) | Channel extension coding for multi-channel source | |
US7809579B2 (en) | Fidelity-optimized variable frame length encoding | |
CN110223701B (en) | Decoder and method for generating an audio output signal from a downmix signal | |
WO2006091150A1 (en) | Improved filter smoothing in multi-channel audio encoding and/or decoding | |
US7725324B2 (en) | Constrained filter encoding of polyphonic signals | |
CN118366463A (en) | Method and apparatus for processing down-mix of multi-channel digital audio signal, encoding method and medium | |
EP1639580B1 (en) | Coding of multi-channel signals | |
AU2007237227B2 (en) | Fidelity-optimised pre-echo suppressing encoding |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2006518597 Country of ref document: JP Ref document number: 2004809080 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 5269/DELNP/2005 Country of ref document: IN |
|
WWP | Wipo information: published in national office |
Ref document number: 2004809080 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: DE |