TR201900414T4

TR201900414T4 - Decoding a stereo sound signal using complex prediction.

Info

Publication number: TR201900414T4
Application number: TR2019/00414T
Authority: TR
Inventors: Purnhagen Heiko; Carlsson Pontus; Villemoes Lars; Robilliard Julien; Neusinger Matthias; Helmrich Christian; Hilpert Johannes; Rettelbach Nikolaus; Disch Sascha; Edler Bernd
Original assignee: Dolby Int Ab; Fraunhofer Ges Forschung
Priority date: 2010-04-09
Filing date: 2011-03-23
Publication date: 2019-02-21
Also published as: ES2708581T3; ES2704261T3; TR201820422T4; TR201901427T4; ES2707473T3; TR201900906T4; TR201900830T4; ES2704891T3

Abstract

Bir orta sinyal olarak bir birinci kombinasyon sinyalini (204) ve orta sinyalden türetilen öngörülmüş bir yan sinyal kullanılarak türetilebilen bir artık sinyali (205) elde etmek için iki ses kanalının (201, 202) birleştirilmesine dayalı olan bir ses kodlayıcı ve bir ses kodu çözücü açıklanmaktadır. Birinci kombinasyon sinyali ve öngörü artık sinyali kodlanır (206) ve bir optimizasyon hedefine (208) dayalı olarak bir iyileştirici (207) vasıtasıyla türetilen öngörü bilgisi (206) ile birlikte bir veri akışına (213) yazılır (212). Bir kod çözücü, kodu çözülmüş birinci kanal sinyalini ve kodu çözülmüş ikinci kanal sinyalini türetmek için öngörü artık sinyalini, birinci kombinasyon sinyalini ve öngörü bilgisini kullanır. Bir kodlayıcı örneğinde veyahut bir kod çözücü örneğinde, birinci kombinasyon sinyalinin spektrumunun sanal parçasını kestirmek için bir gerçekten-sanala dönüşüm uygulanabilir. Öngörü artık sinyalinin türetilmesinde kullanılan öngörü sinyalini hesaplamak için gerçek değerli birinci kombinasyon sinyali, karmaşık öngörü bilgisinin gerçek bir kısmıyla çarpılırken; birinci kombinasyon sinyalinin kestirilen sanal parçası, karmaşık öngörü bilgisinin sanal bir kısmıyla çarpılır.An audio encoder and an audio decoder are disclosed based on combining two audio channels 201, 202 to obtain a residual signal 205 that can be derived using a first combination signal 204 as a middle signal and a predicted side signal derived from the middle signal. . The first combination signal and the prediction residual signal are encoded 206 and written to a data stream 213 with predictive information 206 derived by an optimizer 207 based on an optimization target 208. A decoder uses the prediction residual signal, the first combination signal, and predictive information to derive the decoded first channel signal and the decoded second channel signal. In an encoder sample or a decoder example, a real-virtual transform can be applied to estimate the imaginary part of the spectrum of the first combination signal. While the real valued first combination signal is multiplied by a real part of the complex predictive information to calculate the prediction signal used in the derivation of the prediction residual signal; The predicted imaginary part of the first combination signal is multiplied by an imaginary part of the complex predictive information.

Description

TARIFNAME KARMASIK ÖNGÖRÜ KULLANARAK BIR STEREO SES SINYALININ ÇÖZÜLMESI Mevcut bulus, ses isleme ile, özellikle de iki veya daha fazla kanal sinyali olan çok kanalli bir sinyalin çok kanalli ses islemesi ile ilgilidir. Çok kanalli veya stereo isleme alaninda, orta/yan (M/S: mid/side) stereo kodlama olarak bilinen yöntemin uygulanmasi bilinmektedir. Bu konseptte, bir orta veya mono sinyali (M) elde etmek üzere sol veya birinci ses kanali sinyali ile sag veya ikinci ses kanali sinyalinin bir kombinasyonu olusturulur. Buna ilave olarak, yan sinyali (S) elde etmek üzere 501 veya birinci kanal sinyali ile sag veya ikinci kanal sinyalinin arasinda bir fark olusturulur. Sol sinyal ile sag sinyalin birbirlerine oldukça benzer oldugu durumda yan sinyal oldukça küçük olacagindan, bu orta/yan kodlama yöntemi ciddi bir kodlama kazanci ile sonuçlanir. Genel olarak, nicemlenecek/entropi- kodlanacak degerlerin araligi küçüldügü zaman, nicemleyici/entropi kodlayici asamasindaki kodlama kazanci artacaktir. Dolayisiyla yan sinyal küçüldügü zaman, bir PCM (Pulse Code Modulation: Darbe/vurum kod kiplenimi) veya Huffman- tabanli veya aritmetik entropi-kodlayici için kodlama kazanci artacaktir. Ancak orta/yan kodlamanin kodlama kazanci ile sonuçlanmadigi bazi durumlar da mevcuttur. Bu durum, her iki kanaldaki sinyallerde birbirlerine örnegin 90° faz-kaydirma yapildigi zaman gerçeklesebilir. Ardindan, orta sinyal ile yan sinyal oldukça benzer bir aralikta olabilir, bu sebeple entropi- kodlayici kullanilarak orta sinyal ile yan sinyalin kodlanmasi kodlama kazanci ile sonuçlanmayacak, hatta artan bir bit orani ile sonuçlanacaktir. Dolayisiyla, bantlarda orta/yan kodlamayi etkisiz hale getirmek için, örnegin yan sinyalin orijinal sol sinyale göre bir dereceye kadar küçülmedigi frekans-seçici bir orta/yan kodlama uygulanabilir. DESCRIPTION DECODING A STEREO AUDIO SIGNAL USING COMPLEX PREDICTION The present invention relates to audio processing, particularly to multi-channel audio processing of a multi-channel signal having two or more channel signals. In the field of multi-channel or stereo processing, the application of a method known as mid/side (M/S) stereo coding is known. In this concept, a combination of the left or first audio channel signal and the right or second audio channel signal is created to obtain a middle or mono signal (M). In addition, a difference is created between the 501 or first channel signal and the right or second channel signal to obtain the side signal (S). Since the side signal will be quite small when the left and right signals are quite similar, this method of center/side coding results in a significant coding gain. In general, the coding gain at the quantizer/entropy coder stage will increase as the range of values to be quantized/entropy-coded becomes smaller. Thus, the coding gain for a PCM (Pulse Code Modulation) or Huffman-based or arithmetic entropy coder will increase as the side signal becomes smaller. However, there are some cases where center/side coding does not result in coding gain. This may happen when the signals in both channels are phase-shifted to each other by, for example, 90°. Then, the middle signal and the side signal may be in a very similar range, so coding the middle signal and the side signal using the entropy-encoder will not result in coding gain, but will instead result in an increased bit rate. Therefore, a frequency-selective middle/side coding can be applied to disable the middle/side coding in the bands, for example, where the side signal is not reduced to some extent relative to the original left signal.

Sol ve sag sinyaller özdes oldugunda yan sinyal sifir olmasina, bunun da yan sinyalin elimine edilmesinden kaynakli maksimum kodlama kazanci ile sonuçlanmasina karsin, durum orta sinyal ile yan sinyalin, bu iki sinyal arasindaki tek fark genel genlikleri olmak kaydiyla dalga biçimi bakimindan özdes olmasi halinde farkli bir hal alir. Bu durumda, yan sinyalden orta sinyale faz- kayma olmadigi da varsayildiginda, yan sinyal anlamli ölçüde artarken, orta sinyalde deger araligi bakimindan çok fazla bir düsüs görülmez. Belli bir frekans bandinda bu durum gerçeklestigi zaman, kodlama kazanci olmamasindan ötürü orta/yan kodlama etkisiz hale getirilebilir. Orta/yan kodlama frekans- seçici olarak uygulanabilecegi gibi, alternatif olarak zaman bölgesinde de uygulanabilir. While the side signal is zero when the left and right signals are identical, resulting in maximum coding gain due to the elimination of the side signal, the situation is different when the middle signal and the side signal are identical in waveform, with the only difference between them being their overall amplitudes. In this case, assuming no phase-shift from the side signal to the middle signal, the side signal increases significantly, while the middle signal does not drop significantly in terms of range. When this occurs in a certain frequency band, middle/side coding can be disabled due to the lack of coding gain. Middle/side coding can be applied either frequency-selectively or alternatively in the time domain.

Orta/yan kodlama gibi bir dalga biçimi yaklasimina degil, belli stereofonik ipuçlarina dayali olarak parametrik islemeye dayanan alternatif çok kanalli kodlama teknikler mevcuttur. Bu teknikler çevre kodlama" olarak bilinir. Burada çoklu frekans bantlari için belli ipuçlari hesaplanir. Bu ipuçlari; kanallararasi seviye farklarini, kanallararasi uyum.ölçümlerini, kanallararasi zaman farklarini ve/veya kanallararasi faz farklarini içerir. Bu yaklasimlar dinleyici tarafindan hissedilen çok kanalli bir etkinin, kanallarin detayli dalga biçimlerine dayanmasinin zorunlu olmadigi, aksine frekans-seçici olarak saglanmis dogru ipuçlarina veya kanallararasi bilgiye dayandigi varsayimindan gelmektedir. Bu da bir renderlama makinesinde ipuçlarini dogru bir sekilde yansitan çok kanalli sinyallerin renderlanmasi esnasinda dikkatli olunmasini gerektigi, ancak dalga biçimlerinin belirleyici bir öneme sahip olmadigi anlamina gelmektedir. There are alternative multi-channel coding techniques that do not rely on a waveform approach such as center/side coding, but rather on parametric processing based on specific stereophonic cues. These techniques are known as "surround coding". Here, certain cues are calculated for multiple frequency bands. These cues include inter-channel level differences, inter-channel coherence measurements, inter-channel time differences, and/or inter-channel phase differences. These approaches assume that a multi-channel effect perceived by the listener is not necessarily based on the detailed waveforms of the channels, but rather on accurate frequency-selective cues or inter-channel information. This means that care must be taken to render multi-channel signals that accurately reflect the cues in a rendering machine, but that the waveforms are not of decisive importance.

Bu yaklasim özellikle, bütün bu kanallar bir veya ayni downmix kanalindan türetilmis olsa da kod çözücünün birbirlerinden ilintisizlestirilmis stereo sinyallerini yapay olarak elde etmesi için bir ilintisizlestirme isleme uygulamak zorunda oldugu durumda karmasik olabilir. Uygulamaya bagli olarak, bu amaca yönelik ilintisizlestiriciler karmasiktir` ve özellikle geçisli sinyal kisimlarinda yapayliklara neden olabilir. Ayrica parametrik kodlama yaklasimi, dalga biçimi kodlamanin aksine hem tipik nicemlemeden hem de belirli dalga biçimleri yerine stereofonik ipuçlarina bakilmasindan kaynaklanan kaçinilmaz bir bilgi kaybina neden olan kayipli bir kodlama yaklasimidir. Bu yaklasim çok düsük bit oranlarina olanak saglar, ancak kaliteden ödün vermeyi gerektirebilir. This approach can be particularly complicated when the decoder has to apply a decorrelation process to artificially obtain decorrelated stereo signals from each other, even though all these channels are derived from one or the same downmix channel. Depending on the application, decorrelators for this purpose are complicated and can cause artifacts, especially in the transient signal portions. Furthermore, the parametric coding approach is a lossy coding approach, which, unlike waveform coding, inevitably results in information loss, both from typical quantization and from looking at stereophonic cues instead of specific waveforms. This approach allows very low bit rates, but may require quality compromises.

Son dönemde, Sekil 7'de görülen tümlesik konusma.ve ses kodlamada (USAC; Unified. Speech and Audio Coding) gelistirmeler yapilmistir. Bir çekirdek kod çözücü (700), giris 701'de orta/yan kodlanmis olabilen kodlanmis stereo sinyalde kod çözme islemini gerçeklestirir. Çekirdek kod çözücü, hat 702'de bir orta sinyal, hat 703'te ise bir yan veya artik sinyal çikisi yapar. Her iki sinyal de QMF (Quadrature Mirror Filter; Çeyrek Ayna Süzgeç) süzgeç öbekleriyle (704 ve 705) bir QMF bölgesine dönüstürülür. Ardindan, bir sol kanal sinyalini (707) ve bir sag kanal sinyalini (708) üretmek için bir MPEG çevre kod çözücü (706) uygulanir. Sonrasinda düsük bantli sinyaller bir spektral bant kopyalama (SBR; Spectral Band Replication) kod çözücüye (709) iletilir, bu da hat 710 ve 711'de genis bantli sol ve sag sinyalleri üretir ve ardindan bunlar genis bantli sol ve sag sinyalleri (L, R) elde etmek üzere QMF sentez süzgeç öbekleri (712, 713) ile bir zaman bölgesine dönüstürülür. Recently, improvements have been made in Unified Speech and Audio Coding (USAC), as shown in Figure 7. A core decoder 700 performs decoding of the encoded stereo signal, which may be center/side encoded on input 701. The core decoder outputs a center signal on line 702 and a side or residual signal on line 703. Both signals are converted to a QMF domain with QMF (Quadrature Mirror Filter) filter banks 704 and 705. An MPEG surround decoder 706 is then applied to produce a left channel signal 707 and a right channel signal 708. The low-band signals are then passed to a spectral band replication (SBR) decoder (709), which produces wide-band left and right signals on lines 710 and 711, which are then converted to a time domain by QMF synthesis filter banks (712, 713) to obtain the wide-band left and right signals (L, R).

Sekil 7'de MPEG çevre kod çözücünün. (706) bir orta/yan kod çözmeyi gerçeklestirdigi durum gösterilmistir. Alternatif olarak, MPEG çevre kod çözücü blogu (706), tek bir mono çekirdek kod çözücü sinyalinden stereo sinyalleri üretmeye yönelik bir stereofonik ipucu tabanli parametrik kod çözmeyi gerçeklestirir. Figure 7 shows the case where the MPEG surround decoder 706 performs a center/side decoding. Alternatively, the MPEG surround decoder block 706 performs a stereophonic cue-based parametric decoding to produce stereo signals from a single mono core decoder signal.

MPEG çevre kod çözücü (706) ayrica kanallararasi seviye farklari, kanallararasi uyum ölçümleri gibi parametrik bilgileri veya bu türdeki diger kanallararasi bilgi parametrelerini kullanarak, SBR kod çözücü bloguna (709) girilecek olan çoklu düsük bantli çikis sinyallerini de üretebilir. The MPEG surround decoder 706 may also generate multiple low-band output signals to be input to the SBR decoder block 709 using parametric information such as inter-channel level differences, inter-channel coherence measurements, or other such inter-channel information parameters.

MPEG çevre kod çözücü blogu (706) Sekil 7b'de gösterilen orta/yan kod çözmeyi gerçeklestirdiginde, bir gerçek-kazanç katsayisi (g) uygulanabilir; DMX/RES ve L/R sirasiyla karmasik karma QMF bölgesinde gösterilen downmix/artik ve sol/sag sinyallerdir. When the MPEG surround decoder block 706 performs the mid/side decoding shown in Figure 7b, a real-gain coefficient (g) can be applied; DMX/RES and L/R are the downmix/residual and left/right signals shown in the complex composite QMF region, respectively.

Blok 706 ve blok 709'un bir kombinasyonunun kullanilmasi, temel olarak bir stereo kod çözücünün kullanimi ile kiyaslandiginda sayisal karmasiklikta sadece çok az bir artisa neden olur, çünkü sinyale ait karmasik QMF gösterimi SBR kod çözücünün bir parçasi olarak halihazirda mevcuttur. SBR olmayan bir yapida ise USAC baglaminda açiklanan QMF tabanli stereo kodlama, bu örnekte 64- bant analiz öbekleri ve 64-bant sentez öbeklerini gerektiren gerekli QMF öbeklerinden ötürü sayisal karmasikliktaki anlamli bir artis ile sonuçlanabilir. Bu süzgeç öbekleri sadece stereo kodlama amaçli olarak eklenmelidir. Using a combination of block 706 and block 709 results in only a very small increase in numerical complexity compared to using a stereo decoder as the underlying, since the complex QMF representation of the signal is already available as part of the SBR decoder. In a non-SBR architecture, QMF-based stereo coding as described in the USAC context can result in a significant increase in numerical complexity due to the required QMF blocks, in this example requiring 64-band analysis blocks and 64-band synthesis blocks. These filter blocks should be added for stereo coding purposes only.

Gelistirilmekte olan MPEG USAC sisteminde ise genellikle SBR'nin kullanilmadigi yüksek bit oranlarindaki kodlama modlari mevcuttur. In the MPEG USAC system currently being developed, there are generally coding modes at high bit rates where SBR is not used.

Asagida verilen iki MPEG USAC dokümanlarda bir karmasik degerli öngörü katsayisi vasitasiyla bir downmix/toplam/mono sinyalden farkli bir sinyalin tahmin edildigi çok kanalli ses kodlama/kod çözme semalarina iliskin örnekler mevcuttur: 0 HEIKO PURNHAGEN VE ARK.: "Technical description of proposed Unified Stereo Coding in USAC", 90. MPEG TOPLANTISI; 26- -23); 0 MAX NEUENDORF (EDITÖR): "WD5 of USAC", 90. MPEG TOPLANTISI; 12-08), sayfa 1-146; kompleks tahmin kullanarak stereo sifreleme ve çözme için yöntemler ve cihazlar anlatmaktadir. Bir düzenlemede, kompleks tahmin kodlama tarafindan sifrelenen ve iki girdi kanalinin birinci frekans alan gösterimlerini içeren bir girdi stereo sinyalinden bir çikti stereo sinyalinin elde edilmesi için bir çözme yöntemi takibi upmiksleme adimlarini içermektedir: (i) birinci bir girdi kanalinin ikinci bir frekans alan gösteriminin hesaplanmasi; ve (ii) birinci girdi kanalinin birinci ve ikinci frekans alan gösterimlerine, ikinci girdi kanalinin birinci frekans alan gösterimine ve bir kompleks tahmin katsayisina dayanilarak bir çikti sinyalinin hesaplanmasi. Upmiksleme islemi kontrol verisine karsilik olarak askida tutulabilmektedir. dayanilarak bir mono downmiks sinyalinden bir sol sinyal ve bir sag sinyal üreten bir parametrik stereo upmiks ekipmanini anlatmaktadir. Bahsi geçen parametrik stereo upmiks, bir tahmin katsayisi ile ölçeklendirilen mono downmiks sinyaline dayanilarak sol sinyal ve sag sinyal arasinda bir fark içeren bir fark sinyalinin tahmin edilmesi için bir cihaz içermesi ile karakterize edilmektedir. Bahsi geçen katsayi uzaysal parametrelerden türetilmektedir. Bahsi geçen stereo upmiks ekipmani ayrica bir toplama ve mono downmiks sinyalinin ve bahsi geçen fark sinyalinin bir farkina dayanilarak sol sinyal ve sag sinyal türetmek üzere aritmetik bir araç içermektedir. The following two MPEG USAC documents provide examples of multi-channel audio coding/decoding schemes where a signal other than a downmix/sum/mono signal is estimated by means of a complex-valued prediction coefficient: 0 HEIKO PURNHAGEN ET AL.: "Technical description of proposed Unified Stereo Coding in USAC", 90th MPEG MEETING; 26- -23); 0 MAX NEUENDORF (EDITOR): "WD5 of USAC", 90th MPEG MEETING; 12-08), pages 1-146; describes methods and devices for stereo coding and decoding using complex prediction. In one embodiment, a decoding method for obtaining an output stereo signal from an input stereo signal encoded by complex prediction coding and comprising first frequency domain representations of two input channels includes the following steps of upmixing: (i) calculating a second frequency domain representation of a first input channel; and (ii) calculating an output signal based on the first and second frequency domain representations of the first input channel, the first frequency domain representation of the second input channel, and a complex prediction coefficient. The upmixing process may be suspended in response to control data. A parametric stereo upmix equipment that produces a left signal and a right signal from a mono downmix signal based on the first and second frequency domain representations of the first input channel, the first frequency domain representation of the second input channel, and a complex prediction coefficient. The said parametric stereo upmix is characterized by comprising a device for estimating a difference signal containing a difference between the left signal and the right signal based on the mono downmix signal scaled by an estimation coefficient. The said coefficient is derived from spatial parameters. The said stereo upmix equipment also comprises a sum and an arithmetic means for deriving the left signal and the right signal based on a difference of the mono downmix signal and the said difference signal.

Mevcut bulusun bir amaci bir yandan yüksek kodlama kazanci saglarken, diger yandan da iyi bir ses kalitesi ve/veya azalan sayisal karmasiklik ile sonuçlanan iyilestirilmis bir ses isleme konsepti ortaya koymaktir. One object of the present invention is to provide an improved audio processing concept that provides high coding gain while resulting in good audio quality and/or reduced computational complexity.

Bu amaca Istem 1'deki gibi bir MPEG birlesik konusma ve ses kodu çözücü, Istem lZ'deki gibi bir MPEG birlesik konusma ve ses kodu çözme yöntemi veya Istem 13'deki gibi bir bilgisayar programi ile ulasilmaktadir. This objective is achieved by an MPEG combined speech and audio decoder as in claim 1, an MPEG combined speech and audio decoding method as in claim 1Z, or a computer program as in claim 13.

Mevcut bulus, yüksek kaliteli dalga biçimi kodlama yaklasimindaki bir kodlama kazancinin bir birinci kombinasyon sinyali kullanilarak bir ikinci kombinasyon sinyalinin anlamli ölçüde arttirilabilecegi bulgusuna dayanmakta olup, buradaki sinyallerin her ikisi de orta/yan kombinasyon kurali gibi bir kombinasyon kurali kullanilarak orijinal kanal sinyallerinden türetilir. Bulus konusu öngörünün parametre tabanli bir çift veya çok kanalli kodlama yaklasimindan ziyade dalga biçimi tabanli bir kodlama olmasindan dolayi, bu öngörü bilgisinin, bir optimizasyon hedefi yerine getirilecek, sadece küçük bir ek yüke neden olacak, fakat ses kalitesinde herhangi bir kayip olmaksizin yan sinyal için gereken bit oraninda anlamli bir azalma ile sonuçlanacak sekilde ses kodlayicidaki bir öngörücü tarafindan hesaplandigi bulunmustur. Sayisal karmasikligi azaltmak adina, öngörü bilgisinin bant-seçici bir sekilde frekans bölgesi girisinden türetildigi bir frekans-bölgesi kodlamanin gerçeklestirilmesi tercih edilir. Zaman bölgesi gösteriminin bir spektral gösterime dönüstürülmesine yönelik dönüsüm algoritmasi tercihen, karmasik dönüsümde bir spektrumun gerçek ve karmasik degerleri 2 kat yüksek hizda örnekleme (oversampling) ile sonuçlanacak sekilde hesaplanirken, karmasik dönüsümden sadece gerçek degerlerin veya sanal degerlerin hesaplanmasi bakimindan fakli olan, modifiye edilmis kesintili kosinüs dönüsümü (MDCT; Modified Discrete Cosine Transform) veya modifiye edilmis kesintili sinüs dönüsümü (MDST; Modified Discrete Sine Transform) gibi kritik olarak› örneklenmis bir islemdir. The present invention is based on the finding that a coding gain in a high quality waveform coding approach can be significantly increased by using a first combination signal for a second combination signal, where both signals are derived from the original channel signals using a combination rule such as the mid/side combination rule. Since the inventive prediction is a waveform-based coding approach rather than a parameter-based dual or multi-channel coding approach, it has been found that this prediction information is calculated by a predictor in the audio encoder in such a way that an optimization goal is achieved, incurring only a small overhead, but resulting in a significant reduction in the bit rate required for the side signal, without any loss in audio quality. In order to reduce the computational complexity, it is preferable to implement a frequency-domain coding in which the predictive information is derived from the frequency-domain input in a band-selective manner. The transformation algorithm for transforming the time-domain representation into a spectral representation is preferably a critically sampled operation such as the modified discrete cosine transform (MDCT) or the modified discrete sine transform (MDST), which differ from the complex transform in that only real values or imaginary values are calculated, while in the complex transform the real and complex values of a spectrum are calculated with 2 times oversampling, resulting in oversampling.

Tercihen, örtüsen bir sunum ve silmeye dayali bir dönüsüm kullanilir. Özellikle MDCT böyle bir dönüsümdür ve kod çözücü tarafindaki örtüsme-ekleme-isleme sayesinde elde edilen iyi bilinen zaman bölgesi esdes silme (TDAC; Time Domain Aliasing Cancellation) özelliginden ötürü herhangi bir ek yük olmaksizin sonraki bloklar arasinda bir çapraz-sönümlemeye olanak saglar. Preferably, a transformation based on overlapping representation and cancellation is used. MDCT in particular is such a transformation and allows cross-fading between subsequent blocks without any additional overhead due to the well-known Time Domain Aliasing Cancellation (TDAC) property achieved by the overlap-addition-processing at the decoder side.

Tercihen, kodlayicida hesaplanan, kod çözücüye iletilen ve kod çözücüde kullanilan öngörü bilgisi, avantajli olarak O° ve 360° arasindaki miktarlardan rastgele seçilen iki ses kanali arasindaki faz farkliliklarini yansitan bir sanal parçayi içerir. Sayisal karmasiklik, sadece gerçek degerli bir dönüsüm veya genel anlamda sadece gerçek bir spektrum ya da sadece sanal bir spektrum saglayan bir dönüsüm 'uygulandigi zaman anlamli ölçüde azaltilir. Sol sinyale ait belli bir bant ile sag sinyale ait karsilik gelen bir bant arasindaki faz kaymasini belirten bu sanal öngörü bilgisinden faydalanmak için, orijinal kombinasyon sinyaline göre faz-döndürülmüs olan birinci kombinasyon sinyalinden bir öngörü artik sinyali hesaplamak üzere kod çözücüde bir gerçekten-sanala dönüstürücü ya da dönüsüm uygulamasina bagli olarak bir sanaldan-gerçege dönüstürücü saglanir. Sonrasinda bu faz-döndürülmüs öngörü artik sinyali, son asamada belli bir banttaki kodu çözülmüs sol kanali ve bu banttaki kodu çözülmüs sag kanali elde etmek üzere bir orta sinyal ile birlestirilebilen bir yan sinyali yeniden üretmek için bit akisinda iletilen öngörü artik sinyali ile birlestirilebilir. Preferably, the prediction information calculated in the encoder, transmitted to the decoder and used in the decoder advantageously includes an imaginary part reflecting the phase differences between two audio channels chosen at random from quantities between 0° and 360°. The numerical complexity is significantly reduced when a pure real-valued transformation, or in general a transformation that provides only a real spectrum or only an imaginary spectrum, is applied. To utilize this virtual prediction information, which indicates the phase shift between a certain band of the left signal and a corresponding band of the right signal, a real-to-virtual converter or a virtual-to-real converter, depending on the conversion application, is provided in the decoder to calculate a prediction residual signal from the first combination signal that is phase-rotated with respect to the original combination signal. This phase-rotated prediction residual signal can then be combined with the prediction residual signal transmitted in the bitstream to reproduce a side signal, which can then be combined with a middle signal to obtain the decoded left channel in a certain band and the decoded right channel in that band.

Ses kalitesini arttirmak için, kod çözücü tarafinda uygulanan ayni gerçekten-sanala veya sanaldan-gerçege dönüstürücü, öngörü artik sinyali kodlayicida hesaplanirken kodlayici tarafinda da uygulanir. To improve the audio quality, the same real-to-virtual or virtual-to-real converter applied on the decoder side is also applied on the encoder side when the predictive signal is calculated in the encoder.

Mevcut bulus, ayni bit oranina ve ayni ses kalitesine sahip sistemler ile kiyaslandiginda iyilestirilmis bir ses kalitesi ve azaltilmis bir bit orani saglamasi bakimindan avantajlidir. The present invention is advantageous in that it provides improved sound quality and a reduced bit rate when compared to systems having the same bit rate and the same sound quality.

Ayrica, genellikle SBR'nin kullanilmadigi, yüksek bit oranlarinda MPEG USAC sisteminde faydali olan birlestirilmis stereo kodlamanin sayisal verimliligi bakimindan da avantajlar elde edilmistir. Bu yaklasimlarda, sinyalin karmasik karma QMF bölgesinde islenmesinin yerine, alttaki stereo dönüsüm kodlayicinin yerel MDCT bölgesinde artik tabanli öngörücü stereo kodlama uygulanir. Örnekler` MDCT bölgesinde karmasik öngörü yoluyla bir stereo sinyalin üretimi için bir cihaz veya yöntem içermekte olup, burada karmasik öngörü bir gerçekten-karmasiga dönüsüm kullanilarak MDCT bölgesinde yapilir; burada sözü edilen stereo sinyal, stereo sinyal üretme cihazi veya yöntem kod çözücü tarafinda uygulandiginda, kodlayici tarafindaki kodlanmis bir stereo sinyal olabilecegi gibi alternatif olarak kodu çözülmüs/iletilmis bir stereo sinyal de olabilir. There are also advantages in terms of the computational efficiency of combined stereo coding, which is generally useful in the MPEG USAC system at high bit rates, where SBR is not used. In these approaches, instead of processing the signal in the complex composite QMF domain, residual-based predictive stereo coding is applied in the local MDCT domain of the underlying stereo transform encoder. Examples include a device or method for generating a stereo signal by complex prediction in the MDCT domain, where the complex prediction is performed in the MDCT domain using a real-to-complex transformation, where the stereo signal in question may be an encoded stereo signal at the encoder side, or alternatively a decoded/transmitted stereo signal when the stereo signal generating device or method is implemented at the decoder side.

Asagida mevcut bulusun tercih edilen düzenlemeleri ekli sekillere atifta bulunularak açiklanacak olup, bu sekillerden; Sekil 1, bir ses kodu çözücünün bir Örnegini gösteren bir semadir; Sekil 2, bir ses kodlayicinin bir örnegini gösteren bir blok semasidir; Sekil 3a, Sekil 2'deki kodlayici hesaplayicinin bir uygulamasini göstermektedir; Sekil 3b, Sekil 2'deki kodlayici hesaplayicinin alternatif bir uygulamasini göstermektedir; Sekil 3c, kodlayici tarafinda uygulanacak bir orta/yan kombinasyon kuralini göstermektedir; Sekil 4a, Sekil l'deki kod çözücü hesaplayicinin bir uygulamasini göstermektedir; Sekil 4b, bir matris hesaplayici formundaki kod çözücü hesaplayicinin alternatif bir uygulamasini göstermektedir; Sekil 4c, Sekil 3c'de gösterilen kombinasyon kuralina karsilik gelen bir orta/yan ters kombinasyon kuralini göstermektedir; Sekil 5a, tercihen gerçek degerli bir frekans bölgesi olan frekans bölgesinde isleyen bir ses kodlayici örnegini göstermektedir; Sekil 5b, frekans bölgesinde isleyen bir ses kodu çözücünün bir uygulamasini göstermektedir; Sekil 6a, MDCT bölgesinde isleyen ve bir gerçekten-sanala dönüsümü kullanan bir ses kodlayicinin alternatif bir uygulamasini göstermektedir; Sekil 6b, MDCT bölgesinde isleyen ve bulusun tercih edilen bir düzenlemesine göre gerçekten hayale dönüsümü kullanan bir ses çözücüsünü göstermektedir; Sekil 7a, bir stereo kod çözücüyü ve sonradan baglanan bir SBR kod çözücüyü kullanan bir ses sonislemcisini göstermektedir; Sekil 7b, bir orta/yan upmix matrisini göstermektedir; Sekil 8a, Sekil 6a'daki MDCT blogunun detayli bir görünümünü göstermektedir; Sekil 8b, Sekil 6b'deki MDCT_1 blogunun detayli bir görünümünü göstermektedir; Sekil 9a, MDCT çikisina göre azaltilmis çözünürlükte isleyen bir iyilestiricinin (optimize edici) bir uygulamasini göstermektedir; Sekil 9b, bir MDCT spektrumunun ve öngörü bilgisinin hesaplandigi karsilik gelen düsük çözünürlük bantlarinin bir gösterimini göstermektedir; Sekil 10a, Sekil 6a veya 6b'deki gerçekten-sanala dönüstürücünün bir uygulamasini göstermektedir ve Sekil lOb, Sekil 10a'daki sanal spektrum hesaplayicinin olasi bir uygulamasini göstermektedir. Below, preferred embodiments of the present invention will be described with reference to the accompanying figures, of which; Figure 1 is a schematic showing an example of an audio decoder; Figure 2 is a block diagram showing an example of an audio encoder; Figure 3a shows an implementation of the encoder calculator of Figure 2; Figure 3b shows an alternative implementation of the encoder calculator of Figure 2; Figure 3c shows a middle/side combination rule to be applied on the encoder side; Figure 4a shows an implementation of the decoder calculator of Figure 1; Figure 4b shows an alternative implementation of the decoder calculator in the form of a matrix calculator; Figure 4c shows a mid/side reverse combination rule corresponding to the combination rule shown in Figure 3c; Figure 5a shows an example of an audio coder operating in the frequency domain, preferably a real-valued frequency domain; Figure 5b shows an implementation of an audio decoder operating in the frequency domain; Figure 6a shows an alternative implementation of an audio coder operating in the MDCT domain and using a real-to-virtual transformation; Figure 6b shows an audio decoder operating in the MDCT domain and using a real-to-virtual transformation according to a preferred embodiment of the invention; Figure 7a shows an audio postprocessor using a stereo decoder and a SBR decoder connected later; Figure 7b shows a mid/side upmix matrix; Figure 8a shows a detailed view of the MDCT block in Figure 6a; Figure 8b shows a detailed view of the MDCT_1 block in Figure 6b; Figure 9a shows an implementation of an optimizer operating at reduced resolution with respect to the MDCT output; Figure 9b shows a representation of the corresponding low resolution bands where an MDCT spectrum and predictive information are calculated; Figure 10a shows an implementation of the real-to-virtual converter in Figure 6a or 6b, and Figure 10b shows a possible implementation of the virtual spectrum calculator in Figure 10a.

Sekil 1, bir giris hattinda (100) elde edilen kodlanmis çok kanalli bir ses sinyalinin kodunu çözmeye yönelik bir ses kodu çözücüyü göstermektedir. Kodlanmis çok kanalli ses sinyali, çok kanalli ses sinyalini gösteren bir birinci kanal sinyali ile bir ikinci kanal sinyalini birlestirmeye yönelik bir kombinasyon kurali kullanilarak üretilen bir kodlanmis birinci kombinasyon sinyalini, kodlanmis bir öngörü artik sinyalini ve bir öngörü bilgisini içerir. Kodlanmis çok kanalli sinyal, çogullamali formdaki üç bilesene sahip olan bir bit akisi gibi bir veri akisi olabilir. Hat 100'de kodlanmis çok kanalli sinyale ilave yan bilgi dahil edilebilir. Sinyal giris arayüzüne (102) girilir. Figure 1 shows an audio decoder for decoding an encoded multi-channel audio signal obtained on an input line 100. The encoded multi-channel audio signal includes a coded first combination signal, an coded predictive residual signal, and a predictive information generated using a combination rule for combining a first channel signal representing the multi-channel audio signal and a second channel signal. The coded multi-channel signal may be a data stream, such as a bit stream having three components in multiplexed form. Additional side information may be included in the encoded multi-channel signal on line 100. The signal is input to the input interface 102.

Giris arayüzü (102) hat 104'te kodlanmis birinci kombinasyon sinyalini, hat 106'da kodlanmis artik sinyali, hat 108'de ise öngörü bilgisini çikaran bir veri akisi çogullama çözücü olarak uygulanabilir. Öngörü bilgisi tercihen sifira esit olmayan bir gerçek parça ve/Veya sifirdan farkli olan bir sanal parçaya sahip olan bir faktördür. Kodlanmis kombinasyon sinyali ile kodlanmis artik sinyal, hat 112'de kodu çözülmüs bir birinci kombinasyon sinyali elde etmek üzere birinci kombinasyon sinyalinin kodunu çözmek için bir sinyal kodu çözücüye (110) girilir. Ek olarak, sinyal kodu çözücü (110) hat 114'de kodu çözülmüs bir artik sinyali elde etmek üzere kodlanmis artik sinyalin kodunu çözecek sekilde yapilandirilmistir. Bir ses kodlayici tarafindaki kodlama islemeye bagli olarak, sinyal kodu çözücü bir Huffman kod Çözücü, bir aritmetik kod çözücü veya herhangi bir diger entropi-kod çözücü gibi bir entropi-kod çözücüyü ve iliskili bir ses kodlayicida bir nicemleyici islemiyle eslesen bir nicemleme giderme isleminin gerçeklestirilmesine yönelik sonradan baglamali bir nicemleme giderme (nicemlenmis bir görüntüde kaybolan detayin geri kazanilmasi) adimini içerebilir. Hat 112 ve 114'teki sinyaller hat ll7'de birinci kanal sinyalini, hat 118'de ikinci kanal sinyalini çikaran bir kod çözücü hesaplayiciya (115) girilir ki burada sözü edilen iki sinyal birçok kanalli ses sinyaline ait iki kanal ya da stereo sinyaldir. Örnegin çok kanalli ses sinyali bes kanal içeriyorsa, bu durumda söz konusu iki sinyal, çok kanalli sinyalden gelen iki kanaldir. Bes kanali olan bu tür birçok kanalli sinyali tamamen kodlamak için, Sekil 1'de gösterildigi gibi iki kod çözücü uygulanabilir ki burada birinci kod çözücü sol kanali ve sag kanali isler, ikinci kod çözücü sol çevre kanalini ve sag çevre kanalini isler, yine burada merkez kanalda mono-kodlama uygulamak için bir üçüncü mono kod çözücü kullanilabilir. The input interface 102 may be implemented as a data stream demultiplexer that extracts the encoded first combination signal on line 104, the encoded residual signal on line 106, and the prediction information on line 108. The prediction information is preferably a factor having a real part that is not equal to zero and/or an imaginary part that is different from zero. The encoded combination signal and the encoded residual signal are input to a signal decoder 110 to decode the first combination signal to obtain a decoded first combination signal on line 112. Additionally, the signal decoder 110 is configured to decode the encoded residual signal to obtain a decoded residual signal on line 114. Depending on the encoding process on the audio coder side, the signal decoder may include an entropy decoder such as a Huffman decoder, an arithmetic decoder or any other entropy decoder and a subsequent dequantization (recovery of lost detail in a quantized image) step to perform a dequantization operation coupled to a quantizer operation in an associated audio coder. The signals on lines 112 and 114 are input to a decoder calculator 115 which outputs the first channel signal on line 117 and the second channel signal on line 118, where the two signals are two channels or stereo signals of a multi-channel audio signal. For example, if a multichannel audio signal contains five channels, then the two signals in question are the two channels from the multichannel signal. To fully encode such a multichannel signal with five channels, two decoders can be applied as shown in Figure 1, where the first decoder processes the left channel and the right channel, the second decoder processes the left surround channel and the right surround channel, and a third mono decoder can be used to apply mono-encoding to the center channel.

Bununla birlikte dalga biçimi kodlayicilar ve parametrik kodlayicilarin baska gruplamalari ve kombinasyonlari da uygulanabilir. Öngörü semasini ikiden fazla kanala genellemenin alternatif bir yolu da üç (veya daha fazla) sinyali ayni anda isleme tabi tutmak, yani MPEG Çevre'de "ikiden üçe (two-to- three)" yaklasimina oldukça benzer sekilde iki öngörü katsayisini kullanarak 1. ve 2. sinyallerden bir 3. kombinasyon sinyalini öngörmektir. However, other groupings and combinations of waveform coders and parametric coders can also be applied. An alternative way to generalize the prediction scheme to more than two channels is to process three (or more) signals simultaneously, i.e. to predict a 3rd combination signal from signals 1 and 2 using two prediction coefficients, very similar to the "two-to-three" approach in MPEG Environment.

Kod çözücü hesaplayici (116); kodu çözülmüs artik sinyal (114), öngörü bilgisi (108) ve kodu çözülmüs birinci kombinasyon sinyalini (112) kullanarak kodu çözülmüs birinci kanal sinyali (117) ve kodu çözülmüs ikinci kanal sinyali (118) bulunan bir kodu çözülmüs çok kanalli sinyali hesaplamak üzere yapilandirilmistir. Kod çözücü hesaplayici (116) özellikle kodu çözülmüs birinci kanal sinyali ve kodu çözülmüs ikinci kanal sinyali, birinci kombinasyon sinyali ve öngörü artik sinyali üretildigi esnada kombinasyon kurali ile birlestirilen, karsilik gelen bir kodlayiciya girilen çok kanalli sinyale ait bir birinci kanal sinyali ve bir ikinci kanal sinyalinin en azindan bir yaklasigi olacak sekilde islemek üzere yapilandirilmistir. Özel olarak belirtmek gerekirse, hat 108'deki öngörü bilgisi sifirdan farkli bir gerçek degerli parça ve/veya sifirdan farkli bir sanal parça içerir. The decoder calculator 116 is configured to calculate a decoded multi-channel signal having a decoded first channel signal 117 and a decoded second channel signal 118 using the decoded residual signal 114, the prediction information 108 and the decoded first combination signal 112. The decoder calculator 116 is specifically configured to process the decoded first channel signal and the decoded second channel signal such that at least an approximation of a first channel signal and a second channel signal of the multi-channel signal is input to a corresponding encoder, combined by the combination rule when the first combination signal and the prediction residual signal are generated. Specifically, the forecast information in line 108 includes a non-zero real-valued part and/or a non-zero imaginary part.

Kod çözücü hesaplayici (116) farkli sekillerde uygulanabilir. The decoder calculator 116 can be implemented in different ways.

Bir birinci uygulama Sekil 4a'da gösterilmistir. Bu uygulama bir öngörücüyü (1160), bir kombinasyon sinyali hesaplayiciyi (1161) ve bir birlestiriciyi (1162) içerir. Öngörücü kodu çözülmüs birinci kombinasyon sinyali (112) ile öngörü bilgisini (108) alir ve bir öngörü sinyalini (1163) çikarir. Yine özel olarak, öngörücü (1160) öngörü bilgisini (108) kodu çözülmüs birinci kombinasyon sinyaline (112) veya kodu çözülmüs birinci kombinasyon sinyalinden türetilen bir sinyale uygulamak üzere yapilandirilmistir. Öngörü bilgisinin (108) uygulandigi sinyali türetmeye yönelik türetme kurali, bir gerçekten-sanala dönüsüm ya da esit bir biçimde bir sanaldan-gerçege dönüsüm veya bir agirliklandirma islemi olabilecegi gibi, uygulamaya bagli olarak bir faz kaymasi islemi veya birlestirilmis bir agirliklandirma/faz kaymasi islemi de olabilir. Kodu çözülmüs ikinci kombinasyon sinyalini (1165) hesaplamak için öngörü sinyali (1163) kodu çözülmüs artik sinyal ile birlikte kombinasyon sinyali hesaplayiciya (1161) girilir. Sirayla hat 1166 ile hat 1167'de kodu çözülmüs birinci kanal sinyali ve kodu çözülmüs ikinci kanal sinyali bulunan kodu çözülmüs çok kanalli ses sinyalini elde etmek üzere, sinyallerin (112 ve 1165) her ikisi de kodu çözülmüs birinci kombinasyon sinyali ve ikinci kombinasyon sinyalini birlestiren birlestiriciye (1162) girilir. A first embodiment is shown in Figure 4a. The embodiment includes a predictor 1160, a combination signal calculator 1161, and a combiner 1162. The predictor receives the decoded first combination signal 112 and the prediction information 108 and outputs a prediction signal 1163. Again, specifically, the predictor 1160 is configured to apply the prediction information 108 to the decoded first combination signal 112 or to a signal derived from the decoded first combination signal. The derivation rule for deriving the signal to which the prediction information 108 is applied may be a real-to-virtual transformation or, equivalently, a virtual-to-real transformation or a weighting operation, or, depending on the implementation, a phase shift operation or a combined weighting/phase shift operation. The prediction signal 1163 is input into the combination signal calculator 1161 together with the decoded residual signal to calculate the second decoded combination signal 1165. Both signals 112 and 1165 are input to combiner 1162, which combines the decoded first combination signal and the second combination signal to obtain a decoded multi-channel audio signal having the decoded first channel signal and the decoded second channel signal on lines 1166 and 1167, respectively.

Alternatif olarak, kod. çözücü hesaplayici giris olarak kodu çözülmüs birinci kombinasyon sinyali veya. M sinyalini, kodu çözülmüs artik sinyal veya D sinyalini ve öngörü bilgisini (108) alan bir matris hesaplayici (1168) olarak uygulanir. Matris hesaplayici (1168) L, R çikis sinyallerini elde etmek için M, D sinyallerine 1169 ile gösterilen bir dönüsüm matrisini uygular; burada L kodu çözülmüs birinci kanal sinyali, R ise kodu çözülmüs ikinci kanal sinyalidir. Sekil 4b'deki yazim, bir sol kanalli (L) ve bir sag kanalli (R) bir stereo yazima benzer. Kolay bir anlasilma açisindan bu yazim uygulanmistir, ancak teknik alanda uzman kisilerce anlasilacagi üzere L, R sinyalleri, ikiden fazla kanal sinyali bulunan birçok kanalli sinyal içerisindeki iki kanal sinyalinin herhangi bir kombinasyonu olabilir. Matris islemleri bir tür "tek durumlu (single-shot)" matris hesaplamaya birlestirir; burada Sekil 4a'daki devreye yapilan girisler ve Sekil 4a'daki devreden gelen çikislar, matris hesaplayiciya (1168) yapilan girislere ve matris hesaplayicidan (1168) gelen çikislara özdestir. Alternatively, the decoder calculator is implemented as a matrix calculator 1168 that takes as input the decoded first combination signal or M signal, the decoded residual signal or D signal, and the prediction information 108. The matrix calculator 1168 applies a transformation matrix 1169 to the M, D signals to obtain the output signals L, R, where L is the decoded first channel signal and R is the decoded second channel signal. The plot in Figure 4b is similar to a stereo plot with one left channel (L) and one right channel (R). This notation is applied for ease of understanding, but it will be understood by those skilled in the art that the L, R signals can be any combination of two channel signals in a multi-channel signal with more than two channel signals. Matrix operations combine into a type of "single-shot" matrix calculation, where the inputs to the circuit in Figure 4a and the outputs from the circuit in Figure 4a are identical to the inputs to the matrix calculator (1168) and the outputs from the matrix calculator (1168).

Sekil 4c, Sekil 4a'daki birlestirici (1162) tarafindan uygulanan bir ters kombinasyon kuralinin bir örnegini göstermektedir. Özellikle, bu kombinasyon kurali iyi bilinen orta/yan kodlamadaki kod çözücü tarafi kombinasyon kuralina benzer olup, burada L = M + 8 ve R = M - S'dir. Sekil 4c'deki ters kombinasyon kurali tarafindan kullanilan 5 sinyalinin kombinasyon sinyali hesaplayici tarafindan hesaplanan sinyal, yani hat 1163'teki öngörü sinyali ile hat 114'teki kodu çözülmüs artik sinyalin bir kombinasyonu olduguna dikkat edilmelidir. Yine, bu tarifnamede, hatlar üzerindeki sinyallerin bazi durumlarda hatlar için kullanilan referans numaralari kullanilarak, bazi durumlarda da hatlara istinaden kullanilan referans numaralarinin kendisi kullanilarak isimlendirildigi belirtilmelidir. Dolayisiyla yazim, belli bir sinyale sahip olan bir hattin sinyalin kendisini belirttigi seklindedir. Bir hat, donanimla bütünlesik bir uygulamada fiziksel bir hat olabilir. Bilgisayar donanimli bir uygulamada ise fiziksel bir hat bulunmaz, bunun yerine hattin temsil ettigi sinyal bir hesaplama modülünden diger bir hesaplama modülüne iletilir. Figure 4c shows an example of a reverse combination rule applied by the combiner 1162 in Figure 4a. In particular, this combination rule is similar to the well-known decoder-side combination rule in mid/side coding, where L = M + 8 and R = M - S. It should be noted that the combination signal 5 used by the reverse combination rule in Figure 4c is a combination of the signal calculated by the calculator, i.e., the prediction signal on line 1163, and the decoded residual signal on line 114. Again, it should be noted that in this specification, signals on lines are named in some cases using the reference numbers used for the lines, and in some cases using the reference numbers themselves used with respect to the lines. Therefore, the writing is that a line with a particular signal represents the signal itself. A line may be a physical line in a hardware-integrated application. In a computer-hardware application, there is no physical line, instead the signal represented by the line is transmitted from one computing module to another computing module.

Sekil 2, iki veya daha fazla kanal sinyaline sahip olan birçok kanalli ses sinyalinin (200) kodlanmasina yönelik bir ses kodlayiciyi göstermekte olup, burada birinci kanal sinyali 201'de, ikinci kanal ise 202'de gösterilmistir. Öngörü artik sinyali (205), birinci kombinasyon sinyalinden (204) türetilen bir öngörü sinyali ile birlestirildiginde ve öngörü› bilgisi (206) bir birinci kombinasyon sinyali (204) ile sonuçlandiginda, birinci kombinasyon sinyali ile ikinci kombinasyon sinyali, bir kombinasyon kurali kullanilarak birinci kanal sinyalinden (201) ve ikinci kanal sinyalinden (202) türetilebilecek sekilde, her iki sinyal de birinci kanal sinyali (201) ile ikinci kanal sinyalini (202) ve öngörü bilgisini (206) kullanarak bir birinci kombinasyon sinyali (204) ile bir öngörü artik sinyalini (205) hesaplamaya yönelik bir kodlayici hesaplayiciya (203) girilir. Öngörü bilgisi, öngörü artik sinyali bir optimizasyon hedefini (208) yerine getirecek sekilde öngörü bilgisini (206) hesaplamaya yönelik bir iyilestirici (207) tarafindan üretilir. Figure 2 shows an audio encoder for encoding a multi-channel audio signal 200 having two or more channel signals, where the first channel signal is shown at 201 and the second channel is shown at 202. When the prediction residual signal 205 is combined with a prediction signal derived from the first combination signal 204 and the prediction information 206 results in a first combination signal 204, the first combination signal and the second combination signal can be derived from the first channel signal 201 and the second channel signal 202 using a combination rule, both signals are input to an encoder calculator 203 for calculating a first combination signal 204 and a prediction residual signal 205 using the first channel signal 201 and the second channel signal 202 and the prediction information 206. The prediction information is generated by an optimizer (207) to calculate the prediction information (206) such that the prediction residual signal satisfies an optimization objective (208).

Birinci kombinasyon sinyali (204) ve artik sinyal (205), bir kodlanmis birinci kombinasyon sinyalini (210) elde etmek için birinci kombinasyon sinyalini kodlamaya ve bir kodlanmis artik sinyali (211) elde etmek için artik sinyali (205) kodlamaya yönelik bir sinyal kodlayiciya (209) girilir. Sekil 1'de gösterilen ses kodu çözücünün giris arayüzüne (102) girilen kodlanmis çok kanalli sinyale (100) benzer bir kodlanmis çok kanalli sinyal (213) elde etmek üzere, kodlanmis sinyallerin (210, 211) her ikisi de kodlanmis birinci kombinasyon sinyalini (201) kodlanmis öngörü artik sinyali (211) ve öngörü bilgisiyle (206) birlestirmeye yönelik bir çikis arayüzüne girilir. The first combination signal 204 and the residual signal 205 are input to a signal encoder 209 for encoding the first combination signal 210 and encoding the residual signal 205 to obtain a coded residual signal 211. Both of the coded signals 210, 211 are input to an output interface for combining the coded first combination signal 201 with the coded prediction residual signal 211 and the prediction information 206 to obtain a coded multi-channel signal 213 similar to the coded multi-channel signal 100 input to the input interface 102 of the audio decoder shown in Fig. 1.

Uygulamaya bagli olarak, iyilestirici (207) ya birinci kanal sinyalini (201) ve ikinci kanal sinyalini (202) alir ya da hat 214 ve 215 ile gösterilen, Sekil 3a'daki birlestiriciden (2031) (daha sonra açiklanacaktir) türetilen birinci kombinasyon sinyalini (214) ve ikinci kombinasyon sinyalini (215) alir. Depending on the implementation, the optimizer 207 either receives the first channel signal 201 and the second channel signal 202, or receives the first combination signal 214 and the second combination signal 215 derived from the combiner 2031 (described later) in Figure 3a, indicated by lines 214 and 215.

Sekil 2'de tercih edilen bir optimizasyon hedefi gösterilmekte olup, burada kodlama kazanci maksimuma çikarilmistir, yani bit orani olabildigince azaltilmistir. Bu optimizasyon hedefinde D artik sinyali, a'ya göre minimuma indirilmistir. Baska bir deyisle, öngörü bilgisi a "5 - dMHZIninimuma indirilecek sekilde seçilmistir. Bu da Sekil 2'de gösterilen d için bir çözümle sonuçlanir. S, M sinyalleri bloksal olarak verilmistir ve tercihen spektral bölge sinyalleridir; burada H..." yazimi degiskenin 2-normudur ve <...> normal iç çarpimi gösterir. Figure 2 shows a preferred optimization target, where the coding gain is maximized, i.e. the bit rate is minimized as much as possible. In this optimization target, the residual signal D is minimized with respect to a. In other words, the predictive information a is selected to be minimized to "5 - dMHZIminimum". This results in a solution for d shown in Figure 2. The signals S, M are given blockwise and are preferably spectral domain signals; where H..." is the 2-norm of the variable and <...> represents the normal inner product.

Birinci kanal sinyali 201 ile ikinci kanal sinyali 202 iyilestiriciye (207) girildiginde, iyilestirici kombinasyon kuralini uygulayabilir (örnek bir kombinasyon kurali Sekil 3c'de gösterilmistir). Ancakr birinci kombinasyon sinyali 214 ile ikinci kombinasyon sinyali 215 iyilestiriciye (207) girildiginde ise iyilestiricinin (207) kombinasyon kuralini kendiliginden uygulamasina gerek yoktur. When the first channel signal 201 and the second channel signal 202 are input to the optimizer 207, the optimizer may apply the combination rule (an example combination rule is shown in Figure 3c). However, when the first combination signal 214 and the second combination signal 215 are input to the optimizer 207, the optimizer 207 does not need to apply the combination rule by itself.

Diger optimizasyon hedefleri algi kalitesi ile iliskili olabilir. Optimizasyon hedefi, maksimum algi kalitesinin elde edilmesi olabilir. Bu durumda, iyilestirici bir algisal modelden ilave bilgi edinilmesini gerektirebilir. Optimizasyon hedefinin diger uygulamalari Hdnimum veya sabit bir bit oraninin elde edilmesi ile ilgili olabilir. Ardindan a, minimum bit orani veya alternatif olarak sabit bit orani gibi gereklilikleri yerine getirmek üzere ayarlanacak sekilde belli a degerleri için gereken bit oranini belirlemek için, nicemleme/entropi-kodlama islemini gerçeklestirmek adina iyilestirici (207) uygulanabilir. Other optimization goals may be related to perception quality. The optimization goal may be to achieve maximum perception quality. In this case, the optimizer may require additional information from a perceptual model. Other applications of the optimization goal may be related to achieving a minimum or constant bit rate. The optimizer 207 may then be applied to perform the quantization/entropy-coding operation to determine the required bit rate for certain values of a, such that a is adjusted to satisfy requirements such as a minimum bit rate or, alternatively, a constant bit rate.

Optimizasyon hedefinin diger 'uygulamalari kodlayici veya kod çözücü kaynaklarinin minimum düzeyde kullanimi ile ilgili olabilir. Böyle bir optimizasyon hedefinin uygulanmasi halinde, belli bir optimizasyon için gereken kaynaklara iliskin bilgi iyilestiricide (207) mevcut olabilir. Buna ek olarak, öngörü bilgisini (206) hesaplayan iyilestiriciyi (207) kontrol etmek adina bu optimizasyon hedeflerinin veya diger optimizasyon hedeflerinin bir kombinasyonu uygulanabilir. Other applications of the optimization goal may concern the minimum use of encoder or decoder resources. If such an optimization goal is implemented, information regarding the resources required for a particular optimization may be available to the optimizer 207. Additionally, a combination of these or other optimization goals may be implemented to control the optimizer 207 that calculates the predictive information 206.

Sekil 2'deki kodlayici hesaplayici çesitli sekillerde uygulanabilmekle birlikte, birlestiricide (2031) açik bir kombinasyon kuralinin gerçeklestirildigi bir birinci uygulama örnegi Sekil 3a'da gösterilmistir. Sekil 3b'de bir matris hesaplayicinin (2039) kullanildigi bir alternatif örnek uygulama gösterilmistir. Sekil 3a'daki birlestirici (2031) Sekil 3c'de gösterilen, tüm dallara 0,5 agirlik faktörünün uygulandigi iyi bilinen bir orta/yan kodlama kurali örnegi olan kombinasyon kuralini gerçeklestirmek için uygulanabilir. Ancak, uygulamaya bagli olarak diger agirlik faktörleri uygulanabilecegi gibi hiçbir agirlik faktörünün uygulanmamasi da mümkündür. Ayrica, kodlayici tarafindan uygulanan kombinasyon kuralina ters olan bir kombinasyon kuralini uygulayan Sekil 4a'da gösterilen kod çözücü birlestiricide (1162) uygulanabilen karsilik gelen bir ters kombinasyon kurali bulundugu sürece, diger dogrusal kombinasyon kurallari veya dogrusal olmayan kombinasyon kurallari gibi baska kombinasyon kurallarinin da uygulanabilecegi göz önünde bulundurulmalidir. Dalga biçimi üzerindeki etki öngörü ile "dengelendigi" için, yani kodlayici hesaplayici (203) ile kombinasyon halinde iyilestirici (207) tarafindan gerçeklestirilen öngörü isleminin bir dalga biçimi- koruma süreci olmasindan ötürü mevcut herhangi bir hata iletilmis artik sinyalde kaldigi için herhangi bir terslenebilir öngörü kullanilabilir. While the encoder calculator in Figure 2 can be implemented in a variety of ways, a first example implementation where an explicit combination rule is implemented in the combiner 2031 is shown in Figure 3a. An alternative example implementation where a matrix calculator 2039 is used is shown in Figure 3b. The combiner 2031 in Figure 3a can be implemented to implement the combination rule shown in Figure 3c, which is a well-known example of a middle/side coding rule where a weighting factor of 0.5 is applied to all branches. However, depending on the implementation, other weighting factors can be applied, or no weighting factor at all. It should also be noted that other combination rules, such as other linear combination rules or nonlinear combination rules, can also be applied, as long as there is a corresponding inverse combination rule that can be applied in the decoder combiner 1162 shown in Figure 4a, which implements a combination rule that is opposite to the combination rule implemented by the encoder. Any reversible prediction can be used, since the effect on the waveform is "balanced" by the prediction, i.e., any error present in the transmitted residual signal remains, since the prediction process performed by the optimizer 207 in combination with the encoder calculator 203 is a waveform-protecting process.

Birlestirici (2031) birinci kombinasyon sinyali (204) ile bir ikinci kombinasyon sinyalini (2032) çikarir. Birinci kombinasyon sinyali bir öngörücüye (2033) girilirken ikinci kombinasyon sinyali (2032) artik hesaplayiciya (2034) girilir. Öngörücü (2033) bir öngörü sinyalini (2035) hesaplar, bu da son asamada bir artik sinyal (205) elde etmek üzere ikinci kombinasyon sinyali (2032) ile birlestirilir. Birlestirici (2031) özellikle, birinci kombinasyon sinyali (204) ve ikinci kombinasyon sinyalini (2032) elde etmek için farkli yollarla çok kanalli ses sinyaline ait iki kanal sinyalini (201 ve 202) birlestirmek üzere yapilandirilmis olup, söz konusu iki farkli yol Sekil 3c'de bulunan açiklayici örnekte gösterilmistir. Öngörücü (2033), öngörü sinyalini (2035) elde etmek için öngörü bilgisini birinci kombinasyon sinyaline (204) veya birinci kombinasyon sinyalinden türetilen bir sinyale uygulamak üzere yapilandirilmistir. The combiner 2031 outputs a second combination signal 2032 from the first combination signal 204. The first combination signal is input to a predictor 2033 while the second combination signal 2032 is input to the residual calculator 2034. The predictor 2033 calculates a predictor signal 2035 which is then combined with the second combination signal 2032 to obtain a residual signal 205. The combiner 2031 is specifically configured to combine two channel signals 201 and 202 of the multi-channel audio signal in different ways to obtain the first combination signal 204 and the second combination signal 2032, the two different ways being illustrated in the illustrative example in Figure 3c. The predictor 2033 is configured to apply the prediction information to the first combination signal 204 or a signal derived from the first combination signal to obtain the prediction signal 2035.

Kombinasyon sinyalinden türetilen sinyal, herhangi bir dogrusal olan veya olmayan islem ile türetilebilir; burada gerçekten- sanala dönüsüm/sanaldan-gerçege dönüsüni tercih. ediliry ki bu belli degerlerin agirliklandirilmis toplama (/ekleme) islemlerini yapan bir FIR (Finite Impulse Response; Sonlu dürtü yaniti) süzgeci gibi dogrusal bir süzgeç kullanilarak uygulanabilir. kombinasyon sinyalinden çikarilacak sekilde bir çikarma islemi de uygulayabilir. Bununla birlikte, artik hesaplayicida baska islemlerin gerçeklestirilmesi de mümkündür. Buna paralel olarak, Sekil 4a'daki kombinasyon sinyali hesaplayici (1161), ikinci kombinasyon sinyalini (1165) elde etmek *üzere kodu çözülmüs artik sinyal (114) ile öngörü sinyalinin (1163) toplandigi bir toplama islemini gerçeklestirebilir. The signal derived from the combination signal can be derived by any linear or nonlinear operation, where the real-to-virtual transformation is preferred. This can be implemented by using a linear filter such as a FIR (Finite Impulse Response) filter, which performs weighted addition (/addition) of certain values. A subtraction operation can also be applied to subtract from the combination signal. However, it is now possible to perform other operations in the calculator. In parallel, the combination signal calculator 1161 in Figure 4a may perform a summing operation where the decoded residual signal 114 and the prediction signal 1163 are added to obtain the second combination signal 1165.

Sekil 5a, bir ses kodlayicinin tercih edilen bir uygulamasini göstermektedir. Sekil 3a'da gösterilen ses kodlayici ile kiyaslandiginda, birinci kanal sinyali (201), bir zaman bölgesi birinci kanal sinyalinin (55a) bir spektral gösterimidir. Buna uygun olarak, ikinci kanal sinyali (202) bir zaman bölgesi kanal sinyalinin (55b) bir spektral gösterimidir. Zaman bölgesinden spektral gösterime dönüsüm, birinci kanal sinyali için bir zaman/frekans dönüstürücü (50) ile; ikinci kanal sinyali için ise bir baska zaman/frekans sinyali (51) ile gerçeklestirilir. Figure 5a shows a preferred embodiment of a voice coder. Compared to the voice coder shown in Figure 3a, the first channel signal 201 is a spectral representation of a time domain first channel signal 55a. Accordingly, the second channel signal 202 is a spectral representation of a time domain channel signal 55b. The conversion from the time domain to the spectral representation is performed by a time/frequency converter 50 for the first channel signal and another time/frequency signal 51 for the second channel signal.

Spektral dönüstürücüler (50, 51) tercihen gerçek degerli dönüstürücüler olarak uygulanir, ama bu zorunlu degildir. Spectral converters (50, 51) are preferably implemented as real-valued converters, but this is not mandatory.

Dönüsüm algoritmasi; bir kesintili kosinüs dönüsümü, sadece bir tek gerçek parçanin kullanildigi bir FFT (Fast Fourier Transform; Hizli Fourier dönüsümü) dönüsümü, bir MDCT dönüsümü veya gerçek degerli spektral degerleri saglayan herhangi bir diger dönüsüm olabilir. Alternatif olarak, her iki dönüsüm de sadece tek bir sanal parçanin kullanildigi ve gerçek parçanin atildigi DST (Discrete Sine Transform; Kesintili sinüs dönüsümü), MDST, FFT ve benzeri bir sanal dönüsüm olarak uygulanabilir. Sadece sanal degerleri saglayan herhangi bir diger dönüsüm de uygulanabilir. Salt gerçek degerli dönüsümün ya da salt sanal dönüsümün kullanilmasinin bir amaci sayisal karmasikliktir, çünkü her bir spektral deger için örnegin büyüklük gibi tek bir degerin veya gerçek parçanin islenmesi ya da alternatif olarak faz veya sanal parçanin islenmesi gerekir. The transformation algorithm can be a discrete cosine transform, a FFT (Fast Fourier Transform) transform using only one real part, an MDCT transform, or any other transform that provides real-valued spectral values. Alternatively, both transforms can be implemented as a virtual transform, such as DST (Discrete Sine Transform), MDST, FFT, etc., where only one imaginary part is used and the real part is discarded. Any other transform that provides only imaginary values can also be implemented. One purpose of using a pure real-valued transform or a pure imaginary transform is numerical complexity, since for each spectral value a single value, such as magnitude, or real part, must be processed, or alternatively the phase or imaginary part must be processed.

FFT gibi tamamen karmasik dönüsümün aksine, iki degerin, yani her bir spektral hat için bir gerçek parça ve bir sanal parçanin, en az 2 faktörlük bir sayisal karmasiklik artisi ile islenmesi gerekebilir. Burada gerçek degerli bir dönüsümü kullanmanin diger bir nedeni de bu tür bir dönüsümün kritik olarak örneklenmis olmasi, dolayisiyla da sinyal nicemleme ve entropi kodlama ("MP3", AAC veya benzer ses kodlama sistemlerinde uygulanan standart "algisal ses kodlama" paradigmasi) için uygun (ve ortak olarak kullanilan) bir bölge saglamasidir. Unlike a fully complex transform such as the FFT, two values, one real part and one imaginary part for each spectral line, may need to be processed, with a numerical complexity increase of at least a factor of 2. Another reason to use a real-valued transform here is that such a transform is critically sampled, thus providing a convenient (and commonly used) region for signal quantization and entropy coding (the standard "perceptual audio coding" paradigm implemented in "MP3", AAC or similar audio coding systems).

Sekil 5a ayrica "arti" girisinde yan sinyali alan, "eksi" girisinde ise öngörücü (2033) tarafindan öngörü sinyali çikisini alan bir toplayici olarak kullanilan bir artik hesaplayiciyi (2034) göstermektedir. Buna ilaveten, Sekil 5a öngörücü kontrol bilgisinin iyilestiriciden kodlanmis çok kanalli ses sinyalini gösteren bir çogullamali bit akisini çikaran çogullayiciya (212) yönlendirildigi durumu göstermektedir. Özellikle, öngörü islemi, Sekil 5a'nin sagindaki Denklemlerle gösterildigi gibi yan sinyal, orta sinyalden öngörülecek sekilde gerçeklestirilir. Figure 5a also shows a residual calculator (2034) used as a collector that receives the side signal on its "plus" input and the predictor (2033) outputs the predictive signal on its "minus" input. In addition, Figure 5a shows the case where the predictive control information is routed to the multiplexer (212) that outputs a multiplexed bit stream representing the encoded multi-channel audio signal from the optimizer. Specifically, the prediction process is performed in such a way that the side signal is predicted from the center signal, as shown by the Equations on the right of Figure 5a.

Tercihen, öngörücü kontrol bilgisi (206) Sekil 3b'nin saginda gösterildigi gibi bir faktördür. Öngörü kontrol bilgisinin karmasik degerli a'nin bir büyüklügü veya karmasik degerli a'nin gerçek parçasi gibi gerçek bir kismi içerdigi ve bu kismin sifirdan farkli bir faktöre karsilik geldigi bir örnekte, orta sinyal ve yan sinyalin dalga biçimi yapilarindan ötürü birbirlerine benzer olup farkli genliklere sahip oldugunda ciddi bir kodlama kazanci elde edilebilir. Preferably, the predictive control information 206 is a factor as shown on the right of Figure 3b. In an example where the predictive control information includes a real part such as a magnitude of the complex value a or a real part of the complex value a, and this part corresponds to a nonzero factor, a significant coding gain can be achieved when the waveform structures of the center signal and the side signal are similar to each other but have different amplitudes.

Buna karsin, öngörü kontrol bilgisinin sadece karmasik degerli bir faktörün sanal parçasi veya karmasik degerli bir faktörün faz bilgisi olabilen bir ikinci kismi içerdigi ve sanal parçanin ya da faz bilgisinin sifirdan farkli oldugu durumda, mevcut bulus O° veya 180°'den farkli bir degerle birbirlerine faz kaydirilmis ve faz kaymasinin yani sira benzer dalga biçimi özellikleri ve benzer genlik iliskileri bulunan sinyaller için ciddi bir kodlama kazanci saglar. In contrast, in the case where the predictive control information only contains a second part, which can be the imaginary part of a complex-valued factor or the phase information of a complex-valued factor, and where the imaginary part or the phase information is non-zero, the present invention provides a significant coding gain for signals that are phase-shifted to each other by a value other than 0° or 180° and have similar waveform characteristics and similar amplitude relationships, as well as phase shift.

Bir öngörü kontrol bilgisi tercihen karmasik degerlidir. Bu sayede genlik. bakimindan farkli olan ve faz kaymasina tabi tutulmus sinyaller için ciddi bir kodlama kazanci elde edilebilir. Zaman/frekans dönüsümlerinin karmasik spektrumlar sagladigi bir durumda, islem (2034) öngörücü kontrol bilgisinin gerçek parçasinin karmasik M spektrumunun gerçek parçasina uygulandigi, karmasik öngörü bilgisinin sanal parçasinin karmasik spektrumun sanal parçasina uygulandigi karmasik bir islem olacaktir. Ardindan ekleyicide (2034) bu öngörü isleminin sonucu öngörülmüs gerçek bir spektrum ve öngörülmüs sanal bir spektrum olur; karmasik bir artik D spektrumu elde etmek üzere öngörülmüs gerçek spektrum, yan S sinyalinin gerçek spektrumundan çikarilir (bantsal olarak) ve öngörülmüs sanal spektrum S spektrumumun sanal parçasindan çikarilir. A predictive control information is preferably complex-valued. In this way, a significant coding gain can be achieved for signals that differ in amplitude and are subject to phase shift. In a case where time/frequency transforms yield complex spectra, the process 2034 will be a complex process where the real part of the predictive control information is applied to the real part of the complex M spectrum, and the imaginary part of the complex predictive information is applied to the imaginary part of the complex spectrum. Then, in the adder 2034, the result of this predictive operation will be a predicted real spectrum and a predicted imaginary spectrum; to obtain a complex residual D spectrum, the predicted real spectrum is subtracted (bandwise) from the real spectrum of the side S signal, and the predicted virtual spectrum is subtracted from the imaginary part of the S spectrum.

Zaman bölgesi sinyalleri (L ve R) gerçek degerli sinyallerdir, fakat frekans bölgesi sinyalleri gerçek degerli ya da karmasik degerli olabilir. Frekans bölgesi sinyalleri gerçek degerli ise dönüsüni de gerçek degerli bir dönüsüni olur. Frekans bölgesi sinyalleri karmasik ise dönüsüm de karmasik degerli bir dönüsüm olur. Bu, zamandan-frekansa girisin ve frekanstan-zamana dönüsümlerin çikisinin gerçek degerli oldugunu göstermekle birlikte, frekans bölgesi sinyalleri örnegin karmasik degerli QMF bölgesi sinyalleri olabilir. Time domain signals (L and R) are real-valued signals, but frequency domain signals can be real-valued or complex-valued. If frequency domain signals are real-valued, their transformation is also a real-valued transformation. If frequency domain signals are complex, the transformation is also a complex-valued transformation. This shows that the input from time-to-frequency and the output from frequency-to-time transformations are real-valued, but frequency domain signals can be, for example, complex-valued QMF domain signals.

Sekil 5b, Sekil 5a'da gösterilen ses kodlayiciya karsilik gelen bir ses kodu çözücüyü. göstermektedir. Sekil 1'deki ses kodu çözüçüye istinaden benzer ögeler benzer referans numaralarina sahiptir. Figure 5b shows an audio decoder, which corresponds to the audio coder shown in Figure 5a. Similar elements have similar reference numbers, as for the audio decoder in Figure 1.

Sekil Sa'daki bit akisi çogullayici (212) tarafindan yapilan bit akisi çikisi, Sekil 5b'deki bir bit akisi çogullama çözücüye (102) girilir. Bit akisi çogullama çözücü (102) bit akisini M downmix sinyali ve D artik sinyaline çogullar. M downmix sinyali nicemleme gidericiye (llOa) girilir. D artik sinyali ise bir baska nicemleme gidericiye (llOb) girilir. Buna ek olarak, bit akisi çogullama çözücü (102) bit akisindan öngörücü kontrol bilgisini (108) çogullar ve bunu öngörücüye (1160) girer. Öngörücü (1160) bir öngörülmüs yan sinyali (d - M) çikarir ve birlestirici (1161), son asamada yeniden olusturulmus yan S sinyalini elde etmek üzere nicemleme giderici (llOb) vasitasiyla öngörülmüs yan sinyal ile artik sinyal çikisini birlestirir. The bitstream output from the bitstream multiplexer 212 in Figure 5b is input to a bitstream demultiplexer 102 in Figure 5b. The bitstream demultiplexer 102 multiplexes the bitstream into a downmix signal M and a residual signal D. The downmix signal M is input to a dequantizer llOa. The residual signal D is input to another dequantizer llOb. In addition, the bitstream demultiplexer 102 multiplexes the predictive control information 108 from the bitstream and inputs it to the predictor 1160. The predictor 1160 outputs a predicted side signal (d - M) and the combiner 1161 combines the predicted side signal and the residual signal output via the dequantizer llOb to obtain the reconstructed side signal S in the final stage.

Ardindan sinyal, orta/yan kodlamaya istinaden Sekil 4c'de gösterildigi gibi örnegin bir toplam/fark islemeyi gerçeklestiren birlestiriciye (1162) girilir. Blok (1162) özellikle, sol kanalin bir frekans bölgesi gösterimini ve sag kanalin bir frekans bölgesi gösterimini elde etmek için bir (ters) orta/yan kod çözmeyi gerçeklestirir. Sonrasinda frekans bölgesi gösterimi karsilik gelen frekans/zaman dönüstürücüler (52 ve 53) vasitasiyla bir zaman bölgesi gösterimine dönüstürülür. The signal is then input to the combiner 1162, which performs a sum/difference operation, for example, as shown in Figure 4c, with respect to the mid/side coding. Block 1162 specifically performs a (reverse) mid/side decoding to obtain a frequency domain representation of the left channel and a frequency domain representation of the right channel. The frequency domain representation is then converted to a time domain representation by means of the corresponding frequency/time converters 52 and 53.

Sistem uygulamasina bagli olarak, bulusun bir düzenlemesine uygun olarak frekans bölgesi gösterimi gerçek degerli bir gösterim ise frekans/zaman dönüstürücüler (52, 53) gerçek degerli frekans/zaman dönüstürücülerdir veyahut frekans bölgesi gösterimi karmasik degerli bir gösterim ise karmasik degerli frekans/zaman dönüstürücülerdir. Depending on the system implementation, the frequency/time converters (52, 53) are real-valued frequency/time converters if the frequency domain representation is a real-valued representation, or complex-valued frequency/time converters if the frequency domain representation is a complex-valued representation, in accordance with one embodiment of the invention.

Ancak verimliligin arttirilmasi için, Sekil 6a'da kodlayici, Sekil 6b'de kod. çözücü› için gösterilen diger` bir 'uygulamada oldugu gibi gerçek degerli bir dönüsümün gerçeklestirilmesi tercih edilir. Gerçek degerli dönüsümler (50 ve 51) bir MDCT ile uygulanir. Ayrica, öngörü bilgisi bir gerçek parça ve bir sanal parçaya sahip olan bir karmasik deger olarak hesaplanir. M, 8 spektrumlarinin her ikisi de gerçek degerli spektrumlar oldugundan ve bu sebeple spektrumun sanal parçasi bulunmadigindan, M sinyalinin gerçek degerli spektrumundan kestirilen bir sanal spektrumu hesaplayan bir gerçekten-sanala dönüstürücü (2070) mevcuttur. Bu gerçekten-sanala dönüstürücü tarafindan kestirilen sanal spektrum (600), artik 2073'te belirtilen gerçek degerli bir faktöre ve 2074'te belirtilen sanal bir faktöre sahip olan öngörü bilgisini (206) hesaplamak için gerçek 14 spektrumu ile birlikte iyilestirici asamasina (2071) giriliru Bu asamada, söz konusu. örnege uygun olarak, sonradan gerçek degerli yan spektrumdan çikarilacak olan öngörü sinyalini elde etmek üzere birinci M kombinasyon sinyalinin gerçek degerli spektrumu gerçek parça (dR, 2073) ile çarpilir. However, to increase the efficiency, it is preferable to perform a real-valued transformation, as in another implementation shown in Figure 6a for the encoder and Figure 6b for the decoder. The real-valued transformations (50 and 51) are implemented with an MDCT. Also, the prediction information is calculated as a complex value having a real part and an imaginary part. Since both the M, 8 spectra are real-valued spectra and therefore there is no imaginary part of the spectrum, a real-to-virtual converter 2070 is available that calculates an imaginary spectrum estimated from the real-valued spectrum of the M signal. The virtual spectrum 600 estimated by this real-to-virtual converter is entered into the optimizer stage (2071) together with the real spectrum 14 to calculate the prediction information 206, which now has a real-valued factor specified in 2073 and a virtual factor specified in 2074. In this stage, in accordance with the example in question, the real-valued spectrum of the first M combination signal is multiplied by the real part (dR, 2073) to obtain the prediction signal, which is then subtracted from the real-valued side spectrum.

Ayrica, ilave öngörü sinyalini elde etmek için sanal spektrum (600) 2074'te gösterilen sanal parça (di) ile çarpilir; sonrasinda bu öngörü sinyali 2034b'de belirtilen gerçek degerli yan spektrumdan çikarilir. Daha sonra, D öngörü artik sinyali nicemleyicide (209b), gerçek degerli M spektrumu ise nicemlenmis/kodlanmis blokta (209a) nicemlenir. Ayrica, örnegin Sekil Sa'daki bit akisi çogüllayiciya (212) yönlendirilen ve son asamada öngörü bilgisi olarak. bit akisina girilen kodlanmis karmasik a degerini elde etmek için öngörü bilgisi a'nin nicemleyici/entropi kodlayicida (2072) nicemlenmesi ve kodlanmasi tercih edilir. a için nicemleme/kodlama (Q/C) modülünün konumuyla ilgili olarak çarpicilarin (2073 ve 2074) tercihen kod çözücüde kullanilacak olanla ayni (nicemlenmis) a'yi kullandigi belirtilmelidir. Also, to obtain the additional prediction signal, the virtual spectrum (600) is multiplied by the virtual part (di) shown at 2074; then this prediction signal is subtracted from the real-valued side spectrum shown at 2034b. Then, the prediction residual signal D is quantized in the quantizer 209b, and the real-valued spectrum M is quantized in the quantized/coded block 209a. Also, it is preferred to quantize and encode the prediction information a in the quantizer/entropy coder 2072 to obtain the coded complex value a, which is routed to the bitstream multiplexer 212 in Fig. Sa and inputted into the bitstream as the prediction information in the final stage. Regarding the location of the quantization/coding (Q/C) module for a, it should be noted that the multipliers 2073 and 2074 preferably use the same (quantized) a that will be used in the decoder.

Dolayisiyla, dogrudan 2071 çikisina hareket edilebilecegi gibi, a'nin nicemlemesinin halihazirda 2071'deki optimizasyon sürecinde göz önünde bulunduruldugu da düsünülebilir. Therefore, it is possible to proceed directly to the 2071 output, or it can be assumed that the quantization of a is already taken into account in the optimization process in 2071.

Kodlayici tarafinda karmasik bir spektrumun hesaplanabilmesine karsin, tüm bilgi mevcut oldugundan, kod çözücü bakimindan Sekil 6b'de gösterilen tercih edilen bir çözücüye iliskin kosullarin üretilmesi için kodlayicida blok 2070'te gerçekten-karmasiga dönüsümün gerçeklestirilmesi tercih edilir. Kod çözücü, birinci kombinasyon sinyaline ait bir gerçek degerli kodlanmis spektrumu ve kodlanmis artik sinyale ait bir gerçek. degerli spektral gösterimini alir. Buna ek olarak, 108'de kodlanmis bir karmasik öngörü bilgisi elde edilir; ll60b'de gösterilen gerçek parçayi (dR) ve 1160c'de gösterilen sanal parçayi (di) elde etmek için bir entropi-kod çözme ve nicemleme giderme gerçeklestirilir. Although a complex spectrum can be calculated at the encoder, since all the information is available, it is preferable to perform the real-to-complex transformation at block 2070 in the encoder to produce the conditions for a preferred decoder, as shown in Figure 6b. The decoder receives a real-valued encoded spectrum of the first combination signal and a real-valued spectral representation of the encoded residual signal. In addition, a complex prediction information encoded at 108 is obtained; an entropy-decoding and dequantization are performed to obtain the real part (dR), shown at 1160b, and the imaginary part (di), shown at 1160c.

Agirliklandirma ögeleri (1160b ve 1160c) tarafindan çikarilan orta sinyaller kodu çözülmüs ve nicemlemesi giderilmis öngörü artik sinyaline eklenir. Özel olarak belirtmek gerekirse, agirliklandiriciya (ll60c) girilen, agirlik faktörü olarak karmasik öngörü faktörünün sanal parçasinin kullanildigi spektral degerler, gerçekten-sanala dönüstürücü (1160a) vasitasiyla gerçek degerli M spektrumundan türetilir, ki bu tercihen Sekil 6a'daki kodlayici tarafiyla ilgili blok 2070'te oldugu gibi uygulanir. Kod çözücü tarafinda, kodlayici tarafinin aksine orta sinyale ve yan sinyale ait karmasik degerli bir gösterini mevcut degildir. Bunun nedeni, bit oranlarindan ve karmasiklik nedenlerinden ötürü kodlayicidan kod çözücüye sadece kodlanmis gerçek degerli spektrumlarin iletilmesidir. The intermediate signals extracted by the weighting elements 1160b and 1160c are added to the decoded and dequantized prediction signal. Specifically, the spectral values input to the weighting element 1160c, using the imaginary part of the complex prediction factor as the weighting factor, are derived from the real-valued M spectrum by means of the real-to-virtual converter 1160a, which is preferably implemented as in block 2070 on the encoder side of Figure 6a. On the decoder side, unlike on the encoder side, there is no complex-valued representation of the intermediate signal and the side signal. This is because, due to bit rates and complexity reasons, only encoded real-valued spectra are transmitted from the encoder to the decoder.

Sekil 6a'daki gerçekten-sanala dönüstürücü (ll60a) veya karsilik patentlerinde veya US 6,980,933 numarali ABD patentinde yayinlandigi gibi uygulanabilir. Alternatif olarak, ilgili teknik alandan bilinen herhangi bir diger uygulama da uygulanabilir; bu anlamda tercih edilen bir uygulama Sekil lOa, lOb'ye istinaden açiklanmistir. The real-to-virtual converter (II60a) in Figure 6a or as published in corresponding patents or US Patent No. 6,980,933 may be implemented. Alternatively, any other implementation known from the relevant technical field may be implemented; a preferred implementation in this regard is described with reference to Figures 10a, 10b.

Yine özel olarak, Sekil 10a'da gösterildigi gibi, gerçekten- sanala dönüstürücü (1160a) bir sanal spektrum hesaplayiciya (1001) baglanan bir spektral çerçeve seçiciyi (1000) içerir. Specifically, as shown in Figure 10a, the virtual-to-real converter 1160a includes a spectral frame selector 1000 coupled to a virtual spectrum calculator 1001.

Spektral çerçeve seçici (1000), giris 1002'de geçerli çerçeve i'nin bir göstergesini, uygulamaya bagli olarak da bir kontrol girisi 1003'te kontrol bilgisini alir. Örnegin hat 1002'deki göstergenin geçerli çerçeve i için sanal bir spektrumun hesaplanacagini belirtmesi ve kontrol bilgisinin (1003) söz konusu hesaplama için sadece geçerli çerçevenin kullanilacagini belirtmesi halinde, spektral çerçeve seçici (1000) sadece geçerli çerçeve i yi seçer ve bu bilgiyi sanal spektrum hesaplayiciya gönderir. Ardindan sanal spektruni hesaplayici, frekans bakimindan Sekil 10b'de 1004'te gösterildigi gibi sanal bir hattin hesaplanacagi geçerli spektral hat k'ya yakin veya bunun çevresinde olan, geçerli çerçevede (blok 1008) konumlandirilan hatlarin agirliklandirilmis bir kombinasyonunu gerçeklestirmek için sadece geçerli çerçeve i'nin spektral hatlarini kullanir. Ancak spektral çerçeve seçici (1000), sanal spektrumun hesaplanmasi için önceki çerçeve (i-1) ve sonraki çerçevenin (i+1) de kullanilacagini belirten bir kontrol bilgisini aldiginda, sanal spektrum hesaplayici ilaveten çerçeveler i-1 ve i+l'den gelen degerleri de alir ve çerçeve i- karsilik gelen çerçevelerde agirliklandirilmis bir kombinasyon gerçeklestirir. Agirliklandirma islemlerinin sonuçlari, son asamada çerçeve fi için bir sanal k hatti elde etmek üzere blok 1007'de agirliklandirilmis bir kombinasyon ile birlestirilir; sonrasinda bu, söz konusu hat için öngörü sinyalini elde etmek üzere öge 1160c'de öngörü bilgisinin sanal parçasi ile çarpilir; bu da ardindan kod çözücü için olan toplayicida (1161) orta sinyalin karsilik gelen hattina eklenir. Kodlayicida ayni islem gerçeklestirilir, ancak öge 2034b'de bir çikarma islemi yapilir. The spectral frame selector 1000 receives an indication of the current frame i at input 1002 and, depending on the implementation, control information at a control input 1003. For example, if the indication at line 1002 indicates that a virtual spectrum is to be calculated for the current frame i and the control information 1003 indicates that only the current frame is to be used for that calculation, the spectral frame selector 1000 selects only the current frame i and sends this information to the virtual spectrum calculator. The virtual spectrum calculator then uses only the spectral lines of the current frame i to perform a weighted combination of the lines positioned in the current frame (block 1008) that are close in frequency to or around the current spectral line k for which a virtual line is to be calculated, as shown at 1004 in Figure 10b. However, when the spectral frame selector 1000 receives a control information indicating that the previous frame (i-1) and the next frame (i+1) will also be used to calculate the virtual spectrum, the virtual spectrum calculator additionally takes the values from frames i-1 and i+1 and performs a weighted combination on the corresponding frames for frame i-. The results of the weighting operations are combined with a weighted combination in block 1007 to obtain a virtual k line for frame fi in the final stage; which is then multiplied by the virtual part of the prediction information in item 1160c to obtain the prediction signal for that line; which is then added to the corresponding line of the middle signal in the adder 1161 for the decoder. The same operation is performed in the encoder, but a subtraction operation is performed at element 2034b.

Kontrol bilgisinin (1003) ayrica çevreleyen iki çerçeveden daha fazla çerçeveyi kullanmayi veya örnegin sistematik gecikmeyi azaltmak adina "ilerideki" çerçeveleri kullanmadan, sadece geçerli çerçeveyi ve önceki çerçevelerden tam olarak bir veya daha fazlasini kullanmayi belirtebilecegi göz önünde bulundurulmalidir. It should be noted that control information 1003 may also specify using more than the two surrounding frames, or, for example, using only the current frame and exactly one or more of the previous frames, without using "ahead" frames in order to reduce systematic delay.

Yine, diger sirada, bir birinci islemde bir çerçeveden gelen hatlarin birlestirildigi ve sonrasinda bu çerçevesel kombinasyon islemlerinin, sonuçlarinin kendiliginden Ibirlestirildigi Sekil 10b'deki gösterildigi gibi asamali agirliklandirilmis kombinasyonun da gerçeklestirilebilecegi belirtilmelidir. Again, it should be noted that in another order, a stepwise weighted combination can also be performed, as shown in Figure 10b, where the lines from a frame are combined in a first operation and then the results of these frame combination operations are automatically combined.

Burada diger sira ile kastedilen, bir birinci adimda kontrol bilgisi (103) ile belirtilen birtakim bitisik çerçeveden gelen geçerli k frekansi hatlarinin agirliklandirilmis bir kombinasyon ile birlestirilmesidir. Sanal hatti tahmin etmek için kullanilacak olan bitisik hat sayisina bagli olarak bu agirliklandirilmis kombinasyon k, k-l, k-2, k+l, k+2 Vb. için uygulanir. Sonrasinda, bu "zamansal" kombinasyonlardan elde edilen sonuçlar, son asamada fi çerçevesi için sanal k hattini elde etmek üzere "frekans yönü"nde agirliklandirilmis kombinasyona tabi tutulur. Agirliklar tercihen -1 ve 1 arasindaki degerlere ayarlanmistir` ve spektral hatlarin veya farkli frekanslardan ve farkli çerçevelerden gelen spektral sinyallerin dogrusal kombinasyonunu gerçeklestiren bir dogrudan FIR veya IIR (Infinite Impulse Response; Sonsuz dürtü yaniti) süzgeci kombinasyonda uygulanabilir. What is meant here by the other order is that in a first step, the valid k frequency lines from a number of adjacent frames specified by the control information (103) are combined with a weighted combination. Depending on the number of adjacent lines to be used to estimate the virtual line, this weighted combination is applied for k, k-l, k-2, k+l, k+2 etc. Then, the results obtained from these "temporal" combinations are subjected to a weighted combination in the "frequency direction" to obtain the virtual k line for frame fi in the final stage. The weights are preferably set to values between -1 and 1 and a direct FIR or IIR (Infinite Impulse Response) filter can be applied in combination, which performs a linear combination of spectral lines or spectral signals from different frequencies and different frames.

Sekil 6a ve 6b'de belirtildigi gibi, tercih edilen dönüsüm algoritmasi, Sekil 6a'daki öge 50 ve 51'de ileri yönde uygulanan; öge 52, 53'te ise spektral bölgede isleyen birlestiricideki (1162) kombinasyon isleminin ardindan geri yönde uygulanan MDCT dönüsümü algoritmasidir. As indicated in Figures 6a and 6b, the preferred transform algorithm is the MDCT transform algorithm, which is applied in the forward direction in items 50 and 51 in Figure 6a, and in the backward direction after the (1162) combination operation in the combiner operating in the spectral domain in items 52, 53.

Sekil 8a, blok 50 veya 51'in daha detayli bir uygulamasini göstermektedir. Özellikle, zaman bölgesi ses örneklerinin bir dizisi, bir analiz penceresini kullanarak bir pencereleme islemini gerçeklestiren ve bu islemi özellikle bir çerçeve içerisinde çerçevesel olarak, fakat %50'lik bir asma veya örtüsme kullanarak gerçeklestiren bir analiz pencereleyiciye (500) girilir. Analiz pencereleyici sonuçlari, yani pencerelenmis örneklerin çerçevelerine ait diziler, gerçek degerli MDCT çerçevelerine ait dizileri çikaran bir MDCT dönüsüm bloguna (501) girilir ve burada söz konusu çerçevelere örtüsme etkisine maruz kalir. Örnek olarak, analiz pencereleyici 2048 örnek uzunlugundaki analiz pencerelerini uygular. Sonra, MDCT dönüsüm blogu (50l), 1024 gerçek spektral hatti veya MDCT degeri olan MDCT spektrumlarini çikarir. Tercihen, analiz pencereleyici ( bir pencere uzunlugu veya dönüsüm uzunlugu kontrolü (502) vasitasiyla kontrol edilebilir ki bu sayede daha iyi kodlama sonuçlari elde edebilmek için örnegin sinyaldeki geçis kisimlari için pencere uzunlugu/dönüsüm uzunlugu azaltilir. Figure 8a shows a more detailed implementation of block 50 or 51. In particular, a sequence of time domain audio samples is fed into an analysis windower 500, which performs a windowing operation using an analysis window, and in particular, performs this operation frame by frame within a frame, but using a 50% overlap or overlap. The analysis windower results, i.e., the sequences of frames of the windowed samples, are fed into an MDCT transform block 501, which extracts sequences of real-valued MDCT frames, and where the frames are subjected to the overlapping effect. For example, the analysis windower applies analysis windows of 2048 samples in length. Then, the MDCT transform block (50l) extracts the MDCT spectra with 1024 real spectral lines or MDCT values. Preferably, the analysis can be controlled by means of the windower (a window length or transform length control (502)), which reduces the window length/transform length for transition portions in the signal, for example, to obtain better coding results.

Sekil 8, blok 52 ve 53'te gerçeklestirilen ters MDCT islemini göstermektedir. Örnegin, blok 52, çerçeve çerçeve (frame-by- frame) ters MDCT dönüsümü gerçeklestirmeye yönelik blok 250'yi içerir. Örnegin MDCT degerlerinin bir çerçevesinin 1024 degere sahip olmasi halinde, bu MDCT ters dönüsümünün çikisinda 2048 örtüsmeye maruz kalmis zaman örnegi bulunur. Bu tür bir çerçeve, 2048 örneklik bu çerçeveye bir sentez penceresi uygulayan bir sentez pencereleyiciye (521) gönderilir. Sonrasinda pencerelenmis çerçeve bir örtüsme/toplama islemciye (522) yönlendirilir, bu da örnegin takip eden iki çerçeve arasinda durumda 1024 yeni örtüsmesiz çikis sinyali örnegi ile sonuçlanacak sekilde örnek. örnek (sample by sample) toplama gerçeklestirir. Yine, 523'te belirtildigi gibi, örnegin kodlanmis çok kanalli sinyalin yan bilgisinde iletilen bir bilgiyi kullanarak pencere/dönüsüm uzunlugu kontrolü uygulamak tercih sebebidir. Öngörü degerleri bir MDCT spektrumundaki her bir bireysel spektral hat için hesaplanabilir. Ancak, bunun gerekli olmadigi ve öngörü bilgisinin bantsal olarak hesaplanmasiyla önemli bir miktarda yan bilginin kurtarilabilecegi bulunmustur. Farkli bir ifadeyle, örnegin Sekil 8a baglaminda açiklandigi gibi bir MDCT islemcisi olan bir spektral dönüstürücü (50), Sekil 9b'de gösterilen belli spektral hatlara sahip olan yüksek frekans çözünürlüklü bir spektrum saglar. Bu yüksek frekans çözünürlüklü spektrum, B1, B2, B3, ... , BN seklinde belli bantlari içeren düsük frekans çözünürlüklü bir spektrumu saglayan bir spektral hat seçici (90) tarafindan kullanilir. Bu düsük frekans çözünürlüklü spektrum, öngörü bilgisi her spektral hat için degil, sadece her bir bant için hesaplanacak sekilde öngörü bilgisini hesaplamak üzere iyilestiriciye (207) yönlendirilir. Figure 8 shows the inverse MDCT operation performed in blocks 52 and 53. For example, block 52 contains block 250 for performing a frame-by-frame inverse MDCT transformation. For example, if a frame of MDCT values has 1024 values, the output of this MDCT inverse transform would be 2048 overlapped time samples. Such a frame is sent to a synthesis windower 521, which applies a synthesis window to this frame of 2048 samples. The windowed frame is then routed to an aliasing/summing processor 522 which performs sample by sample summing between the following frames, resulting in 1024 new non-aliasing output signal samples. Again, as noted at 523, it is preferred to implement window/transform length control using information transmitted in the encoded multi-channel signal side information. Prediction values can be computed for each individual spectral line in an MDCT spectrum. However, it has been found that this is not necessary and a significant amount of side information can be recovered by calculating the prediction information bandwise. In other words, a spectral converter 50, which is an MDCT processor as described in the context of Figure 8a, provides a high-frequency resolution spectrum having certain spectral lines as shown in Figure 9b. This high-frequency resolution spectrum is used by a spectral line selector 90, which provides a low-frequency resolution spectrum containing certain bands, B1, B2, B3, ... , BN. This low-frequency resolution spectrum is routed to the optimizer 207 to calculate the predictive information, such that the predictive information is calculated only for each band, not for each spectral line.

Bu amaçla, iyilestirici (207) bant basina spektral hatlari alir ve banttaki her spektral hat için ayni d degerinin kullanildigi varsayimina dayanarak optimizasyon islemini hesaplar. For this purpose, the optimizer (207) takes the spectral lines per band and calculates the optimization process based on the assumption that the same d value is used for each spectral line in the band.

Tercihen, Sekil 9b'de gösterildigi gibi düsük frekanslardan yüksek frekanslara gidildikçe bantlarin bant genisliginin artmasi için bantlar psikoakustik olarak sekillendirilir. Preferably, the bands are psychoacoustically shaped so that the bandwidth of the bands increases from low to high frequencies, as shown in Figure 9b.

Alternatif olarak, bant genisligi artirma uygulamasi kadar tercih edilmese de her bir frekans bandinin en az iki veya tipik olarak çok daha fazla, örnegin en az 30 frekans hattina sahip oldugu esit boyutlu frekans bantlarinin kullanimi da mümkündür. Alternatively, although not as desirable as bandwidth expansion, it is also possible to use equally sized frequency bands, where each frequency band has at least two or typically many more, for example at least 30 frequency lines.

Genel olarak, 1024 spektral hatli bir spektrum için, 30'dan az karmasik a degeri, tercihen 5'ten fazla a degeri hesaplanir. l024'ten az spektral hatli spektrumlar için (örnegin 128 hat), a için tercihen daha az frekans bandi (örnegin 6) kullanilir. a degerlerini hesaplamak için MDCT spektrumu zorunlu degildir. In general, for a spectrum with 1024 spectral lines, less than 30 complex a values are calculated, preferably more than 5 a values. For spectra with less than 1024 spectral lines (e.g. 128 lines), fewer frequency bands are preferably used for a (e.g. 6). MDCT spectrum is not mandatory for calculating a values.

Alternatif olarak, a degerlerini hesaplamak için gereken çözünürlüge benzer bir frekans çözünürlügü olan bir süzgeç öbegi de kullanilabilir. Frekansi artan bantlar uygulanacagi zaman bu süzgeç öbeginin degiskenlik gösteren bant genisligi olmalidir. Alternatively, a filter bank with a frequency resolution similar to the resolution required to calculate the a values can be used. This filter bank should have a variable bandwidth when increasing frequency bands are to be applied.

Ancak, düsükten yüksege frekanslardaki sabit bir bant genisliginin yeterli oldugu durumda, esit genislikli alt- bantlari olan geleneksel bir süzgeç öbegi kullanilabilir. However, where a constant bandwidth from low to high frequencies is sufficient, a conventional filter bank with subbands of equal width can be used.

Uygulamaya bagli olarak, Sekil 3b veya 4b'de belirtilen a degerinin imzasi tersine Çevrilebilir. Ancak, tutarlilik açisindan bu tersine çevirmenin hem kodlayici tarafinda hem de kod çözücü tarafinda kullanilmasi gereklidir. Sekil Sa, Sekil 6a'ya kiyasla, kodlayicinin genel. bir görünümünü. göstermekte olup; burada madde 2033, madde 207'de belirlenen ve yan bilgi olarak bit akisina katilan öngörücü kontrol bilgisi (206) vasitasiyla kontrol edilen bir öngörücüdür. Sekil 5a'da, Sekil 6'da blok 50,51'de kullanilan MDCT'nin yerine genellenmis bir zaman/frekans dönüsümü kullanilmistir. Daha önce belirtildigi gibi, Sekil 6a, Sekil 6b'deki kod Çözücü sürecine karsilik gelen bir kodlayici süreci olup burada L, sol kanal sinyalini; R, sag kanal sinyalini; M, orta sinyali veya downmix sinyalini; S, yan sinyali; D ise artik sinyali temsil etmektedir. Alternatif olarak L, birinci kanal sinyali (201) olarak; R, ikinci kanal sinyali (202) olarak; M, birinci kombinasyon sinyali (204) olarak; 5 ise ikinci kombinasyon sinyali (2032) olarak da adlandirilabilir. Depending on the implementation, the signature of the value a specified in Figure 3b or 4b can be inverted. However, for consistency reasons, this inversion must be used on both the encoder and decoder sides. Figure Sa shows a general view of the encoder, compared to Figure 6a, where item 2033 is a predictor controlled by the predictor control information 206) specified in item 207 and included in the bit stream as side information. In Figure 5a, a generalized time/frequency transform is used in place of the MDCT used in blocks 50, 51 of Figure 6. As previously mentioned, Figure 6a is an encoder process corresponding to the Decoder process in Figure 6b, where L represents the left channel signal; R represents the right channel signal; M represents the middle signal or downmix signal; S represents the side signal; and D represents the residual signal. Alternatively, L can be referred to as the first channel signal (201); R as the second channel signal (202); M as the first combination signal (204); and 5 as the second combination signal (2032).

Tercihen, kodlayicidaki 2070, kod çözücüdeki ll60a modülleri, dogru dalga biçimi kodlamayi garantilemek için birbiriyle tam olarak eslesmelidir. Bu, tercihen bu modüllerin kesilmis süzgeçler gibi bir çesit yaklasiklamayi veya üç MDCT çerçevesi yerine sadece bir veya iki MDCT çerçevesinden olustugu durumda (yani hat 60'daki geçerli MDCT çerçevesi) hat 61'deki önceki MDCT çerçevesi ve hat 62'deki sonraki MDCT çerçevesini kullandigi durum için de geçerlidir. Preferably, the modules 2070 in the encoder and ll60a in the decoder should be exactly matched to each other to ensure proper waveform encoding. This is also preferably the case where these modules use some sort of approximation such as cut filters or, in the case where there are only one or two MDCT frames (i.e., the current MDCT frame on line 60) instead of three MDCT frames, the previous MDCT frame on line 61 and the next MDCT frame on line 62.

Buna ek olarak, kod çözücüdeki gerçekten-sanala (RZI) modülü ll60a'nin giris olarak sadece nicemlenmis MDCT spektrumuna sahip olmasina karsin, Sekil 6a'daki kodlayicidaki modül 2070'in giris olarak nicemlenmemis M MDCT spektrumunu kullanmasi tercih edilir. Alternatif olarak, kodlayicinin modül 2070'e bir giris olarak nicemlenmis MDCT katsayilarini kullandigi bir uygulamanin kullanilmasi da mümkündür. Ancak algisal açidan bakildiginda, modül 2070'e giris olarak nicemlenmemis MDCT spektrumunun kullanimi tercih edilen bir yaklasimdir. In addition, while the real-to-virtual (RZI) module ll60a in the decoder has only the quantized MDCT spectrum as input, it is preferable for the module 2070 in the encoder in Figure 6a to use the unquantized MDCT spectrum as input. Alternatively, it is possible to use an implementation in which the encoder uses the quantized MDCT coefficients as an input to module 2070. However, from a perceptual standpoint, using the unquantized MDCT spectrum as an input to module 2070 is a preferable approach.

Bazi yönler daha sonra daha detayli bir sekilde açiklanacaktir. Some aspects will be explained in more detail later.

Standart parametrik stereo kodlama, yüksek hizda örneklenmis karmasik (karma) QMF bölgesinin, örtüsme yapayliklarina neden olmadan zaman ve frekans bakimindan degiskenlik gösteren algisal olarak yönlendirilmis sinyal islemeye olanak saglama özelligine baglidir. Ancak downmix/artik kodlama (burada dikkate alinan yüksek bit oranlarinda kullanildigi gibi) durumunda, elde edilen birlestirilmis stereo kodlayici bir dalga biçimi kodlayici olarak islev görür. Bu da MDCT bölgesi gibi kritik olarak örneklenmis bir bölgede islem yapilmasina olanak saglar, çünkü dalga biçimi kodlama paradigmasi, MDCT-IMDCT (Inverse Modified Discrete Cosine Transform; Ters modifiye edilmis kesintili kosinüs dönüsümü) isleme zincirinin örtüsme silme özelliginin yeterli ölçüde iyi bir sekilde korunmasini saglar. Standard parametric stereo coding relies on the ability of the oversampled complex QMF domain to allow perceptually oriented signal processing that varies in time and frequency without introducing aliasing artifacts. However, in the case of downmix/residual coding (as used at the high bit rates considered here), the resulting combined stereo coder functions as a waveform coder. This allows processing in a critically oversampled region, such as the MDCT region, because the waveform coding paradigm ensures that the aliasing removal property of the MDCT-IMDCT (Inverse Modified Discrete Cosine Transform) processing chain is sufficiently well preserved.

Bununla birlikte, karmasik degerli bir a öngörü katsayisi vasitasiyla kanallararasi zaman veya faz farkliliklari olan stereo sinyallerle elde edilebilen kodlama verimliliginden faydalanabilmek için karmasik› degerli upmix matrisine giris olarak downmix sinyalinin (DMX) karmasik degerli bir frekans bölgesi gösterimi gereklidir. Bu, DMX sinyali için MDCT dönüsümüne ek olarak bir MDST dönüsümünün de kullanilmasi suretiyle olanakli hale getirilebilir. MDST spektrumu, MDCT spektrumundan (tam olarak veya yaklasik olarak) hesaplanabilir. However, in order to benefit from the coding efficiency that can be achieved with stereo signals with inter-channel time or phase differences through a complex-valued prediction coefficient a, a complex-valued frequency domain representation of the downmix signal (DMX) is required as an input to the complex-valued upmix matrix. This can be made possible by using an MDST transform in addition to the MDCT transform for the DMX signal. The MDST spectrum can be calculated (exactly or approximately) from the MDCT spectrum.

Ayrica, upmix matrisinin parametrelendirilmesi, MPS (MPEG Surround) parametreleri yerine a karmasik öngörü katsayisinin iletilmesi suretiyle basitlestirilebilir. Dolayisiyla üç parametre (ICC (Inter-Channel Coherence; Kanallararasi uyum), CLD (Channel Level Difference; Kanal seviye farki) ve IPD (Inter- channel Phase Difference; Kanallararasi faz farki)) yerine sadece iki parametre (a'nin gerçek ve sanal parçasi) iletilir. Also, the parameterization of the upmix matrix can be simplified by passing the complex prediction coefficient a instead of the MPS (MPEG Surround) parameters. Thus, instead of three parameters (ICC (Inter-Channel Coherence), CLD (Channel Level Difference) and IPD (Inter- channel Phase Difference)), only two parameters (real and imaginary part of a) are passed.

Bu, downmix/artik kodlama durumunda MPS parametrelendirmesindeki artikliktan ötürü mümkündür. MP8 parametrelendirmesi kod çözücüye eklenecek olan göreli ilintisizlestirme miktarina iliskin bilgiyi (yani RES (Residual; Artik) ve DMX (Downmix) sinyalleri arasindaki enerji orani) içerir ve bu bilgi fiili DMX ve RES sinyalleri iletildiginde gerekenden fazladir. This is possible due to the redundancy in the MPS parameterization in the case of downmix/residual coding. The MP8 parameterization includes information on the relative amount of decorrelation to be added to the decoder (i.e. the energy ratio between the RES (Residual) and DMX (Downmix) signals), and this information is more than necessary when the actual DMX and RES signals are transmitted.

Ayni nedenden dolayi, yukaridaki upmix netrisinde gösterilen kazanç katsayisi (g), downmix/artik› kodlama durumunda artik kullanilmamaktadir. Dolayisiyla karmasik öngörülü downmix/artik kodlama için upmix matrisi artik asagidaki gibidir: R _1+a -1 RES' Sekil 4b'deki Denklem ll69'a kiyasla, bu denklemde alfa isareti ters çevrilmistir; DMX=M ve RES=DT seklindedir. Bu sebeple de bu uygulama, Sekil 4b'ye istinaden alternatif bir uygulama/yazimdir. For the same reason, the gain coefficient (g) shown in the upmix net above is no longer used in the case of downmix/residual coding. Therefore, the upmix matrix for complex predictive downmix/residual coding is now as follows: R _1+a -1 RES' Compared to Equation ll69 in Figure 4b, in this equation the alpha sign is reversed; DMX=M and RES=DT. Therefore, this implementation is an alternative implementation/writing with respect to Figure 4b.

Kodlayicidaki öngörü artik sinyalini hesaplamak için iki seçenek mevcuttur. Bunlardan biri, downmikse ait nicemlenmis MDCT spektral degerlerini kullanmaktir. Burada kodlayici ve kod çözücü öngörü üretmek› için ayni degerleri kullandigindan, bu seçenek M/S kodlamadakiyle ayni nicemleme hatasi dagilimi ile sonuçlanir. Seçeneklerden digeri ise nicemlenmemis MDCT spektral degerlerini kullanmaktir. Bu, kodlayici ve kod çözücünün öngörüyü üretmek için ayni veriyi kullanmayacagi anlamina gelmektedir, ki bu kismen azalan bir kodlama kazanci pahasina, sinyalin anlik maskeleme özelliklerine göre kodlama hatasinin uzamsal olarak yeniden dagilmasina olanak saglar. There are two options for calculating the prediction residual signal in the encoder. One is to use the quantized MDCT spectral values of the downmix. Since the encoder and decoder use the same values to generate the prediction, this option results in the same quantization error distribution as in M/S coding. The other option is to use the unquantized MDCT spectral values. This means that the encoder and decoder do not use the same data to generate the prediction, which allows the coding error to be spatially redistributed according to the instantaneous masking properties of the signal, at the cost of a somewhat reduced coding gain.

MDST spektrumunun, açiklanan üç bitisik MDCT çerçevesinin iki boyutlu FIR süzgeci vasitasiyla frekans bölgesinde dogrudan hesaplanmasi tercih edilir. Sonraki durum bir "gerçekten-sanala" (RZI) dönüsüm olarak düsünülebilir. MDST'ye ait frekans bölgesi hesaplamadaki karmasiklik çesitli sekillerde azaltilabilir, yani sadece MDST spektrumunun bir yaklasigi hesaplanabilir: o FIR süzgeci tap sayisinin sinirlandirilmasi. It is preferable to directly calculate the MDST spectrum in the frequency domain by means of a two-dimensional FIR filter of the three adjacent MDCT frames described. The latter case can be considered as a "real-to-virtual" (RZI) transformation. The complexity of the frequency domain calculation of MDST can be reduced in several ways, i.e., only an approximation of the MDST spectrum can be calculated: o Limiting the number of FIR filter taps.

- Sadece geçerli MDCT çerçevesinden MDST'nin tahmin edilmesi. o Geçerli ve bir önceki MDCT çerçevesinden MDST'nin tahmin edilmesi. - Estimating MDST from the current MDCT frame only. o Estimating MDST from the current and previous MDCT frames.

Kodlayici ve kod çözücüde ayni yaklasiklama kullanildigi müddetçe dalga biçimi kodlama özellikleri etkilenmez. Ancak, MDST spektrumundaki bu yaklasiklamalar karmasik öngörüyle elde edilen kodlama kazancinda azalmaya neden olabilir. As long as the same approximation is used in the encoder and decoder, the waveform coding characteristics are not affected. However, these approximations in the MDST spectrum may cause a decrease in the coding gain obtained with complex prediction.

Alttaki MDCT kodlayicinin pencere biçimli geçisi desteklemesi halinde, MDST spektrumunu hesaplamak için kullanilan iki boyutlu FIR süzgecinin katsayilari fiili pencere sekillerine uyarlanmak zorundadir. Geçerli çerçevenin MDCT spektrumuna uygulanan süzgeç katsayilari tam pencereye baglidir, yani her bir pencere türü ve her bir pencere geçisi için bir grup katsayi gereklidir. Önceki/sonraki çerçevenin MDCT spektrumuna uygulanan süzgeç katsayilari sadece geçerli çerçevenin yarisiyla örtüsen pencereye baglidir, yani bunlarda sadece her bir pencere türü için bir grup katsayi gereklidir (geçisler için ilave katsayi Alttaki MDCT kodlayicinin dönüsüm uzunlugu geçisini kullanmasi halinde, yaklasiklamadaki önceki ve/veya sonraki MDCT çerçevesi dahil olmak üzere, farkli dönüsüm uzunlukluklari arasindaki geçislerde daha karmasik bir durum olusur. Geçerli ve önceki/sonraki çerçevedeki farkli sayidaki MDCT katsayilarindan dolayi bu durumda iki boyutlu süzme daha karisiktir. Artan sayisal ve yapisal karmasikligi gidermek için dönüsüm uzunlugu geçislerinde önceki/sonraki çerçeve, özel çerçeveler için yaklasiklama dogrulugunun azalmasi pahasina da olsa, süzme islemine dahil edilmeyebilir. If the underlying MDCT encoder supports window-shaped transitions, the coefficients of the two-dimensional FIR filter used to calculate the MDST spectrum must be adapted to the actual window shapes. The filter coefficients applied to the MDCT spectrum of the current frame are window-dependent, i.e., a set of coefficients is required for each window type and each window transition. The filter coefficients applied to the MDCT spectrum of the previous/next frame depend only on the window that overlaps half of the current frame, i.e. only one set of coefficients is required for each window type (additional coefficients for transitions). If the underlying MDCT encoder uses the transform length transition, a more complicated situation occurs in the transitions between different transform lengths, including the previous and/or next MDCT frame in the approximation. Two-dimensional filtering is more complicated in this case due to the different number of MDCT coefficients in the current and the previous/next frame. In order to eliminate the increased numerical and structural complexity, the previous/next frame is filtered for the transform length transitions, at the cost of decreasing the approximation accuracy for special frames. may not be included in the process.

Ayrica, FIR süzme için gerekenden daha az çevreleyen MDCT katsayisinin bulundugu MDST spektrumunun en düsük ve en yüksek parçalarina (DC ve fs/2'ye yakin) özellikle dikkat edilmelidir. Also, special attention should be paid to the lowest and highest parts of the MDST spectrum (near DC and fs/2), where the surrounding MDCT coefficient is less than that required for FIR filtering.

Burada, süzme süreci MDST spektrumunu dogru bir sekilde hesaplamak üzere uyarlanmalidir. Bu, atlanmis katsayilar için MDCT spektrumunun simetrik olarak uzatilmasi/genisletilmesi (zaman kesintili sinyallerin spektrumlarinin periyodikligine göre) ya da süzgeç katsayilarinin uygun bir sekilde uyarlanmasi suretiyle yapilabilir. Bu özel durumlar, MDST spektrumunun sinirlarina yakin yerlerde dogrulugun azaltilmasi pahasina daha basitlestirilmis bir sekilde islenebilir. Here, the filtering process must be adapted to correctly calculate the MDST spectrum. This can be done by symmetrically stretching/expanding the MDCT spectrum for omitted coefficients (according to the periodicity of the spectra of time-discontinuous signals) or by appropriately adapting the filter coefficients. These special cases can be handled in a simplified way, at the cost of decreasing accuracy near the boundaries of the MDST spectrum.

Kod çözücüdeki iletilmis MDCT spektrumlarindan tam MDST spektrumunun. hesaplanmasi kod. çözücü gecikmesini bir çerçeve kadar azaltir (burada l024 örnek oldugu varsayilmaktadir). Computing the full MDST spectrum from the transmitted MDCT spectra at the decoder reduces the decoder delay by one frame (assuming 1024 samples here).

Giris olarak sonraki çerçevenin MDCT spektrumunu gerektirmeyen MDST spektrumunun bir yaklasiklamasi kullanilarak ilave gecikme önlenebilir. Additional delay can be avoided by using an approximation of the MDST spectrum that does not require the MDCT spectrum of the next frame as input.

Asagida MDCT tabanli birlestirilmis stereo kodlamanin QMF tabanli birlestirilmis stereo kodlamaya göre avantajlari maddeler halinde siralanmistir: . Sayisal karmasiklikta sadece küçük bir artis (SBR kullanilmadiginda). o MDCT spektrumlari nicemlenmediginde kusursuz yeniden olusturmaya varan ölçekler. QMF tabanli birlestirilmis stereo kodlamada bu durum söz konusu degildir. The advantages of MDCT-based combined stereo coding over QMF-based combined stereo coding are listed below: . Only a small increase in computational complexity (when SBR is not used). o Scales up to perfect reconstruction when MDCT spectra are not quantized. This is not the case with QMF-based combined stereo coding.

M/S kodlamanin ve ses siddeti stereo kodlamanin dogal olarak genisletilmesi. Natural expansion of M/S coding and loudness stereo coding.

Stereo sinyal isleme ve nicemleme/kodlama siki bir sekilde birlestirilebilecegi için kodlayici uyumlamasini basitlestiren daha temiz bir yapi. Burada, QMF tabanli birlestirilmis stereo kodlamada, MPEG Çevre çerçevelerinin ve MDCT çerçevelerinin hizali olmadigina ve ölçek katsayisi bantlarinin parametre bantlariyla eslesmedigine dikkat edilmelidir. A cleaner structure that simplifies encoder matching, as stereo signal processing and quantization/coding can be tightly coupled. It should be noted here that in QMF-based coupled stereo coding, the MPEG Surround frames and MDCT frames are not aligned and the scale factor bands do not match the parameter bands.

MPEG Çevrede (ICC, CLD, IPD) oldugu gibi üç parametre yerine sadece iki parametrenin (karmasik a) iletilmesi gerektigi için stereo parametrelerin verimli bir sekilde kodlanmasi. Efficient coding of stereo parameters, as only two parameters (complex a) need to be transmitted, instead of three as in MPEG Environment (ICC, CLD, IPD).

MDST spektrumunun bir yaklasiklama olarak hesaplanmasi halinde (sonraki çerçeve kullanilmaksizin) herhangi bir ilave kod çözücü gecikmesi söz konusu degil. If the MDST spectrum is calculated as an approximation (without using the next frame), there is no additional decoder delay.

Bir uygulamanin önemli özellikleri asagidaki gibi özetlenebilir: .a) MDST spektrumlari geçerli, önceki ve sonraki MDCT spektrumlarindan iki boyutlu FIR süzgeci vasitasiyla hesaplanir. FIR süzgeci taplarini ve/veya kullanilan MDCT çerçevelerinin sayilarini azaltmak suretiyle MDST hesaplama (yaklasiklama) için farkli karmasiklik/kalite degisimleri mümkündür. Iletim veya dönüsüm uzunlugu geçisi esnasinda gerçeklesen çerçeve kaybindan ötürü bitisik bir çerçevenin mevcut olmamasi halinde, söz konusu çerçeve MDST kestirimine dahil edilmez (dislama). Dönüsüm uzunlugu geçisinde ise bit akisinda dislama sinyali verilir. .b) ICC, CLD ve IPD yerine sadece iki parametre (yani d karmasik öngörü katsayisinin gerçek ve sanal parçasi) iletilir. a'nin gerçek ve sanal parçalari, aralikla sinirli bagimsiz olarak islenir. Verilen bir çerçevede belli bir parametre (a'nin gerçek veya sanal parçasi) kullanilmiyorsa bunun sinyali bit akisinda verilir ve ilgisiz parametre iletilmez. Parametreler zaman ayrimsal veya frekans ayrimsal olarak kodlanir ve son olarak ölçek katsayisi kod çizelgesi kullanilarak Huffman kodlama uygulanir. Öngörü katsayilari her saniye ölçek katsayisi bandinda güncellenir, bu da MPEG Çevresindekine benzer bir frekans çözünürlügü ile sonuçlanir. Bu nicemleme ve kodlama semasi, 96 kb/s hedef bit orani olan tipik bir yapilanmada stereo yan bilgi için yaklasik 2 kb/s'lik bir ortalama bit orani ile sonuçlanir. The important features of an implementation can be summarized as follows: .a) MDST spectra are calculated from the current, previous and next MDCT spectra by means of a two-dimensional FIR filter. Different complexity/quality trade-offs are possible for the MDST calculation (approximation) by reducing the FIR filter taps and/or the number of MDCT frames used. If an adjacent frame is not available due to a frame loss during transmission or transform length transition, that frame is not included in the MDST estimation (exclusion). In transform length transition, an exclusion signal is given in the bit stream. .b) Instead of ICC, CLD and IPD, only two parameters (i.e. the real and imaginary parts of the complex prediction coefficient d) are transmitted. The real and imaginary parts of a are processed independently, limited by the interval. If a particular parameter (real or imaginary part of a) is not used in a given frame, its signal is given in the bitstream and the irrelevant parameter is not transmitted. The parameters are encoded time- or frequency-differentially, and finally Huffman coding is applied using the scale factor code table. The prediction coefficients are updated in the scale factor band every second, resulting in a frequency resolution similar to that of MPEG Environment. This quantization and coding scheme results in an average bit rate of about 2 kb/s for stereo side information in a typical configuration with a target bit rate of 96 kb/s.

Tercih edilen ilave veya alternatif uygulama detaylari asagidakileri içerir: c) d'nin iki parametresinden her biri için bit akisinda karsilik gelen bir bit ile sinyali verilen, çerçeve basina veya akis basina prensibiyle isleyen ayrimsal olmayan (PCM) veya ayrimsal olan (DPCM (Differential Pulse Code Modulation: Ayrimsal Darbe/vurum kod kiplenimi)) kodlama seçilebilir. DPCM kodlama için zaman veya frekans ayrimsal kodlama mümkündür. Bu da bir-bit isareti kullanilarak sinyallenebilir. d) AAC (Advanced Audio Coding; Gelismis ses kodlama) ölçek katsayi kitabi gibi önceden tanimlanmis bir kod kitabini yeniden kullanmaktansa d parametre degerlerini kodlamak için özel bir degisimsiz ya da sinyal-uyarlanir kod kitabi kullanilabilecegi gibi, sabit uzunluklu (Örnegin 4-bit) imzasiz veya ikiye tümleyen kod sözcüklerine geçilebilir. e) a parametre degerlerinin araliginin yani sira parametre nicemleme adim boyu, rastgele seçilip eldeki sinyal özelliklerine optimize edilebilir. f) Aktif d parametre bantlarinin sayisi ve spektral ve/Veya zamansal genisligi rastgele seçilip belirli sinyal özelliklerine optimize edilebilir. Özellikle, bant yapilanmasi çerçeve basina veya akis basina prensibine g) yukaridaki a) maddesinde açiklanan mekanizmalara ilave veya alternatif olarak, bit akisindaki çerçeve basina bit vasitasiyla açik bir sekilde MDST spektrumu yaklasiklamasini hesaplamak için sadece geçerli çerçevenin MDCT spektrumunun kullanildigi, yani bitisik MDCT çerçevelerinin hesaba katilmadigi sinyali verilebilir. Additional or alternative implementation details that are preferred include: c) For each of the two parameters of d, non-differential (PCM) or differential (DPCM (Differential Pulse Code Modulation)) coding can be selected, signaled by a corresponding bit in the bit stream, operating on a per-frame or per-stream principle. For DPCM coding, time or frequency differential coding is possible. This can also be signaled using a one-bit signal. d) Instead of reusing a predefined codebook such as the AAC (Advanced Audio Coding) scale coefficient book, a special invariant or signal-adaptive codebook can be used to encode the d parameter values, or fixed-length (e.g. 4-bit) unsigned or two's complement codewords can be used. e) The range of a parameter values, as well as the parameter quantization step size, can be randomly selected and optimized for the signal characteristics at hand. f) The number and spectral and/or temporal width of the active d parameter bands can be randomly selected and optimized for the specific signal characteristics. In particular, band structuring accords to the per-frame or per-stream principle g) in addition to or as an alternative to the mechanisms described in a) above, or as an alternative, bits per frame in the bit stream can be explicitly signaled that only the MDCT spectrum of the current frame is used to calculate the MDST spectrum approximation, adjacent MDCT frames are not taken into account.

Düzenlemeler MDCT bölgesindeki birlestirilmis stereo kodlamaya yönelik bulus konusu sistem ile ilgilidir. Bu, QMF tabanli yaklasimin getirdigi sayisal karmasiklikta anlamli bir artis olmaksizin MPEG USAC sisteminde birlestirilmis stereo kodlamanin avantajlarindan daha yüksek bit oranlarinda dahi faydalanmayi saglar (burada SBR kullanilmaz). The embodiments relate to the inventive system for combined stereo coding in the MDCT domain. This allows the advantages of combined stereo coding in the MPEG USAC system to be exploited even at higher bit rates (SBR is not used here) without the significant increase in computational complexity introduced by the QMF-based approach.

Asagidaki iki listede daha önce açiklanan tercih edilen yapilanma yönleri açiklanmis olup, bunlar birbirlerine alternatif olarak veyahut diger yönlere ilave olarak kullanilabilir: la) genel konsept: orta MDCT ve MDST'den yan MDCT'nin karmasik öngörüsü; lb) 1 veya daha fazla çerçeveler (3-çerçeve gecikmeye yol açar) kullanarak frekans bölgesinde MDCT'den MDST'yi hesapla/yaklasikla; lc) sayisal karmasikligi azaltmak için süzgecin kesilmesi (l-çerçeve 2-tap'a kadar düsecek sekilde, yani [-1 0 l]); ld) DC ve fs/2'nin düzgün bir sekilde islenmesi; le) pencere sekli geçisinin düzgün bir sekilde islenmesi; lf) farkli bir dönüsüm boyutu varsa önceki/sonraki çerçeveyi kullanma; lg) kodlayicidaki nicemlenmemis veya nicemlenmis MDCT katsayilarina dayali öngörü; 2a) karmasik öngörü katsayisinin gerçek ve sanal parçasini dogrudan nicemle ve kodla (yani MPEG Çevresi parametrelendirmesi yok); 2b) bunun için tekdüze nicemleyici kullan (örnegin adim 2c) öngörü katsayilari için uygun frekans çözünürlügü kullan (örnegin 2 Ölçek Katsayi Bandi için 1 katsayi); 2d) tüm öngörü katsayilarinin gerçek olmasi halinde ucuz sinyalleme; Ze) l-çerçeve RZI islemini baslatmak için çerçeve basina açik bit. The following two lists describe the preferred configuration aspects described earlier, which can be used as an alternative to each other or in addition to the other aspects: la) general concept: complex prediction of side MDCT from middle MDCT and MDST; lb) calculate/approximate MDST from MDCT in the frequency domain using 1 or more frames (leading to a 3-frame delay); lc) filter truncation to reduce numerical complexity (l-frame down to 2-tap, i.e. [-1 0 l]); ld) proper processing of DC and fs/2; le) proper processing of window shape transition; lf) use previous/next frame if there is a different transform size; lg) prediction based on unquantized or quantized MDCT coefficients in the encoder; 2a) directly quantize and encode the real and imaginary parts of the complex prediction coefficient (i.e. no MPEG Environment parameterization); 2b) use a uniform quantizer for this (e.g. step 2c) use appropriate frequency resolution for the prediction coefficients (e.g. 1 coefficient for 2 Scale Coefficient Bands); 2d) cheap signaling if all prediction coefficients are real; Ze) clear bit per frame to start the l-frame RZI process.

Bir örnekte kodlayici ayrica iki kanal sinyalinin zaman bölgesi gösterimini, söz konusu iki kanal için. altbant sinyallerine sahip olan iki kanal sinyaline ait spektral gösterime dönüstürülmesi için bir spektral dönüstürücüyü (50, 51) içermekte olup; burada birlestirici (2031), öngörücü (2033) ve artik sinyal hesaplayici (2034), çoklu altbantlar için birinci birlestirilmis sinyal ile artik sinyalin elde edilebilmesi için her bir altbant sinyalini ayri ayri isleyecek sekilde yapilandirilmistir; yine burada çikis arayüzü (212) çoklu altbantlar için kodlanmis birinci birlestirilmis sinyal ile kodlanmis artik sinyali birlestirmek üzere yapilandirilmistir. In one example, the encoder further includes a spectral converter (50, 51) for converting the time domain representation of the two channel signals into a spectral representation of the two channel signals having subband signals for the two channels, wherein the combiner (2031), predictor (2033) and residual signal calculator (2034) are configured to process each subband signal separately to obtain the first combined signal and the residual signal for multiple subbands; wherein the output interface (212) is configured to combine the first combined signal encoded for multiple subbands and the encoded residual signal.

Bazi yönler bir cihaz baglaminda açiklanmis olmakla birlikte, bu yönlerin bir blok veya cihazin bir yöntem adimina veya bir yöntem adimindaki bir özellige karsilik geldigi ilgili bir yöntemin açiklamasini da kapsadigi anlasilmalidir. Benzer sekilde bir yöntem adimi baglaminda açiklanan yönler, ilgili bir blok veya ögenin ya da ilgili cihazin bir özelliginin açiklamasini da kapsamaktadir. Although some aspects are described in the context of a device, it should be understood that these aspects also cover the description of a related method where a block or device corresponds to a method step or a feature of a method step. Similarly, aspects described in the context of a method step also cover the description of a related block or element or a feature of the related device.

Mevcut bulusun bir düzenlemesinde, pencere sekli geçisinin düzgün bir islemesi uygulanir. Sekil lOa'ya bakildiginda, sanal spektrum hesaplayiciya (1001) bir pencere sekli bilgisi (109) girilebilir. Spesifik olarak, MDCT spektrumu gibi (Sekil 6a'daki öge 2070 veya Sekil 6b'deki öge ll60a. gibi) gerçek degerli spektrumun gerçekten-sanala dönüsümünü gerçeklestiren sanal spektrum hesaplayici, FIR veya IIR süzgeci olarak uygulanabilir. In one embodiment of the present invention, a smooth processing of the window shape transition is implemented. Referring to Figure 10a, a window shape information 109 can be input into the virtual spectrum calculator 1001. Specifically, the virtual spectrum calculator, which performs the real-to-virtual transformation of a real-valued spectrum such as the MDCT spectrum (such as item 2070 in Figure 6a or item 1160a in Figure 6b), can be implemented as a FIR or IIR filter.

Bu gerçekten-sanala modülündeki (1001) FIR veya IIR katsayilari, geçerli çerçevenin sol yarisina ve sag yarisina ait pencere sekline baglidir. Bu pencere sekli bir sinüs pencere veya KBD (Kaiser~ Bessel Derived; Kaiser Bessel Türevli) pencere için farkli olabilir' ve ilgili pencere dizisi yapilanmasina tabi olarak bir uzun pencere, bir baslatma penceresi, bir durdurma penceresi, bir durdurma-baslatma penceresi veya bir kisa pencere olabilir. Gerçekten-sanala modülü iki boyutlu bir FIR süzgeci içerebilir; burada birinci boyut iki ardisik MDCT çerçevesinin FIR süzgecine girildigi bir zaman boyutu, ikinci boyut ise bir çerçevenin frekans katsayilarinin girildigi bir frekans boyutudur. The FIR or IIR coefficients in this real-to-virtual module (1001) depend on the window shape for the left and right halves of the current frame. This window shape can be different for a sine window or a Kaiser-Bessel Derived (KAIS) window and can be a long window, a start window, a stop window, a stop-start window or a short window, depending on the configuration of the respective window array. The real-to-virtual module can include a two-dimensional FIR filter, where the first dimension is a time dimension where two consecutive MDCT frames are input into the FIR filter, and the second dimension is a frequency dimension where the frequency coefficients of a frame are input.

Asagidaki tabloda farkli pencere sekilleri ve pencerenin sol yarisinin ve sag yarisinin farkli uygulamalari için geçerli bir pencere dizisine iliskin farkli MDST süzgeci katsayilari verilmistir. The following table gives the different MDST filter coefficients for a valid window array for different window shapes and different implementations of the left half and right half of the window.

Tablo A - Geçerli Pencere için MDST Süzgeci Parametreleri Geçerli Pencere Dizisi ONLY_LONG_SEQUENCE, EIGHT_SHORT_SEQUENCE LONG_START_SEQUENCE LONG_STOP_SEQUENCE STOPISTARTISEQUENCE Sol Yari: Sag Yari: Sol Yari: KBD Sag Yari: KBD .581427, .581427, .091497] .047969, .608574, .608574, - .O47969, - .150512] 0.608574, 0.608574, 0.150512] 0.635722, 0.635722, Sol Yari: Sinüs Sol Yari: KBD Geçerli Pencere Dizisi Sol Yari: Sinüs Sol Yari: KBD Geçerli Pencere Dizisi ONLY LONG SEQUENCE, EIGHT SHORT SEQUENCE Sol Yari: Sinüs Sol Yari: KBD Geçerli Pencere Dizisi Buna ek olarak, MDCT spektrumundan MDST spektrumunu hesaplamak için önceki pencere kullanildiginda, pencere sekli bilgisi (109) önceki pencere için pencere sekli bilgisini saglar. Önceki pencereye iliskin karsilik gelen MDST süzgeci katsayilari Tablo B - Önceki Pencere için MDST Süzgeci Parametreleri Geçerli Geçerli Geçerli Pencere Dizisi Yarisi: Sinüs Yarisi: KBD LONG STOP SEQUENCE, STOP START SEQUENCE Sonuç olarak, pencere sekli bilgisine (109) bagli olarak Sekil lOa'daki sanal spektrum hesaplayici (lOOl) farkli süzgeç katsayilari gruplari uygulanarak uyarlanabilir. Table A - MDST Filter Parameters for Current Window Current Window Sequence ONLY_LONG_SEQUENCE, EIGHT_SHORT_SEQUENCE LONG_START_SEQUENCE LONG_STOP_SEQUENCE STOPISTARTISEQUENCE Left Half: Right Half: Left Half: KBD Right Half: KBD .581427, .581427, .091497] .047969, .608574, .608574, - .O47969, - .150512] 0.608574, 0.608574, 0.150512] 0.635722, 0.635722, Left Half: Sine Left Half: KBD Current Window Sequence Left Half: Sine Left Half: KBD Current Window Sequence ONLY LONG SEQUENCE, EIGHT SHORT SEQUENCE Left Half: Sine Left Half: KBD Current Window Sequence In addition, when the previous window is used to calculate the MDST spectrum from the MDCT spectrum, the window shape information (109) provides the window shape information for the previous window. Corresponding MDST filter coefficients for the previous window Table B - MDST Filter Parameters for the Previous Window Current Valid Current Window Sequence Half: Half Sine: KBD LONG STOP SEQUENCE, STOP START SEQUENCE Finally, depending on the window shape information (109), the virtual spectrum calculator (lOOl) in Fig. lOa can be adapted by applying different groups of filter coefficients.

Kod çözücü tarafinda kullanilan pencere sekli bilgisi kodlayici tarafinda hesaplanir ve kodlayici çikis sinyaliyle birlikte yan bilgi olarak iletilir. Kod çözücü tarafinda ise pencere sekli bilgisi (109), bit akisi çogullama çözücü (örnegin Sekil 5b'de l02) vasitasiyla bit akisindan ayiklanir ve Sekil lOa'da gösterildigi gibi sanal spektrum hesaplayiciya (lOOl) saglanir. The window shape information used on the decoder side is calculated on the encoder side and transmitted as side information along with the encoder output signal. On the decoder side, the window shape information (109) is extracted from the bit stream by means of the bit stream demultiplexer (e.g., 102 in Figure 5b) and provided to the virtual spectrum calculator (101) as shown in Figure 10a.

Pencere sekli bilgisi (109) önceki çerçevenin farkli bir dönüsüm boyutu oldugu sinyalini verirse, gerçek degerli spektrumdan sanal spektrumu hesaplamak için önceki çerçevenin kullanilmamasi tercih edilir. Ayni durum pencere sekli bilgisi (109) yorumlanarak sonraki çerçevenin farkli bir dönüsüm boyutu oldugu bulundugu zaman da geçerlidir. Bu durumda gerçek degerli spektrumdan sanal spektrumu hesaplamak için sonraki çerçeve kullanilmaz. Böyle bir durumda, örnegin önceki çerçeve geçerli çerçeveden farkli bir dönüsüni boyutuna sahip oldugunda veya sonraki çerçeve geçerli çerçeveden farkli bir dönüsüm boyutuna sahip oldugunda, sanal spektrumu kestirmek için sadece geçerli çerçeve, yani geçerli pencerenin spektral degerleri kullanilir. If the window shape information (109) signals that the previous frame has a different transform size, it is preferable not to use the previous frame to calculate the virtual spectrum from the real-valued spectrum. The same applies when the window shape information (109) is interpreted to find that the next frame has a different transform size. In this case, the next frame is not used to calculate the virtual spectrum from the real-valued spectrum. In such a case, for example, when the previous frame has a different transform size than the current frame or when the next frame has a different transform size than the current frame, only the spectral values of the current frame, i.e. the current window, are used to estimate the virtual spectrum.

Kodlayicidaki öngörü MDCT katsayilari gibi nicemlenmemis veya nicemlenmis frekans katsayilarina dayalidir. Sekil 3a'da öge 2033 ile gösterilen öngörü Örnegin nicemlenmemis veriye dayali oldugunda, artik hesaplayici (2034) tercihen nicemlenmemis veri üzerinde de islem yapar ve artik hesaplayici çikis sinyali, yani artik sinyal (205) entropi kodlanmadan önce nicemlenir ve bir kod çözücüye iletilir. Alternatif bir örnekte öngörünün nicemlenmis MDCT katsayilarina dayali olmasi tercih edilir. Bu durumda, bir birinci nicemlenmis kanal ile bir ikinci nicemlenmis kanalin artik sinyalin hesaplanmasi için temel olmasi için nicemleme Sekil 3a'daki birlestiriciden (2031) önce gerçeklesebilir. Alternatif olarak, birinci kombinasyon sinyali ile ikinci kombinasyon sinyalinin nicemlenmemis bir formda hesaplanmasi ve artik sinyal hesaplanmadan önce nicemlenmesi için, nicemleme islemi birlestiriciden (2031) sonra da gerçeklesebilir. Yine alternatif olarak, öngörücü (2033) nicemlenmemis bölgede isleyebilir ve öngörü sinyali (2035) artik hesaplayiciya girilmeden önce nicemlenir. Bu durumda, artik hesaplayiciya (2034) girilen diger bir sinyal olan ikinci kombinasyon sinyalinin (2032) de artik hesaplayici Sekil 3a'daki öngörücü (2033) içerisinde uygulanabilen ve kod çözücü tarafinda mevcut olan ayni nicemlenmis veri ile isleyen Sekil 6a'daki artik sinyali (1070) hesaplamadan önce nicemlenmesi faydalidir. Bu sayede artik sinyalin hesaplamasini gerçeklestirme amaçli olarak kodlayicida kestirilen MDST spektrumunun, ters Öngörüyü gerçeklestirmek, yani artik sinyalden yan sinyali hesaplamak için kullanilan kod çözücü tarafindaki MDST spektrumuyla tamamen ayni olmasi garanti altina alinabilir. Bu amaçla Sekil 6a'daki hat 204 üzerindeki M sinyali gibi birinci kombinasyon sinyali, blok 2070'e girilmeden önce nicemlenir. Sonra, geçerli çerçevenin nicemlenmis MDCT spektrumu kullanilarak hesaplanan MDST spektrumu, kontrol bilgisine bagli olarak da önceki veya sonraki çerçevenin nicemlenmis MDCT spektrumu, çarpiciya (2074) girilir; bu durumda Sekil a'daki çarpicinin (2074) çikisi da nicemlenmemis bir spektrum olacaktir. Bu nicemlenmemis spektrum, spektrum girisinden toplayiciya (2034b) çikarilacak ve son olarak nicemleyicide (209b) nicemlenecektir. The prediction in the encoder is based on unquantized or quantized frequency coefficients such as the MDCT coefficients. For example, when the prediction is based on unquantized data, as shown in item 2033 in Figure 3a, the residual calculator 2034 preferentially also operates on the unquantized data, and the residual calculator output signal, the residual signal 205, is quantized before being entropy coded and passed to a decoder. In an alternative example, it is preferable for the prediction to be based on quantized MDCT coefficients. In this case, quantization can occur before the combiner 2031 in Figure 3a, so that a first quantized channel and a second quantized channel are the basis for calculating the residual signal. Alternatively, the quantization process can occur after the combiner 2031, so that the first combination signal and the second combination signal are calculated in a non-quantized form and quantized before the residual signal is calculated. Alternatively, the predictor 2033 can operate in the non-quantized region, and the predictor signal 2035 is quantized before being input to the residual calculator. In this case, it is useful to quantize the second combination signal 2032, another signal input to the residual calculator 2034, before calculating the residual signal 1070 in Figure 6a, which can be implemented in the predictor 2033 in Figure 3a and processed with the same quantized data available at the decoder. In this way, it can be guaranteed that the MDST spectrum estimated in the encoder for the purpose of performing the calculation of the residual signal is exactly the same as the MDST spectrum at the decoder side used to perform the reverse Prediction, i.e., to calculate the side signal from the residual signal. For this purpose, the first combination signal, such as the M signal on line 204 in Figure 6a, is quantized before being input to block 2070. Then, the MDST spectrum calculated using the quantized MDCT spectrum of the current frame, and the quantized MDCT spectrum of the previous or next frame depending on the control information, is input to the multiplier 2074; in this case, the output of the multiplier 2074 in Figure a will also be a non-quantized spectrum. This unquantized spectrum will be output from the spectrum input to the collector (2034b) and finally quantized in the quantizer (209b).

Bir örnekte, öngörü bandi basina karmasik öngörü katsayinin gerçek parçasi ve sanal parçasi dogrudan, yani örnegin MPEG Çevresi parametrelendirmesi olmaksizin nicemlenir ve kodlanir. In one example, the real part and the imaginary part of the complex prediction coefficient per prediction band are quantized and coded directly, i.e. without parameterization of the MPEG environment.

Nicemleme islmei, örnegin 0,1 adim boyu olan tekdüze bir nicemleyici kullanilarak gerçeklestirilebilir. Bu da herhangi bir logaritmik nicemleme adim boyunun ve benzerinin uygulanmadigi, ancak herhangi bir dogrusal adim boyunun uygulandigi anlamina gelir. Bir uygulamada, karmasik öngörü katsayisinin gerçek parçasi ve sanal parçasinin deger araligi - 3 ila 3 arasinda degisir, bu da karmasik öngörü katsayisinin gerçek parçasi ve sanal parçasi için 60 veya uygulama detaylarina bagli olarak 61 nicemleme adiminin kullanilmasi demektir. The quantization process can be performed using a uniform quantizer with a step size of 0.1, for example. This means that no logarithmic quantization step size and so on are applied, but any linear step size is applied. In one implementation, the range of values of the real part and the imaginary part of the complex prediction coefficient varies from -3 to 3, which means that 60 or 61 quantization steps are used for the real part and the imaginary part of the complex prediction coefficient, depending on the implementation details.

Tercihen, Sekil 6a'daki çarpicida (2073) uygulanan gerçek parça ve Sekil 6a'da uygulanan sanal parça (2074) da uygulanmadan önce nicemlenir, bu sayede kodlayici tarafinda kod çözücü tarafinda mevcut olanla ayni öngörü degeri kullanilir. Bu da öngörü artik sinyalinin, sunulan nicemleme hatasinin yani sira, kod çözücü tarafinda nicemlenmis bir öngörü katsayisi uygulanirken kodlayici tarafinda nicemlenmemis bir öngörü katsayisi uygulandiginda ortaya çikabilecek tüm hatalari kapsamasini garanti eder. Tercihen, nicemleme islemi hem kodlayici tarafinda hem de kod çözücü tarafinda mümkün oldugunca ayni durum ve ayni sinyaller mevcut olacak sekilde uygulanir. Dolayisiyla, girisin gerçekten-sanala hesaplayiciya (2070) nicemlenmesi isleminin nicemleyicide (209a) uygulananla ayni nicemleme kullanilarak yapilmasi tercih edilir. Ayrica, madde 2073 ve madde 2074'teki çarpmalari yapmak için öngörü katsayisinin gerçek parçasinin ve sanal parçasinin nicemlenmesi tercih edilir. Bu nicemleme islemi, nicemleyicide (2072) uygulanan islemin aynisidir. Buna ek olarak, Sekil 6a'daki blok 2031 tarafindan çikarilan yan sinyal de toplayicilardan (2034a ve 2034b) önce nicemlenebilir. Preferably, the real part applied in the multiplier 2073 in Figure 6a and the imaginary part 2074 applied in Figure 6a are also quantized before being applied, so that the same prediction value is used at the encoder side as at the decoder side. This ensures that the prediction residual signal covers the quantization error introduced, as well as all errors that may occur when a non-quantized prediction coefficient is applied at the encoder side while a quantized prediction coefficient is applied at the decoder side. Preferably, the quantization process is applied in such a way that, as much as possible, the same situation and the same signals are present at both the encoder and the decoder. Therefore, it is preferred to quantize the input to the real-to-virtual calculator 2070 using the same quantization as that applied in quantizer 209a. It is also preferred to quantize the real part and the imaginary part of the prediction coefficient to perform the multiplications in item 2073 and item 2074. This quantization operation is the same as that applied in quantizer 2072. In addition, the side signal extracted by block 2031 in Figure 6a can also be quantized before the adders 2034a and 2034b.

Bununla birlikte, nicemleyici (209b) tarafindan gerçeklestirilen nicemlemenin, söz konusu toplayicilarin yaptigi toplamanin nicemlenmemis bir yan sinyal ile uygulandigi ekleme isleminin ardindan yapilmasi problemli degildir. However, it is not problematic to perform the quantization performed by the quantizer (209b) after the addition operation, where the addition performed by the said adders is performed with a non-quantized side signal.

Tarifin diger bir örneginde tüm öngörü katsayilarinin gerçek olmasi halinde ucuz sinyalleme uygulanmaktadir. Belli bir çerçeve için, yani ses sinyalinin ayni zaman kismi için tüm öngörü katsayilarinin gerçek olarak hesaplandigi bir durum söz konusu olabilir. Bu tür bir durum, tam orta sinyal veya tam yan sinyalden bir digerine faz-kaymasi hiç yoksa veya çok az varsa gerçeklesebilir. Bitleri kaydetmek için bu, tek bir gerçek gösterge ile belirtilir. Bu durumda öngörü katsayisinin sanal parçasinin, bir sifir degerini temsil eden bir kod sözcügü ile bit akisinda sinyallenmesine gerek kalmaz. Kod çözücü tarafinda, bit akisi kod çözücü arayüzü, örnegin bit akisi çogullama çözücü, bu yeni göstergeyi yorumlayacak ve sanal parça için kod sözcüklerini aramayacak, bunun yerine tüm bitlerin gerçek degerli öngörü katsayilari için bit akisinin karsilik gelen bölümünde oldugunu varsayacaktir. Bunun yani sira, öngörücü (2033) çerçevedeki öngörü katsayilarinin tüm sanal parçalarinin sifir oldugunu belirten, bir gösterge aldiginda. bunun gerçek degerli MDCT spektrumundan bir MDST spektrumunu veya genel olarak sanal bir spektrumu hesaplamasina gerek kalmayacaktir. Bu sebeple, Sekil 6b'deki kod çözücüdeki öge ll60a deaktif hale getirilecek ve ters öngörü sadece Sekil 6b'deki çarpicida (1160b) uygulanan gerçek degerli öngörü katsayisi kullanilarak gerçeklestirilecektir. Ayni durum, öge 2070'in deaktif hale getirilecegi ve öngörünün sadece çarpici (2073) kullanilarak gerçeklestirilecegi kodlayici tarafi için de geçerlidir. Bu yan bilgi tercihen çerçeve basina ilave bit olarak kullanilir, bu durumda kod çözücü gerçekten-sanala dönüstürücünün (ll60a) bir çerçeve için aktif olup olmayacagina karar vermek için bu biti çerçeve çerçeve okuyacaktir. Dolayisiyla, bu bilginin saglanmasi öngörü katsayisinin bir çerçeve için sifir olan tüm sanal parçalarinin daha verimli sinyallenmesinden ötürü bit akisi boyutunun azalmasi ile sonuçlanir ve buna ek olarak, böyle bir çerçeve için kod çözücüde daha az karmasiklik saglar, bu da hizli bir sekilde örnegin batarya ile çalisan mobil bir cihazda uygulanan bu tür bir islemcinin pil tüketiminin azalmasina yol Mevcut bulusun tercih edilen düzenlemelerine uygun karmasik stereo öngörü, kanallar arasinda seviye ve/veya faz farkliliklari ile kanal çiftlerinin verimli bir sekilde kodlanmasina yönelik bir araçtir. Karmasik degerli a parametresi kullanilarak sol ve sag kanallar asagidaki matrisle yeniden olusturulur. Burada, dmxnm downmix kanallarinin (dmxw) MDCT'sine karsilik gelen MDST'yi belirtir. r l-aRE -aIm 1 1 = 1 1 drmcim Yukaridaki denklem, a'nin gerçek parçasi ve sanal parçasi bakimindan ayrilan ve birlestirilmis bir Öngörü/kombinasyon islemi için olan bir denklemi temsil eden bir baska gösterim olup, burada öngörülmüs S sinyalinin hesaplanmasi zorunlu degildir. In another example of the description, cheap signaling is applied when all the prediction coefficients are real. There may be a situation where all the prediction coefficients are calculated as real for a certain frame, i.e. for the same time segment of the audio signal. Such a situation may occur when there is no or very little phase-shift from the exact middle signal or the exact side signal to the other. In order to record the bits, this is indicated by a single real indicator. In this case, the imaginary part of the prediction coefficient does not need to be signaled in the bit stream with a code word representing a zero value. On the decoder side, the bitstream decoder interface, i.e. the bitstream demultiplexer, will interpret this new indicator and will not search for code words for the imaginary part, instead it will assume that all bits are in the corresponding part of the bitstream for the real-valued prediction coefficients. In addition, when the predictor receives an indicator indicating that all imaginary parts of the prediction coefficients in the (2033) frame are zero, it will not need to calculate an MDST spectrum from the real-valued MDCT spectrum, or in general a virtual spectrum. Therefore, element 1160a in the decoder in Figure 6b will be deactivated and the reverse prediction will be performed using only the real-valued prediction coefficient applied in the multiplier (1160b) in Figure 6b. The same applies to the encoder side, where element 2070 will be deactivated and prediction will be performed using only the multiplier 2073. This side information is preferably used as an additional bit per frame, in which case the decoder will read this bit frame by frame to decide whether the virtual-to-real converter ll60a will be active for a frame. Providing this information therefore results in a reduction of the bitstream size due to more efficient signaling of all virtual parts of the prediction coefficient being zero for a frame, and in addition, it provides less complexity in the decoder for such a frame, which quickly leads to a reduction in battery consumption of such a processor, for example, when implemented in a battery-powered mobile device. Complex stereo prediction in accordance with preferred embodiments of the present invention is a means for efficiently encoding channel pairs with level and/or phase differences between the channels. Using the complex-valued parameter a, the left and right channels are reconstructed with the following matrix. Here, dmxnm denotes the MDST corresponding to the MDCT of the downmix channels (dmxw). r l-aRE -aIm 1 1 = 1 1 drmcim The above equation is another representation of an equation for a Prediction/combination process, separated and combined in terms of the real part and the imaginary part of a, where the calculation of the predicted signal S is not necessary.

Bu araç için tercihen asagidaki veri ögeleri kullanilir: cplx_pred all cplx_pred_used[g][sfb] complex_coef use_prev_frame delta_code_time heod-alpha_g_re hcod-alpha_g;im 0: Bazi bantlar L/R kodlamayi kullanir, bunun sinyali cplx_pred_used[] ile verilir öngörüyü kullanir (öngörü bantlarindan eslestirildikten sonra) pencere grubu (g) ve ölçek katsayisi bandi (sfb) basina bir-bit isareti sunu gösterir O: karmasik öngörü kullanilmiyor, L/R kodlama kullanilir 1: karmasik öngörü kullaniliyor 0: tüm öngörü bantlari için CXImZO 1: tüm öngörü bantlari için dMIiletilir O: MDST kestirimi için sadece geçerli çerçeveyi kullan ve önceki çerçeveyi kullan O: Öngörü katsayilarinin frekans ayrimsal kodlamasi ayrimsal kodlamasi dm Huffman kodu dmiHuffman kodu Bu veri ögeleri bir kodlayicida hesaplanir birçok kanalli ses sinyalin yan bilgisine yerlestirilir. Ögeler bir yan bilgi ayiklayici vasitasiyla kod çözücü tarafinda yan bilgiden ayiklanir ve karsilik gelen bir islem gerçeklestirmek üzere kod çözücü hesaplayiciyi kontrol etmek için kullanilir. The following data elements are preferably used for this tool: cplx_pred all cplx_pred_used[g][sfb] complex_coef use_prev_frame delta_code_time heod-alpha_g_re hcod-alpha_g;im 0: Some bands use L/R coding, this is signaled by cplx_pred_used[] uses prediction (after matching from prediction bands) a one-bit flag per window group (g) and scale factor band (sfb) indicates O: complex prediction not used, L/R coding used 1: complex prediction used 0: for all prediction bands CXImZO 1: for all prediction bands dMItransmitted O: Use only the current frame for MDST estimation and use the previous frame O: Frequency differential coding of prediction coefficients dm Huffman code dmiHuffman code These data elements are computed in an encoder and embedded in the side information of a multi-channel audio signal. The elements are extracted from the side information by a side information extractor at the decoder side and used to control the decoder calculator to perform a corresponding operation.

Karmasik stereo öngörü, geçerli kanal çiftinin downmix MDCT spektrumunu; complex_coef = 1 durumunda ise geçerli kanal çiftinin downmix MDST spektrumuun, yani MDCT spektrumunun sanal karsiligini gerektirir. Downmix MDST kestirimi geçerli çerçevenin MDCT downmixinden; use_prev_frame == 1 durumunda ise önceki çerçevenin MDCT downmixinden hesaplanir. Pencere grubu (g) ve grup pencereye (b) ait önceki çerçevenin MDCT downmixi, söz konusu çerçevenin yeniden olusturulmus sol ve sag spektrumlarindan elde edilir. Complex stereo prediction requires the downmix MDCT spectrum of the current channel pair; in the case of complex_coef = 1, the downmix MDST spectrum of the current channel pair, i.e., the virtual counterpart of the MDCT spectrum. The downmix MDST estimation is calculated from the MDCT downmix of the current frame; in the case of use_prev_frame == 1, it is calculated from the MDCT downmix of the previous frame. The MDCT downmix of the previous frame belonging to the window group (g) and the group window (b) is obtained from the reconstructed left and right spectra of the frame in question.

Downmix MDST kestiriminin hesaplanmasinda, window_sequence ve bunun yani sira filter_coefs ve filter_coefs_prev'e bagli çift sayi degerli MDCT dönüsümü uzunlugu kullanilir, ki bunlar süzgeç çekirdeklerini içeren ve önceki tablolara göre türetilen dizilimlerdir. In calculating the downmix MDST estimate, the length of the double-valued MDCT transform is used, which depends on the window_sequence as well as filter_coefs and filter_coefs_prev, which are sequences containing the filter kernels and derived according to the previous tables.

Tüm öngörü katsayilari için (zaman veya frekansta) önceki deger bir Huffman kod kitabi kullanilarak kodlanir. cplx_pred_used = 0 durumunda öngörü katsayilari öngörü bantlari için iletilmez. For all prediction coefficients (in time or frequency) the previous value is encoded using a Huffman codebook. In the case of cplx_pred_used = 0, prediction coefficients are not transmitted for prediction bands.

Ters nicemlenmis öngörü katsayilari alpha_re ve alpha_im asagidaki sekilde verilir; alpha_re = alpha_q_re*0.1 alpha_im = alpha_q_im*0.1 Bazi uygulama gerekliliklerine bagli olarak bulusun düzenlemeleri donanim veya yazilimlarda uygulanabilir. Uygulama, ilgili yöntemin gerçeklestirilmesine olanak saglayacak sekilde, üzerinde programlanabilir bir bilgisayar sistemi ile birlikte çalisan (veya birlikte çalisma kabiliyetine sahip olan) elektronik olarak okunabilir kontrol sinyallerini depolayan disket, DVD, CD, ROM, PROM, EPROM, EEPROM veya FLASH bellek gibi bir sayisal depolama ortaminda gerçeklestirilebilir. The inverse quantized prediction coefficients alpha_re and alpha_im are given as follows; alpha_re = alpha_q_re*0.1 alpha_im = alpha_q_im*0.1 Depending on certain implementation requirements, embodiments of the invention may be implemented in hardware or software. The implementation may be implemented in a digital storage medium, such as a floppy disk, DVD, CD, ROM, PROM, EPROM, EEPROM or FLASH memory, which stores electronically readable control signals that operate (or have the ability to operate) with a programmable computer system, to enable the implementation of the method in question.

Bulusa göre bazi düzenlemeler, burada açiklanan yöntemlerden birinin gerçeklestirmesine olanak saglayacak sekilde programlanabilir bir bilgisayar sistemiyle birlikte çalisma kabiliyetine sahip olan elektronik olarak okunabilir kontrol sinyalleri bulunan kalici veya somut bir veriyi tasiyan bir tasiyiciyi içermektedir. Some embodiments of the invention include a carrier carrying permanent or tangible data with electronically readable control signals that are capable of interoperating with a programmable computer system to perform one of the methods described herein.

Genel olarak mevcut bulus konusu düzenlemeler, bilgisayar programi ürünü bir bilgisayar üzerinde çalisirken ilgili yöntemlerden birini gerçeklestirecek sekilde çalisma özelligine sahip olan bir program kodu bulunan bir bilgisayar programi olarak uygulanabilir. Program. kodu örnegin Inakine tarafindan okunabilir bir tasiyicida depolanabilir. In general, embodiments of the present invention may be implemented as a computer program having a program code that has the ability to perform one of the methods when the computer program product is running on a computer. The program code may be stored on a carrier that is readable by, for example, Inakine.

Diger düzenlemeler, makine tarafindan okunabilir bir tasiyicida depolanan, burada açiklanan yöntemlerden birini gerçeklestirmeye yönelik bir bilgisayar programini içermektedir. Other embodiments include a computer program for performing one of the methods described herein, stored on a machine-readable carrier.

Diger bir ifadeyle bulus konusu yöntemin bir düzenlemesi, bilgisayar programi bir bilgisayar üzerinde çalisirken burada açiklanan yöntemlerden birini gerçeklestirmek için bir program koduna sahip olan bir bilgisayar programidir. In other words, an embodiment of the method of the invention is a computer program having a program code for performing one of the methods described herein when the computer program is running on a computer.

Dolayisiyla bulus konusu yöntemin bir diger düzenlemesi, burada açiklanan yöntemlerden birini gerçeklestirmeye yönelik olarak bilgisayar programini temsil eden bir veri akisi ya da sinyaller dizisidir. Veri akisi ya da sinyaller dizisi örnegin Internet gibi bir veri iletisim baglantisi yoluyla aktarilacak sekilde yapilandirilabilir. Thus, another embodiment of the method of the invention is a data stream or set of signals representing a computer program for performing one of the methods described herein. The data stream or set of signals may be configured to be transmitted over a data communication link, such as the Internet.

Bir diger düzenleme de örnegin bilgisayar veya programlanabilir bir lojik cihazi gibi burada açiklanan yöntemlerden birini gerçeklestirecek sekilde yapilandirilmis veya uyarlanmis bir isleme aracini içermektedir. Another embodiment includes a processing means configured or adapted to perform one of the methods described herein, such as a computer or a programmable logic device.

Bir diger düzenleme, burada açiklanan yöntemlerden birini gerçeklestirmeye yönelik bilgisayar programinin kuruldugu bir bilgisayari içermektedir. Another embodiment includes a computer into which a computer program is installed to perform one of the methods described herein.

Bazi düzenlemelerde ise burada açiklanan yöntemlerin islevselliklerinden bazilarini veya tümünü gerçeklestirmek için programlanabilir bir lojik cihazi (örnegin bir alanda programlanabilir geçit dizilimi) kullanilabilir. Bazi düzenlemelerde, burada açiklanan yöntemlerden birini gerçeklestirmek için alanda programlanabilir geçit dizilimi bir mikroislemci ile birlikte çalisabilir. Yöntemler genel olarak tercihen herhangi bir donanim cihazi ile gerçeklestirilebilir. In some embodiments, a programmable logic device (e.g., a field programmable gate array) may be used to implement some or all of the functionality of the methods described herein. In some embodiments, the field programmable gate array may be operated in conjunction with a microprocessor to implement one of the methods described herein. The methods may generally be implemented with any hardware device, preferably any hardware device.

Yukarida açiklanan düzenlemeler sadece mevcut bulus konusu ilkeleri örnekleme amaçlidir. Burada açiklanan düzenlemeler ve detaylarda çesitli modifikasyonlarin ve varyasyonlarin yapilabilecegi teknik alanda uzman kisiler için asikardir. The embodiments described above are for the purpose of illustrating the principles of the present invention only. It will be obvious to those skilled in the art that various modifications and variations can be made to the embodiments and details described herein.

Dolayisiyla, mevcut bulusun kapsamini buradaki düzenlemelere iliskin tarif ve açiklamalarda belirtilen spesifik detaylar degil, ekli patent istemleri belirlemektedir.Therefore, the scope of the present invention is determined by the appended patent claims, not by the specific details set forth in the description and explanations of the regulations herein.

Claims

CLAIMS 1. An MPEG combined speech and audio decoder of an encrypted stereo audio signal (100), the encrypted stereo audio signal comprising an encrypted downmix signal generated based on a combination rule for combining a first channel audio signal and a second channel audio signal of a stereo audio signal, having: a dequantizer (110; 110a, 110b) for dequantizing the encrypted downmix signal (104) to obtain a decoded downmix signal (112, M) and dequantizing the encrypted residual signal (106) to obtain a decoded residual signal (114, D); and a decoding calculator (116) for calculating a decoded stereo signal including a decoded first channel signal (117, L) and a decoded second channel signal (118, R) using the decoded residual signal (114, D), the prediction information (108) and the decoded downmix signal (112, M), such that the decoded first channel signal (117) and the decoded second channel signal (118) are at least approximations of the stereo signal, the first channel signal, and the second channel signal, wherein the prediction information (108) comprises a real part (dR) and an imaginary part (dI) of a complex prediction coefficient, wherein the decoding calculator (116) includes an estimator (1160) configured to estimate (1160a) an imaginary part (601) of the decoded downmix signal (112) of a current frame, the decoded downmix signal for the current frame being a fraction of the current frame. is the MDCT spectrum, where the estimator 1160 includes a truly imaginary transformer to calculate an MDST spectrum of the current frame as the imaginary part, where the truly imaginary transformer is applied in a frequency domain and converts an MDCT spectrum of the decoded downmix signal of a preceding frame into the decoded downmix signal of the current frame. The estimator (1160) is configured to apply two-dimensional finite impulse response filtering to the MDCT spectrum and the MDCT spectrum of the decoded downmix signal of the next frame, or to calculate an approximation of an MDST spectrum using the MDCT spectrum of the current frame and the MDCT spectrum of the previous frame, or to calculate an approximation of an MDST spectrum using only the MDCT spectrum of the current frame, wherein the estimator (1160) is configured to multiply the imaginary part of the decoded downmix signal (601) by the negative imaginary part of the estimation coefficient (dI) to obtain a second part of the estimation signal, and the real part of the decoded downmix signal by the negative real part of the estimation coefficient (dR) to obtain a first part of the estimation signal; wherein the solver calculator (116) further comprises a combination signal calculator (1161a, 1161b) configured to add a first part of the prediction signal and the decoded residual signal and a second part of the prediction signal to obtain a side signal (1165); wherein the solver calculator (116) further comprises a combiner (1162) for combining the side signal (1165) and the decoded downmix signal, a middle signal, to obtain a first channel signal (117, L), which is a left channel signal, and a decoded second channel signal (118, R), which is a right channel signal; wherein the solver further comprises an inverse modified discrete cosine transform element (52) for generating a time-domain first channel signal from the left channel signal and an inverse modified discrete cosine transform element (53) for generating a time-domain second channel signal from the right channel signal. 2. The MPEG combined speech and audio decoder according to claim 1, wherein the estimator is configured to specifically remove a neighboring frame from the calculation of the MDCT spectrum of the current frame, particularly when the neighboring frame is not available due to a frame during communication or transform-length change, wherein the removal for the transform-length change case is signaled in a bitstream. The MPEG combined speech and audio decoder according to claim 1, wherein when either the real part (dR) or the imaginary part (dI) is not used in a given frame, this case is signaled in a bitstream and the unused part is not carried. The MPEG combined speech and audio decoder of claim 1, wherein no differential coding or no differential coding is used on a per frame basis of each stream for each of the real part (dR) and the imaginary part (dI), which are signaled in a bitstream, and for the differential coding, either time or frequency differential coding is used and is signaled by a bit flag in the bitstream. The MPEG combined speech and audio decoder of claim 1, wherein the prediction information is encoded using a predefined codebook, or a separate immutable codebook, or a signal adaptive codebook, or a fixed length unsigned coding, or using two's complement code words. The MPEG combined speech and audio decoder of claim 1, wherein the real part (dR) and the imaginary part (dI) are updated in both scale factor bands. The MPEG combined speech and audio decoder of claim 1, wherein a bitstream includes one bit for each frame where only the MDCT spectrum of the current frame is used to calculate the MDST spectrum of the current frame, and the MDCT spectra of neighboring frames are not taken into account. The MPEG combined speech and audio decoder of claim 1, wherein the filter applied by the virtual phantom converter is a 1-frame 2-end cut filter. The MPEG combined speech and audio decoder of claim 1, wherein the virtual phantom converter includes two-dimensional FIR filtering of the MDCT spectra of the decoded downmix signal. The MPEG combined speech and audio decoder according to claim 9, wherein if an underlying MDCT encoder uses window shape variation, the filter coefficients of the FIR filter are adapted to the actual window shapes implemented by the MDCT encoder. The MPEG combined speech and audio decoder according to claim 9 or 10, wherein if the underlying MDCT encoder uses turn length variation resulting in different numbers of MDCT coefficients present in the current frame and the previous or next frame, the previous or next frame is omitted from the computation of the MDST spectrum for the current frame. The method of MPEG combined speech and audio decoding of an encrypted stereo audio signal (100), the encrypted stereo audio signal comprising an encrypted downmix signal generated based on a combination rule for combining a first channel audio signal and a second channel audio signal of a stereo audio signal, having: dequantizing the encrypted downmix signal (104) to obtain a decoded downmix signal (112), and dequantizing the encrypted residual signal (106) to obtain a decoded residual signal (114); and calculating (116) a decoded stereo signal comprising a decoded first channel signal (117) and a decoded second channel signal (118) using the decoded residual signal (114), the estimation information (108) and the decoded downmix signal (112), such that the decoded first channel signal (117) and the decoded second channel signal (118) are at least approximations of the stereo signal, the first channel signal and the second channel signal, wherein the estimation information (108) comprises a real part (dR) and an imaginary part (gl) of a complex estimation coefficient, wherein an imaginary part of the decoded downmix signal (112) is estimated (1160a) by an estimator (1160) using the decoded downmix signal (112), the decoded downmix signal for the current frame being an MDCT spectrum of the current frame, wherein the estimator (1160) estimates an MDST spectrum of the current frame as the imaginary part. to calculate the truly imaginary transform, wherein the truly imaginary transform is applied in a frequency domain and performs a two-dimensional finite impulse response filtering of an MDCT spectrum of the decoded downmix signal of a preceding frame, the MDCT spectrum of the decoded downmix signal of the current frame and the MDCT spectrum of the decoded downmix signal of the next frame, or calculating an approximation of an MDST spectrum using the MDCT spectrum of the current frame and the MDCT spectrum of the previous frame, or calculating an approximation of an MDST spectrum using only the MDCT spectrum of the current frame, wherein the imaginary part (601) of the decoded downmix signal is multiplied by the negative imaginary part of the prediction information (108) to obtain a second part of the prediction signal, and wherein the real part of the decoded downmix signal is multiplied by the negative real part (dR) of the prediction information (108) to obtain a first part of the prediction signal; wherein the first part of the prediction signal and the decoded residual signal are added and the second part of the prediction signal is additionally added to obtain the side signal (1165); wherein the side signal (1165) and the decoded downmix signal are combined to obtain the decoded first channel signal (117, L), which is a left channel signal, and the decoded second channel signal (118, R), which is a right channel signal, wherein the method further comprises applying an inverse modified discrete cosine transform to the left channel signal to produce a time domain first channel signal and applying another inverse modified discrete cosine transform to the right channel signal to produce a time domain second channel signal. 13. A computer program adapted to carry out the method of claim 10-12 when executed on a computer or processor.