TWI309140B - Device and method for generating a multi-channel signal or a parameter data set - Google Patents

Device and method for generating a multi-channel signal or a parameter data set Download PDF

Info

Publication number
TWI309140B
TWI309140B TW94145269A TW94145269A TWI309140B TW I309140 B TWI309140 B TW I309140B TW 94145269 A TW94145269 A TW 94145269A TW 94145269 A TW94145269 A TW 94145269A TW I309140 B TWI309140 B TW I309140B
Authority
TW
Taiwan
Prior art keywords
data
configuration
parameter
channel
prompt
Prior art date
Application number
TW94145269A
Other languages
Chinese (zh)
Inventor
Ralph Sperschneider
Juergen Herre
Johannes Hilpert
Christian Ertel
Stefan Geyersberger
Original Assignee
Fraunhofer Ges Forschung
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fraunhofer Ges Forschung filed Critical Fraunhofer Ges Forschung
Priority to TW94145269A priority Critical patent/TWI309140B/en
Application granted granted Critical
Publication of TWI309140B publication Critical patent/TWI309140B/en

Links

Description

1309140 九、發明說明: 【發明所屬之技術領域】 本發明關於參數多聲道處理技術,且尤其是關於彈性 資料語法產生用及/或讀取用以及使參數資料和下行混合 及/或傳輸聲道之資料相關聯之編碼器/解碼器。 【先前技術】 除雙立體聲道外,所建議之多聲道環繞表現包含一個 中央聲道C和兩個環繞聲道,即,左環繞聲道Ls和右環繞 聲道Rs,且另外,如果適用的話,含亦稱爲LFE聲道(LFE = 低頻增強)之超低音喇叭聲道。這參照聲音格式亦稱爲3/2 (正LFE)立體聲且最近亦稱爲5.1多聲道,其意爲有三個前 聲道和兩個環繞聲道。通常,需要五個或六個傳輸聲道。 在再生環境中,在各別五個不同位置至少需五個喇叭以得 到離五個正擺設喇叭既定距離之最理想的所謂甜美音點。 然而,針對其擺設位置,可以相當自由之方式使用超低音 喇叭。 有數種技術用於降低要傳輸多聲道音訊信號所需之資 料量。這種技術亦稱爲立體聲合倂技術。爲此起見,參考 第5圖。第5圖表示立體聲合倂裝置60。這裝置可爲執行 ,例如,強度立體聲技術(IS技術)或雙聲道提示編碼(BCC 技術)之裝置。這種裝置通常接收至少兩個聲道(CH1,CH2, …CHn)作爲輸入信號並輸出至少一個單載波聲道(下行混 合)及參數資料,即一或更多參數集。界定參數資料使得可 在解碼器中槪算各原始聲道(CH1,CH2, ...,CHn)。 1309140 正常說來,載波聲道會包含次頻帶取樣、頻譜係數或 時域取樣等,其提供相當精緻表現之基本信號,而參數資 料及/或參數集不包含任何這種取樣或頻譜係數。取而代之 的是,參數資料包含控制參數,用於控制如藉倍增、時間 位移、頻率位移、…加以權重之既定重建演算法。參數資 料因此包含僅相當粗略表示之信號或關聯聲道。以數字表 示,載波聲道(其爲壓縮,亦即利用,例如AAC加以編碼) 所需之資料量範圍爲60至7 0kbit/s,而參數旁側資訊所需 ® 資料量大小爲一個聲道約1 .5kbit/s。如以下之說明,參數 資料之實例爲已知之縮放比例因子。強度立體聲資訊或雙 Λ 聲道提示參數。 " 強度立體聲編碼技術是說明在1 994年2月,阿姆斯特 丹(AMSTERDAM),由赫利(J.HERRE)、布蘭登堡(K.H. BRANDENBURG)、利德勒(D. L E D E R E R)所著標題爲"強度 立體聲編碼"之AES待出版品3799號中。通常,強度立體 聲之槪念是根據被應用到兩個立體聲音訊聲道之資料的主 ^ 軸轉換。如果將大半資料點圍繞第一主軸擺設,可藉由旋 轉藉由編碼前之一既定角度的兩信號來達成編碼增益。然 而,這不永遠適用於真正之立體聲再生技術。左右聲道之 重建信號包含相同傳輸技術的相異權重或縮放比例版本。 雖然如此,重建信號之振幅相異,但針對其相位資訊而言 ,他們相同。然而,兩原始音訊聲道之能量時間包絡的維 持是藉由向來以選頻方式加以操作之選用性縮放比例操作 。這等於高頻下之人聲感覺,其中,主要空間提示爲能量 -6- 1309140 包絡所決定。 此外,在實行執行中,所傳輸信號,即載波聲道是由 左右聲道之加總信號而非旋轉兩成分而形成。而且,這種 處理,即,用於實施縮放比例操作而產生強度立體聲參數 ,是以選頻方式加以實施,亦即,與各比例因子頻帶,即 各編碼器頻率隔區彼此無關。最佳是,結合兩個聲道形成 一結合或”載波”聲道。除結合聲道外,強度立體聲資訊係 依第一聲道能量、第二聲道能量及結合或加總聲道能量而 _決定。 BCC技術是說明在2002年5月,慕尼黑(MUNCHEN) m ,由福勒(C. FALLER)、邦迦第(F. BAUMGARTE)所著標題 ' 爲”應用到立體聲及多聲道音訊壓縮之雙聲道提示編碼’•之 AES會議文件5 5 74號中。在BCC編碼中,使用具重疊視 窗之DFT爲基準的轉換將許多音訊輸入聲道轉換成頻譜表 示。將所形成之頻譜分成非重疊隔區。各隔區之頻寬與等 效直角頻寬(ERB)成正比。針對各隔區,即各頻帶及各幀κ ^ ,即時間取樣區加以計算所謂的相互聲道位準差(ICLD)以 及所謂的推互聲道時間差(ICTD)。使ICLD和ICDT參數量 化且加以編碼得到B C C位兀流。各聲道針對一參照聲道有 相互聲道位準差和相互聲道時間差。尤其是,依要處理之 特別信號分區而定,根據預定之公式加以計算參數。 在解碼器端,解碼器接收一單音信號和BCC位元流, 即每幀有相互聲道時間差之第一參數集及相互聲道位準差 之第二參數集。將單音信號轉換成頻域並予輸入至亦接收 -7- 1309140 已解碼ICLD和ICTD値之合成區。在合成區或重建區中, 使用BCC參數(ICLD和ICTD)實施單音信號之權重操作以 重建多聲道信號,在頻率/時間轉換後,該多聲道信號表現 原始多聲道音訊信號之重建。 在BCC之情況中,立體聲合倂模組60作用在傳輸聲 道旁側資訊以達參數資料資料之量化及ICLD和ICTD參數 之編碼’其中可使用其中一原始聲道爲參照聲道供聲道旁 側資訊編碼用。正常而言,載波聲道係由相關之原始聲道 ^的加總所形成。 當然’以上技術僅提供單音表示之只能使載波聲道解 碼’但不能產生用於產生多於一個輸入聲道之一或更多槪 ' 算之參數資料的解碼器。 在美國專利申請案之US 2003/0219130 A1、2003/ 0026441 A1 及 2003/0035553 A1 中進一步說明稱爲 BCC 技術。此外’進一步參見福勒和邦迦第之"雙聲道提示編碼 ,第二部:設計及應用",1 9 9 3年1 1月,第1 1冊,第6 β 號之IEEE :音訊與語音處理之執行。而且,亦參見2〇〇2 年5月’福勒和邦迦第所著之”應用於立體聲及多聲道音訊 壓縮之雙聲道提示編碼”,預印本,第112屆之音訊工程學 會(AES)會議,及由赫利,福勒,俄特爾(C. ERTEL),希派 特(J. HILPERT),候哲(A. HOELZER),史賓格(C. SPENGER) 所著之"MP3環繞:多聲道音訊之有效及相容編碼,2004 年柏林(BERLIN)之第116屆AES'會議,預印本6049號。 以下當中’針對第6圖至第8圖表示多聲道音訊信號編碼 1309140 用之通用BCC編碼設計。第6圖則表示多聲道音訊信號編 碼/傳輸用之通用BCC編碼設計。在BCC編碼器112之輸 入1 1 〇處輸入多聲道音訊輸入信號並使其在所謂的下行混 合區1 14中進行”下行混合",即轉換成單一總聲道。在本 實施例中,輸入110處之信號爲具有一前左聲道和一前右 聲道,一左環繞聲道和一右環繞聲道,及一中央聲道之5 聲道環繞信號。向來,下行混合區藉由將這些五個聲道簡 易添加成一單音信號而產生加總信號。在本技術中得知其 B 它下行混合設計,利用多聲道輸入信號,產生具有單一聲 道或具有在任何情況下下行混合信號小於原始輸入聲道之 數量的多數下行混合聲道。在本實施例中,當從五個輸入 聲道產生四個載波聲道時則已經達成下行混合之操作。將 單一輸出聲道及/或多輸出聲道輸出在加總信號線115上。 將由BCC分析區1 1 6所得到之旁側資訊輸出在旁側資 訊線1 17上。在BCC分析區中可計算相互聲道位準差(ICLD) 、相互聲道時間差(ICTD)或相互聲道相互關聯値(ICC値) B 。因此,對於BCC合成區122中之重建有三個相異參數集 ,亦即相互聲道位準差(ICLD)、相互聲道時間差(ICTD)及 相互聲道相互關聯値(ICC)。 一向以量化和編碼格式將具有參數集之加總信號及旁 側資訊傳輸至BCC解碼器120。BCC解碼器將被傳輸(及在 已編碼傳輸的情況中被解碼)之加總信號分成許多次頻帶 並實施縮放比例、延遲及進一步之處理,以產生欲重建之 數聲道之次頻帶。實施這種處理使得在輸出121處所重建 -9- 1309140 之多聲道信號之ICLD、ICTD及ICC參數(提示)類似於在 進入BCC解碼器112中輸入110處之原始多聲道信號的各 別提示。爲此起見,BCC解碼器120包含BCC合成區122 和旁側資訊處理區1 2 3。 以下將針對第7圖說明BCC合成區122之內部結構。 將線1 1 5上之加總信號輸入在向來以過濾庫F B 1 2 5爲實例 之時間/頻率轉換區內。在區125輸出處,有N個次頻帶信 號或在特殊情況中,如果音訊過濾庫1 2 5實施從N個時域 ® 取樣產生N個頻譜係數之轉換時則有一頻譜係數區。 B C C合成區1 2 2進一步包含延遲級1 2 6,位準修飾級 1 2 7,相互關聯處理級1 2 8以及表現反相過濾庫之級IF B 1 2 9 。如第6圖中所示,在級120之輸出處可將在5聲道環繞 系統情況中具有例如5聲道之重建多聲道音訊信號輸出在 喇叭1 2 4機組上。 第7圖進一步說明利用元件1 2 5將輸入信號轉換成頻 域或過濾庫域。如節點1 3 0所示,元件1 2 5所輸出之信號 ® 被加乘,藉以得到數個版本相同之信號。原始信號之版本 數等於在欲重建之輸出信號中的輸出聲道數量。如果各版1309140 IX. Description of the Invention: [Technical Field] The present invention relates to parametric multi-channel processing techniques, and more particularly to the generation and/or reading of elastic data grammars and the mixing and/or transmission of parameter data and downlinks The encoder/decoder associated with the data of the channel. [Prior Art] In addition to the dual stereo channels, the proposed multi-channel surround performance includes one center channel C and two surround channels, namely, the left surround channel Ls and the right surround channel Rs, and, if applicable, if applicable Includes a subwoofer channel also known as the LFE channel (LFE = Low Frequency Enhancement). This reference sound format is also known as 3/2 (Positive LFE) stereo and has recently been referred to as 5.1 multi-channel, which means that there are three front channels and two surround channels. Typically, five or six transmission channels are required. In the regenerative environment, at least five horns are required at each of the five different locations to obtain the ideal so-called sweet point from the distance of the five erected horns. However, the subwoofer can be used in a fairly free manner for its position. There are several techniques for reducing the amount of data needed to transmit multi-channel audio signals. This technology is also known as stereo merging technology. For this purpose, refer to Figure 5. Fig. 5 shows a stereo merging device 60. This device can be a device that performs, for example, intensity stereo technology (IS technology) or two-channel cue coding (BCC technology). Such devices typically receive at least two channels (CH1, CH2, ... CHn) as input signals and output at least one single carrier channel (downlink mixing) and parameter data, i.e., one or more parameter sets. Defining the parameter data makes it possible to calculate the original channels (CH1, CH2, ..., CHn) in the decoder. 1309140 Normally, the carrier channel will include sub-band sampling, spectral coefficients, or time-domain sampling, etc., which provide a fairly sophisticated representation of the underlying signal, while the parameter data and/or parameter set does not contain any such samples or spectral coefficients. Instead, the parameter data contains control parameters that are used to control established reconstruction algorithms such as multiplication, time shift, frequency shift, ... weighting. The parameter data therefore contains signals or associated channels that are only fairly coarsely represented. In digital terms, the carrier channel (which is compressed, that is, encoded using, for example, AAC) requires a data range of 60 to 70 kbit/s, while the parametric information required for the parameter side is one channel. About 1.5 kbit/s. As explained below, examples of parameter data are known scaling factors. Intensity stereo information or dual channel prompt parameters. " Intensity Stereo Coding Technology is described in February 1994, Amsterdam (AMSTERDAM), by J. HERRE, KH BRANDENBURG, D. LEDERER "Intensity Stereo Coding" AES to be published in No. 3799. Usually, the intensity stereo sound is based on the main axis conversion of the data applied to the two stereo channels. If most of the data points are placed around the first major axis, the coding gain can be achieved by rotating the two signals at a given angle from the previous one. However, this does not always apply to true stereo reproduction technology. The reconstructed signals of the left and right channels contain different weights or scaled versions of the same transmission technique. Even so, the amplitudes of the reconstructed signals are different, but they are the same for their phase information. However, the maintenance of the energy time envelope of the two original audio channels is performed by an optional scaling operation that is conventionally operated in a frequency selective manner. This is equal to the vocal sensation at high frequencies, where the main spatial cues are determined by the energy -6- 1309140 envelope. Further, in the execution of execution, the transmitted signal, i.e., the carrier channel, is formed by the summed signals of the left and right channels instead of the two components. Moreover, this processing, i.e., for implementing the scaling operation to produce the intensity stereo parameters, is implemented in a frequency selective manner, i.e., independent of the scale factor bands, i.e., the encoder frequency partitions. Preferably, a combined or "carrier" channel is formed in combination with the two channels. In addition to combining the channels, the intensity stereo information is determined by the first channel energy, the second channel energy, and the combined or added channel energy. BCC technology is illustrated in May 2002, MUNCHEN m, by C. FALLER, F. BAUMGARTE titled "for" applied to stereo and multi-channel audio compression Channel Prompt Code '• AES Conference Document 5 5 74. In BCC coding, a DFT-based conversion with overlapping windows is used to convert a number of audio input channels into a spectral representation. The resulting spectrum is divided into non-overlapping The bandwidth of each partition is proportional to the equivalent right-angle bandwidth (ERB). The so-called mutual channel level difference is calculated for each partition, that is, each frequency band and each frame κ ^ , that is, the time sampling area ( ICLD) and the so-called push channel time difference (ICTD). The ICLD and ICDT parameters are quantized and encoded to obtain BCC bit turbulence. Each channel has a mutual channel level difference and a mutual channel time difference for a reference channel. In particular, depending on the particular signal partition to be processed, the parameters are calculated according to a predetermined formula. At the decoder side, the decoder receives a single tone signal and a BCC bit stream, ie each frame has a first channel time difference. Parameter set and mutual channel The second parameter set of the quasi-difference. The monophonic signal is converted into the frequency domain and input to the synthesis zone of the decoded ICLD and ICTD, which also receives -7- 1309140. In the synthesis zone or reconstruction zone, the BCC parameters (ICLD and ICTD) performs a weighting operation of the tone signal to reconstruct the multi-channel signal, and after the frequency/time conversion, the multi-channel signal represents the reconstruction of the original multi-channel audio signal. In the case of BCC, the stereo merging module 60 It acts on the side of the transmission channel to achieve the quantization of the parameter data and the encoding of the ICLD and ICTD parameters. One of the original channels can be used as the reference channel for channel side information encoding. Normally, the carrier sound The pedigree is formed by the sum of the associated original channels ^. Of course, the above technique only provides a single tone representation that can only decode the carrier channel' but does not produce one or more for generating more than one input channel. A decoder for the parameter data is further described in the U.S. Patent Application Nos. US 2003/0219130 A1, 2003/ 0026 441 A1 and 2003/0035553 A1, which are further referred to as BCC technology. The first "two-channel prompt coding, the second part: design and application", January 1st, 1st, 1st, 6th beta IEEE: the implementation of audio and speech processing. Moreover, See also the "Two-Channel Cue Code for Stereo and Multi-Channel Audio Compression" by Fowler and Bonga, May 2nd, 2002. Preprint, The 112th Audio Engineering Society (AES) Conference, and by "H., FT", "C. ERTEL", J. HILPERT, A. HOELZER, and C. SPENGER "MP3 Surround: Effective and compatible encoding of multi-channel audio, the 116th AES' conference in Berlin (BERLIN) in 2004, pre-printed 6049. In the following, the general BCC coding design for multi-channel audio signal coding 1309140 is shown for FIGS. 6 to 8. Figure 6 shows the general BCC coding design for multichannel audio signal encoding/transmission. The multi-channel audio input signal is input at the input 1 1 〇 of the BCC encoder 112 and caused to be "downstream mixed" in the so-called downmix region 1 14 to be converted into a single total channel. In this embodiment The signal input at 110 has a front left channel and a front right channel, a left surround channel and a right surround channel, and a center channel 5-channel surround signal. The summed signal is generated by simply adding these five channels as a single tone signal. It is known in the art that its B-downmix design utilizes a multi-channel input signal to produce a single channel or have in any case The downstream mixed signal is smaller than the majority of the original input channels. In the present embodiment, the downlink mixing operation has been achieved when four carrier channels are generated from the five input channels. And/or the multi-output channel output is on the sum signal line 115. The side information obtained by the BCC analysis area 1 16 is outputted on the side information line 1 17. The mutual channel position can be calculated in the BCC analysis area. Quasi-difference (ICLD) Mutual channel time difference (ICTD) or mutual channel correlation (ICC値) B. Therefore, there are three distinct parameter sets for the reconstruction in the BCC synthesis area 122, that is, mutual channel level difference (ICLD), Mutual channel time difference (ICTD) and mutual channel correlation (ICC). The summed signal and side information with parameter sets are always transmitted to the BCC decoder 120 in a quantized and encoded format. The BCC decoder will be transmitted ( And the summed signal that is decoded in the case of the encoded transmission is divided into a number of sub-bands and scaled, delayed, and further processed to produce a sub-band of the digital channel to be reconstructed. This processing is performed at output 121. The ICLD, ICTD, and ICC parameters (cue) of the multichannel signal reconstructed by the premises -9- 1309140 are similar to the individual prompts for the original multichannel signal at input 110 in the BCC decoder 112. For this reason, BCC The decoder 120 includes a BCC synthesis area 122 and a side information processing area 1 2 3. The internal structure of the BCC synthesis area 122 will be described below with respect to Fig. 7. The total signal on the line 1 15 is input in the direction to filter the library FB. 1 2 5 is Example time/frequency conversion zone. At the output of zone 125, there are N subband signals or, in special cases, if the audio filter library 1 2 5 implements the conversion of N spectral coefficients from N time domain ® samples. There is a spectral coefficient region. The BCC synthesis region 1 2 2 further includes a delay level 1 2 6 , a level modification level 1 2 7 , a correlation processing level 1 2 8 and a level IF B 1 2 9 representing a reverse phase filter bank. As shown in Fig. 6, at the output of stage 120, a reconstructed multi-channel audio signal having, for example, 5 channels in the case of a 5-channel surround system can be output on the horn 1 24 unit. Figure 7 further illustrates the use of component 1 2 5 to convert the input signal into a frequency domain or a filtered library domain. As indicated by node 1 30, the signal ® output from component 1 2 5 is multiplied to obtain several identical versions of the signal. The version number of the original signal is equal to the number of output channels in the output signal to be reconstructed. If each edition

之原始信號在節點1 3 0受到既定延遲d i、d2.....di、dN ,結果爲在區塊1 2 6輸出處之情況包含相同信號版本,但 延遲不同。以第6圖中之旁側資訊處理區123計算延遲參 數且該延遲參數如他們爲BCC分析區116所決定的,且從 相互聲道時間差所獲得。 问樣適用在倍增參數a!、a2、…、ai、a>j,其亦根據 -10- 1309140 BCC分析區116所決定之相互聲道位準差,由旁側資訊處 理區123加以計算。 I C C參數由B C C分析區1 1 6加以計算並用於控制區塊 1 2 8之功能使得在區塊1 2 8輸出處得到所延遲和位準操縱 信號之間的既定相互關聯値。要注意的是級1 26、1 27和 128之順序可與第7圖中所表示者相異。 進一步注意到在針對音訊信號之區塊處理,亦針對區 塊實施BCC分析。而且,亦針對頻率,即選頻方式實施 ^ BCC分析。意即,對各頻譜帶而言,各區塊有一ICLD參 數、一IC TD參數及一 ICC參數。對於橫跨所有頻帶之至 « 少一聲道之至少一區塊的IC LD參數因此表現ICTD參數集 。同樣應用至IC LD參數集,其對於至少一輸出聲道之重 建,表現至少所有頻帶ICLD參數。同樣依序適用於ICC 參數集,根據輸入聲道或加總聲道,對於至少一個輸出聲 道之重建其再次包含各種頻道之至少一區塊的數個個別 ICC參數。 ® 以下當中,參考第8圖,該圖表示可看到決定BCC參 數之情況。正常而言,可在任何聲道配對之間界定I C L D、 ICTD和ICC參數。向來’在一參照聲道和各其它輸入聲道 之間決定IC LD和ICTD參數’使得除參照聲道外之各輸入 聲道有相異之參數集。這在第8A圖中亦有說明。 然而’可以不同方式界定ICC參數。亦如第8B圖之 示意圖所示,通常在任何聲道配對間之編碼器中可產生 IC C參數。在這情況中,解碼器會實施I c c合成,使得當 1309140 它位於任何聲道配對間之原始信號中時大致得到相同結果 。然而,已有建議在任何時間,即各幀,只計算兩最強聲 道間之ICC參數。這種設計表示在第8C圖中,在表示當 中一次計算並傳輸聲道1和2之間ICC參數’且當中另一 次計算聲道1和5之間ICC參數之一實例。然後,解碼器 合成解碼器中兩個最強聲道間之相互聲道相互關聯並執行 進一步典型之自發性規則(heuristic rule)’用於合成剩下聲 道配對之相互聲道的一致性。 # 針對例如根據所傳輸之ICLD參數,計算倍增參數ai .....aN,參考引用之AES會議文件5574。ICLD參數表 現在原始多聲道信號中之能量分布。不失一般性,第8A ' 圖表示有表現所有其它聲道和前左聲道之間能量差之四個 ICLD參數。在旁側資訊處理區123中,倍增參數ai..... aN是得自ICLD參數,使得所有重建輸出聲道之總能量與 現有已傳輸之加總信號之能量相同或至少該能量成正比。 要決定這些參數之一種方式爲兩階段程序,在其中之第一 ^ 階段中,將左前聲道之倍增因子設成1而將第8C圖中其它 聲道之倍增因子設成已傳輸之ICLD値。然後,在第二階 段中,計算所有五個聲道之能量並和已傳輸加總信號之能 量比較。然後,縮小所有聲道之比例,亦即,使用對於所 有聲道皆均等之比例因子,其中,選定該比例因子,使得 縮放比例後所有重建輸出聲道之總能量等於已傳輸之加總 信號及/或已傳輸之複數加總信號之總能量。 針對從BCC編碼器傳輸至BCC解碼器,作爲進一步參 -12- 1309140 數集之相互聲道一致性測量方法之ICC ’要注意的是一致 性之操縱可藉修飾倍增因子加以實施’如藉由介於20 1〇g 1〇-6和20 log 1〇6間之隨機數値來乘上所有次頻帶之權重 因子。一般選定假隨機序列,使得所有關鍵頻帶之變化大 致相等且各關鍵頻帶內之平均値爲零。對各個不同幀或區 塊之頻譜係數使用相同序列。因此’藉修飾假隨機序列之 變化,控制音訊情境寬度。較大之變化產生較大之收聽寬 度。可在具有關鍵頻帶之寬度的個別頻帶中實施變化之修 t 飾。這允許在各物體具有相異收聽寬度之收聽情境中同時 存在數個物體。假隨機序列之適當振幅分佈如以專利公開 案20 02/02 1 9 1 3 0 A1中之表示,依對數比例呈均勻分佈。 爲了以相容方式,例如以亦適合正常立體聲解碼器之 位元流格式傳輸五個聲道,可使用1992年10月,泰爾(G. THEILE)和史鐸(G. STOLL)在舊金山(SAN FRAN CISCO) AES預印本之”音樂攝錄環繞(MUSICAM SURROUND):與 ISO/IEC 1 1 172-3相容之萬用多聲道編碼系統”中所說明之 b 所謂的矩陣式技術。 而且’進一步參見1994年12月葛利爾(B. GRILL)、 赫利、布蘭登堡、艾伯頓恩(E. EBERLEIN)、庫勒(J. KOLLER) 、米勒(J. MILLER)在阿姆斯特丹AES末出版品3865號之 發表"增進之MPEG 2音訊多聲道編碼,,中所說明之多聲道 編碼技術’其中,使用相容性矩陣從原始輸入聲道得到下 行混合聲道。 總而言之’例如,亦如2004年舒晳(E· SCHUUER)、 -13- 1309140 布里巴特(J· BREEBAART)、朋海根(H. PURNHAGEN)、英 格德迦在柏林第119屆AES會議,末出版品6073號,標 題爲"低複雜參數立體聲編碼"之專業出版品中之說明,你 可說是BCC技術允許多聲道音訊材料之有效且亦往後相容 編碼。就此而論,亦應提到MPEG-4標準且尤其是擴及參數 音訊技術,其中,該標準部分亦爲標號ISO/IEC 14496-3: 2 00 1 /FDAM2(參數音訊)所知。在這方面,應提到的是,尤 其是標題爲”ps data()語法”之MPEG-4標準的表8、9中 ^ 之語法。在這實例中,我們應提到語法要素"enable_icc "和 "enable_ipdopd”,其中’使用這些語法要素,開啓及關閉 傳輸ICC參數及對應於相互聲道時間差之相位。應進一步 提到的是語法要素"icc_data( )""ipd_data(),,和,,〇pd —data()" ο 總而言之,要注意的是,通常使用這種參數多聲道技 術是利用一或數個已傳輸之載波聲道,其中從Ν個原始聲 道形成Μ個已傳輸聲道,藉以再次重建Ν個輸出聲道或Κ ^ 個輸出聲道,其中Κ等於或小於原始聲道數ν。 如可從第6圖中看到的,BCC分析爲典型分離前置處 理,一方面產生參數資料且另一方面從具有Ν個原始聲道 之多聲道信號產生一或更多傳輸聲道(下列混合聲道)。雖 然這未顯示在第6圖中,一般例如利用典型的ΜΡ3或aAC 立體聲/單聲編碼器則壓縮這些下行混合聲道,故在輸出端 ,有表示壓縮形式之傳輸聲道資料的位元流且進一步有表 示參數資料之另一位元流。因此從第6圖下行混合聲道之 -14- 1309140 實際音訊編碼及/或加總信號115分別產生BCC分析。 解碼器端爲類似。依所用之編碼演算法而定’具有多 聲道能力之解碼器將首先使包含壓縮下行混合信號之位元 流解碼且在輸出端再次提供一或更多傳輸聲道’亦即一般 作爲 PCM 資料(PCM = PULSE CODE MODULATION,脈衝碼 調變)之時序。然後,在輸出端時發生BCC合成以作爲個別 分離且隔離的後處理,該後處理係以該參數資料流來自我 充分地發出信號且具備欲產生之資料,數個輸出聲道最佳 ® 是等於來自音訊解碼之下行混合信號的原始輸入聲道數。 因此,B C C分析之優點爲,例如它具有供B C C分析用 之個別過濾庫且具有供B C C合成用之個別過濾庫’所以爲 " 了不作有關一方面爲音訊壓縮及另一方面爲多聲道重建之 任何妥協,其與音訊編碼器/解碼器之過濾庫是分開的。一 般而言,從兩應用領域最適宜要配備之多聲道參數處理因 此分別完成音訊壓縮。 然而,這種槪念之缺點爲必須對多聲道重建和音訊解 ^ 碼傳輸完整發出信號。如典型的情況,當音訊解碼器和多 聲道重建手段實施相同或類似步驟且因此需相同及/或相 互相依之配置設定時,這尤其不利。由於完全不同之槪念 ,因此傳輸兩次表示性資料,造成人爲的資料量"擴充”’ 這最終是由於在音訊編碼/解碼及多聲道分析/合成之間已 選擇不同之槪念之事實。 另一方面,完全"鏈結"多聲道重建至音訊解碼會大量 限制其彈性,因爲在那情況中,分開兩處理步驟以便能用 -15- 1309140 最適宜方式實施各處理步驟的實際重要目標必須被放棄。 因此,尤其是在亦稱"彙接式(TANDEM)"編碼之數連續編碼 /解碼級情況中會造成明顯的品質流失。如果B C C資料至已 編碼音訊資料有完全鏈結,當記錄時,必須以各個解碼來 實施多聲道重建,再次實施多聲道合成。因這是它所損耗 之每一參數技術之本質,反覆分析之合成分析將累積損耗 ,使得配有各個編碼器/解碼器級時進一步減低音訊信號之 可感知的品質。 Φ 在這情況中,如處在彙接式鏈之各音訊編解碼器一樣 地工作,即具有相同取樣率、區塊長度、推進長度、開窗 、轉換、…,即通常具有相同配置,此外,且如果亦維持 ' 各別區塊邊界時,才有可能無同時分析/合成處理參數資料 之音訊資料的解碼/編碼。然而’這種槪念會相當限制整個 槪念之彈性。尤其是有關’例如,藉額外之參數資料’意 圖使參數多聲道技術追加已存在立體聲資料’這種限制是 令人痛苦的。因已存在之立體聲資料可源自所有使用相異 • 區塊長度或甚至不在頻域中而在時域等中操作之許多不同 編碼器,這種限制將採取從一開始稍後追加之歸謬法 (absurdum)的槪念。 【發明內容】 本發明之目的在提供多聲道音訊信號或重建參數資料 集產生用之彈性及有效率槪念° 這目的之達成是藉由如申請專利範圍第1項之多聲道 信號產生用之裝置、如申請專利範圍第14項之多聲道信號 -16- 1309140 產生用之方法、如申請專利範圍第15項之參數資料集產生 用之裝置、如申請專利範圍第18項之參數資料輸出產生用 之方法、如申請專利範圍第19項之參數資料輸出產生用之 裝置、如申請專利範圍第20項之參數資料輸出產生用之方 法、或如申請專利範圍第2 1項之電腦程式。 本發明是根據發現藉由使可包含傳輸聲道資料和和參 數資料之資料流包含已被插入在編碼器端且在解碼器端加 以評估之參數配置提示可達成一方面之效率性及另一方面 ® 之彈性。這提示表示是否從輸入資料(S卩,自編碼器傳輸至 解碼器之資料)建置多聲道重建手段,或是否藉由隨編碼演 算法之提示建置多聲道重建手段,其中利用該編碼演算法 — 已將被編碼之傳輸聲道資料加以解碼。多聲道重建手段具 有和音訊解碼器之配置設定相同之配置設定,用於使已編 碼之傳輸聲道資料解碼或至少依這設定而定。 如果解碼器檢測到第一種情況,即參數配置提示具有 第一種意義時,解碼器將在已接收輸入資料中查詢進一步 ^ 之建置資訊’適當地建置多聲道重建手段,使用資訊然後 使多聲道重建手段之配置設定生效。這種配置設定可例如 爲區塊長度、推進、取樣頻率、過濾庫控制資料,所謂的 微粒資訊(一幀中有多少BCC區),聲道配置(例如,不管何 時有”MP3"時即產生5.1輸出),有關在比例化之情況(例如 ,ICLD)中參數資料是強制要且非爲(ICTD)的資訊。 然而’如果解碼器決定參數配置提示有異於第一種意 義的第二種意義時,多聲道重建手段將依有關音訊編碼演 -17- 1309140 算法之資訊而定,在多聲道重建手段中選定配置設定,其 中,傳輸聲道資料(即,下行混合聲道)之編碼/解碼是依據 該演算法。 相對於一方面爲參數資料之個別槪念且另一方面爲壓 縮下行混合資料,產生多聲道音訊信號用之本發明裝置在 實際完全個別且自我充份之音訊資料及/或在自我充份操 作之上游音訊解碼器中,對於多聲道.重建手段之配置可 說明犯了 ”行竊(theft)”以便建置其本身。 B 當考慮到不同音訊編碼演算法時,本發明較佳實施例 中之發明槪念特別有效力。在這情況中,會必須傳輸大量 之明顯標示資訊,用於達成同步操作,亦即當中多聲道重 建手段與音訊解碼器同步操作之操作,即,對於各相異編 碼演算法所相對應之推進長度等,使得實際上獨立之多聲 道重建演算法與音訊解碼演算法同步執行。 如本發明,單一位元即足夠之參數配置提示對解碼器 發出信號,爲了其配置之用途,這要看其往下游流向那一 I 音訊編碼器。在這之後’解碼器將接收有關音訊編碼器目 .前往上游流向許多不同音訊編碼器之資訊。當它接收這資 訊時,它最佳是以這音訊編碼演算法識別來進入存放在多 聲道解碼器中之配置表到擷取爲各可能之音訊編碼演算法 所預先界定之配置資訊處’使多聲道重建手段之至少一配 置設定生效與。其中在資料流中’明顯標示配置,其中在 多聲道重建手段和音訊解碼器之間因此未作考量,當其中 亦無以多聲道重建手段之音訊解碼器資料之發明”行竊”的 -18- 1309140 情況比較,這達成明顯節省資料率。 另一方面’因爲由於資料流中之單一位元對其爲足夠 之參數配置提示,本發明槪念仍提供配置資訊之明顯標示 所固有之高彈性’故如有必要,有可能在資料流中實際傳 輸所有配置資訊,或一以混合形式,在資料流中傳輸至少 部分之參數配置資訊並從設計資訊集接取另一部分之必要 資訊。 在本發明一較佳實施例中,比較於已經存在或先前標 > 示之配置設定,不管其是否應全然改變配置設定、或其是 . 否應如往常繼續、或作對連續佇行某一設定之反應、是否 讀進參數配置提示加以決定針對音訊解碼器是否應有多聲 道重建手段之對位、或在傳輸資料中是否包含有關配置之 至少部分明顯之資訊’從編碼器傳輸至解碼器之資料更包 含發出信號至解碼器之連續提示。 以下針對隨圖將更詳細說明本發明之較佳實施例。 【實施方式】 > 第1圖表示參數資料集產生用發明裝置之方塊電路圖 ’其中’在第1圖中所示裝置之輸出10可輸出參數資料集 °參數資料集包含參數資料,該參數資料與第〗圖中未說 明’但稍後將予討論之傳輸聲道資料一起表示N個原始聲 道’其中傳輸聲道資料將一向包含Μ個傳輸聲道,其中傳 輸聲道數Μ小於原始聲道數Ν且等於或大於1。 如第1圖中所示,將被收容在編碼器端之裝置包含多 聲道重建手段丨i,其被設計爲實施,例如,B C C分析或強 -19- 1309140 度立體聲分析之類者。在這情況中,多聲道重建手段11在 輸入12將接收N個原始聲道。然.而另外,亦可將多聲道重 建手段11設計成轉碼手段,其使用被饋入未加工處理參數 輸入13的即有未加工處理參數資料在手段11之輸出產生 參數資料。如參數資料爲如任何BCC分析手段提供給他們 之簡單BCC資料,多聲道重建手段11之處理就存在於從 輸入13複製到手段11之輸出內之複製功能。然而,亦可 將多聲道重建手段1 1設計成改變未加工處理參數資料流 Φ 之語法’藉以增添,例如,發信資料或寫入參數集,其中 在至少與彼此部分無關地從既有未加工處理之參數資料的 ' 情況下,該參數集可被解碼或略過。 第1圖中所示之裝置進一步包含發信手段14,用於決 定並使參數配置提示PKH與手段11輸出處之參數資料聯 結。尤其是’將發信手段設計成決定參數配置提示,使得 當使用內含在參數資料集中之配置資訊爲多聲道重建時, 發信手段具有第一種意義。另外,發信手段14將決定參數 φ 配置提示使得當使用配置資料爲多聲道重建時,發信手段 具有第二種意義,其中,配置資料是根據要用於及/或已用 於使傳輸聲道資料編碼之配置手段。 最後’第1圖之發明裝置包含配置資料寫入手段15, h k手段lx 5十成使配置資訊與參數資料和參數配置提示聯 結’最後在輸出1 0得到參數資料集。參數資料集1 〇因此 包含來自多聲道重建手段Η之參數資料、來自發信手段 14之參數配置^£不ΡΚΗ、且如果適用的話,來自配置畜料 寫入手段15之配置資料。在參數資料集中,資料集之這起 -20- 1 1309140 元件是根據槪定語法來配置且一向是以分時多工,如第 圖中通常稱爲結合手段16之元件所圖示。 在本發明較佳實施例中,將發信手段1 4經由控制 17耦接至配置資料寫入手段15,僅當參數配置提示具有 一種意義時,即在多聲道重建中當以任何方式無法存取 碼器中所存在之配置資訊時,而當明顯發出信號時,即 進一步配置資訊存在於參數資料集當中時,即觸動配置 料寫入手段1 5。在另一情況中,參數配置提示具有第二 φ 意義,未啓動配置資料寫入手段15在輸出10處導入參 資料集當中之資料,因爲如稍後之討論,這種資料將不 解碼器所讀取及/或將不爲解碼器所需。在混合式解決方 ' 之情況,不在資料流中對每一事物發出信號,只對一部 配置發出信號,而剩下者取自例如解碼器中之配置表。 發信手段14包含控制輸入18,透過發信手段14通 是否參數配置提示具有第一種或第二種意義。如針對第 圖和第4 B圖將討論的,在所謂的"同步”操作中,最佳是 ® 取參數配置提示,使其具有第二種意義,在解碼器端以 種模式得到有關編碼演算法之資訊並依在其上面者而定 在解碼器端,在多聲道重建手段中完成配置設定。然而 在非同步操作中,控制輸入1 8將會驅動發信手段使其決 參數配置提示之第一種意義時,由解碼器解譯該提示使 資料本身中有配置資訊,且將不使用傳輸聲道資料所根 之音訊編碼演算法。 要注意的是參數資料集及/或參數資料輸出未必一 線 第 解 當 資 種 數 爲 案 分 知 4A 選 這 定 得 據 定 -21- 1309140 要相對彼此爲固定形式。因此,未必一定要以串流或封包 一起傳輸配置提示,配置資料及參數資料,但亦可將其彼 此分開提供給解碼器。 以下討論將針對桌4A圖提出所謂的"同步"操作。爲說 明起見,第4 A圖說明以幀序列4 0說明參數資料,其中幀 序列40之前有標首碼41,該標首碼中有由發信手段14之 產生之參數配置提示,且如果適用的話,更有由配置資料 寫入手段15所產生之配置資訊。在手段1丨輸出處之參數 ® 資料是收容在幀1、2、3、4中’這是爲什麼在第4A圖中 他們亦稱爲負載量資料之原因。 當在第1圖中發信手段14之輸出處有提及且進一步在 第4A圖中之標首碼41亦有提及之連續提示FSH具有既定 意義造成解碼器維持時即繼續與先前所通信之相同配置設 定’而當連續提示FSH具有另一種意義時,即根據參數配 置提示’根據資料流中之配置資訊或根據提示擷取到解碼 器端之音訊編碼演算法的配置資料,決定多聲道重建手段 β中之配置設定是否生效。 第4Α圖進一步表現與時間有關之已編碼傳輸資料區 序列42 ’其亦具有四個幀,幀1、幀2、幀3、幀4。以第 4Α圖中之垂直箭頭說明具有已編碼傳輸聲道之時間有關 的參數資料。因此,當使用重疊視窗時已編碼傳輸聲道資 料區將總是有關於輸入資料區及/或至少與前區比較將儲 備在一資料區中新處理多少資料之推進,且在同步操作中 ’該推進將與資料區長度及/或取得參數資料處之推進同步 -22- 1309140 。這確保不會遺失一方面爲重建參數且另一方面爲傳輸聲 道資料間之連結。 這將利用簡短實例加以說明。假設爲一 5聲道輸入信 號’這5聲道輸入信號將具有含時間取樣分別從時間X到 時間y之五個相異音訊聲道。在第6圖之下行混合級1 1 4 中,則產生將與多聲道輸入資料同步之至少一傳輸聲道。 從時間X到時間y之一部分傳輸聲道資料因此將對應於從 時間X到時間y之一部分各別多聲道輸入資料。而且,第 ^ 6圖之B C C分析手段1 1 6產生,例如,參數資料,再次恰 爲從時間X到時間y的傳輸聲道資料之時間選取,藉以在 解碼器端再次產生各別的從時間X到時間y之輸出聲道資 ' 料,從時間X到時間y之傳輸聲道資料及從時間X到時間 y之參數資料。 當使參數資料賴以產生寫入之入框(framing)等於音訊 解碼器用於壓縮一或更多個傳輸聲道所賴以操作之入框時 即自動達成同步操作。如果因此參數資料和已編碼傳輸聲 ^ 道資料之幀(第4A圖中之40和42)總是與相同時間部分有 關時,多聲道重建手段可總是輕易地處理對應於音訊幀之 資料並同時處理參數幀。 在同步操作中,傳輸下行混合資料用之音訊解碼器的 幀長度因此等於參數多聲道設計所用之幀長度。類似地’ 當然亦有可能性爲在幀長度與參數資料和已編碼傳輸聲道 資料之間有整數關係。在這情況中,甚至可將參數多聲道 編碼之旁側資訊多工處理成音訊下行混合信號之已編碼位 -23- 1309140 元流,藉以產生單一位元流。在已存在於立體聲資料之" 修正"情況中,仍會有兩相異之資料流。然而,在兩幀序列 之間會有1 : 1及/或m : 1或m : η之關係。入框光柵將不 會彼此相對位移。因此,在音訊資料幀和相對應之參數旁 側資訊資料幀之間有明確關聯。這種模式有利於各種應用 〇 根據本發明,在這種情況中,參數配置提示會具有第 | —種意義。這意爲標首碼41中沒有或僅有部分配置資訊, 因爲多聲道重建手段本身設置有關基本音訊編碼器資訊, - 且依在其上面而定,選取其配置設定,亦即,例如推進或 . 區塊長度等之時間取樣數。 對照之下,第4Β圖表示非同步操作。當傳輸聲道資料 4 2 1不具,例如’幀結構時,但只產生爲p c Μ取樣流時則 存在非同步操作。另外,當音訊編碼器具不規則幀結構或 就具有幀長度及/或幀光柵相異於參數資料40之幀光柵之 ^ 幀結構時亦會產生這種非同步情況。在此,因此將參數多 聲道編碼設計和音訊編碼/解碼手段考慮爲不依賴彼此而 定之隔離和不同處理級。在當中有數個編碼/解碼續接級, 所謂的彙接編碼方案之情況中,這特別有利。如將參數資 料固定親合至壓縮音訊資料時,在各編碼/解碼時必須同時 完成多聲道合成和隨後之多聲道分析。因這些操作旦損来毛 性,該損耗會逐漸累積’這將造成多聲道質變之觀感。 在這種彙接鏈中’參數配置提示對第二種意義之設定 24· 1309140 及寫入資料流之配置資訊與基本音訊編碼器無關,允許解 碼器中多聲道重建手段之配置設定。可因此以任何方式對 下行混合資料加以解碼/編碼,不需同時總是必須實施多聲 道合成或多聲道分析。如參數資料語法將配置資訊導入資 料流且最佳是參數資料流內,可說是,以已解碼之傳輸聲 道資料之時間取樣儲備參數資料之絕對關聯性,即如在同 步操作中爲自我充足且相對於編碼器幀處理規則不會產生 之聯結。 在非同步操作中,因爲不會永遠實施多聲道分析/合成 ,因此防止多聲道聲音特性之質變。參數多聲道編碼/解碼 之幀大小因此未必一定與音訊編碼器之幀大小有關聯。 可將第1圖中之裝置實施爲編碼器及所謂的"前向轉換 編碼器"。在第一種情況中,多聲道重建手段計算參數資料 本身。在第二種情況中,它以既定形式接收參數資料並提 供參數配置提示及相關配置資料給發明參數資料輸出。前 向轉換編碼器因此產生來自任何資料輸出之發明參數資料 輸出。 以所謂的"反向轉換編碼器"完成反向測量,該反向轉 換編碼器從發明參數資料輸出產生當中已不再內含參數配 置提示,然而亦完全內含配置資料之某些輸出,使得在配 置之多聲道重建中未必使用音訊編碼演算法。 如本發明,使用輸入資料將反向轉換編碼器設計成參 數資料輸出產生用之裝置,其中,該參數資料輸出和包含 Μ個傳輸聲道之傳輸聲道資料表現N個原始聲道,其中μ -25- 1309140 小於N且等於或大於1,其中輸入資料包含參數配置提示 (4 1 ),其依傳輸聲道已用以從其—已編碼版本加以解碼之 編碼演算法(2 3)而定’具有多聲道重建手段之配置資訊是 內含在輸入資料中之第一種意義’或具有多聲道重建手段 欲使用配置資訊之第二種意義。當參數配置提示具有第二 種意義時,它包含用以寫入配置資料之寫入手段,其中’ 將寫入手段設計成首先讀取輸入資料加以擷取(3〇)參數配 置提示,並擷取有關傳輸聲道資料已用以從其一已編碼版 # 本加以解碼之編碼演算法(23)的資訊並將它輸出作爲配置 資料。 以下當中,說明針對第2圖,如本發明較佳實施例之 ' 多聲道音訊信號產生用裝置的方塊電路圖。爲了產生多聲 道音訊信號,使用之輸入資料包含表現Μ個傳輸聲道之傳 輸聲道資料及進一步包含參數資料21,取得Κ個輸出聲道 。Μ個傳輸聲道和參數資料一起表現Ν個原始聲道,其中 Μ小於Ν且等於或大於1,且其中Κ大於Μ。而且,如已 ^ 討論的,本發明包含參數配置提示ΡΚΗ,而傳輸聲道資料 20爲已如編碼演算法所編碼之傳輸聲道資料22的解碼版 本。在第2圖中所示實施例中,藉由音訊解碼器23加以實 現解碼演算法,該音訊解碼器具有,例如,如MP3槪念或 如MPEG-2(AAC)或如任何其它編碼槪念之編碼演算法在 運作。 要使用在第2圖中所示之解碼器端的裝置包含將其設 計成從傳輸聲道資料20和參數資料21,在輸出25處產生 -26- 1309140 K個輸出聲道之多聲道重建手段24。 而且’第2圖中所示之發明裝置包含配置手段 該手段設計成經由發信線2 7來送出配置設定加以 聲道重建手段24。配置手段26接收輸入資料且最 數資料21,加以讀取且相對應地處理參數配置提示 提示FSH及可能是目前之配置資料。而且,配置手 編碼演算法發信輸入2 8,藉以獲得有關音訊編碼演 資訊’已解碼之傳輸聲道資料是根據該音訊編碼演 ^ 即利用音訊編碼器2 3執行編碼演算法。該資訊可從 式取得’例如,從已解碼傳輸聲道資料之觀察結果 以他們已被編碼/解碼之編碼演算法,從他們可看得 訊的話。另外,音訊解碼器2 3本身可傳達其特性給 段26。還另外’配置手段26可亦分析已編碼之傳 資料2 2 ’從編碼演算法之編碼所據此已發生之已編 聲道資料加以決定提示。這種"編碼演算法特徵”將 含在編碼器之各輸出資料流中。 ® 以下當中,將根據對於第3Α圖之方塊圖說明配 之較佳實施例。如方塊3 0中之說明,將配置手段 成從輸入資料讀取參數配置提示ΡΚΗ並加以解譯。 31中之說明’如參數配置提示具有第一種意義時, 段將繼續在參數資料流中讀取,抽取參數資料流中 資訊(或至少部分之配置資訊)。然而,如果步驟3 0 數配置提示ΡΚΗ具有第二種意義時,在步驟32中 手段將得到有關已解碼傳輸聲道資料所根據之編碼 26,將 建置多 佳是參 ,連續 段包含 算法之 算法, 不同方 ,如果 到該資 配置手 輸聲道 碼傳輸 向來內 置手段 26設計 如方塊 配置手 之配置 決定參 ,配置 演算法 -27- 1309140 的資訊。 如果有數種基本上可能的編碼演算法供設計多聲道信 號產生用之發明裝置,步驟32隨後接著當中多聲道重建手 段根據存在於解碼器端之資訊決定(3 3)配置設定之步驟33 。這例如可以查詢表(LU T)之形式加以完成。在步驟3 2末 ,如果得到音訊編碼器識別提示,則使用該音訊編碼器識 別提示在步驟3 3中輸入查詢表,其中使用音訊編碼器識別 提示作爲索引。索引(index)中所關聯者,發現的有與這種 ® 音訊編碼器相關聯之各種配置設定,如區塊長度、取樣率 、推進等。 在步驟34中然後將配置設定套用在多聲道重建手段 。然而在步驟30中,如果選定參數配置提示之第一種意義 ,如第3圖中方塊3 1和方塊3 4間之連結箭頭所表示,根 據內含在參數資料流中之配置資訊使相同之配置設定生效 〇 發明設計具有彈性,支援明示及暗示之配置資訊發信 ^ 方法。這是參數配置提示PKH所作用的,在最佳情況中最 佳是以旗標將其插入,且只需單一位元表示本質上配置資 訊之發信。參數多聲道解碼器可隨後評估這旗標。如果以 這旗標發出明確可用之配置資訊的可用信號則使用這配置 資訊。另一方面,如果旗標表示暗示性之發信,則解碼器 將使用有關已使用音訊或語音編碼方法之資訊並根據已發 信之編碼方法套用配置資訊。爲此起見,參數多聲道解碼 器及/或多聲道重建手段最佳是具有包含既定數量音訊或 -28- 1309140 語音編碼器之標準配置資訊的查詢表。然而亦有,例如, 可包含亦可修改解答等之査詢表以外的其它可能性。通常 ,依實際現有之編碼器識別資訊而定,解碼器可以其本身 中現有之預定資訊提供配置資訊。 這種槪念在以最小之額外努力可達成完整參數設計配 置當中尤其有利,其中,在極端情況中,單一位元將足夠 ,這與所有配置資訊,就位元而論,必須以相當費力才能 明確地被寫入本身之資料流的情況形成對比。 B 如本發明,可來回發出信號。甚至如果傳輸聲道資料 之表示變更,例如,當使傳輸聲道資料解碼且稍後再編碼 ,亦即,當有彙接編碼情況時,這允許簡單之多聲道資料 處理。 發明槪念因此允許在一方面爲同步操作而在如有必要 時在另一方面切換到非同步操作的情況下節約發信位元, 亦即,有效率地實施位元之節約,且另一方面在有關’’補充” 既有立體聲資料至多聲道表現具特別重要性之彈性處理。 B 以下當中,對於第4 C圖,有以語法假編碼產生多聲道 音訊信號之發明裝置的典範實施例。首先,讀入變數 "useSameBccConfig"之値。此處,該變數作爲連續提示用 。所以,當這變數,即連續提示之値等於,例如,爲1時 ,只連續解譯參數配置提示。然而,如連續提示不等於1 ,即其具有其他意義時,則使用先前已傳輸之配置。如在 多聲道重建手段尙未有配置時,則必須等待直到它得到完 全第一之配置資訊及/或配置設定。 -29- 1309140 以下將檢視參數配置提不。變數"codecToBccConfigAlignment" 作爲參數配置提示PKH用。如果這變數等於1,即,如果 它具有第二種意義時,解碼器將不使用任何進一步之配置 資訊’但如可從第4C圖中以"case"當起始線所看到的,將 根據如MP3、CoderX或CoderY之編碼器識別決定配置資 訊。利用實例’要注意的是,第4C圖中所示之語法只支援 MP3、CoderX或coderY。然而,可添加任何其它的編碼名 稱/識別。 ^ 當,例如,已決定MP3爲編碼器資訊時,將變數 . bccConfigID設爲,例如,MP3—VI,這是語法版本爲VI 之根據MP3編碼器之配置。隨後,根據這BCC配置識別以 既定參數集建置解碼器。因此,例如,啓動576個取樣之 區塊長度作爲配置設定。因此,發出具有這種區塊長度之 入框信號。另選/額外之配置設定可爲取樣率等。然而,如 果參數配置提示(codecToBccConfigAlignment)具有第一種 意義時,即例如,値爲〇時,解碼器將從資料流明確接收 胃 配置資訊,亦即它將從資料流,即從輸入資料接收獨特之 bccConfigID。以下步驟則與剛說明的相同。然而,在這情 況下,爲多聲道重建手段之配置起見’不使用已編碼傳輸 聲道資料解碼用之解碼器之識別。 因此,在用於建置多聲道重建手段之MP3音訊解碼器 的情況下’爲了將傳輸聲道資料解碼起見’可使用 bccConfigID。另一方面,不管根本之音訊編碼器是否爲 MP 3編碼器,在資料流中亦可有任何其它之配置資訊 • 30- 1309140 bccConfigID並可加以評估。同樣適用於其它預先界定之配 置設定,如CoderX或CoderY,並適用於當中將配置資訊 (bccConfigID)設定於個體之進一步的自由配置。在較佳實 施例中,在資料流中有進一步之配置資訊,其依序發出信 號給解碼器,該解碼器應使用目前在解碼器中之已預先界 定之配置資訊及明確傳輸之配置資訊混合。 不像上述實施例,亦可將本發明應用到諸如參數編碼 視頻信號等無音訊信號之其它多聲道信號。 B 依情況而定,可以硬體或軟體實施產生及/或解碼之發 明方法。可在數位儲存媒體上完成實施,尤其是軟性磁碟 或具有可以電氣方式讀出之控制信號的CD,其可與程式化 之電腦系統聯合以達方法之執行。通常,當電腦程式產品 在電腦上執行時,本發明因此亦在於具有程式碼,用以實 施儲存在機器可讀取之載波上之方法的電腦程式產品。另 言之,當電腦程式在電腦上執行時,可因此以具有用於實 施該方法之程式碼的電腦程式使本發明實現。 B 【圖式簡單說明】 第1圖爲可使用在編碼器端之參數資料集產生用發明 裝置之方塊電路圖; 第2圖爲使用在解碼器端之多聲道音訊信號產生用裝 置之方塊電路圖; 第3圖爲本發明較佳實施例中之第2圖配置裝置操作 之原理流程圖; 第4A圖爲用於音訊解碼器和多聲道重建手段間之同 -3 1- 1309140 步操作的資料流之示意圖; 第4 B圖爲用於音訊解碼器和多聲道重建手段間之非 同步操作的資料流之示意圖; 第4 C圖爲以語法形式之多聲道音訊信號產生用裝置 之較佳實例; 第5圖爲多聲道編碼器之一般表示圖; 第6圖爲BCC編碼器/BCC解碼器路徑之示意方塊圖 第7圖爲第6圖之BCC合成區塊之方塊電路圖;以及 第8A圖至第8c圖爲用於計算參數集ICLD、ICTD和 之典型 ί槪要 的 表 示 圖 〇 要元件符號 說 明 ] 10 參 數 資 料 集 , 輸 出 11 多 聲 道 重 建 手 段 12 輸 入 13 未 加 工 處 理 之 參 數輸入 14 發 信 手 段 15 配 置 資 料 寫 入 手 段 16 結 合 手 段 17 控 制 線 18 控 制 輸 入 23 音 訊 解 碼 器 24 多 聲 道 重 建 手 段 25 輸 出 -32- 1309140The original signal is subjected to a predetermined delay d i, d2 at node 1 30. . . . . Di, dN, the result is that the same signal version is included at the output of block 1 26, but the delay is different. The delay parameters are calculated by the side information processing area 123 in Fig. 6 and the delay parameters are determined as they are for the BCC analysis area 116, and are obtained from the mutual channel time difference. The sample is applied to the multiplication parameters a!, a2, ..., ai, a>j, which are also calculated by the side information processing area 123 according to the mutual channel level difference determined by the -10- 1309140 BCC analysis area 116. The I C C parameter is calculated by the B C C analysis area 1 16 and used to control the block 1 28 function such that a predetermined correlation between the delayed and level steering signals is obtained at the block 1 28 output. It is to be noted that the order of stages 1 26, 1 27 and 128 can be different from those shown in FIG. It is further noted that in the block processing for audio signals, BCC analysis is also performed for the blocks. Moreover, the BCC analysis is also implemented for the frequency, that is, the frequency selection method. That is, for each spectrum band, each block has an ICLD parameter, an IC TD parameter, and an ICC parameter. For IC LD parameters spanning all bands to at least one block of less than one channel, the ICTD parameter set is thus represented. The same applies to the IC LD parameter set, which for at least one output channel reconstruction, exhibits at least all band ICLD parameters. The same applies to the ICC parameter set, which, depending on the input channel or the total channel, again contains a plurality of individual ICC parameters for at least one of the various channels for the reconstruction of the at least one output channel. ® In the following, refer to Figure 8, which shows the case where the BCC parameter is determined. Normally, I C L D, ICTD, and ICC parameters can be defined between any channel pairing. The incoming 'determination of IC LD and ICTD parameters between a reference channel and each of the other input channels' causes the input channels other than the reference channel to have a different set of parameters. This is also illustrated in Figure 8A. However, ICC parameters can be defined in different ways. As also shown in the schematic of Figure 8B, IC C parameters are typically generated in an encoder between any of the channel pairs. In this case, the decoder implements I c c synthesis such that when 1309140 it is in the original signal between any of the channel pairs, the same result is obtained. However, it has been suggested that only the ICC parameters between the two strongest channels are calculated at any time, i.e., for each frame. This design is shown in Fig. 8C, an example of the ICC parameter between the channels 1 and 2 is calculated and transmitted at one time and the ICC parameters between the channels 1 and 5 are calculated another time. Then, the mutual channel between the two strongest channels in the decoder synthesis decoder is correlated and a further typical heuristic rule is performed for synthesizing the consistency of the mutual channels of the remaining channel pairs. # Calculate the multiplication parameter ai for example based on the transmitted ICLD parameters. . . . . aN, referenced AES meeting document 5574. ICLD Parameter Table The energy distribution in the original multichannel signal. Without loss of generality, Figure 8A' shows four ICLD parameters that represent the energy difference between all other channels and the front left channel. In the side information processing area 123, the multiplication parameter ai. . . . .  aN is derived from the ICLD parameter such that the total energy of all reconstructed output channels is equal to or at least proportional to the energy of the existing transmitted summed signal. One way to determine these parameters is a two-stage procedure. In the first phase, set the multiplication factor of the left front channel to 1 and the multiplication factor of the other channels in the 8C picture to the transmitted ICLD. . Then, in the second stage, the energy of all five channels is calculated and compared to the energy of the transmitted summed signal. Then, reduce the proportion of all channels, that is, use a scale factor that is equal for all channels, wherein the scale factor is selected such that the total energy of all reconstructed output channels after scaling is equal to the transmitted summed signal and / or the total energy of the total number of signals transmitted. For the transmission from the BCC encoder to the BCC decoder, as an ICC for the mutual channel conformance measurement method of the further reference-12-1309140, it should be noted that the manipulation of consistency can be implemented by modifying the multiplication factor. The random number between 20 1〇g 1〇-6 and 20 log 1〇6 is multiplied by the weighting factor of all sub-bands. The pseudo-random sequence is typically chosen such that the changes in all critical bands are substantially equal and the average enthalpy in each critical band is zero. The same sequence is used for the spectral coefficients of the different frames or blocks. Therefore, the width of the audio context is controlled by modifying the variation of the pseudo-random sequence. Larger changes result in a larger listening width. Variations can be implemented in individual frequency bands having a width of a critical frequency band. This allows for the simultaneous presence of several objects in a listening context where each object has a different listening width. The appropriate amplitude distribution of the pseudo-random sequence is uniformly distributed in logarithmic scale as indicated in the patent publication 20 02/02 1 9 1 3 0 A1. In order to transmit five channels in a compatible manner, for example in a bit stream format suitable for normal stereo decoders, use in October 1992, Tyre (G.  THEILE) and Shi Wei (G.  STOLL) San Francisco (SAN FRAN CISCO) AES pre-printed "MUSICAM SURROUND: Universal Multi-Channel Coding System Compatible with ISO/IEC 1 1 172-3" b Matrix technology. And 'see further December 1994, Greer (B.  GRILL), Hurley, Brandenburg, Eberton (E.  EBERLEIN), Kühler (J.  KOLLER), Miller (J.  MILLER) published in Amsterdam, AES, No. 3865, "Advanced MPEG 2 Audio Multi-Channel Coding," multi-channel coding technology described in which the compatibility matrix is used to obtain downlink mixing from the original input channel. Channel. In general, for example, as in 2004, E. SCHUUER, -13- 1309140, B. BREEBAART, and H. H. (H.  PURNHAGEN), Ingedgard at the 119th AES Conference in Berlin, the last published article 6073, titled "low complex parameter stereo coding" in the professional publications, you can say that BCC technology allows multi-channel The audio material is valid and compatible with the code. In this connection, reference should also be made to the MPEG-4 standard and in particular to the expansion of parametric audio technology, which is also known by the reference ISO/IEC 14496-3: 2 00 1 /FDAM2 (parameter audio). In this regard, it should be mentioned that, in particular, the syntax of Tables 8, 9 of the MPEG-4 standard entitled "ps data() syntax". In this example, we should mention the syntax elements "enable_icc " and "enable_ipdopd, where 'use these syntax elements to turn on and off the transmission ICC parameters and the phase corresponding to the mutual channel time difference. Should be further mentioned Is a grammatical element "icc_data( )""ipd_data(),,,,,〇pd—data()" ο In summary, it is important to note that multi-channel technology is usually used to utilize one or several A transmitted carrier channel in which one transmitted channel is formed from one original channel, thereby reconstructing only one output channel or 输出^ output channels, where Κ is equal to or smaller than the original channel number ν. As can be seen from Figure 6, the BCC analysis is a typical separation pre-processing that produces parametric data on the one hand and one or more transmission channels from a multi-channel signal with one original channel on the other hand ( The following mixed channels). Although this is not shown in Figure 6, the downstream mixed channels are typically compressed, for example, using a typical ΜΡ3 or aAC stereo/mono coder, so at the output, there is a representation of the compressed form. The bit stream of the channel data is transmitted and further has another bit stream representing the parameter data. Therefore, the actual audio coding and/or the sum signal 115 of the downlink mixed channel of Fig. 6 respectively generates BCC analysis. The decoder side is similar. Depending on the coding algorithm used, a decoder with multi-channel capability will first decode the bitstream containing the compressed downstream mixed signal and provide one or more transmission channels again at the output. That is, generally as the timing of PCM data (PCM = PULSE CODE MODULATION). Then, BCC synthesis occurs at the output as an individual separation and isolated post-processing, which is derived from the parameter data stream. I fully signal and have the data to be generated, and the number of output channels is optimally equal to the number of raw input channels from the mixed signal under the audio decoding. Therefore, the advantage of BCC analysis is, for example, it has BCC analysis. Use separate filter libraries and have individual filter libraries for BCC synthesis 'so that's not related to audio compression and on the other hand Any compromise in channel reconstruction is separate from the filter library of the audio encoder/decoder. In general, the multi-channel parameter processing that is most suitable for the two applications is to complete the audio compression separately. The shortcoming of mourning is that the signal must be completely signaled for multi-channel reconstruction and audio decoding. As is typical, when the audio decoder and multi-channel reconstruction implement the same or similar steps and therefore need to be identical and/or mutually dependent This is particularly disadvantageous when configuring settings. Because of the completely different confession, two representations of data are transmitted, resulting in artificial data volume "expansion"' This is ultimately due to audio encoding/decoding and multi-channel analysis/ The fact that different mournings have been chosen between the syntheses. On the other hand, full "chaining" multi-channel reconstruction to audio decoding will greatly limit its flexibility, because in that case, the two processing steps are separated so that the actual processing steps can be implemented in the most suitable way with -15- 1309140. Important goals must be abandoned. Therefore, especially in the case of the "TANDEM" "coded number continuous encoding/decoding level, it causes significant quality loss. If the B C C data is completely linked to the encoded audio material, multi-channel reconstruction must be performed with each decoding when recording, and multi-channel synthesis is performed again. Since this is the nature of each of the parameter techniques it depletes, the synthetic analysis of the repeated analysis accumulates losses, further reducing the perceived quality of the bass signal when equipped with individual encoder/decoder stages. Φ In this case, as in the audio codec of the tandem chain, ie with the same sampling rate, block length, push length, windowing, conversion, ..., usually with the same configuration, in addition And if the 'different block boundaries are also maintained, it is possible to decode/encode the audio data of the parameter data without simultaneous analysis/synthesis. However, this kind of commemoration will quite limit the flexibility of the whole mourning. In particular, the limitation of 'for example, using additional parameter data' to make parameter multi-channel technology add existing stereo data' is painful. Since the existing stereo data can originate from many different encoders that operate in the time domain, etc., using different • block lengths or even in the frequency domain, this limitation will be taken from the beginning to add later. The sorrow of the law (absurdum). SUMMARY OF THE INVENTION The object of the present invention is to provide a multi-channel audio signal or a reconstruction parameter data set for the flexibility and effectiveness of the generation of the purpose of achieving the goal of multi-channel signal generation as claimed in claim 1 The device used, such as the method for generating the multi-channel signal-16-1309140 of claim 14 of the patent scope, the device for generating the parameter data set of claim 15 of the patent application, the parameter of the 18th item of the patent application scope A method for generating data output, such as a device for outputting parameter data of claim 19, a method for outputting parameter data of claim 20, or a computer for applying for a patent scope No. 21 Program. The present invention is based on the discovery that efficiency can be achieved on the one hand by including a parameter configuration prompt that can be included in the encoder side and evaluated at the decoder side, and another data stream that can include the transmission channel data and the parameter data. Aspect® flexibility. This prompt indicates whether to establish a multi-channel reconstruction means from the input data (ie, data transmitted from the encoder to the decoder), or whether to establish a multi-channel reconstruction means by prompting with the coding algorithm, wherein the Encoding algorithm - The encoded transmission channel data has been decoded. The multi-channel reconstruction means has the same configuration settings as the configuration settings of the audio decoder for decoding the encoded transmission channel data or at least depending on the setting. If the decoder detects the first case, that is, the parameter configuration prompt has the first meaning, the decoder will query the built-in input data to further establish the multi-channel reconstruction means, using the information. Then make the configuration settings of the multi-channel reconstruction means take effect. Such configuration settings can be, for example, block length, advancement, sampling frequency, filter library control data, so-called particle information (how many BCC zones are in a frame), and channel configuration (eg, whenever there is an MP3" 5. 1 output), the information about the parameter in the case of proportionalization (for example, ICLD) is mandatory and not (ICTD) information. However, if the decoder determines that the parameter configuration hint is different from the second meaning of the first meaning, the multi-channel reconstruction method will depend on the information of the audio coding algorithm -17-1309140, in the multi-channel reconstruction method. The configuration settings are selected in which the encoding/decoding of the transmission channel data (ie, the downmix channel) is based on the algorithm. The device of the present invention for generating a multi-channel audio signal is actually completely individual and self-sufficient in audio data and/or self-sufficient in relation to the individual commemoration of the parameter data on the one hand and the compressed downlink mixed data on the other hand. Operation of the upstream audio decoder for multi-channel. The configuration of the reconstruction means that the “theft” was committed in order to build itself. B The inventive concept of the preferred embodiment of the present invention is particularly effective when considering different audio coding algorithms. In this case, it is necessary to transmit a large amount of obvious indication information for achieving the synchronization operation, that is, the operation of the multi-channel reconstruction means and the audio decoder synchronous operation, that is, corresponding to the different coding algorithms. The advancement length and the like are such that the virtually independent multi-channel reconstruction algorithm is executed synchronously with the audio decoding algorithm. In accordance with the present invention, a single bit, i.e., sufficient parameter configuration hints, signals the decoder that, for its configuration purposes, depends on its downstream flow to that I audio encoder. After this, the decoder will receive information about the audio encoder. Go upstream to the information of many different audio encoders. When it receives this information, it is best to use the audio coding algorithm to identify the configuration table stored in the multi-channel decoder to the configuration information pre-defined for each possible audio coding algorithm' At least one configuration setting of the multi-channel reconstruction means is validated. Among them, the configuration is clearly marked in the data stream, which is not considered between the multi-channel reconstruction means and the audio decoder. When there is no invention of the audio decoder data by multi-channel reconstruction means "stolen" - 18- 1309140 In comparison with the situation, this achieved a significant savings in data rates. On the other hand, 'because the single bit in the data stream is sufficient to configure the parameters, the present invention still provides the high flexibility inherent in the obvious indication of the configuration information', so if necessary, it may be in the data stream. Actually transmit all configuration information, or in a mixed form, transmit at least part of the parameter configuration information in the data stream and pick up the necessary information from another part of the design information set. In a preferred embodiment of the invention, the configuration settings are compared to those already present or previously indicated, whether or not they should completely change the configuration settings, or are they.  Should it continue as usual, or respond to a continuous setting, whether to read the parameter configuration prompts to determine whether the audio decoder should have a multi-channel reconstruction means, or whether the transmission data contains relevant configuration At least some of the obvious information 'the data transmitted from the encoder to the decoder contains a continuous prompt to send a signal to the decoder. Preferred embodiments of the present invention will now be described in more detail with reference to the drawings. [Embodiment] > Fig. 1 shows a block circuit diagram of the invention device for generating a parameter data set, wherein 'the output 10 of the device shown in Fig. 1 can output a parameter data set. The parameter data set contains parameter data, and the parameter data N original channels are represented together with the transmission channel data not described in the figure, but will be discussed later. The transmission channel data will always contain one transmission channel, wherein the number of transmission channels is smaller than the original sound. The number of tracks is equal to or greater than 1. As shown in Fig. 1, the device to be housed at the encoder end includes a multi-channel reconstruction means 丨i, which is designed to implement, for example, B C C analysis or strong -19 - 1309140 degree stereo analysis. In this case, the multi-channel reconstruction means 11 will receive N original channels at input 12. Of course. Alternatively, the multi-channel reconstruction means 11 may be designed as a transcoding means for generating parameter data at the output of the means 11 using the raw processing parameter data fed to the raw processing parameter input 13. If the parameter data is simple BCC data provided to them by any BCC analysis means, the processing of the multi-channel reconstruction means 11 exists in the copy function copied from the input 13 to the output of the means 11. However, the multi-channel reconstruction means 1 1 can also be designed to change the syntax of the raw processing parameter data stream Φ 'to add, for example, a transmission data or a write parameter set, from at least partially independent of each other. In the case of the raw data of the raw processing, the parameter set can be decoded or skipped. The apparatus shown in Figure 1 further includes a means 14 for determining and coupling the parameter configuration prompt PKH to the parameter data at the output of the means 11. In particular, the signaling means is designed to determine the parameter configuration prompts, so that when the configuration information contained in the parameter data set is used for multi-channel reconstruction, the signaling means has the first meaning. In addition, the signaling means 14 will determine the parameter φ configuration prompt such that when the configuration data is used for multi-channel reconstruction, the signaling means has a second meaning, wherein the configuration data is based on the information to be used and/or used for transmission. The means of configuring the channel data encoding. Finally, the inventive device of Fig. 1 includes a configuration data writing means 15, h k means lx 5, so that the configuration information is linked with the parameter data and the parameter configuration prompts. Finally, the parameter data set is obtained at the output 10. The parameter data set 1 〇 therefore contains the parameter data from the multi-channel reconstruction means, the parameter configuration from the signaling means 14 and, if applicable, the configuration data from the configured livestock writing means 15. In the parameter data set, the -20- 1 1309140 component of the data set is configured according to the grammar and is always time-multiplexed, as illustrated by the components commonly referred to as the bonding means 16 in the figure. In the preferred embodiment of the present invention, the signaling means 14 is coupled to the configuration data writing means 15 via the control 17, only when the parameter configuration prompt has a meaning, that is, in a multi-channel reconstruction, in any way When the configuration information existing in the encoder is accessed, and when the signal is obviously sent, that is, when the further configuration information exists in the parameter data set, the configuration material writing means 15 is activated. In another case, the parameter configuration prompt has a second φ meaning, and the unstarted configuration data writing means 15 imports the data in the reference data set at output 10, as such data will not be decoded by the decoder as will be discussed later. Read and / or will not be required by the decoder. In the case of a hybrid solution, each device is not signaled in the data stream, only one configuration is signaled, and the remainder is taken from, for example, a configuration table in the decoder. The signaling means 14 includes a control input 18 that is passed through the signaling means 14 whether the parameter configuration prompt has a first or second meaning. As will be discussed for the figure and Figure 4B, in the so-called "synchronous" operation, the best is to take the parameter configuration hints to have a second meaning, and get the relevant code in the mode at the decoder side. The information of the algorithm is determined by the decoder on the decoder side, and the configuration setting is completed in the multi-channel reconstruction method. However, in the asynchronous operation, the control input 18 will drive the signaling means to configure the parameters. In the first sense of the prompt, the decoder interprets the prompt so that there is configuration information in the data itself, and the audio encoding algorithm rooted in the transmission channel data is not used. Note the parameter data set and/or parameters. The data output may not be the first line. When the number of resources is the case, the 4A is selected. This is determined to be 21- 1309140. It is fixed to each other. Therefore, it is not necessary to transmit configuration prompts, configuration data and Parameter data, but they can also be provided separately to the decoder. The following discussion will present a so-called "synchronization" operation for Table 4A. For illustrative purposes, Figure 4A illustrates The sequence of frames 40 indicates the parameter data, wherein the frame sequence 40 is preceded by a header code 41, which has a parameter configuration prompt generated by the signaling means 14, and if applicable, a configuration data writing means 15 configuration information generated. The parameter ® data at the output of means 1 is contained in frames 1, 2, 3, 4 'This is why they are also referred to as load data in Figure 4A. The continuous presentation of the FSH in the output of the signaling means 14 in Fig. 1 and further mentioned in the header code 41 in Fig. 4A has the established meaning that the decoder continues to maintain the same communication as previously communicated. Configuration setting 'When the continuous prompt FSH has another meaning, that is, according to the parameter configuration prompt 'based on the configuration information in the data stream or according to the prompt to capture the configuration data of the audio coding algorithm on the decoder side, determine multi-channel reconstruction Whether the configuration setting in the means β is effective. The fourth figure further shows the time-dependent encoded transmission data area sequence 42' which also has four frames, frame 1, frame 2, frame 3, frame 4. In the fourth figure The vertical arrow indicates the parameter data related to the time of the encoded transmission channel. Therefore, when using the overlap window, the encoded transmission channel data area will always be associated with the input data area and/or at least compared to the front area. How much data is processed in the data area, and in the synchronous operation, the advancement will be synchronized with the length of the data area and/or the advancement of the data obtained in the data section-22- 1309140. This ensures that no reconstruction parameters will be lost on the one hand and another On the one hand, the connection between the channel data is transmitted. This will be explained using a short example. Assume a 5-channel input signal. This 5-channel input signal will have five phases with time sampling from time X to time y, respectively. The audio channel. In the line mixing stage 1 1 4 below the figure 6, at least one transmission channel that will be synchronized with the multi-channel input data is generated. The transmission of the vocal tract data from one of the time X to the time y will therefore correspond to the respective multi-channel input data from one of the time X to the time y. Moreover, the BCC analysis means 1 16 of FIG. 6 generates, for example, the parameter data, which is again selected from the time of transmitting the channel data from time X to time y, thereby generating the respective slave time again at the decoder side. The output channel information of X to time y, the transmission channel data from time X to time y and the parameter data from time X to time y. The synchronization operation is automatically achieved when the framing of the parameter data is such that the framing of the write is equal to the frame on which the audio decoder is used to compress one or more transmission channels. If the frame of the parameter data and the encoded transmission channel data (40 and 42 in FIG. 4A) are always related to the same time portion, the multi-channel reconstruction means can always easily process the data corresponding to the audio frame. And process the parameter frame at the same time. In the synchronous operation, the frame length of the audio decoder for transmitting the downlink mixed data is therefore equal to the frame length used for the parameter multi-channel design. Similarly, it is of course also possible to have an integer relationship between the frame length and the parameter data and the encoded transmission channel data. In this case, the side information of the parameter multi-channel coding can even be multiplexed into the encoded bit -23- 1309140 elementary stream of the audio down-mix signal to generate a single bit stream. In the case of "correction" that already exists in stereo data, there will still be two different streams of data. However, there will be a relationship between 1:1 and/or m: 1 or m: η between the two frame sequences. The in-frame rasters will not be displaced relative to each other. Therefore, there is a clear correlation between the audio data frame and the corresponding information material frame. This mode is advantageous for various applications. 〇 In accordance with the present invention, in this case, the parameter configuration prompt will have the meaning of the first meaning. This means that there is no or only partial configuration information in the header code 41, because the multi-channel reconstruction means itself sets the basic audio encoder information, and depending on the above, the configuration settings are selected, that is, for example, advancement Or .  The number of time samples such as the length of the block. In contrast, Figure 4 shows the asynchronous operation. When the transmission channel data 4 2 1 does not have, for example, the 'frame structure, but only the p c Μ sample stream is generated, there is an asynchronous operation. In addition, this non-synchronization condition occurs when the audio coding apparatus has an irregular frame structure or a frame structure having a frame length and/or a frame raster different from the frame raster of the parameter data 40. Here, the parametric multi-channel coding design and the audio coding/decoding means are therefore considered as isolation and different processing levels independent of each other. This is particularly advantageous in the case of a number of encoding/decoding continuation stages, the so-called tandem coding scheme. When the parameter data is fixedly coupled to the compressed audio material, multi-channel synthesis and subsequent multi-channel analysis must be performed simultaneously for each encoding/decoding. Because of these operations, the loss will gradually accumulate. This will cause a multi-channel quality change. In this tandem chain, the parameter configuration prompt sets the second meaning. 24· 1309140 and the configuration information of the write data stream are independent of the basic audio encoder, allowing the configuration of the multi-channel reconstruction means in the decoder. The downstream mixed data can therefore be decoded/encoded in any way, without the need to simultaneously implement multi-channel synthesis or multi-channel analysis. If the parameter data syntax imports the configuration information into the data stream and the best is the parameter data stream, it can be said that the absolute correlation of the sample parameter parameters is stored at the time of the decoded transmission channel data, that is, self in the synchronous operation. Sufficient and not associated with the encoder frame processing rules. In the non-synchronous operation, since the multi-channel analysis/synthesis is not always performed, the quality of the multi-channel sound characteristics is prevented. The frame size of the parameter multi-channel encoding/decoding is therefore not necessarily related to the frame size of the audio encoder. The device in Fig. 1 can be implemented as an encoder and a so-called "forward conversion encoder". In the first case, the multi-channel reconstruction means calculates the parameter data itself. In the second case, it receives the parameter data in the established form and provides parameter configuration prompts and related configuration data to the invention parameter data output. The forward conversion encoder thus produces an invented parameter data output from any data output. The reverse measurement is performed by the so-called "reverse conversion encoder", which is no longer included in the parameter output of the invention, but also contains some output of the configuration data. This makes it unnecessary to use an audio coding algorithm in the configuration of multi-channel reconstruction. According to the present invention, the inverse conversion encoder is designed as a device for generating a parameter data using input data, wherein the parameter data output and the transmission channel data including the one transmission channel represent N original channels, wherein μ -25- 1309140 is less than N and equal to or greater than 1, where the input data contains a parameter configuration hint (4 1 ) depending on the encoding algorithm (2 3) that the transmission channel has been used to decode from its encoded version. 'The configuration information with multi-channel reconstruction means is the first meaning contained in the input data' or the second meaning of the multi-channel reconstruction means to use the configuration information. When the parameter configuration prompt has a second meaning, it includes a writing means for writing the configuration data, where 'the writing means is designed to first read the input data to retrieve (3) the parameter configuration prompt, and The information about the encoding algorithm (23) whose transmission channel data has been decoded from its encoded version is taken and outputted as configuration data. In the following, a block circuit diagram of a multi-channel audio signal generating apparatus according to a preferred embodiment of the present invention will be described with reference to FIG. In order to generate a multi-channel audio signal, the input data used includes the transmission channel data representing one transmission channel and further includes parameter data 21 to obtain one output channel. One transmission channel and the parameter data together represent one original channel, where Μ is less than Ν and equal to or greater than 1, and Κ is greater than Μ. Moreover, as already discussed, the present invention includes a parameter configuration prompt ΡΚΗ, and the transmission channel data 20 is a decoded version of the transmission channel material 22 that has been encoded as encoded by the encoding algorithm. In the embodiment shown in Fig. 2, the decoding algorithm is implemented by an audio decoder 23 having, for example, MP3 mourning or MPEG-2 (AAC) or any other coding commemoration The coding algorithm is in operation. The apparatus to be used at the decoder end shown in Fig. 2 includes a multi-channel reconstruction means for designing -26 - 1309140 K output channels at output 25 from the transmission channel data 20 and parameter data 21. twenty four. Further, the invention device shown in Fig. 2 includes arrangement means. The means is designed to send the configuration settings via the transmission line 27 to the channel reconstruction means 24. The configuration means 26 receives the input data and the maximum data 21, reads it and correspondingly processes the parameter configuration prompts to prompt the FSH and possibly the current configuration data. Moreover, the hand code algorithm is configured to transmit the input 2 8 to obtain the audio coded information. The decoded channel data is decoded according to the audio code, and the code algorithm is executed by the audio encoder 23. This information can be obtained from, for example, the observations from the decoded transmission channel data, from the coding algorithms they have been encoded/decoded, from which they can see the message. In addition, the audio decoder 2 itself can convey its characteristics to the segment 26. Still further, the configuration means 26 may also analyze the encoded data 2 2 ' to determine the prompt from the encoded channel data from which the encoding of the encoding algorithm has occurred. This "coding algorithm feature" will be included in each output stream of the encoder. ® In the following, a preferred embodiment will be described in accordance with the block diagram of Figure 3, as illustrated in block 30. The configuration means is to read the parameter configuration prompt from the input data and interpret it. In the description of 31, if the parameter configuration prompt has the first meaning, the segment will continue to be read in the parameter data stream, and the parameter data stream is extracted. Information (or at least part of the configuration information). However, if the step number configuration prompt has a second meaning, in step 32 the means will get the code 26 based on the decoded transmission channel data, which will be built. More than the best is the reference, the continuous segment contains the algorithm of the algorithm, different parties, if the configuration of the hand-transmitted channel code transmission to the built-in means 26 design, such as the block configuration hand configuration decision, configuration algorithm -27- 1309140 information. If there are several basically possible coding algorithms for designing the inventive device for multi-channel signal generation, step 32 is followed by a multi-channel reconstruction hand. The segment determines (33) configuration setting step 33 based on the information present at the decoder. This can be done, for example, in the form of a lookup table (LU T). At the end of step 32, if an audio encoder identification prompt is obtained, then the segment is used. The audio encoder identification prompt inputs a lookup table in step 33, wherein the audio encoder is used to identify the prompt as an index. The associated person in the index finds various configuration settings associated with the ® audio encoder. Such as block length, sampling rate, advancement, etc. The configuration settings are then applied to the multi-channel reconstruction means in step 34. However, in step 30, if the first meaning of the parameter configuration prompt is selected, as in Figure 3 The connection arrow between block 3 1 and block 3 4 indicates that the same configuration settings are made according to the configuration information contained in the parameter data stream. The invention design is flexible, and supports explicit and implicit configuration information transmission method. It is the parameter configuration prompt PKH, in the best case, it is best to insert it by flag, and only a single bit is needed to express the information of the essential configuration information. The parametric multi-channel decoder can then evaluate this flag. This configuration information is used if an available signal is sent with this flag to explicitly provide configuration information. On the other hand, if the flag indicates an implied signaling, the decoder The information about the used audio or speech coding methods will be used and the configuration information will be applied according to the signalling method that has been sent. For this reason, the parametric multi-channel decoder and/or multi-channel reconstruction means preferably has a predetermined amount of audio. Or -28- 1309140 A lookup table for the standard configuration information of the speech encoder. However, there are, for example, other possibilities than the lookup table that can also modify the solution, etc. Usually, depending on the actual encoder identification information. The decoder can provide configuration information in its own predetermined information. This kind of commemoration is especially advantageous in a complete parameter design configuration with minimal additional effort, where in the extreme case, a single bit will suffice, which is quite laborious in terms of all configuration information, in terms of bits. A comparison is made between the situation in which the data stream is explicitly written. B. According to the invention, a signal can be sent back and forth. This allows for simple multi-channel data processing even when the representation of the transmission channel data is changed, for example, when the transmission channel data is decoded and later re-encoded, i.e., when there is a tandem encoding condition. The inventive concept thus allows saving of signaling bits on the one hand for synchronous operation and, if necessary, switching to asynchronous operation on the other hand, ie efficient implementation of bit savings, and another In terms of ''supplement') there is an elastic process of special importance from stereo data to multi-channel performance. B In the following, for the 4th C picture, there is a typical implementation of the inventive device for generating multi-channel audio signals by grammatical pseudo-coding. First, read the variable "useSameBccConfig". Here, the variable is used as a continuous prompt. Therefore, when this variable, that is, the continuous prompt is equal to, for example, 1, only the continuous interpretation of the parameter configuration prompt However, if the continuous prompt is not equal to 1, that is, it has other meanings, then the previously transmitted configuration is used. If the multi-channel reconstruction means is not configured, then it must wait until it gets the full first configuration information. And / or configuration settings. -29- 1309140 The following parameters will be checked. The variable "codecToBccConfigAlignment" is used as a parameter configuration prompt PKH. If the variable is equal to 1, that is, if it has a second meaning, the decoder will not use any further configuration information 'but as seen from the 4C chart, "case" The configuration information will be determined based on the encoder identification such as MP3, CoderX or CoderY. Using the example 'Note that the syntax shown in Figure 4C only supports MP3, CoderX or coderY. However, any other encoding name can be added / Identify. ^ When, for example, MP3 is determined to be encoder information, the variable will be changed.  bccConfigID is set to, for example, MP3-VI, which is a configuration based on the MP3 encoder with a syntax version of VI. The decoder is then built with the established set of parameters based on this BCC configuration. Therefore, for example, the block length of 576 samples is started as a configuration setting. Therefore, an in-frame signal having such a block length is issued. Alternative/additional configuration settings can be sample rate, etc. However, if the parameter configuration hint (codecToBccConfigAlignment) has the first meaning, ie, for example, when the 値 is 〇, the decoder will explicitly receive the stomach configuration information from the data stream, that is, it will receive unique information from the data stream, ie from the input data. bccConfigID. The following steps are the same as just explained. However, in this case, for the configuration of the multi-channel reconstruction means, the identification of the decoder for decoding the encoded channel data is not used. Therefore, in the case of an MP3 audio decoder for constructing a multi-channel reconstruction means, bccConfigID can be used for the purpose of decoding the transmission channel material. On the other hand, regardless of whether the underlying audio encoder is an MP 3 encoder, there can be any other configuration information in the data stream. • 30- 1309140 bccConfigID can be evaluated. The same applies to other predefined configuration settings, such as CoderX or CoderY, and for further free configuration where the configuration information (bccConfigID) is set to the individual. In a preferred embodiment, there is further configuration information in the data stream, which in turn sends a signal to the decoder, which should use the pre-defined configuration information currently in the decoder and the configuration information that is explicitly transmitted. . Unlike the above embodiments, the present invention can also be applied to other multi-channel signals such as parametric coded video signals without audio signals. B Depending on the situation, the method of generating and/or decoding may be implemented in hardware or software. Implementation can be done on digital storage media, especially a flexible disk or a CD with an electrically readable control signal that can be combined with a stylized computer system to perform the method. Generally, when a computer program product is executed on a computer, the present invention is also a computer program product having a program code for implementing a method of storing on a carrier readable by a machine. In other words, when the computer program is executed on a computer, the present invention can be implemented in a computer program having a program code for implementing the method. B [Simplified description of the drawing] Fig. 1 is a block circuit diagram of the invention device for generating a parameter data set at the encoder end; Fig. 2 is a block circuit diagram of a device for generating a multi-channel audio signal using the decoder side Figure 3 is a schematic flow chart showing the operation of the apparatus of Figure 2 in the preferred embodiment of the present invention; Figure 4A is a diagram of the same operation between the audio decoder and the multi-channel reconstruction means - 3 - 1309140 Schematic diagram of the data stream; FIG. 4B is a schematic diagram of a data stream for asynchronous operation between the audio decoder and the multi-channel reconstruction means; FIG. 4C is a diagram of a multi-channel audio signal generating apparatus in a grammatical form 5 is a schematic diagram of a multi-channel encoder; FIG. 6 is a schematic block diagram of a BCC encoder/BCC decoder path; FIG. 7 is a block circuit diagram of a BCC composite block of FIG. 6; And Fig. 8A to Fig. 8c are diagrams for calculating the parameter set ICLD, ICTD, and typical figures. Symbols for the component elements. 10 Parameter data set, output 11 Multichannel reconstruction means 12 input 13 Unprocessed parameter input 14 Sending section 15 Configuration data writing section 16 Combining hand 17 Control line 18 Control input 23 Audio decoder 24 Multi-channel reconstruction Hand 25 Output -32- 1309140

26 配 置 手 段 27 發 信 線 28 發 信 輸 入 40 幀 序 列 4 1 標 首 碼 60 體 聲 合 倂 裝 置 110 輸 入 112 BCC 編 碼 器 114 下 行 混 合 級 115 加 總 信 號 線 116 BCC 分 析 區 117 旁 側 資 訊 線 120 BCC 解 碼 器 12 1 輸 出 122 -BCC 合 成 區 123 旁 側 資 訊 處 理 區 124 喇 叭 125 過 濾 庫 126 延 遲 級 127 位 準 修 飾 級 128 相 互 關 聯 處 理 級 129 級 IFB 130 節 點 -33-26 Configuration means 27 Signal line 28 Signal input 40 Frame sequence 4 1 Header code 60 Body sound combining device 110 Input 112 BCC encoder 114 Downmix stage 115 Total signal line 116 BCC Analysis area 117 Side information line 120 BCC Decoder 12 1 Output 122 -BCC Synthesis Area 123 Side Information Processing Area 124 Speaker 125 Filter Library 126 Delay Level 127 Level Modification Level 128 Interrelated Processing Level 129 Level IFB 130 Node-33-

Claims (1)

1309140 第94145269號「多聲道信號或參數資料集產生用之裝置和 方法」專利案 (2008年8月修正) 十、申請專利範圍: 1 '種使用輸入資料取得K個輸出聲道之多聲道信號產生 用之裝置’該輸入資料包含呈現Μ個傳輸聲道之傳輸聲 道資料及參數資料’其中該Μ個傳輸聲道及參數資料一 起表現Ν個原始聲道,其中Μ小於ν且大於或等於1, 且其中Κ大於Μ ’其中該輸入資料包含參數配置提示(41) ,其包含: 多聲道重建手段(24),其被設計成從該傳輸聲道資 料及該參數資料產生該Κ個輸出聲道;以及 配置手段(26),用於建置該多聲道重建手段,其中 將該配置手段設計成: 讀取該輸入資料以解譯該參數配置提示(3 0), 當該參數配置提示具有第一種意義時,抽取內含在 該輸入資料中之配置資訊(31)並使該多聲道重建手段之 配置設定生效(34),以及 當該參數配置提示具有異於該第一種意義之第二種 意義時,使用有關已從其一種編碼版來將該傳輸聲道資 料解碼之編碼演算法(23)的資訊,建置該多聲道重建手 段(3 4),致使該多聲道重建手段之配置設定與該編碼演 算法(23)之配置設定相同或依該編碼演算法(23)之配置 設定而定。 2.如申請專利範圍第1項之裝置,其中該傳輸聲道資料包 1309140 含具有傳輸聲道資料語法之傳輸聲道資料流’ 其中該參數資料包含具有參數資料語法之參數資料 流,其中該傳輸聲道資料語法異於該參數資料語法,以 及 其中,根據這語法將該參數配置提示插入該參數資 料中, 其中將該配置手段(26)設計成根據該參數資料語法 來讀取該參數資料及抽取該參數配置提示(3〇)。 3.如申請專利範圍第1項之裝置,其中將該多聲道重建手 段(24)設計成以區塊實施處理,其中該傳輸聲道資料爲 一取樣序列,且其中該配置設定包含一區塊長度或每處 理一區塊時由該多聲道重建手段(2 4)所最近處理之預先 數量之取樣。 4 .如申請專利範圍第3項之裝置,其中該傳輸聲道資料爲 至少一傳輸聲道之時間取樣,且該多聲道重建手段(24) 包含濾波器組,將一該傳輸聲道資料之時間取樣區塊轉 換成頻域表現。 5.如申請專利範圍第1項之裝置,其中該參數資料包含一 參數値區塊之序列’其中參數値區塊與至少一傳輸聲道 之時間部分有關’其中將該多聲道重建手段(24)設計致 使該配置設定形成參數値區塊及至少一傳輸聲道之相關 時間部分,用於產生K個輸出聲道。 6 ·如申請專利範圍第1項之裝置,其中該編碼演算法(23) 爲多數各種編碼演算法之一種,以及 1309140 其中該配置手段(2 6)係包含査詢表手段,其包含索 引及有關編碼演算法索弓丨(index)之配置資訊集’其分別 包含該編碼演算法之配置設定, 其中將該配置手段(26)設計成從有關該編碼演算法 之資訊決定該查詢表之索引並由此決定該多聲道重建手 段之配置資訊(33)。 7. 如申請專利範圍第1項之裝置,其中該輸入資料在具有 該第一種意義的參數配置提示之情況下’包含該多聲道 重建手段(24)之配置資訊,且在具有該第二種意義的該 參數配置提示之情況下,只包含該多聲道重建手段之部 分或無配置資訊。 8. 如申請專利範圍第1項之裝置,其中將該配置手段(2 6) 設計成當該參數配置提示具有該第二種意義時,從該輸 入資料只抽取部分之所需配置資訊,並從該多聲道重建 手段已知之預設配置資訊加以利用配置資訊之剩下部分。 9 .如申請專利範圍第1項之裝置,其中將該配置手段(26) 設計成:當該參數配置提示具有該第二種意義時,經由 連接線取得有關該編碼演算法之資訊,該配置手段可經 由該連接線連接至從該編碼傳輸聲道資料產生該傳輸聲 道資料之解碼器,或藉由讀取該傳輸聲道資料或該編碼 傳輸聲道資料取得有關該編碼演算法之資訊。 10.如申請專利範圍第1項之裝置,其中該輸入資料更包含 連續提示(4 1 ),以及 其中將該配置手段(2 6)設計成在具有第一種意義之 連續提示情況下,讀取並解譯連續提示(2 9),使該多聲道 1309140 重建手段之固定集或先前發訊之配置設定生效,及在只 具有異於該第一種意義之第二種意義的該連續提示情況下 ,根據該參數配置提示加以建置該多聲道重建手段(30) 〇 11.如申請專利範圍第1 〇項之裝置,其中該連續提示係與根 據參數資料語法之該參數資料有關且爲該參數資料流中 之旗標。 1 2 .如申請專利範圍第1項之裝置,其中該參數配置提示係 與根據參數資料語法之該參數資料有關且爲該參數資料 流之旗標。 13. 如申請專利範圍第11項之裝置,其中該連續提示或該參 數配置提示各包含一單位元(single bit)。 14. 一種使用輸入資料取得K個輸出聲道之多聲道信號產生 用之方法,該輸入資料包含表現Μ個傳輸聲道之傳輸聲 道資料及參數資料,其中該Μ個傳輸聲道及該參數資料 一起表現Ν個原始聲道,其中Μ小於Ν且大於或等於1 ’且其中Κ大於μ,其中該輸入資料包含參數配置提示 (41),其包含: 根據重建演算法從該傳輸聲道資料及該參數資料重 建(24)該Κ個輸出聲道; 藉由以下的次步驟來建置該重建演算法(26): 讀取該輸入資料以解譯該參數配置提示(30); 當該參數配置提示具有第一種意義時,抽取內 含在輸入資料中之配置資訊(31)並使該重建演算法之配 1309140 置設定生效(34),以及 當該參數配置提示具有異於該第一種意義之第 二種意義時’利用有關編碼演算法(23)之資訊使該重建演 算法之配置設定生效(34),藉由該編碼演算法(2 3)已將該 傳輸聲道資料從其編碼版加以解碼,致使配置設定與該 編碼演算:法(23)之配置設定相同或依該編碼演算法(23) 之配置設定而定。 15_—種參數資料輸出產生用之裝置,該參數資料輸出和包 含Μ個傳輸聲道之傳輸聲道資料一起表現n個原始聲道 ,其中Μ小於Ν且大於或等於1,其包含: 多聲道重建手段(11),用於提供該參數資料; 發訊手段(14),用於決定參數配置提示,其中當要使 用內含在該參數資料輸出中之配置資訊作爲多聲道重建 手段時,該參數配置提示具有第一種意義,且其中當要 使用配置資料作爲多聲道重建時,該參數配置提示具有 第二種意義,其中該配置資料是根據將Μ個傳輸聲道加 以編碼或解碼所使用之編碼演算法;以及 配置資料寫入手段(1 5 ),用於輸出該配置資訊,取得 該參數資料輸出。 16.如申請專利範圍第15項之裝置,其中將該配置資料寫入 手段(1 5)設計成將連續提示插入一參數資料集, 其中當該連續提示具有第一種意義時,該連續提示 使得在多聲道重建中使用先前所發訊之配置設定的固定 集’且當該連續提示具有異於該第一種意義之第二種意 1309140 義時’使用該參數配置提示,藉以產生多聲道重建之配 置。 17.如申請專利範圍第15項之裝置,其中將該配置資料寫入 手段設計成:當該參數配置提示具有該第二種意義(17) 時’使必要配置資訊與參數資料集無關或僅部分有關。 18· —種參數資料輸出產生用之方法,該參數資料輸出和包 含Μ個傳輸聲道之傳輸聲道資料一起表現N個原始聲道 ’其中Μ小於N且大於或等於丨,其包含: 提供該參數資料(1 1 ); 決定參數配置提示(14),其中當要使用內含在該參數 資料輸出中之配置資訊來作爲多聲道重建演算法時,該 參數配置提不具有第一種意義,且其中當要使用配置資 料來作爲多聲道重建時,該參數配置提示具有第二種意 義’其中該配置資料是根據將該Μ個傳輸聲道加以編碼 或解碼所使用之編碼演算法;以及 輸出該配置資訊(1 5),以取得該參數資料輸出。 19. 一種利用輸入資料來產生參數資料輸出之裝置,該參數 資料輸出和包含IV[個傳輸聲道之傳輸聲道資料一起表現 Ν個原始聲道,其中μ小於Ν且大於或等於1,其中該 輸入資料包含參數配置提示(41),該提示具有多聲道重建 手段之配置資訊是內含在該輸入資料中之第一種意義, 或具有該多聲道重建手段是使用依編碼演算法(23)而定 之配置資訊的第二種意義’而藉由編碼演算法(23)已將該 傳輸聲道資料從其編碼版加以解碼,其包含: 1309140 用於寫入配置資料的寫入手段,其中該寫入手段係 設計爲: 讀取該輸入資料以解譯該參數配置提示(30),以 及當該參數配置提示具有該第二種意義時,擷取並輸出 爲有關編碼演算法(23)之該配置資料資訊,而藉由該編碼 演算法(2 3)已將該傳輸聲道資料從其編碼版加以解碼。 20.—種利用輸入資料來產生參數資料輸出之方法,該參數 資料輸出和包含Μ個傳輸聲道之傳輸聲道資料一起表現 Ν個原始聲道’其中Μ小於Ν且大於或等於1,其中該 輸入資料包含參數配置提示(41),該提示具有多聲道重建 手段之配置資訊是內含在該輸入資料中之第一種意義, 或具有該多聲道重建手段是使用依編碼演算法(23 )而定 之配置資訊的第二種意義,而藉由編碼演算法(23)已將該 傳輸聲道資料從其編碼版加以解碼,其包含: 讀取該輸入資料以解譯該參數配置提示(3 0),以及 當該參數配置提示具有該第二種意義時,擷取有關 編碼演算法(23)之資訊並輸出所擷取之該配置資料,而藉 由該編碼演算法(23)已將該傳輸聲道資料從其編碼版加 以解碼。 2 1 . —種電腦程式產品,其具有程式碼,當在電腦上執行此 電腦程式產品時,用於實施如申請專利範圍第1 4、1 8或 20項之方法。1309140 No. 94145269 "Devices and methods for generating multi-channel signals or parameter data sets" Patent case (amended in August 2008) X. Patent application scope: 1 'Use of input data to obtain multiple sounds of K output channels The device for generating a channel signal 'the input data includes transmission channel data and parameter data representing one transmission channel', wherein the one transmission channel and the parameter data together represent one original channel, wherein Μ is less than ν and greater than Or equal to 1, and wherein Κ is greater than Μ 'where the input data includes a parameter configuration prompt (41), comprising: a multi-channel reconstruction means (24) designed to generate the data from the transmission channel data and the parameter data输出 an output channel; and a configuration means (26) for constructing the multi-channel reconstruction means, wherein the configuration means is designed to: read the input data to interpret the parameter configuration prompt (3 0), when When the parameter configuration prompt has the first meaning, extracting the configuration information (31) contained in the input data and making the configuration setting of the multi-channel reconstruction means effective (34), and when the parameter When the number configuration hint has a second meaning different from the first meaning, the multi-channel is built using information about the encoding algorithm (23) that has decoded the transmission channel data from one of the encoded versions. The reconstruction means (34) causes the configuration setting of the multi-channel reconstruction means to be the same as the configuration setting of the coding algorithm (23) or according to the configuration setting of the coding algorithm (23). 2. The apparatus of claim 1, wherein the transmission channel data packet 1309140 includes a transmission channel data stream having a transmission channel data syntax, wherein the parameter data includes a parameter data stream having a parameter data syntax, wherein The transmission channel data syntax is different from the parameter data syntax, and wherein the parameter configuration prompt is inserted into the parameter data according to the syntax, wherein the configuration means (26) is designed to read the parameter data according to the parameter data syntax. And extract the parameter configuration prompt (3〇). 3. The apparatus of claim 1, wherein the multi-channel reconstruction means (24) is designed to be processed in a block, wherein the transmission channel data is a sampling sequence, and wherein the configuration setting comprises a region The block length or a predetermined number of samples processed most recently by the multi-channel reconstruction means (24) each time a block is processed. 4. The device of claim 3, wherein the transmission channel data is time sampled by at least one transmission channel, and the multi-channel reconstruction means (24) comprises a filter bank, and the transmission channel data is The time sampling block is converted into a frequency domain representation. 5. The apparatus of claim 1, wherein the parameter data comprises a sequence of parameters 値 block, wherein the parameter 値 block is related to a time portion of at least one transmission channel, wherein the multi-channel reconstruction means ( 24) The design causes the configuration to form a parameter block and an associated time portion of at least one of the transmission channels for generating K output channels. 6) The apparatus of claim 1, wherein the encoding algorithm (23) is one of a plurality of various encoding algorithms, and 1309140, wherein the configuration means (26) comprises a lookup table means including an index and related a configuration information set of a coding algorithm, which includes configuration settings of the coding algorithm, wherein the configuration means (26) is designed to determine an index of the lookup table from information about the coding algorithm and The configuration information (33) of the multi-channel reconstruction means is thus determined. 7. The apparatus of claim 1, wherein the input data includes configuration information of the multi-channel reconstruction means (24) in the case of a parameter configuration prompt having the first meaning, and has the In the case of the parameter configuration prompt of the two meanings, only part of the multi-channel reconstruction means or no configuration information is included. 8. The apparatus of claim 1, wherein the configuration means (26) is designed to extract only part of the required configuration information from the input data when the parameter configuration prompt has the second meaning, and The default configuration information known from the multi-channel reconstruction means utilizes the remainder of the configuration information. 9. The device of claim 1, wherein the configuration means (26) is designed to: obtain information about the coding algorithm via a connection line when the parameter configuration prompt has the second meaning, the configuration The means may be connected to the decoder for generating the transmission channel data from the encoded transmission channel data via the connection line, or obtain information about the coding algorithm by reading the transmission channel data or the encoded transmission channel data. . 10. The apparatus of claim 1, wherein the input data further comprises a continuous prompt (4 1 ), and wherein the configuration means (26) is designed to read in the case of a continuous prompt having the first meaning Taking and interpreting the continuous prompt (2 9), making the fixed set of the multi-channel 1309140 reconstruction means or the configuration setting of the previous transmission effective, and in the continuous meaning having only the second meaning different from the first meaning In the case of prompting, the multi-channel reconstruction means (30) is constructed according to the parameter configuration prompt. 装置11. The device of claim 1, wherein the continuous prompt is related to the parameter data according to the parameter data grammar. And is the flag in the data stream of the parameter. 1 2 . The device of claim 1, wherein the parameter configuration prompt is related to the parameter data according to the parameter data syntax and is a flag of the parameter data stream. 13. The device of claim 11, wherein the continuous prompt or the parameter configuration prompt each comprise a single bit. 14. A method for generating a multi-channel signal of K output channels using input data, the input data comprising transmission channel data and parameter data representing one transmission channel, wherein the one transmission channel and the The parameter data together represent one original channel, wherein Μ is less than Ν and greater than or equal to 1 ' and wherein Κ is greater than μ, wherein the input data includes a parameter configuration prompt (41), which includes: according to the reconstruction algorithm from the transmission channel Data and the parameter data are reconstructed (24) the output channels; the reconstruction algorithm (26) is constructed by the following substeps: reading the input data to interpret the parameter configuration prompt (30); When the parameter configuration prompt has the first meaning, the configuration information (31) contained in the input data is extracted and the configuration of the reconstruction algorithm is set to be valid (34), and when the parameter configuration prompt is different from the In the second meaning of the first meaning, 'the information about the coding algorithm (23) is used to make the configuration of the reconstruction algorithm effective (34), and the transmission algorithm (2 3) has already transmitted the sound. The track data is decoded from its encoded version such that the configuration settings are the same as the configuration settings of the code calculation: method (23) or depending on the configuration settings of the code algorithm (23). 15_—A device for generating a parameter data output, the parameter data output and the transmission channel data including the one transmission channel represent n original channels, wherein Μ is less than Ν and greater than or equal to 1, which includes: a channel reconstruction means (11) for providing the parameter data; a signaling means (14) for determining a parameter configuration prompt, wherein when the configuration information contained in the parameter data output is to be used as a multi-channel reconstruction means The parameter configuration prompt has a first meaning, and wherein when the configuration data is to be used as a multi-channel reconstruction, the parameter configuration prompt has a second meaning, wherein the configuration data is encoded according to one of the transmission channels or The coding algorithm used for decoding; and the configuration data writing means (15) for outputting the configuration information, and obtaining the parameter data output. 16. The apparatus of claim 15, wherein the configuration data writing means (15) is designed to insert a continuous prompt into a parameter data set, wherein the continuous prompt is when the continuous prompt has a first meaning So that the fixed set of previously configured configuration settings is used in multi-channel reconstruction and when the continuous prompt has a second meaning of 1309140 that is different from the first meaning, the parameter is used to configure the prompt, thereby generating more Channel reconstruction configuration. 17. The apparatus of claim 15, wherein the configuration data writing means is designed to: when the parameter configuration prompt has the second meaning (17), 'make the necessary configuration information independent of the parameter data set or only Partly related. 18· a method for generating a parameter data output, the parameter data output and the transmission channel data including the one transmission channel together represent N original channels 'where Μ is less than N and greater than or equal to 丨, which includes: The parameter data (1 1 ); determines a parameter configuration prompt (14), wherein when the configuration information contained in the parameter data output is to be used as a multi-channel reconstruction algorithm, the parameter configuration does not have the first type Meaning, and wherein when the configuration data is to be used for multi-channel reconstruction, the parameter configuration prompt has a second meaning 'where the configuration data is a coding algorithm used to encode or decode the one transmission channel And output the configuration information (1 5) to obtain the parameter data output. 19. A device for generating parameter data output by using input data, wherein the parameter data output and the transmission channel data including IV [one transmission channel represent one original channel, wherein μ is less than Ν and greater than or equal to 1, wherein The input data includes a parameter configuration prompt (41), the prompt having the configuration information of the multi-channel reconstruction means is the first meaning included in the input data, or the multi-channel reconstruction means is using the coding algorithm (23) The second meaning of the configuration information is determined, and the transmission channel data has been decoded from its encoded version by the encoding algorithm (23), which includes: 1309140 Writing means for writing configuration data The writing means is designed to: read the input data to interpret the parameter configuration prompt (30), and when the parameter configuration prompt has the second meaning, extract and output the relevant encoding algorithm ( 23) The profile information is decoded by the encoding algorithm (23) from its encoded version. 20. A method for generating parameter data output by using input data, the parameter data output and the transmission channel data including one transmission channel are represented by an original channel 'where Μ is less than Ν and greater than or equal to 1, wherein The input data includes a parameter configuration prompt (41), the prompt having the configuration information of the multi-channel reconstruction means is the first meaning included in the input data, or the multi-channel reconstruction means is using the coding algorithm (23) a second meaning of the configuration information, and the transmission channel data is decoded from its encoded version by the encoding algorithm (23), comprising: reading the input data to interpret the parameter configuration a prompt (30), and when the parameter configuration prompt has the second meaning, extract information about the encoding algorithm (23) and output the extracted configuration data, and the encoding algorithm (23) The transmission channel data has been decoded from its encoded version. 2 1. A computer program product having a program code for implementing the method of claim 14, claim 18 or 20 when the computer program product is executed on a computer.
TW94145269A 2005-12-20 2005-12-20 Device and method for generating a multi-channel signal or a parameter data set TWI309140B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW94145269A TWI309140B (en) 2005-12-20 2005-12-20 Device and method for generating a multi-channel signal or a parameter data set

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW94145269A TWI309140B (en) 2005-12-20 2005-12-20 Device and method for generating a multi-channel signal or a parameter data set

Publications (1)

Publication Number Publication Date
TWI309140B true TWI309140B (en) 2009-04-21

Family

ID=45071977

Family Applications (1)

Application Number Title Priority Date Filing Date
TW94145269A TWI309140B (en) 2005-12-20 2005-12-20 Device and method for generating a multi-channel signal or a parameter data set

Country Status (1)

Country Link
TW (1) TWI309140B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI468031B (en) * 2011-05-13 2015-01-01 Fraunhofer Ges Forschung Apparatus and method and computer program for generating a stereo output signal for providing additional output channels

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI468031B (en) * 2011-05-13 2015-01-01 Fraunhofer Ges Forschung Apparatus and method and computer program for generating a stereo output signal for providing additional output channels
US9913036B2 (en) 2011-05-13 2018-03-06 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method and computer program for generating a stereo output signal for providing additional output channels

Similar Documents

Publication Publication Date Title
JP4601669B2 (en) Apparatus and method for generating a multi-channel signal or parameter data set
JP5645951B2 (en) An apparatus for providing an upmix signal based on a downmix signal representation, an apparatus for providing a bitstream representing a multichannel audio signal, a method, a computer program, and a multi-channel audio signal using linear combination parameters Bitstream
CA2645912C (en) Methods and apparatuses for encoding and decoding object-based audio signals
US9578435B2 (en) Apparatus and method for enhanced spatial audio object coding
JP4418493B2 (en) Frequency-based coding of channels in parametric multichannel coding systems.
RU2576476C2 (en) Audio signal decoder, audio signal encoder, method of generating upmix signal representation, method of generating downmix signal representation, computer programme and bitstream using common inter-object correlation parameter value
JP4519919B2 (en) Multi-channel hierarchical audio coding using compact side information
JP4987736B2 (en) Apparatus and method for generating an encoded stereo signal of an audio fragment or audio data stream
JP5134623B2 (en) Concept for synthesizing multiple parametrically encoded sound sources
RU2449388C2 (en) Methods and apparatus for encoding and decoding object-based audio signals
JP5520300B2 (en) Apparatus, method and apparatus for providing a set of spatial cues based on a microphone signal and a computer program and a two-channel audio signal and a set of spatial cues
US8266195B2 (en) Filter adaptive frequency resolution
TW201514972A (en) Audio encoder, audio decoder, methods and computer program using jointly encoded residual signals
US8885854B2 (en) Method, medium, and system decoding compressed multi-channel signals into 2-channel binaural signals
US8626515B2 (en) Apparatus for processing media signal and method thereof
TWI309140B (en) Device and method for generating a multi-channel signal or a parameter data set
JP2023541250A (en) Processing parametrically encoded audio
Breebaart et al. 19th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007