TW315561B - A multi-channel predictive subband audio coder using psychoacoustic adaptive bit allocation in frequency, time and over the multiple channels - Google Patents

A multi-channel predictive subband audio coder using psychoacoustic adaptive bit allocation in frequency, time and over the multiple channels Download PDF

Info

Publication number
TW315561B
TW315561B TW85114822A TW85114822A TW315561B TW 315561 B TW315561 B TW 315561B TW 85114822 A TW85114822 A TW 85114822A TW 85114822 A TW85114822 A TW 85114822A TW 315561 B TW315561 B TW 315561B
Authority
TW
Taiwan
Prior art keywords
audio
frame
band
channel
bit
Prior art date
Application number
TW85114822A
Other languages
Chinese (zh)
Inventor
Stephen Malcolm Smyth
Michael Henry Smyth
William Paul Smith
Original Assignee
Dts Technology Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US08/642,254 external-priority patent/US5956674A/en
Application filed by Dts Technology Llc filed Critical Dts Technology Llc
Application granted granted Critical
Publication of TW315561B publication Critical patent/TW315561B/en

Links

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

A subband audio coder 12 employs perfect/non-perfect reconstruction filters 34, predictive/non-predictive subband encoding 72, transient analysis 106, and psycho-acousti c/minimum mean-square-error (mmse) bit allocation 30 over time, frequency and the multiple audio channels to encode/decode a data stream to generate high fidelity reconstructed audio. The audio coder windows 64 the multi-channel audio signal such that the frame size, i.e. number of bytes, is constrained to lie in a desired range, and formats the encoded data so that the individual subframes can be played back as they are received thereby reducing latency. Furthermore, the audio coder processes the baseband portion (0-24kHz) of the audio bandwidth for sampling frequencies of 48kHz and higher with the same encoding/decoding algorithm so that audio coder architecture is future compatible.

Description

A7 B7 315561 五、發明説明(1 ) (請先閲讀背面之注意事項再填寫本頁) 本發明背景 本發明領域 .本發明與多通道聲頻信號之高品質編碼與解碼有關, 尤其與一在時間、頻率及多聲頻通道上使用完全性/非完 全性倖濾波器、預測性/非預測性次波帶編碼、暫態分析 及精神聽覺/最小均方誤差(mmse)位元分配之次波帶編碼 器有關,以產生一具有有限解碼計算負載之資料流0 相關技舊說明 已知高品質聲頻與音樂編碼器可區分成二類廣泛之體 系。第一爲適合依據一精神聽覺遮蔽計算量化分析視窗內 次波帶或係數選樣之中至高頻率解析度之次波帶/變換編 碼器。第二爲使用ADPCM處理次波帶選樣,藉以組成其等 不良頻率解析度之低解析度次波帶編碼器〇 經濟部中央標準局員工消費合作社印裝 第一類編碼器藉由容許位元分配利用一般音樂信號之 大型短期頻譜變異數,以依據信號之頻譜能量作適應0此 等編碼器之高解析度容許頻率轉換信號根據聽覺之一臨界 波帶理論而直接應用到精神聽覺模型0 Dolby之AC-3聲頻 編碼器(Todd 等,1994年 2 月 Convention of the Audio Engineering Society > " A C 3 Flexible Perceptual Coding for Audi。Transmission and St。rage"通常計算 個別PCM信號上之l〇24-ffts ,並將一精神聽覺模型應用 於各通道內之1024頻率係數,以決定各係數之位元傳輸速 率〇 Dolby系統使用一暫態分析,使視窗大小減至256個 選樣以隔離各暫態0 AC-3編碼器使用一專利之反向適應演 本紙块尺度適用中國國家操準(CNS ) A4規格(210 X 297公釐) 315561 a? ____B7 ____ 五、發明説明(2 ) 算法將位元分配解碼。此減少邊際編碼聲頻資料被發送之 位元分配資訊之量0結果可用於聲頻之頻帶寬在前向適應 體系上增加,此導至聲音品質之改善。 第二類編碼器中,各差分次波帶信號之量化係受到固 定或適應以將在所有或部份次波帶上之量化雜訊功率減至 最少而無任何明顯參考到精神聽覺遮蔽理論。一般所接受 者爲一直接精神聽覺失眞臨限爲無法應用預測性/差分次 波帶信號,因於位元分配程序之前估計預測器性能有其困 難。各問題於預測程序上與量化雜訊交互作用而更進一步 混合。 此等編碼器之工作爲由於感覺之臨界聲頻信號在長期 時間上大致爲週期性〇此一週期性爲由預測性差異量化所 利用。將信號分割成小數量之次波帶,減少雜訊調變之可 聽見效果,並容許利用聲頻信號之長期頻譜變異數。若次 波帶之數目被增加,各次波帶內之預測增益被減少,且於 某·些時間該預測增益將趨近於零。 經濟部中央樣準局負工消費合作社印製 (請先閲讀背面之注意事項再填寫本頁)A7 B7 315561 V. Description of the invention (1) (Please read the notes on the back before filling in this page) Background of the invention The field of the invention. The invention is related to the high-quality encoding and decoding of multi-channel audio signals, especially related to time , Frequency and multi-frequency channels using complete / incomplete luck filters, predictive / non-predictive subband coding, transient analysis, and mental hearing / minimum mean square error (mmse) bit allocation subbands Encoders are used to generate a data stream with limited decoding computational load. Related Art Description It is known that high-quality audio and music encoders can be divided into two broad systems. The first is a sub-band / transform encoder suitable for high-frequency resolution in the sub-band or coefficient selection in the quantitative analysis window based on a mental auditory masking calculation. The second is the use of ADPCM to process the sub-band selection, so as to form its low-resolution sub-band encoder with poor frequency resolution. The Ministry of Economic Affairs Central Standards Bureau Employee Consumer Cooperative printed the first type of encoder by allowing the bit Distribute and use the large-scale short-term spectrum variation of general music signals to adapt to the spectrum energy of the signal. The high resolution of these encoders allows frequency-converted signals to be directly applied to the mental auditory model based on the critical band theory of hearing The AC-3 audio encoder (Todd et al., February 1994 Convention of the Audio Engineering Society > " AC 3 Flexible Perceptual Coding for Audi. Transmission and St. rage " usually calculates l〇24- on individual PCM signals ffts and apply a mental hearing model to the 1024 frequency coefficients in each channel to determine the bit transmission rate of each coefficient. The Dolby system uses a transient analysis to reduce the window size to 256 samples to isolate each transient 0 The AC-3 encoder uses a patented reverse adaptation script. The paper size is applicable to China National Standards (CNS) A4 specification (210 X 297 ) 315561 a? ____B7 ____ 5. Description of the invention (2) The algorithm decodes the bit allocation. This reduces the amount of bit allocation information sent by the marginally encoded audio data. The result can be used for the audio frequency bandwidth in the forward adaptation system. The increase leads to an improvement in sound quality. In the second type of encoder, the quantization of each differential subband signal is fixed or adapted to minimize the quantization noise power in all or part of the subband and minimize There is no obvious reference to the theory of mental hearing masking. Generally, the recipient is a direct mental hearing loss threshold. Predictive / differential subband signals cannot be applied, because it is difficult to estimate the performance of the predictor before the bit allocation process. The problems are further mixed with the quantization noise in the prediction process. The work of these encoders is that the critical audio signal due to perception is roughly periodic in the long term. This periodicity is caused by the predictive difference quantization Use. Divide the signal into a small number of sub-bands, reduce the audible effect of noise modulation, and allow the use of long-term spectrum changes of the audio signal If the number of sub-bands is increased, the predicted gain in each sub-band is reduced, and at some time the predicted gain will approach zero. Printed by the Consumer Labor Cooperative of the Central Bureau of Standards of the Ministry of Economic Affairs ( (Please read the notes on the back before filling this page)

Digital Theater Systems,L.P.(DTS)利用一聲頻編 碼器 ',其中各PCM聲頻通道被濾波成四個次波帶,而各次 波帶使用一反向ADP CM編碼器作編碼,使預測器係數適應 於次波帶資料〇位元分配爲固定,對各通道亦相同,較低 .頻率次波帶被指定較多之位元,其多於較高頻率次波帶者 〇該位元分配提供一固定壓縮比,譬如4 : 1 〇 DTS編碼器描 述於 1991 年之 Proceedings o f the 10th International A E S C ο n f e r e n c e,4 1 - 5 6 頁,M i k e S m y t h 與 S t e p h e n S m y t h 本紙張尺度適用中國國家標準(CNS ) Λ4規格(210X297公釐〉 315561 經濟部中央樣準局員工消費合作社印裂 五、發明説明(3) 之"APT-X100: A LOW-DELAY, LOW BIT-RATE, SUB-BAND ADPCM AUDIO CODERFOR BROADCASTING" 〇 二類之聲頻編碼器具有其他共同之限制〇第一,已知 聲頻編碼器以一固定之框大小作編碼/解碼,即由一框所 代表之選樣數目或時間週期被固定。結果爲當編碼傳輸率 相對於抽樣率爲增加時,於該框內之資料(位元組)數量 亦增加。如此,解碼器緩衝器大小須設計以容納最壞情形 之體系,以防止資料溢位。此增加RAM之數量,後者爲解 碼器主要成本成份。第二,已知聲頻編碼器不易擴充至大 於48 kHz (仟赫)之選樣頻率〇如此進行將使現有解碼器 不相容於新編碼器所需格式。此一未來相容性之缺乏爲一 嚴重之限制。再者,用於編碼各PCM資料之已知格式於重 放被啓動前,需要整框被解碼器讀入。此需要緩衝器大小 被限制爲約ms區段之資料,使延遲或等待時間不致困 .惱聽者。 此外,雖然此等編碼器具有達24' kHz之編碼能力,較 高之次波帶經常被放棄。此減少重構信號之高頻率眞度或 環境〇已知編碼器一般使用二種誤差偵測體系之一種0最 常用者爲Read Solo mo η編碼法,其中編碼器於資料流內邊 際資訊加入誤差偵測位元。此協助於邊際資訊任何誤差之 偵測與改正。然而,聲頻資料內之誤差未被偵測0另一處 理方式爲就該框與聲頻表頭之無效碼狀態檢查之0譬如, —特定之3位元參數可只具有3個有效狀態〇若其他5個 狀態之一者被辨識,則必發生一誤差。此只提供偵測能力 本紙張尺度適用中國國家標準(CNS ) Α4規格(210Χ297公釐) —6 — (請先閲讀背面之注意事項再填寫本頁) ---^裝—Digital Theater Systems, LP (DTS) uses an audio encoder, where each PCM audio channel is filtered into four sub-bands, and each sub-band is encoded using a reverse ADP CM encoder to adapt the predictor coefficients In the subband data, the bit allocation is fixed, the same for each channel, the lower. The frequency subband is assigned more bits, which is more than the higher frequency subband. The bit allocation provides a Fixed compression ratio, such as 4: 1 〇DTS encoder described in 1991 Proceedings of the 10th International AESC ο nference, 4 1-5 6 pages, Mike S myth and S tephen S myth This paper scale is applicable to Chinese national standards ( CNS) Λ4 specification (210X297 mm) 315561 Printed by the Central Sample Bureau of the Ministry of Economic Affairs Employee Consumer Cooperative V. Invention Description (3) " APT-X100: A LOW-DELAY, LOW BIT-RATE, SUB-BAND ADPCM AUDIO CODERFOR BROADCASTING " ○ Class 2 audio encoders have other common limitations. First, known audio encoders use a fixed frame size for encoding / decoding, that is, the sample represented by a frame The target or time period is fixed. The result is that when the encoding transmission rate increases relative to the sampling rate, the amount of data (bytes) in the frame also increases. Thus, the decoder buffer size must be designed to accommodate the worst case System to prevent data overflow. This increases the amount of RAM, which is the main cost component of the decoder. Second, the known audio encoder is not easy to expand to a sampling frequency greater than 48 kHz (000 Hz). Doing so will make Existing decoders are not compatible with the format required by the new encoder. This lack of future compatibility is a serious limitation. Furthermore, the known format used to encode each PCM data needs to be adjusted before playback is started. The frame is read in by the decoder. This requires the buffer size to be limited to about ms blocks of data, so that the delay or latency is not troublesome. Annoying. In addition, although these encoders have encoding capabilities up to 24 'kHz, Higher sub-bands are often abandoned. This reduces the high frequency or environment of the reconstructed signal. Known encoders generally use one of two error detection systems. The most commonly used is the Read Solo mo η encoding method. The middle encoder adds error detection bits to the marginal information in the data stream. This helps in the detection and correction of any errors in the marginal information. However, the errors in the audio data are not detected. For example, 0 of the invalid code state check of the audio header. For example, a specific 3-bit parameter may only have 3 valid states. If one of the other 5 states is identified, an error must occur. This only provides detection capabilities. The paper size is applicable to China National Standard (CNS) Α4 specification (210Χ297mm) — 6 — (please read the precautions on the back before filling this page) --- ^ 装 —

Sufi ml nn ffml I訂 Λ:----Sufi ml nn ffml I order Λ: ----

五、發明説明(4 ) ,而不偵測於聲頻資料內之誤差0 本發明綜沭 鑑於上述問題,本發明提供一具有弾性容納廣泛壓縮 位準範圍之多通道聲頻編碼器,於高位元傳輸速率具有優 於CD之品質,而於低位元傳輸速率具有改良之感覺品質, 具有減少之重放等待時間、簡化之誤差偵測、改良之前回 聲失眞、及對較高抽樣率之未來擴充性05. Description of the invention (4) without detecting errors in the audio data The rate has better quality than CD, and the improved sensory quality at low bit transmission rate, with reduced playback latency, simplified error detection, improved previous echo loss, and future expansion to higher sampling rates 0

經濟部中央標準局員工消費合作社印I (請先閲讀背面之注意事項再填寫本頁) 此達成爲以次波帶編碼器,其將各聲頻通道窗框成一 序列之聲頻框,將各框濾波成基波帶與高頻率範圍,並分 解各基波帶信號成多個次波帶。當位元傳輸速率低時,次 波帶編碼器正常選擇一非完全性濾波器分解基波帶信號, 但當位元傳輸速率足夠高時,選擇一完全性濾波器〇 —高 頻率編碼階段獨立將基波帶信號之高頻率信號編碼〇 —基 波帶編碼階段包括一VQ與一 ADPCM編碼器,其分別將較高 與較低頻率次波帶編碼0 .每一次波帶框包括至少一次框, 其各自更進一步細分成多個次次框0各次框經分析以估計 ADPCM編碼器之預測增益,其當預測增益低時預測能力爲 失效、並偵測各暫態以調整暫態前與後之SF 0 一廣域位元管理(GBM )系統將各位元分配至各次框, 其利用現行框內多個聲頻通道、多個次波帶及各次框架間 之差異。GBM系統起先計算其經預測增益調整之SMR以分 配位元至各次框,以滿足一精神聽覺模型0然後GBM系統 依據一 Μ M S E處理方式分配任何剩餘位元,以或立即切換至 —MMSE分配,降低所有雜訊基値,或逐漸變形成一MMSE分 本紙張尺度適用中國國家標準(匸阳)八4規格(2]0乂297公釐) 一Ί - 3ί556ί at Β7 五、發明説明(5) 配。 —多工器產生各輸出框,其包括一同步(sync)字,一 框表頭,一聲頻表頭,及至少一個次框,其以一傳輸率多 工傳輸至一資料流內。該框表頭包括該視窗大小與現行輸 出框之大小。聲頻表頭指示一壓縮配置與聲頻框之一編碼 格式。各聲頻次框包括用以解碼聲頻次框而不需參考任何 其他次框之邊際資訊,高頻率YQ碼,多個基波帶聲頻次次 框,其中各通道較低頻率次波帶用之聲頻資料被壓縮,且 與其他通道多工傳輸,一高頻率聲頻區段,其中各通道用 之高頻率範圍聲頻資料被壓縮,且與其他通道多工傳輸, 使多通道聲頻信號爲可於多個解碼抽樣率下解碼,及一用 以驗證次框末端之解壓縮同步。 該視窗大小經選擇爲傳輸率對編碼器抽樣率比値之一 函數,使輸出框大小被限制位於一所要範圍內。當壓縮量 相當低時,視窗大小被減少,使框大小不超過一較高之最 大値〇結果爲一解碼器可使用一具有固定且相當小量RAM 之輸入緩衝器。當壓縮量相當高時,視窗大小被增加。結 果爲GBM系統可·將各位元分佈玲一較大時間視窗,從而改 良編碼器性能0 本發明此等及其他特色與優點,從以下較佳具體形式 詳述連同附圖及表,對業界技術熟練人員將會很明顯,其 中: 簡要圖說 - 圖1爲依據本發明之5通道聲頻編碼器之一方塊圖; 本紙張尺度適用中國國家標準( CNS ) Α4規袼(210X297公釐)~ ^ ^ -8- Λ (請先閱讀背面之注意事項再填寫本頁) -裝· 、1Τ 經濟部中央樣準局員工消費合作社印取Printed by the Consumer Cooperative of the Central Standards Bureau of the Ministry of Economic Affairs (please read the precautions on the back before filling in this page) This is a sub-band encoder, which frames each audio channel window into a sequence of audio frames and filters each frame Form a fundamental band and a high frequency range, and decompose each fundamental band signal into multiple sub-bands. When the bit transmission rate is low, the subband encoder normally selects an incomplete filter to decompose the fundamental band signal, but when the bit transmission rate is sufficiently high, a complete filter is selected. The high frequency coding stage is independent Encode the high-frequency signal of the fundamental band signal. The fundamental band encoding stage includes a VQ and an ADPCM encoder, which encode the higher and lower frequency sub-bands respectively. Each time the band frame includes at least one frame , Each of them is further subdivided into multiple sub-frames. Each sub-frame is analyzed to estimate the prediction gain of the ADPCM encoder. When the prediction gain is low, the prediction ability is disabled, and each transient is detected to adjust the transient and The latter SF 0-wide area bit management (GBM) system assigns each bit to each subframe, which utilizes the differences between multiple audio channels, multiple subbands, and subframes within the current frame. The GBM system first calculates its predicted gain adjusted SMR to allocate bits to each subframe to satisfy a mental hearing model 0. Then the GBM system allocates any remaining bits according to a MSE processing method, or immediately switches to -MMSE allocation , Reduce all noise base values, or gradually transform into a MMSE sub-sheet. The paper standard is applicable to the Chinese National Standard (Shenyang) 84 specifications (2) 0 297 mm) 1 Ί-3ί556ί at Β7 5. Description of the invention (5 ) Match. -The multiplexer generates each output frame, which includes a sync (sync) word, a frame header, an audio header, and at least one sub-frame, which are multiplexed into a data stream at a transmission rate. The frame header includes the size of the window and the size of the current output frame. The audio table header indicates a compression configuration and one of the audio frame encoding formats. Each audio sub-frame includes marginal information used to decode the audio sub-frame without reference to any other sub-frames, high-frequency YQ codes, multiple fundamental-band audio sub-frames, where each channel uses lower-frequency sub-band audio frequencies Data is compressed and multiplexed with other channels, a high-frequency audio section, in which high-frequency range audio data for each channel is compressed and multiplexed with other channels, so that multi-channel audio signals can be multiple Decode at the decoded sample rate, and one to verify the decompression synchronization at the end of the subframe. The window size is selected as a function of the ratio of transmission rate to encoder sampling rate, so that the size of the output frame is limited to a desired range. When the amount of compression is quite low, the window size is reduced so that the frame size does not exceed a higher maximum value. The result is that a decoder can use an input buffer with a fixed and relatively small amount of RAM. When the amount of compression is quite high, the window size is increased. The result is that the GBM system can distribute each bit to a larger time window, thereby improving the performance of the encoder. These and other features and advantages of the present invention are detailed from the following preferred specific forms together with the drawings and tables. It will be obvious to the skilled personnel, among them: Brief description-Figure 1 is a block diagram of a 5-channel audio encoder according to the present invention; This paper scale is applicable to China National Standard (CNS) Α4 regulation (210X297mm) ~ ^ ^ -8- Λ (please read the precautions on the back before filling in this page)-installed

經濟部中央標準局黃工消費合作社印I A7 —_B_L— ‘__ 五、發明説明(6) 圖2爲多通道編碼器之一方塊圖; 圖3爲基波帶編碼器與解碼器之一方塊圖; 圖4 a及41)分捌爲一高抽樣率編碼器與解碼器之方塊圖 , 圖5爲單一通道編碼器之一方塊圖; 圖6爲每框位元組對可變傳輸率之框大小之圖; 圖7爲對NPK與PR重構濾波器之振幅響應之圖; 圖8爲對一重構濾波器之次波帶別名設定之圖; 圖9爲對NPR與PR濾波器之失眞曲線之圖; 圖10爲單一次波帶編碼器之一示意圖; 圖11a及lib分別爲對一次框之暫態偵測而且標度因 數計算; 圖12例示對量化之TM0DES之熵編碼法程序; 圖1 3例示標度因數量化程序; 圖1 4例示一信號遮蔽:之迴旋,其具有信號頻率響應以 產生SMR ,· '圖15爲人類聽覺響應之圖; '圖16爲次波帶之SMR之圖; 圖17爲精神聽覺與mmse位元分配之誤差信號之圖; 圖18a及lSb分別爲次波帶能量位準圖與倒轉圖,例 示mms e”充水〃位元配置程序; 圖1 9爲於資料流內單一框之方塊圖; 圖20爲解碼器之示意圖; 圖2 1爲編碼器硬體實.施例之一方塊圖;而 本紙張尺度適用中國國家梯準(CNS ) A4规格(210父197^釐_) ' ™ ^~" ' -9 — ---------、裝— (請先閱讀背面之注意事項再填寫本頁) -9 經濟部中央標準局員工消費合作社印策 ^15561 A7 __ B7 五、發明説明(7) 圖22爲解碼器硬體實施例之一方塊圖〇 簡要圖表說明 表1爲最大:t框大小對抽樣率與傳輸率之作表; 表2爲最大4容許框大小(位元組)對抽樣率與傳輸率 之作表,·而 表3例τκ A B I T索引値、量化位準之數目與所形成次波 帶SNR間之關係。 本發明詳述 多通道聲頻編碼系統 .如圖1所示,本發明結合已知編碼體系之特色加上於 單一多通道聲頻編碼器1 Q之額外特色兩者。該編碼演算法 係設計以錄音室品質位準執行,即、、優於CD "品質,並提 供改變壓縮位準、抽樣率、字長度、通道數目及感覺品質 之廣泛範圍應用。 編碼器12將PCM聲頻·資料Η之多通道(通常以48 kHz 選樣而字長度在16與24位兀間)以一已知傳輸速率(宜於 32-40 9 6 kbps之範圍內)編碼成一資料流16 〇不像已知聲 頻編碼器,本案架構可擴充至較高之抽樣率(48- 1 92 kHz ),而不致與現有爲基波帶抽樣率或任何中間之抽樣率所 設計之解碼器不相容。此外,P C Μ資料14以一回—框予視 窗化及編碼,其各框最好被分割成I4個次框。聲頻視窗之 大小(即I」CM選樣之數目)係根據抽樣率與傳輸率之相對 値,使解碼器18每框所讀出一輸出框之大小(即位元組之 數目)受約束,合宜爲介.於5 . 3與8千位元組之間。 本紙張尺度適用中國國家標準(CNS ) A4規格(210X297公釐) -10- (請先閲讀背面之注意事項再填寫本頁) 、τ 經濟部中央標準局員工消費合作社印製 3l556i at r~—___ _ B7 五、發明説明(8) 結果,於解碼器所需緩街.進入之資料流之RAM量保持 相當低,此減少解碼器之成本。於低率時,較大之視窗大 小可用以框取PCM資料,此改良編碼性能。於較高位元傳 輸率時,須使用較小之視窗大小以滿足資料限制ο此必然 減少編碼性能,但此於較高率時不顯著。又,PCM資料被 定框之方式容許解碼器1 8起動重放爲於整體輸出框被讀入 緩衝器內之前。此減少聲頻編碼器之延遲或等待時間〇 編碼器1 2使用一高解析度濾波器組,其最好根據位元 傳輸率,於非完全性(NPR)與完全性(PR)重構濾波器之間 切換,將各聲頻通道1 4分解成許多次波帶信號〇預測性與 向量量化(VQ )編碼器爲分別用以將較低與較高頻率次波帶 編碼。起始YQ次波帶可爲固定,或可作爲現行信號性質之 一函數而予動態測定。聯合頻率編碼法可於低位元傳輸率 被使用,以同時於較高頻率次波帶對多通道編碼。 該預測性編碼器最好根據次波帶預測增益,於APCM與 ADPCM模式之間切換。一暫態分析器將各次波帶次框分隔 成前回聲與後信號(次次框),並對前與後回聲之次次框 計算各自之標度因數,從而減少前回聲失眞。該編碼器於 所有PCM通道與次波帶對現行框依據其各自需要(精神聽 覺或mse ),適應性分配可用之位元傳輸速率,以使編碼 效率最佳化。藉結合預測性編碼與精神聽覺模型建立,低 位元傳輸率編碼效率被提高,從而降低達成主觀上透通性 之位元傳輸率。一可程式控制器1 9如電腦或鍵台’與編碼 器1 2接介以中繼聲頻模式資訊,包括譬如所要之位元傳輸 本紙張尺度適用中國國家標準(CNS ) A4規格(2i〇X297公釐) ^ ~ -11- (請先閲讀背面之注意事項再填寫本頁) 裝_ 訂 A7 A7 經濟部中央標準局員工消費合作社印袋 B7 五、發明説明(9) 率、通道數目、PR或NPR重構、抽樣率及傳輸率等參數。 編碼信號與旁帶資訊.被壓縮並多工傳輸到資料流16 ,使解碼計算負載受約束成位於所要範圍內。資料流1 6係 於一傳輸介質2:0譬如CD、數位影碟(DVD)、或直接廣播人 造衛星上編碼或播送〇解碼器1 8將個別之次波帶信號解碼 ,並進行反轉濾波作業,以產生多通道聲頻信號22,其主 觀上相當於原來之多通道聲頻信號Η ύ —聲頻系統2 4譬如 家庭戲院系統或多媒體電腦,爲使用者重放該聲頻信號。 多通道編碼器 如圖2所示,編碼器12包括多數(宜爲5個)個別之 通道編碼器26 (左前、中央、右前、左後及右後),產生 各別組之編碼次波帶信號28 (適宜爲每通道32個次波帶信 號)。編碼器12採用一廣域位元管理(GBM)系統30,由一 在各通道中之共同位元池、介於一通道內之各次波帶間、 .及在一已知次波帶內之一個別框內將各位元動態分配。編 碼器12亦可使用聯合頻率編碼技術,以利用較高頻率次波 帶內之通道間相關性。此外,編碼器12可將向量量化用於 特別不能察覺之較高頻率次波帶上,而以一極低·之位元傳 輸率提供基本之高頻眞度或環境〇此方法中,編碼器利用 多通道之全異信號需求,譬如次波帶之rms値與精神聽覺 遮蔽位準,以及毎一通道內頻率上與一旣定框內時間上之 信號能量之不均勻分佈。 位元分配總論 GBM系統30首先判定何者通道之次波帶將作聯合頻率 本紙張尺度適用中國國家標準(CNS ) A4規格(210X2?7公釐) : _ -12- (請先閲讀背面之注意事項再填寫本頁) m- -1-— -- -- - . I ......... mm 1 I - · - - ϊ I - - - - _ « νι^ϋ —HI— vm m n In i 1^1 n^i 經濟部中央標準局員工消費合作社印製 Α7 Β7 五、發明説明(ΐϋ) 編碼’並將數據平均,然後確定何者次波帶將用向童量化 予以編碼,並由可用之位元傳輸率減去該等位元〇何者次 波帶行向量量化之判定可先作成,因爲所有高於一臨限頻 率之次波帶均予向量量化,或可根據各框內個別次波帶之 精神聽覺遮蔽效果作成。其後,GBM系統SO使用精神聽覺 遮蔽在其餘次波帶上分配位元(AB I T ),以使已解碼聲頻信 號之主觀品質最佳化。若可獲得額外之位元,該編碼器可 轉換成一純粹之mmse體系,即'^充水",並根據各次波帶 相對rms値重分配所有位元,將誤差信號之rms値減至最 小。此適用於極高位元傳輸率。較佳手段爲保留精神聽覺 位元分配,而依據m m s e體系僅分配額外之位元。此維持由 精神聽·覺遮蔽所產生雜訊信號之形狀,但向下均匀移動雜 訊基値。 另法,該較佳手段可予修正以依據r m s與精神聽覺位 準間之差異分配額外之位元。結果,精神聽覺分配在位元 傳輸率增加時變形成mm s e分配,從而在二技術之間提供— 平順之轉換0以上技術特別適用於固定式位元傳輸率系統 〇另法,編碼器12可設定一失眞位準(主觀性或mse性) 而容許總位元傳輸率改變以維持該失眞位準0 —多工器32 依據一指定資料格式將次波帶信號及邊際資訊多工化成資 料流1 6 〇該資料格式之細節論述於下圖2 0中。 某波帶編碼 對於8 - 4 8 k Η z範圍內之抽樣率,通道編碼器2 6,如圖 3所示,採用一均匀之512分接頭3 2波帶之分析濾波器組 本紙張尺度適用中國國_家揉準(CNS ) Α4规格(2ΙΟΧ297令筆) -13 — Γ * • ---------------^ 裝------^、訂----:---Λ ; . (請先閱讀背面之注意Ϋ項再填寫本頁) 經濟部中央梯準局員工消費合作社印装 A7 ____B7_^__ 五、發明説明(11 ) 34,以48 kHz之抽樣率操作將各通道之聲頻譜(0-24 kHz )分割成32個次波帶,每個次波帶具有750 Hz之帶寬。編 碼階段3 6將各次波帶信號編碼,並予多工化3 8成壓縮資料 流1 6 ◦解碼器+18接收該壓縮資料流,使用一解壓縮器4 〇將 各次波帶之編碼資料分離出,將各次波帶信號解碼4 2,並 使用一 5 1 2分接頭3 2波帶之均匀內挿濾波組4 4對各通道重 構PCM數位聲頻信號(Fsamp = 48 kHz)〇 在本架構中,所有編碼策略,例如、96或192kHz之 抽樣率,均在最低(基波帶)聲頻頻率,譬如〇-2 4 kHz上 使用32波帶編碼/解碼程序。因此,現今根據48 kHz選樣 率所設計與建造之解碼器將與設計以利用較高頻率組件之 未來編碼器相容。現行解碼器讀取基波帶信號(〇-24 kHz ) 而略去較高頻率時之編碼資料〇 萵抽樣率編碼 對於48-96kHz範圍之抽樣率,通道編碼器26最好將聲 頻頻譜分割成二者,並對下半部使用一均勻之3 2波帶分析 濾波器組,而對上半部使用一 8波帶分析濾波器組〇如圖 “及4b所示,聲頻頻譜(0-48 kHz)起先使用一 2 56分接 頭2波帶之十取一式前濾波器組4 6加以分割,得到每波帶 24 kHz之聲頻帶寬〇底部波帶(0-24 kHz)以圖3所述方式 被分割並編碼成32個均匀之波帶。然而,上部波帶(24-48 kHz )被分割並編碼成8個均匀之波帶◦若8波帶之十取一 /內揷濾波器組48之延遲不等於32波帶濾波器組者,則於 24-48kHz信號路徑某處須使用一延遲補償階段5〇,以確保 本紙張尺度適用中國國家標準(CNS ) A4规格(210X297公釐) —14 — (請先閲讀背面之注意事項再填寫本頁) 裝 —訂 經濟部中央標準局眞工消費合作社印製 A7 ____ _B7_ 五、發明説明(I2) .二者時間波形於2波帶復合濾波器組之前於解碼器處排齊 。於96 kHz選樣編碼系統中,24-48kHz聲頻波帶被延遲達 384個選樣,然後使用128分接頭內挿濾波器組分割成8 個均匀波帶。各個3kHZ次波帶與從0-24 kHz波帶之編碼資 料一起被編碼52及壓縮54,以形成壓縮資料流16 〇 到達解碼器18時,壓縮資料流16被解壓縮56,而32波 帶解碼器(ϋ-24 kHz區域)與8波帶解碼器(24-48kHz)二 者所用之碼被分離出,並分別饋至其各自之解碼階段42及 58〇該8及S2解碼次波帶爲分別使用128分接頭與512分 接頭之均匀內揷濾波器組6 0與4 4予以重構〇該解碼次波帶 後續使用2 5 6分接頭2波帶之均勻內揷濾波器組6 2予以復 合,以產生具有96 kHz抽樣率之單一 PCM數位聲頻信號〇 在希望解碼器以一半之壓縮資料流抽樣率操作之情形,此 可藉拋棄上部波帶編碼資料(2“4 8kHz )而僅將0-24 kHz聲 頻區域內之3 2次波帶解碼而方便實行。 通道編碼器 於所有所述編碼策略中,32波帶編碼/解碼程序爲對 聲頻帶寬介於G-24 kHz間之基波帶部份實施。如圖5所示 ,一框攫取器64將PCM聲頻通道η視窗化,以將其分成連 績之資料框66片段。該PCM聲頻視窗界定連績輸入選樣之 數目,編碼程序爲此在資料流內產生一輸出框。視窗大小 係基於壓縮之量(亦即傳輸率對抽樣率之比値)而設定, 以使各框內之編碼資料量受到限制。各連續之資料框6 6係 藉一 32波帶512分接頭之FIK十取一式濾波器組34分割成 私紙張尺度適用中國國家標準(CNS ) A4規格(210X297公酱) -15- (請先閲讀背面之注意^項再填寫本頁) 裝------I訂丨·-----C、----- Α7 3ί556ΐ ____ Β7 五、發明説明(1 3 ) ™ 3 2個均匀之頻率波帶68 0由各次波帶輸出之選樣予緩衝並 應用至3 2波帶編碼階段3 6 〇 (請先閲讀背面之注意事項再填寫本頁) —分析階段7ϋ (詳述於圖10-1 9 )爲各緩衝次波帶選 樣產生最佳預測器係數、差分量化器位元分配及最佳量化 器標度因數〇分析階段70亦可判定何者次波帶將與向量量 化(VQ )’及何者將予聯合頻率編碼,倘若此等判定非屬固 定。此等資料,或邊際資訊,向前饋至選定之ADPCM階段 72、VQ階段73或聯合頻率編碼(JFC)階段74,並至資料多 工器32 (壓縮器)。然後,次波帶選樣經由ADPCM或VQ程 序編碼,而量化碼輸入到多工器。J F C階段7 4並不實際將 次波帶選樣編碼,卻係產生指示何者通道之次波帶被聯合 及其等被置於資料流內何處之碼q各次波帶之量化碼與邊 際資訊被壓縮入資料流1 6內,並傳輸至解碼器。 經濟部中央揉準局員工消费合作社印製 到達解碼器1S畤,資料流被多工化4〇 (或解壓縮)回 到個別之次波帶內◦各次波帶之標度因數及位元分配連同 各預測器係數先被設立於各反轉量化器7 5內。然後,對各 指定次波帶之差分碼直接使用ADPCM程序7 6或反轉向量量 化程序7 7,或使用反轉J F C程序7 8予以重構。最後,使用 3 2波帶內挿濾波器組4 4·將各次波帶混合回復到單一 p C Μ聲 頻信號22。 PCM信號定框 如圖6所示,當傳輸率對一旣定抽樣率作改變時,圖 5所示之框攫取器6 4改變視窗7 9之大小,以使每輸出框8 0 之位元組數目被約束位而於例如5 · 3k個位元組與8k個位元 本紙張尺度適用中國國家標準(CNS ) A4规格(210X297^t )~ 一 A7 B7 經濟部中央樣準局—工消費合作社印裂 五、發明説明(Η) 組之間。表1及2分別爲譲設計者就一旣定抽樣率與傳輸 率選擇最佳視窗大小及解碼器緩衝器大小(框大小)之設 計表。低傳輸率時,框大小可較大。此容許編碼器利用聲 頻信號對時間之;非平坦變異數分佈,並改善聲頻編碼器之 性能〇高傳輸率時,框大小減少,以使位元組之總數不使 解碼器緩衝器溢位。結果,設計者可對解碼器提供8k位元 組之·ΚΑΜ ,以滿足所有傳輸率。此減少解碼器之成本。一 般而言,聲頻視窗之大小求出如下: 聲頻視窗=(框大小)* Fsamp * (8/Trate) 其中框大小爲解碼器緩衝器之大小,F s a m p爲抽樣率,而 Tr a t e爲傳輸率。聲頻視窗之大小與聲頻通道之數目無關 〇然而,當通道之數目增加時,壓縮量亦須增加,以維持 所要之傳輸率。 ^_1_ F s a nip (kHz)Printed by the Huanggong Consumer Cooperatives, Central Bureau of Standards, Ministry of Economic Affairs. I A7 —_B_L— '__ 5. Description of the invention (6) Figure 2 is a block diagram of a multi-channel encoder; Figure 3 is a block diagram of a baseband encoder and decoder Figure; Figure 4 a and 41) is divided into a block diagram of a high sampling rate encoder and decoder, Figure 5 is a block diagram of a single channel encoder; Figure 6 is a variable transmission rate per frame bit group Figure of the frame size; Figure 7 is a diagram of the amplitude response to the NPK and PR reconstruction filters; Figure 8 is a diagram of the subband alias setting for a reconstruction filter; Figure 9 is a diagram of the NPR and PR filters Figure 10 is a schematic diagram of a single primary band encoder; Figure 11a and lib are the transient detection of the primary frame and the calculation of the scale factor; Figure 12 illustrates the entropy encoding method of quantized TM0DES Procedure; Fig. 13 illustrates the scale factor quantization procedure; Fig. 14 illustrates a signal masking: the convolution, which has a signal frequency response to generate SMR, "Figure 15 is a diagram of the human auditory response;" Figure 16 is the subwave Diagram of SMR with band; Figure 17 is a diagram of the error signal of mental hearing and mmse bit allocation; Figure 18a lSb is the sub-band energy level diagram and inversion diagram respectively, exemplifying the mms e "water-filling" bit configuration procedure; Figure 19 is a block diagram of a single frame in the data stream; Figure 20 is a schematic diagram of the decoder; Figure 2 1 is a block diagram of the encoder hardware. One of the examples; the paper scale is applicable to the Chinese National Standard (CNS) A4 specification (210 father 197 ^ _) '™ ^ ~ "' -9 — --- ------ 、 装 — (Please read the precautions on the back before filling in this page) -9 Printed by the Employees ’Consumer Cooperative of the Central Bureau of Standards of the Ministry of Economic Affairs ^ 15561 A7 __ B7 5. Description of the invention (7) Figure 22 shows the decoding Block diagram of one of the hardware embodiments of the device. Brief description of the table. Table 1 is the maximum: t box size vs. sampling rate and transmission rate; Table 2 is the maximum 4 allowed frame size (bytes) vs. sampling rate and transmission rate Table 3 shows the relationship between the τκ ABIT index value, the number of quantization levels and the SNR of the formed sub-band. The present invention details the multi-channel audio coding system. As shown in FIG. 1, the present invention has combined The characteristics of the known coding system plus both the additional characteristics of a single multi-channel audio encoder 1 Q. The coding algorithm It is designed to be performed at the studio quality level, that is, better than CD " quality, and provides a wide range of applications that change the compression level, sampling rate, word length, number of channels, and perceived quality. The encoder 12 uses PCM audio Multiple channels of data H (usually sampled at 48 kHz and word length between 16 and 24 bits) are encoded into a data stream 16 at a known transmission rate (preferably within the range of 32-40 9 6 kbps). Known audio encoders, the architecture of this case can be expanded to a higher sampling rate (48-92 kHz) without being incompatible with existing decoders designed for the fundamental band sampling rate or any intermediate sampling rate. In addition, the PCM data 14 is windowed and coded in a one-frame, and each frame is preferably divided into 14 subframes. The size of the audio window (that is, the number of I ”CM samples) is based on the relative value of the sampling rate and the transmission rate, so that the size of an output frame read by each frame of the decoder 18 (that is, the number of bytes) is restricted, which is appropriate For the introduction. Between 5.3 and 8 kilobytes. This paper scale is applicable to the Chinese National Standard (CNS) A4 specification (210X297mm) -10- (please read the precautions on the back before filling this page), τ printed by 3l556i at r ~ — ___ _ B7 V. Description of the invention (8) As a result, the amount of RAM needed to enter the data stream in the decoder is kept relatively low, which reduces the cost of the decoder. At low rates, the larger window size can be used to frame PCM data, which improves coding performance. At higher bit rates, a smaller window size must be used to meet the data limit. This will inevitably reduce the encoding performance, but this will not be significant at higher rates. Also, the way the PCM data is framed allows the decoder 18 to start playback before the entire output frame is read into the buffer. This reduces the delay or latency of the audio encoder. The encoder 12 uses a high-resolution filter bank, which preferably reconstructs the filter in terms of incompleteness (NPR) and completeness (PR) according to the bit transmission rate Switching between them, each audio channel 14 is decomposed into many subband signals. The predictive and vector quantization (VQ) encoder is used to encode the lower and higher frequency subbands, respectively. The initial YQ sub-band can be fixed or can be dynamically determined as a function of the nature of the current signal. The joint frequency coding method can be used at a low bit rate to simultaneously encode multiple channels in the higher frequency subband. The predictive encoder preferably switches between APCM and ADPCM modes based on the subband predictive gain. A transient analyzer separates each subband subframe into pre-echo and post-signal (sub-sub-frame), and calculates the respective scale factor for the sub-frames of pre- and post-echo to reduce the pre-echo miss. The encoder adapts the available bit transmission rate to the current frame of all PCM channels and sub-bands according to their respective needs (mental hearing or mse), so as to optimize the coding efficiency. By combining predictive coding with a mental hearing model, the coding efficiency of the low bit transmission rate is improved, thereby reducing the bit transmission rate that achieves subjective transparency. A programmable controller 19, such as a computer or keyboard, is connected to the encoder 12 to relay audio mode information, including, for example, the desired bit transmission. The paper standard is applicable to the Chinese National Standard (CNS) A4 specification (2i〇X297 Mm) ^ ~ -11- (please read the precautions on the back before filling in this page) Packing_ Order A7 A7 Printed Bag B7 of the Employees ’Consumer Cooperative of the Central Standards Bureau of the Ministry of Economy V. Invention Description (9) Rate, Number of Channels, PR Or parameters such as NPR reconstruction, sampling rate and transmission rate. Encoded signal and sideband information. It is compressed and multiplexed to the data stream 16, so that the decoding calculation load is constrained to be within the desired range. Data stream 16 is encoded or broadcast on a transmission medium 2: 0 such as CD, digital video disc (DVD), or direct broadcast satellite. Decoder 18 decodes individual sub-band signals and performs inverse filtering operations To generate a multi-channel audio signal 22, which is subjectively equivalent to the original multi-channel audio signal H ύ-audio system 24, such as a home theater system or a multimedia computer, to reproduce the audio signal for the user. The multi-channel encoder is shown in Figure 2. The encoder 12 includes a plurality of (preferably 5) individual channel encoders 26 (front left, center, front right, rear left, and rear right), which generate encoding subbands for each group Signal 28 (suitably 32 subband signals per channel). The encoder 12 uses a wide area bit management (GBM) system 30, which consists of a common bit pool in each channel, between subbands within a channel, and within a known subband Each element is dynamically allocated in an individual box. The encoder 12 may also use joint frequency coding techniques to take advantage of inter-channel correlation in the higher frequency sub-band. In addition, the encoder 12 can use vector quantization for higher frequency subbands that are particularly imperceptible, and provide a basic high frequency or environment at an extremely low bit transmission rate. In this method, the encoder Utilize the disparate signal requirements of multiple channels, such as the rms value of the sub-band and the level of mental hearing masking, and the uneven distribution of the signal energy in the frequency of each channel and the time in a fixed frame. Bit allocation overview GBM system 30 first determines which channel's secondary wave band will be used as the joint frequency. This paper scale is applicable to the Chinese National Standard (CNS) A4 specification (210X2? 7mm): _ -12- (please read the back side first (Notes and fill in this page again) m- -1-—---. I ......... mm 1 I-·--ϊ I----_ «νι ^ ϋ —HI— vm mn In i 1 ^ 1 n ^ i Printed Α7 Β7 by the employee consumer cooperative of the Central Bureau of Standards of the Ministry of Economy V. Description of the invention (lϋ) Encoding and average the data, and then determine which sub-band will be encoded with the quantification of the child, And the available bit transmission rate minus these bits. The determination of which sub-band row vector quantization can be made first, because all sub-bands above a threshold frequency are vector quantized, or according to each frame The mental auditory masking effect of the individual sub-bands is created. Thereafter, the GBM system SO uses mental auditory masking to allocate bits (AB I T) on the remaining sub-bands to optimize the subjective quality of the decoded audio signal. If additional bits are available, the encoder can be converted into a pure mmse system, ie '^ Filled water', and all bits are redistributed according to the relative rms value of each wave band to reduce the rms value of the error signal to The smallest. This applies to extremely high bit transfer rates. The preferred method is to retain the mental auditory bit allocation, while according to the m m s e system, only additional bits are allocated. This maintains the shape of the noise signal generated by mental hearing and hearing shielding, but moves the noise base evenly downward. Alternatively, the preferred method can be modified to allocate additional bits based on the difference between r m s and the psychoacoustic level. As a result, psychoacoustic distribution changes to mm se distribution when the bit transmission rate increases, thereby providing a smooth transition between the two technologies. The technology above 0 is particularly suitable for fixed bit transmission rate systems. Alternatively, the encoder 12 can Set a missing level (subjective or mse) and allow the total bit transmission rate to change to maintain the missing level 0-the multiplexer 32 multiplexes the sub-band signals and marginal information into a specified data format Data stream 16 0. The details of this data format are discussed in Figure 20 below. For a band coding for sampling rates in the range of 8-4 8 k H z, the channel encoder 26, as shown in Figure 3, uses a uniform 512 tap 32 filter band analysis filter bank This paper size is applicable China _ Home Rubbing (CNS) Α4 specification (2ΙΟΧ297 order pen) -13 — Γ * • --------------- ^ Pack ------ ^, order- -: --- Λ;. (Please read the note Ϋ on the back before filling this page) A7 ____ B7 _ ^ __ printed by the Consumer Cooperative of the Central Escalation Bureau of the Ministry of Economic Affairs V. Invention description (11) 34, at 48 kHz The sampling rate operation divides the sound spectrum (0-24 kHz) of each channel into 32 sub-bands, and each sub-band has a bandwidth of 750 Hz. Encoding stage 3 6 Encode each wave band signal and pre-multiplex 3 8 into a compressed data stream 1 6 ◦Decoder +18 receives the compressed data stream and uses a decompressor 4 〇 to encode each wave band Separate the data, decode each band signal 4 2 and use a 5 1 2 tap 3 2 band uniform interpolation filter group 4 4 to reconstruct the PCM digital audio signal for each channel (Fsamp = 48 kHz). In this architecture, all encoding strategies, for example, 96 or 192kHz sampling rate, use 32-band encoding / decoding procedures at the lowest (fundamental band) audio frequency, such as 〇-2 4 kHz. Therefore, decoders designed and constructed today based on the 48 kHz sampling rate will be compatible with future encoders designed to utilize higher frequency components. The current decoder reads the fundamental band signal (〇-24 kHz) and omits the coded data at higher frequencies. Sampling rate coding For a sampling rate in the range of 48-96 kHz, the channel encoder 26 preferably divides the audio spectrum into For both, use a uniform 32-band analysis filter bank for the lower half, and an 8-band analysis filter bank for the upper half. As shown in Figure 4 and 4b, the audio frequency spectrum (0-48 kHz) First, use a 2 56 tap 2 wave band ten to take a type of front filter bank 4 6 to be divided to obtain an audio bandwidth of 24 kHz per band. The bottom band (0-24 kHz) is as described in FIG. 3 It is divided and coded into 32 uniform bands. However, the upper band (24-48 kHz) is divided and coded into 8 uniform bands. If ten of the 8 bands take one / internal filter bank 48 If the delay is not equal to the 32-band filter bank, a delay compensation stage 50 must be used somewhere in the 24-48kHz signal path to ensure that the paper size is applicable to the Chinese National Standard (CNS) A4 specification (210X297 mm) — 14 — (Please read the precautions on the back before filling out this page) Binding—Binding Central Standards of the Ministry of Economic Affairs Printed by Aiko Consumer Cooperative A7 ____ _B7_ V. Description of the invention (I2). The time waveforms of the two are aligned at the decoder before the 2-band composite filter bank. In the 96 kHz sample coding system, 24-48 kHz audio The band is delayed by up to 384 samples, and then divided into 8 uniform bands using a 128-tap interpolation filter bank. Each 3kHZ sub-band is encoded 52 and compressed together with the encoded data from the 0-24 kHz band 54, to form a compressed data stream 16 〇 When reaching the decoder 18, the compressed data stream 16 is decompressed 56, and 32-band decoder (ϋ-24 kHz region) and 8-band decoder (24-48kHz) both The codes used are separated and fed to their respective decoding stages 42 and 58. The 8 and S2 decoded sub-bands are uniform inner pan filter banks 60 and 4 using 128 taps and 512 taps respectively It is reconstructed. The decoded sub-band is subsequently compounded using a 2 5 6 tap 2 band homogenous filter box 6 2 to produce a single PCM digital audio signal with a sampling rate of 96 kHz. The operation of half the compressed data stream sampling rate, which can be discarded by Band coded data portion (2 "4 8kHz) and only the sub-band decoder 32 within the region of 0-24 kHz audio and easy to implement. Channel Encoder In all the encoding strategies described above, the 32-band encoding / decoding procedure is implemented for the fundamental band portion of the audio bandwidth between G-24 kHz. As shown in FIG. 5, a frame grabber 64 windowizes the PCM audio channel n to divide it into consecutive data frame 66 segments. The PCM audio window defines the number of consecutive input samples, for which the encoding process generates an output frame in the data stream. The window size is set based on the amount of compression (that is, the ratio of the transmission rate to the sampling rate) to limit the amount of encoded data in each frame. Each continuous data frame 6 6 is divided into a private paper standard by a 32-wave FIK ten-in-one filter bank 34 with 512 taps. It is applicable to the Chinese National Standard (CNS) A4 specification (210X297 public sauce) -15- (please first Read the notes on the back ^ item and fill out this page) Install ------ I order 丨 · ----- C, ----- Α7 3ί556Ι ____ Β7 5. Description of the invention (1 3) ™ 3 2 Uniform frequency band 68 0 The sample selected by each band output is buffered and applied to the 3 2 band coding stage 3 6 〇 (please read the precautions on the back before filling this page)-analysis stage 7ϋ (details (In Figures 10-1 9) Generate the best predictor coefficients, differential quantizer bit allocation and the best quantizer scale factor for each buffered subband sample selection. The analysis stage 70 can also determine which subband will be combined with Vector quantization (VQ) 'and which will be combined frequency coding, if these decisions are not fixed. This data, or marginal information, is fed forward to the selected ADPCM stage 72, VQ stage 73 or Joint Frequency Coding (JFC) stage 74, and to the data multiplexer 32 (compressor). Then, the sub-band selection is encoded via the ADPCM or VQ program, and the quantization code is input to the multiplexer. JFC stage 7 4 does not actually encode the sub-band selection, but generates a code indicating which channel's sub-band is combined and where it is placed in the data stream q the quantization code and margin of each sub-band The information is compressed into the data stream 16 and transmitted to the decoder. Printed by the Consumer Cooperative of the Ministry of Economic Affairs of the Ministry of Economic Affairs and arrived at the decoder 1S, the data stream is multiplexed 40 (or decompressed) and returned to the individual sub-bands. Scale factor and bit of each sub-band The distribution together with the predictor coefficients is first established in each inverse quantizer 75. Then, the differential codes of the respective designated sub-bands are directly reconstructed using the ADPCM program 76 or the inverse vector quantization program 77 or the inverse JFC program 78. Finally, the 32-band interpolation filter bank 4 4 is used to mix each sub-band back to a single p C M audio signal 22. The PCM signal is framed as shown in FIG. 6. When the transmission rate changes a fixed sampling rate, the frame grabber 6 shown in FIG. 5 changes the size of the window 7 9 so that each output frame has 80 bits. The number of groups is constrained and the paper size of, for example, 5.3k bytes and 8k bytes is in accordance with the Chinese National Standard (CNS) A4 specification (210X297 ^ t) ~ 1 A7 B7 Central Bureau of Standards, Ministry of Economic Affairs-industrial consumption Printed by the cooperative. 5. Description of Invention (H) Group. Tables 1 and 2 are the design tables for the designer to select the optimal window size and decoder buffer size (frame size) for a fixed sampling rate and transmission rate. At low transmission rates, the frame size can be larger. This allows the encoder to use the audio signal versus time; non-flat variance distribution and improve the performance of the audio encoder. At high transmission rates, the frame size is reduced so that the total number of bytes does not overflow the decoder buffer. As a result, the designer can provide the decoder with 8K bytes of KAM to meet all transmission rates. This reduces the cost of the decoder. In general, the size of the audio window is obtained as follows: Audio window = (frame size) * Fsamp * (8 / Trate) where the frame size is the size of the decoder buffer, F samp is the sampling rate, and Tr ate is the transmission rate . The size of the audio window has nothing to do with the number of audio channels. However, as the number of channels increases, the amount of compression must also increase to maintain the desired transmission rate. ^ _1_ F s a nip (kHz)

Trate 8-12 16-24 32-48 64-96 128-192 (請先閱讀背面之注意事項再填寫本頁)Trate 8-12 16-24 32-48 64-96 128-192 (please read the notes on the back before filling this page)

·—C 裝I S 5.1 2 kbps $1024 kbps ^ 2048 kbps ^4096 kbps 1024 2 048 1024 40 96 2048 1024 冰 氺 氺 2 04 8 1024 * 木 * 2 0 48· —C installed I S 5.1 2 kbps $ 1024 kbps ^ 2048 kbps ^ 4096 kbps 1024 2 048 1024 40 96 2048 1024 ice 氺 氺 2 04 8 1024 * wood * 2 0 48

Trate 8-12 <512 kbps 8 - 5 , 3 k <1024 kbps * <2048 kbps * <4096 kbps ·* 表_2 Fsamp (kHz) 16-24 32-48 64-96 128-192 8-5.3k 8-5 . 3k 8-5 . 3k 8 - 5 . 3 k * 8-5 . 3k 氺 氺Trate 8-12 < 512 kbps 8-5, 3 k < 1024 kbps * < 2048 kbps * < 4096 kbps · * Table_2 Fsamp (kHz) 16-24 32-48 64-96 128-192 8 -5.3k 8-5. 3k 8-5. 3k 8-5. 3 k * 8-5. 3k 氺 氺

8-5 . 3k * 8-5 · 3k 8-5 , 3k ,本紙張尺度適用中國國家標準(CNS ) A4规格(210X297公g) A7 B7 經濟部中夬標準局員工消費合作社印裝 五、發明説明(l5 ) 次波帶濾波 3 2波帶5 1 2分接頭之均勻十取一式濾波器組3 4自二個 多相濾波器組選擇,.以將資料框6 6分割成於圖5所示之3 2 個均勻次波帶68。該二濾波器組具有不同之重構性質,後 者以次波帶編碼增益交換重構準確度0有一種濾波器稱爲 完全性重稱(P R)濾波器。當完全性重構十取一式(編碼) 濾波器及其內挿(解碼)濾波器背對背置放時,重構信號 即爲 '"完全性",其中完全性被定義成於24位元之解析度 時在0.51sb內。另一種濾波器稱爲非完全性重構(NPR)濾 波器,因重構信號具有一非零之雜訊基値,後者與濾波程 序之非完全性.別名設定作用消除性質相關。 NPH與PR濾波器對單一次波帶之個別轉移功能82與84 示於圖7 〇由於NP1{瀘波器不予約束以提供完全性重構, 其等展現遠較PR濾波爲大之接近停止波帶斥拒(NSBR)比値 ,亦即通帶對第一旁瓣之比値(110 dB對85dB) 〇如圖8 所示,濾波器之旁瓣導致一自然位於第三次波帶內之信號 8 6以別名進入相鄰次波帶內。次波帶增益量測各鄰接次波 帶內之信號斥拒作用,因而指示濾波器與聲頻信號去相關 之Hb力。由於ΝΡί(漉波器具有—'遠較PR據波器爲大之NSBR 比値,故其:#亦將具有一遠屬較大之次波帶增益〇結果, NPR濾波器提供較佳之編碼效率0 如圖9所示,對Pfi及ΝΡβ濾波器二者而言,當總位元 傳輸率增加時,壓縮資料流之總失眞量減少。然而,於低 傳輸率時’該二濾波器類型間之次波帶增益性能差異較與8-5. 3k * 8-5 · 3k 8-5, 3k, this paper scale is applicable to the Chinese National Standard (CNS) A4 specification (210X297 g) A7 B7 Printed and printed by the Consumer Cooperative of the Ministry of Economic Affairs Zhongsui Standard Bureau Description (l5) Sub-band filter 3 2 band 5 1 2 Tap-to-even filter bank 3 4 Select from two polyphase filter banks to divide the data frame 6 6 into Figure 5 The 3 2 uniform sub-bands 68 are shown. The two filter banks have different reconstruction properties. The latter exchanges reconstruction accuracy with sub-band coding gain. There is a filter called a complete remapping (PR) filter. When the complete reconstruction decimation (encoding) filter and its interpolation (decoding) filter are placed back-to-back, the reconstructed signal is' " completeness ", where completeness is defined as 24 bits The resolution is within 0.51sb. Another kind of filter is called incomplete reconstruction (NPR) filter, because the reconstructed signal has a non-zero noise base value, which is related to the incompleteness of the filter program. The alias setting function is related to the nature of elimination. The individual transfer functions 82 and 84 of the NPH and PR filters for a single primary band are shown in FIG. 7. Since the NP1 {the wave filter is not constrained to provide complete reconstruction, its performance is much larger than that of the PR filter and is close to stop Band Rejection (NSBR) ratio, that is, the ratio of the passband to the first side lobe (110 dB to 85dB). As shown in Figure 8, the side lobe of the filter results in a natural location in the third band The signal 8 6 enters the adjacent sub-band with an alias. The sub-band gain measures the repulsive effect of signals in each adjacent sub-band, and thus indicates the Hb force of the filter to decorrelate the audio signal. Since ΝΡί (the wave filter has an NSBR ratio that is much larger than the PR wave filter, so its: # will also have a far larger subband gain. As a result, the NPR filter provides better coding efficiency 0 As shown in Figure 9, for both Pfi and NPβ filters, as the total bit transmission rate increases, the total amount of loss in the compressed data stream decreases. However, at low transmission rates, the two filter types The difference between the second band gain performance

It ml i « (請先闖讀背面之注^h項再填寫本頁) 裝---- 、訂. . _____It ml i «(Please read the note ^ h on the back before filling in this page) Pack ----, order ... _____

I—...... w ----- I— ! I 各紙張尺度適用中國國家標準(CNS ) Μ規格(2】〇><297公釐 -18- 經濟部中央揉準局員工消費合作社印製 A7 B7 五、發明説明( NPR濾波器相關之雜訊基値爲大。因此’ NPR濾波器之相 關失眞曲線90位於Pft濾波器相關之失眞曲線92下方。因此 ,於低傳輸率時,聲頻編碼器選擇N P R滴波器組0於某一 點9 4上,編碼器之量化誤差落於NPK濾波器之雜訊基値以 下,如此對Αΰ P C Μ編碼器增添額外之位元並不提供額外之 利益0在此點上,聲頻編碼器轉換至p R濾波器組0 ADPCM編碼 ADPCM編碼器7 2自Η個先前重構選樣之線性組合產生 一預测選樣Ρ ( η )。然後,此一預測選樣自輸入X ( η )減去, 以獲得一差選樣d(n)〇差選樣除以RMS (或PEAK)標度因 數而予標定,以使差選.樣之振幅與量化器特性Q相配 合。標定之差選樣ud U )應用於一具有L個階層之步級大 小SZ (由對目前選樣所分配位元ABIT之數目所決定者)之 量化器特性。量化器對各標定之差選樣洲(η )產生一階層 碼QL(n)〇此等階層碼最後被傳輸到解碼器ADPCM階段。 爲更新預測器歷程,量化器階層碼QL ( η )使用一具有與Q 相同之特性之反轉量化器1 / Q作局部解碼,以產生一量化 標定差選樣d(n)。該選樣』(η)乘以RMS (或PEAK)標 度因數而予再標定以產生ί ( η)。原姶輸入選樣x ( n )之—量 化型式〗U )係藉添加起始預測選樣p ( n )至量化差選樣〗(n ) 予以重構〇 .然後將此一選樣用以更新預測器歷程〇 向量量化 預測器係數與高頻率次波帶選樣係用向量量化(V Q )作 用予以編碼0預測器VQ具有一爲4個選樣之向量維度,以 本用中關家標準(CNS ) A4規格(2丨GX297公釐) ' ----~~' ---------裝------J-訂 I.---.---(. (請先閲讀背面之注'項再填寫本頁) f f 經濟部中央榡準局員工消費合作社印袋 A7 ____ B7 _ _ 五、發明説明(17) 及毎一選樣3個位元之位元傳輸率〇因此,最終碼冊係由 維度4之4〇96個碼向量組成。匹配向量之搜尋係建構成一 種二階層樹枝形電路,而樹枝形電路內之各節點具有64個 分支〇上部階層儲存ΰ4個節點碼向量,後者僅爲編碼器所 需以協助搜索程序。底部階層接觸4〇96個最終碼向量,後 者爲編碼器及解碼器所需。對各次搜尋而言,需有I28個 維度爲4之MSE計算。在上部階層之碼冊及節點向量係用 LBG方法加以訓練,而有超過5百萬個之預測係數訓練向 量。訓練向量就所有次波帶作累積,其呈現一正預測增益 ,同時對一廣範圍之聲頻材料編碼◦對一訓練組內之各測 試向量獲得約之平均SNR 〇 高頻率VQ具有一爲32個選樣(一次框之長度)之向量 維度以及每一.選樣ϋ . 3 1 2 5位元之位元傳輸率。因此,最終 碼冊由維度3 2之1 0 2 4個碼向量所組成。匹配向量之搜尋係 建構成一種二階層樹枝形電路,而樹枝形電路內之各節點 具有32個分支。上部階層儲存32個節點碼向量,後者僅爲 編碼器所需。底部階層含有1 ϋ 2 4個最終碼向量’後者爲編 碼器及解碼器所需。對各次搜尋而言’需有64個維度32之 MSE計算。於上部階層之碼册及節點向量係使用LBG方法 加以訓練,_而有超過7百萬個之尚頻率次波帶訓練向量〇 對一廣範圍聲頻材料之4 8 k Η z抽樣率,組成各向量之選樣 係由次波帶1 6至3 2之各輸出加以累積。抽樣率爲4 8 kHz時 ,各訓練選樣代表1 2到24kHz範圍內之聲頻頻率。對於訓 練組內之各測試向量,預期有—約3 dB之平均SNK 。雖然 本紙張又度適用中國國家標準('.CNS ) A4·現格(2丨〇X297公釐) —5 Ω — ........ : - - . 1 - -- ...... I - II I..... I 一-I^n I— . Ini , (請先閲讀背面之注意事項再填寫本頁) A7 B7 3l556i 五、發明説明(1 8 ) 3 dB乃一小SNK ,然其足以在此等高頻率提供高頻率眞度 或環境〇感覺上其遠較已知之單純放棄高頻率次波帶之技 術爲佳。 聯合頻率編碼: 在極低位元傳輸率應用中,總重構眞度可藉由僅將由 二或更多聲頻通道所得高頻率次波帶信號之總和編碼,而 非將其等獨立編碼予以改善。由於高頻率次波帶經常具有 類似之能量分佈,且由於人類聽覺系統主要舆高頻率分量 之v强度〃靈敏而非其等之精細構造,因此聯合頻率編碼 乃有可能。因此,由於在任何位元傳輸率時有更多之位元 可用以將感官上重要之低頻率編碼,故重構之平均信號提 供良好之總眞度〇 聯合頻率編碼索引(JO i Μ )係直接傳輸至解碼器以指 示何者通道芨次波帶已然聯合,以及編碼信號係置於資料 流中何處。解碼器在指定通道中將信號重構,然後將其拷 貝至每一其他通道。然後,各通道依據其特殊之RMS標度 因數加以標定。由於聯合頻率編碼工作根據其等能量分佈 之類似性而將時間信號平均,故重構之眞度減低0因此, 其應用通常限定於低位元傳輸率應用,而主要爲2020 kHz 之信號〇在高位元傳輸率應用之媒介中,聯合頻率編碼通 常失效0 次波帶編碼器 對於單一使用ADPCM/APCM程序予以編碼之旁帶之編碼 程序,尤其圖5所示分析階段70與ADPCM編碼器72以及圖 本紙張尺度適用中國國家標準(CNS ) A4规格(210X297公釐) -21- I---------(裝—I (請先閲讀背面之注意事項再填寫本頁) 訂 經濟部中央標準局員工消費合作社印製 經濟部中央橾準局員工消費合作社印簟 Α7 Β7 五、發明説明(IS) 2所示廣域位元管理系統3 0之交作,係例示於圖2 〇中〇圖 11-19詳述圖1S所示之組件程序。濾波器組34將PCM聲頻 信號14分割成32個次波帶信號x(n),後者被寫入各自之次 波帶選樣緩衝器9 6內0假設一聲頻視窗大小爲4 0 9 6個選樣 ,每一次波·帶選樣緩衝器9 6即儲存一爲1 2 8個選樣之框, 後者係區分成4個3 2選樣之次框0 1 〇 2 4個選樣之視窗大小 將產生單一之32選樣次框。各選樣以。被導向分析階段70 ,以求出各次框之預測係數、預測器模式(PM ODE )、暫態 模式(TM0DE)及標度因數(SF)〇各選樣x(n)亦提供至GBM 系統3〇,後者確定每一聲頻通道每一次波帶之各次框之位 元分配(AB1T) 〇此後,各選樣χ(η)以一回一次框方式傳送 至ADPCM編碼器72。 . 最佳預測係數之估計 使用在一區段之次波帶選樣Χ ( η )上最佳化之標準自相 關方法98亦即Weiner-Hopf或Yule-Walker方程式分別對 各次框產生Η預測係數(宜爲4次冪)〇 最佳預測係敷之量化 四個一組之預測器係數較佳爲使用上述之4兀素樹枝 形電路搜尋1 2位元向量碼冊(每係數3個位元)予以量化 〇 12位元向量碼冊含有4 0 9 6個使用一標準叢集演算法對一 所要機率分佈予最佳化之係數向量。一向量量化(V Q )搜尋 100選出顯現其本身與最佳係數間之最小櫂重均方差之係 數向量。然後,各次框之最佳係數用此等量化"向量取 代。用一反轉V Q LUT 1 Q 1對ADPCM編碼器7 2提供量化之預 ‘本紙張尺度適用中國國家榡準(CNS ) Α4規格(210X297公釐〉 -22 — ---------(裝— (請先閱讀背面之注意事項再填寫本頁) 訂 Λ mmmmmKmm:· ί 經濟部中央標準局員工消費合作社印製 A7 B7 i、發明説明(^) 測器係數q 預測差信號d ( η )之估Jt ADPCM之一重大困境爲差選樣序列d(n)不能在實際遞 迴程序7 2之前輕易預測。前向適應性次波帶A D P CM之基本 要求爲差信號能量在A1JPCM編碼之前爲已知,以便對會於 重構選樣內產生一已知量化器誤差或雜訊位準之量化器計 算一適當之位元分配。差能量信號之知嘵亦有需要,以容 許在編碼之前確定一最佳差標度因數0 不幸,差信號能量不僅與輸入信號特性有關,而且亦 與預測器性能有關0除了各已知限制譬如預測器序及預測 器係數之最佳性外,預測器性能亦受重構選樣誘生之量化 誤差或雜訊之位準所影響Q由於量化雜訊係受最終位元分 配AB1T及差標度因數KMS (或PEAK)値本身所支配,差信 號能量估計須於重複1 之前達成。 步驟1 :假設霊畳化誤差 第一差信號估計之作成係將緩衝次波帶選樣X ( η )通過 一 ADPCM程序,後者不將差信號量化〇此係使ADPCM編碼 迴路中之量化及KMS標定無效予以達成。經由此一方式估 計差信號d ( r〇,標度因數及位元分配値之效應自計算中消 除。然而,該程序因使用向量量化預測係數而將量化誤差 對預測器係數之效應納入考慮。一反轉VQ LUT 1 04被用以 提供量化之預测係數。爲進一步提高估計預測器之精確度 ,在計算之前將先前區段終了時所累積之實際ADP CM ·預測 器之歷程選樣拷貝於預測.器內。此確保預測器由先前輸入 本紙張尺度適用中國國家標準(CNS ) A4规格(2IOX297公釐) — -23 — (請先聞讀背面之注意事項再填寫本頁) 訂一:--- 經濟部中央標準局員工消費合作社印聚 A7 * B7 五、發明説明(21) 緩衡器終了時實際WPCM預測器停止之處開始。 此一估計e d U)與實際程序d ( η )間之主要差異爲量化 雜訊對重構選樣X ( η )及對減低之預測精確度之效應被忽略 。對具有大階層數之量化器而言,雜訊位準一般將會較小 (假設適當之比例調整),因此實際差信號能量將與估計 中所計算者密切匹配。然而,當量化器階層數目較少時, 如一般低位元傳輸率聲頻編碼器之情形,實際預測信號, 進而差信號能量,會與估計者有顯著差異。此產生與較早 於適應性位元分配程序所預測者不同之編碼雜訊基値〇 儘管如此,預测性能之變化對應用或位元傳輸率可能 不顯著。因此,估計値可直接用以計算位元分配及標度因 數,而無需反覆。一額外之改良爲能藉由故意高估差信號 能量而補償性能損失,倘若一具有小階層數之量化器有可 能予分配至該次波帶。該項高估亦可依據量化器階層數目 之變化予以分級供改善精確度0I —...... w ----- I—! I The paper standards apply the Chinese National Standards (CNS) Μ specifications (2) 〇 > < 297mm-18- The Central Bureau of Economic Development of the Ministry of Economic Affairs A7 B7 printed by the Employee Consumer Cooperative V. Description of the invention (The noise base related to the NPR filter is large. Therefore, the relevant missing curve 90 of the NPR filter is below the missing curve 92 related to the Pft filter. Therefore, When the transmission rate is low, the audio encoder selects NPR dropper group 0 at a certain point 94. The quantization error of the encoder falls below the noise base value of the NPK filter, thus adding an extra bit to the Αΰ PC Μ encoder The element does not provide additional benefits. At this point, the audio encoder switches to p R filter bank. 0 ADPCM encoding. ADPCM encoder 7 2 generates a predictive sample selection from the linear combination of H previous reconstruction samples. η). Then, this prediction sample is subtracted from the input X (η) to obtain a difference sample d (n). The difference sample is divided by the RMS (or PEAK) scale factor to pre-calibrate to make the difference The amplitude of the selected sample is matched with the characteristic Q of the quantizer. The sampled difference ud U of the calibration is applied to a step size SZ with L levels ( Determined by the number of ABITs allocated to the current sample selection) quantizer characteristics. The quantizer generates a hierarchical code QL (n) for each calibration difference sampling continent (η). These hierarchical codes are finally transmitted To the ADPCM stage of the decoder. To update the history of the predictor, the quantizer hierarchical code QL (η) uses an inverse quantizer 1 / Q with the same characteristics as Q for local decoding to generate a quantized calibration difference sample d ( n). The selected sample (η) is multiplied by the RMS (or PEAK) scale factor and re-calibrated to produce ί (η). The original sample input sample x (n)-quantization type〗 U) is added by The initial prediction sample p (n) to the quantization difference sample〗 (n) is reconstructed. Then this sample is used to update the predictor history. Vector quantized predictor coefficients and high frequency subband sampling system Use vector quantization (VQ) to encode 0 predictor VQ has a vector dimension of 4 sample selections, based on the Zhongguanjia standard (CNS) A4 specification (2 丨 GX297mm) '---- ~~ '--------- 装 ------ J- 訂 I .---.--- (. (Please read the note on the back side before filling in this page) ff Central Ministry of Economy Quasi-bureau employee spending She printed bag A7 ____ B7 _ _ V. Description of the invention (17) and the sample transmission rate of 3 bits per sample. Therefore, the final code book is composed of 4096 code vectors of dimension 4. Matching vector The search is constructed as a two-level dendritic circuit, and each node in the dendritic circuit has 64 branches. The upper level stores Ψ4 node code vectors, which are only needed by the encoder to assist the search process. The bottom layer contacts 4096 final code vectors, which are required by the encoder and decoder. For each search, I28 MSE calculations with a dimension of 4 are required. The codebooks and node vectors at the upper level are trained using the LBG method, and there are more than 5 million prediction coefficient training vectors. The training vectors are accumulated over all sub-bands, which presents a positive predictive gain, and at the same time encodes a wide range of audio material. ◦Average average SNR is obtained for each test vector in a training group. High frequency VQ has a number of 32 The vector dimension of the sample selection (the length of the primary frame) and the bit transmission rate of each sample selection ϋ. 3 1 2 5 bit. Therefore, the final codebook consists of 1 0 2 4 code vectors of dimension 32. The matching vector search system is constructed as a two-level branch circuit, and each node in the branch circuit has 32 branches. The upper hierarchy stores 32 node code vectors, which are only needed by the encoder. The bottom layer contains 1 ϋ 2 4 final code vectors, the latter being required by the encoder and decoder. For each search, MSE calculations with 64 dimensions of 32 are required. The codebook and node vectors at the upper level are trained using the LBG method, and there are more than 7 million sub-band training vectors of still frequency. For a wide range of audio materials, the sampling rate of 4 8 k H z consists of The selection of vectors is accumulated from the outputs of subbands 16 to 32. At a sampling rate of 4 8 kHz, each training sample represents an audio frequency in the range of 12 to 24 kHz. For each test vector in the training group, an average SNK of about 3 dB is expected. Although this paper is again applicable to the Chinese National Standard ('.CNS) A4 · present grid (2 x 297 mm) — 5 Ω — ........:--. 1--.... .. I-II I ..... I 一 -I ^ n I—. Ini, (please read the precautions on the back before filling out this page) A7 B7 3l556i 5. Description of the invention (1 8) 3 dB is one A small SNK, however, is sufficient to provide high frequency or environment at these high frequencies. It feels far better than the known technology of simply abandoning high frequency subbands. Joint Frequency Coding: In very low bit rate applications, the overall reconstruction can be improved by only coding the sum of high frequency subband signals from two or more audio channels instead of coding them independently . Since high-frequency sub-bands often have similar energy distributions, and because the main strength of the human auditory system and the v-intensity of high-frequency components are sensitive rather than their fine structure, joint frequency coding is possible. Therefore, since more bits can be used to encode sensoryly important low frequencies at any bit transmission rate, the reconstructed average signal provides a good overall frequency. The joint frequency coding index (JO i Μ) system Directly transmitted to the decoder to indicate which channels and subbands have been combined and where the encoded signal is placed in the data stream. The decoder reconstructs the signal in the designated channel and then copies it to every other channel. Then, each channel is calibrated according to its special RMS scale factor. Since the joint frequency coding work averages the time signal according to the similarity of its equal energy distribution, the reconstruction frequency is reduced by 0. Therefore, its application is usually limited to low bit transmission rate applications, while the 2020 kHz signal is mainly at the high bit. In the medium of the meta-transmission rate application, the joint frequency coding usually fails. The 0th-order band encoder encodes the sideband coding procedure for single-use ADPCM / APCM procedures, especially the analysis stage 70 shown in FIG. 5 and the ADPCM encoder 72 and This paper scale is applicable to the Chinese National Standard (CNS) A4 specification (210X297mm) -21- I --------- (installed-I (please read the precautions on the back before filling this page) Printed by the Central Bureau of Standards' Staff Consumer Cooperative Printed by the Ministry of Economic Affairs of the Ministry of Economic Affairs, Central Bureau of Standards and Staff's Consumer Cooperative Printed A7 Β7 V. Interpretation (IS) 2 shows the interaction of the wide area bit management system 30, which is illustrated in Figure 2 〇 Figure 11-19 details the component program shown in Figure 1S. The filter bank 34 divides the PCM audio signal 14 into 32 sub-band signals x (n), which are written into their respective sub-band sample selection buffers 9 6 in 0 assuming an audio window size For 4 0 9 6 sample selections, each time the wave with sample buffer 9 6 stores a frame of 1 2 8 sample selections, the latter is divided into 4 3 2 sample selection sub-frames 0 1 〇2 The window size of the 4 sample selections will generate a single 32 sample selection subframe. Each sample selection is guided to the analysis stage 70 to find the prediction coefficient, predictor mode (PM ODE), and transient mode of each subframe ( TM0DE) and scale factor (SF). Each sample selection x (n) is also provided to the GBM system 30. The latter determines the bit allocation (AB1T) of each subframe of each band of each audio channel. After that, each The sample selection χ (η) is sent to the ADPCM encoder 72 in a frame-by-frame manner. The estimation of the best prediction coefficient uses the standard autocorrelation method optimized on the sub-band sampling Χ (η) of a segment 98 is the Weiner-Hopf or Yule-Walker equations to generate H prediction coefficients (preferably to the power of 4) for each subframe. Optimal prediction system is quantized. The predictor coefficients of a group of four are preferably used. The element dendritic circuit searches for a 12-bit vector codebook (3 bits per coefficient) to be quantized. A 12-bit vector codebook contains 4 0 9 6 uses The standard clustering algorithm optimizes the coefficient vector for a desired probability distribution. A vector quantization (VQ) search 100 selects the coefficient vector that exhibits the smallest mean squared deviation between itself and the optimal coefficient. Then, each frame The best coefficients are replaced with these quantized " vectors. An inverted VQ LUT 1 Q 1 is provided to the ADPCM encoder 7 2 to provide a quantized prediction. This paper size is applicable to the Chinese National Standard (CNS) Α4 specification (210X297 mm) -22 — --------- (install — (please read the precautions on the back before filling out this page) Order Λ mmmmmKmm: · ί A7 B7 printed by the Consumer Cooperative of the Central Standards Bureau of the Ministry of Economy i. Description of invention (^) The detector coefficient q predicts the estimation of the difference signal d (η). One of the major dilemmas of Jt ADPCM is that the difference sample sequence d (n) cannot be easily predicted before the actual recursion procedure 72. The basic requirement of the forward adaptive subband ADP CM is that the difference signal energy is known before A1JPCM encoding, in order to calculate a quantizer that will produce a known quantizer error or noise level in the reconstruction selection Appropriate bit allocation. Knowledge of the difference energy signal is also needed to allow the determination of an optimal difference scale factor before encoding 0. Unfortunately, the difference signal energy is not only related to the characteristics of the input signal, but also to the performance of the predictor. In addition to known limitations such as In addition to the optimality of the predictor order and predictor coefficients, the performance of the predictor is also affected by the quantization error or noise level induced by the reconstruction sampling Q. Because the quantization noise is subject to the final bit allocation AB1T and difference The degree factor KMS (or PEAK) is dominated by itself, and the difference signal energy estimation must be achieved before repeating 1. Step 1: Assuming that the first difference signal estimation of the error is made by sampling the buffer subband X (η) through an ADPCM procedure, the latter does not quantize the difference signal. This is the quantization and KMS in the ADPCM encoding loop The calibration is invalid. By estimating the difference signal d (ro) in this way, the effects of the scale factor and bit allocation value are eliminated from the calculation. However, the procedure takes into account the effect of quantization errors on the predictor coefficients by using vector quantized prediction coefficients. A reverse VQ LUT 104 is used to provide quantized prediction coefficients. To further improve the accuracy of the estimated predictor, the actual ADP CM accumulated at the end of the previous section before the calculation is selected In the predictor. This ensures that the predictor applies the Chinese national standard (CNS) A4 specification (2IOX297 mm) from the paper size previously entered. — -23 — (please read the precautions on the back and fill in this page). : --- The Ministry of Economic Affairs, Central Bureau of Standards, Employee and Consumer Cooperative Printed A7 * B7 V. Description of the invention (21) Where the actual WPCM predictor stops when the balancer ends. This estimate ed U) and the actual program d (η) The main difference between them is that the effects of quantization noise on the reconstruction sample X (η) and the reduced prediction accuracy are ignored. For a quantizer with a large number of levels, the noise level will generally be smaller (assuming proper scaling), so the actual difference signal energy will closely match the one calculated in the estimate. However, when the number of quantizer hierarchies is small, as in the case of general low bit rate audio encoders, the actual predicted signal, and thus the difference signal energy, will be significantly different from the estimator. This results in a coding noise base value that is different from what was predicted earlier in the adaptive bit allocation process. Nonetheless, the change in prediction performance may not be significant for the application or bit transmission rate. Therefore, the estimated value can be directly used to calculate the bit allocation and scale factors without repeating. An additional improvement is to compensate for the performance loss by deliberately overestimating the energy of the difference signal, if a quantizer with a small number of levels may be allocated to the sub-band. This overestimation can also be graded based on changes in the number of quantizer levels for improved accuracy.

If 步驟2 :使用估計位元分配及標度因數之再計算 一旦位元分配U13 Π')及標度因數(SF )已用第一估計差 信號予以產生,其等之最佳性可藉由在ADPCM迴路72中使 用估計之AB1T及RMS (或PEAK)値進行一進一步ADPCM估 計程序予以测試。如第一估計値之情況,估計預測器歷程 係於開始計算之前拷貝自實際ADPCM預測器,以確保二預 測器均由相同點開始0 —旦緩衝輸入選樣全數通過此第二 估計迴路,將每一次波帶內所得之雜訊基値與適應性位元 分配程序內之假設雜訊基値比較◦任何顯著差異均可藉由 本紙張尺度適用中國國家標準(CNS ) A4規格(210X297公釐) -24- (請先閲讀背面之注^h項再填寫本頁) 裝_ 訂 經濟部中央樣準局員工消費合作社印裝 Α7 Β7 五、發明説明(22 ) 修正位元分配及/或標度因數予以補償0 每次使用最新差信號估計値計算下一組位元分配及標 度因數時,步驟2可重複以適當改善分佈在各次波帶上之 雜訊基値〇 —般而言,若標度因數會有約大於2-3 dB之改 變,則與再計算之。否則位元分配將有違反精神聽覺遮蔽 程序或者mms e程序所產生之信號對遮蔽比之風險。通常, 單次反覆爲屬足夠。 〜 次波帶預測模式(PMODJj)之計算 爲改薺編碼效率,一控制器1 0 6可在目前次框內之預 測增益降至一臨限値以下時,藉設定一PM0DE旗標而任意 將預測程序關閉。當估計階段期間對一區段之輸入選樣所 測得之_預測增益(輸入信號能量與估計差信號能量之比) 超出某一正臨限値時,該PM0DE旗標被設定爲一〇反之, 若該預測增益測得爲小於該正臨限値時,ADPCM預測器係 數於編碼器及解碼器二處均對次波帶設定爲零,並將各自 之_PM0DE設定爲零。該預測增益臨限値係設定成使其等於 上方傳輸預测器係數向量之失眞率◦此一作法乃試圖確保 ADPCM程序之編碼增益在PM0DE = 1時,恆大於或等於前向 適應性PCM (APCM)編碼程序。否則,藉由設定PM0DE爲零 並重新設定各預测器係數,ADPCM程序即回復成APCM 〇 若ADPCM編碼增益變化對該應用不重要,則在任一或 全部次波帶內之P Μ 0 D E可設定爲高値。反之,_ P Μ 0 D E可設; 定爲低値,倘若,舉例言之,某些次波帶將根本不予編碼 ,該應用之位元傳輸率夠高以致無需預測增益以保持聲頻 本紙張尺度適用中國國家標準(CNS ) Α4规格(210Χ29乙/當穿) (請先閲讀背面之注$項再填寫本頁)If Step 2: Use the estimated bit allocation and the recalculation of the scale factor Once the bit allocation U13 Π ') and the scale factor (SF) have been generated using the first estimated difference signal, their optimality can be obtained by Using the estimated AB1T and RMS (or PEAK) values in the ADPCM loop 72, a further ADPCM estimation procedure is tested. As in the case of the first estimation value, the history of the estimated predictor is copied from the actual ADPCM predictor before starting the calculation, to ensure that both predictors start from the same point. Once the buffer input sample selection has passed through this second estimation loop, the The noise base value obtained in each band is compared with the assumed noise base value in the adaptive bit allocation procedure. Any significant differences can be applied to the Chinese National Standard (CNS) A4 specification (210X297 mm) by this paper scale -24- (please read the note ^ h on the back and then fill in this page) Packing _ Ordered by the Central Sample Bureau of the Ministry of Economic Affairs Employee Consumer Cooperative Printed Α7 Β7 V. Description of invention (22) Corrected bit allocation and / or scale Factor compensation 0 Every time the latest difference signal estimation value is used to calculate the next set of bit allocation and scaling factors, step 2 can be repeated to properly improve the noise base value distributed on each wave band. Generally speaking, If the scale factor will change by more than about 2-3 dB, it will be recalculated. Otherwise, the bit allocation will run the risk of violating the mental auditory masking procedure or the mms e procedure to the signal-to-masking ratio. Usually, a single repetition is sufficient. ~ The calculation of the sub-band prediction mode (PMODJj) is to improve the coding efficiency. A controller 106 can set a PM0DE flag when the prediction gain in the current sub-frame falls below a threshold value. The prediction program is closed. When the _predicted gain (the ratio of the input signal energy to the estimated difference signal energy) measured during the estimation phase for the input selection of a section exceeds a positive threshold value, the PM0DE flag is set to the opposite. If the prediction gain is measured to be less than the positive threshold value, the ADPCM predictor coefficients are set to zero for the subband at both the encoder and decoder, and the respective _PM0DE is set to zero. The prediction gain threshold value is set to be equal to the missing rate of the coefficient vector of the predictor transmitted above. This method attempts to ensure that the coding gain of the ADPCM program is always greater than or equal to the forward adaptive PCM when PM0DE = 1. (APCM) coding program. Otherwise, by setting PM0DE to zero and resetting the predictor coefficients, the ADPCM procedure reverts to APCM. If the ADPCM coding gain change is not important to the application, the P Μ 0 DE in any or all sub-bands can be Set to high value. On the contrary, _P Μ 0 DE can be set; set to low value, if, for example, some sub-bands will not be encoded at all, the bit transmission rate of the application is high enough so that there is no need to predict the gain to maintain the audiobook paper The standard is applicable to the Chinese National Standard (CNS) Α4 specification (210Χ29B / when worn) (please read the $ item on the back and fill in this page)

A7 B7 經濟部中央樣準局員工消費合作杜印製 五、發明説明(Μ 之主觀品質,信號爲高暫態內容,或ADPCM編碼聲頻之編 接特性根本不需要,猶如聲頻編輯應用之情形。 對每一次波帶,以一等於編碼器及解碼器ADPCM程序 中線性預测器之更新率傳輸個別之預測模式(PMODE) 〇該 PMODE參數之目的爲對解碼器指示該特定次波帶是否將具 有任何與其編碼聲頻資料區段有關之預測係數向量位址〇 當在任一次波帶中PM〇DE = ]時,則—預測器係數向量位址 將恆包括在資料流中。當在任一次波帶中PM0DE = 0時,則 一預測器係數向量位址將絕不包括於資料流出中,且預測 器係數於編碼器及解碼器二ADPCM階段被設定爲零。 PMODE之計算始於以相對於第一階段估計(亦即假設 無量化誤差)中所獲對應緩衝估計差信號能量分析緩衝次 波帶輸入信號能量。輸入選樣X ( η )及估計差選樣e d ( η )二 者分別對各次波帶予以緩衝。緩衝器大小等於於各預測器 更新期間內所含選樣數目,譬如次框之大小〇然後計算預 測增益如下:A7 B7 Employee consumption cooperation of the Central Prototype Bureau of the Ministry of Economic Affairs. Printed by Du. V. Invention description (Subjective quality of Μ, the signal is highly transient content, or the ADPCM coded audio editing feature is not required at all, as is the case for audio editing applications. For each waveband, the individual prediction mode (PMODE) is transmitted at an update rate equal to the linear predictor in the encoder and decoder ADPCM procedures. The purpose of the PMODE parameter is to indicate to the decoder whether the particular subband will be With any prediction coefficient vector address associated with its encoded audio data section. When PM〇DE =] in any primary band, then-the predictor coefficient vector address will always be included in the data stream. When in any primary band When PM0DE = 0, the vector address of a predictor coefficient will never be included in the data stream, and the predictor coefficient is set to zero in the ADPCM stage of the encoder and decoder. The calculation of PMODE begins with the relative The corresponding buffer estimate difference signal energy analysis obtained in the first-stage estimation (that is, assuming no quantization error) analyzes the input signal energy of the buffer subband. Input sample selection X (η) and estimated difference selection The two samples, e d (η), buffer each sub-band. The buffer size is equal to the number of samples included in the update period of each predictor, such as the size of the sub-frame. Then calculate the predicted gain as follows:

Pgain (dB) = 20 , 0*L〇gl〇 (RMSX ( n , /RMSed (η )) 其中RM S x ( a ,=緩衝輸入選樣χ ( n )之根均方値,而rμ S e d ( n ) =緩衝估計差選樣e d U)之根均方値。 對於正預測增益,該差信號平均而言爲小於輸入信號 ’因而對於相同之位元傳輸率,可用ADPCM程序在APCM上 獲得一降低之重構雜訊基値。對於負預測增益,ADPCM編 碼器使該差信號在平均上大於輸入信號,其對於相同之位 元傳輸率造成較APCM爲高.之雜訊基値〇使PM〇D]E開啓之預 私紙張尺度適用中國國家標準(CNS ) A4规格(210X297公釐) -26« rt------τΐτI.---^--- (請先閲讀背面之注意事項再填寫本頁) 經濟部中央樣準局員工消費合作社印裝 31556J Α7 Β7 五、發明説明(24) 測增益臨限正常將爲正値,旦將具有一將因傳輸預測器係 數向量位址所耗之額外通道容量納入考慮之値〇 方波帶暫態模式(TMODE )之計算 控制器1 ϋ 6對各次波帶內之各次框計算各暫態模式( TMODE.)〇TMOUE指示當PM0DE=1時在估計差信號ed(n)緩 衝器內,或當PM013E = 〇時在輪入次波帶信號x(n)緩衝器內 之對其等爲有效之標度因數及選樣之數目。TMODE係以相 同於預测係數向量位址之速率更新,並傳輸至解碼器。暫 態模式之目的爲於信號暫態存在時減少可聞編碼〜前回聲 _"人爲現象。 暫態係定義成一在低振幅信號與一高振幅信號間之快 速轉換〇由於標度因數係在一次波帶差選樣區段上作平均 ,故若在一區段內發生信號振幅之快速改變,亦即一暫態 發生時,計算出之標度因數則傾向遠大於暫態前低振幅選 樣之最佳者。因此,在暫態前之選樣內之量化誤差可能非 常高。此一雜訊被感覺成前回聲失眞。 實作上,暫態模式係用以修正次波帶標度因數平均區 段長度,以限制暫態對緊接其前之差選樣標定之影響〇此 作法之動機爲人類聽覺系統固有之預遮蔽現象,其暗示於 暫態存在時,雜訊可在暫態之前被遮蔽,唯其延時保持短 暫。 視PMOD£之値而定,次波帶選樣緩衝器x ( n )或估計差 緩衝器edU)之內容(亦即次框)被拷貝至一暫態分析緩 衝器內。於此,緩衝器內容被均匀分成2、3或4個次次框 本紙張尺度適用中國國家標準(〇阳)八4思洛(210\297今夢) ---------C 裝------ (請先閱讀背面之注意事項再填寫本頁) 、?τ_ 經濟部中央榡準局負工消費合作社印装 A7 ____ B7 五、發明説明(25 ) ,視分析緩衝器之選樣大小而定〇舉例言之,若分析緩衝 器含有3 2個次波帶選樣(2 1 . 3 m s @ 1 5 0 〇 H z ),緩衝器及被分 成4個各爲8個選樣之次次框,而對1500Hz之次波帶選樣 率提供5 . 3 . ni s之時間解析度。另法,若分析視窗係以1 6個 次波帶選樣構成,則緩衝器僅需被分成二個次次框,以獲 得相同之時間解析度。· 分析各次次框內之信號,並確定除第一個以外之每一 暫態狀態。若有任何次次框宣告爲暫態,則產生二個別之 標度因數用於分析緩衝器亦即目前次框〇由該暫態次次框 前之次次框內選樣計算第一標度因數。由該暫態次次框以 及所有進行中次次框之選樣計算第二標度因數〇 第一次次框之暫態狀態不予計算,因量化雜訊因分析 視窗本身之起始峙而自動受到限制。若超過一個次次框宣 告爲暫態,則僅考慮首先出現者。若完全未偵測到暫態次 緩衝器,則使用分析緩衝器內之所有選樣而僅計算單個標 度因數。在此方法中,包括暫態選樣之標度因數値不予用 以標定過去時間內超過一個次次框期間前之各較早選樣0 因此,暫態前量化雜訊被限定於一個次次框期間。 暫態官告 次次框若其在先前次緩衝器上之能量之比値超過一暫 態臨限(ΤΊ'),而於先前次次框內之能量低於一暫態前臨限 (PTT ),即宣告爲暫態。TT及PTT之値將視位元傳輸率及 所需前回聲之抑制程度而定。其等通常受到變化,直到所 察覺之前回聲失眞匹配其他編碼人爲現象(若存在)之位 本ϋ尺度適用中國國家#準(CNS ) A4规格(210X297公釐)…" " ~ — (请先閲讀背面之注意事項再填寫本頁) 裝· 訂 31556i A7 B7 五、發明説明(26) (請先閲讀背面之注^^項再填寫本頁) 準爲止。增加中之ΐΤ及/或減少中之PTT値將減少次次框 宣告爲暫態之可能性,因而將減少與標度因數傳輸相關之 位元傳輸率◦反之,減少中之ΤΤ及/或增加中之ΡΤΤ値將 增加次次框宣告:爲暫態之可能性,因而將增加與標度因數 傳輸相關之位元傳輸率〇 由於ΤΤ及ΡΤΤ爲個別對每一次波帶·設定,故編碼器處 暫態偵測之敏感度可對任一次波帶任意設定〇舉例言之, 若發現高頻率次波帶內前回聲之可感覺性較低頻率次波帶 內者小,則可設定各臨限以減少於較高次波帶內宣告爲暫 態之可能性。然而,由於TMODE係嵌於壓縮資料流內,故 解碼器根本無需知道在編碼器處使用之暫態偵測演算法, 以便正確將該ΐ Μ 0 D β資訊解碼〇 四次緩衝器槌Pgain (dB) = 20, 0 * L〇gl〇 (RMSX (n, / RMSed (η)) where RM S x (a, = buffer input selection χ (n) root mean square value, and rμ S ed (n) = root mean square value of the buffer estimate difference sample ed U). For positive prediction gain, the difference signal is on average less than the input signal. Therefore, for the same bit transmission rate, the ADPCM program can be obtained on APCM A reduced reconstruction noise base value. For negative prediction gain, the ADPCM encoder makes the difference signal larger than the input signal on average, which results in a higher noise base value than APCM for the same bit transmission rate. PM〇D] The standard of pre-private paper opened by E is applicable to the Chinese National Standard (CNS) A4 specification (210X297 mm) -26 «rt ------ τΐτI .--- ^ --- (please read the back page first (Notes to fill out this page) 31556J Α7 Β7 printed by the Employee Consumer Cooperative of the Central Bureau of Samples of the Ministry of Economic Affairs V. Description of the invention (24) The measured gain threshold will be positive, and will have a coefficient vector bit due to the transmission predictor The additional channel capacity consumed by the site is taken into account. The calculation controller of the square wave band transient mode (TMODE) 1 ϋ 6 pairs of each Each subframe in the band calculates each transient mode (TMODE.). The TMOUE indicates that it is in the estimated difference signal ed (n) buffer when PM0DE = 1, or the round subband signal x when PM013E = 〇 (n) The equivalent scaling factors and the number of samples in the buffer. TMODE is updated at the same rate as the prediction coefficient vector address and transmitted to the decoder. The purpose of the transient mode is to Reduce the audible coding when the signal transient exists ~ pre-echo_ " Artificial phenomenon. The transient is defined as a rapid transition between a low amplitude signal and a high amplitude signal. Because the scale factor is selected in the first band difference The averaging is performed on the segments, so if a rapid change in signal amplitude occurs in a segment, that is, when a transient occurs, the calculated scaling factor tends to be much larger than the best one for low amplitude sampling before the transient. Therefore, the quantization error in the sample selection before the transient may be very high. This noise is perceived as a pre-echo loss. In practice, the transient mode is used to correct the average band length of the subband scale factor , To limit the effect of transients on the calibration of the difference sample selection immediately before it. The motive of the practice is the pre-shadowing phenomenon inherent in the human auditory system, which implies that in the presence of transients, the noise can be masked before the transients, but the delay is kept short. Depending on the value of PMOD £, the sub-band selection The contents of the buffer x (n) or the estimated difference buffer edU) (that is, the subframe) are copied into a transient analysis buffer. Here, the content of the buffer is evenly divided into 2, 3 or 4 sub-frames. The paper size is applicable to the Chinese national standard (〇yang) 84 Siluo (210 \ 297 this dream) --------- C Install ------ (please read the precautions on the back before filling in this page),? Τ_ Printed and printed on the A7 ____ B7 by the Central Preparatory Bureau of the Ministry of Economic Affairs A5 ____ B7 V. Invention description (25), depending on the analysis buffer The sample selection depends on the size. For example, if the analysis buffer contains 3 2 subband samples (2 1.3 ms @ 1 5 0 〇H z), the buffer and is divided into 4 each 8 The sub-frame of sample selection, and the time resolution of 5.3 Hz for the sub-band sampling rate of 1500 Hz. Alternatively, if the analysis window is composed of 16 subband selections, the buffer only needs to be divided into two subframes to obtain the same time resolution. · Analyze the signals in each subframe and determine each transient state except the first one. If any sub-frames are declared as transient, two separate scale factors are generated for analysis buffer, that is, the current sub-frame. The first scale is calculated from the sample selection in the sub-frame before the transient sub-frame Factor. Calculate the second scale factor from the transient sub-frame and all the samples of the intermediate sub-frame. The transient state of the first sub-frame is not calculated, because the quantization noise is due to the start of the analysis window itself. And automatically restricted. If more than one subframe is declared as transient, only the first to appear is considered. If the transient sub-buffer is not detected at all, then all samples in the analysis buffer are used and only a single scale factor is calculated. In this method, the scale factor including transient sample selection is not used to calibrate the earlier sample selections in the past that exceed one sub-frame period in the past time. Therefore, the quantization noise before transient is limited to one sub-sample During the subframe. The transient officer informs the sub-frame if its energy ratio in the previous sub-buffer exceeds a transient threshold (T Ί '), and the energy in the previous sub-frame is lower than a pre-transient threshold (PTT) ), Declared as transient. The values of TT and PTT will depend on the bit rate and the degree of suppression of the required pre-echo. They are usually subject to change until the echoes are missed before being detected and match the position of other coding artifacts (if any). The standard applies to the Chinese National Standard # 4 (CNS) A4 (210X297 mm) ... " " ~ — (Please read the precautions on the back before filling in this page) Binding · Order 31556i A7 B7 5. Invention Description (26) (Please read the note ^^ on the back before filling in this page). Increasing ΙΤ and / or decreasing PTT value will reduce the possibility of the subframe being declared as transient, and thus will reduce the bit transmission rate related to the scale factor transmission. On the contrary, decreasing TT and / or increasing The PTT value in the middle will increase the subframe declaration: the possibility of being transient, and therefore will increase the bit transmission rate related to the scale factor transmission. Since TTT and PTT are set individually for each band, the encoder The sensitivity of the transient detection can be set arbitrarily for any primary band. For example, if the perceptibility of the front echo in the high-frequency sub-band is found to be lower in the lower-frequency sub-band, then each can be set. In order to reduce the possibility of being declared transient in the higher order wave band. However, because TMODE is embedded in the compressed data stream, the decoder does not need to know the transient detection algorithm used at the encoder in order to correctly decode the information of the Μ 0 D β quadruple buffer mallet

如圖11a所示,若次波帶分析緩衝器109內之第一次 次框108爲暫態、或若無暫態次次框被偵測到,則TM0DE 經濟部中央標準局員工消费合作社印製 =〇 〇若第二次次框爲暫態,但第一個不是,則TM0DE = 1 〇 若第三次次框爲暫態,但第一個或第二個不是,則TM 0DE =2 〇若僅有第四次次框爲暫態,則TM0DE = 3 〇 標度因數之計尊 如圖lib所示,當TM0DE = 0時,各標度因數110係就 所有次次框加以計算。當TM0DE=1時,第一標度因數係就 第一次次框加以計算,而第二標度因數係就所有先前之次 次框加以計算。當TM0DE = 2時,第一標度因數係就第一及 第二次次框加以計算,而.第二標度因數係就所有先前之次 本紙張^度適用中國國家標準規格(210X297公釐) " ~ -29- 經濟部中央揉準局員工消費合作社印製 A7 _____________B7_ 五、發明説明(27 ) 次框加以計算。當TMUDE = 3時,第一標度因數係就第一、 第二及第三次次框加以計算,而第二標度因數係就第四次 次框加以計算〇 使.用.τ Μ 〇 D E之A D P C Μ編碼及解碼 當TMQDE = 0時,使用單一標度因數標定整個分析緩衝 器(亦即一次框).延時之次波帶差選樣,並傳輸至解碼器 以協助反轉標定工作。當TM0DE>0時,則使用二標度因數 標定次波帶差選樣並將二者傳輸至解碼器。對任何TMODE ,使用每一標度因數標定首先用以將其產生之差選樣。 次波帶槱度W敝(RMS或PEAK)之計算 視該次波帶之PMODE値而定,估計差選樣ed ( η )或輸 入次波帶選樣x(n)被用以計算適當之標度因數〇於此一計 算中使用1'MODE以兼求出標度因數之數目及並確認緩衝器 內之對應次次框0 RMS標度因敷計算 對第j個次波帶計算各rms標度因數如下: 當TM0DE = G時,則單一 rms値爲: L 2 0 5 RMS尸(£ ed(n) /L) _ n= 1 其中L爲次框內之選樣數目〇 當TM0DE>0時,則二個rms値爲: it RMSlj= ( Σ ed(n)2/L)°'5 n= 1 k + 1 2 05 RMS2.= ( Σ ed(n) /L)' ‘ n = 1 本紙張尺度適用―不—國國家梯準(CNS)A4规格(2IOX29»f_) ^ '~~! (請先閱讀背面之注^^項再填寫本貫) 装· 訂 經濟部中央梯準局員工消費合作社印製 A7 B7 五、發明説明(28) 其中k=(TMODE*L/NSB),而NSB爲均匀次次框之數目。 若PMODE = 0 ,則edjU)選樣置换以輸入選樣Xj(n) 〇 PEAK標度因败計m •對第j個次波帶計算p e ak標度因數如下: 當T Μ 0 D E = 0時,則單一p e a k値爲: PEAK】 =MAX(ABS(edj(n))),對於 n = l,L 當T M 0 D E > 0時,則二個p e a k値爲: PEiiKlj =MAX(ABSUdj(n))),對於 n = (TMODE*L/NSB) PEAK2j =MAX(A!3S(edj(n):)),對於 n=(TMODE*L/NSB),L 若PM0DE = 0 ,則edjU)選樣置換以輸入選樣Xj(n) 〇 _PMODE、及標度因敷夕景化 PMQDE 之.g 化 ' 預测模式旗標僅有二個値,開或關,且以1位元碼直 接傳輸至解碼器。 TMODE之噩化 . 暫態槟式旗標最多具有4個値· 0、1、2 、及3 ,且 係使用2位元無記號整.數碼字或隨意經由一 4階層熵表直 接傳輸至解碼器,以嚐試將TMODE之平均字長減至2位元 以下0典型上,該隨意性之熵編碼工作係用於低位元傳輸 率應用上俾保存位元〇 圖1 2中所詳示之熵編碼程序11 2如下:將j個次波帶 之暫態模式碼TMODE (j)·映射至(p)數量之4階層中升線可 變長度’碼册,其中每一碼冊均對一差輸入統計特性予最佳 化〇 TMODE値被映射至4階層表114並計算116.與每一表 本紙張尺度適用中囷國家標準(CNS ) A4規格(210X297公产) ^裝 I 1 ^訂4^ ^ ^ (請先閲讀背面之注$項再填寫本頁) - A7 B7 3155ei 五、發明説明(29) (NBp)相關之總位元使用。使用TflUM索引選擇118在整 個映射程序內提供最低位元使用之表0映射碼VTMODE ( j ) 自此表摘取、壓縮並連同該THUFF索引字.傳輸至解碼器,〇 持有同一組4階層反轉表之解碼器使用THUFF索引將各進 來之可變長庚碼YTM〇DE(j)導往適當之表,用以解碼回復 至TMODE索引。 次波帶標度因數之量化 爲將標度因數傳輸至解碼器,其等須量化成一已知之 碼格式。在此系統中,其等用一均匀64階層對數特性、一 均勻128 .階層對數特性、或一變率編碼之均匀64階層對數 特性予以量化12ϋ 〇ϋ4階層量化器在二種情況下呈現2.25 dB之步級大小,而128階層爲1 .25dB之步級大小〇 64階層 量化係用於低至中度位元傳輸率,·額外變率編碼係用於低 位元傳輸率應用,而128階層一般係用於高位元傳輸率。 量化程序Γ20例示於·圖13〇標度因數(RMS或PEAK) 係自一緩衝器1 2 1讀出,.變換至對數定義域1 2 2 ,然後如 編碼器模式控制128所決定予應用於64階層或128階層之 均匀量化器1〗4、U6。然後,將經對數量化之標度因數寫 入一緩衝器13ϋ內。128及ti4階層量化器之範圍足以涵蓋 分別具有大約16ϋ dB及I44 dB動態範圍之標度因數〇 128 階層之上限係設定以涵蓋24位元輸入PCM數位聲頻信號之 動態範圍。64階層之上限係設定以涵蓋20位元輸入PCM數 位聲頻信號之動態範圍。 將對數標度因數映射.至量化器,並以最近之量化器階 本逋财酬家鮮(CNS ) A规格(2丨GX 297公釐) ' '— -32- ---------f 裝------r訂----- (請先閱讀背面之注意事項再填寫本頁) 經濟部中央揉準局I工消費合作杜印製As shown in FIG. 11a, if the first sub-frame 108 in the sub-band analysis buffer 109 is transient, or if no transient sub-frame is detected, the TM0DE Employee Consumer Cooperative System = 〇〇 If the second sub-frame is transient, but the first is not, TM0DE = 1 〇 If the third sub-frame is transient, but the first or second is not, then TM 0DE = 2 〇If only the fourth subframe is transient, then TM0DE = 3 〇The scale factor is shown in lib. When TM0DE = 0, each scale factor 110 is calculated for all subframes. When TM0DE = 1, the first scale factor is calculated for the first subframe, and the second scale factor is calculated for all previous subframes. When TM0DE = 2, the first scale factor is calculated based on the first and second subframes, and the second scale factor is applied to the Chinese National Standard Specification (210X297 mm) for all previous substandard papers ) &Quot; ~ -29- A7 _____________B7_ printed by the Employee Consumer Cooperative of the Central Bureau of Economic Development of the Ministry of Economic Affairs V. Invention description (27) Subframe for calculation. When TMUDE = 3, the first scale factor is calculated for the first, second, and third subframes, and the second scale factor is calculated for the fourth subframe. Use τ Μ 〇 ADPC Μ encoding and decoding of DE When TMQDE = 0, use a single scaling factor to calibrate the entire analysis buffer (that is, the primary frame). The delayed subband difference sample is selected and transmitted to the decoder to assist in the reverse calibration work . When TM0DE> 0, then use two scaling factors to calibrate the subband difference sample selection and transmit the two to the decoder. For any TMODE, use each scale factor to calibrate the sample used to select the difference. The calculation of the sub-band temperature W (RMS or PEAK) depends on the PMODE value of the sub-band. The estimated difference sample ed (η) or the input sub-band sample x (n) is used to calculate the appropriate Scale factor 〇 In this calculation, use 1'MODE to find the number of scale factors and confirm the corresponding sub-frame in the buffer. 0 RMS scale factor calculation for the jth sub-band The rms scaling factor is as follows: When TM0DE = G, the single rms value is: L 2 0 5 RMS corpse (£ ed (n) / L) _ n = 1 where L is the number of samples in the sub-frame. When TM0DE > At 0, the two rms values are: it RMSlj = (Σ ed (n) 2 / L) ° '5 n = 1 k + 1 2 05 RMS2. = (Σ ed (n) / L)' 'n = 1 This paper standard is applicable-No-National Standards (CNS) A4 specifications (2IOX29 »f_) ^ '~~! (Please read the note ^^ on the back side and fill in the main text) The A7 B7 is printed by the Consumer Cooperative of the Bureau of the P. V. Explanation of the invention (28) where k = (TMODE * L / NSB), and the NSB is the number of uniform subframes. If PMODE = 0, then edjU) sample selection replacement to input sample selection Xj (n) 〇PEAK scale due to failure meter m • Calculate the pe ak scale factor for the j-th wave band as follows: When T Μ 0 DE = 0 , The single peak value is: PEAK] = MAX (ABS (edj (n))), for n = l, L When TM 0 DE > 0, then the two peak values are: PEiiKlj = MAX (ABSUdj ( n))), for n = (TMODE * L / NSB) PEAK2j = MAX (A! 3S (edj (n) :)), for n = (TMODE * L / NSB), L if PM0DE = 0, then edjU ) Sample selection replacement to input the sample selection Xj (n) 〇_PMODE, and the scale of the PMQDE. The prediction mode flag has only two values, on or off, and with a 1-bit code Transfer directly to the decoder. The catastrophe of TMODE. Transient Pennant flags have up to 4 values · 0, 1, 2, and 3, and use 2-bit unsigned integers. Digital words or random transmission directly through a 4-level entropy table to the decoder To try to reduce the average word length of TMODE to less than 2 bits. Typically, this random entropy coding is used to save bits in low bit rate applications. The entropy is detailed in Figure 12. The coding procedure 11 2 is as follows: the transient mode codes TMODE (j) · of j sub-bands are mapped to (p) number of 4 layers of rising-line variable-length 'codebooks, where each codebook is one to one difference Enter the statistical characteristics for optimization. The TMODE value is mapped to the 4-level table 114 and calculated 116. For each table paper size, the Chinese National Standard (CNS) A4 specification (210X297 public product) is applied. ^ Install I 1 ^ Order 4 ^ ^ ^ (Please read the note $ item on the back and then fill in this page)-A7 B7 3155ei V. Description of the invention (29) (NBp) The total number of bits used. Use the TflUM index selection 118 to provide the table with the lowest bit used in the entire mapping process. 0 The mapping code VTMODE (j) is extracted from this table, compressed, and transmitted with the THUFF index word. Delivered to the decoder, 〇 holds the same group of 4 levels The decoder of the reversal table uses the THUFF index to guide each incoming variable-length code YTM〇DE (j) to the appropriate table for decoding and returning to the TMODE index. Quantization of sub-band scale factor To transmit the scale factor to the decoder, it must be quantized into a known code format. In this system, it is quantified with a uniform 64-level logarithmic characteristic, a uniform 128-level logarithmic characteristic, or a variable rate-coded uniform 64-level logarithmic characteristic 12ϋ 〇ϋ4-level quantizer presents 2.25 dB in two cases The step size is 128 levels and the step size is 1.25dB. The 64 level quantization is used for low-to-moderate bit transmission rates. The extra variable rate coding is used for low-bit transmission rate applications, while the 128 level is generally Used for high bit rate. An example of the quantization procedure Γ20 is shown in FIG. 13. The scale factor (RMS or PEAK) is read from a buffer 1 2 1, transformed into a log domain 1 2 2 and then applied as determined by the encoder mode control 128 64-level or 128-level uniform quantizer 1〗 4, U6. Then, the quantized scaling factor is written into a buffer 13ϋ. The range of the 128 and ti4 hierarchical quantizers is sufficient to cover the scale factors with a dynamic range of approximately 16 ϋ dB and I44 dB respectively. The upper limit of the 128 hierarchical level is set to cover the dynamic range of the 24-bit input PCM digital audio signal. The upper limit of 64 levels is set to cover the dynamic range of 20-bit input PCM digital audio signals. Map the logarithmic scale factor to the quantizer, and use the nearest quantizer order for the financial reward (CNS) A specification (2 丨 GX 297mm) ''--32- ------- --f 装 ------ r 定 ----- (Please read the precautions on the back before filling out this page) I printed by the Central Bureau of Economic Development of the Ministry of Economic Affairs

經濟部中央榡準局員工消費合作社印袋 五、發明説明(w) 層碼KMSql (或PEAK")置換該標度因數〇在64階層量化 器之情況下,此等碼爲6位元長,而範圍介於0 - 6 3之間。 在128階層量化器之情況下,各碼爲7位元長,而範圍介 於0-127之間。 反轉置化1 3 1之達成係單純將各階層碼映射回各自之 反轉量化特性.,以獲得RMSq (或P£AKq )値。對ADPCM ( 或當PM0DJl’ = 0時爲APCM)差分選樣標定,量化標度因數兼 用於編碼器及解碼器,因此確保標定及反轉標定二者程序 相同〇 在6 4階層量化器碼之位元傳輸率需降低時執行額外之 i、商或nj變長度編碼◦ (34階層碼在j個次.波帶上於第二次波 帶(j=2)到最高作用次波帶予一階差分編碼132 。該程序 亦可用以將PE.AK標度因數編碼。有記號差分碼DRMSQ “ j ) (或DP£'AKy£(j))具有+ /- 63之最大範圍’且儲存於一緩 衝器1 3 4內。爲降低原始6位元碼之位元傳輸率,差分碼 被映射至(P)败量之127階層中升線可變長度碼冊。每一 碼册均就一不同之輸入統計特性作最佳化。 '對有記號差分碼作熵編碼之程序相同於圖1 2所例示暫 態模式之熵編碼程序,唯使用P個I2?階層可變長度碼表 0在映射程序中提供最低位元使用之表用SHUFF索引予以 選出。映射碼VDK,MSv L ( j )由此表摘取、壓縮並連同SHUFF 索引字傳輸至解碼器。持有同一組(p) 127階層反轉表之 解碼器使用SHUFF索引將進來之可變長度碼導往適當之表 ,供解碼回復至差分量化器碼階層〇各差分碼階層用以下 本紙張尺度適用中國國家標準(CNS ) A4規袼(210X29乙兮手_) -裝-- (請先閲讀背面之注意事項再填寫本頁) 訂 經濟部中央標準局貞工消費合作社印製 A7 B7 五、發明説明(31) 常式轉回成絕對値: EMSq L ( 1 ) =DRMSq l (1)Printed bags of employees ’consumer cooperatives of the Central Bureau of Economics of the Ministry of Economic Affairs. 5. Description of invention (w) The layer code KMSql (or PEAK ") replaces the scale factor. In the case of a 64-level quantizer, these codes are 6 bits long And the range is between 0-6 3. In the case of a 128-level quantizer, each code is 7 bits long, and the range is between 0-127. The realization of the inverse conversion 1 3 1 is simply to map each hierarchical code back to its inverse quantization characteristic. In order to obtain the RMSq (or P £ AKq) value. For ADPCM (or APCM when PM0DJl '= 0) differential sampling calibration, the quantization scale factor is used for both encoder and decoder, so ensure that the calibration and reverse calibration procedures are the same. The 64-level quantizer code When the bit transmission rate needs to be reduced, additional i, quotient, or nj variable-length coding is performed. (34 hierarchical codes are in the jth order. The band is on the second waveband (j = 2) to the highest effect subband. First order differential code 132. This program can also be used to encode the PE.AK scale factor. The marked differential code DRMSQ "j) (or DP £ 'AKy £ (j)) has a maximum range of +/- 63' and is stored in A buffer 1 3 4. In order to reduce the bit transmission rate of the original 6-bit code, the differential code is mapped to the (P) loss of the 127-level rising variable-length code book. Each code book is one Different input statistical characteristics are optimized. 'The procedure for entropy encoding the marked differential codes is the same as the entropy encoding procedure for the transient mode illustrated in Fig. 12. Only P I2? Hierarchy variable length code tables 0 are used. The table that provides the lowest bit used in the mapping program is selected using the SHUFF index. The mapping codes VDK, MSv L (j) are extracted from this table, Combined with the SHUFF index word and transmitted to the decoder. Decoders holding the same set of (p) 127-level inversion tables use the SHUFF index to guide the incoming variable-length codes to the appropriate table for decoding to return to the differential quantizer code Hierarchy 〇The following paper standards are applicable to the Chinese National Standard (CNS) A4 (210X29 Yi Xi Shou_)-installed-(please read the precautions on the back before filling out this page) to set the central standard of the Ministry of Economic Affairs A7 B7 printed by Juzheng Consumer Cooperatives V. Description of invention (31) The routine is converted back to absolute value: EMSq L (1) = DRMSq l (1)

RMSql (j)=13RMS(i:L (j)+RMSQL(j-l)對於 j = 2, . . .K 而P E A K差分碼階層使用以下常式轉回成絕對値: PEAKql ⑴=DPEAKq“1)RMSql (j) = 13RMS (i: L (j) + RMSQL (j-l) for j = 2,.. .K and P E A K differential code level is converted back to absolute value using the following routine: PEAKql ⑴ = DPEAKq “1)

PEAKql ( j)=DPEAKQL ( j)+PEAKQL (1)對於 j = 2,.. .K 其中在二者情況下,K =作用次波帶之數目。 廣域位元分配 圖10所示廣域位元管理系統so爲多通道聲頻編碼器管 理位元分配U β i τ ),確定作用次波帶(s U B S )之數目及聯合 頻率策略(cJ Q i MX )與V Q策略,以一降低之位元傳輸率提供 主觀上透明之編碼。此增大可予編碼及儲存於一固定媒介 上之聲頻通道之數目及/或重放時間,同時維持或改善聲 頻眞度。一般而言,GBM系統30首先依據編碼器預測增益 所修正之精神聽覺分析,·將各位元分配至各次波帶〇然後 ,依據mms e體系分配剩餘位元,以降低總雜訊基値。爲將 編碼效率最佳化,GBM系統同時將位元分配於所有聲頻通 道、所有次波帶及整個框上。此外,可採用一聯合頻率編 碼策略。此方式中,本系統利用聲頻通道間信號能量在頻 率上及在時間上之非均匀分佈〇 精神聽覺分析 精神聽覺量測係用以確定聲頻信號內與感覺無關之資 訊〇與感覺無關之資訊定義爲聲頻信號中無法被人類聽衆 所聽到,而可於時間領域、頻率領域,或在某些其他基準 本紙張尺度適用中國國家標率(CNS ) A4規格(210X297公釐) 一 34- ^裝 J 訂 I, : ^ . (請先閲讀背面之注意事項再填寫本頁) 經濟部中央捸準局員工消費合作社印製 315561 A7 B7 ______— "" " ' 五、發明说明(32) 上量測者。J’D’Johnston: "Transform Coding of AudioPEAKql (j) = DPEAKQL (j) + PEAKQL (1) For j = 2,... K where in both cases, K = the number of sub-bands of action. Wide-area bit allocation The wide-area bit management system so shown in FIG. 10 allocates U β i τ for multi-channel audio encoder management bits, determines the number of acting subbands (s UBS) and the joint frequency strategy (cJ Q i MX) and VQ strategies to provide subjectively transparent encoding with a reduced bit rate. This increase can encode and store the number of audio channels and / or playback time on a fixed medium while maintaining or improving the audio frequency. Generally speaking, the GBM system 30 firstly allocates each bit to each sub-band according to the psychoacoustic analysis corrected by the encoder prediction gain, and then allocates the remaining bits according to the mms e system to reduce the total noise base value. In order to optimize the coding efficiency, the GBM system simultaneously allocates bits to all audio channels, all sub-bands, and the entire frame. In addition, a joint frequency coding strategy may be used. In this way, the system uses the non-uniform distribution of the signal energy between the audio channels in frequency and time. Psychoacoustic analysis The psychoacoustic measurement system is used to determine information that is not related to sensations in the audio signal. Definition of information that is not related to sensations Because the audio signal cannot be heard by human listeners, it can be applied to the China National Standard Rate (CNS) A4 specification (210X297mm) in the time domain, frequency domain, or in some other benchmark paper standards. J Order I,: ^. (Please read the precautions on the back before filling out this page) Printed 315561 A7 B7 ______— " " " 'The invention description (32) On the measurer. J’D’Johnston: " Transform Coding of Audio

Signals Using Perceptual Noise Criteria’1 1988年 2 月 IEEE Journal Selected Areas in Communications» v。1 J S A.O ϋ,2 i 3 1 4 - 3 2 3頁說明精神聽覺編碼之一般 原理。. 二主要因素影響精神聽覺之量測。一爲適用於人類與 頻率相關之聽覺絕對臨限Q另一爲聲音之遮蔽效應’人類 食g^ji:可在聽到第一種聲音之同時甚或之後聽到第二種聲 音演奏。換言之’第—種聲音使吾人不能聽到第二種聲音 ,則稱爲將其遮蔽0 $ >次波帶編碼器內,精神聽覺計算之最後結果爲標 明各次波帶於該瞬間之雜訊不可聞位準之—組數目0此一 計算爲衆所周知,並納入I’9 9 2年之Μ P E G 1壓縮標準I S 0 / IEC DiS 之"inforniati〇n technology-Coding of moving pictures and associated audio for digital storage media up t o about 1.5 Mbits/s’’o 此等數 g 隨 聲頻信號動態變化。編碼器嚐試經由位元分配程序調整各 次波帶之量化雜訊基値,使於此等次波帶之量化雜訊小於 可聞位準。 一精確之精神聽覺計算通常需於時間對頻率轉換具有 高頻率解析度。此喑示時間對頻率轉換有一大分析視窗。 標準分析視窗大小爲對應壓縮聲頻資料次框之〗〇24個選樣 。一長度l〇24. ff t之頻率解析度約與人耳之時間解析度相 匹配0 本紙張尺度適财H g $縣丨eNS丨A4__ Γ2Κ)Χ 297公釐) ~~~^ —- -35-* (請先閱讀背面之注意事項再填寫本頁) 裝 訂 五、發明説明(3 A7 B7 精神聽覺模型之輸出爲對32個次波帶各者之一信號對 經濟部中央標準局員工消費合作社印製 遮蔽(SMR)比。SMR 量,因而亦指示量化 言之,大SMR ( >>1) 指示需要較少之位元 臨限以下,而無需位 如圖1 4所示,每 頻選樣上計算一 f f t 頻率係數1 4 2 ,2 )對 神聽覺遮蔽144迴旋 係數平均以產生SMii 覺響應146將SMR隨 人耳之敏感度於 減少時向下降低。因 信號須較一 4 k Η z信號 頻率之SMR較外側頻 狀取決於傳送到聽衆 覺_應14 6被壓縮。 將於其他音量成爲次 位準用以將SMK位準 就32個次波帶所產生 位.元分配常式 G β Μ系統3 ϋ首先 將以V Q及A D P C Μ演算 指示一特定次波帶可承受之量化雜訊 次波帶內選樣所需之位元數目。明確 指示需要較多之位元,而小SMR (>0) 〇若SM1U0 ,則聲頻信號在雜訊遮蔽 元進行量化。 一連績框之SMK通常係藉1)在PCM聲 (長度 各次波 各頻率 位準, 意予正 頻率靠 此,爲 更强烈 率更爲 之信號 因此, 最佳化 正規化 之S Μ β 較佳爲1 0 24)以產生 帶以頻率相關之音調 係數,3 )對各次波帶 以及4 )依據圖1 5所示 規化而產生0 · 近4kHz爲最大,而頻 以相同位準被察覺, 0因此 般而言 重要。然而,該曲線 平均功率◦當音量增 就一特定音量作最佳 。結果,若非選出一 ,不然即正規化工作 1 4 8示於圖1 6 〇 一系列之 與雜訊精 將所得之 之人類聽 率增加或 一 20 kHz 靠近4kHz 之準確形 加時,聽 化之系統 標稱功率 受抑制〇 選出適當之編碼策略,即何者次波帶 法編碼及JFC是否將作用。此後,各 本紙張尺度適用中國國家標準(CNS ) Μ规格(210X297j^|_)Signals Using Perceptual Noise Criteria’1 February 1988 IEEE Journal Selected Areas in Communications »v. 1 J S A.O ϋ, 2 i 3 1 4-3 2 Page 3 explains the general principles of psychoacoustic coding. . Two main factors affect the measurement of mental hearing. One is the absolute threshold of hearing for frequency-related humans. The other is the shadowing effect of sounds. Human food g ^ ji: The second sound can be played at the same time or afterwards. In other words, the first sound prevents us from hearing the second sound, it is called to cover it. 0 $ > In the sub-band encoder, the final result of the mental hearing calculation is to indicate the noise of each sub-band at that instant The inaudible level—the number of groups 0 is calculated as well-known, and is included in the I’9 942 M PEG 1 compression standard IS 0 / IEC DiS " inforniati〇n technology-Coding of moving pictures and associated audio for digital storage media up to about 1.5 Mbits / s''o These numbers g change dynamically with the audio signal. The encoder attempts to adjust the quantization noise base value of each sub-band through the bit allocation process, so that the quantization noise in these sub-bands is less than the audible level. An accurate mental hearing calculation usually requires a high frequency resolution in time to frequency conversion. This display time has a large analysis window for frequency conversion. The standard analysis window size is corresponding to the subframe of the compressed audio data. 〇24 sample selections. The frequency resolution of a length l〇24. Ff t approximately matches the time resolution of the human ear 0 The size of this paper is suitable for financial purposes H g $ 县 丨 eNS 丨 A4__ Γ2Κ) Χ 297mm) ~~~ ^ —-- 35- * (Please read the precautions on the back before filling in this page) Binding V. Invention description (3 A7 B7 The output of the mental hearing model is a signal to each of the 32 sub-bands to the employees of the Central Standards Bureau of the Ministry of Economic Affairs The cooperative prints the masking (SMR) ratio. The amount of SMR, and therefore also indicates quantification. In other words, a large SMR (> > 1) indicates that fewer bits are required below the threshold, and no bits are required as shown in Figure 14. Calculate a fft frequency coefficient 1 4 2 on the frequency selection sample, 2) Average the gyration coefficient 144 for the God's auditory occlusion to generate the SMii sensory response 146 and reduce the sensitivity of SMR with the human ear downward when the sensitivity of the human ear decreases. Because the signal must be higher than the SMR frequency of a 4 k Hz signal frequency, the outer frequency depends on the transmission to the listener. It should be compressed. The other volume will become the sub-level for the SMK level to generate the bits for the 32 sub-bands. Element allocation routine G β Μ system 3 ϋ First, the VQ and ADPC Μ calculations will be used to indicate the tolerance of a specific sub-band Quantize the number of bits required for sample selection in the sub-band of noise. It is clearly indicated that more bits are required, and small SMR (> 0). If SM1U0, the audio signal is quantized in the noise masking element. The SMK of a continuous performance frame is usually borrowed 1) in the PCM sound (length and frequency of each wave, the positive frequency is expected to rely on this, for a more intense rate signal. Therefore, the optimized normalized S Μ β is It is preferably 1 0 24) to generate tone coefficients with frequency dependence, 3) for each sub-band and 4) according to the normalization shown in Figure 15 to produce 0 · Near 4kHz is the maximum, and the frequency is at the same level Perceive that 0 is therefore important in general. However, the average power of the curve is the best when a volume is increased. As a result, if one is not selected, the normalization work 1 4 8 is shown in Figure 16. A series of human hearing rates increased by the noise signal or an accurate form of 20 kHz close to 4 kHz, the audio frequency The nominal power of the system is suppressed. Choose the appropriate coding strategy, that is, which sub-band coding and JFC will work. After that, each paper standard applies the Chinese National Standard (CNS) Μ specifications (210X297j ^ | _)

請 先 閱 背 之 注 意 事 項丹A 訂 Λ 經濟部中央棵準局員工消費合作社印裝 A7 B7_^__ 五、發明説明(34) GBM系統選出一精神聽覺或一MMSE位元分配處理方式。譬 如、於高位元傅輸率下,系統會使精神聽覺模型建立失效 ,並使用一眞實m m s e分配體系0此減少計算之複雜性’而 不需於重構聲頻信號有任何變化◦反之,於低率時,系統 可啓動上述聯合頻率編碼體系’以改善較低頻率之重構眞 度〇 GBM系統可以每一框爲基礎,根據信號暫態內容於正 常精神聽覺分配與m ni s e配置之間轉換。於高暫態內容時, 用以計算SMK之靜止狀態之假設不再眞實,故而mm s e體系 提供較佳之性能0 爲進.行精神聽覺分配,系統首先分配各可用位元 以滿足精神聽覺效應,然後分配剩餘位元以降低整體雜訊 基値◦第一步驟爲如上述對目前之框確定各次波帶之SMR 。次一步驟爲對各自次波帶之預測增益(p g a i η )調整S M R ,以產生遮蔽雜訊比(Μ Ν幻。其原理爲A D Ρ C Μ編碼器將提 供一部份所需之SMR 〇結果,不可聞之精神聽覺雜訊位準 可以較少之位元逹成〇 第j個次波帶之ΜΜ如下,假設PM0DE = 1 : Μ N R ( j ) = S M R ( j ) - P g a i η ( j) ^ P E F ( A B I T ) 其中PJiF( AMT)係量化器之預測效率因數。計算〇R( j )時 ,設計者須有一位元分配(A B I T )之估計値,.後者可單獨根 據SMR( j )之分配位元或假設PliF(ABlT)=l而產生。於中度 至高位元傳輸率時,有效預測增益約等於計算預測增益0 然而,於低位元傳輸率時,有效預測增益會降低。使用譬 如5階層之量化器,所達成之有效預測增益約爲估計預測 本紙張尺度適用中國國'CNS ) M規格(210><297公釐) " ~ (請先閱讀背面之注意事項再填寫本頁)Please read the notes on the back. Item Dan A Order Λ Printed by the Employee Consumer Cooperative of the Central Bureau of Economic Affairs of the Ministry of Economic Affairs A7 B7 _ ^ __ 5. Description of Invention (34) The GBM system selects a mental hearing or an MMSE bit allocation processing method. For example, at high bit rate, the system will invalidate the establishment of the mental hearing model, and use a solid mmse distribution system. This reduces the complexity of the calculation without any change in the reconstructed audio signal. On the contrary, the low The system can start the above-mentioned joint frequency coding system to improve the reconstruction of lower frequencies. The GBM system can switch between normal mental auditory distribution and m ni se configuration based on the signal transient content based on each frame. . In the case of high transient content, the assumption used to calculate the rest state of SMK is no longer true, so the mm se system provides better performance. For mental hearing allocation, the system first allocates each available bit to satisfy the mental hearing effect, Then allocate the remaining bits to reduce the overall noise base. The first step is to determine the SMR of each waveband for the current frame as described above. The next step is to adjust the SMR for the predicted gain (pgai η) of the respective sub-bands to produce a masked noise ratio (MN). The principle is that the AD PCM encoder will provide a portion of the required SMR results. , The level of inaudible mental auditory noise can be reduced to less bits. The Μ of the j th sub-band is as follows, assuming PM0DE = 1: Μ NR (j) = SMR (j)-P gai η (j ) ^ PEF (ABIT) where PJiF (AMT) is the predicted efficiency factor of the quantizer. When calculating 〇R (j), the designer must have an estimate of the one-bit allocation (ABIT), which can be based on SMR (j) alone The assigned bit may be generated assuming PliF (ABlT) = 1. When the transmission rate is medium to high, the effective prediction gain is approximately equal to the calculated prediction gain. However, when the transmission rate is low, the effective prediction gain decreases. Use For example, the 5-level quantizer, the effective prediction gain achieved is about the estimated prediction. The paper size is applicable to China's' CNS) M specification (210 > < 297mm) " ~ (Please read the notes on the back before filling in (This page)

V^15561 A7 B7 五、發明説明(35 增益之ϋ.7倍,而--6 5階層之量化器容許有效預測增益約 等於估計預测增益,PEF = 1 , ϋ 〇在限値內,當位元傳輸率 爲零時,預測性編碼本質上受抑制,故有效預測增益爲零ο 次一步驟中’ GBM系統產生一位元分配體系,其滿 足各次波帶之MNK 〇其完成爲使用1位元等於6 dB信號失 眞之近似法。爲確保編碼失眞小於精神聽覺上之可聞臨限 ,指定之位元傳輸率爲MNR除以6 dB所得之最大整數,如 下所示: .MNRG) 經濟部中央揉準局貝工消費合作社印裝 藉由此 傾向於跟隨 頻率,雜訊 之頻率時, 擬有關之平 能方面會更 _當於所 或小於目標 個別次波帶 位元傳輸率 〇譬如,可 通道之平均 當局部 過目標位元 一方式 圖17所 位準將 雜訊基 均誤差 佳,特 有聲頻 位元傳 之位元 〇此爲 用位元 SMK.或 位元分 傳輸率 ΑΒΓΓ ① 分配位元 示之信號 相當高+, 値將很小 經常大於 別於低位 通道上各 輸速·率時 分配。或 次佳但較 可均匀分 KMS成比 配(包括 時,廣域 6dB ,重構信號內 本身1 5 7 〇因 但將保持不可 而聽不見〇與 m m s e雜訊位準 元傳輸速率時 次波帶之分配 ,GBM常式將 者,可對各聲 簡單者,尤於 佈於各聲頻通 例分佈。 VQ碼位元與邊 位元管理常式 之雜訊位準156 此,信號極强之 聞。信號相當弱 此種精神聽覺模 1 5 8 ,但可聞性 〇 位元之和爲大於 反覆減少或增加 頻通道計算目標 硬體實施中爲然 道間,或可與各 際資訊)之和超 將逐漸減少局部 ----------f 裝-----Ί訂 T!I--- (請先閲讀背面之注項再填寫本頁) 本紙張尺度適用中國國家梯準(CNS ) A4規格(210 X 297公釐) -3 8-V ^ 15561 A7 B7 V. Description of the invention (35 times of ϋ.7 times, and the -6-level quantizer allows the effective prediction gain to be approximately equal to the estimated prediction gain, PEF = 1, ϋ 〇 within the limit, when When the bit transmission rate is zero, the predictive coding is essentially suppressed, so the effective prediction gain is zero. In the first step of each time, the GBM system generates a bit allocation system that meets the MNK of each wave band. It is completed for use One bit equals an approximation of 6 dB signal loss. To ensure that the code loss is less than the audible threshold for mental hearing, the specified bit transmission rate is the maximum integer obtained by dividing the MNR by 6 dB, as shown below: .MNRG ) Printed by the Beigong Consumer Cooperative of the Central Bureau of Economic Development of the Ministry of Economic Affairs, by which it tends to follow the frequency and the frequency of the noise, the level of energy to be related will be more _ when the bit transmission rate is less than the target individual sub-band 〇 For example, the channel average can be localized when the target bit is one way. The level of the noise level in Figure 17 is good, and the unique audio bit is transmitted. This is the bit transmitted by the bit SMK. Or bit divided transmission rate ΑΓΓΓ ① Letter of allocation + Quite high, often greater than Zhi will be small when not dispensing the respective transmission rates on the low-speed channel. Or sub-optimal but more evenly divided into KMS ratios (including time, 6dB in wide area, 1 5 7 in the reconstructed signal itself) It will be inaudible because it will remain indispensable. Sub-waves with mmse noise level transmission rate The distribution of the band, the GBM routines, can be simple for each sound, especially distributed in the general audio frequency distribution. VQ code bit and side bit management routine noise level 156 So, the signal is very strong The signal is quite weak. This mental auditory mode is 1 5 8, but the audible sum of 0 bits is greater than the repeated reduction or increase of the frequency channel. The calculation target hardware is implemented during the implementation of the road, or it can be combined with various information). Super will gradually reduce the locality ---------- f outfit ----- Ίbook T! I --- (please read the notes on the back and then fill out this page) Standard (CNS) A4 specification (210 X 297 mm) -3 8-

五、發明説明(36〉 次波帶之位元分配°許多特定技術可用以降低平均位元傳 輸率。首先,被最大整数函数進位之位元傳輸率可捨去〇 其次,從具有最小MWK之次波帶可移去一個位元0此外, 較高頻率之次波帶可予關閉,或聯合頻率編碼可發生作用 〇所有位元傳輸率降低策略遵隨以平順方式逐漸降低編碼 解析度之一般原理,而以感覺上具最少攻擊性之策略先予 引進,而最具攻擊性之策略最後使用。 當目標位元傳輸率大於局部位元分配(包括VQ碼位元 及邊際資訊)之和時,廣域位元管理常式將逐漸且反覆增 加局部次波帶之位元分配,以減少重構信號之總雜訊基値 。此會導致前己分配之零位元之次波帶.被編碼。若P Μ 0 D E 發生作用,此方式中,,接通〃次波帶內之位元總數可能 需反映傳輸任何預測器係數方面之成本。 經濟部中央榡準局貞工消費合作社印製 --------装— (請先閲讀背面之注項再填寫本頁) GBM常式可自三種不同體系中選擇其一以分配剩餘位 元0其一選擇爲使用wms e手段,將所有位元重分配,使所 得雜訊基値約爲扁平。此相當於初期抑制精神聽覺模型建 立’0爲獲致mm s e雜訊基値,圖1 8 a所示次波帶RM S値之作 圖1 6 ϋ如圖1 8 b所示被反轉,並'"充水"直到所有位元耗 盡爲止。此一周知技術稱爲充水作用,係因當分配位元數 目增加時,失眞位準均匀降落〇於所示實例中,第一位元 被分派至次波帶1 .,第二及第三位元被分派至次波帶1及 2 ,第四至第七位元被分派至次波帶1、2、4及7等等〇 另法,可將一位元分派至每一次波帶,以保證每一次波帶 均將被編碼,然後剩餘位元即予充水。 本紙張尺度適用中國國家捸率(CNS ) A4规格(210X297公釐) _ 3 9 — Α7 Β7 五、發明説明(u) 一第二且較佳選擇爲依據mms e手段及上述RMS作圖分 配剩餘位元。此一方法之效果爲均匀降低圖1 7所示之雜訊 基値1 5 7同時保:;持與精神聽覺遮蔽相關之形狀〇此提供精 神聽覺與mse失眞間之一良好折衷。 第三種手段爲使用mms e手段分配剩餘位元,如同適用 於各次波帶K.MS與MNR値間差値之作圖。當位元傳輸率增 加時.,此一手段之效果爲將雜訊基値形狀由最佳精神聽覺 形狀157平滑變形成最佳(扁平)mmse形狀158 〇在任一 此等體系中,若任一次波帶之編碼誤差相對於來源PCM落 於0,5 LS.B以下,則不再有位元分配至該次波帶〇次波帶 位元分配之隨意固定最大値可用以限制特定次波帶所分配 位元之最大數目〇 經濟部中央標準局ΪΚ工消費合作社印— (請先閏讀背面之注$項再填寫本頁〕 在上述編碼系統中,吾人假設每選樣之平均位元傳輸 率固定,且已產生位元分配以使重構聲頻信號之眞度爲最 大。另法,失眞位準(mse者或感覺者)可予固定而容許 位元傳輸率改變以滿足該失眞位準。在mms e手段中,RMS 作圖予單純充水直到失眞位準滿足爲止。所需位元傳輸率 將根'據次波帶之⑽S位準變化。在精神聽覺手段中,各位 元被分配以滿足個別之_ κ 。結果,位元傳輸率將根據個 別之SMK及預測增益變化。此類型之分配在目前並非有用 ,因現今之解碼器均以固定率操作。然而,替代之傳送系 統譬如ATM或隨機存取儲存媒介會使變率編碼於最近之未 來成爲實用。 位元分配衷引(A B丨ΐ )之量化 本紙張尺度適用中國國家揉準(CNS ) Μ規格(210X297/含系」 A7 -------B7___ 五、發明説明(38) 位元分配索引UB 1 T )係藉廣域位元管理程序中之適應 ‘性位元分配常式對各次波帶及各聲頻通道予以產生。編碼 器索引之目的爲指示圖1 0所示之階層數1 62 ,後者乃使差 信號量化以於解碼器聲頻獲得一主觀上最佳之重構雜訊基 ίϋ所必耍〇在解碼器處,其等指示反轉量化所需之階層數 _〇索引係就每一分析緩衝器產生,而其値範圍可自〇至27 0索引値、量化器階層數與近似之所得差分次波帶SNq R間 之關係示於表3 〇由於差信號經正規化,步級大小1 64乃 設定成等於一 〇 --------^ ^— (請先閱讀背面之注意事項再填寫本頁) 經濟部中央榡準局員工消費合作社印粟 皇 ΑΒ1Τ索引 Q階層數 碼長度 (.位元) SNQR(dB) 0 0 0 0 1 3 變數 8 2 5 變數 12 3 7 (或 8 ) 變數 (或3) 16 4 9 變數 19 5 13 變數 21 6 17 (或 16) (或4) 24 7 2 5 變數 27 8 33 (或 32) 變數 (或5) 30 9 65 (或 64) 變數 (或6 ) 36 10 129 (或 128) 變數 (或Ό 42 11 256 8 48 12 5 12 9 54 本紙張尺度適用中國國家梂準(CNS ) A4規格(2Ί0Χ297公釐) 4 i 一V. Description of the invention (36> Bit allocation in sub-bands. Many specific techniques can be used to reduce the average bit transmission rate. First, the bit transmission rate carried by the largest integer function can be discarded. Second, from the one with the smallest MWK The secondary band can be removed by one bit. In addition, the higher frequency secondary band can be turned off, or joint frequency coding can take effect. All bit transmission rate reduction strategies follow the general rule of gradually reducing the coding resolution in a smooth manner. Principle, and the least aggressive strategy is introduced first, and the most aggressive strategy is used last. When the target bit transmission rate is greater than the sum of the local bit allocation (including VQ code bits and marginal information) , The wide-area bit management routine will gradually and repeatedly increase the bit allocation of the local sub-band to reduce the total noise base value of the reconstructed signal. This will result in the sub-band of the previously allocated zero bit. Encoding. If P Μ 0 DE comes into effect, in this way, the total number of bits in the 〃 subband may need to reflect the cost of transmitting any predictor coefficients. Printed by the company -------- installed — (please read the notes on the back and then fill in this page) GBM routine can choose one of three different systems to allocate the remaining bits 0 The other one is to use wms e means, redistribution of all bits to make the resulting noise base value approximately flat. This is equivalent to the initial suppression of the mental auditory model establishment '0 is the resulting mm se noise base value, as shown in Figure 18 a sub-band RM Figure 1 6 ϋ is reversed as shown in Figure 1 8 b, and '" filled with water " until all the bits are exhausted. This known technique is called water filling, which is due to the allocation of bits As the number of cells increases, the missed level drops evenly. In the example shown, the first bit is assigned to subbands 1. The second and third bits are assigned to subbands 1 and 2, the fourth To the seventh bit is allocated to sub-bands 1, 2, 4, and 7, etc. Alternately, one bit can be allocated to each band to ensure that each band will be encoded, and then the remaining bits Yuan is filled with water. This paper scale is applicable to China ’s National Averaging Rate (CNS) A4 specification (210X297 mm) _ 3 9 — Α7 Β7 V. Description of invention (u) One second and more A good choice is to allocate the remaining bits according to the mms e method and the above RMS mapping. The effect of this method is to uniformly reduce the noise base value 1 5 7 shown in Figure 17 while maintaining :; holding the shape related to mental hearing masking 〇This provides a good compromise between mental hearing and mse loss. The third method is to use mms e to allocate the remaining bits, just as it is applicable to the mapping of the difference between the K.MS and MNR of each wave band. When the bit transmission rate increases, the effect of this method is to smoothly transform the noise base shape from the best mental hearing shape 157 to the best (flat) mmse shape 158. In any of these systems, if any wave The coding error of the band falls below 0,5 LS.B relative to the source PCM, then no more bits are allocated to this subband. The arbitrary fixed value of the bit allocation of the subband is used. The maximum value can be used to limit the specific subband. The maximum number of allocated bits. Printed by the Central Standards Bureau of the Ministry of Economic Affairs ΪΚ 工 Consumer Cooperatives-(Please read the note $ item on the back before filling this page] In the above coding system, I assume the average bit transmission rate for each sample selected Fixed, and bit allocation has been generated to make it heavy The frequency of the constitutive audio signal is the largest. Alternatively, the missed level (mse or sensory) can be fixed and allow the bit transmission rate to change to meet the missed level. In the mms e method, the RMS drawing is simply filled with water until the missing level is satisfied. The required bit transmission rate will change according to the ⑽S level of the secondary wave band. In psychoacoustic means, each element is allocated to satisfy individual κ. As a result, the bit transmission rate will vary according to the individual SMK and predicted gain. This type of allocation is not currently useful, because today's decoders operate at a fixed rate. However, alternative transmission systems such as ATM or random access storage media will make variable rate coding practical in the nearest future. Quantification of bit allocation (AB 丨 l) This paper scale is applicable to China National Standard (CNS) Μ specification (210X297 / inclusive) A7 ------- B7___ V. Invention description (38) Bit allocation Index UB 1 T) is generated for each sub-band and each audio channel by the adaptive bit allocation routine in the wide area bit management process. The purpose of the encoder index is to indicate the number of layers 1 62 shown in FIG. 10, which is necessary to quantize the difference signal to obtain a subjectively optimal reconstructed noise base for the decoder audio. It is necessary to play at the decoder. , Etc. indicates the number of layers required for inverse quantization. The index is generated for each analysis buffer, and its value range can be from 0 to 270. The index value, the number of quantizer layers, and the approximate differential subband SNq The relationship between R is shown in Table 3. 〇Because the difference signal is normalized, the step size 1 64 is set to be equal to one -------- ^ ^ — (Please read the notes on the back before filling this page ) Ministry of Economic Affairs, Central Bureau of Precinct Employee Consumer Cooperatives, Indian Emperor ABT1 Index, Q-level digital length (.bit) SNQR (dB) 0 0 0 0 1 3 Variable 8 2 5 Variable 12 3 7 (or 8) Variable (or 3 ) 16 4 9 Variable 19 5 13 Variable 21 6 17 (or 16) (or 4) 24 7 2 5 Variable 27 8 33 (or 32) Variable (or 5) 30 9 65 (or 64) Variable (or 6) 36 10 129 (or 128) variable (or Ό 42 11 256 8 48 12 5 12 9 54 This paper size is applicable to China National Standards (CNS) A4 specification (2 0Χ297 mm) 4 i a

、1T 31556i A7 B7 經濟部中央標準局員工消費合作社印製 五、發明説明(39) 13 1024 10 60 14 2 0 4 8 11 66 15 4 0 9 6 12 72 16 8192 13 78 17 1 63 84 14 84 18 3 27 68 15 90 19 6 55 3 6 16 96 20 131072 17 102 21 262144 18 108 22 524268 19 114 23 1048576 20 120 24 2097152 21 126 25 4194304 22 132 26 8388608 23 138 27 16777216 24 . 144 位元分配索引(AB I T )係直接使用4位元無記號整數碼 字+、5位元無記號整數碼字,或使用1 2階層熵表傳輸至解 碼器。典型上,熵編碼將被用於低位元傳輸率應用以保存 各位元。將Afi I T編碼之方法係以編碼器之模式控制予以設 定,並傳輸至解碼器。熵編碼器使用圖12所示具有12階層 ABIT表之程序將ABIT索引映射166至一特定碼冊,後者係 由BHUFF索引及該碼冊內之一特定碼VABIT予以識別〇 廣域位元傳輸率控制 ’ . 由於邊際資訊及差分次波帶選樣二者皆可使用熵可變 本紙張尺度適用中國國家標準(CNS )八4規格(210X297公釐) -42一 ---------^裝------^-訂---.-- (請先閱讀背面之注項再填寫本頁) 經濟部中央標準局員工消費合作社印製 A7 __B7_^_ 五、發明説明(4〇) 長度碼書予隨意編碼,故當壓縮位元流將以一固定率傳輪 時,必須採用某些機制以調整所得之編碼器位元傳輸率。 由於邊際資訊一旦計算後正常情況爲無需修改,故位元傳 輸.率調整之達成最佳爲藉由反覆改變ADPCM編碼器內之差 分次波帶選樣量化程序直到符合對率之約束爲止。 在所述系統中,圖1〇之廣域率控制(GRC)系統178調 整位元傳輸率,此係藉由改變階層碼値之統計分佈將量化 器階層碼映射至熵表之程序所致。各熵表均假設呈現一類 似之較高階層碼値之較高碼長度趨勢◦在此情形下,當低 値碼階層.之可能性增加時,平均位元傳輸率降低,反之亦 然。在ADPCM (或APCM)量化程序中,標度因數之大小決 定階層碼値之分佈或使用。舉例言之,當標度因數之大小 增加時,差分選樣將傾向於被較低階層量化,因而碼値將 逐漸變小。此進而將造成較小之燏碼字長度與一較低之位 元傳輸率。 ' 本方法之缺點爲次波帶選樣之重構雜訊因增加標度因 數大小而亦以相同之程度提高。然而,實作中,標度因數 之調整正常爲不大於1 dB至3 dB〇若需要一較大之調整, 其較佳爲返回位元分配並減少總位元分配,而非冒可聞童 化雜訊出現於將使用增大之標度因數之次波帶內之可能性 之風險〇 爲調整熵編碼之ADPCM位元分配,每一次波帶之預測 .器歷程選樣在ADPCM編碼周期重複時儲存於一暫時緩衝器 內。其次,使用由次波帶.LPC分析導出之預測係數Aa連同 本&浪尺度適用中國國家標準(CNS ) Λ4規格(210X 297公釐) ^ ^ ^ -43- (請先閲讀背面之注^^項再填寫本頁) 裝- Μ A7 _______B7_ 五、發明説明(41) 標度因數KMS (或PEAK)、量化器位元分配ABIT、暫態模 式TMODE 、以及由估計差信號導出之預測模式pM〇D]E以整 個ADPCM程序將所有次波帶選樣緩衝器編碼〇所得之量化 器階層碼予緩衝並映射至熵可變長度碼冊,後者再次使用 位元分配索引以決定碼書大小而顯現最低位元使用0 然後,GKC系統使用涵蓋所有索引之相同位元分配索 引分析各次波帶所用之位元數目。舉例言之,當ABIT = 1時 ,在廣域位元管理內之位元分配計算可假設每各次波帶選 樣爲1 . 4之牛均率(亦即假設最佳階層碼振幅分佈之傾碼 冊平均率)〇若AB I T = 1時,所有次波帶之總位元使用大於 1 . 4 /(次波帶選樣之總數),則標度因數可在所有此等次 波帶內增加,以完成位元傳輸率降低作用◦調整各次波帶 標度因數之判定較佳爲予留到所有ΑΒ I Τ索引率業已存取爲 止。結果,位元傳輸率低於位元分配程序中所假設者之索 引會補償位元傳輸率高於該位準者0此一評估亦可在適當 之情況延伸以涵蓋所有聲頻通道。 經濟部中央樣準局員工消费合作社印裝 (請先閲讀背面之注意事項再填寫本頁) 對減少總位元傳輸率之推薦程序係以超過臨限之最低 ΑΒ I Τ索引位元傳輸率開始,並增大具有此位元分配之每一 各次波帶內之比例因素。實際之位元使用因此等次波帶原 本高於該分配之標稱率之位元之數目而減少0若修正之位 元使用仍超過所容許之最大値,則次高ABIT索引(其位元 使用超過標稱者)之次波帶標度因數增大0此一程序繼續 直到修正之位元使用低於最大値爲止0、 1T 31556i A7 B7 Printed by the Employees ’Consumer Cooperative of the Central Bureau of Standards of the Ministry of Economy V. Invention Instructions (39) 13 1024 10 60 14 2 0 4 8 11 66 15 4 0 9 6 12 72 16 8192 13 78 17 1 63 84 14 84 18 3 27 68 15 90 19 6 55 3 6 16 96 20 131072 17 102 21 262144 18 108 22 524268 19 114 23 1048576 20 120 24 2097152 21 126 25 4194304 22 132 26 8388608 23 138 27 16777216 24. 144 bit allocation index (AB IT) is to directly use 4-bit unsigned integer digital words +, 5-bit unsigned integer digital words, or use a 12-level entropy table to transmit to the decoder. Typically, entropy coding will be used for low bit rate applications to save bits. The method of encoding Afi IT is set by the mode control of the encoder and transmitted to the decoder. The entropy encoder uses the procedure shown in FIG. 12 with a 12-level ABIT table to map 166 the ABIT index to a specific codebook, which is identified by the BHUFF index and a specific code VABIT in the codebook. Wide area bit transmission rate Control '. Since both the marginal information and the differential subband selection can be used, the entropy can be changed. The paper scale is applicable to the Chinese National Standard (CNS) 84 specifications (210X297mm) -42 一 -------- -^ 装 ------ ^-定 ---.-- (please read the notes on the back before filling out this page) A7 __B7 _ ^ _ printed by the Staff Consumer Cooperative of the Central Bureau of Standards of the Ministry of Economy V. Description of invention ( 4〇) The length codebook is randomly coded, so when the compressed bit stream will be transmitted at a fixed rate, some mechanism must be used to adjust the resulting encoder bit transmission rate. Once the marginal information is calculated, the normal situation is that it does not need to be modified. Therefore, the best way to achieve bit adjustment is to change the difference in the ADPCM encoder by repeatedly changing the quantization process of the fractional band selection until the constraint on the rate is met. In the system, the wide area rate control (GRC) system 178 of FIG. 10 adjusts the bit transmission rate, which is caused by the process of mapping the quantizer hierarchical codes to the entropy table by changing the statistical distribution of the hierarchical code values. Each entropy table assumes a similar trend of higher code lengths for higher-level code values. In this case, as the probability of lower-level code levels increases, the average bit transmission rate decreases, and vice versa. In the ADPCM (or APCM) quantization process, the size of the scale factor determines the distribution or use of hierarchical code values. For example, when the scale factor increases, the differential sampling will tend to be quantized by the lower layers, so the code value will gradually become smaller. This in turn will result in a smaller codeword length and a lower bit transmission rate. 'The disadvantage of this method is that the reconstructed noise of the subband selection is also increased to the same degree due to the increased scale factor. However, in practice, the adjustment of the scale factor is normally no more than 1 dB to 3 dB. If a larger adjustment is required, it is better to return to the bit allocation and reduce the total bit allocation, rather than risking an audible child. The risk of the possibility of noise in the sub-band that will use the increased scale factor. To adjust the allocation of the ADPCM bits of the entropy coding, the prediction of each band is repeated. The sampling process of the device is repeated during the ADPCM coding cycle. Time is stored in a temporary buffer. Secondly, use the prediction coefficient Aa derived from the sub-band .LPC analysis together with this & wave scale to apply the Chinese National Standard (CNS) Λ4 specification (210X 297mm) ^ ^ ^ -43- (please read the note on the back ^ ^ Item and then fill out this page) Pack-Μ A7 _______B7_ V. Description of invention (41) Scale factor KMS (or PEAK), quantizer bit allocation ABIT, transient mode TMODE, and prediction mode derived from the estimated difference signal pM 〇D] E Encode all subband sampling buffers with the entire ADPCM program. The resulting quantizer hierarchical codes are buffered and mapped to the entropy variable-length codebook. The latter uses the bit allocation index again to determine the codebook size. It is shown that the lowest bit uses 0. Then, the GKC system uses the same bit allocation index covering all indexes to analyze the number of bits used in each wave band. For example, when ABIT = 1, the calculation of bit allocation within wide-area bit management can assume that each band selection is 1.4 cattle average rate (that is, assuming the best hierarchical code amplitude distribution Average rate of tilt code book) 〇If AB IT = 1, the total bit of all sub-bands is greater than 1.4 / (the total number of sub-band selections), then the scale factor can be in all such sub-bands The internal increase is used to complete the bit transmission rate reduction. The adjustment of the scaling factor of each band is preferably reserved until all ABT index rates have been accessed. As a result, an index where the bit transmission rate is lower than that assumed in the bit allocation process will compensate for the bit transmission rate higher than the level 0. This evaluation can also be extended to cover all audio channels where appropriate. Printed by the Employees ’Consumer Cooperative of the Central Sample Agency of the Ministry of Economic Affairs (please read the precautions on the back before filling out this page) The recommended procedure for reducing the total bit transmission rate starts with the minimum ABT index bit transmission rate that exceeds the threshold And increase the proportionality factor in each sub-band with this bit allocation. The actual bit usage is therefore reduced by the number of bits in the equal-order band originally higher than the nominal rate of the allocation. 0 If the modified bit usage still exceeds the maximum allowable value, the next highest ABIT index If the use exceeds the nominal value, the sub-band scale factor increases by 0. This process continues until the corrected bit usage is below the maximum value.

此一旦達成,將舊歷程資料載入預測器內,而ADPCM 本紙張尺度適用中國國家標準(CNS ) A4規格(210X297^)Once this is achieved, the old history data is loaded into the predictor, and the paper standard of ADPCM is applicable to the Chinese National Standard (CNS) A4 specification (210X297 ^)

Sl556l A7 B7 五、發明説明(42) 編碼程序72對已將其標度因數修正之次波帶重複。循此, 階層碼再次映射至最佳之熵碼冊,並重新計算位元使用〇 若任一位元使用::仍超過標稱率,則標度因數進一步增大並 重複該循環。 對標度因數之修正可以二種方式完成。第一種爲對每 一 ABIT索引傳輸一調整因素至解碼器。舉例言之,2位元 字可以信號表示調整範圍,比如〇、1、2及3 dB 〇由於相 同調整因素係用於所有使用ABIT索引之次波帶,且僅有索 引1-10可使用燏編碼,故需予傳輸之調整因素之最大數目 對所有次波帶而言均爲1 〇 0另法,標度因數可藉選擇一咼 量化器階層而於各次波帶內改變。然而,由於標度因數量 化器分別具有1. 2 5及2 .5 dB之步級大小,故標度因數調整 被限制於此等步級。此外,當使用此一技術時,若熵編碼 發生作用,則標度因數之差分編碼及所得之位元使用可能 需予重新計算 經濟部中央搮準局貞工消費合作社印製 _________f袭— (請先閲讀背面之注意事項再填寫本頁) 一般而言,相伺之程序亦可用以增加位元傳輸率’亦 即當位元傳輸率小於所需位元傳輸率之際。在此一情形下 ,標度因數將減小,以迫使差分選樣對外部量化器階層作 較大之利用,因而使用熵表內之較長碼字〇 若位元分配索引之位元使用不能減少至一合理之反覆 數目內,或調整步級之數目在傳輸標度因數調整因素之情 況下已然到達極限,則可能有二種補救方法〇第一,在標 稱率範圍內之次波帶標度因數可增大,從而降低總位元傅 輸率。另法,可放棄整個ADPCM編碼程序並重新計算次波 本紙張尺度適用中國國家標準(CNS ) A4規格(210X 297公釐) _45_ 經濟部中央標準局員工消費合作社印製 A7 _____ B7 五、發明説明(43) 帶整體上之適合性位元分配,而此次使用較少之位元。 資料流格式 圖1 〇所示多工器3 2壓縮每一通道之資料,然後將各通 道之壓縮資料多工傳輸入一輸出框內以形成資料流〗6 〇壓 縮及多工傳輸資料之方法,亦即圖19所示之框格式186 , 係設計成使聲頻編碼器可用於廣闊之應用範圍並可擴充至 較高之抽樣頻率,於各框內之資料數量受到約束,重放工 作可獨立啓動於每一次次框上以減少潛伏期,而且減少解 碼誤差〇 如所示,一單一框1 8 6 ( 409 6 PCM選樣/通道)界定 位元流邊界,其中駐有充份之資訊以將一區段之聲頻適當 解碼,並由4個次框188 ( 1〇24 PCM選樣/通道)組成, 後者進而各由4個次次框190 ( 2 5 6 PCM選樣/通道)組 成。框同步字192置於各聲頻框之起頭。框表頭資訊194 主要提供有關框1 86構造.、編碼器組構之資訊,後者產生 詨資料流及各種選項之操作特色,例如內嵌動態範圍控制 及時間碼〇選項之表頭資訊1 96告知解碼器··是否需要需 要向下混合,動態範圍補償是否完成,及資料流中是否包 括輔助資料位兀組。聲頻編碼表頭1 9 8指;^在編碼器處所 用之壓縮配置與編碼格式,以組合編碼用'、邊際資訊", 亦即位元分配、標度因數、PM ODES、TM ODES、碼冊等等。 框之其餘部份由SUBFS順序之聲頻次框188所組成〇 每一次框均以聲頻編碼用邊際資訊200開始,後者將 多數壓縮聲頻所用關鍵編碼系統之資訊中繼至解碼器。此 本紙張尺度適用中國圉家揉準(CNS ) A4規格(210 X 297言穿) (請先閱讀背面之注^h項再填寫本頁) .裝---I I —1 J1 H - mu* §^ιϋ nn vm —^n" nfn «1^^— n^— _ Α7 Β7 315561 五、發明説明(4 4 ) (請先聞讀背面之注意事項再填寫本页) 等包括暫態偵測、預測性編碼、適應性位元分配、高頻率 向量量化、强度編碼及適應性標定。此類資料大多使用以 上聲頻編碼表頭資訊自資料流中解壓縮0高頻率VQ碼陣列 2 02係由每高頻率次波帶10個位元索引組成,以VQSUB索 引指示。低頻率效應陣列2ϋ4係屬選項’且代表可用以驅 動例如次低音揚聲器之極低頻率資料。 聲頻陣列206係用Huff man /固定式反轉量化器予以 解碼並分成多數次次框(SSC),而各自解碼達每聲頻通道 2 5 6個P C Μ選樣。超選樣之聲頻陣列2 0 8僅在選樣頻率大 於48 kHz時存在。爲保持相容性,不能以高於48 kHz之抽 樣率操作之解碼器應跳過此一聲頻資料.陣列〇 DSYNC 210 予用以核驗聲頻框內之該次框位置末端.0若該位置未核驗 ,則該次框內解碼之聲頻即宣告爲不可靠。結果,若非該 次框受抑制,即爲前一框被重複。 次波帶解碼器 經濟部中央標準局員工消費合作社印笨 圖20分別爲次波帶選樣解碼器18之一方塊圖。該解碼 器湘較於編碼器爲十分簡單,且不涉及對重構聲頻之品質 譬如位元分配具根本重要性之計算工作。在同步之後,解 壓縮器W將壓縮聲頻資料流16解壓縮、偵測且於必要時校 正傳輸誘生誤差,並將資料解多工傳輸至個別之聲頻通道 內。次波帶差分信號被再量化成PCM信號,而各聲頻通道 被反轉濾波以將信號轉換回時間領域內〇 接收聲頻框及解壓縮表頭 編碼資料流在編碼器處受壓縮(或定框),且除實際 本紙張尺度適用中囤國家標準(CNS ) A4規格(.2ί〇Χ297公釐) -47 一 A7 B7 經濟部中央標準局貝工消費合作社印裝 五、發明説明(45) 之聲頻碼本身外,在各框中包括供解碼器同步之額外資料 、誤差偵測與校正、聲頻編碼狀態旗標以及編碼用邊際資 訊〇解壓縮器偵測SYNC字並摘取框大小FSIZE 〇編碼位 元流由順序之聲頻框組成,而各以一 32位元(0x7f f e 8 0 0 1 ) 同步字(SYNC)開始。聲頻框之實體大小(FSIZE)係摘取自 跟隨SYNC字之位元組。此容許程式師設定一、、框末端〃計 時器,以減少軟體總量。其次,NB丨ks被摘取,容許解碼 器計算聲頻視窗大小(32 (Nb Iks+ 1 ))〇此告知解碼器將摘 取何者邊際資訊及將產生多少重構選樣。 當已接收框表頭位元組(sync, ftype, surp, nblks, fsize, a mode, s f r e q , rate, m i x t, d y n f , d y n c t , time, auxcnt,Iff,hflag)時,首i2個位元組之有效性可使用 Re e d S ο 1 〇 mo u檢查位元組HCiiC作檢查。此等將在1 4個位元 組中校正1個錯誤位元組,或將2個錯誤位元組加旗標〇 · 錯誤檢查完成後,表頭資訊被用以更新解碼器旗標〇 '跟隨H C R C以迄選項資訊之表頭(f i i t s , vernum, chist, pcmr, unspec)可予摘取並用以更新解碼器旗標。由於此 一資訊將不一框一框改變,故可使用一多數票決體系以補 .償各位元誤差。選項表頭資料(times, mcoeff,dcoeff, auxd, ocrt’)係依據 dynf, time 及 auxcnt等表頭 摘取Z 〇該選項資料可使用隨意之Solomon;檢查位元 組0CRC核驗。 聲頻編碼框表頭(s u b f s , s, u b s , c h s , v q s u b,j ό i n X, thuff, shufi, bhuff, sel5, sel7, sel9, sell3, sell7, 本紙張尺度適用中國國家標準(CNS ) A4規格(21 Ox297公釐) "'" -48- (請先閲讀背面之注意事項再填寫本頁) -裝_ 訂 -h A7 3^556ι ___ Β7 五、發明説明(46 ) sel25,sel33,sel65,sell29,ahcrc)在每一框中傳輸 一次0其等可使用聲頻Keed Solomon檢査位元組AHCRC核 驗〇大部份表頭如由CHS所界定對每一聲頻通道重複〇 .將次框編碼用擤際眘訊解壓縮 將聲頻編碼框分成多數次框(SUBFS )。必要之邊際資 訊(pmode, pvq, t m o d e , scales, abits,hfreq)全部包 括在內以將聲頻之每一次框適當解碼,而無需參考任何其 他次框。各連續次框藉先將其邊際資訊解壓縮而予解碼。 對每一作用次波帶並通過所有聲頻通道傳輸一 1位元 預測模式(PM0M)旗標〇 PMQDE旗標對目前次框爲有效。 PM0DE = 0暗示預測器係數不包括在該次波帶之聲頻框內。 此一情形下,此一波帶之預測器係數於該次框之期間內重 新設定爲零。喑示邊際資訊含有此次波帶之預測 器係數◦此一情形下,摘取預測器係數並於該次框之期間 內設立於其預測器內。 對pmode陣列內之每次PM0DE:=1, 將一對應預測係數 .YQ位址索引置於陣列PVQ內。該等索引爲固定之無記號12 位元整數字,並藉由將1 2位元整數映射至向量表2 6 6而自 檢査表摘取4個預測係數。 位元分《索引(AB I T)指示反轉量化器內將次波帶聲頻 碼轉換回絕對値之階層之數目◦解壓縮格式對各聲頻通道 內之ΑβίΤ爲屬相異,視BHUFF索引及一特定VABIT碼256 而定。 暫態模式邊際資訊(TM 01) E ) 2 3 8係用以指示暫態於各 本紙張尺度適用中國國家標準(CNS ) A4規格(— --------if -裝------7 訂 I.---^--- (請先閲讀背面之注意事項再填寫本頁) 經濟部中央標準局員工消費合作社印製 A7 B7 五、發明説明(47) .次波帶內相對於次框之位置。各次框被分成;[到4個次次 框〇以次波帶選樣而言,各次次框係由8個選樣組成〇最 大次框之大小爲32個次波帶選樣。若一暫態出現於第—次 次框,則t m 〇 d e = 0 〇當t m 〇 d e = 1時,在第二次次框內指示 一暫態,餘類推之。爲控制暫態失眞(譬如前回聲),於 TMODE大於〇時,對各次框次波帶傳輸二標度因數。自聲 頻表頭摘取之ΪΗϋFK索引決定將tmODE解碼所需之方法。 當THUFF = 3時,TM0DE解壓縮成一無記號之2位元整數。 傳輸各標度因數索引·以容許次波帶聲頻碼在各次框內 之適當標定。若TMODE等於零,則予傳輸一標度因數〇若 TM0DE對任一次波帶爲大於零,則將二標度因數一併傳輸 。自聲頻表頭摘取之SHUFF索引240決定將各單獨聲頻通 道之SCALES解碼所需之方法。vDRMSq l索引確定RMS標度 因數之値〇 在某些模式中,SCALES索引係選用五個129階層有記 號之Huffman反轉量化器予以解壓縮。然而,所得之反轉 量化索引係予差分編碼,並如下被轉換成絕對値: 經濟部中央標準肩員工消費合作社印装 ---------f 袭 II (請先闖讀背面之注^^項再填寫本頁)Sl556l A7 B7 V. Description of the invention (42) The coding program 72 repeats for the secondary band whose scale factor has been corrected. Following this, the hierarchical codes are mapped to the best entropy codebook again, and the bit usage is recalculated. If any bit usage :: still exceeds the nominal rate, the scale factor is further increased and the cycle is repeated. The correction of the scale factor can be done in two ways. The first is to transmit an adjustment factor to the decoder for each ABIT index. For example, a 2-bit word can signal an adjustment range, such as 〇, 1, 2, and 3 dB. Because the same adjustment factor is used for all subbands that use ABIT index, and only indexes 1-10 can be used. For coding, the maximum number of adjustment factors that need to be transmitted is 100 for all sub-bands. The scaling factor can be changed in each sub-band by selecting a hierarchical quantizer hierarchy. However, the scale factor adjustment is limited to these steps because the quantizer has steps of 1.2 5 and 2.5 dB, respectively. In addition, when using this technique, if the entropy coding works, the differential coding of the scale factor and the use of the resulting bits may need to be recalculated. Printed by the Central Bureau of Economic Affairs of the Ministry of Economic Affairs. Please read the precautions on the back before filling in this page.) Generally speaking, the corresponding procedure can also be used to increase the bit transmission rate ', that is, when the bit transmission rate is less than the required bit transmission rate. In this case, the scale factor will be reduced to force the differential sampling to make greater use of the external quantizer hierarchy, so the longer codewords in the entropy table are used. If the bit allocation index cannot be used Reduced to a reasonable number of iterations, or the number of adjustment steps has reached the limit under the condition of transmission scale factor adjustment factors, there may be two remedies. First, the secondary wave band within the nominal rate range The scale factor can be increased, thereby reducing the total bit rate. Alternatively, you can abandon the entire ADPCM encoding process and recalculate the sub-bourne paper size to apply the Chinese National Standard (CNS) A4 specification (210X 297 mm) _45_ Printed by the Consumer Cooperative of the Central Standards Bureau of the Ministry of Economic Affairs A7 _____ B7 V. Description of invention (43) With overall suitable bit allocation, this time using fewer bits. Data stream format Multiplexer 32 shown in Figure 10 2 compresses the data of each channel, and then multiplexes the compressed data of each channel into an output frame to form a data stream. 6 〇Compression and multiplex transmission data method , That is, the frame format 186 shown in FIG. 19 is designed to make the audio encoder available for a wide range of applications and can be expanded to a higher sampling frequency, the amount of data in each frame is restricted, and the playback can be independent Start on each sub-frame to reduce latency and reduce decoding error. As shown, a single frame 1 8 6 (409 6 PCM sample / channel) boundary locates the meta-stream boundary, where sufficient information resides to The audio of a sector is properly decoded and consists of 4 sub-frames 188 (1024 PCM sampling / channel), which in turn are each composed of 4 sub-frames 190 (2 5 6 PCM sampling / channel). The frame sync word 192 is placed at the beginning of each audio frame. Frame header information 194 mainly provides information about the structure of the frame 1 86. Encoder configuration, which generates operational data streams and operational features of various options, such as embedded dynamic range control and time code. Option header information 1 96 Inform the decoder if it needs to be down-mixed, whether the dynamic range compensation is completed, and whether the auxiliary data bit group is included in the data stream. The audio coding table header 1 9 8 refers to; ^ the compression configuration and coding format used at the encoder to combine coding, 'marginal information', that is, bit allocation, scale factor, PM ODES, TM ODES, codebook and many more. The rest of the frame is composed of audio subframes 188 in the SUBFS sequence. Each frame starts with the marginal information 200 for audio coding, which relays the information from the key coding system used for most compressed audio to the decoder. The size of this paper is applicable to the Chinese Standard Homepage (CNS) A4 specification (210 X 297 words) (please read the note ^ h on the back and then fill in this page). Installation --- II —1 J1 H-mu * § ^ ιϋ nn vm — ^ n " nfn «1 ^^ — n ^ — _ Α7 Β7 315561 5. Description of the invention (4 4) (please read the precautions on the back before filling out this page) etc. including transient detection , Predictive coding, adaptive bit allocation, high frequency vector quantization, intensity coding and adaptive calibration. Most of this type of data uses the above audio coding header information to decompress from the data stream. 0 High-frequency VQ code array 2 02 is composed of 10 bit indexes per high-frequency subband, indicated by the VQSUB index. The low frequency effect array 2ϋ4 is an option 'and represents very low frequency data that can be used to drive subwoofers, for example. The audio array 206 is decoded with a Huffman / fixed inverse quantizer and divided into multiple subframes (SSC), and each decodes up to 256 PCM samples per audio channel. The audio array 2 0 8 for super sample selection only exists when the sample selection frequency is greater than 48 kHz. To maintain compatibility, decoders that cannot operate at a sampling rate higher than 48 kHz should skip this audio data. Array DSYNC 210 is used to verify the end of the subframe position within the audio frame. If the position is not After verification, the audio decoded in the sub-frame is declared unreliable. As a result, if the frame is not suppressed, the previous frame is repeated. Subband decoder Demonstration of the Consumer Cooperative of the Central Bureau of Standards of the Ministry of Economic Affairs Figure 20 is a block diagram of the subband sampling decoder 18 respectively. The decoder Hunan is very simple compared to the encoder, and does not involve calculations that are fundamental to the quality of the reconstructed audio, such as bit allocation. After synchronization, the decompressor W decompresses the compressed audio data stream 16, detects and corrects transmission-induced errors if necessary, and demultiplexes the data to individual audio channels. The subband differential signal is re-quantized into a PCM signal, and each audio channel is inverted and filtered to convert the signal back into the time domain. The received audio frame and the decompressed header encoded data stream are compressed (or framed) at the encoder ), And in addition to the actual paper size applicable to the national standard (CNS) A4 specification (.2ί〇Χ297 mm) -47 A7 B7 Printed by the Beigong Consumer Cooperative of the Central Standards Bureau of the Ministry of Economy V. Description of invention (45) In addition to the audio code itself, each frame includes additional data for decoder synchronization, error detection and correction, audio encoding status flags, and marginal information for encoding. The decompressor detects the SYNC word and extracts the frame size FSIZE. Encoding The bit stream consists of sequential audio frames, and each starts with a 32-bit (0x7f fe 8 0 0 1) sync word (SYNC). The physical size (FSIZE) of the audio frame is taken from the bytes following the SYNC word. This allows the programmer to set a timer at the end of the frame to reduce the total amount of software. Secondly, NB | ks are extracted, allowing the decoder to calculate the audio window size (32 (Nb Iks + 1)). This tells the decoder which marginal information will be extracted and how many reconstruction samples will be generated. When the header bytes of the frame are received (sync, ftype, surp, nblks, fsize, a mode, sfreq, rate, mixt, dynf, dynct, time, auxcnt, Iff, hflag), the first i2 bytes The validity can be checked using Re ed S ο 1 〇mo u check byte HCiiC. These will correct 1 error byte out of 14 bytes, or flag 2 error bytes. After the error check is completed, the header information is used to update the decoder flag. The headers (fiits, vernum, chist, pcmr, unspec) of the option information following HCRC can be extracted and used to update the decoder flag. Since this information will not change from frame to frame, a majority voting system can be used to compensate for the error of each member. Option header data (times, mcoeff, dcoeff, auxd, ocrt ’) are based on dynf, time, and auxcnt headers. Extract Z ○ The option data can be used at will. Solomon; check byte 0CRC verification. Audio coding frame table header (subfs, s, ubs, chs, vqsub, j ό in X, thuff, shufi, bhuff, sel5, sel7, sel9, sell3, sell7, this paper scale is applicable to China National Standard (CNS) A4 specification ( 21 Ox297mm) " '" -48- (please read the precautions on the back before filling in this page) -install _ order-h A7 3 ^ 556ι ___ Β7 5. Description of the invention (46) sel25, sel33, sel65 , Sell29, ahcrc) transmitted once in each frame 0, etc. You can use the audio Keed Solomon check byte AHCRC verification. Most of the headers are repeated for each audio channel as defined by CHS. The sub-frame coding is used. The audio compression decompression divides the audio coding frame into a majority frame (SUBFS). The necessary marginal information (pmode, pvq, t m o d e, scales, abits, hfreq) are all included to properly decode each frame of audio without referring to any other subframes. Each successive subframe is decoded by first decompressing its marginal information. For each active sub-band and through all audio channels, a 1-bit prediction mode (PM0M) flag is transmitted. The PMQDE flag is valid for the current subframe. PM0DE = 0 implies that the predictor coefficients are not included in the audio box of this subband. In this case, the predictor coefficients of this waveband are reset to zero during the period of the subframe. It shows that the marginal information contains the predictor coefficients of this wave band. In this case, the predictor coefficients are extracted and set up in its predictor during the period of the subframe. For each PM0DE in the pmode array: = 1, a corresponding prediction coefficient .YQ address index is placed in the array PVQ. These indexes are fixed unsigned 12-bit integers, and 4 prediction coefficients are extracted from the check table by mapping 12-bit integers to the vector table 2 6 6. The bit number "Index (AB IT) indicates the number of layers in the inverse quantizer that convert the subband audio code back to an absolute value. The decompression format is different for ΑβίΤ in each audio channel, depending on the BHUFF index and a specific VABIT code depends on 256. Transient mode marginal information (TM 01) E) 2 3 8 is used to indicate that the transient state is applicable to the Chinese national standard (CNS) A4 specification (— -------- if -installation --- --- 7 Order I. --- ^ --- (Please read the notes on the back before filling in this page) A7 B7 printed by the Employee Consumer Cooperative of the Central Bureau of Standards of the Ministry of Economy V. Description of invention (47). Subband The position of the sub-frame relative to the sub-frame. Each sub-frame is divided into; [to 4 sub-frames. In terms of sub-band selection, each sub-frame is composed of 8 samples. The maximum sub-frame size is 32 Sample selection for each subband. If a transient appears in the first subframe, then tm 〇de = 0 〇 When tm 〇de = 1, a transient state is indicated in the second subframe, and so on. In order to control the transient loss (such as pre-echo), when TMODE is greater than 0, the second scale factor is transmitted to each sub-frame band. The ΪΗϋFK index extracted from the audio meter header determines the method required to decode tmODE. When THUFF = 3, TM0DE is decompressed into an unmarked 2-bit integer. Transmit the index of each scale factor to allow proper calibration of the subband audio code in each subframe. If TMODE, etc. At zero, a scaling factor is pre-transmitted. If TM0DE is greater than zero for any primary band, the two scaling factors are transmitted together. The SHUFF index 240 extracted from the audio meter header determines the individual audio channels The method required for SCALES decoding. The vDRMSq l index determines the value of the RMS scale factor. In some modes, the SCALES index is decompressed using five 129-level marked Huffman inverse quantizers. However, the result is the opposite The quantized index is pre-differentially encoded and converted to absolute values as follows: Printed by the Ministry of Economic Affairs Central Standards Staff Consumer Cooperative --------- f Attack II (Please read the note on the back ^^ first Fill in this page)

A ABS_SCALE(n+l)=SCALES(n)-SCALES(n+l) 其中η爲聲頻通道內自第一次波帶開始之第n個差分標度 因數0 在低位元傅輸率聲頻編碼模式時,聲頻編碼器使用向 量量化有效率將高頻率次波帶聲頻選樣直接編碼。在此等 次波帶內不使用差分編碼,且所有與正常ADPCM程序有關 之陣列均須保持成重新設定狀態。使用YQ予·以編碼之第一 本紙張尺度適用中國國家標準(CNS ) A4規格(210X297公釐) -50- A7 _ _ B7 _ 五、發明説明(4 8 ) 次波帶係以VQSUB指示,而所有逹於SUBS之次波帶亦以此 法編碼。 高頻率索引(H FREQ)予解壓縮24 8成固定之10位元無 記號整數。各次波帶次框所需之32個選樣藉施加適當之索 引而自Q4分數二進位LUT摘取出。此對高頻率VQ模式發生 作用之每一通道予以重複。 效應通道之十取一因素恆爲ΧΙΜο存在於LFE內8位 元有效選樣之數目在PSC = 0時係以SSC*2獲得,或當PSC 不爲零時以(SSC + 1广2獲得。一額外之7位元標度因數( 無記號整數)亦包括在LFE陣列之末端,且用7位元LUT 將之轉換成rms 〇 g次框墼頻碼陣列解壓縮 經濟部中央標準局員工消費合作社印製 (請先閱讀背面之注意事項再填寫本頁) 次波帶聲頻碼之摘取程序係由AB I T索引予以驅動,且 各SEL索引於ΑΒίΐ&lt;11之情形下亦然。聲頻碼係用可變長 度Huff man碼或固定式線性碼予格式化〇 —般而言,1〇個 或較少之AB1T索引將暗示屬Huffman可變長度碼,.係由碼 VQL(n) 2 58所予選擇,而ABIT大於10恆表示固定碼。所有 量化器均具有中升之均一特性。固定碼(Y2 )量化器之最大 負位準予以丟棄◦.將聲頻碼壓縮於次次框內,各代表8個 次波帶選樣之最大値,而此等次次框於目前次框內重複達 四次〇 若抽樣率旗標(SFREQ)指示一高於48 kHz之率,則超 聲頻資料陣列將存在於聲頻框內。此陣列內之首二位元組 將指示超聲頻之位元組大小。此外,解碼器硬體之抽樣率 本紙張尺度適用中國國家標準(CNS ) A4規格(210X297公釐) -51- 經濟部中央標準局負工消費合作社印装 A7 B7 五、發明説明(49) 應設定以於SFKEQ/2或SFKEQ/4操作,視高頻率抽樣率而 定。 將同步檢杳解壓縮 於每個次框之末端偵測資料解壓縮同步檢査字DSYNC =Ο X f f f f ,以容許核驗解壓縮完整性。如低聲頻位元傳輸 率之情況,邊際資訊內可變碼字及聲頻碼之使用可導致解 壓縮不對正,倘若表頭、邊際資訊或聲頻陣列已因位元誤 差而敗壞。若解壓縮指標未指到DSYNC之起始,則可假設 前一次框聲頻爲不可靠。 一旦所有邊際資訊及聲頻資料皆被解壓縮,解碼器以 —回一次框方式將多通道聲頻信號重構.0圖20例;^單一通 道內單一次波帶之基波帶解碼器部份〇 電構RMS標度因敞 解碼器重構ADPCM、VQ及JFC演算法之RMS標度因數 (SCALES) 〇明確言之,V.’TM〇[)i^THUFF索引被反轉映射以 識別目前次框之暫態模式(TM0DE)。其後,SHUFF索引、 VD_RMSQL碼及TM0DE被反轉映射以重構差分RMS碼。差分 RMS^碼被反轉差分編碼24 2以選擇RMS碼,後者於是被反 轉簟化2 44以產生RMS標度因數。 反轉量化高頻率向優 解碼器將高頻率向量反轉量化以重構次波帶聲頻信號 。明確言之,如由起始VQ次波帶(VQSUBS)所識別之加有8 位元分數(Q4)二進位數字之經摘取高頻率選樣(HFREQ)被 映射至一反轉VQ lut 24.8。選定之表値予反轉量化2 50 , 本紙張尺度逋用中國國家標準(CNS ) A4规格(210X:297公釐) ' -52- ---------'裝------I訂—---:---Ci (請先閲讀背面之注意事項再填寫本頁) 經濟部中央標準局員工消費合作社印策 A7 _____B7 五、發明説明(5〇) 並以R M S標度因數加以標定2 5 2 〇 將聲頻碼反_最化 '在·進入ADPCM迴路之前,各聲頻碼予反轉量化並標定 以產生重構次波帶差選樣〇 .反轉量化之逹成爲先反轉映射 VABIT及BHUFF索引以規定決定步級大小與量化階層數目 之ABIT ’及反轉映射SEL索引與產生量化器階層碼QL(n) 之VQL ( η )聲頻碼。其後,各碼字QL ( n )被映射至由AB ][ 了及 SEL索引規定之反轉量化器之檢查表26〇 〇雖然各碼係由 ABIT排序,但各單獨之聲頻通道將具有一單獨之SEL指定 器◦檢視程序產生一可藉由乘以量化器步級大小而轉換成 單位rms之有記號量化器階層數目。然後,單位rms値藉 由乘以指派之K.MS標度因數(SCALES) 262轉換成全差選樣 Ο ; 1· QLU] = l/Q[c〇deU]],其中1/Q爲反轉量化器之檢 視表。 2 . Y [ η ] = Q η ];丨:S t e p S i z e [ a b i t s ] 3. Rd [ li] = Y[ n] * s c a 1 e_f ac tor ,其中 Rd=重構差選 樣0 .A ABS_SCALE (n + l) = SCALES (n) -SCALES (n + l) where η is the nth differential scaling factor 0 from the first band in the audio channel. In the low-bit Fu rate audio coding mode At that time, the audio encoder uses vector quantization to efficiently encode high frequency subband audio samples directly. Differential coding is not used in these sub-bands, and all arrays related to normal ADPCM procedures must be kept in a reset state. Use YQ to code the first paper size to apply the Chinese National Standard (CNS) A4 specification (210X297 mm) -50- A7 _ _ B7 _ 5. Description of the invention (4 8) The secondary wave band is indicated by VQSUB, And all the sub-bands that are in SUBS are also coded in this way. The high frequency index (H FREQ) is decompressed 24 8 into fixed 10-bit unsigned integers. The 32 samples required for each band subframe are taken from the Q4 fractional binary LUT by applying appropriate indexes. This is repeated for each channel where the high frequency VQ mode is active. The factor of ten for the effect channel is always XIMO. The number of 8-bit effective samples present in the LFE is obtained as SSC * 2 when PSC = 0, or (SSC + 1) when PSC is not zero. An additional 7-bit scale factor (unsigned integer) is also included at the end of the LFE array, and it is converted into rms 〇g subframe frequency code array with 7-bit LUT to decompress the employee consumption of the Central Standards Bureau of the Ministry of Economic Affairs Printed by the cooperative (please read the precautions on the back before filling in this page) The sub-band audio code extraction process is driven by the AB IT index, and each SEL index is also in the case of ΑΒίll <11. Audio code system Use variable length Huff man codes or fixed linear codes for formatting. Generally speaking, 10 or fewer AB1T indexes will imply Huffman variable length codes. It is represented by the code VQL (n) 2 58 Pre-selected, and ABIT greater than 10 constant means a fixed code. All quantizers have a uniform rising characteristic. The maximum negative level of the fixed code (Y2) quantizer is discarded. Compress the audio code in the sub-frame, each Represents the maximum value of 8 sub-bands for sampling, and these sub-frames are Repeat four times in the frame. If the sampling rate flag (SFREQ) indicates a rate higher than 48 kHz, the ultrasound data array will exist in the audio frame. The first two bytes in this array will indicate the ultrasound frequency. The size of the byte. In addition, the sampling rate of the decoder hardware is based on the Chinese National Standard (CNS) A4 specification (210X297 mm) -51- Printed by the National Bureau of Standards, Ministry of Economic Affairs, Consumer Cooperative A7 B7 V. Invention Note (49) should be set to operate in SFKEQ / 2 or SFKEQ / 4, depending on the high frequency sampling rate. Decompress the sync detection at the end of each sub-frame to detect the data and decompress the sync check word DSYNC = Ο X ffff to allow verification of decompression integrity. In the case of low audio bit transmission rate, the use of variable codewords and audio codes in the marginal information can cause decompression errors, if the header, marginal information, or audio array has been The bit error is corrupted. If the decompression index does not refer to the start of DSYNC, it can be assumed that the previous frame audio is unreliable. Once all the marginal information and audio data are decompressed, the decoder will return to the frame once Multi-channel audio signal reconstruction. 0 Figure 20 examples; ^ Single-band fundamental wave band decoder part in a single channel RMS scale of electrical structure Reconstruct the RMS scale of ADPCM, VQ and JFC algorithms by the open decoder Degree Factor (SCALES). Specifically, the V.'TM〇 [) i ^ THUFF index is inverted to identify the current subframe transient mode (TMODE). Thereafter, the SHUFF index, VD_RMSQL code, and TM0DE are inverted and mapped to reconstruct the differential RMS code. The differential RMS ^ code is inverted by differential encoding 24 2 to select the RMS code, which is then inverted by 2 44 to produce the RMS scaling factor. Inverse quantization of high-frequency optimizing decoders inversely quantizes high-frequency vectors to reconstruct sub-band audio signals. Specifically, the extracted high frequency sample (HFREQ) with 8-bit fractional (Q4) binary digits as identified by the initial VQ subband (VQSUBS) is mapped to an inverted VQ lut 24.8 . The selected table value is reversed and quantified 2 50. The paper size is in accordance with Chinese National Standard (CNS) A4 specification (210X: 297mm). --I Order -----: --- Ci (please read the precautions on the back before filling in this page) Employee Consumer Cooperative of the Central Bureau of Standards, Ministry of Economic Affairs, A7 _____B7 V. Description of Invention (5〇) and marked with RMS The degree factor is calibrated 2 5 2 〇Invert the audio code before _optimization '. Before entering the ADPCM loop, each audio code is inverse quantized and calibrated to produce a reconstructed subband difference sample. The inverse quantization becomes First invert the mapping VABIT and BHUFF indexes to specify the ABIT 'that determines the step size and the number of quantization levels, and invert the mapping SEL index and the VQL (n) audio code that generates the quantizer level code QL (n). After that, each codeword QL (n) is mapped to the check table 260 of the inverse quantizer specified by AB] [and the SEL index. Although each code is sorted by ABIT, each individual audio channel will have a Separate SEL specifier ◦ The review process generates a number of marked quantizer levels that can be converted to unit rms by multiplying the quantizer step size. Then, the unit rms value is converted into a total difference sample by multiplying the assigned K.MS scale factor (SCALES) 262; 1 · QLU] = l / Q [c〇deU]], where 1 / Q is the inverse Checklist for quantizer. 2. Y [η] = Q η]; 丨: S t e p S i z e [a b i t s] 3. Rd [li] = Y [n] * s c a 1 e_f ac tor, where Rd = reconstructed difference sample 0.

反轉之ADPCM 對每一次波帶.差選樣執行ADPCM解碼程序如下: 1 . 由反轉之V Q 1 u t 2 6 8載入各預測係數。 2. 將目前預測器係數迴旋並使前4個重構次波帶選樣維 持於預測器歷程陣列268內而產生預測選樣〇 P[n] = sum(Coeff[.i]*R[n-i]),對於 i=l,4,其中 本紙張尺度適用中國國家標準(CNS ) A4規格(2I0X297公釐) -53- _,11:—I---^ 裝----—-^訂---1---^'ν (請先閲讀背面之注項再填寫本頁) 經濟部中央揉準局員工消費合作杜印製 A7 ___B7_ 五、發明説明(51) n=目前選樣周期。 3 ·將該預测選樣加至重構差選樣,以產生一重構次波帶 選樣27ϋ 。 R [ η ] = K d [ n ] + ' P [ η ] 4 · 更新預測器歷程,亦即將目前之重構次波帶選樣拷貝 至歷程列表之頂端。 K[n-i] = R[ii&quot;&quot;i+1],對於 i=4,1 在PM0= 0之情況下,各預測器係數將爲零,預測選 樣爲零,而重構次波帶選樣等於差分次波帶選樣。雖然預 測之計算在此情形下非必要,但將預測器歷程保持更新狀 態乃屬必要,萬一PMODE會在未來之次框中變爲發生作用 。此外,若HfJLAG在目前聲頻框內發生作用,則在將該框 內之第一次次框解碼之前應將預測器歷程清除。該歷程自 該點起應照常予以更新0 於高頻率VQ次波帶之情況*或在次波帶不被選擇(亦 即高出SUBS界限)時,預測器歷程應保持清除狀態,直到 次波帶預测器變爲發生作用時爲止0 ADPCM、VQ及JFC解碼之潠擇控制 一第一、、開關&quot;控制ADPCM或VQ輸出之選擇〇 VQSUBS 索引識別VQ編碼之起始之次波帶。因此,若目前次波帶較 VQSUBS低,則該開關選擇ADPCM輸出。否則彼即選擇VQ輸 出〇 —第二'開關〃 278控制直接通道輸出或JFC編碼輸 出之選擇〇 J 0 i Ν X索引識別何者通道被結合及重構信號在 何者通道內被產生。重構JFC信號構成其他通道內JFC輸 本紙張尺度適用中國國家樣準(CNS ) A4規格(210X297公釐) —R 4.— (請先閱讀背面之注意事項再填寫本頁) k •裝· 、-0 經濟部中央標準局員工消費合作社印敢 3ι556χ A7 _一 一 B7 - . - 五、發明説明(52) 入之强度源〇因此,若目前次波帶爲一 JFC之一部份且非 指派之通道,則該開關選擇JFC輸出。正常時,該開關選 擇通道輸出〇 : 向下矩随化 資料流之聲頻編碼模式係由AMODE指示。於是經解碼 之聲頻通道可予重新導向,以匹配解碼器硬體2 8 0上之實 體輸出通道配置〇 · 動態範圍摔制資料 動態範圍係數DCOEFF可於編碼階段282隨意嵌於聲頻 框內。此一特色之目的爲容許於解碼器輸出方便聲頻動態 範圍之壓縮。動態範圍壓縮在聽聞環境內特別重要,其中 高周圍雜訊位準使其不可能在響亮通過期間區別低位準信 號,而不冒損壞揚聲器之風險〇此一問題因顯現高如11 〇 dB之動態範圍之2ϋ位元PCM聲頻錄音之增多使用而進一步 混成。 視框(NBLKS )視窗大小而定,對任一編碼模式(DYNF ) ,每聲頻通道予傳輸一、二或四個係數。若予傳輸單一係 數,則此爲.用於整個框。當爲二個係數時,第一個係用於 框之前半部,而第二個用於框之後半部。每一個框象限上 分佈四個係數〇藉由局部內揷於傳輸値之間,較高之時間 解析度爲有可能。 每一係數均爲8位元有記號之分數Q2二進位,並代表 在表(53 )中所示之一-個對數增益値,而在〇 25dB之步'級中 給予+ /-31 .WdB之範圍。.各係數係以通道數目排序。動態 本紙張尺度適用中囿國家梂準(CNS) A4规格(210x29lf (請先閱讀背面之注意事項再填寫本頁) h、 裝_ i --- —^Kii---- B7 B7 經濟部中央標準局員工消費合作社印製 五、發明説明(53) 範圍壓縮係藉由解碼聲頻選樣乘以線性係數所造成。 壓縮之程度可以適當調整解碼器處之係數値予以改變 ,或忽略各係數而予完全關閉〇 32波帶內挿濾波器細. 3 2波帶內挿濾波器組44將各聲頻通道之32次波帶轉換 成單一 P C Μ時間領域信號。當F I L T S = 0時,使用非完全性 之重權係數(5 1 2分接頭l·1 U濾波)〇當F〗LTS= 1時,使 用完全性之重構係數。通常,餘弦調變係數將予預計算, 並儲存於ROM內〇內挿程序可予擴充以重構較大之資料區 段而減少迴路總量。然而,在終止框之情況下,可能被要 求之最小解析度爲3 2個P C Μ選樣。內挿.演算法如下:產生 餘弦調變係數,讀入3 2個新次波帶選樣到陣列X I Ν ,乘以 餘弦調變係數並產生暫時陣列SUM及DIFF ,儲存歷程,乘 以濾波器係數,產生32個PCM輸出選樣,更新工作陣列, .以及輸出32個新PCM選樣〇 視操作中之位元傳輸率及編碼體系而定,位元流可規 定非完全性或爲完全性重構內挿濾波器組係數(FILTS)' 〇 由於編碼器十取一式濾波器組係以40位元浮點精密度計算 ,故解碼器獲致最大理論重構精度之能力將取決於用以計 算迴旋之來源PCM字長與DSP磁心精密度,及各操作標定 之方法。 ’ 低頻率效應PCM內揷 與低頻率效應通道相關之聲頻資料與主聲頻通道無關 。此一通道係用一在一經X128十取一(120 Hz帶寬)之20 本紙張尺度適用中國國家標準(CNS ) A杉見格(n〇X 297公釐) — 56— ---------、裝— 〈請先閲讀背面之注意事項再填寫本頁) ,ιτ λ 經濟部中央標準局員工消費合作社印製 Α7 Β7 五、發明説明(54) 位元PCM輸入上操作之8位元APCM程序予以編碼。經十取 —之效應聲頻與主聲頻通道內目前次框聲頻在時間上對正 〇因此,由於在32波帶內挿濾波器組上之延遲爲2 5 6個選 樣(5 1 2分接頭),故須謹慎以確保內揷低頻率效應通道 亦與輸出前聲頻通道之剩餘部份對正。若效應內揷F ;[ R亦 爲5 1 2分接頭,則不需補償。 LFT演算法使用一5 1 ‘2分接頭1 2 8X內挿F I R如下:將 7位元標度因數映射至rms ,乘以7位元量化器之步級大 小,自各正規化値產生次選樣値,以及譬如對每一次選樣 所給予者.使用一低通濾波器以1 2 8內挿〇 硬體實施 圖21及22說明六通道型之編碼器及解碼器之硬體實施 之基本功能構造用以於3 2、4 4 · 1及4 8 k Η z之抽樣率操作 〇S青參考圖 22,八個 Analog Devices 公司之 ADSP21020 40 位元浮點數位信號處理器(D S P )晶片2 9 6用以實施六通道 數位聲頻編碼器2 9 S 〇六個DS P用以將每—通道編碼,而 第七及第八個係分別用以實施、、廣域位元分配與管理,/及 '&quot;貪料流格式化器與誤差編碼&quot;功能〇每—ADSP2 1 020係 以33 MHz計時,並利用外部α位元X 32k程式ram (PRAM) 300 、4ϋ位元X 32k資料ram (SRAM) 302進行演算。編碼 器之情況下,一8位元X 512k EPROM 304亦用於儲存固定 常數譬如可變長度嫡碼册〇資料流格式化用D S P使用R e e d S ο 1 〇 m ο n C K C晶片3 ϋ 6 ,以於解碼器處便利誤差偵測及保 護0編碼器D S Ρ與廣域位元分配與管理間之通訊係使用雙 本紙張尺度適用中國國家摞準(CNS ) Α4規格(210Χ297公徒) — 57·*· I--------C ,裝-----Μ 訂-Γ---:---C i _(請先閲讀背面之注意事項再填寫本頁) 經濟部中央標準扃員工消費合作社印裝 A7 B7 五、發明説明(55) 埠靜態RAM 3ϋδ予以實施。 編碼處理流程如下。於三個A E S / E B U數位聲頻接收器 中每一個之輸出摘取一個2通道數位聲頻PCM資料流3 1 0 °每一對中之第一通道分別導往CH1、3及5編碼器DSP , 而每一第二通道分別導往CH2、4及6 〇藉由將串聯PCM字 轉換成平行(s/p)而將各PCM選樣讀入DSP內。各編碼器 累積一框量之PCM選樣,並如前述將該框資料編碼。有關 每一通道估計差信號(e d ( η ))及次波帶選樣(X ( n ))之資訊 係經由雙埠ΚΑΜ傳輪至廣域位元分配與管理DSP 〇然後, 各編碼器之位元分配策略以相同方式讀回〇 —旦編碼程序 完畢,編碼資料及六個通道之邊際資訊經由廣域位元分配 與管理DSP傳輸至資料流格式化器DSP 〇此一階段選擇性 產生CRC檢視位元組,並添加至編碼資料供於解碼器處提 供誤差保護作用。最後,整個資料小包1 6被組合及輸出。 圖22中說明一種六通道硬體解碼器實施情形。使用— 單一Analog Devices公司ADSP2 1 0 2 0 4 0位元浮點數位信號 處理器(DSP )晶片324以實施六通道數位聲頻解碼器。該 ADSP21020係以33MHZ計時,並利用外部48位元X 32k程 式 ram (PKAM) 32(3、40位元 X 32k 資料 ram (SRAM) 328 進行解碼演眞。一額外之8位元X 5 1 2 k E P R 0 Μ 3 3 0亦予用 於固定常數譬如可變長度熵及預測係數向量碼冊之儲存〇 解碼處理流程如下。壓縮資料流1 6經由一串聯至平行 轉換器(s/p) 332輸入至DSP 〇資料如前例示予解壓縮及 解碼。對各通道將各次波.帶選樣於單一 PCM資料流22內重 本紙張尺度適用中國國家標準(CNS ) A4规格(210父297言#) (請先聞讀背面之注項再填寫本頁) -* • - ·- -- - :--¾. A7 3ι556χ 五、發明説明(5 ΰ) 構,並經由三個平行至串聯轉換器(p / s ) 3 3 5輸出至三個 AES/EBU敗位聲頻傳輸器晶片3 3 4 0 雖然本發明曾顯示並說明若干例示性具體形式,但對 業界熟練技術人員而言將出現許多變化及替代具體形式0 舉例言之,當處理器速度增加而記憶體成本降低時,選樣 頻率、傳輸率及緩衝器大小將極有可能增加◦此等變化及 替代具體形式均被期待,且可達成而不背離後附各申請專 利範圍項所界定之本發明精神及範[I壽。 (請先閲讀背面之注^h項再填寫本頁)The inverted ADPCM performs the ADPCM decoding procedure for each band and difference selection as follows: 1. The predicted coefficients are loaded from the inverted V Q 1 u t 2 6 8. 2. Rotate the current predictor coefficients and maintain the first 4 reconstructed sub-band samples in the predictor history array 268 to generate prediction samples. P [n] = sum (Coeff [.i] * R [ni ]), For i = l, 4, where the paper size is in accordance with the Chinese National Standard (CNS) A4 specification (2I0X297 mm) -53- _, 11: —I --- ^ 装 ----—- ^ 訂--- 1 --- ^ 'ν (Please read the notes on the back before filling in this page) A7 printed by the consumer cooperation of the Central Bureau of Economic Development of the Ministry of Economic Affairs ___B7_ V. Invention description (51) n = Current sampling cycle . 3 · Add the prediction sample to the reconstruction difference sample to produce a reconstructed subband sample 27ϋ. R [η] = K d [n] + 'P [η] 4 · Update the history of the predictor, that is, copy the current reconstruction sub-band sample selection to the top of the history list. K [ni] = R [ii &quot; &quot; i + 1], for i = 4, 1 In the case of PM0 = 0, each predictor coefficient will be zero, the prediction sample is zero, and the reconstructed subband Sample selection is equal to differential subband sampling. Although the calculation of the forecast is not necessary in this case, it is necessary to keep the forecaster history updated, in case PMODE will become effective in the next subframe. In addition, if HfJLAG is active in the current audio frame, the predictor history should be cleared before the first subframe in the frame is decoded. The history should be updated as usual from this point. 0 In the case of the high frequency VQ subband * or when the subband is not selected (that is, above the SUBS limit), the predictor history should remain clear until the subwave When the band predictor becomes active, the selection control of ADPCM, VQ, and JFC decoding-the first, switch &quot; controls the selection of ADPCM or VQ output. The VQSUBS index identifies the starting secondary band of the VQ encoding. Therefore, if the current sub-band is lower than VQSUBS, the switch selects ADPCM output. Otherwise, the VQ output is selected. The second switch 278 controls the choice of direct channel output or JFC encoding output. The J 0 i NX index identifies which channel is combined and in which channel the reconstructed signal is generated. Reconstruct the JFC signal to form the JFC input paper in other channels. The paper size is applicable to the Chinese National Standard (CNS) A4 specification (210X297 mm) —R 4.— (please read the precautions on the back and fill in this page) k • Install , -0 The employee consumption cooperative of the Central Bureau of Standards, Ministry of Economic Affairs, Yingan 3ι556χ A7 _ 一一 B7-.-V. Description of the invention (52) The source of the intensity of the input. Therefore, if the current sub-band is part of a JFC and is not For the assigned channel, the switch selects the JFC output. Normally, the switch selects the channel output. 0: The torque is randomized. The audio coding mode of the data stream is indicated by AMODE. The decoded audio channel can then be redirected to match the physical output channel configuration on the decoder hardware 280. Dynamic range control data The dynamic range coefficient DCOEFF can be embedded in the audio frame at random during the encoding stage 282. The purpose of this feature is to allow easy compression of the audio dynamic range at the decoder output. Dynamic range compression is particularly important in an audible environment, where high ambient noise levels make it impossible to distinguish low-level signals during loud passes without risking damage to the speakers. This problem is due to dynamics as high as 11 dB The range of 2ϋ-bit PCM audio recording is further mixed with the increased use. Depending on the size of the NBLKS window, for any coding mode (DYNF), one, two or four coefficients are transmitted per audio channel. If a single coefficient is to be transmitted, this is used for the entire frame. When there are two coefficients, the first is used for the first half of the frame, and the second is used for the second half of the frame. Four coefficients are distributed on each frame quadrant. By locally interpolating between transmission values, a higher time resolution is possible. Each coefficient is an 8-bit signed fraction Q2 binary, and represents one of the logarithmic gain values shown in Table (53), and is given in the step of 〇25dB +/- 31.WdB Scope. . Each coefficient is sorted by the number of channels. The size of the dynamic paper is applicable to the Chinese National Standard (CNS) A4 (210x29lf (please read the precautions on the back before filling in this page) h, install _ i --- — ^ Kii ---- B7 B7 Central Ministry of Economic Affairs Printed by the Staff Consumer Cooperative of the Bureau of Standards 5. Description of the invention (53) Range compression is caused by decoding audio samples by multiplying by linear coefficients. The degree of compression can be adjusted by appropriately adjusting the coefficient value at the decoder or ignoring each coefficient. Pre-off completely. The 32-band interpolation filter is fine. 3 The 2-band interpolation filter bank 44 converts the 32nd band of each audio channel into a single PC time domain signal. When FILTS = 0, incomplete Weight coefficient (5 1 2 tap l · 1 U filter). When F〗 LTS = 1, the complete reconstruction coefficient is used. Usually, the cosine modulation coefficient will be pre-calculated and stored in ROM 〇The interpolation program can be expanded to reconstruct a larger data segment and reduce the total amount of loops. However, in the case of the termination frame, the minimum resolution that may be required is 32 PC Μ selection. Interpolation. The algorithm is as follows: generate the cosine modulation coefficient, read 3 2 Select the new subband to the array XI Ν, multiply it by the cosine modulation coefficient and generate the temporary array SUM and DIFF, store the history, multiply by the filter coefficient, generate 32 PCM output samples, update the working array, and output 32 Two new PCM samples 〇Depending on the bit transmission rate and coding system in operation, the bit stream can specify incomplete or reconstructed interpolation filter bank coefficients (FILTS) for completeness. 〇Since the encoder takes The one-type filter bank is calculated with 40-bit floating-point precision, so the ability of the decoder to obtain the maximum theoretical reconstruction accuracy will depend on the source PCM word length used to calculate the rotation and the DSP core precision, and the calibration method for each operation The audio data related to the low frequency effect channel in the low frequency effect PCM has nothing to do with the main audio channel. This channel is used for one of the 20 papers in X128 (120 Hz bandwidth). This paper standard is applicable to Chinese national standards ( CNS) A Shan Jiange (n〇X 297 mm) — 56— ---------, installed— <Please read the notes on the back before filling out this page), ιτ λ Central Bureau of Standards, Ministry of Economic Affairs Printed by employee consumer cooperatives Α7 Β7 5. Description of the invention (54) An 8-bit APCM program operated on a bit-bit PCM input is encoded. The effect audio after the ten-take is aligned with the current subframe audio in the main audio channel in time. Therefore, since the delay on the 32-band interpolation filter bank is 2 5 6 samples (5 1 2 tap ), So care must be taken to ensure that the internal low-frequency effect channel is also aligned with the rest of the audio channel before output. If the effect is within F; [R is also a 5 1 2 tap, no compensation is required. The LFT algorithm uses a 5 1 '2 tap 1 2 8X to interpolate the FIR as follows: map the 7-bit scale factor to rms, multiply it by the step size of the 7-bit quantizer, and generate sub-samples from each normalized value Value, and for example, given for each sample selection. Using a low-pass filter with 1 2 8 interpolation. Hardware implementation Figures 21 and 22 illustrate the basic functions of the six-channel encoder and decoder hardware implementation Constructed to operate at sampling rates of 3, 4, 4 · 1 and 4 8 kHz. Refer to Figure 22, eight ADSP21020 40-bit floating-point digital signal processor (DSP) chips 2 from Analog Devices 9 6 is used to implement a six-channel digital audio encoder. 2 9 S. Six DS Ps are used to encode each channel, and the seventh and eighth are used to implement, and wide-area bit allocation and management, respectively./ And '&quot; corruption stream formatter and error coding &quot; function. Every — ADSP2 1 020 is clocked at 33 MHz, and uses external alpha bit X 32k program ram (PRAM) 300, 4 ϋ bit X 32k data ram (SRAM) 302 performs calculations. In the case of an encoder, an 8-bit X 512k EPROM 304 is also used to store fixed constants such as variable-length codebooks. Data stream formatting DSP uses Reed S ο 1 〇m ο n CKC chip 3 ϋ 6, In order to facilitate error detection and protection at the decoder, the communication between the encoder DS Ρ and wide-area bit distribution and management is based on the use of double-paper standards applicable to the China National Stack Standard (CNS) Α4 specification (210Χ297)-57 · * · I -------- C, installed ----- Μ Order-Γ ---: --- C i _ (Please read the notes on the back before filling this page) Central Ministry of Economic Affairs A7 B7 is printed and printed by the Standard Consumer Cooperative. V. Description of invention (55) Port static RAM 3ϋδ will be implemented. The encoding process flow is as follows. Extract a 2-channel digital audio PCM data stream from the output of each of the three AES / EBU digital audio receivers 3 1 0 ° The first channel of each pair is directed to the CH1, 3 and 5 encoder DSP, and Each second channel leads to CH2, 4 and 6 respectively. By converting the serial PCM words into parallel (s / p), each PCM sample is read into the DSP. Each encoder accumulates a frame of PCM samples and encodes the frame data as described above. Information about the estimated difference signal (ed (η)) and sub-band sampling (X (n)) of each channel is distributed to the wide area bit distribution and management DSP through the dual-port CAM transmission. Then, the encoders The bit allocation strategy is read back in the same way. Once the encoding process is completed, the encoded data and the marginal information of the six channels are transmitted to the data stream formatter DSP through the wide-area bit allocation and management DSP. This stage selectively generates CRC View the bytes and add them to the encoded data for error protection at the decoder. Finally, the entire data packet 16 is combined and output. Figure 22 illustrates an implementation of a six-channel hardware decoder. Use — A single Analog Devices ADSP2 1 0 2 0 4 0 bit floating point digital signal processor (DSP) chip 324 to implement a six-channel digital audio decoder. The ADSP21020 is clocked at 33MHZ and uses an external 48-bit X 32k program ram (PKAM) 32 (3, 40-bit X 32k data ram (SRAM) 328 for decoding. An additional 8-bit X 5 1 2 k EPR 0 Μ 3 3 0 is also used for the storage of fixed constants such as variable length entropy and prediction coefficient vector codebook. The decoding process is as follows. The compressed data stream 16 is serially connected to a parallel converter (s / p) 332 The data input to DSP is decompressed and decoded as shown in the previous example. Each wave is selected for each channel. The sample is selected in a single PCM data stream. 22 The paper size is applicable to the Chinese National Standard (CNS) A4 specification (210 father 297 words #) (Please read the notes on the back and then fill in this page)-* •-·---:-¾. A7 3ι556χ V. Invention description (5 ΰ) structure, and through three parallel to series conversion Device (p / s) 3 3 5 output to three AES / EBU defeated audio transmitter chips 3 3 4 0 Although the present invention has shown and described several exemplary specific forms, but for the skilled person in the industry, there will be many Variations and alternative specific forms 0 For example, as processor speed increases and memory cost decreases The sample selection frequency, transmission rate and buffer size will most likely increase. These changes and alternative specific forms are expected, and can be achieved without departing from the spirit and scope of the invention as defined by the scope of each patent application attached. I Shou. (Please read the note ^ h on the back before filling this page)

—l·—· m tMWUM 訂 經濟部中央標準局負工消費合作社印裝 本紙張尺度適用中國國家標準(CNS ) A4规格(210X29p|^:)—L · — · m tMWUM Ordered by the Ministry of Economic Affairs, Central Standards Bureau, Negative Work Consumer Cooperative Printing This paper scale is applicable to the Chinese National Standard (CNS) A4 specification (210X29p | ^ :)

Claims (1)

六、申請專利範圍 1 · 一種多通道聲頻編碼器,包含: —框攫取器(ΰ 4 ),將一聲頻視窗施加於以一抽樣率 抽樣之多通道聲頻信號之每一通道,以產生各自之聲頻框 序列; / 多數濾波器(3 4 ),將各通道之聲頻框分割成涵蓋一 基波帶頻率範圍之各別多數頻率次波帶,所述頻率次波帶 各包含一系列之次波帶框,每一次波帶框至少具有該聲頻 資料之一次框; 多數次波帶編碼器(26),以一回一次框方式將各別 頻率次波帶內之聲頻資料編碼成編碼次波帶信號; 一多工器(3 2 ),將該編碼次波帶信號壓縮並多工傳 輸進入每一連續資料框之一輸出框內,從而以—傳輸率形 成一資料流;以及 一控制器(1 9 ),根據該抽樣率及傳輸率設定聲頻視 .窗之大小,以使所述輸出框之大小受約束而落於一所要範 圍內◦ 經濟部中央標率局負工消费合作社印製 {請先閲讀背面之注意事項再填寫本頁) 2 ,如申請專利範圍第χ項之多通道聲頻編碼器,其中 控—器將聲頻視窗大小設定成小於 (框大小 PFsampMS / Trate) 之最大爲二之倍數,其中框大小爲該輸出框之最大尺寸, Fs amp爲抽樣率,而Tr ate爲傳輸率。 3 ·如申請專利範圍第1項之多通道聲頻編碼器,其中 該多通道聲頻信號係以一目標位元傳輸率予以編碼,而該 等次波帶編碼器包含多數預測性編碼器;尙包含: 本紙張尺度速用中國國家梂準(CNS ) A4规格(210X297^^ 經濟部中央標率局具工消費合作社印裝 Αδ Β8 CS D8 六、申請專利範圍 —廣域位元管理器(GBM) (30),對每一次框計算一 精神聽覺信號遮蔽比(SMK)及一估計預測增益(Pgain), 以其等相關預測:v增益之各別分數將SMR降低而計算遮蔽雜 訊比(MNR),分配各位元以滿足每一MNR ,計算所有次波 帶上之已分配位元傳輸率,以及調整個別之分配以使實際 位元傳輸率接近目標位元傳輸率◦_ 4 .如申請專利範圍第1或3項之多通道聲頻編碼器, 其中該次波帶編碼器將每一次框分割成多數次次框,每一 次波帶編碼器包含一就每一次框產生並量化一誤差信號之 預測性編碼器(7 2 ),尙包含: 一分析器(98,100,1 0 2,1 04,1 06),在對每一次框編 碼之前產生一估計誤差信號,偵測該估計誤差信號每一次 次框內之暫態,產生一暫態碼指示是否有暫態存在於第一 個及其內出現暫態之次次框以外之任一次次框內,以及在 偵測到一暫態時對該等在暫態前之次次框產生一暫態前標 度因數,且對該等包括該暫態及在暫態後之次次框產生一 暫態後標定因數,否則對該次框產生一均匀標度因數, 所述預測性編碼器使用所述暫態前、暫態後及均匀標 度因數,以於編碼前標定該誤差信號以減少對應於暫態前 標度因敫之次次框內之編碼誤差〇 5 —種多通道聲頻編碼器,包含: —框撄取器(64),將一聲頻視窗施加於以一抽樣率 抽樣之多通道聲頻信號之每一通道,以產生各自之聲頻框 序列,所述聲頻框具有一.自DC延伸至約爲抽樣率之半之聲 本紙張尺度適用中國國家標準(CNS ) A4规格(210X297兮替L ~' ~~ I ^ n .訂 I 線 (請先閲讀背面之注意事項再填寫本頁) 鯉濟部由,央棣準局貝工消費合作社印策 A8 B8 C8 D8 六、申請專利範圍 頻帶寬; —前濾波器(4 6 ),將每一所述聲頻框分割成代表該 聲頻帶寬基波帶部份之基波帶框以及代表該聲頻帶寬其餘 部份之高柚樣率框; 一髙抽樣率編碼器(48,5〇 , 52 ),將各聲頻通道之高 .抽樣率框編碼成各自之編碼高抽樣率信號; 多數濾波器(34 ),將各通道之基波帶框分割成各別 之多數頻率次波帶,所述頻率次波帶各包含一系列次波帶 框,每一次波帶框至少具有該聲頻資料之一次框; 多數次波帶編碼器(2 6 ),以一回一次框方式將各別 頻率次波帶內之聲頻資料編碼,以產生編碼次波帶信號; 以及 —多工器(32 ),將該編碼次波帶信號及高抽樣率信 號壓縮並多工傳輸進入每一連績資料框之一輸出框內,從 而以一傳輸率形成一資料流,以使該多通道聲頻信號之基 波帶及高抽樣率部份可獨立解碼〇 6. 如申請專利範圍第5項之多通道聲頻編匕器,尙包 含: 一控制器U 9 ),根據該抽樣率及傳輸率設定聲頻視 窗之大小,以使所述輸出框之大小受約束而落於一所要範 圍內。 7. —種多通道聲頻編碼器,包含: —框攫取器(6 4 )’將一聲頻視窗施加於以—抽樣率 抽樣之多通道聲頻信號之每一通道,以產生各自之聲頻框 本紙張尺度適用中國國家楳準(CNS ) A4規格(210X297全考I ----------f.------.訂------.ii (請先聞讀背面之注意事項再赛寫本頁) r 經濟部中央梂準局員工消费合作社印装 3i5S6j i __ P8 _ 六、申請專利範圍 序列; 多數濾波器(3 4 ),將各通道之聲頻框分割成涵蓋一 基波帶頻率範圍之各別多數頻率次波帶,所述頻率次波帶 各包含一系列之次波帶框,每一次波帶框至少具有該聲頻 資料之一次框;' —廣域位元管理器(GBM ) ( 3 0 ),對每一次框計算一 精神聽覺信號遮蔽比(SMR)及一估計預測增益(Pga i η ), 以其等相關預測增益之各別分數將SMR降低而計算遮蔽雜 訊比(MNR),分配各位元以滿足每一MNR ,計算所有次波 帶上之已分配位元傳輸率,以及調整個別之分配以使實際 位元傳輸率接近目標位元傳輸率。 多數次波帶編碼器(26 ),依據位元分配以一回一次 框方式將各別頻率次波帶内之聲頻資料編碼,以產生編碼 次波帶信號;以及 一多工器(3 2 ),將該編碼次波帶信號及位兀分配壓 縮並多工傳輸進入每一連續資料框之一輸出框內,從而以 一傳輸率形成一資料流_ 〇 8 .如申請專利範圍第7項之多通道聲頻編碼器,其中 當分配位元傳輸率小於目標位元傳輸率時,GBM ( 3 0 )依據 最小均方誤差(mms e )體系分配剩餘位元。 9 .如申請專利範圍第7項之多通道聲頻編碼器,其中 GBM ( 3 0 )對每一次框計算一根均方(KMS )値,而當分配位 元傳輸率小於目標位元傳輸率時,該GBM依據施加於RMS 値之m m s e體系重新分配所有可用位元,直到分配位元傳輸 本紙張尺度適用中國國家橾準(CNS_Ta4规格(210X297仓爹k ^ _ (請先閲讀背面之注意事項再填寫本頁) f -裝_ 訂 線 Γ 々、申請專利範圍 率接近目標位元傳輸率爲止0 1 〇 .如申請專利範圍第7項之多通道聲頻編碼器,其中 GBM (30)對每一次框計算一根均方(RMS)値,並依據施加 於RMS値之龍se體系分配所有剩餘位元,直到分配位元傳 輸率接近目標位元傳輸率爲止〇 11.如申請專利範圍第7項之多通道聲頻編碼器,其中 GBM (30)對每一次框計算—根均方(RMS)値,並依據施加 於次框之RMS與MNK値間之差之mmse體系分配所有剩餘位 元,直到分配位元傳輸率接近目標位元傳輸率爲止。 1 2 .如申請專利範圍第7項之多通道聲頻編碼器,其中 G B Μ ( 3 ϋ )將S Μ 設定成一均句値,以使各位元依據最小均 方誤差(m m s e )體系加以分配。 13. —種多通道固定失眞變率聲頻編碼器,包含: 一框攫取器(64),將一聲頻視窗施加於以一抽樣率 抽樣之多通道聲頻信號之每一通道,以產生各自之聲頻框 序列,所述多通道聲頻信號具有N位元之解析度; 經濟部中央標準局—工消費合作社印策 (請先閲讀背面之注意事項再填寫本頁) 多數完全性重構濾波器(34 ),將各通道之聲頻框分 割成涵蓋一基波帶頻率範圍之各別多數頻率次波帶,所述 頻率次波帶各包含一系列之次波帶框,每一次波帶框至少 具有該聲頻資料之一次框; —廣域位元管理器(GBM ) ( 3 0 ),對每一次框計算一 根均方UMS)値,並依據各KMS値將位元分配至各次框, 以使一編碼失眞+位準小於聲頻信號N位元解析度之最不重 要位元之半; 本紙張尺渡適用中國國家揉準(CNS ) A4规格(210X2的兮律)_ A8 B8 €8 D8 經濟部中央榡準局負工消费合作社印製 六、申請專利範圍 多數預測性次波帶編碼器(2 6 ),依據位元分配以一 次一個次框方式將各別頻率波帶內之聲頻資料編碼,以產 生編碼次波帶信號;以及 —多工器(3 2 ),將該編碼次波帶信號及位元分配壓 縮並多工傳輸進入每一連續資料框之一輸出框內,從而以 —傳輸率、形成一資料流,所述資料流可予解碼成—使所述 多通道聲頻信號等於該N位元解析度之解碼多通道聲頻信 號。 1 4 .如申請專利範圍第i 3項之多通道聲頻編碼器,其中 所述基波帶頻率範圍具有一最大頻率,尙包含: —前濾波器(4叫,以該基波帶頻率範圍內及高於最 大頻率之頻率分別將每一所述聲頻框分割成一基波帶信號 及一高抽樣率信號,所述GBM將各位元分配至該高抽樣率 信號以滿足選定之固定失眞;以及 - —高抽樣率編碼器(48,5〇,52),將各聲頻通道之高 抽樣率信號編碼成各自之編碼高抽樣率信號; 所述多工器將各通道之編碼高抽樣率信號壓縮進入 各自之輸出框內,以使該多通道聲頻信號之基波帶與高抽 樣率部份可獨立解碼。 1 5 .如申請專利範圍第1 3項之多通道聲頻編碼器,尙包 含: 一控制器(1 9 ),根據該抽樣率及傳輸率設定聲頻視 窗之大小,以使所述輸出框之大小受約束而落於一所要範 圍內Q 本紙張尺度適用中國國家棵準(CNS ) A4规格(2!〇X297全赛5)- (請先閲讀背面之注意事項再填寫本頁) -裝- I-訂 線 f ί 6 5 5 ί s ABCD 經濟部中央揉準局貝工消費合作社印製 六、申請專利範圍 1 6 . —種多通道固定失眞變率聲頻編碼器,包含·· —可程式控制器(1 9 )’用以選出一固定感覺失眞及 一固定最小均方誤差(mmse:)失眞中之一; 一框攫取器(6 4 ),將一聲頻視窗施加於以一抽樣率 抽樣之多通道聲頻信號之每一通道,以產生各自之聲頻框 序列; 多數濾波器(34) ’將各通道之聲頻框分割成涵蓋一-基波帶頻率範圍之各別多數頻率次波帶,所述頻率次波帶 各包含一系列之次波帶框’每一次波帶框至少具有該聲頻 資料之一次框; 一廣域位元管理器(GBM) ( 30),自一就每一次框計 算一根均方UM_S )値並根據RMS値將各位元分配至各次框 直到該固定mms e失眞被滿足爲止之相關nrns e體系且自一就 每一次框計算一精神聽覺ί目號遮蔽比(S M R )及一'估計預測 增益(P g a i η )之精神聽覺體系進行選擇而回應該失眞選擇 作用,以其等相關預測增益之各別分數將SMR降低而計算 遮蔽雜訊比(ΜΜ),及分配各位元以滿足每一MNR 〇 多數次波帶編碼器(2 6 ),依據位元分配以一回一次 框方式將各別頻率次波帶內之聲頻資料編碼,以產生編碼 次波帶信號;以及 一多工器(3 2 ),將該編碼次波帶信號及位元分配壓 縮並多工傳輸進入每一連續資料框之一輸出框內,從而以 一傳輸率形成一資料流〇 I7,一種多通道聲頻解碼器,用以自一資料流重構多聲 (請先閲讀背面之注意事項再填寫本頁) f -裝. .、1T 本紙張尺Jbt用中國國家梂準(CNS〉Μ规格(2丨0X297仓替6) Α8 Β8 C8 D8 經濟部中央棣準局貝工消费合作社印裂 六、申請專利範圍 頻通道以迄一編碼器抽樣率,其中每一聲頻通道均以一至 少與解/碼器抽樣率同高之編碼器抽樣率抽樣,再分成多數 頻率次波帶,以一傳輸率壓縮及多工傳輸進入該資料流內 ,包含: —輸入緩衝器(3 2 4 ),用以一回一框方式讀入並儲 存資料流,每一所述框包括一sync字、一框表頭、—聲頻 表頭、及至少一次框,包括聲頻邊際資訊,多數具有涵-蓋 一基波帶頻率範圍之基波帶聲頻碼之次次框、一區段之涵 蓋一高抽樣率頻率範圍之高抽樣率聲頻碼、以及一解壓縮 sync ; —解多工器(4 0 ),用以a )偵測該s y n c字,b )將該框 表頭解壓縮以摘取--指示該框內聲頻選樣數目之視窗大小 及一指示於該框內位元組數目之框大小,所述視窗大小係 設定成傳輸率對編碼器抽樣率比値之函數,以使該框大小 被約束成小於輸入緩衝器之大小,c )將該聲頻表頭解壓縮 以摘取於該框內之次框數目及編碼聲頻通道之數目,以及 d )連續將每一次框解壓縮以摘取聲頻邊際資訊,將每一次 次框內之基波帶聲頻碼解多工傳輸進入多聲頻通道內並將 每一聲頻通道解壓縮進入其次波帶聲頻碼內,將高抽樣率 聲頻碼解多工傳輸進入多聲頻通道以迄該解碼器抽樣率並 跳過剩餘高抽樣率聲頻碼以迄該編碼器抽樣率,並偵測該 解壓縮s y n c以核驗該次框之末端; 一基波帶解碼器(42 , 44 ),使用邊際資訊以一回一 次框方式將次波帶聲頻碼解碼成重構次波帶信號,而無需 C請先閲讀背面之注意事項再填寫本頁) -裝. -= 線 本紙張尺度適用中國國家樣準(CNS ) A4规格(210X297仓律7)_ 經濟部中央襟準局員工消費合作社印裝 A8 B8 C8 ___D8六、申請專利範圍 參考任何其他次框; 一基波帶重構濾波器(44),以一回一次框方式將各 通道之重構次波帶信號結合成一重構基波帶信號; —高抽樣率解碼器(5 8,6 0 ),使用邊際資訊以一回 一次框方式對每一聲頻通道將高抽樣率聲頻碼解碼成一重 構高抽樣率信號; 一通道重構濾波器(62),以一回一次框方式將重構 基波帶及高抽樣率信號結合成一重構多通道聲頻信號。 18 .如申請專利範圍第1?項之多通道聲頻解碼器,其中 基波帶重構濾波器(44)包含一非完全性重構(NPE)濾波器 組及一完全性重構(PK)濾波器組,而所述框表頭包括一選 出所述NPR及PR濾波器組中之一之濾波器碼〇 1 9 .如申請專利範圍第17項之多通道聲頻解碼器,其中 基波帶解碼器包含多數用以將各別之次波帶聲頻碼解碼之 反轉適應性差分脈衝碼調變(ADPCM)編碼器(268,270), 所述邊際資訊包括各別ADPCM編碼器之預測係數,以及一 用以控制各預測係數對各別ADPCM編碼器之施加以選擇性 使其等預測能力作用及不作用之預測模式(PM0DE) 〇 2 0 .如申請專利範圍第1 7項之多通道聲頻解碼器,其中 所述邊際資訊包含: 一用於每一通道次波帶之位元分配表’其中每一次 波帶之位元傳輸率在該次框上爲固定; 至少爲一之用於每一通道內每一次波帶之標度因數 ;以及 (請先閲讀背面之注意事項再填寫本頁) f .裝 •訂 本紙張尺度適用中國國家揉準(CNS ) A4规格(210X297么H ΛΟ 5 5 nf 8 8 8 8 ABCD 經濟部中央樣準局員工消费合作社印策 六、申請專利教i圍 —用於每一通道內每一次波帶之暫態模式(TMODE) ,識別標示標度因數及其等相關次次框之數目,所述基波 帶解碼器依據其等之TMODE以各自之標度因數標定各次波 帶之聲頻碼以便利解碼。 本紙張尺度適用中國國家捸準(CNS ) A4规格(210X297兮律9)_ (請先閲讀背面之注意事項再填寫本頁)6. Patent application scope 1 · A multi-channel audio encoder, including:-a frame grabber (ΰ 4), applying an audio window to each channel of a multi-channel audio signal sampled at a sampling rate to generate its own Audio frame sequence; / Majority filter (3 4), which divides the audio frame of each channel into distinct majority frequency subbands covering a frequency range of a fundamental band, each of which contains a series of subwaves With a frame, each waveband frame has at least a primary frame of the audio data; the majority waveband encoder (26) encodes the audio data in the subbands of each frequency into the coded secondary waveband in a one-time frame mode A signal; a multiplexer (32) that compresses and multiplexes the encoded subband signal into an output frame of each continuous data frame, thereby forming a data stream at a transmission rate; and a controller ( 1 9), according to the sampling rate and transmission rate to set the size of the audio-visual window, so that the size of the output frame is constrained to fall within a desired range ◦ Printed by the Consumer Labor Cooperative of the Central Standardization Bureau of the Ministry of Economic Affairs { Read the precautions on the back first and then fill out this page) 2, such as the multi-channel audio encoder of item χ of the patent application, where the controller sets the audio window size to be less than (frame size PFsampMS / Trate) is two Multiple, where the frame size is the maximum size of the output frame, Fs amp is the sampling rate, and Tr ate is the transmission rate. 3. Multi-channel audio encoder as claimed in item 1 of the patent scope, in which the multi-channel audio signal is encoded at a target bit transmission rate, and the sub-band encoders include most predictive encoders; : This paper uses the Chinese National Standard (CNS) A4 specification (210X297 ^^, printed by the Central Standard Rating Bureau of the Ministry of Economic Affairs, printed and printed by Δδ Β8 CS D8 六 、 Patent application range—Wide Area Bit Manager (GBM) (30), for each frame, calculate a mental auditory signal masking ratio (SMK) and an estimated predictive gain (Pgain), and calculate the masking noise ratio (MNR) by reducing the SMR with their respective predictions: v gain scores ), Allocate each bit to meet each MNR, calculate the transmission rate of the allocated bit on all secondary bands, and adjust the individual allocation so that the actual bit transmission rate is close to the target bit transmission rate _ 4. If applying for a patent The multi-channel audio encoder of the first or third item in the range, wherein the sub-band encoder divides each frame into a plurality of sub-frames, and each band encoder includes one to generate and quantize an error signal for each frame Predictive encoder (7 2), including: an analyzer (98,100,1 0 2,1 04,1 06), which generates an estimated error signal before encoding each frame, and detects the estimated error signal every time Transient within the sub-frame, a transient code is generated to indicate whether a transient exists in any sub-frame other than the first and sub-sub-frames in which the transient occurs, and when a transient is detected These sub-frames before the transient generate a pre-transient scaling factor, and a post-transient calibration factor is generated for the sub-frames that include the transient and after the transient, otherwise generate for the sub-frame A uniform scale factor, the predictive encoder uses the pre-transient, post-transient, and uniform scale factors to calibrate the error signal before encoding to reduce the number of times the scale factor corresponding to the pre-transient Encoding error in the frame 〇5-A multi-channel audio encoder, including:-A frame extractor (64), which applies an audio window to each channel of a multi-channel audio signal sampled at a sampling rate to generate its own The audio frame sequence, the audio frame has a. From DC to approximately The sampling rate is half of this. The paper size is applicable to the Chinese National Standard (CNS) A4 specification (210X297 for L ~ '~~ I ^ n. Order I line (please read the precautions on the back and fill in this page). By the Central Bureau of Industry and Commerce, Beigong Consumer Cooperative Institution A8, B8, C8, D8. 6. Patent application frequency bandwidth;-Front filter (4 6), dividing each audio frame into the fundamental band of the audio bandwidth. A portion of the fundamental wave band frame and a high pomelo sample rate frame representing the rest of the audio bandwidth; a high sampling rate encoder (48, 50, 52), which encodes the high audio channel sampling rate frames into their respective Encode high-sampling rate signal; majority filter (34), divide the fundamental wave band frame of each channel into separate majority frequency sub-bands, the frequency sub-bands each contain a series of sub-band bands, each wave The band frame has at least one primary frame of the audio data; most of the secondary band encoders (26) encode the audio data in the secondary frequency bands of various frequencies in a one-time primary frame mode to generate encoded secondary band signals; and —Multiplexer (32), the coding sub-band The signal and high sampling rate signal are compressed and multiplexed into one output frame of each consecutive data frame, thereby forming a data stream at a transmission rate, so that the fundamental band and high sampling rate part of the multi-channel audio signal It can be decoded independently. For example, the multi-channel audio editing device of the fifth patent application includes: a controller U 9), according to the sampling rate and transmission rate to set the size of the audio window, so that the output box The size is constrained and falls within a desired range. 7. A multi-channel audio encoder, including:-a frame grabber (64) that applies an audio window to each channel of a multi-channel audio signal sampled at a sampling rate to produce its own audio frame paper The scale is applicable to China National Standard (CNS) A4 (210X297 full test I ---------- f .------. Subscribe ------. Ii (please read the back first (Notes for replays are written on this page) r Printed 3i5S6j i __ P8 _ of the Central Consumer ’s Bureau of the Ministry of Economic Affairs Employee's Consumer Cooperatives 六. Patent application sequence; Most filters (3 4) divide the audio frame of each channel into covers A plurality of frequency subbands in the frequency range of a fundamental band, each of which includes a series of subband frames, each of which has at least a primary frame of the audio data; The meta-manager (GBM) (30) calculates a mental auditory signal obscuration ratio (SMR) and an estimated prediction gain (Pga i η) for each frame, and reduces the SMR with their respective scores of the relevant prediction gain Calculate the masked noise ratio (MNR), assign each bit to meet each MNR, and calculate the Distribute the bit transmission rate, and adjust the individual distribution to make the actual bit transmission rate close to the target bit transmission rate. Most band encoders (26), according to the bit allocation, the frequency of each frequency The audio data in the band is coded to produce the coded subband signal; and a multiplexer (32) that compresses and multiplexes the coded subband signal and bit distribution into one of each continuous data frame In the output box, a data stream is formed at a transmission rate _ 〇8. For example, the multi-channel audio encoder of the seventh patent application, where when the transmission rate of the allocated bit is less than the target bit transmission rate, GBM (3 0 ) The remaining bits are allocated according to the minimum mean square error (mms e) system. 9. For example, the multi-channel audio encoder of item 7 of the patent scope, where GBM (30) calculates a mean square (KMS) for each frame Value, and when the assigned bit transmission rate is less than the target bit transmission rate, the GBM redistributes all available bits according to the mmse system applied to the RMS value, until the assigned bit transmission. This paper standard is applicable to the Chinese national standard (CNS_Ta4 Grid (210X297 warehouse father k ^ _ (please read the precautions on the back before filling in this page) f -install _ line Γ 々, the patent application rate is close to the target bit transmission rate 0 1 〇. If the patent application number 7-item multi-channel audio encoder, where GBM (30) calculates a mean square (RMS) value for each frame, and allocates all remaining bits according to the RMS value of the dragon se system until the bit transmission rate is assigned Close to the target bit rate. 11. For example, the multi-channel audio encoder of the seventh patent application, in which GBM (30) is calculated for each frame-the root mean square (RMS) value, and is based on the The mmse system of the difference between the RMS and MNK values allocates all remaining bits until the transmission rate of the allocated bit is close to the target bit transmission rate. 1 2. The multi-channel audio encoder as claimed in item 7 of the patent scope, in which G B Μ (3 ϋ) sets S Μ to a uniform sentence value, so that each bit is allocated according to the minimum mean square error (m m s e) system. 13. A multi-channel fixed loss rate audio encoder, including: a frame grabber (64), applying an audio window to each channel of a multi-channel audio signal sampled at a sampling rate to generate its own Audio frame sequence, the multi-channel audio signal has N-bit resolution; Central Bureau of Standards of the Ministry of Economic Affairs-Industry and Consumer Cooperatives (Please read the precautions on the back before filling this page) Most complete reconstruction filters ( 34), the audio frame of each channel is divided into different majority frequency sub-bands covering a frequency range of the fundamental wave band, the frequency sub-bands each include a series of sub-band sub-frames, each band sub-frame has at least The primary frame of the audio data;-the wide area bit manager (GBM) (30), calculating a mean square UMS for each frame, and assigning bits to each frame according to each KMS value, to Make a code missing + the level is less than half of the least significant bit of the N-bit resolution of the audio signal; this paper ruler applies to the Chinese National Standard (CNS) A4 specification (210X2) _ A8 B8 € 8 D8 Ministry of Economic Affairs, Central Bureau of Prejudice, Unemployment Consumption Printed by Zuosha. 6. Most patented predictive subband encoders (26), according to bit allocation, encode audio data in each frequency band one subframe at a time to generate coded subbands Signal; and—multiplexer (3 2), which compresses the coded subband signal and bit distribution and multiplexes it into one of the output frames of each continuous data frame, thereby forming a data stream at the transmission rate , The data stream can be pre-decoded to make the multi-channel audio signal equal to the N-bit resolution decoded multi-channel audio signal. 1 4. The multi-channel audio encoder as claimed in item i 3 of the patent scope, wherein the frequency range of the fundamental band has a maximum frequency, and the following includes:-front filter (4 calls, within the frequency range of the fundamental band) And the frequency higher than the maximum frequency divides each of the audio frames into a fundamental band signal and a high sampling rate signal, respectively, and the GBM allocates each bit to the high sampling rate signal to satisfy the selected fixed loss; and --High sampling rate encoder (48, 50, 52), which encodes the high sampling rate signals of each audio channel into their respective encoded high sampling rate signals; the multiplexer compresses the encoded high sampling rate signals of each channel Enter the respective output boxes to enable the baseband and the high sampling rate of the multi-channel audio signal to be independently decoded. 1 5. If the multi-channel audio encoder of item 13 of the scope of patent application includes: 1 The controller (19) sets the size of the audio window according to the sampling rate and transmission rate, so that the size of the output frame is constrained to fall within a desired range. Q This paper scale is applicable to China National Standard (CNS) A4 Specifications (2! 〇X2 97 full game 5)-(please read the precautions on the back before filling in this page) -installed- I-line f f ί 6 5 5 ί s ABCD Printed by the Ministry of Economic Affairs Central Bureau of Precision Industry Beigong Consumer Cooperative. Range 16-a multi-channel fixed loss rate audio encoder, including a programmable controller (19) to select a fixed sensory loss and a fixed minimum mean square error (mmse :) loss One of them; a frame grabber (64) that applies an audio window to each channel of a multi-channel audio signal sampled at a sampling rate to produce its own audio frame sequence; most filters (34) Divide the audio frame of each channel into distinct majority frequency subbands covering the frequency range of the one-fundamental band, the frequency subbands each include a series of subband bands. Each frame has at least the audio frequency The primary frame of data; a wide-area bit manager (GBM) (30), which calculates a mean square UM_S for each frame from one time) and assigns each bit to each secondary frame according to the RMS value until the fixed mms e The relevant nrns e system until the missing is satisfied and every time since the box Calculate the mental hearing system's occlusion ratio (SMR) and the estimated hearing gain (P gai η) of the mental hearing system to select and respond to the loss of the selection effect, and reduce the SMR with their respective scores of relevant prediction gains Calculate the masking noise ratio (ΜΜ), and allocate each bit to meet each MNR. The majority frequency band encoder (2 6), according to the bit allocation, the frequency of each frequency band within a single frame Audio data coding to generate coded subband signals; and a multiplexer (32) to compress and multiplex the coded subband signals and bits into an output frame of each continuous data frame , So as to form a data stream at a transmission rate. I7, a multi-channel audio decoder, is used to reconstruct multiple sounds from a data stream (please read the precautions on the back before filling out this page) f -installed., 1T This paper ruler Jbt uses the Chinese National Standard (CNS> M specifications (2 丨 0X297 warehouse replacement 6) Α8 Β8 C8 D8 Printed by the Beigong Consumer Cooperative of the Central Bureau of Economic Affairs of the Ministry of Economic Affairs. 6.Apply for the patent scope frequency channel as an encoder. Sampling rate, each of them The frequency channels are sampled at an encoder sampling rate at least as high as the decoder / decoder sampling rate, and then divided into most frequency subbands, compressed at a transmission rate and multiplexed into the data stream, including:-input buffer (3 2 4), used to read and store the data stream in a frame-by-frame manner, each frame includes a sync word, a frame header, an audio header, and at least one frame, including audio marginal information , Most of them have the subframe of the fundamental frequency band audio code with a frequency range covering a fundamental frequency band, a section of high sampling rate audio code covering a high sampling rate frequency range, and a decompression sync; Tool (4 0), used to a) detect the sync word, b) decompress the frame header to extract-the window size indicating the number of audio samples in the frame and an indication in the frame The frame size of the number of tuples, the window size is set as a function of the ratio of the transmission rate to the encoder sampling rate, so that the frame size is constrained to be smaller than the size of the input buffer, c) decompress the audio header To extract the number of sub-frames in the frame and the encoded audio channel And d) successively decompress each frame to extract audio marginal information, demultiplex the fundamental band audio code in each subframe into multiple audio channels and decompress each audio channel into the second Within the band audio code, demultiplex the high-sampling-rate audio code into the multi-audio channel to the decoder sampling rate and skip the remaining high-sampling-rate audio code to the encoder sampling rate, and detect the decompression sync to verify the end of the sub-frame; a fundamental band decoder (42, 44), using marginal information to decode the sub-band audio code into a reconstructed sub-band signal in a one-time primary frame mode, without C please first Read the precautions on the back and then fill out this page) -Installation.-= The size of the printed paper is applicable to the Chinese National Standard (CNS) A4 specification (210X297 warehouse law 7) _ Printed and printed by the Staff Consumer Cooperative of the Central Bureau of Accreditation of the Ministry of Economic Affairs A8 B8 C8 ___D8 6. For the scope of patent application, please refer to any other sub-frames; a fundamental band reconstruction filter (44), which combines the reconstructed sub-band signals of each channel into a reconstructed fundamental band signal in a one-time primary frame mode; High sampling rate The decoder (58, 60) uses marginal information to decode the high-sampling-rate audio code into a reconstructed high-sampling-rate signal for each audio channel in a frame-by-frame manner; a channel reconstruction filter (62), to The one-time one-frame method combines the reconstructed fundamental wave band and the high sampling rate signal into a reconstructed multi-channel audio signal. 18. The multi-channel audio decoder as claimed in item 1 of the patent scope, in which the fundamental band reconstruction filter (44) includes a non-complete reconstruction (NPE) filter bank and a complete reconstruction (PK) Filter bank, and the frame header includes a filter code that selects one of the NPR and PR filter banks. The multi-channel audio decoder as claimed in item 17 of the patent scope, in which the fundamental band The decoder includes a majority of inverted adaptive differential pulse code modulation (ADPCM) encoders (268,270) for decoding the respective subband audio codes, the marginal information includes the prediction coefficients of the respective ADPCM encoders, and A prediction mode (PM0DE) used to control the application of each prediction coefficient to each ADPCM encoder to selectively make its equal prediction ability function and not function. For example, the multi-channel audio decoding of patent application item 17 Where the marginal information includes: a bit allocation table for each channel ’s sub-band, where the bit transmission rate of each band is fixed on the sub-frame; at least one is used for each Scaling factor for each waveband in the channel; and (Please read the precautions on the back before filling in this page) f. The size of the bound and bound paper is applicable to the Chinese National Standard (CNS) A4 (210X297? H ΛΟ 5 5 nf 8 8 8 8 ABCD Central Bureau of Samples Employee Consumer Cooperative Printing Strategy 6. Applying for patents to teach the surroundings-for the transient mode (TMODE) of each wave band in each channel, identifying the scale factor and the number of related sub-frames, the fundamental wave The tape decoder calibrates the audio code of each wave band with its own scale factor according to its equivalent TMODE to facilitate decoding. This paper scale is applicable to the Chinese National Standard (CNS) A4 specification (210X297 Xi law 9) _ (please read first (Notes on the back then fill this page)
TW85114822A 1996-05-02 1996-11-30 A multi-channel predictive subband audio coder using psychoacoustic adaptive bit allocation in frequency, time and over the multiple channels TW315561B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US08/642,254 US5956674A (en) 1995-12-01 1996-05-02 Multi-channel predictive subband audio coder using psychoacoustic adaptive bit allocation in frequency, time and over the multiple channels

Publications (1)

Publication Number Publication Date
TW315561B true TW315561B (en) 1997-09-11

Family

ID=51566734

Family Applications (1)

Application Number Title Priority Date Filing Date
TW85114822A TW315561B (en) 1996-05-02 1996-11-30 A multi-channel predictive subband audio coder using psychoacoustic adaptive bit allocation in frequency, time and over the multiple channels

Country Status (1)

Country Link
TW (1) TW315561B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8041578B2 (en) 2006-10-18 2011-10-18 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding an information signal
US8126721B2 (en) 2006-10-18 2012-02-28 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding an information signal
US8417532B2 (en) 2006-10-18 2013-04-09 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding an information signal

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8041578B2 (en) 2006-10-18 2011-10-18 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding an information signal
US8126721B2 (en) 2006-10-18 2012-02-28 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding an information signal
US8417532B2 (en) 2006-10-18 2013-04-09 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding an information signal

Similar Documents

Publication Publication Date Title
US11315579B2 (en) Metadata driven dynamic range control
KR100277819B1 (en) Multichannel Predictive Subband Coder Using Psychoacoustic Adaptive Bit Assignment
US6675148B2 (en) Lossless audio coder
US7333929B1 (en) Modular scalable compressed audio data stream
KR100571824B1 (en) Method for encoding/decoding of embedding the ancillary data in MPEG-4 BSAC audio bitstream and apparatus using thereof
US7848931B2 (en) Audio encoder
JP2012198555A (en) Extraction method and device of important frequency components of audio signal, and encoding and/or decoding method and device of low bit rate audio signal utilizing extraction method
KR20100089772A (en) Method of coding/decoding audio signal and apparatus for enabling the method
JP2004199064A (en) Audio encoding method, decoding method, encoding device and decoding device capable of adjusting bit rate
TW315561B (en) A multi-channel predictive subband audio coder using psychoacoustic adaptive bit allocation in frequency, time and over the multiple channels
JP2003332914A (en) Encoding method for digital signal, decoding method therefor, apparatus for the methods and program thereof
Bii MPEG-1 Layer III Standard: A Simplified Theoretical Review
Chen et al. Fast time-frequency transform algorithms and their applications to real-time software implementation of AC-3 audio codec
Noll et al. Digital audio: from lossless to transparent coding
Cavagnolo et al. Introduction to Digital Audio Compression
Smyth An Overview of the Coherent Acoustics Coding System
Ning Analysis and coding of high quality audio signals
Bosi et al. DTS Surround Sound for Multiple Applications
Noll et al. Lossless and perceptual coding of digital audio
Jayant Digital audio communications
Ferreira The perceptual audio coding concept: from speech to high-quality audio coding

Legal Events

Date Code Title Description
MK4A Expiration of patent term of an invention patent