TW584835B

TW584835B - Method and architecture of digital coding for transmitting and packing audio signals

Info

Publication number: TW584835B
Application number: TW91136087A
Authority: TW
Inventors: Chi-Min Liu; Wen-Chieh Lee
Original assignee: Univ Nat Chiao Tung
Priority date: 2002-12-13
Filing date: 2002-12-13
Publication date: 2004-04-21
Also published as: TW200410201A

Abstract

A digital coding method converts inputted audio signal into frequency sampling sequence to represent a spectrum integration of the audio signal; further based on a bit allocation procedure, quantitize the frequency sampling sequence into quantitized value. By means of a mask threshold value, the bit allocation procedure uses a parameter predictor to predict quantitized parameter. The quantitized value is coded as the coded data containing multiple bits. If the bit number of the coded data exceed a preset available bit number of the coded data, a recursive rate control loop will adjust quantitized parameter and quantitized gap size. Prior to quantitizing frequency sampling sequence, the method also follows a cutoff frequency to cut off the high frequency portion of inputted audio signal where the cutoff frequency is decided by the recursive rate control loop.

Description

584835 五、發明說明（1) 發明所屬之技術領域本發明係有關於一種用於傳送和包裝（packing)訊號的數位編碼（digital coding)方法與結構，特別是關於音孔（audio signai)編碼中的位元分配（bH aii〇cati〇n)。先前技術知覺（perceptual)語音編碼如MPEG 1-3層，進階語曰、’扁號’或時間/頻率（t i m e / f r e q u e n c y，T / F )編碼等已廣泛應用在電子電信消費品，電信電訊及播音技術上。在這些知覺語音編碼器中，位元分配是引導高複雜度和決定編碼品質的關鍵模組中一項主要的工作。圖1為一種知覺音訊編碼中編碼過程（p r 〇 c e s s )的方塊示忌圖’時間/頻率轉換器（T / j? m a p p e r) 1 0 1以一個視窗接，個視窗（window-by-window)為基礎將音訊S(n)轉換成由時間範圍至頻率範圍的頻段（frequenCy segments) S (m，f)。多種編碼器（c 〇 d e r s) 1 0 3也被用在編碼過程中，以達成咼壓縮率（high compression ratios)。輸出值 X(m，f)為以視窗片段指標（wind〇w segment index)m和頻率指標（frequency segment)f 編碼後的序列（sequence)，以頻率範圍為定義域。量化器（quantizer )丨05將x(m，f)量584835 V. Description of the invention (1) Technical field to which the invention belongs The present invention relates to a digital coding method and structure for transmitting and packing signals, particularly in audio signai coding Bit allocation (bH aiiocation). Perceptual speech coding in the prior art, such as MPEG layer 1-3, advanced expressions, 'symbols' or time / frequency (T / F) coding, etc. have been widely used in consumer electronics, telecommunications and telecommunications Broadcasting technically. In these perceptual speech encoders, bit allocation is a major task in key modules that guide high complexity and determine encoding quality. FIG. 1 is a block diagram of a coding process (pr occess) in a perceptual audio coding. The time / frequency converter (T / j? Mapper) 1 0 1 is connected by a window, window-by-window Based on this, the audio S (n) is converted into a frequency range (frequenCy segments) S (m, f) from time range to frequency range. A variety of encoders (co d e r s) 103 are also used in the encoding process to achieve high compression ratios. The output value X (m, f) is the sequence encoded by the window segment index m and the frequency segment f, and the frequency range is used as the definition domain. Quantizer (quantizer) 丨 05 will be x (m, f) amount

584835584835

化成有限個級別（丨e v e 1 s )，以χ，（一立广 +· νηι，ί)表不0並以量化雜曰 Uuantizatlon nolse)所產生的宝 ·，、 )減至最小A日沾。、士此旦關扣-(impairments )減主琅】為目的泛些里化級別是由量化參數 (quantization parameters)來控制。 ===縮（C_FessiC)r〇是將頻率線歸類成組柄之為Ϊ化波段uuantlzatlon bands)。一個量化所組成的頻率線的數量是根據重要頻波段（crltical / bands)以及用來傳送量化參數所能提供的位元來決定。可變長度編碼器（variable length c〇ding，VLc)i〇7 代表經由可變長度編碼的量化後的序列x，（m，f)，其以轉換後的信號的統計上的發生機率來考量。聚集單元（packing u n i t) 1 0 9將最終轉碼結果集合成一個由特定音訊規範 (specified audio protocol)所定義的序列。聽覺分析模式（^ych〇acoustic m〇del)lll將訊號做分析並由分析結果提供訊號-對-遮罩率（signaht〇 —masking rati〇，SMR) 給來自訊號分析結果的量化波段。位元分配器（bi t-allocator)113參考聽覺分析模式in所提供的遮罩門檻值 (masking thresholds)和預定可用位元（available bit . budge t) 11 5來決定量化參數。非均勻量化器（non-uniform quantizer)在位元分配裔的控制下量化光譜線（spectral lines)，此位元分配器考量最終的音訊品質和需要的位元，來決定量化方式。因Formed into a limited number of levels (丨 e v e 1 s), χ, (一立广 + · νηι, ί) represent 0 and quantify the amount of treasure generated by Uuantizatlon nolse (,,) to minimize A-day contamination. For this purpose, the level of generalization is controlled by quantization parameters. === 缩 (C_FessiC) r0 is to classify the frequency lines into groups and handle them as uuantlzatlon bands. The number of frequency lines formed by a quantization is determined according to the important frequency bands (crltical / bands) and the bits that can be provided to transmit the quantization parameters. A variable length encoder (VLc) i07 represents a quantized sequence x, (m, f) after variable length encoding, which is considered in terms of the statistical probability of the converted signal . The aggregation unit (packing u n t) 1 0 9 aggregates the final transcoding results into a sequence defined by a specific audio protocol. The auditory analysis mode (^ ych〇acoustic model) analyzes the signal and provides the signal-pair-masking rati0 (SMR) from the analysis result to the quantized band from the signal analysis result. The bit allocator 113 refers to the masking thresholds provided by the auditory analysis mode in and the predetermined available bits (budge t) 11 5 to determine the quantization parameter. A non-uniform quantizer quantizes the spectral lines under the control of a bit allocation system. This bit allocation device considers the final audio quality and the required bits to determine the quantization method. because

第6頁 584835 五、發明說明（3) 此、’音訊品質和位元數目的控制為位元分配器的基本要件美國專利文獻5，579，430中揭露了一種有關以頻率定義的最佳、、為碼（optimum coding in the frequency domain ’ 〇CF)過程的數位轉碼程序。它提供一種可媲美CD (compact-disc)品質的音樂轉碼，其可行的資料率（date rate)約為2位元/ATW，以及在良好的FM_無線電廣播品質下1^5位元/ATW的資料率。另一美國專利文獻5, 924, 0 60則揭路種關於聽覺訊號（acoustical signals)傳送及/或儲存的數位編碼程序，其利用一個4和6的因數來減低資料率’而無實質降低音樂訊號的品質。關於MPEG第1和第2層，均勻量化器（unif〇rm ==11 zer)是用來控制品質和位元要件。因此位元分配器間早地將可用位元總數分配給次頻波段訊號（別卜band sjg^ls)作為量化用，以使量化雜音能聽度二; ^TtlZaU〇n如⑷最小化。在如ΜΡΕ(^3層、 MPEG - 2 AAC和MPEG4 T/F編碼的編碼器中，控二因在於它們皆使用非均勻量化器，的雜音：變：換…依據知覺可允許第3層和長化产= 來將給不同的值，I即祐/Λ 碼將可變位元長度指定，而非從量化ί參的位元應該是從量化結果中取得 584835 五、發明說明（4) < 、上述缺點造成在估計量化參數時產生問題。一種兩層巢狀（two nested)迴圈反覆方法（i〇〇p iterative method )’稱為OCF ’被提出來解決這個問題。如圖2所示，它經由兩種反覆迴圈’即比率控制迴圈（rate — controlling loop)和口口鲁控制迴圈（quaHty一c〇ntr〇iiing ι〇〇ρ)，來估=量化參數。比率控制迴圈反覆地調整參數值，以符合由里化過程頻譜線的jju f f man編碼得到的有限的位元。品質控制迴圈則反覆地調整參數值，以符合反量化過程Page 6 584835 V. Description of the invention (3) Therefore, the control of audio quality and the number of bits is a basic requirement of a bit allocator. US Patent Document 5,579,430 discloses an optimal, A digital transcoding program for optimal coding in the frequency domain '〇CF. It provides a music transcoding that is comparable to CD (compact-disc) quality, its feasible data rate (date rate) is about 2 bits / ATW, and 1 ^ 5 bits / under good FM_radio broadcast quality ATW data rate. Another US patent document 5, 924, 0 60 discloses a digital encoding program for transmitting and / or storing acoustic signals, which uses a factor of 4 and 6 to reduce the data rate without substantially reducing the music. The quality of the signal. Regarding the first and second layers of MPEG, a uniform quantizer (unifom == 11 zer) is used to control quality and bit requirements. Therefore, the bit allocator early allocates the total number of available bits to the sub-frequency band signal (beijing band sjg ^ ls) for quantization, so as to make the quantization noise audibility two; ^ TtlZaUon as minimized. In encoders such as MPE (^ 3 layer, MPEG-2 AAC, and MPEG4 T / F encoding, the second reason is that they all use non-uniform quantizers. Noise: change: change ... according to the perceptual allowable layer 3 and Changhua product = different values will be given, I, I / Λ code specifies the variable bit length, instead of quantizing the bit should be obtained from the quantization result 584835 5. Invention description (4) & lt The above disadvantages cause problems in estimating the quantization parameters. A two-layer nested loop iterative method (referred to as OCF) is proposed to solve this problem. As shown in Figure 2 It is shown that it is estimated by two kinds of iterative loops, that is, a rate control loop (rate-control loop) and a mouth-loop control loop (quaHty-c0ntr〇iiing ι〇〇ρ). The loop repeatedly adjusts the parameter values to conform to the limited bits obtained by the jjuff man encoding of the spectral lines of the refining process. The quality control loop repeatedly adjusts the parameter values to conform to the inverse quantization process

CinveTse quantization)所須評估的量化雜音的知覺標準 (perceptual criterion)。對於一個含有F條頻譜線之畫面（frame)的方法，其複雜度可被描述為〇 (F · R · π +F · q · r )，其中q和r分別為重覆品質控制迴圈及比率控制迴圈的次數，而”和r又分別為在比率控制迴圈及品質控制迴圈下處理頻譜線的計算複雜度（computation complexity)。比率控制迴圈複雜度 7?係來自夏化及頻譜線VLC編碼，而品質控制迴圈複雜度γ 則來自解量化（dequ anti zat ion)和雜音量測（n〇ise measure)。々和7兩者皆為高複雜度。並且，重覆次數q和r 係依里化參數的初始值和調整方法而定，此複雜度甚至大於圖1中混合轉換（h y b r i d t r a n s f 〇 r m)和聽覺分析模式的總複雜度。CinveTse quantization) The perceptual criterion of the quantization noise. For a frame method with F spectral lines, its complexity can be described as 〇 (F · R · π + F · q · r), where q and r are repeated quality control loops and ratios, respectively. The number of control loops, and "r" are the computational complexity of processing the spectral lines under the ratio control loop and the quality control loop, respectively. The complexity of the ratio control loop 7 is from Xiahua and spectrum Line VLC encoding, and the quality control loop complexity γ comes from deequ anti zat ion and noise measure. Both 々 and 7 are high complexity. And, the number of repetitions q And r depend on the initial value of the reification parameter and the adjustment method, and this complexity is even greater than the total complexity of hybridtransfom and auditory analysis mode in FIG. 1.

五、發明說明（5) 在口口質控制迴圈中指定位碼的品質。指位70、，、口里化波段決定了語音編日疋位凡有兩種處理方法。從士、+ θ 元指定給每一万#圃由址处理万/2：一種方法是只將位母反覆迴圈中擁有最差的雜音-對-遮罩率 (ncuse-to〜masking ratl〇)的波段。口、、 ”反覆數目很大，意即相當高的複雜：法在母反覆迴圈中將位元指定仏雜音疋 Ϊ Μ :: 所有可用位元皆被使用殆盡為止。此方法的第-種方法的複雜度低許多。然而，令人所Κ 的疋此方法的品質是否讓人滿意。第一種方法檻值相等，此標 iso的樣本編碼. 法的問題在於它内控制品質和位止盡的迴圈，一。一般用來處理一個極限，並且質和迴圈數。然保證。可使雜音形變，以使遮罩門檻值與雜音門準已廣被接受。第二種方法曾被使用在通A吊使品質實質上變好。兩種巢狀迴圈可施無法導致收斂的情況。因為在兩迴圈元/肖耗有兩種不同法則，因此可能導致無身二稱之為”停滯π問題（d e a d 1 〇 c k p r 〇 b 1 e m) +滞問題的方法是將反覆次數的最大值設使用某種啟發性的參數調整方法來關照品 ’在這些方法中，仍然無法使品質獲得發明内容V. Description of the invention (5) Specify the quality of the bit code in the mouth quality control loop. The finger position 70,, and the localized band determine the two ways of processing the voice editor. Congshi, + θ yuan is assigned to every 10,000 # garden by address processing / 2: One method is to only have the worst noise-to-masking ratl in the loop repeatedly. ). The number of repetitions is large, which means a high degree of complexity: the method assigns bits in the mother's repeated loops. ::: All available bits are used up. The first- The complexity of this method is much lower. However, it makes people wonder whether the quality of this method is satisfactory. The first method has the same threshold and the sample encoding of this standard iso. The problem of the method is that it controls the quality and position. The number of loops is one. Generally used to deal with a limit, and the quality and number of loops. Of course, the noise can be deformed, so that the mask threshold and noise threshold have been widely accepted. The second method has been The use of the Tong A crane makes the quality substantially better. Two kinds of nested loops can not be used to cause convergence. Because there are two different rules for the yuan / Xiao consumption in the two loops, it may cause the bodyless to call it "Stagnation π problem (dead 1 〇ckpr 〇b 1 em) + The method of stagnation problem is to set the maximum value of the number of iterations to use some kind of heuristic parameter adjustment method to care about the product. In these methods, the quality still cannot be obtained. Summary of the invention

第9頁 584835 、發明說明（6) 曰本毛明克服上述傳統數位編碼過程的缺點。其主要目 =ί提供一種具有高品質及較少複雜度運算之用來傳達和 t集语音訊號的數位編碼方法。根據本發明，輸入音訊首先被轉換成頻率樣本序列 feculence of frequency samples)，以代表此音訊的頻譜合成（spectral composition)。此頻率樣本序列根據一種位元分配程序和一個參數預估器（parameter )而被量化。此參數預估器直接依據遮罩門襤值來預估量化參數。這些量化值以一可變長度編碼方式來編碼，或是直接集合成一個特，定的語音規格。如果編碼後的資料的全長超過可用位元的數目，則將參數調整並且增加量化間隔的大小（Quantization step size)。重複這個過程直到可用位元的數目大於此編碼所需要的位元數目。最後，將最終已編碼的序列聚集成一個由特定語音規格所定義的序列。Page 9 584835, description of the invention (6) The present Mao Ming overcomes the shortcomings of the traditional digital coding process described above. Its main purpose is to provide a digital encoding method with high-quality and less complex operations for communicating and t-set speech signals. According to the present invention, the input audio is first converted into a sequence of frequency samples (feculence of frequency samples) to represent the spectral composition of the audio. The frequency sample sequence is quantized according to a bit allocation procedure and a parameter estimator (parameter). This parameter estimator estimates the quantization parameters directly based on the mask threshold. These quantized values are encoded in a variable-length encoding method, or they are directly assembled into a specific, fixed voice specification. If the full length of the encoded data exceeds the number of available bits, adjust the parameters and increase the quantization step size. This process is repeated until the number of available bits is greater than the number of bits required for this encoding. Finally, the final encoded sequences are aggregated into a sequence defined by specific speech specifications.

三層；也可應用到如大部份的MPEG 本發明的方法以MPEG第三層的非均句量化器來詳細說明衍生步驟，且檢查此知覺編碼方法的複雜度和語音品質。依此，本發明利用分段的（segmental)雜音_對_遮罩率說明此衍生步驟’並且提供一種封閉式的方程式 (closed-form equation)來表示位元/量化間格大小（bu/ step size)和量化語音之間的關係。本方法不限於肝“第 AAC ( 進階語音編碼Three layers; can also be applied to most of the MPEG methods. The method of the present invention uses the MPEG third layer non-uniform sentence quantizer to explain the derivation steps in detail, and checks the complexity and speech quality of this perceptual coding method. According to this, the present invention uses segmental murmur_pair_mask rate to explain this derivation step 'and provides a closed-form equation to represent the bit / quantization grid size (bu / step size) and quantized speech. This method is not limited to liver "AAC (Advanced Speech Coding

第10頁 584835Page 10 584835

的位元分配標準也如mpeg的第一層和的知覺編螞器。由於本發明所提供的新可以應用到具有均勻量化器的編碼器，第二層。本發明的另一目的是提。此結構含有一轉換器、一一參數預估器、一包裝單元一個可被訊號處理器實現以 (comparator) 〇 =實現此數位編碼程序的結構量化器、一可變長度編碼器、、一調整器（ad justor)，以及元成本發明之方法的比較器根據本發明，對於低位元率編碼程序量化參數直接從叩質條件來評估。因為藉由利用比率控制迴圈，在非均等的頻率線裡的量化頻寬（quantizat i〇n bandwidth)和所需的位兀數被考量，並且可被調整。而對於可變位元率的編 $馬’則可完全除去比率控制迴圈。兹配合下列圖示、實施例之詳細說明及專利申請範圍 ’將上述及本發明之其他目的與優點詳述於后。貫施方式 •圖3a說明本發明之音訊編碼方法的程序。參照圖3a，輪入音訊首先被轉換成一頻率樣本序列，以代表此音訊的The bit allocation standards are also as the first layer of mpeg and the perceptual editor. Since the novel provided by the present invention can be applied to an encoder having a uniform quantizer, the second layer. Another object of the invention is to mention. This structure contains a converter, a parameter estimator, a packaging unit, and a processor that can be implemented by a signal processor. 〇 = A structure quantizer that implements this digital encoding program, a variable-length encoder, and an adjustment. According to the present invention, the quantization parameters of the low bit rate encoding program are directly evaluated from the quality conditions. Because by using the ratio control loop, the quantizaton bandwidth and the required number of bits in the non-uniform frequency lines are considered and can be adjusted. For variable bit rate programming, the ratio control loop can be completely removed. The above-mentioned and other objects and advantages of the present invention are described in detail below in conjunction with the following drawings, detailed description of the embodiments, and the scope of patent applications. Implementation method Figure 3a illustrates the procedure of the audio coding method of the present invention. Referring to FIG. 3a, the turn-in audio is first converted into a sequence of frequency samples to represent the audio frequency.

刈4835 五、發明說明（8) ^請合成。接著，此頻率樣本序列根據_種位元分配 ^，而芒量化以獲得具有一個較低精確度的符號。一個灸數預估态直接依據遮罩門檻值來預估量化參數，二檻值為人類聽㈣統所能聽見的雜音限度。 ^門個壓縮系統之決定訊號級別解析度的參數。… 預估一硒=ΐ 6量化符號的編碼是以一個τ變長度編碼器來編讓1一個步驟則是檢查一個預定的可用位元數是否足夠 ^碼_貝料使用。如果編碼後的資料的全長超過可用位 ▲㈤目，則將參數調整並且增加量化間隔的大小。重複直到可用位元的數目大於此編碼所需要的位元數格所定'義的：Ϊ終已編碼的序列包裝成一個由特定音訊規對於低位元化參數之前，高位7L率音訊編碼音編碼所需要的 ^ 頻率（CUt-off 量在預估量化參大小也將被調整可根據所需要的 2迴圈的步驟則音訊編碼程序的率音讯的編碼，在評估參數估琴中頻率可被砍除（CUt-Off)。·圖扑說明此低程序的步驟。如圖3b所示，當低位元率語位元數目超過可用位元的數目時，調整砍 frequency)然後傳送，以使高頻率的分數之前就被砍除。如有需要，則量化間隔。對於可變位元率音訊的編碼，可用位元品質而被調整。在此情況下，反覆比率控可以完全被移除。圖3 c說明此可變位元率步驟，其中反覆比率控制迴圈的步驟從圖 584835刈 4835 V. Description of the invention (8) ^ Please synthesize. Then, this frequency sample sequence is allocated according to the _ species bits, and the quantization is performed to obtain a symbol with a lower accuracy. One moxibustion number estimation state directly estimates the quantization parameter based on the mask threshold value. The second threshold value is the noise limit that can be heard by the human hearing system. ^ Gate A compression system parameter that determines the signal level resolution. … Estimated one Selenium = ΐ 6 The encoding of the quantization symbol is coded with a τ variable-length encoder. One step is to check whether a predetermined number of available bits is sufficient. If the full length of the encoded data exceeds the available digits, adjust the parameters and increase the size of the quantization interval. Repeat until the number of available bits is greater than the number of bits required for this encoding. Definition: Ϊ The final encoded sequence is packed into a high-level 7L rate audio coded audio coded by a specific audio specification before the low bitization parameter The required ^ frequency (CUt-off amount will be adjusted in the estimated quantization parameter size. According to the required 2 loop steps, the rate of the audio encoding program is the encoding of the audio frequency. The frequency can be cut off in the evaluation parameter estimation. (CUt-Off). Tupu explains the steps of this low program. As shown in Figure 3b, when the number of low-bit rate speech bits exceeds the number of available bits, adjust the frequency) and then transmit to make the high-frequency The score was cut off before. If necessary, quantize the interval. For encoding of variable bit rate audio, the bit quality can be adjusted. In this case, the iterative ratio control can be completely removed. Figure 3c illustrates this variable bit rate step, where the step of iterative rate control loops is shown in Figure 584835.

3a中被移除。本發明之圖3a到圖3c的編碼程序可用訊號處理器 (signal processors)來實現〇 ψ 者工曰, .^ _私此只現的詳細架構揭露如后。根據圖3a，圖4a所示的實現架構，將輸入音訊接收並轉換成_媚1婵士产,得換态4〇1 ^ 頻率樣本序列，以代表續咅訊的一種頻譜合成。一個量化哭4n9 # 叭衣Θ曰 — + 化為4 0 2，根據一個位元分配程^，將此頻率樣本序列量化成有限個一個參數預估器4 0 5，直接μ出 ώ ^ ) Α ^ ^ ^ 7 η 接錯由一個遮軍門檻值來估計量化參數，然後一個最佳的編石馬5| 4 〇Cj蔣當預定預定可用位元數不足2二已！化的級別編碼。調整器4。7調整這…匕4夠讓，資料使用時’-個Removed in 3a. The coding procedure of FIG. 3a to FIG. 3c of the present invention may be implemented by a signal processor. 工工,. ^ _Private The detailed architecture is only disclosed later. According to the implementation architecture shown in FIG. 3a and FIG. 4a, the input audio is received and converted into a 产 1 婵婵 product, and a state sequence of 401 ^ frequency samples is obtained to represent a kind of spectrum synthesis of the continuous signal. A quantized cry 4n9 # 衣衣 Θ said — + is changed to 4 0 2, according to a bit allocation process ^, this frequency sample sequence is quantized into a finite number of parameter estimators 4 0 5, directly ώ) Α ^ ^ ^ 7 η The error is estimated by a cover threshold to quantify the parameters, and then an optimal stone horse 5 | 4 〇Cj Chiang when the number of reserved available bits is less than 22! Level encoding. Adjuster 4. 7 adjust this ... Dagger 4 is enough, when the data is used’- a

一匕多數。一個比較器4 0 8，比軔一〆151 預定的可用位元數和已編；^ I 用的位元數是否足夠讓已η:的長度’以檢查此可 4〇9，將最終已㈣ΛΛΛ 一個包裝單元定義的序列。 j匕裝成一個由特定音訊規格所圖4b和圖4c分別為實翊 -個調整器413用來調整二圖：和3C的架構。參考圖4b ’ 編碼的情形下，將此砍除^、冑率’並在低位儿率音訊 411。調整器413也可以^頻率傳送到高頻率砍除單元大小。高頻率砍除單元(:整丄在量化器4°2裡的量化間隔 411是被加在轉換器4〇1和』卜freQuency cut-off unit) 後的砍除頻率，並將它傳：：器40 2之間，用來接收調整 ^丨哥迗給I數預估器4 〇 5。在可變位One dagger majority. A comparator 4 0 8 compares the number of available bits with the predetermined number of 〆〆 151; ^ I The number of bits used is sufficient to allow the length of η: to check this may be 409, which will eventually be ㈣ΛΛΛ A sequence defined by a packing unit. Figure 4b and Figure 4c are respectively implemented by a specific audio specification-an adjuster 413 is used to adjust the architecture of the two pictures: and 3C. Referring to FIG. 4b, in the case of encoding, this is cut off ^, 胄 rate ', and the audio rate 411 is at a low rate. The adjuster 413 may also transmit the frequency to the high-frequency cut-off unit size. The high-frequency cut-off unit (: the quantization interval 411 in the quantizer 4 ° 2 is the cut-off frequency added to the converters 401 and FreQuency cut-off unit) and passes it :: Between the receivers 40 and 2 for receiving adjustments to the e-number estimator 4 05. Variable position

第13頁 584835 五、發明說明（ίο) 元率編碼的情形下，與及爱，_ h ϋ ^ _ 覆比率控制迴圈步驟有關的元件均被移除，如圖4 c所不。在本發明中，產生礎的決定公式（determi 分配程序中參數預估器個雜音預估器的封閉式法以Μ P E G第三層來詳細一個MPEG ACC量化器，一個以常數遮罩_對_雜音率p為基 nistic formula)，來計算在位元 2量化參數。此決定方程式提供一 ^式給非均勻量化器。本發明的方 °兒明衍生步驟和作為實施例。對於也可以使用類似的程序。本發明的位元分配藉由單一步驟的預測，就符合了對於每一個子波段（SUb-band)的位元率和雜音成形（n〇ise shaping)的需求。而直接藉由一個遮罩門檻值，最佳的全域因子（global factor)和每一個子波段的比率調整因子 (scaling factor)就被預估。此全域因子控制了被使用殆盡之位元的總數，而此比率調整因子則控制.了該相關波段相對於其他波段的量化雜音。以下章節首先說明此位元分配的要件，再詳加說明雜音預估器（n〇ise predict〇r)的衍生步驟’以及在零波段（zero band)和負值雜音遮罩率 (negative noise-to-masking ratio，NMR)條件下的一個比率調整因子的上下限（bounds)。位元分i己4条件Page 13 584835 V. Description of the invention (ίο) In the case of meta-rate encoding, the components related to the love, _ h ϋ ^ _ override ratio control loop step have been removed, as shown in Figure 4c. In the present invention, a basic decision formula (closed method of parameter estimator and noise estimator in the determi allocation procedure is used to detail an MPEG ACC quantizer with MPEG third layer, and a constant mask_pair_ The noise rate p is a basic nistic formula) to calculate the quantization parameter at bit 2. This decision equation provides a formula for the non-uniform quantizer. The method and derivation steps of the present invention are described as examples. A similar procedure can be used for. The bit allocation of the present invention meets the requirements for the bit rate and noise shaping of each sub-band (SUb-band) through a single-step prediction. By directly using a mask threshold, the optimal global factor and the scaling factor of each sub-band are estimated. This global factor controls the total number of exhausted bits used, while this ratio adjustment factor controls the quantization noise of the relevant band relative to other bands. The following chapters first explain the requirements for this bit allocation, and then explain in detail the derivation steps of the noise predictor (n〇ise predict〇r) and the zero noise and negative noise masking rate (negative noise- to-masking ratio (NMR) The upper and lower bounds of a ratio adjustment factor. Bit points i and 4 conditions

584835 五、發明說明（11) 首先，考慮分段的N M R的最小值，並表示為 =魟gMinT/ m584835 V. Description of the invention (11) First, consider the minimum value of the segmented N M R and express it as = 魟 gMinT / m

M{i) Jj 其中σΑ^/；>和分別為與關鍵波段/ 相關的雜音能$ (noise energy)和遮罩能量（masking energy)。為將 NMR最小化的位元率。在一個每樣本有邱）個位元的PCM編碼器中，量化誤差變異數（Quantization error variance )表示如下： (2) (3) Ν(η = Ρ2，σ2χ(〇因此，此最小值 iin2~2R(0a2.) ar§MmZ 一^—— m i [V σΜ{〇 J 必須受限於此總位元率，亦即 (4)- 根據拉氏乘數（Lagrange multipliers)的方法，其λ 解必須滿足下列方程式：M {i) Jj where σA ^ /; > and the noise energy $ (noise energy) and masking energy related to the key band / respectively. Is the bit rate to minimize NMR. In a PCM encoder with Qiu bits per sample, the quantization error variance (Quantization error variance) is expressed as follows: (2) (3) Ν (η = Ρ2, σ2χ (〇 Therefore, this minimum iin2 ~ 2R (0a2.) Ar§MmZ a ^ —— mi [V σΜ {〇J must be limited by this total bit rate, which is (4)-according to the Lagrange multipliers method, its λ solution The following equations must be satisfied:

第15頁 584835Page 15 584835

B(j) B(j) 則B (j) B (j) then

A (2 log 2) P2 v(./) 2 log 2(- '(.n 對於所有 (5) (j) (j) 所以，必須找出i?⑺使得雜音遮罩率正比於此。亦即 ⑺⑽·），對於所有 (6) 雜音的標準水平（noise level)必須正比於遮罩門根值乘以一個頻兔’以便具有最佳分段的N M R。其次’考量在該量化波段的遮罩門檻值和關鍵頻寬以選取量化波段的雜音標準水平。換句話說，將找出4 ’ 取代 <.)以使分段的NMR最小化。木A (2 log 2) P2 v (./) 2 log 2 (-'(.n for all (5) (j) (j) So we must find i? ⑺ so that the noise mask ratio is proportional to this. Also That is,） ·), for all (6) noise, the standard level (noise level) must be proportional to the mask gate root value multiplied by a frequency rabbit 'in order to have the best segmented NMR. Secondly, consider the mask at this quantized band Mask thresholds and key bandwidths to select a standard level of noise in the quantization band. In other words, 4 'substitutions will be found to minimize segmented NMR. wood

其中，？是量化波段的指標。近使分段的N M R最小化的能量此問題就是找出s⑼來最佳逼，亦即among them,? It is an indicator of the quantization band. The energy that minimizes the segmented N M R. This problem is to find s⑼ for the best approximation, that is,

584835 五、發明說明（13) 在量化波段裡的關鍵波段的遮罩能量是均勻汁鼻後的選擇為則經 (9) 第 ΛΑ ώ ，為了避免位元被分配給具有高於雜音水伞 =罩水平標準的波段’必須修改將分段的麗最^準條件，使得具有負值NMR的波段被歸整為1。意即，々化的這 ί 2 ί雜音必須有一個下a。另-方面，高於遮ΐ:波 /雜曰會導致一種現象’此現象即相關波段將被J2檻 :::之為零波段。這些零波段是相當明顯 -為些ϊ化級別也必須被限制為不可大於訊號能量。汀从，總而言之，在零波段和負值NMR的條件元分配時，雜音必須與遮罩門檻值乘以_〜，指定此仇 π見的值相等。本發明以MPEG第三層的量化器來詳％的衍生步驟。從MPEG第三層的標準，此筮，明雜音分配器化器的簡化公式為系二層之非均句量 (3 Λ int (1〇)584835 V. Description of the invention (13) The masking energy of the key band in the quantized band is even after the nose is selected. (9) No. ΛΑ FREE, in order to avoid the bits being allocated to the umbrella with noise higher than the noise = The band level of the mask level standard must be modified to match the segmentation criteria so that the band with negative NMR is rounded to 1. This means that the 々 2 杂 murmurs that have been converted must have a lower a. On the other hand, it is higher than the cover: the wave / miscellaneous phenomenon will cause a phenomenon. This phenomenon means that the relevant band will be thresholded by J2 ::: the zero band. These zero bands are quite obvious-for some levels of virtualization, they must also be limited to no more than the signal energy. Ting Cong, in short, in the allocation of conditional elements of the zero band and negative NMR, the noise must be multiplied by the mask threshold multiplied by _ ~, specifying the value of this hate. In the present invention, a quantizer of the third layer of MPEG is used to describe the% derivation step. From the standard of the third layer of MPEG, here, the simplified formula of the splitter and splitter is the non-uniform sentence volume of the second layer (3 Λ int (1〇)

第17頁 584835 五、發明說明（14) 其中，量化間隔大小為Page 17 584835 V. Description of the invention (14) Among them, the quantization interval is

(ID _ -scales ΑΦ = 1 . 從MPEG的標準，非均勻量化器的公式也可以表示為 =mt xr.2 scale - gain -0.0946 (12)(ID _ -scales ΑΦ = 1. From the standard of MPEG, the formula of non-uniform quantizer can also be expressed as = mt xr. 2 scale-gain -0.0946 (12)

VV

其中，對於每個量化波段q，比率調整因子為 scaleq = 1 / 2(1 + scalefac_ scale){scalefacq + preflag · pretabq) ; scalefac _ scale 為〇或 1 ， scalefacq 介於0到1 5之間，而預先放大的旗幟 (pre-amplified flag)為 Prefias，pretabq ;對於ΜPEG 第三層結構的每一個小細節（granule)，全域益（global gain) 為 gciingr=\/2(gl〇bal』aingr2\Q) 。將 0 · 0 9 4 6 忽略，則（1 2 )可被推導成 / is =int xr.2 scale-gain \ int ^scaleq-gaingFor each quantization band q, the ratio adjustment factor is scaleq = 1/2 (1 + scalefac_scale) {scalefacq + preflag · pretabq); scalefac _ scale is 0 or 1, and scalefacq is between 0 and 15. And the pre-amplified flag is Prefias, pretabq; for each small detail of the third layer structure of MPEG, the global gain is gciingr = \ / 2 (gl〇bal′aingr2 \ Q). If 0 · 0 9 4 6 is ignored, then (1 2) can be derived as / is = int xr.2 scale-gain \ int ^ scaleq-gaing

int _3、 xr} Δ,, (13) 其中，間隔大小為 ^{gaing,-scale qint _3, xr} Δ ,, (13) where the gap size is ^ {gaing, -scale q

第18頁 584835 五、發明說明（15) 接下來，輸入訊號（i n p u t s i g n a 1 )巧和回復訊號 (r e c ο n s t r u c t e d s i g n a 1 ) xr,有下列兩個公式來表示： xrt = ((iSi^s,)/isJb} ，和 χη =(is}AsJh} 非均勻量化器e的量化誤差將等於輸入訊號xr,和回復訊號xr, 的差值· {(is. +sl)Asjhf =(1 + isf εί)3 isi 3 Δφ' — {isf Δφ(14) 令 /(^,) = 0 +以,'.户。以泰勒展開式（Ty 1 er expans i on) 的第一階導數 + f(^)s 來逼近此差值，而導出假設量化訊號h和非均勻量化器&的量化後的誤差相互獨立，則非均勻量化器A的量化誤差的期望值如下： E[ef ] - ^a}e[is}£2] ^ ^A}E[IS}]E[£!] ( 1 5 } 如果量化波段的光譜是均勻的，則線的雜音可以是此量化波段的平均能量；亦即Page 18 584835 V. Description of the invention (15) Next, the input signal (inputsigna 1) and the reply signal (rec ο nstructedsigna 1) xr are expressed by the following two formulas: xrt = ((iSi ^ s,) / isJb}, and χη = (is} AsJh} The quantization error of the non-uniform quantizer e will be equal to the difference between the input signal xr, and the response signal xr, {(is. + sl) Asjhf = (1 + isf εί) 3 isi 3 Δφ '— {isf Δφ (14) Let / (^,) = 0 + Yi,'. hu. Take the first derivative of Taylor expansion (Ty 1 er expans i on) + f (^) s Approximating this difference, and deriving the assumption that the quantization signal h and the quantization error of the non-uniform quantizer & are independent of each other, the expected value of the quantization error of the non-uniform quantizer A is as follows: E [ef]-^ a} e [is } £ 2] ^ ^ A} E [IS}] E [£!] (1 5} If the spectrum of the quantization band is uniform, the noise of the line can be the average energy of this quantization band; that is,

第19頁 584835 五、發明說明（16) E(ef) = E(e2g) (16) 因為 Ε[ε2]上，（15)式變成 Li 12 E[el ] a ¥ } q1 E[\XRlY ] (1 7)P.19 584835 V. Explanation of the invention (16) E (ef) = E (e2g) (16) Since E [ε2], the formula (15) becomes Li 12 E [el] a ¥} q1 E [\ XRlY] (1 7)

將（7 )式帶入（1 6 )式得出 E[e2q] = Ka2M{g)B(q) (18) 最後，藉由定義7；二<，⑷，此全域益和比率調整因子之間的差值趨近至 0 0.5 gaingr - scaleq « f l〇g2 ^K'Tq I E[ XRq ] 5 或Bring Eq. (7) into Eq. (1 6) and get E [e2q] = Ka2M {g) B (q) (18) Finally, by defining 7; two <, ⑷, this global benefit and ratio adjustment factor The difference between them approaches 0 0.5 gaingr-scaleq «fl〇g2 ^ K'Tq IE [XRq] 5 or

gain - scale log2 — + l〇g2 K +1°§2 ^ ^°§2 ^f\^q| ^ 4 (19) 因為比率調整因子的值介於0和1 6之間，並且這些量化波段之比率調整的最小值必須為零，因此全域益為gain-scale log2 — + l〇g2 K + 1 ° §2 ^ ^ ° §2 ^ f \ ^ q | ^ 4 (19) because the value of the ratio adjustment factor is between 0 and 16 and these quantized bands The minimum ratio adjustment must be zero, so the global benefit is

第20頁 584835 五、發明說明（17) gain 二 Max{gaingr - scaleq \， s q 並且得到所有子波段的比率調整因子。可見，全域益隨著位元率相關常數/C而改變，而每一子波段的比率調整因子則根據此遮罩門檻值和輸入訊號而不同。比率調整因子的上下限Page 20 584835 V. Description of the invention (17) gain 2 Max {gaingr-scaleq \, s q and get the ratio adjustment factors of all sub-bands. It can be seen that the global benefit changes with the bit rate related constant / C, and the ratio adjustment factor of each sub-band is different according to this mask threshold and the input signal. Upper and lower limits of the ratio adjustment factor

如前所述，位元分配應受限於非負值NMR和零波段的條件下。對於非負值NMR的情形，雜音級別被設定為遮罩門檻值；亦即7；和/c = 1。如此產生相對於全域比率調整之的上限。 gaingr - Uscaleq = -( l〇g2 — +1〇§2 σ1ί{ς)"" | ^ (20) 意即， scalea <Όscale = gaingr -\(l〇g2 — + l°g2 σΜ(ς) "| J) q 4 (21 )As mentioned earlier, bit allocation should be limited to non-negative NMR and zero band conditions. For the case of non-negative NMR, the noise level is set to the mask threshold; that is, 7; and / c = 1. This results in an upper limit relative to the global ratio adjustment. gaingr-Uscaleq =-(l〇g2 — + 1〇§2 σ1ί {ς) " " | ^ (20) That is, scalea < Όscale = gaingr-\ (l〇g2 — + l ° g2 σΜ ( ς) " | J) q 4 (21)

此將根據可用的位元而被調整。下限可由零波段的條件下導出。當雜音大於訊號能量時，則會發生零波段的情形；亦即This will be adjusted based on the available bits. The lower limit can be derived from the zero-band condition. When the noise is greater than the signal energy, a zero-band situation will occur; that is,

第21頁 (22) 584835 五、發明說明（18) 0.5 因此，比率調整的下限會是 scaleq > Dscaleq = gaingr - ^\0gi E[\XRq 0.5 (23) 圖5分別說明本發明*MpEG位元分配程數，而R為比率控制的平均反覆次數。如圖5 反復次的分配方法移除了品質控制反覆所需要的反择=j本發明利用一個大於3的因子來減少比率控制的反覆^欠人數數。，並且圖6說明本發明相較於丨s〇之位元分配方分紀錄。在此，本發明採用語音品質的知覺預的客觀的評 (perceptual evaluation of audio qualHy^tJ! ，此PEAQ系統為ITU-R工作組10/4所推薦的’々MAQ)系統始碼。I SOI採用Lame的終止條件而被改盖'。”、、t° 1 S〇為原款式（stereo mode)和心理分析模式2為美礎二驗以立體 .(blt -ser；；i〇V^^^ 關，因此實驗中將此兩個機制關掉。客觀差異等配級^法無 (objecUve deference g]：ade，〇DG)是客觀測量%方法的輸出變數。0DG值應理想地介於0和〜4之間，其中〇個察覺不出的損害’而-4係對應—個被認定是非常惱：的 584835 五、發明說明（19) 損害。如圖6所示，本發明之方法的品質較此圖文中所提出的方法品質為好。 PEAQ在本發明中所採用的架構為基本的版本。此基本版本使用以FFT為基礎的耳朵模型（FFT - based ear model )。它使用下列的模型輸出變數：BandwidthRefB，Page 21 (22) 584835 V. Description of the invention (18) 0.5 Therefore, the lower limit of the ratio adjustment will be scaleq > Dscaleq = gaingr-^ \ 0gi E [\ XRq 0.5 (23) Figure 5 illustrates the * MpEG bit of the present invention, respectively. Yuan distribution process, and R is the average number of iterations of ratio control. The repeated allocation method shown in Figure 5 removes the inverse selection needed for quality control iterations. The present invention uses a factor greater than 3 to reduce the number of iterations of rate control. And, FIG. 6 illustrates the bit allocation record of the present invention compared to 丨 s0. Here, the present invention uses a perceptual pre-evaluation of audio quality (perceptual evaluation of audio qualHy ^ tJ!). This PEAQ system is the 々MAQ recommended by the ITU-R Working Group 10/4. I SOI was changed using Lame's termination conditions. ' ", T ° 1 S〇 is the original style (stereo mode) and psychoanalysis mode 2 is based on the second test and three-dimensional. (Blt -ser ;; i〇V ^^^ off, so in the experiment these two mechanisms Turn off. The objective difference equal grading method (objecUve deference g): ade, 0DG) is the output variable of the objective measurement% method. The value of 0DG should ideally be between 0 and ~ 4, of which 0 are not perceptible. The damage is 'and -4' corresponds to one which is considered to be very annoying: 584835 V. Description of the invention (19) Damage. As shown in Figure 6, the quality of the method of the present invention is better than the quality of the method proposed in this figure and text Well, the architecture used by PEAQ in the present invention is a basic version. This basic version uses an FFT-based ear model (FFT-based ear model). It uses the following model output variables: BandwidthRefB,

BandwidthTestB，Total NMRB，WinModDifflB，ADBb， EHSB，AvgModDifflB，AvgModDiff2B，RmsNoiseLoudB， ^^?08以及1^10丨3七？厂3111658。利用一個人造類神經網 (artificial neural network)，這 11 個模型輸出變數被轉換成一個單一品質指數，此人造類神經網的隱藏層 (hidden layer)中含有三個節點（n〇de)。圖7提供一組測試訊號的表單，這些測試訊號被使用於客觀和主觀的試驗中。藉由設定相同的反覆終止條件，例如反覆次數、非增加雜音比率調整因子波段、合適於比率调整因子表等等[網址http://www.inp3dev.org/inp3]， I SO演譯法則可以利用Lame提到的方法來改善（即具有最佳。口 λ的m p 3編碼)。被採用來比較的兩個巢狀迴圈係以 Lame所使用的反覆演譯法則為基礎。 ’、唯，以上所述者，僅為本發明之較佳實施例而已，當不能以此限定本發明實施之範圍。即大凡依本發明申請專利範圍所作之均等變化與修飾，皆應仍屬本發明專利涵雲BandwidthTestB, Total NMRB, WinModDifflB, ADBb, EHSB, AvgModDifflB, AvgModDiff2B, RmsNoiseLoudB, ^^? 08 and 1 ^ 10 丨 37? Plant 3111658. Using an artificial neural network, the 11 model output variables are converted into a single quality index. The hidden layer of the artificial neural network contains three nodes (node). Figure 7 provides a list of test signals used in objective and subjective tests. By setting the same iterative termination conditions, such as the number of iterations, non-increasing noise ratio adjustment factor bands, suitable ratio adjustment factor tables, etc. [URL http://www.inp3dev.org/inp3], the I SO algorithm can Use the method mentioned by Lame to improve (that is, mp 3 encoding with the best. Λ). The two nested loops used for comparison are based on the iterative algorithm used by Lame. However, the above are merely preferred embodiments of the present invention, and the scope of implementation of the present invention cannot be limited by this. That is to say, all equivalent changes and modifications made in accordance with the scope of the patent application of the present invention should still belong to the patent scope of the present invention.

第23頁 584835 五、發明說明（20) 之範圍内。Page 23 584835 5. Within the scope of (20).

Ι1·Ι1ΙΙ 第24頁圖式簡單說明圖1係時下音訊編碼之_ 種編碼程序的方塊示意圖。圖2係一〇 C F程序中的a 一々汴〒的位兀分配程序。圖3a說明本發明之音_ “ 扁碼知序的流程步驟。圖3 b說明本發明之柄你—亡^ - 凡率曰訊編碼程序的流程步驟訊編碼程序的流程步圖3c說明本發明之可變位元率音圖4 a說明本發明之圖3 ^ ^ 圃仏的一種實現架構。圖4b和圖4c分別說明圖3h知闰Q ^ — n BMb和圖3C的實現架構。圖5分別說明本發明未μ p p广一料時，口所> ^ # MPEG位兀分配程序使用不同試驗材Ι1 · Ι1ΙΙ page 24 Brief description of the figure Figure 1 is a block diagram of the current audio coding _ encoding process. Figure 2 shows the a-bit allocation procedure in the 10 C program. Fig. 3a illustrates the steps of the sequence of the sound of the present invention _ "oblate code. Fig. 3b illustrates the process of the present invention. You-die ^-flow rate of the encoding process of the message rate. Fig. 3c illustrates the process of the present invention. Figure 4a illustrates the implementation architecture of Figure 3 ^ ^ of the present invention. Figure 4b and Figure 4c illustrate the implementation architecture of Figure 3h known Q ^ n BMb and Figure 3C. Figure 5 Explain separately when the present invention is not μ pp wide, ^ # MPEG bit allocation program uses different test materials

料％口口貝控制和位元率i允生,1 ΛΑ T 兀半控制的平均反覆次數。圖6說明本發明相較於I — 蛛、1 bU之位兀分配方法的客觀的評分乡錄° 圖7提供一組測試訊缺沾本抑觀和主觀的試驗中/的表|’這些測試訊號被使用於客Material% Oral control and bit rate i allow birth, 1 ΛΑ T half control the average number of iterations. Figure 6 illustrates the objective scores of the present invention compared to the I-spider, 1 bU allocation method. Figure 7 provides a set of tests / tables for testing the subjective and subjective tests. 'These tests Signals are used by customers

第25頁 584835Page 25 584835

圖式簡單說明圖號說明 101 時間 /頻率轉換器 103 其他編碼器 105 量化器 107 可變長度編碼器 109 包裝單元 111 聽覺分析模式 113 位元分配器 115 可用的位元 401 轉換器 402 量化器 403 可變長度編碼器 405 參數預估器 407 調整器 408 比較器 409 包裝單元 411 砍除 1¾ 頻率 413 調整器Brief description of the drawing Figure description 101 Time / frequency converter 103 Other encoders 105 Quantizer 107 Variable-length encoder 109 Packaging unit 111 Auditory analysis mode 113 Bit allocator 115 Available bits 401 Converter 402 Quantizer 403 Variable Length Encoder 405 Parameter Estimator 407 Adjuster 408 Comparator 409 Packaging Unit 411 Cut 1¾ Frequency 413 Adjuster

第26頁Page 26

Claims

584835 6. Scope of patent application 1. A method for digitally encoding and transmitting audio, including the following steps: (a) converting the input audio into a sequence of frequency samples to represent a spectral synthesis of the audio; (b) according to A bit allocation program that quantizes the frequency sample sequence into a quantized value. With a mask threshold value, the bit allocation program uses a parameter estimator to estimate the quantization parameter; (c) uses a symbol encoder to The quantized value is encoded to form encoded data, the encoded data comprising a plurality of bits; and (d) packaging the encoded data into a sequence of data according to a specific audio specification. 2. The method for digitally encoding and transmitting audio as described in item 1 of the scope of patent application, wherein step (b) is performed by a uniform quantizer or a non-uniform quantizer. 3. The method for digitally encoding and transmitting audio as described in item 1 of the patent application scope, wherein the symbol encoder comprises a variable length-encoder. 4. The method of digital encoding for transmitting and packaging audio according to item 1 of the scope of patent application, wherein for a quantized band, the parameter estimator in the bit allocation program uses a formula to calculate and adjust at least

Page 27 584835 VI. Patent Application Range A corresponding global factor and / or a frequency band ratio adjustment factor. The formula is based on a constant mask-to-noise ratio. 5. The method for digitally encoding and transmitting audio as described in item 4 of the scope of the patent application, wherein the bit allocation procedure of step (b) further includes the following steps: According to a predetermined availability of the encoded data Adjust the global factor by the number of bits; and generate an upper limit and a lower limit of the band ratio adjustment factor corresponding to the global factor of the quantized band. 6. The method for digitally encoding and transmitting audio as described in item 5 of the patent application scope, wherein the upper limit is limited by a non-negative noise-to-mask ratio. 7. The method for digitally encoding and transmitting audio as described in item 5 of the patent application scope, wherein the lower limit is limited by the zero band. 8. The method for digitally encoding and transmitting audio as described in item 4 of the scope of patent application, wherein the band ratio adjustment factor is different for each sub-band according to the mask threshold and the input audio. 9. The method for digitally encoding and transmitting audio as described in item 4 of the scope of the patent application, wherein the global factor is dependent on a bit rate

Page 28 584835 6. The scope of patent application is constant and varies. 10. According to the method for digitally encoding and transmitting audio as described in item 1 of the scope of the patent application, before step (d), there is an iterative rate control loop, which includes the following steps : (C 1) if the number of bits contained in the encoded data does not exceed a predetermined number of available bits of the encoded data, continue with step (d), otherwise continue with step (c 2); and ( c2) Adjust the quantization parameter and the size of a quantization interval to be used in step (b), and return to step (b). 11. The method for digitally encoding and transmitting audio as described in item 10 of the patent application scope, wherein step (b) is performed by a uniform quantizer or a non-uniform quantizer. 12. The method for digitally encoding and transmitting audio packaging audio as described in item 10 of the scope of patent application, wherein if the number of bits included in the encoded data exceeds a predetermined available number of the encoded data Number of bits, at least one corresponding global factor and one frequency band ratio adjustment factor will be adjusted, and the size of the quantization interval in this step (c2) will also increase. 13. A method of digital encoding for transmitting and packaging audio as described in item 1 of the patent application scope, wherein the symbol encoder comprises a variable length

Page 29 6. Application for patent range encoder 14 15. The method for transmitting and packetizing 1 = code described in item 10 of t Γ t patent range, wherein step (b) further includes a : Step by step: J § Before Xuan frequency sample sequence 'for low bit rate audio coding, cut off high frequencies. The method of transmitting and encapsulating the cubic digital code described in the scope of patent application No. 14 of Qian Yue Λ includes the step of removing the second rate more than that shown in the table. Step-by-step steps. The frequency is reduced by one person-a cut-off frequency of 16.17. If there is at least one corresponding parameter in the formula of the patent application, the formula is as follows: Global factor coded; to generate the method for transmitting and packaging audio as described in item 10 of the quantization circle, where for a quantized band, the bit-score estimator uses a formula to calculate and adjust the global factor And / or one-frequency band ratio adjustment is based on a constant mask ~ parallel noise ratio. The method for transmitting and packaging audio described in item 16, wherein the bit allocation procedure of step (b) uses a predetermined number of available bits of data to adjust the band ratio corresponding to the global factor of the band Adjustment factor

584835 6. An upper limit and a lower limit of the scope of patent application. 18 The digital coding method for transmission and packaging as described in item 17 of the scope of patent application, the upper limit is limited by a non-negative, -pair-to-mask ratio. Complex noise 19. The method for digitally encoding a transport and packing person as described in item 17 of the patent application scope, wherein the lower limit is limited by the zero band ^. Hole 20. The f-bit encoding method for transmitting and packaging audio as described in item 16 of the scope of patent application, wherein according to the mask threshold value and the input, the band ratio adjustment factor is Different bands. 21β ^ The number 2 encoding method for transmitting and packaging audio as described in item 16 of the scope of patent application, wherein the global factor varies with a bit rate-dependent hanging number. In ·: The structure of a digitally coded stone horse that transmits and packs audio, the structure 22. Contains: a converter that converts the input audio ^ ^ ^ sequence to represent the Z-type spectrum of the audio combined with-frequency Sample a parameter estimator that estimates the number of inflammations to estimate the quantization parameters; H mask thresholds-a quantizer, which quantifies the frequency based on the quantization parameters 1 Page 31 584835 6. Patent application scope This sequence is quantized into a quantized value; a variable-length encoder that encodes the quantized value into coded data, where the coded data includes a plurality of bits; and a packaging unit , The packaging unit packages the encoded data into a data sequence according to a specific audio specification. 23. The digitally encoded structure for transmitting and packaging audio as described in item 22 of the scope of the patent application, the structure further comprising: a comparator that compares the number of bits contained in the encoded data with the A predetermined number of available bits of encoded data; and an adjuster for adjusting when the number of bits included in the encoded data exceeds the predetermined number of available bits of the encoded data The quantization parameter. 24. The structure of digital encoding for transmitting and packaging audio as described in item 23 of the scope of patent application, the structure further includes a high-frequency cut-off unit connected to the converter and the quantizer There is also an input port for receiving a cut-off frequency from the regulator.

Page 32