九、發明說明: 【發明所屬之技術領域】 本發明是有關於一種音頻解碼器及其方法。 【先前技術】 音視頻傳輸技術廣泛應用於視頻會議、數位電視、網 路電話等各類資訊技術領域。由於音視頻資料具有数據量 大的特性,業界普遍採用編解碼技術來實現音視頻資料的 傳輪。例如,按照一定的編碼規則在發送端編碼,並在接 收端按與發送端相對應的解碼規則來解碼。通常發送端在 編碼時會採用一定的編碼時脈,接收端在恢復音視頻資料 時’還需要設置與編瑪時脈保持一致的解碼時脈,從而保 證音視頻資料的連續有序的播放Q例如目前廣泛採用的音 視頻編碼標準,即移動圖像專家組(MPEG )系列標準,它 在視頻壓縮方面充分利用空間和時間上的冗餘而達到有效 的壓縮,而在音頻壓縮方面則主要利用人耳的主觀雜訊感 知特性來達到麼縮的目的。例如’人耳的有效識別頻率範 圍在20〜20KHZ ’在進行音頻資料壓縮時可以相應地重點突 出該範圍以内的信號而忽略該範圍以外的信號,同時還可 以利用聲音頻譜的非平坦性從一個方面達到壓縮的目的。 但音頻解瑪設備的發送端與接收端的採樣頻率瑪制可 能有所不同,例如發送端編碼的採樣頻率與接收端的採樣 點輸出速率不一致’可能需要進行採樣頻率的轉換。另一 方面,由於通道的堵塞等原因,可能造成接收端本地時脈 與發送端編碼時脈的不匹配。例如,MPEG2中以節目流 1337812 依照本發明之一實施例,音頻解碼器係解碼一音頻編 碼器所輸出之至少一音頻碼流(audi〇 data stream),此音頻 解碼器包含一解析單元、一解碼單元、一重採樣單元以及 一控制單元。解析單元接收外部之音頻碼流並進行解封 包’以獲得一音頻資料。解碼單元解碼音頻資料,並對解 碼後之音頻資料進行反離散餘弦轉換(IDCT)以及加窗處理 (Windowing),藉以獲得複數個脈波編碼調變採樣值(pCM sample)。重採樣單元按照一採樣頻率比值,對脈波編碼調 變採樣值進行重採樣(re-sampling)。控制單元控制音頻解碼 器之工作。 因此本發明之另一方面提供一種音頻解碼方法,使輸 出之音頻信號具有較高的頻率控制精確度,並且降低音頻 解碼器之成本》 依照本發明之另一實施例,音頻解碼方法包括:接收 外部之音頻碼流並進行解封包;對解封包後之音頻碼流進 行解碼,並進行反離散餘弦轉換(IDCT)和加窗處理 (Windowing),以獲得複數個脈波編碼調變採樣值(PCM sample);以及按一預定採樣頻率比值,對脈波編碼調變採 樣值進行重採樣(Resample)後予以輸出。 依照本發明之又一實施例,音頻解碼方法包括:解析 音頻封包,以得到一音頻資料;對音頻資料進行一解碼程 序,以獲得至少一脈波編碼調變採樣值(at least one PCM sample);對該脈波編碼調變採樣值進行重採樣 (re-sampling);以及據波(filtering)調整後之脈波編瑪調變採 樣值。 7 1337812 採樣演算法對採樣值進行滤波,以重構採樣點輸出波形, 而調整採樣點輸出速率。在此第2圖中,箭頭有序排列表 不採樣的時間點,爲便於清楚顯示,圖中僅以較少的採樣 點作爲示例。在原有採樣頻率下有波形A,當實際音頻採 • 樣點輸出速率大於音頻解碼器的播放速率時,藉由波形重 、 冑相與波形A相同的波形B,但波形B的採樣頻率提高; 當實際音頻採樣點輪出速率小於音頻解碼器的播放速率 時,藉由波形重建得到與波形A相同的波形c,但波形〇 • 的採樣頻率降低。這襄採闬的賣構演算法例如可以爲插值 演算法、短時傅立葉變換演算法或頻域預測演算法等或 者其中任意幾種演算法的合理組合,如時域插值演算法與 傅立葉變換演算法相結合。例如在本發明的一個實施例 中,採用時域插值演算法,利用内插逐步逼近的方法完成 採樣頻率的變換,從而達到調整採樣點輸出速率的目的。 6月參閱第3圖,其係繪示本發明一實施例之重採樣單 兀結構框圖。重採樣單元105設有檢測裝置13卜頻率比值 φ 控制裝置m和頻率調整裝置I33。 檢測裝置131可以對採樣頻率的變換和/或採樣頻率的 誤差進行檢測。檢測裝置131根據接收到的資料串流(data stream)中的資訊來確定是否需要調整採樣頻率。例如,檢 測裝置131可以根據音頻資料串流的基本封包的檔頭 (Header)資訊來確定是否需要進行採樣頻率的調整。頻率比 值控制裝置132根據檢測裝置131的輸出值計算得出頻率 調整參考值或直接記錄爲頻率調整參考值。當檢查到需要 變換採樣頻率而進行碼制轉換和/或存在採樣頻率的誤差 9 1337812 採用級聯的形式來實現。從形式上來說,可以先對一種重 採樣的應用進行濾波完成頻率變化,例如先將編碼端的 48.005KHZ變換到48ΚΗζ達成誤差糾正,再對另—種重採 樣的應用進行濾波完成頻率變化,例如再將48KHz變換到 32KHz。兩種重採樣之間可以用卷積的邏輯關係相連。 第4圖係繪不本發明一實施例之音頻解碼方法流程 圖。音頻解碼方法包括: 步驟401,接收外部的音頻瑪流並進行解封包。IX. INSTRUCTIONS: TECHNICAL FIELD The present invention relates to an audio decoder and a method thereof. [Prior Art] Audio and video transmission technology is widely used in various information technology fields such as video conferencing, digital television, and network telephone. Due to the large amount of data in audio and video data, codec technology is commonly used in the industry to realize the transmission of audio and video data. For example, it is encoded at the transmitting end according to a certain encoding rule, and decoded at the receiving end according to a decoding rule corresponding to the transmitting end. Usually, the transmitting end uses a certain encoding clock when encoding, and the receiving end needs to set the decoding clock consistent with the programming clock when recovering audio and video data, so as to ensure continuous and orderly playback of audio and video data. For example, the currently widely used audio and video coding standard, the Moving Picture Experts Group (MPEG) series of standards, utilizes spatial and temporal redundancy in video compression to achieve effective compression, while audio compression mainly uses The subjective noise perception characteristics of the human ear are used to achieve the purpose of shrinking. For example, 'the effective recognition frequency range of the human ear is 20~20KHZ'. When performing audio data compression, the signal within the range can be highlighted accordingly, and the signal outside the range can be ignored, and the non-flatness of the sound spectrum can also be utilized from one. Aspects achieve the purpose of compression. However, the sampling frequency of the transmitting end and the receiving end of the audio decoding device may be different. For example, the sampling frequency encoded by the transmitting end is inconsistent with the sampling point output rate of the receiving end. It may be necessary to perform sampling frequency conversion. On the other hand, due to the blockage of the channel, etc., it may cause a mismatch between the local clock at the receiving end and the encoded clock at the transmitting end. For example, in MPEG2, program stream 1337812, in accordance with an embodiment of the present invention, an audio decoder decodes at least one audio stream outputted by an audio encoder, the audio decoder including a parsing unit, A decoding unit, a resampling unit, and a control unit. The parsing unit receives the external audio stream and performs decapsulation to obtain an audio material. The decoding unit decodes the audio data, and performs inverse discrete cosine transform (IDCT) and windowing (Windowing) on the decoded audio data to obtain a plurality of pulse code modulated sample values (pCM samples). The resampling unit resamples the pulse code modulated sample values according to a sampling frequency ratio. The control unit controls the operation of the audio decoder. Accordingly, another aspect of the present invention provides an audio decoding method that enables an output audio signal to have higher frequency control accuracy and lowers the cost of the audio decoder. According to another embodiment of the present invention, an audio decoding method includes: receiving The external audio stream is decapsulated; the audio stream after decapsulation is decoded, and inverse discrete cosine transform (IDCT) and windowing (Windowing) are performed to obtain a plurality of pulse code modulated samples ( PCM sample); and according to a predetermined sampling frequency ratio, the pulse code modulated sample value is resampled (Resample) and output. According to still another embodiment of the present invention, an audio decoding method includes: parsing an audio packet to obtain an audio material; performing a decoding process on the audio data to obtain at least one PCM sample at least one pulse coded sample value (at least one PCM sample) And re-sampling the pulse code modulated sample value; and adjusting the pulse wave coded sample value according to the filtering. 7 1337812 The sampling algorithm filters the sampled values to reconstruct the sample point output waveform and adjust the sample point output rate. In this second figure, the arrow sorts the list of time points that are not sampled. For the sake of clarity, only a few sample points are used as an example. There is waveform A at the original sampling frequency. When the actual audio sampling point output rate is greater than the playback rate of the audio decoder, the waveform B with the same waveform and 胄 phase and waveform A is the same, but the sampling frequency of waveform B is increased; When the actual audio sample point rotation rate is lower than the audio decoder's playback rate, the same waveform c as waveform A is obtained by waveform reconstruction, but the sampling frequency of the waveform 〇• is lowered. The selling algorithm can be, for example, an interpolation algorithm, a short-time Fourier transform algorithm or a frequency domain prediction algorithm, or a reasonable combination of any of the several algorithms, such as a time domain interpolation algorithm and a Fourier transform algorithm. The law is combined. For example, in an embodiment of the present invention, the time domain interpolation algorithm is used to perform the transformation of the sampling frequency by using the interpolation stepwise approximation method, thereby achieving the purpose of adjusting the sampling point output rate. Referring to Figure 3 in June, a block diagram of a resampling unit 一 according to an embodiment of the present invention is shown. The resampling unit 105 is provided with a detecting means 13 frequency ratio value φ controlling means m and frequency adjusting means I33. The detecting means 131 can detect the conversion of the sampling frequency and/or the error of the sampling frequency. The detecting means 131 determines whether it is necessary to adjust the sampling frequency based on the information in the received data stream. For example, the detecting means 131 can determine whether or not the sampling frequency needs to be adjusted based on the header information of the basic packet of the audio stream. The frequency ratio control means 132 calculates a frequency adjustment reference value based on the output value of the detecting means 131 or directly records it as a frequency adjustment reference value. When it is checked that the sampling frequency needs to be converted and the code conversion is performed and/or the error of the sampling frequency is present, 9 1337812 is implemented in a cascaded form. Formally speaking, a resampling application can be first filtered to complete the frequency change. For example, the 48.005 KHZ of the encoding end is first transformed to 48 ΚΗζ to achieve error correction, and then another resampling application is filtered to complete the frequency change, for example, Transform 48KHz to 32KHz. The two types of resampling can be connected by a logical relationship of convolution. Fig. 4 is a flow chart showing an audio decoding method according to an embodiment of the present invention. The audio decoding method includes: Step 401: Receive an external audio stream and perform decapsulation.
步驟403,對解封包後的音頻碼流進行解碼,龙進行反 離散餘弦轉換(IDCT)以及加窗處理(wind〇wing),來獲得脈 波編碼調變採樣值(PCM sampies)。 步驟405,對脈波編碼調變採樣值按預定採樣頻率比值 進行重採樣後予以輸出。 第5圖係繪示音頻解碼方法中步驟4〇5的詳細流裎 圖。其中,步驟405包括:Step 403: Decode the audio stream after decapsulation, and perform inverse discrete cosine transform (IDCT) and windowing (wind) on the dragon to obtain pulse code modulated samples (PCM sampies). In step 405, the pulse code modulated sample value is resampled according to a predetermined sampling frequency ratio and then output. Figure 5 is a detailed flow diagram showing steps 4〇5 of the audio decoding method. Wherein, step 405 includes:
步驟501,對採樣頻率的變換和/或採樣頻率的誤差進 行檢測,並産生一個頻率調整比值參考值。 步驟503,根據頻率調整比值參考值輪出新採樣頻率和 原採樣頻率的比值。 步驟505,根據新採樣頻率和原採樣頻率的比值,採用 濾波方法重建波形,並對採樣頻率進行變換和/或調整,之 後輸出採樣點。 其中,重採樣步驟是按照一定的演算法採用濾波方法 重構採樣點的輸出波形而調整採樣點輸出速率。這種演算 法包括插值演算法、短時傅立葉變換演算法和頻域預測演 16In step 501, the conversion of the sampling frequency and/or the error of the sampling frequency are detected, and a frequency adjustment ratio reference value is generated. Step 503, according to the frequency adjustment ratio reference value, the ratio of the new sampling frequency to the original sampling frequency is rotated. Step 505: According to the ratio of the new sampling frequency and the original sampling frequency, the filtering method is used to reconstruct the waveform, and the sampling frequency is transformed and/or adjusted, and then the sampling point is output. The resampling step is to adjust the output point of the sampling point by using a filtering method to reconstruct the output waveform of the sampling point according to a certain algorithm. This algorithm includes interpolation algorithm, short-time Fourier transform algorithm and frequency domain prediction.
一種或多種演算法進行濾波以重構波 β對PCM採樣值進行重採樣時,採樣頻率的重採樣比值 疋可變的。在本發明的一個實施例中,採樣頻率的精確度 範圍可以根據需要利用軟體編程來設置。 知樣頻率的重採樣比值可以根據音頻資料串流提供的 的資訊來設置。在符合MPEG標準的一些實施例中,對於 採樣頻率的誤差的調整,由採樣頻率的重採樣比值根據音 ,資料串朗基本封包的楼頭資訊包含的pTS域的值來決 定,對於不同知樣頻率之間的變換,由採樣頻率的重採樣 比值根據音頻資料串流的基本流的檔頭資訊包含的 sampling frequencey域的值來決定,並可結合基本流的檔頭 資訊包含的ID域的值。 雖然本發明已以一實施例揭露如上,然其並非用以限 定本發明,任何在本發明所屬技術領域中具有通常知識 者,在不脫離本發明之精神和範圍内,當可作各種之更動 與潤飾,因此本發明之保護範圍當視後附之申請專利範圍 所界定者為準》 【圖式簡單說明】 為讓本發明之上述和其他目的、特徵、優點與實施例 能更明顯易懂’所附圖式之詳細說明如下: 第1圖係繪示本發明一實施例之一種音頻解碼器結構 示意圖。 17When one or more algorithms filter to reconstruct the wave β to resample the PCM sample value, the resampling ratio of the sample frequency is variable. In one embodiment of the invention, the accuracy of the sampling frequency can be set using software programming as needed. The resampling ratio of the known frequency can be set based on the information provided by the audio stream. In some embodiments conforming to the MPEG standard, for the adjustment of the error of the sampling frequency, the resampling ratio of the sampling frequency is determined according to the value of the pTS field included in the information of the basic packet of the data packet, for different samples. The conversion between frequencies, the resampling ratio of the sampling frequency is determined according to the value of the sampling frequencey field included in the header information of the elementary stream of the audio data stream, and may be combined with the value of the ID field included in the header information of the elementary stream. . Although the present invention has been disclosed in an embodiment of the present invention, it is not intended to limit the present invention, and any one of ordinary skill in the art to which the invention pertains may be modified in various ways without departing from the spirit and scope of the invention. And the scope of the present invention is defined by the scope of the appended claims. [Simplified Description of the Drawings] The above and other objects, features, advantages and embodiments of the present invention will become more apparent. The detailed description of the drawings is as follows: FIG. 1 is a schematic diagram showing the structure of an audio decoder according to an embodiment of the present invention. 17