TWI293833B - Method and apparatus for mpeg-4 fgs performance enhancement - Google Patents

Method and apparatus for mpeg-4 fgs performance enhancement Download PDF

Info

Publication number
TWI293833B
TWI293833B TW93106316A TW93106316A TWI293833B TW I293833 B TWI293833 B TW I293833B TW 93106316 A TW93106316 A TW 93106316A TW 93106316 A TW93106316 A TW 93106316A TW I293833 B TWI293833 B TW I293833B
Authority
TW
Taiwan
Prior art keywords
fine
prediction
bits
quality prediction
layer
Prior art date
Application number
TW93106316A
Other languages
Chinese (zh)
Other versions
TW200531454A (en
Inventor
Chia Wen Lin
Su Ren Chen
Original Assignee
Ind Tech Res Inst
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ind Tech Res Inst filed Critical Ind Tech Res Inst
Priority to TW93106316A priority Critical patent/TWI293833B/en
Publication of TW200531454A publication Critical patent/TW200531454A/en
Application granted granted Critical
Publication of TWI293833B publication Critical patent/TWI293833B/en

Links

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)

Description

1293833 玖、發明說明: 【發明所屬之技術領域】 本發明係關於具細緻可伸縮性(fine granularity scalable,FGS)之編解碼器(codec),特別係關於此具細緻可 伸縮性之編解碼器的架構、預測模式(prediction mode)、和 位元分派(bit allocation)。 【先前技術】 多媒體的應用在現今世界已愈來愈廣泛,例如,聆聽 CD播放機、或是透過網際網路存取網頁。透過網際網路的 ► 多媒體應用的一個普遍的問題是,未經壓縮的視訊資料太 過龐大而不易儲存和傳送。因此,國際組織如ITU-T與ISO MPEG委員會,針對資料壓縮的問題,制訂了數種編碼標 準。藉著這些標準的建立,視訊資料的儲存與傳送就變得 簡單許多。 由於網際網路的技術在過去幾年有大幅的進步,現在 人們可以透過網際網路閱讀網頁、玩遊戲、或是下載構案。 其中,串流視訊(streamingvideo)是一重要的網路應用, 藉著這個應用,人們可以透過網路從視訊伺服器(vide〇 server)存取事先編碼好的視訊片段(vide〇 clip)。串流視 訊的最大優點是,經由網際網路連線,人們可以從任何地 點接收視訊h料。糟由串流視訊,人們可以可以從非對稱 的網路像是ADSL、Cable Modem等存取視訊。對於串流 1293833 視訊的提供者,因為人們接收視訊的頻寬不一,視訊的位 元流(bitstream)必須以各種位元速率(bitrate)傳送。 位元流位元速率的調整有幾種傳統的方法。一種方法 是在編碼時,就將依視訊節目編碼成多種速率的位元流, 如此雖可解決接收端頻寬不一的問題。但是在一群播 (multicast)的環境,成百或成千的接收端將同時存取相 同之視訊節目,如此一來,這種方法在視訊流提供端所需 的位元速率,會是這多個位元流的位元速率的總和。另一-種方法是將位元流以最高可能的連線速率編碼,然後再轉 碼(transcode)成不同速率。首先,轉碼器是先將位元流 解碼,然後,再轉碼成各個接收端適合的位元速率。如此 一來,争流視訊提供者就可以使用此種轉碼器,依照各個 接收端提供各種速率的位元流。 ^ %:. MPEG-4草案增補(Draft Amendment) 4提出並標準化 一個新的概念,稱為FGS。FGS壓縮視訊包含一基礎層 '(base layer )和一加強層(enhancement layer )。基礎層是 利用一 MPEG-4編碼器,在所有可能連線下的最低的位元 速率來產生的。FGS則採用原來的和重建的離散餘弦轉換 (discretecosinetansformation,DCT)係數,以位元層 (bit-plane)編碼來產生加強層位元流。產生的方法是這樣 的:將DCT原來的係數減去重建後的係數,就可以得到因 量化(quantization)過程而引入的誤差值(residues)。然 1293833 後,FGS編解碼器用位元層編碼將這些殘留誤差予以編 碼’將產生的各個位元層從最高有效位元(MSB)到最低 有效位元(LSB) —一輸出。加強層因此可以在任一位元 數裁截(truncated)。如果接收端在收到基礎層位元流後還有 頻寬的話,接收端可繼續接收加強層位元流,接收到的加 強層位元層愈多,重建後的視訊品質會更好。FGS因為可 以提供從基礎層位元速率到接收端頻寬上限的廣泛速率範 圍’因此FGS非常適合多重播送的串流視訊。如第1圖所 示,所有接收端(接收端1、2、3)在最低的視覺品質下-都可收到FGS基礎層。由於接收端1的頻寬不足,接收端 1無法接收FGS加強層,而接收端2與3則可以盡其可能 的接收FGS加強層位元層。 FGS因為可以提供廣泛的位元速率範圍以因應接收端 ,·的頻寬變化,所以在視訊流應用上要比其他編碼技術更有 彈性,也因而在串流視訊的應用上愈來愈廣泛。然而,FGS 編碼器雖然提供這樣因應頻寬的高彈性,它的編碼效率在 -同一位元速率下要比其他非可伸縮性(non-scalable)編碼器 來的差。FGS編碼效率的不佳主要有兩個因素。第一,Fgs 基礎層的運動補償(motion-compensated)預測編碼 (predictivecoding)只有使用粗略品質預測(coarse prediction),而未使用來自加強層重建的編碼殘值(該影像 的細節)。第二,FGS加強層編碼器沒有運動補償預測迴路 (prediction loop )。也就是說,每一 FGS加強層畫面(frame) 1293833 是層内編碼(intra-layer coded )。因為fGS基礎層是以人 類最低的視覺品質在最低位元速率下編碼,FGS基礎層與 時間相關的預測’其編碼增益(codinggain)通常是不如 非可伸縮性的的編碼器。 第2圖說明產生FGS基礎層和加強層位元流的編碼過 程。基礎層是用一 MPEG-4非可伸縮性的的編碼器,以位 元速率凡編碼。加強層用原來的與重建後、解量化的 (de-quantized)係數為輸入,以位元層編碼產生加強層啤_ 元流。加強層位元流的編碼程序是:首先,原來的DC丁係 數減去解量化的DCT係數,得到量化誤差。接著,在產生 一個畫面所有的DCT量化誤差後,加強層編碼器找出這些 DCT量化誤差的最大絕對值,以決定這個畫面的最大位元 層數。在決定一個畫面的最大位元層數後,FGS加強層編 :碼器從最高有效位元層(MSBplane) —個位元層一個位元 層的輸出加強層資料,直到最低位有效元層(LSBplane) 為止。每一位元層的位元先被轉為符號(symbol),再以可變 〃長度的方式編碼,以產生輸出的位元流。下面這個例子說 明了上述的編碼過程,其中以一 DCT區塊之絕對量化誤差 為例。 5,0,4, 1,2,0, ··· 〇,〇 此DCT區塊之絕對量化誤差的最大值為5 ’而以^一進 位制表示5 ( 101)所需的位元數是3,將這些絕對量化誤 1293833 差寫成二進位制來表示,則形 3個位元層如下 (MSB) (MSB-1) (LSB) 2,〇, 1,〇, 0,0··· 〇,〇 0,0,0,0,1,0 ··· 〇,〇 Uo, 1,0,0 ··· 〇,〇 第3圖說明加強層畫面重建的FGS解碼過程。FGS基 礎層的解碼過程和-MPEG_4非可伸縮性的位元流的解碼 過程一樣。由KFGS位元流内建的特性,解碼器是接收並 以可變長度的方式,從MSB位元層到LSB位元層,解瑪· 出DCT殘值的位元層。因為解碼器不一定能完整接收某1 特定位元層的所有區塊,解碼器會在這些位元層之未收到 的區塊裡填〇 ’並且執行反離散餘弦轉換(inversediscrete cosmetansformation,IDCT),以將收到的DCT係數轉換 為像素值(pixel value)。這些像素值會後續加到基礎層已解 ••碼出的畫面,以得到最終改善後的視訊影像。1293833 玖, invention description: [Technical field of the invention] The present invention relates to a codec with fine granularity scalable (FGS), in particular to a codec with fine scalability The architecture, prediction mode, and bit allocation. [Prior Art] The application of multimedia has become more and more widespread in the world today, for example, listening to a CD player or accessing a webpage through the Internet. Through the Internet ► A common problem with multimedia applications is that uncompressed video data is too large to be easily stored and transmitted. Therefore, international organizations such as the ITU-T and ISO MPEG committees have developed several coding standards for data compression. With the establishment of these standards, the storage and transmission of video materials has become much simpler. Since the technology of the Internet has made great progress in the past few years, people can now read web pages, play games, or download files through the Internet. Among them, streaming video is an important network application. Through this application, people can access pre-encoded video clips (vide〇 clip) from the video server (vide〇 server). The biggest advantage of streaming video is that people can receive video from any location via the Internet connection. By streaming video, people can access video from asymmetric networks such as ADSL, Cable Modem, etc. For the provider of streaming 1293833 video, because the bandwidth of people receiving video is different, the bitstream of the video must be transmitted at various bitrates. There are several traditional methods for adjusting the bit stream bit rate. One method is to encode the video-programmed program into a bit stream of multiple rates during encoding, which can solve the problem of different bandwidths at the receiving end. However, in a group of multicast environments, hundreds or thousands of receivers will simultaneously access the same video program. As a result, the bit rate required by this method at the video stream provider side will be much higher. The sum of the bit rates of the bit stream. Another way is to encode the bit stream at the highest possible connection rate and then transcode it to a different rate. First, the transcoder first decodes the bit stream and then transcodes it into a suitable bit rate for each receiver. In this way, the contention video provider can use the transcoder to provide bit streams of various rates according to the respective receiving ends. ^ %:. MPEG-4 Draft Amendment 4 proposes and standardizes a new concept called FGS. The FGS compressed video includes a base layer and an enhancement layer. The base layer is generated using an MPEG-4 encoder at the lowest bit rate of all possible connections. The FGS uses the original and reconstructed discrete cosine transform (DCT) coefficients to generate a reinforced layer bit stream in bit-plane coding. The method produced is such that by subtracting the reconstructed coefficients from the original coefficients of the DCT, the error values introduced by the quantization process can be obtained. After 1293833, the FGS codec encodes these residual errors with bit-level coding, and the resulting bit layers are output from the most significant bit (MSB) to the least significant bit (LSB). The enhancement layer can therefore be truncated at any number of bits. If the receiving end has a bandwidth after receiving the base layer bit stream, the receiving end can continue to receive the enhanced layer bit stream, and the more the enhanced layer bit layer is received, the reconstructed video quality will be better. Since FGS can provide a wide range of rates from the base layer bit rate to the receiver end bandwidth limit, FGS is well suited for multi-streaming streaming video. As shown in Figure 1, all receivers (receivers 1, 2, 3) receive the FGS base layer at the lowest visual quality. Since the bandwidth of the receiving end 1 is insufficient, the receiving end 1 cannot receive the FGS enhancement layer, and the receiving ends 2 and 3 can receive the FGS enhancement layer bit layer as much as possible. Because FGS can provide a wide range of bit rates to respond to the bandwidth variation of the receiving end, it is more flexible in video streaming applications than other encoding technologies, and thus more and more widely used in streaming video applications. However, although the FGS encoder provides such a high flexibility of response bandwidth, its coding efficiency is worse at the same bit rate than other non-scalable encoders. There are two main factors in the poor efficiency of FGS coding. First, the motion-compensated predictive coding of the Fgs base layer uses only coarse prediction, but does not use coded residuals from the enhancement layer reconstruction (details of the image). Second, the FGS enhancement layer encoder has no motion compensated prediction loop. That is to say, each FGS enhancement layer frame 1293833 is intra-layer coded. Since the fGS base layer is encoded at the lowest bit rate with the lowest human visual quality, the FGS base layer's time-dependent prediction 'its coding gain is usually not as scalable as the non-scalable encoder. Figure 2 illustrates the encoding process for generating the FGS base layer and the enhancement layer bit stream. The base layer is encoded with an MPEG-4 non-scalable encoder at bit rate. The enhancement layer uses the original and reconstructed de-quantized coefficients as input, and the bit layer coding produces a reinforced layer beer stream. The coding procedure for the enhanced layer bit stream is: First, the original DC D number is subtracted from the dequantized DCT coefficients to obtain a quantization error. Next, after generating all DCT quantization errors for a picture, the enhancement layer encoder finds the maximum absolute value of these DCT quantization errors to determine the maximum number of bit levels for this picture. After determining the maximum number of bit layers in a picture, the FGS enhancement layer code: the code from the most significant bit layer (MSBplane) - one bit layer of one bit layer of the output enhancement layer data, up to the lowest effective element layer ( LSBplane) So far. Each bit of the meta layer is first converted to a symbol and then encoded in a variable length to produce a stream of output bits. The following example illustrates the above encoding process, taking the absolute quantization error of a DCT block as an example. 5,0,4, 1,2,0, ··· 〇, 最大值 The maximum quantization error of this DCT block is 5 ' and the number of bits required to represent 5 ( 101) is ^ 3. The absolute quantization error 1293383 is written as a binary system, and the 3 bit layers are as follows (MSB) (MSB-1) (LSB) 2, 〇, 1, 〇, 0, 0··· 〇 , 〇0,0,0,0,1,0 ··· 〇,〇Uo, 1,0,0 ··· 〇,〇 Figure 3 illustrates the FGS decoding process for enhanced layer picture reconstruction. The decoding process of the FGS base layer is the same as the decoding process of the -MPEG_4 non-scalable bit stream. Built into the KFGS bit stream, the decoder receives and converts the bit layer from the MSB bit layer to the LSB bit layer in a variable length manner. Because the decoder does not necessarily receive all blocks of a particular bit-specific layer, the decoder fills in the unreceived blocks of these bit layers and performs inverse discrete cosmetans formation (IDCT). To convert the received DCT coefficients into pixel values. These pixel values are then added to the image that has been decoded by the base layer to obtain the final improved video image.

雖然FGS可以支援廣泛的位元速率,以簡化對頻寬變 化的調整,然而FGS卻有一些缺點。參考第2圖,餽入加 強層解碼器的輸入訊號係將輸入的視訊相對於它的基礎層 重建所得的版本的預測誤差的量化誤差(quantization error),此基礎層是以最低的視覺品質在最低位元速率下進 行編碼。因此通常不能精確的逼近將輸入的視訊,所以量 化誤差會相當大,進而導致低編碼效率。單層編碼(single layercoding)的效能在相同的傳輸位元速率下,會比FGS 1293833 好,因為單層編碼是以全品質(制_quality)視訊來作預測。 這種效犯的降低可達1·5到2.5dB,如習知技術所述。 為克服這個問題,有數個改進FGS編碼之視覺品質的 相關研究被提出來,這些方法簡述如後。 種稱為可適性運動補償FGS ( adaptive motion compensated PGS ’ AMC-FGS)的方法,其編解碼器的特 色是有兩種簡化的具伸縮性編解碼器:單預測迴路 . MC-FGS與雙預測迴路MC_FGS,,此兩者各有其不同程度 的編瑪效率和谷錯能力(error resilience)。雙預測迴路 MC-FGS僅對B-畫面(B-frames)在加強層編碼器使用一 額外的MCP迴路。由於在編解碼時,其他畫面不會為了預 測而參考到B_畫面,如果有遺失队畫面資料,也不會有誤 ,•差蔓延,果一個B-畫面有漂移誤差(drifling eiT〇r),此 漂移誤差也不會蔓延到後續的畫面。單預測迴路MC-FGS 在P-畫面(P-frames)與畫面皆使用精細品質預測(fme prediction),因此與雙預測迴路MC_FGS相較之下,有相 對較高的編碼效率。然而,如果用在p_畫面基礎層的預測 的加強層資料,因為頻寬不足或傳輸通道錯誤問題而使封 包丟失,無法被解碼器接收的話,其錯誤強韌性(err〇r rObustness)降顯著地降低。AMC-FGS使用一種可適性預 測模式決策演算法,能夠在兩種預測技術間進行切換,以 在編碼效率和抗誤能力兩者之間取得較佳的折衷(^找过 12 1293833 tradeoff) 〇 一種稱為漸進 FGS(progressive FGS,PFGS)的新 fgS 架構,其加強層不僅可以參考基礎層,也可以參考到先前 的加強層資料。然而,當頻寬被降低時,如果所參考到的 位元層無法保證送達到解碼器的話,相同的漂移誤差同樣 會擾亂輪出品質。 另一種稱為具錯誤強健性FGS (robustFGS,RFGS)/ 的方法。這個方法在加強層增加一組運動補償預測迴路並 加入茂漏預測(leaky prediction)以求取編碼效能及抗誤能 力間的較佳之折衷。這個額外的運動補償預測迴路能透過 參考高畫質畫面(high quality frame memory)以達到增加編 碼效能的目的,而利用洩漏預測來抑制所伴隨而來的漂移 ·.錯誤、。在加強層動補償預測迴路利用以預測的漂移錯誤為 準的一個洩漏因子a(0 da □ 1)來獲得重建畫面。此外, 另一個參數,所參考的位元層數夕也被利用在部分預測。 -利用調整這兩個參數,RFGS能提供對於各種編碼方式的 彈性。若洩漏因子⑷被設定為〇,那麼RFGS幾乎和原 本的FGS是相同的,若每個參考畫面的《都設定為1,則 RFGS的預測模式跟MC-FGS是相同的。 【發明内容】 本發明增進FGS編解碼器的效能,其主要目的為提供 13 1293833 一種FGS編碼解碼器的新架構,此新架構有三種預測模 式,能適切地去選擇。本發明的另一目的係提供一種方法, 能適切地對輸入訊號的每一巨集區塊(macr〇bl〇ck)去選 擇一種預測模式。本發明的又一目的係提供一種FGS編解 碼器之加強層位元層裁切方法。 根據本發明,FGS編解碼器的編碼器和解碼器都有一 基礎層和一加強層,此基礎層包含有一粗略品質預測迴路 和一基礎層模式選擇器(base layer mode selector),而此加钱_ 層包含有一精細品質預測迴路和一加強層模式選擇器 (enhancement-layer mode selector)。此基礎層模式選擇器可 以被控制,以選擇基礎層的輸出是粗略畫質預測或是精細 畫質預測。類似地’加強層模式遵擇器也可以被控制,以 選擇加強層的輸出是粗略畫質預測或是精細畫質預測。 ^ V . 本發明的FGS編碼解碼器提供三種預測模式:全精細 畫質預測模式(all-fine prediction mode)是指基礎層模式 '選擇器與加強層模式選擇器都選擇精細畫質預測輸出;全 粗略畫質預測模式(all-coarse prediction mode)是指基礎 層模式選擇器與加強層模式選擇器都選擇粗略畫質預測輸 出;混合預測模式(mix prediction mode)是指基礎層模式 選擇器選擇粗略畫質預測輸出,而加強層模式選擇器選擇 精細畫質預測輸出。 1293833 本發明的編碼器可以針對輸入視訊訊號的每一巨集區 塊’適切地選擇預測模式。本發明使用一種兩階段式 (two-pass)編碼程序。第一階段的編碼收集所有巨集區塊 的編碼參數,這些參數包括··粗略畫質預測與精細畫質預 測的預測誤差值、以及最佳情況與最壞情況下的所預估的 錯誤匹配誤差(estimated mismatch error),此錯誤匹配誤差 是在精細畫質預測時,解碼器不能接收到應用於精細畫質 預測的加強層資料而引起的。然後,一編碼增益(c〇ding gain)是由此粗略畫質預測和精細畫質預測的預測誤差偉-所推導出。一估測的錯誤匹配誤差值則是從最佳情況與最 壞情況之兩種錯誤匹配誤差值所估算而得。然後據此可以 计算出每一巨集區塊的編碼效益,此編碼效益被定義為編 碼增益除以所預估的錯誤匹配誤差值。一個畫面裡所有巨 集區塊的編碼效益的平均值(mean)和標準差(standard ,.deviatiotf)也會被計算出來。 然後依據每一巨集區塊的編碼效益,將這些巨集區塊 〃分成三個群組。每一群組的巨集區塊都是用同一個預測模 式來編碼。如果一個巨集區塊的編碼效率小於編碼效率平 均值與編碼效益標準差的一預定倍數(pre_determined multiple)之間的差值(difference),此巨集區塊則以全粗略畫 質預測模式編碼來編碼。如果一個巨集區塊的編碼效率大 於編碼效率平均值與編碼效率標準差的一預定倍數的和, 此巨集區塊則以全精細畫質預測模式來編碼。否則的話, 15 1293833 此巨集區塊以混合預測模式來編碼。 本發明進_步提供-種新的鱗媽演算法,根據三 種可用頻寬的情況來裁切加強層的位元層,此三種情況分 別是:低位元速率、中位元速率、與高位元速率。在低位 元速率的情況下,Ι/P-畫面(l/p_frame)的加強層位元層會 儘可月b地被裁切’位元分派(bit allocation)則只給i/p-晝面, 而在B-畫面(B-frame)的加強層資料全被裁切。在中位元 速率的情況下,當Ι/P-畫面的位元分派能保證用於精細聋_ 質預測的Ι/P-畫面位元層可以,完全送出後,多出的 (excessive)位元才會分配給B_畫面。在高位元速率的情況 下’分配的位元數是由位元層的大小來控制,而且隨著特 定的位元速率而變。如果沒有多出的位元分配給j/ρ-畫面, 為了避免兩個鄰近畫面之間有太大的變異,畫面彼此之間 ,的位元分配的分佈就必須要均衡。 兹配合下列圖式、實施例之詳細說明及申請專利範 圍,將上述及本發明之其他目的與優點詳述於後。 【實施方式】 第4與第5圖說明本發明之新的3-模式FGS編解碼器 的方塊示意圖。如第4圖所示,本發明的編碼器架構包含 一加強層與一基礎層。該加強層有一 DCT單元401、一位 元層位移(bit-plane shift)單元402、一最大值尋找器 1293833 (maximum value finder) 403、一位元層可變長度編碼器 404、和一精細晝質預測迴路。此精細畫質預測迴路包括一 位元層分割器(bit-plane divider) 405、一 IDCT 單元 406、 一精細畫質畫面記憶體(fine frame memory) 407、和一備 有切換開關SW1的運動補償單元408,此切換開關SW1 用來選擇加強層裡的預測模式。此基礎層有一 DCT單元 411、一量化(quantization)單元412、一可變長度編碼器 413、和一粗略畫質預測迴路。此粗略畫質預測迴路包括一 解量化(inverse quantization)單元 414、一 IDCT 單元 415、/ 一粗略畫質畫面記憶體(coarse frame memory) 416、一運 動估測(motion estimation)單元417、和一備有切換開關 SW2的運動補償單元418,此切換開關SW2用來選擇基礎 層裡的預測模式。 如第5圖所示之本發明的解碼器架構也包含一加強層 和一基礎層。此加強層有一位元層可變長度解碼器501、 一第一 IDCT單元502、和一精細晝質預測迴路。此精細畫 質預測迴路包括一位元層分割器503、一第二IDCT單元 504、一精細晝質畫面記憶體505、和一備有切換開關SW3 的運動補償單元506,此切換開關用來選擇加強層裡的預 測模式。此基礎層有一可變長度解瑪器510、一解量化單 元511、一第二IDCT單元512、和一粗略畫質預測迴路。 此粗略畫質預測迴路包括一粗略畫質晝面記憶體513、和 —備有切換開關SW4的運動補償單元514,此切換開關用 17 1293833 來選擇基礎層裡的預測模式。 本發明使用的FGS編解碼器的原理和運作方式係熟知 且已描述於習知技術中,而本發明提供的新的FGS編碼解 碼器架構使用了切換開關SW1、SW2、SW3和SW4,能 適切地去選擇預測模式,以改善編碼效率與效能。下文將 描述這幾種預測模式的原理和它們的運作方式。 如第4圖所示,此編碼器包含兩個切換開關swi. ~ SW2 ’分別用來選擇在加強層與基礎層編碼器裡的兩個運 動補償預測迴路的預測模式。在上方的SW1是用來對在加 強層編碼器的動作補償迴路,以選择來自精細畫質畫面記 憶體或是粗略畫質畫面記憶體的預測。而SW2是用來選擇 基礎層的預測模式(SW=1:精細畫質預測,SW=〇:粗略畫 質預測> 如第1表所摘錄的,本發明在編碼器裡提供三種 在巨集區塊等級的編碼模式:全精細畫質預測(MP : SWl=l且SW2=1)、全粗略畫質預測(ACp : swl=〇且 〃SW2=0)、與混合預測(Mp ·· 且 SW2=0)。 根據本發Μ ’編碼||_聰妓依據輸人視訊訊號 的每一巨集區塊之特性,經由模式切換開關SW1 與SW2 加以適當地選擇的。如第4圖所示,SW1與SW2是由-錯誤匹配估計與模式決定單元(mismatch estimation and mode decision unit) 419來控制的。此錯誤匹配估計與模式 1293833 決定單元419計算最好情況與最壞情況下錯誤匹配誤差的 估計量,以做出預測模式的決定。因此,除了運動補償單 元418輸出的最佳情況粗略畫質預測外,最壞情況基礎層 解碼器(base-layer decoder) 420也會輸出最壞情況粗略畫 質重建畫面值。有關動態選擇預測模式的方法將詳細 描述於後。 每一巨集區塊有一個或兩個的經由可變長度編碼 (VLC)位元被送到解碼器,以通知所使用的預測模式。-這些編碼模式在編碼效率與抗誤能力上有不同的特性。如 果選擇的是AFP模式,基礎層和加強層都會利用精細畫質 畫面記憶體的預測,而達到最高的編碼效率。然而,這種 模式也有引起漂移誤差的高風險,因為接收端可能因為頻 寬不足或封包丟失的原因,而無法完全接收到用於精細晝 ,·質預測的加強層位元層。整體來說,此模式的運作方式是 非常類似於單預測迴路運動補償FGS (MC-FGS)。相反 地’ ACP模式的基礎層與加強層都使用粗略畫質預測,假 ,如基礎層位元流能完全被接收,此模式就保證不會有漂移 誤差,但是它的編碼效率則是三種模式裡最低的。MP模 式則是在編碼效率與耐錯能力之間取得折衷,它對加強層 採用的是精細畫質預測,對基礎層則採用粗略晝質預測。 當用於預測的加強層位元層有部份丟失時,此種模式在加 強層還疋會有漂移誤差;而,如果解碼器可以接收到完整的 基礎層資料,則基礎層是不會有漂移誤差的。 19 1293833 除了此新顆的3-模式編碼解碼器之外,另一個特例是 包含MP與ACP編碼模式的一種簡化的FGS編碼解碼器, 此編解碼器犧牲了 AFP編碼模式所提供的某些編碼增益, 但另一方面則降低了漂移誤差。沒有此AFP編碼模式的 話,此編解碼器就剩下如第6圖和第7圖分別所示的編碼 器與解碼器架構。此雙模式編解碼器稱為「低漂移誤差 (low-drift)」模式,而3_模式編解碼器稱為r高增益 (hign-gain)」模式。在此種新的編解碼器裡,將其編碼模式 送出的成本負擔則減至每一巨集區塊一個位元。表1摘錄-了本發明之編解碼器所使用的預測模式。 , 表1本發明的FGS編碼方法所用的三種預測模式Although FGS can support a wide range of bit rates to simplify adjustments to bandwidth variations, FGS has some drawbacks. Referring to FIG. 2, the input signal fed to the enhancement layer decoder is a quantization error of the prediction error of the input video reconstructed relative to its base layer. The base layer is at the lowest visual quality. Encoding is performed at the lowest bit rate. Therefore, it is usually impossible to accurately approximate the input video, so the quantization error will be quite large, resulting in low coding efficiency. The performance of single layer coding is better than FGS 1293833 at the same transmission bit rate because single layer coding is predicted with full quality video. This reduction in utility can range from 1.5 to 2.5 dB as described in the prior art. To overcome this problem, several related studies have been proposed to improve the visual quality of FGS coding, and these methods are briefly described below. A method called adaptive motion compensated PGS ' AMC-FGS, whose codec features two simplified scalable codecs: single prediction loop. MC-FGS and dual prediction The loop MC_FGS, both of which have different degrees of horoscope efficiency and error resilience. Dual Predictive Loop MC-FGS uses an additional MCP loop for the enhancement layer encoder only for B-frames. Since other pictures will not be referenced to the B_ picture for prediction during codec, if there is missing team picture data, there will be no errors, and the difference will spread. If a B-picture has drift error (drifling eiT〇r) This drift error will not spread to subsequent pictures. The single prediction loop MC-FGS uses fine quality prediction (fme prediction) for both P-pictures and pictures, so it has a relatively high coding efficiency compared with the double prediction loop MC_FGS. However, if the predicted enhancement layer data used in the base layer of the p_picture layer is lost due to insufficient bandwidth or transmission channel error and cannot be received by the decoder, its error toughness (err〇r rObustness) is significantly reduced. Reduced ground. AMC-FGS uses an adaptive prediction mode decision algorithm that can switch between two prediction techniques to achieve a better compromise between coding efficiency and error resistance (^ found 12 1293833 tradeoff) A new fgS architecture called progressive FGS (PFGS), whose enhancement layer can refer not only to the base layer, but also to the previous enhancement layer data. However, when the bandwidth is reduced, the same drift error can also disturb the round-out quality if the referenced bit layer is not guaranteed to be sent to the decoder. The other is called the method of error robust FGS (robustFGS, RFGS)/. This method adds a set of motion compensated prediction loops to the enhancement layer and adds leaky prediction to obtain a better compromise between coding performance and error tolerance. This additional motion-compensated prediction loop can improve the coding performance by referring to the high quality frame memory, and use leakage prediction to suppress the accompanying drift. The enhanced layer motion compensation prediction loop obtains a reconstructed picture using a leakage factor a (0 da □ 1) which is based on the predicted drift error. In addition, another parameter, the number of bit layers referenced, is also utilized in partial prediction. - By adjusting these two parameters, RFGS can provide flexibility for various coding methods. If the leakage factor (4) is set to 〇, then the RFGS is almost the same as the original FGS. If each reference picture is set to 1, the prediction mode of the RFGS is the same as the MC-FGS. SUMMARY OF THE INVENTION The present invention enhances the performance of the FGS codec, and its main purpose is to provide a new architecture of an FGS codec of 13 1293833. The new architecture has three prediction modes that can be appropriately selected. Another object of the present invention is to provide a method for appropriately selecting a prediction mode for each macroblock (macr〇bl〇ck) of an input signal. It is still another object of the present invention to provide a reinforced layer bit layer dicing method for an FGS codec. According to the present invention, both the encoder and the decoder of the FGS codec have a base layer and a reinforcement layer, the base layer including a coarse quality prediction loop and a base layer mode selector, and this adds money. The _ layer contains a fine quality prediction loop and an enhancement-layer mode selector. This base layer mode selector can be controlled to select whether the output of the base layer is a coarse quality prediction or a fine quality prediction. Similarly, the enhancement layer mode controller can also be controlled to select whether the output of the enhancement layer is a rough image quality prediction or a fine image quality prediction. ^ V . The FGS codec of the present invention provides three prediction modes: an all-fine prediction mode means that the base layer mode 'selector and the enhancement layer mode selector select a fine image quality prediction output; The all-coarse prediction mode means that the base layer mode selector and the enhancement layer mode selector both select a coarse image quality prediction output; the hybrid prediction mode refers to the base layer mode selector selection. The coarse image quality predicts the output, while the enhancement layer mode selector selects the fine image quality predictive output. 1293833 The encoder of the present invention can appropriately select the prediction mode for each macroblock of the input video signal. The present invention uses a two-pass encoding procedure. The first stage of the code collects the coding parameters of all macroblocks, including the prediction error values of the rough image quality prediction and the fine image quality prediction, and the best case and the worst case predicted error. The error mismatch error is caused by the decoder not receiving the enhancement layer data applied to the fine image quality prediction during the fine image quality prediction. Then, a coding gain is derived from the prediction error of the coarse image quality prediction and the fine image quality prediction. An estimated error matching error value is estimated from the error and error values of the best case and the worst case. Based on this, the coding benefit of each macroblock can be calculated. This coding benefit is defined as the coding gain divided by the estimated error matching error value. The mean (mean) and standard deviation (standard, .deviatiotf) of the coding benefits of all macroblocks in a picture are also calculated. Then, according to the coding benefit of each macroblock, these macroblocks are divided into three groups. The macroblocks of each group are encoded in the same prediction mode. If the coding efficiency of a macroblock is less than the difference between the coding efficiency average and the pre_determined multiple of the coding benefit standard deviation, the macroblock is coded in a full coarse quality prediction mode. To code. If the coding efficiency of a macroblock is greater than the sum of the coding efficiency average and a predetermined multiple of the coding efficiency standard deviation, the macroblock is encoded in a full fine quality prediction mode. Otherwise, 15 1293833 This macroblock is encoded in a mixed prediction mode. The present invention provides a new scale mother algorithm, which cuts the bit layer of the enhancement layer according to the three available bandwidths, namely: low bit rate, median rate, and high bit. rate. In the case of low bit rate, the enhancement layer layer of the Ι/P-picture (l/p_frame) will be cropped as long as the 'bit allocation' is only given to the i/p-昼, and the reinforcement layer data of the B-frame is completely cut. In the case of the median rate, when the bit allocation of the Ι/P-picture can guarantee the Ι/P-picture bit layer for fine 预测 quality prediction, after the complete send, the excess bit The meta is assigned to the B_ screen. In the case of high bit rates, the number of bits allocated is controlled by the size of the bit layer and varies with a particular bit rate. If there are no extra bits allocated to the j/ρ-picture, in order to avoid too much variation between the two adjacent pictures, the distribution of the bit allocation between the pictures must be balanced. The above and other objects and advantages of the present invention will be described in detail with reference to the accompanying drawings. [Embodiment] Figs. 4 and 5 illustrate block diagrams of a novel 3-mode FGS codec of the present invention. As shown in Fig. 4, the encoder architecture of the present invention includes a reinforcement layer and a base layer. The enhancement layer has a DCT unit 401, a bit-plane shift unit 402, a maximum value finder 403383, a one-bit variable length coder 404, and a fine 昼. Quality prediction loop. The fine image quality prediction loop includes a bit-plane divider 405, an IDCT unit 406, a fine picture memory 407, and a motion compensation provided with the switch SW1. Unit 408, this switch SW1 is used to select the prediction mode in the enhancement layer. The base layer has a DCT unit 411, a quantization unit 412, a variable length coder 413, and a coarse picture quality prediction loop. The coarse quality prediction loop includes an inverse quantization unit 414, an IDCT unit 415, a coarse frame memory 416, a motion estimation unit 417, and a A motion compensation unit 418 of the switch SW2 is provided, and the switch SW2 is used to select a prediction mode in the base layer. The decoder architecture of the present invention as shown in Figure 5 also includes a reinforcement layer and a base layer. The enhancement layer has a one-bit variable length decoder 501, a first IDCT unit 502, and a fine quality prediction loop. The fine image quality prediction loop includes a bit layer splitter 503, a second IDCT unit 504, a fine enamel picture memory 505, and a motion compensation unit 506 provided with a switch SW3 for selecting Strengthen the prediction mode in the layer. The base layer has a variable length masher 510, a dequantization unit 511, a second IDCT unit 512, and a coarse quality prediction loop. The coarse picture quality prediction loop includes a coarse picture memory 513, and a motion compensation unit 514 provided with a switch SW4, which uses 17 1293833 to select a prediction mode in the base layer. The principle and operation of the FGS codec used in the present invention are well known and described in the prior art, and the new FGS codec architecture provided by the present invention uses the switches SW1, SW2, SW3 and SW4 to be suitable. Select the prediction mode to improve coding efficiency and performance. The principles of these prediction modes and how they operate are described below. As shown in Fig. 4, the encoder includes two switchers swi. ~ SW2' for selecting the prediction modes of the two motion compensated prediction loops in the enhancement layer and the base layer encoder, respectively. The upper SW1 is used to compensate the motion compensation loop of the enhancement layer encoder to select the prediction from the fine picture memory or the coarse picture memory. SW2 is used to select the prediction mode of the base layer (SW=1: fine image quality prediction, SW=〇: coarse image quality prediction). As extracted from the first table, the present invention provides three kinds of macros in the encoder. Block level coding mode: full fine picture quality prediction (MP: SWl = l and SW2 = 1), full coarse picture quality prediction (ACp: swl = 〇 and 〃 SW2 = 0), and mixed prediction (Mp · · and SW2 = 0). According to the present invention, the code "|| _ 妓 is appropriately selected according to the characteristics of each macro block of the input video signal via the mode switch SW1 and SW2. As shown in Fig. 4 , SW1 and SW2 are controlled by a mismatch estimation and mode decision unit 419. This error matching estimate and mode 1293833 determines unit 419 to calculate the best case and the worst case error matching error. Estimate the amount to make a prediction mode decision. Therefore, in addition to the best case coarse quality prediction output by the motion compensation unit 418, the worst case base-layer decoder 420 also outputs the worst case rough. Image quality reconstructs the picture value. About the dynamic The method of selecting the prediction mode will be described in detail later. One or two of each macroblock is sent to the decoder via variable length coding (VLC) bits to inform the prediction mode used. The mode has different characteristics in coding efficiency and error resistance. If the AFP mode is selected, the base layer and the enhancement layer will use the prediction of the fine picture memory to achieve the highest coding efficiency. However, this mode also has The high risk of drift error is caused by the fact that the receiving end may not be able to completely receive the enhanced layer bit layer for fine-grained and quality prediction because of insufficient bandwidth or packet loss. Overall, how this mode works It is very similar to single prediction loop motion compensation FGS (MC-FGS). Conversely, both the base layer and the reinforcement layer of the 'ACP mode use coarse quality prediction, and if the base layer bit stream can be completely received, this mode is Guaranteed that there will be no drift error, but its coding efficiency is the lowest of the three modes. MP mode is a compromise between coding efficiency and error tolerance, it is The strong layer uses fine image quality prediction, and the base layer uses coarse quality prediction. When the enhancement layer bit layer used for prediction is partially lost, this mode will have drift error in the enhancement layer; If the decoder can receive the complete base layer data, there will be no drift error in the base layer. 19 1293833 In addition to this new 3-mode codec, another special case is the MP and ACP encoding modes. A simplified FGS codec that sacrifices some of the coding gain provided by the AFP coding mode, but reduces the drift error on the other hand. Without this AFP encoding mode, the codec leaves the encoder and decoder architecture as shown in Figures 6 and 7, respectively. This dual mode codec is referred to as the "low-drift" mode, while the 3_mode codec is referred to as the "hign-gain" mode. In this new codec, the cost burden of sending its coding mode is reduced to one bit per macroblock. Table 1 is an excerpt from the prediction mode used by the codec of the present invention. Table 1 Three prediction modes used in the FGS coding method of the present invention

•基礎層與加強層均使用精細畫質 預測。與原始的FGS相同。 •有很強的容錯性,但編碼效率較 低。 全粗略畫質預測 (SW1 =〇 與 SW2 = 0) 低漂移誤差:1 高增益:10 全精細畫質預測 (SW1 = 1 與 SW2: 低漂移誤差:N.A. 1) 高增益:10 基礎層與加強層均使用精細畫質 預測。與單迴路MC-FGS相同。 有最高的編碼效率,但易受漂移• Both the base layer and the reinforcement layer use fine image quality predictions. Same as the original FGS. • Strong fault tolerance, but coding efficiency is low. Full coarse quality prediction (SW1 = 〇 and SW2 = 0) Low drift error: 1 High gain: 10 Full fine image quality prediction (SW1 = 1 and SW2: Low drift error: NA 1) High gain: 10 Base layer and enhancement The layers are all predicted using fine image quality. Same as single-loop MC-FGS. Has the highest coding efficiency, but is susceptible to drift

加強層使用的是精細畫質預測, 混合預測 (SW1 = 1 與 SW2 = 0) 低漂移誤差:〇 高增益:0 基礎層使用的是粗略畫質預測。 與PFGS相同。 在基礎層漂移誤差有被限制,並 在高位元速率要比原始的FGS有The enhancement layer uses fine image quality prediction, mixed prediction (SW1 = 1 and SW2 = 0). Low drift error: 〇 High gain: 0 The base layer uses coarse quality prediction. Same as PFGS. The drift error in the base layer is limited, and the high bit rate is higher than the original FGS

根據本發明,為了縣執行運動重新估計(咖行⑽ 20 1293833 re estimation)以及為了避免每一巨集區塊需要送出一額外 的運動向量(motion vect〇r),從基礎層編碼器得到的運動 向里s重複用於加強層編碼器的運動補償。然而,基礎層 的運動向量對編碼此加強層位元流可能並不是最佳的。 如上所述,用粗略畫質預測的編碼(亦即Acp模式) 和用精細畫質預測的編碼(亦即APP*^模式)相比是 效率較低的,而如果使職細畫質預測,但某些用於預測 的加強層位元層未能被解碼器所接收的話,則可能會因而-發生漂移誤差。本發曝展種麟的綠,在編碼前 使用者位元速率是未知時,來估計預測模式的最佳選擇。 如第8圖所示,本發明使用一種兩階段的編碼程序。 當執行第-循環編碼時,收集所有巨集區塊的編碼參數, 攻些參數包練略畫質賴與精細畫質預測各自的預測誤 差值、以及在精細畫質預測下因用於預測的加強層資料無 法被解碼器接收而衍生的預估之錯誤匹配誤差(estimated mismatch error)。在這些參數中,這兩種預測的預測誤差值 的差異反映了它們編碼增益的差異,而錯誤匹配誤差會蔓 延到後續的畫面。例如,精細畫質酬的編碼增益要比粗 略畫質預酬顯著地高丨許多,而這種差異可以用粗略畫 質與精細畫質預測誤差值的差異來估計如下: (1) 21 1293833 其中’尤代表第/個進來的巨集區塊,伙^與户义分別代 表^:的粗略畫質預測與精細畫質預測。請注意,方程式(i) 的兩個絕對值分別代表粗略畫質預測模式與精細畫質預測 模式預測誤差值的能量值(energy value)(亦即它們的大 小)。一個大的G值代表這個巨集區塊的精細晝質預測要 比粗略畫質預測精確許多。 然而,編碼增益會導致產生漂移誤差的風險,因為精 細畫質預測使用部分的加強層資料,此加強層資料因為簡_ 寬不足或封包丟失的緣故而可能無法完全被解碼器接收、 為了有效地估測這樣的漂移誤差量,我們使用下列兩種估 測量: ^ = ^ ^ \\PX^ ^ ^ PXEL n)\\ (2) ^ Wpx^u ^ ^ ~px^ ^ ni\ (3) 其中A與^分別代表在使用零運動向量進行錯誤隱蔽 (error concealment)之錯誤匹配誤差的最佳和最差估計 值。乂12是使用前一張(僅使用基礎層之資料而未使用任何 之加強層資料進行重建)進行預測編碼,因此其編碼效率較 為低。因此最佳值β意謂著先假定所編碼畫面之前的 晝面皆有收到完整的加強層資料,然後估算粗略畫質預測 編碼的錯誤匹配誤差量。相反地,最差值^則是表示在所 編碼畫面之前的畫面皆僅收到基礎層的資料,所有加強層 22 1293833 的資料皆遺⑽情況下,所轉妹料質酬編碼的錯 誤匹配誤差量。在此種情形下會導致其錯誤誤差的問 題持續蔓延下去,也絲著漂移誤差的問财為嚴重。 一般而言,在預存視訊之争流應用,因 事先得知用戶端網路頻寬之大小及狀況的好壞,因此^法 很精確的估計錯誤匹配誤差值。然而,因為已經知道實際 的錯誤匹配誤差是落在這兩個估計值之間,亦 nB < Γ) ^ nw ,,,。本發明利用這兩個估計值的加權平均-(weightedvalue)來預測實際的錯誤匹配誤差: > PDt^kDDf +(l-A:D)Z^w (4) 其中心Ε[0,1]。么的選擇是依解碼器端可用頻寬而定。 為了決定每一巨集區塊的編碼模式,以達到良好的編 碼效能同時又能兼顧足夠的抗誤能力,使用如下一個新的 指標值 CODE (coding gains over drifting error): - CODE. ^GJPD, (5) 其中0,與/>/),是由方程式(1)與(4)分別求得。方程式 (5)的CODE指標值可以有效的表示出各個巨集區塊之 編碼效益和漂移誤差之間的關係,例如CODE值愈大表示 如果使用機系畫質參考畫面進行預測編碼得到的編碼效益 較高且其前在之漂移誤差量相對地較小。 23 1293833 在擷取出一個畫面所有巨集區塊的特徵後,CODE值 的平均值乂咖與標準差〇coDE可以計算如下··According to the present invention, the motion from the base layer encoder is obtained for the county to perform motion re-estimation (the re-estimation of the line (10) 20 1293833 re-estimation) and to avoid the need to send an additional motion vector (motion vect〇r) for each macroblock. The motion compensation for the enhancement layer encoder is repeated inward s. However, the motion vector of the base layer may not be optimal for encoding this enhancement layer bitstream. As described above, encoding with coarse quality prediction (that is, Acp mode) is less efficient than encoding with fine image quality (that is, APP*^ mode), and if the image quality prediction is made, However, some of the enhancement layer bit layers used for prediction are not received by the decoder, and thus drift errors may occur. The green color of the exposed seed is estimated to be the best choice for estimating the prediction mode when the user bit rate is unknown before encoding. As shown in Figure 8, the present invention uses a two-stage encoding procedure. When the first-loop coding is performed, the coding parameters of all the macroblocks are collected, and the prediction parameters of the parameters are analyzed, and the prediction error values of the fine image quality prediction and the fine image quality prediction are used for prediction. The estimated mismatch error derived from the enhancement of the layer data that cannot be received by the decoder. Among these parameters, the difference between the predicted errors of the two predictions reflects the difference in their coding gains, and the error matching error will spread to subsequent pictures. For example, the coding gain of fine picture quality is significantly higher than the rough picture quality prepayment, and this difference can be estimated by the difference between the coarse picture quality and the fine picture quality prediction error value as follows: (1) 21 1293833 'Especially represents the first/incoming macroblock, and the group ^ and Hu Yi represent ^: the rough image quality prediction and fine image quality prediction. Note that the two absolute values of equation (i) represent the energy values (ie, their size) of the coarse image prediction mode and the fine image prediction mode prediction error value, respectively. A large G value represents that the fine enamel prediction of this macroblock is much more accurate than the coarse quality prediction. However, the coding gain may cause a risk of drift error, because the fine picture prediction uses part of the enhancement layer data, which may not be completely received by the decoder due to insufficient simplification or packet loss, in order to effectively To estimate such drift error, we use the following two estimates: ^ = ^ ^ \\PX^ ^ ^ PXEL n)\\ (2) ^ Wpx^u ^ ^ ~px^ ^ ni\ (3) A and ^ represent the best and worst estimates of the error matching error of the error concealment using the zero motion vector, respectively.乂12 is predictive coding using the previous one (using only the data of the base layer without using any enhancement layer data for reconstruction), so its coding efficiency is relatively low. Therefore, the optimal value β means that it is assumed that the face before the coded picture has received the complete enhancement layer data, and then estimates the error matching error amount of the coarse picture quality prediction code. Conversely, the worst value ^ means that the picture before the encoded picture only receives the data of the base layer, and the data of all the enhancement layers 22 1293833 (10), the error matching error of the transferred code quality code the amount. In this case, the problem that causes the error error continues to spread, and the profit of the drift error is serious. Generally speaking, in the pre-stored video streaming application, since the size and condition of the user's network bandwidth are known in advance, the method accurately estimates the error matching error value. However, since it is known that the actual error matching error falls between these two estimates, it is also nB < Γ) ^ nw ,,,. The present invention uses the weighted average of these two estimates to predict the actual error matching error: > PDt^kDDf +(l-A:D)Z^w (4) Its center Ε[0,1]. The choice is based on the available bandwidth of the decoder. In order to determine the coding mode of each macroblock to achieve good coding performance while taking into account sufficient error resistance, use a new CODE (coding gains over drifting error): - CODE. ^GJPD, (5) where 0, and />/) are obtained by equations (1) and (4), respectively. The CODE index value of equation (5) can effectively represent the relationship between the coding benefit and the drift error of each macroblock. For example, the larger the CODE value, the coding benefit obtained by predictive coding using the system image quality reference picture. The higher and the amount of drift error preceded by it is relatively small. 23 1293833 After extracting the features of all the macroblocks of a picture, the average value of the CODE value and the standard deviation 〇coDE can be calculated as follows··

WCODEWCODE

CODE, ⑹ ⑺ ac〇〇H=^|(C〇DE/^CODE)2 其中他是一個畫面裡的巨集區塊的數目。 這些巨集區塊然後被分成三個群組,此三個群組各以-不同的預測模式(亦即ACP、AFP、MP模式)來編碼' 巨集區塊是利用CODE值的平均值與標準差分類如下: fACP //CODE,< Wc〇DE^ac〇D£ MODE, =. AFP if CODE, > mC0DE + kaC0DE (8) MP otherwise 第9圖說明對多個巨集區塊,其錯誤匹配誤差和編碼 增益之間的關係及繪出的一個範例分佈圖。其X轴與γ轴 分別代表方程式(4)與方程式(1)所計算出來的錯誤匹 配誤差與編碼增益。一巨集區塊的Y值愈高,表示在到精 細畫質重建畫面記憶體裡使用更多加強層位元層,因此對 此巨集區塊的精細晝質預測愈精準,而產生較高之編碼增 益。但是在採用這些額外的位元的情況下,編碼增益的增 加會伴隨著漂移誤差的增加。第9圖的每一點代表位於每 一類別之一巨集區塊的(G/))對(pair),圖中的上下兩條 實線分別代表CODE值等於 24 1293833 (在此灸=1)的所有(G/>)對,而其中的虛線代表CODE 值等於气。DE的所有(砂)對。(砂)位於上實線之上的 巨集區塊是以AFP模式編碼,因為此可預期可能可以達到 較高的編碼效能,而同時如果解碼器沒有收到某些用於預 測的加強層資料封包,引進的漂移誤差不會那麼嚴重。相 反地,(<^D)位於下實線之下的巨集區塊因為較易受到漂 移誤差的影響,是以ACP模式編碼。其他的巨集區塊是以 MP模式編碼集區以在編碼增益與漂移誤差中取得較佳的 折衷。 1 ¥ 由於P-畫面是用作後續B/P-畫面編碼的參考,本發明 決定預測模式的方法是應用在P-畫面上。再者,由於队 畫面不會用作其他畫面的預測,此漂移誤差不會傳播到其 他畫面。因此在本發明中,B-畫面皆使用全精細畫質預測 ,·編碼來達到最高的編碼效率 在提供視訊流時,視訊流伺服器會裁切每一加強層畫 '面到一適當的大小,以符合客戶端終端設備的頻寬。如果 精細畫質預測是用來編碼基礎層與加強層,用來裁切FGS 加強層晝面的位元分派方法(bit-allocation scheme)對效能 會有很大的影響。例如,如果更多位元可以合理地分派給 Ι/P-晝面而不是畫面的話,解碼器可能接收到更多j/p_ 畫面的位元層,進而導致較低的漂移誤差與較高的視訊品 質。此外,B-畫面也可以在編碼器端預測時、或解瑪器端 25 1293833 重建時,參考較佳品質的圖像,如果有更多用來預測的這 些參考圖像的加強層位元層被收到的話。 本發明另外提出一種新的速率調節加強層位元層裁切 演算法,在視訊伺服器端以低位元率、中位元率、與高位 兀率三種不同的可用頻寬來裁切加強層位元層。在低位元 率的情況下,在編碼過程中,可用的頻寬不夠送出用於兩 層精細畫質預測的ι/p畫面的所有加強層位元層。所以, 當用來預測的部份加強層資料在裁切過程中被丟棄時,潭-移誤差是無法避免的。另一方面”如果可用頻寬夠大到送 出所有用於精細畫質預測的加強層位元層,但頻寬仍小於 一圖像群(group of pictures,GOP)所有B-畫面的#ΒΡ個 加強層MSB位元層的位元數時,所有多出來的位元可以在 B-畫面裡分配,以平衡I/P_畫面與B_畫面之間的圖像品 ,‘質。如果頻寬的情況更好的話,多餘的位元也將被分配到 Ι/P-晝面,而相關的位元會被保留以避免漂移誤差。這樣的 加強層位元層裁切的位元率調節行動可以在伺服器或路由 器進行。針對不同狀況的裁切方法將分別詳述如下。底下 的表2說明本發明的伺服器位元層裁切演算法所用的各個 參數。 表2用於伺服器速率調節的參數 參數 說明 _ MjOP GOP畫面數 N\妙 一 GOP裡的I-畫面與p-畫面數 Nb 一 GOP 裡的 B-畫面(A/b = β Μ&ι>) 26 1293833 編碼器的預編瑪(pre-encoding ) A^bp 編碼時精細畫質預測所用的位元層赵 PBel 一 G 0P裡用於精細畫質預測的加強層位元總數 BL 一 GOP裡用於精細畫質預測的所有畫面與ρ·畫面的加強層 位元數 PBbxl 一 GOP裡所有B-畫面的jVBP個加強層MSB位元層的位元數 户万I&PJEL 一 GOP裡用於精細畫質預測的第”個!/^書面的加強層位元 數 PB^ 一 GOP裡第w個B-畫面的#BP個加強層MSB位元層的位元 數 伺服器端位元層裁切的參數 tbel 一 GOP裡加強層分派的位元數 —―― ( ^ ^l&PJEL 一 GOP裡加強層第《個i/p-畫面分派的位元數 TOm 一 GOP裡加強層第;w個B-畫面分派的位元數 情況 1 :低可用頻宽(low available bandwidth) 此況中,在伺服器端估計一頻道可用的頻寬為小於 在編碼時用於精細畫質預測的畫面與p_畫面的加強層位 元數。因為可用的頻寬不足以送出所有用於精細畫質預測 的位元,本發明儘可能的裁切I-畫面與畫面的加強層。 每一 Ι/p-畫面的裁切方法是根據該晝面用於預測的位元數 來調節如下的··CODE, (6) (7) ac〇〇H=^|(C〇DE/^CODE)2 where is the number of macroblocks in a picture. These macroblocks are then divided into three groups, each of which is coded in a different prediction mode (ie ACP, AFP, MP mode). The macroblock is the average of the CODE values. The standard deviation is classified as follows: fACP //CODE, < Wc〇DE^ac〇D£ MODE, =. AFP if CODE, > mC0DE + kaC0DE (8) MP otherwise Figure 9 illustrates the multiple macroblocks, The relationship between the error matching error and the coding gain and an example distribution map drawn. The X-axis and the γ-axis represent the error matching error and coding gain calculated by Equation (4) and Equation (1), respectively. The higher the Y value of a macroblock, the more layer of the enhancement layer is used in the memory of the fine image reconstruction. Therefore, the finer the prediction of the fineness of the macroblock is, the higher the yield is. The coding gain. However, with these extra bits, the increase in coding gain is accompanied by an increase in drift error. Each point in Figure 9 represents the (G/) pair of one of the macroblocks in each category. The upper and lower solid lines in the figure represent the CODE value equal to 24 1293833 (in this moxibustion = 1) All (G/>) pairs, and the dashed line represents the CODE value equal to gas. All (sand) pairs of DE. (Sand) The macroblock above the upper solid line is encoded in AFP mode, as this can be expected to achieve higher coding performance, while the decoder does not receive some enhancement layer data for prediction. Packets, introduced drift errors will not be so serious. Conversely, (<^D) macroblocks located below the lower solid line are encoded in ACP mode because they are more susceptible to drift errors. Other macroblocks encode the pool in MP mode to achieve a better compromise between coding gain and drift error. 1 ¥ Since the P-picture is used as a reference for subsequent B/P-picture coding, the method of the present invention for determining the prediction mode is applied to the P-picture. Furthermore, since the team picture is not used as a prediction for other pictures, this drift error will not propagate to other pictures. Therefore, in the present invention, the B-pictures all use full-precision image quality prediction, and the coding achieves the highest coding efficiency. When providing the video stream, the video stream server will cut each enhancement layer's surface to an appropriate size. To match the bandwidth of the client terminal device. If fine-quality prediction is used to encode the base layer and the enhancement layer, the bit-allocation scheme used to crop the FGS enhancement layer has a large impact on performance. For example, if more bits can be reasonably assigned to the Ι/P-昼 plane instead of the picture, the decoder may receive more bit layers of the j/p_ picture, resulting in lower drift errors and higher Video quality. In addition, the B-picture can also be referenced to the better quality image when the encoder side is predicted, or when the numerator end 25 1293833 is reconstructed, if there are more enhancement layer layers for these reference images for prediction. Received words. The invention further proposes a new rate adjustment enhancement layer bit layer layer cutting algorithm, which cuts the enhancement layer at the video server end with three different available bandwidths of low bit rate, median rate and high bit rate. Meta layer. In the case of low bit rates, during the encoding process, the available bandwidth is insufficient to deliver all of the enhancement layer bits of the ι/p picture for the two-layer fine picture prediction. Therefore, the tank-shift error is unavoidable when part of the reinforcement layer data used for prediction is discarded during the cutting process. On the other hand, "If the available bandwidth is large enough to send out all enhancement layer bits for fine image quality prediction, but the bandwidth is still smaller than the #ΒΡ of all B-pictures of a group of pictures (GOP) When the number of bits in the MSB bit layer is enhanced, all the extra bits can be allocated in the B-picture to balance the image between the I/P_ picture and the B_ picture, 'quality. If the bandwidth In the better case, the extra bits will also be allocated to the Ι/P-昼 face, and the associated bits will be preserved to avoid drift errors. Such reinforced layer level layer cropping bit rate adjustment actions It can be done at the server or router. The cutting methods for different situations will be detailed as follows. Table 2 below shows the various parameters used in the servo bit layer cutting algorithm of the present invention. Table 2 is used for server rate Description of the parameter parameters of the adjustment _ MjOP GOP screen number N \ I-picture and p-picture number in the GOP Nb A B-picture in a GOP (A / b = β Μ & ι >) 26 1293833 Encoder pre- Pre-encoding The bit layer used for fine image quality prediction in A^bp encoding PBel - G 0P Total number of enhancement layer bits for fine image quality prediction BL - GOP for all images of fine image quality prediction and ρ· picture enhancement layer number PBbxl - JV of all B-pictures in a GOP The number of bits in the enhancement layer MSB bit layer is the number one for fine image quality prediction in the I&PJEL-GOP! /^Written enhancement layer bit number PB^ One GOP in the first b B-picture #BP enhancement layer MSB bit layer bit number Server end bit layer clipping parameter tbel a GOP reinforcement layer The number of bits allocated -- ( ^ ^l & PJEL - GOP in the enhancement layer "the number of bits of the i / p - picture assignment TOm a GOP reinforcement layer; w B - screen assignment of the number of bits Case 1: low available bandwidth In this case, the available bandwidth of the channel is estimated to be less than the number of enhancement layer bits of the picture and the p_ picture used for fine picture prediction at the time of encoding. Since the available bandwidth is not enough to send all the bits for fine image quality prediction, the present invention cuts the enhancement layer of the I-picture and the picture as much as possible. The cutting method of each Ι/p-picture is based on The number of bits used for prediction is adjusted as follows...

27 (9) 1293833 在此情況中,B-畫面分派的位元數都設定為零,亦即 〇,m=l,2,···,外。方程式(9)是用在當可用的 位元數小於時,而且位元僅分派給μ畫面與ρ·書 面,而所有Β-畫面的加強層資料都在裁切中丟棄。此方法 在低位元速率時可以達到更好的效能。 情況 2 ·中可用頻寬(medium available bandwidth) 如果可用頻寬夠大於送出用於精細畫質預測的晝面 與P-畫面的所有加強層位元,但頻寬仍小於P5bel時,何-服器會先將位元分派給Ι/P-畫面”以保障用於精細晝質預 測的Ι/P-畫面位元層可以完全送給接收端,然後伺服器才 將還多餘的位元分派給B-畫面。 情況 3 :高可用頻宽(high available bandwidth) 如果可用的頻寬大於送出用於精細晝質預測的加強層 位元層數所需的頻寬’則用於分派的位元數是由位元層的 大小來控制,並且隨著特定的位元速率而變。然而,當位 〃元速率快速提升時,如果不再有更多位元分派給晝面, 則兩個相鄰畫面之間會有很大的差異。所以,位元需要平 衡地分派給各個畫面以避免巨大的品質差異。 本發明的加強層位元分派演算法可以用一虛擬程式碼 (pseudo program)摘述如下:27 (9) 1293833 In this case, the number of bits allocated by the B-picture is set to zero, that is, 〇, m=l, 2, ···, and outside. Equation (9) is used when the number of available bits is smaller, and the bits are only assigned to the μ picture and the ρ·book face, and all the enhancement layer data of the frame are discarded in the crop. This method achieves better performance at low bit rates. Case 2 · Medium available bandwidth If the available bandwidth is larger than all the enhancement layer bits sent out for the fine picture prediction and the P-picture, but the bandwidth is still less than P5bel, The device will first assign the bit to the Ι/P-picture to ensure that the Ι/P-picture bit layer for fine 昼 quality prediction can be completely sent to the receiving end, and then the server will assign the extra bits to B-picture. Case 3: high available bandwidth If the available bandwidth is greater than the bandwidth required to send the number of enhancement layer bits for fine quality prediction, then the number of bits used for dispatching It is controlled by the size of the bit layer and varies with the specific bit rate. However, when the bit cell rate is rapidly increased, if there are no more bits assigned to the face, then two adjacent There is a big difference between the pictures. Therefore, the bits need to be balancedly assigned to each picture to avoid huge quality differences. The enhanced layer bit allocation algorithm of the present invention can be summarized by a pseudo program. as follows:

Begin: 28 1293833 if (7¾ ^ PBi&p^l ) perform low-rate bit truncation */ TB^l-PBel n = 1,2, ···,M&P; tK^l = 0» m = l,2,.",iVB; else if (TB^ ^ PBEL) /* perform medium-rate bit truncation */ ^®I&PJEL ~ w = 1,2, ···,M&p; τΒζ^-ΡΒ^x^b^l,mm m-0 else /* perform high-rate bit truncation */Begin: 28 1293833 if (73⁄4 ^ PBi&p^l ) perform low-rate bit truncation */ TB^l-PBel n = 1,2, ···,M&P; tK^l = 0» m = l ,2,.",iVB; else if (TB^ ^ PBEL) /* perform medium-rate bit truncation */ ^®I&PJEL ~ w = 1,2, ···,M&p; τΒζ^- ΡΒ^x^b^l,mm m-0 else /* perform high-rate bit truncation */

endif ,End 本發明的編碼解碼器的有效性可以由模擬的實驗結果 .呈現出來。此實驗中使用兩組分別為C〇astgUard與Mobile 的測試畫面。這些畫面是以(30,2) GOP架構來編碼,基 礎層是用TM5速率控制方法以及3〇他的晝面率在384 kbits/sec的速率下編碼。晝面大小為aF 352心8。精細畫 質預測(亦即在AFT與MP模式)使用兩個加強層位元層。 第10圖和第Π圖說明本發明的方法與其他三種方法 29 1293833 分別在這兩組測試畫面系列下的效能。此這三種方法是: 基線(baseline)FGS、全精細畫質預測、與單層MPEG]編 解碼器。模擬的結果顯示本發明的方法在廣泛的位元速率 範圍内的確勝過其他三種方法。全細模式與基線FGS方法 分別代表在最高位元速率與最低位元速率下兩個重要而不 同的品質界限。本發明的目的就是在一廣泛的位元速率範 圍,在這兩個方法間找到一個良好的折衷方法。而這個目 的是藉由在基礎層的運動補償預測,導入一預定數目的位 元層來達成,然而在一小段低位元速率的範圍厂 (384-512kbps),可以觀察到由於漂移誤差所致的輕微品 質降低。本發明的方法要比全精細畫質預測具有更大的耐 錯能力。 全精細畫質預測是用在所有的B-畫面,這樣可以顯著 ,改善編碼效率而又不會引起誤差的傳播。運動向量是用高 品質預測求得。層間選擇(inter-layer selection)方法是在 P-畫面上實Μ,以改善基礎層的編碼效率,而有相同運動 資訊(motioninformation)的這兩層,其運動補償的參考書面 可以是不同的。基礎層與加強層是不需要兩組運動向量, 因為會需要更多的運算和額外的位元速率來估計和傳送多 出的一組運動向量。因此基礎層估計出的運動向量是重複 用在加強層的運動補償。當位元速率低時,全精細晝質預 測會遭受大約ldB的損失。利用本發明,低位元速率下由 於漂移誤差所致的品質降低,可以顯著的減輕,而在高位 30 1293833 疋逮率下,達到的編碼增益要比原始的代^高約丨〜丨5dB。 第12圖和第π圖說明本發明的方法與其他三種方法 分別在這兩組測試畫面系列、384kbps的基礎層位元速率、 與二種加強層位元速率·· Okbps、256kbps、與768kbps下, 連續畫面的PSNR效能。本發明的方法,在可用頻寬低時, 可以比全精細畫質預測更有效率地的降低漂移誤差,而在 可用頻寬高時,可以保持接近全精細畫質預測的編碼效 率。本發明的方法能達到比原始的FGS明顯更高的PSNI^ 品質改善。第14圖說明使用本發明與原始的FGS解碼出 的兩幅圖像,以提供客觀的效能比較。 惟’以上所述者,僅為本創作之較佳實施例而已,當 不能以此限定本創作實施之範圍。即大凡依本創作申請專 ,·利範園所作之均等變化與修飾,皆應仍屬本創作專利涵蓋 之範圍内。 31 1293833 【圖式簡單說明】 第1圖說明FGS位元流傳送到不同頻寬的接收端。 第2圖說明產生FGS基礎層與加強層位元流的編碼過程。 第3圖說明FGS基礎層與加強層畫面重建的解碼過程。 第4圖說明根據本發明之層内預測的新的fgs編解碼器的 編碼Is架構。 第5圖說明根據本發明之層内預測的新的fgs編解碼器的 解碼器架構。 第6圖說明根據本發明之層内預測的新的fgs編解碼器的-編碼器架構,其中,基礎層僅具有粗略畫質預測。 > 第7圖說明根據本發明之層内預測的新的fgs編解碼器的 解碼器架構,其中,基礎層僅具有粗略畫質預測。 第8圖說明本發明的兩階段編碼程序。 第9圖說明對多個巨集區塊,其錯誤匹配誤差和編碼增益 之間的關係及繪出的一個範例分佈圖。 第10圖說明本發明的方法與其他三種傳統方法使用 Mobile測試數列下的效能比較。 '第11圖說明本發明的方法與其他三種傳統方法使用 Coastguard測試數列下的效能比較。 第12圖說明本發明的方法與其他三種傳統方法使用 Coastguard測試數列下,以384kbps基礎層位元速率和三 種加強層位元速率,(a) 0kbps (b) 256kbps (c) 768kbps,的 效能比較。 第13圖說明本發明的方法與其他三種傳統方法在M〇bne 32 1293833 測試數列下,以512kbps基礎層位元速率和三種加強層位 元速率,⑻ Okbps (b) 256kbps (c)768kbps,的效能比較。 第14圖說明在512kbps的基礎層與512kbps的加強層,以 ⑻原始的FGS編碼器(27.5dB)和(b)本發明的Hybrid MB-MSFGS方法(32.4dB )的第4幅解碼出的圖像。 圖號說明 401DCT單元 403最大值尋找器 405位元層分割器 407精細畫質畫面記憶體 411DCT單元 413可變長度編碼器 415IDCT 單元 4.17運動估計單元 402位元層位移 404位元層可變長度編碼器 406IDCT 單元 408運動補償單元 412量化單元 414解量化單元 416粗略晝質畫面記憶體 418運動補償單元Endif, End The validity of the codec of the present invention can be presented by the experimental results of the simulation. In this experiment, two sets of test screens for C〇astgUard and Mobile were used. These pictures are encoded in a (30, 2) GOP architecture, and the base layer is encoded at a rate of 384 kbits/sec using the TM5 rate control method and its face rate. The size of the face is aF 352 heart 8. Fine picture quality prediction (ie, in AFT and MP modes) uses two enhancement layer bit layers. Fig. 10 and Fig. 10 illustrate the performance of the method of the present invention and the other three methods 29 1293833 in the two sets of test pictures, respectively. The three methods are: baseline FGS, full-precision quality prediction, and single-layer MPEG] codec. The results of the simulation show that the method of the present invention does outperform the other three methods over a wide range of bit rates. The full-fine mode and the baseline FGS method represent two important different quality boundaries at the highest bit rate and the lowest bit rate, respectively. The object of the present invention is to find a good compromise between the two methods over a wide range of bit rates. And this purpose is achieved by introducing a predetermined number of bit layers in the motion compensation prediction of the base layer, but at a low range of low bit rate (384-512 kbps), it can be observed due to drift error. Minor quality is reduced. The method of the present invention is more error tolerant than full fine quality prediction. Full-precision image quality prediction is used in all B-pictures, which can significantly improve coding efficiency without causing error propagation. Motion vectors are derived from high quality predictions. The inter-layer selection method is implemented on the P-picture to improve the coding efficiency of the base layer, and the motion compensation reference writing can be different for the two layers with the same motion information. The base layer and the enhancement layer do not require two sets of motion vectors because more operations and additional bit rates are needed to estimate and transmit an extra set of motion vectors. Therefore, the motion vector estimated by the base layer is repeatedly used for motion compensation in the enhancement layer. When the bit rate is low, the full fine enamel prediction suffers from a loss of approximately 1 dB. With the present invention, the quality degradation due to the drift error at a low bit rate can be significantly reduced, and at a high bit rate of 30 1293833, the achieved coding gain is about 丨~丨5 dB higher than the original generation. Figure 12 and Figure π illustrate the method of the present invention and the other three methods in the two sets of test picture series, the base layer bit rate of 384 kbps, and the two enhancement layer bit rates·· Okbps, 256 kbps, and 768 kbps. , PSNR performance of continuous pictures. The method of the present invention can reduce the drift error more efficiently than the full-fine image quality prediction when the available bandwidth is low, and can maintain the coding efficiency close to the full-fine image quality prediction when the available bandwidth is high. The method of the present invention achieves significantly higher PSNI^ quality improvements than the original FGS. Figure 14 illustrates two images decoded using the present invention and the original FGS to provide an objective performance comparison. However, the above description is only for the preferred embodiment of the present invention, and the scope of the present invention cannot be limited thereto. That is to say, the average change and modification made by Li Fanyuan should be within the scope of this creation patent. 31 1293833 [Simple description of the diagram] Figure 1 shows that the FGS bit stream is transmitted to the receiving end of different bandwidths. Figure 2 illustrates the encoding process for generating the FGS base layer and the enhancement layer bit stream. Figure 3 illustrates the decoding process of the FGS base layer and enhancement layer picture reconstruction. Figure 4 illustrates the coded Is architecture of the new fgs codec for intra-layer prediction in accordance with the present invention. Figure 5 illustrates the decoder architecture of the new fgs codec for intra-layer prediction in accordance with the present invention. Figure 6 illustrates a new fgs codec-encoder architecture for intra-layer prediction in accordance with the present invention, wherein the base layer has only coarse quality predictions. > Figure 7 illustrates a decoder architecture of a new fgs codec for intra-layer prediction in accordance with the present invention, wherein the base layer has only coarse quality prediction. Figure 8 illustrates the two-stage encoding procedure of the present invention. Figure 9 illustrates the relationship between the error matching error and the coding gain for a plurality of macroblocks and an example distribution map. Figure 10 illustrates the performance comparison of the method of the present invention with three other conventional methods using the Mobile Test Sequence. Figure 11 illustrates the performance comparison of the method of the present invention with the other three conventional methods using the Coastguard test series. Figure 12 illustrates the performance comparison between the method of the present invention and three other conventional methods using the Coaster test sequence at a 384 kbps base layer bit rate and three enhancement layer bit rates, (a) 0 kbps (b) 256 kbps (c) 768 kbps. . Figure 13 illustrates the method of the present invention and three other conventional methods in the M〇bne 32 1293833 test sequence, with a base layer bit rate of 512 kbps and three enhancement layer bit rates, (8) Okbps (b) 256 kbps (c) 768 kbps, Performance comparison. Figure 14 illustrates the fourth decoded picture of the 512 kbps base layer and the 512 kbps enhancement layer with (8) the original FGS encoder (27.5 dB) and (b) the Hybrid MB-MSFGS method of the present invention (32.4 dB). image. Figure No. 401 DCT unit 403 maximum value finder 405 bit layer layer splitter 407 fine picture quality picture memory 411 DCT unit 413 variable length coder 415 IDCT unit 4.17 motion estimation unit 402 bit layer shift 404 bit layer variable length coding 406 IDCT unit 408 motion compensation unit 412 quantization unit 414 dequantization unit 416 coarse enamel picture memory 418 motion compensation unit

419錯誤匹配估計與模式決定單元 420基礎層解碼器 SW1、SW2切換開關 501位元層可變長度解碼器 503位元層分割器 505精細畫質晝面記憶體 510可變長度解碼器 512第三IDCT單元 502第一 IDCT單元 504第二IDCT單元 506運動補償單元 511解量化單元 513粗略畫質畫面記憶體 33 1293833 514運動補償單元 SW3、SW4切換開關419 error matching estimation and mode decision unit 420 base layer decoder SW1, SW2 switch 501 bit layer variable length decoder 503 bit layer splitter 505 fine picture quality memory 510 variable length decoder 512 third IDCT unit 502 first IDCT unit 504 second IDCT unit 506 motion compensation unit 511 dequantization unit 513 coarse picture memory 33 1393833 514 motion compensation unit SW3, SW4 switch

Claims (1)

I、申請¥^範圍·· 1. 一種具細緻可伸縮性之編碼器,包含有: 一基礎層編碼區塊,包括一粗略畫質預測迴路與一基礎層 模式選擇器,該粗略畫質預測迴路備有一粗略晝質預測輸 出; J 一加強層編碼區塊,包括一精細畫質預測迴路與一加強層 模式選擇器,該精細畫質預測迴路備有一精細晝質預測輸 出;以及 一錯誤匹配估計與模式決定單元,用來適切地控制該加強 層和該基礎層模式選擇器,並且估計該粗略畫質預測輸出 與該精細晝質預測輸出之間的錯誤匹配誤差; 其中,當該基礎層模式選擇器與該加強層模式選擇器均切 換去選擇該精細晝質預測輸出時,該編碼器係運作於一全 精細晝質預測模式,當該基礎層模式選擇器切換去選擇該 粗略晝質預測輸出,且當該加強層模式麵器切換去選擇 該精細畫質酬輸出時,該編碼器係運作於—混合預測模 式,而當該基礎層模式選擇器與該加強層模式選擇器均切 換去選擇該粗略畫質酬輸出時,該編碼器係運作於一全 粗略畫質預測模式。 2·如申請專利範圍第i項所述之具細緻可伸縮性之編碼器, 該編碼器更包含一最壞情況基礎層解碼器,用來提供-最 壞情況的粗略晝質酬輸出給該錯誤匹配估計與模式決定 XJX3 率兀。 3· —種具細緻可伸縮性之解碼器,包含有: 35 1293833 9W日修正i: -基礎層解碼區塊,包括-粗略晝質酬迴路,該粗略晝 質麵迴路具有-第-切換開關以選擇基礎層裡的預測模 式,該粗略畫質預測迴路並備有一粗略晝質預測輸出; 一加強層解碼區塊,包括一精細晝質預測迴路,該精細晝 質預測迴路具有-第二切換開關以選擇加強層裡的預測模 式,該精細晝質酬迴路並備有一精細畫質預測輸出; 其中’當該第-切換開關與該第二切換開關均切換去選擇 該精細畫質預測輸出時,該解碼器係運作於一全精細晝質 預測模式,而當該第一切換開關切換去選擇該粗略晝質預 測輪出,且該第二切換開關切換選擇該精、細晝質預測輸出 時,該解碼器係運作於一混合預測模式,而當該第一切換 開關與該第二切換開關均切換去選擇該粗略晝質麵輸出 曰守,忒解碼裔係運作於一全粗略晝質預測模式。 4·種具有至少兩種編碼模式的編碼方法,該編碼方法包含·· (a) —粗略畫質預測的基礎層與一精細畫質預測的加強 層,並從輸入訊號的複數個巨集區塊的每一巨集區塊去 收集編碼參數,該編碼參數至少包括一精細畫質預測誤 差值、一粗略畫質預測誤差值、以及在精細畫質預測下 的最佳情況與最壞情況的錯誤匹配誤差; (b) 分析該編碼參數以決定每一巨集區塊的編碼模式;以及 (c) 根據該步驟(b)裡決定的編碼模式,對每一巨集區塊編 碼。 5.如申睛專利範圍第4項所述之具有至少兩種編碼模式的編 碼方法,其中,該複數個巨集區塊在該步驟(b)裡被分類為 36 1293833 準· 1½ i %修修,正本: 至少兩編碼群組,在同一群組的每一巨集區塊被指定一相 同的編碼模式。 6·如申請專利制第4項所述之具有至少兩種編碼模式的編碼 方法,其中,該編碼方法具有一全粗略畫質預測模式、一 全精細畫質預測模式、與一混合預測模式,而該複數個巨 集區塊在该步驟(b)裡被分類為一全粗略畫質預測群組、一 全精細畫質預測群組、與一混合預測群組,該全粗略晝質 預測群組裡的每一巨集區塊均指定採用該全粗略畫質預測 杈式,該全精細晝質預測群組裡的每一巨集區塊均指定採 用該全精細畫質預測模式,該混合預測群組裡的每一巨集 區塊均指定採用該混合預測模式。 7·如申請專利顧第4項所述之具有至少兩種編碼模式的編 碼方法,其中,根據一編碼增益與一預估的錯誤匹配誤差, 邊複數個輯區塊被分類為至少兩編碼群組,該編碼增益 係從每一巨集區塊的該精細畫質預測誤差值和該粗略畫質 預測誤差值所導出,該預估的錯誤匹配誤差係從每一巨集 區塊的忒隶佳情況與該最壞情況的錯誤匹配誤差所導出。 8·如申凊專利範圍帛7項所述之具有至少兩種編碼模式的編 碼方法,其中,ό亥編碼方法具有一全粗略晝質預測模式、 一全精細晝負預測模式、與一混合預測模式,而該複數個 巨集區塊在a亥步驟(b)裡被分類為一全粗略畫質預測群組、 一全精細畫質預測群組、與一混合預測群組,該全粗略畫 質預測群組裡的每-巨集區塊均指定採麟全粗略晝質預 測模式,該全精細畫質預測群組裡的每一巨集區塊均指定 37 1293833 厂一] 牛月 日修 iL不 採用該全精細晝質預測模式,該混合預測群组裡的每一巨 集區塊均指定採用該混合預測模式。 9.如申請專利細第7項所述之具有至少兩種編碼模式的編碼 方法,其中,-巨集區塊的編碼增益除以該巨集區塊綱 的錯誤匹配誤差被定義為該巨集區塊的編碼效率,然後該 巨集區塊根據其編碼效率被指定採用該全粗略畫質預測模 式、該全精細畫質預測模式、與該混合預測模式三者之其 中一種0 ω·如申請專利嶋9項所述之具有至少兩種編碼模式的編 碼方法,其中,從該複數個巨集區塊的編碼效率,計算出 -編碼效率平均值與_編碼效率標準差,然後藉由比:該 給定之巨集區塊的編碼效率,和由該編碼效率平均值斑該 編碼效率標準差來決定的值,該巨集區塊被指定採用該全 粗略畫質預測模式、該全精細晝質預測模式、與該混合預 測模式二者之其中一種。 Π.如申請細_1G項所述之具有至少兩種編碼模式的編 碼方法’其中,該巨集區塊的編碼效率若小於該編碼平均 值與該編碼標準差的一預定倍數兩者的差,則該巨集區塊 «定採職錄略錄動_式,敍無塊的編碼效 率若大於斜触_編碼縣差的—财倍數兩者 的牙巨木區塊被指定採用該全精細晝質預測模式, 否則的話,該巨集區塊被指定採用該混合預測模式。 12.-種截切-組圖像的—加強層裡的位元面以分派位元的方 法,該位福被送至—客戶端頻道,該方法包含下列步驟: 38 1293833I. Application ¥^Scope·· 1. An encoder with detailed scalability, comprising: a base layer coding block, including a rough image quality prediction loop and a base layer mode selector, the rough image quality prediction The loop has a coarse quality prediction output; J-enhanced layer coding block includes a fine image quality prediction loop and a reinforcement layer mode selector, the fine image quality prediction loop has a fine quality prediction output; and a mismatch An estimation and mode decision unit for appropriately controlling the enhancement layer and the base layer mode selector, and estimating an error matching error between the coarse image quality prediction output and the fine quality prediction output; wherein, when the base layer When the mode selector and the enhancement layer mode selector both switch to select the fine quality prediction output, the encoder operates in a full fine quality prediction mode, and when the base layer mode selector switches to select the rough quality Predicting the output, and when the enhancement layer mode switch switches to select the fine picture quality compensation output, the encoder operates in a hybrid prediction Type, and when the mode selector base layer and the reinforcing layer are the mode selector switch to select the pay coarse quality output, the encoder system to operate in a full coarse picture prediction mode. 2. The encoder with fine scalability as described in item i of the patent application, the encoder further includes a worst case base layer decoder for providing a worst case rough quality output to the The error match estimate and mode determine the XJX3 rate. 3·—A decoder with meticulous scalability, including: 35 1293833 9W day correction i: - base layer decoding block, including - rough 昼 quality compensation loop, the coarse 昼 quality surface loop has - first-switch To select a prediction mode in the base layer, the coarse quality prediction loop is provided with a coarse quality prediction output; a reinforcement layer decoding block includes a fine quality prediction loop, and the fine quality prediction loop has a second switching The switch selects a prediction mode in the enhancement layer, and the fine quality compensation loop is provided with a fine image quality prediction output; wherein 'when the first switching switch and the second switching switch are both switched to select the fine image quality prediction output The decoder operates in a full fine quality prediction mode, and when the first switching switch switches to select the coarse quality prediction rounding, and the second switching switch switches to select the fine and fine quality predicted output The decoder operates in a hybrid prediction mode, and when the first switching switch and the second switching switch are both switched to select the coarse 昼 面 曰 output, 忒 decoding A whole day to make a rough qualitative prediction mode. 4. An encoding method having at least two encoding modes, the encoding method comprising: (a) - a base layer of coarse image quality prediction and a reinforcing layer of a fine image quality prediction, and from a plurality of macro regions of the input signal Each macroblock of the block collects coding parameters including at least a fine picture quality prediction error value, a coarse picture quality prediction error value, and an optimum condition and a worst case condition under fine picture quality prediction. Error matching error; (b) analyzing the encoding parameters to determine the encoding mode of each macroblock; and (c) encoding each macroblock according to the encoding mode determined in the step (b). 5. The encoding method having at least two encoding modes as described in claim 4, wherein the plurality of macroblocks are classified as 36 1293833 and 11⁄2 i % in the step (b). Repair, original: At least two encoding groups, each macroblock in the same group is assigned the same encoding mode. 6. The encoding method having at least two encoding modes as described in claim 4, wherein the encoding method has a full coarse quality prediction mode, a full fine image quality prediction mode, and a hybrid prediction mode. And the plurality of macroblocks are classified into a full coarse quality prediction group, a full fine image quality prediction group, and a mixed prediction group in the step (b), the full coarse prediction group Each macroblock in the group is specified to adopt the full coarse quality prediction formula, and each macroblock in the full fine quality prediction group is designated to adopt the full fine image quality prediction mode, the hybrid Each macroblock in the prediction group is assigned the hybrid prediction mode. 7. The encoding method having at least two encoding modes as described in claim 4, wherein the plurality of tiles are classified into at least two encoding groups according to a coding gain and an estimated error matching error. And the coding gain is derived from the fine picture quality prediction error value of each macro block and the coarse picture quality prediction error value, and the estimated error matching error is obtained from each macro block block. The best case is derived from the worst case error matching error. 8. The encoding method having at least two encoding modes as described in claim 7, wherein the encoding method has a full coarse prediction mode, a full fine negative prediction mode, and a hybrid prediction a pattern, and the plurality of macroblocks are classified into a full coarse quality prediction group, a full fine image quality prediction group, and a mixed prediction group in the ahai step (b), the full rough drawing Each macro-block in the quality prediction group specifies the general prediction mode of the lining, and each macroblock in the full-precision image prediction group is designated 37 1293833. The iL does not adopt the full-precision quality prediction mode, and each macroblock in the hybrid prediction group is designated to adopt the hybrid prediction mode. 9. The encoding method having at least two encoding modes as described in claim 7, wherein the encoding gain of the macroblock divided by the error matching error of the macroblock is defined as the macro The encoding efficiency of the block, and then the macroblock is specified according to its encoding efficiency, using the full coarse image quality prediction mode, the full fine image quality prediction mode, and the hybrid prediction mode. An encoding method having at least two encoding modes as described in claim 9 wherein, from the encoding efficiency of the plurality of macroblocks, a standard deviation between the encoding efficiency average and the _ encoding efficiency is calculated, and then by ratio: The coding efficiency of a given macroblock, and the value determined by the coding efficiency standard mean value of the coding efficiency standard deviation, the macroblock is designated to adopt the full coarse quality prediction mode, the full fine quality prediction One of the mode, and the hybrid prediction mode. The encoding method having at least two encoding modes as described in the item _1G, wherein the encoding efficiency of the macroblock is smaller than a difference between the encoding average and a predetermined multiple of the encoding standard deviation. , the macro block «定采职录录动动_, the code efficiency of the no block is greater than the slanting _ coding county difference - the financial multiples of the tooth giant block is designated to use the full fine 昼Quality prediction mode, otherwise, the macro block is assigned to adopt the hybrid prediction mode. 12. - Cut - The image of the group image - The bit face in the enhancement layer is assigned to the bit channel. The bit is sent to the client channel. The method consists of the following steps: 38 1293833 ⑻如果該加強層可麟分派的位元總數小於或等於該組 圖像所有i/p_畫面用於精細畫質預測的加強層位元總 數’則實施低速率位元截切; (b) 如果該加強層可用於分派的位元總數小於或等於該組 圖像用於精細晝質預測的加強層位元總數,但大於該組 圖像所有’畫關於精細錢糊的加強層位元總 數’則實施中速率位元截切;以及 (c) 如果3加強層可用於分派的位元總數大於該組圖像用 於精細晝質預測的加強層位元總數,則實施高速率位元 截切。 13·如申請專利範圍第12項所述之一種截切一組圖像的一加 強層裡的位元面以分派位元的方法,其中該低速率位元截 切以一位元數分派給該加強層的每一!/p_晝面,該位元數 正比於一比值,該比值為每一 :^:^畫面用來預測的位元數 與該組圖像所有Ι/P-晝面用於精細晝質預測的位元總數 的比,且該低速率位元截切未分派位元給該加強層的任何 B-畫面。 14.如申請專利範圍第13項所述之一種截切一組圖像的一加 強層裡的位元面以分派位元的方法,其中該中速率位元截 切以一位元數分派給該加強層的每一 I/p_畫面,該位元數 為每一 Ι/P-畫面用來精細畫質預測的位元數,且該中速率 位元截切以一位元數分派給該加強層的每一 B_晝面,該位 兀數正比於一比值,該比值為每一 B_晝面用來精細畫質預 測的加強層最顯著位元數與該組圖像所有B_晝面用於精 39 1293833 ¢5573 、田旦貝預測的加強層最顯著位元總數的比。 申明專概目第丨4項所述之—鋪切—組圖像的一加 強層裡的位兀面以分派位元的方法,其中該高速率位元截 切以位70數、加上另—位域、以及該組圖像所有仏畫 面用於精細畫質綱的加強層最顯著位元總數,分派給該 力強層的每- I/p_晝面,該位元數為每一工A畫面用來精細 畫f預測的位元數,該另一位元數正比於-比值,該比值 為每Ι/P-晝面用來精細晝質預測的位元數與該組圖像所 有ι/p-畫面用於精細畫質預測的位元總數的比,並且該高速 率位元截切以-位元數、以及該組圖像所有队畫面用於精 細畫質預測的加強層最顯著位元總數,分派給該加強層的 每一 B-畫面,該位元數正比於一比值,該比值為每一 b_ 里面用來精細畫質預測的加強層最顯著位元數與該組圖像 所有B_畫面用於精細畫質酬的加強層最顯著位元總數的 比。 16·如申請專利範圍第15項所述之—賊切_組圖像的一加 強層裡的位元面以分派位元的方法,其中該中速率位元截 切以一位元數分派給該加強層的每一 J/P_晝面,該位元數為 每一 Ι/P-畫面用來精細晝質預測的位元數,且該中速率位元 截刪以一位元數分派給該加強層的每一 B_晝面,該位元數 正比於一比值,該比值為每一 晝面用來精細晝質預測的 加強層最顯著位元數與該組圖像所有晝面用於精細畫質 預測的加強層最顯著位元總數的比。 17.如申請專利範圍第12項所述之一種截切一組圖像的一加 1293833(8) If the total number of bits allocated by the enhancement layer is less than or equal to the total number of enhancement layer bits of all i/p_pictures used for fine image quality prediction of the group of images, then low rate bit clipping is implemented; (b) If the total number of bits that can be used for the enhancement layer is less than or equal to the total number of enhancement layer bits used by the group of images for fine enamel prediction, but greater than the total number of enhancement layer bits for the group of images. 'The implementation of the rate bit truncation; and (c) if the total number of bits that the 3 enhancement layer can use for dispatching is greater than the total number of enhancement layer bits used by the set of images for fine quality prediction, then a high rate bit truncation is implemented. cut. 13. A method of assigning a bit in a reinforcing layer of a set of images to assign a bit as described in claim 12, wherein the low rate bit is truncated by one bit. Give each of the enhancement layers! /p_昼, the number of bits is proportional to a ratio, the ratio is: each: ^: ^ The number of bits used for prediction and all Ι/P-昼 of the set of images are used for fine 昼 prediction The ratio of the total number of bits, and the low rate bit truncates the unallocated bits to any B-picture of the enhancement layer. 14. A method of assigning a bit in a reinforcing layer of a set of images to assign a bit as described in claim 13 wherein the medium rate bit is truncated by one bit. For each I/p_ picture of the enhancement layer, the number of bits is the number of bits used for fine quality prediction for each Ι/P-picture, and the medium rate bit is truncated by one-bit number For each B_昼 face of the enhancement layer, the number of turns is proportional to a ratio, which is the most significant number of enhancement layers for each B_昼 surface used for fine image quality prediction and the set of images. B_昼面 is used for the ratio of the most significant number of significant layers of the reinforcement layer predicted by Jing 39 1293833 ¢ 5573 and Tian Danbei. Declaring the method described in item 4 of the general plan - the method of assigning bits in a reinforcement layer of the image-group image, wherein the high-rate bit is truncated by the number of bits 70, plus In addition, the bit field and all the pictures of the set of images are used for the total number of most significant bits of the enhancement layer of the fine picture quality, and are assigned to each - I/p_ face of the force layer, the number of bits is The work A picture is used to finely draw the number of bits predicted by f, which is proportional to the - ratio, which is the number of bits used for fine quality prediction per Ι/P-昼 face and the set of pictures. The ratio of the total number of bits used for fine image quality predictions for all ι/p-pictures, and the high-rate bits are truncated with the number of bits, and all the frames of the set of images are used for the enhancement of fine image quality prediction. The total number of most significant bits in the layer, assigned to each B-picture of the enhancement layer, the number of bits being proportional to a ratio, which is the most significant number of significant layers of the enhancement layer used for fine image quality prediction in each b_ All B_pictures of the set of images are used to compare the ratio of the most significant bits of the enhancement layer of the fine picture quality. 16. The method of assigning a bit in a reinforcing layer in a reinforcing layer of the thief cut_group image as described in claim 15 of the patent application, wherein the medium rate bit is truncated by one-digit number For each J/P_ plane of the enhancement layer, the number of bits is the number of bits used for fine 昼 quality prediction for each Ι/P-picture, and the medium rate bits are truncated by one-digit number Assigned to each B_昼 surface of the enhancement layer, the number of bits is proportional to a ratio, which is the most significant number of significant layers of the enhancement layer used for each fine-tune prediction and the set of images. The ratio of the most significant number of bits in the enhancement layer used for fine image quality prediction. 17. A method of cutting a set of images as described in claim 12, 1293833 強層裡的位元面时派位元的方法,其愧高速率位元截 切以一位元數、加上另—位元數、以及該組圖像所有B-畫 面用於精細晝質預測的加強層最顯著位讀數,分派給該 加強層的每一 [P-畫面,該位元數為每-·畫面用來精細 晝質預測驗元數,刻—位元數正比於—雜,該比值 為每一 I/p_畫面用來精細晝質預測的位元數與該組圖像所 畫面用於精細晝質預測的位元總數的比,並且該高速 ^位元截切以-位讀、以及該組圖像所有Β·畫面用於精 /貝制的加強層最麟位元總數,分派給該加強層的 Β.畫面’該位元數正比於—比值該比值為每一 Β· 戶息面用來精細晝__加強層最顯著位元數與該組圖像 所有匕畫_於精細晝f預測的加強層最顯著位元總數的 比。 41The method of assigning bits in the bit layer of the strong layer, the high rate bit is truncated by one-digit number, plus another-bit number, and all B-pictures of the set of images are used for fine enamel The most significant bit reading of the predicted enhancement layer is assigned to each [P-picture of the enhancement layer, the number of bits is used for the fine enamel prediction number, and the number of bits is proportional to the The ratio is the ratio of the number of bits used for each fine I/p_picture to the fineness prediction and the total number of bits used for the fine quality prediction of the set of images, and the high-speed bits are clipped to - bit reading, and the total number of linings of the set of images for the fine/beat reinforcement layer, assigned to the enhancement layer. The picture 'the number of bits is proportional to the ratio. The ratio is the ratio. A household account is used to fine-tune the ratio of the most significant number of bits in the enhancement layer to the total number of significant bits in the enhancement layer of the set of images. 41
TW93106316A 2004-03-10 2004-03-10 Method and apparatus for mpeg-4 fgs performance enhancement TWI293833B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW93106316A TWI293833B (en) 2004-03-10 2004-03-10 Method and apparatus for mpeg-4 fgs performance enhancement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW93106316A TWI293833B (en) 2004-03-10 2004-03-10 Method and apparatus for mpeg-4 fgs performance enhancement

Publications (2)

Publication Number Publication Date
TW200531454A TW200531454A (en) 2005-09-16
TWI293833B true TWI293833B (en) 2008-02-21

Family

ID=45067980

Family Applications (1)

Application Number Title Priority Date Filing Date
TW93106316A TWI293833B (en) 2004-03-10 2004-03-10 Method and apparatus for mpeg-4 fgs performance enhancement

Country Status (1)

Country Link
TW (1) TWI293833B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI475843B (en) * 2009-03-23 2015-03-01 奧勒圖股份有限公司 System and method for multi-stream video compression

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9210480B2 (en) * 2007-12-20 2015-12-08 Broadcom Corporation Video processing system with layered video coding and methods for use therewith
US8259801B2 (en) * 2008-10-12 2012-09-04 Mediatek Inc. Methods for coding digital media data with prediction information and prediction error information being respectively carried by different bit stream sections

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI475843B (en) * 2009-03-23 2015-03-01 奧勒圖股份有限公司 System and method for multi-stream video compression

Also Published As

Publication number Publication date
TW200531454A (en) 2005-09-16

Similar Documents

Publication Publication Date Title
US7227894B2 (en) Method and apparatus for MPEG-4 FGS performance enhancement
JP5384694B2 (en) Rate control for multi-layer video design
Chen et al. Recent advances in rate control for video coding
DK2382784T3 (en) Video coding with multi-bit-rate using variable-bit-rate and dynamic resolution for adaptive video streaming
JP5611601B2 (en) Efficient video block mode change in second pass video coding
US7848433B1 (en) System and method for processing data with drift control
US8218619B2 (en) Transcoding apparatus and method between two codecs each including a deblocking filter
US8442122B2 (en) Complexity scalable video transcoder and encoder
TW200841744A (en) Coding mode selection using information of other coding modes
WO2001047283A1 (en) Video compression for multicast environments using spatial scalability and simulcast coding
JP2004509574A (en) Preferred Transmission / Streaming Order for Fine Granular Scalability
EP2092747A1 (en) Method and apparatus for encoding and/or decoding bit depth scalable video data using adaptive enhancement layer prediction
KR20060122671A (en) Method for scalably encoding and decoding video signal
JP2009544176A (en) System and method for transcoding between a scalable video codec and a non-scalable video codec
US8170094B2 (en) Method and system for scalable bitstream extraction
Wu et al. Macroblock‐based progressive fine granularity scalable video coding
Van et al. HEVC backward compatible scalability: A low encoding complexity distributed video coding based approach
CN101146229B (en) A FGS priority scheduling method for SVC video
TWI293833B (en) Method and apparatus for mpeg-4 fgs performance enhancement
Wang et al. Fine-granularity spatially scalable video coding
Peng et al. Inter-layer correlation-based adaptive bit allocation for enhancement layer in scalable high efficiency video coding
US20030118099A1 (en) Fine-grain scalable video encoder with conditional replacement
van der Schaar et al. Motion-compensation fine-granular-scalability (MC-FGS) for wireless multimedia
Chen et al. MPEG-4 FGS coding performance improvement using adaptive inter-layer prediction
MX2008012360A (en) Method of assigning priority for controlling bit rate of bitstream, method of controlling bit rate of bitstream, video decoding method, and apparatus using the same.

Legal Events

Date Code Title Description
MM4A Annulment or lapse of patent due to non-payment of fees