TW201119396A - Multi-standard video decoding system and method - Google Patents

Multi-standard video decoding system and method Download PDF

Info

Publication number
TW201119396A
TW201119396A TW98139845A TW98139845A TW201119396A TW 201119396 A TW201119396 A TW 201119396A TW 98139845 A TW98139845 A TW 98139845A TW 98139845 A TW98139845 A TW 98139845A TW 201119396 A TW201119396 A TW 201119396A
Authority
TW
Taiwan
Prior art keywords
hardware accelerator
video decoding
hardware
standard video
processors
Prior art date
Application number
TW98139845A
Other languages
Chinese (zh)
Inventor
Yi-Shin Li
Yi-Shin Tung
Tse-Tsung Shih
Sheng-Wei Lin
Original Assignee
Hon Hai Prec Ind Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hon Hai Prec Ind Co Ltd filed Critical Hon Hai Prec Ind Co Ltd
Priority to TW98139845A priority Critical patent/TW201119396A/en
Publication of TW201119396A publication Critical patent/TW201119396A/en

Links

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A multi-standard video decoding system comprises a memory, a multi-master bridge interface, a peer-to-peer bus, a plurality of processors and a plurality of hardware accelerators. The memory is employed to store bit stream and temporal data produced during decoding flow. The multi-master bridge interface is connected to the memory. At least one of the plurality of processors receives bit stream from the memory via the multi-master bridge interface. Each of the plurality of hardware accelerators receives instructions from a processor of the plurality of the processors and operates related video decoding flow, and also accesses the memory via the multi-master bridge interface. The peer-to-peer bus is employed to connect the plurality of processors and the plurality of the hardware accelerators.

Description

201119396 六、發明說明: 【發明所屬之技術領域】 [0001] 本發明涉及一種視訊解碼系統,尤其涉及一種多重標準 視訊解碼系統。 [先前技術] [0002] 為使得影音及視訊應用之大量原始資料得以在有限的網 路頻寬中迅速傳送,將原始資料於傳送前先經過編碼壓 縮,接收端收到後再行解碼還原成原始資料已為必要之 舉。隨著多媒體數位應用日漸廣泛,消費者對影像解析 度的要求與日倶增’視訊編解碼的運算量也隨之提升。 [〇〇〇3]就視訊解碼技術而言,為達到‘高解析度畫質視訊即時解 碼的需求,一般透過硬體方式加以實現,以硬體設計應 付高解析度畫質所產生的龐大運算量。然而硬體設計通 常針對單一視訊解碼標準,若要支援其它不同的視訊解 碼標準,必須移除或增加硬體設計中的電路,缺乏彈性 與擴充性。若使用軟體方式實現視訊解碼,雖可支援多 種視訊解碼標準,但通用處理器之效能提升受限於大量 的資料匯流排存取,很難符合高解析度畫質視訊即時解 碼的需求。 【發明内容】 ‘ [0004] 有鑑於此,需提供一種軟、硬體整合的多重標準視訊解 碼系統。 [0005] 本發明提供一種多重標準視訊解碼系統,包括記憶體、 多重主端橋接介面、複數處理器以及複數硬體加速器。 該記憶體用來儲存位元流及解碼過程中的暫時性資料。 098139845 表單編號 A0101 第 4 頁/共 27 頁 0982068407-0 201119396 該多重主端橋接介面,連接至該記憶體。該複數處理器 該複數處理器包含至少一處理器,經由該多重主端橋接 介面,從該記憶體接收該位元流;以及各該複數硬體加 速器接收來自該複數處理器中一處理器的指令而執行視 訊解碼相關流程,並經由該多重主端橋接介面,存取該 記憶體。 [0006] 本發明還提供一種多重標準視訊解碼系統,包括記憶體 、多重主端橋接介面、點對點匯流排、複數處理器以及 複數硬體加速器。該複數處理器包含至夕一處理器,經 〇 由該多重主端橋接介面,從該記憶體接收該位元流。各 該複數硬體加速器接收來自該複數處理器中一處理器的 ' 指令而執行視訊解碼相關流程,經由該多重主端橋接介 . !V 二=.:-p ' 面,存取該記憶體,並經由點對點匯流排連結至各該複 數處理器。 [0007] 本發明還提供一種多重標準視訊解碼方法,執行於一系 統中。該系統的一第一處理器接收位元流,透過該系統 〇 的一第一硬體加速器執行可變:長:度垮碼,透過該系統的 一第二硬體加速器執行離散餘弦反轉換,透過該系統的 一第三硬體加速器執行運動補償,最後透過該系統的一 第四硬體加速器執行去區塊過遽。 [實施方式】 [0008]本發明利用複數處理器分擔視訊解碼過程的運算量,並 將不同視訊標準常用的解碼流程透過硬體模組加以實現 。請參閱圖1所示的硬體模組1〇〇,包含熵解碼(En_ tr0py Decode) 11〇、反轉換(Inverse Transf〇rm 098139845 表單編號A0101 第5頁/共27頁 0982068407-0 201119396 )111、運動補償(Motion Compensation) 112與區 塊雜訊過濾(De-Blocking Filter) 113等硬體加速器 。该摘解碼11〇主要進行熵編碼(Entropy Encode)的 逆過程’從中取得可變長度編碼(Variable Length Code ’簡稱VLC)與運動向量(Motion Vector)等資 訊作為其後解碼重建影像之用,反轉換111則用來將頻率 (frequency)域影像資料還原成空間(spatial)域 影像資料,運動補償112利用先前局部還原的影像資料預 測與補償正在解碼的影像資料,最後區塊雜訊過濾U3則 進行重建影像的後置處理。 [00〇9]請參閱圖2,顯示視訊解碼系統200的結構方塊圖。視訊 解碼系統200包含複數處理器210-212、圖一中硬體模組 100中的複數硬鱧加速器如滴解碼(Entropy Decode) 220、反轉換(Inverse Transform) 221、運動補償( Motion Compensation) 222與區塊雜訊裨濾(De-201119396 VI. Description of the Invention: [Technical Field] [0001] The present invention relates to a video decoding system, and more particularly to a multiple standard video decoding system. [Prior Art] [0002] In order to enable a large amount of original data of video and video applications to be quickly transmitted in a limited network bandwidth, the original data is encoded and compressed before being transmitted, and then received by the receiving end and then decoded and restored. The source material is already necessary. With the increasing popularity of multimedia digital applications, consumers' demand for image resolution and the increasing number of video codec operations have also increased. [〇〇〇3] As far as video decoding technology is concerned, in order to achieve the requirement of 'high-definition picture quality video instant decoding, it is generally implemented by hardware, and the large-scale operation generated by hardware design to cope with high-resolution image quality the amount. However, hardware design is usually for a single video decoding standard. To support other different video decoding standards, the circuit in the hardware design must be removed or added, lacking flexibility and scalability. If video decoding is implemented by software, although various video decoding standards can be supported, the performance improvement of general-purpose processors is limited by a large amount of data bus access, and it is difficult to meet the requirements of high-definition video quality instant decoding. SUMMARY OF THE INVENTION [0004] In view of the above, it is desirable to provide a software and hardware integrated multiple standard video decoding system. The present invention provides a multi-standard video decoding system including a memory, a multi-master bridge interface, a complex processor, and a complex hardware accelerator. This memory is used to store the bit stream and the temporary data in the decoding process. 098139845 Form No. A0101 Page 4 of 27 0982068407-0 201119396 This multi-master bridge interface is connected to this memory. The complex processor includes at least one processor that receives the bit stream from the memory via the multi-master bridge interface; and each of the plurality of hardware accelerators receives a processor from the plurality of processors The video decoding related process is executed by the instruction, and the memory is accessed via the multi-master bridge interface. The present invention also provides a multiple standard video decoding system including a memory, a multi-master bridge interface, a point-to-point bus, a complex processor, and a complex hardware accelerator. The complex processor includes a processor that receives the bit stream from the memory via the multi-master bridge interface. Each of the plurality of hardware accelerators receives an instruction from a processor of the plurality of processors to perform a video decoding related process, and accesses the memory via the multi-master bridge interface .V2=.:-p' plane And connected to each of the plurality of processors via a point-to-point bus. The present invention also provides a multiple standard video decoding method, which is implemented in a system. A first processor of the system receives the bit stream, and performs a variable: length weight through a first hardware accelerator of the system, and performs discrete cosine inverse conversion through a second hardware accelerator of the system. Motion compensation is performed through a third hardware accelerator of the system, and finally a deblocking is performed through a fourth hardware accelerator of the system. [Embodiment] The present invention utilizes a complex processor to share the computational amount of the video decoding process, and implements a decoding process commonly used by different video standards through a hardware module. Please refer to the hardware module 1〇〇 shown in Figure 1, including entropy decoding (En_ tr0py Decode) 11〇, inverse conversion (Inverse Transf〇rm 098139845 Form No. A0101 Page 5 / 27 pages 0982068407-0 201119396 ) 111 , hardware compensation (Motion Compensation) 112 and block noise filtering (De-Blocking Filter) 113 and other hardware accelerators. The extractive decoding 11〇 mainly performs the inverse process of entropy encoding (Entropy Encode) to obtain information such as Variable Length Code (VLC) and motion vector (Motion Vector) as information for decoding and reconstructing the image. The conversion 111 is used to restore the frequency domain image data to the spatial domain image data, and the motion compensation 112 uses the previously restored image data to predict and compensate the decoded image data, and finally the block noise filtering U3. Perform post processing of the reconstructed image. [00〇9] Please refer to FIG. 2, which shows a block diagram of the structure of the video decoding system 200. The video decoding system 200 includes a plurality of processors 210-212, a plurality of hard accelerators in the hardware module 100 in FIG. 1, such as Entropy Decode 220, Inverse Transform 221, and Motion Compensation 222. With block noise filtering (De-

Blocking Filter) 223 、 記憶體控制器 231 、 記憶體 *·. 232、視訊輸出單元240、橋接介面251與點對點匯流排 2521-2528。複數處理器210-212可以是任何類型的通 用處理器’如數位訊號處理器(Digital Signal Processor ’ 簡稱 DSP) 或精簡指令集處理器 (ReduceciBlocking Filter) 223, memory controller 231, memory *·. 232, video output unit 240, bridge interface 251 and point-to-point bus 2521-2528. The complex processors 210-212 can be any type of general purpose processor such as a Digital Signal Processor (DSP) or a reduced instruction set processor (Reduceci).

Instruction Set Computing,簡稱RISC)。視訊解 碼系統200使用點對點匯流排2521-2528作為複數處理器 210-212以及複數硬體加速器之間的相互連結,使得處理 器與處理器間,處理器與與其關聯的硬體加速器間,不 需要透過匯流排仲裁(bus arbitration)即可傳遞指 098139845 表單編號A0101 第6頁/共27頁 0982068407-0 201119396 令流及資料流。橋接介面251則為一多重主端橋接介面(Instruction Set Computing (RISC). The video decoding system 200 uses the point-to-point bus 2521-2528 as the interconnection between the complex processors 210-212 and the complex hardware accelerators, so that there is no need between the processor and the processor, and between the processor and its associated hardware accelerator. Passing bus arbitration can be used to transfer 098139845 Form No. A0101 Page 6 / Total 27 Page 0982068407-0 201119396 Order flow and data flow. The bridge interface 251 is a multi-master bridge interface (

Multi-Master Bridge interface),經由該橋接介 ❹ 面251 ’複數處理器21 〇_212與複數硬體加速器皆能透過 記憶體控制器231直接存取記憶體232。該記憶體232可 以為隨機存取5己憶體(Rand〇m Access Memory,簡 稱RAM) ’例如靜態隨機存取記憶體(static RAM,簡 稱SRAM)或是動態隨機存取記憶體(Dynamic RAM,簡 稱DRAM)用來儲存解碼過程中產生的係數數據與影像資 料等。複數硬體加速器於記憶體232中,可以有各自的緩 衝區供其運作時存取資料,亦可以有一組共用的緩衝區 ’作為資料交換使用《視訊解碼系統2〇〇解碼所得的影像 資料最後經由視訊輸出單元24〇格式化後輸出。 [0010] 〇 視訊解碼系統200可實施在具有各種視訊應用領域的裝置 或傳輸系統,包含無線多媒體裝置、標準解析度和高解 析度的電視廣播應用、網際網路的視訊應用、傳輸高解 析度視訊的數位影像光碟、鈇位機頂盒與手持式電子裝 置等。需要瞭解的是’各複數處理器y〇_212所負責的解 碼流程與硬體加速H的運作會依照不同的視訊編碼標準 而有所不同。為了平衡系統效能與硬體成本,處理器負 責的解碼程序與硬體加速器負責的解碼程序,亦可作適 時的分配以便取得軟硬體間的效能平衡點,舉例來說, 處理器202中的處理程序負責反量化的解碼程序反轉換 221則負責將反量化後的影像資料從頻率域還原成空間域 ’或者也可以由反轉換221同時完成反量化與反轉換的解 碼程序。 098139845 表單編號A0101 第7頁/共27頁 0982068407-0 201119396 [0011] [⑻ 12] [0013] 請參閱圖3 ’為利用圖2中視訊解碼系統進行—般視訊 解碼之方法300的示意圖,其中處理器3〇1_3〇3分別為處 理器210 212的實施方式,熵解碼3〇4、反轉換3〇5、運 動補償306及區塊雜訊過遽307等分別為嫡解碼220、反 轉換221、運動補償222與區塊雜訊過渡223等硬體加速 器的實施方式,記憶體3QQ為記憶體232的實施方式。 處理器301從記憶體33〇接收到編碼位元流(bit stream),經由語法解析程序3〇8依語法結構判別位元 流使用的視訊編碼標準。得知視訊編碼標準後將資料 傳送給熵解瑪硬體加速器3Q4並命令其進行摘解碼,同時 藉由控制時脈週期控制命令流與資料流的傳遞速度。姨 解碼硬體加速器304的輸人可以是位元流亦可以是巨集區 塊(macro block),若為巨集區塊,則代表位元流中 的區塊槽頭資料已在處理器3G1的語法解析程序中完成解 碼。在另-實施方式中,處理器3()1可以藉由得知視訊編 瑪標準’決定是否動態啟動.内部的程序,舉例而言處 理器301動態啟動運動向量重建的程序,以減輕接下來解 碼程序中其它處理器的負載。 經由熵解碼硬體加速器3〇4處理的輸出包含運動向量區 塊量化參數與量化離散餘弦轉換係數矩陣等,由熵解碼 硬體加速器304傳送給處理器3G2作進—步的解碼處理。 處理器302接收到運動向量、區塊量化參數與量化離散餘 弦轉換係數矩陣後,以區塊量化參數與量化的離散餘弦 轉換係數矩陣作為反量化程序309的輸入資料,以運動向 量作為運動向量重建程序310的輸入資料。反量化程序 098139845 表單編號A0101 第8頁/共27頁 0982068407-0 201119396 309以接收到的區塊量化參數針對量化離散餘弦轉換係數 矩陣進行反量化的解碼程序,輸出反量化的離散餘弦轉 • 換係數矩陣,並傳送給反轉換硬體加速器305,同時命令 其進行反轉換。運動向量重建程序310負責運動向量預測 與重建的解碼程序,輸出預測的巨集區塊,並傳送給運 動補償硬體加速器306,同時命令其進行運動補償的解碼 程序。反轉換硬體加速器305可由蝶形演算法實作支援多 重視訊編碼標準的離散餘弦反轉換,舉例而言,反轉換 硬體加速器305可支援MPEG-2視訊編碼標準的8x8離散餘 〇 弦反轉換' H. 264視訊編碼標準的4x4整數反轉換與 WMV9/VC-1視訊編碼標準的8x8、8x4、4x8、4x4整數反 轉換等。反轉換硬體加速器305經由離散餘弦反轉換運算 得到一係數矩陣,該係數矩陣即為殘餘係妓資料(Residual Values) , 反轉換硬體加速器 305 完成離散餘 弦反轉換的運算後,即將該殘餘係數資料儲存到記憶體 330中的一緩衝區331,該^衝區331由反轉換硬體加速 器305、運動補償硬體加速器站6與區塊雜訊過濾硬體加 ◎ 速器307所共享,亦即反轉換硬體加速器305、運動補償 硬體加速器306與區塊雜訊過濾硬體加速器307皆可存取 緩衝區331。 [〇〇14] 運動補償硬體加速器306接收到處理器302的命令與資料 後’從共享緩衝區331讀取反轉換硬體加速器輸出的上述 殘餘像素資料,並將上述預測的巨集區塊與上述殘餘像 素資料相加,即可得到一重建的巨集區塊,並將之儲存 到緩衝區331。重建的巨集區塊尚須經過區塊雜訊過濾硬 098139845 表單編號A0101 第9頁/共27頁 0982068407-0 201119396 體加速器307去除區塊效應(block effect)後,方完 成巨集區塊的完整解碼。該區塊雜訊過濾硬體加速器307 的運作由處理器303中的過濾控制程序311所控制。過濾 控制器經由監視運動補償硬體加速器306的狀態暫存器得 知其是否完成解碼程序,一旦得知運動補償硬體加速器 306完成解碼程序,即命令區塊雜訊過濾硬體加速器307 從緩衝區331讀取重建的巨集區塊,完成去除區塊效應後 ,將巨集區塊儲存至記憶體330中目前重建的晝面,完成 巨集區塊的完整解碼。 [0015] 以下將介紹利用本發明解碼H. 264與VC-1等視訊編碼位 元流的實施方式。 [0016] H. 264標準又稱為MPEG-4第10部分,為國際電信聯盟遠 端通訊標準化組(ITU Telecommunication Standardization Sector , 簡稱為 ITU-T) 與國際標準化組 織(International Organization for Standard-izat ion,簡稱為ISO ) /國際電氣技術協會(Interna-t ional Electrotechnical Commission,簡稱為 IEC )底下的動態圖像專家組(Motion Picture Experts Group,簡稱為MPEG)所共同制定的視訊編碼標準,官 方名稱為進階視訊編碼(Advanced Video Coding,簡 稱為AVC),以區塊為基本單元,相較於先前的視訊編碼 標準,提供多樣化的巨集區塊類型進行運動估計(Mo-tion Estimation)與運動補償(Motion Compensation) , 最小為4x4大小的巨集區塊 ,能夠對圖像序列中 的運動區域進行更精確的分割。 098139845 表單編號A0101 第10頁/共27頁 0982068407-0 201119396 [0017] Η. 264進行運動補償計算時,提供以多張已重建畫面獲得 最類似目前解碼巨集區塊的參考值,稱為畫面間預測( inter-frame prediction),參考的已重建畫面最多 可往前31張及往後31張。H. 264亦提供不參考其它已重 建畫面’以外插的方式來取得解碼的像素,稱為畫面内 預測(intra-frame prediction)。於熵編碼部份, H. 264對於非轉換係數,使用單一編碼表,而對於量化過 的轉換係數’則使用内容可適性(C〇ntext_adaptive ❹ )編碼技術,依字碼(code word)的出現機率產生合 適的編碼表提高壓縮比率。内容可適性編碼技術分為兩 種’ 一種為調適性可變長度編碼(Context_adaptiveThe multi-master bridge interface can directly access the memory 232 through the memory controller 231 via the bridge interface 251 'the complex processor 21 〇 _212 and the plurality of hardware accelerators. The memory 232 can be a Rand〇m Access Memory (RAM), such as a static random access memory (SRAM) or a dynamic random access memory (Dynamic RAM). Referred to as DRAM), it is used to store coefficient data and image data generated during the decoding process. The plurality of hardware accelerators in the memory 232 may have their own buffers for accessing data during operation, or a set of shared buffers may be used as data exchanges for the video decoding system. It is outputted by the video output unit 24〇 and output. [0010] The video decoding system 200 can be implemented in a device or transmission system having various video application fields, including a wireless multimedia device, a standard resolution and a high-resolution television broadcasting application, an Internet video application, and a transmission high resolution. Video digital video discs, digital set-top boxes and handheld electronic devices. It should be understood that the decoding process and the hardware acceleration H operation of the complex processor y〇_212 will be different according to different video coding standards. In order to balance the system performance and the hardware cost, the decoding program responsible for the processor and the decoding program responsible for the hardware accelerator can also be allocated in time to obtain a performance balance between the hardware and the hardware. For example, in the processor 202 The decoding program is responsible for the inverse quantization of the decoding program. The inverse conversion 221 is responsible for reducing the inverse quantized image data from the frequency domain to the spatial domain' or the inverse decoding and 221 decoding process of the inverse quantization and inverse conversion. 098139845 Form No. A0101 Page 7 of 27 Page 0982068407-0 201119396 [0011] [(8) 12] [0013] Please refer to FIG. 3 ' is a schematic diagram of a method 300 for performing general video decoding using the video decoding system of FIG. 2, wherein The processors 3〇1_3〇3 are respectively implementations of the processor 210212, and the entropy decoding 3〇4, the inverse conversion 3〇5, the motion compensation 306, and the block noise 遽307 are respectively 嫡 decoding 220 and inverse conversion 221 In the embodiment of the hardware accelerator such as the motion compensation 222 and the block noise transition 223, the memory 3QQ is an embodiment of the memory 232. The processor 301 receives the encoded bit stream from the memory 33, and discriminates the video encoding standard used by the bit stream according to the syntax structure via the syntax analysis program 3〇8. After learning the video coding standard, the data is transmitted to the entropy solution hardware 3Q4 and commanded to perform decimation, and the transmission speed of the command stream and the data stream is controlled by controlling the clock cycle. The input of the decoding hardware accelerator 304 may be a bit stream or a macro block. If it is a macro block, the block header data in the bit stream is already in the processor 3G1. The decoding is done in the syntax parser. In another embodiment, the processor 3(1) can determine whether to dynamically start the internal program by knowing the video coding standard. For example, the processor 301 dynamically starts the motion vector reconstruction procedure to reduce the next step. Decode the load of other processors in the program. The output processed by the entropy decoding hardware accelerator 3〇4 includes a motion vector block quantization parameter and a quantized discrete cosine transform coefficient matrix, etc., which are transmitted by the entropy decoding hardware accelerator 304 to the processor 3G2 for further decoding processing. After receiving the motion vector, the block quantization parameter and the quantized discrete cosine transform coefficient matrix, the processor 302 uses the block quantization parameter and the quantized discrete cosine transform coefficient matrix as the input data of the inverse quantization program 309, and reconstructs the motion vector as the motion vector. The input data of the program 310. Anti-quantization program 098139845 Form No. A0101 Page 8/Total 27 Page 0982068407-0 201119396 309 Decode the program for the quantized discrete cosine transform coefficient matrix with the received block quantization parameter, and output the inverse quantized discrete cosine transform The coefficient matrix is passed to the inverse conversion hardware accelerator 305 while it is instructed to perform the inverse conversion. The motion vector reconstruction program 310 is responsible for the decoding process of the motion vector prediction and reconstruction, outputs the predicted macroblock, and transmits it to the motion compensation hardware accelerator 306, and simultaneously commands the motion compensation decoding program. The inverse conversion hardware accelerator 305 can be implemented by a butterfly algorithm to support discrete cosine inverse conversion of a multi-signal coding standard. For example, the inverse conversion hardware accelerator 305 can support 8x8 discrete cosine inverse conversion of the MPEG-2 video coding standard. 'H. 264 video coding standard 4x4 integer inverse conversion and WMV9/VC-1 video coding standard 8x8, 8x4, 4x8, 4x4 integer inverse conversion. The inverse conversion hardware accelerator 305 obtains a coefficient matrix by discrete cosine inverse conversion operation, which is a residual system data (Residual Values), and the inverse conversion hardware accelerator 305 performs the operation of discrete cosine inverse conversion, that is, the residual coefficient The data is stored in a buffer 331 in the memory 330. The buffer 331 is shared by the inverse conversion hardware accelerator 305, the motion compensation hardware accelerator station 6, and the block noise filtering hardware plus the 307. That is, the inverse conversion hardware accelerator 305, the motion compensation hardware accelerator 306, and the block noise filtering hardware accelerator 307 can all access the buffer 331. [〇〇14] After receiving the command and data of the processor 302, the motion compensation hardware accelerator 306 reads the above residual pixel data output from the inverse conversion hardware accelerator from the shared buffer 331 and the above predicted macroblock block. Adding the residual pixel data to obtain a reconstructed macroblock and storing it in the buffer 331. The reconstructed macro block still has to pass the block noise filtering hard 098139845 Form No. A0101 Page 9 / Total 27 pages 0982068407-0 201119396 After the block accelerator is removed, the macro block is completed. Complete decoding. The operation of the block noise filtering hardware accelerator 307 is controlled by the filter control program 311 in the processor 303. The filter controller knows whether it completes the decoding process by monitoring the state register of the motion compensation hardware accelerator 306. Once it is known that the motion compensation hardware accelerator 306 completes the decoding process, the command block noise filtering hardware accelerator 307 is buffered. The area 331 reads the reconstructed macro block, and after the block removal effect is completed, the macro block is stored into the currently reconstructed face of the memory 330, and the complete decoding of the macro block is completed. [0015] An embodiment of decoding a video encoded bitstream such as H.264 and VC-1 using the present invention will now be described. [0016] The H.264 standard, also known as MPEG-4 Part 10, is the ITU Telecommunication Standardization Sector (ITU-T) and the International Organization for Standardization (International Organization for Standardization). The video coding standard jointly developed by the Motion Picture Experts Group (MPEG) under the Interna-Tational Electrotechnical Commission (IEC), the official name is Advanced Video Coding (AVC), with block as the basic unit, provides a variety of macro block types for motion estimation (Mo-tion Estimation) and motion compared to previous video coding standards. Motion Compensation, a 4x4 macroblock, enables more precise segmentation of motion regions in an image sequence. 098139845 Form No. A0101 Page 10/Total 27 Page 0982068407-0 201119396 [0017] 264. When performing motion compensation calculation on 264, it provides a reference value that is most similar to the current decoding macroblock in multiple reconstructed pictures, called the picture. Inter-frame prediction, the referenced reconstructed picture can be up to 31 sheets and 31 sheets in the future. H.264 also provides pixels that are decoded without reference to other reconstructed pictures' extrapolation, called intra-frame prediction. In the entropy coding part, H.264 uses a single coding table for non-conversion coefficients, and uses content adaptability (C〇ntext_adaptive ❹) coding technique for the quantized conversion coefficients, depending on the probability of occurrence of the code word. Generate a suitable code table to increase the compression ratio. Content adaptive coding techniques are divided into two types: one is adaptive variable length coding (Context_adaptive)

Variable-Length Codes ’ 簡稱CAVLC),另一種為基 於上下文的自適應式·一進位算術編為(C.ontext-basedVariable-Length Codes </ acronym CAVLC), another context-based adaptive one-bit arithmetic (C. ontext-based

Arithmetic Coding,簡稱CABAC)、 [0018] ❹ 請參考圖4,為利用圖2中裸訊解:碼系統2〇〇解碼Η. 264位Arithmetic Coding (abbreviated as CABAC), [0018] ❹ Please refer to FIG. 4, which is a decoding solution of FIG. 2: code system 2〇〇 decoding Η. 264 bits

~ r4 1 I 元流之方法400的示意圖,其中處理器4〇卜403分別為處 理器210-212的實施方式,熵解碼4〇4、反轉換405、運 動補償406及區塊雜訊過濾407等分別為熵解碼220、反 轉換221、運動補償222與區塊雜訊過濾223等硬體加速 器的實施方式,記憶體430為記憶體232的實施方式。 [0019] 處理器401從記憶體430接收Η. 264位元流420,經語法解 析器程序408得知視訊編碼標準為Η. 264,下指令給網解 碼硬體加速器404對位元流進行解碼。熵解碼硬體加逮器 404將所接收到的位元流依語法架構解碼出接下來解瑪流 程所需之主要資料,包括運動向量、量化係數、晝面内 098139845 表單編號Α0101 第11頁/共27頁 0982068407-0 [0020] [0021] 201119396 預測模式的指標等,並傳送給處理器4〇2。 處理器402接收到量化係數後,經由反量化程序_得到 去量化係數,傳送給反轉換硬體加速器4G5,並命a其進 行4x4整數反轉換,將上述去量化係數 間域像素資料並儲存至記憶體綱中之一共享緩衝區仍 ,該空間域像素資料又稱為殘餘像素資料。處理器4〇2接 ㈣運動向量後’經由運動向量重建程序4ig,根據運動 向量紀錄的偏移量,從記憶體43〇中讀取來自單數或复數 前-解碼祕巾已重建畫面的單數或複數參^集3區1數 ’並產生畫面間_巨集區塊,當預測的巨集區塊 資料產生後’即傳送給運動補償硬體加速器4〇6,並八、八 運動補償硬想加速H4_取反轉換硬艘加迷令 至記憶體430中共享緩衝區431的殘餘像素資料,儲存 餘像素資料與接收到預測的巨集區塊像★資料相 重建的巨錢塊,之後將重建的巨集區塊儲存到記= 中430的共享緩衝區431。 ‘&quot;篮 當處理器402接收到畫面内預測模式的指找, 3不,則通知處理 器403的反畫面内預測程序412重新產生書面内 區塊。-旦畫面内預測巨集區塊產生,則由處理預:= 送給運動補償硬體加速H4G6,並命令運動補償硬體加、 器406讀取反轉換硬體加速器405儲存至記馇體 加速 緩衝區的殘餘像素資料,將該殘餘像素資料與 ^享 測的巨集區塊像素資料相加得到重建的 ’預 ^•杲區塊,之後 將重建的巨集區塊儲存到記憶體43〇中的一〜 旱緩衝區4 31 C1 〇 098139845 表單編號A0101 第12頁/共27頁 0982068407-0 201119396 [0022]處理器403的過濾控制程序4丨丨監視運動補償硬體加速器 406的狀態暫存器,一旦運動補償硬體加速器406完成重 建巨集區塊並儲存該巨集區塊至記憶體430中的共享緩衝 區431 ’即命令區塊雜訊過濾硬體加速器407從記憶體 430中共享緩衝區431讀取該重建巨集區塊進行去除區塊 效應’並將去除區塊效應後的重建巨集區塊寫回儲存於 記憶體430中目前重建的畫面,完成巨集區塊的完整解碼 0 [_ VC-1 的前身為 WMV 第 9版。WMV,全名為 Windows Media~ r4 1 I-stream flow method 400, wherein processor 4 404 is an implementation of processor 210-212, entropy decoding 4〇4, inverse conversion 405, motion compensation 406, and block noise filtering 407 The memory 430 is an implementation of the memory 232, such as an implementation of a hardware accelerator such as entropy decoding 220, inverse conversion 221, motion compensation 222, and block noise filtering 223. [0019] The processor 401 receives the 264 bit stream 420 from the memory 430, and the syntax parser program 408 knows that the video encoding standard is Η. 264, and the lower instruction decodes the bit stream to the network decoding hardware accelerator 404. . The entropy decoding hardware adder 404 decodes the received bit stream according to the syntax structure and the main data required for the next solution process, including motion vector, quantization coefficient, and 098139845 form number Α 0101 page 11 / A total of 27 pages 0982068407-0 [0020] [0021] 201119396 Forecast mode indicators, etc., and transmitted to the processor 4〇2. After receiving the quantized coefficients, the processor 402 obtains the dequantized coefficients through the inverse quantization program _, transmits the dequantized coefficients to the inverse conversion hardware accelerator 4G5, and performs a 4x4 integer inverse conversion, and stores the dequantized inter-domain pixel data to the same. One of the memory classes still shares the buffer, and the spatial domain pixel data is also called residual pixel data. After the processor 4〇2 is connected to the (4) motion vector, 'via the motion vector reconstruction program 4ig, the singular or the reconstructed picture from the singular or plural pre-decoded secrets is read from the memory 43〇 according to the offset of the motion vector record. The plural number is divided into 3 areas and 1 number and the inter-picture _ macro block is generated. When the predicted macro block data is generated, it is transmitted to the motion compensation hardware accelerator 4〇6, and the 8 and 8 motion compensation is conceived. Accelerating the H4_reverse conversion hard ship plus the fascination to the residual pixel data of the shared buffer 431 in the memory 430, storing the remaining pixel data and receiving the predicted macro block like the data reconstruction of the huge money block, and then The reconstructed macroblock is stored in the shared buffer 431 of the 430. &apos;&quot; Basket When the processor 402 receives the indication of the intra-picture prediction mode, 3 no, the anti-screen prediction program 412 of the notification processor 403 regenerates the written intra-block. Once the intra-frame prediction macroblock is generated, the processing pre-:= is sent to the motion compensation hardware to accelerate H4G6, and the motion compensation hardware adder 406 is read to read the inverse conversion hardware accelerator 405 to the recording body acceleration. The residual pixel data of the buffer is added to the reconstructed macroblock pixel data to obtain the reconstructed 'pre-^ block, and then the reconstructed macro block is stored in the memory 43〇 One of the ~ drought buffer 4 31 C1 〇 098139845 Form No. A0101 Page 12 / Total 27 page 0982068407-0 201119396 [0022] The filter control program of the processor 403 4 monitors the state of the motion compensation hardware accelerator 406 Once the motion compensated hardware accelerator 406 completes the reconstruction of the macroblock and stores the macroblock to the shared buffer 431 in the memory 430, the command block noise filtering hardware accelerator 407 is shared from the memory 430. The buffer 431 reads the reconstructed macroblock to remove the block effect' and writes the reconstructed macroblock after removing the block effect back to the currently reconstructed picture stored in the memory 430 to complete the macro block integrity. solution 0 [_ VC-1 predecessor to WMV version 9. WMV, full name Windows Media

Video,是微軟公司自行開發的一系列視訊編解碼格式的 通稱,微軟公司於2003年將WMV第9版提交至美國電影與 電視工程師協會(Society of Motion i&gt;icture andVideo, a generic term for a series of video codec formats developed by Microsoft itself, was submitted to the Society of Motion Picture and Television Engineers (Society of Motion i&gt;icture and

Television Engineers,簡稱SMPTE),欲使WMV9成 為視訊編碼標準之一,2006年4月WMV9正式成為視訊編 瑪標準並由SMPTE命名為1。VC-1的編碼流程與 H. 264相似,都是藉由空間域與時間域中,人類視覺無法 Q 察覺的部份加以編碼,達到視訊壓縮的效果。 [0024] VC-1同樣以區塊做為編碼的基本單元,相較於H. 264提 供7種不同大小的巨集區塊進行運動估計與運動補償, VC-1則提供16x16、16x8、8x16與8x8等四種大小的巨 集區塊進行分割。VC-1亦提供畫面間預測與晝面内預測’ 兩種動態預測模式,對於畫面間預測,VCy提供參考的 已重建畫面最多只可往前1張及往後⑽,對於畫面内預 測,與H. 264使用空間域像素資料進行畫面内預測有所不 同’ VC-1使㈣近區塊經量化的轉換係數作為預測資料 098139845 表單編號A0101 第13頁/共27百 ' H 0982068407-0 201119396 ,又稱為交流/直流預測(AC/DC prediction)。相較 於之前的視訊編碼標準是以8x8巨集區塊或是4x4區塊作 為轉換單位,VC-1允許四種不同大小的適應性區塊大小 轉換(Adaptive Block Size Transform)。此外, VC_1還有一個不同的去區塊效應解決方式’稱為重疊轉 換(Overlap Transform)。傳統的區塊雜訊過遽,雖 然可以有效降低區塊效應,但區塊雜訊過濾是在重建還 原之後才會進行的步驟,有可能導致影像細節去失。VC-1的重疊轉換技術則是在編碼時,利用在空間域做前處理 ,且在解碼時搭配後置處理來完成,且僅針對區塊型態 為I者執行。至於熵編碼部份· ν〇1對於非轉換係數與量 化過的轉換係數皆以可變長度編碼。 [0025] 請參閱圖5,為利用圖2中視訊解碼系統200解碼VC-1位 元流之方法500的示意圖,其中處理器501-503分別為處 理器210-212的實施方式,熵解碼504、反轉換505、運 動補償506及區塊雜訊過濾507等分別為熵解碼220、反 轉換221、運動補償222與區塊雜訊過濾223等硬體加速 器的實施方式’記憶體530^4記憶體232的實施方式。 [0026] 處理器501從記憶體530接收位元流520,經語法解析器 程序508得知視訊編碼標準為VC-1,下指令給熵解碼硬體 加速器504對位元流進行解碼《熵解碼硬體加速器5〇4將 位元流依語法架構解碼出接下來流程所需之主要資料, 包括運動向量、量化係數、交流/直流預測指標等。滴解 碼硬體加速器504將運動向量回傳給處理器501,其餘則 傳送給處理器502。 098139845 表單編號Α0101 第14頁/共27頁 0982068407-0 201119396 [0027] Ο =器_收到量化係數後,經由反量化程序_到 夏化係數料給反轉換硬體加速㈣$,並命令其進 =整肢轉換,將上述去量化係數從頻率域還原至空間 。像素貝料讀存至記憶體53〇中一共享緩衝區531,該 門域像素資料又稱為殘餘像素資料仏卜 则)。處理器501接收到運動向量後,經由運動向量重 建程序509根據運動向量紀錄的偏移量,從記憶體咖 中讀取前-解碼流程中已重建畫面中的參考巨集區塊, 並產生畫面間預;則巨集區塊,當預測的巨集區塊像素資 料產生後,即傳送給運動補償硬體加速器5〇6,並命令運 動補償硬體加速器506讀取反轉換硬體加速器505儲存至 記憶體530中共享緩衝區531的上述殘餘像素資料,將該 殘餘像素資料與接收到預測的巨集區塊像嗉資料相加得 到重建的巨集區塊’之後將重建的巨集區塊儲存到記憶 體530中共享緩衝區mi。 [0028]當處理器502接收到交流/直流預測指標,則經由反交流/ Q 直流預測程序511重新產生畫面内預測巨集區塊。一旦畫 , , I &quot;, sr A ::^ 面内預測巨集區塊產生,則由處理器502傳送給運動補償 硬體加速器506 ’並命令運動補償硬體加速器506讀取反 轉換硬體加速器505儲存至記憶體530中共享緩衝區531 的上述殘餘像素資料,將該殘餘像素資料與接收到預測 的巨集區塊像素資料相加得到重建的巨集區塊,之後將 重建的巨集區塊儲存到記憶體530中共享緩衝區531。 [0029] 處理器503包含兩程序,一為重疊轉換程序512,另一為 過濾控制程序513。重疊轉換程序512用來監視反轉換硬 098139845 表單編號A0101 第15頁/共27頁 0982068407-0 201119396 體加速11505的狀態暫存器,—旦反轉換硬體加速器5〇5 完成整數反轉換並儲存上述殘餘像素資料於記憶體53〇中 共享緩衝區53卜即讀取該殘餘像素資料進行重疊轉換, 再重新寫回記憶體530中共享緩衝區531。過渡控制程序 513監視運動補償硬體加速器506的狀態暫存器,一旦運 動補償硬體加速器506完成重建巨集區塊並儲存該巨集區 駐記憶體530中共享緩衝區53卜即命令區塊雜訊過滤 硬體加速器507從記憶體530中共享緩衝區531讀取該重 建巨集區塊進行去除區塊效應(block effect),並於 過濾完成後,將該重建的巨集區塊寫回儲存於記憶體53〇 Η 中目前重建的晝面,完成巨拳區塊的完整解碼。 [0030] 綜上所述’本發明符合發明專利要件,棄依法提出專利 申請。惟’以上所述者僅為本發明之較佳實施方式,舉 凡熟悉本案技藝之人士’在爰依本案發明精神所作之等 效修飾或變化,皆應包含於以下之申請專利範圍内。 【圖式簡單說明】Television Engineers (SMPTE), in order to make WMV9 one of the video coding standards, WMV9 officially became the video coding standard in April 2006 and was named 1 by SMPTE. The encoding process of VC-1 is similar to that of H.264. It is encoded by the part of the spatial domain and the time domain that human vision cannot detect, and the effect of video compression is achieved. [0024] VC-1 also uses block as the basic unit of coding. Compared with H.264, it provides seven different sizes of macroblocks for motion estimation and motion compensation, and VC-1 provides 16x16, 16x8, and 8x16. Split with four sizes of macroblocks such as 8x8. VC-1 also provides inter-picture prediction and intra-plane prediction. Two dynamic prediction modes. For inter-picture prediction, the reconstructed picture that VCy provides for reference can only be forwarded one at a time and backward (10). For intra-picture prediction, H. 264 uses spatial domain pixel data for intra-picture prediction. 'VC-1 makes the (four) near-block quantized conversion coefficient as prediction data 098139845 Form No. A0101 Page 13/27-100' H 0982068407-0 201119396 Also known as AC/DC prediction. Compared to the previous video coding standard, which uses 8x8 macroblocks or 4x4 blocks as conversion units, VC-1 allows four different sizes of Adaptive Block Size Transform. In addition, VC_1 has a different deblocking solution called 'Overlap Transform'. The traditional block noise is too low, although it can effectively reduce the block effect, but block noise filtering is a step that will be performed after the reconstruction is restored, which may result in loss of image details. The overlap conversion technique of VC-1 is performed by pre-processing in the spatial domain during encoding, and with post-processing in decoding, and is performed only for the block type. As for the entropy coding part, ν〇1 is variable length coded for both the non-transformation coefficient and the quantized conversion coefficient. Referring to FIG. 5, a schematic diagram of a method 500 for decoding a VC-1 bitstream using the video decoding system 200 of FIG. 2, wherein the processors 501-503 are implementations of the processors 210-212, respectively, entropy decoding 504. The inverse 505, the motion compensation 506, and the block noise filtering 507 are implementations of the hardware accelerator such as entropy decoding 220, inverse conversion 221, motion compensation 222, and block noise filtering 223, respectively. Embodiment of body 232. The processor 501 receives the bit stream 520 from the memory 530, and the syntax parser program 508 knows that the video coding standard is VC-1, and the next instruction sends the entropy decoding hardware accelerator 504 to decode the bit stream. The hardware accelerator 5〇4 decodes the bit stream according to the syntax structure to the main data required for the next process, including motion vectors, quantization coefficients, AC/DC prediction indicators, and so on. The decimation code hardware accelerator 504 passes back the motion vector to the processor 501 and the remainder to the processor 502. 098139845 Form No. 1010101 Page 14/Total 27 Page 0982068407-0 201119396 [0027] Ο = _ After receiving the quantized coefficient, the inverse conversion program _ to the summerization coefficient is used to accelerate the inverse conversion hardware (4) $ and command it Into the whole limb conversion, the above dequantization coefficients are restored from the frequency domain to the space. The pixel data is read to a shared buffer 531 in the memory 53. The pixel data of the gate field is also referred to as residual pixel data. After receiving the motion vector, the processor 501 reads the reference macroblock in the reconstructed picture in the pre-decoding flow from the memory coffee via the motion vector reconstruction program 509 according to the offset of the motion vector record, and generates a picture. The macro block, when the predicted macro block pixel data is generated, is transmitted to the motion compensation hardware accelerator 5〇6, and commands the motion compensation hardware accelerator 506 to read the inverse conversion hardware accelerator 505 storage. The residual pixel data of the shared buffer 531 in the memory 530 is shared, and the residual pixel data is added to the macroblock image data received by the prediction to obtain the reconstructed macroblock. The shared buffer mi is stored in the memory 530. [0028] When the processor 502 receives the AC/DC prediction indicator, the intra-picture prediction macroblock is regenerated via the reverse AC/Q DC prediction procedure 511. Once drawn, I &quot;, sr A ::^ in-plane prediction macroblock generation is then transmitted by processor 502 to motion-compensated hardware accelerator 506' and commands motion-compensated hardware accelerator 506 to read the inverse-conversion hardware The accelerator 505 stores the residual pixel data of the shared buffer 531 in the memory 530, and adds the residual pixel data to the macroblock pixel data received by the prediction to obtain a reconstructed macroblock, and then the reconstructed macro. The block is stored in the shared buffer 531 in the memory 530. [0029] The processor 503 includes two programs, one being an overlap conversion program 512 and the other being a filter control program 513. Overlap conversion program 512 is used to monitor the inverse conversion hard 098139845 Form No. A0101 Page 15 / Total 27 Page 0982068407-0 201119396 Volume Acceleration 11505 Status Register, - Reverse Conversion Hardware Accelerator 5〇5 Complete integer inverse conversion and save The residual pixel data is shared in the buffer 53 in the memory 53, and the residual pixel data is read for overlap conversion, and then written back to the shared buffer 531 in the memory 530. The transition control program 513 monitors the state register of the motion compensation hardware accelerator 506. Once the motion compensation hardware accelerator 506 completes the reconstruction of the macro block and stores the shared buffer 53 in the macro memory resident memory 530, the command block The noise filtering hardware accelerator 507 reads the reconstructed macro block from the shared buffer 531 in the memory 530 to remove the block effect, and after the filtering is completed, writes the reconstructed macro block back. Stored in the memory 53〇Η currently reconstructed, complete the decoding of the giant box. [0030] In summary, the present invention meets the requirements of the invention patent and abandons the patent application according to law. However, the above-mentioned embodiments are merely preferred embodiments of the present invention, and equivalent modifications or variations made by those skilled in the art to the present invention are included in the following claims. [Simple description of the map]

[0031] 圖1為硬體模組100之一實施方式的示意圖。 U[0031] FIG. 1 is a schematic diagram of one embodiment of a hardware module 100. U

[0032] 圖2為視訊解碼系統200之一實施方式的結構方塊圖。 [0033] 圖3為利用圖2中視訊解碼系統200進行一般視訊解碼的方 法示意圖。 [0034] 圖4為利用圖2中視訊解碼系統200解碼Η. 264位元流的方 法示意圖。 [0035] 圖5為利用圖2中視訊解碼系統2 0 0解碼V C -1位元流的方 法示意圖。 098139845 表單編號Α0101 第16頁/共27頁 0982068407-0 201119396 【主要元件符號說明】 [0036] 熵解碼 110、220、304、404、504 [0037] 反轉換 111、221、305、405、505 [0038] 運動補償 112、222、306 ' 406、506 [0039] 區塊雜訊過濾 113、223、307、407、507 [0040] 處理器 210、211、212、3(Π、302、 [0041] 303、401、402、403、501、 〇 [0042] 502 、 503 [0043] 記憶體控制器131 [0044] 記憶體 232、330、430、530 [0045] 視訊輸出單元240 [0046] 橋接介面251 [0047] 點對點匯流排 2521、2522、2523、2524、 〇 [0048] 2525 、 2526 、 2527 、 2528 [0049] 語法解析308、408、508 [0050] 運動向量重建310、410、509 [0051] 過濾控制311、411、513 [0052] 編碼位元流320 [0053] 緩衝區 331、431、531 [0054] 反畫面内預測412 098139845 表單編號A0101 第17頁/共27頁 0982068407-0 201119396 [0055] Η. 264位元流 420 [0056] 反交流/直流預測511 [0057] 重疊轉換512 [0058] VC-1 位元流 520 09820684 0 098139845 表單編號Α0101 第18頁/共27頁2 is a block diagram showing the structure of an embodiment of a video decoding system 200. 3 is a schematic diagram of a method for performing general video decoding using the video decoding system 200 of FIG. 2. 4 is a schematic diagram of a method of decoding a 264 bit stream using the video decoding system 200 of FIG. 2. 5 is a schematic diagram of a method of decoding a V C -1 bit stream using the video decoding system 200 of FIG. 2. 098139845 Form No. 101 0101 Page 16 / Total 27 Page 0892068407-0 201119396 [Main Element Symbol Description] [0036] Entropy decoding 110, 220, 304, 404, 504 [0037] Reverse conversion 111, 221, 305, 405, 505 [ 0038] Motion Compensation 112, 222, 306 '406, 506 [0039] Block Noise Filter 113, 223, 307, 407, 507 [0040] Processors 210, 211, 212, 3 (Π, 302, [0041] 303, 401, 402, 403, 501, 〇 [0042] 502, 503 [0043] Memory Controller 131 [0044] Memory 232, 330, 430, 530 [0045] Video Output Unit 240 [0046] Bridge Interface 251 [0047] Point-to-point bus 2251, 2522, 2523, 2524, 〇 [0048] 2525, 2526, 2527, 2528 [0049] Syntax parsing 308, 408, 508 [0050] Motion vector reconstruction 310, 410, 509 [0051] Filtering Control 311, 411, 513 [0052] Encoded bitstream 320 [0053] Buffer 331, 431, 531 [0054] Intra-picture prediction 412 098139845 Form number A0101 Page 17 of 27 page 0982068407-0 201119396 [0055] 264. 264 bit stream 420 [0056] inverse AC/DC prediction 511 [0057] Overlap conversion 512 [0058] VC -1 bit stream 520 09820684 0 098139845 Form number Α0101 Page 18 of 27

Claims (1)

201119396 七、申請專利範圍: 1 . 一種多重標準視訊解碼系統,包括: • 一記憶體,用來儲存位元流及解碼過程中的暫時性資料; 一多重主端橋接介面,連接至該記憶體; 複數處理器,其中該複數處理器包含至少一處理器,經由 該多重主端橋接介面,從該記憶體接收該位元流;以及 複數硬體加速器,每一硬體加速器接收來自該等複數處理 器其中之一者之指令執行視訊解碼流程,並經由該多重主 Ο 端橋接介面存取於該記憶體。 2.如申請專利範圍第1項所述之多重標準視訊解碼系統,其 中該複數處理器之間經由點對點匯流排連結。 3 .如申請專利範圍第1項所述之多重標準視訊解碼系統,其 中該複數處理器與相應的硬體加速器之間經由點對點匯流 排連結。 . 4 .如申請專利範圍第1項所述之多重標準視訊解碼系統,其 中該複數硬體加速器包含一第一硬體加速器,該第一硬體 〇 加速器從該複數處理器之其中一者接收解碼指令執行可變 長度解碼。 5 .如申請專利範圍第1項所述之多重標準視訊解碼系統,其 中該複數硬體加速器包含一第二硬體加速器,該第二硬體 加速器從該複數處理器之其中一者接收解碼指令執行離散 餘弦反轉換。 6 .如申請專利範圍第1項所述之多重標準視訊解碼系統,其 中該複數硬體加速器包含一第三硬體加速器,該第三硬體 098139845 表單編號A0101 第19頁/共27頁 0982068407-0 201119396 加速器從該複數處理器之其中又一者接收解碼指令執行運 動補償。 7 .如申請專利範圍第1項所述之多重標準視訊解碼系統,其 中該複數硬體加速器包含一第四硬體加速器,該第四硬體 加速器從該複數處理器之其中還一者接收解碼指令執行去 區塊過渡。 8 .如申請專利範圍第1項所述之多重標準視訊解碼系統,其 中該複數硬體加速器與該記憶體共享一緩衝區,作為資料 交換之用。 9 . 一種多重標準視訊解碼系統,包括: 一記憶體; 一多重主端橋接介面; 一點對點匯流排; 複數處理器,其中至少一處理器經由該多重主端橋接介面 從該記憶體接收該位元流;以及 複數硬體加速器,各該複數硬體加速器接收來自該複數處 理器之其中一者之指令執行視訊解碼流程,經由該多重主 端橋接介面存取於該記憶體,並經由點對點匯流排連結該 等處理器。 1〇 .如申請專利範圍第9項所述之多重標準視訊解碼系統,其 中該複數處理器之間經由點對點匯流排連結。 11 .如申請專利範圍第9項所述之多重標準視訊解碼系統,其 中該複數硬體加速器包含一第一硬體加速器,該第一硬體 加速器從該複數處理器其中之一者接收解碼指令開始而執 行可變長度解碼。 12 .如申請專利範圍第9項所述之多重標準視訊解碼系統,其 098139845 表單編號A0101 第20頁/共27頁 0982068407-0 201119396 13 . 14 . ❹ ' 15 . 16 . 中該複數硬體加速器包含—第二硬體加速器,該第二硬體 加速器從域數處㈣其中之另—者接㈣碼指令開始而 執行離散餘弦反轉換。 如申請專利範圍第9項所述之多重標準視訊解碼系統,其 中該複數硬體加速器包含__第三硬體加速器,該第三硬體 加速器從該複數處理器其中之又—者接收解碼指令而執行 運動補償。 如申。月專利圍第9項所述之多重標準視訊解碼系統,其 中该複數硬體加速器包含—第四硬體加速器該第四硬體 加速器從該複數處理器其巾之還-者接收解碼指令而執行 去區塊過濾。 : 如申凊專利範圍第9項所述之多重標準視訊解碼系統,其 中《數硬體加速器與該記龍共n衝區作為資料 交換之用。 一種多重標準視訊解瑪方法,執狀-多重標準視訊解瑪 系統中,包括: 4 G 該多重標準視訊解碼系统的 透過該多重標準視訊解瑪系 變長度編碼; 透過該多重標準視訊解碼系 散餘弦反轉換; 透過該多重標準視訊解碼系 動補償;以及 透過該多重標準視訊解褐系 φ塊過濾。 一第一處理器接收位元流; 統的一第一硬體加速器執行可 統的一第二硬體加速器執行離 統的一第三硬體加速器執行運 統的一第四硬體加速器執行去 098139845 .如申β專利範圍第16項所述之多重鮮視訊解瑪方 第21頁/共27頁 表單編號Α0101 法,其 09820 17 201119396 中該第二硬體加速器、該第三硬體加速器以及該第四硬體 加速器共享一缓衝區以交換資料。 098139845 表單編號A0101 第22頁/共27頁 0982068407-0201119396 VII. Patent application scope: 1. A multiple standard video decoding system, comprising: • a memory for storing bit stream and temporary data during decoding; a multi-master bridge interface connected to the memory a complex processor, wherein the complex processor includes at least one processor, receiving the bit stream from the memory via the multi-master bridge interface; and a plurality of hardware accelerators, each of which receives from the hardware accelerator The instruction of one of the plurality of processors performs a video decoding process and accesses the memory via the multi-master bridge interface. 2. The multiple standard video decoding system of claim 1, wherein the plurality of processors are coupled via a point-to-point bus. 3. The multiple standard video decoding system of claim 1, wherein the plurality of processors are coupled to the corresponding hardware accelerator via a point-to-point bus. 4. The multiple standard video decoding system of claim 1, wherein the plurality of hardware accelerators comprise a first hardware accelerator, the first hardware accelerator receiving from one of the plurality of processors The decoding instructions perform variable length decoding. 5. The multiple standard video decoding system of claim 1, wherein the plurality of hardware accelerators comprise a second hardware accelerator, the second hardware accelerator receiving decoding instructions from one of the plurality of processors Perform discrete cosine inverse conversion. 6. The multi-standard video decoding system of claim 1, wherein the plurality of hardware accelerators comprise a third hardware accelerator, the third hardware 098139845, form number A0101, page 19, total 27 pages 0982068407- 0 201119396 The accelerator receives motion decoding from another of the complex processors to receive a decoding instruction. 7. The multiple standard video decoding system of claim 1, wherein the plurality of hardware accelerators comprise a fourth hardware accelerator, the fourth hardware accelerator receiving decoding from one of the plurality of processors The instruction performs a deblocking transition. 8. The multiple standard video decoding system of claim 1, wherein the plurality of hardware accelerators share a buffer with the memory for data exchange. 9. A multiple standard video decoding system, comprising: a memory; a multi-master bridge interface; a point-to-point bus; a complex processor, wherein at least one processor receives the memory from the memory via the multi-master bridge interface a bit stream; and a plurality of hardware accelerators, each of the plurality of hardware accelerators receiving an instruction from the plurality of processors to perform a video decoding process, accessing the memory via the multi-master bridge interface, and via a point-to-point The bus connects the processors. The multi-standard video decoding system of claim 9, wherein the plurality of processors are connected via a point-to-point bus. 11. The multiple standard video decoding system of claim 9, wherein the plurality of hardware accelerators comprise a first hardware accelerator, the first hardware accelerator receiving decoding instructions from one of the plurality of processors Start with variable length decoding. 12. The multiple standard video decoding system according to claim 9 of the patent application, 098139845 Form No. A0101, page 20/27 pages 0982068407-0 201119396 13 . 14 . ❹ ' 15 . 16 . The plural hardware accelerator Including - a second hardware accelerator, the second hardware accelerator performs a discrete cosine inverse conversion starting from a number of (4) other (4) code instructions. The multiple standard video decoding system of claim 9, wherein the plurality of hardware accelerators include a third hardware accelerator, and the third hardware accelerator receives decoding instructions from the plurality of the plurality of processors. And perform motion compensation. Such as Shen. The multi-standard video decoding system of claim 9, wherein the plurality of hardware accelerators include a fourth hardware accelerator, the fourth hardware accelerator is executed by receiving the decoding instruction from the plurality of processors Go to block filtering. : The multi-standard video decoding system described in claim 9 of the patent scope, wherein the "number of hardware accelerators" and the recording unit are used for data exchange. A multi-standard video decoding method, in the multi-standard video decoding system, comprising: 4 G, the multi-standard video decoding system transmits the multi-standard video decoding length-length coding; Cosine inverse conversion; dynamic compensation through the multi-standard video decoding; and filtering of the φ block by the multi-standard video. a first processor receives the bit stream; a first hardware accelerator of the system performs a second hardware accelerator to perform a third hardware accelerator execution of a third hardware accelerator 098139845. The second hardware accelerator, the third hardware accelerator and the second hardware accelerator of the 09820 17 201119396, as described in claim 16 of the patent scope of the patent application. The fourth hardware accelerator shares a buffer to exchange data. 098139845 Form No. A0101 Page 22 of 27 0982068407-0
TW98139845A 2009-11-24 2009-11-24 Multi-standard video decoding system and method TW201119396A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW98139845A TW201119396A (en) 2009-11-24 2009-11-24 Multi-standard video decoding system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW98139845A TW201119396A (en) 2009-11-24 2009-11-24 Multi-standard video decoding system and method

Publications (1)

Publication Number Publication Date
TW201119396A true TW201119396A (en) 2011-06-01

Family

ID=44936078

Family Applications (1)

Application Number Title Priority Date Filing Date
TW98139845A TW201119396A (en) 2009-11-24 2009-11-24 Multi-standard video decoding system and method

Country Status (1)

Country Link
TW (1) TW201119396A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10165288B2 (en) 2012-12-21 2018-12-25 Dolby Laboratories Licensing Corporation High precision up-sampling in scalable coding of high bit-depth video

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10165288B2 (en) 2012-12-21 2018-12-25 Dolby Laboratories Licensing Corporation High precision up-sampling in scalable coding of high bit-depth video
US10516889B2 (en) 2012-12-21 2019-12-24 Dolby Laboratories Licensing Corporation High precision up-sampling in scalable coding of high bit-depth video
US10958922B2 (en) 2012-12-21 2021-03-23 Dolby Laboratories Licensing Corporation High precision up-sampling in scalable coding of high bit-depth video
US11284095B2 (en) 2012-12-21 2022-03-22 Dolby Laboratories Licensing Corporation High precision up-sampling in scalable coding of high bit-depth video
US11570455B2 (en) 2012-12-21 2023-01-31 Dolby Laboratories Licensing Corporation High precision up-sampling in scalable coding of high bit-depth video
US11792416B2 (en) 2012-12-21 2023-10-17 Dolby Laboratories Licensing Corporation High precision up-sampling in scalable coding of high bit-depth video

Similar Documents

Publication Publication Date Title
TWI378729B (en) Method and apparatus to construct bi-directional predicted frames for temporal scalability
US9961346B2 (en) Video encoder and method of operating the same
US9131240B2 (en) Video decoding method and apparatus which uses double buffering
CA2665243C (en) Signalling of maximum dynamic range of inverse discrete cosine transform
Kang et al. MPEG4 AVC/H. 264 decoder with scalable bus architecture and dual memory controller
KR100772379B1 (en) External memory device, method for storing image date thereof, apparatus for processing image using the same
CN103959792B (en) Method for video coding, video encoding/decoding method and realize the device of the method
TW201813387A (en) Apparatus and method for low latency video encoding
KR101392349B1 (en) Method and apparatus for video decoding
US20110110435A1 (en) Multi-standard video decoding system
JP2002112268A (en) Compressed image data decoding apparatus
Chen et al. Architecture design of high performance embedded compression for high definition video coding
TW201119396A (en) Multi-standard video decoding system and method
JP2950367B2 (en) Data output order conversion method and circuit in inverse discrete cosine converter
JP2003230148A (en) Image data coding unit
Kun et al. A hardware-software co-design for h. 264/avg decoder
CN109495745B (en) Lossless compression decoding method based on inverse quantization/inverse transformation
Hase et al. Development of low-power and real-time VC-1/H. 264/MPEG-4 video processing hardware
KR20090076020A (en) Multi codec decoder and decoding method
KR100636911B1 (en) Method and apparatus of video decoding based on interleaved chroma frame buffer
KR102171119B1 (en) Enhanced data processing apparatus using multiple-block based pipeline and operation method thereof
JP4888224B2 (en) Image processing apparatus and method, and program
US10334262B2 (en) Moving-picture decoding processing apparatus, moving-picture coding processing apparatus, and operating method of the same
JP4214554B2 (en) Video decoding device
KR100821922B1 (en) Local memory controller for mpeg decoder