201117619 六、發明說明: 【發明所屬之技術領域】 本發明是有關於一種可重組式移動補償架構之設計空間探索 方法,且特別是關於一種採用架構/演算法協同設計之概念,發展 可重組式移動補償架構之方法。 【先前技術】201117619 VI. Description of the Invention: [Technical Field] The present invention relates to a design space exploration method for a reconfigurable mobile compensation architecture, and in particular to a concept of architecture/algorithm collaborative design to develop reconfigurable The method of moving the compensation architecture. [Prior Art]
隨著多媒體技術的發展,多種視訊壓縮標準如ISO/BEC所訂立 之視訊壓縮標準MPEG-1、MPEG-2及MPEG-4、ITU-T所訂立之 視訊壓縮標準Η.263及Η.264已成功地發展’提升了人類於日常 生活中的視訊享受。 近幾年來,因應不同視訊壓縮標準之應用,發展能支援多規格 視訊之解碼器成為必然的趨勢。在上述視訊壓縮標準之中,移動 補償處理(motion compensation )為最重要且核心的部份。對視; 解碼器而言,移動補償處理為依據移動估算處理 estimation)所獲得之移動向量(m〇ti〇n vector),從參考畫 如 找到與目前畫面最匹配相似的對應區塊,並且經由插 $之中 到移動補償麵值。 得 由於這些視訊壓縮標準所對應之移動補償處理之 · 性,倘若各自將不同視訊壓縮標準之移動補償處理共^有異同 碼器中,必然會耗費重複之硬體資源來實現這些^二I現於解 間的共同性。因此,須有一套設計方法能針對^需之^償處理之 展有效率的移動麵硬齡構,以支援不^規格發 補償處理。 卿準之移動 201117619 【發明内容】 本發明提供一種可重組式移動補償 法’其設計之概念,針索方 格,發展有效率之可重組式移動補償架構針對所需之預疋應用規 本發明種可重組式移動補償架構之設計 法。首先’設定預定應用規格。接著,萃取多 =With the development of multimedia technology, a variety of video compression standards such as ISO/BEC video compression standards MPEG-1, MPEG-2 and MPEG-4, ITU-T video compression standards 263.263 and Η.264 have been Successful development has improved the enjoyment of video in human life. In recent years, in response to the application of different video compression standards, it has become an inevitable trend to develop decoders capable of supporting multi-standard video. Among the above video compression standards, motion compensation is the most important and core part. For the decoder; the motion compensation process is a motion vector (m〇ti〇n vector) obtained according to the motion estimation process, and the corresponding tile similar to the current picture is found from the reference picture, and Insert $ to move compensation face value. Due to the nature of the motion compensation processing corresponding to these video compression standards, if the motion compensation processing of different video compression standards is different, it will inevitably consume repeated hardware resources to implement these two The commonality between the solutions. Therefore, there must be a set of design methods that can be used to support the efficient implementation of the mobile surface. Qing Zhunzhi Mobile 201117619 [Invention] The present invention provides a reconfigurable mobile compensation method's concept of design, a cable grid, and an efficient reconfigurable mobile compensation architecture for the required pre-applications. A design method for a reconfigurable mobile compensation architecture. First, set the predetermined application specifications. Then, extract more =
疋應用規格’分析於尖峰運算量及錄_最差航下可重 移動補償架淑不_料粒度及不_量之處理元件進行各移動 補償演算法時之資料_,並據讀得各移動補償z 枝計目狀硬财數,選射重組式移= 補償架構之預疋資料粒度及預定數量之處理元件。 上述之可重組式移動補償架構之設計空間探索方法,本發明之 -實施例更&括依據各機補償演算法所支援之多種區塊分割類 型’分析在不同資料粒度下處理各區塊分割類型時,各區塊分割 類型所對獻參考區勒可重複使狀,並據轉得各移動 補償演算法賴應之硬體參數。當可重組式移動補償架構以預定 資料粒度進域理時,保留參考區_可重複使用之資料於可重 組式移動補償架構之内部記憶體。 、 上述之可重組式移動補償架構之設計空間探索方法,本發明之 一實施例更包括依據各移動補償演算法所支援之多種像素插補類 型,分析各像素插補類贿需之參考區塊,並據娜得各移動補 償演算法所職之硬财數。當可重㈣移動爾架構以預定資 料粒度進行處理時’嫩各像素插補類㈣應之參考區塊於可重 組式移動補償架構之内部記憶體β 、 .201117619 基於上述,本發明基於所萃取之共同性’分析各移動補償演算 法之運算量,且進而獲得可重組式移動補償架構之處理元件。藉 此處理元件可節省硬體資源處理這些移動補償演算法之間具共同 性之運算。另外,本發明以不同資料粒度及不同處理元件數量分 析可重組式移動補償架構執行各移動補償演算法之資料流程,並 獲得各移動補償演算法所對應之硬體參數。基於預定設計目標權 衡這些硬體參數,以發展符合預定應用規格且具較佳效率的可重 組式移動補償架構。 【實施方式】 由於各種視訊壓縮標準隨不同應用需求而定義了不同的設定 播(profile )及層級(level ),因此本實施例先設定一預定應用規 格’並在此預定應用規格下經由設計空間探索方法來發展具較佳 效率之可重組式移動補償架構β 表1為本發明之一實施例之可重組式移動補償架構所支援之預 疋應用規格的表格。請參照表1 ’本實施例發展之可重組式移動補 償架構欲支援視訊壓縮標準MPEG-2中的主設定檔(mainprofile) 同高層級(high level)、視訊壓縮標準MPEG-4中的進階壓縮效率 設定槽(advanced coding efficiency profile )同L4層級以及視訊壓 縮標準H.264中的主設定檔同L4層級,支援p畫面及β晝面之處 理’且可即時處理每秒30張色彩格式YCrCb比為4:2:〇之高解析 度(1920x1088)畫面。 _表1可重.組式移動補償絮槿所*接^葙宗廄用描.;格的矣捻 影像解析度 1920x1088 色彩格式(Y:Cr:Cb) 4:2:0 201117619 畫框率(晝框/秒) 30 圖像類型 P晝面及B畫面 MPEG-2之設定檔及層級 主設定標同兩層級 MPEG-4之設定檔及層級 進階壓縮效率設定檔同L4層級 H.264之設定檔及層級 主設定檔同L4層級 為了支援上述之視訊壓縮標準,本實施例分析這些視訊壓縮標 準所分別對應之移動補償演算法之間的異同處,從中萃取出這些 移動補償演算法之間至少一共同性,使可重組式移動補償架構利 能更有效率地運作以及節省硬體資源。 舉例來說,視訊壓縮標準H.264所對應之移動補償演算法支援 1/4像素精度的亮度(luminance)插補及ι/g像素精度的色度 (chrominance)插補,且視訊壓縮標準MPEG-4所對應之移動補 償演算法在亮度預測及色度預測上分別支援1/4像素精度及1/2像 素精度的插補。對1/4像素精度而言,雖然H.264及MPEG-2之 移動補償演算法分別採用6階及8階之有限脈衝響應(finite impulse response, FIR)濾波器來進行插補,但是H.264移動補償 演算法採用之階係數序列[1,5, 20, 20, -5, 1]與MPEG-2移動補償 演算法採用之階係數序列[-8, 24, -48, 160, -48, 24, -8]之間存在一 公因係數序列。此公因係數序列經過簡單的加法運算及位移運算 即可獲得上述之階係數序列其中一之。 MPEG-2移動補償演算法採用之階係數序列卜8, 24, _48, 160, -48, 24, -8]除以除數8可獲得階係數序列[-1,3, -6, 20, 20, -6, 3, -1],此階係數序列與H.264移動補償演算法採用之階係數序列[u, 20,20, -5,1]可由公因數序列[丨,2,4,16]經過如下表2所示之加法 運算及位移運算後可獲得之。階係數3為係數丨向左位移〗位元 201117619 (即獲得轉2)並加上係數1,階係數.5及_6分別為係數i向左 位移2位元(即獲得係數4)並加上係數2或者係數卜且階係數 20為係數1向左位移4位元並加上係數4。此外,視訊塵縮標準 MPEG-2及MPEG-4在亮度預測及色度預測上支援w像素精度的 插補,於此可採用簡單的雙線性濾波器來產生移動補償之預測值。 表2階係數之運算 階係數 運算 20 16+4 -6 4+2 -5 4+1 3 2+1 接著’本實施例基於上述之共同性,分析各視訊壓縮標準所對 應之移動補償演算法之運算量,並據以決定可重組式移動補償架 構中處理元件(processing element,PE)所包含之運算元件。圖i 緣示為視訊壓縮標準H.264支援之非整數點像素插補的示意圖。 請參照圖1,在H.264移動補償演算法中,1/2像素位置為&用其 左右相鄰之6個整數點像素進行插補,例如1/2像素位置b二 r〇und((E-5F+20G+20H-5I+J)/32),其中 round〇表示四捨五入。基 於上述所萃取之共同性,在H.264移動補償演算法中,1/2像素位 置 b = [(G+J)(16+4)+(F+J)((-4)+(-l))+(E+J)+16]»5,從而可知 1/2 像素位置b需8個加法運算及1個位移運算,於此僅簡述基於公 因係數序列,可分析獲得H.264移動補償演算法中各像素位置所 需之運算量’未加入由公因係數序列求得階係數之加法運算及位 移運算。基於上述之公因係數序列,也可類推MPEG-4及MPEG-2 201117619 移動補償演算法中各像素位置所需之運算。 請參照圖卜在Η.264移動補償演算法所支援的多種區塊分割 類型中’例如:4x4、8x4、8x8及16x8等區塊分割類型,Η264 移動補償演算法對叫區塊分割類型進行像素位置u k或q處理 時’此情況因4x4區塊分割類型下經運算所獲得之插補值可重複 糊性低,且插補像素位置u k或q運算量大而為最壞情況。同 樣地’ MPEG.4移動補偏算㈣8x8區塊侧_進行像素位 置a,b,c或d處理時為最壞情況,且撕£(}_2移動償 制6區塊侧_進行·位置a朗時為最壞情況償' 因此’基於上述之共同性,本實施例可估算不同視訊麗縮標準 之中,各移動補償演算法於最壞情況下處理p畫面中亮度及色度 插補分別=需之運算量,以及處理B畫面中亮度及色度插補分別 所需之運算。以加法運算來說,由於呢郎_4移動猶演算法相 較於另兩__償演算法,其於最壞情況下插補獲得第一個像 素位置時需耗費鮮加法運算,目此假設處理元件包含足夠多的 加法運算於—雜聊完成MPEG4 _補健算絲最壞情況 下之處理。藉此,本實施例可決定處理元件所包含之運算元件。 為發展可重組式移動補償架構,本實施例參考自上而下設計方 法論所分之多種抽象層,例如:應用規格、演算法層及架構層等, 從最上層去探討各移動補償演算法之資料流程,並且逐一地對各 種設計條件去進行探索,以權衡獲得符合預定設計目標之硬體。 對可重組式移動補償架構而言,最上層的資料流程為在單位時 脈週期内進行所需資料之擷取以及資料之運算。於此,本實施例 基於預定應用規格’分析可重組式移動補償架構於尖峰運算量及 資料組態最差情況下,以不同資料粒度(datagranularily)及不同 數量之處理單元進行各移動補償演算法時所對應之資料流程,並 201117619 據以獲得各移動補償演算法所對應之Μ硬體參數,例如 頻寬、匯流排位元數、所需之記憶體容量以及工作頻率等。疋Application specification 'analysis in the peak calculation amount and record _ the worst-case re-movable compensation frame 不 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ Compensate the z-counter-like hard-earnings, select the recombination-shift = the compensation structure, the granularity of the data, and the predetermined number of processing elements. The above-mentioned design space exploration method of the reconfigurable mobile compensation architecture, the embodiment of the present invention further includes processing various block partitions under different data granularities according to various block segmentation types supported by each machine compensation algorithm. For the type, the reference area of each block segmentation type can be repeated, and according to the hardware parameters of each mobile compensation algorithm. When the reconfigurable motion compensation architecture enters the domain at a predetermined data granularity, the reference region _ reusable data is retained in the internal memory of the reconfigurable motion compensation architecture. The design space exploration method of the reconfigurable mobile compensation architecture described above, and an embodiment of the present invention further includes analyzing a reference block of each pixel interpolation type bribe according to various pixel interpolation types supported by each mobile compensation algorithm. And according to Nade's mobile compensation algorithm, the hard earned position. When the scalable (four) mobile architecture is processed at a predetermined data granularity, the "negative pixel interpolation class (4) should be referenced to the internal memory of the reconfigurable mobile compensation architecture β, .201117619 based on the above, the present invention is based on the extraction The commonality 'analyzes the amount of computation of each motion compensation algorithm, and further obtains the processing elements of the reconfigurable motion compensation architecture. This processing component saves hardware resources from processing common operations between these motion compensation algorithms. In addition, the present invention analyzes the data flow of each mobile compensation algorithm by reconfigurable mobile compensation architecture with different data granularities and different processing component numbers, and obtains hardware parameters corresponding to each mobile compensation algorithm. These hardware parameters are weighed against predetermined design goals to develop a reconfigurable mobile compensation architecture that meets the predetermined application specifications and is more efficient. [Embodiment] Since various video compression standards define different configuration profiles and levels according to different application requirements, this embodiment first sets a predetermined application specification 'and through the design space under the predetermined application specifications. Exploring methods to develop a reconfigurable mobile compensation architecture with better efficiency. Table 1 is a table of pre-application specifications supported by the reconfigurable mobile compensation architecture of one embodiment of the present invention. Please refer to Table 1 'The reconfigurable mobile compensation architecture developed in this embodiment is to support the main profile of the video compression standard MPEG-2 and the advanced level in the high level video compression standard MPEG-4. The advanced coding efficiency profile is the same as the L4 level and the main profile in the video compression standard H.264. It supports the processing of p-picture and β-facet and can process 30 color format YCrCb per second. The ratio is 4:2: 高 high resolution (1920x1088) screen. _Table 1 can be heavy. Group mobile compensation 槿 槿 接 接 接 葙 葙 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 1920 1920 1920 1920 1920 1920 1920 1920 1920 1920 1920 1920 1920 1920 1920 1920 1920 1920 1920 1920 1920 1920 1920 1920昼 frame / sec) 30 Image type P face and B picture MPEG-2 set file and level master setting The same two levels of MPEG-4 set file and level advanced compression efficiency setting file with L4 level H.264 The profile and the level master profile are the same as the L4 level. In order to support the above-mentioned video compression standard, this embodiment analyzes the similarities and differences between the motion compensation algorithms corresponding to the video compression standards, and extracts between these motion compensation algorithms. At least one commonality enables the reconfigurable mobile compensation architecture to operate more efficiently and save hardware resources. For example, the motion compensation algorithm corresponding to the video compression standard H.264 supports 1/4 pixel precision luminance interpolation and ι/g pixel precision chrominance interpolation, and video compression standard MPEG. The motion compensation algorithm corresponding to -4 supports interpolation of 1/4 pixel precision and 1/2 pixel precision in luminance prediction and chrominance prediction, respectively. For 1/4 pixel accuracy, although the H.264 and MPEG-2 motion compensation algorithms use 6th and 8th order finite impulse response (FI) filters for interpolation, respectively, H. The order coefficient sequence [1, 5, 20, - 20 There is a sequence of common factor coefficients between , 24, -8]. The sequence of the common factor coefficients is obtained by a simple addition operation and a displacement operation to obtain one of the above-mentioned sequence of order coefficients. The MPEG-2 motion compensation algorithm uses a sequence of order coefficients, 8, 24, _48, 160, -48, 24, -8] divided by a divisor of 8 to obtain a sequence of order coefficients [-1, 3, -6, 20, 20, -6, 3, -1], this order coefficient sequence and the sequence coefficient sequence [u, 20, 20, -5, 1] used by the H.264 motion compensation algorithm can be composed of the common factor sequence [丨, 2, 4 , 16] can be obtained by the addition and displacement operations shown in Table 2 below. The order coefficient 3 is the coefficient 丨 left shift 〖bit 201117619 (that is, get 2) and the coefficient 1 is added, the order coefficients .5 and _6 are the coefficient i shifted to the left by 2 bits (that is, the coefficient 4 is obtained) and added The upper coefficient 2 or the coefficient b and the order coefficient 20 are the coefficient 1 shifted to the left by 4 bits and the coefficient 4 is added. In addition, the video dust reduction standard MPEG-2 and MPEG-4 support w pixel precision interpolation in luminance prediction and chrominance prediction, and a simple bilinear filter can be used to generate motion compensation prediction values. Table 2 The order coefficient calculation of the order coefficient 20 16+4 -6 4+2 -5 4+1 3 2+1 Then the present embodiment analyzes the motion compensation algorithm corresponding to each video compression standard based on the commonality described above. The amount of computation, and the operational elements included in the processing element (PE) in the reconfigurable motion compensation architecture. Figure i shows the schematic diagram of non-integer pixel interpolation supported by the video compression standard H.264. Referring to FIG. 1, in the H.264 motion compensation algorithm, the 1/2 pixel position is & interpolation with 6 integer point pixels adjacent to the left and right, for example, 1/2 pixel position b2 r〇und ( (E-5F+20G+20H-5I+J)/32), where round〇 indicates rounding. Based on the commonality of the above extractions, in the H.264 motion compensation algorithm, the 1/2 pixel position b = [(G+J)(16+4)+(F+J)((-4)+(- l))+(E+J)+16]»5, so that 1/2 pixel position b requires 8 addition operations and 1 displacement operation. Here, only the sequence based on the common factor coefficient can be analyzed and H. The amount of computation required for each pixel position in the 264 motion compensation algorithm 'is not added to the addition and displacement operations of the order coefficients obtained from the common factor coefficient sequence. Based on the above-described series of common factor coefficients, the operations required for each pixel position in the MPEG-4 and MPEG-2 201117619 motion compensation algorithms can also be analogized. Please refer to Figure Bu in the various block partition types supported by the 264.264 motion compensation algorithm. For example: block partition types such as 4x4, 8x4, 8x8, and 16x8, Η264 mobile compensation algorithm performs pixel segmentation for block segmentation type. When the position uk or q is processed, 'the interpolation value obtained by the operation under the 4x4 block division type is low in repeatability, and the interpolation pixel position uk or q is large in operation and is the worst case. Similarly, the MPEG.4 mobile complement calculation (4) 8x8 block side _ is the worst case when the pixel position a, b, c or d is processed, and the tear is removed (}_2 mobile compensation 6 block side _ proceeding · position a Langshi is the worst case compensation. Therefore, based on the commonality of the above, this embodiment can estimate the different video refraction standards, and each motion compensation algorithm processes the luminance and chrominance interpolation in the p picture in the worst case. = the amount of computation required, and the operations required to process the luminance and chrominance interpolation in the B picture. In terms of addition, since the lang_4 moving still algorithm is compared to the other two __ compensation algorithms, In the worst case, the interpolation requires a fresh addition operation to obtain the first pixel position. Therefore, it is assumed that the processing element contains enough addition operations to perform the worst case processing of the MPEG4_complementary calculation. In this embodiment, the computing elements included in the processing component can be determined. To develop a reconfigurable mobile compensation architecture, the present embodiment refers to various abstraction layers of the top-down design methodology, such as application specifications, algorithm layers, and architecture. Layer, etc., from the top Exploring the data flow of each mobile compensation algorithm, and exploring various design conditions one by one to weigh the hardware that meets the predetermined design goals. For the reconfigurable mobile compensation architecture, the uppermost data flow is in the unit. The acquisition of the required data and the calculation of the data are performed in the clock cycle. Here, the present embodiment analyzes the reconfigurable mobile compensation architecture based on the predetermined application specification in the case of the peak calculation amount and the worst case configuration of the data, with different data. Datagranularily and different number of processing units perform the data flow corresponding to each mobile compensation algorithm, and 201117619 obtains the hardware parameters corresponding to each mobile compensation algorithm, such as the bandwidth and the number of bus ranks. , the required memory capacity and operating frequency.
舉例來說,由於Η.264移動補償演算法最小支援到料區塊分 割類型’故可基於各區塊分割類型細4χ4之資料粒度進行處理 的條件下’分析以不同數量之處理單元運算各移_償演算法所 對應之資料絲1 3〜表5分稿本實施例在視訊壓縮標準 MPEG-2、MPEG-4及Η.264巾,可重組式移動補償架構以不同數 量之處理單元運算所估算之工作頻率的表格,其中h Β及ρ分別 表示I畫面、B晝面以及P畫面。 表3今MPEG-2中以不同數量之處理單元所估耳乞^_作領宁 PE數量 不同圖像群(gi roup of picture, GOP)結構之工作 _ IP IBP EBP IBBBP 1 94MHz 141MHz 156MHz 162MHz 2 47MHz 70MHz 78MHz 81MHz 3 31MHz 37MHz 45MHz 49MHz 4 23MHz 35MHz 39MHz 40MHz 5 18MHz 28MHz 31MHz 32MHz 表4 _ MPEG-4中以不同數量之處理單疋所估算之τ作頻乎 PE數量 不同圖像群 roup of picture, GOP )結構之工祚艏座 IP IBP BBP IBBBP 1 211MHz 308MHz 341MHz 354MHz 2 105MHz 154MHz 170MHz 177MHz 3 70MHz 102MHz 113MHz 118MHz 4 52MHz 77MHz 85MHz 88MHz 5 42MHz 61MHz 68MHz 70MHz 201117619 表5卢H.264中以不同數量之處理單元所估算之王作镅率 PE數量 不同圖像群(group of picture,GOP )結構之工作,率 IP IBP BBP 1BBBP 1 2611MHz 370MHz 406MHz 421MHz 2 130MHz 185MHz 203MHz 210MHz 3 87MHz 123MHz 135MHz 140MHz 4 65MHz 92MHz 101MHz 105MHz 5 52MHz 74MHz 81MHz 84MHz 如表3〜5所示,當採用1或2個處理元件時,可重組式移動補 償架構支援視訊壓縮標準MPEG-4及H.264時需較高的工作頻 率,因此可能會耗費過多硬體資源來實現此工作頻率。當採用3 個處理元件時’可重組式移動補償架構支援上述視訊壓縮標準時 其資料流程的規則性低且複雜度高。雖然採用5個處理元件可降 低可重組式移動補償架構的工作頻率,但是也會面臨與採用3個 處理元件相同的問題。因此,在本實施例中,基於預定設計目標 以及各移動補償演算法所對應之硬體參數,可選擇出可重組式移 動補償架構所採用之預定資料粒度(例如:4x4),以及預定數量 (例如:4個)之處理元件,其中預定設計目標例如為權衡各移動 補償演算法所對應之硬體參數,使得在預定資料粒度及預定數量 之處理元件下可獲得最適當的工作頻率、頻寬以及所需之記憶體 容量。 值得注意的是,上述實施例之表3〜表5為基於所萃取之共同性 及預訂應用規格,以不同設計條件探索可重組式移動補償架構支 援不同移動補償演算法之資料流程,從而獲得各移動補償演算法 對應之硬體參數。然而,可萃取之共同性隨著支援不同視訊壓縮 201117619 標準而有所不同,且符合設計目標所採用之資料粒度及處理元件 • 數量也會隨著不同應用規格之設定而不同,上述實施例為提供可 • 重组式移動補償架構之設計空間探索方法,以有效率地發展符合 預定應用規格之硬體。 如上所述,在此可歸納為下列的方法流程。圖2繪示為本發明 之一實施例之可重組式移動補償架構之設計空間探索方法的流程 圖。請參照圖2,首先,設定一預定應用規格(步驟S201)。經由 從多種視訊壓縮標準之移動補償演算法中萃取共同性(步驟 鲁 S202),並據以分析各移動補償演算法之運算量,以決定處理元件 所包含之運算元件(步驟S203)。基於預定應用規格,於尖峰運算 量及資料組態最差情況下,分析可重組式移動補償架構以不同資 料粒度及不同數量之處理元件進行各移動補償演算法時之資料流 程’並據以獲得各移動補償演算法所對應之硬體參數(步驟 S204)。基於預定設計目標及硬體參數,選擇可重組式移動補償架 構之預定資料粒度及預定數量之處理元件(步驟S205)q 更進一步地’在可重組式移動補償架構之設計空間探索方法 φ 中’可以採用其他設計策略來提升可重組式移動補償架構之工作 效率。本發明另一實施例依據各移動補償演算法所支援之區塊分 割類型,分析在不同資料粒度下處理各該區塊分割類型時,各區 塊分割類型所對應之參考區塊内可重複使用之資料。圖3繪示為 本發明之一實施例之H.264移動補償演算法基於ι6χ16區塊分割 類型所需之參考區塊示意圖。請參照圖3,當H.264移動補償演算 法以16x16區塊分割類型310進行處理時,需要21χ21參考區塊 320,小之資料來運算出16χ16區塊分割類型31〇内像素插補。 右可重組式移動補償架構以4x4資料粒度逐一地完成16χ16區 塊分割類型310所包含之16個4x4區塊之像素插補,則21χ21參 ί S. 1 12 201117619 考區塊320具有如陰影區域330所示可重複使用之資料。在不同 • 資料粒度下,各區塊分割類型所對應之參考區塊内可重複使用之 • 資料會有所不同。本實施例所採用之設計策略為將可重複使用之 資料保留於内部記憶體之中供下一區塊進行處理,以降低對外部 記憶體存取資料之頻寬負載。本實施例經此分析可獲得各移動補 償演算法所對應之硬體參數(例如:内部記憶體容量及頻寬等), 並且基於預訂設計目標及這些硬體參數,選擇出可重組式移動補 償架構最適合之預定資料粒度。 • 另外,本發明之另一實施例更分析各移動補償演算法所支援之 像素插補類型所需之參考區塊。舉例來說,在視訊壓縮標準 MPEG_2中’整數點像素之亮度及色度插補運算需要ΜχΝ參考區 塊大小之資料且水平1/2像素之亮度及色度插補運算需要 (Μ+1)χΝ參考區塊大小之資料,其中ΜχΝ為區塊分割類型大小。 门理’視訊麗縮標準MPEG-4及Η.264對不同像素插補類型需梅 取不同大小之參考區塊來進行運算。本實施例所採用之設計策略 為可重組式移動補償架構進行處理時,會依據像素插補類型而擷 取對應之參考區塊於内部記憶體之中,以降低對外部記憶體存取 資料之頻寬負載。本實施例經此分析可獲得各移動補償演算法所 對應之硬體參數(例如:頻寬等),並且基於預訂設計目標及這些 硬體參數’選擇出可重組式移動補償架構最適合之預定資料粒度^ 經由上述之設計空間探索方法以及設計策略,便可發展符合此 預定應用規格下最適當之可重組式移動補償架構。圖4搶示為本 發明之一實施例之可重組式移動補償架構的示意圖。請參照圖4, ^於資料可重複利用的設計策略及/或依據據像素插補類型擷取所 項參考區塊的設計策略,本實施例於可重組式移動補償架構4〇〇 之中提供資料通減組.細從外部記碰擷取各區塊分割類 13 201117619 型所對應之參考區塊於内部記憶體420中,甚至可因應像素插補 類型之不同而擷取所需之參考區塊於内部記憶體42〇中。 接著,當可重組式移動補償架構400進行像素插補運算時,資 料通訊模組410會依據預定資料粒度而從内部記憶體中取得 所需之資料,並且將可重複使用之資料储存回内部記憶體42〇中 以供下一區塊運算。由於像素插補運算可能參考水平方向上的像 素或垂直方向的像素’因此本實施例於可重組式移動補償架構4⑽ 之中提供資料供給模組430,以排列資料通訊模組41〇所存取之資 料。經上述之設計空間探索方法,可決定可重組式移動補償架構 4〇〇支援多種視訊壓縮標準之處理元件數量,於此本實施例提供包 $定數量之處理元件所組成之插補模組44〇來執行各移“償 廣算法之運算。為了控制這些模組之間的協作,本實施例需提供 參數控制模組450於可重組式移動補償架構4〇〇之中,其接收執 行各移動補償演算法所需之參數(例如W動向量),並據以控制 各模組運作,其中缓衝器模組460為用以暫存資料。 工 綜上所述,上述實施例基於不同視訊壓縮標準之移動補償演算 法之間的朗性’對各移動補償演算法之運算量進行分析並從而 獲得可重㈣移_散構之處理元件,其中此處理元件因 共同性而可節省硬體資源處理移動補償演算法之間具共同性之運 算。另外’上述實補以不㈣料粒度及不同處理元件數 析可重組式飾猶架構執行各移動補伽算法之資料流程,^ 資料流程巾可獲知各移動麵法騎應之硬體參I =設計目標及這些硬體參數,可選擇可重組式移動補償架構所 率的數量之處理單元’藉此發展具較佳效 201117619 【圖式簡單說明】 . 圖1繪示為視訊壓縮標準H.264支援之非整數點像素插補的示 意圖。 圖2繪示為本發明之一實施例之可重組式移動補償架構之設計 空間探索方法的流程圖。 圖3繪示為本發明之一實施例之H.264移動補償演算法基於 16x16區塊分割類型所需之參考區塊示意圖。 圖4繪示為本發明之一實施例之可重組式移動補償架構的示意 籲 圖。 【主要元件符號說明】 E〜N、P、Q:整數點像素位置 a〜k、m、η、p~s、cc、dd、e、ff:非整數點像素位置 310 : 16x16區塊分割類型 320 :參考區塊 330 :可重複使用之資料 400 :可重組式移動補償架構 • 410 :資料通訊模組 420 :内部記憶體 430 :資料供給模組 440 :插補模組 450 :參數控制模組 460 :緩衝器模組 S201〜S205 :本發明之一實施例之可重組式移動補償架構之設 計空間探索方法的各步驟 15For example, since the Η.264 motion compensation algorithm supports the minimum partition type of the material block, it can be processed under different conditions based on the data granularity of each block partition type. _Record algorithm corresponding to the data thread 1 3 ~ Table 5 Sub-commitment This embodiment in the video compression standard MPEG-2, MPEG-4 and Η.264 towel, reconfigurable mobile compensation architecture with a different number of processing unit operations A table of estimated operating frequencies, where h Β and ρ represent an I picture, a B picture, and a P picture, respectively. Table 3: Estimated earpieces in different numbers of processing units in MPEG-2. _ work as a gi roup of picture (GOP) structure _ IP IBP EBP IBBBP 1 94MHz 141MHz 156MHz 162MHz 2 47MHz 70MHz 78MHz 81MHz 3 31MHz 37MHz 45MHz 49MHz 4 23MHz 35MHz 39MHz 40MHz 5 18MHz 28MHz 31MHz 32MHz Table 4 _ MPEG-4 estimated by a different number of processing units τ as the number of PE different image group roup of picture, GOP) Structure of the Industrial Block IP IBP BBP IBBBP 1 211MHz 308MHz 341MHz 354MHz 2 105MHz 154MHz 170MHz 177MHz 3 70MHz 102MHz 113MHz 118MHz 4 52MHz 77MHz 85MHz 88MHz 5 42MHz 61MHz 68MHz 70MHz 201117619 Table 5 Lu H.264 in different quantities The unit estimates the number of PEs. The number of PEs is different from the group of picture (GOP) structure. IP IBP BBP 1BBBP 1 2611MHz 370MHz 406MHz 421MHz 2 130MHz 185MHz 203MHz 210MHz 3 87MHz 123MHz 135MHz 140MHz 4 65MHz 92MHz 101MHz 105MHz 5 52MHz 74MHz 81MHz 84MHz As shown in Table 3~5, reconfigurable motion compensation when using 1 or 2 processing elements Configuration supports MPEG-4 video compression standards and the need for higher operating frequency, so it may take too much time H.264 hardware resources to implement this operating frequency. When three processing elements are used, the reconfigurable motion compensation architecture supports the above video compression standards. The data flow has low regularity and high complexity. Although the use of five processing elements reduces the operating frequency of the reconfigurable motion compensation architecture, it also faces the same problems as using three processing components. Therefore, in the embodiment, based on the predetermined design target and the hardware parameters corresponding to the respective motion compensation algorithms, the predetermined data granularity (for example, 4×4) used in the reconfigurable mobile compensation architecture may be selected, and the predetermined number ( For example: 4) processing elements, wherein the predetermined design objective is, for example, weighing the hardware parameters corresponding to the respective motion compensation algorithms, so that the most suitable operating frequency and bandwidth can be obtained under predetermined data granularity and a predetermined number of processing elements. And the required memory capacity. It should be noted that Tables 3 to 5 of the above embodiment are based on the commonality of the extraction and the application specifications of the reservation, and explore the data flow of the reconfigurable mobile compensation architecture to support different motion compensation algorithms under different design conditions, thereby obtaining each The hardware parameters corresponding to the motion compensation algorithm. However, the commonality of extractables varies with the support of different video compression 201117619 standards, and the data granularity and processing components used in meeting the design goals • The number will vary with different application specifications. The above embodiment is Provides a design space exploration method that can be used in a reorganized mobile compensation architecture to efficiently develop hardware that meets the intended application specifications. As described above, it can be summarized here as the following method flow. 2 is a flow chart showing a design space exploration method of a reconfigurable mobile compensation architecture according to an embodiment of the present invention. Referring to FIG. 2, first, a predetermined application specification is set (step S201). The commonality is extracted by a motion compensation algorithm from a plurality of video compression standards (step S202), and the calculation amount of each motion compensation algorithm is analyzed to determine an arithmetic element included in the processing element (step S203). Based on the predetermined application specifications, in the worst case of peak computing and data configuration, analyze the data flow of the reconfigurable mobile compensation architecture with different data granularities and different number of processing elements for each mobile compensation algorithm. The hardware parameters corresponding to the respective motion compensation algorithms (step S204). Selecting a predetermined data granularity and a predetermined number of processing elements of the reconfigurable mobile compensation architecture based on predetermined design goals and hardware parameters (step S205) q further 'in the design space exploration method φ of the reconfigurable mobile compensation architecture' Other design strategies can be employed to increase the efficiency of the reconfigurable mobile compensation architecture. According to another embodiment of the present invention, according to the block segmentation type supported by each mobile compensation algorithm, when the segmentation type of each block is processed under different data granularities, the reference block corresponding to each block segmentation type can be reused. Information. FIG. 3 is a schematic diagram of a reference block required for the H.264 motion compensation algorithm based on the partition type of the ι6χ16 block according to an embodiment of the present invention. Referring to FIG. 3, when the H.264 motion compensation algorithm is processed by the 16x16 block partition type 310, 21χ21 reference block 320 is needed, and the small data is used to calculate the 16χ16 block partition type 31〇 pixel interpolation. The right reconfigurable motion compensation architecture performs pixel interpolation of 16 4x4 blocks included in the 16χ16 block partition type 310 one by one with 4x4 data granularity, and then 21χ21 ί S S 1 12 201117619 test block 320 has a shadow area 330. Reusable data as shown. Under different data sizes, the data that can be reused in the reference block corresponding to each block segmentation type will be different. The design strategy adopted in this embodiment is to keep the reusable data in the internal memory for processing in the next block to reduce the bandwidth load on the external memory access data. In this embodiment, the hardware parameters corresponding to the motion compensation algorithms (for example, internal memory capacity and bandwidth) can be obtained, and the reconfigurable motion compensation is selected based on the predetermined design target and the hardware parameters. The architecture is best suited to the granularity of the intended data. In addition, another embodiment of the present invention further analyzes the reference blocks required for the pixel interpolation type supported by each motion compensation algorithm. For example, in the video compression standard MPEG_2, the 'integer point pixel luminance and chrominance interpolation operations need to refer to the block size data and the horizontal 1/2 pixel brightness and chrominance interpolation operations need (Μ +1) χΝReference block size data, where ΜχΝ is the block partition type size. The MPEG-4 and Η.264 standards for different video interpolation types require different reference blocks for calculation. When the design strategy adopted in this embodiment is processed by the reconfigurable mobile compensation architecture, the corresponding reference block is extracted in the internal memory according to the pixel interpolation type to reduce the access to the external memory. Bandwidth load. Through this analysis, the hardware parameters (for example, bandwidth, etc.) corresponding to each mobile compensation algorithm can be obtained, and the most suitable reservation for the reconfigurable mobile compensation architecture is selected based on the predetermined design target and the hardware parameters. Data Granularity ^ Through the above-mentioned design space exploration method and design strategy, the most appropriate reconfigurable mobile compensation architecture that meets this predetermined application specification can be developed. 4 is a schematic diagram of a reconfigurable mobile compensation architecture according to an embodiment of the present invention. Referring to FIG. 4, the design strategy for data reusable and/or the design strategy for extracting the reference block according to the pixel interpolation type is provided in the reconfigurable mobile compensation architecture. The data reduction group is finely extracted from the external block to capture the reference block corresponding to each block partition class 13 201117619 type in the internal memory 420, and even the required reference area can be extracted according to the different pixel interpolation types. The block is in the internal memory 42〇. Then, when the reconfigurable motion compensation architecture 400 performs the pixel interpolation operation, the data communication module 410 obtains the required data from the internal memory according to the predetermined data granularity, and stores the reusable data back into the internal memory. The body 42 is for the next block operation. Since the pixel interpolation operation may refer to pixels in the horizontal direction or pixels in the vertical direction, the present embodiment provides the data supply module 430 in the reconfigurable motion compensation architecture 4 (10) to be accessed by the data communication module 41. Information. Through the above-mentioned design space exploration method, the number of processing elements supporting the plurality of video compression standards can be determined by the reconfigurable mobile compensation architecture. In this embodiment, an interpolation module 44 composed of a processing component of a predetermined number is provided. In order to control the cooperation between the modules, the present embodiment needs to provide a parameter control module 450 in the reconfigurable mobile compensation architecture 4, which receives and performs each movement. Compensating parameters required for the algorithm (such as W motion vector), and controlling the operation of each module, wherein the buffer module 460 is used for temporarily storing data. As described above, the above embodiment is based on different video compression. The salience between the standard motion compensation algorithms 'analyzes the computational complexity of each motion compensation algorithm and thereby obtains the processing elements of the repetitive (four) shift_distribution, wherein the processing elements can save hardware resources due to commonality The processing of the motion compensation algorithm has a common operation. In addition, the above-mentioned real compensation is not (four) material granularity and the number of different processing elements can be reorganized. The data flow of the algorithm, ^ data flow towel can know the hardware parameters of each mobile surface method I = design goals and these hardware parameters, you can choose the number of processing units that can be recombined mobile compensation architecture' Figure 1 illustrates a non-integer pixel interpolation supported by the video compression standard H.264. Figure 2 illustrates a reconfigurable mobile according to an embodiment of the present invention. A flowchart of a design space exploration method of a compensation architecture. FIG. 3 is a schematic diagram of a reference block required for a H.264 motion compensation algorithm based on a 16x16 block partition type according to an embodiment of the present invention. A schematic diagram of a reconfigurable motion compensation architecture according to an embodiment of the invention. [Description of main component symbols] E~N, P, Q: integer dot pixel positions a~k, m, η, p~s, cc, dd , e, ff: non-integer point pixel position 310: 16x16 block partition type 320: reference block 330: reusable data 400: reconfigurable motion compensation architecture • 410: data communication module 420: internal memory 430 : data supply module 44 0: interpolation module 450: parameter control module 460: buffer module S201 to S205: steps of the design space exploration method of the reconfigurable motion compensation architecture according to an embodiment of the present invention 15