TW395135B

TW395135B - A high throughput and regular architecture of 2-D 8x8 DCT/IDCT using direct form

Info

Publication number: TW395135B
Application number: TW87103022A
Authority: TW
Inventors: Liang-Ji Chen; Yung-Bin Li; Jung-Wei Gu; Yuan-Jen Liou
Original assignee: Nat Science Council
Priority date: 1998-03-03
Filing date: 1998-03-03
Publication date: 2000-06-21

Abstract

The invention describes a novel 8x8 2-D DCT/IDCT architecture based on the direct 2-D approach and the rotation technique. The computational complexity is reduced by taking advantage of the special attribute of complex number. Two architectures are proposed: the parallel architecture is suitable for low speed and low power design; the folded architecture is suitable for higher speed and applications of various standards. Unlike other approach, the proposed architecture is regular, hence, it is suitable for VLSI implementation. Compared to the row-column method, the throughput of the proposed folded architecture can achieve two times that of the row-column method with little hardware increased.

Description

經濟部中央標準局員工消費合作社印製 A7 B7 本發明係一種利用直搂式（direct 2_D met加⑴的 5方法與複數的旋轉特性導出於χ W二轉散餘弦轉換/反離散餘弦轉換的快速演算法。發明背景影像與梘訊翟縮（土瓜叫㊀and video compression)已經成為璲在通訊技術的鞏要課題，現在很多民生產品例如影像電話、影像會議、高晝質數位電視 (HDTV)等都需要以資料壓縮技術減少其傳送資料量，以符合通道（channel)的頻寬雨得到更高的品質。 15 由於現今許多影像與視叙壓縮的標準，如JPEG、 Η·Ζ61、MPEG等等，皆已採用離散餘弦轉換<Discrete Cosine Transform，DCT)演算法為其壓縮步驟中的二環，故一個快速且有效率的離散餘弦轉換（DCT)硬體將必抓有助於未來影像與視訊壓縮的發展，尤其當未來如高晝質數位電視(HDTV)等資料量愈來愈大時，具有高度運算能力的離散餘弦轉換（DCT)才能滿足其需求。另外，就硬體製作與超大型積體電路（VLSi)技術層面上考慮，一個規則與模組化的硬、體架構除了可減少所需硬體面積外，亦可 25縮短硬體製作時設計與偵鍇所花費的時間。綜合上述所 (請先閲讀背面之注意事項再填寫本頁) 广/I------訂--------ϋ 本紙張尺度適用中國國家標率（CNS ) Α4規格（210X297公釐） A7 B7 五 '發明説明（之言’一個具有高輸出量（thtoughput >與高度規則性 (regular)的離散餘弦轉換（DCT)演算法將可適合其要 5求。 . 習和技術行-列法(row-column inetHod〉-摘二維的版W離散餘弦轉換（DCT)公式可表示如 ^ 丄1、t·4! ____ <χέ =:.=1 . ....—Jill. ni,n2rkuk2t〇,l,…n，1 (!) (請先閲讀背面之注意事項再填寫本頁) 15 經濟部中央標準局員工消費合作社印装 25 C(0)=古且 c(n)= 1 當 n 关〇其中Κι,μ是離散餘弦轉換（DCT)的輸入、，而是 DCt的輸出。雖然已有許多演算法提出，但是最常用的方法邊是將二維的資料轉換成一維的運算’稱之為行一列法。一個離散餘弦轉換利搿行-列法只需要W次，維DCT的運算，其運算複雜虔亦由#降為說，行一列法的架構如r圖一所示。過去大部份的離散餘弦轉換(DCT)硬體，由於考虞到其規則性’因而採用了行-列法（r〇w—c〇1_贼^) 實現其硬體，但其複雜度遠大於直接法（direct method) 〇 4 本紙張跳it用中國國家榡準（c叫A4絲（2獻297公餐） —r裝------訂 •1^__Φ— ! 1^___I II』__! 15 經濟部中央標準局員工消費合作社印製 20 (b) 偶數五、發明説明（J ) :4弭概述 5 本發明首要目標，為降低習知技術之複雜度，發展 5出了—種利用直接法的離散餘弦轉換（dct)演算法，並採用此演算法去實瑪DCT的硬體》本發明另-目標在於平行架構與折疊架構等兩種架構，財之精賴適合於低迷賴低料設計，而折疊 1〇架構適合於高喊的設計並適用於各種視訊樨準。由於降低了複雜度，相對的，在同樣的硬體面積下’資料的輸出量將可提高為兩倍。本演算法亦針對直接 =固有料觸躲ιέ純分析與錢，使縣演算法變知規則與模組化，非常齡於敍型賴電路（咖工）硬體的製作。表列說明表一 ki、k2與a、b的齡係圖示說明圖一行-列法的架構圖二當㈣時，從;^2到〜,£的齡映方式。圖三步驟-一之架構（a)n；L =奇數 25 圖四步驟二之架構本紙張尺度適用中國國家標準（CNS ) A4規格（Η0Χ297公釐） (請先閲讀背面之注意事項再填寫本頁)Printed by the Consumers' Cooperative of the Central Bureau of Standards of the Ministry of Economic Affairs A7 B7 The present invention is a fast method derived from the direct 2_D met plus 5 method and the rotation characteristics of complex numbers. It is derived from the fast χ W two-transform cosine transform / inverse discrete cosine transform. BACKGROUND OF THE INVENTION Video and video compression have become a major issue in communication technology. Nowadays, many people's livelihood products such as video phones, video conferences, high-quality digital television (HDTV), etc. All need to use data compression technology to reduce the amount of data they transmit in order to meet the channel ’s bandwidth and rain to obtain higher quality. 15 Because of the current standards for many video and video compression, such as JPEG, Η · Z61, MPEG, etc. , Both have adopted Discrete Cosine Transform (DCT) algorithm as the second ring in its compression step, so a fast and efficient discrete cosine transform (DCT) hardware will be necessary to capture future images and The development of video compression, especially when the amount of data such as high-quality digital television (HDTV) is increasing in the future, discrete cosine transformation with high computing power DCT) to meet their needs. In addition, considering the technical aspects of hardware fabrication and very large integrated circuits (VLSi), a regular and modularized hardware and architecture can reduce the required hardware area, and can also shorten the design of the hardware. Time spent with detectives. Comprehensive the above (please read the precautions on the back before filling this page) Cantonese / I -------- Order -------- ϋ This paper size is applicable to China National Standard (CNS) Α4 specification (210X297 A7 B7 Five 'Invention Note' (words' A discrete cosine transform (DCT) algorithm with high output (thtputput > high regularity) will be suitable for its requirements .... Row-column inetHod>-A two-dimensional version of the W discrete cosine transform (DCT) formula can be expressed as ^ 丄 1, t · 4! ____ < χέ =:. = 1. ....— Jill. Ni, n2rkuk2t〇, l, ... n, 1 (!) (Please read the notes on the back before filling out this page) 15 Printed by the Consumer Cooperatives of the Central Standards Bureau of the Ministry of Economic Affairs 25 C (0) = 古与 c ( n) = 1 When n is off, where Km, μ is the input of the discrete cosine transform (DCT), but the output of DCt. Although many algorithms have been proposed, the most commonly used method is to convert two-dimensional data The one-dimensional operation is called the row-column method. A discrete cosine transform is only needed for the row-column method. The operation of the dimensional DCT is complicated. The structure of the columnar method is shown in Figure 1. Most of the discrete cosine transform (DCT) hardware in the past has adopted the row-column method (r0w-c〇1_ thief because of its regularity). ^) Realize its hardware, but its complexity is far greater than the direct method 〇4 This paper jumps to the national standard of China (c is called A4 silk (2 offering 297 meals) —r pack ----- -Order • 1 ^ __ Φ—! 1 ^ ___ I II 』__! 15 Printed by the Consumer Cooperatives of the Central Standards Bureau of the Ministry of Economic Affairs 20 (b) Even number 5. Description of the invention (J): 4 弭 Overview 5 The primary objective of the present invention is To reduce the complexity of the conventional technology, we have developed 5-a discrete cosine transform (dct) algorithm using the direct method, and use this algorithm to implement the hardware of the real DCT. The present invention also aims at parallel architecture and folding. Two types of architectures, such as architecture, are suitable for sluggish and low-material design, while the folding 10 architecture is suitable for shouting design and suitable for various video standards. Due to the reduced complexity, relatively, the same hardware The output volume of the data under the body area will be doubled. The algorithm also aims at direct = inherent material contact hiding. Pure analysis and money make the county algorithm know the rules and modularization, very old in the production of the narrative type Lai circuit (coffee worker) hardware. Table 1 shows the age diagram of ki, k2 and a, b Illustrate the structure of the row-column method. Figure 2 shows the age mapping method from ^ 2 to ~, £ when the figure is displayed. Figure 3 Step-One Structure (a) n; L = Odd Number 25 Figure 4 Step 2 Structure Paper size applies Chinese National Standard (CNS) A4 specification (Η0 × 297 mm) (Please read the precautions on the back before filling this page)

A7 B7 五、發明説明（4' ) . —~(a)當欠 2= 2，1,5(或 6,7,3) (b)當；c2=〇,4 圖五8x8二維離散餘弦轉換（DCT)的平行架構圖六狀8二維反離散餘弦轉換（2-D IDCT)的折疊架構 5 發明之詳細說明直接式二难」y_j, w弟散餘弦轉換（dct)快速演算法 10 為了方便，輿不失其正確性，我們忽略二維的施# 離散餘弦轉換（DCT)公式（1)，式中項，因此（1)式可成為 ” ΛΓ-1 ΛΓ-1 Ykl, k2= Σ Σ X Βι-0 η2 =0 β2 cos 2π (2ΰι + 1) ~4Ν cos 1π {2ηζ +1)1, 4Ν (2) 15 接下來，我們假設W = ，_〇 = 〇, ι,2 A 1, η 2可透過適當的順序變換轉換成κ 則 η 2 20 經濟部中央榡準局員工消費合作杜印製 25A7 B7 V. Description of the invention (4 '). — ~ (A) When owing 2 = 2,1,5 (or 6,7,3) (b) When; c2 = 0,4 Figure 5 8x8 two-dimensional discrete cosine Parallel architecture diagram of DCT transformation Six-shaped 8 Folding architecture of two-dimensional inverse discrete cosine transformation (2-D IDCT) 5 Detailed description of the invention The direct dilemma "y_j, w Diverse cosine transformation (dct) fast algorithm 10 For the sake of convenience and accuracy, we ignore the two-dimensional Shi # Discrete Cosine Transform (DCT) formula (1), the term in the formula, so formula (1) can become "ΛΓ-1 ΛΓ-1 Ykl, k2 = Σ Σ X Βι-0 η2 = 0 β2 cos 2π (2ΰι + 1) ~ 4N cos 1π {2ηζ +1) 1, 4N (2) 15 Next, we assume W =, _〇 = 〇, ι, 2 A 1, η 2 can be converted into κ through proper sequence transformation.

Yn lr n 2 = X2nl, 2n2 Πχ=〇, ··· N/2^1 „ n ^ =X2N-2nl-l ? n J/。 n2=〇,……N/2-1 x ^ 2n2 ^i=W2r....n~i 〇 2 =L1 n1==0,……N/2^x ^：：' N/2_ 1 ~^2N"2ni'lf 2N-2n0-i ηι=Ν/2 KT 1 2 …·Ν 1 因此⑵式可寫成 2 1 /2'.....Κ η2=Ν/2,..··Ν-1Yn lr n 2 = X2nl, 2n2 Πχ = 〇, ··· N / 2 ^ 1 „n ^ = X2N-2nl-l? N J /. N2 = 〇, ... N / 2-1 x ^ 2n2 ^ i = W2r .... n ~ i 〇2 = L1 n1 == 0, ... N / 2 ^ x ^ :: 'N / 2_ 1 ~ ^ 2N " 2ni'lf 2N-2n0-i ηι = Ν / 2 KT 1 2… · Ν 1 So the formula can be written as 2 1 /2'.....Kη2=N/2,..··N-1

Ykl, k2 Λ^'Ι N~l Σ Σ rric iJ2=〇 al'«2 ㈣--c〇s Ml£2+1)^2 门、為了計算（3)式，我們可利用（4) L ^ J n ^ ~l M ^ (5)式快速的得到： U，i,k2= V1^1, 1 〇刀 2 = 0 而 ,Vfl (4)Ykl, k2 Λ ^ 'Ι N ~ l Σ Σ rric iJ2 = 〇al' «2 ㈣--c〇s Ml £ 2 + 1) ^ 2 gate. In order to calculate (3), we can use (4) L ^ J n ^ ~ l M ^ (5) is quickly obtained: U, i, k2 = V1 ^ 1, 1 〇 Knife 2 = 0 and Vfl (4)

A7 B7A7 B7

D kl, k2 經濟部中央標準局員工消費合作社印製五、發明説明 1 (5) W 4hIin(t^>Rea/…2)] 由於（5)式中對稱的服係，肢此我們不需要計算⑷式所有的心42值’實際上，我們只需要計箅所有的心且部份的允2 ’只要任何的欠2值能以心或汉表示及可。D kl, k2 Printed by the Consumer Cooperatives of the Central Standards Bureau of the Ministry of Economic Affairs V. Invention Description 1 (5) W 4hIin (t ^ > Rea /… 2)] Because of the symmetrical service system in (5), we do not We need to calculate all the 42 values of the mind. 'Actually, we only need to calculate all the hearts and some allowable 2', as long as any value of owing 2 can be expressed in the mind or the Chinese.

為進一步化簡（4)式，我們將ynl, η2以（6)式代換成 ynl,t 4n2 +1 =(4t+l) (4ni+l)ia〇d 4N 而 '〇 $ Hi, Ti2 S N~]_ )式中，如果我們固定值，則與t對映的關係將 1〇是一對一的 '然而，如果是不同的见值，對映的方式將會是不—樣的。圖二說明當W岣時，從Xnl,n2到yni,t的對映方式。我灼將（4)式中的办^2用ynl,t取代，並重寫（4)式成 15 (7a, 7叫式： s\ttrr AN \ f d.) t Σ：〇^β1ι/^ 4(；ί1 + 1)["1 + (4,+,)/：2ΐ3 (7b) 在（7b)式中，内層的總和計算輿<維的泣點^丁非常相似，除了 Ul+(4t+l)i2：i項沒有限制在〇到.；L的範圍 2〇之内。為了解決這個不同，考慮如下的關係式： [^1+(4 t+1) ^2]= a N+bi 、而〇 S b s N-1 其中占為整數，而办則為〇❹一 i的整數。將此關係式帶入（7b>式中，即苛得到（8)式：本紙張尺Μ用相_鱗（CNS ) Α4規格（^了97公釐 (請先閲讀背面之注意事項再填寫本頁)To further simplify the formula (4), we replace ynl, η2 with formula (6) to ynl, t 4n2 +1 = (4t + l) (4ni + l) ia〇d 4N and '〇 $ Hi, Ti2 In the formula SN ~] _), if we fix the value, the relationship with t map will be 1 to 1 'However, if it is a different value, the way of map will be different. Figure 2 illustrates the mapping from Xnl, n2 to yni, t when W 岣. I replaced ^ 2 in formula (4) with ynl, t, and rewritten formula (4) into 15 (7a, 7 called: s \ ttrr AN \ f d.) T Σ: 〇 ^ β1ι / ^ 4 (; ί1 + 1) [" 1 + (4, +,) /: 2ΐ3 (7b) In the formula (7b), the calculation of the sum of the inner layers is very similar, except that Ul + (4t + 1) i2: The term i is not limited to 0 to.; L is within the range of 20. In order to solve this difference, consider the following relation: [^ 1 + (4 t + 1) ^ 2] = a N + bi, and 〇S bs N-1 is an integer, and the office is 〇❹ 一 i Integer. Bring this relation into (7b >), that is to obtain (8): This paper ruler is used for phase_scale (CNS) Α4 specifications (^ 97 mm (please read the precautions on the back before filling in this page)

·· 1 JV- Σ Σ y ,χ ,w 7 ----、 A7 B7 五、發明説明（J&J .Ukifk2 = Σ )a[ Σ N -： 4ΛΓ 1)6 ](8) 若我們將m的總和計算結果記為π trb 則 /?! = 0 N~l·· 1 JV- Σ Σ y, χ, w 7 ----, A7 B7 V. Description of the invention (J & J .Ukifk2 = Σ) a [Σ N-: 4ΛΓ 1) 6] (8) If we will The calculation result of the sum of m is recorded as π trb then / ?! = 0 N ~ l

4N 15 經濟部中央標準局員工消費合作社印製 20 Σ (c〇s^l^H21_ y-C0S2£ll£Llw ^=〇极 J tos ^ Uft,b是一個複數，然而，由上關係式可知，它的實部 (real part)實際上就是一個议點的一維DCT，而它的虛部(imaginary part)與實部有如下的關係·· = -Re{UVt, b} 所以，U’t,b可以用一個IV點的一維DCT實現出來。而在（3)式中的另一個t的總和運算則只需要加（減）法的運算’不需要任何的乘法運算。茲勝整個演算法的過程整理如卞： v 1 * 5十异W組W點的資料的DCT以轉到U’t b : yni, t : {y 〇, t ’ yi, t，...... y^-i, t) * 〇 ^ t ^ N-i 2*利用⑻式計算出部份必要的Umm值。 3·利用(5)式計算值。此演算法總共只需要計算W組ΑΓ點的離散餘弦轉換 (DCT)再配合一举<加（減）法；相對於傳統的行―列法需要計算2ΑΓ組泣點的DCT，計算量降低許多。本紙張尺度適用中國國家標準（CNS ) Α4規格（210 X 297公釐） (請先聞讀背面之注意事項再填寫本頁) -1、裝· 、1Τ ·1 A7 B7 五、發明説明（7). 直接式卞㈣-換(DCT)啤嬝上述演算法雖然需要較少的運算量，但是架構較不規則，不適合硬體製作’固此斜對上述演算法提出广値較統 5則的電路。在道裡我們針對二維的咖離散餘弦轉換 (DCT>，先提出一個規則的平行余構中 architect似e)，此干行架構非常模組化且所需工作頻率較低，所以非常適合用於低功率的設計（1〇w ρ〇· design)。另外我們苒將平行架構經過適當的折疊以減少 ίο面積，此折疊的架構（folded architecture)適合應用在各個視訊標準中。 i行架槿當W =8 _ ’我們只需要計算允2=〇, 1,2,4,5時的w2 值’故根據不同的；值’我捫將（7a)式展開如下： (請先閲讀背面之注意事項再填寫本頁) 訂. 154N 15 Printed by the Consumer Cooperatives of the Central Bureau of Standards of the Ministry of Economic Affairs 20 Σ (c0s ^ l ^ H21_ y-C0S2 £ ll £ Llw ^ = 〇 Poles J tos ^ Uft, b is a complex number, however, we can see from the above relation , Its real part (actual part) is actually a one-dimensional DCT of a point, and its imaginary part has the following relationship with the real part. · = -Re {UVt, b} So, U ' t, b can be realized by one-dimensional DCT with one IV point. The sum operation of another t in formula (3) only needs the addition (subtraction) operation, 'no multiplication is needed. The process of the algorithm is as follows: v 1 * 5 DCT of ten different W groups of W points to go to U't b: yni, t: {y 〇, t 'yi, t, ... y ^ -i, t) * 〇 ^ t ^ Ni 2 * Use the formula to calculate some necessary Umm values. 3. Calculate the value using equation (5). In total, this algorithm only needs to calculate the discrete cosine transform (DCT) of the W group ΑΓ points and then cooperate with the one-touch < addition (subtraction) method; compared with the traditional row-column method, it needs to calculate the DCT of the 2ΑΓ group crying point, which is much less computationally expensive. . This paper size applies Chinese National Standard (CNS) A4 specification (210 X 297 mm) (please read the precautions on the back before filling out this page) -1, · · 1T · 1 A7 B7 V. Description of the invention (7 ). Although the direct algorithm of DCT (DCT) beer algorithm mentioned above requires less computation, it has a more irregular structure and is not suitable for hardware production. Circuit. In the Tao, we aim at two-dimensional discrete discrete cosine transformation (DCT). First, we propose a regular parallel cosine that looks like e). This trunk architecture is very modular and requires a low operating frequency, so it is very suitable for use. For low power design (10w ρ ·· design). In addition, we have appropriately folded the parallel architecture to reduce the area. This folded architecture is suitable for application in various video standards. I line frame when W = 8 _ 'We only need to calculate the value of w2 when allow 2 = 〇, 1,2,4,5', so according to different; the value 'I will expand the formula (7a) as follows: (please (Please read the notes on the back before filling out this page) Order. 15

,0= Σ ^,0^324 巧=0 (9a) 而 i/h, y -(472!+1)(^+4) = 1 …（9b)而〜經濟部中央標準局員工消費合作社印製 Σ (、i); /=〇 -ΦΓ. ^il=0 而〜=(¾) +·^Πΐ，2 —Λΐ，6>-·/(Λΐ4 +*心，5 -%，3 (9d) ^l~〇而一·^，6>^Ε^，1—，广'7张_^, 0 = Σ ^, 0 ^ 324 Qiao = 0 (9a) and i / h, y-(472! +1) (^ + 4) = 1… (9b) and ~ printed by the staff consumer cooperative of the Central Standards Bureau of the Ministry of Economic Affairs System Σ (, i); / = 〇-ΦΓ. ^ Il = 0 and ~ = (¾) + · ^ Πΐ, 2 —Λΐ, 6 >-· / (Λΐ4 + * heart, 5-%, 3 (9d ) ^ l ~ 〇 and one · ^, 6 > ^ Ε ^, 1—, Cantonese '7 sheets_ ^

Q 本紙張尺度適用中國國家操隼（CNS ) Α4規格（21〇X297公釐 A7 五、發明説明（8 B7 l,5=Zs，5^1+1)a'+5) (9e) 5 15 經濟部中央標準局員工消費合作社印製 20 根據（9)式與（5)式，我們將運算分成二個步驟：步驟一：根據（9)式計算出Uni,〇，Uju,4，2 > ϋη1，:L，與ϋη1，5值。此步驟之架構如圖三所示，由於口與unl,5有（-ί)垃項的關係，故當Ώΐ=奇數與時，其架構是稍有木同的。步驟二：根據⑼式計算出u,M,k2，大尸〇, 4, 2, L 5 ;再根據（5)式計算出饮1大2。當欠2=2 1 Γ (或 6,乃3)時，其架構如圖四（a)所示;'當大2=〇, $時，其架構如圖由（b)所示。圖四（a)與圖四⑴）不同在於圖四 (b)需要多一級的加法運算。圖五則是提出的離散餘弦轉換⑴CT)的平行架構，可以看出整個架構相當規則與模組化，只有在兩個步^ (stage)間需要大區域的資料連結（gl〇bai interconnection) ’此繞線間題仍須克服6即使如此，此部份亦有規則性。與1990年P.Duhame，1991年 N·1· Cho等人提出的2-D DCT架構比較，此平行架構明顯的具有較規則的架構與較少的繞線問題。 m 本紙張尺度適用中國國家標準（CNS ) A4規格（210X297公釐） (請先聞讀背面之注意事項再填寫本頁j 「裝·Q The size of this paper is applicable to the Chinese National Code of Practice (CNS) Α4 (21 × 297mm A7 V. Description of the invention (8 B7 l, 5 = Zs, 5 ^ 1 + 1) a '+ 5) (9e) 5 15 Printed by the Consumer Standards Cooperative of the Central Standards Bureau of the Ministry of Economic Affairs 20 According to formulas (9) and (5), we divide the calculation into two steps: Step 1: Calculate Uni, 0, Uju, 4, 2 & gt according to formula (9) & gt ϋη1,: L, and ϋη1,5 values. The architecture of this step is shown in Figure 3. Because the mouth and unl, 5 have a (-ί) spam relationship, when Ώΐ = odd number and, the architecture is slightly the same. Step 2: Calculate u, M, k2, corpse 0, 4, 2, L 5 according to the formula; then calculate 1 to 2 according to the formula (5). When owing 2 = 2 1 Γ (or 6, is 3), the structure is shown in Figure 4 (a); 'When the large 2 = 0, $, the structure is shown in Figure (b). Figure 4 (a) differs from Figure 4 (i) in that Figure 4 (b) requires multiple levels of addition. Figure 5 is the proposed parallel architecture of discrete cosine transform (CT). It can be seen that the entire architecture is quite regular and modular. Only a large area of data connection (gl〇bai interconnection) is required between the two stages. This winding problem still has to be overcome.6 Even so, this part has regularity. Compared with the 2-D DCT architecture proposed by P. Duhame in 1990, N · Cho et al. In 1991, this parallel architecture obviously has a more regular architecture and fewer winding problems. m This paper size applies to China National Standard (CNS) A4 (210X297 mm) (Please read the precautions on the back before filling in this page.

-、1T ρ· A7 經濟部中央標準局員工消費合作社印製五、發明説明（9 ). i斤叠架構上述平行架構擁有許乡相同的設計，因此可以只需要建;立一個模組’而這些相同的設計利用分時（time sl^ring) 5的方式去使用此一模組，這是丄㈣年p Duhame等人於 iCASSP^0 第 4 卷第 151：5-1518 頁與 1991 年 N.I. Cho等人於工瓦观Tr如名· syst •苐CAS_ 38卷第397-305頁發表有關二維離散餘弦轉換（24 DCT)無法做到的。接下來我們將以8,8二維反離散餘弦 10轉換（2一D iDCT)為例得到一個折疊的架構9 將圖五的8'8二維離散餘弦轉換（2-D DCT)架構的資料流向反，相即可得到8'8二維反離散餘弦轉換（2_d IDCT) ◊現將原平行架4冓拆成原來的四分之一，此時有三點必須生意： 1-當輪入Υμ,〇與L,4時，必須多一級加法運算，輸入其他斜時則不必，因此必須要有一多X器 (multiplexer)做選擇。 2.圖五中$1到1^1,1^的運算可如（1〇)式所示内 Λ Un 1 _ η Ιύ (4λι+Π(^ι+12). 礼厶〜，32 而 k2= 1, 2, 5α〇) ^1=0 不同的大1與幻，其對應到的a與办值亦不同，h、大2 與占、亡的關係如表一所禾。园此（1〇)式可甩如下方式 11 \ ' .j r-- ----------衣-- - Μ ^ . (請先閱讀背面之注意事項再填寫本頁} 、1Τ f. 本紙張尺度適用中國國家標準（CNS ) A4規格（21〇χ297公着 A7 B7 五、發明説明 (10) 實現：首先，計算下式：0 = 2X^3，+…接著，根據 ^1=0 表一的a攔將部分的輸出乘以（-));最後，再根據表一的攔將8個輸出值做一木循環位移（qirCuiar shift)即 5可°而仏1到U々l,k2的反運算及可以循環位移—乘以（-j) —計算總和的方式實現。 3.在步驟一與步驟二之間的資料連結，根據其規則性’可用一轉換記憶體（tranSp〇se mem〇ry)實現。 10 圖六說明了此8'8二維反_散餘弦轉換（2-P IDCT) 的折疊架構。資料分成四組輸入，經過運算後，亦分成四組輸出。圖中的Substage 1與工nterconnection相當於用行-列法所需的架構，其他部份則是一些蝴蝶式加法器（butterfly adder)，與常數乘法器（constant 15 multiplier)。相對於行-列法，所增加的硬體面積不 I 多’但整轉的輸出量（throughput)卻增為兩倍。 ----------- 7"- (請先閱讀背面之注意事項再填寫本頁)-、 1T ρ · A7 Printed by the Consumer Cooperatives of the Central Standards Bureau of the Ministry of Economic Affairs. 5. Description of the Invention (9). I The stacking structure mentioned above has the same design as Xu Xiang, so it can only be built; These same designs use time sl ^ ring 5 to use this module. This is the leap year p Duhame et al. ICASSP ^ 0 Vol. 4 151: 5-1518 and 1991 NI Cho et al. In Gongwaguan Tr as the name · syst • 苐 CAS_ 38 pp. 397-305 published about the two-dimensional discrete cosine transform (24 DCT) can not be done. Next, we will use the 8,8 two-dimensional inverse discrete cosine 10 transform (2D iDCT) as an example to obtain a folded structure. 9 The data of the 8'8 two-dimensional discrete cosine transform (2-D DCT) architecture in Figure 5 The flow direction is reversed, and the phase can get 8'8 two-dimensional inverse discrete cosine transform (2_d IDCT). Now the original parallel frame 4 冓 is disassembled into the original quarter. At this time, there are three points that must be traded: 〇 and L, 4, there must be one level of addition operation, it is not necessary to input other slopes, so you must have a multi-X (multiplexer) to choose. 2. In Figure 5, the operations from $ 1 to 1 ^ 1,1 ^ can be expressed as (1〇) in Λ Un 1 _ η Ιύ (4λι + Π (^ ι + 12). Li Yi ~, 32 and k2 = 1, 2, 5α〇) ^ 1 = 0 Different large 1 and magic, the corresponding a and the value are also different, the relationship between h, large 2 and accounted for, as shown in Table 1. This (1〇) type can be thrown as follows 11 \ '.j r----------- clothing--Μ ^. (Please read the precautions on the back before filling this page} 、 1Τ f. This paper size applies the Chinese National Standard (CNS) A4 specification (21〇 × 297 public A7 B7 V. Description of the invention (10) Implementation: First, calculate the following formula: 0 = 2X ^ 3, + ... Then, according to ^ 1 = 0 a block in Table 1 multiplies part of the output by (-)); finally, according to the block in Table 1, multiply the 8 output values by a qirCuiar shift, which is 5 ° and 仏 1 to U. 々L, k2 can be implemented by inverse operation—multiply by (-j) —to calculate the sum. 3. The data link between step 1 and step 2. According to its regularity, a conversion memory can be used ( tranSp〇se mem〇ry). Figure 6 illustrates the folding structure of this 8'8 two-dimensional inverse-scatter cosine transform (2-P IDCT). The data is divided into four groups of inputs, and after the operation, it is also divided into four groups of outputs The Substage 1 and the nterconnection in the picture are equivalent to the architecture required for the row-column method. The other parts are some butterfly adders and constant multipliers (co nstant 15 multiplier). Compared to the row-column method, the increased hardware area is not much, but the throughput of the entire revolution has doubled. ----------- 7 " -(Please read the notes on the back before filling this page)

、1T Ρ-. 經濟部中央標隼局員工消費合作社印製 20 t5 ___ 1? 本紙張尺度適用中國國家標準（CNS ) A4規格（210X297公釐） A7 B7 五、發明説明（11) 表一 ^2 = 〇 k2 = l 灸2 = 2 先2 — 5 左1 左1 +无2 a b ki 4- a b + a b hi + A：2 a b 0 0 0 0 1 0 1 2 0 2 5 0 5 1 1 0 1 2 0 2 3 0 3 6 0 6 2 2 0 2 3 0 3 4 0 4 7 0 7 3 3 0 3 4 0 4 5 0 5 8 1 0 4 4 0 4 5 0 5 6 0 6 9 1 1 5 5 0 5 6 0 6 7 0 7 10 1 2 6 6 0 6 7 0 7 8 1 0 11 1 3 7 7 0 7 8 1 0 9 1 1 12 1 4 (請先閲讀背面之注意事項再填寫本頁) 經濟部中央標準局員工消費合作社印製 13、 1T Ρ-. Printed by the Consumer Cooperatives of the Central Bureau of Standards of the Ministry of Economic Affairs 20 t5 ___ 1? This paper size is applicable to the Chinese National Standard (CNS) A4 specification (210X297 mm) A7 B7 V. Description of the invention (11) Table 1 ^ 2 = 〇k2 = l moxibustion 2 = 2 first 2 — 5 left 1 left 1 + no 2 ab ki 4- ab + ab hi + A: 2 ab 0 0 0 0 1 0 1 2 0 2 5 0 5 1 1 0 1 2 0 2 3 0 3 6 0 6 2 2 0 2 3 0 3 4 0 4 7 0 7 3 3 0 3 4 0 4 5 0 5 8 1 0 4 4 0 4 5 0 5 6 0 6 9 1 1 5 5 0 5 6 0 6 7 0 7 10 1 2 6 6 0 6 7 0 7 8 1 0 11 1 3 7 7 0 7 8 1 0 9 1 1 12 1 4 (Please read the notes on the back before filling this page ) Printed by the Consumer Cooperatives of the Central Standards Bureau of the Ministry of Economic Affairs13

本紙張尺度適用中國國家標準（CNS ) A4規格（210X297公釐）This paper size applies to Chinese National Standard (CNS) A4 (210X297 mm)

Claims

Fan Li is requested to print by ABCD 15 20 25 Employees' cooperation of the Central Bureau of Standards of the Ministry of Economic Affairs 30 1 A conversion method of video data, including ... (a) A series of 8X8 original digital signal data is displayed in cosine The reeled format is converted into 8X8 cosine-transformed digital data, which is &transmitted; (b) converts two-dimensional raw digital signal data into one-dimensional data, and rotates in the way of word-parallel bit-serial (C) Complex DCT can be transformed into two 8-p0int one-dimensional discrete cosine transforms (1-D DCT) and eight butterfly additions / (⑴cosine transforms the required rearrangement of digital data output (re_ ordering ) The circuit can be combined with the subsequent Zig_zag circuit. 0 2 · As described in item 1 of the scope of patent application, where the data is input from the original output terminal and the DCT is changed to IDCT, a series of cosine-transformed digital data can be converted. 'Conversion by inverse cosine transform' into the original digital signal data. 3. The method as described in item 1 of the scope of patent application, in which the computational architecture is reduced to a quarter 'from & 1, wee to ynl, antipodal way of t, Middle: (in) Digital signal data is divided into four groups to be entered sequentially, and cosine-transformed digital assets are also divided into four groups to be sequentially output; (b) Structured tribute connection. (Global interconnection) Four 4X4 conversion memories 4. The method described in item 3 of the scope of patent application, where the data is input from the original round and the DCT is changed to IDCT, a series of cosine-transformed digital data can be converted by inverse cosine transform. Into the original digital signal data. -J --- 1 ^ —11 II? I m · nt-I a ... i, (Please read the notes on the back first ^^ Write this page}-* 1T- # 线 · Sheet ") A4M (210X297 / &Jt; J ^ iT Wai Fanli, please make-· ABCD 15 20 25 Printed by the consumer cooperation of the Central Bureau of Standards of the Ministry of Economic Affairs 30 1 A method of converting video data, including ... ( a) Convert a line of 8X8 original digital signal data into cosine-rolled format to 8X8 cosine converted digital data for transmission &; (b) Convert two-dimensional raw digital signal data into one-dimensional Data, rotated in the way of word-parallel bit-serial; month (c) Complex DCT can be transformed into two 8-p0int one-dimensional discrete cosine transforms (1-D DCT) and eight butterfly addition / (⑴ cosine transform digital data output required for reordering (re_ ordering) circuit can be followed by Zig_zag circuit combined with 0 2 · The method described in item 1 of the scope of patent application, where the data is input from the original output and the DCT is changed to IDCT, a series of cosine-transformed digital data can be converted by inverse cosine Way to convert it into raw digital signal data. 3. The method as described in item 1 of the scope of patent application, in which the computing architecture is reduced to a quarter of the mapping from & 1, to ynl, t, where: (in) digital signal data is divided into The four groups are entered in sequence, and the cosine-transformed digital assets are also divided into four groups and output sequentially; (b) The architecture's global interconnection. (Global interconnection) is replaced by four 4X4 conversion memories. 4. The method described in item 3 of the scope of patent application, where the data is input from the original round and the DCT is changed to IDCT, a series of cosine-transformed digital data can be converted into the original by inverse cosine transform. Digital signal data. -J --- 1 ^ —11 II? I m · nt-I a ... i, (Please read the precautions on the back ^^ write this page first)-* 1T- # 线 · 表纸 ") A4M (210X297 / > J ^ iT