TW299423B - Cyclic multiplication/addition processor - Google Patents

Cyclic multiplication/addition processor Download PDF

Info

Publication number
TW299423B
TW299423B TW82105070A TW82105070A TW299423B TW 299423 B TW299423 B TW 299423B TW 82105070 A TW82105070 A TW 82105070A TW 82105070 A TW82105070 A TW 82105070A TW 299423 B TW299423 B TW 299423B
Authority
TW
Taiwan
Prior art keywords
adder
clock signal
output
clock
shift register
Prior art date
Application number
TW82105070A
Other languages
Chinese (zh)
Inventor
Bor-Chuan Hwang
Jiing-Shyang Yang
Original Assignee
Ind Tech Res Inst
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ind Tech Res Inst filed Critical Ind Tech Res Inst
Priority to TW82105070A priority Critical patent/TW299423B/en
Application granted granted Critical
Publication of TW299423B publication Critical patent/TW299423B/en

Links

Landscapes

  • Complex Calculations (AREA)

Abstract

A multiplication/addition processor, which can multiply one input variable X with one coefficient C pre-stored in the processor and add input value Zi, output multiplication value Zo, comprises of: (1) one time segment controller, which can receive one system clock signal, and accelerating the above system clock signal to N-times operation clock signal, and one corresponding loading clock signal; (2) one load and shift register and modified Booth decoder, which can load the above input variable X based on the above loading clock signal, and according to the above operation clock signal in sequence fetch 3 bits of variable X, calculate 3 bit Booth operation code of the clock signal;(3) one modified Booth selector, which can receive the above Booth operation code and the above pre-stored coefficient C so as to calculate temporary partial product of the operation clock; (4) one adder, which can receive the above temporary partial product and the most significant bit of Booth operation code, after addition operation output partial summation of the operation clock; (5) one dual-port load shift register, which has two inputs, one is Zi, the other is partial summation of adder output, and the register selects input signal according to loading clock signal, and operates based on the above clock signal; (6) the above adder and dual-port shift register constitute one cyclic addition operation; when loading clock signal is active, dual-port shift register loads Zi, and outputs it to adder, when loading clock signal is non-active, dual-port register will load partial summation of adder output, and reserve the two least significant bits as last of output bit, and output the rest bits to adder by hardware wiring, whose effect is like leaving the two least significant bits, and shift partial summation to left by two bits, cyclic output to adder; in every operation clock adder fetches output of dual-port loading shift register, temporary partial product and the most significant bit of Booth operation code to do addition operation, generates partial summation of the operation clock, and loading this part and cyclic input dual-port to shift register; therefore, after one system clock ends, dual-port loading shift register is ready to output multiplication/addition value.

Description

29^423 A6 B6 五、發明説明(/ ) 设濟部中央桴苹局與工消費合作社印^ 本發明是關於數位訊號處理系統中之乘加處理器, 特別是關於以改良式布斯演算法及循環式相加的運算架構 ,達成快速處理訊號及減少硬體架構體積之功效。 乘加處理器爲數位訊號處理系統中之基本元件,尤其 常被數位信號過濾之設計所使用,傳統的乘加器爲達成乘 及加之運算功能,通常是以一乘法器及加法器分開的結 構,其中乘法與加法的運算是兩個分開且獨立的演算。第 一圖顯示這種傳統式的乘加器1 0,包含一乘法器1 2 及加法器1 4,乘法器1 2先將輸入數據X i 1 6乘以係 數C R1 8,所得到的乘積,再輸入加法器1 4,與加法 器之輸入數據A1 20,相加得到總和,再經延遲器2 2延遲一個時段再輸入下一階乘加器之加法器處理。 因爲第一圖所顯示之乘加器10處理訊號既費時且 硬體結構之體積又大,故普遍使用的是以循環累加方式將 第一圖之乘法器加以改進。第二圖顯示這種經過改進的乘 加器4 0。乘加器4 0是以一解碼加法器4 2及一鎖存器 (D Flip Flop’ s)44聯合使用而循環運 作的;解碼加法器42每個時鐘(C 1 ock)輸入一個 位元的C r,和X運算而得一個部份乘秸,再和鎖存器所 存上一個時鐘之部份和相加而成爲道個時鐘之部份和,存 入鎖存器並經過適當的移位後循環給下一個時鐘運算之用 。如此,經過C r的位元數個時鐘後便可得到C rX的乘稂 值。 (請先閱讀背面之注意事項再璜寫本頁) -裝- · —線 本紙張尺度遇用中國國家標準(CNS)甲4规格(210 X 297公釐)等三夏 82·ύ. 40,000 五、發明説明(2) A6 B6 經濟部中央桴半局8工消費合作社印*'|衣 第二圖所顯示的架構,雖然解決了乘法器耗費硬體的 問題,但其運算時間爲系統時鐘c k和乘數C r的位元數 的乘積,如果要達髙速的運算,第二圖用的加法器4 2可 以爲一進位保留加法器(Carry Save Adder : CSA),但 其缺點爲這種架構需使用許多鎖存器以暫存進位及總和之 數值,另外一改進運算速度之方法爲用先進位加法器( Carry Look-ahead Adder: CLA)但同樣的這種加法器在 硬體上體積會太過擴大,不符現代微小型之電子元件之使 用。故先前技術,如上所述仍有計算速度緩慢或硬體架構 體積太大或元件太多之困難及限制》 本發明之目的,其要旨爲克服習知技術之困難及限制 使乘加器在時效上及硬體架構都得以改進,以更符合現代 微電子元件之要求。 本發明係採用改良式布斯演算法來簡化乘法的運算, 並運用循環累加之運算架構來節省硬體線路,配合以數倍 於系統時鐘速率之運算時鐘來加速運算時間,以達成在一 個系統時鐘內完成乘加運算之運作,在時序上及硬體的結 構設計上最高效益之應用。 改良式布斯演算法在處理m位元和η位元的乘法時, 取其位元數較低者當作乘數X,位元數較髙者當作被乘數 C ’並假設t爲m、η之較小值,k爲t之一半(當t爲 奇數時,取t加1之一半),則C X的乘秸可简化爲k個 部份乘租P 2*i經過適當移位後之總和。部份乘桢P 2*i如 <請先«讀背面之注意Ϋ項再is寫本頁) 丨裝. 訂. -線· 本纸張尺度適用中國國家樣準(CNS)甲4规格mo X 297公货〉冬σσ瓦 82.6. 40,000 A7 ________ B7五、發明説明(3 ) 表一所示,其中i為0、1、2、、、至〔k-l〕° 表一 X 2*1+1 改良式布斯演算法 X 2*1 X 2*1-1 部份乘積P 0 0 0 0 0 0 1 C 0 1 0 C 0 1 1 2 C 1 0 0 —2 C 1 0 1 —C 1 1 0 -C 1 1 1 0 J'裝------訂------^線 (請先閲讀背面之注意事項再填寫本頁) 經濟部中央標準局Μ工消f合作社印製 本發明所設計之乘加處理器,其功能在計算乘數X, 乘以彼乘數C,再加上Z i,産生Z 〇。本發明由於蓮用 改艮式布斯演算法,C X的乘揹可簡化為k個部份乘積P Pi經過適當移位後之總和。為了逹成利甩改良式布斯演 算法的最佳時效,本乘加處埋器包含一個時段控制器,將 接收的系統時逋信號,加速成k倍之時逋信號,吾人稱之 為運算時鐘信號,只要運算時鐘速率仍在加法器的處理速 率之內,再配合加法器與雙埠載入移位暫存器聯合完成之 k次循環相加動作,逹成在一個系统時鐘内完成乘加運算 本紙張尺度適用中國國家標準(CNS)A4規格( 210X297公釐) 冬鱼真 83. 3. 10,000 A6 B6 經濟郎中央技準局e*工消費合作杜印女 五、發明説明(令) 之效果。而且硬體上只需要一個時段控制器、一個載入及 移位暫存器並改良式布斯解碼器、一個改良式布斯選擇器 、一個加法器、及一個雙埠載入移位暫存器;大幅節省了 硬體線路面積,因此本發明同時具備了縮小硬髖結構及提 昇運算速度之雙重功效。 本發明利用一個時段控制器,將系統時鐘加速產生出 運算時鐘信號及載入時鐘信號,送入載入及移位暫存器並 改良式布斯解碼器與雙埠載入移位暫存器,以控制乘加處 理器之運作;其中運算時鐘信號爲k倍之系統時鐘信號, 在一個系統時鐘之第一個運算時鐘前,將載入時鐘信號變 爲主動(Active),以便載入乘數X和上一級之輸入值Z i,並在一個系統時鐘之其他運算時鐘時,將載入時鐘信 號變爲非主動(NonActive); —個載入及移位暫存器並 改良式布斯解碼器,在每一個運算時鐘逐次選取乘數X之 三個位元,根據表二解出三位元之布斯演算碼S (S2S1 SO); —個改良式布斯選擇器,輸入被乘數C及布斯演 算碼S,依據表二選取合適的暫時性部份乘積值Cs,其 中 Cs 爲 0,C,2C’C’ ,2<:’之一’而(:’表示 C之補數(One _ s Complement );—個加法器’將Cs、 Z f和S 2相加,産生這個運Ώ時鐘之部份和S a ’其中Z r爲雙埠載入移位暫存器之輸出’在一個系統時鐘之第一 個運算時鐘時Z r即爲Z i ’在一個系統時鐘之其他個運 算時鐘時Z r即爲將部份和S a左移兩位後之結果;一個 雙培載入移位哲存器’它有兩個輸入:S a和Z丨’當載 入時鐘信號爲主動時它載入z 1 ’輸出z z 1 ’否則 用中國S家彳苹(CN'S)甲!规格(210 X 297公釐) 冬A K 82.6. 40,000 — — — — — — — — — — III —--卜(·—·111 ^^ I I ----I--Γ---认 (請先Η讀背面之注意事項再塡寫本頁) A7 B7 五、發明説明(5 ) ,載入Sa,輸出Zr是取Sa扣除掉最低兩位元後的其他 諸位元,以硬體拉線來完成,形同右移兩位後之Sa,(S a之最低兩位元將保留作為最終之輸出),輸出Zr將給加 法器作下一個運算時鐘之累加動作;如此,加法器與雙淳 載入移位暫存器形成一套循環相加的動作,並且當第 運算時鐘結束後,亦即一個系統時鐘結束後,雙埠載入移 位暫存器便可輸出C乘以X加上Z i之乘加值2:〇 ° ^ _.t.-- (請先閲讀背面之注意事項再填寫本瓦) — — X 2*1+1 X2*i 布斯演算碼對照表 X 2*i-l S 2 S 1 S0 C S 0 0 0 0 0 0 0 0 0 1 0 0 1 C 0 1 0 0 1 0 C 0 1 1 0 1 1 2 C 1 0 0 1 1 1 2 C J 1 0 1 1 1 0 c, 1 1 0 1 0 1 c, 1 1 1 1 0 0 0 訂 -1 旅 經濟部中央標準局只工消費合作社印裝 第三圖顯示依本發明所設計之一徑八位元乘以八位元 之乘加處理器1 〇 〇,其功能在計筲輪入值X. 1 〇 2乘上 係數C 1 0 4,再加上由前一階傳來的輸入值:z i 1 0 fi ,産生出z «。由於運用改良式布斯演算法,八位元之X 只需要作四次經過適當移位之部份乘m的加法運算即可。 為了達成利用改民式布斯演算法的最佳時效,乘加處理器 本紙張尺度適用中國國家橾準(CNS ) A4规格(210 X 297公釐)冬七具 83.3. 10,000 _扣3 A6 _ _ B6 五、發明説明(6) 經冴.部中央5£準局8工消疗合作:::1印5.: 1 0 0包含一個接收系統時鐘信號CK1 1 0,並産生一 個四倍於系統時鐘之運算時鐘信號CKx4及載入時鐘信 號CKL o a d之時段控制器1 〇 8 〇 載入時鐘信號CKL 〇 a d從控制器1 〇 8送到載入 及移位暫存器並改罠式布斯解碼器112及雙埠載入移位 暫存器1 1 6,促使輸入值XI 〇 2,及上一階之輸入值 Z i 1 0 6在每一系統時鐘C+K開始時之第一個運算時鐘 載入此兩暫存器,運算時鐘CKx4送入載入及移位暫存 器並改良式布斯解碼器1 1 2,及雙埠載入移位暫存器1 1 6,以控制乘加處理器之運作;解碼器1 1 2依照運算 時鐘CKX4,逐次取三未元之X (依次是χιχοο、 X3X2X1、X5X4X3、及X7X6X5),依照表二解出三 位元之布斯演算碼S (S2S1S0) ,S和C輸入改良式 布斯選擇器1 1 4,再根據表二産生暫時性部份乘積Cs ,C s、S 2和由雙埠載入移位暫存器的輸出Ζ Γ,輸入一 個十八位元加法器1 1 8,得到一該運算時鐘之部份和S a,此部份和S a將循環輸入雙埠載入移位暫存器1 1 fi ,雙埠載入移位暫存器丨1 6有兩個輸入:Z i和S a, 由載入時鐘信號C K h o a (丨來選擇,當C K丨v () a d為 主動時輸出Z r為Z i,當C K L o a d為非主勋時輸出 Z r為將S a左移兩位後之結果,Z Γ將循環輸入固十八 位元加法器1 1 8,以和忸時性部份乘ffiCs、布斯演算 碼之最高位元S 2相加,産生該運箅時熥之部份和S a, 如此,十八位元加法器丨1 8與雙埠賊入移位暫存器丨1 (請先閱讀背面之注意事項再堉寫本頁)29 ^ 423 A6 B6 V. Description of the invention (/) Printed by the Central Bureau of Economic Affairs of the Ministry of Economy and Industry and Consumer Cooperatives ^ This invention is about the multiply-add processor in the digital signal processing system, especially about the improved Booth algorithm And the cyclically added computing architecture achieves the effect of quickly processing signals and reducing the size of the hardware architecture. The multiply-add processor is a basic component in a digital signal processing system, and is often used in the design of digital signal filtering. The traditional multiplier-adder is used to achieve the multiply and add operation function. , Where the operations of multiplication and addition are two separate and independent calculations. The first figure shows this traditional multiplier-adder 10, which includes a multiplier 1 2 and an adder 1 4. The multiplier 1 2 first multiplies the input data X i 1 6 by the coefficient C R1 8 and the resulting product , And then input into the adder 14 and add the input data A1 20 of the adder to obtain the sum, and then delay the delay unit 2 2 for a period of time and then enter the adder of the next order multiplier for processing. Because the signal processing by the multiplier-adder 10 shown in the first figure is time-consuming and the volume of the hardware structure is large, it is commonly used to improve the multiplier of the first figure by cyclic accumulation. The second figure shows this improved multiplier 40. The multiplier-adder 40 is a combination of a decode adder 4 2 and a latch (D Flip Flop's) 44 which operates cyclically; the decode adder 42 inputs one bit per clock (C 1 ock) C r, and X are calculated to get a partial multiplication, and then the sum of the part of the clock stored in the latch is added to become the partial sum of the clocks, which is stored in the latch and shifted appropriately The latter loop is used for the next clock operation. In this way, the multiplying value of CrX can be obtained after several clocks of Cr bits. (Please read the precautions on the back before writing this page) -Installed- · -The size of the paper in line with the Chinese National Standard (CNS) A 4 specifications (210 X 297 mm), etc. Sanxia 82 · ύ. 40,000 5 2. Description of the invention (2) A6 B6 Printed by the 8th Consumer Cooperative of the Central Government Bureau of the Ministry of Economic Affairs ** | The architecture shown in the second figure solves the problem of multipliers consuming hardware, but the calculation time is the system clock ck The product of the number of bits of the multiplier Cr, if you want to achieve high-speed operation, the adder 4 2 used in the second figure can be a carry save adder (Carry Save Adder: CSA), but its disadvantage is this The architecture requires the use of many latches to temporarily store the value of the carry and the sum. Another way to improve the calculation speed is to use an advanced bit adder (Carry Look-ahead Adder: CLA). However, the same adder is bulky in hardware It will be too large to be compatible with the use of modern miniature electronic components. Therefore, in the prior art, as mentioned above, there are still difficulties and limitations in slow calculation speed or too large hardware architecture or too many components. The purpose of the present invention is to overcome the difficulties and limitations of the conventional technology and make the multiplier and adder effective in time. Both the upper and the hardware architecture have been improved to better meet the requirements of modern microelectronic components. The present invention adopts an improved Booth algorithm to simplify the multiplication operation, and uses the cyclic accumulation operation architecture to save the hardware circuit, and cooperates with the operation clock that is several times the system clock rate to speed up the operation time, in order to achieve a system The operation of multiply-add operation is completed in the clock, which is the most efficient application in timing and hardware structure design. When the modified Booth algorithm handles the multiplication of m-bits and η-bits, the lower bit number is taken as the multiplier X, and the higher bit number is taken as the multiplicand C 'and assumes t is The smaller values of m and η, k is one-half of t (when t is an odd number, take t plus one and a half), then the multiplying of CX can be simplified as k partial rent P 2 * i with proper shift The sum of the latter. Part of the frame P 2 * i such as < please first «read the notes on the back side and then write this page) 丨 installed. Ordered. -Line · The paper size is applicable to China National Standards (CNS) A 4 specifications mo X 297 public goods> winter σσ tile 82.6. 40,000 A7 ________ B7 V. Description of invention (3) As shown in Table 1, where i is 0, 1, 2, and, to [kl] ° Table 1 X 2 * 1 + 1 Improved Booth Algorithm X 2 * 1 X 2 * 1-1 Partial Product P 0 0 0 0 0 0 1 C 0 1 0 C 0 1 1 2 C 1 0 0 —2 C 1 0 1 —C 1 1 0 -C 1 1 1 0 J 'outfit ------ order ------ ^ line (please read the precautions on the back before filling in this page) Printed by M Co., Ltd., Central Standards Bureau, Ministry of Economic Affairs The multiply-add processor designed by the present invention has the function of calculating the multiplier X, multiplying by the multiplier C, and adding Z i to generate Z 〇. In the present invention, because the Lotus uses a modified Burg's algorithm, the multiplying back of C X can be simplified to the sum of k partial products P Pi after proper shifting. In order to achieve the best time efficiency of the improved Booth algorithm, the multiply-add processor includes a time controller to accelerate the received system time signal to k times the signal, which I call the operation As for the clock signal, as long as the operation clock rate is still within the processing rate of the adder, and then cooperate with the adder and the dual-port load shift register to complete the k times cycle addition action, complete the multiplication in a system clock The paper size of the addendum applies to the Chinese National Standard (CNS) A4 (210X297mm) Dong Yuzhen 83. 3. 10,000 A6 B6 Economic Lang Central Technical Bureau e * Industry and Consumer Cooperation Du Yinv. Invention Description (Order) Of effect. And the hardware only needs a period controller, a load and shift register and a modified Booth decoder, a modified Booth selector, an adder, and a dual-port load shift register The device greatly saves the hardware circuit area, so the present invention has the dual effects of reducing the hard hip structure and increasing the calculation speed. The present invention utilizes a time period controller to accelerate the system clock to generate an operation clock signal and a load clock signal, which are fed into the load and shift register and the improved Booth decoder and dual port load shift register , To control the operation of the multiply-add processor; where the operation clock signal is a system clock signal of k times, before the first operation clock of a system clock, change the load clock signal to Active to load the multiplication Count X and the input value Z i of the previous stage, and change the loading clock signal to non-active (NonActive) during other operation clocks of a system clock; a loading and shifting register and improved Booth The decoder selects the three bits of the multiplier X successively at each operation clock, and solves the three-bit Booth calculation code S (S2S1 SO) according to Table 2; an improved Booth selector, the input is multiplied For the number C and Booth's calculation code S, select the appropriate temporary partial product value Cs according to Table 2, where Cs is 0, C, 2C'C ', 2 <:' one 'and (:' represents the complement of C Number (One _ s Complement);-an adder 'will Cs, Z f and S 2 Add up to generate this operation clock part and S a 'where Z r is the output of the dual port load shift register' Z r is Z i when the first operation clock of a system clock is Z r is the result of shifting the part and S a to the left by two bits when the other operation clocks of a system clock; a double training is loaded into the shift memory 'it has two inputs: S a and Z 丨' When the loading clock signal is active, it loads z 1 'output zz 1' otherwise it is used by China's S. Jiaping (CN'S) A! Specifications (210 X 297 mm) Winter AK 82.6. 40,000 — — — — — — — — — — III —-- Bu (· — · 111 ^^ II ---- I--Γ --- recognition (please read the precautions on the back before writing this page) A7 B7 V. Invention description ( 5), load Sa, output Zr is to take Sa to deduct the other two bits after the lowest two digits, and complete with hardware cable, which is like Sa after shifting right two digits, (the lowest two digits of Sa Keep as the final output), the output Zr will add the next operation clock to the adder; thus, the adder and Shuangchun load shift register form a set of cyclic addition actions And after the end of the first operation clock, that is, after the end of a system clock, the dual-port load shift register can output C multiplied by X plus Z i multiplied by 2: 〇 ° ^ _.t.- -(Please read the precautions on the back before filling in this tile) — — X 2 * 1 + 1 X2 * i Booth calculation code comparison table X 2 * il S 2 S 1 S0 CS 0 0 0 0 0 0 0 0 0 1 0 0 1 C 0 1 0 0 1 0 C 0 1 1 0 1 1 2 C 1 0 0 1 1 1 2 CJ 1 0 1 1 1 0 c, 1 1 0 1 0 1 c, 1 1 1 1 0 0 0 Set-1 Printed by the Central Bureau of Standards of the Ministry of Economy and Tourism, only the Consumer Cooperative Society. The third figure shows the design of an 8-bit multiplied by 8-bit multiplier plus a processor designed according to the present invention. Its function is in calculation. The round-off value X. 1 〇2 is multiplied by the coefficient C 1 0 4 and the input value from the previous order: zi 1 0 fi is added to produce z «. Due to the improved Booth algorithm, the eight-bit X only needs to be added four times to the appropriately shifted part by m. In order to achieve the best time-efficiency using the modified civil Booth algorithm, the paper scale of the multiply-add processor is applicable to the Chinese National Standard (CNS) A4 specification (210 X 297 mm) Dongqi 83. 10,000 _ Buckle 3 A6 _ _ B6 V. Description of the invention (6) The Ministry of Education, Ministry of Education, Ministry of Education, £ 5, the quasi-authority 8, and the medical treatment cooperation: :: 1 seal 5 .: 1 0 0 contains a receiving system clock signal CK1 1 0, and generates a quadruple The operation clock signal CKx4 of the system clock and the time period controller 1 〇8 of the loading clock signal CKL oad. The loading clock signal CKL 〇ad is sent from the controller 1 〇8 to the loading and shifting register and the pattern is changed. The decoder 112 and the dual-port load shift register 1 1 6 cause the input value XI 〇2, and the previous input value Z i 1 0 6 to be the first at the beginning of each system clock C + K Two operation clocks are loaded into these two registers. The operation clock CKx4 is sent to the load and shift register and the improved Booth decoder 1 1 2 and the dual port load shift register 1 1 6 to Controls the operation of the multiply-add processor; the decoder 1 1 2 takes the three-element X successively according to the operation clock CKX4 (in turn, χιχοο, X3X2X1, X5X4X3, X7X6X5), according to Table 2 solve the three-digit Booth calculation code S (S2S1S0), S and C input improved Booth selector 1 1 4, and then according to Table 2 to generate a temporary partial product Cs, C s, S 2 and the output Z Γ loaded into the shift register from the dual port, input an eighteen-bit adder 1 1 8 to obtain a part of the operation clock and S a, and this part and S a will cycle Input dual port load shift register 1 1 fi, dual port load shift register 丨 16 has two inputs: Z i and Sa, selected by loading clock signal CK hoa (丨, when When CK 丨 v () ad is active, output Z r is Z i. When CKL oad is non-primary, output Z r is the result of shifting S a to the left by two bits. 1 1 8, multiply the ffiCs and the highest bit S 2 of the Booth's calculus by the temporal part, and generate the part and Sa of the operation time, so, the eighteen-bit adder 丨1 8 and dual-port thief shift register 丨 1 (please read the notes on the back before writing this page)

X i裝· 訂· 用中® 0家择半(CNS) Ψ 4 規格(210 X 297 公釐) ^ y\ X 82-6. 40 〇〇〇 經濟部中央榡半局月工消費合作社印52 A6 B6 五、發明説明(7) 6聯合成一循環相加之運作架構,每四分之一系統時鐘C K,即每一運算時鐘CKx4就有一輸出Zr從雙埠載入 移位暫存器1 1 6,循環輸入回十八位元加法器1 1 8, 與C s、S 2相加,這樣的循環相加過四次,在第四個運 算時鐘CKx4結束時,即一個系統時鐘結束時,Z i加 C*X之乘加值Z〇,便可産生並輸出到下一階之乘加器 °第四圖為01<:、€:1<:父4、€:1(:1^〇&(]、乂、21、 S、Cs、Zr、Sa、及Zo等信號之時間關係參考圖 ,其中S (1)表示第一個運算時鐘之布斯演算碼,Cs(l) 表示第一個運算時鐘之暫時性部份乘積,S a (1)表示第 一個運算時鐘之部份和,S .S a (1)表示第一個運算時鐘 之部份和左移兩位之結果。實際上本設計僅為本發明之一 參考實施例,並不局限本發明之範圍。 圖式簡單說明: 第一圖是一傳統型之乘加器的方塊示意圖; 第二圖是一種傳統型改良式布斯乘法器之方塊示意圖; 第二圖是本發明之一種八位元乘以八位元乘加處埋 器實施例; 第四·是本發明之一徑八位元乘以八位元乘加處埋器 實施例之信號時間參考〇。 本紙張尺度通用中國國家標準KNS}甲4规格⑵。X撕公祉)冬九具 82.6. 4 ),000 (請先閲讀背面之注意事項再填寫本頁) 丨裝- 訂.X i installed · booked · used ® 0 Home Choice Half (CNS) Ψ 4 specifications (210 X 297 mm) ^ y \ X 82-6. 40 〇〇〇 Central Ministry of Economics and Social Sciences half-bureau monthly labor consumption cooperative printing 52 A6 B6 V. Description of invention (7) 6 Combined into a cyclically added operation architecture, every quarter system clock CK, that is, every operation clock CKx4 has an output Zr loaded from the dual port into the shift register 1 1 6. Loop input back to the 18-bit adder 1 1 8 and add to C s and S 2. Such loop addition has been performed four times. At the end of the fourth operation clock CKx4, that is, at the end of a system clock, Z i plus C * X multiplier addition value Z〇, can be generated and output to the next-order multiplier and adder ° The fourth picture is 01 <:, €: 1 <: parent 4, €: 1 (: 1 ^ 〇 & (), X, 21, S, Cs, Zr, Sa, Zo and other signals time reference diagram, where S (1) represents the Boots calculation code of the first operation clock, Cs (l) represents The temporary partial product of the first operation clock, S a (1) represents the partial sum of the first operation clock, and S .S a (1) represents the part of the first operation clock and the left shift by two As a result, this design is actually only A reference embodiment of the present invention does not limit the scope of the present invention. The schematic diagram briefly illustrates: The first diagram is a block diagram of a conventional multiplier-adder; the second diagram is a conventional improved buss multiplier Block diagram; the second figure is an embodiment of an eight-bit multiplied by eight-bit multiply-adding device of the present invention; the fourth is the implementation of an eight-bit multiplied by eight-bit multiply-adding device of the present invention For example, the signal time reference is 0. The paper size is in accordance with the Chinese National Standard KNS} A 4 specifications ⑵. X tearing public welfare) Dong Jiu Ju 82.6. 4), 000 (Please read the precautions on the back before filling this page) 丨-Order.

Claims (1)

六、申請專利範圍 —乘加處理裝置能在一系統時鐘內將一輸入之變數 X ’乘以預先存在該處理裝置之一係數c並加上輸入 值Zi,輸出乘積值Z0,上述之乘加處理裝置包含 一時段控制器:該控制器能接收一系統時鐘信號,而 將上述之系統時鐘信號加速成N倍之運算時鐘信號, 及一相對應之載入時鐘信號; 一載入及移位暫存器並改良式布斯解碼裝置:該裝置 能依據上述之載入時鐘信號而載入上述之輸入變數X ,並依據上述之運算時鐘信號而在運算時鐘內逐次取 三個位元之變數X,算出一該運算時鐘之三位元布斯 演算碼; 一改良式布斯選擇器:該選擇器能接收上述之布斯演 算碼及上述預存之係數C,以算出一該運算時鐘之暫 時性部份乘積; 一加法器:該加法器能接收上述之暫時性部份乘積、 及布斯演算碼的最高位元,經過加法運算而輸出該運 算時鐘之部份和;和 一雙埠載入移位暫存器:它有兩個輸入,一個是z i ,一個是加法器輸出之部份和,該暫存器係依據載入 時鐘信號來選擇輸入信號,並根據上述之運算時鐘信 來動作; 本紙張尺度適用中國國家標準(CNS ) Λ·^格(210X 297公釐) 冬十具 --------J .裝------訂-------Λ-^ (請先閱讀背面之注意事項再填寫本頁) ' 經濟部中央掠準^u工消费合作社印製 經濟郎中央標準局R工消費合作社印製 A7 B7 C7 D7 六、申請專利範圍 上述之加法器和雙埠載入移位暫存器構成一循環相加 之運箕;當載入時鐘信號為主動時,雙埠載入移位暫 存器載入Z i,並將之輸出給加法器,當載入時鐘信 號為非主動時,雙埠載入移位暫存器將載入加法器輸 出之部份和,保留最低兩位元作為最終之輸出位元之 後,並將其餘位元以硬體拉線的方式循環輸出給加法 器,其效果形同留下最低兩位元,並將部份和右移兩 位後,循環輸出給加法器;加法器便在每一個運算時 鐘取雙埠載入移位暫存器之輸出、暫時性部份乘積、 和布斯演算碼之最髙位元作相加蓮算,産生出該運算 時鐘之部份和,並將此部份和循環輸入回雙埠載入移 位暫存器;如此,在一個系統時鐘結束後,雙埠載入 移位暫存器即可輸出乘加值。 2 .如申請專利範圍第1項之乘加處理裝置,其中上述之 載入移位暫存及改良布斯解碼裝置係依據表二,依照 上述之運算時鐘逐次處理該輸入變數X的三個位元解 出該運算時鐘之三位元布斯演算碼。 3 .如申請專利範圍第1項之乘加處埋裝置,其中上述之 時段控制器所産生N倍於系统時鐘的運算時鐘,當輸 入變數X位元數為偶數時,N係為輸入變數X位元數 之一半;當輸入變数X位兀数為奇數時,N係為輸入 變數X位元數加1的一半。 4 ·如申請專利盹圍第1項之乘加處理裝置,其中上述之 載入時鐘信號在一系統時鐘內之第一個運算時鐘前為 主動,在一系統時鐘内之其他蓮算時鐘時為非主動。 {請先閲讀背面之注意事項再瑣寫本頁) —裝· 訂. •%· 本紙張尺度適用中國因家標準(CNS)甲4規格(210 X 297公* ) 挲十一 |6. Scope of patent application—The multiply-add processing device can multiply an input variable X ′ by a coefficient c pre-existing in the processing device within a system clock and add the input value Zi to output the product value Z0. The processing device includes a period controller: the controller can receive a system clock signal, and accelerate the above system clock signal to an N-time operation clock signal, and a corresponding load clock signal; a load and shift Temporary register and improved Booth decoding device: the device can load the input variable X according to the above-mentioned loading clock signal, and successively take three-bit variables within the operation clock according to the above-mentioned operation clock signal X, calculate a three-bit Booth operation code of the operation clock; an improved Booth selector: the selector can receive the aforementioned Booth operation code and the pre-stored coefficient C to calculate a temporary operation clock Additive product: an adder: the adder can receive the above-mentioned temporary partial product and the highest bit of the Booth's arithmetic code, and output the operation after the addition operation Partial sum; and a dual-port load shift register: it has two inputs, one is zi, and the other is the partial sum of the adder output. The register selects the input based on the loaded clock signal Signal, and act according to the above-mentioned operation clock letter; This paper scale is applicable to the Chinese National Standard (CNS) Λ · ^ grid (210X 297mm) Dong Shi Ju -------- J. Installed ---- --Subscribe ------- Λ- ^ (Please read the precautions on the back before filling out this page) A7 B7 C7 D7 VI. Patent application The above-mentioned adder and dual-port load shift register form a cyclic summation load; when the load clock signal is active, the dual port load shift register Load Z i and output it to the adder. When the load clock signal is inactive, the dual-port load shift register will load the partial sum of the output of the adder, leaving the lowest two bits as the final After outputting the bits, the remaining bits are cyclically output to the adder in the form of a hardware cable, and the effect is the same Leaving the lowest two bits and shifting the part and the right by two bits, the output is cyclically output to the adder; the adder takes the output of the dual-port load shift register and the temporary partial product at each arithmetic clock , And the highest bit of the Booth's calculus code are added together to generate a partial sum of the operation clock, and the partial sum is looped back into the dual port and loaded into the shift register; thus, in a system After the clock ends, the dual port is loaded with a shift register to output the multiply-add value. 2. The multiply-add processing device as claimed in item 1 of the patent scope, wherein the above-mentioned load shift temporary storage and improved Booth decoding device are based on Table 2 and sequentially process the three bits of the input variable X according to the above-mentioned operation clock Yuan solves the three-bit Booth algorithm of the operation clock. 3. The multiply-add buried device as claimed in item 1 of the patent scope, in which the operation clock generated by the period controller described above is N times the system clock. When the input variable X bits is an even number, N is the input variable X One-half of the number of bits; when the input variable X-bit number is odd, N is the half of the input variable X-bit number plus 1. 4. For example, the multiply-add processing device of Patent Application No. 1 in which the above-mentioned loading clock signal is active before the first operation clock in a system clock and other lotus clocks in a system clock is Not active. {Please read the precautions on the back before writing this page) — Binding · Ordering. •% · The paper size is applicable to China In-House Standard (CNS) A 4 specifications (210 X 297 g *) 挲 11 |
TW82105070A 1993-06-25 1993-06-25 Cyclic multiplication/addition processor TW299423B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW82105070A TW299423B (en) 1993-06-25 1993-06-25 Cyclic multiplication/addition processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW82105070A TW299423B (en) 1993-06-25 1993-06-25 Cyclic multiplication/addition processor

Publications (1)

Publication Number Publication Date
TW299423B true TW299423B (en) 1997-03-01

Family

ID=51565558

Family Applications (1)

Application Number Title Priority Date Filing Date
TW82105070A TW299423B (en) 1993-06-25 1993-06-25 Cyclic multiplication/addition processor

Country Status (1)

Country Link
TW (1) TW299423B (en)

Similar Documents

Publication Publication Date Title
TW448400B (en) Processor which can favorably execute a rounding process composed of positive conversion saturation calculation processing
US7725520B2 (en) Processor
TW378294B (en) Data processing unit and microprosessor
JPH053614B2 (en)
TW299423B (en) Cyclic multiplication/addition processor
US6230178B1 (en) Method for the production of an error correction parameter associated with the implementation of a modular operation according to the Montgomery method
JP3222313B2 (en) Arithmetic device and arithmetic method
JPS5841532B2 (en) Sekiwa Keisan Cairo
CN116781041B (en) Multi-rate conversion filter with high resource utilization rate
JP2734438B2 (en) Multiplier
JP3123060B2 (en) Digital arithmetic circuit
US5646874A (en) Multiplication/multiplication-accumulation method and computing device
WO2008077803A1 (en) Simd processor with reduction unit
JP2864598B2 (en) Digital arithmetic circuit
JPS6259828B2 (en)
JP3456450B2 (en) Fixed point multiplier and method
JPH08335167A (en) Electronic component capable of especially performing division of two numbers to base four
JP3851024B2 (en) Multiplier
SU1437857A1 (en) Device for dividing binary numbers in auxiliary code
JP3205020B2 (en) Arithmetic unit
JPH10260958A (en) Address generating circuit
JPH0863585A (en) Parallel processor
JP3526511B2 (en) Arithmetic device and arithmetic method
JPH05150951A (en) Division processing system
JPH0816366A (en) Divider and dividing method for the same