TW454144B

TW454144B - Method and apparatus for reducing the delay in generating saturation flags in an ALU

Info

Publication number: TW454144B
Application number: TW89101082A
Authority: TW
Inventors: Geoffrey Francis Burns; Alexander Goldovsky
Original assignee: Lucent Technologies Inc
Priority date: 1999-01-29
Filing date: 2000-01-24
Publication date: 2001-09-11
Also published as: JP2000222172A

Abstract

A method and apparatus are disclosed for generating a preprocessed saturation flag in a multiplier. According to one aspect of the invention, the preprocessed saturation flag is produced at substantially the same time as the reduced products of the more significant side of an array of partial products are selected. A conventional multiplier generates an array of partial products. The partial products are reduced in the more significant side of the array assuming a carry-out from the less significant side of the array as taking on a first state to produce a first set of reduced products. The partial products are also reduced in the more significant side of the array assuming a carry-out from the less significant side of the array as taking on a second state to produce a second set of reduced products. Both sets of reduced partial products are generated in parallel with the carry-out from the least significant side. The first set of reduced products are selected as the reduced products of the more significant side of the array when the carry-out from the less significant side of the array takes on a first state. The second set of reduced products are selected as the reduced products of the more significant side of the array when the carry-out from the less significant side of the array takes on a second state. A number of the most significant bits of the first set of reduced products are combined to produce a first saturation flag and a number of the most significant bits of the second set of reduced products are combined to produce a second saturation flag. The appropriate saturation flag is selected based on the carry received from the least significant side of the array. The saturation flag is selected substantially simultaneously with the selection of the reduced products, based on the carry received from the least significant side of the array.

Description

454144 五、發明說明（i)454144 V. Description of the invention (i)

Igg 領域 '. <本發明係關於數位信號處理器和其他微處理器之資料算術單位（DAU )，以及更特别j也是，係關於降低在微處理器之資料算術單位（DAU)中產生預先處琿飽和旗標之延遲的方法以及裝置。 t明背景二貝料算街單位（DAU)為數位信號處理器（DSp)主要的計算 2行單元。DAU支援的其中之一運算為乘法。乘法器為微τ 二=器或是數位信號處理器中用以獲得被乘數和‘數乘積以二電路。在一典型乘法器中，部分乘積為藉由乘進制表示法而形成，例如二進位、2 、補=之數目的被乘數乘以二進制表示法之乘數以= =2;部分乘積。該部分乘積被簡化以獲得二術部分乘積可以多數已知方法中的任何方法數和乘數最低有效位元（lsb)在右， ^;以至左方（視為右-至—左或是LSB第一办幵。1及進行播5亡卡· μ a 而進位為移位或县蚀 iiL至;工乘積和進位傳播之運算為皮覆 /刀乘積陣列頂端開始簡化以及往陣列底部 '乘積為在部乘積項之最低有效部分以二進制形姦:仃。最終部分 1進位1存形式產生。最高有效^生^最高有效部分達。進位傳播加法器為用以將最 ^ϋ同t到刀孓積項最南有效部Igg field '. &Lt; The present invention relates to data arithmetic units (DAU) of digital signal processors and other microprocessors, and more specifically, also to reducing the generation of advances in the data arithmetic units (DAU) of microprocessors. Method and device for processing delay of saturation flag. The background is two main units of digital signal processor (DSp) calculation. One of the operations supported by DAU is multiplication. The multiplier is a micro τ two = device or a digital signal processor used to obtain the multiplied number and the 'number multiplied by two' circuit. In a typical multiplier, the partial product is formed by a multiplicative representation, such as the number of the multiplicands of the binary, 2, and the complement = multiplied by the binary representation of the multiplier = = 2; the partial product. The partial product is simplified to obtain the second-hand partial product. The least significant bit (lsb) of any method and multiplier in most known methods is on the right, ^; to the left (considered as right-to-left or LSB). The first to do. 1 and 5 broadcast cards, μa, and carry is shift or eclipse iiL to; the calculation of the industrial product and carry spread is to simplify the top of the skin / knife product array and to the bottom of the array. The least significant part of the in product term is in binary form: 仃. The final part 1 is carried in the form of 1 and stored. The most significant ^ is generated ^ The most significant part is reached. The carry propagation adder is used to combine the most significant ϋ ϋ to t 孓Product term

4541 44 五、發明說明（2) 分由進位-儲存形式轉換為二在乘法器循環之後，乘穑、制形式以完成乘法操作。在結果乘積應用至算術邏輟二典型地載入至暫存器以用於以m先處理？依據結果乘積：：(ALU)或是加法器之前加制定’但是通常預先處理正確預先處理為供應商右方或是左方卜位元。在^ =含飽和或是將乘積移位至些i高有效位元為用以= ’結果乘積之有 ^ ^ ^ m 生預先處理控制信號。例如，假使(在32-位π結果乘積中的3最高‘有效位元為邏輯〗時，= 和氘先處理運算將以已知方式設定其餘29最低有效位元值為0。，由於預先處理運算為依序以選擇和獲得乘法器結果啟始以及載入乘法器結果至暫存器而完成，所以導入3或4 階段之延遲。在計算預先處理旗標時導入之延遲視為dau 之關鍵路桓' R.K. Kolagotla 等人在 IEEE 1 997 年之 Custom4541 44 V. Description of the invention (2) The points are converted from the carry-storage form to two. After the multiplier cycle, the multiplication and system forms are used to complete the multiplication operation. Is the result product applied to the arithmetic logic two typically loaded into a register for processing in m first? Product based on the result: (ALU) or add before the adder, but it is usually pre-processed correctly and pre-processed as the supplier right or left. ^ = Contains saturation or the product is shifted to some i-significant bits for the result of the product ^ ^ ^ m to generate a pre-processing control signal. For example, if (the 3 most significant 'significant bits in a 32-bit π result product are logical'), the = and deuterium preprocessing operations will set the remaining 29 least significant bit values to 0 in a known manner, due to preprocessing The operation is completed in order by starting with selecting and obtaining the multiplier result and loading the multiplier result into the register, so the delay of the 3 or 4 stage is introduced. The delay of the import when calculating the pre-processing flag is considered the key of the dau Cotai 'RK Kolagotla et al. Custom in IEEE 1997

Integrated Circuits Conf.，第469-72(1997年5 月）頁之 "VLSI Implementation of 1 2000-MHz 16x16 Left-t〇-Right Carry-Free Multiplier in 0； 35 ^m CMOS Technology for Next-Generation DSPs"文章，在此以提^的方1式併入本文中’揭示以最高有效位元為較已知乘涂器早先可使用之方式提供部分乘積之乘法器，而_因" 此降低完成乘，法所需之時間。由上述以傳統技術獲得預先處理旗標之缺點可明顯看出，只要最高有效位元為可使用就存在評估預先處理旗標之必要。發明總結Integrated Circuits Conf., Pp. 469-72 (May 1997) " VLSI Implementation of 1 2000-MHz 16x16 Left-t〇-Right Carry-Free Multiplier in 0; 35 ^ m CMOS Technology for Next-Generation DSPs & quot The article, hereby incorporated in this article in the form of "1", reveals that the most significant bit is a multiplier that provides a partial product in a way that can be used earlier than the known multiplier, and _ 因 " This reduction is completed The time required for multiplication and multiplication. It is obvious from the above-mentioned disadvantages of obtaining the pre-processing flag by the conventional technology that it is necessary to evaluate the pre-processing flag as long as the most significant bit is usable. Summary of invention

五、發明說明（3) 之方：和裝置。乘法器中產生預先處理飽和旗標產生與部分乘積陣列f古=二特3，預先處理飽和旗標之同時。 J最同有效側簡化乘積之選擇實質上為傳統乘法器產生部分一狀態的陣列it / , J邛刀乘積由假設具有第簡化以產生第? 位輸出而在陣列之最高有效ί 第二狀態的陣列最低右^ f、。部分乘積亦可以由假設具有侧簡化以產生第二組】=位輸出而在陣列之最高有效低有效侧進位輪出並；產生。當積與由最有效側之簡化S當選擇.為該陣列最高 =出4，第二組簡化^積選；“===效侧簡化乘積。〜敢阿有效侧之為產生預先處理之飽和旗樟，一 :有效位元被奴合以產生第二飽和旗標以及多最之飽和旗標為依據由陣列最低有效側接收2 $私。合適該飽和旗標之選擇實質上與簡化乘積之選 4 =選擇。據由陣列最低有效側接收之進位。 ’擇為同時，各依本發明允許旗標在乘法循環早期預先處理， - 生旗標之延遲而增加區域的成本。當乘法器改良產入暫存器而選擇用於ALU預先處理時，所揭積^經由載算飽和旗標之演算法節省3至4個旗標產生 454144 五、發明說明（4) 段。本發明之更完整瞭解以及本發明其他態樣和優點將藉由參考下列詳細說明和附圖而獲得。圖式之簡單說明圖1為一簡圖，解釋用於產生和加入部分乘積項以及計算乘數和被乘數乘積之傳統二基數乘法器；圖2為一簡圖’解釋圖1加法器·，圖3為一簡圖，解釋囷丨之人格；圖4為一簡圖，解釋圖1之b格；圖5為一簡圖’解釋解釋用於產生和加部分乘積項以及（計算乘數和被乘敏乘積之得統四基數乘法器；圖6為一簡圖，解釋圖1之D格；圖7為一簡圖，解釋圖1或是圖5之傳統乘法器以及產生不同預先處理操作之傳統預先處理控制方塊；以及圖8為一簡圖，解釋-1或是圖5之傳統乘法器以及如本發明執行預先處理飽和運算之預先處理控制方塊。 JI式之詳細說明圖1為傳統乘法器1 〇解釋性具體實施例之簡圖，該乘法 2可以為積體電路（1C)之部分。該乘法器1〇可以為例如，信號Ϊ理器（DSP)、微控制器、特定應用積體電路或 ;播加U路ί資料算術單元部分。乘法器10包括進位如圖2 α所顯一、，\、土工器1 4、轉換器1 6以及加法器1 8和1 9。在陣列24 4^相万口鱼器18具有輪入2〇和輪出22且加法器18 任陣列2 4中相互連接，装由+ L ^ 具中之一加法器1 8之輸出在該處可5. Description of invention (3): and device. The multiplier generates a pre-processed saturation flag. It generates and partially multiplies the array fgu = two special 3, and pre-processes the saturation flag at the same time. The choice of the J-most effective side simplified product is essentially a traditional multiplier that produces a partial one-state array it /. The J-knife product is assumed to have the first reduction to produce the? -Th output and is the most efficient in the array. Array lowest right ^ f ,. Partial products can also be simplified by assuming that there is a side reduction to produce the second group] = bit output and carry out on the highest effective low effective side of the array. When the product is selected by the most effective side, the simplified S is selected. For the array maximum = 4, the second group is simplified ^ product selection; "=== the effective side simplified product. ~ The effective side is to generate pre-processed saturation Flag camphor, one: the effective bits are slaved to generate the second saturation flag and the most saturated flag based on receiving 2 $ private from the least significant side of the array. The choice of a suitable saturation flag is essentially a simplified product Select 4 = Select. According to the carry received by the least significant side of the array. 'Select as the same time, each according to the present invention allows the flags to be pre-processed early in the multiplication cycle,-the delay in generating the flags increases the cost of the area. When the multiplier is improved When the register is generated and selected for ALU pre-processing, the exposed product ^ saves 3 to 4 flags through the algorithm that calculates the saturation flag to generate 454144. 5. The description of the invention (4) paragraph. The invention is more complete Understanding and other aspects and advantages of the present invention will be obtained by referring to the following detailed description and drawings. Brief Description of the Drawings Figure 1 is a simplified diagram explaining the generation and addition of partial product terms and the calculation of multipliers and multiplicands Number product tradition Cardinality multiplier; Figure 2 is a simplified diagram to explain the adder of Figure 1, Figure 3 is a simplified diagram to explain the personality of Figure 1; Figure 4 is a simplified diagram to explain the grid of Figure 1; Figure 5 is a simplified diagram Figure 'explains the four basic radix multipliers used to generate and add partial product terms and (calculate multipliers and multiplied products); Figure 6 is a simplified diagram explaining the D grid of Figure 1; Figure 7 is a simplified diagram , Explain the traditional multiplier of FIG. 1 or FIG. 5 and the traditional pre-processing control block that generates different pre-processing operations; and FIG. 8 is a diagram explaining the -1 or the conventional multiplier of FIG. 5 and performing the pre-processing as the present invention. A pre-processing control block that handles saturation operations. Detailed description of the JI formula Figure 1 is a simplified diagram of an illustrative embodiment of a traditional multiplier 10. The multiplication 2 may be part of an integrated circuit (1C). The multiplier 1〇 It can be, for example, a signal processor (DSP), a microcontroller, an application-specific integrated circuit, or a data arithmetic unit part of the U channel. The multiplier 10 includes a carry as shown in Figure 2, α, \, geotechnical Converters 1, 4, 16 and adders 18 and 19. In the array 24 4 ^ phase million The mouth fish device 18 has a round-in 20 and a round-out 22, and the adder 18 is connected to each other in the array 24. The output of the adder 1 8 is + L ^

454144454144

五、發明說明（5) 力二：器18之輸入。乘法器10在圖1解釋為產生和數之乘積員以及因此計算n_位元乘數和η-位元被乘之乘積形成具有2〇位元之乘積: 乘和位儿乘數加法器1 8配置為行和列r 人藝所知。如圖2所顧-】乂及叮以為全加法器，如相關技述同列（-Λ丄 *器18接收下列輸入20⑴由上進位輪出田以及c..: 法器18和輸出；（ii)次低有效列之知。各力部分乘積位元，如相關技藝所外，辦加接除了介於虛線30和32之間的加法器19之辦加:刀乘積至進入和以及進位輸出信號。加法器18 ^ . Γ< ^ 、進位輸出以及部分乘積位元以產生下列輸出位元' 二法器1 8、1 9之陣列2 4傳播的和位元及進位輸出元側示。加法器18、19之陣列24最高有效位位元側2 6 Α ί 侧，而加法器18、19之陣列24最低有效工側26為在圖丨右侧。完整的簡化陣列沒有顯示。進位傳播加法器1 2由陣列24最低有效侧26上的底部行之加法器18接收部分和以及進位輸出信號。進位傳播加法 12產生多位元輸出4〇及至少一進位輸出位元“。在解釋性具體實施例中，多位元輸出40包含（η + 1)位元以及進位輸出位兀42為單一位元。進位輸出位元42提供選擇輸入至多工器1 4以控制多工器丨4輸出之選擇，將在下文更.進一步討論。介於虛〜線30和32之間的加法器19將陣列24之加法器丨8與轉換器16介面。加法器19不增加一部分乘積位元。加法器V. Description of the invention (5) Force 2: Input of device 18. The multiplier 10 is explained in FIG. 1 as a multiplier that produces a sum and thus calculates the product of the n_bit multiplier and the η-bit multiplied to form a product with 20 bits: multiplier and bit multiplier adder 1 8 configurations are known for rows and columns. As shown in Figure 2-] 乂 and Ding are considered as full adders, as described in the same column (-Λ 丄 * 器 18 receives the following input 20⑴ from the upper carry wheel out of the field and c ..: the implement 18 and output; (ii ) The knowledge of the second lowest effective column. The product bits of each force part, such as the related arts, should be added to the addition of the adder 19 between the dashed lines 30 and 32: the knife product to the entry and the carry output signal Adder 18 ^. Γ < ^, carry output and partial product bits to produce the following output bits' The sum of the propagated bits and carry output of the array 2 4 of the adder 18, 19 is shown on the side. Adder The most significant bit side of the array 24 of 18 and 19 is the side 2 6 Α ί, and the least significant side of the array 24 of the adder 18 and 19 is on the right side of the figure. The complete simplified array is not shown. Carry propagation adder 12 The partial sum and the carry output signal are received by the adder 18 on the bottom row on the least significant side 26 of the array 24. The carry propagation addition 12 produces a multi-bit output 40 and at least one carry output bit. In the example, the multi-bit output 40 includes (η + 1) bits and a carry output Bit 42 is a single bit. Carry output bit 42 provides a selection input to the multiplexer 14 to control the output of the multiplexer 4 and will be further discussed below. Between the virtual lines 30 and 32 The adder 19 interfaces the adder 丨 8 of the array 24 with the converter 16. The adder 19 does not increase a part of the product bits. The adder

第12頁 4 5 4 彳 44 五、發明說明（6) 1 9將由陣列24產生之3位元加在一起以及產生2輸出至轉換器 16。 < 轉換器1 6包括二類犁格，指定為a和b。圖3為a格之簡圖’顯不輸出s (和）和C (進位），以及輸出A、G、Z0和00為輸入之邏輯組合。指示輸入為如何邏輯組合以形成輸出之陳式’顯示在圖3右側。。如圖1所顯示’各A格由介於虛線3 0和3 2之間的同行加法器1 9，收進位輪入。各儿格由介於虡線3〇和32之間的同i 加法器1 8接收和。最高有效位元之簡化在八袼和^格中完成，以，於多工器1 4之輸入在進位傳播加法器1 2的進位輸出42之前可i用或是與進位傳播加法器12的進位輸出42同時可使用。以此方式，在播加法器1 2的進位輸出42可使用之後的一個多工器延遲，和44可使用為多工器14之輸出。出^和 ΐ ^簡圖，顯示輸入Z1和〇 1 (由㈤列之八格輸出Z0和00接收）以及由同行之A格接收齡和^。各6格傳遞輸入A和G至鄰近b格（在該處呈現）以及產生輸出和⑻。輸=zo和oq為B格輪入之邏輯組合。指示輪入為如何邏輯組s以形成各B格中輪出之該陳式，顯示在圖4右侧。 A格和B格，依據介於輸入和輸出以及的關係，計算和維掊_拟々夕悉接县> 士堤摆东構之間名士 , 再符—形式之乘積最内有效位元和啖县锌、s綠曲能，傳播加法器產生之進位輸出42為第— 邏輯狀^如〇，轉換器16,計算最高有效位元和戋輯狀態，如1 ，轉換弩]R介二+管县一士 & 出4 2為第二邏得狹Is 1 6亦计异最南有效位元和或是簡化Page 12 4 5 4 彳 44 V. Description of the Invention (6) 1 9 Add the 3 bits generated by the array 24 together and generate 2 outputs to the converter 16. < Converter 16 includes two types of plows, designated a and b. Figure 3 is a simplified diagram of grid a 'showing the logical combination of outputs s (and) and C (carry), and outputs A, G, Z0, and 00 as inputs. A formula indicating how the inputs are logically combined to form an output is shown on the right side of FIG. . As shown in Fig. 1, each A cell is composed of a peer adder 19 between dotted lines 30 and 32, which is rounded in place. Each grid is summed by the same i adder 18 between 虡 and 30. The simplification of the most significant bit is done in the octets and squares, so that the input of the multiplexer 14 can be used before the carry output 42 of the carry propagation adder 12 or the carry of the carry adder 12 Output 42 is available at the same time. In this manner, a multiplexer delay after the carry output 42 of the adder 12 can be used, and 44 can be used as the output of the multiplexer 14. Figures ^ and ΐ ^ show the inputs Z1 and 〇 1 (received by the eighth output of the queue Z0 and 00) and the age and ^ received by the peer A. Each 6 cells pass inputs A and G to adjacent b cells (presented there) and produce outputs and ⑻. Loss = zo and oq are logical combinations of B-box turns. Indicating how the round-in is logically grouped s to form the Chen style of round-out in each B cell is shown on the right side of FIG. 4. A and B cells, according to the relationship between input and output, and calculate and maintain the __ 々々 Xi Xijie County > Shidi pendulum between the East and the West, and then the character of the product of the form with the innermost significant bit sum Yixian zinc, s green curve energy, the carry output 42 generated by the propagation adder is the first — logical state ^ such as 0, converter 16, calculates the most significant bit and the state of the series, such as 1, conversion crossbow] R 介二 + Guanxian Yishi & Out 4 2 is the second logically narrow Is 1 6 Also count the most significant bit and simplify

第13頁 45 4 1 4 4 五、發明說明（7) ' '— 乘積。底部行之A格和B格輸出提供二和為至多工器i 4之輸入。其中之一和由Z0輸出表示，而第二和由〇〇輸出表示。’ 當進位像播加法器1 2已經完成計算以及進位蝓出42為可使用時’由轉換器1 6計算之二和中的正確之一和或是簡化乘積被多工器14選擇。當進位傳播加法器12產生之進位輸出42具有第一狀態時，由轉換器16計算之第一和被選擇為多工器14輸出。當進位傳播加法器12產生之進位輸出42具有第二狀態時，由轉換器丨6計算之第二和被選择為多工哭 14輸也。例如由Z〇輸入表示之第一和可能為至多工器輸入，然而例如由〇〇輸入表示之第二和可能為至多工器Η 之輸入。 " ^在解釋性具體實施例中，多工器14提供（n-i)輸出44而進位傳播加法器12提供（η + 1)輪出4〇，集中為以位元之乘 ^ 4多工器1 4產生之輸出44為乘積之最高有效位元，進位傳播加法器12產生之輸出42為乘積之最低有效位元。一數位在介於圖1中的虛線30和32之間簡化。為此原因，由進位傳播加法器40輪出之位元數大於多工器14輪出之位元數 1。數位為lag2基數。。介於陣列2 6最低充—效侧與陣列26最高有效側之間的積確區別線可以依據乘歲和被乘數之位元數而變化。圖5顯示基;數~4乘法器的轉換器乘法器10,之簡圖。圖5 :的元件提供與圖1基數_2具體實施例中元件類似之功能，以相同參考數字和撇符號加以顯示。在基數_4轉換器中，與圖1基數二^轉換器之人格產生一位元比較’為各卜格Page 13 45 4 1 4 4 V. Description of the Invention (7) '' — Product. The A and B outputs of the bottom row provide two sums as inputs to the multiplexer i 4. One of the sums is represented by the Z0 output, and the second sum is represented by the 00 output. ′ When the carry video adder 12 has completed the calculation and the carry out 42 is available ’The correct one of the two sums calculated by the converter 16 or the simplified product is selected by the multiplexer 14. When the carry output 42 produced by the carry propagation adder 12 has a first state, the first sum calculated by the converter 16 is selected as the output of the multiplexer 14. When the carry output 42 produced by the carry propagation adder 12 has the second state, the second sum calculated by the converter 6 is selected as the multiplexer 14 and the output is also lost. For example, the first sum represented by the Z0 input may be the input to the multiplexer, whereas the second sum represented, for example, by the 0 input may be the input to the multiplexer Η. " ^ In an illustrative embodiment, the multiplexer 14 provides (ni) output 44 and the carry propagation adder 12 provides (η + 1) round out 40, focusing on multiplying by bits ^ 4 multiplexer The output 44 generated by 14 is the most significant bit of the product, and the output 42 generated by the carry propagation adder 12 is the least significant bit of the product. One digit is simplified between dashed lines 30 and 32 in FIG. For this reason, the number of bits out by the round of the carry propagation adder 40 is greater than the number of bits out by the round 14 of the multiplexer 1. Digits are lag2 bases. . The product difference line between the lowest charge-effect side of the array 2 6 and the most significant side of the array 26 may vary depending on the number of bits in the multiplier and the multiplicand. Fig. 5 shows a schematic diagram of a base-to-number multiplier converter multiplier 10. Figure 5: The components provide similar functions to those in the embodiment of radix_2 in Figure 1 and are shown with the same reference numerals and apostrophes. In the radix_4 converter, a one-bit comparison with the personality of the radix_2 converter in FIG.

第U頁 454144Page 454 144

五、發明說明（8) 產生二位元。由於介於虛線30,和32，化，進位傳播加法器12，輸出之位之間的二數位被簡工器14’輪出之位元數2。由D格 ^輸出4G，大於多由於運算元為基數—4，所以圖1之二=邏輯顯示在圖6中。取代。雖然本發明為以圖丨之基數2 3 A格以圖5之類型D格人士而 ' 本發明不限於此具體實施例。… 筑之· 圖1和圖5之乘法器！ 0、i 〇，可。熟知的VLSI處理加以製造。本用至田少一積體電路中雷故ίΛ、s〜u l利用包含此技術之精 2二=糸統和裂備中特別有利。該通訊系統：裝Ϊ 力，以完成信號處理的優點。乘法器㈠、^: :因α Ϊ Ϊ f早可使用之最高有效位元的方式提供部分乘積’因:匕降低完成乘法所需要的•間。本發明以許多簡化設計、乘法器類型和基數而有利。此方式可以任何基數_2 陣列乘法器使用，如Baugh-W〇〇ley_BUnkenship編碼之2 的補數乘法器、Booth-MacSorley乘法器或是任何較高基數乘法器。預先處理旗標偵測在圖7中，圖1和圖5之乘法器10、ι〇,分割為乘法器陣列 24、轉換器16、最終多工器級14以及進位傳播加法器12。（圖7中的元件提供與圖1解釋性基數—2具體實施例"或是圖5 解釋性基數-4具體實施例中元件類似的功能，以相同參考數字顯示。如圖7所顯示，，由乘法器丨〇產生之結果乘_積放置在載入暫存器70中。預·先處理控制.方塊72則由暫存器705. Description of the invention (8) Generates two bits. Due to the conversion between the dotted lines 30, and 32, the carry propagates the adder 12, and the two digits between the output bits are rounded out by the simplifyer 14 '. 4G is output by the D grid ^, which is greater than many. Since the operand is the base-4, so Figure 1bis = logic is shown in Figure 6. To replace. Although the present invention is a person with a cardinality of 2 3 A and a type of D in FIG. 5, the present invention is not limited to this specific embodiment. … Tsukiji · Multiplier of Figure 1 and Figure 5! 0, i 〇, OK. It is manufactured by a well-known VLSI process. This application is used in the integrated circuit of Tian Shaoyi. Therefore, Lei ίΛ, s ~ u l is particularly advantageous in the use of refined technology including this technology. The communication system: installation force to complete the advantages of signal processing. Multipliers ㈠, ^:: due to α Ϊ Ϊ f can be used as the most significant bit to provide part of the product ’s reason: 匕 reduces the time required to complete the multiplication. The invention is advantageous with many simplified designs, types of multipliers, and cardinalities. This method can be used with any radix_2 array multiplier, such as 2's complement multiplier, Booth-MacSorley multiplier, or any higher radix multiplier. Pre-processing flag detection In FIG. 7, the multipliers 10, ι0 of FIGS. 1 and 5 are divided into a multiplier array 24, a converter 16, a final multiplexer stage 14, and a carry propagation adder 12. (The elements in FIG. 7 provide functions similar to the elements in the illustrative embodiment of the radix-2 in FIG. 1 or in the illustrative embodiment of the radix-4 in FIG. 5, and are shown with the same reference numerals. As shown in FIG. 7, The product multiplied by the result of the multiplier 丨〇 is placed in the load register 70. The pre-processing control is performed. Block 72 is provided by the register 70

第15頁Page 15

454144454144

五、發明說明（9) 產生之結果乘積接收預先疋義數目之最高有效位元以及產生用於執行不同預先處理運算如飽和或是左移/右移k—位元運算之時控選擇信號。本發明與飽和預先處理運算有關。 . 如先前所指示’飽和預先處理將評估結果乘積之—些最高有效位元’以及假使呈現飽和情況時，該飽和預先2理將使結果乘積飽和。例如，飽和情況可能需要3個最高有效位元值具有邏輯1。假使呈現飽和情況時，該飽和二先處理例如將藉由設定剩餘之最低有效位元為〇而使結果乘積飽和。圖7之乘涂循環關鍵延遲路徑如下：選擇由進位傳播加法器1 2、最終多工器級14產生之選擇信號、載入乘法器f終乘積至載入暫存器7,0以及評估預先處理控政方塊 72之MSB。因此，當乘法器乘積經由載入暫存器而選擇於ALU預先處理時，關鍵延遲路徑為旗標產生邏輯3至（遲階段·。如本發明之態樣，只要當最高有效位元為可使用時就執行飽走預先處理運算。如先前所指示，最高有效位元之忾化由A和B格中的轉換器u、16,完成，以致於至多工 / 之輸入在進位傳播加法器丨2進位輸出42之前可使用或是與進位傳播加法器1 2'.進位輸出42同時可使用。以此方式，在播加法器1 2的進位輸出4 2可使用之後的一個多工器延遲，和44可使用為多工器η之輸出。圖8解釋具有如本發明預’先處理級丨〇〇之乘法器丨Q。該預先處理級1 0 0導入一對方塊8 〇、82用於預先處理乘法器乘V. Description of the invention (9) The result product receives the most significant number of pre-defined bits and generates a timing control signal for performing different pre-processing operations such as saturation or left / right shift k-bit operations. The present invention relates to a saturation preprocessing operation. As previously indicated, the 'saturation pre-processing will multiply the result of the evaluation-some of the most significant bits', and if saturation is present, the saturation pre-processing will saturate the result product. For example, a saturation situation may require that the 3 most significant bit values have a logic one. If saturation is present, the saturation-first processing, for example, will saturate the result product by setting the remaining least significant bit to zero. The key delay paths of the multiplication-painting cycle of Fig. 7 are as follows: selection of the selection signal generated by the carry propagation adder 1, 2, the final multiplexer stage 14, loading of the multiplier f, and final loading into the loading register 7, 0, and evaluation of the advance Processes the MSB of control block 72. Therefore, when the multiplier product is selected to be pre-processed by the ALU via the load register, the key delay path is the flag generation logic 3 to (late stage.) As in the present invention, as long as the most significant bit is available When used, the pre-processing operation is performed. As previously indicated, the conversion of the most significant bits is completed by the converters u and 16 in the A and B cells, so that the multiplexed / input is carried in the carry propagation adder 丨The 2 carry output 42 can be used before or at the same time as the carry propagation adder 1 2 '. The carry output 42 can be used at the same time. In this way, the multiplexer delay after the carry output 4 2 of the adder 12 can be used, And 44 can be used as the output of the multiplexer η. Figure 8 illustrates a multiplier with a pre-processing stage such as the present invention Q. The pre-processing stage 100 introduces a pair of blocks 80 and 82 for Preprocessing multiplier multiplication

454144 五、發明說明（ίο) " :---- 積之MSB。基本上，方塊8〇、82決定預先定之最高有效位 το,是否為全部等於1。第一方塊8〇預先處理與轉換器16 汁算之第一和（Ζ(ι))有關的msb，而第二方塊82預先處理與轉換器16計算之第二和（〇(丨））有關的MSB。在一製作中，各方塊8 0、82將預先定義之MSB數AND在一起以決定是否所有MSB具有為邏輯i的值。多工器14，，以與圖】之多工 =和圖5之多工器14’相同方式操作，而相同的選擇信號進位傳播加法器1 2產生以選擇二邏輯狀態的其中之一狀態給予由方塊80、82產生之各預先處理Μ3β位元。 j進位傳播加法器12已經完成計算以及進位輸出值42為可使用時，由轉換器1 6計算之二和中正確之一和或是簡化，，由多工器1 4加g選擇。另外，由方塊8 〇、8 2產生之預理值中正確之一值被選擇。當進位傳播加法器12產生選ί:Ϊ:4=ϊ 一狀態時，轉換器16計算之第-和被付传i多ij4輸出且第一預先處理飽和值被選擇。當進傳播二法器12產生之進位輪出42具有第二狀態時，轉換飽和:ί之第一和被選擇為多工器u輸出且第二預先處理選擇。例如由ζ〇輸入表示之第-和可能為至多工 ^ ^輸入，然而例如由00輸入表示之第二和可能工為1 4之輸入。夕，本發明利用多工器u之乘法器乘積的msb輸出計行，§·!·算乘法器乘積（飽和旗標）之預先處理位 Ϊ預旗標由預先處理級100產生之後，該旗標應用处理控制方塊7 2。假使飽和旗標已經設定為〇時，454144 V. Description of the invention (ίο) ": ---- Product of MSB. Basically, blocks 80 and 82 determine whether the predetermined most significant bit το is all equal to 1. The first block 80 preprocesses the msb related to the first sum (Z (ι)) calculated by the converter 16, and the second block 82 preprocesses the second sum (0 (丨)) calculated by the converter 16 MSB. In a production, each of the blocks 80, 82 AND's a predetermined number of MSBs together to determine whether all MSBs have a value of logical i. The multiplexer 14 operates in the same way as the multiplexer in the figure] and the multiplexer 14 'in FIG. 5, and the same selection signal carry propagation adder 12 is generated to select one of two logical states. Each of the pre-processed M3β bits generated by blocks 80, 82. When the carry carry adder 12 has completed the calculation and the carry output value 42 is available, the correct one of the two sums calculated by the converter 16 is summed or simplified, and is selected by the multiplexer 14 plus g. In addition, the correct one of the predicted values generated by the blocks 80 and 82 is selected. When the carry-propagation adder 12 generates a state of ί: Ϊ: 4 = ϊ, the converter 16 calculates the first-and-by-pass ij4 output and the first pre-processed saturation value is selected. When the carry round out 42 generated by the two-pass propagator 12 has the second state, the conversion is saturated: the first sum is selected as the output of the multiplexer u and the second pre-processing option is selected. For example, the first sum represented by the ζ0 input may be a maximum of ^^ input, however, the second sum represented by the 00 input may be an input of 14. In the present invention, the msb output count of the multiplier product of the multiplexer u is used to calculate the preprocessing bit of the multiplier product (saturation flag). After the preflag is generated by the preprocessing stage 100,标应用处理控制平面 7 2。 Standard application processing control block 7 2. If the saturation flag has been set to 0,

4 5 4 14 4 五、發明說明（11) 該預先處理^塊72將以上式使乘法器乘積飽和然而，該飽和較傳統技術為早 :要昧解的是由此顯示和說明之^^變化僅僅 i蓺且各種不同變化可以由那些熟知相關技《之人士在.不脫離本發明範嘴和精神下加以製作。4 5 4 14 4 V. Explanation of the invention (11) The pre-processing block 72 saturates the multiplier product by the above formula. However, this saturation is earlier than the traditional technique: what is to be interpreted is the ^^ change shown and explained here. Only i and various changes can be made by those who are familiar with the related technology without departing from the spirit and spirit of the present invention.

第18頁Page 18

Claims

6. Scope of Patent Application 1. A method for multiplication includes the following steps: generating a partial product matrix, assuming that the array is the lowest, sometimes the simplified array has the highest product; assuming that the array is the lowest, sometimes, the simplified array has the highest product; The first 纰 simplified product is the array with the least effective side product. The second set of simplified products is the combination of the first set of sum flags and the combination of the second and second saturation flags; and When in the second state, ^ if the scope of patent application is • substantially the same as the choice 3. If the scope of patent application 'ΐί No this policy — & or the pigeons and flags of 1 which generate a pre-processed saturated flag Method, column; the carry output produced by the effect side has a partial product of the first state effect side to produce a first set of carry outputs produced by the dare side has a partial product of the second state effect side to produce a second group g When the carry output generated on the side has the first state, it is the simplified product of the most significant side of the array, and when the bit output has the second state, the column with the highest status is selected. The simplified product of the effective side; the product of the most significant bit number of the product to produce the first saturation "and the simplified product of the most significant bit number to produce the carry from the first side When the wheel has the first state, t and when the least significant side of the array produces Carry the loser to select the second saturation flag. Method 1 1 ^ where the step 2 of the selection flag is performed concurrently. A method 'where the combination step produces the H-th and simplified product. The most significant bit I

Page 19 4 5 4 1 4 4

6. The scope of patent application 4. If the patent application covers the method of the first group or item, the step of the combination still includes the steps to determine the highest significant bit AND 5 of the simplified product of a = — * group For example, if you apply for a patent] and the most significant digit is 1. The method of item 1 of the effective bit side also includes simplifying the minimum array. 6. For example, the method of applying item 1 generated on the bit side of the output step I of applying for the production of anger, where the array has the highest The product of the simplified part of the effect: = The product is generated on the least significant bit side of the array. 7. If a patent application is applied, the method uses the simple 2 CU generated on the bit side, where the simplification generated by the array 歹 JL is the most efficient. The partial raid product is essentially the side with the least significant bit of the array 8.-Species are used in ΐϊίΠΓ. The method includes the following steps: a method for generating a pre-processed saturation flag, and generating a product array of partial parts; simplifying the hierarchical carry-in and burying 1, soil gg A, and the product to generate the first and last;: Π ::: The sum of the partial multiplications is based on the carry received by the least significant side of the train. The simplified partial product is combined to combine the first most significant bit of the simplified product with the sum flag and the second-saturated second-saturated flag is combined. ; And the number of the most significant bits of the product is saturated with the carry received by the least significant side of the array based on the production-generation basis. 9. If the method of the patent application for the fresh item% selects the saturation, the step of simplifying the product with the selection is essentially: At the same time: Choose. Steps of knowing 454144 VI. Application scope of patents _ I 0. If the application is applied to _ indicate whether the method of the 8th item is' in which the combination step produces a saturation flag of all 1 or the second set of simplified products The Most Significant Bit II · If applying for a true brake includes the method of the Λ or the perimeter, where the combination step is still and, the second most simplified bit of the most significant bit 12 is such as My sister * 疋 Should all the most significant bits be 1. Effective bit side? The method of the eighth round of the eighth part of the method also includes the step of simplifying the array-column minimum i3 such as _ towards the knife to produce a round of carry. The bit-side method of generating the 8th item, in which the product of the fraction of the product of the simplified part of the array with the highest i is generated as the least significant bit of the array < 刖 can be used. 1 4. If the simplified partial product 2b produced by the patent application is substantially the same as the least significant bit of the array _ side, the product of r .. 丨 knife 1 can be used at the same time. 15. —A kind of multiplier, including: The first circuit is used to simplify the most productive array of partial products. The product is clever: the _ carry state is related to: = The product is used to reduce the number of parts used to simplify the most significant side of the partial product array. A selector for selecting the first set of simplified products as the array 彳 when the recognition produced on the least significant side of the array has the first carry state, and the simplified product on the output side; and When there is a second carry state, the second set of simplified multiplications is selected as the simplified product of the effective side of the carry wheel;

21st poor

6. The scope of the patent claims that the processor 'is used to combine the first set of simplified fees. ^ There is a neodymium saturation flag and the second set of simplified registration counties: Valid: Yuan to generate the second saturation Flag; and the gated product has the two most: ^ Li: the carry round produced on the least effective side of the Γ train line out of the least effective side production 1: advance: a saturated flag; Saturation flag. When the wheel enters and exits with the second level, the multiplier of item 15 of the patent scope is selected, and the second selection ^ saturation flag is essentially a simplified product of the first selector selection. It is the multiplier of item 15 of the same device, wherein the pre-processing effect bit is entered into Qiu & the first or second group of simplified products has the highest saturation flag with a king of 1. The device asks for the multiplication steaming of item 17 of the patent scope, where the pre-processed product I 2 2 AN1) gate is used to combine the first or second group of simplified multiplication i. U 冋 significant bits to determine whether the most significant bits are all 19. A multiplier includes: multiplying #i less one circuit for simplifying a part of the hierarchical carry selection adder ΐ t ί 2 of the most significant side Part "product to produce the first and second-group" ί “ί", which is used to select the sum of the simplified partial products according to the carry received by the least significant side of the array; a pre-processor is used to combine the most significant bits of the first group of simplified products 454 彳 44 VI. Application for Patent Scope _ Yuan to generate 坌 effective number of digits to produce 7 main soil 饱和 saturation flag; and ^ §? Gp saturation flag product 'based on the carry selection received by the least significant side of the array The multiplier of item 19 of the patent scope 'wherein the second alternative time. The sum flag is essentially a simplified multiplication product of the first selector selection and the multiplier of item 19 of the patent scope of the same patent application, wherein the preprocessing ° _: Does not mean that the most significant bit of the simplified product of the first group or the second group is a saturation flag which is all 1. 2 2. As in the multiplier of the 21st scope of the patent application, the preprocessingThe device includes an f-AND gate for combining the most significant bits of the first group or the second group of simplified products to determine whether the most significant bits are all 1 ° 23. A kind of integrated electric bee, including: A multiplier, the multiplier further comprising: a first circuit for simplifying a partial product of the most significant side of the partial product array side to generate a first set of simplified products related to a first carry state; a second circuit 'for simplifying the partial product Product of the most significant side part of the array side to generate a second set of simplified products related to the second carry state; a first selector for selecting the first when the carry output produced by the least significant side of the array has the first carry state The set of simplified products is the simplified product of the most effective side of the array; and when the carry output produced by the least significant side of the array has the second carry state, it is used to select the second group of simplified products-

4 5 4 1 4 4

Patent application scope High effective side simplified product pre-processor 'used to combine _ 篦 a group _ _ + number of bits to produce the first full deduction 旒擗 0x' 4 The product has the most significant effective number of significant digits to produce the second Saturation flag; and-the system product is issued in the most round. 选择: Selector 'is used to select the "full house" carry when the lowest effective side of the array is produced. ^ Quasi-flag, and when the carry output used to select the second full saturation has the second carry state: Yu I! Miru integrated circuit of item 23 of the scope of patent application, of which the Dengxixi flag is Xiao Zhi supposes that the saturation of the choice has = Mu 2 5 as the arithmetic logic unit of the f-saturation arithmetic logic unit pre-processor. The way is to count the product circuit of the 23rd item, where the product is two digits. No. processor (DSP), where the integrated circuit is included in the integrated circuit, and the integrated circuit is included in the integrated circuit 2 6 · If the integrated circuit circuit of item 23 of the patent application is a microprocessor. 27, if applied The integrated circuit of item 23 of the patent scope is a micro control 2. 8. The product of patent application range of the second three-way circuits is electrically integrated a particular application ^ jC).