WO2004095260A1 - Division de base tres elevee - Google Patents

Division de base tres elevee Download PDF

Info

Publication number
WO2004095260A1
WO2004095260A1 PCT/US2004/005664 US2004005664W WO2004095260A1 WO 2004095260 A1 WO2004095260 A1 WO 2004095260A1 US 2004005664 W US2004005664 W US 2004005664W WO 2004095260 A1 WO2004095260 A1 WO 2004095260A1
Authority
WO
WIPO (PCT)
Prior art keywords
denominator
scaled
division
current remainder
quotient
Prior art date
Application number
PCT/US2004/005664
Other languages
English (en)
Inventor
Ping Tang
Warren Ferguson
Original Assignee
Intel Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel Corporation filed Critical Intel Corporation
Publication of WO2004095260A1 publication Critical patent/WO2004095260A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/52Multiplying; Dividing
    • G06F7/535Dividing only

Definitions

  • Embodiments of the present invention relate generally to radix division, and more particularly to reducing a data path for implementations of radix division in microprocessor architectures.
  • Floating point performance is a key focus of modern microprocessor architecture.
  • addition, multiplication, and division, division is the most resource intensive operation for microprocessing architectures.
  • very high radix we mean that the number of quotient digits generated by each iteration of the algorithm is much larger than the typical traditional algorithms that yield 1 bit (radix-2), 2 bits (radix-4), 3 bits (radix-8), or 4 bits (radix-16).
  • Rj+ 1 rx Rj - q,-+ 1 x V.
  • R is the remainder
  • r is the radix
  • g y + 1 is the quotient digit
  • Y is the divisor (e.g., denominator).
  • the bulk of the work is in computing the product q J + 1 x Y.
  • the width of Y remains fixed while the width of q,-+ 1 grows with the radix.
  • L the data width of the precision in question
  • the depth of the multiplier is the number of additional quotient data bits the algorithm generates per iteration
  • the width of the multiplier is fixed at the data width of the precision in question.
  • the width of the multiplier grows as well.
  • This requirement is a direct outcome of a crucial "pre-scaling" step (e.g., divisor or denominator reciprocal, discussed in the Detailed Discussion Section below) in this class of algorithms that make them practical to implement.
  • a crucial "pre-scaling" step e.g., divisor or denominator reciprocal, discussed in the Detailed Discussion Section below
  • the growth of the width leads, nevertheless, to a number of drawbacks.
  • the more obvious drawbacks are increased space and increased power consumption.
  • the less obvious but ever growing important drawback is the need for a customized multiplier and/or adder rather than those most naturally found in standard cell libraries related to the precision width L in question. [0005] Therefore, there is a need for improved implementations and techniques for radix division. These implementations and techniques should be as fast as some of the recent radix division algorithms, but capable of maintaining the width of the multiplier such that space usage and power consumption is minimized and thus reduced when compared
  • FIG. 1 is a flow diagram of a method for performing very high radix division, in accordance with one embodiment of the invention.
  • FIG. 2 is a diagram depicting a machine implementation for performing very high radix division, in accordance with one embodiment of the invention.
  • FIG. 3 is a diagram of a very high radix division system, in accordance with one embodiment of the invention. Description of the Embodiments
  • FIG. 1 illustrates a flow diagram of one method 100 for performing very high radix division, in accordance with one embodiment of the invention.
  • the method 100 is implemented in microprocessor architectures.
  • the method 100 can be implemented in hardware, software, firmware, or in combinations of hardware, software, and firmware within or accessible to microprocessor architectures.
  • embodiments of method 100 permit very high radix division calculations to be decomposed in a novel manner that result in narrow data paths.
  • existing and conventional decompositions perform very high radix divisions where wide data paths are necessary.
  • microprocessor architectures implementing embodiments of method 100 experience a decrease in space utilization and accordingly a decrease in power consumption when performing very high radix division. This is achieved without adversely impacting processor throughput. This is a significant advantage over conventional techniques, since as microprocessor architectures become increasingly smaller in the industry; the ability to conserve power is a growing concern.
  • a numerator and denominator are received as operands for a very high radix division calculation.
  • the division calculation is pre- scaled (e.g., preprocessed) into an equivalent mathematical calculation before a quotient for the calculation is determined. This is achieved by attempting to make the denominator of the division calculation approximately 1. By making the denominator or divisor 1 , the division calculation becomes straightforward, since any numerator divided by 1 will equal the numerator.
  • MN any number of any number N
  • the reciprocal of a number N may not terminate (e.g., continues to infinity)
  • some desired and configurable precision of the approximate reciprocal can be used.
  • a factor representing a desired precision for an approximate reciprocal of the denominator is acquired.
  • an approximate factor is obtained by performing a lookup in a data structure (e.g., table or other data structure).
  • the data structure having the reciprocal value for the denominator need not store all number possibilities, since a subset of numbers can be used and then calculated to the proper reciprocal value.
  • a data structure may house only the approximate reciprocal values for number digits 1-9 (where the radix is decimal (base 10)). Of course, in actuality only 2-9 may be stored since it is readily known the reciprocal of 1 is 1.
  • Reciprocal values for all other number possibilities can be readily calculated based on this small data structure, because these remaining possibilities are only off by a factor of the radix (e.g., decimal (base 10)).
  • the reciprocal of 1 is 1
  • the reciprocal of 10 is 0.1
  • the reciprocal of 7 is approximately 0.14
  • the reciprocal of 71 is approximately 0.014.
  • any desired level of precision can be configured to the requirements or desires of the underlying microprocessor architecture.
  • the data structure could store reciprocals for numbers 2-99, 11- 99, 10-99, or other subsets of numbers.
  • the number of significant digits for the stored reciprocal values can be tailored to the requirements or desires of the underlying architecture. Thus, if only digits 2-9 were stored in the data structure, the actual values can include more than 2 significant digits.
  • the reciprocal of 7 can be stored as 0.142857142 having 8 significant digits.
  • the original division calculation is pre-scaled (e.g., preprocessed or decomposed) to a new processing format where the original denominator is approximately 1. This is achieved by multiplying the original numerator and the original denominator by the acquired approximate denominator reciprocal factor.
  • the new processing format for the division calculation is recast, at 140, to a mathematical equivalent represented by a scaled numerator (e.g., original numerator X factor) divided by a scaled denominator (e.g., original denominator X factor).
  • the multiplication is applied to both the numerator and the denominator so the restated division calculation is mathematically equivalent to the original division calculation.
  • the first digit of the scaled numerator is a first portion of a quotient for results associated with the original division calculation. This is so, because the scaled denominator is approximately 1 , and any number divided by 1 is the number itself. Thus, at 150, the first portion of the quotient can be set as the first digit of the scaled numerator.
  • the reciprocal of 7 is approximately 0.14 (e.g., where our reciprocal data structure is accurate to approximately 2 significant digits).
  • a running variable defined as a current remainder is initially set to be the scaled numerator.
  • the quotient is iteratively assembled using a portion of the then current remainder. The first time into the processing loop, the processing at 170 does not need to occur since this was initially handled before entering the loop at 150. Alternatively, the processing at 150 can be removed and handled by the processing at 170.
  • a first product is a radix (number base being used, e.g., binary, decimal, hexadecimal, and others) multiplied by the current remainder.
  • This first product is then altered further at 174 by subtracting an integer portion of the current remainder.
  • the integer portion is the first digit of the current remainder and made a whole integer number. This can be done because our scaled denominator is nearly or approximately 1 , and the division calculation can be decomposed in a novel manner.
  • N 4 4
  • the division calculation is restated as a scaled numerator divided by scaled denominator (which is approximately 1 ), thus, N can be automatically selected as the first digit integer portion of the current remainder, which we did in our example of 0.42/0.98 (e.g., 3/7 restated to a mathematical equivalent).
  • our first digit integer value (0.4 for the present example) is readily obtained from the current remainder.
  • the original calculation used to solve for a quotient digit is presented as a second product subtracted from a first product.
  • the first product is the current remainder multiplied by the radix
  • the second product is some number N multiplied by our scaled denominator (e.g., divisor).
  • the second product is decomposed in a novel manner, which decreases the data width required to solve for a value of the second product, for microprocessor architectures that implement embodiments of the present invention.
  • This is achieved by decomposing the divisor (e.g., scaled denominator) to be represented as a mathematically equivalent expression.
  • the divisor e.g., scaled denominator
  • the sum of this subtraction will include one or more initial or leading digits that are 0 or 1.
  • Leading insignificant 1 's can occur when the subtractions results in a negative number, such as when the scaled denominator was greater than 1 when using an approximate factor that was greater than 1.
  • the leading zeros or ones are insignificant numbers and will not occupy additional data width within microprocessor architectures. Therefore, the leading zeros or ones will decrease the data width of the second product, if the second product is also restated in a mathematically equivalent manner.
  • the equivalent expression for the divisor can be stated as 1 - Y', where Y' is some number that when subtracted from 1 equals the scaled denominator.
  • Y' is some number that when subtracted from 1 equals the scaled denominator.
  • the second product can now appear as the equation of the first integer portion of the current remainder (since scaled denominator is approximately 1) multiplied by the sub equation represented by 1-Y' (this is how the scaled denominator is being expressed), where Y' was 1 - scaled denominator.
  • the second product can be further decomposed to another mathematically equivalent equation by distributing the first portion of the current remainder over the sub equation 1-Y'.
  • the second product will now be represented as the first integer portion of the current remainder (e.g., current remainder multiplied by 1 the first part of the sub equation) minus the product of the first integer portion of the current remainder multiplied by Y'.
  • the product will include less significant digits and occupy less data width within microprocessor architectures, than it would have if we had multiplied the first integer portion of the current remainder directly against the scaled denominator or divisor, because Y' includes one or more insignificant leading zeros or ones, whereas the raw scaled denominator did not include any insignificant leading zeros or ones.
  • the decomposition of the scaled denominator to be restated in a mathematically equivalent manner is space efficient.
  • the entire equation for solving for digits of a quotient can be expressed as subtracting the first integer portion of the current remainder from a first product represented as the first integer portion of the current remainder multiplied by the radix and then subtracting a product represented by multiplying the first integer portion of the current remainder against Y', and Y' is equivalent to 1 - scaled denominator.
  • the first product in the iterative loop is the current remainder multiplied by the radix as depicted at 172, next at, 174 we subtract the first integer portion of the current remainder from the first product.
  • the embodiments of the present invention decompose the processing for determining the product of the scaled denominator multiplied by the first portion of the current remainder, such that the data width required in performing the division calculation is reduced.
  • Y' is represented as 1 - the scaled denominator (which is approximately 1), and thus Y' will include one or more insignificant leading digits that are 0 or 1. Therefore, when this multiplication is performed less data width is required to obtain the resulting product.
  • the data width of the multiplication is the data width of the desired precision.
  • microprocessor architectures that implement embodiments of the present invention can perform high radix division with less space with less power consumption than what has been conventionally required.
  • FIG. 2 illustrates a diagram 200 depicting one machine implementation for performing very high radix division, in accordance with one embodiment of the invention.
  • the machine architecture can include standard cell libraries associated architectures for adders and multipliers when performing high radix division.
  • a first operand X having a bit width of L bits is received by multiplexer (Mux) 201. Simultaneously, in parallel, or subsequently Mux 202 receives a second operand Y having a data width of L bits. Operand X is associated with a numerator of a very high radix division, and operand Yis associated with a denominator of the very high radix division.
  • L is the number of significant digits used by the machine architecture to perform the very high radix division calculation at a desired precision.
  • the radix (r) at the machine level is a binary power of 2, where the power is represented as m (e.g., 2 m ).
  • m e.g. 2 m
  • the approximate reciprocal factor is provided to Mux 205.
  • the sum of the partial products of X and Y are produced by Mux or adder 206(can also be a multiplier), using the approximate reciprocal factor acquired from Mux 205.
  • Mux 205 feeds m bits of the approximate reciprocal at a time to Mux or adder 206 for processing.
  • Multiplier 206 (can be a Mux or adder as well) multiplies m bits of the approximate reciprocal factor against the Xand Yto restate the original division calculation as scaled X divided by scaled Y, where scaled Y is now approximately or near a value of 1 , since the approximate reciprocal of Yis now multiplied by Y.
  • Multiplier 206 is designed to handle calculations for an m by L bit width array.
  • the adder is designed to handle adding L bits plus L bits.
  • the architecture of diagram 200 can process high division radix division with standard architectures associated with cell multipliers and adders designed to process calculations having bit array dimensions of L by m or adding L bits plus L bits, respectively.
  • traditional architectures would need a multiplier capable of handling m by (m + L) bit width arrays or adders capable of handling adding data bit widths of m + L.
  • X' is then multiplied in multiplier 206 by r (e.g., radix represented as 2 m ). This reduction in significant bits will be realized in subsequent iterations through diagram 200.
  • r e.g., radix represented as 2 m
  • p can be thought of as 4 (the first integer portion) and R' can be thought of as 0.78.
  • the 0.78 can be rounded in Mux or adder 210 to be 1.
  • the adjusted value of R' is stored as d in register 211. This rounding is used to adjust p in adder 212 to a potentially rounded (although not always required or needed) value for a quotient digit q, which is stored in register 213.
  • Mux or adder 206 then provides Y'to adder 207 and the previous selected digit q.
  • Adder 207 also receives a previously potentially rounded representation of the trailing remainder R' represented in the diagram as d, and adder 207 receives the previous R'.
  • adder 207 compares the previous R' against d to see if a rounding to q has occurred, and if it has then q is adjusted back to an non rounded version of q. This is done to ensure accuracy in the very high radix division for all iterations.
  • adder 207 assembles the current remainder as q + R' and multiplies this against the radix (r) to acquire the first product for the next cycle of the division. Adder 207 also multiplies q against Y' to form a second product. Next, q is subtracted from the first product and the result is added to the second product. At this point, we now have the information necessary for determining the next Q digit q J+1 .
  • adder 207 strips the first few bits of the resulting calculation and stores this in register 208 as p J+ ⁇ , the trailing portion of the result becomes the next trailing remainder R' J+ ⁇ and is provided to Mux 209.
  • the processing of diagram 200 can also be expressed in a formal notation to reflect processing logic of the various components of FIG. 2.
  • the quotient digit selection begins the main processing loop in diagram 200.
  • a quotient digit qj+1 can be rounded thus ⁇ , reflects this adjustment that is being made in diagram 200 to quotient digits q J+ .
  • FIG. 3 illustrates a diagram of a very high radix division system 300, in accordance with one embodiment of the invention.
  • the high radix division system includes a data structure 310 and a processor 320.
  • the processor includes logic 321 , and space for housing a numerator 322, a denominator 323, and a quotient associated with a high radix division calculation.
  • FIG. 3 is presented for purposes of illustration only and is not intended to be limiting, since it is readily apparent that in some embodiments the data structure 310 can also be included within a register or area of the processor 320. In fact, any embodiment where the data structure 310 is accessible to the logic 321 is intended to fall within the broad scope of the present invention.
  • the logic 321 can be hardware components, firmware components, software components, or a combination of hardware, firmware, and software.
  • the data structure 310 is an approximate reciprocal table for integer numbers. The number of significant digits represented in the data structure 310 is configurable. Moreover, as was discussed above with FIG. 1 , the data structure 310 need not store all possible integers since factors can be readily derived based on the radix being used for the very high radix division. Thus, for a decimal radix only integers 2-9 (no need for 1 since the reciprocal of 1 is 1) need to be represented in the data structure 310, since all other integer possibilities can be derived from factors for these integers using the radix.
  • the data structure 310 can include integer possibilities for large sets of integers (e.g., 2-99, 10-99, 11-99, or other combinations for a decimal radix). Additionally, the data structure 310 need not be a table in all embodiments, since any data structure 310 that is accessible to the logic 321 can be used. [0048] Initially, the logic 321 accesses the numerator 322 and the denominator 323. Next, the logic 321 accesses the data structure 310 to acquire an approximate reciprocal factor for the denominator 323.
  • the logic 321 may need to derive the approximate reciprocal factor from a value obtained from the data structure 310 by using a radix associated with the very high radix division, such as when the denominator is not directly represented directly in the data structure 310.
  • the logic 321 multiplies the numerator 322 and the denominator 323 by the factor in order to restate the division in a more efficient manner.
  • the numerator 322 is represented as a scaled numerator and the denominator is represented as a scaled denominator.
  • the scaled denominator is approximately 1 , which simplifies the division.
  • the scaled denominator is decomposed into an equivalent expression that is used for processing the division.
  • the equivalent expression represents the fact that the division will be iteratively processed to solve for quotient 324 digits.
  • the division will be processed by the logic 321 to obtain new values for the current remainder, and a first integer portion of the current remainder will be a new or next quotient digit.
  • the iterative processing produces a new quotient 324 digit for each processing cycle.
  • Each new current remainder is determined by the logic 321 by evaluating the expression of a first product and subtracting the sum of a first integer portion of the current remainder minus a product represented by the first integer portion of the current remainder multiplied by 1 minus the scaled denominator.
  • the logic 321 in some embodiments for efficiency purposes, restates the scaled denominator as 1 minus the scaled denominator.
  • the logic 321 then stores this value within the processor 320 or memory accessible to the processor 320 during initialization. Since the revised scaled denominator is approximately 1 , when the logic 321 stores the sum of 1 minus the scaled denominator one or more leading digits of the sum will be insignificant (e.g., zeros or ones by a factor of the radix). Thus, by performing this processing the required data width for calculating the high radix division is reduced from what has been conventionally achieved. This results in more efficient use of space within the processor 320 and reduces power consumption. Moreover, this permits microprocessor architectures to use standard cell libraries to perform very high radix division. Conversely, conventional architectures require specialized cell libraries to process very high radix division.
  • the logic 321 enters a processing loop to iteratively determine the quotient 324 for the very high radix division.
  • the first quotient digit is the first integer portion of the scaled numerator, since the scaled denominator is approximately 1.
  • the logic 321 can adjust the iteratively determined quotient digits by rounding them upward/downward based on digit values associated with the trailing portion of a current remainder (trailing portion of the scaled numerator for the first iteration).
  • the logic 321 determines a new value for a current remainder.
  • the logic 321 takes a first integer portion of the previous current remainder and multiplies it by the radix to acquire a first product.
  • the first integer portion of the previous current remainder is subtracted from the first product.
  • a second product is acquired by multiplying the saved and restated scaled denominator (1 - scaled denominator) by the first integer portion of the previous current remainder.
  • the second product is added to the first product to acquire a new value for the current remainder.
  • This new current remainder will then have its first integer portion used as a next quotient digit for the quotient 324 that is being iteratively assembled.
  • the new current remainder is recycled back in the logic 321 for another processing cycle where the new current remainder becomes the previous current remainder.
  • the logic 321 continues this processing until a desired precision is reached or until a current remainder value of 0 is obtained.
  • the first integer portion of the current remainder can be rounded upward/downward based on the trailing portion of the current remainder before providing the next quotient digit for the quotient 324.

Abstract

L'invention concerne des procédés, des machines et des systèmes de division de base très élevée à l'aide de chemins de données étroits. Un numérateur et un dénominateur sont reçus pour un calcul de division de base très élevée. Une réciproque approximative du dénominateur est obtenue à partir d'une structure de données. Le numérateur et le dénominateur sont pré-échelonnés par la réciproque. Le dénominateur est décomposé en une expression équivalente permettant d'obtenir un nombre de valeurs principales non significatives. La modification du reste par formation d'un premier produit et soustraction de l'expression équivalente permet d'assembler de manière itérative un quotient.
PCT/US2004/005664 2003-03-21 2004-02-25 Division de base tres elevee WO2004095260A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US10/394,952 US7167891B2 (en) 2003-03-21 2003-03-21 Narrow data path for very high radix division
US10/394,952 2003-03-21

Publications (1)

Publication Number Publication Date
WO2004095260A1 true WO2004095260A1 (fr) 2004-11-04

Family

ID=32988505

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2004/005664 WO2004095260A1 (fr) 2003-03-21 2004-02-25 Division de base tres elevee

Country Status (3)

Country Link
US (1) US7167891B2 (fr)
CN (1) CN1761938A (fr)
WO (1) WO2004095260A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7167891B2 (en) 2003-03-21 2007-01-23 Intel Corporation Narrow data path for very high radix division

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7499962B2 (en) * 2004-12-21 2009-03-03 Intel Corporation Enhanced fused multiply-add operation
US20060294177A1 (en) * 2005-06-27 2006-12-28 Simon Rubanovich Method, system and apparatus of performing division operations
US7830905B2 (en) * 2007-04-20 2010-11-09 Cray Inc. Speculative forwarding in a high-radix router
US8725786B2 (en) * 2009-04-29 2014-05-13 University Of Massachusetts Approximate SRT division method
CN102156625B (zh) * 2011-03-31 2012-11-21 北京大学 利用阻变器件进行除法计算的方法
CN104731551B (zh) * 2013-12-23 2018-02-16 浙江大华技术股份有限公司 基于fpga进行除法操作的方法及装置
US10209959B2 (en) 2016-11-03 2019-02-19 Samsung Electronics Co., Ltd. High radix 16 square root estimate
US10447983B2 (en) * 2017-11-15 2019-10-15 Nxp Usa, Inc. Reciprocal approximation circuit
CN111813372B (zh) * 2020-07-10 2021-05-18 上海擎昆信息科技有限公司 一种高精度低时延实现32位整数除法的方法及装置

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4337519A (en) * 1979-02-01 1982-06-29 Tetsunori Nishimoto Multiple/divide unit
GB2296350A (en) * 1994-12-21 1996-06-26 Advanced Risc Mach Ltd Data processing divider enabling reduced interrupt latency

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5020017A (en) * 1989-04-10 1991-05-28 Motorola, Inc. Method and apparatus for obtaining the quotient of two numbers within one clock cycle
EP0530936B1 (fr) * 1991-09-05 2000-05-17 Cyrix Corporation Méthode et dispositif pour effectuer des divisions précadrées
US6240338B1 (en) * 1995-08-22 2001-05-29 Micron Technology, Inc. Seed ROM for reciprocal computation
US6078939A (en) 1997-09-30 2000-06-20 Intel Corporation Apparatus useful in floating point arithmetic
US6141670A (en) 1997-09-30 2000-10-31 Intel Corporation Apparatus and method useful for evaluating periodic functions
JP3551113B2 (ja) * 2000-02-07 2004-08-04 日本電気株式会社 除算器
US6782405B1 (en) * 2001-06-07 2004-08-24 Southern Methodist University Method and apparatus for performing division and square root functions using a multiplier and a multipartite table
FI20011610A0 (fi) * 2001-08-07 2001-08-07 Nokia Corp Menetelmä ja laite jakolaskun suorittamiseksi
US7127483B2 (en) * 2001-12-26 2006-10-24 Hewlett-Packard Development Company, L.P. Method and system of a microprocessor subtraction-division floating point divider
US7167891B2 (en) 2003-03-21 2007-01-23 Intel Corporation Narrow data path for very high radix division

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4337519A (en) * 1979-02-01 1982-06-29 Tetsunori Nishimoto Multiple/divide unit
GB2296350A (en) * 1994-12-21 1996-06-26 Advanced Risc Mach Ltd Data processing divider enabling reduced interrupt latency

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7167891B2 (en) 2003-03-21 2007-01-23 Intel Corporation Narrow data path for very high radix division

Also Published As

Publication number Publication date
CN1761938A (zh) 2006-04-19
US7167891B2 (en) 2007-01-23
US20040186873A1 (en) 2004-09-23

Similar Documents

Publication Publication Date Title
Ko et al. Design and application of faithfully rounded and truncated multipliers with combined deletion, reduction, truncation, and rounding
EP0411491B1 (fr) Procédé et appareil pour l'exécution des divisions utilisant un multiplieur à format rectangulaire
US20120259904A1 (en) Floating point format converter
US5307303A (en) Method and apparatus for performing division using a rectangular aspect ratio multiplier
US20040098440A1 (en) Multiplication of multi-precision numbers having a size of a power of two
US4949296A (en) Method and apparatus for computing square roots of binary numbers
US5951629A (en) Method and apparatus for log conversion with scaling
US6782405B1 (en) Method and apparatus for performing division and square root functions using a multiplier and a multipartite table
KR102581403B1 (ko) 공유 하드웨어 로직 유닛 및 그것의 다이 면적을 줄이는 방법
US7167891B2 (en) Narrow data path for very high radix division
US5060182A (en) Method and apparatus for performing the square root function using a rectangular aspect ratio multiplier
US7711764B2 (en) Pipelined real or complex ALU
Murillo et al. Energy-efficient MAC units for fused posit arithmetic
WO2021217034A1 (fr) Conception de circuits multiplicateurs modulaires de montgomery hautement performants et évolutifs
US20040167956A1 (en) Method and apparatus for executing division
Rodriguez-Garcia et al. Fast fixed-point divider based on Newton-Raphson method and piecewise polynomial approximation
US6182100B1 (en) Method and system for performing a logarithmic estimation within a data processing system
KR100433131B1 (ko) 작은 사이즈의 룩업 테이블을 갖는 파이프라인 나눗셈연산기 및 연산방법
CN108334304B (zh) 数字递归除法
KR102332323B1 (ko) 기수 4 피디 표로 구현된 기수 16 피디 표
Erdem et al. A less recursive variant of Karatsuba-Ofman algorithm for multiplying operands of size a power of two
Murillo et al. A suite of division algorithms for posit arithmetic
Patankar et al. Division algorithms-From Past to Present Chance to Improve Area Time and Complexity for Digital Applications
CN115827555A (zh) 数据处理方法、计算机设备、存储介质和乘法器结构
Walke High sample-rate Givens rotations for recursive least squares

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): BW GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 20048077421

Country of ref document: CN

122 Ep: pct application non-entry in european phase