CN1965292A

CN1965292A - Complex logarithmic ALU

Info

Publication number: CN1965292A
Application number: CN 200580018166
Authority: CN
Inventors: 保罗·威尔金森·登特
Original assignee: Telefonaktiebolaget LM Ericsson AB
Current assignee: Telefonaktiebolaget LM Ericsson AB
Priority date: 2004-06-04
Filing date: 2005-06-02
Publication date: 2007-05-16
Also published as: CN1965486B; CN1965486A

Abstract

The present invention relates to an ALU implements logarithmic arithmetic and describes a method and apparatus for performing logarithmic arithmetic with real and/or complex numbers represented in a logarithmic format. In one exemplary embodiment, an ALU implements logarithmic arithmetic on complex numbers represented in a logpolar format. According to this embodiment, memory in the ALU stores a look-up table used to determine logarithms of complex numbers, while a processor in the ALU generates an output logarithm based on complex input operands represented in logpolar format using the stored look-up table. In another exemplary embodiment, the ALU performs logarithmic arithmetic on real and complex numbers represented in logarithmic format. In this embodiment, the memory stores two look-up tables, one for determining logarithms of real numbers and one for determining logarithms of complex numbers, while the processor generates an output logarithm based on real or complex input operands represented in logarithmic format using the real or complex look-up tables, respectively.

Description

Plural number logarithm operation ALU

Technical field

Present invention relates in general to calculate and digital signal processing, relate more specifically to pipelining (pipelined) logarithm operation in the ALU (ALU).

Background technology

ALU is used to realize to real number and/or the plural various calculation functions such as addition, subtraction, multiplication, division etc. traditionally.Conventional system uses fixed point or floating number ALU.Also known utilization has the ALU of the real logarithm of bounded precision.For example, referring to " Digital filtering using logarithmicarithmetic " (N.G.Kingsbury and P.J.W.Rayner, Electron.Lett. (Jan.28,1971), Vol.7, No.2, pp.56-58)." Arithmetic on the European LogarithmicMicroprocessor " (J.N.Coleman, E.I.Chester, C.I.Softley and J.Kadlec, (July2000) IEEE Trans.Comput., Vol.49, No.7 pp.702-715) provides high precision (32) at real number to another example of counting unit.

The fixed point programming has caused the particularly mental burden of determining the position of decimal point after multiplication or division arithmetic to the programming personnel.For example, supposing that the FIR filtrator relates to utilizes weighting factor-0.607,1.035 ,-0.607 ... to the weighted addition of signal sampling, described weighting factor has to be assigned to millesimal precision.In fixed-point arithmetic, for example, must represent 1.035 with 1035.As a result, signal sampling and this numerical value multiply each other result's word length have been expanded 10.For this result of storage that lives forever by identical memory word, so, must abandon 10; Yet, be to abandon MSB (highest significant position) still should abandon LSB (least significant bit (LSB)), still should abandon some in each, depend on signal data spectrum (signal data spectrum), and this must determine by the emulation that utilizes real data.This feasible checking to correct programming becomes and has required great effort.

Introduced floating point processor, with by by means of with " index " part of " mantissa " part correlation connection of the numerical value of each storage, determine radix point automatically, overcome the mental inconvenience of determining radix point.The ieee standard floating-point format is:

SEEEEEEEE.MMMMMMMMMMMMMMMMMMMMMMM

Wherein, S be value symbol (0=+, 1=-), EEEEEEEE is 8 indexes, and MMM ... MM is 23 mantissa.For the ieee standard floating-point format, the 24th highest significant position of mantissa be 1 (except very zero) always, has been omitted thus.In the IEEE form, the actual value of mantissa thereby be:

1.MMMMMMMMMMMMMMMMMMMMMMM

For example, the truth of a matter is 2 logarithm numerical value-1.40625 * 10 ^-2=-1.8 * 2 ^-7, can be shown by the ieee standard form shfft:

1 01111000.11001100110011001100110。

And zero exponent is 01111111, and numerical value+1.0 can be written as thus:

0 01111111.00000000000000000000000

Expression very zero needs the minus infinity index, and this is unpractical, therefore, comes true zero but not 2 by a full zero-order mode (all zeros bit pattern) is explained ^-127, generate artificial zero.

For two floating numbers are multiplied each other, utilize 24 * 24 multipliers of fixed point (it is the logic with appropriate high complexity and delay) that the mantissa of repressed MSB 1 playback back (replaced) is multiplied each other, simultaneously with the index addition, and deduct in the side-play amount 127 one.Then, 48 results after must multiplying each other intercept into 24, and are deleting highest significant position 1 afterwards to left-justify (left-justification).Thus, compare, at the multiplication of floating number even more complicated with multiplication at fixed-point number.

For with two floating number additions, at first, their index must be subtracted each other, whether aim at the radix point of understanding them.If radix point is not aimed at, then select less number to move to right to equal a plurality of binary location of index difference, before with mantissa's addition, aiming at radix point, thus 1 playback that will imply.In order to carry out displacement fast, can use barrel shifter (barrel shifter), it is similar to the fixed-point multiplication device on structure and complicacy.After addition is more particularly being subtracted each other, must be zero of the front mantissa that moves to left out, and make exponential increasing.Thereby in floating-point operation, addition and subtraction also are complex calculations.

Under pure linear format, be simple at the addition of fixed-point number and subtraction, but multiplication, division, square and square root more complicated.That multiplier is configured to is a series of " displacement and condition addition " circuit, these circuit that " are shifted and the condition addition " have a large amount of logical delays inherently.Fast processor can use streamline (pipelining) to overcome this delay, but this makes programming complicated usually.Thus, the pipelining delay that minimizes in the fast processor is significant.

Should be noted that floating number represent be logarithm with linear expression between mix.Index be numerical value 2 being the integral part of the logarithm of the truth of a matter, and mantissa is linear fraction part.Because multiplication is complicated for linear expression, and addition is complicated for logarithm is represented, so why this illustrated that for mixing floating point representation, the both is complicated.In order to overcome this problem, the system that some are known as the system of citation in the above, has used pure logarithm to represent.This has solved the problem of definite radix point and has simplified multiplication, and only remaining addition is complicated.In the prior art, utilize tracing table to carry out the logarithm addition.Yet the big or small limitation of table is limited to limited word length to this solution, for example is limited to the scope of 0-24 position.Utilize the interpolation technique that needs multiplier above-mentioned in the quoting of Coleman, realized 32 precision with the tracing table of fair-sized.Like this, Coleman handles and still comprises the complicacy relevant with multiplication.

Though description of the Prior Art be used to realize the whole bag of tricks and the device of real number logarithm operation, prior art does not provide the tracing table solution at complex operation, and complex operation is useful in radio signal is handled.And prior art does not provide has the shared real number and the ALU of plural processing power.The single ALU that realizes real number and plural logarithm operation simultaneously because handling, radio signal not only needs plural processing power but also need the real number processing power usually, so in the radio communication device with size and/or power problem, will be useful.

Summary of the invention

The present invention relates to real number and/or the plural ALU (ALU) of carrying out arithmetical operation to representing by the logarithm form.Utilize the logarithm numeric representation to simplify multiplication and division arithmetic, make that still addition and subtraction are more difficult.Yet, can utilize that algorithm known is as discussed below simplified two input operands and or the logarithm of difference.In the following discussion, suppose a＞b, and c=a+b.Can represent:

C＝log _q(c)＝log _q(a+b)＝A+log _q(1+q ^-r) (1)

Wherein, q is the truth of a matter of logarithm, r=A-B, A=log _qAnd B=log (a), _q(b).The computing (being called logarithm addition (logadd)) of formula (1) expression at this allow only to utilize addition and subtraction calculate a and b's and logarithm, wherein, utilize tracing table to determine log _q(1+q ^-r) value.

In an example embodiment, the invention provides a kind of ALU that is used for the complex number type input operand of representing by log-polar (logpolar) form is carried out logarithm operation.For example, A=log _q(a)=(R ₁, θ ₁) and B=log _q(b)=(R ₂, θ ₂), wherein, R and θ represent Logarithmic magnitude (logmagnitude) and phase angle respectively, this will further discuss below.According to this embodiment, ALU comprises storer and processor.The tracing table that memory stores is used for the logarithm of the plural number determining to represent by the log-polar form, and the tracing table of processor utilization storage generates the output logarithm of the complex number type input operand of representing by the log-polar form.

In another example embodiment, the invention provides a kind of real number and plural ALU that carries out logarithm operation that is used for representing by the logarithm form.Demonstration ALU according to this embodiment also comprises storer and processor.Two tracing tables of memory stores, a tracing table is used for determining the logarithm of real number, and another tracing table is used for determining the logarithm of plural number.Processor comprises common processor, and this common processor is utilized at the real number tracing table of real number type input operand with at the plural tracing table of complex number type input operand, generates the output logarithm based on the input operand of representing by the logarithm form.

Under any circumstance, according to an example embodiment of the present invention, processor can comprise the butterfly circuit, and this butterfly circuit is configured to, and generates the output logarithm at logarithm additive operation and logarithm subtraction (logsub) computing simultaneously.According to another example embodiment, processor can comprise and searches controller and output totalizer, wherein, searches controller and calculates one or more part output (partial output) based on tracing table.Described part output can perhaps can be determined during one or more stage of streamline once or during more times iteration determining.The output totalizer generates the output logarithm based on described part output.

Description of drawings

Fig. 1 illustration at the IEEE floating-point format of real number and very relatively to the plot between the number format.

Fig. 2 illustration at the IEEE floating-point format of real number and very relatively to the chart between the number format.

Fig. 3 illustration the block diagram of linear interpolation device.

Fig. 4 illustration the plot between true F function and the exponential approximation relatively.

Fig. 5 A and 5B respectively illustration the quantification district that represents at log-polar representation and Cartesian coordinate.

Fig. 6 illustration be used for carrying out simultaneously the block diagram of a demonstration ALU of logarithm additive operation and logarithm subtraction.

Fig. 7 illustration utilize the ALU among Fig. 6 to realize 16 FFT.

Fig. 8 illustration according to the block diagram of demonstration of the present invention ALU.

Fig. 9 illustration be used for the ALU of Fig. 8 demonstration search the block diagram of controller.

Figure 10 illustration according to the additional detail of demonstration of the present invention ALU.

Figure 11 A-11C illustration distributing with respect to real number at the difference of plural number.

Embodiment

The invention provides a kind of being used for to adopting to the plural number of number format and/or the ALU of real number execution logarithm operation.In one embodiment, ALU utilizes one or more tracing table that the plural number of representing by the log-polar form is carried out logarithm operation.In another embodiment, ALU utilizes at least one plural tracing table and at least one real number tracing table respectively, and plural number and the real number represented by the logarithm form are carried out logarithm operation.In order to understand details of the present invention and beneficial effect better, below, the details of relevant numeric representation, conventional interpolation, iteration logarithm operation, high precision iteration logarithm addition, high precision iteration logarithm subtraction and exponential approximation at first is provided.

Numeric representation

The logarithm operation of realizing in ALU needs specific numeric format usually.As mentioned above, conventional processors can format real number or plural number by fixed binary form or floating-point format.As mentioned above, fixed point format is pure linear format.Therefore, addition and the subtraction at fixed-point value is that simply multiplication is then more complicated.Floating-point numerical value be logarithm represent with linear expression between mix.Therefore, addition, subtraction, multiplication and division all are complicated in floating-point format.In order to overcome some difficulties relevant with these forms, can use purely with suitable algorithm to number format, solve and addition and the subtraction problem relevant number format.Provide below and gone for the pure additional detail relevant of the present invention number format.

Adopt pure real number can be abbreviated as (S8.23), thereby be expressed as number format:

S xxxxxxxx.xxxxxxxxxxxxxxxxxxxxxxx

Can use two this real numbers as a kind of mode of representing plural number.Yet as further described below, the log-polar form may be the mode of more favourable expression plural number.

The truth of a matter that is used for logarithm can be selected arbitrarily.Yet, exist and select a truth of a matter to surpass the advantage of selecting another truth of a matter.For example select the truth of a matter 2 to have many advantages.At first, as shown in Equation (2), 32 pure looks identical with (S8.23) IEEE floating point representation basically to number format.

Pure logarithm: Sxx ... xx.xx ... xx  (1) ^s* 2 ^-xx... ^Xx..xx... ^Xx

IEEE：SEE…EE.MM…MM(-1) ^s×(1+0.MM…MM)×2 ^-EE...EE(2)

Integral part at the logarithm of the truth of a matter 2 can be offset 127 with the same in the IEEE form, and value 1.0 all is expressed as by arbitrary form:

0 01111111.00000000000000000000000

Alternatively, can use 128 side-play amount, 1.0 be expressed as in the case:

0 10000000.00000000000000000000000

Utilize 127 or 128 to be the thing that realizes as preferred side-play amount.

The IEEE floating-point format is the same with adopting, and can be defined as full null mode manually very zero.In fact, if it is use same index side-play amount (127), then, this pure consistent with the IEEE form to number format for for example all numerical value as 2 power such as 4,2,1,0.5, and the magnitude portion of each is only slightly different between 2 power, as shown in Figure 1.

To number format, maximum denotable value is for pure:

0 11111111.11111111111111111111111

For the truth of a matter is 2, and its expression is 256 logarithms that deduct after the side-play amount 127 roughly, that is, and and expression roughly 2 ¹²⁹Or 6.81 * 10 ³⁸Numerical value.

Minimum denotable value is:

0 00000000.00000000000000000000000

For the truth of a matter is 2, the logarithm that its expression equals-127, and it is 5.88 * 10 ^-39If desired, then should complete zero form can with the same reservation the in the IEEE situation, to represent manually very zero.In this case, I represents that numerical value is:

0 00000000.00000000000000000000001

It is that to equal roughly-127 the truth of a matter be 2 logarithm, and it is still corresponding to about 5.88 * 10 ^-39

Quantified precision with IEEE mantissa of the value between 1 to 2 is a LSB value 2 ^-23, promptly 2 ^-23To 2 ^-24(0.6 to 1.2 * 10 ^-7) between precision.The precision that by the truth of a matter is 2 logarithm form shfft registration value x is the constant 2 by logarithmic form ^-23, it has provided dx/x=log _e(2) * 2 ^-23Or 0.83 * 10 ^-7, this is better than the average level of IEEE quantified precision a little.

In another is realized, can use logarithm at other truth of a matter (being e) as the truth of a matter.For the truth of a matter is e, and so, 32 the symbol that real number can be stored as by following expression adds the Logarithmic magnitude form:

S xxxxxxx.xxxxxxxxxxxxxxxxxxxxxxxx

Or be abbreviated as (S7.24).Because the truth of a matter is big (e=2.718), so the less figure place on the radix point left side just is enough to provide sufficient dynamic range, and need an additional bit on radix point the right for reaching equal or better precision, this will further discuss below.

Logarithmic magnitude part can be signed fixed point amount, and does not obscure this sign bit with the symbol S of represented numerical value mutually is-symbol position, wherein leftmost position.Alternatively, Logarithmic magnitude part can be offset+64 (or+63), makes bit pattern:

0 1000000.000000000000000000000000

Expression zero logarithm (numerical value=1.0).Under one situation of back, maximum can represent that numerical value has the logarithm that the truth of a matter is e:

0 1111111.111111111111111111111111

It roughly is 128 to deduct side-play amount 64, that is, and and e ⁶⁴Or 6.24 * 10 ²⁷, and expression minimum reciprocal can be represented number.Formula (3) the expression truth of a matter is the quantified precision that the logarithm of e is represented.

\frac{dx}{x} = 2^{- 24} = 0.6 \times 10^{- 7} - - - (3)

Fig. 2 compares IEEE floating-point format (have+127 side-play amount) and truth of a matter e form (have+64 side-play amount) and the truth of a matter 2 forms (have+127 side-play amount).

Select the truth of a matter in fact to be equal in fixed word length, to determine compromise between dynamic range and the precision, and be equal to by step-length and move radix point less than an integer-bit.Select the truth of a matter 24 or  (be generally 2 ^{ 2N}, wherein, N is positive integer or negative integer) be equal to radix point is moved position, positive N position or position, negative N position, provide identical performance simultaneously.Yet, select the truth of a matter 8 not to be equal to radix point moved an integer position because it with logarithm divided by 3.In other words, select the logarithm truth of a matter being equal to the right that changes binary point and the division of the position between the left side on the mathematics, this will change trading off between precision and the dynamic range.Yet radix point can only be shifted by the step, and the truth of a matter can continuously change.Under the situation of signed Logarithmic magnitude, (this is with respect to signless 127 side-play amount Logarithmic magnitudes), the symbol (S position) by the symbol that sign bit is called Logarithmic magnitude with this sign bit and numerical value makes a distinction.For further clarifying this problem, consideration is the logarithm of the truth of a matter with 10, log ₁₀And log (3)=0.4771, ₁₀(1/3)=-0.4771.Thereby in order to represent with logarithm+3 value, the symbol of numerical value and logarithm thereof all is+, this can be write as ++ and 0.4771.The following table illustration this notation.

Notation	Expression
Notation	Expression	++0.4771	With 10 be the truth of a matter+3 logarithm
+-0.4771	With 10 be the truth of a matter+1/3 logarithm	++0.4771	With 10 be the truth of a matter+3 logarithm
+-0.4771	With 10 be the truth of a matter+1/3 logarithm	-+0.4771	With 10 is-3 logarithm of the truth of a matter
--0.4771	With 10 is-1/3 logarithm of the truth of a matter	-+0.4771	With 10 is-3 logarithm of the truth of a matter

Represent it all is positive in order to ensure all logarithms, can use the skew expression.For example, if by quantity than selected numerical value (for example 0.0001) greatly how many logarithms doubly replace representing this quantity, then 3 expression is log ₁₀And 1/3 expression is log (3/0.0001)=4.4771, ₁₀(0.3333/0.0001)=3.5229.Because skew, causing both all is positive now.0.0001 expression be log (0.0001/0.0001)=0.So, but full zero-order mode is represented minimum energy 0.0001.

Traditional table of logarithm need be stored at 10000 numerical value of the logarithm between 0.0000 to 0.9999 to search inverse logarithm and at the similar quantity that is used to obtain logarithm of same precision.Can use logarithmic identity to reduce the size of tracing table.For example, log ₁₀(3)=0.4771, log ₁₀(2)=0.3010.In view of the above, can derive immediately:

log ₁₀(6)＝log ₁₀(2×3)＝log ₁₀(3)+log ₁₀(2)＝0.4771+0.3010＝0.7781。

Can also derive immediately:

log ₁₀(1.5)＝log ₁₀(3/2)＝log ₁₀(3)-log ₁₀(2)＝0.4771-0.3010＝0.1761

Yet, can not be by any simple operation of given numerical value 0.4771 and 0.3010 be derived immediately:

log ₁₀(5)＝log ₁₀(2+3)＝0.6990。

More unconspicuous is how to derive according to 3 and 2 logarithm:

log ₁₀(1)＝log ₁₀(3-2)＝0。

In order to address this problem, can use based on logarithm addition function F _aTracing table.For example, can pass through log ₁₀(3) and log ₁₀The function F of the greater (2) (that is, 0.4771) and their difference _a[log ₁₀(3)-log ₁₀(2)]=F _a(0.1761) addition obtains the logarithm of (2+3), wherein, and for the truth of a matter 10:

F _a(x)＝log ₁₀(1+10 ^-x) (4)

Similarly, can pass through from log ₁₀(3) and log ₁₀(2) the greater in deducts function F _s(0.1761), obtain the logarithm of 3-2, wherein, for the truth of a matter 10, F _s(x) be:

F _s(x)＝log ₁₀(1-10 ^-x) (5)

Yet, be used for F _a(x) and F _s(x) tracing table still need be at least 10000 numerical value of each function storage.

Method of interpolation

Can utilize interpolation, reduce the quantity of the value that will in tracing table, store.For ease of following discussion, investigate interpolation in more detail below.Use truth of a matter e for simplifying to set forth.Yet, should be appreciated that same available other truth of a matter.

In order to utilize limited number individual by x ₀The tabulated value of expression comes computing function F _a(x)=log _e(1+e ^-x), function F (x) is about tabulated value point x ₀The Taylor/McClaurin expansion provide:

F(x)＝F(X ₀)+(x-x ₀)F′(x ₀)+0.5(x-x ₀) ²F″(x ₀)…， (6)

Wherein, ' expression first order derivative, " expression second derivative etc.Based on this expansion, the benefit that can utilize the Taylor/McClaurin expansion is log _e(c)=log _e(a+b) be calculated as log _e(a)+F _a(x), wherein, x=log _e(a)-log _eAnd wherein, in table, provide (b), at x ₀Value.

For the situation at 32 truth of a matter e is used the simple linear interpolation, for the 24th binary location, for example, less than 2 ^-25, must ignore and relate to second derivative F " second order term.To F _a(x)=log _e(1+e ^-x) differentiate draws:

F_{a}^{'} (x) = \frac{- e^{- x}}{1 + e^{- x}} - - - (7)

F_{a}^{'} (x) = \frac{e^{- x}}{{(1 + e^{- x})}^{2}}

When x=0, F _a" peak value (x) is at 0.25 place.Thereby, as (x-x ₀)＜2 ^-11The time, second order term is less than 2 ^-25In order to satisfy this demand, highest significant position is pressed column format (5.11) addressing list train value point x ₀, promptly

xxxxx.xxxxxxxxxxx

So that make remainder dx=x-x ₀Have following form:

O.00000000000xxxxxxxxxxxxx，

Thus less than 2 ^-11Like this, dx is 13 amounts, and x ₀Be 16 amounts.

Linear interpolation item F _a' (x ₀) precision also must be 2 ^-25Rank.The ridge is F _a' (x ₀) multiply by less than 2 ^-11Dx, so F _a' (x ₀) precision must be 2 ^-14Can be at F _a(x ₀) table in the additional collecting terms (extra couple) of LSB is provided, to help to reduce round-off error, this shows that the wide tracing table of needs 5 bytes (40) is to store at each x ₀The F and the F ' of value.

Therefore, tabulated value comprises 2 ¹⁶=65536 26 F _a14 F of the correspondence of value and similar number _a' value.In addition, need 14 * 13 multipliers to form dxF _a'.This multiplier is carried out 13 bit shift computing and additive operations inherently, comprises about 13 logical delays thus.Utilize Booth can reduce the complicacy and the delay of multiplier to a certain extent, yet, may use conventional multiplier as benchmark (benchmark).

Fig. 3 illustration realize the exemplary block diagram of the conventional ALU of above-mentioned linear interpolation.The ALU of Fig. 3 utilizes subtracter 10, totalizer 20, F _a/ F _a' tracing table 30, multiplier 40 and subtracter 50 come estimated value C=log _e(A+B).When using in this example, A=log _eAnd B=log (a), _e(b).Because may carry out backward interpolation avoiding singular point as described below at subtraction, thus Fig. 3 illustration from X _MThe interpolation that rises, X _MBe to Duo one x than the highest effective 16 bit positions of x ₀Value.At F _aTracing table 30 comprise F _aAt X _MThe value at+l place, therefore, the F that is comprised _a' value can be the value at interval intermediate value place, that is, and at X _MThe F that+0.5 place is calculated _a' value.Multiplier 40 is with 14 F _a' (X _M) 13 2 minimum effective 13 complement code with x on duty

And multiplier 40 is configured to, and makes that the result is F _a' (X _M) and 27 products.

The LSB of 27 products can be used as the borrow input at subtracter 50, and from 26 F _a' (X _M) value deducts remaining 26, to generate interpolation at 26, then in output adder 20 with the greater addition among this interpolation and A and the B, be C round-up as a result 31 of Logarithmic magnitude according to carry " 1 ".

Therefore, tracing table 30 and 13 * 14 multipliers 40 of comprising about 65536 * 40=2.62 megabit based on 32 logarithm totalizers of practicality of linear interpolation.These assemblies have consumed very big silicon area, and do not have speed advantage with regard to logical delay.Yet, solve subtraction or complex arithmetic operations in order to utilize method of interpolation, be essential at the basic adjustment and the multiplier arrangement of word length.

For example, realize subtraction in order to utilize interpolation, determine functional value according to the subtraction function formula, the subtraction function formula is expressed as:

F _s(x)＝log _e(1-e ^-x) (8)

F _s(x) Taylor/McClaurin expansion comprises first order derivative:

F_{S}^{'} (x) = \frac{e^{- x}}{1 - e^{- x}} - - - (9)

When x was tending towards 0, it was tending towards infinitely great.At distance operation from this singular point, can be according to following formula (10), from than x=log _e(A)-log _e(B) tabulated value of the big LSB of actual value of (when A＞B) rises function is carried out backward interpolation:

F _s(x)＝F _s(x ₀)-(x ₀-x)F _s′(x ₀) (10)

This is at the illustrative realization of logarithm addition among Fig. 3.Then, when the highest significant position at least of x is zero, x ₀Be to be worth a bigger LSB, just in time avoided singular point.

Utilize and cut apart x at identical 16/13 of addition ₀Minimum value be 2 ^-11, F so _s' size be about 2048 values.Yet, F _s' value than long 12 of the relative operand of its logarithm addition (counterpart), thus, this is used to form dxF _s' the size of multiplier to increase be 13 * 26 device.

Therefore according to above-mentioned, realize having limited in the interpolation real number addition and real number subtraction and at the synergy between the complex operation (synergy) at ALU.Thereby tracing table and multiplication all need to carry out interpolation, and this makes to realized traditional method of interpolation undesirably complicated in hardware logic.

The iteration logarithm operation

As at above-mentioned interpolation processing with for reducing the alternative example of storage demand, can use iterative scheme.Iterative scheme uses two less relatively tracing tables, comes to utilize iterative processing to calculate logarithm output based on the tabular function.For the illustration iterative scheme, a decimal system example is provided, how can be with illustration according to log ₁₀And log (3)=0.4771 ₁₀(2)=0.3010 derive log ₁₀(5)=log ₁₀(3+2) and log ₁₀(1)=log ₁₀(3-2).

Logarithm addition function table (is also referred to as F at this _aTable) storage is at the truth of a matter 10 and at 50 values of the x value that changes between 0.0 to 4.9 by 0.1 step-length based on formula (4).Another table (being called correction chart or G table at this), storage are at 99 values of the y value that changes between 0.001 to 0.099 by 0.001 step-length based on following formula:

G(y)＝-log ₁₀(1-10 ^-y) (11)

Following illustration utilize the two table iterative processing at above-mentioned log (5)=log (3+2) example of these two tracing tables.Though be described according to the truth of a matter 10 below, it will be appreciated by those skilled in the art that and to use any truth of a matter.The embodiment that is different from the truth of a matter of the truth of a matter 10 at use, be to be understood that, though formula (4) and (11) have defined function table and correction chart at the calculating of the truth of a matter 10 respectively,, formula (12) has defined function table and the correction chart at any truth of a matter q generally.

F _a(x)＝log _q(1+q ^-x)

G(y)＝-log _q(1-q ^-y) (12)

For the logarithm addition process, at first, independent variable x=A-B=log ₁₀(3)-log ₁₀(2)=0.1761 round-up is to nearest tenths 0.2.According to F with 50 values _aTable, we find F _a(0.2)=0.2124.With 0.2124 and 0.4771 addition, the result is at log ₁₀(2+3) first be approximately 0.6895.It is 0.0239 that x is rounded up to 0.2 error amount that causes from 0.1761.Therefore this error will never, use the correction tracing table G (y) of 99 values greater than 0.099.For modified value y=0.0239, round-up to 0.024, the G table provides modified value 1.2695.Merge G (y)=1.2695 and value F according to first tracing table _a(0.2)=(0.2124) and the original value (0.1761) of x, generates at F _aNew independent variable be x '=1.658.It will be appreciated by those skilled in the art that the apostrophe that limits x in this case do not represent differentiate.

When round-up during to nearest tenths, x '=1.7.F _a(1.7)=0.0086, its with at log ₁₀During (2+3) first approximate 0.6895 addition, provide second approximate 0.6981.Error 1.658 round-ups to 1.7 time is 0.042.In G table, search y=0.042, the value of providing 1.035, its with previous F _aValue 0.0086 and cause a new x value during with x '=1.658 additions, x "=2.7016.X " after the round-up to 2.8, is being used F _aTable generates F _a(2.8)=0.0007.With 0.0007 and second approximate (0.6981) addition, provide the 3rd and final approximate 0.6988, it is considered to enough to reach the F that only has 50 values utilizing near actual value 0.6990 _aDesired precision when tracing table and the G tracing table that only has 100 values.If desired, can carry out further iteration, to improve precision a little.Yet, for addition, usually need be more than three times iteration.Alternatively, if maximum iteration time is preset as 3, then can be F _aBe 2.7 at the independent variable round down of last iteration to tenths recently, but not round-up always.F _a(2.7)=0.0009, its with at log ₁₀During (3+2) second approximate number, 0.6981 addition, provide the log as a result of expectation ₁₀(5)=log ₁₀(3+2)=0.6990.

Two table iterative processings are included as 100 times of reductions avoiding multiplication and accept processing of 3 steps and tracing table size.In hardware is realized, at the total degree in three iteration required logical delays in fact can less than by multiplier repeat add/number of times in logical delay of displacement structure.Under any circumstance, the reduction of above-mentioned tracing table size all is useful when silicon area and/or precision have most important property.

Can calculate log similarly ₁₀Value (3-2).The starting approximation number is the logarithm of plurality, promptly 0.4771.The F that is used for subtraction _sTable is pressed 0.1 step-length storing value:

F _s(x)=log ₁₀(1-10 ^-x) (at the truth of a matter 10) (13)

F _s(x)=log _q(1-q ^-x) (at general truth of a matter q)

The G table is kept intact.With log ₁₀(3) and log ₁₀(2) difference 0.1761 round-up is to nearest tenths 0.2.In the subtraction function table, search 0.2 and generate F _s(0.2)=-0.4329.General-0.4329 and several 0.4771 additions of starting approximation generate at log ₁₀(1) first approximate number 0.0442.

With the same, be 0.0239 with the error of 0.1761 round-up to 0.2 at addition.With the G table of 0.024 addressing by aforementioned qualification, rreturn value 1.2695.With 1.2695 with previous F _sIndependent variable x=0.1761 and previous F _sThe addition of table value of searching-0.4329 generates new F _sTable independent variable x '=1.0127.X ' round-up to nearest tenths 1.1, and is utilized F once more _sTable generates F _s(1.1)=-0.0359.With-0.0359 and first approximate number (0.0442) addition, provide at log ₁₀(1) second approximate number 0.0083.With the error of 1.0127 round-ups to 1.1 is 0.0873.Utilization is worth 0.087 addressing G table and provides G (0.087)=0.7410.When with the F that did not before round off _sTable independent variable 1.0127 and F _sDuring the addition of table value of searching-0.0359, generate new F _sTable independent variable x "=1.7178." round-up to 1.8 causes F with x _s(1.8)=-0.0069, with itself and 0.0083 addition of second approximate number, obtain at log ₁₀(1) the 3rd approximate number 0.0014.With the error of 1.7178 round-ups to 1.8 is 0.0822.With 0.082 addressing G table, rreturn value 0.7643.With itself and previous F _sTable independent variable 1.7178 and previous F _sThe addition of table value of searching-0.0069 generates new F _sTable independent variable x =2.4752.With 2.4752 round-ups to 2.5, generating function value F _s(2.5)=-0.0014.With the-0.0014 and the 3rd approximate number (0.0014) addition, provide log as expectation ₁₀(1)=log ₁₀(3-2)=0.Because F _sIndependent variable all increase at each iteration, cause revising more and more littler, so this algorithm convergence.

Above-mentioned processing at subtraction is except that the subtraction form (version) of having used the F table, and all the other are identical with the processing at addition.Yet addition and subtraction all use same G table.And subtraction is compared with addition needs many iteration, so that good precision to be provided; This be because, since under the situation of subtraction with F _sIncrement during the value addition is born, so F _sIndependent variable for each iteration, especially for the iteration first time, slightly rise with slowing down.

High precision logarithm addition

Usually, be that the logarithm addition problem that the logarithm of q will solve can provide through the following steps at the more general truth of a matter:

Suppose A=log _q(a) and B=log _q(b), wherein, a and b are positive numbers, and q is the truth of a matter.

Target: seek C=log _q(c), wherein, c=a+b.

Thereby, C=log _q(a+b)=log _q(q ^A+ q ^B),

If A is the greater among A and the B.

So, C=log _q(q ^A(1+q ^-(A-B)))

＝A+log _q(1+q ^-(A-B))

=A+log _q(1+q ^-r), wherein, r=A-B, and be positive.

Thereby problem is simplified as calculates the function log with unitary variant r _q(1+q ^-r).

If r has limited word length, then can obtain functional value by the function tracing table.For example, for 16 r value, the function tracing table must be stored 65536 words.And, if under the situation of truth of a matter q=e=2.718, r＞9, then the value of function is different from zero and less than 2 ^-13, this hint only needs to consider 4 integral parts and 12 fraction parts of 15 of being to the maximum of r.So, for r＞9, functional value all is 0 for 12 binary location after the radix point, therefore, only needs tracing table for the r value up to 9, thereby provides 9 * 4096=36864 memory word.

Because the maximal value of function is log when r=0 _e(2)=0.69, so only need to store 12 fraction parts, thus, storage requirement only is 36864 12 words, rather than 65536 16 words.Under the situation of the truth of a matter 2, for r＞13, function all is 0 for 12 binary location, thus, also only needs to consider 4 integral parts of r.If be used for symbol to one, then the Logarithmic magnitude part only 15 long, for example 4.11 forms or 5.10 forms correspondingly can be adjusted above-mentioned calculation.

Yet in order to obtain for example to utilize 32 word length than 16 much higher precision, the direct tracing table that is used for function is too big.For example, in order to provide and suitable precision and the dynamic range of 32 floating-point standards of IEEE, at the situation of truth of a matter e, A and B should respectively have 7 integral parts, 24 fraction parts and a sign bit.Now, the value of r is must be greater than 25log before 0 at function for 24 precision _e(2)=17.32, this can partly be represented by 5 positive integers of r.Thereby, must be considered as function F to 29 potential r values of form 5.24 _aIndependent variable.At the value between 0 to 18, directly searching of r needs 18 * 2 ²⁴Or the tracing table size of 302,000,000 24 words.In fact, all pay close attention to these table sizes of reduction at the research of logarithm operation, and final purpose is in order to make 64 word length practicalities.Several technology described here makes this technology advance towards this target.

For (this is for all logarithm addition function F as the address that use r according to single big table _adirectly search need) size of reduction tracing table, a kind of realization of the present invention comprises r is divided into the highest effectively (MS) part r respectively _MWith minimum effectively (LS) part r _LAs described below, described MS part and LS partly distinguish addressing two little a lot of table F and G.MS partly represents input value " round-up " form, and LS partly represents the poor of round-up form and original full argument value.

If r _MBe the highest effective 14 of r (r＜32), and r _LBe minimum effective 15 of r, as shown in Equation (14).

r _M＝xxxxx.xxxxxxxxx

r _L＝00000.000000000xxxxxxxxxxxxxxx (14)

For simplicity, can be r _MAnd r _LLength write a Chinese character in simplified form and be expressed as (5.9) and (15).By obvious modification to this method, can use other division that r is divided into highest significant position part and least significant bit (LSB) part equally, and for some considerations of the preferred particular division of following further discussion, relate at other word length (for example, 16) or the ability shown with identical F table and G again at complex operation.

If r _M ⁺Be r _MIncrease r _LThe maximum possible value after value (that is, 00000.000000000111111111111111).Will be appreciated that this is its minimum effective 15 and is arranged to 1 original r value.In some implementations, r _MCan alternatively increase 0.000000001, that is,

r _M ⁺=xxxxx.xxxxxxxxx+00000.000000001 (15) establishes r _LComplemented value be expressed as:

r_{L}^{-} = r_{M}^{+} - r - - - (16)

So, depend on and used at r _MAbove-mentioned two alternatives which in increasing, this is r _LComplement code or 2 complement codes, that is,

r _L ^-=00000.000000000111111111111111-00000.000000000xxxxxxxxxx xxxxx (r _LComplement code), perhaps

r _L ^-=00000.000000001000000000000000-00000.000000000xxxxxxxxxx xxxxx (r _L2 complement codes).Result at truth of a matter e is so, below:

\log_{e} (1 + e^{- r}) = \log_{e} (1 + e^{- r_{M}^{+}} - e^{- r_{M}^{+}} + e^{- r})

= \log_{e} ((1 + e^{- r_{M}^{+}}) (1 + \frac{(e^{- r} - e^{- r_{M}^{+}})}{(1 + e^{- r_{M}^{+}})}))

= \log_{e} (1 + e^{- r_{M}^{+}}) + \log_{e} (1 + e^{- r^{'}}) - - - (17)

Wherein,

r^{'} = r + \log_{e} (1 + e^{- r_{M}^{+}}) = \log_{e} (1 - e^{- r_{L}^{-}}) .

Similar expansion log _e(1+e ^{-r '}), then have:

\log_{e} (1 + e^{- r^{'}}) = \log_{e} (1 + e^{- r_{M}^{' +}}) + \log_{e} (1 + e^{- r^{'}}) - - - (18)

Wherein,

r^{'} = r^{'} + \log_{e} (1 + e^{- r_{M}^{' +}}) - \log_{e} (1 - e^{- r_{L}^{' -}}) .

Iteration at conclusion shows, the answer of hope comprise array function down etc. and:

\log_{e} (1 + e^{- r_{M}^{+}})

\log_{e} (1 + e^{- r_{M}^{' +}})

\log_{e} (1 + e^{- r_{M}^{' +}}) - - - (19)

These functions only depend on the highest effective 14 of their corresponding r independents variable, and the highest effective 14 can obtain from the tracing table that 16384 words are only arranged.

To (19), the apostrophe that is used for limiting the r value of expression or not derivative for formula (17).What replace is, by adding up to last value just from logarithm addition function tracing table (F _a) value that obtains, and according to r minimum effective 15 add one by revising tracing table (that is, the G table is because r _L ^-Be 15 place values, so it has 32768 words) value that provides (that is value,

\log_{e} (1 - e^{- r_{L}^{' +}}),

Derive sequence r, r ', r " etc. of r value.

Although according to r _M ⁺And r _L ^-Calculate the value of storage, but can be respectively according to r _MAnd r _LDirectly address function tracing table and correction tracing table.Call these tracing table function F respectively _aAnd G, and notice that modified value is always high negative, in the G table, can store positive modified value.With this positive modified value and previous r independent variable addition, replace the storage negative value and deduct it.And, can deduct the minimum modified value of G table or its integral part at least from the value of storage, with the figure place of reduction storage, and add the back to once being worth as long as from table, pull out (pulled).For the truth of a matter 2, value 8 is suitable for minimum modified value, and even does not need to add to the back in some implementations.So, iteration is:

1, will export accumulator value C and be initialized as the greater among A and the B.

If 2 A are the greater, then r is initialized to A-B, if perhaps B is the greater, then r is initialized to B-A.

3, r is divided into r _MAnd r _L

4, search respectively by r _MAnd r _LThe F of addressing _a(r _M ⁺) and G (r _L ^-).

5, the F that adds up _aWith C, and the F that adds up _a+ G and r.

If 6 r＜outage threshold (STOP_THRESHOLD) (following further discussion) are then from step 3 repetition.

Those skilled in the art are to be understood that, can use a plurality of logic gates, to utilize logic b6.OR. (b5.AND. (b4.OR.b3.OR.b2)) (32 collect or have 8,4 or 2 one concentrated 16 collection) to detect r-value greater than 18, wherein, the bit position on the position index indication radix point left side.Function

G (r_{L}^{-}) = \log_{e} (1 - e^{- r_{L}^{-}})

Value always approximately greater than 6.24, therefore, iteration always 3 circulations or still less circulation back stop.For the truth of a matter 2, modified value scales up, thereby equally for the truth of a matter 2, r always surpasses 25 after 3 circulations at the most.In general, any truth of a matter is typically satisfied in 3 circulations.

The two table of high precision logarithm subtraction

If the symbol S that is associated with A and B represents that a and b have same-sign, then can use aforementioned logarithm addition algorithm, be called " logarithm addition (logadd) " at this.Otherwise, need the logarithm subtraction algorithm, be called " logarithm subtraction (logsub) " at this.Following table has shown when use corresponding algorithm:

Symbol (a):	Symbol (b):	Addition	Deduct b from a
Symbol (a):	Symbol (b):	Addition	Deduct b from a	+	+	Use logadd (A, B)	Use logsub (A, B)
+	-	Use logsub (A, B)	Use logadd (A, B)	+	+	Use logadd (A, B)	Use logsub (A, B)
+	-	Use logsub (A, B)	Use logadd (A, B)	-	+	Use logsub (B, A)	Use logadd (A, B)
-	-	Use logadd (A, B)	Use logsub (A, B)	-	+	Use logsub (B, A)	Use logadd (A, B)

When using the logarithm addition algorithm, the symbol that result's symbol always is associated with bigger Logarithmic magnitude.

If it is at first put upside down the symbol that is associated with second independent variable, then like this equally for the logarithm subtraction.When hope is subtraction, in the time that second independent variable can being applied in the input to the logarithm operation unit, carry out the putting upside down of symbol of second independent variable.The following derivation of " logarithm subtraction " algorithm: suppose to provide A=log (| a|), and B=log (| b|).Wish to find C=log (c), wherein, c=|a|-|b|.If A is the greater among A and the B.For clarity sake, remove absolute value sign (||), and suppose that present a and b are positive, thereby have:

C＝log _e(a-b)＝log _e(e ^A-e ^B) (20)

As the situation of logarithm addition, only for illustrative purposes, used truth of a matter e in this example, so this is not construed as limiting.

Because A is greater than B in supposition, so:

C＝log _e(e ^A(1-e ^-(A-B)))

＝A+log _e(1-e ^-(A-B)) (21)

＝A+log _e(1-e ^-r)

Wherein, r=A-B, and be positive.Thereby problem reduction becomes to calculate the function log with unitary variant r _e(1-e ^-r).If r _M, r _L, r _M ⁺And r _L ^-Definition as described above.So, at truth of a matter e:

\log_{e} (1 - e^{- r}) = \log_{e} (1 - e^{- r_{M}^{+}} + e^{- r_{M}^{+}} - e^{- r})

= \log_{e} ((1 - e^{- r_{M}^{+}}) (1 - \frac{(e^{- r} - e^{{- r}_{M}^{+}})}{(1 - e^{- r_{M}^{+}})}))

= \log_{e} (1 - e^{- r_{M}^{+}}) + \log_{e} (1 - e^{- r^{'}}) - - - (22)

Wherein,

r^{'} = r + \log_{e} (1 - e^{- r_{M}^{+}}) - \log_{e} (1 - e^{- r_{L}^{-}}) .

Launch log _e(1-e ^-r') cause equally:

\log_{e} (1 - e^{- r^{'}}) = \log_{e} (1 - e^{- r_{M}^{' +}}) + \log_{e} (1 - e^{- r^{'}}) - - - (23)

Wherein,

r^{'} = r^{'} + \log_{e} (1 - e^{- r_{M}^{' +}}) - \log_{e} (1 - e^{- r_{L}^{' -}}),

Deng.Iteration at conclusion shows

\log_{e} (1 - e^{- r_{M}^{+}})

\log_{e} (1 - e^{- r_{M}^{' +}})

\log_{e} (1 - e^{- r_{M}^{' +}}) - - - (24)

Or the like, they only depend on the highest effective 14 of the long r value of corresponding full word, this can be provided by the tracing table that only has 16384 words.

As the situation of logarithm addition, although storing value is according to r _M ⁺And r _L ^-Calculate, still, can be constituted as by r at the tracing table of logarithm subtraction _MAnd r _LDirectly address.And, as the situation of logarithm addition, for the apostrophe that the r value of modifying indication is used is not represented derivative.

Call these tracing tables F as previously mentioned respectively _aAnd G (G is the same tracing table that is used for the logarithm addition algorithm) and store G on the occasion of, will generate the required F of logarithm subtraction _sTable and G table.Because 1-e ^-rAlways less than 1, so F _sAlways negative, thus, can store and deduct but not add true amplitude.Other method is stored the negative value that its minus symbol position is removed, and when carrying out subtraction, by additional highest significant position " 1 ", replaces this minus symbol position outside tracing table.Collaborative that of the maximum of the preferred simplification of selecting to cause logic and look-up table values between addition and subtraction, as following further as described in.Under any circumstance, the following step has been summarized " logarithm subtraction " processing:

1, initialization output accumulator value makes the greater among C=A and the B.

3, r is divided into r _MAnd r _L

4, press r respectively _MAnd r _LAddressing searches F _s(r _M ⁺) and G (r _L ^-).

5, the F that adds up _sWith C, and the F that adds up _s+ G and r.

If 6 r＜outage threshold (following discussion) are then from step 3 repetition.

For logarithm addition algorithm and logarithm subtraction algorithm, select outage threshold so that from any contribution of another iteration all less than half of LSB.This appears at 17.32 (can use 18) for the truth of a matter e that has 24 binary location after the radix point and locates, and perhaps appears at 24 places for the truth of a matter 2 that has 23 binary location after the radix point.In principle, can set up the truth of a matter less than the truth of a matter 2, it provides outage threshold 31, so just will use the F function that defines on can the whole address space according to the selected MSB addressing of r.Alternatively, can set up the truth of a matter greater than truth of a matter e, it provides the outage threshold 15 with same nature.Yet the real advantage of the truth of a matter 2 be it seems greater than any advantage of utilizing at the space, full address of F table.In general, for the truth of a matter 2, the number big 1 or 2 of the binary location that outage threshold is only represented than the logarithm after the radix point.

Shown in the decimal system example that provides above, if the final independent variable that round down uses for addressing F table r  for example _M ⁺, rather than round-up r  _M, then improved limited number of time iteration precision afterwards.If the iteration of fixed number of times is always carried out in two table iterative processings, if perhaps final iteration is otherwise discerned in this processing, then can be when final iteration the independent variable of round down F.For example can by r the particular range of outage threshold (for truth of a matter e be～6, perhaps for the truth of a matter 2 be～8) in (expression limits next iteration and surpasses outage threshold), discern final iteration.When this method of use, if for final iteration r _LLeftmost bit be zero, then can be with address decrement at the F table.In the streamline that will describe is realized, only calculate final F table content at the round down independent variable.

Unique difference between logarithm subtraction algorithm and the logarithm addition algorithm has been to use tracing table F _sBut not F _aBecause the both has the size of 16384 words, so they can be merged into (the r with F _M, operational code) expression have be used to select+or-the single function F table of the extra address position of form, wherein additional independent variable " operational code " is that to have

value

0 or 1 be to use the logarithm addition algorithm or the extra address position of application logarithm subtraction algorithm with expression.Alternatively, less because peripheral logic (that is, input totalizer and output totalizer and adder/subtracter) is compared with corresponding tracing table,

So, spend less for duplicating peripheral logic to form independently totalizer and subtracter.Another possibility of considering below is to use function F _aWith-F _sBetween similarity.

Exponential approximation

As mentioned above, r _M ⁺Can comprise by r _L(0.00000000011111111111111) r that maximum possible value increases _M, perhaps can comprise the r that increases by 0.000000001 _MSelect r _MIncrement be 0.0000000001111111 ... 1 but not 0.000000001 advantage be, can be during iterative algorithm by r _LComplement code addressing G table, perhaps directly at r _MCan be under=0 the situation by r _L(without complement arithmetic) addressing G table to be obtaining the value of F, thereby, other the difficult situation that makes single iteration just can satisfy to deduct two almost equal values.Compare with the complement code that forms 2, it is simpler and faster that complemented value and non-complemented value can be used, because do not need to transmit carry.

For logarithm addition, F _aThe value of table can be defined as follows:

F_{a} (X_{M}) = \log_{2} (1 + 2^{- (X_{M} + d)}) - - - (25)

Wherein, d represents incremental change, and it is preferably X _LThe maximum possible value, that is, all be 1.This function can constitute by X _MThe tracing table of addressing.For subtraction, F _sThe value of table can be defined as follows:

F_{s} (X_{M}) = - \log_{2} (1 - 2^{- (X_{M} + d)}) - - - (26)

For X _MHigher value, and for 32 bit arithmetics and the independent variable scope between 16 and 24, F _a(X _M)=F _s(X _M), the both can be by the following formula sufficient approximation:

E = 2^{- X_{M 1}} \cdot (\frac{2^{- {0 . X}_{M 2}}}{\log_{e} (2)}) - - - (27)

Wherein, X _M1Be X _MIntegral part (position on the radix point left side), and X _M2Be fraction part, that is, and the position on radix point the right.Function in the bracket can be by little index tracing table storage.The dextroposition device can be realized integral part, thereby only decimal place needs the addressing exponential function, and this has dwindled table size.

Fig. 4 illustration exponential approximation (E) and true functional value (F _a, F _s) between similarity.When independent variable changed in 16 scopes between 24, E was substantially equal to F _aAnd F _sAnd it is another approximate that Fig. 4 has gone back illustration:

E_{2} = \frac{2^{- 2 (X_{M} + d)}}{2 \log_{e} (2)} - - - (28)

How also sufficient approximation the poor dF between exponential approximation and the true functional value _a=E-F _aAnd dF _s=F _s-E.Therefore, for the X in 8 to 16 scope _M, can use by length to be less than or equal to 8 light maintenance on the occasion of the revised exponential approximation E of E2, as seen from Figure 4.When having 24 positions after the needs binary point, this as a result length be 17.

Realize the required silicon area of exponential approximation because the E area under a curve approximately is similar to, thus Fig. 4 also illustration realize being used for the required approximate silicon area of function table of logarithm additive operation and logarithm subtraction.Utilize the truth of a matter be 2 logarithmic scale as vertical scale, mean the word length of highly representing binary value.Horizontal scale is represented the numerical value of this value.Therefore, area under a curve represents to store the figure place of the required ROM of curve values.Yet exponential function E is periodic, and except the dextroposition at each increment 1, its value is repetition.Thereby, only need storage by fraction part X _M2The one-period of addressing, and this result pressed X _M1The a plurality of displacement that provide.Therefore, exponential function E needs very little table.And, because modified value dF or E ₂With original F _aAnd F _sFunction is compared, and obviously has littler area under their curve, so dF and E are revised in approximate E of utilization index and storage ₂Need littler table size, thus with storage F _aAnd F _sCompare, need littler silicon area.

Formula (29) provides at the G function of least significant bit (LSB) as follows:

G (X_{L}) = - \log_{2} (1 - 2^{- (d - X_{L})}) - - - (29)

Wherein, when d all is 1, (d-X _L) equal X _LComplement code.G (X _L) minimum value depend on that 31 Logarithmic magnitudes are at X _MWith X _LBetween division.If X _MHas 5.8 form, then X _LHas the form of 0.00000000xxxxxxxxxxxxxxx and less than 2 ^-8Work as X _L=0 o'clock, so, the minimum value of G was 8.5.For X _MHave form (5.7), the minimum value of G=7.5, and for X _MHas form (5.9), the minimum value of G=9.5.Because each cycle of the value of X increases the value of G at least, so need only three G values on average greater than 8, X just will be above 24 in three cycles.Below, for illustrative purposes, keep the supposition of 32 bit arithmetics.When the minimum value of G is 8.5, can from storing value, deduct base value (base value) 8.

Logarithm operation at plural number

Above-mentioned various different disposal is applicable to the logarithm operation at real number usually.Yet radiocommunication signals can utilize real number representation and complex representation.For example, the typical case's application at real number and complex signal processing comprises the radio signal processing.In radio system, the signal that receives at the antenna place comprises radionoise, and can be represented by a series of complex samplings.Usually it is desirable for the utilization possible signal the most weak and come information reproduction, so that the scope maximization with respect to noise.Therefore, the complex representation of the sampling of assembling from antenna does not need high-precision digital, because adopt the quantified precision well more a lot of than the noise level of expectation not too useful.Yet, handling the complex noise signal with information reproduction and after correcting mistakes, wish to remove noise; This moment, gained information may need more high-precision expression.For example, can sample by a series of real numbers and represent voice, but because the fidelity that the original aerial signal of handling has improved the signal to noise ratio (S/N ratio) of voice, so may need more high-precision numeral.

Therefore, provide, in radio application, paid close attention to such as cell phone and cellular system for the high-accuracy arithmetic of real number with for the signal processor of the lower accuracy computing of plural number.Sort processor can comprise: be used for procedure stores storer, be used to store data-carrier store, real number and the complex arithmetic/logical block (ALU) of just processed real number type data and complex number type data and the input and output device that can comprise analog to digital converter and digital to analog converter.Data-carrier store storage ALU designing institute at the word of identical word length; Logical is to use identical word length at real number with plural number, so that they can be stored in the same storer.Yet the present invention does not need like this.

Be typically, 16 words are enough for speech processes.Therefore, what paid close attention to is, determines whether 16 complex representations provide enough dynamic ranges, is used to the noise signal of representing that antenna receives.Be proved to be really so in this first generation digital cellular telephone of being made during 1988-1997 and sold by European L.M. Ericsson and branch offices U.S. Ericsson-General Electric thereof, described first generation digital cellular telephone has used the 15-16 position log-polar that comprises 8 Logarithmic magnitudes and 7 phase places to represent.These products have also utilized the United States Patent (USP) of realizing according to combination 5048059,5148373 and 5070303 that the radio signal Direct Digital is changed into plural log-polar form, by reference described United States Patent (USP) are incorporated into this.

As for real number, for the logarithm of amplitude, can use any truth of a matter.If use truth of a matter e, then Logarithmic magnitude represents with napier to be the momentary signal level of unit.As known in the art, 1 napier approximates 8.686 decibels (dB), therefore, adopts 8 Logarithmic magnitudes of form xxxx.xxxx to be illustrated in 0 to 15 and 15/16 the napier (～signal level that 139dB) changes in the scope.

Quantization error be a napier or 0.27dB least significant bit (LSB) or+/-1/32 half, it is about 3.2% percentage error.In theory, this error is evenly distributed on+/-3.2% between, and have the RMS value of peak value 1/3 (that is, about 1%).Thus, quantizing noise is 1/100 of a signal level, that is, and and than the low 40dB of signal level, and if used over-sampling, that is, by sampling greater than the Nyquist rate of the every Hz1 of the per second of signal bandwidth sampling, then quantizing noise can be littler.

The advantage that log-polar is represented is that it is constant that this quantified precision all keeps in the gamut of signal level.Total dynamic range is 139dB under-quantizing noise of 40dB is regarded as far being competent in most of radio signals application.

Fig. 5 A illustration represent to contrast with the Cartesian coordinate of Fig. 5 B, the what use is made of log-polar is represented complex plane is divided into unit area.White " hole " in the central authorities of log-polar figure is the position of signal level less than 0000.0000 napier, and cylindrical is highest signal level 1111.1111 napiers.If select lower limit 0000.0000 than the low 10dB of radionoise level, then this guarantees and will be enough to represent noise drift, and the numeric representation statistical property of deterioration noise exceedingly not.Thereby cylindrical is represented the signal level than the high 129dB of noise again, and this even may not surpassed by the strongest signal.

Be used for representing that the limited figure place of phase angle also can cause quantization error and noise.From the RMS value of the noise contribution of phase quantization for the radian be unit the minimum phase place value 1/12.If use 8 to represent phase place, then the minimum phase position has the value of 2 pi/2s, 56 radians, thus, quantizing noise with respect to signal level be 2 π/(12*256)=0.002 or-53.8dB.This is less than the Logarithmic magnitude quantizing noise of-40dB.

Distribute 1 position to distribute 1 of amplitude overabsorption less to phase place, make Logarithmic magnitude quantize to be about-46dB, and the phase quantization noise is-47.8dB.Thus, when using 16 word lengths, the log-polar form that hint is used for Logarithmic magnitude is xxxx.xxxxx, and the log-polar form that is used for phase place is 0.xxxxxxx (modulus 2 π).

If using the truth of a matter is that 2 logarithm is represented Logarithmic magnitude, then form is that the quantizing noise of xxxx.xxxxx will reduce log _e(2) or 3.18dB to-49dB.Dynamic range reduces 16 * 6dB=96dB from 16 napiers or 139dB, and this is still fully.

The log-polar number can be stored as Logarithmic magnitude preceding, that is, and and { xxxx.xxxxx; 0.xxxxxxx}={log (r); θ }, perhaps be stored as phase place preceding, that is, and { 0.xxxxxxx; Xxxx.xxxxx}={ θ; Log (r) }.What come in handy is, phase place is thought of as 1 " phase place " or the expansion of symbol of real number, and with expression under plural situation just in time two angles 0 degree and 180 degree angle in addition, thus, " phase place is preceding " form provides the logical format that is used to describe this.In complex operation, may be almost as broad as long between addition and subtraction, just in time be two some places in the gamut at the relative phase angle that will consider because merge the numbers differ 0 degree (that is, adding) or 180 degree (that is, subtracting).

Utilize the log-polar form,, obtain the product of two plural numbers by the fixed point addition (considering underflow or overflow) of Logarithmic magnitude part and the fixed point addition of the bit position mutually of ignoring overflow (because angle is calculated as mould 2 π).When binary phase word quantized level is evenly separated in the scope of 0-2 π, according to the phase calculation needs, according to the upset (rollover) of binary addition accurately corresponding to mould 2 π computings.Equally, obtain the merchant of two log-polar plural numbers by the fixed point subtraction.

Consideration is used same ALU at 16 logarithm real number form (logreal) computings and 16 log-polar computings.Can recognize, in addition or the unique difference in subtracting each other be, at the log-polar situation, do not allow from any carry of the addition of Logarithmic magnitude part or subtraction or the phase bit position that borrow is sent to totalizer or subtracter, if use Logarithmic magnitude, also be like this at preceding form.

For how illustration can realize logarithm operation to the plural number of representing by the log-polar form, consider as follows.If formula (30) is the log-polar form Z of e with the truth of a matter ₁And Z ₂Represent two Cartesian coordinate plural number z ₁And z ₂

Z ₁＝(R ₁，θ ₁)＝log _e(z ₁)

Z ₂＝(R ₂，θ ₂)＝log _e(z ₂) (30)

In order to determine Z ₃=log _e(z ₃) (wherein, z ₃=z ₁+ z ₂), can carry out at the similar process of the said process of real number.At first, notice:

Z_{3} = \log_{e} (Z_{1} + Z_{2}) = \log_{e} (e^{Z_{1}} + e^{Z_{2}}) - - - (31)

Suppose Z ₁Compare Z ₂Has bigger Logarithmic magnitude (R ₁), and application and above-mentioned similar logic, then Z ₃Can be expressed as:

Z_{3} = \log_{e} (e^{Z_{1}} (1 + e^{- (Z_{1} - Z_{2})}))

= Z_{1} + \log_{e} (1 + e^{- (Z_{1} - Z_{2})})

= Z_{1} + \log_{e} (1 + e^{- Z}) - - - (32)

Wherein, Z=Z ₁-Z ₂, have arithmetic number part R ₁-R ₂, because R ₁＞R ₂, this guarantees amplitude e ^-Z＜1.Thereby, provide Z ₁And Z ₂Calculate Z ₃Problem be simplified as, calculate function log with log-polar complex variable z=(R+j θ) _e(1+e ^-Z), wherein, R=R ₁-R ₂, and θ=θ ₁-θ ₂Although above-mentioned example is used truth of a matter e, those skilled in the art are to be understood that and can use any truth of a matter.

When R＞6, the addition of less value or subtraction can not influence the 5th binary location, and the result is bigger value.Therefore, only need to consider 3 of the binary point left side for R.

Can come computing function log according to various means _e(1+e ^-Z).For example, can use single table, single iterative processing.Although be applicable to the low precision number and the high-precision number of degrees, may forbid becoming big for the size of the required single tracing table of the high-precision number of degrees.Tracing table can have optimum structure.For example, for 16 log-polar computings, what come in handy is, stores the value of the address of its θ component phase difference of pi in pairs, thereby provides 16384 * 32 ROM, if perhaps utilize conjugate symmetry, then provides half big or small ROM.So, in one-period, can carry out plural logarithm addition and plural logarithm subtraction simultaneously with a pair of input value.

In one-period, carry out adding and subtract and being called as butterfly operation (Butterflyoperation) of a pair of value simultaneously, and typically in the butterfly circuit, carry out.Fig. 6 illustration comprise the demonstration ALU of low precision complex number type butterfly circuit 100.Butterfly circuit 100 comprises: amplitude totalizer 102, phase accumulator 104, selector switch 106, tracing table 108 and combiner 110, and difference combiner 112.Work as R ₁Greater than R ₂The time, Logarithmic magnitude totalizer 102 calculates by R=R ₁-R ₂The Logarithmic magnitude of expression is poor, and phase accumulator 104 calculates by θ=θ ₁-θ ₂The phase angle difference of expression.Alternatively, work as R ₂Greater than R ₁The time, Logarithmic magnitude totalizer 102 calculates by R=R ₂-R ₁The Logarithmic magnitude of expression is poor, and phase accumulator 104 calculates by θ=θ ₂-θ ₁The phase angle difference of expression.Amplitude totalizer 102 and phase accumulator 104 are exported to tracing table 108 to the difference that calculates.

Tracing table 108 comprises the logarithm value at the plural number of all angles.Logarithmic magnitude difference and phase differential addressing tracing table 108 are to provide two log-polar value F (Z) and F (Z+ π).If desired, by utilizing the positive angle independent variable all the time, and when the rudimentary horn address is to bear conjugation output F (Z) value, this table size can be reduced by half.

Amplitude totalizer 102 is also based on R ₁And R ₂In the greater, control selector switch 106 is to select Z ₁Or Z ₂As Z _LSelector switch 106 is Z _LOffer and combiner 110 and difference combiner 112.Combiner 110,112 is Z _LWith two tracing tables output F (Z) and (F (Z+ π) addition, with generate be associated with two input plural numbers export logarithm with output logarithm and difference, the plural butterfly operation of execution in once-through operation thus.

For the required fast Fourier transform (FFT) of carrying out such as OFDM (OFDM) signal decoding of various signal Processing computings, butterfly operation is normally useful.For the truth of a matter is 2 FFT computing, usually by 2 pi/2s ^NMultiple revise phase angle, wherein, 2 ^NIt is the size of FFT.In the log-polar form, be known as that to revolve these phase place twiddle operations that move (twiddle) be unessential, and only relate to the multiple of amount (as 0.0001000) is added into the phase bit position.Because in butterfly circuit 100, revise easily phase angle,, make it highly beneficial for FFT so can carry out very efficiently butterfly and revolve movement and calculate by the plural number of representing by the log-polar form being applied to butterfly circuit 100.As long as FFT is the truth of a matter be 2 and N be less than or equal to the word length of θ, just can not round off in calculating revolving movement.FFT for beyond the truth of a matter 2 can design special log-polar form, wherein, utilizes the root identical with the truth of a matter of FFT to represent θ.Can in this device, use algorithm described here by revising tracing table rightly.

Fig. 7 illustration utilize a plurality of complex number type butterfly circuit 100 (example is complex number type butterfly circuit as shown in Figure 6) to realize 16 FFT of demonstration.Butterfly circuit 100 is merged into right value, separate in 16 array of elements selected 8.Then, in selected and angle part output and difference output, revise them, revolve the complex number type rotation that moves to realize being known as.Mould 2 π additions by illustrative bit pattern are revised angle.When the 7 parallactic angle degree parts used as shown in the figure, mould 2 π additions are simple mould 128 additions.For each machine cycle fully, parallel processing and the complete FFT of calculating, also can utilize 8 * 4=32 copy of butterfly circuit 100 to realize this FFT.Alternatively, can utilize single-row totally 8 butterfly circuit 100 to realize this FFT continuously, to carry out each in four column counts successively.And single butterfly circuit 100 can be reused 32 times, to carry out FFT.These options depend on the compromise of hope between speed and size or the cost.

When signal can appear at above the 60dB dynamic range Anywhere the time, by considering signal indication, can realize the advantage that quantizes with log-polar that the Cartesian coordinate of complex values is represented to compare to for example problem of 1% precision.This can appear in the receiver, and being used for does not provide the burst mode of the alarm of relevant wanted signal level to send to receiver.If minimum signal level is about 1, then Cartesian coordinate is partly represented 1% precision needs minimum about 1/64 (that is, 6 s' of binary point the right) step-length.Yet expression surpasses the signal demand of 60dB scope and represents that signal is bigger 1000 times than this, and this need add 10 on the binary point left side.Thereby real part and imaginary part all need to have form S10.6, make sum reach 34.Yet, as mentioned above, press the log-polar form, only utilize 16 just can realize identical quantified precision and dynamic range.

If compare the higher precision of needs with the single tracing table that can adapt to fair-sized, the then aforementioned two table alternative manners that are used for real number may be suitable for plural number.The interior plural form of 32 word lengths that is filled in the high-precision real number format for example is illustrated as at preceding form by phase place:

(0.xxxxxxxxxxxxxxx；xxxxx.xxxxxxxxxxxx)

Or be abbreviated as (0.15; 5.12).Select the figure place of phase place to Duo 2 or 3, will provide similar quantization error for phase place and amplitude than the figure place on the right of the binary point of Logarithmic magnitude.The least significant bit (LSB) of 15 phase places has value 2 π * 2 ^-15=6.28 * 2 ^-15Variation at the 12nd the binary location place of R=log (r) provides d (log (r))=dr/r=2 ^-12=8 * 2 ^-15

Thereby the least significant bit (LSB) of log (r) is the displacement that the footpath makes progress, the displacement of the least significant bit (LSB) of the θ (theta) on it is slightly larger than tangentially.Utilize the truth of a matter 2, the least significant bit (LSB) of log (r) is reduced log _e(2)=0.69 to 5.54 * 2 ^-15, this is slightly less than the least significant bit (LSB) of θ.If important, then utilize 2 and e between special truth of a matter e ^π/4=2.19328, can realize accurately equal radially and tangentially quantizing.Yet the truth of a matter 2 has the realization advantage, and is preferred.For example, utilize the truth of a matter 2, the Logarithmic magnitude of form 5.12 represents that signal level changes in 32 * 6=192dB dynamic range, and this is the twice of the scope of 16 bit formats.And for all signal levels, quantizing noise is than more than the low 80dB of signal level.Handle for the radio signal in normal use, this is enough big, and the emulation when guaranteeing that for hope quantification effect can be ignored,, may be useful perhaps for such as having the critical applications that greatly poor interface is nullified (cancellation) between the signal at big superfluous signal and little needing.

When logarithm addition or logarithm subtracted each other two log-polar values, if to such an extent as to the least significant bit (LSB) of very big log of the difference in their Logarithmic magnitude (r) or θ is unaffected, then this result was the value with bigger Logarithmic magnitude.Therefore, if R ₁And R ₂Be two log-polar value Z ₁And Z ₂Logarithmic magnitude, and R is R ₁With R ₂Between poor, and always positive, then when R greater than 13log _e(2)=9.011 o'clock, function

log _e(1+e ^-z)＝log _e(1+e ^-(R+jθ))

For 12 binary location all is 0.

Thereby, under situation, only need to consider the value of the difference of the Logarithmic magnitude R between 0 to 9 at the truth of a matter e of 32 log-polar forms.Similarly, at the situation of the truth of a matter 2, only need to be considered as the value of the Logarithmic magnitude difference between 0 to 13 of plural logarithm addition/subtraction argument of function.Thereby 4 of the binary point left side are enough to represent R, make R have form 4.12.

Because, be conjugation at the plural logarithm addition/subtraction function of negative θ at the plural logarithm addition/subtraction function of positive θ, thus θ be limited to 0 to just less than the scope of π, have the form of 0.0xxxxxxxxxxxxxx thus, have only 14 variable bit.During causing research of the present invention, by getting rid of particular value=0.10000000000 at differential seat angle ..., mainly solved the convergence problem of plural iteration.The real number that this value accurately equals Logarithmic magnitude subtracts each other, and result's angle is in two input independent variable angles, and is by utilizing the F at real arithmetic _sThe best execution of function.

At the iterative processing of plural number,, at first comprise two independent variable Z that will merge as for real number ₁And Z ₂Poor Z=(θ, R)=Z ₁-Z ₂Be divided into most significant part and minimum live part.As mentioned above, the value of Z in fact only needs 30 variable bit.For example, establish Z _MBe the highest effective 8 among the highest effective 7 and 16 R in 14 variable bit of θ, that is, and by phase place representation Z the preceding _M=(0.0xxxxxxx; Xxxx.xxxx).

So, Z _LBe minimum effective 7 of the residue of the residue of R minimum effective 8 and θ, form is Z _L=(0.00000000xxxxxxx; 0000.0000xxxxxxxx).Then, definition Z _M ⁺=Z _M+ dz, wherein, dz has real part 0.0001 or 0.000011111111 and imaginary part 0 or 0.111111111111111,, lacks 1 LSB than 2 π that is.Then, definition Z _L ^-Be Z _M-Z.For the last selection of dz, Z _L ^-Be Z _L2 complement codes of variable bit, and select for back one of dZ, it is these complement code.Because compare with 2 complement codes, easier formation complement code is so a back selection of the real part of dZ and imaginary part is preferred.So,

\log_{e} (1 + e^{- Z}) = \log_{e} (1 + e^{- Z_{M}^{+}} - e^{- Z_{M}^{+}} + e^{- Z})

= \log_{e} (1 + e^{- Z_{M}^{+}}) + \log_{e} (1 + e^{- Z^{'}}) - - - (33)

Wherein

Z^{'} = Z + \log_{e} (1 + e^{- Z_{M}^{+}}) - \log_{e} (1 + e^{- Z_{L}^{-}}) - - - (34)

Function

\log_{e} (1 - e^{- Z_{M}^{+}})

Only depend on 8 highest significant positions of R and 7 highest significant positions of θ, thus, can precomputation and be stored in and pass through Z _MIn 32768 word tables of directly address.Thereby, during handling, do not need to form Z _M ⁺

Function

- \log_{e} (1 - e^{- Z_{L}^{-}})

Only depend on 7 LSB of R and 8 LSB of θ, and can precomputation and be stored as 32768 word tracing tables, be used for the G function of complex operation.As the result of hope is the original independent variable Z with big Logarithmic magnitude ₁Or Z ₂And has an independent variable Z _M, Z ' _M, Z " _MDeng F function successive value and the time, the latter only needs to calculate successive value Z ', Z ", Z  etc.Studies show that,, may need nearly 6 iteration for reaching convergent plural number logarithm addition and logarithm subtraction iteration.Bad situation is Z ₁And Z ₂Angle separate nearly 180 degree, and their amplitude is almost equal.By being the real number subtraction, handle accurately the separately situation of the angle of 180 degree as mentioned above to calculation process.

In order to adopt same F table to satisfy real arithmetic and complex operation, two extra address positions can be set, with select at the real number addition table, at the table of real number subtraction, and at the table of complex addition/subtraction.Can use F (r _m, operational code) and come representative function, wherein, for plural situation, r _MBe 14 among 15 of independent variable, and the 15th is the part of 2 bit manipulation sign indicating numbers.Thus, distribute two bit manipulation sign indicating numbers as shown in the table:

00	The real number addition
00	The real number addition	01	The real number subtraction
1x	Complex addition/subtraction, wherein, x is the 15th of main independent variable	01	The real number subtraction

Equally, function

\log_{e} (1 - e^{- Z_{L}^{-}})

Only depend on Z _L15, therefore, can precomputation and store in the tracing table, directly to press Z _L ^-Addressing.For the G table that is used for real arithmetic, it is equality on size and function, and can in 65536 word tracing tables, merge by introducing " operational code " independent variable (it is 0 at real number, is 1 at plural number), to select half appropriate 32768 word with real number G table.

By complex number type input being divided into most significant part and minimum live part, the same principle of using for the logarithm operation that utilizes the iterative processing of two table to carry out relevant real number also can be applied to the plural number represented by the logarithm form.In addition, by complex number type input being divided into most significant part and minimum live part, at U.S. Patent application co-pending the Multi-stage pipeline of describing in number (application attorney docket 4015-5287) can be applied to the plural number of representing by the log-polar form.This application co-pending is incorporated into this by reference.In the streamline of this application co-pending, the selected part at different levels at streamline of ALU storage tracing table.At least the selected part of tracing table is carried out in the level input that the one-level utilization of streamline is represented by the log-polar form, to generate the part output that is associated with this grade.By merging this part output, multi-stage pipeline generates logarithm output.

When θ=π, as can be seen, computing is equal to the real number subtraction.Result at this situation only depends on R, can use special tracing table to this in single step (one-shot) computing.Alternatively, can use existing tracing table at the real number subtraction.This can be undertaken by 14 0xxxx.xxxxxxxxx execution real number subtraction algorithm utilizing R, will become R with addressing _LThe expansion of initial value the F of 12 zero F table is arranged _SThree of the residues of part and R.Then, with the corresponding output register of plural precision that reduces in, except the hope position of the precision that only adds up, also carry out the real number iteration, and the more termination criteria of morning is compared in use with R＞18.For example, R＞9 can be satisfied.

The shared ALU that is used for real number and plural logarithm operation

Real number and plural number can be used to be illustrated in the various signals in the triangular web.Equally, conventional processors can comprise a plurality of independent ALU, and one is used to carry out plural Logarithmic Algorithm, and another is used to carry out the real number Logarithmic Algorithm.Yet two independent ALU take sizable silica space.And in some cases, what inhibition was arranged is that this ALU may need big tracing table.Therefore, it is useful utilizing following single ALU, and described single ALU can utilize the tracing table of fair-sized to carry out real number and plural Logarithmic Algorithm

Fig. 8 illustration be used to carry out an exemplary ALU200 of real number and plural Logarithmic Algorithm.ALU200 comprises input totalizer 210, searches controller 220, and output totalizer 230.In general, input totalizer 210 calculates poor between two real numbers or the plural number input, utilize real number tracing table or plural tracing table and search controller 220 and export totalizer 230 based on described input, concentrate generation output logarithm based on the real number or the plural number output of input totalizer 210.

Two real numbers representing with the logarithm form that will add or deduct or plural A and B submit to input totalizer 210 continuously.Elected promoting blood circulation dashed when occurring for the first time, and ALU 200 is loaded into the first number A in input totalizer 210 and the output totalizer 230.Then the second number B (it has angle part θ or changes the associated symbol of 180 degree at subtraction) is submitted to input totalizer 210.

Elected promoting blood circulation dashed when occurring for the second time, and input totalizer 210 is carried out A-B.If there be the underflow of the Logarithmic magnitude of expression B, then import totalizer 210 storage and output valve X=B-A, and send borrow pulse to output totalizer 230 greater than the Logarithmic magnitude of A.This borrow pulse makes output totalizer 230 load B and rewrites A, and described B comprises that the modification of its association or unmodified symbol (or the angle under plural situation) are interior.Yet if there is no underflow is imported cumulative adder stores and output valve X=A-B.Thus, the greater that output integrating instrument 230 keeps among A and the B, import totalizer 210 simultaneously and keep | A-B|.In aforementioned formula at real number, the amount X amount of equaling r, and in aforementioned formula at plural number, the amount X amount of equaling Z.

Based on X, search controller 220 and determine two outputs, part is exported L and is revised output Y.Search controller 220 and will partly export L and output to output totalizer 230 together with the ADD pulse, the existing content that makes part export L and output totalizer 230 adds up.Search controller 230 will revise output Y together with the ADD pulse output to output totalizer 230, make Y with the input totalizer 210 existing content add up, generate new value X thus.Repeat this circulation, till Y satisfies or exceeds predetermined value.In case Y satisfies or exceeds predetermined value, just stop circulation, search controller 220 and generate following READY signal, the desirable answer of this READY signal indication can obtain as output C from output totalizer 230, then, the state of ALU 200 returns original state, in original state, waits for new a pair of input value A and B.

Fig. 9 provides an exemplary additional detail that searches controller 220 at real number or plural Logarithmic Algorithm computing.Search controller 220 and comprise F table 222, G table 224, combiner 226, and serial device 228.F table 222 and G table 224 all comprise the real number tracing table that is used for determining the plural tracing table of plural logarithm and/or is used for the logarithm of definite real number.Though Fig. 9 illustrated F table 222 and G table 224 comprise plural number and real number tracing table,, it will be appreciated by those skilled in the art that F table 222 and G table 224 also can only comprise one of plural number and real number tracing table.

When quantity A being applied to totalizer 210 and 230 to first 32, apply initial pulse to serial device 228.Serial device 228 provides to input totalizer 210 and loads 1 pulse, and provides loading 2 pulses to output totalizer 230, makes them store this 32 A amounts.When quantity B being applied to totalizer 210,230 to second 32, apply second strobe pulse to serial device 228.

Serial device 228 provides summation pulse to input totalizer 210.If input totalizer 210 output " borrow " pulses, should " borrow " pulse represent the Logarithmic magnitude of the Logarithmic magnitude of B greater than A, then another loads 2 pulses to serial device 228 to 230 outputs of output totalizer, the B value of its symbol that comprises numerical value B or phase place is stored in the output totalizer 230, to rewrite A.For real number, the symbol with value of big Logarithmic magnitude becomes the symbol of C as a result.The value of poor X between the input totalizer 210 output Logarithmic magnitudes, wherein, if A greater than B, X=A-B then, if perhaps B is greater than A, X=B-A then is so X is always positive.The highest significant position X of X _MBe applied to F tracing table 222, and the least significant bit (LSB) X of X _LBe applied to G tracing table 224.

For real number, the symbol of the symbolic logic part of input totalizer 210 and numerical value A and B is carried out XOR, to determine to use the F of tracing table 222 _aPartly (same-sign hints and adds) still should use F _sPartly (the distinct symbols hint subtracts).Thus, the XOR of symbol forms the extra address position at F table 222.

If the value X in input totalizer 210 does not exceed outage threshold, then do not provide stop pulse to serial device 228, and serial device is subsequently by sending to summation pulse in input totalizer 210 and output totalizer 230, impel the F+G that adds up to be imported in the totalizer 210 from the value of combiner 226, make the adding up continuously of correction output of adding up part output L in the output totalizer 230 and having the content in the input totalizer 210.

Repeat above-mentioned, satisfied or exceed outage threshold and be on value up to its contents of output totalizer 230 indication, by providing " stopping " pulse to stop to serial device 228, serial device 228 is according to should " stopping " pulses generating " ready " pulse (the value C that should " ready " pulse represents to export in the totalizer 230 is a net result) and itself returning initial state.

In the device of Fig. 9, the F of tracing table 222 _sPart is stored negative value, the acquisition that in totalizer 210 and 230, suitably adds up of this negative value, and what needn't separately represent to carry out is logarithm addition or logarithm subtraction.In addition, in order to save all F of storage _sSign bit, as all F _aValue just is and all F _sWhen value is negative, can from tracing table, omit F _sSign bit, and can use according to symbolic logic provide+/-position value.F to default sign bit _sTrue negative store with the value of negating and storage on the occasion of different, must follow it is deducted from export totalizer 230 and combiner 226.When considering the compression of tracing table size, should see that the latter has advantage.At U.S. patented claim No. Further discussed the compression of tracing table size in (acting on behalf of files No.4015-5288), by reference it has been herein incorporated.

Can expect the term of execution other change, comprise that the output valve Y that makes combiner 226 becomes negative F+G so that it can deduct from input totalizer 210, thus, eliminate adding order and distinguish the needs of importing totalizer 210 between the order with subtracting.Because negative is that complement code adds 1, so when storage all reduced G table 224 value of a least significant bit (LSB), it can utilize complement code to export and finish.Yet, preferably, do not revise the value of G table 224 like this, so that G table 224 can be common to other situation.

Figure 10 shows the more details of the complex operation among the ALU 200.For complex values, input totalizer 210 comprises two independent sectors, i.e. 210A of R portion and the 210B of θ portion.If prevent to be sent among the 210A of R portion from the carry output of the 210B of θ portion that imports totalizer, then can be used for complex algorithm to the same input totalizer 210 that is used for the real number algorithm, perhaps vice versa under the situation of first preface of θ.

What be significant is that the bit position of addressing plural number F table 222 is from θ and partly from R.If θ takies under the real number situation position that the LSB by R takies,, must change being connected between input totalizer and the F table 222 then for complex operation.It also is true for G table 224.This is a fleabite, and it can utilize a group selector switch (not shown) to carry out, and this selector switch and real number or complex operation are irrespectively selected appropriate position from input totalizer 210, to be connected to the address input of G and F table.Also can expect an alternative solution: at being connected of real number and complex operation keeping identical, its position that need intersect R and θ is distributed between importing totalizer 210 and F and G showing.Thus, in this carries out, the highest significant position of θ will with the least significant bit (LSB) switch of R, so that the highest significant position of R and θ takies under the real number situation position, position that the highest significant position by R takies, and make the least significant bit (LSB) of R and θ take under the real number situation position, position that the least significant bit (LSB) by R takies.The R position of the connection that connects in order to keep makes and forms R totalizer 226A, the θ position that similarly keeps being connected is to form independently θ totalizer 226B, then with at the real number situation compare,, need the carry digit of three adder stage be re-routed at plural number.If accomplished these, then avoided the connection from the real number to the plural number to intersect, be provided with output totalizer 230 and totalizer 226 again.The carry-out bit that so also can guarantee F and G table can keep being connected to the identical destination in totalizer 226 and the totalizer 230.

The alternative solution of above-mentioned firm description, the plural situation during for θ=π is used real number subtraction meter F in hope _sThe time lack practicality.In this case, all of wishing R all are connected to the address input of F table 222, and all carry-out bits also all are connected to the R totalizer portion of totalizer 230 and totalizer 226.In this case, be difficult to avoid use the rerouting switch.If the situation of θ=π does not need iterative processing, that is, and at real number F _sHandle by single searching in the table 222, then avoided the rerouting of totalizer position.

At the situation of plural θ=π, it is that the figure place on the left side of arriving binary point of R is compared with the real number (5) of plural number few one (4) that another one aligning to be processed is discussed in the process of utilizing the real number subtraction meter.In addition, the real number iteration is used the F table 222 of the highest significant position addressing of for example pressing form 5.9, and in order under the situation that does not need iteration, to handle the situation at the θ=π of plural number, the needs utilization is come addressing F table 222 by whole 16 of the different value R of format 4 .12, and it needs the tables of different sizes.

Figure 11 shows at the different of real number and plural number and may distribute the position.Figure 11 A shows 1 to 32 direct distribution at the real number logarithm value, and it starts from the marker bit S in the position 1, and the back is 31 Logarithmic magnitudes by form 8.23, and shows and be separated into the highest significant position X with form 5.9 _MWith the least significant bit (LSB) X that presses form 0.14 _LThe bottom shows at by the Logarithmic magnitude of form 5.12 with by the direct distribution of the position 1-32 of the phase angle of form 0.15, and Logarithmic magnitude is divided into most significant part R by format 4 .4 _MWith minimum effective 8 R _L, and phase place is divided into 7 and 7 minimum live parts the highest, and show corresponding position separately with π.According to Figure 11 A, the misalignment number between real number and the plural number is obvious.For example, the binary point of Logarithmic magnitude is or not the same position place, and (at real number is X in the position of addressing F table 222 _MAnd be X at plural number _M, θ _M) also not in identical position.

Figure 11 B shows binary point wherein and aims at the position of real number and plural Logarithmic magnitude respectively and distribute.It only pays close attention to whether attempt re-using real number F at the plural situation of θ=π _sTable 222 and whether attempt re-using real number F at the plural situation of θ=0 _aTable 222.Yet, still different with the position of G table at the addressing F of real number and plural number.

Figure 11 C shows at the position of the identical bits of real number and plural number realization addressing F table 222 and distributes.At the real number algorithm, make 15 bit address together, sign bit S and most significant part X are set continuously _M, with addressing F _aWith the G table, and under plural situation, identical 15 comprise 8 R _MWith 7 θ _M

Similarly, by R _LAnd θ _L15 and X forming _L14 to overlapping, its addressing G shows 224ROM.Under the real number situation,, therefore, when pressing real number addressing G table, just ignore figure place 2 because it is plural table size half.The position preface that Figure 11 C also shows in the highest effective and minimum live part is arbitrarily, but under the prerequisite that keeps indeclinable importance between real number and the complex operation, can select a preface so that maximize to Next carry linking number from an adder stage.

Simplified application does not attempt plural number and real number F table 222 are merged into a big table, thus, must use the identical address position in both cases, but must use, be connected to the individual tables of appropriate address bit at different from 210 selections of input totalizer of real number and plural number.The situation of alternative can be used a plurality of absolute address demoders at real number and plural number.In addition, the G table 224 that is used for real number and plural number can be different table, perhaps different at least address decoders.Except the situation of θ=π of further considering total size keep approximately with the epiphase that merges with.The situation of θ=π is almost equal at Logarithmic magnitude,, just becomes problem when R is almost nil that is.Therefore, need as only handling at the special circumstances such as the R value of 0000.xxxxxxxxxxxx, that is, in this case, the highest effective 4 of different R is zero.This only needs 4096 word tables, and the complicacy that re-routes for fear of bit line is so that can use real number F _sTable 222, this is worth.Provide the maximum ratio that tracing table takies silicon chip area, and, Comparatively speaking, by totalizer, totalizer, and the chip area that other peripheral logic takies is less, can draw such conclusion, promptly, separately carrying out of real number and plural number is logic, and the benefit of being brought is to carry out real number and complex operation simultaneously because of the processing speed that increases causes processor to follow.

Certainly, under the situation that does not break away from inner characteristic of the present invention, the present invention can adopt except other method this concrete method of setting forth and carry out.All specific embodiment should be regarded as exemplaryly and nonrestrictive from any angle, and be intended to change and all be contained in this falling into institute in the implication of claims and the equivalency range.

Claims

1, a kind of ALU that is used to calculate the output logarithm, this ALU comprises:

Storer, this reservoir are stored first tracing table and second tracing table, and described first tracing table is used for determining the logarithm of real number, and described second tracing table is used for determining the logarithm of plural number; With

Common processor, this common processor are utilized at described first tracing table of real number type input operand with at described second tracing table of complex number type input operand, generate the output logarithms based on two input operands representing by the logarithm form.

2, ALU according to claim 1, wherein, described output logarithm is represented the logarithm of the difference of the logarithm of described input operand sum or described input operand.

3, ALU according to claim 1, wherein, this ALU comprises the butterfly circuit, and described butterfly circuit is configured to, and utilizes described first tracing table or described second tracing table to generate the logarithm of difference of described input operand and the logarithm of described input operand sum simultaneously.

4, ALU according to claim 3, wherein, described butterfly circuit comprises:

First combiner, this first combiner are used to merge selected input operand and the difference that is provided by described first tracing table or described second tracing table, with the logarithm of the difference that generates described input operand; With

Second combiner, this second combiner be used to merge described selected input operand with provide by described first tracing table or described second tracing table and value, to generate the logarithm of described input operand sum.

5, ALU according to claim 1, wherein, described common processor comprises:

Search controller, this searches controller and is configured to, and calculates one or more part output based on described first tracing table or described second tracing table; With

The output totalizer, this output totalizer is configured to, and generates described output logarithm based on described part output.

6, ALU according to claim 5 wherein, is based on the hope precision of described output logarithm with the number of the part output that generates described output logarithm.

7, ALU according to claim 5, wherein, described common processor is carried out twice or more times iteration by the described controller that searches, and to determine described output logarithm, wherein, each iteration generates in the described part output.

8, ALU according to claim 7, this ALU also comprises the input totalizer, this input totalizer is configured to, and generates at the input of real number type or the complex number type of current iteration based on the described part output that generates during last iteration and imports.

9, ALU according to claim 7, wherein, described output totalizer generates described output logarithm based on selected input operand and the described part output that generates during each iteration.

10, ALU according to claim 9, wherein, described common processor also comprises the selection circuit, this selection circuit is configured to, and selects to have the input operand of maximum amplitude.

11, ALU according to claim 5, wherein, the described controller that searches comprises multi-stage pipeline, and wherein, of all generating in the described part output at different levels in the described multi-stage pipeline.

12, ALU according to claim 11, wherein, the selected part of described first tracing table and the selected parts of described second tracing table of all storing at different levels in the described streamline.

13, ALU according to claim 12, wherein, one-level at least in the described streamline utilizes the real number level to import to carry out the described selected part of described first tracing table, perhaps utilize the input of plural number level to carry out the described selected part of described second tracing table, thereby generate in the described part output one.

14, ALU according to claim 1, wherein, described complex number type input operand respectively comprises amplitude part and bit position mutually.

15, ALU according to claim 14, this ALU also comprises the input totalizer, this input totalizer comprises:

Amplitude totalizer, this amplitude totalizer are used for partly generating based on the amplitude of described complex number type input operand the amplitude part of complex number type input; With

Phase accumulator, this phase accumulator are used for generating based on the phase bit position of described complex number type input operand the phase bit position of described complex number type input.

16, a kind of method that is used for calculating the output logarithm at ALU, this method may further comprise the steps:

The first tracing table storing step, this first tracing table storing step storage is used for first tracing table of the logarithm of definite real number;

The second tracing table storing step, this second tracing table storing step storage is used for determining second tracing table of plural logarithm; And

The output logarithm generates step, this output logarithm generates step utilization at described first tracing table of real number type input operand with at described second tracing table of complex number type input operand, generates the output logarithms based on two input operands representing by the logarithm form in common processor.

17, method according to claim 16, wherein, described output logarithm generates step and comprises that the difference based on described input operand sum or described input operand generates the step of described output logarithm.

18, method according to claim 16, wherein, described output logarithm generates step and comprises the step that generates the output logarithm simultaneously, this step that generates the output logarithm is simultaneously utilized described first tracing table or described second tracing table, generate simultaneously described input operand difference the output logarithm and generate the output logarithm of described input operand sum.

19, method according to claim 18, wherein, the described step that generates the output logarithm simultaneously may further comprise the steps:

Based on relatively selecting to import operand between the described input operand;

Merge selected input operand and the difference that provides by described first tracing table or described second tracing table, with the output logarithm of the difference that generates described input operand; And

Merge described selected input operand with provide by described first tracing table or described second tracing table and value, to generate the output logarithm of described input operand sum.

20, method according to claim 16, wherein, described output logarithm generates step and may further comprise the steps:

Calculate one or more part output based on described first tracing table or described second tracing table; With

Generate described output logarithm based on described part output.

21, method according to claim 20, this method also comprises the iteration execution in step, and this iteration execution in step is carried out twice or more times iteration, and to determine described output logarithm, wherein, each iteration generates in the described part output.

22, method according to claim 21, this method comprise that also input generates step, and this input generates step and export the input that generates at current iteration based on the part that generates during last iteration.

23, method according to claim 20, wherein, described output logarithm generates step and comprises that the part output based on the middle generations at different levels in multi-stage pipeline generates described output logarithm.

24, method according to claim 23, this method also comprise selected part storing step, and this selected part storing step is at the selected part of described first tracing table of storages at different levels in the described multi-stage pipeline and the selected part of described second tracing table.

25, method according to claim 24, this method also comprises selected part execution in step, should select in the one-level at least of part execution in step in described streamline, carry out the described selected part of described first tracing table or the described selected part of described second tracing table based on input of real number level or the input of plural number level respectively, to generate in the described part output.

26, method according to claim 16, wherein, described complex number type input operand respectively comprises amplitude part and bit position mutually.

27, method according to claim 26, this method is further comprising the steps of:

Amplitude partly generates step, and this amplitude partly generates step partly generates the complex number type input based on the amplitude of described complex number type input operand amplitude part; With

The phase bit position generates step, and this phase bit position generates step generates described complex number type input based on the phase bit position of described complex number type input operand phase bit position.

28, a kind of ALU that is used for the output logarithm of calculated complex, this ALU comprises:

Storer, this reservoir is used to store tracing table, and this tracing table is used for determining the logarithm of plural number: and

Processor, the tracing table of this processor utilization storage generates the complex number type of representing by the log-polar form and imports the output logarithm that the arithmetic of operand merges.

29, ALU according to claim 28, wherein, described processor comprises the butterfly circuit, and this butterfly circuit is configured to, and calculates the output logarithm of difference of described complex number type input operand and the output logarithm of described complex number type input operand sum simultaneously based on described tracing table.

30, ALU according to claim 29, wherein, described butterfly circuit comprises:

First combiner, this first combiner are used to merge selected input operand and the difference that is provided by described tracing table, import the output logarithm of the difference of operand to generate described complex number type; With

Second combiner, this second combiner be used to merge described selected input operand with provide by described tracing table and value, import the output logarithm of operand sum to generate described complex number type.

31, ALU according to claim 28, wherein, described processor comprises:

Search controller, this searches controller and is configured to, and calculates one or more part output based on described tracing table; With

32, ALU according to claim 31, wherein, described processor is carried out twice or more times iteration by the described controller that searches, and to generate described output logarithm, wherein, each iteration generates in the described part output.

33, ALU according to claim 32, this ALU also comprises the input totalizer, this input totalizer is configured to, and generates at the complex number type of current iteration based on the part output that generates during last iteration and imports.

34, ALU according to claim 31, wherein, the described controller that searches comprises multi-stage pipeline, and wherein, of all generating in the described part output at different levels in the described multi-stage pipeline.

35, ALU according to claim 28, wherein, described complex number type input operand comprises amplitude part and bit position mutually.

36, ALU according to claim 35, this ALU also comprises the input totalizer, this input totalizer comprises:

37, ALU according to claim 35, wherein, described phase bit position comprises the most significant part of described complex number type input, and wherein, described amplitude partly comprises the minimum live part of described complex number type input.

38, according to the described ALU of claim 37, wherein, described tracing table comprises amplitude tracing table and phase place tracing table.

39, according to the described ALU of claim 38, wherein, the described phase place tracing table of described most significant part addressing of described complex number type input, and wherein, the described amplitude tracing table of minimum live part addressing of described complex number type input.

40, a kind of method of output logarithm of calculated complex, this method may further comprise the steps:

Tracing table storing step, the storage of this tracing table storing step are used for determining the tracing table of the logarithm of the plural number represented by the log-polar form; With

The output logarithm generates step, and this output logarithm generates the tracing table of step utilization storage, generates the output logarithm based on the complex number type input operand of representing by the log-polar form.

41, according to the described method of claim 40, wherein, described output logarithm generates step and may further comprise the steps: based on described tracing table, calculate the output logarithm of difference of described complex number type input operand and the output logarithm of described complex number type input operand sum simultaneously.

42, according to the described method of claim 40, wherein, described output logarithm generates step and may further comprise the steps:

Calculate one or more part output based on described tracing table; With

Generate described output logarithm based on described part output.

43, according to the described method of claim 42, this method is further comprising the steps of: carry out twice or more times iteration, to generate described output logarithm, wherein, each iteration generates in the described part output.

44, according to the described method of claim 43, wherein, described output logarithm generates step and may further comprise the steps: carry out multi-stage pipeline, and to generate described output logarithm, wherein, of all generating in the described part output at different levels in the described multi-stage pipeline.

45, according to the described method of claim 40, this method is further comprising the steps of:

46, according to the described ALU of claim 45, wherein, described phase bit position comprises the most significant part of described complex number type input, and wherein, described amplitude partly comprises the minimum live part of described complex number type input.

47, according to the described ALU of claim 46, wherein, described tracing table comprises amplitude tracing table and phase place tracing table.

48, according to the described ALU of claim 47, this ALU is further comprising the steps of: utilize the described phase place tracing table of described most significant part addressing of described complex number type input, and utilize the described amplitude tracing table of described minimum live part addressing of described complex number type input.