KR20070018981A

KR20070018981A - Complex logarithmic alu

Info

Publication number: KR20070018981A
Application number: KR1020067025542A
Authority: KR
Inventors: 폴 윌킨슨 덴트
Original assignee: 텔레호낙티에볼라게트 엘엠 에릭슨(피유비엘)
Priority date: 2004-06-04
Filing date: 2005-06-02
Publication date: 2007-02-14

Abstract

본 발명은 로그 포맷으로 표현된 실수 및/또는 복소수와의 로그 연산을 수행하기 위한 방법 및 장치에 관한 것이다. 예시적인 일 실시예에서, ALU는 로그폴라 포맷으로 표현된 복소수들에 대한 로그 연산을 구현한다. 본 실시예에 따라, ALU 내의 메모리는 복소수의 로그들을 결정하는데 사용되는 룩업 테이블을 저장하는 반면, ALU 내의 프로세서는 저장된 룩업 테이블을 이용하여 로그폴라 포맷으로 표현된 복소수 입력 피연산자에 기초하여 출력 로그를 생성한다. 예시적인 다른 실시예에서, ALU는 로그 포맷으로 표현된 실수 및 복소수에 대한 로그 연산을 수행한다. 본 실시예에서, 메모리는 두 개의 룩업 테이블들을 저장하는데, 하나는 실수의 로그들을 결정하기 위한 것이고, 다른 하나는 복소수의 로그들을 결정하기 위한 것인 반면, 프로세서는 각각 실수 또는 복소수 룩업 테이블을 이용하여 로그 포맷으로 표현된 실수 또는 복소수 입력 피연산자들에 기초하여 출력 로그를 생성한다.The present invention relates to a method and apparatus for performing log operations with real and / or complex numbers expressed in log format. In one exemplary embodiment, the ALU implements a log operation on complex numbers represented in logpolar format. According to this embodiment, the memory in the ALU stores a lookup table used to determine complex logs, while the processor in the ALU uses the stored lookup table to generate an output log based on a complex input operand expressed in logpolar format. Create In another exemplary embodiment, the ALU performs log operations on real and complex numbers expressed in log format. In this embodiment, the memory stores two lookup tables, one for determining the logs of the real number and the other for determining the logs of the complex number, while the processor uses the real or complex lookup table respectively. Generate an output log based on real or complex input operands expressed in log format.

ALU, 룩업 테이블, 누산기, 메모리, 버터플라이 회로 ALU, Lookup Table, Accumulator, Memory, Butterfly Circuit

Description

Complex log ADL {COMPLEX LOGARITHMIC ALU}

본 발명은 일반적으로 컴퓨팅 및 디지털 신호 프로세싱에 관한 것이며, 보다 구체적으로는 ALU(arithmetic logic unit)의 파이프라인 로그 연산(logarithmic arithmetic)에 관한 것이다.FIELD OF THE INVENTION The present invention generally relates to computing and digital signal processing, and more particularly to pipeline logarithmic arithmetic of arithmetic logic units (ALUs).

ALU는 통상 실수 및/또는 복소수에 대한 가산, 감산, 승산, 제산 등과 같은 다양한 산술 기능들을 구현하는데 사용되어 왔다. 종래의 시스템들은 고정 소수점 또는 부동 소수점 ALU 중 하나를 이용한다. 유한 정확도의 실수 로그를 사용하는 ALU가 또한 공지되어 있다. 예를 들어, "로그 연산을 이용하는 디지털 필터링(Digital filtering using logarithmic arithmetic)"(N. G. Kingsbury 및 P. J. W. Rayner, Electron. Lett.(1971년 1월 28일), Vol.7, No.2, pp.56-58)을 참조하라. "유럽 로그 마이크로프로세서에 대한 연산(Arithmetic on the European Logarithmic Microprocessor)"(J. N. Coleman, E. I. Chester, C. I. Softley 및 J. Kadlec, (2000년 7월) IEEE Trans. Comput., Vol.49, No.7, pp.702-715)은 실수에 대한 고정확도(32 비트) 로그 유닛의 다른 일례를 제공한다.ALUs have typically been used to implement various arithmetic functions such as addition, subtraction, multiplication, division, etc. for real and / or complex numbers. Conventional systems use either fixed-point or floating-point ALUs. ALUs that use finite accuracy real logs are also known. See, for example, "Digital filtering using logarithmic arithmetic" (NG Kingsbury and PJW Rayner, Electron. Lett. (January 28, 1971), Vol. 7, No. 2, pp. 56. -58). "Arithmetic on the European Logarithmic Microprocessor" (JN Coleman, EI Chester, CI Softley and J. Kadlec, July 2000) IEEE Trans. Comput., Vol. 49, No. 7 pp. 702-715 provide another example of a high accuracy (32 bit) log unit for real numbers.

고정 소수점 프로그래밍은, 특히 승산 또는 제산 연산 후에, 프로그래머에게 소수점의 위치를 추적해야 하는 정신적 부담을 준다. 예를 들어, FIR 필터가 -0.607, 1.035, -0.607 …의 가중 인자들을 사용하는 신호 샘플들의 가중화 가산을 포함한다고 가정하면, 이는 1000분의 1(1 part in 1000)의 정확도로 지정되어야 한다. 고정 소수점 연산에서, 예를 들어, 1.035 × 1035를 표현할 필요가 있다. 결과적으로, 이러한 수에 의한 신호 샘플의 승산은 결과의 워드 길이를 10 비트만큼 확장한다. 동일한 메모리 워드 길이로 결과를 저장하기 위해, 10 비트를 버릴 필요가 있지만; 각각의 MSB(최상위 비트) 또는 LSB(최하위 비트) 또는 일부 중 어느 것이 폐기될 지는 신호 데이터 스펙트럼에 의존하고, 실제적인 데이터를 이용한 시뮬레이션에 의해서 결정되어야 한다. 이는 정확한 프로그래밍 검증을 어렵게 한다.Fixed-point programming places a mental burden on the programmer to track the position of the decimal point, especially after multiplication or division. For example, if the FIR filter is -0.607, 1.035, -0.607. Suppose we include a weighted addition of signal samples using the weighting factors of, which should be specified with an accuracy of one part in 1000. In fixed-point arithmetic, it is necessary to represent 1.035 × 1035, for example. As a result, multiplication of the signal samples by this number extends the resulting word length by 10 bits. To store the result in the same memory word length, we need to discard 10 bits; Which of each MSB (most significant bit) or LSB (least significant bit), or some, is discarded depends on the signal data spectrum and must be determined by simulation using real data. This makes it difficult to verify correct programming.

부동 소수점 프로세서들은, 각각의 저장된 수의 "가수(mastissa)"부와 연관된 "지수(exponent)"부의 도움으로 소수점을 자동으로 추적함으로써 소수점을 추적하는 정신적 불편함을 피하기 위해 도입되었다. IEEE 표준 부동 소수점 포맷은 다음과 같으며:Floating point processors have been introduced to avoid the mental inconvenience of tracking a decimal point by automatically tracking the decimal point with the help of an "exponent" portion associated with each stored number of "mastissa" portions. The IEEE standard floating point format is as follows:

SEEEEEEEE.MMMMMMMMMMMMMMMMMMMMMMM,SEEEEEEEE.MMMMMMMMMMMMMMMMMMMMMMM,

여기서, S는 값의 부호이고(0=+; 1=-), EEEEEEEE는 8 비트 지수이고, MMM...MM은 23 비트 가수이다. IEEE 표준 부동 소수점 포맷에 의해, 가수의 24번째 최상위 비트는 항상 1(참 제로 제외)이므로, 생략된다. 따라서, IEEE 포맷에서, 가수의 실제 값은 다음과 같다:Where S is the sign of the value (0 = +; 1 =-), EEEEEEEE is an 8-bit exponent, and MMM ... MM is a 23-bit mantissa. By the IEEE standard floating point format, the 24th most significant bit of the mantissa is always 1 (except for true), so it is omitted. Thus, in IEEE format, the actual value of the mantissa is:

1.MMMMMMMMMMMMMMMMMMMMMMM.1.MMMMMMMMMMMMMMMMMMMMMMM.

예를 들어, 밑 2 로그 수 -1.40625×10^-2 = -1.8×2^-7은 IEEE 표준 포맷으로 다음과 같이 표시될 수 있다:For example, the base 2 log numbers -1.40625 × 10 ^-2 = -1.8 × 2 ^-7 can be represented in the IEEE standard format as follows:

1 01111000.11001100110011001100110.1 01111000.11001100110011001100110.

또한, 제로 지수는 01111111이므로, 수 +1.0은 다음과 같이 쓸 수 있다:Also, since the zero index is 01111111, the number +1.0 can be written as:

0 01111111.00000000000000000000000.0 01111111.00000000000000000000000.

참 제로를 표현하는 것은 음의 무한 지수를 요구할 것이며, 이는 실제적이지 않아서, 모든 제로 비트 패턴을 2^-127 대신 참 제로로 해석함으로써 인공 제로가 생성된다.Representing true zero would require a negative infinite exponent, which is not practical, resulting in artificial zero by interpreting all zero bit patterns as true zero instead of 2 ^-127 .

두 개의 부동 소수점들을 승산하기 위해, 억압된 MSB 1이 대체된 가수들이 고정 소수점 24×24 비트 승산기를 이용하여 승산되며, 이는 적당히 높은 복잡성 및 지연의 로직인 반면, 지수들은 가산되어 127의 오프셋들 중 하나가 감산된다. 이어서, 승산의 48 비트 결과는 24 비트로 잘라져야 하고, 최상위 1이 좌측 정렬 후 삭제된다. 따라서, 승산은 부동 소수점의 경우 고정 소수점보다 훨씬 더 복잡하다.To multiply two floating point numbers, the mantissa with the suppressed MSB 1 replaced is multiplied using a fixed point 24x24 bit multiplier, which is a reasonably high complexity and delay logic, while the exponents are added to offsets of 127. One of them is subtracted. The 48-bit result of the multiplication must then be truncated to 24 bits, with the top 1 deleted after left alignment. Thus, multiplication is much more complicated than fixed point in the case of floating point.

두 개의 부동 소수점 수들을 가산하기 위해, 그 지수들은 먼저 소수점들이 정렬되어 있는지 확인하기 위해 감산되어야 한다. 소수점들이 정렬되어 있지 않으면, 가수들을 가산하기 전에 소수점들을 정렬하기 위해 지수 차와 동일한 이진수 공간만큼 우측 시프트되도록 좀더 작은 수가 선택되고, 상정된 1이 대체된다. 시프팅을 빠르게 수행하기 위해, 배럴 시프터(barrel shifter)가 이용될 수 있으며, 이는 구조 및 복잡성 면에서 고정 소수점 승산기와 유사하다. 가산 및 보다 구체적으로는 감산 후에, 선행 제로들은 지수를 증분하면서 가수로부터 좌측 시프트되어야 한다. 따라서, 부동 소수점 연산에서 가산 및 감산 또한 복잡한 연산들이다.To add two floating point numbers, the exponents must first be subtracted to see if the decimal points are aligned. If the decimal points are not aligned, a smaller number is selected to right shift by the binary space equal to the exponential difference to sort the decimal points before adding the mantissas, and the assumed one is replaced. In order to perform shifting quickly, a barrel shifter can be used, which is similar in structure and complexity to a fixed point multiplier. After addition and more specifically subtraction, the leading zeros must be left shifted from the mantissa while incrementing the exponent. Thus, addition and subtraction in floating point operations are also complex operations.

단순 선형 포맷에서, 고정 소수점의 가산 및 감산은 간단하지만, 승산, 제산, 제곱 및 제곱근은 더 복잡하다. 승산기들은 고유하게 대다수의 로직 지연들을 갖는 "시프트 및 조건적 가산" 회로들의 시퀀스로서 구성된다. 고속 프로세서들은 이러한 지연을 극복하기 위해 파이프라이닝을 사용할 수도 있지만, 이는 통상 프로그래밍을 복잡하게 한다. 따라서, 고속 프로세서의 파이프라이닝 지연을 최소화하는데 관심이 모아지고 있다. In the simple linear format, addition and subtraction of fixed point is simple, but multiplication, division, square and square root are more complicated. Multipliers are inherently configured as a sequence of "shift and conditional addition" circuits with a majority of logic delays. High speed processors may use pipelining to overcome this delay, but this typically complicates programming. Thus, attention has been drawn to minimizing the pipelining delay of high speed processors.

부동 소수점 표시는 로그 및 선형 표시 간의 하이브리드임을 유의해야 한다. 지수는 그 수의 밑 2에 대해 로그의 정수부이고, 가수는 선형 분수부이다. 승산이 선형 표현에 대해 복잡하고 가산이 로그 표현에 대해 복잡하므로, 이것이 하이브리드 부동 소수점 표현들에 대해 승산 및 가산이 왜 복잡한지를 설명해 준다. 이를 극복하기 위해, 상술된 바와 같은 몇몇 공지된 시스템들은 순수 로그 표현을 사용해 왔다. 이는 소수점의 추적 문제를 해결해 주고 승산을 간단하게 해주지만, 단지 가산이 복잡하다는 점만 남는다. 로그 가산은 룩업 테이블을 이용하여 종래 기술에서 수행되었다. 그러나, 테이블의 크기 한계가 예를 들어, 0-24 비트 범위와 같은 제한된 워드 길이로 상기 해법을 제한한다. 상술된 콜맨의 참조 문헌에서, 승산기를 요하는 보간 기술을 사용하는 합당한 크기의 룩업 테이블들로 32 비트 정확도가 달성되었다. 이와 같이, 콜맨 프로세스는 여전히 승산과 연관된 복잡성들 을 포함한다.Note that floating point representation is a hybrid between logarithmic and linear representations. The exponent is the integer part of the logarithm to the base 2 of the number, and the mantissa is a linear fractional part. Since multiplication is complex for linear representations and addition is complex for logarithmic representations, this explains why multiplication and addition are complex for hybrid floating point representations. To overcome this, some known systems as described above have used pure log representation. This solves the problem of tracking decimal points and simplifies multiplication, but only adds complexity. Log addition was performed in the prior art using a lookup table. However, the size limit of the table limits the solution to limited word lengths, for example in the range 0-24 bits. In Coleman's references mentioned above, 32-bit accuracy has been achieved with reasonably sized lookup tables using interpolation techniques that require multipliers. As such, the Coleman process still includes the complexity associated with multiplication.

종래 기술이 실수 로그 연산을 구현하기 위한 다양한 방법들 및 장치들을 기술하지만, 종래 기술은 무선 신호 프로세싱에서 유용한 복소수 연산에 대한 룩업 테이블 해법을 제공하지 않는다. 또한, 종래 기술은 공유 실수 및 복소수 프로세싱 능력을 갖는 ALU를 제공하지 않는다. 무선 신호 프로세싱이 종종 복소수 및 실수 프로세싱 능력을 요구하기 때문에, 실수 및 복소수 로그 연산을 구현하는 싱글 ALU는 크기 및/또는 파워에 관한 무선 통신 장치들에서 유용하다.Although the prior art describes various methods and apparatuses for implementing real logarithm operations, the prior art does not provide a lookup table solution for complex arithmetic useful in wireless signal processing. Also, the prior art does not provide an ALU with shared real and complex processing capabilities. Since wireless signal processing often requires complex and real processing capabilities, a single ALU that implements real and complex logarithms is useful in wireless communication devices regarding size and / or power.

발명의 개요Summary of the Invention

본 발명은 로그 포맷으로 표현된 실수 및/또는 복소수와의 산술 계산들을 수행하는 산술 논리 연산 장치(ALU)에 관한 것이다. 로그 수 표현을 이용하는 것은 승산 및 제산 연산들을 간단하게 하지만, 가산 및 감산은 더 어렵게 한다. 그러나, 두 입력 피연산자들의 합 또는 차의 로그는 본 명세서에 기술된 바와 같이 공지된 알고리즘들을 이용하여 간단해 질 수 있다. 다음의 설명에서, a>b 이고 c=a+b라고 가정한다. 다음과 같이 표현될 수 있으며,The present invention relates to an arithmetic logic unit (ALU) for performing arithmetic calculations with real and / or complex numbers expressed in log format. Using log number representation simplifies multiplication and division operations, but addition and subtraction make it more difficult. However, the log of the sum or difference of the two input operands can be simplified using known algorithms as described herein. In the following description, assume that a> b and c = a + b. It can be expressed as

C = log_q(c) = log_q(a+b) = A + log_q(1+q^-r)C = log _q (c) = log _q (a + b) = A + log _q (1 + q ^-r )

여기서, q는 로그의 밑이고, r=A-B, A=log_q(a), B=log_q(b)이다. 본 명세서에서 로그애드(logadd)로 언급된 수학식 1로 표현된 연산은 가산 및 감산 연산들만 을 이용하여 a 및 b의 합의 로그가 계산되도록 하는데, log_q(1+q^-r)의 값은 룩업 테이블을 이용하여 결정된다. Where q is the base of the log, r = AB, A = log _q (a), and B = log _q (b). The operation represented by Equation 1, referred to herein as logadd, causes the log of the sum of a and b to be calculated using only addition and subtraction operations, wherein the value of log _q (1 + q ^-r ) is Determined using a lookup table.

예시적인 일 실시예에서, 본 발명은 로그폴라 포맷으로 표현된 복소수 입력 피연산자들에 대한 로그 연산을 수행하기 위한 ALU를 제공한다. 예를 들어, A=log_q(a)=(R₁,θ₁)이고, B=log_q(b)= (R₂,θ₂)이다. 여기서, R 및 θ는 이하에 더 설명되는 바와 같이, 각각, 로그크기 및 위상각을 나타낸다. 본 실시예에 따르면, ALU는 메모리 및 프로세서를 포함한다. 메모리는 로그폴라 포맷의 복소수들의 로그들을 결정하는데 사용되는 룩업 테이블을 저장하는 반면, 프로세서는 저장된 룩업 테이블을 이용하여 로그폴라 포맷으로 표현된 복소수 입력 피연산자들의 출력 로그를 생성한다.In one exemplary embodiment, the present invention provides an ALU for performing a log operation on complex input operands expressed in logpolar format. For example, A = log _q (a) = (R ₁ , θ ₁ ) and B = log _q (b) = (R ₂ , θ ₂ ). Here, R and θ represent log size and phase angle, respectively, as described further below. According to this embodiment, the ALU includes a memory and a processor. The memory stores a lookup table used to determine the logs of complex numbers in logpolar format, while the processor uses the stored lookup table to generate an output log of complex input operands expressed in logpolar format.

예시적인 다른 실시예에서, 본 발명은 로그 포맷으로 표현된 실수 및 복소수 양자에 대한 로그 연산을 수행하기 위한 ALU를 제공한다. 본 실시예에 따른 예시적인 ALU는 또한 메모리 및 프로세서를 포함한다. 메모리는 두 개의 룩업 테이블들을 저장하는데, 하나는 실수의 로그들을 결정하기 위한 것이고, 다른 하나는 복소수의 로그들을 결정하기 위한 것이다. 프로세서는 실수 입력 피연산자를 위한 실수 룩업 테이블 및 복소수 입력 피연산자를 위한 복소수 룩업 테이블을 이용하여 로그 포맷으로 표현된 입력 피연산자들에 기초하여 출력 로그를 생성하는 공유 프로세서를 포함한다.In another exemplary embodiment, the present invention provides an ALU for performing log operations on both real and complex numbers expressed in log format. An exemplary ALU according to this embodiment also includes a memory and a processor. The memory stores two lookup tables, one for determining real logs, and the other for determining complex logs. The processor includes a shared processor that generates an output log based on input operands expressed in log format using a real lookup table for a real input operand and a complex lookup table for a complex input operand.

임의의 경우, 본 발명의 예시적인 일 실시예에 따라, 프로세서는 로그애드 및 로그서브(logsub) 연산 양자에 대한 출력 로그를 동시에 생성하도록 구성된 버터플라이 회로(butterfly circuit)를 포함할 수 있다. 또 다른 예시적인 실시예에 따라, 프로세서는 룩업 컨트롤러 및 출력 누산기를 포함할 수 있으며, 룩업 컨트롤러는 룩업 테이블들을 기초로 하여 하나 이상의 부분 출력들을 계산한다. 부분 출력들은 1회 이상의 반복동안에 결정될 수도 있으며, 또는 파이프라인의 하나 이상의 단들 중에 결정될 수도 있다. 출력 누산기는 부분 출력들에 기초하여 출력 로그를 생성한다.In any case, according to one exemplary embodiment of the present invention, the processor may include a butterfly circuit configured to simultaneously generate output logs for both logad and logsub operations. According to another exemplary embodiment, the processor may include a lookup controller and an output accumulator, the lookup controller calculating one or more partial outputs based on the lookup tables. The partial outputs may be determined during one or more iterations, or may be determined among one or more stages of the pipeline. The output accumulator generates an output log based on the partial outputs.

도 1은 실수들에 대한 IEEE 부동 소수점 포맷과 참 로그 포맷 간의 좌표 비교를 도시한 도면. 1 illustrates a coordinate comparison between the IEEE floating point format and the true log format for real numbers.

도 2는 실수들에 대한 IEEE 부동 소수점 포맷과 참 로그 포맷 간의 차트 비교를 도시한 도면. 2 shows a chart comparison between the IEEE floating point format and the true log format for real numbers.

도 3은 선형 보간기의 블록도를 도시한 도면.3 shows a block diagram of a linear interpolator.

도 4는 참 F-함수와 지수 근사화 간의 좌표 비교를 도시한 도면.4 shows a coordinate comparison between a true F-function and an exponential approximation.

도 5A 및 도 5B는 각각 로그폴라 및 데카르트 표현들의 양자화 영역들을 도시한 도면.5A and 5B show quantization regions of logpolar and Cartesian representations, respectively.

도 6은 로그애드 및 로그서브 연산들을 동시에 수행하기 위한 한 예시적인 ALU의 블록도를 도시한 도면.6 shows a block diagram of an exemplary ALU for concurrently performing log add and log sub operations.

도 7은 도 6의 ALU를 사용한 16-포인트 FFT의 구현을 도시한 도면.FIG. 7 illustrates an implementation of a 16-point FFT using the ALU of FIG. 6.

도 8은 본 발명에 따른 일례의 ALU의 블록도를 도시한 도면.8 shows a block diagram of an example ALU in accordance with the present invention.

도 9는 도 8의 ALU에 대한 일례의 룩업 컨트롤러의 블록도를 도시한 도면.9 illustrates a block diagram of an example lookup controller for the ALU of FIG. 8;

도 10은 본 발명에 따른 일례의 ALU의 추가 세부사항을 도시한 도면.10 shows further details of an exemplary ALU in accordance with the present invention.

도 11A 내지 도 11C는 실수들에 대한 복소수들의 상이한 할당들을 도시한 도면. 11A-11C show different assignments of complex numbers for real numbers.

본 발명은 로그 포맷의 복소수 및/또는 실수에 대한 로그 연산을 수행하기 위한 ALU를 제공한다. 일 실시예에서, ALU는 하나 이상의 룩업 테이블들을 이용하여 로그폴라 포맷으로 표현된 복소수에 대한 로그 연산을 수행한다. 다른 실시예에서, ALU는 적어도 하나의 복소수 룩업 테이블 및 적어도 하나의 실수 룩업 테이블을 각각 이용하여 로그 포맷으로 표현된 복소수 및 실수 양자에 대한 로그 연산을 수행한다. 본 발명의 세부사항 및 이점들을 더 잘 이해하기 위해, 먼저 이하의 설명은 수 표현, 종래의 보간, 반복 로그 연산, 고정확도 반복 로그 가산, 고정확도 반복 로그 감산 및 지수 근사화에 관한 세부사항을 제공한다.The present invention provides an ALU for performing log operations on complex numbers and / or real numbers in log format. In one embodiment, the ALU performs log operations on complex numbers expressed in logpolar format using one or more lookup tables. In another embodiment, the ALU performs log operations on both complex and real numbers represented in log format using at least one complex lookup table and at least one real lookup table, respectively. In order to better understand the details and advantages of the present invention, the following description first provides details regarding number representation, conventional interpolation, iterative log operation, high accuracy iterative log addition, high accuracy iterative log subtraction and exponential approximation do.

<실시예><Example>

수 표현Can express

ALU에서 구현된 로그 연산들은 일반적으로 특정 수 포맷을 요구한다. 상술된 바와 같이, 종래의 프로세서들은 고정 소수점 이진 포맷 또는 부동 소수점 포맷으로 실수 또는 복소수를 포맷할 수 있다. 상술된 바와 같이, 고정 소수점 포맷은 단순 선형 포맷이다. 따라서, 고정 소수점과의 가산 및 감산은 간단한 반면, 승산은 보다 복잡해 진다. 부동 소수점 수들은 로그 및 선형 표현 간의 하이브리드이다. 따라서, 가산, 감산, 승산 및 제산이 모두 부동 소수점 포맷의 경우 복잡하다. 상기 포맷들과 연관된 어려움들을 다소 극복하기 위해, 로그 포맷과 연관된 가산 및 감산 문제점들 해결하도록 적합한 알고리즘과 함께 순수 로그 포맷이 사용될 수 있다. 이하의 설명은 본 발명에 적용될 수도 있는 순수 로그 포맷과 연관된 추가 세부사항을 제공한다. Log operations implemented in the ALU generally require a specific number format. As mentioned above, conventional processors may format real or complex numbers in fixed-point binary format or floating-point format. As mentioned above, the fixed point format is a simple linear format. Thus, addition and subtraction with fixed point are simple, while multiplication is more complicated. Floating point numbers are a hybrid between logarithmic and linear representations. Thus, addition, subtraction, multiplication, and division are all complicated for floating point formats. To somewhat overcome the difficulties associated with these formats, a pure log format can be used with an appropriate algorithm to solve the addition and subtraction problems associated with the log format. The following description provides additional details related to the pure log format that may be applied to the present invention.

순수 로그 포맷의 실수들은 (S 8.23)으로 약기될 수 있으며, 다음과 같이 표현될 수 있다:Real log format mistakes can be abbreviated as (S 8.23) and can be expressed as:

S xxxxxxxx.xxxxxxxxxxxxxxxxxxxxxxxS xxxxxxxx.xxxxxxxxxxxxxxxxxxxxxxx

이러한 두 실수는 복소수들을 표현하는 한 방법으로서 이용될 수 있다. 그러나, 후술되는 바와 같이, 로그폴라 포맷이 복소수를 표현하는데 더 유용한 방법일 수 있다.These two real numbers can be used as a way to represent complex numbers. However, as described below, logpolar format may be a more useful method for representing complex numbers.

로그에 사용되는 밑은 쉽게 선택될 수 있다. 그러나, 하나의 밑을 다른 밑과 겹쳐서 선택하는 것이 유리하다. 예를 들어, 밑 2를 선택하는 것은, 많은 이점을 갖는다. 먼저, 수학식 2에 도시된 바와 같이, 32 비트 순수 로그 포맷은 (S8.23) IEEE 부동 소수점 표현과 실질적으로 동일하게 보인다.The base used for logging can be easily selected. However, it is advantageous to select one base overlaid with the other base. For example, choosing the base 2 has many advantages. First, as shown in equation (2), the 32 bit pure log format looks substantially the same as the (S8.23) IEEE floating point representation.

순수 로그: S xx...xx.xx...xx ↔ (-1)^S × 2^{-xx...xx.xx...xx} Pure log: S xx ... xx.xx ... xx ↔ (-1) ^S × 2 ^{-xx ... xx.xx ... xx}

IEEE: S EE...EE.MM...MM ↔ (-1)^S × (1+0.MM...MM) × 2^-EE...EE IEEE: S EE ... EE.MM ... MM ↔ (-1) ^S × (1 + 0.MM ... MM) × 2 ^{-EE ... EE}

밑 2에 대한 로그의 정수부는 IEEE 포맷과 같이 127만큼 오프셋될 수 있으므 로, 수 1.0은 다른 포맷으로 다음과 같이 표현된다:Since the integer part of the logarithm to base 2 can be offset by 127 as in the IEEE format, the number 1.0 is expressed in another format as:

0 01111111.00000000000000000000000.0 01111111.00000000000000000000000.

대안으로, 오프셋으로 128이 사용될 수 있으며, 이 경우, 1.0은 다음과 같이 표현된다:Alternatively, 128 can be used as an offset, in which case 1.0 is expressed as:

0 10000000.00000000000000000000000.0 10000000.00000000000000000000000.

바람직한 오프셋으로서 127 또는 128을 사용하는 것은 구현의 문제이다.Using 127 or 128 as the preferred offset is a matter of implementation.

모든 제로 패턴은 IEEE 부동 소수점 포맷에서와 같이 인공적 참 제로로서 정의될 수 있다. 사실상, 동일한 지수 오프셋(127)이 사용되면, 도 1에 도시된 바와 같이, 순수 로그 포맷은 2의 거듭제곱들(powers), 예를 들어, 4, 2, 1, 0.5 등의 모든 수들에 대한 IEEE 포맷과 일치하고, 각각의 가수부는 2의 거듭제곱들 간에서 단지 약간씩 서로 다르다.All zero patterns can be defined as artificial zeros as in the IEEE floating point format. In fact, if the same exponential offset 127 is used, as shown in Figure 1, the pure log format is for all numbers of powers of two, for example 4, 2, 1, 0.5, and so on. Consistent with the IEEE format, each mantissa is only slightly different from powers of two.

순수 로그 포맷의 경우, 최대 표현 값은 다음과 같다:For pure log format, the maximum representation is:

0 11111111.11111111111111111111111.0 11111111.11111111111111111111111.

밑 2의 경우, 이는 거의 256 - 127의 오프셋의 로그, 즉, 거의 2¹²⁹ 또는 6.81 × 10³⁸의 수를 나타낸다.For base 2, this represents a log of an offset of approximately 256-127, that is, a number of nearly 2 ¹²⁹ or 6.81 × 10 ³⁸ .

최소 표현 값은 다음과 같다:The minimum expression value is:

0 00000000.000000000000000000000000 00000000.00000000000000000000000

밑 2의 경우, 이는 5.88 × 10^-39의 -127과 동일한 로그를 나타낸다. 원한다면, 상기 모두 제로인 포맷은 IEEE 경우에서와 같이, 인공적 참 제로를 나타내도록 예약될 수 있다. 이러한 시나리오에서, 최소 표현가능 수는 다음과 같다:For base 2, this represents the same logarithm as -127 of 5.88 × 10 ⁻³⁹ . If desired, the above all zero format may be reserved to indicate artificial true zero, as in the IEEE case. In this scenario, the minimum representable number is as follows:

0 00000000.000000000000000000000010 00000000.00000000000000000000001

이는 거의 -127과 동일한 밑 2 로그이며, 이는 여전히 대략 5.88 × 10^-39에 대응한다.This is a base 2 log that is approximately equal to -127, which still corresponds to approximately 5.88 × 10 ⁻³⁹ .

1과 2 사이의 값을 갖는 IEEE 가수의 양자화 정확도는 2^-23의 LSB 값으로, 2^-23과 2^-24 사이의 정확도이다(0.6 내지 1.2 × 10^-7). 밑 2 로그 포맷의 수 x를 나타내는 정확도는 로그에서 상수 2^-23이며, 이는 dx/x = log_e(2)× 2^-23 또는 0.83 ×10^-7을 제공하는데, 이는 IEEE 양자화 정확도의 평균보다 양간 더 바람직하다.Quantization accuracy IEEE mantissa having a value between 1 and 2 is between ^2-23 accuracy of the LSB value, ^2-23 and ^2-24 (0.6 to 1.2 × 10 ^-7). The accuracy representing the number x of the base 2 log format is a constant 2 ^-23 in the logarithm, which gives dx / x = log _e (2) × 2 ^-23 or 0.83 × 10 ^-7 , which is better than the average of the IEEE quantization accuracy. More preferred between the two.

다른 구현에서, 밑 e와 같은 다른 밑의 로그들이 사용될 수 있다. 밑 e의 경우, 실수들은 32 비트 부호 + 로그크기 포맷으로 다음과 같이 명시될 수 있다:In other implementations, other base logs, such as base e, can be used. For base e, real numbers can be specified in 32-bit sign + logsize format as follows:

S xxxxxxxx.xxxxxxxxxxxxxxxxxxxxxxx S xxxxxxxx.xxxxxxxxxxxxxxxxxxxxxxx

또는 간단히 (S7.24)로 명시될 수 있다. 보다 큰 밑(e = 2.718)으로 인해, 소수점 좌측의 비트의 보다 적은 수가 적합한 동적 범위를 제공하기에 충분한 반면, 동일하거나 보다 나은 정확도를 위해 소수점 우측의 여분의 비트가 필요한데, 이는 후술될 것이다.Or simply as (S7.24). Due to the larger base (e = 2.718), fewer bits to the left of the decimal point are sufficient to provide a suitable dynamic range, while extra bits to the right of the decimal point are needed for the same or better accuracy, which will be described below.

로그크기 부분은 부호화된 고정 소수점 양일 수도 있는데, 가장 왼쪽 비트가 부호 비트이며, 표현된 수의 부호 S와 혼동되지 않는다. 대안으로, 로그크기 부분은 비트 패턴이 다음과 같도록 +64(또는 +63)만큼 오프셋될 수 있다:The log size portion may be a coded fixed-point amount, with the leftmost bit being the sign bit and not to be confused with the represented number of sign S. Alternatively, the log size portion may be offset by +64 (or +63) such that the bit pattern is as follows:

0 10000000.000000000000000000000000 10000000.00000000000000000000000

상기 비트 패턴은 제로 로그(수 = 1.0)를 나타낸다. 후자의 경우, 최대 표현가능 수는 밑 e 로그: The bit pattern represents zero log (number = 1.0). In the latter case, the maximum representable number is the base e log:

0 11111111.111111111111111111111110 11111111.11111111111111111111111

를 가지며, 이는 64의 오프셋, 즉, e⁶⁴ 또는 6.24×10²⁷보다 적은 거의 128이며, 역수는 최소 표현가능 수를 나타낸다. 수학식 3은 밑 e 로그 표현의 양자화 정확도를 나타낸다.This is an offset of 64, i.e., nearly 128 less than e ⁶⁴ or 6.24 × 10 ²⁷ , and the reciprocal represents the minimum representable number. Equation 3 shows the quantization accuracy of the base e logarithm.

도 2는 IEEE 부동 소수점 포맷(+127 오프셋)을 밑 e 포맷(+64 오프셋) 및 밑 2 포맷(+127 오프셋)과 비교한다.2 compares the IEEE floating point format (+127 offset) with the base e format (+64 offset) and the base 2 format (+127 offset).

밑 선택은 사실상 고정 워드 길이 내의 동적 범위 및 정확도 간의 트레이드오프를 결정하는 것과 동일하며, 하나의 정수 비트 이하의 단으로의 소수점 이동과 등가이다. 밑을 2 또는 4 또는 √2(일반적으로

, 여기서, N은 양 또는 음의 정수)로 선택하는 것은 + 또는 - N 비트 위치들로 소수점을 이동하는 것과 등가이며, 동시에 동일한 성능을 제공한다. 밑으로서 8을 선택하는 것은 로그를 3으로 나누는 것과 같이, 소수점을 정수의 위치들로 이동하는 것과 등가가 아니다. 다시 말해서, 로그 밑을 선택하는 것은 이진 소수점의 우측과 좌측 간의 비트의 분할을 변경하는 것과 수학적으로 등가이며, 이는 정확도 및 동적 범위 간의 절 충(compromise)을 변경한다. 그러나, 소수점은 오직 단으로만 시프트될 수 있으며, 밑은 계속해서 변경될 수 있다. 부호화된 로그크기의 경우, (부호화되지 않은 127-오프셋 로그크기와 반대로), 부호 비트는 로그크기의 부호로서 참조함으로써 수의 부호(S 비트)와 구별된다. 이를 더 명료하게 하기 위해, 밑 10 로그에서, log₁₀(3) = 0.4771이지만, log₁₀(1/3) = -0.4771임을 고려하라. 따라서, +3의 값을 로그로 나타내기 위해, 수 및 로그의 부호가 +이며, ++0.4771로 쓰여질 수 있다. 이하의 표는 이러한 표기법을 설명한다.The bottom choice is actually equivalent to determining the tradeoff between dynamic range and accuracy within a fixed word length, equivalent to shifting the decimal point to one or fewer integer bits.

Base

2 or 4 or √2 (typically

, Where N is a positive or negative integer) is equivalent to moving the decimal point to + or-N bit positions, while at the same time providing the same performance. Selecting 8 as the base is not equivalent to moving the decimal point to integer positions, such as dividing the log by three. In other words, selecting logarithmic is mathematically equivalent to changing the division of bits between the right and left sides of the binary decimal point, which changes the compromise between accuracy and dynamic range. However, the decimal point can only be shifted in steps, and the base can be changed continuously. In the case of the encoded log size (as opposed to the unsigned 127-offset log size), the sign bit is distinguished from the sign of the number (S bit) by referring to it as the sign of the log size. To make this clearer, consider that in the base 10 log, log ₁₀ (3) = 0.4771, but log ₁₀ (1/3) = -0.4771. Thus, to log the value of +3, the sign of the number and log is + and can be written as ++ 0.4771. The table below illustrates this notation.

표기Mark 표현expression ++0.4771++ 0.4771 log₁₀(+3)log ₁₀ (+3) +-0.4771+ -0.4771 log₁₀(+1/3)log ₁₀ (+1/3) -+0.4771-+ 0.4771 log₁₀(-3)log ₁₀ (-3) --0.4771--0.4771 log₁₀(-1/3)log ₁₀ (-1/3)

모든 로그 표현들이 양수임을 보장하기 위해, 오프셋 표현이 사용될 수 있다. 예를 들어, 선택된 수 보다 얼마나 더 많은 횟수의 로그만큼, 예를 들어, 0.0001만큼 대신 표시되면, 3의 표현은 log₁₀(3/0.0001) = 4.4771이며, 1/3의 표현은 log₁₀(0.3333/0.0001) = 3.5229가 될 것이다. 오프셋으로 인해, 이제 둘 모두 양수이다. 0.0001의 표시는 log(0.0001/0.0001) = 0이 될 것이다. 이이서, 모든 제로 비트 패턴은 0.0001의 최소 가능 양을 나타낸다.To ensure that all log representations are positive, an offset representation can be used. For example, if more than the number of logs selected, for example 0.0001, is displayed instead, the representation of 3 is log ₁₀ (3 / 0.0001) = 4.4771, and the representation of 1/3 is log ₁₀ (0.3333 /0.0001) = 3.5229. Due to the offset, both are now positive. An indication of 0.0001 would be log (0.0001 / 0.0001) = 0. This zero bit pattern represents a minimum possible amount of 0.0001.

전형적인 로그 테이블들은 역대수를 조사하기 위해 0.0000과 0.9999 사이의 로그들의 10,000 수들을 저장하고, 동일한 정확도의 로그를 획득하기 위해 유사한 양을 저장할 것을 요구한다. 로그 항등이 룩업 테이블의 크기를 감소하는데 사용 될 수 있다. 예를 들어, log₁₀(3) = 0.4771이고 log₁₀(2) = 0.3010 이다. 이로부터, 다음과 같이 즉시 유도될 수 있다:Typical log tables require storing 10,000 numbers of logs between 0.0000 and 0.9999 to examine the logarithm, and storing similar amounts to obtain logs of the same accuracy. Log equality can be used to reduce the size of the lookup table. For example, log ₁₀ (3) = 0.4771 and log ₁₀ (2) = 0.3010. From this, it can be derived immediately as follows:

log₁₀(6) = log₁₀(2×3) = log₁₀(3) + log₁₀(2) = 0.4771 + 0.3010 = 0.7781log ₁₀ (6) = log ₁₀ (2 × 3) = log ₁₀ (3) + log ₁₀ (2) = 0.4771 + 0.3010 = 0.7781

또한 다음과 같이 즉시 유도될 수 있다:It can also be derived immediately as follows:

log₁₀(1.5) = log₁₀(3/2) = log₁₀(3) - log₁₀(2) = 0.4771 - 0.3010 = 0.1761log ₁₀ (1.5) = log ₁₀ (3/2) = log ₁₀ (3)-log ₁₀ (2) = 0.4771-0.3010 = 0.1761

그러나, 소정의 수들 0.4771 및 0.3010의 임의의 간단한 조작에 의해 다음과 같이 유도할 수는 없다:However, by any simple manipulation of the predetermined numbers 0.4771 and 0.3010, it cannot be derived as follows:

log₁₀(5) = log₁₀(2+3) = 0.6990log ₁₀ (5) = log ₁₀ (2 + 3) = 0.6990

3 및 2의 로그들로부터 다음이 유도될 수 있는 것도 전혀 명백하지 않다:From the logs of 3 and 2 it is also not clear at all that:

log₁₀(1) = log₁₀(3-2) = 0log ₁₀ (1) = log ₁₀ (3-2) = 0

이러한 문제점을 처리하기 위해, 로그애드 함수 F_a를 기초로 한 룩업 테이블이 사용될 수 있다. 예를 들어, log₁₀(3)과 log₁₀(2) 중 더 큰 값, 즉, 0.4771을 그들의 차의 함수 F_a[log₁₀(3) - log₁₀(2)] = F_a(0.1761)에 가산함으로써 (2+3)의 로그가 획득될 수도 있는데, 밑 10의 경우, 다음과 같다: To address this problem, a lookup table based on the logad function F _a can be used. For example, the larger of log ₁₀ (3) and log ₁₀ (2), that is, 0.4771, is given to their difference function F _a [log ₁₀ (3)-log ₁₀ (2)] = F _a (0.1761) By addition, a log of (2 + 3) may be obtained, for base 10, as follows:

F_a(X) = log₁₀(1+10^-X)F _a (X) = log ₁₀ (1 + 10 ^-X )

유사하게, log₁₀(3)과 log₁₀(2) 중 더 큰 값으로부터 함수 F_S(0.1761)을 감산 함으로써 3-2의 로그가 획득될 수도 있는데, 밑 10의 경우, F_S(X)는 다음과 같다: Similarly, the logarithm of 3-2 may be obtained by subtracting the function F _S (0.1761) from the larger of log ₁₀ (3) and log ₁₀ (2), where for base 10, F _S (X) is As follows:

F_S(X) = log₁₀(1-10^-X)F _S (X) = log ₁₀ (1-10 ^-X )

그러나, F_a(X) 및 F_S(X)의 룩업 테이블은 여전히 각각의 함수에 대해 적어도 10,000개의 수들을 저장할 것을 요구한다.However, the lookup tables of F _a (X) and F _S (X) still require storing at least 10,000 numbers for each function.

보간 방법Interpolation method

룩업 테이블에 저장되는 값들의 수를 감소하기 위해 보간이 사용될 수 있다. 이후의 설명을 용이하게 하기 위해, 보간이 보다 상세히 후술된다. 편의상, 밑 e가 사용된다. 그러나, 다른 밑들이 동등하게 사용될 수도 있음을 알 것이다.Interpolation may be used to reduce the number of values stored in the lookup table. In order to facilitate the following description, interpolation is described in more detail below. For convenience, the base e is used. However, it will be appreciated that other bases may be used equally.

x_o로 예시된 제한된 수의 표에 의한 값들을 이용하여 함수 F_a(x) = log_e(1+e^x)를 계산하기 위해, 표에 의한 소수점 x_o에 대한 함수 F(X)의 테일러/맥클로린(Taylor/McClaurin) 설명은 다음을 제공한다: x _o by using the values given in Table A limited number of the example as a function F _a (x) of a _{^{= log e (1 + e x}} ) to calculate a function of the point x _o given in Table F (X) Taylor The Taylor / McClaurin description provides:

F(x) = F(x_o) + (x-x_o)F'(x_o) + 0.5(x-x_o)²F"(x_o)...F (x) = F (x _o ) + (xx _o ) F '(x _o ) + 0.5 (xx _o ) ² F "(x _o ) ...

여기서, '는 제1 도함수를 나타내고, "는 제2 도함수를 나타낸다. 이러한 확장론에 기초하여, 테일러 맥클로린 확장론(Taylor McClaurin expansion)의 이점들을 이용하여 log_e(c) = log_e(a+b)가 log_e(a) + F_a(x)로서 계산될 수 있다. 여기 서, x = log_e(a) - log_e(b) 이며, x_o의 값들은 표로 제공된다.Where 'denotes the first derivative and "denotes the second derivative. Based on this expansion theory, using the advantages of Taylor McClaurin expansion, log _e (c) = log _e (a + b) can be calculated as log _e (a) + F _a (x), where x = log _e (a)-log _e (b) and the values of x _o are provided in a table.

32 비트 밑 e 경우에 대한 간단한 선형 보간을 사용하기 위해, 제2 도함수 F"를 포함하는 2차항은 예를 들어, 2^-25 보다 작은 24번째 이진 자리까지 무시되어야 한다. F_a(x) = log_e(1+e^x)의 미분은 다음을 산출한다:In order to use simple linear interpolation for the 32-bit sub-e case, the second term containing the second derivative F "must be ignored, for example, to the 24th binary digit less than 2 ^-25 . F _a (x) = The derivative of log _e (1 + e ^x ) yields:

F_a"(x)는 x= 0 일 때 0.25로 피크치이다. 따라서, (x - x_o)<2^-11일 때, 2차항이 2^-25보다 작다. 이러한 요구조건을 만족시키기 위해, F _a "(x) is peaked at 0.25 when x = 0. Therefore, when (x-x _o ) <2 ^-11 , the quadratic term is less than 2 ^-25 . To satisfy this requirement,

나머지 dx = x - x_o가 The rest of dx = x-x _o

0.00000000000xxxxxxxxxxxxx0.00000000000xxxxxxxxxxxxx

의 형태로 되도록, In the form of

최상위 비트는 포맷 (5.11), 즉 The most significant bit is formatted (5.11), i.e.

xxxxx.xxxxxxxxxxxxxxxx.xxxxxxxxxxx

으로 표에 의한 소수점 x_o을 어드레스하므로, 2^-11보다 작다. 이처럼, dx는 13 비트 양이며 x_o은 16 비트 양이다.Is less than 2 ^-11 because it addresses the decimal point x _o in the table. As such, dx is a 13-bit quantity and x _o is a 16-bit quantity.

선형 보간 항 F_a'(x_o)의 정도는 대략 2^-25이 되어야 한다. F_a'(x_o)이 2^-11보다 작은 dx로 승산되기 때문에, F_a'(x_o)의 정확도는 2^-14이 되어야 한다. 여분의 LSB 커플들은 반올림 오차를 감소하는데 도움을 주기 위해, F_a(x_o)에 대한 표로 제공될 수도 있는데, 이는 각각의 x_o 값에 대해 F 및 F'를 둘 다 저장하는데 필요한 룩업 테이블의 넓이가 5 바이트(40 비트)가 되도록 제안된다.The degree of linear interpolation term F _a '(x _o ) should be approximately 2 ^-25 . Since F _a '(x _o ) is multiplied by dx less than 2 ^-11 , the accuracy of F _a ' (x _o ) must be 2 ^-14 . Extra LSB couples may be provided as a table for F _a (x _o ) to help reduce rounding errors, which is a lookup table for storing both F and F 'for each x _o value. It is proposed to be 5 bytes (40 bits) wide.

따라서, 표에 의한 값들은 26 비트 F_a의 2¹⁶ = 65,536 값들 및 동일한 수의 대응 14 비트 F_a' 값들을 포함한다. 또한, 14×13 비트 승산기가 dx·F_a'를 형성하는데 필요하다. 그러한 승산기는 본래 13 시프트 및 가산 연산들을 실행한다. 따라서, 대략 13 로직 지연들을 포함한다. 승산기의 복잡성 및 지연은 부스(Booth)의 알고리즘을 이용하여 다소 감소될 수도 있지만, 종래의 승산기는 벤치마크로서 이용될 수 있다.Thus, the values in the table include 2 ¹⁶ = 65,536 values of 26 bit F _a and the same number of corresponding 14 bit F _a 'values. In addition, _a 14x13 bit multiplier is required to form dx · F _a ′. Such multipliers inherently perform 13 shift and add operations. Thus, it includes approximately 13 logic delays. The complexity and delay of the multiplier may be somewhat reduced using Booth's algorithm, but a conventional multiplier can be used as a benchmark.

도 3은 상술된 선형 보간을 구현하는 종래의 ALU의 일례의 블록도를 도시한다. 도 3의 ALU는 감산기(10), 가산기(20), F_a/F_a' 룩업 테이블(30), 승산기(40) 및 감산기(50)를 이용하여 값 C = log_e(A+B)를 추정한다. 본 일례에서 사용되는 바와 같이, A = log_e(a)이고, B = log_e(b)이다. 후술되는 바와 같이, 특이성(singularity)을 방지하기 위해 감산에 대한 백워드 보간을 실행할 필요가 있을 수도 있기 때문에, 도 3은 x의 최상위 16 비트 부분보다 하나 많게 x₀의 값을 X_M으로부터 보간하는 것을 도시한다. F_a의 룩업 테이블(30)은 X_M + 1에서의 F_a의 값을 포함하므로, 포함된 F_a'의 값은 간격의 중심의 값, 즉, X_M + 0.5에서 계산된 F_a'의 값일 수 있다. 승산기(40)는 x의 최하위 13 비트의 13 비트 2의 보수(complement),

로 14 비트 F_a'(X_M) 값을 승산한다. 또한, 승산기(40)는 결과가 F_a'(X_M)와

의 27 비트 곱이 되도록 구성된다.3 shows a block diagram of an example of a conventional ALU that implements the linear interpolation described above. The ALU of FIG. 3 uses a subtractor 10, an adder 20, a F _a / F _a 'lookup table 30, a multiplier 40 and a subtractor 50 to determine the value C = log _e (A + B). Estimate. As used in this example, A = log _e (a) and B = log _e (b). As described below, since it may be necessary to perform backward interpolation on subtraction to prevent singularity, FIG. 3 interpolates the value of x ₀ from X _M by one more than the most significant 16 bit portion of x _. Shows that. A look-up table 30 in the F _a is in, with F _a 'values are a F _a calculation from the value of the center of the interval, i.e., X _M + 0.5' because it contains a value of F _a at X _M + 1 Can be a value. Multiplier 40 is a complement of 13 bits 2 of the least significant 13 bits of x,

Multiply the 14-bit F _a '(X _M ) value by. Multiplier 40 also results in F _a '(X _M )

It is configured to be a 27-bit product of.

27 비트 곱의 LSB는 감산기(50)에 바로우(borrow)로서 입력될 수도 있으며, 나머지 26 비트는 26 비트로 보간 값을 산출하기 위해 26 비트 F_a'(X_M) 값으로부터 감산된 후에, 출력 가산기(20)의 A와 B 중 더 큰 값에 가산되는데, '1'의 캐리인 비트에 의해 로그크기의 31 비트로 결과 C를 반올림한다.The LSB of the 27 bit product may be input to the subtractor 50 as a borrow, after the remaining 26 bits are subtracted from the 26 bit F _a '(X _M ) value to yield the interpolation value with 26 bits, then the output adder It is added to the larger of A and B in (20), rounded up the result C to 31 bits of the log size by the carry-in bit of '1'.

선형 보간에 기초하여 한 실제 32 비트 로그가산기는 대략 65,536 × 40 = 2.62 메가비트의 룩업 테이블(30) 및 13×14 비트의 승산기(40)를 포함한다. 상기 컴포넌트들은 상당한 실리콘 영역을 소비하며, 로직 지연이라는 점에서 속도면에서 이점이 없다. 그러나, 보간 방법을 이용하여 감산 또는 복소수 산술 연산을 프로세싱하기 위해, 워드 길이 및 승산기 구성에 상당한 조정이 필요하다.The actual 32-bit log adder based on linear interpolation includes a lookup table 30 of approximately 65,536 × 40 = 2.62 megabits and a multiplier 40 of 13 × 14 bits. The components consume significant silicon area and do not have speed advantages in terms of logic delay. However, in order to process subtraction or complex arithmetic operations using interpolation methods, significant adjustments are needed to the word length and multiplier configuration.

예를 들어, 보간을 이용하여 감산을 구현하기 위해, 다음과 같은 감산 함수 식에 따라 함수 값들이 결정된다:For example, to implement subtraction using interpolation, the function values are determined by the following subtraction equation:

F_s(x) = log_e(1 - e^-x)F _s (x) = log _e (1-e ^-x )

F_s(x)의 테일러/맥크로린 확장법은 1차 도함수를 수반한다:The Taylor / McCroline expansion of F _s (x) involves a first derivative:

이는 x가 0으로 감에 따라 무한대로 가는 경향이 있다. 이러한 특이성으로부터 연산들을 멀리 떨어지게 하기 위해, 함수는 다음 수학식 10에 의해 x = log_e(A) - log_e(B)(A>B 일 때)의 실제 값보다 큰 하나의 LSB를 표에 의한 값으로부터 백워드 보간될 수 있다:This tends to go to infinity as x goes to zero. To keep the operations away from this singularity, the function uses one table, LSB, which is larger than the actual value of x = log _e (A)-log _e (B) (A> B) by It can be backward interpolated from a value:

F_s(x) = F_s(x_o) - (x_o-x)F_s'(x_o)F _s (x) = F _s (x _o )-(x _o -x) F _s ' (x _o )

상기 식은 도 3의 로그애드를 위해 도시된 구현이다. 그 후, x의 적어도 최상위 비트가 0일 때, x_o은 값이 보다 큰 하나의 LSB이어서, 상기 특이성을 방지한다.The above equation is the implementation shown for the logad of FIG. Then, when at least the most significant bit of x is zero, x _o is one LSB with a larger value, thus preventing the specificity.

가산과 같이, 동일한 16/13 비트 분할의 경우, x_o의 최소 값은 2^-11이고, F_s'의 크기는 대략 2,048 값들이다. 그러나, F_s'의 값이 로그애드 상대보다 더 긴 12 비트가어서, dx·F_s'를 형성하기 위한 승산기의 크기를 13×26 비트 디바이스로 증가시킨다.Such as the addition, in the case of the same 16/13 bit division, the minimum value of x _o is ^2-11, the size of the F _s' are the values approximately 2,048. However, the value of F _s 'is 12 bits longer than the logad counterpart, thereby increasing the size of the multiplier for forming dx · F _s ' to 13 × 26 bit devices.

상술된 관점에서, 복소수 연산들 뿐만 아니라 실제 가산 및 실제 감산 간의 시너지가 ALU 구현 보간에서 제한된다. 따라서, 보간 실행을 위해 룩업 테이블 및 승산이 모두 필요한데, 이는 하드웨어 논리로 구현하기데 바람직하지 못하게 복잡해 진다. In view of the above, the synergy between actual addition and actual subtraction as well as complex operations is limited in ALU implementation interpolation. Thus, both lookup tables and multiplications are required for interpolation execution, which becomes undesirable to implement in hardware logic.

반복 로그 연산Repeated Logarithmic Operations

상술된 보간 프로세스에 대한 대안으로서, 또한, 스토리지 요구조건들을 감소시키기 위해, 반복 해법이 사용될 수 있다. 반복 해법은 표로 만들어진 함수들에 기초하여 반복 프로세스를 이용하여 로그 출력을 계산하기 위해 비교적 보다 적은 두 개의 테이블들을 사용한다. 반복 해법을 설명하기 위해, log₁₀(3) = 0.4771 및 log₁₀(2) = 0.3010로부터 어떻게 log₁₀(5) = log₁₀(3+2) 및 log₁₀(1) = log₁₀(3-2)가 유도될 수 있는지를 설명하도록 십진법 예가 제공된다.As an alternative to the interpolation process described above, an iterative solution can also be used to reduce storage requirements. The iterative solution uses relatively few two tables to compute log output using an iterative process based on tabulated functions. To illustrate the iterative solution, from log ₁₀ (3) = 0.4771 and log ₁₀ (2) = 0.3010, how log ₁₀ (5) = log ₁₀ (3 + 2) and log ₁₀ (1) = log ₁₀ (3-2 Decimal example is provided to illustrate whether can be derived.

F_a-테이블이라고도 언급되는 로그애드 함수표는 0.1 대신 0.0과 4.9 사이의 x의 값들을 위한 것으로 밑 10의 수학식 4를 근거로 한 50개의 값들을 저장한다. G-테이블 또는 정정 테이블이라고도 하는 다른 테이블은 수학식 11에 기초하여 0.001 대신 0.001과 0.099 사이의 y의 값들을 위한 99개의 값들을 저장한다:The logad function table, also referred to as F _a -table, is for values of x between 0.0 and 4.9 instead of 0.1 and stores 50 values based on Equation 4 below. Another table, also called a G-table or a correction table, stores 99 values for values of y between 0.001 and 0.099 instead of 0.001 based on Equation 11:

G(y) = -log₁₀(1-10^-y)G (y) = -log ₁₀ (1-10 ^-y )

이하에는 상기 log(5) = log(3+2) 일례에 대한 투-테이블 반복 프로세스가 상기 두 개의 룩업 테이블들을 이용하여 설명된다. 이하는 밑 10에 의해 설명되지만, 본 기술 분야에 숙련된 자들은 임의의 밑이 사용될 수도 있음을 이해할 것이다. 밑 10과 다른 밑을 사용하는 실시예들의 경우, 수학식 4 및 수학식 11이 밑 10 계산들에 대한 함수 및 정정 테이블을 각각 정의하고, 수학식 12가 일반적으로 임의의 밑 q에 대한 함수 및 정정 테이블들을 정의함을 이해할 것이다.The two-table iteration process for the log (5) = log (3 + 2) example is described below using the two lookup tables. Although the following is described by base 10, those skilled in the art will understand that any base may be used. For embodiments using base 10 and other bases, Equations 4 and 11 define the function and correction table for base 10 calculations, respectively, and Equation 12 is generally a function for any base q and It will be understood that the correction tables are defined.

F_a(x) = log_q(1+q^-x)F _a (x) = log _q (1 + q ^-x )

G(y) = -log_q(1-q^-y)G (y) = -log _q (1-q ^-y )

로그애드 프로세스의 경우, 인수 x = A-B = log₁₀(3) - log₁₀(2) = 0.1761이 먼저 가장 가까운 한자리 소수 0.2로 반올림된다. 50개의 값들의 F_a-테이블로부터, F_a(0.2) = 0.2124 임을 찾을 수 있다. 0.2124를 0.4771에 가산함으로써, 먼저 대략 log₁₀(2+3)은 0.6895가 된다. 0.1761에서 0.2로 x를 반올림함으로써 생기는 오차값은 0.0239이다. 상기 오차는 결코 0.099보다 크지 않아서, 99 값 정정 룩업 테이블 G(y)가 사용된다. 정정 값 y = 0.0239의 경우, 0.024로 반올림되어, G-테이블은 1.2695의 정정 값을 제공한다. 제1 룩업 테이블 F_a(0.2) = (0.2124)로부터의 값 및 x(0.1761)의 고유 값과 G(y) = 1.2695를 결합함으로써 F_a, x' = 1.658의 새로운 인수가 생성된다. 본 기술 분야에 숙련된 자들은 이러한 경우에 x를 한정하는 소 수(prime)가 미분을 나타내지 않음을 이해할 것이다.For the logad process, the arguments x = AB = log ₁₀ (3)-log ₁₀ (2) = 0.1761 are rounded to the nearest single decimal place 0.2 first. From the F _a -table of the 50 values, we can find that F _a (0.2) = 0.2124. By adding 0.2124 to 0.4771, first approximately log ₁₀ (2 + 3) is 0.6895. The error resulting from rounding x from 0.1761 to 0.2 is 0.0239. The error is never greater than 0.099, so a 99 value correction lookup table G (y) is used. For a correction value y = 0.0239, rounded to 0.024, the G-table gives a correction value of 1.2695. By combining the values from the first lookup table F _a (0.2) = (0.2124) and the eigenvalues of x (0.1761) and G (y) = 1.2695, a new factor of F _a , x '= 1.658 is created. Those skilled in the art will understand that in this case the prime defining x does not represent a derivative.

가장 가까운 한자리 소수로 반올림될 때, x' = 1.7이다. F_a(1.7) = 0.0086인데, 이는 0.6985의 log₁₀(2+3)의 제1 근사값에 가산될 때 0.6981의 제2 근사값을 제공한다. 1.658을 1.7로 반올림할 때의 오차는 0.042이다. G-테이블에서 y = 0.042를 룩업하면, 1.035 값이 제공되는데, 이는 0.0086의 이전 F_a 값 및 x' = 1.658과 가산될 때, 새로운 x 값 x" = 2.7016을 야기한다. x"을 2.8로 반올림한 후, F_a-테이블을 이용하여, F_a(2.8) = 0.0007이 야기된다. 0.0007을 제2 근사값(0.6981)에 가산하면, 0.6988의 최종 제3 근사값이 제공되는데, 이는 오직 50개의 값들의 F_a 룩업 테이블 및 오직 100개의 값들의 G 룩업 테이블을 사용할 때 예상되는 정확도로 0.6990의 실제 값에 충분히 가깝다고 생각된다. 원하는 경우, 정확도면에서 약간의 증가를 위해 또 다른 반복이 실행될 수 있다. 그러나, 3회 보다 많은 반복은 일반적으로 가산에서는 필요하지 않다. 대안으로, 반복 최대 회수가 3으로 선정되면, 최종 반복을 위한 F_a의 인수는 2.7이라는 가장 가까운 한자리 소수로 반올림될 수 있다. F_a(2.7) = 0.0009인데, 이는 0.6981의 log₁₀(3+2)의 제2 근사값에 가산될 때, log₁₀(5) = log₁₀(3+2) = 0.6990이라는 예상된 결과가 제공된다.When rounded to the nearest single decimal, x '= 1.7. F _a (1.7) = 0.0086, which gives a second approximation of 0.6981 when added to the first approximation of log ₁₀ (2 + 3) of 0.6985. The error when rounding 1.658 to 1.7 is 0.042. Looking up y = 0.042 in the G-table gives a 1.035 value, which when added to the previous F _a value of 0.0086 and x '= 1.658, results in a new x value x "= 2.7016. X" to 2.8 after rounds, _a F - using the table, a F _a (2.8) = 0.0007 is caused. When adding 0.0007 to a second approximate value (0.6981), there is provided a final third approximate value of 0.6988, which is only of 0.6990 with an accuracy that is expected when using the G look-up tables of F _a look-up table, and only 100 values of 50 values I think it's close enough to the actual value. If desired, another iteration may be performed for a slight increase in accuracy. However, more than three iterations are generally not necessary for the addition. Alternatively, if the maximum number of iterations is chosen to be 3, the factor of F _a for the final iteration may be rounded to the nearest single decimal place of 2.7. Inde F _a (2.7) = 0.0009, which when added to the second approximate value of log ₁₀ (3 + 2) of 0.6981, the expected result that the _{_{log 10 (5) = log 10}} (3 + 2) = 0.6990 are provided .

2-테이블 반복 프로세스는 룩업 테이블 크기의 100-폴드(fold) 감소 및 승산의 방지를 위해 3-스텝 프로세스를 대신 수용하는 것을 포함한다. 하드웨어 구현에서, 3회 반복을 위해 필요한 로직 지연의 총 수는 실제로 승산기의 반복 가산/시 프트 구조를 통한 로직 지연 수보다 작을 수 있다. 임의의 경우에서, 상술된 감소는 실리콘 영역 및/또는 정확도가 주요하게 중요할 때 유용하다.The two-table iteration process involves accepting a three-step process instead to avoid 100-fold reduction and multiplication of the lookup table size. In a hardware implementation, the total number of logic delays required for three iterations may actually be less than the number of logic delays through the iterative add / shift structure of the multiplier. In any case, the reduction described above is useful when silicon area and / or accuracy are of primary importance.

log₁₀(3-2)의 값이 유사하게 계산될 수 있다. 시작 근사값은 보다 큰 수의 로그, 즉, 0.4771이다. 감산을 위한 F_s-테이블은 0.1의 스텝들에서 다음과 같은 값들을 저장한다; Values of log ₁₀ (3-2) can be calculated similarly. The starting approximation is a larger number of logs, that is, 0.4771. The F _s -table for subtraction stores the following values in steps of 0.1;

F_s(x) = log₁₀(1-10^-x) (밑 10의 경우)F _s (x) = log ₁₀ (1-10 ^-x ) (for base 10)

F_s(x) = log_q(1-q^-x)(일반 밑 q의 경우)F _s (x) = log _q (1-q ^-x ) (for normal base q)

여기서, G-테이블은 동일한 것을 유지한다. 0.1761의 log₁₀(3)과 log₁₀(2) 간의 차는 가장 가까운 한자리 소수 0.2로 반올림된다. 감산 함수 표에서 0.2를 룩업하면 F_s(0.2) = -0.4329가 야기된다. -0.4329를 시작 근사값 0.4771에 가산하면, log₁₀(1)에 대한 제1 근사값, 0.0442가 생성된다.Here, the G-tables keep the same. The difference between log ₁₀ (3) and log ₁₀ (2) of 0.1761 is rounded to the nearest single decimal place 0.2. A look up of 0.2 in the subtraction table results in F _s (0.2) = -0.4329. Adding -0.4329 to the starting approximation 0.4771 yields a first approximation of log ₁₀ (1), 0.0442.

가산의 경우, 0.1761을 0.2로 반올림할 때의 오차는 0.0239이다. 0.024로 이전에 정의된 G-테이블을 처리하면, 1.2695가 리턴된다. 1.2695를 x = 0.1761의 이전 F_s 인수 및 -0.4329의 이전 F_s-테이블 룩업 값에 가산하면, x' = 1.0127의 새로운 F_s-테이블 인수가 생성된다. x'를 가장 가까운 한자리 소수, 1.1로 반올림하 고 F_s-테이블을 다시 사용할 때, F_s(1.1) = -0.0359가 산출된다. 제1 근사값(0.0442)에 -0.0359를 가산해서, 0.0083의 log₁₀(1)에 대한 제2 근사값이 제공된다. 1.0127을 1.1로 반올림할 때의 오차는 0.0873이었다. G-테이블을 어드레스하기 위해 값 0.087을 사용하면, G(0.087) = 0.7410이 제공된다. 1.0127의 이전에 반올림되지 않은 F_s-테이블 인수 및 -0.0359의 F_s-테이블 룩업 값에 가산될 때, 새로운 F_s-테이블 인수 x" = 1.7178이 생성된다. x"를 1.8로 반올림하면, F_s(1.8) = -0.0069가 야기되는데, 이는 0.0014의 log₁₀(1)에 대한 제3 근사값을 획득하기 위해 0.0083의 제2 근사값에 가산된다. 1.7178을 1.8로 반올림할 때의 오차는 0.0822였다. 0.082로 G-테이블을 어드레스할 때, 값 0.7643이 리턴된다. 상기 값을 1.7178의 이전 F_s-테이블 인수 및 -0.0069의 이전 F_s-테이블 룩업 값에 가산하면, x"' = 2.4752의 새로운 F_s-테이블 인수가 생성된다. 2.4752를 2.5로 반올림하면, F_s(2.5) = -0.0014의 함수값이 생성된다. -0.0014를 제3 근사값(0.0014)에 가산하면, 예상대로 log₁₀(1) = log₁₀(3-2) = 0이 제공된다. F_s의 인수가 각각의 반복에 대해 증가하기 때문에 알고리즘은 수렴되어서, 점점 더 작은 정정이 야기된다.In the case of addition, the error when rounding 0.1761 to 0.2 is 0.0239. When processing a G-Table previously defined with 0.024, 1.2695 is returned. When added to the look-up table values, new F _s of x '= 1.0127 - - 1.2695 the previous F _s F _s of acquired and before the x -0.4329 = 0.1761 The table acquisition is generated. the x 'nearest decimal digit, rounded to 1.1, and F _s - is when using the back table, F _s (1.1) = -0.0359 is calculated. By adding -0.0359 to the first approximation value (0.0442), a second approximation for log ₁₀ (1) of 0.0083 is provided. The error when rounding 1.0127 to 1.1 was 0.0873. Using the value 0.087 to address the G-table, G (0.087) = 0.7410 is provided. When added to the previously unrounded F _s -table argument of 1.0127 and the F _s -table lookup value of -0.0359, a new F _s -table argument x "= 1.7178 is generated. If x" is rounded to 1.8, F _s (1.8) = -0.0069 is caused, which is added to the second approximation of 0.0083 to obtain a third approximation for log ₁₀ (1) of 0.0014. The error when rounding 1.7178 to 1.8 was 0.0822. When addressing a G-Table with 0.082, the value 0.7643 is returned. Adding this value to the previous F _s -table argument of 1.7178 and the old F _s -table lookup value of -0.0069 produces a new F _s -table argument of x "'= 2.4752. Rounding 2.4752 to 2.5, F _s (2.5) a = -0.0014 function values are generated. When the addition -0.0014 to the third approximate value (0.0014), the _{_{log 10 (1) = log 10}} (3-2) = 0 is provided as expected. F _s The algorithm converges because the factor of increases with each iteration, resulting in smaller and smaller corrections.

상술된 감산 프로세스는 F-테이블의 감산-버전 사용과 무관하게 가산의 경우와 동일했다. 그러나, 가산 및 감산 모두 동일한 G-테이블을 사용한다. 또한, 바람직한 정도를 제공하기 위해 감산은 가산보다 한 회 더 많은 회수의 반복을 요구 했다; 이는 F_s의 인수가 각각의 반복에서, 특히, 처음 반복에서, 약간 덜 신속하게 증가하기 때문인데, F_s-값을 가산할 때의 증가는 감산의 경우와 네가티브이기 때문이다. The subtraction process described above was the same as for the addition regardless of the use of the subtraction-version of the F-table. However, both addition and subtraction use the same G-table. In addition, subtraction required one more repetition than addition to provide the desired degree; This is because the factor of F _s increases slightly less rapidly at each iteration, especially at the first iteration, because the increase when adding F _s -value is negative for subtraction.

고 정확도 로그애드High accuracy logad

일반적으로, 보다 일반적인 밑 q 로그에 대해 해결될 로그애드 문제점은 이하의 단계들로 제공될 수 있다: In general, the logad problem to be solved for the more common base q log can be provided by the following steps:

A = log_q(a) 및 B = log_q(b)라고 가정함 - 여기서, a 및 b는 양수이고 q는 밑.

Assume that A = log _q (a) and B = log _q (b)-where a and b are positive and q is base.

목표: C = log_q(c)를 찾음 - 여기서, c = a + b.

Goal: find C = log _q (c) where c = a + b.

따라서, C = log_q(a + b) = log_q(q^A + q^B)

Thus, C = log _q (a + b) = log _q (q ^A + q ^B )

A를 A 및 B 중에서 큰 수로 하라.

Let A be the larger of A and B.

이어서, C = log_q(q^A(1 + q^-(A-B)))

Then C = log _q (q ^A (1 + q- ^(AB) ))

= A + log_q(1 + q^-(A-B))= A + log _q (1 + q- ^(AB) )

= A + log_q(1 + q^-r), 여기서, r = A - B 이며 양수.= A + log _q (1 + q ^-r ), where r = A-B and positive.

따라서, 단일 변수 r의 함수 log_q(1 + q^-r)을 계산하면서 문제점은 감소되었 다.Therefore, the problem is reduced by calculating the function log _q (1 + q ^-r ) of the single variable r.

r이 제한된 워드 길이를 가지면, 함수값은 함수 룩업 테이블에 의해 획득될 수 있다. 예를 들어, 16 비트 r-값의 경우, 함수 룩업 테이블은 65,536개의 워드들을 저장해야만 한다. 또한, 밑 q = e = 2.718의 경우, r > 9이면, 함수의 값은 2^-13보다 적게 0과 상이한데, 이는 12 비트 소수부와 함께, 오직 4 비트 정수부가 고려될 필요가 있음을 제시한다. 그 후, r > 9 인 경우, 함수 값은 소수점 후 0 내지 12 이진 자리들이어서, 룩업 테이블은 최대 9까지의 r의 값에 대해서만 요구되며, 메모리의 9×4,096 = 36,864 워드들을 제공한다.If r has a limited word length, the function value can be obtained by the function lookup table. For example, for a 16-bit r-value, the function lookup table must store 65,536 words. In addition, for base q = e = 2.718, if r> 9, the value of the function is different from 0 less than 2 ^-13 , suggesting that only a 4 bit integer part needs to be considered, with a 12 bit fractional part. . Then, if r> 9, the function value is 0-12 binary places after the decimal point, so the lookup table is only required for values up to 9 r, giving 9 × 4,096 = 36,864 words of memory.

함수의 최대 값이 log_e(2) = 0.69이기 때문에, r = 0 일 때, 12 비트 소수부만이 저장될 필요가 있어서, 메모리 요구조건은 65,536 16 비트 워드들이 아닌 단지 36,864 12 비트 워드들이다. 밑 2의 경우에, 함수는 r > 13 에 대해 0 내지 12 이진 자리들이어서, 다시 오직 r의 4 비트 정수부만이 고려될 필요가 있다. 부호를 위해 하나의 비트가 사용되면, 로그크기 부분은 오직 15 비트 길이, 예를 들어, 4.11 포맷 또는 5.10 포맷이며, 따라서, 상술된 도면들이 조정될 수 있다.Since the maximum value of the function is log _e (2) = 0.69, when r = 0, only 12 bit fractional parts need to be stored, so the memory requirement is only 36,864 12 bit words, not 65,536 16 bit words. In the case of base 2, the function is from 0 to 12 binary places for r> 13, so again only the 4-bit integer portion of r needs to be considered. If one bit is used for the sign, the log size portion is only 15 bits long, for example 4.11 format or 5.10 format, so that the above-described figures can be adjusted.

그러나, 예를 들어, 32 비트의 워드 길이들을 이용하여, 16 비트보다 훨씬 더 높은 정도들을 획득하기 위해, 함수에 대한 다이렉트 룩업 테이블은 과도하게 크다. 예를 들어, IEEE 32 비트 부동 소수점 표준에 필적하는 정도 및 동적 범위를 제공하기 위해, A 및 B는 밑 e의 경우 각각 7 비트 정수부, 24 비트 소수부 및 부호 비트를 가져야만 한다. r의 5 비트 양의 정수부로 표현될 수 있는 경우인, 함수가 0 내지 24 비트 정도이기 전에, r의 값은 이제 25log_e(2) = 17.32 보다 커야만 한다. 따라서, 포맷 5.24의 잠정적인 29 비트 r-값이 함수 F_a의 인수로서 고려되어야 한다. 18×2²⁴의 룩업 테이블 크기 또는 3억 2백만개의 24 비트 워드들이 0 내지 18 사이의 값들에 대한 r의 희망 룩업을 위해 요구된다. 로그 연산들에 대한 모든 연구는 상기 테이블 크기를 감소시키는데 관심을 가지며, 최종 목표는 실제 워드 길이가 64 비트가 되게 하는 것이다. 본 명세서에 기술된 수개의 기술들이 상기 목표를 향하고 있다.However, for example, using word lengths of 32 bits, the direct lookup table for the function is excessively large to obtain degrees much higher than 16 bits. For example, to provide a degree and dynamic range comparable to the IEEE 32-bit floating point standard, A and B must each have a 7-bit integer part, a 24-bit fractional part, and a sign bit for base e. The value of r must now be greater than 25log _e (2) = 17.32 before the function is on the order of 0 to 24 bits, which can be represented by the positive integer part of r's 5 bits. Thus, the tentative 29-bit r-value of format 5.24 should be considered as an argument of the function F _a . A lookup table size of 18 × 2 ²⁴ or 321 million 24-bit words is required for the desired lookup of r for values between 0 and 18. All work on log operations is concerned with reducing the table size, and the ultimate goal is to make the actual word length 64 bits. Several techniques described herein are directed towards this goal.

단일 대형 테이블로부터 룩업 테이블의 크기를 감소시키기 위해, r의 모든 비트를 어드레스로서 사용하는 로그애드 함수 F_a의 희망 룩얼을 위해 요구되는 대로, 본 발명의 한 구현은 r을 각각, MS(최상위) 및 LS(최하위) 부분들, r_M 및 r_L로 분할하는 것을 포함한다. 상기 MS 및 LS 부분들은 훨씬 작은 두 개의 테이블들, F 및 G를 각각 처리하는데, 이는 후술된다. MS 부는 입력 값의 "반올림" 버전을 나타내지만, LS 부는 반올림 버전과 고유 총 인수 값 간의 차를 나타낸다.In order to reduce the size of the lookup table from a single large table, one implementation of the invention requires r to be equal to MS, respectively, as desired for the desired lookup of the logad function F _a using all bits of r as addresses. And dividing into LS (lowermost) portions, r _M and r _L. The MS and LS portions process two much smaller tables, F and G, which are described below. The MS part represents the "rounded" version of the input value, while the LS part represents the difference between the rounded version and the unique total argument value.

수학식 14에 도시된 바와 같이, r_M을 r<32의 최상위 14 비트라고 하고, r_L을 r의 최하위 15 비트라고 하자.As shown in equation (14), let r _{M be} the most significant 14 bits of r <32 and r _{L be} the least significant 15 bits of r.

r_M = xxxxx.xxxxxxxxxr _M = xxxxx.xxxxxxxxx

r_L = 00000.000000000xxxxxxxxxxxxxxxr _L = 00000.000000000xxxxxxxxxxxxxxx

편의상, r_M 및 r_L의 길이는 간단히 (5.9) 및 (15)라고 표시될 수 있다. r의 최상위 및 최하위 비트 부분들로의 다른 분할들이 방법의 명백한 변경에 의해 동등하게 사용될 수 있으며, 후술되는 특정 분할을 참조하기 위한 몇몇 고려 사항들은 다른 워드 길이(예를 들어, 16 비트) 또는 복소수 연산들을 위한 동일한 F 및 G 테이블들을 재사용하는 기능과 관련된다.For convenience, the lengths of r _M and r _L may simply be denoted as (5.9) and (15). Other divisions of r into the most significant and least significant bit parts can be used equally by the apparent modification of the method, and some considerations for referring to the specific divisions described below are different word lengths (eg, 16 bits) or complex numbers. It relates to the ability to reuse the same F and G tables for operations.

r_M ⁺를 r_L의 최대가능 값, 즉, 00000.000000000111111111111111만큼 증가된 r_M의 값이라고 하자. 이는 단지 최하위 15 비트가 1로 설정된 고유 r-값임을 알 것이다. 몇몇 구현들에서, r_M은 0.000000001만큼 증가될 수 있다. 즉, 다음과 같다: Let r _M ^{+ be} the maximum possible value of r _L , that is, the value of r _M increased by 00000.000000000111111111111111. It will be appreciated that only the lowest 15 bits are unique r-values set to one. In some implementations, r _M can be increased by 0.000000001. That is:

r_M ⁺ = xxxxx.xxxxxxxxx + 00000.000000001r _M ⁺ = xxxxx.xxxxxxxxx + 00000.000000001

r_L의 보수 값은 다음과 같다고 하자: Suppose the complement of r _L is

r_L ^- = r_M ⁺ - r _{^{_{^{r L - = r M + -}}}} r

r_L의 보수 또는 2-보수 중 하나인데, r_M의 상기 두가지 대안의 증가율, 즉, r_L ^- = 00000.000000000111111111111111 - 00000.000000000xxxxxxxxxxxxxxx(r_L의 보 수) 또는 r_L ^- = 00000.000000001000000000000000 - 00000.000000000xxxxxxxxxxxxxxx (r_L의 2-보수) 중 어떤 오그먼트가 사용되는지에 좌우된다. 이이서, 밑 e에 대해 다음과 같은 결과가 야기된다:One of the two's complement or 2's complement r _L, r the growth rate of the two alternatives of _M, _{^{i.e., r L - = 00000.000000000111111111111111 - 00000.000000000xxxxxxxxxxxxxxx}} ( number of beam _L r) or _{^{r L - = 00000.000000001000000000000000 - 00000.000000000xxxxxxxxxxxxxxx (}} r L Depends on which augment is used. This results in the following for e under:

여기서,

이다. log(1 +

) 를 확장하면 다음과 같은 결과를 낳는다:here,

to be. log (1 +

)) Produces the following output:

여기서,

이다. 반복하면, 원하는 답이 이하의 함수들의 합을 포함함을 보여준다: here,

to be. Repeating, it shows that the desired answer includes the sum of the following functions:

상기 함수들은 각각의 r-인수들의 최상위 14 비트에만 좌우되는데, 이는 그 후 오직 16,384 워드들의 룩업 테이블로부터 획득될 수 있다.The functions depend only on the most significant 14 bits of each r-factor, which can then be obtained from a lookup table of only 16,384 words.

수학식 17 내지 수학식 19의 문맥에서, 표시된 r-값들을 한정하는데 사용되 는 소수(들)은 도함수를 나타내지 않는다. 대신, 일련의 r-값들 r, r', r" 등이 로그애드 함수 룩업 테이블 (F_a)로부터 막 획득된 값을 선행 값에 누산하고, r의 최하위 15 비트에 좌우되는 값, 즉, 값-log_e(1 -

)을 가산함으로써 유도되는데, 상기 후자 값은 r_L ^-가 15 비트 값이기 때문에 32,768 워드들을 갖는 정정 룩업 테이블, 즉, G-테이블에 의해 제공된다.In the context of Equations 17-19, the decimal number (s) used to define the indicated r-values does not represent a derivative. Instead, a series of r-values r, r ', r ", etc., accumulate the value just obtained from the logad function lookup table F _a to the preceding value, ie a value that depends on the least significant 15 bits of r. -log _e (1-

), Which is provided by a correction lookup table with 32,768 words, i.e., a G-table since r _L ^- is a 15 bit value.

저장된 값들이 r_M ⁺ 및 r_L ^-로부터 계산되더라도, 함수 및 정정 룩업 테이블들은 각각 r_M 및 r_L에 의해 직접 어드레스될 수 있다. 상기 룩업 테이블 함수들 F_a 및 G를 각각 호출하고, 정정 값들이 항상 매우 음수임을 나타내면, 양의 정정 값이 G-테이블에 저장될 수 있다. 상기 양의 정정 값은 음의 값을 저장하고 감산하는 대신, 이전 r-인수에 가산된다. 또한, G-테이블의 최소 정정 값, 또는 적어도 그 정수부는 저장된 비트의 수를 감소시키기 위해 저장된 값들로부터 감산될 수도 있으며, 값이 테이블로부터 풀될 때마다 다시 가산될 수 있다. 밑 2의 경우, 8의 값은 최소 정정 값으로 적합하며, 몇몇 구현들에서 다시 가산될 필요가 없다. 반복은 다음과 같다:Even though the stored values are calculated from r _M ⁺ and r _L ⁻ , the function and correction lookup tables can be addressed directly by r _M and r _L , respectively. If the lookup table functions F _a and G are called respectively and the correction values are always very negative, then a positive correction value can be stored in the G-table. The positive correction value is added to the previous r-factor, instead of storing and subtracting the negative value. In addition, the minimum correction value, or at least its integer portion, of the G-table may be subtracted from the stored values to reduce the number of stored bits, and added again each time the value is pulled from the table. For base 2, the value of 8 fits as the minimum correction value and in some implementations does not need to be added again. The repetition is as follows:

1. 출력 누산기 값 C를 A 및 B 중 큰 값으로 초기화한다.1. Initialize the output accumulator value C to the greater of A and B.

2. A가 더 크면 A-B로 r을 초기화하고, B가 더 크면 B-A로 r을 초기화한다.2. If A is bigger, initialize r with A-B. If B is larger, initialize r with B-A.

3. r을 r_M 및 r_L로 분할한다.3. Divide r by r _M and r _L.

4. 각각 r_M 및 r_L로 어드레스되는 대로, F_a(r_M ⁺) 및 G(r_L ^-)을 룩업한다.4. Look up F _a (r _M ⁺ ) and G (r _L ⁻ ), as addressed by r _M and r _L , respectively.

5. F_a를 C로 누산하고 F_a + G를 r로 누산한다.5. Accumulate F _a with C and F _a + G with r.

6. r < STOP_THRESHOLD이면, 단계 3부터 반복한다(후술됨).6. If r <STOP_THRESHOLD, repeat from step 3 (described below).

본 기술 분야에 숙련된 자들은 소수의 로직 게이트들이 로직 b6.OR.(b5.AND.(b4.OR.b3.OR.b2))(32 비트 설정, 또는 16 비트 설정, 8, 4, 또는 2 비트 중 하나로 설정)을 이용하여 18 보다 큰 r-값을 검출하는데 사용될 수도 있음을 알 것이며, 여기서 비트 인덱스는 소수점 좌측의 비트 위치를 나타낸다. 함수 G(r_L ^-) = log_e(1 -

)의 값은 항상 대략 6.24보다 커서, 반복은 항상 3 사이클 이하로 종료된다. 정정 값들은 밑 2에 대해 비례해서 커져서, r은 항상 많아야 3 사이클들동안 밑 2에 대해 25를 초과한다. 일반적으로, 임의의 밑에 대해 3 사이클들이면 통상 충분하다.Those skilled in the art will appreciate that a few logic gates may have a logic b6.OR. (b5.AND. (B4.OR.b3.OR.b2)) (32-bit setting, or 16-bit setting, 8, 4, or It will be appreciated that it may be used to detect r-values greater than 18 using a setting of one of two bits, where the bit index indicates the bit position to the left of the decimal point. Function _{^{G (r L -) = log}} e (1 -

) Is always greater than approximately 6.24, so that iteration always ends in less than 3 cycles. The correction values increase proportionally for base 2, so r always exceeds 25 for base 2 for at most 3 cycles. In general, 3 cycles for any bottom are usually sufficient.

고 정확도 2-테이블 로그서브High Accuracy 2-Table Log Sub

A 및 B와 관련된 부호들 S가 a 및 b가 동일한 부호를 가짐을 나타내면, "로그애드(logadd)"라고 하는 상술된 로그 가산 알고리즘이 사용될 수 있다. 아니면, "로그서브(logsub)"라고 하는 로그 감산 알고리즘이 필요하다. 이하의 테이블은 각각의 알고리즘들이 사용될 때를 나타낸다:If the signs S associated with A and B indicate that a and b have the same sign, the above-described log addition algorithm, referred to as "logadd", can be used. Otherwise, a log subtraction algorithm called "logsub" is required. The following table shows when each algorithm is used:

부호(a)Code (a) 부호(b)Sign (b) 가산Addition a로부터 b를 감산subtract b from a ++ ++ logadd(A,B) 사용use logadd (A, B) logsub(A,B) 사용use logsub (A, B) ++ -- logsub(A,B) 사용use logsub (A, B) logadd(A,B) 사용use logadd (A, B) -- ++ logsub(B,A) 사용use logsub (B, A) logadd(A,B) 사용use logadd (A, B) -- -- logadd(A,B) 사용use logadd (A, B) logsub(A,B) 사용use logsub (A, B)

결과의 부호는 항상 로그애드 알고리즘이 사용될 때보다 큰 로그크기와 관련된 부호이다.The sign of the result is always the sign associated with a larger log size than when the logad algorithm is used.

제2 인수와 관련된 부호가 먼저 반전되면, 로그서브 알고리즘에 대해서도 동일하게 참을 유지한다. 감산이 요구되면, 제2 인수를 로그 유닛의 입력에 적용할 때 제2 인수의 부호의 반전이 실행될 수 있다. "logsub" 알고리즘은 다음과 같이 유도된다: A = log(|a|) 및 B = log(|b|)라고 하자. C = log(c)를 찾도록 요구되는데, 여기서, c = |a| - |b| 이다. A는 A 및 B 중 더 큰 것으로 하자. 편의상 절대값을 (||)으로 표시하며, a 및 b 둘 다 양수라고 가정되면, 다음과 같다:If the sign associated with the second argument is inverted first, the same holds true for the logsub algorithm. If subtraction is required, inversion of the sign of the second argument can be performed when applying the second argument to the input of the log unit. The "logsub" algorithm is derived as follows: Let A = log (| a |) and B = log (| b |). C = log (c) is required to be found, where c = | a | -| b | to be. Let A be the larger of A and B. For convenience, the absolute value is denoted by (||), assuming both a and b are positive:

C = log_e(a-b) = log_e(e^A - e^B)C = log _e (ab) = log _e (e ^A -e ^B )

로그애드의 경우, 본 일례에서는 단지 설명을 목적으로 밑 e가 사용되므로, 제한되는 것은 아니다.In the case of logad, the base e is used in this example for illustrative purposes only, and is not limited.

A가 B보다 크다고 가정되기 때문에: Since A is assumed to be greater than B:

C = log_e(e^A(1 - e^-(A-B)))C = log _e (e ^A (1-e- ^(AB) ))

= A + log_e(1 - e^-(A-B))= A + log _e (1-e- ^(AB) )

= A + log_e(1 - e^-r)= A + log _e (1-e ^-r )

여기서, r = A - B 이며, 양수이다. 따라서, 문제는 단일 변수 r의 함수 log(1 - e^-r)을 계산하는 것으로 축소된다. r_M, r_L, r_M ⁺ 및 r_L ^-은 이전에 정의된 바와 같다고 하자. 이어서, 밑 e의 경우: Where r = A-B and is positive. Thus, the problem is reduced to calculating the function log (1-e ^-r ) of the single variable r. Let r _M , r _L , r _M ⁺ and r _L ⁻ be as defined previously. Then, for base e:

여기서,

이다. log(1 +

) 를 확장하면 다음과 같은 결과를 낳는다:here,

to be. log (1 +

)) Produces the following output:

여기서,

이다. 결과를 반복하면, 원하는 답이 이하의 함수들의 합을 포함함을 보여준다: here,

to be. Repeating the results shows that the desired answer includes the sum of the following functions:

상기 함수들은 각각의 총 워드 길이 r-값들의 최상위 14 비트에만 좌우되는데, 이는 그 후 오직 16,384 워드들의 룩업 테이블에 의해 제공될 수 있다.The functions depend only on the most significant 14 bits of each total word length r-values, which can then be provided by a lookup table of only 16,384 words.

저장된 값들이 r_M ⁺ 및 r_L ^-로부터 계산되더라도, 로그애드에서와 같이, 로그서 브를 위한 룩업 테이블들은 r_M 및 r_L에 의해 직접 어드레스되도록 구성될 수 있다. 또한, 로그애드에서와 같이, 표시된 r-값들을 변경하는데 사용되는 소수(들)은 도함수를 나타내지 않는다.Even though the stored values are calculated from r _M ⁺ and r _L ⁻ , as in Log Add, the lookup tables for the log sub can be configured to be addressed directly by r _M and r _L. Also, as with logad, the decimal number (s) used to change the indicated r-values does not represent a derivative.

상기 룩업 테이블들 F_s 및 G(G는 로그애드 알고리즘과 동일한 룩업 테이블)를 각각 호출하고, 전과 같이 G의 양의 값을 저장하면, 로그서브 연산들에 필요한 F_s 및 G 테이블들이 생성된다. 1 - e^-r이 항상 1보다 작기 때문에, F_s는 항상 음수이므로, 양의 크기가 저장되어 가산 대신 감산될 수 있다. 다른 방법은 음의 부호 비트의 버려진 음수 값을 저장하는데, 음의 부호 비트는 감산이 진행중일 때 최상위 '1'을 첨부함으로써 룩업 테이블 밖에 배치된다. 바람직한 선택은 로직의 간단성(simplicity)을 야기하고 가산과 감산 간의 룩업 테이블 값들의 최대 시너지를 야기하는 것인데, 이는 후술된다. 임의의 경우에서, 이하의 단계들은 "로그서브" 프로세스를 약술한다:Calling the look-up table of F _s and G (G is the same look-up table and a log add algorithms), respectively, and when the store a positive value of G as before, are generated F _s and G tables required to log the sub-operations. Since 1-e ^-r is always less than 1, F _s is always negative, so a positive magnitude can be stored and subtracted instead of the addition. Another method stores discarded negative values of negative sign bits, which are placed outside the lookup table by appending the most significant '1' when subtraction is in progress. A preferred choice is to cause simplicity of logic and maximum synergy of lookup table values between addition and subtraction, which is described below. In any case, the following steps outline the "logsub" process:

3. r을 r_M 및 r_L로 분할한다.3. Divide r by r _M and r _L.

4. 각각 r_M 및 r_L로 어드레스되는 대로, F_s(r_M ⁺) 및 G(r_L ^-)을 룩업한다.4. Look up F _s (r _M ⁺ ) and G (r _L ⁻ ), as addressed by r _M and r _L , respectively.

5. F_s를 C로 누산하고 F_s + G를 r로 누산한다.5. Accumulate F _s with C and F _s + G with r.

LOGADD 및 LOGSUB 알고리즘들 둘 다의 경우, STOP_THRESHOLD는 다른 반복으로부터 유도된 임의의 값(contribution)이 LSB의 1/2 보다 적도록 선택된다. 이는 소수점 후 24 이진 자리를 갖는 밑 e(18 사용 가능)의 경우 17.32에서 발생하거나 또는 소수점 후 23 이진 자리들을 갖는 밑 2의 경우 24에서 발생한다. 원래, 31의 STOP_THRESHOLD를 제공하는 밑 2 보다 적은 밑이 발견될 수도 있는데, 이는 r의 선택된 MSB들에 의해 어드레스될 수 있는 총 어드레스 스페이스에 대해 정의된 F-함수를 사용한다. 또한, 15의 STOP_THRESHOLD를 제공한 밑 e 보다 큰 밑이 발견될 수도 있는데, 동일한 속성을 갖는다. 그러나, 밑 2의 실제 이점들이 F-테이블들에 대한 총 어드레스 스페이스를 사용할 때의 임의의 이점보다 큰 것으로 보인다. 일반적으로, 밑 2의 경우, STOP_THRESHOLD는 간단히 소수점 후 로그-표현의 이진 자리들의 수 보다 큰 1 또는 2이다.For both LOGADD and LOGSUB algorithms, STOP_THRESHOLD is chosen so that any contribution derived from another iteration is less than half of the LSB. This occurs at 17.32 for base e with 18 binary digits after the decimal point or at 24 for base 2 with 23 binary digits after the decimal point. Originally, less than 2 bases that provide a STOP_THRESHOLD of 31 may be found, which uses an F-function defined for the total address space that can be addressed by the selected MSBs of r. Also, a base greater than e that provides a STOP_THRESHOLD of 15 may be found, which has the same attribute. However, the actual benefits of base 2 appear to be greater than any advantage of using the total address space for F-tables. In general, for base 2, STOP_THRESHOLD is simply 1 or 2 greater than the number of binary digits in the log-expression after the decimal point.

상술된 십진법 예들에 의해 제시된 바와 같이, F-테이블을 어드레싱하는데 사용된 최종 인수, 예를 들어, r_M"'⁺가 r_M"'으로부터 반올림되는 대신 버림되면, 유한 회수의 반복 후의 정도가 향상된다. 2-테이블 반복 프로세스가 항상 고정 회수의 반복들을 실행하면, 또는 프로세스가 최종 반복을 식별하면, F의 인수는 최종 반복에서 버림된다. 최종 반복은 예를 들어, STOP_THRESHOLD의 특정 범위(밑 e의 경우 ~6, 또는 밑 2의 경우 ~8) 내의 r에 의해 식별될 수도 있는데, 다음 반복이 STOP_THRESHOLD를 초과하도록 바운드됨을 나타낸다. 상기 방법이 사용될 때, F-테 이블로의 어드레스는 r_L의 제일 왼쪽 비트가 최종 반복에서 0이면 1 만큼 감소될 수 있다. 기술된 파이프라인 구현에서, 최종 F-테이블 콘텐츠는 간단히 라운드-다운 인수에 대해 계산된다.As suggested by the decimal examples described above, if the final factor used to address the F-table, for example r _M "' ⁺ is rounded off instead of rounded from r _M "', the degree after a finite number of iterations is improved. do. If the two-table iteration process always executes a fixed number of iterations, or if the process identifies the last iteration, the argument of F is discarded at the last iteration. The final iteration may be identified, for example, by r within a specific range of STOP_THRESHOLD (˜6 for bottom e, or ˜8 for bottom 2), indicating that the next iteration is bound to exceed STOP_THRESHOLD. When the method is used, the address of the F-table can be reduced by one if the leftmost bit of r _L is zero in the last iteration. In the described pipeline implementation, the final F-table content is simply calculated for the round-down factor.

LOGSUB 및 LOGADD 알고리즘들 간의 차이는 오직 F_a가 아닌 F_s 룩업 테이블을 사용하는 것이다. 둘 다 16,384 워드 크기이기 때문에, F(r_M,opcode)로 표시된 + 또는 - 버전을 선택하기 위해 여분의 어드레스 비트를 갖는 싱글 함수 F-테이블로 결합될 수도 있는데, 여분의 인수 "opcode"는 LOGADD 또는 LOGSUB 알고리즘들 중 어느 알고리즘을 적용할 것인지를 나타내기 위해 값 0 또는 1을 갖는 여분의 어드레스 비트이다. 또한, 주변 로직(즉, 입력 및 출력 누산기들 및 가산기들/감산기들)이 반복 룩업 테이블들에 비해 작기 때문에, 독립 가산기 및 감산기를 형성하기 위해 주변 로직을 복제하는데 비용이 거의 들지 않는다. 또한, 함수들 F_a 및 -F_s 간의 유사성이 설명을 위해 후술된다.The difference between the LOGSUB and LOGADD algorithms is the use of the F _s lookup table, not only F _a . Since both are 16,384 words in size, they can also be combined into a single function F-table with extra address bits to select the + or-version denoted by F (r _M , opcode), with the extra argument "opcode" being LOGADD Or an extra address bit with a value of 0 or 1 to indicate which of the LOGSUB algorithms to apply. In addition, since the peripheral logic (ie, input and output accumulators and adders / subtractors) are small compared to the iterative lookup tables, it costs little to duplicate the peripheral logic to form independent adders and subtractors. Also, the similarity between the functions F _a and -F _s is described below for explanation.

지수 근사화Exponential approximation

상술된 바와 같이, r_M ⁺는 r_L의 최대 가능 값(0.00000000011111111111111)만큼 증가된 r_M을 포함할 수도 있으며, 또는 0.000000001만큼 증가된 r_M을 포함할 수 있다. 0.000000001 대신 0.0000000001111111....1을 r_M의 오그먼트로 선택하는 이점은 G 테이블이 반복 알고리즘 중에 r_L의 보수에 의해 어드레싱될 수 있다거나, r_M = 0인 경우에 직접 F의 값을 얻기 위해 r_L(보수가 아님)에 의해 어드레싱될 수 있다는 점이어서, 거의 동일한 2개의 값들을 감산하는 어려운 경우 단일 반복이면 충분하게 된다. 캐리가 전달될 필요가 없기 때문에, 보수 값 및 보수가 아닌 값을 모두 유효하게 하는 것이 2-보수를 형성하는 것 보다 더 간단하고 더 빠르다.As described above, r _M ⁺ may include r _M increased by the maximum possible value of r _L (0.00000000011111111111111), or may include r _M increased by 0.000000001. The advantage of choosing 0.0000000001111111 .... 1 instead of 0.000000001 as an fragment of r _M is that the G table can be addressed by complement of r _L during the iteration algorithm, or directly obtain the value of F if r _M = 0. Can be addressed by r _L (not complement), so a single iteration is sufficient if it is difficult to subtract two nearly identical values. Since the carry does not need to be delivered, validating both the complement and non-compensation values is simpler and faster than forming a two-complement.

로그애드의 경우, F_a-테이블의 값들은 다음과 같이 정의될 수 있다:For logad, the values in F _a -table can be defined as follows:

여기서, d는 바람직하게 X_L의 최대 가능 값, 즉, 모두 1들로 된 값인 증분이다. 함수는 X_M에 의해 어드레싱되는 룩업 테이블로서 구성될 수 있다. 감산의 경우, F_s의 값들은 다음과 같이 정의될 수 있다:Where d is preferably an increment which is the maximum possible value of X _L , ie the value of all ones. The function can be configured as a lookup table addressed by X _M. For subtraction, the values of F _s can be defined as follows:

X_M의 큰 값들에 대해 또한 32 비트 연산에 대해 F_a(X_M) = F_s(X_M)이고, 16 내지 24 사이의 인수 범위는 대략 다음의 수학식으로 적합하게 근사될 수 있다:For large values of X _M and also for 32-bit operations, F _a (X _M ) = F _s (X _M ), and the argument range between 16 and 24 can be suitably approximated by the following equation:

여기서, X_M1은 X_M의 정수부(소수점 왼쪽의 비트)이고, X_M2는 소수부, 즉, 소수 점 오른쪽의 비트이다. 괄호 안의 함수는 작은 지수 룩업 테이블에 저장될 수 있다. 우(right) 시프터는 정수 부분을 구현할 수 있어서, 소수 비트만이 지수 함수를 어드레스할 필요하므로, 테이블 크기를 감소시킨다.Here, X _M1 is an integer part (bit to the left of the decimal point) of X _M , and X _M2 is a decimal part, that is, a bit to the right of the decimal point. Functions in parentheses can be stored in small exponential lookup tables. The right shifter can implement an integer part, reducing the table size since only a few bits need to address the exponential function.

도 4는 지수 근사값(E) 및 함수값들 (F_a, F_s) 간의 유사성들을 도시한다. 인수 범위가 16 내지 24일 때, E는 F_a 및 F_s와 상당히 동일하다. 또한, 도 4는 다른 근사값 4 shows the similarities between the exponential approximation E and the function values F _a , F _s . When the argument ranges from 16 to 24, E is substantially the same as F _a and F _s . 4 is another approximation.

이 지수 근사값과 참 함수값들, dF_a = E - F_a 및 dF_s = F_s - E 간의 차를 적합하게 근사화하는지를 도시한다. 따라서, 8 내지 16 범위의 X_M에 대해, 도 4에서 알 수 있는 바와 같이, 8 비트 길이보다 작거나 같은 작은 정정 값 E₂에 의해 정정될 때, 지수 근사값 E가 사용될 수 있다. 이진 소수점 후 24 위치들이 요구될 때, 결과는 17 비트 길이이다.It shows whether the approximate difference between this exponential approximation and the true function values, dF _a = E-F _a and dF _s = F _s -E, is approximated. Thus, for X _M in the range of 8 to 16, an exponential approximation E can be used when corrected by a small correction value E ₂ , which is less than or equal to 8 bits long, as can be seen in FIG. 4. When 24 positions after the binary decimal point are required, the result is 17 bits long.

E 곡선 하의 영역은 대략 지수 근사값을 구현하는데 필요한 실리콘 영역을 근사화하기 때문에, 도 4는 로그애드 및 로그서브 연산들을 위한 함수 테이블들을 구현하는데 필요한 근사 실리콘 영역을 도시한다. 수직 스케일로서 밑 2 로그 스케일을 사용하는 것은 높이가 이진 값의 워드 길이를 나타냄을 의미한다. 수평 스케일은 상기 값들의 수를 나타낸다. 따라서, 곡선 아래 영역은 곡선 값들을 저장 하는데 필요한 ROM의 비트의 수를 나타낸다. 그러나, 지수 함수 E는 주기적이며, 값들은 우 시프트를 제외하고 1이 증가할 때마다 반복된다. 따라서, 소수부 X_M2에 의해 어드레스되는 오직 하나의 사이클만이 저장될 필요가 있으며, 결과는 X_M1에 의해 주어진 위치들의 수를 시프트했다. 따라서, 지수 함수 E는 매우 적은 테이블들을 필요로 한다. 또한, 정정 값들 dF 또는 E₂가 명백히 고유 F_a 및 F_s 함수들보다 적은 곡선 아래 영역을 갖기 때문에, 지수 근사값 E를 이용하여, 정정 값들 dF 및 E₂를 저장하는 것이 F_a 및 F_s를 저장하는 것보다 더 작은 실리콘 영역 및 더 작은 테이블 크기를 요구한다.Since the area under the E curve approximates the silicon area needed to implement an approximate exponential approximation, Figure 4 shows the approximate silicon area needed to implement the function tables for logad and logsub operations. Using the base 2 logarithmic scale as the vertical scale means that the height represents the word length of the binary value. The horizontal scale represents the number of these values. Thus, the area under the curve represents the number of bits of ROM needed to store the curve values. However, the exponential function E is periodic and the values repeat every time 1 increases except for the right shift. Therefore, only one cycle addressed by the fractional part X _M2 needs to be stored, and the result has shifted the number of positions given by X _M1 . Thus, the exponential function E requires very few tables. Also, since the correction values dF or E ₂ obviously have less area under the curve than the intrinsic F _a and F _s functions, using the exponential approximation E, storing correction values dF and E ₂ yields F _a and F _s . It requires smaller silicon area and smaller table size than storing.

수학식 29는 최하위 비트에 대한 G-함수를 제공한다:Equation 29 provides the G-function for the least significant bit:

여기서, (d-X_L)은 d가 모두 1일 때의 X_L의 보수와 동일하다. G(X_L)의 최소 값은 X_M과 X_L 사이의 31 비트 로그크기의 분할에 좌우된다. X_M이 5.8 형태이면, X_L은 0.00000000xxxxxxxxxxxxxxx 형태이며 2^-8보다 작다. G의 최소값은 X_L = 0일 때 8.5이다. X_M이 형태 (5.7)인 경우, G의 최소값은 7.5이고, X_M이 형태 (5.9)인 경우, G의 최소값은 9.5이다. X의 값이 각각의 사이클에서 적어도 G의 값만큼 증가되기 때문에, G 값들이 평균 8보다 큰 한, X는 3 사이클들 내에서 24를 초과한다. 이하에는, 설명을 목적으로 32 비트 연산이 유지된다고 가정된다. G의 최소값이 8.5일 때, 8의 밑 값이 저장된 값들로부터 감산될 수 있다.Here, (dX _L ) is the same as the complement of X _L when d is all 1. The minimum value of G (X _L ) depends on the division of the 31-bit log size between X _M and X _L. If X _M is 5.8, then X _L is 0.00000000xxxxxxxxxxxxxxx and less than 2 ^-8 . The minimum value of G is 8.5 when X _L = 0. If X _M is form (5.7), the minimum value of G is 7.5; if X _M is form (5.9), the minimum value of G is 9.5. Since the value of X is increased by at least the value of G in each cycle, as long as the G values are larger than 8 on average, X exceeds 24 in 3 cycles. In the following, it is assumed that 32 bit operations are maintained for the purpose of explanation. When the minimum value of G is 8.5, the base value of 8 can be subtracted from the stored values.

복소수를 위한 로그 연산Logarithmic Operations for Complex Numbers

상술된 다양한 프로세스들은 일반적으로 실수들에 대한 로그 연산에 적용된다. 그러나, 무선 통신 신호들이 실수 및 복소수 표현들을 모두 사용할 수 있다. 예를 들어, 실수 및 복소수 신호 프로세싱을 위한 전형적인 어플리케이션들은 무선 신호 프로세싱을 포함한다. 무선 시스템에서, 안테나에서 수신된 신호들은 무선 잡음을 포함하며, 복소수 샘플 시퀀스로 표현될 수 있다. 범위를 최대화하도록, 통상 잡음에 대해 가능한 가장 약한 신호들을 이용하여 정보를 복구하는 것이 바람직하다. 따라서, 예상된 잡음 레벨들보다 더 나은 양자화 정도를 사용할 필요가 없기에, 안테나에서 획득된 샘플들의 복소수 표현은 고 정확도 디지털화를 요구하지 않는다. 정보 및 정확한 오차들을 복구하기 위해 복소수 잡음 신호를 프로세싱한 후, 잡음은 제거된다; 결과 정보는 이제 보다 높은 정확도 표현을 요구할 수 있다. 예를 들어, 음성은 실수 샘플 시퀀스로 표현될 수도 있지만, 처리된 생 안테나 신호가 신호의 충실도(fidelity)를 음성의 잡음 비율로 상승시키기 때문에, 보다 높은 정확도의 디지털 표현이 요구될 수 있다.The various processes described above generally apply to log operations for mistakes. However, wireless communication signals may use both real and complex representations. For example, typical applications for real and complex signal processing include wireless signal processing. In a wireless system, the signals received at the antenna include radio noise and may be represented by a complex sample sequence. To maximize the range, it is usually desirable to recover the information using the weakest signals possible for noise. Thus, since there is no need to use a degree of quantization better than the expected noise levels, the complex representation of the samples obtained at the antenna does not require high accuracy digitization. After processing the complex noise signal to recover information and correct errors, the noise is removed; The resulting information may now require a higher degree of accuracy. For example, speech may be represented by a real sample sequence, but a higher accuracy digital representation may be required because the processed raw antenna signal raises the fidelity of the signal to the noise ratio of speech.

실수들에 대한 고 정확도 연산 및 복소수에 대한 보다 낮은 정확도 연산을 둘 모두 제공하는 신호 프로세서가 셀폰 및 셀폰 시스템과 같은 무선 어플리케이션들에서 관심사이다. 그러한 프로세서는 프로그램 스토리지를 위한 메모리, 처리될 실수 및 복소수 데이터를 저장하기 위한 데이터 메모리, 실수 및 복소수 ALU, 및 AD 변환기 및 DA 변환기를 포함할 수도 있는 입출력 장치들을 포함할 수 있다. 데이터 메모리는 설계된 ALU의 워드 길이와 동일한 워드들을 저장한다; 실수 및 복소수를 위한 동일한 워드 길이를 사용하는 것이 논리적이어서, 동일한 메모리에 저장될 수 있다. 그러나, 본 발명은 이를 요구하지 않는다.Signal processors that provide both high precision operations for real numbers and lower accuracy operations for complex numbers are a concern in wireless applications such as cell phones and cell phone systems. Such a processor may include memory for program storage, data memory for storing real and complex data to be processed, real and complex ALUs, and input / output devices that may include an AD converter and a DA converter. The data memory stores words equal to the word length of the designed ALU; It is logical to use the same word length for real and complex numbers so that they can be stored in the same memory. However, the present invention does not require this.

통상, 16 비트 워드들이면 음성 프로세싱에 충분하다. 따라서, 16 비트 복소수 표현이 안테나에 의해 수신된 잡음 신호들을 표현하기에 적합한 동적 범위를 제공하는지를 결정하는 것이 관심사이다. 이는 8 비트 로그진폭 및 7 비트 위상을 포함하는 15-16 비트 로그폴라 표현을 사용한, 1988년 내지 1997년 동안 유럽 엘.엠.에릭슨(L.M. Ericsson) 및 미국 계열사 에릭슨-지이(Ericsson-GE)에 의해 제조 및 판매된 제1 디지털 셀폰의 경우에 입증되었다. 상기 제품들은 결합해서 구현되었으며, 본 명세서에 참조용으로 인용된 미국특허번호 제5,048,059호; 제5,148,373호 및 제5,070,303호에 따라 무선 신호의 복소수 로그폴라로의 다이렉트 디지털화를 사용했다.Typically, 16 bit words are sufficient for speech processing. Thus, it is a concern to determine if a 16 bit complex representation provides a dynamic range suitable for representing noise signals received by an antenna. This was applied to European LM Ericsson and US affiliate Ericsson-GE during 1988-1997, using a 15-16-bit logpolar representation containing 8-bit log amplitude and 7-bit phase. In the case of the first digital cell phone manufactured and sold by the company. These products are implemented in combination, and US Pat. No. 5,048,059, which is incorporated herein by reference; Direct digitization of radio signals into complex logpolars was used in accordance with Nos. 5,148,373 and 5,070,303.

실수들에 있어서, 임의의 밑이 진폭 로그를 위해 사용될 수 있다. 밑 e가 사용되면, 로그진폭은 네퍼(Nepher)로 동시 신호 레벨을 표현한다. 본 기술 분야에서 공지된 바와 같이, 1 네퍼는 대략 8.686 데시벨(dB)과 등가이므로, 포맷 xxxx.xxxx의 8 비트 로그진폭은 0 내지 15의 범위에서 변하는 신호 레벨 및 15/16 네퍼들, ~139dB를 나타낸다.For real numbers, any base can be used for the amplitude log. If the base e is used, the log amplitude is expressed as Nepher to represent the simultaneous signal level. As is known in the art, one napper is equivalent to approximately 8.686 decibels (dB), so the 8-bit log amplitude of format xxxx.xxxx varies from 0 to 15, with signal levels varying from 0 to 15 and ~ 139 dB Indicates.

양자화 오차는 최하위 비트의 1/2 또는 네퍼의 +/-1/32 또는 0.27dB인데, 이는 대략 3.2%의 백분율 오차이다. 이론적으로, 상기 오차는 +/-3.2% 사이에 균일 하게 분배되며, 대략 1%의 피크의 1/3의 RMS 값을 갖는다. 따라서, 양자화 잡음은 신호 레벨의 1/100, 즉, 신호 레벨의 40dB 이하이며, 오버-샘플링이 사용되면 더 작아질 수 있는데, 즉, 오버 샘플링은 신호 대역폭의 Hz당 초당 1 샘플의 나이키스트(Nyquist) 비율보다 큰 샘플링이다.The quantization error is 1/2 of the least significant bit or +/- 1/32 or 0.27 dB of the nefer, which is a percentage error of approximately 3.2%. In theory, the error is evenly distributed between +/- 3.2% and has an RMS value of 1/3 of the peak of approximately 1%. Thus, the quantization noise is one hundredth of the signal level, i.e., 40 dB below the signal level, and may be smaller if over-sampling is used, i. Nyquist) is a sampling larger than the ratio.

로그폴라 표현의 이점은 상기 양자화 정도가 총 범위의 신호 레벨들에 대해 일정하게 유지된다는 점이다. 139dB의 총 동적 범위를 갖는 -40dB의 양자화 잡음은 대부분의 무선 신호 어플리케이션들에 보다 적합하다고 생각된다.The advantage of the logpolar representation is that the degree of quantization remains constant over the full range of signal levels. Quantization noise of -40dB with a total dynamic range of 139dB is considered more suitable for most wireless signal applications.

도 5A는 도 5B의 데카르트 표현과 대조적으로 로그폴라 표현을 이용하여 복소수 면이 요소 영역들로 어떻게 세그먼트화되는 지를 도시한다. 로그폴라 차트 중심의 하얀 "구멍"은 신호 레벨이 0000.0000 네퍼들보다 작은 영역이며, 바깥 원은 1111.1111 네퍼의 최고 신호 레벨이다. 0000.0000의 보다 낮은 한계가 무선 잡음 레벨 이하의 10dB로 선택되면, 이는 잡음 편위(excursion)가 적합하게 표현됨을 보장하며, 잡음의 통계가 수 표현에 의해 부당하게 변조되지 않음을 보장한다. 따라서, 바깥 원은 잡음에 대해 129dB의 신호 레벨을 나타내는데, 이는 심지어 가장 강한 신호들에 의해서도 초과되지 않는다.FIG. 5A illustrates how complex surfaces are segmented into element regions using a logpolar representation as opposed to the Cartesian representation of FIG. 5B. The white "hole" in the center of the logpolar chart is the area where the signal level is less than 0000.0000 nippers, and the outer circle is the highest signal level of 1111.1111 nipper. If the lower limit of 0000.0000 is chosen to be 10 dB below the radio noise level, this ensures that noise excursions are properly represented, and that the statistics of noise are not unfairly modulated by the numerical representation. Thus, the outer circle represents a signal level of 129 dB for noise, which is not exceeded even by the strongest signals.

위상각을 나타내는데 사용되는 비트의 유한적인 수는 양자화 오차 및 잡음을 야기한다. 위상 양자화로부터의 잡음 기여는 라디안으로된 최소 위상 비트 값의 1/12의 RMS 값을 갖는다. 8 비트가 위상을 표현하는데 사용되면, 최소 위상 비트는 2π/256 라딘안 값을 가지며, 양자화 잡음은 2π/(12*256) = 0.002 또는 신호 레벨에 비해 -53.8dB이다. 이는 -40dB 로그진폭 양자화 잡음보다 작다.The finite number of bits used to represent the phase angle results in quantization error and noise. The noise contribution from phase quantization has an RMS value of 1/12 of the minimum phase bit value in radians. If 8 bits are used to represent the phase, the minimum phase bit has a 2π / 256 radian value and the quantization noise is 2π / (12 * 256) = 0.002 or -53.8dB relative to the signal level. This is less than -40dB log amplitude quantization noise.

진폭이 1 비트 많고 위상이 1 비트 적은 비트 할당은 로그진폭 양자화를 대략 -46dB가 되게 하고 위상 양자화 잡음이 -47.8dB가 되게 한다. 따라서, 16 비트 워드 길이가 사용될 때, 로그진폭에 대한 xxxx.xxxxx 및 위상에 대한 0.xxxxxxx(모듈로 2π)의 로그폴라 포맷이 제안된다.Bit assignments of one bit more amplitude and one bit less phase cause the log amplitude quantization to be approximately -46 dB and the phase quantization noise to be -47.8 dB. Thus, when a 16 bit word length is used, a logpolar format of xxxx.xxxxx for log amplitude and 0.xxxxxxx (modulo 2π) for phase is proposed.

밑 2 알고리즘들이 로그진폭을 나타내는데 사용되면, xxxx.xxxxx 포맷의 양자화 잡음이 log_e(2) 또는 3.18dB 내지 -49dB 만큼 감소된다. 동적 범위는 16 네퍼들, 또는 139dB 내지 16×6 dB = 96dB로 감소되는데, 이는 여전히 적합하다.If the base 2 algorithms are used to represent the log amplitude, the quantization noise in the xxxx.xxxxx format is reduced by log _e (2) or 3.18 dB to -49 dB. The dynamic range is reduced to 16 nefers, or 139 dB to 16 × 6 dB = 96 dB, which is still suitable.

로그폴라 수들에는 먼저 로그진폭, 즉, {xxxx.xxxxx; 0.xxxxxxx} = {log(r); θ}가 저장되거나 또는 먼저 위상, 즉, {0.xxxxxxx; xxxx.xxxxx} = {θ; log(r)}이 저장될 수 있다. 복소수의 경우에 단지 두 개의 각들 0도 및 180도 보다 많은 각도를 표현하기 위해 실수의 1 비트 "위상" 또는 부호의 익스텐션으로서 위상을 생각하는 것이 유용할 수도 있어서, "위상-우선" 포맷이 이를 묘사하기 위한 로지컬 포맷을 제공한다. 복소수 연산에서, 가산 및 감산은 거의 구별이 없을 수 있는데, 0도(즉, 가산) 또는 180도(즉, 감산)로 차별하는 결합 수들은 고려되는 상대 위상 각들의 총 범위 내의 단지 두 개의 점들이다.Log polar numbers include log amplitude, i.e. {xxxx.xxxxx; 0.xxxxxxx} = {log (r); [theta]} is stored or first phase, i.e. {0.xxxxxxx; xxxx.xxxxx} = {θ; log (r)} can be stored. In the case of complex numbers, it may be useful to think of the phase as an extension of the sign's real number or one bit "phase" to represent an angle greater than just two angles 0 degrees and 180 degrees, so that the "phase-first" format does this. Provide a logical format to describe. In complex arithmetic, addition and subtraction may be almost indistinguishable, where the number of combinations that discriminate by 0 degrees (ie, addition) or 180 degrees (ie, subtraction) are only two points within the total range of relative phase angles considered. .

로그폴라 포맷을 이용하여, 두 복소수들의 적은 로그진폭 부분(언더플로 또는 오버플로에 주의)의 고정 소수점 가산 및 오버플로를 무시하는 위상 부분들의 고정 소수점 가산에 의해 획득되는데, 각은 모듈로-2π로 계산된다. 이진 위상 워드 양자화 레벨들이 0 내지 2π 범위에 걸쳐 균일하게 배치될 때, 위상 계산을 위 해 요구되는 바와 같이, 이진 가산시 롤오버는 정확하게 모듈로-2π 연산에 대응한다. 유사하게, 두 개의 로그폴라 복소수들의 몫은 고정 소수점 감산에 의해 획득된다.Using the logpolar format, it is obtained by the fixed-point addition of the small log amplitude portion of the two complex numbers (note the underflow or overflow) and the fixed-point addition of the phase portions ignoring the overflow, each of which is modulo-2π Is calculated. When the binary phase word quantization levels are evenly placed across the 0 to 2π range, as required for phase calculation, the rollover in binary addition exactly corresponds to the modulo-2π operation. Similarly, the quotient of two logpolar complex numbers is obtained by fixed point subtraction.

16 비트 로그리얼(logreal) 및 16 비트 로그폴라 연산들에 대해 동일한 ALU를 사용하는 것을 고려할 때, 가산 또는 감산의 유일한 차이는 로그폴라의 경우, 로그진폭 부분의 가산 또는 감산으로부터의 임의의 캐리 또는 바로우가 가산기 또는 감산기의 위상 부분로 전달되는 것이 허용되지 않는다는 점임을 알 수 있으며, 로그진폭-우선 포맷이 사용되는 경우, 그 역도 허용되지 않는다는 점을 알 수 있다.Considering using the same ALU for 16-bit logreal and 16-bit logpolar operations, the only difference in addition or subtraction is any carry from the addition or subtraction of the log amplitude portion, for logpolar It can be seen that the barrow is not allowed to be passed to the phase portion of the adder or subtractor, and if the log amplitude-first format is used, the reverse is also not allowed.

로그 연산이 로그폴라 포맷으로 표현된 복소수들에 대해 어떻게 구현될 수 있는지를 설명하기 위해, 다음을 고려해 보라. 수학식 30이 두 개의 데카르트 복소수들 z₁ 및 z₂를 밑 e에 대한 로그폴라 포맷, Z₁ 및 Z₂로 표현한다고 하자.To illustrate how log operations can be implemented for complex numbers represented in logpolar format, consider the following: Suppose Equation 30 expresses two Cartesian complex numbers z ₁ and z ₂ in logpolar format, Z ₁ and Z ₂ for base e.

Z₁ = (R₁,θ₁) = log_e(z₁)Z ₁ = (R ₁ , θ ₁ ) = log _e (z ₁ )

Z₂ = (R₂,θ₂) = log_e(z₂)Z ₂ = (R ₂ , θ ₂ ) = log _e (z ₂ )

Z₃ = log_e(z₃)을 결정하기 위해, 실수들에 대해 상술된 바로 유사한 절차를 구현할 수도 있는데, z₃ = z₁ + z₂ 이다. 먼저, 다음을 주의하자:To determine Z ₃ = log _e (z ₃ ), one may implement the very similar procedure described above for real numbers, where z ₃ = z ₁ + z ₂ . First, note the following:

Z₁이 Z₂보다 큰 로그크기 (R₁)를 갖는다고 가정하고, 상술된 바와 유사한 로직을 적용하면, Z₃은 다음과 같이 표현될 수 있다:Assuming that Z ₁ has a log size (R ₁ ) greater than Z ₂ , applying logic similar to that described above, Z ₃ can be expressed as:

여기서, Z = Z₁ - Z₂는 R₁ > R₂ 이기 때문에 양의 실수부 R₁ - R₂를 가지며, 이는 e^-Z < 1의 크기를 보장한다. 따라서, Z₁ 및 Z₂가 주어진 경우 Z₃을 계산하는 문제는 로그폴라 복소수 변수 Z = (R + jθ)의 함수 log_e(1+e^-Z)를 계산하는 것으로 축소되는데, 여기서, R = R₁ - R₂이며, θ = θ₁ - θ₂ 이다. 상기 일례가 밑 e를 사용하지만, 본 기술 분야에 숙련된 자들은 임의의 밑이 사용될 수도 있음을 이해할 것이다.Here, Z = Z ₁ -Z ₂ has a positive real part R ₁ -R ₂ because R ₁ > R ₂ , which ensures the size of e- ^Z <1. Thus, the problem of calculating Z ₃ given Z ₁ and Z ₂ is reduced to calculating the function log _e (1 + e ^-Z ) of the logpolar complex variable Z = (R + jθ), where R = R ₁ -R ₂ , and θ = θ ₁ -θ ₂ . Although the above example uses base e, those skilled in the art will understand that any base may be used.

R > 6일 때, 보다 작은 값의 가산 또는 감산은 5번째 이진 자리에 영향을 주지 않고, 결과는 보다 큰 값이다. 따라서, 이진 소수점의 왼쪽의 오직 3개의 비트가 R에 대해 고려될 필요가 있다.When R> 6, the addition or subtraction of smaller values does not affect the fifth binary position, and the result is a larger value. Thus, only three bits to the left of the binary decimal point need to be considered for R.

함수 log_e(1+e^-Z)는 매우 다양한 수단으로 계산될 수 있다. 예를 들어, 싱글 테이블, 싱글 반복 프로세스가 사용될 수 있다. 저 정확도 및 고 정확도 수들에 적용될 수 있으며, 고 정확도 수들에 필요한 싱글 룩업 테이블의 크기는 엄청나게 클 수 있다. 룩업 테이블은 최적 구조를 가질 수 있다. 예를 들어, 16 비트 로그폴라 연산의 경우, 16,384×32 비트 ROM이 제공된 경우, θ-컴포넌트에서 π로 구별하는 어드레스들에 대한 값들을 쌍으로 저장하는 것이 유용할 수도 있으며, 또는 공액 대칭성이 이용되면 1/2로 저장하는 것이 유용할 수 있다. 동일한 쌍의 입력 값들의 복소수 로그 가산 및 복소수 로그 감산은 한 사이클에서 동시에 실행될 수 있다.The function log _e (1 + e ^-Z ) can be calculated by a wide variety of means. For example, a single table, single iterative process can be used. It can be applied to low and high precision numbers, and the size of the single lookup table required for high accuracy numbers can be enormously large. The lookup table may have an optimal structure. For example, for 16-bit logpolar operations, where 16,384 x 32-bit ROMs are provided, it may be useful to store pairs of values for addresses distinguished by π in the θ-component, or conjugate conjugate symmetry is used. It may be useful to save it in half. Complex log addition and complex log subtraction of the same pair of input values may be performed simultaneously in one cycle.

한 사이클에서 한 쌍의 값들의 동시 가산 및 감산은 버터플라이 연산으로 공지되어 있으며, 통상 버터플라이 회로에서 실행된다. 도 6은 저 정확도 복소수 버터플라이 회로(100)를 포함하는 일례의 ALU를 도시한다. 버터플라이 회로(100)는 크기 누산기(102), 위상 누산기(104), 선택기(106), 룩업 테이블(108), 합 결합기(110) 및 차 결합기(112)를 포함한다. R₁이 R₂보다 클 때, 로그크기 누산기(102)는 R = R₁ - R₂로 표현된 로그크기 차를 계산하고, 위상 누산기(104)는 θ = θ₁ - θ₂로 표현된 위상 각 차를 계산한다. 또한, R₂가 R₁보다 클 때, 크기 누산기(102)는 R = R₂ - R₁로 표현된 로그크기 차를 계산하고, 위상 누산기(104)는 θ = θ₂ - θ₁로 표현된 위상 각 차를 계산한다. 크기 누산기(102) 및 위상 누산기(104)는 계 산된 차들을 룩업 테이블(108)에 출력한다.Simultaneous addition and subtraction of a pair of values in one cycle is known as a butterfly operation and is typically performed in a butterfly circuit. 6 shows an example ALU that includes a low accuracy complex butterfly circuit 100. The butterfly circuit 100 includes a magnitude accumulator 102, a phase accumulator 104, a selector 106, a lookup table 108, a sum combiner 110 and a difference combiner 112. When R ₁ is greater than R ₂ , the log size accumulator 102 calculates the log size difference represented by R = R ₁ -R ₂ , and the phase accumulator 104 has a phase represented by θ = θ ₁ -θ ₂ . Calculate each difference. Also, when R ₂ is greater than R ₁ , magnitude accumulator 102 calculates the logarithmic size difference expressed as R = R ₂ -R ₁ , and phase accumulator 104 is represented by θ = θ ₂ -θ ₁ . Calculate the phase angle difference. The magnitude accumulator 102 and the phase accumulator 104 output the calculated differences to the lookup table 108.

룩업 테이블(108)은 모든 각들에 대한 복소수들의 로그 값들을 포함한다. 로그크기 차 및 위상 차는 두 개의 로그폴라 값들 F(Z) 및 F(Z+π)를 제공하기 위해 룩업 테이블(108)을 어드레스한다. 원한다면, 상기 테이블은 항상 양의 각 인수를 사용하고 고유 각 어드레스가 음일 때 출력 F(Z) 값들을 공액하여 크기를 반으로 나눌 수 있다.Lookup table 108 includes logarithms of complex numbers for all angles. The logsize difference and the phase difference address the lookup table 108 to provide two logpolar values F (Z) and F (Z + π). If desired, the table can always divide in size by using positive each argument and conjugating the output F (Z) values when the unique each address is negative.

크기 누산기(102)는 R₁ 및 R₂ 중 더 큰 값에 기초하여 Z₁ 또는 Z₂를 Z_L로서 선택하기 위해 선택기(106)를 제어한다. 선택기(106)는 Z_L을 합 결합기(110) 및 차 결합기(112)에 제공한다. 결합기들(110, 112)은 두 개의 룩업 테이블 출력들 F(Z) 및 F(Z+π)에 Z_L을 가산해서, 두 개의 입력 복소수들과 관련된 차 출력 로그 및 합 출력 로그를 산출함으로써, 하나의 연산으로 복소수 버터플라이를 실행한다.The magnitude accumulator 102 controls the selector 106 to select Z ₁ or Z ₂ as Z _L based on the larger of R ₁ and R ₂ . The selector 106 provides Z _L to the sum combiner 110 and the difference coupler 112. Combiners 110 and 112 add Z _L to the two lookup table outputs F (Z) and F (Z + π) to produce a difference output log and a sum output log associated with the two input complex numbers, Perform a complex butterfly in one operation.

버터플라이 연산은 종종 OFDM(Orthogonal Frequency Division Multiplex) 신호 디코딩과 같은 다양한 신호 프로세싱 연산들에 필요한 FFT(Fast Fourier Transform)을 실행하는데 유용하다. 밑 2 FFT 연산들의 경우, 2π/2^N 배수들로 위상 각을 변경하는 것이 흔한 일인데, 여기서, 2^N은 FFT의 크기이다. 로그폴라 포맷에서, 트위들(twiddle)로 공지된 상기 위상 회전 연산들은 사소한 것이며, 0.0001000과 같은 양들의 배수들을 위상 부분에 가산하는 것만을 포함한다. 버터플라이 회로(100)에서 위상각을 변경하는 것이 쉽기 때문에, 매우 효율적인 버터플 라이 및 트위들이 로그폴라 포맷으로 표현된 복소수들을 버터플라이 회로(100)에 적용함으로써 실행될 수도 있어서, FFT에 매우 유익하다. FFT가 밑 2이고 N이 θ의 워드 길이 이하인 한 트위들 연산에서는 라운딩이 발생하지 않는다. 밑 2가 아닌 다른 FFT들의 경우, 특별 로그폴라 포맷이 고안될 수도 있는데, 여기서, θ는 FFT 밑와 동일한 근들을 이용하여 표현되었다. 본 명세서에 기술된 알고리즘은 적합하게 룩업 테이블들을 응용함으로써 상기 디바이스에서 사용될 수 있다.Butterfly operations are often useful for performing the Fast Fourier Transform (FFT), which is required for various signal processing operations such as Orthogonal Frequency Division Multiplex (OFDM) signal decoding. For base 2 FFT operations, it is common to change the phase angle in multiples of 2π / 2 ^N , where 2 ^N is the magnitude of the FFT. In logpolar format, the phase rotation operations, known as twiddles, are trivial and only include adding multiples of quantities equal to 0.0001000 to the phase portion. Since it is easy to change the phase angle in the butterfly circuit 100, very efficient butterfly and tweets may be implemented by applying complex numbers expressed in logpolar format to the butterfly circuit 100, which is very beneficial to the FFT. . As long as the FFT is base 2 and N is less than or equal to the word length of θ, no rounding takes place. For FFTs other than base 2, a special logpolar format may be devised, where θ is expressed using the same roots as the FFT base. The algorithm described herein can be used in the device by suitably applying lookup tables.

도 7은 도 6에 도시된 다중 복소수 버터플라이 회로(100)를 사용하는 일례의 16-포인트 FFT의 구현을 도시한다. 버터플라이 회로(100)는 16-요소 배열에서 선택된 값들의 쌍들 8개를 결합한다. 선택된 합 및 차 출력들은 트위들로 공지된 복소수 회전에 영향을 주기 위해 각 부분에서 변경된다. 각들은 도시된 비트 패턴들의 모듈로-2π 가산에 의해 변경된다. 도시된 바와 같이, 7 비트 각 부분들이 사용될 때, 모듈로-2π 가산은 간단히 모듈로-128 가산이다. 기계 사이클 당 완전한 FFT의 총 병렬 프로세싱 및 계산을 위해 버터플라이 회로(100)의 8×4 = 32 개의 카피들을 이용하여 상기 FFT가 구현될 수 있다. 또한, 4개의 계산 열들 각각을 구현하기 위해 연속해서 싱글 열의 8개의 버터플라이 회로(100)들을 이용하여 상기 FFT가 구현될 수 있다. 또한, 싱글 버터플라이 회로(100)는 FFT 실행을 위해 32회 반복해서 사용될 수 있다. 상기 옵션들은 속도와 크기 또는 비용 간의 희망 트레이드오프에 좌우된다.FIG. 7 illustrates an implementation of an example 16-point FFT using the multiple complex butterfly circuit 100 shown in FIG. 6. The butterfly circuit 100 combines eight pairs of values selected in a 16-element array. The selected sum and difference outputs are changed in each part to affect the complex rotation known as the tweens. The angles are changed by modulo-2π addition of the shown bit patterns. As shown, when 7-bit respective parts are used, the modulo-2π addition is simply a modulo-128 addition. The FFT may be implemented using 8 × 4 = 32 copies of the butterfly circuit 100 for the total parallel processing and calculation of the complete FFT per machine cycle. In addition, the FFT may be implemented using eight butterfly circuits 100 in a single row in succession to implement each of the four computation columns. In addition, the single butterfly circuit 100 can be used repeatedly 32 times for FFT execution. The options depend on the desired tradeoff between speed and size or cost.

복소수 값들의 데카르트 표현에 비해 로그폴라 양자화의 이점은 신호가 60dB의 동적 범위의 임의의 장소에서 나타날 수 있을 때, 신호를 표현하는 문제를 1% 정도로 고려함으로써 실현될 수 있다. 이는 예상 신호 레벨에 대한 경고 없이 수신기를 제공하는 버스트-모드 전송을 위한 수신기들에서 발생할 수 있다. 데카르트 부분들을 1% 정도로 표현하기 위해, 최소 신호 레벨이 대략 1이면, 이진 소수점의 오른쪽 6 비트인 대략 1/64의 최소의 스텝이 필요하다. 그러나, 신호들을 60dB에 걸쳐 표현하기 위해, 이진 소수점의 좌측 추가 10 비트를 필요로 하는 경우 보다 1000배 더 크게 신호들을 표현할 필요가 있다. 실수부 및 허수부는 S10.6의 포맷을 가질 필요가 있어서, 총 34 비트가 된다. 그러나, 상술된 바와 같이, 동일한 양자화 정도 및 동적 범위가 로그폴라 포맷에서는 단지 16 비트를 이용하여 달성된다.The advantage of logpolar quantization over the Cartesian representation of complex values can be realized by considering the problem of representing the signal as much as 1% when the signal can appear anywhere in the 60 dB dynamic range. This may occur in receivers for burst-mode transmission providing a receiver without warning about the expected signal level. To represent the Cartesian parts by about 1%, if the minimum signal level is approximately 1, a minimum step of approximately 1/64 is required, which is 6 bits to the right of the binary decimal point. However, in order to represent the signals over 60 dB, it is necessary to represent the signals 1000 times larger than in the case of requiring an additional 10 bits to the left of the binary decimal point. The real part and the imaginary part need to have a format of S10.6, which totals 34 bits. However, as described above, the same degree of quantization and dynamic range are achieved using only 16 bits in logpolar format.

적합한 크기의 싱글 룩업 테이블로 달성될 수 있는 것 보다 더 높은 정확도가 요구되면, 실수들에 대해 상술된 2-테이블 반복 방법이 복소수들에 대해 적용될 수 있다. 고 정확도 실수 포맷의 32 비트 워드 길이 내에서 적합한 복소수 포맷은 예를 들면 If higher accuracy is required than can be achieved with a single lookup table of suitable size, the two-table iteration method described above for real numbers can be applied for complex numbers. Suitable complex formats within the 32-bit word length of the high accuracy real format are, for example,

(0.xxxxxxxxxxxxxxx; xxxxx.xxxxxxxxxxxx)(0.xxxxxxxxxxxxxxx; xxxxx.xxxxxxxxxxxx)

로 표시되거나 또는 위상 우선 포맷에서 간단히 (0.15; 5.12)로 표시된다. 위상의 비트의 수를 로그크기의 이진 소수점의 우측 비트의 수 보다 2 또는 3 많게 선택하면, 위상 및 진폭에 대해 보다 작은 양자화 오차가 생긴다. 15 비트 위상의 최하위 비트는 2π×2^-15 = 6.28×2^-15의 값을 갖는다. R = log(r)의 12번째 이진 자리의 변경으로 d(log(r)) = dr/r = 2^-12 = 8×2^-15가 된다.Or simply in phase priority format (0.15; 5.12). Selecting two or three more bits of the phase than the number of bits to the right of the binary decimal point of log size results in smaller quantization errors for phase and amplitude. The least significant bit of the 15 bit phase has a value of 2π × 2 ⁻¹⁵ = 6.28 × 2 ⁻¹⁵ . The change in the 12th binary position of R = log (r) results in d (log (r)) = dr / r = 2 ^-12 = ^{8x2 -15} .

따라서, log(r)의 최하위 비트는 접선 방향의 θ의 하나의 최하위 비트의 변위(displacement) 보다 약간 큰 방사 방향의 변위이다. 밑 2를 이용하여, log(r)의 최하위 비트가 θ의 하나의 최하위 비트보다 약간 작은 5.54×2^-15로 log_e(2) = 0.69만큼 감소된다. 중요하다면, 정확히 동일한 방사 및 접선 양자화가 2 내지 e^π/4 = 2.19328의 e 사이의 특별 밑으로 달성될 수 있다. 그러나, 밑 2는 구현 이점들을 가지며 바람직하다. 예를 들어, 밑 2를 이용하여, 포맷 5.12의 로그크기는 16 비트 포맷의 범위의 두배인 32×6 = 192dB 동적 범위를 갖는 신호 레벨들을 나타낸다. 양자화 잡음은 모든 신호 레벨들에 대해 신호 레벨 이하의 80dB 보다 크다. 이는 정상 어플리케이션들의 무선 신호 프로세싱에 적합한 것 이상이며, 양자화 영향이 무시될 수 있음을 보장할 것이 요구될 때 시뮬레이션에도 유용하며 희망하지 않은 큰 신호들과 희망한 적은 신호들 간의 과도한 차이들을 갖는 간섭 소거와 같은 주요 어플리케이션들에도 유용할 수 있다.Thus, the least significant bit of log (r) is a radial displacement that is slightly larger than the displacement of one least significant bit of tangential direction θ. Using a bottom 2, the log (r) the least significant bit is reduced by a log _e (2) = 0.69 to 5.54 × 2 ^-15 slightly smaller than one least significant bit of the θ. If important, exactly the same emission and tangential quantization can be achieved with a special base between e of 2 and e ^{π / 4} = 2.19328. However, Base 2 has implementation advantages and is desirable. For example, using base 2, the log size of format 5.12 represents signal levels with a 32 × 6 = 192 dB dynamic range, which is twice the range of the 16-bit format. Quantization noise is greater than 80 dB below signal level for all signal levels. This is more than suitable for wireless signal processing in normal applications and is also useful for simulation when it is required to ensure that the quantization effects can be ignored, and interference cancellation with excessive differences between undesired large and small desired signals. It can also be useful for major applications such as:

두 개의 로그폴라 값들이 로그-가산 또는 로그-감산될 때, 로그크기들의 차가 log(r)의 최하위 비트 또는 θ가 영향을 받지 않을 만큼 크도록 결과는 보다 큰 로그크기를 갖는 값이다. 따라서, R₁ 및 R₂가 두 개의 로그폴라 값들 Z₁ 및 Z₂의 로그크기들이고, R이 R₁과 R₂ 간의 차이며, 항상 양수이면, R이 13log_e(2) = 9.011 보다 클 때, 함수 When two logpolar values are log-added or log-subtracted, the result is a value with a larger log size so that the difference between log sizes is large enough that the least significant bit or θ of log (r) is not affected. Thus, if R ₁ and R ₂ are log sizes of _two logpolar values Z ₁ and Z ₂ , and R is the difference between R ₁ and R ₂ , and is always positive, then when R is greater than 13 log _e (2) = 9.011 , function

는 0 내지 12 이진 자리들이다.Are 0 to 12 binary positions.

따라서, 0 내지 9의 로그크기들 R의 차의 값들만이 32 비트 로그폴라 포맷에 대해 밑 e 경우 고려될 필요가 있다. 유사하게, 밑 2 경우에, 0 내지 13의 로그크기 차의 값들만이 복소수 logadd/sub 함수의 인수로서 고려될 필요가 있다. 따라서, 이진 소수점의 좌측 4 비트가 R을 표현하기에 충분하여서, R은 형태 4.12가 된다.Therefore, only the values of the difference of log sizes R of 0 to 9 need to be considered if base e for the 32 bit logpolar format. Similarly, in the base 2 case, only values of log size differences of 0 to 13 need to be considered as arguments of the complex logadd / sub function. Thus, the left 4 bits of the binary decimal point are sufficient to represent R, so that R becomes form 4.12.

음의 θ에 대한 복소수 logadd/logsub 함수는 양의 θ에 대한 복소수 logadd/logsub 함수의 공액(conjugate)이기 때문에, θ는 0 내지 π 보다 적은 값 범위로 제한될 수 있으며, 오직 14개의 가변 비트를 갖는 0.0xxxxxxxxxxxxxx 형태이다. 본 발명을 이끄는 연구 중에, 각 차에 대한 π = 0.10000000000...의 특별 값을 배제함으로써 복소수 반복에 따른 수렴 문제가 매우 해결됨이 발견되었다. 상기 값은 로그크기들의 실제 감산과 정확하게 동일하고, 결과 각은 두 개의 입력 인수 각들 중 하나이며, 실제 연산에서 F_s 함수를 이용하여 최상으로 실행된다.Since the complex logadd / logsub function for negative θ is a conjugate of the complex logadd / logsub function for positive θ, θ can be limited to a range of values less than 0 to π, with only 14 variable bits It has the form 0.0xxxxxxxxxxxxxx. During the study leading the invention, it was found that the problem of convergence due to complex iterations was very solved by excluding the special value of π = 0.10000000000 ... for each difference. The value is exactly equal to the actual subtraction of the log sizes, the resulting angle is one of the two input argument angles, and is best executed using the F _s function in the actual operation.

실수들에 있어서, 복소수들에 대한 반복 프로세스는 먼저 결합될 두 개의 인수들 Z₁ 및 Z₂의 차 Z = (θ,R) = Z₁ - Z₂를 최상위 부분 및 최하위 부분로 분할하는 것을 포함한다. 상술된 바와 같이, Z의 값은 실제로 30개의 가변 비트만을 필요로 한다. 예를 들어, Z_M을 θ의 14 가변 비트의 최하위 7 비트 및 16 비트 R의 최상위 8 비트로 하면, 즉, 위상-우선 표기에서 Z_M = (0.0xxxxxxx; xxxx.xxxx)이다.In real numbers, the iterative process for complex numbers involves dividing the difference Z = (θ, R) = Z ₁ -Z ₂ of the two factors Z ₁ and Z ₂ to be combined first into the most significant part and the least significant part. do. As mentioned above, the value of Z actually requires only 30 variable bits. For example, let Z _{M be} the least significant 7 bits of the 14 variable bits of θ and the most significant 8 bits of 16 bits R, ie, Z _M = (0.0xxxxxxx; xxxx.xxxx) in the phase-first notation.

Z_L은 R의 나머지 최하위 8 비트 및 7개의 θ의 최하위 비트로서, 포맷 Z_L = (0.00000000xxxxxxx; 0000.0000xxxxxxxx) 이다. 그 후, Z_M ⁺ = Z_M + dZ라고 정의하자. 여기서, dZ는 0.0001 또는 0.000011111111의 실수부 및 0 또는 0.111111111111111의 허수부, 즉, 2π 보다 작은 1 LSB를 갖는다. 그 후, Z_L ^-는 Z_M ⁺ - Z라고 정의된다. dZ에 대한 전자 선택의 경우, Z_L ^-는 Z_L의 가변 비트의 2의 보수이며, dZ에 대한 후자 선택의 경우, 상기 비트의 보수가 된다. 보수는 2의 보수 보다 더 쉽기 때문에, dZ의 실수부 및 허수부의 후자 선택이 바람직하다. Z _L is the remaining least significant 8 bits of R and the least significant bits of seven θ, with the format Z _L = (0.00000000xxxxxxx; 0000.0000xxxxxxxx). Then define Z _M ⁺ = Z _M + dZ. Here, dZ has a real part of 0.0001 or 0.000011111111 and an imaginary part of 0 or 0.111111111111111, that is, 1 LSB smaller than 2π. After that, Z _L ^- is defined as Z _M ⁺ -Z. In the case of electronic selection for dZ, Z _L ^- is the two's complement of the variable bits of Z _L , and in the case of the latter selection for dZ, the complement of the bits. Since the repair is easier than 2's repair, the latter selection of the real and imaginary parts of dZ is preferred.

함수 log_e(1-

)는 R의 8개의 최상위 비트 및 θ의 7개의 최상위 비트에만 좌우되기에, 미리 계산되어 Z_M에 의해 직접 어드레스되는 32,768-워드 테이블에 저장될 수 있다. 따라서, 프로세싱 중에 Z_M ⁺를 형성할 필요가 없다.Function log _e (1-

) Depends only on the eight most significant bits of R and the seven most significant bits of θ and can be stored in a 32,768-word table that is precomputed and directly addressed by Z _M. Thus, there is no need to form Z _M ⁺ during processing.

함수 -log_e(1-

)는 R의 7개의 최상위 비트 및 θ의 8개의 최상위 비트에만 좌우되고, 미리 계산되어 복소수 연산을 위한 G-함수를 32,768-워드 룩업테이블로서 저장될 수 있다. 후자는 연속 값들 Z', Z", Z'" 등을 계산하는데만 필요하며, 희망 결과는 보다 큰 로그크기, Z₁ 또는 Z₂를 갖는 고유 인수와 인수들, Z_M, Z'_M, Z"_M 등을 갖는 일련의 F-함수 값들의 합이다. 연구에 의해, 수렴을 위해 필요한 복소수 로그애드/로그서브 반복은 최대 6회 반복임이 입증되었고, 최악의 경우는 Z₁ 및 Z₂의 각들이 거의 180도 떨어져 있으며, 그 크기들이 거의 동일한 경우이다. 상술된 바와 같이, 정확히 180도 떨어진 각들의 경우는 연산을 실수 감산으로 처리함으로써 해결된다.Function -log _e (1-

) Depends only on the seven most significant bits of R and the eight most significant bits of θ and can be precomputed to store the G-function for complex arithmetic as a 32,768-word lookup table. The latter is only necessary to calculate the continuous values Z ', Z ", Z'", etc., and the desired result is a unique argument and arguments with a larger log size, Z ₁ or Z ₂ , Z _M , Z ' _M , Z "Is the sum of a series of F-function values with _M, etc. Studies have shown that the complex logad / logsub iterations required for convergence are up to six iterations, with the worst case being each of Z ₁ and Z ₂ . Are nearly 180 degrees apart and their sizes are almost the same, as described above, the case of angles exactly 180 degrees is solved by treating the operation with real subtraction.

실수 및 복소수 연산들 둘 다를 동일한 F-테이블로 수용하기 위해, 두 개의 여분의 어드레스 비트가 실수 가산을 위한 테이블, 실수 감산을 위한 테이블 및 복소수 가산/감산을 위한 테이블을 선택하기 위해 제공될 수 있다. 함수는 F(r_m,opcode)로 표시될 수도 있는데, 여기서, r_M은 복소수 경우 인수의 15 비트 중 14이고, 15번째 비트는 2 비트 opcode의 부분이다. 2 비트 opcode는 이하의 테이블에 도시된 바와 같이 할당된다:In order to accommodate both real and complex operations in the same F-table, two extra address bits may be provided for selecting a table for real addition, a table for real subtraction and a table for complex addition / subtraction. . The function may be denoted by F (r _m , opcode), where r _M is 14 of the 15 bits of the argument in the complex case and the 15th bit is part of the 2-bit opcode. Two bit opcodes are allocated as shown in the following table:

0000 실수 가산Mistake addition 0101 실수 감산Mistake subtraction 1x1x 복소수 가산/감산, 여기서 x는 주요 인수의 15번째 비트Complex addition / subtraction, where x is the 15th bit of the main argument

유사하게, 함수 log_e(1-

)는 Z_L의 15비트에만 좌우되어서, 미리 계산되어 Z_L ^-에 의해 직접 어드레스되는 룩업 테이블에 저장될 수 있다. 크기 및 함수는 실 수 연산들을 위한 G-테이블과 같게 하고, 절반인 적합한 32,768-워드를 선택하기 위해 실수에 대해서는 0이고 복소수에 대해서는 1인 "opcode" 인수를 도입함으로써 65,536-워드 룩업 테이블에서 실수 G-테이블과 결합될 수 있다. Similarly, the function log _e (1-

) Depends only on the 15 bits of Z _L and can be precomputed and stored in a lookup table addressed directly by Z _L ⁻ . The magnitude and function are the same as the G-table for real operations, and the real in the 65,536-word lookup table by introducing an "opcode" argument of 0 for real and 1 for complex numbers to select a suitable 32,768-word that is half. Can be combined with a G-table.

복소수 입력을 최상위 부분 및 최하위 부분로 분할함으로써, 2-테이블 반복 프로세스를 이용하여 실수에 대한 로그 연산을 수행하는데 사용되는 바와 동일한 원칙들이 로그 포맷으로 표현된 복소수에 적용될 수 있다. 또한, 복소수 입력을 최상위 부분 및 최하위 부분로 분할함으로써, 공동 계류중인 미국 특허 출원 일련 번호 제_______호(위임 문서 번호 4015-5287)에 기술된 다단계 파이프라인이 로그폴라 포맷으로 표현된 복소수들에 적용될 수 있다. 상기 공동 계류중인 어플리케이션은 본 명세서에 참조용으로 인용되어 있다. 공동 계류중인 어플리케이션의 파이프라인에서, ALU는 파이프라인의 각각의 단의 룩업 테이블의 선택된 부분을 저장한다. 파이프라인의 적어도 하나의 단는 단와 관련된 부분 출력을 생성하기 위해 로그폴라 포맷으로 표현된 단 입력을 이용하여 룩업 테이블의 선택된 부분을 실행한다. 부분 출력들을 결합함으로써, 다단계 파이프라인은 로그 출력을 생성한다.By dividing the complex input into the top and bottom parts, the same principles as used to perform log operations on real numbers using a two-table iteration process can be applied to complex numbers represented in log format. In addition, by dividing the complex input into the top and bottom portions, the complex numbers described in co-pending US patent application Ser. No. _______ (Delegation Document No. 4015-5287) in logpolar format are represented. Can be applied to The co-pending application is incorporated herein by reference. In the pipeline of a co-pending application, the ALU stores the selected portion of the lookup table of each stage of the pipeline. At least one stage of the pipeline executes selected portions of the lookup table using stage inputs expressed in logpolar format to produce partial output associated with the stage. By combining the partial outputs, the multistage pipeline produces log output.

θ = π일 때, 연산이 실수 감산과 동등함을 알 수 있다. 이러한 경우 결과는 R에만 좌우되는데, 특별 룩업 테이블이 단일 단계 연산(one-shot operation)에서 사용될 수 있다. 또한, 실수 감산을 위한 현존 룩업 테이블이 사용될 수 있다. 이는 F-테이블의 F_s 부분을 어드레스하기 위해 R의 14비트 0xxxx.xxxxxxxxx을 사용하고, R_L의 초기 값인 12개의 0들로 확장된 R의 나머지 3 비트를 이용하여 실수 감 산 알고리즘을 실행함으로써 달성될 수 있다. 실수 반복은 그 후 감소된 복소수 정확도에 대응하는 출력 레지스터의 정확도의 희망 비트만을 누산함으로써 실행되고, R > 18 보다 더 빠른 종료 기준을 사용한다. 예를 들어, R > 9 이면 충분할 수 있다.It can be seen that when θ = π, the operation is equivalent to real subtraction. In this case, the result depends only on R. Special lookup tables can be used in one-shot operations. In addition, an existing lookup table for real subtraction can be used. It uses the 14-bit 0xxxx.xxxxxxxxx of R to address the F _s portion of the F-table, and executes the real subtraction algorithm using the remaining 3 bits of R extended to 12 zeros, the initial value of R _L. Can be achieved. Real iterations are then performed by accumulating only the desired bits of the accuracy of the output register corresponding to the reduced complex accuracy, and using termination criteria faster than R> 18. For example, R> 9 may be sufficient.

실수 및 복소수 로그 연산을 위한 공통 ALUCommon ALU for Real and Complex Logarithm Operations

복소수 및 실수들은 싱글 시스템 내의 다양한 신호들을 표현하는데 사용될 수 있다. 종래의 프로세서들은 개별 ALU들을 포함할 수 있다 - 하나는 복소수 로그 연산을 구현하기 위한 것이고, 다른 하나는 실수 로그 연산을 구현하기 위한 것이다. 그러나, 두 개의 개별 ALU들은 상당한 실리콘 스페이스를 차지한다. 몇몇 실례들에서, ALU들은 엄청나게 큰 룩업 테이블들을 요구할 수 있다. 따라서, 합당한 크기의 룩업 테이블들을 갖는 실수 및 복소수 로그 연산을 모두 구현하는 단일 ALU를 갖는 것이 유익하다.Complex numbers and real numbers can be used to represent various signals in a single system. Conventional processors may include separate ALUs-one for implementing complex logarithm operations and the other for implementing real logarithm operations. However, two separate ALUs occupy considerable silicon space. In some instances, ALUs may require incredibly large lookup tables. Thus, it is beneficial to have a single ALU that implements both real and complex logarithm operations with reasonable size lookup tables.

도 8은 실수 및 복소수 로그 연산을 둘 다 실행하기 위한 일례의 ALU(200)를 도시한다. ALU(200)는 입력 누산기(210), 룩업 컨트롤러(220) 및 출력 누산기(230)를 포함한다. 일반적으로, 입력 누산기(210)는 두 개의 실수 또는 복소수 입력들 간의 차를 계산하고, 룩업 컨트롤러(220) 및 출력 누산기(230)는 함께 입력에 좌우되는 실수 룩업 테이블 또는 복소수 룩업 테이블을 이용하여 입력 누산기(210)의 실수 또는 복소수 출력에 기초하여 출력 로그를 생성한다.8 illustrates an example ALU 200 for performing both real and complex logarithm operations. The ALU 200 includes an input accumulator 210, a lookup controller 220 and an output accumulator 230. In general, the input accumulator 210 calculates the difference between two real or complex inputs, and the lookup controller 220 and the output accumulator 230 together input using a real lookup table or a complex lookup table depending on the input. An output log is generated based on the real or complex output of the accumulator 210.

가산되거나 감산되는 로그 포맷으로 표현된 두 개의 실수 또는 복소수들 A 및 B는 입력 누산기(210)에 연속해서 제공된다. 스트로브 펄스가 처음 발생할 때, ALU(200)는 제1 수 A를 입력 누산기(210) 및 출력 누산기(230)에 로드한다. 그 후, 각 부분 θ 및 감산을 위해 180도 변경된 관련 부호를 갖는 제2 수 B가 입력 누산기(210)에 제공된다.Two real or complex numbers A and B, represented in the log format being added or subtracted, are provided in succession to the input accumulator 210. When the strobe pulse first occurs, the ALU 200 loads the first number A into the input accumulator 210 and the output accumulator 230. Thereafter, a second number B having an associated sign changed 180 degrees for each portion θ and subtraction is provided to the input accumulator 210.

스트로브가 두번째 발생할 때, 입력 누산기(210)는 A로부터 B를 감산한다. B의 로그크기가 A의 로그크기 보다 크다고 나타내는 언더플로가 있으면, 입력 누산기(210)는 값 X = B - A를 저장 및 출력하고, 바로우 펄스를 출력 누산기(230)에 송신한다. 바로우 펄스는 출력 누산기(230)가 B를 로드하게 하는데, 변경되거나 변경되지 않은 관련 부호(또는 복소수의 경우, 각)을 포함하고, A를 겹쳐쓰기한다. 그러나, 언더플로가 없으면, 입력 누산기는 값 X = A - B를 저장 및 출력한다. 따라서, 출력 누산기(230)는 A 및 B 중 큰 값을 유지하고, 입력 누산기(210)는 |A-B|를 유지한다. 양 X는 실수들에 대한 상술된 수학식들의 양 r과 동일하고, 복소수에 대한 상술된 수학식들의 양 Z와 동일하다.When the strobe occurs a second time, the input accumulator 210 subtracts B from A. If there is an underflow indicating that the log size of B is greater than the log size of A, then the input accumulator 210 stores and outputs the value X = B-A, and sends a straight pulse to the output accumulator 230. The right pulse causes the output accumulator 230 to load B, including associated or unchanged associated signs (or angles in the case of complex numbers) and overwriting A. However, if there is no underflow, the input accumulator stores and outputs the value X = A-B. Accordingly, the output accumulator 230 maintains a larger value of A and B, and the input accumulator 210 maintains | A-B |. The amount X is equal to the amount r of the above-described equations for real numbers and equal to the amount Z of the above-described equations for complex numbers.

X를 근거로, 룩업 컨트롤러(220)는 두 개의 출력들, 부분 출력 L과 정정 출력 Y를 결정한다. 룩업 컨트롤러(220)는 부분 출력 L을 출력 누산기(230)에 ADD 펄스와 함께 출력하여서, 부분 출력 L이 출력 누산기(230)의 현존 콘텐츠와 누산된다. 룩업 컨트롤러(220)는 정정 출력 Y를 입력 누산기(210)에 ADD 펄스와 함께 출력해서, Y가 입력 누산기(210)의 현존 콘텐츠와 누산되어서, 새로운 X 값이 생성된다. 사이클은 Y가 선정된 값을 만족시키거나 초과할 때까지 반복된다. Y가 선정된 값을 만족시키거나 초과하면, 사이클은 정지되며, 룩업 컨트롤러(220)는 희망 응답이 유효함을 나타내는 READY 신호를 출력 누산기(230)로부터 출력 C로서 생성 하며, ALU(200)의 상태가 초기 상태로 리턴되며, A 및 B의 새로운 쌍의 입력 값들을 대기한다.Based on X, lookup controller 220 determines two outputs, a partial output L and a correction output Y. The lookup controller 220 outputs the partial output L to the output accumulator 230 with the ADD pulse, so that the partial output L is accumulated with the existing content of the output accumulator 230. The lookup controller 220 outputs the corrected output Y to the input accumulator 210 with the ADD pulse, so that Y is accumulated with the existing content of the input accumulator 210, thereby generating a new X value. The cycle is repeated until Y satisfies or exceeds the predetermined value. If Y satisfies or exceeds the predetermined value, the cycle stops, and the lookup controller 220 generates a READY signal from the output accumulator 230 as output C indicating that the desired response is valid, The state is returned to its initial state, waiting for the input values of a new pair of A and B.

도 9는 실수 또는 복소수 로그 연산들에 대한 하나의 일례의 룩업 컨트롤러(220)의 추가 세부사항들을 도시한다. 룩업 컨트롤러(220)는 F-테이블(222), G-테이블(224), 결합기(226) 및 시퀀서(228)를 포함한다. F-테이블(222) 및 G-테이블(224)은 복소수의 로그를 결정하기 위한 복소수 룩업 테이블 및/또는 실수 로그를 결정하기 위한 실수 룩업 테이블을 포함한다. 도 9에 도시된 F-테이블(222) 및 G-테이블(224)이 복소수 및 실수 룩업 테이블들을 모두 포함하지만, 본 기술 분야에 숙련된 자들은 F-테이블(222) 및/또는 G-테이블(224)이 복소수 룩업 테이블 및 실수 룩업 테이블 중 오직 하나만을 포함할 수도 있음을 알 것이다.9 shows additional details of one example lookup controller 220 for real or complex logarithm operations. The lookup controller 220 includes an F-table 222, a G-table 224, a combiner 226, and a sequencer 228. F-table 222 and G-table 224 include a complex lookup table for determining a complex number of logs and / or a real lookup table for determining a real number log. Although the F-table 222 and G-table 224 shown in FIG. 9 include both complex and real lookup tables, those skilled in the art will appreciate that the F-table 222 and / or G-table ( It will be appreciated that 224 may include only one of a complex lookup table and a real lookup table.

제1 32 비트 로그 양 A가 누산기들(210 및 230)에 적용됨에 따라, 개시 스트로브가 시퀀서(228)에 적용된다. 시퀀서(228)는 로드 1 펄스를 입력 누산기(210)에 제공하고, 로드 2 펄스를 출력 누산기(230)에 제공해서, 32 비트 A-양을 저장하게 한다. 제2 32 비트 로그 양 B가 누산기들(210 및 230)에 적용됨에 따라, 제2 스트로브가 시퀀서(228)에 적용된다.As the first 32 bit log amount A is applied to accumulators 210 and 230, the starting strobe is applied to sequencer 228. Sequencer 228 provides a load 1 pulse to input accumulator 210 and a load 2 pulse to output accumulator 230 to store a 32 bit A-quantity. As the second 32 bit log amount B is applied to the accumulators 210 and 230, a second strobe is applied to the sequencer 228.

시퀀서(228)는 누산 펄스를 입력 누산기(210)에 제공한다. 입력 누산기(210)가 B의 로그크기가 A의 로그크기 보다 큼을 나타내는 "바로우" 펄스를 출력하면, 시퀀서(228)는 다른 로드 2 펄스를 출력 누산기(230)에 출력해서, 출력 누산기(230)에서 수 B의 부호 또는 위상을 포함하는 B 값이 저장되며, A가 겹쳐쓰여진다. 실수들의 경우, 보다 큰 로그크기를 갖는 값의 부호가 결과 C의 부호가 된다. 입력 누산기(210)는 로그크기들 간의 차 X의 값을 출력하는데, A가 더 크면, X = A - B이고, B가 더 크면, X = B - A 이다. 따라서, X는 항상 양수이다. X의 최상위 부분, X_M이 F 룩업 테이블(222)에 적용되고, X의 최하위 부분, X_L이 G 룩업 테이블(224)에 적용된다.Sequencer 228 provides an accumulating pulse to input accumulator 210. When the input accumulator 210 outputs a "barrow" pulse indicating that the log size of B is greater than the log size of A, the sequencer 228 outputs another load 2 pulse to the output accumulator 230 to output the accumulator 230. In B, a B value containing the number B sign or phase is stored, and A is overwritten. For real numbers, the sign of the value with the larger log size becomes the sign of the result C. The input accumulator 210 outputs the value of the difference X between the log sizes, where A is larger, X = A-B, and B is larger, X = B-A. Thus, X is always positive. The uppermost part of X, X _M , is applied to F lookup table 222, and the lowermost part of X, X _L is applied to G lookup table 224.

실수들의 경우, 입력 누산기(210)의 부호 로직 부분이 수들 A 및 B의 부호들을 XOR해서, 룩업 테이블(222)의 F_a 부분이 사용될지(동일한 부호들이 가산을 의미함) 또는 F_s가 사용될지(상이한 부호들이 감산을 의미함)를 결정한다. 부호들의 XOR은 F-테이블(222)에 여분의 어드레스 비트를 형성해준다.For real numbers, the sign logic portion of input accumulator 210 XORs the signs of numbers A and B, such that the F _a portion of lookup table 222 is used (the same signs mean addition) or F _s is used. Determines whether different symbols mean subtraction. The XOR of the signs forms an extra address bit in the F-table 222.

입력 누산기(210)의 값 X가 정지 임계값을 초과하지 않으면, 정지 펄스가 시퀀서(228)에 제공되지 않으며, 시퀀서는 계속해서 누산 펄스를 입력 누산기(210) 및 출력 누산기(230)에 송신해서, 결합기(226)로부터의 값 F + G가 입력 누산기(210)에 누산되며, 출력 누산기(230)에서 부분 출력 L이 누산되며, 입력 누산기(210)의 콘텐츠와 정정 출력이 누산된다.If the value X of the input accumulator 210 does not exceed the stop threshold, no stop pulse is provided to the sequencer 228, and the sequencer continues to transmit the accumulate pulse to the input accumulator 210 and the output accumulator 230, The value F + G from the combiner 226 is accumulated in the input accumulator 210, the partial output L is accumulated in the output accumulator 230, and the content and correction output of the input accumulator 210 are accumulated.

이는 출력 누산기(230)가 콘텐츠가 "정지" 펄스를 시퀀서(228)에 제공함으로써 값이 정지 임계값을 만족시키거나 초과하는지를 나타낼 때까지 반복되는데, "정지" 펄스가 제공될 때, 시퀀서(228)는 출력 누산기(230)의 값 C가 최종 결과임을 나타내는 "준비(ready)" 펄스를 생성하고, 개시 상태로 리턴한다.This is repeated until the output accumulator 230 indicates whether the content satisfies or exceeds the stop threshold by providing a " stop " pulse to the sequencer 228, when the " stop " pulse is provided. ) Generates a " ready " pulse indicating that the value C of the output accumulator 230 is the final result and returns to the start state.

도 9의 장치에서, 룩업 테이블(222)의 F_s 부분은 로그 가산 또는 로그 감산 연산이 진행중인지를 개별적으로 나타낼 필요 없이 누산기들(210 및 230)에서 적합 하게 누산된 음의 값을 저장한다. 아니면, 모든 F_s의 부호 비트를 저장하기 위해, 룩업 테이블로부터 생략될 수도 있으며, 부호 로직으로부터 제공된 +/- 비트의 값이 사용될 수도 있는데, 모든 F_a 값들은 양수이고 모든 F_s 값들은 음수이다. 부호 비트 보다 작은 참인 음수 값 F_s를 저장하는 것은 값을 부정하고 양의 값을 저장하는 것과는 상이한데, 이는 후에 출력 누산기(230) 및 결합기(226)로부터 감산되어야 한다. 룩업 테이블 크기 압축이 고려될 때, 후자가 유익함을 알 것이다. 룩업 테이블 압축은 본 명세서에 참조용으로 인용된 미국 특허 출원 일련 번호 제_______호(위임 문서 번호 4015-5288)에 더 기술되어 있다.In the apparatus of FIG. 9, the F _s portion of the lookup table 222 stores suitably accumulated negative values in the accumulators 210 and 230 without separately indicating whether a log addition or log subtraction operation is in progress. Alternatively, to store the sign bits of all F _s , they may be omitted from the lookup table, and the values of the +/− bits provided from the sign logic may be used, where all F _a values are positive and all F _s values are negative. . Storing a negative value F _s that is less than the sign bit is different from negating the value and storing a positive value, which must later be subtracted from output accumulator 230 and combiner 226. When lookup table size compression is considered, it will be appreciated that the latter is beneficial. Lookup table compression is further described in US Patent Application Serial No. _______ (Delegation Document No. 4015-5288), which is incorporated herein by reference.

구현중에 고려될 수 있는 다른 변화들은 입력 누산기(210)로부터 감산될 수 있도록 결합기(226)의 출력 값 Y가 F + G의 음수가 되게 함으로써, 입력 누산기(210)가 가산 커맨드와 감산 커맨드를 차별화할 필요가 없게 한다. 그 음수는 보수 + 1이기 때문에, 이는 보수 출력들을 이용하여 달성될 수 있으며, 하나의 최하위 비트에 의해 모두 감소된 G-테이블(224) 값들을 저장한다. 그러나, G-테이블(224)이 일반적으로 다른 시나리오들에도 유용할 수 있게 하기 위해, G-테이블(224)의 값들이 변경되지 않는 것이 바람직하다.Other changes that may be considered during implementation cause the output accumulator Y of the combiner 226 to be negative of F + G so that it can be subtracted from the input accumulator 210, thereby allowing the input accumulator 210 to differentiate the add and subtract commands. Don't have to. Since the negative number is complement + 1, this can be achieved using the complement outputs, storing the G-table 224 values all reduced by one least significant bit. However, in order for the G-table 224 to be generally useful for other scenarios, it is desirable that the values of the G-table 224 remain unchanged.

도 10은 ALU(200)의 복소수 연산을 보다 상세히 도시한다. 복소수 값들에 있어서, 입력 누산기(210)는 두 개의 독립 부분들, R-부분(210A) 및 θ-부분(210B)를 포함한다. θ-부분(210B)로부터 출력된 캐리가 R-부분(210A)로 전달되는 것이 방지되면, 실수 연산에 사용되는 바와 동일한 입력 누산기(210)가 복소수 연산에 사용될 수 있으며, θ 우선 비트 순서의 경우에는 그 역이 성립된다.10 illustrates in more detail the complex operation of the ALU 200. For complex values, the input accumulator 210 includes two independent portions, the R-section 210A and the θ-section 210B. If the carry output from the θ-part 210B is prevented from being passed to the R-part 210A, the same input accumulator 210 as used for the real number operation can be used for the complex operation, in the case of θ priority bit order. The reverse is true.

복소수 F-테이블(222)을 어드레스하는 비트가 부분적으로는 θ로부터 오고 부분적으로는 R로부터 오는 것이 중요하다. θ가 실수의 경우 R의 LSB에 의해 점령된 위치를 차지하면, 입력 누산기와 F-테이블(222) 간의 커넥션들이 복소수 연산을 위해 변경되어야 한다. 이는 G-테이블(224)에 대해서도 참이다. 이는 실수 및 복소수 연산에 독립적으로 G-테이블 및 F-테이블의 입력들을 어드레스하도록 커넥트하기 위해 입력 누산기(210)로부터 적합한 비트를 선택하는 선택기 스위치 세트(도시되지 않음)로 별로 불편하지 않게 구현될 수 있다. 다른 해법이 고려될 수 있다: 입력 누산기(210)와 F 및 G 테이블들 간의 커넥션들이 실수 및 복소수 연산에 대해 동일하게 유지될 수도 있는데, 이는 R 및 θ에 비트를 할당하는 것을 인터리빙할 것을 요구한다. 따라서, θ의 최상위 비트는 구현시 R의 최하위 비트과 위치들을 교환하여서, R 및 θ의 최상위 비트가 실수 경우 R의 최상위 비트에 의해 점령된 비트 위치들을 차지하며, R 및 θ의 최하위 비트는 실수 경우 R의 최하위 비트에 의해 점령된 비트 위치들을 차지한다. R-가산기(226A)를 형성하기 위해 연결된 R 비트 및 독립 θ-가산기(226B)를 형성하기 위한 θ 비트를 유지하기 위해, 실수에 비해 복소수를 위해 3개의 가산기 단들의 캐리 비트가 다시 루팅될 필요가 있다. 달성되면, 실수로부터 복소수로의 커넥션들의 교차를 방지하기 위해, 출력 누산기(230) 및 가산기(226)는 유사하게 구성된다. 이는 F 및 G-테이블의 출력 비트가 가산기(226) 및 누산기(230)에서 동일한 행선들에 연결된 상태로 유지됨을 보장한다.It is important that the bits addressing the complex F-tables 222 come from θ in part and from R in part. If θ occupies the position occupied by the LSB of R in case of a real number, the connections between the input accumulator and the F-table 222 must be changed for complex computation. This is also true for the G-table 224. This can be implemented inconveniently with a selector switch set (not shown) that selects the appropriate bits from the input accumulator 210 to connect to address the inputs of the G-table and F-table independently of real and complex operations. have. Other solutions may be considered: The connections between the input accumulator 210 and the F and G tables may remain the same for real and complex operations, which requires interleaving the allocation of bits to R and θ. . Thus, the most significant bit of θ exchanges positions with the least significant bit of R in the implementation so that if the most significant bit of R and θ is real, it occupies bit positions occupied by the most significant bit of R, and the least significant bit of R and θ is real Occupies bit positions occupied by the least significant bit of R; In order to maintain the R bits connected to form R-adder 226A and the θ bits for forming independent θ-adder 226B, the carry bits of the three adder stages need to be routed again for complex numbers compared to real numbers. There is. When achieved, output accumulator 230 and adder 226 are similarly configured to prevent intersection of connections from real to complex numbers. This ensures that the output bits of the F and G-tables remain connected to the same lines in adder 226 and accumulator 230.

θ = π 일 때, 복소수 경우를 위해 실수 감산 테이블 F_s를 사용하는 것이 요구되면, 상술된 대안은 덜 실제적이다. 이러한 경우, R의 모든 비트가 F-테이블(222)의 어드레스 입력에 연결되고, 마찬가지로, 누산기(230) 및 가산기(226)의 R-가산기 부분들에 모든 비트를 연결할 필요가 있다. 이러한 경우, 다시 리루팅 스위치들의 사용을 방지하는 것이 어렵다. θ = π 경우가 반복없이 처리되면, 즉, 실수 F_s-테이블(222)의 싱글 룩업에 의해 처리되면, 가산기 비트의 리루팅이 방지된다.When θ = π, if it is desired to use the real subtraction table F _s for the complex case, the above alternative is less practical. In this case, all bits of R are connected to the address input of F-table 222, and likewise, it is necessary to connect all bits to the R-adder portions of accumulator 230 and adder 226. In this case, it is difficult to prevent the use of rerouting switches again. If the θ = π case is processed without repetition, that is, by a single lookup of the real F _s -table 222, the rerouting of the adder bits is prevented.

복소수 θ = π 경우에 대한 실수 감산 테이블을 이용하여 처리되는 다른 비트 정렬은 이진 소수점의 좌측의 R의 비트의 수가 실수들(5 비트) 보다 복소수들의 경우 하나 적다(4 비트). 또한, 실수 반복은 예를 들어, 5.9 형태의 최상위 비트에 의해 어드레스되는 F-테이블(222)을 사용하며, 반복 없이 복소수 θ = π를 처리하기 위해 상이한 크기의 테이블을 요구하는 포맷 4.12의 차 값 R의 총 16 비트로 F-테이블(222)을 어드레스할 필요가 있다.Another bit alignment processed using the real subtraction table for the complex θ = π case is one less in number (4 bits) for complex numbers than the real number (5 bits) to the left of the binary decimal point. In addition, real repetition uses, for example, an F-table 222 addressed by the most significant bit of the form 5.9, and a difference value of format 4.12 that requires tables of different sizes to process complex θ = π without repetition. It is necessary to address the F-table 222 with a total of 16 bits of R.

도 11은 실수 및 복소수 수들의 상이한 가능 비트 할당을 도시한다. 도 11A는 위치 1의 부호 비트 S로부터 시작해서 포맷 8.23의 31 비트 로그크기로 이어지는 비트 1 내지 32의 직선적 할당을 도시하고, 포맷 5.9의 최상위 부분 X_M 및 포맷 0.14의 최하위 부분 X_L로의 분할을 도시한다. 아래에는 포맷 5.12의 로그진폭 및 포맷 0.15의 위상 각으로의 비트 1 내지 32의 직선적 할당이 도시되고, 포맷 4.4의 최상위 부분 R_M 및 최하위 8 비트 R_L로의 로그진폭의 분할이 도시되며, 위상은 7 비트 최상위 및 최하위 부분들로 분할되는데, 개별적으로 도시된 π에 대응하는 비트를 갖는다. 실수 및 복소수 간의 다수의 미스얼라인먼트들이 도 11A로부터 명백하다. 예를 들어, 로그크기의 이진 소수점은 동일한 위치에 있지 않으며, F-테이블(222)을 어드레스하는 비트, 실수를 위한 X_M, R_M 및 복소수를 위한 θ_M은 동일한 비트가 아니다.11 illustrates different possible bit allocations of real and complex numbers. 11A shows the linear assignment of bits 1 to 32 starting from the sign bit S of position 1 and leading to the 31-bit log size of format 8.23, and splitting into the most significant portion X _M of format 5.9 and the least significant portion X _L of format 0.14. Illustrated. Shown below is the linear assignment of bits 1 to 32 to the log amplitude of format 5.12 and the phase angle of format 0.15, the division of the log amplitude into the highest part R _M and the least significant 8 bits R _L of format 4.4, with the phase It is divided into 7 bit top and bottom parts, with the bit corresponding to π separately shown. Many misalignments between real and complex numbers are apparent from FIG. 11A. For example, the binary size of the log size is not in the same position, and the bits addressing the F-table 222, X _M for real numbers, R _M and θ _M for complex numbers are not the same bits.

도 11B는 이진 소수점들이 각각 실수 및 복소수 로그크기들을 위해 정렬된 비트 할당을 도시한다. 이는 오직 θ = π인 복소수 경우를 위해 실수 F_s-테이블(222)을 재사용하고자 시도하고 θ = 0인 복소수 경우를 위해 실수 F_a-테이블(222)을 재사용하고자 시도하는 경우에만 관심사이다. F 및 G 테이블들을 어드레스하는 비트는 여전히 실수 및 복소수에 대해 상이하다.11B shows the bit allocation where the binary decimal points are aligned for real and complex log sizes, respectively. This is only _a concern when attempting to reuse the real F _s -table 222 for a complex case where θ = π and attempting to reuse the real F _a -table 222 for a complex case where θ = 0. The bits addressing the F and G tables are still different for real and complex numbers.

도 11C는 실수 및 복소수 둘 다를 위한 F-테이블(222)을 어드레스하는 동일한 비트를 달성하는 비트 할당을 도시한다. 부호 비트 S와 최상위 부분 X_M은 실수 연산을 위한 F_a 및 G 테이블들을 어드레스하기 위해 인접해서 배치되어서, 15 비트 어드레스가 함께 있고, 복소수 경우 동일한 15 비트가 8 비트 R_M 및 7 비트 θ_M을 포함한다.11C shows the bit allocation to achieve the same bits addressing the F-table 222 for both real and complex numbers. The sign bit S and the most significant part X _M are placed contiguously to address the F _a and G tables for real arithmetic, so that the 15 bit addresses are together, and in the case of complex numbers the same 15 bits represent 8 bit R _M and 7 bit θ _M Include.

유사하게, R_L 및 θ_L로 구성된 15 비트는 X_L의 14 비트를 오버랩하고, G-테이블(224) ROM을 어드레스한다. 실수 경우, 복소수 테이블의 크기의 절반인 실수의 G-테이블(224)을 어드레스할 때 비트 수 2가 무시된다. 도 11C는 최상위 부분 및 최하위 부분 내의 비트 순서가 임의적이지만, 하나의 가산기 단으로부터 다음 단으로의 캐리 커넥션들의 수를 최대화하도록 선택될 수 있음을 도시하고 있으며, 실수 및 복소수 연산 사이에서 변경되지 않은 상태로 유지되는 것이 중요하다.Similarly, 15 bits consisting of R _L and θ _L overlap the 14 bits of X _L and address the G-table 224 ROM. In case of a mistake, the number of bits 2 is ignored when addressing the real G-table 224, which is half the size of the complex table. 11C shows that the bit order within the top and bottom portions is arbitrary, but may be selected to maximize the number of carry connections from one adder stage to the next, and does not change between real and complex operations. It is important to stay with.

간단한 해법은 동일한 어드레스 비트를 사용해야만 하는 하나의 큰 테이블로 복소수 및 실수 F-테이블(222)을 결합하려고 시도하는 것이 아니라, 실수 및 복소수 경우들에 대해 상이하게 입력 누산기(210)로부터 선택된 적합한 어드레스 비트에 연결된 개별 테이블들을 사용해야만 한다. 또한, 개별 어드레스-디코더들이 실수 및 복소수를 위해 사용될 수 있다. 마찬가지로, 실수 및 복소수를 위한 G-테이블(224)은 상이한 테이블들이거나 적어도 상이한 어드레스 디코더들일 수 있다. 총 크기는 θ = π인 경우와 같은 다른 고려 사항들과 달리 결합된 테이블들과 동일하게 유지된다. θ = π인 경우는 오직 로그진폭이 거의 동일할 때, 즉, R이 거의 제로일 때 문제가 된다. 따라서, 0000.xxxxxxxxxxxx 또는 0.12와 같은 R 값들에 대해서만, 즉, 차 R의 최상위 4 비트가 0일 때만, 특별 경우로서 처리될 필요가 있다. 이는 오직 4096-워드 테이블을 요구하는데, 이는 실수 F_s-테이블(222)을 사용할 수 있도록 비트 라인 리라우팅의 복잡성을 방지할 필요가 있다. 룩업 테이블들이 실리콘 칩 영역의 최대 비율을 차지하고, 누산기들, 가산기들 및 다른 주변 로직에 의해 차지되는 칩 영역이 작다고 하면, 결론은 실수 및 복소수 알고리즘들의 개별 구현이 논리적일 수 있다는 것이고, 결과 프로세서가 증가된 처리 속도로 동시에 실수 및 복소수 연산들을 실행할 수 있다는 이점을 갖는다.The simple solution does not attempt to combine the complex and real F-tables 222 into one large table that must use the same address bits, but rather a suitable address selected from the input accumulator 210 for real and complex cases differently. You must use separate tables linked to bits. In addition, separate address-decoders can be used for real and complex numbers. Similarly, G-table 224 for real and complex numbers may be different tables or at least different address decoders. The total size remains the same as the combined tables, unlike other considerations, such as when θ = π. The case where θ = π is only a problem when the log amplitudes are nearly equal, i. Therefore, it needs to be treated as a special case only for R values such as 0000.xxxxxxxxxxxx or 0.12, ie only when the most significant 4 bits of the difference R are zero. It only requires a 4096-word table, which needs to avoid the complexity of bit line rerouting so that the real F _s -table 222 can be used. If the lookup tables occupy the largest proportion of the silicon chip area and the chip area occupied by accumulators, adders and other peripheral logic is small, the conclusion is that separate implementations of real and complex algorithms can be logical, and the resulting processor It has the advantage that it is possible to execute real and complex operations simultaneously with increased processing speed.

본 발명은 물론 본 발명의 본질적인 특징들 내에서 본 명세서에 특별히 기재되지 않은 다른 방법들로 실행될 수 있다. 본 실시예들은 제한의 의미가 아닌 설명을 의도로 고려된 것이며, 첨부된 청구항들의 의미 및 동등한 범위 내에서 이루어진 모든 변경들은 본 발명에 포함된다.The invention can of course be implemented in other ways not specifically described herein within the essential features of the invention. The present embodiments are to be considered as illustrative and not restrictive, and all changes that come within the meaning and range of equivalency of the appended claims are embraced within the invention.

Claims

ALU for calculating the output log,

A memory for storing a first lookup table for determining a log of real numbers and a second lookup table for determining a log of complex numbers; And

A shared processor for generating an output log based on two input operands expressed in log format using the first lookup table for real input operands and the second lookup table for complex input operands

ALU comprising a.

The method of claim 1,

The output log is an ALU representing a log of the sum or difference of the input operands.

The method of claim 1,

Wherein the ALU comprises a butterfly circuit configured to simultaneously generate a log of the difference between the input operands and a log of the sum of the input operands using the first lookup table or the second lookup table.

The method of claim 3, wherein

The butterfly circuit,

A first combiner for combining the selected input operand with the difference value provided by the first or second lookup tables to produce a log of the difference between the input operands; And

A second combiner for combining the selected input operand with the sum value provided by the first or second lookup tables to produce a log of the sum of the input operands;

ALU comprising a.

The method of claim 1,

The shared processor,

A lookup controller configured to calculate one or more partial outputs based on the first or second lookup tables; And

An output accumulator configured to generate the output log based on the partial outputs

ALU comprising a.

The method of claim 5,

The number of partial outputs used to generate the output log is based on the desired accuracy of the output log.

The method of claim 5,

Wherein the shared processor executes two or more iterations through the lookup controller to determine the output log, each iteration producing one of the partial outputs.

The method of claim 7, wherein

And an input accumulator configured to generate a real or complex input for the current iteration based on the partial output generated during a previous iteration.

The method of claim 7, wherein

And the output accumulator generates the output log based on the partial outputs generated during each iteration and a selected input operand.

The method of claim 9,

The shared processor further comprises a selection circuit configured to select an input operand having a maximum magnitude.

The method of claim 5,

The lookup controller comprises a multistage pipeline, each stage of the multistage pipeline producing one of the partial outputs.

The method of claim 11,

Each stage of the pipeline storing a selected portion of the first and second lookup tables.

The method of claim 12,

At least one stage of the pipeline executes a selected portion of the first lookup table using a real stage input or a selected portion of the second lookup table using a complex stage input, thereby performing ALU to generate one.

The method of claim 1,

Wherein each of the complex input operands comprises a magnitude portion and a phase portion.

The method of claim 14,

Further includes an input accumulator,

The input accumulator is

A magnitude accumulator for generating a magnitude portion of the complex input based on the magnitude portions of the complex input operands; And

And a phase accumulator that generates a phase portion of the complex input based on the phase portions of the complex input operands.

A method of calculating the output log in the ALU,

Storing a first lookup table for determining a log of mistakes;

Storing a second lookup table to determine a complex number of logs; And

Generating an output log based on two input operands expressed in log format in a shared processor using the first lookup table for real input operands and the second lookup table for complex input operands

Output log calculation method comprising a.

The method of claim 16,

And generating the output log based on two input operands comprises generating the output log based on the sum or difference of the input operands.

The method of claim 16,

Generating the output log based on two input operands may include simultaneously generating an output log of the difference between the input operands and an output log of the sum of the input operands using the first or second lookup tables. Including output log calculation method.

The method of claim 18,

Simultaneously generating the output log,

Selecting an input operand based on a comparison between the input operands;

Combining the selected input operand with the difference value provided by the first or second lookup tables to produce an output log of the difference between the input operands; And

Combining the selected operand with the sum value provided by the first or second lookup tables to produce an output log of the sum of the input operands.

The method of claim 16,

Generating an output log based on two input operands,

Calculating one or more partial outputs based on the first or second lookup tables; And

Generating the output log based on the partial outputs.

The method of claim 20,

Executing at least two iterations to determine the output log, each iteration producing one of the partial outputs.

The method of claim 21,

Generating an input for a current iteration based on the partial output generated during a previous iteration.

The method of claim 20,

Generating an output log based on the partial outputs comprises generating the output log based on partial outputs generated at each stage of a multi-stage pipeline.

The method of claim 23, wherein

Storing the selected portion of the first and second lookup tables at each stage of the multi-stage pipeline.

The method of claim 24,

Outputting at least one stage of the pipeline, respectively, executing selected portions of the first or second lookup tables based on real or complex stage inputs to produce one of the partial outputs. Calculation method.

The method of claim 16,

And said complex input operands each comprise a magnitude portion and a phase portion.

The method of claim 26,

Generating a magnitude portion of the complex input based on the magnitude portions of the complex input operands; And

Generating a phase portion of the complex input based on the phase portions of the complex input operands

Output log calculation method further comprising.

ALU for calculating the complex output log,

A memory for storing a lookup table for determining a log of complex numbers;

A processor for generating an output log of arithmetic combining of complex input operands expressed in logpolar format using the stored lookup table

ALU comprising a.

The method of claim 28,

The processor includes a butterfly circuit configured to simultaneously generate an output log of the difference between the complex input operands and an output log of the sum of the complex input operands based on the lookup table.

The method of claim 29,

The butterfly circuit,

A first combiner for combining the selected input operand with the difference value provided by the lookup table to produce an output log of the difference between the complex input operands; And

A second combiner for combining the selected input operand with the sum value provided by the lookup table to produce an output log of the sum of the complex input operands

ALU comprising a.

The method of claim 28,

The processor,

A lookup controller configured to calculate one or more partial outputs based on the lookup table; And

ALU comprising a.

The method of claim 31, wherein

The processor,

An at least two iterations through the lookup controller to generate the output log, each iteration generating one of the partial outputs.

33. The method of claim 32,

And an input accumulator configured to generate a complex input for the current iteration based on the partial output generated during a previous iteration.

The method of claim 31, wherein

The lookup controller comprises a multistage pipeline, wherein each stage of the multistage pipeline generates one of the partial outputs.

The method of claim 28,

Wherein the complex input operands comprise a magnitude portion and a phase portion.

36. The method of claim 35 wherein

Further includes an input accumulator,

The input accumulator is

36. The method of claim 35 wherein

Wherein the phase portion comprises a most significant portion of the complex input and the magnitude portion comprises a least significant portion of the complex input.

The method of claim 37,

Wherein the lookup table comprises a magnitude lookup table and a phase lookup table.

The method of claim 38,

The most significant portion of the complex input addresses the phase lookup table, and the least significant portion of the complex input addresses the magnitude lookup table.

As a method of calculating a complex output log,

Storing a lookup table for determining a complex number of logs represented in a logpolar format; And

Generating an output log based on the complex input operands expressed in log polar format using the stored lookup table

Output log calculation method comprising a.

The method of claim 40,

Generating the output log based on the complex input operands includes simultaneously calculating an output log of the difference between the complex input operands and an output log of the sum of the complex input operands based on the lookup table How to calculate the log.

The method of claim 40,

Generating the output log,

Calculating one or more partial outputs based on the lookup table; And

Generating the output log based on the partial outputs

Output log calculation method comprising a.

The method of claim 42, wherein

Generating at least two iterations to generate the output log, wherein each iteration generates one of the partial outputs.

The method of claim 43,

The generating of the output log includes generating the output log by executing a multi-stage pipeline, wherein each stage of the multi-stage pipeline generates one of the partial outputs.

The method of claim 40,

Output log calculation method further comprising.

The method of claim 45,

47. The method of claim 46 wherein

The method of claim 47,

Addressing the phase lookup table using the most significant portion of the complex input, and addressing the magnitude lookup table using the least significant portion of the complex input.