KR19990021771A

KR19990021771A - Digital signal processor

Info

Publication number: KR19990021771A
Application number: KR1019970045345A
Authority: KR
Inventors: 임일택; 반준호
Original assignee: 구자홍; 엘지전자 주식회사
Priority date: 1997-08-30
Filing date: 1997-08-30
Publication date: 1999-03-25
Also published as: KR100246472B1

Abstract

본 발명은 데이타의 라운딩시의 시간소모를 방지하여 데이타를 빠르게 처리할 수 있는 DSP에 관한 것이다.The present invention relates to a DSP that can process data quickly by preventing time consumption during rounding of the data.

DSP는 N 비트의 데이타를 입력하기 위한 데이타 입력수단과, 데이타 입력수단으로부터의 N비트 데이타에 N보다 작은 r비트의 라운딩비트를 부가하는 라운딩비트 부가수단과, 라운딩비트 부가수단으로부터의 데이타의 상위비트에 g비트의 가드비트를 부가하는 가드비트 부가수단과, 가드비트 부가수단으로부터의 데이타를 연산하는 연산수단과, 연산수단으로부터의 데이타의 포화처리와 라운딩을 행하는 라운딩/포화처리수단과, 연산수단으로부터의 데이타를 라운딩하는 라운딩수단을 구비한다.The DSP includes data input means for inputting N bits of data, rounding bit adding means for adding r-bit rounding bits smaller than N to N-bit data from the data input means, and high-order data of the rounding bit adding means. Guard bit adding means for adding g-bit guard bits to bits, arithmetic means for computing data from guard bit adding means, rounding / saturation processing means for saturating and rounding data from arithmetic means, and arithmetic operations; Rounding means for rounding the data from the means.

Description

Digital signal processor

본 발명은 각종 디지탈(Digital) 신호를 소프트웨어에 의해 처리하는 디지탈신호처리기(Digital Signal Processor: 이하 DSP:라 함)에 관한 것이다.The present invention relates to a digital signal processor (hereinafter referred to as a DSP) for processing various digital signals by software.

DSP는 연산기능을 가지는 중앙처리장치(Central Processing Unit: 이하 CPU라 함)의 일종으로서, 원래 미합중국의 전자회사인 텍사스 인스트루먼츠(Texas Instruments; 이하 TI라 함) 사의 주도하에 발전되었다. 이 DSP는 최근에 비디오 및 오디오신호의 처리에 적용되면서 그 사용이 급격하게 증가되고 있다. 특히, 오디오 신호의 여러가지 처리방식들 중에서 MPEG(Moving Picture Expert Group)에 의해 국제표준으로 제정되어진 오디오압축알고리듬인 뮤지캠(Musicam)이나 돌비(Dolby)사의 디지탈 오디오 압축·복원 알고리듬인 AC-3등을 구현하는 DSP들이 급격하게 출현하고 있다. 이에 따라, 세계 유수의 기업들(특히, 구미의 여러회사들)은 자사의 이익을 증대하기 위하여 자사 고유의 독자적인 DSP의 개발에 박차를 가하고 있는 실정이다.DSPs are a type of central processing unit (CPU) with arithmetic functions, originally developed by Texas Instruments, a US electronics company. This DSP has recently been applied to the processing of video and audio signals, and its use is increasing rapidly. In particular, among various processing methods of audio signals, Musicam, an audio compression algorithm established by MPEG (Moving Picture Expert Group) as an international standard, and AC-3, a digital audio compression and restoration algorithm of Dolby, etc. The DSPs that implement this are emerging rapidly. As a result, many of the world's leading companies, particularly in the West, are spurring the development of their own DSPs to increase their profits.

이러한 오디오신호 처리용의 DSP는 지금까지 보편적으로 사용되어온 T1사의 고정소수점(Fixed-Point) DSP에 비하여 내부 연산을 위한 워드길이(Word Length)가 길다. 예를 들면, T1사의 TMS320C5x 시리즈의 DSP는 16비트(Bits)의 데이타 워드길이와 32비트의 연산로직유니트(Arithmatic Logic Unit; 이하 ALU라 함)를 구비하는 반면에 MPEG의 뮤지캠 및 돌비 사의 AC-3을 구현하기 위해 개발되어진 최근의 DSP들은 거의 모두 20 비트 이상의 데이타 워드길이와 48 비트 이상의 ALU를 구비하고 있다. 좀 더 구체적으로 언급하면, 야마하(Yamaha)사의 YSS243, 모토로라(Motorola)사의 DSP5600x, 크리스탈 세미콘덕터(Crystal Semiconductor)사의 CS2923 및 메디아닉스 세미콘덕터(Medianix Semiconductor)사의 MED25201 등의 DSP들은 모두 돌비사의 AC-3의 오디오신호를 디코딩(Decoding)하기 위해 24 비트의 데이타 워드길이의 56비트의 ALU를 구비한다. 또한, 조란(Zoran)사의 ZR3800 및 후지쓰(Fujitsu)의 MB86342 등과 같은 DSP들은 모두 돌비사의 AC-3의 오디오신호를 디코딩하기 위해 20비트의 데이타 워드길이와 48비트의 ALU를 구비한다.The DSP for audio signal processing has a longer word length for internal operation compared to T1's fixed-point DSP, which has been commonly used up to now. For example, the T1's TMS320C5x series of DSPs have a 16-bit data word length and a 32-bit Arithmatic Logic Unit (ALU), while MPEG's Musiccam and Dolby's AC Most recent DSPs developed to implement -3 have a data word length of more than 20 bits and an ALU of more than 48 bits. More specifically, Yamaha's YSS243, Motorola's DSP5600x, Crystal Semiconductor's CS2923 and Medianix Semiconductor's MED25201 are all Dolby's AC- A 56-bit ALU of 24-bit data word length is provided to decode an audio signal of 3. In addition, DSPs such as Zoran's ZR3800 and Fujitsu's MB86342 all have 20-bit data word lengths and 48-bit ALUs to decode Dolby's AC-3 audio signals.

이와 같은 구성되어진 DSP들은 모두 N비트의 길이를 가지는 두개의 데이타를 도 1에서와 같이 연산하게 된다. 제1 및 제2데이타(10,12)는 메모리로부터 판독되어 승산기에 의해 승산됨으로써 그 승산결과인 제3데이타를 발생시킨다. 제1 및 제2데이타(10,12)는 모두 N비트의 길이를 가지며 정수부와 소수부로 구성된다. 승산기의 승산결과인 제3데이타(14)는 최대 2N 비트의 길이를 가질 수 있고 이에 따라 2N 비트의 승산용 레지스터에 일시적으로 저장되게 된다. 이어서 제3데이타(14)는 제1 및 제2데이타(10,12)의 정수부(Integer Part)의 비트 수, 예를 들면 R 비트 만큼 배럴쉬프터(Barrel Shifter)에 의해 쉬프트되어 그 쉬프트 결과인 2N 비트의 제4데이타(16)가 산출되도록 한다. 제4데이타(16)는 ALU에 의해 로직연산됨으로써 제5데이타(18)를 생성시킨다. ALU의 연산결과인 제5데이타(18)에는 오버플로우(Overflow)가 발생될 수 있다. 이 오버플로우를 처리하기 위하여, T1사의 TMS320C5x와 같은 DSP에서는 제5데이타(18)를 포화시킨다. 이를 상세히 하면 , ALU에 의해 연산된 제5데이타(18)가 2N 비트의 누적용 레지스터에 저장되기 곤란한 비트길이를 가지는 경우, 즉 제5데이타(18)에 오버플로우가 발생되면 TMS320C5x의 DSP는 제5데이타(18)를 포화시키고 그 포화된 제5데이타(18)를 누적용 레지스터에 저장함으로써 오버플로우에 의한 오차를 완화시킨다. 이와는 달리, 모토로라사의 DSP5600x 등과 같은 DSP들은 도 1에서와 같이 8비트를 오버플로우 보호용으로서 누적용 레지스터에 부가한다. 이 경우, ALU와 누적용 레지스터는 모두 8+2N 비트의 길이를 가지게 된다. 이러한 오버플로우 보호비트를 앞서 열거한 AC-3 전용 DSP들에 적용하면, 24 비트의 데이타 워드길이를 가지는 DSP들은 데이타가 N=24 비트이므로 56 비트의 누적기를 그리고 20비트의 데이타 워드길이를 가지는 DSP들은 데이타가 N=20 비트이므로 48비트의 누적기를 각각 포함하게 된다. 그리고 오버플로우 보호비트들이 부가된 제5데이타(18)는 포화처리됨으로써 메모리에 저장되어진 제1 및 제2데이타(10,12)와 같은 비트길이를 가지는 제6데이타(20)로 변화된다.The DSPs configured as described above operate two data having a length of N bits as shown in FIG. The first and second data 10, 12 are read from the memory and multiplied by a multiplier to generate third data that is the result of the multiplication. The first and second data 10, 12 both have a length of N bits and are composed of an integer part and a fractional part. The third data 14, which is a multiplication result of the multiplier, may have a maximum length of 2N bits, and is thus temporarily stored in a 2N bit multiplication register. Subsequently, the third data 14 is shifted by the barrel shifter by the number of bits of the integer part of the first and second data 10 and 12, for example, R bits, and thus the shift result is 2N. The fourth data 16 of the bit is calculated. The fourth data 16 is logically operated by the ALU to generate the fifth data 18. Overflow may occur in the fifth data 18 that is an operation result of the ALU. To deal with this overflow, the fifth data 18 is saturated in a DSP such as TMS320C5x of T1. In detail, when the fifth data 18 calculated by the ALU has a bit length that is difficult to be stored in the 2N bit accumulation register, that is, when the overflow occurs in the fifth data 18, the DSP of the TMS320C5x is generated. By saturating the five data 18 and storing the saturated fifth data 18 in the accumulation register, an error due to overflow is alleviated. In contrast, DSPs, such as Motorola's DSP5600x, add 8 bits to the accumulator register for overflow protection, as shown in FIG. In this case, both the ALU and the accumulation register have a length of 8 + 2N bits. Applying this overflow protection bit to the above-mentioned AC-3 dedicated DSPs, DSPs with 24-bit data word length have 56-bit accumulator and 20-bit data word length because the data is N = 24 bits. DSPs each contain 48 bits of accumulator because the data is N = 20 bits. The fifth data 18 to which the overflow protection bits are added is converted into sixth data 20 having the same bit length as the first and second data 10 and 12 stored in the memory by being saturated.

상기한 일련의 연산과정을 수행하기 위하여 대부분의 DSP들은 48 비트 이상의 큰 워드길이를 가지는 ALU 및 누적용 레지스터를 구비하여야만 했었다. 48 비트 이상의 큰 워드길이를 가지는 이들 ALU 및 누적기는 DSP 칩(Chip)상의 다이 사이즈(Die Size)를 크게하고 아울러 ALU에서의 전파지연량을 증가시킨다. 따라서, 워드길이가 큰 ALU 및 누적기는 DSP 칩의 제조비용을 증가시키는 요인이 됨은 물론 DSP의 동작속도를 저하시키는 요인으로 작용하게 된다.In order to perform the above series of operations, most DSPs had to have ALUs and cumulative registers having a large word length of 48 bits or more. These ALUs and accumulators, which have large word lengths of 48 bits or more, increase the die size on the DSP chip and increase the amount of propagation delay in the ALU. Therefore, the ALU and accumulator having a large word length not only increases the manufacturing cost of the DSP chip but also acts as a factor of decreasing the operation speed of the DSP.

그리고 대부분의 DSP에서는 데이타의 포화처리에 앞서 라운딩(Rounding)연산을 지원한다. 예를 들면, 모토로라사의 DSP는 rnd라는 명령어에 의해 누적용 레지스터에 저장된 8+2N비트의 데이타를 8+N 비트의 데이타로 변환한다. 이 라운딩된 데이타는 최종적으로 N비트의 데이타로 포화된 후 메모리에 저장된다. 이러한 라운딩 연산을 위하여 대부분의 DSP들은 하나의 명령어 또는 클럭주기를 추가로 소모하고 있다. 이 같은 명령어 또는 클럭주기의 소모는 루우핑(Looping) 또는 블록반복(Block Repeating)이 적용되는 코드 세그먼트(Code Segment)에 라운딩 연산이 들어가는 경우 추가로 소모되는 클럭주기들이 매우 많아져서 DSP의 전체적인 연산량이 크게 증가된다.Most DSPs support rounding operations before saturating the data. For example, Motorola's DSP converts 8 + 2N bits of data stored in the accumulator register into 8 + N bits of data by the rnd instruction. This rounded data is finally saturated with N bits of data and then stored in memory. For these rounding operations, most DSPs consume an additional instruction or clock period. This instruction or clock cycle consumes a lot of additional clock cycles when the rounding operation is applied to a code segment to which looping or block repeating is applied. This is greatly increased.

또한, T1사의 TMS320C5x와 같은 DSP는 ALU 앞단에 위치한 프리-스케일링 쉬프트(Pre-scaling Shifter)를 이용하여 ALU에서 연산될 이진 데이타를 0 내지 16비트까지 좌측으로 쉬프트시킨다. 이 때, 0번째 비트에서 15번째 비트까지의 쉬프트동작은 통상 이진데이타를 스케일링하는데 이용되지만, 16번째 비트까지의 쉬프트동작은 정수(Integer) 연산이 아닌 고정소수점(Fixed Point) 연산이 수행될 경우에 사용된다. 이는 고정소수점 연산의 경우에 메모리로부터 판독된 16비트 데이타가 32 비트의 누적용 레지스터의 상위비트들에 정렬되어야 하기 때문이다. 데이타를 스케일링하기 위한 0번째 비트 내지 15번째 비트까지의 쉬프트동작은 주어진 알고리듬에 따라 특히 부호화(Coding) 방식에 따라 유용하게 사용될 수도 있고 그렇지 않을 수도 있다. 프리스케일링이 유용하게 사용되지 않는 코드로 구현된 알고리듬을 수행하는 경우에 스케일링 쉬프터는 그다지 도움을 주지 않으면서 DSP 칩의 다이 사이즈를 크게함과 아울러 전파지연량을 증가시킨다. 일례로, 모토로라사의 DSP에 포함된 프리-스케일링 쉬프터는 데이타를 좌측 또는 우측으로 1비트 만큼 쉬프트시키게 되어 있다. 대신에, 모토로라의 DSP는 부동소수점 연산과 정수 연산을 위해 서로 다른 승산명령어를 제공함으로써, T1사의 TMS3200C5x가 데이타를 16비트 만큼 좌측으로 쉬프트하는 동작을 흡수하고 있다.In addition, DSPs such as T1's TMS320C5x use a pre-scaling shifter located in front of the ALU to shift the binary data to be computed in the ALU to the left from 0 to 16 bits. At this time, the shift operation from the 0th bit to the 15th bit is generally used to scale binary data, but the shift operation up to the 16th bit is performed when a fixed point operation is performed instead of an integer operation. Used for This is because in the case of fixed-point arithmetic, 16-bit data read from memory must be aligned with the upper bits of the 32-bit accumulation register. Shift operations from the 0th bit to the 15th bit for scaling data may or may not be usefully used depending on a given algorithm, particularly depending on a coding scheme. When performing algorithms implemented in code where prescaling is not useful, the scaling shifter increases the DSP chip die size and increases propagation delay without much help. For example, the pre-scaling shifter included in Motorola's DSP is to shift data one bit to the left or to the right. Instead, Motorola's DSP provides different multiply instructions for floating-point and integer arithmetic, absorbing the T1's TMS3200C5x shifting data left by 16 bits.

다음으로, DSP에서는 각 명령어가 얼마나 빠르게 수행될 수 있는가 하는 것이 중요하게 여겨지고 있다. 명령어의 고속 수행을 위하여는 DSP에 인가되는 명령어의 주기가 짧아져야 하고 가능한 많은 명령어가 병렬로 처리되어야 한다. 전자의 명령어 주기를 짧게하는 것은 하드웨어, 즉 회로구성에 관련된 문제로서 DSP에 가능한 불필요한 블록(예를 들면, 이미 언급한 프리-스케일링 쉬프터)이 존재하지 않도록 하여 회로적인 경로를 짧게 하는 것이다. 후자의 명령어의 병렬처리능력은 DSP가 많은 계산을 필요로 하는 비디오/오디오 디코딩의 용도로 사용되면서 과거와 같은 순차적인 명령수행방식으로는 필요한 계산능력을 갖추지 못하는 것에서 비롯된다. 따라서, DSP들은 병렬처리가 가능한 특수한 명령어들을 갖추어 가고 있는 실정이다. 예를 들면, 아날로그 디바이스(Analog Device)사의 DSP인 DSP21020은 FFT연산을 위한 버터플라이(Butterfly) 연산과 같이 다수의 복합연산식들을 몇개의 클럭주기만에 수행하는 특수 명령어들을 가지고 있다. 이를 위하여, ADSP21020은 연산과 관련된 최소 12개의 레지스터들로 구성되어진 레지스터 화일(Register File)을 갖추고 있다. 또한, Zoran사의 ZR38000도 FFT용 버터플라이와 같은 다수의 복합연산식들을 단 몇개의 클럭주기만에 수행한다. 이를 위해, ZR38000은 8개의 레지스터들로 이루어진 레지스터 화일을 가짐과 아울러 승산과 그 결과의 가감산을 한 클럭주기안에 한꺼번에 수행하는 구조를 가지고 있다.Next, in DSP, it is important to know how fast each instruction can be executed. For fast execution of instructions, the period of instructions applied to the DSP should be shortened and as many instructions as possible should be processed in parallel. Shortening the former instruction cycle is a hardware-, circuit-related problem that shortens the circuit path by eliminating unnecessary blocks (e.g., the pre-scaling shifters already mentioned) possible in the DSP. The parallelism of the latter instruction comes from the fact that DSPs are used for video / audio decoding, which requires a lot of computation, and the sequential instruction execution method of the past does not have the necessary computational capability. Therefore, DSPs are equipped with special instructions capable of parallel processing. For example, the DSP21020, an analog device DSP, has special instructions for performing multiple complex expressions in a few clock cycles, such as a butterfly operation for FFT operations. To this end, the ADSP21020 has a register file consisting of at least 12 registers associated with the operation. Zoran's ZR38000 also performs many complex equations, such as butterfly for FFT, in just a few clock cycles. To this end, the ZR38000 has a register file consisting of eight registers and a structure that performs multiplication and addition and subtraction of the result in one clock cycle.

본 발명의 목적은 데이타의 라운딩시에 시간소모를 방지하여 데이타를 빠르게 처리할 수 있는 DSP를 제공함에 있다.SUMMARY OF THE INVENTION An object of the present invention is to provide a DSP capable of processing data quickly by preventing time consumption during rounding of data.

본 발명의 또 다른 목적은 논리값의 스케일을 포함하는 연산을 고속으로 수행할 수 있는 DSP를 제공함에 있다.It is still another object of the present invention to provide a DSP capable of performing an operation including a scale of a logic value at a high speed.

도 1은 종래의 DSP의 신호처리과정을 개략적으로 도시하는 도면.1 is a diagram schematically showing a signal processing procedure of a conventional DSP.

도 2는 본 발명의 실시예에 따른 DSP의 블럭도.2 is a block diagram of a DSP in accordance with an embodiment of the present invention.

도3은 도 2에서의 데이타의 라운딩이 수행되는 부분을 도시하는 도면.FIG. 3 shows a part where rounding of data in FIG. 2 is performed;

도면의 주요부분에 대한 부호의 설명Explanation of symbols for main parts of the drawings

30,40,44,48,64,66,76,78 : 제1 내지 제8 레지스터30,40,44,48,64,66,76,78: first through eighth registers

32,34,82,84 : 제1 내지 제4비트정렬기38 : 가드비트부가기32, 34, 82, 84: first to fourth bit sorter 38: guard bit adder

36,42,52,54,56,62,68,70,74,80,88 : 제1 내지 제11 멀티플렉서36,42,52,54,56,62,68,70,74,80,88: first to eleventh multiplexers

46 : 승산기38 : 비트조절기46: multiplier 38: beat controller

58,72 : 제1 및 제2ALU60 : 배럴쉬프터58,72: first and second ALU60: barrel shifter

86 : 라운딩/포화처리기86: rounding / saturation processor

상기 목적을 달성하기 위하여, 본 발명에 따른 DSP는 N 비트의 데이타를 입력하기 위한 데이타 입력수단과, 데이타입력수단으로부터의 N비트 데이타에 N 보다 작은 r비트의 라운딩비트를 부가하는 라운딩비트 부가수단과, 라운딩비트 부가수단으로부터의 데이타의 상위비트에 g비트의 가드비트를 부가하는 가드비트 부가수단과, 가드비트 부가수단으로부터의 데이타를 연산하는 연산수단과, 연산수단으로부터의 데이타의 포화처리와 라운딩을 행하는 라운딩/포화처리수단과, 연산수단으로부터의 데이타를 라운딩하는 라운딩수단을 구비한다.In order to achieve the above object, the DSP according to the present invention is a data input means for inputting N bits of data, and a rounding bit addition means for adding a round bit of r bits smaller than N to N bit data from the data input means. Guard bit adding means for adding g-bit guard bits to the upper bits of the data from the rounding bit adding means, computing means for calculating data from the guard bit adding means, saturation processing of data from the computing means, Rounding / saturation processing means for rounding, and rounding means for rounding data from the calculation means.

본 발명에 따른 DSP는 N 비트의 데이타를 입력하기 위한 데이타 입력수단과, 데이타입력수단으로부터의 N비트 데이타에 N 보다 작은 r비트의 라운딩비트를 부가하는 라운딩비트 부가수단과, 라운딩비트 부가수단으로부터의 데이타의 상위비트에 g비트의 가드비트를 부가하는 가드비트 부가수단과, 가드비트 부가수단으로부터의 데이타와 귀환루프로부터의 데이타를 연산로직수단과, 연산로직수단으로부터의 데이타를 일시적으로 저장하고 상기 귀환루프와 연결된 메모리와, 가드비트 부가수단으로부터의 데이타를 스케일링하기 위한 스케일링수단과, 스케일링수단으로부터의 데이타와 연산로직수단으로부터의 데이타를 선택적으로 메모리쪽으로 전송하는 선택수단과, 메모리로부터의 데이타를 포화처리 및 라운딩하는 라운딩/포화처리수단과, 메모리로부터의 데이타를 라운딩하는 라운딩수단을 구비한다.The DSP according to the present invention includes data input means for inputting N bits of data, rounding bit addition means for adding r-bit rounding bits smaller than N to N-bit data from the data input means, and rounding bit addition means. A guard bit adding means for adding a g bit guard bit to an upper bit of the data of the data, temporarily storing the data from the guard bit adding means and the data from the feedback loop and the data from the operation logic means. A memory coupled to the feedback loop, scaling means for scaling data from guardbit adding means, selection means for selectively transferring data from the scaling means and data from the arithmetic logic means to the memory, and data from the memory; Rounding and saturating means for saturating and rounding And a rounding means for rounding the data from Li.

상기 목적외에 본 발명의 다른 목적 및 잇점들은 첨부한 도면을 참조한 실시예에 대한 상세한 설명을 통하여 명백하게 드러나게 될 것이다.Other objects and advantages of the present invention other than the above objects will become apparent from the detailed description of the embodiments with reference to the accompanying drawings.

이하, 본 발명의 실시예를 첨부한 도 2 및 도 3을 참조하여 상세히 설명하기로 한다.Hereinafter, with reference to Figures 2 and 3 attached to an embodiment of the present invention will be described in detail.

도 2를 참조하면, 제1외부버스(31)로부터 제1외부버스(31)로부터 N 비트의 데이타를 입력하기 위한 제1레지스터(30)와, 이 제1레지스터(30)에 병렬 접속된 제1 및 제2비트정렬기(32,34)를 구비하는 본 발명의 실시예에 따른 DSP가 도시되어 있다. 제1외부버스(31)는 도시하지 않은 작업용 메모리에 공통적으로 접속되어진 N개의 데이타라인들로 구성된 것으로서 N비트의 제1읽기전용데이타버스(First Read Data Bus)로 사용된다. 제1레지스터(30)는 제1외부버스(31)를 경유하여 메모리로부터의 N 비트의 데이타를 일시적으로 보관하게 된다. 제1 및 제2비트정렬기(32,34)는 제1레지스터(30)로부터의 N 비트의 데이타를 쉬프트시킴으로써 데이타를 N 비트에서 N+r 비트로 확장시키는 역할을 한다. 이를 상세히 하면, 제1비트정렬기(32)는 고정소수점 연산을 위해 N 비트의 데이타를 r 비트 만큼 좌측으로 쉬프트시킨다. 이를 위해, 제1비트정렬기(32)는 제1레지스터(30)의 N비트의 출력라인들을 제1멀티플렉서(36)의 N+r 개의 단자들로 이루어진 제1입력포트중 상위 N개의 입력단자들에 접속시키는 배선을 가진다. 제2비트정렬기(34)는 정수 연산을 위해 N비트의 데이타를 r를 비트 만큼 우측으로 쉬프트시킨다. 이 때, 상위 r비트들은 DSP의 사인-확장모드(Sign-Extension Mode)에 따라 사인(정(+) 또는 부(-))비트들로서 채워지기도 하고 또는 0의 값들로 채워지기도 한다. 이를 위해, 제2비트정렬기(34)는 제1레지스터(30)의 N비트의 출력라인들을 제1멀티플렉서(36)의 N+r개의 단자들로 이루어진 제2입력포트중 하위 N개의 입력단자들에 접속시키는 배선을 가진다. 상위 r 비트는 상기 언급한대로 Sign bit 혹은 0으로 채워진다. 제1멀티플렉서(36)는 연산의 종류, 즉 고정소수점/정수 연산에 따라 제1입력포트상의 N+r 비트의 데이타 또는 제2입력포트상의 N+r 비트의 데이타를 제1가드비트(Guard Bits)부가기(38)에 공급한다. 이 제1가드비트부가기(38)는 제1멀티플렉서(36)로부터의 N+r비트의 데이타에 g 비트의 가드비트들을 부가한다. g 비트의 가드비트들은 모두 0으로 세트되거나 데이타의 사인을 확장시키게 된다. 이를 상세히 하면, g 비트의 가드비트들은 사인-확장모드가 세트된 경우에 모두 데이타의 사인비트와 동일한 논리값을 그리고 사인-확장모드가 리세트된 경우에는 모두 0의 논리값을 가지게 된다. 사인-확장모드는 특정한 명령어에 의해 세트 또는 리세트된다. 이들 g 비트의 가드비트들을 N+r 비트의 데이타에 부가하기 위하여, 제1가드비트부가기(38)는 N+r 개의 단자들로 이루어진 제1멀티플렉서(36)의 출력포트를 제1내부버스(35)에 포함된 g+N+r개의 라인들 중 하위의 N+r 개의 라인들에 연결시키는 배선을 가진다.Referring to FIG. 2, a first register 30 for inputting N bits of data from the first external bus 31 to the first external bus 31, and a first connected in parallel with the first register 30. A DSP according to an embodiment of the present invention having first and second bit aligners 32 and 34 is shown. The first external bus 31 is composed of N data lines commonly connected to a working memory (not shown), and is used as an N-bit first read data bus. The first register 30 temporarily stores N bits of data from the memory via the first external bus 31. The first and second bit sorters 32 and 34 serve to extend data from N bits to N + r bits by shifting N bits of data from the first register 30. In detail, the first bit sorter 32 shifts N bits of data to the left by r bits for fixed-point arithmetic. To this end, the first bit sorter 32 includes N-bit output lines of the first register 30 of the upper N input terminals of the first input port including N + r terminals of the first multiplexer 36. Wiring to be connected to each other. The second bit sorter 34 shifts the r bits right of the N bits of data for integer arithmetic. At this time, the upper r bits may be filled as sine (positive or negative) bits or filled with values of zero depending on the DSP's Sign-Extension Mode. To this end, the second bit sorter 34 includes N-bit output lines of the N-bit output lines of the first multiplexer 36 of the N input terminals of the N-bit output lines of the first register 30. Wiring to be connected to each other. The upper r bits are filled with either Sign bit or 0 as mentioned above. The first multiplexer 36 receives N + r bits of data on the first input port or N + r bits of data on the second input port according to the type of operation, that is, fixed-point / integer operation. ) Is supplied to the adder 38. This first guard bit adder 38 adds g bits of guard bits to the N + r bits of data from the first multiplexer 36. The gbit guard bits are either all set to zero or they extend the sign of the data. In detail, all g-bit guard bits have the same logic value as the sign bit of data when the sign-extension mode is set, and all logic values of 0 when the sign-extension mode is reset. The sign-extension mode is set or reset by a particular instruction. In order to add these g-bit guard bits to the N + r bit data, the first guard bit adder 38 connects the output port of the first multiplexer 36 consisting of N + r terminals to the first internal bus. The wirings are connected to the lower N + r lines among the g + N + r lines included in (35).

상기 DSP는 제1외부버스(31)에 직렬 접속된 제2레지스터(40) 및 승산기(46)와, 제1 및 제2외부버스(31,33)와 접속되어진 제2멀티플렉서(42)를 추가로 구비한다. 제2레지스터(40)는 제1레지스터(30)와 마찬가지로 제1외부버스(31)를 경유하여 입력되는 메모리로부터의 N 비트 데이타를 일시적으로 저장하는 역할을 담당한다. 제2외부버스(33)는 도시하지 않은 프로그램 메모리 및 CPU에 공통적으로 접속된 N개의 라인들로 구성된 것으로서 제2읽기전용 데이타 버스(Second read data bus)로 사용된다. 제2멀티플렉서(42)는 자신의 제1입력포트에 공급되는 제1외부버스(31)로부터 N비트 데이타와 자신의 제2입력포트에 공급되는 제2외부버스(33)로부터의 N비트 데이타 중 어느 한 데이타를 제3레지스터(44)에 공급한다. 제3레지스터(44)는 제2멀티플렉서(42)로부터의 N비트 데이타를 일시적으로 저장한다. 승산기(46)는 제2 및 제3레지스터(40,44)에 저장된 두개의 데이타를 승산한다. 이 승산기(46)에 의해 승산되어진 결과는 2N 비트를 가질 수 있다. 이에 따라, 승산기(46)으로부터의 데이타를 일시적으로 저장하는 제4레지스터(48)는 2N 비트의 길이를 가진다. 제4레지스터(48)와 제2내부버스(37) 사이에 접속된 배선조절기(50)는 제4레지스터(48)에 저장되어진 2N 비트의 데이타를 2N보다 작은 N+r 비트의 데이타로 변환하고 그 변환된 N+r 비트의 데이타에 g비트의 가드비트를 부가한다.The DSP adds a second register 40 and a multiplier 46 connected in series to the first external bus 31 and a second multiplexer 42 connected to the first and second external buses 31 and 33. It is provided with. Like the first register 30, the second register 40 temporarily stores N-bit data from the memory input via the first external bus 31. The second external bus 33 is composed of N lines commonly connected to a program memory and a CPU (not shown) and is used as a second read-only data bus. The second multiplexer 42 includes N bit data from the first external bus 31 supplied to its first input port and N bit data from the second external bus 33 supplied to its second input port. One data is supplied to the third register 44. The third register 44 temporarily stores N-bit data from the second multiplexer 42. Multiplier 46 multiplies two data stored in second and third registers 40 and 44. The result multiplied by this multiplier 46 may have 2N bits. Accordingly, the fourth register 48 which temporarily stores data from the multiplier 46 has a length of 2N bits. The wiring controller 50 connected between the fourth register 48 and the second internal bus 37 converts 2N bits of data stored in the fourth register 48 into N + r bits of data smaller than 2N. A g-bit guard bit is added to the converted N + r-bit data.

그리고 상기 DSP는 제1내부버스(35)에 공통적으로 접속되어진 제3 내지 제5멀티플렉서(52 내지 56)를 구비한다. 제3멀티플렉서(52)는, 제1, 제2 및 제4내부버스(35,37,43)로부터 g+N+r 비트의 데이타를 각각 입력하는 제1 내지 제3입력포트를 가진다. 제3멀티플렉서(52)는 제1,제2 및 제4 내부버스(35,37,43)로부터 3개의 데이타 중 어느 하나를 제1ALU(58)에 공급한다. 제4멀티플렉서(54)는 0의 논리값을 입력하는 제1입력포트(39)와, 제1 및 제3내부버스(35,41)로부터 g+N+r 비트의 데이타를 각각 입력하는 제2 및 제3입력포트를 구비한다. 제4멀티플렉서(54)는 자신의 제1 내지 제3 입력포트상의 3개의 데이타중 어느 하나를 제1ALU(58)에 공급하게 된다. 제1ALU(58)는 제3 및 제4멀티플렉서(52,54)로부터의 두개의 데이타를 연산하여 그 연산된 결과를 제6멀티플렉서(62)에 공급한다. 제5멀티플렉서(56)도 제1내부버스(35)로부터의 g+N+r 비트의 데이타와 제3내부버스(41)로부터의 g+N+r 비트의 데이타를 선택적으로 배럴쉬프터(60)에 공급한다. 배럴쉬프터(60)는 제5멀티플렉서(56)로부터의 데이타의 논리값을 스케일하고 그 스케일링된 데이타를 제6멀티플렉서(62)에 공급한다. 이 데이타의 스케일링을 위해 배럴쉬프트(60)는 스케일링 양에 해당하는 비트수 만큼 제5멀티플렉서(56)로부터의 데이타의 좌측 또는 우측방향으로 쉬프트시킨다. 또한, 베럴쉬프터(60)는 제1ALU(58)와 병렬 접속되므로써 데이타의 전파지연시간을 최소화할 수 있다. 이에 따라, 상기 DSP는 사칙연산, 스케일링 및 그를 포함한 사칙연산을 고속으로 수행할 수 있다. 제6멀티플렉서(62)는 제1ALU(58)로부터의 g+N+r 비트의 사칙연산되어진 데이타와 배럴쉬프트(60)로부터의 스케일드된 데이타를 선택적으로 제5 또는 제6레지스터(64 또는 66)에 공급한다. 제5 또는 제6레지스터(64 또는 66)에 저장되는 데이타는 제3내부버스(41)에 공급되게 된다. 제5 및 제6레지스터(64,66)는 누적용 레지스터들로서 제1ALU(58)와 함께 제1누적기를 구성하게 된다. 제5 및 제6레지스터(64,66) 모두 g+N+r 비트의 데이타를 일시적으로 저장하기 위해 g+N+r 비트의 길이를 가지며 아울러 제3내부버스(41)도 g+N+r개의 라인으로 구성된다.The DSP includes third to fifth multiplexers 52 to 56 that are commonly connected to the first internal bus 35. The third multiplexer 52 has first to third input ports for inputting g + N + r bits of data from the first, second and fourth internal buses 35, 37 and 43, respectively. The third multiplexer 52 supplies any one of three data to the first ALU 58 from the first, second and fourth internal buses 35, 37, and 43. The fourth multiplexer 54 includes a first input port 39 for inputting a logic value of 0 and a second input for g + N + r bits of data from the first and third internal buses 35 and 41, respectively. And a third input port. The fourth multiplexer 54 supplies any one of three pieces of data on its first to third input ports to the first ALU 58. The first ALU 58 computes two data from the third and fourth multiplexers 52 and 54 and supplies the calculated result to the sixth multiplexer 62. The fifth multiplexer 56 also selectively outputs g + N + r bits of data from the first internal bus 35 and g + N + r bits of data from the third internal bus 41. To feed. The barrel shifter 60 scales the logical value of the data from the fifth multiplexer 56 and supplies the scaled data to the sixth multiplexer 62. In order to scale this data, the barrel shift 60 shifts the data from the fifth multiplexer 56 to the left or right direction by the number of bits corresponding to the scaling amount. In addition, since the barrel shifter 60 is connected in parallel with the first ALU 58, the propagation delay time of data can be minimized. Accordingly, the DSP can perform arithmetic operations, scaling and arithmetic operations including the same at high speed. The sixth multiplexer 62 selectively selects the fifth + sixth register 64 or 66 from the arithmetic data of g + N + r bits from the first ALU 58 and the scaled data from the barrel shift 60. Supplies). Data stored in the fifth or sixth register 64 or 66 is supplied to the third internal bus 41. The fifth and sixth registers 64 and 66 constitute a first accumulator together with the first ALU 58 as accumulation registers. Both the fifth and sixth registers 64 and 66 have a length of g + N + r bits to temporarily store g + N + r bits of data, and the third internal bus 41 also has g + N + r bits. It consists of two lines.

또한 상기 DSP는 제1내부버스(35) 및 제4내부버스(43)와 접속되어진 제7멀티플렉서(68)와, 제1 및 제2내부버스(35,37)와 접속되어진 제8멀티플렉서(70)를 구비한다. 제7멀티플렉서(68)는 0의 논리값을 입력하는 제1입력포트(45)와, 제1 및 제4내부버스(35,43)로부터 g+N+r 비트의 데이타를 각각 입력하는 제2 및 제3입력포트를 구비한다. 제7멀티플렉서(68)는 자신의 제1 내지 제3입력포트상의 3개의 데이타 중 어느 하나를 제2ALU(72)에 공급하게 된다. 제8멀티플렉서(70)는 제1 및 제2내부버스(35,73)로부터 g+N+r비트의 데이타를 각각 입력하는 제1 및 제2입력포트를 갖는다. 제8멀티플렉서(70)는 제1 및 제2내부버스(35,73)로부터의 2개의 데이타중 어느 하나를 제2ALU(72)에 공급한다. 제2ALU(72)는 제7 및 제8멀티플렉서(68,70)로부터의 두개의 데이타를 연산하여 그 연산된 결과를 제9멀티플렉서(74)에 공급한다. 제9멀티플렉서(74)는 제2ALU(72)로부터의 g+N+r 비트의 연산되어진 데이타와 배럴쉬프터(60)로부터의 스케일링된 데이타를 선택적으로 제7 또는 제8레지스터(76 또는 78)에 공급한다. 제7 또는 제8레지스터(76,78)는 제9멀티플렉서(74)로부터의 데이타를 제4내부버스(43)에 공급되게 된다. 제7 및 제8레지스터(76,78)는 누적용 레지스터로서 제2ALU(72)와 함께 제2누적기를 구성하며, 이 제2누적기는 제1누적기와 병렬로 접속됨으로써 2이상의 복합연산식들이 병렬로 연산되도록 한다. 제7 및 제8레지스터(76,78)는 모두 g+N+r 비트의 데이타를 일시적으로 저장하기 위해 g+N+r 비트의 길이를 가진다. 또한, 제7 및 제8레지스터(76,78)는 제5 및 제6레지스터(64,66)와 함께 다수의 복합연식들이 빠르게 연산되도록 한다.In addition, the DSP includes a seventh multiplexer 68 connected to the first internal bus 35 and the fourth internal bus 43, and an eighth multiplexer 70 connected to the first and second internal buses 35 and 37. ). The seventh multiplexer 68 includes a first input port 45 for inputting a logic value of zero and a second input data of g + N + r bits from the first and fourth internal buses 35 and 43, respectively. And a third input port. The seventh multiplexer 68 supplies any one of three pieces of data on its first to third input ports to the second ALU 72. The eighth multiplexer 70 has first and second input ports for inputting g + N + r bits of data from the first and second internal buses 35 and 73, respectively. The eighth multiplexer 70 supplies any one of two data from the first and second internal buses 35 and 73 to the second ALU 72. The second ALU 72 calculates two data from the seventh and eighth multiplexers 68 and 70 and supplies the calculated result to the ninth multiplexer 74. The ninth multiplexer 74 selectively stores the calculated data of the g + N + r bits from the second ALU 72 and the scaled data from the barrel shifter 60 to the seventh or eighth register 76 or 78. Supply. The seventh or eighth registers 76 and 78 are supplied with data from the ninth multiplexer 74 to the fourth internal bus 43. The seventh and eighth registers 76 and 78 form a second accumulator together with the second ALU 72 as a cumulative register, and the second accumulator is connected in parallel with the first accumulator so that two or more complex equations are parallel. To be calculated as Both the seventh and eighth registers 76 and 78 have a length of g + N + r bits to temporarily store g + N + r bits of data. In addition, the seventh and eighth registers 76 and 78, together with the fifth and sixth registers 64 and 66, allow multiple compound equations to be computed quickly.

더 나아가, 상기 DSP는 제3 및 제4내부버스(41,43)로부터의 두개의 g+N+r 비트 데이타들을 선택하기 위한 제10멀티플렉서(80)와, 이 제10멀티플렉서(80)에 공통적으로 접속되어진 제3 및 제4비트정렬기(82,84)와 라운딩/포화(Rounding/Saturation)처리기(82)를 추가로 구비한다. 제10멀티플렉서(80)는 제3내부버스(41)를 경유한 제5 또는 제6레지스터(64,66)로부터의 g+N+r 비트의 데이타와 제4내부버스(23)를 경유한 제7 또는 제8레지스터(76,78)로부터의 g+N+r 비트의 데이타 중 어느 한 데이타를 제3 및 제4비트정렬기(82,84)와 라운딩/포화 처리기(86)에 공통적으로 공급한다. 제3 및 제4비트정렬기(82,84)는 제10멀티플렉서(80)로부터의 g+N+r 비트의 데이타에서 N 비트만을 추출하여 그 추출된 N 비트의 데이타를 제11멀티플렉서(88)의 제1 및 제2입력포트에 각각 공급한다. 이를 상세히 하면, 제3비트정렬기(82)는 제10멀티플렉서(80)로부터의 g+N+r 비트의 데이타 중 상위 g+1 번째 비트로부터 N 비트의 데이타만을 추출하여 그 추출된 N 비트의 데이타를 제11멀티플렉서(88)의 제1입력포트에 공급한다. 이를 위하여, 제3비트정렬기(82)는 제10멀티플렉서(80)의 g+N+r 개의 출력단자들 중 상위 g+1 번째 단자로부터 N 개의 단자들, 즉 상위 g 개의 단자들과 하위 r개의 단자들을 제외한 나머지 N개의 단자들을 N개의 단자들로 이루어진 제11멀티플렉서(88)의 제1입력포트에 접속시키는 배선을 구비한다. 단순히 배선에 의해서만 제3비트정렬기(82)는 g+N+r 비트의 데이타를 N 비트의 데이타로 변환시킨다. 제4비트정렬기(84)는 제10멀티플렉서(80)로부터의 g+N+r 비트의 데이타 중 하위 N 비트의 데이타만을 추출하여 그 추출된 N비트의 데이타를 제11멀티플렉서(88)의 제2입력포트에 공급한다. 이를 위하여, 제4비트정렬기(84)는 제10멀티플렉서(80)의 g+N+r 개의 출력단자들 중 하위 N 개의 단자들, 즉 상위 g+r 개의 단자들을 제외한 나머지 N개의 단자들을 N 개의 단자들로 이루어진 제11멀티플렉서(88)의 제2입력포트에 접속시키는 배선을 가진다. 제4비트정렬기(84)는 제1 내지 제3비트정렬기(32,34,82)과 함께 단순히 배선에 의해 구성됨으로써 별도의 회로블록을 요구하지 않는다. 이에 따라, 제1 내지 제4비트정렬기들(32,34,82,84)은 DSP의 회로구성을 간소화할 수 있음은 물론 고정소수점 연산 및 정수연산 모두가 고속으로 수행되도록 한다.Furthermore, the DSP is common to the tenth multiplexer 80 and the tenth multiplexer 80 for selecting two g + N + r bit data from the third and fourth internal buses 41 and 43. And a third and fourth bit sorter 82,84 and a rounding / saturation processor 82 connected to each other. The tenth multiplexer 80 includes g + N + r bits of data from the fifth or sixth registers 64 and 66 via the third internal bus 41 and the fourth through the internal bus 23. Any one of g + N + r bits of data from the seventh or eighth registers 76 and 78 is commonly supplied to the third and fourth bit sorters 82 and 84 and the rounding / saturation processor 86. do. The third and fourth bit sorters 82 and 84 extract only N bits from g + N + r bits of data from the tenth multiplexer 80 to extract the extracted N bits of data from the eleventh multiplexer 88. Supply to the first and second input ports respectively. In detail, the third bit sorter 82 extracts only N bits of data from the upper g + 1 th bit of the g + N + r bits from the tenth multiplexer 80 to extract the N bits of the extracted N bits. Data is supplied to the first input port of the eleventh multiplexer 88. For this purpose, the third bit sorter 82 may have N terminals from the upper g + 1 th terminal among the g + N + r output terminals of the tenth multiplexer 80, that is, the upper g terminals and the lower r. Wiring for connecting the remaining N terminals except for the N terminals to the first input port of the eleventh multiplexer 88 including N terminals is provided. Only by wiring, the third bit sorter 82 converts data of g + N + r bits into N bits of data. The fourth bit sorter 84 extracts only the lower N bits of the data of the g + N + r bits from the tenth multiplexer 80, and extracts the extracted N bits of data from the eleventh multiplexer 88. 2 Supply to the input port. For this purpose, the fourth bit sorter 84 may select N N terminals other than the lower N terminals, that is, the upper g + r terminals, among the g + N + r output terminals of the tenth multiplexer 80. And a wiring for connecting to the second input port of the eleventh multiplexer 88 consisting of four terminals. The fourth bit sorter 84, together with the first to third bit sorters 32, 34 and 82, is simply constituted by wiring so that no separate circuit block is required. Accordingly, the first to fourth bit sorters 32, 34, 82, and 84 can simplify the circuit configuration of the DSP and allow both fixed-point and integer operations to be performed at high speed.

라운딩/포화처리기(86)는 도시하지 않은 제어기로부터 명령에 따라 라운딩처리모드, 포화처리모드, 또는 병합모드로 구동된다. 라운딩 모드시 라운딩/포화처리기(86)는 제10멀티플렉서(80)로부터의 g+N+r 비트의 데이타에서 r+1번째 하위비트로부터 N 비트의 데이타를 선택하여 제11멀티플렉서(88)의 제3입력포트에 공급한다. 다음으로 라운딩/포화처리기(86)는 포화모드시에는 제10멀티플렉서(80)로부터의 g+N+r 비트의 데이타에서 e+1번째 하위비트로부터 N비트의 데이타를 선택하여 제11멀티플렉서(88)의 제3입력포트에 공급한다. 다음으로 라운딩/포화처리기(86)는 포화 모드시에는 제10멀티플랙서(80)로부터의 g+N+r 비트의 데이타에서 상위 g+1 비트들의 논리값에 따라 데이타를 처리하게 된다. 이를 상세히하면, 라운딩/포화 처리기(86)는 제10멀티플렉서(80)로부터의 g+N+r 비트의 데이타에서 상위 g+1 비트들(즉, g개의 가드비트들과 1개의 사인비트)의 논리값들이 모두 동일한가에 따라 오버플로우(Overflow)의 발생여부를 판단한다. 상위 g+1 비트들의 논리값들이 모두 동일한 경우에 라운딩/포화 처리기(86)는 제10멀티플렉서(80)로부터의 g+N+r 비트의 데이타중 상위 g 비트와 하위 r 비트를 제외한 나머지 N 비트의 데이타를 연산되어진 결과로써 제11멀티플렉서(88)의 제3입력포트에 공급한다. 반대로 상위 g+1 비트들의 논리값들이 모두 동일하지 않은 경우에 라운딩/포화 처리기(86)는 g개의 가드 비트들중 최상위비트의 논리값이 0 또는 1인가를 판단한다. 라운딩/포화처리기(86)는 최상위 가드비트의 논리값이 0인 경우에 제10멀티플렉서(80)로부터의 데이타가 오버플로우가 발생되어진 정(+)의 데이타인 것으로 간주하여 최상위 비트만이 0의 논리값을 가지는 최대값의 N 비트의 데이타(즉, 011…11)를 제11멀티플렉서(88)의 제3입력포트에 공급한다. 이와는 달리 최상위 가드비트의 논리값이 1인 경우, 라운딩/포화 처리기(86)는 제10멀티플렉서(80)로부터의 데이타가 오버플로우가 발생되어진 부(0)의 데이타인 것으로 간주하여 최상위 비트만이 1의 논리값을 가지는 N 비트의 데이타(즉, 1000…00)를 제11멀티플렉서(88)의 제3입력포트에 공급한다. 이와 같이 포화처리기(86)는 가드비트들과 사인비트의 논리값에 근거하여 ALU(58 또는 72)에 의해 연산된 데이타의 포화논리값(즉, 오버플로우가 발생되어진 논리값)을 정확하게 처리하게 된다. 마지막으로 병합모드시, 라운딩/포화처리기(86)는 제10멀티플렉서(80)로부터의 g+N+r 비트의 데이타에 대하여 상기 라운딩 처리를 수행한 다음 그 라운딩된 g+N 비트의 데이타에 대하여 포화처리기를 수행하게 된다. 이렇게 라운딩/포화처리기(86)에 의해 라운딩 및 포화처리된 N 비트 데이타는 제11 멀티플렉서(88)에 공급되게 된다. 이 경우, 라운딩/포화처리기(86)는 하위 비트들에 대한 라운딩 처리를 먼저 수행하되, 라운딩과 포화 처리를 하나의 클럭주기내에 수행함으로써 별도의 라운딩 처리시간을 소모하지 않는다. 이에 따라, 라운딩/포화 처리기(86)는 별도의 시간을 소모하지 않으면서 고정소수점 연산되어진 데이타를 라운딩할 수 있는 이점을 제공한다. 제11멀티플렉서(88)는 제3 및 제4비트정렬기(82,84)와 라운딩/포화 처리기(86)로부터의 3개의 N 비트 데이타중 어느 하나를 제3외부버스(47)쪽으로 전송한다. 제3외부버스(47)는 기록용데이타버스(Write Data Bus) 및 기록용어드레스버스(Write Address Bus)로 구성된 N 개의 라인으로 이루어져 있다.The rounding / saturation processor 86 is driven in the rounding processing mode, the saturation processing mode, or the merge mode in response to a command from a controller (not shown). In the rounding mode, the rounding / saturation processor 86 selects N bits of data from the r + 1st lower bits in g + N + r bits of data from the 10th multiplexer 80 to generate the 11th multiplexer 88 data. 3 Supply to the input port. Next, in the saturation mode, the rounding / saturation processor 86 selects N bits of data from the e + 1st lower bits of the g + N + r bits from the 10th multiplexer 80 to select the 11th multiplexer 88. To the third input port. Next, in the saturation mode, the rounding / saturation processor 86 processes the data according to the logical value of the upper g + 1 bits in the g + N + r bit data from the tenth multiplexer 80. In detail, the rounding / saturation processor 86 performs the processing of the upper g + 1 bits (ie, g guard bits and one sign bit) in the g + N + r bits of data from the tenth multiplexer 80. It is determined whether overflow occurs according to whether all logic values are the same. When the logic values of the upper g + 1 bits are all the same, the rounding / saturation processor 86 performs the remaining N bits except for the upper g bit and the lower r bit among the g + N + r bits of data from the tenth multiplexer 80. Is supplied to the third input port of the eleventh multiplexer 88 as a result of the calculation. Conversely, when the logic values of the upper g + 1 bits are not all the same, the rounding / saturation processor 86 determines whether the logical value of the most significant bit of the g guard bits is 0 or 1. When the logic value of the most significant guard bit is 0, the rounding / saturation processor 86 considers that the data from the tenth multiplexer 80 is positive data in which an overflow has occurred. N bits of data (ie, 011... 11) of the maximum value having a logic value are supplied to the third input port of the eleventh multiplexer 88. On the contrary, when the logic value of the most significant guard bit is 1, the rounding / saturation processor 86 regards the data from the tenth multiplexer 80 as negative data having overflowed, and thus only the most significant bit. N bits of data (that is, 1000... 00) having a logic value of 1 are supplied to the third input port of the eleventh multiplexer 88. In this way, the saturation processor 86 can correctly process the saturation logic of the data calculated by the ALU 58 or 72 based on the logic values of the guard bits and the sine bits (that is, the logic value at which the overflow occurred). do. Finally, in merge mode, the rounding / saturation processor 86 performs the rounding process on the g + N + r bits of data from the tenth multiplexer 80 and then on the rounded g + N bits of data. The saturation processor is performed. The N-bit data rounded and saturated by the rounding / saturation processor 86 is supplied to the eleventh multiplexer 88. In this case, the rounding / saturation processor 86 performs rounding processing for the lower bits first, but does not consume a separate rounding processing time by performing the rounding and saturation processing in one clock period. Accordingly, the rounding / saturation processor 86 provides the advantage of rounding the data that has been fixed-point computed without consuming extra time. The eleventh multiplexer 88 transmits one of the three N-bit data from the third and fourth bit sorters 82 and 84 and the rounding / saturation processor 86 to the third external bus 47. The third external bus 47 is composed of N lines composed of a write data bus and a write address bus.

도3은 도2에서의 데이타의 라운딩이 수행되는 부분을 도시하는 도면이다. 도 3에 있어서, g+N+r 비트의 제1데이타(D1)는 ALU(58 또는 72)에 의해 연산된 후 누적용 레지스터(64,66,76,78 중 하나)에 일시적으로 저장된다. 제1데이타(D1)에서 g는 가드비트들, N은 메모리(도시하지 않음)로부터의 소스데이타 그리고 r은 라운딩비트들이다. 제3비트정렬기(82)는 도 2에서 설명된 바와 같은 배선에 의해 제1데이타(D1)를 N 비트의 제2데이타(D2)로 변환함으로써 별도의 처리시간을 소모하지 않는다. 라운딩/포화 처리기(86)는 제1데이타(D1)를 도 2에서 설명되어진 바와 같이 라운딩처리한 다음 그 라운딩된 데이타를 포화처리함으로써 N비트의 제2데이타(D2)로 변환한다. 제11멀티플렉서(88)는 제1데이타(D1)이 포화처리되어야 하는가의 여부, 즉 제1데이타(D1)에 오버플로우가 발생되었는가에 따라 라운딩/포화 처리기(86)의 출력데이타 또는 제3비트정렬기(82)의 출력데이타를 선택하게 된다. 이를 상세히 하면, 제11멀티플렉서(88)는 제1데이타(D1)이 포화처리되어야 할 경우, 즉 제1데이타(D1)에 오버플로우가 발생되어진 경우에 라운딩/포화 처리기(86)의 출력데이타를 선택한다. 반대로 제1데이타(D1)이 포화처리될 필요가 없는 경우, 즉 제1데이타(D1)에 오버플로우가 발생되지 않은 경우에 제11멀티플렉서(88)는 제3비트정렬기(82)의 출력데이타를 선택한다. 이와 같은 라운딩/포화 처리기(86), 제3비트정렬기(82) 및 제11멀티플렉서(88)에 의해 DSP는 데이타의 라운딩 처리를 위해 별도의 시간(즉, 클럭주기)을 소모하지 않게 되고 나아가 데이타를 고속으로 처리하게 된다.FIG. 3 is a diagram showing a part where rounding of data in FIG. 2 is performed. In Fig. 3, the first data D1 of g + N + r bits is temporarily stored in the accumulation register 64, 66, 76, 78 after being calculated by the ALU 58 or 72. In the first data D1, g is guard bits, N is source data from a memory (not shown), and r is rounding bits. The third bit sorter 82 does not consume additional processing time by converting the first data D1 into N bits of the second data D2 by the wiring as described in FIG. 2. The rounding / saturation processor 86 converts the first data D1 into N bits of second data D2 by rounding the first data D1 and then saturating the rounded data. The eleventh multiplexer 88 outputs a third bit or an output data of the rounding / saturation processor 86 according to whether the first data D1 should be saturated, that is, whether the first data D1 has overflowed. The output data of the sorter 82 is selected. In detail, the eleventh multiplexer 88 selects the output data of the rounding / saturation processor 86 when the first data D1 needs to be saturated, that is, when an overflow occurs in the first data D1. Choose. In contrast, when the first data D1 does not need to be saturated, that is, when no overflow occurs in the first data D1, the eleventh multiplexer 88 outputs the output data of the third bit sorter 82. Select. The rounding / saturation processor 86, the third bit sorter 82, and the eleventh multiplexer 88 prevent the DSP from spending extra time (i.e., clock period) for the rounding process of the data. Processing data at high speed.

상술한 바와 같이, 본 발명의 DSP는 ALU에 공급되는 데이타의 비트수(즉, 워드길이)를 배선에 의해 조절함으로써, ALU 및 누적용 레지스터의 길이를 짧게 할 수 있다. 또한, 본 발명에 따른 DSP는 데이타의 포화처리시에 데이타의 라운딩이 동시에 수행됨으로써 데이타의 라운딩 처리에 소요되는 시간을 없앨 수 있다. 이에 따라, 본 발명에 따른 DSP는 신호처리를 고속화할 수 있다.As described above, the DSP of the present invention can shorten the lengths of the ALU and the accumulation register by adjusting the number of bits (i.e., word length) of data supplied to the ALU by wiring. In addition, the DSP according to the present invention can eliminate the time required for the rounding process of the data by performing the rounding of the data at the same time during the data saturation processing. Accordingly, the DSP according to the present invention can speed up signal processing.

또한, 본 발명에 따른 DSP에서는 배선으로 구현되어진 비트정렬기들이 누적기의 전단 및 후단에 배치됨으로써 고정소수점 및 정수 연산이 빠르게 수행됨과 아울러 회로구성이 간소화된다.In addition, in the DSP according to the present invention, since the bit sorters implemented as wirings are arranged at the front and rear ends of the accumulator, fixed-point and integer operations are performed quickly, and the circuit configuration is simplified.

나아가 본 발명에 따른 DSP는 배럴쉬프터를 ALU와 병렬 접속시킴으로써 전파지연시간을 최소화할 수 있음은 물론 연산에 필요한 클럭의 수를 작게 할 수 있다. 이에 따라, 본 발명에 따른 DSP는 연산, 스케일링 및 그를 포함한 연산을 고속으로 수행할 수 있다.Furthermore, the DSP according to the present invention can minimize the propagation delay time by connecting the barrel shifter to the ALU in parallel and can reduce the number of clocks required for the calculation. Accordingly, the DSP according to the present invention can perform calculations, scaling, and calculations including the same at high speed.

본 발명에 따른 DSP는 병렬 접속되어진 한쌍의 ALU를 이용하여 두개의 복합연산식을 병렬로 연산함으로써 다수의 복합연산식들을 고속으로 연산할 수 있다.The DSP according to the present invention can compute a plurality of complex equations at high speed by computing two complex equations in parallel using a pair of ALUs connected in parallel.

이상 설명한 내용을 통해 당업자라면 본 발명의 기술사상을 일탈하지 아니하는 범위에서 다양한 변경 및 수정이 가능함을 알 수 있을 것이다. 따라서, 본 발명의 기술적 범위는 실시예에 기재된 내용으로 한정되는 것이 아니라 특허 청구의 범위에 의하여 정하여져야만 한다.Those skilled in the art will appreciate that various changes and modifications can be made without departing from the technical spirit of the present invention. Therefore, the technical scope of the present invention should not be limited to the contents described in the embodiments, but should be defined by the claims.

Claims

Data input means for inputting N bits of data;

Rounding bit adding means for adding a round bit of r bits smaller than N to the N bit data from the data input means;

Guard bit adding means for adding a g-bit guard bit to an upper bit of data from the rounding bit adding means;

Calculating means for calculating data from the guard bit adding means;

Rounding / saturation processing means for performing saturation and rounding of data from said computing means;

And rounding means for rounding the data from said computing means.

The method of claim 1,

And said rounding / saturation processing means selectively performs said saturation processing and said rounding and saturation processing on the data to said computing means.

The method of claim 2,

And said rounding / saturation processing means processes said data differently according to a logic value of upper g + 1 bits of data from said computing means during saturation processing.

The method of claim 3,

The rounding / saturation processing means has N logic data of which only the most significant bit has a logic value of zero and only the most significant bit has a logic value of 1 when the logical values of the upper g + 1 bits of data from the computing means are different. A digital signal processor characterized by selectively generating N bits of data.

The method of claim 4, wherein

The rounding / saturation processing means generates N bits of data having only a logical value of 1 when only the most significant bit is 1 when the logical value of the most significant bit of the data from the computing means is 1, and the logical value of the most significant bit is 0. A digital signal processor, characterized in that only the most significant bit generates N bits of data having a logical value of zero.

The method according to any one of claims 1 to 5,

And the rounding means is implemented by wiring.

The method according to any one of claims 1 to 5,

And the guard bit adding means is implemented by wiring.

Data input means for inputting N bits of data;

Operation logic means for converting data from the guard bit adding means and data from the feedback loop;

A memory for temporarily storing data from said operational logic means and connected to said feedback loop;

Scaling means for scaling data from the guardbit adding means;

Selecting means for selectively transferring data from the scaling means and data from the arithmetic logic means to the memory;

Rounding / saturation processing means for saturating and rounding data from the memory;

And rounding means for rounding the data from the memory.

The method of claim 8,

And said rounding / saturation processing means selectively performs said saturation processing and said rounding and saturation processing on data from said computing means.

The method of claim 9,

The method of claim 10,

The method according to any one of claims 8 to 11,

And the rounding means is implemented by wiring.

The method according to any one of claims 8 to 11,

And the rounding bit adding means is implemented by wiring.