KR19990061575A

KR19990061575A - Command Length Reduction Method of Central Processing Unit and Its Device

Info

Publication number: KR19990061575A
Application number: KR1019970081848A
Authority: KR
Inventors: 이윤태
Original assignee: 윤종용; 삼성전자 주식회사
Priority date: 1997-12-31
Filing date: 1997-12-31
Publication date: 1999-07-26

Abstract

본 발명은 명령어들의 길이를 축소함으로써 중앙 처리 유니트의 성능을 효율적으로 유지할 수 있는 중앙 처리 유니트의 명령어 길이 축소방법 및 그 장치를 제공하기 위한 것이다. 이를 위해 본 발명의 명령어 길이 축소방법은, 외부 저장수단에 저장된 N비트의 명령어를 M비트 단위로 페치하여 그 명령어를 실행하는 중알처리 유니트의 명령어 처리방법에 있어서; 복수의 N비트 명령어들을 상기 중앙 처리 유니트의 내부에 저장하는 단계와; 프로그래머에 의해 명령어들을 프로그램 입력하는 입력단계; 상기 프로그램에 따라 상기 외부 저장수단으로부터 페치된 K개의 (N/K)비트 압축명령어를 사용하여 상기 내부에 저장된 명령어들을 독출하는 독축단계 및; 상기 독출단계에서 독출된 명령어들을 실행가능한 명령코드로 디코딩하는 디코딩단계를 포함하여 구성된다. 그리고, 본 발명의 명령어 길이 축소장치는, 외부 저장수단에 저장된 N비트 명령어를 M비트 단위로 페치하여 그 명령어를 실행하는 중앙처리 유니트에 있어서; 상기 중앙 처리 유니트의 실행시 사용되는 복수의 N비트 명령어들을 내부에 저장해 두고, 상기 외부 저장수단으로부터 페치된(N/K)비트의 압축 명령어들을 사용하여 상기 내부에 저장된 명령어들을 독출해서 실해가능한 명령코드로 디코딩하여 출력하는 축소수단을 구비하여 구성된다.The present invention provides a method and apparatus for reducing the length of an instruction in a central processing unit that can efficiently maintain the performance of the central processing unit by reducing the length of the instructions. To this end, the instruction length reduction method of the present invention is directed to an instruction processing method of a central processing unit for fetching an N-bit instruction stored in an external storage means in units of M bits and executing the instruction; Storing a plurality of N-bit instructions inside the central processing unit; An input step of program inputting instructions by a programmer; A reading step of reading out the instructions stored therein using K (N / K) bit compression instructions fetched from the external storage means according to the program; And a decoding step of decoding the instructions read in the reading step into an executable command code. In addition, the instruction length reduction apparatus of the present invention comprises: a central processing unit for fetching N-bit instructions stored in an external storage means in units of M bits and executing the instructions; A plurality of N-bit instructions used when executing the central processing unit therein, and instructions executable by reading the internally stored instructions using (N / K) bit compression instructions fetched from the external storage means And reduction means for decoding and outputting the code.

Description

Method and apparatus for reducing instruction ging of central processing unit

본 발명은 중앙 처리 유니트에 관한 것으로 , 보다 상세하게는 중앙 처리 유니트에서 명령어의 길이를 축소하는 명령어 길이 축소방법 및 그 장치에 관한 것이다.The present invention relates to a central processing unit, and more particularly, to an instruction length reduction method and apparatus for reducing the length of an instruction in the central processing unit.

일반적으로, 32비트 중앙 처리 유니트에 있어서 명령어의 길이가 32비트인 경우에 데이터 버스의 폭은 32비트이다. 이 경우, 외부 메모리의 폭이 8비트 또는 16비트인 시스템에서 32비트 중앙 처리 유니트를 사용할 때, 한 개의 32비트 명령어를 페치(fetch)하기 위해서 외부 메모리 억세스 사이클을 4회 내지 2회 실행해야 한다.In general, in a 32-bit central processing unit, when the instruction length is 32 bits, the width of the data bus is 32 bits. In this case, when using a 32-bit central processing unit in an 8-bit or 16-bit system, the external memory access cycle must be executed four or two times to fetch one 32-bit instruction. .

따라서, 32비트의 외부 메모리 폭에 비해서 빈번한 명령어 페치 사이클로 인하여 긍극적으로 중앙 처리 유니트의 성능이 저하된다. 그 이유는 명령어 페치동안에 중앙처리 유니트의 일시 중지 현상이 일어나기 때문이다.As a result, frequent instruction fetch cycles, compared to 32-bit external memory widths, ultimately degrade the performance of the central processing unit. This is because the central processing unit pauses during instruction fetch.

이러한 문제를 해결하기 위해 다음의 2가지 방법을 이용한다. 즉, 첫 번째는 중앙 처리 유니트가 32비트 명령어를 처리가능함에도 불구하고 가능한 한 명령어 길이만을 16비트로 가져가는 경우이다. 일본국 히다치(Hitach)사의 SH시리즈 중앙 처리 유니트의 경우가 이에 해당하는데, 이 경우는 명령어 길이가 긴 경우에 중앙 처리 유니트의 기능 정의를 더욱더 강력하게 할 수 있다는 일반적인 원칙에 어긋나기 때문에 32비트 명령어를 가지는 32비트의 중앙처리 유니트에 비해서는 성능이 떨어질 수 있다.To solve this problem, the following two methods are used. That is, the first case is that the central processing unit takes only 16 bits of instruction length as much as possible even though it can process 32 bits instructions. This is the case of Hitachi, Japan's SH series central processing unit. This is a 32-bit instruction because it violates the general principle that the function definition of the central processing unit can be made stronger when the instruction length is long. Performance may be degraded compared to 32-bit CPUs with

두 번째는 32비트의 중앙처리 유니트의 구조를 풀(full)로 사용하지 않는다는 가정하에 16비트 명령어를 정의하고, 이 16비트 명령어가 중앙 처리 유니트 내부에 들어가서 32비트 명령어로 변환되어서 명령을 수행하는 방식이다.The second defines a 16-bit instruction on the assumption that the structure of the 32-bit central processing unit is not used in full, and the 16-bit instruction enters the central processing unit and is converted into a 32-bit instruction to execute the instruction. That's the way.

여기서, 32비트의 중앙 처리 유니트를 풀로 사용하지 않는다는 의미는, 가령32비트 명령어에서는 하나의 명령어에서 지정할 수 있는 내부 레지스터의 개수가 3개인데 반해서, 이를 두 개밖에 사용하지 못한다는 제약 또는 즉치 필드(immediate value field)를 12비트에서 8비트로만 제약하는 등의 가정하에 16비트 명령어를 구성하는 방식이다. 여기에 해당하는 방식은 ARM사의 덤(Thumb)방식의 중앙 처리 유니트가 있는데, 그 구성예를 제 1 도에 도시하였다.Here, a 32-bit central processing unit is not used as a pool. For example, in a 32-bit instruction, the number of internal registers that can be designated by one instruction is three, whereas the constraint or immediate field that only two are used is available. It is a method of constructing 16-bit instructions under the assumption that the (immediate value field) is limited to 12 bits to 8 bits. An example of such a scheme is an ARM central processor unit, which is illustrated in FIG. 1.

즉, 덤방식의 중앙 처리 유니트는, 레지스터(22)와 제 1멀티플렉서(24), 덤명령어 신장기(26), 제2멀티플렉서(28), 제3멀티플렉서(32) 및 명령어 디코더(34)를 포함하여 구성된다.That is, the bonus central processing unit includes a register 22, a first multiplexer 24, a bonus instruction extender 26, a second multiplexer 28, a third multiplexer 32, and an instruction decoder 34. It is configured by.

상기 32비트 레지스트(22)는 외부 메모리 맵(10)으로부터 데이터 버스(D-BUS)를 매개로 명령어를 입력받는다. 상기 외부 메모리 맵(10)은 적어도 하나의 ROM(10A)과 적어도 하나의 RAM(10B)으로 구성되며, 상기 ROM(10A)에는 중앙 처리 유니트의 명령더들이 저장되어 있다. 이들 ROM(10A) 및 RAM(10B)의 메모리 폭은 8비트, 16비트 또는 32비트로 구성할 수 있다.The 32-bit register 22 receives a command from the external memory map 10 via a data bus D-BUS. The external memory map 10 is composed of at least one ROM 10A and at least one RAM 10B, and instructions of a central processing unit are stored in the ROM 10A. The memory widths of the ROM 10A and the RAM 10B can be composed of 8 bits, 16 bits, or 32 bits.

상기 제1멀티플렉서(24)는 상기 32비트 레지스터(22)의 하위 16비트와 상위 16비트를 선택 제어신호에 의해 토글(toggle) 선택하여 출력한다. 상기 덤방식의 중앙 처리 유니트에서는 32비트의 정상적인 명령어 코드(ARM코드)와 16비트의 덤 명령이 코드를 혼용하여 사용할 수 있도록 되어 있다.The first multiplexer 24 toggles and outputs the lower 16 bits and the upper 16 bits of the 32-bit register 22 by a selection control signal. In the bonus type central processing unit, a 32-bit normal instruction code (ARM code) and a 16-bit bonus instruction can be used interchangeably.

동 도면에 도시한 제어신호(BX)는 프로그래머가 프로그램시에 정상모드(ARM 모드)의 32비트 명령어를 처리하는 ARM루틴과 덤모드의 16비트 압축 명령어를 처리하는 덤 루틴을 토글 선택하기 위한 특정 명령어인 BX를 프로그램 입력/실행함에 따라 중앙 처리 유니트의 코어에서 발생되는 것으로, ARM명령어의 실행시에는 예를 들면 논리 0레벨로, 덤 명령어의 실행시에는 예를 들면 논리 1레벨로 된다.The control signal (BX) shown in the figure is a programmer's specific program for toggling between an ARM routine that handles 32-bit instructions in normal mode (ARM mode) and a dumb routine that handles 16-bit compressed instructions in dumb mode. It occurs at the core of the central processing unit when a program BX, which is an instruction, is executed. At the time of execution of an ARM instruction, for example, the logic level is 0, and when the bonus instruction is executed, for example, logic 1 level.

이에 따라, 상기 제1멀티플렉서(24)는 정상모드(ARM모드)시에는 논리 0레벨의 제어신호(BX)에 의해 디스에이블(disable)상태로 되는 반면, 덤 모드시에는 논리 1레벨의 제어신호(BX)에 의해 인에이블(enable)상태로 되어, 도시하지 않은 선택 제어신호에 따라 상기 32비트 레지스터(22)의 상위 및 하위 16비트를 토글 선택하여 출력하게 된다.Accordingly, the first multiplexer 24 is disabled by the control signal BX of logic level 0 in the normal mode (ARM mode), while the control signal of logic level 1 in the dumb mode. (BX) is enabled, and the upper and lower 16 bits of the 32-bit register 22 are toggled and output according to a selection control signal (not shown).

상기 덤 명령어 신장기(26)는 덤 모드시에 상기 제1멀티플렉서(24)를 매개로 입력되는 16비트의 덤 명령어를 32비트의 ARM 명령어로 신장시키는 것으로, 예를 들면 (ADd Rd, #Constant)의 연산명령어로서 덤 명령어는 도 5(a)에 도시되어 있고, 이 덤 명령어를 ARM명령어로 변환한 예가 도 5(d)에 도시되어 있다.The bonus instruction expander 26 expands a 16-bit bonus instruction input via the first multiplexer 24 into a 32-bit ARM instruction in the bonus mode. For example, (ADd Rd, #Constant) The dumb instruction as an operation instruction of is shown in Fig. 5 (a), and an example of converting the dumb instruction into an ARM instruction is shown in Fig. 5 (d).

즉, 도 5(a)에 도시한 덤 명령어의 하위 8비트는 즉치(immediate value)이고, 2비트의 Rd는 소스(또는 목적) 레지스터를 나타내는 것이며, 10 및 11번 비트의 10은 ADD명령어를 나타내는 마이너(minor)연산코드이고, 상위 3비트의 1은 측치와의 ADD명령 포맷을 나타내는 메이저(major)연산코드이다. 이 덤명령은 덤 명령어 신장기 (26)을 통과하면 도 5(b)에 도시한 바와 같이 32비트의 ARM명령어로 신장된다. 여기서, ARM명령어로 변환된 32비트 코드에서 상위 4비트인 1110는 항상 고정된 상태코드이다.That is, the lower 8 bits of the bonus instruction shown in FIG. 5 (a) are immediate values, 2 bits of Rd represent a source (or destination) register, and 10 of bits 10 and 11 represent an ADD instruction. A minor operation code is shown, and one of the upper three bits is a major operation code indicating the format of the ADD instruction with the measured value. This bonus instruction is expanded to a 32-bit ARM instruction as shown in FIG. 5B when the bonus instruction extender 26 passes. Here, in the 32-bit code converted to the ARM instruction, 1110, the upper four bits, is always a fixed status code.

상기 제2 및 제3멀티플렉서(28,32)는 상기 레지스터(22)로부터의 32비트명령어와 상기 덤 명령어 신장기(26)로부터의 32비트 신장 명령어를 제어신호(S)에 따라 선택하여 출력한다.The second and third multiplexers 28 and 32 select and output 32-bit instructions from the register 22 and 32-bit decompression instructions from the dumb instruction extender 26 according to the control signal S.

상기 제어신호(S)는 명령어 코드중의 특정 코드에 할당되어 있는 명령어 상태코드에 따라, 즉 현재 입력된 명령어가 덤 명령어를 신장한 명령어인지 또는 ARM 명령어인지에 따라 생성되는 제어신호로서, 덤 명령어의 신장 명령어인 경우에는 예를 들면 논리1레벨로 되는 반면, ARM명령어인 경우에는 예를 들면 논리 0 레벨로 된다. 이에따라, 상기 제2 및 제3멀티플렉서(28,32)는 상기 제어신호(S)가 논리 1레벨이면 덤 명령어 신장기(26)의 출력을 선택하는 반면, 제어신호(S)가 논리 0레벨이면 레지스터(22)로부터의 32비트 ARM명령어를 선택한다.The control signal S is a control signal generated according to an instruction status code assigned to a specific code among the instruction codes, that is, whether the currently input instruction is an instruction that extends a dumb instruction or an ARM instruction. In the case of the decompression instruction, the logic level is, for example, the logic level. For ARM instructions, the logic level is, for example, the logic level. Accordingly, the second and third multiplexers 28 and 32 select the output of the dumb instruction extender 26 when the control signal S is at logic 1 level, while the registers when the control signal S is at logic 0 level. Select the 32-bit ARM instruction from (22).

여기서, 상기 ARM명령어의 일예로서 데이터 처리 명령어가 도 4에 도시되어 있다. 도 4(a)에 도시한 바와 같이 상위 4비트는 명령어의 상태 필드이고, 25번 비트는 즉치 오퍼랜드(immediate operand)를 나타내며, 20번 비트(s)는 설정상태 코드로서 ARM명령어인 경우 0, 덤 명령어인 경우 1로 설정된다. 그리고, 4비트의 Rn 및 Rd는 각각 제1오퍼랜드 레지스터 및 목적 레지스터를 나타내는 것이고, 12비트의 Operrand 2는 제2오퍼랜드를 나타내는 것이다. 상기 Operand 2는 상기 제2오퍼랜드 레지스터를 나타내기 위한 하위 4비트와 이 제2오퍼랜드에 적용되는 시프트비트인 상위 8비트로 구성되거나, 또는 즉치를 나타내기 위한 하위 8비트와 이 즉치에 적용되는 시프트 비트인 상위 4비트로 구성된다. 그리고, 도 4(a)의 하부에는 상기한 명령어가 각각 8비트 폭의 메모리, 16비트 폭의 메모리 및 32비트 폭의 메모리에 저장될 경우에 분할되는 상태를 나타낸 것이다. 또한, 도 4(b)는 연산 명령의 예로서 「ADD R1 ← R2, R3」에 대한 덤 명령어를 덤 명령어 신장기(26)에 의해 신장한 상태를 나타낸 것으로, 20번 비트가 1로 설정되어 있다.Here, as an example of the ARM instruction, a data processing instruction is illustrated in FIG. 4. As shown in FIG. 4 (a), the upper 4 bits are the status field of the instruction, the 25th bit represents an immediate operand, and the 20th bit (s) is a setting status code of 0 for an ARM instruction. Set to 1 for the bonus command. Rn and Rd of 4 bits indicate a first operand register and a destination register, respectively, and Operrand 2 of 12 bits indicates a second operand. The Operand 2 is composed of the lower 4 bits for representing the second operand register and the upper 8 bits, which are shift bits applied to the second operand, or the lower 8 bits for representing the immediate value, and the shift bit applied to the immediate value. Consists of the upper 4 bits. In the lower part of FIG. 4A, the instructions are divided when they are stored in an 8-bit wide memory, a 16-bit wide memory, and a 32-bit wide memory, respectively. In addition, Fig. 4 (b) shows a state in which a bonus instruction for "ADD R1 ← R2, R3" is expanded by the bonus instruction extender 26 as an example of an operation instruction, and bit 20 is set to 1. .

상기 명령어 디코더(34)는 룩-업 테이블로 이루어진 코드북을 이용하여, 상기 제3멀티플렉서(32)를 매개로 입력되는 명령어를 실행가능한 상태의 코드로 디코딩하여 출력한다.The instruction decoder 34 decodes and outputs an instruction input through the third multiplexer 32 into an executable code using a codebook consisting of a look-up table.

상기와 같이 구성된 종래의 덤 방식의 중앙 처리 유니트의 명령어 처리 동작에 대해서 도 2내지 도 5를 참조하여 설명한다. 여기서, 프로그래머에 의해 프로그램된 명령어 흐름으로서 도 2에 도시한 바와 같은 명령어 흐름이 입력된다고 가정한다.An instruction processing operation of the conventional bonus type central processing unit configured as described above will be described with reference to FIGS. 2 to 5. Here, it is assumed that the instruction flow as shown in Fig. 2 is input as the instruction flow programmed by the programmer.

먼저, 명렬어 흐름이 입력되면 외부 메모리 맵(10)의 ROM(10A)에 저장된 해당 명령어 디이터들을 독출하여 순차적으로 레지스터(22)로 페치한다. 여기서, 상기 ROM(10A)의 폭이 8비트인 경우에는 도 3(a)에 도시한 바와 같이 명령어 데이터가 격납되어 있으며, 이 경우 32비트의 ARM 명령어 1(A_INT 1)을 페치하기 위해서는 4사이클의 페치동작에 의해 대응하는 명령어 데이터(A_INT 1-1, A_INT 1-2, A_INT 1-3, A_INT 1-4)를 레지스터(22)에 페치할 수 있는 한편, 16비트의 덤 명령어 5(T_INT 1)를 페치하기 위해서는 2사이클의 페치동작에 의해 대응하는 명령어 데이터(T_INT 1-1, T_INT 1-2)를 레지스터(22)에 페치할 수 있다.First, when the command word flow is input, the corresponding instruction data stored in the ROM 10A of the external memory map 10 is read and sequentially fetched into the register 22. In the case where the width of the ROM 10A is 8 bits, the instruction data is stored as shown in FIG. 3 (a). In this case, 4 cycles are required to fetch 32-bit ARM instructions 1 (A_INT 1). Corresponding instruction data (A_INT 1-1, A_INT 1-2, A_INT 1-3, A_INT 1-4) can be fetched into the register 22 by the fetch operation of the 16-bit bonus instruction 5 (T_INT). In order to fetch 1), corresponding instruction data (T_INT 1-1, T_INT 1-2) can be fetched into the register 22 by a 2-cycle fetch operation.

그리고, ROM(10A)의 폭이 16비트인 경우에는 도 3(b)에 도시한 바와 같이 명령어 데이터가 격납되어 있으며, 이 경우 32비트의 ARM 명령어 (A_INT 1)을 페치하기 위해서는 2사이클의 페치동작에 의해 대응하는 명령어 데이터(A_INT 1-1, A_INT 1-2)를 레지스터 (22)에 페치할 수 있는 한편, 16비트의 덤 명령어 5(T_INT 1)를 페치하기 위해서는 1사이클의 페치동작에 의해 대응하는 명령어 데이터(T_INT 1)를 레지스터(22)에 페치할 수 있다.When the width of the ROM 10A is 16 bits, the instruction data is stored as shown in FIG. 3 (b). In this case, in order to fetch 32-bit ARM instructions (A_INT 1), two cycles of fetching are performed. By the operation, the corresponding instruction data (A_INT 1-1, A_INT 1-2) can be fetched into the register 22, while in order to fetch a 16-bit dumb instruction 5 (T_INT 1), one cycle of fetch operation is performed. Corresponding instruction data T_INT 1 can be fetched into the register 22 by this.

또한, ROM(10A)의 폭이 32비트인 경우에는 도 3(c)에 도시한 바와 같이 명령어 데이터가 격납되어 있으며, 이 경우 32비트의 ARM 명령어 1(A_INT 1)을 페치하기 위해서는 1사이클의 페치동작에 의해 대응하는 명령어 데이터(A_INT 1)을 레지스터(22)에 페치할 수 있는 한편, 16비트의 덤 명령어 5(T_INT 1)를 페치하기 위해서는 1사이클의 페치동작에 의해 대응하는 명령어 데이터(T_INT 1)를 레지스터(22)에 페치할 수 있다.In addition, when the width of the ROM 10A is 32 bits, the instruction data is stored as shown in Fig. 3 (c). In this case, in order to fetch 32 bits of ARM instruction 1 (A_INT 1), one cycle is used. The corresponding instruction data A_INT 1 can be fetched into the register 22 by the fetch operation. On the other hand, in order to fetch the 16-bit dumb instruction 5 (T_INT 1), the corresponding instruction data (1_Fetch operation) is fetched. T_INT 1) can be fetched into the register 22.

그후, BX명령어(BX1)에 의해 이후의 명령어가 정상모드(ARM모드) 명령어라는 것이 정의되므로, ARM루틴(1)의 명령어 1∼4(A_INT 1∼A_INT 4)는 레지스터(22)와 제2멀티플렉서(28)를 통해 출력됨과 더불어 제3멀티플렉서(32)를 매개로 명령어 디코더(34)에도 입력되어 디코딩된다.Thereafter, the BX instruction BX1 defines that subsequent instructions are normal mode (ARM mode) instructions, so that instructions 1 to 4 (A_INT 1 to A_INT 4) of the ARM routine 1 are assigned to the register 22 and the second. In addition to being output through the multiplexer 28, it is also input to the command decoder 34 through the third multiplexer 32 and decoded.

이어서, BX명령어(BX2)에 의해 이후의 명령어가 덤 모드 명령어라는 것이 정의되므로, 덤 루틴(1)에서 16비트의 명령어 5∼7(T_INT 1∼T_INT 3)은 레지스터(22)와 제1멀티플렉서(24) 및 덤 명령어 신장기(26)를 통과하면서 32비트의 명령어로 신장된 다음, 제2멀티플렉서(28)를 통해 출력됨과 더불어 제3멀티플렉서(32)를 매개로 명령어 디코더(34)에도 입력되어 디코딩된다.Subsequently, since the subsequent instructions are defined by the BX instruction BX2 as a dumb mode instruction, the 16-bit instructions 5 through 7 (T_INT 1 through T_INT 3) in the bonus routine 1 are assigned to the register 22 and the first multiplexer. (24) and the extra command expander 26 is expanded to 32-bit instructions, and then output through the second multiplexer 28, and is also input to the instruction decoder 34 via the third multiplexer 32, Decoded.

다음으로, ARM루틴(2) 및 덤 루틴(2)에서도 상기한 ARM루틴(1) 및 덤 루틴(1)과 동일하게 동작을 반복함으로써 명령어 처리동작을 수행한다.Next, the ARM routine 2 and the dumb routine 2 also perform the instruction processing operation by repeating the same operations as the above-described ARM routine 1 and the dumb routine 1.

이상 설명한 덤방식의 중앙 처리 유니트는 일본국 히다치(Hitach)사의 SH시리즈 중앙 처리 유니트의 경우와 유사한 개념이지만, 내부에서 32비트 구조를 가지는 하드웨어를 구동할 수 있기 때문에 구조를 제한해서 가져가는 SH시리즈 중앙 처리 유니트보다는 성능면에서 다소 우수할 수 있다.The bonus central processing unit described above is similar to that of Hitachi, Japan's SH series central processing unit, but the SH series has a limited structure because it can drive hardware with a 32-bit structure internally. It may be somewhat better in performance than the central processing unit.

그러나, 상기한 덤방식의 중앙 처리 유니트에서도 덤 명령어가 16비트로 규정되기 때문에, 8비트 또는 16비트 버스 폭을 가진 시스템에서 여전히 32비트의 데이터처리상의 성능을 충분히 발휘하지 못하는 문제점을 가지고 있다. 특히, 이러한 문제점은 8비트의 버스 폭을 가진 시스템에 적용할 경우 현저하게 나타나는데, 그 이유는 16비트의 덤 명령어를 페치하는데 2사이클의 페치동작을 수행해야 하기 때문이다.However, even in the above-described dumb central processing unit, since the dumb instruction is defined as 16 bits, a system having an 8-bit or 16-bit bus width still has a problem of insufficient performance of 32-bit data processing. In particular, this problem is remarkable when applied to a system having an 8-bit bus width, because two cycles of fetch operation must be performed to fetch a 16-bit dumb instruction.

이에 본 발명은 상기한 문제점을 해결하기 위해 이루어진 것으로, 중앙 처리 유니트에서 사용되는 명령어들의 길이를 축소함으로써 중앙 처리 유니트의 성능을 효율적으로 유지할 수 있는 중앙 처리 유니트의 명령어 길이 축소방법 및 그 장치를 제공하고자 함에 그 목적이 있다.Accordingly, the present invention has been made to solve the above problems, and provides a method and apparatus for reducing a command length of a central processing unit that can efficiently maintain the performance of the central processing unit by reducing the length of instructions used in the central processing unit. The purpose is to.

그리고, 본 발명의 다른 목적은 빈번히 사용되는 중앙 처리 유니트의 명령어들을 미리 중앙 처리 유니트의 칩내에 저장하고, 이들의 포인터를 명령어로 사용함으로써 명령어의 길이를 축소하는 중앙 처리 유니트의 명령어 길이 축소방법 및 그 장치를 제공하고자 함에 그 목적이 있다.Another object of the present invention is to store instructions of a central processing unit which are frequently used in a chip of the central processing unit, and to reduce the length of the instructions by using their pointers as instructions. The purpose is to provide the device.

상기한 목적을 달성하기 위한 본 발명에 따른 중앙 처리 유니트의 명령어 길이 축소방법은, 외부 저장수단에 저장된 N비트의 명령어를 M비트 단위로 페치하여 그 명령어를 실행하는 중앙 처리 유니트의 명령어 처리방법에 있어서; 복수의 N비트 명령어들을 상기 중앙 처리 유니트의 내부에 저장하는 단계와; 프로그래머에 의해 명령어들을 프로그램 입력하는 입력단계; 상기 프로그램에 따라 상기 외부 저장 수단으로부터 페치된 K개의 (N/K)비트 압축명령어를 사용하여 상기 내부에 저장된 명령어들을 독출하는 독출단계 및; 상기 독출단계에서 독출된 명령어들을 실행가능한 명령코드로 디코딩하는 디코딩단계를 포함하여 구성된다.The instruction length reduction method of the central processing unit according to the present invention for achieving the above object is to the instruction processing method of the central processing unit for fetching the N-bit instruction stored in the external storage means in units of M bits and executing the instruction. In; Storing a plurality of N-bit instructions inside the central processing unit; An input step of program inputting instructions by a programmer; A reading step of reading the internally stored instructions using K (N / K) bit compression instructions fetched from the external storage means according to the program; And a decoding step of decoding the instructions read in the reading step into an executable command code.

여기서, 상기 독출단계는, 상기 페치된 N비트의 명령어를 임시 격납하는 격납단계를 포함하고서, 상기 격납된 K개의 (N/K)비트 압축 명령어들을 각기 어드레스 신호로 하여 상기 저장단계에서 저장된 대응하는 N비트의 명령어들을 순차적으로 독출한다.Here, the reading step includes a storing step of temporarily storing the fetched N-bit instructions, wherein the stored K (N / K) bit compression instructions are stored as the address signals corresponding to each stored in the storing step. N-bit instructions are read sequentially.

그리고, 상기한 목적을 달성하기 위한 본 발명의 다른 양태에 따른 중앙 처리 유니트의 명령어 길이 축소방법은, 외부 저장수단에 저장된 N비트 명령어를 M비트 단위로 페치하여, 그 명령어를 실행하는 중앙 처리 유니트의 명령어 처리방법에 있어서; 상기 외부 저장수단에 다수의 N비트 명령어 및, 상기 중앙 처리 유니트에서 빈번히 사용되는 다수의 N비트 명령어에 대한 압축된 형태의 (N/K)비트 압축 명령어를 저장하는 제1저장단계; 상기 빈번히 사용되는 복수의 N비트 명령어들을 중앙 처리 유니트의 내부에 저장해 두는 제2저장단계; 프로그래머에 의해 명령어들을 프로그램 입력하는 입력단계; 상기 명령 프로그램의 명령어 시퀀스에 따라 상기 외부 저장수단으로부터 입력되는 N비트 명령어들이 정상 명령어인지 또는 압축 명령어인지의 여부를 판단하는 단계; 상기 정상 명령어인 경우 상기 입력되는 N비트 명령어들을 실행 가능한 명령코드로 디코딩 하는 제1디코딩단계; 상기 N비트 명령어가 K개의 압축 명령어인 경우 이 압축 명령어들을 이용하여 상기 내부에 저장된 각기 대응하는 N비트 명령어들을 독출하는 독출단계 및; 상기 독출된 N비트 명령어들을 순차적으로 실행 가능한 명령코드로 디코딩하는 제2디코딩단계를 포함하여 구성된다.In addition, the instruction length reduction method of the central processing unit according to another aspect of the present invention for achieving the above object, the central processing unit for fetching the N-bit instruction stored in the external storage means in units of M bits, and executes the instruction. In the instruction processing method of; A first storage step of storing a plurality of N-bit instructions and compressed (N / K) bit compression instructions for a plurality of N-bit instructions frequently used in the central processing unit in the external storage means; A second storing step of storing the frequently used plurality of N-bit instructions in a central processing unit; An input step of program inputting instructions by a programmer; Determining whether N-bit instructions inputted from the external storage means are normal instructions or compressed instructions according to the instruction sequence of the instruction program; A first decoding step of decoding the input N-bit instructions into an executable command code in the case of the normal instruction; Reading the corresponding N-bit instructions stored therein by using the compression instructions when the N-bit instructions are K compressed instructions; And a second decoding step of decoding the read N-bit instructions into executable command codes.

여기서, 본 발명은 상기 페치된 N비트의 명령어를 임시 격납하는 격납단계와, 상기 판단단계의 판단결과에 따라 상기 격납단계에서 격납된 N비트 명령어를 상기 제 1디코딩 단계와 상기 독출단계중 어느 한쪽으로 전달하는 전달단계를 더 포함하여 구성되고, 상기 독출단계는 상기 전달단계에서 전달된 K개의 (N/K)비트 압축 명령어들을 어드레스신호로 이용하여 해당 어드레스들에 저장된 K개의 N비트 명령어들을 순차 독출한다.Herein, the present invention includes a storing step of temporarily storing the fetched N-bit instructions, and one of the first decoding step and the reading step of the N-bit instructions stored in the storing step according to the determination result of the determining step. It further comprises a forwarding step of transmitting to the reading step, the reading step using the K (N / K) bit compression instructions delivered in the delivery step as an address signal sequentially the K N-bit instructions stored at the corresponding addresses Read out.

그리고, 상기한 목적을 달성하기 위한 본 발명에 따른 중앙 처리 유니트의 명령어 길이 축소장치는, 외부 저장수단에 저장된 N비트 명령어를 M비트 단위로 페치하여 그 명령어를 실행하는 중앙 처리 유니트에 있어서; 상기 중앙 처리 유니트의 실행시 사용되는 복수의 N비트 명령어들을 내부에 저장해 두고, 상기 외부 저장수단으로부터 페치된 (N/K)비트의 압축 명령어들을 사용하여 상기 내부에 저장된 명령어들을 독출해서 실행 가능한 명령코드로 디코딩하여 출력하는 명령어 축소수단을 구비하여 구성된다.In addition, the instruction length reduction apparatus of the central processing unit according to the present invention for achieving the above object comprises a central processing unit for fetching the N-bit instructions stored in the external storage means in units of M bits to execute the instructions; A plurality of N-bit instructions used when executing the central processing unit therein, and instructions executable by reading the instructions stored therein using (N / K) bit compression instructions fetched from the external storage means And instruction reduction means for decoding and outputting the code.

여기서, 상기 명령어 축소수단은, 상기 페치된 N비트의 명령어를 임시 격납하는 격납수단과; 상기 압축 명령어에 대응하는 상기 N비트 명령어들을 저장하고서, 상기 격납 수단에 격납된 K개의 압축 명령어들을 각기 어드레스 신호로 입력받아 대응하는 N비트 의 명령어들을 순차적으로 독출하는 내부 저장수단 및; 상기 내부 저장수단에서 독출된 N비트 명령어들을 순차적으로 디코딩하는 명령어 디코딩 수단을 구비하여 구성된다.Here, the instruction reducing means may include: storing means for temporarily storing the fetched N-bit instruction; Internal storage means for storing the N-bit instructions corresponding to the compression instruction, receiving K-compression instructions stored in the storage means as address signals, and sequentially reading corresponding N-bit instructions; And instruction decoding means for sequentially decoding the N-bit instructions read from the internal storage means.

또한, 상기한 목적을 달성하기 위한 본 발명의 다른 양태에 따른 중앙 처리 유니트의 명령어 길이 축소장치는, 외부 저장수단에 저장된 N비트 명령어를 M비트 단위로 페치하여 그 명령어를 실행하는 중앙 처리 유니트에 있어서; 상기 외부 저장수단에는 다수의 N비트 명령어 및, 상기 중앙 처리장치에서 빈번히 사용되는 다수의 N비트 명령어에 대한 압축된 형태의 (N/K)비트 압축 명령어가 저장되어 있고; 상기 빈번히 사용되는 복수의 N비트 명령어들을 내부에 저장해 두고, 명령어 시퀀스에 따라 상기 외부 저장수단으로부터 입력되는 N비트 명령어들이 압축 명령어가 아닌 정상 명령어인 경우 실행 가능한 명령코드로 직접 디코딩 하여 출력하며, 상기 입력되는 N비트 명령어가 K개의 압축 명령어인 경우 이 압축 명령어들을 이용하여 상기 내부에 저장된 각기 대응하는 N비트 명령어들을 독출한 다음, 실행 가능한 명령코드로 디코딩하여 출력하는 명령어 축소수단을 구비하여 구성된다.In addition, the instruction length reduction apparatus of the central processing unit according to another aspect of the present invention for achieving the above object, to the central processing unit for fetching the N-bit instructions stored in the external storage means in units of M bits to execute the instructions. In; The external storage means stores a plurality of N-bit instructions and (N / K) bit compression instructions in compressed form for a plurality of N-bit instructions frequently used in the central processing unit; The frequently used N-bit instructions are stored therein, and when the N-bit instructions inputted from the external storage means are normal instructions instead of compressed instructions according to the instruction sequence, the N-bit instructions are directly decoded and executed as executable command codes. In the case where the input N-bit instruction is K compression instructions, the apparatus further includes an instruction reduction means for reading out corresponding N-bit instructions stored therein using the compression instructions, and then decoding and outputting them into executable instruction codes. .

여기서, 상기 명령어 축소수단은, 상기 페치된 N비트의 명령어를 임시 격납하는 격납수단과; 제어신호에 따라 상기 격납수단에 격납된 N비트 명령어를 제1출력단과 제2출력단중 어느 하나로 절환출력하는 디멀티플렉싱수단; 상기 빈번히 사용되는 복수의 N비트 명령어를 저장해두고서, 상기 멀티플렉싱수단의 제1출력단을 통해 인가되는 K개의 압축 명령어들을 어드레스신호로 이용하여 해당 어드레스들에 저장된 K개의 N비트 명령어를 순차 독출하는 내부 저장수단; 제어신호에 따라 상기 멀티플렉싱수단의 제2출력단과 상기 내부 저장수단의 독출명령어중 어느 하나를 선택출력하는 멀티플렉싱수단 및; 상기 멀티플렉싱수단의 출력되는 N비트 명령어를 디코딩하는 명령어 디코딩 수단을 구비하여 구성된다.Here, the instruction reducing means may include: storing means for temporarily storing the fetched N-bit instruction; Demultiplexing means for switching and outputting an N-bit command stored in the storing means to any one of a first output terminal and a second output terminal according to a control signal; Internally storing the frequently used N-bit instructions, and sequentially reading K N-bit instructions stored at the corresponding addresses by using K compression instructions applied through the first output terminal of the multiplexing means as address signals. Storage means; Multiplexing means for selectively outputting any one of a second output terminal of the multiplexing means and a read command of the internal storage means according to a control signal; And instruction decoding means for decoding the output N-bit instruction of the multiplexing means.

제 1 도는 종래 중앙 처리 유니트의 명령어 길이 축소장치의 일예를 나타낸 회로 블록도,1 is a circuit block diagram showing an example of an instruction length reduction apparatus of a conventional central processing unit;

제 2 도는 제 1 도에 도시한 명령어 길이 축소장치에서의 명령어 처리흐름을 설명하기 위한 도면,2 is a view for explaining an instruction processing flow in the instruction length reduction apparatus shown in FIG.

제 3 도는 제 1 도에 도시한 외부 메모리 맵의 메모리 폭에 따른 명령어 격납상태를 설명하기 위한 도면,3 is a view for explaining an instruction storing state according to a memory width of an external memory map shown in FIG. 1;

제 4 도는 제 1 도에 도시한 중앙 처리 유니트에서 사용되는 32비트 명령어의 일예를 설명하기 위한 도면,4 is a view for explaining an example of a 32-bit instruction used in the central processing unit shown in FIG.

제 5 도는 제 1 도에 도시한 덤 명령어 신장기의 동작을 설명하기 위한 도면,5 is a view for explaining the operation of the bonus command extender shown in FIG.

제 6 도는 본 발명에 따른 중앙 처리 유니트의 명령어 길이 축소장치의 개략적인 회로 블럭도,6 is a schematic circuit block diagram of an instruction length reduction apparatus of a central processing unit according to the present invention;

제 7 도는 제 6 도의 요부를 상세히 설명하기 위한 회로 블록도,7 is a circuit block diagram for explaining the main part of FIG. 6 in detail;

제 8 도는 제 6 도에 도시한 중앙 처리 유니트의 명령어 길이 축소장치에 채용될수 있는 명령어 길이 축소 방법을 설명하기 위한 명령어 처리 흐름을 나타낸 도면,FIG. 8 is a view illustrating a command processing flow for explaining a command length reduction method that may be employed in the command length reduction apparatus of the central processing unit shown in FIG. 6;

제 9 도는 제 6도에 도시한 외부 메모리 맵의 메모리 폭에 따른 명령어 격납상태 및 내부 메모리의 명령어 격납상태를 설명하기 위한 도면이다.FIG. 9 is a view for explaining an instruction storage state and an instruction storage state of the internal memory according to the memory width of the external memory map shown in FIG.

* 도면의 주요부분에 대한 부호의 설명* Explanation of symbols for main parts of the drawings

100 : 외부 메모리 맵 100A : ROM100: external memory map 100A: ROM

100B : RAM 200 : 중앙 처리 유니트100B: RAM 200: Central Processing Unit

220 : 레지스터 240 : 디멀티플렉서220: register 240: demultiplexer

260 : 내부 메모리 280 : 멀티플렉서260: internal memory 280: multiplexer

300 : 명령어 디코더300: command decoder

이하, 첨부도면을 참조하여 본 발명에 따른 중앙 처리 유니트의 명령어 길이 축소방법 및 그 장치에 대해 상세히 설명한다.Hereinafter, a method and apparatus for reducing the command length of a central processing unit according to the present invention will be described in detail with reference to the accompanying drawings.

도 6은 본 발명의 일실시예에 따른 중앙 처리 유니트의 명령어 길이 축소 장치의 개략적인 회로 블록도이다.6 is a schematic circuit block diagram of an instruction length reduction apparatus of a central processing unit according to an embodiment of the present invention.

동 도면에 도시한 바와 같이, 본 발명의 일실시예에 따른 중앙 처리 유니트의 명령어 축소장치는, 레지스터(220)와 디멀티플렉서(240), 내부 메모리(260), 멀티플렉서(280) 및 명령어 디코더(300)를 포함하여 구성된다.As shown in the figure, the instruction reduction apparatus of the central processing unit according to an embodiment of the present invention, the register 220, the demultiplexer 240, the internal memory 260, the multiplexer 280 and the instruction decoder 300 It is configured to include).

상기 32비트 레지스터(220)는 외부 메모리 맵(100)으로부터 데이터 버스(D-BUS)를 매개로 명령어를 입력받는다. 상기 외부 메모리 맵(100)은 불휘발성 반도체 메모리인 적어도 하나의 ROM(100A)과, 휘발성 반도체 메모리인 적어도 하나의 RAM(100B)으로 구성되며, 상기 ROM(100A)에는 중앙 처리 유니트의 명령어들이 저장되어 있다. 이들 ROM(100A)은 메모리 폭이 도 9(a),(b),(c)에 도시한 바와 같이 8비트, 16비트 또는 32비트인 ROM으로 구성할 수 있다.The 32-bit register 220 receives a command from the external memory map 100 via a data bus D-BUS. The external memory map 100 includes at least one ROM 100A, which is a nonvolatile semiconductor memory, and at least one RAM 100B, which is a volatile semiconductor memory, and instructions of a central processing unit are stored in the ROM 100A. It is. These ROMs 100A can be configured as ROMs having 8-bit, 16-bit, or 32-bit memory widths as shown in Figs. 9A, 9B, and 9C.

여기서, 본 발명에서는 N비트(예를 들면, 본 실시예에서는 32비트)의 명령어를 사용하는 정상모드와, 상기 N비트(예를 들면, 32비트)의 명령어를 소정 배수(예를 들면, 본 실시예에서는 1/4배)로 압축한 N/4비트(예를 들면, 본 실시예에서는 8비트)의 명령어를 사용하는 압축모드가 존재한다.Here, in the present invention, the normal mode using the N-bit (for example, 32-bit in the present embodiment), and the N-bit (for example, 32-bit) instructions a predetermined multiple (for example, In the embodiment, there is a compression mode using an instruction of N / 4 bits (for example, 8 bits in the present embodiment) compressed to 1/4 times.

따라서, 도 9(a)에 도시한 바와 같이, 8비트 폭의 ROM(100A)에는 정상모드에서 사용되는 n개의 32비트 명령어들이 4등분되어, 예들 들면 하나의 32비트 명령어가 NR_INT 1-1, NR_INT 1-2, NR_INT 1-3, NR_INT 1-4와 같이 분리 저장되어 있다. 그리고, 압축모드에서 사용되는 n개의 8비트 압축 명령어들이 각각 COM_INT 1, COM_INT 2, …, COM_INT n의 순으로 저장되어 있다.Accordingly, as shown in FIG. 9A, n 32-bit instructions used in the normal mode are divided into four in the 8-bit wide ROM 100A. For example, one 32-bit instruction is divided into NR_INT 1-1, It is separately stored as NR_INT 1-2, NR_INT 1-3, NR_INT 1-4. Then, n 8-bit compression instructions used in the compression mode are respectively COM_INT 1, COM_INT 2,... Followed by COM_INT n.

또, 도 9(b)에 도시한 바와 같이, 16비트 폭의 ROM(100A)에는 정상모드에서 사용되는 n개의 32비트 명령어들이 2등분되어, 예들 들면 하나의 32비트 명령어가 NR_INT 1-1, NR_INT 1-2와 같이 분리 저장되어 있다. 그리고, 압축모드에서 사용되는 n개의 8비트 압축 명령어들이 예들 들면 2개씩 COM_INT 1, COM_INT 2; COM_INT 3, COM_INT 4, …; COM_INT n-1, COM_INT n의 순으로 저장되어 있다.As shown in Fig. 9B, n 32-bit instructions used in the normal mode are divided into two in the 16-bit wide ROM 100A. For example, one 32-bit instruction is divided into NR_INT 1-1, NR_INT 1-2 is stored separately. Then, n 8-bit compression instructions used in the compression mode are, for example, two COM_INT 1 and COM_INT 2; COM_INT 3, COM_INT 4,... ; It is stored in the order of COM_INT n-1, COM_INT n.

또한, 도 9(c)에 도시한 바와 같이, 32비트 폭의 ROM(100A)에는 정상모드에서 사용되는 n개의 32비트 명령어들이 각각 예들 들면 NR_INT 1, NR_INT 2, … NR_INT n의 순으로 저장되어 있다. 그리고, 압축모드에서 사용되는 n개의 8비트 압축명령어들이 예들 들면 4개씩 COM_INT 1, COM_INT 2, COM_INT 3, COM_INT 4 ;…; COM_INT n-3, COM_INT n-2, COM_INT n-1, COM_INT n의 순으로 저장되어 있다.As shown in Fig. 9C, the 32-bit wide ROM 100A includes n 32-bit instructions used in the normal mode, for example, NR_INT 1, NR_INT 2,... NR_INT n is stored in order. And n 8-bit compressed instructions used in the compression mode, for example, COM_INT 1, COM_INT 2, COM_INT 3, COM_INT 4; ; It is stored in the order of COM_INT n-3, COM_INT n-2, COM_INT n-1, and COM_INT n.

상기 디멀티플렉서(240)는 제어신호(mode)에 따라 상기 32비트 레지스터(22)에 격납된 32비트 명령어를 2개의 출력단중 어느 하나로 출력한다. 여기서, 제어신호(Mode)는 상기 한 정상모드와 압축모드에 따라 상보적으로 예를 들면 정상모드시 논리 0레벨로, 압축모드시 논리 1레벨로 된다. 이에 따라, 상기 디멀티플렉서(240)는 정상모드시 후술하는 멀티플렉서(280)로 스위칭 출력하는 한편, 압축모드시 후술하는 내부 메모리(260)로 스위칭 출력한다.The demultiplexer 240 outputs a 32-bit command stored in the 32-bit register 22 to any one of two output terminals according to a control signal mode. Here, the control signal Mode becomes complementary to, for example, a logic zero level in the normal mode and a logic one level in the compression mode in accordance with the normal mode and the compression mode. Accordingly, the demultiplexer 240 switches to the multiplexer 280 to be described later in the normal mode, and switches to the internal memory 260 to be described later in the compressed mode.

상기 내부 메모리(260)는 중앙 처리 유니트의 칩 내부에 설치되는 N비트(예를 들면, 본 실시예에서는 32비트)의 메모리 폭을 가진 저장수단으로서, 이 저장수단에는 압축모드의 N/4비트(예를 들면, 8비트) 명령어에 대응하는 것으로 빈번히 사용되는 32비트 명령어들이 저장되어 있다. 상기 저장수단은 예를 들면 ROM, EPROM, EEPROM, 플래시 메모리(Flash Memory) 등과 같은 불휘발성 반도체 메모리와, RAM과 같은 휘발성 반도체 메모리로 구성할 수 있다. 상기 저장수단을 불휘발성 메모리로 구성할 경우에는 제조시 빈번히 사용되는 32비트 명령어들을 엠베딩(embedding)하여 사용하는 한편, 불휘발성 메모리로 구성하는 경우에는 원래의 명령어들을 외부 시스텝 메모리에 저장하여 두고 시스템 부팅시에 로드하여 사용할 수도 있다.The internal memory 260 is a storage means having a memory width of N bits (for example, 32 bits in the present embodiment) installed inside a chip of the central processing unit, and the storage means includes N / 4 bits in a compression mode. Stores 32-bit instructions that are frequently used to correspond to (eg, 8-bit) instructions. The storage means may include, for example, a nonvolatile semiconductor memory such as a ROM, an EPROM, an EEPROM, a flash memory, and a volatile semiconductor memory such as a RAM. When the storage means is configured as a nonvolatile memory, the 32-bit instructions frequently used during manufacturing are embedded to be used.In the case of the nonvolatile memory, the original instructions are stored in an external system memory. It can also be loaded and used at system boot time.

이러한 내부 메모리(260)는 예를 들면 압축모드의 명령어가 8비트로 구성된 경우, 도 9(d)에 도시한 바와 같이 빈번히 사용하는 32비트 명령어를 256개 저장하여 사용한다. 여기서, 상기 압축모드의 8비트 명령어는 상기 내부 메모리(260)에 저장된 대응하는 32비트 명령어의 저장 어드레스이다.The internal memory 260 stores and uses 256 32-bit instructions that are frequently used, as shown in FIG. Here, the 8-bit instruction in the compression mode is a storage address of the corresponding 32-bit instruction stored in the internal memory 260.

이와 같이 압축모드의 명령어가 어드레스인 경우에는, 예를 들어 빈번히 사용되는 명령어를 256개 정의한 경우 32비트 필드로 한꺼번에 4개의 명령어를, 16개의 명령어를 정의하는 경우에는 32비트 필드로 한꺼번에 8개의 명령어를 정의할 수 있기 때문에, 종래 예에서 설명한 SH시리즈의 중앙 처리 유니트와 덤방식의 중앙 처리 유니트의 경우보다 2배 또는 4배 정도의 코드 밀도(code density)를 개선할 수 있다. 또한, 명령어 페치 빈도도 마찬가지로 줄어들고, 더욱이 32비트의 중앙 처리 유니트 구조의 장점을 효과적으로 이용한다는 점에서 성능을 상당히 개선할 수 있다.In this case, if the instruction in the compressed mode is an address, for example, if 256 instructions are used frequently, four instructions are defined at once in the 32-bit field, and if the instruction defines 16 instructions, eight instructions at the same time. Since it can be defined, it is possible to improve the code density of about 2 times or 4 times than that of the central processing unit and the bonus type central processing unit of the SH series described in the conventional example. In addition, the frequency of instruction fetches is similarly reduced, further improving performance significantly in terms of effectively exploiting the advantages of the 32-bit central processing unit architecture.

상기 멀티플렉서(280)는 제어신호(Mode)에 따라 상기 디멀티플렉서(240)로부터 출력되는 32비트 명령어와 상기 내부 메모리(260)에서 독출된 32비트 명령어중 어느 하나를 선택하여 출력한다. 여기서, 제어신호(Mode)는 상기한 정상모드와 압축모드에 따라 상보적으로 예를 들면 정상모드시 논리 0레벨로, 압축모드시 논리 1레벨로 된다. 이에 따라, 상기 멀티플렉서(260)는 정상모드시 상기 디멀티플렉서(240)로부터의 32비트 명령어를 선택출력하는 한편, 압축모드시 상기 내부 메모리(260)로부터의 32비트 명령어를 선택출력한다.The multiplexer 280 selects and outputs one of a 32-bit command output from the demultiplexer 240 and a 32-bit command read from the internal memory 260 according to a control signal (Mode). Here, the control signal Mode becomes complementary to, for example, a logic zero level in the normal mode and a logic one level in the compression mode in accordance with the normal mode and the compression mode. Accordingly, the multiplexer 260 selectively outputs 32-bit instructions from the demultiplexer 240 in the normal mode, and selectively outputs 32-bit instructions from the internal memory 260 in the compression mode.

상기 명령어 디코더(300)는 룩-업 테이블로 이루어진 코드북을 이용하여, 상기 멀티플렉서(280)를 매개로 입력되는 32비트의 명령어를 실행가능한 상태의 코드로 디코딩하여 출력한다.The instruction decoder 300 decodes a 32-bit instruction input through the multiplexer 280 into a code in an executable state using a codebook consisting of a look-up table.

이하, 상기와 같이 구성된 본 발명에 따른 중앙 처리 유니트의 명령어 길이 축소장치 및 그 방법에 대해 도 7 내지 도 9를 참조하여 상세히 설명한다.Hereinafter, a command length reduction apparatus and a method of a central processing unit according to the present invention configured as described above will be described in detail with reference to FIGS. 7 to 9.

도 7은 도 6의 요부를 상세히 설명하기 위한 회로 블록도이고, 도 8은 도 6에 도시한 중앙 처리 유니트의 명령어 길이 축소장치에 채용될 수 있는 명령어 길이 축소 방법을 설명하기 위한 명령어 처리 흐름을 나타낸 도면이며, 도 9는 도 6에 도시한 외부 메모리 맵의 메모리 폭에 따른 명령어 격납상태 및 내부 메모리의 명령어 격납상태를 설명하기 위한 도면이다.FIG. 7 is a circuit block diagram illustrating the main part of FIG. 6 in detail. FIG. 8 is a flowchart illustrating an instruction length reduction method that may be employed in the instruction length reduction apparatus of the central processing unit of FIG. 6. FIG. 9 is a diagram for explaining an instruction storage state and an instruction storage state of the internal memory according to the memory width of the external memory map shown in FIG. 6.

여기서는 빈번히 사용되는 명령어를 256가지로 정의한 경우를 예를 들어 설명한다. 그리고, 상기 명령어 256개가 중앙 처리 유니트 내부의 저장수단으로서 내부 메모리(260)에 이미 저장되어 있다.In this example, 256 cases of frequently used commands are explained as an example. The 256 instructions are already stored in the internal memory 260 as storage means in the central processing unit.

먼저, 정상모드인 경우[제어신호(Mode)가 예를 들면 논리 0인 경우], 외부 메모리 맵(100)의 ROM(100A)으로부터 32비트 명령어가 레지스터(220)로 페치되면, 이 레지스터(220)의 명령어는 디멀티플렉서(240)를 통해 직접 멀티플렉서(280)에 인가되고, 이어서 이 멀티플렉서(280)에 의해 선택된 후에 명령어 디코더(300)의 입력으로 들어가게 된다. 이에 따라, 정상적인 32비트의 중앙 처리 유니트의 명령어의 수행이 일어나게 된다.First, in the normal mode (when the control signal Mode is, for example, logic 0), when a 32-bit instruction is fetched from the ROM 100A of the external memory map 100 into the register 220, the register 220 ) Is directly applied to the multiplexer 280 via the demultiplexer 240 and then entered by the multiplexer 280 and then input to the input of the command decoder 300. As a result, execution of instructions of a normal 32-bit central processing unit occurs.

다음으로, 압축모드인 경우[제어신호(Mode)가 예를 들면 논리 1인 경우], 외부 메모리 맵(100)의 ROM(100A)으로부터 32비트 명령어가 레지스터(220)로 페치되면, 이 레지스터(220)에 격납된 명령어는 디멀티플렉서(240)를 통해 내부 메모리(260)에 인가된다. 이때, 상기 32비트 명령어는 4개의 8비트 압축 명령어인 내부메모리(260)의 어드레스이므로, 이들은 도 7에 도시한 바와 같이 8비트 단위로 분할되어 내부 메모리(260)의 어드레스로서 인가된다. 이에 따라, 상기 4개의 압축 명령어에 의해 억세스된 어드레스에 격납된 32비트의 명령어가 독출되어 순차적으로 멀티플렉서(280)에 입력된다. 그후, 멀티플렉서(280)은 상기 메모리(260)에서 인가되는 32비트 명령어를 선택하여 명령어 디코더로 출력한다. 이에 따라, 정상적인 32비트의 중앙 처리 유니트의 명령어의 수행이 일어나게 된다.Next, in the compression mode (when the control signal Mode is, for example, logic 1), when a 32-bit instruction is fetched into the register 220 from the ROM 100A of the external memory map 100, the register ( The instructions stored at 220 are applied to the internal memory 260 through the demultiplexer 240. In this case, since the 32-bit command is an address of the internal memory 260 which is four 8-bit compression commands, they are divided into 8-bit units and applied as the address of the internal memory 260 as shown in FIG. 7. Accordingly, 32-bit instructions stored at the addresses accessed by the four compression instructions are read out and sequentially input to the multiplexer 280. Thereafter, the multiplexer 280 selects a 32-bit command applied from the memory 260 and outputs the 32-bit command to the command decoder. As a result, execution of instructions of a normal 32-bit central processing unit occurs.

따라서, 상기한 압축모드에서는 32필드로 한 번에 4개의 압축명령어를 수행할 수 있게 된다.Therefore, in the above compression mode, four compression commands can be executed at once with 32 fields.

한편, 상기 중앙 처리 유니트의 원활한 수행을 위해서는, 일차적으로 어셈블러에서 정상모드의 명령어와 압축모드의 명령어를 구별해 줄 필요가 있다.On the other hand, in order to smoothly execute the central processing unit, the assembler needs to distinguish between the instructions of the normal mode and the instructions of the compression mode.

따라서, 도 8에 도시한 바와 같이, 프로그래머에 의한 프로그램시에 정상모드 명령어들[명령어 1(NR_INT 1), 명령어 2(NR_INT 2), 명렁어 3(NR_INT 3), 명령어 4(NR_INT 4)]의 시작위치에 정상모드 명령(NR 1)을 삽입함으로써 이후 정상모드 명령어들의 시퀀스[즉, 정상모드 루틴(1)]로 진입할 수 있다. 여기서, 정상모드 명령(NR 1)이라는 신택스(syntax)는 실제 어셈블러에 의해서 정상모드의 변경이 가능케 해주는 32비트 명령어에 해당한다. 이 명령어를 수행하면 중앙 처리 유니트의 상태는 정상모드 명령어를 수행할 수 있는 상태로 변환된다.Therefore, as shown in FIG. 8, the normal mode instructions (instruction 1 (NR_INT 1), instruction 2 (NR_INT 2), command 3 (NR_INT 3), instruction 4 (NR_INT 4) at the time of programming by the programmer). By inserting the normal mode command NR 1 at the start position of, it is possible to enter a sequence of normal mode commands (ie, the normal mode routine 1). Here, the syntax of the normal mode instruction NR 1 corresponds to a 32-bit instruction that enables the normal mode to be changed by the actual assembler. Executing this command converts the state of the central processing unit into a state capable of executing the normal mode command.

또한, 압축모드 명령어들[압축 명령어 1(COM_INT 1), 압축 명령어 2(COM_INT 2), 압축 명령어 3(COM-INT 3), 압축 명령어 4(COM_INT 4)]의 시작위치에 압축모드 명령(COM 1)을 삽입함으로써 이후에 압축모드 명령의 시퀀스[즉, 압축모드 루틴(1)]로 진입할 수 있다. 여기서, 압축모드 명령(COMP 1)이라는 신택스는 실제 어셈블러에 의해서 정상모드에서 압축모드로의 변경이 가능케 해주는 32비트 명령어에 해당한다. 이 명령어를 수행하면 중앙 처리 유니트의 상태는 압축모드 명령어를 수행할 수 있는 상태로 변환된다.In addition, the compression mode command (COM_INT 1), the compression command 2 (COM_INT 2), the compression command 3 (COM-INT 3), the compression command 4 (COM_INT 4) at the beginning of the compression mode commands (COM_INT 4) By inserting 1) it is possible to enter a sequence of compressed mode commands (ie, compressed mode routine 1) afterwards. Here, the syntax of the compressed mode command (COMP 1) corresponds to a 32-bit command that allows the actual assembler to change from the normal mode to the compressed mode. After executing this command, the state of the central processing unit is converted into a state capable of executing the compressed mode command.

다음으로 수행되는 정상모드 명령(NR 2)과 이에 따른 정상모드 루틴(2) 및, 압축모드 명령(COM 2)과 이에 따른 압축모드 루틴(2)은, 상기한 바와 마찬가지로 수행된다.Next, the normal mode command NR 2 and the normal mode routine 2 and the compression mode command COM 2 and the compression mode routine 2 thus performed are performed as described above.

여기서, 압축모드에서 수행된 N개의 명령어는 데이터 필드의 정렬(align-ment)과 관련하여 하드웨어의 단순화를 위해서 4배의 배수인 것이 바람직하다.Here, the N instructions executed in the compression mode are preferably multiples of four times for simplicity of hardware in relation to the alignment of data fields.

한편, 본 발명은 상기한 특정 실시예에서 한정되는 것이 아니라 본 발명의 요지를 벗어나지 않는 범위 내에서 여러가지로 변형 및 수정하여 실시할 수 있는 것이다.On the other hand, the present invention is not limited to the above specific embodiments, but can be modified and modified in various ways without departing from the gist of the present invention.

예를 들면, 상기한 특정 실시예에서는 중앙 처리 유니트의 명령어 길이가 32비트로 한정하여 설명하였지만, 본 발명은 상기 명령어 길이가 16비트, 64비트 등과 같이 다른 경우에도 적용할 수 있는 것임은 당해 기술분야에 통상의 지식을 가진 자라면 용이하게 이해할 수 있을 것이다.For example, in the above specific embodiment, the instruction length of the central processing unit has been described as being limited to 32 bits, but the present invention can be applied to other cases in which the instruction length is 16 bits, 64 bits, or the like. Those of ordinary skill in the art will readily understand.

또, 상기한 특정 실시예에서는 압축 명령어의 길이를 8비트로 한정하여 설명하였지만, 이는 중앙 처리 유니트가 사용하는 명령어의 길이에 따라 가변될 수 있고, 또한 빈번히 사용되는 명령어의 개수를 몇 개로 정의하느냐에 따라서 가변될 수 있음은 물론이다.In addition, in the above-described specific embodiment, the length of the compression instruction is limited to 8 bits, but this may vary depending on the length of the instruction used by the central processing unit, and also depending on how many instructions are used. Of course, it can be varied.

또한, 상기한 특정 실시예에서는 중앙 처리 유니트에서 빈번히 사용하는 소정 개수의 명령어들의 길이를 축소하는 것에 대해 설명하였지만, 본 발명은 필요하다면 중앙 처리 유니트에서 사용되는 모든 명령어들의 길이를 축소할 수도 있는 것이다.In addition, in the above specific embodiment, the length of a predetermined number of instructions frequently used in the central processing unit has been described. However, the present invention may reduce the length of all instructions used in the central processing unit if necessary. .

이상 설명한 바와 같이 본 발명의 일실시예에 따르면, 빈번히 사용되는 명령어들을 중앙 처리 유니트의 내부 메모리에 저장해두고, 이 저장 명령어를 선택할 수 있는 포인터(즉, 어드레스)를 명령어로서 정의함으로써 코드 밀도를 현저히 개선할 수 있다.As described above, according to an embodiment of the present invention, the code density is remarkably defined by storing frequently used instructions in the internal memory of the central processing unit and defining pointers (that is, addresses) to select the stored instructions as instructions. It can be improved.

즉, 예를 들면 32비트 필드로 한꺼번에 4개의 명령어를 정의할 수 있고, 16개의 명령어를 정의하는 경우에는 32비트 필드로 한꺼번에 8개의 명령어를 정의할 수 있기 때문에, 종래예에서 설명한 SH시리즈의 중앙 처리 유니트와 덤방식의 중앙 처리 유니트의 경우보다 2배 또는 4배 정도의 코드 밀도를 개선할 수 있다.That is, for example, four instructions can be defined at one time as a 32-bit field, and eight instructions can be defined at one time as a 32-bit field when 16 instructions are defined. The code density can be improved by two or four times as compared to the processing unit and the bonus central processing unit.

또, 본 발명에 의하면 상기한 코드 밀도 개선 외에도 명령어 페치 빈도도 마찬가지로 줄일 수 있으므로, 중앙 처리 유니트 구조의 장점을 효과적으로 이용한다는 점에서 성능을 상당히 개선할 수 있다.In addition, according to the present invention, since the instruction fetch frequency can be similarly reduced in addition to the above code density improvement, the performance can be considerably improved in terms of effectively utilizing the advantages of the central processing unit structure.

또한, 본 발명에 의하면 ARM의 덤방식의 중앙 처리 유니트에 비해서 매우 간단한 구성으로 보다 우수한 성능을 달성할 수 있다.In addition, according to the present invention, superior performance can be achieved with a very simple configuration compared to the ARM type central processing unit.

Claims

In the central processing unit for fetching the N-bit instruction stored in the external storage means in M-bit unit and executing the instruction,

A plurality of N-bit instructions used when executing the central processing unit therein, and instructions executable by reading the instructions stored therein using (N / K) bit compression instructions fetched from the external storage means Instruction reduction apparatus of the central processing unit, characterized in that it comprises a command reduction means for decoding and outputting the code.

2. The apparatus of claim 1, wherein the instruction reducing means comprises: storing means for temporarily storing the fetched N-bit instruction;

Internal storage means for storing the N-bit instructions corresponding to the compression instruction, receiving K-compression instructions stored in the storing means as address signals, and sequentially reading corresponding N-bit instructions;

And an instruction decoding means for sequentially decoding the N-bit instructions read out from the storage means.

The method of claim 2, wherein the external storage means is a memory having an 8-bit memory width, the internal storage means is a memory having a 32-bit width,

Wherein N is 32 and K is 4.

The method of claim 2, wherein the external storage means is a memory having a 16-bit memory width, the internal storage means is a memory having a 32-bit width,

Wherein N is 32 and K is 4;

The apparatus according to any one of claims 1 to 4, wherein the external storage means and the internal storage means are nonvolatile semiconductor memories.

The method according to any one of claims 1 to 4, wherein the internal storage means is a volatile semiconductor memory,

Command length reduction device of the central processing unit, characterized in that the instructions stored in the internal storage means is loaded from an external system memory at system boot

3. The apparatus of claim 1 or 2, wherein N is a multiple of 8 and K is a multiple of four.

The external storage means stores a plurality of N-bit instructions and compressed (N / K) bit compression instructions for a plurality of N-bit instructions frequently used in the central processing unit.

And storing the frequently used N-bit instructions therein and directly decoding the N-bit instructions input from the external storage means according to the instruction sequence into executable command codes when the normal instructions are not compressed instructions. In the case where the input N-bit instruction is K compression instructions, the apparatus comprises a command reduction means for reading out the corresponding N-bit instructions stored therein using the compression instructions, and then decoding and outputting them into an executable instruction code. A command length reduction device of a central processing unit.

9. The apparatus of claim 8, wherein the instruction reduction means comprises: storage means for temporarily storing the fetched N-bit instruction;

Demultiplexing means for switching and outputting N-bit instructions stored in the storing means to either one of a first output terminal and a second output terminal in accordance with a control signal;

Internally storing the frequently used N-bit instructions, and sequentially reading K N-bit instructions stored at the corresponding addresses by using K compression instructions applied through the first output terminal of the multiplexing means as address signals. Storage Equipment,

Multiplexing means for selectively outputting any one of a second output terminal of the multiplexing means and a read command of the internal storage means according to a control signal;

And an instruction decoding means for decoding the output N-bit instruction of the multiplexing means.

10. The apparatus of claim 9, wherein the control signal is generated by a programmer by a mode setting instruction that defines the normal instruction and the compression instruction in the instruction sequence.

11. The apparatus of claim 9 or 10, wherein the external storage means is a memory having an 8-bit memory width, and the internal storage means is a memory having a 32-bit width,

Wherein N is 32 and K is 4;

11. The method of claim 9 or 10, wherein the external storage means is a memory having a 16-bit memory width, the internal storage means is a memory having a 32-bit width,

Wherein N is 32 and K is 4;

11. The apparatus of claim 9 or 10, wherein the external storage means and the internal storage means are nonvolatile semiconductor memories.

The method of claim 9 or 10, wherein the internal storage means is a volatile semiconductor memory,

And an instruction stored in the internal storage means is loaded from an external system memory upon system booting.

11. The apparatus according to any one of claims 8 to 10, wherein N is a multiple of 8 and K is a multiple of four.

In the instruction processing method of the central processing unit for fetching the N-bit instruction stored in the external storage means in M-bit units and executing the instruction,

Storing a plurality of N-bit instructions inside the central processing unit;

An input step for programmatically entering commands by a programmer,

A reading step of reading the internally stored instructions using K (N / K) bit compression instructions fetched from the external storage means according to the program;

And a decoding step of decoding the instructions read in the reading step into an executable command code.

17. The method of claim 16, wherein the reading step comprises a step of temporarily storing the fetched N-bit instructions, wherein the storing step comprises the stored K (N / K) bit compression instructions as address signals, respectively. And sequentially reading the corresponding N-bit instructions stored in the CPU.

18. The method of claim 16 or 17, wherein N is a multiple of 8 and K is a multiple of four.

A first storage step of storing a plurality of N-bit instructions and compressed (N / K) bit compression instructions for a plurality of N-bit instructions frequently used in the central processing unit in the external storage means;

A second storing step of storing the frequently used plurality of N-bit instructions in a central processing unit;

An input step for programmatically entering commands by a programmer,

Determining whether N-bit instructions input from the external storage means are normal instructions or compressed instructions according to the instruction sequence of the instruction program;

A first decoding step of decoding the input N-bit instructions into an executable command code in case of the normal instruction;

Reading the corresponding N-bit instructions stored therein by using the compression instructions when the N-bit instructions are K compression instructions, and

And a second decoding step of decoding the read N-bit instructions into sequentially executable command codes.

20. The method of claim 19, further comprising: storing the fetched N bits of instructions temporarily;

And a transferring step of transferring the N-bit instruction stored in the storing step to either the first decoding step or the reading step according to the determination result of the determining step.

The reading step sequentially reads K N-bit instructions stored at the corresponding addresses by using the K (N / K) bit compression instructions transmitted in the transferring step as address signals. How to reduce length.

21. The central processing unit instruction of claim 19 or 20, wherein the determination of the determining step is determined by the programmer by a mode setting instruction that defines the normal instruction and the compression instruction in the instruction sequence. How to reduce length.

22. The method of claim 21, wherein the external storage means and the internal storage means are nonvolatile semiconductor memories.

The method of claim 21, wherein the internal storage means is a volatile semiconductor memory,

And said second storing step loads said N-bit instructions from an external system memory at system boot time.

21. A method according to claim 19 or 20, wherein N is a multiple of 8 and K is a multiple of four.