WO1997048041A9 - An apparatus and method for detecting and decompressing instructions from a variable-length compressed instruction set - Google Patents

An apparatus and method for detecting and decompressing instructions from a variable-length compressed instruction set

Info

Publication number
WO1997048041A9
WO1997048041A9 PCT/US1997/009984 US9709984W WO9748041A9 WO 1997048041 A9 WO1997048041 A9 WO 1997048041A9 US 9709984 W US9709984 W US 9709984W WO 9748041 A9 WO9748041 A9 WO 9748041A9
Authority
WO
WIPO (PCT)
Prior art keywords
instruction
compressed
field
instructions
register
Prior art date
Application number
PCT/US1997/009984
Other languages
French (fr)
Other versions
WO1997048041A1 (en
Filing date
Publication date
Priority claimed from US08/661,003 external-priority patent/US5896519A/en
Priority claimed from US08/659,709 external-priority patent/US5794010A/en
Priority claimed from US08/659,708 external-priority patent/US5905893A/en
Application filed filed Critical
Priority to JP10501756A priority Critical patent/JP2000512409A/en
Priority to AU34808/97A priority patent/AU3480897A/en
Priority to GB9825726A priority patent/GB2329495B/en
Publication of WO1997048041A1 publication Critical patent/WO1997048041A1/en
Publication of WO1997048041A9 publication Critical patent/WO1997048041A9/en

Links

Definitions

  • TITLE AN APPARATUS AND METHOD FOR DETECTING AND DECOMPRESSING
  • This invention relates to the field of microprocessors and, more particularly, to optimization of the instruction set of a microprocessor
  • Microprocessor architectures may generally be classified as either complex instruction set computing (CISC) architectures or reduced instruction set compu ⁇ ng (RISC) architectures
  • CISC architectures specify an instruction set compnsmg high level, relatively complex instructions
  • microprocessors implementing CISC architectures decompose the complex instructions into multiple simpler operations which may be more readily implemented m hardware
  • Microcoded routines stored in an on-chip read-only memory (ROM) have been successfully employed for providing the decomposed operations corresponding to an instruction More recently, hardware decoders which separate the complex instructions into simpler operations have been adopted bv certain CISC microprocessor designers
  • the x86 microprocessor architecture is an example of a CISC architecture
  • RISC architectures specify an instruction set compnsmg low level, relatively simple instructions Typically, each instruction within the instruction set is directly implemented in hardware Complexities associated with the CISC approach are removed, allowing for more advanced implementations to be designed Additionally, high frequency designs may be achieved more easily since the hardware employed to execute the instructions is simpler
  • An exemplary RISC architecture is the MIPS RISC architecture
  • variable-length instruction sets have often been asso ⁇ ated with CISC architectures while fixed-length instruction sets have been asso ⁇ ated with RISC architectures
  • Va ⁇ able-length instruction sets use dissimilar numbers of bits to encode the various instructions within the set as well as to specify addressing modes for the instructions, etc
  • va ⁇ able-length instruction sets attempt to pack instruction information as effi ⁇ ently as possible into the byte or bytes representing each instruction
  • fixed-length instruction sets employ the same number of bits for each instruction (the number of bits is typically a multiple of eight such that each instruction fully occupies a fixed number of bytes)
  • a small number of instruction formats compnsmg fixed fields of information are defined Decoding each instruction is thereby simplified to routing bits corresponding to each fixed field to logic designed to decode that field
  • va ⁇ able-length instructions lack the fixed field structure of fixed- length instructions Decoding is further complicated by the lack of fixed fields.
  • RISC architectures employing fixed-length instruction sets suffer from problems not generally applicable to CISC architectures employing va ⁇ able-length instruction sets Because each instruction is fixed length, certain of the simplest instructions may effectively waste memory by occupying bytes which do not convey information concerning the instruction For example, fields which are specified as "don't care" fields for a particular instruction or instructions in many fixed-length instruction sets waste memory In contrast, va ⁇ able-
  • RISC architectures do not include the more complex instructions employed by CISC architectures
  • the number of instructions employed m a program coded with RISC instructions may be larger than the number of instructions employed in the same program coded in with CISC instructions
  • Each of the more complex instructions coded in the CISC version of the program is replaced by multiple instructions in the RISC version of the program Therefore, the CISC version of a program often occupies significantly less memory than the RISC version of the program
  • more bandwidth between devices stonng the program, memory, and the microprocessor is needed for the RISC version of the program than for the CISC version of the program
  • the problems outlined above are m large part solved by a miCTOprocessor in accordance with the present invention
  • the microprocessor is configured to fetch a compressed instruction set which comp ⁇ ses a subset of a corresponding non-compressed instruction set
  • the non-compressed instruction set may be a RISC instruction set, such that the microprocessor may enjoy the high frequency operation and simpler execution resources typically asso ⁇ ated with RISC architectures Fetching the compressed instructions from memory and decompressing them within the microprocessor advantageously decreases the memory bandwidth required to achieve a given level of performance (e g instructions executed per second) Still further, the amount of memory occupied by the compressed instructions may be comparatively less than the conesponding non- compressed instructions may occupy
  • the exemplary compressed instruction set descnbed herein is a vanable length instruction set
  • two distinct instruction lengths are included 16-bit and 32-bit instructions
  • the 32-bit instructions are coded using an extend opcode, which indicates that the instruction being fetched is an extended (e g 32 bit) instruction Instructions may be fetched as 16-bit quantities
  • the succeeding 16-bit instruction is concatenated with the instruction having the extend opcode to form a 32-bit extended instruction Extended instructions have enhanced capabilities with respect to non-extended instructions, further enhan ⁇ ng the flexibility and power of the compressed instruction set Routines which employ the capabilities included in the extended instructions may thereby be coded using compressed instructions
  • the compressed instruction set further includes multiple sets of register mappings from the compressed register fields to the decompressed register fields Each value coded in the compressed register fields decompresses to a different register within the microprocessor
  • the compressed register fields compnse three bits each Therefore, eight registers are accessible to a particular instruction
  • the select instructions are assigned two opcode encodings One of the opcode encodings indicates a first mapping of register fields, while the second opcode encoding indicates a second mapping of register fields
  • the compressed register fields may include relatively few bits while select instructions for which access to additional registers is desired may be granted such access Additionally, the register mappings are selected to minimize the logic employed to decompress register fields
  • the compressed register field is directly copied into a portion of the decompressed register field while the remaining portion of the decompressed register field is created using a small number of logic gates
  • the microprocessor supports programs having routines coded in compressed instructions and other routines coded in non-compressed instructions
  • the subroutme call instruction within the compressed instruction set includes a compression mode which indicates whether or not the target routine is coded m compressed instructions
  • the compression mode specified by the subroutine call instruction is captured by the microprocessor as the compression mode for the routine
  • the compression mode is stored as one of the fetch address bits (stored in a program counter register within the microprocessor) Since the compression mode is part of the fetch address and the subroutine call instruction includes stonng a return address for the subroutine, the compression mode of the calling routine is automatically stored upon execution of a subroutine call instruction When a subrouune return instruction is executed, the compression mode of the calling routine is thereby automatically restored
  • An additional feature of one embodiment of the microprocessor is the decompression of the immediate field used for load store instructions having the global pointer register as a base register
  • the immediate field is decompressed into a decompressed immediate field for which the most significant bit is set
  • a subrange of addresses at the lower boundary of the global va ⁇ able address space is thereby allocated for global variables of compressed instructions
  • Non-compressed instructions may store global variables in the remainder of the global variable address space
  • global vanable allocation between the compressed and non- compressed routines of a particular program may be relatively simple since the subranges are separate
  • the present invention contemplates an apparatus for executing instructions from a vanable-length compressed instruction set, comprising an instruction decompressor
  • the instruction decompressor is coupled to receive instructions which are members of the vanable-length compressed instruction set, wherein the instruction decompressor is configured to examine an opcode field of a particular instruction
  • the instruction decompressor is configured to determine that the particular instruction is an extended instruction having a first fixed length if the opcode field is coded as an extend opcode
  • the instruction decompressor is configured to determine that the particular instruction is a non-extended instruction if the opcode field is coded as a second opcode different than the extend opcode
  • the present invention further contemplates a method for expanding compressed instructions mto decompressed instructions
  • a compressed instruction is determined to be an extended instruction having a first fixed length if an opcode field of the compressed instruction is an extend opcode If the opcode field of the compressed instruction is a second opcode different than the extend opcode, the compressed instruction is a non-extended instruction having a second fixed length
  • the compressed instruction is decompressed into a decompressed instruction
  • a number of bytes included in the compressed instruction is defined by the first fixed length if the compressed instruction is an extended instruction Alternatively, the number of bytes is defined by the second fixed length if the compressed instruction is a non-extended instruction
  • the present invention still further contemplates an apparatus for expanding compressed instructions into decompressed instructions, compnsmg a first determining means, a second determining means, and a decompressing means
  • the first determining means determines that a compressed instruction is an extended instruction having a first fixed length if an op
  • the present invention yet further contemplates a method for executing a program including a first routine and a second routine in a microprocessor
  • a subroutine call instruction is executed within the first routine, wherein the subroutine call instruction indicates that the second routine is to be executed via a target address of the subroutine call instruction
  • An indication within the subroutine call instruction is examined If the indication is in a first state, the second routine is determined to be coded using compressed instructions The second routine is determined to be coded using non-compressed instructions if the indication is in a second state different than the first state
  • the present invention contemplates an apparatus for executing a program including a first routine and a second routine in a microprocessor, compnsmg an executing means and an examimng means
  • the executing means executes a subroutine call instruction within the first routine
  • the subroutine call instruction indicates that the second routine is to be executed via a target address of the subrouune call instruction
  • the examimng means examines an indication within the subroutine
  • the present invention still further contemplates an apparatus for fetching compressed and non- compressed instructions in a microprocessor, compnsmg a storage device and a mode detector
  • the storage device stores a compression enable indicator Coupled to the storage device, the mode detector is configured to detect a compression mode of a target routine upon fetch of a subroutine call instruction specifying the target routine
  • the mode detector is configured to convey the compression mode to a processor core
  • the processor core is configured to fetch compressed instructions if the compression mode indicates compressed Additionally, the processor core is configured to fetch non-compressed instructions if the compression mode indicates non-compressed
  • the present invention yet further contemplates a microprocessor compnsmg an instruction decompressor and a processor core
  • the instruction decompressor is coupled to receive compressed instructions which are members of a vanable-length compressed instruction set
  • the instruction decompressor is configured to decompress each received compressed instruction into a corresponding decompressed instruction Coupled to receive decompressed instructions, the processor core is configured to execute the decompressed instructions
  • the present invention additionally contemplates a method for executing instruction code
  • Compressed instructions are fetched, wherem the compressed instructions are members of a vanable-length compressed instruction set
  • the compressed instructions are decompressed in an instruction decompressor, thereby forming corresponding decompressed instructions
  • the decompressed instructions are executed in a processor core
  • the present invention still further contemplates an apparatus for executing instruction code, compnsmg a fetching means, a decompressing means, and an executing means
  • the fetching means fetches compressed instructions which are members of a vanable-length compressed instruction set
  • the decompressmg means decompresses the compressed instructions, thereby forming co ⁇ espon ⁇ ng decompressed instructions
  • the executing means executes the decompressed instructions
  • the present invention contemplates an instruction decompressor configured to decompress compressed instructions A first one of the compressed instructions is codable to access a first subset of registers defined for a conesponding non-compressed instruction set Additionally, a second one of the compressed instructions is codable to access the first subset of registers and is further codable to access a second subset of registers
  • the present invention further contemplates a method for decompressmg compressed instructions A particular compressed instruction having a first register field is decompressed using a first register mapping from compressed register indicators to decompressed register indicators if
  • the present invention contemplates an apparatus for decompressing a compressed register field of a compressed instruction into a decompressed register field of a decompressed instruction, comprising a first means and a second means
  • the first means is for directly copying at least a portion of the compressed register field into a portion of the decompressed register field
  • the first means is coupled to receive the compressed register field
  • the second means is for logically operating upon the compressed register field to produce a remaining portion of the decompressed register field
  • the present invention contemplates an instruction decompressor configured to decompress a compressed register field of a compressed instruction into a decompressed register field of a decompressed instruction
  • the instruction decompressor forms a first portion of the decompressed register field by copying at least a portion of the compressed register field thereto
  • the instruction decompressor includes a logic block which is configured to operate upon the compressed register field to produce a remaining portion of the decompressed register field
  • Fig 1 is a block diagram of one embodiment of a microprocessor
  • Fig 2 is a block diagram of a second embodiment of a microprocessor
  • Fig 3 A is a first instruction format supported by one embodiment of the microprocessors shown in
  • Fig 3B is a second instruction format supported by one embodiment of the mi ⁇ oprocessors shown Figs 1 and 2
  • Fig 3C is a third instruction format supported by one embodiment of the microprocessors shown m
  • Fig 3D is a fourth instruction format supported by one embodiment of the microprocessors shown in Figs 1 and 2.
  • Fig 4A is a fifth instruction format supported by one embodiment of the microprocessors shown in
  • Fig 4B is a sixth instruction format supported by one embodiment of the mi ⁇ oprocessors shown m Figs 1 and 2
  • Fig 4C is a seventh instruction format supported by one embodiment of the mi ⁇ oprocessors shown in
  • Fig 4D is an eight instruction format supported by one embodiment of the microprocessors shown in
  • Figs 5A, 5B, 5C, 5D, and 5E are tables of exemplary instructions using the formats shown in Figs 3A, 3B, 3C, and 3D
  • Figs 6A, 6B, 6C, 6D, 6E, and 6F are tables of exemplary instructions using the formats shown m
  • Fig. 7 is a diagram depicting offsets from an arbitrary register and a global pointer register, according to one embodiment of the microprocessors shown in Figs. 1 and 2.
  • Fig. 8 is a block diagram of exemplary hardware for expanding an immediate field from a compressed instruction to a decompressed instruction.
  • Fig. 9 is a diagram depicting decompressed offsets in accordance with one embodiment of the mi ⁇ oprocessors shown in Figs. 1 and 2.
  • Fig. 10 is a flow chart depict ng operation of a decompressor for immediate fields according to one embodiment of the mi ⁇ oprocessors shown in Figs. 1 and 2.
  • Fig. 11 is a block diagram of exemplary hardware for generating fetch addresses according to one embodiment of the mi ⁇ oprocessors shown in Figs. 1 and 2.
  • Fig. 12 is a block diagram showing register decompression logic employed in one embodiment of the mi ⁇ oprocessors shown in Figs. 1 and 2.
  • Fig. 13 is a block diagram of an exemplary computer system including the miCToprocessor for which embodiments are shown in Figs. 1 and 2.
  • Mi ⁇ oprocessor 10A includes an instruction decompressor 12 A, an instruction cache 14 A, and a processor core 16.
  • Instruction decompressor 12 A is coupled to receive instruction bytes from a main memory subsystem (not shown).
  • Instruction decompressor 12A is further coupled to instruction cache 14A.
  • Instruction cache 14A is coupled to processor core 16.
  • miCToprocessor 10A is configured to fetch compressed instructions from the main memory subsystem.
  • the compressed instructions are passed through instruction decompressor 12 A, which expands the compressed instructions into decompressed instructions for storage within instruction cache 14 A.
  • Many of the compressed instructions occupy fewer memory storage locations than the corresponding decompressed instructions, advantageously reducing the amount of memory required to store a particular program.
  • the bandwidth required to transport the compressed instructions from the main memory subsystem to microprocessor 10A is reduced
  • Microprocessor 10A may be employed within a computer system having a relatively small main memory Relatively large programs may be stored in the main memory due to the compression of instructions stored therein
  • microprocessor 10A is configured to execute both compressed and non- compressed instructions on a routine-by-routine basis
  • a routine may be coded using either compressed instructions or non-compressed instructions
  • routines which may not be effi ⁇ entlv coded in the compressed instruction set may be coded using non-compressed instructions
  • routines which are effi ⁇ ently coded m the compressed instruction set are so coded
  • Mi ⁇ oprocessor 10 A may support a particular decompression of the immediate field for load/store instructions using the global pointer register as a base register, in order to support mixing of compressed and non-compressed instructions The particular decompression is detailed further below
  • a compression mode is detected by instruction decompressor 12A The compression mode identifies the instruction set m which a routine is coded compressed or non-compressed
  • Instruction compression is achieved in mi ⁇ oprocessor 10A by imposing certain limitations upon the available instruction encodings.
  • instruction field sizes may be reduced (I e the number of bits within an instruction field may be decreased)
  • the number of available registers may be reduced to form the compressed instruction set
  • Instruction decompressor 12A expands the encoded register field into a decompressed register field
  • the decompressed register field is included in the decompressed instruction
  • the compressed instructions use the reduced instruction fields, thereby occupying less memory (l e fewer bits) than the original instruction encodings defined by the microprocessor architecture employed by processor core 16
  • Instruction decompressor 12A is configured to accept compressed instructions and to decompress the instructions into the o ⁇ ginal instruction encodings Each instruction field within a particular compressed instruction is expanded from the compressed field to a corresponding decompressed field within the conesponding decompressed instruction The decompressed instruction is coded in the o ⁇ ginal instruction format supported by processor core 16
  • Processor core 16 includes ⁇ rcuitry for fetching instructions from instruction cache 14A, decoding the instructions, and executing the instructions
  • the instructions supported by processor core 16 are specified by the mi ⁇ oprocessor architecture employed therein
  • processor core 16 employs the MIPS RISC architecture
  • processor core 16 may employ any miCToprocessor architecture Since instruction decompressor 12A decompresses instructions into the onginal instruction format, processor core 16 may compnse a previously designed processing core In other words, the processmg core mav not require substantial modification to be included within microprocessor 10A
  • the MIPS RISC architecture specifies an instruction set compnsmg 32 bit fixed-length instructions
  • a compressed instruction set is defined for mi ⁇ oprocessor 10A which compnses vanable-length instructions Many of the compressed instructions comp ⁇ se 16-bit instructions Other compressed instructions compnsed 32 bit instructions in conjunction with the extend instruction descnbed below Several 16-bit and 32-bit instruction formats are defined It is understood that, although 16-bit and 32-bit compressed instructions are used in this embodiment, other embodiments may employ different instruction lengths
  • the compressed instructions encode a subset of the non-compressed instructions Instruction encodings supported within the compressed instruction set compnse many of the most commonly coded instructions as well as the most often used registers, such that many programs, or routines withm the programs, may be coded using the compressed instructions
  • microprocessor 10 A employs a compression mode If the compression mode is active, then compressed instructions are being fetched and executed Instruction decompressor 12A decompresses the instructions when they are transfe ⁇ ed from main memory to instruction cache 14
  • the compressed mode may be inactive When the compression mode is inactive, non-compressed instructions are being fetched and executed Instruction decompressor 12 A is bypassed when the compressed mode is inactive
  • the compression mode is indicated by a bit within the fetch address (e g bit 0)
  • the current fetch address may be stored in a PC register 18 within processor core 16 Bit 0 of PC register 18 indicates the compression mode (CM) of mi ⁇ oprocessor 10A
  • Instruction cache 14A is a high speed cache memory configured to store decompressed and non- compressed instructions Although any cache organization may be employed by instruction cache 14 A, a set associative or direct mapped configuration may be suitable for the embodiment shown in Fig 1
  • Mi ⁇ oprocessor 10B includes an instruction cache 14B coupled to receive instruction bytes from the mam memory subsystem, an instruction decompressor 12B, and processor core 16 Instruction cache 14B is coupled to instruction decompressor 12B, which is further coupled to processor core 16
  • Mi ⁇ oprocessor 10B is configured with instruction decompressor 12B between instruction cache 14B and processor core 16
  • Instruction cache 14B stores the compressed instructions transfe ⁇ ed from the mam memory subsystem In this manner, instruction cache 14B may store a relatively larger number of instructions than a similarly sized instruction cache employed as instruction cache 14A in microprocessor 10A
  • Instruction decompressor 12B receives fetch addresses conesponding to instruction fetch requests from processor core 16, and accesses instruction cache 14B in response to the fetch request The corresponding compressed instructions are decompressed into decompressed instructions by instruction decompressor 12B The decompressed instructions are transmitted to processor core 16
  • mi ⁇ oprocessor 10B includes a compression mode in one embodiment Instruction decompressor 12B is bypassed when non-compressed instructions are being fetched and executed
  • instruction cache 14B stores both compressed and non-compressed instructions
  • instruction cache 14B typically stores instruction bytes in fixed-size storage locations refened to as cache lines Therefore, a particular cache line may be stonng compressed or non- compressed instructions In either case, a plurality of instruction bytes are stored Therefore, instruction caches 14A and 14B may be of similar construction
  • the compression mode at the time a cache line is accessed determines whether the instruction bytes are interpreted as compressed or non-compressed instructions
  • An alternative configuration for miCToprocessor 1 OB is to include instruction decompressor 12B within the instruction decode logic of processor core 16 The compressed instructions may not actually be decompressed in such an embodiment Instead, the compressed instructions may be decoded directly by the decode logic
  • the decoded instructions may be similar to the decoded instructions generated for the non- compressed instructions which correspond to the compressed
  • mi ⁇ oprocessor 10 which operates upon compressed instructions
  • instruction cache 14 and instruction decompressor 12 will be used to refer to the conesponding elements of both Figs 1 and 2 as well as other embodiments of the elements included in other implementations of microprocessor 10
  • compressed instruction refers to an instruction which is stored in a compressed form in memory
  • the compressed instruction is generally stored using fewer bits than the number of bits used to store the instruction when represented as defined in the microprocessor architecture employed by processor core 16
  • decompressed instruction refers to the result of expanding a compressed instruction into the o ⁇ ginal encoding as defined in the mi ⁇ oprocessor architecture employed by processor core 16
  • non- compressed instruction refers to an instruction represented in the encoding defined by the microprocessor architecture employed by processor core 16
  • Non-compressed instructions are also stored in memory in the same format (l e non-compressed instructions were never compressed)
  • compression refers to the process of expanding a compressed instruction into the conesponding decompressed instruction It is noted that instruction decompressors 12 A and 12B may be configured to simultaneously decompress multiple compressed instructions Such embodiment
  • Figs. 3A-3D and 4A-4D depict exemplary instruction formats for 16-bit and 32-bit compressed instructions, respectively, according to one specific embodiment of microprocessor 10 employing the MIPS RISC architecture
  • Other instructions formats may be employed by other embodiments
  • the instruction formats shown in Figs 3A-3D each compnse 16 bits m this particular implementation
  • the instruction formats shown in Figs 4A-4D each compnse 32 bits in this particular implementation
  • the compressed instructions encoded using the instruction formats are decompressed into instruction formats as defined by the MIPS RISC architecture for each instruction
  • Fig 3 A depicts a first instruction format 20
  • Instruction format 20 includes an opcode field 22, a first register field 24, a second register field 26, and a function field 28
  • Opcode field 22 is used to identify the instruction
  • function field 28 is used in conjunction with certain particular encodings of opcode field 22 to identify the instruction Effectively, function field 28 and opcode field 22 together form the opcode field for these instructions
  • function field 28 is used as an immediate field
  • First register field 24 and second register field 26 identify destination and source registers for the lnstructtoiL
  • the destination register is also typically used as a source register for the instruction In this manner, two source operands and one destination operand are specified via first register field 24 and second register field 26
  • the notations "RT" and "RS" in first register field 24 and second register field 26 indicate the use of the fields in the instruction tables below Either RT or RS may be a destination register, depending upon the encoding of the instruction
  • opcode field 22 comprises 5 bits, first register field 24 and second register field 26 compnse 3 bits each, and function field 28 compnses 5 bits
  • First register field 24 is divided into two subfields (labeled RT1 and RT0)
  • RT1 compnses two bits in the present embodiment
  • RT0 compnses one bit RT1 is concatenated with RT0 to form first register field 24
  • Subfield RT1 and second register field 26 are used in certain instructions encoded via instruction format 20 to indicate one of the 32 registers defined by the MIPS RISC architecture
  • Fig 3B depicts a second instruction format 30
  • Instruction format 30 includes opcode field 22, first register field 24, and second register field 26 Additionally, a third register field 32 and a function field 34 are shown Third register field 32 is generally used to identify the destination register for instructions using instruction format 30 Therefore, first register field 24 and second register field 26 compnse source registers for instruction format 30 Function field 34 is used similar to function field 28 In the embodiment shown, third register field 32 compnses three bits and function field 34 compnses two bits
  • a third instruction format 40 is shown in Fig 3C
  • Instruction format 40 includes opcode field 22 and second register field 26, as well as an immediate field 42 Immediate field 42 is used to provide immediate data for the instruction specified by instruction format 40
  • Immediate data is an operand of the instruction, similar to the value stored in a register specified by first register field 24 or second register field 26
  • an add instruction which uses immediate data adds the immediate data to the value stored m the destination register, and stores the resulting sum into that destination register
  • immediate field 42 compnses eight bits Immediate field 42 is divided into two subfields (DvtMl and LMM0) in the instruction format shown m Fig 3C
  • the subfields allow second register field 26 to be placed in the same bit positions within instruction format 40 as it is placed in instruction formats 20 and 30
  • second register field 26 is always found in the same position of 16-bit instructions in which it is used Therefore, subfield IMMl comprises 2 bits and subfield DvfMO compnses 6 bits IMMl is concatenated with
  • Fig 3D depicts a fourth instruction format 50
  • Instruction format 50 includes opcode field 22 and an immediate field 52
  • Immediate field 52 similar to immediate field 42, is used as an operand of the instruction
  • immediate field 52 compnses 11 bits
  • Fig 4A depicts a fifth instruction format 60
  • Instruction format 60 includes opcode field 22, which is coded as the extend instruction Instruction decompressor 12 recognizes the extend instruction opcode within opcode field 22 and treats the current instruction as a 32-bit instruction (l e the 16 bits included in the instruction containing the extend opcode and the 16 bits which would otherwise comp ⁇ se the next instruction in program order are concatenated to form a 32 bit instruction) Therefore, the compressed instruction can be seen to be a vanable-length instruction set compnsmg 16-bit instructions and 32-bit instructions
  • Instruction format 60 further includes a zero field 62 compnsmg six bits (coded to all binary zeros), an immediate field 64, and a BR field 66 Instruction format 60 is used to code an extended form of the BR instruction (an unconditional branch instruction), and hence BR field 66 is an opcode field indicating the BR instruction
  • the BR opcode is hexadecimal 02
  • the extended BR instruction has a larger immediate field than the non-extended BR instruction, and therefore may be coded with larger offsets than the non-extended BR instruction
  • the extended BR instruction may be used
  • branches to close instructions may use the non-extended BR instruction
  • Immediate field 64 compnses 16 bits which are used as an offset to be added to the address of the instruction following the BR instruction to create the target address of the branch instruction
  • the non-extended BR instruction by contrast, includes an eleven bit offset (I e it is coded using instruction format 50)
  • Fig 4B depicts an instruction format 70 which is an extended version of instruction format 40
  • Instruction format 70 includes opcode field 22 coded as the extend opcode, as well as an immediate field 72, a first register field 74, a second register field 76, and a second opcode field 78
  • First register field 74 and second register field 76 compnse five bits each in the embodiment shown Therefore, any register defined by the MIPS RISC architecture may be accessed using instruction format 70
  • Second opcode field 78 defines the instruction bemg executed, and compnses 5 bits (similar to opcode field 22)
  • immediate field 72 compnses 12 bits divided into a one bit IMM2 subfield, a five bit IMMl subfield, and a six bit IMMO subfield Immediate field 72 is formed by concatenating IMM2 with IMMl and further with IMMO in the embodiment shown
  • An extended instruction format corresponding to instruction format 30 is shown m Fig 4C as an instruction format 80
  • Second opcode field 78 is coded to a particular value to identify instruction format 80 from instruction format 70
  • instruction format 80 is assumed by instruction decompressor 12
  • instruction format 70 is assumed by instruction decompressor 12
  • the particular value compnses hexadecimal 00
  • Instruction format 80 further includes a COP0 bit 86 COP0 bit 86, when set, indicates that certain coprocessor zero instructions (as defined in the MIPS RISC architecture) are being executed
  • COP0 bit 86 when set, indicates that certain coprocessor zero instructions (as defined in the MIPS RISC architecture) are being executed
  • the tables of instructions below further define the instructions encoded by setting COP0 bit 86
  • routines may need to perform operations of which these instructions are incapable While most of the instructions in the routine may be coded using instruction formats 20-50, several instructions may require additional encodings For example, access to a repster not included within the subset of available registers m formats 20-50 may be needed Additional instructions not included in the instructions encoded using formats 20-50 may be needed For these and other reasons, the extend opcode and extended instruction formats 60-80 are defined
  • Instruction decompressor 12 examines opcode field 22 in order to detect the extend opcode
  • the extend opcode is one of the opcodes defined to use instruction format 50 in the present embodiment although the bits included in immediate field 52 are assigned diffenng interpretations depending upon the extended instruction format coded for the particular extended instruction
  • the extended instruction formats include a second opcode field (e g fields 66 and 78) which identify the particular extended instruction
  • Addition of the extend opcode and extended instruction formats allows for many instructions to be encoded using the na ⁇ ower instruction formats 20-50, but still have the flexibility of the wider extended instruction formats when desired Programs which occasionally make use of the functionalitv included in the extended instruction formats may still achieve a reduced memory footp ⁇ nt, since these programs may be encoded using compressed instructions and many of the compressed instructions may compnse 16-bit compressed instructions
  • An embodiment of mi ⁇ oprocessor 10 may handle the extended instructions by fetching 16-bit instruction portions and detecting the extend opcode When the extend opcode is detected, a NOP may be transmitted to processor core 16 and the remaining 16-bit portion of the extended instruction may be fetched
  • the extended instruction is decompressed and provided as the next instruction after the NOP
  • instruction decompressor 12 handles cases wherein a portion of the extended instruction is available while a second portion is unavailable For example, two portions of the extended instruction may lie within two distinct cache lines within instruction cache 14 Therefore, one portion of the instruction may be fetched from instruction cache 14 while the other portion may not reside within instruction cache 14 The portion mav then need to be stored within instruction decompressor 12 until the remaining portion is available.
  • Fig 4D is an instruction format 90 used to explicitly expand the J AL instruction of the MIPS
  • the JAL instruction is often used as a subroutine call instruction
  • Subroutines may be stored in memory at a great distance (address-wise) from the calling routine Therefore, having the largest possible range of relative offsets (via an immediate field 92 comprising 26 bits) is important for the JAL instruction.
  • exchange bit 94 is included in the instruction encoding
  • the exchange bit is used to indicate the compressed/non-compressed nature of the instructions at the target address If the bit is set, the target instructions are compressed instructions If the bit is clear, the target instructions are non-compressed instructions
  • the value of exchange bit 94 is copied into bit 0 of the program counter within processor core 16 Bit 0 of the program counter may always be assumed to be zero, since the sixteen bit and thirty-two bit instructions occupy at least two bytes each and instructions are stored at aligned addresses Therefore, bit zero is a useful location for stonng the compression mode of the cu ⁇ ent routine Processor core 16 increments fetch addresses by 2 (instead of 4) when bit 0 is set, thereby fetching 16 bit compressed instructions through instruction decompressor 12
  • Each instruction within the compressed instruction set employed by microprocessor 10 uses at least one ofthe instruction formats shown in Figs 3A-3D and Figs 4A-4D It is noted that opcode field 22 is included each instruction format, and is located in the same place within each instruction format The coding of opcode field 22 determines which instruction format is used to interpret the remainder ofthe instruction. A first portion ofthe opcode field encodings is assigned to instruction format 20, a second portion ofthe opcode field encodings is assigned to instruction format 30, etc
  • instruction field refers to one or more bits within an instruction which are grouped and assigned an interpretation as a group
  • opcode field 22 is compnses a group of bits which are interpreted as the opcode ofthe instruction
  • first and second repster fields 24 and 26 compnse register identifiers which identify a storage location within processor core 16 which store operands of the instruction
  • immediate field refers to an instruction field m which immediate data is coded Immediate data may provide an operand for an instruction Alternatively, immediate data mav be used as an offset to be added to a register value, thereby producing an address Still further, immediate data may be used as an offset for a branch instruction
  • Figs 5A-6F are tables listing an exemplary compressed instruction set for use by one particular implementation of microprocessor 10
  • the particular implementation employs the MIPS RISC architecture within processor core 16 Therefore, the instruction mnemonics listed in an instruction column 100 ofthe tables correspond to instruction mnemonics defined in the MIPS RISC architecture (or defined for the instruction assembler, as descnbed in "MIPS RISC Architecture" by Kane and Heinnch, Appendix D, Prentice Hall PTR. Upper Saddle River, New Jersey, 1992, incorporated herein by reference) with the following exceptions CMPI, MOVEI, MOVE, NEG, NOT, and extend These instructions translate to the following MIPS instructions (RS and RT refer to the 16-bit RS and RT)
  • MOVI ADDIU RS $0, s ⁇ mm8 MOV ADD RS, $0. RT
  • first register field 24, second register field 26, and third register field 32 comprise three bits each.
  • Table 1 lists the mapping of the field encodings (listed in binary) to registers in the MIPS RISC architecture for these symbols. Other mappings are also contemplated, as shown further below. Names assigned according to MIPS assembler convention are also listed in Table 1.
  • each register field is three bits, only eight registers are available for a given opcode.
  • Instructions which may access all sixteen registers are assigned two opcodes in the instruction tables below.
  • Register selection is thereby a function of both a register field and opcode field 22.
  • register fields may be encoded using fewer bits while still providing select instructions which may access a large poup of registers.
  • immediate field decompression for load/store instructions comprises right rotation ofthe immediate bits by one bit for halfwords and two bits for words, followed by shifting ofthe immediate bits left by one bit for halfwords and two bits for words.
  • a seven bit immediate field is provided for words and a six bit immediate field for halfwords (in the 16-bit instruction formats).
  • the MIPS RISC architecture defines that data addresses conesponding to load store instructions are aligned for each instruction included in the exemplary compressed instruction set. Therefore, the least significant bit (for halfwords) and the second least significant bit (for words) may be set to zero. Bits in the compressed immediate field need not be used to specify these bits. Finally, "imm" is post-fixed with a number indicating the number of bits included in the immediate field
  • Opcode field 22 and function field 28 are decompressed as well More particularly, opcode field 22 and function field 28 identify the instruction within the MIPS RISC architecture, in accordance with the tables shown in Figs 5A-6F The opcode and function fields of the decompressed instructions are coded in accordance with the MIPS RISC architecture definition
  • Figs 5A and 5B depict a table 110 and a table 112, respectively Tables 110 and 112 list instructions from the exemplary compressed instruction set which use instruction format 20 shown in Fig 3
  • Instruction column 100 and operands column 102 are included, as well as an opcode column 106 and a function column 104
  • Opcode column 104 and function column 106 include hexadecimal numbers, and correspond to opcode field 22 and function field 28, respectively
  • Table 110 includes several instructions which have an " ⁇ mm5" coding in function column 104
  • the " ⁇ mm5" coding appears for the load store instructions within table 110, and indicates that function field 28 is used as an immediate field for these instructions
  • function field 28 is used in conjunction with opcode field 22 to identify a particular instruction within the compressed instruction set
  • opcode Id is labeled as special m table 110
  • spe ⁇ al instructions have a specific interpretation of function field 28 In particular, if the most significant bit ofthe function field is clear, then the instruction is defined to be
  • the low order two bits ofthe s ⁇ mm9 operand are set to zero
  • the destination ofthe SLT and SLTU instructions shown in table 110 is the t8 register (repster $24) according to one embodiment
  • Table 112 shows an " ⁇ mm3" and " ⁇ mm6" operand for several instructions
  • the ⁇ mm3 operand is coded into second register field 26, and the " ⁇ mm6" operand is coded into both second repster field 26 and first register field 24
  • table 112 includes the jump repster (JR) instruction, having second repster field 26 as an operand.
  • JR jump repster
  • subfield RT1 of first register field 24 is used in conjunction with second register field 26 to specify any ofthe MIPS RISC architecture registers for the JR instruction.
  • Table 114 including instruction column 100, operands column 102, opcode column 106, and function column 104.
  • Table 114 lists instructions from the exemplary instruction set which use instruction format 30 shown in Fig. 3B. Certain instructions within table 114 have hardcoded destination registers (i.e. the destination registers cannot be selected by the programmer, other than by using a different opcode). For these instructions, third register field 32 is combined with function field 34 to store the function field encoding shown in function column 104. Additionally, an instruction is shown which has an immediate operand in function column 104 and operands column 102. This instruction uses second register field 26 in conjunction with function field 34 to code the conesponding immediate field used by the instruction.
  • Figs 5D and 5E are tables 116 and 118 showing the instructions from the exemplary compressed instruction set which employ instruction formats 40 and 50, respectively. It is noted that the extend instruction is shown in table 118. However, the extend instruction actually indicates that the instruction is a 32-bit compressed instruction which uses one of instruction formats 60, 70, or 80.
  • Tables 120 and 122 depict those instructions from the exemplary compressed instruction set which are encoded using instruction format 70, shown in Fig. 4B.
  • Table 120 includes instruction column 100 and operands column 102, and further includes an opcode column 108.
  • Opcode column 108 is similar to opcode column 106, except that the opcode encodings shown in opcode column 108 correspond to opcode field 78.
  • Table 122 includes an RT column 109 which corresponds to first register field 74.
  • the coding ofthe RT field in the instructions shown in table 122 indicates which instruction is selected.
  • the instructions shown in table 122 share a specific encoding in opcode field 78. In one embodiment, the specific encoding is 00 (hexadecimal).
  • Figs. 6C, 6D, 6E, and 6F are tables 124, 126, 128, and 130 which depict instructions from the exemplary compressed instruction set which are encoded according to instruction format 80.
  • Tables 124, 126, and 130 include a function column 107 which conesponds to encodings of function field 84.
  • Table 128 includes an RS, RT column 105 which will be explained in more detail below.
  • Operands column 102 for table 124 includes immediate operands for certain instructions.
  • “imm5" operand is coded into second register field 76.
  • the "imml5" operand is coded into a combination of first register field 74, second register field 76, and third register field 82.
  • the instructions listed in table 128 are identified via encodings of second register field 76, as shown in RS, RT column 105. Certain instructions are identified via second register field 76 in conjunction with first register field 74. Those instructions for which RS, RT column 105 includes an asterisk for the RT portion are identified via second register field 76, while those instructions for which RS, RT column 105 does not include an asterisk are identified by second register field 76 in conjunction with first register field 74. Instructions which are not identified via first register field 74 may use first register field 74 to encode an operand.
  • the instructions listed in tables 128 and 130 are instructions for which COP0 bit 86 is set, while instructions listed in tables 124 and 126 are encoded with COP0 bit 86 clear.
  • Certain instructions in table 128 include an "imm6" operand.
  • the "imm6" operand is coded into function field 84.
  • function field 84 is used to indicate the instructions shown in table 130 when second register field 76 is coded to lx (hexade ⁇ mal), wherein "x" indicates that the low order bits are don't cared
  • a first addressing window 150 and a second addressing window 152 are shown according to one embodiment of microprocessor 10
  • the value of the base repster identifies an address within the mam memory subsystem Addressing window 150 represents the range of addresses around the value ofthe base repster which are accessible to a load store instruction in the non- compressed instruction set according to one embodiment ofthe non-compressed instruction set
  • the non- compressed instruction set specifies that load/store instructions form the address of a memory operand via the sum of a value stored in a base repster and a sixteen bit signed immediate field
  • the range of addresses has an upper boundary of 32767 peater than the base register and a lower boundary of 32768 less than the base repster
  • Other embodiments may include larger or smaller ranges
  • the term "base repster" refers to a register which is specified by a load store instruction as stonng a base
  • load store instructions within the 16-bit portion ofthe exemplary compressed instruction set include a five bit immediate field This field is rotated nght two bits and then shifted left two bits for word-sized memory operands, forming a seven bit immediate field (the largest ofthe immediate fields which may be formed using the five bits, according to one embodiment) The seven bit immediate field is then zero extended to form a positive offset from the base register in the conesponding decompressed instruction
  • a subrange 154 of addresses are therefore available for access bv compressed instructions Within addressing window 150, subrange 154 has an upper boundary of 127 greater than the base register and a lower boundary of the base register
  • subrange 154 may vary in size from embodiment to embodiment While subrange 154 may work well for many load store instructions, a different subrange may be employed for use with the global pointer repster
  • the global pointer repster is a repster assipied by software convention to locate an area of memory used for stonng global vanables
  • a global vanable is a vanable which is available
  • the area of memory around the global pointer register may therefore be viewed as a table of global vanables
  • Each global vanable is assigned an offset within the table
  • a 64 kilobyte table may be allocated for global vanables as shown along the left side of addressing windows 150 and 152
  • the global vanable table includes a section which is accessible to compressed instructions (conesponding to subrange 154) which is between two subranges 156 and 158 accessible to non-compressed instructions
  • microprocessor 10 may support programs in which some routines are coded with non-compressed instructions while other routines are coded with compressed instructions Allocating global vanables m a particular program is complicated by the division ofthe non-compressed global vanable subranges 156 and 158 of addressing window 150
  • Global vanables may be allocated into subrange 158, for example, and then global variable allocation must continue in subrange 156 (for non-compressed instructions) ln other words, subrange 154 must be bypassed for global vanables accessible to non-compressed instructions.
  • Microprocessor 10 may employ a decompression ofthe compressed immediate field for load/store instructions using the global pointer (GP) repster which leads to addressing window 152.
  • Addressing window 152 includes a subrange 160 accessible to compressed instructions and a subrange 162 accessible to non- compressed instructions.
  • subrange 162 is a contiguous block of memory Global vanables for access by non-compressed instructions may be allocated into subrange 162, while global variables for access by compressed instructions may be allocated into subrange 160.
  • subrange 160 and subrange 162 form distinct tables of global variables for access by compressed and non-compressed instructions, respectively.
  • Addressing window 152 is achieved by decompressing the compressed immediate field as descnbed above, except that the most significant bit ofthe decompressed immediate field is set.
  • the decompressed immediate field is 8000 (in hexadecimal). Since the decompressed immediate field is interpreted as a signed field for load/store instructions, the 8000 value is the most negative number available in the decompressed immediate field. Other encodings ofthe compressed immediate field are decompressed into negative numbers which form subrange 160 Subrange 160 forms the lower boundary of the range of addresses represented by addressing window 152 as shown m the embodiment of Fig. 7.
  • memory operand refers to a value stored in a memory location within the mam memory subsystem. Load instructions may be used to transfer the memory operand to a repster within mi ⁇ oprocessor 10. Conversely, store instructions may be used to transfer a value stored in a register into the memory operand storage location.
  • a memory operand may be of various sizes (i.e. numbers of bytes) In one embodiment, three sizes are available: byte, halfword, and word. A halfword compnses two bytes, and a word compnses four bytes. Other memory operand sizes are contemplated for other embodiments.
  • Fig. 8 a block diagram of exemplary hardware within instruction decompressor 12 for decompressmg the immediate field of a load/store instruction is shown. It is noted that multiple copies of the exemplary hardware shown in Fig. 8 may be employed to concurrently decompress multiple load store instructions.
  • the exemplary hardware shown in Fig. 8 is described in terms of mi ⁇ oprocessor 10B. However, similar hardware may be employed within mi ⁇ oprocessor 10A.
  • the exemplary hardware includes a immediate field decompressor 170 and a register decoder 172.
  • a portion ofthe instruction comprising the compressed immediate field for load/store instructions is conveyed to immediate field decompressor 170 upon a compressed immediate bus 174.
  • the compressed immediate field compnses function field 28 shown in Fig 3 A.
  • the base repster field for the compressed load/store instruction is conveyed upon a base register bus 1 6
  • the base repster field compnses second register field 26 For the exemplary instruction set shown in Figs. 3 A-6F, the base repster field compnses second register field 26.
  • Register decoder 172 decodes the repster identified upon base repster bus 176. If the base repster is the global pointer register, register decoder 172 asserts a GP signal upon GP line 178 to immediate field decompressor 170 Otherwise, repster decoder 172 deasserts the GP signal Immediate field decompressor 170 decompresses the compressed immediate field in one of two ways, dependent upon the GP signal. If the GP signal is deasserted, then immediate field decompressor 170 clears the most significant bit ofthe decompressed immediate field. Conversely, immediate field decompressor 170 sets the most significant bit ofthe immediate field if the GP signal is asserted.
  • Immediate field decompressor 170 conveys the decompressed immediate field upon a decompressed immediate bus 180.
  • Fig. 9 illustrates the decompressed immediate field generated for load store instructions according to one embodiment ofthe exemplary compressed instruction set.
  • the compressed immediate field of load/store instructions which do not employ the global pointer register as the base register are decompressed as indicated by reference number 182.
  • the decompression for bytes, halfwords, and words are shown separately, with each bit position ofthe decompressed immediate field (or offset) represented by a numerical digit or an "L".
  • Bits from the compressed immediate field are shown in the respective bit locations of the decompressed field via the numerical digits.
  • the least significant bit ofthe compressed immediate field is represented by the digit 0, and the most significant bit ofthe compressed immediate field is represented by a 4.
  • the letter "L" is used to indicate a bit position which is set to a binary zero.
  • Decompressed immediate fields corresponding to bytes, halfwords, and words for load store instructions which use the global pointer register as a base register are indicated by reference number 184. Similar to the decompressed fields indicated by reference number 182, the decompressed fields indicated by reference number 184 depict numerals in bit positions which are filled with a bit from the compressed immediate field and the letter "L" is used to indicate a bit position which is set to a binary zero. Additionally, the most significant bit of each decompressed offset is set to a binary one (indicated by the letter "H").
  • Fig. 10 a flow chart is shown depicting activities performed by instruction decompressor 12 in order to decompress instructions in accordance with the embodiment shown in Fig. 8. Although the steps shown in Fig. 10 are illustrated as serial in nature, it is understood that various steps may be performed in parallel.
  • Instruction decompressor 12 determines if a received instruction is a load/store instruction (decision block 190). If the instruction is not a load/store instruction, the instruction is expanded in accordance with a mapping between the compressed instructions (as illustrated in Figs. 3A-6F) and the conesponding decompressed instructions (step 192). If the instruction is a load/store instruction, then the base register specified by the instruction is examined (decision block 196). If the base register is the global pointer register, the immediate field is decompressed as indicated by reference number 184 in Fig. 9 (step 194). Alternatively, if the base register is not the global pointer register, the immediate field is decompressed as indicated by reference number 182 in Fig. 9 (step 192).
  • microprocessor 10 In addition to decompressing load/store offsets in a different manner for the global pointer register, microprocessor 10 also supports a compression mode for indicating which type of instructions are being executed by microprocessor 10 (i.e. compressed or non-compressed).
  • Fig. 11 is a block diagram illustrating a portion of one embodiment of instruction decompressor 12. The illustrated portion determines the compression mode for each routine executed by mi ⁇ oprocessor 10. The portion shown may be suitable for microprocessor 10B, and a similar portion may be employed by microprocessor 10A.
  • Fig. 11 depicts a mode detector 200.
  • Mode detector 200 detects when the jump and link (JAL) instruction is fetched, and further examines the exchange bit 94. If exchange bit 94 is set, the routine at the target address ofthe JAL instruction comprises compressed instructions. Therefore, the compression mode ofthe target routine is compressed Alternatively, exchange bit 94 may be clear In this case, the compression mode of the target routine is uncompressed
  • the JAL instruction causes the address ofthe instruction following the JAL instruction to be stored into register $31 ofthe MIPS RISC architecture This register may subsequently be used with the JR instruction to return from the target routine Because the compression mode is stored as part ofthe address in this embodiment, the compression mode of the source routine is restored upon execution of the JR instruction
  • routines encoded in compressed instructions may be intermixed with routines encoded in non-compressed instructions
  • the new compression mode is conveyed to processor core 16 upon a compression mode line 206
  • mode detector 200 may be included as a part of processor core 16 instead of instruction decompressor 12, in alternative embodiments
  • mode detector 200 shown in Fig 11 includes a storage 204 for a compression enable bit If compression is enabled, the compression enable bit is set When instructions are fetched in compressed mode and compression is enabled, instruction decompressor 12 decompresses the instructions If the enable bit is clear, instruction compression is disabled for microprocessor 10 Instruction decompressor 12 is bypassed when instruction decompression is disabled Furthermore, mode detector 200 indicates that the compression mode is non-compressed when instruction compression is disabled
  • a routine is an ordered set of instructions coded for execution by mi ⁇ oprocessor 10
  • the routine may be coded in either compressed or non-compressed instructions, and is delimited by a subroutine call instruction and a return instruction
  • the delimiting subrouune call instruction is not included within the routine Instead, the subroutine call instruction indicates the beginning of the routine via a target address included with the subroutine call instruction
  • the first instruction ofthe routine is stored at the target address
  • the address of an instruction within the routine including the subroutine call instruction is saved so that a return instruction may be executed to return to the calling routine
  • the jal instruction may serve as a subroutine call instruction
  • the alr instruction may serve as a subroutine call instruction
  • a routine ends with a return instruction, which causes subsequent instruction execution to return to the address saved when the conesponding subroutine call instruction is executed
  • the target address of the return instruction is the saved address
  • the jr instruction may serve as a return instruction
  • a target address is an address at which instruction fetching is to bepn upon execution of the instruction conesponding to the target address
  • a block diagram of one embodiment of repster field decompression is shown Other embodiments of repster field decompression are contemplated
  • the compressed register field conesponding to an instruction is conveyed upon compressed register field bus 210
  • a register decompressor block 212 receives the compressed register field Additionally, at least a portion of the compressed register field is incorporated into the decompressed repster field which is then conveyed upon decompressed register field bus 214
  • the decompressed repster field is thereby formed by concatenating at least a portion of the compressed repster field to the value generated by repster decompressor block 212
  • the entire compressed repster field is concatenated into the decompressed register field Additionally, the remaining portion ofthe decompressed register field depends upon which repster set the instruction accesses (e g xs vs rs and xt vs rt)
  • a set selector signal is received upon set selector bus 216 for each repster, indicating whether the xs (xt) or the rs (rt) register set should be used If the set selector signal is asserted, then xs (xt) is selected Otherwise, rs (rt) is selected
  • the set selector signal is asserted or deasserted based upon the opcode ofthe instruction being decompressed, m accordance with the exemplary compressed instruction set shown m Figs 5A-6F
  • the register mapping between compressed and decompressed repsters shown in Table 1 may be employed
  • register decompressor 212 may employ the following lope, wherein DR represents the decompressed repster field, CR represents the compressed
  • DR[4 3] ⁇ RH, (RH & CR[2]
  • DR[4 3] ⁇ RH, (RH & CR[2]
  • DR[4 3] ⁇ (RH & CR[2]
  • DRI4 3] ⁇ (RH & CR[2]
  • vanous repsters are assigned to vanous functions by software convention
  • the MIPS assembler assigns the following meamngs to registers
  • mi ⁇ oprocessor 10 is incorporated onto a semiconductor substrate 224 along with multiple I/O interfaces 222 A- 222N
  • the I/O interfaces interface to I O devices external to substrate 224
  • An exemplary I O interface 222A may be a universal asynchronous receiver transmitter (UART)
  • Microprocessor 10 may be coupled to I O interfaces 222 for communication therewith Additionally, microprocessor 10 may be coupled to external interface lope 226, which further interfaces to one or more dynamic random access memory (DRAM) modules 228 DRAM modules 228 may store compressed and/or non-compressed instruction code, as well as data for used by the program represented by the compressed and/or non-compressed instruction code
  • DRAM dynamic random access memory
  • vanous signals As used herein, a signal is “asserted” if it conveys a value indicative of a particular condition Conversely, a signal is “deasserted” if it conveys a value indicative of a lack of a particular condition
  • a signal may be defined to be asserted when it conveys a lopcal zero value or, conversely, when it conveys a lopcal one value
  • Verilog listing describes exemplary logic for instruction decompressor 12. Many different embodiments ofthe logic are contemplated, although the Verilog listing shown is one suitable example:
  • wire spany ⁇ €xtjal & ci[15] & ⁇ ci[14] & ⁇ ci[ll]
  • wire splor2 spany & ⁇ ci[13]
  • wire special splor2 & ⁇ ci[12]
  • wire br brjal & ⁇ ci[ll]
  • wire word ⁇ [15]&( ⁇ ext
  • wire half ci[15] & ( ⁇ ext
  • a microprocessor has been descnbed which executes instructions from both a compressed instruction set and a non-compressed instruction set
  • the mi ⁇ oprocessor expands the compressed instructions into decompressed instructions or directly decodes the compressed instructions
  • routines coded using the compressed instruction set occupy a smaller amount of memory than the corresponding routines coded in non-compressed instructions. Memory formerly occupied by such routines may be freed for use by other routines or data operated upon by such routines.

Abstract

A microprocessor is configured to fetch a compressed instruction set which comprises a subset of a corresponding non-compressed instruction set. The compressed instruction set is a variable length instruction set including 16-bit and 32-bit instructions. The 32-bit instructions are coded using an extend opcode, which indicates that the instruction being fetched is an extended (e.g. 32-bit) instruction. The compressed instruction set further includes multiple sets of register mappings from the compressed register fields to the decompressed register fields. Certain select instructions are assigned two opcode encodings, one for each of two mappings of the corresponding register fields. The compressed register field is directly copied into a portion of the decompressed register field while the remaining portion of the decompressed register field is created using a small number of logic gates. The subroutine call instruction within the compressed instruction set includes a compression mode which indicates whether or not the target routine is coded in compressed instructions. The compression mode is stored in the program counter register. The decompression of the immediate field used for load/store instructions having the global pointer register as a base register is optimized for mixed compressed/non-compressed instruction execution. The immediate field is decompressed into a decompressed immediate field for which the most significant bit is set.

Description

TITLE: AN APPARATUS AND METHOD FOR DETECTING AND DECOMPRESSING
INSTRUCTIONS FROM A VARIABLE- LENGTH COMPRESSED INSTRUCTION SET
BACKGROUND OF THE INVENTION
1 Field of the Invention
This invention relates to the field of microprocessors and, more particularly, to optimization of the instruction set of a microprocessor
2 Description of the Relevant Art
Microprocessor architectures may generally be classified as either complex instruction set computing (CISC) architectures or reduced instruction set compuαng (RISC) architectures CISC architectures specify an instruction set compnsmg high level, relatively complex instructions Often, microprocessors implementing CISC architectures decompose the complex instructions into multiple simpler operations which may be more readily implemented m hardware Microcoded routines stored in an on-chip read-only memory (ROM) have been successfully employed for providing the decomposed operations corresponding to an instruction More recently, hardware decoders which separate the complex instructions into simpler operations have been adopted bv certain CISC microprocessor designers The x86 microprocessor architecture is an example of a CISC architecture
Conversely, RISC architectures specify an instruction set compnsmg low level, relatively simple instructions Typically, each instruction within the instruction set is directly implemented in hardware Complexities associated with the CISC approach are removed, allowing for more advanced implementations to be designed Additionally, high frequency designs may be achieved more easily since the hardware employed to execute the instructions is simpler An exemplary RISC architecture is the MIPS RISC architecture
Although not necessarily a defining feature, variable-length instruction sets have often been assoαated with CISC architectures while fixed-length instruction sets have been assoαated with RISC architectures Vaπable-length instruction sets use dissimilar numbers of bits to encode the various instructions within the set as well as to specify addressing modes for the instructions, etc Generally speaking, vaπable-length instruction sets attempt to pack instruction information as effiαently as possible into the byte or bytes representing each instruction Conversely, fixed-length instruction sets employ the same number of bits for each instruction (the number of bits is typically a multiple of eight such that each instruction fully occupies a fixed number of bytes) Typically, a small number of instruction formats compnsmg fixed fields of information are defined Decoding each instruction is thereby simplified to routing bits corresponding to each fixed field to logic designed to decode that field
Because each instruction in a fixed-length instruction set compnses a fixed number of bytes, locating instructions is simplified as well The location of numerous instructions subsequent to a particular instruction is implied by the location of the particular instruction (I e as fixed offsets from the location of the particular instruction) Conversely, locating a second vaπable-length instruction requires locating the end of the first vaπable-length instruction, locating a third vaπable-length instruction requires locating the end of the second vaπable-length instruction, etc Still further, vaπable-length instructions lack the fixed field structure of fixed- length instructions Decoding is further complicated by the lack of fixed fields Unfortunately, RISC architectures employing fixed-length instruction sets suffer from problems not generally applicable to CISC architectures employing vaπable-length instruction sets Because each instruction is fixed length, certain of the simplest instructions may effectively waste memory by occupying bytes which do not convey information concerning the instruction For example, fields which are specified as "don't care" fields for a particular instruction or instructions in many fixed-length instruction sets waste memory In contrast, vaπable-length instruction sets pack the instruction information into a minimal number of bytes
Still further, since RISC architectures do not include the more complex instructions employed by CISC architectures, the number of instructions employed m a program coded with RISC instructions may be larger than the number of instructions employed in the same program coded in with CISC instructions Each of the more complex instructions coded in the CISC version of the program is replaced by multiple instructions in the RISC version of the program Therefore, the CISC version of a program often occupies significantly less memory than the RISC version of the program Correspondingly, more bandwidth between devices stonng the program, memory, and the microprocessor is needed for the RISC version of the program than for the CISC version of the program
SUMMARY OF THE INVENTION
The problems outlined above are m large part solved by a miCTOprocessor in accordance with the present invention The microprocessor is configured to fetch a compressed instruction set which compπses a subset of a corresponding non-compressed instruction set The non-compressed instruction set may be a RISC instruction set, such that the microprocessor may enjoy the high frequency operation and simpler execution resources typically assoαated with RISC architectures Fetching the compressed instructions from memory and decompressing them within the microprocessor advantageously decreases the memory bandwidth required to achieve a given level of performance (e g instructions executed per second) Still further, the amount of memory occupied by the compressed instructions may be comparatively less than the conesponding non- compressed instructions may occupy
The exemplary compressed instruction set descnbed herein is a vanable length instruction set According to one embodiment, two distinct instruction lengths are included 16-bit and 32-bit instructions The 32-bit instructions are coded using an extend opcode, which indicates that the instruction being fetched is an extended (e g 32 bit) instruction Instructions may be fetched as 16-bit quantities When a 16-bit instruction having the extend opcode is fetched, the succeeding 16-bit instruction is concatenated with the instruction having the extend opcode to form a 32-bit extended instruction Extended instructions have enhanced capabilities with respect to non-extended instructions, further enhanαng the flexibility and power of the compressed instruction set Routines which employ the capabilities included in the extended instructions may thereby be coded using compressed instructions
The compressed instruction set further includes multiple sets of register mappings from the compressed register fields to the decompressed register fields Each value coded in the compressed register fields decompresses to a different register within the microprocessor In one embodiment, the compressed register fields compnse three bits each Therefore, eight registers are accessible to a particular instruction In order to offer access to additional repsters for certain select instructions, the select instructions are assigned two opcode encodings One of the opcode encodings indicates a first mapping of register fields, while the second opcode encoding indicates a second mapping of register fields Advantageously, the compressed register fields may include relatively few bits while select instructions for which access to additional registers is desired may be granted such access Additionally, the register mappings are selected to minimize the logic employed to decompress register fields In one embodiment, the compressed register field is directly copied into a portion of the decompressed register field while the remaining portion of the decompressed register field is created using a small number of logic gates
The microprocessor supports programs having routines coded in compressed instructions and other routines coded in non-compressed instructions The subroutme call instruction within the compressed instruction set includes a compression mode which indicates whether or not the target routine is coded m compressed instructions The compression mode specified by the subroutine call instruction is captured by the microprocessor as the compression mode for the routine In one embodiment, the compression mode is stored as one of the fetch address bits (stored in a program counter register within the microprocessor) Since the compression mode is part of the fetch address and the subroutine call instruction includes stonng a return address for the subroutine, the compression mode of the calling routine is automatically stored upon execution of a subroutine call instruction When a subrouune return instruction is executed, the compression mode of the calling routine is thereby automatically restored
An additional feature of one embodiment of the microprocessor is the decompression of the immediate field used for load store instructions having the global pointer register as a base register The immediate field is decompressed into a decompressed immediate field for which the most significant bit is set A subrange of addresses at the lower boundary of the global vaπable address space is thereby allocated for global variables of compressed instructions Non-compressed instructions may store global variables in the remainder of the global variable address space Advantageously, global vanable allocation between the compressed and non- compressed routines of a particular program may be relatively simple since the subranges are separate
Broadly speaking, the present invention contemplates an apparatus for executing instructions from a vanable-length compressed instruction set, comprising an instruction decompressor The instruction decompressor is coupled to receive instructions which are members of the vanable-length compressed instruction set, wherein the instruction decompressor is configured to examine an opcode field of a particular instruction The instruction decompressor is configured to determine that the particular instruction is an extended instruction having a first fixed length if the opcode field is coded as an extend opcode Additionally the instruction decompressor is configured to determine that the particular instruction is a non-extended instruction if the opcode field is coded as a second opcode different than the extend opcode
The present invention further contemplates a method for expanding compressed instructions mto decompressed instructions A compressed instruction is determined to be an extended instruction having a first fixed length if an opcode field of the compressed instruction is an extend opcode If the opcode field of the compressed instruction is a second opcode different than the extend opcode, the compressed instruction is a non-extended instruction having a second fixed length The compressed instruction is decompressed into a decompressed instruction A number of bytes included in the compressed instruction is defined by the first fixed length if the compressed instruction is an extended instruction Alternatively, the number of bytes is defined by the second fixed length if the compressed instruction is a non-extended instruction The present invention still further contemplates an apparatus for expanding compressed instructions into decompressed instructions, compnsmg a first determining means, a second determining means, and a decompressing means The first determining means determines that a compressed instruction is an extended instruction having a first fixed length if an opcode field of the compressed instruction is an extend opcode The second determining means determines that the compressed instruction is a non-extended instruction having a second fixed length if the opcode field of the compressed instruction is a second opcode different than the extend opcode The decompressing means decompresses the compressed instruction into a decompressed instruction. A number of bytes included in the compressed instruction is defined by the first fixed length if the compressed instruction is the extended instruction Alternatively, the number of bytes is defined by the second fixed length if the compressed instruction is the non-extended instruction
The present invention yet further contemplates a method for executing a program including a first routine and a second routine in a microprocessor A subroutine call instruction is executed within the first routine, wherein the subroutine call instruction indicates that the second routine is to be executed via a target address of the subroutine call instruction An indication within the subroutine call instruction is examined If the indication is in a first state, the second routine is determined to be coded using compressed instructions The second routine is determined to be coded using non-compressed instructions if the indication is in a second state different than the first state Furthermore, the present invention contemplates an apparatus for executing a program including a first routine and a second routine in a microprocessor, compnsmg an executing means and an examimng means The executing means executes a subroutine call instruction within the first routine The subroutine call instruction indicates that the second routine is to be executed via a target address of the subrouune call instruction The examimng means examines an indication within the subroutine call instruction The examimng means determines that the second routine is coded using compressed instructions if the indication is in a first state If the mdication is m a second state, the examimng means determines that the second routine is coded using non-compressed instructions
The present invention still further contemplates an apparatus for fetching compressed and non- compressed instructions in a microprocessor, compnsmg a storage device and a mode detector The storage device stores a compression enable indicator Coupled to the storage device, the mode detector is configured to detect a compression mode of a target routine upon fetch of a subroutine call instruction specifying the target routine The mode detector is configured to convey the compression mode to a processor core The processor core is configured to fetch compressed instructions if the compression mode indicates compressed Additionally, the processor core is configured to fetch non-compressed instructions if the compression mode indicates non-compressed
The present invention yet further contemplates a microprocessor compnsmg an instruction decompressor and a processor core The instruction decompressor is coupled to receive compressed instructions which are members of a vanable-length compressed instruction set The instruction decompressor is configured to decompress each received compressed instruction into a corresponding decompressed instruction Coupled to receive decompressed instructions, the processor core is configured to execute the decompressed instructions
The present invention additionally contemplates a method for executing instruction code Compressed instructions are fetched, wherem the compressed instructions are members of a vanable-length compressed instruction set The compressed instructions are decompressed in an instruction decompressor, thereby forming corresponding decompressed instructions The decompressed instructions are executed in a processor core
The present invention still further contemplates an apparatus for executing instruction code, compnsmg a fetching means, a decompressing means, and an executing means The fetching means fetches compressed instructions which are members of a vanable-length compressed instruction set The decompressmg means decompresses the compressed instructions, thereby forming coπesponώng decompressed instructions The executing means executes the decompressed instructions Furthermore, the present invention contemplates an instruction decompressor configured to decompress compressed instructions A first one of the compressed instructions is codable to access a first subset of registers defined for a conesponding non-compressed instruction set Additionally, a second one of the compressed instructions is codable to access the first subset of registers and is further codable to access a second subset of registers The present invention further contemplates a method for decompressmg compressed instructions A particular compressed instruction having a first register field is decompressed using a first register mapping from compressed register indicators to decompressed register indicators if the particular compressed instruction is encoded using a first opcode Alternatively, the particular compressed instruction having the first register field is decompressed using a second register mapping from compressed register indicators to decompressed register indicators if the particular compressed instruction is encoded using a second opcode The present invention still further contemplates an apparatus for decompressing compressed instructions comprising a decompressing means The decompressing means is configured to decompress a particular compressed instruction having a first register field using a first register mapping from compressed register indicators to decompressed register indicators if the particular compressed instruction is encoded using a first opcode Additionally, the decompressing means is configured to decompress the particular compressed instruction using a second register mapping from compressed register indicators to decompressed register indicators if the particular compressed instruction is encoded using a second opcode
The present invention yet further contemplates an instruction decompressor configured to decompress a compressed register field of a compressed instruction into a decompressed register field of a decompressed instruction A decompression of the compressed register field is dependent upon a first value coded into the compressed register field and a second value coded into an opcode field of the compressed instruction
The present invention additionally contemplates a method for decompressing a compressed register field of a compressed instruction into a decompressed register field of a decompressed instruction At least a portion of the compressed register field is directly copied into a portion of the decompressed register field The remaining portion of the decompressed register field is produced by logically operating upon the compressed register field
Moreover, the present invention contemplates an apparatus for decompressing a compressed register field of a compressed instruction into a decompressed register field of a decompressed instruction, comprising a first means and a second means The first means is for directly copying at least a portion of the compressed register field into a portion of the decompressed register field The first means is coupled to receive the compressed register field Similarly coupled to receive the compressed register field, the second means is for logically operating upon the compressed register field to produce a remaining portion of the decompressed register field
Furthermore, the present invention contemplates an instruction decompressor configured to decompress a compressed register field of a compressed instruction into a decompressed register field of a decompressed instruction The instruction decompressor forms a first portion of the decompressed register field by copying at least a portion of the compressed register field thereto Additionally, the instruction decompressor includes a logic block which is configured to operate upon the compressed register field to produce a remaining portion of the decompressed register field
BRIEF DESCRIPTION OF THE DRAWINGS
Other objects and advantages of the invention will become apparent upon reading the following detailed description and upon reference to the accompanying drawings in which
Fig 1 is a block diagram of one embodiment of a microprocessor
Fig 2 is a block diagram of a second embodiment of a microprocessor
Fig 3 A is a first instruction format supported by one embodiment of the microprocessors shown in
Fig 3B is a second instruction format supported by one embodiment of the miαoprocessors shown Figs 1 and 2
Fig 3C is a third instruction format supported by one embodiment of the microprocessors shown m
Fig 3D is a fourth instruction format supported by one embodiment of the microprocessors shown in Figs 1 and 2.
Fig 4A is a fifth instruction format supported by one embodiment of the microprocessors shown in
Figs 1 and 2
Fig 4B is a sixth instruction format supported by one embodiment of the miαoprocessors shown m Figs 1 and 2
Fig 4C is a seventh instruction format supported by one embodiment of the miαoprocessors shown in
Fig 4D is an eight instruction format supported by one embodiment of the microprocessors shown in
Figs 5A, 5B, 5C, 5D, and 5E are tables of exemplary instructions using the formats shown in Figs 3A, 3B, 3C, and 3D
Figs 6A, 6B, 6C, 6D, 6E, and 6F are tables of exemplary instructions using the formats shown m
Figs 4A, 4B, 4C, and 4D Fig. 7 is a diagram depicting offsets from an arbitrary register and a global pointer register, according to one embodiment of the microprocessors shown in Figs. 1 and 2.
Fig. 8 is a block diagram of exemplary hardware for expanding an immediate field from a compressed instruction to a decompressed instruction.
Fig. 9 is a diagram depicting decompressed offsets in accordance with one embodiment of the miαoprocessors shown in Figs. 1 and 2.
Fig. 10 is a flow chart depict ng operation of a decompressor for immediate fields according to one embodiment of the miαoprocessors shown in Figs. 1 and 2.
Fig. 11 is a block diagram of exemplary hardware for generating fetch addresses according to one embodiment of the miαoprocessors shown in Figs. 1 and 2.
Fig. 12 is a block diagram showing register decompression logic employed in one embodiment of the miαoprocessors shown in Figs. 1 and 2.
Fig. 13 is a block diagram of an exemplary computer system including the miCToprocessor for which embodiments are shown in Figs. 1 and 2.
While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail. It should be understood, however, that the drawings and detailed description thereto are not intended to limit the invention to the particular form disclosed, but on the contrary, the intention is to cover all modifications, equivalents and alternatives falling within the spirit and scope of the present invention as defined by the appended claims.
DETAILED DESCRIPTION OF THE INVENTION
Turning now to Fig. 1, a block diagram of a first embodiment of a microprocessor 10A is shown. Miαoprocessor 10A includes an instruction decompressor 12 A, an instruction cache 14 A, and a processor core 16. Instruction decompressor 12 A is coupled to receive instruction bytes from a main memory subsystem (not shown). Instruction decompressor 12A is further coupled to instruction cache 14A. Instruction cache 14A is coupled to processor core 16.
Generally speaking, miCToprocessor 10A is configured to fetch compressed instructions from the main memory subsystem. The compressed instructions are passed through instruction decompressor 12 A, which expands the compressed instructions into decompressed instructions for storage within instruction cache 14 A. Many of the compressed instructions occupy fewer memory storage locations than the corresponding decompressed instructions, advantageously reducing the amount of memory required to store a particular program. Additionally, since instructions are decompressed within miCToprocessor 10 A, the bandwidth required to transport the compressed instructions from the main memory subsystem to microprocessor 10A is reduced Microprocessor 10A may be employed within a computer system having a relatively small main memory Relatively large programs may be stored in the main memory due to the compression of instructions stored therein
In one embodiment, microprocessor 10A is configured to execute both compressed and non- compressed instructions on a routine-by-routine basis In other words, a routine may be coded using either compressed instructions or non-compressed instructions Advantageously, routines which may not be effiαentlv coded in the compressed instruction set may be coded using non-compressed instructions, while routines which are effiαently coded m the compressed instruction set are so coded Miαoprocessor 10 A may support a particular decompression of the immediate field for load/store instructions using the global pointer register as a base register, in order to support mixing of compressed and non-compressed instructions The particular decompression is detailed further below Additionally, a compression mode is detected by instruction decompressor 12A The compression mode identifies the instruction set m which a routine is coded compressed or non-compressed
Instruction compression is achieved in miαoprocessor 10A by imposing certain limitations upon the available instruction encodings. By limiting the instruction encodings, instruction field sizes may be reduced (I e the number of bits within an instruction field may be decreased) For example, the number of available registers may be reduced to form the compressed instruction set Because fewer registers arc available, a smaller field may be used to encode the registers used as source and destination operands for the instruction Instruction decompressor 12A expands the encoded register field into a decompressed register field The decompressed register field is included in the decompressed instruction The compressed instructions use the reduced instruction fields, thereby occupying less memory (l e fewer bits) than the original instruction encodings defined by the microprocessor architecture employed by processor core 16
Instruction decompressor 12A is configured to accept compressed instructions and to decompress the instructions into the oπginal instruction encodings Each instruction field within a particular compressed instruction is expanded from the compressed field to a corresponding decompressed field within the conesponding decompressed instruction The decompressed instruction is coded in the oπginal instruction format supported by processor core 16
Processor core 16 includes αrcuitry for fetching instructions from instruction cache 14A, decoding the instructions, and executing the instructions The instructions supported by processor core 16 are specified by the miαoprocessor architecture employed therein In one particular embodiment, processor core 16 employs the MIPS RISC architecture However, it is understood that processor core 16 may employ any miCToprocessor architecture Since instruction decompressor 12A decompresses instructions into the onginal instruction format, processor core 16 may compnse a previously designed processing core In other words, the processmg core mav not require substantial modification to be included within microprocessor 10A The MIPS RISC architecture specifies an instruction set compnsmg 32 bit fixed-length instructions
A compressed instruction set is defined for miαoprocessor 10A which compnses vanable-length instructions Many of the compressed instructions compπse 16-bit instructions Other compressed instructions compnsed 32 bit instructions in conjunction with the extend instruction descnbed below Several 16-bit and 32-bit instruction formats are defined It is understood that, although 16-bit and 32-bit compressed instructions are used in this embodiment, other embodiments may employ different instruction lengths The compressed instructions encode a subset of the non-compressed instructions Instruction encodings supported within the compressed instruction set compnse many of the most commonly coded instructions as well as the most often used registers, such that many programs, or routines withm the programs, may be coded using the compressed instructions
In one embodiment, microprocessor 10 A employs a compression mode If the compression mode is active, then compressed instructions are being fetched and executed Instruction decompressor 12A decompresses the instructions when they are transfeπed from main memory to instruction cache 14
Alternatively, the compressed mode may be inactive When the compression mode is inactive, non-compressed instructions are being fetched and executed Instruction decompressor 12 A is bypassed when the compressed mode is inactive In one particular embodiment, the compression mode is indicated by a bit within the fetch address (e g bit 0) The current fetch address may be stored in a PC register 18 within processor core 16 Bit 0 of PC register 18 indicates the compression mode (CM) of miαoprocessor 10A
Instruction cache 14A is a high speed cache memory configured to store decompressed and non- compressed instructions Although any cache organization may be employed by instruction cache 14 A, a set associative or direct mapped configuration may be suitable for the embodiment shown in Fig 1
Turning next to Fig 2, a second embodiment of a miαoprocessor 10B is shown Miαoprocessor 10B includes an instruction cache 14B coupled to receive instruction bytes from the mam memory subsystem, an instruction decompressor 12B, and processor core 16 Instruction cache 14B is coupled to instruction decompressor 12B, which is further coupled to processor core 16
Miαoprocessor 10B is configured with instruction decompressor 12B between instruction cache 14B and processor core 16 Instruction cache 14B stores the compressed instructions transfeπed from the mam memory subsystem In this manner, instruction cache 14B may store a relatively larger number of instructions than a similarly sized instruction cache employed as instruction cache 14A in microprocessor 10A Instruction decompressor 12B receives fetch addresses conesponding to instruction fetch requests from processor core 16, and accesses instruction cache 14B in response to the fetch request The corresponding compressed instructions are decompressed into decompressed instructions by instruction decompressor 12B The decompressed instructions are transmitted to processor core 16
Similar to microprocessor 10A, miαoprocessor 10B includes a compression mode in one embodiment Instruction decompressor 12B is bypassed when non-compressed instructions are being fetched and executed For this embodiment, instruction cache 14B stores both compressed and non-compressed instructions It is noted that instruction cache 14B typically stores instruction bytes in fixed-size storage locations refened to as cache lines Therefore, a particular cache line may be stonng compressed or non- compressed instructions In either case, a plurality of instruction bytes are stored Therefore, instruction caches 14A and 14B may be of similar construction The compression mode at the time a cache line is accessed determines whether the instruction bytes are interpreted as compressed or non-compressed instructions An alternative configuration for miCToprocessor 1 OB is to include instruction decompressor 12B within the instruction decode logic of processor core 16 The compressed instructions may not actually be decompressed in such an embodiment Instead, the compressed instructions may be decoded directly by the decode logic The decoded instructions may be similar to the decoded instructions generated for the non- compressed instructions which correspond to the compressed instructions It is noted that microprocessors 10 A and 10B are merely exemplary embodiments of a miαoprocessor
10 which operates upon compressed instructions For the remainder of this discussion, miαoprocessor 10, instruction cache 14, and instruction decompressor 12 will be used to refer to the conesponding elements of both Figs 1 and 2 as well as other embodiments of the elements included in other implementations of microprocessor 10
The terms decompression, compressed instruction, decompressed instruction, and non-compressed instruction are used in the above discussion and may further be used below As used herein, the term "compressed instruction" refers to an instruction which is stored in a compressed form in memory The compressed instruction is generally stored using fewer bits than the number of bits used to store the instruction when represented as defined in the microprocessor architecture employed by processor core 16 The term "decompressed instruction" refers to the result of expanding a compressed instruction into the oπginal encoding as defined in the miαoprocessor architecture employed by processor core 16 The term "non- compressed instruction" refers to an instruction represented in the encoding defined by the microprocessor architecture employed by processor core 16 Non-compressed instructions are also stored in memory in the same format (l e non-compressed instructions were never compressed) Finally, the term "decompression" refers to the process of expanding a compressed instruction into the conesponding decompressed instruction It is noted that instruction decompressors 12 A and 12B may be configured to simultaneously decompress multiple compressed instructions Such embodiments of instruction decompressors 12 may be employed with embodiments of processor core 16 which execute multiple instructions per clock cycle
Figs. 3A-3D and 4A-4D depict exemplary instruction formats for 16-bit and 32-bit compressed instructions, respectively, according to one specific embodiment of microprocessor 10 employing the MIPS RISC architecture Other instructions formats may be employed by other embodiments The instruction formats shown in Figs 3A-3D each compnse 16 bits m this particular implementation Conversely, the instruction formats shown in Figs 4A-4D each compnse 32 bits in this particular implementation The compressed instructions encoded using the instruction formats are decompressed into instruction formats as defined by the MIPS RISC architecture for each instruction
Fig 3 A depicts a first instruction format 20 Instruction format 20 includes an opcode field 22, a first register field 24, a second register field 26, and a function field 28 Opcode field 22 is used to identify the instruction Additionally, function field 28 is used in conjunction with certain particular encodings of opcode field 22 to identify the instruction Effectively, function field 28 and opcode field 22 together form the opcode field for these instructions When opcode field 22 employs certain other encodings than the particular encodings, function field 28 is used as an immediate field First register field 24 and second register field 26 identify destination and source registers for the lnstructtoiL The destination register is also typically used as a source register for the instruction In this manner, two source operands and one destination operand are specified via first register field 24 and second register field 26 The notations "RT" and "RS" in first register field 24 and second register field 26 indicate the use of the fields in the instruction tables below Either RT or RS may be a destination register, depending upon the encoding of the instruction
In one embodiment, opcode field 22 comprises 5 bits, first register field 24 and second register field 26 compnse 3 bits each, and function field 28 compnses 5 bits First register field 24 is divided into two subfields (labeled RT1 and RT0) RT1 compnses two bits in the present embodiment, while RT0 compnses one bit RT1 is concatenated with RT0 to form first register field 24 Subfield RT1 and second register field 26 are used in certain instructions encoded via instruction format 20 to indicate one of the 32 registers defined by the MIPS RISC architecture
Fig 3B depicts a second instruction format 30 Instruction format 30 includes opcode field 22, first register field 24, and second register field 26 Additionally, a third register field 32 and a function field 34 are shown Third register field 32 is generally used to identify the destination register for instructions using instruction format 30 Therefore, first register field 24 and second register field 26 compnse source registers for instruction format 30 Function field 34 is used similar to function field 28 In the embodiment shown, third register field 32 compnses three bits and function field 34 compnses two bits
A third instruction format 40 is shown in Fig 3C Instruction format 40 includes opcode field 22 and second register field 26, as well as an immediate field 42 Immediate field 42 is used to provide immediate data for the instruction specified by instruction format 40 Immediate data is an operand of the instruction, similar to the value stored in a register specified by first register field 24 or second register field 26 For example, an add instruction which uses immediate data adds the immediate data to the value stored m the destination register, and stores the resulting sum into that destination register In one embodiment, immediate field 42 compnses eight bits Immediate field 42 is divided into two subfields (DvtMl and LMM0) in the instruction format shown m Fig 3C The subfields allow second register field 26 to be placed in the same bit positions within instruction format 40 as it is placed in instruction formats 20 and 30 Advantageously, second register field 26 is always found in the same position of 16-bit instructions in which it is used Therefore, subfield IMMl comprises 2 bits and subfield DvfMO compnses 6 bits IMMl is concatenated with IMM0 to form the immediate value
Fig 3D depicts a fourth instruction format 50 Instruction format 50 includes opcode field 22 and an immediate field 52 Immediate field 52, similar to immediate field 42, is used as an operand of the instruction However, immediate field 52 compnses 11 bits
Fig 4A depicts a fifth instruction format 60 Instruction format 60 includes opcode field 22, which is coded as the extend instruction Instruction decompressor 12 recognizes the extend instruction opcode within opcode field 22 and treats the current instruction as a 32-bit instruction (l e the 16 bits included in the instruction containing the extend opcode and the 16 bits which would otherwise compπse the next instruction in program order are concatenated to form a 32 bit instruction) Therefore, the compressed instruction can be seen to be a vanable-length instruction set compnsmg 16-bit instructions and 32-bit instructions Instruction format 60 further includes a zero field 62 compnsmg six bits (coded to all binary zeros), an immediate field 64, and a BR field 66 Instruction format 60 is used to code an extended form of the BR instruction (an unconditional branch instruction), and hence BR field 66 is an opcode field indicating the BR instruction In one embodiment, the BR opcode is hexadecimal 02
The extended BR instruction has a larger immediate field than the non-extended BR instruction, and therefore may be coded with larger offsets than the non-extended BR instruction When a branch to an instruction distant from the branch instruction is desired, the extended BR instruction may be used Alternatively, branches to close instructions may use the non-extended BR instruction Immediate field 64 compnses 16 bits which are used as an offset to be added to the address of the instruction following the BR instruction to create the target address of the branch instruction The non-extended BR instruction, by contrast, includes an eleven bit offset (I e it is coded using instruction format 50)
Fig 4B depicts an instruction format 70 which is an extended version of instruction format 40 Instruction format 70 includes opcode field 22 coded as the extend opcode, as well as an immediate field 72, a first register field 74, a second register field 76, and a second opcode field 78 First register field 74 and second register field 76 compnse five bits each in the embodiment shown Therefore, any register defined by the MIPS RISC architecture may be accessed using instruction format 70 Second opcode field 78 defines the instruction bemg executed, and compnses 5 bits (similar to opcode field 22) Finally, immediate field 72 compnses 12 bits divided into a one bit IMM2 subfield, a five bit IMMl subfield, and a six bit IMMO subfield Immediate field 72 is formed by concatenating IMM2 with IMMl and further with IMMO in the embodiment shown An extended instruction format corresponding to instruction format 30 is shown m Fig 4C as an instruction format 80 Instruction format 80 includes opcode field 22, first register field 74, second register field 76, and second opcode field 78, similar to instruction format 70 Additionally, instruction format 80 includes a third register field 82 and a function field 84 Third register field 82 is similar to third repster field 32, except that third register field 82 compnses five bits Therefore, any MIPS RISC architecture repster may be specified by third repster field 82 Function field 84 is similar to function fields 28 and 34, except that function field 84 compnses six bits
Second opcode field 78 is coded to a particular value to identify instruction format 80 from instruction format 70 When second opcode field 78 is coded to the particular value, instruction format 80 is assumed by instruction decompressor 12 Conversely, when second opcode field 78 is coded to a value other than the particular value, instruction format 70 is assumed by instruction decompressor 12 In one embodiment, the particular value compnses hexadecimal 00
Instruction format 80 further includes a COP0 bit 86 COP0 bit 86, when set, indicates that certain coprocessor zero instructions (as defined in the MIPS RISC architecture) are being executed The tables of instructions below further define the instructions encoded by setting COP0 bit 86 The instructions defined for instruction formats 20. 30, 40, and 50 are capable of performing many of the operations commonly performed in typical programs However, routines may need to perform operations of which these instructions are incapable While most of the instructions in the routine may be coded using instruction formats 20-50, several instructions may require additional encodings For example, access to a repster not included within the subset of available registers m formats 20-50 may be needed Additional instructions not included in the instructions encoded using formats 20-50 may be needed For these and other reasons, the extend opcode and extended instruction formats 60-80 are defined
Instruction decompressor 12 examines opcode field 22 in order to detect the extend opcode The extend opcode is one of the opcodes defined to use instruction format 50 in the present embodiment although the bits included in immediate field 52 are assigned diffenng interpretations depending upon the extended instruction format coded for the particular extended instruction The extended instruction formats include a second opcode field (e g fields 66 and 78) which identify the particular extended instruction
Addition of the extend opcode and extended instruction formats allows for many instructions to be encoded using the naπower instruction formats 20-50, but still have the flexibility of the wider extended instruction formats when desired Programs which occasionally make use of the functionalitv included in the extended instruction formats may still achieve a reduced memory footpπnt, since these programs may be encoded using compressed instructions and many of the compressed instructions may compnse 16-bit compressed instructions
An embodiment of miαoprocessor 10 may handle the extended instructions by fetching 16-bit instruction portions and detecting the extend opcode When the extend opcode is detected, a NOP may be transmitted to processor core 16 and the remaining 16-bit portion of the extended instruction may be fetched
The extended instruction is decompressed and provided as the next instruction after the NOP
Additionally, instruction decompressor 12 handles cases wherein a portion of the extended instruction is available while a second portion is unavailable For example, two portions of the extended instruction may lie within two distinct cache lines within instruction cache 14 Therefore, one portion of the instruction may be fetched from instruction cache 14 while the other portion may not reside within instruction cache 14 The portion mav then need to be stored within instruction decompressor 12 until the remaining portion is available Finally, Fig 4D is an instruction format 90 used to explicitly expand the J AL instruction of the MIPS
RISC instruction set The JAL instruction is often used as a subroutine call instruction Subroutines may be stored in memory at a great distance (address-wise) from the calling routine Therefore, having the largest possible range of relative offsets (via an immediate field 92 comprising 26 bits) is important for the JAL instruction. Additionally, an exchange bit 94 is included in the instruction encoding The exchange bit is used to indicate the compressed/non-compressed nature of the instructions at the target address If the bit is set, the target instructions are compressed instructions If the bit is clear, the target instructions are non-compressed instructions The value of exchange bit 94 is copied into bit 0 of the program counter within processor core 16 Bit 0 of the program counter may always be assumed to be zero, since the sixteen bit and thirty-two bit instructions occupy at least two bytes each and instructions are stored at aligned addresses Therefore, bit zero is a useful location for stonng the compression mode of the cuπent routine Processor core 16 increments fetch addresses by 2 (instead of 4) when bit 0 is set, thereby fetching 16 bit compressed instructions through instruction decompressor 12
Each instruction within the compressed instruction set employed by microprocessor 10 uses at least one ofthe instruction formats shown in Figs 3A-3D and Figs 4A-4D It is noted that opcode field 22 is included each instruction format, and is located in the same place within each instruction format The coding of opcode field 22 determines which instruction format is used to interpret the remainder ofthe instruction. A first portion ofthe opcode field encodings is assigned to instruction format 20, a second portion ofthe opcode field encodings is assigned to instruction format 30, etc
As used herein, the term "instruction field" refers to one or more bits within an instruction which are grouped and assigned an interpretation as a group For example, opcode field 22 is compnses a group of bits which are interpreted as the opcode ofthe instruction Additionally, first and second repster fields 24 and 26 compnse register identifiers which identify a storage location within processor core 16 which store operands of the instruction Additionally, the term immediate field refers to an instruction field m which immediate data is coded Immediate data may provide an operand for an instruction Alternatively, immediate data mav be used as an offset to be added to a register value, thereby producing an address Still further, immediate data may be used as an offset for a branch instruction
Figs 5A-6F are tables listing an exemplary compressed instruction set for use by one particular implementation of microprocessor 10 The particular implementation employs the MIPS RISC architecture within processor core 16 Therefore, the instruction mnemonics listed in an instruction column 100 ofthe tables correspond to instruction mnemonics defined in the MIPS RISC architecture (or defined for the instruction assembler, as descnbed in "MIPS RISC Architecture" by Kane and Heinnch, Appendix D, Prentice Hall PTR. Upper Saddle River, New Jersey, 1992, incorporated herein by reference) with the following exceptions CMPI, MOVEI, MOVE, NEG, NOT, and extend These instructions translate to the following MIPS instructions (RS and RT refer to the 16-bit RS and RT)
CMPI XORI $24, RS, ιmm8
MOVI ADDIU RS, $0, sιmm8 MOV ADD RS, $0. RT
NEG SUB RS, $0, RT
NOT NOR RS, $0, RT extend (described above)
Additionally, the instruction tables use several symbols. In an operands column 102, the symbols rs, rt, xs, xt and rd are used. Rs and xs refer to second register field 26 (or second register field 76), while rt and xt refer to first register field 24 (or first register field 74). Similarly, rd refers to third instruction field 32 (or third instruction field 82). As mentioned for one embodiment above, first register field 24, second register field 26, and third register field 32 comprise three bits each. ' Table 1 below lists the mapping of the field encodings (listed in binary) to registers in the MIPS RISC architecture for these symbols. Other mappings are also contemplated, as shown further below. Names assigned according to MIPS assembler convention are also listed in Table 1.
Table 1: Register Mappings
Field Encoding RS. RT. RD XS. XT
000 $8 (tO) $24 (t8)
001 $1 (at) $17 (si)
010 $2 (v0) $18 (s2)
Oil $3 (vl) $19 (s3)
100 $4 (aO) $28 (gp)
101 $5 (al) $29 (sp)
110 $6 (a2) $30 (s8)
111 $7 (a3) $31 (ra)
As shown in table 1, up to 16 registers are available for use in compressed instructions having registers fields
24, 26, or 32. Because each register field is three bits, only eight registers are available for a given opcode.
Instructions which may access all sixteen registers are assigned two opcodes in the instruction tables below.
Register selection is thereby a function of both a register field and opcode field 22. Advantageously, register fields may be encoded using fewer bits while still providing select instructions which may access a large poup of registers.
Also listed in operands column 102 are symbols for the immediate fields 32, 42, 64, and 72. The symbol "imm" indicates an immediate field is included. If "imm" is preceded by an "s", the immediate field is signed and the decompression ofthe immediate field into the decompressed instruction is performed by sign extending the immediate field. If "imm" is not preceded by an "S", the immediate field is unsigned and immediate field decompression involves zero extending the immediate field. In one embodiment, immediate field decompression for load/store instructions comprises right rotation ofthe immediate bits by one bit for halfwords and two bits for words, followed by shifting ofthe immediate bits left by one bit for halfwords and two bits for words. Effectively, a seven bit immediate field is provided for words and a six bit immediate field for halfwords (in the 16-bit instruction formats). The MIPS RISC architecture defines that data addresses conesponding to load store instructions are aligned for each instruction included in the exemplary compressed instruction set. Therefore, the least significant bit (for halfwords) and the second least significant bit (for words) may be set to zero. Bits in the compressed immediate field need not be used to specify these bits. Finally, "imm" is post-fixed with a number indicating the number of bits included in the immediate field
Opcode field 22 and function field 28 are decompressed as well More particularly, opcode field 22 and function field 28 identify the instruction within the MIPS RISC architecture, in accordance with the tables shown in Figs 5A-6F The opcode and function fields of the decompressed instructions are coded in accordance with the MIPS RISC architecture definition
Figs 5A and 5B depict a table 110 and a table 112, respectively Tables 110 and 112 list instructions from the exemplary compressed instruction set which use instruction format 20 shown in Fig 3 A Instruction column 100 and operands column 102 are included, as well as an opcode column 106 and a function column 104 Opcode column 104 and function column 106 include hexadecimal numbers, and correspond to opcode field 22 and function field 28, respectively
Table 110 includes several instructions which have an "ιmm5" coding in function column 104 The "ιmm5" coding appears for the load store instructions within table 110, and indicates that function field 28 is used as an immediate field for these instructions For other instructions, function field 28 is used in conjunction with opcode field 22 to identify a particular instruction within the compressed instruction set Additionally, opcode Id is labeled as special m table 110 The speαal instructions have a specific interpretation of function field 28 In particular, if the most significant bit ofthe function field is clear, then the instruction is defined to be
ADDIU rt, rs, sιmm4
wherem the "sιmm4" operand is formed from the remaining bits of function field 28 If the most significant bit of function field 28 is set, the instruction is defined to be
ADDIU xt, xs, sιmm4
except for two speαal cases If second repster field 26 is coded to a zero, then the instruction is
MOVEI xt, ιmm4
wherein again the ιmm4 operand is formed from the remainder of function field 28 Lastly, if second repster field 26 is coded to 5 (hexadecimal), then the instruction is defined to be
ADDIU sp, sιmm9
wherem the sιmm9 operand is formed from the remaining bits of function field 28 and first repster field 24
The low order two bits ofthe sιmm9 operand are set to zero
It is noted that the destination ofthe SLT and SLTU instructions shown in table 110 is the t8 register (repster $24) according to one embodiment
Table 112 shows an "ιmm3" and "ιmm6" operand for several instructions The ιmm3 operand is coded into second register field 26, and the "ιmm6" operand is coded into both second repster field 26 and first register field 24
Additionally, table 112 includes the jump repster (JR) instruction, having second repster field 26 as an operand. However, it is noted that in one embodiment subfield RT1 of first register field 24 is used in conjunction with second register field 26 to specify any ofthe MIPS RISC architecture registers for the JR instruction.
Turning now to Fig. 5C, a table 114 including instruction column 100, operands column 102, opcode column 106, and function column 104. Table 114 lists instructions from the exemplary instruction set which use instruction format 30 shown in Fig. 3B. Certain instructions within table 114 have hardcoded destination registers (i.e. the destination registers cannot be selected by the programmer, other than by using a different opcode). For these instructions, third register field 32 is combined with function field 34 to store the function field encoding shown in function column 104. Additionally, an instruction is shown which has an immediate operand in function column 104 and operands column 102. This instruction uses second register field 26 in conjunction with function field 34 to code the conesponding immediate field used by the instruction.
Figs 5D and 5E are tables 116 and 118 showing the instructions from the exemplary compressed instruction set which employ instruction formats 40 and 50, respectively. It is noted that the extend instruction is shown in table 118. However, the extend instruction actually indicates that the instruction is a 32-bit compressed instruction which uses one of instruction formats 60, 70, or 80.
Turning now to Figs. 6 A and 6B, a table 120 and a table 122 are shown. Tables 120 and 122 depict those instructions from the exemplary compressed instruction set which are encoded using instruction format 70, shown in Fig. 4B.
Table 120 includes instruction column 100 and operands column 102, and further includes an opcode column 108. Opcode column 108 is similar to opcode column 106, except that the opcode encodings shown in opcode column 108 correspond to opcode field 78.
Table 122 includes an RT column 109 which corresponds to first register field 74. The coding ofthe RT field in the instructions shown in table 122 indicates which instruction is selected. The instructions shown in table 122 share a specific encoding in opcode field 78. In one embodiment, the specific encoding is 00 (hexadecimal).
Figs. 6C, 6D, 6E, and 6F are tables 124, 126, 128, and 130 which depict instructions from the exemplary compressed instruction set which are encoded according to instruction format 80. Tables 124, 126, and 130 include a function column 107 which conesponds to encodings of function field 84. Table 128 includes an RS, RT column 105 which will be explained in more detail below. Operands column 102 for table 124 includes immediate operands for certain instructions. The
"imm5" operand is coded into second register field 76. The "imml5" operand is coded into a combination of first register field 74, second register field 76, and third register field 82.
The instructions listed in table 128 are identified via encodings of second register field 76, as shown in RS, RT column 105. Certain instructions are identified via second register field 76 in conjunction with first register field 74. Those instructions for which RS, RT column 105 includes an asterisk for the RT portion are identified via second register field 76, while those instructions for which RS, RT column 105 does not include an asterisk are identified by second register field 76 in conjunction with first register field 74. Instructions which are not identified via first register field 74 may use first register field 74 to encode an operand. The instructions listed in tables 128 and 130 are instructions for which COP0 bit 86 is set, while instructions listed in tables 124 and 126 are encoded with COP0 bit 86 clear.
Certain instructions in table 128 include an "imm6" operand. The "imm6" operand is coded into function field 84. Additionally, function field 84 is used to indicate the instructions shown in table 130 when second register field 76 is coded to lx (hexadeαmal), wherein "x" indicates that the low order bits are don't cared
Turning now to Fig 7, a first addressing window 150 and a second addressing window 152 are shown according to one embodiment of microprocessor 10 At the center of addressing window 150 is the value of a base repster (represented as Reg on the left side of addressing window 150) The value of the base repster identifies an address within the mam memory subsystem Addressing window 150 represents the range of addresses around the value ofthe base repster which are accessible to a load store instruction in the non- compressed instruction set according to one embodiment ofthe non-compressed instruction set The non- compressed instruction set specifies that load/store instructions form the address of a memory operand via the sum of a value stored in a base repster and a sixteen bit signed immediate field In such an embodiment the range of addresses has an upper boundary of 32767 peater than the base register and a lower boundary of 32768 less than the base repster Other embodiments may include larger or smaller ranges As used herein, the term "base repster" refers to a register which is specified by a load store instruction as stonng a base address, to which the signed immediate field is added to form the address ofthe memory operand operated upon by the instruction
As shown in table 110, for example, load store instructions within the 16-bit portion ofthe exemplary compressed instruction set include a five bit immediate field This field is rotated nght two bits and then shifted left two bits for word-sized memory operands, forming a seven bit immediate field (the largest ofthe immediate fields which may be formed using the five bits, according to one embodiment) The seven bit immediate field is then zero extended to form a positive offset from the base register in the conesponding decompressed instruction A subrange 154 of addresses are therefore available for access bv compressed instructions Within addressing window 150, subrange 154 has an upper boundary of 127 greater than the base register and a lower boundary of the base register However, subrange 154 may vary in size from embodiment to embodiment While subrange 154 may work well for many load store instructions, a different subrange may be employed for use with the global pointer repster The global pointer repster is a repster assipied by software convention to locate an area of memory used for stonng global vanables A global vanable is a vanable which is available for access from any routine within a program In contrast, a local vanable is typicallv accessible only to a particular routine or poup of routines In the MIPS instruction set, for example, repster $28 is often used as the global pointer repster
The area of memory around the global pointer register may therefore be viewed as a table of global vanables Each global vanable is assigned an offset within the table The offset conesponds to a particular immediate field value which may be added to the global pointer repster in order to locate the global vanable For the embodiment shown in Fig 7, for example, a 64 kilobyte table may be allocated for global vanables as shown along the left side of addressing windows 150 and 152
If compressed immediate fields are decompressed as descnbed for addressing window 150, then the global vanable table includes a section which is accessible to compressed instructions (conesponding to subrange 154) which is between two subranges 156 and 158 accessible to non-compressed instructions As noted above, microprocessor 10 may support programs in which some routines are coded with non-compressed instructions while other routines are coded with compressed instructions Allocating global vanables m a particular program is complicated by the division ofthe non-compressed global vanable subranges 156 and 158 of addressing window 150 Global vanables may be allocated into subrange 158, for example, and then global variable allocation must continue in subrange 156 (for non-compressed instructions) ln other words, subrange 154 must be bypassed for global vanables accessible to non-compressed instructions.
Microprocessor 10 may employ a decompression ofthe compressed immediate field for load/store instructions using the global pointer (GP) repster which leads to addressing window 152. Addressing window 152 includes a subrange 160 accessible to compressed instructions and a subrange 162 accessible to non- compressed instructions. Advantageously, subrange 162 is a contiguous block of memory Global vanables for access by non-compressed instructions may be allocated into subrange 162, while global variables for access by compressed instructions may be allocated into subrange 160. Essentially, subrange 160 and subrange 162 form distinct tables of global variables for access by compressed and non-compressed instructions, respectively. Addressing window 152 is achieved by decompressing the compressed immediate field as descnbed above, except that the most significant bit ofthe decompressed immediate field is set. If the compressed immediate field is coded with binary zeros, then the decompressed immediate field is 8000 (in hexadecimal). Since the decompressed immediate field is interpreted as a signed field for load/store instructions, the 8000 value is the most negative number available in the decompressed immediate field. Other encodings ofthe compressed immediate field are decompressed into negative numbers which form subrange 160 Subrange 160 forms the lower boundary of the range of addresses represented by addressing window 152 as shown m the embodiment of Fig. 7.
As used herein, the term memory operand refers to a value stored in a memory location within the mam memory subsystem. Load instructions may be used to transfer the memory operand to a repster within miαoprocessor 10. Conversely, store instructions may be used to transfer a value stored in a register into the memory operand storage location. A memory operand may be of various sizes (i.e. numbers of bytes) In one embodiment, three sizes are available: byte, halfword, and word. A halfword compnses two bytes, and a word compnses four bytes. Other memory operand sizes are contemplated for other embodiments.
Turning to Fig. 8, a block diagram of exemplary hardware within instruction decompressor 12 for decompressmg the immediate field of a load/store instruction is shown. It is noted that multiple copies of the exemplary hardware shown in Fig. 8 may be employed to concurrently decompress multiple load store instructions. The exemplary hardware shown in Fig. 8 is described in terms of miαoprocessor 10B. However, similar hardware may be employed within miαoprocessor 10A. The exemplary hardware includes a immediate field decompressor 170 and a register decoder 172. When an instruction is conveyed to instruction decompressor 12B from instruction cache 14B, a portion ofthe instruction comprising the compressed immediate field for load/store instructions is conveyed to immediate field decompressor 170 upon a compressed immediate bus 174. For the exemplary instruction set described in Figs. 3A-6F, the compressed immediate field compnses function field 28 (shown in Fig 3 A). Additionally, the base repster field for the compressed load/store instruction is conveyed upon a base register bus 1 6 For the exemplary instruction set shown in Figs. 3 A-6F, the base repster field compnses second register field 26.
Register decoder 172 decodes the repster identified upon base repster bus 176. If the base repster is the global pointer register, register decoder 172 asserts a GP signal upon GP line 178 to immediate field decompressor 170 Otherwise, repster decoder 172 deasserts the GP signal Immediate field decompressor 170 decompresses the compressed immediate field in one of two ways, dependent upon the GP signal. If the GP signal is deasserted, then immediate field decompressor 170 clears the most significant bit ofthe decompressed immediate field. Conversely, immediate field decompressor 170 sets the most significant bit ofthe immediate field if the GP signal is asserted. Therefore, a positive offset is created when a register other than the global pointer register is used as the base register. A negative offset is created when the global pointer register is used as the base register. Immediate field decompressor 170 conveys the decompressed immediate field upon a decompressed immediate bus 180. Fig. 9 illustrates the decompressed immediate field generated for load store instructions according to one embodiment ofthe exemplary compressed instruction set. The compressed immediate field of load/store instructions which do not employ the global pointer register as the base register are decompressed as indicated by reference number 182. The decompression for bytes, halfwords, and words are shown separately, with each bit position ofthe decompressed immediate field (or offset) represented by a numerical digit or an "L". Bits from the compressed immediate field are shown in the respective bit locations of the decompressed field via the numerical digits. The least significant bit ofthe compressed immediate field is represented by the digit 0, and the most significant bit ofthe compressed immediate field is represented by a 4. The letter "L" is used to indicate a bit position which is set to a binary zero.
Decompressed immediate fields corresponding to bytes, halfwords, and words for load store instructions which use the global pointer register as a base register are indicated by reference number 184. Similar to the decompressed fields indicated by reference number 182, the decompressed fields indicated by reference number 184 depict numerals in bit positions which are filled with a bit from the compressed immediate field and the letter "L" is used to indicate a bit position which is set to a binary zero. Additionally, the most significant bit of each decompressed offset is set to a binary one (indicated by the letter "H"). Turning next to Fig. 10, a flow chart is shown depicting activities performed by instruction decompressor 12 in order to decompress instructions in accordance with the embodiment shown in Fig. 8. Although the steps shown in Fig. 10 are illustrated as serial in nature, it is understood that various steps may be performed in parallel.
Instruction decompressor 12 determines if a received instruction is a load/store instruction (decision block 190). If the instruction is not a load/store instruction, the instruction is expanded in accordance with a mapping between the compressed instructions (as illustrated in Figs. 3A-6F) and the conesponding decompressed instructions (step 192). If the instruction is a load/store instruction, then the base register specified by the instruction is examined (decision block 196). If the base register is the global pointer register, the immediate field is decompressed as indicated by reference number 184 in Fig. 9 (step 194). Alternatively, if the base register is not the global pointer register, the immediate field is decompressed as indicated by reference number 182 in Fig. 9 (step 192).
In addition to decompressing load/store offsets in a different manner for the global pointer register, microprocessor 10 also supports a compression mode for indicating which type of instructions are being executed by microprocessor 10 (i.e. compressed or non-compressed). Fig. 11 is a block diagram illustrating a portion of one embodiment of instruction decompressor 12. The illustrated portion determines the compression mode for each routine executed by miαoprocessor 10. The portion shown may be suitable for microprocessor 10B, and a similar portion may be employed by microprocessor 10A. Fig. 11 depicts a mode detector 200.
When an instruction is fetched by processor core 16, the instruction is received upon an instruction bus 202 by mode detector 200. Mode detector 200 detects when the jump and link (JAL) instruction is fetched, and further examines the exchange bit 94. If exchange bit 94 is set, the routine at the target address ofthe JAL instruction comprises compressed instructions. Therefore, the compression mode ofthe target routine is compressed Alternatively, exchange bit 94 may be clear In this case, the compression mode of the target routine is uncompressed
In addition to specifying the compression mode for the target routine, the JAL instruction causes the address ofthe instruction following the JAL instruction to be stored into register $31 ofthe MIPS RISC architecture This register may subsequently be used with the JR instruction to return from the target routine Because the compression mode is stored as part ofthe address in this embodiment, the compression mode of the source routine is restored upon execution of the JR instruction Advantageously, routines encoded in compressed instructions may be intermixed with routines encoded in non-compressed instructions The new compression mode is conveyed to processor core 16 upon a compression mode line 206 It is noted that mode detector 200 may be included as a part of processor core 16 instead of instruction decompressor 12, in alternative embodiments
The embodiment of mode detector 200 shown in Fig 11 includes a storage 204 for a compression enable bit If compression is enabled, the compression enable bit is set When instructions are fetched in compressed mode and compression is enabled, instruction decompressor 12 decompresses the instructions If the enable bit is clear, instruction compression is disabled for microprocessor 10 Instruction decompressor 12 is bypassed when instruction decompression is disabled Furthermore, mode detector 200 indicates that the compression mode is non-compressed when instruction compression is disabled
As used herein, a routine is an ordered set of instructions coded for execution by miαoprocessor 10 The routine may be coded in either compressed or non-compressed instructions, and is delimited by a subroutine call instruction and a return instruction The delimiting subrouune call instruction is not included within the routine Instead, the subroutine call instruction indicates the beginning of the routine via a target address included with the subroutine call instruction The first instruction ofthe routine is stored at the target address Additionally, the address of an instruction within the routine including the subroutine call instruction is saved so that a return instruction may be executed to return to the calling routine In the exemplary compressed instruction set depicted in Figs 3 A-6F, the jal instruction may serve as a subroutine call instruction Alternatively, the alr instruction may serve as a subroutine call instruction
A routine ends with a return instruction, which causes subsequent instruction execution to return to the address saved when the conesponding subroutine call instruction is executed In other words, the target address of the return instruction is the saved address For the exemplary compressed instruction set, the jr instruction may serve as a return instruction Generally speaking, a target address is an address at which instruction fetching is to bepn upon execution of the instruction conesponding to the target address
Turning next to Fig 12, a block diagram of one embodiment of repster field decompression is shown Other embodiments of repster field decompression are contemplated The compressed register field conesponding to an instruction is conveyed upon compressed register field bus 210 A register decompressor block 212 receives the compressed register field Additionally, at least a portion of the compressed register field is incorporated into the decompressed repster field which is then conveyed upon decompressed register field bus 214 The decompressed repster field is thereby formed by concatenating at least a portion of the compressed repster field to the value generated by repster decompressor block 212
In one embodiment, the entire compressed repster field is concatenated into the decompressed register field Additionally, the remaining portion ofthe decompressed register field depends upon which repster set the instruction accesses (e g xs vs rs and xt vs rt) A set selector signal is received upon set selector bus 216 for each repster, indicating whether the xs (xt) or the rs (rt) register set should be used If the set selector signal is asserted, then xs (xt) is selected Otherwise, rs (rt) is selected The set selector signal is asserted or deasserted based upon the opcode ofthe instruction being decompressed, m accordance with the exemplary compressed instruction set shown m Figs 5A-6F For example, the register mapping between compressed and decompressed repsters shown in Table 1 may be employed For such an example, register decompressor 212 may employ the following lope, wherein DR represents the decompressed repster field, CR represents the compressed repster field, and RH represents the conesponding set selector signal value
DR[4 3] = {RH, (RH & CR[2] | ICR[2 0])}
Several other repster mappings arc contemplated, examples of which are shown in tables 2-4 below, along with corresponding Veπlog lope equations It is noted that any repster mapping may be employed by various embodiments of miαoprocessor 10 Table 2: Second Exemplary Register Mappings
Field Encoding RS. RT. RD XS. XT
000 $8 (tO) $24 (t8)
001 $9 (tl) $25 (t9)
010 $2 (v0) $18 (s2)
Oi l $3 (vl) $19 (S3)
100 $4 (aO) $28 (gp)
101 $5 (al) $29 (sp)
110 $6 (a2) $30 (s8)
111 $7 (a3) $31 (ra)
DR[4 3] = {RH, (RH & CR[2] | ICR[2 1])}
Table 3: Third Exemplary Register Mappings
Field Encoding RS. RT. RD XS. XT
000 $16 (sO) $24 (t8)
001 $1 (at) $9 (tl)
010 $2 (v0) $10 (t2) on $3 (vl) $11 (t3)
100 $4 (aO) $28 (gp)
101 $5 (al) $29 (sp)
110 $6 (a2) $30 (s8)
111 $7 (a3) $31 (ra)
DR[4 3] = {(RH & CR[2] | 'CR[2 0]), RH}
Table 4: Fourth Exemplary Register Mappings
Field Encoding RS. RT. RD XS. XT
000 $16 (sO) $24 (t8)
001 $17 (si) $25 (19) 010 $2 (vO) $10 (t2) Oil $3(vl) $11 (t3) 100 $4 (aO) $28 (gp)
101 $5 (al) $29 (sp)
110 $6 (a2) $30 (s8)
111 $7 (a3) $31 (ra)
DRI4 3] = {(RH & CR[2] | ' I CR[2 1]), RH}
As indicated by the assembler assigned names shown in tables 1-4, vanous repsters are assigned to vanous functions by software convention For example, the MIPS assembler assigns the following meamngs to registers
Table 5: Software Convention for Register Names
Register Software Name Use
$0 none Hardwired to zero
$1 Sat used by assembler
$2..$3 vO-vl Function results or static link
$4.. $7 a0-a3 arguments for a subroutine
$8..$15, t0-t9 Temporary repsters, not saved between subroutine calls
$24..$25
$16 .$23, $30 s0-s8 Saved between subroutine calls
$26..$27 kO-kl Reserved for operating system
$28 gp Global Pointer
$29 sp Stack Pointer
$31 ra Return address
It is desirable to provide access to both temporary and saved registers to routines coded in compressed instructions Additionally, access to vO-vl, a0-a3, gp, sp, and ra are needed to operate with existing software The repster mappings shown balance these qualities with the desire for repster decompressor 212 to occupy a fairly small number of gates Advantageously, a useful set of registers is selected from the MIPS register set while still maintaining a low gate count within repster decompressor 212
Turning now to Fig 13, an exemplary computer system 220 including miαoprocessor 10 is shown Many other computer systems employing miαoprocessor 10 are contemplated Within computer system 220, miαoprocessor 10 is incorporated onto a semiconductor substrate 224 along with multiple I/O interfaces 222 A- 222N The I/O interfaces interface to I O devices external to substrate 224 An exemplary I O interface 222A may be a universal asynchronous receiver transmitter (UART)
Microprocessor 10 may be coupled to I O interfaces 222 for communication therewith Additionally, microprocessor 10 may be coupled to external interface lope 226, which further interfaces to one or more dynamic random access memory (DRAM) modules 228 DRAM modules 228 may store compressed and/or non-compressed instruction code, as well as data for used by the program represented by the compressed and/or non-compressed instruction code
It is noted that the present discussion may refer to the assertion of vanous signals As used herein, a signal is "asserted" if it conveys a value indicative of a particular condition Conversely, a signal is "deasserted" if it conveys a value indicative of a lack of a particular condition A signal may be defined to be asserted when it conveys a lopcal zero value or, conversely, when it conveys a lopcal one value Although a specific example of a compressed instruction set is Shown and described herem, multiple variations, extensions, and modifications may be made to the exemplary compressed instruction set. These variations, extensions, and modifications are contemplated.
The following Verilog listing describes exemplary logic for instruction decompressor 12. Many different embodiments ofthe logic are contemplated, although the Verilog listing shown is one suitable example:
timescale 1 ns / 1 ns module tinyrisc_dp( xo, ci, dojal, ext, x ); output[31:0] xo; // expanded instruction out input[15:0] ci; // compressed instruction in input dojal; //dojal input ext; // extend input[10:0] // extend bits wire ext Jal = ext | dojal ;
/* ci[15 11] decodes*/ wire xsp = ci[15:ll]== 51)00000; wire spany = ~€xtjal & ci[15] & ~ci[14] & ~ci[ll] ; wire splor2 = spany & ~ci[13] ; wire special = splor2 & ~ci[12]; wire brjal = ci[15: 12] ==41)0001 ; wire br = brjal &~ci[ll]; wire word = α[15]&(~ext|~ci[13])&ci[12]&ci[ll] ; wire special3= ~extjal&ci[15:ll] == 51)11101 ; wire half = ci[15] & (~ext | ~ci[13] | ~ci[12]) & cifll] & ~special3; wire opi = ~ci[15] & ci[14] & ~ci[12] ; wire opx = ci[15:12]== 41)0000 | ci[15] &~ci[13] & ci[12] &ci[ll]; wire sll = ~-extJal&ci[15:ll]== 51)10110; wire rtxO = -ext Jal & ( ~ci[15] & ci[14] & ci[12] & ~(ci[13] & ci[ll]) ); wire snx = ~ci[15] &~ci[14] & (ci[13] | ~ci[12]) | ci[15:ll] == 51)01001 wire special2= splor2 & ci[12]; wire ximm = ~( xsp I br ); wire exti = ext & ximm ; wire x2z = spany &ci[ 12] ; wire x4z = spany | special3 ; wire xn = ci[15:ll]== 51)01111 ; wire ill = brjal I xn ; wire rsza = ~«xtjal & ( ill | sll | ci[15:ll] == 51)01000 );
/* & decodes */ wire jr special & ci[4: 1] == 41)0100; wire jalr jr&cifO]; wire negnot special & ci[4] & ~ci[3] & ci[l] & ci[0]; wire rseqO ci[8:6] == 31)000; wire sp2x special2 & ~ci[0] ; wire sit sp2x&ci[l] ; wire sp3x special3 & ci[4] ; wire sp3sp sp3x&ci[8:6]== 31)101 ;
/* I decodes / wire i8 -ext Jal & ~ci[15] | sp3sp; wire i8s i8&~ill ; wire rdrs sp2x&~ci[l] I special & ci[4] wire rdrt special & ~(ci[4] | ci[3]) | jalr ; wire rdrd special2 & ci[0] | sll ; == 41)0000 & cι[4 2] == 3"b00 & (ext ? ~(cι[15]
j sp3x & rseqO | shrs ,
& exti ,
Figure imgf000026_0001
assign xo[31] -dojal & cι[15] & ( ext | ~x4z ) , assign xo[30] ext & xsp & x[10] , assign xo[29] -dojal & α[14] & ~( -ext & cι[15] & ~cι[12] & ~α[l 1] ), assign xo[28] -dojal & ( cι[13] & ~ spany | -ext & word | speαal3 ) | -ext & ~cι[15] & ~cι[14] | br) assign xo[27] dojal I α[12] & ( ext | ~( spany | α[15]&cι[14]&-cι[l l] ) ), assign xo[26] dojal | α[ll] , wιre[4 0] rs = { rs5 ? α[10 9] {xs,(cι[8] & xs | rseqO)}, αf8 6] }, assign xo[25 21] = rs & {5{~rsz}} , wιre[4 0] rt { xt, (cι[10] ? xt ~(α[9] α[5])), cι[10 9], α[5] }, assign xo[20 16] rt & {5{rtrt}} I rs & {5{rtrs}} I x[9 5] & {5{extjal}} I {rtxO, rtxO, 31)00}, wιre[4 0] rd = { 11)0, lα[4 2], cι[4 2] }, assign xo[15 11] = rs & {5{rdrs}}
I rt & {5{rdrt}}
I rd & {5{rdrd}}
I x[4 0] & {5 {ext & -ximm | dojal}}
I { {2{slt I snh}}, snh, snl2, snl l }, assign xo[ 10 6]
= rs & {{2{shrs}}, {3{shrs|sll}}} I α[15 ll] & {5{dojal}} I x[4 0] & {5{exti}}
I { α[l]&sll, αfOl&sll, 11)0, cι[10]&ι8s cι|9]&ι8s | cι[l]&~extjal&word} j { snm, snm. snm, snl, snl }, assign xo[5] = α[5] & (extjal 1 18) | cι[4] & special
I α[0] & -extjal & half | speαal2 | snl , assign xo[4] = cι[4] & ~x4z | cι[l] & sp3sp | snl , assign xo[3] = sp3sp ? cι[0] cι[3] & ~x2z | sit , assign xo{2] = α[2] & ~x2z, assign xo(l] = α[l] & ~i -dojal & word | sll | sp3sp ), assign xo[0] = α[0] & ~( -dojal & half | sll | sp3sp ) | specιal2 & cι[2] , endmodule
In accordance with the above disclosure, a microprocessor has been descnbed which executes instructions from both a compressed instruction set and a non-compressed instruction set The miαoprocessor expands the compressed instructions into decompressed instructions or directly decodes the compressed instructions Advantageously, routines coded using the compressed instruction set occupy a smaller amount of memory than the corresponding routines coded in non-compressed instructions. Memory formerly occupied by such routines may be freed for use by other routines or data operated upon by such routines. Numerous variations and modifications will become apparent to those skilled in the art once the above disclosure is fully appreαated. It is intended that the following claims be interpreted to embrace all such variations and modifications.

Claims

WHAT IS CLAIMED IS:
1 An apparatus for executing instructions from a vaπable-length compressed instruction set, compnsmg
an instruction decompressor coupled to receive instructions which are members of said vaπable-length compressed instruction set, wherem said instruction decompressor is configured to examine an opcode field of a particular instruction, and wherem said instruction decompressor is configured to determine that said particular instruction is an extended instruction having a first fixed length if said opcode field is coded as an extend opcode, and wherein said instruction decompressor is configured to determine that said particular instruction is a non- extended instruction if said opcode field is coded as a second opcode different than said extend opcode
2 The apparatus as reαted in claim 1 wherein said first fixed length is a first number of bvtes which is greater than a second number of bytes conesponding to said second fixed length
3 The apparatus as reαted in claim 2 wherein said first number is greater than said second number by an integer factor
4 The apparams as reαted in claim 3 wherein said integer factor is two
5 The apparatus as reαted in claim 1 wherein said extended instruction includes a second opcode field defining a particular extended instruction within a set of extended instructions
6 The apparatus as rcαted in claim 5 wherein said instruction decompressor is configured to examine said second opcode field in order to determine how to decompress said particular extended instruction
7 The apparams as reαted in claim 1 further compnsmg a processor core configured to execute decompressed instructions
8 The apparatus as reαted in claim 7 wherein each compressed instruction conesponds to one decompressed instruction
9 A method for expanding compressed instructions into decompressed instructions, compnsmg
determining that a compressed instruction is an extended instruction having a first fixed length if an opcode field of said compressed instruction is an extend opcode,
determining that said compressed instruction is a non-extended instruction having a second fixed length if said opcode field of said compressed instruction is a second opcode different than said extend opcode, and
decompressing said compressed instruction into a decompressed instruction, wherein a number of bytes included in said compressed instruction is defined by said first fixed length if said compressed instruction is said extended instruction, and wherein said number of bytes is defined by said second fixed length if said compressed instruction is said non-extended instruction
10 The method as reαted in claim 9 wherein said first fixed length is greater than said second fixed length
11 The method as reαted in claim 10 wherein said first fixed length is greater than said second fixed length by an integer number of bytes
12 The method as reαted in claim 11 wherein said integer number is two
13 The method as reαted in claim 11 further compns g fetching compressed instructions in quantities defined by said second fixed length
14 The method as reαted in claim 13 wherem said non-extended instruction, when detected, is formed from a first of said quantities in which said extend opcode is detected and a second of said quantities which is immediately subsequent to said first of said quantities
1 The method as reαted in claim 9 wherem said extended instruction includes a second opcode field identifying said extended instruction as a particular extended instruction
16 The method as recited in claim 15 wherem said decompressing compnses identifying an extended instruction format corresponding to said particular extended instruction
17 The method as recited in claim 16 wherein said decompressing further compnses interpreting bits within said particular extended instruction according to a plurality of instruction fields within said extended instruction format
18 The method as reαted m claim 9 wherem said decompressing compnses identifying a non-extended instruction format conesponding to said non-extended instruction from an encoding of said opcode field
19 The method as recited m claim 18 wherein said decompressmg further compnses interpreting bits from said non-extended instruction according to a second plurality of instruction fields within said non-extended instruction format
20 An apparams for expanding compressed instructions into decompressed instructions, compnsmg
a first determining means for determining that a compressed instruction is an extended instruction having a first fixed length if an opcode field of said compressed instruction is an extend opcode, a second determining means for determining that said compressed instruction is a non-extended instruction having a second fixed length if said opcode field of sa d compressed instruction is a second opcode different than said extend opcode, and
a decompressing means for decompressing said compressed instruction into a decompressed instruction, wherem a number of bytes included m said compressed instruction is defined by said first fixed length if said compressed instruction is said extended instruction, and wherein said number of bytes is defined by said second fixed length if said compressed instruction is said non-extended instruction
21 A method for executing a program including a first routine and a second routine in a microprocessor, compnsmg
executing a subroutine call instruction within said first routine, wherein said subroutine call instruction indicates that said second routine is to be executed via a target address of said subroutine call instruction, and
examimng an indication within said subroutine call instruction, wherem said second routine is determined to be coded using compressed instructions if said indication is in a first state, and wherein said second routine is determined to be coded using non-compressed instructions if said indication is in a second state different than said first state
22 The method as reαted in claim 21 wherein said indication compnses a bit and said first state compnses said bit being set
23 The method as reαted in claim 22 wherein said second state compnses said bit being clear
24 The method as reαted in claim 21 further compnsmg stonng said indication in a program counter within said miαoprocessor
25 The method as reαted in claim 21 wherein said indication serves as a compression mode for said second routine
26 The method as reαted in claim 25 further compnsmg decompressing instructions from said second routine if said compression mode indicates compressed
27 The method as reαted in claim 21 further compnsmg executing a return instruction at completion of said second routine, wherein said return instruction includes a second target address
28 The method as reαted in claim 27 further compnsmg examimng a second bit within said second target address, wherein said first routine is determined to be coded using compressed instructions if said second bit is in a first state, and wherein said first routine is determined to be coded using non-compressed instructions if said second bit is in a second state different than said first state.
29. An apparatus for executing a program including a first routine and a second routine in a miαoprocessor, comprising:
an executing means for executing a subroutine call instruction within said first routine, wherein said subroutine call instruction indicates that said second routine is to be executed via a target address of said subroutine call instruction; and
an examimng means for examining an indication within said subroutine call instruction, wherein said examining means determines that said second routine is coded using compressed instructions if said indication is in a first state, and wherein said examining means determines that said second routine is coded using non-compressed instructions if said indication is in a second state different than said first state.
30. An apparams for fetching compressed and non-compressed instructions in a miαoprocessor, comprising:
a storage device configured to store a compression enable indicator;
a mode detector coupled to said storage device, wherein said mode detector is configured to detect a compression mode of a target routine upon fetch of a subroutine call instruction specifying said target routine, and wherein said mode detector is configured to convey said compression mode to a processor core; and
a processor core coupled to said mode detector, wherein said processor core is configured to fetch compressed instructions if said compression mode indicates compressed, and wherein said processor core is configured to fetch non-compressed instructions if said compression mode indicates non-compressed.
31. The apparatus as recited in claim 30 wherein a particular bit within said subroutine call instruction identifies said compression mode.
32. The apparatus as recited in claim 31 wherein said processor core comprises a PC register, and wherein said processor core is configured to store said compression mode within said PC register.
33. The apparatus as recited in claim 32 wherein said compression mode is stored as a least significant bit of a fetch address stored within said PC register.
34. The apparatus as recited in claim 33 wherein said processor core inαements said fetch address by a first fixed amount if said compression mode indicates compressed, and wherein said processor core inαements said fetch address by a second fixed amount if said compression mode indicates non-compressed. 35 A microprocessor compnsmg
an instruction decompressor coupled to receive compressed instructions which are members of a vaπable-length compressed instruction set, wherein said instruction decompressor is configured to decompress each received compressed instruction into a conesponding decompressed instruction, and
a processor core coupled to receive decompressed instructions, wherein said processor core is configured to execute said decompressed instructions
36 The miαoprocessor as reαted in claim 35 further compnsmg an instruction cache coupled between said instruction decompressor and said processor core, wherein said instruction cache is configured to store said corresponding decompressed instructions
7 The miαoprocessor as reαted in claim 35 further compnsmg an instruction cache coupled to said instruction decompressor, wherein said instruction cache is configured to store said compressed instructions
38 The miαoprocessor as reαted in claim 35 wherem said vaπable-length instruction set compnses a first plurality of instructions having a first fixed length and a second plurality of instructions having a second fixed length
39 The miαoprocessor as reαted in claim 38 wherein said first fixed length is greater than said second fixed length
40 The miαoprocessor as reαted in claim 39 wherein said first fixed length is greater than said second fixed length by an integer factor
41 The miαoprocessor as reαted in claim 40 wherein said integer factor is two
42 The miαoprocessor as reαted in claim 40 wherein said processor core is configured to fetch instructions according to said second fixed length
43 The microprocessor as reαted in claim 42 wherein said instruction decompressor is configured to detect a fetch of one of said first plurality of instructions
44 The miαoprocessor as reαted in claim 43 wherein said instruction decompressor is configured to convey a NOP upon detection of said one of said first plurality of instructions and to await a second portion of said one of said first plurality of instructions via said processor core fetching, thereby receiving said one of said plurality of instructions m portions
45 The miαoprocessor as recited in claim 35 wherein said processor core is further configured to executed non-compressed instructions 46 The miαoprocessor as reαted in claim 45 wherem said non-compressed instructions bypass said instruction decompressor
47 A method for executing instruction code, compnsmg
fetching compressed instructions which are members of a vanable-length compressed instruction set
decompressmg said compressed instructions in an instruction decompressor, thereby forming conesponding decompressed instructions, and
executing said decompressed instructions in a processor core
48 The method as recited in claim 47 wherem each compressed instruction corresponds to one decompressed instruction.
49 The method as reαted in claim 48 where said vanable-length instruction set compnses instructions having a first fixed length and other instructions having a second fixed length
50 The method as reαted in claim 49 wherein said first fixed length is greater than said second fixed length
51 The method as reαted in claim 47 further comprising fetching non-compressed instructions
52 The method as reαted in claim 51 further comprising executing said non-compressed instructions
53 The method as recited in claim 52 further comprising bypassing said instruction decompressor when fetching said non-compressed instructions
54 An apparatus for executing instruction code, compnsmg
a fetching means for fetching compressed instructions which are members of a vanable-length compressed instruction set,
a decompressing means for decompressing said compressed instructions, thereby forming conesponding decompressed instructions, and
an executing means for executing said decompressed instructions
55 An instruction decompressor configured to decompress compressed instructions, wherein a first one of said compressed instructions is codable to access a first subset of registers defined for a conesponding non- compressed instruction set, and wherein a second one of said compressed instructions is codable to access said first subset of registers and is further codable to access a second subset of registers 56 The instruction decompressor as reαted in claim 55 wherein said second one of said compressed instructions is assigned a first opcode encoding and a second opcode encoding
57 The instruction decompressor as reαted in claim 56 wherein said first opcode encoding indicates that said second one of said compressed instructions is coded to access one of said first subset of registers
58 The instruction decompressor as reαted in claim 57 wherein said instruction decompressor decompresses said second one of said compressed instructions using a first mappmg of compressed register codings to decompressed register codings, wherein said first mapping maps each compressed register coding to a decompressed register coding within said first subset
59 The instruction decompressor as reαted in claim 58 wherein said second opcode encoding indicates that said second one of said compressed instructions is coded to access one of said second subset of registers
60 The instruction decompressor as reαted in claim 59 wherein said instruction decompressor decompresses said second one of said compressed instructions using a second mapping of compressed register codings to decompressed register codings, wherein said second mapping maps each of said compressed register codings to a decompressed register coding within said second subset of registers
61 The instruction decompressor as reαted in claim 56 wherein said second one of said compressed instructions includes a first repster field and a second register field
62 The instruction decompressor as reαted in claim 61 wherem said instruction decompressor decompresses said first register field according to said first subset of registers if said first opcode encoding is used
63 The instruction decompressor as reαted in claim 62 wherein said instruction decompressor decompresses said first register field according to said second subset of registers if said second opcode encoding is used
64 The instruction decompressor as reαted in claim 63 wherein said second one of said compressed instructions is assigned a third opcode encoding, wherein said decompressor decompresses said second register field using said second subset of registers if said third opcode encoding is used
65 The instruction decompressor as reαted in claim 64 wherein said second one of said compressed instructions is assigned a fourth opcode encoding, wherein said decompressor decompresses said first register field and said second register field using said second subset of registers if said fourth opcode encoding is used
66 The instruction decompressor as reαted in claim 56 wherem said first opcode encoding and said second opcode encoding differ m bits included in a function field of said second one of said compressed instructions
67 A method for decompressing compressed instructions, compnsmg decompressmg a particular compressed instruction having a first register field using a first register mapping from compressed register indicators to decompressed register indicators for decompressing said first register field if said particular compressed instruction is encoded using a first opcode, and
decompressing said particular compressed instruction having said first register field using a second register mapping from compressed register indicators to decompressed register indicators for decompressmg said first register field if said particular compressed instruction is encoded usmg a second opcode
68 The method as reαted m claim 67 wherein said particular compressed instruction further includes a second register field
69 The method as reαted in claim 68 further compnsmg decompressing said second register field usmg said second register mapping if said particular compressed instruction is encoded using a third opcode
70 The method as reαted in claim 69 further compnsmg decompressing said second register field and said first register field using said second register mapping if said particular compressed instruction is encoded using a fourth opcode
71 The method as reαted in claim 67 further compnsmg decompressing a second particular compressed instruction using said first mapping
72 The method as reαted in claim 71 wherein said second particular compressed instruction includes one opcode encoding
73 An apparatus for decompressmg compressed instructions compnsmg a decompressing means, wherein said decompressing means is configured to decompress a particular compressed instruction having a first register field using a first register mapping from compressed register indicators to decompressed register indicators for decompressing said first register field if said particular compressed instruction is encoded using a first opcode, and wherein said decompressing means is further configured to decompress said particular compressed instruction having said first register field using a second register mapping from compressed register indicators to decompressed register indicators for decompressing said first register field if said particular compressed instruction is encoded using a second opcode
74 An instruction decompressor configured to decompress a compressed register field of a compressed instruction into a decompressed register field of a decompressed instruction, wherein a decompression of said compressed register field is dependent upon a first value coded into said compressed register field and a second value coded into an opcode field of said compressed instruction
75 The instruction decompressor as reαted in claim 74 wherein said instruction decompressor is configured to select one of multiple mappings from compressed register field encodings to decompressed register field encodings in response to said second value
76 The instruction decompressor as reαted in claim 75 wherem said instruction decompressor selects one of said decompressed register encodings from said one of said multiple mappings in response to said first value
77 The instruction decompressor as reαted in claim 76 wherein a particular instruction is assigned a first opcode encoding and a second opcode encoding, and wherein said instruction decompressor, upon receipt of said particular instruction having said first opcode encoding selects said one of said multiple mappings for decompressing said compressed register field
78 The instruction decompressor as reαted in claim 77 wherein said instruction decompressor, upon receipt of said particular instruction having said second opcode encoding selects another one of said multiple mappings for decompressmg said compressed register field
79 A method for decompressing a compressed register field of a compressed instruction into a decompressed register field of a decompressed instruction, compnsmg
directly copying at least a portion of said compressed register field into a portion of said decompressed register field, and
logically operating upon said compressed register field to produce a remaining portion of said decompressed register field
80 The method as reαted in claim 79 wherein said portion of said compressed register field comprises an entirety of said compressed register field
81 The method as reαted in claim 80 wherein said portion of said decompressed register field receiving said entirety of said compressed register field compnses a plurality of least significant bits of said decompressed register field.
82. The method as reαted in claim 81 wherein said plurality of least significant bits is equal in number to a number of bits compnsmg said compressed instruction field
83 The method as reαted in claim 79 further compnsmg logically operating upon an opcode field of said compressed instruction to produce said remaining portion of said decompressed register field
84 The method as reαted m claim 83 wherem said logically operating upon said opcode field compnses selecting a first register mapping in response to a first opcode encoding in said opcode field
85 The method as reαted in claim 84 wherein said logically operating upon said opcode field further compnses selecting a second register mapping m response to a second opcode encoding in said opcode field 86 The method as reαted in claim 85 wherein said first opcode encoding and said second opcode encoding are assigned to a particular instruction
87 An apparatus for decompressing a compressed register field of a compressed instruction into a decompressed repster field of a decompressed instruction, compnsmg
a first means for directly copying at least a portion of said compressed repster field into a portion of said decompressed register field, wherein said first means is coupled to receive said compressed repster field, and
a second means for logically operating upon said compressed repster field to produce a remaimng portion of said decompressed repster field, wherein said second means is coupled to receive said compressed register field
88 The apparatus as reαted in claim 87 wherein said portion of said compressed repster field compnses an entirety of said compressed repster field
89 The method as reαted in claim 88 wherein said portion of said decompressed register field receiving said entirety of said compressed repster field compnses a plurality of least significant bits of said decompressed register field
90 An instruction decompressor configured to decompress a compressed repster field of a compressed instruction into a decompressed repster field of a decompressed instruction, wherem said instruction decompressor forms a first portion of said decompressed repster field by copying at least a portion of said compressed repster field thereto, and wherein said instruction decompressor includes a lope block which is configured to operate upon said compressed repster field to produce a remaimng portion of said decompressed repster field
91 The instruction decompressor as reαted in claim 90 wherein said portion of said compressed repster field compnses an entirety of said compressed repster field
92 The instruction decompressor as reαted in claim 90 wherein said lope block selects a repster mapping from compressed repster field encodings to decompressed repster field encodings in response to a signal received from said instruction decompressor
93 The instruction decompressor as reαted in claim 92 wherem said instruction decompressor is configured to assert said signal upon detection of a first opcode assigned to a particular instruction, and wherein said instruction decompressor is configured to deassert said signal upon detection of a second opcode assigned to said particular instruction
94 The instruction decompressor as recited in claim 90 wherem said portion of said decompressed register field receiving said portion of said compressed repster field compnses a plurality of least significant bits of said decompressed register field.
95. The instruction decompressor as reαted in claim 94 wherein said plurality of least significant bits are equal in number to a number of bits comprising said compressed repster field.
PCT/US1997/009984 1996-06-10 1997-06-10 An apparatus and method for detecting and decompressing instructions from a variable-length compressed instruction set WO1997048041A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP10501756A JP2000512409A (en) 1996-06-10 1997-06-10 Apparatus and method for detecting and decompressing instructions from a variable length compressed instruction set
AU34808/97A AU3480897A (en) 1996-06-10 1997-06-10 An apparatus and method for detecting and decompressing instructions from a variable-length compressed instruction set
GB9825726A GB2329495B (en) 1996-06-10 1997-06-10 An apparatus and method for detecting and decompressing instructions from a variable-length compressed instruction set

Applications Claiming Priority (8)

Application Number Priority Date Filing Date Title
US66102796A 1996-06-10 1996-06-10
US08/659,708 1996-06-10
US08/659,709 1996-06-10
US08/661,003 US5896519A (en) 1996-06-10 1996-06-10 Apparatus for detecting instructions from a variable-length compressed instruction set having extended and non-extended instructions
US08/659,709 US5794010A (en) 1996-06-10 1996-06-10 Method and apparatus for allowing execution of both compressed instructions and decompressed instructions in a microprocessor
US08/659,708 US5905893A (en) 1996-06-10 1996-06-10 Microprocessor adapted for executing both a non-compressed fixed length instruction set and a compressed variable length instruction set
US08/661,027 1996-06-10
US08/661,003 1996-06-10

Publications (2)

Publication Number Publication Date
WO1997048041A1 WO1997048041A1 (en) 1997-12-18
WO1997048041A9 true WO1997048041A9 (en) 1998-04-02

Family

ID=27505302

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1997/009984 WO1997048041A1 (en) 1996-06-10 1997-06-10 An apparatus and method for detecting and decompressing instructions from a variable-length compressed instruction set

Country Status (4)

Country Link
JP (1) JP2000512409A (en)
AU (1) AU3480897A (en)
GB (1) GB2329495B (en)
WO (1) WO1997048041A1 (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6189090B1 (en) 1997-09-17 2001-02-13 Sony Corporation Digital signal processor with variable width instructions
SE9704476L (en) 1997-12-02 1999-06-23 Ericsson Telefon Ab L M Extended instruction decoding
EP0942357A3 (en) 1998-03-11 2000-03-22 Matsushita Electric Industrial Co., Ltd. Data processor compatible with a plurality of instruction formats
WO2000008554A1 (en) * 1998-08-07 2000-02-17 Koninklijke Philips Electronics N.V. Apparatus with program memory and processor
US7676653B2 (en) 2007-05-09 2010-03-09 Xmos Limited Compact instruction set encoding
US9710277B2 (en) * 2010-09-24 2017-07-18 Intel Corporation Processor power management based on class and content of instructions
US9361097B2 (en) * 2013-10-18 2016-06-07 Via Technologies, Inc. Selectively compressed microcode
US9372696B2 (en) 2013-10-18 2016-06-21 Via Technologies, Inc. Microprocessor with compressed and uncompressed microcode memories
GB2586258A (en) 2019-08-15 2021-02-17 1Inspiries Tech Ltd Efficient processor machine instruction handling
US11204768B2 (en) 2019-11-06 2021-12-21 Onnivation Llc Instruction length based parallel instruction demarcator

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4437149A (en) * 1980-11-17 1984-03-13 International Business Machines Corporation Cache memory architecture with decoding
JPH07121352A (en) * 1993-10-21 1995-05-12 Canon Inc Arithmetic processor
GB2284492B (en) * 1993-12-06 1998-05-13 Graeme Roy Smith Improvements to computer control units
GB2289353B (en) * 1994-05-03 1997-08-27 Advanced Risc Mach Ltd Data processing with multiple instruction sets

Similar Documents

Publication Publication Date Title
US5905893A (en) Microprocessor adapted for executing both a non-compressed fixed length instruction set and a compressed variable length instruction set
US5896519A (en) Apparatus for detecting instructions from a variable-length compressed instruction set having extended and non-extended instructions
US5867681A (en) Microprocessor having register dependent immediate decompression
US6412066B2 (en) Microprocessor employing branch instruction to set compression mode
US5794010A (en) Method and apparatus for allowing execution of both compressed instructions and decompressed instructions in a microprocessor
CN102207849B (en) Method and apparatus for performing logical compare operation
EP1320800B1 (en) Cpu accessing an extended register set in an extended register mode and corresponding method
CN102473093B (en) Packed data in multiple passages is decompressed
US6671797B1 (en) Microprocessor with expand instruction for forming a mask from one bit
JPH09512651A (en) Multiple instruction set mapping
GB2289353A (en) Data processing with multiple instruction sets.
EP1063586B1 (en) Apparatus and method for processing data with a plurality of flag groups
WO2001025900A1 (en) Risc processor using register codes for expanded instruction set
US5884071A (en) Method and apparatus for decoding enhancement instructions using alias encodings
US7865699B2 (en) Method and apparatus to extend the number of instruction bits in processors with fixed length instructions, in a manner compatible with existing code
US5682531A (en) Central processing unit
US7546442B1 (en) Fixed length memory to memory arithmetic and architecture for direct memory access using fixed length instructions
WO1997048041A9 (en) An apparatus and method for detecting and decompressing instructions from a variable-length compressed instruction set
JP3781519B2 (en) Instruction control mechanism of processor
WO1997048041A1 (en) An apparatus and method for detecting and decompressing instructions from a variable-length compressed instruction set
US6026486A (en) General purpose processor having a variable bitwidth
JP2682469B2 (en) Instruction code encoding method
US20040024992A1 (en) Decoding method for a multi-length-mode instruction set
US6681319B1 (en) Dual access instruction and compound memory access instruction with compatible address fields
GB2349252A (en) An apparatus and method for detecting and decompressing instructions from a variable length compressed instruction set