WO1994002894A2 - Systeme de traitement de donnees a dispositif de traitement de boucles de programme - Google Patents

Systeme de traitement de donnees a dispositif de traitement de boucles de programme Download PDF

Info

Publication number
WO1994002894A2
WO1994002894A2 PCT/GB1993/001470 GB9301470W WO9402894A2 WO 1994002894 A2 WO1994002894 A2 WO 1994002894A2 GB 9301470 W GB9301470 W GB 9301470W WO 9402894 A2 WO9402894 A2 WO 9402894A2
Authority
WO
WIPO (PCT)
Prior art keywords
instruction
data
program
bit
register
Prior art date
Application number
PCT/GB1993/001470
Other languages
English (en)
Other versions
WO1994002894A3 (fr
Inventor
Gérard Chauvel
Yves Wenzinger
Peter Dent
Original Assignee
Texas Instruments France
Texas Instruments Incorporated
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from FR9208664A external-priority patent/FR2693571B1/fr
Priority claimed from FR9208667A external-priority patent/FR2693586B1/fr
Priority claimed from FR9208668A external-priority patent/FR2693572B1/fr
Priority claimed from FR9208669A external-priority patent/FR2693573B1/fr
Priority claimed from FR9208665A external-priority patent/FR2693576B1/fr
Application filed by Texas Instruments France, Texas Instruments Incorporated filed Critical Texas Instruments France
Priority to EP93916063A priority Critical patent/EP0650613A1/fr
Priority to KR1019950700141A priority patent/KR950702719A/ko
Priority to JP6503868A priority patent/JPH08509080A/ja
Publication of WO1994002894A2 publication Critical patent/WO1994002894A2/fr
Publication of WO1994002894A3 publication Critical patent/WO1994002894A3/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/3001Arithmetic instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/04Addressing variable-length words or parts of words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/80Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
    • G06F15/8007Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
    • G06F15/8015One dimensional arrays, e.g. rings, linear arrays, buses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30094Condition code generation, e.g. Carry, Zero flag
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30181Instruction operation extension or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/32Address formation of the next instruction, e.g. by incrementing the instruction counter
    • G06F9/322Address formation of the next instruction, e.g. by incrementing the instruction counter for non-sequential address
    • G06F9/325Address formation of the next instruction, e.g. by incrementing the instruction counter for non-sequential address for loops, e.g. loop detection or loop counter
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/34Addressing or accessing the instruction operand or the result ; Formation of operand address; Addressing modes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3802Instruction prefetching
    • G06F9/3804Instruction prefetching for branches, e.g. hedging, branch folding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3842Speculative instruction execution
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C7/00Arrangements for writing information into, or reading information out from, a digital store
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C8/00Arrangements for selecting an address in a digital store

Definitions

  • the present invention concerns data-processing systems of the microprocessor type, and in particular a data-processing system comprising an improved device for handling program loops of the type comprising a program-address-start register to record the number of the first instruction of the loop and a repeat counter initialized at the time the first instruction is executed with the number of repetitions to be made; the repeat counter is decremented at the end of each loop, and execution of the loop is repeated until it reaches zero.
  • the data- processing system may include a status register containing status parameters that are a function of the results of arithmetic and logic operations and the updating of which depends on the program.
  • the command program of the data-processing system may include instructions depending on parameters of the status register determined by the result of arithmetic and logic operations.
  • a device for selectively reading/writing data, byte by byte, in the data-processing system comprises a data memory in which the memory locations each contain a predetermined number of bytes.
  • a program of instructions for running a data- processing system such as a microprocessor often consists of an instruction sequence to be executed many times in order to achieve a given result. This is what is commonly called a loop.
  • the classical technique for handling loops consists of recording the number of repetitions of the instruction sequence or loops to be performed in a counter and recording the number of the last instruction in the loop in a register.
  • a comparison is made between the number of the instruction to be executed, given by a program counter, and the number of the last instruction recorded.
  • the repeat counter is decremented and the value of the program counter is replaced by the value of the first instruction of the loop, which is recorded in a program-address-start register, this being done as long as the repeat counter has not reached zero.
  • SUBSTITUTESHEET This comparison requires functional elements such as a register for recording the number of the last instruction of the loop and a comparator, as well as a comparison operation performed at each instruction. Loop handling is therefore performed at the cost of logic circuits, which evidently consume silicon and a significant amount of time.
  • One purpose of the invention is therefore to realize a data- processing system such as a microprocessor, of which the device for handling loops consists of a reduced number of logic circuits and permits a substantial gain in processing speed.
  • An instruction set for directing a classical data-processing system such as a microprocessor or a digital-signal processor (DSP) consists of instructions of the arithmetic and logic type and instructions for breaking the instruction sequence, such as conditional or unconditional branching or skipping instructions.
  • Instructions of the arithmetic and logic type direct arithmetic and logic operations performed by the arithmetic and logic unit (ALU) .
  • ALU arithmetic and logic unit
  • the result of an operation of this type is that a certain number of parameters, called status parameters, which are generally stored in a status register, are modified. These parameters are, among others, parameters indicating that the results of the operation is zero (Z) , or negative (N) , or that the operation has generated a carry (C) .
  • the status parameters in the status register are therefore updated by the results provided by the arithmetic and logic unit.
  • this automatic updating has the inconvenience that the programmer does not control the contents of the status register, since it is modified at each instruction. But most of all, in the case of a test that conditions a branching, it is not possible to insert instructions
  • Another purpose of this invention is therefore to realize a data-processing system in which the status parameters, resulting from arithmetic and logic operations, are not updated automatically.
  • a program of instructions for directing a data-processing system such as a microprocessor often includes instructions to be executed differently according to the results of preceding instructions. For example, this can be common in a specialized data-processing system such as a digital signal processor (DSP) .
  • DSP digital signal processor
  • the result of a comparison determines whether an addition or subtraction is to be performed. According to the result of the comparison, the program flows in sequence if addition is involved or makes a branch consisting of a skipping one or more instructions if subtraction is involved. Of course, it is necessary to again envision a skipping of the instructions if subtraction is involved.
  • a double set of instructions (addition and subtraction) must therefore be included in the program, and branching and skipping instructions must also be provided. This substantially increases the number of program instructions, and most of all, it consumes time, to the detriment of the processing speed of the set of operations.
  • Another purpose of the invention is therefore to realize a data-processing system such as a microprocessor comprising a
  • Another purpose of the invention is to realize a data- processing system including a single instruction in place of several alternative instructions, depending on the results of previous operations.
  • the elementary- addressable unit in memory remains the byte, that is, the 8-bit word.
  • this capacity is always expressed in bytes. This comes from the fact that the group of 8 bits remains the most adequate unit of information for representing a character of alphanumeric data.
  • 16-bit processors have the possibility of reading (or writing) 8-bit words or 16-bit words in memory.
  • 16-bit words When 16-bit words are involved, there are two ways of reading. When reading is done in order of increasing addresses, it starts either from the high-order 8-bit portion of the 16-bit word or from the low- order 8-bit portion. This can be of great importance in numerous applications, and in particular in telecommunications, which requires transmission of data in blocks formed by series of bytes. In fact, these transmissions can be made by sending the most significant bit first (MSB mode) or by sending the least
  • TUTESHEET significant bit first (LSB mode) ; it is necessary to determine which of the two parts of the 16-bit word should be read first.
  • This type of application in which a chain of 8-bit bytes is transmitted, mostly uses indirect addressing. That is, the instructions that direct the transfer of information have access to one of the general registers of the processor in which the memory address of the word in memory is located. Each instruction contains a group of two bits which directs incrementing of the address located in the register, or decrementing it, or neither of those. Incrementing is used when one desires to read a series of bytes in memory in the direction of increasing addresses. Decrementing is used when one desires to read a series of bytes in memory in the direction of decreasing addresses. And of course, neither of these commands is made when either the transmission operation is terminated or the address loaded into the register is done by an operation that is neither incrementing nor decrementing. Consequently, two instruction bits must be available in order to provide for these three cases above.
  • Another purpose of the invention is therefore to realize a device for reading/writing in the memory of a data-processing system that requires only one incrementer (or decrementer) for the operations of addressing a series of bytes in memory and not the incrementer-decrementer combination used in the systems of prior technology.
  • One object of the invention is a data-processing system in which of the last instructions of the instruction sequence to be executed several times, in which the departure from the last instruction of the instruction sequence depends on the pipeline number of the system, containing a code at the end of the loop, which directs that when the last instruction of the loop is executed, the program counter to be loaded from the program- address-start register as long as the repeat counter is not at zero, so that the instruction sequence is repeated a number of times equal to the value contained in the repeat counter.
  • An object of the invention is a data-processing system of the type stated above, in which the arithmetic and logic instructions include an updating field and the status parameters stored in the status register are updated only when the updating field contains a predetermined value.
  • Another object of the invention is a data-processing system of the type comprising an arithmetic and logic unit for performing arithmetic and logic operations by the operation code of the instructions of a program and a status register containing status parameters, the value of which depends on arithmetic or logic operations performed by the arithmetic and logic unit, and including a circuit for decoding the parameters contained in the status register for modifying the operation code of an instruction transmitted to the arithmetic and logic unit as a function of the status parameters, so as to provide several operation codes equivalent to several alternative instructions, the execution of which depends on the status parameters.
  • Another object of the invention is a device for reading data byte by byte in a system of data comprising a data memory, of which the memory locations each contain a predetermined number of
  • the read device comprising a selection means that determines the end of each memory location from which the first byte of the predetermined number of bytes should be read first in response to a selection bit located in the status register of the system.
  • Another object of the invention is a device for writing data byte by byte in a system of data comprising a data memory, of which the memory locations each contain a predetermined number of bytes, the write device comprising a selection means that determines the end of each memory location into which the first byte of the predetermined number of bytes should be written first in response to a selection bit located in the status register of the system.
  • FIG. 1 shows the overall scheme of a data-processing system in which the invention has been applied
  • Figure 2 shows the overall scheme of a device for repetitive execution of an instruction sequence according to the prior technology
  • Figure 3 shows the overall scheme of a device for repetitive execution of an instruction sequence according to a preferred implementation mode of the invention
  • FIGS. 4A-4D show diagrams illustrating the operation sequence performed when the device shown in Figure 3 is applied to various steps of the repetitive execution of the instruction sequence
  • FIG. 5 is the overall scheme of a preferred application of the invention.
  • SUBSTITUTESHEET - Figure 6 schematically illustrates a device applying the aspect of the invention, wherein the command program includes instructions depending on parameters of a status register;
  • Figure 7 is an overall scheme of a preferred implementation of the aspect of the invention shown in Figure 6;
  • FIG. 8 shows an overall scheme of a preferred embodiment of a device for reading data in memory according to another aspect of the invention.
  • FIG. 9 shows an overall scheme of a preferred embodiment of the device for writing data in memory in accordance with the aspect of the invention also illustrated in Figure 8;
  • FIGs 10A and 10B show embodiments of the ones- complement circuit used in the read/write devices illustrated in Figures 8 and 9.
  • Figure 1 shows the organization of a data-processing system incorporating the invention.
  • a preferred data-processing system for applying the invention is a data processor of the digital- signal type or a microprocessor as described below.
  • the program instructions are recorded in a program memory 10.
  • the program memory 10 is generally a permanent memory or a read-only memory (ROM) of the programmable type (PROM) or erasable (EPROM) type. However, memory 10 can also be a read-write memory (RAM) .
  • Memory 10 communicates via the bus 12 with a decoding and control unit 14, which contains the essentials of the system logic, with the task of decoding the instructions coming from the program memory 10 and of controlling the progress of the decoded information in the instructions. This information can be used either to access data or for arithmetic and logic operations.
  • the decoding and control unit 14 is connected to the address bus 16.
  • Bus 16 is connected, on the one hand, to a block of processing registers 18, which contain the general registers used for instruction processing. These registers are designed to contain the operands contained in the instructions or coming from the data memory, or they can serve as index registers for indirect addressing. But these processing registers can have other functions assigned most of the time to the decoding and control unit.
  • the block of processing registers 18 can contain the program counter (PC) , incrementation registers, shift registers, used mostly in operations of multiplication and division, such as a Booth shift register, or else a status register, which contains a number of parameters determined after execution of an arithmetic or logic instruction with result zero, result negative, or with a carry.
  • PC program counter
  • incrementation registers used mostly in operations of multiplication and division, such as a Booth shift register, or else a status register, which contains a number of parameters determined after execution of an arithmetic or logic instruction with result zero, result negative, or with a carry.
  • Most of the registers of block 18 are addressable by means of the address bus 16.
  • the address bus 16 is likewise an input to an address- decoding unit 20, the function of which is to decode the addresses resulting from decoding to the instructions by the decoding and control unit 14 in order to access data stored in a data memory 22.
  • the data memory is generally a read-write memory (RAM) , but it could also be a permanent memory or any other kind of memory.
  • RAM read-write memory
  • the block of processing registers 18 is connected to the address bus 16 in such a way that the contents of certain registers of block 18 can be transmitted to the address-decoding unit 20 when this content is an address for accessing the data memory 22.
  • an arithmetic and logic unit (ALU) 24 is connected to the output of the block
  • the output of the arithmetic and logic unit 24 is either a storage of data in the data memory 22 by the data bus 26 or a new address from one of the registers of the block of processing registers 18 via the results bus 28.
  • the data coming from the data memory 22, the address of which has been provided by the address-decoding unit 20, are transmitted to the input of the arithmetic and logic unit 24 by means of the memory-reading bus 30.
  • the block of processing registers 18 is connected to the control unit for peripheral devices 32 by means of the input/output bus 34.
  • the data-processing system that has just been described is of the Harvard type, that is, the memory 10 containing the instructions is completely separate from the memory 22 containing the data. But the invention could likewise by applied in data- processing systems of another type.
  • the data- processing system can be a microprocessor of universal application, but equally a specialized microprocessor such as a digital signal processor (DSP) used in devices for data transmission, and in particular a cellular radio device (GSM, DECT, or equivalent) .
  • DSP digital signal processor
  • the instructions comprising the command program of a data- processing system such as that described above are generally processed in sequence, a program counter (PC) being incremented at each of the instructions. But it happens that an instruction sequence should be repeated a certain number of times in order to perform repetitive operations as well. In the classical manner, this repetition of an instruction sequence, called a loop, is applied in the manner described below.
  • PC program counter
  • SUBSTITUTESHEET instruction number of the end of the loop is loaded into a program-address-end register (PAER) 42.
  • PAER program-address-end register
  • PC program counter
  • PASR program-address-start register
  • the program counter 46 At each incrementation of the program counter 46, its contents are compared to the contents of the PAER register 42 by the comparator 58. If the contents are different, a 0 bit is transmitted to the AND circuit 60 by its input 62. Consequently, the AND circuit has its output 64 at zero, which will activate the input 56 (incrementation of the program counter) of the multiplexer 54.
  • the comparator 58 transmits a 1 bit on line 62.
  • the other input line 66 of the AND circuit 60 As the other input line 66 of the AND circuit 60 is at 1, because the RPTC counter 40 is not equal to zero, the AND circuit 60 becomes passable, and a high signal is transmitted on its output line 64 to the multiplexer 54.
  • the value of the RPTC counter 40 is decremented by means of the decrementer 68.
  • the purpose of the invention is to reduce the number of operations to be performed and the number of circuits used. Such a goal is realized by using the device illustrated by the overall scheme of Figure 3 in place of the device of Figure 2.
  • the number n - 1 is loaded into an RPTC repeat counter by means of the data bus 72.
  • the number 1, given by the program counter (PC) 74 is loaded into a program-address- start register (PASR) 76 by means of bus 78 under the command of an initialization signal on line 79.
  • PC program counter
  • PASR program-address- start register
  • SUBSTITUTESHEET N - l are then executed in sequence, the program counter 74 being incremented at each instruction thanks to an incrementer 80 and by means of a multiplexer 82, of which the active input is the bus 84, as will be explained below.
  • the RPTC counter 70 being positive (its contents were set at n - 1), its output 86 is at 1.
  • Line 86 is an input to an AND circuit 90, of which the other input 88 is the EOL signal. This input is low, that is, the EOL signal is at zero, during the whole execution of instructions 1 through N - 2, thus blocking the AND circuit 90, of which the output 92 activates the multiplexer 82 in the manner that has been conditioned by the input 84 of the value previously incremented by the incrementer 80.
  • the essential characteristic of the invention consists of incorporating into the next-to-last instruction of the loop a coding indicating the end of the loop to permit realization of a reduced device with respect to the classical device, fulfilling the same function.
  • the device illustrated in Figure 3 no longer includes the program-address-end register (PAER) or a comparator for the purpose of comparing the contents of the program-address-end register with the program counter at each instruction.
  • PAER program-address-end register
  • the EOL code indicating the end of the loop is, in the preferred implementation mode of the invention, one of the instruction bits. When this bit is at 1 in an instruction, this means that the next instruction is the last instruction of an instruction sequence, and, as has been seen before, this 1 bit permits the number of the first instruction of the instruction sequence in the loop to be transferred to the program counter.
  • the EOL code may have a particular value in the instruction field and not just a single bit, in which case a decoding will be necessary.
  • FIG. 3 has been applied in a data- processing system or a microprocessor of which the functioning is of the pipeline type with two levels. That is, access and decoding of an instruction take place in the same cycle as execution of the preceding instruction.
  • Figures 4A-4D showing diagrams which illustrate the progress of the operations applied in the device of the invention, will permit a better understanding of these particular points.
  • Diagram 4B schematically illustrates the end of the first loop.
  • decoding of instruction N indicates that the RPTC counter should be decremented.
  • An inherent advantage in the invention is that a precaution, which is taught to every beginning programmer, is no longer needed, that is, having to avoid conditional branches into the inte r ior of a loop.
  • branches are delicate to manipulate within a loop, because it is not always easy to return to the loop when several branches intervene, and
  • SUBSTITUTESHEET it is therefore imperative to terminate at the end of the loop to the extent that the value at the end of the loop stored in a PAER register (see Figure 2) conditions this end.
  • the invention permits this obligation to be eliminated, because the programmer can incorporate one or more instructions involving the EOL code into his program without problem. He could thus also include branches to the interior of the instruction sequence without having the problems that programmers using the previous technology have.
  • the device that applies the invention includes a certain number of elements (program counter, incrementer, repeat counter, multiplexer, etc.) that are found typically incorporated into the decoding and control unit.
  • the program counter, the program-address-start register, the incrementer, and the multiplexer are registers in the block of processing registers.
  • Such an integration of registers having the same structure permits a gain in speed as well as in cost. But it is clear that one can envision such a device in which some of the elements are found in the decoding and control unit and others are in the block of registers, without leaving the scope of the invention.
  • the invention permits a reduction in the number of elements in the device employed (absence of the PAER register and the comparator) .
  • This advantage appears concretely in a reduction in elementary circuits (of transistors) .
  • the invention leads to a gain of 600 transistors or a gain of 10% in the whole microprocessor. But a gain in speed is equally possible, especially for short loops, since there is an economy
  • the block of data-processing registers includes a register with the role of storing the status parameters that result from arithmetic and logic operations performed by the arithmetic and logic unit (ALU) .
  • the status register is a particular register that is not part of the general processing registers; its content is automatically updated at each arithmetic and logic operation by status parameters coming
  • the status register is a general register of the same kind as the other general registers that are part of the block of processing registers, and it therefore remains under control of the programmer. But in contrast to prior systems, the status register is only updated by results coming from the arithmetic and logic unit in execution of an instruction specifically directing this updating.
  • SUBSTITUTESHEET Figure 5 schematically shows the device that applies this invention.
  • Status register 140 contains the status parameters N, indicating a negative operation result, Z, a zero result, and C, an operation with a carry.
  • These three parameters are, in the preferred realization, those which are updated by the results of an arithmetic and logic operation performed by the ALU 24 using the device of Figure 5. But it is evident that this choice is not limiting, and that a greater number of status parameters could be updated.
  • Each program instruction contains, in the classical manner, a field 142 affected by an operation code or OPCODE and a condition-code field 144. Moreover, the instruction includes a field US 146 for updating the parameters N, Z, and C.
  • the US field is a single bit, but this is in no way limiting, and the US field could contain several bits without leaving the scope of the invention.
  • Bit 146 serves as an input to an AND circuit 148, the other input of which is the write signal, W, which functions in the classical way at each instruction cycle.
  • the AND circuit becomes passable, and the write signal, W, directs loading of new status parameters, N, Z, and C, by means of lines 150 coming from the arithmetic and logic unit 24 into status register 140.
  • the status parameters of status register 140 would be updated automatically by means of the lines 150 at each instruction when signal W has been activated.
  • the circuit that permits this updating has not been modified (it is part of the AND circuit 148); it is classical, and its description is not necessary for understanding this invention.
  • conditional instructions accomplishes its full purpose when it is used in combination with conditional instructions.
  • conditional instructions as it has been described in French Patent Application No. FR-A-9,107,985, consists of envisioning a condition-code field on which execution of the instruction will depend.
  • DSP SUBSTITUTESHEET signal processors
  • conditional instructions such as the instruction to update the status register as envisioned by the invention, the instruction sequence becomes:
  • branching instructions such as those shown in the above example require several cycles, either 2, 3, or 4 cycles. It will therefore be seen that the time consumed in the above example without application of the invention, that is, with the branching instruction, is 4, 5, or 6 cycles in the most favorable case. In contrast, when the invention is applied, only 3 cycles are necessary, since the branching instruction that uses several cycles has disappeared.
  • the following example illustrates a sequence of instructions that could be encountered in a program executed in a data- processing system applying the invention.
  • Updating the status register only occurs at instruction 1 and instruction 36. This permits the programmer to control the flow of the program perfectly. It is thus possible to envision placing a conditional instruction that takes the values N, Z, and C into account (condition Cl) as the 6th instruction. This would not have been possible in a prior data-processing system, because the intermediate instructions would update the status register automatically. In contrast, to the extent that the programmer himself determines, at which instructions, updating of the status
  • SUBSTITUTE SHEET registers will occur, he used the status register as a processing register in instructions 2, 7, 14, 16, and 23, which would not, of course, have been possible in the prior technology.
  • application of the invention permits the programmer to retain control of the status register and to make use of it like any other processing register. Moreover, in combination with the use of conditional instructions, the invention permits avoiding numerous branches that consume many cycles. These two advantages contribute to a substantial reduction in the number of program instructions. In practice, application of the invention can permit a reduction of about 2 million instructions out of a total of about 10 million instructions required by programs of the ADPCM type used in a voice encoder-decoder, which represents an improvement in the performance of the processor on the order of 20%.
  • each instruction 240 contains a field 242 containing the operation code (OPCODE) of the instruction.
  • This operation code which is decoded by the decoding and control unit 14 (already illustrated in Figure 1) , is the code that directs the type of operation to be performed, such as Addition, Subtraction, Comparison, etc.
  • the signals on the output lines 244 then direct either addressing of processing registers or addressing of data in memory, as has been seen previously with reference to Figure 1.
  • a group 246 of output lines which can, moreover, be reduced to a single line as will be seen later, is connected to a decoding circuit 248.
  • lines 246 provide the signals produced by decoding the instructions used to apply the invention, while, for all the other instructions, the decoding signals are provided by lines 244.
  • the information provided on lines 246 for decoding the operation code of an instruction depending on the status parameters are used by the decoding circuit 248 in combination with certain status parameters found in the status register 250 of the block of processing registers 18.
  • the status parameters of status register 250 used within the framework of the invention are: N: bit at 1 when the sign bit (MSB) of the preceding operation is at 1, indicating that the result of this operation is negative, Z: bit at 1 when the result of the preceding operation is zero, C: carry bit at 1 when the operation generates a carry.
  • the decoding circuit 248 provides a command code on lines 252 in response to the signals received on lines 246 coming from decoding of the operation code of an instruction depending on the status parameters.
  • This command code is transmitted directly to the arithmetic and logic unit 24, the operation of which will be different according to the code.
  • three lines 252 are shown, because eight different command codes can be provided in response to the eight possible combinations of the parameters N, Z, and C. But it goes without saying that the number of lines can be less than or greater than three without violating the principle of the invention.
  • Figure 7 illustrates a preferred implementation method of this aspect of the invention, which permits giving evidence, thanks to a real example, of the reduction in the number of instructions made possible by application of the invention.
  • N status parameter N of status register 250 is used.
  • the value of N is used as the first input of an EXCLUSIVE-OR circuit 260 (corresponding to the decoding circuit of Figure 6) .
  • the second input of circuit 260 is an operation-code bit of an instruction depending on status parameters, of which there are four in the case illustrated. According to whether N has the value 0 or the value 1, depending on the result of the preceding comparison, this operation-code bit will or will not be inverted.
  • the instruction sequence only requires two instructions in place of five with a classical data-processing system.
  • application of this aspect of the invention permits a substantial gain of about 500,000 instructions out of the total of 8 million instructions necessary for a program of the ADPCM type used in a voice encoder-decoder, which represents an increase in the speed of the processor on the order of 6-7%.
  • the input code to the decoding circuit will comprise two bits, and the decoding circuit itself should comprise two EXCLUSIVE-OR circuits in series.
  • the four possible outputs of the decoding circuit could be used to direct the arithmetic and logic unit to perform one of four operations, such as addition, subtraction, setting to zero, and setting to one.
  • Block 27 between bus 26 and the memory 22 in the system as shown in Figure 1, represents the device for writing byte by byte in memory, according to another aspect of the invention, as will
  • Block 29 represents the device for reading byte by byte in memory according to this aspect of the invention, as will be explained in the following.
  • the address-bus lines 16 are divided into two parts.
  • the lines corresponding to the high-order bits 342 are decoded by an address-decoding unit 340 to access memory locations in the data memory 22. If each memory location 344 comprises two bytes, bus 342 comprises all the address-bus lines 16 except line A Q , corresponding to the lowest-order bit. If each memory location comprises 4 bytes, bus 342 comprises all the lines of the address bus 16 except lines A Q and A ⁇ , corresponding to the two lowest-order bits of the address. And so forth.
  • the data bytes considered are octets, or 8 bits. This means that a memory location containing two bytes corresponds to 16 bits, a memory location containing four bytes corresponds to 32 bits, etc.
  • processors currently use 8-bit words or bytes as units of elementary information, the invention can be applied likewise to bytes containing more or less than 8 bits.
  • the highest-order byte of the memory word may be the leftmost byte 344-1 in memory location 344; in this case the low-order byte is at the right in
  • SUBSTITUTESHEET 344-2 SUBSTITUTESHEET 344-2. Or else the lowest-order byte is the leftmost byte in 344-1; in this case the highest order bit is at the right end of the memory location, or part 344-2.
  • the lowest-order byte is the leftmost byte in 344-1; in this case the highest order bit is at the right end of the memory location, or part 344-2.
  • byte by byte if we start by reading the highest-order byte in the leftmost part of the memory location, or part 344-1, we are accustomed to saying that the reading is being done in the "big endian” mode.
  • the lowest-order byte in the rightmost part 344-2 of the memory location we are accustomed to saying that the reading is being done in the "little endian” mode.
  • the "little endian" mode corresponds to reading bits 7-0 first in 344-2.
  • incrementer or decrementer
  • Each instruction therefore has a bit which, when it is 1, directs incrementing of the contents of the register containing the address, for example register X of the register block 18 (see Figure 1) by the incrementer INC of the register block 18. If the processor works in the mode called “little endian, " the lowest-order byte should be received before the highest-order byte, and conversely, if the processor works in the "big endian" mode.
  • the invention is characterized by the possibility of the processor being able to operate equally well in the "little endian” and the “big endian” modes, while still only having one incrementer.
  • the lines corresponding to low-order bits of the address of the address bus 16, or A Q , A ⁇ , etc. are connected to the input of a ones-complement circuit 346.
  • Another input line 348 of circuit 346 comes from a
  • the ones-complement circuit 346 has the effect of inverting each of the bits in the lines A Q , A]_, etc., if the value of the B/L bit is equal to 1 (and likewise has the effect of inverting the order of reading bytes in a memory location) and of having no action in the inverse case.
  • Lines A' Q , A' ⁇ , etc., at the output of circuit 346 are connected to the command input of a multiplexer 350.
  • the multiplexer 350 receives as inputs the bytes contained in the memory location 344 addressed by the high- order bits on the lines comprising bus 342.
  • SUBSTITUTESHEET architecture permits an optimization (incrementing or decrementing) of resources in the case where the two addressing modes are not indispensable, because realization of two addressing modes has a higher cost than implementation of the invention.
  • Figure 9 which shows an overall scheme of a preferred embodiment of the write device according to this aspect of the invention, contains certain parts in common with Figure 8.
  • the address-bus lines are likewise divided into two parts.
  • the lines corresponding to high-order bits 342 of the address bus 16 are decoded by the address-decoding unit 340 for accessing memory locations in the data memory 22.
  • bus 342 comprises all the address-bus lines 16 except line A Q , corresponding to the lowest-order bit if each memory location 354 comprises two bytes
  • bus 342 comprises all the lines of bus 16 except lines A Q and ⁇ , corresponding to the two lowest-order bits of the address if each memory location comprises 4 bytes, and so forth.
  • each memory location is composed of a predetermined number of bytes
  • SUBSTITUTESHEET has therefore a capacity of a multiple of 8 bits. There too, it is good to explain that this aspect of the invention can be applied to bytes or words containing more or less than 8 bits.
  • this aspect of the invention is characterized by the possibility of the processor being able to work both in the "little endian” and the "big endian” mode, while still having only one incrementer.
  • the bus lines 16 corresponding to low-order bits of the address, or AQ, AJ_, etc., are connected to the input of a ones-complement circuit 356.
  • Another input line 358 of circuit 356 comes from the B/L selection bit of the STAT status register of the register block 18.
  • the ones-complement circuit 356 has the effect of inverting each of the bits on lines AQ, A ⁇ , etc., if the value of the bit is equal to 1 and of having no action in the inverse case.
  • Lines A' Q , 'I, etc., at the output of circuit 356 are connected to the input of a demultiplexer or address decoder 360.
  • the demultiplexer 360 contains as output as many lines as there are bytes contained in memory location 354.
  • the two output lines 362 and 364 of the demultiplexer 360 affect respectively two AND circuits, 366 and 368.
  • the other input to the AND circuits 366 and 368 is a write line 370, which is active when data should be stored in memory 22 and the memory location is designated by the address located on bus 16.
  • the demultiplexer 360 therefore makes line 362 or 364 active according to the value of the B/L bit in the STAT status register and the value of the A Q bit.
  • AND circuit 366 When line 362 is active, AND circuit 366 is passing if the write signal is high on line 370, which permits storage of the byte present on data bus 26 in the left part 354-1 of memory location 354. This byte being the highest-order byte of the word, the memory arrangement corresponds to the "big endian" mode.
  • SUBSTITUTESHEET arrangement corresponds to the "little endian" mode if the byte is arranged in part 354-2. Even though only two output lines of the demultiplexer 360, two AND circuits, and two parts (each containing one byte) of the memory location have been shown in Figure 9, it is easy to conceive that there exist as many output lines for the multiplexer and the AND circuits as there are bytes contained in memory location 354.
  • the B/L selection bit could be stored in another register than the STAT status register.
  • the write device represented in Figure 9 permits avoiding simultaneous utilization of an incrementer and a decrementer.
  • the write modes described above are inverted, that is, the "little endian" mode then corresponds to the highest-order byte being written first and the "big endian mode" to the lowest-order byte being written first.
  • Figures 10A and 10B show implementation examples of the ones-complement circuit used in this aspect of the invention.
  • Figure 10A corresponds to a memory in which each location contains 16 bits, consisting of a high-order byte and a low-order byte.
  • the ones-complement circuit is a simple EXCLUSIVE-OR circuit that has for the first input the B/L bit of the status register and for the second input the line of the low- order bit AQ of the address bus.
  • the output A' Q is the input A Q when the B/L bit is equal to 0 and is the inverse of A Q when the B/L is equal to 1.
  • the circuit shown in Figure 10B corresponds to memory locations containing 32 bits or 4 bytes of 8 bits.
  • the ones-complement circuit is formed by two EXCLUSIVE-OR circuits having as first inputs the B/L bit located in the status register and as asrolSaiiH inputs line A Q or A ⁇ corresponding respectively to lines of the lowest-order bits of the address bus.
  • Output lines A' Q and A' -_ are identical to inputs AQ and A ⁇ when the B/L bit is equal to 0 and are the inverses of inputs AQ and Ai when the B/L is equal to 1.
  • the aspect of the invention that has just been described therefore permits using only one incrementer (or decrementer) instead of the incrementer-decrementer indispensable in prior-art systems.
  • This permits a non-negligible economy of circuits, since it represents about 400 transistors at least out of a total of about 8000 transistors comprising a processor of the DSP type, or an economy of about 5% in the silicon surface.
  • each instruction only has one bit instead of two for defining incrementing in indirect addressing, the bit saved can be used judiciously for another purpose, which will increase the efficiency of the system still further.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Analysis (AREA)
  • Microelectronics & Electronic Packaging (AREA)
  • Executing Machine-Instructions (AREA)
  • Advance Control (AREA)

Abstract

L'invention concerne un système de traitement de données du type microprocesseur à deux niveaux pipeline qui comprend un dispositif exécutant une séquence d'instructions de façon répétitive, un compteur (74) de programme et un registre (76) de programme-adressage-démarrage destiné à enregistrer le numéro de la première instruction de la séquence d'instructions, ainsi qu'un compteur de répétition (70) qui s'initialise quand la première instruction de cette séquence d'instructions est exécutée. L'avant-dernière instruction de la séquence d'instructions à répéter contient un code de fin de boucle (EOL) qui, une fois la dernière instruction de la boucle exécutée, ordonne que la teneur du registre (76) de programme-adressage-démarrage soit chargée dans le compteur (74) du programme tant que le compteur de répétition n'est pas à zéro. Le code de fin de boucle (EOL) permet d'utiliser nettement moins de circuits et d'augmenter la vitesse de traitement des boucles.
PCT/GB1993/001470 1992-07-13 1993-07-13 Systeme de traitement de donnees a dispositif de traitement de boucles de programme WO1994002894A2 (fr)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP93916063A EP0650613A1 (fr) 1992-07-13 1993-07-13 Systeme de traitement de donnees a dispositif de traitement de boucles de programme
KR1019950700141A KR950702719A (ko) 1992-07-13 1993-07-13 프로그램 루프를 처리하는 장치를 갖는 데이타-프로세싱 시스템(data-processing system with a device for handling program loops)
JP6503868A JPH08509080A (ja) 1992-07-13 1993-07-13 プログラムループ処理のためのデバイスを備えたデータ処理システム

Applications Claiming Priority (9)

Application Number Priority Date Filing Date Title
FR9208668 1992-07-13
FR9208669 1992-07-13
FR9208664 1992-07-13
FR9208664A FR2693571B1 (fr) 1992-07-13 1992-07-13 Système de traitement de données dont le programme de commande comporte des instructions dépendant de paramètres d'état.
FR9208667A FR2693586B1 (fr) 1992-07-13 1992-07-13 Dispositif de lecture/écriture de données en mode sélectif dans un système de traitement de données.
FR9208668A FR2693572B1 (fr) 1992-07-13 1992-07-13 Système de traitement de données comportant un dispositif amélioré de traitement des boucles de programme.
FR9208669A FR2693573B1 (fr) 1992-07-13 1992-07-13 Système de traitement de données à registre d'état dont la mise à jour dépend du programme.
FR9208665A FR2693576B1 (fr) 1992-07-13 1992-07-13 Système multiprocesseur à contrôle local.
FR9208667 1992-07-13

Publications (2)

Publication Number Publication Date
WO1994002894A2 true WO1994002894A2 (fr) 1994-02-03
WO1994002894A3 WO1994002894A3 (fr) 1994-05-11

Family

ID=27515576

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB1993/001470 WO1994002894A2 (fr) 1992-07-13 1993-07-13 Systeme de traitement de donnees a dispositif de traitement de boucles de programme

Country Status (1)

Country Link
WO (1) WO1994002894A2 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116028118A (zh) * 2023-01-31 2023-04-28 南京砺算科技有限公司 保障数据一致性的指令执行方法及图形处理器、介质
CN117132450A (zh) * 2023-10-24 2023-11-28 芯动微电子科技(武汉)有限公司 一种可实现数据共享的计算模块和图形处理器

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4097920A (en) * 1976-12-13 1978-06-27 Rca Corporation Hardware control for repeating program loops in electronic computers
JPS57114950A (en) * 1981-01-08 1982-07-17 Nippon Telegr & Teleph Corp <Ntt> Loop processing system for program controller
EP0231928A2 (fr) * 1986-02-03 1987-08-12 Nec Corporation Circuit pour la commande par programme
EP0374419A2 (fr) * 1988-12-21 1990-06-27 International Business Machines Corporation Méthode et dispositif pour générer des boucles d'itération par matériel et microcode

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4097920A (en) * 1976-12-13 1978-06-27 Rca Corporation Hardware control for repeating program loops in electronic computers
JPS57114950A (en) * 1981-01-08 1982-07-17 Nippon Telegr & Teleph Corp <Ntt> Loop processing system for program controller
EP0231928A2 (fr) * 1986-02-03 1987-08-12 Nec Corporation Circuit pour la commande par programme
EP0374419A2 (fr) * 1988-12-21 1990-06-27 International Business Machines Corporation Méthode et dispositif pour générer des boucles d'itération par matériel et microcode

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
PATENT ABSTRACTS OF JAPAN vol. 006, no. 209 (P-150) 21 October 1982 & JP 57 114950 A 17 July 1982 *
See also references of EP0650613A1 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116028118A (zh) * 2023-01-31 2023-04-28 南京砺算科技有限公司 保障数据一致性的指令执行方法及图形处理器、介质
CN116028118B (zh) * 2023-01-31 2023-07-25 南京砺算科技有限公司 保障数据一致性的指令执行方法及图形处理器、介质
CN117132450A (zh) * 2023-10-24 2023-11-28 芯动微电子科技(武汉)有限公司 一种可实现数据共享的计算模块和图形处理器
CN117132450B (zh) * 2023-10-24 2024-02-20 芯动微电子科技(武汉)有限公司 一种可实现数据共享的计算装置和图形处理器

Also Published As

Publication number Publication date
WO1994002894A3 (fr) 1994-05-11

Similar Documents

Publication Publication Date Title
US5682531A (en) Central processing unit
KR100328162B1 (ko) 정보처리회로와마이크로컴퓨터와전자기기
US4740893A (en) Method for reducing the time for switching between programs
EP1063586B1 (fr) Méthode et dispositif pour le traitement de données, ayant plusieurs groupes des bits d&#39;état
US4274138A (en) Stored program control system with switching between instruction word systems
JP2002366348A (ja) 多重命令セットによるデータ処理
JPS6339931B2 (fr)
US5249280A (en) Microcomputer having a memory bank switching apparatus for accessing a selected memory bank in an external memory
US7546442B1 (en) Fixed length memory to memory arithmetic and architecture for direct memory access using fixed length instructions
KR100320559B1 (ko) 디지탈신호처리프로세서및그것에관한컨디션코드를갱신하지않는조건을갖는데이타조작방법
US4945511A (en) Improved pipelined processor with two stage decoder for exchanging register values for similar operand instructions
US5938759A (en) Processor instruction control mechanism capable of decoding register instructions and immediate instructions with simple configuration
KR100322277B1 (ko) 확장 명령어를 가진 중앙처리장치
US5991872A (en) Processor
US5504923A (en) Parallel processing with improved instruction misalignment detection
WO1994002894A2 (fr) Systeme de traitement de donnees a dispositif de traitement de boucles de programme
US6223275B1 (en) Microprocessor with reduced instruction set limiting the address space to upper 2 Mbytes and executing a long type register branch instruction in three intermediate instructions
US6438680B1 (en) Microprocessor
EP0650613A1 (fr) Systeme de traitement de donnees a dispositif de traitement de boucles de programme
JP3504355B2 (ja) プロセッサ
US6005502A (en) Method for reducing the number of bits needed for the representation of constant values in a data processing device
EP0650614B1 (fr) Architecture de processeur de signal numerique
US4218741A (en) Paging mechanism
US5649229A (en) Pipeline data processor with arithmetic/logic unit capable of performing different kinds of calculations in a pipeline stage
JP3199603B2 (ja) コードサイズ縮小化マイクロプロセッサ

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): JP KR US

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE

WPC Withdrawal of priority claims after completion of the technical preparations for international publication

Free format text: FR

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WWE Wipo information: entry into national phase

Ref document number: 1993916063

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1993916063

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 1993916063

Country of ref document: EP