US7636836B2 - Fetch and dispatch disassociation apparatus for multistreaming processors - Google Patents

Fetch and dispatch disassociation apparatus for multistreaming processors Download PDF

Info

Publication number
US7636836B2
US7636836B2 US12173560 US17356008A US7636836B2 US 7636836 B2 US7636836 B2 US 7636836B2 US 12173560 US12173560 US 12173560 US 17356008 A US17356008 A US 17356008A US 7636836 B2 US7636836 B2 US 7636836B2
Authority
US
Grant status
Grant
Patent type
Prior art keywords
instruction
instructions
fetch
dispatch
stage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
US12173560
Other versions
US20080270757A1 (en )
Inventor
Mario D. Nemirovsky
Adolfo M. Nemirovsky
Narendra Sankar
Enrique Musoll
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Arm Finance Overseas Ltd
Original Assignee
MIPS Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Grant date

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling, out of order instruction execution
    • G06F9/3851Instruction issuing, e.g. dynamic instruction scheduling, out of order instruction execution from multiple instruction streams, e.g. multistreaming

Abstract

A dynamic multistreaming processor has instruction queues, each instruction queue corresponding to an instruction stream, and execution units. The dynamic multistreaming processor also has a dispatch stage to select at least one instruction from one of the instruction queues and to dispatch the selected at least one instruction to one of the execution units. Lastly the dynamic multistreaming processor has a queue counter, associated with each instruction queue, for indicating the number of instructions in each queue, and a fetch counter, associated with each instruction queue, for indicating an address from which to obtain instructions when the associated instruction queue is not full. The dynamic multistreaming processor might also have fetch counters for indicating a next instruction address from which to obtain at least one instruction when the associated instruction queue is not full. The dynamic multistreaming processor could also have a second counter for indicating a next instruction address.

Description

CROSS REFERENCE TO RELATED APPLICATION

This application is a continuation of U.S. application Ser. No. 11/539,322, filed Oct. 6, 2006, which is a continuation of U.S. application Ser. No. 09/706,154, filed Nov. 3, 2000 (now U.S. Pat. No. 7,139,898), all of which are incorporated by reference herein in their entirety.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention is in the field of digital processing and pertains more particularly to apparatus and methods for fetching and dispatching instructions in dynamic multistreaming processors.

2. Background

Conventional pipelined single-stream processors incorporate fetch and dispatch pipeline stages, as is true of most conventional processors. In such processors, in the fetch stage, one or more instructions are read from an instruction cache and in the dispatch stage, one or more instructions are sent to execution units (EUs) to execute. These stages may be separated by one or more other stages, for example a decode stage. In such a processor the fetch and dispatch stages are coupled together such that the fetch stage generally fetches from the instruction stream in every cycle.

In multistreaming processors known to the present inventors, multiple instruction streams are provided, each having access to the execution units. Multiple fetch stages may be provided, one for each instruction stream, although one dispatch stage is employed. Thus, the fetch and dispatch stages are coupled to one another as in other conventional processors, and each instruction stream generally fetches instructions in each cycle. That is, if there are five instruction streams, each of the five fetches in each cycle, and there needs to be a port to the instruction cache for each stream, or a separate cache for each stream.

In a multistreaming processor multiple instruction streams share a common set of resources, for example execution units and/or access to memory resources. In such a processor, for example, there may be M instruction streams that share Q execution units in any given cycle. This means that a set of up to Q instructions is chosen from the M instruction streams to be delivered to the execution units in each cycle. In the following cycle a different set of up to Q instructions is chosen, and so forth. More than one instruction may be chosen from the same instruction stream, up to a maximum P, given that there are no dependencies between the instructions.

It is desirable in multistreaming processors to maximize the number of instructions executed in each cycle. This means that the set of up to Q instructions that is chosen in each cycle should be as close to Q as possible. Reasons that there may not be Q instructions available include flow dependencies, stalls due to memory operations, stalls due to branches, and instruction fetch latency.

What is clearly needed in the art is an apparatus and method to de-couple dispatch operations from fetch operations. The present invention, in several embodiments described in enabling detail below, provides a unique solution.

SUMMARY OF THE INVENTION

In a preferred embodiment of the present invention a pipelined multistreaming processor is provided, comprising an instruction source, a plurality of streams fetching instructions from the instruction source, a dispatch stage for selecting and dispatching instructions to a set of execution units, a set of instruction queues having one queue associated with each stream in the plurality of streams, and located in the pipeline between the instruction source and the dispatch stage, and a select system for selecting streams in each cycle to fetch instructions from the instruction source. The processor is characterized in that the number of streams selected for which to fetch instructions in each cycle is fewer than the number of streams in the plurality of streams.

In some embodiments the number of streams in the plurality of streams is eight, and the number of streams selected for which to fetch instructions in each cycle is two. Also in some embodiments the select system monitors a set of fetch program counters (FPC) having one FPC associated with each stream, and directs fetching of instructions beginning at addresses according to the program counters. In still other embodiments each stream selected to fetch is directed to fetch eight instructions from the instruction cache.

In some embodiments there is a set of execution units to which the dispatch stage dispatches instructions. In some embodiments the set of execution units comprises eight Arithmetic-Logic Units (ALS), and two memory units.

In another aspect of the invention, in a pipelined multistreaming processor having an instruction queue, a method for decoupling fetching from a dispatch stage is provided, comprising the steps of (a) placing a set of instruction queues, one for each stream, in the pipeline between the instruction queue and the dispatch stage; and (b) selecting one or more streams, fewer than the number of streams in the multistreaming processor, for which to fetch instructions in each cycle from an instruction source.

In some embodiments of the method the number of streams in the plurality of streams is eight, and the number of streams selected for which to fetch instructions in each cycle is two. In some embodiments the select system monitors a set of fetch program counters (FPC) having one FPC associated with each stream, and directs fetching of instructions beginning at addresses according to the program counters. In other embodiments each stream selected to fetch is directed to fetch eight instructions from the instruction source. In preferred embodiments, also, the dispatch stage dispatches instructions to a set of execution units, which may comprise eight Arithmetic-Logic Units (ALS), and two memory units.

In embodiments of the present invention, described in enabling detail below, for the first time apparatus and methods are provided for a decoupling fetch and dispatch in processors, and particularly in multistreaming processors.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram depicting a pipelined structure for a processor in the prior art.

FIG. 2 is a block diagram depicting a pipelined structure for a multistreaming processor known to the present inventors.

FIG. 3 is a block diagram for a pipelines architecture for a multistreaming processor according to an embodiment of the present invention.

DETAILED DESCRIPTION OF EMBODIMENTS OF THE INVENTION

FIG. 1 is a block diagram depicting a pipelined structure for a processor in the prior art. In this prior art structure there is an instruction cache 11, wherein instructions await selection for execution, a fetch stage 13 which selects and fetches instruction into the pipeline, and a dispatch stage 15 which dispatches instructions to execution units (EUs) 17. In many conventional pipelined structures there are additional stages other than the exemplary stages illustrated here.

In the simple architecture illustrated in FIG. 1 everything works in lockstep. In each cycle an instruction is fetched and another previously fetched instruction is dispatched to one of the execution units.

FIG. 2 is a block diagram depicting a pipelined structure for a multistreaming processor known to the present inventors, wherein a single instruction cache 19 has ports for three separate streams, and a fetch is made per cycle by each of three fetch stages 21, 23, and 25 (one for each stream). In this particular case a single dispatch stage 27 selects instructions from a pool fed by the three streams and dispatches those instructions to one or another of three execution units 29. In this architecture the fetch and dispatch units are still directly coupled. It should be noted that the architecture of FIG. 2, while prior to the present invention, is not necessarily in the public domain, as it is an as-yet proprietary architecture known to the present inventors. In another example, there may be separate caches for separate streams, but this does not provide the desired de-coupling.

FIG. 3 is a block diagram depicting an architecture for a dynamic multistreaming (DMS) processor according to an embodiment of the present invention. In this DMS processor there are eight streams and ten functional units. Instruction cache 31 in this embodiment has two ports for providing instructions to fetch stage 33. Eight instructions may be fetched each cycle for each port, so 16 instructions may be fetched per cycle.

In a preferred embodiment of the present invention instruction queues 39 are provided, which effectively decouple fetch and dispatch stages in the pipeline. There are in this embodiment eight instruction queues, one for each stream. In the example of FIG. 3 the instruction queues are shown in a manner to illustrate that each queue may have a different number of instructions ready for transfer to a dispatch stage 41.

Referring again to instruction cache 31 and the two ports to fetch stage 33, it was described above that eight instructions may be fetched to stage 33 via each port. Typically the eight instructions for one port are eight instructions from a single thread for a single stream. For example, the eight instructions fetched by one port in a particular cycle will typically be sequential instructions for a thread associated with one stream.

Determination of the two threads associated with two streams to be accessed in each cycle is made by selection logic 35. Logic 35 monitors a set of fetch program counters 37, which maintain a program counter for each stream, indicating at what address to find the next instruction for that stream. Select logic 35 also monitors the state of each queue in set 39 of instruction queues. Based at least in part on the state of instruction queues 39 select logic 35 determines the two threads from which to fetch instructions in a particular cycle. For example, if the instruction queue in set 39 for a stream is full, the probability of utilizing eight additional instructions into the pipeline from the thread associated with that stream is low. Conversely, if the instruction queue in set 39 for a stream is empty, the probability of utilizing eight additional instructions into the pipeline from the thread associated with that stream is high.

In this embodiment, in each cycle, four instructions are made available to dispatch stage 41 from each instruction queue. In practice, dispatch logic is provided for selecting from which queues to dispatch instructions. The dispatch logic has knowledge of many parameters, typically including priorities, instruction dependencies, and the like, and is also aware of the number of instructions in each queue.

As described above, there are in this preferred embodiment ten execution units, which include two memory units 43 and eight arithmetic logic units (ALUs) 45. Thus, in each cycle up to ten instructions may be dispatched to execution units.

In the system depicted by FIG. 3, the unique and novel set of instruction queues 39 provides decoupling of dispatch from fetch in the pipeline. The dispatch stage now has a larger pool of instructions from which to select to dispatch to execution units, and the efficiency of dispatch is improved. That is the number of instructions that may be dispatched per cycle is maximized. This structure and operation allows a large number of streams of a DMS processor to execute instructions continually while permitting the fetch mechanism to fetch from a smaller number of streams in each cycle. Fetching from a smaller number of streams, in this case two, in each cycle is important, because the hardware and logic necessary to provide additional ports into the instruction cache is significant. As an added benefit, unified access to a single cache is provided.

Thus the instruction queue in the preferred embodiment allows fetched instructions to be buffered after fetch and before dispatch. The instruction queue read mechanism allows the head of the queue to be presented to dispatch in each cycle, allowing a variable number of instructions to be dispatched from each stream in each cycle. With the instruction queue, one can take advantage of instruction stream locality, while maximizing the efficiency of the fetch mechanism in the presence of stalls and branches. By providing a fetch mechanism that can support up to eight instructions from two streams, one can keep the instruction queues full while not having to replicate the fetch bandwidth across all streams.

The skilled artisan will recognize that there are a number of alterations that might be made in embodiments of the invention described above without departing from the spirit and scope of the invention. For example, the number of instruction queues may vary, the number of ports into the instruction cache may vary, the fetch logic may be implemented in a variety of ways, and the dispatch logic may be implemented in a variety of ways, among other changes that may be made within the spirit and scope of the invention. For these and other reasons the invention should be afforded the broadest scope, and should be limited only by the claims that follow.

Claims (16)

1. A dynamic multistreaming processor, comprising:
a plurality of instruction queues, each instruction queue corresponding to an instruction stream;
a fetch stage configured to fetch at least one instruction from an instruction source and store the fetched instructions in a selected one of the plurality of instruction queues;
a plurality of execution units;
a dispatch stage configured to select at least one instruction from one of the plurality of instruction queues and to dispatch the selected at least one instruction to one of the plurality of execution units;
a queue counter, associated with each instruction queue, configured to indicate a number of instructions in each instruction queue and to indicate the capacity of each instruction queue to accept additional instructions from the fetch stage; and
a fetch program counter, associated with each instruction queue, configured to indicate an address from which to obtain instructions when the associated instruction queue indicates capacity to accept additional instructions.
2. The dynamic multistreaming processor of claim 1, wherein each of said instruction queues is associated with a thread.
3. The dynamic multistreaming processor of claim 2, wherein the dispatch stage comprises logic for determining thread priorities and instruction dependencies.
4. The dynamic multistreaming processor of claim 1, wherein the fetch stage is configured to fetch a sequential plurality of instructions from an instruction source and store the fetched instructions in at least one of the plurality of instruction queues.
5. The dynamic multistreaming processor of claim 1, further comprising:
a fetch stage configured to transfer instructions from an instruction source to a selected one of the plurality of instruction queues wherein the number of fetched instructions is dependent upon the number of instructions in the selected one of the plurality of instruction queue.
6. The dynamic multistreaming processor of claim 1, further comprising:
an instruction cache; and
a fetch stage configured to fetch at least one instruction from the instruction cache to a selected one of the plurality of instruction queues.
7. A dynamic multistreaming processor, comprising:
a plurality of instruction queues, each instruction queue corresponding to an instruction stream;
a plurality of execution units;
a dispatch stage configured to select at least one instruction from one of the plurality of instruction queues and configured to dispatch the selected at least one instruction to a corresponding one of the plurality of execution units;
a plurality of fetch program counters, one associated with each of the plurality of instruction queues, configured to indicate a next instruction address from which to obtain at least one instruction; and
a fetch stage configured to fetch the at least one instruction to a selected one of the instruction queues based at least in part on the plurality of fetch program counters.
8. The dynamic multistreaming processor of claim 7, wherein each of said plurality of instruction queues is associated with a thread.
9. The dynamic multistreaming processor of claim 7, wherein the dispatch stage comprises logic for determining thread priorities and instruction dependencies.
10. The dynamic multistreaming processor of claim 7, wherein the fetch stage is configured to fetch and store a number of instructions in a selected instruction queue at a rate that is independent from the rate that the dispatch stage is configured to dispatch instructions from the instruction queues.
11. The dynamic multistreaming processor of claim 7, further comprising:
an instruction source coupled to the fetch stage.
12. A dynamic multistreaming processor, comprising:
a plurality of instruction queues, each instruction queue corresponding to an instruction stream;
a fetch stage configured to fetch at least one instruction from an instruction source and store the fetched instructions in a selected one of the instruction queues;
a counter, associated with each instruction queue, configured to indicate a number of instructions in each instruction queue and to indicate the capacity of each instruction queue to accept additional instructions
a second counter, associated with each instruction queue, configured to indicate a next instruction address in the instruction source from which to obtain at least one instruction when the associated instruction queue indicates capacity to accept additional instructions from the fetch stage;
a plurality of execution units; and
a dispatch stage configured to select at least one instruction from one of the instruction queues and to dispatch the selected at least one instruction to one of the execution units wherein the number of instructions dispatched by the dispatch stage to the execution units is different than the number of fetched instructions.
13. The dynamic multistreaming processor of claim 12, further comprising:
logic configured to determine how many instructions from a selected instruction queue should be dispatched to the execution units.
14. The dynamic multistreaming processor of claim 12, wherein the logic further comprises logic configured to determine dependencies between instructions.
15. The dynamic multistreaming processor of claim 12, wherein the fetch stage is configured to fetch a sequential plurality of instructions from the instruction source and store the fetched instructions in at least one of the instruction queues.
16. The dynamic multistreaming processor of claim 15, wherein the instruction source comprises a single instruction cache.
US12173560 2000-11-03 2008-07-15 Fetch and dispatch disassociation apparatus for multistreaming processors Active US7636836B2 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US09706154 US7139898B1 (en) 2000-11-03 2000-11-03 Fetch and dispatch disassociation apparatus for multistreaming processors
US11539322 US7406586B2 (en) 2000-11-03 2006-10-06 Fetch and dispatch disassociation apparatus for multi-streaming processors
US12173560 US7636836B2 (en) 2000-11-03 2008-07-15 Fetch and dispatch disassociation apparatus for multistreaming processors

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US12173560 US7636836B2 (en) 2000-11-03 2008-07-15 Fetch and dispatch disassociation apparatus for multistreaming processors

Publications (2)

Publication Number Publication Date
US20080270757A1 true US20080270757A1 (en) 2008-10-30
US7636836B2 true US7636836B2 (en) 2009-12-22

Family

ID=24836426

Family Applications (3)

Application Number Title Priority Date Filing Date
US09706154 Active 2022-09-21 US7139898B1 (en) 2000-11-03 2000-11-03 Fetch and dispatch disassociation apparatus for multistreaming processors
US11539322 Active US7406586B2 (en) 2000-11-03 2006-10-06 Fetch and dispatch disassociation apparatus for multi-streaming processors
US12173560 Active US7636836B2 (en) 2000-11-03 2008-07-15 Fetch and dispatch disassociation apparatus for multistreaming processors

Family Applications Before (2)

Application Number Title Priority Date Filing Date
US09706154 Active 2022-09-21 US7139898B1 (en) 2000-11-03 2000-11-03 Fetch and dispatch disassociation apparatus for multistreaming processors
US11539322 Active US7406586B2 (en) 2000-11-03 2006-10-06 Fetch and dispatch disassociation apparatus for multi-streaming processors

Country Status (2)

Country Link
US (3) US7139898B1 (en)
WO (1) WO2002037269A1 (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7139898B1 (en) 2000-11-03 2006-11-21 Mips Technologies, Inc. Fetch and dispatch disassociation apparatus for multistreaming processors
US7954102B2 (en) * 2002-11-13 2011-05-31 Fujitsu Limited Scheduling method in multithreading processor, and multithreading processor
US7191320B2 (en) * 2003-02-11 2007-03-13 Via Technologies, Inc. Apparatus and method for performing a detached load operation in a pipeline microprocessor
US7584216B2 (en) 2003-02-21 2009-09-01 Motionpoint Corporation Dynamic language translation of web site content
US7657891B2 (en) 2005-02-04 2010-02-02 Mips Technologies, Inc. Multithreading microprocessor with optimized thread scheduler for increasing pipeline utilization efficiency
US7490230B2 (en) 2005-02-04 2009-02-10 Mips Technologies, Inc. Fetch director employing barrel-incrementer-based round-robin apparatus for use in multithreading microprocessor
US7664936B2 (en) 2005-02-04 2010-02-16 Mips Technologies, Inc. Prioritizing thread selection partly based on stall likelihood providing status information of instruction operand register usage at pipeline stages
US7613904B2 (en) 2005-02-04 2009-11-03 Mips Technologies, Inc. Interfacing external thread prioritizing policy enforcing logic with customer modifiable register to processor internal scheduler
US20060229638A1 (en) * 2005-03-29 2006-10-12 Abrams Robert M Articulating retrieval device
US8726292B2 (en) * 2005-08-25 2014-05-13 Broadcom Corporation System and method for communication in a multithread processor
EP2680161A1 (en) 2010-07-13 2014-01-01 Motionpoint Corporation Uniform Resource Locator (URL) improvement methods
US9354884B2 (en) * 2013-03-13 2016-05-31 International Business Machines Corporation Processor with hybrid pipeline capable of operating in out-of-order and in-order modes

Citations (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3771138A (en) 1971-08-31 1973-11-06 Ibm Apparatus and method for serializing instructions from two independent instruction streams
US4916652A (en) 1987-09-30 1990-04-10 International Business Machines Corporation Dynamic multiple instruction stream multiple data multiple pipeline apparatus for floating-point single instruction stream single data architectures
US4924376A (en) * 1985-12-26 1990-05-08 Nec Corporation System for dynamically adjusting the accumulation of instructions in an instruction code prefetched pipelined computer
US5313600A (en) * 1988-12-02 1994-05-17 Mitsubishi Denki Kabushiki Kaisha System for controlling the number of data pieces in a queue memory
US5404469A (en) 1992-02-25 1995-04-04 Industrial Technology Research Institute Multi-threaded microprocessor architecture utilizing static interleaving
US5430851A (en) 1991-06-06 1995-07-04 Matsushita Electric Industrial Co., Ltd. Apparatus for simultaneously scheduling instruction from plural instruction streams into plural instruction execution units
US5574939A (en) 1993-05-14 1996-11-12 Massachusetts Institute Of Technology Multiprocessor coupling system with integrated compile and run time scheduling for parallelism
US5604909A (en) 1993-12-15 1997-02-18 Silicon Graphics Computer Systems, Inc. Apparatus for processing instructions in a computing system
US5699537A (en) 1995-12-22 1997-12-16 Intel Corporation Processor microarchitecture for efficient dynamic scheduling and execution of chains of dependent instructions
US5724565A (en) 1995-02-03 1998-03-03 International Business Machines Corporation Method and system for processing first and second sets of instructions by first and second types of processing systems
US5742782A (en) 1994-04-15 1998-04-21 Hitachi, Ltd. Processing apparatus for executing a plurality of VLIW threads in parallel
US5745725A (en) 1995-07-14 1998-04-28 Sgs-Thomson Microelectronics Limited Parallel instruction execution with operand availability check during execution
US5745778A (en) 1994-01-26 1998-04-28 Data General Corporation Apparatus and method for improved CPU affinity in a multiprocessor system
US5812811A (en) 1995-02-03 1998-09-22 International Business Machines Corporation Executing speculative parallel instructions threads with forking and inter-thread communication
US5848268A (en) * 1992-02-07 1998-12-08 Mitsubishi Denki Kabushiki Kaisha Data processor with branch target address generating unit
US5900025A (en) 1995-09-12 1999-05-04 Zsp Corporation Processor having a hierarchical control register file and methods for operating the same
US5907702A (en) * 1997-03-28 1999-05-25 International Business Machines Corporation Method and apparatus for decreasing thread switch latency in a multithread processor
US5913049A (en) 1997-07-31 1999-06-15 Texas Instruments Incorporated Multi-stream complex instruction set microprocessor
US5933627A (en) 1996-07-01 1999-08-03 Sun Microsystems Thread switch on blocked load or store using instruction thread field
US6092175A (en) 1998-04-02 2000-07-18 University Of Washington Shared register storage mechanisms for multithreaded computer systems with out-of-order execution
US6105127A (en) 1996-08-27 2000-08-15 Matsushita Electric Industrial Co., Ltd. Multithreaded processor for processing multiple instruction streams independently of each other by flexibly controlling throughput in each instruction stream
US6105053A (en) 1995-06-23 2000-08-15 Emc Corporation Operating system for a non-uniform memory access multiprocessor system
US6141746A (en) 1997-10-20 2000-10-31 Fujitsu Limited Information processor
US6219780B1 (en) 1998-10-27 2001-04-17 International Business Machines Corporation Circuit arrangement and method of dispatching instructions to multiple execution units
US6343348B1 (en) 1998-12-03 2002-01-29 Sun Microsystems, Inc. Apparatus and method for optimizing die utilization and speed performance by register file splitting
US6378063B2 (en) 1998-12-23 2002-04-23 Intel Corporation Method and apparatus for efficiently routing dependent instructions to clustered execution units
US6460130B1 (en) * 1999-02-19 2002-10-01 Advanced Micro Devices, Inc. Detecting full conditions in a queue
US6470443B1 (en) 1996-12-31 2002-10-22 Compaq Computer Corporation Pipelined multi-thread processor selecting thread instruction in inter-stage buffer based on count information
US6530042B1 (en) * 1999-11-08 2003-03-04 International Business Machines Corporation Method and apparatus for monitoring the performance of internal queues in a microprocessor
US6542991B1 (en) 1999-05-11 2003-04-01 Sun Microsystems, Inc. Multiple-thread processor with single-thread interface shared among threads
US6542987B1 (en) * 1999-02-01 2003-04-01 Hewlett-Packard Development Company L.P. Method and circuits for early detection of a full queue
US6622240B1 (en) 1999-06-18 2003-09-16 Intrinsity, Inc. Method and apparatus for pre-branch instruction
US6968444B1 (en) 2002-11-04 2005-11-22 Advanced Micro Devices, Inc. Microprocessor employing a fixed position dispatch unit
US7035998B1 (en) 2000-11-03 2006-04-25 Mips Technologies, Inc. Clustering stream and/or instruction queues for multi-streaming processors
US7046677B2 (en) 2002-11-27 2006-05-16 Rgb Networks, Inc. Method and apparatus for time-multiplexed processing of multiple digital video programs
US7139898B1 (en) 2000-11-03 2006-11-21 Mips Technologies, Inc. Fetch and dispatch disassociation apparatus for multistreaming processors

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5804909A (en) * 1997-04-04 1998-09-08 Motorola Inc. Edge emission field emission device

Patent Citations (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3771138A (en) 1971-08-31 1973-11-06 Ibm Apparatus and method for serializing instructions from two independent instruction streams
US4924376A (en) * 1985-12-26 1990-05-08 Nec Corporation System for dynamically adjusting the accumulation of instructions in an instruction code prefetched pipelined computer
US4916652A (en) 1987-09-30 1990-04-10 International Business Machines Corporation Dynamic multiple instruction stream multiple data multiple pipeline apparatus for floating-point single instruction stream single data architectures
US5313600A (en) * 1988-12-02 1994-05-17 Mitsubishi Denki Kabushiki Kaisha System for controlling the number of data pieces in a queue memory
US5430851A (en) 1991-06-06 1995-07-04 Matsushita Electric Industrial Co., Ltd. Apparatus for simultaneously scheduling instruction from plural instruction streams into plural instruction execution units
US5848268A (en) * 1992-02-07 1998-12-08 Mitsubishi Denki Kabushiki Kaisha Data processor with branch target address generating unit
US5404469A (en) 1992-02-25 1995-04-04 Industrial Technology Research Institute Multi-threaded microprocessor architecture utilizing static interleaving
US5574939A (en) 1993-05-14 1996-11-12 Massachusetts Institute Of Technology Multiprocessor coupling system with integrated compile and run time scheduling for parallelism
US6691221B2 (en) * 1993-12-15 2004-02-10 Mips Technologies, Inc. Loading previously dispatched slots in multiple instruction dispatch buffer before dispatching remaining slots for parallel execution
US5604909A (en) 1993-12-15 1997-02-18 Silicon Graphics Computer Systems, Inc. Apparatus for processing instructions in a computing system
US5745778A (en) 1994-01-26 1998-04-28 Data General Corporation Apparatus and method for improved CPU affinity in a multiprocessor system
US5742782A (en) 1994-04-15 1998-04-21 Hitachi, Ltd. Processing apparatus for executing a plurality of VLIW threads in parallel
US5812811A (en) 1995-02-03 1998-09-22 International Business Machines Corporation Executing speculative parallel instructions threads with forking and inter-thread communication
US5724565A (en) 1995-02-03 1998-03-03 International Business Machines Corporation Method and system for processing first and second sets of instructions by first and second types of processing systems
US6105053A (en) 1995-06-23 2000-08-15 Emc Corporation Operating system for a non-uniform memory access multiprocessor system
US5745725A (en) 1995-07-14 1998-04-28 Sgs-Thomson Microelectronics Limited Parallel instruction execution with operand availability check during execution
US5900025A (en) 1995-09-12 1999-05-04 Zsp Corporation Processor having a hierarchical control register file and methods for operating the same
US5699537A (en) 1995-12-22 1997-12-16 Intel Corporation Processor microarchitecture for efficient dynamic scheduling and execution of chains of dependent instructions
US5933627A (en) 1996-07-01 1999-08-03 Sun Microsystems Thread switch on blocked load or store using instruction thread field
US6105127A (en) 1996-08-27 2000-08-15 Matsushita Electric Industrial Co., Ltd. Multithreaded processor for processing multiple instruction streams independently of each other by flexibly controlling throughput in each instruction stream
US6470443B1 (en) 1996-12-31 2002-10-22 Compaq Computer Corporation Pipelined multi-thread processor selecting thread instruction in inter-stage buffer based on count information
US5907702A (en) * 1997-03-28 1999-05-25 International Business Machines Corporation Method and apparatus for decreasing thread switch latency in a multithread processor
US5913049A (en) 1997-07-31 1999-06-15 Texas Instruments Incorporated Multi-stream complex instruction set microprocessor
US6141746A (en) 1997-10-20 2000-10-31 Fujitsu Limited Information processor
US6092175A (en) 1998-04-02 2000-07-18 University Of Washington Shared register storage mechanisms for multithreaded computer systems with out-of-order execution
US6219780B1 (en) 1998-10-27 2001-04-17 International Business Machines Corporation Circuit arrangement and method of dispatching instructions to multiple execution units
US6343348B1 (en) 1998-12-03 2002-01-29 Sun Microsystems, Inc. Apparatus and method for optimizing die utilization and speed performance by register file splitting
US6378063B2 (en) 1998-12-23 2002-04-23 Intel Corporation Method and apparatus for efficiently routing dependent instructions to clustered execution units
US6542987B1 (en) * 1999-02-01 2003-04-01 Hewlett-Packard Development Company L.P. Method and circuits for early detection of a full queue
US6460130B1 (en) * 1999-02-19 2002-10-01 Advanced Micro Devices, Inc. Detecting full conditions in a queue
US6542991B1 (en) 1999-05-11 2003-04-01 Sun Microsystems, Inc. Multiple-thread processor with single-thread interface shared among threads
US6622240B1 (en) 1999-06-18 2003-09-16 Intrinsity, Inc. Method and apparatus for pre-branch instruction
US6530042B1 (en) * 1999-11-08 2003-03-04 International Business Machines Corporation Method and apparatus for monitoring the performance of internal queues in a microprocessor
US7035998B1 (en) 2000-11-03 2006-04-25 Mips Technologies, Inc. Clustering stream and/or instruction queues for multi-streaming processors
US7139898B1 (en) 2000-11-03 2006-11-21 Mips Technologies, Inc. Fetch and dispatch disassociation apparatus for multistreaming processors
US7406586B2 (en) 2000-11-03 2008-07-29 Mips Technologies, Inc. Fetch and dispatch disassociation apparatus for multi-streaming processors
US6968444B1 (en) 2002-11-04 2005-11-22 Advanced Micro Devices, Inc. Microprocessor employing a fixed position dispatch unit
US7046677B2 (en) 2002-11-27 2006-05-16 Rgb Networks, Inc. Method and apparatus for time-multiplexed processing of multiple digital video programs

Non-Patent Citations (19)

* Cited by examiner, † Cited by third party
Title
ARM Archictecture Reference Manual. Prentice Hall. pp. 3-41, 3-42, 3-43, 3-67, 3-68 (1996).
Becker et al. The PowerPC 601 Microprocessor, Oct. 1993. pp. 54-68. IEEE Micro.
Diefendorff, Keith et al. "AltiVec Extension to PowerPC Accelerates Media Processing." IEEE Micro. vol. 20, No. 2, pp. 85-95 (Mar.-Apr. 2000).
Diefendorff, Keith et al. "Organization of the Motorola 88110 Superscalar RISC Microprocessor." IEEE Micro. vol. 12, No. 2, pp. 40-63 (1992).
Diefendorff, Keith. "Compaq Chooses SMT for Alpha." Microprocessor Report, http://www.mdronline.com/mpr/h/19991206/1131601.html Dec. 6, 1999.
Diefendorff, Keith. "Jalapeno Powers Cyrix's M3." Microprocessor Report. http://www.mdronline.com/mpr/h/19981116/121507.html, Nov. 16, 1998.
Diefendorff, Keith. "K7 Challenges Intel." Microprocessor Report. vol. 12, No. 14, 7 pages (Oct. 26, 1998).
Diefendorff, Keith. "Power4 Focuses on Memory Bandwidth." Microprocessor Report. vol. 13, No. 13, 13 pages (Oct. 6, 1999).
Diefendorff, Keith. "WinChip4 Thumbs Nose at ILP." Microprocessor Report, http://www.mdronline.com/mpr/h/19981207/121605.html, Dec. 7, 1998.
Eggers et al. "Simultaneous Multithreading: A Platform for Next-Generation Processors." Sep. 1998, pp. 12-19, IEEE Micro.
ESA/390 Principles of Operation. IBM Library Server, Table of Contents and Para.7.5.31 and 7.5.70 (1993). (available at http://publibz.boulder.ibm.com/cgi-bin/bookmgr-OS390/BOOK/DZ9AR001/CCONTENTS).
Gwennap, Linley. "Digital 21264 Sets New Standard." Microprocessor Report. vol. 20, No. 14. 11 Pages (Oct. 28, 1999).
Hirata, H. etal., An Elementary Processor Architecture with Simultaneous Instruction Issuing from Multiple Threads, 1992, ACM pp. 136-145. *
Kane, Gerry. PA-RISC 2.0 Architecture. Prentice Hall, New Jersey. pp. 7-106 and 7-107 (1996).
M.J. Potel, "Real-Time Playback in Animation Systems." Proceedings of the 4th Annual Conference on Computer Graphis and Interactive Techniques, San Jose, CA. pp. 72-77 (1977).
MC68020 32-Bit Microprocessor User's Manual. Third Edition. Prentice Hall, New Jersey. pp. 3-125, 3-126, and 3-127 (1989).
MC88110 Second Generation RISC Microprocessor User's Manual. Motorola, Inc., pp. 10-66, 10-67 and 10-71 (1991).
Michael Slater. "Rise Joins x86 Fray with mP6." Microprocessor Report. http:/www.mdronline.com/mpr/h/19981116/121501/html. Nov. 16, 1998.
The PowerPC Architecture: A Specificaiton for a New Family of RISC Processors. Second Edition, Morgan Kaufmann. San Francisco. pp. 70-72. (May 1994).

Also Published As

Publication number Publication date Type
US20070260852A1 (en) 2007-11-08 application
US7139898B1 (en) 2006-11-21 grant
WO2002037269A1 (en) 2002-05-10 application
US20080270757A1 (en) 2008-10-30 application
US7406586B2 (en) 2008-07-29 grant

Similar Documents

Publication Publication Date Title
US6101595A (en) Fetching instructions from an instruction cache using sequential way prediction
US6366999B1 (en) Methods and apparatus to support conditional execution in a VLIW-based array processor with subword execution
US6725357B1 (en) Making available instructions in double slot FIFO queue coupled to execution units to third execution unit at substantially the same time
US5850543A (en) Microprocessor with speculative instruction pipelining storing a speculative register value within branch target buffer for use in speculatively executing instructions after a return
US5404552A (en) Pipeline risc processing unit with improved efficiency when handling data dependency
US6216215B1 (en) Method and apparatus for senior loads
US5131086A (en) Method and system for executing pipelined three operand construct
US5430851A (en) Apparatus for simultaneously scheduling instruction from plural instruction streams into plural instruction execution units
US6671795B1 (en) Method and apparatus for pausing execution in a processor or the like
US6249862B1 (en) Dependency table for reducing dependency checking hardware
US5051885A (en) Data processing system for concurrent dispatch of instructions to multiple functional units
US5075840A (en) Tightly coupled multiprocessor instruction synchronization
US6839828B2 (en) SIMD datapath coupled to scalar/vector/address/conditional data register file with selective subpath scalar processing mode
US6170038B1 (en) Trace based instruction caching
US6446190B1 (en) Register file indexing methods and apparatus for providing indirect control of register addressing in a VLIW processor
US4992933A (en) SIMD array processor with global instruction control and reprogrammable instruction decoders
US20040216106A1 (en) Apparatus and method for adjusting instruction thread priority in a multi-thread processor
US6269439B1 (en) Signal processor having pipeline processing that supresses the deterioration of processing efficiency and method of the same
US5923862A (en) Processor that decodes a multi-cycle instruction into single-cycle micro-instructions and schedules execution of the micro-instructions
US6085311A (en) Instruction alignment unit employing dual instruction queues for high frequency instruction dispatch
US5974523A (en) Mechanism for efficiently overlapping multiple operand types in a microprocessor
US6272616B1 (en) Method and apparatus for executing multiple instruction streams in a digital processor with multiple data paths
US5794003A (en) Instruction cache associative crossbar switch system
US5881307A (en) Deferred store data read with simple anti-dependency pipeline inter-lock control in superscalar processor
US20040215944A1 (en) Method using a dispatch flush in a simultaneous multithread processor to resolve exception conditions

Legal Events

Date Code Title Description
AS Assignment

Owner name: X-STREAM LOGIC, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:NEMIROVSKY, MARIO;NEMIROVSKY, ADOLFO;SANKAR, NARENDRA;AND OTHERS;REEL/FRAME:023460/0973

Effective date: 20001031

Owner name: MIPS TECHNOLOGIES, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:CLEARWATER NETWORKS, INC.;REEL/FRAME:023461/0415

Effective date: 20021217

Owner name: CLEARWATER NETWORKS, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:X-STREAM LOGIC, INC.;REEL/FRAME:023461/0274

Effective date: 20010718

AS Assignment

Owner name: BRIDGE CROSSING, LLC, NEW JERSEY

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MIPS TECHNOLOGIES, INC.;REEL/FRAME:030202/0440

Effective date: 20130206

FPAY Fee payment

Year of fee payment: 4

AS Assignment

Owner name: ARM FINANCE OVERSEAS LIMITED, GREAT BRITAIN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:BRIDGE CROSSING, LLC;REEL/FRAME:033074/0058

Effective date: 20140131

FPAY Fee payment

Year of fee payment: 8