WO2007033203A2 - Multi-threaded processor architecture - Google Patents

Multi-threaded processor architecture Download PDF

Info

Publication number
WO2007033203A2
WO2007033203A2 PCT/US2006/035541 US2006035541W WO2007033203A2 WO 2007033203 A2 WO2007033203 A2 WO 2007033203A2 US 2006035541 W US2006035541 W US 2006035541W WO 2007033203 A2 WO2007033203 A2 WO 2007033203A2
Authority
WO
WIPO (PCT)
Prior art keywords
threaded processor
hardware contexts
context
processor
executing
Prior art date
Application number
PCT/US2006/035541
Other languages
French (fr)
Other versions
WO2007033203A3 (en
Inventor
Michael A. Fischer
Original Assignee
Freescale Semiconductor Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US11/470,721 external-priority patent/US8046567B2/en
Application filed by Freescale Semiconductor Inc. filed Critical Freescale Semiconductor Inc.
Priority to KR1020087006004A priority Critical patent/KR101279343B1/en
Publication of WO2007033203A2 publication Critical patent/WO2007033203A2/en
Publication of WO2007033203A3 publication Critical patent/WO2007033203A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3851Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution from multiple instruction streams, e.g. multistreaming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/461Saving or restoring of program or task context
    • G06F9/462Saving or restoring of program or task context with multiple register sets

Definitions

  • the present invention relates to computer science in general, and, more particularly, to an architecture for a multi-threaded processor.
  • the present invention enables the manufacturing and use of processors that are capable of responding to multiple low-latency-tolerant events concurrently and with moderate power consumption.
  • the illustrative embodiment is a multi-threaded processor that is capable of responding to, and processing, multiple low-latency-tolerant events concurrently and while using relatively slow, low-power memories.
  • the illustrative embodiment comprises a multi-threaded processor, which itself comprises a context controller and a plurality of hardware contexts.
  • Each hardware context is capable of storing the current state of one thread in a form that enables the processor to quickly switch to or from the execution of that thread.
  • each thread and, therefore, each hardware context - is prioritized to reflect the latency tolerance of the event to which it responds.
  • the context controller switches contexts - and, therefore, access to the processing capability of the processor - among the highest priority threads on a time- sequenced basis (e.g., an instruction-by-instruction basis, etc.).
  • a time- sequenced basis e.g., an instruction-by-instruction basis, etc.
  • the processor's memory is constructed of independent memory banks, and the instructions and data for each executing thread are stored in a different bank.
  • This enables the illustrative embodiment to comprise relatively slow-speed program memory because the memory's access time need only be as fast as the amount of time needed by the processor to execute two or more successive instructions in a single thread depending on parameters that are described in detail below.
  • the illustrative embodiment comprises: (a) H hardware-contexts, each of which is capable of storing the execution state of one thread in a multi-threaded processor; and (b) a context controller for:
  • FIG. 1 depicts a block diagram of the logical components of an electronic appliance in accordance with the illustrative embodiment of the present invention.
  • FIG. 2 depicts a block diagram of the salient aspects of multi-threaded processor 103 in accordance with the illustrative embodiment of the present invention.
  • Figure 3 depicts a chart of the salient tasks associated with the operation of the illustrative embodiment.
  • FIG. 1 depicts a block diagram of the logical components of an electronic appliance in accordance with the illustrative embodiment of the present invention.
  • Electronic appliance 100 comprises: radios 101-1 through 101-3, local bus controller 102, multithreaded processor 103, input/output 104, and memory 105, interrelated as shown.
  • Each of radios 101-1 through 101-3 is a standard radio as is well known to those skilled in the art that enables electronic appliance 100 to communicate with other- electronic appliances wirelessly.
  • Each of radios 101-1 through 101-3 operates in accordance with a different air-interface: radio 101-1 is IEEE 802.11 compliant, radio 101-2 is IS-95 compliant, and radio 101-3 is Bluetooth compliant.
  • radio 101-1 through 101-3 receive messages independently of each other and that require low-latency responses. It will be clear to those skilled in the art how to make and use radios 101-1 through 101-3.
  • Local bus controller 102 acts as the interface between radios 101-1 through 101-3 and multi-threaded processor 103 in well-known fashion. For example, local bus controller 102 controls the interaction between radios 101-1 through 101-3 and multithreaded processor 103. It will be clear to those skilled in the art how to make and use local bus controller 102.
  • Multi-threaded processor 103 is a general-purpose processor that is capable of interacting with local bus controller 102, input/output 104, and memory 105 as described below and with respect to Figures 2 and 3.
  • multi-threaded processor 103 is capable of executing a plurality of concurrent threads in the manner described below.
  • Input/output 104 is the non-radio interface for electronic appliance 100 and interacts with multi-threaded processor 103 in well-known fashion.
  • Memory 105 is the program memory for electronic appliance 100 and comprises C independent memory banks, wherein C is a positive integer greater than 2. The upper bound of C is described below with respect to Figure 2.
  • the access time of memory 105 is equal to or less than the time required by multi-threaded processor 103 to execute C instructions.
  • the instructions and data for each executing thread is stored in a different bank in memory 105 so that there is no memory contention between threads for data in the same bank. It will be clear to those skilled in the art how to make and use memory 105. It will be clear to those skilled in the art, after reading this disclosure, how to make and use alternative embodiments of the present invention in which memory 105 is tightly-coupled memory or the cache of a multi-level memory hierarchy.
  • FIG. 2 depicts a block diagram of the salient aspects of multi-threaded processor 103 in accordance with the illustrative embodiment of the present invention.
  • Multithreaded processor 103 comprises context controller 301 and H hardware contexts 301-1 through 301 -H, wherein His a positive integer greater than C.
  • H equals 8, but it will be clear to those skilled in the art, after reading this disclosure, how to make and use alternative embodiments of the present invention in which H has any integral value greater than C.
  • a "hardware context” is described as the hardware required to store the current state of a thread in a form that enables multithreaded processor 103 to switch to or from the execution of the thread.
  • Context controller 301 is capable of monitoring and regulating the population and execution of hardware contexts 301-1 through 301 -H, in the manner described below and with respect to Figure 3. Furthermore, context controller 301 maintains a table which provides the following information for each hardware context: (i) is the hardware context vacant or populated? (ii) what is the priority of hardware context? (iii) is the hardware context active or inactive? (iv) is the hardware context executing or not?
  • Figure 3 depicts a chart of the salient tasks associated with the operation of the illustrative embodiment.
  • tasks 301, 302, 303, and 304 run concurrently.
  • context controller 201 populates a vacant hardware context of hardware contexts 202-1 through 202-7/ in response to the spawning of a thread.
  • the thread has a priority, and, therefore context controller 201 associates the priority of that thread with the newly populated hardware context.
  • context controller 201 deems the hardware context populated in task 301 as "active,” which hardware context was initially deemed “inactive.”
  • active hardware context is defined as a context that is ready to execute and an “inactive hardware context” is defined as a context that is not ready to execute.
  • the value of A can fluctuate as new threads are spawned and completed threads are vacated.
  • the value of A can fluctuate due to the occurrence of events upon which inactive threads are waiting and the voluntary inactivation of threads pending occurrence of expected future events.
  • context controller 201 can deem an active hardware context as inactive when the context encounters a suspension state for whatever reason (e.g., a processor execution stall due to a cache miss, the need to wait for an external event, etc.) and can deem an inactive hardware context as active when the wait or block state has been overcome.
  • context controller 201 deems the E highest-priority of the A active hardware contexts as "executing,” wherein E is a non-negative integer and equals the lesser of A and C.
  • E is a non-negative integer and equals the lesser of A and C.
  • an "executing hardware context” is defined as an active hardware context that is given access to the processing capability of multi-threaded processor 103 and a “non-executing hardware context” is defined as an active hardware context that is not given access to the processing capability of multi-threaded processor 103.
  • both the value of E can fluctuate as the number of active hardware contexts fluctuates and the members of the set of executing hardware contexts can fluctuate as the relative priority of the active hardware contexts fluctuates.
  • context controller 201 deems the newly activated hardware context as an executing hardware context and deems the lowest priority of the E executing hardware contexts as non-executing. In this way, context controller 201 maintains the E highest-priority of the A active hardware contexts as executing hardware contexts.
  • context controller 201 initiates a context switch among the E executing hardware contexts on a time-sequenced basis.
  • a time-sequenced basis is defined as a resource allocation system that allocates the processing capability of multi-threaded processor 103 across the executing hardware contexts based on time. The switching of contexts on a time-sequenced basis is common among many fine-grained multi-threaded processors.
  • One example of context switching on a time-sequenced basis is switching on an instruction-by-instruction basis.
  • each of the E executing hardware contexts receives 1/Cth of the processing capability of multi-threaded processor 103 and (C-E)IC of the processing capability of multi-threaded processor 103 is not used by any of the E executing hardware contexts.
  • This is advantageous because each thread achieves a uniform processing time, which is advantageous (1) in applications where externally relevant time intervals ⁇ e.g., network inter-frame spaces, etc.) are generated directly by the instruction sequence and (2) in low-power applications.
  • One advantage of context switching on an instruction-by-instruction basis and giving each of the E executing hardware contexts receives 1/Cth of the processing capability of multi-threaded processor 103 is that the instruction execution rate can be synchronized with the memory access.
  • the memory is partitioned into C memory banks, and the data for each thread is stored in a different bank.
  • the access time of the memory need only be equal to or less than the time required by processor 103 to execute C instructions.
  • each of the E executing hardware contexts receives 1/Eth of the processing capability of multi-threaded processor 103. This is advantageous because it achieves faster processing of each thread and lower response time to external events.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Power Sources (AREA)
  • Advance Control (AREA)
  • Multi Processors (AREA)
  • Hardware Redundancy (AREA)

Abstract

A multi-threaded processor (103) that is capable of responding to, and processing, multiple low-latency-tolerant events concurrently and while using relatively slow, low-power memories (105) is disclosed. The illustrative embodiment comprises a multi-threaded processor, which itself comprises a context controller (301) and a plurality of hardware contexts (302). Each hardware context is capable of storing the current state of one thread in a form that enables the processor to quickly switch to or from the execution of that thread. To enable the processor to be capable of responding to low-latency-tolerant events quickly, each thread - and, therefore, each hardware context is prioritized - depending on the latency tolerance of the thread responding to the event.

Description

Multi-Threaded Processor Architecture
Field of the Invention
[oooi] The present invention relates to computer science in general, and, more particularly, to an architecture for a multi-threaded processor.
Background of the Invention
[0002] The trend in the design of consumer electronics is to build portable electronic appliances (e.g., personal digital assistants, cell phones, etc.) that are capable of communicating via two or more radio protocols (e.g., WiFi, Bluetooth, etc.) concurrently, and this mandates that the processor within the appliance be capable of responding to multiple low-latency-tolerant events.
[0003] The requirement that a processor be capable of responding to multiple low- latency-tolerant eventsjias, in general, meant that the^prqcessor needed to be fast. A fast processor, however, uses a great deal of wattage and, therefore, drains a portable electronic appliance's batteries quickly, which is, of course, most disadvantageous.
[0004] The need exists, therefore, for the development of a processor that is capable of capable of responding to multiple low-latency-tolerant events concurrently and with moderate power consumption.
Summary of the Invention
[0005] The present invention enables the manufacturing and use of processors that are capable of responding to multiple low-latency-tolerant events concurrently and with moderate power consumption. For example, the illustrative embodiment is a multi-threaded processor that is capable of responding to, and processing, multiple low-latency-tolerant events concurrently and while using relatively slow, low-power memories.
[0006] In particular, the illustrative embodiment comprises a multi-threaded processor, which itself comprises a context controller and a plurality of hardware contexts. Each hardware context is capable of storing the current state of one thread in a form that enables the processor to quickly switch to or from the execution of that thread. To enable the processor to be capable of responding to low-latency-tolerant events quickly, each thread — and, therefore, each hardware context - is prioritized to reflect the latency tolerance of the event to which it responds.
[0007] The context controller switches contexts - and, therefore, access to the processing capability of the processor - among the highest priority threads on a time- sequenced basis (e.g., an instruction-by-instruction basis, etc.). In this way, the controller ensures that the processor exhibits both the low-latency benefit associated with both coarsegrained multi-threaded architectures and the concurrent-processing benefit associated with fine-grained multi-threaded architectures in the prior art.
[oooδ] In accordance with the illustrative embodiment, the processor's memory is constructed of independent memory banks, and the instructions and data for each executing thread are stored in a different bank. This enables the illustrative embodiment to comprise relatively slow-speed program memory because the memory's access time need only be as fast as the amount of time needed by the processor to execute two or more successive instructions in a single thread depending on parameters that are described in detail below. [0009] The illustrative embodiment comprises: (a) H hardware-contexts, each of which is capable of storing the execution state of one thread in a multi-threaded processor; and (b) a context controller for:
(i) activating each of A hardware contexts, wherein each of said A active hardware contexts has a priority,
(ii) maintaining the E highest priority of the A active hardware contexts as executing hardware contexts, wherein E equals the lesser of A and C, and wherein C equals the maximum number of concurrently executing hardware contexts in the multi-threaded processor, and (iii) initiating a context switch in the multi-threaded processor among the E executing hardware contexts on a time-sequenced basis; wherein C and H are positive integers and 2 < C < H; and wherein A and E are non-negative integers and E -A = H. Brief Description of the Drawings
[ooio] Figure 1 depicts a block diagram of the logical components of an electronic appliance in accordance with the illustrative embodiment of the present invention.
[ooii] Figure 2 depicts a block diagram of the salient aspects of multi-threaded processor 103 in accordance with the illustrative embodiment of the present invention.
[0012] Figure 3 depicts a chart of the salient tasks associated with the operation of the illustrative embodiment.
Detailed Description
[0013] Figure 1 depicts a block diagram of the logical components of an electronic appliance in accordance with the illustrative embodiment of the present invention. Electronic appliance 100 comprises: radios 101-1 through 101-3, local bus controller 102, multithreaded processor 103, input/output 104, and memory 105, interrelated as shown.
[0014] Each of radios 101-1 through 101-3 is a standard radio as is well known to those skilled in the art that enables electronic appliance 100 to communicate with other- electronic appliances wirelessly. Each of radios 101-1 through 101-3 operates in accordance with a different air-interface: radio 101-1 is IEEE 802.11 compliant, radio 101-2 is IS-95 compliant, and radio 101-3 is Bluetooth compliant. Each of radios 101-1 through 101-3 receive messages independently of each other and that require low-latency responses. It will be clear to those skilled in the art how to make and use radios 101-1 through 101-3.
[0015] Local bus controller 102 acts as the interface between radios 101-1 through 101-3 and multi-threaded processor 103 in well-known fashion. For example, local bus controller 102 controls the interaction between radios 101-1 through 101-3 and multithreaded processor 103. It will be clear to those skilled in the art how to make and use local bus controller 102.
[0016] Multi-threaded processor 103 is a general-purpose processor that is capable of interacting with local bus controller 102, input/output 104, and memory 105 as described below and with respect to Figures 2 and 3. In particular, multi-threaded processor 103 is capable of executing a plurality of concurrent threads in the manner described below.
[0017] Input/output 104 is the non-radio interface for electronic appliance 100 and interacts with multi-threaded processor 103 in well-known fashion.
[0018] Memory 105 is the program memory for electronic appliance 100 and comprises C independent memory banks, wherein C is a positive integer greater than 2. The upper bound of C is described below with respect to Figure 2. The access time of memory 105 is equal to or less than the time required by multi-threaded processor 103 to execute C instructions. In accordance with the illustrative embodiment, the instructions and data for each executing thread is stored in a different bank in memory 105 so that there is no memory contention between threads for data in the same bank. It will be clear to those skilled in the art how to make and use memory 105. It will be clear to those skilled in the art, after reading this disclosure, how to make and use alternative embodiments of the present invention in which memory 105 is tightly-coupled memory or the cache of a multi-level memory hierarchy.
[0019] Figure 2 depicts a block diagram of the salient aspects of multi-threaded processor 103 in accordance with the illustrative embodiment of the present invention. Multithreaded processor 103 comprises context controller 301 and H hardware contexts 301-1 through 301 -H, wherein His a positive integer greater than C. In accordance with the illustrative embodiment, H equals 8, but it will be clear to those skilled in the art, after reading this disclosure, how to make and use alternative embodiments of the present invention in which H has any integral value greater than C.
[0020] For the purposes of this specification, a "hardware context" is described as the hardware required to store the current state of a thread in a form that enables multithreaded processor 103 to switch to or from the execution of the thread.
[0021] Context controller 301 is capable of monitoring and regulating the population and execution of hardware contexts 301-1 through 301 -H, in the manner described below and with respect to Figure 3. Furthermore, context controller 301 maintains a table which provides the following information for each hardware context: (i) is the hardware context vacant or populated? (ii) what is the priority of hardware context? (iii) is the hardware context active or inactive? (iv) is the hardware context executing or not?
The answer to each of these questions can be stored in a vector that is part of the hardware context. It will be clear to those skilled in the art, after reading this disclosure, how to make and use alternative embodiments of the present invention in which the answers to these questions are stored in another way. [0022] Figure 3 depicts a chart of the salient tasks associated with the operation of the illustrative embodiment. In accordance with the illustrative embodiment, tasks 301, 302, 303, and 304 run concurrently.
[0023] At task 301, context controller 201 populates a vacant hardware context of hardware contexts 202-1 through 202-7/ in response to the spawning of a thread. In accordance with the illustrative embodiment, the thread has a priority, and, therefore context controller 201 associates the priority of that thread with the newly populated hardware context.
[0024] It will be clear to those skilled in the art, after reading this disclosure, how to make and use alternative embodiments of the present invention in which hardware contexts 202-1 through 202-Hare themselves prioritized and context controller 201 populates a vacant hardware context of hardware contexts 202-1 through 202 -H in response to the spawning of a thread, which vacant hardware context has a priority commensurate with the priority of the thread.
[0025]- -As is well known-to those skilled in the art, the-normal- execution of a thread usually, at some point, terminates, and at that point the hardware context for that thread is vacated and becomes available for repopulation. It will be clear to those skilled in the art, after reading this disclosure, how to make and use alternative embodiments of the present invention in which the hardware context for a thread is retained after the normal execution and becomes dormant after the thread has terminated.
[0026] At task 302, context controller 201 deems the hardware context populated in task 301 as "active," which hardware context was initially deemed "inactive." For the purposes of this specification, an "active hardware context" is defined as a context that is ready to execute and an "inactive hardware context" is defined as a context that is not ready to execute.
[0027] At any one instant, A of hardware contexts 202-1 through 202-H are deemed active, wherein A is a non-negative integer and A =H. Over time, the value of A can fluctuate as new threads are spawned and completed threads are vacated. Furthermore, the value of A can fluctuate due to the occurrence of events upon which inactive threads are waiting and the voluntary inactivation of threads pending occurrence of expected future events. Furthermore, context controller 201 can deem an active hardware context as inactive when the context encounters a suspension state for whatever reason (e.g., a processor execution stall due to a cache miss, the need to wait for an external event, etc.) and can deem an inactive hardware context as active when the wait or block state has been overcome.
[0028] At task 303, context controller 201 deems the E highest-priority of the A active hardware contexts as "executing," wherein E is a non-negative integer and equals the lesser of A and C. For the purposes of this specification, an "executing hardware context" is defined as an active hardware context that is given access to the processing capability of multi-threaded processor 103 and a "non-executing hardware context" is defined as an active hardware context that is not given access to the processing capability of multi-threaded processor 103. Over time, both the value of E can fluctuate as the number of active hardware contexts fluctuates and the members of the set of executing hardware contexts can fluctuate as the relative priority of the active hardware contexts fluctuates. In other words, when E equals C and an inactive hardware context with a higher priority than at least one of the E executing hardware contexts is deemed active, context controller 201 deems the newly activated hardware context as an executing hardware context and deems the lowest priority of the E executing hardware contexts as non-executing. In this way, context controller 201 maintains the E highest-priority of the A active hardware contexts as executing hardware contexts.
[0029] At task 304, context controller 201 initiates a context switch among the E executing hardware contexts on a time-sequenced basis. For the purposes of this specification, a "time-sequenced basis" is defined as a resource allocation system that allocates the processing capability of multi-threaded processor 103 across the executing hardware contexts based on time. The switching of contexts on a time-sequenced basis is common among many fine-grained multi-threaded processors.
[0030] One example of context switching on a time-sequenced basis is switching on an instruction-by-instruction basis. Another example of context switching on a time- sequenced basis is switching wherein N instructions in each set of N*E successively-executed instructions is from each of the E executing hardware contexts, and wherein N is a positive integer greater and N= I. When N equals 1 , this is equivalent to switching on an instruction- by-instruction basis.
[0031] In accordance with the illustrative embodiment, when there are fewer executing hardware contexts than multi-threaded processor 103 can concurrently handle (i.e., E < C), each of the E executing hardware contexts receives 1/Cth of the processing capability of multi-threaded processor 103 and (C-E)IC of the processing capability of multi-threaded processor 103 is not used by any of the E executing hardware contexts. This is advantageous because each thread achieves a uniform processing time, which is advantageous (1) in applications where externally relevant time intervals {e.g., network inter-frame spaces, etc.) are generated directly by the instruction sequence and (2) in low-power applications.
[0032] One advantage of context switching on an instruction-by-instruction basis and giving each of the E executing hardware contexts receives 1/Cth of the processing capability of multi-threaded processor 103 is that the instruction execution rate can be synchronized with the memory access. For example, in the case of the illustrative embodiment, the memory is partitioned into C memory banks, and the data for each thread is stored in a different bank. In these cases, the access time of the memory need only be equal to or less than the time required by processor 103 to execute C instructions.
[0033] It will be clear to those skilled in the art, however, after reading this disclosure, how to make and use alternative embodiments of the present invention in which each of the E executing hardware contexts receives 1/Eth of the processing capability of multi-threaded processor 103. This is advantageous because it achieves faster processing of each thread and lower response time to external events.
[0034] It is to be understood that the above-described embodiments are merely illustrative of the present invention and that many variations of the above-described embodiments can be devised by those skilled in the art without departing from the scope of the invention. It is therefore intended that such variations be included within the scope of the following claims and their equivalents.

Claims

What is claimed is:
1. An apparatus comprising:
(a) H hardware contexts, each of which is capable of storing the execution state of one thread in a multi-threaded processor; and
(b) a context controller for:
(i) activating each of A hardware contexts, wherein each of said A active hardware contexts has a priority,
(ii) maintaining the E highest priority of said A active hardware contexts as executing hardware contexts, wherein E equals the lesser of A and C, and wherein C equals the maximum number of concurrently executing hardware contexts in said multi-threaded processor, and
(iii) initiating a context switch in said multi-threaded processor among said E executing hardware contexts on a time-sequenced basis; wherein C and H are positive integers and 2 < C < H; and wherein A and E are non-negative integers and and E = A =H.
2. The apparatus of claim 1 wherein said time-sequenced basis is an instruction-by- instruction basis.
3. The apparatus of claim 1 wherein N instructions in each set of N*E successively- executed instructions is from each of said E executing hardware contexts, and wherein N is a positive integer.
4. The apparatus of claim 3 wherein N= 1.
5. The apparatus of claim 1 wherein each of said E executing hardware contexts receives 1/Eth of the processing capability of said multi-threaded processor.
6. The apparatus of claim 1 wherein each of said E executing hardware contexts receives 1/Cth of the processing capability of said multi-threaded processor.
7. The apparatus of claim 1 further comprising:
(c) a memory configured as C memory banks, wherein the access time of said memory is equal to or less than the time required by said multi-threaded processor to execute C instructions.
8. An apparatus comprising: (i) H hardware contexts, each of which is capable of storing the execution state of one thread in a multi-threaded processor; and
(ii) a context controller for switching the context in said multi-threaded processor among the E highest priority of A active hardware contexts after each instruction, wherein E equals the lesser of A and C, and wherein C equals the maximum number of concurrently executing hardware contexts in said multi-threaded processor; wherein C and H are positive integers and 2 < C < H; and wherein A and E are non-negative integers and and E = A - H.
9. The apparatus of claim 8 wherein N instructions in each set of N*E successively- executed instructions is from each of said E executing hardware contexts, and wherein N is a positive integer.
10. The apparatus of claim 10 wherein N= 1.
11. The apparatus of claim 8 wherein each of said E executing hardware contexts receives 1/Eth of the processing capability of said multi- threaded processor.
12. The apparatus of claim 8 wherein each of said E executing hardware contexts receives 1/Cth of the processing capability of said multi-threaded processor.
13. The apparatus of claim 8 further comprising:
(c) a memory configured as C memory banks, wherein the access time of said memory is equal to or less than the time required by said multi-threaded processor to execute C instructions.
14. An apparatus comprising:
(i) H hardware contexts, each of which is capable of storing the execution state of one thread in a multi-threaded processor; and
(ii) a context controller for switching the context in said multi-threaded processor among the E executing hardware contexts a time-sequenced basis; wherein each of said E executing hardware contexts receives 1/Cth of the processing capability of said multi-threaded processor; wherein (C-E)ZC of the processing capability of said multi-threaded processor is not used by any of said E executing hardware contexts; wherein E equals the lesser of A and C, and wherein C equals the maximum number of concurrently executing hardware contexts in said multi-threaded processor; and wherein E is a non-negative integer, C and H are positive integers, and 2 = E < C < H.
15. The apparatus of claim 14 wherein said time-sequenced basis is an instruction-by- instruction basis.
PCT/US2006/035541 2005-09-13 2006-09-12 Multi-threaded processor architecture WO2007033203A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020087006004A KR101279343B1 (en) 2005-09-13 2006-09-12 Multi-threaded processor architecture

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US71680605P 2005-09-13 2005-09-13
US60/716,806 2005-09-13
US11/470,721 US8046567B2 (en) 2005-09-13 2006-09-07 Multi-threaded processor architecture
US11/470,721 2006-09-07

Publications (2)

Publication Number Publication Date
WO2007033203A2 true WO2007033203A2 (en) 2007-03-22
WO2007033203A3 WO2007033203A3 (en) 2007-05-24

Family

ID=37865529

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2006/035541 WO2007033203A2 (en) 2005-09-13 2006-09-12 Multi-threaded processor architecture

Country Status (2)

Country Link
KR (1) KR101279343B1 (en)
WO (1) WO2007033203A2 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9798582B2 (en) 2015-10-22 2017-10-24 International Business Machines Corporation Low latency scheduling on simultaneous multi-threading cores
GB2606674A (en) * 2016-10-21 2022-11-16 Datarobot Inc System for predictive data analytics, and related methods and apparatus
US11922329B2 (en) 2014-05-23 2024-03-05 DataRobot, Inc. Systems for second-order predictive data analytics, and related methods and apparatus

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6205468B1 (en) * 1998-03-10 2001-03-20 Lucent Technologies, Inc. System for multitasking management employing context controller having event vector selection by priority encoding of contex events
US6986141B1 (en) * 1998-03-10 2006-01-10 Agere Systems Inc. Context controller having instruction-based time slice task switching capability and processor employing the same

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6968445B2 (en) 2001-12-20 2005-11-22 Sandbridge Technologies, Inc. Multithreaded processor with efficient processing for convergence device applications
US6925643B2 (en) 2002-10-11 2005-08-02 Sandbridge Technologies, Inc. Method and apparatus for thread-based memory access in a multithreaded processor

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6205468B1 (en) * 1998-03-10 2001-03-20 Lucent Technologies, Inc. System for multitasking management employing context controller having event vector selection by priority encoding of contex events
US6986141B1 (en) * 1998-03-10 2006-01-10 Agere Systems Inc. Context controller having instruction-based time slice task switching capability and processor employing the same

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
EGGERS S.J. ET AL.: 'Simultaneous Multithreading: A Platform for Next-Generation Processors' IEEE MICRO 1997, pages 12 - 19, XP002252719 *
PAREKH S. ET AL.: 'Thread-Sensitive Scheduling for SMT Processors' DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING, UNIVERSITY OF WASHINGTON 2000, pages 1 - 18, XP003013088 *
RAASCH S.E. ET AL.: 'Applications of Thread prioritization in SMT Processors' PROC. 1999 WORKSHOP ON MULTITHREADED EXECUTION AND COMPILATION January 1999, pages 1 - 9, XP003013089 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11922329B2 (en) 2014-05-23 2024-03-05 DataRobot, Inc. Systems for second-order predictive data analytics, and related methods and apparatus
US9798582B2 (en) 2015-10-22 2017-10-24 International Business Machines Corporation Low latency scheduling on simultaneous multi-threading cores
US9817696B2 (en) 2015-10-22 2017-11-14 International Business Machines Coroporation Low latency scheduling on simultaneous multi-threading cores
GB2606674A (en) * 2016-10-21 2022-11-16 Datarobot Inc System for predictive data analytics, and related methods and apparatus
GB2606674B (en) * 2016-10-21 2023-06-28 Datarobot Inc System for predictive data analytics, and related methods and apparatus

Also Published As

Publication number Publication date
KR20080043349A (en) 2008-05-16
KR101279343B1 (en) 2013-07-04
WO2007033203A3 (en) 2007-05-24

Similar Documents

Publication Publication Date Title
CN104915256B (en) A kind of the Real-Time Scheduling implementation method and its system of task
CN102341780B (en) Real-time multithread scheduler and dispatching method
JP5323828B2 (en) Virtual machine control device, virtual machine control program, and virtual machine control circuit
EP3245587B1 (en) Systems and methods for providing dynamic cache extension in a multi-cluster heterogeneous processor architecture
EP3259825B1 (en) Heterogeneous battery cell switching
CN104583900A (en) Dynamically switching a workload between heterogeneous cores of a processor
US20080268828A1 (en) Device that determines whether to launch an application locally or remotely as a webapp
US20110107426A1 (en) Computing system using single operating system to provide normal security services and high security services, and methods thereof
US9411649B2 (en) Resource allocation method
Reusing Comparison of operating systems tinyos and contiki
EP2580657B1 (en) Information processing device and method
Sabri et al. Comparison of IoT constrained devices operating systems: A survey
US20160350156A1 (en) Method for performing processor resource allocation in an electronic device, and associated apparatus
CN101790709A (en) Dynamic core switches
EP2458501A1 (en) Method of operating a communication device and related communication device
US20150301858A1 (en) Multiprocessors systems and processes scheduling methods thereof
EP2551768A1 (en) Multi-core system and start-up method
US20060155552A1 (en) Event handling mechanism
WO2007033203A2 (en) Multi-threaded processor architecture
CN114490123A (en) Task processing method and device, electronic equipment and storage medium
US20100305937A1 (en) Coprocessor support in a computing device
US8046567B2 (en) Multi-threaded processor architecture
CN101258465A (en) System and method of controlling multiple program threads within a multithreaded processor
CN116661907A (en) Method, device, equipment and medium for calling non-switching function under SGX single thread
GB2506169A (en) Limiting task context restore if a flag indicates task processing is disabled

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 1044/DELNP/2008

Country of ref document: IN

WWE Wipo information: entry into national phase

Ref document number: 1020087006004

Country of ref document: KR

NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 06803459

Country of ref document: EP

Kind code of ref document: A2