WO2003088036A1 - Systeme et procede de traitement multiprocessus au niveau d'instructions - Google Patents
Systeme et procede de traitement multiprocessus au niveau d'instructions Download PDFInfo
- Publication number
- WO2003088036A1 WO2003088036A1 PCT/IB2003/001234 IB0301234W WO03088036A1 WO 2003088036 A1 WO2003088036 A1 WO 2003088036A1 IB 0301234 W IB0301234 W IB 0301234W WO 03088036 A1 WO03088036 A1 WO 03088036A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- state
- processor
- computer system
- processing pipeline
- sets
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 13
- 238000012545 processing Methods 0.000 claims abstract description 59
- 238000012546 transfer Methods 0.000 claims abstract description 38
- 230000008878 coupling Effects 0.000 claims abstract description 8
- 238000010168 coupling process Methods 0.000 claims abstract description 8
- 238000005859 coupling reaction Methods 0.000 claims abstract description 8
- 238000004891 communication Methods 0.000 claims description 4
- 230000004913 activation Effects 0.000 description 5
- 230000006870 function Effects 0.000 description 3
- 230000010354 integration Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000004044 response Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 239000000872 buffer Substances 0.000 description 2
- 239000003795 chemical substances by application Substances 0.000 description 2
- 230000003111 delayed effect Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3867—Concurrent instruction execution, e.g. pipeline or look ahead using instruction pipelines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline or look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
- G06F9/3851—Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution from multiple instruction streams, e.g. multistreaming
Definitions
- the invention relates to a computer system as the defined in the precharacterizing portion of claim 1.
- the invention further relates to a method for operating a computer system as defined in the precharacterizing portion of claim 6.
- Modern processors employed in computer systems use various techniques to improve their performance.
- One of them is multithreading.
- a multithreaded computer system may contain hardware support for multiple threads of execution.
- the threads can be independent programs, or related execution streams of a single parallel program.
- Multithreading enables a better performance of a computer system, if the system is configured to allow the processor to continue with another thread if a delay occurs in processing the current thread, for example because a cache miss occurs and the memory has to be accessed which generally has a relatively long latency. Otherwise thread switch means may be present which periodically select an active thread from a pool of available threads.
- a computer system and method referred to in the opening paragraph are known from WO 00/6878.
- the computer system allows a fast context switching between different states, by selectively coupling a different set of state saving elements (e.g. registers) to the processing pipeline. After a context switch another set is selected.
- state saving elements e.g. registers
- a disadvantage of the known computer system is that it does not provide a solution for the case that more threads than sets of state saving elements are present. Hence toportee that the system is also suitable for switching between a high number of threads a relatively high number of sets of state saving elements is necessary. This however has the disadvantage that a significant amount of state saving elements is unused in the case that few threads are available.
- the state transfer unit may transfer a state between a set of state saving elements and the memory while the processing pipeline accesses the other state. If a thread switch has occurred from a first thread using a first one of the sets of state saving elements to a second thread using a second one of the sets of state saving elements, the state transferring unit may transfer the state from the first one of the sets of state saving elements to the memory, while the processing pipeline continues with the second thread. In this way an unlimited amount of threads can be handled with only two sets of state saving elements.
- An embodiment comprising at least three sets is described in claim 2. This embodiment is advantageous if a thread is interrupted very early after its start of execution. In this circumstance there was not enough time available to complete a state transfer between a set of state saving elements and the memory during execution of said thread. By completion of a state transfer is understood the saving of the 'old' state from the set to the memory and the restoring of a 'new state'from the memory to the set. By coupling a third of the sets, in which the state of a further thread is saved, the processing pipeline can immediately continue with the further thread.
- the transfer of a state takes place with a low bus priority as compared to communication between the processing pipeline and the memory. In this way, even when the processing pipeline and the state transfer unit share the same memory, the active thread can be executed without being significantly delayed by intervening accesses to the bus by the state transfer unit.
- the embodiment of claim 4 enables a further improvement of the efficiency of the processing pipeline.
- the processor serves as a translator for translating (converting) instruction code of a first type to instruction code of a second type suitable to be processed by a native processor.
- this type of computer system it is particularly important to perform thread execution and state saving in parallel, because a complete state swapping is essential for each thread switch in such a system. The reason is that a thread switch may occur in the middle of the translation of a code of the first type.
- the code of the first type to be translated is JAVA byte code. An overview thereof is discussed in "Implementing the JAVA Virtual Machine", by Brian Case, Microdesign Resources, March 25, 1996, pp.12-17.
- a typical translator requires some dozens of state saving elements for storing intermediate values, e.g. an index to the current byte code, and indexes to several tables and buffers used in the translator.
- the translator may comprise state saving elements, i.e. in the form of registers or stack locations for saving parameters which are operated on by the instruction codes.
- Such a translator essentially differs from a computing processor in that it converts instructions of a first type without carrying them out itself. As the number of generated instructions of the second type usually is greater than the number of instructions read from the memory, this leaves the state transfer unit ample time to access the bus to the memory, even when it is operating at a low priority.
- a processor acting as a translator for a further processor is described in WO 99/18484.
- this document describes a processor which is capable of refeeding a sequence of instructions after the further processor has been interrupted.
- the subject-matter of this PCT-applications is considered to be included by reference herein.
- Instructions of the first type maybe too complex to translate them in dedicated hardware. Examples thereof are the JAVA bytecodes invokevirtual (search object classes for method and call), invokestatic (also method call), getfleld (search object classes for field data and load an object's field value onto the stack), new (search object class and create new object accordingly).
- Such bytecodes may be passed to the further processor for processing by a dedicated subroutine,
- Figure 1 shows a first embodiment of a computer system according to the invention
- Figure 1A shows an organization of state saving elements in state saving units and in sets of state saving elements
- Figure 2 A shows in more detail a portion of the computer system of Figure 1
- Figure 2B shows in more detail a portion of Figure 2A
- Figure 2C shows in more detail another portion of Figure 2 A
- Figure 3 shows a flow chart of a state transfer
- Figure 4 shows the synchronization between the processing pipeline and the state transfer unit
- Figure 5 A-D shows four different embodiments of a computer system according to the invention in which the processor serves as a preprocessor for converting instruction codes
- Figure 6A-D shows four examples of implementation of the embodiment shown in Figure 5B.
- FIG. 1 shows a first embodiment of a computer system according to the invention.
- the computer system shown comprises a processor 10 which is arranged for multi- thread processing.
- the processor comprises a processing pipeline 11, 12, 13 and at least a first and a second set 20, 30 of state saving elements.
- the processing pipeline comprises a first stage 11 for fetching instructions, a second stagel2 for translating the instructions, and a third stagel3 for providing the results of the second stagel2 to an output, such as a bus.
- the processor comprises an instruction cache 14 via which it is coupled to the memory 70 via a communication means 90 such as a bus or a point to point connection. Point-to-point connections enable a fast data transfer.
- the processing pipeline comprises several stages, between which sets 20, 30, 30', 30" of state saving elements are arranged.
- the stages of the pipeline 11, 12, 13 pass information to each other via a selected set e.g. 20 of the state saving elements. Additional state saving elements could be part of the sets. Although three stages 11, 12, 13 are shown any number of stages is possible.
- a first set 20 of state saving elements is indicated with a first F-shaped area bounded by a solid line
- a second set 30 as well as a third and a fourth set 30', 30" of state saving elements is indicated with an F-shaped area with a dashed boundary.
- Each of the sets of stage saving elements 20, 30, 30', 30" may have a plurality of state saving elements between each pair of mutually coupled stages.
- the set of stage saving elements 20 comprises the state saving elements 21 ⁇ , 2 In and 21m between the pipeline stages 11 and 12.
- a corresponding state saving elements in the different sets of state saving elements form state saving units.
- state saving unit I comprises state saving elements 21 ⁇ , 31 ⁇ , 31 'i and 31" ⁇ from sets 20, 30, 30' and 30" respectively.
- the processing pipeline 11, 12, 13 has to interrupt processing a first thread for example because of a cache miss, it can rapidly start processing another thread of which the state is saved in an other e.g. set 30 of state saving elements.
- the computer system shown comprises selection means for selectably coupling one of the sets 20, 30, 30', 30" to the processing pipeline 11, 12, 13.
- FIG 2A For clarity, a part of the processing pipeline including the relevant part of the sets of state saving elements and the selection means is shown therein.
- stages 11 and 12 of the processing pipeline are shown as well as state saving units I, II and III. Each of the state saving units is capable of containing state information.
- Each of the state saving units I comprises a plurality of state saving elements, as is shown in more detail in Figure 2B and 2C.
- One of these state saving elements is used during the processing of a current thread.
- the one or more other state saving elements are used to store information about currently inactive threads.
- the other state saving units ⁇ and III are preferably equivalent to state saving unit I in order to facilitate designing and manufacturing the device.
- the state saving units I, II and III have a first input, II for receiving state information from a first stage 11 of the pipeline and a first output 13 to enable a second pipeline stage 12 to read out the information.
- the state saving units further have a second input 12 to enable restoring of information from the data memory 70 into the state saving unit II and a second output 14 to enable saving of state information from the state saving unit I to the data memory 70. Furthermore the state saving units have inputs for receiving the m-valued signals SelBl and selB2.
- the state saving units further comprise an input for receiving a unit selection signal Rl, R2, R3 and an input for receiving a clock signal.
- the unit selection signals Rl, ..., Rn identifies the state saving unit which should receive the data which is loaded from memory 70 during state restoring.
- the computer system according to the invention further comprises a state transferring unit 60 for transferring a state between a set not coupled to the processing pipeline and a memory. The state transferring unit 60 is controlled by the controller 50.
- FIG. 2B shows a state saving unit I in more detail.
- the state saving unit I comprises a first input II for receiving state information from a processing stage 11.
- a demultiplexer 41 redirects this state information to one of a set of state saving elements 21 ⁇ , 31 ⁇ , ... in reponse to a bank select signal SelBl.
- a multiplexer 43 selects one of the output signals of the state saving elements 2 li, 31 1 , ... as the output signal at output 13 in response to the same bank select signal selBl. This output signal can be read by the next processing stage 12 in the pipeline.
- a second input 12 is coupled to the bus 90 for receiving information which is to be restored from the memory 70.
- a demultiplexer 42 redirects this state information to another one of a set of state saving elements 21 ⁇ , 31 ⁇ , ... in reponse to a second bank select signal SelB2.
- a multiplexer 44 selects another one of the output signals of the state saving elements 21 ⁇ , 31 ⁇ , ... as the output signal at output 14 in response to the same bank select signal selB2. This output 14 is coupled to the memory 70 to enable saving of state information.
- a state saving element e.g. 21 ⁇ can read information either via the first demultiplexer 41 or via the second demultiplexer 42 when it is activated by an activation signal Clij, e.g. state saving element 21 is activated with signal Oil.
- Fig 2A and Fig. 2B shows selection means in the form of multiplexers
- other ways well known to the skilled person are possible to perform the function of the multiplexers.
- the same function could be implemented by a 5-1 lookuptable, having the 4 outputs of the state saving elements 21 ⁇ , 31 l3 .. coupled to four of its inputs, while the fifth input receives the signal SelBl.
- the demultiplexers 41, 42, or any other logic function could be implemented in a look-up table.
- Figure 2C shows by way of example a circuit for generating the activation signals Oil, Oi2 etc.
- the circuit comprises a first decoder 25 for decoding the first bank select signal selBl and a second decoder 26 for decoding the second bank select signal SelB2.
- the circuit comprises first combination means (here A D-gates 24, 34, ..) for combining the proper unit selection signal Ri with the output signals of the second decoder 26.
- the circuit further comprises second combination means (the OR-gates 23, 33, ...) for combining the output signals of the first combination means 24, 34 with the outputs of the first decoder 25.
- the circuit further comprises third combination means (the AND-gates 22, 32, ...) for combining the outputs of the second combination means 23, 33, ..
- the processor 10 has four sets of state saving elements i.e. 20, 30, 30', 30".
- the control unit 60 is capable of coupling a third of the sets e.g. 30' to the processing pipeline 11, 12, 13 upon detecting that a thread using a first of the sets e.g. 20 finishes execution before the state transfer between a second e.g. 30 of the sets is complete.
- the state saving unit comprises 4 state saving elements each forming part of a set of state saving elements.
- the computer system comprises a further processor 80.
- the processor 10 serves as a preprocessor for converting instruction code of a first type to instruction code of a second type suitable to be processed by the further processor 80.
- the processing pipeline coupled to a first set 20 of state saving elements, e.g. set 20 processes a thread
- the state transfer unit 60 performs a state transfer to the memory e.g. to the second set 30 of state saving elements.
- step SI an internal counter 61 of the state transfer unit 60 is initialized.
- This count CNT registered in this counter 61 is indicative for a particular state saving element in a set of state saving elements.
- the counter 61 may for example be initialized at 1 corresponding to the first state saving element 21 ⁇ of a set 20 of state saving elements.
- step S2 it is verified whether the bus 90 to the memory 70 is available. As long as this is not the case step S2 is repeated. As soon as the bus 90 becomes available the address in the memory 70 corresponding to the selected state saving element of the set which is to be saved is provided to the bus in step S3.
- the addres is calculated in this embodiment by adding the value CNT to an offset value which depends on the thread for which the state is saved.
- the offset value is saved in a register by the processor saved, the address indicating the beginning of the thread save space.
- the signal of the counter CNT is provided to a multiplexer 62 in step S4 so that an input of the multiplexer 62 is enabled which is coupled to said first state saving element e.g. 21 ⁇ of set 20.
- the data stored in the state saving element is written to the bus 90.
- the transfer unit waits until a bus acknowledge indicates that the data is stored in memory 70.
- step S6 it is checked whether for each of the state saving elements the content has been saved to the memory 70. If this is not the case the value CNT is increased and the loop is repeated from step S2 after the count CNT is incremented in step S6a.
- step S8 the output signal CNT addresses a decoder 63 which is now enabled by a signal WR.
- One of the output signals e.g. Ri of the decoder 63 corresponds to a state saving unit having the same index I, which is selected. If for example the value of SelB2 is "1" and the signal Ri is activated the signal Oi2 for the state saving element 31 ⁇ will be enabled at the next clock pulse Cl.
- step S9 it is checked whether the bus is available.
- step S10 an address is provided in step S10 in the same way as in step S4.
- step SlOa the transfer unit waits until a bus acknowledge indicates that the memory 70 has the data available at the bus 90.
- step SI 1 data available at the bus 90 is transferred via the multiplexer 42 in the state saving element, e.g. 31 which is enabled by the signal Qi2 from the circuit shown in Figure 2C.
- step S12 it is verified whether the state restore operation is complete. If this is not the case the value CNT is increased and the loop is repeated from step S8 after the count CNT is incremented in step S13. If the state restore operation is complete then the WR signal is reset in step S14.
- the state transfer unit 60 first saves the state stored in a set of state saving elements to the memory and subsequently loads a state corresponding to another thread to the set of state saving elements. Otherwise the state transfer unit 60 could load a new value in a state saving element before it starts to save the value of a next state saving element. Any order is allowed as long as a value in a state saving element is not overwritten before it is saved.
- the memory 70 may be used exclusively by the state transfer unit 61.
- the memory 70 is shared by other modules. In the embodiment shown it shares the bus 90 to the memory 70 with the cache 14 of the processor 10. This allows for a smaller hardware implementation. It further allows for a flexible use of the memory 70. If for example an application has relatively few threads, then a relatively larger amount of the memory can be used by the other module, e.g the processor pipeline. In such an embodiment the transferring of a state preferably takes place with a low bus priority as compared to communication between the processing pipeline and the memory.
- the priority to access the memory 70 may be determined dynamically, on a per access basis, i.e. for each request to communicate with the memory. On the other hand the priority may have a fixed value for each bus agent.
- the thread switch frequency is in the order of 1 kHz.
- the state is determined by about 25 state saving elements of 32 bits. Saving a state and restoring another state requires about 250 cycles. In a machine operating on 80 MHz this equals to 6 ⁇ s.
- the minimum time required for state saving and restoring is usually a factor 2 to 3 higher because system buses and memory usually run at a reduced clock speed. In a computer system with only one set of state saving elements this required time would noticably delay the conversion process.
- the state transfer can take place while the processing pipeline is processing a thread, it does not hamper the latter process. It is even possible to allow the state transfer to take place at a pace which is about 50 times slower than at maximum transfer speed.
- the controller 50 shown in Figure 1 may autonomously select a thread to be processed by the processor 10. Such an autonomous selection may be realized by a timer which periodically selects a new thread each time a certain time interval has elapsed. Otherwise the controller 50 may initiate a thread switch in response to signals II, 12 from outside, e.g. from the processing pipeline 11, 12, 13, for example a signal that the processing pipeline 11, 12, 13 is delayed by a cache miss.
- FIG. 4 shows an example wherein four threads TI , T2, T3, T4 are processed periodically.
- the activities of the state transfer unit 60 are schematically indicated by the lower bar referred to with STU in the figure.
- the activities of the processing pipeline 11, 12, 13 are symbolized with the upper bar eferred to with PP.
- the processing pipeline starts processing thread Tl using set SI (e.g. 20) of state saving elements.
- the state transfer unit 60 loads the state corresponding to thread T2 from memory section M2 to set S2 of state saving elements.
- the processing pipeline 11, 12, 13 continues with processing thread T2 using set S2 (e.g. 30) of state saving elements.
- the state transfer unit 60 In a time interval from t2 to t2' the state transfer unit 60 first saves the state stored in set SI to memory section Ml assigned to thread SI. In the time interval from t2' to t3 it loads a state stored in memory section M3 assigned to thread T3 into the set SI. At t3 the processing pipeline 11, 12, 13 starts processing thread T3 using set SI of state saving elements in which a previous state of thread T3 was restored in the time interval t2'-t3. In this way every thread is processed. At t5 the processing pipeline continues to proces thread Tl again.
- a plurality of computer system described herein may be combined to form a parallel computer system, in which threads not only are computed sequentially, but also in parallel.
- FIG. 5A-C shows some examples of embodiments of the computer system according to the invention wherein the processor serves as a preprocessor for converting instruction code of a first type to instruction code of a second type suitable to be processed by the further processor.
- Figure 5 A schematically again shows the architecture of Figure 1.
- the processor 10 directly communicates its translated instructions to the further processor 80. This is advantageous in that the bus 90 is not loaded with the translated instructions, and remains available for transfer of instructions of the first type and for state transfers.
- Figure 5B shows an alternative embodiment, wherein the processor 10 and the further processor 80 are integrated. This embodiment is described in some more detail in Figure 6A-D.
- the processor 10 is attached as a peripheral to the bus 90 of the further processor 80.
- This has the advantage of a simple architecture.
- the bus 90 however is loaded with the translated instruction stream.
- the bus 90 should preferably be a fast on-chip bus.
- Figure 5D shows an embodiment in which the memory 70 is coupled to the bus 90 via the processor 10. This embodiment is less suitable in that for every memory read issued by the further processor 80, the processor 10 has to decide whether it has to generate the data itself, or retrieve the data from memory 70.
- Figure 6A-D shows in more detail four examples of an embodiment in which the processor 10 is integrated with a further processor 80.
- the further processor shown here is a RISC CPU with a pipeline 83, instruction cache 82, and bus interface 81 indicated by solid lines. Besides these modules, such CPUs also contain a data cache, a register file, a write back buffer, etc. Those components are not shown in Figure 6A-D, however, since they would unncessarily complicate the Figures.
- the further processor 80 is coupled to the bus 90 via a bus wrapper 95, such as defined by the Virtual Socket Alliance (VSIA) in its Virtual Component Interface (VCI) proposals.
- VSIA Virtual Socket Alliance
- VCI introduces bus wrappers, in order to allow bus agents (such as CPUs) to abstract from the actual on-chip bus.
- the wrappers translate (proprietary) bus protocols to a standardised point-to-point (P2P) protocol.
- P2P protocol reduces protocol overhead if no bus is placed inbetween (when connecting the processor 10 to the bus wrapper and the bus wrapper to the further processor 80).
- FIG. 6A-D The examples shown in Figure 6A-D are arranged in increasing level of integration between the procesor 10 and the further processor 80.
- the performance level is expected to increase as the integration level increases.
- Embodiments having a low integration level however, have the advantage of a low design complexity.
- FIG. 6B shows an example wherein the processor 10 is coupled to the bus interface 81 of the further processor 80.
- bus interface and bus wrapper will be integrated, so that this embodiment does not substantially differ from the embodiment of Figure 6 A.
- FIG. 6C wherein the processor 10 is coupled to the bus 90 via the bus interface 81 of the further processor is advantageous in that the processor 10 does not need its own bus interface.
- Figure 6D A particular advantageous embodiment is shown in Figure 6D.
- the processor 10 retrieves the instruction code of the first type from the instruction cache 82 of the processor 80 and writes the translated instructions directly to the pipeline 83 of the further processor 80.
- This embodiment is advantageous in that it allows the processor 10 utilizing the instruction cache 82 of the further processor 80.
- the scope of protection of the invention is not restricted to the embodiments described herein. Neither is the scope of protection of the invention restricted by the reference numerals in the claims.
- the word 'comprising' does not exclude other parts than those mentioned in a claim.
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU2003215845A AU2003215845A1 (en) | 2002-04-12 | 2003-03-27 | System and method for instruction level multithreading |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP02290935.2 | 2002-04-12 | ||
EP02290935 | 2002-04-12 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2003088036A1 true WO2003088036A1 (fr) | 2003-10-23 |
Family
ID=29225734
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/IB2003/001234 WO2003088036A1 (fr) | 2002-04-12 | 2003-03-27 | Systeme et procede de traitement multiprocessus au niveau d'instructions |
Country Status (2)
Country | Link |
---|---|
AU (1) | AU2003215845A1 (fr) |
WO (1) | WO2003088036A1 (fr) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FR2864660A1 (fr) * | 2003-12-30 | 2005-07-01 | St Microelectronics Sa | Processeur a chemins de traitement multiples avec bus dedie |
EP1703377A2 (fr) * | 2005-03-18 | 2006-09-20 | Marvell World Trade Ltd | Processeur multi-fil |
EP1703375A2 (fr) * | 2005-03-18 | 2006-09-20 | Marvell World Trade Ltd | Appareil de contrôle en temps réel doté d'un processeur à fils multiples |
US7747989B1 (en) | 2002-08-12 | 2010-06-29 | Mips Technologies, Inc. | Virtual machine coprocessor facilitating dynamic compilation |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5778243A (en) * | 1996-07-03 | 1998-07-07 | International Business Machines Corporation | Multi-threaded cell for a memory |
US6006293A (en) * | 1998-04-21 | 1999-12-21 | Comsat Corporation | Method and apparatus for zero overhead sharing for registered digital hardware |
WO2000045258A1 (fr) * | 1999-01-27 | 2000-08-03 | Xstream Logic, Inc. | Unite de transfert de registres pour processeur electronique |
-
2003
- 2003-03-27 AU AU2003215845A patent/AU2003215845A1/en not_active Abandoned
- 2003-03-27 WO PCT/IB2003/001234 patent/WO2003088036A1/fr not_active Application Discontinuation
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5778243A (en) * | 1996-07-03 | 1998-07-07 | International Business Machines Corporation | Multi-threaded cell for a memory |
US6006293A (en) * | 1998-04-21 | 1999-12-21 | Comsat Corporation | Method and apparatus for zero overhead sharing for registered digital hardware |
WO2000045258A1 (fr) * | 1999-01-27 | 2000-08-03 | Xstream Logic, Inc. | Unite de transfert de registres pour processeur electronique |
Non-Patent Citations (1)
Title |
---|
HASKINS J W ET AL: "INEXPENSIVE THROUGHPUT ENHANCEMENT IN SMALL-SCALE EMBEDDED MICROPROCESSORS WITH BLOCK MULTITHREADING: EXTENSIONS, CHARACTERIZATION AND TRADEOFFS", CONFERENCE PROCEEDINGS OF THE 2001 IEEE INTERNATIONAL PERFORMANCE, COMPUTING, AND COMMUNICATIONS CONFERENCE. (IPCCC). PHOENIX, AZ, APRIL 4 - 6, 2001, IEEE INTERNATIONAL PERFORMANCE, COMPUTING AND COMMUNICATIONS CONFERENCE, NEW YORK, NY: IEEE, US, vol. CONF. 20, 4 April 2001 (2001-04-04), pages 319 - 328, XP001049966, ISBN: 0-7803-7001-5 * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7747989B1 (en) | 2002-08-12 | 2010-06-29 | Mips Technologies, Inc. | Virtual machine coprocessor facilitating dynamic compilation |
US9207958B1 (en) | 2002-08-12 | 2015-12-08 | Arm Finance Overseas Limited | Virtual machine coprocessor for accelerating software execution |
US10055237B2 (en) | 2002-08-12 | 2018-08-21 | Arm Finance Overseas Limited | Virtual machine coprocessor for accelerating software execution |
US11422837B2 (en) | 2002-08-12 | 2022-08-23 | Arm Finance Overseas Limited | Virtual machine coprocessor for accelerating software execution |
FR2864660A1 (fr) * | 2003-12-30 | 2005-07-01 | St Microelectronics Sa | Processeur a chemins de traitement multiples avec bus dedie |
US7424638B2 (en) | 2003-12-30 | 2008-09-09 | Stmicroelectronics S.A. | Multipath processor with dedicated buses |
EP1703377A2 (fr) * | 2005-03-18 | 2006-09-20 | Marvell World Trade Ltd | Processeur multi-fil |
EP1703375A2 (fr) * | 2005-03-18 | 2006-09-20 | Marvell World Trade Ltd | Appareil de contrôle en temps réel doté d'un processeur à fils multiples |
EP1703377A3 (fr) * | 2005-03-18 | 2007-11-28 | Marvell World Trade Ltd | Processeur multi-fil |
EP1703375A3 (fr) * | 2005-03-18 | 2011-05-04 | Marvell World Trade Ltd. | Appareil de contrôle en temps réel doté d'un processeur à fils multiples |
US8195922B2 (en) | 2005-03-18 | 2012-06-05 | Marvell World Trade, Ltd. | System for dynamically allocating processing time to multiple threads |
US8468324B2 (en) | 2005-03-18 | 2013-06-18 | Marvell World Trade Ltd. | Dual thread processor |
Also Published As
Publication number | Publication date |
---|---|
AU2003215845A1 (en) | 2003-10-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US4648034A (en) | Busy signal interface between master and slave processors in a computer system | |
US7590774B2 (en) | Method and system for efficient context swapping | |
US6963962B2 (en) | Memory system for supporting multiple parallel accesses at very high frequencies | |
US20080270707A1 (en) | Data processor | |
JPH0484253A (ja) | バス幅制御回路 | |
WO2001035246A2 (fr) | Systeme et procede de memoire cache pour un processeur de signaux numeriques | |
JP2007133456A (ja) | 半導体装置 | |
JP4226085B2 (ja) | マイクロプロセッサ及びマルチプロセッサシステム | |
US6915414B2 (en) | Context switching pipelined microprocessor | |
US20030177288A1 (en) | Multiprocessor system | |
US6101589A (en) | High performance shared cache | |
JP2001525568A (ja) | 命令デコーダ | |
JPH0696008A (ja) | 情報処理装置 | |
US20180293095A1 (en) | Semiconductor device | |
WO2003088036A1 (fr) | Systeme et procede de traitement multiprocessus au niveau d'instructions | |
US8402260B2 (en) | Data processing apparatus having address conversion circuit | |
US4764866A (en) | Data processing system with pre-decoding of op codes | |
US6732235B1 (en) | Cache memory system and method for a digital signal processor | |
JP7468112B2 (ja) | インタフェース回路およびインタフェース回路の制御方法 | |
US5677859A (en) | Central processing unit and an arithmetic operation processing unit | |
US5327565A (en) | Data processing apparatus | |
JPH06103223A (ja) | データ処理装置 | |
JPS6352240A (ja) | デ−タ処理装置 | |
JP2682186B2 (ja) | マイクロプロセッサ | |
JP2696578B2 (ja) | データ処理装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NI NO NZ OM PH PL PT RO RU SC SD SE SG SK SL TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
122 | Ep: pct application non-entry in european phase | ||
NENP | Non-entry into the national phase |
Ref country code: JP |
|
WWW | Wipo information: withdrawn in national office |
Country of ref document: JP |