CN101957744B - Hardware multithreading control method for microprocessor and device thereof - Google Patents

Hardware multithreading control method for microprocessor and device thereof Download PDF

Info

Publication number
CN101957744B
CN101957744B CN201010512737.3A CN201010512737A CN101957744B CN 101957744 B CN101957744 B CN 101957744B CN 201010512737 A CN201010512737 A CN 201010512737A CN 101957744 B CN101957744 B CN 101957744B
Authority
CN
China
Prior art keywords
multithreading
hardware
thread
register
instruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201010512737.3A
Other languages
Chinese (zh)
Other versions
CN101957744A (en
Inventor
齐悦
王磊
王惠娟
师立宁
王沁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology Beijing USTB
Original Assignee
University of Science and Technology Beijing USTB
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology Beijing USTB filed Critical University of Science and Technology Beijing USTB
Priority to CN201010512737.3A priority Critical patent/CN101957744B/en
Publication of CN101957744A publication Critical patent/CN101957744A/en
Application granted granted Critical
Publication of CN101957744B publication Critical patent/CN101957744B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Advance Control (AREA)

Abstract

The invention relates to a hardware multithreading control method for a microprocessor and a device thereof, and belongs to the field of microprocessor architectures. The control method provided by the invention comprises the following steps of: multithreading instruction fetch, multithreading decoding, multithreading execution, multithreading access and multithreading copyback. An emulation device provided by the invention comprises a hardware multithreading instruction fetch device, a hardware multithreading decoding device, a hardware multithreading execution device, a hardware multithreading access device, a hardware multithreading copyback device, a hardware multithreading register set and a multithreading control device. In addition, in the hardware multithreading control method, a software multithreading program can be executed through the hardware multithreading of a processor, and during execution, access delaying is concealed effectively, the storage and restoration of relevant thread information during thread switching are saved, and the expenditure of the thread switching is reduced; by adopting pipelining technology, n threads can be executed parallelly within the time for executing one thread originally; and through the hardware multithreading, the relevant risks of data caused by deep pipelining are avoided effectively, and the design complexity of a system is reduced, and the execution efficiency of the system is improved in the aspect of hardware.

Description

A kind of hardware multithreading control method and device thereof that is used for microprocessor
Technical field:
The present invention relates to the micro-processor architecture field, particularly the control method of hardware multithreading and device thereof.
Background knowledge:
In order further to improve microprocessor performance, the architecture of multiple novelty has been proposed, as multinuclear, multithreading, stream processing, PIM, restructural, polymorphic etc.These new architectures have proposed solution to the developing problem of microprocessor from different perspectives.Be subjected to the physical restriction of software program properties influence and hardware technology, the development of following architecture has not been the significantly lifting that only just can obtain system performance by the raising dominant frequency.The trend of architecture technical development is that significantly multithreading and multinuclear become two gordian technique directions, and in every field, the processor that contains multithreading or multinuclear feature emerges in an endless stream.Pipelining is the key character that risc processor is different from cisc processor.Adopt degree of depth flowing water technology, when instruction is correlated with and instruct redirect, can reduce the performance of streamline greatly.The present invention adopts degree of depth flowing water technology on RISC framework basis, realize the multi-hardware thread execution, effectively avoids instructing the relevant performance of bringing to lower, and improves the performance of microprocessor by multithreading.
The disclosed patent of 2008-7-16: CN101221493A, exercise question: " in the parallel processor multithreading carry out ", the inventor: the D Bornstein because of etc.This disclosure of the Invention a kind of parallel hardware multithreading processor.This processor comprises the general processor of a harmony systemic-function and supports a plurality of hardware threads and a plurality of micro engine.This processor also comprises the memory control system with the 1st memory controller and the 2nd memory controller, the 1st memory controller is to point to the even stored group or point to the odd number storage sets according to memory access, memory access is classified, the 2nd memory controller is read access or write access according to memory access then, and memory access is optimized.
The disclosed patent of 2007-8-22: CN101021801, exercise question: the mass data transfers method of message queue " between the streamline multi-process based on ", inventor: Xue Qingtong etc.This patent disclosure between a kind of streamline multi-process based on the method for the mass data transfers of message queue.A ticket order is by being divided into format, regular (perhaps being called letter sorting), row's weight, wholesale price, a plurality of processes of warehouse-in at least in the charging product business processing flow, charging method adopts single step and the whole mechanism that combines of submitting to, be implemented in the automatic allocating task of employing different messages queue type under the varying environment by configuration, the load balancing management, ticket is distributed to message queue inequality, according to service logic, self-defined mode is disposed.Adopt the transmission of call bill data between process of the charge system magnanimity of this inventive method realization, all by message queue, processing procedure can realize there is not the expense of the IO of system in the internal memory the inside, and speed improves greatly.Employing obviously improves based on the system handles efficient of the pipeline parallel method treatment technology scheme of message queue.Processing speed is at home and abroad chargeed and is come out at the top in the producer.
The disclosed patent of 2006-1-25: CN1725176, exercise question: " method and apparatus of multi-thread pipelined instruction decoder ", inventor: JP Du lattice Lars etc.This patent disclosure a kind of method of multi-thread pipelined instruction decoder, use the instruction of decoded stream waterline in instruction decoder timing, removing and the delay multithreading machine of multithreading transmission, can obtain best performance and minimum power consumption.An image stream waterline reflection keeps instruction decoded stream waterline and each flow line stage effective instruction bit of instruction decoder of thread identification.Thread identification and effective bit are used for controlling timing, removing and the delay to each flow line stage of instruction decoder.Thread instruction can be eliminated and not clash with other thread instruction at the decoded stream waterline, and in some cases, the instruction of a thread can be delayed and not clash with other thread instruction at the decoded stream waterline.Among the present invention, only need advance so that timing flow line stage when keeping power and minimum latency when effectively instructing.
The disclosed patent of 1999-9-15: CN1228557, exercise question: " multiple line programme instruction level concurrent technique for computer processor ", inventor: Liu Yin etc.This patent disclosure a kind of multiple line programme instruction level concurrent technique for computer processor.This invention relates to a kind of technology that can be applicable to computer processor: the multithreading instruction level concurrent technique.Adopt the computer processor of this technology can alternately instruction fetch from the thread that is in executing state, make that some instructions of executed in parallel come from different threads respectively in computer processor, thereby do not have " instruction inter dependence problem " between these instructions.
The disclosed patent of 2007-6-6: CN1975663, exercise question: " device ", inventor: wear dimension A carat with the asymmetric hardware multithreading support that is used for different threads.This patent provides a kind of asymmetric hardware support that is used for special class of threads.Preferably, this special class of threads is the I/O binding thread of high priority.In aspect first, multiline procedure processor comprises the N group register of the concurrent execution that is used to support N thread.At least one registers group is specifically designed to the thread of certain kinds, and can not be used by other thread, even also be like this when idle.In second aspect, but the thread of the certain kinds limited portion of filled ultra high-speed buffer memory only, so that reduce otherwise the refreshing of the hypervelocity impact damper that may occur.
The disclosed patent of 2005-12-14: CN1707694, exercise question: " memory controller that is used for multithreading pipeline bus system ", inventor: Xu Yunfan etc.This patent disclosure a kind of memory controller that is used for multithreading pipeline bus system, in the storage controlling method of multithreading pipeline system, receive many rows' to be visited the storage unit address from the main frame order.Each row among these many rows judges whether the address corresponding to this row is imported from main frame when read/write command outputs to storage unit.When this result of determination shows when having imported corresponding to this row's address, comprise in open page information and the auto-precharge information any read/write command to this storage unit output.
The disclosed patent of 1999-10-20: CN1232219, exercise question: " pipeline-type multi-processor system ", inventor: Xiao Chi mediocre person.This patent disclosure a kind of pipeline-type multi-processor system, comprise one group of processor unit, pool of buffer device and debugging unit.This processor unit is used for the pipeline processes data; This impact damper keeps the result of input data and each processor unit; Impact damper and processor unit be cascade successively between the data input and output, and debugging unit is used for selectively externally exporting the result of each processor unit, to monitor when debugging.
Summary of the invention:
The objective of the invention is a kind of hardware multithreading control method and relevant hardware multithreading control device that is used for microprocessor of execution design for the software multithread programs.
A kind of hardware multithreading control method that is used for microprocessor is characterized in that this method may further comprise the steps:
1) multithreading is got the finger step, and the instruction that is used for each thread is read, and the instruction address of each thread produces.Comprise that specifically the control of multithreading instruction address, multithreading instruction address buffer memory, multithreading get finger.
A) the multithreading instruction address is controlled, and is used to produce the instruction address of each thread, when certain thread block, only blocks this thread instruction address, and other thread instruction address is normally upgraded;
B) multithreading instruction address buffer memory, the instruction address that is used to store n thread;
C) multithreading is got finger, is used for referring to that with getting the logic symmetry is divided into n level flowing water, takes out n the pairing software thread instruction of hardware thread, when certain thread block, only blocks this thread and gets finger, and other thread is got the normal operation of making a comment or criticism.
2) multithreading decoding step is used for the instruction of each hardware thread is deciphered, and is ready to the needed register data of multithreading execution in step.Comprise that specifically multithreading decoding, multithreading register manipulation number are prepared, the control of decoding unit data bypass:
A) multithreading decoding is used for the decoding logic symmetry is divided into n level flowing water, finishes multithreading initialization special instruction decoding, conventional instruction decode.When certain thread block, only block this thread instruction decoding, the normal operation of other thread instruction decoding.Described multithreading initialization special instruction comprises: be used to identify the operation code field of this instruction, be used to operate the operand field of destination operand, be used for the operand field of operate source operand;
B) multithreading register manipulation number is prepared, and is used to produce register address to be read, and from n registers group reading command action required number;
C) the decoding unit data bypass is controlled, and is used for the data of data bypass are offered certain flowing water stage of instruction decode;
3) multithreading execution in step is used to carry out each thread instruction.Specifically comprise multithreading initialization special instruction execution, thread number buffer memory, the control of execution unit data bypass, the conventional instruction execution of multithreading:
A) multithreading initialization special instruction is carried out, and is used to produce new hardware thread number, and new hardware thread is number corresponding to the performed software thread of this hardware thread;
B) thread number buffer memory, be used for the new hardware thread buffer memory that produces is carried out in described multithreading initialization special instruction, make certain new hardware thread of producing number in the thread number register series the position and this hardware thread at instruction address register sequence, instruction fetching component multithreading register series, decoding unit multithreading register series, execution unit multithreading register series, memory access parts multithreading register series, write back the position consistency in the parts multithreading register series;
C) execution unit data bypass control is used for the data of data bypass are offered certain flowing water stage that instruction is carried out;
D) the conventional instruction of multithreading is carried out, and is used for the actuating logic symmetry is divided into n level flowing water, finishes the routine instruction of n hardware thread and carries out, and when certain thread block, only blocks this thread instruction and carries out, and other thread instruction is carried out normal operation.
4) multithreading memory access step, be used for memory access logic symmetry is divided into n level flowing water, the execution result of each thread be written to storer or go into the thread desired data from memory read, when certain thread block, only block this thread-data memory access, other thread-data memory access normally moves.
5) multithreading writes back step, is used for being divided into n level flowing water with writing back the logic symmetry, and the execution result of each thread is write back to corresponding registers group, when certain thread block, only blocks this thread-data and writes back, and other thread-data writes back normal operation.Comprise that specifically multithreading writes back Data Control, multithreading writes back register address control:
A) multithreading writes back Data Control, is used to wait to write back the preparation and the output of register data;
B) multithreading writes back register address control, is used to wait to write back the preparation and the output of register address.
A kind of hardware multithreading control device that is used for microprocessor, this device is supported n hardware thread executed in parallel by adopting pipelining.It is characterized in that comprising with lower member: hardware multithreading is got and is referred to that device, hardware multithreading decoding device, hardware multithreading performer, hardware multithreading memory access device, hardware multithreading write back device, hardware multithreading registers group and multithreading control device.
1) described hardware multithreading is got and referred to that device comprises: the instruction address control device is used to produce the instruction address of each thread; The instruction address register sequence, the instruction address that is used to store n thread; Instruction fetching component multithreading register series is used for the temporary intermediate result that refers to logic n level flowing water of getting, and the part of every grade of corresponding hardware thread of register is got and referred to logic output;
2) described hardware multithreading decoding device comprises: decoding unit multithreading register series is used for the intermediate result of temporary decoding logic n level flowing water, the part decoding logic output of every grade of corresponding hardware thread of register; The data bypass register series is used to store the middle execution result of preceding two instructions of each thread;
3) described hardware multithreading performer comprises: multithreading initialization special instruction performer, be used to produce hardware thread number, and this hardware thread is number corresponding to the performed software thread of this hardware thread; The thread number register series is used for hardware thread that buffer memory produces number; Execution unit multithreading register series is used for the intermediate result of temporary actuating logic n level flowing water and the output of the part actuating logic of every grade of corresponding hardware thread of register; The data bypass register series is used to store the middle execution result of preceding two instructions of each thread;
4) described hardware multithreading memory access device comprises: memory access parts multithreading register series is used for the intermediate result of temporary memory access logic n level flowing water, the part memory access logic output of every grade of corresponding hardware thread of register;
5) described hardware multithreading writes back device and comprises: write back parts multithreading register series, be used for the temporary intermediate result that writes back logic n level flowing water, the part of every grade of corresponding hardware thread of register writes back logic output;
6) described hardware multithreading registers group comprises: code translator, register controlled signal that produces according to the multithreading control device and hardware multithreading decoding device or hardware multithreading write back the address signal that device produces, decipher the registers group enable signal of output current thread and the register address of current operation; Multi-channel gating device, according to the register controlled signal that the multithreading control device produces, the data of the registers group of gating current thread, and with its output; N registers group offers n thread respectively and use, and be independent separately, writes data and write back the output data of device from hardware multithreading, and sense data is given the hardware multithreading decoding device;
7) described multithreading control device, be used to produce following control signal: produce to get and accuse the system signal, export to described hardware multithreading and get the finger device, produce the encoded control signal, export to described hardware multithreading decoding device, produce and carry out control signal, export to described hardware multithreading performer, produce the memory access control signal, export to described hardware multithreading memory access device, produce and write back control signal, export to described hardware multithreading and write back device, produce the register controlled signal, export to described hardware multithreading registers group.
An advantage of the invention is at the software multithread programs, can utilize the processor hardware multithreading to carry out, hidden the memory access delay during execution effectively, the preservation and the recovery of thread relevant information when thread switches have been omitted, reduced the expense that thread switches, thereby improved executing efficiency, reduced power consumption.
Another advantage of the present invention is by adopting pipelining, and making in the original time of carrying out a thread now can an executed in parallel n thread, has improved executing efficiency from hardware.
Another advantage of the present invention is effectively to have evaded the data dependence risk that degree of depth flowing water brings by hardware multithreading, has reduced the design complexity of system, and has improved the execution efficient of system.
Description of drawings
Fig. 1 is typical MIPS processor flowing water system assumption diagram.
Fig. 2 is the hardware multithreading control device figure that is used for microprocessor.
Fig. 3 is that the hardware multithreading in the hardware multithreading device is got finger device figure.
Fig. 4 is the hardware multithreading decoding device figure in the hardware multithreading device.
Fig. 5 is the hardware multithreading performer figure in the hardware multithreading device.
Fig. 6 is the hardware multithreading memory access device figure in the hardware multithreading device.
Fig. 7 is that the hardware multithreading in the hardware multithreading device writes back device figure.
Fig. 8 is the hardware multithreading registers group key diagram in the hardware multithreading device.
Fig. 9 is a multithreading initialization special instruction coded format.
Figure 10 is the hardware multithreading control method block diagram that is used for microprocessor.
Figure 11 is that the hardware multithreading streamline splits and clock figure.
Figure 12 is that hardware multithreading is carried out the sequential synoptic diagram.
Embodiment
Below in conjunction with accompanying drawing, realization of the present invention is described in detail.
Shown in Figure 1 is typical MIPS processor flowing water system assumption diagram.The execution of an instruction is divided into gets finger (IF), decoding (ID), carry out (EX), memory access (MEM) and write back (WB) Pyatyi flowing water, this patent promptly designs on this flowing water basis, adopts and deepens flowing water progression, the execution of support hardware multithreading;
Fig. 2 is the hardware multithreading control device general structure synoptic diagram that the present invention is used for microprocessor.Comprising that hardware multithreading is got refers to device 201, hardware multithreading decoding device 202, and hardware multithreading performer 203, hardware multithreading memory access device 204, hardware multithreading write back device 205, hardware multithreading registers group 206, multithreading control device 207.Each device is all supported n hardware thread executed in parallel, and is synchronous between each device.
Hardware multithreading is got and is referred to device 201, accuse the system signal according to getting of multithreading control device 207 outputs, finish n hardware thread instruction address renewal operation, realize instruction address storage, finish getting of n hardware thread referred to operate, instruction is outputed to hardware multithreading decoding device 202; Hardware multithreading decoding device 202, get the instruction of n the thread that refers to device 201 outputs and the encoded control signal that multithreading control device 207 produces according to hardware multithreading, finish the decoded operation of n hardware thread, control information operation to be operated and data message are exported to hardware multithreading performer 203; Hardware multithreading performer 203, the execution control signal that produces according to control information operation and data message, the multithreading control device 207 of hardware multithreading decoding device 202 output, finish the instruction of n thread and carry out, the data message of execution result is exported to hardware multithreading memory access device 204; Hardware multithreading memory access device 204 according to the memory access control signal that multithreading control device 207 produces, is finished the memory access operation of n hardware thread, and will export to hardware multithreading from the data message of multithreading performer 203 and write back device 205; Hardware multithreading writes back device 205, according to the control signal that writes back of multithreading control device 207 generations, data message to be write back and register address current to be write back is exported to hardware multithreading registers group 206; Hardware multithreading registers group 206 according to the register controlled signal of multithreading control device 207 outputs, cooperates hardware multithreading decoding device 202 and hardware multithreading to write back device 205, finishes the registers group read-write operation of n hardware thread; Multithreading control device 207, controlling each device of whole hardware multithreading device carries out, concrete produce following control signal: produce to get and accuse that the system signal exports to hardware multithreading and get and refer to device 201, produce the encoded control signal and export to hardware multithreading decoding device 202, produce and carry out control signal and export to hardware multithreading performer 203, produce the memory access control signal and export to hardware multithreading memory access device 204, produce and write back control signal and export to hardware multithreading and write back device 205, produce the register controlled signal and export to hardware multithreading registers group 206.
Fig. 3 is that the hardware multithreading that cooperates Fig. 2 to be used for the hardware multithreading control device overall construction drawing of microprocessor is got finger device 201 structural drawing.Hardware multithreading is got and is referred to that device 201 comprises: instruction address control device 301, instruction address register sequence 302, instruction fetching component multithreading register series 303.301 controls of instruction address control device produce the instruction address of each thread, the each generation corresponding to Fig. 3 got next the bar instruction address that refers to performed thread among the logic IF1, suppose certain constantly during t among the IF1 performed thread sequence number be i, then should instruct address control unit spare 301 to produce next bar instruction address that thread i carries out in proper order in the clock period, this address is output among the pc_DFF_1 of sequence 302 at next clock period t+1, got in the instruction address control device 301 in the value (being the instruction address of the current execution of thread i+1) in t cycle at t+1 period p c_DFF_n, produced next bar instruction address that thread i+1 carries out in proper order; The instruction address of n thread of instruction address register sequence 302 storages is stored as to the pc_DFF_n register successively from the pc_DFF_1 register: thread j, thread j-1 ..., thread 1, thread n, thread n-1 ..., thread j+1 next bar instruction address; The instruction fetching component multithreading register series 303 temporary intermediate results that refer to logic n level flowing water of getting, storage part is got and is referred to logic IFj (j=1 in every grade of register, 2, ..., n) output, and the thread number and instruction address register sequence 302 from the IF_DFF_1 register to IF_DFF_n register institute deposit data correspondence in the instruction fetching component multithreading register series 303 from the pc_DFF_1 register to the pc_DFF_n register pairing thread number unanimity, even deposit the instruction address of thread i in the pc_DFF_k register, then the IFk of IF_DFF_k register unwrapping wire journey i gets and refers to logic output.
Fig. 4 cooperates Fig. 2 to be used for hardware multithreading decoding device 202 structural drawing of the hardware multithreading control device overall construction drawing of microprocessor.Hardware multithreading decoding device 202 comprises: decoding unit multithreading register series 401, decoding unit data bypass register series 402.The intermediate result of decoding unit multithreading register series 401 temporary decoding logic n level flowing water, the output of storage part decoding logic IDj in every grade of register, and it is consistent to the corresponding thread number of IF_DFF_n register the thread number from the ID_DFF_1 register to ID_DFF_n register institute deposit data correspondence in the decoding unit multithreading register series 401 and the instruction fetching component multithreading register series 303 from the IF_DFF_1 register, the IFk that even deposits thread i in the IF_DFF_k register gets and refers to logic output, the then IDk decoding logic of ID_DFF_k register unwrapping wire journey i output; The middle execution result of n preceding two instructions of thread of decoding unit data bypass register series 402 storages.
Fig. 5 cooperates Fig. 2 to be used for hardware multithreading performer 203 structural drawing of the hardware multithreading control device overall construction drawing of microprocessor.Hardware multithreading performer 203 comprises: multithreading initialization special instruction performer 501, thread number register series 502, execution unit multithreading register series 503, execution unit data bypass register series 504.Multithreading initialization special instruction performer 501 produces the hardware thread number of multithreadings; Thread number register series 502 is a n level register buffer memory, with the thread number of certain hardware thread of producing in register series 502 position Th_DFF_i and this hardware thread at instruction address register sequence (302), instruction fetching component multithreading register series (303), decoding unit multithreading register series (401), execution unit multithreading register series (503), memory access parts multithreading register series (601), write back the position consistency in the parts multithreading register series (701); The intermediate result of execution unit multithreading register series 503 temporary actuating logic n level flowing water, the output of storage part actuating logic Ej in every grade of register, and it is consistent to the pairing thread number of IF_DFF_n register the thread number from the E_DFF_1 register to E_DFF_n register institute deposit data correspondence in the execution unit multithreading register series 503 and the instruction fetching component multithreading register series 303 from the IF_DFF_1 register, the IFk that even deposits thread i in the IF_DFF_k register gets and refers to logic output, the then Ek actuating logic of E_DFF_k register unwrapping wire journey i output; The middle execution result of n preceding two instructions of thread of execution unit data bypass register series 504 storages.
Fig. 6 cooperates Fig. 2 to be used for hardware multithreading memory access device 204 structural drawing of the hardware multithreading control device overall construction drawing of microprocessor.Hardware multithreading memory access device 204 comprises: memory access parts multithreading register series 601, the intermediate result of the temporary memory access logic n level flowing water of this sequence, the output of storage part memory access logic Mj in every grade of register, and it is consistent to the pairing thread number of IF_DFF_n register the thread number from the M_DFF_1 register to M_DFF_n register institute deposit data correspondence in the memory access parts multithreading register series 601 and the instruction fetching component multithreading register series 303 from the IF_DFF_1 register, the IFk that even deposits thread i in the IF_DFF_k register gets and refers to logic output, the then Mk actuating logic of M_DFF_k register unwrapping wire journey i output.
Fig. 7 is that the hardware multithreading that cooperates Fig. 2 to be used for the hardware multithreading control device overall construction drawing of microprocessor writes back device 205 structural drawing.Hardware multithreading writes back device 205 and comprises: write back parts multithreading register series 701, the temporary intermediate result that writes back logic n level flowing water of this sequence, storage part writes back the output of logic Wj in every grade of register, and write back in the parts multithreading register series 701 consistent to the pairing thread number of IF_DFF_n register the thread number from the W_DFF_1 register to W_DFF_n register institute deposit data correspondence and instruction fetching component multithreading register series 303 from the IF_DFF_1 register, the IFk that even deposits thread i in the IF_DFF_k register gets and refers to logic output, and then the Wk of W_DFF_k register unwrapping wire journey i writes back logic output.
Fig. 8 cooperates Fig. 2 to be used for hardware multithreading registers group 206 structural drawing of the hardware multithreading control device overall construction drawing of microprocessor.Hardware multithreading registers group 206 comprises: code translator 801, multi-channel gating device 802, a n registers group 803.Register controlled signal that code translator 801 produces according to multithreading control device 207 and hardware multithreading decoding device 202 or hardware multithreading write back the address signal that device 205 produces and decipher, and output makes the effective enable signal of registers group Regs_i of current thread and the register address of current operation; Multi-channel gating device 802, according to the register controlled signal that multithreading control device 207 produces, the data output of certain registers group of gating current thread; N registers group 803 offers n thread respectively and use, and be independent separately, writes data and write back the output data of device 205 from hardware multithreading, and sense data is given hardware multithreading decoding device 202.
The present invention can realize on microprocessor that hardware multithreading carries out, and finishes the mapping of software multithreading to hardware multithreading by a kind of multithreading initialization special instruction is provided, and Fig. 9 is the coded format of the hardware multithreading initialization special instruction that provided.Comprise: the operation code field 901 that is used to identify this instruction; Be used to operate the operand field 902 of destination operand, this destination operand can be certain register in the registers group of this hardware thread correspondence, is used for identifying this hardware thread number at initial phase; The operand field 903 that is used for the operate source operand, this source operand can be new thread number that produces.
Figure 10 is a hardware multithreading control method block diagram.Multithreading is got and is referred to step 1001, and the instruction that is used for each thread is read, and the instruction address of each thread produces.Comprise that the control of multithreading instruction address, multithreading instruction address buffer memory, multithreading get finger.Multithreading decoding step 1002 is used for the instruction of each hardware thread is deciphered, and is ready to the needed register data of execution in step.Comprise that multithreading decoding, multithreading register manipulation number are prepared, the control of decoding unit data bypass.Multithreading execution in step 1003 is used to carry out each thread instruction.Comprise multithreading initialization special instruction execution, thread number buffer memory, the control of execution unit data bypass, the conventional instruction execution of multithreading.Multithreading memory access step 1004, the execution result that is used for each thread is written to storer or goes into the thread desired data from memory read.Multithreading writes back step 1005, and the execution result that is used for each thread writes back to register, comprises that multithreading writes back Data Control, multithreading writes back register address control.
Be example (n=4) with hardware 4 threads below, realization sequential of the present invention is described.
Figure 11 is that the hardware multithreading streamline is divided and the clock synoptic diagram.Getting the finger logic with 101 is example, get finger logic IF symmetry with 101 among Figure 11 and be divided into 4 grades of flowing water, when then the time-delay of every grade of logic becomes single-stage flowing water 1/4, if the clock period of IF logic is T among Fig. 1, corresponding clock is the clk1 among Figure 11, then the clock of hardware four threads can be shown in clk2, and frequency is 4 times of clk1, and interior present hardware four threads of time of carrying out a thread when making single-stage flowing water can 4 threads of executed in parallel.
Figure 12 is that hardware multithreading is carried out the sequential synoptic diagram.With n=4 is example, #1, #2, #3, #4 correspond respectively to 4 different hardware threads, and all hardware threads all begin to get finger to be carried out from the same instructions address, suppose that T1 begins to carry out constantly, hardware thread #1 carries out the first step IF1 that refers to logic that gets of article one instruction; T2 hardware thread #1 constantly carries out the second step IF2 that getting of article one instruction refers to logic, and hardware thread #2 carries out the first step IF1 that refers to logic that gets of article one instruction simultaneously; And the like, T5 constantly, the second step ID2 of the decoding logic that hardware thread #1 execution article one is instructed and second of the finger logic of getting of second instruction go on foot IF2, the first step ID1 of the decoding logic of hardware thread #2 execution article one instruction and the first step IF1 that gets the finger logic of second instruction.Suppose that article one instruction is a multithreading initialization special instruction, then T2 moment hardware thread #2 begins to get multithreading initialization special instruction, T5 hardware thread #2 constantly begins to decipher multithreading initialization special instruction, T6 hardware thread #2 constantly begins to carry out multithreading initialization special instruction, produce relevant hardware thread number 2, T7 constantly this hardware thread numbers 2 passes to second of hardware thread #2 by data bypass, article three, the execution of instruction and decoding stage, T8 moment hardware thread #2 begins the initialization special instruction and writes back logic, and T9 finishes the initialization special instruction constantly and writes back logic.Calculated hardware thread numbers 2 in T7 initialization special instruction constantly, and can pass to second of hardware thread #2 by data bypass, article three, the execution of instruction and decoding stage, if the 3rd instruction condition jump instruction, then it begins to decipher this condition jump instruction constantly at T7, number determine whether carrying out redirect by judging from the hardware thread of data bypass, being about to thread number and 2 compares, for hardware thread #2, its hardware thread number equates with 2, realizes redirect, for other threads, its hardware thread number is not equal to 2, do not carry out redirect, continuation is carried out in proper order, thereby has realized the separation of each thread, this condition jump instruction needn't only be the 3rd instruction, after the 3rd instruction all can because T7 has constantly calculated hardware thread number, can carry out the condition redirect by the condition jump instruction.The description of other situations of separating for thread is similar, and as seen, if article one instruction is multithreading initialization special instruction, then the 3rd instruction or the 3rd instruction just can realize the separation of thread afterwards by the condition jump instruction.

Claims (6)

1. hardware multithreading control method that is used for microprocessor is characterized in that may further comprise the steps:
1) multithreading is got and is referred to step (1001), and the instruction that is used for each thread is read, and the instruction address of each thread produces;
Wherein, multithreading is got and is referred to that step (1001) comprises that the control of multithreading instruction address, multithreading instruction address buffer memory, multithreading get finger;
(1) the multithreading instruction address is controlled, and is used to produce the instruction address of each thread, when certain thread block, only blocks this thread instruction address and upgrades, and other thread instruction address is normally upgraded;
(2) multithreading instruction address buffer memory, the instruction address that is used to store n thread;
(3) multithreading is got finger, is used for referring to that with getting logic (101) symmetry is divided into n level flowing water, takes out n the pairing software thread instruction of hardware thread;
2) multithreading decoding step (1002) is used for the instruction of each hardware thread is deciphered, and is ready to the needed register data of multithreading execution in step (1003);
Wherein, multithreading decoding step (1002) comprises that multithreading decoding, multithreading register manipulation number are prepared, the control of decoding unit data bypass;
(1) multithreading decoding is used for decoding logic (102) symmetry is divided into n level flowing water, finishes multithreading initialization special instruction decoding, conventional instruction decode, when certain thread block, only blocks this thread instruction decoding, the normal operation of other thread instruction decoding;
Described multithreading initialization special instruction comprises: operation code field (901), be used to identify this instruction, and operand field (902) is used to operate destination operand, and operand field (903) is used for the operate source operand;
(2) multithreading register manipulation number is prepared, and is used for selecting from n registers group the registers group of corresponding thread, and reading command action required number;
(3) the decoding unit data bypass is controlled, and is used for the data of data bypass are offered certain flowing water stage of instruction decode;
3) multithreading execution in step (1003) is used to carry out each thread instruction;
Wherein, multithreading execution in step (1003) comprises multithreading initialization special instruction execution, thread number buffer memory, the control of execution unit data bypass, the conventional instruction execution of multithreading;
(1) multithreading initialization special instruction is carried out, and is used to produce new hardware thread number, and new hardware thread is number corresponding to the performed software thread of this hardware thread;
(2) thread number buffer memory, be used for the new hardware thread buffer memory that produces is carried out in described multithreading initialization special instruction, make certain new hardware thread of producing number in thread number register series (502) the position and this hardware thread at address pointer register sequence (302), instruction fetching component multithreading register series (303), decoding unit multithreading register series (401), execution unit multithreading register series (503), memory access parts multithreading register series (601), write back the position consistency in the parts multithreading register series (701);
(3) execution unit data bypass control is used for the data of data bypass are offered certain flowing water stage that instruction is carried out;
(4) the conventional instruction of multithreading is carried out, and is used for actuating logic (103) symmetry is divided into n level flowing water, finishes the routine instruction of n hardware thread and carries out, and when certain thread block, only blocks this thread instruction and carries out, and other thread instruction is carried out normal operation;
4) multithreading memory access step (1004), be used for memory access logic (104) symmetry is divided into n level flowing water, the execution result of each thread is written to storer or goes into the data that thread needs from memory read, when certain thread block, only block this thread-data memory access, other thread-data memory access normally moves;
5) multithreading writes back step (1005), and the execution result that is used for each thread writes back to registers group;
Wherein, multithreading writes back step (1005) and will write back logic (105) symmetry and be divided into n level flowing water, when certain thread block, only blocking this thread-data writes back, other thread-data writes back normal operation, and multithreading writes back step (1005) and comprises that multithreading writes back Data Control, multithreading writes back register address control;
(1) multithreading writes back Data Control, is used to wait to write back the preparation and the output of register data;
(2) multithreading writes back register address control, is used to wait to write back the preparation and the output of register address.
2. hardware multithreading control device that is used for microprocessor, this device is supported n hardware thread executed in parallel by adopting pipelining; It is characterized in that device comprises that hardware multithreading is got and refers to that device (201), hardware multithreading decoding device (202), hardware multithreading performer (203), hardware multithreading memory access device (204), hardware multithreading write back device (205), hardware multithreading registers group (206), multithreading control device (207);
Hardware multithreading is got and is referred to device (201), be used to finish value operation to n hardware thread, instruction is outputed to hardware multithreading decoding device (202), finish the renewal of n next bar instruction address of hardware thread according to the charge system of the getting signal of multithreading control device (207) output and operate, realize the storage of instruction address;
Hardware multithreading decoding device (202), be used for receiving from hardware multithreading get the instruction of n the thread that refers to device (201), from the encoded control signal of multithreading control device (207), finish the decoded operation of n hardware thread, control information operation to be operated and data message are exported to hardware multithreading performer (203);
Hardware multithreading performer (203), be used for receiving from the control information operation of hardware multithreading decoding device (202) and data message, from the execution control signal of multithreading control device (207), finish the instruction of n thread and carry out, the data message of execution result is exported to hardware multithreading memory access device (204);
Hardware multithreading memory access device (204) is used to finish the memory access operation of n hardware thread; Described hardware multithreading memory access device (204) comprises memory access parts multithreading register series (601), the intermediate result that is used for temporary memory access logic n level flowing water, the output of the memory access logic M1~Mn of every grade of corresponding hardware thread of register M_DFF1~M_DFF n, the memory access control signal that the output that is input as hardware multithreading performer (203) of wherein memory access logic M1 and multithreading control device (207) produce, the output of register M_DFF n is as the output of hardware multithreading memory access device (204);
Hardware multithreading writes back device (205), is used for data message to be write back and register address current to be write back are exported to hardware multithreading registers group (206); Described hardware multithreading writes back device (205) and comprises and write back parts multithreading register series (701), be used for the temporary intermediate result that writes back logic n level flowing water, the output that writes back logic W1~Wn of every grade of corresponding hardware thread of register W_DFF1~W_DFF n, wherein write back the control signal that writes back that the output that is input as hardware multithreading memory access device (204) of logic W1 and multithreading control device (207) produce, the output of register W_DFF n writes back the output of device (205) as hardware multithreading;
Hardware multithreading registers group (206), register controlled signal according to multithreading control device (207) output, cooperate hardware multithreading decoding device (202) and hardware multithreading to write back device (205), finish the registers group read-write operation of n hardware thread;
Multithreading control device (207), the output control signal, controlling each device of whole hardware multithreading device carries out, specifically comprise following control signal: produce to get and accuse that the system signal exports to described hardware multithreading and get and refer to device (201), produce the encoded control signal and export to described hardware multithreading decoding device (202), produce the execution control signal and export to described hardware multithreading performer (203), produce the memory access control signal and export to described hardware multithreading memory access device (204), generation writes back control signal and exports to described hardware multithreading and write back device (205), produces the register controlled signal and exports to described hardware multithreading registers group (206).
3. hardware multithreading control device as claimed in claim 2 is characterized in that, described hardware multithreading is got and referred to that device (201) comprises with lower member:
1) instruction address control device (301), the output of accusing system signal (comprising instruction address control signal and jump address) and pc_DFF n of getting according to multithreading control device (207) generation, produce next bar instruction address of current thread, and it is sent into instruction address register sequence (302);
2) instruction address register sequence (302), the instruction address that is used to store n thread, the instruction address of each thread are passed through instruction address register pc_DFF1~pc_DFF n successively;
3) instruction fetching component multithreading register series (303), be used for the temporary intermediate result that refers to logic n level flowing water of getting, the output of getting finger logic IF1~IFn of every grade of corresponding hardware thread of register IF_DFF1~IF_DFF n, wherein get the output that is input as instruction address control device (301) and the getting of multithreading control device (207) generation that refer to logic IF1 and accuse the system signal, register IF_DFF n is output as hardware multithreading and gets the output that refers to device (201).
4. hardware multithreading control device as claimed in claim 2 is characterized in that, described hardware multithreading decoding device (202) comprising:
1) decoding unit multithreading register series (401), the intermediate result that is used for temporary decoding logic n level flowing water, the output of the decoding logic ID1~IDn of every grade of corresponding hardware thread of register ID_DFF1~ID_DFF n, wherein the hardware multithreading that is input as of decoding logic ID1 is got the output that refers to device (201), the sense data of hardware multithreading registers group (206) and the encoded control signal that multithreading control device (207) produces, and register ID_DFF n is output as the output of hardware multithreading decoding device (202);
2) data bypass register series (402), be used to store the middle execution result of preceding two instructions of each thread, institute's deposit data comes from the output data of hardware multithreading performer (203) and the output data of hardware multithreading memory access device (204) respectively, the input data of every grade of register ID_Bypass_1~ID_Bypass_n-1 of this data bypass register series are input to corresponding decoding logic ID1~IDn-1 respectively, and the output data of register ID_Bypass_n-1 is input to decoding logic IDn.
5. hardware multithreading control device as claimed in claim 2 is characterized in that, described hardware multithreading performer (203) comprises with lower member:
1) multithreading initialization special instruction performer (501) is used to produce hardware thread number, and the hardware thread that produces number is outputed to thread number register series (502), and this hardware thread is number corresponding to the performed software thread of this hardware thread;
2) thread number register series (502) is used for hardware thread that buffer memory produces number, and each passes through thread number register Th_DFF1~Th_DFF n successively by the thread number that multithreading initialization special instruction performer (501) produces;
3) execution unit multithreading register series (503), the intermediate result that is used for temporary actuating logic n level flowing water, the output of the actuating logic E1~En of every grade of corresponding hardware thread of register E_DFF1~E_DFF n, the execution control signal that produces of the output that is input as hardware multithreading decoding device (202) of actuating logic E1 and multithreading control device (207) wherein, the output of the output of register E_DFF n or register Th_DFF n is as the output of hardware multithreading performer (203);
4) data bypass register series (504), be used to store the middle execution result of preceding two instructions of each thread, come from the output data of hardware multithreading performer (203) and the output data of hardware multithreading memory access device (204) respectively, the input data of every grade of register E_Bypass_1~E_Bypass_n-1 of this data bypass register series are input to corresponding actuating logic E1~En-1 respectively, and the output data of register E_Bypass_n-1 is input to decoding logic En.
6. hardware multithreading control device as claimed in claim 2 is characterized in that, described hardware multithreading registers group (206) comprises with lower member:
1) code translator (801), register controlled signal that produces according to multithreading control device (207) and hardware multithreading decoding device (202) or hardware multithreading write back the address signal that device (205) produces, decipher the registers group enable signal of output current thread and the register address of current operation;
2) multi-channel gating device (802), according to the register controlled signal that multithreading control device (207) produces, the data of the registers group of gating current thread, and with its output;
3) n registers group (803) offers n thread respectively and use, and be independent separately, writes data and write back the output data of device (205) from hardware multithreading, and sense data is given hardware multithreading decoding device (202).
CN201010512737.3A 2010-10-13 2010-10-13 Hardware multithreading control method for microprocessor and device thereof Expired - Fee Related CN101957744B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010512737.3A CN101957744B (en) 2010-10-13 2010-10-13 Hardware multithreading control method for microprocessor and device thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010512737.3A CN101957744B (en) 2010-10-13 2010-10-13 Hardware multithreading control method for microprocessor and device thereof

Publications (2)

Publication Number Publication Date
CN101957744A CN101957744A (en) 2011-01-26
CN101957744B true CN101957744B (en) 2013-07-24

Family

ID=43485089

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010512737.3A Expired - Fee Related CN101957744B (en) 2010-10-13 2010-10-13 Hardware multithreading control method for microprocessor and device thereof

Country Status (1)

Country Link
CN (1) CN101957744B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107315575A (en) * 2016-04-26 2017-11-03 北京中科寒武纪科技有限公司 A kind of apparatus and method for performing vectorial union operation

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9021237B2 (en) * 2011-12-20 2015-04-28 International Business Machines Corporation Low latency variable transfer network communicating variable written to source processing core variable register allocated to destination thread to destination processing core variable register allocated to source thread
JP6017260B2 (en) * 2012-10-17 2016-10-26 ルネサスエレクトロニクス株式会社 Multithreaded processor
CN104778074B (en) * 2014-01-14 2019-02-26 腾讯科技(深圳)有限公司 A kind of calculating task processing method and processing device
CN104699463B (en) * 2015-03-20 2017-05-17 浪潮集团有限公司 Implementation method for assembly lines low in power consumption
CN104699465B (en) * 2015-03-26 2017-05-24 中国人民解放军国防科学技术大学 Vector access and storage device supporting SIMT in vector processor and control method
GB2540971B (en) * 2015-07-31 2018-03-14 Advanced Risc Mach Ltd Graphics processing systems
CN105808357B (en) * 2016-03-29 2021-07-27 沈阳航空航天大学 Multi-core multi-thread processor with accurately controllable performance
CN108255587B (en) * 2016-12-29 2021-08-24 展讯通信(上海)有限公司 Synchronous multi-thread processor
CN110018781B (en) * 2018-01-09 2022-06-21 阿里巴巴集团控股有限公司 Disk flow control method and device and electronic equipment
CN110647358B (en) * 2018-06-27 2021-11-23 展讯通信(上海)有限公司 Synchronous multithread processor
CN110647357B (en) * 2018-06-27 2021-12-03 展讯通信(上海)有限公司 Synchronous multithread processor
CN109597654B (en) * 2018-12-07 2022-01-11 湖南国科微电子股份有限公司 Register initialization method, basic configuration table generation method and embedded system
CN112713993A (en) * 2020-12-24 2021-04-27 天津国芯科技有限公司 Encryption algorithm module accelerator and high-speed data encryption method
CN112579278B (en) * 2020-12-24 2023-01-20 海光信息技术股份有限公司 Central processing unit, method, device and storage medium for simultaneous multithreading

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1760826A (en) * 2004-10-14 2006-04-19 国际商业机器公司 Method, processor and system for processing instructions

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070005942A1 (en) * 2002-01-14 2007-01-04 Gil Vinitzky Converting a processor into a compatible virtual multithreaded processor (VMP)

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1760826A (en) * 2004-10-14 2006-04-19 国际商业机器公司 Method, processor and system for processing instructions

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A FPGA Implementation of a MIPS RISC Processor for Computer Architecture Education;Victor P. Rubio;《Master of Science New Mexico State University》;20040708;第16-35页 *
Victor P. Rubio.A FPGA Implementation of a MIPS RISC Processor for Computer Architecture Education.《Master of Science New Mexico State University》.2004,第16-35页.

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107315575A (en) * 2016-04-26 2017-11-03 北京中科寒武纪科技有限公司 A kind of apparatus and method for performing vectorial union operation

Also Published As

Publication number Publication date
CN101957744A (en) 2011-01-26

Similar Documents

Publication Publication Date Title
CN101957744B (en) Hardware multithreading control method for microprocessor and device thereof
KR101594090B1 (en) Processors, methods, and systems to relax synchronization of accesses to shared memory
US7836276B2 (en) System and method for processing thread groups in a SIMD architecture
JP6043374B2 (en) Method and apparatus for implementing a dynamic out-of-order processor pipeline
JP2928695B2 (en) Multi-thread microprocessor using static interleave and instruction thread execution method in system including the same
KR102335194B1 (en) Opportunity multithreading in a multithreaded processor with instruction chaining capability
CN100461094C (en) Instruction control method aimed at stream processor
US9811340B2 (en) Method and apparatus for reconstructing real program order of instructions in multi-strand out-of-order processor
US20030154358A1 (en) Apparatus and method for dispatching very long instruction word having variable length
JPH04360234A (en) Information processor and instruction scheduling device
TW200403588A (en) Suspending execution of a thread in a multi-threaded processor
US20140075157A1 (en) Methods and Apparatus for Adapting Pipeline Stage Latency Based on Instruction Type
CN105426160A (en) Instruction classified multi-emitting method based on SPRAC V8 instruction set
CN106104481A (en) Certainty and opportunistic multithreading
JPH06318155A (en) Computer system
US8560813B2 (en) Multithreaded processor with fast and slow paths pipeline issuing instructions of differing complexity of different instruction set and avoiding collision
JPH1124929A (en) Arithmetic processing unit and its method
JPH03282958A (en) Electronic computer
US10268519B2 (en) Scheduling method and processing device for thread groups execution in a computing system
US20110022821A1 (en) System and Methods to Improve Efficiency of VLIW Processors
CN101763251A (en) Instruction decode buffer device of multithreading microprocessor
US11366669B2 (en) Apparatus for preventing rescheduling of a paused thread based on instruction classification
JPH02227730A (en) Data processing system
Iliakis et al. Repurposing GPU microarchitectures with light-weight out-of-order execution
KR20080008683A (en) Method and apparatus for processing according to multi-threading/out-of-order merged scheme

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20130724

Termination date: 20211013

CF01 Termination of patent right due to non-payment of annual fee