CN105824605B - A kind of controlled dynamic multi-threading and processor - Google Patents

A kind of controlled dynamic multi-threading and processor Download PDF

Info

Publication number
CN105824605B
CN105824605B CN201610272367.8A CN201610272367A CN105824605B CN 105824605 B CN105824605 B CN 105824605B CN 201610272367 A CN201610272367 A CN 201610272367A CN 105824605 B CN105824605 B CN 105824605B
Authority
CN
China
Prior art keywords
instruction
mark
processor
thread
instructions
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610272367.8A
Other languages
Chinese (zh)
Other versions
CN105824605A (en
Inventor
王生洪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201610272367.8A priority Critical patent/CN105824605B/en
Publication of CN105824605A publication Critical patent/CN105824605A/en
Application granted granted Critical
Publication of CN105824605B publication Critical patent/CN105824605B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units
    • G06F9/3889Concurrent instruction execution, e.g. pipeline or look ahead using a plurality of independent parallel functional units controlled by multiple instructions, e.g. MIMD, decoupled access or execute

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Advance Control (AREA)

Abstract

The invention discloses a kind of controlled dynamic multi-threading and processor, the method is to a processor using pipeline organization, increases mark newly in its order structure, which includes two partial informations:The precedence information of thread and mark corresponding instructions belonging to mark corresponding instructions;Processor controls its corresponding instruction according to mark, and thread in mark and precedence information are launched and perform the instruction.The processor includes at least an instruction system containing mark, a program that can identify and track mark performs control unit(Branch), one can identify mark and carry out decoded instruction demoding circuit, an arithmetic operation unit that can identify and decode mark and corresponding internal storage location.The present invention can dynamically dispatch all arithmetic hardware resources of a processor so as to improve the operational capability of processor, and need not increase the hardware of many complexity.

Description

A kind of controlled dynamic multi-threading and processor
Technical field
The present invention relates to field of processors, more particularly to a kind of controlled dynamic multi-threading(Dynamic Multi- ) and processor threading.
Background technology
In order to improve the operational capability of processor, many parallel processing techniques are developed, such as superscale(Super- scalar), assembly line(Pipeline)Overlength wide instruction(VLIW), single instrction execution more(SIMD), etc..But due to one The instruction processing of a software program is that order performs, the dependence of instruction and data present in its implementation procedure (dependencies)Cause processor it is frequent be waited for thus limit these parallel processing technique efficiency Play.
In order to overcome the executory dependence of instruction, some improve the technology of instruction issue efficiency, such as out of order code(Out- of-Order), control program prediction(Branch Prediction)Etc. being developed, but these technologies have its limitation Property.Their either hardware are extremely complex, or efficiency improves the application of limited and unsuitable embedded system.One insertion Formula system, especially moves equipment, such as mobile communication, mobile unit, Wearable etc., and the requirement to processor performance is not only Operational capability will height, more require that power consumption wants low and real-time is eager to excel.
Multithreaded parallel processor technology(Multi-Threading), because it can be parallel in same processor Handle 2 or multiple completely self-contained operation programs, thus can be relatively good solve execution process instruction in control and number Limited according to operational efficiency caused by dependence, wherein synchronizing multiple threads technology(Simultaneous Multi-threading) And token driving multithreading(Token Triggered multi-threading, SMT) in some processor products Arrive good application, such as the POWER5 of the Hyper-Threading of Intel, IBM, Sun Microsystems The MT of UltraSPARC T2 and MIPS are to employ SMT technologies.The SandblasterDSP cores of Sandbridge using Token drives multithreading.
Although the dependency problem in SMT technologies energy settlement procedure implementation procedure, SMT technologies are except needing to per thread Have will also add thread trace logic outside the register needed for a set of executive program of oneself in every grade of assembly line, and increase is altogether Enjoy the size of resource, such as Instruction Cache, TLBs etc..Its thread trace logic not only want the stroke of track thread also to check and Judge the thread whether complete by executed.It is in due to having substantial amounts of thread and performs or half execution state, thus CPU The necessary sufficiently large Thrashing to avoid between unnecessary thread of the size of Caches and TLB, the complexity of its hardware Greatly increase with the increase of Thread Count thus limit it and be difficult to apply to embeded processor and low power processor Design.
Following table is a typical SMT multithread programs implementation procedure:
Token driving multithreading is a kind of time-division multithreading, since it can only perform same line within each clock cycle The instruction of Cheng Chengxu, thus its hardware complexity will simplify much compared to SMT, but efficiency also and then declines.Its main feature is that:
1. each clock cycle only has a thread to send instruction;
2. all threads are sequence startings as shown in Figure 1, thus simplifying thread selection circuit;
3. per thread has the clock cycle of identical execute instruction, it is not necessary to relies on inspection and the hardware that detours;
4. operation result can guarantee that the thread in next time has just obtained before performing.
Following table gives the program process of token driving multithreading:
1 Clock cycle i:Thread T0 sends instructions j and j+1 and j+2
2 Clock cycle i+1:Thread T0 sends instructions k and k+1
3 Clock cycle i+2:Thread T2 sends instructions l
4 Clock cycle i+3:Thread T3 sends instructions m and m+1 and m+2
5 Clock cycle i+4:Thread T0 instructs missing, and processor waits
6 Clock cycle i+5:Thread T1 sends instructions K+2
7 Clock cycle i+6:Thread T2 sends instructions I+1 and I+2
8 Clock cycle i+7:Thread T3 instructs missing, and processor waits
But since token driving multiline procedure processor can only perform specific threading operation in the defined clock cycle, because If this is in this clock cycle, its thread specified is due to instruction or the missing of data(missing)Or because dependence and When being unable to firing order, which is just wasted.In order to overcome this this defect of token driving multithreading, a machine Meeting multithreading is developed.
Chance multithreading allow a multiline procedure processor a thread within the clock cycle of some if When there is no an effective instruction need not this clock cycle of HOLD, but give the clock cycle to other thread for having effective instruction. The clock cycle that will be wasted originally gives other thread as one " chance " and uses.
For having multiline procedure processor to one using this method, its thread no longer can only be sent out one by the per thread cycle The limitation of secondary instruction, and any " chance " is available with as long as can the firing order clock cycle in each clock cycle The thread of original start does not instruct effectively within the clock cycle.
1. chance multithreading is as token driving multithreading, it is a kind of timesharing multithreading, each Clock cycle can only perform a program.Its executable Thread Count is limited to the Thread Count of hardware.
2. chance multithreading needs a branch prediction circuit, for a processor using VLIW structures, it is needed The dependence of each sub-instructions is predicted.Therefore branch prediction circuit is considerably complicated.
3. chance multithreading needs the thread identity of one group of 2 dimension(ID)Register instructs for track thread Implementation status per level production line is to ensure that result data will not be mixed up unrest.
4. in practical application, per thread increase is necessarily using each arithmetic element of the processor of chance multithreading One group of 2 dimension is totally independent of the data registers of other threads to prevent the data between half thread for performing state Thrashing。
5. in order to the firing order within the clock cycle of each processor, the instruction memory belonging to thread is also necessary The clock frequency identical with processor clock cycle is operated in ensure that thread can timely read instruction.Thus, multithreading One there would not be the characteristics of reducing power consumption of memory.
Analysis is it can be seen that more than token driving using the hardware complexity of the processor of chance multiprogram technology above Threading increase is very much, and in order to enable per thread to read instruction in each clock cycle, its instruction memory Clock frequency must be as the master oscillator frequenc of processor, and the power consumption of such processor can substantially increase.Thus chance is multi-thread Journey technology is not appropriate for being applied to low-power-consumption embedded processor design.
Fig. 2 is that the program of chance multithreading performs schematic diagram.
The content of the invention
The technical problems to be solved by the invention are to be directed to the defects of involved in background technology, there is provided a kind of controllable dynamic State multi-threading and processor.
The present invention uses following technical scheme to solve above-mentioned technical problem:
A kind of controlled dynamic multi-threading, uses one pipeline organization and the processor with I-cache, Increase mark in its order structure newly, which includes two partial informations:Thread and mark belonging to mark corresponding instructions correspond to The precedence information of instruction, the precedence information are used for the execution sequence for indicating instruction and the correlation with its front and rear instruction; Processor controls its corresponding instruction according to mark, is launched by the precedence information and affiliated thread of the instruction and is performed this and refers to Order.
As a kind of further prioritization scheme of controlled dynamic multi-threading of the present invention, the processor is controlled according to mark Its corresponding instruction is made, launches by the precedence information and affiliated thread of the instruction and performs comprising the following steps that for the instruction:
Step 1), according to etc. precedence information in the corresponding mark of instruction to be performed read instruction;
Step 2), instruction decoding and distribution:
The decoding circuit of processor is by step 1)In read instruction decoding be mark and each sub-instructions, processor Distribution logic assigns them to different arithmetic elements according to the function of each sub-instructions and goes to perform;
Step 3), instruction execution:
For each sub-instructions, processor reads corresponding register according to the thread information in instruction mark belonging to it Data, and by the register of the result of execution deposit its respective thread;
Step 4), jump to step 1).
According to specific Hardware Implementation, step 1 and 2 may require that multiple clock cycle sometimes, when only needing 1 sometimes Clock cycle, step 3)N-1 clock cycle is then needed, n is the pipeline series of processor arithmetic element.
As a kind of further prioritization scheme of controlled dynamic multi-threading of the present invention, the step 1)Detailed step It is as follows:
Step 1.1), the instruction reading circuit of processor check I-Cache whether have instruction by etc. it is pending, i.e., whether deposit In the instruction in Valid states;
Step 1.1.1)If only existing 1 instruction for being in Valid states, the instruction is read;
Step 1.1.2), if the instructions of more than 2 are in Valid states, then checked according to the corresponding mark of instruction The priority of which bar instruction is high;
Step 1.1.2.1), the instruction of other instructions is higher than if there is priority, then reads the instruction,
Step 1.1.2.2), the instruction of other instructions is higher than if there is no priority, then judges whether back The instruction thread of execution;
Step 1.1.2.2.1), if there is the instruction thread of back execution, read the order line performed with back The instruction of Cheng Butong reads instruction according to the order of thread;
Step 1.1.2.2.1), if there is no the instruction thread of back execution, read and instructed according to the order of thread.
As a kind of further prioritization scheme of controlled dynamic multi-threading of the present invention, the mark write by software or Person's compiler automatically writes in compilation process.
As a kind of further prioritization scheme of controlled dynamic multi-threading of the present invention, the processor is sent out for multiple instructions Processor is penetrated, its every instruction is all independent to carry the mark of oneself.
As a kind of further prioritization scheme of controlled dynamic multi-threading of the present invention, the processor is sent out for multiple instructions Processor is penetrated, a plurality of instruction shares one group of mark.
As a kind of further prioritization scheme of controlled dynamic multi-threading of the present invention, the processor is sent out for single instrction Processor is penetrated, the corresponding mark of its every instruction.
The invention also discloses a kind of processor based on the controlled dynamic multi-threading, including at least mark's Instruction system, one can identify and track mark program perform control unit, one can identify mark and be decoded Instruction demoding circuit, an arithmetic operation unit that can identify and decode mark and corresponding internal storage location.
The present invention compared with prior art, has following technique effect using above technical scheme:
1. Multi-thread control circuit and the complicated effective prediction circuit of instruction that need not be complicated can be transferred efficiently The hardware resource of processor, the priority and correlation of effective decision instruction;
2. do not had to worry according to the priority orders execute instruction of instruction because the missing of some instructions or data And cause the waste of hardware resource and the phenomenon of operation result confusion occur;
3. effectively improving the utilization rate of the hardware resource of processor, and then reduce power consumption.
Brief description of the drawings
Fig. 1 is the token driving multithreading thread flow figure of four threads;
Fig. 2 is that chance multithread programs perform schematic diagram;
Fig. 3 is the single instrction structure chart with mark;
Fig. 4 is the single mark order structures figure of multiple instructions band;
Fig. 5 is the more mark order structures figures of multiple instructions band;
Fig. 6 is a multithreading execution flow chart with 6 level production lines;
Fig. 7 is a block diagram of the processor with software-controllable dynamic multi streaming.
Embodiment
Technical scheme is described in further detail below in conjunction with the accompanying drawings:
The present invention is to increase by one group of corresponding instruction in the instruction system of the processor of a use multi-stage pipeline arrangement Thread identity and its precedence information symbol(mark).The instruction system of processor is being read(Fetch)While instruction Obtain the mark of the thread identity for performing the instruction and the information of its priority.The instruction control arithmetic system of processor (Branch)The hardware resource of processor and execution sequence are arranged according to the information of the mark.This mark will always with The each step performed with instruction in order to track the execution step of the instruction, and according to precedence information indicate this instruction with The dependence of instruction/data before and after it and the order preferentially performed.
The content of the mark of the present invention can set execution according to the requirement of application system when programmer programs to be somebody's turn to do The thread of programmed instruction and execution priority or compiler set thread and according to programs automatically in compilation process Calculation function sets its priority in the correlation for differentiating the instruction and its front and rear instruction and data.
Using software design patterns program execution thread and provide in the program priority of every instruction and with being held before and after it The information of the correlation of row instruction is attached in each instruction and is used as an identifier(mark).Processor hardware only needs can The dynamic hardware resource for transferring processor can be realized and efficiently perform the finger of multithreading by identifying the information of these mark Order operation.
The line for being also possible that while running using the execution thread of software design patterns and the program of management multiline procedure processor Number of passes is from the firing order number of processor and the limitation of pipeline series.Can also avoid because program threads less than assembly line and Caused by clock cycle/hardware resource waste phenomenon.
To realize software-controllable dynamic multi streaming method, the instruction system of its processor is except the instruction of usual executive program One group must also be added outside word and includes thread number and the identifier of precedence information is attached in coding line as a mark, As shown in Figure 3.Mark in figure is 2 binary digits of one at least 2.
By taking the mark of 3 digits as an example:
Assuming that mark=" 000 ";The thread for representing to perform the instruction is 0, and priority is 0(0 represents low priority)
Assuming that mark=" 101 ";The thread for representing to perform the instruction is 1, and priority is 1(Represent high priority)
The concrete numerical value of Mark can be the execution line that programmer sets this section of program in programming according to the requirement of system Journey and priority or compiling system provides automatically in compilation process according to the function of program.
The software-controllable dynamic multi streaming method of the present invention can be not only used for the processor of single instruction issue, can also use In the processor of multiple instructions transmitting.
For the processor of a multiple instructions transmitting, the instruction of its multi-emitting can share a mark information, can also be every Bar instruction carries the mark information of oneself.
Fig. 3 is the order structure of a list mark single instrction.
Fig. 4 is the order structure of a list mark multiple instructions;Wherein coding line 1,2, n must be same multi-threaded program In different instruction.The structure of single mark coding lines can only perform time-division multiple threads.
What Fig. 5 was provided is more mark, the order structure of multiple instructions word, and in figure, M is the meaning for representing Mark;Due to each Coding line has the mark of their own, so these instructions can be the instruction of the program of different threads.The finger of this more mark Structure is made to be applicable to synchronizing multiple threads processing.
The execution step of the dynamic multi streaming method of the present invention is as follows:
Step 1(Or the clock cycle 0)Read instruction:The I-Cache read control circuits of processor check whether there is instruction Etc. pending(Valid), if the instruction Valid of more than 2,(The I-Cache of the processor of one multithreading should be at least There is the Bank of 2 or more), then check that the priority of which bar instruction is high, if just reading the high instruction of priority, if Priority is the same then to be read the instruction different with the instruction thread that back performs or reads instruction according to the order of thread;
Step 2(Or the clock cycle 1)Instruction decoding and distribution:Decoding circuit solution code instruction 1, instruction 2, instruction 3, distribution is patrolled The function distribution collected further according to solution code instruction goes to perform to different arithmetic elements;
Step 3(Or 2~n+1 of clock cycle)Instruction performs:Processor reads corresponding according to the thread information in mark The data of register, and by the register of the result of execution deposit its respective thread;By taking instruction control circuit as an example, according to mark's Thread information presses corresponding PC content of registers sequential execution of programmed instructions, and other work(of corresponding thread are read according to instruction Can register(Such as loop counter, jump, condition etc.)Data, and the result of execute instruction is restored again into accordingly Thread these registers;
Here the n numerical value in 2~n+1 of clock cycle is decided by the pipeline series of processor arithmetic element.If one The structure of a 4 grades of flowing water, this n is equal to 4, if 6 stage pipeline structures, n are equal to 6;
The clock cycle n+1 of step 3 just returns step 1 after having performed.
Since the dynamic multithreading architecture of single mark multiple instructions in the present invention is a time-division multithreaded architecture, work as Present procedure runs to step 2(Clock cycle 1)When, the I-Cache read control circuits of processor are read in repeat step 1 Take the validity of instruction of the control circuit in the appearance for checking next step(Valid)And determine which reads according to Valid The programmed instruction of thread.
When current program goes to step 3 (clock cycle 2), I-Cache read control circuits still re-cover step 1, the 3rd group of instruction is read according to the Valid information of instruction;And the decoding distributor circuit of processor then re-covers and performs step 2, solution The instruction of code and distribution program 2;So in cycles.
Fig. 6, which gives one, has 6 level production lines(Arithmetic element)The execution flow signal of the dynamic multi streaming of structure.Figure In:
T-thread;
Y-thread number, y=0,1,2,, n;For representing y threads T;For example T (2) represents thread 2;
The value of Y is provided by the mark in coding line;
Ith transmitting of the i-identical thread within the same instruction cycle;An instruction cycle is equal in this example 6 clock cycle;
J-pipeline series;
Such as T (32,4) represent the 2nd time of thread 3 transmitting and its state in the 4th grade of assembly line.
Here the suitable procedures described above 3 of operating process of flow chart).Wherein n is equal to 6, i.e. processor has been read Instruction is taken and instruction decoding and corresponding processing unit will be allocated to.Corresponding processing unit has been obtained for thread and excellent The information of first level.
The operating process of one dynamic multi streaming is:(Assuming that program 0,1,2,, 5 be all independent thread)
The C0 in clock cycle zero(Here clock cycle 0 is equivalent to the foregoing instruction cycle 2):The processing list of processor First mark parts read instruction and decoded coding line obtain the thread Y of present instruction, it is assumed that the journey of Y=0, i.e. thread 0 Thread T (0 is just awarded in the instruction of sequence, the instruction0,0) and performed since zero level assembly line;
The C1 in clock cycle one:Processor, which reads next and instructs and decode mark, obtains Y=1, illustrates that the instruction is Thread T (1 is awarded in the instruction of the program of thread 1, the instruction0,0), and performed since the first level production line, and at this moment preceding article Flowing water is to the 1st level production line for instruction, so state becomes T (00,1);
The C2 in clock cycle two:Processor should read instruction i.e. Y=2 of 2 program of thread under normal circumstances, still For some reason, the instruction missing of the program of thread 2, and the instruction for the program of thread 0 occur is already prepared to, at this moment Processor can read the mark of instruction and if decoding obtains the decoding of Y=0 and also obtains priority equal to 1(Without waiting for thread The operation result of the 0 previous bar instruction of program), at this moment processor begin to authorize thread T (01,0) and start to perform the instruction, Order, before 2 instruction states become, T (00,2) and T (10,1);
The C3 in clock cycle three:Processor, which reads to instruct and decode mark, obtains Y=3, that is, authorizes instruction thread T (30,0) and start to perform.At this moment the instruction execution state order before becomes T (00,3), T (10,2) and T (01,1);
The C4 in clock cycle four:Processor, which reads to instruct and decode mark, obtains Y=4, that is, authorizes instruction thread T (40,0) and start to perform.At this moment the instruction execution state order before becomes T (00,4), T (10,3), T (01,2) and T (30,1);
The C5 in clock cycle five:Processor, which reads to instruct and decode mark, obtains Y=5, that is, authorizes instruction thread T (50,0) and start to perform.At this moment the instruction execution state order before becomes T (00,5), T (10,4), T (01,3), T (30,2) and T (40,2);So far, an instruction cycle terminates, and instructs T (00) operation result be stored in corresponding register.
As seen from the above analysis, dynamic multi streaming technology is controlled using software, only needs to track for processor T (the Y of every instructionI, j) it just can effectively transfer hardware resource.And the setting of multithreading completely can will obtain from system Hair is flexible to be transferred.
Fig. 7 is a Harvard structure, employs the controllable multiline procedure processor logic of the dynamic of software design patterns thread Block diagram.The order structure of processor in figure is a tri- instruction word issue structure of list mark.Processor as we can see from the figure Increase the outer other parts in mark positions and a typical processor structure almost one of several bits in coding line structure Sample.The information of Mark needs to send all arithmetic elements to.Instruction control unit is according to the thread and precedence information of mark The reading and control of control instruction and the execution state for tracking multithreading, arithmetic operation unit are then come really using the information of mark Unrest will not be mixed up by protecting the operation result of the instruction.
Those skilled in the art of the present technique are it is understood that unless otherwise defined, all terms used herein(Including skill Art term and scientific terminology)With the identical meaning of the general understanding with the those of ordinary skill in fields of the present invention.Also It should be understood that those terms such as defined in the general dictionary should be understood that with the context of the prior art The consistent meaning of meaning, and unless defined as here, will not be explained with the implication of idealization or overly formal.
Above-described embodiment, has carried out the purpose of the present invention, technical solution and beneficial effect further Describe in detail, it should be understood that the foregoing is merely the embodiment of the present invention, be not limited to this hair Bright, within the spirit and principles of the invention, any modification, equivalent substitution, improvement and etc. done, should be included in the present invention Protection domain within.

Claims (6)

1. a kind of controlled dynamic multi-threading, it is characterised in that pipeline organization and the place with I-cache are used to one Device is managed, increases mark newly in its order structure, which includes two partial informations:Thread belonging to mark corresponding instructions and The precedence information of mark corresponding instructions, the precedence information be used for the execution sequence for indicating instruction and with its front and rear instruction Correlation;Processor controls its corresponding instruction according to mark, launches by the precedence information of the corresponding instruction and affiliated thread And the instruction is performed, comprise the following steps that:
Step 1), according to etc. precedence information in the corresponding mark of instruction to be performed read instruction;
Step 1.1), the instruction reading circuit of processor check I-Cache whether have instruction by etc. it is pending, i.e., with the presence or absence of place In the instruction of Valid states;
Step 1.1.1)If only existing 1 instruction for being in Valid states, the instruction is read;
Step 1.1.2), if the instructions of more than 2 are in Valid states, then which is checked according to the corresponding mark of instruction The priority of bar instruction is high;
Step 1.1.2.1), the instruction of other instructions is higher than if there is priority, then reads the priority higher than other instructions Instruction;
Step 1.1.2.2), the instruction of other instructions is higher than if there is no priority, then judges whether that back performs Instruction thread;
Step 1.1.2.2.1), if there is the instruction thread of back execution, read the instruction thread performed with back not Instruction with thread or the order according to instruction thread read instruction;
Step 1.1.2.2.2), if there is no the instruction thread of back execution, read and instructed according to the order of instruction thread;
Step 2), instruction decoding and distribution:
The decoding circuit of processor is by step 1)In read instruction decoding be mark and each sub-instructions, the distribution of processor Logic assigns them to different arithmetic elements according to the function of each sub-instructions and goes to perform;
Step 3), instruction execution:
For each sub-instructions, processor reads the corresponding register of the thread according to the thread information in instruction mark belonging to it Data, and by the register of the result of execution deposit its respective thread;
Step 4), jump to step 1).
2. controlled dynamic multi-threading according to claim 1, it is characterised in that the mark by software write or Compiler automatically writes in compilation process.
3. controlled dynamic multi-threading according to claim 1, it is characterised in that the processor is launched for multiple instructions Processor, its every instruction is all independent to carry the mark of oneself.
4. controlled dynamic multi-threading according to claim 1, it is characterised in that the processor is launched for multiple instructions Processor, a plurality of instruction share one group of mark.
5. controlled dynamic multi-threading according to claim 1, it is characterised in that the processor is single instruction issue Processor, the corresponding mark of its every instruction.
6. the processor of a kind of controlled dynamic multi-threading for described in perform claim requirement 1, it is characterised in that at least wrap The program that mark can be identified and tracked containing an instruction system with mark, one perform control unit, one can identify Mark simultaneously carries out decoded instruction demoding circuit, an arithmetic operation unit that can identify and decode mark and corresponding memory Unit.
CN201610272367.8A 2016-04-28 2016-04-28 A kind of controlled dynamic multi-threading and processor Active CN105824605B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610272367.8A CN105824605B (en) 2016-04-28 2016-04-28 A kind of controlled dynamic multi-threading and processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610272367.8A CN105824605B (en) 2016-04-28 2016-04-28 A kind of controlled dynamic multi-threading and processor

Publications (2)

Publication Number Publication Date
CN105824605A CN105824605A (en) 2016-08-03
CN105824605B true CN105824605B (en) 2018-04-13

Family

ID=56528841

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610272367.8A Active CN105824605B (en) 2016-04-28 2016-04-28 A kind of controlled dynamic multi-threading and processor

Country Status (1)

Country Link
CN (1) CN105824605B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5511182A (en) * 1994-08-31 1996-04-23 Motorola, Inc. Programmable pin configuration logic circuit for providing a chip select signal and related method
US7447887B2 (en) * 2005-10-14 2008-11-04 Hitachi, Ltd. Multithread processor
US7518993B1 (en) * 1999-11-19 2009-04-14 The United States Of America As Represented By The Secretary Of The Navy Prioritizing resource utilization in multi-thread computing system
CN101763285A (en) * 2010-01-15 2010-06-30 西安电子科技大学 Zero-overhead switching multithread processor and thread switching method thereof
CN101763251A (en) * 2010-01-05 2010-06-30 浙江大学 Instruction decode buffer device of multithreading microprocessor

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5511182A (en) * 1994-08-31 1996-04-23 Motorola, Inc. Programmable pin configuration logic circuit for providing a chip select signal and related method
US7518993B1 (en) * 1999-11-19 2009-04-14 The United States Of America As Represented By The Secretary Of The Navy Prioritizing resource utilization in multi-thread computing system
US7447887B2 (en) * 2005-10-14 2008-11-04 Hitachi, Ltd. Multithread processor
CN101763251A (en) * 2010-01-05 2010-06-30 浙江大学 Instruction decode buffer device of multithreading microprocessor
CN101763285A (en) * 2010-01-15 2010-06-30 西安电子科技大学 Zero-overhead switching multithread processor and thread switching method thereof

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Speculation-aware thread scheduling for simultaneous multithreading;Kang等;《Electronics Letters》;20041231;第40卷(第5期);第790-795页 *
基于多个取指优先级的同时多线程处理器取指策略;孙彩霞等;《电子学报》;20060531(第5期);第296-298页 *

Also Published As

Publication number Publication date
CN105824605A (en) 2016-08-03

Similar Documents

Publication Publication Date Title
CN106104481B (en) System and method for performing deterministic and opportunistic multithreading
CN108027807B (en) Block-based processor core topology register
CN108027769B (en) Initiating instruction block execution using register access instructions
US20230106990A1 (en) Executing multiple programs simultaneously on a processor core
CN108027772B (en) Different system registers for a logical processor
EP3350686B1 (en) Debug support for block-based processor
KR102335194B1 (en) Opportunity multithreading in a multithreaded processor with instruction chaining capability
US5710902A (en) Instruction dependency chain indentifier
US10095519B2 (en) Instruction block address register
US20170371660A1 (en) Load-store queue for multiple processor cores
KR101594502B1 (en) Systems and methods for move elimination with bypass multiple instantiation table
CN113703834A (en) Block-based processor core composition register
US20160378491A1 (en) Determination of target location for transfer of processor control
KR20180021812A (en) Block-based architecture that executes contiguous blocks in parallel
US20180032344A1 (en) Out-of-order block-based processor
EP2782004B1 (en) Opportunistic multi-thread method and processor
WO2017223004A1 (en) Load-store queue for block-based processor
CN105824605B (en) A kind of controlled dynamic multi-threading and processor
CN205721743U (en) A kind of processor of controlled dynamic multi-threading

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20160803

Assignee: Suzhou Hongxin integrated circuit Co.,Ltd.

Assignor: Wang Shenghong

Contract record no.: X2023990000728

Denomination of invention: A controllable dynamic multithreading method and processor

Granted publication date: 20180413

License type: Exclusive License

Record date: 20230726

EE01 Entry into force of recordation of patent licensing contract