CN104866458A

CN104866458A - Pipeline reversible CPU design and simulation system

Info

Publication number: CN104866458A
Application number: CN201510242080.6A
Authority: CN
Inventors: 卫丽华; 朱鹏程
Original assignee: Nantong Institute of Technology
Current assignee: Shanghai Ruimi Information Technology Co ltd; Wuxi Xinsu Technology Co.,Ltd.
Priority date: 2015-05-13
Filing date: 2015-05-13
Publication date: 2015-08-26
Anticipated expiration: 2035-05-13
Also published as: CN104866458B

Abstract

The present invention discloses a pipeline reversible CPU design and simulation system. The pipeline reversible CPU is operated by using a variety of reversible instructions, and the reversible instructions are executed by using a plurality of parallel pipelines. The pipelines comprise a plurality of execution stages. The pipeline reversible CPU has the following advantages: heat dissipation performance is desirable, the program execution speed is fast, the program execution time is short, and the stability and accuracy of data processing is desirable.

Description

The reversible CPU design of a kind of streamline and analogue system

Technical field

The invention belongs to computer realm, be specifically related to the reversible CPU design of a kind of streamline and analogue system.

Background technology

Along with the progress of manufacturing process, the integrated level of integrated circuit is more and more higher, the number of transistors comprised in every square millimeter of unit area has arrived 1,000,000 orders of magnitude, chip cooling problem becomes the key factor that restriction integrated circuit further develops, and reversible calculating is one of effective way solving chip cooling problem.

Be different from traditional computation model, reversible calculating not only possesses forward determinacy, also possesses backwards determinism, and its all logical operations comprised are all reversible, therefore can not erasure information in computation process.Landauer once pointed out, except the heat radiation caused by technique or material, calculating itself also can cause heat dissipation, and specifically, the information of often wiping 1 bit will produce the heat of kTln2 joule (wherein k is Boltzmann constant, T is room temperature).Bennett points out that any calculating all can be exchanged into reversible calculating, thus is infinitely close to zero in the heat dissipation ideally produced.The advantage of reciprocal circuit in power consumption and heat radiation is all proven in theory and practice, a lot of scholar is to the ALU unit of reversible CPU, overall architecture and order set have done further investigation, Pendulum is the reversible CPU of first item that can look into, its design being reversible CPU provides prototype and template, but Pendulum comprises the reversible CPU occurred afterwards and all belongs to monocycle CPU, namely whole actions of every bar instruction all complete within a clock period, an instruction executes and could perform next instruction again, the concurrent design of this shortage instruction-level does not obviously meet development trend and the feature of modern processors.

Therefore a kind of heat dissipating state is good, can faster procedure execution speed, shorten program execution time, the reversible CPU design of a kind of streamline improving data stability urgently proposes with analogue system.

Summary of the invention

In order to solve the problems of the technologies described above, the present invention proposes a kind of streamline reversible CPU design and analogue system, this CPU heat dissipating state is good, program execution speed fast, and program execution time is short, the stability of data processing and accuracy good.

In order to achieve the above object, technical scheme of the present invention is as follows:

The reversible CPU design of a kind of streamline and analogue system, the reversible CPU of streamline adopts multiple reversible instruction to operate, and reversible instruction adopts many parallel pipelines to perform, and streamline comprises multiple execute phase.

The reversible CPU design of a kind of streamline of the present invention is different from monocycle CPU with analogue system, it adopts pipelining, and faster procedure execution speed, shortens program execution time greatly, and due to the present invention be that reversible CPU adopts reversible instruction to carry out executable operations, good heat dissipation effect.

On the basis of technique scheme, also can do following improvement:

As preferred scheme, reversible instruction comprises:

Arithmetical operation/logic instruction, for carrying out/the logical operation that counts;

Jump instruction, for carrying out the redirect of instruction;

Access instruction, the data-carrier store for the reversible CPU of pipeline is directly accessed.

Adopt above-mentioned preferred scheme, use different instructions to realize the different operating to the reversible CPU of streamline of the present invention.

As preferred scheme, streamline comprises seven execute phases, is followed successively by: the instruction fetch phase, instruction decode stage, the read register stage, perform computing/access storer stage, write register stages, instruction encoding stage and link order stage.

Adopting above-mentioned preferred scheme, is perform reversible Command Resolution in multiple stage, allows each step operation overlap of different reversible instruction, thus realizes many reversible parallel instructions process, effective accelerated procedure operational process.

As preferred scheme, the execute phase of streamline has symmetry.

Adopt above-mentioned preferred scheme, ensure that the reversible CPU of streamline of the present invention effectively can realize forward and oppositely perform reversible instruction.

As preferred scheme, reversible CPU is provided with inverse command mappings table, if the reversible CPU of streamline receives the signal oppositely performed, from inverse command mappings table, inquires about this against instruction operation code, according to inverse instruction operation code generating run signal.

Adopting above-mentioned preferred scheme, without the need to arranging special arithmetical unit and transfer bus for oppositely performing instruction, reducing circuit complexity.

As preferred scheme, the reversible CPU of streamline comprises:

Arithmetical unit, for carrying out arithmetical operation;

Parasites Fauna, for storing the intermediate data in computation process;

Data-carrier store, for storing data;

Command memory, for storing reversible instruction;

Steering logic device, for according to operational code generating run signal;

Adverse control logic device, for according to operation signal restoring operation code;

DIR direction register, performs direction for controlling;

PC programmable counter, is used in reference to current reversible instruction;

PPC former procedure counter, is used in reference to the reversible instruction of last bar to current reversible instruction;

BR redirect register, for realizing the redirect of instruction.

Adopt above-mentioned preferred scheme, structure is simple, and Each performs its own functions.

As preferred scheme, the reversible CPU of streamline comprises:

IF/ID segment register, for preserving the data between instruction fetch phase and instruction decode stage;

ID/RR segment register, for the data of holding instruction between decode stage and read register stage;

RR/EXE segment register, for preserving the read register stage and performing the data of computing/between the access storer stage;

EXE/WR segment register, for the data of preserving execution computing/access storer stage and write between register stages;

WR/IE segment register, for preserving the data write between register stages and instruction coding stage;

IE/IR segment register, for the data of holding instruction between coding stage and link order stage.

Adopt above-mentioned preferred scheme, the segment register arranged between streamline different phase effectively can ensure that the data between its different phase can not make a mistake and deviation.

As preferred scheme, the reversible CPU of streamline also comprises data hazard and detects converting unit.

Adopt above-mentioned preferred scheme, add data hazard detection converting unit and effectively can improve the stability of data and improve the execution efficiency of operation.

As preferred scheme, data hazard detects converting unit and comprises:

Read-Read transponder, whether the source operand register id field for the register id field in test I D/RR segment register and RR/EXE has common factor;

If any common factor, then select to forward from the corresponding data of the RR/EXE segment register operand as this reversible execution phase;

As do not occured simultaneously, then selection comes from the operand of corresponding data as this reversible execution phase of Parasites Fauna;

Write-Read transponder, judges whether the arbitrary operand in RR/EXE segment register and the destination operand in EXE/WR segment register come from identical register;

If come from identical register, then the corresponding data be stored in EXE/WR segment register is forwarded to the input end of arithmetical unit;

If not come from identical register, then the corresponding data be stored in RR/EXE segment register is forwarded to the input end of arithmetical unit.

Adopt above-mentioned preferred scheme, Read-Read transponder is for solving Read-Read risk; Write-Read transponder is for solving Write-Read risk.

Accompanying drawing explanation

The reversible CPU of a kind of streamline that Fig. 1 provides for the embodiment of the present invention designs the logical organization schematic diagram with analogue system.

The reversible instruction pipelining structural representation that Fig. 2 provides for the embodiment of the present invention.

The Read-Read risk schematic diagram that Fig. 3 provides for the embodiment of the present invention.

The Read-Read transponder data path figure that Fig. 4 provides for the embodiment of the present invention.

The Write-Read risk schematic diagram that Fig. 5 provides for the embodiment of the present invention.

The Write-Read transponder data path figure that Fig. 6 provides for the embodiment of the present invention.

Embodiment

The preferred embodiment of the present invention is described in detail below in conjunction with accompanying drawing.

In order to reach object of the present invention, in the reversible CPU design of a kind of streamline with the some of them embodiment of analogue system,

As shown in Figure 1, the reversible CPU of a kind of streamline of the present invention is based on Harvard structure, and store separately by program storage and data, wherein each functional part is comprehensively formed by reversible logic.Reversible logic synthesis rule requires that the input number of each parts is identical with output number, and without fan-in and fan-out.The reversible CPU of a kind of streamline comprises: arithmetical unit, Parasites Fauna, data-carrier store, command memory, steering logic device, Adverse control logic device, DIR direction register, PC programmable counter, PPC former procedure counter, BR redirect register, Read-Read transponder and Write-Read transponder.

Arithmetical unit (ALU) is for carrying out arithmetical operation;

Parasites Fauna is used for storing the intermediate data in computation process;

Data-carrier store is used for storing data;

Command memory is used for storing reversible instruction;

Steering logic device is used for according to operational code generating run signal;

Adverse control logic device is used for according to operation signal restoring operation code;

DIR direction register performs direction for controlling, and is that forward performs, value is 1 for oppositely performing when DIR value is 0;

PC programmable counter is used in reference to current reversible instruction;

PPC former procedure counter is used in reference to the reversible instruction of last bar to current reversible instruction;

BR redirect register is for realizing the redirect of instruction, and jump instruction will upgrade BR redirect register, and does not directly upgrade PC register;

Read-Read transponder and Write-Read transponder are for solving data hazard problem.

Wherein: in Fig. 1, Mux represents multiplexer; Signext represents the functional unit for carrying out sign extended to the immediate in instruction; IsZero represents that whether value for judging corresponding registers is the functional unit of zero; PPCUpdate and PCUpdate represents the functional unit for upgrading PPC and PC register respectively; Incrementer represents the functional unit carrying out adding an operation.

The reversible CPU of a kind of streamline adopts risc instruction set, comprises the reversible instruction of three classes altogether: arithmetical operation/logic instruction, jump instruction and access instruction.Each instruction in this instruction set is all logic reversible, and every bar instruction all exists corresponding inverse instruction.In addition, this instruction set is based on load/store pattern, and namely except special access instruction, other instruction does not allow direct accesses data memory.Wherein part instructs is as shown in table 1.

The reversible instruction set of table 1

Order format is as shown in table 2 and table 3, and instruction word length 32, because in this instruction set, instruction is on the low side, only operational code OPCode just can distinguish all instructions, and latter 11 therefore in table 2 wouldn't use, and after waiting until, instruction extension is used.

Table 2 counts/logic instruction and access instruction form

OPCode	Rsd	Rs	Rt	0
					31：26	25：21	20：16	15：11	10：0

Table 3 jump instruction form

OPCode	Rs	Offset
			31：26	25：21	20：0

In reversible instruction, the implementation of every bar instruction can be divided into seven stages, respectively: instruction fetch, Instruction decoding, read register, execution computing/access storer, write register, instruction encoding and link order.

(1) instruction fetch.Different with conventional processors, there are two programmable counter PPC and PC, deposit an instruction address when just entering fetch phase in PPC, PC deposits current instruction address, after PC reading command, do to add an operation to PC, now PC deposits next instruction address simultaneously.Whether is simultaneously zero renewal PPC counter according to BR register, this counter is used for, in the link order stage, instruction is turned back to origin-location, as BR=0 then PPC=PPC+1, otherwise PPC=PPC+BR.

(2) Instruction decoding.All kinds of control signal is generated according to instruction operation code by steering logic device.

(3) read register.Read corresponding registers according to register number, for preventing data trnascription and coverage information, the read-write of register, all based on exchange, namely exchanges the content of read/write data impact damper and register.Three content of registers can be read at most simultaneously.Whether in addition, the renewal of jump instruction to PC register is also placed in this stage, and jump instruction does not directly upgrade PC, but upgrades BR, be then zero renewal PC according to BR, as BR=0 does not then upgrade PC, otherwise PC=PC+BR.

(4) computing/access storer is performed.Need experience without any instruction in this order set simultaneously and perform computing and access two stages of storer, therefore these two stages can be merged.Arithmetical unit ALU is realized by reversible logic circuits, can accept at most three parameters.Memory access and register similar, be based on swap operation equally, namely do not allow occur cover or copy.

(5) register is write.By the result of arithmetical unit ALU write register, because to be written three content of registers reset in the read register stage, can not erasure information.

(6) instruction encoding.By Reverse Turning Control logic according to the operation signal restoring operation code regained, then operational code and operand are merged and generate presumptive instruction, processing procedure is contrary with Instruction decoding.

(7) link order.Instruction is put back to order register by the value according to PPC counter device.

As shown in Figure 2, as seen from Figure 2, the execute phase of reversible instruction has symmetry to instruction pipelining.In Fig. 1, IF/ID segment register is for preserving the data between instruction fetch phase and instruction decode stage; The data of ID/RR segment register for holding instruction between decode stage and read register stage; RR/EXE section is deposited for preserving the read register stage and performing the data of computing/between the access storer stage; EXE/WR segment register is for the data of preserving execution computing/access storer stage and write between register stages; WR/IE segment register is for preserving the data write between register stages and instruction coding stage; The data of IE/IR segment register for holding instruction between coding stage and link order stage.

The instruction pipelining emulation logic of the reversible CPU of a kind of streamline is as follows:

Each circulation represents a clock period of instruction pipelining, a stage in seven methods difference code flowing water in circulation, and order is upper contrary with the flowing water stage.Backward mode why is adopted to simulate, because the ephemeral data of the every one-phase of instruction operation is all kept in segment register, as order simulates each stage, the execution of subsequent instructions can cover the ephemeral data of last instruction in segment register, thus destroys the execution of last instruction.

Reversible CPU can perform instruction by forward, also oppositely can perform instruction.Consider that oppositely execution instruction is equivalent to forward and perform it against instruction, the reversible CPU of a kind of streamline of the present invention increases inverse command mappings table, if the reversible CPU of streamline receives the signal oppositely performed, against command mappings table, it is inquired about against instruction operation code, according to it against instruction operation code generating run signal from this.Without the need to arranging special arithmetical unit and transfer bus for oppositely performing instruction, reduce circuit complexity.Therefore, when instruction oppositely performs, only make inverse operation (do add operation as ADD instruction forward performs, oppositely perform the operation that subtracts) in execution operation stages, all the other stages operatings are identical.

In the microarchitecture of computer CPU, some problems in instruction pipelining Out-of-order execution may cause obtaining incorrect result of calculation, occur risk problem, are mainly: structural hazards, data hazard and control risk.

Various flows last pipeline stages use Same Physical resource is the root of structural hazards problem, the reversible CPU of a kind of streamline of the present invention is all write in front half clock cycle by regulation command memory and Parasites Fauna, second half clock cycle is read, and solves the structural hazards because order register and Parasites Fauna cause.

The data dependence needed when certain execution phase just there occurs data hazard when instruction still in a pipeline.The data hazard of this reduced instruction set streamline can be divided into two classes: Read-Read risk, Write-Read risk.

(1) Read-Read risk

For ensureing that data have and only have portion, be switch type to the reading of register in reversing environment, namely content is read out late register vanishing.When the operand (source or object) needed for certain instruction is the source operand of last instruction, Read-Read risk just can occur, and this risk there will not be in ordinary streamline.Streamline as following instruction sequence is as shown in Figure 3 concurrent.

ANDX R5，R2，R3

ANDX R4，R3，R2

Before presumptive instruction performs the value of R2 and R3 be respectively for: 10,5.In the CC3 clock period, Article 1 instruction fetch register R2, R3, the value that content of registers is read out rear R2 and R3 is 0, cannot read original content of R2, R3 in the instruction of CC4 clock period Article 2, unless a clock period is blocked in this instruction.In Fig. 3, the thick line of two ends round dot represents that this Read-Read takes a risk.Solve this kind of problem by arranging Read-Read transponder in the read register stage, in Fig. 3, Bold arrows represents forwarding.

The detection logic that such Read-Read takes a risk is as follows:

If({ID/RR.Rsd，ID/RR.Rs，ID/RR.Rt}∩{RR/EXE.Rs，RR/EXE.Rt)}≠ )

Copy data from RR/EXE to RR/EXE after a clock cyle

Namely when the source operand in the operand in ID/RR segment register and RR/EXE segment register is from identical register, after related data in RR/EXE segment register being copied a clock period of a delay, data are gone to RR/EXE segment register again, use for next instruction.As shown in Figure 4, whether Read-Read transponder has common factor for the source operand register id field (containing Rs and Rt) of the register id field (containing Rsd, Rs and Rt) in test I D/RR segment register and RR/EXE; If any common factor, then arranging IF signal is True, controls Port Multiplier Mux and selects to forward from the corresponding data of the RR/EXE segment register operand as this reversible execution phase; As do not occured simultaneously, then selection comes from the operand of corresponding data as this reversible execution phase of Parasites Fauna.Although Fig. 4 illustrate only the Port Multiplier for selecting operand Rsd to originate, in fact Rs and Rd all has corresponding Port Multiplier.

(2) Write-Read risk

This type of risk occurs when the operand (source or object) needed for certain instruction is the destination operand of last instruction, and if do not made forward process, this instruction will get clogged a clock period.As:

ANDX R5，R2，R3

ANDX R4，R5，R6

In Fig. 5, the thick line of two ends round dot represents that this Write-Read takes a risk.Arrange Write-Read transponder by execution computing/access storer stage and solve this kind of problem, in Fig. 5, Bold arrows represents forwarding.

The forwarding logic that such Write-Read takes a risk is as follows:

If(EXE/WR.Rsd＝RR/EXE.Rs or EXE/WR.Rsd＝RR/EXE.Rt orEXE/WR.Rsd＝RR/EXE.Rsd)

The input end of Copy data from EXE/WR to ALU ' s Input//forward the data to ALU

Write-Read transponder judges whether the arbitrary operand in RR/EXE segment register and the destination operand in EXE/WR segment register come from identical register; If come from identical register, then the corresponding data be stored in EXE/WR segment register is forwarded to the input end of arithmetical unit ALU; If not come from identical register, then the corresponding data be stored in RR/EXE segment register is forwarded to the input end of arithmetical unit ALU.

Owing to performing computing and the memory access stage can overlap in the streamline of the reversible CPU of a kind of streamline of the present invention, therefore when access instruction immediately following counting/logic instruction or count/logical order immediately following the Write-Read risk occurred during access instruction by the same manner process.Shown in data path Fig. 6 of this Write-Read transponder.Illustrate only a Port Multiplier Mux in Fig. 6, but in fact other two operands of arithmetical unit ALU also have similar Port Multiplier, this Port Multiplier is originated according to the operand of Hazard detection result Selecting operation device ALU.

Will occur when running into jump instruction in instruction pipelining to control risk.The next instruction address of jump instruction just can calculate writing register stages, and this will to cause when running into conditional jump instructions streamline by four clock period of obstruction.The computing related to due to jump instruction is relatively simple, arithmetical unit ALU can not be used and use specialized circuitry to carry out related operation, just the address computation of next instruction can be advanceed to the read register stage, even if but make streamline like this and still will block two clock period.

In order to improve this situation, always the condition of assumed condition jump instruction does not meet, the next instruction namely following jump instruction in instruction stream closely no matter jump instruction implementation status how, enters streamline immediately when following clock cycle arrives.As confirmed the non-redirect really of this conditional jump instructions in the read register stage, then there is not any obstruction in flowing water; As there is redirect, illustrating that two instructions that this advances into flowing water are wrong, being now equivalent to two clock period of pipeline blocking.

The judgement that jump instruction relates to by the reversible CPU of a kind of streamline and address computation advance to the read register stage and complete, and this alleviates to a certain extent and controls risk.

The reversible CPU of a kind of streamline is different from monocycle CPU, it adopts pipelining, faster procedure execution speed greatly, shorten program execution time, and due to the present invention be that reversible CPU adopts reversible instruction to carry out executable operations, good heat dissipation effect, the data hazard of increase detects converting unit and can effectively solve data hazard problem, ensures the stability of data and the accuracy of calculating.

Novel preferred implementation, it should be pointed out that for the person of ordinary skill of the art, and without departing from the concept of the premise of the invention, can also make some distortion and improvement, these all belong to protection scope of the present invention.

Claims

1. the reversible CPU design of streamline and an analogue system, it is characterized in that, the reversible CPU of described streamline adopts multiple reversible instruction to operate, and described reversible instruction adopts many parallel pipelines to perform, and described streamline comprises multiple execute phase.

2. the reversible CPU design of streamline according to claim 1 and analogue system, it is characterized in that, described reversible instruction comprises:

Jump instruction, for carrying out the redirect of instruction;

Access instruction, for directly accessing the data-carrier store of the reversible CPU of described streamline.

3. the reversible CPU design of streamline according to claim 2 and analogue system, it is characterized in that, described streamline comprises seven execute phases, is followed successively by: the instruction fetch phase, instruction decode stage, the read register stage, perform computing/access storer stage, write register stages, instruction encoding stage and link order stage.

4. the reversible CPU design of streamline according to claim 3 and analogue system, it is characterized in that, the execute phase of described streamline has symmetry.

5. the reversible CPU design of the streamline according to any one of claim 1-4 and analogue system, it is characterized in that, described reversible CPU is provided with inverse command mappings table, if the reversible CPU of described streamline receives the signal oppositely performed, this is inquired about against instruction operation code, according to described inverse instruction operation code generating run signal from described inverse command mappings table.

6. the reversible CPU design of streamline according to claim 6 and analogue system, it is characterized in that, the reversible CPU of described streamline comprises:

Arithmetical unit, for carrying out arithmetical operation;

Parasites Fauna, for storing the intermediate data in computation process;

Data-carrier store, for storing data;

Command memory, for storing described reversible instruction;

Steering logic device, for according to operational code generating run signal;

DIR direction register, performs direction for controlling;

PC programmable counter, is used in reference to current described reversible instruction;

PPC former procedure counter, is used in reference to reversible instruction described in the last bar to current described reversible instruction;

BR redirect register, for realizing the redirect of instruction.

7. the reversible CPU design of streamline according to claim 6 and analogue system, it is characterized in that, the reversible CPU of described streamline comprises:

IF/ID segment register, for preserving the data between described instruction fetch phase and described instruction decode stage;

ID/RR segment register, for preserving the data between described instruction decode stage and described read register stage;

RR/EXE segment register, for preserving the data of described read register stage and described execution computing/between the access storer stage;

EXE/WR segment register, for preserving described execution computing/access storer stage and the described data write between register stages;

WR/IE segment register, for writing the data between register stages and described instruction encoding stage described in preserving;

IE/IR segment register, for preserving the data between described instruction encoding stage and described link order stage.

8. the reversible CPU design of streamline according to claim 7 and analogue system, it is characterized in that, the reversible CPU of described streamline also comprises data hazard and detects converting unit.

9. the reversible CPU design of streamline according to claim 8 and analogue system, it is characterized in that, described data hazard detects converting unit and comprises:

Whether Read-Read transponder, have common factor for the source operand register id field of testing register id field in described ID/RR segment register and described RR/EXE;

If any common factor, then select to forward from the corresponding data of the described RR/EXE segment register operand as execution phase reversible described in this;

As do not occured simultaneously, then selection comes from the operand of corresponding data as execution phase reversible described in this of described Parasites Fauna;

Write-Read transponder, judges whether the arbitrary operand in described RR/EXE segment register and the destination operand in described EXE/WR segment register come from identical register;

If come from identical register, then the corresponding data be stored in described EXE/WR segment register is forwarded to the input end of described arithmetical unit;

If not come from identical register, then the corresponding data be stored in described RR/EXE segment register is forwarded to the input end of described arithmetical unit.