CN100552621C - A kind of ALU that adopts asynchronous circuit to realize - Google Patents

A kind of ALU that adopts asynchronous circuit to realize Download PDF

Info

Publication number
CN100552621C
CN100552621C CNB2008101144688A CN200810114468A CN100552621C CN 100552621 C CN100552621 C CN 100552621C CN B2008101144688 A CNB2008101144688 A CN B2008101144688A CN 200810114468 A CN200810114468 A CN 200810114468A CN 100552621 C CN100552621 C CN 100552621C
Authority
CN
China
Prior art keywords
data
unit
alu
receiving end
delay
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB2008101144688A
Other languages
Chinese (zh)
Other versions
CN101303643A (en
Inventor
高丽江
陈虹
陈弘毅
王志华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Research Institute Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CNB2008101144688A priority Critical patent/CN100552621C/en
Publication of CN101303643A publication Critical patent/CN101303643A/en
Application granted granted Critical
Publication of CN100552621C publication Critical patent/CN100552621C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3867Concurrent instruction execution, e.g. pipeline or look ahead using instruction pipelines
    • G06F9/3871Asynchronous instruction pipeline, e.g. using handshake signals between stages

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Logic Circuits (AREA)
  • Information Transfer Systems (AREA)

Abstract

The present invention relates to a kind of ALU that adopts asynchronous circuit to realize, comprising: functional unit, the s operation control instruction of receiving end/sending end and input data are carried out the arithmetic logical operation of input data; The estimation of delaying time of time-delay estimation unit, the type of the s operation control instruction that receives according to functional unit and the concrete data of input; Timelag matching unit is selected the computing time-delay with the functional unit coupling; The Handshake Protocol unit, when transmitting terminal sent the input data, the control transmitting terminal produced request signal, made input synchronously, and experience computing time-delay back notice receiving end is taken the output data of functional unit away, and the control receiving end produces answer signal, makes output synchronously.ALU of the present invention has high performance characteristics, has overcome synchronizing circuit and can only realize the performance loss that worst case is delayed time; Have low in power consumption, also saved circuit area.

Description

A kind of ALU that adopts asynchronous circuit to realize
Technical field
The present invention relates to little deal with data path field, be specifically related to a kind of ALU that adopts asynchronous circuit to realize.
Background technology
(Arithmetic-Logic Unit is the performance element of central processing unit (CPU) ALU) to ALU, is the core ingredient of all central processing units, and the major function of ALU is to carry out the computing of binary data.ALU generally in processor, finish addition, subtraction, comparison, displacement, with or, various arithmetic logical operation functions such as XOR.Because computing circuit is comparatively complicated, and the frequency that ALU uses at microprocessor is very big, ALU usually becomes the bottleneck when improving microprocessor performance, therefore must fully pay attention to the design of ALU, keeps on improving.
Integrated circuit can be divided into synchronizing circuit and asynchronous circuit according to the difference that realizes style.In the middle of synchronizing circuit, system adopts a global clock to control each functional part, realizes necessary synchronous operation.Asynchronous circuit adopt Handshake Protocol realize each functional part synchronously, communication and sequential operation.In the middle of the integrated circuit evolution, synchronizing circuit is owing to realize that thought is simple, and is theoretical ripe and become the main flow of circuit design.
The general synchronizing circuit that adopts is realized ALU in the custom circuit, and the integrated circuit development enters after the deep-submicron, and along with dwindling of characteristic dimension, the synchronizing circuit development faces problems.The first, the clock period of synchronous circuit system is to be decided by critical path (be longest path time-delay, be specially and carry out the time-delay that complex calculation takies).So just do not have good adaptability, promptly can not utilize best and average path delay, thereby lose a part of circuit performance.For ALU, be the longest arithmetical operation used operation time among the general ALU, and used time of logic and shift operation is shorter, in synchronizing circuit, no matter the beat that is arithmetical operation or logic and shift operation clock period that all will be determined according to longest path (arithmetical operation) comes work, lost circuit performance.The second, along with dwindling of characteristic dimension, clock frequency rises gradually, according to power consumption formula (1):
P = 0.5 αfC V dd 2 - - - ( 1 )
Wherein, P is a power consumption, and α is an activity, and f is a clock frequency, and C is the capacitance in the circuit, V DdBe supply voltage, as seen, along with the rising of clock frequency, power consumption also increases gradually.Power problems has become the important difficult problem that the integrated circuit development faces.In addition, the integrated circuit scale is increasing, and for large-scale design, the cost that the clock distribution is paid is also increasing.Increasing by the clock clock trees the form shared proportion in way circuit that distributes, the power consumption cost of bringing is thus occupied quite great proportion in total power consumption.The 3rd, another problem relevant with clock is the clock skew problem.Clock skew is the time of arrival of the difference spatially of a clock upset in the integrated circuit.Clock skew all has very big influence to the performance and the function of sequential system.
The advantage that asynchronous circuit is compared synchronizing circuit has: the first, potential high performance nature is arranged.This be because, the performance of asynchronous circuit depends on the actual performance of each functional module, in a single day last module is finished, next module can enter duty immediately, thereby has realized average behavior, and performance depends on the longest path time delay in the synchronizing circuit.The second, have the characteristic of low-power consumption.Synchronizing circuit is worked under whole clock control, and the clock work frequency must satisfy the requirement of peak load, causes power wastage.Synchronous Clock Gating Technique can only be controlled roughly on a large scale, and the effect that reduces power consumption is limited.Asynchronous circuit is then by data-driven, and only ability consumed energy when the needs deal with data has the potentiality of low-power consumption.And asynchronous circuit can be between zero-power no datat state and maximum throughput state switches rapidly, without any need for auxiliary.Be particularly suitable for the frequent occasion of standby.Three, can avoid the clock skew problem.Along with the increase of system-on-a-chip and the increase of interconnection line delay proportion in entire circuit postpones, the more and more difficult control of the clock skew of synchronizing circuit, design difficulty is increasing.Asynchronous circuit has been cancelled clock, thereby has overcome the clock skew problem in itself.Four, modular nature is outstanding, in when the design complicated circuit, having dirigibility.This is because asynchronous module has all sequential and data message at their interface.And synchronizing circuit only comprises desired data message at interface.For asynchronous module, need only Interface Matching and adopt identical Handshake Protocol, different modules just can couple together.Synchronizing circuit then faces the restriction of factors such as clock does not match.Modular nature is a very big advantage of asynchronous circuit.This makes asynchronous module have reusability, and allows slower module to be revised separately, thereby obtained more performance under situation about whole design not being exerted an influence.Five, insensitive to signal delay, strong to little live width integrated circuit technology adaptability.When the integrated circuit live width reached deep-submicron, the signal delay that is caused by wire capacitances load and wiring delay surpassed the delay that is caused by circuit unit, occupies main status.Asynchronous circuit uses handshake to communicate, and the delay of circuit only can influence operating rate, and can not influence the circuit behavior, and insensitive to process deviation.Six, the good advantage of Electro Magnetic Compatibility is arranged, because the few and good dispersion of its radiation frequency spectrum energy content.The work of asynchronous circuit is not locked on the intrinsic frequency, and radiation power is not concentrated in the specific narrow band spectrum, but evenly distributes on a large scale.
Handshake among the known asynchronous ALU all is to adopt coding and double track to realize that coding and double track has two phase place coding and double track and four phase place coding and double track dual modes, all has the big shortcoming of area.For example, referring to " An ALU Design using a Novel AsynchronousPipeline Architecture " (Tin-Yau TANG, Chiu-Sing CHOY, Jan BUTAS, Cheong-Fat CHAN, ISCAS.Vol.5,2000, pp.361-364), " AsynchronousDesign Methodology for an Efficient Implementation of Low powerALU " (P.Manikandan, B.D.Liu, L.Y.Chiou, G.Sundar, C.R.Mandal, APCCAS, Dec 2006, pp.590-593).
The coding and double track agreement adopts the way that the data that will transmit are represented with two lines, and request signal is encoded in the middle of the data-signal.Bits per inch it is believed that breath d has adopted two lines (being respectively d.t and d.f), and these two lines have been represented data and request signal simultaneously.The solicit operation that any cycle of shaking hands participates in is exactly that d.t and d.f finish jointly.Two lines are put together just becomes code word: { x.f, x.t}.{ x.f, x.t}={0,1} and { x.f, x.t}={1,0} representative " valid data " (represent logical zero separately, and logical one), and { 0} represents " no datat " (" room ", " null value " or " sky ") for x.f, x.t}={0.{ 1} is not used code word for x.f, x.t}={1, belongs to illegal state.This agreement adopts two lines to represent the one digit number certificate, must make that the scale of circuit is bigger than the binding data protocol, thereby have bigger area.For example a double track just needs 40 transistors with door, is binding data structure accepted standard CMOS of institute and more than six times of door (only needing 6 transistors).
Realize the difference of style according to asynchronous circuit, also have the handshake of some asynchronous circuits to adopt two phase place binding data, because the realization of two phase place binding data need design the responsive parts of signal upset, for example responsive incident control register to signal upset, and design is more complicated more than the common parts to level-sensitive to the responsive parts that overturn.In addition, the condition steering logic of response signal upset is also very complicated in the two phase place binding data circuit.
Summary of the invention
The purpose of this invention is to provide a kind of ALU that adopts asynchronous circuit to realize, utilize this ALU to overcome the various shortcomings of synchronizing circuit ALU, its circuit structure is succinct simultaneously, has saved circuit area.
For achieving the above object, the present invention adopts following technical scheme:
A kind of ALU that adopts asynchronous circuit to realize, be used for the data that receiving end/sending end sends, described data are carried out outputing to receiving end after the arithmetic logical operation, described ALU comprises: functional unit, the s operation control instruction of receiving end/sending end and input data, the arithmetic logical operation of input data is carried out in instruction according to s operation control; The estimation of delaying time of time-delay estimation unit, the type of the s operation control instruction that receives according to functional unit and the data of input; Timelag matching unit according to the computing time-delay of the estimated delay selection of time-delay estimation unit with the functional unit coupling, is used for the output of the request signal of transmitting terminal generation is delayed time; The Handshake Protocol unit, when transmitting terminal sent the input data, the control transmitting terminal produced request signal, make input synchronously, the request signal experience computing time-delay back output notice receiving end that is produced is taken the output data of functional unit away, and the control receiving end produces answer signal, makes output synchronously.
Wherein, described transmitting terminal adopts data line to transmit described input data, adopts request signal line transfer request signal, and described receiving end adopts the output data of data line receiving function unit, adopts acknowledge signal line transmission answer signal.
Wherein, described Handshake Protocol unit adopts the Handshake Protocol of four phase places binding data to carry out synchronously, comprise: the first request signal unit, be used for when transmitting terminal sends the input data, the request signal that transmitting terminal produces is put high level, at experience computing time-delay back control ALU output terminal output high level request signal; First acknowledgement signal unit is used for behind ALU output terminal output high level request signal, and the notice receiving end is taken the output data of functional unit away, and the answer signal of receiving end is put high level; The second request signal unit is used for after the answer signal of receiving end is put high level, and the high level request signal of transmitting terminal is changed to low level; Second acknowledgement signal unit is used for along with putting of transmitting terminal request signal is low, and the high level answer signal of receiving end is changed to low level.
Wherein, described functional unit comprises: multiplex arithmetric device, its input end input data by the data line receiving end/sending end, to the arithmetic logical operation of described input data after the output of its output terminal; Port Multiplier is connected with the output terminal of transmitting terminal with described multiplex arithmetric device respectively, is used for the s operation control instruction that receiving end/sending end sends, according to the output terminal of the described multiplex arithmetric device of described s operation control instruction gating.
Wherein, described multiplex arithmetric device comprises: carry out and instruction with door; Carry out the XOR gate of XOR instruction; Carry out the carry lookahead adder of add instruction, the input end of described and door, XOR gate, carry lookahead adder passes through the input data of data line receiving end/sending end respectively, describedly be connected with Port Multiplier respectively with the output terminal of door, XOR gate, carry lookahead adder, described carry lookahead adder comprises: by the carry generation unit that constitutes with door, and the carry propagation unit that constitutes by XOR gate; Described execution and instruction with the multiplexing described carry lookahead adder of door in door; XOR gate in the multiplexing carry lookahead adder of XOR gate of described execution XOR instruction.
Wherein, described multiplex arithmetric device also comprises: carry out or instruction or door; Carry out the shift unit of shift order; Carry out the subtracter of subtraction or comparison order, described or the door, the input end of shift unit passes through the input data of data line receiving end/sending end respectively, described or the door, the output terminal of shift unit is connected with Port Multiplier respectively, described subtracter is connected and composed by carry lookahead adder and phase inverter, the input end of described phase inverter is by the input data of data line receiving end/sending end, the output terminal of described phase inverter connects the input end of Port Multiplier, the output terminal of described Port Multiplier is connected with the input end of carry lookahead adder, the s operation control instruction of described Port Multiplier receiving end/sending end, the output terminal of the described phase inverter of gating sends to carry lookahead adder with the input data after anti-phase and carries out subtraction when described s operation control instruction is subtraction instruction.
Wherein, described functional unit also comprises the overflow detector that is connected with carry lookahead adder, is used for carrying out to provide when addition or subtraction exceed the scope that can represent and overflowing indication detecting described carry lookahead adder.
Wherein, described overflow detector is realized by NOR gate circuit.
Wherein, the estimated time-delay of estimation unit of will delaying time of described timelag matching unit strengthens 25%~35% scope, as with the computing time-delay of functional unit coupling.
Wherein, the data of described functional unit after carrying out arithmetic logical operation are enabled the described data of latches that are connected with functional unit according to computing time-delay by the Handshake Protocol unit.
The implementation of ALU has realized the high-performance on the statistical significance among the present invention, has overcome synchronizing circuit and can only realize the performance loss that worst case is delayed time; Asynchronous circuit this ALU is used in the middle of the data handling system, owing to also will have low in power consumption.Internal module multiplexing also realized less circuit area among the instruction of and instruction and XOR and the CLA.
Description of drawings
Fig. 1 is binding data protocol synoptic diagram;
Fig. 2 is four a phase places binding data handshakes agreement synoptic diagram;
The arithmetics logic cell structure synoptic diagram that Fig. 3 adopts asynchronous circuit to realize for the present invention;
Fig. 4 is arithmetics logic cell structure and a signal schematic representation among the embodiment;
Fig. 5 is the structural drawing of functional unit in implementing;
Fig. 6 is the structural drawing of carry lookahead adder;
Fig. 7 is for constituting the AND circuit figure of carry generation unit;
Fig. 8 is for constituting the NOR gate circuit figure of carry propagation unit;
Fig. 9 is four a phase places binding data pipeline synoptic diagram;
Figure 10 is the four phase places binding data pipeline synoptic diagram that has functional unit.
Embodiment
The ALU that the employing asynchronous circuit that the present invention proposes realizes is described as follows in conjunction with the accompanying drawings and embodiments.
Asynchronous circuit can be divided into two phase place binding data, the binding of four phase places data, two phase place coding and double track and four phase place coding and double tracks according to the difference that realizes style.Wherein four phase places binding The data data line and the way that the request signal line separates are mutually realized Handshake Protocol, adopt four variations on request signal and the answer signal to realize that one is shaken hands the cycle.
The notion of binding data is meant data-signal is adopted common boolean's numerical value encode (be about to data-signal Bolean number value representation), and set up getting in touch between data and the request-reply signal, shown in the accompanying drawing 1, independently request and answer signal and data-signal bind together, binding is meant that request msg produces with the input gathering data arrival, and answer signal produces with receiving gathering data arrival.In the Handshake Protocol of four phase places binding data, as shown in Figure 2, request signal and answer signal also adopt common boolean's numerical value to carry out the coding of information, four phase places here are meant the number of times of communication behavior, according to the trend of dotted line among Fig. 2 as can be known: (1) transmitting terminal arrives and is ready to data will send the time in data, putting request signal is high level, and expression has the data input; (2) after certain time-delay, receiving end receives data and puts answer signal is high level, and the expression receiving end has received data; (3) after certain time-delay transmitting terminal to put request signal be low level (no longer require to effectively in this point data, the expression data can change) in response; (4) receiving end is replied for low making by putting answer signal.At this moment can begin next communication cycle at transmitting terminal.
Adopt the realization style of four phase places binding data to realize asynchronous ALU in the present embodiment, with the asynchronous sequential control method of other Synchronization Design different mining, realized high-performance, and can realize the characteristic of low-power consumption in actual applications, overcome the clock skew problem, and saved circuit area.
Embodiment
Be the structural drawing of the asynchronous ALU of the present invention as shown in Figure 3, comprise: functional unit, the s operation control instruction of receiving end/sending end and input data, the arithmetic logical operation of input data is carried out in instruction according to s operation control; The estimation of delaying time of time-delay estimation unit, the type of the s operation control instruction that receives according to functional unit and the concrete data of input; Timelag matching unit is delayed time according to the estimated delay selection of time-delay estimation unit and the computing of functional unit coupling; The Handshake Protocol unit when transmitting terminal sends the input data, produces request signal at transmitting terminal, makes input synchronously, and experience computing time-delay back notice receiving end is taken the output data of functional unit away, produces answer signal at receiving end, makes output synchronously.
Be illustrated in figure 4 as the asynchronous arithmetics logic cell structure figure of present embodiment, relevant signal comprises that request signal req_in, answer signal ack, request signal req_in export dout through output request signal req_out, data input din and the data of the given computing time-delay of delay unit.The function of each several part is: functional unit is carried out the instruction of ALU, and the different arithmetic logical unit of gating carries out arithmetic logical operation; Timelag matching unit has the time-delay subelements (big time-delay as shown in Figure 3, middle time-delay and little time-delay) of different sizes, in the present embodiment greatly time-delay, middle time-delay and little time-delay be respectively: 2.2ns, 1.0ns, 0.8ns; The time-delay estimation unit obtains the time-delay that the functional unit execution command needs by input data and s operation control instruction type are analyzed, and then selects different timelag matching unit to mate by timelag matching unit.The course of work of ALU is such: when the input data arrive, the req_in signal becomes height, functional unit begins to carry out according to the instruction of ALU, the time-delay estimation unit is carried out the different needed time-delays of instruction to functional unit and is made estimation, and the delay unit and the functional module of different sizes is complementary in the selection timelag matching unit, thereby produce the req_out signal, tell receiving end computing at the corresponding levels to finish, receiving end after " taking " the output result of ALU away is drawn high answer signal ack as replying, and then the ack signal along with req_out signal step-down step-down.Data line and request-reply signal are opened independent separately use in the present embodiment, transmitting terminal adopts data line to transmit above-mentioned input data, adopt request signal line transfer request signal, receiving end adopts the output data of data line receiving function unit, adopts acknowledge signal line transmission answer signal.
When the functional unit of ALU carries out add operation, can have different time-delays according to different input data, its reason is the difference that the difference of the length of carry propagation has caused time-delay, then have maximum time-delays if carry propagation is 16, then only have and be about as much as half time-delay of maximum delay if carry propagation is 8; It is 1.7ns that the needed time is carried out in addition, subtraction, comparison order; It is 0.6ns that shift order is carried out the needed time; With or, to carry out the needed time be 0.77ns to XOR, the time-delay estimation unit is according to the estimation of delaying time of input data, after the time-delay that draws estimation, mate by timelag matching unit, timelag matching unit is the suitable lengthening of meeting actual time delay after above-mentioned estimation time-delay, in order correctly to indicate output under various technologies, voltage and temperature conditions, the amount of Jia Daing is chosen as 30% here, avoids receiving end data to be taken away the situation of also not finishing arithmetic logical operation.
Accompanying drawing 5 has been represented the inner structure of the functional unit of present embodiment ALU, comprising: multiplex arithmetric device, its input end are carried out arithmetic logical operation after the output of its output terminal by the input data of data line receiving end/sending end to the input data; Port Multiplier is connected with the output terminal of transmitting terminal with the multiplex arithmetric device respectively, is used for the s operation control instruction that receiving end/sending end sends, according to the output terminal of described s operation control instruction gating multiplex arithmetric device.The multiplex arithmetric device comprises: carry out and instruction with door; Carry out or instruction or door; Carry out the XOR gate of XOR instruction; Carry out the shift unit of shift order, carry out the carry lookahead adder of add instruction, carry out the subtracter of subtraction or comparison order, the input end of above-mentioned and door or door, XOR, displacement, carry lookahead adder and the data line a company of transmitting terminal, output terminal is connected with Port Multiplier, another input end is connected the back and connects data line b with another Port Multiplier, be that data are imported into after all logical blocks are carried out logical operation, instruction determines which obtains operation result in the gating 1,2,3,4 according to s operation control by Port Multiplier.
Because add instruction adopts a carry lookahead adder to realize, the carry look ahead addition is an existing realization add instruction totalizer commonly used, be illustrated in figure 6 as the structural drawing of carry lookahead adder, wherein signal generation unit signal generation unit has wherein comprised carry generation unit and carry propagation unit, be respectively totalizer and produce every carry generation signal and carry propagation signal, the carry generation unit produces the carry output of totalizer and for three four adder units that are in high bit produce carries input (being in than the carry input of four adder units of the low level carry input by totalizer provides), four four adder units draw each according to the output of carry input and signal generation unit and signal.Serve as reasons as shown in Figure 7 and a carry generation unit that constitutes, be somebody's turn to do and door reception input data a and b, the direct carry that produces when both are 1, Fig. 8 is the carry propagation unit that is made of XOR gate, and the specific implementation of other parts no longer is described in detail in detail here.In the present embodiment in order to save circuit area, carry out separately and instruction with door realizing in just can multiplexing carry lookahead adder with door, XOR gate in the multiplexing carry lookahead adder of XOR gate of execution XOR instruction realizes, promptly except the transmitting terminal of carry lookahead adder with the data line that receives the input data is connected, in the carry lookahead adder with door, the input end of XOR gate also is connected by the data line of switch with transmitting terminal input data respectively, by the s operation control instruction control above-mentioned switch of Port Multiplier according to reception, when needing to carry out related operation, it is carried out gating carry out corresponding operational order.According to the two's complement subtract operational method, subtraction can be realized by the two's complement addition.Like this as long as the subtrahend negate is added one again, and then with the minuend addition just can obtain poor.Therefore, the structure of subtracter is exactly to add a phase inverter in each of the addend position of adder structure, and the carry input of lowest order is set at height can realizes (see figure 5).Comparer can be judged comparative result by sign bit by two numbers that will compare are carried out subtraction.The sign bit of being on duty is 1 o'clock, represents the input of the input of minuend position less than the subtrahend position, otherwise, then represent the input of the input of minuend position more than or equal to the subtrahend position.Therefore, subtracter or comparer are connected and composed by carry lookahead adder and phase inverter in the present embodiment, the input end of phase inverter is by the input data of data line receiving end/sending end, the output terminal of phase inverter connects the input end of Port Multiplier, the output terminal of Port Multiplier is connected with the input end of carry lookahead adder, the s operation control instruction of Port Multiplier receiving end/sending end, the data of transmitting terminal shown in Figure 5 divide two-way to insert Port Multiplier, one the tunnel is former input data, one the tunnel is the input data after anti-phase, the output terminal of gated inverter sends to carry lookahead adder with the input data after anti-phase and carries out subtraction when the s operation control instruction is subtraction instruction.Like this, multiplexing by to totalizer further saved circuit area again.
In the present embodiment or instruction be that shift order is to be realized by common logarithm shift unit by common or door is realized.Overflowing the test section is to be realized by independent XOR gate.When computer hardware can't represent computing as a result the time, spillover will take place.For addition, overflow occur in two positive number additions and the result when negative, perhaps two negative additions and the result is timing.For subtraction, the result is a negative if positive number deducts negative, perhaps negative deduct positive number and the result for positive number, just taken place to overflow this moment so.This means when doing subtraction and borrowed 1 from sign bit.Overflowing the method that detection can adopt is: if be not equal to the carry value of highest significant position to the carry value of highest significant position, that has just taken place to overflow.Most significant digit carry output is soon exported XOR mutually with time high-order carry, if the result is 0, then expression is not overflowed, if the result is 1, then represents to have to overflow.Therefore, overflow detection circuit can be realized with an XOR gate.The operation steering logic is made of Port Multiplier commonly used in the prior art.
The result shows that the implementation of this ALU has realized the high-performance on the statistical significance, has overcome synchronizing circuit and can only realize the performance loss that worst case is delayed time; Asynchronous circuit this ALU is used in the middle of the system, owing to also will have low in power consumption.Internal module multiplexing also realized less circuit area among the instruction of and instruction and XOR and the CLA.
The ALU that utilizes present embodiment to provide, can finish the one-level computing through secondary data input and output, live through the data path that data procedures behind the multistage operations forms pipeline system, as shown in Figure 9, owing to adopt the ALU of four phase places binding data protocol in the present embodiment, can be used in the four phase places binding data pipeline.
The application of ALU in asynchronous pipeline is as described below.A Muller streamline is used to produce local clock's pulse, and the time clock here is different from the time clock in the synchronizing circuit, is non-periodic, is local, disperses, and is not that the overall situation is unified.The time clock that the time clock of each grade generation and adjacent level are produced overlaps in a kind of mode of interlocking of careful control.Accompanying drawing 9 is streamline FIFO of a first in first out mechanism, and C represents the MullerC unit, is the elementary cell that is used for sequential control in the asynchronous circuit, extensively adopts in asynchronous circuit; EN represents the enable signal of latch; Latch represents latch.Latch latchs data according to enable signal.Fig. 9 is a streamline that does not have data processing module, and accompanying drawing 10 has represented to place the method for combinational logic circuit (being also referred to as functional module) between two latchs, and the comb among the figure is a functional circuit.Asynchronous ALU in this enforcement just can insert between the two-stage of streamline and carry out the working reaction data process of circulation.
The circuit of Fig. 9 and Figure 10 can be regarded as the conventional synchronization data path of being made up of latch and combinational circuit (it is synchronous to be distributed gated clock driver institute) by imagery ground, also can regard as by the two classes asynchronous data flow structure that parts (latch and functional module) form of shaking hands, handshake makes the input data sync just can adopt latch to realize in the present embodiment, after functional unit is finished arithmetic logical operation, data are put into latch and latch, and offer next stage and handle.
Above embodiment only is used to illustrate the present invention; and be not limitation of the present invention; the those of ordinary skill in relevant technologies field; under the situation that does not break away from the spirit and scope of the present invention; can also make various variations and modification; therefore all technical schemes that are equal to also belong to category of the present invention, and scope of patent protection of the present invention should be defined by the claims.

Claims (10)

1, a kind of ALU that adopts asynchronous circuit to realize is used for the data that receiving end/sending end sends, and described data are carried out outputing to receiving end after the arithmetic logical operation, it is characterized in that described ALU comprises:
Functional unit, the s operation control instruction of receiving end/sending end and input data, the arithmetic logical operation of input data is carried out in instruction according to s operation control;
The estimation of delaying time of time-delay estimation unit, the type of the s operation control instruction that receives according to functional unit and the data of input;
Timelag matching unit according to the computing time-delay of the estimated delay selection of time-delay estimation unit with the functional unit coupling, is used for the output of the request signal of transmitting terminal generation is delayed time;
The Handshake Protocol unit, when transmitting terminal sent the input data, the control transmitting terminal produced request signal, make input synchronously, the request signal experience computing time-delay back output notice receiving end that is produced is taken the output data of functional unit away, and the control receiving end produces answer signal, makes output synchronously.
2, the ALU of employing asynchronous circuit realization as claimed in claim 1, it is characterized in that, described transmitting terminal adopts data line to transmit described input data, adopt request signal line transfer request signal, described receiving end adopts the output data of data line receiving function unit, adopts acknowledge signal line transmission answer signal.
3, the ALU of employing asynchronous circuit realization as claimed in claim 1 is characterized in that, described Handshake Protocol unit adopts the Handshake Protocol of four phase places binding data to carry out comprising synchronously:
The first request signal unit is used for when transmitting terminal sends the input data the request signal that transmitting terminal produces being put high level, at experience computing time-delay back control ALU output terminal output high level request signal;
First acknowledgement signal unit is used for behind ALU output terminal output high level request signal, and the notice receiving end is taken the output data of functional unit away, and the answer signal of receiving end is put high level;
The second request signal unit is used for after the answer signal of receiving end is put high level, and the high level request signal of transmitting terminal is changed to low level;
Second acknowledgement signal unit is used for along with putting of transmitting terminal request signal is low, and the high level answer signal of receiving end is changed to low level.
4, the ALU of employing asynchronous circuit realization as claimed in claim 1 is characterized in that described functional unit comprises:
Multiplex arithmetric device, its input end input data by the data line receiving end/sending end, to the arithmetic logical operation of described input data after the output of its output terminal;
Port Multiplier is connected with the output terminal of transmitting terminal with described multiplex arithmetric device respectively, is used for the s operation control instruction that receiving end/sending end sends, according to the output terminal of the described multiplex arithmetric device of described s operation control instruction gating.
5, the ALU realized of employing asynchronous circuit as claimed in claim 4 is characterized in that described multiplex arithmetric device comprises: carry out and instruction with door; Carry out the XOR gate of XOR instruction; Carry out the carry lookahead adder of add instruction, the input end of described and door, XOR gate, carry lookahead adder passes through the input data of data line receiving end/sending end respectively, describedly be connected with Port Multiplier respectively, wherein with the output terminal of door, XOR gate, carry lookahead adder
Described carry lookahead adder comprises: by the carry generation unit that constitutes with door, and the carry propagation unit that constitutes by XOR gate;
Described execution and instruction with the multiplexing described carry lookahead adder of door in door;
XOR gate in the multiplexing carry lookahead adder of XOR gate of described execution XOR instruction.
6, the ALU realized of employing asynchronous circuit as claimed in claim 5 is characterized in that described multiplex arithmetric device also comprises: carry out or instruction or door; Carry out the shift unit of shift order; Carry out the subtracter of subtraction or comparison order, described or the door, the input end of shift unit passes through the input data of data line receiving end/sending end respectively, described or the door, the output terminal of shift unit is connected with Port Multiplier respectively, wherein, described subtracter is connected and composed by carry lookahead adder and phase inverter, the input end of described phase inverter is by the input data of data line receiving end/sending end, the output terminal of described phase inverter connects the input end of Port Multiplier, the output terminal of described Port Multiplier is connected with the input end of carry lookahead adder, the s operation control instruction of described Port Multiplier receiving end/sending end, the output terminal of the described phase inverter of gating sends to carry lookahead adder with the input data after anti-phase and carries out subtraction when described s operation control instruction is subtraction instruction.
7, the ALU of employing asynchronous circuit realization as claimed in claim 6, it is characterized in that, described functional unit also comprises the overflow detector that is connected with carry lookahead adder, is used for carrying out to provide when addition or subtraction exceed the scope that can represent and overflowing indication detecting described carry lookahead adder.
8, the ALU of employing asynchronous circuit realization as claimed in claim 7 is characterized in that described overflow detector is realized by NOR gate circuit.
9, the ALU realized of employing asynchronous circuit as claimed in claim 1 is characterized in that, the estimated time-delay of estimation unit of will delay time of described timelag matching unit strengthens 25%~35% scope, as delaying time with the computing of functional unit coupling.
10, the ALU of employing asynchronous circuit realization as claimed in claim 1, it is characterized in that, the data of described functional unit after carrying out arithmetic logical operation are enabled the described data of latches that are connected with functional unit according to computing time-delay by the Handshake Protocol unit.
CNB2008101144688A 2008-06-06 2008-06-06 A kind of ALU that adopts asynchronous circuit to realize Expired - Fee Related CN100552621C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2008101144688A CN100552621C (en) 2008-06-06 2008-06-06 A kind of ALU that adopts asynchronous circuit to realize

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2008101144688A CN100552621C (en) 2008-06-06 2008-06-06 A kind of ALU that adopts asynchronous circuit to realize

Publications (2)

Publication Number Publication Date
CN101303643A CN101303643A (en) 2008-11-12
CN100552621C true CN100552621C (en) 2009-10-21

Family

ID=40113560

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2008101144688A Expired - Fee Related CN100552621C (en) 2008-06-06 2008-06-06 A kind of ALU that adopts asynchronous circuit to realize

Country Status (1)

Country Link
CN (1) CN100552621C (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102394633B (en) * 2011-08-31 2013-08-21 华南理工大学 Low power consumption asynchronous comparison gate for low density parity code (LDPC) decoder
CN105205274B (en) * 2015-10-09 2018-03-23 重庆大学 A kind of parallel computation asynchronous circuit
CN106940686B (en) * 2016-01-05 2020-04-07 佛山市顺德区顺达电脑厂有限公司 Method for synchronizing data between wearable device and electronic device
CN108108151A (en) * 2017-12-13 2018-06-01 中国科学院计算技术研究所 The arithmetic logic unit operation method and system of superconduction list flux quantum processor
CN109815619B (en) * 2019-02-18 2021-02-09 清华大学 Method for converting synchronous circuit into asynchronous circuit
CN109871611B (en) * 2019-02-18 2021-06-08 清华大学 Method for automatic delay matching of asynchronous circuit
KR102500860B1 (en) 2019-09-03 2023-02-16 선전 구딕스 테크놀로지 컴퍼니, 리미티드 Asynchronous sampling devices and chips
CN112817638A (en) * 2019-11-18 2021-05-18 北京希姆计算科技有限公司 Data processing device and method
CN116842880A (en) * 2022-03-24 2023-10-03 华为技术有限公司 Chip, signal processing method and electronic equipment
CN116384309B (en) * 2023-05-31 2023-08-11 华中科技大学 Four-phase latching asynchronous handshake circuit applied to low-power chip design
CN116866447B (en) * 2023-09-04 2023-11-10 深圳时识科技有限公司 Conversion device, chip and electronic equipment between four-phase binding and two-phase double-track protocol

Also Published As

Publication number Publication date
CN101303643A (en) 2008-11-12

Similar Documents

Publication Publication Date Title
CN100552621C (en) A kind of ALU that adopts asynchronous circuit to realize
Nowick et al. Asynchronous design—Part 1: Overview and recent advances
US20110169525A1 (en) Systems, pipeline stages, and computer readable media for advanced asynchronous pipeline circuits
EP0584265A4 (en) Null convention speed independent logic
CN101140511B (en) Cascaded carry binary adder
Renaudin et al. A new asynchronous pipeline scheme: application to the design of a self-timed ring divider
Huemer et al. Sorting network based full adders for QDI circuits
Kol et al. A doubly-latched asynchronous pipeline
US9633157B2 (en) Energy-efficient pipeline circuit templates for high-performance asynchronous circuits
CN102043604B (en) Parallel feedback carry adder (PFCA) and realization method thereof
Cortadella et al. SELF: Specification and design of synchronous elastic circuits
CN107092462B (en) 64-bit asynchronous multiplier based on FPGA
Manohar et al. Asynchronous signalling processes
CN106547514B (en) A kind of high energy efficiency binary adder based on clock stretching technique
Balasubramanian et al. Timing analysis of quasi-delay-insensitive ripple carry adders–a mathematical study
Sravani et al. Novel Asynchronous Pipeline Architectures for High-Throughput Applications
US6970017B2 (en) Logic circuit
CN111985174A (en) RT latch and latch method
CN101710271B (en) Mixed numerical system summator
Yeh et al. Designs of counters with near minimal counting/sampling period and hardware complexity
Jou et al. Low-power globally asynchronous locally synchronous design using self-timed circuit technology
Srivastava Completion Detection in Asynchronous Circuits: Toward Solution of Clock-Related Design Challenges
Amrutha et al. Implementation of ALU Using Asynchronous Design
CN115496219A (en) Multi-stage non-AND gate quantum dot cell automatic machine circuit and control method and conversion method thereof
Su et al. DCP: Improving the throughput of asynchronous pipeline by dual control path

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: SHENZHEN INSTITUTE OF STINGHUA UNIVERSITY

Free format text: FORMER OWNER: TSINGHUA UNIVERSITY

Effective date: 20120921

C41 Transfer of patent application or patent right or utility model
COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: 100084 HAIDIAN, BEIJING TO: 518000 SHENZHEN, GUANGDONG PROVINCE

TR01 Transfer of patent right

Effective date of registration: 20120921

Address after: 518000 Nanshan District hi tech Industrial Zone, Guangdong, China, Shenzhen

Patentee after: Shenzhen Institute of Stinghua University

Address before: 100084 Beijing Haidian District Tsinghua Yuan 100084-82 mailbox

Patentee before: Tsinghua University

ASS Succession or assignment of patent right

Owner name: ZHEJIANG XINGSHENG WULIAN TECHNOLOGY CO., LTD.

Free format text: FORMER OWNER: SHENZHEN INSTITUTE OF STINGHUA UNIVERSITY

Effective date: 20130207

C41 Transfer of patent application or patent right or utility model
COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: 518000 SHENZHEN, GUANGDONG PROVINCE TO: 311800 SHAOXING, ZHEJIANG PROVINCE

TR01 Transfer of patent right

Effective date of registration: 20130207

Address after: 311800 Zhejiang province Zhuji City West two street Tao road 288, foreign trade building twenty-four layer

Patentee after: Zhejiang Xingsheng IOT Technology Co., Ltd.

Address before: 518000 Nanshan District hi tech Industrial Zone, Guangdong, China, Shenzhen

Patentee before: Shenzhen Institute of Stinghua University

C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20161031

Address after: 518057 Shenzhen Institute of technology, Nanshan District high tech Industrial Park, Guangdong,, Tsinghua University, A302

Patentee after: Shenzhen Institute of Stinghua University

Address before: 311800 Zhejiang province Zhuji City West two street Tao road 288, foreign trade building twenty-four layer

Patentee before: Zhejiang Xingsheng IOT Technology Co., Ltd.

CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20091021

Termination date: 20170606

CF01 Termination of patent right due to non-payment of annual fee