CN100504725C - Closing non-acting numeric value logical operation unit to save power - Google Patents

Closing non-acting numeric value logical operation unit to save power Download PDF

Info

Publication number
CN100504725C
CN100504725C CNB2006100736186A CN200610073618A CN100504725C CN 100504725 C CN100504725 C CN 100504725C CN B2006100736186 A CNB2006100736186 A CN B2006100736186A CN 200610073618 A CN200610073618 A CN 200610073618A CN 100504725 C CN100504725 C CN 100504725C
Authority
CN
China
Prior art keywords
logical operation
operation unit
enable signal
value logical
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CNB2006100736186A
Other languages
Chinese (zh)
Other versions
CN1838031A (en
Inventor
李察L·邓肯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Via Technologies Inc
Original Assignee
Via Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Via Technologies Inc filed Critical Via Technologies Inc
Publication of CN1838031A publication Critical patent/CN1838031A/en
Application granted granted Critical
Publication of CN100504725C publication Critical patent/CN100504725C/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention relates to a method for closing idle digit logic unit of processor to reduce electricity consumption, and relative device, wherein the executing unit of one processor has dual digit logic calculation units to execute digit or logic calculation; the logic unit of each digit logic unit is used to control the digit logic calculation unit to start digit or logic calculation or not; therefore, only needed digit logic calculation unit is started to execute calculation, to avoid abundant digit logic calculation units to receive input information; therefore the idle digit logic calculation units can be closed to save electricity consumption.

Description

Closing non-acting numeric value logical operation unit is to save electric power
Technical field
The invention relates to a kind of processor, especially a kind of closing non-acting numeric value logical operation unit that sees through is to save the processor of electric power.
Background technology
Because the processing procedure of integrated circuit is constantly progressive, the semiconductor device size that foregoing circuit is integrated is also dwindled gradually, therefore causes circuit more and more intensive, also owing to the shortening of the clock pulse propagation delay time between integrating apparatus allows higher clock pulse speed.
When device heals to become to dwindling and clock pulse increases day by day, the performance of circuit, with its running speed gradually by its circuit framework decision.For example in a microprocessor, value logical operation unit is the key function unit of its performance of decision, because of it need carry out computing operations most in this microprocessor, also because value logical operation unit mainly is to operate with list type or pipeline.Although the actual execution of value logical operation unit significance arithmetic is in fact only arranged in each clock pulse, all value logical operation units still remain on starting state in each clock pulse in the microprocessor now.In traditional value logical operation unit design is to start all value logical operation units, causes and need utilize extra logical circuit to switch non-acting numeric value logical operation unit.Though the result of calculation of above-mentioned no effect value logical operation unit is ignored, yet its computation process is wasted electric power.This kind power consumption has reduced battery-powered time and has but benefited without any performance.
Summary of the invention
The present invention partly is can be enough for being familiar with this operator's check and learning in the whole of advantage that this disclosed and feature or its.See through following explanation, apply for that particularly the pointed feature of interest field can specifically understand feature of the present invention and advantage.
One embodiment of the invention provide a kind of seeing through to close value logical operation unit to save the processor of electric power, can be in order to solve the target that above-mentioned traditional processor fails to reach.
In this embodiment, the invention provides a processor, it is characterized in that this processor comprises:
One decoding unit is to produce an activation signal; And
One performance element receives this enable signal with this decoding unit certainly, and wherein this performance element comprises:
A plurality of gate value logical operation units, wherein each these gate value logical operation units also comprise:
One logical block, receiving input data, a decoding instruction and this enable signal, and receive these input data at the same time, this has sent the output valve of this logical block when decoding instruction and this enable signal; And
One value logical operation unit is sent out this gate value logical operation unit with the output valve that receives this logical block and with a result of calculation, and wherein the output valve of this logical block determines whether these input data are sent to this value logical operation unit; And
One multiplexer with according to a selection signal, chooses one as an output result of this performance element certainly in the result of calculation of these a plurality of gate value logical operation units.
Wherein this logical block blocks this input data when this value logical operation unit need not start computing, and need start computing season this input data in this value logical operation unit and enter this value logical operation unit.
Wherein this decoding unit is decoded to produce this enable signal and this decoding instruction to the instruction that has received.
This enable signal that wherein this decoding unit produced is to be used for decision whether to start this a plurality of value logical operation units.
Wherein this logical block comprise one with door to receive these input data, this decoding instruction and this enable signal, this output valve with door be to export this value logical operation unit to.
Wherein this logical block comprise one or door receiving these input data, this inverted signal of decoding instruction and this enable signal, this or output valve be to export this value logical operation unit to.
Wherein this logical block comprises a multiplexer, this multiplexer be connect this input data with as one first input, connect this and separated code data and select signal as one as one second input and this enable signal, the output valve of this multiplexer is to export this value logical operation unit to.
Wherein this logical block comprises a door bolt to receive these input data, this inverted signal of decoding instruction and this enable signal, and the output valve of this door bolt is to export this value logical operation unit to.
Another embodiment of the present invention provides another kind of seeing through to close value logical operation unit to save the device of electric power.The present invention's one processor is characterized in that, this processor comprises:
-decoding unit shifts to an earlier date enable signal to produce a clock pulse; And
One performance element receives this clock pulse with this decoding unit certainly and shifts to an earlier date enable signal, and wherein this performance element comprises:
A plurality of gate value logical operation units, wherein each these gate value logical operation units also comprise:
One logical block, with receive input data ,-decoding instruction and this clock pulse shift to an earlier date enable signal, and receive these input data at the same time, this has sent the output valve of this logical block when decoding instruction and this enable signal; And
One value logical operation unit is sent out this gate value logical operation unit with the output valve that receives this logical block and with a result of calculation, and wherein the output valve of this logical block determines whether these input data are sent to this value logical operation unit; And
One first multiplexer with according to a selection signal, chooses one as an output result of this performance element certainly in the result of calculation of these a plurality of gate value logical operation units.
Wherein this logical block blocks this input data when this value logical operation unit need not start computing, and need start computing season this input data in this value logical operation unit and enter this value logical operation unit.
Wherein this decoding unit is decoded to produce this enable signal and this decoding instruction to accepted instruction.
It is to be used for decision whether to start these value logical operation units that this clock pulse that wherein this decoding unit produced shifts to an earlier date enable signal, and this clock pulse shifts to an earlier date the more former clock pulse of enable signal and shifts to an earlier date one-period.
Wherein this logical block comprises:
One with the door, shift to an earlier date enable signal to receive a clock pulse signal and this clock pulse, and transmit an output; And
One flip-flop, with receive these input data, this decoding instruction with should with the output of door, the output valve of this flip-flop is to export this value logical operation unit to.
Wherein this logical block comprises:
One second multiplexer, with receive a feedback loop output signal as one first input, receive this input data as one second input, receive this decoding instruction shift to an earlier date enable signal as one the 3rd input and this clock pulse and select signal as one, and produce an output; And
One flip-flop, with the output and the clock pulse signal that receive this second multiplexer, the output valve of this flip-flop is to export this value logical operation unit to;
Wherein this feedback loop output signal of this second multiplexer reception is the feedback signal for this output valve of this flip-flop.
Another embodiment of the present invention more provides a kind of method that reduces power consumption in processor.A kind of method that reduces power consumption in processor of the present invention is characterized in that, comprises:
Receive each value logical operation unit of activation signal to a performance element respectively from a decoding unit, with the unlatching that determines each this value logical operation unit or close;
Whether this enable signal of judging each this value logical operation unit correspondence is unlatching;
If this enable signal of this numerical value unit correspondence blocks an input data and enters this value logical operation unit for closing; And if this enable signal of this numerical value unit correspondence allows this input data to enter this value logical operation unit for opening, with in numerical value or the logical operation wherein carrying out being desired; And
Select signal to select the output results as this performance element in the result of calculation of these a plurality of value logical operation units according to one.
Description of drawings
For further specifying concrete technology contents of the present invention, below in conjunction with embodiment and accompanying drawing describes in detail as after, wherein:
Fig. 1 is the block schematic diagram for a pipeline job processor that has moment in the prior art;
Fig. 2 A is the block schematic diagram for the decoding unit of a pipeline job processor in the prior art;
Fig. 2 B is the block schematic diagram for a decoding unit of pipeline job processor according to the present invention;
Fig. 3 A is a block schematic diagram that has the performance element of a plurality of value logical operation units for one of a pipeline job processor in the prior art;
Fig. 3 B is a block schematic diagram that has the performance element of a plurality of gate value logical operation units for one of the one pipeline job processor according to the present invention;
Fig. 4 A is the block schematic diagram for the gate value logical operation unit of one first embodiment according to the present invention;
Fig. 4 B is the block schematic diagram for the gate value logical operation unit of one second embodiment according to the present invention;
Fig. 4 C is the block schematic diagram for the gate value logical operation unit of one the 3rd embodiment according to the present invention;
Fig. 4 D is the block schematic diagram for the gate value logical operation unit of one the 4th embodiment according to the present invention;
Fig. 4 E is the block schematic diagram for the gate value logical operation unit of one the 5th embodiment according to the present invention;
Fig. 5 A is the block schematic diagram for the gate value logical operation unit of one the 6th embodiment according to the present invention;
Fig. 5 B is the block schematic diagram for the gate value logical operation unit of one the 7th embodiment according to the present invention;
Fig. 5 C is the block schematic diagram for the gate value logical operation unit of one the 8th embodiment according to the present invention; And
Fig. 6 is the flow process synoptic diagram according to performance element running provided by the present invention.
Embodiment
The present invention is that a kind of seeing through closed value logical operation unit to save the processor of electric power in this direction of inquiring into.In order to understand the present invention up hill and dale, will in following description, propose detailed step and composition thereof, and execution of the present invention is not defined in the specific details that skill person had the knack of of field of microprocessors.On the other hand, well-known composition or step are not described in the details, with the restriction of avoiding causing the present invention unnecessary.Preferred embodiment meeting of the present invention is described in detail in down, yet except these were described in detail, the present invention can also implement in other embodiments widely, and scope of the present invention do not limited, its with after claim be as the criterion.
The composition of one computer system can be reduced at least three assemblies, and it has at least one processor, at least one memory cell and at least one output input subsystem.Please refer to shown in Figure 1ly, it is to describe one to have the calcspar of processor its framework of five pipeline stage with execution command.Note that other pipeline architecture design with different configuration or different pipeline stage quantity also can meet the disclosed Teaching-with of the present invention and show and spirit.Framework as shown in Figure 1 is to describe an instruction fetch unit 110, a decoding unit 120, a performance element 130, a memory access unit 140 and a working storage to write back unit 150.Except in the disclosed content of this instructions, the function mode of above-mentioned each unit or logical circuit square is known by being familiar with this operator all according to traditional approach, so not in this detailed description.
Know as known, the memory fetch operation of above-mentioned instruction fetch unit 110 execution commands, it is to be used for judgement order (in-order) instruction to carry out, the value or the content of a programmable counter (program counter) in the one working storage archives 160, such sequential instructions is as the vector that makes an exception, branch and link order etc.This instruction fetch unit 110 also is used to determine the return address of all exceptions and branch instruction, and this return address is write or be stored in the above-mentioned working storage archives 160 a suitable working storage.
Above-mentioned decoding unit 120 is to be used for the instruction that decoding instruction extraction unit 110 is passed on, and produces the required control signal of performance element 130 certain specific instruction of execution.The certain architectures of this decoding unit 120 is different along with processor, is known by being familiar with this operator yet it operates and organizes.Similarly, the framework of this performance element 130 is also different along with processor with running.Generally speaking, performance element 130 comprises a circuit in order to carry out according to decoding unit 1
The instruction that 20 control signals that produce are differentiated.
Above-mentioned memory access unit 140 is connected with the external data memory interface, so that carry out reading and writing of data according to above-mentioned performance element 130 performed instructions.Certainly, be not that all instructions all need to carry out storage access, but for the instruction that needs access memory, this memory access unit 140 will be carried out the necessary access action of external memory storage for it.At last, to write back unit 150 be to be responsible for instruct result's storage of carrying out or write in the suitable working storage of these working storage archives 160 for above-mentioned working storage.
Fig. 2 A is the decoding unit 210 one functional block diagram for the known pipeline job processor of prior art.The data and instruction that the instruction fetch unit 110 that this decoding unit 210 receives previous stage is transmitted to be decoding to this instruction, and produces above-mentioned performance element 130 and carry out the required control signal of specific instruction.Above-mentioned reception, decoding and transmission operation are as receiving square 211, decoding square 212 with shown in the transmission square 213.Decoded instruction and data self-demarking code unit 210 were sent to pipeline job processor next stage, and promptly performance element 130 is to carry out this instruction.
Decoding unit 220 of the present invention is shown in Fig. 2 B.The present invention discloses extra control signal, i.e. an enable signal.What person of value logical operation unit in the above-mentioned performance element 130 of this activation signal may command starts or closes.Above-mentioned data, decoding instruction and enable signal will be sent to the execution of performance element 130 for instruction.The operation of this decoding unit 220 is comparable to Fig. 2 A, has one and receives square 221, one decoding squares 222 and a transmission square 223, and its different persons are to increase by one and transmit enable signal square 224 to transmit this enable signal.
Fig. 3 A is depicted as a block schematic diagram of traditional performance element in the pipeline job processor.Each value logical operation unit 311 receives data input and decoding instruction (DI) from the decoding unit 210 of Fig. 2 A, so that carry out numerical evaluation or logic determines in the value logical operation unit 311.Though 311 actual effective numerical value of execution or the logical operations of a value logical operation unit are only arranged, and all value logical operation units 311 all remain on operating state.The result of calculation Output of above-mentioned all value logical operation unit 311 gained all gathers in the final for you to choose output of a multiplexer 312 Output_s as a result, and this output Output_s as a result then is sent to next stage of pipeline operation.
Please refer to shown in Fig. 3 B, it is the block schematic diagram for a performance element of the present invention's one pipeline job processor, and wherein above-mentioned performance element 130 comprises several gates (gated) value logical operation unit (gated ALUs) 321-1-321-n.As shown in Figure 1, this performance element 130 is to be positioned at behind the above-mentioned decoding unit 120 and the pipeline stage before the memory access unit 140.Decoding unit 120 produces the required control signal of performance element 130 to carry out the execution operation of a certain specific instruction.This performance element 130 comprises a plurality of gate value logical operation unit 321-1-321-n, and its first unit is that 321-1, second unit are 321-2, and remainder by that analogy.Each gate value logical operation unit 321-1-321-n all is assigned to an activation signal (EN) and a decoding instruction (DI).Enable signal EN1-ENn startup/the activation of decoding unit 120 outputs is at least one gate value logical operation unit 321-1-321-n wherein, and closes all the other gate value logical operation unit 321-1-321-n.Yet under some situation, also might not need start any one gate value logical operation unit 321-1-321-n, can close whole gate value logical operation unit 321-1-321-n this moment.For example, if this numerical value or logical operation only need start the first gate value logical operation unit 321-1, decoding unit 220 transmit input data, an activation signal EN1 and decoding instruction DI1 so far the first gate numerical value logical block 321-1 with the numerical value or the logical operation of execution appointment.The result of calculation of this first gate value logical operation unit 321-1 is to be designated as first output (Output-1).So remaining gate value logical operation unit 321-2-321-n is not because of carrying out desired numerical value or logical operation for closing.Afterwards, result of calculation Output-1-Output-n of each gate value logical operation unit 321-1-321-n all is coupled to a multiplexer 324, this multiplexer 324 selects required output result (Output-s) to be sent to next stage, that is the memory access unit 140 of this pipeline job processor.The selection signal (SEL) of this multiplexer 324 is also provided by above-mentioned decoding unit 120, and this decoding unit 120 is translated into control signal for giving performance element 130 with instruction operation code.In addition, if the output result of above-mentioned multiplexer 324 must be stored in the working storage archives 160, also can write back unit 150 and deposit working storage archives 160 in through memory access unit 140 and working storage.Another kind may be that its output result need feed back to this performance element 130 itself, for example when the continuous multiplication of execution.Other logical circuit that can carry out the output Output-1-Output-n that selects gate value logical operation unit 321-1-321-n also can be in order to replace the multiplexer 324 of Fig. 3 B.
One typical performance element 130 comprises following several gate value logical operation unit 321: one totalizers, a subtracter, a RSF device (reversesubtractor), a translation spinner and multipliers.Because each gate numerical operation unit 321 of the present invention all is subjected to gate, so before each value logical operation unit, label gate, as gate totalizer, gate subtracter, the anti-subtracter of gate (reverse subtractor), gate translation spinner and gate multiplier.For example when needs carry out an additive operation, decoding unit 220 is translated into control signal with instruction operation code and transmits an extra enable signal to above-mentioned performance element 130, this extra enable signal will start/and this gate totalizer of activation to be carrying out this additive operation, and result of calculation is sent to above-mentioned multiplexer 324.At last, the output result of this multiplexer 324 is sent to above-mentioned storer and passes deposit receipt unit 140, working storage archives 160 or feed back to this performance element 130 itself.Note that except this gate totalizer all remaining gate value logical operation units 321 all are closed to reduce the loss of microprocessor electric power.
Fig. 4 A is according to the first embodiment of the present invention, the calcspar of a gate value logical operation unit 411 its structures in the performance element of depiction 3B.But this gate value logical operation unit 411 comprises the logical block 412 of a value logical operation unit 413 and this value logical operation unit 413 of a gate.Data input, an activation signal (EN-X) and decoding instruction (DI-X) are accepted in this logical block 412 self-demarking code unit 220, when this activation signal allows these input data by above-mentioned logical block 412 when starting.Please refer to Fig. 4 B to Fig. 4 E, but each figure all describes the logical block 412 of a gate non-acting numeric value logical operation unit 413.In second embodiment of Fig. 4 B, this logical block comprise one with door 422, this 422 places before the value logical operation unit 423 with door, blocks the function of non-acting numeric value logical operation unit with execution.When needs carried out numerical value or logical operation, an activation signal EN-X will transmit gate value logical operation unit 421 so far in self-demarking code unit 220.When this activation signal EN-X when starting, this receives this activation signal, decoding instruction and input data simultaneously with door 422, and allow this input data by and be sent to specified value logical operation unit 423.In in the case, the value logical operation unit 423 of above-mentioned appointment is activated and carries out required numerical value or logical operation.Otherwise,, do not provide its enable signal that corresponding value logical operation unit 423 starts, so can close this value logical operation unit 423 in mathematics or the logical operation that this clock pulse need not be carried out.This can block the input data with door 422 and enter this value logical operation unit 423, so this numerical value logical block 423 can not carry out any numerical value or logical operation, thereby reaches the purpose of this non-acting numeric value logical operation unit 423 of gate.The result of calculation of above-mentioned gate value logical operation unit 421 (Output-X) then is sent to the multiplexer 324 of Fig. 3 B.Please note the calculating process shown in Fig. 4 B only for conform with the scope of the invention and spirit numerous embodiment one of them.Add one with door 422 only for closing a feasible pattern of no effect value logical operation unit 423, utilize other logical block or combination also can reach identical function, for example, adds a succession of and door with replace single with.Be different from and remain all numerical value logical blocks in the traditional design in starting state, numerical value or logical operation that this method is carried out according to desire only start required numerical value logical block.The method can solve numerical value logical circuit in the conventional microprocessor and produce useless operation result in power loss and the performance element that switch logic caused.In view of the above, the processor of being realized with above-mentioned gate value logical operation unit 421 can reduce power consumption, and can realize that a lower powered processor is to reach the demand in market now.
According to third embodiment of the invention, Fig. 4 C is the block schematic diagram for a gate value logical operation unit 431.When an activation signal EN-X when closing, one or door 432 also can be used for blocking the input data and enter in the value logical operation unit 433 and carry out computings, this or door 432 are that the inverted signal that receives this activation signal is one of input.When having this activation signal for startup only, required numerical value or logical operation beginning can carry out in this value logical operation unit 433, and its result of calculation will be expressed as result of calculation Output-X.
Fig. 4 D is the block schematic diagram according to fourth embodiment of the invention, and this embodiment utilizes a multiplexer 442 to form a gate value logical operation unit 441.This multiplexer 442 has one first input and imports to receive a decoding instruction to receive an input data and one second.This multiplexer 442 has one in addition and selects signal EN-X, when its function is carried out numerical value or logical operation for do not need special value arithmetic logic unit 443 when this clock pulse cycle, blocks above-mentioned input data.When this selected signal EN-X for startup, input data will be sent to value logical operation unit 443 by this multiplexer 442, and its result calculated is result of calculation Output-X.Otherwise, when this selects signal EN-X for closing, that is represent that this clock pulse cycle need not use value logical operation unit 443, it will block input data can not carry out any numerical value or logical operation by this multiplexer 442.
Shown in Fig. 4 B with door 422, except can Fig. 4 C's or door 432 or Fig. 4 D's multiplexer 442 replaces, also can utilize a door bolt (latch) 452 to replace above-mentioned and door 422, shown in Fig. 4 E.As above-mentioned other embodiment, among the 5th embodiment of Fig. 4 E this door bolt enable signal of 452 also may command input data be to be sent to value logical operation unit 453 or to fasten 452 thus with a bolt or latch and block.
The gate value logical operation unit of other form is to be shown in Fig. 5 A to Fig. 5 C.The 6th embodiment of Fig. 5 A describes one to have the gate value logical operation unit 511 that an extra clock signal and a clock pulse shift to an earlier date enable signal, wherein above-mentioned clock pulse shifts to an earlier date enable signal and carries previous clock pulse than the clock signal of processor, so that a clock pulse determines data and signal in advance ahead of time.This gate value logical operation unit 511 receives input data, decoding instruction (DI_X), clock signal and a clock pulse shift to an earlier date enable signal (Pre_EN-X), and produces in the multiplexer 324 of result of calculation Output-X to Fig. 3 B of a gate.Logical block 512 in this gate value logical operation unit 511 is to block the input data when value logical operation unit 513 need not start, and preserves the input data when this value logical operation unit 513 needs to carry out numerical value or logical operation.
Comparison diagram 5A to Fig. 5 C and Fig. 4 A to Fig. 4 E, the embodiment shown in all Fig. 5 series all have the advantage of avoiding causing synthetic (combinational) logical circuit, and the combinator circuit is closed after value logical operation unit is opened immediately.In addition, the embodiment shown in all Fig. 5 series also avoids mixing attached transfer (spurioustransition).In view of the above, when obtaining one in advance during clock pulse in pipeline, the embodiment shown in Fig. 5 series is good than the embodiment shown in Fig. 4 series.
Please refer to shown in Fig. 5 B, it is a block schematic diagram of seventh embodiment of the invention.The logical block of this embodiment comprise a flip-flop 523 and one and the door 522, wherein above-mentioned flip-flop 523, a logical block 522 and a value logical operation unit 524 are that sequence is joined.Add flip-flop 523 and can keep an input state.The clock pulse of carrying previous clock pulse in one pipeline shifts to an earlier date enable signal (Pre_EN-X) and enters this logical block 522 simultaneously with a clock pulse signal (Clock).This gate value logical operation unit 521 can shift to an earlier date a clock pulse new data will be sent in the above-mentioned flip-flop 523 so that the value logical operation unit of appointment 524 carries out required numerical value or logical operation.This kind framework allows zero, one or more value logical operation unit 524 to be activated in the cycle in same clock pulse, and remaining value logical operation unit 524 will be closed so that save electric power.
Another embodiment of Fig. 5 B illustrated embodiment, that is a gate value logical operation unit 531 of eighth embodiment of the invention is to be shown in Fig. 5 C.One multiplexer 532 has one first input and imports to receive a decoding instruction (DI_X) to receive an input data and the 3rd with the feedback that receives a flip-flop 533 output, one second input.The selection signal of this multiplexer 532 is to shift to an earlier date enable signal (Pre_EN-X) for a clock pulse of carrying previous clock pulse than pipeline architecture.Can above-mentioned input data enter a value logical operation unit 534 is to depend on that this clock pulse shifts to an earlier date enable signal.This flip-flop 533 receives the output of this multiplexer 532, carry out numerical value or logical operation when enable signal is sent to above-mentioned value logical operation unit 534 with its output valve when starting, or enter value logical operation unit 534 execution computings when enable signal blocks above-mentioned input data value when closing.
Fig. 4 series can be avoided unnecessary numerical value or logical operation with the described logical block of Fig. 5 series, thereby has saved the electric power of remarkable quantity.If with a performance element with five value logical operation units is example, keep the value logical operation unit of a startup and close the practice of all the other four value logical operation units, and start five following electric power of eighty per cant of saving that value logical operation unit is compared in the conventional processors.If do not need any value logical operation unit to carry out under the situation about calculating, also can close whole five value logical operation units.With the situation that starts five value logical operation units in comparison, can save hundred-percent electric power.Obviously it is favourable than prior art to make the gate value logical operation unit in fact in processor.
Please refer to shown in Figure 6ly, it describes a flow process synoptic diagram of the disclosed performance element operation of the present invention.This flow process starts from step 610, and each value logical operation unit in the performance element all receives an activation signal and from a decoding unit and imports data so that open or close each value logical operation unit.Judge in step 620 whether the enable signal that each value logical operation unit receives is startup.If close, then step 630 will block an input data and enter its relative value logical operation unit, so value logical operation unit is not carried out any numerical value or logical operation.Yet if enable signal is for starting, in step 640, above-mentioned input data can enter its relative value logical operation unit, and this value logical operation unit is sent its result of calculation into a multiplexer after carrying out required numerical value or logical operation.In step 650, this multiplexer is selected in the result of calculation that a plurality of value logical operation units transmit producing a required output result, and its output result is sent to memory access unit, working storage archives or feeds back to performance element itself under suitable situation.
Though aforesaid embodiment normally blocks the input data with one with door and enters non-acting numeric value logical operation unit or utilize a flip-flop that its result of calculation is kept, other embodiment that conforms with the scope of application of the present invention and spirit also can use.For example, other logical block such as a series of and door also can enter non-acting numeric value logical operation unit in order to avoid input data.Similarly, above-mentioned flip-flop can also other logical circuit or the combination of logic lock replace.In in the case, the disclosed embodiment of Fig. 4 and Fig. 5 is the convenience for expressing only, and the part embodiment of gate value logical operation unit is provided.
Apparently, according to the description of top embodiment, the present invention has many corrections and difference.Therefore need be understood in the scope of its additional claim item, except above-mentioned detailed description, the present invention can also implement widely in other embodiments.Above-mentioned is preferred embodiment of the present invention only, is not in order to limit claim of the present invention; All other do not break away from the equivalence of being finished under the disclosed spirit and changes or modification, all should be included in the following claim.

Claims (15)

1. a processor is characterized in that, this processor comprises:
One decoding unit, it has one and transmits the enable signal module, to produce an activation signal; And
One performance element receives this enable signal with this decoding unit certainly, and wherein this performance element comprises:
A plurality of gate value logical operation units, wherein each these gate value logical operation units also comprise:
One logical block, receiving input data, a decoding instruction and this enable signal, and receive these input data at the same time, this has sent the output valve of this logical block when decoding instruction and this enable signal; And
One value logical operation unit, to serve as to receive the output valve of this logical block and a result of calculation is sent out this gate value logical operation unit when opening at this enable signal, wherein the output valve of this logical block determines whether these input data are sent to this value logical operation unit; And
One first multiplexer, with the selection signal that basis is provided by this decoding unit, result of calculation chooses one as one of this performance element and exports the result in these a plurality of gate value logical operation units certainly.
2. according to claim 1 a described processor, it is characterized in that, wherein this logical block blocks this input data when this value logical operation unit need not start computing, and need start computing season these input data in this value logical operation unit and enter this value logical operation unit.
3. according to claim 1 a described processor, it is characterized in that wherein this decoding unit is decoded to produce this enable signal and this decoding instruction to the instruction that has received.
4. according to claim 1 a described processor, it is characterized in that this enable signal that wherein this decoding unit produced is to be used for decision whether to start this a plurality of value logical operation units.
5. according to claim 1 a described processor, it is characterized in that, wherein this logical block comprise one with door to receive these input data, this decoding instruction and this enable signal, this output valve with door be to export this value logical operation unit to.
6. according to claim 1 a described processor, it is characterized in that, wherein this logical block comprise one or door receiving these input data, this inverted signal of decoding instruction and this enable signal, this or output valve be to export this value logical operation unit to.
7. according to claim 1 a described processor, it is characterized in that, wherein this logical block comprises one second multiplexer, this second multiplexer be connect these input data with as one first input, connect this decoding instruction as one second input and this enable signal as a selection signal, the output valve of this second multiplexer is to export this value logical operation unit to.
8. according to claim 1 a described processor, it is characterized in that, wherein this logical block comprises a latch to receive these input data, this inverted signal of decoding instruction and this enable signal, and the output valve of this latch is to export this value logical operation unit to.
9. a processor is characterized in that, this processor comprises:
One decoding unit, it has one and transmits the enable signal module, shifts to an earlier date enable signal to produce a clock pulse; And
One performance element receives this clock pulse with this decoding unit certainly and shifts to an earlier date enable signal, and wherein this performance element comprises:
A plurality of gate value logical operation units, wherein each these gate value logical operation units also comprise:
One logical block, with receive input data, decoding instruction and this clock pulse shift to an earlier date enable signal, and receive these input data at the same time, this sends the output valve of this logical block when decoding instruction and this clock pulse have shifted to an earlier date enable signal; And
One value logical operation unit, receive the output valve of this logical block and a result of calculation is sent out this gate value logical operation unit when enable signal is unlatching to shift to an earlier date at this clock pulse, wherein the output valve of this logical block determines whether these input data are sent to this value logical operation unit; And
One first multiplexer, with the selection signal that basis is provided by this decoding unit, result of calculation chooses one as one of this performance element and exports the result in these a plurality of gate value logical operation units certainly.
10. according to claim 9 a described processor, it is characterized in that, wherein this logical block blocks this input data when this value logical operation unit need not start computing, and need start computing season these input data in this value logical operation unit and enter this value logical operation unit.
11., it is characterized in that wherein this decoding unit is decoded to produce this enable signal and this decoding instruction to the instruction that has received according to claim 9 a described processor.
12. according to claim 9 a described processor, it is characterized in that, it is to be used for decision whether to start these value logical operation units that this clock pulse that wherein this decoding unit produced shifts to an earlier date enable signal, and this clock pulse shifts to an earlier date the more former clock pulse of enable signal and shifts to an earlier date one-period.
13., it is characterized in that wherein this logical block comprises according to claim 9 a described processor:
One with the door, shift to an earlier date enable signal to receive a clock pulse signal and this clock pulse, and transmit an output; And
One flip-flop, with receive these input data, this decoding instruction with should with the output of door, the output valve of this flip-flop is to export this value logical operation unit to.
14., it is characterized in that wherein this logical block comprises according to claim 9 a described processor:
One second multiplexer, with receive a feedback loop output signal as one first input, receive these input data as one second input, receive this decoding instruction shift to an earlier date enable signal as a selection signal as one the 3rd input and this clock pulse, and produce an output; And
One flip-flop, with the output and the clock pulse signal that receive this second multiplexer, the output valve of this flip-flop is to export this value logical operation unit to;
Wherein this feedback loop output signal of this second multiplexer reception is the feedback signal for this output valve of this flip-flop.
15. a method that reduces power consumption in processor is characterized in that, comprises:
Receive each value logical operation unit of activation signal to a performance element respectively from a decoding unit, with the unlatching that determines each this value logical operation unit or close; Wherein import data, decoding instruction and this enable signal together are delivered to each value logical operation unit of this performance element;
Whether this enable signal of judging each this value logical operation unit correspondence is unlatching;
If this enable signal of this numerical value unit correspondence, blocks input data for closing and enters this value logical operation unit; And if this enable signal of this numerical value unit correspondence allows these input data to enter this value logical operation unit for opening, with in numerical value or the logical operation wherein carrying out being desired; And
Select signal to select the output results as this performance element in the result of calculation of these a plurality of value logical operation units according to one.
CNB2006100736186A 2005-04-13 2006-04-13 Closing non-acting numeric value logical operation unit to save power Active CN100504725C (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US67068005P 2005-04-13 2005-04-13
US60/670,680 2005-04-13

Publications (2)

Publication Number Publication Date
CN1838031A CN1838031A (en) 2006-09-27
CN100504725C true CN100504725C (en) 2009-06-24

Family

ID=37015445

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2006100736186A Active CN100504725C (en) 2005-04-13 2006-04-13 Closing non-acting numeric value logical operation unit to save power

Country Status (2)

Country Link
CN (1) CN100504725C (en)
TW (1) TWI315489B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109102073A (en) 2017-06-21 2018-12-28 上海寒武纪信息科技有限公司 A kind of sparse training method
CN109086880B (en) * 2017-06-13 2021-06-29 上海寒武纪信息科技有限公司 Arithmetic device and method
WO2018228399A1 (en) 2017-06-13 2018-12-20 上海寒武纪信息科技有限公司 Computing device and method
CN109117455A (en) 2017-06-26 2019-01-01 上海寒武纪信息科技有限公司 Computing device and method

Also Published As

Publication number Publication date
TWI315489B (en) 2009-10-01
TW200636570A (en) 2006-10-16
CN1838031A (en) 2006-09-27

Similar Documents

Publication Publication Date Title
CN103150146B (en) Based on ASIP and its implementation of scalable processors framework
Trivedi et al. Design & analysis of 16 bit RISC processor using low power pipelining
CN100504725C (en) Closing non-acting numeric value logical operation unit to save power
AU618142B2 (en) Tightly coupled multiprocessor instruction synchronization
WO2001016710A1 (en) Data processor
CN101861585A (en) Method and apparatus for real time signal processing
Jafri et al. Energy-aware coarse-grained reconfigurable architectures using dynamically reconfigurable isolation cells
RU2182353C2 (en) Asynchronous data processing device
CN112486312A (en) Low-power-consumption processor
CN104461758A (en) Exception handling method and structure tolerant of missing cache and capable of emptying assembly line quickly
CN100451951C (en) 5+3 levels pipeline structure and method in RISC CPU
CN100373295C (en) Method for effecting the controlled shutdown of data processing units
CN101253480B (en) Computer having dynamically-changeable instruction set in real time
CN101833433B (en) Tri-valued, thermal-insulating and low-power multiplier unit and multiplier
US7788470B1 (en) Shadow pipeline in an auxiliary processor unit controller
CN111008042B (en) Efficient general processor execution method and system based on heterogeneous pipeline
US20040248353A1 (en) Processor and semiconductor integrated circuit
CN106547514B (en) A kind of high energy efficiency binary adder based on clock stretching technique
CN101923386A (en) Method and device for reducing CPU power consumption and low power consumption CPU
CN101930281B (en) Method and device for reducing power consumption of CPU and low-power CPU
Heysters et al. A reconfigurable function array architecture for 3G and 4G wireless terminals
Brackenbury An instruction buffer for a low-power DSP
CN1318959C (en) Method and system for forecasting conditional statement executive mode in processor
CN113407239B (en) Pipeline processor based on asynchronous monorail
Parker et al. Microprogramming: the challenges of VLSI

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant