CN110058840A - A kind of low-consumption multiplier based on 4-Booth coding - Google Patents

A kind of low-consumption multiplier based on 4-Booth coding Download PDF

Info

Publication number
CN110058840A
CN110058840A CN201910238829.8A CN201910238829A CN110058840A CN 110058840 A CN110058840 A CN 110058840A CN 201910238829 A CN201910238829 A CN 201910238829A CN 110058840 A CN110058840 A CN 110058840A
Authority
CN
China
Prior art keywords
power gating
input
circuit
gating switch
input terminal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910238829.8A
Other languages
Chinese (zh)
Other versions
CN110058840B (en
Inventor
余宁梅
马文恒
高钰迪
黄自力
张文东
刘和娜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian University of Technology
Original Assignee
Xian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Technology filed Critical Xian University of Technology
Priority to CN201910238829.8A priority Critical patent/CN110058840B/en
Publication of CN110058840A publication Critical patent/CN110058840A/en
Application granted granted Critical
Publication of CN110058840B publication Critical patent/CN110058840B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/52Multiplying; Dividing
    • G06F7/523Multiplying only

Abstract

A kind of low-consumption multiplier based on 4-Booth coding disclosed by the invention, including the encoder group composed in parallel by least two encoders, the input terminal of encoder group is connected with digit selector, the input terminal of digit selector respectively with multiplier input ports, the connection of multiplicand input port, the input terminal and multiplier input ports of digit selector, the first Power gating switch is connected separately between multiplicand input port, the output end of encoder group is connect by the 2nd Power gating switch with the input terminal of compressor, the output end of compressor is connect by the 3rd Power gating switch with the input terminal of carry lookahead adder.A kind of low-consumption multiplier based on 4-Booth coding disclosed by the invention can reduce power consumption while guaranteeing that calculated result is correct.

Description

A kind of low-consumption multiplier based on 4-Booth coding
Technical field
The invention belongs to low-consumption multiplier technical fields, and in particular to it is a kind of based on 4-Booth coding low-power consumption multiply Musical instruments used in a Buddhist or Taoist mass.
Background technique
In all kinds of chips such as high-speed digital video camera (DSP), microprocessor (MCU) and RISC, multiplier is not The unit that can lack, and multiplier is often in critical path, and therefore, the speed of system is often depending on the speed of multiplier Degree.For the normal work for realizing assembly line, the multiplier in execution unit needs to complete within a clock cycle.By excellent The design for changing multiplier can influence and improve the operation efficiency and stability of entire processor.It therefore, at a high speed can portable low function The Multiplier Design of consumption is very heavy in specific integrated circuit, digital processing field and the design of digital filtering neighborhood system It wants and a necessary ring.
High speed can portable low-power consumption multiplier a kind of implementation be improve parallel computation amount, reduce subsequent calculating Amount, for N multiplication, conventional algorithm device can generate N bit position product, and final result can be obtained after cumulative, and compile from Booth Since code algorithm comes out, the performance of multiplier has largely been improved.The basic principle is that passing through the number for reducing partial product It measures to simplify operation, and the multiplier of participation multiplying and the digit of multiplicand are more, Booth encryption algorithm simplifies the energy of operation Power is more prominent.Typical Booth encryption algorithm has: base 2-Booth coding, base 4-Booth coding, base 8-booth encryption algorithm. Base 2-Booth encryption algorithm coding schedule is simple, and algorithm is easily achieved, but can not simplify operation;Base 4-Booth encryption algorithm can To simplify 1/2 calculation amount, coding circuit is easily achieved;Base 8-Booth encryption algorithm can simplify 3/4 calculation amount, but its There is the operation to multiplier multiplied by (- 3) in coding schedule, complementary circuit cannot be taken to realize by simply shifting.It is carried out in processor When multiplying, multiplier and multiplicand are 32, and due to 2^3,2=4 294 967 296, and two 32 digits are mutually multiplied To 64 digits be exactly a huger number.In the design, so huge number is hardly used, that is, is joined The case where being extremely likely to occur very much " zero " occupy-places with a high position of the multiplier B and multiplicand A of operation.In this case, according to Conventional coding, compression, summation not only waste the plenty of time and also take up many hardware resources, improve the function of whole system Consumption.
In addition, when multiplier circuit carries out multiplying, coding unit, compression unit, carry lookahead adder unit Be it is serial, when front stage circuits do not complete operation, late-class circuit is constantly in wait state, although circuit debugging, does not have There is participation operation, increases the power consumption of system.Partial product enters during compression unit participation summation operation, due to carry signal It is different from partial product signal generation time delay, when entering next stage Wallace tree-shaped compressor circuit, it might have race hazard feelings Condition generates the calculated result for leading to mistake.
Summary of the invention
The purpose of the present invention is to provide a kind of low-consumption multipliers based on 4-Booth coding, can guarantee to calculate As a result while correct, power consumption is reduced.
The first technical solution of the present invention is: a kind of low-consumption multiplier based on 4-Booth coding, including The encoder group composed in parallel by least two encoders, the input terminal of encoder group are connected with digit selector, digit selector Input terminal is connect with multiplier input ports, multiplicand input port respectively, the input terminal and multiplier input ports, quilt of digit selector The first Power gating switch is connected separately between multiplier input ports, the first Power gating is switched for according to defeated Whether the multiplier or multiplicand entered is zero and is opened or closed circuit, and encoder group controls complement code segment signal output product, coding The output end of device group by the 2nd Power gating switch connect with the input terminal of compressor, the 2nd Power gating according to The maximum delay of encoder group generating portion product opens circuit, the output end of compressor by the 3rd Power gating switch with The input terminal of carry lookahead adder connects, the 3rd Power gating switch to receive compressor final output puppet and Carry signal and open circuit, the product of the output end of carry lookahead adder output multiplicand and multiplier.
The features of the present invention also characterized in that
Encoder have three data input pins, three data input pins of each encoder with the output of digit selector End connection, the output end of each encoder are switched with the 2nd Power gating and are connected.
Encoder logic includes three inputs and door, and three input the input terminal with three input terminals of door for encoder, Three inputs are connect with the input terminal of the output end of door and carry save adder (CSA), and three inputs and three input terminals of door pass through deposit Device I is connect with the input terminal of carry save adder (CSA), and register I exports the corresponding Ei of digit selector output signal, partial carry The puppet and signal output end of adder are connect with the input terminal of register II, the output end of register II and shift register it is defeated Enter end connection, shift register output partial product is connected with control between three inputs and the output end and shift counter of door and mends Code participates in circuit, and control complement code participates in the formation that circuit participates in partial product to control complement code.
It includes register III that control complement code, which participates in circuit, is connected between the input of register III and three and the output end of door anti- Phase device, the input terminal of phase inverter are connect with three inputs with the output end of door, and the output end of register III passes through alternative selector It is connect with the input terminal of shift register, alternative selector is connected with the 4th power gating switch, the 4th power Gating switch is also connect by complementary circuit with alternative selector, and complementary circuit generates the partial product for using complement code calculating, 4th power gating is switched to control opening and gating for complementary circuit.
The tree-like compressor of Wallace that compressor uses multiple carry save adder (CSA)s to constitute, in the tree-like compressor of Wallace Every grade of compressor circuit is connected with the 5th power gating switch, and the 5th power gating switch is to control the same level compression electricity Road opening and turning off, and every grade of the 5th power gating switch is connected with the 5th power gating of next stage switch, often The 5th power gating of grade switchs the max calculation time delay by the same level compressor circuit to control the 5th power of next stage Gating is switched open-minded, and each carry save adder (CSA) is respectively connected with the 6th power gating in the tree-like compressor of Wallace Switch, the 6th power gating are switched to control opening and turning off for carry save adder (CSA).
The beneficial effects of the present invention are:
(1) a kind of low-consumption multiplier based on 4-Booth coding of the present invention, with first before multiplier computation Power gating switch control entirety multiplier operation, before the compressor be arranged the 2nd Power gating switch, The 3rd Power gating is arranged before carry lookahead adder to switch for accurately opening circuit, reduces unnecessary standby function Consumption;
(2) a kind of low-consumption multiplier based on 4-Booth coding of the present invention, passes through multiple encoders in parallel operation sides Formula is provided with three inputs and door in encoder, according to three input with door output level, to multiplier part position carry out displacement or The product operation of complement code nubbin, can quickly find out partial product, reduce power consumption;
(3) a kind of low-consumption multiplier based on 4-Booth coding of the present invention, wherein compressor uses partial carry addition Wallace's tree-shaped pressure texture of device permutation and combination passes through logic unit according to the sum of the maximum delay calculated with deterministic process To control opening and turning off for carry save adder (CSA), while increase power gating is controlled on each carry save adder (CSA) System switch is made whether the judgement for being zero to three signals for inputting the carry save adder (CSA), if three inputs are zero It then turns off the CSA circuit and directly exports zero, reach accurate calculating, while reducing the purpose of power consumption again.
Detailed description of the invention
Fig. 1 is a kind of structural schematic diagram of the low-consumption multiplier based on 4-Booth coding of the present invention;
Fig. 2 is encoder circuit figure in a kind of low-consumption multiplier based on 4-Booth coding of the present invention;
Fig. 3 is compressor circuit figure in a kind of low-consumption multiplier based on 4-Booth coding of the present invention;
Fig. 4 is the base 4-Booth coding of encoder in a kind of low-consumption multiplier based on 4-Booth coding of the present invention Table;
Fig. 5 is a kind of design timing diagram of the low-consumption multiplier based on 4-Booth coding of the present invention;
Fig. 6 is a kind of partial product arrangement figure of the low-consumption multiplier based on 4-Booth coding of the present invention.
Specific embodiment
The following describes the present invention in detail with reference to the accompanying drawings and specific embodiments.
A kind of low-consumption multiplier based on 4-Booth coding of the present invention, to specific for handling 32 multiplyings Structure is illustrated, and PG1, PG2, PG3, PG4, PG5, PG6 respectively indicate the first Power gating switch, second in attached drawing Power gating switch, the 3rd Power gating switch, the 4th Power gating switch, the 5th Power gating are opened It closes, the 6th Power gating switch, CSA indicates carry save adder (CSA) (Carry Save Adder partial carry addition Device).
As shown in Figure 1, a kind of low-consumption multiplier based on 4-Booth coding of the present invention includes by 17 encoder parallel connections The encoder group of composition, the input terminal of encoder group are connected with digit selector, and the input terminal of digit selector is inputted with multiplier respectively Port, the connection of multiplicand input port, between the input terminal and multiplier input ports of digit selector, multiplicand input port respectively Be connected with the first Power gating switch, the first Power gating switch for according to the multiplier or multiplicand of input whether It is zero and is opened or closed circuit, encoder group controls complement code segment signal output product, and the output end of encoder group passes through second Power gating switch is connect with the input terminal of compressor, and the 2nd Power gating is according to encoder group generating portion product Maximum delay opens circuit, the input that the output end of compressor passes through the 3rd Power gating switch and carry lookahead adder End connection, the 3rd Power gating switch are opened circuit to receive the pseudo- and carry signal of compressor final output, are surpassed The product of output end the output multiplicand and multiplier of advanced potential adder.
As shown in Fig. 2, encoder has three data input pins, three data input pins of each encoder are selected with position The output end connection of device is selected, the output end of each encoder is switched with the 2nd Power gating and connected.
Encoder logic includes three inputs and door, and three input the input terminal with three input terminals of door for encoder, Three inputs are connect with the input terminal of the output end of door and carry save adder (CSA), and three inputs and three input terminals of door pass through deposit Device I is connect with the input terminal of carry save adder (CSA), and register I exports the corresponding Ei of digit selector output signal, partial carry The puppet and signal output end of adder are connect with the input terminal of register II, the output end of register II and shift register it is defeated Enter end connection, shift register output partial product is connected with control between three inputs and the output end and shift counter of door and mends Code participates in circuit, and control complement code participates in the formation that circuit participates in partial product to control complement code.
It includes register III that control complement code, which participates in circuit, is connected between the input of register III and three and the output end of door anti- Phase device, the input terminal of phase inverter are connect with three inputs with the output end of door, and the output end of register III passes through alternative selector It is connect with the input terminal of shift register, alternative selector is connected with the 4th power gating switch, the 4th power Gating switch is also connect by complementary circuit with alternative selector, and complementary circuit generates the partial product for using complement code calculating, 4th power gating is switched to control opening and gating for complementary circuit.
As shown in figure 3, the tree-like compressor of Wallace that compressor uses multiple carry save adder (CSA)s to constitute, Wallace tree Every grade of compressor circuit is connected with the 5th power gating switch in shape compressor, and the 5th power gating is switched to control The same level compressor circuit being opened and turning off, and every grade of the 5th power gating switch is opened with the 5th power gating of next stage Series connection is closed, every grade of the 5th power gating switchs the max calculation time delay by the same level compressor circuit to control next stage the 5th Power gating is switched open-minded, and each carry save adder (CSA) is respectively connected with the 6th power in the tree-like compressor of Wallace Gating switch, the 6th power gating are switched to control opening and turning off for carry save adder (CSA).
All power gating switches connect to power supply, so that power gating be driven to switch.
A kind of low-consumption multiplier principle explanation based on 4-Booth coding of the present invention: multiplier is to apply executing mould Rapid computations unit in the integer instruction processing of block, the design are low using a multiplier of base 4-Booth encryption algorithm design Power consumption multiplier, the CSA in multiplier use the CSA of 3-2 model, before carrying out multiplication operation, the overall logic of multiplier Be off, before multiplier carries out operation first with the first power gating switch to multiplicand and multiplier make whether The judgement being not zero directly output zero and is transmitted to if wherein at least one is zero and writes back module, if multiplier B It is not zero with multiplicand A, then opens the logic unit circuit of encoder section, whether reach generating unit further according to delay time The judgement of the maximum delay time and the 2nd power gating switch of point product handles the sum of time, come control CSA1, CSA2, CSA3, CSA4, CSA5, CSA6's is open-minded.Due to the first order compression each CSA be it is parallel, they opening and closing It is disconnected processing be all it is synchronous, only they need to be connected on same power gating switch.By CSA1, CSA2, CSA3, After CSA4, CSA5, CSA6 are opened simultaneously, need in the maximum delay for waiting CSA and first the 5th power gating switch The sum of judgement time after open the second level compressor circuit CSA7, CSA8, CSA9, CSA10 and first order compressor circuit will be kept It is open-minded with coding circuit.Equally, after the sum of judgement time of maximum delay and power gating for waiting a CSA The third level compressor circuit CSA11, CSA12 is open-minded, keep coding circuit, the first order, second level compressor circuit it is open-minded.It waits After the sum of judgement time of the maximum delay of one CSA adder and power gating, by fourth stage compressor circuit CSA13, CSA14 is open-minded, opens the open-minded of coding circuit and preceding three-level compressor circuit.Wait a CSA adder maximum delay and It is after the sum of judgement time of power gating, level V compressor circuit CSA15 is open-minded, keep coding circuit, preceding level Four pressure Contracting circuit it is open-minded.After waiting the sum of the maximum delay of a CSA adder and the judgement time of power gating, by the 6th Grade compressor circuit CSA16 is open-minded, keep coding circuit, preceding Pyatyi compressor circuit it is open-minded, obtain final carry signal with and Signal is multiplier as a result, at this point, all circuit debuggings in multiplier by weight addition result.It is complete in multiplying Cheng Hou turns off encoder group, the tree-like compressor of Wallace, carry lookahead adder of multiplier, waits multiplying next time Arrival.Its timing diagram is as shown in Figure 5.
The calculating process of multiplier is divided into three steps: the generation of partial product, the compression of partial product, carry with it is pseudo- and phase Add and obtain final result, specifically:
If multiplicand A=a31a30…a0, multiplier B=b31b30…b0, wherein a31,b31For sign bit, P is that product then has:
Its coding schedule as shown in figure 4, sharing eight kinds of situations, distinguish by corresponding five kinds of different operations, five kinds of different operations It is one times of multiplicand of plus-minus, twice of multiplicand of plus-minus and two plus zero times of multiplicand.In two plus zero times of multiplicand operation, most Such case is respectively (+0), (- 0) in first Booth coding, in multiplier coding stage, to (- 0) benefit in coding situation Carry out "+0 " processing with carry save adder (CSA), i.e. signal turning operation, so that it becomes (+0), by coding schedule (+0), (- 0) two (+0) codings are changed to, i.e., by (- 0) when (+0) processing, pass through the variation that add circuit realizes Ei signal magnitude in this way So that the Ei signal overturning rate of incoming register II and register III reduces, that is, reduce the overturning of signal in an encoding process Rate reduces power consumption.Specific circuit is realized are as follows: increases b in original coding circuit in the encoder2i+1、b2i、b2i-1Three letters Number phase with circuit structure, b2i+1, b2i, input of the b2i-1 as register and three inputs and door be defeated with door by three inputs Phase and result EN1 out, if it is (b that output result EN1, which is high level,2i+1, b2i, b2i-1)=(1,1,1), EN1 makes as CSA's Can signal control its own and carry out add operation, "+0 " operation carried out to Ei1, the source operand Ei1 of add operation is from register I Middle reading stores in register I and (b2i+1, b2i, b2i-1) corresponding Ei1 signal, Ei1 conduct after CSA is summed it up Exporting with signal S for CSA takes Ei2 using register II, finally by shift register output partial product.If EN1 is low electricity Flat i.e. (b2i+1, b2i, b2i-1) when being not all 1, EN1 negates output EN2 signal by phase inverter, and EN2 signal enters register III In take Ei3, if Ei3 value be one of (- 2) or (- 1) if the 4th power gating switch control complementary circuit and benefit Complement code signal is passed into shift register output par, c product, i.e., if the 4th power by the path conducting of code circuit Gating switch then 2 selects 1 selector that will throw to complementary circuit generating portion product.In other situations, Ei3 is sent directly into displacement and posts In storage, relevant shifting function is carried out, obtains corresponding partial product.Register I, register II, register III are stored The eight kinds of code of (b2i+1, b2i, b2i-1) and Ei value corresponding with coding.
The design takes out all b used in multiplier B in the design of partial product generative circuit2i+1、b2i、b2i-1 And with the b in Booth base 4- coding schedule2i+1、b2i、b2i-1It is quickly found out by bit comparison using the mode of 17 encoders in parallel Partial product simultaneously inputs compression module.For the correctness for ensuring partial product, Booth coding module begins in the operational process of multiplier It is in opening state eventually.
Due to there is the operation of times multiplicand (- A) that subtracts one, the multiplicand (- 2*A) that subtracts twice in coding schedule, in actual circuit Although subtraction, which can use, plus complement code is to reduce the number of signal overturning in reduces power consumption, due to the complement code of negative be by Its step-by-step, which negates, carries out adding again an operation, and complement code of every calculating can all introduce additional adder, not only increase system Area, reduces the arithmetic speed of system, and brings additional power consumption.The design joined complement code electricity in coding circuit Road first successively calculates times multiplicand (- A) that subtracts one and the two for the multiplicand (- 2*A) that subtracts twice using an adder unit Two complement codes are passed in each partial product generation unit by the complement code of operand by the interconnection line with power gating, Pass through complement code signal of the power gating by multiplier (- 2*A) if when the value of b2i+1, b2i, b2i-1 are consistent with 1,0,0 It is open-minded, pass through power gating if when the value of b2i+1, b2i, b2i-1 are consistent with 1,0,1 or 1,1,0 for multiplier (- A) Complement code signal it is open-minded, otherwise two complement code signals are turned off by power gating, that is, are not needing complement code participation portion When dividing the generation of product, either-or switch throws complementary circuit to be invalid, i.e., complement code signal is in an off state compiles there is no incoming In code unit.In this way, not only having reduced signal overturning bring extra power consumption, but also reduce the quantity of adder, reduces system Area accelerate the formation speed of partial product and because the complement code of multiplicand has been found out in advance.
Wallace tree is a kind of tree algorithm to partial product reduction, uses CSA arrangement group in compressor part the design Wallace's tree-shaped pressure texture of conjunction is multiplication for 32, and number of compression stages needed for the design is six grades, and partial-product sum carry is total The quantity variation of number is 17 → 12 → 8 → 6 → 4 → 3 → 2.When being generated due to partial product, carry signal and partial product signal are raw At time delay difference, carry signal C with and signal S exported from the same level compressor circuit and enter junior Wallace tree-shaped compressor circuit When, it might have the generation of race hazard situation.The design uses logic circuit, according to the maximum delay that partial product generates, passes through 5th power gating switch and the 6th power gating switch control CSA opening and turning off, in guarantee carry and portion The power consumption for dividing product to reduce system while calculating correct.
The 17 bit positions product generated for the first step is multiplicand A in each of multiplier B phase, weight difference such as Fig. 6 It is shown, it needs to carry out height Bits Expanding according to the weight of partial product in entering Wallace's tree-shaped compressor, so that into same The weight of three partial products of CSA is identical, and what could be calculated goes out correct carry signal and and signal.Construct this module most Basic unit is CSA, their teaming method will determine the logical depth and complexity of entire circuit, or even influencing cabling need to It sums the complexity of interconnection line, this has power consumption and obviously influences.
The CSA combining form of the design is connected to one the by the CSA of six 3-2 models of the first order by interconnection line Five power gating switch.According to the maximum delay of encoder group generating portion product, that is, calculate that volume of (- 2*A) The maximum delay of code device uses the open-minded of the 5th power gating switch all CSA of the parallel control first order.Each 3- simultaneously Respectively one the 6th power gating of connection is switched 2CSA again, is responsible for judging whether three inputs of the 3-2CSA are all zero, The off state of the 3-2CSA is kept if being all zero.After the CSA that input is not all zero in first order CSA is opened, due to Encoder group keeps opening state, and 17 partial products of generation are just admitted to Wallace under the control of power gating array Partial product compression is carried out in tree-shaped compressor.
The specific connection structure of Wallace's tree-shaped compressor and contraction principle are as follows: from 17 partial products of encoder output (P0~P16), P0, P1, P2 as 3 of CSA1 inputs, P3, P4, P5 as CSA2 3 inputs, P6, P7, P8 as 3 of CSA3 inputs, P9, P10, P11 as CSA4 3 inputs, P12, P13, P14 as CSA5 3 inputs, P15, P16 does zero padding processing as 2 of CSA6 inputs, and for the other input terminal of CSA6;Output S1, C1 and CSA2's of CSA1 is defeated S2 is as 3 inputs of CSA7 out, and for output S3, C3 of the output C2 and CSA3 of CSA2 as 3 inputs of CSA8, CSA4's is defeated The output S5 of S4, C4 and CSA5 are as 3 inputs of CSA9 out, and output S6, C6 of the output C5 and CSA6 of CSA5 is as CSA10 3 input;3 inputs of the output S8 of output S7, C7 and CSA8 of CSA7 as CSA11, the output C8 and CSA9 of CSA8 Output S9, C9 as 3 of CSA12 inputs, the output S12 of output S11, C11 and CSA12 of CSA11 are as the 3 of CSA13 A input, 3 inputs for exporting S10, C10 as CSA14 for exporting C12 and CSA10 of CSA12, the output S13 of CSA13, The output S14 of C13 and CSA14 as 3 of CSA15 inputs, the output C14 of output S15, C15 and CSA14 of CSA15 as 3 inputs of CSA16;2 inputs of output S16, the C16 of CSA16 as carry lookahead adder.
Enter different CSA in the way of from low to high by weight from 17 partial products (P0~P16) of encoder output In, the CSA (CSA1~CSA6) of the first order is switched by the way of parallel connection, and with the 5th power gating of the first order It is connected in series, which switchs while controlling the turn-on and turn-off of each CSA compression unit of the first order, each CSA Compression processing is carried out simultaneously.Each CSA individually connects a 6th power gating switch simultaneously, controls opening for the CSA Logical and shutdown turns off the CSA circuit if three inputs of coupled CSA are zero;First order compressor circuit it is defeated It being connected in the input of second level compressor circuit out, the compression of the second level is made of CSA7~CSA10 parallel connection, and the 5th of the second level the Power gating switch is connected in series with CSA7~CSA10, while being connected in series with the 5th power gating of first order switch, The 5th power gating switch of the second level is just in when the 5th power gating of the first order is opened Opening state, and the compressor circuit for controlling this grade is open-minded, each compressor of CSA7~CSA10 is independently connected one the 6th Power gating switch judges whether three inputs into the CSA are zero, to control opening and turning off for the CSA; The output of second level compressor circuit is connected in the input of third level compressor circuit, the compression of the third level by CSA11 and CSA12 simultaneously Connection is constituted, and the 5th power gating switch of the third level is connected in series with CSA11~CSA12, while the with the second level the 5th Power gating switch is connected in series, the when the 5th power gating of the first order and second level switch is opened The 5th power gating switch of three-level is just in opening state, and the compressor circuit for controlling this grade is open-minded, CSA11 with CSA12 is independently connected the 6th power gating switch, judges whether three inputs into the CSA are zero, from And control opening and turning off for the CSA;The output of third level compressor circuit is connected in the input of fourth stage compressor circuit, and the 4th The compression of grade is made of CSA13 and CSA14 parallel connection, and the 5th power gating switch of the fourth stage is gone here and there with CSA13~CSA14 phase Connection, while being connected in series with the 5th power gating of third level switch, and if only if the of the first order, the second level and the third level The 5th power gating switch of the fourth stage is just in opening state when five power gating switch is opened, and controlling should CSA13~CSA14 compressor circuit of grade is open-minded, and CSA13 and CSA14 one the 6th power gating that be independently connected are switched, Whether three inputs of judgement into the CSA are zero, to control opening and turning off for the CSA;Fourth stage compressor circuit Output is connected in the input of level V compressor circuit, and the compression of level V is made of CSA15, and the with level V the 5th Power gating switch is connected in series, and if only if the 5th power gating of the first order, the second level, the third level and the fourth stage The 5th power gating switch of level V is just in opening state when switch is opened, and controls this grade of CSA15 compression electricity Road it is open-minded, whether CSA15 connects the 6th power gating switch simultaneously, judge equal into three inputs of the CSA It is zero, to control opening and turning off for the CSA;The output of level V compressor circuit is connected to the input of the 6th grade of compressor circuit On, the 6th grade of compression is made of CSA16, and is connected in series with the 6th grade of the 5th power gating switch, and if only if the Level-one, the second level, the third level, the fourth stage and level V the 5th power gating switch when opening the 6th grade the 5th Power gating switch is just in opening state, and controls the open-minded of this grade of CSA16 compressor circuit, and CSA16 connects one simultaneously A 6th power gating switch judges whether three inputs into the CSA are zero, to control the open-minded of the CSA With shutdown.Thus the Wallace's tree-shaped compressor for taking power gating switch arrays is just constituted, in power Under the control of gating switch arrays, the power consumption of the compressor is by considerable reduction.
Compression process are as follows: first by partial product P0, P1, P2 high-low-position for inputting CSA be extended to same weight part and, I.e. to extending 4 zero after the lowest order of P2, to extension two zero after the lowest order of P1.Secondly to input CSA2 partial product P3, P4, P5 input partial product P6, P7, P8 of CSA3, input partial product P9, P10, P11 of CSA4, input the partial product of CSA5 P12, P13, P14 and partial product P15, P16 for inputting CSA6 take mode identical with CSA1 to be extended to same weight respectively Partial product carry out the first order part and squeeze operation.
After the max calculation delay time of the detection time and CSA that wait the 5th power gating of the first order to switch, Second level CSA compressor circuit is opened by the 5th power gating of the first order.S1 that CSA1 and CSA2 are exported, C1, S2 are sent into CSA7 as three inputs;Output C2, S3, C3 of CSA2 and CSA3 are sent into CSA8 as input;It will Output S4, C4, S5 of CSA4 and CSA5 is sent into CSA9 as input;Output C5, S6, C6 of CSA5 and CSA6 are sent into As input in CSA10.Before carrying out compaction algorithms, whether each CSA switchs detection input by the 6th power gating It is zero, starts operation again later.
After the max calculation delay time of the detection time and CSA that wait the 5th power gating of the second level to switch, Open third level CSA compressor circuit.S7, C7, S8 that CSA7 and CSA8 is exported are sent into CSA11 as three inputs;It will Output C8, S9, C9 of CSA8 and CSA9 is sent into CSA12 as input;Output S10, C10 of CSA10 are sent into next stage In compression unit.Before carrying out compaction algorithms, each CSA by the 6th power gating switch detection input whether be Zero, start operation again later.
It is open-minded after the max calculation delay time of the detection time and CSA that wait the power gating of the third level to switch Fourth stage CSA compressor circuit.S11, C11, S12 that CSA11 and CSA12 is exported are sent into CSA13 as three inputs;It will Output S10, C10, C12 of CSA12 and CSA10 is sent into CSA14 as input.Before carrying out compaction algorithms, each CSA by Whether the 6th power gating switch detection input is zero, starts operation again later.
It is open-minded after the max calculation delay time of the detection time and CSA that wait the power gating of the fourth stage to switch Level V CSA compressor circuit.S13, C13, S14 that CSA13 and CSA14 is exported are sent into CSA15 as three inputs;It will The output C14 of CSA14 is sent into next stage compressor circuit.Before carrying out compaction algorithms, each CSA is by the 6th power gating Whether switch detection input is zero, starts operation again later.
It is open-minded after the max calculation delay time of the detection time and CSA that wait the power gating of level V to switch 6th grade of CSA compressor circuit.Using the output C14 of S15, C15 and CSA14 of CSA15 output as three inputs of CSA16, produce Raw final output and signal S, carry signal C.Before carrying out compaction algorithms, each CSA is opened by the 6th power gating It closes whether detection input is zero, starts operation again later.
To sum up, 17 partial products that 32 multiplication generate are entered by the control of the 5th power gating of first order switch First order CSA's is open-minded, parallelly compressed to be 6 partial products and generate 6 carry signals, then passes through the 5th power of the second level Gating switch control second level CSA's is open-minded, and is about to the partial product of carry signal boil down to 4 of 6 partial-product sums 6 and produces Raw 4 carry signals, then pass through opening 4 for the CSA of the 5th power gating switch control third level of the third level 4 carry signals of partial-product sum, 3 partial products of boil down to simultaneously generate 3 carry signals, then pass through the 5th of the fourth stage Power gating switch control fourth stage CSA's is open-minded, simultaneously by the 3 carry signal boil down to 2 of partial-product sum 3 partial products Generate 2 carry signals.Opening 2 by the CSA of the 5th power gating switch control level V of level V again 2 carry signals of partial-product sum, the carry signal of partial-product sum 2 of boil down to 1, finally by the 6th grade of the 5th power The 6th grade of CSA's of gating switch control is open-minded, and 1 partial-product sum, 2 carry signal boil down to puppets and signal S and carry are believed Number C.
Finally, open-minded by the 3rd power gating switch control carry lookahead adder, by pseudo- and signal S and into Final result can be obtained in position signal C addition.The mode for taking partial product similar in this weight to handle together can make The digit of sign extended increases to the digit of needs step by step.Reduce the data transmission on interconnection line, while ensure that operation As a result correctness reduces calculating time and the power consumption of system.

Claims (5)

1. a kind of low-consumption multiplier based on 4-Booth coding, which is characterized in that including by least two encoder parallel connection groups At encoder group, the input terminal of the encoder group is connected with digit selector, the input terminal of the digit selector respectively with multiply Number input port, the connection of multiplicand input port, the input terminal and multiplier input ports, multiplicand input terminal of the digit selector The first Power gating switch is connected separately between mouthful, the first Power gating switch is for multiplying according to input It counts or whether multiplicand is zero and is opened or closed circuit, the encoder group control complement code segment signal output product, the volume The output end of code device group is connect by the 2nd Power gating switch with the input terminal of compressor, the 2nd Power Gating opens circuit according to the maximum delay of encoder group generating portion product, and the output end of the compressor passes through third Power gating switch is connect with the input terminal of carry lookahead adder, and the 3rd Power gating switch is to receive The puppet and carry signal of compressor final output and open circuit, the output end of the carry lookahead adder exports multiplicand With the product of multiplier.
2. a kind of low-consumption multiplier based on 4-Booth coding as described in claim 1, which is characterized in that the coding There are three data input pin, three data input pins of each encoder to connect with the output end of the digit selector for utensil It connects, the output end of each encoder is switched with the 2nd Power gating and connected.
3. a kind of low-consumption multiplier based on 4-Booth coding as described in claim 1, which is characterized in that the coding Device logic circuit includes three inputs and door, and three input and three input terminals of door are the input terminal of encoder, and described three is defeated Enter and connect with the input terminal of the output end of door and carry save adder (CSA), three input and three input terminals of door pass through deposit Device I is connect with the input terminal of the carry save adder (CSA), and the register I exports the corresponding Ei of digit selector output signal, The puppet and signal output end of the carry save adder (CSA) are connect with the input terminal of register II, the output end of the register II It is connect with the input terminal of shift register, the shift register output partial product, output end and institute of three input with door It states and is connected with control complement code participation circuit between shift counter, the control complement code participates in circuit to control complement code participation portion Divide the formation of product.
4. a kind of low-consumption multiplier based on 4-Booth coding as claimed in claim 3, which is characterized in that the control It includes register III that complement code, which participates in circuit, is connected with reverse phase between the register III and three input and the output end of door Device, the input terminal of the phase inverter are connect with three input with the output end of door, and the output end of the register III passes through two A selector and the input terminal of the shift register is selected to connect, the alternative selector is connected with the 4th power gating Switch, the 4th power gating switch are also connect by complementary circuit with the alternative selector, the complement code electricity Road generates the partial product for using complement code calculating, and the 4th power gating switch is to control opening and selecting for complementary circuit It is logical.
5. a kind of low-consumption multiplier based on 4-Booth coding as described in claim 1, which is characterized in that the compression Device uses the tree-like compressor of Wallace of multiple carry save adder (CSA)s composition, every grade of compression in the tree-like compressor of Wallace Circuit connection has the 5th power gating switch, and the 5th power gating switch is to control the same level compressor circuit Open and turn off, every grade of the 5th power gating switch with the 5th power gating switch series described in next stage Connection, every grade of the 5th power gating switch the max calculation time delay by the same level compressor circuit to control next stage the 5th Power gating is switched open-minded, and each carry save adder (CSA) is respectively connected with the 6th in the tree-like compressor of Wallace Power gating switch, the 6th power gating switch is to control opening and turning off for carry save adder (CSA).
CN201910238829.8A 2019-03-27 2019-03-27 Low-power-consumption multiplier based on 4-Booth coding Active CN110058840B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910238829.8A CN110058840B (en) 2019-03-27 2019-03-27 Low-power-consumption multiplier based on 4-Booth coding

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910238829.8A CN110058840B (en) 2019-03-27 2019-03-27 Low-power-consumption multiplier based on 4-Booth coding

Publications (2)

Publication Number Publication Date
CN110058840A true CN110058840A (en) 2019-07-26
CN110058840B CN110058840B (en) 2022-11-25

Family

ID=67317466

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910238829.8A Active CN110058840B (en) 2019-03-27 2019-03-27 Low-power-consumption multiplier based on 4-Booth coding

Country Status (1)

Country Link
CN (1) CN110058840B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110955403A (en) * 2019-11-29 2020-04-03 电子科技大学 Approximate base-8 Booth encoder and approximate binary multiplier of mixed Booth encoding
CN111831255A (en) * 2020-06-30 2020-10-27 深圳市永达电子信息股份有限公司 Processing method and computer readable storage medium for ultra-long digit multiplication
WO2021097765A1 (en) * 2019-11-21 2021-05-27 华为技术有限公司 Multiplier and operator circuit
CN113031913A (en) * 2019-12-24 2021-06-25 上海寒武纪信息科技有限公司 Multiplier, data processing method, device and chip
CN113031915A (en) * 2019-12-24 2021-06-25 上海寒武纪信息科技有限公司 Multiplier, data processing method, device and chip
WO2021196096A1 (en) * 2020-04-01 2021-10-07 华为技术有限公司 Multimode fusion multiplier
WO2022247194A1 (en) * 2021-05-22 2022-12-01 上海阵量智能科技有限公司 Multiplier, data processing method, chip, computer device and storage medium
CN116205244A (en) * 2023-05-06 2023-06-02 中科亿海微电子科技(苏州)有限公司 Digital signal processing structure

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5040139A (en) * 1990-04-16 1991-08-13 Tran Dzung J Transmission gate multiplexer (TGM) logic circuits and multiplier architectures
US20050080834A1 (en) * 2003-09-30 2005-04-14 International Business Machines Corporation Fused booth encoder multiplexer
CN101382882A (en) * 2008-09-28 2009-03-11 宁波大学 Booth encoder based on CTGAL and thermal insulation complement multiplier-accumulator
CN103092560A (en) * 2013-01-18 2013-05-08 中国科学院自动化研究所 Low-power consumption multiplying unit based on Bypass technology
CN109388373A (en) * 2018-10-12 2019-02-26 胡振波 Multiplier-divider for low-power consumption kernel

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5040139A (en) * 1990-04-16 1991-08-13 Tran Dzung J Transmission gate multiplexer (TGM) logic circuits and multiplier architectures
US20050080834A1 (en) * 2003-09-30 2005-04-14 International Business Machines Corporation Fused booth encoder multiplexer
CN101382882A (en) * 2008-09-28 2009-03-11 宁波大学 Booth encoder based on CTGAL and thermal insulation complement multiplier-accumulator
CN103092560A (en) * 2013-01-18 2013-05-08 中国科学院自动化研究所 Low-power consumption multiplying unit based on Bypass technology
CN109388373A (en) * 2018-10-12 2019-02-26 胡振波 Multiplier-divider for low-power consumption kernel

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
SOUMYAROOP ROY ET AL: "A Compiler Based Leakage Reduction Technique by Power-Gating Functional units in Embedded Microprocessors", 《20TH INTERNATIONAL CONFERENCE ON VLSI DESIGN HELD JOINTLY WITH 6TH INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS》 *
崔晓平: "基于修正BOOTH编码的32×32位乘法器", 《电子测量技术》 *
张明英: "32位低功耗高速乘法器设计", 《微处理机》 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021097765A1 (en) * 2019-11-21 2021-05-27 华为技术有限公司 Multiplier and operator circuit
CN113227963A (en) * 2019-11-21 2021-08-06 华为技术有限公司 Multiplier and operator circuit
JP7371255B2 (en) 2019-11-21 2023-10-30 華為技術有限公司 Multiplier and operator circuits
US11855661B2 (en) 2019-11-21 2023-12-26 Huawei Technologies Co., Ltd. Multiplier and operator circuit
CN110955403A (en) * 2019-11-29 2020-04-03 电子科技大学 Approximate base-8 Booth encoder and approximate binary multiplier of mixed Booth encoding
CN113031913A (en) * 2019-12-24 2021-06-25 上海寒武纪信息科技有限公司 Multiplier, data processing method, device and chip
CN113031915A (en) * 2019-12-24 2021-06-25 上海寒武纪信息科技有限公司 Multiplier, data processing method, device and chip
WO2021196096A1 (en) * 2020-04-01 2021-10-07 华为技术有限公司 Multimode fusion multiplier
CN111831255A (en) * 2020-06-30 2020-10-27 深圳市永达电子信息股份有限公司 Processing method and computer readable storage medium for ultra-long digit multiplication
WO2022247194A1 (en) * 2021-05-22 2022-12-01 上海阵量智能科技有限公司 Multiplier, data processing method, chip, computer device and storage medium
CN116205244A (en) * 2023-05-06 2023-06-02 中科亿海微电子科技(苏州)有限公司 Digital signal processing structure
CN116205244B (en) * 2023-05-06 2023-08-11 中科亿海微电子科技(苏州)有限公司 Digital signal processing structure

Also Published As

Publication number Publication date
CN110058840B (en) 2022-11-25

Similar Documents

Publication Publication Date Title
CN110058840A (en) A kind of low-consumption multiplier based on 4-Booth coding
Chen et al. Minimization of switching activities of partial products for designing low-power multipliers
CN101685385A (en) Complex multiplier
CN101221490B (en) Floating point multiplier and adder unit with data forwarding structure
CN101739231A (en) Booth-Wallace tree multiplier
CN109542393A (en) A kind of approximation 4-2 compressor and approximate multiplier
CN109816105A (en) A kind of configurable neural network activation primitive realization device
CN110515589A (en) Multiplier, data processing method, chip and electronic equipment
CN102184161B (en) Matrix inversion device and method based on residue number system
Olivieri Design of synchronous and asynchronous variable-latency pipelined multipliers
CN106354473A (en) Divider and quotient and remainder solving method
CN104090737B (en) A kind of modified model part parallel framework multiplier and its processing method
CN105913118A (en) Artificial neural network hardware implementation device based on probability calculation
CN101840324B (en) 64-bit fixed and floating point multiplier unit supporting complex operation and subword parallelism
CN110531954A (en) Multiplier, data processing method, chip and electronic equipment
CN110515587A (en) Multiplier, data processing method, chip and electronic equipment
CN107092462B (en) 64-bit asynchronous multiplier based on FPGA
CN110515590A (en) Multiplier, data processing method, chip and electronic equipment
CN107368459A (en) The dispatching method of Reconfigurable Computation structure based on Arbitrary Dimensions matrix multiplication
CN109284085B (en) High-speed modular multiplication and modular exponentiation operation method and device based on FPGA
CN102270110B (en) Improved 16Booth-based coder
CN101349967B (en) CBSA hardware adder of addition and subtraction non-difference paralleling calculation and design method thereof
CN103699729B (en) Modulus multiplier
CN113791753A (en) FPGA-based programmable DSP supporting rapid division
Bokade et al. CLA based 32-bit signed pipelined multiplier

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant