CN101110016A - Subword paralleling integer multiplying unit - Google Patents

Subword paralleling integer multiplying unit Download PDF

Info

Publication number
CN101110016A
CN101110016A CNA2007100356514A CN200710035651A CN101110016A CN 101110016 A CN101110016 A CN 101110016A CN A2007100356514 A CNA2007100356514 A CN A2007100356514A CN 200710035651 A CN200710035651 A CN 200710035651A CN 101110016 A CN101110016 A CN 101110016A
Authority
CN
China
Prior art keywords
product
modified value
low level
result
fffe
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA2007100356514A
Other languages
Chinese (zh)
Inventor
张民选
董兰飞
李少青
陈吉华
赵振宇
陈怒兴
马剑武
徐炜遐
孙岩
乐大珩
贺鹏
刘婷
喻仁峰
何小威
郑东裕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CNA2007100356514A priority Critical patent/CN101110016A/en
Publication of CN101110016A publication Critical patent/CN101110016A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The utility model discloses a sub-word parallel integer multiplier of which the data preprocessing module can extend the multiplicand and the multi-plicator as per the operator schema and the symbolic control signal, so as to produce four groups of the multiplicand and four groups of multi-plicator. The corrected value selection module can select and incorporate the corrected value according to the sign digit of the operator schema and the arithmetic product result. The input of the partial producing module comprises four groups of the multiplicand, four groups of multi-plicator and the control signal. The output thereof is the partial product. Each group of the partial producing module is composed of a group of the Booth encoding unit and a group of partial product selected cell; partial product compress tree mold block compress the partial product and the incorporated corrected value. The utility model has the simple structure and simplifies the algorithm as well as realizes the delay and the diminishing of the partial product compression modules, so as to improve the sub-word parallel integer multiplier of the entire multiplier.

Description

Subword paralleling integer multiplying unit
Technical field
The present invention is mainly concerned with the design field of 64 bit architecture microprocessors, refers in particular to a kind of Subword paralleling integer multiplying unit.
Background technology
Advanced microprocessor and multimedia treatment technology are had higher requirement to multiplying.Single instruction multiple data (SIMD) processing mode is a principal feature of multimedia extension and digital signal processing in modern multimedia processor, the general processor.Support the performance of the sub-word parallel multiplier of SIMD processing mode to become the key component that improves the multimedia processor performance.
Fig. 1 is the structural drawing of existing sub-word parallel multiplier.Existing sub-word parallel multiplier is based on the structure of model selection, this multiplier has added the MUX based on model selection in the partial product generation module, and inserted disconnection carry chain signal based on pattern at partial product compressed tree and final each mode boundary place of carry propagate adder, can finish 1 64 * 64,2 32 * 32,4 16 * 16,88 * 8 multiply operation through reconstruct.When common multiplication pattern, as two input A[63:0] and B[63:0] arrive after, at first the long-pending generation unit of entering part is divided into four parts with multiplicand and multiplier, produce partial product by the Booth coding respectively, long-pending being input in the partial product compressed tree (PPRT) of various piece compressed afterwards, pseudo-after the compression and be input in the carry propagate adder (CPA) with pseudo-carry and carry out last addition exported the result at last then.
When sub-word parallel schema, as two input A[63:0] and B[63:0] arrive after, all operations is identical during with common multiplication pattern, just increase a modified value and selected module, select the modified value after the merging, be sent in the partial product compressed tree (PPRT), compress with partial product.Fig. 2 is that the carry chain in the partial product pressure texture disconnects logic.
Wherein:
sum=(ab)(c in.□kill) (1)
c out=(a.b)+[(a+b).(c in.□kill)] (2)
But existing sub-word parallel multiplier has following shortcoming:
1, the parallel multiplier that adopts this algorithm design more than when partial product produces the one-level selector switch, increased the one-level gate delay;
2, will be in the disconnection carry chain signal of mode boundary place insertion based on pattern, algorithm complexity;
3, the fan-out load of disconnection carry chain signal is big, interconnect traces is very long, almost spread all over entire portion and overstock the part that contracts on domain is realized, huge fan-out load and long cabling have increased the time-delay of partial product compression to a certain extent, and it is complicated that full customization domain is realized;
This parallel multiplier also will insert the disconnection carry chain signal based on pattern at each mode boundary place in last carry propagate adder, this has also increased time-delay and difficulty for the design of carry propagate adder, makes the versatility of design be very limited.
Summary of the invention
The problem to be solved in the present invention just is: at the technical matters of prior art existence, the invention provides a kind of simple in structure, simplified algorithm and realization, the delay of partial product compression unit is reduced, improve the Subword paralleling integer multiplying unit of the performance of whole multiplier.
For solving the problems of the technologies described above, the solution that the present invention proposes is: a kind of Subword paralleling integer multiplying unit, it is characterized in that: it comprises that data preprocessing module, four independent parts generation modules, a modified value select module and partial product compressed tree module, described data preprocessing module is used for importing multiplicand SRC1[63:0] and multiplier SRC2[63:0] and control signal, according to operator scheme and Signed Domination signal multiplicand and multiplier are expanded, produced corresponding 4 groups of multiplicands and 4 groups of multipliers; Described modified value selects module to be used for modified value is selected and being merged modified value according to the sign bit of operator scheme and result of product; The part generation module be input as 4 groups of multiplicands that data preprocessing module produces, 4 groups of multipliers and control signal, be output as partial product, every group of part generation module is made up of one group of Booth coding unit and one group of partial product selected cell, function is identical, parallel processing, first's generation module produces 9 partial products of low level multiplication, the second portion generation module produces 9 partial products of time low level multiplication, the third part generation module produces time 9 high-order partial products, and the 4th part generation module produces 9 high-order partial products; Partial product compressed tree module is used for the modified value after the partial-product sum merging of part generation module generation is compressed.
Modified value is selected being operating as of module when in sub-word parallel schema multiply operation:
(1), at first when doing the operation of sub-word parallel multiplication, adding a modified value for corresponding partial product according to the difference of each sub-word product signs:
1. the sign bit when low seat word result of product is timing, adds modified value for its result of product:
128’hffff_ffff_fffe_0000_0000_0000_0000_0000;
2. when the symbol of low seat word result of product when negative, add modified value for its result of product:
128’hffff_ffff_fffe_0000_0000_0001_0000_0000;
3. when the symbol of time low seat word result of product for just, then add modified value to its result of product:
128’hffff_fffe_0000_0000_0000_0000_0000_0000;
4. if the symbol of time low seat word result of product adds modified value for negative, then for its result of product:
128’hffff_fffe_0000_0001_0000_0000_0000_0000。
5. the symbol when inferior high seat word result of product is timing, adds modified value for its result of product:
128’hfffe_0000_0000_0000_0000_0000_0000_0000;
6. when the symbol of time high seat word result of product when negative, add modified value for its result of product:
128’hfffe_0001_0000_0000_0000_0000_0000_0000。
(2), just these three modified values are merged before adding the unit three modified values being sent into part accumulation, with partial product generation unit executed in parallel, at last itself and partial product are together sent into the partly overstocked unit that contracts and added up and get final product; According to each sub-word result of product symbol difference, low level, inferior low level and an inferior high position respectively have two kinds of modified values, and permutation and combination has 8 kinds of situations, and the modified value amalgamation result of every kind of situation is as follows, wherein 0 represent product for just, and 1 represents product for negative:
1. a time high position is 0, and inferior low level is 0, and low level is 0 o'clock, and the modified value amalgamation result is:
128’hfffd_fffd_fffe_0000_0000_0000_0000_0000;
2. a time high position is 0, and inferior low level is 0, and low level is 1 o'clock, and the modified value amalgamation result is:
128’hfffd_fffd_fffe_0000_0000_0001_0000_0000;
3. a time high position is 0, and inferior low level is 1, and low level is 0 o'clock, and the modified value amalgamation result is:
128’hfffd_fffd_fffe_0001_0000_0000_0000_0000;
4. a time high position is 0, and inferior low level is 1, and low level is 1 o'clock, and the modified value amalgamation result is:
128’hfffd_fffd_fffe_0001_0000_0001_0000_0000;
5. a time high position is 1, and inferior low level is 0, and low level is 0 o'clock, and the modified value amalgamation result is:
128’hfffd_fffe_fffe_0000_0000_0000_0000_0000;
6. a time high position is 1, and inferior low level is 0, and low level is 1 o'clock, and the modified value amalgamation result is:
128’hfffd_fffe_fffe_0000_0000_0001_0000_0000;
7. a time high position is 1, and inferior low level is 1, and low level is 0 o'clock, and the modified value amalgamation result is:
128’hfffd_fffe_fffe_0001_0000_0000_0000_0000;
8. a time high position is 1, and inferior low level is 1, and low level is 1 o'clock, and the modified value amalgamation result is:
128’hfffd_fffe_fffe_0001_0000_0001_0000_0000。
When described modified value selects module to carry out the modified value merging, most of position of revising is fixed, have only the 97th, 96,64, revise the position according to different the changing of each sub-word product signs combination for 32, and change regular following, wherein the 97th, revising the position for 96 changes according to time high seat word product signs, revising the position for the 64th changes according to time low seat word product signs, revising the position for the 32nd changes according to low seat word product signs, all the other each to revise the position all be fixed value, adds up as long as itself and partial product are together sent into the overstocked unit that contracts of part when being used as sub-word parallel schema multiply operation.
Compared with prior art, advantage of the present invention just is:
1, Subword paralleling integer multiplying unit of the present invention can make the delay of partial product compression unit reduce, thereby has improved the performance of whole multiplier;
2, Subword paralleling integer multiplying unit of the present invention does not need to insert the disconnection carry chain signal based on pattern, has simplified algorithm and realization; Simultaneously also avoid resolving out the long interconnection line of the overall situation of carry chain signal, domain is realized simple efficient;
3, Subword paralleling integer multiplying unit of the present invention 128 puppets making that partial product compression back produces and just can obtain final result of product after by any one carry propagate adder addition with 128 pseudo-carries do not need the specialized designs totalizer;
4, Subword paralleling integer multiplying unit of the present invention is applied widely, except being has 64 integer multiplier algorithms of 4 groups of sub-word parallel functions, after suitably expanding, this multiplier can also be applicable to have any the multiplying that n organizes sub-word parallel function;
5, Subword paralleling integer multiplying unit algorithm of the present invention has reduced hardware spending and overall time-delay, has more excellent performance.
Description of drawings
Fig. 1 is the structural representation of multiplier in the prior art;
Fig. 2 is the synoptic diagram that carry chain disconnects logic;
Fig. 3 is the partial product synoptic diagram of 64 bit parallel integer multiplier under parallel schema;
Fig. 4 is the structural representation of multiplier of the present invention.
Embodiment
Below with reference to the drawings and specific embodiments the present invention is described in further details.
As shown in Figure 4, a kind of Subword paralleling integer multiplying unit of the present invention, it comprises that data pre-service (DPrepare) module, four independent parts produce (PPGenertor) modules, a modified value is selected module and partial product compressed tree (PPRT), described data preprocessing module is used for importing multiplicand SRC1[63:0] and multiplier SRC2[63:0] and control signal, according to operator scheme and Signed Domination signal multiplicand and multiplier are expanded, produced corresponding 4 groups of multiplicands and 4 groups of multipliers; Described modified value selects module to be used for modified value is selected and being merged modified value according to the sign bit of operator scheme and result of product; 4 groups of multiplicands that are input as the data preprocessing module generation of partial product generation module, 4 groups of multipliers and control signal, be output as partial product, every group of part generation module is made up of one group of Booth coding unit (Decode) and one group of partial product selected cell (PPSelect), function is identical, parallel processing, in the present embodiment, Decode1 and PPSelect1 have constituted PPGenertor1, Decode2 and PPSelect2 constitute PPGenertor2, Decode3 and PPSelect3 constitute PPGenertor3, Decode4 and PPSelect4 constitute PPGenertor4, wherein first's generation module PPGenertor1 produces 9 partial products of low level multiplication, second portion generation module PPGenertor2 produces 9 partial products of time low level multiplication, third part generation module PPGenertor3 produces time 9 high-order partial products, and the 4th part generation module PPGenertor4 produces 9 high-order partial products; Partial product compressed tree module is used for the modified value after the partial-product sum merging of part generation module generation is compressed.
64 designed bit parallel integer multiplier of the present invention select to finish the general mode operation according to operator scheme, and the operation of symbol or signless 64 * 64-bits multiplication of integers is promptly arranged; Can finish sub-word parallel schema operation again, i.e. 4 parallel symbol being arranged or do not have the operation of symbol 16 * 16-bits multiplication of integers.When carrying out 64 multiplications of integers, finish given multiplicand SRC1[63:0] and multiplier SRC2[63:0] multiply each other, 128 result of product promptly correspond to the product of 64 multiplication.When carrying out 4 parallel 16 multiplications of integers operations with this multiplier, finish multiplicand SRC1[63:48], SRC1[47:32], SRC1[31:16], SRC1[15:0] and multiplier SRC2[63:48], SRC2[47:32], SRC2[31:16], SRC2[15:0] correspondence multiply each other, [127:96] of result of product, [95:63], [63:32], [31:0] are the product of 4 parallel 16 multiplication correspondences.The partial product of 64 bit parallel integer multiplier under parallel schema adopted the booth2 algorithm, and provided with latticed form as shown in Figure 3.The target of 64 bit parallel integer multiplier of design be 128 puppets of making every effort to partial product is compressed that the back produces and and 128 pseudo-carries back of adding up just can obtain finally correct result of product.Obviously, the result of partial product compression is correct when carrying out the general mode multiply operation, but when sub-word parallel schema multiply operation, because the low level multiplicand will be to high-order escape character position, and low level has carry to a high position, if directly carry out the partial product compression, will produce error, this error is by (high position to low level can not exert an influence) of low level to the influence generation of a high position, and is correct in order to guarantee the result, must revise this error.
Therefore, modified value is selected being operating as of module when in sub-word parallel schema multiply operation:
(1), at first when doing the operation of sub-word parallel multiplication, adding a modified value for corresponding partial product according to the difference of each sub-word product signs:
1. the sign bit when low seat word result of product is timing, adds modified value for its result of product:
128’hffff_ffff_fffe_0000_0000_0000_0000_0000;
2. when the symbol of low seat word result of product when negative, add modified value for its result of product:
128’hffff_ffff_fffe_0000_0000_0001__0000_0000;
3. when the symbol of time low seat word result of product for just, then add modified value to its result of product:
128’hffff_fffe_0000_0000_0000_0000_0000_0000;
4. if the symbol of time low seat word result of product adds modified value for negative, then for its result of product:
128’hffff_fffe_0000_0001_0000?0000_0000_0000。
5. the symbol when inferior high seat word result of product is timing, adds modified value for its result of product:
128’hfffe_0000_0000_0000_0000_0000_0000_0000;
6. when the symbol of time high seat word result of product when negative, add modified value for its result of product:
128’hfffe_0001_0000_0000_0000_0000_0000_0000。
By modified value, can make multiplier need not insert disconnection carry chain signal and just can simply realize sub-word concurrent operation.After adopting this step, when doing 4 sub-word parallel schema multiply operations, only need simply to add corresponding modified value, then modified value and partial product are together carried out partial product and compress and just can obtain correct result of product according to the symbol of each sub-word result of product.But simple like this three corrections and partial product are directly added up certainly will increase the delay of part compression module, the performance of multiplier is affected, so the present invention solves this problem by following steps (2).
(2), just these three modified values are merged before adding the unit three modified values being sent into part accumulation, with partial product generation unit executed in parallel, at last itself and partial product are together sent into the partly overstocked unit that contracts and added up and get final product; According to each sub-word result of product symbol difference, low level, inferior low level and an inferior high position respectively have two kinds of modified values, and permutation and combination has 8 kinds of situations, and the modified value amalgamation result of every kind of situation is as follows, wherein 0 represent product for just, and 1 represents product for negative:
1. a time high position is 0, and inferior low level is 0, and low level is 0 o'clock, and the modified value amalgamation result is:
128’hfffd_fffd_fffe_0000_0000_0000_0000_0000;
2. a time high position is 0, and inferior low level is 0, and low level is 1 o'clock, and the modified value amalgamation result is:
128’hfffd_fffd_fffe_0000_0000_0001_0000_0000;
3. a time high position is 0, and inferior low level is 1, and low level is 0 o'clock, and the modified value amalgamation result is:
128’hfffd_fffd_fffe_0001_0000_0000_0000_0000;
4. a time high position is 0, and inferior low level is 1, and low level is 1 o'clock, and the modified value amalgamation result is:
128’hfffd_fffd_fffe_0001_0000_0001_0000_0000;
5. a time high position is 1, and inferior low level is 0, and low level is 0 o'clock, and the modified value amalgamation result is:
128’hfffd_fffe_fffe_0000_0000_0000_0000_0000;
6. a time high position is 1, and inferior low level is 0, and low level is 1 o'clock, and the modified value amalgamation result is:
128’hfffd_fffe_fffe_0000_0000_0001_0000_0000;
7. a time high position is 1, and inferior low level is 1, and low level is 0 o'clock, and the modified value amalgamation result is:
128’hfffd_fffe_fffe_0001_0000_0000_0000_0000;
8. a time high position is 1, and inferior low level is 1, and low level is 1 o'clock, and the modified value amalgamation result is:
128’hfffd_fffe_fffe_0001_0000_0001_0000_0000。
When modified value selects module to carry out the modified value merging, most of position of revising is fixed, have only the 97th, 96,64, revise the position according to different the changing of each sub-word product signs combination for 32, and change regular following, wherein the 97th, revising the position for 96 changes according to time high seat word product signs, revising the position for the 64th changes according to time low seat word product signs, revising the position for the 32nd changes according to low seat word product signs, all the other each to revise the position all be fixed value, adds up as long as itself and partial product are together sent into the overstocked unit that contracts of part when being used as sub-word parallel schema multiply operation.The modified value merga pass is sent into part accumulation in modified value and just this modified value is merged before adding the unit, and while and partial product generation unit executed in parallel reduce the delay of partial product compression unit, improve the performance of whole multiplier.

Claims (3)

1. Subword paralleling integer multiplying unit, it is characterized in that: it comprises that data preprocessing module, four independent parts generation modules, a modified value select module and partial product compressed tree module, described data preprocessing module is used for importing multiplicand SRC1[63:0] and multiplier SRC2[63:0] and control signal, according to operator scheme and Signed Domination signal multiplicand and multiplier are expanded, produced corresponding 4 groups of multiplicands and 4 groups of multipliers; Described modified value selects module to be used for modified value is selected and being merged modified value according to the sign bit of operator scheme and result of product; The part generation module be input as 4 groups of multiplicands that data preprocessing module produces, 4 groups of multipliers and control signal, be output as partial product, every group of part generation module is made up of one group of Booth coding unit and one group of partial product selected cell, function is identical, parallel processing, first's generation module produces 9 partial products of low level multiplication, the second portion generation module produces 9 partial products of time low level multiplication, the third part generation module produces time 9 high-order partial products, and the 4th part generation module produces 9 high-order partial products; Partial product compressed tree module is used for the modified value after the partial-product sum merging of part generation module generation is compressed.
2. Subword paralleling integer multiplying unit according to claim 1 is characterized in that modified value is selected being operating as of module when in sub-word parallel schema multiply operation:
(1), at first when doing the operation of sub-word parallel multiplication, adding a modified value for corresponding partial product according to the difference of each sub-word product signs:
1. the sign bit when low seat word result of product is timing, adds modified value for its result of product:
128’hffff_ffff_fffe_0000_0000_0000_0000_0000;
2. when the symbol of low seat word result of product when negative, add modified value for its result of product:
128’hffff_ffff_fffe_0000_0000_0001_0000_0000;
3. when the symbol of time low seat word result of product for just, then add modified value to its result of product:
128’hffff_fffe_0000_0000_0000_0000_0000_0000;
4. if the symbol of time low seat word result of product adds modified value for negative, then for its result of product:
128’hffff_fffe_0000_0001_0000_0000_0000_0000;
5. the symbol when inferior high seat word result of product is timing, adds modified value for its result of product:
128’hfffe_0000_0000_0000_0000_0000_0000_0000;
6. when the symbol of time high seat word result of product when negative, add modified value for its result of product:
128’hfffe_0001_0000_0000_0000_0000_0000_0000。
(2), just these three modified values are merged before adding the unit three modified values being sent into part accumulation, with partial product generation unit executed in parallel, at last itself and partial product are together sent into the partly overstocked unit that contracts and added up and get final product; According to each sub-word result of product symbol difference, low level, inferior low level and an inferior high position respectively have two kinds of modified values, and permutation and combination has 8 kinds of situations, and the modified value amalgamation result of every kind of situation is as follows, wherein 0 represent product for just, and 1 represents product for negative:
1. a time high position is 0, and inferior low level is 0, and low level is 0 o'clock, and the modified value amalgamation result is:
128’hfffd_fffd_fffe_0000_0000_0000_0000_0000;
2. a time high position is 0, and inferior low level is 0, and low level is 1 o'clock, and the modified value amalgamation result is:
128’hfffd_fffd_fffe_0000_0000_0001_0000_0000;
3. a time high position is 0, and inferior low level is 1, and low level is 0 o'clock, and the modified value amalgamation result is:
128’hfffd_fffd_fffe_0001_0000_0000_0000_0000;
4. a time high position is 0, and inferior low level is 1, and low level is 1 o'clock, and the modified value amalgamation result is:
128’hfffd_fffd_fffe_0001_0000_0001_0000_0000;
5. a time high position is 1, and inferior low level is 0, and low level is 0 o'clock, and the modified value amalgamation result is:
128’hfffd_fffe_fffe_0000_0000_0000_0000_0000;
6. a time high position is 1, and inferior low level is 0, and low level is 1 o'clock, and the modified value amalgamation result is:
128’hfffd_fffe_fffe_0000_0000_0001_0000_0000;
7. a time high position is 1, and inferior low level is 1, and low level is 0 o'clock, and the modified value amalgamation result is:
128’hfffd_fffe_fffe_0001_0000_0000_0000_0000;
8. a time high position is 1, and inferior low level is 1, and low level is 1 o'clock, and the modified value amalgamation result is:
128’hfffd_fffe_fffe_0001_0000_0001_0000_0000。
3. Subword paralleling integer multiplying unit according to claim 2, it is characterized in that: when described modified value selects module to carry out the modified value merging, most of position of revising is fixed, have only the 97th, 96,64, revise the position according to different the changing of each sub-word product signs combination for 32, and change regular following, wherein the 97th, revising the position for 96 changes according to time high seat word product signs, revising the position for the 64th changes according to time low seat word product signs, revising the position for the 32nd changes according to low seat word product signs, all the other each to revise the position all be fixed value, adds up as long as itself and partial product are together sent into the overstocked unit that contracts of part when being used as sub-word parallel schema multiply operation.
CNA2007100356514A 2007-08-29 2007-08-29 Subword paralleling integer multiplying unit Pending CN101110016A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNA2007100356514A CN101110016A (en) 2007-08-29 2007-08-29 Subword paralleling integer multiplying unit

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNA2007100356514A CN101110016A (en) 2007-08-29 2007-08-29 Subword paralleling integer multiplying unit

Publications (1)

Publication Number Publication Date
CN101110016A true CN101110016A (en) 2008-01-23

Family

ID=39042105

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA2007100356514A Pending CN101110016A (en) 2007-08-29 2007-08-29 Subword paralleling integer multiplying unit

Country Status (1)

Country Link
CN (1) CN101110016A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105739945A (en) * 2016-01-22 2016-07-06 南京航空航天大学 Modified Booth coding multiplier based on modified partial product array
CN107273089A (en) * 2016-03-30 2017-10-20 华邦电子股份有限公司 Non- modular multiplication device, method and computing device for non-modular multiplication
CN107423026A (en) * 2017-04-21 2017-12-01 中国人民解放军国防科学技术大学 The implementation method and device that a kind of sin cos functionses calculate
CN110515590A (en) * 2019-08-30 2019-11-29 上海寒武纪信息科技有限公司 Multiplier, data processing method, chip and electronic equipment
CN110780845A (en) * 2019-10-17 2020-02-11 浙江大学 Configurable approximate multiplier for quantization convolutional neural network and implementation method thereof

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105739945A (en) * 2016-01-22 2016-07-06 南京航空航天大学 Modified Booth coding multiplier based on modified partial product array
CN105739945B (en) * 2016-01-22 2018-10-16 南京航空航天大学 A kind of amendment Booth encoded multipliers for accumulating array based on improvement part
CN107273089A (en) * 2016-03-30 2017-10-20 华邦电子股份有限公司 Non- modular multiplication device, method and computing device for non-modular multiplication
CN107273089B (en) * 2016-03-30 2020-09-29 华邦电子股份有限公司 Non-modulus multiplier, method for non-modulus multiplication and computing device
CN107423026A (en) * 2017-04-21 2017-12-01 中国人民解放军国防科学技术大学 The implementation method and device that a kind of sin cos functionses calculate
CN110515590A (en) * 2019-08-30 2019-11-29 上海寒武纪信息科技有限公司 Multiplier, data processing method, chip and electronic equipment
CN110780845A (en) * 2019-10-17 2020-02-11 浙江大学 Configurable approximate multiplier for quantization convolutional neural network and implementation method thereof

Similar Documents

Publication Publication Date Title
CN110688158B (en) Computing device and processing system of neural network
Trivedi et al. On-line algorithms for division and multiplication
Vázquez et al. A new family of high. performance parallel decimal multipliers
US20210349692A1 (en) Multiplier and multiplication method
US6601077B1 (en) DSP unit for multi-level global accumulation
JP4290202B2 (en) Booth multiplication apparatus and method
US11816448B2 (en) Compressing like-magnitude partial products in multiply accumulation
CN110362293B (en) Multiplier, data processing method, chip and electronic equipment
CN111045728B (en) Computing device and related product
Li et al. A stochastic reconfigurable architecture for fault-tolerant computation with sequential logic
Venkatachalam et al. Approximate sum-of-products designs based on distributed arithmetic
CN101110016A (en) Subword paralleling integer multiplying unit
CN116450217A (en) Multifunctional fixed-point multiplication and multiply-accumulate operation device and method
US20040267853A1 (en) Method and apparatus for implementing power of two floating point estimation
CN116661733A (en) Multiplier and microprocessor supporting multiple precision
CN110955403A (en) Approximate base-8 Booth encoder and approximate binary multiplier of mixed Booth encoding
CN111931441B (en) Method, device and medium for establishing FPGA fast carry chain time sequence model
EP3767454B1 (en) Apparatus and method for processing floating-point numbers
CN112712172B (en) Computing device, method, integrated circuit and apparatus for neural network operations
Asadi et al. Towards designing quantum reversible ternary multipliers
Gao et al. Efficient realization of bcd multipliers using fpgas
CN111258544B (en) Multiplier, data processing method, chip and electronic equipment
Kuang et al. Energy-efficient multiple-precision floating-point multiplier for embedded applications
Baba et al. Design and implementation of advanced modified booth encoding multiplier
Molahosseini et al. Efficient MRC-based residue to binary converters for the new moduli sets {2 2n, 2 n-1, 2 n+ 1-1} and {2 2n, 2 n-1, 2 n-1-1}

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication