CN116540977A - Modulo multiplier circuit, FPGA circuit and ASIC module - Google Patents

Modulo multiplier circuit, FPGA circuit and ASIC module Download PDF

Info

Publication number
CN116540977A
CN116540977A CN202310813349.6A CN202310813349A CN116540977A CN 116540977 A CN116540977 A CN 116540977A CN 202310813349 A CN202310813349 A CN 202310813349A CN 116540977 A CN116540977 A CN 116540977A
Authority
CN
China
Prior art keywords
multiplier
adder
circuit
outputs
output end
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310813349.6A
Other languages
Chinese (zh)
Other versions
CN116540977B (en
Inventor
请求不公布姓名
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Real AI Technology Co Ltd
Original Assignee
Beijing Real AI Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Real AI Technology Co Ltd filed Critical Beijing Real AI Technology Co Ltd
Priority to CN202310813349.6A priority Critical patent/CN116540977B/en
Publication of CN116540977A publication Critical patent/CN116540977A/en
Application granted granted Critical
Publication of CN116540977B publication Critical patent/CN116540977B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/60Methods or arrangements for performing computations using a digital non-denominational number representation, i.e. number representation without radix; Computing devices using combinations of denominational and non-denominational quantity representations, e.g. using difunction pulse trains, STEELE computers, phase computers
    • G06F7/72Methods or arrangements for performing computations using a digital non-denominational number representation, i.e. number representation without radix; Computing devices using combinations of denominational and non-denominational quantity representations, e.g. using difunction pulse trains, STEELE computers, phase computers using residue arithmetic
    • G06F7/722Modular multiplication
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/57Arithmetic logic units [ALU], i.e. arrangements or devices for performing two or more of the operations covered by groups G06F7/483 – G06F7/556 or for performing logical operations
    • G06F7/575Basic arithmetic logic units, i.e. devices selectable to perform either addition, subtraction or one of several logical operations, using, at least partially, the same circuitry
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention provides a modular multiplier circuit, an FPGA circuit and an ASIC module. The modulo multiplier circuit includes: a first multiplier, a second multiplier, a third multiplier, a first adder, a second adder, and a second multiplexer, wherein: the output end of the first multiplier outputsAndthe output end of the second multiplier outputs the product p2 of p1 and m; the first input terminal of the third multiplier receivesThe output end of the third multiplier outputsAndthe product p3, the output end of the first adder outputs the difference t between p2 and p3, the output end of the second adder outputs the difference t-m between t and m, and the output end of the two-way multiplexer selects the result to be output according to the size relation of t and m. By the invention, the hardware cost of the modular multiplier circuit can be reduced.

Description

Modulo multiplier circuit, FPGA circuit and ASIC module
Technical Field
The present invention relates to the field of circuit technologies, and in particular, to a modulo multiplier circuit, an FPGA circuit, and an ASIC module.
Background
It is important to model multiplication operations in cryptographyIs performed according to the operation of (a). The modular multiplication can be described by the following formula:wherein->And->Is the operand of modular multiplication, m is the module used in modular multiplication, n is +.>And->Is a length of (c). To avoid expensive division operations, barrett's algorithm is typically used or large digital-to-analog multiplication operations are implemented.
The Barrett Reduction algorithm is the modular algebraic reduction algorithm proposed by Barrett in 1986. The basic process of Barrett Reduction is described as follows:
(1)
(2)
(3)
(4)
(5)
(6)
(7)
due toIs a power of 2, thus pair->The division of (c) can be replaced by a simple bit operation. It can be seen that Barrett Reduction uses only limited multiplication operations, completely avoiding expensive division operations. The above (1) to (7) can achieve +.>And (3) operating.
The inventors have found that the use of circuitry to implement the above-described modulo multiplier of Barrett Reduction (Barrett's protocol) is largely divided into two ideas: commonality realization, inputM does not do constraint, so that the requirements of various scenes can be met, but the circuit structure is complex, the optimization space is small, and the performance is difficult to improve; one is a special implementation, in which a special constraint is made on m, from which a highly optimized circuit can be produced, but which is only suitable for certain specific application scenarios due to the strong constraints.
Meanwhile, the inventor further researches and discovers that in the two methods, the modulo multiplier circuit realizes multiplication operation and modulo operation by using two independent modules, namely, calculates an intermediate result firstThen calculate +.>. As shown in fig. 1, the prior art multiplier architecture, the Multiplier (MUL) module and the modulo operation (Barrett Reduction) module are two independent, separate modules. If the limit of the two modules can be broken and the two modules are fused, redundant calculation can be reduced, and unnecessary calculation is reducedAnd the intermediate result is needed, so that the hardware cost is reduced, and the calculation throughput rate is improved.
Therefore, how to reduce the hardware cost of the modulo multiplier circuit is a technical problem to be solved in the art.
Disclosure of Invention
The invention aims to provide a modular multiplier circuit, an FPGA circuit and an ASIC module, which are used for solving the technical problems in the prior art.
In one aspect, the present invention provides a modular multiplier circuit for achieving the above object.
The modulo multiplier circuit includes: a first multiplier, a second multiplier, a third multiplier, a first adder, a second adder, and a second multiplexer, wherein: the first input end of the first multiplier receivesThe second input of the first multiplier receives +.>The output end of the first multiplier outputs +.>And->Is p1, wherein ∈>And->Is the operand of the modular multiplication, m is the module used for the modular multiplication, +.>,/>Is constant, n isAnd->Is a length of (2); the first input end of the second multiplier is connected with the output end of the first multiplier, the second input end of the second multiplier receives m, and the output end of the second multiplier outputs a product p2 of p1 and m; the first input of the third multiplier receives +.>The second input of the third multiplier receives +.>The output end of the third multiplier outputs +.>And->The product p3; a first input end of the first adder is connected with an output end of the second multiplier, a second input end of the first adder is connected with an output end of the third multiplier, and an output end of the first adder outputs a difference value t between p2 and p3; the first input end of the second adder is connected with the output end of the first adder, the second input end of the second adder receives m, and the output end of the second adder outputs a difference t-m between t and m; the first input end of the two-way multiplexer is connected with the output end of the first adder, the second input end of the two-way multiplexer is connected with the output end of the second adder, and the output end of the two-way multiplexer selects a result to be output according to the size relation of t and m, wherein when t>When=m, the multiplexer outputs t-m, when t<And m, outputting t by the two-way multiplexer.
Further, the first multiplier outputs the highest n bits of the product p1, the second multiplier outputs the lowest n bits of the product p2, and the third multiplier outputs the lowest n bits of the product p 3.
Further, the first adder is configured to perform addition operation with carry 1 after inverting the output result of the second multiplier.
Further, the modulo multiplier circuit further comprises: a pre-calculation unit for calculatingWherein the second input of the first multiplier is adapted to be connected to the output of the pre-calculation unit.
In another aspect, to achieve the above object, the present invention provides an FPGA circuit, which includes any of the modulo multiplier circuits provided by the present invention, wherein the multipliers in the modulo multiplier circuits are implemented by a DSP.
In a further aspect, to achieve the above object, the present invention provides an ASIC module comprising a Barrett circuit comprising any of the modular multiplier circuits provided by the present invention.
The modular multiplier circuit, the FPGA circuit and the ASIC module provided by the invention break the limit of the multiplication operation module and the modular operation module when the modular multiplier hardware is realized aiming at the application scene that one operand of the modular multiplication is constant, do not calculate the product of x0 and x1, and integrate the two modules of the multiplication operation module and the modular operation module into one module (namely the modular multiplier circuit provided by the invention), and the first multiplier, the second multiplier, the third multiplier, the first adder, the second adder and the two-way multiplexer are used for constructing the modular multiplier circuit, so that the redundant calculation in the modular multiplier is reduced, the unnecessary intermediate result is reduced, the hardware cost of the modular multiplier circuit, the FPGA circuit and the ASIC module is reduced, and the calculation throughput rate is improved.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
FIG. 1 is a block diagram of a prior art multiplier;
fig. 2 is a circuit diagram of a modular multiplier circuit according to a first embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The inventors have found that in computer systems, multiplier implementations generally require more circuit area, which also dominates circuit power consumption. If the occupation of the multipliers by the modulo multipliers can be reduced, this helps to increase the throughput of the system, reduce the power consumption and hardware area, and therefore the focus of the invention is to optimize the number of multipliers used in the modulo multiplier circuit.
The inventors have further studied that in the application scenario for a modulo multiplier circuit, i.e. in the application scenario where a modulo multiplication operation needs to be performed, there is typically a constant number of one party. For example: in the DNN reasoning task, the weight of the DNN model is constant; in the feature comparison task, the feature library is constant; in the encryption/decryption task, the key is constant; in the number-theory transform (NTT) or FFT, the twiddle factor is constant. Thus, the modular multiplication operation can make full use of this feature to optimize the hardware circuit.
Based on this, assuming that x1 is constant, line (3) in the general Barrett Reduction algorithm can be rewritten as follows:
it is obvious that the process is not limited to,is constant, so +.>Merging into the (2) th row, and rewriting the (2) th row into:
meanwhile, through error analysis, the program is added>The value can be further rewritten as +.>Decrease->Is effective to reduce the bit width required by the multiplier. (where the cost of the multiplier is proportional to the square of the operational digital width), the fused modulo multiplier algorithm can be derived as follows:
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
the above algorithm does not calculate the product of x0 and x1, but fuses the operation of multiplication of x0 and x1 with the modulo m operation, i.e. the fused modulo multiplier algorithm. The fused modular multiplier algorithm can be realizedAnd (3) operating.
Comparing the fused modulo multiplier algorithm with a general Barrett Reduction algorithm:
for convenience we use n in describing the operational digital width variation of the multipliern- > m represents that two n-bit input operands are multiplied to obtain m-bit multiplication results.
In estimating the multiplier cost (area or number of logic gates), we use nThe cost of the n-n multiplier is 1 per unit cost.
Thus, n can be estimatedThe cost of the n-2 n multiplier is 2;
2nthe cost of the 2n→2n multiplier is 4.
2nThe cost of the 2n→4n multiplier is 8.
By analyzing the cost of three multipliers in a circuit, it can be seen that the bit width of the multipliers in a fused mode multiplier is significantly reduced.
For example, x=x0 is calculated in a common mode multiplierIn x1, n is required>n-2 n multiplier with cost ratio +.>(n/>n→n) is doubled;
as another example, in a common mode multiplier due to x andis 2n, thus calculating +.>=x/> The multiplier cost is 8. But in a fused modulo multiplier circuit due to +.>And->Is n bits in bit width, thus +.>Multiplier cost n->n-2 n is only 2, which is far lower than the cost of a general-purpose modulo multiplier.
It can be seen that based on the fused modulo multiplier algorithm, the relative cost of the multiplier (here the cost of an n multiplier is taken as the unit cost) decreases from 12 (=2+8+2) to 4 (=1+2+1).
The invention realizes a modular multiplier circuit, an FPGA circuit and an ASIC module based on the fused modular multiplier algorithm. Specific embodiments of the modular multiplier circuit, FPGA circuit and ASIC module provided by the present invention are described in detail below.
Example 1
The first embodiment of the invention provides a modular multiplier circuit, which breaks the limit of a multiplication operation module and a modular operation module when the modular multiplier is realized by hardware, so that the two modules are combined into one module, redundant calculation is reduced, unnecessary intermediate results are reduced, the hardware cost of the modular multiplier circuit is reduced, and the calculation throughput rate is improved. Fig. 2 is a circuit diagram of a modulo multiplier circuit according to a first embodiment of the present invention, as shown in fig. 2, the modulo multiplier circuit includes: the first multiplier U0, the second multiplier U1, the third multiplier U2, the first adder U3, the second adder U4 and the two-way multiplexer U5.
Wherein the first multiplier U0 is an n-by-n multiplier, and the first input terminal thereof receivesThe second input of the first multiplier U0 receives +.>The output of the first multiplier U0 outputs +.>And->Is p1, wherein ∈>And->Is the operand of the modular multiplication, m is the module used for the modular multiplication, +.>,/>Is constant, n is->And->Optionally, the first multiplier U0 outputs the highest n bits of the product p 1. Optionally, in the encryption algorithm implemented based on a modular multiplier circuit,/is>And->Is an operand which participates in a modular multiplication operation, in particular having different physical meanings at different stages, e.g.>And->May be plaintext, ciphertext, public and/or secret keys, etc.
The second multiplier U1 is an n×n multiplier, a first input end thereof is connected to an output end of the first multiplier U0, a second input end of the second multiplier U1 receives m, and an output end of the second multiplier U1 outputs a product p2 of p1 and m, that is, calculates a product of the output of the first multiplier U0 and m, and optionally, the second multiplier U1 outputs the lowest n bits of the product p 2.
The third multiplier U2 is an n-by-n multiplier, the first input of which receivesThe second input of the third multiplier U2 receives +.>The output of the third multiplier U2 outputs +.>And->The product p3, optionally, the third multiplier U2 outputs the lowest n bits of the product p 3.
The first adder U3 is an n-bit adder, a first input end of the first adder U3 is connected to an output end of the second multiplier U1, a second input end of the first adder U3 is connected to an output end of the third multiplier U2, and an output end of the first adder U3 outputs a difference t between p2 and p3, that is, calculates a difference between the outputs of the second multiplier U1 and the third multiplier U2, and optionally, the first adder U3 is configured to perform an addition operation with a carry 1 after inverting an output result of the second multiplier U1, so as to convert a subtraction operation into an addition operation.
The second adder U4 is an n-bit adder, a first input terminal of the n-bit adder is connected to the output terminal of the first adder U3, a second input terminal of the second adder U4 receives m, and an output terminal of the second adder U4 outputs a difference t-m between t and m, optionally, the second adder U4 is similar to the first adder U3, and is used for performing addition operation with carry 1 after inverting the output result of the first adder U3 so as to convert subtraction operation into addition operation.
The first input end of the two-way multiplexer U5 is connected with the output end of the first adder U3, the second input end of the two-way multiplexer U5 is connected with the output end of the second adder U4, and the output end of the two-way multiplexer U5 selects a result to be output according to the size relation of t and m, wherein when t > =m, the two-way multiplexer U5 outputs t-m, namely, the output result of the second adder U4, and when t < m, the two-way multiplexer U5 outputs t, namely, the output result of the first adder U3.
The mode multiplier circuit provided by the embodiment greatly reduces the hardware cost of the circuit. The hardware cost of the modular multiplier circuit mainly depends on the number and cost of the multipliers, if the multiplier cost of n x n- & gt n is assumed to be 1 as unit cost, the hardware cost of the traditional modular multiplier circuit is 12, and the hardware cost of the modular multiplier circuit provided by the invention is 4, which is equivalent to the hardware cost reductionAbout 66% lower. Meanwhile, due to the reduction of the multiplier, the circuit area is reduced, and the overall power consumption of the circuit is reduced. In addition, since the multiplier is a main factor for restricting the throughput rate of the system, the consumption of the circuit to the multiplier is reduced, which means that the throughput rate of the system is improved by 3 times under the condition that the total number of logic of the multiplier is fixed. In a conventional modular multiplication circuit, the highest delay module is the computation of xIs implemented by a 2n x 2n 4n multiplier, and the multiplier is located on the critical path. Because of its higher computational complexity, the circuit delay is higher and therefore the frequency is generally more difficult to boost. In the present invention, since x + ->Because the computational complexity is greatly reduced and the carry chain is also greatly shortened, the delay is significantly improved and higher operating frequencies are easier to use.
Optionally, in one embodiment, the modulo multiplier circuit further comprises: a pre-calculation unit for calculatingWherein the second input of the first multiplier U0 is used for connecting the output of the pre-calculation unit by which the +.>The calculation is completed.
The first multiplier U0 in fig. 2 implements lines 10, 11 of the fused modulo multiplier algorithm. U0 reception(width n bits) and pre-calculated +.>(n bits in width), the product p1 (2 n bits) of the two is calculated, and the lowest n bits of the product are discarded, leaving only the highest n bits (p 1') of the product, so U0 can be regarded as MSB MUL (high order multiplier). For convenience we use n +.>n→2n describes the output-input bit width of its multiplier.
The second multiplier U1 section implements line 12 of the fused modulo multiplier algorithm. It calculates the product of p1' (n bits) and m (n bits), but discards the highest n bits of the product, leaving the lowest n bits (p 2) of the product. U1 may be referred to as LSB MUL (low order multiplier).
The third multiplier U2 element is also an LSB MUL (low multiplier) which implements line 13 of the fused modulo multiplier algorithm, which computes the product of x0 and x1 of n bits, and retains the lowest n bits (p 3) of the product.
The U1 and U2 parts are LSB MUL (low order multiplier) with input/output bit width of nn→n describes.
The first adder U3 element is a full adder, which is responsible for calculating the difference between p3 and p2 (line 14 of the algorithm). Since the data in the computer are all represented by the complementary codes, the negative operation rule can be about inverting and adding 1, and therefore, p3-p2 can be about inverting and adding 1 and p3, namely, the first adder (U3) is used for inverting the output result of the second multiplier (U1) and then executing addition operation with carry 1.
The second adder U4 and the diplexer U5 implement line 15 of the fused modulo multiplier algorithm. U4 is responsible for calculating the difference between t and m, and U5 is a two-way multiplexer that compares the magnitude relationship of t and m, and outputs t-m if t > =m, and otherwise outputs t.
Example two
The second embodiment of the present invention provides an FPGA circuit, that is, a digital integrated circuit, which can change the internal structure of the chip by programming, and the FPGA circuit includes any one of the modulo multiplier circuits provided in the first embodiment of the present invention, where the circuit unit for performing the modulo multiplication operation in the FPGA circuit employs the modulo multiplier circuit of the present invention, and the multiplier in the modulo multiplier circuit is implemented by a DSP.
With the FPGA circuit provided by this embodiment, the calculation performance of the FPGA circuit can be improved by about 67% with the number of DSPs fixed, due to the reduced number of multipliers required for the modulo multiplier circuit.
Example III
An ASIC module, that is, an ASIC module, is provided in a third embodiment of the present invention, including a Barrett circuit, where a Barrett circuit includes any one of the modulo multiplier circuits provided in the first embodiment of the present invention, and a circuit unit for performing a modulo multiplication operation in the Barrett circuit employs the modulo multiplier circuit of the present invention.
With the ASIC module provided in this embodiment, more Barrett circuits can be instantiated on the chip with limited chip area or power consumption, thereby improving throughput by a factor of 3. Alternatively, in the case where the throughput index is satisfied, the chip area is reduced by about 66%, thereby reducing the chip cost and power consumption.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims (6)

1. A modulo multiplier circuit, comprising: a first multiplier (U0), a second multiplier (U1), a third multiplier (U2), a first adder (U3), a second adder (U4) and a two-way multiplexer (U5), wherein:
the first input end of the first multiplier (U0) receivesThe second input of the first multiplier (U0) receivesThe output of the first multiplier (U0) is +.>And->Is p1, wherein ∈>And->Is the operand of the modular multiplication, m is the module used for the modular multiplication, +.>,/>Is constant, n is->Andis a length of (2);
the first input end of the second multiplier (U1) is connected with the output end of the first multiplier (U0), the second input end of the second multiplier (U1) receives m, and the output end of the second multiplier (U1) outputs a product p2 of p1 and m;
the first input of the third multiplier (U2) receivesThe second input of the third multiplier (U2) receives +.>The output of the third multiplier (U2) outputs +.>And->The product p3;
a first input end of the first adder (U3) is connected with an output end of the second multiplier (U1), a second input end of the first adder (U3) is connected with an output end of the third multiplier (U2), and an output end of the first adder (U3) outputs a difference value t between p2 and p3;
the first input end of the second adder (U4) is connected with the output end of the first adder (U3), the second input end of the second adder (U4) receives m, and the output end of the second adder (U4) outputs a difference t-m between t and m;
the first input end of the two-way multiplexer (U5) is connected with the output end of the first adder (U3), the second input end of the two-way multiplexer (U5) is connected with the output end of the second adder (U4), and the output end of the two-way multiplexer (U5) selects a result required to be output according to the size relation of t and m, wherein when t > =m, the two-way multiplexer (U5) outputs t-m, and when t < m, the two-way multiplexer (U5) outputs t.
2. A modular multiplier circuit according to claim 1, characterized in that the first multiplier (U0) outputs the highest n bits of the product p1, the second multiplier (U1) outputs the lowest n bits of the product p2, and the third multiplier (U2) outputs the lowest n bits of the product p 3.
3. A modulo multiplier circuit according to claim 1, wherein said first adder (U3) is arranged to perform an addition operation with carry 1 after inverting the output of said second multiplier (U1).
4. The modular multiplier circuit of claim 1, further comprising: a pre-calculation unit for calculatingWherein a second input of the first multiplier (U0) is adapted to be connected to an output of the pre-calculation unit.
5. An FPGA circuit comprising a modulo multiplier circuit according to any of claims 1 to 4, wherein the multipliers in the modulo multiplier circuit are implemented by a DSP.
6. An ASIC module comprising a Barrett circuit comprising a modulo multiplier circuit according to any of claims 1 to 4.
CN202310813349.6A 2023-07-05 2023-07-05 Modulo multiplier circuit, FPGA circuit and ASIC module Active CN116540977B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310813349.6A CN116540977B (en) 2023-07-05 2023-07-05 Modulo multiplier circuit, FPGA circuit and ASIC module

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310813349.6A CN116540977B (en) 2023-07-05 2023-07-05 Modulo multiplier circuit, FPGA circuit and ASIC module

Publications (2)

Publication Number Publication Date
CN116540977A true CN116540977A (en) 2023-08-04
CN116540977B CN116540977B (en) 2023-09-12

Family

ID=87447425

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310813349.6A Active CN116540977B (en) 2023-07-05 2023-07-05 Modulo multiplier circuit, FPGA circuit and ASIC module

Country Status (1)

Country Link
CN (1) CN116540977B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1310816A (en) * 1998-07-22 2001-08-29 摩托罗拉公司 Circuit and method of modulo multiplication
US20120233234A1 (en) * 2011-03-08 2012-09-13 Brooks Jeffrey S System and method of bypassing unrounded results in a multiply-add pipeline unit
CN104461136A (en) * 2014-12-03 2015-03-25 无锡华润矽科微电子有限公司 Dynamic threshold adjusting circuit in touch-control device
CN110908635A (en) * 2019-11-04 2020-03-24 南京大学 High-speed modular multiplier based on post-quantum cryptography of homologus curve and modular multiplication method thereof
CN114816328A (en) * 2022-04-08 2022-07-29 中山大学 Storage and computation combined multiplier and control method thereof
CN115167815A (en) * 2022-08-01 2022-10-11 九识(苏州)智能科技有限公司 Multiplier-adder circuit, chip and electronic equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1310816A (en) * 1998-07-22 2001-08-29 摩托罗拉公司 Circuit and method of modulo multiplication
US20120233234A1 (en) * 2011-03-08 2012-09-13 Brooks Jeffrey S System and method of bypassing unrounded results in a multiply-add pipeline unit
CN104461136A (en) * 2014-12-03 2015-03-25 无锡华润矽科微电子有限公司 Dynamic threshold adjusting circuit in touch-control device
CN110908635A (en) * 2019-11-04 2020-03-24 南京大学 High-speed modular multiplier based on post-quantum cryptography of homologus curve and modular multiplication method thereof
CN114816328A (en) * 2022-04-08 2022-07-29 中山大学 Storage and computation combined multiplier and control method thereof
CN115167815A (en) * 2022-08-01 2022-10-11 九识(苏州)智能科技有限公司 Multiplier-adder circuit, chip and electronic equipment

Also Published As

Publication number Publication date
CN116540977B (en) 2023-09-12

Similar Documents

Publication Publication Date Title
JP4739020B2 (en) Small Galois field multiplier engine
US6601077B1 (en) DSP unit for multi-level global accumulation
Riese qMultiSum—a package for proving q-hypergeometric multiple summation identities
Thomsen et al. Optimized reversible binary-coded decimal adders
Gokhale et al. Design of area and delay efficient Vedic multiplier using Carry Select Adder
CN114968173A (en) Polynomial multiplication method and polynomial multiplier based on NTT and INTT structures
CN113794572A (en) Hardware implementation system and method for high-performance elliptic curve digital signature and signature verification
CN110543291A (en) Finite field large integer multiplier and implementation method of large integer multiplication based on SSA algorithm
CN116540977B (en) Modulo multiplier circuit, FPGA circuit and ASIC module
KR102401902B1 (en) Lossy arithmetic
Mazonka et al. Fast and compact interleaved modular multiplication based on carry save addition
Harish et al. Comparative performance analysis of Karatsuba Vedic multiplier with butterfly unit
Dalmia et al. Novel high speed vedic multiplier proposal incorporating adder based on quaternary signed digit number system
JPS6053907B2 (en) Binomial vector multiplication circuit
JPH09128213A (en) Block floating processing system/method
González-Pinto et al. On the starting algorithms for fully implicit Runge-Kutta methods
JP3221076B2 (en) Digital filter design method
RU2148270C1 (en) Device for multiplication
JP2008158855A (en) Correlation computing element and correlation computing method
Beheshti Fixed point performance of interpolation/extrapolation algorithms for resource constrained wireless sensors
Bardis et al. Accelerated modular multiplication algorithm of large word length numbers with a fixed module
CN117938347A (en) Secure encryption system and method for resisting side channel attack
Sun et al. Design of scalable hardware architecture for dual-field montgomery modular inverse computation
CN116940939A (en) Multi-scalar multiplication implementation method, device, terminal and storage medium
Weimerskirch Karatsuba Algorithm.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant