WO2020231353A1 - A low-latency redundant multiplier and method for the same - Google Patents

A low-latency redundant multiplier and method for the same Download PDF

Info

Publication number
WO2020231353A1
WO2020231353A1 PCT/TR2019/050331 TR2019050331W WO2020231353A1 WO 2020231353 A1 WO2020231353 A1 WO 2020231353A1 TR 2019050331 W TR2019050331 W TR 2019050331W WO 2020231353 A1 WO2020231353 A1 WO 2020231353A1
Authority
WO
WIPO (PCT)
Prior art keywords
polynomial
integer
processing means
multiplication
modular multiplication
Prior art date
Application number
PCT/TR2019/050331
Other languages
French (fr)
Inventor
Erdinc Ozturk
Original Assignee
Sabanci Universitesi
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sabanci Universitesi filed Critical Sabanci Universitesi
Priority to PCT/TR2019/050331 priority Critical patent/WO2020231353A1/en
Publication of WO2020231353A1 publication Critical patent/WO2020231353A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/60Methods or arrangements for performing computations using a digital non-denominational number representation, i.e. number representation without radix; Computing devices using combinations of denominational and non-denominational quantity representations, e.g. using difunction pulse trains, STEELE computers, phase computers
    • G06F7/72Methods or arrangements for performing computations using a digital non-denominational number representation, i.e. number representation without radix; Computing devices using combinations of denominational and non-denominational quantity representations, e.g. using difunction pulse trains, STEELE computers, phase computers using residue arithmetic
    • G06F7/722Modular multiplication

Definitions

  • the invention presented hereby generally concerns methods enabling fast circuit implementation of modular multiplication operations. Disclosed invention more specifically falls within the technical area of shortening circuit depths for reduction/multiplication circuits such as Barrett and Montgomery as defined particularly in cryptography.
  • Publication document with number CN 107766032 discloses a polynomial-based GF(2 n) multiplier.
  • the multiplier is used for calculating a product of an element A and an element B in a polynomial ring, comprising a quotient solving module, an intermediate modular multiplication calculation module and a summation module, wherein the quotient solving module is used for calculating a quotient q obtained after the product of the polynomials A and B for modular multiplication is divided by an n-degree polynomial; the intermediate modular multiplication calculation module is used for calculating modular multiplication between the product AB of the polynomial A and the polynomial B and the polynomial to obtain an intermediate modular value (c+q); and the input end of the summation module is connected with the output end of the intermediate modular multiplication calculation module and the output end of the quotient solving module, and the summation module is used for subtracting the quotient q from the intermediate modular value (c+q) to obtain
  • a document in the prior art US 10 101 969 (Bl) relates to a system including an integrated circuit (IC) configure to receive a multiplicand number, a multiplier number, and a modulus at one or more data inputs.
  • the multiplicand and the multiplier numbers are partitioned into a plurality of multiplicand words with different specific widths.
  • a plurality of outer loop iterations of an outer loop is performed to iterate through the plurality of the multiplicand words.
  • Each outer loop iteration of the outer loop includes a plurality of inner loop iterations of an inner loop performed to iterate through the plurality of the multiplier words.
  • a Montgomery product of the multiplicand number and the multiplier number with respect to the modulus is determined.
  • Primary object of the disclosed invention is to present a low-latency redundant multiplier.
  • Another object of the disclosed invention is to present a low-latency modular multiplication means.
  • Another object of the disclosed invention is to present a method of modular multiplication marked by a very short circuit depth enabled by an optimal critical path.
  • integers are represented as polynomials in a way such that any n-bit integer is expressable by a k-degree polynomial.
  • k n/d, where d is digit length
  • every integer for modular multiplication thus becomes a polynomial of specified digit lengths (8-bit, 16-bit etc.)
  • invention then computes the multiplication of two integers offered in the form of polynomials.
  • Polynomial coefficients may be bitwise one greater than the digit width, making the algorithm efficient. Therefore, each of the digitwise polynomial coefficient computations constitute the ultimate critical path respective with the determined digit width.
  • polynomial multiplication centered architecture of the processing means is based on the width of the digits for conversion from very large integer to polynomial form, as digits pertain to coefficients of the polynomial form once conversion is computed. This induces a very low latency for redundant multiplication means compared to the state of the art, enabling great time advantage in public key cryptography based computation by alleviating the load.
  • Fig. 1 demonstrates the schoolbook multiplication algorithm for polynomials according to the disclosed invention.
  • Fig. 2 demonstrates the accumulation layout for 4 by 4 polynomial multiplier according to an embodiment of the disclosed invention.
  • Fig. 3 demonstrates the reduction of lower 8-bits of 16-bit digits of a polynomial with lookup tables according to an embodiment of the disclosed invention.
  • Fig. 4 demonstrates the reduction of higher 8-bits of 16-bit digits of a polynomial with lookup tables according to an embodiment of the disclosed invention.
  • Fig. 5 demonstrates the reduction of polynomial forms in accordance with Barrett procedure according to an embodiment of the disclosed invention.
  • Fig. 6 demonstrates the reduction of polynomial forms in accordance with Montgomery procedure according to an embodiment of the disclosed invention.
  • the present invention discloses a highly efficient and fast circuit implementation of a modular multiplication operation with a method for the same.
  • Disclosed invention is novel next to the technique in the art in the sense that any very large integer that is subject to a modular multiplication operation is expressable in polynomial form, i.e. an integer in n-bit digit form with every n-bit digit representing a polynomial coefficient. Polynomial multiplication therefore takes over the very-large- integer multiplication, improving the speed at which modular multiplication is handled.
  • n-bit integers are accepted as input and modulus operation is executed in the following manner:
  • C A*B mod M (an n-bit integer).
  • Integers A and B are shown in Figure 1 in its form that is represented as a multiplicity of 16-bit digits. Where in an embodiment of the disclosed invention digits are 16-bit in width, in other embodiments digits may be of a general d-bit width. It should be noted that, although digits of an integer are d bits each, polynomial coefficients are allowed to grow to d+1 bits, for efficiency reasons. This particular redundancy in representation is, as crucial to the method at hand will be explained hereinafter.
  • Conversion from redundant polynomial representation to integer representation is done as follows: A k-degree polynomial, denoted as C(x), with (d+l)-bit coefficients and M, an n-bit modulus is accepted as input. What is the output of this sub-algorithm is C, an n-bit integer, where C is initialized as zero and from index i at zero to k, the degree of polynomial C(x), at every value of index i the bit width is multiplied with, making C much less than said value and incerementing thereof with the value. Thus, at index k, C mod M is achieved.
  • a sub-algorithm called the schoolbook multiplication algorithm for polynomials accepts two polynomials as such, one A(x) and B(x) to be multiplied.
  • Output is therefore C(x), the multiplication of said two polynomials as itself a polynomial, straightforwardly computed.
  • a 4x4 Polynomial Multiplier is detailed in this section visualizing a Polynomial Multiplier.
  • A3:A0 are 17-bit coefficients of a degree-3 polynomial
  • B3:B0 are 17-bit coefficients of a degree-3 polynomial B.
  • Each AiBj are 34-bit numbers, which are results of multiplications Ai*Bj.
  • Each column is accumulated together as shown in the Figure and there is no carry propagation between columns. Accumulation is realized using carry-save adder tree.
  • Each Si-Ci pair has different length in theory.
  • S0-C0 are 16-bit numbers
  • Sl-Cl are 19-bit numbers
  • S2-C2 are 20-bit numbers, etc.
  • 8 separate accumulation circuits are needed. Each of said accumulation circuits may have different number of inputs.
  • W(x) is considered, one such of degree 2k.
  • a polynomial of degree-2k means that it has 2k coefficients, which calls for a reduction back to k coefficients.
  • An n-bit number is representable with k coefficients, however intermediate results are allowed to extend to k+1 to eliminate an extra level of reduction.
  • Polynomial W(x) is converted back to an integer modulo M as follows:
  • the disclosed invention splits up each coefficient into 8-bit segments and reduce these segments separately.
  • Figure 3 shows how the algorithm in the disclosed invention reduces the lower 8 bits.
  • Figure 4 shows how the higher 8-bit segments can be reduced using identical LUT structures as the lower 8-bit segments. This is an option for resource-limited environments. If there is enough resources, separate LUT structures can be built for reducing the higher 8-bit segments, decreasing the latency of the overall modular multiplication operation. Highest 1-bit segment of each coefficient can be reduced in the same expressed manner as explained above.
  • Reduction may be undertaken by way of Montgomery and Barrett reduction algorithms.
  • Montgomery Reduction or Barrett Reduction algorithms may be utilized for the reduction of the 2n-bit number back to an n-bit number.
  • Montgomery Reduction and Barrett Reduction algorithms may be applied to polynomials in the same manner as they are applied to integers. Algorithms are shown in Figures 5 and 6 respectively.
  • disclosed invention in a nutshell, relates to a low-latency redundant multiplication method and modular multiplication means marked with efficient and fast implementation, where integers are represented as polynomials in a way such that any n-bit integer is expressable by a k- degree polynomial.
  • Integers for modular multiplication are represented as polynomials of specified digit lengths (8-bit, 16-bit etc.), post-which the multiplication of two integers offered in the form of polynomials is computed.
  • Critical path of the modular multiplication is also greatly improved.
  • Polynomial multiplication centered architecture of the processing means is based on the width of the digits for conversion from very large integer to polynomial form, as digits pertain to coefficients of the polynomial form once conversion is computed. This induces a very low latency for redundant multiplication means compared to the state of the art, enabling great time advantage in public key cryptography operations.
  • a modular multiplication system for public key cryptography applications such as verifiable delay functions comprising a processing means is proposed.
  • said processing means comprises at least one dedicated accumulation circuit configured for polynomial digitwise addition.
  • said processing means further comprises a reduction mechanism configured for conversion to an integer form from a polynomial form.
  • said processing means is configured to accept at least one integer.
  • said processing means is configured to compute polynomial form of said at least one accepted integer according to a predetermined digit width.
  • said processing means is configured to reduce lower half of polynomial digits according to a lookup table.
  • said processing means is configured to reduce higher half of polynomial digits according to a lookup table.
  • said method comprises a step of input accept, where at least one input for an integer for multiplication and one input for modulus are received.
  • said method comprises a step of integer-to-polynomial, where said at least one input received for an integer are converted to polynomial representation according to a predetermined bit width.
  • said method comprises a step of polynomial multiplication, where said at least one polynomial resulting from the previous step are multiplied based on polynomial digit addition.
  • said method comprises a step of polynomial reduction, where end product of the previous step of multiplication of 2k digits is reduced to k digits.
  • said method comprises a step of polynomial-to-integer, where the result of the reduction is converted back to an integer, modulo the said one input for modulus.

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Error Detection And Correction (AREA)

Abstract

The present invention relates to a redundant/modular multiplicaton means with low latency and a method for the same. To establish an efficient and fast integer modular multiplication framework, integers are represented as polynomials in a way such that any n-bit integer is expressable by a k-degree polynomial. Denoted by k = n/d, where d is digit length, every integer for modular multiplication thus becomes a polynomial of specified digit lengths (8-bit, 16-bit etc.), invention then computes the multiplication of two integers offered in the form of polynomials.

Description

A LOW-LATENCY REDUNDANT MULTIPLIER AND METHOD FOR
THE SAME Technical Field of the Present Invention
The invention presented hereby generally concerns methods enabling fast circuit implementation of modular multiplication operations. Disclosed invention more specifically falls within the technical area of shortening circuit depths for reduction/multiplication circuits such as Barrett and Montgomery as defined particularly in cryptography.
Prior Art/ Background of the Present Invention Indispensable cryptographic processes are documented with a need for efficient modular multiplication as a precursor for high-performance implementation. For public key cryptography implementations such as classical RSA, Diffie-Hellman or (hyper-) elliptic curve algorithms' demanding aspects in hardware e.g. logic operators, FPGAs, the art has mainly relied on popular methods of Montgomery multiplication and regular long-integer multiplication in combination with Barrett's modular reduction technique. In a specific point, the modular multiplication operation of large numbers and many relatively slower incarnations require optimizations for circuit depth and critical path. Solutions existing in the art mainly focus on throughput optimization for multiplication of large numbers, whereas a latency optimization is yet to be documented.
Publication document with number CN 107766032 (A) discloses a polynomial-based GF(2 n) multiplier. The multiplier is used for calculating a product of an element A and an element B in a polynomial ring, comprising a quotient solving module, an intermediate modular multiplication calculation module and a summation module, wherein the quotient solving module is used for calculating a quotient q obtained after the product of the polynomials A and B for modular multiplication is divided by an n-degree polynomial; the intermediate modular multiplication calculation module is used for calculating modular multiplication between the product AB of the polynomial A and the polynomial B and the polynomial to obtain an intermediate modular value (c+q); and the input end of the summation module is connected with the output end of the intermediate modular multiplication calculation module and the output end of the quotient solving module, and the summation module is used for subtracting the quotient q from the intermediate modular value (c+q) to obtain a modular multiplication value c of the product AB of the polynomial A and the polynomial B relative to a polynomial f(x). Through the multiplier, a direct module solving step relative to the polynomial f(x) is unavailable, less XOR gates and AND gates are available on average, and therefore the space complexity of the multiplier is lowered under the condition that time complexity is not improved. A document in the prior art US 10 101 969 (Bl) relates to a system including an integrated circuit (IC) configure to receive a multiplicand number, a multiplier number, and a modulus at one or more data inputs. The multiplicand and the multiplier numbers are partitioned into a plurality of multiplicand words with different specific widths. A plurality of outer loop iterations of an outer loop is performed to iterate through the plurality of the multiplicand words. Each outer loop iteration of the outer loop includes a plurality of inner loop iterations of an inner loop performed to iterate through the plurality of the multiplier words. A Montgomery product of the multiplicand number and the multiplier number with respect to the modulus is determined. Objects of the Present Invention
Primary object of the disclosed invention is to present a low-latency redundant multiplier.
Another object of the disclosed invention is to present a low-latency modular multiplication means.
Another object of the disclosed invention is to present a method of modular multiplication marked by a very short circuit depth enabled by an optimal critical path.
Summary of the Present Invention
In proposed invention, primary focus of which is public key cryprography applications in decentralized systems such as randomness beacons, leader election in consensus protocols, and proofs-of-replication and more specifically verifiable delay functions (VDFs); a computationally inexpensive architecture for modular multiplication is disclosed. Marked by a very low latency compared to the teachings and disclosures in the art, present method is usable in exponentiation with a very high degree, as well as obviating the need for full intermediate reduction next to rendering lazy reduction feasible.
To establish an efficient and fast integer modular multiplication framework, integers are represented as polynomials in a way such that any n-bit integer is expressable by a k-degree polynomial. Denoted by k = n/d, where d is digit length, every integer for modular multiplication thus becomes a polynomial of specified digit lengths (8-bit, 16-bit etc.), invention then computes the multiplication of two integers offered in the form of polynomials. Polynomial coefficients may be bitwise one greater than the digit width, making the algorithm efficient. Therefore, each of the digitwise polynomial coefficient computations constitute the ultimate critical path respective with the determined digit width.
To improve the critical path of the modular multiplication, disclosed invention offers a representation of modular subjects in the polynomial form that facilitates the modular multiplication operation. Polynomial multiplication centered architecture of the processing means is based on the width of the digits for conversion from very large integer to polynomial form, as digits pertain to coefficients of the polynomial form once conversion is computed. This induces a very low latency for redundant multiplication means compared to the state of the art, enabling great time advantage in public key cryptography based computation by alleviating the load.
Brief Description of the Figures of the Present Invention
Accompanying figures are given solely for the purpose of exemplifying a low latency redundant/modular multiplication architecture, whose advantages over prior art were outlined above and will be explained in brief hereinafter.
The figures are not meant to delimit the scope of protection as identified in the claims nor should they be referred to alone in an effort to interpret the scope identified in said claims without recourse to the technical disclosure in the description of the present invention. Fig. 1 demonstrates the schoolbook multiplication algorithm for polynomials according to the disclosed invention.
Fig. 2 demonstrates the accumulation layout for 4 by 4 polynomial multiplier according to an embodiment of the disclosed invention.
Fig. 3 demonstrates the reduction of lower 8-bits of 16-bit digits of a polynomial with lookup tables according to an embodiment of the disclosed invention.
Fig. 4 demonstrates the reduction of higher 8-bits of 16-bit digits of a polynomial with lookup tables according to an embodiment of the disclosed invention.
Fig. 5 demonstrates the reduction of polynomial forms in accordance with Barrett procedure according to an embodiment of the disclosed invention.
Fig. 6 demonstrates the reduction of polynomial forms in accordance with Montgomery procedure according to an embodiment of the disclosed invention.
Detailed Description of the Present Invention
The present invention discloses a highly efficient and fast circuit implementation of a modular multiplication operation with a method for the same. Disclosed invention is novel next to the technique in the art in the sense that any very large integer that is subject to a modular multiplication operation is expressable in polynomial form, i.e. an integer in n-bit digit form with every n-bit digit representing a polynomial coefficient. Polynomial multiplication therefore takes over the very-large- integer multiplication, improving the speed at which modular multiplication is handled.
In the disclosed invention, two instances of n-bit integers are accepted as input and modulus operation is executed in the following manner: C = A*B mod M (an n-bit integer). Algorithm in the disclosed invention represents an n-bit integer A as a k-degree polynomial A(x) and an n-bit integer B as a k-degree polynomial B(x) as follows (k=n/d, where d is digit length). Integers A and B are shown in Figure 1 in its form that is represented as a multiplicity of 16-bit digits. Where in an embodiment of the disclosed invention digits are 16-bit in width, in other embodiments digits may be of a general d-bit width. It should be noted that, although digits of an integer are d bits each, polynomial coefficients are allowed to grow to d+1 bits, for efficiency reasons. This particular redundancy in representation is, as crucial to the method at hand will be explained hereinafter.
Conversion from redundant polynomial representation to integer representation is done as follows: A k-degree polynomial, denoted as C(x), with (d+l)-bit coefficients and M, an n-bit modulus is accepted as input. What is the output of this sub-algorithm is C, an n-bit integer, where C is initialized as zero and from index i at zero to k, the degree of polynomial C(x), at every value of index i the bit width is multiplied with, making C much less than said value and incerementing thereof with the value. Thus, at index k, C mod M is achieved.
A sub-algorithm called the schoolbook multiplication algorithm for polynomials accepts two polynomials as such, one A(x) and B(x) to be multiplied. Referring to Figure 2, both polynomials are shown in sigma notation from index i = 0 to k. Output is therefore C(x), the multiplication of said two polynomials as itself a polynomial, straightforwardly computed. As an exemplary aspect of the disclosed invention, referring to Figure 2, a 4x4 Polynomial Multiplier is detailed in this section visualizing a Polynomial Multiplier. In Figure 2, A3:A0 are 17-bit coefficients of a degree-3 polynomial A and B3:B0 are 17-bit coefficients of a degree-3 polynomial B. Each AiBj are 34-bit numbers, which are results of multiplications Ai*Bj. Each column is accumulated together as shown in the Figure and there is no carry propagation between columns. Accumulation is realized using carry-save adder tree. Each Si-Ci pair has different length in theory. S0-C0 are 16-bit numbers, Sl-Cl are 19-bit numbers, S2-C2 are 20-bit numbers, etc. For this specific example, 8 separate accumulation circuits are needed. Each of said accumulation circuits may have different number of inputs.
Reduction is also disclosed according to an embodiment of the present invention. In this space, a polynomial, W(x) is considered, one such of degree 2k. A polynomial of degree-2k means that it has 2k coefficients, which calls for a reduction back to k coefficients. An n-bit number is representable with k coefficients, however intermediate results are allowed to extend to k+1 to eliminate an extra level of reduction. Polynomial W(x) is converted back to an integer modulo M as follows:
Figure imgf000009_0001
W = y WtRl(mod M), R = 2d i= 0 The range of coefficients from 0 to (k-1) do not need to be reduced since they already exist in the (k+1) coefficient redundant result. Coefficients of range k to (2k+l) need to be reduced. According to an embodiment of the present invention, modulo reduction of each coefficient is precomputed and stored in look-up tables (LUTs). Referring to Figure 3, where a coefficient k is denoted as "W_k x k". After integer conversion, a relation with the following form "x k= 2 (d*k)=2 n" is achieved. So, if the following expression is precomputed for each j in the range (0:2d+l-l):
Pk [/] = / * 2n mod M
a look-up table that consists of polynomials Pk[j](x) of degree k with d-bit coefficients at every index is obtained as follows:
Figure imgf000010_0001
Instead of utilizing (d+l)-bit input and n-bit output look-up tables, the disclosed invention splits up each coefficient into 8-bit segments and reduce these segments separately. Figure 3 shows how the algorithm in the disclosed invention reduces the lower 8 bits. In one embodiment of the present invention, coefficients are arranged to be 17 bits and polynomial is arranged to be to be degree 128 (k=128, d=16).
Figure 4 shows how the higher 8-bit segments can be reduced using identical LUT structures as the lower 8-bit segments. This is an option for resource-limited environments. If there is enough resources, separate LUT structures can be built for reducing the higher 8-bit segments, decreasing the latency of the overall modular multiplication operation. Highest 1-bit segment of each coefficient can be reduced in the same expressed manner as explained above.
Reduction, according to at least one embodiment of the disclosed invention, may be undertaken by way of Montgomery and Barrett reduction algorithms.
After multiplying each coefficient together, (n/d) subresults are obtained, which will be accumulated together. This accumulation may be done in redundant form, utilizing Wallace tree adder structures. The accummulation seen in Figure 2 is realized utilizing Wallace tree adder, which dictates that the result is in carry-save redundant form. This enables very fast accumulation of a large number of numbers with very small circuit depth. Wallace tree adders provide a circuit depth of O(logn).
During reduction, after the results are retrieved from the look-up tables, multiple numbers need to be accumulated together. This accumulation can happen exactly as described hitherto, enabling very fast reduction of a large number of numbers with very small circuit depth.
According to one embodiment of the invention, instead of look-up table based reduction, Montgomery Reduction or Barrett Reduction algorithms may be utilized for the reduction of the 2n-bit number back to an n-bit number. Montgomery Reduction and Barrett Reduction algorithms may be applied to polynomials in the same manner as they are applied to integers. Algorithms are shown in Figures 5 and 6 respectively.
In a nutshell, disclosed invention relates to a low-latency redundant multiplication method and modular multiplication means marked with efficient and fast implementation, where integers are represented as polynomials in a way such that any n-bit integer is expressable by a k- degree polynomial. Integers for modular multiplication are represented as polynomials of specified digit lengths (8-bit, 16-bit etc.), post-which the multiplication of two integers offered in the form of polynomials is computed. Critical path of the modular multiplication is also greatly improved. Polynomial multiplication centered architecture of the processing means is based on the width of the digits for conversion from very large integer to polynomial form, as digits pertain to coefficients of the polynomial form once conversion is computed. This induces a very low latency for redundant multiplication means compared to the state of the art, enabling great time advantage in public key cryptography operations.
In one aspect of the present invention, a modular multiplication system for public key cryptography applications such as verifiable delay functions comprising a processing means is proposed.
In another aspect of the present invention, said processing means comprises at least one dedicated accumulation circuit configured for polynomial digitwise addition.
In another aspect of the present invention, said processing means further comprises a reduction mechanism configured for conversion to an integer form from a polynomial form.
In another aspect of the present invention, said processing means is configured to accept at least one integer. In another aspect of the present invention, said processing means is configured to compute polynomial form of said at least one accepted integer according to a predetermined digit width. In another aspect of the present invention, said processing means is configured to reduce lower half of polynomial digits according to a lookup table.
In another aspect of the present invention, said processing means is configured to reduce higher half of polynomial digits according to a lookup table.
In one aspect of the present invention, a modular multiplication method for public key cryptography applications such as verifiable delay functions is proposed.
In another aspect of the present invention, said method comprises a step of input accept, where at least one input for an integer for multiplication and one input for modulus are received.
In another aspect of the present invention, said method comprises a step of integer-to-polynomial, where said at least one input received for an integer are converted to polynomial representation according to a predetermined bit width.
In another aspect of the present invention, said method comprises a step of polynomial multiplication, where said at least one polynomial resulting from the previous step are multiplied based on polynomial digit addition. In another aspect of the present invention, said method comprises a step of polynomial reduction, where end product of the previous step of multiplication of 2k digits is reduced to k digits. In another aspect of the present invention, said method comprises a step of polynomial-to-integer, where the result of the reduction is converted back to an integer, modulo the said one input for modulus.

Claims

1) A modular multiplication system for public key cryptography applications such as verifiable delay functions comprising a processing means characterized in that;
said processing means comprises at least one dedicated accumulation circuit configured for polynomial digitwise addition; and, said processing means further comprises a reduction mechanism configured for conversion to an integer form from a polynomial form.
2) A modular multiplication system for public key cryptography applications as set forth in Claim 1 characterized in that said processing means is configured to accept at least one integer. 3) A modular multiplication system for public key cryptography applications as set forth in Claim 1 characterized in that said processing means is configured to compute polynomial form of said at least one accepted integer according to a predetermined digit width. 4) A modular multiplication system for public key cryptography applications as set forth in any preceding Claim characterized in that said processing means is configured to reduce lower half of polynomial digits according to a lookup table. 5) A modular multiplication system for public key cryptography applications as set forth in any preceding Claim characterized in that said processing means is configured to reduce higher half of polynomial digits according to a lookup table. 6) A modular multiplication method for public key cryptography applications such as verifiable delay functions characterized in that said method comprises distinct steps of;
input accept, where at least one input for an integer for multiplication and one input for modulus are received;
integer-to-polynomial, where said at least one input received for an integer are converted to polynomial representation according to a predetermined bit width;
polynomial multiplication, where said at least one polynomial resulting from the previous step are multiplied based on polynomial digit addition;
polynomial reduction, where end product of the previous step of multiplication of 2k digits is reduced to k digits; and,
polynomial-to-integer, where the result of the reduction is converted back to an integer, modulo the said one input for modulus.
PCT/TR2019/050331 2019-05-14 2019-05-14 A low-latency redundant multiplier and method for the same WO2020231353A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/TR2019/050331 WO2020231353A1 (en) 2019-05-14 2019-05-14 A low-latency redundant multiplier and method for the same

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/TR2019/050331 WO2020231353A1 (en) 2019-05-14 2019-05-14 A low-latency redundant multiplier and method for the same

Publications (1)

Publication Number Publication Date
WO2020231353A1 true WO2020231353A1 (en) 2020-11-19

Family

ID=67145860

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/TR2019/050331 WO2020231353A1 (en) 2019-05-14 2019-05-14 A low-latency redundant multiplier and method for the same

Country Status (1)

Country Link
WO (1) WO2020231353A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060164145A1 (en) * 2005-01-21 2006-07-27 Andrei Poskatcheev Method and apparatus for creating variable delay
US20140229716A1 (en) * 2012-05-30 2014-08-14 Intel Corporation Vector and scalar based modular exponentiation
CN107766032A (en) 2016-08-15 2018-03-06 清华大学 Polynomial basis GF (2^n) multiplier
US10101969B1 (en) 2016-03-21 2018-10-16 Xilinx, Inc. Montgomery multiplication devices

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060164145A1 (en) * 2005-01-21 2006-07-27 Andrei Poskatcheev Method and apparatus for creating variable delay
US20140229716A1 (en) * 2012-05-30 2014-08-14 Intel Corporation Vector and scalar based modular exponentiation
US10101969B1 (en) 2016-03-21 2018-10-16 Xilinx, Inc. Montgomery multiplication devices
CN107766032A (en) 2016-08-15 2018-03-06 清华大学 Polynomial basis GF (2^n) multiplier

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
J-F DHEM: "Design of an efficient public-key cryptographic library for RISC-based smart cards", THESE SOUTENUE EN VUE DE L'OBTENTION DU GRADE DE DOCTEUR ENSCIENCES APPLIQUEES, UCL, FACULTÉ DES SCIENCES APPLIQUÉES, LOUVAIN-LA-NEUVE, BE, 1 May 1998 (1998-05-01), pages complete, XP002212065 *
PARHAMI B ED - SODERSTRAND M A ET AL: "Modular reduction by multi-level table lookup", CIRCUITS AND SYSTEMS, 1997. PROCEEDINGS OF THE 40TH MIDWEST SYMPOSIUM ON SACRAMENTO, CA, USA 3-6 AUG. 1997, NEW YORK, NY, USA,IEEE, US, vol. 1, 3 August 1997 (1997-08-03), pages 381 - 384, XP010272518, ISBN: 978-0-7803-3694-0, DOI: 10.1109/MWSCAS.1997.666114 *

Similar Documents

Publication Publication Date Title
US11249726B2 (en) Integrated circuits with modular multiplication circuitry
JP3784156B2 (en) Modular multiplication method
Daly et al. Efficient architectures for implementing montgomery modular multiplication and RSA modular exponentiation on reconfigurable logic
Erdem et al. A general digit-serial architecture for montgomery modular multiplication
US9146708B2 (en) Implementation of arbitrary galois field arithmetic on a programmable processor
EP1421472B1 (en) A method and apparatus for carrying out efficiently arithmetic computations in hardware
US8862651B2 (en) Method and apparatus for modulus reduction
CN110908635A (en) High-speed modular multiplier based on post-quantum cryptography of homologus curve and modular multiplication method thereof
US7580966B2 (en) Method and device for reducing the time required to perform a product, multiplication and modular exponentiation calculation using the Montgomery method
Liu et al. High performance modular multiplication for SIDH
Sassaw et al. High radix implementation of Montgomery multipliers with CSA
CN113794572A (en) Hardware implementation system and method for high-performance elliptic curve digital signature and signature verification
US8244790B2 (en) Multiplier and cipher circuit
US8719324B1 (en) Spectral modular arithmetic method and apparatus
CN113467754B (en) Lattice encryption modular multiplication operation device based on decomposition reduction
McIvor et al. High-radix systolic modular multiplication on reconfigurable hardware
Hariri et al. Digit-serial structures for the shifted polynomial basis multiplication over binary extension fields
Großschädl High-speed RSA hardware based on Barret’s modular reduction method
Elango et al. Hardware implementation of residue multipliers based signed RNS processor for cryptosystems
WO2020231353A1 (en) A low-latency redundant multiplier and method for the same
Rahimzadeh et al. Radix-4 implementation of redundant interleaved modular multiplication on FPGA
WO2022115108A1 (en) An architecture for small and efficient modular multiplication using carry-save adders
Coliban Fast Radix-2 Montgomery Modular Multiplication on FPGA Using Ternary Adder
KR100946256B1 (en) Scalable Dual-Field Montgomery Multiplier On Dual Field Using Multi-Precision Carry Save Adder
Namin et al. A High-Speed Word Level Finite Field Multiplier in ${\BBF} _ {2^ m} $ Using Redundant Representation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19735670

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19735670

Country of ref document: EP

Kind code of ref document: A1