CN113608718B - Method for realizing prime number domain large integer modular multiplication calculation acceleration - Google Patents

Method for realizing prime number domain large integer modular multiplication calculation acceleration Download PDF

Info

Publication number
CN113608718B
CN113608718B CN202110783676.2A CN202110783676A CN113608718B CN 113608718 B CN113608718 B CN 113608718B CN 202110783676 A CN202110783676 A CN 202110783676A CN 113608718 B CN113608718 B CN 113608718B
Authority
CN
China
Prior art keywords
segment
multiplicand
multiplier
multiplication
bits
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110783676.2A
Other languages
Chinese (zh)
Other versions
CN113608718A (en
Inventor
郑昉昱
高莉莉
魏荣
马原
王跃武
范广
万立鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Information Engineering of CAS
Original Assignee
Institute of Information Engineering of CAS
Filing date
Publication date
Application filed by Institute of Information Engineering of CAS filed Critical Institute of Information Engineering of CAS
Priority to CN202110783676.2A priority Critical patent/CN113608718B/en
Publication of CN113608718A publication Critical patent/CN113608718A/en
Application granted granted Critical
Publication of CN113608718B publication Critical patent/CN113608718B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a method for realizing the acceleration of the large integer modular multiplication calculation of prime number domain, dividing the multiplicand and multiplier with the length of k bits in prime number domain into N sections, wherein each section of the former (N-1) section is w bits in length, the Nth section is r bits in length, and w is more than or equal to r; converting each section of the multiplicand and the multiplier into double-precision floating point numbers, multiplying and adding each section of the multiplicand and the multiplier after conversion by adopting a fused multiply-add operation, initializing 2N fixed point numbers, accumulating binary values of the multiplication and addition result into the initialized fixed point numbers, and carrying out bit number reduction on the fixed point numbers to obtain a final modular multiplication result. The invention fully utilizes the format characteristics of the double-precision floating point number, and improves the calculation efficiency of prime number domain modular multiplication.

Description

Method for realizing prime number domain large integer modular multiplication calculation acceleration
Technical Field
The invention belongs to the technical field of calculation, and relates to a method for realizing the acceleration of the large integer modular multiplication calculation of a prime number domain.
Background
With the continuous progress of technology, computer technology is rapidly developed, users have higher requirements on privacy protection, and cryptography is also widely applied to network communication technology. For example, the internet-derived industry, which is directed to a large number of users, e-commerce, software distribution, etc., achieves privacy protection and secure communications over the internet through key agreement and digital signature. Large integer modular multiplication is the core computational load of many asymmetric cryptographic algorithms. The main computational load of the world's mainstream asymmetric cryptographic algorithm ECC (Elliptic Curve Cryptography) is the large integer modular multiplication of the prime number domain. Therefore, the operation speed of the large integer modular multiplication in the prime number domain directly influences the speed of key negotiation and digital signature realization, and the research on the large integer modular multiplication high-performance realization in the prime number domain is very important.
GPUs (Graphics processing units) is very efficient in computer graphics and image processing and is therefore more adept at floating point arithmetic. The computational power of floating point numbers of GPUs has increased more than ten times over the past decade. In addition, the CUDA parallel computing framework developed by NVIDIA corporation allows GPUs computing resources that would otherwise be suitable only for graphics processing computing to also be used to accelerate scientific computing. Many researchers have accelerated the cryptographic primitives of the mainstream using the computational resources of GPUs. For example, pan et al accelerated ECDSA with the fixed point number computing power of GPUs, niall et al accelerated RSA with the double precision floating point number computing power of GPUs, and throughput reached a new peak. In order to adapt to the characteristic of rapid development of the computing capability of GPUs floating point numbers, the invention combines a fused multiply-add instruction based on double-precision floating point numbers and an integer domain arithmetic instruction to accelerate large integer modular multiplication operation of a prime number domain.
The basic data types of the current computer have corresponding fixed word length, large integers cannot be directly represented in the computer through the basic data types, researchers generally split the large integers, represent one large integer by a plurality of basic data types, and calculate the modular multiplication of the large integers by adopting a multi-precision calculation mode.
The double-precision floating point number format used by the invention meets the floating point number standard specified by IEEE 754. A floating point number in the IEEE754 standard consists of sign bits, a step code and a tail code, wherein the tail code comprises 1-bit implied bits and several bit fractional parts. A double-precision floating-point number comprises 1-bit sign bit, 12-bit step code, 1-bit implied bit and 52-bit fractional part, the implied bit is not displayed in a computer.
Disclosure of Invention
The invention provides a method for realizing the acceleration of the calculation of the large integer modular multiplication of a prime number domain, which can fully utilize the double-precision floating point calculation capability of calculation resources and promote the calculation speed of the large integer modular multiplication.
A method for realizing the acceleration of the prime number domain large integer modular multiplication calculation comprises the following steps:
1) Dividing large integers A and B with the length of k bits defined on a prime number domain F p into N sections, wherein each section of the front (N-1) section is w bits, the Nth section is r bits, and w is more than or equal to r; wherein p is 2 k -sigma, sigma is a prime number less than 2 w;
2) Converting each segment of the multiplicand A and the multiplier B into double-precision floating point numbers respectively; multiplying and adding the converted multiplicand A and multiplier B by adopting a fused multiply-add operation, and converting an operation result into a certain point number R;
3) Dividing the fixed point number R into 2N segments, and setting the segment length of the front (2N-1) segment of R as w bits under the condition that the value of R is unchanged; reducing R to N-segment fixed point number using multiplication and addition operations With multiplication operations, addition operations and shift operations will/>Partial subtraction beyond k bits such that/>Fixed point number of k bits;
4) Judging Whether or not it is an integer in the selected prime number domain, if/>Is an integer over the selected prime number domain, then/>The modular multiplication result of the large integer A and the large integer B is obtained; if/>Not an integer over the selected prime number domain, will/>Subtracting p as a modular multiplication result of large integer A and large integer B.
Where "large integer" refers to an integer that cannot be represented by only one double-precision floating point number.
Further, the segment lengths of the multiplicand A and the multiplier BWherein 52 is the tail code length of the double-precision floating point number; the bit length w of the former (N-1) segment and the N-th segment of the multiplicand A and the multiplier B satisfy the equation (N-1) x w+r=k, and w-r is made as small as possible in the case that 52 is equal to or larger than w is equal to or larger than r.
Further, after segmenting the multiplicand and multiplier, A [0:N-1] represents N segments of the 0 th to (N-1) of the multiplicand A, A '0:N-1 ] is the floating-point form of A [0:N-1], B [0:N-1] represents N segments of the 0 th to (N-1) of the multiplier, and B' 0:N-1] is the floating-point form of B [0:N-1 ].
Further, the multiplying and adding operation is performed on the converted multiplicand a and multiplier B by using the fused multiply-add operation, including: firstly, initializing 2N fixed point numbers as R [0:2N-1]; secondly, according to a large integer multiplication sequence Sigma i,j A 'i.B' j of segment scanning, calculating a segment A 'i of a multiplicand A' and a multiplication and addition result M ij [0] of a multiplier B 'and an addition number C0, and then calculating a segment A' i of the multiplicand A 'and a multiplication and addition result M ij [1] of a multiplier B' and an addition number C1, wherein 0 is less than or equal to i, and j is less than or equal to N; let the operation of conv_2_bin (x) be a binary form of x, accumulate conv_2_bin (M ij [0 ]) into fixed point number R [ i+j+1], accumulate conv_2_bin (M ij [1 ]) into R [ i+j ].
Further, the initialization method of the 2N fixed point numbers R [0:2N-1] is as follows: r < t ] = - [ (t× (0x433+w) + (t+1) ×0x433) &0xFFF ] < 52 when t epsilon [0, N-1], R < t ] = - [ ((t+1) × (0x433+w) +t×0x433) &0xFFF ] < 52 when t epsilon [ N,2N-1 ]. Wherein 0x433 is a hexadecimal form of the offset 1023 plus 52 of the double-precision floating-point number-order code bit. 0xFFF is in hexadecimal form 2 12 -1.
Further, the value of the addend C0 is 52 +w, and the value of the addend C1 is 2 52+w+252-Mij [0].
Further, the method for setting the segment length of the first (2N-1) segment of R to w bits is as follows: r t+1=Rt+1+(Rt > w), t.epsilon.0, 2N-2. Wherein R t represents the t+1th segment in R, and R t+1 represents the t+2nd segment in R.
Further, the R is reduced to N-segment fixed point number by multiplication operation and addition operationComprising the following steps: Post reduction/> The value range of (1) is [0,2 k+σ·2digit-r), wherein digit is the bit length of a double-precision floating point number, and because A and B are large integers, the bit length k is far greater than the bit length of the double-precision floating point number, so/>I.e./>
Further, the method comprises the steps of,Representation/>N segments of (0) to (N-1), described/>Is the carry, according to/>As can be seen from the range of (2), the value of carry is 0 or 1; the utilization of multiplication operations, addition operations and shift operations will/>Partial subtraction beyond k bits such that/>A fixed point number of k bits comprising: first order/> Wherein mask r is 2 r -1; then when t is E [0, N-2], letCarry-subtracted/>The range of the values is as follows: when the carry is 0,When carry is 1,/>Since σ is a small prime number and digit is much smaller than k, carry post-reduction/>The range of values of (C) can be unified as [0,2 k -1].
Further, ifLess than prime number p, then/>Multiplying a large integer A and a large integer B and then modulo p; if/>Greater than prime number p, then/>And multiplying the large integer A and the large integer B and then modulo p.
Compared with the prior art, the invention has the following positive effects:
When the invention calculates the large integer modular multiplication of the prime number domain, firstly, the multiplicand and the multiplier are split and converted into a plurality of numerical values of double-precision floating point types, and in the floating point conversion process, the fraction part in the mantissa of the double-precision floating point is fully utilized; the method realizes the large integer modular multiplication of the prime number domain by using the floating point calculation instruction, has novel conception and high calculation efficiency, maximally utilizes the double-precision floating point storage format of a computer, and improves the calculation speed of the large integer modular multiplication.
Drawings
FIG. 1 is a flow chart of a method for accelerating the large integer modular multiplication calculation in the prime number domain by using floating point number calculation instructions.
Detailed Description
The technical scheme of the present invention will be described in detail, but the scope of the present invention is not limited to the embodiments.
For a given prime number domain F p,p=2221 -3, A and B are large integers on a prime number domain F p, when the modulus of p is calculated by multiplying A by B, a floating point number calculation instruction is utilized to realize a prime number domain large integer modulus multiplication calculation acceleration method, which mainly comprises the following steps:
1) Dividing a multiplicand A and a multiplier B with the length of 221 bits into N segments respectively, wherein N=5; wherein, each section of the first 4 sections is 45 bits, and the 5 th section is 41 bits;
2) After segmenting the multiplicand and multiplier, A [0:4] represents the 5 segments of the 0 th to 4 th of the multiplicand A, and B [0:4] represents the 5 segments of the 0 th to 4 th of the multiplier B. Each segment of A [0:4] is converted to the double-precision floating-point form denoted A '[0:4], and each segment of B [0:4] is converted to the double-precision floating-point form denoted B' [0:4].
3) According to the large integer multiplication sequence sigma i,j A 'i.B' j, i, j E0, 4 of the segment scanning, firstly calculating a segment A 'i of the multiplicand A' and a multiplication and addition result M ij [0] of a segment B 'j of the multiplier B' and an addition number C0, wherein C0=2 97; then, a segment A 'i of the multiplicand A' and a product M ij [1] of a segment B 'j of the multiplier B' and the addend C1 are calculated, wherein C1=2 97+252-Mij [0].
4) Initializing a fixed point number R, dividing the fixed point number R into 2N segments, and marking the segments as R < 0:2N < -1 >; the initialization mode of R < 0:2N-1 > is as follows:
5) Let the operation of conv_2_bin (x) be a binary form of x, accumulate conv_2_bin (M ij [0 ]) into fixed point number R [ i+j+1], accumulate conv_2_bin (M ij [1 ]) into R [ i+j ].
6) Setting the segment length of the first 9 segments of R [0:9] to 45 bits, the setting method is as follows:
Rt+1=Rt+1+(Rt>>45),t∈[0,8]
7) Reducing the 10-segment fixed-point number R to 5-segment fixed-point number by multiplication operation and addition operation The calculation method of (1) is as follows:
I.e.
After reduction ofThe value range of (3) is [0,2 221+3·223).
8)Representation/>N segments of (0) to (N-1), described/>The upper 23 bits of (1) are carry, according to step 7)/>As can be seen from the range of (2), the value of carry is 0 or 1; n-segment fixed point number/>, using multiplication, addition and shift operationsThe reduction is 221 bits. Let/> Then when t is E [0,3], let/>After carry-out operation,/>
9) JudgingIf it is smaller than the prime number p, then/>Multiplying a large integer A and a large integer B and then modulo p; if/>Greater than prime number p, then/>And multiplying the large integer A and the large integer B and then modulo p.
Finally, the relevant parameters are calculated by using the floating point number calculation instruction provided by the invention to realize the large integer modular multiplication calculation acceleration method of the prime number domains through 7 prime number domains commonly used in cryptography, so as to obtain the following table 1.
TABLE 1 grouping Length and segment Length selection for the commonly used prime field
p k σ N w r
2221-3 221 3 5 45 41
2222-117 222 117 5 45 42
2251-9 251 9 5 51 47
2255-19 255 19 5 51 51
2382-105 382 105 8 48 46
2383-187 383 187 8 48 47
2414-17 414 17 8 52 50
Based on the same inventive concept, another embodiment of the present invention provides an asymmetric cryptographic method, which comprises a prime number domain large integer modulo multiplication calculation, wherein the prime number domain large integer modulo multiplication sampling is calculated by the method of the present invention.
Based on the same inventive concept, another embodiment of the present invention provides an electronic device (computer, server, smart phone, etc.) comprising a memory storing a computer program configured to be executed by the processor, and a processor, the computer program comprising instructions for performing the steps in the inventive method.
Based on the same inventive concept, another embodiment of the present invention provides a computer readable storage medium (e.g., ROM/RAM, magnetic disk, optical disk) storing a computer program which, when executed by a computer, implements the steps of the inventive method.
The above-disclosed embodiments of the present invention are intended to aid in understanding the contents of the present invention and to enable the same to be carried into practice, and it will be understood by those of ordinary skill in the art that various alternatives, variations and modifications are possible without departing from the spirit and scope of the invention. The invention should not be limited to what has been disclosed in the examples of the specification, but rather by the scope of the invention as defined in the claims.

Claims (9)

1. An asymmetric cryptographic method comprising prime field large integer modulo multiplication computation, said prime field large integer modulo multiplication sampling being computed by:
1) A and B are large integers defined on prime number field F p, p is 2 k - σ, σ is a prime number less than 2 w; dividing a multiplicand A and a multiplier B with the length of k bits into N sections respectively; wherein each of the front (N-1) sections is w bits, the N th section is r bits, and w is more than or equal to r;
2) Converting each segment of the multiplicand A and the multiplier B into double-precision floating point numbers respectively; multiplying and adding the converted multiplicand A and multiplier B by adopting a fused multiply-add operation, and converting an operation result into a certain point number R;
3) Dividing the fixed point number R into 2N sections, and setting w bits for the section length of the front (2N-1) section of R under the condition that the R value is unchanged; reducing R to N-segment fixed point number using multiplication and addition operations With multiplication operations, addition operations and shift operations will/>Partial subtraction beyond k bits such that/>Fixed point number of k bits;
4) Judging Whether or not it is an integer in the selected prime number domain, if/>Is an integer over the selected prime number domain, then/>The modular multiplication result of the large integer A and the large integer B is obtained; if/>Not an integer over the selected prime number domain, will/>Subtracting p as the modular multiplication result of the large integer A and the large integer B;
wherein the segment lengths of the multiplicand A and the multiplier B Wherein 52 is the tail code length of the double-precision floating point number; the bit length w of the (N-1) segment before the multiplicand A and the multiplier B and the bit length r of the N-th segment meet the equation (N-1) x w+r=k, and w-r is made as small as possible under the condition that 52 is more than or equal to w is more than or equal to r.
2. The method of claim 1 wherein, after segmenting the multiplicand and multiplier, A [0:N-1] represents N segments of the multiplicand A from 0 th to (N-1), A '0:N-1 ] is a floating point form of A [0:N-1], B [0:N-1] represents N segments of the multiplier from 0 th to (N-1), and B' 0:N-1] is a floating point form of B [0:N-1 ].
3. The method of claim 2, wherein multiplying and adding the converted multiplicand a, multiplier B using a fused multiply-add operation comprises: firstly initializing fixed point number R, dividing it into 2N segments, and marking it as R0:2N-1; secondly, according to a large integer multiplication sequence Sigma i,j A 'i.B' j of segment scanning, calculating a segment A 'i of a multiplicand A' and a multiplication and addition result M ij [0] of a multiplier B 'and an addition number C0, and then calculating a segment A' i of the multiplicand A 'and a multiplication and addition result M ij [1] of a multiplier B' and an addition number C1, wherein 0 is less than or equal to i, and j is less than or equal to N; let the operation of conv_2_bin (x) be a binary form of x, accumulate conv_2_bin (M ij [0 ]) into fixed point number R [ i+j+1], accumulate conv_2_bin (M ij [1 ]) into R [ i+j ].
4. The method of claim 3, wherein initializing the fixed-point number R comprises: r < t ] = - [ (t× (0x433+w) + (t+1) ×0x433) &0xFFF ] < 52 when t epsilon [0, N-1], R < t ] = - [ ((t+1) × (0x433+w) +t×0x433) &0xFFF ] < 52 when t epsilon [ N,2N-1 ].
5. A method as claimed in claim 3, characterized in that the addend C0 has a value of 2 52 +w and the addend C1 has a value of 2 52+w+252-Mij [0].
6. The method according to claim 1 or 5, wherein the method of setting the segment length of the preceding (2N-1) segment of R to w bits is: r t+1=Rt+1+(Rt > w), t.epsilon.0, 2N-2, wherein R t represents the t+1st segment in R, and R t+1 represents the t+2nd segment in R.
7. The method of claim 6, wherein the reducing R to N-piece fixed point numbers using a multiplication operation and an addition operationComprising the following steps: /(I)Post reduction/>The value range of (1) is [0,2 k+σ·2digit-r), wherein digit is the bit length of a double-precision floating point number, and because A and B are large integers, the bit length k is far greater than the bit length of the double-precision floating point number, so/>I.e./>
8. The method of claim 7, wherein,Representation/>N segments of (0) to (N-1), recordIs the carry, according to/>As can be seen from the range of (2), the value of carry is 0 or 1; the utilization of multiplication operations, addition operations and shift operations will/>Partial subtraction beyond k bits such that/>A fixed point number of k bits comprising: first order/>Wherein mask r is 2 r -1; then when t is E [0, N-2], let/>Carry-subtracted/>The range of the values is as follows: when carry is 0,/>When carry is 1,/>Since σ is a small prime number and digit is much smaller than k, carry post-reduction/>The range of values of (C) can be unified as [0,2 k -1].
9. The method of claim 8, wherein ifLess than prime number p, then/>Multiplying a large integer A and a large integer B and then modulo p; if/>Greater than prime number p, then/>And multiplying the large integer A and the large integer B and then modulo p.
CN202110783676.2A 2021-07-12 Method for realizing prime number domain large integer modular multiplication calculation acceleration Active CN113608718B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110783676.2A CN113608718B (en) 2021-07-12 Method for realizing prime number domain large integer modular multiplication calculation acceleration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110783676.2A CN113608718B (en) 2021-07-12 Method for realizing prime number domain large integer modular multiplication calculation acceleration

Publications (2)

Publication Number Publication Date
CN113608718A CN113608718A (en) 2021-11-05
CN113608718B true CN113608718B (en) 2024-06-25

Family

ID=

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2327924A1 (en) * 2000-12-08 2002-06-08 Ibm Canada Limited-Ibm Canada Limitee Processor design for extended-precision arithmetic

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2327924A1 (en) * 2000-12-08 2002-06-08 Ibm Canada Limited-Ibm Canada Limitee Processor design for extended-precision arithmetic

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"基于GPU的高性能密码计算";郑昉昱;《信息安全研究》;20190131;第5卷(第1期);第88-94页 *

Similar Documents

Publication Publication Date Title
US9519460B1 (en) Universal single instruction multiple data multiplier and wide accumulator unit
EP3853713A1 (en) Multiply and accumulate circuit
US20210349692A1 (en) Multiplier and multiplication method
US10949168B2 (en) Compressing like-magnitude partial products in multiply accumulation
CN106951211A (en) A kind of restructural fixed and floating general purpose multipliers
US5796645A (en) Multiply accumulate computation unit
CN110888623A (en) Data conversion method, multiplier, adder, terminal device and storage medium
CN113608718B (en) Method for realizing prime number domain large integer modular multiplication calculation acceleration
CN117155572A (en) Method for realizing large integer multiplication in cryptographic technology based on GPU (graphics processing Unit) parallel
CN117420982A (en) Chip comprising a fused multiply-accumulator, device and control method for data operations
CN112558920A (en) Signed/unsigned multiply-accumulate device and method
Mahitha et al. A low power signed redundant binary vedic multiplier
CN113672196B (en) Double multiplication calculating device and method based on single digital signal processing unit
Hung et al. Fast RNS division algorithms for fixed divisors with application to RSA encryption
CN113608718A (en) Method for realizing acceleration of prime number domain large integer modular multiplication calculation
CN112906863B (en) Neuron acceleration processing method, device, equipment and readable storage medium
Kalaiselvi et al. A modular technique of Booth encoding and Vedic multiplier for low-area and high-speed applications
CN117908835B (en) Method for accelerating SM2 cryptographic algorithm based on floating point number computing capability
JPH0793134A (en) Multiplier
CN111630509A (en) Arithmetic circuit
CN114531241B (en) Data encryption method and device, electronic equipment using data encryption method and storage medium
CN116048455B (en) Insertion type approximate multiplication accumulator
CN113625989B (en) Data operation device, method, electronic device, and storage medium
US20230259581A1 (en) Method and apparatus for floating-point data type matrix multiplication based on outer product
CN111752532B (en) Method, system and device for realizing 32-bit integer division with high precision

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
GR01 Patent grant